Friday, December 12, 2008

Data access acceleration : HW or SW based GRID ?

For once, I thought I'd blog on a technical subject :-)

The question that has been in my mind is that - which is a better option for large volume data management - is an hardware solution better or a software solution - or even a combined one ?

About 2 months ago, at the Oracle Open World conference, I watched Larry Ellison announce the HP-Oracle DB machine. During the presentation, I was both bemused (with the idea that this was Larry’s / Oracle’s second venture into the hardware territory post NC) and the fact that for Oracle and HP to come up with this meant there was a real issue with the proliferating data.

As most of the readers would be aware, the last few quarters have seen a tremendous number of activities related to Virtualisation, Grid computing, Cloud computing and High volume data management. What are the options?


Hardware Options: Generally handled with Virtualisation and Grid computing effort as is also Custom Built machines (e.g: HP-Oracle DB machine). While it is a clean way to get to an array structure (CPU slices + Storage Array etc), I am not really sure if this is “efficient” – simply because, the core software / application was never written to take advantage of this.

Software Options : For a “pure” software option to work – there has to be two important components; A mechanism for Caching (In memory caches) and a mechanism for load balancing / splitting into parallel processing threads.


In my opinion, there is going to be a push for a combination of the two with distributed service architecture to manage the growing SOA / message structures inherently with a combination of GRID and CACHE (in memory).


I see three major players in the market today with fairly similar / competing offerings;
  1. Oracle with it’s COHERENCE offering (Object oriented in memory DB cache) which is actually a product / company acquisition – Tangosol.
  2. Gemstone with it’s GEMFIRE offering (again OODB / in memory DB cache) and
  3. Gigaspaces with it’s XAP offering
The only issue I see with these offerings is that all the I/O needs to be re-configured / re-written using specific API’s to make use of the new features. This is a big issue, the questions to ask are; what is going to force the developers of COTS data access / reporting / application product suppliers (eg. Oracle, IBM-Cognos, SAP-BO etc) to provide this API access? Why would they invest in these during a downturn when no “new” product licenses are visible?

You can also see similar postings in the cross link at http://calsoftblog.blogspot.com/

No comments: