HANA vs Exalytics: Where two fight the third one wins? Or is there a fight really?

The recent release of Oracle Exalytics brought third round of a verbal fight between SAP and Oracle and surrounding social (press, bloggers, forumers) comments. On a closer look you will find most of the arguments and comparisons pretty shallow and a based on rhetoric, then facts, be this from SAP side or Oracle side. Those keep discussion lively and catch the attention of many, but certainly do not please us – technologists, who are left-brained. One thing is for sure: megaplayers like SAP and now Oracle directed everyone’s eyes at the long existed, but previously niche, market of in-memory databases. Smaller, but established players, do not plan to waste 15 minutes of topic’s  fame, like Mike Allen, VP Product Management at Terracotta, cited in “Big Data Buyer’s Guide…” by Enterprise Apps Today:

At the end of the day SAP HANA and Oracle’s solutions are still databases – running on dedicated hardware that has to be sized to the problem in hand – that are accessed across a network. […] Terracotta BigMemory stores data right where it’s used, in the memory where the application runs. This makes it much faster.

Well, good luck to Teracottas, VoltDBs, Altibases etc, but for the moment let us move back to SAP HANA vs Oracle Exalytics debate. Indeed, at the moment SAP HANA is much closer in comparison to Oracle Exalytics, than to Oracle Exadata or to generic Oracle Database. I said “in comparison”, and not “in competition”, because to compete the products should stand against each other in customer bids, and not in vendors’ verbal arguments. So far I haven’t seen a single place where HANA and Exalytics stood against each other in the customer’s choice. I obviously do not see everything that is happening on the market, and therefore will greatly appreciate any additional facts in the comments to this post. On the surface there are many similarities between the two:

  • both are using the word “in-memory” to describe the major characteristic of their major software component,
  • both are sold only as software+hardware combo, although with Oracle Linux and Oracle Sun hardware with Intel Xeon chips in case of Exalytics, and with SuSE Linux and Xeon-based hardware from six (and counting) SAP’s hardware partners in case of HANA
  • SAP’s initial major and most promoted use cases for HANA database were analytical data marts and planning applications, same (but only these) are use cases for Exalytics;
  • both are Fast Data solutions, requiring to couple them with Big Data solutions – Exadata in case of Exalytics, or database from Sybase family in case of HANA.

So, what make them different and why we don’t see really a fight on the market between the two, although for sure both vendors are searching prominent example to announce their superiority and victory?

  1. First and the foremost: the position of the product on the vendors priority list. HANA is the center of the SAP’s technology strategy: to-be the heart of all-things-SAP plus a platform with which SAP tries to attract ISVs. For Oracle Exalytics is just an answer to first technical use case released by SAP for HANA, i.e. high-performing analytics on data marts.
  2. From the go-to-market, I see both as products to be up-sold or cross-sold, and therefore positioned in vendors’ own shops, rather than going to brand new industries and customers and clash there. There are certainly examples of HANA and Exalytics sells into new accounts, but I do not see these happening in volumes both vendors are after.
  3. Exalytics has BI engines and tools pre-installed, while HANA doesn’t. Those who follow HANA’s history right from the beginning may remember that SAP initially wanted to include BusinessObjects BI tools run-times on HANA appliance as well, but finally did not. Instead today HANA database has lightweight application server (called XS Engine) built-in. As such Mark Hurd of Oracle was right in last Q3/2012 earnings call: “… one of the nice things about Exalytics is you plug it in and your existing Hyperion EPM system runs faster. Your existing Oracle BI runs faster. With HANA, one of the problems is you plug in HANA and then you start to write programs, because it’s not compatible with anything.” The thing though is that with HANA you can obviously write programs, but as well you can plug it into some of the SAP applications: ERP accelerators, SAP NetWeaver BW, SBO Analytic Applications.
  4. And that’s another difference: Exalytics works only with Oracle software, and HANA’s strongest case is in interoperability with SAP products. HANA database will not work with Hyperion EPM (even if provocatively invited by SAP), and Exalytics cannot be used with SAP BW. Although interestingly now the use case for SAP BW puts SAP HANA database into the competition with Oracle Exadata, which has been certified for SAP NetWeaver installations.


Filed under Exalytics, HANA, Oracle, SAP

BW 7.3 Orange (powered by SAP HANA) is just the beginning

In the podcast “Debating the Value of SAP HANA” mentioned in my previous post, we ran into the discussion if SAP NetWeaver BW 7.3 SP5 powered by SAP HANA (project “BW Orange”, or simply “BW-on-HANA”) is the killer app or not for SAP in-memory technology. My answer was “Yes”, but here I would like to go into the grades of shade beyond this binary question.

There are very few customers out there, who would not complain about some aspects of the performance of SAP/BW. I had been presenting sessions on BW performance at different conferences since 2008, and they always resonated with the audience. BW-on-HANA, which comes with the major promise of performance boost for both front-end (querying/planning) and data warehousing parts, will resonate with almost every BW customer. I see it in my day job as a consultant.

It is a no-brainer that reading all data at the speed of RAM, instead of disks, will resolve lots of I/O-heavy processing. You may share my memories of frustration, waiting for long-running BW operations – queries, DTPs, extractions, where the only proof that the work session is still alive was to go to transaction DB02, find your running SQL and then see slowly but steadily increasing numbers under “blocks read”. This should be a song of the past with BW running with HANA system. The advantages of in-memory data retrieval are obvious, and there should be no discussion about this aspect of BW Orange.

But I/O bottleneck is not the only reason for slow performance. Accordingly to SAP’s own statistics presented during SAP TechEd’10 USA in ERP about 20% of the processing is happening in the database layer, and the rest 80% is in application (ABAP) layer. Obviously in this case super fast database can resolve only 20% of performance issues. Althought the ratio 20/80 will be different for BW, as more heavy BW joins or massive inserts are done, the problem remains the same. Through the years BW (like most of the other ABAP-based systems) was built to run lots of the intensive processing in the application layer. Analytics (OLAP and Planning) Engine) being the prominent example. DSO activations, SID calculations, user exits are some more examples from the data warehousing part of BW.

Therefore BW Orange release introducing some applications optimizations, which are specific to SAP HANA and cannot be reused by BW running on any other database system. You may have heard about those by now:

  1. HANA-optimized InfoCubes, i.e. different technical design of underlying database tables, and different querying technology,
  2. HANA-optimized DSOs and the new activation process executed in the db layer,
  3. New Planning Applications Kit, which allows execution of BW-IP standard functions within HANA db.

As you see the list of optimizations is not going through all BW components. Additionally, optimizations are not 100% applicable, i.e. in some cases tables layouts remain the same as on regular databases and processing is executed in ABAP layer, like in case when DSO is  supplied by RDA real-time data acquisition (see OSS Note 1665322 – Conversion for SAP HANA-optimized DataStore objects). In some other cases “Remodeling might be required to leverage Hana optimization when starting from an existing scenarios” (OSS Note 1637199 – Using the planning applications KIT).

The human brain and hands are not yet completely eliminated, and we all are just at the beginning of the extremely interesting journey.

1 Comment

Filed under BW, HANA, Rant, SAP

Big Data and SAP HANA? Or Sybase IQ?

Like few more folks I think that there was some kind of misunderstanding in mixing Big Data and SAP HANA into one bag. We touched on this topic in the recent podcast “Debating the Value of SAP HANA”, but I would like to spend few more minutes here to explain my thoughts.

SAP HANA has been created with traditional SAP Business Suite and Business Warehouse (BW) customers in mind. How big is the biggest single SAP software installation in the world in terms of single-store data size? I do not know exactly. The times of the proud “Terabyte Club” are in the past. Four years ago it was loud about 60TB BW test SAP did. The biggest customer I worked with had 72TB database of BW data. So, I would assume that the biggest SAP instance is somewhere close to 120 TB. That’s still a lot of data not just to process, but as well to manage (think back-ups, system upgrades, copies, disaster recovery etc)… Besides current technical limitations – 8TB biggest certified hardware configuration and 2 billion records limit in a single table partition – SAP HANA is on the way to help SAP ERP and BW customers with those challenges. But those are not what the industry calls “Big Data”.

Here are main differences as I see them:

  • Data sizes we are discussing with SAP HANA are in the ballpark of few terabytes, while Big Data currently is something in single digit petabytes. E.g. HP Vertica has 7 customers with a petabyte or more of user data each accordingly to Monash Research.
  • Current focus of SAP HANA is structured data, while Big Data issues are generated by mostly unstructured data: web, scientific, machine-generated. Fair to mention though that SAP is working on Enterprise Search powered by HANA, as  Stefan Sigg, VP In-Memory Platform in SAP, told me during this TechEd Live interview.
  • Currently Big Data processing is almost a synonym with a MapReduce software framework, where huge data sets are processed by a big cluster of rather cheap computers. On the other hand SAP in-memory technology requires “a small number of more powerful high-end [servers]” accordingly to Hasso Plattner’s “In-Memory Data Management: An Inflection Point for Enterprise Applications” book.
  • Related to the point above is that in SAP HANA the promise is the real-time, where fact is available for analysis subseconds after occurrence. In Big Data algorithms processing is mostly batch based. My previous blog’s post became available in results of the Google Search and in Google Alert only 4 days after being posted – not quite real-time, huh?
  • SAP HANA data analyses are most often paired with SAP BusinessObjects Explorer – modeless visual data search and exploration. Use of MapReduce libraries on top of Big Data requires advanced programming skills.

During SAPPHIRE’11 USAkeynote speech Hasso Plattner mentioned MapReduce as a road map feature for SAP HANA, but since then I haven’t gotten any specifics what it means. Instead silently announced Release 15.4 of Sybase IQ has introduced some features focused on analyses of Big Data in their original meaning. Is there a silent revolution in SAP going on the Sybase side, while all eyes are on the HANA product?


Filed under HANA, SAP

Is SAP HANA about the “in-memory database”?

Disclaimer: this post is not meant to be easy-digestible, so please stay with me through the text and let’s have a discussion after that.

What is SAP HANA?

When in May 2010 I first heard Hasso Platner, Chairman of the Board in SAP, talking about the in-memory revolution they were planning with the SAP HANA product, I scratched my head. I had been working with SAP NetWeaver BW Accelerator (BWA) already for 4 years, and it was obvious that HANA was the continuation of the same technology. But what I made me curious was why out of three major principles underpinning the technology – massive parallel processing (MPP), columnar-based data store, and in-memory data store – SAP had chosen the last one as a flagship feature for the new product? It was not clear for me at that time. I decided that it must be due to the fact, that there are products already strongly identified with columnar data presentation (like Sybase IQ or Vertica) and with analytics MPP processing (like Teradata or HP Neoview), while in-memory databases, like TimesTen, Altibase or solidDB, were not that known to a broader audience.

For a last couple of years we’ve seen SAP effort to re-claim the “innovative” adjective next to the company name. So, using “in-memory” – existing, but not that wide known, technology seemed to be a good match for “innovation”. As we saw during last year, indeed HANA was used successfully by SAP marketing to generate lots of “game-changing”, “revolutionary”, “deliciously disruptive” buzz. This buzz was picked up by many. So, it was quite interesting to read the contradictory statement made by the analyst Dennis Gaughan at Gartner Symposium (source):

… Gaughan said none of the four vendors [IBM, Microsoft, Oracle, SAP] are “re-imagining” IT, as per the theme of the Gartner conference.

You won’t find innovation in their product portfolio,” he said. “You might find it if you try and talk to the research parts of these organisations.”…

Indeed for those of us with a broader and deeper technical view, the question remained open: “What makes SAP HANA the innovative product among many existing in-memory database management systems?” I do not think this question has been fully answered by SAP so far. Let me share my understanding and thoughts here.

Firstly, in my opinion it is not the technology, as it is the ultimate promise, which is visionary: running transactional and analytic systems on a single platform with a single store of data. The whole data warehousing, as we know it, was born from a need to remove analytic workload from the transactional systems. In addition transactional data structures were transformed to analysis-optimized (like star schemas or OLAP cubes) along with data enrichment. Then ETL systems came into place to remove data transformation workload from data warehousing systems. Now SAP promises to bring everything back at one system (see graph below) – making separate ETL and EDW systems (and much of related skills and expertise) obsolete. This will be a huge change, yet from my discussions with SAP customers it was not clear if they had gotten it. Many of them want to have SAP HANA database for the sake of running ERP alone faster. Again – it is not what is revolutionary with the SAP vision to be delivered thanks to the HANA platform.

OLTP and OLAP systems today require not only separate computing resources, but as well different data structures optimized for specific profiles of queries. SAP’s promise is that once transactional (e.g. ERP) and analytic (e.g. BW) systems are running on a single HANA platform, they will be using a single copy of data. All additional data modifications required, for example by analytics part of the system, like data cleansing, transformation, enrichment, will be done on the fly during each execution of queries [VitalBI: I bet there is going to be some kind of results caching, even if some guys in SAP marketing disagree]. In-memory data storage together with in-database calculations, append-only tables, and multi-cores processing are all the features, which are going to help SAP to achieve the “single business platform” promise.

What is different comparing to other in-memory database management systems, that SAP’s ambition to bring in-memory technology to the next level: Enterprise.  It means not only specific and limited use cases, but mixed-workload, big-scale, high-volumes scenarios.

Secondly, there is not enough information about the innovation in the technology being developed by SAP. You will not find many white papers from SAP describing what is under the hood of the new database. Just storing data in the RAM, and treating this as a faster storage, is nothing new. Sybase ASE – the database acquired by SAP last year – has an “In-memory database” option. SAP HANA certainly has to offer something better.

My discussion with Franz Faerber, SAP HANA chief architect, at SAP Influencer Summit last summer helped to get a bit deeper view into the technology, beyond obvious things. In a nutshell, two major drivers behind SAP HANA technology were:

  1. “RAM is slow” (And you thought “in-memory” is about storing data in RAM??)
  2. “CPU clock frequency reaches its growth barrier”

In SAP HANA everything is about the performance, which is a prerequisite for the real-time data processing. Even if RAM is faster than ‘spindle’ hard drives, CPUs still waste cycle while waiting for data from RAM. Therefore the optimization goal is to reduce the idle cycles by making sure that there is as many useful data in CPU caches as possible. The HANA database has to be coded using CPU-cache-aware algorithms and processing CPU-cache-optimized data structures. Well, back in 2006 Jim Gray from Microsoft discussed this principle in his famous presentation “RAM Locality is King”.

Most of the data is stored in SAP HANA databases in columnar and compressed format. This data still has to be converted to records during processing, so it is important that this step happens as late as possible – something called late materialization. Ideally operations on the data should be able to run directly on compressed data, without need to uncompress them.

As just mentioned in the previous paragraphs: in HANA everything is about performance, so when the clock speed growth slows down, the search for performance is in multi-core CPU processing. It is the worst kept secret on the market that about a dozen of developers from Intel spent months in SAP office coding the core of SAP in-memory technology to use all possible features of Intel Xeon chipset architecture: HyperThreading, Intel Turbo Boost, Threading Building Blocks. That’s why its top performance SAP HANA database can achieve only when running bare metal on Intel Xeon CPUs, and not on other platforms or in the virtualized environment.

Last, but not least: SAP HANA database is in fact the hybrid database: the RAM is used as a primary data store, but there are still SSDs or spindle drives used for data persistence, like in case of the power lost. I saw some customers being surprised when facing the SAP HANA hardware with external storage besides lots of RAM.

On SAP invitation I am going to attend SAP Influencer Summit during December 13-14, and I am looking forward to it as a chance to get a layer deeper into what makes SAP in-memory technology truly a step forward comparing to others and how they are going to overcome some remaining technology barriers.


Filed under HANA, SAP

About in-memory technology and SAP HANA at SBOUC’11

It’s been an intensive year. It is not over yet (no rush!), but here we are – in the month of October and presenting at the 4th conference this year.

In-memory technology has been a hot subject this year. So – no surprise – this was a topic for Dave’s and my session at recent ASUG SAP BusinessObjects User Conference (SBOUC) as well.

Below is the presentation for your review and download. We hope you find it useful.

If you cannot open links embedded into the presentation above, here they are:

Any comments or questions – please use the the Reply section below.

Leave a comment

Filed under HANA, HP, SAP

Hello BI world!

I have arrived 🙂

I mean I’ve been in blogosphere before with my SCN posts, but now it’s time to bring blogging to the new level with this “Vital BI” site.
The time is perfect: there are lots of discussions going on in our area. Last week I was listening to the announcement of Exalytics at Oracle OpenWorld’11, and now I’m at ASUG SAP BusinessObjects User Conference with a chance to have some first hand discussions. Let’s start then.

Leave a comment

Filed under Miscellaneous, Rant, Uncategorized