Tag Archives: database

Calculating number π by throwing darts: digitally in SAP HANA

In my previous blog I promised exploration of Big Data on SAP HANA, express edition. But remember, the Big Data is not only about the Volume, but about Variety (of data types) as well. And this is the route I chose first to look at the fun stuff you can do with spatial data processing in SAP HANA.

Ever since I enjoyed “Calculating Pi with Darts” video from Physics Girl and Veritasium [which you should watch too!] I have thought about repeating it. The world is going digital, so obviously I meant using SAP HANA for that. I know I should have done it during the PI Day (3/14/16, or on the 14th of March 2016), but better later than never!

Calculating pi number with darts is one of the Monte Carlo methods of getting its approximate value. Accordingly to Wikipedia “[…] method for computing π is to draw a circle inscribed in a square, and randomly place dots in the square. The ratio of dots inside the circle to the total number of dots will approximately equal π/4”

Looked like SAP HANA’s spatial capabilities would fit perfectly for that. If you are not familiar with spatial processing I prepared four introductory tutorials that should not take more than 20 minutes for your to complete and understand all basic concepts needed to follow the rest of the blog. And if you do not have SAP HANA Express yet, then it is 10 minutes to get it. Alternatively you can use as well SAP HANA MDC instance in your HCP Trial account as we are still not talking about huge volumes of data here.

  1. Points: http://www.sap.com/developer/tutorials/hana-spatial-intro1-point.html
  2. Lines and strings: http://www.sap.com/developer/tutorials/hana-spatial-intro2-string.html
  3. Areas and polygons: http://www.sap.com/developer/tutorials/hana-spatial-intro3-polygon.html
  4. Spatial columns in tables: http://www.sap.com/developer/tutorials/hana-spatial-intro4-columns.html

Virtual dart hits are points with random X and Y coordinates (objects ST_Point). The dartboard is a disk (ST_Buffer() around a ring’s center point). And then calculation of the average of hits within an area of the disk (ST_Within() method).

First I need a table with a spatial column, which will store coordinates of my digital hits, plus a procedure to populate this table with required number of attempts.

CREATE SCHEMA "TESTSGEO";
SET SCHEMA "TESTSGEO";

--DROP TABLE "TESTSGEO"."SPATIAL_CALCPI";
CREATE COLUMN TABLE SPATIAL_CALCPI(
	POINT ST_POINT
);

--DROP PROCEDURE "TESTSGEO"."COLLECT_HITS";
CREATE PROCEDURE collect_hits (IN attempts INT)
 LANGUAGE SQLSCRIPT AS
 iter INTEGER;
 BEGIN
    iter := 1; 
    WHILE iter<=attempts DO
        INSERT INTO "TESTSGEO"."SPATIAL_CALCPI" VALUES (new st_point(RAND(), RAND()));
        iter := iter+1;
    END WHILE;
    MERGE DELTA OF "TESTSGEO"."SPATIAL_CALCPI";
 END;

Now let’s check the result of throwing 2000 virtual darts and what the PI number approximation will be!

--TRUNCATE TABLE "TESTSGEO"."SPATIAL_CALCPI"; 
CALL "TESTSGEO"."COLLECT_HITS"(ATTEMPTS => 2000);

--Check the results of throwing: coordinates and if hit dartboard
SELECT
  POINT.ST_asWKT(), 
  POINT.ST_Within(NEW ST_Point(0.5,0.5).ST_Buffer(0.5)) as IN_CIRCLE 
FROM "TESTSGEO"."SPATIAL_CALCPI";

--Calculating PI using Monte Carlo formula
SELECT
  4*AVG(POINT.ST_Within(NEW ST_Point(0.5,0.5).ST_Buffer(0.5))) as PI 
FROM "TESTSGEO"."SPATIAL_CALCPI";

Results I got in my system were between 3.11 and 3.21. Well, very rough approximation of number π 🙂

Let’s visualize the results by generating SVG with a dartboard and all generated hits.

SELECT
  ST_UnionAggr(POINT).ST_Union(NEW ST_CircularString('CIRCULARSTRING(0 0.5, 1 0.5, 0 0.5)')).ST_asSVG() AS DARTBOARD 
FROM "TESTSGEO"."SPATIAL_CALCPI";

I did a minor modification of the SVG to have a circle in red.


Then I tried 50000 attempts, but the result was 3.1168. So, no much improvement over previous attempts.

PS. Obviously using below SAP HANA spatial method calculating a circle’s circumference when diameter is 1 would be much faster and precise way to get the pi. But – hey! – it would take away all the fun of throwing digital darts 😉

SELECT 
  NEW ST_CircularString ('CircularString (0 0.5, 0 1.5, 0 0.5)').ST_Length() as PI 
FROM DUMMY;

--Result is PI 3.141592653589793

Please let me know what pi numbers you got by throwing digital darts in your SAP HANA instances.

PS. Republished from my blog https://blogs.sap.com/2016/12/14/calculating-number-%CF%80-by-throwing-darts-digitally-in-sap-hana/

Leave a comment

Filed under HANA

HANA vs Exalytics: Where two fight the third one wins? Or is there a fight really?

The recent release of Oracle Exalytics brought third round of a verbal fight between SAP and Oracle and surrounding social (press, bloggers, forumers) comments. On a closer look you will find most of the arguments and comparisons pretty shallow and a based on rhetoric, then facts, be this from SAP side or Oracle side. Those keep discussion lively and catch the attention of many, but certainly do not please us – technologists, who are left-brained. One thing is for sure: megaplayers like SAP and now Oracle directed everyone’s eyes at the long existed, but previously niche, market of in-memory databases. Smaller, but established players, do not plan to waste 15 minutes of topic’s  fame, like Mike Allen, VP Product Management at Terracotta, cited in “Big Data Buyer’s Guide…” by Enterprise Apps Today:

At the end of the day SAP HANA and Oracle’s solutions are still databases – running on dedicated hardware that has to be sized to the problem in hand – that are accessed across a network. […] Terracotta BigMemory stores data right where it’s used, in the memory where the application runs. This makes it much faster.

Well, good luck to Teracottas, VoltDBs, Altibases etc, but for the moment let us move back to SAP HANA vs Oracle Exalytics debate. Indeed, at the moment SAP HANA is much closer in comparison to Oracle Exalytics, than to Oracle Exadata or to generic Oracle Database. I said “in comparison”, and not “in competition”, because to compete the products should stand against each other in customer bids, and not in vendors’ verbal arguments. So far I haven’t seen a single place where HANA and Exalytics stood against each other in the customer’s choice. I obviously do not see everything that is happening on the market, and therefore will greatly appreciate any additional facts in the comments to this post. On the surface there are many similarities between the two:

  • both are using the word “in-memory” to describe the major characteristic of their major software component,
  • both are sold only as software+hardware combo, although with Oracle Linux and Oracle Sun hardware with Intel Xeon chips in case of Exalytics, and with SuSE Linux and Xeon-based hardware from six (and counting) SAP’s hardware partners in case of HANA
  • SAP’s initial major and most promoted use cases for HANA database were analytical data marts and planning applications, same (but only these) are use cases for Exalytics;
  • both are Fast Data solutions, requiring to couple them with Big Data solutions – Exadata in case of Exalytics, or database from Sybase family in case of HANA.

So, what make them different and why we don’t see really a fight on the market between the two, although for sure both vendors are searching prominent example to announce their superiority and victory?

  1. First and the foremost: the position of the product on the vendors priority list. HANA is the center of the SAP’s technology strategy: to-be the heart of all-things-SAP plus a platform with which SAP tries to attract ISVs. For Oracle Exalytics is just an answer to first technical use case released by SAP for HANA, i.e. high-performing analytics on data marts.
  2. From the go-to-market, I see both as products to be up-sold or cross-sold, and therefore positioned in vendors’ own shops, rather than going to brand new industries and customers and clash there. There are certainly examples of HANA and Exalytics sells into new accounts, but I do not see these happening in volumes both vendors are after.
  3. Exalytics has BI engines and tools pre-installed, while HANA doesn’t. Those who follow HANA’s history right from the beginning may remember that SAP initially wanted to include BusinessObjects BI tools run-times on HANA appliance as well, but finally did not. Instead today HANA database has lightweight application server (called XS Engine) built-in. As such Mark Hurd of Oracle was right in last Q3/2012 earnings call: “… one of the nice things about Exalytics is you plug it in and your existing Hyperion EPM system runs faster. Your existing Oracle BI runs faster. With HANA, one of the problems is you plug in HANA and then you start to write programs, because it’s not compatible with anything.” The thing though is that with HANA you can obviously write programs, but as well you can plug it into some of the SAP applications: ERP accelerators, SAP NetWeaver BW, SBO Analytic Applications.
  4. And that’s another difference: Exalytics works only with Oracle software, and HANA’s strongest case is in interoperability with SAP products. HANA database will not work with Hyperion EPM (even if provocatively invited by SAP), and Exalytics cannot be used with SAP BW. Although interestingly now the use case for SAP BW puts SAP HANA database into the competition with Oracle Exadata, which has been certified for SAP NetWeaver installations.

2 Comments

Filed under Exalytics, HANA, Oracle, SAP