The next wave in BI

Until 'Madison' arrives, Microsoft SQL Server runs faster on Linux

11/10/2008 13:51

Visitors of the latest BI summit of the Seattle based software giant were awed by a first demo of project Madison, which is basically a Windows/SQL Server based version of the recently acquired appliance vendor Datallegro. Microsoft showed how a 150 TB (yes, that's Tera, not Giga) database could perform on a 24 node MPP system. Unfortunately, the announced release date is 'first half of 2010', so considering who is telling this we shouldn't expect anything capable of putting into production before the Holiday season of that year. So what are your options if you're running a big SQL server datawarehouse and run into performance problems? Wait another two years, migrate to Oracle, or buy an appliance? Well, Microsoft spent $275 Mln for a reason, so they would hate to see their customers go for either one of the latter two alternatives and just want you to wait eagerly for a Redmond backed solution. Fortunately, you don't have to because there are two interesting solutions available right now. Both can offload your data to an analytical database which perfectly understands the T-SQL queries you fire at it. The first one is Paraccel, a column based, shared nothing MPP software solution that can coexist with a SQL Server OLTP database and is able to reroute the analytical queries to the Paraccel cluster. Paraccel was the first column based vendor to take on the TPC-H challenge and shattered all existing scores when they first published their results. Pricing however is pretty steep at $100K per terabyte of raw data but my guess is that if you want to build that 150TB machine they'll probably grant you a nice discount. An even simpler solution might be to buy a Dataupia box. This is an appliance that hooks into the database gateway so setting this up and offloading your data is a trivial task. Dataupia offers this appliance (which it is) in 2TB building blocks that are actually very cheap: only $20K a piece. Building the same 150TB powerhouse this way would cost you a mere $1.5 Mln, and that's even before discounts. Dataupia is not a column based solution but has some interesting aggregation algorithms running that are supposed to improve performance as well. No TPC-H entries though, so how fast it really is will be a mystery until you try it. Now comes the interesting part: both architectures run on Linux! Dataupia on Suse, Paraccel on Redhat. Another similarity: both companies spring from the innovative minds of former Netezza founders Barry Zane (Paraccel) and Foster Hinshaw (Dataupia) which might add some extra credibility to these vendors. So if you hit that SQL Server data warehouse performance barrier and can't wait another 2 to 3 years: have a look at the (not so) new kids on the block. You might be surprised and get a lot of bang for the buck along the way.