IBM Blu – Very Fast – Very Easy – Very Cool
When I was first asked “Have you heard of IBM Blu?”, I immediately thought that was a redundant statement. Everyone knows that IBM is also known as “Big Blue”, so I was a little caught off guard. But, after doing a little digging and holding a call with the product managers, it became clear to me that IBM Blu is paving the way toward fast database analytics and access.
What do I mean by fast? Well, Handelsbanken, a Swedish bank providing universal banking services, saw a 100X improvement when implementing Blu in their data systems.
What do I mean by easy? Well, gone are the days with tuning, tweaking, and indexing. Blu handles the IO efficiently and intelligently without ANY human intervention.
In today’s companies, fast and accurate business analytics are critical to their success. When executives need answers, they do not want to wait. The longer the delay in execution for answers, the longer critical decisions will be made, and by then, it could be too late.
What is Blu?
Blu is a technology, more than a product, that sits inside the database engine and uses many advanced concepts in order to do the job. Mostly, it uses In-Memory Computing, Columnar Processing, and advanced compression to exponentially increase your data analytics and processing. Let’s talk about these topics one-by-one.
Did I say it’s fast? It is very fast. On average, IBM has consistently seen an 8x-25x increase in analytical speeds, with the highest increase at 1000x.
Storing data in RAM is still one of the fastest ways to get data in and out of the processor, because it is digital and has direct access to the system bus. RAM is also getting more advanced with data integrity, density, and decrease in costs. But, if in-memory computing so great, why doesn’t everyone use it? Well, it’s volatile for one. If you lose power or the system reboots, your data is lost. It also does not have the storage capacity that disks do. Some in-memory vendors require you to match the memory size to your database size.
Blu is engineered to get around those deficiencies. First off, it does not store your entire database in memory like some vendors do. If you have a 50GB database, Blu does NOT require 50GB of RAM. Blu stores the data in the system registers in almost a “de-dupe”-way saving valuable memory space. It also intelligently retrieves data from disk before you need it. This implies that ONLY the data you need is stored, and not the entire database.
“…using a database with 200M records, took 30 seconds using row-based storage. But, the same query using columnar-based storage took only 0.33 seconds.”
Many database vendors are moving to a “Columnar Processing” storage format where data is stored in columns rather than rows. The idea here, is that most of your queries on tables only deal with a sub-set of fields, and not the entire row. By storing only the column data (fields) rather than the entire row (recordset), you can save valuable IO when retrieving data. Take my image below for example:
The above image shows a query joining two tables: Customers and Territories. A diluted version, of course, as there are MANY fields you will have. But, this report just wants to run a count of customers to territories. So, this query is only getting the Territory and the name of the customer. That’s only 3 fields out of, say, 10, that I am retrieving. So, why use all that IO to read all of the fields? Why not just get the 3? That’s columnar storage. You can read more about it on Wikipedia’s page.
Below shows the difference between row-storage and column-storage. Columnar storage is not a new idea. It’s been around for quite awhile, but with Big Data and the increased demand for high-speed analytics, this type of storage is gaining more traction and visibility.
(Image Source: DB2 10.5 with Blu Acceleration )
John Schlesinger, Chief Enterprise Architect at Temenus claimed that using a database with 200M records, took 30 seconds using row-based storage. But, the same query using columnar-based storage took only 0.33 seconds. Yes, that is worth repeating. 30 seconds down to 0.33 seconds.
When using compression, you are keeping your data storage down. But, because the data is compressed, the transferring of that data is also less. Smaller data means less load times for the system to process. But, Blu goes one step further. IBM Blu can actually process the data already in compressed format, which they call “Actionable Compression”. No need to uncompress, process, then re-compress, reducing IO even more.
Also, the compression is done in such a way, that the encoded data going into the CPU is actually matched to the size of the register bits of the CPU. This decreases IO even more, making it simpler to “digest”, as it were, for the processor.
So, IBM Blu sounds great, how do I get it? Well, the good news is that if you have DB2, you may already have it. As of DB2 version 10.5, IBM Blu is already available to you. All you need to do is 4 easy steps outlined in IBM’s Deploying Blu Page.
Because IBM Blu is a Columnar-Based storage, it’s not for everything. If you have a transaction-based application, or a linear logging application, Blu could actually cause your app to suffer. This is because those applications need all fields available and usually requires sequential methods. IBM Blu is geared for your Business Intelligence systems and Big Data Analytics. The good news, is that you can selectively choose which tables should be row-based, and which tables should be column-based.
Because Blu is so readily available (even in the Cloud), it’s easy to get started. Download the free 90-day trial to test for yourself. The trial version is fully featured with all services enabled. With the kinds of speed increases IBM says they can deliver, and the fact that no human intervention is needed, what more do you need to try it out?
Sound interesting? Of course it does! Heck, even *this* author is wanting to give it a go. The client testimonies available on IBM Blu’s website is telling enough. See below to hear testimonies from Coca-Cola, BNSF Railways, and Handelsbanken.
“Computer Processor” image courtesy of cooldesign / FreeDigitalPhotos.net