Vertipaq Engine – Data Compression

Introduction

Microsoft introduces a very powerful engine for fast query named Vertipaq. Which is used in the Power BI and SSAS Tabular model? Here we are not going to discuss about Power BI or SSIS Tabular model. Our main concern is to identify how the Vertipaq Engine works and how it provide Fastest Query.

The Vertipaq Engine stores the table in a columnar format and established relation between columns. In this blog post we are going to discuss about Data Compression mechanism and try to explain it.

 

Case Study

We have a table which is imported into Vertipaq engine.
Table Name: Order_Details

 


 

How Vertipaq works on this Table

Vertipaq Compress Text in a Table

Text takes lot of space in data base. So Vertipaq need to compress it. It maintain another table with distinct Text value and Id number like this. It is also called Dictionary/ Hash compression.

 

Product_ID_Dictionary

     


    

 

Value Encoding

Here Vertipaq tries to convert Value into One Digit where as possible. TO understand this we are taking Quantity columns for an example.

 


 

From Quantity columns its tries the Minimum value. In our case it is 10.

Now its takes every value from Quantity and try to minus minimum value to make it single digit where as possible.

 


 

If, we need to get the Original Value, we have to add the Minimum value, i.e. + 10

 


 

 

Run Length Encoding

 

Now look at the Product_ID columns. Now we have to find that how many times the Product ID is repeated.

 


 

(Product_ID, Value Repetes)

(0, 2)
(1, 2)
(2, 1)
(3, 1)

 

Looks like this:

 


 

Hope you like that.

Comments

  1. It is very much informative and as well as interesting also. As per my understanding I need little bit more elaborate discussion in perspective of quey execution to fetch record...

    Thanks

    ReplyDelete

Post a Comment

Popular Posts

Working with Spark – Spark RDD

Activity Dependencies with Logical AND in ADF

Working with Python – Module