Business Intelligence – Oracle

Hyperion Essbase 9.3.1 – OBE Series – Designing Essbase Databases – BSO Outlines & Data Load – Global Schema – Part 4 – Loading Data

Posted by Venkatakrishnan J on December 17, 2008

In the last 3 blog entries, we saw how to go about loading and designing the dimensions for a block storage cube. Today we shall see how to go about loading data. A block storage cube can accept input data at any level. The data need not always be at the lowest level. The process of loading the data into Essbase is again done through rule files. As a general rule, lets create one rule file for each measure (UNITS, UNIT_PRICE & UNIT_COST). Generally the number of rule files is determined by how your SQL source is structured. The SQL for the rule file(for UNITS) in our case is given below

SELECT
MONTH_DSC,
'Units',
CHANNEL_DSC,
ITEM_DSC,
SHIP_TO_DSC,
SUM(UNITS) DATA
FROM
UNITS_FACT UF,
TIME_DIM TD,
CHANNEL_DIM CD,
PRODUCT_DIM PD,
CUSTOMER_DIM CD1
WHERE
UF.MONTH_ID = TD.MONTH_ID AND
UF.CHANNEL_ID = CD.CHANNEL_ID AND
UF.ITEM_ID = PD.ITEM_ID AND
UF.SHIP_TO_ID = CD1.SHIP_TO_ID
GROUP BY
MONTH_DSC,
CHANNEL_DSC,
ITEM_DSC,
SHIP_TO_DSC

As a general rule, it is recommended to have columns in the select statement arranged in the same order as the dimensions in the Outline. This will ensure that the data load is fast. The process of creating the rule files remain the same as for the dimensions. The major difference lies in property settings. Lets start with creating a new rule file. Go to Open SQL and enter the above SQL. Then ensure that the Data Load Property is set.

       

The next step is to make sure that each and every column is properly tagged to the corresponding dimension as shown below

       

This needs to be set for all the columns apart from the data. The last data column needs to set with a data property. Also, by default in a BSO cube if the existing cells have data they would be overwritten. There are properties which can change this default property. This is shown below

       

       

Once this is done, validate the rule file and save it. Now, lets load the data from this rule file using the same approach as before. The only difference in the data load is that, while doing the data load we need to choose the load only option.

       

This would load all the data in the Essbase Cube. This can be verified from the database properties. You would see that the Input Level 0 blocks would have data.

       

The various statistics above are very important while loading an Essbase Cube. In our case, we have a block size of 3K which is on the lower side. This can be increased further by making another dimension as dense. But i would not do that in this blog entry. Also, our Block density is Ok. Anything around 25% block density is considered very good. This density would reduce as we have not aggregated the cube yet. Also, the Index hit ratio of 1 is very good. So, this means that Index Cache size is being utilized completely and there is scope for increasing this value further. Before we look at Calculation Scripts lets create the remaining 2 rule files to load the UNIT_PRICE and UNIT_COST measures. The SQL for both the rule files are provided below.

SELECT
MONTH_DSC,
'Unit Price',
'Channels',
ITEM_DSC,
'Customer',
SUM(UNIT_PRICE) DATA
FROM
PRICE_AND_COST_FACT UF,
TIME_DIM TD,
PRODUCT_DIM PD
WHERE
UF.MONTH_ID = TD.MONTH_ID AND
UF.ITEM_ID = PD.ITEM_ID
GROUP BY
MONTH_DSC,
ITEM_DSC

 

SELECT
MONTH_DSC,
'Unit Cost',
'Channels',
ITEM_DSC,
'Customer',
SUM(UNIT_COST) DATA
FROM
PRICE_AND_COST_FACT UF,
TIME_DIM TD,
PRODUCT_DIM PD
WHERE
UF.MONTH_ID = TD.MONTH_ID AND
UF.ITEM_ID = PD.ITEM_ID
GROUP BY
MONTH_DSC,
ITEM_DSC

Since UNIT_PRICE and UNIT_COST have the grain of only two dimensions, we need to load the data against Topmost node for all the other dimensions(non-grain dimensions). Thats the reason why the Topmost node of non-grain dimensions have been hardcoded in the above query. The screenshots below would act as a reference for creating the rule files.

       

       

Now, lets load the data from these 2 rule files to our cube again. Once the data load is done, let us look at the statistics again

       

As you see, the number of upper level blocks have increased since we are loading the data directly to the Topmost node of Channels and Customer Dimensions. Now, the next step is to aggregate the cube. Aggregation in Block Storage cubes are done through scripts called as calculation scripts. The major advantage of these scripts is that they provided the flexibility to aggregate any portion of the cube. So, we shall create a single calculation script which will aggregate UNITS across all the dimension and UNIT_PRICE, UNIT_COST across 2 dimension. The calculation script is provided below

SET CACHE HIGH;
SET CALCPARALLEL 2;

FIX ("Units")
CALC DIM ("Time");
AGG("Channels");
AGG("Product");
AGG("Customer");
ENDFIX;

FIX("Unit Price","Unit Cost","Channels","Customer")
CALC DIM("TIME");
AGG("Product");
ENDFIX;

Basically what this does, it calculates all the dimensions for the Units measure. But for the Unit Price and Unit Cost measure it calculates only the Product and Time dimensions.

       

Now, lets execute this calculation and look at the database properties.

       

       

As you see, the number of upper level blocks have increased. Also the block density has gone up which is good. Now as the last step let us look at the data.

       

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: