Oracle BI EE 10.1.3.4.1 – Understanding Logical SQL – Part 1
Posted by Venkatakrishnan J on May 7, 2009
I was recently involved in a migration project wherein the customer was migrating from a custom OLAP solution to Oracle OLAP. The customer had around 1200 reports reporting out of a single BI EE repository. I was brought in to build a repository on Oracle OLAP 10g. Since it was a migration project and since only the backend was changing, the customer expected all the reports to work out of the box on Oracle OLAP. So, my job was to build a repository from scratch using SQL Views on OLAP 10g, and at the same time ensure that none of the reports actually break due to this new repository. It was a very interesting exercise since one would have to use the concept of dimension binning on all the dimensions. Though dimension binning itself is a straightforward concept, the implementation itself was pretty tricky due to multiple hierarchies in each dimension and i had to dedicate around 8 hours to build the repository and at the same time get it right in all aspects. There are certain concepts that one would have to be aware of while building a BI EE rpd on Oracle OLAP 10g. On OLAP 11g(which the customer was not willing to migrate to) though, this would be a straightforward exercise as 11g offers out of the box MVs for facts and dimensions. So, in effect, 11g OLAP cube would be treated as a simple star schema. I would cover those concepts like using EVALUATE_PREDICATE, Dimension Binning, Multi Hierarchy fragmentation etc in a future blog entry. While working on this repository, i noticed that there were certain prompt requirements wherein i had to hardcode the logical sql since a direct select on the column would produce a drastically costly query on the OLAP SQL Views. This led me to write today’s blog entry which is about logical SQL.
As you would probably know, Oracle BI Server generates 2 types of queries. Logical and Physical. When columns are chosen in a report, a custom logical sql is generated and BI Server converts this logical sql to the actual physical queries. Now, how different is a Logical SQL from a Physical SQL, in terms of structure as well as in terms of functionality. This blog series would cover this.
Structure of Logical SQL:
A logical SQL is exactly similar to a physical SQL. The major difference is that it follows the ANSI SQL structure. But there are cases wherein it has its own structure. At a high level the Logical SQL Structure would be like the one shown below
SELECT <DISTINCT> <PRESENTATION TABLE>.<PRESENTATION COLUMN LIST> FROM <SUBJECT AREA> WHERE <PRESENTATION TABLE>.<PRESENTATION COLUMN LIST EXPRESSIONS> GROUP BY <PRESENTATION TABLE>.<PRESENTATION COLUMN LIST> ORDER BY <PRESENTATION TABLE>.<PRESENTATION COLUMN LIST>
When do i join and when do i not
So, there is no need for any join in a logical sql when presentation table columns belong to a single Subject Area. For example, consider the logical sql and the correponding subject area organization below
SELECT Time."Fiscal Year" saw_0, Customer.Region saw_1, Facts.UNITS saw_2 FROM "Global - Prod & Dev" ORDER BY saw_0, saw_1
The idea behind the sql is that we need the number of units sold for each year in every region. In normal sql terms this would have involved a join between the Time dimension, customer dimension and the fact table. It would also have involved a group by on Year and Region columns. But in logical sql. joins and group bys are not needed as long as we are reporting against a single subject area. But assume that we have the following requirement wherein we need to show all the products and regions in a single report. Now, in our physical layer we do not have a join between products and regions. Hence when we create such a report in BI EE, the query generated would make a join with the fact table. So, instead of getting every combination of products and regions, we would be getting list of products being sold in every region based on whether transactions have occured for them in the fact table. The actual query generated is given below
select T11023.ITEM_DSC as c1, T10994.REGION_DSC as c2 from PRODUCT_DIM T11023, CUSTOMER_DIM T10994, UNITS_FACT T11080 where ( T10994.SHIP_TO_ID = T11080.SHIP_TO_ID and T11023.ITEM_ID = T11080.ITEM_ID )
But sometimes this is not what we need. All we need is a cartesian product between Customer and Products dimension. Though we can create such a cartesian join in the physical layer, changing the repository to satisfy all such similar cases would not work all the time. To circumvent this, we can make the BI Server to make an in-memory cartesian join. The logical SQL is given below
SELECT B.saw_1 saw_0, A.saw_0 saw_1 FROM (SELECT Customer.Region saw_0 FROM "Global - Prod & Dev") A, (SELECT Product.Family saw_1 FROM "Global - Prod & Dev") B ORDER BY saw_0, saw_1
If you look at the actual query generated, you would notice 2 queries being fired parallely.
In effect we have made BI Server to make an in-memory cartesian join. Now, in some cases, the requirement would be the other way around. In a dashboard prompt, we would have to show the list of all products for which there has been a transaction in the fact table as a drop down. In such a case, there are 2 ways of getting the list of products. One way is to include the following SQL in the sql results section of the dashboard prompt
SELECT Product.Product, Facts.UNITS FROM "Global - Prod & Dev"
In effect, the dashboard prompt would ignore the 2nd column and hence you should see all the products for which there is a transaction in the fact table.
The other approach is to create a logical SQL that will use the above query as a subquery.
SELECT Sub.Prod saw_0 FROM (SELECT Product.Product Prod, Facts.UNITS Units FROM "Global - Prod & Dev") Sub ORDER BY saw_0
Once we understand how the logical SQL works, it would become even more clearer on how BI Server generates the physical queries. Today we just saw some basic logical SQL queries. There are more complex queries possible to cater to different kinds of reporting requirements. The only drawback in using the custom SQL is that, end users lose the capability of doing the native hierarchy drills. But sometimes logical SQL becomes an indispensable tool to have. We shall see more later.