All of the topics discussed here in this blog comes from my real life encounters. They serve as references for future research. All of the data, contents and information presented in my entries have been altered and edited to protect the confidentiality and privacy of the clients.

Friday, November 2, 2012

a general approach to performance tuning in OBIEE

A long time ago, I wrote an article about performance tuning in OBIEE, in which I stated the importance of minimizing performance issue by having a good project direction and a clear understanding of the BI Concept. This is the ensure the project with clean design, optimized process so that we won't even running into a lot of the performance issues later on. Remember, a bad design, wrong process and mismanaged expectation are the root cause of a lot of the issues we potentially face on our path. Applying fixes, workaround and changes attempting the improve the performance are only temporary solutions that eases the symptom, but it won't help in the long run.

Although I am a strong believer of dealing with performance issue at design and project process level, it doesn't mean that there is nothing else we can do within OBIEE to optimize the performance. When a report is running very slow, there are a few areas in OBIEE that we can look into to diagnose the issue.

One of the most common place to investigate is the query log, which records the exact physical queries that a particular report is generating. When you look into the query log, don't just focus on the physical SQL, but also read the steps this log is recording. Each step of the action has a time stamp telling us what time this action started and what time it completed and moved on to the next step. Sometimes, it could be the connection to the DB that takes unusually long, not the query itself. Sometimes, we are defining some DB scripts in the connection pool for it to execute before or after queries, this can cause extra delay. So we need to look into all these things before getting to the physical query itself.

In order to tune a query, some basic understanding of our tables and objects are very important. We know that most of the queries consist of some dimension and fact tables. Fact tables are generally huge because it holds time sensitive information. All these stats, measures and keys are subject to change based on time. When you look at sales fact, you don't just look at the sales total, you need to look at sales total based on dimensional information, especially time related information in order for it to make sense. Therefore, fact tables are loaded with a certain frequency based on time, the records can be at hourly frequency or daily frequency or even min frequency. Over time, fact tables get huge. A full table scan on a fact table will definitely be very costly and slow. Therefore, when all of the queries should at least have a day dimension joining to the fact table and filter based on a specific time period to avoid full table scan. Indexes, and partitions are always needed on the fact table. But then again, if a user insists on running a report for entire 2 years of data, then there is no much we can do. He should read the myth of performance tuning.

Another thing to look for is, if the report is created with some conditional expressions to define the column values, such as case statement, make sure in the where clause the case statement is not on the right side of the equation. For example, you may have a column like x column that gives you date based on different calendars, and then create a filter (like x column = current date) out of this column. Then the where clause would look like this: Where case when column = retail then retail date when column = fiscal then fiscal date else date end = current date. This type of query will take away the index you created on fact table because it has to go through the entire table since the value is conditional.

Explain plan is always a good place to look at in order to understand what the query is actually doing. However, if the query takes way too long to run, you may not want to join the explain plan and wait for a long time. I'd suggest to analyze the query by looking at how many tables and how many joins are involved in the report. Draw a diagram if need to. Then, start removing columns and joins one after another, then rerun the report and measure the performance. I guarantee that by removing layers and layers of joins, the performance will start to get better. Then you will be able to find that 1 table, when joined with your other tables, kills the performance. Then focus on tuning a much simpler version of the query. Chances are, you need to set the driving table to be the smaller table in your logical & complex join. Understand the sequence of how data are being fetched to make sure the smaller dataset is used as the driving dataset against much larger fact tables to avoid full table scan.

An example: You have one report that fetches data from the following tables in a snowflake fashion:

Region <--> Supplier <-->Customer <--> Sales fact<-->Date

We know that sales fact is much bigger. Region has to go through supplier in order to find customer, and then go to the sales fact to find the measures for those selected customers based on date and regions. So if I go from regions to supplier and then customer, I will obtain a pretty small list of records. Then I will just search into the sales fact for these small list of customers instead of scanning the whole fact table. Other way to get the result would be, get all of the sales fact records, then take the record and search into customer and eventually get you the region. First way is much faster then the second way because the driving data set is much smaller. The explain plan will outline the flow of the joins. Analyze that and see if you are using the wrong data set as driver.

Sometimes, graphs and charts can cost system overheads too. Putting too many charts and graphs on one dashboard will slow down the performance.

Although caching, aggregate tables are available in OBIEE as features, don't look at it as magic bullets. Especially caching, it has cost and it won't work in ad-hoc environment.

Understanding your hardware resources and system environment is another important aspect of performance tuning. You need to know where your limit is.

In summary, performance tuning is a very common practice in all OBIEE project because we all face slow performance. Certain things can be avoided by having a clean design, other issues will have to be investigated and tuned at the query and DB Level, while there are still possible solutions using OBIEE features. Still there will be some issues you just can't fix. As a last resort, a deep understanding of what the customer wants is important before you apply the workaround so that you can manage the expectation properly.


Until next time


admission essay writing service said...

Like other essay, the admission essay has three parts namely introduction, body and conclusion. Make sure that your essay is your own voice. Show the draft of your essay to family and teachers to know is it defines very well about you. If you have any difficulty in admission essay writing, you can go through the admission essay writing service.

Aby Perez said...

OBBIE performance tuning posses lots of greats features and thus enjoying most of it, though there are problems that can't be fix but just a little error. You can try to visit bestessay for some additional options when it comes in writing. This is really a good option for you to consider when you love writing.

Related Posts Plugin for WordPress, Blogger...