Hello there
I once wrote an entry to discuss performance tuning in OBIEE, that was a high level discussion on how to deal with performance issues in OBIEE. In fact, slow performance can be caused by many different reasons, which takes effort to diagnose. It is important to not jump the gun whenever you run into this type of issues by purposing indexes or 'creating aggr tables' or even 'caching'. Every problem is caused by a specific reason, a lot of it can be reflected by the query it generates while others can't. So today, I will discuss something that I run into on the project.
Remember I once wrote an entry about how I implemented some filtering logic in a quite complex way? While that was working great, it was later creating problems in performance. The performance was so bad that it would take 30 mins or more to return 50 records, and it would hang the entire system. Now what are you going to do about that?
The reason for this sudden slow-down was because we later loaded more complete data sets, so we were dealing with more realistic data size than before when I first created the report. This suggests that the query was not efficient in handling larger tables even if they were properly indexed.
The real SQL query was quite long so I don't post it here. Yet, if you still remember from the previous design that I had to write some coding to get the filtering logic work:
If you look, you will see that there are quite a bit of coding in the expression. Based on the requirement, i was originally thinking to apply conditions before filtering values. In other words, whether the result should be within the date and hour range would depends on conditions like the current date vs user input date or against history. Nevertheless, having this type of filter would add a huge case statement in the where clause of the SQL that it generates.
Having case statements or 'OR' operator is always going to cause performance problems because it can't be properly indexed so it will always do full table scan. As an experiment, I removed the saved filter and just ran the report itself, it came back within 10 seconds. Therefore, something had to be done to avoid case statement in where clause while keeping the filter logic working. That became a headache for me for sometimes..
Luckily, there is always more than one ways to solve the same problem. The reason I had to use case statement was because I was treating the date and hours as separate columns. They were being filtered separately. Therefore, based on the requirement, I had to first create the filters on date columns to make sure it works, then create filters on hour columns. Yet the hour filters wouldn't always work the same way depending on the conditions from the date and history flag, therefore i was writing cases.
What if I can combine date and hours today and filter them once, then hours are no longer just numbers between 0 to 24, there will be date implications behind each 0 and 24. Then the problem should be solved..
I will skip the thought process and steps I took to get to the final stage as it would be very boring. I will just show you what the final filter looks like:
Here the condition is simple:
start date < TIMESTAMPADD(SQL_TSI_DAY, @{endhour}{24}/24, date '@{varConfigHistoryDate}')
Here on the right side of the equation, we are basically taking the user entered date value, which is passed through the presentation variables 'varConfigHistoryDate', then we add a couple of hours to it. As far as the actual number of hours added to the date will depend on the user entered hours. So let's say user select Aug 1 and end hour as 8 from the filter, and let's say the date column has a value Aug 1st 8:00:00 am, then this expression will become Aug 1st 4:00:00 pm, because it is adding 8/24 to the date, which is 8 hours (1/3 of the date) ahead of the start date. So in this case, Aug 1st 8:00:00 will satisfy the filter because it is less than Aug 1st 4:00:00 pm, and thus show up in the report as exactly what we want.
ifnull(end date, timestampadd(sql_tsi_day, 10000000, current_date) => TIMESTAMPADD(SQL_TSI_DAY, @{starthour}{0}/24, date '@{varConfigHistoryDate}')
Here the same idea applies. Only that here we are evaluating based on the hour values passed through start hour prompt. So in this case, if users enter 8 in the start hour prompt and Aug 1st as the varConfigHistoryDate' value, then the right side of the equation will be Aug 1st hour plus 8/24. So again, if there is an transaction that states at Aug 1st 8:00:00am, then this filter condition will be Aug 1st 8:00:00am => Aug 1st 4:00:00am. This means if the end date timestamp is greater than or equal to anything that started in the past (starthour being added to the timestamp), then result should return.
On the left side of the equation, an 'ifnull' has to be employed to take care of situations where end date timestamp is blank. In this case, it will be added 1000000 indicating that this record will absolutely satisfy the filter conditions because it is always going to be way bigger into the future.
After testing the result, I confirmed that this query is much simpler and performance-friendlier than the case statements. More importantly, it didn't compromise any logic that was desired.
Thank you
Until next time
I once wrote an entry to discuss performance tuning in OBIEE, that was a high level discussion on how to deal with performance issues in OBIEE. In fact, slow performance can be caused by many different reasons, which takes effort to diagnose. It is important to not jump the gun whenever you run into this type of issues by purposing indexes or 'creating aggr tables' or even 'caching'. Every problem is caused by a specific reason, a lot of it can be reflected by the query it generates while others can't. So today, I will discuss something that I run into on the project.
Remember I once wrote an entry about how I implemented some filtering logic in a quite complex way? While that was working great, it was later creating problems in performance. The performance was so bad that it would take 30 mins or more to return 50 records, and it would hang the entire system. Now what are you going to do about that?
The reason for this sudden slow-down was because we later loaded more complete data sets, so we were dealing with more realistic data size than before when I first created the report. This suggests that the query was not efficient in handling larger tables even if they were properly indexed.
The real SQL query was quite long so I don't post it here. Yet, if you still remember from the previous design that I had to write some coding to get the filtering logic work:
If you look, you will see that there are quite a bit of coding in the expression. Based on the requirement, i was originally thinking to apply conditions before filtering values. In other words, whether the result should be within the date and hour range would depends on conditions like the current date vs user input date or against history. Nevertheless, having this type of filter would add a huge case statement in the where clause of the SQL that it generates.
Having case statements or 'OR' operator is always going to cause performance problems because it can't be properly indexed so it will always do full table scan. As an experiment, I removed the saved filter and just ran the report itself, it came back within 10 seconds. Therefore, something had to be done to avoid case statement in where clause while keeping the filter logic working. That became a headache for me for sometimes..
Luckily, there is always more than one ways to solve the same problem. The reason I had to use case statement was because I was treating the date and hours as separate columns. They were being filtered separately. Therefore, based on the requirement, I had to first create the filters on date columns to make sure it works, then create filters on hour columns. Yet the hour filters wouldn't always work the same way depending on the conditions from the date and history flag, therefore i was writing cases.
What if I can combine date and hours today and filter them once, then hours are no longer just numbers between 0 to 24, there will be date implications behind each 0 and 24. Then the problem should be solved..
I will skip the thought process and steps I took to get to the final stage as it would be very boring. I will just show you what the final filter looks like:
Here the condition is simple:
start date < TIMESTAMPADD(SQL_TSI_DAY, @{endhour}{24}/24, date '@{varConfigHistoryDate}')
Here on the right side of the equation, we are basically taking the user entered date value, which is passed through the presentation variables 'varConfigHistoryDate', then we add a couple of hours to it. As far as the actual number of hours added to the date will depend on the user entered hours. So let's say user select Aug 1 and end hour as 8 from the filter, and let's say the date column has a value Aug 1st 8:00:00 am, then this expression will become Aug 1st 4:00:00 pm, because it is adding 8/24 to the date, which is 8 hours (1/3 of the date) ahead of the start date. So in this case, Aug 1st 8:00:00 will satisfy the filter because it is less than Aug 1st 4:00:00 pm, and thus show up in the report as exactly what we want.
ifnull(end date, timestampadd(sql_tsi_day, 10000000, current_date) => TIMESTAMPADD(SQL_TSI_DAY, @{starthour}{0}/24, date '@{varConfigHistoryDate}')
Here the same idea applies. Only that here we are evaluating based on the hour values passed through start hour prompt. So in this case, if users enter 8 in the start hour prompt and Aug 1st as the varConfigHistoryDate' value, then the right side of the equation will be Aug 1st hour plus 8/24. So again, if there is an transaction that states at Aug 1st 8:00:00am, then this filter condition will be Aug 1st 8:00:00am => Aug 1st 4:00:00am. This means if the end date timestamp is greater than or equal to anything that started in the past (starthour being added to the timestamp), then result should return.
On the left side of the equation, an 'ifnull' has to be employed to take care of situations where end date timestamp is blank. In this case, it will be added 1000000 indicating that this record will absolutely satisfy the filter conditions because it is always going to be way bigger into the future.
After testing the result, I confirmed that this query is much simpler and performance-friendlier than the case statements. More importantly, it didn't compromise any logic that was desired.
Thank you
Until next time