Recently, I came across a requirement that the users want to query against the transactional database using OBIEE. In our case, they literally want to pull data directly (meaning no ETL, no staging area in between) from the cash register of the retail stores nationwide. The databases of these stores are all sitting on different servers although they all have the same data structure. The goal is to be able to report these store data in 1 subject area in OBIEE with the functionality of letting users choose to report against 1 store data or numbers of store data together for totaling.
So let's explore how this can be implemented.
First, let's say I have successfully imported 2 sets of tables into my physical layer with different connection pool, and I have defined the join criteria among these tables.
First as we can see, there are two groups of tables coming from 2 different datasource, but their definitions identical. Therefore, we need to come up with a way to let OBIEE know how to distinguish these 2 sets of data. Therefore, I decided to add a new field in both sets call 'store number' and hard code this field with different values. This field will be used later to allow us to control the querying targets.
Since these datasource are within the store that we do not have control of, I decided to add a new physical column in each of the physical tables in OBIEE and change these physical tables to become views in order to define the value of the new column 'store number':
Test the data in column 'store number', it appears to be correct:
Now apply the same configuration to all physical tables in both physical database tables. We should have store number column in every table. The one from server 5th-ave is hardcoded 'NY' and the one from Dolphin is hardcoded '001'
Now bring both groups of tables into the same BMM layer folder. In each of the logical table, we should now have 2 different LTS. In each LTS, define the fragmentation content using 'store number':
This will be done in all Logical tables, and then move the wanted columns into presentation layer and after re-arranging, it looks like this:
Now let's go and test the query. In the subject area, the column store number will now content both 'NY' and '001', depending on which value we select, the result will come back accordingly.
So let's just run a simple report of store number and total amount received and see what we get:
If we look at the physical query that's generated, we will see that it fired 2 queries to both datasources and bring back the result :
This goes without saying that we can also just pick one store number and the report will come back with data from just one datasource.
One thing to keep in mind is that although there is no hard limit of how many datasource OBIEE is allowed to defined, meaning that I can have 50+ datasources defined in the same RPD without causing configuration errors (if this is actually required), there is practical limitations on how many datasources OBIEE can handle at the same time. This will depend on a lot of things, such as data size, reporting requirements, physical hardware, network connections. I can say that if the users want to report against 20 stores at the same time and see the grand total, the query will very likely be hanging for a long time. Hence, staging area and ETL effort will be a better option.
Nonetheless, for working around the requirement of having a few datasource in report, it can be done in OBIEE and still be practical.
til next time.
Wednesday, May 12, 2010
Subscribe to:
Post Comments (Atom)
0 comments:
Post a Comment