Originally published on 09-22-2011 02:55 PM
New in expressor 3.4 are lookup tables, which are implemented by a light-weight relational database management system integrated into the expressor Data Integration Engine. The database stores the lookup table content in a file and depending on the file system location of this file, the scope of the lookup table can be limited to a single project (or library) or to all projects (and libraries) in a workspace, or to all projects (and libraries) deployed onto the computer. In fact, if the file system location of the lookup file is mapped to a drive, the lookup table's contents can be accessed by expressor data integration applications running on networked computers. In developing the examples described in this tutorial, the lookup table files will be stored in a directory (C:\lookups) accessible to all expressor applications.
A lookup table first needs to be initialized with data using the Write Lookup Table operator. The data used to initialize a lookup table can come directly from a resource, such as a file or database table, or can be data produced by a data integration application. In keeping with the expressor composite type usage paradigm, you can either manually define the columns in a lookup table or or define the table by referencing an existing composite type.
The first step is to create the lookup table. Lookup tables are expressor artifacts similar to schema or connection artifacts. Under every project or library directory, look for the subdirectory Lookup Tables. This is where lookup table artifacts are stored. To create a new lookup table artifact,
Now that the lookup table has been defined, it will appear under the Lookup Tables subdirectory and you are ready to populate the table with data. This example populates the lookup table with data extracted from a database table that contains a collection of cities and their television station call signs.
mysql> select * from city_station;
| city | call_sign |
| Atlanta | WAGA
| Atlanta | WATL
| Atlanta | WPBA
| Atlanta | WXIA
| Boston | WBZ
| Boston | WCVB
| Boston | WGBH
| Boston | WHDH
| Boston | WLVI
| Dallas | KDAF
| Dallas | KERA
| Dallas | KTVT
| Dallas | KXAS
| Dallas | KXTX
| Dallas | WFAA
| Knoxville | WATE
| Missoula | KECI
| New York | WABC
| New York | WCBS
| New York | WNBC
| New York | WNYW
| New York | WNET
| New York | WPIX
| New York | WWOR
| Philadelphia | KYW
| Philadelphia | WCAU
| Philadelphia | WHYY
| Philadelphia | WPHL
| Philadelphia | WPVI
| Philadelphia | WTXF
| Saginaw | WNEM
31 rows in set (0.08 sec)
A simple dataflow - Read Table, Write Lookup Table - can be used to transfer the data from the database to the lookup table. Since all the details of the structure and file system location of the lookup table are embedded in the Lookup Table artifact, you only need to set the Lookup Table property of the Write Lookup Table operator and indicate whether you want to truncate, or append to, the contents of the table when the dataflow runs.
Now that the lookup table is loaded with data, how do you use it?
Applications access the data in a lookup table through use of a lookup rule, which can only be used within a Transform Operator. In this example, the lookup table has both unique and non-unique keys, so you can find the city in which a specific television station is located or all the televisions stations in a given city. When you use the unique key, zero or one record may be returned from the lookup table; when you use a non-unique key, zero or more records may be returned from the lookup table.
Let's create a small application that reads a list of station call signs and extracts from the lookup table the city in which the station is located. The dataflow would include a Read File operator, a Transform operator in which a lookup rule is used to access the lookup table, and a Write File operator to output the records.
Now, let's run some data through the application.
If the input file contains the call signs
The output file will contain
But what happens if the input file includes a call sign, perhaps ZZZZ, for which there isn't a table entry? The lookup rule gives you control over the outcome.
So much for using a unique key with the lookup. What happens when a non-unique key is used? In this case, more than one record could be returned from the lookup table, for example, New York has seven television stations. The lookup rule can handle this situation as well.
The circled symbol in the upper right-hand corner of the rule indicates that multiple matches may be found in the lookup table. In this situation, the Transform operator will emit multiple records, for example, seven records would be emitted for New York.
If you ever need to read the entire contents of a lookup table, use the Read Lookup Table operator. Just like the Write Lookup Table operator, the only information you need to configure this operator is the lookup table you want to read.
For those situations in which you need absolute control over the contents of the lookup table, you may use a Lookup Function Rule. When you use a function rule, you may write code that can directly manipulate the contents of the table, performing reads, updates, and deletes as well as inserts. Use of Lookup Function Rules and the lookup application programming interface will be discussed in another tutorial.