6 Replies Latest reply: Jan 30, 2013 12:04 PM by James Rowley RSS

    Updating Schemas & Lookup tables

      Hi,

       

      A few questions I'm stuck on;

       

      Q1, During development I am continually changing the output of my final Write Artifact ,i.e number of attributes and every time I do this I have to delete or create a new schema, is there a way of auto updating the output schema based on the upstream output?

       

      I see that I can edit the existing one but I thought there must be an auto update, or is there a more elgant way in which to deal with continual changes to schemas whilst developing?

       

       

      Q2, I have created a lookup table based on a csv input file, created a key column. I wanted to use this within a transform artifact but the Key to lookup on is something that I was going to code as a parameter or similar, it seems I can only drag an input to the lookup rule, any suggestions on how to achieve this?

       

      My only thought would be to create the value required as part of the input source or similar and pass it through to the input, but I'm not sure this will be practical in the end state due to the source data.

       

      As always, suggestions, comments or solutions welcome...

       

      Thanks

      James

        • Re: Updating Schemas & Lookup tables

          Q1.

          As you've discovered, it is always possible to hand edit the composite type in the schema to add or remove attributes, but it is not always possible to edit the field (external) description of the record (left hand side in the schema editor).  With schemas that describe file output, you can manually edit, but with database table schemas you cannot modify the description of the database table.  In this case, you must either create a new schema from the table or, if the table doesn't yet exist, create a schema from the output of the upstream operator.

           

          Another approach, but it's only a half-way measure, is to create a shared composite type from the output of the nearest upstream transformers operator and then add another composite type to the existing schema.  But this does not modify the field (external) description of the record, which you must do manually if possible.

           

          So, I think the most satisfactory approach is to either manually modify, if that's an option, or create a new schema from the output of the upstream operator.

           

          Q2

          You are correct that within the Transform operator in which you use a lookup rule your only option is to associate an existing incoming attribute with the key parameter.  If the incoming record does not include an attribute corresponding to the key, you must add such an attribute to the record in an upstream operator so that you can, as you noted, simply drag that attribute to the rule's input parameter.

           

          So, your idea of creating the value as part of the input source is the correct approach.

            • Re: Updating Schemas & Lookup tables

              Thanks John,

               

               

              I dont have any control over the source files, how do you think I could bring the value in, the only section where I think I could drop it in is after I apply a filter to the source data (this is the point that  I know what the value should be), but how can this be done, possibly create a copy feed this to a transform, then use an expression to create the value then re-join downstream using the Join?

               

              On a different note about udpating and auto creation, I have say Table1, which I also need to do self lookups against, which in essence are just table lookups, but against itself, is the best way to do this; Create a Lookup table from Table1 to use as reference and just use the lookup rule again? My only concern is where I need to use an key for the lookup but I have multiple rows returned and need to filter against (again) a specific value not provided in the source? (Same problem as before)

               

              Note; When I create a lookup table,I notice it doesn't give the option to take the upstream feed to create the table structure and requies it to be done manually?

               

              James

                • Re: Updating Schemas & Lookup tables

                  Not sure I understand the situations in the first two paragraphs.

                   

                  Regarding creating a lookup table.

                  When you are defining the table you can assign the composite type from the schema used to read the source data.  In the lookup table wizard, click on the Actions button next to the label Composite Type and select Assign > Shared from the popup menu.  In the window that opens, select Used by a Schem from the Scope dropdown list and then you will be able to select the type from the schema used to read the source data.

                    • Re: Updating Schemas & Lookup tables

                      Hi,

                       

                      Thanks the Lookup table schema selection works, just takes a bit of getting used to navigating round the tool (still awaiting the training to be released here in Europe).

                       

                      I'll expand on the other issues;

                       

                      I have csv source that has say 100 records, made up of various types. I am processing these one type at a time so I have a Read File >> Filter >> transform >> Write File.

                       

                      In the filter is where I select the records I want, at this point I know that the records are say Type="Swap". This is the value that I wanted to use in the Lookup Rule (previous post about hard coded values), so I was wondering how I could feed this into the Transform as an input, hence using for the Lookup.

                       

                      I know on the face of it it seems simple if the Type is already there (it isn't), the logic behind defining what type a record is can be quite complex and isn't just a case of Type=x.

                       

                      James

                        • Re: Updating Schemas & Lookup tables

                          OK, I think I understand.  A file contains many records that are of different types (financial transactions?).  You need to examine some of the values in each record to determine its type and then use a knowledge of that type as the key to the lookup.

                           

                          If I understand it correctly, there are two approaches you could follow.

                           

                          First:

                          Use a filter operator with multiple output ports.  The code associated with each output port is evaluated to determine its type.  Each type is associated with a different port, so you only return true for one of the ports.  This is illustrated in the following image.

                          filter.png

                          Each output port is followed by a transform, which adds the extra attribute to the record, initializing it with the appropriate value for type.  A second transform then performs the lookup.  Then use a funnel to join the records before writing to the output file.

                           

                          Second:

                          The Transform operator has a helper function named filter.  If this function returns true, the record is processed; if it returns false, the record is not processed.  In this implementation, the filter function always returns true, but only after determining the type and initializing a variable.  The main code in the transform function then initializes the value of the extra attribute to the type.  Again, a second transform performs the lookup.

                           

                          If you need to add more types, you will need to ports to the filter operator and additional Transform operators.

                          transform2.png

                          Note that all records are processed in a single stream.  There's no need to separate the different types of records and then rejoin. 

                           

                          If you need to add more types, you simply change the code in the filter function (and of course the data stored in the lookup).  So this approach would be easier to maintain.

                           

                          Hope I understood correctly and that this helps.