How to union 2 data sources with different schemas

    Suppose you need to union 2 data files together into 1 file, but the input files have different schemas. 


    simple1.txt has:


    Mike,Tarallo,123 myplace,mycity,mystate,11211,These are some notes

    John,Smith,345 his place,hiscity,histatte,23233,More notes


    simple2.txt has:


    Sam,Johson,456 a street, po box 1090,boston,MA,78945

    Wayne,Eckerson,659 main street,apt 1a,orlando,FL,90812


    Take a look at the attached project(s), which you can import into Expressor.  Do not unzip - just use the Import Projects... wizard that is under the Studio tab - or right click on an existing workspace top level node in the explorer tree and select Import Projects.


    You will have to modify your file connection artifact FileConnection1 which will can point to the appropriate location where you saved the sample input files (.txt) to and where the output file will be written to.

  simply unions the files and just leaves ",," (NULLS) - for the missing values


    The union output is:


    Mike,Tarallo,123 myplace,,mycity,mystate,11211,These are some notes

    John,Smith,345 his place,,hiscity,histatte,23233,More notes

    Sam,Johson,456 a street, po box 1090,boston,MA,78945,

    Wayne,Eckerson,659 main street,apt 1a,orlando,FL,90812,


    If you want to explicitly set a default value for the missing data you can use a transform operator to do so - look in project

    Here we use the Transform operator to insert default values:



    The union output becomes:


    Sam,Johson,456 a street, po box 1090,boston,MA,78945,No Notes Today

    Wayne,Eckerson,659 main street,apt 1a,orlando,FL,90812,No Notes Today

    Mike,Tarallo,123 myplace,No Address2,mycity,mystate,11211,These are some notes

    John,Smith,345 his place,No Address2,hiscity,histatte,23233,More notes


    Finally, a more sophisticated way is to take advantage of the semantic type constraint properties (you'll need to understand composite types a bit more but it is very powerful).  Look in


    This solution uses composite type constraints and NO Transformation Rules in the Dataflow!


    If you open the schemas for both the Read File 1 and Read File 2 operators - on the right you will notice that I manually added the missing attributes BUT did not map them to anything. This allows the structure of the outgoing record layout (we call a Composite Type) to then conform when using the Funnel Operator - which requires the record layout to be the same. Then the Transform Operator does just that, transforms the data in a number of ways.


    Then finally the Write File Operator uses a 3rd schema to define the delimited file output that will be written.


    Read File 1 Schema




    Read File 2 Schema



    Output Write File Schema