19 Replies Latest reply: Jun 12, 2017 3:14 AM by KANCHANA DORAISWAMY RSS

    Selecting data from multiple Excel tables with same field names

    KANCHANA DORAISWAMY

      Hi all,

       

      I have multiple excel files (*.xslx) containing data gathered over different dates. The excel files have the same sheet names and the same field names. The only data that is different from one file to another is the values of the fields.

       

      Sample_File_Date1 :

       

      Feature_Name                                        Value_1              Value_2        

      4D TRAK                                                  98%                    75%

      Automatic Coil Selection\AcsFcc              81%                    65%

      MRE                                                        93%                    40%

      PCA                                                        96%                    21%

       

      Sample_File_Date2 :

       

      Feature_Name                                        Value_1              Value_2        

      4D TRAK                                                  94%                    85%

      Automatic Coil Selection\AcsFcc              82%                    21%

      MRE                                                        0%                    40%

      PCA                                                        67%                    22%


      In my load script, I am loading the excel files dynamically as below :


      LOAD

          Feature_Name

          Value_1 

          Value_2

      FROM [lib://../Sample_File_*.xlsx]

      (ooxml, embedded labels, table is [SampleSheet]);

       

      The problem I have is I have to display/calculate values based on the dates. That is, some of my charts use, for example, average of Value_1 only from the file Sample_File_Date2 , while elsewhere a pivot table might have to display the fields only from Sample_File_Date1 in a pivot table. What function(s) should I use to achieve this?

       

      Any help is truly appreciated.

       

      Regards,

      Kanchana

        • Re: Selecting data from multiple Excel tables with same field names
          Stefan Wühl

          Create a new field in the LOAD that identifies the file source.

           

          Instead of the wildcard '*' in the filename, I would use a FOR EACH loop and use the variable, something like

           

          For Each File in FileList('[lib://../Sample_File_*.xlsx]')

           

              Data:

              LOAD

                  Feature_Name,

                  Value_1,

                  Value_2,

                  Subfield('$(File)', '\', -1) AS Filename

              FROM [$(File)] (ooxml, embedded labels, table is [SampleSheet]);

           

          Next File


          edit: there is also a function FileName() you can have a look into, but there seems to be an open issue with latest QlikSense versions and xlsx files


          Re: Function FileName() not working after upgrade to 3.2.3

            • Re: Selecting data from multiple Excel tables with same field names
              KANCHANA DORAISWAMY

              Hi Stefan,

               

              Thanks for your response.

               

              However, I do not have any use for the file name as there are hundreds of such files that differ only in the date in which they were generated. Therefore, I have to use the date to filter out the data, such as the scenarios as below :

               

              1. Display Feature_Name, Value_1, Value_2 from the table that was last generated (I use Max(date) function here) in a pivot table.

              2. Calculate the Average of all Value_1 values from the table that was generated just previous to the latest table and display in a KPI.

               

              etc.

               

              I need to know specifically how to select the data in my chart - I am not sure whether set analysis would come in handy here, and if it would, how should I create my statements?

               

              Regards,

              Kanchana

                • Re: Selecting data from multiple Excel tables with same field names
                  Stefan Wühl

                  As far as I understood, you need to parse the date from your filename, right?`

                   

                  Above was just a sample, use string functions, like Textbetween() or Subfield() to filter the dates from the filename.

                   

                  After you parsed the dates and stored them into a field, you can filter your date using set analysis or filter panes or conditional functions just like you filter on any other attribute.

                    • Re: Selecting data from multiple Excel tables with same field names
                      KANCHANA DORAISWAMY

                      I have the date information in two separate places, one an xml file which contains some generic information on the excel file as below :

                       

                      <records xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

                        <record>

                            <File_No>1</File_No>

                            <File_Date>05/02/17</File_Date>

                            <File_Stream>Atlmainrecon</File_Stream>

                            <File_SWID>404</File_SWID>

                        </record>

                        <record>

                            <File_No>2</File_No>

                            <File_Date>06/01/17</File_Date>

                            <File_Stream>Atlmain</File_Stream>

                            <File_SWID>339</File_SWID>

                        </record>

                      </records>

                       

                      I have loaded this file also in the script as

                       

                      SampleCrossTable:

                          LOAD

                              File_No as "File No",

                              File_Date as "File Date",

                              File_Stream as "File Stream",

                              File_SWID as "File SWID"

                          FROM [lib://*/Summary.xml]

                      (XmlSimple, table is [records/record]);

                       

                      From this, I can get the date information, say, the latest date as follows :

                       

                      LOAD

                          Date(max("File Date")) as MaxDate

                      resident SampleCrossTable;

                       

                      Two, the same date information is also available from the filename, which I have extracted this way within the load script:

                       

                      Date(Mid(FileName(), 12, 11)) as "File Date"

                       

                      But my question is, how can I correlate the MaxDate from the load script with the corresponding Excel file and use it in my charts to get the data for the scenarios that I mentioned in my previous question.

                       

                      I have tried using conditional statements such as

                       

                      if([File Date] = MaxDate), do something..


                      but this doesn't get me any data. Where could I be going wrong?

                       

                      Regards,

                      Kanchana

                        • Re: Selecting data from multiple Excel tables with same field names
                          Stefan Wühl

                          If something like

                           

                          if([File Date] = MaxDate), do something..

                           

                          does not work, there are two common issues (besides that if() function should show at least two arguments, THEN branch being second, an optional ELSE branch being third. Have a look at the HELP pages for full syntax and examples. I guess above pseudo code is just not based on your real expression syntax)

                           

                          a) both dates should be read in as dates by Qlik, having a numerical representation

                           

                          Why don’t my dates work?

                          Get the Dates Right

                           

                          b) You probably need to use an aggregation function around the if() conditional, i,e, rather use

                           

                          =Sum( If( X=Y, Z))

                           

                          than

                           

                          =If( X=Y, Sum(Z))

                           

                          I say probably, because the correct syntax is depending on the scope of your expression and your requirement.

                          For example, second expression may work when it's used with either X or Z as dimension.

                           

                          Use Aggregation Functions!

                           

                           

                          If this is still not getting you any further, then please add more context to your request, best by posting a small, reloadable sample application.

                          • Re: Selecting data from multiple Excel tables with same field names
                            Prashanth Reddy

                            Hi Kanchana,

                             

                            What I understood in this case..

                            1. You have multiple files and loaded data with the date which is available in file name.
                            2. Also a master table which contains file n date name.
                            3. From above 2, you only bother about Max date.

                             

                            When you have same columns, automatically data will be appended. For ex. In file 1 you have 10 records with date, col1, ..

                            and file 2 with 10 records with diff date, col1, ..

                             

                            Now you will have 20 records after load..

                             

                            To get your desired output have a variable to hold max date from your other table/master.

                             

                            Next, resident load from base data with date condition = max date from the variable.

                             

                            I hope, it will solve your problem.

                              • Re: Selecting data from multiple Excel tables with same field names
                                KANCHANA DORAISWAMY

                                Hi Prashanth,

                                 

                                Thank you for your response. But as I was discussing with Stefan earlier, I have the max date stored in a variable at the time of loading the files. And my charts and calculations work perfectly when only 1 excel file is present in the directory. But as soon as I load more than one file (since they all have the same column names), the charts are unable to display the right data from the date in which I am interested.

                                 

                                I am rewriting the code from the earlier posts for your benefit :

                                 

                                --------------------------------------------------------------------------------------------------------

                                PART 1 : This is my master file with the generic information:

                                 

                                <records xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

                                  <record>

                                      <File_No>1</File_No>

                                      <File_Date>05/02/17</File_Date>

                                      <File_Stream>Atlmainrecon</File_Stream>

                                      <File_SWID>404</File_SWID>

                                  </record>

                                  <record>

                                      <File_No>2</File_No>

                                      <File_Date>06/01/17</File_Date>

                                      <File_Stream>Atlmain</File_Stream>

                                      <File_SWID>339</File_SWID>

                                  </record>

                                </records>

                                 

                                I have loaded this file in the script as :


                                SampleCrossTable:

                                    LOAD

                                        File_No as "File No",

                                        File_Date as "File Date",

                                        File_Stream as "File Stream",

                                        File_SWID as "File SWID"

                                    FROM [lib://*/Summary.xml]

                                (XmlSimple, table is [records/record]);

                                 

                                From this, I can get the date information, the max date as follows :

                                 

                                LOAD

                                    Date(max("File Date")) as MaxDate

                                resident SampleCrossTable;

                                 

                                And the date just previous to the latest as :

                                 

                                LOAD

                                    Date(max("File Date")) as MaxDate

                                resident SampleCrossTable where not exists(MaxDate,"File Date");


                                I have checked that I get both the dates as expected.

                                ---------------------------------------------------------------------------------------------------------------


                                PART 2 : These are my multiple excel data files with the information that has to be selectively used in chart display or for some calculations based on the dates :


                                Sample_File_Date1 :

                                 

                                Feature_Name                                        Value_1              Value_2

                                4D TRAK                                                  98%                    75%

                                Automatic Coil Selection\AcsFcc              81%                    65%

                                MRE                                                        93%                    40%

                                PCA                                                        96%                    21%

                                 

                                Sample_File_Date2 :

                                 

                                Feature_Name                                        Value_1              Value_2

                                4D TRAK                                                  94%                    85%

                                Automatic Coil Selection\AcsFcc              82%                    21%

                                MRE                                                        0%                    40%

                                PCA                                                        67%                    22%


                                In my load script, I am loading the excel files dynamically as below :


                                LOAD

                                    Feature_Name

                                    Value_1

                                    Value_2

                                FROM [lib://../Sample_File_*.xlsx]

                                (ooxml, embedded labels, table is [SampleSheet]);


                                ---------------------------------------------------------------------------------------------------------------

                                 

                                PART 3 : Now, let us say I want to display the contents of the sample file with the latest date in a pivot table, this is the code I have written in the pivot table :


                                ROW :

                                [Feature_Name]


                                MEASURE :

                                Only({<[File Date] = MaxDate>} [Value_1])

                                 

                                This is where the problem begins as the values are not being filtered. It only displays those values in the chart when they are the same across all the excel files. The other values are being displayed as null.

                                 

                                -----------------------------------------------------------------------------------------------------------------------

                                 

                                I don't know if I have missed anything, and any help would be great!

                                 

                                Regards,

                                Kanchana

                                  • Re: Selecting data from multiple Excel tables with same field names
                                    Stefan Wühl

                                    And how do you want to display the values if they are not the same?

                                     

                                    Only() does work as designed to show only identical values:

                                    The Only Function

                                     

                                    You either need to add a dimension (like Value_1) to create more possible rows per combination of dimensional values or use a different aggregation function, like Concat():

                                     

                                    =Concat(DISTINCT {<[File Date] = MaxDate >} [Value_1], ', ')

                                      • Re: Selecting data from multiple Excel tables with same field names
                                        KANCHANA DORAISWAMY

                                        Hi Stefan,

                                         

                                        I do not need to worry if the values are the same or not, the requirement is simply that I display the values as they are, but only from the latest results (that is why I calculate the MaxDate). In addition, I also need perform some more calculations based on the date, such as compare the values from the latest report with those from the last report, determine averages etc.

                                         

                                        I think I cannot use Concat() because I need to display only one value in a pivot table. Just for the sake of simplicity of understanding, assume I have to load all the excel files, but display the data values (as is) only for the latest file in a pivot table.

                                         

                                        Regards,

                                        Kanchana

                                      • Re: Selecting data from multiple Excel tables with same field names
                                        Prashanth Reddy

                                        Hi Kanchana,

                                         

                                        I think you can do this way to have a relationship with dates to do some filter stuff or any in front end.

                                         

                                        Sample_File_Date1 :


                                        load Feature_Name, Value1, Value2, 'Date1 (ex: 5/6/2017)' as FileDate .....;

                                        Note: Date1 can be extracted from the filename

                                         

                                        Feature_Name                                        Value_1              Value_2          FileDate

                                        4D TRAK                                                  98%                    75%          5/6/2017

                                        Automatic Coil Selection\AcsFcc              81%                    65%          5/6/2017

                                        MRE                                                        93%                    40%          5/6/2017

                                        PCA                                                        96%                    21%          5/6/2017

                                         

                                        Sample_File_Date2 :

                                         

                                        Feature_Name                                        Value_1              Value_2          FileDate

                                        4D TRAK                                                  94%                    85%          4/6/2017

                                        Automatic Coil Selection\AcsFcc              82%                    21%          4/6/2017

                                        MRE                                                        0%                    40%          4/6/2017

                                        PCA                                                        67%                    22%          4/6/2017


                                        Sorry, If I'm talking about something esle.