<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Skip rows in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311193#M82255</link>
    <description>Hi Sultania,
&lt;BR /&gt;Do you also have delete and update from source table or just new insertions?
&lt;BR /&gt;If there are only new insertions, I have another approach
&lt;BR /&gt;- Once your initial load is completed, create a copy of your source data (Table A is source and Table B is copy)
&lt;BR /&gt;- During second execution use (A-B) to get additional records in A which are not present in B
&lt;BR /&gt;- Insert these new records in target
&lt;BR /&gt;- Flush out B and make another copy of A
&lt;BR /&gt;If the records are not too many, another approach would be to perform inner join with A and B and get the rejected records from A which are insertions (it could be update as well)
&lt;BR /&gt;Thanks
&lt;BR /&gt;vaibhav</description>
    <pubDate>Wed, 25 Jun 2014 06:04:10 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2014-06-25T06:04:10Z</dc:date>
    <item>
      <title>Skip rows</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311188#M82250</link>
      <description>Hello,
&lt;BR /&gt;I have a Job like this
&lt;BR /&gt;
&lt;BR /&gt;Source Table -----&amp;gt;tMap--------&amp;gt;Destination Table
&lt;BR /&gt;I want to skip first few rows from source table to be processes. How can i do it?
&lt;BR /&gt;PS: This job runs several times with new data added to source Table. I don't want the Data which were already loaded to Destination to be loaded again.
&lt;BR /&gt;Thanks!</description>
      <pubDate>Tue, 24 Jun 2014 11:19:28 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311188#M82250</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-06-24T11:19:28Z</dc:date>
    </item>
    <item>
      <title>Re: Skip rows</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311189#M82251</link>
      <description>Hi
&lt;BR /&gt;If you just want the data that do not exist in the target table to be inserted, you need to do an inner job between the source data and target table, and get the unmatched rows, insert these data into target table.
&lt;BR /&gt;Shong</description>
      <pubDate>Tue, 24 Jun 2014 12:15:44 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311189#M82251</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-06-24T12:15:44Z</dc:date>
    </item>
    <item>
      <title>Re: Skip rows</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311190#M82252</link>
      <description>it might do the job for now but performance won't be good. Or will it? (For millions of rows)</description>
      <pubDate>Tue, 24 Jun 2014 12:46:53 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311190#M82252</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-06-24T12:46:53Z</dc:date>
    </item>
    <item>
      <title>Re: Skip rows</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311191#M82253</link>
      <description>Hi ksultania,
&lt;BR /&gt;What is the column structure of your input table?
&lt;BR /&gt;Do you have unique identification column which identifies new rows?
&lt;BR /&gt;Do you have time stamp in your input table?
&lt;BR /&gt;Can you show snapshot of your data with new rows and old rows?
&lt;BR /&gt;Vaibhav</description>
      <pubDate>Tue, 24 Jun 2014 12:50:03 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311191#M82253</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-06-24T12:50:03Z</dc:date>
    </item>
    <item>
      <title>Re: Skip rows</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311192#M82254</link>
      <description>I do not have a unique key on which I can take inner join. Also, i don't have time stamp in input table.&lt;BR /&gt;What in was thinking is,s every time I pass the data from input table to output Table. I will maintain a count (count of rows). And the next time the job runs, i will start reading the rows from count+1 row.&lt;BR /&gt;Is this possible?</description>
      <pubDate>Wed, 25 Jun 2014 05:29:53 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311192#M82254</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-06-25T05:29:53Z</dc:date>
    </item>
    <item>
      <title>Re: Skip rows</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311193#M82255</link>
      <description>Hi Sultania,
&lt;BR /&gt;Do you also have delete and update from source table or just new insertions?
&lt;BR /&gt;If there are only new insertions, I have another approach
&lt;BR /&gt;- Once your initial load is completed, create a copy of your source data (Table A is source and Table B is copy)
&lt;BR /&gt;- During second execution use (A-B) to get additional records in A which are not present in B
&lt;BR /&gt;- Insert these new records in target
&lt;BR /&gt;- Flush out B and make another copy of A
&lt;BR /&gt;If the records are not too many, another approach would be to perform inner join with A and B and get the rejected records from A which are insertions (it could be update as well)
&lt;BR /&gt;Thanks
&lt;BR /&gt;vaibhav</description>
      <pubDate>Wed, 25 Jun 2014 06:04:10 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311193#M82255</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-06-25T06:04:10Z</dc:date>
    </item>
    <item>
      <title>Re: Skip rows</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311194#M82256</link>
      <description>No, i don't have to delete the anything in source table. 
&lt;BR /&gt;The approach you mentioned will work. But i dont want to create a separate table. Also a lot of memory will be wasted in this approach. 
&lt;BR /&gt;Won't this approach also be inefficient in terms of performance (when there are millions of rows) ? 
&lt;BR /&gt;Thanks for the suggestion though.</description>
      <pubDate>Wed, 25 Jun 2014 06:43:29 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311194#M82256</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-06-25T06:43:29Z</dc:date>
    </item>
    <item>
      <title>Re: Skip rows</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311195#M82257</link>
      <description>So in designing lot of things are dependent on the basic requirements... if you clear the requirements initially, then it becomes easy to plan for approach...
&lt;BR /&gt;It is better, if you provide the use case scenario with all the details, based on this better approach could be devised.. Have a look at your first post regarding problem definition... and then again reformulate the problem definition...
&lt;BR /&gt;Thanks
&lt;BR /&gt;Vaibhav</description>
      <pubDate>Wed, 25 Jun 2014 08:21:50 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311195#M82257</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-06-25T08:21:50Z</dc:date>
    </item>
    <item>
      <title>Re: Skip rows</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311196#M82258</link>
      <description>Yeah, I should have given more details. 
&lt;BR /&gt;The scenario is like this. 
&lt;BR /&gt;1. File ---&amp;gt; Source Table 
&lt;BR /&gt;2. Source Table -----&amp;gt;tMap--------&amp;gt;Destination Table 
&lt;BR /&gt; 
&lt;BR /&gt;we need to copy the content of Source after transformation to Destination, which are not already present in destination. 
&lt;BR /&gt;Source table and destination table do not have a unique key. The data is huge(Millions of records/Rows) So making an extra table will lead to consumption of extra memory. 
&lt;BR /&gt;The 2nd job will run after a new file loads data to Source table.</description>
      <pubDate>Wed, 25 Jun 2014 09:23:36 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311196#M82258</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-06-25T09:23:36Z</dc:date>
    </item>
    <item>
      <title>Re: Skip rows</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311197#M82259</link>
      <description>Hi Sultania,
&lt;BR /&gt;Do you have a unique column or combination of columns which represents a unique row? If you don't have this, I am afraid about how to do...
&lt;BR /&gt;What is the file size?
&lt;BR /&gt;Vaibhav
&lt;BR /&gt;Vaibhav</description>
      <pubDate>Wed, 25 Jun 2014 09:29:40 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311197#M82259</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-06-25T09:29:40Z</dc:date>
    </item>
    <item>
      <title>Re: Skip rows</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311198#M82260</link>
      <description>Each file is of approx 1mb. (Containing 3k-4k rows and around 60 columns)
&lt;BR /&gt;there may be 1000s of files coming in.
&lt;BR /&gt;Also, I do not have any such combination which leads to unique key formation. I am thinking of inserting a column as index and then make it work.</description>
      <pubDate>Wed, 25 Jun 2014 09:57:45 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311198#M82260</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-06-25T09:57:45Z</dc:date>
    </item>
    <item>
      <title>Re: Skip rows</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311199#M82261</link>
      <description>Hi Vaibhav,
&lt;BR /&gt;I have inserted a timestamp as unique key. Is there any way other than A-B using which i can achieve the task? As it consumes a lot of extra memory.</description>
      <pubDate>Tue, 08 Jul 2014 11:52:35 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311199#M82261</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-07-08T11:52:35Z</dc:date>
    </item>
    <item>
      <title>Re: Skip rows</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311200#M82262</link>
      <description>in tMap properties, you file in place of system memory for processing records.
&lt;BR /&gt;Vaibhav</description>
      <pubDate>Tue, 08 Jul 2014 14:00:39 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Skip-rows/m-p/2311200#M82262</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2014-07-08T14:00:39Z</dc:date>
    </item>
  </channel>
</rss>

