<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Split CSV to many files based on key in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/Split-CSV-to-many-files-based-on-key/m-p/2358401#M123442</link>
    <description>I have a csv which looks something like this 
&lt;BR /&gt;a, col1, col2, col3 
&lt;BR /&gt;a, col1, col2, col3 
&lt;BR /&gt;a, col1, col2, col3 
&lt;BR /&gt;b, col1, col2, col3 
&lt;BR /&gt;b, col1, col2, col3 
&lt;BR /&gt;c, col1, col2, col3 
&lt;BR /&gt;c, col1, col2, col3 
&lt;BR /&gt;The first column starts with a key (a,b,c), and then the rest of the columns follow. What I want to do is read in the csv (got that covered) and then split the csv based on key, so I have 3 chunks/ groups of data and then convert each of those chunks of data into a separate json file, which I think I can get. 
&lt;BR /&gt;This question is not a much different from 
&lt;A href="http://www.talendforge.org/forum/viewtopic.php?pid=101372#p101372" target="_blank" rel="nofollow noopener noreferrer"&gt;http://www.talendforge.org/forum/viewtopic.php?pid=101372#p101372&lt;/A&gt;. 
&lt;BR /&gt;I don't know how many different keys are available so want to build something that doesn't mind about new keys. 
&lt;BR /&gt;Essentially I want to - 
&lt;BR /&gt;Read -&amp;gt; group based on key -&amp;gt; for each group transform to JSON. 
&lt;BR /&gt;The transforming to JSON is something I'm happy to play with, my question really focuses around the grouping. 
&lt;BR /&gt;From the above question I've done the following - 
&lt;BR /&gt;tFileInputDelimited ----row 1 main ---&amp;gt; tFlowToIerate ---iterate---&amp;gt; tFixedFlowInput --- row2 (main) ---&amp;gt; tFileOutputDelimited 
&lt;BR /&gt;However this creates lots of keyed filenames which is good, however the content of the files is the same on each row, when it shouldn?t be. 
&lt;BR /&gt;Any ideas? 
&lt;BR /&gt;David</description>
    <pubDate>Thu, 15 Aug 2013 12:58:08 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2013-08-15T12:58:08Z</dc:date>
    <item>
      <title>Split CSV to many files based on key</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Split-CSV-to-many-files-based-on-key/m-p/2358401#M123442</link>
      <description>I have a csv which looks something like this 
&lt;BR /&gt;a, col1, col2, col3 
&lt;BR /&gt;a, col1, col2, col3 
&lt;BR /&gt;a, col1, col2, col3 
&lt;BR /&gt;b, col1, col2, col3 
&lt;BR /&gt;b, col1, col2, col3 
&lt;BR /&gt;c, col1, col2, col3 
&lt;BR /&gt;c, col1, col2, col3 
&lt;BR /&gt;The first column starts with a key (a,b,c), and then the rest of the columns follow. What I want to do is read in the csv (got that covered) and then split the csv based on key, so I have 3 chunks/ groups of data and then convert each of those chunks of data into a separate json file, which I think I can get. 
&lt;BR /&gt;This question is not a much different from 
&lt;A href="http://www.talendforge.org/forum/viewtopic.php?pid=101372#p101372" target="_blank" rel="nofollow noopener noreferrer"&gt;http://www.talendforge.org/forum/viewtopic.php?pid=101372#p101372&lt;/A&gt;. 
&lt;BR /&gt;I don't know how many different keys are available so want to build something that doesn't mind about new keys. 
&lt;BR /&gt;Essentially I want to - 
&lt;BR /&gt;Read -&amp;gt; group based on key -&amp;gt; for each group transform to JSON. 
&lt;BR /&gt;The transforming to JSON is something I'm happy to play with, my question really focuses around the grouping. 
&lt;BR /&gt;From the above question I've done the following - 
&lt;BR /&gt;tFileInputDelimited ----row 1 main ---&amp;gt; tFlowToIerate ---iterate---&amp;gt; tFixedFlowInput --- row2 (main) ---&amp;gt; tFileOutputDelimited 
&lt;BR /&gt;However this creates lots of keyed filenames which is good, however the content of the files is the same on each row, when it shouldn?t be. 
&lt;BR /&gt;Any ideas? 
&lt;BR /&gt;David</description>
      <pubDate>Thu, 15 Aug 2013 12:58:08 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Split-CSV-to-many-files-based-on-key/m-p/2358401#M123442</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-08-15T12:58:08Z</dc:date>
    </item>
    <item>
      <title>Re: Split CSV to many files based on key</title>
      <link>https://community.qlik.com/t5/Talend-Studio/Split-CSV-to-many-files-based-on-key/m-p/2358402#M123443</link>
      <description>Hi David 
&lt;BR /&gt; 
&lt;BLOCKQUOTE&gt; 
 &lt;TABLE border="1"&gt; 
  &lt;TBODY&gt; 
   &lt;TR&gt; 
    &lt;TD&gt;tFileInputDelimited ----row 1 main ---&amp;gt; tFlowToIerate ---iterate---&amp;gt; tFixedFlowInput --- row2 (main) ---&amp;gt; tFileOutputDelimited&lt;BR /&gt;However this creates lots of keyed filenames which is good, however the content of the files is the same on each row, when it shouldn?t be.&lt;/TD&gt; 
   &lt;/TR&gt; 
  &lt;/TBODY&gt; 
 &lt;/TABLE&gt; 
&lt;/BLOCKQUOTE&gt; 
&lt;BR /&gt;Set a dynamic file path based on the first column, for example: 
&lt;BR /&gt;"D:/file/"+(String)globalMap.get("row1.column1")+".csv" 
&lt;BR /&gt;and check the 'append' option on tFileOutputDelimited, so as to append the record to an existing file. 
&lt;BR /&gt;Shong</description>
      <pubDate>Thu, 15 Aug 2013 15:17:27 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/Split-CSV-to-many-files-based-on-key/m-p/2358402#M123443</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2013-08-15T15:17:27Z</dc:date>
    </item>
  </channel>
</rss>

