<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: HDFS file to Hive table - file format mismatch in Talend Studio</title>
    <link>https://community.qlik.com/t5/Talend-Studio/HDFS-file-to-Hive-table-file-format-mismatch/m-p/2258673#M40403</link>
    <description>&lt;P&gt;I got escaping to work! You need to quadruple the backslash, so it appears in the tHiveTableCreate component as "\\\\".&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://jira.talendforge.org/browse/TBD-7964" target="_blank" rel="nofollow noopener noreferrer"&gt;https://jira.talendforge.org/browse/TBD-7964&lt;/A&gt; created as this feels like a bug to me. Certainly needs to be documented!&lt;/P&gt;</description>
    <pubDate>Wed, 07 Nov 2018 14:49:10 GMT</pubDate>
    <dc:creator>PhilHibbs</dc:creator>
    <dc:date>2018-11-07T14:49:10Z</dc:date>
    <item>
      <title>HDFS file to Hive table - file format mismatch</title>
      <link>https://community.qlik.com/t5/Talend-Studio/HDFS-file-to-Hive-table-file-format-mismatch/m-p/2258668#M40398</link>
      <description>&lt;P&gt;I have a job which creates a Hive table, transfers a file to HDFS, and loads the data from the file into the hive table. At least, that's what I want it to do.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;It falls down at the final step, with this error:&lt;/P&gt; 
&lt;P&gt;Error while compiling statement: FAILED: SemanticException Unable to load data to destination table. Error: The file that you are trying to load does not match the file format of the destination table.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;I'm trying a super-minimal case with the table just having a single integer column, and the file just containing the number 3 and a newline.&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="tHiveConnection_1.png" style="width: 642px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009Lzcc.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/131655i43AA1178B411AD8F/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009Lzcc.png" alt="0683p000009Lzcc.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="tHDFSPut_1.png" style="width: 756px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009Lzmb.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/152699iC3F25B6C6C92BF18/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009Lzmb.png" alt="0683p000009Lzmb.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="tHiveLoad.png" style="width: 767px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009Lzdv.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/149226i133D1F2ABFF34E6B/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009Lzdv.png" alt="0683p000009Lzdv.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 16 Nov 2024 07:43:27 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/HDFS-file-to-Hive-table-file-format-mismatch/m-p/2258668#M40398</guid>
      <dc:creator>PhilHibbs</dc:creator>
      <dc:date>2024-11-16T07:43:27Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS file to Hive table - file format mismatch</title>
      <link>https://community.qlik.com/t5/Talend-Studio/HDFS-file-to-Hive-table-file-format-mismatch/m-p/2258669#M40399</link>
      <description>&lt;P&gt;you use create table if not exists&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;first of course check - is table have the same structure&amp;nbsp;with GENERIC schema?&lt;/P&gt;&lt;P&gt;then - is table&amp;nbsp;have the same format? (file, not parquet, not&amp;nbsp;etc)&lt;/P&gt;&lt;P&gt;is table have the same delimiters with the file?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 05 Oct 2018 08:39:07 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/HDFS-file-to-Hive-table-file-format-mismatch/m-p/2258669#M40399</guid>
      <dc:creator>vapukov</dc:creator>
      <dc:date>2018-10-05T08:39:07Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS file to Hive table - file format mismatch</title>
      <link>https://community.qlik.com/t5/Talend-Studio/HDFS-file-to-Hive-table-file-format-mismatch/m-p/2258670#M40400</link>
      <description>&lt;P&gt;&lt;A href="https://community.qlik.com/s/profile/0053p000006dyyJAAQ"&gt;@PhilHibbs&lt;/A&gt;,make sure the schema of the Hive table and the HDFS file. and also you should mention the same path ,which you specified&amp;nbsp;in tHDFSPut. since if your reading the same file which you have loaded into HDFS.&lt;/P&gt;</description>
      <pubDate>Fri, 05 Oct 2018 10:51:11 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/HDFS-file-to-Hive-table-file-format-mismatch/m-p/2258670#M40400</guid>
      <dc:creator>manodwhb</dc:creator>
      <dc:date>2018-10-05T10:51:11Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS file to Hive table - file format mismatch</title>
      <link>https://community.qlik.com/t5/Talend-Studio/HDFS-file-to-Hive-table-file-format-mismatch/m-p/2258671#M40401</link>
      <description>&lt;P&gt;I got this working, but I'm not 100% sure what the problem was.&lt;BR /&gt;&lt;BR /&gt;The issue now is that I could only get it working by not having any delimiters (by which I mean quotes, not the comma separator) or escape characters. If I tick the "Escape" box in the tHiveCreateTable component, I get this error:&lt;BR /&gt;&lt;BR /&gt;Error while compiling statement: FAILED: ParseException line 2:20 character '&amp;lt;EOF&amp;gt;' not supported here&lt;/P&gt; 
&lt;P&gt;&lt;BR /&gt;&lt;SPAN class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="tHiveCreateTable.png" style="width: 999px;"&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="0683p000009M0tn.png"&gt;&lt;img src="https://community.qlik.com/t5/image/serverpage/image-id/130393i30C903C5E8B5C3CF/image-size/large?v=v2&amp;amp;px=999" role="button" title="0683p000009M0tn.png" alt="0683p000009M0tn.png" /&gt;&lt;/span&gt;&lt;/SPAN&gt;&lt;/P&gt; 
&lt;P&gt;&amp;nbsp;&lt;/P&gt; 
&lt;P&gt;My ultimate objective is to be able to load an email address such as&amp;nbsp;a","a&lt;SPAN&gt;!#$%&amp;amp;'*+-/=?^_`{|}~&lt;/SPAN&gt;@aaa.net into a Hive table.&lt;/P&gt;</description>
      <pubDate>Wed, 07 Nov 2018 13:08:29 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/HDFS-file-to-Hive-table-file-format-mismatch/m-p/2258671#M40401</guid>
      <dc:creator>PhilHibbs</dc:creator>
      <dc:date>2018-11-07T13:08:29Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS file to Hive table - file format mismatch</title>
      <link>https://community.qlik.com/t5/Talend-Studio/HDFS-file-to-Hive-table-file-format-mismatch/m-p/2258672#M40402</link>
      <description>The essential problem seems to be that the tHiveCreateTable has no equivalent to "Text Enclosure" in a tFileOutputDelimited. Although it says "Set Delimited row format", you can only specify a separator, not a delimiter, and the Escape doesn't work (or, I can't get it to work).</description>
      <pubDate>Wed, 07 Nov 2018 13:45:19 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/HDFS-file-to-Hive-table-file-format-mismatch/m-p/2258672#M40402</guid>
      <dc:creator>PhilHibbs</dc:creator>
      <dc:date>2018-11-07T13:45:19Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS file to Hive table - file format mismatch</title>
      <link>https://community.qlik.com/t5/Talend-Studio/HDFS-file-to-Hive-table-file-format-mismatch/m-p/2258673#M40403</link>
      <description>&lt;P&gt;I got escaping to work! You need to quadruple the backslash, so it appears in the tHiveTableCreate component as "\\\\".&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://jira.talendforge.org/browse/TBD-7964" target="_blank" rel="nofollow noopener noreferrer"&gt;https://jira.talendforge.org/browse/TBD-7964&lt;/A&gt; created as this feels like a bug to me. Certainly needs to be documented!&lt;/P&gt;</description>
      <pubDate>Wed, 07 Nov 2018 14:49:10 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/HDFS-file-to-Hive-table-file-format-mismatch/m-p/2258673#M40403</guid>
      <dc:creator>PhilHibbs</dc:creator>
      <dc:date>2018-11-07T14:49:10Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS file to Hive table - file format mismatch</title>
      <link>https://community.qlik.com/t5/Talend-Studio/HDFS-file-to-Hive-table-file-format-mismatch/m-p/2258674#M40404</link>
      <description>&lt;P&gt;I'm struggling with this again. I thought I got it working a while back, but I can't get it working now!&lt;/P&gt;
&lt;P&gt;My problem is with a comma in the data. For example, this line of data in the file:&lt;/P&gt;
&lt;PRE&gt;"2019-05-16T10:05:44.399Z","12","400","{ \"statusCode\": \"400\", \"details": \"Schema validation error\" }"&lt;/PRE&gt;
&lt;P&gt;The last column gets truncated at the first comma so all I get is &lt;FONT face="courier new,courier" color="#0000FF"&gt;{ "statusCode": "400"&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;Or this, I can reformat the file if needed:&lt;/P&gt;
&lt;PRE&gt;"2019-05-16T10:05:44.399Z","12","400","{ ""statusCode"": ""400"", ""details"": ""Schema validation error"" }"&lt;/PRE&gt;
&lt;P&gt;So I don't mind if the load file needs Excel-style CSV quoting, or C/Java-style escaping, either will do, but it needs to be abe to load quotes, commas, etc.&lt;/P&gt;
&lt;P&gt;Like I said earlier, the escaping can be done by quadrupling the slash: "\\\\"&lt;/P&gt;
&lt;P&gt;However, there is no quote specifier in the tHiveCreateTable component.&lt;/P&gt;
&lt;P&gt;Is it possible with Serde row format?&lt;/P&gt;</description>
      <pubDate>Thu, 16 May 2019 13:18:14 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/HDFS-file-to-Hive-table-file-format-mismatch/m-p/2258674#M40404</guid>
      <dc:creator>PhilHibbs</dc:creator>
      <dc:date>2019-05-16T13:18:14Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS file to Hive table - file format mismatch</title>
      <link>https://community.qlik.com/t5/Talend-Studio/HDFS-file-to-Hive-table-file-format-mismatch/m-p/2258675#M40405</link>
      <description>&lt;P&gt;I have worked around it by writing out the file without quotes, so my row looks like this:&lt;/P&gt;&lt;PRE&gt;2019-05-16T10:05:44.399Z,12,GB,BF0073,400,{ "statusCode": "400"\, "details": "Schema validation error" }&lt;/PRE&gt;&lt;P&gt;I had to manually code the escaping of the comma, so every column that might contain a comma has to have &lt;FONT face="courier new,courier" color="#0000FF"&gt;.replaceAll(",", "\\\\,")&lt;/FONT&gt; applied to it before writing. Not ideal.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 16 May 2019 13:20:45 GMT</pubDate>
      <guid>https://community.qlik.com/t5/Talend-Studio/HDFS-file-to-Hive-table-file-format-mismatch/m-p/2258675#M40405</guid>
      <dc:creator>PhilHibbs</dc:creator>
      <dc:date>2019-05-16T13:20:45Z</dc:date>
    </item>
  </channel>
</rss>

