Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I inherited code that reads QVD data into a text string separated by a "|" symbol within a text file on a cloud sever. Then, an $(INCLUDE= .txt) file is written, after which SQL syntax is used to INSERT OVERWRITE the data into a DataBricks table store. This is the code (variable names etc. redacted):
//First the fields are read in from the QVD
[TABLE]:
LOAD
A,
B
C,
D,
E,
TABLE.Edited_By,
TABLE.Edited_At
FROM [lib://location/file.qvd]
(qvd);
//Then the QVD is transmuted into a text string, and then SQL-inserted into a DataBricks table store.
unqualify*;
NoConcatenate
table_qvd:
LOAD *
RESIDENT TABLE;
unqualify*;
NoConcatenate
table_qvd_str:
LOAD
CONCAT(
"A" & '|' &
"B" & '|' &
"C" & '|' &
"D" & '|' &
"E" & '|' &
TABLE.Edited_By" & '|' &
TABLE.Edited_At" &
'%') AS [SET DATA =] RESIDENT table_qvd;
STORE [SET DATA =] FROM table_qvd_str INTO "lib://location/table_qvd.txt" (TXT);
$(INCLUDE=[lib://location/table_qvd.txt]);
SQL INSERT OVERWRITE workspace.databricks_table VALUES ('$(DATA)')!EXECUTE_NON_SELECT_QUERY;
DROP TABLES table_qvd, table_qvd_str;
...generating this error at the point at which "Edited_At" and "Edited_By" are separated
Thank you again for all your help. A REPLACE (FIELD, ';' , ',') as FIELD has solved this problem by replacing the problematic semicolon in the automatic audit field with a comma. All good now 🙂
I suggest to compare the 3 stages (table from qvd + string from resident + txt-file per editor) from the working and failing executions. I assume that the differences will be easily to detect. I could imagine that any comma and/or quoting or similar occurs through the content of the data or different interpretation-settings.
Thank you Marcus! Yes, will do and report back. Appreciate it.
Very respectfully,
Jonathan
Hello Marcus,
Something strange is going on in the encoding of one text file, vs. the encoding of the other. I don't know why, because the QVD fields and QVDs from which they derive, via VizLib WriteBack forms, are all formatted identically when it comes to the Audit timestamp column that is causing this unexpected token in one instance.
When I view the successful text file within Qlik's file viewer as a delimited file with a "semicolon" delimiter, I see the entire concatenated string. When I view the unsuccessful text file the same way, with a "semicolon" delimiter, even though the formats of each field between the pipe manually-concatenated delimiter are exactly the same, it truncates the text severely in the unsuccessful text file. Very strange.
Further, when I manually insert the unsuccessfull text data file via data load script from the Qlik viewer "insert script" as such- with a COMMA delimiter which enables all the string to be viewed in the Qlik viewer, unlike the unsuccessful semicolon- the DLE runs successfully:
LOAD
@1
FROM [lib:URL/file.txt]
(txt, utf8, no labels, delimiter is ',', no quotes);
However, I don't know how to get the above loaded data/field into my dataBricks table via SQL INSERT OVERWRITE. The $(INCLUDE=) doesn't work, and using ('$(DATA)') bypasses the above loaded data for the last table $INCLUDE= data.
I think if I knew how to get the manually-scripted data into dataBricks via SQL this will be solved. Thoughts? Thank you immeasurably.
Very respectfully,
Jonathan
Try to look on the results with an editor like Notepad++ to see all chars - maybe there are any unexpected not visible ones like any line-breaks or NULL. Also possible is to check these information within a Qlik load with something like:
load *, ord(Chr) as Ord;
load *, mid(Data, iterno(), 1) as Chr, iterno() as IterNo while iterno() <= len(Data);
load [@1:n] as Data from table_qvd.txt (fix, utf8, embedded labels);
Beside the above mentioned extra commas/quoting it might be also possible that the data cause any line-break or EOF or similar by storing the txt. There are various interpretation-logic included within the here used old windows libraries which may trigger any transformation - quite common are /n and similar stuff.
Thank you for this information, and definitely will do, but do you know of any SQL INSERT OVERWRITE syntax that can just get the data loaded with:
LOAD
@1
FROM [lib:URL/file.txt]
(txt, utf8, no labels, delimiter is ',', no quotes);
...stored into a DataBricks table? If so, I think this problem may be solved faster. Thank you!
You could try to skip the store as txt and fetching it per include() again by assigning the content directly to a variable, like:
let MyVar = peek('SET DATA =', 0, 'table_qvd_str');
and then using this variable within the SQL.
Beside this it might be helpful to show the variable-content within the progress-window as well as within the document-log before they are used within a statement, for example:
TRACE $(MyVar);
If anything breaks it makes it easier to find the reasons.
Awesome. Thank you for this information.
Additional question: this is all taking place on a Qlik Cloud, Amazon Enterprise instance- so, I can't easily use Notepad ++ to investigate hidden characters. Any recommendations? No worries if not. Thanks for all your help!
With the above hinted load of the txt and/or also the qvd and looping through the field by extracting each single char and their numeric ascii-index you could also investigate the entire content. Important by such an approach is the usage of recno/rowno/iterno to look on the results within the origin order and also to merge it later properly (if needed).