We are using the FROM_FIELD clause to parse some JSON documents. Those documents may amount to tens of thousands or maybe even a few millions of lines. We did some tests and the log files created looks something like this:
... TABLE_NAME: LOAD * FROM_FIELD(JSON_DOCUMENT, JSON_FIELD)(json, utf8, no labels) X fields found: <list of fields> 1 lines fetched TABLE_NAME: LOAD * FROM_FIELD(JSON_DOCUMENT, JSON_FIELD)(json, utf8, no labels) X fields found: <list of fields> 2 lines fetched TABLE_NAME: LOAD * FROM_FIELD(JSON_DOCUMENT, JSON_FIELD)(json, utf8, no labels) X fields found: <list of fields> 3 lines fetched TABLE_NAME: LOAD * FROM_FIELD(JSON_DOCUMENT, JSON_FIELD)(json, utf8, no labels) X fields found: <list of fields> 4 lines fetched ... (and it goes on, if the document has 1000, there will be 1000 lines like this one)
Our undestanding is that rows are handled one at a time and added to the script log. Because of this the logs become quite heavy. For a 40000 rows document we ended up with a 12MB txt file. Hence our problem.
We believe an anwser to any of those questions will solve or problem, Is there: A way to remove/hide those line generated from the log file for this specific app? A better way to use the FROM_FIELD clause? A better way to parse JSON documents?