Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Problem Description:
The data in our dashboard follows a pipeline in witch the back end team in PySpark creates csv files for us (the front end) to import in QlikView. There is a column that the users want to be able to export to excel but this column is being truncated to 254 characters. I fear that this may be a windows issue cause...
1. When opened in windows excel or when loaded in QlikView the csv file truncates the column to 254 characters
2. But when the same csv file if read back into a data-frame in Pyspark the column in question shows all the data (more that 254 characters)
Question & Options:
1. Is there any code I can add to this column in the load script to make it read all data (more than 254 characters) on that column?
2. If option 1 is not feasible, what other data file would be recommended to store this file, so the issue goes away?
Excel and also Qlik aren't restricted to max. 254 chars within a field-value. How does it look like if you opened the csv with an editor like Notepad++ ? You may also enable the option to show all chars - including the not visible. I could imagine that there are line-breaks included and you just don't see the next lines. In both tools you could adjust it how many lines are displayed. Also you may apply checks like: len(YourField) to count how many chars are there.
- Marcus
Hi Marcus,
We just found out that the csv file was bein manipulated by another application before it comes to be read by QlikView. And this application was the one truncating it. We tested a new csv prior the application and it loaded with no issues.
best,