Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi Experts,
The excel is my source and it has millions of records, wherein I need to split the multiple string values (wrapped text) into the next individual rows not columns. I tried subfield with a space as delimiter, but I didn't get the required output. Doing it in excel is a manual work even using the function as Textsplit(A2,,char(10))
Going ahead with the Textsplit function in excel I need to make sure that there are empty rows beneath the row from where I need to extract the individual values to the next row and all this while I have been manually inserting the rows.
Is there a way to do it on Qlik Sense which would resolve the manual effort. Some rows have single string value which is fine, only the cases where multiple values are wrapped.
my data looks like this:
Questionnaire ID | Supporting Evidence |
SOX-SA-177480 | Test Plan_USA_32.1.1.1.4_Jan12.xls |
SOX-SA-177481 | List of Accruals with Descriptions Q1-2012.xls List of Accruals with Descriptions Q1-2012.xls RCTS Documentation - SOX Compliance - Accruals.msg RE RCTS Documentation - SOX Compliance-Accruals 1.msg RE RCTS Documentation - SOX Compliance-Accruals 2.msg |
SOX-SA-177482 | Support Services Accounting Guidelines.pdf |
SOX-SA-177483 | SigEntries.pdf |
SOX-SA-177484 | Dec 11 GA Doc List.xlsx GA Doc List - Nov Dec Jan.xls RCTS Documentation - SOX Compliance - Supporting Doc.msg RE RCTS Documentation - SOX Compliance.msg |
SOX-SA-177485 | List of Accruals with Descriptions Q1-2012.xls List of Accruals with Descriptions Q1-2012.xls RCTS Documentation - SOX Compliance - Accruals.msg RE RCTS Documentation - SOX Compliance-Accruals 1.msg RE RCTS Documentation - SOX Compliance-Accruals 2.msg |
The desired output is to be:
Questionnaire ID | Supporting Evidence |
SOX-SA-177480 | Test Plan_USA_32.1.1.1.4_Jan12.xls |
SOX-SA-177481 | List of Accruals with Descriptions Q1-2012.xls |
SOX-SA-177481 | RCTS Documentation - SOX Compliance - Accruals.msg |
SOX-SA-177481 | RE RCTS Documentation - SOX Compliance-Accruals 1.msg |
SOX-SA-177481 | RE RCTS Documentation - SOX Compliance-Accruals 2.msg |
SOX-SA-177481 | List of Accruals with Descriptions Q1-2012.xls |
SOX-SA-177482 | Support Services Accounting Guidelines.pdf |
SOX-SA-177483 | SigEntries.pdf |
SOX-SA-177484 | Dec 11 GA Doc List.xlsx |
SOX-SA-177484 | GA Doc List - Nov Dec Jan.xls |
SOX-SA-177484 | RCTS Documentation - SOX Compliance - Supporting Doc.msg |
SOX-SA-177484 | RE RCTS Documentation - SOX Compliance.msg |
SOX-SA-177485 | List of Accruals with Descriptions Q1-2012.xls |
SOX-SA-177485 | List of Accruals with Descriptions Q1-2012.xls |
SOX-SA-177485 | RCTS Documentation - SOX Compliance - Accruals.msg |
SOX-SA-177485 | RE RCTS Documentation - SOX Compliance-Accruals 1.msg |
SOX-SA-177485 | RE RCTS Documentation - SOX Compliance-Accruals 2.msg |
Also, I need to extract the file extension (.xls, .xlsx, .pdf, .msg, .tif) in a different column for all the evidence once these are splitted into different rows. Taking the right substring function is not helping as there is no uniformity in the extensions as they are xlsx or xls and so on. Any help is appreciated.
@appi_priyakarna try below. This will create duplicate rows for each file type.
Data:
Load *,
'.' & SubField("Supporting Evidence",'.',-1) as FileType;
LOAD
"Questionnaire ID",
"Business Unit Function/Region",
"Business Unit Name",
"Assessment Status",
Target,
"Control Question Name",
"SOX Priority",
Assessor,
Reviewer,
"SOX Response",
"SOX Process Name",
"SOX Sub-Process Category",
"SOX Tier",
"Action Item",
"Assessor Due Date",
"Assessor Status",
"Assessor Submit Date",
"Business Unit Division",
"Campaign Name",
"Campaign Status",
"Campaign Type",
"Control Activity ID",
"Control Background Information",
"Control Objective",
"Control Question Text",
"Company Level Control",
"Assessor Comments on Functioning of Control",
"Reviewer's Conclusion on Functioning of Control",
"Deficiency Review Status",
"Description of Local Process and Comments",
"Estimated Financial Risk",
"Financial Risk Description",
"Hard Close?",
"ICD Analysis",
"ICD Assignment",
"ICD Comments",
"ICD Rating",
"ICD Status",
"ICD Submit Date",
"Inherited Permissions",
"Is there a test template sample size?",
"Last Updated",
"Link to Testing Template",
"Manager's SOX Tier",
"Number Compliant in Sample Size",
Period,
"Policy Reference",
"Potential Exposure",
"Current Quarter",
Quarter,
"Reviewer Due Date",
"Reviewer Status",
"Reviewer Submit Date",
"Reviewer Timeliness",
"Root Cause",
"Assessor Root Cause",
"Sample Size",
"SOX Response Type",
SubField("Supporting Evidence",chr(10)) as "Supporting Evidence",
"Target Text String",
"Tracking ID",
"Year",
"Current Year"
FROM [lib://DataFiles/Data1.xlsx]
(ooxml, embedded labels, table is Sheet1);
Can someone please provide their input on the query above. It's a little urgent.
can you post a small sample as excel file?
@appi_priyakarna try below
Data:
LOAD RecNo() as Row,
"Questionnaire ID",
"Supporting Evidence"
FROM [lib://Desktop/test.xlsx]
(ooxml, embedded labels, table is Sheet1);
Final:
NoConcatenate
Load *,
'.'& SubField("Supporting Evidence",'.',-1) as FileType,
if(Len(Trim("Questionnaire ID"))=0,Peek('Questionnaire ID1'),[Questionnaire ID]) as [Questionnaire ID1]
Resident Data
Order by Row;
Drop Table Data;
Drop Fields [Questionnaire ID],Row;
Rename Field [Questionnaire ID1] to [Questionnaire ID];
@Kushal_Chawda Thanks a lot for the solution but is still not working in my case. Below is what I am getting. I am attaching a sample data in the excel and all the fields are to be taken from the excel.
The script:
Data:
LOAD
RecNo() as Row,
"Questionnaire ID",
"Business Unit Function/Region",
"Business Unit Name",
"Assessment Status",
Target,
"Control Question Name",
"SOX Priority",
Assessor,
Reviewer,
"SOX Response",
"SOX Process Name",
"SOX Sub-Process Category",
"SOX Tier",
"Action Item",
"Assessor Due Date",
"Assessor Status",
"Assessor Submit Date",
"Business Unit Division",
"Campaign Name",
"Campaign Status",
"Campaign Type",
"Control Activity ID",
"Control Background Information",
"Control Objective",
"Control Question Text",
"Company Level Control",
"Assessor Comments on Functioning of Control",
"Reviewer's Conclusion on Functioning of Control",
"Deficiency Review Status",
"Description of Local Process and Comments",
"Estimated Financial Risk",
"Financial Risk Description",
"Hard Close?",
"ICD Analysis",
"ICD Assignment",
"ICD Comments",
"ICD Rating",
"ICD Status",
"ICD Submit Date",
"Inherited Permissions",
"Is there a test template sample size?",
"Last Updated",
"Link to Testing Template",
"Manager's SOX Tier",
"Number Compliant in Sample Size",
Period,
"Policy Reference",
"Potential Exposure",
"Current Quarter",
Quarter,
"Reviewer Due Date",
"Reviewer Status",
"Reviewer Submit Date",
"Reviewer Timeliness",
"Root Cause",
"Assessor Root Cause",
"Sample Size",
"SOX Response Type",
"Supporting Evidence",
"Target Text String",
"Tracking ID",
"Year",
"Current Year"
FROM [lib://DataFiles/Data1.xlsx]
(ooxml, embedded labels, table is [Archer Search Report]);
Final:
NoConcatenate
load *,
'.'& SubField("Supporting Evidence",'.',-1) as FileType,
if(Len(Trim("Questionnaire ID"))=0,Peek('Questionnaire ID1'),[Questionnaire ID]) as [Questionnaire ID1]
Resident Data
Order by Row;
Drop Table Data;
Drop Fields [Questionnaire ID],Row;
Rename Field [Questionnaire ID1] to [Questionnaire ID];
let me know what is not working here.
@appi_priyakarna try below. This will create duplicate rows for each file type.
Data:
Load *,
'.' & SubField("Supporting Evidence",'.',-1) as FileType;
LOAD
"Questionnaire ID",
"Business Unit Function/Region",
"Business Unit Name",
"Assessment Status",
Target,
"Control Question Name",
"SOX Priority",
Assessor,
Reviewer,
"SOX Response",
"SOX Process Name",
"SOX Sub-Process Category",
"SOX Tier",
"Action Item",
"Assessor Due Date",
"Assessor Status",
"Assessor Submit Date",
"Business Unit Division",
"Campaign Name",
"Campaign Status",
"Campaign Type",
"Control Activity ID",
"Control Background Information",
"Control Objective",
"Control Question Text",
"Company Level Control",
"Assessor Comments on Functioning of Control",
"Reviewer's Conclusion on Functioning of Control",
"Deficiency Review Status",
"Description of Local Process and Comments",
"Estimated Financial Risk",
"Financial Risk Description",
"Hard Close?",
"ICD Analysis",
"ICD Assignment",
"ICD Comments",
"ICD Rating",
"ICD Status",
"ICD Submit Date",
"Inherited Permissions",
"Is there a test template sample size?",
"Last Updated",
"Link to Testing Template",
"Manager's SOX Tier",
"Number Compliant in Sample Size",
Period,
"Policy Reference",
"Potential Exposure",
"Current Quarter",
Quarter,
"Reviewer Due Date",
"Reviewer Status",
"Reviewer Submit Date",
"Reviewer Timeliness",
"Root Cause",
"Assessor Root Cause",
"Sample Size",
"SOX Response Type",
SubField("Supporting Evidence",chr(10)) as "Supporting Evidence",
"Target Text String",
"Tracking ID",
"Year",
"Current Year"
FROM [lib://DataFiles/Data1.xlsx]
(ooxml, embedded labels, table is Sheet1);
@Kushal_Chawda thank you so much for the above solution. It worked as expected and was able to pull the records for all the file types.
independent question, moved to new thread
Request someone to pls help with a solution on the second query posted.