Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I currently have 5 free-text fields in my source data and I want to reduce these down to two fields. The idea being Name1 contains the first non-null value, and Name2 contains the second non-null.
Using nested if statements, I can get Name1 populated, but I am becoming stumped over Name2.
| RowId | Field1 | Field2 | Field3 | Field4 | Field5 |
|---|---|---|---|---|---|
| 1 | Text1 | Text2 | null | null | Text5 |
| 2 | null | Text2 | Text3 | null | null |
| 3 | Text1 | null | null | null | Text5 |
So in the example above, I would expect the following results:
| RowId | Name1 | Name2 |
|---|---|---|
| 1 | Text1 | Text2 |
| 2 | Text2 | Text3 |
| 3 | Text1 | Text5 |
Has anyone got any ideas how to code this in the script?
Thanks
Hi Shoebox,
Try this ...
Data:
Load RowId, Subfield(NameList,';',1) as Name1, Subfield(NameList,';',2) as Name2;
LOAD RowId,
If(IsNull(Field1),'',Field1 & ';') &
If(IsNull(Field2),'',Field2 & ';') &
If(IsNull(Field3),'',Field3 & ';') &
If(IsNull(Field4),'',Field4 & ';') &
If(IsNull(Field5),'',Field5 & ';') as NameList
FROM
[Data.xls]
(biff, embedded labels, table is Sheet1$);
flipside
Can u explain how you build the second table?
The second table would be a left join of fields Name1 and Name2 back to the first table, with Fields1-5 then being dropped.
Thanks, it didn't quite do what I was after, but it did sow a seed of an idea which has worked for me. Although I'm not sure how efficient it is.
Load the data:
| Key | Field1 | Field2 | Field3 | Field4 | Field5 |
|---|---|---|---|---|---|
| Claim1 | Text1 | Text2 | null | null | Text5 |
| Claim2 | null | Text2 | Text3 | null | null |
| Claim3 | Text1 | null | null | null | Text5 |
Create a temporary table with all non-null values in a single field and use the RowNo() function:
| Key | TempName | RowID |
|---|---|---|
| Claim1 | Text1 | 1 |
| Claim1 | Text2 | 2 |
| Claim1 | Text5 | 3 |
| Claim2 | Text2 | 4 |
| Claim2 | Text3 | 5 |
| Claim3 | Text1 | 6 |
| Claim3 | Text5 | 7 |
Create a minimum RowID per Key as Name1ID:
| Key | TempName | RowID | Name1ID |
|---|---|---|---|
| Claim1 | Text1 | 1 | 1 |
| Claim1 | Text2 | 2 | 1 |
| Claim1 | Text5 | 3 | 1 |
| Claim2 | Text2 | 4 | 4 |
| Claim2 | Text3 | 5 | 4 |
| Claim3 | Text1 | 6 | 6 |
| Claim3 | Text5 | 7 | 6 |
Create a minimum RowID per Key, where RowID<>Name1ID as Name2ID:
| Key | TempName | RowID | Name1ID | Name2ID |
|---|---|---|---|---|
| Claim1 | Text1 | 1 | 1 | 2 |
| Claim1 | Text2 | 2 | 1 | 2 |
| Claim1 | Text5 | 3 | 1 | 2 |
| Claim2 | Text2 | 4 | 4 | 5 |
| Claim2 | Text3 | 5 | 4 | 5 |
| Claim3 | Text1 | 6 | 6 | 7 |
| Claim3 | Text5 | 7 | 6 | 7 |
Left join temporary table back to Data where RowID=Name1ID and again for RowID-Name2ID
| RowId | Field1 | Field2 | Field3 | Field4 | Field5 | Name1 | Name2 |
|---|---|---|---|---|---|---|---|
| 1 | Text1 | Text2 | null | null | Text5 | Text1 | Text2 |
| 2 | null | Text2 | Text3 | null | null | Text2 | Text3 |
| 3 | Text1 | null | null | null | Text5 | Text1 | Text5 |
Drop the temporary table and Fields1-5:
| Key | Name1 | Name2 |
|---|---|---|
| Claim1 | Text1 | Text2 |
| Claim2 | Text2 | Text3 |
| Claim3 | Text1 | Text5 |
Phew! As I say, it works, but it's a little messy.
Hi Shoebox,
Try this ...
Data:
Load RowId, Subfield(NameList,';',1) as Name1, Subfield(NameList,';',2) as Name2;
LOAD RowId,
If(IsNull(Field1),'',Field1 & ';') &
If(IsNull(Field2),'',Field2 & ';') &
If(IsNull(Field3),'',Field3 & ';') &
If(IsNull(Field4),'',Field4 & ';') &
If(IsNull(Field5),'',Field5 & ';') as NameList
FROM
[Data.xls]
(biff, embedded labels, table is Sheet1$);
flipside
Excellent. Thank you. That's much tidier.