topic Data cleansing on existing table in QlikView

Data cleansing on existing table

Sat, 05 Dec 2009 04:53:20 GMT

Dear All,

I've got a problem with creating new data based on existing.

There is a Excel file with just two columns (No and Name):

No, Name
1, Entre-
2, os-Rios
3, Albergavia-
4, a-Velha
5, Albergavia-
6, a-Novo
7, Riba-
8, mandego
9, Puebla de
10, Sanabria
11,
12, ert2
13, 1234
14, Vale
15, de Canba
16, Varzea
17, Cova
18, PORTO
19, FARO
20, Moledo do Minho

The result should look like..

Nr, Name
1, Entre-os-Rios
2, Albergavia-a-Velha
3, Albergavia-a-Novo
4, Ribamandego
5, Puebla de Sanabria
6, Vale de Canba
7, Varzea Cova
8, Porto
9, Faro
10, Moledo do Minho

Any idea how to accomplish this in QV?

Thank you.

Regards,

Nick

Data cleansing on existing table

Sat, 05 Dec 2009 16:57:11 GMT

Hi Nick,

Obviously it is better to organize the source data better, but since it is not a perfect world ....

Maybe my suggestion will help you with your problem.

The steps I took: 1: Determine where a new name (could) starts and calculate a potential namegroup index

2; group the table on the namegroup and concatenate the name-strings

3: check against an existing table if the names are valid

In the script it looks like:

Temp1:

Load * inline

[x, NameString

1, Entre-

2, os-Rios

3, Albergavia-

4, a-Velha

5, Albergavia-

6, a-Novo

7, Riba-

8, mandego

9, Puebla de

10, Sanabria

11,

12, ert2

13, 1234

14, Vale

15, de Canba

16, Varzea

17, Cova

18, PORTO

19, FARO

20, Moledo do Minho];

// Start a new namegroup for all records that start with an Uppercase character.

// this could be any condition of course

Temp2:

Load *

, NameString & spacing as NameString2;

Load *

, if (Upper(Left(NameString,1)) = Left(NameString,1)

, rangesum(1,peek('namegroup'))

, peek('namegroup') ) as namegroup

, if (right(NameString,1)='-'

, ''

, ' ') as spacing

RESIDENT Temp1

where IsText(NameString);

// group and concat the string

Temp3:

Load namegroup

, Concat(NameString2) as newName

resident Temp2

group by namegroup;

// capitalize if the second character is in uppercase.

// check against another table (if applicable)

Temp4:

Load *

, exists(Name, TheName) as nameFound;

Load namegroup

, if (Upper(Mid(newName,2,1)) = Mid(newName,2,1)

, Capitalize(newName)

, newName) as TheName

resident Temp3;

Good Luck,

Regards, Jan

Data cleansing on existing table

Sat, 05 Dec 2009 19:41:43 GMT

Very good solution. This will help!

Thank you very much!

Regards, Nick