Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
dennisthemenace
Contributor III
Contributor III

Categorizing data in script

Dear all,

my aim is to load a table with problem issues and add a column to categorize each row/data with the help of certain criterias if it's relevant or not.

To make it easier to understand, here is an example of given data:

table_Data:

problem_nr;

problem_description;

problem_short;

table_Criterias:

problem_description;

problem_short;

relevance;

table_Final:

problem_nr;

...

relevance;

The problem is, that the colums problem_short / problem_description are not 100 % identical. The table criteras includes only key words, where the table_data includes more complex descriptions. So, there is no applyMap possible. Any idea, how to solve it the easiest way?

Thanks and kind regards!

 

1 Solution

Accepted Solutions
GaryGiles
Specialist
Specialist

Okay, as long as the problem_description and problem_short fields from table_criteria contain one word per row, we can do something like this:

After loading table_Data and table_Criteria . . .

Join (table_Data)

Load problem_description as problem_description_criteria,

        problem_short as problem_short_criteria,

       relevance

Resident table_Criteria_Detail;

 

table_Data_temp:

Load problem_nr,

         problem_description,

         problem_short,

         relevance

Resident table_Data

Where wildmatch(problem_short,'*'&problem_short_criteria&'*')

       and wildmatch(problem_description,'*'&problem_description_criteria&'*');

 

Drop tables table_Data;

Rename table table_Data_temp to table_Data;

 

View solution in original post

16 Replies
GaryGiles
Specialist
Specialist

For table_Criteria, what is the format of the data in problem_short / problem_description?  And are the rules for matching.  Is it, "if any word in problem_short or problem_description from table_Criteria matches with a row from table_data, include it or is the rows with the most matches?

dennisthemenace
Contributor III
Contributor III
Author

Hello Gary,

one field example for table_criteria:

problem_description      problem_short      relevance

defect                                    Door                           relevant

field example for table_data:

problem_nr                        problem_description              problem_short      

123                                         Both doors are defect              Door

 

Now, IF problem_Description from table_data INCLUDES (wildmatch) ANY problem_description from table_criteria AND problem_short from table_data INCLUDES (wildmatch) ANY problem_Short from table_criteria THEN match with relevance ("relevant").

 

Hope, I could explain it...             

GaryGiles
Specialist
Specialist

Okay, as long as the problem_description and problem_short fields from table_criteria contain one word per row, we can do something like this:

After loading table_Data and table_Criteria . . .

Join (table_Data)

Load problem_description as problem_description_criteria,

        problem_short as problem_short_criteria,

       relevance

Resident table_Criteria_Detail;

 

table_Data_temp:

Load problem_nr,

         problem_description,

         problem_short,

         relevance

Resident table_Data

Where wildmatch(problem_short,'*'&problem_short_criteria&'*')

       and wildmatch(problem_description,'*'&problem_description_criteria&'*');

 

Drop tables table_Data;

Rename table table_Data_temp to table_Data;

 

dennisthemenace
Contributor III
Contributor III
Author

Thanks for your quick response, Gary! 

In general, it seems to be working, but its recognizing/comparing only the problem_short column in the wildmatch. 

So, the output are all problem_nr where problem_short = problem_short_Criteria, but without problem_description = problem_description_criteria

 

I've found the problem: the rule is not recognizing the problem_description, which stands in the SAME line as problem_short .. which was my fault in rule description... but actually the rule should recognize the combination of problem_short_criteria and problem_description_criteria (being in the same line)

GaryGiles
Specialist
Specialist

They are created and compared the exact same way in the load script, so I'm not sure why it wouldn't work.  Did it create the correct outcome for the example data you posted earlier?

dennisthemenace
Contributor III
Contributor III
Author

Hi Gary,

as I wrote in red, its comparing ANY problem_short with ANY problem_description,

but I would like it like Any problem_short with problem_description from same row number.

So, the combination is important.

GaryGiles
Specialist
Specialist

I still don't follow what you are asking.

Where wildmatch(problem_short,'*'&problem_short_criteria&'*')

       and wildmatch(problem_description,'*'&problem_description_criteria&'*');

will give you rows where both problem_short and problem_description criteria.

If you posted a bigger sample data set, it might help.

dennisthemenace
Contributor III
Contributor III
Author

It's working now fine. Of course its not working as long as the titles in tables are not the same.

 

Thanks again!

GaryGiles
Specialist
Specialist

Glad it is working.  I was working on a response to your deleted post.  

One thing I did notice is that you need add a check for the length of problem_short_criteria and problem_description_criteria > 0.  A empty field can affect the results of a wildmatch.

Where wildmatch(problem_short,'*'&problem_short_criteria&'*') and len(problem_short_criteria) >0
and wildmatch(problem_description,'*'&problem_description_criteria&'*') and len(problem_description_criteria) >0;