Skip to main content
Announcements
A fresh, new look for the Data Integration & Quality forums and navigation! Read more about what's changed.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

[resolved] How to count distinct values in a lot of columns in SQL server

Hello,
I have to count distinct values in a (SQL server) table containing about 100 columns.
I know how to retrieve the schema of the table (with a TMSSQLColumnList) but i don't know how to loop on the columns names.
Can somebody help me ?
Thanks for your help.
(sorry for my bad English)
Labels (2)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

I think you will need to loop through the column names with a dynamic string.
You can try the following approach
tMSSQLInput_1 (with 1 column as colName) ---> main (row1) ---> tFlowToIterate ---> Iterate ---> tMSSQLInput_2 ---> t_UniqRow
In tMSSQLInput_1 use the below statement:
"SELECT column
FROM syscolumns
WHERE id=OBJECT_ID('YOUR_TABLE') "
In tMSSQLInput_2 use the below statement:
"select " + ((Integer)globalMap.get("row1.ColName")) + " from tablename"

Sorry I recently switched to Oracle so can't show you a working example.
Hope this helps.
G.

View solution in original post

4 Replies
Anonymous
Not applicable
Author

Hi,
It seems you need use SQL query to achieve that.
Please refer to the article http://stackoverflow.com/questions/16497761/how-to-count-distinct-column-in-sql-server to see if it is what you need.
Best regards
Sabrina
Anonymous
Not applicable
Author

Hello xdshi (and all of you),
Thank's for your answer, the problem I encounter is the fact that I don't manage to link the file which contains the names of the columns with the SQL request.
I would like to do a request like that :
SELECT columnname,COUNT(columnname) FROM myTable
GROUP by columnname
I want to iterate this request for each column of my table.
I think it's impossible to do it with a Taggregaterow.
I hope you understand better my problem.
Thanks.
Anonymous
Not applicable
Author

I think you will need to loop through the column names with a dynamic string.
You can try the following approach
tMSSQLInput_1 (with 1 column as colName) ---> main (row1) ---> tFlowToIterate ---> Iterate ---> tMSSQLInput_2 ---> t_UniqRow
In tMSSQLInput_1 use the below statement:
"SELECT column
FROM syscolumns
WHERE id=OBJECT_ID('YOUR_TABLE') "
In tMSSQLInput_2 use the below statement:
"select " + ((Integer)globalMap.get("row1.ColName")) + " from tablename"

Sorry I recently switched to Oracle so can't show you a working example.
Hope this helps.
G.
Anonymous
Not applicable
Author

Hello,
Thanks a lot G. for your help, it's working with your explanation.
Best regards.
TD