Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I am new to talend software using since a week so please anyone can you help me.
how to get the unique values from the columns without using database and using version 6.4
it also includes getting the all columns data into one column and the data is unique i.e it doesn't repeat in any one of the column and then post that data
Welcome to Talend! Can I ask you for a bit more information. Maybe an example of your input data and an example of how you want it output. That should give enough info to help you out.
my input data is the github repository topics and the output should be the unique github topics of all the repositories which should not be repeated
I'm afraid I will need actual examples of the data. The more information you give about a problem, the better the chance of a response. If people have to search for examples of your data, they tend not to be so willing to respond.
Be sure not to supply anything sensitive. Replace any sensitive or private data with random data.
i am attaching the excel file for details,
but these details are obtained from the github using github apis to get the repo names and their topics
some of the topics are repeated and i need to get all the unique topics in a single column and they should not be repeated
and also post those unique values into another website.
In excel file we see that i am having columns topics 0, topics 1, topics 2 and i need these three columns to be combined into one column and it should be having only unique values
Note: i am doing it for 1000 to 10000 of records in the excel i have given only around 30 to 40 of records.
the flow is like
tloop---------> tjava--------------> trestclient----------------> textractJsonFields-----------------> tunite-----------> tmap----------> tlogRow
in this flow tloop and tjava are used to loop for pagination of github to get greater than 100 of records .
trestClient is used with the get method to get all the repo names and their topics.
tExtractJsonFields helps to get only the required fields.
tunite helps to unite all the data which is more than 100 records and give output at a time
tmap for mapping of the data.
tlogrow to print output
Take a look at the last solution on this thread.
First you need to combine the columns. You can do this with a tMap. Just link all of the columns together using simple Java String concatenation. Add a separator, maybe a semicolon. Something like this....
row1.column1 +";"+row1.column2+";"+row1.column3
You can then use the code in the thread I linked to above to remove duplicates.
Actually i need all the columns data in one column which means not concatenation
it is one after the other like example input is from link @captureJPG and the output should be like in the link @output1
then get the distinct values of that single column AllTags from output1 .
and could you be still more specific of using the distinct code that you have provided in the previous link as i am new i dont know where to write or copy the code.
i dont know what you exactly want
but i try to solve your question
try this
i want to concatenate my columns but in different way where i want to get the all the columns data in a single column i.e one column data after the another
like in the output1JPG image in that i have commented the start and end of each column data i.e, if column 1 has 10 values and column 2 has 20 values both combined together i wanted to get 30 values
And in previous message you have given the python code can i know how to implement it
This is why i asked for the data and for a good explanation of the problem. The data needs to be in a format that people can use. Screenshots are not good enough. If you had given your input data in a format I could use and your output data so that I could understand it, I could have helped you a while ago. However, I will need to think about your requirement to try and understand it, since you have given clues to what is not a normal requirement. I will try and look at this later as I have a lot I need to work on.