Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I am working on new analysis where I have to check on trend analysis of new words popping up in social media. Presently i am running ngrams on the data per day and next to it assigning the date, when the data was downloaded. Now i want to create table or chart which can show only the words and frequency of the words which were not specified till the date.
E.g: if the data is as below with words and frequencies along with the date
batteries 10 01/01/2017
cover 8 01/01/2017
size 7 01/01/2017
batteries 15 02/01/2017
cover 9 02/01/2017
size 6 02/01/2017
batteries 11 03/01/2017
cover 7 03/01/2017
size 8 03/01/2017
batteries 14 04/01/2017
cover 8 04/01/2017
size 9 04/01/2017
headphones 3 04/01/2017
here it should show headphones as the newly popped word. It will be great if anyone can help me in this.
Thanks in advance
Regards,
Punitha
May be as below:
Thank you Thirumala this is working fine. This was of great help.
If you think you got the solution please close the thread accordingly, If not please let us know.
Hi,
one possible solution could be:
tabWords:
LOAD * INLINE [
Word, Frequency, Date
batteries, 10, 01/01/2017
cover, 8, 01/01/2017
size, 7, 01/01/2017
batteries, 15, 02/01/2017
cover, 9, 02/01/2017
size, 6, 02/01/2017
batteries, 11, 03/01/2017
cover, 7, 03/01/2017
size, 8, 03/01/2017
batteries, 14, 04/01/2017
cover, 8, 04/01/2017
size, 9, 04/01/2017
headphones, 3, 04/01/2017
];
Join
LOAD Word,
Date(Min(Date)) as [Date of first appearance]
Resident tabWords
Group By Word;
Join
LOAD Distinct
Date,
[Date of first appearance],
-(Date = [Date of first appearance]) as [first appearance]
Resident tabWords;
tabCalendar:
LOAD *,
Day([Date of first appearance]) as [Day of first appearance],
WeekDay([Date of first appearance]) as [WeekDay of first appearance],
Week([Date of first appearance]) as [Week of first appearance],
WeekName([Date of first appearance]) as [WeekName of first appearance],
Month([Date of first appearance]) as [Month of first appearance],
MonthName([Date of first appearance]) as [MonthName of first appearance],
Dual('Q'&Ceil(Month([Date of first appearance])/3),Ceil(Month([Date of first appearance])/3)) as [Quarter of first appearance],
QuarterName([Date of first appearance]) as [QuarterName of first appearance],
Year([Date of first appearance]) as [Year of first appearance],
WeekYear([Date of first appearance]) as [WeekYear of first appearance];
LOAD Date(MinDate+IterNo()-1) as [Date of first appearance]
While MinDate+IterNo()-1 <= MaxDate;
LOAD Min([Date of first appearance]) as MinDate,
Max([Date of first appearance]) as MaxDate
Resident tabWords;
hope this helps
regards
Marco
Hi Tirumala,
Thank you, but the words in the list gives all the keywords throughout from the first date which occurs only once. But I want to check the latest words which have never occurred till date referring the same eg:
batteries 10 01/01/2017
cover 8 01/01/2017
size 7 01/01/2017
batteries 15 02/01/2017
cover 9 02/01/2017
size 6 02/01/2017
batteries 11 03/01/2017
cover 7 03/01/2017
size 8 03/01/2017
batteries 14 04/01/2017
cover 8 04/01/2017
size 9 04/01/2017
headphones 3 04/01/2017
batteries 12 05/01/2017
charger 3 05/01/2017
here the answer should be charger not the headphones, as I want to check which are the newly words popping up everyday.
Take a look at the attached file and hope this is what you are looking for: