Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello!
I am trying to populate a sparse column based on values of the same column. I attached an example.
I have a table like the following:
Id, Before, Group
1, None, 1
2, 1, -
3, None, 2
4, 2, -
5, 4, -
6, 3, -
To fill the missing values I intend to lookup the value in Before in the Id column and receive the value of Group. For example in line 2 i have the value 1 in "Before", so it should look in line 1 (as there is "Id" 1) and return 1 for "Group".
I want to receive the following result:
Id, Before, Group
1, None, 1
2, 1, 1,
3, None, 2
4, 2, 1
5, 4, 1
6, 3, 2
I am using LEFT JOIN right now to lookup the missing groups and I understand that this will not work because the column group already exsits. But in fact I do not know how to deal with it. Can anyone help?
Maybe you are mixing up AUTONUMBER( expression [, AutonumberID] ) and AUTONUMBERHASH128( expression {, expression}).
AUTONUMBER will create a unique number per expression value, starting from 1 and incremented by 1 for each new expression value as it occurs.
But you can create several autonumber counters that are used independently from each other, in my example, I am using two counters, the default counter (without second argument) and one counter named 'Lookup'.
I am doing so because the first counter should create a number for each occurence of 'None' in Before field, the second should essentially create and index / counter for each record.
Using your advice for my small dataset works perfectly but not for my production data. I really cannot figure out, what the problem is. For some entries the group is assigned well but for some reason something weird appears. Is it possible that I'll send you my data in a personal message?
You can send me a message with a link where I can download your file.
Thanks a lot!
Right now I could not send you a message?
I used your approach to get the right group-value for each id.
But in the attached file you could see that (for some reason I could not understand) the wrong group is picked. Using the selection I made in the attatched file you see that id "scza140|1|2|22.09. 05:46" has the group-value 606 although the value before (scza140|1|1|22.09. 05:14) belongs to group 1 and for that reason they should belong to the same group.
I hope you can figure out what is going wrong cause i really do not get the point (because 16 values are correctly assigned to group 1)
Dorothea, your sample Id values are not unique (and the above approach relies on unique Ids).
For example these IDs with frequency added:
| SCZB185|26|1|26.09. 03:06 | 3 |
| SCZB152|76|1|26.09. 09:00 | 2 |
| SCZB152|8|1|25.09. 13:34 | 2 |
| SCZC185|59|1|24.09. 03:59 | 2 |
| SCZA155|48|1|23.09. 16:49 | 2 |
| SCZC185|22|1|25.09. 19:52 | 2 |
| SCZA155|63|1|22.09. 18:29 | 2 |
| SCZA152|25|1|22.09. 18:20 | 2 |
| SCZA152|19|1|24.09. 07:22 | 2 |
| SCZA155|27|1|24.09. 12:06 | 2 |
| SCZA145|46|1|22.09. 02:13 | 2 |
Is this what you expect in your data? Maybe the timestamp is just not granular enough?
If you expect this, how would you then like to do the lookup?
Another issue is that you sort your data by Time, but then Before lookup requires to lookup ahead in time:
| Id | Before | Group |
|---|---|---|
| SCZA152|51|1|23.09. 05:26 | SCZA152|51|2|23.09. 06:00 |
Yes I noticed that my Ids are not unique but in fact they should be. The Ids are not given in my raw data, they are created (and that is still one of my big problems).
I attached a sample file of my raw data which explains how "SCZB185|26|1|26.09. 03:06" could be connected trice. Be aware that the columns "Line" and "Linked Line" were added manually by me and is not in the data present.
Each line of my data set does not have a unique identifier, so I created one using a combination of LP-Nr, LPT, Nr and Arrival. Arrival or Departure are both possible, but one timestamp is need as the combination of LP-Nr, LPT and Nr could exist for each day.
Additionally each line has the columns Linked LP-Nr, Linked LPT and Linked Nr to show the following action. But as I mentioned before it is not unique without any timestamp. So the challenge is to finde the right timestamp.
To finde the right timestamp i have to distinguish between 2 cases:
1. Combination of LP-Nr & LPT is equal to combination of Linked LP-Nr & Linked LPT (they only differ concerning the Nr): In this case i use the Arrival of the action whose departure is at the same DATE as the arrival. I could not find any mistakes for those connections.
2. This case is much more difficult and does not work 100% correctly. In this case the LPT and Linked LPT are not equal or LP-Nr and Linked LP-Nr are not equal. This case is displayed in the attached file. The lines 1-3 are all linked to SCZB185|26|1 which exists trice (on 24, 25 and 26 September). In the script I connect each line to all possible lines (1-4, 1-5, 1-6, 2-4, 2-5, 2-6, 3-4, 3-5, 3-6) and compute the difference between arrival of the first action and departure of the connected action and only keep those who are >= 0 (1-5, 1-6, 2-5, 2-6, 3-5, 3-6). And then I'll take the minimum difference. But in this case for lines 1-3 the difference is minimal always for the combination with line 5 and for that reason it is not unique.
The right solution would be 1-4, 2-5, 3-6 as each connection should only be used once.
If you had any ideas for that problem i would be very thankful!
And if i understand you right: if my ids are unique, the assigning of the group should work perfectly?
I think there must be a different key to solve your issues. In your present data or in data you may need to access.
Just looking at your file: Wouldn't Abbhf and Anbhf tell you the connections needed?
| Abbhf | Anbhf |
| A | B |
| C | D |
| E | F |
| B | B |
| D | C |
| F | F |
No the combination of Abbhf and Anbhf is not unique and with that combination i will never find the subsequent action of one action (the goal is to display kind of a timetable) because for the subsequent action i only have the combination of Linked LP-Nr, Linked LPT and Linked Nr as hint.
So you do not have any idea how to solve the issue to get unique connections between my actions?