Skip to main content
Announcements
Global Transformation Awards! Applications are now open. Submit Entry
cancel
Showing results for 
Search instead for 
Did you mean: 
Yulia_Ut
Contributor II
Contributor II

Pseudography in text file. How to load data and get a table?

Hi all!

Is anybody know how to get data if the origin file has a pseudography? An example is below (if you'll copy-paste in new .txt file you'll get the origin file).

-----------------------------------------
|Data statistic               |   Number|
|---------------------------------------|
|Trans records                |      311|
|Amount                       |        1|
-----------------------------------------
25.01.2019                                    vichicules Consumption                                       2
--------------------------------------------------------------------------------------------------------------
vichicules Consumption
--------------------------------------------------------------------------------------------------------------
| Document  |Date      |Text                     |TMv|Mat       |Number        |Tp |CC   |    Sum        |Val|
|------------------------------------------------------------------------------------------------------------|
| 1234567899|24.04.2018|ABC DEF Z1234567891234567|111|WR12345678|            4 |pc.|AA001|         49,68 |USD|
| 1234567899|24.04.2018|ABC DEF Z1234567891234567|111|WR87654321|            3 |pc.|AA001|          9,09 |USD|
| 1234567899|24.04.2018|ABC DEF Z1234567891234567|111|WR12983476|            4 |pc.|AA001|         65,80 |USD|
--------------------------------------------------------------------------------------------------------------
25.01.2019                                    vichicules Consumption                                       3
--------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------
| Document  |Date      |Text                     |TMv|Mat       |    Number    |Tp |CC   |    Sum        |Val|
|------------------------------------------------------------------------------------------------------------|
| 4908708567|24.04.2018|ABC DEF Z1234567891234567|201|DT12345678|            7 |pc.|AA001|         22,54 |USD|
| 4908708567|24.04.2018|ABC DEF Z1234567891234567|201|DT12345678|           10 |pc.|AA002|         44,20 |USD|
| 4908708567|24.04.2018|ABC DEF Z1234567891234567|201|DT12345678|           32 |pc.|A0011|         25,60 |USD|
| 4908708567|24.04.2018|ABC DEF Z1234567891234567|201|DT12345678|            2 |pc.|AA002|          4,28 |USD|
|------------------------------------------------------------------------------------------------------------|
|*          |          |                         |   |          |   17 794,939 |kg |     |        890,86 |USD|
|           |          |                         |   |          |   16 369,675 |l  |     |               |   |
|           |          |                         |   |          |    5 522,120 |m  |     |               |   |
|           |          |                         |   |          |2 253 092,010 |pc.|     |               |   |
--------------------------------------------------------------------------------------------------------------

As a result I should have data like this:

Document Date     Text                     TMvMat       Number   TpCCSum       Val
123456789924.04.2018ABC DEF Z1234567891234567111WR123456784pc.AA00149,68USD
123456789924.04.2018ABC DEF Z1234567891234567111WR876543213pc.AA0019,09USD

 

How we can define 11th string as the name of the columns? And how to combine the data if it's separated by several strings? More over last 6 strings  is a "garbage" and our final table mustn't contain it.

P.S. this is SAP extraction which could contain more than a million strings.

Labels (1)
1 Solution

Accepted Solutions
marcus_sommer

If you just need this extract from the file you could use something like this:

LOAD
trim(@2) as [Document],
@3 as [Date],
@4 as [Text],
@5 as TMv,
@6 as [Mat],
@7 as [Number],
@8 as [Tp],
@9 as [CC],
@10 as [Sum],
@11 as Val
FROM
[Pseudography-in-text-file-How-to-load-data-and-get-a-table.txt]
(txt, codepage is 1252, no labels, delimiter is '|', header is 10 lines) where isnum(trim(@2));

which is a bit more generic then the filter-suggestion from Somasundaram whereby those filter mustn't be mandatory static else be defined with conditions (such approach could become useful if there are more/further things to be transformed).

But I suggest to try to get a real table with structured data instead of to pop out the useful data from a big amount of garbage. Especially if there are really millions of such records from SAP the transformations might take some times and applying of any incremental approaches could be quite difficult (also the efforts from SAP to generate such reports should be more heavier as exporting real tables).

- Marcus

View solution in original post

5 Replies
Somasundaram
Creator III
Creator III

Hi,

we can able to do this using transformation, while loading but we need to do manual things.

 

Please try this code 


LOAD Document,
Date,
Text,
TMv,
Mat,
Number,
Tp,
CC,
Sum,
Val,
F11
FROM
[C:\Users\username\Desktop\test_Data.txt]
(txt, utf8, embedded labels, delimiter is '|', msq, header is 10 lines, filters(
Remove(Col, Pos(Top, 1)),
Remove(Row, Pos(Top, 2)),
Remove(Row, Pos(Top, 5)),
Remove(Row, Pos(Top, 5)),
Remove(Row, Pos(Top, 5)),
Remove(Row, Pos(Top, 5)),
Remove(Row, Pos(Top, 5)),
Remove(Row, Pos(Top, 5)),
Remove(Row, Pos(Top, 9)),
Remove(Row, Pos(Top, 9)),
Remove(Row, Pos(Top, 9)),
Remove(Row, Pos(Top, 11)),
Remove(Row, Pos(Top, 10)),
Remove(Row, Pos(Top, 9))
));


-Somasundaram

If this resolves your Query please like and accept this as an answer.
marcus_sommer

If you just need this extract from the file you could use something like this:

LOAD
trim(@2) as [Document],
@3 as [Date],
@4 as [Text],
@5 as TMv,
@6 as [Mat],
@7 as [Number],
@8 as [Tp],
@9 as [CC],
@10 as [Sum],
@11 as Val
FROM
[Pseudography-in-text-file-How-to-load-data-and-get-a-table.txt]
(txt, codepage is 1252, no labels, delimiter is '|', header is 10 lines) where isnum(trim(@2));

which is a bit more generic then the filter-suggestion from Somasundaram whereby those filter mustn't be mandatory static else be defined with conditions (such approach could become useful if there are more/further things to be transformed).

But I suggest to try to get a real table with structured data instead of to pop out the useful data from a big amount of garbage. Especially if there are really millions of such records from SAP the transformations might take some times and applying of any incremental approaches could be quite difficult (also the efforts from SAP to generate such reports should be more heavier as exporting real tables).

- Marcus

Yulia_Ut
Contributor II
Contributor II
Author

Thanks for answer, @Somasundaram . For my example it works perfectly. But for SAP extractions I should think about  more general solution.

And in your code,  what does the field  "F11" mean?

Yulia_Ut
Contributor II
Contributor II
Author

@marcus_sommer, thanks for your brilliant idea and solution!!Heart

 I think for further action with data manipulation I'll  use .qvd as an intermediate step. E.g.:

.......

store ex into example.qvd;

DROP table ex;

LOAD Document,
Date,
Left(Right(Text,21),3) as Shop,
Left(Right(Text,25),3) as NFC\DT,
Right(Text,17) as VIN,

TMv,
Mat,
Number,
Tp,
CC,
Sum,
Val
FROM
example.qvd
(qvd);

 

Somasundaram
Creator III
Creator III

Hi,    Yulia_Ut

 

Need to remove the columns "F11".

 


-Somasundaram

If this resolves your Query please like and accept this as an answer.