Extract data from 1 column in txt with Power Query

Zigfreud

New member
Joined
May 24, 2022
Messages
6
Reaction score
0
Points
1
Excel Version(s)
Office 365
I have some files from SAP system. They are exported in .txt. They have the same structure. The idea is to arrange the data into a more user friendly way using Power Query, especially since there are many such files weekly, and we are unable to report based on txt file.


After I have done some cleaning, the data looks like this: PQ image


And have no clue how to continue, to extract the data into a different column, like Vendor Number, Vendor Name, Document, Amount etc..

I have attached an example of the source file.
 

Attachments

  • example.txt
    10 KB · Views: 7
Last edited by a moderator:

pinarello

Member
Joined
Jun 21, 2019
Messages
191
Reaction score
2
Points
18
Location
Germany
Excel Version(s)
Office 365
It is probably not a big problem to split the vertically arranged vendor blocks into individual columns, but in the end there will be too many columns to work well with.

So it would be useful if you define which information you want to see at the end.
 

Zigfreud

New member
Joined
May 24, 2022
Messages
6
Reaction score
0
Points
1
Excel Version(s)
Office 365
These are the columns that I need at the end - all information is already contained in the file extracted.

Vendor IDVendor NameCompany CodeDoc NumberGross AmountDeductedNet Amount PaidDoc TypeDoc DateBline DatePay TermsP KeyCurrencyReference
 

pinarello

Member
Joined
Jun 21, 2019
Messages
191
Reaction score
2
Points
18
Location
Germany
Excel Version(s)
Office 365
At least in your test folder, the amounts were not positinonized consistently. Otherwise, it was actually quite simple.

Now you have to test with your data, if everything fits. Maybe you have to make a few adjustments.

Addition:
I see now that you need more columns after "Deducted". This was not visible before due to advertising. Contact me if you can't manage to add the missing columns by yourself.
 

Attachments

  • xlguru - Extract data from 1 column in txt with Power Query (PQ).xlsx
    19.9 KB · Views: 2
Last edited:

pinarello

Member
Joined
Jun 21, 2019
Messages
191
Reaction score
2
Points
18
Location
Germany
Excel Version(s)
Office 365
Here now with all needed columns. You just have to check if I interpreted the columns correctly.
 

Attachments

  • xlguru - Extract data from 1 column in txt with Power Query (PQ).xlsx
    21.4 KB · Views: 2

Zigfreud

New member
Joined
May 24, 2022
Messages
6
Reaction score
0
Points
1
Excel Version(s)
Office 365
OMG, it is perfect! Only on Column vendor name (B) we should have just vendor name, but I can see and fix that. Can you tell me how you managed to do it?
Asking because I am unable to see all the information in the file you shared - first I got some weird Error about Permissions, managed to pass that, and now i got stuck here, because it does not find "p_file":

I tried changing the source to where my file is located, clicked on Refresh but still can't get past this:
source_error.png

If you have the time, it would help to explain a bit the steps. So that I could learn the logic. Really appreciate you taking the time to do this.
 
Last edited:

Zigfreud

New member
Joined
May 24, 2022
Messages
6
Reaction score
0
Points
1
Excel Version(s)
Office 365
Privacy.pngcompatibility.png

these were the encountered errors
 

pinarello

Member
Joined
Jun 21, 2019
Messages
191
Reaction score
2
Points
18
Location
Germany
Excel Version(s)
Office 365
In the worksheet "Query1" I have assigned the name "p_file" to cell C3 and entered the path and the name of the source file in this cell.

The process is actually simple. First I created an index to find the vendor rows in the next step, which I then filled in down. Now next, because so confusing, I filtered out rows that are not needed. Next, with the supplier row filled in down, I grouped them and inserted an index into the grouping, which is then expanded.

Now (from step Col_Vendor) the vendor number and vendor name are determined and filled in downwards. With the step "Filter_relevant" now all lines are filtered, whose grouping index is larger 7, since the relevant documents begin with each vendor from line 8.

Thus, now only document lines are left that already contain supplier number and supplier name. Unnecessary at this point, but as it becomes clearer, columns that are no longer needed are deleted.

Since the amounts in the sample file are provided with different numbers of blanks, these are reduced with the next steps, in order to receive then with the step "Split_Column", the needed columns.

With the last step "Remove_other_Cols" the needed columns are marked in the final order and the rest is deleted.

I have now taken this opportunity to delete some steps from the query that were not necessary. Therefore I upload the current folder again.

Compared to importing websites in html format, this was really just kid's stuff.

About the messages: With the privacy screen, you can select "Ignore ..." And the next message only says that my Power Query is newer than yours. But this is only problematic if I accidentally used functions that your Power Query doesn't know yet.
 

Attachments

  • xlguru - Extract data from 1 column in txt with Power Query (PQ).xlsx
    22.1 KB · Views: 3

p45cal

Super Moderator
Staff member
Joined
Dec 16, 2012
Messages
2,170
Reaction score
12
Points
38
Excel Version(s)
365
I have some files from SAP system.
Surely you can export from SAP in more friendly ways! These files look like they're from the spool queue to a printer (.prn files?) [a dot matrix printer at that].
 

pinarello

Member
Joined
Jun 21, 2019
Messages
191
Reaction score
2
Points
18
Location
Germany
Excel Version(s)
Office 365
There are many ways in SAP to get the information you need quickly and easily.

For example, there is the Quickviewer, with which users can generate selected areas. For fear of inexperienced users constantly starting very computationally intensive queries, the necessary authorization is not readily granted. Even in a well-organized BW, all data is actually available and can be retrieved quickly. A with the vast number of queries, it is not easy to find the right one. Also, there are many old reports whose output is hardly suitable for Excel. With realtive little effort in these, optionally, an alternative output could be programmed. But until this little thing (I know what I am writing about) is programmed and transported into the productive system, it usually takes months. And because of the excessive overhead (Change Request > Offer > Order > Testworkbench > Maintenance cycle - every step with a lot of signatures), even the little things are so expensive internally that they are often not approved by superiors.

In this respect, I can well understand that many users try, from today's point of view, to somehow process idiotic SAP lists in Excel, just to be able to do their tasks better.

Until about 20 years ago, I could use my creativity to make programs better and more efficient. Then it started that I had to use larger and larger parts of my creativity to outsmart the senseless bureaucratic overhead.

The only good thing about this bullshit was that it made it easier for me to leave active professional life.
 
Top