Blackban88s
New member
- Joined
- Dec 31, 2015
- Messages
- 4
- Reaction score
- 0
- Points
- 0
Hello all, I need to perform the Power Query equivalent of an Index Match to identify partial text (substring) within Description column of a source table. For clarity, what I’m doing is essentially labeling thousands of rows of combined .csv bank statement transaction files with a categorization based on either a partial vendor name in the Description column or an account number and a vendor specific remittance text or number substring. Here is an example:
Table1 (Source)
Table2 (lookup table)
*null for the account number means it doesn't matter which account the the substring is found in.
Repeated splitting (parsing) functions to get a clean description are not an option here as there are simply too many variations in the transaction descriptions across multiple accounts and banks to apply effectively. In the past when faced with a similar problem I've just built very long if/then statements in M which gets very cumbersome after about 100 different sub-strings. I’ve found a few related threads here and some tangentially helpful blog posts using Text.Contains, Text.Combine, and List.Select but I’m not able to pull together a solution on my own. I’ve read Ken’s book – M is for Data Monkey - and even though I’m not intimidated by M code anymore, I can’t yet write my own functions and am easily tripped up by passing the wrong data type to a M function. Can anyone help or at least point me to a related thread?
Table1 (Source)
[Account] | [Description] | |
12345 | AMAZON#57982 | |
12345 | 908745-2017 | |
12345 | AMAZONMKTPLACE#12345 | |
12345 | EXXONMOBIL-12345 | |
12345 | SHELL OIL789 | |
78910 | SHELLOIL#12345 | |
78910 | EUREST STARBUC16 | |
78910 | STARBUCKS #12345 | |
78910 |
|
Table2 (lookup table)
[Account] | [Sub-string] | [Category] |
null* | AMAZON | A |
null | EXXON | B |
null | SHELL | B |
null | STARBUCKS | C |
78910 | 908745 | D |
78910 | 908745 | F |
*null for the account number means it doesn't matter which account the the substring is found in.
Repeated splitting (parsing) functions to get a clean description are not an option here as there are simply too many variations in the transaction descriptions across multiple accounts and banks to apply effectively. In the past when faced with a similar problem I've just built very long if/then statements in M which gets very cumbersome after about 100 different sub-strings. I’ve found a few related threads here and some tangentially helpful blog posts using Text.Contains, Text.Combine, and List.Select but I’m not able to pull together a solution on my own. I’ve read Ken’s book – M is for Data Monkey - and even though I’m not intimidated by M code anymore, I can’t yet write my own functions and am easily tripped up by passing the wrong data type to a M function. Can anyone help or at least point me to a related thread?