Mailing Address Cleanup - Query from Source

Will_Texas

New member
Joined
Dec 20, 2021
Messages
16
Reaction score
1
Points
3
Location
Houston, Texas
Excel Version(s)
365
Hello, I have a large data set with US based addresses. I was wondering what the best way would be to parse the addresses so it is:

Street City State Zip Country
Below is an example of three sample addresses, all contained in a single cell, over three rows. parsing out State, Zip and Country isn't a problem.

The issue is parsing out via delimiters (there are none) between the Street and the City. Some City names are one, or two words. Some Streets end with Pkwy, Dr, BLVD, or have no abbreviation at all. My data set is downloaded as a .CSV file type.

6402 Fairmont Pkwy Pasadena TX 77505 US
510 S Interstate 35 Round Rock TX 78681 USA
11800 Hero Way W Leander TX 78641 USA

These addresses all "map" when uploaded to mapping software (google maps, etc.) so, I'm open to suggestions - either trying to parse the data in Power Query by some type of delimeter, or connecting to a web-based data source that will return the proper address format to my excel file.

The goal here is that I can sort/filter/view by City and then take into Power BI, etc. Thanks!
 
One possible way with Power Query
Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Split Column by Delimiter" = Table.SplitColumn(Source, "Column1", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, true), {"Column1.1", "Column1.2"}),
    #"Split Column by Delimiter1" = Table.SplitColumn(#"Split Column by Delimiter", "Column1.1", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, true), {"Column1.1.1", "Column1.1.2"}),
    #"Split Column by Delimiter2" = Table.SplitColumn(#"Split Column by Delimiter1", "Column1.1.1", Splitter.SplitTextByEachDelimiter({"#(00A0)"}, QuoteStyle.Csv, true), {"Column1.1.1.1", "Column1.1.1.2"}),
    #"Split Column by Delimiter3" = Table.SplitColumn(#"Split Column by Delimiter2", "Column1.1.1.1", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, false), {"Column1.1.1.1.1", "Column1.1.1.1.2"}),
    #"Split Column by Delimiter4" = Table.SplitColumn(#"Split Column by Delimiter3", "Column1.1.1.1.2", Splitter.SplitTextByEachDelimiter({"#(00A0)"}, QuoteStyle.Csv, true), {"Column1.1.1.1.2.1", "Column1.1.1.1.2.2"}),
    #"Renamed Columns" = Table.RenameColumns(#"Split Column by Delimiter4",{{"Column1.1.1.1.1", "Number"}, {"Column1.1.1.1.2.1", "Street"}, {"Column1.1.1.1.2.2", "City"}, {"Column1.1.1.2", "State"}, {"Column1.1.2", "Zip"}, {"Column1.2", "Country"}})
in
    #"Renamed Columns"
 
Hi,

You might consider using the Splitter.SplitTextByAnyDelimiter function to parse the Street and State fields by delimiters. For optimal results, you may need to provide a list of "City" names to ensure the function splits the text accurately.

BR,
ONG CHEE SENG

Picture_A.JPG
M:
let
    //Part 1 - Source and Declaration (Update if needed i.e. add City for delimiter)
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    CityList = {"Pasadena", "Round Rock", "Leander"},
    ColumnNames = {"Street", "City", "State Zip Country"},
    //Part 2
    CityColumn = Table.AddColumn(Source, "City",
                            (x)=>
                                let
                                    a = List.Transform(CityList, each
                                                        if Text.Contains(x[Column1], _)
                                                        then _
                                                        else null),
                                    b = Text.Combine(List.RemoveNulls(a))
                                in b, type text),
    Split_Column = Table.SplitColumn(CityColumn,
                                    "Column1",
                                    Splitter.SplitTextByAnyDelimiter(CityList, QuoteStyle.Csv),
                                    {"Street", "State Zip Country"}),
    //Part 3
    TextTrim = Table.TransformColumns(Split_Column, {}, Text.Trim),
    Reorder = Table.SelectColumns(TextTrim, ColumnNames, MissingField.Ignore)
in
    Reorder
 
Back
Top