Data Clean Up Help - Concatenate rows (inconsistent pattern)

ehrgeiz

New member
Joined
Jul 7, 2017
Messages
5
Reaction score
0
Points
0
First time poster--please bear with me.

I get these reports that are made for PDFs and so they are not tabular. In excel, I managed to clean up the data using many different steps which include complex IF statements, then pasting the values in a different worksheet, sorting, deleting blanks etc. Example is in the highlighted orange under the "Data" Worksheet. There's several other steps after this and it's time consuming.

I'd like to convert them to a tabular format using Power Query but I'm not sure how to tackle this. I would love some guidance. Attached is an example of the report that's been scrubbed a bit.

The one tricky part I haven't been able to replicate in Power Query would be in Column C, some times the data is on a single row but sometimes it spans into the next room (i.e. Cell C37 and C38) and I can't think of easy logic in Power Query to concatenate them. In my old way, I've been using the IF statements to concatenate by basing whether data exists in a different column.

I know about the solution to stacked data and using modulo but I don't think I can use that for this case.

Another option I thought was maybe duplicating column C and then deleting a single cell so that it's offsetting. Then that way, I can use some sort of IF statement within Power Query to possibly concatenate the lines BUT I don't know if it's possible to just delete a single cell and shift up that data in one specific column within Power Query.

I'm hoping someone can open my eyes to some other things that can be done. I'm approaching this from one angle but I think I may just need someone to steer me in a different direction for cleaning up this data.

Thank you!!
 

Attachments

  • Example.XLSX
    84.3 KB · Views: 9
Not sure I am following, you have names in the start date column, why would they be there, should they be there? What is the logic to determine if two rows should be merged.

Show some examples of what you have, what you want to achieve.
 
I'm with Bob on this? The data does seem a bit "scattered"?

I can supply you with a typical way to get the value offsetted one down (similar to you describing deleting a cell to shift the columns rows up one).
See the M-Code below. The line called: #"Get Offset 1 Down" is the key line.

Hopefully you can make something of the code below...

Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type any}, {"Column2", type text}, {"Column3", type text}, {"Column4", type any}, {"Column5", type text}, {"Column6", type text}, {"Column7", type text}, {"Column8", type any}, {"Column9", type text}, {"Column10", type any}, {"Column11", type text}, {"Column12", type text}, {"Column13", type text}, {"Column14", type any}, {"Column15", type date}, {"Column16", type text}, {"Column17", type any}, {"Column18", type text}, {"Column19", type text}, {"Column20", type text}, {"Column21", type text}, {"Column22", type any}, {"Column23", type text}, {"Column24", type any}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Column1", "Column4", "Column8", "Column10", "Column12", "Column13", "Column16", "Column18", "Column19", "Column20", "Column21", "Column23", "Column9", "Column5", "Column6"}),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Column2", "Customer Name"}, {"Column3", "Description"}, {"Column7", "Start Date"}, {"Column14", "Transaction Amount"}, {"Column15", "Transaction Date"}, {"Column11", "Expiry Date"}, {"Column17", "Award Amount"}, {"Column24", "Award Balance"}, {"Column22", "Amount Used"}}),
    #"Added Index" = Table.AddIndexColumn(#"Renamed Columns", "Index", 0, 1),
    #"Get Offset 1 Down" = Table.AddColumn(#"Added Index", "Custom", each #"Added Index"{[Index] + 1}[Description]),
    #"Remove Index" = Table.RemoveColumns(#"Get Offset 1 Down",{"Index"}),
    #"Merged Columns" = Table.CombineColumns(#"Remove Index",{"Description", "Custom"},Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Description"),
    #"Reordered Columns" = Table.ReorderColumns(#"Merged Columns",{"Customer Name", "Description", "Start Date", "Expiry Date", "Transaction Amount", "Transaction Date", "Award Amount", "Amount Used", "Award Balance"})
in
    #"Reordered Columns"
 
Sorry, try this file. The different letters in column B on the Data tab represent different people's names and should be unique. I've included what would be the end goal in the last tab "Final Product".

Yes, the data is very scattered which is why I'm hoping Power Query can help reduce the effort to clean it up to a table format.
 

Attachments

  • Example.XLSX
    105.9 KB · Views: 12
This is the closest I can get to your "Final Result". (It was really just a slog through the process with no real direction!! I was just trying to match your final result, so the code is terrible. Please note that I still find no real "stable" logical patterns in this data and although this output is close to the mark, I cannot guarantee that it will work for the set of data.

Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type any}, {"Column2", type text}, {"Column3", type text}, {"Column4", type any}, {"Column5", type text}, {"Column6", type text}, {"Column7", type text}, {"Column8", type any}, {"Column9", type text}, {"Column10", type any}, {"Column11", type text}, {"Column12", type text}, {"Column13", type text}, {"Column14", type any}, {"Column15", type date}, {"Column16", type text}, {"Column17", type any}, {"Column18", type text}, {"Column19", type text}, {"Column20", type text}, {"Column21", type text}, {"Column22", type any}, {"Column23", type text}, {"Column24", type any}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Column1", "Column4", "Column8", "Column10", "Column12", "Column13", "Column16", "Column18", "Column19", "Column20", "Column21", "Column23", "Column9", "Column5", "Column6"}),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Column2", "Customer Name"}, {"Column3", "Description"}, {"Column7", "Start Date"}, {"Column14", "Transaction Amount"}, {"Column15", "Transaction Date"}, {"Column11", "Expiry Date"}, {"Column17", "Award Amount"}, {"Column24", "Award Balance"}, {"Column22", "Amount Used"}}),
    #"Added Index" = Table.AddIndexColumn(#"Renamed Columns", "Index", 0, 1),
    #"Get Offset 1 Down" = Table.AddColumn(#"Added Index", "Desc+1", each #"Added Index"{[Index] + 1}[Description]),
    #"Removed Errors" = Table.RemoveRowsWithErrors(#"Get Offset 1 Down", {"Desc+1"}),
    #"Removed Columns1" = Table.RemoveColumns(#"Removed Errors",{"Index"}),
    #"Merged Columns" = Table.CombineColumns(#"Removed Columns1",{"Description", "Desc+1"},Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Description"),
    #"Reordered Columns" = Table.ReorderColumns(#"Merged Columns",{"Customer Name", "Description", "Start Date", "Expiry Date", "Transaction Date", "Transaction Amount", "Award Amount", "Amount Used", "Award Balance"}),
    #"Filled Down" = Table.FillDown(#"Reordered Columns",{"Customer Name"}),
    #"Filtered Rows" = Table.SelectRows(#"Filled Down", each ([Start Date] <> null)),
    #"Changed Type1" = Table.TransformColumnTypes(#"Filtered Rows",{{"Start Date", type date}}),
    #"Replaced Errors" = Table.ReplaceErrorValues(#"Changed Type1", {{"Start Date", null}}),
    #"Filled Down1" = Table.FillDown(#"Replaced Errors",{"Award Amount", "Start Date"}),
    #"Filtered Rows2" = Table.SelectRows(#"Filled Down1", each ([Award Amount] <> "Amount Granted")),
    #"Added Conditional Column" = Table.AddColumn(#"Filtered Rows2", "Delete Row", each if [Expiry Date] = null and [Transaction Amount] = null then "1" else null ),
    #"Filtered Rows1" = Table.SelectRows(#"Added Conditional Column", each ([Delete Row] = null)),
    #"Removed Columns2" = Table.RemoveColumns(#"Filtered Rows1",{"Delete Row"})
in
    #"Removed Columns2"
 
Back
Top