Complex Data Cleaning

noushad1ali2000

New member
Joined
Oct 21, 2024
Messages
1
Reaction score
0
Points
1
Excel Version(s)
Excel2021
Greetings,

I am regular watching your vlog related excel, and now i need your help to clean a complex type of data extracted from my company system which i need to clean and provide them in simple form to create some insight from that.

1st sheet as sample extracted (
i have given two sets of data as sample, whereas in my co file having 25860 rows of data for almost 1185 person

2nd sheets need to be normalized

I have attached sample with dummy data as an attachment

it would be much appreciated if you could help me out

Thanks
Naushad Ali
 

Attachments

  • testDataModel.xlsx
    10 KB · Views: 5
cross posted without links:

@noushad1ali2000, for your information, you should always provide links to your cross posts.
This is a requirement, not just a request.
If you have cross posted at other places, please add links to them too.
Why? Have a read of https://excelguru.ca/a-message-to-forum-cross-posters/
 
Hi,

The below is my idea and would like to share with you. Hope its can help to solve your problem.

BR,
ONG

M:
let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],

    AddedNewCol = Table.AddColumn(Source, "New", each Text.Combine(List.Transform(Record.ToList(_),Text.From), ", ")),
    RemovedCols = Table.SelectColumns(AddedNewCol,{"New"}),
    x = List.Max(Table.TransformColumns(RemovedCols, {"New", each List.Count(Text.PositionOf(_, ",", Occurrence.All)), type number})[New])+1,
    SplitCols = Table.SplitColumn(RemovedCols, "New", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), List.Transform({1..x}, each "New." & Text.From(_))),
    Grouped = Table.Group(SplitCols, {"New.1"}, {{"All", each
            let
                a_Filtered = Table.SelectRows(_, each ([New.1] = "grade" or [New.1] = "ID")),
                a_Tbl = #table({"A"}, List.Transform(Table.ToColumns(a_Filtered), each {_})),
                a_Grouped = Table.Group(a_Tbl, {"A"}, {"All", each Table.FromRows([A]), type table}, 0, (x,y)=>Number.From(y[A]{0}=" role")),
                a_Transform = List.Transform(a_Grouped[All], each Table.ToColumns(_)),
                a_Tranpose = Table.FromColumns(List.Combine(a_Transform)),

                b_RemovedRows = Table.Skip(_,each [New.1]<>"code"),

                c_Combine = List.Transform(Table.ToColumns(a_Tranpose) & Table.ToColumns(b_RemovedRows), each _),
                c_Tbl = Table.FromColumns(List.Select(c_Combine, each List.NonNullCount(_)>0)),
                c_Promoted = Table.PromoteHeaders(c_Tbl, [PromoteAllScalars=true])
            in
                c_Promoted, type table}}, 0, (x,y)=>Number.From(Text.StartsWith(y[New.1], "____"))),
    Combine = Table.Combine(Table.RemoveRowsWithErrors(Grouped)[All]),
    Filtered = Table.SelectRows(Combine, each ([#" title"] <> null)),
    y = {"ID", " role", "grade", "code", " title"},
    Reordered = Table.ReorderColumns(Filtered,y),
    ChangedTypes = Table.TransformColumns(Reordered, {}, Text.From)
in
    ChangedTypes
 
Last edited:
Back
Top