Perform calculation between records

wannes

New member
Joined
Dec 2, 2016
Messages
3
Reaction score
0
Points
0
I have a text file with data (thousands of records).
I read the data in a workbook trough power query to perfom some modifications.
So far so good.

Wat I need now is to compare a value in a record to a value in the previous record. I succeed to perfom this in power query, but it is really very slow.

Therefore I load the data from power query to a worksheet, where there is no problem to do calculations between records.
The problem is that, when I do a reread of the data or perform a filtering in power query, the number of records changes and the calculations are lost ( #ref : so the formula's wont keep their referentions ).

Can someone give me advice to reach my goal?

Txs!

Wannes
 
Not without seeing the workbook. Please attach it here. Desensitise any sensitive data.
 
Hi,

in attachment a sample file of what I want to have.

KR
Wannes
 

Attachments

  • Sample PQ.xlsx
    20.4 KB · Views: 22
Here is some sample code to apply to your example...

Code:
let
    Source = Excel.CurrentWorkbook(){[Name="T_SOURCE"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"A", Int64.Type}, {"B", Int64.Type}, {"C", Int64.Type}, {"D", type text}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
    TestVal = Table.AddColumn(#"Added Index", "Offset", each #"Added Index"{[Index] - 1}[D]),
    #"Added Conditional Column" = Table.AddColumn(TestVal, "Custom", each if [D] <> [Offset] then 1 else 0),
    #"Replaced Errors" = Table.ReplaceErrorValues(#"Added Conditional Column", {{"Custom", 1}}),
    #"Removed Columns" = Table.RemoveColumns(#"Replaced Errors",{"Index", "Offset"})
in
    #"Removed Columns"
 
I realise this is an old thread - I'm going through these to try and learn from real life scenarios.

I have a text file with data (thousands of records).<snip>
I succeed to perfom this in power query, but it is really very slow.

I heard somewhere about this kind of thing taking a long time and that merging two queries would be faster, although more code.
The attached contains my first gauche attempt at this and already the difference in time is huge.
Code:
let
    Source = Excel.CurrentWorkbook(){[Name="T_SOURCE"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"A", Int64.Type}, {"B", Int64.Type}, {"C", Int64.Type}, {"D", type text}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
    #"Added Index1" = Table.AddIndexColumn(#"Added Index", "Index.1", 1, 1),
    #"Merged Queries" = Table.NestedJoin(#"Added Index1", {"Index"}, #"Added Index1", {"Index.1"}, "Added Index1", JoinKind.LeftOuter),
    #"Expanded Added Index1" = Table.ExpandTableColumn(#"Merged Queries", "Added Index1", {"D"}, {"D.1"}),
    #"Sorted Rows" = Table.Sort(#"Expanded Added Index1",{{"Index", Order.Ascending}}),
    #"Added Custom" = Table.AddColumn(#"Sorted Rows", "Change", each if [D]=[D.1] then 0 else 1),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"A", "B", "C", "D", "Change"})
in
    #"Removed Other Columns"

But can it be better put together?

It takes less than 5 secs on wannes' source data expanded to more than 100k rows.
Rudi's code started fast enough reaching a 1000 records after a few seconds, then started to slow down. After 20 minutes it had processed 15k rows and was processing them at less than 10 rows per second. (After 30mins about 6 records per second.)
Would Rudi's code benefit from a/some Table.Buffer lines it it somewhere?

The source data's too big to attach here so here's a link to the file: https://app.box.com/s/wlb6qitneexwarjuz9cc568ueiy7o4k4
 
Back
Top