Small basedata files and Out of Memory

UK_GER · Jun 4, 2017

Hi together,

for a longer time I'm using PQ now.
It combines work and "fun" and provides good results.

But - Now I'm at a point where is starts to offer bad vibes.

I'f build some queries with even small amount of data (6 files with each less then 1 MB) but do a lot of filtering, merging, adding, filtering again, merging again....... 40 queries in one excle file.

And - It is slooooooow or even crashes with "Out of Memory"
(Looking into the task manager I saw Excel and some mashup container growing and growing.)

Is anyone here, who faces the same problem?
And - is there a solution?

Our IT is willing to help me, but I even have no idea what I need from them.

I'm working with 2010 (32bit) and PQ as Add-In.

Thanks
UK_GER

MarcelBeug · Jun 4, 2017

If your IT department has someone with Power Query knowledge, they can try and analyze your code to find out where the bad performance comes from.

Possibly you need some table buffering at some points, to prevent code from being evaluated over and over again.
Otherwise some suspects to slow down your code would be:
- referencing other rows in the same table (using index values),
- creating subtotals, each summing values from previous rows,
- sorting, grouping (in)pivoting,
- using List.Generate, List.Accumulate or recursive functions without buffering.

Also "40 queries" sounds as a lot to me: do they all include end results you need for reporting, or do they include intermediate tables (are these loaded with "Create connection only"?).
Possibly you can combine some query codes and reduce the total number of queries.

In other words: it's hard to tell in general.

Otherwise you can take general steps to analyse the problem: what was recently changed that may cause the current problems, put your code in a test environment and start taking out parts of code to try and find the problematic parts, etcetera.

Maybe you can share some code for us to verify (not all 40 queries, but som parts that might be suspicious).

UK_GER · Jun 4, 2017

Dear Marcel,

thanks for the fast response.
So it seems the problem sits in front of the PC.

Table Buffering is new to me. I will do some work with it.
I have no List.* at all, also no indexing.
I also have no subtotals (as I understand from your description such as inventory per date e.g.).

8 Queries load data into one tab at the end to generate diagrams.
Another 6 load the data into tables for analysis of the fact that are shown in 6 out of the 8 diagrams form above.

What I do have are some more Duplicates and References as I need the data in the Diagram per month and in the tabs.

Not easy to explain.

There is no one in our IT that can help ;-)

I will do some homework and come back.

Thanks
Uwe

Ken Puls · Jun 5, 2017

I was going to say vote for this, but I see you've already commented on it.

Flattening your query chain can help to avoid this issue although you would hope that you wouldn't have to do this.

Try turning off your Privacy settings, as this can cause a huge performance hit. You also may want to try setting the Fast Data Load option to on. (Both are found in the PQ settings). The latter may or may not make a difference... it depends on how big your data set is. (The bigger it is, the more likely Fast Data Load will help.)

UK_GER · Jun 5, 2017

Hi Ken,

yes, i've written a short comment.
It's take and GIVE.

The options you mentioned, I have all done "before".
But that does not help so much.

Seems that I have to optimize.

One could be to build intermediate queries to reduce the data (even they are small right now).
One of the excel data files include bookings for orders that have to be summed up.
This always blow up the loading time.

But then I come to the point I have asked here app. 1 year ago. Make query on query. Grrrr.
That was also the reason for my SAP BW question. This could probably provide me some more tables I need for different queries.
At the moment we pull the data "by hand". SAP users have an idea what that means. So we try to reduce the amount of base data files.
But - for some queries we need data from the last 10 years, for others only ytd.

But one question ( the 1st step):

Is it better to merge a date in a table with another table containing date conversions e.g. month, year, mmm.yy,....
Or should I do this work in PQ.

Thanks
Uwe

Ken Puls · Jun 6, 2017

I hear you Uwe. It's very frustrating.

So, on the date thing... I generally create my date table for Power Pivot via Power Query and don't merge it into anything at all. I do the link in Power Pivot and call it a day.

If you can't do that though... which is faster? I'm thinking with the perf issue highlighted in that uservoice item, it's probably faster to add new columns than create a child table that needs to be refreshed.

UK_GER · Jun 7, 2017

Hi Ken,

It's good to know that someone can understand my feelings ;-)

I think I will go for the following solution:

Extract data from SAP in big data sets (as we do now).
Build one intermediate query to split the data in short packages (e.g. Bookings 2017, Bookings 2016,..Bookings per order as sum up (for different years), an so on.

No Index, no list, no merge. Only some short prog-steps for e.g. Month_Year out of date...and Bill's (all in one row).
This extract will be loaded in tabs.
This one file we be the DATABASE_Piont for all to build queries.

In best case it is one big refresh - that works without "Out of Memory" - send a smile? ?!

So it's from BIG to small to What.I.Want.

Not nice - but hope it works out for some time.

All a nice day
Uwe

Small basedata files and Out of Memory

UK_GER

New member

MarcelBeug

New member

UK_GER

New member

Ken Puls

Administrator

UK_GER

New member

Ken Puls

Administrator

UK_GER

New member