Running out of memory (Exceeding 400GB file)

ste4en

New member
Joined
Jul 24, 2018
Messages
13
Reaction score
0
Points
0
Excel Version(s)
2016
Hello, I am running a power query on three Excel workbooks, each have about 200 sheets. I am pulling about 70 rows and 30 columns from every sheet. I was unable to perform any filtering or column removal in the process as it just errors out. Anyway I have decided to let it run as is and i do the filtering etc after all the data is in one large flat file. Overall about 70,000 rows.

My problem started today (the spreadsheet grows daily and is almost complete, maybe another 5,000 rows to go) in that it gets to about 60,000 rows and my computer runs out of starage. the file size exceeds 400GB.

It doesn't seem right that the query is so big while it is being created. it is also taking about 30 minutes to run.

Any way to speed this up or let it use less memory.

Thanks = query below.


-----------------------------------------------------
let
Source = Folder.Files("C:\Users\sh\Desktop\Gasifier work\Restart Excel Sheets"),
#"Added Custom" = Table.AddColumn(Source, "Custom", each Excel.Workbook([Content],true)),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Name", "Data"}, {"Name.1", "Data"}),
#"Renamed Columns" = Table.RenameColumns(#"Expanded Custom",{{"Name", "FileName"}, {"Name.1", "Name"}}),
#"Expanded Data" = Table.ExpandTableColumn(#"Renamed Columns", "Data", {"Column10", "Column11", "Column12", "Column13", "Column14", "Column15", "Column16", "Column17", "Column18", "Column19", "Column2", "Column20", "Column21", "Column22", "Column23", "Column24", "Column25", "Column26", "Column27", "Column28", "Column29", "Column3", "Column30", "Column31", "Column32", "Column33", "Column34", "Column35", "Column36", "Column37", "Column38", "Column39", "Column4", "Column40", "Column41", "Column42", "Column43", "Column44", "Column45", "Column46", "Column47", "Column48", "Column49", "Column5", "Column50", "Column51", "Column52", "Column53", "Column54", "Column55", "Column56", "Column57", "Column58", "Column59", "Column6", "Column60", "Column7", "Column8", "Column9", "SG Solutions - Routine Maintenance Outage Cost Estimating Worksheet"}, {"Column10", "Column11", "Column12", "Column13", "Column14", "Column15", "Column16", "Column17", "Column18", "Column19", "Column2", "Column20", "Column21", "Column22", "Column23", "Column24", "Column25", "Column26", "Column27", "Column28", "Column29", "Column3", "Column30", "Column31", "Column32", "Column33", "Column34", "Column35", "Column36", "Column37", "Column38", "Column39", "Column4", "Column40", "Column41", "Column42", "Column43", "Column44", "Column45", "Column46", "Column47", "Column48", "Column49", "Column5", "Column50", "Column51", "Column52", "Column53", "Column54", "Column55", "Column56", "Column57", "Column58", "Column59", "Column6", "Column60", "Column7", "Column8", "Column9", "SG Solutions - Routine Maintenance Outage Cost Estimating Worksheet"})
in
#"Expanded Data"
 
Try this:
Open the Power Query Options window
Global Data Load section
• Check: Fast Data Load

Current Workbook Data Load section
• UN-check: Allow data preview to download in the background
(That option is mis-named/described: When checked, whenever you alter ANY query, the preview for ALL queries is updated. Potentially consuming a huge amount of resources.)

Does that help?
 
Thank you. No change, may be running a little slower, but using same memory. I have decided split the query into three parts, so good for now.
thanks
 
I would also try turning the privacy settings off for that workbook. Options --> Current Workbook --> Privacy --> Ignore. I've found that can help with speed in a big way, and may also reduce the memory pressure.

Out of curiosity, are you on a 32 bit or 64 bit version of Excel?
 
ok (re cross posting)
 
No change on the Privacy-Ignore setting.
I am running 32Bit
 
@ste4en, maybe you want to try running on 64bit and increase your RAM.

Works for me. I was using 32BM & running on 16MB of RAM (took me like 2 hours ish or more to run a query) and when i updated to 64 bits and 32GB RAM, the same queries took me about 10 mins or so :)
 
The killer though, is convincing IT. At least Excel 2019 is supposed to be shipping at a 64 bit default now, so hopefully that will finally force IT to move forward on this stuff!
 
i am running a 64bit machine with 16MB ram. I have broken the quory down into three parts which helps. Yes Excel 2019 64Bit will help but this exercise will be over before it arrives in our organization.
 
Out of curiosity, is there any chance that you can get IT to install the 64 bit version of 2016 for you? They already own it, as it is an install option. Maybe explain the issues you're having and that you'd like to be the organizational guinea pig for end user testing?

My feeling is that you're in a memory issue here, and there isn't much you're going to be able to do to fix this in a single workbook scenario.

The only other method I would look at is breaking things down into a number of sub files (say 5) that process 1/5 of the data files each, then use a master solution to just pull that data in and append it. This way you split the heavy processing into smaller components that can be completed with less memory. The final solution should complete, as it is only retrieving data, not manipulating it (beyond a simple append.)

HTH,
 
Back
Top