Performance Methods

MushroomFace

New member
Joined
Jun 12, 2018
Messages
6
Reaction score
0
Points
0
Location
UK
Excel Version(s)
2013
Hi

I have two forms of excel data going through Power Query:
Data 1 - Retrieves file, manipulates the data. I then reference this (creating another query) and retrieve a unique list of invoices.
Data 2 - This file has 1,000,000+ rows, I would like to filter on the invoices first, to reduce the data and then manipulate my data which also requires other merges.

Using the "Unique List of Invoices" from Data 1, what would be the efficient way to treat Data 2? I'm currently thinking;
  • Merge function, or
  • Filter by list (Table.Select etc)

Other suggestions welcome! Thanks
 
I've got that covered. One is a master list of invoices that has all details but there's millions of rows, the list of invoices is from another report that has certain criteria but does not provide enough information. If the master list is the source, I would merge with unique list of invoices as right outer.
Both methods work but which is the most efficient, if any? As I don't want to break it, work have the 32bit version.
 
32bit...ugh. You could try to do it all in one spreadsheet...and it might choke. My approach, given your probable file size limitations under 32bit Excel would be:
1. Step 1, do as you proposed, but copy-paste the results to a blank spreadsheet and save is as a CSV file.
2. Step 2, import the CSV file created in Step 1 as a connection only. Hopefully the data file 2 with that has 1,000,000+ rows is a CSV file. Import it as a separate table connection only. Now do a another query that merges the two to leave you with what you want and load it to the spreadsheet. Hopefully it will not lock up. That's just my take.
 
Back
Top