Transform blocks of columns to rows

Shirley Moreman

New member
Joined
May 8, 2022
Messages
4
Reaction score
0
Points
1
Excel Version(s)
365
Hi,

I'm fairly new to Power Query so have got stumped by this one. I have a set of data for event schedules that I need to transform form blocks of columns relating to dates into a table with rows for the data in each date, including the date and other record information.

So I want to go from this:

Before.jpg

to this:


After.jpg

I've got as far as using Transpose and Fill down to set up the dates and create a list of these, but I'm stuck then with how to pick up the other data and get it into the format I need. I've done some searching but haven't found anything like this.

Any help would be much appreciated.
 

Attachments

  • Event Schedule PQ Challenge.xlsx
    17.8 KB · Views: 12
Hi Shirley,

This is a bit tricky, as it involves both a pivot and an unpivot. Have a look at the attached. :)
 

Attachments

  • Shirley.xlsx
    22.3 KB · Views: 14
Ken you are a total star! Thank you so much.
I haven't had a chance to study the query yet, but the output looks like it's just what I need. I will take a look and let you know if I have any questions.
 
Here's another option - very similar to Ken's but I tried NOT to code the column names in each of the "day" blocks in case those blocks vary.

@Ken - nice solution! I love the way you used promote headers to make the "Type" column names unique 😊
 

Attachments

  • Event Schedule PQ Challenge.xlsx
    21.2 KB · Views: 8
This was a great solution too MisterBates �� I'd tried to merge columns and pivot but kept tying myself in knots!

Thanks both for taking the time to solve this for me. Happy to close off the thread now.
 
Hi Shirely, feel free to ignore this thread, but it may also be of interest...

MisterBates, I was curious to compare our solutions, particularly since you were adding Index columns and performing merges. So what I did was put both our solutions in one workbook, split the data source into a new query, and built a data set that repeats the "data" section of the raw table 1,000 times. I then ran some speed tests over the solutions with my Monkey Tools TimeSleuth.

Please be aware that this is not criticism of your solution, but rather genuine academic interest on my side. Having said that, here's a view of the results:
compare.png

The key takeaways for me here are that - with privacy off - the time isn't massively different. (My method averages ~1.7 sec, yours is ~3.6, but with a bit more variation over several refreshes.) But when privacy checks are active (Power Query's default state,) the differences start to become significant. Even with privacy checks on, the methods I used turn out to have little impact, where your method sees significant perf hits and MUCH more variation in refresh times. My guess is that the issues can be laid at the feet of the Index (which essentially buffers the table, forcing a full load into memory before subsequent processing), followed by the merge operations.

I've attached a workbook with the side-by-side and the output report. Also, full disclosure, I did tweak mine to remove a Changed Type step (to avoid hard coding some columns), as well as added a step to drop the _x suffixes on the "Type" column, as I missed that the first time. Also, in order to get the file small enough to upload to the forum, I had to add a Parameter called MaxRepeats and reduce the data to 100 repeats. If you want to feel the experience I had, just just into Power Query, change that value to 1000, close and do a full refresh.
 

Attachments

  • Compare.xlsx
    105 KB · Views: 7
Wow! ... for me, this is the value of a forum. Being able to compare different solutions to a problem. Being able to see the performance differences (and whether a solution is "variable" in terms of performance). And having access to folk with the expertise (and time) to do this kind of analysis.

A genuine THANKS from me ��
 
Since, as is well known, many roads lead to Rome, here is another possibility.
 

Attachments

  • Event Schedule PQ Challenge.xlsx
    22.8 KB · Views: 11
Since, as is well known, many roads lead to Rome, here is another possibility.

Nice! Many ways to solve the requirement indeed.

Although I'm not liking the dependency that your solution has on the column names - if those changed, or if an additional column was added in each group (or in just one of the groups) the code would need quite some revision.
Just my two pence :)
 
Nice! Many ways to solve the requirement indeed.

Although I'm not liking the dependency that your solution has on the column names - if those changed, or if an additional column was added in each group (or in just one of the groups) the code would need quite some revision.
Just my two pence :)
My thoughts as well, although another slick solution, it is not scalable, per the OP's requirements that there will be more columns.
 
Not sure how I missed all this activity on my original question - apologies to all who took the time to compare and come up with alternative solutions. Particular thanks to Ken Puls for taking the time to do a performance test - although in most cases the data set will be quite small, there's a possibility that there could be a larger set at some point, so it's good to know the potential issues. Thanks again everyone.
 
Back
Top