Splitting rows by regex

Kaggle has a nifty little dataset that contains data on ~5000 movies scraped from the IMDB website. It’s perfect for demonstrating why and how you might want to split rows.

Download the sample workbook (with the demo query) by clicking here. Get QueryStorm from here.

Here’s a snippet of what the data looks like:

It’s one movie per row, nice and tidy for the most part. We can easily use a pivot table to figure out e.g. the average movie duration for each director.

But what if we wanted to find out the average movie duration per genre? All the genres for a movie are mushed together as a single value, making this data unusable for pivoting.

Continue reading “Splitting rows by regex”