To combine similar columns multiple tabs into one sheet, there are many ways to do this in google sheets. But here’s the one I used to combine electric scooters, escooters and e-scooters:
=sort(QUERY({'Electric scooter'!A2:B;Escooter!A2:B;'E scooter'!A2:B},"select * where Col2 >40"),2,false)
To categorise your search terms, where they are listed in column A, your category names are in the header row, cells C1:R1 and your MSV (monthly search volume) is in column B:
=arrayformula(if(REGEXMATCH($A2:$A,C$1:R$1),$B2:$B,))
To turn that data in table-form into column form, you combine every combination of category name, search term and MSV with the “@” symbol as the divider. Then you flatten that table of data, split it on the “@” symbol and sort it by MSV, which will now be column 4:
=ArrayFormula(sort(split(FLATTEN(Categorise!C1:S1&"@"& Categorise!A2:A3677&"@"&Categorise!C2:S3677),"@"),3,0)
)
To clean your data, nest some substitutes. Here I get rid of some stop words (a regexreplace would have been better to be honest but I set myself a rule of not using complex regex although see below) as well as cleaning up the scooter terms so we are just dealing with the word scooter. Note that this can go a bit wrong – Pure Scooters ended up as Purscooter but this wasn’t that bit a problem.
=arrayformula(substitute(substitute(substitute(substitute(SUBSTITUTE(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(substitute(SUBSTITUTE(substitute(substitute(substitute(substitute(B2:B,"seats","seat"),"scooters", "scooter"),"are ",""),"-","")," e scooter"," scooter")," e scooters"," scooter"),"electric scooter","scooter"), "e scooter","scooter"),"escooter","scooter"),"adults", "adult"),"kids","kid")," for "," ")," a "," ")," an "," ")," with "," "),"the ",""),"what's","what"),"on ",""),"in ",""),"laws","law"))
To put all search terms into alphabetical order so ones that are identical but diffrently ordered can be combined, we split the term on the space so each word is in a cell of its own on a row. We then transpose that row to a single column and sort it to get it in alphabetical order (required as you can sort columns but nor rows in sheets). Then you transpose it to put it back as a row and join the cells up with spaces to get it all back into a single cell!
=join(" ",transpose(sort(transpose(SPLIT(G2," ")),1,true)))
Now use a pivot table! Thanks to electroheads for permission to use this electric scooter analysis at my talk in Brighton.
If you’re happy to use regex instead of nested substitutes, something like this works:
=substitute(substitute(REGEXREPLACE (B2,"\'s|\bi\b|\ba\b|\bbe\b|\bof\b|\band\b|\bdo\b|\bto\b|\bhas\b|\bthe\b|\bis\b|\bon\b|\bin\b",)," "," ")," "," ")
The /b is the start or end of the word so EG \band\b finds the word and (but not part of words like band or sandy) – the whole thing is then wrapped in some substitutes to eliminate double spaces that arise.
Leave a reply to Malcolm Coles Cancel reply