A couple of days ago there was a post on the lookstat blog titled top search keywords for energy, it compared search terms in Google in an attempt to estimate popularity of energy images on microstock sites. Earlier this year I did something similar in a post about seasonal stock images, and at the time I made the point that I wasn't exactly sure how well Google search terms related to searches on microstock sites.
So that set me thinking... (yes, be very afraid) Just exactly how much of a match is google trends/adwords data to what people are searching for at microstock sites? Clearly there will be some relationship, but I'd also guess that there are lots of popular terms that will not have a proportionate number of microstock searches. It's difficult to know how similar the two are. Is it reasonable to assume that popular keywords in Google are more likely to lead to more microstock sales as those keywords make popular subjects hence there will be related businesses in need of such images? As they say "assume makes an ass out of u and me".
Say for example we see that holiday flights is a popular subject in google, we also see that exchange rates is equally popular, for me it's a stretch to say that images representing exchange rates will sell in the same proportions as ones depicting holiday flights, surely there are too many variables?
Another Data Set
I have access to the keywords that people used while browsing a free stock photo site (similar stockxchange but nothing like as big). Of about 1 million searches in 2008 and late 2007 there were some 160,000 unique key phrases searched. The vast majority of them only got one search (just 48k with 2 searches or more). This is just one of the places in microstock we see the 'long tail / exponential decay graph', see my post how long images continue to sell and more recently microstock dairies revisited the longtail. A plot of the top 100 is as follows: full table of the data is below, only every 4th keyword would fit on the graph:
Graph of search terms vs search volume for top 100 searches.
The keyphrases were sorted as is, so typos, stemming and things like entering the search "cat." or "cats" have not been grouped along with all the other "cat" searches. Likewise the total for "people" does not include a total of times users searched for terms with people in them like "young people" these are listed separately.
Armed with this data (in the slowest pivot table known to man) I decided to do a bit of analysis to see how these matched the results in the lookstat post:
Left: Google search analysis from the lookstat post and Right: results from the free stock photo site data set.
Looks like they match quite well! Anything below 10 could perhaps be considered error and could easily be skewed by some other factor. I was convinced I was going to be able to prove that energy jobs was popular in Google but not a popular stock search, it seems that way but sadly I don't think I have a large enough data set to be certain (?).
What's quite interesting is that only 45 out of a million searches were made for our 'top' energy keywords (there were also 6 similar with one search each - "solar energy farm", "solar energy panel" etc) plus many more for single keywords of solar, energy and their related synonyms).
The Top 100
For extra comparison, the keywords in my data set look a lot like those top 100 keywords searched on Shutterstock, although I have a definite English language bias, I also have not removed from the top 100 several keywords like 'nude' and 'sex' that are probably not image buyers. Quite a lot more variability in the ordering and plenty of the keywords Shutterstock have in their top 100 only made it into my top 200.
Note: "blank" searches are probably either robots, perhaps mistaken users, or users just seeing what an empty search does. Interesting if you run a web site with a photo search then a blank search should most likely not allow you to search, or perhaps return a message with nothing found but ALSO a selection of random or popular images.
Ranked 95 to 99 "chocolate, beer, tv and space cake", sounds like a good night in, lol.
Unlucky for some
Heres a few of the 406 terms that had 13 searches each:
baby jesus, voucher, pylon, weed, quebec, ladders, computer chip, emo girl, brussel sprouts, learner driver, woman on phone, lord of the rings, lotto, turf, fashion clothes, sand clock, ghandi, abba, herron, synergy, tofu, hunk, paper plane, miami beach, nylon, andy warhol
Quite an eclectic little bunch and I think this is the first time synergy, tofu and andy warhol have been used in the same sentence. Quite a few of the searches are not what you would call 'traditional stock subjects'.
It seems reasonable that a comparison of relative terms in google trends/adwords will match the relationships between searches on a stock photo site, but I still think that there are a lot of keyphrases for which that is also not the case. I plan to analyse the data some more to see if I can pick out a few obvious "search engine popular" keywords that don't match image searches. it would be really great if google would let us search their "image search" volume alone. I did previously look at using the google data by combining keywords of interest with the keyword "photos", "images" or "pictures", it works for very popular single word searches but not for most key phrases. We have thus far ignored which images actually sell! see picniche for more about that.
I should be able to set-up something were you can query this data and my more recent 2009 dataset, if anyone is interested?