A couple of days ago there was a post on the lookstat blog titled top search keywords for energy, it compared search terms in Google in an attempt to estimate popularity of energy images on microstock sites. Earlier this year I did something similar in a post about seasonal stock images, and at the time I made the point that I wasn't exactly sure how well Google search terms related to searches on microstock sites.

So that set me thinking... (yes, be very afraid) Just exactly how much of a match is google trends/adwords data to what people are searching for at microstock sites? Clearly there will be some relationship, but I'd also guess that there are lots of popular terms that will not have a proportionate number of microstock searches. It's difficult to know how similar the two are. Is it reasonable to assume that popular keywords in Google are more likely to lead to more microstock sales as those keywords make popular subjects hence there will be related businesses in need of such images? As they say "assume makes an ass out of u and me".

Say for example we see that holiday flights is a popular subject in google, we also see that exchange rates is equally popular, for me it's a stretch to say that images representing exchange rates will sell in the same proportions as ones depicting holiday flights, surely there are too many variables?

 

Another Data Set

I have access to the keywords that people used while browsing a free stock photo site (similar stockxchange but nothing like as big). Of about 1 million searches in 2008 and late 2007 there were some 160,000 unique key phrases searched. The vast majority of them only got one search (just 48k with 2 searches or more). This is just one of the places in microstock we see the 'long tail / exponential decay graph', see my post how long images continue to sell and more recently microstock dairies revisited the longtail. A plot of the top 100 is as follows: full table of the data is below, only every 4th keyword would fit on the graph:

 

graph of search terms vs search volume
Graph of search terms vs search volume for top 100 searches.

The keyphrases were sorted as is, so typos, stemming and things like entering the search "cat." or "cats" have not been grouped along with all the other "cat" searches. Likewise the total for "people" does not include a total of times users searched for terms with people in them like "young people" these are listed separately.

 

Comparison

Armed with this data (in the slowest pivot table known to man) I decided to do a bit of analysis to see how these matched the results in the lookstat post:

results from analysis by lookstat from google data searches performed on a free stock photo sites 07-08

Left: Google search analysis from the lookstat post and Right: results from the free stock photo site data set.

Looks like they match quite well! Anything below 10 could perhaps be considered error and could easily be skewed by some other factor. I was convinced I was going to be able to prove that energy jobs was popular in Google but not a popular stock search, it seems that way but sadly I don't think I have a large enough data set to be certain (?).

What's quite interesting is that only 45 out of a million searches were made for our 'top' energy keywords (there were also 6 similar with one search each - "solar energy farm", "solar energy panel" etc) plus many more for single keywords of solar, energy and their related synonyms).

 

The Top 100

For extra comparison, the keywords in my data set look a lot like those top 100 keywords searched on Shutterstock, although I have a definite English language bias, I also have not removed from the top 100 several keywords like 'nude' and 'sex' that are probably not image buyers. Quite a lot more variability in the ordering and plenty of the keywords Shutterstock have in their top 100 only made it into my top 200. 

RankKeywordFrequencyRankKeywordFrequencyRankKeywordFrequency
1people1881834doctor205967animals1455
2(blank)1633235nude199868fish1448
3music897136party197769construction1422
4fruit892237fire190570flower1412
5christmas758938medical188771fruits1410
6business550839hands188072dancing1379
7food541340child186473cat1341
8woman500841kids181874rose1341
9family478742tree181875sky1327
10computer464743education179876heart1317
11children373144golf178777home1306
12baby370845sports177278camera1284
13car345246wine175879sun1281
14dance343347massage174580birthday1254
15house319148coffee173781shopping1249
16money316249hand166382paper1243
17school311950fashion165083girls1235
18sex309351earth164384eye1217
19wedding305852face162985students1202
20book297753health162386beauty1189
21girl284554horse162187world1177
22football270755phone159788winter1161
23women261156snow158789pizza1107
24beach258657nature158790computers1076
25water251058student155691film1076
26apple245959smile154992spa1075
27love240660globe153293law1066
28dog238461hair153194dogs1063
29books234962fitness153095chocolate1049
30man226463soccer152196beer1027
31sport219364guitar150997tv1022
32office217765flowers147398space1020
33cars210166sexy147199cake995
      100london994

Note: "blank" searches are probably either robots, perhaps mistaken users, or users just seeing what an empty search does. Interesting if you run a web site with a photo search then a blank search should most likely not allow you to search, or perhaps return a message with nothing found but ALSO a selection of random or popular images.

Ranked 95 to 99 "chocolate, beer, tv and space cake", sounds like a good night in, lol.

 

Unlucky for some

Heres a few of the 406 terms that had 13 searches each:

baby jesus, voucher, pylon, weed, quebec, ladders, computer chip, emo girl, brussel sprouts, learner driver, woman on phone, lord of the rings, lotto, turf, fashion clothes, sand clock, ghandi, abba, herron, synergy, tofu, hunk, paper plane, miami beach, nylon, andy warhol

Quite an eclectic little bunch and I think this is the first time synergy, tofu and andy warhol have been used in the same sentence. Quite a few of the searches are not what you would call 'traditional stock subjects'.

 

Conclusion

It seems reasonable that a comparison of relative terms in google trends/adwords will match the relationships between searches on a stock photo site, but I still think that there are a lot of keyphrases for which that is also not the case. I plan to analyse the data some more to see if I can pick out a few obvious "search engine popular" keywords that don't match image searches. it would be really great if google would let us search their "image search" volume alone. I did previously look at using the google data by combining keywords of interest with the keyword "photos", "images" or "pictures", it works for very popular single word searches but not for most key phrases. We have thus far ignored which images actually sell! see picniche for more about that.

I should be able to set-up something were you can query this data and my more recent 2009 dataset, if anyone is interested?

 

Related Links

Best selling images and top search terms at pixmac

 


Rahul Pathak's picture

Really Cool Post

Rahul Pathak (not verified) on Tue, 2009-12-01 09:17
Great post, Steve! I definitely think there are going to be situations where the google data & the microstock data is going to diverge. Still, it was impressive to see the correlation on the energy related items. Very cool stuff. Look forward to seeing more :)

Add new comment

Popular content