A couple of days ago there was a post on the lookstat blog titled top search keywords for energy, it compared search terms in Google in an attempt to estimate popularity of energy images on microstock sites. Earlier this year I did something similar in a post about seasonal stock images, and at the time I made the point that I wasn't exactly sure how well Google search terms related to searches on microstock sites.

So that set me thinking... (yes, be very afraid) Just exactly how much of a match is google trends/adwords data to what people are searching for at microstock sites? Clearly there will be some relationship, but I'd also guess that there are lots of popular terms that will not have a proportionate number of microstock searches. It's difficult to know how similar the two are. Is it reasonable to assume that popular keywords in Google are more likely to lead to more microstock sales as those keywords make popular subjects hence there will be related businesses in need of such images? As they say "assume makes an ass out of u and me".

Say for example we see that holiday flights is a popular subject in google, we also see that exchange rates is equally popular, for me it's a stretch to say that images representing exchange rates will sell in the same proportions as ones depicting holiday flights, surely there are too many variables?


Another Data Set

I have access to the keywords that people used while browsing a free stock photo site (similar stockxchange but nothing like as big). Of about 1 million searches in 2008 and late 2007 there were some 160,000 unique key phrases searched. The vast majority of them only got one search (just 48k with 2 searches or more). This is just one of the places in microstock we see the 'long tail / exponential decay graph', see my post how long images continue to sell and more recently microstock dairies revisited the longtail. A plot of the top 100 is as follows: full table of the data is below, only every 4th keyword would fit on the graph:


graph of search terms vs search volume
Graph of search terms vs search volume for top 100 searches.

The keyphrases were sorted as is, so typos, stemming and things like entering the search "cat." or "cats" have not been grouped along with all the other "cat" searches. Likewise the total for "people" does not include a total of times users searched for terms with people in them like "young people" these are listed separately.



Armed with this data (in the slowest pivot table known to man) I decided to do a bit of analysis to see how these matched the results in the lookstat post:

results from analysis by lookstat from google data searches performed on a free stock photo sites 07-08

Left: Google search analysis from the lookstat post and Right: results from the free stock photo site data set.

Looks like they match quite well! Anything below 10 could perhaps be considered error and could easily be skewed by some other factor. I was convinced I was going to be able to prove that energy jobs was popular in Google but not a popular stock search, it seems that way but sadly I don't think I have a large enough data set to be certain (?).

What's quite interesting is that only 45 out of a million searches were made for our 'top' energy keywords (there were also 6 similar with one search each - "solar energy farm", "solar energy panel" etc) plus many more for single keywords of solar, energy and their related synonyms).


The Top 100

For extra comparison, the keywords in my data set look a lot like those top 100 keywords searched on Shutterstock, although I have a definite English language bias, I also have not removed from the top 100 several keywords like 'nude' and 'sex' that are probably not image buyers. Quite a lot more variability in the ordering and plenty of the keywords Shutterstock have in their top 100 only made it into my top 200. 

Rank Keyword Frequency Rank Keyword Frequency Rank Keyword Frequency
1 people 18818 34 doctor 2059 67 animals 1455
2 (blank) 16332 35 nude 1998 68 fish 1448
3 music 8971 36 party 1977 69 construction 1422
4 fruit 8922 37 fire 1905 70 flower 1412
5 christmas 7589 38 medical 1887 71 fruits 1410
6 business 5508 39 hands 1880 72 dancing 1379
7 food 5413 40 child 1864 73 cat 1341
8 woman 5008 41 kids 1818 74 rose 1341
9 family 4787 42 tree 1818 75 sky 1327
10 computer 4647 43 education 1798 76 heart 1317
11 children 3731 44 golf 1787 77 home 1306
12 baby 3708 45 sports 1772 78 camera 1284
13 car 3452 46 wine 1758 79 sun 1281
14 dance 3433 47 massage 1745 80 birthday 1254
15 house 3191 48 coffee 1737 81 shopping 1249
16 money 3162 49 hand 1663 82 paper 1243
17 school 3119 50 fashion 1650 83 girls 1235
18 sex 3093 51 earth 1643 84 eye 1217
19 wedding 3058 52 face 1629 85 students 1202
20 book 2977 53 health 1623 86 beauty 1189
21 girl 2845 54 horse 1621 87 world 1177
22 football 2707 55 phone 1597 88 winter 1161
23 women 2611 56 snow 1587 89 pizza 1107
24 beach 2586 57 nature 1587 90 computers 1076
25 water 2510 58 student 1556 91 film 1076
26 apple 2459 59 smile 1549 92 spa 1075
27 love 2406 60 globe 1532 93 law 1066
28 dog 2384 61 hair 1531 94 dogs 1063
29 books 2349 62 fitness 1530 95 chocolate 1049
30 man 2264 63 soccer 1521 96 beer 1027
31 sport 2193 64 guitar 1509 97 tv 1022
32 office 2177 65 flowers 1473 98 space 1020
33 cars 2101 66 sexy 1471 99 cake 995
            100 london 994

Note: "blank" searches are probably either robots, perhaps mistaken users, or users just seeing what an empty search does. Interesting if you run a web site with a photo search then a blank search should most likely not allow you to search, or perhaps return a message with nothing found but ALSO a selection of random or popular images.

Ranked 95 to 99 "chocolate, beer, tv and space cake", sounds like a good night in, lol.


Unlucky for some

Heres a few of the 406 terms that had 13 searches each:

baby jesus, voucher, pylon, weed, quebec, ladders, computer chip, emo girl, brussel sprouts, learner driver, woman on phone, lord of the rings, lotto, turf, fashion clothes, sand clock, ghandi, abba, herron, synergy, tofu, hunk, paper plane, miami beach, nylon, andy warhol

Quite an eclectic little bunch and I think this is the first time synergy, tofu and andy warhol have been used in the same sentence. Quite a few of the searches are not what you would call 'traditional stock subjects'.



It seems reasonable that a comparison of relative terms in google trends/adwords will match the relationships between searches on a stock photo site, but I still think that there are a lot of keyphrases for which that is also not the case. I plan to analyse the data some more to see if I can pick out a few obvious "search engine popular" keywords that don't match image searches. it would be really great if google would let us search their "image search" volume alone. I did previously look at using the google data by combining keywords of interest with the keyword "photos", "images" or "pictures", it works for very popular single word searches but not for most key phrases. We have thus far ignored which images actually sell! see picniche for more about that.

I should be able to set-up something were you can query this data and my more recent 2009 dataset, if anyone is interested?


Related Links

Best selling images and top search terms at pixmac


Rahul Pathak's picture

Really Cool Post

Rahul Pathak (not verified) on Tue, 2009-12-01 09:17
Great post, Steve! I definitely think there are going to be situations where the google data & the microstock data is going to diverge. Still, it was impressive to see the correlation on the energy related items. Very cool stuff. Look forward to seeing more :)

Add new comment

Popular content