Tuesday, January 2, 2018

What are people really interested in?

For several years now, I have been seeking some sort of relatively empirical means of determining what it is that people are interested in and to what degree.

There are, of course, the newspaper headlines. That, theoretically, ought to reflect what the reading public is interested in but that interpretation requires several things to be true. 1) Editors understand what their readers are interested in, 2) Straight news reporting independent of editorials, 3) newspaper readership is reflective of the nation's citizens, etc.

Newspaper readership has been plunging for more than a decade. With the collapse of their commercial business model, there has been a dramatic shift from straight news reporting to editorializing (it is cheaper to form an opinion than it is to dispatch a reporter to the scene of the event). We know that journalists skew dramatically differently from the average citizen (more urban, more college credentialed, more Democratically partisan, more remunerated, younger, more class conscious, etc.) The same is true for newspaper readership.

Newspapers, magazines, broadcast news, cable news - they are all unrepresentative of the nation at large. What they are interested in writing about has no necessary relationship to what citizens in aggregate are interested in. News headlines are not a good proxy for citizen priorities.

Gallup does a routine survey of some 1,000 participants as to what are the most important problems. They do it monthly and the data goes back to the 1970s (though with less frequency). That is reasonably useful but it is only 1,000 people, it seems particularly sensitive to news headlines, and it captures only the negative side of things. There are plenty of issues in which people invest time and interest but which are positive developments, not concerns.

And of course there are all the intermittent surveys about particular issues. The drawback with these is that they are often 1) advocacy driven, designed a priori to achieve particular answers, 2) they lack longitudinal data, and 3) they never are cast in terms of constraints. In other words, it does little good to know that 25% of people think education is important without also knowing how they trade-off education with security, or infrastructure, or the military, or the economy, etc. They may view education as important but it might trail far behind a host of other important issues when it comes to divvying up a limited budget.

There seems little reliable way to discern what the average citizen is interested in/concerned about in an independent fashion.

Enter Google Trends. It allows you to track search terms or topics that people search on in Google by geographical locale and by time frame going back to 2004. This has the advantage of volume (billions of searches), being a census rather than a survey (a census of all Google users), and being direct (working with actual searches) rather than self-reporting (what people say they searched.) These are great advantages.

The drawback is that Google uses an index of searches rather than the actual number of searches (i.e. the results are displayed compared to one another rather than in absolute terms.) Further, you can search only 5 items at a time so it is impossible to look at many items at once. There is also a challenge that we, as users, don't have access to Google's algorithms so we cannot be absolutely certain how reflective the final reported ratio is to the original numbers.

There is also the problem that a single term can have multiple meanings. For example, someone might search "security" with different things in mind: possibly, home security, personal security, data security, neighborhood security, police, global security, etc. In addition, there are financial securities, an entirely different category of meaning. There was one search term I used that likely was highly skewed due to meaning ambiguity. "Hunger" was a relatively low search frequency item until 2010 when it began a steep rise. This was more likely due to the book trilogy Hunger Games and the subsequent movie in that time period than anything to do with concern about physical hunger.

Another drawback is that we cannot distinguish breadth of interest and depth of interest from the Google index. If there are 100 million Google users, it makes a difference whether 100 million people google "security" once a year, or whether one million people google "security" a hundred times each a year. The first is a mild but broad concern, the second is a narrow but intense concern.

The Google index lacks directional context. If people are googling "economy" a lot, we cannot distinguish whether they are concerned about a plunging economy or are interested in a booming economy. Sometimes there is a clear implication but we cannot be certain.

We lack any matching of participant to outcome. It would be useful to know whether the numbers are being generated by people in particular locales (sub-nation level), by class, race, age, education attainment, etc.

So Google Trends is not a great answer but there is a way to get around the five topic limit by chain ratioing. Keep one of the search terms constant across multiple five term searches. For example, I used "economy" as the constant. If the other four terms are much higher than "economy" in one batch, and they are lower than "economy" in a second batch, I can create a relative ratio between the batches using "economy" as the anchor.

That's is what I did for nearly a hundred terms. I took the terms from Gallup's problem list, from headline frequencies, from Maslow's hierarchy of needs, as well as some reasonably random items as testers of the approach.

Results to follow.

No comments:

Post a Comment