Crowd Wisdom, Public Wisdom: Regarding Misinformation at Large

Could “crowds of regular people” be, in the words of one early research review, “as good at moderating fake news on Facebook as professional fact-checkers?” For a while now, technology practitioners and academics have been working on the question of whether the “wisdom of the crowd” can help fix the spread of misinformation online.

If you are unfamiliar with this line of research, the idea is based on a practical recognition: the firehose of online information streams cannot be sifted through one comment at a time. Machine learning or algorithmic processes might be able to help improve the situation (think of how well spam filters work on email) but they need to be trained into understanding what misinformation is. The more complicated the task —what is truth, for instance— the more data is needed. But recognizing some amount of misinformation might be something that members of the crowd, or the general public, can provide.

At the same time the potential for crowdsourcing misinformation exists. So, like any relationship, human or technological, it’s complicated.

Power in Numbers 

The optimism related to crowdsourcing is not unfounded. Aristotle was among the first to think seriously about collective wisdom. More recently in 2004, journalist James Surowiecki shared how averaging the judgment of crowds could zero-in on certain kinds of knowledge much better than a lone expert might. In his book, The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations, he outlines examples of this success that range from assessing the weight of an ox to correct answers for Who Wants to Be a Millionaire? Going with the crowd can work.

But equally you can’t go with just any crowd. There are ways and times that crowd judgment becomes the opposite of wisdom. If for example, the crowd lacks diversity or is too similar in its backgrounds and experiences, or if members of the crowds are too dependent upon one another, there is simply less knowledge from which the crowd can draw. And if any large crowd could provide good collective intelligence, we wouldn’t be in the position of needing this kind of research in the first place.

That a certain crowd could help identify misinformation at a large scale is a tantalizing possibility, and one that technologists are chasing even today. For public interest technologists working on this conundrum, there are issues related to cognition and prediction research. Questions include, for example, an individual’s or group’s capacities for misinformation detection. You might want to know whether age or partisan politics might make one more susceptible to “fake news” (possibly yes, possibly differently than we think) or whether even a person’s disposition to certain kinds of analytical thinking can factor in as well. (The good news is investigations like these, some of which can be found through the resources of the Social Science Research Council’s MediaWell project, abound.)

In addition to asking questions about the crowd’s capacity, it also makes sense to ask questions about the different kinds of wisdom you want the crowdsourcing system to emulate. My research colleagues and I recently pointed out how different expertise can impact crowdsourcing. In our paper about the challenges of designing a crowdsourced system for polarizing topics such as climate science journalism, we found that science experts were certainly more attuned to the nuances of hard environmental science compared to the journalism experts we queried. The scientists’ evaluation of certain news articles seems to provide a possible ground truth for the system. However, their assessment of articles on, for example, climate-themed summits produced unexpected results. When evidence is not a matter of scientific citation, scientists seem to struggle whereas journalists can employ experience related to credible coverage of  events happening in real time.

There are, in other words, many layers to making such systems work, including the deep challenge of factual context that technologists and laypeople must include in their work. The truth of a situation, a linguistic communication, or a piece of information is often dependent upon context. The complexities involved in taking all these variables and dependencies, and then actually building crowdsourcing systems for misinformation detection does beg the question: will all of this effort in fact work?

Finding the Right Path

I believe the answer is yes, at least in part. As complicated as it is, there are many elements related to misinformation that at bottom are a problem of recognition, or perception. Perception is something that artificial intelligence and machine learning has been shown to be good at. Andrew Ng, former founder and lead of Google Brain (among many other things), wrote in 2016: “If a typical person can do a mental task with less than one second of thought, we can probably automate it using AI either now or in the near future.” This one-second rule is helpful to consider: judging the veracity of statements that I’ve been mulling for years are probably beyond AI’s realm, but there are some images, statements, and videos that I know to be false in even less than a second. Matching a certain group of people whose capacity it is to evaluate certain kinds of information seems possible.

Success, however wild or limited, does not strike me as the true complication behind the crowdsourcing of misinformation. The true difficulty lies in what to do with the faculty once we have it. One clear purpose for the crowdsourcing of misinformation is to assist, or replace in part, the laborious effort of professional fact-checking at the scale of current online information flow. And an information flow that is filled with more accurate information and less junk seems like a critical need for democratic societies.

At the same time, how will deployments of real-time scalar recognition of misinformation… feel? Will we see less of this information, from news reports to tweets? Will it be removed altogether? Will there be a world-sized spam box for everyone to see? This might be important because as established, it will be a certain kind of crowd that helps to define the system.

Whatever ways these crowdsourcing solutions are going to be valuable, we need to figure out if there are meaningful ways to think through these options in open ways alongside their design. PIT practitioners can help in these efforts, and are even today. In the not-quite-post-election period, for example, we have heard of some conservatives decamping to alternative social media platforms. It’s hard to know what shifts in social spaces mean in the long term (MySpace, anyone?), but some kind of public conversation, related to what misinformation solutions are and how they would best be used, should also be taking place. This should happen now, as solutions are being actively designed and developed, since one thing we all know is that it can be hard to change technological systems once you build them.

Finally, the recognition of incorrect information is not the same as building repositories of correct knowledge. When it comes to COVID-19 and Wikipedia, for example, members of the online encyclopedia community have worked relentlessly to provide the world reliable information on the pandemic. For those who are knowledgeable about Wikipedia’s processes, you learn that a mix of human curation, automated bots, and other processes help to keep its pages free of vandalism. Nevertheless, the quality of information on Wikipedia can vary widely, and Wikipedia itself is often not a reliable source, but it can point people to a world of reliable evidence when the article is robustly cited.

The way that Wikipedia has crowdsourced its knowledge and made these processes transparent is surely one of the reasons why it is trusted, or rather, why millions of people look to it for information on a universe of topics. Recognizing Wikipedia’s strengths and weaknesses are why others and I think it critical to improve its possibilities for reliable information especially in the area of vaccine safety. This is especially as we look with hope to next year’s release of a COVID-19 vaccine.

One question that I find asking myself is: when it comes to misinformation, what is the path from crowd wisdom to public wisdom? Opportunities for the development of creative responses are still much needed in this space. Perhaps something to look for in 2021.

Among her many hats, Connie Moon Sehat is director of the News Quality Initiative, which seeks to elevate quality journalism in news ranking and recommendation systems.