Towards a Regulatory Framework for Harmful Online Content: Measuring the Problem and Progress

As discussed in my previous article, content recommendation engines (CREs) like the Facebook newsfeed and YouTube’s “watch next” feature appear to be amplifying harmful content.  Further, there may be an inherent conflict of interest in which the business models of the companies behind these CREs may disincentivize them from pursuing adequate measures to solve the harmful content problem.  Given the widespread recognition1 of the social harms due to online dissemination of harmful content, and especially given the potential conflict of interest, greater participation from regulatory bodies is needed to ensure that progress is made.

My view is that a co-regulatory approach is most appropriate for tackling this problem, calling both governments and companies into action. The benefit of this approach is that it harnesses the expertise and insight of companies – who control the data, content, and CRE algorithms at the heart of the problem – while also ensuring effective transparency and accountability – as democatic governments set the guardrails and verify reasonable efforts. More extreme approaches – strict rule-based regulation on the one hand, and pure self-regulation on the other – have both failed to make inroads into the problems with CREs today2.

In a co-regulatory framework, access to the relevant data by privileged third parties ( governments, auditors, academics) is essential in order to evaluate the progress companies are making.  We do not set out a vision of who this auditor might be and under exactly what circumstances the data should be provided, but assume that effective public and private law and basic constitutional safeguards are in place to prevent abuse of power by the auditors.

We focus here on the data needed to measure the extent of the problem and how much progress is being made (in a follow-up article, we will focus on the data needed to ensure that reasonable efforts are being made and that conflicts of interest are not hindering progress).  Conceptually, this is simple: we need to measure the prevalence of harmful content on these platforms, and how much of it is being exposed to users, over time.

But there are many subtleties.  We must be clear on the operational definitions of harmful content, which evolve as new laws and policies are written.  We must understand how much content the site hosts, which may change constantly. We must have clear documentation of the methods used to identify harmful content on the site, whether human review or machine learning model.  Then, based on the output of these methods, we need the identified rates of harmful content. It is important to note that human review of only a small (appropriately chosen) sample of a site’s content can allow us to infer the overall rates of harmful content on the platform with reasonable accuracy3.

As we have discussed previously, harmful content is inherently subjective with no single concrete definition.  We can consider various definitions to operationalize the concept, but they will carry their own limitations and biases.  For example, we can consider “illegal content”. In countries where there is less emphasis on freedom of speech than in the US, much of what we would consider harmful content could well be illegal content. However, judicial review is generally needed to establish whether the material is illegal.  As such, “illegal content” is not a practical operational definition.

Other definitions are created by the companies operating the online platforms.  Internet companies have a terms of service (ToS) document that spells out generally what content they allow on their services, although the definitions may still be subject to some interpretation.  Content that violates the ToS can be referred to as “disallowed content”.

Many such companies (especially the large ones) employ contractors to evaluate and rate their content.  In addition to the ToS, they provide written rating policies (like Google’s Search Quality Rating Guidelines) that clearly define particular categories of content.  For example, YouTube refers publicly to “borderline content” and claims specific numerical reductions of views of this content – there must necessarily be a concrete definition that the company has written to classify content as “borderline”.  There may be multiple policies and each single policy may identify multiple categories of content, including multiple rating scales on metrics such as quality, accuracy, or trustworthiness.

Finally, we can also consider user-flagged content.  Most online platforms provide a mechanism for users to flag content that they consider objectionable.  Of course, users may have many reasons to do that, so the rate of flagged content has to be interpreted with care.  Often, flagged content is prioritized for rating by employees or contractors.

These categories are not completely independent.  Some users may flag content simply because they think it violates the ToS; the ToS will probably reflect legal requirements, and if certain types of content are frequently flagged, they may be specifically called out in the ToS or other rating policy.  Ultimately, no definition is going to be adequate – what is important is that a reasonable definition is operationalized to the point that content can be objectively determined to be harmful or not. The existence of such a definition should be a requirement for all but the smallest companies.  They should then reasonably be expected to report on:

  • Rates of removal of content due to reports or findings that it is illegal or disallowed.  This should include the grounds for removal, who requested the removal, and any review or analysis to verify the claim.
    • As well as a measure of the actual rate of illegal content on the site, this can shine a light on censorship: companies often take down content that is flagged as illegal by a government authority without waiting for a court assessment (see here for a discussion of this issue and page 5 of this document for some data and analysis).  Additionally, this kind of data can be valuable for understanding the impact of changing regulations.
  • Rates of flagged content.
  • Rates of content in any categories that the company has the capacity to assess, either through policies (or “rating guides”) used for human review or through machine learning models.

This begs the question of how disallowed content is identified.  If a piece of content is reviewed and found to be disallowed, presumably it would be immediately removed from the service.  However, typically it is only possible to review a small proportion of a service’s content. Imagine a video-sharing site that hosts 100 000 videos.  Perhaps the company hires contractors to assess a random 1000 of those videos – they find that 40 of those 1000 videos are disallowed by the ToS. Because the 1000 videos reviewed were a random sample of the 100 000 on the platform, we know that about 4% (40 out of 1000) of the videos on the site would be disallowed if they were reviewed.  We have only needed to review 1000 of the 100 000 videos, but using a statistical method known as a “confidence interval for a proportion4” we can report that we are 95% confident that the true rate of disallowed content on the platform is between 3.0% and 5.4%.

Additionally, many online platforms will make use of statistical models to classify their content.  Such models need training data, so as in our example above, some random sample of the service’s content will be classified by contractors according to a written guide produced by the company (perhaps as “good” / “borderline” / “disallowed” or, in more sophisticated cases, there may be categories for individual types of problematic content, such as “conspiracy theory”, “hate speech”, etc.).  The statistical model can then learn to predict the category of any other piece of content on the service.

These statistical models have limited effectiveness for filtering.  The model predictions will have uncertainty and could have errors or bias.  For example, the model, when applied in an automated content recognition setting – might state that a particular video has a “72% chance of being disallowed” – this is probably not sufficient grounds for deleting the content preemptively, although content that the model predicts is highly likely to be problematic may be flagged for further review or could be suppressed for more sensitive audiences (children, etc.).  However, the models are quite effective at determining rates of harmful content.  Due to a statistical concept known as the law of large numbers, even if the model is wrong about many individual pieces of content, it is likely to be quite accurate in determining how much of the content is harmful overall.  This provides an excellent measure for the overall magnitude of the problem that a service has with harmful content.

We have so far remained nonspecific about what harmful content is.  We suggest that various categories should be reported, such as disallowed, user-flagged, illegal, etc.; however, not all harmful content is equal: exposure to child sexual abuse material (CSAM) is likely to be considered much worse than exposure to a conspiracy theory.  We do not set out a full taxonomy of harmful content here (although that would be a worthwhile endeavour), but one can imagine defining various categories such as CSAM, conspiracy theories, medical misinformation, etc. Within each of these categories, there might be different tiers of material, perhaps conspiracy theories in the highest tiers would be those that might lead to violence against a particular group.

With this taxonomy in place, one could calculate many different harmful content rates: the rate of harmful content of any kind, the rate for a particular category or set of categories, or the rate of harmful content in the highest (or top two) tiers, to give some examples.  Additional categories can be defined as needed: for example, we may define a category of content that perpetuates racial discrimination, another that advocates violence, and another that provides misinformation related to an election.

We must also consider that there are many ways of measuring rates of content in any category.  Take, for example, a video sharing site. We might care about the proportion of videos that are harmful.  But maybe it’s important if longer videos are more likely to be harmful, in which case we might care about the proportion of hours of videos that are harmful.  Next, it may not matter if the site hosts harmful content if no-one is watching it, so we might care about the proportion of videos viewed or hours of videos viewed that are harmful.  We also might instead care about the proportion of the service’s users that view at least one harmful video in a given month.  Finally we may care about videos that are only “impressed”, meaning that the title, description, and perhaps first frame are shown on the screen, but are never played.  Generally speaking, there are many metrics we can use to measure rates of bad content. They all involve a “numerator” (how much harmful content) and a “denominator” (how much content or users total).  For example, we might have a numerator of “hours of harmful content watched” and a denominator of “total hours of content watched”. Alternatively, we might have “users that watched at least one harmful video in February 2020” and “total users in February 2020”.

We now describe, generally, what data these companies might be compelled to make available to auditors.  A technical report specifying the details of this data could be written, but we do not take that on here.

Firstly, we need concrete definitions.  Every company should have, as a minimum:

  • A ToS document that spells out what content is allowed on the platform.
  • A mechanism for users to flag content that they consider problematic – at the very simplest, this might be just an email address that users can send reports to, but typically should be an in-product user interface affordance such as a button close to the content itself.  A document should be provided explaining the functioning of this mechanism.
  • The policy describing how content can be removed from the site due to claims that it is illegal or disallowed from governments or other third-parties, or for any other reason.

Many companies will also define additional content categories and this may be considered mandatory for larger platforms.  These may be “borderline content” that does not strictly violate the ToS but may still be considered harmful. Alternatively, these categories can include different types of ToS violations or different content themes.  Documents defining these categories should also be provided.

As discussed above, typically employees or contractors will review and rate content.  This should be mandatory for all but the smallest platforms with clear guidelines and instructions provided to the reviewers.  Additionally, reporting should be done on the type of reviewers (contractors, speciality employees, other employees, etc.), what cultures and languages they represent, and the number of reviewers and time spent on rating.

It is also common that statistical models are used to identify harmful content.  This should be a requirement for platforms above a certain size. The performance characteristics of the model should be shared.

Each document and its change history should be provided, as changing definitions can make rates of harmful content appear to vary over time when in reality only the definition has changed.

In order to measure rates of harmful content and also to contextualize any findings, it is necessary to report on the number of users the platform receives and the amount of content that they view or consume.  Required measures should be reported over the history of the platform and would include measures counting users and how much content they are consuming.

Then, data on the presence of harmful content is needed.  This should include the results of any human rating of content as well as the output of any models designed to predict content ratings.  In order to support validation of content ratings, the ratings (from both human review and model predictions) should be provided for some reasonable sample of content so that a third party can evaluate the accuracy of the ratings.  Additionally, there should be a full log of any content removed based on requests from governments or any other parties.

Additionally, all this data should be possible to restrict to particular geographical or linguistic subsets of the site.  It should be possible to, for example, to compare the rate of bad content between English and non-English content, or between the USA and Canada.  If the site collects or infers demographics such as age or gender, restriction to various demographics should also be supported.

To summarize, it is quite reasonable to expect that digital platform companies know the overall extent of their problem with harmful content.  By sharing clear definitions, policies for assessment, and data about usage and identified harmful content, greater transparency can be achieved.  Then, in collaboration with regulators and researchers, progress towards a solution can be possible.

      1. See the links in the first paragraph of my previous article.
      2. See this report for an example of a strict approach being ineffective.  The fact that this is still such a problem today makes it clear that self-regulation has not been effective.
      3. Facebook discusses their methods to do this here:
      4.  Note that this method is probably not effective in many relevant cases, but that there are more sophisticated methods that are.


Reflection on a Pandemic

The COVID-19 pandemic has brought our society to a cross-roads.  Measures that seemed impossible to consider previously: near-universal remote work and education, drastic reductions in travel, universal basic income, and others, are suddenly in place – made possible by our desire to limit the death toll (and economic consequences) of this disease.

However, it’s possible that the lives these measures save from COVID-19 may pale in comparison to the lives they save in other ways.  I (uncharacteristically) will not attempt to do the math1, but we almost certainly see a massive reduction in traffic fatalities these days, probably a reduction in violent crime, and although harder to measure, we may be saving lives by doing less damage to our environment.

We are forced to consider what we value as a society.  Over a million people a year die from traffic fatalities.  This number is likely to be greater than the death toll that COVID-19 will reap.  And yet we are not banning automobiles. But the math is not simple. A total ban on automobiles would cost lives for all sorts of complex reasons.  Just as the current social distancing measures will inevitably cost some lives as well.

We probably don’t have enough data to really do the math on whether any given action will cost or save lives in the long run.  But we can decide as a society what our priorities are, what kinds of deaths are unacceptable, and how we want to improve ourselves and grow.

I hope that we all can take some time to reflect on this and recognize that all the options have always been on the table.  Climate change might be thought of as being in the early stage of exponential growth, like when the COVID-19 cases numbered in the tens or twenties.  But if we fail to take effective action, the growth will happen and will accelerate2.  The consequences of unchecked climate change will easily dwarf the impact of COVID-19.  We should recognize now that we have options available, rather than wait until it’s too late.

    1. Although other have:
    2. See analysis compiled by the United Nations:


Why ClearView AI was inevitable and is unstoppable

Clearview, a new facial recognition system, has been grabbing headlines recently.  They provide an app that, given a photograph of a person’s face, can fairly reliably find more photos of the same person as well as a name and other identifying information.  The attention is well-deserved. Systems that can recognize a face from a small to medium database (a collection of mug shots for example) have been around for some time.  But Clearview has amassed a database of, reportedly, 3 billion facial images.  We do not know how many individuals these images come from, but it’s likely that it’s a significant proportion of the internet-using world.  In one example, a journalist determined that the Cleaview database held 7 images of her.  If they have, on average, 7 photos of each person in their database, it includes something like 10% of all internet users in the world.

Being able to search for a face in such a large database is a technical achievement, but not implausible.  If there is any “secret sauce”, it won’t be secret for long. Perhaps the more impressive achievement is to have managed to acquire such a large database of face images.  To do so, they crawled the web, social media sites, company directories, personal websites, any publically accessible web content. Collecting all the photos they found and, as I understand, also collecting personally identifying information (especially full name) when available.  This sort of “web scraping” requires some technical ability and significant computational resources, but is something anyone with sufficient funding could pull off.

Whether or not this technology is good for society is something I won’t discuss here.  But, suffice to say, there are many calls to ban the technology.  Bans may be somewhat effective in preventing use in cases where transparency is required – a court will not accept facial recognition results as evidence if facial recognition is illegal – but preventing individuals or private organizations from making use of this sort of software is probably extremely difficult.

This sort of software is an inevitable consequence of the internet age – specifically the persistence and searchability of personal data.  As an analogy, consider highschool yearbooks, which typically include the name and photo of every student in the school. These yearbooks were not thought of as privacy invasions in the past. Technology has made it possible to build a virtual collection of every highschool yearbook in the world and search through them all to find a particular face or name in a fraction of a second.  Data that was previously harmless becomes a threat.

So the Clearview database is not going away.  Even if they were compelled to destroy their database, it can be reconstructed – once something is available on the internet, it’s nearly impossible to delete all possible copies of it.  And the only way that the database won’t continue to grow is if people stop posting facial photographs in publically-available places. So no more photo-sharing beyond a strict group of real (and trusted) friends and family, no more photos in company directories, no more names and faces in media articles, etc.  The list goes on. I think it’s highly unlikely that sharing of photographs is going to stop.

Ultimately, Clearview AI is a symptom of a fundamental shift in our notion of privacy.  Previously harmless information can be exploited once it becomes persistent and searchable.  This is something we need to get used to.

Protecting Free Will

I’ve written before about how the dangers online data collection are broad and that  privacy is not an adequate conceptual model to face these challenges.

I came across this article on about a talk by Shoshana Zuboff that explains this very well: we need to protect our free will, not just our privacy.

In an upcoming article I’ll go into more depth about how the understanding that people are easily manipulated is the essential starting point for understanding these challenges.

Unintended Consequences: Amplifying Harmful Content

The Internet’s largest user-generated content platforms – including YouTube, Facebook, and others – have a serious problem with harmful content.  Misinformation, radicalization, and exploitation have all found homes on these sites.  These are complex phenomena, reflecting social and psychological issues that predate our era, yet modern technology can amplify them in new and powerful ways.  At least in part, this amplification appears to be inherent in the content recommendation algorithms and in the business models of the companies that build them.  Greater transparency and responsibility are needed in order to ensure that these companies are taking the appropriate steps to avoid harming our society.

Dividing posts and videos into piles of “good” and “bad” content is hard, if not impossible.  This article is not advocating for censorship – laws vary between nations, but within appropriate limits, people should have the right to create and distribute whatever content they want to.  However, ultimately, the platforms choose what content to recommend, even if this choice is obfuscated through algorithms. If content recommendation engines are amplifying voices and broadening audiences for content that is making people feel unsafe online or otherwise harmful to society, then solving this problem is not censorship.

To understand the possible link between the business models of the content platforms and harmful content, we must understand something about how these business models function.  The types of companies we’re talking about can be classified as “attention merchants”.  There is an excellent exposé written by Dan McComas, the former product head at Reddit, that summarizes the idea succinctly:

The incentive structure is simply growth at all costs. There was never, in any board meeting that I have ever attended, a conversation about the users, about things that were going on that were bad, about potential dangers, about decisions that might affect potential dangers. There was never a conversation about that stuff.”

For the attention merchants, the primary business goals are to get more users and more engagement from those users.  The more people spending more time with the product, the more ads can be shown and sold. And as users engage with the platform, uploading or sharing content, liking and commenting, the platform collects data that can be sold or used to better target those ads.  This focus on growth and engagement is baked into the core of the algorithms that power the Internet’s largest content platforms.

How is this connected to harmful content?  If the primary goal is to maximize engagement, then we might ask: “can recommending harmful content lead to more engagement for a platform?”  Only the platform companies themselves are in a position to decisively answer this question, but all the evidence points to “yes”; the recommendation engines are very good at recommending content that will lead to engagement, and so the very fact that so much harmful content is recommended is quite telling.  As well, it seems that harmful content can receive a large amount of engagement.  Recommending harmful content may be an unintended consequence of optimizing a recommendation engine for engagement.  Even though these companies have no intent to promote harmful content, their content recommendation engines may be doing exactly that.

Of course there are trade-offs to be made.  The companies care about their long-term success and recognize that surfacing excessive harmful content is not good for business.  But when suppressing harmful content hurts the bottom line, the business logic leads to the question of “how much harmful content can we still recommend without harming our long-term success?”  The appropriate balance here for a business is not necessarily the appropriate balance for preventing harm to our society.

To better understand engagement and how it is measured, let’s get to a few details1.  One of the main tools of the trade for data scientists and quantitative analysts is the “metric”.  A metric reduces complex information about how a product is doing to a number. One common metric is “daily active users”, commonly referred to as “DAU”.  This measures the number of unique people using the product on any given day. Another metric might be “average time in app”, which would measure the time spent using the app, among all users on a given day.  A third metric might be “like button interaction probability”, which might measure the probability of a user clicking on a like button when they view a post.

As you can imagine, there are many possible metrics.  They also may measure how much content users share, how much they interact with particular features in the product, etc.  But typically, just a few very important metrics are chosen, often referred to as “North Star Metrics” or “Key Performance Indicators”.  Most product development effort focuses on increasing these metrics.

There are two primary ways a product is optimized for a metric, meaning the product is changed in ways that will increase the metric: experimentation (A/B testing is a common type of experiment) and machine learning optimization.  In the case of A/B testing, a change to the product can be tested by showing the changed version to some users and the original version to others. The metrics can then be calculated separately for each group, and if the changed version improves the metrics, it will be “launched” and the product will be updated for everyone.  It’s worth noting that many large tech companies run thousands of such experiments every year.

Machine learning works similarly – you can think of them as continuously running experiments.  The model is tasked with making some decision about how the product operates (for example, which video to suggest that a YouTube user watches next).  The model is constantly receiving feedback (did a user watch the recommended video, what kind of video was it, and what do we know about the user) and adjusting how it makes its recommendations.  This adjustment is always guided by some kind of metric, just like in experimentation.

Content platforms are constantly tuning their recommendation engines in order to increase certain metrics.  Of course, the type of metrics that we’ve been talking about (“growth metrics”) are not the only ones used. There are many other types, measuring interactions with user interface elements, product performance in terms of speed and reliability, and measures of views and recommendations of content with different topics or by different creators.

There are even metrics to measure exposure to harmful content.  Typically, a company will have a written policy to describe how content can be classified into defined categories.  Some of these categories will be content that is explicitly unacceptable in the product’s terms of service and will probably be deleted when it is identified.  Another category will be what is considered “borderline content” that does not violate any rules but may still be harmful to show to users in some or all cases.  It is important to make clear that the content platform companies are writing these policies – they make their own definitions of harmful or borderline content. As I mentioned, the true concept of harmful content is complex and contextual, but these companies make their approximate generalizations.

Once the definitions are established, metrics can be developed.  Some sample of content is sent to human raters (usually contractors) for review and classification.  At this point, they now know, for some small subset of the platform’s content, “what is good and what is bad”.  This data can be used to train machine learning models to classify every other piece of content on the platform.  Critically, these models are imperfect: some harmful content will pass as apparently harmless; likewise, some innocent content will be incorrectly flagged as harmful.  But statistically, these models should provide a fairly accurate measure of how much harmful content the users are being exposed to.

What this means is that the platform companies can not generally say with certainty that any particular piece of content is harmful.  So it is not feasible to simply “filter out” all the bad content. But there are changes to the content recommendation engines that can increase or decrease the overall level of harmful content that users are exposed to, and the platform companies are able to effectively measure the impact of these changes – due to a statistical property known as “the law of large numbers” even if the classification of an individual piece of content is sometimes wrong, the proportion of harmful content in a large sample can be known quite accurately.

Preventing harmful content from being surfaced is not easy, but is not impossible either.  Google Search does an excellent job of preventing inappropriate content from being returned in the list of results.  The fact that YouTube recommendations have so much more of a problem with harmful content than Google Search does suggests that there are some fundamental differences between the two systems.

I would argue that this has to do with objectives: Google Search can surface content that best meets the user’s search query.  YouTube recommendations have no particular search intent to work with and so optimize simply for engagement: getting the user to watch more videos.  As I suggest, it is this optimization for engagement that amplifies harmful content. This is supported by the observation that there is less of a problem with harmful content in YouTube search results as compared to YouTube recommendations.  When there is a search query to work with, the optimization is not purely for engagement.

So now, we get to the core question: what if an experiment shows that a particular change to a content recommendation algorithm will increase the key growth metrics, but also slightly increase the amount of harmful content users are exposed to?  Will the company decide to make that change? We don’t know. We don’t even know for sure if these sorts of situations arise, but given the large scale of the harmful content problem on these services, and given how much engagement harmful content tends to receive, it seems very likely.

Conflicting incentives like these are a major reason why we need greater public awareness and why we need to push for real responsibility and accountability in the implementation of content recommendation engines. The companies behind these platforms claim to be making progress in solving these problems; but we need those claims to be backed up with data and evidence, and we need external researchers and journalists to have the access and data necessary to be part of the solution.

In the next instalment, I will go into more detail about what these companies could (and should) do to demonstrate their commitment to preventing their products from creating social harm.

  1. In this post, I present an oversimplified view that leaves out some technical details; I hope that it is comprehensible for everyone and that experts will forgive the omissions.

Accidental Learned Helplessness – A Thought Experiment

I wrote previously on how Data Science techniques for optimizing product growth might have unintended consequences.  A product becoming more addictive or surfacing more divisive content are typical examples. Here I explore a different type of example.

When I ask Google Maps how to walk from one location in the city to another, there are many possible routes it may recommend.  Some routes may better develop my ability to navigate in the city; for example, a route staying on a few major roads may help me get to know the structure of the city, while a route that makes many turns on many small streets may be too confusing to contribute to my understanding of the shape of the city.

Google presumably runs experiments, testing different methods of selecting routes.  As they develop new methods, they naturally want to test these methods to see which are best, both in terms of user experience and in terms of the success of the product.

It’s likely that these experiments are analyzed using growth metrics – seeing which methods lead to the greatest use of the product.  Generally, we imagine that this measures both the success of the product (more use = more revenue) and the user experience (if people are using it more, it must be because they’re happier with it).  However, what we measure (how much people use the product) and what we want to measure (how good the user experience of the product is) are not quite the same thing.

It’s possible that I may use Google Maps less often over time (or stop using it entirely) if I develop a strong sense of direction in the city I live in.  This means that when Google tests different methods, they could find that those methods that lead to more confusing routes, and thus to less development of the user’s sense of direction, actually lead to greater product growth, and thus be selected for use.

Without any intention to do so, the naive optimization of Google Maps for product growth could cause the product to “create learned helplessness” and interfere with the users’ ability to navigate on their own.

Ultimately, I do not believe that Google would intentionally sabotage my sense of direction in order to increase my dependence on their products.  I don’t even think it’s likely that the effect I write about could realistically happen, unintentionally or intentionally. I, do however, think it’s an interesting examination of the danger in blindly optimizing products for growth.

Moving Towards Greater Regulation?

A key question in the data ethics world is whether protecting society from the harmful abuses of data requires greater government regulation (the “European approach”) or is possible with industry self-regulation and codes of ethics within a free market context (the “American approach”).  Surprisingly, a recent survey suggests that more than half of Americans “fear the federal government won’t do enough to regulate big tech companies”.  Perhaps Americans are concerned enough to be seeking strong solutions.

The survey certainly suffers from selection and non-response biases, but was appropriately weighted to represent the US population and is probably not too far from the truth.  In any case, it’s an interesting data point demonstrating that the public is taking these issues seriously.