The Effectiveness of Site Blocking as an Anti-Piracy Strategy: Evidence from the U.K.

By Michael Smith
June 3rd, 2015

Brett Danaher, Michael D. Smith, Rahul Telang

It is well established in the academic research that piracy harms sales for entertainment goods;[1] and there is emerging evidence that, by reducing the profitability of content creation, piracy may reduce the quality and quantity of the content that is created.[2]

Given these empirical results, as academic researchers, we have spent considerable effort trying to understand the effectiveness of various anti-piracy strategies that attempt to mitigate the impact of piracy on industry revenues by either making legal content more appealing or making illegal content less appealing (see for example here and here). Our latest research examines an anti-piracy strategy known as “site-blocking” adopted in many countries, including the United Kingdom where we conduct our analysis. In the U.K. courts respond to blocking requests, and where they find cause, order Internet Service Providers (ISPs) to block access to specific piracy-enabling sites.

This approach is notably different than shutting down entire sites that store pirated content: the sites and pirated content remain online worldwide, and within the affected country the blocked sites can still be accessed by technologically sophisticated users. Given these differences we decided to study the effectiveness of site-blocking strategies at changing consumer behavior, focusing on court-ordered blocks in the UK: The May 2012 block of one site, The Pirate Bay, and the October/November 2013 block of 19 major piracy sites.

Our results, which were first presented to an academic audience at the December 2014 Workshop on Information Systems and Economics, used consumer data from an Internet panel tracking company to examine the behavior of a set of UK Internet users before and after these sites were blocked. We considered users who had not visited the site(s) before the block as a control group (since they were largely unaffected by the block) and we asked how treated users – those who had used the site(s) before the block – changed their behavior after the block (relative to the control group).

Our analysis found that blocking The Pirate Bay had little impact on UK users’ consumption through legal channels. Instead blocked users switched to other piracy sites or circumvented the block by using Virtual Private Networks. However, unlike the May 2012 Pirate Bay block, our results showed that when 19 sites were blocked simultaneously, former users of these sites increased their usage of paid legal streaming sites by 12% on average, relative to the control group.[3]  The blocks caused the lightest users of the blocked sites (and thus the users who were least affected by the blocks, other than the control group) to increase their use of paid streaming sites by 3.5% while they caused the heaviest users of the blocked sites to increase paid streaming visits by 23.6%, strengthening the causal inference in our result.

As we discuss in our paper, the most likely explanation for this result — and one supported by other observations in the data — is that when only one site is blocked, most pirates have an easy time finding and switching to other piracy sites. But, blocking many sites can increase the cost of finding alternate sources of piracy enough that a significant number of former pirates will switch their behavior toward legal sources.

As with our other empirical findings, summarized above, this finding suggests that consumers behave like consumers: They make choices based on the costs and benefits of what is available, and will change their behavior based on sufficient changes in those costs and benefits.


[1]       See this paper or this paper for a review of the academic literature on how piracy impacts sales.

[2]       See, for example, this paper or its summary in this blog post

[3]       Importantly, our data did not allow us to determine whether this 12% increase reflected new users coming to these paid sites or simply increased usage of an already existing customer base.

The NABU Network: A Great Lesson, But Not About Openness

By Scott Wallsten
February 5th, 2015

When announcing his plan to regulate Internet Service Providers under Title II in Wired, FCC Chairman Tom Wheeler argued that his experience at NABU Network in the 1980s helped inform his decision. He writes that NABU failed because “The phone network was open whereas the cable networks were closed. End of story.”

But that’s not the whole story, and its lessons aren’t really about openness. Instead, it teaches us about the importance of investment and network effects.

NABU sprang from the mind of Canadian entrepreneur John Kelly, who realized that cable television networks were uniquely suited to high-speed data transmission. The service allowed a home computer to connect to a remote software library and, in principle, play games, shop, bank, and do email. And apparently it could do all that at speeds up to 6.5 Mbps—even more than Chairman Wheeler claimed in his recent Wired article.[1] Not too shabby. NABU first launched in Ottawa in 1983 and Sowa, Japan and Fairfax, VA in 1984. By the time it went out of business it had reached agreements with cable companies in 40 other regions.

As it turned out, the world wasn’t ready for NABU, and it failed in 1986.

Analyses of NABU, however, do not point to issues of openness as the cause of death. After all, other computer networks in the early 1980s that relied on the telephone network also failed.[2]

Instead, post-mortems point to issues we know are important in network industries: network effects and investment, or, rather, the lack thereof in both cases.

As has been written ad nauseam, the Internet is a two-sided (actually, multi-sided) platform. In order to succeed, it must attract both users and applications. In early stages, when uses and users are scarce, it can be difficult to get anyone on board. The presence of indirect network effects makes it worse, since the benefit from each new user or application is greater than the benefit that accrues just to the new subscriber or developer. That is, a new user benefits by being able to access all the available content, but the entire network benefits due to increased incentives to develop new applications. The new user, however, does not realize all those benefits, meaning that adoption, at least in the early stages, may be artificially slow.

Early commercial data networks faced precisely this problem. Why would someone pay to go online if there were nothing to do when he logged on? In order to subscribe to NABU, consumers paid not just a monthly fee, but also had to buy or lease a $299 NABU personal computer. Data networks tried to induce consumers to subscribe by making collections of software available. In the 1980s, however, most commercial data networks just could not provide enough of an incentive to attract or keep subscribers.

Creating reasons to subscribe played an important role in NABU’s failure. As one source put it, “ the NABU Network did not catch on due to lack of accessible resources.”

Another reason for its failure appears to have been the inability of the then-existing infrastructure to fully deliver on NABU’s promises. Cable TV systems were not built to handle much upstream traffic—an issue they still face today. Upgrading the cable infrastructure for two-way communication required significant investment.

Competition also made survival difficult for NABU. NABU faced direct competitors in the form of other data networks like AOL (founded in 1985), Prodigy, and the dominant firm, Compuserve. Additionally, to the extent that consumers would sign up to play games, NABU also faced competition from packaged software games and gaming consoles, and faced the same over-saturation of the market that led to the great video game crash. It even faced potential competition from The Games Network, a firm that was developing a system that used cable networks to distribute video games but failed to get off the ground.

In short, the market wasn’t quite ready for the kind of service NABU was selling, although NABU founder Kelly was right about the potential of cable networks. As Marty McFly might have said to potential subscribers in the 1980s, “your kids are gonna love it.

Openness is a key part of the Internet. It just wasn’t a key part of the NABU story. Instead, it reminds us of the importance of network effects, the economics of multi-sided networks, and network investment. Unlike the 1980s, these are now working together in a virtuous cycle favoring innovation. Let’s make sure any new rules don’t change that.


For a fascinating and detailed history of early data networks, including NABU, see

Zbigniew Stachniak, “Early Commercial Electronic Distribution of Software,” IEEE Annals of the History of Computing 36, no. 1 (2014): 39–51, doi:10.1109/MAHC.2013.55.

[1] Stachniak, “Early Commercial Electronic Distribution of Software”, n. 21.

[6] Stachniak, “Early Commercial Electronic Distribution of Software” Table 1.


A Closer Look at Those FCC Emails

By Scott Wallsten
November 24th, 2014

Recently, Vice News received 623 pages of emails from the FCC in response to a Freedom of Information Act request. Vice News has kindly made the entire PDF file available for download.

We decided to categorize the emails to get a picture of who contacts the FCC and what they want to talk about. This simple categorization is time consuming given the need to review each page to pull out the relevant information. Nevertheless, our intrepid research associate, Nathan Kliewer, managed to slog his way through the pile, leaving us with a clean dataset. The fruits of his labor are printed below.

The statistics derived from this dataset come with important caveats. First, and most importantly, we categorize only the initial email in any given chain of emails. As a result, this analysis tells us nothing about the extent of a given email interaction. Second, it is possible that some emails are mischaracterized (seriously, you try reading 623 pages of blurry PDFs). Third, because the FCC released only selected emails, we do not know if these emails are representative of FCC email correspondence.

Nevertheless, let’s see what we’ve got.

Figure 1 shows the number of emails from different types of organizations.

Figure 1

Number of Emails by Type of Organization


The figure shows that most emails were initiated by news organizations, followed closely by industry. The FCC itself appears as the originator of a good number of these emails, most of which are from one FCC staff member to another. Eleven emails are from law firms (which represent industry clients), nine from people affiliated with universities, eight from other government agencies, seven from consumer advocacy groups, and six from think tanks. Among the unexpected emails is one from a representative of the government of Serbia simply inquiring about “current regulatory challenges,” and another from someone applying for an internship at the FCC (the latter we did not include in the figure).

Figure 2 highlights the general subject or topic of the email. The largest number of emails, not surprisingly, contains the sender’s views on policy issues relevant to net neutrality. The second largest number is news items people forward to FCC staff. Next are requests for comments, followed by information about events and requests for meetings.

Figure 2

emails by subject

Figure 3 combines these two categories to reveal which type of organizations focus on which issues. Industry, consumer groups, and other government agencies tend to send emails discussing views on policy issues. News organizations send requests for comments. Industry and law firms, generally representing industry, send ex parte notices.

Figure 3

email by org and topic


Unfortunately, this meta-analysis tells us little about whether those emails mattered in any real way. I also can’t believe I spent so much time on this.

According to the Vice News story, the FCC plans on releasing more emails on November 26. I look forward to seeing an updated meta-analysis of those emails, but prepared by somebody else.

Google, Search Ranking, and the Fight Against Piracy

By Michael Smith
October 20th, 2014

Last month, Rahul Telang and I blogged about research we conducted with Liron Sivan where we used a field experiment to analyze how the position of pirate links in search results impact consumer behavior. Given this research, we were very interested in Google’s announcement last Friday that they were changing their ranking algorithm to make pirate links harder to find in search results.

According to the announcement, Google changed their ranking algorithm to more aggressively demote links from sites that receive a large number of valid DMCA notices, and to make legal links more prominent in search results. The hope is that these changes will move links from many “notorious” pirate sites off the first page of Google’s search results and will make legal content easier to find.

One might ask whether these changes — moving pirate results from the first to the second page of search results and promoting legal results — could have any effect on user behavior. According to our experimental results, the answer seems to be “yes, they can.”

Specifically, in our experiment we gave users the task of finding a movie of their choosing online. We then randomly assigned users to a control group and to two treatment groups: one where pirate links were removed from the first page of search results and where legal links were highlighted (legal treatment), and one where legal links were removed from the first page of search results (piracy treatment).

Our results show that users are much more likely to purchase legally in the legal treatment condition than in the control. We also found that these results hold even among users who initially search using terms related to piracy (e.g., by including the terms “torrent” or “free” in their search, or by including the name of well-known pirate sites), suggesting that even users with a predisposition to pirate can be persuaded to purchase legally through small changes in search results.

Given our findings, reducing the prominence of pirated links and highlighting legal links seems like a very promising and productive decision by Google. While it remains to be seen just how dramatically Google’s new search algorithm will reduce the prominence of pirate links, we are hopeful that Google’s efforts to fight piracy will usher in a new era of cooperation with the creative industries to improve how consumers discover movies and other copyrighted content, and to encourage users to consume this content through legal channels instead of through theft. If implemented well, both Google and studios stand to benefit significantly from such a partnership.

Using Search Results to Fight Piracy

By Michael Smith
September 15th, 2014

With the growing consensus in the empirical literature that piracy harms sales, and emerging evidence that increased piracy can affect both the quantity and quality of content produced (here and here for example), governments and industry partners are exploring a variety of ways to reduce the harm caused by intellectual property theft. In addition to graduated response efforts and site shutdowns, Internet intermediaries such as Internet Service Providers, hosting companies, and web search engines are increasingly being asked play a role in limiting the availability of pirated content to consumers.

However, for this to be a viable strategy, it must first be the case that these sorts of efforts influence consumers’ decisions to consume legally. Surprisingly, there is very little empirical evidence one way or the other on this question.

In a recent paper, my colleagues Liron Sivan, Rahul Telang and I used a field experiment to address one aspect of this question: Does the prominence of pirate and legal sites in search results impact consumers’ choices for infringing versus legal content? Our results suggest that reducing the prominence of pirate links in search results can reduce copyright infringement.

To conduct our study, we first developed a custom search engine that allows us to experimentally manipulate what results are shown in response to user search queries. We then studied how changing what sites are listed in search results impacted the consumption behavior of a panel of users drawn from a general population, and a separate panel of only college aged participants.

In our experiments, we first randomly assigned users to one of three groups: a control group of users who are shown the same search results they would receive from a major search engine, and two treatment groups where pirate sites are artificially promoted and artificially demoted in the displayed search results. We then asked users to obtain a movie they are interested in watching, and to use our search engine instead of the search engine they would normally use. We observe what queries each set of users issued to search for their chosen movie, and surveyed them regarding what site they used to obtain the movie.

Our results suggest that changing the prominence of pirate and legal links has a strong impact on user choices: Relative to the control condition, users are more likely to consume legally (and less likely to infringe copyright) when legal content is more prominent in search results, and user are more likely to consume pirate content when pirate content is more prominent in search results.

By analyzing users’ initial search terms we find that these results hold even among users with an apparent predisposition to pirate: users whose initial search terms indicate an intention to consume pirated content are more likely to use legal channels when pirated content is harder to find in search results.

Our results suggest that reducing the prominence of pirate links in search results can reduce copyright infringement. We also note that there is both precedent and available data for this sort of response. In terms of precedent, search engines are already required to block a variety of information, including content from non-FDA approved pharmacies in the U.S. and content that violates an individual’s “right to be forgotten” in a variety of EU countries. Likewise, the websites listed in DMCA notices give search engines some of the raw data necessary to determine which sites are most likely to host infringing content.

Thus, while more research and analysis is needed to craft effective policy, we believe that our experimental results provide important initial evidence that users’ choices for legal versus infringing content can be influenced by what information they are shown, and thus that search engines can play a role in the ongoing fight against intellectual property theft.


Does Piracy Undermine Product Creation?

By Michael Smith
September 5th, 2014

(Below is a guest post by my colleague, Rahul Telang from Carnegie Mellon University)

That Piracy undermines demand for products in copyright industries is intuitive and well supported by data. Music, movies, books, software have seen demand degradation due to various forms of piracy. What is not so well supported by data is whether piracy undermines product creation. For example, does piracy reduce the number of movies made, or quality of movies made, or investments in movies? Common sense suggests that this must be true. After all, this is the core principle of copyright. Large scale copyright infringement should affect revenues which in turn should affect producers’ incentives to create.

Despite this compelling argument the data does not support this claim readily. The reasons are many. For one, while the change in demand due to infringement happens more quickly, the production adjustments take time. So unless the infringement is persistent for a period of time, the contraction in production is not readily visible. The technology that leads to widespread infringement (say P2P networks and broadband infra-structure that facilitates online piracy) might also be accompanied by a period where cost of production and distribution declines or new markets open up. The net effect of these two opposing factors is all we can see in the data. And, the net effect could very well be that the production actually has increased!!!. This is not an evidence that piracy does not matter. Finally, there may be distributional bottlenecks (say number of theatres) that may prevent growth in production but might lead to larger investments in movies or in some cases higher input costs (actors and directors become more expensive).

In short, to see the effects of piracy in data, we need a setting where other factors are largely unchanged. With my co-author Joel Waldfogel, we explore Indian movie industry around the diffusion of Cable television and VCR. This phenomenon took place during 1985-2000. The paper is here. The story of our paper from the abstract is essentially that:

The diffusion of the VCR and cable television in India between 1985 and 2000 created substantial opportunities for unpaid movie consumption. We first document, from narrative sources, conditions conducive to piracy as these technologies diffused. We then provide strong circumstantial evidence of piracy in diminished appropriability: movies’ revenues fell by a third to a half, conditional on their ratings by movie-goers and their ranks in their annual revenue distributions. Weaker effective demand undermined creative incentives. While the number of new movies released had grown steadily from 1960 to 1985, it fell markedly between 1985 and 2000, suggesting a supply elasticity in the range of 0.2-0.7.

Even the quality as measured by IMDb ratings declined substantially. Thus, our study provides affirmative evidence on a central tenet of copyright policy, that stronger effective copyright protection effects more creation. For empirical research, sometimes you have to look at the historical context to see the evidence of the effect of a policy. Doing a similar study in post 2000 era for any other country might be tricky because the other competing factors have altered. There will be a need to be more creative in defining and measuring product creation in this new context. And, I am sure we will see such efforts in near future. Needless to say, a lot more research is needed to settle this issue.  However, our paper does provide an evidence that in an appropriate setting, effects of copyright infringement on product creation can be measured.

2014 TPI Aspen Forum has Ended, but the Videos Live On…

By Amy Smorodin
August 22nd, 2014

Did you miss the Aspen Forum this year?  Or, do you just want to watch some of the panels again?  Videos of the panels and keynotes from the 2014 event are now up on the TPI website.

Some highlights from Monday night and Tuesday:

Comcast’s David Cohen was the Monday night dinner speaker.  In front of a packed room, Cohen spoke about the benefits of the Comcast/TWC deal, vertical and horizontal integration in the industry in general, and even revealed what keeps him up at night (hint: it’s not the communications industry).  His speech can be viewed here.

First up on Monday morning was a panel on copyright moderated by Mike Smith, TPI Senior Adjunct Fellow and Professor at Carnegie Mellon.  “Copyright Protection: Government vs. Voluntary Arrangements” featured Robert Brauneis from GW Law School, the Center for Copyright Information’s Jill Lesser, Jeff Lowenstein from the Office of Congressman Schiff, Shira Perlmutter from USPTO and NYU’s Chris Sprigman. Panelists discussed the copyright alert system, the state of the creative market in general, and the perennial question of what can be done to reduce piracy.  Video of the spirited panel can be viewed here.

Next up was the panel, “Internet Governance in Transition:  What’s the Destination?” moderated by Amb. David Gross.  The pretty impressive group of speakers discussed issues surrounding the transition of ICANN away from the loose oversight provided by the U.S. Dept. of Commerce.  Participants were ICANN Chair Steve Crocker, Reinhard Wieck from Deutsche Telekom, Shane Tews from AEI, Amb. Daniel Sepulveda, the U.S. Coordinator for International Communications and Information Policy, and NYU’s Lawrence White.  Video is here.

Finally, the Forum concluded with a panel on “Data and Trade,” moderated by TPI’s Scott Wallsten.  The panelists discussed how cybersecurity, local privacy laws, and national security issues are barriers to digital trade.  Speakers were USITC Chairman Meredith Broadbent, Anupam Chander from University of CA, Davis, PPI’s Michael Mandel, Joshua Meltzer from Brookings, and Facebook’s Matthew Perault.  Video of the discussion is here.

We hope all attendees and participants at the TPI Aspen Forum found it interesting, educational, and enjoyable.  We hope to see you next year!

Dispatch from the TPI Aspen Forum

By Amy Smorodin
August 18th, 2014

Sunday, August 17

Last night, we kicked off our 2014 Aspen Forum in lovely Aspen, Colorado.

Congressman Scott Tipton welcomed attendees to his home state (and his home district).  In his remarks, Tipton discussed the importance of tech in growing small business and the economic impact of regulations, which he estimated to cost $1.8 billion a year.  Rep. Tipton also discussed the importance of broadband penetration in rural areas.

Video of his speech, and short remarks from TPI President Thomas Lenard and TPI Board Member Ray Gifford, can be found here.

Monday, August 18

The first full day of the TPI Aspen Forum began with a discussion on “The Political Economy of Telecom Reform,” moderated by TPI’s Scott Wallsten.

Former Congressman Rick Boucher, now a Partner at Sidley Austin, explained that during the 1996 telecom act, the issues were not partisan in nature.  However, he identified a sticking point that seems to be drawn along party lines: network neutrality.  He would like to see net neutrality dealt with separately prior to the start of any real push for telecom reform in Congress, in hopes that lawmakers will have an easier time finding common ground.

Peter Davidson from Verizon stated that there does not seem to be as much consensus among players in the communications industry as there was during the last push for telecom reform.  However, he did express that the threat of Title II regulation may drive many to band together.

Roger Noll from Stanford University declared the big winners in the ‘96 Act “were people who make a living manipulating regulatory processes.”  He also said such a thing was less likely to happen with any new telecom reform act because there are many more players – not just traditional wired communications companies – who know how to mobilize politically.

Philip Weiser, Dean, University of Colorado Law School stated the communications sector is going to have a lot of innovation in the next few years despite the static telecom reform act. In any new reform act, Congress should stick to high-level principles to enable ongoing innovation.  In other words, Congress needs to show restraint.

Video of the entire discussion can be viewed here.

More summaries of today’s panels and tonight’s keynote dinner speech by Comcast’s David Cohen, will be posted soon. Videos of everything will also be posted on the TPI YouTube page just as soon as we can get them up.

Stay tuned!

The Expendables 3 Leak and the Financial Impact of Pre-Release Piracy

By Michael Smith
July 25th, 2014

This past week a DVD-quality copy of the movie The Expendables 3 leaked online three weeks before its planned U.S. theatrical release. According to Variety, the film was downloaded 189,000 times within 24 hours. As researchers, an immediate question comes to mind: how much of a financial impact could movie-makers face from such pre-release piracy?

The effect of piracy on the sales of movies and other copyrighted works has long been scrutinized, with the vast majority of peer-reviewed academic papers concluding that piracy negatively impacts sales. Indeed, in a recent National Bureau of Economic Research book chapter, my co-authors and I reviewed the academic literature, and showed that 16 of the 19 papers published in peer-reviewed academic journals find that piracy harms media sales.

But less well understood is the impact of pre-release movie piracy, which could be particularly harmful to box office revenue because it appears at a time when there are no legal channels for anxious fans to consume the movie. Because of this, seeing a movie appear online before it appears in theaters sends chills down the spines of studio executives given the investment in human and financial capital necessary to produce the typical studio film.

To better understand the impact of this particular form of piracy, my colleagues and I conducted a study to measure the impact of pre-release piracy on box office revenue. Our study was accepted for publication last month in the peer-reviewed journal Information Systems Research, making it the first peer-reviewed journal article we are aware of to analyze the impact of pre-release movie piracy.

In our study we applied standard statistical models for predicting box office revenue, but added a variable for whether a movie leaked onto pirated networks prior to its release using data obtained from the site Our analysis concluded that, on average, pre-release movie piracy results in a 19% reduction in box office revenue relative to what would have occurred if piracy were only available after the movie’s release. As we discuss in the paper, this result is robust to a variety of different empirical approaches and sensitivity tests.

The growing consensus in the academic literature regarding financial harm from digital piracy provides an important backdrop to active policy debates about the best options for addressing this threat. We have seen governments and industry adopt various anti-piracy measures in recent years, from government sponsored graduated response laws, site blocking and site shutdowns; to market-based responses by rights holders and industry-level partnerships such as the Copyright Alert System in the United States.

At next month’s TPI Aspen Forum I am pleased to be chairing a panel of industry, legal, and policy experts to discuss the effectiveness and appropriateness of these initiatives to better serve the interests of the creative sector, the technology industries, and society as a whole. However, what seems to require no discussion is that digital piracy of this type can dramatically reduce sales.

Takeaways from the White House Big Data Reports

By Tom Lenard
May 5th, 2014

On May 1, the White House released its two eagerly-awaited reports on “big data” resulting from the 90-day study President Obama announced on January 17—one by a team led by Presidential Counselor John Podesta, and a complementary study by the President’s Council of Advisors on Science and Technology (PCAST).  The reports contain valuable detail about the uses of big data in both the public and private sector.  At the risk of oversimplifying, I see three major takeaways from the reports.

First, the reports recognize big data’s enormous benefits and potential.  Indeed, the Podesta report starts out by observing that “properly implemented, big data will become an historic driver of progress.”  It adds, “Unprecedented computational power and sophistication make possible unexpected discoveries, innovations, and advancements in our quality of life.”  The report is filled with examples of the value of big data in medical research and health care delivery, education, homeland security, fraud detection, improving efficiency and reducing costs across the economy, as well as in providing targeted information to consumers and the raw material for the advertising-supported internet ecosystem.  The report states that the “Administration remains committed to supporting the digital economy and the free flow of data that drives its innovation.”

Second, neither report provides any actual evidence of harms from big data.  While the reports provide concrete examples of beneficial uses of big data, the harmful uses are hypothetical.  Perhaps the most publicized conclusion of the Podesta report concerns the possibility of discrimination—that “big data analytics have the potential to [italics added] eclipse longstanding civil rights protections in how personal information is used in housing, credit, employment, health, education, and the marketplace.”  However, the two examples of discrimination cited turn out to be almost non-examples.

The first example involves StreetBump, a mobile application developed to collect information about potholes and other road conditions in Boston.  Even before its launch the city recognized that this app, by itself, would be biased toward identifying problems in wealthier neighborhoods, because wealthier individuals would be more likely to own smartphones and make use of the app.  As a result, the city adjusted accordingly to ensure reporting of road conditions was accurate and consistent throughout the city.

The second example involves the E-verify program used by employers to check the eligibility of employees to work legally in the United States.  The report cites a study that “found the rate at which U.S. citizen have their authorization to work be initially erroneously unconfirmed by the system was 0.3 percent, compared to 2.1 percent for non-citizens.  However, after a few days many of these workers’ status was confirmed.”  It seems almost inevitable that the error rate for citizens would be lower since citizens automatically are eligible to work, whereas additional information is needed to confirm eligibility for non-citizens (i.e., evidence of some sort of work permit).  Hence, it is not clear this is an example of discrimination.

It is notable that both these examples are of government activities.  The reports do not present examples of commercial uses of big data that discriminate against particular groups.  To the contrary, the PCAST report notes the private-sector use of big data to help underserved individuals with loan and credit-building alternatives.

Finally, and perhaps most importantly, both reports indicate that the Fair Information Practice Principles (FIPPs) that focus on limiting data collection are increasingly irrelevant and, indeed, harmful in a big data world.  The Podesta report observes that “these trends may require us to look closely at the notice and consent framework that has been a central pillar of how privacy practices have been organized for more than four decades.”  The PCAST report notes, “The beneficial uses of near-ubiquitous data collection are large, and they fuel an increasingly important set of economic activities.  Taken together, these considerations suggest that a policy focus on limiting data collection will not be a broadly applicable or scalable strategy—nor one likely to achieve the right balance between beneficial results and unintended negative consequences (such as inhibiting economic growth).”  The Podesta report suggests examining “whether a greater focus on how data is used and reused would be a more productive basis for managing privacy rights in a big data environment.”  The PCAST report is even clearer:

Policy attention should focus more on the actual uses of big data and less on its collection and analysis.  By actual uses, we mean the specific events where something happens that can cause an adverse consequence or harm to an individual or class of individuals….By contrast, PCAST judges that policies focused on the regulation of data collection, storage, retention, a priori limitations on applications, and analysis…are unlikely to yield effective strategies for improving privacy.  Such policies would be unlikely to be scalable over time, or to be enforceable by other than severe and economically damaging measures.

In sum, the two reports have much to like:  their acknowledgement of the importance and widespread use of big data and their attempt, particularly in the PCAST report, to refocus the policy discussion in a more productive direction.  The reports also, however, suffer from a lack of evidence to substantiate their claim of harms.