The Production of Information in an Online World: is Copy Right?

Julia Cagé (Sciences Po Paris)

Nicolas Hervé (Institut National de l'Audiovisuel)

Marie-Luce Viaud (Institut National de l'Audiovisuel)

Abstract: This paper documents the extent of online copyright infringement. We build a unique dataset combining all the online content produced by the universe of news media (newspaper, television, radio, pure online media, and a news agency) in France during year 2013 with new micro audience data. We develop a topic detection algorithm that identifies each news event, trace the timeline of each story and study news propagation. We show that one fourth of the news stories are reproduced online in less than 4 minutes. High reactivity comes with verbatim copying and media hardly name the outlets they copy. We find that only 32.6% of the online content is original, but that original content represents between 46 and 57.8% of total news consumption. The negative impact of copyright violations on newsgathering incentives might indeed be counterbalanced by reputation effects. Using article-level variations (with story and media fixed effects), we show that a 50 percentage point increase in the originality rate of an article leads to a 35% increase in the number of times it is shared on Facebook.