BSA | The Software Alliance

Les informations de ce site Web à été fournie par des sources externes. Le gouvernement du Canada n'assume aucune responsabilité concernant la précision, l'actualité ou la fiabilité des informations fournies par les sources externes. Les utilisateurs qui désirent employer cette information devraient consulter directement la source des informations. Le contenu fournit par les sources externes n'est pas assujetti aux exigences sur les langues officielles et la protection des renseignements personnels.

September 17, 2021

Submission of BSA | The Software Alliance to Innovation, Science and Economic Development Canada

Consultation on a Modern Copyright Framework for Artificial Intelligence and the Internet of Things

Submitted via Email to: copyright-consultation-droitdauteur@canada.ca

BSA | The Software Alliance (BSA) welcomes this opportunity to provide comments to Innovation, Science and Economic Development Canada (ISED) in response to the Consultation on a Modern Copyright Framework for Artificial Intelligence and the Internet of Things (Consultation).Footnote 1 BSA is the leading advocate for the global software industry before governments and in the international marketplace.Footnote 2 Our members are at the forefront of software-enabled innovation that powers the global economy and helps businesses in every industry compete more effectively. BSA is therefore acutely aware of the critical role that copyright policy plays in fostering research and development of cutting-edge technologies and we appreciate the opportunity to provide input as ISED considers potential reforms to the Copyright Act.

This Consultation arrives a timely moment. Advances in AI and software-enabled data analytics are fueling job and economic growth in Canada, improving how businesses in every sector operate, and producing real societal gains. Across industries, the analysis of data has made Canadian businesses more agile, responsive, and competitive, boosting the underlying productivity of many key pillars of the economy. More than simply benefitting from the adoption of AI, Canadian innovators are on the leading edge in shaping the development of the technology. In fact, many of the recent breakthroughs in AI can be traced back to the trailblazing work of a trio of Canadian computer scientists whose research in the 1980s laid the foundations for today’s “deep learning” techniques.Footnote 3 Canada’s leading academic institutions continue to develop world class tech talent. In addition to supporting many domestic technology companies, Canada’s deep well of AI experts is also a major draw for foreign direct investment that is contributing to economic growth and job creation. Based on the strength of Canada’s research capabilities, more than 45 multinational companies have now invested in AI R&D centers located in Canada.Footnote 4 Earlier this year, the Government of Canada announced plans to renew the Pan-Canadian AI Strategy by investing $443.8 million to attract the best AI talent and support the commercialization of domestic AI R&D. ISED can play a key role in helping achieve the vision set forth in the recently renewed Pan-Canadian AI Strategy by ensuring that the Copyright Act provides the right balance of incentives for the research and development of cutting-edge technologies. As BSA and a wide array of other stakeholders testified before the Committee on Industry, Science, and Technology in 2018, the Copyright Act presently creates unnecessary uncertainty about the legal implications of key analytical techniques, such as text and data mining and machine learning, that are foundational to the development of AI.Footnote 5 Based on its review of the record, the Committee on Industry, Science, and Technology concluded the Government of Canada should adopt a statutory exception for “informational analysis of lawfully acquired copyrighted content,” in order to “help Canada’s promising future in artificial intelligence become reality.”Footnote 6 For the reasons outlined below, we urge ISED to act upon the Committee’s recommendation.

AI and Copyright: The Source of Legal Uncertainty

The incredible advances in AI capabilities in recent years have been enabled by a particular subset of the technology referred to as “machine learning.” At its most basic, machine learning involves the computational analysis of large amounts of data (i.e., “training data”) to identify correlations, patterns and other metadata that can be used to develop a “model” capable of making predictions based on future data inputs. For example, in 2020, the World Wildlife Fund used machine learning to transform ongoing conservation efforts. Leveraging partnerships through the Coalition to End Wildlife Trafficking Online, a hackathon team set out to design an algorithm that could assist online companies in automatically identifying pangolin products for sale, and to create a proof of concept for an image recognition algorithm that identifies pangolin leather products online.Footnote 7 Leveraging text and data mining capabilities to source the images, volunteers labelled a training data set of 5,353 pangolin images with 99% accuracy, trained an image recognition model that could identify pangolin leather products on Microsoft Bing Shopping with 86% precision, and designed a user interface that allows visualized results in real-time.

As the foregoing example demonstrates, some forms of machine learning rely on training data that is derived through the computational analysis of items potentially subject to copyright protection. This “input” stage of the machine learning process may involve two sets of reproductions that potentially implicate the Copyright Act: (1) reproductions necessary to create a corpus of “training data,” and (2) transient reproductions that are incidental to the computational process of training the AI model. In each case, the reproductions are “intermediate” in the sense that they are not visible or otherwise made available to the public. Instead, the reproductions are the necessary byproduct of a technical process that is aimed at identifying non-copyrightable information about the underlying corpus of works – i.e., the correlations and patterns that inform the creation of the AI model and enable it to make predictions based on future data inputs. Such intermediate, non-expressive reproductions have no impact on the economic interests that copyright is intended to protect. Recognizing that not all uses of copyrighted works should require permission, the Copyright Act includes several exceptions that arguably might cover certain forms of information analysis and machine learning. However, because the Copyright Act currently lacks an express exception to enable informational analysis, there is considerable uncertainty about the scope of activity that is permitted under current law. To resolve the existing ambiguity, ISED should introduce an exception for “informational analysis.”

In contrast to the legal uncertainty surrounding the use of copyrighted works as inputs to AI processes, the Copyright Act remains well equipped to resolve questions about the copyright implications of AI outputs. In most circumstances, the output of an AI system will not implicate copyright at all. In the rare instances where the output of an AI system involves copyrightable expression, existing doctrines will guide courts in assessing whether the output infringes another work and allocating potential liability. Traditional tests for evaluating whether the exclusive rights have been directly infringed are technologically agnostic and would not be impacted by the adoption of an informational analysis exception. Like any other action, courts will compare the AI output to the plaintiff’s work to determine whether there is a substantial similarity that supports a finding of infringement. To the extent the AI system results in an output that is substantially similar to the plaintiff’s work, courts will impose liability on users whose volitional acts resulted in the infringement.

AI and Copyright – International Developments

In contrast to Canada, the copyright laws of several other leading AI nations provide greater legal certainty for AI research and development. Indeed, there is an increasing global awareness about the need to modernize copyright laws to facilitate the development of AI.

1. Japan

Japan first recognized such a need in 2009 when it amended its Copyright Act to create an explicit exception for reproductions that are created as part of an “information analysis” process.Footnote 8 Although the 2009 amendment is heralded as having transformed Japan into a “machine learning paradise,” the Japanese Diet made further revisions to the Copyright Act in 2018 to further expand the exception.Footnote 9 In May 2018, the Diet passed the Copyright Law Amendment Act, broadening the existing exception to allow users to “exploit” any copyrighted work for non-consumptive purposes, including for “data analysis (meaning the extraction, comparison, classification, or other statistical analysis of the constituent language, sound, or image data)” and “computer data processing.”Footnote 10 In addition to creating a general purpose exception for non-consumptive uses of copyrighted works, the recent amendment package also authorizes beneficiaries of the information processing exception to make limited public uses of the underlying works, such as the display of snippets.Footnote 11

2. United States

In the United States, courts have confirmed that under the “fair use” doctrine, incidental copies of a work made in the course of informational analysis are non-infringing, even where the analysis is performed for commercial purposes. Creating a corpus of AI training data fits neatly within a long line of fair use rulings. For instance, in Authors Guild v. HathiTrust and Authors Guild v. Google, the Second Circuit determined that the unauthorized copying of tens of millions of books for the purposes of creating a searchable database of those works was a fair use.Footnote 12 Notwithstanding Google’s commercial motivation, the Second Circuit determined that the creation of the database was a fair use because it served a “highly convincing transformative purpose” that did not create “substitute competition” for the works included in the database.Footnote 13 The court reasoned that creating the searchable database was transformative because it enabled users to uncover factual information “about” the works included in the database and that providing access to such information did not implicate the expressive interests that copyright is intended to protect.Footnote 14 This conclusion reflects a strong consensus among the courts, which have consistently ruled that unauthorized copying is permitted when it is undertaken for non-expressive purposesFootnote 15 or to identify non-copyrightable information about the copied works.Footnote 16

3. European Union

The European Union also recently passed legislation to provide clarity for the development of AI. In April 2019, the European Council formally adopted the Directive on Copyright and Related Rights in the Digital Single Market. Articles 3 and 4 of the Directive create two broad exceptions that authorize AI researchers to make reproductions that are needed for the purposes of carrying out “any automated analytical technique aimed at analysing text and data in digital form in order to generate information which includes but is not limited to patterns, trends and correlations.” Importantly, the Directive clarifies that Articles 3 and 4 are without prejudice to existing exceptions and limitations that may already allow for reproductions that are necessary for machine learning. The Article 4 exception specifically contemplates that text and data mining may be performed for commercial purposes, but it is limited to circumstances where a rights holder has not “expressly reserved” his or her rights “in an appropriate manner, such as machine-readable means in the case of content made publicly available online.”

4. Singapore

On September 13, the Parliament of Singapore passed important new legislation to update the country’s Copyright Act and ensure that it remains a global hub for AI research and innovation.Footnote 17 Singapore’s new law includes an important exception for “computational data analysis” that will provide much needed certainty for the researchers and developers fueling Singapore’s AI ambitions. The scope of the exception includes reproductions that are necessary for the purpose of performing a computational data analysisFootnote 18 and communications to the publicFootnote 19 that are necessary for the purposes of: (i) verifying the results of the computational data analysis or (ii) collaborative research and study relating to the purpose of the computational data analysis.Footnote 20 The Singapore exception is subject to a number of important safeguards. Most importantly, it applies only to circumstances in which an entity has “lawful access” to the copy on which computational analysis will be performed, and the exception cannot be invoked if the entity performing the computational data analysis knows either that the source material is infringing or that it was “obtained from an online location that is being or has been used to flagrantly commit or facilitate rights infringements.”Footnote 21

Singapore’s proposal is consistent with elements of forward-looking copyright policies that are important to the development of AI. The proposal is technologically neutral, applying broadly to any reproduction that is needed in order to carry out a “computational data analysis” of a work, including copies made during (and in preparation for) training an AI system. Importantly, the exception also permits the exchange of datasets to facilitate collaborative research. Recognizing that AI research is being driving by a large ecosystem of researchers and developers that spans across industries and academic disciplines, the proposal is also purpose- and user-agnostic, permitting all users to perform machine learning techniques irrespective of whether they are associated with an academic institution or a commercial enterprise.

Clearing the Path for Canada’s AI Ambitions

By recommending the adoption of an express exception in the Canadian Copyright Act to cover copying of a lawfully accessed work for the purpose of “information analysis,” ISED can help enhance the competitiveness of Canada’s AI industry. Consistent with international best practice, ISED should introduce an exception that is technologically neutral, sufficiently flexible so as to be future-proof, and agnostic as to purpose and user. With that in mind, the exception should extend to:

  • Commercial and non-commercial uses;
  • All works and other subject matter; all copyright-relevant acts, including retention of data for purposes of verification and validation of results; and,
  • The provision of AI services (i.e., permitting service providers to perform informational analysis on behalf of end-users).

Importantly, such an exception would be consistent with Canada’s international obligations. The TRIPs Agreement and Berne Convention require Member States to ensure that exceptions to copyright are confined to “certain special cases which do not conflict with the normal exploitation of the work and do not unreasonably prejudice the legitimate interests of the rights holder.”Footnote 22 An exception to facilitate information analysis of lawfully accessed works is consistent with each of these requirements. It would meet the “certain special cases” requirement insofar as it is clearly and narrowly tailored to advance a significant public interest in the development of AI. It will not conflict with the normal exploitation of copyrighted works because reproductions made during an informational analysis process are not visible to humans and cannot substitute for or displace markets for the original works. And finally, an informational analysis exception will not prejudice the legitimate interests of a rights holder because the value derived from such processes is unrelated to the expressive content that copyright is intended to protect. Indeed, the very purpose of performing informational analysis is identify non-copyrightable information – such as the relationships and correlations between large corpuses of works – that can be used to train AI models. The value derived from such activity lies not in the factual information that is gleaned from any single source, but rather in the discovery of entirely new forms of knowledge that emerge from the identification of patterns and correlations that exist between large bodies of disparate sets of data. While copyright protects the specific expression of factual information, it does not extend to facts themselves, and it was never intended to prevent users from analyzing a work to which they have lawful access in order to derive factual, non-copyrightable information. Once lawful access to a work is obtained, it should not matter whether a user analyzes the material manually or extracts the underlying factual information through a digital process.


BSA appreciates the opportunity to provide feedback on these critically important issues and we look forward to remaining engaged as ISED considers next steps.

Sincerely,

Christian Troncoso
Senior Director, Policy