[PATCH v2 3/3] docs: define policy forbidding use of AI code generators

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Daniel P. Berrangé" <berrange@redhat.com>
To: qemu-devel@nongnu.org
Cc: "Thomas Huth" <thuth@redhat.com>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Gerd Hoffmann" <kraxel@redhat.com>,
	"Mark Cave-Ayland" <mark.cave-ayland@ilande.co.uk>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Kevin Wolf" <kwolf@redhat.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Alexander Graf" <agraf@csgraf.de>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"Markus Armbruster" <armbru@redhat.com>
Subject: [PATCH v2 3/3] docs: define policy forbidding use of AI code generators
Date: Thu, 16 May 2024 17:22:30 +0100	[thread overview]
Message-ID: <20240516162230.937047-4-berrange@redhat.com> (raw)
In-Reply-To: <20240516162230.937047-1-berrange@redhat.com>

There has been an explosion of interest in so called AI code generators
in the past year or two. Thus far though, this is has not been matched
by a broadly accepted legal interpretation of the licensing implications
for code generator outputs. While the vendors may claim there is no
problem and a free choice of license is possible, they have an inherent
conflict of interest in promoting this interpretation. More broadly
there is, as yet, no broad consensus on the licensing implications of
code generators trained on inputs under a wide variety of licenses

The DCO requires contributors to assert they have the right to
contribute under the designated project license. Given the lack of
consensus on the licensing of AI code generator output, it is not
considered credible to assert compliance with the DCO clause (b) or (c)
where a patch includes such generated code.

This patch thus defines a policy that the QEMU project will currently
not accept contributions where use of AI code generators is either
known, or suspected.

This merely reflects the current uncertainty of the field, and should
this situation change, the policy is of course subject to future
relaxation. Meanwhile requests for exceptions can also be considered on
a case by case basis.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 docs/devel/code-provenance.rst | 50 +++++++++++++++++++++++++++++++++-
 1 file changed, 49 insertions(+), 1 deletion(-)

diff --git a/docs/devel/code-provenance.rst b/docs/devel/code-provenance.rst
index eabb3e7c08..846dda9a35 100644
--- a/docs/devel/code-provenance.rst
+++ b/docs/devel/code-provenance.rst
@@ -264,4 +264,52 @@ boilerplate code template which is then filled in to produce the final patch.
 The output of such a tool would still be considered the "preferred format",
 since it is intended to be a foundation for further human authored changes.
 Such tools are acceptable to use, provided they follow a deterministic process
-and there is clearly defined copyright and licensing for their output.
+and there is clearly defined copyright and licensing for their output. Note
+in particular the caveats applying to AI code generators below.
+
+Use of AI code generators
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+TL;DR:
+
+  **Current QEMU project policy is to DECLINE any contributions which are
+  believed to include or derive from AI generated code. This includes ChatGPT,
+  CoPilot, Llama and similar tools**
+
+The increasing prevalence of AI code generators, most notably but not limited
+to, `Large Language Models <https://en.wikipedia.org/wiki/Large_language_model>`__
+(LLMs) results in a number of difficult legal questions and risks for software
+projects, including QEMU.
+
+The QEMU community requires that contributors certify their patch submissions
+are made in accordance with the rules of the :ref:`dco` (DCO).
+
+To satisfy the DCO, the patch contributor has to fully understand the
+copyright and license status of code they are contributing to QEMU. With AI
+code generators, the copyright and license status of the output is ill-defined
+with no generally accepted, settled legal foundation.
+
+Where the training material is known, it is common for it to include large
+volumes of material under restrictive licensing/copyright terms. Even where
+the training material is all known to be under open source licenses, it is
+likely to be under a variety of terms, not all of which will be compatible
+with QEMU's licensing requirements.
+
+With this in mind, the QEMU project does not consider it is currently possible
+for contributors to comply with DCO terms (b) or (c) for the output of commonly
+available AI code generators.
+
+The QEMU maintainers thus require that contributors refrain from using AI code
+generators on patches intended to be submitted to the project, and will
+decline any contribution if use of AI is either known or suspected.
+
+Examples of tools impacted by this policy includes both GitHub's CoPilot,
+OpenAI's ChatGPT, and Meta's Code Llama, amongst many others which are less
+well known.
+
+This policy may evolve as the legal situation is clarifed. In the meanwhile,
+requests for exceptions to this policy will be evaluated by the QEMU project
+on a case by case basis. To be granted an exception, a contributor will need
+to demonstrate clarity of the license and copyright status for the tool's
+output in relation to its training model and code, to the satisfaction of the
+project maintainers.
-- 
2.43.0

next prev parent reply	other threads:[~2024-05-16 16:23 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-16 16:22 [PATCH v2 0/3] docs: define policy forbidding use of "AI" / LLM code generators Daniel P. Berrangé
2024-05-16 16:22 ` [PATCH v2 1/3] docs: introduce dedicated page about code provenance / sign-off Daniel P. Berrangé
2024-05-16 17:29   ` Peter Maydell
2024-05-16 17:34     ` Michael S. Tsirkin
2024-05-16 17:43       ` Peter Maydell
2024-05-17  5:05         ` Thomas Huth
2024-05-17 10:03           ` Daniel P. Berrangé
2024-05-16 17:33   ` Michael S. Tsirkin
2024-05-17 11:09     ` Daniel P. Berrangé
2024-05-17 18:08   ` Alex Bennée
2024-05-16 16:22 ` [PATCH v2 2/3] docs: define policy limiting the inclusion of generated files Daniel P. Berrangé
2024-05-16 17:04   ` Michael S. Tsirkin
2024-05-17 10:51     ` Daniel P. Berrangé
2024-05-17 18:23   ` Alex Bennée
2024-05-28 15:41   ` Kevin Wolf
2024-05-16 16:22 ` Daniel P. Berrangé [this message]
2024-05-16 17:11   ` [PATCH v2 3/3] docs: define policy forbidding use of AI code generators Michael S. Tsirkin
2024-05-17 10:57     ` Daniel P. Berrangé
2024-05-16 17:20 ` [PATCH v2 0/3] docs: define policy forbidding use of "AI" / LLM " Michael S. Tsirkin
2024-05-16 17:34   ` Peter Maydell
2024-05-16 17:36     ` Michael S. Tsirkin
2024-05-21 14:27 ` Stefan Hajnoczi
2024-05-28 15:41 ` Kevin Wolf

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:eabb3e7c0 dfblob:846dda9a3 )
 OR (
bs:"[PATCH v2 3/3] docs: define policy forbidding use of AI code generators" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240516162230.937047-4-berrange@redhat.com \
    --to=berrange@redhat.com \
    --cc=agraf@csgraf.de \
    --cc=alex.bennee@linaro.org \
    --cc=armbru@redhat.com \
    --cc=kraxel@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mark.cave-ayland@ilande.co.uk \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=stefanha@redhat.com \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).