git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jiang Xin <worldhello.net@gmail.com>
Cc: "Kristoffer Haugsbakk" <kristofferhaugsbakk@fastmail.com>,
	"Git List" <git@vger.kernel.org>,
	"Justin Tobler" <jltobler@gmail.com>,
	"Alexander Shopov" <ash@kambanaria.org>,
	"Mikel Forcada" <mikel.forcada@gmail.com>,
	"Ralf Thielow" <ralf.thielow@gmail.com>,
	"Jean-Noël AVILA" <jn.avila@free.fr>,
	"Bagas Sanjaya" <bagasdotme@gmail.com>,
	"Dimitriy Ryazantcev" <DJm00n@mail.ru>,
	"Peter Krefting" <peter@softwolves.pp.se>,
	"Emir SARI" <bitigchi@me.com>, "Arkadii Yakovets" <ark@cho.red>,
	"Vũ Tiến Hưng" <newcomerminecraft@gmail.com>,
	"Teng Long" <dyroneteng@gmail.com>,
	"Yi-Jyun Pan" <pan93412@gmail.com>
Subject: Re: [PATCH 0/2] Fix misaligned output of git repo structure
Date: Fri, 14 Nov 2025 11:22:54 -0800	[thread overview]
Message-ID: <xmqqecq0ifld.fsf@gitster.g> (raw)
In-Reply-To: <CANYiYbGyGKy=S6a3NJFyrv-bOZos+BXdR=nPXDT3W_dGxeiNPA@mail.gmail.com> (Jiang Xin's message of "Fri, 14 Nov 2025 17:52:34 +0800")

Jiang Xin <worldhello.net@gmail.com> writes:

>> Is `Co-developed-by` supposed to have a different meaning than the more
>> common `Co-authored-by`?
>
> This is a very good question.
>
> **Background**
>
> At Alibaba Cloud, our development team uses a variety of AI coding tools,
> including Cursor, Claude Code, Gemini-CLI, Lingma, and Qoder, etc. To
> measure adoption—specifically, how many developers are using AI coding
> tools and how much code is AI-generated—we needed a unified tracking
> mechanism compatible with all these tools. I chose to implement a git
> commit-msg hook that automatically detects the AI coding tool responsible
> for a commit based on environment variables at commit time.

In other words, addition of this is solely to help corporations like
Alibaba to measure which AI tools are used (and what correlation
there are between success rate of the patches and the tools that
generated them, etc..

What is in it for us?  What benefit are we getting in exchange for
tolerating these additional trailer lines in our log messages?

A few random thoughts about generated contents:

 * Disclosing the tools that were used during the development of a
   patch is a good practice in principle, but this is not limited to
   use of AI tools.  We have fixes for issues found with existing
   Coccinelle checks, sanitizers, static checkers, and it is the
   usual practice for the patches that fix them to disclose how the
   author discovered the issue.  When making mechanical replacement
   changes en masse, it is the usual practice for the patches to
   describe what scripts were used to make the changes in them.  But
   we do not dedicate a trailer line for such a disclosure, and
   there is no reason why AI tools has to be treated specially here.
   Instead of "Co-developed-by" that only tells what tool was used,
   why not disclose what prompts (again, somehow AI tools are
   treated specially here, too---we call the input to these tools
   "scripts" when the changes were made with sed or perl or
   coccinelle) were used?

 * Whether some or all contents in a submitted patch were generated
   by tools, it does not change the obligation of the person who
   submits the patch.  They need to make sure that the changes are
   reviewable, its goal and implementation are described in the
   proposed log message appropriately, the updated code does what
   the proposed log message claims to do.  They need to make sure
   that they have the right to contribute the patch under DCO, and
   sign off their patch accordingly.

 * What is made more difficult for a submitter with AI tools is that
   it is often not obvious to the human developer how much of the
   tools' generated output is parroting what the tools saw during
   their training session, and what the licensing terms of these
   training materials are.  Even if a hypothetical AI tool were
   trained only with BSD licensed material, the output from such a
   tool is likely to hold you under certain obligations like
   including the original copyright notice, but without the tool
   disclosing to you the human developer, you do not even know whose
   copyright notice to include.

 * Worse yet, the above difficulty is only for the submitter of such
   a patch, not the project that, trusting what the sign-off of the
   submitter certifies, reviews and accepts such a patch.  It does
   not make any difference if the original submitter copied and
   pasted proprietary code of their employer in the patch, or
   included code that AI tools "borrowed" from elsewhere without
   following proper procedure to honor the licensing terms.  In
   either case, the project may have accepted what was stolen
   without knowing, and it is very likely that the submitter but not
   the project is primarily held liable.  In a sense, the project
   would be better off if the patch does not say it was generated
   with AI tools---if the project does not know, it cannot possibly
   held liable for it, even though the project will have to waste
   engineering resources to rewrite or remove the remnant from such
   a faulty contribution.

  reply	other threads:[~2025-11-14 19:22 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-14  5:52 [PATCH 0/2] Fix misaligned output of git repo structure Jiang Xin
2025-11-14  5:52 ` [PATCH 1/2] t/unit-tests: add UTF-8 width tests for CJK chars Jiang Xin
2025-11-14 20:17   ` Junio C Hamano
2025-11-15 12:38     ` Jiang Xin
2025-11-14  5:52 ` [PATCH 2/2] builtin/repo: fix table alignment for UTF-8 characters Jiang Xin
2025-11-14 17:50   ` Justin Tobler
2025-11-15 12:41     ` Jiang Xin
2025-11-14 20:00   ` Junio C Hamano
2025-11-15 12:54     ` Jiang Xin
2025-11-15 16:36       ` Junio C Hamano
2025-11-16 13:32         ` Jiang Xin
2025-11-16 16:51           ` Junio C Hamano
2025-11-14  7:41 ` [PATCH 0/2] Fix misaligned output of git repo structure Kristoffer Haugsbakk
2025-11-14  9:52   ` Jiang Xin
2025-11-14 19:22     ` Junio C Hamano [this message]
2025-11-15 12:25       ` Jiang Xin
2025-11-14 16:13 ` Junio C Hamano
2025-11-15 13:36 ` [PATCH v2 " Jiang Xin
2025-11-15 13:36   ` [PATCH v2 1/2] t/unit-tests: add UTF-8 width tests for CJK chars Jiang Xin
2025-11-15 13:36   ` [PATCH v2 2/2] builtin/repo: fix table alignment for UTF-8 characters Jiang Xin
2025-11-15 15:04     ` Phillip Wood
2025-11-15 16:49       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqecq0ifld.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=DJm00n@mail.ru \
    --cc=ark@cho.red \
    --cc=ash@kambanaria.org \
    --cc=bagasdotme@gmail.com \
    --cc=bitigchi@me.com \
    --cc=dyroneteng@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jltobler@gmail.com \
    --cc=jn.avila@free.fr \
    --cc=kristofferhaugsbakk@fastmail.com \
    --cc=mikel.forcada@gmail.com \
    --cc=newcomerminecraft@gmail.com \
    --cc=pan93412@gmail.com \
    --cc=peter@softwolves.pp.se \
    --cc=ralf.thielow@gmail.com \
    --cc=worldhello.net@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).