From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists1p.gnu.org (lists1p.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3EC35CD5BDE for ; Wed, 27 May 2026 08:46:31 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists1p.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1wS9tW-0006AC-Qb; Wed, 27 May 2026 04:45:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists1p.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1wS9tP-00068K-QU for qemu-devel@nongnu.org; Wed, 27 May 2026 04:45:35 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1wS9sy-0005yO-LS for qemu-devel@nongnu.org; Wed, 27 May 2026 04:45:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1779871307; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=X6XldLN8mMQINy3kBPIlfBZjWtuorD+ZJNS/2Vg1HDU=; b=Sk6OnluOu5ELHt8uY8lVD/ALaXvc2apMVuwz2sAGhwhU2Z2raWSK5eNTz5ohmEJXtsXI0v uY4AcaXYXdOBBMPf8QuhL2uOlGDJvqyvk5Y8PeKbIoFIR+Eh+P0Dyf8VCL5l7ENzTawP1e vwKJLTCdry/wPjdtZ2qaqdTLjpdhCh4= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-31-MNvs62IbPVKfaanasWM5tA-1; Wed, 27 May 2026 04:41:42 -0400 X-MC-Unique: MNvs62IbPVKfaanasWM5tA-1 X-Mimecast-MFC-AGG-ID: MNvs62IbPVKfaanasWM5tA_1779871301 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 2476719560AB; Wed, 27 May 2026 08:41:41 +0000 (UTC) Received: from redhat.com (unknown [10.44.48.98]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 191F61800347; Wed, 27 May 2026 08:41:38 +0000 (UTC) Date: Wed, 27 May 2026 10:41:36 +0200 From: Kevin Wolf To: Warner Losh Cc: "Michael S. Tsirkin" , qemu-devel@nongnu.org, stefanha@redhat.com Subject: Re: on ai generated and code provenance Message-ID: References: <20260524083329-mutt-send-email-mst@kernel.org> <20260526140231-mutt-send-email-mst@kernel.org> <20260526152526-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.129.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: 8 X-Spam_score: 0.8 X-Spam_bar: / X-Spam_report: (0.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.445, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_SBL_CSS=3.335, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Am 26.05.2026 um 21:52 hat Warner Losh geschrieben: > On Tue, May 26, 2026 at 1:32 PM Michael S. Tsirkin wrote: > > > On Tue, May 26, 2026 at 08:59:55PM +0200, Kevin Wolf wrote: > > > So yes, trivial patches is another obvious starting point. The challenge > > > there is defining the line where a patch stops being trivial. So I'm not > > > completely sure if making this distinction in a policy is a good idea; > > > maybe practically speaking it has to be all or nothing in terms of > > > creativity (for lack of a better word). > > > > Let the maintainers decide? > > > > Or we can enumerate things: > > - fixing tool (compiler/checkpatch/smatch) errors/warnings in obvious ways > > (e.g. suggested by the > > tools itself, such as initializing an uninitialized variable) > > - propagating API changes (e.g. rebasing a patch after an API change) > > - anything that could be done by a perl/sed/coccinelle script > > - adding or fixing code comments > > > > Those are good examples. Perhaps the following words are good place to start > to frame what I've seen expressed here: > > The QEMU Project currently may accept limited uses of AI that produce > high quality patches that are limited in the creative content added. > While maintainers will ultimately decide, changes like the following > fall within this policy > 1. Fixing obvious warnings in the obvious ways suggested by the tool > 2. Tree wide API changes, and other similar mechanical changes done > today with perl/python/sed/coccinelle As I said in the paragraph you quoted below, I don't think we should encourage using AI for tasks that a deterministic tool could do. If you can use a deterministic tool like sed or Coccinelle for the job, you should. I know that writing Coccinelle spatches can be challenging; that is the part that you can ask AI to help with. (Perl and Python follow the same logic as long as the script is simple, but obviously you have to stop when the helper script becomes almost as complex as the change itself.) Letting AI perform the change directly instead may be an acceptable shortcut for a one-man hobby project that nobody else will ever look at, but in the context of a community project like QEMU in which your changes have to be reviewed and understood by others, it matters a lot that the output of the tool is reproducible. Otherwise, you're creating unnecessary work for others, and that isn't acceptable. So maybe we should even explicitly mention a recommendation like the following: If you can use a deterministic tool, don't use AI instead. If you don't know how to use the deterministic tool, use the AI to tell you how to use it instead of trying to replace it. > 3. Limited, small changes to fix bugs or add a small new feature whose > scope is less than about 100 lines and the originator can explain > them all or the meta issues about the patch. Not sure if mentioning a number of lines is wise. 100 lines can be mostly boilerplate and simple sequential code or they can be a deeply nested complex algorithm. > Maintainers are free to accept or reject changes outside these > guidelines, but please check with the maintainers before sending to > keep the load from AI content to something they can manage. Large and > Very Large patches, especailly ones that have not been deeply > analyised and tested by humans, should be avoided. > > Though maybe the list of 'exceptions' needs work. But the basic > framing is that we will accept some, high quality patches. Maintainers > have some discression for larger pieces to a point, and we still don't > want to drown in AI slop. Yes, if we decide that we do want to make patch complexity/creative expression/whatever you may call it part of the criteria, then having a list like this looks like a possible approach. The details of what exactly should be in it would certainly lead to more discussion, though. Kevin > Warner > > > > > > > As an aside, personally, I'm not convinced that AI can be a "better > > > sed". If it's really about mechanical changes, I think the resulting > > > patch is much more reviewable if the agent doesn't modify the code, but > > > just generate the sed command line or the Coccinelle patch and that is > > > included in the commit message. Reviewers can then just review that and > > > then reproduce the result themselves for comparison. This is impossible > > > with AI prompts and agents do tend to forget an instance of something to > > > replace here and there, so you do have to review the result carefully. > > > > > > But none of these "better sed" problems need to handled in an AI policy. > > > If a patch is hard to review, the maintainer will already reject it on > > > those grounds. > > > > Absolutely.