From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2A9523E92A7
	for <tech-board-discuss@lists.linux.dev>; Tue, 10 Mar 2026 12:47:31 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=18.9.28.11
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1773146853; cv=none; b=skET/Y5trP1xvDwCZBayZZerpGS/DtVLEmxx+yyd0MBlx33IvscrFF1llf3x6j0VupMQA6/fDSuwulC/z5qCb0bf7L+CqcAkuOKrPyNvflTeu+d0jOkR9jJVBDkL7g+zwxvMOrYMfMtFarrShiJYOirNfOBkhobVjhmSWm0SkWc=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1773146853; c=relaxed/simple;
	bh=KaB/INDVQHj0tKjoO49F5y8CK/n+RWxjDE0qZdGs5nA=;
	h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version:
	 Content-Type:Content-Disposition:In-Reply-To; b=tJc3ALLAgXIUTtpNw1LasZdIv5rGzS8kY+FV0nCXxk7m3K4D020FOiBPKowp/N4oSSHVkQUhm2OyAm/M54bvo2/BNErGfGzBlh64wz0FbdxxSsU0/xgUgwL93K8ujUW6Y7/DvKnJid9y4HlBt/NzE8ryJWnUrGaBDEYW+/972NI=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=mit.edu; spf=pass smtp.mailfrom=mit.edu; dkim=pass (2048-bit key) header.d=mit.edu header.i=@mit.edu header.b=TveyT3JX; arc=none smtp.client-ip=18.9.28.11
Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=mit.edu
Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=mit.edu
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=mit.edu header.i=@mit.edu header.b="TveyT3JX"
Received: from macsyma.thunk.org (pool-173-48-117-133.bstnma.fios.verizon.net [173.48.117.133])
	(authenticated bits=0)
        (User authenticated as tytso@ATHENA.MIT.EDU)
	by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 62AClLgY030654
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
	Tue, 10 Mar 2026 08:47:22 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mit.edu; s=outgoing;
	t=1773146844; bh=IRwhL7MmpCj4Z4bM3AxcQwvWHECRSsSSUGjEQaq+0g0=;
	h=Date:From:Subject:Message-ID:MIME-Version:Content-Type;
	b=TveyT3JXFkZDGk5myRTnwabW4vWWWQt9DvIE85tTIiiYRoKo27bKL+/3doYqZox6s
	 QCU/71o+4Q04fIi+aKaEhOBiX5XBopWbb/cM+25GsF6U3K5rqG6ArsLnD0HafA84NO
	 g0fCnOJd8967F2gT+bXH8h6DDz0QjqmGnmesoDVQ/us2PPK2KvDu8f5CsJmryGFtzb
	 8Wad0kJTpjsWjrAkh98ZmCNkmcznFz67SSV3ZPM2rLqy9wQV9Nq0sJKQN2ad6UfKh7
	 Aykx4FuIzLhttUuSotx5jC++StR6m3OQdjxojtbUlrTzv1lfFwR6mzbDkYPhu6svA+
	 WMMV1KTVEyjSg==
Received: by macsyma.thunk.org (Postfix, from userid 15806)
	id 7AD855C4E3CF; Tue, 10 Mar 2026 08:47:21 -0400 (EDT)
Date: Tue, 10 Mar 2026 08:47:21 -0400
From: "Theodore Tso" <tytso@mit.edu>
To: EJ Stinson <favroitegamers@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>, Jonathan Corbet <corbet@lwn.net>,
        Steven Rostedt <rostedt@goodmis.org>,
        Christian Brauner <brauner@kernel.org>,
        tech-board-discuss@lists.linux.dev, linux-kernel@vger.kernel.org,
        ksummit-discuss@lists.linuxfoundation.org,
        christianvanbrauner@gmail.com
Subject: Re: LLM based rewrites
Message-ID: <20260310124721.GB14867@macsyma-wired.lan>
References: <20260307-clean-room-6118793eb175@brauner>
 <20260309095705.7a6b6177@gandalf.local.home>
 <EBF43D48-DA7B-4449-85CF-36351BE07A56@zytor.com>
 <20260309121629.21cabc25@gandalf.local.home>
 <871phtvu7r.fsf@trenco.lwn.net>
 <04B897EF-DEEC-42D0-8E00-888CEEA5318E@zytor.com>
 <20260310045210.GA14867@macsyma-wired.lan>
 <CAMTJT3_cVaA7aJmDa6j288-qwP3jzvM_R2pdk+XmE+1U=Sovbg@mail.gmail.com>
Precedence: bulk
X-Mailing-List: tech-board-discuss@lists.linux.dev
List-Id: <tech-board-discuss.lists.linux.dev>
List-Subscribe: <mailto:tech-board-discuss+subscribe@lists.linux.dev>
List-Unsubscribe: <mailto:tech-board-discuss+unsubscribe@lists.linux.dev>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <CAMTJT3_cVaA7aJmDa6j288-qwP3jzvM_R2pdk+XmE+1U=Sovbg@mail.gmail.com>

On Mon, Mar 09, 2026 at 10:15:28PM -0700, EJ Stinson wrote:
> Imagine if a rouge AI got access to rewriting the kernel, or was exploited,
> this would lead to near certain catastrophe. LLM’s should not rewrite the
> code, as if somehow a AI were to achieve singularity or go rouge/be
> attacked by an anarchistic/foreign actor, think about the amount of code it
> could sneak in without human suspicion, or just lead to human ignorance. I
> think for the time being until we know for certain, there should be no
> reason to use LLM’s to help rewrite at scale any sort of code. Even if we
> were able to prove it wasn’t stolen code; the time spent on proving such
> fact, and ensuring the security, would already take way too long tomerit
> this sort of use.

I think you're misunderstanding the concern that was raised at FOSDEM;
which is that it is now possible for companies to take code that might
be licensed under a license such as the GPL, and ask AI to to do a
"clean room rewrite" and make then that code could be used or
relicensed under a more permissive license, such as Apache or BSD ---
or the company might take that code and use it in a proprietary
codebase.

"It's the end of the world as we know it....."

There are a couple of problems with their premise.  The first is that
they demonstrated this on some very simple bits of Javascript.  It's
not clear whether this would work at *all* on something more
complicated, never mind something like the Linux kernel.

The second is the legal issues, and there are multiple dimensions
whether the resulting code really would be considered free and clear
for relicensing.

And the third is whether it would really result in more secure code
(which was their premise for why some companies might do this, since
the people giving the presentation at FOSDEM were security
researchers).  Given that AI generated code is generally *more* likely
to have security vulnerabilities than human written code, this
assumption seems dubious to me.  Also if the security vulnerability is
inherent in the software architecture, having the first LLM generate a
spec might result in a *spec* which is buggy / vulnerable, and so when
the second LLM translates that spec back into C code, not only might
it introduce new security vulnerabiities, the original security
vulnerability present in the source implementaiton might be preserved.

The bottom line is that I rate the FOSDEM as being 10/10 when they
talk about the history of copyright, 9/10 when they talked about the
history of clean room reimplementation (which has been around since
humans has been around), when they talk about what's possible in the
present, I'd give them a 3/10, and when they talk about the future,
I'd rate their talk at 5/10 --- since their whole point was to start a
conversation, and they certainly did that.


One thing we need to remember though is that we don't have the power
to stop people from doing this.  For that matter, it could be that
there are sweatshops in some third world country where people have
been reimplementing open source code into propretiary code, and that
could have been happening for years or even decades --- if the
resulting rewrite gets used in some propetiary code case, we'd never
know about it.

The only thing AI could potentially do is to democratize this, so that
any random person with a few thousand dollars of AI LLM credits might
be able to attempt this.  And even if today the LLM's aren't really up
to the task for non-trivial programs, that could change over time.


If that happens though, it's not just Open Source that is going to be
affected.  There are lots of people predicting that people graduating
with CS degrees are going to be left begging in the streets since
whether we're talking about new proprietary code or new open source
code, an AI bot, perhaps with some help with a senior developer to
guide the LLM, will mean that we won't need all that many (or perhaps
*any*) junior programmers.  Is that hysteria and overblown hyperbole?
Maybe.

The other possibility is that this will be the beginning of something
similar to what happened to the replacement of textile artisans that
made cloth by hand in early 1800's, when mechanized power looms made
their jobs.... obsolete.  Look up "Luddite" in wikipedia for more
details.  What happened really *sucked* for the people who made cloth
the old way, and but the result was the ability for people to buy
shirts for something significantly less that the a year's worth of
wages for the average laborer.

Will AI do to Software Engineers with the early industrial revelotion
in England did to people like Ned Ludd?  Who knows?  But if it
happens, it isn't going to be just Open Source that will be affected.
And in the meantime, people who design clothes and fabric patterns
will have jobs, even today in the 21st century.

Cheers,

						- Ted