public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Roman Gushchin <roman.gushchin@linux.dev>
To: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	 Andrew Morton <akpm@linux-foundation.org>,
	 Theodore Ts'o <tytso@mit.edu>,
	 Guenter Roeck <linux@roeck-us.net>,
	 Konstantin Ryabitsev <konstantin@linuxfoundation.org>,
	 Chris Mason <clm@meta.com>,  SeongJae Park <sj@kernel.org>,
	 elkin@google.com,  Christian Brauner <brauner@kernel.org>,
	 Dmitry Vyukov <dvyukov@google.com>,
	 Sasha Levin <sashal@kernel.org>,
	 Shakeel Butt <shakeel.butt@linux.dev>,
	 Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	 Sean Christopherson <seanjc@google.com>,
	 Ian Rogers <irogers@google.com>
Subject: Re: Introduce Sashiko (agentic review of Linux kernel changes)
Date: Thu, 19 Mar 2026 15:33:38 -0700	[thread overview]
Message-ID: <87jyv7a1q5.fsf@linux.dev> (raw)
In-Reply-To: <34630bb5-840b-4a99-8e19-51fd4fc8ba96@lucifer.local> (Lorenzo Stoakes's message of "Wed, 18 Mar 2026 18:50:27 +0000")

"Lorenzo Stoakes (Oracle)" <ljs@kernel.org> writes:

> On Wed, Mar 18, 2026 at 11:33:22AM -0700, Roman Gushchin wrote:
>> "Lorenzo Stoakes (Oracle)" <ljs@kernel.org> writes:
>>
>> > On Tue, Mar 17, 2026 at 03:31:11PM +0000, Roman Gushchin wrote:
>> >> Hello,
>> >>
>> >> I'm happy to share something my colleagues and I have been working on
>> >> for the last several months:
>> >> Sashiko - an agentic system for Linux kernel changes.
>> >>
>> >> First, Sashiko is available as a service at:
>> >>   * https://sashiko.dev
>> >>
>> >
>> > ...
>> >
>> > (For one I'm going to go fix some bugs on my series I saw reported there).
>> >
>> > I think over time as the approach/model is refined this will get a LOT
>> > better, it seems these things can acelerate quickly.
>>
>> Hi Lorenzo,
>>
>> Thank you for kind words!
>
> No problem, thanks for your hard work! :)
>
>>
>> RE false positives: I think Chris's prompts were initially heavily
>> biased towards avoiding false positives, but it comes at the cost of
>> missing real issues (in general, I don't have hard data on % of findings).
>> Now he also is looking to relax it a bit, to my knowledge.
>> But then there are different models in use, different protocols, etc.
>>
>> I also have a notion of issue severity and I was thinking about
>> e.g. sending out only reviews revealing critical & high severity bugs
>> (e.g. memory corruptions & panics). Or maybe send the feedback to the
>> author in any case (e.g. for fixing typos), but cc maintainers only if
>> there are serious concerns.
>>
>> And obviously no pressure, I won't enable any public email sending
>> unless there is a consensus across maintainers of the corresponding
>> subsystem.
>
> I think maybe an opt-in thing might work for some of us?

Absolutely, I think with mm we can start with replying to the author and
a dedicated list of volunteers.

> But yeah we can take our time with this, Andrew is looking, I am for
> sure.

Thank you!

>
> Oh and one data point -
> https://lore.kernel.org/linux-mm/cover.1773846935.git.ljs@kernel.org/
>
> Read the v3 change log for a list of the issues it correctly raised for that
> series, so it's definitely useful.
>
> It was about maybe 50/50 noise/signal I think?
>
> But as you can see that's already very useful thank you and has fixed a
> bunch of bugs in that codde!
>
> I'm not sure what Chris is planning, and I keep not going to the AI
> meetings for various reasons (other stuff clashing/away/tired sometimes :)
> but I wonder how we will sync up with Chris's review bot experiments?

So as Chris said, we're syncing regularly and actively thinking how to
organize it. I think we both want to share as much stuff as possible.

The hard part is that we can't easily test each others setup and it's
all very brittle. Initially I tried to use Chris's prompts directly with
only minimal changes, but it was hard to keep Sashiko stable. Plus the
new multi-stage protocol improved the discovery rate by almost 10%,
which was hard to ignore.

My current thinking (and things evolving quickly, so I might have a
different opinion in a couple of weeks) is that we need to separate
per-subsystem knowledge, make sure it's not containing any imperative
instructions or llm/tools specifics and share it completely. We can move
it to a separate repo or even put into the kernel tree, it's all
debatable. In a way, these prompts should be owned by subsystem
maintainers more than anyone else.

Then there are things which can be shared, but are not subsystem-specific.
E.g. an instruction on how to assess issue severity.

And then there is a specific review protocol, which significantly
depends on the tooling and LLM being used. This part is hard to share,
but also it's the place where a lot of experimentation is happening,
so maybe it's fine to have multiple tools. And they might be optimized
for different use cases: e.g. for personal development it might be
beneficial to have a live interaction with llm on the review material
(someone already asked me about this); but for sashiko.dev's mass review
case I do care a lot about the stability and token efficiency.

>> >>
>> >> * What's next?
>> >>
>> >> This is our first version and it's obviously not perfect. There is a
>> >> long list of fixes and improvements to make. Please, don't expect it to
>> >> be 100% reliable, even though we'll try hard to keep it up and running.
>> >> Please use github issues or email me any bug reports and feature
>> >> requests, or send PR's.
>> >
>> > Of course, it's all much appreicated!
>> >
>> >>
>> >> As of now, Sashiko only provides a web interface;
>> >> however, Konstantin Ryabitsev is already adding sashiko.dev support to b4,
>> >> and SeongJae Park is adding support to hkml.
>> >> That was really fast, thank you!
>> >
>> > Thanks to Konstantantin and SJ too but the web interface is pretty nice I
>> > must say so thanks for that! :)
>> >
>> >>
>> >> We're working on adding an email interface to Sashiko, and soon Sashiko
>> >> will be able to send out reviews over email - similar to what the bpf
>> >> subsystem already has. It will be opt-in by subsystem and will have options
>> >
>> > Like I said, I think it's a bit premature for mm at least _at this point_
>> > but I'm sure it'll get there.
>>
>> I'd really appreciate (and actually need) yours and other maintainers and
>> developers feedback here. Even though I can't fix every single false
>> positive as a code issue, I can hopefully tackle some common themes.
>
> Is there a way for us to point out which parts of a review are signal and
> which are noise?

Not yet. I think answering emails is the easiest part and I plan to
teach Sashiko to recognize these answers and analyze them. Maybe Sashiko
can even adjust it's own prompts in a (semi)-automatic way, Idk.

>
> If you could update the web interface for feedback that'd be really handy,
> though I guess there's the painful stuff of having to have users and
> etc. for that :)

Yeah, I'm afraid we might end up trying to build a new JIRA this way...

>
>>
>> Chris did a fantastic work on the bpf subsystem (and several others) by
>> manually analyzing replies to the AI feedback and adjusting prompts. Now
>> we need to repeat this for all other subsystems.
>
> Yeah, I'm happy to feedback if there's a fairly low friction way of doing
> it, but constant workload makes it hard if it requires much more
> effort :)

Can't agree more :)

>
>>
>> >
>> > For now I think we need to get the false positive rate down a fair bit
>> > otherwise it might be a little distracitng.
>> >
>> > But people are _already_ integrating the web interface into workflows, I
>> > check it now, and Andrew is already very keen :) see:
>> >
>> > https://lore.kernel.org/all/20260317121736.f73a828de2a989d1a07efea1@linux-foundation.org/
>> > https://lore.kernel.org/all/20260317113730.45d5cef4ba84be4df631677f@linux-foundation.org/
>> >
>> >> to CC only the author of the patch, maintainers, volunteers, or send a
>> >> fully public reply. If you're a maintainer and have a strong preference
>> >> to get reviews over email, please let me know.
>> >
>> > Well as maintainer I think 'not quite yet' but probably soon is the answer
>> > on that one!
>> >
>> >>
>> >> We also desperately need better benchmarks, especially when it comes to
>> >> false positives. Having a decent vetted set of officially perfect
>> >> commits can help with this.
>> >
>> > Not sure perfect commits exist in the kernel certainly not mine :P
>>
>> Same here :) This is why it's so hard.
>
> Yes, but worthwhile! LLMs are surprisingly good at figuring out issues in
> things, it's a real strength.
>
> And it's already improving the code.
>
>>
>> >
>> >>
>> >> Finally, some subsystems have a good prompts coverage and some don't. It
>> >> doesn't have to be lengthy documentation (and it might actually be
>> >> counter-productive), but having a small list of things to look at - some
>> >> high-level concepts which are hard to grasp from the code, etc. - can
>> >> help a lot with both bug discovery and false positives.
>> >
>> > I guess best contributed to Chris's review-prompts repo right?
>>
>> Both works for me now, we'll figure out with Chris how to sync our
>> prompts. The small problem is that we're using various models, tools and
>> review protocols and barely can test each other's setup. And it's all
>> very fragile, so it's not exactly trivial.
>> But we'll figure out something soon.
>
> Yeah, part of the fun I guess :)
>
>>
>> In general we need to carefully separate instructions (like which tools
>> to use, which prompts to load etc) from factual data. Then we can easily
>> use the factual data with various tooling around.
>
> Hopefully I find some time to contribute some mm-specific stuff too :)

Awesome, waiting for it!

Thanks!

  reply	other threads:[~2026-03-19 22:33 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-17 15:31 Introduce Sashiko (agentic review of Linux kernel changes) Roman Gushchin
2026-03-18 12:03 ` Lorenzo Stoakes (Oracle)
2026-03-18 18:33   ` Roman Gushchin
2026-03-18 18:50     ` Lorenzo Stoakes (Oracle)
2026-03-19 22:33       ` Roman Gushchin [this message]
2026-03-18 18:50     ` Chris Mason
2026-03-18 15:00 ` SeongJae Park
2026-03-18 18:43   ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87jyv7a1q5.fsf@linux.dev \
    --to=roman.gushchin@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=clm@meta.com \
    --cc=dvyukov@google.com \
    --cc=elkin@google.com \
    --cc=irogers@google.com \
    --cc=konstantin@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=ljs@kernel.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=sashal@kernel.org \
    --cc=seanjc@google.com \
    --cc=shakeel.butt@linux.dev \
    --cc=sj@kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox