Maintainer workflows discussions

Maintainer workflows discussions
 help / color / mirror / Atom feed

* Re: Stop false review statements
From: SeongJae Park @ 2026-05-18  2:12 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: SeongJae Park, Greg KH, Konstantin Ryabitsev, Guenter Roeck,
	Krzysztof Kozlowski, sashiko-bot, sashiko-reviews, sashiko,
	Linux Kernel Workflows, Linux Kernel Mailing List, devicetree,
	kfree
In-Reply-To: <0902F8E6-C495-40A1-975D-92D3B72D44AE@linux.dev>

On Sat, 16 May 2026 08:49:39 -0700 Roman Gushchin <roman.gushchin@linux.dev> wrote:

> 
> > On May 16, 2026, at 8:45 AM, Greg KH <gregkh@linuxfoundation.org> wrote:
> > 
> > On Sat, May 16, 2026 at 08:41:43AM -0700, Roman Gushchin wrote:
> >> 
> >>>> On May 16, 2026, at 8:20 AM, Konstantin Ryabitsev <mricon@kernel.org> wrote:
> >>> 
> >>> On Sat, May 16, 2026 at 05:11:28AM -0700, Guenter Roeck wrote:
> >>>>> On Sat, May 16, 2026 at 10:05:02AM +0200, Krzysztof Kozlowski wrote:
[...]
> >> The goal here is to inform maintainers that sashiko has successfully reviewed the patch
> >> and there were no findings, otherwise maintainers have to go to the web site and check the status.

Yes, this will be helpful.  I also think notifying review failures (usually due
to patch applying failure) or general review results summary for every case
(maybe opt-in?) would also be helpful.

> > 
> > That's fine.
> > 
> >> I’m not attached to any specific form of it, I thought Reviewed-by is the most obvious form.
> >> And we use Reported-by: tags with various tooling for years.
> > 
> > Reported-by: shows the existance of a problem that some tool found, a
> > subtle difference here.
> > 
> >> What do you think is the best form?
> >> 
> >> I’ll pause sending reviewed-by tags until we have a discussion and agreement here.
> > 
> > Just say it in some other text form, that our tools will not pick up.
> > Like:
> >    Tool XXXX reports that all is good:
> >        https://....
> > 
> > or something like that?
> 
> Sure, works for me.

+1.  I was also feeling Reviewed-by: is at least controversial.


Thanks,
SJ

[...]

^ permalink raw reply

* Re: Stop false review statements
From: Laurent Pinchart @ 2026-05-17 22:22 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Greg KH, Mauro Carvalho Chehab, Roman Gushchin,
	Krzysztof Kozlowski, debarbos, Arnaldo Carvalho de Melo,
	Konstantin Ryabitsev, Guenter Roeck, sashiko-bot, sashiko-reviews,
	sashiko, Linux Kernel Workflows, Linux Kernel Mailing List,
	devicetree, kfree
In-Reply-To: <20260517162912.GA51520@macsyma-wired.lan>

On Sun, May 17, 2026 at 12:29:12PM -0400, Theodore Tso wrote:
> It should also be noted that Intel's zero-day bot was (a) closed
> source, and (b) was sending its test regression reports with the
> linux-kernel mailing list cc'ed, and no one really complained because
> it was so useful, and if Intel was willing to use very expensive
> hardware in their data center to contribute reports, so long as the
> reports were useful and the false-positive noise was low enough, we
> decided to be grateful and not worry (too much) about the fact that
> Intel's zero-day bot was closed source.  (There was indeed some
> grumbling in the bar at Plumbers, of course.  :-)

The 0-day but was a closed-source front-end to orchestrate analysis
tools that are open-source (compilers, static analyzers, ...). Sashiko
is an open-source front-end to orchestrate analysis tools that are
closed-source. That's the complete opposite, so I'm not sure how
relevant the comparison is. Comparing with Coverity may be more
relevant.

> In my opinion, we should be doing the same for Sashiko, and that's the
> decision which the ext4 developers have made --- at least for ext4
> patches, after an experiment where we only sent reviews to the patch
> authors and the maintainer, people were satisifed that false positive
> rate was low enough (with the caveats that I had previously mentioned,
> but we were willing to live with them because at least for us, it was
> useful enough), that we will be requesting that Sashiko reviews be
> cc'ed to the ext4 mailing list.
> 
> I realize that there are some extra sensitivities around AI / LLM's,
> but from the perspective of reviewing patches, I don't see any
> difference between this and other closed source tools that we've used,
> such as Coverity and the Zero-day bot.  Not everyone will agree, of
> course, but at the moment, this is a decision that we are making on a
> subsystem by subsystem basis, which again, has strong historical
> precedence.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply

* Re: Stop false review statements
From: Mauro Carvalho Chehab @ 2026-05-17 22:05 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Theodore Tso, Greg KH, Krzysztof Kozlowski, debarbos,
	Arnaldo Carvalho de Melo, Konstantin Ryabitsev, Guenter Roeck,
	sashiko-bot, sashiko-reviews, sashiko, Linux Kernel Workflows,
	Linux Kernel Mailing List, devicetree, kfree
In-Reply-To: <4ECB5626-01E2-4AEB-AD11-524AB224CAA0@linux.dev>

On Sun, 17 May 2026 12:42:12 -0700
Roman Gushchin <roman.gushchin@linux.dev> wrote:

> 
> > On May 17, 2026, at 11:57 AM, Theodore Tso <tytso@mit.edu> wrote:
> > On Sun, May 17, 2026 at 11:17:06AM -0700, Roman Gushchin wrote:  
> >> 
> >> I actually tried to run it with ollama on my
> >> personal framework 13. Adding nominal support is trivial, but the
> >> whole thing is not really useful: I can get maybe few hundreds
> >> tokens per second using a quantified model with reduced quality; an
> >> average sashiko review is consuming 3.5 millions tokens (with Gemini
> >> 3.1 pro, it’s also model-dependent).  
> > 
> > I'm curious.  What hardware and LLM model were you using?  A few
> > hundred tokens per second seems surprising high.  My initial
> > research[1] showes that an M5 Max Macbook Pro costing 5 or 6 kilobucks
> > can do 31.6 tokens/second on a 27B 4-bit Quanitized model (Qwen 3.5).  
> 
> I’ve framework 13 with amd 7840u. I’ve tried several models both on cpu and gpu. 
> Sorry, it was a couple of months ago and I don’t remember all the details, so I won’t 
> claim any specific numbers, but as I remember the best numbers were around 
> a hundred tokens per second. In any case it’s few orders of magnitude slower than
>  what is realistically required.
> 
> If someone has a powerful hardware and is willing to benchmark sashiko with open-source
> models, I’m very interested in results.

If you add the patch you used with ollama somewhere, I can try
running here and do some benchmarks - that is assuming that 
it won't try to run 3.5 millions of tokens.


> 
> > [1] https://www.reddit.com/r/LocalLLaMA/comments/1rzkw4x/m5_max_128g_performance_tests_i_just_got_my_new/
> > 
> > The model matters of course.  With Gemma 3 27B and a 6-bit
> > quantization, it's 21 tokens/s, and with Deepseek R1 8B Q6_K, it's
> > 72.8 tokens/second.  But unless you're using a really low-end model,
> > or a really expensive, splufty hardware platform, I haven't seen
> > reports of hundreds of tokens per second on hardware costing a
> > reasonable amount of memory.  (I'll set aside the question of whether
> > spending $6k for a fully spec'ed out M5 Max Macbook Pro, or $15k for a
> > fully spec'ed out M3 Ultra Mac Studio is "reasonable".)
> > 
> > As a result I'm not entirely sure how realistic it is to do reviews
> > using "free" (you still have to pay $$$ for the hardware) local,
> > open-weight LLM's if an average review requires around 3.5 million
> > tokens.  
> 
> Fully agree. But it might change in few years, things are moving quickly.


Thanks,
Mauro

^ permalink raw reply

* Re: Stop false review statements
From: Danilo Krummrich @ 2026-05-17 21:25 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Krzysztof Kozlowski, Greg KH, Konstantin Ryabitsev, Guenter Roeck,
	Miguel Ojeda, sashiko-bot, sashiko-reviews, sashiko,
	Linux Kernel Workflows, Linux Kernel Mailing List, devicetree,
	kfree
In-Reply-To: <DIL2P8CHKVZD.2WVQQRN0FM28N@kernel.org>

On Sun May 17, 2026 at 5:56 PM CEST, Danilo Krummrich wrote:
> However, I still have the same concern I raised previously when it comes to
> email delivery: I think that when sashiko sends feedback to contributors
> (without Cc'ing the mailing list and all other recipients), it should actively
> ask the contributor to raise things on the list with all other recipients,
> reviewers and maintainers before acting on them, such that changes subsequent to
> the first submission on the list are aligned.
>
> Can this be added please?

I'm also happy to send a PR of course.

Thanks,
Danilo

^ permalink raw reply

* Re: Stop false review statements
From: Roman Gushchin @ 2026-05-17 19:53 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Greg KH, Krzysztof Kozlowski, debarbos, Arnaldo Carvalho de Melo,
	Konstantin Ryabitsev, Guenter Roeck, sashiko-bot, sashiko-reviews,
	sashiko, Linux Kernel Workflows, Linux Kernel Mailing List,
	devicetree, kfree


> On May 17, 2026, at 11:56 AM, Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> On Sun, 17 May 2026 11:17:06 -0700
> Roman Gushchin <roman.gushchin@linux.dev> wrote:
> 
>>> On May 17, 2026, at 9:40 AM, Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
>>> On Sun, 17 May 2026 12:12:00 +0200
>>> Greg KH <gregkh@linuxfoundation.org> wrote:
>>>>> On Sun, May 17, 2026 at 12:05:56PM +0200, Mauro Carvalho Chehab wrote:
>>>>> On Sat, 16 May 2026 14:59:44 -0700
>>>>> Roman Gushchin <roman.gushchin@linux.dev> wrote:
>>>>>>> On May 16, 2026, at 2:33 PM, Krzysztof Kozlowski <krzk@kernel.org> wrote:
>>>>>>> I find it opposite: clogging commits with useless information, because
>>>>>>> some arbitrary and completely closed-source tool did analysis means
>>>>>>> nothing to me one year later when I look at the commit in the Git history.      
>>>>>> This is simple not true: Sashiko is fully open-source, under Apache 2.0 license
>>>>>> and the code belongs to LF.     
>>>>>> Yes, the instance behind sashiko.dev is using
>>>>>> Gemini 3.1 Pro LLM, which is not open-source, but it’s not a fundamental limitation -
>>>>>> Sashiko is supporting various LLMs, including open models - it’s just a practical
>>>>>> choice: to my knowledge the quality of open models is not on par with frontier closed
>>>>>> models     
>>>>> I would very much prefer using an open source LLM, even if not in pair
>>>>> with latest paid models.
>>>>>> and it would require a non-trivial amount of hardware and infrastructure to run
>>>>>> an open model at the required scale.    
>>>>> IMHO the best would be to have them running on some infra that would accept
>>>>> open source models (*). If there aren't enough resources to have our own
>>>>> infra, there are offers out there which allows running open source models
>>>>> like https://ollama.com/pricing (I never used myself).
>>>>> (*) For instance, Qwen3.6 is brand new and licensed under apache-2.0.
>>>>>   Not bad on my tests running it locally.    
>>>> You can run the tool locally, with whatever model you want, if you want
>>>> to.
>>>> But for now, let's just take the free credits that Google is willing to
>>>> throw at this thing and let it give us reviews IF the maintainer of the
>>>> subsystem feels it is something they want to do.  No one is forcing
>>>> maintainers to do this.  
>>> If Google and/or others are willing to give free credits on their cloud,
>>> they could instead or in addition give free credits to run ollama
>>> there, allowing us to use different models.
>>> From my side, while I won't personally object getting reviews from
>>> Sashiko/Gemini, this is something I can't reproduce locally. I would
>>> very much want something where I can select my LLM preferred model
>>> and run on my ollama docker container on my own GPU, in a way that
>>> I could run it locally before even sending a patch series.  
>> 
>> 2 thoughts here:
>> 1) I actually tried to run it with ollama on my personal framework 13. Adding nominal support is trivial,
>> but the whole thing is not really useful: I can get maybe few hundreds tokens per second using
>> a quantified model with reduced quality; an average sashiko review is consuming 3.5 millions tokens
>> (with Gemini 3.1 pro, it’s also model-dependent).
> 
> Do you mean 3.5 millions tokens per patch series? If so, that
> sounds a lot! Why does it require too many tokens?

It’s an average per patch, not a series. Some are much cheaper, some are much more expensive.
Sashiko posts token cost nearby each review.

Why it uses many tokens? Because in many cases it has to dig deep into the code.
Long sessions with multiple tool calls are expensive. Also Sashiko has a multi-stage
architecture, effectively it reviews every patch multiple times from different angles.
It has a measurable influence on the quality of reviews. The current generation of LLMs
is not good at spotting various types of issues at once: once it sees a memory leak
it can’t think anymore on e.g. locking issues. Also just by running the same thing multiple times
and combining the result you can meaningfully improve the quality.

>> I’m personally all in on having the entire thing as open as possible and I believe Sashiko is what
>> is realistically the best at this moment - a fully open-source harness and set of prompts which
>> can work with a variety of models.
>> I’m happy to merge a support for any LLM model which can produce decent review results.
>> 
>> 2) Due to probabilistic nature of LLMs, nothing is reproducible in a strict sense of the word.
>> Even with exactly the same model/harness/prompts you’ll get different results every time you run it.
>> It’s unfortunate, but it is what it is at the moment.
> 
> By "reproduce locally", I didn't mean in strict sense. Sure, LLM answers
> won't be identical, but I suspect that at least most of the major issues
> on a patch series would be reported by any decent model.

I believe we’re not quite there yet. Models do differ in their abilities to spot
various types of bugs and also producing false positives. Some types of issues
(e.g. complex locking issues) are really hard for best of the current models.

> So, if we have something that one can locally run using its GPU, being
> able to get an answer in the range of a couple of minutes per patch
> should be enough to catch most of the issues.

I’m happy to be wrong here, but my understanding is that it’s not realistic now.
Sashiko reviews taking longer with production grade hardware.

^ permalink raw reply

* Re: Stop false review statements
From: Roman Gushchin @ 2026-05-17 19:42 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Mauro Carvalho Chehab, Greg KH, Krzysztof Kozlowski, debarbos,
	Arnaldo Carvalho de Melo, Konstantin Ryabitsev, Guenter Roeck,
	sashiko-bot, sashiko-reviews, sashiko, Linux Kernel Workflows,
	Linux Kernel Mailing List, devicetree, kfree


> On May 17, 2026, at 11:57 AM, Theodore Tso <tytso@mit.edu> wrote:
> On Sun, May 17, 2026 at 11:17:06AM -0700, Roman Gushchin wrote:
>> 
>> I actually tried to run it with ollama on my
>> personal framework 13. Adding nominal support is trivial, but the
>> whole thing is not really useful: I can get maybe few hundreds
>> tokens per second using a quantified model with reduced quality; an
>> average sashiko review is consuming 3.5 millions tokens (with Gemini
>> 3.1 pro, it’s also model-dependent).
> 
> I'm curious.  What hardware and LLM model were you using?  A few
> hundred tokens per second seems surprising high.  My initial
> research[1] showes that an M5 Max Macbook Pro costing 5 or 6 kilobucks
> can do 31.6 tokens/second on a 27B 4-bit Quanitized model (Qwen 3.5).

I’ve framework 13 with amd 7840u. I’ve tried several models both on cpu and gpu. 
Sorry, it was a couple of months ago and I don’t remember all the details, so I won’t 
claim any specific numbers, but as I remember the best numbers were around 
a hundred tokens per second. In any case it’s few orders of magnitude slower than
 what is realistically required.

If someone has a powerful hardware and is willing to benchmark sashiko with open-source
models, I’m very interested in results.

> [1] https://www.reddit.com/r/LocalLLaMA/comments/1rzkw4x/m5_max_128g_performance_tests_i_just_got_my_new/
> 
> The model matters of course.  With Gemma 3 27B and a 6-bit
> quantization, it's 21 tokens/s, and with Deepseek R1 8B Q6_K, it's
> 72.8 tokens/second.  But unless you're using a really low-end model,
> or a really expensive, splufty hardware platform, I haven't seen
> reports of hundreds of tokens per second on hardware costing a
> reasonable amount of memory.  (I'll set aside the question of whether
> spending $6k for a fully spec'ed out M5 Max Macbook Pro, or $15k for a
> fully spec'ed out M3 Ultra Mac Studio is "reasonable".)
> 
> As a result I'm not entirely sure how realistic it is to do reviews
> using "free" (you still have to pay $$$ for the hardware) local,
> open-weight LLM's if an average review requires around 3.5 million
> tokens.

Fully agree. But it might change in few years, things are moving quickly.

^ permalink raw reply

* Re: Stop false review statements
From: Mauro Carvalho Chehab @ 2026-05-17 19:36 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Roman Gushchin, Greg KH, Krzysztof Kozlowski, debarbos,
	Arnaldo Carvalho de Melo, Konstantin Ryabitsev, Guenter Roeck,
	sashiko-bot, sashiko-reviews, sashiko, Linux Kernel Workflows,
	Linux Kernel Mailing List, devicetree, kfree
In-Reply-To: <20260517185701.GB53471@macsyma-wired.lan>

On Sun, 17 May 2026 14:57:01 -0400
"Theodore Tso" <tytso@mit.edu> wrote:

> On Sun, May 17, 2026 at 11:17:06AM -0700, Roman Gushchin wrote:
> > 
> > I actually tried to run it with ollama on my
> > personal framework 13. Adding nominal support is trivial, but the
> > whole thing is not really useful: I can get maybe few hundreds
> > tokens per second using a quantified model with reduced quality; an
> > average sashiko review is consuming 3.5 millions tokens (with Gemini
> > 3.1 pro, it’s also model-dependent).  
> 
> I'm curious.  What hardware and LLM model were you using?  A few
> hundred tokens per second seems surprising high.  My initial
> research[1] showes that an M5 Max Macbook Pro costing 5 or 6 kilobucks
> can do 31.6 tokens/second on a 27B 4-bit Quanitized model (Qwen 3.5).
> 
> [1] https://www.reddit.com/r/LocalLLaMA/comments/1rzkw4x/m5_max_128g_performance_tests_i_just_got_my_new/
> 
> The model matters of course.  With Gemma 3 27B and a 6-bit
> quantization, it's 21 tokens/s, and with Deepseek R1 8B Q6_K, it's
> 72.8 tokens/second.  But unless you're using a really low-end model,
> or a really expensive, splufty hardware platform, I haven't seen
> reports of hundreds of tokens per second on hardware costing a
> reasonable amount of memory.  (I'll set aside the question of whether
> spending $6k for a fully spec'ed out M5 Max Macbook Pro, or $15k for a
> fully spec'ed out M3 Ultra Mac Studio is "reasonable".)

Ted,

Here, I'm using a RX9060XT, with is a relatively budget hardware.

It is also at the range of dozens of tokens per second. If you're
interested, I ran a benchmark this weekend with 3 models (just
for the sake of testing a set of turboquant patches - those aren't
the models I normally use).

You can see results here:
	https://github.com/ollama/ollama/pull/15505#issuecomment-4467278354

llama3.2:3b with f16 speed gives 72.5 decode tokens/s, and 37 decode tokens/s
with tq4 (actually a modified version of it) which, according with the
PR author, has quality almost identical to f16.

The main issue on such hardware is to have only 16 GB VRAM, making
it a little bit slow for models like qwen3.6:35b, as it will partially 
use CPU. Still, you can get a pretty decent answer in a couple of
minutes, with thinking enabled.

> As a result I'm not entirely sure how realistic it is to do reviews
> using "free" (you still have to pay $$$ for the hardware) local,
> open-weight LLM's if an average review requires around 3.5 million
> tokens.

Yes, 3.5 million tokens is indeed too much. I wonder why. Maybe
Gemini spreads the same query to multiple instances, making it 
spend a lot more tokens?

Here, I did some tests asking some LLM models to review code,
getting answers on a reasonable time (but didn't try to use sashiko
prompts).

Thanks,
Mauro

^ permalink raw reply

* Re: Stop false review statements
From: Theodore Tso @ 2026-05-17 18:57 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Mauro Carvalho Chehab, Greg KH, Krzysztof Kozlowski, debarbos,
	Arnaldo Carvalho de Melo, Konstantin Ryabitsev, Guenter Roeck,
	sashiko-bot, sashiko-reviews, sashiko, Linux Kernel Workflows,
	Linux Kernel Mailing List, devicetree, kfree
In-Reply-To: <F2FBD939-179D-467B-9FA8-BAA44F6C7524@linux.dev>

On Sun, May 17, 2026 at 11:17:06AM -0700, Roman Gushchin wrote:
> 
> I actually tried to run it with ollama on my
> personal framework 13. Adding nominal support is trivial, but the
> whole thing is not really useful: I can get maybe few hundreds
> tokens per second using a quantified model with reduced quality; an
> average sashiko review is consuming 3.5 millions tokens (with Gemini
> 3.1 pro, it’s also model-dependent).

I'm curious.  What hardware and LLM model were you using?  A few
hundred tokens per second seems surprising high.  My initial
research[1] showes that an M5 Max Macbook Pro costing 5 or 6 kilobucks
can do 31.6 tokens/second on a 27B 4-bit Quanitized model (Qwen 3.5).

[1] https://www.reddit.com/r/LocalLLaMA/comments/1rzkw4x/m5_max_128g_performance_tests_i_just_got_my_new/

The model matters of course.  With Gemma 3 27B and a 6-bit
quantization, it's 21 tokens/s, and with Deepseek R1 8B Q6_K, it's
72.8 tokens/second.  But unless you're using a really low-end model,
or a really expensive, splufty hardware platform, I haven't seen
reports of hundreds of tokens per second on hardware costing a
reasonable amount of memory.  (I'll set aside the question of whether
spending $6k for a fully spec'ed out M5 Max Macbook Pro, or $15k for a
fully spec'ed out M3 Ultra Mac Studio is "reasonable".)

As a result I'm not entirely sure how realistic it is to do reviews
using "free" (you still have to pay $$$ for the hardware) local,
open-weight LLM's if an average review requires around 3.5 million
tokens.

Cheers,

						- Ted

^ permalink raw reply

* Re: Stop false review statements
From: Mauro Carvalho Chehab @ 2026-05-17 18:56 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Greg KH, Krzysztof Kozlowski, debarbos, Arnaldo Carvalho de Melo,
	Konstantin Ryabitsev, Guenter Roeck, sashiko-bot, sashiko-reviews,
	sashiko, Linux Kernel Workflows, Linux Kernel Mailing List,
	devicetree, kfree
In-Reply-To: <F2FBD939-179D-467B-9FA8-BAA44F6C7524@linux.dev>

On Sun, 17 May 2026 11:17:06 -0700
Roman Gushchin <roman.gushchin@linux.dev> wrote:

> > On May 17, 2026, at 9:40 AM, Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> > 
> > On Sun, 17 May 2026 12:12:00 +0200
> > Greg KH <gregkh@linuxfoundation.org> wrote:
> >   
> >>> On Sun, May 17, 2026 at 12:05:56PM +0200, Mauro Carvalho Chehab wrote:
> >>> On Sat, 16 May 2026 14:59:44 -0700
> >>> Roman Gushchin <roman.gushchin@linux.dev> wrote:
> >>>   
> >>>>> On May 16, 2026, at 2:33 PM, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> >>>>> 
> >>>>> I find it opposite: clogging commits with useless information, because
> >>>>> some arbitrary and completely closed-source tool did analysis means
> >>>>> nothing to me one year later when I look at the commit in the Git history.      
> >>>> 
> >>>> This is simple not true: Sashiko is fully open-source, under Apache 2.0 license
> >>>> and the code belongs to LF.     
> >>>   
> >>>> Yes, the instance behind sashiko.dev is using
> >>>> Gemini 3.1 Pro LLM, which is not open-source, but it’s not a fundamental limitation -
> >>>> Sashiko is supporting various LLMs, including open models - it’s just a practical
> >>>> choice: to my knowledge the quality of open models is not on par with frontier closed
> >>>> models     
> >>> 
> >>> I would very much prefer using an open source LLM, even if not in pair
> >>> with latest paid models.
> >>>   
> >>>> and it would require a non-trivial amount of hardware and infrastructure to run
> >>>> an open model at the required scale.    
> >>> 
> >>> IMHO the best would be to have them running on some infra that would accept
> >>> open source models (*). If there aren't enough resources to have our own
> >>> infra, there are offers out there which allows running open source models
> >>> like https://ollama.com/pricing (I never used myself).
> >>> 
> >>> (*) For instance, Qwen3.6 is brand new and licensed under apache-2.0.
> >>>    Not bad on my tests running it locally.    
> >> 
> >> You can run the tool locally, with whatever model you want, if you want
> >> to.
> >> 
> >> But for now, let's just take the free credits that Google is willing to
> >> throw at this thing and let it give us reviews IF the maintainer of the
> >> subsystem feels it is something they want to do.  No one is forcing
> >> maintainers to do this.  
> > 
> > If Google and/or others are willing to give free credits on their cloud,
> > they could instead or in addition give free credits to run ollama
> > there, allowing us to use different models.
> > 
> > From my side, while I won't personally object getting reviews from
> > Sashiko/Gemini, this is something I can't reproduce locally. I would
> > very much want something where I can select my LLM preferred model
> > and run on my ollama docker container on my own GPU, in a way that
> > I could run it locally before even sending a patch series.  
> 
> 2 thoughts here:
> 1) I actually tried to run it with ollama on my personal framework 13. Adding nominal support is trivial,
> but the whole thing is not really useful: I can get maybe few hundreds tokens per second using
> a quantified model with reduced quality; an average sashiko review is consuming 3.5 millions tokens 
> (with Gemini 3.1 pro, it’s also model-dependent).

Do you mean 3.5 millions tokens per patch series? If so, that
sounds a lot! Why does it require too many tokens?

> I’m personally all in on having the entire thing as open as possible and I believe Sashiko is what 
> is realistically the best at this moment - a fully open-source harness and set of prompts which 
> can work with a variety of models.
> I’m happy to merge a support for any LLM model which can produce decent review results.
> 
> 2) Due to probabilistic nature of LLMs, nothing is reproducible in a strict sense of the word.
> Even with exactly the same model/harness/prompts you’ll get different results every time you run it.
> It’s unfortunate, but it is what it is at the moment.

By "reproduce locally", I didn't mean in strict sense. Sure, LLM answers
won't be identical, but I suspect that at least most of the major issues 
on a patch series would be reported by any decent model.

So, if we have something that one can locally run using its GPU, being
able to get an answer in the range of a couple of minutes per patch
should be enough to catch most of the issues.

Thanks,
Mauro

^ permalink raw reply

* Re: Stop false review statements
From: Roman Gushchin @ 2026-05-17 18:17 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Greg KH, Krzysztof Kozlowski, debarbos, Arnaldo Carvalho de Melo,
	Konstantin Ryabitsev, Guenter Roeck, sashiko-bot, sashiko-reviews,
	sashiko, Linux Kernel Workflows, Linux Kernel Mailing List,
	devicetree, kfree
In-Reply-To: <20260517183959.37441984@foz.lan>



> On May 17, 2026, at 9:40 AM, Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> 
> On Sun, 17 May 2026 12:12:00 +0200
> Greg KH <gregkh@linuxfoundation.org> wrote:
> 
>>> On Sun, May 17, 2026 at 12:05:56PM +0200, Mauro Carvalho Chehab wrote:
>>> On Sat, 16 May 2026 14:59:44 -0700
>>> Roman Gushchin <roman.gushchin@linux.dev> wrote:
>>> 
>>>>> On May 16, 2026, at 2:33 PM, Krzysztof Kozlowski <krzk@kernel.org> wrote:
>>>>> 
>>>>> I find it opposite: clogging commits with useless information, because
>>>>> some arbitrary and completely closed-source tool did analysis means
>>>>> nothing to me one year later when I look at the commit in the Git history.    
>>>> 
>>>> This is simple not true: Sashiko is fully open-source, under Apache 2.0 license
>>>> and the code belongs to LF.   
>>> 
>>>> Yes, the instance behind sashiko.dev is using
>>>> Gemini 3.1 Pro LLM, which is not open-source, but it’s not a fundamental limitation -
>>>> Sashiko is supporting various LLMs, including open models - it’s just a practical
>>>> choice: to my knowledge the quality of open models is not on par with frontier closed
>>>> models   
>>> 
>>> I would very much prefer using an open source LLM, even if not in pair
>>> with latest paid models.
>>> 
>>>> and it would require a non-trivial amount of hardware and infrastructure to run
>>>> an open model at the required scale.  
>>> 
>>> IMHO the best would be to have them running on some infra that would accept
>>> open source models (*). If there aren't enough resources to have our own
>>> infra, there are offers out there which allows running open source models
>>> like https://ollama.com/pricing (I never used myself).
>>> 
>>> (*) For instance, Qwen3.6 is brand new and licensed under apache-2.0.
>>>    Not bad on my tests running it locally.  
>> 
>> You can run the tool locally, with whatever model you want, if you want
>> to.
>> 
>> But for now, let's just take the free credits that Google is willing to
>> throw at this thing and let it give us reviews IF the maintainer of the
>> subsystem feels it is something they want to do.  No one is forcing
>> maintainers to do this.
> 
> If Google and/or others are willing to give free credits on their cloud,
> they could instead or in addition give free credits to run ollama
> there, allowing us to use different models.
> 
> From my side, while I won't personally object getting reviews from
> Sashiko/Gemini, this is something I can't reproduce locally. I would
> very much want something where I can select my LLM preferred model
> and run on my ollama docker container on my own GPU, in a way that
> I could run it locally before even sending a patch series.

2 thoughts here:
1) I actually tried to run it with ollama on my personal framework 13. Adding nominal support is trivial,
but the whole thing is not really useful: I can get maybe few hundreds tokens per second using
a quantified model with reduced quality; an average sashiko review is consuming 3.5 millions tokens
(with Gemini 3.1 pro, it’s also model-dependent).
I’m personally all in on having the entire thing as open as possible and I believe Sashiko is what 
is realistically the best at this moment - a fully open-source harness and set of prompts which 
can work with a variety of models.
I’m happy to merge a support for any LLM model which can produce decent review results.

2) Due to probabilistic nature of LLMs, nothing is reproducible in a strict sense of the word.
Even with exactly the same model/harness/prompts you’ll get different results every time you run it.
It’s unfortunate, but it is what it is at the moment.


^ permalink raw reply

* Re: Stop false review statements
From: Guenter Roeck @ 2026-05-17 17:03 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Greg KH
  Cc: Roman Gushchin, Krzysztof Kozlowski, debarbos,
	Arnaldo Carvalho de Melo, Konstantin Ryabitsev, sashiko-bot,
	sashiko-reviews, sashiko, Linux Kernel Workflows,
	Linux Kernel Mailing List, devicetree, kfree
In-Reply-To: <20260517183959.37441984@foz.lan>

On 5/17/26 09:39, Mauro Carvalho Chehab wrote:
...
> 
> It is not about the model itself. It is about being able to easily
> install a sashiko locally on a container and easily make it use my
> ollma server with the model(s) of my choice. Right now, at least at
> from its README.md, it sounds that only closed source services
> are supported.
> 

The README file says:

- **Self-contained**: Doesn't depend on 3rd-party tools and can work with
   various LLM providers (Gemini, Claude, and GitHub Copilot CLI are currently
   supported).

Sashiko is open source. No one prevents you from adding support for different
LLM providers. I would suggest to submit patches to have it support whatever
underlying LLM you want to use that isn't currently supported.

Guenter


^ permalink raw reply

* Re: Stop false review statements
From: Mauro Carvalho Chehab @ 2026-05-17 16:39 UTC (permalink / raw)
  To: Greg KH
  Cc: Roman Gushchin, Krzysztof Kozlowski, debarbos,
	Arnaldo Carvalho de Melo, Konstantin Ryabitsev, Guenter Roeck,
	sashiko-bot, sashiko-reviews, sashiko, Linux Kernel Workflows,
	Linux Kernel Mailing List, devicetree, kfree
In-Reply-To: <2026051758-superbowl-baritone-2705@gregkh>

On Sun, 17 May 2026 12:12:00 +0200
Greg KH <gregkh@linuxfoundation.org> wrote:

> On Sun, May 17, 2026 at 12:05:56PM +0200, Mauro Carvalho Chehab wrote:
> > On Sat, 16 May 2026 14:59:44 -0700
> > Roman Gushchin <roman.gushchin@linux.dev> wrote:
> >   
> > > > On May 16, 2026, at 2:33 PM, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> > > > 
> > > > I find it opposite: clogging commits with useless information, because
> > > > some arbitrary and completely closed-source tool did analysis means
> > > > nothing to me one year later when I look at the commit in the Git history.    
> > > 
> > > This is simple not true: Sashiko is fully open-source, under Apache 2.0 license
> > > and the code belongs to LF.   
> >   
> > > Yes, the instance behind sashiko.dev is using
> > > Gemini 3.1 Pro LLM, which is not open-source, but it’s not a fundamental limitation - 
> > > Sashiko is supporting various LLMs, including open models - it’s just a practical
> > > choice: to my knowledge the quality of open models is not on par with frontier closed
> > > models   
> > 
> > I would very much prefer using an open source LLM, even if not in pair 
> > with latest paid models.
> >   
> > > and it would require a non-trivial amount of hardware and infrastructure to run
> > > an open model at the required scale.  
> > 
> > IMHO the best would be to have them running on some infra that would accept
> > open source models (*). If there aren't enough resources to have our own
> > infra, there are offers out there which allows running open source models
> > like https://ollama.com/pricing (I never used myself).
> > 
> > (*) For instance, Qwen3.6 is brand new and licensed under apache-2.0.
> >     Not bad on my tests running it locally.  
> 
> You can run the tool locally, with whatever model you want, if you want
> to.
> 
> But for now, let's just take the free credits that Google is willing to
> throw at this thing and let it give us reviews IF the maintainer of the
> subsystem feels it is something they want to do.  No one is forcing
> maintainers to do this.

If Google and/or others are willing to give free credits on their cloud,
they could instead or in addition give free credits to run ollama
there, allowing us to use different models.

From my side, while I won't personally object getting reviews from
Sashiko/Gemini, this is something I can't reproduce locally. I would
very much want something where I can select my LLM preferred model
and run on my ollama docker container on my own GPU, in a way that
I could run it locally before even sending a patch series.

> The netdev, bpf, and drm developers have been doing much the same for a
> while now, with who-knows-what model behind the thing.  The model
> doesn't matter, we aren't advertising for them, we just want the results
> that they can provide us.

It is not about the model itself. It is about being able to easily
install a sashiko locally on a container and easily make it use my
ollma server with the model(s) of my choice. Right now, at least at 
from its README.md, it sounds that only closed source services
are supported.

Thanks,
Mauro

^ permalink raw reply

* Re: Stop false review statements
From: Theodore Tso @ 2026-05-17 16:29 UTC (permalink / raw)
  To: Greg KH
  Cc: Mauro Carvalho Chehab, Roman Gushchin, Krzysztof Kozlowski,
	debarbos, Arnaldo Carvalho de Melo, Konstantin Ryabitsev,
	Guenter Roeck, sashiko-bot, sashiko-reviews, sashiko,
	Linux Kernel Workflows, Linux Kernel Mailing List, devicetree,
	kfree
In-Reply-To: <2026051758-superbowl-baritone-2705@gregkh>

It should also be noted that Intel's zero-day bot was (a) closed
source, and (b) was sending its test regression reports with the
linux-kernel mailing list cc'ed, and no one really complained because
it was so useful, and if Intel was willing to use very expensive
hardware in their data center to contribute reports, so long as the
reports were useful and the false-positive noise was low enough, we
decided to be grateful and not worry (too much) about the fact that
Intel's zero-day bot was closed source.  (There was indeed some
grumbling in the bar at Plumbers, of course.  :-)

In my opinion, we should be doing the same for Sashiko, and that's the
decision which the ext4 developers have made --- at least for ext4
patches, after an experiment where we only sent reviews to the patch
authors and the maintainer, people were satisifed that false positive
rate was low enough (with the caveats that I had previously mentioned,
but we were willing to live with them because at least for us, it was
useful enough), that we will be requesting that Sashiko reviews be
cc'ed to the ext4 mailing list.

I realize that there are some extra sensitivities around AI / LLM's,
but from the perspective of reviewing patches, I don't see any
difference between this and other closed source tools that we've used,
such as Coverity and the Zero-day bot.  Not everyone will agree, of
course, but at the moment, this is a decision that we are making on a
subsystem by subsystem basis, which again, has strong historical
precedence.

Cheers,

						- Ted

^ permalink raw reply

* Re: [PATCH] docs: threat-model: add missing closing parenthesis
From: Willy Tarreau @ 2026-05-17 16:04 UTC (permalink / raw)
  To: Baruch Siach; +Cc: Jonathan Corbet, Shuah Khan, workflows, linux-doc
In-Reply-To: <da8ee1e8b4e99261ec11544c4e1a4f81316ae965.1779032501.git.baruch@tkos.co.il>

On Sun, May 17, 2026 at 06:41:41PM +0300, Baruch Siach wrote:
> Fixes: a03ef333fbd6 ("Documentation: security-bugs: explain what is and is not a security bug")
> Signed-off-by: Baruch Siach <baruch@tkos.co.il>

Thank you, and sorry for this mistake!

Obviously: Acked-by: Willy Tarreau <w@1wt.eu>

Willy

> ---
>  Documentation/process/threat-model.rst | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/Documentation/process/threat-model.rst b/Documentation/process/threat-model.rst
> index f177b8d3c1ca..9dd8011dde82 100644
> --- a/Documentation/process/threat-model.rst
> +++ b/Documentation/process/threat-model.rst
> @@ -176,7 +176,7 @@ regular bug:
>    * problems seen only under development simulators, emulators, or combinations
>      that do not exist on real systems at the time of reporting (issues
>      involving tens of millions of threads, tens of thousands of CPUs,
> -    unrealistic CPU frequencies, RAM sizes or disk capacities, network speeds.
> +    unrealistic CPU frequencies, RAM sizes or disk capacities, network speeds).
>  
>    * issues whose reproduction requires hardware modification or emulation,
>      including fake USB devices that pretend to be another one.
> -- 
> 2.53.0

^ permalink raw reply

* Re: Stop false review statements
From: Danilo Krummrich @ 2026-05-17 15:56 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Krzysztof Kozlowski, Greg KH, Konstantin Ryabitsev, Guenter Roeck,
	Miguel Ojeda, sashiko-bot, sashiko-reviews, sashiko,
	Linux Kernel Workflows, Linux Kernel Mailing List, devicetree,
	kfree
In-Reply-To: <FA45D2AD-1135-4480-8423-63C0D37FE78D@linux.dev>

On Sat May 16, 2026 at 9:15 PM CEST, Roman Gushchin wrote:
> I agree, it’s sometimes gets tricky when a patchset is sent to multiple
> mailing lists, which policy to apply. I have some improvements in my plans,
> but it’s not always possible to say how it should be handled.

Which improvements do you have in mind?

> It’s not fundamentally new: landing changes touching multiple subsystems is
> always harder exactly because maintainers might have different and sometimes
> conflicting views.

It can also be relevant in cases where only a single subsystem is touched.

For instance, in the case of Rust, the rust-for-linux list serves two purposes
-- when it is a Rust subsystem change and when Rust code of any other subsystem
is touched, i.e. the rust-for-linux list has more of a LKML character and also
receives patches for subsystems whose maintainers may not have opted in to
sashiko email delivery.

That said, I personally don't mind too much, I really like sashiko, which is
also why I asked for adding the driver-core list. My experience has been that it
does a very decent job in providing feedback for C code; my feeling is that
feedback for Rust code is not quite on par yet, but of course it also highly
depends on the complexity and scope of the corresponding changes.

However, I still have the same concern I raised previously when it comes to
email delivery: I think that when sashiko sends feedback to contributors
(without Cc'ing the mailing list and all other recipients), it should actively
ask the contributor to raise things on the list with all other recipients,
reviewers and maintainers before acting on them, such that changes subsequent to
the first submission on the list are aligned.

Can this be added please?

Thanks,
Danilo

^ permalink raw reply

* [PATCH] docs: threat-model: add missing closing parenthesis
From: Baruch Siach @ 2026-05-17 15:41 UTC (permalink / raw)
  To: Jonathan Corbet, Shuah Khan
  Cc: workflows, linux-doc, Willy Tarreau, Baruch Siach

Fixes: a03ef333fbd6 ("Documentation: security-bugs: explain what is and is not a security bug")
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
---
 Documentation/process/threat-model.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/process/threat-model.rst b/Documentation/process/threat-model.rst
index f177b8d3c1ca..9dd8011dde82 100644
--- a/Documentation/process/threat-model.rst
+++ b/Documentation/process/threat-model.rst
@@ -176,7 +176,7 @@ regular bug:
   * problems seen only under development simulators, emulators, or combinations
     that do not exist on real systems at the time of reporting (issues
     involving tens of millions of threads, tens of thousands of CPUs,
-    unrealistic CPU frequencies, RAM sizes or disk capacities, network speeds.
+    unrealistic CPU frequencies, RAM sizes or disk capacities, network speeds).
 
   * issues whose reproduction requires hardware modification or emulation,
     including fake USB devices that pretend to be another one.
-- 
2.53.0


^ permalink raw reply related

* Re: Stop false review statements
From: Jonathan Corbet @ 2026-05-17 15:21 UTC (permalink / raw)
  To: Guenter Roeck, Krzysztof Kozlowski
  Cc: sashiko-bot, sashiko-reviews, sashiko, Linux Kernel Workflows,
	Linux Kernel Mailing List, devicetree@vger.kernel.org, kfree
In-Reply-To: <fd3b2ca7-4d64-4c4b-98a3-7d3285fa6826@roeck-us.net>

Guenter Roeck <linux@roeck-us.net> writes:

> On 5/16/26 05:16, Krzysztof Kozlowski wrote:
>> Quotes from the existing policy:
>> 
>> 1. "By offering my Reviewed-by: tag, I state that:"
>> 
>> Tool cannot use first person "I". Tool cannot "state that".
>> 
>> 2. "A Reviewed-by tag is *a statement of opinion* that the patch is an
>>   appropriate modification of the kernel without any remaining serious"
>> 
>> Tool cannot make a statement of opinion.
>> 
>> 3. "Any interested reviewer (who has done the work) can offer a
>> Reviewed-by".
>> 
>> Tool is not a reviewer as a person, thus above does not grant the tool
>> permission to offer a tag.
>
> I'd like to see that explicitly spelled out. Until then it is your opinion.

So I'm the person who wrote that text.  Automated review tools weren't
really on the radar at that time, so I can't argue that it expresses an
opinion either way as to whether an LLM could make such assertions.

That said, I was certainly considering *human* reviewers at the time,
and all of the people who agreed with the suggested policy were too.
Adding bots seems like a stretch to me.

I can't speak for subsystems that require Reviewed-by tags on their
commits, but I'm not sure that their maintainers would accept an
automated review as satisfying that requirement.

If we want to record this sort of processing, perhaps a tag like
"Scanned-by" would be appropriate?

jon

^ permalink raw reply

* Re: Stop false review statements
From: Greg KH @ 2026-05-17 10:12 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Roman Gushchin, Krzysztof Kozlowski, debarbos,
	Arnaldo Carvalho de Melo, Konstantin Ryabitsev, Guenter Roeck,
	sashiko-bot, sashiko-reviews, sashiko, Linux Kernel Workflows,
	Linux Kernel Mailing List, devicetree, kfree
In-Reply-To: <20260517120556.248852d8@foz.lan>

On Sun, May 17, 2026 at 12:05:56PM +0200, Mauro Carvalho Chehab wrote:
> On Sat, 16 May 2026 14:59:44 -0700
> Roman Gushchin <roman.gushchin@linux.dev> wrote:
> 
> > > On May 16, 2026, at 2:33 PM, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> > > 
> > > I find it opposite: clogging commits with useless information, because
> > > some arbitrary and completely closed-source tool did analysis means
> > > nothing to me one year later when I look at the commit in the Git history.  
> > 
> > This is simple not true: Sashiko is fully open-source, under Apache 2.0 license
> > and the code belongs to LF. 
> 
> > Yes, the instance behind sashiko.dev is using
> > Gemini 3.1 Pro LLM, which is not open-source, but it’s not a fundamental limitation - 
> > Sashiko is supporting various LLMs, including open models - it’s just a practical
> > choice: to my knowledge the quality of open models is not on par with frontier closed
> > models 
> 
> I would very much prefer using an open source LLM, even if not in pair 
> with latest paid models.
> 
> > and it would require a non-trivial amount of hardware and infrastructure to run
> > an open model at the required scale.
> 
> IMHO the best would be to have them running on some infra that would accept
> open source models (*). If there aren't enough resources to have our own
> infra, there are offers out there which allows running open source models
> like https://ollama.com/pricing (I never used myself).
> 
> (*) For instance, Qwen3.6 is brand new and licensed under apache-2.0.
>     Not bad on my tests running it locally.

You can run the tool locally, with whatever model you want, if you want
to.

But for now, let's just take the free credits that Google is willing to
throw at this thing and let it give us reviews IF the maintainer of the
subsystem feels it is something they want to do.  No one is forcing
maintainers to do this.

The netdev, bpf, and drm developers have been doing much the same for a
while now, with who-knows-what model behind the thing.  The model
doesn't matter, we aren't advertising for them, we just want the results
that they can provide us.

thanks,

greg k-h

^ permalink raw reply

* Re: Stop false review statements
From: Willy Tarreau @ 2026-05-17 10:10 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Roman Gushchin, Krzysztof Kozlowski, debarbos,
	Arnaldo Carvalho de Melo, Greg KH, Konstantin Ryabitsev,
	Guenter Roeck, sashiko-bot, sashiko-reviews, sashiko,
	Linux Kernel Workflows, Linux Kernel Mailing List, devicetree,
	kfree
In-Reply-To: <20260517120556.248852d8@foz.lan>

On Sun, May 17, 2026 at 12:05:56PM +0200, Mauro Carvalho Chehab wrote:
> On Sat, 16 May 2026 14:59:44 -0700
> Roman Gushchin <roman.gushchin@linux.dev> wrote:
> 
> > > On May 16, 2026, at 2:33 PM, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> > > 
> > > I find it opposite: clogging commits with useless information, because
> > > some arbitrary and completely closed-source tool did analysis means
> > > nothing to me one year later when I look at the commit in the Git history.  
> > 
> > This is simple not true: Sashiko is fully open-source, under Apache 2.0 license
> > and the code belongs to LF. 
> 
> > Yes, the instance behind sashiko.dev is using
> > Gemini 3.1 Pro LLM, which is not open-source, but it's not a fundamental limitation - 
> > Sashiko is supporting various LLMs, including open models - it's just a practical
> > choice: to my knowledge the quality of open models is not on par with frontier closed
> > models 
> 
> I would very much prefer using an open source LLM, even if not in pair 
> with latest paid models.
> 
> > and it would require a non-trivial amount of hardware and infrastructure to run
> > an open model at the required scale.
> 
> IMHO the best would be to have them running on some infra that would accept
> open source models (*). If there aren't enough resources to have our own
> infra, there are offers out there which allows running open source models
> like https://ollama.com/pricing (I never used myself).
> 
> (*) For instance, Qwen3.6 is brand new and licensed under apache-2.0.
>     Not bad on my tests running it locally.

FWIW that's what I'm using locally coupled with llama.cpp to find bugs.
And it does. Plenty of valid ones. It's greatly sufficient for most work.

Willy

^ permalink raw reply

* Re: Stop false review statements
From: Mauro Carvalho Chehab @ 2026-05-17 10:05 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Krzysztof Kozlowski, debarbos, Arnaldo Carvalho de Melo, Greg KH,
	Konstantin Ryabitsev, Guenter Roeck, sashiko-bot, sashiko-reviews,
	sashiko, Linux Kernel Workflows, Linux Kernel Mailing List,
	devicetree, kfree
In-Reply-To: <07602616-412B-4ED8-95D7-588C0D077EE3@linux.dev>

On Sat, 16 May 2026 14:59:44 -0700
Roman Gushchin <roman.gushchin@linux.dev> wrote:

> > On May 16, 2026, at 2:33 PM, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> > 
> > I find it opposite: clogging commits with useless information, because
> > some arbitrary and completely closed-source tool did analysis means
> > nothing to me one year later when I look at the commit in the Git history.  
> 
> This is simple not true: Sashiko is fully open-source, under Apache 2.0 license
> and the code belongs to LF. 

> Yes, the instance behind sashiko.dev is using
> Gemini 3.1 Pro LLM, which is not open-source, but it’s not a fundamental limitation - 
> Sashiko is supporting various LLMs, including open models - it’s just a practical
> choice: to my knowledge the quality of open models is not on par with frontier closed
> models 

I would very much prefer using an open source LLM, even if not in pair 
with latest paid models.

> and it would require a non-trivial amount of hardware and infrastructure to run
> an open model at the required scale.

IMHO the best would be to have them running on some infra that would accept
open source models (*). If there aren't enough resources to have our own
infra, there are offers out there which allows running open source models
like https://ollama.com/pricing (I never used myself).

(*) For instance, Qwen3.6 is brand new and licensed under apache-2.0.
    Not bad on my tests running it locally.

Thanks,
Mauro

^ permalink raw reply

* Re: Stop false review statements
From: Krzysztof Kozlowski @ 2026-05-17  8:25 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: debarbos, Arnaldo Carvalho de Melo, Greg KH, Konstantin Ryabitsev,
	Guenter Roeck, sashiko-bot, sashiko-reviews, sashiko,
	Linux Kernel Workflows, Linux Kernel Mailing List, devicetree,
	kfree
In-Reply-To: <07602616-412B-4ED8-95D7-588C0D077EE3@linux.dev>

On 16/05/2026 23:59, Roman Gushchin wrote:
> 
> 
>> On May 16, 2026, at 2:33 PM, Krzysztof Kozlowski <krzk@kernel.org> wrote:
>>
>> I find it opposite: clogging commits with useless information, because
>> some arbitrary and completely closed-source tool did analysis means
>> nothing to me one year later when I look at the commit in the Git history.
> 
> This is simple not true: Sashiko is fully open-source, under Apache 2.0 license
> and the code belongs to LF. Yes, the instance behind sashiko.dev is using
> Gemini 3.1 Pro LLM, which is not open-source, but it’s not a fundamental limitation - 
> Sashiko is supporting various LLMs, including open models - it’s just a practical
> choice: to my knowledge the quality of open models is not on par with frontier closed
> models and it would require a non-trivial amount of hardware and infrastructure to run
> an open model at the required scale.

Sashiko is open, but it is not the Sashiko which performs the review but
closed source LLM behind.

Information that closed source LLM did some analysis is no more useful
than all other cases I mentioned - LKP, Smatch, Coverity or checkpatch -
of which most are even open source...

Best regards,
Krzysztof

^ permalink raw reply

* Re: [PATCH] docs: submitting-patches: Clarify that in English "reviewer" is a person
From: Greg Kroah-Hartman @ 2026-05-17  6:13 UTC (permalink / raw)
  To: Vlastimil Babka (SUSE)
  Cc: Krzysztof Kozlowski, Jonathan Corbet, Shuah Khan, workflows,
	linux-doc, linux-kernel, Andrew Morton, David Hildenbrand,
	Linus Torvalds, Guenter Roeck
In-Reply-To: <ce1e5e9b-83d0-4971-aee3-dc5a8f85ce22@kernel.org>

On Sat, May 16, 2026 at 04:39:45PM +0200, Vlastimil Babka (SUSE) wrote:
> On 5/16/26 14:38, Krzysztof Kozlowski wrote:
> > Common understanding of word "Reviewer" is: a person performing a review
> > work [1]. Tools are not persons, thus cannot be reviewers in this term.
> > Also tools cannot make statements ("A Reviewed-by tag is a statement of
> > opinion"), since making a statement needs some sort of conscious mind.
> > 
> > Our docs already clearly mark that "Reviewed-by" must come from a
> > person:
> > 
> >  - "By offering my Reviewed-by: tag, I state that:"
> > 
> >    Usage of first person "I" and word "state"
> > 
> >  - "A Reviewed-by tag is *a statement of opinion* that the patch is an
> >     appropriate modification of the kernel without any remaining serious"
> > 
> >    Only a person can make a statement of opinion.
> > 
> >  - "Any interested reviewer (who has done the work) can offer a
> >    Reviewed-by"
> > 
> >    A person can offer a tag thus above does not grant the tool
> >    permission to offer a tag.
> > 
> > However this is not enough and apparently English is not that precise,
> > so let's clarify that only a person can state the "Reviewer's statement
> > of oversight".
> > 
> > Link: https://en.wiktionary.org/wiki/reviewer [1]
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: Vlastimil Babka <vbabka@kernel.org>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: David Hildenbrand <david@kernel.org>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
> 
> I agree with the intent that the tag is for people (whether they use a tool
> or not to help them). We also don't put "Tested-by: kernel test robot" or
> syzkaller on every commit that they test and find no bugs. Review is also
> not just about absence of bugs, but agreeing with the larger design and
> whether the change makes sense to do in the first place.
> 
> So whether that's achieved with this particular wording or differently,
> 
> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply

* Re: Stop false review statements
From: Mauro Carvalho Chehab @ 2026-05-16 22:32 UTC (permalink / raw)
  To: Greg KH
  Cc: Roman Gushchin, Konstantin Ryabitsev, Guenter Roeck,
	Krzysztof Kozlowski, sashiko-bot, sashiko-reviews, sashiko,
	Linux Kernel Workflows, Linux Kernel Mailing List, devicetree,
	kfree
In-Reply-To: <2026051631-trolling-juggling-da1c@gregkh>

On Sat, 16 May 2026 17:45:51 +0200
Greg KH <gregkh@linuxfoundation.org> wrote:

> > I’m not attached to any specific form of it, I thought Reviewed-by is the most obvious form. 
> > And we use Reported-by: tags with various tooling for years.  
> 
> Reported-by: shows the existance of a problem that some tool found, a
> subtle difference here.

I'd say that, if an issue was found after a patch is merged,
I don't see why to distinguish. I mean:

if tool or a bot XYZ found a real issue, and a patch fixes it, 
reported-by applies - being a LLM tool/bot or not.

Now, if someone sends a patch series v1, get a bot report and send a
v2  of the same patch series due to some CI/bot/LLM/... feedback, IMO 
the right approach is to mention it on patch 0, just like we do with
any other feedback. Eventually, if such feedback is more relevant, it
can be also be mentioned inside patch description(s).

That's said, I would be fine with either a free text mention or with
some tag.

If one wants/needs to justify if/why some tool is relevant for kernel
development, a simple grep would be enough:

	$ git log|grep -i coverity|wc -l
	4267
	$ git log|grep -i smatch|wc -l
	13140
	$ git log|grep -i sashiko |wc -l
	138

IMO, there's no need for an special tag.

Thanks,
Mauro

^ permalink raw reply

* Re: Stop false review statements
From: Roman Gushchin @ 2026-05-16 21:59 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: debarbos, Arnaldo Carvalho de Melo, Greg KH, Konstantin Ryabitsev,
	Guenter Roeck, sashiko-bot, sashiko-reviews, sashiko,
	Linux Kernel Workflows, Linux Kernel Mailing List, devicetree,
	kfree
In-Reply-To: <4f3d7f48-5766-425b-91f6-0acdb5554584@kernel.org>

> On May 16, 2026, at 2:33 PM, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> 
> I find it opposite: clogging commits with useless information, because
> some arbitrary and completely closed-source tool did analysis means
> nothing to me one year later when I look at the commit in the Git history.

This is simple not true: Sashiko is fully open-source, under Apache 2.0 license
and the code belongs to LF. Yes, the instance behind sashiko.dev is using
Gemini 3.1 Pro LLM, which is not open-source, but it’s not a fundamental limitation - 
Sashiko is supporting various LLMs, including open models - it’s just a practical
choice: to my knowledge the quality of open models is not on par with frontier closed
models and it would require a non-trivial amount of hardware and infrastructure to run
an open model at the required scale.

Thanks

^ permalink raw reply

* Re: Stop false review statements
From: Krzysztof Kozlowski @ 2026-05-16 21:33 UTC (permalink / raw)
  To: debarbos, Arnaldo Carvalho de Melo
  Cc: Roman Gushchin, Greg KH, Konstantin Ryabitsev, Guenter Roeck,
	sashiko-bot, sashiko-reviews, sashiko, Linux Kernel Workflows,
	Linux Kernel Mailing List, devicetree, kfree
In-Reply-To: <agjb7-q-p2SemgJa@debarbos-thinkpadt14gen5.rmtusma.csb>

On 16/05/2026 23:29, Derek Barbosa wrote:
> On Sat, May 16, 2026 at 03:28:47PM -0300, Arnaldo Carvalho de Melo wrote:
>>
>> Couldn't this be something like:
>>
>> AI-analysed-by: bot-X
> 
> +1

But why? What is the benefit of storing in Git log information that some
tool did work?

We do not store checkpatch result (another pattern matching tool),
Coverity, Smatch, LKP or syzkallers.

Instead just blank +1 please provide arguments why this is useful for us.

I find it opposite: clogging commits with useless information, because
some arbitrary and completely closed-source tool did analysis means
nothing to me one year later when I look at the commit in the Git history.

Best regards,
Krzysztof

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox