All of lore.kernel.org
 help / color / mirror / Atom feed
* Sashiko review emails to the list
@ 2026-06-19 14:19 Fuad Tabba
  2026-06-19 16:45 ` Oliver Upton
  0 siblings, 1 reply; 3+ messages in thread
From: Fuad Tabba @ 2026-06-19 14:19 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton
  Cc: Roman Gushchin, Will Deacon, Vincent Donnefort, KVMARM

Hi folks,

I really like Sashiko and find it very useful. It's flagged real bugs
in series I and others have posted to the list (e.g. [1][2][3]), and I
run it locally before sending, which has saved me a few respins.

That said, it's been posting a lot lately, so it seemed worth asking
how the review emails to the list are working out, and whether we
should change anything.

A couple of things stand out:

  1. The noise. Some is genuine false positives, where the model
confabulates ARM details (bit positions, ISS field layouts, and the
like) and flags correct code [4]. The rest is repetition: the same
finding posted against several patches in one series, often a real
issue not introduced by the series.

  2. The emails themselves. Mailing review to the list automatically
can push contributors to respin many times chasing bot comments before
a human's looked, and pull reviewer attention onto findings that may
be misguided to begin with.

On the noise (1):

  - From the logs it looks like Roman's been working on the
false-positive detection on GitHub [5], and the rate's dropped
noticeably in my local testing. I'm not sure the list version has
those changes yet (I get different results locally, fewer FPs), and he
told me yesterday (offlist) he's working on the repetition too.

  - I've been working on the arm64 prompts to stop the confabulation:
the model doesn't have the ARM ARM, so it hallucinates encodings and
asserts them as bugs. I submitted a PR to review-prompt [6], which
should cover most of these but not all. Longer term I have a local
prototype that gives it the actual spec text, but there are copyright
questions around shipping it. I've offered to share it with Roman,
minus the copyrighted material, to see whether it's something we could
use and extend, including to other architectures.

These only address (1). The second concern stands regardless of
accuracy, since it's about respinning before a human's looked, not
whether the comments are right.

These fixes will take a while to land, so the question is what to do
meanwhile. Some options, and surely others:

  - Leave it as-is while the fixes propagate.
  - Stop the emails but keep the reviews on sashiko.dev, so people can
look rather than have them pushed.
  - Disable the emails until the noise is down to a reasonable level,
then re-enable.
  - Raise the confidence bar so it only emails high-confidence findings.

This seemed worth discussing on the usual lists rather than on GitHub,
which has less visibility. What do you think?

Cheers,
/fuad

[1] https://lore.kernel.org/all/CA+EHjTxLVo=GwderoFxqsOEFXV+DrD17nQCkPbnKZPA6mRNxhg@mail.gmail.com/
[2] https://lore.kernel.org/all/20260612113414.1022901-1-tabba@google.com/
[3] https://lore.kernel.org/all/20260615131116.390977-1-tabba@google.com/
[4] https://sashiko.dev/#/patchset/20260619070719.812227-1-tabba@google.com?part=7
[5] https://github.com/sashiko-dev/
[6] https://github.com/masoncl/review-prompts/pull/81

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-06-19 18:06 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-19 14:19 Sashiko review emails to the list Fuad Tabba
2026-06-19 16:45 ` Oliver Upton
2026-06-19 18:05   ` Roman Gushchin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.