public inbox for kernelci@lists.linux.dev
 help / color / mirror / Atom feed
From: Greg KH <gregkh@linuxfoundation.org>
To: Jenny Qu <jenny@pebblebed.com>
Cc: kernelci@lists.linux.dev
Subject: Re: Talk proposal: What 125K kernel bugs tell us about testing gaps
Date: Thu, 5 Feb 2026 08:00:02 +0100	[thread overview]
Message-ID: <2026020513-smoking-pureness-b6a0@gregkh> (raw)
In-Reply-To: <CAPBP3tRAnaV=NmTZ_yFK+w3GtfTTXtZ3XtpyK+AzvfnCHb8AxQ@mail.gmail.com>

On Wed, Feb 04, 2026 at 06:49:57PM -0800, Jenny Qu wrote:
> Hi,
> 
> I'm a security researcher working on automated kernel vulnerability
> detection. I'd love to present at an upcoming Thursday call if there's
> interest.

Cool, but isn't this a better subject for a conference talk?

> I analyzed every Fixes: tag in the kernel's 20-year git history (125K
> bug-fix pairs) and built a model to catch vulnerabilities at commit
> time. Some findings that might be relevant to KernelCI's testing
> strategy:
> 
> - Security bugs hide for 2.1 years on average; race conditions persist 5.0 years
> - 117 "super-reviewers" (including Dan Carpenter, who invented the
> Fixes: tag) catch bugs 47% faster
> - Subsystems like CAN bus (4.2 years) and SCTP (4.0 years) have
> dramatically longer bug lifetimes than gpu/i915 (1.4 years)
> - Weekend commits are 8% less likely to introduce bugs, but take 45%
> longer to fix (review coverage effect)
> 
> The model (VulnBERT) achieves 92% recall at 1.2% false positive rate
> on held-out 2024 data. I'm also working on SmartKuang, an RL-based
> system that has reproduced CVE-2022-34918 autonomously.

I hate to say "your ai model could be replaced with a sql statement",
but really, we do have tools that show this today that give all of this
data in a sqlite database that people can use to mine for the same info.
It's what the kernel CVE team uses to track bug fixes over time for
their work:
	https://git.sr.ht/~gregkh/verhaal
and is part of the vulns.git repo on git.kernel.org

Also for the tracking of employer to people and who is doing the work,
see the reports on lwn.net for the past few decades that have been
documenting this.  The tool for that is also public (but part of the
database of employer mapping is not for obvious reasons, sorry).  I
think you undercounted people's employers a lot as you can not always
rely on email addresses to convey this.

Anyway, I liked your reports as I'm always interested in more people
mining our public data for stuff like this, it's great to see.  But with
regards to kernelci, how do you feel this information can help with our
project?  What would you like us to do based on what you have found
here?

thanks,

greg k-h

  reply	other threads:[~2026-02-05  7:00 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-05  2:49 Talk proposal: What 125K kernel bugs tell us about testing gaps Jenny Qu
2026-02-05  7:00 ` Greg KH [this message]
2026-02-05  8:58   ` Jenny Qu
2026-02-05 14:22     ` Greg KH
2026-02-05 19:31     ` Donald Zickus
     [not found]     ` <CAK18DXbBKCVPFfWMg3DCv_iHiUOWiAvAtVZ-J1nfQJ3fhbdb-g@mail.gmail.com>
2026-02-05 19:57       ` Jenny Qu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2026020513-smoking-pureness-b6a0@gregkh \
    --to=gregkh@linuxfoundation.org \
    --cc=jenny@pebblebed.com \
    --cc=kernelci@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox