Talk proposal: What 125K kernel bugs tell us about testing gaps

public inbox for kernelci@lists.linux.dev
 help / color / mirror / Atom feed

* Talk proposal: What 125K kernel bugs tell us about testing gaps
@ 2026-02-05  2:49 Jenny Qu
  2026-02-05  7:00 ` Greg KH
  0 siblings, 1 reply; 6+ messages in thread
From: Jenny Qu @ 2026-02-05  2:49 UTC (permalink / raw)
  To: kernelci

Hi,

I'm a security researcher working on automated kernel vulnerability
detection. I'd love to present at an upcoming Thursday call if there's
interest.

I analyzed every Fixes: tag in the kernel's 20-year git history (125K
bug-fix pairs) and built a model to catch vulnerabilities at commit
time. Some findings that might be relevant to KernelCI's testing
strategy:

- Security bugs hide for 2.1 years on average; race conditions persist 5.0 years
- 117 "super-reviewers" (including Dan Carpenter, who invented the
Fixes: tag) catch bugs 47% faster
- Subsystems like CAN bus (4.2 years) and SCTP (4.0 years) have
dramatically longer bug lifetimes than gpu/i915 (1.4 years)
- Weekend commits are 8% less likely to introduce bugs, but take 45%
longer to fix (review coverage effect)

The model (VulnBERT) achieves 92% recall at 1.2% false positive rate
on held-out 2024 data. I'm also working on SmartKuang, an RL-based
system that has reproduced CVE-2022-34918 autonomously.

Happy to do 15-20 min on whatever slice would be most useful—the
dataset findings, the detection approach, or how this could complement
KernelCI's coverage.

Writeups:
- https://pebblebed.com/blog/kernel-bugs
- https://pebblebed.com/blog/kernel-bugs-part2

Jenny
jenny@pebblebed.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Talk proposal: What 125K kernel bugs tell us about testing gaps
  2026-02-05  2:49 Talk proposal: What 125K kernel bugs tell us about testing gaps Jenny Qu
@ 2026-02-05  7:00 ` Greg KH
  2026-02-05  8:58   ` Jenny Qu
  0 siblings, 1 reply; 6+ messages in thread
From: Greg KH @ 2026-02-05  7:00 UTC (permalink / raw)
  To: Jenny Qu; +Cc: kernelci

On Wed, Feb 04, 2026 at 06:49:57PM -0800, Jenny Qu wrote:
> Hi,
> 
> I'm a security researcher working on automated kernel vulnerability
> detection. I'd love to present at an upcoming Thursday call if there's
> interest.

Cool, but isn't this a better subject for a conference talk?

> I analyzed every Fixes: tag in the kernel's 20-year git history (125K
> bug-fix pairs) and built a model to catch vulnerabilities at commit
> time. Some findings that might be relevant to KernelCI's testing
> strategy:
> 
> - Security bugs hide for 2.1 years on average; race conditions persist 5.0 years
> - 117 "super-reviewers" (including Dan Carpenter, who invented the
> Fixes: tag) catch bugs 47% faster
> - Subsystems like CAN bus (4.2 years) and SCTP (4.0 years) have
> dramatically longer bug lifetimes than gpu/i915 (1.4 years)
> - Weekend commits are 8% less likely to introduce bugs, but take 45%
> longer to fix (review coverage effect)
> 
> The model (VulnBERT) achieves 92% recall at 1.2% false positive rate
> on held-out 2024 data. I'm also working on SmartKuang, an RL-based
> system that has reproduced CVE-2022-34918 autonomously.

I hate to say "your ai model could be replaced with a sql statement",
but really, we do have tools that show this today that give all of this
data in a sqlite database that people can use to mine for the same info.
It's what the kernel CVE team uses to track bug fixes over time for
their work:
	https://git.sr.ht/~gregkh/verhaal
and is part of the vulns.git repo on git.kernel.org

Also for the tracking of employer to people and who is doing the work,
see the reports on lwn.net for the past few decades that have been
documenting this.  The tool for that is also public (but part of the
database of employer mapping is not for obvious reasons, sorry).  I
think you undercounted people's employers a lot as you can not always
rely on email addresses to convey this.

Anyway, I liked your reports as I'm always interested in more people
mining our public data for stuff like this, it's great to see.  But with
regards to kernelci, how do you feel this information can help with our
project?  What would you like us to do based on what you have found
here?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Talk proposal: What 125K kernel bugs tell us about testing gaps
  2026-02-05  7:00 ` Greg KH
@ 2026-02-05  8:58   ` Jenny Qu
  2026-02-05 14:22     ` Greg KH
                       ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Jenny Qu @ 2026-02-05  8:58 UTC (permalink / raw)
  To: Greg KH; +Cc: kernelci

[resending to list - accidentally replied off-list]

On Wed, Feb 04, 2026 at 11:00:00PM, Greg KH wrote:
> I hate to say "your ai model could be replaced with a sql statement"

Fair point on the descriptive statistics. I should have been clearer:
the 125K bug analysis was training data, not the contribution. verhaal
and the LWN employer reports (Jonathan Corbet's per-release stats
using the gitdm database) already cover the descriptive side well.

The part SQL can't do is the predictive model. VulnBERT takes a raw
git diff *before merge* and predicts whether it introduces a
vulnerability. The evaluation is a strict temporal holdout: trained
on commits with Fixes: tags from <=2023, tested on 2024 commits that
later received Fixes: tags. 92% recall, 1.2% FPR on that split.

To be direct about limitations: those numbers are on historical data
where we know ground truth. The model catches patterns it's seen
before (unbalanced refcounts, missing NULL checks, lock/unlock
mismatches). It will miss novel bug classes it hasn't been trained on.
It's a triage tool and not yet an oracle.

And it's not ready for production use yet. I'm reworking the
architecture. The current approach uses CodeBERT embeddings with
handcrafted features, and I think incorporating LLM reasoning traces
over diffs will do substantially better. I don't want to hand anyone
a tool that generates false confidence.

On employer attribution: you're right, email domain mapping
undercounts significantly. Developers using personal emails,
acquisitions (Mellanox -> NVIDIA), and consultants all break the
heuristic.

> how do you feel this information can help with our project? What
> would you like us to do based on what you have found here?

Honestly, I'd rather hear from the KernelCI community what would
actually be useful than prescribe solutions. But two directions I
think are worth discussing:

1. Subsystem-level test prioritization. The lifetime gap between
   CAN bus (4.2 years) and gpu/i915 (1.4 years) almost certainly
   reflects testing coverage differences. i915 has dedicated
   fuzzing infrastructure and active reviewers like Chris Wilson
   and Ville Syrjala. KernelCI could use lifetime data as a signal
   for where to invest in test enablement. This is actionable now,
   no ML required.

2. Longer-term: commit-level risk scoring to allocate CI resources.
   Flag high-risk commits for extra sanitizer runs, longer fuzzing
   passes. Low-risk commits get the standard pipeline. But this
   needs a model I trust enough to deploy, and I'm not there yet.

I'm speaking at BugBash 2026 in April and looking at LPC for a more
technical deep-dive.

kindly,
Jenny
jenny@pebblebed.com

On Wed, Feb 4, 2026 at 11:00 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Wed, Feb 04, 2026 at 06:49:57PM -0800, Jenny Qu wrote:
> > Hi,
> >
> > I'm a security researcher working on automated kernel vulnerability
> > detection. I'd love to present at an upcoming Thursday call if there's
> > interest.
>
> Cool, but isn't this a better subject for a conference talk?
>
> > I analyzed every Fixes: tag in the kernel's 20-year git history (125K
> > bug-fix pairs) and built a model to catch vulnerabilities at commit
> > time. Some findings that might be relevant to KernelCI's testing
> > strategy:
> >
> > - Security bugs hide for 2.1 years on average; race conditions persist 5.0 years
> > - 117 "super-reviewers" (including Dan Carpenter, who invented the
> > Fixes: tag) catch bugs 47% faster
> > - Subsystems like CAN bus (4.2 years) and SCTP (4.0 years) have
> > dramatically longer bug lifetimes than gpu/i915 (1.4 years)
> > - Weekend commits are 8% less likely to introduce bugs, but take 45%
> > longer to fix (review coverage effect)
> >
> > The model (VulnBERT) achieves 92% recall at 1.2% false positive rate
> > on held-out 2024 data. I'm also working on SmartKuang, an RL-based
> > system that has reproduced CVE-2022-34918 autonomously.
>
> I hate to say "your ai model could be replaced with a sql statement",
> but really, we do have tools that show this today that give all of this
> data in a sqlite database that people can use to mine for the same info.
> It's what the kernel CVE team uses to track bug fixes over time for
> their work:
>         https://git.sr.ht/~gregkh/verhaal
> and is part of the vulns.git repo on git.kernel.org
>
> Also for the tracking of employer to people and who is doing the work,
> see the reports on lwn.net for the past few decades that have been
> documenting this.  The tool for that is also public (but part of the
> database of employer mapping is not for obvious reasons, sorry).  I
> think you undercounted people's employers a lot as you can not always
> rely on email addresses to convey this.
>
> Anyway, I liked your reports as I'm always interested in more people
> mining our public data for stuff like this, it's great to see.  But with
> regards to kernelci, how do you feel this information can help with our
> project?  What would you like us to do based on what you have found
> here?
>
> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Talk proposal: What 125K kernel bugs tell us about testing gaps
  2026-02-05  8:58   ` Jenny Qu
@ 2026-02-05 14:22     ` Greg KH
  2026-02-05 19:31     ` Donald Zickus
       [not found]     ` <CAK18DXbBKCVPFfWMg3DCv_iHiUOWiAvAtVZ-J1nfQJ3fhbdb-g@mail.gmail.com>
  2 siblings, 0 replies; 6+ messages in thread
From: Greg KH @ 2026-02-05 14:22 UTC (permalink / raw)
  To: Jenny Qu; +Cc: kernelci

On Thu, Feb 05, 2026 at 12:58:20AM -0800, Jenny Qu wrote:
> [resending to list - accidentally replied off-list]
> 
> On Wed, Feb 04, 2026 at 11:00:00PM, Greg KH wrote:
> > I hate to say "your ai model could be replaced with a sql statement"
> 
> Fair point on the descriptive statistics. I should have been clearer:
> the 125K bug analysis was training data, not the contribution. verhaal
> and the LWN employer reports (Jonathan Corbet's per-release stats
> using the gitdm database) already cover the descriptive side well.
> 
> The part SQL can't do is the predictive model. VulnBERT takes a raw
> git diff *before merge* and predicts whether it introduces a
> vulnerability. The evaluation is a strict temporal holdout: trained
> on commits with Fixes: tags from <=2023, tested on 2024 commits that
> later received Fixes: tags. 92% recall, 1.2% FPR on that split.

Cool!  So you have re-implemented Sasha's AUTOSEL bot? :)

Note, there are papers and presentations about how that works for the
past 10 years, you might want to look into that as it seems that your
models are the same here (prediction as to what type of commit is a
fix).

> To be direct about limitations: those numbers are on historical data
> where we know ground truth. The model catches patterns it's seen
> before (unbalanced refcounts, missing NULL checks, lock/unlock
> mismatches). It will miss novel bug classes it hasn't been trained on.
> It's a triage tool and not yet an oracle.

That's fine, we need that.  And if you have a pattern that it matches,
let's add it to our coccinelle ruleset so that it does not come back in!

> And it's not ready for production use yet. I'm reworking the
> architecture. The current approach uses CodeBERT embeddings with
> handcrafted features, and I think incorporating LLM reasoning traces
> over diffs will do substantially better. I don't want to hand anyone
> a tool that generates false confidence.

Look at the ebpf "AI" patch reviews that are happening on the mailing
list today already if you want an example of how this could work.  Try
running it on the output of the lore.kernel.org git repos (email is in
git format for others to work easily off of, including the tool 'lei').
Then if your tool catches problems, email them to the patch authors and
list to let them know!

That's the best thing we can do now, catch bugs before they are
committed.

> 1. Subsystem-level test prioritization. The lifetime gap between
>    CAN bus (4.2 years) and gpu/i915 (1.4 years) almost certainly
>    reflects testing coverage differences. i915 has dedicated
>    fuzzing infrastructure and active reviewers like Chris Wilson
>    and Ville Syrjala. KernelCI could use lifetime data as a signal
>    for where to invest in test enablement. This is actionable now,
>    no ML required.

Yes, that is directly due to fuzzing issues.  Fuzzers work on a "layer
by layer" basis, working deeper into the kernel and adding different
subsystems all the time.  That's why you will see "waves" of bugfixes
happening like this.  It's normal and to be expected.

> 2. Longer-term: commit-level risk scoring to allocate CI resources.
>    Flag high-risk commits for extra sanitizer runs, longer fuzzing
>    passes. Low-risk commits get the standard pipeline. But this
>    needs a model I trust enough to deploy, and I'm not there yet.

Again, look at what is already happening on these types of reviews and
perhaps plug your model into that as well and see what happens?  We're
always wanting more code review to help alleviate our most limited
resource, maintainers to review changes.

thanks!

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Talk proposal: What 125K kernel bugs tell us about testing gaps
  2026-02-05  8:58   ` Jenny Qu
  2026-02-05 14:22     ` Greg KH
@ 2026-02-05 19:31     ` Donald Zickus
       [not found]     ` <CAK18DXbBKCVPFfWMg3DCv_iHiUOWiAvAtVZ-J1nfQJ3fhbdb-g@mail.gmail.com>
  2 siblings, 0 replies; 6+ messages in thread
From: Donald Zickus @ 2026-02-05 19:31 UTC (permalink / raw)
  To: Jenny Qu; +Cc: Greg KH, kernelci

(resending in plain text instead of html)

Hi Jenny,


On Thu, Feb 5, 2026 at 4:44 AM Jenny Qu <jenny@pebblebed.com> wrote:
>
> [resending to list - accidentally replied off-list]
>
> On Wed, Feb 04, 2026 at 11:00:00PM, Greg KH wrote:
> > I hate to say "your ai model could be replaced with a sql statement"
>
> Fair point on the descriptive statistics. I should have been clearer:
> the 125K bug analysis was training data, not the contribution. verhaal
> and the LWN employer reports (Jonathan Corbet's per-release stats
> using the gitdm database) already cover the descriptive side well.
>
> The part SQL can't do is the predictive model. VulnBERT takes a raw
> git diff *before merge* and predicts whether it introduces a
> vulnerability. The evaluation is a strict temporal holdout: trained
> on commits with Fixes: tags from <=2023, tested on 2024 commits that
> later received Fixes: tags. 92% recall, 1.2% FPR on that split.
>
> To be direct about limitations: those numbers are on historical data
> where we know ground truth. The model catches patterns it's seen
> before (unbalanced refcounts, missing NULL checks, lock/unlock
> mismatches). It will miss novel bug classes it hasn't been trained on.
> It's a triage tool and not yet an oracle.
>
> And it's not ready for production use yet. I'm reworking the
> architecture. The current approach uses CodeBERT embeddings with
> handcrafted features, and I think incorporating LLM reasoning traces
> over diffs will do substantially better. I don't want to hand anyone
> a tool that generates false confidence.
>
> On employer attribution: you're right, email domain mapping
> undercounts significantly. Developers using personal emails,
> acquisitions (Mellanox -> NVIDIA), and consultants all break the
> heuristic.
>
> > how do you feel this information can help with our project? What
> > would you like us to do based on what you have found here?
>
> Honestly, I'd rather hear from the KernelCI community what would
> actually be useful than prescribe solutions. But two directions I
> think are worth discussing:
>
> 1. Subsystem-level test prioritization. The lifetime gap between
>    CAN bus (4.2 years) and gpu/i915 (1.4 years) almost certainly
>    reflects testing coverage differences. i915 has dedicated
>    fuzzing infrastructure and active reviewers like Chris Wilson
>    and Ville Syrjala. KernelCI could use lifetime data as a signal
>    for where to invest in test enablement. This is actionable now,
>    no ML required.
>
> 2. Longer-term: commit-level risk scoring to allocate CI resources.
>    Flag high-risk commits for extra sanitizer runs, longer fuzzing
>    passes. Low-risk commits get the standard pipeline. But this
>    needs a model I trust enough to deploy, and I'm not there yet.
>
> I'm speaking at BugBash 2026 in April and looking at LPC for a more
> technical deep-dive.

Thanks for this.  As a board member of KernelCI, most of the efforts
we have funded or try to support are ones that have been adopted by
the community.  The work we try to sponsor needs to provide value to
the community but the kernel community can be tricky to navigate as
you can see by Greg's comments.

I would recommend those conferences but also try attaching your work
as replies to various patches.  Try to show off the value of your work
on mailing lists and let that start conversations on how to steer it
towards something that could be considered useful.  That journey will
lead to overlap of existing technologies that Greg mentioned, but more
importantly it will lead to conversations on how to collaborate around
those technologies to make something valuable to the community.  The
end result being that it becomes a no-brainer to add to kernelci.

A current example we are working with is Thorsten's regzbot[0].  A
difficult social problem around regression tracking that the community
helped him navigate towards something of value and now makes sense for
kernelci to sponsor.

Cheers,
Don

[0] - https://linux-regtracking.leemhuis.info/about/


>
> kindly,
> Jenny
> jenny@pebblebed.com
>
>
> On Wed, Feb 4, 2026 at 11:00 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Wed, Feb 04, 2026 at 06:49:57PM -0800, Jenny Qu wrote:
> > > Hi,
> > >
> > > I'm a security researcher working on automated kernel vulnerability
> > > detection. I'd love to present at an upcoming Thursday call if there's
> > > interest.
> >
> > Cool, but isn't this a better subject for a conference talk?
> >
> > > I analyzed every Fixes: tag in the kernel's 20-year git history (125K
> > > bug-fix pairs) and built a model to catch vulnerabilities at commit
> > > time. Some findings that might be relevant to KernelCI's testing
> > > strategy:
> > >
> > > - Security bugs hide for 2.1 years on average; race conditions persist 5.0 years
> > > - 117 "super-reviewers" (including Dan Carpenter, who invented the
> > > Fixes: tag) catch bugs 47% faster
> > > - Subsystems like CAN bus (4.2 years) and SCTP (4.0 years) have
> > > dramatically longer bug lifetimes than gpu/i915 (1.4 years)
> > > - Weekend commits are 8% less likely to introduce bugs, but take 45%
> > > longer to fix (review coverage effect)
> > >
> > > The model (VulnBERT) achieves 92% recall at 1.2% false positive rate
> > > on held-out 2024 data. I'm also working on SmartKuang, an RL-based
> > > system that has reproduced CVE-2022-34918 autonomously.
> >
> > I hate to say "your ai model could be replaced with a sql statement",
> > but really, we do have tools that show this today that give all of this
> > data in a sqlite database that people can use to mine for the same info.
> > It's what the kernel CVE team uses to track bug fixes over time for
> > their work:
> >         https://git.sr.ht/~gregkh/verhaal
> > and is part of the vulns.git repo on git.kernel.org
> >
> > Also for the tracking of employer to people and who is doing the work,
> > see the reports on lwn.net for the past few decades that have been
> > documenting this.  The tool for that is also public (but part of the
> > database of employer mapping is not for obvious reasons, sorry).  I
> > think you undercounted people's employers a lot as you can not always
> > rely on email addresses to convey this.
> >
> > Anyway, I liked your reports as I'm always interested in more people
> > mining our public data for stuff like this, it's great to see.  But with
> > regards to kernelci, how do you feel this information can help with our
> > project?  What would you like us to do based on what you have found
> > here?
> >
> > thanks,
> >
> > greg k-h
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

[parent not found: <CAK18DXbBKCVPFfWMg3DCv_iHiUOWiAvAtVZ-J1nfQJ3fhbdb-g@mail.gmail.com>]

* Re: Talk proposal: What 125K kernel bugs tell us about testing gaps
       [not found]     ` <CAK18DXbBKCVPFfWMg3DCv_iHiUOWiAvAtVZ-J1nfQJ3fhbdb-g@mail.gmail.com>
@ 2026-02-05 19:57       ` Jenny Qu
  0 siblings, 0 replies; 6+ messages in thread
From: Jenny Qu @ 2026-02-05 19:57 UTC (permalink / raw)
  To: Donald Zickus; +Cc: Greg KH, kernelci

Thanks Greg and Don, this is exactly the guidance I needed.

I'll dig into AUTOSEL, the eBPF AI review workflow, and Coccinelle.
The path makes sense: prove value by running on real patches and
engaging on-list, iterate based on feedback, let adoption happen
organically.

Will report back when I have something worth showing.

Kindly,
Jenny

On Thu, Feb 5, 2026 at 10:24 AM Donald Zickus <dzickus@redhat.com> wrote:
>
> Hi Jenny,
>
> On Thu, Feb 5, 2026 at 4:44 AM Jenny Qu <jenny@pebblebed.com> wrote:
>>
>> [resending to list - accidentally replied off-list]
>>
>> On Wed, Feb 04, 2026 at 11:00:00PM, Greg KH wrote:
>> > I hate to say "your ai model could be replaced with a sql statement"
>>
>> Fair point on the descriptive statistics. I should have been clearer:
>> the 125K bug analysis was training data, not the contribution. verhaal
>> and the LWN employer reports (Jonathan Corbet's per-release stats
>> using the gitdm database) already cover the descriptive side well.
>>
>> The part SQL can't do is the predictive model. VulnBERT takes a raw
>> git diff *before merge* and predicts whether it introduces a
>> vulnerability. The evaluation is a strict temporal holdout: trained
>> on commits with Fixes: tags from <=2023, tested on 2024 commits that
>> later received Fixes: tags. 92% recall, 1.2% FPR on that split.
>>
>> To be direct about limitations: those numbers are on historical data
>> where we know ground truth. The model catches patterns it's seen
>> before (unbalanced refcounts, missing NULL checks, lock/unlock
>> mismatches). It will miss novel bug classes it hasn't been trained on.
>> It's a triage tool and not yet an oracle.
>>
>> And it's not ready for production use yet. I'm reworking the
>> architecture. The current approach uses CodeBERT embeddings with
>> handcrafted features, and I think incorporating LLM reasoning traces
>> over diffs will do substantially better. I don't want to hand anyone
>> a tool that generates false confidence.
>>
>> On employer attribution: you're right, email domain mapping
>> undercounts significantly. Developers using personal emails,
>> acquisitions (Mellanox -> NVIDIA), and consultants all break the
>> heuristic.
>>
>> > how do you feel this information can help with our project? What
>> > would you like us to do based on what you have found here?
>>
>> Honestly, I'd rather hear from the KernelCI community what would
>> actually be useful than prescribe solutions. But two directions I
>> think are worth discussing:
>>
>> 1. Subsystem-level test prioritization. The lifetime gap between
>>    CAN bus (4.2 years) and gpu/i915 (1.4 years) almost certainly
>>    reflects testing coverage differences. i915 has dedicated
>>    fuzzing infrastructure and active reviewers like Chris Wilson
>>    and Ville Syrjala. KernelCI could use lifetime data as a signal
>>    for where to invest in test enablement. This is actionable now,
>>    no ML required.
>>
>> 2. Longer-term: commit-level risk scoring to allocate CI resources.
>>    Flag high-risk commits for extra sanitizer runs, longer fuzzing
>>    passes. Low-risk commits get the standard pipeline. But this
>>    needs a model I trust enough to deploy, and I'm not there yet.
>>
>> I'm speaking at BugBash 2026 in April and looking at LPC for a more
>> technical deep-dive.
>
>
> Thanks for this.  As a board member of KernelCI, most of the efforts we have funded or try to support are ones that have been adopted by the community.  The work we try to sponsor needs to provide value to the community but the kernel community can be tricky to navigate as you can see by Greg's comments.
>
> I would recommend those conferences but also try attaching your work as replies to various patches.  Try to show off the value of your work on mailing lists and let that start conversations on how to steer it towards something that could be considered useful.  That journey will lead to overlap of existing technologies that Greg mentioned, but more importantly it will lead to conversations on how to collaborate around those technologies to make something valuable to the community.  The end result being that it becomes a no-brainer to add to kernelci.
>
> A current example we are working with is Thorsten's regzbot[0].  A difficult social problem around regression tracking that the community helped him navigate towards something of value and now makes sense for kernelci to sponsor.
>
> Cheers,
> Don
>
> [0] - https://linux-regtracking.leemhuis.info/about/
>
>>
>> kindly,
>> Jenny
>> jenny@pebblebed.com
>>
>>
>> On Wed, Feb 4, 2026 at 11:00 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>> >
>> > On Wed, Feb 04, 2026 at 06:49:57PM -0800, Jenny Qu wrote:
>> > > Hi,
>> > >
>> > > I'm a security researcher working on automated kernel vulnerability
>> > > detection. I'd love to present at an upcoming Thursday call if there's
>> > > interest.
>> >
>> > Cool, but isn't this a better subject for a conference talk?
>> >
>> > > I analyzed every Fixes: tag in the kernel's 20-year git history (125K
>> > > bug-fix pairs) and built a model to catch vulnerabilities at commit
>> > > time. Some findings that might be relevant to KernelCI's testing
>> > > strategy:
>> > >
>> > > - Security bugs hide for 2.1 years on average; race conditions persist 5.0 years
>> > > - 117 "super-reviewers" (including Dan Carpenter, who invented the
>> > > Fixes: tag) catch bugs 47% faster
>> > > - Subsystems like CAN bus (4.2 years) and SCTP (4.0 years) have
>> > > dramatically longer bug lifetimes than gpu/i915 (1.4 years)
>> > > - Weekend commits are 8% less likely to introduce bugs, but take 45%
>> > > longer to fix (review coverage effect)
>> > >
>> > > The model (VulnBERT) achieves 92% recall at 1.2% false positive rate
>> > > on held-out 2024 data. I'm also working on SmartKuang, an RL-based
>> > > system that has reproduced CVE-2022-34918 autonomously.
>> >
>> > I hate to say "your ai model could be replaced with a sql statement",
>> > but really, we do have tools that show this today that give all of this
>> > data in a sqlite database that people can use to mine for the same info.
>> > It's what the kernel CVE team uses to track bug fixes over time for
>> > their work:
>> >         https://git.sr.ht/~gregkh/verhaal
>> > and is part of the vulns.git repo on git.kernel.org
>> >
>> > Also for the tracking of employer to people and who is doing the work,
>> > see the reports on lwn.net for the past few decades that have been
>> > documenting this.  The tool for that is also public (but part of the
>> > database of employer mapping is not for obvious reasons, sorry).  I
>> > think you undercounted people's employers a lot as you can not always
>> > rely on email addresses to convey this.
>> >
>> > Anyway, I liked your reports as I'm always interested in more people
>> > mining our public data for stuff like this, it's great to see.  But with
>> > regards to kernelci, how do you feel this information can help with our
>> > project?  What would you like us to do based on what you have found
>> > here?
>> >
>> > thanks,
>> >
>> > greg k-h
>>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-02-05 19:57 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-05  2:49 Talk proposal: What 125K kernel bugs tell us about testing gaps Jenny Qu
2026-02-05  7:00 ` Greg KH
2026-02-05  8:58   ` Jenny Qu
2026-02-05 14:22     ` Greg KH
2026-02-05 19:31     ` Donald Zickus
     [not found]     ` <CAK18DXbBKCVPFfWMg3DCv_iHiUOWiAvAtVZ-J1nfQJ3fhbdb-g@mail.gmail.com>
2026-02-05 19:57       ` Jenny Qu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox