* Talk proposal: What 125K kernel bugs tell us about testing gaps @ 2026-02-05 2:49 Jenny Qu 2026-02-05 7:00 ` Greg KH 0 siblings, 1 reply; 6+ messages in thread From: Jenny Qu @ 2026-02-05 2:49 UTC (permalink / raw) To: kernelci Hi, I'm a security researcher working on automated kernel vulnerability detection. I'd love to present at an upcoming Thursday call if there's interest. I analyzed every Fixes: tag in the kernel's 20-year git history (125K bug-fix pairs) and built a model to catch vulnerabilities at commit time. Some findings that might be relevant to KernelCI's testing strategy: - Security bugs hide for 2.1 years on average; race conditions persist 5.0 years - 117 "super-reviewers" (including Dan Carpenter, who invented the Fixes: tag) catch bugs 47% faster - Subsystems like CAN bus (4.2 years) and SCTP (4.0 years) have dramatically longer bug lifetimes than gpu/i915 (1.4 years) - Weekend commits are 8% less likely to introduce bugs, but take 45% longer to fix (review coverage effect) The model (VulnBERT) achieves 92% recall at 1.2% false positive rate on held-out 2024 data. I'm also working on SmartKuang, an RL-based system that has reproduced CVE-2022-34918 autonomously. Happy to do 15-20 min on whatever slice would be most useful—the dataset findings, the detection approach, or how this could complement KernelCI's coverage. Writeups: - https://pebblebed.com/blog/kernel-bugs - https://pebblebed.com/blog/kernel-bugs-part2 Jenny jenny@pebblebed.com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Talk proposal: What 125K kernel bugs tell us about testing gaps 2026-02-05 2:49 Talk proposal: What 125K kernel bugs tell us about testing gaps Jenny Qu @ 2026-02-05 7:00 ` Greg KH 2026-02-05 8:58 ` Jenny Qu 0 siblings, 1 reply; 6+ messages in thread From: Greg KH @ 2026-02-05 7:00 UTC (permalink / raw) To: Jenny Qu; +Cc: kernelci On Wed, Feb 04, 2026 at 06:49:57PM -0800, Jenny Qu wrote: > Hi, > > I'm a security researcher working on automated kernel vulnerability > detection. I'd love to present at an upcoming Thursday call if there's > interest. Cool, but isn't this a better subject for a conference talk? > I analyzed every Fixes: tag in the kernel's 20-year git history (125K > bug-fix pairs) and built a model to catch vulnerabilities at commit > time. Some findings that might be relevant to KernelCI's testing > strategy: > > - Security bugs hide for 2.1 years on average; race conditions persist 5.0 years > - 117 "super-reviewers" (including Dan Carpenter, who invented the > Fixes: tag) catch bugs 47% faster > - Subsystems like CAN bus (4.2 years) and SCTP (4.0 years) have > dramatically longer bug lifetimes than gpu/i915 (1.4 years) > - Weekend commits are 8% less likely to introduce bugs, but take 45% > longer to fix (review coverage effect) > > The model (VulnBERT) achieves 92% recall at 1.2% false positive rate > on held-out 2024 data. I'm also working on SmartKuang, an RL-based > system that has reproduced CVE-2022-34918 autonomously. I hate to say "your ai model could be replaced with a sql statement", but really, we do have tools that show this today that give all of this data in a sqlite database that people can use to mine for the same info. It's what the kernel CVE team uses to track bug fixes over time for their work: https://git.sr.ht/~gregkh/verhaal and is part of the vulns.git repo on git.kernel.org Also for the tracking of employer to people and who is doing the work, see the reports on lwn.net for the past few decades that have been documenting this. The tool for that is also public (but part of the database of employer mapping is not for obvious reasons, sorry). I think you undercounted people's employers a lot as you can not always rely on email addresses to convey this. Anyway, I liked your reports as I'm always interested in more people mining our public data for stuff like this, it's great to see. But with regards to kernelci, how do you feel this information can help with our project? What would you like us to do based on what you have found here? thanks, greg k-h ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Talk proposal: What 125K kernel bugs tell us about testing gaps 2026-02-05 7:00 ` Greg KH @ 2026-02-05 8:58 ` Jenny Qu 2026-02-05 14:22 ` Greg KH ` (2 more replies) 0 siblings, 3 replies; 6+ messages in thread From: Jenny Qu @ 2026-02-05 8:58 UTC (permalink / raw) To: Greg KH; +Cc: kernelci [resending to list - accidentally replied off-list] On Wed, Feb 04, 2026 at 11:00:00PM, Greg KH wrote: > I hate to say "your ai model could be replaced with a sql statement" Fair point on the descriptive statistics. I should have been clearer: the 125K bug analysis was training data, not the contribution. verhaal and the LWN employer reports (Jonathan Corbet's per-release stats using the gitdm database) already cover the descriptive side well. The part SQL can't do is the predictive model. VulnBERT takes a raw git diff *before merge* and predicts whether it introduces a vulnerability. The evaluation is a strict temporal holdout: trained on commits with Fixes: tags from <=2023, tested on 2024 commits that later received Fixes: tags. 92% recall, 1.2% FPR on that split. To be direct about limitations: those numbers are on historical data where we know ground truth. The model catches patterns it's seen before (unbalanced refcounts, missing NULL checks, lock/unlock mismatches). It will miss novel bug classes it hasn't been trained on. It's a triage tool and not yet an oracle. And it's not ready for production use yet. I'm reworking the architecture. The current approach uses CodeBERT embeddings with handcrafted features, and I think incorporating LLM reasoning traces over diffs will do substantially better. I don't want to hand anyone a tool that generates false confidence. On employer attribution: you're right, email domain mapping undercounts significantly. Developers using personal emails, acquisitions (Mellanox -> NVIDIA), and consultants all break the heuristic. > how do you feel this information can help with our project? What > would you like us to do based on what you have found here? Honestly, I'd rather hear from the KernelCI community what would actually be useful than prescribe solutions. But two directions I think are worth discussing: 1. Subsystem-level test prioritization. The lifetime gap between CAN bus (4.2 years) and gpu/i915 (1.4 years) almost certainly reflects testing coverage differences. i915 has dedicated fuzzing infrastructure and active reviewers like Chris Wilson and Ville Syrjala. KernelCI could use lifetime data as a signal for where to invest in test enablement. This is actionable now, no ML required. 2. Longer-term: commit-level risk scoring to allocate CI resources. Flag high-risk commits for extra sanitizer runs, longer fuzzing passes. Low-risk commits get the standard pipeline. But this needs a model I trust enough to deploy, and I'm not there yet. I'm speaking at BugBash 2026 in April and looking at LPC for a more technical deep-dive. kindly, Jenny jenny@pebblebed.com On Wed, Feb 4, 2026 at 11:00 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > On Wed, Feb 04, 2026 at 06:49:57PM -0800, Jenny Qu wrote: > > Hi, > > > > I'm a security researcher working on automated kernel vulnerability > > detection. I'd love to present at an upcoming Thursday call if there's > > interest. > > Cool, but isn't this a better subject for a conference talk? > > > I analyzed every Fixes: tag in the kernel's 20-year git history (125K > > bug-fix pairs) and built a model to catch vulnerabilities at commit > > time. Some findings that might be relevant to KernelCI's testing > > strategy: > > > > - Security bugs hide for 2.1 years on average; race conditions persist 5.0 years > > - 117 "super-reviewers" (including Dan Carpenter, who invented the > > Fixes: tag) catch bugs 47% faster > > - Subsystems like CAN bus (4.2 years) and SCTP (4.0 years) have > > dramatically longer bug lifetimes than gpu/i915 (1.4 years) > > - Weekend commits are 8% less likely to introduce bugs, but take 45% > > longer to fix (review coverage effect) > > > > The model (VulnBERT) achieves 92% recall at 1.2% false positive rate > > on held-out 2024 data. I'm also working on SmartKuang, an RL-based > > system that has reproduced CVE-2022-34918 autonomously. > > I hate to say "your ai model could be replaced with a sql statement", > but really, we do have tools that show this today that give all of this > data in a sqlite database that people can use to mine for the same info. > It's what the kernel CVE team uses to track bug fixes over time for > their work: > https://git.sr.ht/~gregkh/verhaal > and is part of the vulns.git repo on git.kernel.org > > Also for the tracking of employer to people and who is doing the work, > see the reports on lwn.net for the past few decades that have been > documenting this. The tool for that is also public (but part of the > database of employer mapping is not for obvious reasons, sorry). I > think you undercounted people's employers a lot as you can not always > rely on email addresses to convey this. > > Anyway, I liked your reports as I'm always interested in more people > mining our public data for stuff like this, it's great to see. But with > regards to kernelci, how do you feel this information can help with our > project? What would you like us to do based on what you have found > here? > > thanks, > > greg k-h ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Talk proposal: What 125K kernel bugs tell us about testing gaps 2026-02-05 8:58 ` Jenny Qu @ 2026-02-05 14:22 ` Greg KH 2026-02-05 19:31 ` Donald Zickus [not found] ` <CAK18DXbBKCVPFfWMg3DCv_iHiUOWiAvAtVZ-J1nfQJ3fhbdb-g@mail.gmail.com> 2 siblings, 0 replies; 6+ messages in thread From: Greg KH @ 2026-02-05 14:22 UTC (permalink / raw) To: Jenny Qu; +Cc: kernelci On Thu, Feb 05, 2026 at 12:58:20AM -0800, Jenny Qu wrote: > [resending to list - accidentally replied off-list] > > On Wed, Feb 04, 2026 at 11:00:00PM, Greg KH wrote: > > I hate to say "your ai model could be replaced with a sql statement" > > Fair point on the descriptive statistics. I should have been clearer: > the 125K bug analysis was training data, not the contribution. verhaal > and the LWN employer reports (Jonathan Corbet's per-release stats > using the gitdm database) already cover the descriptive side well. > > The part SQL can't do is the predictive model. VulnBERT takes a raw > git diff *before merge* and predicts whether it introduces a > vulnerability. The evaluation is a strict temporal holdout: trained > on commits with Fixes: tags from <=2023, tested on 2024 commits that > later received Fixes: tags. 92% recall, 1.2% FPR on that split. Cool! So you have re-implemented Sasha's AUTOSEL bot? :) Note, there are papers and presentations about how that works for the past 10 years, you might want to look into that as it seems that your models are the same here (prediction as to what type of commit is a fix). > To be direct about limitations: those numbers are on historical data > where we know ground truth. The model catches patterns it's seen > before (unbalanced refcounts, missing NULL checks, lock/unlock > mismatches). It will miss novel bug classes it hasn't been trained on. > It's a triage tool and not yet an oracle. That's fine, we need that. And if you have a pattern that it matches, let's add it to our coccinelle ruleset so that it does not come back in! > And it's not ready for production use yet. I'm reworking the > architecture. The current approach uses CodeBERT embeddings with > handcrafted features, and I think incorporating LLM reasoning traces > over diffs will do substantially better. I don't want to hand anyone > a tool that generates false confidence. Look at the ebpf "AI" patch reviews that are happening on the mailing list today already if you want an example of how this could work. Try running it on the output of the lore.kernel.org git repos (email is in git format for others to work easily off of, including the tool 'lei'). Then if your tool catches problems, email them to the patch authors and list to let them know! That's the best thing we can do now, catch bugs before they are committed. > 1. Subsystem-level test prioritization. The lifetime gap between > CAN bus (4.2 years) and gpu/i915 (1.4 years) almost certainly > reflects testing coverage differences. i915 has dedicated > fuzzing infrastructure and active reviewers like Chris Wilson > and Ville Syrjala. KernelCI could use lifetime data as a signal > for where to invest in test enablement. This is actionable now, > no ML required. Yes, that is directly due to fuzzing issues. Fuzzers work on a "layer by layer" basis, working deeper into the kernel and adding different subsystems all the time. That's why you will see "waves" of bugfixes happening like this. It's normal and to be expected. > 2. Longer-term: commit-level risk scoring to allocate CI resources. > Flag high-risk commits for extra sanitizer runs, longer fuzzing > passes. Low-risk commits get the standard pipeline. But this > needs a model I trust enough to deploy, and I'm not there yet. Again, look at what is already happening on these types of reviews and perhaps plug your model into that as well and see what happens? We're always wanting more code review to help alleviate our most limited resource, maintainers to review changes. thanks! greg k-h ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Talk proposal: What 125K kernel bugs tell us about testing gaps 2026-02-05 8:58 ` Jenny Qu 2026-02-05 14:22 ` Greg KH @ 2026-02-05 19:31 ` Donald Zickus [not found] ` <CAK18DXbBKCVPFfWMg3DCv_iHiUOWiAvAtVZ-J1nfQJ3fhbdb-g@mail.gmail.com> 2 siblings, 0 replies; 6+ messages in thread From: Donald Zickus @ 2026-02-05 19:31 UTC (permalink / raw) To: Jenny Qu; +Cc: Greg KH, kernelci (resending in plain text instead of html) Hi Jenny, On Thu, Feb 5, 2026 at 4:44 AM Jenny Qu <jenny@pebblebed.com> wrote: > > [resending to list - accidentally replied off-list] > > On Wed, Feb 04, 2026 at 11:00:00PM, Greg KH wrote: > > I hate to say "your ai model could be replaced with a sql statement" > > Fair point on the descriptive statistics. I should have been clearer: > the 125K bug analysis was training data, not the contribution. verhaal > and the LWN employer reports (Jonathan Corbet's per-release stats > using the gitdm database) already cover the descriptive side well. > > The part SQL can't do is the predictive model. VulnBERT takes a raw > git diff *before merge* and predicts whether it introduces a > vulnerability. The evaluation is a strict temporal holdout: trained > on commits with Fixes: tags from <=2023, tested on 2024 commits that > later received Fixes: tags. 92% recall, 1.2% FPR on that split. > > To be direct about limitations: those numbers are on historical data > where we know ground truth. The model catches patterns it's seen > before (unbalanced refcounts, missing NULL checks, lock/unlock > mismatches). It will miss novel bug classes it hasn't been trained on. > It's a triage tool and not yet an oracle. > > And it's not ready for production use yet. I'm reworking the > architecture. The current approach uses CodeBERT embeddings with > handcrafted features, and I think incorporating LLM reasoning traces > over diffs will do substantially better. I don't want to hand anyone > a tool that generates false confidence. > > On employer attribution: you're right, email domain mapping > undercounts significantly. Developers using personal emails, > acquisitions (Mellanox -> NVIDIA), and consultants all break the > heuristic. > > > how do you feel this information can help with our project? What > > would you like us to do based on what you have found here? > > Honestly, I'd rather hear from the KernelCI community what would > actually be useful than prescribe solutions. But two directions I > think are worth discussing: > > 1. Subsystem-level test prioritization. The lifetime gap between > CAN bus (4.2 years) and gpu/i915 (1.4 years) almost certainly > reflects testing coverage differences. i915 has dedicated > fuzzing infrastructure and active reviewers like Chris Wilson > and Ville Syrjala. KernelCI could use lifetime data as a signal > for where to invest in test enablement. This is actionable now, > no ML required. > > 2. Longer-term: commit-level risk scoring to allocate CI resources. > Flag high-risk commits for extra sanitizer runs, longer fuzzing > passes. Low-risk commits get the standard pipeline. But this > needs a model I trust enough to deploy, and I'm not there yet. > > I'm speaking at BugBash 2026 in April and looking at LPC for a more > technical deep-dive. Thanks for this. As a board member of KernelCI, most of the efforts we have funded or try to support are ones that have been adopted by the community. The work we try to sponsor needs to provide value to the community but the kernel community can be tricky to navigate as you can see by Greg's comments. I would recommend those conferences but also try attaching your work as replies to various patches. Try to show off the value of your work on mailing lists and let that start conversations on how to steer it towards something that could be considered useful. That journey will lead to overlap of existing technologies that Greg mentioned, but more importantly it will lead to conversations on how to collaborate around those technologies to make something valuable to the community. The end result being that it becomes a no-brainer to add to kernelci. A current example we are working with is Thorsten's regzbot[0]. A difficult social problem around regression tracking that the community helped him navigate towards something of value and now makes sense for kernelci to sponsor. Cheers, Don [0] - https://linux-regtracking.leemhuis.info/about/ > > kindly, > Jenny > jenny@pebblebed.com > > > On Wed, Feb 4, 2026 at 11:00 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > On Wed, Feb 04, 2026 at 06:49:57PM -0800, Jenny Qu wrote: > > > Hi, > > > > > > I'm a security researcher working on automated kernel vulnerability > > > detection. I'd love to present at an upcoming Thursday call if there's > > > interest. > > > > Cool, but isn't this a better subject for a conference talk? > > > > > I analyzed every Fixes: tag in the kernel's 20-year git history (125K > > > bug-fix pairs) and built a model to catch vulnerabilities at commit > > > time. Some findings that might be relevant to KernelCI's testing > > > strategy: > > > > > > - Security bugs hide for 2.1 years on average; race conditions persist 5.0 years > > > - 117 "super-reviewers" (including Dan Carpenter, who invented the > > > Fixes: tag) catch bugs 47% faster > > > - Subsystems like CAN bus (4.2 years) and SCTP (4.0 years) have > > > dramatically longer bug lifetimes than gpu/i915 (1.4 years) > > > - Weekend commits are 8% less likely to introduce bugs, but take 45% > > > longer to fix (review coverage effect) > > > > > > The model (VulnBERT) achieves 92% recall at 1.2% false positive rate > > > on held-out 2024 data. I'm also working on SmartKuang, an RL-based > > > system that has reproduced CVE-2022-34918 autonomously. > > > > I hate to say "your ai model could be replaced with a sql statement", > > but really, we do have tools that show this today that give all of this > > data in a sqlite database that people can use to mine for the same info. > > It's what the kernel CVE team uses to track bug fixes over time for > > their work: > > https://git.sr.ht/~gregkh/verhaal > > and is part of the vulns.git repo on git.kernel.org > > > > Also for the tracking of employer to people and who is doing the work, > > see the reports on lwn.net for the past few decades that have been > > documenting this. The tool for that is also public (but part of the > > database of employer mapping is not for obvious reasons, sorry). I > > think you undercounted people's employers a lot as you can not always > > rely on email addresses to convey this. > > > > Anyway, I liked your reports as I'm always interested in more people > > mining our public data for stuff like this, it's great to see. But with > > regards to kernelci, how do you feel this information can help with our > > project? What would you like us to do based on what you have found > > here? > > > > thanks, > > > > greg k-h > ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <CAK18DXbBKCVPFfWMg3DCv_iHiUOWiAvAtVZ-J1nfQJ3fhbdb-g@mail.gmail.com>]
* Re: Talk proposal: What 125K kernel bugs tell us about testing gaps [not found] ` <CAK18DXbBKCVPFfWMg3DCv_iHiUOWiAvAtVZ-J1nfQJ3fhbdb-g@mail.gmail.com> @ 2026-02-05 19:57 ` Jenny Qu 0 siblings, 0 replies; 6+ messages in thread From: Jenny Qu @ 2026-02-05 19:57 UTC (permalink / raw) To: Donald Zickus; +Cc: Greg KH, kernelci Thanks Greg and Don, this is exactly the guidance I needed. I'll dig into AUTOSEL, the eBPF AI review workflow, and Coccinelle. The path makes sense: prove value by running on real patches and engaging on-list, iterate based on feedback, let adoption happen organically. Will report back when I have something worth showing. Kindly, Jenny On Thu, Feb 5, 2026 at 10:24 AM Donald Zickus <dzickus@redhat.com> wrote: > > Hi Jenny, > > On Thu, Feb 5, 2026 at 4:44 AM Jenny Qu <jenny@pebblebed.com> wrote: >> >> [resending to list - accidentally replied off-list] >> >> On Wed, Feb 04, 2026 at 11:00:00PM, Greg KH wrote: >> > I hate to say "your ai model could be replaced with a sql statement" >> >> Fair point on the descriptive statistics. I should have been clearer: >> the 125K bug analysis was training data, not the contribution. verhaal >> and the LWN employer reports (Jonathan Corbet's per-release stats >> using the gitdm database) already cover the descriptive side well. >> >> The part SQL can't do is the predictive model. VulnBERT takes a raw >> git diff *before merge* and predicts whether it introduces a >> vulnerability. The evaluation is a strict temporal holdout: trained >> on commits with Fixes: tags from <=2023, tested on 2024 commits that >> later received Fixes: tags. 92% recall, 1.2% FPR on that split. >> >> To be direct about limitations: those numbers are on historical data >> where we know ground truth. The model catches patterns it's seen >> before (unbalanced refcounts, missing NULL checks, lock/unlock >> mismatches). It will miss novel bug classes it hasn't been trained on. >> It's a triage tool and not yet an oracle. >> >> And it's not ready for production use yet. I'm reworking the >> architecture. The current approach uses CodeBERT embeddings with >> handcrafted features, and I think incorporating LLM reasoning traces >> over diffs will do substantially better. I don't want to hand anyone >> a tool that generates false confidence. >> >> On employer attribution: you're right, email domain mapping >> undercounts significantly. Developers using personal emails, >> acquisitions (Mellanox -> NVIDIA), and consultants all break the >> heuristic. >> >> > how do you feel this information can help with our project? What >> > would you like us to do based on what you have found here? >> >> Honestly, I'd rather hear from the KernelCI community what would >> actually be useful than prescribe solutions. But two directions I >> think are worth discussing: >> >> 1. Subsystem-level test prioritization. The lifetime gap between >> CAN bus (4.2 years) and gpu/i915 (1.4 years) almost certainly >> reflects testing coverage differences. i915 has dedicated >> fuzzing infrastructure and active reviewers like Chris Wilson >> and Ville Syrjala. KernelCI could use lifetime data as a signal >> for where to invest in test enablement. This is actionable now, >> no ML required. >> >> 2. Longer-term: commit-level risk scoring to allocate CI resources. >> Flag high-risk commits for extra sanitizer runs, longer fuzzing >> passes. Low-risk commits get the standard pipeline. But this >> needs a model I trust enough to deploy, and I'm not there yet. >> >> I'm speaking at BugBash 2026 in April and looking at LPC for a more >> technical deep-dive. > > > Thanks for this. As a board member of KernelCI, most of the efforts we have funded or try to support are ones that have been adopted by the community. The work we try to sponsor needs to provide value to the community but the kernel community can be tricky to navigate as you can see by Greg's comments. > > I would recommend those conferences but also try attaching your work as replies to various patches. Try to show off the value of your work on mailing lists and let that start conversations on how to steer it towards something that could be considered useful. That journey will lead to overlap of existing technologies that Greg mentioned, but more importantly it will lead to conversations on how to collaborate around those technologies to make something valuable to the community. The end result being that it becomes a no-brainer to add to kernelci. > > A current example we are working with is Thorsten's regzbot[0]. A difficult social problem around regression tracking that the community helped him navigate towards something of value and now makes sense for kernelci to sponsor. > > Cheers, > Don > > [0] - https://linux-regtracking.leemhuis.info/about/ > >> >> kindly, >> Jenny >> jenny@pebblebed.com >> >> >> On Wed, Feb 4, 2026 at 11:00 PM Greg KH <gregkh@linuxfoundation.org> wrote: >> > >> > On Wed, Feb 04, 2026 at 06:49:57PM -0800, Jenny Qu wrote: >> > > Hi, >> > > >> > > I'm a security researcher working on automated kernel vulnerability >> > > detection. I'd love to present at an upcoming Thursday call if there's >> > > interest. >> > >> > Cool, but isn't this a better subject for a conference talk? >> > >> > > I analyzed every Fixes: tag in the kernel's 20-year git history (125K >> > > bug-fix pairs) and built a model to catch vulnerabilities at commit >> > > time. Some findings that might be relevant to KernelCI's testing >> > > strategy: >> > > >> > > - Security bugs hide for 2.1 years on average; race conditions persist 5.0 years >> > > - 117 "super-reviewers" (including Dan Carpenter, who invented the >> > > Fixes: tag) catch bugs 47% faster >> > > - Subsystems like CAN bus (4.2 years) and SCTP (4.0 years) have >> > > dramatically longer bug lifetimes than gpu/i915 (1.4 years) >> > > - Weekend commits are 8% less likely to introduce bugs, but take 45% >> > > longer to fix (review coverage effect) >> > > >> > > The model (VulnBERT) achieves 92% recall at 1.2% false positive rate >> > > on held-out 2024 data. I'm also working on SmartKuang, an RL-based >> > > system that has reproduced CVE-2022-34918 autonomously. >> > >> > I hate to say "your ai model could be replaced with a sql statement", >> > but really, we do have tools that show this today that give all of this >> > data in a sqlite database that people can use to mine for the same info. >> > It's what the kernel CVE team uses to track bug fixes over time for >> > their work: >> > https://git.sr.ht/~gregkh/verhaal >> > and is part of the vulns.git repo on git.kernel.org >> > >> > Also for the tracking of employer to people and who is doing the work, >> > see the reports on lwn.net for the past few decades that have been >> > documenting this. The tool for that is also public (but part of the >> > database of employer mapping is not for obvious reasons, sorry). I >> > think you undercounted people's employers a lot as you can not always >> > rely on email addresses to convey this. >> > >> > Anyway, I liked your reports as I'm always interested in more people >> > mining our public data for stuff like this, it's great to see. But with >> > regards to kernelci, how do you feel this information can help with our >> > project? What would you like us to do based on what you have found >> > here? >> > >> > thanks, >> > >> > greg k-h >> ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-02-05 19:57 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-05 2:49 Talk proposal: What 125K kernel bugs tell us about testing gaps Jenny Qu
2026-02-05 7:00 ` Greg KH
2026-02-05 8:58 ` Jenny Qu
2026-02-05 14:22 ` Greg KH
2026-02-05 19:31 ` Donald Zickus
[not found] ` <CAK18DXbBKCVPFfWMg3DCv_iHiUOWiAvAtVZ-J1nfQJ3fhbdb-g@mail.gmail.com>
2026-02-05 19:57 ` Jenny Qu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox