* Re: [PATCH 3/3] tracing: move tracing declarations from kernel.h to a dedicated header
From: Andy Shevchenko @ 2025-11-30 19:44 UTC (permalink / raw)
To: Yury Norov
Cc: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, Randy Dunlap,
Ingo Molnar, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
Tvrtko Ursulin, Petr Pavlu, Daniel Gomez, Greg Kroah-Hartman,
Rafael J. Wysocki, Danilo Krummrich, Andrew Morton, linux-kernel,
intel-gfx, dri-devel, linux-modules, linux-trace-kernel
In-Reply-To: <aSyJ83v7EEAPHXeU@yury>
On Sun, Nov 30, 2025 at 01:16:19PM -0500, Yury Norov wrote:
> On Sat, Nov 29, 2025 at 10:30:23PM +0200, Andy Shevchenko wrote:
> > On Sat, Nov 29, 2025 at 02:53:02PM -0500, Yury Norov (NVIDIA) wrote:
> > > Tracing is a half of the kernel.h in terms of LOCs, although it's a
> > > self-consistent part. Move it to a separate header.
> > >
> > > This is a pure move, except for removing a few 'extern's.
> >
> > Yeah, I also have something similar (but half-baked) locally, the Q I wanted to
> > ask is why a separate header? We have already some of tracing headers. Doesn't
> > suit well?
>
> Just as said in the commit message - this part is more or less
> self-consistent and debugging-oriented. If someone needs to just
> throw trace_printk() in their driver, they will not have to pull
> all the heavy tracing machinery.
Please, add a summary of this to it. It will be much clearer and based on it
I agree with your judgement.
...
> > > --- a/include/linux/kernel.h
> > > +++ b/include/linux/kernel.h
> > > @@ -27,6 +27,7 @@
> > > #include <linux/math.h>
> > > #include <linux/minmax.h>
> > > #include <linux/typecheck.h>
> >
> > > +#include <linux/tracing.h>
> >
> > There is better place for t*.h, i.e. after static_call_types.h.
>
> They are poorly sorted for seemingly no good reason. I found the first
> t*.h and just put this header next to it. Don't think that placing it
> next to static_call_types.h is any better or worse.
It's better, because the (sparsed) chain of the sorted one is longer.
> > Btw, have you tried to sort alphabetically the bulk in the kernel.h after
> > your series. Does it still build? (Just wondering about state of affairs
> > with the possible cyclic dependencies.)
>
> I didn't try. Sorting #include's is not the purpose of the series.
I know, I'm _just wondering_.
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply
* Re: [PATCH 3/3] tracing: move tracing declarations from kernel.h to a dedicated header
From: Steven Rostedt @ 2025-11-30 20:34 UTC (permalink / raw)
To: Yury Norov (NVIDIA)
Cc: Masami Hiramatsu, Mathieu Desnoyers, Andy Shevchenko,
Randy Dunlap, Ingo Molnar, Jani Nikula, Joonas Lahtinen,
Rodrigo Vivi, Tvrtko Ursulin, Petr Pavlu, Daniel Gomez,
Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
Andrew Morton, linux-kernel, intel-gfx, dri-devel, linux-modules,
linux-trace-kernel
In-Reply-To: <20251129195304.204082-4-yury.norov@gmail.com>
On Sat, 29 Nov 2025 14:53:02 -0500
"Yury Norov (NVIDIA)" <yury.norov@gmail.com> wrote:
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -27,6 +27,7 @@
> #include <linux/math.h>
> #include <linux/minmax.h>
> #include <linux/typecheck.h>
> +#include <linux/tracing.h>
> #include <linux/panic.h>
> #include <linux/printk.h>
> #include <linux/build_bug.h>
I'm fine with this as long as it's available as much as printk is.
Acked-by: Steven Rostedt <rostedt@goodmis.org>
-- Steve
^ permalink raw reply
* Re: [PATCH 3/3] tracing: move tracing declarations from kernel.h to a dedicated header
From: Steven Rostedt @ 2025-11-30 20:36 UTC (permalink / raw)
To: Andy Shevchenko
Cc: Yury Norov, Masami Hiramatsu, Mathieu Desnoyers, Randy Dunlap,
Ingo Molnar, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
Tvrtko Ursulin, Petr Pavlu, Daniel Gomez, Greg Kroah-Hartman,
Rafael J. Wysocki, Danilo Krummrich, Andrew Morton, linux-kernel,
intel-gfx, dri-devel, linux-modules, linux-trace-kernel
In-Reply-To: <aSyertuRRX9Czvyz@smile.fi.intel.com>
On Sun, 30 Nov 2025 21:44:46 +0200
Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:
> On Sun, Nov 30, 2025 at 01:16:19PM -0500, Yury Norov wrote:
> > On Sat, Nov 29, 2025 at 10:30:23PM +0200, Andy Shevchenko wrote:
> > > On Sat, Nov 29, 2025 at 02:53:02PM -0500, Yury Norov (NVIDIA) wrote:
> > > > Tracing is a half of the kernel.h in terms of LOCs, although it's a
> > > > self-consistent part. Move it to a separate header.
> > > >
> > > > This is a pure move, except for removing a few 'extern's.
> > >
> > > Yeah, I also have something similar (but half-baked) locally, the Q I wanted to
> > > ask is why a separate header? We have already some of tracing headers. Doesn't
> > > suit well?
> >
> > Just as said in the commit message - this part is more or less
> > self-consistent and debugging-oriented. If someone needs to just
> > throw trace_printk() in their driver, they will not have to pull
> > all the heavy tracing machinery.
>
> Please, add a summary of this to it. It will be much clearer and based on it
> I agree with your judgement.
Agreed. Please update the change log stating that the tracing code in
kernel.h is only used for quick debugging purposes and is not used for
the normal tracing utilities.
-- Steve
^ permalink raw reply
* Re: [PATCH 3/3] tracing: move tracing declarations from kernel.h to a dedicated header
From: david laight @ 2025-11-30 23:09 UTC (permalink / raw)
To: Andy Shevchenko
Cc: Yury Norov, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Randy Dunlap, Ingo Molnar, Jani Nikula, Joonas Lahtinen,
Rodrigo Vivi, Tvrtko Ursulin, Petr Pavlu, Daniel Gomez,
Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
Andrew Morton, linux-kernel, intel-gfx, dri-devel, linux-modules,
linux-trace-kernel
In-Reply-To: <aSyertuRRX9Czvyz@smile.fi.intel.com>
On Sun, 30 Nov 2025 21:44:46 +0200
Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:
> On Sun, Nov 30, 2025 at 01:16:19PM -0500, Yury Norov wrote:
> > On Sat, Nov 29, 2025 at 10:30:23PM +0200, Andy Shevchenko wrote:
> > > On Sat, Nov 29, 2025 at 02:53:02PM -0500, Yury Norov (NVIDIA) wrote:
> > > > Tracing is a half of the kernel.h in terms of LOCs, although it's a
> > > > self-consistent part. Move it to a separate header.
> > > >
> > > > This is a pure move, except for removing a few 'extern's.
> > >
> > > Yeah, I also have something similar (but half-baked) locally, the Q I wanted to
> > > ask is why a separate header? We have already some of tracing headers. Doesn't
> > > suit well?
> >
> > Just as said in the commit message - this part is more or less
> > self-consistent and debugging-oriented. If someone needs to just
> > throw trace_printk() in their driver, they will not have to pull
> > all the heavy tracing machinery.
>
> Please, add a summary of this to it. It will be much clearer and based on it
> I agree with your judgement.
It is worth checking whether the files get included anyway, and whether it
really makes that much difference.
Fiddling with kernel.h and extracting small 'leaf' headers from it is also
unlikely to make a big difference.
Try adding a syntax error to (say) sys/ioctl.h and see where it is included
from the first time - I suspect you'll be surprised.
Working on that include list might be more fruitful (in reducing build times).
David
^ permalink raw reply
* Re: [PATCH 3/3] tracing: move tracing declarations from kernel.h to a dedicated header
From: Andy Shevchenko @ 2025-12-01 2:50 UTC (permalink / raw)
To: david laight
Cc: Yury Norov, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Randy Dunlap, Ingo Molnar, Jani Nikula, Joonas Lahtinen,
Rodrigo Vivi, Tvrtko Ursulin, Petr Pavlu, Daniel Gomez,
Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
Andrew Morton, linux-kernel, intel-gfx, dri-devel, linux-modules,
linux-trace-kernel
In-Reply-To: <20251130230925.376b5377@pumpkin>
On Sun, Nov 30, 2025 at 11:09:25PM +0000, david laight wrote:
> On Sun, 30 Nov 2025 21:44:46 +0200
> Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:
...
> It is worth checking whether the files get included anyway, and whether it
> really makes that much difference.
>
> Fiddling with kernel.h and extracting small 'leaf' headers from it is also
> unlikely to make a big difference.
It makes a big difference for the kernel.h and its (ab)users.
Especially when we have cyclic dependencies and "include everything"
cases due to other _headers_ including kernel.h.
> Try adding a syntax error to (say) sys/ioctl.h and see where it is included
> from the first time - I suspect you'll be surprised.
> Working on that include list might be more fruitful (in reducing build times).
kernel.h elimination (in the form it exists right now) is very fruitful.
However, you may help with the (say) ioctl.h or whatever you consider
really fruitful, we all will thank you (no jokes).
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply
* [PATCH v2] module: Only declare set_module_sig_enforced when CONFIG_MODULE_SIG=y
From: Coiby Xu @ 2025-12-01 3:06 UTC (permalink / raw)
To: linux-modules
Cc: linux-integrity, kernel test robot, Aaron Tomlin, Daniel Gomez,
Luis Chamberlain, Petr Pavlu, Daniel Gomez, Sami Tolvanen,
open list:MODULE SUPPORT
In-Reply-To: <20251031080949.2001716-1-coxu@redhat.com>
Currently if set_module_sig_enforced is called with CONFIG_MODULE_SIG=n
e.g. [1], it can lead to a linking error,
ld: security/integrity/ima/ima_appraise.o: in function `ima_appraise_measurement':
security/integrity/ima/ima_appraise.c:587:(.text+0xbbb): undefined reference to `set_module_sig_enforced'
This happens because the actual implementation of
set_module_sig_enforced comes from CONFIG_MODULE_SIG but both the
function declaration and the empty stub definition are tied to
CONFIG_MODULES.
So bind set_module_sig_enforced to CONFIG_MODULE_SIG instead. This
allows (future) users to call set_module_sig_enforced directly without
the "if IS_ENABLED(CONFIG_MODULE_SIG)" safeguard.
Note this issue hasn't caused a real problem because all current callers
of set_module_sig_enforced e.g. security/integrity/ima/ima_efi.c
use "if IS_ENABLED(CONFIG_MODULE_SIG)" safeguard.
[1] https://lore.kernel.org/lkml/20250928030358.3873311-1-coxu@redhat.com/
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202510030029.VRKgik99-lkp@intel.com/
Reviewed-by: Aaron Tomlin <atomlin@atomlin.com>
Reviewed-by: Daniel Gomez <da.gomez@samsung.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
---
v2
- improve commit message as suggested by Daniel
- add Reviewed-by tags from Aaron and Daniel
include/linux/module.h | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/include/linux/module.h b/include/linux/module.h
index e135cc79acee..fa251958b138 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -769,8 +769,6 @@ static inline bool is_livepatch_module(struct module *mod)
#endif
}
-void set_module_sig_enforced(void);
-
void module_for_each_mod(int(*func)(struct module *mod, void *data), void *data);
#else /* !CONFIG_MODULES... */
@@ -865,10 +863,6 @@ static inline bool module_requested_async_probing(struct module *module)
}
-static inline void set_module_sig_enforced(void)
-{
-}
-
/* Dereference module function descriptor */
static inline
void *dereference_module_function_descriptor(struct module *mod, void *ptr)
@@ -924,6 +918,8 @@ static inline bool retpoline_module_ok(bool has_retpoline)
#ifdef CONFIG_MODULE_SIG
bool is_module_sig_enforced(void);
+void set_module_sig_enforced(void);
+
static inline bool module_sig_ok(struct module *module)
{
return module->sig_ok;
@@ -934,6 +930,10 @@ static inline bool is_module_sig_enforced(void)
return false;
}
+static inline void set_module_sig_enforced(void)
+{
+}
+
static inline bool module_sig_ok(struct module *module)
{
return true;
base-commit: ac3fd01e4c1efce8f2c054cdeb2ddd2fc0fb150d
--
2.52.0
^ permalink raw reply related
* Re: [PATCH v8 04/23] slab: add sheaf support for batching kfree_rcu() operations
From: Harry Yoo @ 2025-12-01 6:55 UTC (permalink / raw)
To: Jon Hunter
Cc: Daniel Gomez, Vlastimil Babka, Suren Baghdasaryan,
Liam R. Howlett, Christoph Lameter, David Rientjes,
Roman Gushchin, Uladzislau Rezki, Sidhartha Kumar, linux-mm,
linux-kernel, rcu, maple-tree, linux-modules, Luis Chamberlain,
Petr Pavlu, Sami Tolvanen, Aaron Tomlin, Lucas De Marchi,
linux-tegra@vger.kernel.org
In-Reply-To: <ad9c6f4d-42de-4251-ab10-579feec7e8d1@nvidia.com>
On Fri, Nov 28, 2025 at 08:57:28AM +0000, Jon Hunter wrote:
>
> On 27/11/2025 12:48, Harry Yoo wrote:
>
> ...
>
> > > > I have been looking into a regression for Linux v6.18-rc where time taken to
> > > > run some internal graphics tests on our Tegra234 device has increased from
> > > > around 35% causing the tests to timeout. Bisect is pointing to this commit
> > > > and I also see we have CONFIG_KVFREE_RCU_BATCHED=y.
> > >
> > > Thanks for reporting! Uh, this has been put aside while I was busy working
> > > on other stuff... but now that we have two people complaining about this,
> > > I'll allocate some time to investigate and improve it.
> > >
> > > It'll take some time though :)
> >
> > By the way, how many CPUs do you have on your system, and does your
> > kernel have CONFIG_CODE_TAGGING enabled?
>
> For this device there are 12 CPUs. I don't see CONFIG_CODE_TAGGING enabled.
Thanks! Then it's probably due to kmem_cache_destroy().
Please let me know this patch improves your test execution time.
https://lore.kernel.org/linux-mm/20251128113740.90129-1-harry.yoo@oracle.com/
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply
* Re: [PATCH v17 44/47] dept: introduce APIs to set page usage and use subclasses_evt for the usage
From: Byungchul Park @ 2025-12-01 7:18 UTC (permalink / raw)
To: Matthew Wilcox
Cc: linux-kernel, kernel_team, torvalds, damien.lemoal, linux-ide,
adilger.kernel, linux-ext4, mingo, peterz, will, tglx, rostedt,
joel, sashal, daniel.vetter, duyuyang, johannes.berg, tj, tytso,
david, amir73il, gregkh, kernel-team, linux-mm, akpm, mhocko,
minchan, hannes, vdavydov.dev, sj, jglisse, dennis, cl, penberg,
rientjes, vbabka, ngupta, linux-block, josef, linux-fsdevel, jack,
jlayton, dan.j.williams, hch, djwong, dri-devel,
rodrigosiqueiramelo, melissa.srw, hamohammed.sa, harry.yoo,
chris.p.wilson, gwan-gyeong.mun, max.byungchul.park, boqun.feng,
longman, yunseong.kim, ysk, yeoreum.yun, netdev, matthew.brost,
her0gyugyu, corbet, catalin.marinas, bp, dave.hansen, x86, hpa,
luto, sumit.semwal, gustavo, christian.koenig, andi.shyti, arnd,
lorenzo.stoakes, Liam.Howlett, rppt, surenb, mcgrof, petr.pavlu,
da.gomez, samitolvanen, paulmck, frederic, neeraj.upadhyay,
joelagnelf, josh, urezki, mathieu.desnoyers, jiangshanlai,
qiang.zhang, juri.lelli, vincent.guittot, dietmar.eggemann,
bsegall, mgorman, vschneid, chuck.lever, neil, okorniev, Dai.Ngo,
tom, trondmy, anna, kees, bigeasy, clrkwllms, mark.rutland,
ada.coupriediaz, kristina.martsenko, wangkefeng.wang, broonie,
kevin.brodsky, dwmw, shakeel.butt, ast, ziy, yuzhao, baolin.wang,
usamaarif642, joel.granados, richard.weiyang, geert+renesas,
tim.c.chen, linux, alexander.shishkin, lillian, chenhuacai,
francesco, guoweikang.kernel, link, jpoimboe, masahiroy, brauner,
thomas.weissschuh, oleg, mjguzik, andrii, wangfushuai, linux-doc,
linux-arm-kernel, linux-media, linaro-mm-sig, linux-i2c,
linux-arch, linux-modules, rcu, linux-nfs, linux-rt-devel
In-Reply-To: <aR3WHf9QZ_dizNun@casper.infradead.org>
On Wed, Nov 19, 2025 at 02:37:17PM +0000, Matthew Wilcox wrote:
> On Wed, Nov 19, 2025 at 07:53:12PM +0900, Byungchul Park wrote:
> > On Thu, Oct 02, 2025 at 05:12:44PM +0900, Byungchul Park wrote:
> > > False positive reports have been observed since dept works with the
> > > assumption that all the pages have the same dept class, but the class
> > > should be split since the problematic call paths are different depending
> > > on what the page is used for.
> > >
> > > At least, ones in block device's address_space and ones in regular
> > > file's address_space have exclusively different usages.
> > >
> > > Thus, define usage candidates like:
> > >
> > > DEPT_PAGE_REGFILE_CACHE /* page in regular file's address_space */
> > > DEPT_PAGE_BDEV_CACHE /* page in block device's address_space */
> > > DEPT_PAGE_DEFAULT /* the others */
> >
> > 1. I'd like to annotate a page to DEPT_PAGE_REGFILE_CACHE when the page
> > starts to be associated with a page cache for fs data.
> >
> > 2. And I'd like to annotate a page to DEPT_PAGE_BDEV_CACHE when the page
> > starts to be associated with meta data of fs e.g. super block.
> >
> > 3. Lastly, I'd like to reset the annotated value if any, that has been
> > set in the page, when the page ends the assoication with either page
> > cache or meta block of fs e.g. freeing the page.
> >
> > Can anyone suggest good places in code for the annotation 1, 2, 3? It'd
> > be totally appreciated. :-)
>
> I don't think it makes sense to track lock state in the page (nor
> folio). Partly bcause there's just so many of them, but also because
> the locking rules don't really apply to individual folios so much as
> they do to the mappings (or anon_vmas) that contain folios.
I've been trying to fully understand what you meant but maybe failed.
FWIW, dept is working based on classification, not instance by instance,
that is similar to lockdep. This patch is for resolving issues that
might come from the fact that there is a **single class** for PG_locked,
by splitting the class to several ones according to their usages.
> If you're looking to find deadlock scenarios, I think it makes more
> sense to track all folio locks in a given mapping as the same lock
> type rather than track each folio's lock status.
>
> For example, let's suppose we did something like this in the
> page fault path:
>
> Look up and lock a folio (we need folios locked to insert them into
> the page tables to avoid a race with truncate)
> Try to allocate a page table
> Go into reclaim, attempt to reclaim a folio from this mapping
I think you are talking about nested lock patterns involving PG_locked.
Even though dept can do much more jobs than just tracking nested lock
patterns within a single context, of course, nested lock patterns
involving PG_locked should be handled appropriately, maybe with the
useful information you gave. When I work on handling nested locks esp.
involving PG_locked, I will try to get you again. Thanks.
However, I have no choice but to keep this approach for the **single
class** issue. Feel free to ask if any.
Byungchul
> We ought to detect that as a potential deadlock, regardless of which
> folio in the mapping we attempt to reclaim. So can we track folio
> locking at the mapping/anon_vma level instead?
>
> ---
>
> My current understanding of folio locking rules:
>
> If you hold a lock on folio A, you can take a lock on folio B if:
>
> 1. A->mapping == B->mapping and A->index < B->index
> (for example writeback; we take locks on all folios to be written
> back in order)
> 2. !S_ISBLK(A->mapping->host) and S_ISBLK(B->mapping->host)
> 3. S_ISREG(A->mapping->host) and S_ISREG(B->mapping->host) with
> inode_lock() held on both and A->index < B->index
> (the remap_range code)
^ permalink raw reply
* Re: [PATCH 1/3] kernel.h: drop STACK_MAGIC macro
From: Jani Nikula @ 2025-12-01 7:46 UTC (permalink / raw)
To: Yury Norov (NVIDIA), Steven Rostedt, Masami Hiramatsu,
Mathieu Desnoyers, Andy Shevchenko, Randy Dunlap, Ingo Molnar,
Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, Petr Pavlu,
Daniel Gomez, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, Andrew Morton, linux-kernel, intel-gfx,
dri-devel, linux-modules, linux-trace-kernel
Cc: Yury Norov (NVIDIA)
In-Reply-To: <20251129195304.204082-2-yury.norov@gmail.com>
On Sat, 29 Nov 2025, "Yury Norov (NVIDIA)" <yury.norov@gmail.com> wrote:
> The macro is only used by i915. Move it to a local header and drop from
> the kernel.h.
>
> Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
> ---
> drivers/gpu/drm/i915/i915_utils.h | 2 ++
> include/linux/kernel.h | 2 --
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
> index a0c892e4c40d..6c197e968305 100644
> --- a/drivers/gpu/drm/i915/i915_utils.h
> +++ b/drivers/gpu/drm/i915/i915_utils.h
i915_utils.h is on a diet itself. STACK_MAGIC is only used in selftests,
please put this in i915_selftest.h.
I guess also need to include that from gt/selftest_ring_submission.c,
the only one that uses STACK_MAGIC but doesn't include i915_selftest.h.
BR,
Jani.
> @@ -32,6 +32,8 @@
> #include <linux/workqueue.h>
> #include <linux/sched/clock.h>
>
> +#define STACK_MAGIC 0xdeadbeef
> +
> #ifdef CONFIG_X86
> #include <asm/hypervisor.h>
> #endif
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index 5b46924fdff5..61d63c57bc2d 100644
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -40,8 +40,6 @@
>
> #include <uapi/linux/kernel.h>
>
> -#define STACK_MAGIC 0xdeadbeef
> -
> struct completion;
> struct user;
--
Jani Nikula, Intel
^ permalink raw reply
* Re: [PATCH 1/3] kernel.h: drop STACK_MAGIC macro
From: Christophe Leroy (CS GROUP) @ 2025-12-01 9:38 UTC (permalink / raw)
To: Yury Norov (NVIDIA), Steven Rostedt, Masami Hiramatsu,
Mathieu Desnoyers, Andy Shevchenko, Randy Dunlap, Ingo Molnar,
Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin,
Petr Pavlu, Daniel Gomez, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, Andrew Morton, linux-kernel, intel-gfx,
dri-devel, linux-modules, linux-trace-kernel
In-Reply-To: <20251129195304.204082-2-yury.norov@gmail.com>
Le 29/11/2025 à 20:53, Yury Norov (NVIDIA) a écrit :
> The macro is only used by i915. Move it to a local header and drop from
> the kernel.h.
At the begining of the git history we have:
$ git grep STACK_MAGIC 1da177e4c3f41
1da177e4c3f41:arch/h8300/kernel/traps.c: if (STACK_MAGIC !=
*(unsigned long *)((unsigned long)current+PAGE_SIZE))
1da177e4c3f41:arch/m68k/mac/macints.c: if (STACK_MAGIC !=
*(unsigned long *)current->kernel_stack_page)
1da177e4c3f41:include/linux/kernel.h:#define STACK_MAGIC 0xdeadbeef
Would be good to know the history of its usage over time.
I see:
- Removed from m68k by 3cd53b14e7c4 ("m68k/mac: Improve NMI handler")
- Removed from h8300 by 1c4b5ecb7ea1 ("remove the h8300 architecture")
- Started being used in i915 selftest by 250f8c8140ac ("drm/i915/gtt:
Read-only pages for insert_entries on bdw+")
Christophe
>
> Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
> ---
> drivers/gpu/drm/i915/i915_utils.h | 2 ++
> include/linux/kernel.h | 2 --
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
> index a0c892e4c40d..6c197e968305 100644
> --- a/drivers/gpu/drm/i915/i915_utils.h
> +++ b/drivers/gpu/drm/i915/i915_utils.h
> @@ -32,6 +32,8 @@
> #include <linux/workqueue.h>
> #include <linux/sched/clock.h>
>
> +#define STACK_MAGIC 0xdeadbeef
> +
> #ifdef CONFIG_X86
> #include <asm/hypervisor.h>
> #endif
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index 5b46924fdff5..61d63c57bc2d 100644
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -40,8 +40,6 @@
>
> #include <uapi/linux/kernel.h>
>
> -#define STACK_MAGIC 0xdeadbeef
> -
> struct completion;
> struct user;
>
^ permalink raw reply
* Re: [PATCH 3/3] tracing: move tracing declarations from kernel.h to a dedicated header
From: david laight @ 2025-12-01 10:16 UTC (permalink / raw)
To: Andy Shevchenko
Cc: Yury Norov, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Randy Dunlap, Ingo Molnar, Jani Nikula, Joonas Lahtinen,
Rodrigo Vivi, Tvrtko Ursulin, Petr Pavlu, Daniel Gomez,
Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
Andrew Morton, linux-kernel, intel-gfx, dri-devel, linux-modules,
linux-trace-kernel
In-Reply-To: <aS0CgnvRfQtam0uU@smile.fi.intel.com>
On Mon, 1 Dec 2025 04:50:42 +0200
Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:
> On Sun, Nov 30, 2025 at 11:09:25PM +0000, david laight wrote:
> > On Sun, 30 Nov 2025 21:44:46 +0200
> > Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:
>
> ...
> kernel.h elimination (in the form it exists right now) is very fruitful.
> However, you may help with the (say) ioctl.h or whatever you consider
> really fruitful, we all will thank you (no jokes).
>
This is the first #include path for ioctl.h
In file included from ../include/asm-generic/ioctl.h:5,
from ./arch/x86/include/generated/uapi/asm/ioctl.h:1,
from ../include/uapi/linux/ioctl.h:5,
from ../include/uapi/linux/random.h:12,
from ../include/linux/random.h:10,
from ../include/linux/nodemask.h:94,
from ../include/linux/numa.h:6,
from ../include/linux/cpumask.h:17,
from ../arch/x86/include/asm/paravirt.h:21,
from ../arch/x86/include/asm/irqflags.h:102,
from ../include/linux/irqflags.h:18,
from ../include/linux/spinlock.h:59,
from ../include/linux/swait.h:7,
from ../include/linux/completion.h:12,
from ../include/linux/crypto.h:15,
from ../arch/x86/kernel/asm-offsets.c:9:
Get past that and sched.h => processor.h => cpuid/api.h also
gets you to paravirt.h.
I suspect a lot of headers get pulled in like that.
David
^ permalink raw reply
* Re: [PATCH 3/3] tracing: move tracing declarations from kernel.h to a dedicated header
From: Andy Shevchenko @ 2025-12-01 15:37 UTC (permalink / raw)
To: david laight
Cc: Yury Norov, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Randy Dunlap, Ingo Molnar, Jani Nikula, Joonas Lahtinen,
Rodrigo Vivi, Tvrtko Ursulin, Petr Pavlu, Daniel Gomez,
Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
Andrew Morton, linux-kernel, intel-gfx, dri-devel, linux-modules,
linux-trace-kernel
In-Reply-To: <20251201101658.0b5ab68e@pumpkin>
On Mon, Dec 01, 2025 at 10:16:58AM +0000, david laight wrote:
> On Mon, 1 Dec 2025 04:50:42 +0200
> Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:
> > On Sun, Nov 30, 2025 at 11:09:25PM +0000, david laight wrote:
> > > On Sun, 30 Nov 2025 21:44:46 +0200
> > > Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote:
...
> > kernel.h elimination (in the form it exists right now) is very fruitful.
> > However, you may help with the (say) ioctl.h or whatever you consider
> > really fruitful, we all will thank you (no jokes).
> >
>
> This is the first #include path for ioctl.h
>
> In file included from ../include/asm-generic/ioctl.h:5,
> from ./arch/x86/include/generated/uapi/asm/ioctl.h:1,
> from ../include/uapi/linux/ioctl.h:5,
> from ../include/uapi/linux/random.h:12,
> from ../include/linux/random.h:10,
> from ../include/linux/nodemask.h:94,
> from ../include/linux/numa.h:6,
> from ../include/linux/cpumask.h:17,
> from ../arch/x86/include/asm/paravirt.h:21,
> from ../arch/x86/include/asm/irqflags.h:102,
> from ../include/linux/irqflags.h:18,
> from ../include/linux/spinlock.h:59,
> from ../include/linux/swait.h:7,
> from ../include/linux/completion.h:12,
> from ../include/linux/crypto.h:15,
> from ../arch/x86/kernel/asm-offsets.c:9:
>
> Get past that and sched.h => processor.h => cpuid/api.h also
> gets you to paravirt.h.
> I suspect a lot of headers get pulled in like that.
And several headers like ioctl.h that is "pull half of everything".
device.h, for example.
So, you can start untangling them piece by piece.
Not sure how [1] is actual right now, but I believe plenty of those
can still be used.
[1]: https://lwn.net/ml/linux-kernel/YdIfz+LMewetSaEB@gmail.com/
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply
* Re: [PATCH 2/3] kernel.h: move VERIFY_OCTAL_PERMISSIONS() to sysfs.h
From: Petr Pavlu @ 2025-12-01 19:01 UTC (permalink / raw)
To: Yury Norov (NVIDIA)
Cc: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Andy Shevchenko, Randy Dunlap, Ingo Molnar, Jani Nikula,
Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, Daniel Gomez,
Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
Andrew Morton, linux-kernel, intel-gfx, dri-devel, linux-modules,
linux-trace-kernel
In-Reply-To: <20251129195304.204082-3-yury.norov@gmail.com>
On 11/29/25 8:53 PM, Yury Norov (NVIDIA) wrote:
> The macro is related to sysfs, but is defined in kernel.h. Move it to
> the proper header, and unload the generic kernel.h.
>
> Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
> ---
> include/linux/kernel.h | 12 ------------
> include/linux/moduleparam.h | 2 +-
> include/linux/sysfs.h | 13 +++++++++++++
> 3 files changed, 14 insertions(+), 13 deletions(-)
>
> [...]
> diff --git a/include/linux/moduleparam.h b/include/linux/moduleparam.h
> index 6907aedc4f74..4e390a84a8bc 100644
> --- a/include/linux/moduleparam.h
> +++ b/include/linux/moduleparam.h
> @@ -4,7 +4,7 @@
> /* (C) Copyright 2001, 2002 Rusty Russell IBM Corporation */
> #include <linux/init.h>
> #include <linux/stringify.h>
> -#include <linux/kernel.h>
> +#include <linux/sysfs.h>
If you are removing the kernel.h include from
include/linux/moduleparam.h, I think it would be good to update the file
to ensure that all necessary includes are now listed directly.
The following items are present in moduleparam.h:
* __UNIQUE_ID(), __used(), __section(), __aligned(), __always_unused()
-> linux/compiler.h,
* THIS_MODULE -> linux/init.h,
* __stringify() -> linux/stringify.h,
* u8, s8, u16, ... -> linux/types.h,
* static_assert() -> linux/build_bug.h,
* VERIFY_OCTAL_PERMISSIONS() -> linux/sysfs.h,
* ARRAY_SIZE() -> linux/array_size.h.
I suggest then updating the includes in include/linux/moduleparam.h to:
#include <linux/array_size.h>
#include <linux/build_bug.h>
#include <linux/compiler.h>
#include <linux/init.h>
#include <linux/stringify.h>
#include <linux/sysfs.h>
#include <linux/types.h>
--
Thanks,
Petr
^ permalink raw reply
* Re: [PATCH 2/3] kernel.h: move VERIFY_OCTAL_PERMISSIONS() to sysfs.h
From: Andy Shevchenko @ 2025-12-01 19:20 UTC (permalink / raw)
To: Petr Pavlu
Cc: Yury Norov (NVIDIA), Steven Rostedt, Masami Hiramatsu,
Mathieu Desnoyers, Randy Dunlap, Ingo Molnar, Jani Nikula,
Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, Daniel Gomez,
Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
Andrew Morton, linux-kernel, intel-gfx, dri-devel, linux-modules,
linux-trace-kernel
In-Reply-To: <c45058d5-d690-4731-85d1-434971c16f92@suse.com>
On Mon, Dec 01, 2025 at 08:01:23PM +0100, Petr Pavlu wrote:
> On 11/29/25 8:53 PM, Yury Norov (NVIDIA) wrote:
...
> > -#include <linux/kernel.h>
> > +#include <linux/sysfs.h>
>
> If you are removing the kernel.h include from
> include/linux/moduleparam.h, I think it would be good to update the file
> to ensure that all necessary includes are now listed directly.
>
> The following items are present in moduleparam.h:
>
> * __UNIQUE_ID(), __used(), __section(), __aligned(), __always_unused()
> -> linux/compiler.h,
> * THIS_MODULE -> linux/init.h,
> * __stringify() -> linux/stringify.h,
> * u8, s8, u16, ... -> linux/types.h,
> * static_assert() -> linux/build_bug.h,
> * VERIFY_OCTAL_PERMISSIONS() -> linux/sysfs.h,
> * ARRAY_SIZE() -> linux/array_size.h.
>
> I suggest then updating the includes in include/linux/moduleparam.h to:
>
> #include <linux/array_size.h>
> #include <linux/build_bug.h>
> #include <linux/compiler.h>
> #include <linux/init.h>
> #include <linux/stringify.h>
> #include <linux/sysfs.h>
> #include <linux/types.h>
Good point. And since we are not adding some top-level ones, this shouldn't
be worse (in terms of potential cyclic dependencies).
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply
* Re: [PATCH 2/3] kernel.h: move VERIFY_OCTAL_PERMISSIONS() to sysfs.h
From: Randy Dunlap @ 2025-12-01 19:51 UTC (permalink / raw)
To: Andy Shevchenko, Yury Norov
Cc: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, Ingo Molnar,
Jani Nikula, Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin,
Petr Pavlu, Daniel Gomez, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, Andrew Morton, linux-kernel, intel-gfx,
dri-devel, linux-modules, linux-trace-kernel
In-Reply-To: <aSydSI-h3KZiYBn6@smile.fi.intel.com>
On 11/30/25 11:38 AM, Andy Shevchenko wrote:
> On Sun, Nov 30, 2025 at 12:42:35PM -0500, Yury Norov wrote:
>> This series was tested by 0-day and LKP. 0-day runs allyesconfig,
>
> AFAICS in the below no configuration had been tested against allYESconfig.
> All of them are allNOconfig.
>
>> as far as I know. It only sends email in case of errors. LKP is OK, find the
>> report below.
>
>> All but XFS include it via linux/module.h -> linux/moduleparam.h path.
>> XFS has a linkage layer: xfs.h -> xfs_linux.h-> linux/module.h, so
>> it's pretty much the same.
>>
>> I think, module.h inclusion path is OK for this macro and definitely
>> better than kernel.h. Notice, none of them, except for vgpu_dbg,
>> include kernel.h directly.
>
> Ideally those (especially and in the first place headers) should follow IWYU
> principle and avoid indirect (non-guaranteed) inclusions.
Can you (or anyone) get IWYU (software) to work?
I tried it a few months ago but didn't have the correct magic
incantation for it.
(no specifics at the moment)
--
~Randy
^ permalink raw reply
* Re: [PATCH 2/3] kernel.h: move VERIFY_OCTAL_PERMISSIONS() to sysfs.h
From: Andy Shevchenko @ 2025-12-01 20:00 UTC (permalink / raw)
To: Randy Dunlap, Jonathan Cameron
Cc: Yury Norov, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Ingo Molnar, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
Tvrtko Ursulin, Petr Pavlu, Daniel Gomez, Greg Kroah-Hartman,
Rafael J. Wysocki, Danilo Krummrich, Andrew Morton, linux-kernel,
intel-gfx, dri-devel, linux-modules, linux-trace-kernel
In-Reply-To: <fd755bbf-50a8-46f7-bff1-61cc625118a9@infradead.org>
On Mon, Dec 01, 2025 at 11:51:24AM -0800, Randy Dunlap wrote:
> On 11/30/25 11:38 AM, Andy Shevchenko wrote:
> > On Sun, Nov 30, 2025 at 12:42:35PM -0500, Yury Norov wrote:
>
> >> This series was tested by 0-day and LKP. 0-day runs allyesconfig,
> >
> > AFAICS in the below no configuration had been tested against allYESconfig.
> > All of them are allNOconfig.
> >
> >> as far as I know. It only sends email in case of errors. LKP is OK, find the
> >> report below.
> >
> >> All but XFS include it via linux/module.h -> linux/moduleparam.h path.
> >> XFS has a linkage layer: xfs.h -> xfs_linux.h-> linux/module.h, so
> >> it's pretty much the same.
> >>
> >> I think, module.h inclusion path is OK for this macro and definitely
> >> better than kernel.h. Notice, none of them, except for vgpu_dbg,
> >> include kernel.h directly.
> >
> > Ideally those (especially and in the first place headers) should follow IWYU
> > principle and avoid indirect (non-guaranteed) inclusions.
>
> Can you (or anyone) get IWYU (software) to work?
> I tried it a few months ago but didn't have the correct magic
> incantation for it.
> (no specifics at the moment)
You should talk to Jonathan Cameron (Cc'ed), he was able to run it to some
extent. AFAIR the state of affairs is that it gives a lot of low-level headers
that we should not really go too deep to (at least for now). That means the
carefully crafted map of guarantees needs to be provided (e.g., if we include
bitmap.h, bitops.h and/or bits.h are guaranteed, so no need to be included).
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply
* Re: [PATCH 1/3] kernel.h: drop STACK_MAGIC macro
From: Yury Norov @ 2025-12-02 2:50 UTC (permalink / raw)
To: Christophe Leroy (CS GROUP)
Cc: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Andy Shevchenko, Randy Dunlap, Ingo Molnar, Jani Nikula,
Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, Petr Pavlu,
Daniel Gomez, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, Andrew Morton, linux-kernel, intel-gfx,
dri-devel, linux-modules, linux-trace-kernel
In-Reply-To: <3e7ddbea-978f-44f7-abdd-7319908fd83c@kernel.org>
On Mon, Dec 01, 2025 at 10:38:01AM +0100, Christophe Leroy (CS GROUP) wrote:
>
>
> Le 29/11/2025 à 20:53, Yury Norov (NVIDIA) a écrit :
> > The macro is only used by i915. Move it to a local header and drop from
> > the kernel.h.
>
> At the begining of the git history we have:
>
> $ git grep STACK_MAGIC 1da177e4c3f41
> 1da177e4c3f41:arch/h8300/kernel/traps.c: if (STACK_MAGIC !=
> *(unsigned long *)((unsigned long)current+PAGE_SIZE))
> 1da177e4c3f41:arch/m68k/mac/macints.c: if (STACK_MAGIC !=
> *(unsigned long *)current->kernel_stack_page)
> 1da177e4c3f41:include/linux/kernel.h:#define STACK_MAGIC 0xdeadbeef
>
> Would be good to know the history of its usage over time.
>
> I see:
> - Removed from m68k by 3cd53b14e7c4 ("m68k/mac: Improve NMI handler")
> - Removed from h8300 by 1c4b5ecb7ea1 ("remove the h8300 architecture")
> - Started being used in i915 selftest by 250f8c8140ac ("drm/i915/gtt:
> Read-only pages for insert_entries on bdw+")
STACK_MAGIC was added in 1994 in 1.0.2. It was indeed used in a couple
of places in core subsystems back then to detect stack corruption. But
since that people invented better ways to guard stacks.
You can check commit 4914d770dec4 in this project:
https://archive.org/details/git-history-of-linux
^ permalink raw reply
* Re: [PATCH 1/3] kernel.h: drop STACK_MAGIC macro
From: Andy Shevchenko @ 2025-12-02 7:37 UTC (permalink / raw)
To: Yury Norov
Cc: Christophe Leroy (CS GROUP), Steven Rostedt, Masami Hiramatsu,
Mathieu Desnoyers, Randy Dunlap, Ingo Molnar, Jani Nikula,
Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, Petr Pavlu,
Daniel Gomez, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, Andrew Morton, linux-kernel, intel-gfx,
dri-devel, linux-modules, linux-trace-kernel
In-Reply-To: <aS5T9-1z7PK32q9R@yury>
On Mon, Dec 01, 2025 at 09:50:31PM -0500, Yury Norov wrote:
> On Mon, Dec 01, 2025 at 10:38:01AM +0100, Christophe Leroy (CS GROUP) wrote:
> > Le 29/11/2025 à 20:53, Yury Norov (NVIDIA) a écrit :
...
> You can check commit 4914d770dec4 in this project:
>
> https://archive.org/details/git-history-of-linux
Side note: we have history/history.git tree on kernel.org, and,
if anyone wants to check, it is handy.
Each of the history tree has its own pros and cons:
https://stackoverflow.com/a/51901211/2511795
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply
* Re: [PATCH V1] mm/slab: introduce kvfree_rcu_barrier_on_cache() for cache destruction
From: Jon Hunter @ 2025-12-02 9:29 UTC (permalink / raw)
To: Harry Yoo, surenb
Cc: Liam.Howlett, atomlin, bpf, cl, da.gomez, linux-kernel, linux-mm,
linux-modules, lucas.demarchi, maple-tree, mcgrof, petr.pavlu,
rcu, rientjes, roman.gushchin, samitolvanen, sidhartha.kumar,
urezki, vbabka, linux-tegra@vger.kernel.org
In-Reply-To: <20251128113740.90129-1-harry.yoo@oracle.com>
On 28/11/2025 11:37, Harry Yoo wrote:
> Currently, kvfree_rcu_barrier() flushes RCU sheaves across all slab
> caches when a cache is destroyed. This is unnecessary when destroying
> a slab cache; only the RCU sheaves belonging to the cache being destroyed
> need to be flushed.
>
> As suggested by Vlastimil Babka, introduce a weaker form of
> kvfree_rcu_barrier() that operates on a specific slab cache and call it
> on cache destruction.
>
> The performance benefit is evaluated on a 12 core 24 threads AMD Ryzen
> 5900X machine (1 socket), by loading slub_kunit module.
>
> Before:
> Total calls: 19
> Average latency (us): 8529
> Total time (us): 162069
>
> After:
> Total calls: 19
> Average latency (us): 3804
> Total time (us): 72287
>
> Link: https://lore.kernel.org/linux-mm/0406562e-2066-4cf8-9902-b2b0616dd742@kernel.org
> Link: https://lore.kernel.org/linux-mm/e988eff6-1287-425e-a06c-805af5bbf262@nvidia.com
> Link: https://lore.kernel.org/linux-mm/1bda09da-93be-4737-aef0-d47f8c5c9301@suse.cz
> Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> ---
Thanks for the rapid fix. I have been testing this and can confirm that
this does fix the performance regression I was seeing.
BTW shouldn't we add a 'Fixes:' tag above? I would like to ensure that
this gets picked up for v6.18 stable.
Otherwise ...
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Thanks!
Jon
--
nvpublic
^ permalink raw reply
* [PATCH V2] mm/slab: introduce kvfree_rcu_barrier_on_cache() for cache destruction
From: Harry Yoo @ 2025-12-02 10:16 UTC (permalink / raw)
To: vbabka
Cc: surenb, Liam.Howlett, cl, rientjes, roman.gushchin, harry.yoo,
urezki, sidhartha.kumar, linux-mm, linux-kernel, rcu, maple-tree,
linux-modules, mcgrof, petr.pavlu, samitolvanen, atomlin,
lucas.demarchi, akpm, jonathanh, stable, Daniel Gomez
Currently, kvfree_rcu_barrier() flushes RCU sheaves across all slab
caches when a cache is destroyed. This is unnecessary; only the RCU
sheaves belonging to the cache being destroyed need to be flushed.
As suggested by Vlastimil Babka, introduce a weaker form of
kvfree_rcu_barrier() that operates on a specific slab cache.
Factor out flush_rcu_sheaves_on_cache() from flush_all_rcu_sheaves() and
call it from flush_all_rcu_sheaves() and kvfree_rcu_barrier_on_cache().
Call kvfree_rcu_barrier_on_cache() instead of kvfree_rcu_barrier() on
cache destruction.
The performance benefit is evaluated on a 12 core 24 threads AMD Ryzen
5900X machine (1 socket), by loading slub_kunit module.
Before:
Total calls: 19
Average latency (us): 18127
Total time (us): 344414
After:
Total calls: 19
Average latency (us): 10066
Total time (us): 191264
Two performance regression have been reported:
- stress module loader test's runtime increases by 50-60% (Daniel)
- internal graphics test's runtime on Tegra23 increases by 35% (Jon)
They are fixed by this change.
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Fixes: ec66e0d59952 ("slab: add sheaf support for batching kfree_rcu() operations")
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/linux-mm/1bda09da-93be-4737-aef0-d47f8c5c9301@suse.cz
Reported-and-tested-by: Daniel Gomez <da.gomez@samsung.com>
Closes: https://lore.kernel.org/linux-mm/0406562e-2066-4cf8-9902-b2b0616dd742@kernel.org
Reported-and-tested-by: Jon Hunter <jonathanh@nvidia.com>
Closes: https://lore.kernel.org/linux-mm/e988eff6-1287-425e-a06c-805af5bbf262@nvidia.com
Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
---
No code change, added proper tags and updated changelog.
include/linux/slab.h | 5 ++++
mm/slab.h | 1 +
mm/slab_common.c | 52 +++++++++++++++++++++++++++++------------
mm/slub.c | 55 ++++++++++++++++++++++++--------------------
4 files changed, 73 insertions(+), 40 deletions(-)
diff --git a/include/linux/slab.h b/include/linux/slab.h
index cf443f064a66..937c93d44e8c 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -1149,6 +1149,10 @@ static inline void kvfree_rcu_barrier(void)
{
rcu_barrier();
}
+static inline void kvfree_rcu_barrier_on_cache(struct kmem_cache *s)
+{
+ rcu_barrier();
+}
static inline void kfree_rcu_scheduler_running(void) { }
#else
@@ -1156,6 +1160,7 @@ void kvfree_rcu_barrier(void);
void kfree_rcu_scheduler_running(void);
#endif
+void kvfree_rcu_barrier_on_cache(struct kmem_cache *s);
/**
* kmalloc_size_roundup - Report allocation bucket size for the given size
diff --git a/mm/slab.h b/mm/slab.h
index f730e012553c..e767aa7e91b0 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -422,6 +422,7 @@ static inline bool is_kmalloc_normal(struct kmem_cache *s)
bool __kfree_rcu_sheaf(struct kmem_cache *s, void *obj);
void flush_all_rcu_sheaves(void);
+void flush_rcu_sheaves_on_cache(struct kmem_cache *s);
#define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | \
SLAB_CACHE_DMA32 | SLAB_PANIC | \
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 84dfff4f7b1f..dd8a49d6f9cc 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -492,7 +492,7 @@ void kmem_cache_destroy(struct kmem_cache *s)
return;
/* in-flight kfree_rcu()'s may include objects from our cache */
- kvfree_rcu_barrier();
+ kvfree_rcu_barrier_on_cache(s);
if (IS_ENABLED(CONFIG_SLUB_RCU_DEBUG) &&
(s->flags & SLAB_TYPESAFE_BY_RCU)) {
@@ -2038,25 +2038,13 @@ void kvfree_call_rcu(struct rcu_head *head, void *ptr)
}
EXPORT_SYMBOL_GPL(kvfree_call_rcu);
-/**
- * kvfree_rcu_barrier - Wait until all in-flight kvfree_rcu() complete.
- *
- * Note that a single argument of kvfree_rcu() call has a slow path that
- * triggers synchronize_rcu() following by freeing a pointer. It is done
- * before the return from the function. Therefore for any single-argument
- * call that will result in a kfree() to a cache that is to be destroyed
- * during module exit, it is developer's responsibility to ensure that all
- * such calls have returned before the call to kmem_cache_destroy().
- */
-void kvfree_rcu_barrier(void)
+static inline void __kvfree_rcu_barrier(void)
{
struct kfree_rcu_cpu_work *krwp;
struct kfree_rcu_cpu *krcp;
bool queued;
int i, cpu;
- flush_all_rcu_sheaves();
-
/*
* Firstly we detach objects and queue them over an RCU-batch
* for all CPUs. Finally queued works are flushed for each CPU.
@@ -2118,8 +2106,43 @@ void kvfree_rcu_barrier(void)
}
}
}
+
+/**
+ * kvfree_rcu_barrier - Wait until all in-flight kvfree_rcu() complete.
+ *
+ * Note that a single argument of kvfree_rcu() call has a slow path that
+ * triggers synchronize_rcu() following by freeing a pointer. It is done
+ * before the return from the function. Therefore for any single-argument
+ * call that will result in a kfree() to a cache that is to be destroyed
+ * during module exit, it is developer's responsibility to ensure that all
+ * such calls have returned before the call to kmem_cache_destroy().
+ */
+void kvfree_rcu_barrier(void)
+{
+ flush_all_rcu_sheaves();
+ __kvfree_rcu_barrier();
+}
EXPORT_SYMBOL_GPL(kvfree_rcu_barrier);
+/**
+ * kvfree_rcu_barrier_on_cache - Wait for in-flight kvfree_rcu() calls on a
+ * specific slab cache.
+ * @s: slab cache to wait for
+ *
+ * See the description of kvfree_rcu_barrier() for details.
+ */
+void kvfree_rcu_barrier_on_cache(struct kmem_cache *s)
+{
+ if (s->cpu_sheaves)
+ flush_rcu_sheaves_on_cache(s);
+ /*
+ * TODO: Introduce a version of __kvfree_rcu_barrier() that works
+ * on a specific slab cache.
+ */
+ __kvfree_rcu_barrier();
+}
+EXPORT_SYMBOL_GPL(kvfree_rcu_barrier_on_cache);
+
static unsigned long
kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
{
@@ -2215,4 +2238,3 @@ void __init kvfree_rcu_init(void)
}
#endif /* CONFIG_KVFREE_RCU_BATCHED */
-
diff --git a/mm/slub.c b/mm/slub.c
index 785e25a14999..7cec2220712b 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4118,42 +4118,47 @@ static void flush_rcu_sheaf(struct work_struct *w)
/* needed for kvfree_rcu_barrier() */
-void flush_all_rcu_sheaves(void)
+void flush_rcu_sheaves_on_cache(struct kmem_cache *s)
{
struct slub_flush_work *sfw;
- struct kmem_cache *s;
unsigned int cpu;
- cpus_read_lock();
- mutex_lock(&slab_mutex);
+ mutex_lock(&flush_lock);
- list_for_each_entry(s, &slab_caches, list) {
- if (!s->cpu_sheaves)
- continue;
+ for_each_online_cpu(cpu) {
+ sfw = &per_cpu(slub_flush, cpu);
- mutex_lock(&flush_lock);
+ /*
+ * we don't check if rcu_free sheaf exists - racing
+ * __kfree_rcu_sheaf() might have just removed it.
+ * by executing flush_rcu_sheaf() on the cpu we make
+ * sure the __kfree_rcu_sheaf() finished its call_rcu()
+ */
- for_each_online_cpu(cpu) {
- sfw = &per_cpu(slub_flush, cpu);
+ INIT_WORK(&sfw->work, flush_rcu_sheaf);
+ sfw->s = s;
+ queue_work_on(cpu, flushwq, &sfw->work);
+ }
- /*
- * we don't check if rcu_free sheaf exists - racing
- * __kfree_rcu_sheaf() might have just removed it.
- * by executing flush_rcu_sheaf() on the cpu we make
- * sure the __kfree_rcu_sheaf() finished its call_rcu()
- */
+ for_each_online_cpu(cpu) {
+ sfw = &per_cpu(slub_flush, cpu);
+ flush_work(&sfw->work);
+ }
- INIT_WORK(&sfw->work, flush_rcu_sheaf);
- sfw->s = s;
- queue_work_on(cpu, flushwq, &sfw->work);
- }
+ mutex_unlock(&flush_lock);
+}
- for_each_online_cpu(cpu) {
- sfw = &per_cpu(slub_flush, cpu);
- flush_work(&sfw->work);
- }
+void flush_all_rcu_sheaves(void)
+{
+ struct kmem_cache *s;
+
+ cpus_read_lock();
+ mutex_lock(&slab_mutex);
- mutex_unlock(&flush_lock);
+ list_for_each_entry(s, &slab_caches, list) {
+ if (!s->cpu_sheaves)
+ continue;
+ flush_rcu_sheaves_on_cache(s);
}
mutex_unlock(&slab_mutex);
--
2.43.0
^ permalink raw reply related
* Re: [PATCH V1] mm/slab: introduce kvfree_rcu_barrier_on_cache() for cache destruction
From: Harry Yoo @ 2025-12-02 10:18 UTC (permalink / raw)
To: Jon Hunter
Cc: surenb, Liam.Howlett, atomlin, bpf, cl, da.gomez, linux-kernel,
linux-mm, linux-modules, lucas.demarchi, maple-tree, mcgrof,
petr.pavlu, rcu, rientjes, roman.gushchin, samitolvanen,
sidhartha.kumar, urezki, vbabka, linux-tegra@vger.kernel.org
In-Reply-To: <be021cb8-9bff-4bfc-bc79-c84cbb3f4c4e@nvidia.com>
On Tue, Dec 02, 2025 at 09:29:17AM +0000, Jon Hunter wrote:
>
> On 28/11/2025 11:37, Harry Yoo wrote:
> > Currently, kvfree_rcu_barrier() flushes RCU sheaves across all slab
> > caches when a cache is destroyed. This is unnecessary when destroying
> > a slab cache; only the RCU sheaves belonging to the cache being destroyed
> > need to be flushed.
> >
> > As suggested by Vlastimil Babka, introduce a weaker form of
> > kvfree_rcu_barrier() that operates on a specific slab cache and call it
> > on cache destruction.
> >
> > The performance benefit is evaluated on a 12 core 24 threads AMD Ryzen
> > 5900X machine (1 socket), by loading slub_kunit module.
> >
> > Before:
> > Total calls: 19
> > Average latency (us): 8529
> > Total time (us): 162069
> >
> > After:
> > Total calls: 19
> > Average latency (us): 3804
> > Total time (us): 72287
> >
> > Link: https://lore.kernel.org/linux-mm/0406562e-2066-4cf8-9902-b2b0616dd742@kernel.org
> > Link: https://lore.kernel.org/linux-mm/e988eff6-1287-425e-a06c-805af5bbf262@nvidia.com
> > Link: https://lore.kernel.org/linux-mm/1bda09da-93be-4737-aef0-d47f8c5c9301@suse.cz
> > Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> > ---
>
> Thanks for the rapid fix. I have been testing this and can confirm that this
> does fix the performance regression I was seeing.
Great!
> BTW shouldn't we add a 'Fixes:' tag above? I would like to ensure that this
> gets picked up for v6.18 stable.
Good point, I added Cc: stable and Fixes: tags.
(and your and Daniel's Reported-and-tested-by: tags)
> Otherwise ...
>
> Tested-by: Jon Hunter <jonathanh@nvidia.com>
Thank you Jon and Daniel a lot for reporting regression and testing the fix!
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply
* Re: [PATCH V2] mm/slab: introduce kvfree_rcu_barrier_on_cache() for cache destruction
From: Harry Yoo @ 2025-12-02 10:20 UTC (permalink / raw)
To: vbabka
Cc: surenb, Liam.Howlett, cl, rientjes, roman.gushchin, urezki,
sidhartha.kumar, linux-mm, linux-kernel, rcu, maple-tree,
linux-modules, mcgrof, petr.pavlu, samitolvanen, atomlin,
lucas.demarchi, akpm, jonathanh, stable, Daniel Gomez
In-Reply-To: <20251202101626.783736-1-harry.yoo@oracle.com>
On Tue, Dec 02, 2025 at 07:16:26PM +0900, Harry Yoo wrote:
> Currently, kvfree_rcu_barrier() flushes RCU sheaves across all slab
> caches when a cache is destroyed. This is unnecessary; only the RCU
> sheaves belonging to the cache being destroyed need to be flushed.
>
> As suggested by Vlastimil Babka, introduce a weaker form of
> kvfree_rcu_barrier() that operates on a specific slab cache.
>
> Factor out flush_rcu_sheaves_on_cache() from flush_all_rcu_sheaves() and
> call it from flush_all_rcu_sheaves() and kvfree_rcu_barrier_on_cache().
>
> Call kvfree_rcu_barrier_on_cache() instead of kvfree_rcu_barrier() on
> cache destruction.
>
> The performance benefit is evaluated on a 12 core 24 threads AMD Ryzen
> 5900X machine (1 socket), by loading slub_kunit module.
>
> Before:
> Total calls: 19
> Average latency (us): 18127
> Total time (us): 344414
>
> After:
> Total calls: 19
> Average latency (us): 10066
> Total time (us): 191264
>
> Two performance regression have been reported:
> - stress module loader test's runtime increases by 50-60% (Daniel)
> - internal graphics test's runtime on Tegra23 increases by 35% (Jon)
^Tegra234
just a minor typo :)
>
> They are fixed by this change.
>
> Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> Fixes: ec66e0d59952 ("slab: add sheaf support for batching kfree_rcu() operations")
> Cc: <stable@vger.kernel.org>
> Link: https://lore.kernel.org/linux-mm/1bda09da-93be-4737-aef0-d47f8c5c9301@suse.cz
> Reported-and-tested-by: Daniel Gomez <da.gomez@samsung.com>
> Closes: https://lore.kernel.org/linux-mm/0406562e-2066-4cf8-9902-b2b0616dd742@kernel.org
> Reported-and-tested-by: Jon Hunter <jonathanh@nvidia.com>
> Closes: https://lore.kernel.org/linux-mm/e988eff6-1287-425e-a06c-805af5bbf262@nvidia.com
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> ---
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply
* Re: [PATCH 1/3] kernel.h: drop STACK_MAGIC macro
From: Andi Shyti @ 2025-12-02 20:58 UTC (permalink / raw)
To: Jani Nikula
Cc: Yury Norov (NVIDIA), Steven Rostedt, Masami Hiramatsu,
Mathieu Desnoyers, Andy Shevchenko, Randy Dunlap, Ingo Molnar,
Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, Petr Pavlu,
Daniel Gomez, Greg Kroah-Hartman, Rafael J. Wysocki,
Danilo Krummrich, Andrew Morton, linux-kernel, intel-gfx,
dri-devel, linux-modules, linux-trace-kernel
In-Reply-To: <d854dadd78a43f589399e967def37a0eda3655c2@intel.com>
Hi Jani,
On Mon, Dec 01, 2025 at 09:46:47AM +0200, Jani Nikula wrote:
> On Sat, 29 Nov 2025, "Yury Norov (NVIDIA)" <yury.norov@gmail.com> wrote:
> > The macro is only used by i915. Move it to a local header and drop from
> > the kernel.h.
> >
> > Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
> > ---
> > drivers/gpu/drm/i915/i915_utils.h | 2 ++
> > include/linux/kernel.h | 2 --
> > 2 files changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
> > index a0c892e4c40d..6c197e968305 100644
> > --- a/drivers/gpu/drm/i915/i915_utils.h
> > +++ b/drivers/gpu/drm/i915/i915_utils.h
>
> i915_utils.h is on a diet itself. STACK_MAGIC is only used in selftests,
> please put this in i915_selftest.h.
>
> I guess also need to include that from gt/selftest_ring_submission.c,
> the only one that uses STACK_MAGIC but doesn't include i915_selftest.h.
Doing this cleanups is a bit out of the scope of this patch.
Given that the patch itself has quite a good consensus, let move
it forward and I can take care of the i915 cleanup once it gets
merged.
Andi
^ permalink raw reply
* Re: [PATCH 1/3] kernel.h: drop STACK_MAGIC macro
From: Yury Norov @ 2025-12-02 21:18 UTC (permalink / raw)
To: Andi Shyti
Cc: Jani Nikula, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Andy Shevchenko, Randy Dunlap, Ingo Molnar, Joonas Lahtinen,
Rodrigo Vivi, Tvrtko Ursulin, Petr Pavlu, Daniel Gomez,
Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
Andrew Morton, linux-kernel, intel-gfx, dri-devel, linux-modules,
linux-trace-kernel
In-Reply-To: <3m64k5fagw7hp2duo43t5fldyn6argdjripx3nn6onxbr6xu6w@iwiepyn5krf6>
On Tue, Dec 02, 2025 at 09:58:19PM +0100, Andi Shyti wrote:
> Hi Jani,
>
> On Mon, Dec 01, 2025 at 09:46:47AM +0200, Jani Nikula wrote:
> > On Sat, 29 Nov 2025, "Yury Norov (NVIDIA)" <yury.norov@gmail.com> wrote:
> > > The macro is only used by i915. Move it to a local header and drop from
> > > the kernel.h.
> > >
> > > Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
> > > ---
> > > drivers/gpu/drm/i915/i915_utils.h | 2 ++
> > > include/linux/kernel.h | 2 --
> > > 2 files changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
> > > index a0c892e4c40d..6c197e968305 100644
> > > --- a/drivers/gpu/drm/i915/i915_utils.h
> > > +++ b/drivers/gpu/drm/i915/i915_utils.h
> >
> > i915_utils.h is on a diet itself. STACK_MAGIC is only used in selftests,
> > please put this in i915_selftest.h.
> >
> > I guess also need to include that from gt/selftest_ring_submission.c,
> > the only one that uses STACK_MAGIC but doesn't include i915_selftest.h.
>
> Doing this cleanups is a bit out of the scope of this patch.
> Given that the patch itself has quite a good consensus, let move
> it forward and I can take care of the i915 cleanup once it gets
> merged.
I'm already testing it in my tree:
https://github.com/norov/linux/tree/sm1
If everything is fine, I'll submit v2 with this change, otherwise will
schedule it for future improvements.
^ permalink raw reply
* Re: [PATCH V2] mm/slab: introduce kvfree_rcu_barrier_on_cache() for cache destruction
From: Harry Yoo @ 2025-12-03 2:17 UTC (permalink / raw)
To: vbabka
Cc: surenb, Liam.Howlett, cl, rientjes, roman.gushchin, urezki,
sidhartha.kumar, linux-mm, linux-kernel, rcu, maple-tree,
linux-modules, mcgrof, petr.pavlu, samitolvanen, atomlin,
lucas.demarchi, akpm, jonathanh, stable, Daniel Gomez
In-Reply-To: <20251202101626.783736-1-harry.yoo@oracle.com>
On Tue, Dec 02, 2025 at 07:16:26PM +0900, Harry Yoo wrote:
> Currently, kvfree_rcu_barrier() flushes RCU sheaves across all slab
> caches when a cache is destroyed. This is unnecessary; only the RCU
> sheaves belonging to the cache being destroyed need to be flushed.
>
> As suggested by Vlastimil Babka, introduce a weaker form of
> kvfree_rcu_barrier() that operates on a specific slab cache.
>
> Factor out flush_rcu_sheaves_on_cache() from flush_all_rcu_sheaves() and
> call it from flush_all_rcu_sheaves() and kvfree_rcu_barrier_on_cache().
>
> Call kvfree_rcu_barrier_on_cache() instead of kvfree_rcu_barrier() on
> cache destruction.
>
> The performance benefit is evaluated on a 12 core 24 threads AMD Ryzen
> 5900X machine (1 socket), by loading slub_kunit module.
>
> Before:
> Total calls: 19
> Average latency (us): 18127
> Total time (us): 344414
>
> After:
> Total calls: 19
> Average latency (us): 10066
> Total time (us): 191264
>
> Two performance regression have been reported:
> - stress module loader test's runtime increases by 50-60% (Daniel)
So I took a look at why this regression is fixed. I didn't expect this
is going to be fixed because Daniel said CONFIG_CODE_TAGGING is enabled,
and there is still a heavy kvfree_rcu_barrier() call during module unloading.
As Vlastimil pointed out off-list, there should be kmem_cache_destroy()
calls somewhere.
So I ran kmod.sh and traced kmem_cache_destroy() calls:
> === kmem_cache_destroy Latency Statistics ===
> Total calls: 6346
> Average latency (us): 5156
> Total time (us): 32725981
Oh, it's called 6346 times during the test? That's impressive.
It also spent 32.725 seconds just for kmem_cache_destroy(), out of total
runtime of 96 seconds.
> === Top 2 stack traces involving kmem_cache_destroy ===
>
> @stacks[
> kmem_cache_destroy+1
> cleanup_module+118
> __do_sys_delete_module.isra.0+451
> __x64_sys_delete_module+18
> x64_sys_call+7366
> do_syscall_64+128
> entry_SYSCALL_64_after_hwframe+118
> ]: 1840
It seems tools/testing/selftests/kmod/kmod.sh is using xfs module for testing
and it creates & destroys many slab caches. (see exit_xfs_fs() ->
xfs_destroy_caches()).
Mystery solved, I guess :D
> @stacks[
> kmem_cache_destroy+1
> rcbagbt_init_cur_cache+4219734
> __do_sys_delete_module.isra.0+451
> __x64_sys_delete_module+18
> x64_sys_call+7366
> do_syscall_64+128
> entry_SYSCALL_64_after_hwframe+118
> ]: 1955
I don't get this one though. Why is the rcbagbt init function (also
from xfs) called during module unloading?
> - internal graphics test's runtime on Tegra23 increases by 35% (Jon)
>
> They are fixed by this change.
>
> Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> Fixes: ec66e0d59952 ("slab: add sheaf support for batching kfree_rcu() operations")
> Cc: <stable@vger.kernel.org>
> Link: https://lore.kernel.org/linux-mm/1bda09da-93be-4737-aef0-d47f8c5c9301@suse.cz
> Reported-and-tested-by: Daniel Gomez <da.gomez@samsung.com>
> Closes: https://lore.kernel.org/linux-mm/0406562e-2066-4cf8-9902-b2b0616dd742@kernel.org
> Reported-and-tested-by: Jon Hunter <jonathanh@nvidia.com>
> Closes: https://lore.kernel.org/linux-mm/e988eff6-1287-425e-a06c-805af5bbf262@nvidia.com
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> ---
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox