Linux Documentation

Linux Documentation
 help / color / mirror / Atom feed

* Re: [PATCH v6 1/4] mm/memory-failure: report MF_MSG_KERNEL for reserved pages
From: Breno Leitao @ 2026-05-12 13:04 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: Miaohe Lin, Naoya Horiguchi, Andrew Morton, Jonathan Corbet,
	Shuah Khan, Lorenzo Stoakes, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Shuah Khan, Steven Rostedt,
	Masami Hiramatsu, Mathieu Desnoyers, Liam R. Howlett, linux-mm,
	linux-kernel, linux-doc, linux-kselftest, linux-trace-kernel,
	kernel-team, Lance Yang
In-Reply-To: <9504c193-8c01-4d03-8f62-c50fd7fbdbc0@kernel.org>

On Tue, May 12, 2026 at 10:17:00AM +0200, David Hildenbrand (Arm) wrote:
> > @@ -2348,6 +2348,7 @@ int memory_failure(unsigned long pfn, int flags)
> >  	unsigned long page_flags;
> >  	bool retry = true;
> >  	int hugetlb = 0;
> > +	bool is_reserved;
> >  
> >  	if (!sysctl_memory_failure_recovery)
> >  		panic("Memory failure on page %lx", pfn);
> > @@ -2411,6 +2412,18 @@ int memory_failure(unsigned long pfn, int flags)
> >  	 * In fact it's dangerous to directly bump up page count from 0,
> >  	 * that may make page_ref_freeze()/page_ref_unfreeze() mismatch.
> >  	 */
> > +	/*
> > +	 * Pages with PG_reserved set are not currently managed by the
> > +	 * page allocator (memblock-reserved memory, driver reservations,
> > +	 * etc.), so classify them as kernel-owned for reporting.
> > +	 *
> > +	 * Sample the flag before get_hwpoison_page(): in the
> > +	 * MF_COUNT_INCREASED path, get_any_page() can drop the caller's
> > +	 * reference before returning -EIO, after which page->flags may
> > +	 * have been reset by the allocator.
> > +	 */
> > +	is_reserved = PageReserved(p);
> > +
> >  	res = get_hwpoison_page(p, flags);
> >  	if (!res) {
> >  		if (is_free_buddy_page(p)) {
> > @@ -2432,7 +2445,11 @@ int memory_failure(unsigned long pfn, int flags)
> >  		}
> >  		goto unlock_mutex;
> >  	} else if (res < 0) {
> > -		res = action_result(pfn, MF_MSG_GET_HWPOISON, MF_IGNORED);
> > +		if (is_reserved)
> > +			res = action_result(pfn, MF_MSG_KERNEL, MF_IGNORED);
> > +		else
> > +			res = action_result(pfn, MF_MSG_GET_HWPOISON,
> > +					    MF_IGNORED);
> >  		goto unlock_mutex;
> >  	}
> >  
> > 
> 
> It's a bit odd that we need this handling when we already have handling for
> reserved pages in error_states[].
> 
> HWPoisonHandlable() would always essentially reject PG_reserved pages. So
> __get_hwpoison_page() ... would always fail? Making
> get_hwpoison_page()->get_any_page() always fail?
> 
> But then, we never call identify_page_state()? And never call me_kernel()?

From what I read, it seems that error_states[0] = { reserved, reserved, MF_MSG_KERNEL, me_kernel }
has been effectively dead code on the hwpoison-from-MCE path for a
while.

My v6 patch relabels the failure-path output to match what me_kernel() would
have reported anyway.

> This all looks very odd.
> 
> Why would you even want to call get_hwpoison_page() in the first place if you
> find PageReserved?

Are you suggesting we should all the page action as soon as we detect the page
is reserved and get out?

Something as:

    if (PageReserved(p)) {
        res = action_result(pfn, MF_MSG_KERNEL, MF_IGNORED);
        goto unlock_mutex;
    }

    res = get_hwpoison_page(p, flags);

Thanks for the review,
--breno

^ permalink raw reply

* Re: [PATCH v6 1/4] mm/memory-failure: report MF_MSG_KERNEL for reserved pages
From: Lance Yang @ 2026-05-12 12:48 UTC (permalink / raw)
  To: david
  Cc: leitao, linmiaohe, nao.horiguchi, akpm, corbet, skhan, ljs,
	vbabka, rppt, surenb, mhocko, shuah, rostedt, mhiramat,
	mathieu.desnoyers, liam, linux-mm, linux-kernel, linux-doc,
	linux-kselftest, linux-trace-kernel, kernel-team, lance.yang
In-Reply-To: <9504c193-8c01-4d03-8f62-c50fd7fbdbc0@kernel.org>


On Tue, May 12, 2026 at 10:17:00AM +0200, David Hildenbrand (Arm) wrote:
>On 5/11/26 17:38, Breno Leitao wrote:
>> When get_hwpoison_page() returns a negative value, distinguish
>> reserved pages from other failure cases by reporting MF_MSG_KERNEL
>> instead of MF_MSG_GET_HWPOISON. Reserved pages belong to the kernel
>> and should be classified accordingly for proper handling.
>> 
>> Sample PG_reserved before the get_hwpoison_page() call. In the
>> MF_COUNT_INCREASED path get_any_page() can drop the caller's
>> reference before returning -EIO, after which the underlying page may
>> have been freed and reallocated with page->flags reset; reading
>> PageReserved(p) at that point would observe stale or unrelated state.
>> The pre-call snapshot reflects what the page actually was at the
>> time of the failure event.
>> 
>> Acked-by: Miaohe Lin <linmiaohe@huawei.com>
>> Reviewed-by: Lance Yang <lance.yang@linux.dev>
>> Signed-off-by: Breno Leitao <leitao@debian.org>
>> ---
>>  mm/memory-failure.c | 19 ++++++++++++++++++-
>>  1 file changed, 18 insertions(+), 1 deletion(-)
>> 
>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>> index 866c4428ac7ef..f112fb27a8ff6 100644
>> --- a/mm/memory-failure.c
>> +++ b/mm/memory-failure.c
>> @@ -2348,6 +2348,7 @@ int memory_failure(unsigned long pfn, int flags)
>>  	unsigned long page_flags;
>>  	bool retry = true;
>>  	int hugetlb = 0;
>> +	bool is_reserved;
>>  
>>  	if (!sysctl_memory_failure_recovery)
>>  		panic("Memory failure on page %lx", pfn);
>> @@ -2411,6 +2412,18 @@ int memory_failure(unsigned long pfn, int flags)
>>  	 * In fact it's dangerous to directly bump up page count from 0,
>>  	 * that may make page_ref_freeze()/page_ref_unfreeze() mismatch.
>>  	 */
>> +	/*
>> +	 * Pages with PG_reserved set are not currently managed by the
>> +	 * page allocator (memblock-reserved memory, driver reservations,
>> +	 * etc.), so classify them as kernel-owned for reporting.
>> +	 *
>> +	 * Sample the flag before get_hwpoison_page(): in the
>> +	 * MF_COUNT_INCREASED path, get_any_page() can drop the caller's
>> +	 * reference before returning -EIO, after which page->flags may
>> +	 * have been reset by the allocator.
>> +	 */
>> +	is_reserved = PageReserved(p);
>> +
>>  	res = get_hwpoison_page(p, flags);
>>  	if (!res) {
>>  		if (is_free_buddy_page(p)) {
>> @@ -2432,7 +2445,11 @@ int memory_failure(unsigned long pfn, int flags)
>>  		}
>>  		goto unlock_mutex;
>>  	} else if (res < 0) {
>> -		res = action_result(pfn, MF_MSG_GET_HWPOISON, MF_IGNORED);
>> +		if (is_reserved)
>> +			res = action_result(pfn, MF_MSG_KERNEL, MF_IGNORED);
>> +		else
>> +			res = action_result(pfn, MF_MSG_GET_HWPOISON,
>> +					    MF_IGNORED);
>>  		goto unlock_mutex;
>>  	}
>>  
>> 
>
>It's a bit odd that we need this handling when we already have handling for
>reserved pages in error_states[].
>
>HWPoisonHandlable() would always essentially reject PG_reserved pages. So
>__get_hwpoison_page() ... would always fail? Making
>get_hwpoison_page()->get_any_page() always fail?
>
>But then, we never call identify_page_state()? And never call me_kernel()?

Looks like we never get that far ...

>This all looks very odd.
>
>Why would you even want to call get_hwpoison_page() in the first place if you
>find PageReserved?

Ah, I see :)

For a PG_reserved page, I would not expect PageLRU to be set, nor would
I expect it to be in the buddy allocator.

include/linux/page-flags.h also says:

"
Once (if ever) freed, PG_reserved is cleared and they will be given to
the page allocator.
"

So maybe special-case PageReserved() before get_hwpoison_page()?
Something like:

	if (PageReserved(p)){
		res = action_result(pfn, MF_MSG_KERNEL, MF_IGNORED);
		goto unlock_mutex; 	
	}

	res = get_hwpoison_page(p, flags, &gp_status);

Cheers, Lance

^ permalink raw reply

* Re: [PATCH] docs: reporting-issues: fix advice wording
From: Jonathan Corbet @ 2026-05-12 12:35 UTC (permalink / raw)
  To: Thorsten Leemhuis, Chen-Shi-Hong; +Cc: skhan, linux-doc, linux-kernel
In-Reply-To: <1e7f2b9c-7c04-45d7-83e4-dd13267ae910@leemhuis.info>

Thorsten Leemhuis <linux@leemhuis.info> writes:

> On 5/12/26 03:51, Chen-Shi-Hong wrote:
>> Replace "these advices" with "this advice" in
>> Documentation/admin-guide/reporting-issues.rst.
>
> Thx for this, fixing this is a good idea. It nevertheless makes me go
> "hmmm...", as the wrongly executed and maybe not obvious enough original
> intention of the author (disclaimer: me) was to make it a bit clearer
> that "this advice" does not only mean the one advice right before it,
> but all the pieces of advice in the paragraph. It would be great to
> cover that while fixing it. "pieces of advice" maybe? Not sure. Maybe
> somebody has a better idea. And maybe just ignore my nitpicking, guess
> "this advice" just feels too easy to misinterpret from my point of view
> as someone to whom English is a second language.

"Advice" is an uncountable thing, so the suggested fix certainly
encompasses the meaning you want.  It could be "all of this advice" or
some such if you want to make its coverage explicitly larger.

jon

^ permalink raw reply

* Re: [PATCH v4 1/4] kernel: param: initialize module_kset before do_initcalls()
From: Sumit Gupta @ 2026-05-12 12:14 UTC (permalink / raw)
  To: Jon Hunter, Shashank Balaji, Thierry Reding
  Cc: Gary Guo, Suzuki K Poulose, James Clark, Alexander Shishkin,
	Maxime Coquelin, Alexandre Torgue, Greg Kroah-Hartman,
	Rafael J. Wysocki, Danilo Krummrich, Miguel Ojeda, Boqun Feng,
	Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
	Trevor Gross, Richard Cochran, Jonathan Corbet, Shuah Khan,
	Luis Chamberlain, Petr Pavlu, Daniel Gomez, Sami Tolvanen,
	Aaron Tomlin, Mike Leach, Leo Yan, Rahul Bukte, linux-kernel,
	coresight, linux-arm-kernel, driver-core, rust-for-linux,
	linux-doc, Daniel Palmer, Tim Bird, linux-modules, linux-tegra
In-Reply-To: <40c3aab2-b5cf-4297-9b14-3ccfea377c83@nvidia.com>


On 12/05/26 14:25, Jon Hunter wrote:
> Hi Shashank,
>
> On 12/05/2026 03:12, Shashank Balaji wrote:
>
> ...
>
>>> Hi Thierry and Jonathan,
>>>
>>> You can find the context for this email in this patch:
>>> https://lore.kernel.org/all/20260427-acpi_mod_name-v4-1-22b42240c9bf@sony.com/ 
>>>
>>>
>>> TL;DR: tegra194_cbb_driver and tegra234_cbb_driver are the only drivers
>>> registering themselves as early as in a pure_initcall. This is a 
>>> problem
>>> on two fronts:
>>> 1. Philosophical: As Gary pointed out, pure_initcalls are intended 
>>> to purely
>>> initialize variables that couldn't be statically initialized. But these
>>> are doing driver registrations.
>>> 2. module_kset not initialized at pure_initcall stage: This is 
>>> needed to
>>> set the module sysfs symlink. Since module_kset is not alive yet during
>>> pure_initcalls, registering these drivers panics the kernel.
>
> Where exactly is this panic seen? Ie. why are we not seeing this?
>
>>> We would like to do the tegra cbb driver registration in a 
>>> core_initcall
>>> (or some later initcall works too), and move module_kset initialization
>>> to a pure_initcall. Like this:
>>>
>>> diff --git a/drivers/soc/tegra/cbb/tegra194-cbb.c 
>>> b/drivers/soc/tegra/cbb/tegra194-cbb.c
>>> index ab75d50cc85c..2f69e104c838 100644
>>> --- a/drivers/soc/tegra/cbb/tegra194-cbb.c
>>> +++ b/drivers/soc/tegra/cbb/tegra194-cbb.c
>>> @@ -2342,7 +2342,7 @@ static int __init tegra194_cbb_init(void)
>>>   {
>>>          return platform_driver_register(&tegra194_cbb_driver);
>>>   }
>>> -pure_initcall(tegra194_cbb_init);
>>> +core_initcall(tegra194_cbb_init);
>>>
>>>   static void __exit tegra194_cbb_exit(void)
>>>   {
>>> diff --git a/drivers/soc/tegra/cbb/tegra234-cbb.c 
>>> b/drivers/soc/tegra/cbb/tegra234-cbb.c
>>> index fb26f085f691..785072fa4e85 100644
>>> --- a/drivers/soc/tegra/cbb/tegra234-cbb.c
>>> +++ b/drivers/soc/tegra/cbb/tegra234-cbb.c
>>> @@ -1774,7 +1774,7 @@ static int __init tegra234_cbb_init(void)
>>>   {
>>>          return platform_driver_register(&tegra234_cbb_driver);
>>>   }
>>> -pure_initcall(tegra234_cbb_init);
>>> +core_initcall(tegra234_cbb_init);
>>>
>>>   static void __exit tegra234_cbb_exit(void)
>>>   {
>>>
>>> Would this work?
>
>
> I am adding Sumit who has been doing a lot of the Tegra CBB driver work.
>
> Sumit, any concerns here? We could run this change through our 
> internal testing to confirm.
>
> Jon
>

CBB driver can be switched to core_initcall.
pure_initcall was originally added so its IRQ handler is registered
before other Tegra drivers to catch and print any bad MMIO error
during their probe.
Looked at the current state of Tegra drivers:
  - The other early Tegra drivers (PMC, fuse, flowctrl, ARI) all run at
    early_initcall, before either pure_ or core_initcall.
  - The only other Tegra core_initcall is tegra-hsp, and link order keeps
    CBB ahead of it (drivers/soc/ links before drivers/mailbox/).

Acked-by: Sumit Gupta <sumitg@nvidia.com>

Thank you,
Sumit Gupta



^ permalink raw reply

* Re: [PATCH RESEND bpf-next v10 5/8] bpf: Add bpf_list_add to insert node after a given list node
From: Kaitao Cheng @ 2026-05-12 12:05 UTC (permalink / raw)
  To: bot+bpf-ci
  Cc: bpf, ast, corbet, martin.lau, daniel, andrii, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, shuah,
	chengkaitao, skhan, memxor, linux-kernel, linux-doc, vmalik,
	linux-kselftest, martin.lau, clm, ihor.solodrai
In-Reply-To: <f573f7e0a5140fe6254d2ddfd3f69add455fa25c67c020c0312e2c2ceeca7c69@mail.kernel.org>



在 2026/5/12 14:41, bot+bpf-ci@kernel.org 写道:
>> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
>> --- a/kernel/bpf/helpers.c
>> +++ b/kernel/bpf/helpers.c
>> @@ -2570,6 +2570,16 @@ __bpf_kfunc int bpf_list_push_back_impl(struct bpf_list_head *head,
>>  	return bpf_list_push_back(head, node, meta__ign, off);
>>  }
>>
>> +__bpf_kfunc int bpf_list_add(struct bpf_list_head *head, struct bpf_list_node *new,
>> +			     struct bpf_list_node *prev, struct btf_struct_meta *meta,
>> +			     u64 off)
> 
> The commit message states that 'prev' must already be in the list. How can a
> BPF program obtain such a reference and pass it to this kfunc?
> 
> A node already in the list is tracked by the verifier as a non-owning
> reference (PTR_TO_BTF_ID | MEM_ALLOC | NON_OWN_REF with ref_obj_id == 0).
> The typical way to obtain this is via bpf_list_front() or bpf_list_back(),
> which return non-owning references.
> 
> However, check_kfunc_args() in kernel/bpf/verifier.c handles
> KF_ARG_PTR_TO_LIST_NODE by requiring owning references and rejects non-owning
> references.
> 
> Without a similar distinction for KF_ARG_PTR_TO_LIST_NODE (as exists for
> KF_ARG_PTR_TO_RB_NODE), is there any valid BPF register state that satisfies
> both the verifier check and the runtime semantics stated in the commit
> message?
> 
> The BPF CI bot reported this issue in v10. The author responded that this was
> fixed in [PATCH bpf-next v10 7/8], which introduces the __nonown_allowed
> annotation and corresponding verifier logic. However, patch 7/8 is not
> present in this commit (feed006fb969), so the parameter lacks the
> __nonown_allowed suffix and the verifier lacks the is_kfunc_arg_nonown_allowed()
> check.

If this is an issue that must be fixed, I will move [PATCH bpf-next v10 7/8]
earlier in the next version.

> Reference: https://lore.kernel.org/bpf/cb7368c6062232b6ad5f9ab1c6a97bc8530566a8b3db5c4deefdd5ea63d31538@mail.kernel.org/
> 
>> +{
>> +	struct bpf_list_node_kern *n = (void *)new, *p = (void *)prev;
>> +	struct list_head *prev_ptr = &p->list_head;
>> +
>> +	return __bpf_list_add(n, head, &prev_ptr, meta ? meta->record : NULL, off);
>> +}
>> +
> 
> [ ... ]
> 
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
> 
> [ ... ]
> 
>> @@ -19235,8 +19238,11 @@ struct bpf_insn addr[2] = { BPF_LD_IMM64(BPF_REG_2, (long)kptr_struct_meta) };
>>  		int struct_meta_reg = BPF_REG_3;
>>  		int node_offset_reg = BPF_REG_4;
>>
>> -		/* rbtree_add has extra 'less' arg, so args-to-fixup are in diff regs */
>> -		if (is_bpf_rbtree_add_kfunc(desc->func_id)) {
>> +		/* list_add/rbtree_add have an extra arg (prev/less),
>> +		 * so args-to-fixup are in diff regs.
>> +		 */
> 
> This isn't a bug, but the multi-line comment format doesn't follow the BPF
> subsystem style. The preferred form in kernel/bpf/ has the opening /* on its
> own line:
> 
>     /*
>      * list_add/rbtree_add have an extra arg (prev/less),
>      * so args-to-fixup are in diff regs.
>      */
> 
>> +		if (desc->func_id == special_kfunc_list[KF_bpf_list_add] ||
>> +		    is_bpf_rbtree_add_kfunc(desc->func_id)) {
>>  			struct_meta_reg = BPF_REG_4;
>>  			node_offset_reg = BPF_REG_5;
>>  		}
> 
> 
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
> 
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25716874656

-- 
Thanks
Kaitao Cheng


^ permalink raw reply

* Re: [PATCH v12 02/11] lib: kstrtox: add kstrtoudec64() and kstrtodec64()
From: Rodrigo Alencar @ 2026-05-12 11:52 UTC (permalink / raw)
  To: Jonathan Cameron, Rodrigo Alencar via B4 Relay
  Cc: rodrigo.alencar, linux-kernel, linux-iio, devicetree, linux-doc,
	David Lechner, Andy Shevchenko, Lars-Peter Clausen,
	Michael Hennerich, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Jonathan Corbet, Andrew Morton, Petr Mladek, Steven Rostedt,
	Andy Shevchenko, Rasmus Villemoes, Sergey Senozhatsky, Shuah Khan,
	David Laight
In-Reply-To: <20260512123953.40d80bc9@jic23-huawei>

On 26/05/12 12:39PM, Jonathan Cameron wrote:
> On Sun, 10 May 2026 13:42:20 +0100
> Rodrigo Alencar via B4 Relay <devnull+rodrigo.alencar.analog.com@kernel.org> wrote:
> 
> > From: Rodrigo Alencar <rodrigo.alencar@analog.com>
> > 
> > Add helpers that parses decimal numbers into 64-bit number, i.e., decimal
> > point numbers with pre-defined scale are parsed into a 64-bit value (fixed
> > precision). After the decimal point, digits beyond the specified scale
> > are ignored.
> > 
> > Signed-off-by: Rodrigo Alencar <rodrigo.alencar@analog.com>
> 
> Whilst Rodrigo has already replied to say there will be another version
> I'd like to request final feedback from those who were involved in the parser
> discussions.  
> 
> They got very involved and I'm far from an expert in the right way to do
> this stuff.  
> 
> I don't think David Laight was +CC so I've added that.
> David, Andy - I think you two were most involved in that discussion:
> Any objections to the end result? 

I am evaluating on taking sashiko's feedback here too, so it is a good
time to check this again indeed.

> Thanks,
> 
> Jonathan
> 
> 
> > ---
> >  include/linux/kstrtox.h |   3 ++
> >  lib/kstrtox.c           | 107 ++++++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 110 insertions(+)
> > 
> > diff --git a/include/linux/kstrtox.h b/include/linux/kstrtox.h
> > index 6ea897222af1..bec2fc17bde0 100644
> > --- a/include/linux/kstrtox.h
> > +++ b/include/linux/kstrtox.h
> > @@ -97,6 +97,9 @@ int __must_check kstrtou8(const char *s, unsigned int base, u8 *res);
> >  int __must_check kstrtos8(const char *s, unsigned int base, s8 *res);
> >  int __must_check kstrtobool(const char *s, bool *res);
> >  
> > +int __must_check kstrtoudec64(const char *s, unsigned int scale, u64 *res);
> > +int __must_check kstrtodec64(const char *s, unsigned int scale, s64 *res);
> > +
> >  int __must_check kstrtoull_from_user(const char __user *s, size_t count, unsigned int base, unsigned long long *res);
> >  int __must_check kstrtoll_from_user(const char __user *s, size_t count, unsigned int base, long long *res);
> >  int __must_check kstrtoul_from_user(const char __user *s, size_t count, unsigned int base, unsigned long *res);
> > diff --git a/lib/kstrtox.c b/lib/kstrtox.c
> > index 97be2a39f537..da7b5f83a3c5 100644
> > --- a/lib/kstrtox.c
> > +++ b/lib/kstrtox.c
> > @@ -17,6 +17,7 @@
> >  #include <linux/export.h>
> >  #include <linux/kstrtox.h>
> >  #include <linux/math64.h>
> > +#include <linux/overflow.h>
> >  #include <linux/types.h>
> >  #include <linux/uaccess.h>
> >  
> > @@ -392,6 +393,112 @@ int kstrtobool(const char *s, bool *res)
> >  }
> >  EXPORT_SYMBOL(kstrtobool);
> >  
> > +static int _kstrtoudec64(const char *s, unsigned int scale, u64 *res)
> > +{
> > +	u64 _res = 0, _frac = 0;
> > +	unsigned int rv;
> > +
> > +	if (scale > 19) /* log10(2^64) = 19.26 */
> > +		return -EINVAL;
> > +
> > +	if (*s != '.') {
> > +		rv = _parse_integer(s, 10, &_res);
> > +		if (rv & KSTRTOX_OVERFLOW)
> > +			return -ERANGE;
> > +		if (rv == 0)
> > +			return -EINVAL;
> > +		s += rv;
> > +	}
> > +
> > +	if (*s == '.' && scale) {

I havent really considered the scale == 0 case, I suppose that
one could rely on kstrtoull() instead. But as sashiko points
out, it deviates from the documented behavior. Also, I will
consider accepting "123." as a valid input, I see that others
parsers do that and should not be a problem. So I will add a
small change here. Also will make sure the test cases are ok.

> > +		s++; /* skip decimal point */
> > +		rv = _parse_integer_limit(s, 10, &_frac, scale);
> > +		if (rv & KSTRTOX_OVERFLOW)
> > +			return -ERANGE;
> > +		if (rv == 0)
> > +			return -EINVAL;
> > +		s += rv;
> > +		if (rv < scale)
> > +			_frac *= int_pow(10, scale - rv);
> > +		while (isdigit(*s)) /* truncate */
> > +			s++;
> > +	}
> > +
> > +	if (*s == '\n')
> > +		s++;
> > +	if (*s)
> > +		return -EINVAL;
> > +
> > +	if (check_mul_overflow(_res, int_pow(10, scale), &_res) ||
> > +	    check_add_overflow(_res, _frac, &_res))
> > +		return -ERANGE;
> > +
> > +	*res = _res;
> > +	return 0;
> > +}
> > +
> > +/**
> > + * kstrtoudec64() - Convert a string to an unsigned 64-bit value that represents
> > + *		    a scaled decimal number.
> > + * @s: The start of the string. The string must be null-terminated, and may also
> > + *  include a single newline before its terminating null. The first character
> > + *  may also be a plus sign, but not a minus sign. Digits beyond the specified
> > + *  scale are ignored.
> > + * @scale: The number of digits to the right of the decimal point. For example,
> > + *  a scale of 2 would mean the number is represented with two decimal places,
> > + *  so "123.45" would be represented as 12345.
> > + * @res: Where to write the result of the conversion on success.
> > + *
> > + * Return: 0 on success, -ERANGE on overflow and -EINVAL on parsing error.
> > + */
> > +noinline
> > +int kstrtoudec64(const char *s, unsigned int scale, u64 *res)
> > +{
> > +	if (s[0] == '+')
> > +		s++;
> > +	return _kstrtoudec64(s, scale, res);
> > +}
> > +EXPORT_SYMBOL(kstrtoudec64);
> > +
> > +/**
> > + * kstrtodec64() - Convert a string to a signed 64-bit value that represents a
> > + *		   scaled decimal number.
> > + * @s: The start of the string. The string must be null-terminated, and may also
> > + *  include a single newline before its terminating null. The first character
> > + *  may also be a plus sign or a minus sign. Digits beyond the specified
> > + *  scale are ignored.
> > + * @scale: The number of digits to the right of the decimal point. For example,
> > + *  a scale of 5 would mean the number is represented with five decimal places,
> > + *  so "-3.141592" would be represented as -314159.
> > + * @res: Where to write the result of the conversion on success.
> > + *
> > + * Return: 0 on success, -ERANGE on overflow and -EINVAL on parsing error.
> > + */
> > +noinline
> > +int kstrtodec64(const char *s, unsigned int scale, s64 *res)
> > +{
> > +	u64 tmp;
> > +	int rv;
> > +
> > +	if (s[0] == '-') {
> > +		rv = _kstrtoudec64(s + 1, scale, &tmp);
> > +		if (rv < 0)
> > +			return rv;
> > +		if ((s64)-tmp > 0)
> > +			return -ERANGE;
> > +		*res = -tmp;
> > +	} else {
> > +		rv = kstrtoudec64(s, scale, &tmp);
> > +		if (rv < 0)
> > +			return rv;
> > +		if ((s64)tmp < 0)
> > +			return -ERANGE;
> > +		*res = tmp;
> > +	}
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL(kstrtodec64);
> > +
> >  /*
> >   * Since "base" would be a nonsense argument, this open-codes the
> >   * _from_user helper instead of using the helper macro below.
> > 
> 

-- 
Kind regards,

Rodrigo Alencar

^ permalink raw reply

* Re: [PATCH v12 00/11] ADF41513/ADF41510 PLL frequency synthesizers
From: Jonathan Cameron @ 2026-05-12 11:48 UTC (permalink / raw)
  To: Rodrigo Alencar via B4 Relay
  Cc: rodrigo.alencar, linux-kernel, linux-iio, devicetree, linux-doc,
	David Lechner, Andy Shevchenko, Lars-Peter Clausen,
	Michael Hennerich, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Jonathan Corbet, Andrew Morton, Petr Mladek, Steven Rostedt,
	Andy Shevchenko, Rasmus Villemoes, Sergey Senozhatsky, Shuah Khan,
	Krzysztof Kozlowski
In-Reply-To: <20260510-adf41513-iio-driver-v12-0-34af2ed2779f@analog.com>

On Sun, 10 May 2026 13:42:18 +0100
Rodrigo Alencar via B4 Relay <devnull+rodrigo.alencar.analog.com@kernel.org> wrote:

> This patch series adds support for the Analog Devices ADF41513 and ADF41510
> ultralow noise PLL frequency synthesizers. These devices are designed for
> implementing local oscillators (LOs) in high-frequency applications.
> The ADF41513 covers frequencies from 1 GHz to 26.5 GHz, while the ADF41510
> operates from 1 GHz to 10 GHz.
> 
> Key features supported by this driver:
> - Integer-N and fractional-N operation modes
> - High maximum PFD frequency (250 MHz integer-N, 125 MHz fractional-N)
> - 25-bit fixed modulus or 49-bit variable modulus fractional modes
> - Digital lock detect functionality
> - Phase resync capability for consistent output phase
> - Load Enable vs Reference signal syncronization
FWIW I have taken another look through and didn't have anything to add.
So I think it's now you vs Sashiko!

With that in mind I'm fine with you not waiting as long as normal before
sending a v13.  Whilst I still would like some level of tag or informal
'it's fine' for the string parser from those who were feeding back on
earlier versions that bit isn't going to change anyway for v13 and
so probably not worth holding it back for that.

Thanks,

Jonathan


^ permalink raw reply

* Re: [PATCH v12 02/11] lib: kstrtox: add kstrtoudec64() and kstrtodec64()
From: Jonathan Cameron @ 2026-05-12 11:39 UTC (permalink / raw)
  To: Rodrigo Alencar via B4 Relay
  Cc: rodrigo.alencar, linux-kernel, linux-iio, devicetree, linux-doc,
	David Lechner, Andy Shevchenko, Lars-Peter Clausen,
	Michael Hennerich, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Jonathan Corbet, Andrew Morton, Petr Mladek, Steven Rostedt,
	Andy Shevchenko, Rasmus Villemoes, Sergey Senozhatsky, Shuah Khan,
	David Laight
In-Reply-To: <20260510-adf41513-iio-driver-v12-2-34af2ed2779f@analog.com>

On Sun, 10 May 2026 13:42:20 +0100
Rodrigo Alencar via B4 Relay <devnull+rodrigo.alencar.analog.com@kernel.org> wrote:

> From: Rodrigo Alencar <rodrigo.alencar@analog.com>
> 
> Add helpers that parses decimal numbers into 64-bit number, i.e., decimal
> point numbers with pre-defined scale are parsed into a 64-bit value (fixed
> precision). After the decimal point, digits beyond the specified scale
> are ignored.
> 
> Signed-off-by: Rodrigo Alencar <rodrigo.alencar@analog.com>

Whilst Rodrigo has already replied to say there will be another version
I'd like to request final feedback from those who were involved in the parser
discussions.  

They got very involved and I'm far from an expert in the right way to do
this stuff.  

I don't think David Laight was +CC so I've added that.
David, Andy - I think you two were most involved in that discussion:
Any objections to the end result? 

Thanks,

Jonathan


> ---
>  include/linux/kstrtox.h |   3 ++
>  lib/kstrtox.c           | 107 ++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 110 insertions(+)
> 
> diff --git a/include/linux/kstrtox.h b/include/linux/kstrtox.h
> index 6ea897222af1..bec2fc17bde0 100644
> --- a/include/linux/kstrtox.h
> +++ b/include/linux/kstrtox.h
> @@ -97,6 +97,9 @@ int __must_check kstrtou8(const char *s, unsigned int base, u8 *res);
>  int __must_check kstrtos8(const char *s, unsigned int base, s8 *res);
>  int __must_check kstrtobool(const char *s, bool *res);
>  
> +int __must_check kstrtoudec64(const char *s, unsigned int scale, u64 *res);
> +int __must_check kstrtodec64(const char *s, unsigned int scale, s64 *res);
> +
>  int __must_check kstrtoull_from_user(const char __user *s, size_t count, unsigned int base, unsigned long long *res);
>  int __must_check kstrtoll_from_user(const char __user *s, size_t count, unsigned int base, long long *res);
>  int __must_check kstrtoul_from_user(const char __user *s, size_t count, unsigned int base, unsigned long *res);
> diff --git a/lib/kstrtox.c b/lib/kstrtox.c
> index 97be2a39f537..da7b5f83a3c5 100644
> --- a/lib/kstrtox.c
> +++ b/lib/kstrtox.c
> @@ -17,6 +17,7 @@
>  #include <linux/export.h>
>  #include <linux/kstrtox.h>
>  #include <linux/math64.h>
> +#include <linux/overflow.h>
>  #include <linux/types.h>
>  #include <linux/uaccess.h>
>  
> @@ -392,6 +393,112 @@ int kstrtobool(const char *s, bool *res)
>  }
>  EXPORT_SYMBOL(kstrtobool);
>  
> +static int _kstrtoudec64(const char *s, unsigned int scale, u64 *res)
> +{
> +	u64 _res = 0, _frac = 0;
> +	unsigned int rv;
> +
> +	if (scale > 19) /* log10(2^64) = 19.26 */
> +		return -EINVAL;
> +
> +	if (*s != '.') {
> +		rv = _parse_integer(s, 10, &_res);
> +		if (rv & KSTRTOX_OVERFLOW)
> +			return -ERANGE;
> +		if (rv == 0)
> +			return -EINVAL;
> +		s += rv;
> +	}
> +
> +	if (*s == '.' && scale) {
> +		s++; /* skip decimal point */
> +		rv = _parse_integer_limit(s, 10, &_frac, scale);
> +		if (rv & KSTRTOX_OVERFLOW)
> +			return -ERANGE;
> +		if (rv == 0)
> +			return -EINVAL;
> +		s += rv;
> +		if (rv < scale)
> +			_frac *= int_pow(10, scale - rv);
> +		while (isdigit(*s)) /* truncate */
> +			s++;
> +	}
> +
> +	if (*s == '\n')
> +		s++;
> +	if (*s)
> +		return -EINVAL;
> +
> +	if (check_mul_overflow(_res, int_pow(10, scale), &_res) ||
> +	    check_add_overflow(_res, _frac, &_res))
> +		return -ERANGE;
> +
> +	*res = _res;
> +	return 0;
> +}
> +
> +/**
> + * kstrtoudec64() - Convert a string to an unsigned 64-bit value that represents
> + *		    a scaled decimal number.
> + * @s: The start of the string. The string must be null-terminated, and may also
> + *  include a single newline before its terminating null. The first character
> + *  may also be a plus sign, but not a minus sign. Digits beyond the specified
> + *  scale are ignored.
> + * @scale: The number of digits to the right of the decimal point. For example,
> + *  a scale of 2 would mean the number is represented with two decimal places,
> + *  so "123.45" would be represented as 12345.
> + * @res: Where to write the result of the conversion on success.
> + *
> + * Return: 0 on success, -ERANGE on overflow and -EINVAL on parsing error.
> + */
> +noinline
> +int kstrtoudec64(const char *s, unsigned int scale, u64 *res)
> +{
> +	if (s[0] == '+')
> +		s++;
> +	return _kstrtoudec64(s, scale, res);
> +}
> +EXPORT_SYMBOL(kstrtoudec64);
> +
> +/**
> + * kstrtodec64() - Convert a string to a signed 64-bit value that represents a
> + *		   scaled decimal number.
> + * @s: The start of the string. The string must be null-terminated, and may also
> + *  include a single newline before its terminating null. The first character
> + *  may also be a plus sign or a minus sign. Digits beyond the specified
> + *  scale are ignored.
> + * @scale: The number of digits to the right of the decimal point. For example,
> + *  a scale of 5 would mean the number is represented with five decimal places,
> + *  so "-3.141592" would be represented as -314159.
> + * @res: Where to write the result of the conversion on success.
> + *
> + * Return: 0 on success, -ERANGE on overflow and -EINVAL on parsing error.
> + */
> +noinline
> +int kstrtodec64(const char *s, unsigned int scale, s64 *res)
> +{
> +	u64 tmp;
> +	int rv;
> +
> +	if (s[0] == '-') {
> +		rv = _kstrtoudec64(s + 1, scale, &tmp);
> +		if (rv < 0)
> +			return rv;
> +		if ((s64)-tmp > 0)
> +			return -ERANGE;
> +		*res = -tmp;
> +	} else {
> +		rv = kstrtoudec64(s, scale, &tmp);
> +		if (rv < 0)
> +			return rv;
> +		if ((s64)tmp < 0)
> +			return -ERANGE;
> +		*res = tmp;
> +	}
> +	return 0;
> +}
> +EXPORT_SYMBOL(kstrtodec64);
> +
>  /*
>   * Since "base" would be a nonsense argument, this open-codes the
>   * _from_user helper instead of using the helper macro below.
> 


^ permalink raw reply

* Re: [PATCH v12 11/11] Documentation: ABI: testing: add common ABI file for iio/frequency
From: Jonathan Cameron @ 2026-05-12 11:36 UTC (permalink / raw)
  To: Rodrigo Alencar via B4 Relay
  Cc: rodrigo.alencar, linux-kernel, linux-iio, devicetree, linux-doc,
	David Lechner, Andy Shevchenko, Lars-Peter Clausen,
	Michael Hennerich, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Jonathan Corbet, Andrew Morton, Petr Mladek, Steven Rostedt,
	Andy Shevchenko, Rasmus Villemoes, Sergey Senozhatsky, Shuah Khan
In-Reply-To: <20260510-adf41513-iio-driver-v12-11-34af2ed2779f@analog.com>

On Sun, 10 May 2026 13:42:29 +0100
Rodrigo Alencar via B4 Relay <devnull+rodrigo.alencar.analog.com@kernel.org> wrote:

> From: Rodrigo Alencar <rodrigo.alencar@analog.com>
> 
> Add ABI documentation file for PLL/DDS devices with frequency_resolution
> sysfs entry attribute used by both ADF4350 and ADF41513.
> 
> Signed-off-by: Rodrigo Alencar <rodrigo.alencar@analog.com>
> ---
>  Documentation/ABI/testing/sysfs-bus-iio-frequency         | 11 +++++++++++
>  Documentation/ABI/testing/sysfs-bus-iio-frequency-adf4350 | 10 ----------
>  2 files changed, 11 insertions(+), 10 deletions(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-iio-frequency b/Documentation/ABI/testing/sysfs-bus-iio-frequency
> new file mode 100644
> index 000000000000..1ce8ae578fd6
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-bus-iio-frequency
> @@ -0,0 +1,11 @@
> +What:		/sys/bus/iio/devices/iio:deviceX/out_altvoltageY_frequency_resolution
> +KernelVersion:	6.20
FWIW Sashiko correctly points out that moving documentation doesn't change the kernel version
in which it wsa introduced.  So this should be 3.4.0

> +Contact:	linux-iio@vger.kernel.org
> +Description:
> +		Stores channel Y frequency resolution/channel spacing in Hz for PLL
> +		devices. The given value directly influences the operating mode when
> +		fractional-N synthesis is required, as it derives values for
> +		configurable modulus parameters used in the calculation of the output
> +		frequency. It is assumed that the algorithm that is used to compute
> +		the various dividers, is able to generate proper values for multiples
> +		of channel spacing.
> diff --git a/Documentation/ABI/testing/sysfs-bus-iio-frequency-adf4350 b/Documentation/ABI/testing/sysfs-bus-iio-frequency-adf4350
> index 1254457a726e..76987a119feb 100644
> --- a/Documentation/ABI/testing/sysfs-bus-iio-frequency-adf4350
> +++ b/Documentation/ABI/testing/sysfs-bus-iio-frequency-adf4350
> @@ -1,13 +1,3 @@
> -What:		/sys/bus/iio/devices/iio:deviceX/out_altvoltageY_frequency_resolution
> -KernelVersion:	3.4.0
> -Contact:	linux-iio@vger.kernel.org
> -Description:
> -		Stores channel Y frequency resolution/channel spacing in Hz.
> -		The value given directly influences the MODULUS used by
> -		the fractional-N PLL. It is assumed that the algorithm
> -		that is used to compute the various dividers, is able to
> -		generate proper values for multiples of channel spacing.
> -
>  What:		/sys/bus/iio/devices/iio:deviceX/out_altvoltageY_refin_frequency
>  KernelVersion:	3.4.0
>  Contact:	linux-iio@vger.kernel.org
> 


^ permalink raw reply

* Re: [PATCH 0/3] mm/zswap: Implement per-cgroup proactive writeback
From: Hao Jia @ 2026-05-12 11:23 UTC (permalink / raw)
  To: Michal Koutný
  Cc: akpm, tj, hannes, shakeel.butt, mhocko, yosry, nphamcs,
	chengming.zhou, muchun.song, roman.gushchin, cgroups, linux-mm,
	linux-kernel, linux-doc, Hao Jia
In-Reply-To: <agG-gNEclOVf-9WA@localhost.localdomain>

On 2026/5/11 19:39, Michal Koutný wrote:
> On Mon, May 11, 2026 at 06:51:46PM +0800, Hao Jia <jiahao.kernel@gmail.com> wrote:
>> From: Hao Jia <jiahao1@lixiang.com>
>>
>> Zswap currently writes back pages to backing swap devices reactively,
>> triggered either by memory pressure via the shrinker or by the pool
>> reaching its size limit. However, this reactive approach makes writeback
>> timing indeterminate and can disrupt latency-sensitive workloads when
>> eviction happens to coincide with a critical execution window.
>>
>> Furthermore, in certain scenarios, it is desirable to trigger writeback
>> in advance to free up memory. For example, users may want to prepare for
>> an upcoming memory-intensive workload by flushing cold memory to the
>> backing storage when the system is relatively idle.
> 
> I can imagine the zswap writeout can come at the least possible
> moment...
> 
>> To address these issues, this patch series introduces a per-cgroup
>> interface that allows users to proactively write back cold compressed
>> pages from zswap to the backing swap device.
> 
> ...but I see this series is not only per-cgroup proactive reclaim but
> it's also age-based reclaim.
> 
> The per-cg consumption and limits (and regular memory reclaim) are all
> measured in sizes. This age-based invocations don't seem commensurable
> (e.g. how would users in practice determine what is the desired input to
> here).
> 

Thanks Michal — you are right. The series is both per-memcg *and*
age-based.

The interface carries a size budget, like memory.reclaim. The two
parameters play different roles:

   "write back up to <max> bytes, chosen from entries whose residency
    in zswap is at least <age>"

Size stays the unit of *amount*; age is just how we describe *which*
entries are eligible.

> Could you explain more reasoning behind this design?
> 

Context on the use case:

Our deployment runs a userspace proactive reclaimer driven by the
system's runtime state (memory/CPU/IO pressure, refault rate, ...)
and workload-specific policy. It uses memory.reclaim to drive
reclaim, which compresses cold anon pages into zswap as the first
stage. For entries that then remain in zswap past a policy-defined
age threshold, the reclaimer wants to write them back to the backing
swap device at a moment of its own choosing, to further reclaim the
DRAM still held by the compressed data.

Why age is a reasonable selector at this stage:

Pages in zswap have already passed a first-stage coldness judgement
(otherwise they would not have been compressed). For second-level
offloading, the question is which of them are cold *enough*.
Time-in-zswap is a natural proxy for that. A swap-in invalidates the
corresponding zswap entry and resets the clock, so by construction
an entry that has sat in zswap for N seconds has not been faulted in
for at least N seconds. Residency in zswap is therefore a strong
signal that the entry is not about to refault.

In our deployment the userspace reclaimer starts from a conservative 
threshold (the starting value depends on the workload) and adjusts it 
through closed-loop feedback:

   - on one side, the age distribution of zswap entries, to see
     whether there is a meaningful population past the threshold;
   - on the other side, the post-writeback refault rate and related
     signals, to confirm that entries written back were in fact cold
     enough.

Both <age> and max=<bytes> are tuned against these signals until the
realized writeback volume matches target. This is the same
control-loop style already used to drive the first-stage
memory.reclaim budget.

Thanks,
Hao

^ permalink raw reply

* Re: [PATCH v3 17/20] drm/drv: Switch skeleton to drm_mode_config_create_initial_state()
From: Maxime Ripard @ 2026-05-12 11:20 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Jonathan Corbet, Shuah Khan, Dmitry Baryshkov, Jyri Sarha,
	Tomi Valkeinen, Andrzej Hajda, Neil Armstrong, Robert Foss,
	Jonas Karlman, Jernej Skrabec, Simon Ser, Harry Wentland,
	Melissa Wen, Sebastian Wick, Alex Hung, Jani Nikula, Rodrigo Vivi,
	Joonas Lahtinen, Tvrtko Ursulin, Chen-Yu Tsai, Samuel Holland,
	Dave Stevenson, Maíra Canal, Raspberry Pi Kernel Maintenance,
	dri-devel, linux-doc, linux-kernel, Daniel Stone, intel-gfx,
	intel-xe, linux-arm-kernel, linux-sunxi
In-Reply-To: <20260504180216.GU1344263@killaraus.ideasonboard.com>

[-- Attachment #1: Type: text/plain, Size: 1929 bytes --]

On Mon, May 04, 2026 at 09:02:16PM +0300, Laurent Pinchart wrote:
> On Fri, Apr 24, 2026 at 12:18:57PM +0200, Maxime Ripard wrote:
> > The driver skeleton currently recommends calling
> > drm_mode_config_reset() at probe time to create the initial state.
> > 
> > Now that drm_mode_config_create_initial_state() exists to handle
> > initial state allocation without hardware side effects, update the
> > skeleton to recommend it instead.
> > 
> > Signed-off-by: Maxime Ripard <mripard@kernel.org>
> > ---
> >  drivers/gpu/drm/drm_drv.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> > index 985c283cf59f..f537556b06a8 100644
> > --- a/drivers/gpu/drm/drm_drv.c
> > +++ b/drivers/gpu/drm/drm_drv.c
> > @@ -340,11 +340,13 @@ void drm_minor_release(struct drm_minor *minor)
> >   *
> >   *		// Further setup, display pipeline etc
> >   *
> >   *		platform_set_drvdata(pdev, drm);
> >   *
> > - *		drm_mode_config_reset(drm);
> > + *		ret = drm_mode_config_create_initial_state(drm);
> > + *		if (ret)
> > + *			return ret;
> 
> There's one point I'm still not sure to understand properly. The
> skeleton example (and the tidss driver, which you convert to the new API
> in this series) both call drm_mode_config_helper_resume(). This in turn
> calls drm_atomic_helper_resume(), and drm_mode_config_reset(). For
> drivers that implement .atomic_create_state() instead of .reset() (such
> as tidss, after its conversion in this series), drm_mode_config_reset()
> will call the drm_mode_config_*_create_state() helpers, which allocate
> and initialize a new state (through .atomic_create_state()), and store
> that new state in the object's ->state field. Won't this leak the state
> previously stored there ?

Thanks for spotting this, you're totally right!

I'll fix it in the next version
Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 273 bytes --]

^ permalink raw reply

* Re: [PATCH v3 18/20] drm/tidss: Switch to drm_mode_config_create_initial_state()
From: Maxime Ripard @ 2026-05-12 11:18 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Jonathan Corbet, Shuah Khan, Dmitry Baryshkov, Jyri Sarha,
	Tomi Valkeinen, Andrzej Hajda, Neil Armstrong, Robert Foss,
	Jonas Karlman, Jernej Skrabec, Simon Ser, Harry Wentland,
	Melissa Wen, Sebastian Wick, Alex Hung, Jani Nikula, Rodrigo Vivi,
	Joonas Lahtinen, Tvrtko Ursulin, Chen-Yu Tsai, Samuel Holland,
	Dave Stevenson, Maíra Canal, Raspberry Pi Kernel Maintenance,
	dri-devel, linux-doc, linux-kernel, Daniel Stone, intel-gfx,
	intel-xe, linux-arm-kernel, linux-sunxi
In-Reply-To: <20260504174907.GT1344263@killaraus.ideasonboard.com>

[-- Attachment #1: Type: text/plain, Size: 1326 bytes --]

On Mon, May 04, 2026 at 08:49:07PM +0300, Laurent Pinchart wrote:
> Hi Maxime,
> 
> Thank you for the patch.
> 
> On Fri, Apr 24, 2026 at 12:18:58PM +0200, Maxime Ripard wrote:
> > Now that drm_mode_config_create_initial_state() exists to create the
> > initial state, use it instead of drm_mode_config_reset() during
> > driver probe.
> > 
> > Signed-off-by: Maxime Ripard <mripard@kernel.org>
> > ---
> >  drivers/gpu/drm/tidss/tidss_drv.c | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/tidss/tidss_drv.c b/drivers/gpu/drm/tidss/tidss_drv.c
> > index 1c8cc18bc53c..f5099d5d6e32 100644
> > --- a/drivers/gpu/drm/tidss/tidss_drv.c
> > +++ b/drivers/gpu/drm/tidss/tidss_drv.c
> > @@ -169,11 +169,15 @@ static int tidss_probe(struct platform_device *pdev)
> >  		goto err_runtime_suspend;
> >  	}
> >  
> >  	drm_kms_helper_poll_init(ddev);
> >  
> > -	drm_mode_config_reset(ddev);
> > +	ret = drm_mode_config_create_initial_state(ddev);
> > +	if (ret) {
> > +		dev_err(dev, "failed to create initial state: %d\n", ret);
> > +		goto err_irq_uninstall;
> > +	}
> 
> There's also a call to drm_mode_config_reset() in tidss_modeset_init(),
> shouldn't it be dropped ?

This has been fixed by f468fef38716 which is in drm-misc-next

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 273 bytes --]

^ permalink raw reply

* Re: [PATCH v3 16/20] drm/mode-config: Create drm_mode_config_create_initial_state()
From: Maxime Ripard @ 2026-05-12 11:12 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Jonathan Corbet, Shuah Khan, Dmitry Baryshkov, Jyri Sarha,
	Tomi Valkeinen, Andrzej Hajda, Neil Armstrong, Robert Foss,
	Jonas Karlman, Jernej Skrabec, Simon Ser, Harry Wentland,
	Melissa Wen, Sebastian Wick, Alex Hung, Jani Nikula, Rodrigo Vivi,
	Joonas Lahtinen, Tvrtko Ursulin, Chen-Yu Tsai, Samuel Holland,
	Dave Stevenson, Maíra Canal, Raspberry Pi Kernel Maintenance,
	dri-devel, linux-doc, linux-kernel, Daniel Stone, intel-gfx,
	intel-xe, linux-arm-kernel, linux-sunxi
In-Reply-To: <20260504174148.GS1344263@killaraus.ideasonboard.com>

[-- Attachment #1: Type: text/plain, Size: 1029 bytes --]

On Mon, May 04, 2026 at 08:41:48PM +0300, Laurent Pinchart wrote:
> > Historically, this was one
> > + *   of drm_mode_config_reset() job, so one might still encounter it in
> > + *   a driver.
> > + *
> > + * - at reset time, for example during suspend/resume,
> > + *   drm_mode_config_reset() will reset the software and hardware state
> > + *   to a known default and will store it in the object's state pointer.
> > + *   Not all objects are affected by drm_mode_config_reset() though.
> 
> Does the reset implementation store a new state in the object's state
> pointer, or does it reset the contents of the already allocated state ?
> I read the documentation here as meaning the former, if it's actually
> the latter it should be reworded.

It's undefined. Both approach works, most drivers will destroy the old
one and allocate a new one, but mediatek will just clear and
re-initialize the old one:

https://elixir.bootlin.com/linux/v7.1-rc3/source/drivers/gpu/drm/mediatek/mtk_plane.c#L28

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 273 bytes --]

^ permalink raw reply

* Re: [PATCH 2/5] io_uring/zcrx: notify user on frag copy fallback
From: Pavel Begunkov @ 2026-05-12 11:02 UTC (permalink / raw)
  To: Clément Léger, io-uring, Jens Axboe
  Cc: linux-doc, linux-kernel, linux-kselftest, netdev, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Jonathan Corbet, Shuah Khan, Vishwanath Seshagiri
In-Reply-To: <20260422112522.3316660-3-cleger@meta.com>

On 4/22/26 12:25, Clément Léger wrote:
> Add a ZCRX_NOTIF_COPY notification type to signal userspace when a
> received fragment could not be delivered using zero-copy and was
> instead copied into a buffer.
> 
> Signed-off-by: Clément Léger <cleger@meta.com>
> ---
>   include/uapi/linux/io_uring/zcrx.h | 1 +
>   io_uring/zcrx.c                    | 7 ++++++-
>   io_uring/zcrx.h                    | 3 ++-
>   3 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/include/uapi/linux/io_uring/zcrx.h b/include/uapi/linux/io_uring/zcrx.h
> index b8596d7d47b6..e0c0079626c8 100644
> --- a/include/uapi/linux/io_uring/zcrx.h
> +++ b/include/uapi/linux/io_uring/zcrx.h
> @@ -70,6 +70,7 @@ enum zcrx_features {
>   
>   enum zcrx_notification_type {
>   	ZCRX_NOTIF_NO_BUFFERS = 1 << 0,
> +	ZCRX_NOTIF_COPY = 1 << 1
>   };
>   
>   struct zcrx_notification_desc {
> diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
> index 35ca28cb6583..732e585aa13a 100644
> --- a/io_uring/zcrx.c
> +++ b/io_uring/zcrx.c
> @@ -1510,8 +1510,13 @@ static int io_zcrx_copy_frag(struct io_kiocb *req, struct io_zcrx_ifq *ifq,
>   			     const skb_frag_t *frag, int off, int len)
>   {
>   	struct page *page = skb_frag_page(frag);
> +	int ret;
> +
> +	ret = io_zcrx_copy_chunk(req, ifq, page, off + skb_frag_off(frag), len);
> +	if (ret > 0)
> +		zcrx_send_notif(ifq, ZCRX_NOTIF_COPY);

We also copy the linear part if present, depends on the semantics
would make sense adding it there as well.

Pavel Begunkov


^ permalink raw reply

* Re: [PATCH 1/5] io_uring/zcrx: notify user when out of buffers
From: Pavel Begunkov @ 2026-05-12 10:59 UTC (permalink / raw)
  To: Clément Léger, io-uring, Jens Axboe
  Cc: linux-doc, linux-kernel, linux-kselftest, netdev, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Jonathan Corbet, Shuah Khan, Vishwanath Seshagiri,
	Vishwanath Seshagiri
In-Reply-To: <20260422112522.3316660-2-cleger@meta.com>

On 4/22/26 12:25, Clément Léger wrote:
> From: Pavel Begunkov <asml.silence@gmail.com>
...
> +static void zcrx_notif_tw(struct io_tw_req tw_req, io_tw_token_t tw)
> +{
> +	struct io_kiocb *req = tw_req.req;
> +	struct io_ring_ctx *ctx = req->ctx;
> +
> +	io_post_aux_cqe(ctx, req->cqe.user_data, req->cqe.res, 0);
> +	percpu_ref_put(&ctx->refs);
> +	kfree_rcu(req, rcu_head);
> +}

Note to myself:

io_poison_req(req);
kmem_cache_free(req_cachep, req);

-- 
Pavel Begunkov


^ permalink raw reply

* Re: [PATCH 1/2] Doc: deprecated.rst: add strlcat()
From: Manuel Ebner @ 2026-05-12 10:43 UTC (permalink / raw)
  To: Jani Nikula
  Cc: andy.shevchenko, apw, corbet, dwaipayanray1, joe, kees, linux-doc,
	linux-kernel, lukas.bulwahn, skhan, workflows
In-Reply-To: <748c2c3d549740918e14f29aa25dd475b99c1313@intel.com>

On Tue, 2026-05-12 at 11:52 +0300, Jani Nikula wrote:
> On Sun, 10 May 2026, Manuel Ebner <manuelebner@mailbox.org> wrote:
> > add strlcat and alternatives
> 
> You'd think it's the strlcat() definition that needs a comment above it
> saying it's deprecated. I don't think folks really look at
> deprecated.rst.

arch/s390/lib/string.c
lib/string.c
and
tools/include/nolibc/string.h

do not mentions anything about obsolete.

include/linux/fortify-string.h has 

 /* Defined after fortified strlen() to reuse it. */
 extern size_t __real_strlcat(char *p, const char *q, size_t avail) __RENAME(strlcat);
 /**
  * strlcat - Append a string to an existing string
  * [...]
  * Do not use this function. While FORTIFY_SOURCE tries to avoid
  * read and write overflows, this is only possible when the sizes
  * of @p and @q are known to the compiler. Prefer building the
  * string with formatting, via scnprintf(), seq_buf, or similar.

should i add this to the former three files?

Manuel

> 
> BR,
> Jani.
> 
> > 
> > Signed-off-by: Manuel Ebner <manuelebner@mailbox.org>
> > ---
> >  Documentation/process/deprecated.rst | 6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/Documentation/process/deprecated.rst b/Documentation/process/deprecated.rst
> > index fed56864d036..b8a65c19796c 100644
> > --- a/Documentation/process/deprecated.rst
> > +++ b/Documentation/process/deprecated.rst
> > @@ -162,6 +162,12 @@ if a source string is not NUL-terminated. The safe replacement is
> > strscpy(),
> >  though care must be given to any cases where the return value of strlcpy()
> >  is used, since strscpy() will return negative errno values when it truncates.
> >  
> > +strlcat()
> > +---------
> > +strlcat() must re-scan the destination string from the beginning on each
> > +call (O(n^2) behavior). Alternatives are seq_buf_puts(), seq_buf_printf(),
> > +snprintf() and scnprintf()
> > +
> >  %p format specifier
> >  -------------------
> >  Traditionally, using "%p" in format strings would lead to regular address
> 


^ permalink raw reply

* Re: [PATCH v10 7/9] gpio: Remove unused `chip` and `srcu` in struct gpio_device
From: Bartosz Golaszewski @ 2026-05-12 10:41 UTC (permalink / raw)
  To: Tzung-Bi Shih
  Cc: Benson Leung, linux-kernel, chrome-platform, driver-core,
	linux-doc, linux-gpio, Rafael J. Wysocki, Danilo Krummrich,
	Jonathan Corbet, Shuah Khan, Laurent Pinchart, Wolfram Sang,
	Jason Gunthorpe, Johan Hovold, Paul E . McKenney, Arnd Bergmann,
	Greg Kroah-Hartman, Linus Walleij
In-Reply-To: <agLhMUtdCPqLb736@google.com>

On Tue, May 12, 2026 at 10:13 AM Tzung-Bi Shih <tzungbi@kernel.org> wrote:
>
> On Mon, May 11, 2026 at 06:18:21AM -0700, Bartosz Golaszewski wrote:
> > On Fri, 8 May 2026 12:54:46 +0200, Tzung-Bi Shih <tzungbi@kernel.org> said:
> > > `chip` and `srcu` in struct gpio_device are unused as their usages are
> > > replaced to use revocable.  Remove them.
> > >
> > > Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
> > > ---
> >
> > I'm thinking that all the GPIO patches could actually be squashed into one. Is
> > there any technical reason for the split or is it just for easier review?
>
> Correct, they are separated only for easier review.  Would you prefer I
> squash the 5 patches into a single patch?

Yes, I think it's less churn that way.

Bartosz

^ permalink raw reply

* [PATCH] docs: netlink: Correct buffer sizing info
From: Konstantin Shabanov @ 2026-05-12 10:30 UTC (permalink / raw)
  To: linux-doc
  Cc: Konstantin Shabanov, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Jonathan Corbet,
	Shuah Khan, netdev, linux-kernel

Update the docs to match the code (include/linux/netlink.h):

  /*
   *	skb should fit one page. This choice is good for headerless malloc.
   *	But we should limit to 8K so that userspace does not have to
   *	use enormous buffer sizes on recvmsg() calls just to avoid
   *	MSG_TRUNC when PAGE_SIZE is very large.
  */
  #if PAGE_SIZE < 8192UL
  #define NLMSG_GOODSIZE	SKB_WITH_OVERHEAD(PAGE_SIZE)
  #else
  #define NLMSG_GOODSIZE	SKB_WITH_OVERHEAD(8192UL)
  #endif

Link: https://lore.kernel.org/all/20220819200221.422801-2-kuba@kernel.org/
Signed-off-by: Konstantin Shabanov <mail@etehtsea.me>
---
 Documentation/userspace-api/netlink/intro.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/userspace-api/netlink/intro.rst b/Documentation/userspace-api/netlink/intro.rst
index aacffade8f84..ca60abe94e3d 100644
--- a/Documentation/userspace-api/netlink/intro.rst
+++ b/Documentation/userspace-api/netlink/intro.rst
@@ -526,8 +526,8 @@ of the recvmsg() system call, *not* a Netlink header).
 
 Upon truncation the remaining part of the message is discarded.
 
-Netlink expects that the user buffer will be at least 8kB or a page
-size of the CPU architecture, whichever is bigger. Particular Netlink
+Netlink expects that the user buffer will be at most 8kB or a page
+size of the CPU architecture, whichever is smaller. Particular Netlink
 families may, however, require a larger buffer. 32kB buffer is recommended
 for most efficient handling of dumps (larger buffer fits more dumped
 objects and therefore fewer recvmsg() calls are needed).

base-commit: 917719c412c48687d4a176965d1fa35320ec457c
-- 
2.53.0


^ permalink raw reply related

* Re: [PATCH v3 12/20] drm/crtc: Add new atomic_create_state callback
From: Maxime Ripard @ 2026-05-12 10:16 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Jonathan Corbet, Shuah Khan, Dmitry Baryshkov, Jyri Sarha,
	Tomi Valkeinen, Andrzej Hajda, Neil Armstrong, Robert Foss,
	Jonas Karlman, Jernej Skrabec, Simon Ser, Harry Wentland,
	Melissa Wen, Sebastian Wick, Alex Hung, Jani Nikula, Rodrigo Vivi,
	Joonas Lahtinen, Tvrtko Ursulin, Chen-Yu Tsai, Samuel Holland,
	Dave Stevenson, Maíra Canal, Raspberry Pi Kernel Maintenance,
	dri-devel, linux-doc, linux-kernel, Daniel Stone, intel-gfx,
	intel-xe, linux-arm-kernel, linux-sunxi
In-Reply-To: <20260504172858.GO1344263@killaraus.ideasonboard.com>

[-- Attachment #1: Type: text/plain, Size: 4008 bytes --]

Hi,

On Mon, May 04, 2026 at 08:28:58PM +0300, Laurent Pinchart wrote:
> On Fri, Apr 24, 2026 at 12:18:52PM +0200, Maxime Ripard wrote:
> > Commit 47b5ac7daa46 ("drm/atomic: Add new atomic_create_state callback
> > to drm_private_obj") introduced a new pattern for allocating drm object
> > states.
> > 
> > Instead of relying on the reset() callback, it created a new
> > atomic_create_state hook. This is helpful because reset is a bit
> > overloaded: it's used to create the initial software state, reset it,
> > but also reset the hardware.
> > 
> > It can also be used either at probe time, to create the initial state
> > and possibly reset the hardware to an expected default, but also during
> > suspend/resume.
> > 
> > Both these cases come with different expectations too: during the
> > initialization, we want to initialize all states, but during
> > suspend/resume, drm_private_states for example are expected to be kept
> > around.
> > 
> > reset() also isn't fallible, which makes it harder to handle
> > initialization errors properly. This is only really relevant for some
> > drivers though, since all the helpers for reset only create a new
> > state, and don't touch the hardware at all.
> > 
> > It was thus decided to create a new hook that would allocate and
> > initialize a pristine state without any side effect:
> > atomic_create_state to untangle a bit some of it, and to separate the
> > initialization with the actual reset one might need during a
> > suspend/resume.
> > 
> > Continue the transition to the new pattern with CRTCs.
> > 
> > Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
> > Signed-off-by: Maxime Ripard <mripard@kernel.org>
> > ---
> >  drivers/gpu/drm/drm_atomic_state_helper.c | 47 +++++++++++++++++++++++++++++++
> >  drivers/gpu/drm/drm_mode_config.c         | 21 +++++++++++++-
> >  include/drm/drm_atomic_state_helper.h     |  4 +++
> >  include/drm/drm_crtc.h                    | 16 +++++++++++
> >  4 files changed, 87 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_atomic_state_helper.c b/drivers/gpu/drm/drm_atomic_state_helper.c
> > index 9cd8550cabb7..b7da134c8c50 100644
> > --- a/drivers/gpu/drm/drm_atomic_state_helper.c
> > +++ b/drivers/gpu/drm/drm_atomic_state_helper.c
> > @@ -103,10 +103,32 @@ __drm_atomic_helper_crtc_reset(struct drm_crtc *crtc,
> >  
> >  	crtc->state = crtc_state;
> >  }
> >  EXPORT_SYMBOL(__drm_atomic_helper_crtc_reset);
> >  
> > +/**
> > + * __drm_atomic_helper_crtc_create_state - initializes crtc state
> 
> "Initialize a CRTC state"

Good catch, thanks.

> The name of the function is misleading ("*_create_*" while you state it
> performs initialization).
>
> > + * @crtc: crtc object
> > + * @state: new state to initialize
> > + *
> > + * Initializes the newly allocated @state, usually required when
> > + * initializing the drivers.
> > + *
> > + * @state is assumed to be zeroed.
> > + *
> > + * This is useful for drivers that subclass @drm_crtc_state.
> > + */
> > +void __drm_atomic_helper_crtc_create_state(struct drm_crtc *crtc,
> > +					   struct drm_crtc_state *state)
> > +{
> > +	__drm_atomic_helper_crtc_state_init(state, crtc);
> > +
> > +	if (drm_dev_has_vblank(crtc->dev))
> > +		drm_crtc_vblank_reset(crtc);
> 
> This is confusing to me (at least before reading the rest of the
> series), and itn't mentioned in the function documentation or in the
> commit message.
>
> Furthermore, __drm_atomic_helper_crtc_create_state() is later used in
> tidss_crtc_create_state(), which is the
> drm_crtc_funcs.atomic_create_state() implementation of the tidss driver.
> The atomic_create_state documentation states that "This callback must
> have no side effect", and drm_crtc_vblank_reset() has side effects.

That's a good point. I've dropped that function entirely and moved the
drm_crtc_vblank_reset() call in drm_mode_config_crtc_create_state().

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 273 bytes --]

^ permalink raw reply

* Re: [PATCH RFC 2/5] dma-heap: charge dma-buf memory via explicit memcg
From: Christian König @ 2026-05-12 10:14 UTC (permalink / raw)
  To: Albert Esteve, Tejun Heo, Johannes Weiner, Michal Koutný,
	Jonathan Corbet, Shuah Khan, Sumit Semwal, Michal Hocko,
	Roman Gushchin, Shakeel Butt, Muchun Song, Andrew Morton,
	Benjamin Gaignard, Brian Starkey, John Stultz, T.J. Mercier,
	Christian Brauner, Paul Moore, James Morris, Serge E. Hallyn,
	Stephen Smalley, Ondrej Mosnacek, Shuah Khan
  Cc: cgroups, linux-doc, linux-kernel, linux-media, dri-devel,
	linaro-mm-sig, linux-mm, linux-security-module, selinux,
	linux-kselftest, mripard, echanude
In-Reply-To: <20260512-v2_20230123_tjmercier_google_com-v1-2-6326701c3691@redhat.com>

On 5/12/26 11:10, Albert Esteve wrote:
> On embedded platforms a central process often allocates dma-buf
> memory on behalf of client applications. Without a way to
> attribute the charge to the requesting client's cgroup, the
> cost lands on the allocator, making per-cgroup memory limits
> ineffective for the actual consumers.
> 
> Add charge_pid_fd to struct dma_heap_allocation_data. When set to
> a valid pidfd, DMA_HEAP_IOCTL_ALLOC resolves the target task's
> memcg and charges the buffer there via mem_cgroup_charge_dmabuf()
> inside dma_heap_buffer_alloc(). Without charge_pid_fd, and with
> the mem_accounting module parameter enabled, the buffer is charged
> to the allocator's own cgroup.
> 
> Additionally, commit 3c227be90659 ("dma-buf: system_heap: account for
> system heap allocation in memcg") adds __GFP_ACCOUNT to system-heap
> page allocations. Keeping __GFP_ACCOUNT would charge the same pages
> twice (once to kmem, once to MEMCG_DMABUF), thus remove it and route
> all accounting through a single MEMCG_DMABUF path.
> 
> Usage examples:
> 
>   1. Central allocator charging to a client at allocation time.
>      The allocator knows the client's PID (e.g., from binder's
>      sender_pid) and uses pidfd to attribute the charge:
> 
>        pid_t client_pid = txn->sender_pid;
>        int pidfd = pidfd_open(client_pid, 0);
> 
>        struct dma_heap_allocation_data alloc = {
>            .len             = buffer_size,
>            .fd_flags        = O_RDWR | O_CLOEXEC,
>            .charge_pid_fd   = pidfd,
>        };
>        ioctl(heap_fd, DMA_HEAP_IOCTL_ALLOC, &alloc);
>        close(pidfd);
>        /* alloc.fd is now charged to client's cgroup */
> 
>   2. Default allocation (no pidfd, mem_accounting=1).
>      When charge_pid_fd is not set and the mem_accounting module
>      parameter is enabled, the buffer is charged to the allocator's
>      own cgroup:
> 
>        struct dma_heap_allocation_data alloc = {
>            .len      = buffer_size,
>            .fd_flags = O_RDWR | O_CLOEXEC,
>        };
>        ioctl(heap_fd, DMA_HEAP_IOCTL_ALLOC, &alloc);
>        /* charged to current process's cgroup */
> 
> Current limitations:
> 
>  - Single-owner model: a dma-buf carries one memcg charge regardless of
>    how many processes share it. Means only the first owner (and exporter)
>    of the shared buffer bears the charge.
>  - Only memcg accounting supported. While this makes sense for system
>    heap buffers, other heaps (e.g., CMA heaps) will require selectively
>    charging also for the dmem controller.

Well that doesn't looks soo bad, it at least seems to tackle the problem at hand for Android and some of other embedded use cases.

I'm just not sure if this is future prove and will work for all use cases, e.g. cloud gaming, native context for automotive etc...

Essentially the problem boils down to two limitations:
1) a piece of memory can only be charged to one cgroup, the framework doesn't has a concept of charging shared memory to multiple groups
2) when memory references in the form of file descriptors are passed between applications we have no way of changing the accounting to a different cgroup

The passing of the memory reference already has a well defined uAPI and if we could solve those two limitations we not only solve the problem without introducing new uAPI (with potential new security risks) but also solve it for all other use cases which uses file descriptors as well as. E.g. memfd, accel and GPU drivers etc...

On the other hand it is really nice to finally see this tackled for at least DMA-buf heaps. On the GPU side I have seen just another try of a driver doing some kind of special driver specific accounting to solve this just a few weeks ago. And to be honest such single driver island approach have the tendency to break more often that they are working correctly.

Regards,
Christian.

> 
> Signed-off-by: Albert Esteve <aesteve@redhat.com>
> ---
>  Documentation/admin-guide/cgroup-v2.rst |  5 ++--
>  drivers/dma-buf/dma-buf.c               | 16 ++++---------
>  drivers/dma-buf/dma-heap.c              | 42 ++++++++++++++++++++++++++++++---
>  drivers/dma-buf/heaps/system_heap.c     |  2 --
>  include/uapi/linux/dma-heap.h           |  6 +++++
>  5 files changed, 53 insertions(+), 18 deletions(-)
> 
> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> index 8bdbc2e866430..824d269531eb1 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> @@ -1636,8 +1636,9 @@ The following nested keys are defined.
>  		structures.
>  
>  	  dmabuf (npn)
> -		Amount of memory used for exported DMA buffers allocated by the cgroup.
> -		Stays with the allocating cgroup regardless of how the buffer is shared.
> +		Amount of memory used for exported DMA buffers allocated by or on
> +		behalf of the cgroup. Stays with the allocating cgroup regardless
> +		of how the buffer is shared.
>  
>  	  workingset_refault_anon
>  		Number of refaults of previously evicted anonymous pages.
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index ce02377f48908..23fb758b78297 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -181,8 +181,11 @@ static void dma_buf_release(struct dentry *dentry)
>  	 */
>  	BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active);
>  
> -	mem_cgroup_uncharge_dmabuf(dmabuf->memcg, PAGE_ALIGN(dmabuf->size) / PAGE_SIZE);
> -	mem_cgroup_put(dmabuf->memcg);
> +	if (dmabuf->memcg) {
> +		mem_cgroup_uncharge_dmabuf(dmabuf->memcg,
> +					  PAGE_ALIGN(dmabuf->size) / PAGE_SIZE);
> +		mem_cgroup_put(dmabuf->memcg);
> +	}
>  
>  	dmabuf->ops->release(dmabuf);
>  
> @@ -764,13 +767,6 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info)
>  		dmabuf->resv = resv;
>  	}
>  
> -	dmabuf->memcg = get_mem_cgroup_from_mm(current->mm);
> -	if (!mem_cgroup_charge_dmabuf(dmabuf->memcg, PAGE_ALIGN(dmabuf->size) / PAGE_SIZE,
> -				      GFP_KERNEL)) {
> -		ret = -ENOMEM;
> -		goto err_memcg;
> -	}
> -
>  	file->private_data = dmabuf;
>  	file->f_path.dentry->d_fsdata = dmabuf;
>  	dmabuf->file = file;
> @@ -781,8 +777,6 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info)
>  
>  	return dmabuf;
>  
> -err_memcg:
> -	mem_cgroup_put(dmabuf->memcg);
>  err_file:
>  	fput(file);
>  err_module:
> diff --git a/drivers/dma-buf/dma-heap.c b/drivers/dma-buf/dma-heap.c
> index ac5f8685a6494..ff6e259afcdc0 100644
> --- a/drivers/dma-buf/dma-heap.c
> +++ b/drivers/dma-buf/dma-heap.c
> @@ -7,13 +7,17 @@
>   */
>  
>  #include <linux/cdev.h>
> +#include <linux/cgroup.h>
>  #include <linux/device.h>
>  #include <linux/dma-buf.h>
>  #include <linux/dma-heap.h>
> +#include <linux/memcontrol.h>
> +#include <linux/sched/mm.h>
>  #include <linux/err.h>
>  #include <linux/export.h>
>  #include <linux/list.h>
>  #include <linux/nospec.h>
> +#include <linux/pidfd.h>
>  #include <linux/syscalls.h>
>  #include <linux/uaccess.h>
>  #include <linux/xarray.h>
> @@ -55,10 +59,12 @@ MODULE_PARM_DESC(mem_accounting,
>  		 "Enable cgroup-based memory accounting for dma-buf heap allocations (default=false).");
>  
>  static int dma_heap_buffer_alloc(struct dma_heap *heap, size_t len,
> -				 u32 fd_flags,
> -				 u64 heap_flags)
> +				 u32 fd_flags, u64 heap_flags,
> +				 struct mem_cgroup *charge_to)
>  {
>  	struct dma_buf *dmabuf;
> +	unsigned int nr_pages;
> +	struct mem_cgroup *memcg = charge_to;
>  	int fd;
>  
>  	/*
> @@ -73,6 +79,22 @@ static int dma_heap_buffer_alloc(struct dma_heap *heap, size_t len,
>  	if (IS_ERR(dmabuf))
>  		return PTR_ERR(dmabuf);
>  
> +	nr_pages = len / PAGE_SIZE;
> +
> +	if (memcg)
> +		css_get(&memcg->css);
> +	else if (mem_accounting)
> +		memcg = get_mem_cgroup_from_mm(current->mm);
> +
> +	if (memcg) {
> +		if (!mem_cgroup_charge_dmabuf(memcg, nr_pages, GFP_KERNEL)) {
> +			mem_cgroup_put(memcg);
> +			dma_buf_put(dmabuf);
> +			return -ENOMEM;
> +		}
> +		dmabuf->memcg = memcg;
> +	}
> +
>  	fd = dma_buf_fd(dmabuf, fd_flags);
>  	if (fd < 0) {
>  		dma_buf_put(dmabuf);
> @@ -102,6 +124,9 @@ static long dma_heap_ioctl_allocate(struct file *file, void *data)
>  {
>  	struct dma_heap_allocation_data *heap_allocation = data;
>  	struct dma_heap *heap = file->private_data;
> +	struct mem_cgroup *memcg = NULL;
> +	struct task_struct *task;
> +	unsigned int pidfd_flags;
>  	int fd;
>  
>  	if (heap_allocation->fd)
> @@ -113,9 +138,20 @@ static long dma_heap_ioctl_allocate(struct file *file, void *data)
>  	if (heap_allocation->heap_flags & ~DMA_HEAP_VALID_HEAP_FLAGS)
>  		return -EINVAL;
>  
> +	if (heap_allocation->charge_pid_fd) {
> +		task = pidfd_get_task(heap_allocation->charge_pid_fd, &pidfd_flags);
> +		if (IS_ERR(task))
> +			return PTR_ERR(task);
> +
> +		memcg = get_mem_cgroup_from_mm(task->mm);
> +		put_task_struct(task);
> +	}
> +
>  	fd = dma_heap_buffer_alloc(heap, heap_allocation->len,
>  				   heap_allocation->fd_flags,
> -				   heap_allocation->heap_flags);
> +				   heap_allocation->heap_flags,
> +				   memcg);
> +	mem_cgroup_put(memcg);
>  	if (fd < 0)
>  		return fd;
>  
> diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c
> index 03c2b87cb1112..95d7688167b93 100644
> --- a/drivers/dma-buf/heaps/system_heap.c
> +++ b/drivers/dma-buf/heaps/system_heap.c
> @@ -385,8 +385,6 @@ static struct page *alloc_largest_available(unsigned long size,
>  		if (max_order < orders[i])
>  			continue;
>  		flags = order_flags[i];
> -		if (mem_accounting)
> -			flags |= __GFP_ACCOUNT;
>  		page = alloc_pages(flags, orders[i]);
>  		if (!page)
>  			continue;
> diff --git a/include/uapi/linux/dma-heap.h b/include/uapi/linux/dma-heap.h
> index a4cf716a49fa6..e02b0f8cbc6a1 100644
> --- a/include/uapi/linux/dma-heap.h
> +++ b/include/uapi/linux/dma-heap.h
> @@ -29,6 +29,10 @@
>   *			handle to the allocated dma-buf
>   * @fd_flags:		file descriptor flags used when allocating
>   * @heap_flags:		flags passed to heap
> + * @charge_pid_fd:	optional pidfd of the process whose cgroup should be
> + *			charged for this allocation; 0 means charge the calling
> + *			process's cgroup
> + * @__padding:		reserved, must be zero
>   *
>   * Provided by userspace as an argument to the ioctl
>   */
> @@ -37,6 +41,8 @@ struct dma_heap_allocation_data {
>  	__u32 fd;
>  	__u32 fd_flags;
>  	__u64 heap_flags;
> +	__u32 charge_pid_fd;
> +	__u32 __padding;
>  };
>  
>  #define DMA_HEAP_IOC_MAGIC		'H'
> 


^ permalink raw reply

* Re: [PATCH 00/17] dynamic-debug cleanups refactors maintenance
From: jim.cromie @ 2026-05-12 10:12 UTC (permalink / raw)
  To: Andrew Morton, Linux Documentation List
  Cc: Jason Baron, Luis Chamberlain, Petr Pavlu, Daniel Gomez,
	Sami Tolvanen, Aaron Tomlin, Shuah Khan, Louis Chauvet,
	linux-kernel, linux-modules, linux-kselftest,
	Łukasz Bartosik
In-Reply-To: <20260508190121.3461706b01f6079bbacdd167@linux-foundation.org>

On Fri, May 8, 2026 at 8:01 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Mon, 04 May 2026 14:45:06 -0600 Jim Cromie <jim.cromie@gmail.com> wrote:
>
> > This series is nearly all maintenance: it refactors/splits functions,
> > The user visible change to /proc/dynamic_debug/control is s/class
> > unknown/class:_UNKNOWN_/, which is a more visible/greppable indication
> > of incomplete class definitions.
>
> Wait.  We can't make userspace-visible changes?
>

- the code has been marked BROKEN for its 1st, intended user: DRM,
so there are no users affected by this change
- UNKNOWN is an error condition, an incomplete/incorrect classmap definition,
and is expected to be caught in implementation or review.
- phase 2 of the patch set has improved compile-time and modprobe-time
validation,
they would catch this coding error.
- I will drop this patch if these reasons are insufficient.


> > Coder visible change is to drop the enum ddebug_class_map_type's
> > unused vals - namely: DD_CLASS_TYPE_DISJOINT_NAMES
> > & DD_CLASS_TYPE_LEVEL_NAMES
> >
> > These allowed more symbolic named inputs:
> >   echo +DRM_UT_CORE > /sys/module/drm/parameters/debug
> >
> > But theyre unused 3 years later, and probably not worth keeping.
> > With a removal commit in the log, its easy enough to restore them later.
> >
> > ...
> >
> >  MAINTAINERS                                        |   1 +
> >  include/linux/dynamic_debug.h                      | 106 ++---
> >  kernel/module/main.c                               |  12 +-
> >  lib/dynamic_debug.c                                | 504 ++++++++++-----------
> >  lib/test_dynamic_debug.c                           |  28 +-
> >  tools/testing/selftests/Makefile                   |   1 +
> >  tools/testing/selftests/dynamic_debug/Makefile     |   9 +
> >  tools/testing/selftests/dynamic_debug/config       |   7 +
> >  .../selftests/dynamic_debug/dyndbg_selftest.sh     | 257 +++++++++++
> >  9 files changed, 582 insertions(+), 343 deletions(-)
>
> No Documentation/ updates?

I have 2 doc-only updates I peeled off and sent to @Linux Documentation List
Otherwise, there are no behavioral changes here to write about.
Phase 2 has API changes needed to actually fix classmaps for DRM, and
docs to go with it.

I split out phase-1 to lower the barrier to review and apply.
By your response, it seems to have helped.

wrt sashiko review, Ive made several adjustments, Im reviewing,
working the others.

Thanks
Jim

^ permalink raw reply

* Re: [PATCH net-next 1/2] net: ti: icssg: Derive stats array lengths from ARRAY_SIZE
From: David CARLIER @ 2026-05-12 10:03 UTC (permalink / raw)
  To: MD Danish Anwar
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Jonathan Corbet, Shuah Khan, Roger Quadros,
	Andrew Lunn, Jacob Keller, Meghana Malladi, Kevin Hao,
	Vadim Fedorenko, netdev, linux-doc, linux-kernel,
	linux-arm-kernel, Vignesh Raghavendra
In-Reply-To: <6a1f411c-d7ed-463b-abf1-277d8cc0c184@ti.com>

Hi Danish,


On Tue, 12 May 2026 at 10:40, MD Danish Anwar <danishanwar@ti.com> wrote:
>
> Hi David,
>
> On 12/05/26 1:28 pm, David CARLIER wrote:
> > Hi MD,
> >
> > On Tue, 12 May 2026 at 07:06, MD Danish Anwar <danishanwar@ti.com> wrote:
> >>
> >> Replace the manually maintained ICSSG_NUM_MIIG_STATS and
> >> ICSSG_NUM_PA_STATS constants with ARRAY_SIZE() expressions derived
> >> directly from the corresponding stat descriptor arrays, so that adding
> >> new entries to icssg_all_miig_stats[] or icssg_all_pa_stats[] no longer
> >> requires a separate update to a numeric constant.
> >>
> >> To make this self-contained, break the circular include dependency
> >> between icssg_stats.h and icssg_prueth.h:
> >>
> >>   - icssg_stats.h previously included icssg_prueth.h (transitively
> >>     pulling in icssg_switch_map.h and ETH_GSTRING_LEN).  Replace that
> >>     with direct includes of <linux/ethtool.h>, <linux/kernel.h> and
> >>     "icssg_switch_map.h".
> >>
> >>   - icssg_prueth.h now includes icssg_stats.h, giving it access to
> >>     the ARRAY_SIZE-based ICSSG_NUM_MIIG_STATS and ICSSG_NUM_PA_STATS
> >>     before they are used in the prueth_emac struct and ICSSG_NUM_STATS.
> >>
> >> Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
> >> ---
> >>  drivers/net/ethernet/ti/icssg/icssg_prueth.h | 3 +--
> >>  drivers/net/ethernet/ti/icssg/icssg_stats.h  | 7 ++++++-
> >>  2 files changed, 7 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/net/ethernet/ti/icssg/icssg_prueth.h b/drivers/net/ethernet/ti/icssg/icssg_prueth.h
> >> index df93d15c5b78..e2ccecb0a0dd 100644
> >> --- a/drivers/net/ethernet/ti/icssg/icssg_prueth.h
> >> +++ b/drivers/net/ethernet/ti/icssg/icssg_prueth.h
> >> @@ -43,6 +43,7 @@
> >>
> >>  #include "icssg_config.h"
> >>  #include "icss_iep.h"
> >> +#include "icssg_stats.h"
> >>  #include "icssg_switch_map.h"
> >>
> >>  #define PRUETH_MAX_MTU          (2000 - ETH_HLEN - ETH_FCS_LEN)
> >> @@ -57,8 +58,6 @@
> >>
> >>  #define ICSSG_MAX_RFLOWS       8       /* per slice */
> >>
> >> -#define ICSSG_NUM_PA_STATS     32
> >> -#define ICSSG_NUM_MIIG_STATS   60
> >>  /* Number of ICSSG related stats */
> >>  #define ICSSG_NUM_STATS (ICSSG_NUM_MIIG_STATS + ICSSG_NUM_PA_STATS)
> >>  #define ICSSG_NUM_STANDARD_STATS 31
> >> diff --git a/drivers/net/ethernet/ti/icssg/icssg_stats.h b/drivers/net/ethernet/ti/icssg/icssg_stats.h
> >> index 5ec0b38e0c67..b854eb587c1e 100644
> >> --- a/drivers/net/ethernet/ti/icssg/icssg_stats.h
> >> +++ b/drivers/net/ethernet/ti/icssg/icssg_stats.h
> >> @@ -8,10 +8,15 @@
> >>  #ifndef __NET_TI_ICSSG_STATS_H
> >>  #define __NET_TI_ICSSG_STATS_H
> >>
> >> -#include "icssg_prueth.h"
> >> +#include <linux/ethtool.h>
> >> +#include <linux/kernel.h>
> >> +#include "icssg_switch_map.h"
> >>
> >>  #define STATS_TIME_LIMIT_1G_MS    25000    /* 25 seconds @ 1G */
> >>
> >> +#define ICSSG_NUM_MIIG_STATS   ARRAY_SIZE(icssg_all_miig_stats)
> >> +#define ICSSG_NUM_PA_STATS     ARRAY_SIZE(icssg_all_pa_stats)
> >> +
> >>  struct miig_stats_regs {
> >>         /* Rx */
> >>         u32 rx_packets;
> >> --
> >> 2.34.1
> >>
> >
> > One thing that caught my eye: icssg_all_miig_stats[] and
> >   icssg_all_pa_stats[] are 'static const' arrays in icssg_stats.h with
> >   ETH_GSTRING_LEN name buffers per entry. Right now only icssg_stats.c
> >   and icssg_ethtool.c pull them in. After this patch icssg_prueth.h
> >   includes icssg_stats.h, so every .c in the driver (classifier,
> >   common, config, mii_cfg, queues, switchdev, ...) ends up with its own
> >   static-const copy of both tables.
> >
> >   Would a static_assert() work for what you're after? Something like:
> >
>
> While adding more stats manually, The ARRAY_SIZE() approach was
> explicitly requested by maintainer [1]:
>
> This patch is a direct response to that feedback. static_assert() would
> still require updating the numeric constant on every array change. The
> goal here is to eliminate the need of manually incrementing stats count
> whenever new stats are added
>
> Your concern about multiple copies of table is noted and valid. Could
> you advise on the preferred way to reconcile these two requirements? I
> am happy to restructure if there is an approach that satisfies both.
>
> [1]
> https://lore.kernel.org/all/20260112181436.4s5ceywwembn674r@skbuf/#:~:text=Can%27t%20this%20be%20expressed%20as%20ARRAY_SIZE(icssg_all_pa_stats)%3F%20It%20is%20very%0Afragile%20to%20have%20to%20count%20and%20update%20this%20manually.
>
>
> >     static const struct icssg_miig_stats icssg_all_miig_stats[] = {
> >         ...
> >     };
> >     static_assert(ARRAY_SIZE(icssg_all_miig_stats) == ICSSG_NUM_MIIG_STATS);
> >
> >   next to each array, keeping the numeric #defines as-is. Then 2/2 fails
> >   to build the moment a new entry is added without bumping the count,
> >   which is the case you're guarding against — without touching the
> >   include graph.
> >
> > What do you think ?
> >
> > Cheers.
>
> --
> Thanks and Regards,
> Danish
>


  Thanks for digging up the context — fair point, I'd missed Vladimir's
  earlier ask. Reading it again though, what he calls fragile is the
  silent miscount, not the keystroke of typing a number. A static_assert
  turns "forgot to bump" into a build error, which I think gets you
  there.

  What about moving the two arrays into icssg_stats.c, declaring them
  extern in the header, and dropping a static_assert next to each
  definition? Numeric #defines stay where they are, icssg_prueth.h
  doesn't need to know about icssg_stats.h, and the tables live in one
  TU instead of every .o in the driver. If the count and the array
  disagree, you get a compile error on the spot.

  Probably worth keeping Vladimir on Cc for v2 in case he had something
  else in mind.

  Cheers,

^ permalink raw reply

* Re: [PATCH v3 10/20] drm/plane: Add new atomic_create_state callback
From: Maxime Ripard @ 2026-05-12  9:55 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Jonathan Corbet, Shuah Khan, Dmitry Baryshkov, Jyri Sarha,
	Tomi Valkeinen, Andrzej Hajda, Neil Armstrong, Robert Foss,
	Jonas Karlman, Jernej Skrabec, Simon Ser, Harry Wentland,
	Melissa Wen, Sebastian Wick, Alex Hung, Jani Nikula, Rodrigo Vivi,
	Joonas Lahtinen, Tvrtko Ursulin, Chen-Yu Tsai, Samuel Holland,
	Dave Stevenson, Maíra Canal, Raspberry Pi Kernel Maintenance,
	dri-devel, linux-doc, linux-kernel, Daniel Stone, intel-gfx,
	intel-xe, linux-arm-kernel, linux-sunxi
In-Reply-To: <20260504165229.GM1344263@killaraus.ideasonboard.com>

[-- Attachment #1: Type: text/plain, Size: 3257 bytes --]

Hi,

On Mon, May 04, 2026 at 07:52:29PM +0300, Laurent Pinchart wrote:
> On Fri, Apr 24, 2026 at 12:18:50PM +0200, Maxime Ripard wrote:
> > Commit 47b5ac7daa46 ("drm/atomic: Add new atomic_create_state callback
> > to drm_private_obj") introduced a new pattern for allocating drm object
> > states.
> > 
> > Instead of relying on the reset() callback, it created a new
> > atomic_create_state hook. This is helpful because reset is a bit
> > overloaded: it's used to create the initial software state, reset it,
> > but also reset the hardware.
> > 
> > It can also be used either at probe time, to create the initial state
> > and possibly reset the hardware to an expected default, but also during
> > suspend/resume.
> > 
> > Both these cases come with different expectations too: during the
> > initialization, we want to initialize all states, but during
> > suspend/resume, drm_private_states for example are expected to be kept
> > around.
> > 
> > reset() also isn't fallible, which makes it harder to handle
> > initialization errors properly. This is only really relevant for some
> > drivers though, since all the helpers for reset only create a new
> > state, and don't touch the hardware at all.
> > 
> > It was thus decided to create a new hook that would allocate and
> > initialize a pristine state without any side effect:
> > atomic_create_state to untangle a bit some of it, and to separate the
> > initialization with the actual reset one might need during a
> > suspend/resume.
> > 
> > Continue the transition to the new pattern with planes.
> > 
> > Signed-off-by: Maxime Ripard <mripard@kernel.org>
> > ---
> >  drivers/gpu/drm/drm_atomic_state_helper.c | 25 +++++++++++++++++++++++++
> >  drivers/gpu/drm/drm_mode_config.c         | 21 ++++++++++++++++++++-
> >  include/drm/drm_atomic_state_helper.h     |  2 ++
> >  include/drm/drm_plane.h                   | 16 ++++++++++++++++
> >  4 files changed, 63 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_atomic_state_helper.c b/drivers/gpu/drm/drm_atomic_state_helper.c
> > index 285efbf29520..50fe4eec41a8 100644
> > --- a/drivers/gpu/drm/drm_atomic_state_helper.c
> > +++ b/drivers/gpu/drm/drm_atomic_state_helper.c
> > @@ -338,10 +338,35 @@ void drm_atomic_helper_plane_reset(struct drm_plane *plane)
> >  	if (plane->state)
> >  		__drm_atomic_helper_plane_reset(plane, plane->state);
> >  }
> >  EXPORT_SYMBOL(drm_atomic_helper_plane_reset);
> >  
> > +/**
> > + * drm_atomic_helper_plane_create_state - default &drm_plane_funcs.atomic_create_state hook for planes
> 
> drm_atomic_helper_colorop_create_state() states "Allocates and
> initializes colorop atomic state", while here you document it as
> "default hook for planes". Consistency would be good.

I don't think it's inconsistent?

colorops don't have a create_state callback, so the only function to
create it is defined as "Allocates and initializes colorop atomic
state". For planes, the hook is documented as "Allocates a pristine,
initialized, state for the plane object and returns it.", and here we
have the default implementation for that hook, which is documented as
such.

It all seems consistent to me?

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 273 bytes --]

^ permalink raw reply

* Re: [PATCH net-next 1/2] net: ti: icssg: Derive stats array lengths from ARRAY_SIZE
From: MD Danish Anwar @ 2026-05-12  9:40 UTC (permalink / raw)
  To: David CARLIER
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Jonathan Corbet, Shuah Khan, Roger Quadros,
	Andrew Lunn, Jacob Keller, Meghana Malladi, Kevin Hao,
	Vadim Fedorenko, netdev, linux-doc, linux-kernel,
	linux-arm-kernel, Vignesh Raghavendra
In-Reply-To: <CA+XhMqykBWcMdk+iNnOtUxM4MX6jpDyUwfuAVZFbjAShO9_v7Q@mail.gmail.com>

Hi David,

On 12/05/26 1:28 pm, David CARLIER wrote:
> Hi MD,
> 
> On Tue, 12 May 2026 at 07:06, MD Danish Anwar <danishanwar@ti.com> wrote:
>>
>> Replace the manually maintained ICSSG_NUM_MIIG_STATS and
>> ICSSG_NUM_PA_STATS constants with ARRAY_SIZE() expressions derived
>> directly from the corresponding stat descriptor arrays, so that adding
>> new entries to icssg_all_miig_stats[] or icssg_all_pa_stats[] no longer
>> requires a separate update to a numeric constant.
>>
>> To make this self-contained, break the circular include dependency
>> between icssg_stats.h and icssg_prueth.h:
>>
>>   - icssg_stats.h previously included icssg_prueth.h (transitively
>>     pulling in icssg_switch_map.h and ETH_GSTRING_LEN).  Replace that
>>     with direct includes of <linux/ethtool.h>, <linux/kernel.h> and
>>     "icssg_switch_map.h".
>>
>>   - icssg_prueth.h now includes icssg_stats.h, giving it access to
>>     the ARRAY_SIZE-based ICSSG_NUM_MIIG_STATS and ICSSG_NUM_PA_STATS
>>     before they are used in the prueth_emac struct and ICSSG_NUM_STATS.
>>
>> Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
>> ---
>>  drivers/net/ethernet/ti/icssg/icssg_prueth.h | 3 +--
>>  drivers/net/ethernet/ti/icssg/icssg_stats.h  | 7 ++++++-
>>  2 files changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/ti/icssg/icssg_prueth.h b/drivers/net/ethernet/ti/icssg/icssg_prueth.h
>> index df93d15c5b78..e2ccecb0a0dd 100644
>> --- a/drivers/net/ethernet/ti/icssg/icssg_prueth.h
>> +++ b/drivers/net/ethernet/ti/icssg/icssg_prueth.h
>> @@ -43,6 +43,7 @@
>>
>>  #include "icssg_config.h"
>>  #include "icss_iep.h"
>> +#include "icssg_stats.h"
>>  #include "icssg_switch_map.h"
>>
>>  #define PRUETH_MAX_MTU          (2000 - ETH_HLEN - ETH_FCS_LEN)
>> @@ -57,8 +58,6 @@
>>
>>  #define ICSSG_MAX_RFLOWS       8       /* per slice */
>>
>> -#define ICSSG_NUM_PA_STATS     32
>> -#define ICSSG_NUM_MIIG_STATS   60
>>  /* Number of ICSSG related stats */
>>  #define ICSSG_NUM_STATS (ICSSG_NUM_MIIG_STATS + ICSSG_NUM_PA_STATS)
>>  #define ICSSG_NUM_STANDARD_STATS 31
>> diff --git a/drivers/net/ethernet/ti/icssg/icssg_stats.h b/drivers/net/ethernet/ti/icssg/icssg_stats.h
>> index 5ec0b38e0c67..b854eb587c1e 100644
>> --- a/drivers/net/ethernet/ti/icssg/icssg_stats.h
>> +++ b/drivers/net/ethernet/ti/icssg/icssg_stats.h
>> @@ -8,10 +8,15 @@
>>  #ifndef __NET_TI_ICSSG_STATS_H
>>  #define __NET_TI_ICSSG_STATS_H
>>
>> -#include "icssg_prueth.h"
>> +#include <linux/ethtool.h>
>> +#include <linux/kernel.h>
>> +#include "icssg_switch_map.h"
>>
>>  #define STATS_TIME_LIMIT_1G_MS    25000    /* 25 seconds @ 1G */
>>
>> +#define ICSSG_NUM_MIIG_STATS   ARRAY_SIZE(icssg_all_miig_stats)
>> +#define ICSSG_NUM_PA_STATS     ARRAY_SIZE(icssg_all_pa_stats)
>> +
>>  struct miig_stats_regs {
>>         /* Rx */
>>         u32 rx_packets;
>> --
>> 2.34.1
>>
> 
> One thing that caught my eye: icssg_all_miig_stats[] and
>   icssg_all_pa_stats[] are 'static const' arrays in icssg_stats.h with
>   ETH_GSTRING_LEN name buffers per entry. Right now only icssg_stats.c
>   and icssg_ethtool.c pull them in. After this patch icssg_prueth.h
>   includes icssg_stats.h, so every .c in the driver (classifier,
>   common, config, mii_cfg, queues, switchdev, ...) ends up with its own
>   static-const copy of both tables.
> 
>   Would a static_assert() work for what you're after? Something like:
> 

While adding more stats manually, The ARRAY_SIZE() approach was
explicitly requested by maintainer [1]:

This patch is a direct response to that feedback. static_assert() would
still require updating the numeric constant on every array change. The
goal here is to eliminate the need of manually incrementing stats count
whenever new stats are added

Your concern about multiple copies of table is noted and valid. Could
you advise on the preferred way to reconcile these two requirements? I
am happy to restructure if there is an approach that satisfies both.

[1]
https://lore.kernel.org/all/20260112181436.4s5ceywwembn674r@skbuf/#:~:text=Can%27t%20this%20be%20expressed%20as%20ARRAY_SIZE(icssg_all_pa_stats)%3F%20It%20is%20very%0Afragile%20to%20have%20to%20count%20and%20update%20this%20manually.


>     static const struct icssg_miig_stats icssg_all_miig_stats[] = {
>         ...
>     };
>     static_assert(ARRAY_SIZE(icssg_all_miig_stats) == ICSSG_NUM_MIIG_STATS);
> 
>   next to each array, keeping the numeric #defines as-is. Then 2/2 fails
>   to build the moment a new entry is added without bumping the count,
>   which is the case you're guarding against — without touching the
>   include graph.
> 
> What do you think ?
> 
> Cheers.

-- 
Thanks and Regards,
Danish


^ permalink raw reply

* Re: [PATCH 2/3] mm/zswap: Implement proactive writeback
From: Hao Jia @ 2026-05-12  9:37 UTC (permalink / raw)
  To: Nhat Pham
  Cc: akpm, tj, hannes, shakeel.butt, mhocko, yosry, mkoutny,
	chengming.zhou, muchun.song, roman.gushchin, cgroups, linux-mm,
	linux-kernel, linux-doc, Hao Jia
In-Reply-To: <CAKEwX=PW2+EN41ANutv4cv+iM+JpwV5V+NSp5ukAt0M6fbHFLg@mail.gmail.com>



On 2026/5/12 03:54, Nhat Pham wrote:
> On Mon, May 11, 2026 at 3:52 AM Hao Jia <jiahao.kernel@gmail.com> wrote:
>> diff --git a/mm/zswap.c b/mm/zswap.c
>> index 19538d6f169a..1173ac6836fa 100644
>> --- a/mm/zswap.c
>> +++ b/mm/zswap.c
>> @@ -36,6 +36,7 @@
>>   #include <linux/workqueue.h>
>>   #include <linux/list_lru.h>
>>   #include <linux/zsmalloc.h>
>> +#include <linux/timekeeping.h>
>>
>>   #include "swap.h"
>>   #include "internal.h"
>> @@ -160,6 +161,12 @@ struct zswap_pool {
>>          char tfm_name[CRYPTO_MAX_ALG_NAME];
>>   };
>>
>> +struct zswap_shrink_walk_arg {
>> +       ktime_t cutoff_time;
>> +       bool proactive;
>> +       bool encountered_page_in_swapcache;
>> +};
>> +
>>   /* Global LRU lists shared by all zswap pools. */
>>   static struct list_lru zswap_list_lru;
>>
>> @@ -183,6 +190,7 @@ static struct shrinker *zswap_shrinker;
>>    * handle - zsmalloc allocation handle that stores the compressed page data
>>    * objcg - the obj_cgroup that the compressed memory is charged to
>>    * lru - handle to the pool's lru used to evict pages.
>> + * store_time - Time when the entry was stored, for proactive writeback.
>>    */
>>   struct zswap_entry {
>>          swp_entry_t swpentry;
>> @@ -192,6 +200,7 @@ struct zswap_entry {
>>          unsigned long handle;
>>          struct obj_cgroup *objcg;
>>          struct list_head lru;
>> +       ktime_t store_time;
> 
> On the implementation side - will this blow up struct zswap_entry
> memory footprint? If so, can you guard this behind a CONFIG option, if
> we are to go this route?

Thanks for the review. I'll address this in v2.

Thanks,
Hao

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox