Linux Documentation
 help / color / mirror / Atom feed
* Re: [PATCH v3 2/3] PM: dpm_watchdog: Allow disabling DPM watchdog by default
From: Tzung-Bi Shih @ 2026-06-09  9:04 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Danilo Krummrich, Shuah Khan,
	Pavel Machek, Len Brown, linux-doc, linux-kernel, linux-pm,
	driver-core, tfiga, senozhatsky, Randy Dunlap
In-Reply-To: <CAJZ5v0g4VuR20dF+Zw0b75u4=ajFBOBAKokCoyDBtjCETexK3Q@mail.gmail.com>

On Mon, Jun 08, 2026 at 04:14:09PM +0200, Rafael J. Wysocki wrote:
> On Mon, Jun 8, 2026 at 4:16 AM Tzung-Bi Shih <tzungbi@kernel.org> wrote:
> >
> > Introduce the CONFIG_DPM_WATCHDOG_DEFAULT_ENABLED Kconfig option to
> > allow the device suspend/resume watchdog (DPM watchdog) to be disabled
> > by default at compile time.
> >
> > Additionally, introduce the "dpm_watchdog_enabled" module parameter to
> > allow the watchdog to be enabled or disabled at boot time (via
> > "power.dpm_watchdog_enabled") and at runtime (via sysfs).
> 
> I think that the new module param is more important because the new
> config option is just its default value, so I'd rearrange the
> changelog.
> 
> Also, I think that the "DEFAULT_" part of the new config option name
> doesn't provide any additional value, so I'd just drop it.

Will fix them in the next version.

^ permalink raw reply

* Re: [PATCH v3 1/3] PM: core: Rename module parameters prefix to "power"
From: Tzung-Bi Shih @ 2026-06-09  9:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Danilo Krummrich, Shuah Khan,
	Pavel Machek, Len Brown, linux-doc, linux-kernel, linux-pm,
	driver-core, tfiga, senozhatsky, Randy Dunlap
In-Reply-To: <CAJZ5v0jy75R24NztKJ0w4NMyRB7G+DcsC+gaQ0xZOQMfTfA5Ww@mail.gmail.com>

On Mon, Jun 08, 2026 at 04:11:30PM +0200, Rafael J. Wysocki wrote:
> On Mon, Jun 8, 2026 at 4:16 AM Tzung-Bi Shih <tzungbi@kernel.org> wrote:
> >
> > Currently, the module parameters defined in drivers/base/power/main.c
> > use the default prefix "main" (derived from the filename).  The prefix
> > "main" is too generic and non-descriptive for power management
> > parameters.
> >
> > Redefine MODULE_PARAM_PREFIX to "power." at the beginning of the file
> > to group the module parameters under the "power" namespace instead.
> > This makes the parameters more descriptive.
> >
> > Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
> > ---
> > v3:
> > - No changes.
> >
> > v2: https://lore.kernel.org/all/20260604090756.2884671-2-tzungbi@kernel.org
> > - New to the series.
> >
> > v1: Doesn't exist.
> >
> >  drivers/base/power/main.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> > index ed48c292f575..cd864f3a2799 100644
> > --- a/drivers/base/power/main.c
> > +++ b/drivers/base/power/main.c
> > @@ -40,6 +40,9 @@
> >  #include "../base.h"
> >  #include "power.h"
> >
> > +#undef MODULE_PARAM_PREFIX
> > +#define MODULE_PARAM_PREFIX "power."
> 
> "power" may be confused with the power supply support, so I'd rather
> use "pm" or even "pm_sleep" (in which case the "dpm_" prefix could be
> dropped from the new module param name in the next patch).

Ack, will use "pm_sleep" in the next version.

Regarding dropping the "dpm_" prefix, should this also apply to the existing
dpm_watchdog_all_cpu_backtrace parameter?  Or should we leave it as-is to
avoid breaking existing configurations?

^ permalink raw reply

* Re: [PATCH v3 3/3] PM: dpm_watchdog: Add sysctl interface for DPM watchdog timeouts
From: Tzung-Bi Shih @ 2026-06-09  9:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Danilo Krummrich, Shuah Khan,
	Pavel Machek, Len Brown, linux-doc, linux-kernel, linux-pm,
	driver-core, tfiga, senozhatsky, Randy Dunlap
In-Reply-To: <CAJZ5v0j=Uey90jN-TiUkx+FEPKtNUWhDGrfhxke65Em_ycbc+w@mail.gmail.com>

On Mon, Jun 08, 2026 at 04:22:35PM +0200, Rafael J. Wysocki wrote:
> On Mon, Jun 8, 2026 at 4:16 AM Tzung-Bi Shih <tzungbi@kernel.org> wrote:
> >
> > Introduce sysctl knobs to allow configuring DPM watchdog timeouts at
> > runtime.
> >
> > Currently, these timeouts are fixed at compile time via
> > CONFIG_DPM_WATCHDOG_TIMEOUT and CONFIG_DPM_WATCHDOG_WARNING_TIMEOUT.
> > This limits flexibility if the timeouts need to be adjusted for
> > different testing scenarios or hardware behaviors without rebuilding
> > the kernel.
> >
> > Add the following sysctl files under /proc/sys/kernel/:
> > - dpm_watchdog_timeout_secs: The total timeout before panic. The
> >   maximum value is capped at CONFIG_DPM_WATCHDOG_TIMEOUT to prevent
> >   unreasonably large timeouts.
> > - dpm_watchdog_warning_timeout_secs: The warning timeout. The maximum
> >   value is capped at the current dpm_watchdog_timeout_secs.
> > Both sysctls have a minimum value of 1.
> >
> > Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
> > ---
> > v3:
> > - No changes.
> >
> > v2: https://lore.kernel.org/all/20260604090756.2884671-4-tzungbi@kernel.org
> > - New to the series.
> >
> > v1: Doesn't exist.
> >
> >  drivers/base/power/main.c | 61 ++++++++++++++++++++++++++++++++++++---
> >  1 file changed, 57 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> > index 7822c29b7c8d..c1a4b30fafb2 100644
> > --- a/drivers/base/power/main.c
> > +++ b/drivers/base/power/main.c
> > @@ -28,6 +28,7 @@
> >  #include <linux/interrupt.h>
> >  #include <linux/sched.h>
> >  #include <linux/sched/debug.h>
> > +#include <linux/sysctl.h>
> >  #include <linux/async.h>
> >  #include <linux/suspend.h>
> >  #include <trace/events/power.h>
> > @@ -539,6 +540,58 @@ static bool __read_mostly dpm_watchdog_enabled =
> >  module_param(dpm_watchdog_enabled, bool, 0644);
> >  MODULE_PARM_DESC(dpm_watchdog_enabled, "Enable DPM watchdog");
> >
> > +static unsigned int __read_mostly dpm_watchdog_timeout = CONFIG_DPM_WATCHDOG_TIMEOUT;
> > +static unsigned int __read_mostly dpm_watchdog_warning_timeout =
> > +                                               CONFIG_DPM_WATCHDOG_WARNING_TIMEOUT;
> > +static const unsigned int dpm_watchdog_timeout_max = CONFIG_DPM_WATCHDOG_TIMEOUT;
> > +
> > +static int proc_dodpm_watchdog_timeout_secs(const struct ctl_table *table,
> > +                                           int write, void *buffer,
> > +                                           size_t *lenp, loff_t *ppos)
> > +{
> > +       struct ctl_table ctl = *table;
> > +       unsigned int val = dpm_watchdog_timeout;
> > +       int ret;
> > +
> > +       ctl.data = &val;
> > +       ret = proc_douintvec_minmax(&ctl, write, buffer, lenp, ppos);
> > +       if (ret || !write)
> > +               return ret;
> > +
> > +       if (val < dpm_watchdog_warning_timeout)
> > +               dpm_watchdog_warning_timeout = val;
> > +       dpm_watchdog_timeout = val;
> > +
> > +       return 0;
> > +}
> > +
> > +static const struct ctl_table dpm_watchdog_sysctls[] = {
> > +       {
> > +               .procname       = "dpm_watchdog_timeout_secs",
> > +               .maxlen         = sizeof(unsigned int),
> > +               .mode           = 0644,
> > +               .proc_handler   = proc_dodpm_watchdog_timeout_secs,
> > +               .extra1         = SYSCTL_ONE,
> > +               .extra2         = (void *)&dpm_watchdog_timeout_max,
> > +       },
> > +       {
> > +               .procname       = "dpm_watchdog_warning_timeout_secs",
> > +               .data           = &dpm_watchdog_warning_timeout,
> > +               .maxlen         = sizeof(unsigned int),
> > +               .mode           = 0644,
> > +               .proc_handler   = proc_douintvec_minmax,
> > +               .extra1         = SYSCTL_ONE,
> > +               .extra2         = (void *)&dpm_watchdog_timeout,
> > +       },
> > +};
> > +
> > +static int __init dpm_watchdog_sysctl_init(void)
> > +{
> > +       register_sysctl_init("kernel", dpm_watchdog_sysctls);
> > +       return 0;
> > +}
> > +subsys_initcall(dpm_watchdog_sysctl_init);
> > +
> >  /**
> >   * dpm_watchdog_handler - Driver suspend / resume watchdog handler.
> >   * @t: The timer that PM watchdog depends on.
> > @@ -564,9 +617,9 @@ static void dpm_watchdog_handler(struct timer_list *t)
> >                         dev_driver_string(wd->dev), dev_name(wd->dev));
> >         }
> >
> > -       time_left = CONFIG_DPM_WATCHDOG_TIMEOUT - CONFIG_DPM_WATCHDOG_WARNING_TIMEOUT;
> > +       time_left = dpm_watchdog_timeout - dpm_watchdog_warning_timeout;
> >         dev_warn(wd->dev, "**** DPM device timeout after %u seconds; %u seconds until panic ****\n",
> > -                CONFIG_DPM_WATCHDOG_WARNING_TIMEOUT, time_left);
> > +                dpm_watchdog_warning_timeout, time_left);
> >         show_stack(wd->tsk, NULL, KERN_WARNING);
> >
> >         wd->fatal = true;
> > @@ -587,11 +640,11 @@ static void dpm_watchdog_set(struct dpm_watchdog *wd, struct device *dev)
> >
> >         wd->dev = dev;
> >         wd->tsk = current;
> > -       wd->fatal = CONFIG_DPM_WATCHDOG_TIMEOUT == CONFIG_DPM_WATCHDOG_WARNING_TIMEOUT;
> > +       wd->fatal = dpm_watchdog_timeout == dpm_watchdog_warning_timeout;
> >
> >         timer_setup_on_stack(timer, dpm_watchdog_handler, 0);
> >         /* use same timeout value for both suspend and resume */
> > -       timer->expires = jiffies + HZ * CONFIG_DPM_WATCHDOG_WARNING_TIMEOUT;
> > +       timer->expires = jiffies + HZ * dpm_watchdog_warning_timeout;
> >         add_timer(timer);
> >  }
> >
> > --
> 
> I think that this can be applied without the other two patches in the
> series, so please let me know if you want me to apply it separately.

Ack.  The patch does have adjacent hunks with the preceding patch, which
might cause minor contextual conflicts if applied independently.

Would you want me to reorder the series in the next version to make this
the first patch?  In case the rest two patches may still take some time
to review.

^ permalink raw reply

* Re: [PATCH mm-unstable v19 11/14] mm/khugepaged: Introduce mTHP collapse support
From: Nico Pache @ 2026-06-09  9:01 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: linux-doc, linux-kernel, linux-mm, linux-trace-kernel, aarcange,
	akpm, anshuman.khandual, apopple, baohua, baolin.wang, byungchul,
	catalin.marinas, cl, corbet, dave.hansen, david, dev.jain, gourry,
	hannes, hughd, jack, jackmanb, jannh, jglisse, joshua.hahnjy, kas,
	lance.yang, liam, mathieu.desnoyers, matthew.brost, mhiramat,
	mhocko, peterx, pfalcato, rakie.kim, raquini, rdunlap,
	richard.weiyang, rientjes, rostedt, rppt, ryan.roberts, shivankg,
	sunnanyong, surenb, thomas.hellstrom, tiwai, usamaarif642, vbabka,
	vishal.moola, wangkefeng.wang, will, willy, yang, ying.huang, ziy,
	zokeefe
In-Reply-To: <aiMTuXKQ5qxKYo60@lucifer>

On Fri, Jun 5, 2026 at 12:38 PM Lorenzo Stoakes <ljs@kernel.org> wrote:
>
> On Fri, Jun 05, 2026 at 10:14:18AM -0600, Nico Pache wrote:
> > Enable khugepaged to collapse to mTHP orders. This patch implements the
> > main scanning logic using a bitmap to track occupied pages and the
> > algorithm to find optimal collapse sizes.
> >
> > Previous to this patch, PMD collapse had 3 main phases, a light weight
> > scanning phase (mmap_read_lock) that determines a potential PMD
> > collapse, an alloc phase (mmap unlocked), then finally heavier collapse
> > phase (mmap_write_lock).
> >
> > To enabled mTHP collapse we make the following changes:
> >
> > During PMD scan phase, track occupied pages in a bitmap. When mTHP
> > orders are enabled, we remove the restriction of max_ptes_none during the
> > scan phase to avoid missing potential mTHP collapse candidates. Once we
> > have scanned the full PMD range and updated the bitmap to track occupied
> > pages, we use the bitmap to find the optimal mTHP size.
> >
> > Implement mthp_collapse() to walk forward through the bitmap and
> > determine the best eligible order for each naturally-aligned region. The
> > algorithm starts at the beginning of the PMD range and, for each offset,
> > tries the highest order that fits the alignment. If the number of
> > occupied PTEs in that region satisfies the max_ptes_none threshold for
> > that order, a collapse is attempted. On failure, the order is
> > decremented and the same offset is retried at the next smaller size. Once
> > the smallest enabled order is exhausted (or a collapse succeeds), the
> > offset advances past the region just processed, and the next attempt
> > starts at the highest order permitted by the new offset's natural
> > alignment.
>
> I think still it might have been nice to discuss why we are not
> e.g. greedily trying to find the biggest possible mTHP size (if we did, we
> would try the highest offset first), but we can save that for adding some
> documentation somewhere later tbh.

We are, the algorithm tries PMD, then order 8, then order 7, and so
on. Due to the required alignment, if the N-1 order succeeds, we try
the same order at the neighboring offset.

So if we collapse a order 8, the following collapse attempt will be
order 8 at 256. We always try the highest order allowed for a given
offset :)

>
> This commit message is long enough as it is :>)
>
> >
> > The algorithm works as follows:
> >     1) set offset=0 and order=HPAGE_PMD_ORDER
> >     2) if the order is not enabled, go to step (5)
> >     3) count occupied PTEs in the (offset, order) range using
> >        bitmap_weight_from()
> >     4) if the count satisfies the max_ptes_none threshold, attempt
> >        collapse; on success, advance to step (6)
> >     5) if a smaller enabled order exists, decrement order and retry
> >        from step (2) at the same offset
> >     6) advance offset past the current region and compute the next
> >        order from the new offset's natural alignment via __ffs(offset),
> >        capped at HPAGE_PMD_ORDER
> >     7) repeat from step (2) until the full PMD range is covered
> >
> > mTHP collapses reject regions containing swapped out or shared pages.
> > This is because adding new entries can lead to new none pages, and these
> > may lead to constant promotion into a higher order mTHP. A similar
> > issue can occur with "max_ptes_none > HPAGE_PMD_NR/2" due to a collapse
> > introducing at least 2x the number of pages, and on a future scan will
> > satisfy the promotion condition once again. This issue is prevented via
> > the collapse_max_ptes_none() function which imposes the max_ptes_none
> > restrictions above.
> >
> > We currently only support mTHP collapse for max_ptes_none values of 0
> > and HPAGE_PMD_NR - 1. resulting in the following behavior:
> >
> >     - max_ptes_none=0: Never introduce new empty pages during collapse
> >     - max_ptes_none=HPAGE_PMD_NR-1: Always try collapse to the highest
> >       available mTHP order
> >
> > Any other max_ptes_none value will emit a warning and default mTHP
> > collapse to max_ptes_none=0. There should be no behavior change for PMD
> > collapse.
> >
> > Once we determine what mTHP sizes fits best in that PMD range a collapse
> > is attempted. A minimum collapse order of 2 is used as this is the lowest
> > order supported by anon memory as defined by THP_ORDERS_ALL_ANON.
> >
> > Currently madv_collapse is not supported and will only attempt PMD
> > collapse.
> >
> > We can also remove the check for is_khugepaged inside the PMD scan as
> > the collapse_max_ptes_none() function handles this logic now.
>
> It'd be nice to have kept the ASCII diagram here too :'( but this is fine,
>
> >
> > Signed-off-by: Nico Pache <npache@redhat.com>
>
> This all LGTM, and we can fix up any issues that arise later if anything
> does break. So:
>
> Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>

Thanks for reviewing :)

>
> > ---
> >  mm/khugepaged.c | 146 +++++++++++++++++++++++++++++++++++++++++++++---
> >  1 file changed, 138 insertions(+), 8 deletions(-)
> >
> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > index ec886a031952..430047316f43 100644
> > --- a/mm/khugepaged.c
> > +++ b/mm/khugepaged.c
> > @@ -99,6 +99,8 @@ static DEFINE_READ_MOSTLY_HASHTABLE(mm_slots_hash, MM_SLOTS_HASH_BITS);
> >
> >  static struct kmem_cache *mm_slot_cache __ro_after_init;
> >
> > +#define KHUGEPAGED_MIN_MTHP_ORDER    2
> > +
> >  struct collapse_control {
> >       bool is_khugepaged;
> >
> > @@ -110,6 +112,9 @@ struct collapse_control {
> >
> >       /* nodemask for allocation fallback */
> >       nodemask_t alloc_nmask;
> > +
> > +     /* Each bit represents a single occupied (!none/zero) page. */
> > +     DECLARE_BITMAP(mthp_present_ptes, MAX_PTRS_PER_PTE);
> >  };
> >
> >  /**
> > @@ -1440,20 +1445,130 @@ static enum scan_result collapse_huge_page(struct mm_struct *mm, unsigned long s
> >       return result;
> >  }
> >
> > +/* Return the highest naturally aligned order that fits at @offset within a PMD. */
> > +static unsigned int max_order_from_offset(unsigned int offset)
> > +{
> > +     if (offset == 0)
> > +             return HPAGE_PMD_ORDER;
> > +
> > +     return min_t(unsigned int, __ffs(offset), HPAGE_PMD_ORDER);
> > +}
>
> Thanks this is better! I wonder if we can ever actually see an
> __ffs(offset) that's > HPAGE_PMD_ORDER but probably better safe than sorry
> here with the min_t.

I don't think so unless offset somehow exceeds 512 (it shouldn't), but
like you said, better safe than sorry.

>
> > +
> > +/*
> > + * mthp_collapse() consumes the bitmap that is generated during
> > + * collapse_scan_pmd() to determine what regions and mTHP orders fit best.
> > + *
> > + * Each bit in cc->mthp_present_ptes represents a single occupied (!none/zero)
> > + * page. We start at the PMD order and check if it is eligible for collapse;
> > + * if not, we check the left and right halves of the PTE page table we are
> > + * examining at a lower order.
> > + *
> > + * For each of these, we determine how many PTE entries are occupied in the
> > + * range of PTE entries we propose to collapse, then we compare this to a
> > + * threshold number of PTE entries which would need to be occupied for a
> > + * collapse to be permitted at that order (accounting for max_ptes_none).
> > + *
> > + * If a collapse is permitted, we attempt to collapse the PTE range into a
> > + * mTHP.
> > + */
> > +static enum scan_result mthp_collapse(struct mm_struct *mm,
> > +             unsigned long address, int referenced, int unmapped,
> > +             struct collapse_control *cc, unsigned long enabled_orders)
> > +{
> > +     unsigned int nr_occupied_ptes, nr_ptes, max_ptes_none;
> > +     enum scan_result last_result = SCAN_FAIL;
> > +     int collapsed = 0;
> > +     bool alloc_failed = false;
> > +     unsigned long collapse_address;
> > +     unsigned int offset = 0;
> > +     unsigned int order = HPAGE_PMD_ORDER;
> > +
> > +     while (offset < HPAGE_PMD_NR) {
> > +             nr_ptes = 1UL << order;
> > +
> > +             if (!test_bit(order, &enabled_orders))
> > +                     goto next_order;
> > +
> > +             max_ptes_none = collapse_max_ptes_none(cc, NULL, order);
> > +             nr_occupied_ptes = bitmap_weight_from(cc->mthp_present_ptes, offset,
> > +                                                   offset + nr_ptes);
> > +
> > +             if (nr_occupied_ptes >= nr_ptes - max_ptes_none) {
> > +                     enum scan_result ret;
> > +
> > +                     collapse_address = address + offset * PAGE_SIZE;
> > +                     ret = collapse_huge_page(mm, collapse_address, referenced,
> > +                                              unmapped, cc, order);
> > +                     switch (ret) {
> > +                     /* Cases where we continue to next collapse candidate */
> > +                     case SCAN_SUCCEED:
> > +                             collapsed += nr_ptes;
> > +                             fallthrough;
> > +                     case SCAN_PTE_MAPPED_HUGEPAGE:
> > +                             goto next_offset;
> > +                     /* Cases where lower orders might still succeed */
> > +                     case SCAN_ALLOC_HUGE_PAGE_FAIL:
> > +                             alloc_failed = true;
> > +                             last_result = ret;
> > +                             goto next_order;
> > +                     /* Cases where no further collapse is possible */
> > +                     case SCAN_PMD_MAPPED:
> > +                             fallthrough;
> > +                     default:
> > +                             last_result = ret;
> > +                             goto done;
> > +                     }
> > +             }
> > +
> > +next_order:
> > +             /*
> > +              * Continue with the next smaller order if there is still
> > +              * any smaller order enabled. When at the smallest order
> > +              * we must always move to the next offset.
> > +              */
> > +             if (order > KHUGEPAGED_MIN_MTHP_ORDER &&
> > +                     (enabled_orders & GENMASK(order - 1, 0))) {
>
> Honestly wasn't aware of GENMASK() before :)

I wasn't either! (thanks David ;) )

>
> > +                     order--;
> > +                     continue;
> > +             }
> > +next_offset:
> > +             /*
> > +              * Advance past the region we just processed and determine the
> > +              * highest order we can attempt next. Since huge pages must be
> > +              * naturally aligned, the max order we can attempt next is
> > +              * limited by the alignment of the new offset.
> > +              * E.g. if we collapsed a order-2 mTHP at offset 0, offset
> > +              * becomes 4 and __ffs(4) == 2, so the next attempt starts at
> > +              * order 2.
> > +              */
>
> Great comment thanks!
>
> > +             offset += nr_ptes;
> > +             order = max_order_from_offset(offset);
> > +     }
> > +done:
> > +     if (collapsed)
> > +             return SCAN_SUCCEED;
> > +     if (alloc_failed)
> > +             return SCAN_ALLOC_HUGE_PAGE_FAIL;
> > +     return last_result;
> > +}
> > +
> >  static enum scan_result collapse_scan_pmd(struct mm_struct *mm,
> >               struct vm_area_struct *vma, unsigned long start_addr,
> >               bool *lock_dropped, struct collapse_control *cc)
> >  {
> > -     const unsigned int max_ptes_none = collapse_max_ptes_none(cc, vma, HPAGE_PMD_ORDER);
> >       const unsigned int max_ptes_shared = collapse_max_ptes_shared(cc, HPAGE_PMD_ORDER);
> >       const unsigned int max_ptes_swap = collapse_max_ptes_swap(cc, HPAGE_PMD_ORDER);
> > +     unsigned int max_ptes_none = collapse_max_ptes_none(cc, vma, HPAGE_PMD_ORDER);
> > +     enum tva_type tva_flags = cc->is_khugepaged ? TVA_KHUGEPAGED : TVA_FORCED_COLLAPSE;
> >       pmd_t *pmd;
> > -     pte_t *pte, *_pte;
> > +     pte_t *pte, *_pte, pteval;
> > +     int i;
> >       int none_or_zero = 0, shared = 0, referenced = 0;
> >       enum scan_result result = SCAN_FAIL;
> >       struct page *page = NULL;
> >       struct folio *folio = NULL;
> >       unsigned long addr;
> > +     unsigned long enabled_orders;
> >       spinlock_t *ptl;
> >       int node = NUMA_NO_NODE, unmapped = 0;
> >
> > @@ -1465,8 +1580,19 @@ static enum scan_result collapse_scan_pmd(struct mm_struct *mm,
> >               goto out;
> >       }
> >
> > +     bitmap_zero(cc->mthp_present_ptes, MAX_PTRS_PER_PTE);
> >       memset(cc->node_load, 0, sizeof(cc->node_load));
> >       nodes_clear(cc->alloc_nmask);
> > +
> > +     enabled_orders = collapse_possible_orders(vma, vma->vm_flags, tva_flags);
> > +
> > +     /*
> > +      * If PMD is the only enabled order, enforce max_ptes_none, otherwise
> > +      * scan all pages to populate the bitmap for mTHP collapse.
> > +      */
> > +     if (enabled_orders != BIT(HPAGE_PMD_ORDER))
> > +             max_ptes_none = KHUGEPAGED_MAX_PTES_LIMIT;
> > +
> >       pte = pte_offset_map_lock(mm, pmd, start_addr, &ptl);
> >       if (!pte) {
> >               cc->progress++;
> > @@ -1474,11 +1600,13 @@ static enum scan_result collapse_scan_pmd(struct mm_struct *mm,
> >               goto out;
> >       }
> >
> > -     for (addr = start_addr, _pte = pte; _pte < pte + HPAGE_PMD_NR;
> > -          _pte++, addr += PAGE_SIZE) {
> > +     for (i = 0; i < HPAGE_PMD_NR; i++) {
> > +             _pte = pte + i;
> > +             addr = start_addr + i * PAGE_SIZE;
> > +             pteval = ptep_get(_pte);
> > +
> >               cc->progress++;
> >
> > -             pte_t pteval = ptep_get(_pte);
> >               if (pte_none_or_zero(pteval)) {
> >                       if (++none_or_zero > max_ptes_none) {
> >                               result = SCAN_EXCEED_NONE_PTE;
> > @@ -1558,6 +1686,8 @@ static enum scan_result collapse_scan_pmd(struct mm_struct *mm,
> >                       }
> >               }
> >
> > +             /* Set bit for occupied pages */
> > +             __set_bit(i, cc->mthp_present_ptes);
> >               /*
> >                * Record which node the original page is from and save this
> >                * information to cc->node_load[].
> > @@ -1616,9 +1746,9 @@ static enum scan_result collapse_scan_pmd(struct mm_struct *mm,
> >       if (result == SCAN_SUCCEED) {
> >               /* collapse_huge_page expects the lock to be dropped before calling */
> >               mmap_read_unlock(mm);
> > -             result = collapse_huge_page(mm, start_addr, referenced,
> > -                                         unmapped, cc, HPAGE_PMD_ORDER);
> > -             /* collapse_huge_page will return with the mmap_lock released */
> > +             result = mthp_collapse(mm, start_addr, referenced,
> > +                                    unmapped, cc, enabled_orders);
> > +             /* mmap_lock was released above, set lock_dropped */
> >               *lock_dropped = true;
> >       }
> >  out:
> > --
> > 2.54.0
> >
>
> Cheers, Lorenzo
>


^ permalink raw reply

* Re: [PATCH v2 1/1] dt-bindings: net: dsa: Convert lan9303.txt to yaml format
From: Paolo Abeni @ 2026-06-09  8:59 UTC (permalink / raw)
  To: Frank.Li, Andrew Lunn, Vladimir Oltean, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Simon Horman, Jonathan Corbet, Shuah Khan, Frank Li,
	open list:NETWORKING DRIVERS,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	open list, open list:DOCUMENTATION
  Cc: imx
In-Reply-To: <20260603210929.3099363-1-Frank.Li@oss.nxp.com>

On 6/3/26 11:09 PM, Frank.Li@oss.nxp.com wrote:
> From: Frank Li <Frank.Li@nxp.com>
> 
> Convert lan9303.txt to yaml format to fix below CHECK_DTBS warnings:
> arch/arm/boot/dts/nxp/imx/imx53-kp-hsc.dtb: /soc/bus@50000000/i2c@53fec000/switch@a: failed to match any schema with compatible: ['smsc,lan9303-i2c']
> 
> Additional changes:
>   - rename switch-phy to switch in example.
> 
> Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
> Signed-off-by: Frank Li <Frank.Li@nxp.com>

Does not apply to net-next cleanly anymore. Please rebase and resend.
While at it, please additionally add the target tree in the subj prefix.

Thanks,

Paolo


^ permalink raw reply

* Re: [PATCH v3 2/3] Documentation: security-bugs: explain what is and is not a security bug
From: Greg KH @ 2026-06-09  8:43 UTC (permalink / raw)
  To: Askar Safin
  Cc: w, corbet, leon, linux-doc, linux-kernel, security, skhan,
	workflows
In-Reply-To: <20260609083305.2382925-1-safinaskar@gmail.com>

On Tue, Jun 09, 2026 at 11:33:05AM +0300, Askar Safin wrote:
> Willy Tarreau <w@1wt.eu>:
> > +in a way that allows multiple local users to get a fair share of the available
> 
> Your "security-bugs.rst" says that we should consult "threat-model.rst" to
> determine whether a bug should be sent to secret mailing list.
> 
> And "threat-model.rst" says that kernel gives everyone "fair share"
> of resources.
> 
> This can be interpreted so: if scheduler is not fair enough, then this is
> security bug and should be reported to secret mailing list. I don't think
> this is what you meant.

Within reason of course, please use your best judgement.

> > +When hardware fails to maintain its specified isolation (e.g., CPU bugs,
> > +side-channels, hardware response to unexpected inputs), the kernel will usually
> > +attempt to implement reasonable mitigations. These are best-effort measures
> > +intended to reduce the attack surface or elevate the cost of an attack within
> > +the limits of the hardware's facilities; they do not constitute a
> > +kernel-provided safety guarantee.
> 
> "best-effort measures" and "they do not constitute a kernel-provided safety
> guarantee" can be interpreted so: if someone finds yet another Meltdown-like
> side-channel CPU bug, then this is not security bug, and should be
> reported openly. I don't think this is what you meant.

Again, please be reasonable.  Hardware bugs have their own reporting
process that we have well documented.

> > +    affect the system's availability (shutdown, reboot, panic, hang, or making
> > +    the system unresponsive via unbounded resource exhaustion).
> 
> So if unprivileged process can crash system, then this is security bug?

Yes.

> Also I'm not sure "unbounded resource exhaustion" is correct here.

Why not?

> As well as I understand, by default kernel and distros don't set any
> memory limits or limits for number of processes for unprivileged processes,
> so unprivileged process can easily cause resource exhaustion by
> allocating a lot of memory or by fork bomb.

That's a distro problem, not a kernel problem.

> So, I think you should instead say that unprivileged process, which
> has memory limit (and other limits) set using cgroups, should not
> be able to cause resource exhaustion.

Patches are always gladly accepted.  But again, be reasonable please,
this isn't a legal document :)

> > +are designed to be accessible to regular local users with a low risk (e.g.
> > +kernel logs via ``/proc/kmsg``), some would expose enough information to
> 
> /proc/kmsg has rights "-r--------", so I think there is error here.
> 
> ---------------
> 
> Finally, I have questions:
> 
> - If unprivileged user created process, which is impossible to kill
> by privileged process, is this security bug?

Sounds like a bug, we can deal with it that way.

> - If unprivileged user prevents privileged user from suspending
> system, is this security bug?

Physical access of suspending a machine feels like an odd threat model
to be worried about :)

If you have bugs that you feel are security issues like the above,
great, please report them and we can take them on a case-by-case basis.

This document is meant as a starting point for that, and to help remove
a huge number of "this is a security bug!" reports that we keep getting
that are obviously not that.

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH v3 2/3] Documentation: security-bugs: explain what is and is not a security bug
From: Askar Safin @ 2026-06-09  8:33 UTC (permalink / raw)
  To: w
  Cc: corbet, greg, gregkh, leon, linux-doc, linux-kernel, security,
	skhan, workflows
In-Reply-To: <20260509094755.2838-3-w@1wt.eu>

Willy Tarreau <w@1wt.eu>:
> +in a way that allows multiple local users to get a fair share of the available

Your "security-bugs.rst" says that we should consult "threat-model.rst" to
determine whether a bug should be sent to secret mailing list.

And "threat-model.rst" says that kernel gives everyone "fair share"
of resources.

This can be interpreted so: if scheduler is not fair enough, then this is
security bug and should be reported to secret mailing list. I don't think
this is what you meant.

> +When hardware fails to maintain its specified isolation (e.g., CPU bugs,
> +side-channels, hardware response to unexpected inputs), the kernel will usually
> +attempt to implement reasonable mitigations. These are best-effort measures
> +intended to reduce the attack surface or elevate the cost of an attack within
> +the limits of the hardware's facilities; they do not constitute a
> +kernel-provided safety guarantee.

"best-effort measures" and "they do not constitute a kernel-provided safety
guarantee" can be interpreted so: if someone finds yet another Meltdown-like
side-channel CPU bug, then this is not security bug, and should be
reported openly. I don't think this is what you meant.

> +    affect the system's availability (shutdown, reboot, panic, hang, or making
> +    the system unresponsive via unbounded resource exhaustion).

So if unprivileged process can crash system, then this is security bug?

Also I'm not sure "unbounded resource exhaustion" is correct here.
As well as I understand, by default kernel and distros don't set any
memory limits or limits for number of processes for unprivileged processes,
so unprivileged process can easily cause resource exhaustion by
allocating a lot of memory or by fork bomb.

So, I think you should instead say that unprivileged process, which
has memory limit (and other limits) set using cgroups, should not
be able to cause resource exhaustion.

> +are designed to be accessible to regular local users with a low risk (e.g.
> +kernel logs via ``/proc/kmsg``), some would expose enough information to

/proc/kmsg has rights "-r--------", so I think there is error here.

---------------

Finally, I have questions:

- If unprivileged user created process, which is impossible to kill
by privileged process, is this security bug?

- If unprivileged user prevents privileged user from suspending
system, is this security bug?

-- 
Askar Safin

^ permalink raw reply

* Re: [Intel-wired-lan] [PATCH iwl-next v8 08/15] idpf: refactor idpf to use libie_pci APIs
From: Larysa Zaremba @ 2026-06-09  8:30 UTC (permalink / raw)
  To: Loktionov, Aleksandr
  Cc: intel-wired-lan@lists.osuosl.org, Nguyen, Anthony L,
	Lobakin, Aleksander, Samudrala, Sridhar, Michal Swiatkowski,
	Fijalkowski, Maciej, Tantilov, Emil S, Chittim, Madhu,
	Hay, Joshua A, Keller, Jacob E, Shanmugam, Jayaprakash,
	Jiri Pirko, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Jonathan Corbet, Richard Cochran,
	Kitszel, Przemyslaw, Andrew Lunn, netdev@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	Salin, Samuel
In-Reply-To: <IA3PR11MB898641BC991D8EAB20EEA185E51C2@IA3PR11MB8986.namprd11.prod.outlook.com>

On Mon, Jun 08, 2026 at 05:16:57PM +0200, Loktionov, Aleksandr wrote:
> 
> 
> > -----Original Message-----
> > From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> > Of Larysa Zaremba
> > Sent: Monday, June 8, 2026 4:41 PM
> > To: intel-wired-lan@lists.osuosl.org; Nguyen, Anthony L
> > <anthony.l.nguyen@intel.com>
> > Cc: Lobakin, Aleksander <aleksander.lobakin@intel.com>; Samudrala,
> > Sridhar <sridhar.samudrala@intel.com>; Michal Swiatkowski
> > <michal.swiatkowski@linux.intel.com>; Zaremba, Larysa
> > <larysa.zaremba@intel.com>; Fijalkowski, Maciej
> > <maciej.fijalkowski@intel.com>; Tantilov, Emil S
> > <emil.s.tantilov@intel.com>; Chittim, Madhu <madhu.chittim@intel.com>;
> > Hay, Joshua A <joshua.a.hay@intel.com>; Keller, Jacob E
> > <jacob.e.keller@intel.com>; Shanmugam, Jayaprakash
> > <jayaprakash.shanmugam@intel.com>; Jiri Pirko <jiri@resnulli.us>;
> > David S. Miller <davem@davemloft.net>; Eric Dumazet
> > <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni
> > <pabeni@redhat.com>; Simon Horman <horms@kernel.org>; Jonathan Corbet
> > <corbet@lwn.net>; Richard Cochran <richardcochran@gmail.com>; Kitszel,
> > Przemyslaw <przemyslaw.kitszel@intel.com>; Andrew Lunn
> > <andrew+netdev@lunn.ch>; netdev@vger.kernel.org; linux-
> > doc@vger.kernel.org; linux-kernel@vger.kernel.org; Salin, Samuel
> > <samuel.salin@intel.com>
> > Subject: [Intel-wired-lan] [PATCH iwl-next v8 08/15] idpf: refactor
> > idpf to use libie_pci APIs
> > 
> > From: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
> > 
> > Use libie_pci init and MMIO APIs where possible, struct idpf_hw cannot
> > be deleted for now as it also houses control queues that will be
> > refactored later. Use libie_cp header for libie_ctlq_ctx that contains
> > mmio info from the start in order to not increase the diff later.
> > 
> > Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
> > Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
> > Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
> > Tested-by: Samuel Salin <Samuel.salin@intel.com>
> > Co-developed-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > ---
> >  drivers/net/ethernet/intel/idpf/Kconfig       |   1 +
> >  drivers/net/ethernet/intel/idpf/idpf.h        |  70 +-------
> >  .../net/ethernet/intel/idpf/idpf_controlq.c   |  26 ++-
> >  .../net/ethernet/intel/idpf/idpf_controlq.h   |   2 -
> >  drivers/net/ethernet/intel/idpf/idpf_dev.c    |  61 ++++---
> >  drivers/net/ethernet/intel/idpf/idpf_idc.c    |  38 ++--
> >  drivers/net/ethernet/intel/idpf/idpf_lib.c    |   7 +-
> >  drivers/net/ethernet/intel/idpf/idpf_main.c   | 114 ++++++------
> >  drivers/net/ethernet/intel/idpf/idpf_vf_dev.c |  57 +++---
> >  .../net/ethernet/intel/idpf/idpf_virtchnl.c   | 169 +++++++++--------
> > -
> >  .../ethernet/intel/idpf/idpf_virtchnl_ptp.c   |  58 +++---
> >  11 files changed, 288 insertions(+), 315 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/intel/idpf/Kconfig
> > b/drivers/net/ethernet/intel/idpf/Kconfig
> > index adab2154125b..586df3a4afe9 100644
> > --- a/drivers/net/ethernet/intel/idpf/Kconfig
> > +++ b/drivers/net/ethernet/intel/idpf/Kconfig
> > @@ -6,6 +6,7 @@ config IDPF
> >  	depends on PCI_MSI
> >  	depends on PTP_1588_CLOCK_OPTIONAL
> >  	select DIMLIB
> 
> ...
> 
> > +56,14 @@ static void idpf_ctlq_reg_init(struct idpf_adapter *adapter,
> >   */
> >  static void idpf_mb_intr_reg_init(struct idpf_adapter *adapter)  {
> > +	struct libie_mmio_info *mmio = &adapter->ctlq_ctx.mmio_info;
> >  	struct idpf_intr_reg *intr = &adapter->mb_vector.intr_reg;
> >  	u32 dyn_ctl = le32_to_cpu(adapter->caps.mailbox_dyn_ctl);
> > 
> > -	intr->dyn_ctl = idpf_get_reg_addr(adapter, dyn_ctl);
> > +	intr->dyn_ctl = libie_pci_get_mmio_addr(mmio, dyn_ctl);
> Probable NULL dereference: libie_pci_get_mmio_addr(mmio, dyn_ctl) can return NULL.
> It looks like no checks were made.

This is consistent with how idpf_get_reg_addr() behaved, though I see that a 
BUG() check is missing in comparison, I could add that to 
libie_pci_get_mmio_addr() I think.

> 
> >  	intr->dyn_ctl_intena_m = PF_GLINT_DYN_CTL_INTENA_M;
> >  	intr->dyn_ctl_itridx_m = PF_GLINT_DYN_CTL_ITR_INDX_M;
> 
> ...
> 
> > 
> >  	return 0;
> >  }
> > --
> > 2.47.0
> 

^ permalink raw reply

* Re: [PATCH] Docs/damon: add TLB flush policy document
From: KunWu Chan @ 2026-06-09  8:11 UTC (permalink / raw)
  To: SeongJae Park
  Cc: akpm, david, ljs, liam, vbabka, rppt, surenb, mhocko, corbet,
	skhan, damon, linux-mm, linux-doc, linux-kernel, Wang Lian
In-Reply-To: <20260606005431.89186-1-sj@kernel.org>

On Sat, Jun 6, 2026 at 8:54 AM SeongJae Park <sj@kernel.org> wrote:
>
> Hi Kunwu and Lian,
>
> On Fri,  5 Jun 2026 11:10:08 +0800 Kunwu Chan <kunwu.chan@gmail.com> wrote:
>
> > From: Kunwu Chan <kunwu.chan@gmail.com>
> >
> > DAMON avoids TLB flushes after clearing PTE Accessed bits for sampling.
> > The overhead was measured and found significant [1].  Production
> > workloads with large working sets flush TLB buffers naturally, so
> > accuracy impact is negligible.
> >
> > On systems with large TLB buffers and small test workloads, stale TLB
> > entries persist across sampling intervals and produce false negatives.
> > This comes up repeatedly on the mailing list and in private inquiries
> > [2][3].
> >
> > Add a document on the design decision, trade-offs, test environment
> > problems, and recommendations.
> >
> > Link: https://lore.kernel.org/20200403103059.12762-1-sjpark@amazon.com [1]
> > Link: https://lore.kernel.org/20260117020731.226785-3-sj@kernel.org [2]
> > Link: https://lore.kernel.org/all/20260526145034.91594-1-sj@kernel.org [3]
>
> Thank you for this great patch!
>
> >
> > Co-developed-by: Wang Lian <lianux.mm@gmail.com>
> > Signed-off-by: Wang Lian <lianux.mm@gmail.com>
> > Signed-off-by: Kunwu Chan <kunwu.chan@gmail.com>
> > ---
> >  Documentation/mm/damon/index.rst     |   1 +
> >  Documentation/mm/damon/tlb_flush.rst | 131 +++++++++++++++++++++++++++
> >  2 files changed, 132 insertions(+)
> >  create mode 100644 Documentation/mm/damon/tlb_flush.rst
> >
> > diff --git a/Documentation/mm/damon/index.rst b/Documentation/mm/damon/index.rst
> > index 318f6a7bfea4..5e239437dab3 100644
> > --- a/Documentation/mm/damon/index.rst
> > +++ b/Documentation/mm/damon/index.rst
> > @@ -19,6 +19,7 @@ DAMON is a Linux kernel subsystem for efficient :ref:`data access monitoring
> >
> >     faq
> >     design
> > +   tlb_flush
> >     api
> >     maintainer-profile
> >
> > diff --git a/Documentation/mm/damon/tlb_flush.rst b/Documentation/mm/damon/tlb_flush.rst
> > new file mode 100644
> [...]
>
> Great document!  That said, it feels like a good complete article or a paper,
> rather than DAMON documentation that pursue to be short and essential.  I feel
> like this fit more to be published in a blog like DAMON project blog [1], or
> news site like LWN.  If you'd like to, please feel free to upload a PR or send
> patch for DAMON project blog source [2].
>
> Mainly due to the verbosity, as I above mentionedd, I'm not sure if the current
> shape of this patch is the best to be merged as is.  I also find the background
> part of the document is a kind of duplicate of some information in design.rst.
> What about putting only essential information in a condensed way on the
> design.rst?
>
Thanks, SJ. I agree — this document reads more like an article than short docs.

I'll follow your suggestion and submit it as a blog post to the DAMON
project blog instead.

Will send a PR to [2] soon. Thanks for the clear guidance.

> [1] https://damonitor.github.io/site_about
> [2] https://github.com/damonitor/damonitor.github.io/tree/master/blog_src
>
>
> Thanks,
> SJ
>
> [...]

^ permalink raw reply

* Re: [PATCH 4/4] block: add configurable error injection
From: Christoph Hellwig @ 2026-06-09  7:47 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, Jens Axboe, Jonathan Corbet, Damien Le Moal,
	Hannes Reinecke, Keith Busch, linux-block, linux-doc,
	Hannes Reinecke
In-Reply-To: <3b276ff3-2065-4cd5-adcf-6664606d1eea@acm.org>

On Mon, Jun 08, 2026 at 03:08:47PM -0700, Bart Van Assche wrote:
> On 6/7/26 10:14 PM, Christoph Hellwig wrote:
>> +Configurable error injection allows injecting specific block layer status codes
>> +for ranges of a block device.  Errors can be injected unconditionally, or with a
>
> ranges -> sector ranges?
>
>> +static void error_inject_removall(struct gendisk *disk)
> > +{
>
> Is a letter "e" perhaps missing from the above function name? (remov -> 
> remove)

Sure, fixed.


^ permalink raw reply

* Re: [PATCH 3/4] block: add a str_to_blk_op helper
From: Christoph Hellwig @ 2026-06-09  7:45 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, Jens Axboe, Jonathan Corbet, Damien Le Moal,
	Hannes Reinecke, Keith Busch, linux-block, linux-doc,
	Hannes Reinecke
In-Reply-To: <5e738f79-e1d9-4224-ae85-322967682a1a@acm.org>

On Mon, Jun 08, 2026 at 02:57:40PM -0700, Bart Van Assche wrote:
> On 6/7/26 10:14 PM, Christoph Hellwig wrote:
>> +enum req_op str_to_blk_op(const char *op)
>> +{
>> +	int i;
>> +
>> +	for (i = 0; i < ARRAY_SIZE(blk_op_name); i++)
>> +		if (blk_op_name[i] && !strcmp(blk_op_name[i], op))
>> +			return (enum req_op)i;
>> +	return REQ_OP_LAST;
>> +}
> The above function is similar but not identical to
> __sysfs_match_string(). Is __sysfs_match_string() good enough in this
> context?

__sysfs_match_string exists as soon as an array entry is NULL, but
blk_status values are not fully contiguous, so no.


^ permalink raw reply

* Re: [PATCH 2/4] block: add a "tag" for block status codes
From: Christoph Hellwig @ 2026-06-09  7:43 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, Jens Axboe, Jonathan Corbet, Damien Le Moal,
	Hannes Reinecke, Keith Busch, linux-block, linux-doc,
	Hannes Reinecke
In-Reply-To: <f4c0895b-4758-4eb1-9c3a-38cda0db50d2@acm.org>

On Mon, Jun 08, 2026 at 02:55:20PM -0700, Bart Van Assche wrote:
> On 6/7/26 10:14 PM, Christoph Hellwig wrote:
>> +const char *blk_status_to_tag(blk_status_t status)
>> +{
>> +	int idx = (__force int)status;
>> +
>> +	if (WARN_ON_ONCE(idx >= ARRAY_SIZE(blk_errors)))
>> +		return "<null>";
>> +	return blk_errors[idx].tag;
>> +}
>
> Since designated initializers are used to initialize blk_errors[], it's
> probably a good idea to check the value of blk_errors[idx].tag, e.g. as
> follows:
>
> return blk_errors[idx].tag ?: "<null>";

I'd go for the good old and readable if statement, but yes, I can add
extra error checking here.


^ permalink raw reply

* Re: [PATCH 4/4] block: add configurable error injection
From: Christoph Hellwig @ 2026-06-09  7:41 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Christoph Hellwig, Jonathan Corbet, Damien Le Moal,
	Hannes Reinecke, Keith Busch, linux-block, linux-doc,
	Hannes Reinecke
In-Reply-To: <bac50400-dd86-4c7f-bab3-481c1306877b@kernel.dk>

On Mon, Jun 08, 2026 at 08:53:22AM -0600, Jens Axboe wrote:
> > +	if (!test_bit(GD_ERROR_INJECT, &bio->bi_bdev->bd_disk->state))
> > +		return false;
> > +	return __blk_error_inject(bio);
> > +}
> 
> I really hate this part, that's a pretty deep set of pointer chasings to
> figure out if injection is enabled or not,

It's to the bdev we use everywhere, and then to the disk which we use
in a lot of places in the submission path.

The only easy way to reduce it would be to move the state to the
block_device.  We currently don't do partitions in debugfs, but maybe
we should?

> when in practice error
> injection is only ever enabled for specific test cases and distros
> invariably will set CONFIG_BLK_ERROR_INJECTION because they turn on
> every damn thing under the sun.
> 
> IOW, that won't fly for the hot path. Maybe a static key would be useful
> here?

a static_key makes sense here, probably including the legacy error
injection.


^ permalink raw reply

* Re: [PATCH v8 2/6] mm/memory-failure: surface unhandlable kernel pages as -ENOTRECOVERABLE
From: David Hildenbrand (Arm) @ 2026-06-09  7:09 UTC (permalink / raw)
  To: Miaohe Lin, Breno Leitao
  Cc: linux-mm, linux-kernel, linux-doc, linux-kselftest,
	linux-trace-kernel, kernel-team, Lance Yang, Andrew Morton,
	Lorenzo Stoakes, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Shuah Khan, Naoya Horiguchi,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Jonathan Corbet, Shuah Khan, Liam R. Howlett
In-Reply-To: <4953bcee-5a0f-2bc5-7295-63e5e7513e8b@huawei.com>

On 6/9/26 04:39, Miaohe Lin wrote:
> On 2026/6/8 22:15, Breno Leitao wrote:
>> On Fri, Jun 05, 2026 at 11:42:53AM +0200, David Hildenbrand (Arm) wrote:
>>>
>>> I mean, any such races can currently already happen one way or the other?
>>>
>>> Really, the only way to not get races is to tryget the (compound)page,
>>> revalidate that the page is still part of the compound page.
>>>
>>> I'm not sure if that's really a good idea.
>>>
>>> But my memory is a bit vague in which scenarios we already hold a page reference
>>> here to prevent any concurrent freeing?
>>
>> No, we don't hold one here in the case that matters.
>>
>> HWPoisonKernelOwned() runs at the very top of get_any_page(), before
>> try_again: and before __get_hwpoison_page(). The first refcount taken in
>> the whole path is the folio_try_get() inside __get_hwpoison_page(), which
>> runs *after* the short-circuit.
>>
>> So get_any_page() itself never holds a reference at the check -- the only way
>> one exists is if the caller passed MF_COUNT_INCREASED (count_increased ==
>> true).
>>
>> So on the MCE/GHES path -- the one this panic option exists for -- no
>> reference is held when HWPoisonKernelOwned() does its compound_head() +
>> PageSlab()/PageTable()/PageLargeKmalloc() checks.
>>
>> Given that, I'd rather keep it racy and take no refcount than add a
>> tryget + revalidate purely for this check. As I've said earleir, an operator
> 
> Would it be acceptable to add a simple recheck? Something like below:
> 
> retry:
> head = compound_head(page);
> PageSlab()/PageTable()/PageLargeKmalloc() checks
> if (head != compound_head(page))
> 	goto retry

Sure. I guess it could still be racy in some weird scenarios where we
free+allocate+free in-between.

-- 
Cheers,

David

^ permalink raw reply

* [PATCH v2] docs: Fix minor grammatical error
From: Brigham Campbell @ 2026-06-09  7:06 UTC (permalink / raw)
  To: Thorsten Leemhuis, Jonathan Corbet, Shuah Khan,
	open list:DOCUMENTATION REPORTING ISSUES, open list
  Cc: Brigham Campbell

Fix minor grammatical error in the administration guide.

Signed-off-by: Brigham Campbell <me@brighamcampbell.com>
---

Since v1:
* Drop pedantic line re-wrapping.

In hindsight, I should have guessed that reflowing the paragraph was
overzealous. Thanks for the guidance, Randy, Thorsten. I'll remember it
if I make minor doc fixes in the future.

 Documentation/admin-guide/quickly-build-trimmed-linux.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/quickly-build-trimmed-linux.rst b/Documentation/admin-guide/quickly-build-trimmed-linux.rst
index cb178e0a6208..3432dc8e1a85 100644
--- a/Documentation/admin-guide/quickly-build-trimmed-linux.rst
+++ b/Documentation/admin-guide/quickly-build-trimmed-linux.rst
@@ -217,7 +217,7 @@ again.
 
    There is a catch: 'localmodconfig' is likely to disable kernel features you
    did not use since you booted your Linux -- like drivers for currently
-   disconnected peripherals or a virtualization software not haven't used yet.
+   disconnected peripherals or virtualization software not currently in use.
    You can reduce or nearly eliminate that risk with tricks the reference
    section outlines; but when building a kernel just for quick testing purposes
    it is often negligible if such features are missing. But you should keep that

base-commit: 738bb6e6c8d992f33335b3cbcce051ab118a33dc
-- 
2.54.0


^ permalink raw reply related

* Re: [PATCH v3 3/4] cpufreq: Remove driver default policy->min/max init
From: Pierre Gondois @ 2026-06-09  6:52 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, Jie Zhan, Lifeng Zheng, Ionela Voinescu,
	Sumit Gupta, Zhongqiu Han, Viresh Kumar, Jonathan Corbet,
	Shuah Khan, Huang Rui, Mario Limonciello, Perry Yuan,
	K Prateek Nayak, Srinivas Pandruvada, Len Brown, Saravana Kannan,
	linux-pm, linux-doc
In-Reply-To: <CAJZ5v0hqjdd79J-Hi=mSLMm2Fxhia+xa6iguJwZy2-pRBSJ6gA@mail.gmail.com>

Hello Rafael,

On 6/8/26 18:50, Rafael J. Wysocki wrote:
> Hi,
>
> On Wed, Jun 3, 2026 at 9:49 AM Pierre Gondois <pierre.gondois@arm.com> wrote:
>> Hello Rafael,
>>
>> On 6/1/26 20:08, Rafael J. Wysocki wrote:
>>> On Thu, May 28, 2026 at 11:10 AM Pierre Gondois <pierre.gondois@arm.com> wrote:
>>>> Prior to [1], drivers were setting policy->min/max and
>>>> the value was used as a QoS constraint. After that change,
>>>> the values were only temporarily used: cpufreq_set_policy()
>>>> ultimately overriding them through:
>>>> cpufreq_policy_online()
>>>> \-cpufreq_init_policy()
>>>>     \-cpufreq_set_policy()
>>>>       \-/* Set policy->min/max */
>>>>
>>>> This patch reinstate the initial behaviour. This will allow
>>>> drivers to request min/max QoS frequencies if desired.
>>>> For instance, the cppc driver advertises a lowest non-linear
>>>> frequency, which should be used as a min QoS value.
>>>>
>>>> To avoid having drivers setting policy->min/max to default
>>>> values which are considered as QoS values (i.e. the reason
>>>> why [1] was introduced), remove the initialization of
>>>> policy->min/max in .init() callbacks wherever the
>>>> policy->min/max values are identical to the
>>>> policy->cpuinfo.min/max_freq.
>>>>
>>>> Indeed, the previous patch ("cpufreq: Set default
>>>> policy->min/max values for all drivers") makes this initialization
>>>> redundant.
>>>>
>>>> The only drivers where these values are different are:
>>>> - gx-suspmod.c (min)
>>>> - cppc-cpufreq.c (min)
>>>> - longrun.c
>>>>
>>>> [1]
>>>> commit 521223d8b3ec ("cpufreq: Fix initialization of min and
>>>> max frequency QoS requests")
>>>>
>>>> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
>>>> Acked-by: Jie Zhan <zhanjie9@hisilicon.com>
>>> sashiko.dev has some feedback on this patch and appears to have a point:
>>>
>>> https://sashiko.dev/#/patchset/20260528090913.2759118-1-pierre.gondois%40arm.com
>>>
>>> Can you have a look at it please?
>>>
>> [sashiko]
>>
>>   > Does removing the policy->max = max_freq assignment here break UAPI
>>   > expectations by exposing the unlisted boost frequency in
>> scaling_max_freq?
>>   >
>>   > Commit 538b0188da4653 intentionally allowed drivers like acpi-cpufreq
>> to set
>>   > policy->cpuinfo.max_freq to a higher boost frequency while relying on
>>   > cpufreq_frequency_table_cpuinfo() to clamp policy->max to the frequency
>>   > table's nominal maximum (max_freq). This ensured that user-space
>> tools saw
>>   > the nominal maximum in scaling_max_freq.
>>   >
>>   > Although commit 521223d8b3ec temporarily disrupted this by defaulting
>> the QoS
>>   > max to -1, a subsequent patch in this series changes the core to
>> initialize
>>   > the QoS request using policy->max.
>>
>> Effectively PATCH [4/4] cpufreq: Use policy->min/max init as QoS request
>> now uses the policy->max value set by the .init() callback to set
>> the max_freq_req QoS constraint.
>>
>>   >
>>   > If the policy->max = max_freq assignment were preserved, the subsequent
>>   > patch would successfully use the nominal frequency as the QoS max
>> request,
>>   > restoring the correct clamping behavior.
>>
>> IIUC this suggests to use the nominal freq. as the QoS max request.
>> This was behaving like that prior to 521223d8b3ec. However doing
>> that would mean that if boost is enabled and the max_freq_req sysfs
>> is not updated, then the frequency would still be clamped by
>> the max_freq_req. 521223d8b3ec intended to correct that.
>>
>> Sashiko seems to suggest modifications to come back to the
>> pre-521223d8b3ec behaviour, but I think 521223d8b3ec is correct
>> and we should conserve this behaviour.
> So there is some confusion in the patch changelogs of this series, but
> not in the code, regarding the role of the last argument of
> freq_qos_add_request().  Namely, that argument is the initial request
> value for the given request object which is subsequently managed by
> user space.  User space may in fact change it to whatever value it
> wants (either lower or higher) and it is only taken into account along
> with the other requests in the given chain.  IMV it is better to
> clarify that, so I have updated the changelogs when applying the
> patches.
>
> Please see
>
> https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/commit/?h=bleeding-edge&id=8c83947c5dbbd49b36d08bb99e344327c6278781
>
> and its ancestors and let me know if there's anything missing in the
> changelogs thereof.

The new commit message is indeed clearer.
Thanks for the update.

Regards,

Pierre


> Thanks!

^ permalink raw reply

* Re: [PATCH v3 5/6] kselftest: alloc_tag: add kselftest for ioctl interface
From: Hao Ge @ 2026-06-09  6:26 UTC (permalink / raw)
  To: Abhishek Bapat
  Cc: Shuah Khan, Jonathan Corbet, linux-doc, linux-kernel, linux-mm,
	Sourav Panda, Suren Baghdasaryan, Andrew Morton, Kent Overstreet
In-Reply-To: <49f725a7-577d-4036-bd5a-5a33fc9e17c3@linux.dev>


On 2026/6/9 14:09, Hao Ge wrote:
> Hi Abhishek
>
>
> On 2026/6/6 07:36, Abhishek Bapat wrote:
>> Introduce a kselftest to verify the new IOCTL-based interface for
>> /proc/allocinfo. The test covers:
>>
>> 1. Validation of the filename filter.
>> 2. Validation of the function filter.
>>
>> The first test validates the functionality of the filename filter. Using
>> "mm/memory.c" as the candidate filename filter, it retrieves filtered
>> entries from both procfs and ioctl and matches the first VEC_MAX_ENTRIES
>> entries.
>>
>> The second test validates the functionality of the function filter.
>> It uses "dup_mm" as the candidate function as we do not expect this
>> function name to change frequently and hence won't be needing to modify
>> this test often.
>>
>> Note that both the tests match line no, function name and file name
>> fields. Bytes allocated and calls are not matched as those values may
>> change in the time when the data is being read from procfs and ioctl and
>> hence can lead to false negatives.
>>
>> Signed-off-by: Abhishek Bapat <abhishekbapat@google.com>
>> ---
>>   MAINTAINERS                                   |   1 +
>>   tools/testing/selftests/alloc_tag/Makefile    |   9 +
>>   .../alloc_tag/allocinfo_ioctl_test.c          | 313 ++++++++++++++++++
>>   3 files changed, 323 insertions(+)
>>   create mode 100644 tools/testing/selftests/alloc_tag/Makefile
>>   create mode 100644 
>> tools/testing/selftests/alloc_tag/allocinfo_ioctl_test.c
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 77f3fc487691..80560f5f1292 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -16713,6 +16713,7 @@ F:    include/linux/alloc_tag.h
>>   F:    include/linux/pgalloc_tag.h
>>   F:    include/uapi/linux/alloc_tag.h
>>   F:    lib/alloc_tag.c
>> +F:    tools/testing/selftests/alloc_tag/
>>     MEMORY CONTROLLER DRIVERS
>>   M:    Krzysztof Kozlowski <krzk@kernel.org>
>> diff --git a/tools/testing/selftests/alloc_tag/Makefile 
>> b/tools/testing/selftests/alloc_tag/Makefile
>> new file mode 100644
>> index 000000000000..f2b8fc022c3b
>> --- /dev/null
>> +++ b/tools/testing/selftests/alloc_tag/Makefile
>> @@ -0,0 +1,9 @@
>> +# SPDX-License-Identifier: GPL-2.0
>> +
>> +TEST_GEN_PROGS := allocinfo_ioctl_test
>> +
>> +CFLAGS += -Wall
>> +CFLAGS += -I../../../../usr/include
>> +
>> +include ../lib.mk
>> +
>> diff --git a/tools/testing/selftests/alloc_tag/allocinfo_ioctl_test.c 
>> b/tools/testing/selftests/alloc_tag/allocinfo_ioctl_test.c
>> new file mode 100644
>> index 000000000000..5c3c16e86c23
>> --- /dev/null
>> +++ b/tools/testing/selftests/alloc_tag/allocinfo_ioctl_test.c
>> @@ -0,0 +1,313 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +
>> +/* kselftest for allocinfo ioctl
>> + * allocinfo ioctl retrives allocinfo data through ioctl
>
>
> nit: s/retrives/retrieves/
>
>
> I've applied the full patch series locally and ran the kselftest, all 
> 4 tests pass:
>
> [root@localhost alloc_tag]# ./allocinfo_ioctl_test
> 1..4
> ok 1 test_filename_filter
> ok 2 test_function_filter
> ok 3 test_size_filter
> ok 4 test_lineno_filter
> # Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0
>
> But there are no tests for ALLOCINFO_FILTER_MASK_MODNAME and
>
> ALLOCINFO_FILTER_MASK_INACCURATE.


Sorry, please disregard my suggestion about adding tests for

ALLOCINFO_FILTER_MASK_MODNAME and ALLOCINFO_FILTER_MASK_INACCURATE.

ALLOCINFO_FILTER_MASK_MODNAME depends on kernel config and also requires

the module to be loaded. ALLOCINFO_FILTER_MASK_INACCURATE entries may not

be common, unless we can find a stable way to produce them.


>
>
> Thanks
>
> Best Regards
>
> Hao
>
>> + * Copyright (C) 2026 Google, Inc.
>> + */
>> +
>> +#include <errno.h>
>> +#include <fcntl.h>
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +#include <string.h>
>> +#include <stdbool.h>
>> +#include <unistd.h>
>> +#include <sys/ioctl.h>
>> +#include <linux/types.h>
>> +#include <linux/alloc_tag.h>
>> +#include "../kselftest.h"
>> +
>> +#define MAX_LINE_LEN        512
>> +#define ALLOCINFO_PROC        "/proc/allocinfo"
>> +
>> +enum ioctl_ret {
>> +    IOCTL_SUCCESS = 0,
>> +    IOCTL_FAILURE = 1,
>> +    IOCTL_INVALID_DATA = 2,
>> +};
>> +
>> +#define VEC_MAX_ENTRIES 32
>> +
>> +struct allocinfo_tag_data_vec {
>> +    struct allocinfo_tag_data tag[VEC_MAX_ENTRIES];
>> +    __u64 count;
>> +};
>> +
>> +static inline int __allocinfo_get_content_id(int dev_fd, struct 
>> allocinfo_content_id *params)
>> +{
>> +    return ioctl(dev_fd, ALLOCINFO_IOC_CONTENT_ID, params);
>> +}
>> +
>> +static inline int __allocinfo_get_at(int dev_fd, struct 
>> allocinfo_get_at *params)
>> +{
>> +    return ioctl(dev_fd, ALLOCINFO_IOC_GET_AT, params);
>> +}
>> +
>> +static inline int __allocinfo_get_next(int dev_fd, struct 
>> allocinfo_tag_data *params)
>> +{
>> +    return ioctl(dev_fd, ALLOCINFO_IOC_GET_NEXT, params);
>> +}
>> +
>> +static bool match_entry(const struct allocinfo_tag_data *procfs_entry,
>> +            const struct allocinfo_tag_data *tag_data,
>> +            bool match_bytes, bool match_calls, bool match_lineno,
>> +            bool match_function, bool match_filename)
>> +{
>> +    if (match_bytes && tag_data->counter.bytes != 
>> procfs_entry->counter.bytes) {
>> +        ksft_print_msg("size retrieved through ioctl does not match 
>> procfs\n");
>> +        return false;
>> +    }
>> +
>> +    if (match_calls && tag_data->counter.calls != 
>> procfs_entry->counter.calls) {
>> +        ksft_print_msg("call count retrieved through ioctl does not 
>> match procfs\n");
>> +        return false;
>> +    }
>> +
>> +    if (match_lineno && tag_data->tag.lineno != 
>> procfs_entry->tag.lineno) {
>> +        ksft_print_msg("lineno retrieved through ioctl does not 
>> match procfs\n");
>> +        return false;
>> +    }
>> +
>> +    if (match_function &&
>> +        strncmp(tag_data->tag.function, procfs_entry->tag.function, 
>> ALLOCINFO_STR_SIZE)) {
>> +        ksft_print_msg("function retrieved through ioctl does not 
>> match procfs\n");
>> +        return false;
>> +    }
>> +
>> +    if (match_filename &&
>> +        strncmp(tag_data->tag.filename, procfs_entry->tag.filename, 
>> ALLOCINFO_STR_SIZE)) {
>> +        ksft_print_msg("filename retrieved through ioctl does not 
>> match procfs\n");
>> +        return false;
>> +    }
>> +    return true;
>> +}
>> +
>> +static bool match_entries(const struct allocinfo_tag_data_vec 
>> *procfs_entries,
>> +              const struct allocinfo_tag_data_vec *tags,
>> +              bool match_bytes, bool match_calls, bool match_lineno,
>> +              bool match_function, bool match_filename)
>> +{
>> +    __u64 i;
>> +
>> +    if (procfs_entries->count != tags->count) {
>> +        ksft_print_msg("Entry count mismatch. ioctl entries: %llu, 
>> proc entries: %llu\n",
>> +                   tags->count, procfs_entries->count);
>> +        return false;
>> +    }
>> +    for (i = 0; i < procfs_entries->count; i++) {
>> +        if (!match_entry(&procfs_entries->tag[i], &tags->tag[i],
>> +                 match_bytes, match_calls, match_lineno,
>> +                 match_function, match_filename)) {
>> +            ksft_print_msg("%lluth entry does not match.\n", i);
>> +            return false;
>> +        }
>> +    }
>> +    return true;
>> +}
>> +
>> +static int get_filtered_procfs_entries(struct allocinfo_tag_data_vec 
>> *procfs_entries,
>> +                       const struct allocinfo_filter *filter, int fd)
>> +{
>> +    FILE *fp = fdopen(fd, "r");
>> +    char line[MAX_LINE_LEN];
>> +    int matches;
>> +    struct allocinfo_tag_data procfs_entry;
>> +
>> +    if (!fp) {
>> +        ksft_print_msg("Failed to open " ALLOCINFO_PROC " for 
>> reading\n");
>> +        return 1;
>> +    }
>> +    memset(procfs_entries, 0, sizeof(*procfs_entries));
>> +    while (fgets(line, sizeof(line), fp) && procfs_entries->count < 
>> VEC_MAX_ENTRIES) {
>> +
>> +        memset(&procfs_entry, 0, sizeof(procfs_entry));
>> +        matches = sscanf(line, "%llu %llu %[^:]:%llu func:%s",
>> +                 &procfs_entry.counter.bytes,
>> +                 &procfs_entry.counter.calls,
>> +                 procfs_entry.tag.filename,
>> +                 &procfs_entry.tag.lineno,
>> +                 procfs_entry.tag.function);
>> +
>> +        if (matches != 5)
>> +            continue;
>> +
>> +        if (filter->mask & ALLOCINFO_FILTER_MASK_FILENAME) {
>> +            if (strncmp(procfs_entry.tag.filename,
>> +                    filter->fields.filename, ALLOCINFO_STR_SIZE))
>> +                continue;
>> +        }
>> +        if (filter->mask & ALLOCINFO_FILTER_MASK_FUNCTION) {
>> +            if (strncmp(procfs_entry.tag.function,
>> +                    filter->fields.function, ALLOCINFO_STR_SIZE))
>> +                continue;
>> +        }
>> +        if (filter->mask & ALLOCINFO_FILTER_MASK_LINENO) {
>> +            if (procfs_entry.tag.lineno != filter->fields.lineno)
>> +                continue;
>> +        }
>> +        if (filter->mask & ALLOCINFO_FILTER_MASK_MIN_SIZE) {
>> +            if (procfs_entry.counter.bytes < filter->min_size)
>> +                continue;
>> +        }
>> +        if (filter->mask & ALLOCINFO_FILTER_MASK_MAX_SIZE) {
>> +            if (procfs_entry.counter.bytes > filter->max_size)
>> +                continue;
>> +        }
>> +
>> + memcpy(&procfs_entries->tag[procfs_entries->count++], &procfs_entry,
>> +               sizeof(procfs_entry));
>> +    }
>> +    return 0;
>> +}
>> +
>> +static enum ioctl_ret get_filtered_ioctl_entries(struct 
>> allocinfo_tag_data_vec *tags,
>> +                         const struct allocinfo_filter *filter, int fd,
>> +                         __u64 start_pos)
>> +{
>> +    struct allocinfo_content_id start_cont_id, end_cont_id;
>> +    struct allocinfo_get_at get_at_params;
>> +    const int max_retries = 10;
>> +    int retry_count = 0;
>> +    int status;
>> +
>> +    /*
>> +     * __allocinfo_get_content_id may return different values if a 
>> kernel module was loaded
>> +     * between the two calls. If that happens, the data gathered 
>> cannot be considered consistent
>> +     * and hence needs to be fetched again to avoid flakiness.
>> +     */
>> +    do {
>> +        if (__allocinfo_get_content_id(fd, &start_cont_id)) {
>> +            ksft_print_msg("allocinfo_get_content_id failed\n");
>> +            return IOCTL_FAILURE;
>> +        }
>> +
>> +        memset(tags, 0, sizeof(*tags));
>> +        memset(&get_at_params, 0, sizeof(get_at_params));
>> +        memcpy(&get_at_params.filter, filter, sizeof(*filter));
>> +        get_at_params.pos = start_pos;
>> +        if (__allocinfo_get_at(fd, &get_at_params)) {
>> +            ksft_print_msg("allocinfo_get_at failed\n");
>> +            return IOCTL_FAILURE;
>> +        }
>> +        memcpy(&tags->tag[tags->count++], &get_at_params.data, 
>> sizeof(get_at_params.data));
>> +
>> +        while (tags->count < VEC_MAX_ENTRIES &&
>> +               __allocinfo_get_next(fd, &tags->tag[tags->count]) == 0)
>> +            tags->count++;
>> +
>> +        if (__allocinfo_get_content_id(fd, &end_cont_id)) {
>> +            ksft_print_msg("allocinfo_get_content_id failed\n");
>> +            return IOCTL_FAILURE;
>> +        }
>> +
>> +        if (start_cont_id.id == end_cont_id.id) {
>> +            status = IOCTL_SUCCESS;
>> +        } else {
>> +            ksft_print_msg("allocinfo_get_content_id mismatch, 
>> retrying...\n");
>> +            status = IOCTL_INVALID_DATA;
>> +        }
>> +    } while (status == IOCTL_INVALID_DATA && retry_count++ < 
>> max_retries);
>> +
>> +    return status;
>> +}
>> +
>> +static int run_filter_test(const struct allocinfo_filter *filter)
>> +{
>> +    int fd;
>> +    struct allocinfo_tag_data_vec *tags = malloc(sizeof(*tags));
>> +    struct allocinfo_tag_data_vec *procfs_entries = 
>> malloc(sizeof(*procfs_entries));
>> +    int ioctl_status;
>> +    int ret = KSFT_PASS;
>> +
>> +    if (!tags || !procfs_entries) {
>> +        ksft_print_msg("Memory allocation failed.\n");
>> +        ret = KSFT_FAIL;
>> +        goto freemem;
>> +    }
>> +
>> +    fd = open(ALLOCINFO_PROC, O_RDONLY);
>> +    if (fd < 0) {
>> +        ksft_exit_skip("Failed to open " ALLOCINFO_PROC ": %s\n", 
>> strerror(errno));
>> +        ret = KSFT_FAIL;
>> +        goto freemem;
>> +    }
>> +
>> +    if (get_filtered_procfs_entries(procfs_entries, filter, fd)) {
>> +        ksft_print_msg("Error retrieving entries from " 
>> ALLOCINFO_PROC "\n");
>> +        ret = KSFT_FAIL;
>> +        goto exit;
>> +    }
>> +
>> +    if (procfs_entries->count == 0) {
>> +        ksft_print_msg("No entries found in " ALLOCINFO_PROC ", 
>> skipping test\n");
>> +        ret = KSFT_SKIP;
>> +        goto exit;
>> +    }
>> +
>> +    ioctl_status = get_filtered_ioctl_entries(tags, filter, fd, 0);
>> +    if (ioctl_status == IOCTL_INVALID_DATA) {
>> +        ksft_print_msg("Trouble retrieving valid IOCTL entries, 
>> skipping.\n");
>> +        ret = KSFT_SKIP;
>> +        goto exit;
>> +    }
>> +    if (ioctl_status == IOCTL_FAILURE) {
>> +        ksft_print_msg("Error retrieving IOCTL entries.\n");
>> +        ret = KSFT_FAIL;
>> +        goto exit;
>> +    }
>> +
>> +    if (!match_entries(procfs_entries, tags, false, false, true, 
>> true, true))
>> +        ret = KSFT_FAIL;
>> +
>> +exit:
>> +    close(fd);
>> +freemem:
>> +    free(tags);
>> +    free(procfs_entries);
>> +    return ret;
>> +}
>> +
>> +static int test_filename_filter(void)
>> +{
>> +    struct allocinfo_filter filter;
>> +    const char *target_filename = "mm/memory.c";
>> +
>> +    memset(&filter, 0, sizeof(filter));
>> +    filter.mask |= ALLOCINFO_FILTER_MASK_FILENAME;
>> +    strncpy(filter.fields.filename, target_filename, 
>> ALLOCINFO_STR_SIZE);
>> +
>> +    return run_filter_test(&filter);
>> +}
>> +
>> +static int test_function_filter(void)
>> +{
>> +    struct allocinfo_filter filter;
>> +    const char *target_function = "dup_mm";
>> +
>> +    memset(&filter, 0, sizeof(filter));
>> +    filter.mask |= ALLOCINFO_FILTER_MASK_FUNCTION;
>> +    strncpy(filter.fields.function, target_function, 
>> ALLOCINFO_STR_SIZE);
>> +
>> +    return run_filter_test(&filter);
>> +}
>> +
>> +int main(int argc, char *argv[])
>> +{
>> +    int ret;
>> +
>> +    ksft_set_plan(2);
>> +
>> +    ret = test_filename_filter();
>> +    if (ret == KSFT_SKIP)
>> +        ksft_test_result_skip("Skipping test_filename_filter\n");
>> +    else
>> +        ksft_test_result(ret == KSFT_PASS, "test_filename_filter\n");
>> +
>> +    ret = test_function_filter();
>> +    if (ret == KSFT_SKIP)
>> +        ksft_test_result_skip("Skipping test_function_filter\n");
>> +    else
>> +        ksft_test_result(ret == KSFT_PASS, "test_function_filter\n");
>> +
>> +    ksft_finished();
>> +}

^ permalink raw reply

* Re: [PATCH 1/1] docs: Fix minor grammatical error
From: Thorsten Leemhuis @ 2026-06-09  6:15 UTC (permalink / raw)
  To: Brigham Campbell, Randy Dunlap, Jonathan Corbet, Shuah Khan,
	open list:DOCUMENTATION REPORTING ISSUES, open list
In-Reply-To: <DJ4A839X388E.376Q6KVB6JE14@brighamcampbell.com>

On 6/9/26 07:50, Brigham Campbell wrote:
> On Sat Jun 6, 2026 at 1:10 PM MDT, Randy Dunlap wrote:
>> Can't you just modify the first line only and leave the other 3 changed lines
>> intact?
> 
> I'm not as familiar with the kernel documentation project as I am with
> the code itself. I figured that it's generally preferred to maintain
> 80-character hard wrapping consistently across all documentation. Is it
> actually preferable to _not_ reflow text after editing in order to avoid
> munging the git history?

It's a "80-character wrapping" vs "keep patches small/simple/obvious"
situation where one has to weighting things up against each other.

If you'd change one thing in a line that would make the line say only
something like 60 characters or less long, then reflowing becomes the
right thing, as the para otherwise would look odd.

But in this case it won't matter much, so it's likely better to not
reflow to keep the patch smaller.

In the end it's a judgement call that the maintainer has to make. I
don't care much, but if I'd be forced to decide for one way or the other
I'd go with what Randy suggested.

BTW, thx for fixing this!

Ciao, Thorsten

^ permalink raw reply

* Re: [PATCH v3 5/6] kselftest: alloc_tag: add kselftest for ioctl interface
From: Hao Ge @ 2026-06-09  6:09 UTC (permalink / raw)
  To: Abhishek Bapat
  Cc: Shuah Khan, Jonathan Corbet, linux-doc, linux-kernel, linux-mm,
	Sourav Panda, Suren Baghdasaryan, Andrew Morton, Kent Overstreet
In-Reply-To: <2e55b3b1388a4f7a59f670a83f222ba6c836ac4e.1780701922.git.abhishekbapat@google.com>

Hi Abhishek


On 2026/6/6 07:36, Abhishek Bapat wrote:
> Introduce a kselftest to verify the new IOCTL-based interface for
> /proc/allocinfo. The test covers:
>
> 1. Validation of the filename filter.
> 2. Validation of the function filter.
>
> The first test validates the functionality of the filename filter. Using
> "mm/memory.c" as the candidate filename filter, it retrieves filtered
> entries from both procfs and ioctl and matches the first VEC_MAX_ENTRIES
> entries.
>
> The second test validates the functionality of the function filter.
> It uses "dup_mm" as the candidate function as we do not expect this
> function name to change frequently and hence won't be needing to modify
> this test often.
>
> Note that both the tests match line no, function name and file name
> fields. Bytes allocated and calls are not matched as those values may
> change in the time when the data is being read from procfs and ioctl and
> hence can lead to false negatives.
>
> Signed-off-by: Abhishek Bapat <abhishekbapat@google.com>
> ---
>   MAINTAINERS                                   |   1 +
>   tools/testing/selftests/alloc_tag/Makefile    |   9 +
>   .../alloc_tag/allocinfo_ioctl_test.c          | 313 ++++++++++++++++++
>   3 files changed, 323 insertions(+)
>   create mode 100644 tools/testing/selftests/alloc_tag/Makefile
>   create mode 100644 tools/testing/selftests/alloc_tag/allocinfo_ioctl_test.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 77f3fc487691..80560f5f1292 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -16713,6 +16713,7 @@ F:	include/linux/alloc_tag.h
>   F:	include/linux/pgalloc_tag.h
>   F:	include/uapi/linux/alloc_tag.h
>   F:	lib/alloc_tag.c
> +F:	tools/testing/selftests/alloc_tag/
>   
>   MEMORY CONTROLLER DRIVERS
>   M:	Krzysztof Kozlowski <krzk@kernel.org>
> diff --git a/tools/testing/selftests/alloc_tag/Makefile b/tools/testing/selftests/alloc_tag/Makefile
> new file mode 100644
> index 000000000000..f2b8fc022c3b
> --- /dev/null
> +++ b/tools/testing/selftests/alloc_tag/Makefile
> @@ -0,0 +1,9 @@
> +# SPDX-License-Identifier: GPL-2.0
> +
> +TEST_GEN_PROGS := allocinfo_ioctl_test
> +
> +CFLAGS += -Wall
> +CFLAGS += -I../../../../usr/include
> +
> +include ../lib.mk
> +
> diff --git a/tools/testing/selftests/alloc_tag/allocinfo_ioctl_test.c b/tools/testing/selftests/alloc_tag/allocinfo_ioctl_test.c
> new file mode 100644
> index 000000000000..5c3c16e86c23
> --- /dev/null
> +++ b/tools/testing/selftests/alloc_tag/allocinfo_ioctl_test.c
> @@ -0,0 +1,313 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +
> +/* kselftest for allocinfo ioctl
> + * allocinfo ioctl retrives allocinfo data through ioctl


nit: s/retrives/retrieves/


I've applied the full patch series locally and ran the kselftest, all 4 
tests pass:

[root@localhost alloc_tag]# ./allocinfo_ioctl_test
1..4
ok 1 test_filename_filter
ok 2 test_function_filter
ok 3 test_size_filter
ok 4 test_lineno_filter
# Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0

But there are no tests for ALLOCINFO_FILTER_MASK_MODNAME and

ALLOCINFO_FILTER_MASK_INACCURATE.


Thanks

Best Regards

Hao

> + * Copyright (C) 2026 Google, Inc.
> + */
> +
> +#include <errno.h>
> +#include <fcntl.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <stdbool.h>
> +#include <unistd.h>
> +#include <sys/ioctl.h>
> +#include <linux/types.h>
> +#include <linux/alloc_tag.h>
> +#include "../kselftest.h"
> +
> +#define MAX_LINE_LEN		512
> +#define ALLOCINFO_PROC		"/proc/allocinfo"
> +
> +enum ioctl_ret {
> +	IOCTL_SUCCESS = 0,
> +	IOCTL_FAILURE = 1,
> +	IOCTL_INVALID_DATA = 2,
> +};
> +
> +#define VEC_MAX_ENTRIES 32
> +
> +struct allocinfo_tag_data_vec {
> +	struct allocinfo_tag_data tag[VEC_MAX_ENTRIES];
> +	__u64 count;
> +};
> +
> +static inline int __allocinfo_get_content_id(int dev_fd, struct allocinfo_content_id *params)
> +{
> +	return ioctl(dev_fd, ALLOCINFO_IOC_CONTENT_ID, params);
> +}
> +
> +static inline int __allocinfo_get_at(int dev_fd, struct allocinfo_get_at *params)
> +{
> +	return ioctl(dev_fd, ALLOCINFO_IOC_GET_AT, params);
> +}
> +
> +static inline int __allocinfo_get_next(int dev_fd, struct allocinfo_tag_data *params)
> +{
> +	return ioctl(dev_fd, ALLOCINFO_IOC_GET_NEXT, params);
> +}
> +
> +static bool match_entry(const struct allocinfo_tag_data *procfs_entry,
> +			const struct allocinfo_tag_data *tag_data,
> +			bool match_bytes, bool match_calls, bool match_lineno,
> +			bool match_function, bool match_filename)
> +{
> +	if (match_bytes && tag_data->counter.bytes != procfs_entry->counter.bytes) {
> +		ksft_print_msg("size retrieved through ioctl does not match procfs\n");
> +		return false;
> +	}
> +
> +	if (match_calls && tag_data->counter.calls != procfs_entry->counter.calls) {
> +		ksft_print_msg("call count retrieved through ioctl does not match procfs\n");
> +		return false;
> +	}
> +
> +	if (match_lineno && tag_data->tag.lineno != procfs_entry->tag.lineno) {
> +		ksft_print_msg("lineno retrieved through ioctl does not match procfs\n");
> +		return false;
> +	}
> +
> +	if (match_function &&
> +	    strncmp(tag_data->tag.function, procfs_entry->tag.function, ALLOCINFO_STR_SIZE)) {
> +		ksft_print_msg("function retrieved through ioctl does not match procfs\n");
> +		return false;
> +	}
> +
> +	if (match_filename &&
> +	    strncmp(tag_data->tag.filename, procfs_entry->tag.filename, ALLOCINFO_STR_SIZE)) {
> +		ksft_print_msg("filename retrieved through ioctl does not match procfs\n");
> +		return false;
> +	}
> +	return true;
> +}
> +
> +static bool match_entries(const struct allocinfo_tag_data_vec *procfs_entries,
> +			  const struct allocinfo_tag_data_vec *tags,
> +			  bool match_bytes, bool match_calls, bool match_lineno,
> +			  bool match_function, bool match_filename)
> +{
> +	__u64 i;
> +
> +	if (procfs_entries->count != tags->count) {
> +		ksft_print_msg("Entry count mismatch. ioctl entries: %llu, proc entries: %llu\n",
> +			       tags->count, procfs_entries->count);
> +		return false;
> +	}
> +	for (i = 0; i < procfs_entries->count; i++) {
> +		if (!match_entry(&procfs_entries->tag[i], &tags->tag[i],
> +				 match_bytes, match_calls, match_lineno,
> +				 match_function, match_filename)) {
> +			ksft_print_msg("%lluth entry does not match.\n", i);
> +			return false;
> +		}
> +	}
> +	return true;
> +}
> +
> +static int get_filtered_procfs_entries(struct allocinfo_tag_data_vec *procfs_entries,
> +				       const struct allocinfo_filter *filter, int fd)
> +{
> +	FILE *fp = fdopen(fd, "r");
> +	char line[MAX_LINE_LEN];
> +	int matches;
> +	struct allocinfo_tag_data procfs_entry;
> +
> +	if (!fp) {
> +		ksft_print_msg("Failed to open " ALLOCINFO_PROC " for reading\n");
> +		return 1;
> +	}
> +	memset(procfs_entries, 0, sizeof(*procfs_entries));
> +	while (fgets(line, sizeof(line), fp) && procfs_entries->count < VEC_MAX_ENTRIES) {
> +
> +		memset(&procfs_entry, 0, sizeof(procfs_entry));
> +		matches = sscanf(line, "%llu %llu %[^:]:%llu func:%s",
> +				 &procfs_entry.counter.bytes,
> +				 &procfs_entry.counter.calls,
> +				 procfs_entry.tag.filename,
> +				 &procfs_entry.tag.lineno,
> +				 procfs_entry.tag.function);
> +
> +		if (matches != 5)
> +			continue;
> +
> +		if (filter->mask & ALLOCINFO_FILTER_MASK_FILENAME) {
> +			if (strncmp(procfs_entry.tag.filename,
> +				    filter->fields.filename, ALLOCINFO_STR_SIZE))
> +				continue;
> +		}
> +		if (filter->mask & ALLOCINFO_FILTER_MASK_FUNCTION) {
> +			if (strncmp(procfs_entry.tag.function,
> +				    filter->fields.function, ALLOCINFO_STR_SIZE))
> +				continue;
> +		}
> +		if (filter->mask & ALLOCINFO_FILTER_MASK_LINENO) {
> +			if (procfs_entry.tag.lineno != filter->fields.lineno)
> +				continue;
> +		}
> +		if (filter->mask & ALLOCINFO_FILTER_MASK_MIN_SIZE) {
> +			if (procfs_entry.counter.bytes < filter->min_size)
> +				continue;
> +		}
> +		if (filter->mask & ALLOCINFO_FILTER_MASK_MAX_SIZE) {
> +			if (procfs_entry.counter.bytes > filter->max_size)
> +				continue;
> +		}
> +
> +		memcpy(&procfs_entries->tag[procfs_entries->count++], &procfs_entry,
> +		       sizeof(procfs_entry));
> +	}
> +	return 0;
> +}
> +
> +static enum ioctl_ret get_filtered_ioctl_entries(struct allocinfo_tag_data_vec *tags,
> +						 const struct allocinfo_filter *filter, int fd,
> +						 __u64 start_pos)
> +{
> +	struct allocinfo_content_id start_cont_id, end_cont_id;
> +	struct allocinfo_get_at get_at_params;
> +	const int max_retries = 10;
> +	int retry_count = 0;
> +	int status;
> +
> +	/*
> +	 * __allocinfo_get_content_id may return different values if a kernel module was loaded
> +	 * between the two calls. If that happens, the data gathered cannot be considered consistent
> +	 * and hence needs to be fetched again to avoid flakiness.
> +	 */
> +	do {
> +		if (__allocinfo_get_content_id(fd, &start_cont_id)) {
> +			ksft_print_msg("allocinfo_get_content_id failed\n");
> +			return IOCTL_FAILURE;
> +		}
> +
> +		memset(tags, 0, sizeof(*tags));
> +		memset(&get_at_params, 0, sizeof(get_at_params));
> +		memcpy(&get_at_params.filter, filter, sizeof(*filter));
> +		get_at_params.pos = start_pos;
> +		if (__allocinfo_get_at(fd, &get_at_params)) {
> +			ksft_print_msg("allocinfo_get_at failed\n");
> +			return IOCTL_FAILURE;
> +		}
> +		memcpy(&tags->tag[tags->count++], &get_at_params.data, sizeof(get_at_params.data));
> +
> +		while (tags->count < VEC_MAX_ENTRIES &&
> +		       __allocinfo_get_next(fd, &tags->tag[tags->count]) == 0)
> +			tags->count++;
> +
> +		if (__allocinfo_get_content_id(fd, &end_cont_id)) {
> +			ksft_print_msg("allocinfo_get_content_id failed\n");
> +			return IOCTL_FAILURE;
> +		}
> +
> +		if (start_cont_id.id == end_cont_id.id) {
> +			status = IOCTL_SUCCESS;
> +		} else {
> +			ksft_print_msg("allocinfo_get_content_id mismatch, retrying...\n");
> +			status = IOCTL_INVALID_DATA;
> +		}
> +	} while (status == IOCTL_INVALID_DATA && retry_count++ < max_retries);
> +
> +	return status;
> +}
> +
> +static int run_filter_test(const struct allocinfo_filter *filter)
> +{
> +	int fd;
> +	struct allocinfo_tag_data_vec *tags = malloc(sizeof(*tags));
> +	struct allocinfo_tag_data_vec *procfs_entries = malloc(sizeof(*procfs_entries));
> +	int ioctl_status;
> +	int ret = KSFT_PASS;
> +
> +	if (!tags || !procfs_entries) {
> +		ksft_print_msg("Memory allocation failed.\n");
> +		ret = KSFT_FAIL;
> +		goto freemem;
> +	}
> +
> +	fd = open(ALLOCINFO_PROC, O_RDONLY);
> +	if (fd < 0) {
> +		ksft_exit_skip("Failed to open " ALLOCINFO_PROC ": %s\n", strerror(errno));
> +		ret = KSFT_FAIL;
> +		goto freemem;
> +	}
> +
> +	if (get_filtered_procfs_entries(procfs_entries, filter, fd)) {
> +		ksft_print_msg("Error retrieving entries from " ALLOCINFO_PROC "\n");
> +		ret = KSFT_FAIL;
> +		goto exit;
> +	}
> +
> +	if (procfs_entries->count == 0) {
> +		ksft_print_msg("No entries found in " ALLOCINFO_PROC ", skipping test\n");
> +		ret = KSFT_SKIP;
> +		goto exit;
> +	}
> +
> +	ioctl_status = get_filtered_ioctl_entries(tags, filter, fd, 0);
> +	if (ioctl_status == IOCTL_INVALID_DATA) {
> +		ksft_print_msg("Trouble retrieving valid IOCTL entries, skipping.\n");
> +		ret = KSFT_SKIP;
> +		goto exit;
> +	}
> +	if (ioctl_status == IOCTL_FAILURE) {
> +		ksft_print_msg("Error retrieving IOCTL entries.\n");
> +		ret = KSFT_FAIL;
> +		goto exit;
> +	}
> +
> +	if (!match_entries(procfs_entries, tags, false, false, true, true, true))
> +		ret = KSFT_FAIL;
> +
> +exit:
> +	close(fd);
> +freemem:
> +	free(tags);
> +	free(procfs_entries);
> +	return ret;
> +}
> +
> +static int test_filename_filter(void)
> +{
> +	struct allocinfo_filter filter;
> +	const char *target_filename = "mm/memory.c";
> +
> +	memset(&filter, 0, sizeof(filter));
> +	filter.mask |= ALLOCINFO_FILTER_MASK_FILENAME;
> +	strncpy(filter.fields.filename, target_filename, ALLOCINFO_STR_SIZE);
> +
> +	return run_filter_test(&filter);
> +}
> +
> +static int test_function_filter(void)
> +{
> +	struct allocinfo_filter filter;
> +	const char *target_function = "dup_mm";
> +
> +	memset(&filter, 0, sizeof(filter));
> +	filter.mask |= ALLOCINFO_FILTER_MASK_FUNCTION;
> +	strncpy(filter.fields.function, target_function, ALLOCINFO_STR_SIZE);
> +
> +	return run_filter_test(&filter);
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +	int ret;
> +
> +	ksft_set_plan(2);
> +
> +	ret = test_filename_filter();
> +	if (ret == KSFT_SKIP)
> +		ksft_test_result_skip("Skipping test_filename_filter\n");
> +	else
> +		ksft_test_result(ret == KSFT_PASS, "test_filename_filter\n");
> +
> +	ret = test_function_filter();
> +	if (ret == KSFT_SKIP)
> +		ksft_test_result_skip("Skipping test_function_filter\n");
> +	else
> +		ksft_test_result(ret == KSFT_PASS, "test_function_filter\n");
> +
> +	ksft_finished();
> +}

^ permalink raw reply

* Re: [RFC PATCH v1 00/13] exec: add spawn templates for repeated executable startup
From: Florian Weimer @ 2026-06-09  6:08 UTC (permalink / raw)
  To: Jann Horn
  Cc: Mateusz Guzik, Christian Brauner, Li Chen, Kees Cook,
	Alexander Viro, linux-fsdevel, linux-api, linux-kernel, linux-mm,
	linux-arch, linux-doc, linux-kselftest, x86, Arnd Bergmann,
	Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Jan Kara, Jonathan Corbet,
	Shuah Khan
In-Reply-To: <CAG48ez38OEE8ZPLyU6nr9=cYx-hMsdoh5WRrv-GMZGMDKyyOTA@mail.gmail.com>

* Jann Horn:

>> Per the above, the primary win would stem from *NOT* messing with mm.
>
> As you write below, I think we have that with CLONE_MM? The C function
> vfork() is kind of a terrible API because of its returns-twice
> behavior, but I think if process cloning with CLONE_VM|CLONE_VFORK was
> wrapped by libc in a way similar to clone() (with the child executing
> a separate handler function), or if it was used in the implementation
> of some higher-level process-spawning API, it would be a perfectly
> fine API?

No, there is still a problem with SIGTSTP handling because we cannot
atomically unmask the signal during execve.  We need to unblock SIGTSTP
before execve in the new process, but this means that it can get
suspended by SIGTSTP.  Consequently, the execve never happens and the
original process is stuck in vfork:

  posix_spawn: parent can get stuck in uninterruptible sleep if child
  receives SIGTSTP early enough
  <https://inbox.sourceware.org/libc-help/2921668c-773e-465d-9480-0abb6f979bf9@www.fastmail.com/>

More on the low-level side, it's difficult to make sure that execve gets
a consistent snapshot of the environ vector.  Both vfork and execve need
to be async-signal-safe.  Any locking or memory allocation (except for
the stack …) persists in the original process after vfork returns.  The
environ vector can be large, so making a copy on the stack is not ideal.
It's even harder for getenv/setenv/unsetenv implementations that use
locking instead of software transactional memory.

In general, I prefer the vfork+execve API over things like posix_spawn
because eventually, you have dependencies between the syslets, or need
control flow.  This introduces a lot of complexity.  Conceptually,
vfork+execve is much simpler, and in many ways quite safe (even mutexes
work as long as they do not need a correct TID).

Thanks,
Florian


^ permalink raw reply

* Re: [PATCH 1/1] docs: Fix minor grammatical error
From: Brigham Campbell @ 2026-06-09  5:50 UTC (permalink / raw)
  To: Randy Dunlap, Brigham Campbell, Thorsten Leemhuis,
	Jonathan Corbet, Shuah Khan,
	open list:DOCUMENTATION REPORTING ISSUES, open list
In-Reply-To: <4c9b2927-29e0-40a6-bed4-14142dedd2ef@infradead.org>

On Sat Jun 6, 2026 at 1:10 PM MDT, Randy Dunlap wrote:
> Can't you just modify the first line only and leave the other 3 changed lines
> intact?

I'm not as familiar with the kernel documentation project as I am with
the code itself. I figured that it's generally preferred to maintain
80-character hard wrapping consistently across all documentation. Is it
actually preferable to _not_ reflow text after editing in order to avoid
munging the git history?

Thanks for your time, Randy,
Brigham

^ permalink raw reply

* [PATCH iproute2-next 7/7] devlink: add scope filter to resource show
From: Tariq Toukan @ 2026-06-09  5:39 UTC (permalink / raw)
  To: Stephen Hemminger, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, David S. Miller
  Cc: David Ahern, Donald Hunter, Simon Horman, Jiri Pirko,
	Jonathan Corbet, Shuah Khan, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Mark Bloch, Shuah Khan, Matthieu Baerts (NGI0),
	Chuck Lever, Or Har-Toov, Carolina Jubran, Moshe Shemesh,
	Shay Drori, Dragos Tatulea, Daniel Zahka, Shahar Shitrit,
	Jacob Keller, Cosmin Ratiu, Parav Pandit, Kees Cook,
	Adithya Jayachandran, Daniel Jurgens, netdev, linux-kernel,
	linux-doc, linux-rdma, linux-kselftest, Gal Pressman,
	Ido Schimmel, Jiri Pirko, Petr Machata
In-Reply-To: <20260609053953.487152-1-tariqt@nvidia.com>

From: Or Har-Toov <ohartoov@nvidia.com>

Add optional 'scope { dev | port }' argument to 'devlink resource show'
without a device handle to filter the full dump to device-level or
port-level resources only.

Example - dump only device-level resources:

  $ devlink resource show scope dev
  pci/0000:03:00.0:
    name max_local_SFs size 128 unit entry dpipe_tables none
    name max_external_SFs size 128 unit entry dpipe_tables none
  pci/0000:03:00.1:
    name max_local_SFs size 128 unit entry dpipe_tables none
    name max_external_SFs size 128 unit entry dpipe_tables none

Example - dump only port-level resources:

  $ devlink resource show scope port
  pci/0000:03:00.0/196608:
    name max_SFs size 128 unit entry dpipe_tables none
  pci/0000:03:00.0/196609:
    name max_SFs size 128 unit entry dpipe_tables none
  pci/0000:03:00.1/196708:
    name max_SFs size 128 unit entry dpipe_tables none
  pci/0000:03:00.1/196709:
    name max_SFs size 128 unit entry dpipe_tables none

Signed-off-by: Or Har-Toov <ohartoov@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
 bash-completion/devlink     |  7 ++++++
 devlink/devlink.c           | 47 ++++++++++++++++++++++++++++++++-----
 man/man8/devlink-resource.8 | 11 ++++++++-
 3 files changed, 58 insertions(+), 7 deletions(-)

diff --git a/bash-completion/devlink b/bash-completion/devlink
index 3d8452a8869e..bfc0083f647b 100644
--- a/bash-completion/devlink
+++ b/bash-completion/devlink
@@ -702,9 +702,16 @@ _devlink_resource()
 {
     case $command in
         show)
+            case $prev in
+                scope)
+                    COMPREPLY=( $( compgen -W "dev port" -- "$cur" ) )
+                    return
+                    ;;
+            esac
             if [[ $cword -eq 3 ]]; then
                 _devlink_direct_complete "dev"
                 _devlink_direct_complete "port"
+                COMPREPLY+=( $( compgen -W "scope" -- "$cur" ) )
             fi
             return
             ;;
diff --git a/devlink/devlink.c b/devlink/devlink.c
index 4224b7fa6792..1a94b6413048 100644
--- a/devlink/devlink.c
+++ b/devlink/devlink.c
@@ -314,6 +314,7 @@ static int ifname_map_update(struct ifname_map *ifname_map, const char *ifname)
 #define DL_OPT_PORT_FN_RATE_TC_BWS	BIT(59)
 #define DL_OPT_HEALTH_REPORTER_BURST_PERIOD	BIT(60)
 #define DL_OPT_PARAM_SET_DEFAULT	BIT(61)
+#define DL_OPT_RESOURCE_SCOPE		BIT(62)
 
 struct dl_opts {
 	uint64_t present; /* flags of present items */
@@ -382,6 +383,7 @@ struct dl_opts {
 	bool selftests_opt[DEVLINK_ATTR_SELFTEST_ID_MAX + 1];
 	struct nla_bitfield32 port_fn_caps;
 	uint32_t port_fn_max_io_eqs;
+	uint32_t resource_scope_mask;
 };
 
 struct dl {
@@ -1467,6 +1469,19 @@ static int flash_overwrite_section_get(const char *sectionstr, uint32_t *mask)
 	return 0;
 }
 
+static int resource_scope_get(const char *scopestr, uint32_t *scope)
+{
+	if (strcmp(scopestr, "dev") == 0) {
+		*scope = DEVLINK_RESOURCE_SCOPE_DEV;
+	} else if (strcmp(scopestr, "port") == 0) {
+		*scope = DEVLINK_RESOURCE_SCOPE_PORT;
+	} else {
+		pr_err("Unknown resource scope \"%s\"\n", scopestr);
+		return -EINVAL;
+	}
+	return 0;
+}
+
 static int param_cmode_get(const char *cmodestr,
 			   enum devlink_param_cmode *cmode)
 {
@@ -1647,6 +1662,7 @@ static const struct dl_args_metadata dl_args_required[] = {
 	{DL_OPT_ESWITCH_ENCAP_MODE,   "E-Switch encapsulation option expected."},
 	{DL_OPT_RESOURCE_PATH,	      "Resource path expected."},
 	{DL_OPT_RESOURCE_SIZE,	      "Resource size expected."},
+	{DL_OPT_RESOURCE_SCOPE,	      "Resource scope expected."},
 	{DL_OPT_PARAM_NAME,	      "Parameter name expected."},
 	{DL_OPT_PARAM_VALUE,	      "Value to set expected."},
 	{DL_OPT_PARAM_CMODE,	      "Configuration mode expected."},
@@ -2662,6 +2678,9 @@ static void dl_opts_put(struct nlmsghdr *nlh, struct dl *dl)
 	if (opts->present & DL_OPT_RESOURCE_SIZE)
 		mnl_attr_put_u64(nlh, DEVLINK_ATTR_RESOURCE_SIZE,
 				 opts->resource_size);
+	if (opts->present & DL_OPT_RESOURCE_SCOPE)
+		mnl_attr_put_u32(nlh, DEVLINK_ATTR_RESOURCE_SCOPE_MASK,
+				 opts->resource_scope_mask);
 	if (opts->present & DL_OPT_PARAM_NAME)
 		mnl_attr_put_strz(nlh, DEVLINK_ATTR_PARAM_NAME,
 				  opts->param_name);
@@ -9010,13 +9029,29 @@ static int cmd_resource_show(struct dl *dl)
 	uint16_t flags = NLM_F_REQUEST | NLM_F_ACK;
 	struct nlmsghdr *nlh;
 	struct resource_ctx resource_ctx = {};
+	struct dl_opts *opts = &dl->opts;
 	int err;
 
-	err = dl_argv_parse_with_selector(dl, &flags, DEVLINK_CMD_RESOURCE_DUMP,
-					  DL_OPT_HANDLE | DL_OPT_HANDLEP,
-					  0, 0, 0);
-	if (err)
-		return err;
+	if (dl_argv_match(dl, "scope")) {
+		const char *scopestr;
+
+		dl_arg_inc(dl);
+		err = dl_argv_str(dl, &scopestr);
+		if (err)
+			return err;
+		err = resource_scope_get(scopestr, &opts->resource_scope_mask);
+		if (err)
+			return err;
+		opts->present |= DL_OPT_RESOURCE_SCOPE;
+		flags |= NLM_F_DUMP;
+	} else {
+		err = dl_argv_parse_with_selector(dl, &flags,
+						  DEVLINK_CMD_RESOURCE_DUMP,
+						  DL_OPT_HANDLE | DL_OPT_HANDLEP,
+						  0, 0, 0);
+		if (err)
+			return err;
+	}
 
 	err = resource_ctx_init(&resource_ctx, dl);
 	if (err)
@@ -9036,7 +9071,7 @@ static int cmd_resource_show(struct dl *dl)
 
 static void cmd_resource_help(void)
 {
-	pr_err("Usage: devlink resource show [ DEV[/PORT_INDEX] ]\n"
+	pr_err("Usage: devlink resource show [ DEV[/PORT_INDEX] | scope { dev | port } ]\n"
 	       "       devlink resource set DEV path PATH size SIZE\n");
 }
 
diff --git a/man/man8/devlink-resource.8 b/man/man8/devlink-resource.8
index 1e7d96126ce5..04cde2bf8958 100644
--- a/man/man8/devlink-resource.8
+++ b/man/man8/devlink-resource.8
@@ -19,7 +19,7 @@ devlink-resource \- devlink device resource configuration
 
 .ti -8
 .B devlink resource show
-.RI "[ " DEV "[/" PORT_INDEX "] ]"
+.RI "[ " DEV "[/" PORT_INDEX "] | " scope " { " dev " | " port " } ]"
 
 .ti -8
 .B devlink resource help
@@ -53,6 +53,15 @@ Format is:
 .in +2
 BUS_NAME/BUS_ADDRESS/PORT_INDEX
 
+.TP
+.BI scope " { dev | port }"
+Filter resources by scope.
+.B dev
+shows only device-level resources.
+.B port
+shows only port-level resources.
+When omitted, resources of both scopes are shown.
+
 .SS devlink resource set - sets resource size of specific resource
 
 .PP
-- 
2.44.0


^ permalink raw reply related

* [PATCH iproute2-next 5/7] devlink: show port resources in resource dump
From: Tariq Toukan @ 2026-06-09  5:39 UTC (permalink / raw)
  To: Stephen Hemminger, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, David S. Miller
  Cc: David Ahern, Donald Hunter, Simon Horman, Jiri Pirko,
	Jonathan Corbet, Shuah Khan, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Mark Bloch, Shuah Khan, Matthieu Baerts (NGI0),
	Chuck Lever, Or Har-Toov, Carolina Jubran, Moshe Shemesh,
	Shay Drori, Dragos Tatulea, Daniel Zahka, Shahar Shitrit,
	Jacob Keller, Cosmin Ratiu, Parav Pandit, Kees Cook,
	Adithya Jayachandran, Daniel Jurgens, netdev, linux-kernel,
	linux-doc, linux-rdma, linux-kselftest, Gal Pressman,
	Ido Schimmel, Jiri Pirko, Petr Machata
In-Reply-To: <20260609053953.487152-1-tariqt@nvidia.com>

From: Or Har-Toov <ohartoov@nvidia.com>

When the kernel returns port-level resource messages during a
DEVLINK_CMD_RESOURCE_DUMP, display them alongside device-level
resources.

For example:

$ devlink resource show
pci/0000:03:00.0:
  name max_local_SFs size 32 unit entry dpipe_tables none
  name max_external_SFs size 32 unit entry dpipe_tables none
pci/0000:03:00.0/196608:
  name max_SFs size 32 unit entry dpipe_tables none
pci/0000:03:00.1:
  name max_local_SFs size 32 unit entry dpipe_tables none
  name max_external_SFs size 32 unit entry dpipe_tables none
pci/0000:03:00.1/262144:
  name max_SFs size 32 unit entry dpipe_tables none

Signed-off-by: Or Har-Toov <ohartoov@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
 devlink/devlink.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/devlink/devlink.c b/devlink/devlink.c
index 0962ffd861ad..737cfc7437f9 100644
--- a/devlink/devlink.c
+++ b/devlink/devlink.c
@@ -8954,18 +8954,23 @@ static void resources_dpipe_tables_fini(struct dpipe_ctx *dpipe_ctx,
 static void
 resources_show(struct resource_ctx *ctx, struct nlattr **tb)
 {
-	struct resources *resources = ctx->resources;
+	bool is_port = !!tb[DEVLINK_ATTR_PORT_INDEX];
 	struct dpipe_ctx dpipe_ctx = {};
 	struct resource *resource;
+	struct dl *dl = ctx->dl;
 
 	resources_dpipe_tables_init(&dpipe_ctx, ctx, tb);
-
-	list_for_each_entry(resource, &resources->resource_list, list) {
-		pr_out_handle_start_arr(ctx->dl, tb);
+	list_for_each_entry(resource, &ctx->resources->resource_list, list) {
+		if (is_port)
+			pr_out_port_handle_start_arr(dl, tb, false);
+		else
+			pr_out_handle_start_arr(dl, tb);
 		resource_show(resource, ctx);
-		pr_out_handle_end(ctx->dl);
+		if (is_port)
+			pr_out_port_handle_end(dl);
+		else
+			pr_out_handle_end(dl);
 	}
-
 	resources_dpipe_tables_fini(&dpipe_ctx, ctx);
 }
 
-- 
2.44.0


^ permalink raw reply related

* [PATCH iproute2-next 6/7] devlink: add per-port resource show support
From: Tariq Toukan @ 2026-06-09  5:39 UTC (permalink / raw)
  To: Stephen Hemminger, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, David S. Miller
  Cc: David Ahern, Donald Hunter, Simon Horman, Jiri Pirko,
	Jonathan Corbet, Shuah Khan, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Mark Bloch, Shuah Khan, Matthieu Baerts (NGI0),
	Chuck Lever, Or Har-Toov, Carolina Jubran, Moshe Shemesh,
	Shay Drori, Dragos Tatulea, Daniel Zahka, Shahar Shitrit,
	Jacob Keller, Cosmin Ratiu, Parav Pandit, Kees Cook,
	Adithya Jayachandran, Daniel Jurgens, netdev, linux-kernel,
	linux-doc, linux-rdma, linux-kselftest, Gal Pressman,
	Ido Schimmel, Jiri Pirko, Petr Machata
In-Reply-To: <20260609053953.487152-1-tariqt@nvidia.com>

From: Or Har-Toov <ohartoov@nvidia.com>

Extend 'devlink resource show' to accept DEV/PORT_INDEX, sending
DEVLINK_ATTR_PORT_INDEX to the kernel so it returns only that port's
resources directly.

For example:

$ devlink resource show pci/0000:03:00.0/196608
pci/0000:03:00.0/196608:
  name max_SFs size 128 unit entry dpipe_tables none

Signed-off-by: Or Har-Toov <ohartoov@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
 bash-completion/devlink     |  1 +
 devlink/devlink.c           |  5 +++--
 man/man8/devlink-resource.8 | 17 ++++++++++++++++-
 3 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/bash-completion/devlink b/bash-completion/devlink
index 7ec6a7cb6abd..3d8452a8869e 100644
--- a/bash-completion/devlink
+++ b/bash-completion/devlink
@@ -704,6 +704,7 @@ _devlink_resource()
         show)
             if [[ $cword -eq 3 ]]; then
                 _devlink_direct_complete "dev"
+                _devlink_direct_complete "port"
             fi
             return
             ;;
diff --git a/devlink/devlink.c b/devlink/devlink.c
index 737cfc7437f9..4224b7fa6792 100644
--- a/devlink/devlink.c
+++ b/devlink/devlink.c
@@ -9013,7 +9013,8 @@ static int cmd_resource_show(struct dl *dl)
 	int err;
 
 	err = dl_argv_parse_with_selector(dl, &flags, DEVLINK_CMD_RESOURCE_DUMP,
-					  DL_OPT_HANDLE, 0, 0, 0);
+					  DL_OPT_HANDLE | DL_OPT_HANDLEP,
+					  0, 0, 0);
 	if (err)
 		return err;
 
@@ -9035,7 +9036,7 @@ static int cmd_resource_show(struct dl *dl)
 
 static void cmd_resource_help(void)
 {
-	pr_err("Usage: devlink resource show [ DEV ]\n"
+	pr_err("Usage: devlink resource show [ DEV[/PORT_INDEX] ]\n"
 	       "       devlink resource set DEV path PATH size SIZE\n");
 }
 
diff --git a/man/man8/devlink-resource.8 b/man/man8/devlink-resource.8
index b55138d950c7..1e7d96126ce5 100644
--- a/man/man8/devlink-resource.8
+++ b/man/man8/devlink-resource.8
@@ -19,7 +19,7 @@ devlink-resource \- devlink device resource configuration
 
 .ti -8
 .B devlink resource show
-.RI "[ " DEV " ]"
+.RI "[ " DEV "[/" PORT_INDEX "] ]"
 
 .ti -8
 .B devlink resource help
@@ -43,6 +43,16 @@ Format is:
 .in +2
 BUS_NAME/BUS_ADDRESS
 
+.PP
+.I "PORT_INDEX"
+- specifies the port to show resources for.
+When given, only port-level resources for that port are shown.
+
+.in +4
+Format is:
+.in +2
+BUS_NAME/BUS_ADDRESS/PORT_INDEX
+
 .SS devlink resource set - sets resource size of specific resource
 
 .PP
@@ -69,6 +79,11 @@ devlink resource show pci/0000:01:00.0
 Shows the resources of the specified devlink device.
 .RE
 .PP
+devlink resource show pci/0000:01:00.0/1
+.RS 4
+Shows port-level resources for port 1 of the specified devlink device.
+.RE
+.PP
 devlink resource set pci/0000:01:00.0 path /kvd/linear size 98304
 .RS 4
 Sets the size of the specified resource for the specified devlink device.
-- 
2.44.0


^ permalink raw reply related

* [PATCH iproute2-next 4/7] devlink: add dump support for resource show
From: Tariq Toukan @ 2026-06-09  5:39 UTC (permalink / raw)
  To: Stephen Hemminger, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, David S. Miller
  Cc: David Ahern, Donald Hunter, Simon Horman, Jiri Pirko,
	Jonathan Corbet, Shuah Khan, Saeed Mahameed, Leon Romanovsky,
	Tariq Toukan, Mark Bloch, Shuah Khan, Matthieu Baerts (NGI0),
	Chuck Lever, Or Har-Toov, Carolina Jubran, Moshe Shemesh,
	Shay Drori, Dragos Tatulea, Daniel Zahka, Shahar Shitrit,
	Jacob Keller, Cosmin Ratiu, Parav Pandit, Kees Cook,
	Adithya Jayachandran, Daniel Jurgens, netdev, linux-kernel,
	linux-doc, linux-rdma, linux-kselftest, Gal Pressman,
	Ido Schimmel, Jiri Pirko, Petr Machata
In-Reply-To: <20260609053953.487152-1-tariqt@nvidia.com>

From: Or Har-Toov <ohartoov@nvidia.com>

Allow 'devlink resource show' without specifying a device to dump
resources from all devlink devices.

Signed-off-by: Or Har-Toov <ohartoov@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
 devlink/devlink.c           | 18 ++++++++++++++----
 man/man8/devlink-resource.8 | 10 ++++++++--
 2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/devlink/devlink.c b/devlink/devlink.c
index ba14c0056b1c..0962ffd861ad 100644
--- a/devlink/devlink.c
+++ b/devlink/devlink.c
@@ -7429,6 +7429,12 @@ static void resources_free(struct resources *resources)
 		resource_free(resource);
 }
 
+static void resources_reset(struct resources *resources)
+{
+	resources_free(resources);
+	INIT_LIST_HEAD(&resources->resource_list);
+}
+
 static int resource_ctx_init(struct resource_ctx *ctx, struct dl *dl)
 {
 	ctx->resources = resources_alloc();
@@ -8986,19 +8992,23 @@ static int cmd_resource_dump_cb(const struct nlmsghdr *nlh, void *data)
 		return MNL_CB_ERROR;
 	}
 
-	if (ctx->print_resources)
+	if (ctx->print_resources) {
 		resources_show(ctx, tb);
+		resources_reset(ctx->resources);
+	}
 
 	return MNL_CB_OK;
 }
 
 static int cmd_resource_show(struct dl *dl)
 {
+	uint16_t flags = NLM_F_REQUEST | NLM_F_ACK;
 	struct nlmsghdr *nlh;
 	struct resource_ctx resource_ctx = {};
 	int err;
 
-	err = dl_argv_parse(dl, DL_OPT_HANDLE, 0);
+	err = dl_argv_parse_with_selector(dl, &flags, DEVLINK_CMD_RESOURCE_DUMP,
+					  DL_OPT_HANDLE, 0, 0, 0);
 	if (err)
 		return err;
 
@@ -9008,7 +9018,7 @@ static int cmd_resource_show(struct dl *dl)
 
 	resource_ctx.print_resources = true;
 	nlh = mnlu_gen_socket_cmd_prepare(&dl->nlg, DEVLINK_CMD_RESOURCE_DUMP,
-			       NLM_F_REQUEST | NLM_F_ACK);
+					  flags);
 	dl_opts_put(nlh, dl);
 	pr_out_section_start(dl, "resources");
 	err = mnlu_gen_socket_sndrcv(&dl->nlg, nlh, cmd_resource_dump_cb,
@@ -9020,7 +9030,7 @@ static int cmd_resource_show(struct dl *dl)
 
 static void cmd_resource_help(void)
 {
-	pr_err("Usage: devlink resource show DEV\n"
+	pr_err("Usage: devlink resource show [ DEV ]\n"
 	       "       devlink resource set DEV path PATH size SIZE\n");
 }
 
diff --git a/man/man8/devlink-resource.8 b/man/man8/devlink-resource.8
index c4f6918c9b03..b55138d950c7 100644
--- a/man/man8/devlink-resource.8
+++ b/man/man8/devlink-resource.8
@@ -19,7 +19,7 @@ devlink-resource \- devlink device resource configuration
 
 .ti -8
 .B devlink resource show
-.IR DEV
+.RI "[ " DEV " ]"
 
 .ti -8
 .B devlink resource help
@@ -31,11 +31,12 @@ devlink-resource \- devlink device resource configuration
 .BI size " RESOURCE_SIZE"
 
 .SH "DESCRIPTION"
-.SS devlink resource show - display devlink device's resosources
+.SS devlink resource show - display devlink device resources
 
 .PP
 .I "DEV"
 - specifies the devlink device to show.
+If omitted, all devices are listed.
 
 .in +4
 Format is:
@@ -58,6 +59,11 @@ The new resource's size.
 
 .SH "EXAMPLES"
 .PP
+devlink resource show
+.RS 4
+Shows resources for all devlink devices.
+.RE
+.PP
 devlink resource show pci/0000:01:00.0
 .RS 4
 Shows the resources of the specified devlink device.
-- 
2.44.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox