Linux Documentation

Linux Documentation
 help / color / mirror / Atom feed

* [PATCH v2 08/11] hugetlb: make hugetlb_add_to_page_cache() to take PAGE_SIZE granularity index
From: Jane Chu @ 2026-06-17 17:25 UTC (permalink / raw)
  To: akpm
  Cc: willy, jack, viro, brauner, muchun.song, osalvador, david, hughd,
	baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
	corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
In-Reply-To: <20260617172534.1740152-1-jane.chu@oracle.com>

hugetlb_add_to_page_cache() is partly a wrapper of the generic
__filemap_add_folio() that takes in PAGE_SIZE granularity index,
hence make it consistent by taking PAGE_SIZE granularity index
as well.

Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 fs/hugetlbfs/inode.c | 13 ++++++++-----
 mm/hugetlb.c         | 18 +++++++++---------
 mm/memfd.c           |  2 +-
 3 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 6c883478f7e7..0b49a79efb08 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -599,7 +599,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
 	struct mm_struct *mm = current->mm;
 	loff_t hpage_size = huge_page_size(h);
 	unsigned long hpage_shift = huge_page_shift(h);
-	pgoff_t start, index, end;
+	pgoff_t start, idx, end;
 	int error;
 	u32 hash;
 
@@ -639,7 +639,9 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
 	vm_flags_init(&pseudo_vma, VM_HUGETLB | VM_MAYSHARE | VM_SHARED);
 	pseudo_vma.vm_file = file;
 
-	for (index = start; index < end; index++) {
+	for (idx = start; idx < end; idx++) {
+		pgoff_t index = idx << huge_page_order(h);
+
 		/*
 		 * This is supposed to be the vaddr where the page is being
 		 * faulted in, but we have no vaddr here.
@@ -659,14 +661,14 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
 		}
 
 		/* addr is the offset within the file (zero based) */
-		addr = index * hpage_size;
+		addr = idx * hpage_size;
 
 		/* mutex taken here, fault path and hole punch */
-		hash = hugetlb_fault_mutex_hash(mapping, index << huge_page_order(h));
+		hash = hugetlb_fault_mutex_hash(mapping, index);
 		mutex_lock(&hugetlb_fault_mutex_table[hash]);
 
 		/* See if already present in mapping to avoid alloc/free */
-		folio = filemap_get_folio(mapping, index << huge_page_order(h));
+		folio = filemap_get_folio(mapping, index);
 		if (!IS_ERR(folio)) {
 			folio_put(folio);
 			mutex_unlock(&hugetlb_fault_mutex_table[hash]);
@@ -690,6 +692,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
 		folio_zero_user(folio, addr);
 		__folio_mark_uptodate(folio);
 		error = hugetlb_add_to_page_cache(folio, mapping, index);
+
 		if (unlikely(error)) {
 			restore_reserve_on_error(h, &pseudo_vma, addr, folio);
 			folio_put(folio);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 5484e78fe72e..b41e7b8df094 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5621,15 +5621,14 @@ bool hugetlbfs_pagecache_present(struct hstate *h,
 }
 
 int hugetlb_add_to_page_cache(struct folio *folio, struct address_space *mapping,
-			   pgoff_t idx)
+			   pgoff_t index)
 {
 	struct inode *inode = mapping->host;
 	struct hstate *h = hstate_inode(inode);
 	int err;
 
-	idx <<= huge_page_order(h);
 	__folio_set_locked(folio);
-	err = __filemap_add_folio(mapping, folio, idx, GFP_KERNEL, NULL);
+	err = __filemap_add_folio(mapping, folio, index, GFP_KERNEL, NULL);
 
 	if (unlikely(err)) {
 		__folio_clear_locked(folio);
@@ -5696,7 +5695,6 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
 	struct folio *folio;
 	unsigned long size;
 	pte_t new_pte;
-	pgoff_t idx = vmf->pgoff >> huge_page_order(h);
 
 	/*
 	 * Currently, we are forced to kill the process in the event the
@@ -5779,7 +5777,8 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
 		new_folio = true;
 
 		if (vma->vm_flags & VM_MAYSHARE) {
-			int err = hugetlb_add_to_page_cache(folio, mapping, idx);
+			int err = hugetlb_add_to_page_cache(folio, mapping, vmf->pgoff);
+
 			if (err) {
 				/*
 				 * err can't be -EEXIST which implies someone
@@ -6170,7 +6169,8 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte,
 	bool wp_enabled = (flags & MFILL_ATOMIC_WP);
 	struct hstate *h = hstate_vma(dst_vma);
 	struct address_space *mapping = dst_vma->vm_file->f_mapping;
-	pgoff_t idx = vma_hugecache_offset(h, dst_vma, dst_addr);
+	pgoff_t index = linear_page_index(dst_vma, dst_addr);
+
 	unsigned long size = huge_page_size(h);
 	int vm_shared = dst_vma->vm_flags & VM_SHARED;
 	pte_t _dst_pte;
@@ -6201,7 +6201,7 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte,
 
 	if (is_continue) {
 		ret = -EFAULT;
-		folio = filemap_lock_folio(mapping, idx << huge_page_order(h));
+		folio = filemap_lock_folio(mapping, index);
 		if (IS_ERR(folio))
 			goto out;
 		folio_in_pagecache = true;
@@ -6297,7 +6297,7 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte,
 	/* Add shared, newly allocated pages to the page cache. */
 	if (vm_shared && !is_continue) {
 		ret = -EFAULT;
-		if (idx >= (i_size_read(mapping->host) >> huge_page_shift(h)))
+		if (index >= (i_size_read(mapping->host) >> PAGE_SHIFT))
 			goto out_release_nounlock;
 
 		/*
@@ -6306,7 +6306,7 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte,
 		 * hugetlb_fault_mutex_table that here must be hold by
 		 * the caller.
 		 */
-		ret = hugetlb_add_to_page_cache(folio, mapping, idx);
+		ret = hugetlb_add_to_page_cache(folio, mapping, index);
 		if (ret)
 			goto out_release_nounlock;
 		folio_in_pagecache = true;
diff --git a/mm/memfd.c b/mm/memfd.c
index b0ec0b12b98d..0b5e8f111b39 100644
--- a/mm/memfd.c
+++ b/mm/memfd.c
@@ -122,7 +122,7 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t index)
 
 			err = hugetlb_add_to_page_cache(folio,
 							memfd->f_mapping,
-							idx);
+							index);
 
 			mutex_unlock(&hugetlb_fault_mutex_table[hash]);
 
-- 
2.43.5


^ permalink raw reply related

* [PATCH v2 07/11] hugetlb: replace filemap_lock_hugetlb_folio with filemap_lock_folio
From: Jane Chu @ 2026-06-17 17:25 UTC (permalink / raw)
  To: akpm
  Cc: willy, jack, viro, brauner, muchun.song, osalvador, david, hughd,
	baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
	corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
In-Reply-To: <20260617172534.1740152-1-jane.chu@oracle.com>

The problem with filemap_lock_hugetlb_folio() is redundancy, replace
it with the generic filemap_lock_folio().

Suggested-by: David Hildenbrand <david@kernel.org>
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 fs/hugetlbfs/inode.c    |  3 +--
 include/linux/hugetlb.h | 12 ------------
 mm/hugetlb.c            |  4 ++--
 3 files changed, 3 insertions(+), 16 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 02cb265a580e..6c883478f7e7 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -518,10 +518,9 @@ static void hugetlbfs_zero_partial_page(struct hstate *h,
 					loff_t start,
 					loff_t end)
 {
-	pgoff_t idx = start >> huge_page_shift(h);
 	struct folio *folio;
 
-	folio = filemap_lock_hugetlb_folio(h, mapping, idx);
+	folio = filemap_lock_folio(mapping, start);
 	if (IS_ERR(folio))
 		return;
 
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index cae5cdd3ea00..e78d0f706681 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -824,12 +824,6 @@ static inline unsigned int blocks_per_huge_page(struct hstate *h)
 	return huge_page_size(h) / 512;
 }
 
-static inline struct folio *filemap_lock_hugetlb_folio(struct hstate *h,
-				struct address_space *mapping, pgoff_t idx)
-{
-	return filemap_lock_folio(mapping, idx << huge_page_order(h));
-}
-
 #include <asm/hugetlb.h>
 
 #ifndef is_hugepage_only_range
@@ -1096,12 +1090,6 @@ static inline struct hugepage_subpool *hugetlb_folio_subpool(struct folio *folio
 	return NULL;
 }
 
-static inline struct folio *filemap_lock_hugetlb_folio(struct hstate *h,
-				struct address_space *mapping, pgoff_t idx)
-{
-	return NULL;
-}
-
 static inline int isolate_or_dissolve_huge_folio(struct folio *folio,
 						struct list_head *list)
 {
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index ecd1d1322fda..5484e78fe72e 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5715,7 +5715,7 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
 	 * before we get page_table_lock.
 	 */
 	new_folio = false;
-	folio = filemap_lock_hugetlb_folio(h, mapping, idx);
+	folio = filemap_lock_folio(mapping, vmf->pgoff);
 	if (IS_ERR(folio)) {
 		size = i_size_read(mapping->host) >> PAGE_SHIFT;
 		if (vmf->pgoff >= size)
@@ -6201,7 +6201,7 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte,
 
 	if (is_continue) {
 		ret = -EFAULT;
-		folio = filemap_lock_hugetlb_folio(h, mapping, idx);
+		folio = filemap_lock_folio(mapping, idx << huge_page_order(h));
 		if (IS_ERR(folio))
 			goto out;
 		folio_in_pagecache = true;
-- 
2.43.5


^ permalink raw reply related

* [PATCH v2 06/11] hugetlb: make hugetlb_fault_mutex_hash() to take PAGE_SIZE index
From: Jane Chu @ 2026-06-17 17:25 UTC (permalink / raw)
  To: akpm
  Cc: willy, jack, viro, brauner, muchun.song, osalvador, david, hughd,
	baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
	corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
In-Reply-To: <20260617172534.1740152-1-jane.chu@oracle.com>

Make hugetlb_fault_mutex_hash() to take a PAGE_SIZE-based index.
This makes the helper interface consistent with filemap_get_folio(),
and linear_page_index(), while preserving the same lock selection for
a given hugetlb file offset.

Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 fs/hugetlbfs/inode.c    |  9 ++++-----
 include/linux/hugetlb.h |  2 +-
 mm/hugetlb.c            | 23 ++++++++++++-----------
 mm/memfd.c              |  9 +++++----
 mm/userfaultfd.c        |  6 +++---
 5 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 1c25485c91b9..02cb265a580e 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -448,7 +448,7 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart,
 	struct address_space *mapping = &inode->i_data;
 	const pgoff_t end = lend >> PAGE_SHIFT;
 	struct folio_batch fbatch;
-	pgoff_t next, index;
+	pgoff_t next;
 	int i, freed = 0;
 	bool truncate_op = (lend == LLONG_MAX);
 
@@ -459,15 +459,14 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart,
 			struct folio *folio = fbatch.folios[i];
 			u32 hash = 0;
 
-			index = folio->index >> huge_page_order(h);
-			hash = hugetlb_fault_mutex_hash(mapping, index);
+			hash = hugetlb_fault_mutex_hash(mapping, folio->index);
 			mutex_lock(&hugetlb_fault_mutex_table[hash]);
 
 			/*
 			 * Remove folio that was part of folio_batch.
 			 */
 			remove_inode_single_folio(h, inode, mapping, folio,
-						  index, truncate_op);
+						  folio->index, truncate_op);
 			freed++;
 
 			mutex_unlock(&hugetlb_fault_mutex_table[hash]);
@@ -664,7 +663,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
 		addr = index * hpage_size;
 
 		/* mutex taken here, fault path and hole punch */
-		hash = hugetlb_fault_mutex_hash(mapping, index);
+		hash = hugetlb_fault_mutex_hash(mapping, index << huge_page_order(h));
 		mutex_lock(&hugetlb_fault_mutex_table[hash]);
 
 		/* See if already present in mapping to avoid alloc/free */
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 218284e80451..cae5cdd3ea00 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -159,7 +159,7 @@ void folio_putback_hugetlb(struct folio *folio);
 void move_hugetlb_state(struct folio *old_folio, struct folio *new_folio, int reason);
 void hugetlb_fix_reserve_counts(struct inode *inode);
 extern struct mutex *hugetlb_fault_mutex_table;
-u32 hugetlb_fault_mutex_hash(struct address_space *mapping, pgoff_t idx);
+u32 hugetlb_fault_mutex_hash(struct address_space *mapping, pgoff_t index);
 
 pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma,
 		      unsigned long addr, pud_t *pud);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 3255f6b762c9..ecd1d1322fda 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5505,7 +5505,7 @@ static vm_fault_t hugetlb_wp(struct vm_fault *vmf)
 		 */
 		if (cow_from_owner) {
 			struct address_space *mapping = vma->vm_file->f_mapping;
-			pgoff_t idx;
+			pgoff_t index;
 			u32 hash;
 
 			folio_put(old_folio);
@@ -5518,8 +5518,8 @@ static vm_fault_t hugetlb_wp(struct vm_fault *vmf)
 			 *
 			 * Reacquire both after unmap operation.
 			 */
-			idx = vma_hugecache_offset(h, vma, vmf->address);
-			hash = hugetlb_fault_mutex_hash(mapping, idx);
+			index = linear_page_index(vma, vmf->address);
+			hash = hugetlb_fault_mutex_hash(mapping, index);
 			hugetlb_vma_unlock_read(vma);
 			mutex_unlock(&hugetlb_fault_mutex_table[hash]);
 
@@ -5654,8 +5654,6 @@ static inline vm_fault_t hugetlb_handle_userfault(struct vm_fault *vmf,
 						  unsigned long reason)
 {
 	u32 hash;
-	struct hstate *h = hstate_vma(vmf->vma);
-	pgoff_t idx = vmf->pgoff >> huge_page_order(h);
 
 	/*
 	 * vma_lock and hugetlb_fault_mutex must be dropped before handling
@@ -5663,7 +5661,7 @@ static inline vm_fault_t hugetlb_handle_userfault(struct vm_fault *vmf,
 	 * userfault, any vma operation should be careful from here.
 	 */
 	hugetlb_vma_unlock_read(vmf->vma);
-	hash = hugetlb_fault_mutex_hash(mapping, idx);
+	hash = hugetlb_fault_mutex_hash(mapping, vmf->pgoff);
 	mutex_unlock(&hugetlb_fault_mutex_table[hash]);
 	return handle_userfault(vmf, reason);
 }
@@ -5896,7 +5894,7 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
 	if (unlikely(ret & VM_FAULT_RETRY))
 		vma_end_read(vma);
 
-	hash = hugetlb_fault_mutex_hash(mapping, idx);
+	hash = hugetlb_fault_mutex_hash(mapping, vmf->pgoff);
 	mutex_unlock(&hugetlb_fault_mutex_table[hash]);
 	return ret;
 
@@ -5913,13 +5911,16 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
 }
 
 #ifdef CONFIG_SMP
-u32 hugetlb_fault_mutex_hash(struct address_space *mapping, pgoff_t idx)
+u32 hugetlb_fault_mutex_hash(struct address_space *mapping, pgoff_t index)
 {
 	unsigned long key[2];
+	struct hstate *h;
 	u32 hash;
 
 	key[0] = (unsigned long) mapping;
-	key[1] = idx;
+
+	h = hstate_inode(mapping->host);
+	key[1] = index >> huge_page_order(h);
 
 	hash = jhash2((u32 *)&key, sizeof(key)/(sizeof(u32)), 0);
 
@@ -5930,7 +5931,7 @@ u32 hugetlb_fault_mutex_hash(struct address_space *mapping, pgoff_t idx)
  * For uniprocessor systems we always use a single mutex, so just
  * return 0 and avoid the hashing overhead.
  */
-u32 hugetlb_fault_mutex_hash(struct address_space *mapping, pgoff_t idx)
+u32 hugetlb_fault_mutex_hash(struct address_space *mapping, pgoff_t index)
 {
 	return 0;
 }
@@ -5965,7 +5966,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	 * the same page in the page cache.
 	 */
 	mapping = vma->vm_file->f_mapping;
-	hash = hugetlb_fault_mutex_hash(mapping, vmf.pgoff >> huge_page_order(h));
+	hash = hugetlb_fault_mutex_hash(mapping, vmf.pgoff);
 	mutex_lock(&hugetlb_fault_mutex_table[hash]);
 
 	/*
diff --git a/mm/memfd.c b/mm/memfd.c
index abe13b291ddc..b0ec0b12b98d 100644
--- a/mm/memfd.c
+++ b/mm/memfd.c
@@ -64,7 +64,7 @@ static void memfd_tag_pins(struct xa_state *xas)
  * (memfd_pin_folios()) cannot find a folio in the page cache at a given
  * index in the mapping.
  */
-struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx)
+struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t index)
 {
 #ifdef CONFIG_HUGETLB_PAGE
 	struct folio *folio;
@@ -79,12 +79,13 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx)
 		 */
 		struct inode *inode = file_inode(memfd);
 		struct hstate *h = hstate_file(memfd);
+		pgoff_t idx;
 		int err = -ENOMEM;
 		long nr_resv;
 
 		gfp_mask = htlb_alloc_mask(h);
 		gfp_mask &= ~(__GFP_HIGHMEM | __GFP_MOVABLE);
-		idx >>= huge_page_order(h);
+		idx = index >> huge_page_order(h);
 
 		nr_resv = hugetlb_reserve_pages(inode, idx, idx + 1, NULL, EMPTY_VMA_FLAGS);
 		if (nr_resv < 0)
@@ -116,7 +117,7 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx)
 			 * races with concurrent allocations, as required by all other
 			 * callers of hugetlb_add_to_page_cache().
 			 */
-			hash = hugetlb_fault_mutex_hash(memfd->f_mapping, idx);
+			hash = hugetlb_fault_mutex_hash(memfd->f_mapping, index);
 			mutex_lock(&hugetlb_fault_mutex_table[hash]);
 
 			err = hugetlb_add_to_page_cache(folio,
@@ -140,7 +141,7 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx)
 		return ERR_PTR(err);
 	}
 #endif
-	return shmem_read_folio(memfd->f_mapping, idx);
+	return shmem_read_folio(memfd->f_mapping, index);
 }
 
 /*
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 180bad42fc79..95fb94b697a4 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -707,7 +707,7 @@ static __always_inline ssize_t mfill_atomic_hugetlb(
 	long copied;
 	struct folio *folio;
 	unsigned long vma_hpagesize;
-	pgoff_t idx;
+	pgoff_t index;
 	u32 hash;
 	struct address_space *mapping;
 
@@ -776,9 +776,9 @@ static __always_inline ssize_t mfill_atomic_hugetlb(
 		 * in the case of shared pmds.  fault mutex prevents
 		 * races with other faulting threads.
 		 */
-		idx = hugetlb_linear_page_index(dst_vma, dst_addr);
+		index = linear_page_index(dst_vma, dst_addr);
 		mapping = dst_vma->vm_file->f_mapping;
-		hash = hugetlb_fault_mutex_hash(mapping, idx);
+		hash = hugetlb_fault_mutex_hash(mapping, index);
 		mutex_lock(&hugetlb_fault_mutex_table[hash]);
 		hugetlb_vma_lock_read(dst_vma);
 
-- 
2.43.5


^ permalink raw reply related

* [PATCH v2 05/11] hugetlb: Convert the vmf->pgoff to PAGE_SIZE granularity
From: Jane Chu @ 2026-06-17 17:25 UTC (permalink / raw)
  To: akpm
  Cc: willy, jack, viro, brauner, muchun.song, osalvador, david, hughd,
	baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
	corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
In-Reply-To: <20260617172534.1740152-1-jane.chu@oracle.com>

Everywhere else in MM, the page fault vmf->pgoff is in PAGE_SIZE
granularity, except in hugetlbfs, it's in hugepagesize granularity.
This is really unnecessary.

Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 mm/hugetlb.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 4b80b167cc9c..3255f6b762c9 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5654,6 +5654,8 @@ static inline vm_fault_t hugetlb_handle_userfault(struct vm_fault *vmf,
 						  unsigned long reason)
 {
 	u32 hash;
+	struct hstate *h = hstate_vma(vmf->vma);
+	pgoff_t idx = vmf->pgoff >> huge_page_order(h);
 
 	/*
 	 * vma_lock and hugetlb_fault_mutex must be dropped before handling
@@ -5661,7 +5663,7 @@ static inline vm_fault_t hugetlb_handle_userfault(struct vm_fault *vmf,
 	 * userfault, any vma operation should be careful from here.
 	 */
 	hugetlb_vma_unlock_read(vmf->vma);
-	hash = hugetlb_fault_mutex_hash(mapping, vmf->pgoff);
+	hash = hugetlb_fault_mutex_hash(mapping, idx);
 	mutex_unlock(&hugetlb_fault_mutex_table[hash]);
 	return handle_userfault(vmf, reason);
 }
@@ -5686,7 +5688,7 @@ static bool hugetlb_pte_stable(struct hstate *h, struct mm_struct *mm, unsigned
 static vm_fault_t hugetlb_no_page(struct address_space *mapping,
 			struct vm_fault *vmf)
 {
-	u32 hash = hugetlb_fault_mutex_hash(mapping, vmf->pgoff);
+	u32 hash;
 	bool new_folio, new_anon_folio = false;
 	struct vm_area_struct *vma = vmf->vma;
 	struct mm_struct *mm = vma->vm_mm;
@@ -5696,6 +5698,7 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
 	struct folio *folio;
 	unsigned long size;
 	pte_t new_pte;
+	pgoff_t idx = vmf->pgoff >> huge_page_order(h);
 
 	/*
 	 * Currently, we are forced to kill the process in the event the
@@ -5714,9 +5717,9 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
 	 * before we get page_table_lock.
 	 */
 	new_folio = false;
-	folio = filemap_lock_hugetlb_folio(h, mapping, vmf->pgoff);
+	folio = filemap_lock_hugetlb_folio(h, mapping, idx);
 	if (IS_ERR(folio)) {
-		size = i_size_read(mapping->host) >> huge_page_shift(h);
+		size = i_size_read(mapping->host) >> PAGE_SHIFT;
 		if (vmf->pgoff >= size)
 			goto out;
 		/* Check for page in userfault range */
@@ -5778,8 +5781,7 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
 		new_folio = true;
 
 		if (vma->vm_flags & VM_MAYSHARE) {
-			int err = hugetlb_add_to_page_cache(folio, mapping,
-							vmf->pgoff);
+			int err = hugetlb_add_to_page_cache(folio, mapping, idx);
 			if (err) {
 				/*
 				 * err can't be -EEXIST which implies someone
@@ -5894,6 +5896,7 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
 	if (unlikely(ret & VM_FAULT_RETRY))
 		vma_end_read(vma);
 
+	hash = hugetlb_fault_mutex_hash(mapping, idx);
 	mutex_unlock(&hugetlb_fault_mutex_table[hash]);
 	return ret;
 
@@ -5947,8 +5950,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 		.address = address & huge_page_mask(h),
 		.real_address = address,
 		.flags = flags,
-		.pgoff = vma_hugecache_offset(h, vma,
-				address & huge_page_mask(h)),
+		.pgoff = linear_page_index(vma, address),
 		/* TODO: Track hugetlb faults using vm_fault */
 
 		/*
@@ -5963,7 +5965,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	 * the same page in the page cache.
 	 */
 	mapping = vma->vm_file->f_mapping;
-	hash = hugetlb_fault_mutex_hash(mapping, vmf.pgoff);
+	hash = hugetlb_fault_mutex_hash(mapping, vmf.pgoff >> huge_page_order(h));
 	mutex_lock(&hugetlb_fault_mutex_table[hash]);
 
 	/*
-- 
2.43.5


^ permalink raw reply related

* [PATCH v2 04/11] hugetlbfs,filemap: replace hugetlbfs_read_iter() with generic_file_read_iter()
From: Jane Chu @ 2026-06-17 17:25 UTC (permalink / raw)
  To: akpm
  Cc: willy, jack, viro, brauner, muchun.song, osalvador, david, hughd,
	baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
	corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
In-Reply-To: <20260617172534.1740152-1-jane.chu@oracle.com>

Replace hugetlbfs_read_iter() with generic_file_read_iter(),
teach filemap_get_pages() to be aware of hugetlb pagesize while
calculating 'last_index'.

[1] https://lore.kernel.org/linux-mm/aeZwAz6PcdlqSnJ2@casper.infradead.org/

Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 fs/hugetlbfs/inode.c | 84 +-------------------------------------------
 mm/filemap.c         | 15 ++++++--
 2 files changed, 14 insertions(+), 85 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index f1f8c3f7388f..1c25485c91b9 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -187,88 +187,6 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
 	return mm_get_unmapped_area_vmflags(file, addr0, len, pgoff, flags, 0);
 }
 
-/*
- * Support for read() - Find the page attached to f_mapping and copy out the
- * data. This provides functionality similar to filemap_read().
- */
-static ssize_t hugetlbfs_read_iter(struct kiocb *iocb, struct iov_iter *to)
-{
-	struct file *file = iocb->ki_filp;
-	struct hstate *h = hstate_file(file);
-	struct address_space *mapping = file->f_mapping;
-	struct inode *inode = mapping->host;
-	unsigned long index = iocb->ki_pos >> huge_page_shift(h);
-	unsigned long offset = iocb->ki_pos & ~huge_page_mask(h);
-	unsigned long end_index;
-	loff_t isize;
-	ssize_t retval = 0;
-
-	while (iov_iter_count(to)) {
-		struct folio *folio;
-		size_t nr, copied, want;
-
-		/* nr is the maximum number of bytes to copy from this page */
-		nr = huge_page_size(h);
-		isize = i_size_read(inode);
-		if (!isize)
-			break;
-		end_index = (isize - 1) >> huge_page_shift(h);
-		if (index > end_index)
-			break;
-		if (index == end_index) {
-			nr = ((isize - 1) & ~huge_page_mask(h)) + 1;
-			if (nr <= offset)
-				break;
-		}
-		nr = nr - offset;
-
-		/* Find the folio */
-		folio = filemap_lock_hugetlb_folio(h, mapping, index);
-		if (IS_ERR(folio)) {
-			/*
-			 * We have a HOLE, zero out the user-buffer for the
-			 * length of the hole or request.
-			 */
-			copied = iov_iter_zero(nr, to);
-		} else {
-			folio_unlock(folio);
-
-			if (!folio_test_hwpoison(folio))
-				want = nr;
-			else {
-				/*
-				 * Adjust how many bytes safe to read without
-				 * touching the 1st raw HWPOISON page after
-				 * offset.
-				 */
-				want = adjust_range_hwpoison(folio, offset, nr);
-				if (want == 0) {
-					folio_put(folio);
-					retval = -EIO;
-					break;
-				}
-			}
-
-			/*
-			 * We have the folio, copy it to user space buffer.
-			 */
-			copied = copy_folio_to_iter(folio, offset, want, to);
-			folio_put(folio);
-		}
-		offset += copied;
-		retval += copied;
-		if (copied != nr && iov_iter_count(to)) {
-			if (!retval)
-				retval = -EFAULT;
-			break;
-		}
-		index += offset >> huge_page_shift(h);
-		offset &= ~huge_page_mask(h);
-	}
-	iocb->ki_pos = ((loff_t)index << huge_page_shift(h)) + offset;
-	return retval;
-}
-
 static int hugetlbfs_write_begin(const struct kiocb *iocb,
 			struct address_space *mapping,
 			loff_t pos, unsigned len,
@@ -1181,7 +1099,7 @@ static void init_once(void *foo)
 }
 
 static const struct file_operations hugetlbfs_file_operations = {
-	.read_iter		= hugetlbfs_read_iter,
+	.read_iter		= generic_file_read_iter,
 	.mmap			= hugetlbfs_file_mmap,
 	.fsync			= noop_fsync,
 	.get_unmapped_area	= hugetlb_get_unmapped_area,
diff --git a/mm/filemap.c b/mm/filemap.c
index df8543573570..eb03b31791fc 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2672,20 +2672,30 @@ static int filemap_get_pages(struct kiocb *iocb, size_t count,
 {
 	struct file *filp = iocb->ki_filp;
 	struct address_space *mapping = filp->f_mapping;
+	bool is_hugetlbfs = is_file_hugepages(filp);
 	pgoff_t index = iocb->ki_pos >> PAGE_SHIFT;
 	pgoff_t last_index;
 	struct folio *folio;
 	unsigned int flags;
+	size_t min_folio_bytes;
 	int err = 0;
 
 	/* "last_index" is the index of the folio beyond the end of the read */
-	last_index = round_up(iocb->ki_pos + count,
-			mapping_min_folio_nrbytes(mapping)) >> PAGE_SHIFT;
+	if (is_hugetlbfs)
+		min_folio_bytes = huge_page_size(hstate_file(filp));
+	else
+		min_folio_bytes = mapping_min_folio_nrbytes(mapping);
+	last_index = round_up(iocb->ki_pos + count, min_folio_bytes) >> PAGE_SHIFT;
+
 retry:
 	if (fatal_signal_pending(current))
 		return -EINTR;
 
 	filemap_get_read_batch(mapping, index, last_index - 1, fbatch);
+
+	if (is_hugetlbfs)
+		goto done;
+
 	if (!folio_batch_count(fbatch)) {
 		DEFINE_READAHEAD(ractl, filp, &filp->f_ra, mapping, index);
 
@@ -2724,6 +2734,7 @@ static int filemap_get_pages(struct kiocb *iocb, size_t count,
 			goto err;
 	}
 
+done:
 	trace_mm_filemap_get_pages(mapping, index, last_index - 1);
 	return 0;
 err:
-- 
2.43.5


^ permalink raw reply related

* [PATCH v2 03/11] mm/filemap: add hwpoison handling to filemap_read()
From: Jane Chu @ 2026-06-17 17:25 UTC (permalink / raw)
  To: akpm
  Cc: willy, jack, viro, brauner, muchun.song, osalvador, david, hughd,
	baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
	corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
In-Reply-To: <20260617172534.1740152-1-jane.chu@oracle.com>

Add hwpoison handling to filemap_read() such that .read_iter() could
make best effort copying data out of clean pages without risking
MCE in case page cache contains HWpoison.

[1] https://lore.kernel.org/linux-mm/aeZwAz6PcdlqSnJ2@casper.infradead.org/

Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 mm/filemap.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index a27ce4ad6247..df8543573570 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2475,6 +2475,8 @@ static void filemap_get_read_batch(struct address_space *mapping,
 
 		if (!folio_batch_add(fbatch, folio))
 			break;
+		if (folio_contain_hwpoisoned_page(folio))
+			break;
 		if (!folio_test_uptodate(folio))
 			break;
 		if (folio_test_readahead(folio))
@@ -2871,6 +2873,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
 			size_t offset = iocb->ki_pos & (fsize - 1);
 			size_t bytes = min_t(loff_t, end_offset - iocb->ki_pos,
 					     fsize - offset);
+			size_t adjusted;
 			size_t copied;
 
 			if (end_offset < folio_pos(folio))
@@ -2885,13 +2888,22 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
 			if (writably_mapped)
 				flush_dcache_folio(folio);
 
-			copied = copy_folio_to_iter(folio, offset, bytes, iter);
+			adjusted = bytes;
+			if (folio_contain_hwpoisoned_page(folio)) {
+				adjusted = adjust_range_hwpoison(folio, offset, bytes);
+				if (adjusted == 0) {
+					error = -EIO;
+					break;
+				}
+			}
+
+			copied = copy_folio_to_iter(folio, offset, adjusted, iter);
 
 			already_read += copied;
 			iocb->ki_pos += copied;
 			last_pos = iocb->ki_pos;
 
-			if (copied < bytes) {
+			if (copied < adjusted) {
 				error = -EFAULT;
 				break;
 			}
-- 
2.43.5


^ permalink raw reply related

* [PATCH v2 00/11] hugetlb: Use PAGE granularity index in exported i/f and adopt the common read_iter
From: Jane Chu @ 2026-06-17 17:25 UTC (permalink / raw)
  To: akpm
  Cc: willy, jack, viro, brauner, muchun.song, osalvador, david, hughd,
	baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
	corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel

changes in v2:
 - new patches 1-4: add hwpoison handling to filemap_read(),
   thus replace hugetlbfs_read_iter() with generic_file_read_iter(),
   suggested by Matthew [2];
 - new patch 5: convert hugetlb fault handler's vmf->pgoff to PAGE_SIZE
   granularity like the rest of mm fault handling convention, suggested
   by Matthew [2];
 - patch 6: fixed a bug in v1 pointed out by Usama Arif, also by syzbot;
 - patch 8: did not pick the Acked-by from Oscar (for 5/6 in v1) due to
   updates to the patch;
 - patch 11: add VM_WARN_ON in hugetlb_unreserve_pages(), per Oscar;
  
v1:
This series stems from a discussion with David. [1]
The series makes a small cleanup to a few hugetlb interfaces used
outside the subsystem by standardizing them on base-page indices.
Hopefully this makes the interface semantics a bit more coherent with
the rest of mm, while the internal hugetlb code continue to use hugepage
indices where that remains the more natural fit.

[1] https://lore.kernel.org/linux-mm/9ec9edd1-0f4c-4da2-ae78-0e7b251a9e25@kernel.org/
[2] https://lore.kernel.org/linux-mm/aeZwAz6PcdlqSnJ2@casper.infradead.org/


Jane Chu (11):
  mm/memory-failure: make is_raw_hwpoison_page_in_hugepage() general
    purpose
  mm: factor out adjust_range_hwpoison() from hugetlbfs
  mm/filemap: add hwpoison handling to filemap_read()
  hugetlbfs,filemap: replace hugetlbfs_read_iter() with
    generic_file_read_iter()
  hugetlb: Convert the vmf->pgoff to PAGE_SIZE granularity
  hugetlb: make hugetlb_fault_mutex_hash() to take PAGE_SIZE index
  hugetlb: replace filemap_lock_hugetlb_folio with filemap_lock_folio
  hugetlb: make hugetlb_add_to_page_cache() to take PAGE_SIZE
    granularity index
  hugetlb: remove the hugetlb_linear_page_index() helper
  hugetlb: drop vma_hugecache_offset() in favor of linear_page_index()
  hugetlb: make hugetlb_[un]reserve_pages() to take PAGE granularity
    index

 Documentation/mm/hugetlbfs_reserv.rst |  19 ++--
 fs/hugetlbfs/inode.c                  | 155 ++++----------------------
 include/linux/fs.h                    |   2 +
 include/linux/hugetlb.h               |  36 +-----
 mm/filemap.c                          |  62 ++++++++++-
 mm/hugetlb.c                          |  87 ++++++++-------
 mm/memfd.c                            |  25 ++---
 mm/memory-failure.c                   |  12 +-
 mm/userfaultfd.c                      |   6 +-
 9 files changed, 164 insertions(+), 240 deletions(-)

-- 
2.43.5


^ permalink raw reply

* [PATCH v2 02/11] mm: factor out adjust_range_hwpoison() from hugetlbfs
From: Jane Chu @ 2026-06-17 17:25 UTC (permalink / raw)
  To: akpm
  Cc: willy, jack, viro, brauner, muchun.song, osalvador, david, hughd,
	baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
	corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
In-Reply-To: <20260617172534.1740152-1-jane.chu@oracle.com>

The functionality and implementation of adjust_range_hwpoison() is
generic, so factor it out and make it ready for generic use.

[1] https://lore.kernel.org/linux-mm/aeZwAz6PcdlqSnJ2@casper.infradead.org/

Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 fs/hugetlbfs/inode.c    | 25 -------------------------
 include/linux/fs.h      |  2 ++
 include/linux/hugetlb.h |  5 -----
 mm/filemap.c            | 31 +++++++++++++++++++++++++++++++
 4 files changed, 33 insertions(+), 30 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 66520f7c53c6..f1f8c3f7388f 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -187,31 +187,6 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
 	return mm_get_unmapped_area_vmflags(file, addr0, len, pgoff, flags, 0);
 }
 
-/*
- * Someone wants to read @bytes from a HWPOISON hugetlb @folio from @offset.
- * Returns the maximum number of bytes one can read without touching the 1st raw
- * HWPOISON page.
- */
-static size_t adjust_range_hwpoison(struct folio *folio, size_t offset,
-		size_t bytes)
-{
-	struct page *page = folio_page(folio, offset / PAGE_SIZE);
-	size_t safe_bytes;
-
-	if (is_raw_hwpoison_page_in_folio(page))
-		return 0;
-	/* Safe to read the remaining bytes in this page. */
-	safe_bytes = PAGE_SIZE - (offset % PAGE_SIZE);
-	page++;
-
-	/* Check each remaining page as long as we are not done yet. */
-	for (; safe_bytes < bytes; safe_bytes += PAGE_SIZE, page++)
-		if (is_raw_hwpoison_page_in_folio(page))
-			break;
-
-	return min(safe_bytes, bytes);
-}
-
 /*
  * Support for read() - Find the page attached to f_mapping and copy out the
  * data. This provides functionality similar to filemap_read().
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 11559c513dfb..3876d5beda58 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3052,6 +3052,8 @@ int generic_write_checks_count(struct kiocb *iocb, loff_t *count);
 extern int generic_write_check_limits(struct file *file, loff_t pos,
 		loff_t *count);
 extern int generic_file_rw_checks(struct file *file_in, struct file *file_out);
+bool is_raw_hwpoison_page_in_folio(struct page *page);
+size_t adjust_range_hwpoison(struct folio *folio, size_t offset, size_t bytes);
 ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *to,
 		ssize_t already_read);
 extern ssize_t generic_file_read_iter(struct kiocb *, struct iov_iter *);
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index a9846f043712..218284e80451 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -1078,11 +1078,6 @@ void hugetlb_register_node(struct node *node);
 void hugetlb_unregister_node(struct node *node);
 #endif
 
-/*
- * Check if a given raw @page is HWPOISON in a folio of any kind
- */
-bool is_raw_hwpoison_page_in_folio(struct page *page);
-
 static inline unsigned long huge_page_mask_align(struct file *file)
 {
 	return PAGE_MASK & ~huge_page_mask(hstate_file(file));
diff --git a/mm/filemap.c b/mm/filemap.c
index 4e636647100c..a27ce4ad6247 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2753,6 +2753,37 @@ static void filemap_end_dropbehind_read(struct folio *folio)
 	}
 }
 
+/**
+ * adjust_range_hwpoison - adjust clean readable range to avoid hwpoison.
+ * @folio: folio that contains hwpoison(s).
+ * @offset: bytes into the folio where subsequent read starts.
+ * @bytes: number of bytes wish to read.
+ *
+ * Return: adjusted total number of bytes starting off @offset that can be
+ * safely read from the @folio.
+ */
+size_t adjust_range_hwpoison(struct folio *folio, size_t offset,
+		size_t bytes)
+{
+	struct page *page = folio_page(folio, offset / PAGE_SIZE);
+	size_t safe_bytes;
+
+	if (is_raw_hwpoison_page_in_folio(page))
+		return 0;
+
+	/* Safe to read the remaining bytes in this page. */
+	safe_bytes = PAGE_SIZE - (offset % PAGE_SIZE);
+	page++;
+
+	/* Check each remaining page as long as we are not done yet. */
+	for (; safe_bytes < bytes; safe_bytes += PAGE_SIZE, page++)
+		if (is_raw_hwpoison_page_in_folio(page))
+			break;
+
+	return min(safe_bytes, bytes);
+}
+EXPORT_SYMBOL_GPL(adjust_range_hwpoison);
+
 /**
  * filemap_read - Read data from the page cache.
  * @iocb: The iocb to read.
-- 
2.43.5


^ permalink raw reply related

* [PATCH v2 01/11] mm/memory-failure: make is_raw_hwpoison_page_in_hugepage() general purpose
From: Jane Chu @ 2026-06-17 17:25 UTC (permalink / raw)
  To: akpm
  Cc: willy, jack, viro, brauner, muchun.song, osalvador, david, hughd,
	baolin.wang, linmiaohe, nao.horiguchi, lorenzo, rppt, peterx,
	corbet, linux-doc, linux-mm, linux-kernel, linux-fsdevel
In-Reply-To: <20260617172534.1740152-1-jane.chu@oracle.com>

Make is_raw_hwpoison_page_in_hugepage() general for checking whether
a given raw page within any kind of folio is HW poisoned. Thus,
replace folio_test_hwpoison() with folio_contain_hwpoisoned_page().
Also rename to is_raw_hwpoison_page_in_folio().

Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 fs/hugetlbfs/inode.c    |  4 ++--
 include/linux/hugetlb.h |  4 ++--
 mm/memory-failure.c     | 12 ++++++++++--
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 78d61bf2bd9b..66520f7c53c6 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -198,7 +198,7 @@ static size_t adjust_range_hwpoison(struct folio *folio, size_t offset,
 	struct page *page = folio_page(folio, offset / PAGE_SIZE);
 	size_t safe_bytes;
 
-	if (is_raw_hwpoison_page_in_hugepage(page))
+	if (is_raw_hwpoison_page_in_folio(page))
 		return 0;
 	/* Safe to read the remaining bytes in this page. */
 	safe_bytes = PAGE_SIZE - (offset % PAGE_SIZE);
@@ -206,7 +206,7 @@ static size_t adjust_range_hwpoison(struct folio *folio, size_t offset,
 
 	/* Check each remaining page as long as we are not done yet. */
 	for (; safe_bytes < bytes; safe_bytes += PAGE_SIZE, page++)
-		if (is_raw_hwpoison_page_in_hugepage(page))
+		if (is_raw_hwpoison_page_in_folio(page))
 			break;
 
 	return min(safe_bytes, bytes);
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 5957bc25efa8..a9846f043712 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -1079,9 +1079,9 @@ void hugetlb_unregister_node(struct node *node);
 #endif
 
 /*
- * Check if a given raw @page in a hugepage is HWPOISON.
+ * Check if a given raw @page is HWPOISON in a folio of any kind
  */
-bool is_raw_hwpoison_page_in_hugepage(struct page *page);
+bool is_raw_hwpoison_page_in_folio(struct page *page);
 
 static inline unsigned long huge_page_mask_align(struct file *file)
 {
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index ee42d4361309..40129e0b8213 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1834,14 +1834,21 @@ static inline struct llist_head *raw_hwp_list_head(struct folio *folio)
 	return (struct llist_head *)&folio->_hugetlb_hwpoison;
 }
 
-bool is_raw_hwpoison_page_in_hugepage(struct page *page)
+/**
+ * is_raw_hwpoison_page_in_folio - answers the question whether a given
+ * page is indeed hwpoisoned.
+ * @page: given page, maybe base page, part of a large folio or hugetlb.
+ *
+ * Return: true if @page is the raw hwpoisoned page; else, false.
+ */
+bool is_raw_hwpoison_page_in_folio(struct page *page)
 {
 	struct llist_head *raw_hwp_head;
 	struct raw_hwp_page *p;
 	struct folio *folio = page_folio(page);
 	bool ret = false;
 
-	if (!folio_test_hwpoison(folio))
+	if (!folio_contain_hwpoisoned_page(folio))
 		return false;
 
 	if (!folio_test_hugetlb(folio))
@@ -1868,6 +1875,7 @@ bool is_raw_hwpoison_page_in_hugepage(struct page *page)
 
 	return ret;
 }
+EXPORT_SYMBOL_GPL(is_raw_hwpoison_page_in_folio);
 
 static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag)
 {
-- 
2.43.5


^ permalink raw reply related

* Re: [PATCH v6 07/10] ACPI: APEI: introduce GHES helper
From: Julian Braha @ 2026-06-17 17:17 UTC (permalink / raw)
  To: Ahmed Tiba, Rafael J. Wysocki, Tony Luck, Borislav Petkov,
	Hanjun Guo, Mauro Carvalho Chehab, Shuai Xue, Len Brown,
	Saket Dumbre, Davidlohr Bueso, Jonathan Cameron, Dave Jiang,
	Alison Schofield, Vishal Verma, Ira Weiny, Dan Williams,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, Jonathan Corbet,
	Shuah Khan
  Cc: linux-kernel, linux-acpi, acpica-devel, linux-cxl, devicetree,
	linux-edac, linux-doc, Dmitry.Lamerov
In-Reply-To: <20260617-topics-ahmtib01-ras_ffh_arm_internal_review-v6-7-91f725174aa0@arm.com>

Hi Ahmed,

On 6/17/26 14:54, Ahmed Tiba wrote:

> +config GHES_CPER_HELPERS
> +	bool
> +	select UEFI_CPER

This config option should probably also depend on ACPI (could just move
it into the if ACPI..endif block), or at least have a comment that
selector options ensure ACPI is enabled.

- Julian Braha

^ permalink raw reply

* Re: [PATCH v4 00/31] Introduce SCMI Telemetry FS support
From: Cristian Marussi @ 2026-06-17 17:11 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Cristian Marussi, linux-kernel, linux-arm-kernel, arm-scmi,
	linux-fsdevel, linux-doc, sudeep.holla, james.quinlan, f.fainelli,
	vincent.guittot, etienne.carriere, peng.fan, michal.simek, d-gole,
	jic23, elif.topuz, lukasz.luba, philip.radford,
	souvik.chakravarty, leitao, kas, puranjay, usama.arif,
	kernel-team
In-Reply-To: <20260617-waten-allabendlich-zueinander-93d4b1367b8c@brauner>

On Wed, Jun 17, 2026 at 02:58:09PM +0200, Christian Brauner wrote:
> On Fri, Jun 12, 2026 at 11:37:30PM +0100, Cristian Marussi wrote:
> > Hi all,
> > 

Hi Christian,

thanks for having had a look at this, first of all.

A few remarks down below...

> > --------------------------------------------------------------------------------
> > [TLDR Summary]
> > This series introduces a new SCMI driver which uses a new Telemetry FS to expose
> > and configure SCMI Telemetry Data Events retrieved from the platform SCMI FW
> > at runtime. The patches carrying the new STLMFS Filesystem support are tagged
> > with 'stlmfs'.
> > --------------------------------------------------------------------------------
> > 
> > the upcoming SCMI v4.0 specification [0] introduces a new SCMI protocol
> > dedicated to System Telemetry.
> > 
> > In a nutshell, the SCMI Telemetry protocol allows an agent to discover at
> > runtime the set of Telemetry Data Events (DEs) available on a specific
> > platform and provides the means to configure the set of DEs that a user is
> > interested into, while reading them back using the collection method that
> > is deeemed more suitable for the usecase at hand. (...amongst the various
> > possible collection methods allowed by SCMI specification)
> > 
> > Without delving into the gory details of the whole SCMI Telemetry protocol
> > let's just say that the SCMI platform/server firmware advertises a number
> > of Telemetry Data Events, each one identified by a 32bit unique ID, and an
> > SCMI agent/client, like Linux, can discover them and read back at will the
> > associated data value in a number of ways.
> > Data collection is mainly intended to happen on demand via shared memory
> > areas exposed by the platform firmware, discovered dynamically via SCMI
> > Telemetry and accessed by Linux on-demand, but some DE can also be reported
> > via SCMI Notifications asynchronous messages or via direct dedicated
> > FastChannels (another kind of SCMI memory based access): all of this
> > underlying mechanism is anyway hidden to the user since it is mediated by
> > the kernel driver which will return the proper data value when queried.
> > 
> > Anyway, the set of well-known architected DE IDs defined by the spec is
> > limited to a dozen IDs, which means that the vast majority of DE IDs are
> > customizable per-platform: as a consequence, though, the same ID, say
> > '0x1234', could represent completely different things on different systems.
> > 
> > Precise definitions and semantic of such custom Data Event IDs are out of
> > the scope of the SCMI Telemetry specification and of this implementation:
> > they are supposed to be provided using some kind of JSON-like description
> > file that will have to be consumed by a userspace tool which would be
> > finally in charge of making sense of the set of available DEs.
> > 
> > IOW, in turn, this means that even though the DEs enumerated via SCMI come
> > with some sort of topological and qualitative description provided by the
> > protocol (like unit of measurements, name, topology info etc), kernel-wise
> > we CANNOT be completely sure of "what is what" without being fed-back some
> > sort of information about the DEs by the afore mentioned userspace tool.
> > 
> > For these reasons, currently this series does NOT attempt to register any
> > of these DEs with any of the usual in-kernel subsystems (like HWMON, IIO,
> > PERF etc), simply because we cannot be sure which DE is suitable, or even
> > desirable, for a given subsystem. This also means there are NO in-kernel
> > users of these Telemetry data events as of now.
> > 
> > So, while we do not exclude, for the future, to feed/register some of the
> > discovered DEs to/with some of the above mentioned Kernel subsystems, as
> > of now we have ONLY modeled a custom userspace API to make SCMI Telemetry
> > available to userspace tools.
> > 
> > In deciding which kind of interface to expose SCMI Telemetry data to a
> > user, this new SCMI Telemetry driver aims at satisfying 2 main reqs:
> > 
> >  - exposing an FS-based human-readable interface that can be used to
> >    discover, configure and access our Telemetry data directly also from
> >    the shell without special tools
> > 
> >  - exposing alternative machine-friendly, more-performant, binary
> >    interfaces that can be used to avoid the overhead of multiple accesses
> >    to the VFS and that can be more suitable to access with custom tools
> > 
> > In the initial RFC posted a few months ago [1], the above was achieved
> > with a combination of a SysFS interface, for the human-readable side of
> > the story, and a classic chardev/ioctl for the plain binary access.
> > 
> > Since V1, instead, we moved away from this combined approach, especially
> > away from SysFS, for the following reason:
> > 
> >  1. "Abusing SysFS": SysFS is a handy way to expose device related
> >       properties in a common way, using a few common helpers built on
> >       kernfs; this means, though, that unfortunately in our scenario I had
> >       to generate a dummy simple device for EACH SCMI Telemetry DataEvent
> >       that I got to discover at runtime and attach to them, all of the
> >       properties I need.
> >       This by itself seemed to me abusing the SysFS framework, but, even
> >       ignoring this, the impact on the system when we have to deal with
> >       hundreds or tens of thousands of DEs is sensible.
> >       In some test scenario I ended with 50k DE devices and half-a-millon
> >       related property files ... O_o
> > 
> >  2. "SysFS constraints": SysFS usage itself has its well-known constraints
> >       and best practices, like the one-file/one-value rule, and due to the
> >       fact that any virtual file with a complex structure or handling logic
> >       is frowned upon, you can forget about IOCTLs and mmap'ing to provide
> >       a more performant interface within SysFs, which is the reason why,
> >       in the previous RFC, there was an additional alternative chardev
> >       interface.
> >       These latter limitations around the implementation of files with a
> >       more complex semantic (i.e. with a broader set of file_operations)
> >       derive from the underlying KernFS support, so KernFS is equally not
> >       suitable as a building block for our implementation.
> > 
> >  2. "Chardev limitations": Given the nature of the protocol, the hybrid
> >       approach employing character devices was itself problematic: first
> >       of all because there is an upper limit on the number of chardev we
> >       can create, dictated by the range of available minor numbers, and
> >       then because the fact itself to have to maintain 2 completely
> >       different interfaces (FS + chardev) is painful.
> > 
> > As a final remark, please NOTE THAT all of this is supposed to be available
> > in production systems across a number of heterogeneous platforms: for these
> > reasons the easy choice, debugFS, is NOT an option here.
> > 
> > Due to the above reasoning, since V1 we opted for a new approach with the
> > proposed interfaces now based on a full fledged, unified, virtual pseudo
> > filesystem implemented from scratch, so that we can:
> > 
> >  - expose all the DEs property we like as before with SysFS, but without
> >    any of the constraint imposed by the usage of SysFs or kernfs.
> > 
> >  - easily expose additional alternative views of the same set of DEs
> >    using symlinking capabilities (e.g. alternative topological view)
> > 
> >  - additionally expose a few alternative and more performant interfaces
> >    by embedding in that same FS, a few special virtual files:
> > 
> >    + 'control': to issue IOCTLs for quicker discovery and on-demand access
> >    		to data
> >    + 'pipe' [TBD]: to provide a stream of events using a virtual
> >    		   infinite-style file
> >    + 'raw_<N>' [TBD]: to provide direct memory mapped access to the raw
> >    		      SCMI Telemetry data from userspace
> 
> A filsystem driver for telemetry like this is really misguided. I think
> shell access is really not an argument for adding a filesystem into the
> kernel like this. That's just not appropriate justification to push
> thousand and thousands of lines of code into the kernel.
> 
> You're building completely new infrastructure. The format is whatever it
> is. If you stream it somehow just add a binary that userspace can use to
> consume or translate it. If you need a filesystem interface for
> convenience build it via FUSE on top of whatever streams that data and
> get it ouf of the kernels way.
> 

...I would not say that this was the kind of feedback I was hoping for,
but I am NOT gonna argue, given that you shot down already what I thought
were all my best selling points :P

At this point my understanding is that the way forward must be to use
a custom tool to configure/extract/translate the raw Telemetry data and
move up into userspace the whole human readable FS layer via FUSE, if
really needed.

I suppose that the new kernel/user interface has to be some dedicated char
device implementing proper fops. (like I did previously in early versions
of this series and then abandoned...)

Is this you have in mind ? Dedicated character device(s) with enough fops
to be able to configure/extract Telemetry data with a custom tool ?

Should/could such a tool live in the kernel tree (tools/) at least for
ease of development/deployment ?

> You also buy into all kinds of really wonky properties. If you split it
> over multiple files you can never get a snapshot of data that is
> consistent if it's across multiple files.

There were also bulk read-all-together interfaces but it does not matter
at this point...

> 
> Telemetry over a filesystem is just not a great idea. If you did it via
> sysfs I really wouldn't care because all because the infrastructure
> already exists and I couldn't be bothered if this grew yet another wart
> but as a separate massive hand-rolled pseudofs, no I'm not seeing it.

...well, I am sure Greg would have shot down the SysFS approach even
more quickly, given the amount of abuse I had to carry-on to be able
to fit my Telemetry sources on top of a zillion dummy devices created
just to be representable in SysFS; indeed I did try to use SysFS in the
initial RFC and it was also a lot of pain to play that game in my
context from the implementation point of view...

...so I thought that adopting this new unified FS approach would have
been much cleaner and acceptable upstream... :D

Thanks again,
Cristian


^ permalink raw reply

* Re: [swap tier discussion] Re: [PATCH v3 2/4] mm/zswap: Implement proactive writeback
From: Nhat Pham @ 2026-06-17 17:11 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: YoungJun Park, Shakeel Butt, Hao Jia, Johannes Weiner, mhocko, tj,
	mkoutny, roman.gushchin, akpm, chengming.zhou, muchun.song,
	cgroups, linux-mm, linux-kernel, linux-doc, Hao Jia, chrisl,
	kasong, baoquan.he, joshua.hahnjy
In-Reply-To: <CAO9r8zOg0OP1Ak1v7CRzSfQq0D8b4Dw+_T0Jui6YTM_KwQQNOA@mail.gmail.com>

On Tue, Jun 16, 2026 at 4:27 PM Yosry Ahmed <yosry@kernel.org> wrote:
>
> On Tue, Jun 16, 2026 at 1:24 PM Nhat Pham <nphamcs@gmail.com> wrote:
>
> Ohh I thought you meant we shouldn't allow zswap to be a tier at all,
> not the *only* tier.
>
> > Or are you suggesting that if we set zswap as the only tier then we
> > can allocate from any swapfile (since we're not doing any IO anyway)?
>
> Hmm, technically having zswap as the only tier should be equivalent to
> disabling writeback, but you're right that if zswap is the only tier
> than the memcg is not allowed to use swap slots from any swapfile, so
> zswap cannot be used. Very good point :)

Yeah the coupling of swap/zswap makes reasoning about these kinds of
things so annoying. :)

If anything, with vswap, I'll stop having to explain to folks why they
have to provision on-disk swapfile when they only want to use
in-memory compressed swap, and that's a win in my book.

>
> In this case I think yes, we need vswap to be enabled to allow making
> zswap the only tier. That's one gap between zswap being the only tier
> and disabling zswap writeback, the former requires vswap while the
> latter doesn't.

Yup! Anyway, I think Youngjun sent out v8 - let's take a look.

^ permalink raw reply

* [PATCH RFC v2 9/9] platform/x86: ideapad-laptop: Fully support auto keyboard backlight
From: Rong Zhang @ 2026-06-17 16:48 UTC (permalink / raw)
  To: Lee Jones, Pavel Machek, Jonathan Corbet, Shuah Khan,
	Thomas Weißschuh, Benson Leung, Guenter Roeck,
	Marek Behún, Mark Pearson, Derek J. Clark, Hans de Goede,
	Ilpo Järvinen, Ike Panhc
  Cc: Andrew Lunn, Jakub Kicinski, Vishnu Sankar, Vishnu Sankar,
	linux-leds, netdev, linux-doc, linux-kernel, chrome-platform,
	platform-driver-x86, Rong Zhang
In-Reply-To: <20260618-leds-trigger-hw-changed-v2-0-c28c44053cf3@rong.moe>

Currently, the auto brightness mode of keyboard backlight maps to
brightness=0 in LED classdev. The only method to switch to such a mode
is by pressing the manufacturer-defined shortcut (Fn+Space). However, 0
is a multiplexed brightness value; writing 0 simply results in the
backlight being turned off.

With brightness processing code decoupled from LED classdev, we can now
fully support the auto brightness mode. In this mode, the keyboard
backlight is controlled by the EC according to the ambient light sensor
(ALS).

To utilize this, a private hardware control trigger "ideapad-auto" is
added, with the event handling procedure calling the
led_trigger_notify_hw_control_changed() interface to activate/deactivate
the private trigger according to the current LED trigger state.

Meanwhile, block brightness changes on exit to prevent the side effect
of LED device unregistration when the private trigger is active from
resetting the brightness to zero, so that we can retain the state of
auto mode among boots.

Signed-off-by: Rong Zhang <i@rong.moe>
---
 drivers/platform/x86/lenovo/ideapad-laptop.c | 63 ++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/drivers/platform/x86/lenovo/ideapad-laptop.c b/drivers/platform/x86/lenovo/ideapad-laptop.c
index 97949094ead4..a83af9bf843c 100644
--- a/drivers/platform/x86/lenovo/ideapad-laptop.c
+++ b/drivers/platform/x86/lenovo/ideapad-laptop.c
@@ -1714,9 +1714,56 @@ static int ideapad_kbd_bl_led_cdev_brightness_set(struct led_classdev *led_cdev,
 {
 	struct ideapad_private *priv = container_of(led_cdev, struct ideapad_private, kbd_bl.led);
 
+	/*
+	 * When deinitializing: It must be the side effect of led_cdev
+	 * unregistration when our private trigger is active. We've set
+	 * LED_RETAIN_AT_SHUTDOWN to retain led_cdev brightness level.
+	 * To do the same for auto mode, gate changes and return early.
+	 */
+	if (unlikely(!priv->kbd_bl.initialized))
+		return 0;
+
 	return ideapad_kbd_bl_brightness_set(priv, brightness);
 }
 
+static bool ideapad_kbd_bl_auto_trigger_offloaded(struct led_classdev *led_cdev)
+{
+	struct ideapad_private *priv = container_of(led_cdev, struct ideapad_private, kbd_bl.led);
+
+	return atomic_read(&priv->kbd_bl.last_hw_brightness) == KBD_BL_AUTO_MODE_HW_BRIGHTNESS;
+}
+
+static int ideapad_kbd_bl_auto_trigger_activate(struct led_classdev *led_cdev)
+{
+	struct ideapad_private *priv = container_of(led_cdev, struct ideapad_private, kbd_bl.led);
+
+	return ideapad_kbd_bl_hw_brightness_set(priv, KBD_BL_AUTO_MODE_HW_BRIGHTNESS);
+}
+
+static struct led_hw_trigger_type ideapad_kbd_bl_auto_trigger_type;
+
+static struct led_trigger ideapad_kbd_bl_auto_trigger = {
+	.name = "ideapad-auto",
+	.trigger_type = &ideapad_kbd_bl_auto_trigger_type,
+	.activate = ideapad_kbd_bl_auto_trigger_activate,
+	.offloaded = ideapad_kbd_bl_auto_trigger_offloaded,
+};
+
+static void ideapad_kbd_bl_notify_hw_control(struct ideapad_private *priv,
+					     int hw_brightness, int last_hw_brightness)
+{
+	bool hw_control, last_hw_control;
+
+	if (priv->kbd_bl.type != KBD_BL_TRISTATE_AUTO)
+		return;
+
+	hw_control = hw_brightness == KBD_BL_AUTO_MODE_HW_BRIGHTNESS;
+	last_hw_control = last_hw_brightness == KBD_BL_AUTO_MODE_HW_BRIGHTNESS;
+
+	if (hw_control != last_hw_control)
+		led_trigger_notify_hw_control_changed(&priv->kbd_bl.led, hw_control);
+}
+
 static void ideapad_kbd_bl_notify(struct ideapad_private *priv)
 {
 	int hw_brightness, brightness, last_brightness, last_hw_brightness;
@@ -1738,6 +1785,8 @@ static void ideapad_kbd_bl_notify(struct ideapad_private *priv)
 	if (hw_brightness == last_hw_brightness)
 		return;
 
+	ideapad_kbd_bl_notify_hw_control(priv, hw_brightness, last_hw_brightness);
+
 	last_brightness = ideapad_kbd_bl_brightness_parse(priv, last_hw_brightness);
 	if (last_brightness < 0 || brightness != last_brightness)
 		led_classdev_notify_brightness_hw_changed(&priv->kbd_bl.led, brightness);
@@ -1770,6 +1819,20 @@ static int ideapad_kbd_bl_init(struct ideapad_private *priv)
 
 	switch (priv->kbd_bl.type) {
 	case KBD_BL_TRISTATE_AUTO:
+		err = devm_led_trigger_register(&priv->platform_device->dev,
+						&ideapad_kbd_bl_auto_trigger);
+		if (err)
+			return err;
+
+		priv->kbd_bl.led.flags             |= LED_TRIG_HW_CHANGED;
+		priv->kbd_bl.led.hw_control_trigger = ideapad_kbd_bl_auto_trigger.name;
+		priv->kbd_bl.led.trigger_type       = &ideapad_kbd_bl_auto_trigger_type;
+
+		/* Hardware remembers the last brightness level, including auto mode. */
+		if (hw_brightness == KBD_BL_AUTO_MODE_HW_BRIGHTNESS)
+			priv->kbd_bl.led.default_trigger = ideapad_kbd_bl_auto_trigger.name;
+
+		fallthrough;
 	case KBD_BL_TRISTATE:
 		priv->kbd_bl.led.max_brightness = 2;
 		break;

-- 
2.53.0


^ permalink raw reply related

* [PATCH RFC v2 8/9] platform/x86: ideapad-laptop: Serialize keyboard backlight notifications
From: Rong Zhang @ 2026-06-17 16:48 UTC (permalink / raw)
  To: Lee Jones, Pavel Machek, Jonathan Corbet, Shuah Khan,
	Thomas Weißschuh, Benson Leung, Guenter Roeck,
	Marek Behún, Mark Pearson, Derek J. Clark, Hans de Goede,
	Ilpo Järvinen, Ike Panhc
  Cc: Andrew Lunn, Jakub Kicinski, Vishnu Sankar, Vishnu Sankar,
	linux-leds, netdev, linux-doc, linux-kernel, chrome-platform,
	platform-driver-x86, Rong Zhang
In-Reply-To: <20260618-leds-trigger-hw-changed-v2-0-c28c44053cf3@rong.moe>

ACPI notifications are delivered in dedicated work contexts and may
arrive simultaneously. In the following change, much work will be done
while handling the notification, which could lead to potential race
conditions.

Introduce a new mutex to serialize keyboard backlight notifications to
prevent potential race conditions.

Signed-off-by: Rong Zhang <i@rong.moe>
---
 drivers/platform/x86/lenovo/ideapad-laptop.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/platform/x86/lenovo/ideapad-laptop.c b/drivers/platform/x86/lenovo/ideapad-laptop.c
index 40153dc9a5f2..97949094ead4 100644
--- a/drivers/platform/x86/lenovo/ideapad-laptop.c
+++ b/drivers/platform/x86/lenovo/ideapad-laptop.c
@@ -26,7 +26,9 @@
 #include <linux/jiffies.h>
 #include <linux/kernel.h>
 #include <linux/leds.h>
+#include <linux/lockdep.h>
 #include <linux/module.h>
+#include <linux/mutex.h>
 #include <linux/platform_device.h>
 #include <linux/platform_profile.h>
 #include <linux/power_supply.h>
@@ -228,6 +230,8 @@ struct ideapad_private {
 		int type;
 		struct led_classdev led;
 		atomic_t last_hw_brightness;
+
+		struct mutex notif_mutex; /* protects notifications */
 	} kbd_bl;
 	struct {
 		bool initialized;
@@ -1720,6 +1724,8 @@ static void ideapad_kbd_bl_notify(struct ideapad_private *priv)
 	if (!priv->kbd_bl.initialized)
 		return;
 
+	guard(mutex)(&priv->kbd_bl.notif_mutex);
+
 	hw_brightness = ideapad_kbd_bl_hw_brightness_get(priv);
 	if (hw_brightness < 0)
 		return;
@@ -1747,6 +1753,10 @@ static int ideapad_kbd_bl_init(struct ideapad_private *priv)
 	if (WARN_ON(priv->kbd_bl.initialized))
 		return -EEXIST;
 
+	err = devm_mutex_init(&priv->platform_device->dev, &priv->kbd_bl.notif_mutex);
+	if (err)
+		return err;
+
 	hw_brightness = ideapad_kbd_bl_hw_brightness_get(priv);
 	if (hw_brightness < 0)
 		return hw_brightness;

-- 
2.53.0


^ permalink raw reply related

* [PATCH RFC v2 7/9] platform/x86: ideapad-laptop: Decouple hardware & classdev brightness for keyboard backlight
From: Rong Zhang @ 2026-06-17 16:48 UTC (permalink / raw)
  To: Lee Jones, Pavel Machek, Jonathan Corbet, Shuah Khan,
	Thomas Weißschuh, Benson Leung, Guenter Roeck,
	Marek Behún, Mark Pearson, Derek J. Clark, Hans de Goede,
	Ilpo Järvinen, Ike Panhc
  Cc: Andrew Lunn, Jakub Kicinski, Vishnu Sankar, Vishnu Sankar,
	linux-leds, netdev, linux-doc, linux-kernel, chrome-platform,
	platform-driver-x86, Rong Zhang
In-Reply-To: <20260618-leds-trigger-hw-changed-v2-0-c28c44053cf3@rong.moe>

Some recent models come with an ambient light sensor (ALS). On these
models, their EC will automatically set the keyboard backlight to an
appropriate brightness when the effective "hardware brightness" is 3.
"Hardware brightness" can't be perfectly mapped to an LED classdev
brightness, but the EC does use this predefined brightness value to
represent auto mode.

Currently, the code processing keyboard backlight is coupled with LED
classdev, making it hard to expose the auto brightness (ALS) mode to the
userspace.

As the first step toward the goal, decouple hardware brightness from LED
classdev brightness, and update comments about corresponding backlight
modes.

Since upcoming changes will heavily rely on kbd_bl.last_hw_brightness,
also convert it into an atomic_t to prevent potential race conditions.

To minimalize the diff set in upcoming changes, a trivial refactor
also converts the initialization path into another equivalent form.

Signed-off-by: Rong Zhang <i@rong.moe>
---
 drivers/platform/x86/lenovo/Kconfig          |   1 +
 drivers/platform/x86/lenovo/ideapad-laptop.c | 148 ++++++++++++++++++---------
 2 files changed, 103 insertions(+), 46 deletions(-)

diff --git a/drivers/platform/x86/lenovo/Kconfig b/drivers/platform/x86/lenovo/Kconfig
index 09b1b055d2e0..76ed1593e2aa 100644
--- a/drivers/platform/x86/lenovo/Kconfig
+++ b/drivers/platform/x86/lenovo/Kconfig
@@ -16,6 +16,7 @@ config IDEAPAD_LAPTOP
 	select INPUT_SPARSEKMAP
 	select NEW_LEDS
 	select LEDS_CLASS
+	select LEDS_TRIGGERS
 	help
 	  This is a driver for Lenovo IdeaPad netbooks contains drivers for
 	  rfkill switch, hotkey, fan control and backlight control.
diff --git a/drivers/platform/x86/lenovo/ideapad-laptop.c b/drivers/platform/x86/lenovo/ideapad-laptop.c
index 4fbc904f1fc3..40153dc9a5f2 100644
--- a/drivers/platform/x86/lenovo/ideapad-laptop.c
+++ b/drivers/platform/x86/lenovo/ideapad-laptop.c
@@ -9,6 +9,7 @@
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/acpi.h>
+#include <linux/atomic.h>
 #include <linux/backlight.h>
 #include <linux/bitfield.h>
 #include <linux/bitops.h>
@@ -134,10 +135,31 @@ enum {
 };
 
 /*
- * These correspond to the number of supported states - 1
- * Future keyboard types may need a new system, if there's a collision
- * KBD_BL_TRISTATE_AUTO has no way to report or set the auto state
- * so it effectively has 3 states, but needs to handle 4
+ * The enumeration has two purposes:
+ *   - as an internal identifier for all known types of keyboard backlight
+ *   - as a mandatory parameter of the KBLC command
+ *
+ * For each type, the hardware brightness values are defined as follows:
+ * +--------------------------+----------+-----+------+------+
+ * |      Hardware brightness |        0 |   1 |    2 |    3 |
+ * | Type                     |          |     |      |      |
+ * +--------------------------+----------+-----+------+------+
+ * | KBD_BL_STANDARD          |      off |  on |  N/A |  N/A |
+ * +--------------------------+----------+-----+------+------+
+ * | KBD_BL_TRISTATE          |      off | low | high |  N/A |
+ * +--------------------------+----------+-----+------+------+
+ * | KBD_BL_TRISTATE_AUTO     |      off | low | high | auto |
+ * +--------------------------+----------+-----+------+------+
+ *
+ * We map LED classdev brightness for KBD_BL_TRISTATE_AUTO as follows:
+ * +--------------------------+----------+-----+------+
+ * |  LED classdev brightness |        0 |   1 |    2 |
+ * | Operation                |          |     |      |
+ * +--------------------------+----------+-----+------+
+ * | Read                     | off/auto | low | high |
+ * +--------------------------+----------+-----+------+
+ * | Write                    |      off | low | high |
+ * +--------------------------+----------+-----+------+
  */
 enum {
 	KBD_BL_STANDARD      = 1,
@@ -145,6 +167,8 @@ enum {
 	KBD_BL_TRISTATE_AUTO = 3,
 };
 
+#define KBD_BL_AUTO_MODE_HW_BRIGHTNESS	3
+
 #define KBD_BL_QUERY_TYPE		0x1
 #define KBD_BL_TRISTATE_TYPE		0x5
 #define KBD_BL_TRISTATE_AUTO_TYPE	0x7
@@ -203,7 +227,7 @@ struct ideapad_private {
 		bool initialized;
 		int type;
 		struct led_classdev led;
-		unsigned int last_brightness;
+		atomic_t last_hw_brightness;
 	} kbd_bl;
 	struct {
 		bool initialized;
@@ -1592,7 +1616,24 @@ static int ideapad_kbd_bl_check_tristate(int type)
 	return (type == KBD_BL_TRISTATE) || (type == KBD_BL_TRISTATE_AUTO);
 }
 
-static int ideapad_kbd_bl_brightness_get(struct ideapad_private *priv)
+static int ideapad_kbd_bl_brightness_parse(struct ideapad_private *priv, int hw_brightness)
+{
+	/* Off, low or high */
+	if (hw_brightness <= priv->kbd_bl.led.max_brightness)
+		return hw_brightness;
+
+	/* Auto (controlled by EC according to ALS), report as off */
+	if (priv->kbd_bl.type == KBD_BL_TRISTATE_AUTO &&
+	    hw_brightness == KBD_BL_AUTO_MODE_HW_BRIGHTNESS)
+		return 0;
+
+	/* Unknown value */
+	dev_warn(&priv->platform_device->dev,
+		 "Unknown keyboard backlight value: %u", hw_brightness);
+	return -EINVAL;
+}
+
+static int ideapad_kbd_bl_hw_brightness_get(struct ideapad_private *priv)
 {
 	unsigned long value;
 	int err;
@@ -1606,21 +1647,7 @@ static int ideapad_kbd_bl_brightness_get(struct ideapad_private *priv)
 		if (err)
 			return err;
 
-		/* Convert returned value to brightness level */
-		value = FIELD_GET(KBD_BL_GET_BRIGHTNESS, value);
-
-		/* Off, low or high */
-		if (value <= priv->kbd_bl.led.max_brightness)
-			return value;
-
-		/* Auto, report as off */
-		if (value == priv->kbd_bl.led.max_brightness + 1)
-			return 0;
-
-		/* Unknown value */
-		dev_warn(&priv->platform_device->dev,
-			 "Unknown keyboard backlight value: %lu", value);
-		return -EINVAL;
+		return FIELD_GET(KBD_BL_GET_BRIGHTNESS, value);
 	}
 
 	err = eval_hals(priv->adev->handle, &value);
@@ -1630,6 +1657,16 @@ static int ideapad_kbd_bl_brightness_get(struct ideapad_private *priv)
 	return !!test_bit(HALS_KBD_BL_STATE_BIT, &value);
 }
 
+static int ideapad_kbd_bl_brightness_get(struct ideapad_private *priv)
+{
+	int hw_brightness = ideapad_kbd_bl_hw_brightness_get(priv);
+
+	if (hw_brightness < 0)
+		return hw_brightness;
+
+	return ideapad_kbd_bl_brightness_parse(priv, hw_brightness);
+}
+
 static enum led_brightness ideapad_kbd_bl_led_cdev_brightness_get(struct led_classdev *led_cdev)
 {
 	struct ideapad_private *priv = container_of(led_cdev, struct ideapad_private, kbd_bl.led);
@@ -1637,32 +1674,37 @@ static enum led_brightness ideapad_kbd_bl_led_cdev_brightness_get(struct led_cla
 	return ideapad_kbd_bl_brightness_get(priv);
 }
 
-static int ideapad_kbd_bl_brightness_set(struct ideapad_private *priv, unsigned int brightness)
+static int ideapad_kbd_bl_hw_brightness_set(struct ideapad_private *priv, int hw_brightness)
 {
-	int err;
 	unsigned long value;
 	int type = priv->kbd_bl.type;
+	int err;
 
 	if (ideapad_kbd_bl_check_tristate(type)) {
-		if (brightness > priv->kbd_bl.led.max_brightness)
-			return -EINVAL;
-
-		value = FIELD_PREP(KBD_BL_SET_BRIGHTNESS, brightness) |
+		value = FIELD_PREP(KBD_BL_SET_BRIGHTNESS, hw_brightness) |
 			FIELD_PREP(KBD_BL_COMMAND_TYPE, type) |
 			KBD_BL_COMMAND_SET;
 		err = exec_kblc(priv->adev->handle, value);
 	} else {
-		err = exec_sals(priv->adev->handle, brightness ? SALS_KBD_BL_ON : SALS_KBD_BL_OFF);
+		value = hw_brightness ? SALS_KBD_BL_ON : SALS_KBD_BL_OFF;
+		err = exec_sals(priv->adev->handle, value);
 	}
-
 	if (err)
 		return err;
 
-	priv->kbd_bl.last_brightness = brightness;
+	atomic_set(&priv->kbd_bl.last_hw_brightness, hw_brightness);
 
 	return 0;
 }
 
+static int ideapad_kbd_bl_brightness_set(struct ideapad_private *priv, int brightness)
+{
+	if (brightness > priv->kbd_bl.led.max_brightness)
+		return -EINVAL;
+
+	return ideapad_kbd_bl_hw_brightness_set(priv, brightness);
+}
+
 static int ideapad_kbd_bl_led_cdev_brightness_set(struct led_classdev *led_cdev,
 						  enum led_brightness brightness)
 {
@@ -1673,26 +1715,31 @@ static int ideapad_kbd_bl_led_cdev_brightness_set(struct led_classdev *led_cdev,
 
 static void ideapad_kbd_bl_notify(struct ideapad_private *priv)
 {
-	int brightness;
+	int hw_brightness, brightness, last_brightness, last_hw_brightness;
 
 	if (!priv->kbd_bl.initialized)
 		return;
 
-	brightness = ideapad_kbd_bl_brightness_get(priv);
-	if (brightness < 0)
+	hw_brightness = ideapad_kbd_bl_hw_brightness_get(priv);
+	if (hw_brightness < 0)
 		return;
 
-	if (brightness == priv->kbd_bl.last_brightness)
-		return;
+	brightness = ideapad_kbd_bl_brightness_parse(priv, hw_brightness);
+	if (brightness < 0)
+		return; /* Reject insane values early. */
 
-	priv->kbd_bl.last_brightness = brightness;
+	last_hw_brightness = atomic_xchg(&priv->kbd_bl.last_hw_brightness, hw_brightness);
+	if (hw_brightness == last_hw_brightness)
+		return;
 
-	led_classdev_notify_brightness_hw_changed(&priv->kbd_bl.led, brightness);
+	last_brightness = ideapad_kbd_bl_brightness_parse(priv, last_hw_brightness);
+	if (last_brightness < 0 || brightness != last_brightness)
+		led_classdev_notify_brightness_hw_changed(&priv->kbd_bl.led, brightness);
 }
 
 static int ideapad_kbd_bl_init(struct ideapad_private *priv)
 {
-	int brightness, err;
+	int hw_brightness, err;
 
 	if (!priv->features.kbd_bl)
 		return -ENODEV;
@@ -1700,21 +1747,30 @@ static int ideapad_kbd_bl_init(struct ideapad_private *priv)
 	if (WARN_ON(priv->kbd_bl.initialized))
 		return -EEXIST;
 
-	if (ideapad_kbd_bl_check_tristate(priv->kbd_bl.type))
-		priv->kbd_bl.led.max_brightness = 2;
-	else
-		priv->kbd_bl.led.max_brightness = 1;
+	hw_brightness = ideapad_kbd_bl_hw_brightness_get(priv);
+	if (hw_brightness < 0)
+		return hw_brightness;
 
-	brightness = ideapad_kbd_bl_brightness_get(priv);
-	if (brightness < 0)
-		return brightness;
+	atomic_set(&priv->kbd_bl.last_hw_brightness, hw_brightness);
 
-	priv->kbd_bl.last_brightness = brightness;
 	priv->kbd_bl.led.name                    = "platform::" LED_FUNCTION_KBD_BACKLIGHT;
 	priv->kbd_bl.led.brightness_get          = ideapad_kbd_bl_led_cdev_brightness_get;
 	priv->kbd_bl.led.brightness_set_blocking = ideapad_kbd_bl_led_cdev_brightness_set;
 	priv->kbd_bl.led.flags                   = LED_BRIGHT_HW_CHANGED | LED_RETAIN_AT_SHUTDOWN;
 
+	switch (priv->kbd_bl.type) {
+	case KBD_BL_TRISTATE_AUTO:
+	case KBD_BL_TRISTATE:
+		priv->kbd_bl.led.max_brightness = 2;
+		break;
+	case KBD_BL_STANDARD:
+		priv->kbd_bl.led.max_brightness = 1;
+		break;
+	default:
+		/* This has already been validated by ideapad_check_features(). */
+		unreachable();
+	}
+
 	err = led_classdev_register(&priv->platform_device->dev, &priv->kbd_bl.led);
 	if (err)
 		return err;

-- 
2.53.0


^ permalink raw reply related

* [PATCH RFC v2 6/9] leds: trigger: Add led_trigger_notify_hw_control_changed() interface
From: Rong Zhang @ 2026-06-17 16:48 UTC (permalink / raw)
  To: Lee Jones, Pavel Machek, Jonathan Corbet, Shuah Khan,
	Thomas Weißschuh, Benson Leung, Guenter Roeck,
	Marek Behún, Mark Pearson, Derek J. Clark, Hans de Goede,
	Ilpo Järvinen, Ike Panhc
  Cc: Andrew Lunn, Jakub Kicinski, Vishnu Sankar, Vishnu Sankar,
	linux-leds, netdev, linux-doc, linux-kernel, chrome-platform,
	platform-driver-x86, Rong Zhang
In-Reply-To: <20260618-leds-trigger-hw-changed-v2-0-c28c44053cf3@rong.moe>

Some hardware can autonomously activate/deactivate hardware control.
After that, the LED hardware notifies the LED driver. Currently, there
is no mechanism for LED drivers to notify the LED core about such events
and initiate a trigger transition to reflect the hardware state.

Add a new interface called led_trigger_notify_hw_control_changed(), so
that LED drivers can call it to notify the LED core about the
transition.

The interface only allows two transitions:

1. "none" => private trigger
2. private trigger => "none"

If the current trigger is neither the private trigger nor "none", no
transition will be made. This protects the currently selected software
trigger.

Note that LED_OFF won't be emitted during the #2 transition, as some
hardware may have selected a new brightness level during its hardware
state transition (e.g., laptop keyboards with a shortcut cycling through
different backlight brightnesses and auto mode).

The interface is designed as a void function as any failure should be
non-fatal and the result of transition should not have any impact on the
LED drivers' event handling procedures.

To use the interface, LEDS_TRIGGERS_HW_CHANGED must be enabled in
Kconfig, and the LED driver must set the LED_TRIG_HW_CHANGED flag for
the classdev.

Signed-off-by: Rong Zhang <i@rong.moe>
---
 Documentation/leds/leds-class.rst | 61 +++++++++++++++++++++++++++
 drivers/leds/led-triggers.c       | 86 +++++++++++++++++++++++++++++++++++++--
 drivers/leds/trigger/Kconfig      |  9 ++++
 include/linux/leds.h              |  8 ++++
 4 files changed, 161 insertions(+), 3 deletions(-)

diff --git a/Documentation/leds/leds-class.rst b/Documentation/leds/leds-class.rst
index 41342ecb5f6b..f250dc938e1f 100644
--- a/Documentation/leds/leds-class.rst
+++ b/Documentation/leds/leds-class.rst
@@ -261,9 +261,70 @@ the end use hw_control_set to activate hw control.
 A trigger can use hw_control_get to check if a LED is already in hw control
 and init their flags.
 
+Alternatively, a private trigger can be implemented along with the LED driver if
+the LED's hw control doesn't fit any generic trigger. To associate the private
+trigger with the LED classdev, their `trigger_type` must be the same. The name
+of the private trigger must be the same as `hw_control_trigger`. Since both the
+LED classdev and the private trigger are in the same LED driver, it's not
+necessary for them to coordinate via `hw_control_*` callbacks.
+
 When the LED is in hw control, no software blink is possible and doing so
 will effectively disable hw control.
 
+Hardware-initiated trigger transition
+=====================================
+
+Some hardware can autonomously activate/deactivate hardware control. After that,
+the LED hardware notifies the LED driver.
+
+If the driver can detect such transitions and thus wants to notify the LED core
+to update the current trigger then the `LED_TRIG_HW_CHANGED` flag must be set in
+flags before registering. To update the current trigger accordingly, call
+`led_trigger_notify_hw_control_changed` on the LED classdev. Calling the method
+on a classdev not registered with the `LED_TRIG_HW_CHANGED` flag or an
+appropriate `hw_control_trigger` string is a bug and will trigger a WARN_ON.
+
+This capability is restricted to the LED device's private trigger. The private
+trigger must have been properly registered (see above) and named after
+`hw_control_trigger`, or else a dev_err() will be triggered.
+
+Only two transitions are defined:
+
+- "none" => private trigger:
+        This happens when the hardware autonomously activates hardware control
+        and when "none" (i.e., no trigger) is currently active. If the private
+        trigger is already active when the method is called, this is essentially
+        a no-op.
+
+        The activation sequence for the private trigger will be executed as
+        normal.
+
+        The LED driver and its private trigger must be able to handle the
+        activation sequence even if the hardware is currently in hardware
+        control.
+
+        If error occurs in the activation sequence, the LED Trigger core reverts
+        the effective trigger to "none".
+
+- private trigger => "none"
+        This happens when the hardware autonomously deactivates hardware control
+        and when the private trigger is currently active. If "none" (i.e., no
+        trigger) is active when the method is called, this is essentially a
+        no-op.
+
+        The deactivation sequence for the private trigger will be executed as
+        normal, except that the current LED brightness is retained. The reason
+        for keeping the brightness unchanged is that some hardware may choose a
+        specific brightness instead of simply turning off the LED after
+        autonomously deactivating hardware control.
+
+        The LED driver and its private trigger must be able to handle the
+        deactivation sequence even if the hardware is not currently in hardware
+        control.
+
+If the current trigger is neither the private trigger nor "none", no transition
+will be made.
+
 Known Issues
 ============
 
diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c
index c43229d9c4c1..73e9ce376d02 100644
--- a/drivers/leds/led-triggers.c
+++ b/drivers/leds/led-triggers.c
@@ -7,6 +7,7 @@
  * Author: Richard Purdie <rpurdie@openedhand.com>
  */
 
+#include <linux/bug.h>
 #include <linux/export.h>
 #include <linux/kernel.h>
 #include <linux/list.h>
@@ -162,8 +163,8 @@ ssize_t led_trigger_read(struct file *filp, struct kobject *kobj,
 }
 EXPORT_SYMBOL_GPL(led_trigger_read);
 
-/* Caller must ensure led_cdev->trigger_lock held */
-int led_trigger_set(struct led_classdev *led_cdev, struct led_trigger *trig)
+static int __led_trigger_set(struct led_classdev *led_cdev, struct led_trigger *trig,
+			     bool hw_triggered)
 {
 	char *event = NULL;
 	char *envp[2];
@@ -194,7 +195,21 @@ int led_trigger_set(struct led_classdev *led_cdev, struct led_trigger *trig)
 		led_cdev->trigger_data = NULL;
 		led_cdev->activated = false;
 		led_cdev->flags &= ~LED_INIT_DEFAULT_TRIGGER;
-		led_set_brightness(led_cdev, LED_OFF);
+
+		/*
+		 * Hardware may have selected a new brightness level during its
+		 * hardware control transition, so only reset brightness if we
+		 * are switching to another trigger or if the switching is not
+		 * hardware triggered.
+		 *
+		 * Note that this does not apply to the error path, as running
+		 * into the error path implies a none => private trigger
+		 * transition. This hints that the LED driver and its private
+		 * trigger must have some fundamental bugs, so don't bother
+		 * leaving the LED in an undefined state.
+		 */
+		if (trig || !hw_triggered)
+			led_set_brightness(led_cdev, LED_OFF);
 	}
 	if (trig) {
 		spin_lock(&trig->leddev_list_lock);
@@ -258,6 +273,12 @@ int led_trigger_set(struct led_classdev *led_cdev, struct led_trigger *trig)
 
 	return ret;
 }
+
+/* Caller must ensure led_cdev->trigger_lock held */
+int led_trigger_set(struct led_classdev *led_cdev, struct led_trigger *trig)
+{
+	return __led_trigger_set(led_cdev, trig, false);
+}
 EXPORT_SYMBOL_GPL(led_trigger_set);
 
 void led_trigger_remove(struct led_classdev *led_cdev)
@@ -448,6 +469,65 @@ int devm_led_trigger_register(struct device *dev,
 }
 EXPORT_SYMBOL_GPL(devm_led_trigger_register);
 
+#ifdef CONFIG_LEDS_TRIGGERS_HW_CHANGED
+static void led_trigger_do_hw_control_transition(struct led_classdev *led_cdev, bool activate,
+						 struct led_trigger *hc_trig)
+{
+	int err = 0;
+
+	if (!led_cdev->trigger) {
+		/* "none" => private trigger. */
+		if (activate)
+			err = __led_trigger_set(led_cdev, hc_trig, true);
+	} else if (led_cdev->trigger == hc_trig) {
+		/* private trigger => "none". */
+		if (!activate)
+			err = __led_trigger_set(led_cdev, NULL, true);
+	} else {
+		/* Other trigger is active. */
+		dev_dbg(led_cdev->dev,
+			"Ignoring hw control transition (%s %s) while %s is active",
+			activate ? "activate" : "deactivate", hc_trig->name,
+			led_cdev->trigger->name);
+
+		return;
+	}
+
+	if (err)
+		dev_warn(led_cdev->dev, "Failed to %s %s in hw control transition: %d",
+			 activate ? "activate" : "deactivate", hc_trig->name, err);
+}
+
+void led_trigger_notify_hw_control_changed(struct led_classdev *led_cdev, bool activate)
+{
+	struct led_trigger *trig;
+
+	/* Restricted to private triggers. */
+	if (WARN_ON(!(led_cdev->flags & LED_TRIG_HW_CHANGED) ||
+		    !led_cdev->hw_control_trigger || !led_cdev->trigger_type))
+		return;
+
+	down_read(&triggers_list_lock);
+	list_for_each_entry(trig, &trigger_list, next_trig) {
+		if (trig->trigger_type == led_cdev->trigger_type &&
+		    !strcmp(trig->name, led_cdev->hw_control_trigger)) {
+			down_write(&led_cdev->trigger_lock);
+			led_trigger_do_hw_control_transition(led_cdev, activate, trig);
+			up_write(&led_cdev->trigger_lock);
+
+			up_read(&triggers_list_lock);
+			return;
+		}
+	}
+	up_read(&triggers_list_lock);
+
+	dev_err(led_cdev->dev,
+		"%s() is called, but the private trigger (%s) is never registered\n",
+		__func__, led_cdev->hw_control_trigger);
+}
+EXPORT_SYMBOL_GPL(led_trigger_notify_hw_control_changed);
+#endif /* CONFIG_LEDS_TRIGGERS_HW_CHANGED */
+
 /* Simple LED Trigger Interface */
 
 void led_trigger_event(struct led_trigger *trig,
diff --git a/drivers/leds/trigger/Kconfig b/drivers/leds/trigger/Kconfig
index c11282a74b5a..798122154049 100644
--- a/drivers/leds/trigger/Kconfig
+++ b/drivers/leds/trigger/Kconfig
@@ -9,6 +9,15 @@ menuconfig LEDS_TRIGGERS
 
 if LEDS_TRIGGERS
 
+config LEDS_TRIGGERS_HW_CHANGED
+	bool "LED hardware-initiated trigger transition support"
+	help
+	  This option enables support for hardware initiated hardware control
+	  transitions, where the LED hardware autonomously switches between
+	  "none" (i.e., no trigger) and its private trigger.
+
+	  See Documentation/leds/leds-class.rst for details.
+
 config LEDS_TRIGGER_TIMER
 	tristate "LED Timer Trigger"
 	help
diff --git a/include/linux/leds.h b/include/linux/leds.h
index 7332034a43c8..479391ddf5e5 100644
--- a/include/linux/leds.h
+++ b/include/linux/leds.h
@@ -109,6 +109,7 @@ struct led_classdev {
 #define LED_INIT_DEFAULT_TRIGGER BIT(23)
 #define LED_REJECT_NAME_CONFLICT BIT(24)
 #define LED_MULTI_COLOR		BIT(25)
+#define LED_TRIG_HW_CHANGED	BIT(26)
 
 	/* set_brightness_work / blink_timer flags, atomic, private. */
 	unsigned long		work_flags;
@@ -599,6 +600,13 @@ led_trigger_get_brightness(const struct led_trigger *trigger)
 
 #endif /* CONFIG_LEDS_TRIGGERS */
 
+#ifdef CONFIG_LEDS_TRIGGERS_HW_CHANGED
+void led_trigger_notify_hw_control_changed(struct led_classdev *led_cdev, bool activate);
+#else
+static inline void led_trigger_notify_hw_control_changed(struct led_classdev *led_cdev,
+							 bool activate) {}
+#endif
+
 /* Trigger specific enum */
 enum led_trigger_netdev_modes {
 	TRIGGER_NETDEV_LINK = 0,

-- 
2.53.0


^ permalink raw reply related

* [PATCH RFC v2 5/9] leds: Add trigger_may_offload attribute
From: Rong Zhang @ 2026-06-17 16:47 UTC (permalink / raw)
  To: Lee Jones, Pavel Machek, Jonathan Corbet, Shuah Khan,
	Thomas Weißschuh, Benson Leung, Guenter Roeck,
	Marek Behún, Mark Pearson, Derek J. Clark, Hans de Goede,
	Ilpo Järvinen, Ike Panhc
  Cc: Andrew Lunn, Jakub Kicinski, Vishnu Sankar, Vishnu Sankar,
	linux-leds, netdev, linux-doc, linux-kernel, chrome-platform,
	platform-driver-x86, Rong Zhang
In-Reply-To: <20260618-leds-trigger-hw-changed-v2-0-c28c44053cf3@rong.moe>

There are multiple triggers implementing hardware control. Only "netdev"
provides a custom attribute to determine if it's offloaded to hardware
(i.e., in hardware control). For other triggers, there is no obvious way
for userspace to determine the trigger state programmatically. Moreover,
userspace can't query if an LED device supports hardware control or
identifies these triggers.

Add a new attribute "trigger_may_offload" to the LED core, so that
userspace can determine:

- if the LED device supports hardware control (supported => visible)
- which trigger is the hardware control trigger selected by the LED
  device
- if the trigger is selected ("<foo_trigger>")
- if the trigger is offloaded ("[foo_trigger]")

Note: the documentation describes the attribute as "returning a list"
despite the LED core currently only supports one hardware control
trigger per LED device. This is intentional to make the attribute
extensible in the future without breaking userspace.

Signed-off-by: Rong Zhang <i@rong.moe>
---
 .../ABI/obsolete/sysfs-class-led-trigger-netdev    | 16 ++++++++
 Documentation/ABI/testing/sysfs-class-led          | 22 +++++++++++
 .../ABI/testing/sysfs-class-led-trigger-netdev     | 13 -------
 Documentation/leds/leds-class.rst                  |  8 ++++
 drivers/leds/led-class.c                           | 23 +++++++++++
 drivers/leds/led-triggers.c                        | 45 ++++++++++++++++++++++
 drivers/leds/leds.h                                |  2 +
 drivers/leds/trigger/ledtrig-netdev.c              |  2 +
 8 files changed, 118 insertions(+), 13 deletions(-)

diff --git a/Documentation/ABI/obsolete/sysfs-class-led-trigger-netdev b/Documentation/ABI/obsolete/sysfs-class-led-trigger-netdev
new file mode 100644
index 000000000000..8d2fbfaf50c3
--- /dev/null
+++ b/Documentation/ABI/obsolete/sysfs-class-led-trigger-netdev
@@ -0,0 +1,16 @@
+What:		/sys/class/leds/<led>/offloaded
+Date:		June 2026
+KernelVersion:	7.3
+Contact:	linux-leds@vger.kernel.org
+Description:
+		Communicate whether the LED trigger modes are offloaded to
+		hardware or whether software fallback is used.
+
+		If 0, the LED is using software fallback to blink.
+
+		If 1, the LED blinking in requested mode is offloaded to
+		hardware.
+
+		/sys/class/leds/<led>/trigger_may_offload provides a generic
+		method to query the offloaded state of supported triggers,
+		superseding this attribute.
diff --git a/Documentation/ABI/testing/sysfs-class-led b/Documentation/ABI/testing/sysfs-class-led
index 0313b82644f2..edd5a9a74dfd 100644
--- a/Documentation/ABI/testing/sysfs-class-led
+++ b/Documentation/ABI/testing/sysfs-class-led
@@ -78,6 +78,28 @@ Description:
 		(which would often be configured in the device tree for the
 		hardware).
 
+What:		/sys/class/leds/<led>/trigger_may_offload
+Date:		June 2026
+KernelVersion:	7.3
+Contact:	linux-leds@vger.kernel.org
+Description:
+		Names and states of triggers that may be offloaded to hardware.
+		Such triggers are also called "hw control trigger" in some
+		context.
+
+		Only exists when the LED supports trigger offload.
+
+		Reading this file returns a list of triggers that are capable to
+		be offloaded. The optional brackets around the trigger name
+		indicate the state of the current trigger:
+
+		- `foo_trigger`: the trigger is not selected.
+		- `<foo_trigger>`: the trigger is selected, but falls back to
+		  software blink for some reason (e.g., incompatible trigger
+		  parameters)
+		- `[foo_trigger]`: the trigger is selected and offloaded to
+		  hardware.
+
 What:		/sys/class/leds/<led>/inverted
 Date:		January 2011
 KernelVersion:	2.6.38
diff --git a/Documentation/ABI/testing/sysfs-class-led-trigger-netdev b/Documentation/ABI/testing/sysfs-class-led-trigger-netdev
index ed46b37ab8a2..396d37a4b820 100644
--- a/Documentation/ABI/testing/sysfs-class-led-trigger-netdev
+++ b/Documentation/ABI/testing/sysfs-class-led-trigger-netdev
@@ -62,19 +62,6 @@ Description:
 		When offloaded is true, the blink interval is controlled by
 		hardware and won't reflect the value set in interval.
 
-What:		/sys/class/leds/<led>/offloaded
-Date:		Jun 2023
-KernelVersion:	6.5
-Contact:	linux-leds@vger.kernel.org
-Description:
-		Communicate whether the LED trigger modes are offloaded to
-		hardware or whether software fallback is used.
-
-		If 0, the LED is using software fallback to blink.
-
-		If 1, the LED blinking in requested mode is offloaded to
-		hardware.
-
 What:		/sys/class/leds/<led>/link_10
 Date:		Jun 2023
 KernelVersion:	6.5
diff --git a/Documentation/leds/leds-class.rst b/Documentation/leds/leds-class.rst
index 84665200a88d..41342ecb5f6b 100644
--- a/Documentation/leds/leds-class.rst
+++ b/Documentation/leds/leds-class.rst
@@ -179,6 +179,9 @@ ops and needs to declare specific support for the supported triggers.
 
 With hw control we refer to the LED driven by hardware.
 
+A sysfs attribute `trigger_may_offload` is provided for userspace to
+query supported triggers and their states.
+
 LED driver must define the following value to support hw control:
 
     - hw_control_trigger:
@@ -240,6 +243,11 @@ LED trigger must implement the following API to support hw control:
                 return a boolean indicating if the trigger is offloaded to
                 hardware.
 
+                If an LED driver specifies a hw control trigger but the
+                latter doesn't implement this callback, a dev_err_once will
+                be emitted and the LED trigger will be assumed to be not
+                offloaded.
+
 LED driver can activate additional modes by default to workaround the
 impossibility of supporting each different mode on the supported trigger.
 Examples are hardcoding the blink speed to a set interval, enable special
diff --git a/drivers/leds/led-class.c b/drivers/leds/led-class.c
index 9e14ae588f78..0ac80b93b8b5 100644
--- a/drivers/leds/led-class.c
+++ b/drivers/leds/led-class.c
@@ -90,8 +90,31 @@ static const struct bin_attribute *const led_trigger_bin_attrs[] = {
 	&bin_attr_trigger,
 	NULL,
 };
+
+static DEVICE_ATTR(trigger_may_offload, 0444, led_trigger_may_offload_show, NULL);
+static struct attribute *led_trigger_attrs[] = {
+	&dev_attr_trigger_may_offload.attr,
+	NULL,
+};
+
+static umode_t led_trigger_is_visible(struct kobject *kobj,
+				      struct attribute *attr,
+				      int idx)
+{
+	struct device *dev = kobj_to_dev(kobj);
+	struct led_classdev *led_cdev = dev_get_drvdata(dev);
+
+	if (attr == &dev_attr_trigger_may_offload.attr &&
+	    !led_cdev->hw_control_trigger)
+		return 0;
+
+	return attr->mode;
+}
+
 static const struct attribute_group led_trigger_group = {
 	.bin_attrs = led_trigger_bin_attrs,
+	.attrs = led_trigger_attrs,
+	.is_visible = led_trigger_is_visible,
 };
 #endif
 
diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c
index b1223218bda1..c43229d9c4c1 100644
--- a/drivers/leds/led-triggers.c
+++ b/drivers/leds/led-triggers.c
@@ -313,6 +313,51 @@ void led_trigger_set_default(struct led_classdev *led_cdev)
 }
 EXPORT_SYMBOL_GPL(led_trigger_set_default);
 
+/*
+ * Caller must ensure led_cdev->trigger_lock held,
+ * and led_cdev->trigger->name must match led_cdev->hw_control_trigger.
+ */
+static bool led_trigger_get_offloaded(struct led_classdev *led_cdev)
+{
+	if (likely(led_cdev->trigger->offloaded))
+		return led_cdev->trigger->offloaded(led_cdev);
+
+	dev_err_once(led_cdev->dev,
+		     "hw control trigger %s doesn't implement offloaded(), this is a bug\n",
+		     led_cdev->trigger->name);
+	return false;
+}
+
+ssize_t led_trigger_may_offload_show(struct device *dev,
+				     struct device_attribute *attr, char *buf)
+{
+	struct led_classdev *led_cdev = dev_get_drvdata(dev);
+	bool hit, offloaded = false;
+	struct led_trigger *trig;
+	int len;
+
+	mutex_lock(&led_cdev->led_access);
+	down_read(&led_cdev->trigger_lock);
+
+	trig = led_cdev->trigger;
+
+	hit = trig && !strcmp(led_cdev->hw_control_trigger, trig->name);
+	if (hit)
+		offloaded = led_trigger_get_offloaded(led_cdev);
+
+	/* [offloaded] <active_but_not_offloaded> inactive */
+	len = sysfs_emit(buf, "%s%s%s\n",
+			 offloaded ? "[" : (hit ? "<" : ""),
+			 led_cdev->hw_control_trigger,
+			 offloaded ? "]" : (hit ? ">" : ""));
+
+	up_read(&led_cdev->trigger_lock);
+	mutex_unlock(&led_cdev->led_access);
+
+	return len;
+}
+EXPORT_SYMBOL_GPL(led_trigger_may_offload_show);
+
 /* LED Trigger Interface */
 
 int led_trigger_register(struct led_trigger *trig)
diff --git a/drivers/leds/leds.h b/drivers/leds/leds.h
index bee46651e068..9177e098989b 100644
--- a/drivers/leds/leds.h
+++ b/drivers/leds/leds.h
@@ -27,6 +27,8 @@ ssize_t led_trigger_read(struct file *filp, struct kobject *kobj,
 ssize_t led_trigger_write(struct file *filp, struct kobject *kobj,
 			const struct bin_attribute *bin_attr, char *buf,
 			loff_t pos, size_t count);
+ssize_t led_trigger_may_offload_show(struct device *dev,
+				     struct device_attribute *attr, char *buf);
 
 extern struct rw_semaphore leds_list_lock;
 extern struct list_head leds_list;
diff --git a/drivers/leds/trigger/ledtrig-netdev.c b/drivers/leds/trigger/ledtrig-netdev.c
index a26109ca4b1c..21f22eea4ab8 100644
--- a/drivers/leds/trigger/ledtrig-netdev.c
+++ b/drivers/leds/trigger/ledtrig-netdev.c
@@ -487,6 +487,8 @@ static ssize_t offloaded_show(struct device *dev,
 {
 	struct led_netdev_data *trigger_data = led_trigger_get_drvdata(dev);
 
+	dev_warn_once(dev, "offloaded attribute has been deprecated, see trigger_may_offload.\n");
+
 	return sprintf(buf, "%d\n", trigger_data->hw_control);
 }
 

-- 
2.53.0


^ permalink raw reply related

* [PATCH RFC v2 4/9] leds: trigger: netdev: Implement offloaded() callback
From: Rong Zhang @ 2026-06-17 16:47 UTC (permalink / raw)
  To: Lee Jones, Pavel Machek, Jonathan Corbet, Shuah Khan,
	Thomas Weißschuh, Benson Leung, Guenter Roeck,
	Marek Behún, Mark Pearson, Derek J. Clark, Hans de Goede,
	Ilpo Järvinen, Ike Panhc
  Cc: Andrew Lunn, Jakub Kicinski, Vishnu Sankar, Vishnu Sankar,
	linux-leds, netdev, linux-doc, linux-kernel, chrome-platform,
	platform-driver-x86, Rong Zhang
In-Reply-To: <20260618-leds-trigger-hw-changed-v2-0-c28c44053cf3@rong.moe>

"netdev" can run in hardware control according to hardware capabilities
and trigger options. Implement offloaded() callback to provide its
hardware control state to the LED core.

Signed-off-by: Rong Zhang <i@rong.moe>
---
 drivers/leds/trigger/ledtrig-netdev.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/leds/trigger/ledtrig-netdev.c b/drivers/leds/trigger/ledtrig-netdev.c
index 64c078e997f2..a26109ca4b1c 100644
--- a/drivers/leds/trigger/ledtrig-netdev.c
+++ b/drivers/leds/trigger/ledtrig-netdev.c
@@ -754,10 +754,18 @@ static void netdev_trig_deactivate(struct led_classdev *led_cdev)
 	kfree(trigger_data);
 }
 
+static bool netdev_trig_offloaded(struct led_classdev *led_cdev)
+{
+	struct led_netdev_data *trigger_data = led_get_trigger_data(led_cdev);
+
+	return trigger_data->hw_control;
+}
+
 static struct led_trigger netdev_led_trigger = {
 	.name = "netdev",
 	.activate = netdev_trig_activate,
 	.deactivate = netdev_trig_deactivate,
+	.offloaded = netdev_trig_offloaded,
 	.groups = netdev_trig_groups,
 };
 

-- 
2.53.0


^ permalink raw reply related

* [PATCH RFC v2 3/9] leds: turris-omnia: Implement offloaded() callback for trigger
From: Rong Zhang @ 2026-06-17 16:47 UTC (permalink / raw)
  To: Lee Jones, Pavel Machek, Jonathan Corbet, Shuah Khan,
	Thomas Weißschuh, Benson Leung, Guenter Roeck,
	Marek Behún, Mark Pearson, Derek J. Clark, Hans de Goede,
	Ilpo Järvinen, Ike Panhc
  Cc: Andrew Lunn, Jakub Kicinski, Vishnu Sankar, Vishnu Sankar,
	linux-leds, netdev, linux-doc, linux-kernel, chrome-platform,
	platform-driver-x86, Rong Zhang
In-Reply-To: <20260618-leds-trigger-hw-changed-v2-0-c28c44053cf3@rong.moe>

"omnia-mcu" is a private hardware control trigger which always stays in
hardware control mode. Implement offloaded() callback with its return
value to be always true to reflect this.

Meanwhile, declare it as a hardware control trigger as it's forgotten
before.

Signed-off-by: Rong Zhang <i@rong.moe>
---
 drivers/leds/leds-turris-omnia.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/leds/leds-turris-omnia.c b/drivers/leds/leds-turris-omnia.c
index 25ee5c1eb820..8e016ca86403 100644
--- a/drivers/leds/leds-turris-omnia.c
+++ b/drivers/leds/leds-turris-omnia.c
@@ -195,10 +195,16 @@ static void omnia_hwtrig_deactivate(struct led_classdev *cdev)
 			err);
 }
 
+static bool omnia_hwtrig_offloaded(struct led_classdev *cdev)
+{
+	return true;
+}
+
 static struct led_trigger omnia_hw_trigger = {
 	.name		= "omnia-mcu",
 	.activate	= omnia_hwtrig_activate,
 	.deactivate	= omnia_hwtrig_deactivate,
+	.offloaded	= omnia_hwtrig_offloaded,
 	.trigger_type	= &omnia_hw_trigger_type,
 };
 
@@ -251,6 +257,7 @@ static int omnia_led_register(struct i2c_client *client, struct omnia_led *led,
 	 * by LED class from the linux,default-trigger property.
 	 */
 	cdev->default_trigger = omnia_hw_trigger.name;
+	cdev->hw_control_trigger = omnia_hw_trigger.name;
 
 	/* Put the LED into software mode */
 	ret = omnia_cmd_write_u8(client, OMNIA_CMD_LED_MODE, OMNIA_CMD_LED_MODE_LED(led->reg) |

-- 
2.53.0


^ permalink raw reply related

* [PATCH RFC v2 2/9] leds: cros_ec: Implement offloaded() callback for trigger
From: Rong Zhang @ 2026-06-17 16:47 UTC (permalink / raw)
  To: Lee Jones, Pavel Machek, Jonathan Corbet, Shuah Khan,
	Thomas Weißschuh, Benson Leung, Guenter Roeck,
	Marek Behún, Mark Pearson, Derek J. Clark, Hans de Goede,
	Ilpo Järvinen, Ike Panhc
  Cc: Andrew Lunn, Jakub Kicinski, Vishnu Sankar, Vishnu Sankar,
	linux-leds, netdev, linux-doc, linux-kernel, chrome-platform,
	platform-driver-x86, Rong Zhang
In-Reply-To: <20260618-leds-trigger-hw-changed-v2-0-c28c44053cf3@rong.moe>

"chromeos-auto" is a private hardware control trigger which always stays
in hardware control. Implement offloaded() callback with its return
value to be always true to reflect this.

Signed-off-by: Rong Zhang <i@rong.moe>
---
 drivers/leds/leds-cros_ec.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/leds/leds-cros_ec.c b/drivers/leds/leds-cros_ec.c
index bea3cc3fbfd2..f48e3cf6ccf6 100644
--- a/drivers/leds/leds-cros_ec.c
+++ b/drivers/leds/leds-cros_ec.c
@@ -86,12 +86,18 @@ static int cros_ec_led_trigger_activate(struct led_classdev *led_cdev)
 	return cros_ec_led_send_cmd(priv->cros_ec, &arg);
 }
 
+static bool cros_ec_led_trigger_offloaded(struct led_classdev *led_cdev)
+{
+	return true;
+}
+
 static struct led_hw_trigger_type cros_ec_led_trigger_type;
 
 static struct led_trigger cros_ec_led_trigger = {
 	.name = "chromeos-auto",
 	.trigger_type = &cros_ec_led_trigger_type,
 	.activate = cros_ec_led_trigger_activate,
+	.offloaded = cros_ec_led_trigger_offloaded,
 };
 
 static int cros_ec_led_brightness_set_blocking(struct led_classdev *led_cdev,

-- 
2.53.0


^ permalink raw reply related

* [PATCH RFC v2 1/9] leds: Add callback offloaded() to query the state of hardware control trigger
From: Rong Zhang @ 2026-06-17 16:47 UTC (permalink / raw)
  To: Lee Jones, Pavel Machek, Jonathan Corbet, Shuah Khan,
	Thomas Weißschuh, Benson Leung, Guenter Roeck,
	Marek Behún, Mark Pearson, Derek J. Clark, Hans de Goede,
	Ilpo Järvinen, Ike Panhc
  Cc: Andrew Lunn, Jakub Kicinski, Vishnu Sankar, Vishnu Sankar,
	linux-leds, netdev, linux-doc, linux-kernel, chrome-platform,
	platform-driver-x86, Rong Zhang
In-Reply-To: <20260618-leds-trigger-hw-changed-v2-0-c28c44053cf3@rong.moe>

There are multiple triggers implementing hardware control. However, the
LED core doesn't really know the hardware control state since the
coordination is done directly between the trigger and the LED device.

Add an offloaded() callback so that the LED core can query the hardware
control state.

Signed-off-by: Rong Zhang <i@rong.moe>
---
 Documentation/leds/leds-class.rst | 5 +++++
 include/linux/leds.h              | 1 +
 2 files changed, 6 insertions(+)

diff --git a/Documentation/leds/leds-class.rst b/Documentation/leds/leds-class.rst
index 5db620ed27aa..84665200a88d 100644
--- a/Documentation/leds/leds-class.rst
+++ b/Documentation/leds/leds-class.rst
@@ -235,6 +235,11 @@ LED driver must implement the following API to support hw control:
                 Returns a pointer to a struct device or NULL if nothing
                 is currently attached.
 
+LED trigger must implement the following API to support hw control:
+    - offloaded:
+                return a boolean indicating if the trigger is offloaded to
+                hardware.
+
 LED driver can activate additional modes by default to workaround the
 impossibility of supporting each different mode on the supported trigger.
 Examples are hardcoding the blink speed to a set interval, enable special
diff --git a/include/linux/leds.h b/include/linux/leds.h
index b16b803cc1ac..7332034a43c8 100644
--- a/include/linux/leds.h
+++ b/include/linux/leds.h
@@ -485,6 +485,7 @@ struct led_trigger {
 	const char	 *name;
 	int		(*activate)(struct led_classdev *led_cdev);
 	void		(*deactivate)(struct led_classdev *led_cdev);
+	bool		(*offloaded)(struct led_classdev *led_cdev);
 
 	/* Brightness set by led_trigger_event */
 	enum led_brightness brightness;

-- 
2.53.0


^ permalink raw reply related

* [PATCH RFC v2 0/9] leds: Add support for hardware-initiated hardware control trigger transition
From: Rong Zhang @ 2026-06-17 16:47 UTC (permalink / raw)
  To: Lee Jones, Pavel Machek, Jonathan Corbet, Shuah Khan,
	Thomas Weißschuh, Benson Leung, Guenter Roeck,
	Marek Behún, Mark Pearson, Derek J. Clark, Hans de Goede,
	Ilpo Järvinen, Ike Panhc
  Cc: Andrew Lunn, Jakub Kicinski, Vishnu Sankar, Vishnu Sankar,
	linux-leds, netdev, linux-doc, linux-kernel, chrome-platform,
	platform-driver-x86, Rong Zhang

Some laptops can tune their keyboard backlight according to ambient
light sensors (auto mode). This capability is essentially a hardware
control trigger. Meanwhile, such laptops also offer a shrotcut for
cycling through brightness levels and auto mode. For example, on
ThinkBook, pressing Fn+Space cycles keyboard backlight levels in the
following sequence:

  1 => 2 => 0 => auto => 1 ...

Recent ThinkPad models should have similar sequence too.

However, there are some issues preventing us from using a private
hardware control trigger:

1. We want a mechanism to tell userspace which trigger is the hardware
   control one, so that userspace can determine if auto mode is on/off,
   as well as turing it on/off programmatically without obtaining the
   trigger's name via other channels
2. Turing on/off auto mode via the shortcut cannot activate/deactivate
   the corresponding hardware control trigger, making the software state
   out of sync
3. Even with #1 resolved, deactivating the hardware control trigger has
   a side effect of emitting LED_OFF, breaking the shortcut cycle, where
   "auto => 1" requires the driver to deactivate the trigger

This RFC series tries to demonstrate a path on solving these issues:

- Introduce an attribute "trigger_may_offload", so that userspace can
  determine:
  - if the LED device supports hardware control (supported => visible)
  - which trigger is the hardware control trigger selected by the LED
    device
  - if the trigger is selected ("<foo_trigger>")
  - if the trigger is offloaded ("[foo_trigger]")
    - A callback offloaded() is added so that LED triggers can report
      their hardware control state
- Add led_trigger_notify_hw_control_changed() interface, so that LED
  drivers can notify the LED core about hardware-initiated hardware
  control transitions. The LED core will then determine if the
  transition is allowed and switching between "none" (i.e., no trigger)
  and the device's private trigger accordingly
  - This capability is restricted to the device's private trigger. If
    the current trigger is neither the private trigger nor "none", no
    transition will be made
  - This interface is gated behind Kconfig LEDS_TRIGGERS_HW_CHANGED and
    LED device flag LED_TRIG_HW_CHANGED
- Tune the logic of trigger deactivation so that it won't emit LED_OFF
  when the deactivation is triggered by hardware

The last three patches are included in the RFC series to demonstrate how
to these interfaces are supposed to be utilized, so that ideapad-laptop
can expose the auto mode of ThinkBook's keyboard backlight. They can be
submitted separately once the dust settles, if preferred.

[ Summary of other approaches ]

< custom attribute >

Pros:
- simplicity, KISS
- no need to touch the LED core
- extensible as long as it has a sensor-neutral name
  - a sensor-related name could potentially lead to a mess if a future
    device implements auto mode based on multiple different sensors

Cons:
- must have zero influence on brightness_set[_blocking] callbacks
  in order not to break triggers
  - potential interference with triggers and the brightness attribute
- weird semantic (an attribute other than "brightness" and "trigger"
  changes the brightness)

< private hardware control trigger (this series) >

Pros:
- mutually exclusive with other triggers (hence less chaos)
- semantic correctness
- acts as an aggregate switch to turn on/off auto mode even a future
  device implements auto mode based on multiple different sensors
  - extensibility (through trigger attributes)

Cons:
- complexity

[ Previous discussion threads ]

https://lore.kernel.org/r/08580ec5-1d7b-4612-8a3f-75bc2f40aad2@app.fastmail.com
https://lore.kernel.org/r/1dbfcf656cdb4af0299f90d7426d2ec7e2b8ac9e.camel@rong.moe

Signed-off-by: Rong Zhang <i@rong.moe>
---
Changes in v2:
- Restrict the led_trigger_notify_hw_control_changed() interface to
  private triggers only
  - Drop PATCH v1 1/9 ("leds: Load trigger modules on-demand if used as
    hw control trigger"), not relavant any more
- Gate the led_trigger_notify_hw_control_changed() interface behind
  Kconfig LEDS_TRIGGERS_HW_CHANGED and LED device flag
  LED_TRIG_HW_CHANGED
- Fix lock ordering inversion
- ideapad-laptop:
  - Only call led_trigger_notify_hw_control_changed() when needed
  - Serialize keyboard backlight notifications
- Reword commit messages and documentations
- Link to v1: https://patch.msgid.link/20260227190617.271388-1-i@rong.moe

---
Rong Zhang (9):
      leds: Add callback offloaded() to query the state of hardware control trigger
      leds: cros_ec: Implement offloaded() callback for trigger
      leds: turris-omnia: Implement offloaded() callback for trigger
      leds: trigger: netdev: Implement offloaded() callback
      leds: Add trigger_may_offload attribute
      leds: trigger: Add led_trigger_notify_hw_control_changed() interface
      platform/x86: ideapad-laptop: Decouple hardware & classdev brightness for keyboard backlight
      platform/x86: ideapad-laptop: Serialize keyboard backlight notifications
      platform/x86: ideapad-laptop: Fully support auto keyboard backlight

 .../ABI/obsolete/sysfs-class-led-trigger-netdev    |  16 ++
 Documentation/ABI/testing/sysfs-class-led          |  22 +++
 .../ABI/testing/sysfs-class-led-trigger-netdev     |  13 --
 Documentation/leds/leds-class.rst                  |  74 +++++++
 drivers/leds/led-class.c                           |  23 +++
 drivers/leds/led-triggers.c                        | 131 +++++++++++-
 drivers/leds/leds-cros_ec.c                        |   6 +
 drivers/leds/leds-turris-omnia.c                   |   7 +
 drivers/leds/leds.h                                |   2 +
 drivers/leds/trigger/Kconfig                       |   9 +
 drivers/leds/trigger/ledtrig-netdev.c              |  10 +
 drivers/platform/x86/lenovo/Kconfig                |   1 +
 drivers/platform/x86/lenovo/ideapad-laptop.c       | 219 ++++++++++++++++-----
 include/linux/leds.h                               |   9 +
 14 files changed, 481 insertions(+), 61 deletions(-)
---
base-commit: 66affa37cfac0aec061cc4bcf4a065b0c52f7e19
change-id: 20260506-leds-trigger-hw-changed-96a62188cbdf

Thanks,
Rong


^ permalink raw reply

* Re: [PATCH v5 4/6] alloc_tag: add accuracy based filtering to ioctl
From: Suren Baghdasaryan @ 2026-06-17 16:31 UTC (permalink / raw)
  To: Abhishek Bapat
  Cc: Andrew Morton, Kent Overstreet, Hao Ge, Shuah Khan,
	Jonathan Corbet, linux-doc, linux-kernel, linux-mm, Sourav Panda
In-Reply-To: <db41f6b4a1ec7429be79b3b342f1ac8cf1300e72.1781564384.git.abhishekbapat@google.com>

On Mon, Jun 15, 2026 at 4:04 PM Abhishek Bapat <abhishekbapat@google.com> wrote:
>
> Extend the allocinfo filtering mechanism to allow users to filter tags
> based on their accuracy.
>
> Signed-off-by: Abhishek Bapat <abhishekbapat@google.com>
> Acked-by: Hao Ge <hao.ge@linux.dev>

Acked-by: Suren Baghdasaryan <surenb@google.com>

> ---
>  include/uapi/linux/alloc_tag.h | 4 ++++
>  lib/alloc_tag.c                | 8 ++++++++
>  2 files changed, 12 insertions(+)
>
> diff --git a/include/uapi/linux/alloc_tag.h b/include/uapi/linux/alloc_tag.h
> index 7f5acbb44c14..6ea39c4869fe 100644
> --- a/include/uapi/linux/alloc_tag.h
> +++ b/include/uapi/linux/alloc_tag.h
> @@ -26,6 +26,8 @@ struct allocinfo_tag {
>         char function[ALLOCINFO_STR_SIZE];
>         char filename[ALLOCINFO_STR_SIZE];
>         __u64 lineno;
> +       /* filter criteria only; see allocinfo_counter.accurate for actual accuracy */
> +       __u64 inaccurate;
>  };
>
>  /* The alignment ensures 32-bit compatible interfaces are not broken */
> @@ -45,6 +47,7 @@ enum {
>         ALLOCINFO_FILTER_FUNCTION,
>         ALLOCINFO_FILTER_FILENAME,
>         ALLOCINFO_FILTER_LINENO,
> +       ALLOCINFO_FILTER_INACCURATE,
>         ALLOCINFO_FILTER_MIN_SIZE,
>         ALLOCINFO_FILTER_MAX_SIZE,
>         __ALLOCINFO_FILTER_LAST = ALLOCINFO_FILTER_MAX_SIZE
> @@ -54,6 +57,7 @@ enum {
>  #define ALLOCINFO_FILTER_MASK_FUNCTION         (1 << ALLOCINFO_FILTER_FUNCTION)
>  #define ALLOCINFO_FILTER_MASK_FILENAME         (1 << ALLOCINFO_FILTER_FILENAME)
>  #define ALLOCINFO_FILTER_MASK_LINENO           (1 << ALLOCINFO_FILTER_LINENO)
> +#define ALLOCINFO_FILTER_MASK_INACCURATE       (1 << ALLOCINFO_FILTER_INACCURATE)
>  #define ALLOCINFO_FILTER_MASK_MIN_SIZE         (1 << ALLOCINFO_FILTER_MIN_SIZE)
>  #define ALLOCINFO_FILTER_MASK_MAX_SIZE         (1 << ALLOCINFO_FILTER_MAX_SIZE)
>
> diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
> index b3d21834b61e..4fb3653cb876 100644
> --- a/lib/alloc_tag.c
> +++ b/lib/alloc_tag.c
> @@ -253,6 +253,8 @@ static bool matches_filter(struct codetag *ct, struct allocinfo_filter *filter,
>                            struct alloc_tag_counters *counters,
>                            bool *fetched_counters)
>  {
> +       bool inaccurate;
> +
>         if (!filter || !filter->mask)
>                 return true;
>
> @@ -278,6 +280,12 @@ static bool matches_filter(struct codetag *ct, struct allocinfo_filter *filter,
>             ct->lineno != filter->fields.lineno)
>                 return false;
>
> +       if (filter->mask & ALLOCINFO_FILTER_MASK_INACCURATE) {
> +               inaccurate = !!(ct->flags & CODETAG_FLAG_INACCURATE);
> +               if (inaccurate != !!(filter->fields.inaccurate))
> +                       return false;
> +       }
> +
>         if (filter->mask & (ALLOCINFO_FILTER_MASK_MIN_SIZE | ALLOCINFO_FILTER_MASK_MAX_SIZE)) {
>                 if (!*fetched_counters) {
>                         *counters = allocinfo_prefetch_counters(ct);
> --
> 2.54.0.1136.gdb2ca164c4-goog
>

^ permalink raw reply

* Re: [PATCH v5 3/6] alloc_tag: add size-based filtering to ioctl
From: Suren Baghdasaryan @ 2026-06-17 16:29 UTC (permalink / raw)
  To: Abhishek Bapat
  Cc: Andrew Morton, Kent Overstreet, Hao Ge, Shuah Khan,
	Jonathan Corbet, linux-doc, linux-kernel, linux-mm, Sourav Panda
In-Reply-To: <7d98db60ab0fddab230b1a7a32140f3361ab42cf.1781564384.git.abhishekbapat@google.com>

On Mon, Jun 15, 2026 at 4:04 PM Abhishek Bapat <abhishekbapat@google.com> wrote:
>
> Extend the allocinfo filtering mechanism to allow users to filter tags
> based on the total number of bytes allocated [min_size, max_size]. The
> size range is inclusive.
>
> Filtering by size involves retrieving allocinfo per-CPU counters, which
> is an expensive operation. Hence, the performance of size-based
> filtering will be worse than other filters.
>
> Signed-off-by: Abhishek Bapat <abhishekbapat@google.com>
> Acked-by: Hao Ge <hao.ge@linux.dev>
> ---
>  include/uapi/linux/alloc_tag.h |  8 ++++-
>  lib/alloc_tag.c                | 63 ++++++++++++++++++++++++++++------
>  2 files changed, 59 insertions(+), 12 deletions(-)
>
> diff --git a/include/uapi/linux/alloc_tag.h b/include/uapi/linux/alloc_tag.h
> index 3b11877955b9..7f5acbb44c14 100644
> --- a/include/uapi/linux/alloc_tag.h
> +++ b/include/uapi/linux/alloc_tag.h
> @@ -45,13 +45,17 @@ enum {
>         ALLOCINFO_FILTER_FUNCTION,
>         ALLOCINFO_FILTER_FILENAME,
>         ALLOCINFO_FILTER_LINENO,
> -       __ALLOCINFO_FILTER_LAST = ALLOCINFO_FILTER_LINENO
> +       ALLOCINFO_FILTER_MIN_SIZE,
> +       ALLOCINFO_FILTER_MAX_SIZE,
> +       __ALLOCINFO_FILTER_LAST = ALLOCINFO_FILTER_MAX_SIZE
>  };
>
>  #define ALLOCINFO_FILTER_MASK_MODNAME          (1 << ALLOCINFO_FILTER_MODNAME)
>  #define ALLOCINFO_FILTER_MASK_FUNCTION         (1 << ALLOCINFO_FILTER_FUNCTION)
>  #define ALLOCINFO_FILTER_MASK_FILENAME         (1 << ALLOCINFO_FILTER_FILENAME)
>  #define ALLOCINFO_FILTER_MASK_LINENO           (1 << ALLOCINFO_FILTER_LINENO)
> +#define ALLOCINFO_FILTER_MASK_MIN_SIZE         (1 << ALLOCINFO_FILTER_MIN_SIZE)
> +#define ALLOCINFO_FILTER_MASK_MAX_SIZE         (1 << ALLOCINFO_FILTER_MAX_SIZE)
>
>  #define ALLOCINFO_FILTER_MASKS \
>         ((1 << (__ALLOCINFO_FILTER_LAST + 1)) - 1)
> @@ -59,6 +63,8 @@ enum {
>  struct allocinfo_filter {
>         __u64 mask; /* bitmask of the filter fields used */
>         struct allocinfo_tag fields;
> +       __u64 min_size;
> +       __u64 max_size;
>  };
>
>  struct allocinfo_get_at {
> diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
> index 5feb61d9fb92..b3d21834b61e 100644
> --- a/lib/alloc_tag.c
> +++ b/lib/alloc_tag.c
> @@ -195,15 +195,26 @@ static int allocinfo_cmp_str(const char *str, const char *template)
>         return strncmp(allocinfo_str(str), template, ALLOCINFO_STR_SIZE);
>  }
>
> +/* Fetch the per-CPU counters */
> +static inline struct alloc_tag_counters allocinfo_prefetch_counters(struct codetag *ct)
> +{
> +       return alloc_tag_read(ct_to_alloc_tag(ct));
> +}
> +
>  /*
>   * Populates the UAPI allocinfo_tag_data structure with active runtime
>   * profiling counters extracted from the given kernel codetag.
>   */
>  static void allocinfo_to_params(struct codetag *ct,
> -                               struct allocinfo_tag_data *data)
> +                               struct allocinfo_tag_data *data,
> +                               struct alloc_tag_counters *counters)
>  {
> -       struct alloc_tag *tag = ct_to_alloc_tag(ct);
> -       struct alloc_tag_counters counter = alloc_tag_read(tag);
> +       struct alloc_tag_counters local_counters;
> +
> +       if (!counters) {
> +               local_counters = allocinfo_prefetch_counters(ct);
> +               counters = &local_counters;
> +       }
>
>         if (ct->modname)
>                 allocinfo_copy_str(data->tag.modname, ct->modname);
> @@ -212,9 +223,9 @@ static void allocinfo_to_params(struct codetag *ct,
>         allocinfo_copy_str(data->tag.function, ct->function);
>         allocinfo_copy_str(data->tag.filename, ct->filename);
>         data->tag.lineno = ct->lineno;
> -       data->counter.bytes = counter.bytes;
> -       data->counter.calls = counter.calls;
> -       data->counter.accurate = !alloc_tag_is_inaccurate(tag);
> +       data->counter.bytes = counters->bytes;
> +       data->counter.calls = counters->calls;
> +       data->counter.accurate = !alloc_tag_is_inaccurate(ct_to_alloc_tag(ct));
>  }
>
>  /*
> @@ -238,7 +249,9 @@ static int allocinfo_ioctl_get_content_id(struct seq_file *m, void __user *arg)
>   * Verifies whether a given codetag satisfies the active filtering criteria by
>   * matching its characteristics against the specified filter.
>   */
> -static bool matches_filter(struct codetag *ct, struct allocinfo_filter *filter)
> +static bool matches_filter(struct codetag *ct, struct allocinfo_filter *filter,
> +                          struct alloc_tag_counters *counters,
> +                          bool *fetched_counters)
>  {
>         if (!filter || !filter->mask)
>                 return true;
> @@ -265,6 +278,19 @@ static bool matches_filter(struct codetag *ct, struct allocinfo_filter *filter)
>             ct->lineno != filter->fields.lineno)
>                 return false;
>
> +       if (filter->mask & (ALLOCINFO_FILTER_MASK_MIN_SIZE | ALLOCINFO_FILTER_MASK_MAX_SIZE)) {
> +               if (!*fetched_counters) {
> +                       *counters = allocinfo_prefetch_counters(ct);
> +                       *fetched_counters = true;
> +               }
> +               if ((filter->mask & ALLOCINFO_FILTER_MASK_MIN_SIZE) &&
> +                   counters->bytes < filter->min_size)
> +                       return false;
> +               if ((filter->mask & ALLOCINFO_FILTER_MASK_MAX_SIZE) &&
> +                   counters->bytes > filter->max_size)
> +                       return false;
> +       }
> +
>         return true;
>  }
>
> @@ -278,6 +304,8 @@ static int allocinfo_ioctl_get_at(struct seq_file *m, void __user *arg)
>         struct codetag *ct;
>         struct allocinfo_get_at params = {0};
>         __u64 skip_count;
> +       struct alloc_tag_counters counters;
> +       bool fetched_counters;
>
>         if (copy_from_user(&params, arg, sizeof(params)))
>                 return -EFAULT;
> @@ -285,6 +313,11 @@ static int allocinfo_ioctl_get_at(struct seq_file *m, void __user *arg)
>         if (params.filter.mask & ~ALLOCINFO_FILTER_MASKS)
>                 return -EINVAL;
>
> +       if ((params.filter.mask & ALLOCINFO_FILTER_MASK_MIN_SIZE) &&
> +           (params.filter.mask & ALLOCINFO_FILTER_MASK_MAX_SIZE) &&
> +           params.filter.min_size > params.filter.max_size)
> +               return -EINVAL;
> +
>         priv = m->private;
>
>         mutex_lock(&priv->ioctl_lock);
> @@ -308,7 +341,8 @@ static int allocinfo_ioctl_get_at(struct seq_file *m, void __user *arg)
>         ct = codetag_next_ct(&priv->ioctl_iter);
>
>         while (ct) {
> -               if (matches_filter(ct, &priv->filter)) {
> +               fetched_counters = false;
> +               if (matches_filter(ct, &priv->filter, &counters, &fetched_counters)) {

Do we really need this "fetched_counters" parameter? Here are the
possible cases:
1. If the filter does not include ALLOCINFO_FILTER_MASK_MIN_SIZE |
ALLOCINFO_FILTER_MASK_MAX_SIZE then counters would not be fetched.
2. If the filter includes ALLOCINFO_FILTER_MASK_MIN_SIZE |
ALLOCINFO_FILTER_MASK_MAX_SIZE and
2.1. matches_filter() returns true then we know counters were fetched
because they had to be validated.
2.2. matches_filter() returns false then we don't care if the counters
were fetched. We do not report that tag anyway.

So, instead of passing fetched_counters to matches_filter() we could do this:

bool filter_by_size = (params.filter.mask &
(ALLOCINFO_FILTER_MASK_MIN_SIZE | ALLOCINFO_FILTER_MASK_MAX_SIZE)) !=
0;
while (ct) {
           if (matches_filter(ct, &priv->filter, &counters)) {
...
}
if (ct) {
           allocinfo_to_params(ct, &params.data, filter_by_size ?
&counters : NULL);
...
}

Wouldn't that work?

>                         if (skip_count == 0)
>                                 break;
>                         skip_count--;
> @@ -317,7 +351,7 @@ static int allocinfo_ioctl_get_at(struct seq_file *m, void __user *arg)
>         }
>
>         if (ct) {
> -               allocinfo_to_params(ct, &params.data);
> +               allocinfo_to_params(ct, &params.data, fetched_counters ? &counters : NULL);
>                 priv->positioned = true;
>         }
>
> @@ -343,6 +377,8 @@ static int allocinfo_ioctl_get_next(struct seq_file *m, void __user *arg)
>         struct codetag *ct;
>         struct allocinfo_tag_data params;
>         int ret = 0;
> +       struct alloc_tag_counters counters;
> +       bool fetched_counters;
>
>         memset(&params, 0, sizeof(params));
>         priv = m->private;
> @@ -356,10 +392,15 @@ static int allocinfo_ioctl_get_next(struct seq_file *m, void __user *arg)
>         }
>
>         ct = codetag_next_ct(&priv->ioctl_iter);
> -       while (ct && !matches_filter(ct, &priv->filter))
> +       while (ct) {
> +               fetched_counters = false;
> +               if (matches_filter(ct, &priv->filter, &counters, &fetched_counters))
> +                       break;
>                 ct = codetag_next_ct(&priv->ioctl_iter);
> +       }
> +
>         if (ct)
> -               allocinfo_to_params(ct, &params);
> +               allocinfo_to_params(ct, &params, fetched_counters ? &counters : NULL);
>
>         if (!ct) {
>                 priv->positioned = false;
> --
> 2.54.0.1136.gdb2ca164c4-goog
>

^ permalink raw reply

* Re: [PATCH v3 01/12] x86/resctrl: Support Privilege-Level Zero Association (PLZA)
From: Babu Moger @ 2026-06-17 16:28 UTC (permalink / raw)
  To: Reinette Chatre, Moger, Babu, corbet, tony.luck, Dave.Martin,
	james.morse, tglx, bp, dave.hansen
  Cc: skhan, x86, mingo, hpa, akpm, rdunlap, pawan.kumar.gupta,
	feng.tang, dapeng1.mi, kees, elver, lirongqing, paulmck, bhelgaas,
	seanjc, alexandre.chartre, yazen.ghannam, peterz, chang.seok.bae,
	kim.phillips, xin, naveen, thomas.lendacky, linux-doc,
	linux-kernel, eranian, peternewman
In-Reply-To: <353185bd-2b3e-484e-bf4c-e774c70ea63c@intel.com>

Hi Reinette,


On 6/16/26 19:00, Reinette Chatre wrote:
> Hi Babu,
> 
> On 6/12/26 9:56 AM, Moger, Babu wrote:
>> Hi Reinette,
>>
>> On 6/11/2026 6:23 PM, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 4/30/26 4:24 PM, Babu Moger wrote:
>>>> Customers have identified an issue while using the QoS resource Control
>>>
>>> "Control" -> "control"?
>>>
>>
>> ack
>>
>>>> feature. If a memory bandwidth associated with a CLOSID is aggressively
>>>
>>> "a memory bandwidth" -> "memory bandwidth"?
>>
>> ack.
>>
>>>
>>>> throttled, and it moves into Kernel mode, the Kernel operations are also
>>>
>>> What does "it" refer to here? From text it seems to be the "CLOSID" but that
>>> does not sound right? Should "it" instead be something like "a task with that
>>> CLOSID"?
>>
>> sure.
>>
>>>
>>> "Kernel" -> "kernel"?
>>
>> ack.
>>>
>>>> aggressively throttled. This can stall forward progress and eventually
>>>> degrade overall system performance. AMD hardware supports a feature
>>>> Privilege-Level Zero Association (PLZA) to change the association of the
>>>> thread as soon as it begins executing.
>>>
>>> "change the association of the thread as soon as it begins executing." I am
>>> not able to parse this.
>>
>> How about ?
>>
>> Customers have identified an issue while using the QoS resource Control
>> feature. If memory bandwidth associated with a CLOSID is aggressively
>> throttled, and a task with that CLOSID moves into kernel mode, the kernel operations are also aggressively throttled. This can stall forward progress and eventually degrade overall system performance.
>> AMD hardware supports a feature Privilege-Level Zero Association (PLZA)
>> to change the CPU association at the user-to-kernel transition, so the kernel execution can use a different association than user mode.
> 
> "change the CPU association at the user-to-kernel transition" -> What is this
> trying to describe? CPU association of what?
> 
> "a different association"? What does this mean?
> 

Will change it to:

AMD hardware supports a feature Privilege-Level Zero Association (PLZA),
which allows the CPU’s CLOSID association to be changed during the 
transition from user mode to kernel mode. This enables the kernel to 
operate with a different CLOSID than the user mode.


>>
>> Privilege-Level Zero Association (PLZA) allows the user to specify a> CLOSID and/or RMID associated with execution in Privilege-Level
>> Zero. When enabled on a CPU, as the CPU enters Privilege-Level Zero,
>> allocation and monitoring for that CPU will be associated with the
>> PLZA CLOSID and/or RMID. Otherwise, the CPU will be associated with
>> the CLOSID and RMID given by PQR_ASSOC.
> 
> 
> Sounds like this is vague because MSR_IA32_PQR_PLZA_ASSOC has not been
> introduced yet. Could it help to introduce MSR_IA32_PQR_PLZA_ASSOC as
> part of this patch and then the changelog can be specific about PLZA
> feature introducing this new MSR and how it complements MSR_IA32_PQR_ASSOC?

Its probably better to remove the second paragraph. This text can go 
with the patch which introduces MSR_IA32_PQR_PLZA_ASSOC.

With splitting the patch, this will only have cpufeatures changes.

> 
> ...
> 
>>>>    Documentation/admin-guide/kernel-parameters.txt | 2 +-
>>>>    arch/x86/include/asm/cpufeatures.h              | 1 +
>>>>    arch/x86/kernel/cpu/resctrl/core.c              | 2 ++
>>>>    arch/x86/kernel/cpu/scattered.c                 | 1 +
>>>
>>> Please split changes to other subsystems and make these changes
>>> obvious with their own subject prefix to avoid sneaking changes into
>>> other subsystems via resctrl.
>>>
>>
>> Ok. Will be two patches.
>> 1. For Documentation/admin-guide/kernel-parameters.txt
>> 2.  arch/x86/include/asm/cpufeatures.h
>>      arch/x86/kernel/cpu/resctrl/core.c
>>      arch/x86/kernel/cpu/scattered.c
> 
> The resctrl changes found in (2) would be documented in (1)? That does not
> look right. Why not just split the resctrl changes from the cpufeatures changes?
> This would be similar to how you did ABMC enabling.
> 

Sounds good.

Thanks
Babu

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox