Linux Power Management development
 help / color / mirror / Atom feed
* [PATCH v12 06/16] mm: introduce AS_NO_DIRECT_MAP
From: Kalyazin, Nikita @ 2026-04-10 15:18 UTC (permalink / raw)
  To: kvm@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	bpf@vger.kernel.org, linux-kselftest@vger.kernel.org,
	kernel@xen0n.name, linux-riscv@lists.infradead.org,
	linux-s390@vger.kernel.org, loongarch@lists.linux.dev,
	linux-pm@vger.kernel.org
  Cc: pbonzini@redhat.com, corbet@lwn.net, maz@kernel.org,
	oupton@kernel.org, joey.gouly@arm.com, suzuki.poulose@arm.com,
	yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org,
	seanjc@google.com, tglx@kernel.org, mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org,
	hpa@zytor.com, luto@kernel.org, peterz@infradead.org,
	willy@infradead.org, akpm@linux-foundation.org, david@kernel.org,
	lorenzo.stoakes@oracle.com, vbabka@kernel.org, rppt@kernel.org,
	surenb@google.com, mhocko@suse.com, ast@kernel.org,
	daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev,
	eddyz87@gmail.com, song@kernel.org, yonghong.song@linux.dev,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
	haoluo@google.com, jolsa@kernel.org, jgg@ziepe.ca,
	jhubbard@nvidia.com, peterx@redhat.com, jannh@google.com,
	pfalcato@suse.de, skhan@linuxfoundation.org, riel@surriel.com,
	ryan.roberts@arm.com, jgross@suse.com, yu-cheng.yu@intel.com,
	kas@kernel.org, coxu@redhat.com, ackerleytng@google.com,
	yosry@kernel.org, ajones@ventanamicro.com, maobibo@loongson.cn,
	tabba@google.com, prsampat@amd.com, wu.fei9@sanechips.com.cn,
	mlevitsk@redhat.com, jmattson@google.com, jthoughton@google.com,
	agordeev@linux.ibm.com, alex@ghiti.fr, aou@eecs.berkeley.edu,
	borntraeger@linux.ibm.com, chenhuacai@kernel.org,
	baolu.lu@linux.intel.com, dev.jain@arm.com, gor@linux.ibm.com,
	hca@linux.ibm.com, palmer@dabbelt.com, pjw@kernel.org,
	shijie@os.amperecomputing.com, svens@linux.ibm.com,
	thuth@redhat.com, yang@os.amperecomputing.com,
	Liam.Howlett@oracle.com, urezki@gmail.com,
	zhengqi.arch@bytedance.com, gerald.schaefer@linux.ibm.com,
	jiayuan.chen@shopee.com, lenb@kernel.org, pavel@kernel.org,
	rafael@kernel.org, yangyicong@hisilicon.com,
	vannapurve@google.com, jackmanb@google.com, patrick.roy@linux.dev,
	Thomson, Jack, Itazuri, Takahiro, Manwaring, Derek,
	Kalyazin, Nikita, Vlastimil Babka
In-Reply-To: <20260410151746.61150-1-kalyazin@amazon.com>

From: Patrick Roy <patrick.roy@linux.dev>

Add AS_NO_DIRECT_MAP for mappings where direct map entries of folios are
set to not present. Currently, mappings that match this description are
secretmem mappings (memfd_secret()). Later, some guest_memfd
configurations will also fall into this category.

Reject this new type of mappings in all locations that currently reject
secretmem mappings, on the assumption that if secretmem mappings are
rejected somewhere, it is precisely because of an inability to deal with
folios without direct map entries, and then make memfd_secret() use
AS_NO_DIRECT_MAP on its address_space to drop its special
vma_is_secretmem()/secretmem_mapping() checks.

Use a new flag instead of overloading AS_INACCESSIBLE (which is already
set by guest_memfd) because not all guest_memfd mappings will end up
being direct map removed (e.g. in pKVM setups, parts of guest_memfd that
can be mapped to userspace should also be GUP-able, and generally not
have restrictions on who can access it).

Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Signed-off-by: Patrick Roy <patrick.roy@linux.dev>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Signed-off-by: Nikita Kalyazin <nikita.kalyazin@linux.dev>
---
 include/linux/pagemap.h   | 16 ++++++++++++++++
 include/linux/secretmem.h | 18 ------------------
 lib/buildid.c             |  8 ++++++--
 mm/gup.c                  |  9 ++++-----
 mm/mlock.c                |  2 +-
 mm/secretmem.c            |  8 ++------
 6 files changed, 29 insertions(+), 32 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index ec442af3f886..68c075502d91 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -211,6 +211,7 @@ enum mapping_flags {
 	AS_KERNEL_FILE = 10,	/* mapping for a fake kernel file that shouldn't
 				   account usage to user cgroups */
 	AS_NO_DATA_INTEGRITY = 11, /* no data integrity guarantees */
+	AS_NO_DIRECT_MAP = 12,	/* Folios in the mapping are not in the direct map */
 	/* Bits 16-25 are used for FOLIO_ORDER */
 	AS_FOLIO_ORDER_BITS = 5,
 	AS_FOLIO_ORDER_MIN = 16,
@@ -356,6 +357,21 @@ static inline bool mapping_no_data_integrity(const struct address_space *mapping
 	return test_bit(AS_NO_DATA_INTEGRITY, &mapping->flags);
 }
 
+static inline void mapping_set_no_direct_map(struct address_space *mapping)
+{
+	set_bit(AS_NO_DIRECT_MAP, &mapping->flags);
+}
+
+static inline bool mapping_no_direct_map(const struct address_space *mapping)
+{
+	return test_bit(AS_NO_DIRECT_MAP, &mapping->flags);
+}
+
+static inline bool vma_has_no_direct_map(const struct vm_area_struct *vma)
+{
+	return vma->vm_file && mapping_no_direct_map(vma->vm_file->f_mapping);
+}
+
 static inline gfp_t mapping_gfp_mask(const struct address_space *mapping)
 {
 	return mapping->gfp_mask;
diff --git a/include/linux/secretmem.h b/include/linux/secretmem.h
index e918f96881f5..0ae1fb057b3d 100644
--- a/include/linux/secretmem.h
+++ b/include/linux/secretmem.h
@@ -4,28 +4,10 @@
 
 #ifdef CONFIG_SECRETMEM
 
-extern const struct address_space_operations secretmem_aops;
-
-static inline bool secretmem_mapping(struct address_space *mapping)
-{
-	return mapping->a_ops == &secretmem_aops;
-}
-
-bool vma_is_secretmem(struct vm_area_struct *vma);
 bool secretmem_active(void);
 
 #else
 
-static inline bool vma_is_secretmem(struct vm_area_struct *vma)
-{
-	return false;
-}
-
-static inline bool secretmem_mapping(struct address_space *mapping)
-{
-	return false;
-}
-
 static inline bool secretmem_active(void)
 {
 	return false;
diff --git a/lib/buildid.c b/lib/buildid.c
index c4b737640621..ba79bf28f7e6 100644
--- a/lib/buildid.c
+++ b/lib/buildid.c
@@ -47,6 +47,10 @@ static int freader_get_folio(struct freader *r, loff_t file_off)
 
 	freader_put_folio(r);
 
+	/* reject folios without direct map entries (e.g. from memfd_secret() or guest_memfd()) */
+	if (mapping_no_direct_map(r->file->f_mapping))
+		return -EFAULT;
+
 	/* only use page cache lookup - fail if not already cached */
 	r->folio = filemap_get_folio(r->file->f_mapping, file_off >> PAGE_SHIFT);
 
@@ -87,8 +91,8 @@ const void *freader_fetch(struct freader *r, loff_t file_off, size_t sz)
 		return r->data + file_off;
 	}
 
-	/* reject secretmem folios created with memfd_secret() */
-	if (secretmem_mapping(r->file->f_mapping)) {
+	/* reject folios without direct map entries (e.g. from memfd_secret() or guest_memfd()) */
+	if (mapping_no_direct_map(r->file->f_mapping)) {
 		r->err = -EFAULT;
 		return NULL;
 	}
diff --git a/mm/gup.c b/mm/gup.c
index 41eb64783e03..c1b4fb1eaee7 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -11,7 +11,6 @@
 #include <linux/rmap.h>
 #include <linux/swap.h>
 #include <linux/swapops.h>
-#include <linux/secretmem.h>
 
 #include <linux/sched/signal.h>
 #include <linux/rwsem.h>
@@ -1216,7 +1215,7 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags)
 	if ((gup_flags & FOLL_SPLIT_PMD) && is_vm_hugetlb_page(vma))
 		return -EOPNOTSUPP;
 
-	if (vma_is_secretmem(vma))
+	if (vma_has_no_direct_map(vma))
 		return -EFAULT;
 
 	if (write) {
@@ -2724,7 +2723,7 @@ EXPORT_SYMBOL(get_user_pages_unlocked);
  * This call assumes the caller has pinned the folio, that the lowest page table
  * level still points to this folio, and that interrupts have been disabled.
  *
- * GUP-fast must reject all secretmem folios.
+ * GUP-fast must reject all folios without direct map entries (such as secretmem).
  *
  * Writing to pinned file-backed dirty tracked folios is inherently problematic
  * (see comment describing the writable_file_mapping_allowed() function). We
@@ -2744,7 +2743,7 @@ static bool gup_fast_folio_allowed(struct folio *folio, unsigned int flags)
 	if (WARN_ON_ONCE(folio_test_slab(folio)))
 		return false;
 
-	/* hugetlb neither requires dirty-tracking nor can be secretmem. */
+	/* hugetlb neither requires dirty-tracking nor can be without direct map. */
 	if (folio_test_hugetlb(folio))
 		return true;
 
@@ -2786,7 +2785,7 @@ static bool gup_fast_folio_allowed(struct folio *folio, unsigned int flags)
 	 * At this point, we know the mapping is non-null and points to an
 	 * address_space object.
 	 */
-	if (secretmem_mapping(mapping))
+	if (mapping_no_direct_map(mapping))
 		return false;
 
 	/*
diff --git a/mm/mlock.c b/mm/mlock.c
index 2f699c3497a5..a6f4b3df4f3f 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -474,7 +474,7 @@ static int mlock_fixup(struct vma_iterator *vmi, struct vm_area_struct *vma,
 
 	if (newflags == oldflags || (oldflags & VM_SPECIAL) ||
 	    is_vm_hugetlb_page(vma) || vma == get_gate_vma(current->mm) ||
-	    vma_is_dax(vma) || vma_is_secretmem(vma) || (oldflags & VM_DROPPABLE))
+	    vma_is_dax(vma) || vma_has_no_direct_map(vma) || (oldflags & VM_DROPPABLE))
 		/* don't set VM_LOCKED or VM_LOCKONFAULT and don't count */
 		goto out;
 
diff --git a/mm/secretmem.c b/mm/secretmem.c
index 27b176af8fc4..d32e1be1eb35 100644
--- a/mm/secretmem.c
+++ b/mm/secretmem.c
@@ -129,11 +129,6 @@ static int secretmem_mmap_prepare(struct vm_area_desc *desc)
 	return 0;
 }
 
-bool vma_is_secretmem(struct vm_area_struct *vma)
-{
-	return vma->vm_ops == &secretmem_vm_ops;
-}
-
 static const struct file_operations secretmem_fops = {
 	.release	= secretmem_release,
 	.mmap_prepare	= secretmem_mmap_prepare,
@@ -151,7 +146,7 @@ static void secretmem_free_folio(struct folio *folio)
 	folio_zero_segment(folio, 0, folio_size(folio));
 }
 
-const struct address_space_operations secretmem_aops = {
+static const struct address_space_operations secretmem_aops = {
 	.dirty_folio	= noop_dirty_folio,
 	.free_folio	= secretmem_free_folio,
 	.migrate_folio	= secretmem_migrate_folio,
@@ -200,6 +195,7 @@ static struct file *secretmem_file_create(unsigned long flags)
 
 	mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER);
 	mapping_set_unevictable(inode->i_mapping);
+	mapping_set_no_direct_map(inode->i_mapping);
 
 	inode->i_op = &secretmem_iops;
 	inode->i_mapping->a_ops = &secretmem_aops;
-- 
2.50.1


^ permalink raw reply related

* [PATCH v12 05/16] mm/gup: drop local variable in gup_fast_folio_allowed
From: Kalyazin, Nikita @ 2026-04-10 15:18 UTC (permalink / raw)
  To: kvm@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	bpf@vger.kernel.org, linux-kselftest@vger.kernel.org,
	kernel@xen0n.name, linux-riscv@lists.infradead.org,
	linux-s390@vger.kernel.org, loongarch@lists.linux.dev,
	linux-pm@vger.kernel.org
  Cc: pbonzini@redhat.com, corbet@lwn.net, maz@kernel.org,
	oupton@kernel.org, joey.gouly@arm.com, suzuki.poulose@arm.com,
	yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org,
	seanjc@google.com, tglx@kernel.org, mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org,
	hpa@zytor.com, luto@kernel.org, peterz@infradead.org,
	willy@infradead.org, akpm@linux-foundation.org, david@kernel.org,
	lorenzo.stoakes@oracle.com, vbabka@kernel.org, rppt@kernel.org,
	surenb@google.com, mhocko@suse.com, ast@kernel.org,
	daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev,
	eddyz87@gmail.com, song@kernel.org, yonghong.song@linux.dev,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
	haoluo@google.com, jolsa@kernel.org, jgg@ziepe.ca,
	jhubbard@nvidia.com, peterx@redhat.com, jannh@google.com,
	pfalcato@suse.de, skhan@linuxfoundation.org, riel@surriel.com,
	ryan.roberts@arm.com, jgross@suse.com, yu-cheng.yu@intel.com,
	kas@kernel.org, coxu@redhat.com, ackerleytng@google.com,
	yosry@kernel.org, ajones@ventanamicro.com, maobibo@loongson.cn,
	tabba@google.com, prsampat@amd.com, wu.fei9@sanechips.com.cn,
	mlevitsk@redhat.com, jmattson@google.com, jthoughton@google.com,
	agordeev@linux.ibm.com, alex@ghiti.fr, aou@eecs.berkeley.edu,
	borntraeger@linux.ibm.com, chenhuacai@kernel.org,
	baolu.lu@linux.intel.com, dev.jain@arm.com, gor@linux.ibm.com,
	hca@linux.ibm.com, palmer@dabbelt.com, pjw@kernel.org,
	shijie@os.amperecomputing.com, svens@linux.ibm.com,
	thuth@redhat.com, yang@os.amperecomputing.com,
	Liam.Howlett@oracle.com, urezki@gmail.com,
	zhengqi.arch@bytedance.com, gerald.schaefer@linux.ibm.com,
	jiayuan.chen@shopee.com, lenb@kernel.org, pavel@kernel.org,
	rafael@kernel.org, yangyicong@hisilicon.com,
	vannapurve@google.com, jackmanb@google.com, patrick.roy@linux.dev,
	Thomson, Jack, Itazuri, Takahiro, Manwaring, Derek,
	Kalyazin, Nikita
In-Reply-To: <20260410151746.61150-1-kalyazin@amazon.com>

From: Nikita Kalyazin <nikita.kalyazin@linux.dev>

Move the check for pinning closer to where the result is used.
No functional changes.

Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Signed-off-by: Nikita Kalyazin <nikita.kalyazin@linux.dev>
---
 mm/gup.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index e8367564d636..41eb64783e03 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2737,18 +2737,9 @@ EXPORT_SYMBOL(get_user_pages_unlocked);
  */
 static bool gup_fast_folio_allowed(struct folio *folio, unsigned int flags)
 {
-	bool reject_file_backed = false;
 	struct address_space *mapping;
 	unsigned long mapping_flags;
 
-	/*
-	 * If we aren't pinning then no problematic write can occur. A long term
-	 * pin is the most egregious case so this is the one we disallow.
-	 */
-	if ((flags & (FOLL_PIN | FOLL_LONGTERM | FOLL_WRITE)) ==
-	    (FOLL_PIN | FOLL_LONGTERM | FOLL_WRITE))
-		reject_file_backed = true;
-
 	/* We hold a folio reference, so we can safely access folio fields. */
 	if (WARN_ON_ONCE(folio_test_slab(folio)))
 		return false;
@@ -2797,8 +2788,18 @@ static bool gup_fast_folio_allowed(struct folio *folio, unsigned int flags)
 	 */
 	if (secretmem_mapping(mapping))
 		return false;
-	/* The only remaining allowed file system is shmem. */
-	return !reject_file_backed || shmem_mapping(mapping);
+
+	/*
+	 * If we aren't pinning then no problematic write can occur. A writable
+	 * long term pin is the most egregious case, so this is the one we
+	 * allow only for ...
+	 */
+	if ((flags & (FOLL_PIN | FOLL_LONGTERM | FOLL_WRITE)) !=
+	    (FOLL_PIN | FOLL_LONGTERM | FOLL_WRITE))
+		return true;
+
+	/* ... hugetlb (which we allowed above already) and shared memory. */
+	return shmem_mapping(mapping);
 }
 
 #ifdef CONFIG_ARCH_HAS_PTE_SPECIAL
-- 
2.50.1


^ permalink raw reply related

* [PATCH v12 04/16] mm/gup: drop secretmem optimization from gup_fast_folio_allowed
From: Kalyazin, Nikita @ 2026-04-10 15:18 UTC (permalink / raw)
  To: kvm@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	bpf@vger.kernel.org, linux-kselftest@vger.kernel.org,
	kernel@xen0n.name, linux-riscv@lists.infradead.org,
	linux-s390@vger.kernel.org, loongarch@lists.linux.dev,
	linux-pm@vger.kernel.org
  Cc: pbonzini@redhat.com, corbet@lwn.net, maz@kernel.org,
	oupton@kernel.org, joey.gouly@arm.com, suzuki.poulose@arm.com,
	yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org,
	seanjc@google.com, tglx@kernel.org, mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org,
	hpa@zytor.com, luto@kernel.org, peterz@infradead.org,
	willy@infradead.org, akpm@linux-foundation.org, david@kernel.org,
	lorenzo.stoakes@oracle.com, vbabka@kernel.org, rppt@kernel.org,
	surenb@google.com, mhocko@suse.com, ast@kernel.org,
	daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev,
	eddyz87@gmail.com, song@kernel.org, yonghong.song@linux.dev,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
	haoluo@google.com, jolsa@kernel.org, jgg@ziepe.ca,
	jhubbard@nvidia.com, peterx@redhat.com, jannh@google.com,
	pfalcato@suse.de, skhan@linuxfoundation.org, riel@surriel.com,
	ryan.roberts@arm.com, jgross@suse.com, yu-cheng.yu@intel.com,
	kas@kernel.org, coxu@redhat.com, ackerleytng@google.com,
	yosry@kernel.org, ajones@ventanamicro.com, maobibo@loongson.cn,
	tabba@google.com, prsampat@amd.com, wu.fei9@sanechips.com.cn,
	mlevitsk@redhat.com, jmattson@google.com, jthoughton@google.com,
	agordeev@linux.ibm.com, alex@ghiti.fr, aou@eecs.berkeley.edu,
	borntraeger@linux.ibm.com, chenhuacai@kernel.org,
	baolu.lu@linux.intel.com, dev.jain@arm.com, gor@linux.ibm.com,
	hca@linux.ibm.com, palmer@dabbelt.com, pjw@kernel.org,
	shijie@os.amperecomputing.com, svens@linux.ibm.com,
	thuth@redhat.com, yang@os.amperecomputing.com,
	Liam.Howlett@oracle.com, urezki@gmail.com,
	zhengqi.arch@bytedance.com, gerald.schaefer@linux.ibm.com,
	jiayuan.chen@shopee.com, lenb@kernel.org, pavel@kernel.org,
	rafael@kernel.org, yangyicong@hisilicon.com,
	vannapurve@google.com, jackmanb@google.com, patrick.roy@linux.dev,
	Thomson, Jack, Itazuri, Takahiro, Manwaring, Derek,
	Kalyazin, Nikita, Vlastimil Babka
In-Reply-To: <20260410151746.61150-1-kalyazin@amazon.com>

From: Patrick Roy <patrick.roy@linux.dev>

This drops an optimization in gup_fast_folio_allowed() where
secretmem_mapping() was only called if CONFIG_SECRETMEM=y. secretmem is
enabled by default since commit b758fe6df50d ("mm/secretmem: make it on
by default"), so the secretmem check did not actually end up elided in
most cases anymore anyway.

To make sure the fast path for ZONE_DEVICE pages (like Device DAX and
PCI P2PDMA) is still allowed, check for folio_is_zone_device() if
mapping is NULL.

This is in preparation of the generalization of handling mappings where
direct map entries of folios are set to not present.  Currently,
mappings that match this description are secretmem mappings
(memfd_secret()).  Later, some guest_memfd configurations will also fall
into this category.

Signed-off-by: Patrick Roy <patrick.roy@linux.dev>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Signed-off-by: Nikita Kalyazin <nikita.kalyazin@linux.dev>
---
 mm/gup.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 8e7dc2c6ee73..e8367564d636 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2739,7 +2739,6 @@ static bool gup_fast_folio_allowed(struct folio *folio, unsigned int flags)
 {
 	bool reject_file_backed = false;
 	struct address_space *mapping;
-	bool check_secretmem = false;
 	unsigned long mapping_flags;
 
 	/*
@@ -2751,14 +2750,6 @@ static bool gup_fast_folio_allowed(struct folio *folio, unsigned int flags)
 		reject_file_backed = true;
 
 	/* We hold a folio reference, so we can safely access folio fields. */
-
-	/* secretmem folios are always order-0 folios. */
-	if (IS_ENABLED(CONFIG_SECRETMEM) && !folio_test_large(folio))
-		check_secretmem = true;
-
-	if (!reject_file_backed && !check_secretmem)
-		return true;
-
 	if (WARN_ON_ONCE(folio_test_slab(folio)))
 		return false;
 
@@ -2787,9 +2778,13 @@ static bool gup_fast_folio_allowed(struct folio *folio, unsigned int flags)
 	 * The mapping may have been truncated, in any case we cannot determine
 	 * if this mapping is safe - fall back to slow path to determine how to
 	 * proceed.
+	 *
+	 * ZONE_DEVICE folios (e.g. Device DAX, PCI P2PDMA) may legitimately
+	 * have a NULL mapping. They are never secretmem/no-direct-map folios,
+	 * so let them through.
 	 */
 	if (!mapping)
-		return false;
+		return folio_is_zone_device(folio);
 
 	/* Anonymous folios pose no problem. */
 	mapping_flags = (unsigned long)mapping & FOLIO_MAPPING_FLAGS;
@@ -2800,7 +2795,7 @@ static bool gup_fast_folio_allowed(struct folio *folio, unsigned int flags)
 	 * At this point, we know the mapping is non-null and points to an
 	 * address_space object.
 	 */
-	if (check_secretmem && secretmem_mapping(mapping))
+	if (secretmem_mapping(mapping))
 		return false;
 	/* The only remaining allowed file system is shmem. */
 	return !reject_file_backed || shmem_mapping(mapping);
-- 
2.50.1


^ permalink raw reply related

* [PATCH v12 03/16] mm/secretmem: make use of folio_{zap,restore}_direct_map
From: Kalyazin, Nikita @ 2026-04-10 15:18 UTC (permalink / raw)
  To: kvm@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	bpf@vger.kernel.org, linux-kselftest@vger.kernel.org,
	kernel@xen0n.name, linux-riscv@lists.infradead.org,
	linux-s390@vger.kernel.org, loongarch@lists.linux.dev,
	linux-pm@vger.kernel.org
  Cc: pbonzini@redhat.com, corbet@lwn.net, maz@kernel.org,
	oupton@kernel.org, joey.gouly@arm.com, suzuki.poulose@arm.com,
	yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org,
	seanjc@google.com, tglx@kernel.org, mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org,
	hpa@zytor.com, luto@kernel.org, peterz@infradead.org,
	willy@infradead.org, akpm@linux-foundation.org, david@kernel.org,
	lorenzo.stoakes@oracle.com, vbabka@kernel.org, rppt@kernel.org,
	surenb@google.com, mhocko@suse.com, ast@kernel.org,
	daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev,
	eddyz87@gmail.com, song@kernel.org, yonghong.song@linux.dev,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
	haoluo@google.com, jolsa@kernel.org, jgg@ziepe.ca,
	jhubbard@nvidia.com, peterx@redhat.com, jannh@google.com,
	pfalcato@suse.de, skhan@linuxfoundation.org, riel@surriel.com,
	ryan.roberts@arm.com, jgross@suse.com, yu-cheng.yu@intel.com,
	kas@kernel.org, coxu@redhat.com, ackerleytng@google.com,
	yosry@kernel.org, ajones@ventanamicro.com, maobibo@loongson.cn,
	tabba@google.com, prsampat@amd.com, wu.fei9@sanechips.com.cn,
	mlevitsk@redhat.com, jmattson@google.com, jthoughton@google.com,
	agordeev@linux.ibm.com, alex@ghiti.fr, aou@eecs.berkeley.edu,
	borntraeger@linux.ibm.com, chenhuacai@kernel.org,
	baolu.lu@linux.intel.com, dev.jain@arm.com, gor@linux.ibm.com,
	hca@linux.ibm.com, palmer@dabbelt.com, pjw@kernel.org,
	shijie@os.amperecomputing.com, svens@linux.ibm.com,
	thuth@redhat.com, yang@os.amperecomputing.com,
	Liam.Howlett@oracle.com, urezki@gmail.com,
	zhengqi.arch@bytedance.com, gerald.schaefer@linux.ibm.com,
	jiayuan.chen@shopee.com, lenb@kernel.org, pavel@kernel.org,
	rafael@kernel.org, yangyicong@hisilicon.com,
	vannapurve@google.com, jackmanb@google.com, patrick.roy@linux.dev,
	Thomson, Jack, Itazuri, Takahiro, Manwaring, Derek,
	Kalyazin, Nikita
In-Reply-To: <20260410151746.61150-1-kalyazin@amazon.com>

From: Nikita Kalyazin <nikita.kalyazin@linux.dev>

Replace set_direct_map_*_noflush with newly available
folio_zap_direct_map calls that take folio's address internally.  A side
effect is even if filemap_add_folio fails, the TLB is still flushed,
which is not expected to be on the hot path.

Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Signed-off-by: Nikita Kalyazin <nikita.kalyazin@linux.dev>
---
 mm/secretmem.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/mm/secretmem.c b/mm/secretmem.c
index fd29b33c6764..27b176af8fc4 100644
--- a/mm/secretmem.c
+++ b/mm/secretmem.c
@@ -53,7 +53,6 @@ static vm_fault_t secretmem_fault(struct vm_fault *vmf)
 	struct inode *inode = file_inode(vmf->vma->vm_file);
 	pgoff_t offset = vmf->pgoff;
 	gfp_t gfp = vmf->gfp_mask;
-	unsigned long addr;
 	struct folio *folio;
 	vm_fault_t ret;
 	int err;
@@ -72,7 +71,7 @@ static vm_fault_t secretmem_fault(struct vm_fault *vmf)
 			goto out;
 		}
 
-		err = set_direct_map_invalid_noflush(folio_address(folio));
+		err = folio_zap_direct_map(folio);
 		if (err) {
 			folio_put(folio);
 			ret = vmf_error(err);
@@ -87,7 +86,7 @@ static vm_fault_t secretmem_fault(struct vm_fault *vmf)
 			 * already happened when we marked the page invalid
 			 * which guarantees that this call won't fail
 			 */
-			set_direct_map_default_noflush(folio_address(folio));
+			folio_restore_direct_map(folio);
 			folio_put(folio);
 			if (err == -EEXIST)
 				goto retry;
@@ -95,9 +94,6 @@ static vm_fault_t secretmem_fault(struct vm_fault *vmf)
 			ret = vmf_error(err);
 			goto out;
 		}
-
-		addr = (unsigned long)folio_address(folio);
-		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
 	}
 
 	vmf->page = folio_file_page(folio, vmf->pgoff);
-- 
2.50.1


^ permalink raw reply related

* [PATCH v12 02/16] set_memory: add folio_{zap,restore}_direct_map helpers
From: Kalyazin, Nikita @ 2026-04-10 15:18 UTC (permalink / raw)
  To: kvm@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	bpf@vger.kernel.org, linux-kselftest@vger.kernel.org,
	kernel@xen0n.name, linux-riscv@lists.infradead.org,
	linux-s390@vger.kernel.org, loongarch@lists.linux.dev,
	linux-pm@vger.kernel.org
  Cc: pbonzini@redhat.com, corbet@lwn.net, maz@kernel.org,
	oupton@kernel.org, joey.gouly@arm.com, suzuki.poulose@arm.com,
	yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org,
	seanjc@google.com, tglx@kernel.org, mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org,
	hpa@zytor.com, luto@kernel.org, peterz@infradead.org,
	willy@infradead.org, akpm@linux-foundation.org, david@kernel.org,
	lorenzo.stoakes@oracle.com, vbabka@kernel.org, rppt@kernel.org,
	surenb@google.com, mhocko@suse.com, ast@kernel.org,
	daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev,
	eddyz87@gmail.com, song@kernel.org, yonghong.song@linux.dev,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
	haoluo@google.com, jolsa@kernel.org, jgg@ziepe.ca,
	jhubbard@nvidia.com, peterx@redhat.com, jannh@google.com,
	pfalcato@suse.de, skhan@linuxfoundation.org, riel@surriel.com,
	ryan.roberts@arm.com, jgross@suse.com, yu-cheng.yu@intel.com,
	kas@kernel.org, coxu@redhat.com, ackerleytng@google.com,
	yosry@kernel.org, ajones@ventanamicro.com, maobibo@loongson.cn,
	tabba@google.com, prsampat@amd.com, wu.fei9@sanechips.com.cn,
	mlevitsk@redhat.com, jmattson@google.com, jthoughton@google.com,
	agordeev@linux.ibm.com, alex@ghiti.fr, aou@eecs.berkeley.edu,
	borntraeger@linux.ibm.com, chenhuacai@kernel.org,
	baolu.lu@linux.intel.com, dev.jain@arm.com, gor@linux.ibm.com,
	hca@linux.ibm.com, palmer@dabbelt.com, pjw@kernel.org,
	shijie@os.amperecomputing.com, svens@linux.ibm.com,
	thuth@redhat.com, yang@os.amperecomputing.com,
	Liam.Howlett@oracle.com, urezki@gmail.com,
	zhengqi.arch@bytedance.com, gerald.schaefer@linux.ibm.com,
	jiayuan.chen@shopee.com, lenb@kernel.org, pavel@kernel.org,
	rafael@kernel.org, yangyicong@hisilicon.com,
	vannapurve@google.com, jackmanb@google.com, patrick.roy@linux.dev,
	Thomson, Jack, Itazuri, Takahiro, Manwaring, Derek,
	Kalyazin, Nikita
In-Reply-To: <20260410151746.61150-1-kalyazin@amazon.com>

From: Nikita Kalyazin <nikita.kalyazin@linux.dev>

Let's provide folio_{zap,restore}_direct_map helpers as preparation for
supporting removal of the direct map for guest_memfd folios.
In folio_zap_direct_map(), flush TLB to make sure the data is not
accessible.  On some architectures, there may be a double TLB flush
issued because set_direct_map_valid_noflush already performs a flush
internally.

The new helpers need to be accessible to KVM on architectures that
support guest_memfd (x86 and arm64).

Direct map removal gives guest_memfd the same protection that
memfd_secret does, such as hardening against Spectre-like attacks
through in-kernel gadgets.

Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Signed-off-by: Nikita Kalyazin <nikita.kalyazin@linux.dev>
---
 include/linux/set_memory.h | 13 +++++++++++
 mm/memory.c                | 45 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 58 insertions(+)

diff --git a/include/linux/set_memory.h b/include/linux/set_memory.h
index 1a2563f525fc..24caea2931f9 100644
--- a/include/linux/set_memory.h
+++ b/include/linux/set_memory.h
@@ -41,6 +41,15 @@ static inline int set_direct_map_valid_noflush(const void *addr,
 	return 0;
 }
 
+static inline int folio_zap_direct_map(struct folio *folio)
+{
+	return 0;
+}
+
+static inline void folio_restore_direct_map(struct folio *folio)
+{
+}
+
 static inline bool kernel_page_present(struct page *page)
 {
 	return true;
@@ -57,6 +66,10 @@ static inline bool can_set_direct_map(void)
 }
 #define can_set_direct_map can_set_direct_map
 #endif
+
+int folio_zap_direct_map(struct folio *folio);
+void folio_restore_direct_map(struct folio *folio);
+
 #endif /* CONFIG_ARCH_HAS_SET_DIRECT_MAP */
 
 #ifdef CONFIG_X86_64
diff --git a/mm/memory.c b/mm/memory.c
index 2f815a34d924..3b9ada2cc19c 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -78,6 +78,7 @@
 #include <linux/sched/sysctl.h>
 #include <linux/pgalloc.h>
 #include <linux/uaccess.h>
+#include <linux/set_memory.h>
 
 #include <trace/events/kmem.h>
 
@@ -7479,3 +7480,47 @@ void vma_pgtable_walk_end(struct vm_area_struct *vma)
 	if (is_vm_hugetlb_page(vma))
 		hugetlb_vma_unlock_read(vma);
 }
+
+#ifdef CONFIG_ARCH_HAS_SET_DIRECT_MAP
+/**
+ * folio_zap_direct_map - remove a folio from the kernel direct map
+ * @folio: folio to remove from the direct map
+ *
+ * Removes the folio from the kernel direct map and flushes the TLB.  This may
+ * require splitting huge pages in the direct map, which can fail due to memory
+ * allocation.  So far, only order-0 folios are supported.
+ *
+ * Return: 0 on success, or a negative error code on failure.
+ */
+int folio_zap_direct_map(struct folio *folio)
+{
+	const void *addr = folio_address(folio);
+	int ret;
+
+	if (folio_test_large(folio))
+		return -EINVAL;
+
+	ret = set_direct_map_valid_noflush(addr, folio_nr_pages(folio), false);
+	flush_tlb_kernel_range((unsigned long)addr,
+			       (unsigned long)addr + folio_size(folio));
+
+	return ret;
+}
+EXPORT_SYMBOL_FOR_MODULES(folio_zap_direct_map, "kvm");
+
+/**
+ * folio_restore_direct_map - restore the kernel direct map entry for a folio
+ * @folio: folio whose direct map entry is to be restored
+ *
+ * This may only be called after a prior successful folio_zap_direct_map() on
+ * the same folio.  Because the zap will have already split any huge pages in
+ * the direct map, restoration here only updates protection bits and cannot
+ * fail.
+ */
+void folio_restore_direct_map(struct folio *folio)
+{
+	WARN_ON_ONCE(set_direct_map_valid_noflush(folio_address(folio),
+						  folio_nr_pages(folio), true));
+}
+EXPORT_SYMBOL_FOR_MODULES(folio_restore_direct_map, "kvm");
+#endif /* CONFIG_ARCH_HAS_SET_DIRECT_MAP */
-- 
2.50.1


^ permalink raw reply related

* [PATCH v12 01/16] set_memory: set_direct_map_* to take address
From: Kalyazin, Nikita @ 2026-04-10 15:17 UTC (permalink / raw)
  To: kvm@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	bpf@vger.kernel.org, linux-kselftest@vger.kernel.org,
	kernel@xen0n.name, linux-riscv@lists.infradead.org,
	linux-s390@vger.kernel.org, loongarch@lists.linux.dev,
	linux-pm@vger.kernel.org
  Cc: pbonzini@redhat.com, corbet@lwn.net, maz@kernel.org,
	oupton@kernel.org, joey.gouly@arm.com, suzuki.poulose@arm.com,
	yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org,
	seanjc@google.com, tglx@kernel.org, mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org,
	hpa@zytor.com, luto@kernel.org, peterz@infradead.org,
	willy@infradead.org, akpm@linux-foundation.org, david@kernel.org,
	lorenzo.stoakes@oracle.com, vbabka@kernel.org, rppt@kernel.org,
	surenb@google.com, mhocko@suse.com, ast@kernel.org,
	daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev,
	eddyz87@gmail.com, song@kernel.org, yonghong.song@linux.dev,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
	haoluo@google.com, jolsa@kernel.org, jgg@ziepe.ca,
	jhubbard@nvidia.com, peterx@redhat.com, jannh@google.com,
	pfalcato@suse.de, skhan@linuxfoundation.org, riel@surriel.com,
	ryan.roberts@arm.com, jgross@suse.com, yu-cheng.yu@intel.com,
	kas@kernel.org, coxu@redhat.com, ackerleytng@google.com,
	yosry@kernel.org, ajones@ventanamicro.com, maobibo@loongson.cn,
	tabba@google.com, prsampat@amd.com, wu.fei9@sanechips.com.cn,
	mlevitsk@redhat.com, jmattson@google.com, jthoughton@google.com,
	agordeev@linux.ibm.com, alex@ghiti.fr, aou@eecs.berkeley.edu,
	borntraeger@linux.ibm.com, chenhuacai@kernel.org,
	baolu.lu@linux.intel.com, dev.jain@arm.com, gor@linux.ibm.com,
	hca@linux.ibm.com, palmer@dabbelt.com, pjw@kernel.org,
	shijie@os.amperecomputing.com, svens@linux.ibm.com,
	thuth@redhat.com, yang@os.amperecomputing.com,
	Liam.Howlett@oracle.com, urezki@gmail.com,
	zhengqi.arch@bytedance.com, gerald.schaefer@linux.ibm.com,
	jiayuan.chen@shopee.com, lenb@kernel.org, pavel@kernel.org,
	rafael@kernel.org, yangyicong@hisilicon.com,
	vannapurve@google.com, jackmanb@google.com, patrick.roy@linux.dev,
	Thomson, Jack, Itazuri, Takahiro, Manwaring, Derek,
	Kalyazin, Nikita
In-Reply-To: <20260410151746.61150-1-kalyazin@amazon.com>

From: Nikita Kalyazin <nikita.kalyazin@linux.dev>

Let's convert set_direct_map_*() to take an address instead of a page to
prepare for adding helpers that operate on folios; it will be more
efficient to convert from a folio directly to an address without going
through a page first.

Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Signed-off-by: Nikita Kalyazin <nikita.kalyazin@linux.dev>
---
 arch/arm64/include/asm/set_memory.h     |  7 ++++---
 arch/arm64/mm/pageattr.c                | 19 +++++++++--------
 arch/loongarch/include/asm/set_memory.h |  7 ++++---
 arch/loongarch/mm/pageattr.c            | 25 ++++++++++-------------
 arch/riscv/include/asm/set_memory.h     |  7 ++++---
 arch/riscv/mm/pageattr.c                | 17 ++++++++--------
 arch/s390/include/asm/set_memory.h      |  7 ++++---
 arch/s390/mm/pageattr.c                 | 13 ++++++------
 arch/x86/include/asm/set_memory.h       |  7 ++++---
 arch/x86/mm/pat/set_memory.c            | 27 +++++++++++++------------
 include/linux/set_memory.h              |  9 +++++----
 kernel/power/snapshot.c                 |  4 ++--
 mm/execmem.c                            |  6 ++++--
 mm/secretmem.c                          |  6 +++---
 mm/vmalloc.c                            | 11 ++++++----
 15 files changed, 91 insertions(+), 81 deletions(-)

diff --git a/arch/arm64/include/asm/set_memory.h b/arch/arm64/include/asm/set_memory.h
index 90f61b17275e..c71a2a6812c4 100644
--- a/arch/arm64/include/asm/set_memory.h
+++ b/arch/arm64/include/asm/set_memory.h
@@ -11,9 +11,10 @@ bool can_set_direct_map(void);
 
 int set_memory_valid(unsigned long addr, int numpages, int enable);
 
-int set_direct_map_invalid_noflush(struct page *page);
-int set_direct_map_default_noflush(struct page *page);
-int set_direct_map_valid_noflush(struct page *page, unsigned nr, bool valid);
+int set_direct_map_invalid_noflush(const void *addr);
+int set_direct_map_default_noflush(const void *addr);
+int set_direct_map_valid_noflush(const void *addr, unsigned long numpages,
+				 bool valid);
 bool kernel_page_present(struct page *page);
 
 int set_memory_encrypted(unsigned long addr, int numpages);
diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
index 358d1dc9a576..5aff94e1f8b2 100644
--- a/arch/arm64/mm/pageattr.c
+++ b/arch/arm64/mm/pageattr.c
@@ -245,7 +245,7 @@ int set_memory_valid(unsigned long addr, int numpages, int enable)
 					__pgprot(PTE_VALID));
 }
 
-int set_direct_map_invalid_noflush(struct page *page)
+int set_direct_map_invalid_noflush(const void *addr)
 {
 	pgprot_t clear_mask = __pgprot(PTE_VALID);
 	pgprot_t set_mask = __pgprot(0);
@@ -253,11 +253,11 @@ int set_direct_map_invalid_noflush(struct page *page)
 	if (!can_set_direct_map())
 		return 0;
 
-	return update_range_prot((unsigned long)page_address(page),
-				 PAGE_SIZE, set_mask, clear_mask);
+	return update_range_prot((unsigned long)addr, PAGE_SIZE, set_mask,
+				 clear_mask);
 }
 
-int set_direct_map_default_noflush(struct page *page)
+int set_direct_map_default_noflush(const void *addr)
 {
 	pgprot_t set_mask = __pgprot(PTE_VALID | PTE_WRITE);
 	pgprot_t clear_mask = __pgprot(PTE_RDONLY);
@@ -265,8 +265,8 @@ int set_direct_map_default_noflush(struct page *page)
 	if (!can_set_direct_map())
 		return 0;
 
-	return update_range_prot((unsigned long)page_address(page),
-				 PAGE_SIZE, set_mask, clear_mask);
+	return update_range_prot((unsigned long)addr, PAGE_SIZE, set_mask,
+				 clear_mask);
 }
 
 static int __set_memory_enc_dec(unsigned long addr,
@@ -349,14 +349,13 @@ int realm_register_memory_enc_ops(void)
 	return arm64_mem_crypt_ops_register(&realm_crypt_ops);
 }
 
-int set_direct_map_valid_noflush(struct page *page, unsigned nr, bool valid)
+int set_direct_map_valid_noflush(const void *addr, unsigned long numpages,
+				 bool valid)
 {
-	unsigned long addr = (unsigned long)page_address(page);
-
 	if (!can_set_direct_map())
 		return 0;
 
-	return set_memory_valid(addr, nr, valid);
+	return set_memory_valid((unsigned long)addr, numpages, valid);
 }
 
 #ifdef CONFIG_DEBUG_PAGEALLOC
diff --git a/arch/loongarch/include/asm/set_memory.h b/arch/loongarch/include/asm/set_memory.h
index 55dfaefd02c8..5e9b67b2fea1 100644
--- a/arch/loongarch/include/asm/set_memory.h
+++ b/arch/loongarch/include/asm/set_memory.h
@@ -15,8 +15,9 @@ int set_memory_ro(unsigned long addr, int numpages);
 int set_memory_rw(unsigned long addr, int numpages);
 
 bool kernel_page_present(struct page *page);
-int set_direct_map_default_noflush(struct page *page);
-int set_direct_map_invalid_noflush(struct page *page);
-int set_direct_map_valid_noflush(struct page *page, unsigned nr, bool valid);
+int set_direct_map_invalid_noflush(const void *addr);
+int set_direct_map_default_noflush(const void *addr);
+int set_direct_map_valid_noflush(const void *addr, unsigned long numpages,
+				 bool valid);
 
 #endif /* _ASM_LOONGARCH_SET_MEMORY_H */
diff --git a/arch/loongarch/mm/pageattr.c b/arch/loongarch/mm/pageattr.c
index f5e910b68229..9e08905d3624 100644
--- a/arch/loongarch/mm/pageattr.c
+++ b/arch/loongarch/mm/pageattr.c
@@ -198,32 +198,29 @@ bool kernel_page_present(struct page *page)
 	return pte_present(ptep_get(pte));
 }
 
-int set_direct_map_default_noflush(struct page *page)
+int set_direct_map_default_noflush(const void *addr)
 {
-	unsigned long addr = (unsigned long)page_address(page);
-
-	if (addr < vm_map_base)
+	if ((unsigned long)addr < vm_map_base)
 		return 0;
 
-	return __set_memory(addr, 1, PAGE_KERNEL, __pgprot(0));
+	return __set_memory((unsigned long)addr, 1, PAGE_KERNEL, __pgprot(0));
 }
 
-int set_direct_map_invalid_noflush(struct page *page)
+int set_direct_map_invalid_noflush(const void *addr)
 {
-	unsigned long addr = (unsigned long)page_address(page);
-
-	if (addr < vm_map_base)
+	if ((unsigned long)addr < vm_map_base)
 		return 0;
 
-	return __set_memory(addr, 1, __pgprot(0), __pgprot(_PAGE_PRESENT | _PAGE_VALID));
+	return __set_memory((unsigned long)addr, 1, __pgprot(0),
+			    __pgprot(_PAGE_PRESENT | _PAGE_VALID));
 }
 
-int set_direct_map_valid_noflush(struct page *page, unsigned nr, bool valid)
+int set_direct_map_valid_noflush(const void *addr, unsigned long numpages,
+				 bool valid)
 {
-	unsigned long addr = (unsigned long)page_address(page);
 	pgprot_t set, clear;
 
-	if (addr < vm_map_base)
+	if ((unsigned long)addr < vm_map_base)
 		return 0;
 
 	if (valid) {
@@ -234,5 +231,5 @@ int set_direct_map_valid_noflush(struct page *page, unsigned nr, bool valid)
 		clear = __pgprot(_PAGE_PRESENT | _PAGE_VALID);
 	}
 
-	return __set_memory(addr, 1, set, clear);
+	return __set_memory((unsigned long)addr, 1, set, clear);
 }
diff --git a/arch/riscv/include/asm/set_memory.h b/arch/riscv/include/asm/set_memory.h
index 87389e93325a..a87eabd7fc78 100644
--- a/arch/riscv/include/asm/set_memory.h
+++ b/arch/riscv/include/asm/set_memory.h
@@ -40,9 +40,10 @@ static inline int set_kernel_memory(char *startp, char *endp,
 }
 #endif
 
-int set_direct_map_invalid_noflush(struct page *page);
-int set_direct_map_default_noflush(struct page *page);
-int set_direct_map_valid_noflush(struct page *page, unsigned nr, bool valid);
+int set_direct_map_invalid_noflush(const void *addr);
+int set_direct_map_default_noflush(const void *addr);
+int set_direct_map_valid_noflush(const void *addr, unsigned long numpages,
+				 bool valid);
 bool kernel_page_present(struct page *page);
 
 #endif /* __ASSEMBLER__ */
diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c
index 3f76db3d2769..0a457177a88c 100644
--- a/arch/riscv/mm/pageattr.c
+++ b/arch/riscv/mm/pageattr.c
@@ -374,19 +374,20 @@ int set_memory_nx(unsigned long addr, int numpages)
 	return __set_memory(addr, numpages, __pgprot(0), __pgprot(_PAGE_EXEC));
 }
 
-int set_direct_map_invalid_noflush(struct page *page)
+int set_direct_map_invalid_noflush(const void *addr)
 {
-	return __set_memory((unsigned long)page_address(page), 1,
-			    __pgprot(0), __pgprot(_PAGE_PRESENT));
+	return __set_memory((unsigned long)addr, 1, __pgprot(0),
+			    __pgprot(_PAGE_PRESENT));
 }
 
-int set_direct_map_default_noflush(struct page *page)
+int set_direct_map_default_noflush(const void *addr)
 {
-	return __set_memory((unsigned long)page_address(page), 1,
-			    PAGE_KERNEL, __pgprot(_PAGE_EXEC));
+	return __set_memory((unsigned long)addr, 1, PAGE_KERNEL,
+			    __pgprot(_PAGE_EXEC));
 }
 
-int set_direct_map_valid_noflush(struct page *page, unsigned nr, bool valid)
+int set_direct_map_valid_noflush(const void *addr, unsigned long numpages,
+				 bool valid)
 {
 	pgprot_t set, clear;
 
@@ -398,7 +399,7 @@ int set_direct_map_valid_noflush(struct page *page, unsigned nr, bool valid)
 		clear = __pgprot(_PAGE_PRESENT);
 	}
 
-	return __set_memory((unsigned long)page_address(page), nr, set, clear);
+	return __set_memory((unsigned long)addr, numpages, set, clear);
 }
 
 #ifdef CONFIG_DEBUG_PAGEALLOC
diff --git a/arch/s390/include/asm/set_memory.h b/arch/s390/include/asm/set_memory.h
index 94092f4ae764..3e43c3c96e67 100644
--- a/arch/s390/include/asm/set_memory.h
+++ b/arch/s390/include/asm/set_memory.h
@@ -60,9 +60,10 @@ __SET_MEMORY_FUNC(set_memory_rox, SET_MEMORY_RO | SET_MEMORY_X)
 __SET_MEMORY_FUNC(set_memory_rwnx, SET_MEMORY_RW | SET_MEMORY_NX)
 __SET_MEMORY_FUNC(set_memory_4k, SET_MEMORY_4K)
 
-int set_direct_map_invalid_noflush(struct page *page);
-int set_direct_map_default_noflush(struct page *page);
-int set_direct_map_valid_noflush(struct page *page, unsigned nr, bool valid);
+int set_direct_map_invalid_noflush(const void *addr);
+int set_direct_map_default_noflush(const void *addr);
+int set_direct_map_valid_noflush(const void *addr, unsigned long numpages,
+				 bool valid);
 bool kernel_page_present(struct page *page);
 
 #endif
diff --git a/arch/s390/mm/pageattr.c b/arch/s390/mm/pageattr.c
index bb29c38ae624..8e90ff5cf50d 100644
--- a/arch/s390/mm/pageattr.c
+++ b/arch/s390/mm/pageattr.c
@@ -383,17 +383,18 @@ int __set_memory(unsigned long addr, unsigned long numpages, unsigned long flags
 	return rc;
 }
 
-int set_direct_map_invalid_noflush(struct page *page)
+int set_direct_map_invalid_noflush(const void *addr)
 {
-	return __set_memory((unsigned long)page_to_virt(page), 1, SET_MEMORY_INV);
+	return __set_memory((unsigned long)addr, 1, SET_MEMORY_INV);
 }
 
-int set_direct_map_default_noflush(struct page *page)
+int set_direct_map_default_noflush(const void *addr)
 {
-	return __set_memory((unsigned long)page_to_virt(page), 1, SET_MEMORY_DEF);
+	return __set_memory((unsigned long)addr, 1, SET_MEMORY_DEF);
 }
 
-int set_direct_map_valid_noflush(struct page *page, unsigned nr, bool valid)
+int set_direct_map_valid_noflush(const void *addr, unsigned long numpages,
+				 bool valid)
 {
 	unsigned long flags;
 
@@ -402,7 +403,7 @@ int set_direct_map_valid_noflush(struct page *page, unsigned nr, bool valid)
 	else
 		flags = SET_MEMORY_INV;
 
-	return __set_memory((unsigned long)page_to_virt(page), nr, flags);
+	return __set_memory((unsigned long)addr, numpages, flags);
 }
 
 bool kernel_page_present(struct page *page)
diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h
index 4362c26aa992..b6a4173ff249 100644
--- a/arch/x86/include/asm/set_memory.h
+++ b/arch/x86/include/asm/set_memory.h
@@ -86,9 +86,10 @@ int set_pages_wb(struct page *page, int numpages);
 int set_pages_ro(struct page *page, int numpages);
 int set_pages_rw(struct page *page, int numpages);
 
-int set_direct_map_invalid_noflush(struct page *page);
-int set_direct_map_default_noflush(struct page *page);
-int set_direct_map_valid_noflush(struct page *page, unsigned nr, bool valid);
+int set_direct_map_invalid_noflush(const void *addr);
+int set_direct_map_default_noflush(const void *addr);
+int set_direct_map_valid_noflush(const void *addr, unsigned long numpages,
+				 bool valid);
 bool kernel_page_present(struct page *page);
 
 extern int kernel_set_to_readonly;
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 40581a720fe8..7517195b75b9 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2587,9 +2587,9 @@ int set_pages_rw(struct page *page, int numpages)
 	return set_memory_rw(addr, numpages);
 }
 
-static int __set_pages_p(struct page *page, int numpages)
+static int __set_pages_p(const void *addr, int numpages)
 {
-	unsigned long tempaddr = (unsigned long) page_address(page);
+	unsigned long tempaddr = (unsigned long)addr;
 	struct cpa_data cpa = { .vaddr = &tempaddr,
 				.pgd = NULL,
 				.numpages = numpages,
@@ -2606,9 +2606,9 @@ static int __set_pages_p(struct page *page, int numpages)
 	return __change_page_attr_set_clr(&cpa, 1);
 }
 
-static int __set_pages_np(struct page *page, int numpages)
+static int __set_pages_np(const void *addr, int numpages)
 {
-	unsigned long tempaddr = (unsigned long) page_address(page);
+	unsigned long tempaddr = (unsigned long)addr;
 	struct cpa_data cpa = { .vaddr = &tempaddr,
 				.pgd = NULL,
 				.numpages = numpages,
@@ -2625,22 +2625,23 @@ static int __set_pages_np(struct page *page, int numpages)
 	return __change_page_attr_set_clr(&cpa, 1);
 }
 
-int set_direct_map_invalid_noflush(struct page *page)
+int set_direct_map_invalid_noflush(const void *addr)
 {
-	return __set_pages_np(page, 1);
+	return __set_pages_np(addr, 1);
 }
 
-int set_direct_map_default_noflush(struct page *page)
+int set_direct_map_default_noflush(const void *addr)
 {
-	return __set_pages_p(page, 1);
+	return __set_pages_p(addr, 1);
 }
 
-int set_direct_map_valid_noflush(struct page *page, unsigned nr, bool valid)
+int set_direct_map_valid_noflush(const void *addr, unsigned long numpages,
+				 bool valid)
 {
 	if (valid)
-		return __set_pages_p(page, nr);
+		return __set_pages_p(addr, numpages);
 
-	return __set_pages_np(page, nr);
+	return __set_pages_np(addr, numpages);
 }
 
 #ifdef CONFIG_DEBUG_PAGEALLOC
@@ -2659,9 +2660,9 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 	 * and hence no memory allocations during large page split.
 	 */
 	if (enable)
-		__set_pages_p(page, numpages);
+		__set_pages_p(page_address(page), numpages);
 	else
-		__set_pages_np(page, numpages);
+		__set_pages_np(page_address(page), numpages);
 
 	/*
 	 * We should perform an IPI and flush all tlbs,
diff --git a/include/linux/set_memory.h b/include/linux/set_memory.h
index 3030d9245f5a..1a2563f525fc 100644
--- a/include/linux/set_memory.h
+++ b/include/linux/set_memory.h
@@ -25,17 +25,18 @@ static inline int set_memory_rox(unsigned long addr, int numpages)
 #endif
 
 #ifndef CONFIG_ARCH_HAS_SET_DIRECT_MAP
-static inline int set_direct_map_invalid_noflush(struct page *page)
+static inline int set_direct_map_invalid_noflush(const void *addr)
 {
 	return 0;
 }
-static inline int set_direct_map_default_noflush(struct page *page)
+static inline int set_direct_map_default_noflush(const void *addr)
 {
 	return 0;
 }
 
-static inline int set_direct_map_valid_noflush(struct page *page,
-					       unsigned nr, bool valid)
+static inline int set_direct_map_valid_noflush(const void *addr,
+					       unsigned long numpages,
+					       bool valid)
 {
 	return 0;
 }
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index 6e1321837c66..6eddfb22c0ff 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -88,7 +88,7 @@ static inline int hibernate_restore_unprotect_page(void *page_address) {return 0
 static inline void hibernate_map_page(struct page *page)
 {
 	if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
-		int ret = set_direct_map_default_noflush(page);
+		int ret = set_direct_map_default_noflush(page_address(page));
 
 		if (ret)
 			pr_warn_once("Failed to remap page\n");
@@ -101,7 +101,7 @@ static inline void hibernate_unmap_page(struct page *page)
 {
 	if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
 		unsigned long addr = (unsigned long)page_address(page);
-		int ret  = set_direct_map_invalid_noflush(page);
+		int ret  = set_direct_map_invalid_noflush(page_address(page));
 
 		if (ret)
 			pr_warn_once("Failed to remap page\n");
diff --git a/mm/execmem.c b/mm/execmem.c
index 810a4ba9c924..220298ec87c8 100644
--- a/mm/execmem.c
+++ b/mm/execmem.c
@@ -119,7 +119,8 @@ static int execmem_set_direct_map_valid(struct vm_struct *vm, bool valid)
 	int err = 0;
 
 	for (int i = 0; i < vm->nr_pages; i += nr) {
-		err = set_direct_map_valid_noflush(vm->pages[i], nr, valid);
+		err = set_direct_map_valid_noflush(page_address(vm->pages[i]),
+						   nr, valid);
 		if (err)
 			goto err_restore;
 		updated += nr;
@@ -129,7 +130,8 @@ static int execmem_set_direct_map_valid(struct vm_struct *vm, bool valid)
 
 err_restore:
 	for (int i = 0; i < updated; i += nr)
-		set_direct_map_valid_noflush(vm->pages[i], nr, !valid);
+		set_direct_map_valid_noflush(page_address(vm->pages[i]), nr,
+					     !valid);
 
 	return err;
 }
diff --git a/mm/secretmem.c b/mm/secretmem.c
index 11a779c812a7..fd29b33c6764 100644
--- a/mm/secretmem.c
+++ b/mm/secretmem.c
@@ -72,7 +72,7 @@ static vm_fault_t secretmem_fault(struct vm_fault *vmf)
 			goto out;
 		}
 
-		err = set_direct_map_invalid_noflush(folio_page(folio, 0));
+		err = set_direct_map_invalid_noflush(folio_address(folio));
 		if (err) {
 			folio_put(folio);
 			ret = vmf_error(err);
@@ -87,7 +87,7 @@ static vm_fault_t secretmem_fault(struct vm_fault *vmf)
 			 * already happened when we marked the page invalid
 			 * which guarantees that this call won't fail
 			 */
-			set_direct_map_default_noflush(folio_page(folio, 0));
+			set_direct_map_default_noflush(folio_address(folio));
 			folio_put(folio);
 			if (err == -EEXIST)
 				goto retry;
@@ -151,7 +151,7 @@ static int secretmem_migrate_folio(struct address_space *mapping,
 
 static void secretmem_free_folio(struct folio *folio)
 {
-	set_direct_map_default_noflush(folio_page(folio, 0));
+	set_direct_map_default_noflush(folio_address(folio));
 	folio_zero_segment(folio, 0, folio_size(folio));
 }
 
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 61caa55a4402..8822f73957d9 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3342,14 +3342,17 @@ struct vm_struct *remove_vm_area(const void *addr)
 }
 
 static inline void set_area_direct_map(const struct vm_struct *area,
-				       int (*set_direct_map)(struct page *page))
+				       int (*set_direct_map)(const void *addr))
 {
 	int i;
 
 	/* HUGE_VMALLOC passes small pages to set_direct_map */
-	for (i = 0; i < area->nr_pages; i++)
-		if (page_address(area->pages[i]))
-			set_direct_map(area->pages[i]);
+	for (i = 0; i < area->nr_pages; i++) {
+		const void *addr = page_address(area->pages[i]);
+
+		if (addr)
+			set_direct_map(addr);
+	}
 }
 
 /*
-- 
2.50.1


^ permalink raw reply related

* [PATCH v12 00/16] Direct Map Removal Support for guest_memfd
From: Kalyazin, Nikita @ 2026-04-10 15:17 UTC (permalink / raw)
  To: kvm@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	bpf@vger.kernel.org, linux-kselftest@vger.kernel.org,
	kernel@xen0n.name, linux-riscv@lists.infradead.org,
	linux-s390@vger.kernel.org, loongarch@lists.linux.dev,
	linux-pm@vger.kernel.org
  Cc: pbonzini@redhat.com, corbet@lwn.net, maz@kernel.org,
	oupton@kernel.org, joey.gouly@arm.com, suzuki.poulose@arm.com,
	yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org,
	seanjc@google.com, tglx@kernel.org, mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org,
	hpa@zytor.com, luto@kernel.org, peterz@infradead.org,
	willy@infradead.org, akpm@linux-foundation.org, david@kernel.org,
	lorenzo.stoakes@oracle.com, vbabka@kernel.org, rppt@kernel.org,
	surenb@google.com, mhocko@suse.com, ast@kernel.org,
	daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev,
	eddyz87@gmail.com, song@kernel.org, yonghong.song@linux.dev,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
	haoluo@google.com, jolsa@kernel.org, jgg@ziepe.ca,
	jhubbard@nvidia.com, peterx@redhat.com, jannh@google.com,
	pfalcato@suse.de, skhan@linuxfoundation.org, riel@surriel.com,
	ryan.roberts@arm.com, jgross@suse.com, yu-cheng.yu@intel.com,
	kas@kernel.org, coxu@redhat.com, ackerleytng@google.com,
	yosry@kernel.org, ajones@ventanamicro.com, maobibo@loongson.cn,
	tabba@google.com, prsampat@amd.com, wu.fei9@sanechips.com.cn,
	mlevitsk@redhat.com, jmattson@google.com, jthoughton@google.com,
	agordeev@linux.ibm.com, alex@ghiti.fr, aou@eecs.berkeley.edu,
	borntraeger@linux.ibm.com, chenhuacai@kernel.org,
	baolu.lu@linux.intel.com, dev.jain@arm.com, gor@linux.ibm.com,
	hca@linux.ibm.com, palmer@dabbelt.com, pjw@kernel.org,
	shijie@os.amperecomputing.com, svens@linux.ibm.com,
	thuth@redhat.com, yang@os.amperecomputing.com,
	Liam.Howlett@oracle.com, urezki@gmail.com,
	zhengqi.arch@bytedance.com, gerald.schaefer@linux.ibm.com,
	jiayuan.chen@shopee.com, lenb@kernel.org, pavel@kernel.org,
	rafael@kernel.org, yangyicong@hisilicon.com,
	vannapurve@google.com, jackmanb@google.com, patrick.roy@linux.dev,
	Thomson, Jack, Itazuri, Takahiro, Manwaring, Derek,
	Kalyazin, Nikita, Nikita Kalyazin

From: Nikita Kalyazin <nikita.kalyazin@linux.dev>

[ based on kvm/next ]

Unmapping virtual machine guest memory from the host kernel's direct map
is a successful mitigation against Spectre-style transient execution
issues: if the kernel page tables do not contain entries pointing to
guest memory, then any attempted speculative read through the direct map
will necessarily be blocked by the MMU before any observable
microarchitectural side-effects happen.  This means that Spectre-gadgets
and similar cannot be used to target virtual machine memory.  Roughly
60% of speculative execution issues fall into this category [1, Table
1].

This patch series extends guest_memfd with the ability to remove its
memory from the host kernel's direct map, to be able to attain the above
protection for KVM guests running inside guest_memfd.

Additionally, a Firecracker branch with support for these VMs can be
found on GitHub [2].

For more details, please refer to the v5 cover letter.  No substantial
changes in design have taken place since.

See also related write() syscall support in guest_memfd [3] where
the interoperation between the two features is described.

Changes since v11:
 - Ackerley/Sashiko: fix previously missed __set_pages_* argument update
   in __kernel_map_pages (patch 1)
 - David: disallow large folios in folio_zap_direct_map (patch 2)
 - David/Sashiko: check for folio_is_zone_device if mapping is NULL in
   gup_fast_folio_allowed (patch 4)
 - Ackerley/Sashiko: kvm_arch_gmem_supports_no_direct_map to return
   false for SEV-SNP (patch 8).
 - David: replace a redundant check for GUEST_MEMFD_FLAG_NO_DIRECT_MAP
   with a WARN_ON_ONCE (patch 10)
 - David: assert the folio is locked when zapping direct map (patch 10)
 - Ackerley/Sashiko: reorder operations to "zap then prepare" and
   "invalidate then restore" (patch 10)

v11: https://lore.kernel.org/kvm/20260317141031.514-1-kalyazin@amazon.com
v10: https://lore.kernel.org/kvm/20260126164445.11867-1-kalyazin@amazon.com
v9: https://lore.kernel.org/kvm/20260114134510.1835-1-kalyazin@amazon.com
v8: https://lore.kernel.org/kvm/20251205165743.9341-1-kalyazin@amazon.com
v7: https://lore.kernel.org/kvm/20250924151101.2225820-1-patrick.roy@campus.lmu.de
v6: https://lore.kernel.org/kvm/20250912091708.17502-1-roypat@amazon.co.uk
v5: https://lore.kernel.org/kvm/20250828093902.2719-1-roypat@amazon.co.uk
v4: https://lore.kernel.org/kvm/20250221160728.1584559-1-roypat@amazon.co.uk
RFCv3: https://lore.kernel.org/kvm/20241030134912.515725-1-roypat@amazon.co.uk
RFCv2: https://lore.kernel.org/kvm/20240910163038.1298452-1-roypat@amazon.co.uk
RFCv1: https://lore.kernel.org/kvm/20240709132041.3625501-1-roypat@amazon.co.uk

[1] https://download.vusec.net/papers/quarantine_raid23.pdf
[2] https://github.com/firecracker-microvm/firecracker/tree/feature/secret-hiding
[3] https://lore.kernel.org/kvm/20251114151828.98165-1-kalyazin@amazon.com

Nikita Kalyazin (4):
  set_memory: set_direct_map_* to take address
  set_memory: add folio_{zap,restore}_direct_map helpers
  mm/secretmem: make use of folio_{zap,restore}_direct_map
  mm/gup: drop local variable in gup_fast_folio_allowed

Patrick Roy (12):
  mm/gup: drop secretmem optimization from gup_fast_folio_allowed
  mm: introduce AS_NO_DIRECT_MAP
  KVM: guest_memfd: Add stub for kvm_arch_gmem_invalidate
  KVM: x86: define kvm_arch_gmem_supports_no_direct_map()
  KVM: arm64: define kvm_arch_gmem_supports_no_direct_map()
  KVM: guest_memfd: Add flag to remove from direct map
  KVM: selftests: load elf via bounce buffer
  KVM: selftests: set KVM_MEM_GUEST_MEMFD in vm_mem_add() if guest_memfd
    != -1
  KVM: selftests: Add guest_memfd based vm_mem_backing_src_types
  KVM: selftests: cover GUEST_MEMFD_FLAG_NO_DIRECT_MAP in existing
    selftests
  KVM: selftests: stuff vm_mem_backing_src_type into vm_shape
  KVM: selftests: Test guest execution from direct map removed gmem

 Documentation/virt/kvm/api.rst                | 21 +++---
 arch/arm64/include/asm/kvm_host.h             | 13 ++++
 arch/arm64/include/asm/set_memory.h           |  7 +-
 arch/arm64/mm/pageattr.c                      | 19 +++--
 arch/loongarch/include/asm/set_memory.h       |  7 +-
 arch/loongarch/mm/pageattr.c                  | 25 +++----
 arch/riscv/include/asm/set_memory.h           |  7 +-
 arch/riscv/mm/pageattr.c                      | 17 +++--
 arch/s390/include/asm/set_memory.h            |  7 +-
 arch/s390/mm/pageattr.c                       | 13 ++--
 arch/x86/include/asm/kvm_host.h               |  6 ++
 arch/x86/include/asm/set_memory.h             |  7 +-
 arch/x86/kvm/x86.c                            |  7 ++
 arch/x86/mm/pat/set_memory.c                  | 27 +++----
 include/linux/kvm_host.h                      | 14 ++++
 include/linux/pagemap.h                       | 16 ++++
 include/linux/secretmem.h                     | 18 -----
 include/linux/set_memory.h                    | 22 +++++-
 include/uapi/linux/kvm.h                      |  1 +
 kernel/power/snapshot.c                       |  4 +-
 lib/buildid.c                                 |  8 +-
 mm/execmem.c                                  |  6 +-
 mm/gup.c                                      | 47 ++++++------
 mm/memory.c                                   | 45 +++++++++++
 mm/mlock.c                                    |  2 +-
 mm/secretmem.c                                | 18 ++---
 mm/vmalloc.c                                  | 11 ++-
 .../testing/selftests/kvm/guest_memfd_test.c  | 17 ++++-
 .../testing/selftests/kvm/include/kvm_util.h  | 37 ++++++---
 .../testing/selftests/kvm/include/test_util.h |  8 ++
 tools/testing/selftests/kvm/lib/elf.c         |  8 +-
 tools/testing/selftests/kvm/lib/io.c          | 23 ++++++
 tools/testing/selftests/kvm/lib/kvm_util.c    | 59 ++++++++-------
 tools/testing/selftests/kvm/lib/test_util.c   |  8 ++
 tools/testing/selftests/kvm/lib/x86/sev.c     |  1 +
 .../selftests/kvm/pre_fault_memory_test.c     |  1 +
 .../selftests/kvm/set_memory_region_test.c    | 52 ++++++++++++-
 .../kvm/x86/private_mem_conversions_test.c    |  7 +-
 virt/kvm/guest_memfd.c                        | 75 +++++++++++++++++--
 39 files changed, 489 insertions(+), 202 deletions(-)


base-commit: 24f9515de8778410e4b84c85b196c9850d2c1e18
-- 
2.50.1


^ permalink raw reply

* Re: [PATCH v2 7/7] platform/x86/intel/pmc: Add Nova Lake support to intel_pmc_core driver
From: Ilpo Järvinen @ 2026-04-10 14:38 UTC (permalink / raw)
  To: Xi Pardee
  Cc: irenic.rajneesh, david.e.box, platform-driver-x86, LKML, linux-pm
In-Reply-To: <20260408222144.3288928-8-xi.pardee@linux.intel.com>

On Wed, 8 Apr 2026, Xi Pardee wrote:

> Add Nova Lake support in intel_pmc_core driver
> 
> Signed-off-by: Xi Pardee <xi.pardee@linux.intel.com>
> ---
>  drivers/platform/x86/intel/pmc/Makefile |    3 +-
>  drivers/platform/x86/intel/pmc/core.c   |    2 +
>  drivers/platform/x86/intel/pmc/core.h   |   31 +
>  drivers/platform/x86/intel/pmc/nvl.c    | 1539 +++++++++++++++++++++++
>  drivers/platform/x86/intel/pmc/ptl.c    |    2 +-
>  5 files changed, 1575 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/platform/x86/intel/pmc/nvl.c
> 
> diff --git a/drivers/platform/x86/intel/pmc/Makefile b/drivers/platform/x86/intel/pmc/Makefile
> index bb960c8721d77..23853e867c912 100644
> --- a/drivers/platform/x86/intel/pmc/Makefile
> +++ b/drivers/platform/x86/intel/pmc/Makefile
> @@ -4,7 +4,8 @@
>  #
>  
>  intel_pmc_core-y			:= core.o spt.o cnp.o icl.o \
> -					   tgl.o adl.o mtl.o arl.o lnl.o ptl.o wcl.o
> +					   tgl.o adl.o mtl.o arl.o \
> +					   lnl.o ptl.o wcl.o nvl.o
>  obj-$(CONFIG_INTEL_PMC_CORE)		+= intel_pmc_core.o
>  intel_pmc_core_pltdrv-y			:= pltdrv.o
>  obj-$(CONFIG_INTEL_PMC_CORE)		+= intel_pmc_core_pltdrv.o
> diff --git a/drivers/platform/x86/intel/pmc/core.c b/drivers/platform/x86/intel/pmc/core.c
> index c84e75b19aac3..207708f4ceb94 100644
> --- a/drivers/platform/x86/intel/pmc/core.c
> +++ b/drivers/platform/x86/intel/pmc/core.c
> @@ -1849,6 +1849,8 @@ static const struct x86_cpu_id intel_pmc_core_ids[] = {
>  	X86_MATCH_VFM(INTEL_LUNARLAKE_M,	&lnl_pmc_dev),
>  	X86_MATCH_VFM(INTEL_PANTHERLAKE_L,	&ptl_pmc_dev),
>  	X86_MATCH_VFM(INTEL_WILDCATLAKE_L,	&wcl_pmc_dev),
> +	X86_MATCH_VFM(INTEL_NOVALAKE,		&nvl_s_pmc_dev),
> +	X86_MATCH_VFM(INTEL_NOVALAKE_L,		&nvl_h_pmc_dev),
>  	{}
>  };
>  
> diff --git a/drivers/platform/x86/intel/pmc/core.h b/drivers/platform/x86/intel/pmc/core.h
> index a741e4698f195..f2b4a20d2ff44 100644
> --- a/drivers/platform/x86/intel/pmc/core.h
> +++ b/drivers/platform/x86/intel/pmc/core.h
> @@ -307,6 +307,29 @@ enum ppfear_regs {
>  #define WCL_NUM_S0IX_BLOCKER			94
>  #define WCL_BLK_REQ_OFFSET			50
>  
> +/* Nova Lake */
> +#define NVL_PCDH_PPFEAR_NUM_ENTRIES		13
> +#define NVL_PCDH_PMC_MMIO_REG_LEN		0x363c
> +#define NVL_PCDS_PMC_MMIO_REG_LEN		0x3118
> +#define NVL_PCHS_PMC_MMIO_REG_LEN		0x30d8
> +#define NVL_LPM_PRI_OFFSET			0x17a4
> +#define NVL_LPM_EN_OFFSET			0x17a0
> +#define NVL_LPM_RESIDENCY_OFFSET		0x17a8
> +#define NVL_LPM_LIVE_STATUS_OFFSET		0x1760
> +#define NVL_LPM_NUM_MAPS			15
> +#define NVL_PCDH_NUM_S0IX_BLOCKER		107
> +#define NVL_PCDS_NUM_S0IX_BLOCKER		71
> +#define NVL_PCHS_NUM_S0IX_BLOCKER		54
> +#define NVL_PCDS_PMC_LTR_RESERVED		0x1bac
> +#define NVL_PCDH_BLK_REQ_OFFSET			53
> +#define NVL_PCDS_BLK_REQ_OFFSET			18
> +#define NVL_PCHS_BLK_REQ_OFFSET			46
> +#define NVL_PMT_PC_GUID				0x13000101
> +#define NVL_PMT_DMU_GUID			0x1a000101
> +#define NVL_LTR_BLK_OFFSET			64
> +#define NVL_PKGC_BLK_OFFSET			4
> +#define NVL_PMT_DMU_DIE_C6_OFFSET		25
> +
>  /* SSRAM PMC Device ID */
>  /* LNL */
>  #define PMC_DEVID_LNL_SOCM	0xa87f
> @@ -329,6 +352,11 @@ enum ppfear_regs {
>  #define PMC_DEVID_MTL_IOEP	0x7ecf
>  #define PMC_DEVID_MTL_IOEM	0x7ebf
>  
> +/* NVL */
> +#define PMC_DEVID_NVL_PCDH	0xd37e
> +#define PMC_DEVID_NVL_PCDS	0xd47e
> +#define PMC_DEVID_NVL_PCHS	0x6e27
> +
>  extern const char *pmc_lpm_modes[];
>  
>  struct pmc_bit_map {
> @@ -558,6 +586,7 @@ extern const struct pmc_reg_map mtl_ioep_reg_map;
>  extern const struct pmc_bit_map ptl_pcdp_clocksource_status_map[];
>  extern const struct pmc_bit_map ptl_pcdp_vnn_req_status_3_map[];
>  extern const struct pmc_bit_map ptl_pcdp_signal_status_map[];
> +extern const struct pmc_bit_map ptl_pcdp_ltr_show_map[];
>  
>  void pmc_core_get_tgl_lpm_reqs(struct platform_device *pdev);
>  int pmc_core_send_ltr_ignore(struct pmc_dev *pmcdev, u32 value, int ignore);
> @@ -581,6 +610,8 @@ extern struct pmc_dev_info arl_h_pmc_dev;
>  extern struct pmc_dev_info lnl_pmc_dev;
>  extern struct pmc_dev_info ptl_pmc_dev;
>  extern struct pmc_dev_info wcl_pmc_dev;
> +extern struct pmc_dev_info nvl_s_pmc_dev;
> +extern struct pmc_dev_info nvl_h_pmc_dev;
>  
>  void cnl_suspend(struct pmc_dev *pmcdev);
>  int cnl_resume(struct pmc_dev *pmcdev);
> diff --git a/drivers/platform/x86/intel/pmc/nvl.c b/drivers/platform/x86/intel/pmc/nvl.c
> new file mode 100644
> index 0000000000000..96f4244d602be
> --- /dev/null
> +++ b/drivers/platform/x86/intel/pmc/nvl.c
> @@ -0,0 +1,1539 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * This file contains platform specific structure definitions
> + * and init function used by Nova Lake PCH.
> + *
> + * Copyright (c) 2026, Intel Corporation.
> + */
> +
> +#include <linux/pci.h>
> +
> +#include "core.h"
> +
> +/* PMC SSRAM PMT Telemetry GUIDS */
> +#define PCDH_LPM_REQ_GUID 0x01093101
> +#define PCHS_LPM_REQ_GUID 0x01092101
> +#define PCDS_LPM_REQ_GUID 0x01091102
> +
> +/*
> + * Die Mapping to Product.
> + * Product PCDDie PCHDie
> + * NVL-H   PCD-H  None
> + * NVL-S   PCD-S  PCH-S
> + */
> +
> +static const struct pmc_bit_map nvl_pcdh_pfear_map[] = {
> +	{"PMC_PGD0",                 BIT(0)},

Add #include for BIT().

> +	{"FUSE_OSSE_PGD0",           BIT(1)},
> +	{"SPI_PGD0",                 BIT(2)},
> +	{"XHCI_PGD0",                BIT(3)},
> +	{"SPA_PGD0",                 BIT(4)},
> +	{"SPB_PGD0",                 BIT(5)},
> +	{"MPFPW2_PGD0",              BIT(6)},
> +	{"GBE_PGD0",                 BIT(7)},
> +
> +	{"SBR16B20_PGD0",            BIT(0)},
> +	{"DBG_SBR_PGD0",             BIT(1)},
> +	{"SBR16B7_PGD0",             BIT(2)},
> +	{"STRC_PGD0",                BIT(3)},
> +	{"SBR16B8_PGD0",             BIT(4)},
> +	{"D2D_DISP_PGD1",            BIT(5)},
> +	{"LPSS_PGD0",                BIT(6)},
> +	{"LPC_PGD0",                 BIT(7)},
> +
> +	{"SMB_PGD0",                 BIT(0)},
> +	{"ISH_PGD0",                 BIT(1)},
> +	{"SBR16B2_PGD0",             BIT(2)},
> +	{"NPK_PGD0",                 BIT(3)},
> +	{"D2D_NOC_PGD1",             BIT(4)},
> +	{"DBG_SBR16B_PGD0",          BIT(5)},
> +	{"FUSE_PGD0",                BIT(6)},
> +	{"SBR16B0_PGD0",             BIT(7)},
> +
> +	{"P2SB0_PGD0",               BIT(0)},
> +	{"OTG_PGD0",                 BIT(1)},
> +	{"EXI_PGD0",                 BIT(2)},
> +	{"CSE_PGD0",                 BIT(3)},
> +	{"CSME_KVM_PGD0",            BIT(4)},
> +	{"CSME_PMT_PGD0",            BIT(5)},
> +	{"CSME_CLINK_PGD0",          BIT(6)},
> +	{"SBR16B21_PGD0",            BIT(7)},
> +
> +	{"CSME_USBR_PGD0",           BIT(0)},
> +	{"SBR16B22_PGD0",            BIT(1)},
> +	{"CSME_SMT1_PGD0",           BIT(2)},
> +	{"MPFPW1_PGD0",              BIT(3)},
> +	{"CSME_SMS2_PGD0",           BIT(4)},
> +	{"CSME_SMS_PGD0",            BIT(5)},
> +	{"CSME_RTC_PGD0",            BIT(6)},
> +	{"CSMEPSF_PGD0",             BIT(7)},
> +
> +	{"D2D_NOC_PGD0",             BIT(0)},
> +	{"ESE_PGD0",                 BIT(1)},
> +	{"SBR16B6_PGD0",             BIT(2)},
> +	{"P2SB1_PGD0",               BIT(3)},
> +	{"SBR16B3_PGD0",             BIT(4)},
> +	{"OSSE_SMT1_PGD0",           BIT(5)},
> +	{"D2D_DISP_PGD0",            BIT(6)},
> +	{"SNPS_USB2_A_PGD0",         BIT(7)},
> +
> +	{"U3FPW1_PGD0",              BIT(0)},
> +	{"FIA_X_PGD0",               BIT(1)},
> +	{"PSF4_PGD0",                BIT(2)},
> +	{"CNVI_PGD0",                BIT(3)},
> +	{"UFSX2_PGD0",               BIT(4)},
> +	{"ENDBG_PGD0",               BIT(5)},
> +	{"DBC_PGD0",                 BIT(6)},
> +	{"FIA_PG_PGD0",              BIT(7)},
> +
> +	{"D2D_IPU_PGD0",             BIT(0)},
> +	{"NPK_PGD1",                 BIT(1)},
> +	{"FIACPCB_X_PGD0",           BIT(2)},
> +	{"SBR8B4_PGD0",              BIT(3)},
> +	{"DBG_PSF_PGD0",             BIT(4)},
> +	{"PSF6_PGD0",                BIT(5)},
> +	{"UFSPW1_PGD0",              BIT(6)},
> +	{"FIA_U_PGD0",               BIT(7)},
> +
> +	{"PSF8_PGD0",                BIT(0)},
> +	{"SBR16B9_PGD0",             BIT(1)},
> +	{"PSF0_PGD0",                BIT(2)},
> +	{"FIACPCB_U_PGD0",           BIT(3)},
> +	{"TAM_PGD0",                 BIT(4)},
> +	{"D2D_NOC_PGD2",             BIT(5)},
> +	{"SBR8B2_PGD0",              BIT(6)},
> +	{"THC0_PGD0",                BIT(7)},
> +
> +	{"THC1_PGD0",                BIT(0)},
> +	{"PMC_PGD1",                 BIT(1)},
> +	{"DISP_PGA1_PGD0",           BIT(2)},
> +	{"TCSS_PGD0",                BIT(3)},
> +	{"DISP_PGA_PGD0",            BIT(4)},
> +	{"SBR16B1_PGD0",             BIT(5)},
> +	{"SBRG_PGD0",                BIT(6)},
> +	{"PSF5_PGD0",                BIT(7)},
> +
> +	{"SBR8B3_PGD0",              BIT(0)},
> +	{"ACE_PGD0",                 BIT(1)},
> +	{"ACE_PGD1",                 BIT(2)},
> +	{"ACE_PGD2",                 BIT(3)},
> +	{"ACE_PGD3",                 BIT(4)},
> +	{"ACE_PGD4",                 BIT(5)},
> +	{"ACE_PGD5",                 BIT(6)},
> +	{"ACE_PGD6",                 BIT(7)},
> +
> +	{"ACE_PGD7",                 BIT(0)},
> +	{"ACE_PGD8",                 BIT(1)},
> +	{"ACE_PGD9",                 BIT(2)},
> +	{"ACE_PGD10",                BIT(3)},
> +	{"FIACPCB_PG_PGD0",          BIT(4)},
> +	{"SNPS_USB2_B_PGD0",         BIT(5)},
> +	{"OSSE_PGD0",                BIT(6)},
> +	{"SBR8B0_PGD0",              BIT(7)},
> +
> +	{"SBR16B4_PGD0",             BIT(0)},
> +	{"CSME_PTIO_PGD0",           BIT(1)},
> +	{}
> +};
> +
> +static const struct pmc_bit_map *ext_nvl_pcdh_pfear_map[] = {
> +	nvl_pcdh_pfear_map,
> +	NULL
> +};
> +
> +const struct pmc_bit_map nvl_pcdh_clocksource_status_map[] = {
> +	{"AON2_OFF_STS",                 BIT(0),	1},
> +	{"AON3_OFF_STS",                 BIT(1),	0},
> +	{"AON4_OFF_STS",                 BIT(2),	1},
> +	{"AON5_OFF_STS",                 BIT(3),	1},
> +	{"AON1_OFF_STS",                 BIT(4),	0},
> +	{"XTAL_LVM_OFF_STS",             BIT(5),	0},
> +	{"MPFPW1_0_PLL_OFF_STS",         BIT(6),	1},
> +	{"D2D_PLL_OFF_STS",              BIT(7),	1},
> +	{"USB3_PLL_OFF_STS",             BIT(8),	1},
> +	{"AON3_SPL_OFF_STS",             BIT(9),	1},
> +	{"MPFPW2_0_PLL_OFF_STS",         BIT(12),	1},
> +	{"XTAL_AGGR_OFF_STS",            BIT(17),	1},
> +	{"USB2_PLL_OFF_STS",             BIT(18),	0},
> +	{"DDI2_PLL_OFF_STS",             BIT(19),	1},
> +	{"SE_TCSS_PLL_OFF_STS",          BIT(20),	1},
> +	{"DDI_PLL_OFF_STS",              BIT(21),	1},
> +	{"FILTER_PLL_OFF_STS",           BIT(22),	1},
> +	{"ACE_PLL_OFF_STS",              BIT(24),	0},
> +	{"FABRIC_PLL_OFF_STS",           BIT(25),	1},
> +	{"SOC_PLL_OFF_STS",              BIT(26),	1},
> +	{"REF_PLL_OFF_STS",              BIT(28),	1},
> +	{"IMG_PLL_OFF_STS",              BIT(29),	1},
> +	{"GENLOCK_FILTER_PLL_OFF_STS",   BIT(30),	1},
> +	{"RTC_PLL_OFF_STS",              BIT(31),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcdh_power_gating_status_0_map[] = {
> +	{"PMC_PGD0_PG_STS",              BIT(0),	0},
> +	{"FUSE_OSSE_PGD0_PG_STS",	 BIT(1),	0},
> +	{"ESPISPI_PGD0_PG_STS",          BIT(2),	0},
> +	{"XHCI_PGD0_PG_STS",             BIT(3),	1},
> +	{"SPA_PGD0_PG_STS",              BIT(4),	1},
> +	{"SPB_PGD0_PG_STS",              BIT(5),	1},
> +	{"MPFPW2_PGD0_PG_STS",           BIT(6),	0},
> +	{"GBE_PGD0_PG_STS",              BIT(7),	1},
> +	{"SBR16B20_PGD0_PG_STS",         BIT(8),	0},
> +	{"DBG_PGD0_PG_STS",              BIT(9),	0},
> +	{"SBR16B7_PGD0_PG_STS",          BIT(10),	0},
> +	{"STRC_PGD0_PG_STS",             BIT(11),	0},
> +	{"SBR16B8_PGD0_PG_STS",          BIT(12),	0},
> +	{"D2D_DISP_PGD1_PG_STS",         BIT(13),	1},
> +	{"LPSS_PGD0_PG_STS",             BIT(14),	1},
> +	{"LPC_PGD0_PG_STS",              BIT(15),	0},
> +	{"SMB_PGD0_PG_STS",              BIT(16),	0},
> +	{"ISH_PGD0_PG_STS",              BIT(17),	0},
> +	{"SBR16B2_PGD0_PG_STS",          BIT(18),	0},
> +	{"NPK_PGD0_PG_STS",              BIT(19),	0},
> +	{"D2D_NOC_PGD1_PG_STS",          BIT(20),	1},
> +	{"DBG_SBR16B_PGD0_PG_STS",       BIT(21),	0},
> +	{"FUSE_PGD0_PG_STS",             BIT(22),	0},
> +	{"SBR16B0_PGD0_PG_STS",          BIT(23),	0},
> +	{"P2SB0_PGD0_PG_STS",            BIT(24),	1},
> +	{"XDCI_PGD0_PG_STS",             BIT(25),	1},
> +	{"EXI_PGD0_PG_STS",              BIT(26),	0},
> +	{"CSE_PGD0_PG_STS",              BIT(27),	1},
> +	{"KVMCC_PGD0_PG_STS",            BIT(28),	1},
> +	{"PMT_PGD0_PG_STS",              BIT(29),	1},
> +	{"CLINK_PGD0_PG_STS",            BIT(30),	1},
> +	{"SBR16B21_PGD0_PG_STS",         BIT(31),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcdh_power_gating_status_1_map[] = {
> +	{"USBR0_PGD0_PG_STS",            BIT(0),	1},
> +	{"SBR16B22_PGD0_PG_STS",         BIT(1),	0},
> +	{"SMT1_PGD0_PG_STS",             BIT(2),	1},
> +	{"MPFPW1_PGD0_PG_STS",           BIT(3),	0},
> +	{"SMS2_PGD0_PG_STS",             BIT(4),	1},
> +	{"SMS1_PGD0_PG_STS",             BIT(5),	1},
> +	{"CSMERTC_PGD0_PG_STS",          BIT(6),	0},
> +	{"CSMEPSF_PGD0_PG_STS",          BIT(7),	0},
> +	{"D2D_NOC_PGD0_PG_STS",          BIT(8),	0},
> +	{"ESE_PGD0_PG_STS",              BIT(9),	1},
> +	{"SBR16B6_PGD0_PG_STS",          BIT(10),	0},
> +	{"P2SB1_PGD0_PG_STS",            BIT(11),	1},
> +	{"SBR16B3_PGD0_PG_STS",          BIT(12),	0},
> +	{"OSSE_SMT1_PGD0_PG_STS",        BIT(13),	1},
> +	{"D2D_DISP_PGD0_PG_STS",         BIT(14),	1},
> +	{"SNPA_USB2_A_PGD0_PG_STS",      BIT(15),	0},
> +	{"U3FPW1_PGD0_PG_STS",           BIT(16),	0},
> +	{"FIA_X_PGD0_PG_STS",            BIT(17),	0},
> +	{"PSF4_PGD0_PG_STS",             BIT(18),	0},
> +	{"CNVI_PGD0_PG_STS",             BIT(19),	0},
> +	{"UFSX2_PGD0_PG_STS",            BIT(20),	1},
> +	{"ENDBG_PGD0_PG_STS",            BIT(21),	0},
> +	{"DBC_PGD0_PG_STS",              BIT(22),	0},
> +	{"FIA_PG_PGD0_PG_STS",           BIT(23),	0},
> +	{"D2D_IPU_PGD0_PG_STS",          BIT(24),	1},
> +	{"NPK_PGD1_PG_STS",              BIT(25),	0},
> +	{"FIACPCB_X_PGD0_PG_STS",        BIT(26),	0},
> +	{"SBR8B4_PGD0_PG_STS",           BIT(27),	0},
> +	{"DBG_PSF_PGD0_PG_STS",          BIT(28),	0},
> +	{"PSF6_PGD0_PG_STS",             BIT(29),	0},
> +	{"UFSPW1_PGD0_PG_STS",           BIT(30),	0},
> +	{"FIA_U_PGD0_PG_STS",            BIT(31),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcdh_power_gating_status_2_map[] = {
> +	{"PSF8_PGD0_PG_STS",             BIT(0),	0},
> +	{"SBR16B9_PGD0_PG_STS",          BIT(1),	0},
> +	{"PSF0_PGD0_PG_STS",             BIT(2),	0},
> +	{"FIACPCB_U_PGD0_PG_STS",        BIT(3),	0},
> +	{"TAM_PGD0_PG_STS",              BIT(4),	1},
> +	{"D2D_NOC_PGD2_PG_STS",          BIT(5),	1},
> +	{"SBR8B2_PGD0_PG_STS",           BIT(6),	0},
> +	{"THC0_PGD0_PG_STS",             BIT(7),	1},
> +	{"THC1_PGD0_PG_STS",             BIT(8),	1},
> +	{"PMC_PGD1_PG_STS",              BIT(9),	0},
> +	{"DISP_PGA1_PGD0_PG_STS",        BIT(10),	0},
> +	{"TCSS_PGD0_PG_STS",             BIT(11),	0},
> +	{"DISP_PGA_PGD0_PG_STS",         BIT(12),	0},
> +	{"SBR16B1_PGD0_PG_STS",          BIT(13),	0},
> +	{"SBRG_PGD0_PG_STS",             BIT(14),	0},
> +	{"PSF5_PGD0_PG_STS",             BIT(15),	0},
> +	{"SBR8B3_PGD0_PG_STS",           BIT(16),	0},
> +	{"ACE_PGD0_PG_STS",              BIT(17),	0},
> +	{"ACE_PGD1_PG_STS",              BIT(18),	0},
> +	{"ACE_PGD2_PG_STS",              BIT(19),	0},
> +	{"ACE_PGD3_PG_STS",              BIT(20),	0},
> +	{"ACE_PGD4_PG_STS",              BIT(21),	0},
> +	{"ACE_PGD5_PG_STS",              BIT(22),	0},
> +	{"ACE_PGD6_PG_STS",              BIT(23),	0},
> +	{"ACE_PGD7_PG_STS",              BIT(24),	0},
> +	{"ACE_PGD8_PG_STS",              BIT(25),	0},
> +	{"ACE_PGD9_PG_STS",              BIT(26),	0},
> +	{"ACE_PGD10_PG_STS",             BIT(27),	0},
> +	{"FIACPCB_PG_PGD0_PG_STS",       BIT(28),	0},
> +	{"SNPS_USB2_B_PGD0_PG_STS",      BIT(29),	0},
> +	{"OSSE_PGD0_PG_STS",             BIT(30),	1},
> +	{"SBR8B0_PGD0_PG_STS",           BIT(31),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcdh_power_gating_status_3_map[] = {
> +	{"SBR16B4_PGD0_PG_STS",          BIT(0),	0},
> +	{"PTIO_PGD0_PG_STS",             BIT(1),	1},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcdh_d3_status_0_map[] = {
> +	{"LPSS_D3_STS",                  BIT(3),	1},
> +	{"XDCI_D3_STS",                  BIT(4),	1},
> +	{"XHCI_D3_STS",                  BIT(5),	1},
> +	{"OSSE_D3_STS",                  BIT(6),	0},
> +	{"SPA_D3_STS",                   BIT(12),	0},
> +	{"SPB_D3_STS",                   BIT(13),	0},
> +	{"ESPISPI_D3_STS",               BIT(18),	0},
> +	{"PSTH_D3_STS",                  BIT(21),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcdh_d3_status_1_map[] = {
> +	{"OSSE_SMT1_D3_STS",             BIT(0),	0},
> +	{"GBE_D3_STS",                   BIT(19),	0},
> +	{"ITSS_D3_STS",                  BIT(23),	0},
> +	{"CNVI_D3_STS",                  BIT(27),	0},
> +	{"UFSX2_D3_STS",                 BIT(28),	0},
> +	{"ESE_D3_STS",                   BIT(29),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcdh_d3_status_2_map[] = {
> +	{"CSMERTC_D3_STS",               BIT(1),	0},
> +	{"CSE_D3_STS",                   BIT(4),	0},
> +	{"KVMCC_D3_STS",                 BIT(5),	0},
> +	{"USBR0_D3_STS",                 BIT(6),	0},
> +	{"ISH_D3_STS",                   BIT(7),	0},
> +	{"SMT1_D3_STS",                  BIT(8),	0},
> +	{"SMT2_D3_STS",                  BIT(9),	0},
> +	{"SMT3_D3_STS",                  BIT(10),	0},
> +	{"OSSE_SMT2_D3_STS",             BIT(11),	0},
> +	{"CLINK_D3_STS",                 BIT(14),	0},
> +	{"PTIO_D3_STS",                  BIT(16),	0},
> +	{"PMT_D3_STS",                   BIT(17),	0},
> +	{"SMS1_D3_STS",                  BIT(18),	0},
> +	{"SMS2_D3_STS",                  BIT(19),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcdh_d3_status_3_map[] = {
> +	{"THC0_D3_STS",                  BIT(14),	1},
> +	{"THC1_D3_STS",                  BIT(15),	1},
> +	{"OSSE_SMT3_D3_STS",             BIT(16),	0},
> +	{"ACE_D3_STS",                   BIT(23),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcdh_vnn_req_status_0_map[] = {
> +	{"LPSS_VNN_REQ_STS",             BIT(3),	1},
> +	{"OSSE_VNN_REQ_STS",             BIT(6),	1},
> +	{"ESPISPI_VNN_REQ_STS",          BIT(18),	1},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcdh_vnn_req_status_1_map[] = {
> +	{"OSSE_SMT1_VNN_REQ_STS",        BIT(0),	1},
> +	{"NPK_VNN_REQ_STS",              BIT(4),	1},
> +	{"DFXAGG_VNN_REQ_STS",           BIT(8),	0},
> +	{"EXI_VNN_REQ_STS",              BIT(9),	1},
> +	{"P2D_VNN_REQ_STS",              BIT(18),	1},
> +	{"GBE_VNN_REQ_STS",              BIT(19),	1},
> +	{"SMB_VNN_REQ_STS",              BIT(25),	1},
> +	{"LPC_VNN_REQ_STS",              BIT(26),	0},
> +	{"ESE_VNN_REQ_STS",              BIT(29),	1},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcdh_vnn_req_status_2_map[] = {
> +	{"CSMERTC_VNN_REQ_STS",          BIT(1),	1},
> +	{"CSE_VNN_REQ_STS",              BIT(4),	1},
> +	{"ISH_VNN_REQ_STS",              BIT(7),	1},
> +	{"SMT1_VNN_REQ_STS",             BIT(8),	1},
> +	{"CLINK_VNN_REQ_STS",            BIT(14),	1},
> +	{"SMS1_VNN_REQ_STS",             BIT(18),	1},
> +	{"SMS2_VNN_REQ_STS",             BIT(19),	1},
> +	{"GPIOCOM4_VNN_REQ_STS",         BIT(20),	1},
> +	{"GPIOCOM3_VNN_REQ_STS",         BIT(21),	1},
> +	{"DISP_SHIM_VNN_REQ_STS",        BIT(22),	1},
> +	{"GPIOCOM1_VNN_REQ_STS",         BIT(23),	1},
> +	{"GPIOCOM0_VNN_REQ_STS",         BIT(24),	1},
> +	{}
> +};
> +
> +const struct pmc_bit_map nvl_pcdh_vnn_req_status_3_map[] = {
> +	{"DTS0_VNN_REQ_STS",             BIT(7),	0},
> +	{"GPIOCOM5_VNN_REQ_STS",         BIT(11),	1},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcdh_vnn_misc_status_map[] = {
> +	{"CPU_C10_REQ_STS",              BIT(0),	0},
> +	{"TS_OFF_REQ_STS",               BIT(1),	0},
> +	{"PNDE_MET_REQ_STS",             BIT(2),	1},
> +	{"PG5_PMA0_REQ_STS",             BIT(3),	1},
> +	{"FW_THROTTLE_ALLOWED_REQ_STS",  BIT(4),	0},
> +	{"VNN_SOC_REQ_STS",              BIT(6),	1},
> +	{"ISH_VNNAON_REQ_STS",           BIT(7),	0},
> +	{"D2D_NOC_CFI_QACTIVE_REQ_STS",	 BIT(8),	1},
> +	{"D2D_NOC_GPSB_QACTIVE_REQ_STS", BIT(9),	1},
> +	{"D2D_IPU_QACTIVE_REQ_STS",      BIT(10),	1},
> +	{"PLT_GREATER_REQ_STS",          BIT(11),	1},
> +	{"ALL_SBR_IDLE_REQ_STS",         BIT(12),	0},
> +	{"PMC_IDLE_FB_OCP_REQ_STS",      BIT(13),	0},
> +	{"PM_SYNC_STATES_REQ_STS",       BIT(14),	0},
> +	{"EA_REQ_STS",                   BIT(15),	0},
> +	{"MPHY_CORE_OFF_REQ_STS",        BIT(16),	0},
> +	{"BRK_EV_EN_REQ_STS",            BIT(17),	0},
> +	{"AUTO_DEMO_EN_REQ_STS",         BIT(18),	0},
> +	{"ITSS_CLK_SRC_REQ_STS",         BIT(19),	1},
> +	{"ARC_IDLE_REQ_STS",             BIT(21),	0},
> +	{"PG5_PMA1_REQ_STS",             BIT(22),	1},
> +	{"FIA_DEEP_PM_REQ_STS",          BIT(23),	0},
> +	{"XDCI_ATTACHED_REQ_STS",        BIT(24),	1},
> +	{"ARC_INTERRUPT_WAKE_REQ_STS",   BIT(25),	0},
> +	{"D2D_DISP_DDI_QACTIVE_REQ_STS", BIT(26),	1},
> +	{"PRE_WAKE0_REQ_STS",            BIT(27),	1},
> +	{"PRE_WAKE1_REQ_STS",            BIT(28),	1},
> +	{"PRE_WAKE2_REQ_STS",            BIT(29),	1},
> +	{"PG5_PMA2_GVNN",                BIT(30),	1},
> +	{"D2D_DISP_EDP_QACTIVE_REQ_STS", BIT(31),	1},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcdh_rsc_status_map[] = {
> +	{"CORE",		0,		1},
> +	{"Memory",		0,		1},
> +	{"PRIM_D2D",		0,		1},
> +	{"PSF0",		0,		1},
> +	{"PSF4",		0,		1},
> +	{"PSF6",		0,		1},
> +	{"PSF8",		0,		1},
> +	{"SB",			0,		1},
> +	{}
> +};
> +
> +static const struct pmc_bit_map *nvl_pcdh_lpm_maps[] = {
> +	nvl_pcdh_clocksource_status_map,
> +	nvl_pcdh_power_gating_status_0_map,
> +	nvl_pcdh_power_gating_status_1_map,
> +	nvl_pcdh_power_gating_status_2_map,
> +	nvl_pcdh_power_gating_status_3_map,
> +	nvl_pcdh_d3_status_0_map,
> +	nvl_pcdh_d3_status_1_map,
> +	nvl_pcdh_d3_status_2_map,
> +	nvl_pcdh_d3_status_3_map,
> +	nvl_pcdh_vnn_req_status_0_map,
> +	nvl_pcdh_vnn_req_status_1_map,
> +	nvl_pcdh_vnn_req_status_2_map,
> +	nvl_pcdh_vnn_req_status_3_map,
> +	nvl_pcdh_vnn_misc_status_map,
> +	ptl_pcdp_signal_status_map,
> +	NULL
> +};
> +
> +static const struct pmc_bit_map *nvl_pcdh_blk_maps[] = {
> +	nvl_pcdh_power_gating_status_0_map,
> +	nvl_pcdh_power_gating_status_1_map,
> +	nvl_pcdh_power_gating_status_2_map,
> +	nvl_pcdh_power_gating_status_3_map,
> +	nvl_pcdh_rsc_status_map,
> +	nvl_pcdh_vnn_req_status_0_map,
> +	nvl_pcdh_vnn_req_status_1_map,
> +	nvl_pcdh_vnn_req_status_2_map,
> +	nvl_pcdh_vnn_req_status_3_map,
> +	nvl_pcdh_d3_status_0_map,
> +	nvl_pcdh_d3_status_1_map,
> +	nvl_pcdh_d3_status_2_map,
> +	nvl_pcdh_d3_status_3_map,
> +	nvl_pcdh_clocksource_status_map,
> +	nvl_pcdh_vnn_misc_status_map,
> +	ptl_pcdp_signal_status_map,
> +	NULL
> +};
> +
> +static const struct pmc_bit_map nvl_pcds_pfear_map[] = {
> +	{"PMC_PGD0",                 BIT(0)},
> +	{"FUSE_OSSE_PGD0",           BIT(1)},
> +	{"SPI_PGD0",                 BIT(2)},
> +	{"XHCI_PGD0",                BIT(3)},
> +	{"SPA_PGD0",                 BIT(4)},
> +	{"SPB_PGD0",                 BIT(5)},
> +	{"RSVD6",                    BIT(6)},
> +	{"GBE_PGD0",                 BIT(7)},
> +
> +	{"RSVD8",                    BIT(0)},
> +	{"RSVD9",                    BIT(1)},
> +	{"SBR16B7_PGD0",             BIT(2)},
> +	{"SBR16B21_PGD0",            BIT(3)},
> +	{"RSVD12",                   BIT(4)},
> +	{"D2D_DISP_PGD1",            BIT(5)},
> +	{"LPSS_PGD0",                BIT(6)},
> +	{"LPC_PGD0",                 BIT(7)},
> +
> +	{"SMB_PGD0",                 BIT(0)},
> +	{"ISH_PGD0",                 BIT(1)},
> +	{"SBR16B1_PGD0",             BIT(2)},
> +	{"NPK_PGD0",                 BIT(3)},
> +	{"D2D_NOC_PGD1",             BIT(4)},
> +	{"DBG_SBR16B_PGD0",          BIT(5)},
> +	{"FUSE_PGD0",                BIT(6)},
> +	{"RSVD23",                   BIT(7)},
> +
> +	{"P2SB0_PGD0",               BIT(0)},
> +	{"OTG_PGD0",                 BIT(1)},
> +	{"EXI_PGD0",                 BIT(2)},
> +	{"CSE_PGD0",                 BIT(3)},
> +	{"CSME_KVM_PGD0",            BIT(4)},
> +	{"CSME_PMT_PGD0",            BIT(5)},
> +	{"CSME_CLINK_PGD0",          BIT(6)},
> +	{"CSME_PTIO_PGD0",           BIT(7)},
> +
> +	{"CSME_USBR_PGD0",           BIT(0)},
> +	{"SBR16B22_PGD0",            BIT(1)},
> +	{"CSME_SMT1_PGD0",           BIT(2)},
> +	{"P2SB1_PGD0",               BIT(3)},
> +	{"CSME_SMS2_PGD0",           BIT(4)},
> +	{"CSME_SMS_PGD0",            BIT(5)},
> +	{"CSME_RTC_PGD0",            BIT(6)},
> +	{"CSMEPSF_PGD0",             BIT(7)},
> +
> +	{"D2D_NOC_PGD0",             BIT(0)},
> +	{"RSVD41",                   BIT(1)},
> +	{"RSVD42",                   BIT(2)},
> +	{"RSVD43",                   BIT(3)},
> +	{"SBR16B2_PGD0",             BIT(4)},
> +	{"OSSE_SMT1_PGD0",           BIT(5)},
> +	{"D2D_DISP_PGD0",            BIT(6)},
> +	{"RSVD47_PGD0",              BIT(7)},
> +
> +	{"RSVD48",                   BIT(0)},
> +	{"DBG_PSF_PGD0",             BIT(1)},
> +	{"RSVD50",                   BIT(2)},
> +	{"CNVI_PGD0",                BIT(3)},
> +	{"UFSX2_PGD0",               BIT(4)},
> +	{"ENDBG_PGD0",               BIT(5)},
> +	{"DBC_PGD0",                 BIT(6)},
> +	{"SBR16B4_PGD0",             BIT(7)},
> +
> +	{"RSVD56",                   BIT(0)},
> +	{"NPK_PGD1",                 BIT(1)},
> +	{"RSVD58",                   BIT(2)},
> +	{"SBR16B20_PGD0",            BIT(3)},
> +	{"RSVD60",                   BIT(4)},
> +	{"SBR8B20_PGD0",             BIT(5)},
> +	{"RSVD62",                   BIT(6)},
> +	{"FIA_U_PGD0",               BIT(7)},
> +
> +	{"PSF8_PGD0",                BIT(0)},
> +	{"RSVD65",                   BIT(1)},
> +	{"RSVD66",                   BIT(2)},
> +	{"FIACPCB_U_PGD0",           BIT(3)},
> +	{"TAM_PGD0",                 BIT(4)},
> +	{"D2D_NOC_PGD2",             BIT(5)},
> +	{"SBR8B2_PGD0",              BIT(6)},
> +	{"THC0_PGD0",                BIT(7)},
> +
> +	{"THC1_PGD0",                BIT(0)},
> +	{"PMC_PGD1",                 BIT(1)},
> +	{"SBR16B3_PGD0",             BIT(2)},
> +	{"TCSS_PGD0",                BIT(3)},
> +	{"DISP_PGA_PGD0",            BIT(4)},
> +	{"RSVD77",                   BIT(5)},
> +	{"RSVD78",                   BIT(6)},
> +	{"RSVD79",                   BIT(7)},
> +
> +	{"SBRG_PGD0",                BIT(0)},
> +	{"RSVD81",                   BIT(1)},
> +	{"SBR16B0_PGD0",             BIT(2)},
> +	{"SBR8B0_PGD0",              BIT(3)},
> +	{"PSF7_PGD0",                BIT(4)},
> +	{"RSVD85",                   BIT(5)},
> +	{"RSVD86",                   BIT(6)},
> +	{"RSVD87",                   BIT(7)},
> +
> +	{"SBR16B6_PGD0",             BIT(0)},
> +	{"PSD0_PGD0",                BIT(1)},
> +	{"STRC_PGD0",                BIT(2)},
> +	{"RSVD91",                   BIT(3)},
> +	{"DBG_SBR_PGD0",             BIT(4)},
> +	{"RSVD93",                   BIT(5)},
> +	{"OSSE_PGD0",                BIT(6)},
> +	{"DISP_PGA1_PGD0",           BIT(7)},
> +	{}
> +};
> +
> +static const struct pmc_bit_map *ext_nvl_pcds_pfear_map[] = {
> +	nvl_pcds_pfear_map,
> +	NULL
> +};
> +
> +static const struct pmc_bit_map nvl_pcds_ltr_show_map[] = {
> +	{"SOUTHPORT_A",		CNP_PMC_LTR_SPA},
> +	{"SOUTHPORT_B",		CNP_PMC_LTR_SPB},
> +	{"SATA",		CNP_PMC_LTR_SATA},
> +	{"GIGABIT_ETHERNET",	CNP_PMC_LTR_GBE},
> +	{"XHCI",		CNP_PMC_LTR_XHCI},
> +	{"SOUTHPORT_F",		ADL_PMC_LTR_SPF},
> +	{"ME",			CNP_PMC_LTR_ME},
> +	{"SATA1",		CNP_PMC_LTR_EVA},
> +	{"SOUTHPORT_C",		CNP_PMC_LTR_SPC},
> +	{"HD_AUDIO",		CNP_PMC_LTR_AZ},
> +	{"CNV",			CNP_PMC_LTR_CNV},
> +	{"LPSS",		CNP_PMC_LTR_LPSS},
> +	{"SOUTHPORT_D",		CNP_PMC_LTR_SPD},
> +	{"SOUTHPORT_E",		CNP_PMC_LTR_SPE},
> +	{"SATA2",		PTL_PMC_LTR_SATA2},
> +	{"ESPI",		CNP_PMC_LTR_ESPI},
> +	{"SCC",			CNP_PMC_LTR_SCC},
> +	{"ISH",			CNP_PMC_LTR_ISH},
> +	{"UFSX2",		CNP_PMC_LTR_UFSX2},
> +	{"EMMC",		CNP_PMC_LTR_EMMC},
> +	{"WIGIG",		ICL_PMC_LTR_WIGIG},
> +	{"THC0",		TGL_PMC_LTR_THC0},
> +	{"THC1",		TGL_PMC_LTR_THC1},
> +	{"SOUTHPORT_G",		MTL_PMC_LTR_SPG},
> +	{"RSVD",		NVL_PCDS_PMC_LTR_RESERVED},
> +	{"IOE_PMC",		MTL_PMC_LTR_IOE_PMC},
> +	{"DMI3",		ARL_PMC_LTR_DMI3},
> +	{"OSSE",		LNL_PMC_LTR_OSSE},
> +
> +	/* Below two cannot be used for LTR_IGNORE */
> +	{"CURRENT_PLATFORM",	PTL_PMC_LTR_CUR_PLT},
> +	{"AGGREGATED_SYSTEM",	PTL_PMC_LTR_CUR_ASLT},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcds_clocksource_status_map[] = {
> +	{"AON2_OFF_STS",                 BIT(0),	1},
> +	{"AON3_OFF_STS",                 BIT(1),	0},
> +	{"AON4_OFF_STS",                 BIT(2),	1},
> +	{"AON5_OFF_STS",                 BIT(3),	1},
> +	{"AON1_OFF_STS",                 BIT(4),	0},
> +	{"XTAL_LVM_OFF_STS",             BIT(5),	0},
> +	{"D2D_OFF_STS",                  BIT(8),	1},
> +	{"AON3_SPL_OFF_STS",             BIT(9),	1},
> +	{"XTAL_AGGR_OFF_STS",            BIT(17),	1},
> +	{"BCLK_EXT_INJ_OFF_STS",         BIT(18),	1},
> +	{"DDI2_PLL_OFF_STS",             BIT(19),	1},
> +	{"SE_TCSS_PLL_OFF_STS",          BIT(20),	1},
> +	{"DDI_PLL_OFF_STS",              BIT(21),	1},
> +	{"FILTER_PLL_OFF_STS",           BIT(22),	1},
> +	{"PHY_OC_EXT_INJ_OFF_STS",       BIT(23),	1},
> +	{"ACE_PLL_OFF_STS",              BIT(24),	0},
> +	{"FABRIC_PLL_OFF_STS",           BIT(25),	1},
> +	{"SOC_PLL_OFF_STS",              BIT(26),	1},
> +	{"REF_PLL_OFF_STS",              BIT(28),	1},
> +	{"GENLOCK_FILTER_PLL_OFF_STS",   BIT(30),	1},
> +	{"RTC_PLL_OFF_STS",              BIT(31),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcds_power_gating_status_0_map[] = {
> +	{"PMC_PGD0_PG_STS",              BIT(0),	0},
> +	{"FUSE_OSSE_PGD0_PG_STS",	 BIT(1),	0},
> +	{"ESPISPI_PGD0_PG_STS",          BIT(2),	0},
> +	{"XHCI_PGD0_PG_STS",             BIT(3),	0},
> +	{"SPA_PGD0_PG_STS",              BIT(4),	0},
> +	{"SPB_PGD0_PG_STS",              BIT(5),	0},
> +	{"RSVD_6",                       BIT(6),	0},
> +	{"GBE_PGD0_PG_STS",              BIT(7),	0},
> +	{"RSVD_8",                       BIT(8),	0},
> +	{"RSVD_9",                       BIT(9),	0},
> +	{"SBR16B7_PGD0_PG_STS",          BIT(10),	0},
> +	{"SBR16B21_PGD0_PG_STS",         BIT(11),	0},
> +	{"RSVD_12",                      BIT(12),	0},
> +	{"D2D_DISP_PGD1_PG_STS",         BIT(13),	1},
> +	{"LPSS_PGD0_PG_STS",             BIT(14),	0},
> +	{"LPC_PGD0_PG_STS",              BIT(15),	0},
> +	{"SMB_PGD0_PG_STS",              BIT(16),	0},
> +	{"ISH_PGD0_PG_STS",              BIT(17),	0},
> +	{"SBR16B1_PGD0_PG_STS",          BIT(18),	0},
> +	{"NPK_PGD0_PG_STS",              BIT(19),	0},
> +	{"D2D_NOC_PGD1_PG_STS",          BIT(20),	1},
> +	{"DBG_SBR16B_PGD0_PG_STS",       BIT(21),	0},
> +	{"FUSE_PGD0_PG_STS",             BIT(22),	0},
> +	{"RSVD_23",                      BIT(23),	0},
> +	{"P2SB0_PGD0_PG_STS",            BIT(24),	1},
> +	{"XDCI_PGD0_PG_STS",             BIT(25),	0},
> +	{"EXI_PGD0_PG_STS",              BIT(26),	0},
> +	{"CSE_PGD0_PG_STS",              BIT(27),	1},
> +	{"KVMCC_PGD0_PG_STS",            BIT(28),	0},
> +	{"PMT_PGD0_PG_STS",              BIT(29),	0},
> +	{"CLINK_PGD0_PG_STS",            BIT(30),	0},
> +	{"PTIO_PGD0_PG_STS",             BIT(31),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcds_power_gating_status_1_map[] = {
> +	{"USBR0_PGD0_PG_STS",            BIT(0),	0},
> +	{"SBR16B22_PGD0_PG_STS",         BIT(1),	0},
> +	{"SMT1_PGD0_PG_STS",             BIT(2),	0},
> +	{"P2SB1_PGD0_PG_STS",            BIT(3),	1},
> +	{"SMS2_PGD0_PG_STS",             BIT(4),	0},
> +	{"SMS1_PGD0_PG_STS",             BIT(5),	0},
> +	{"CSMERTC_PGD0_PG_STS",          BIT(6),	0},
> +	{"CSMEPSF_PGD0_PG_STS",          BIT(7),	0},
> +	{"D2D_NOC_PGD0_PG_STS",          BIT(8),	0},
> +	{"RSVD_9",                       BIT(9),	0},
> +	{"RSVD_10",                      BIT(10),	0},
> +	{"RSVD_11",                      BIT(11),	0},
> +	{"SBR16B2_PGD0_PG_STS",          BIT(12),	0},
> +	{"OSSE_SMT1_PGD0_PG_STS",        BIT(13),	1},
> +	{"D2D_DISP_PGD0_PG_STS",         BIT(14),	1},
> +	{"RSVD_15",                      BIT(15),	0},
> +	{"RSVD_16",                      BIT(16),	0},
> +	{"DBG_PSF_PGD0_PG_STS",          BIT(17),	0},
> +	{"RSVD_18",                      BIT(18),	0},
> +	{"CNVI_PGD0_PG_STS",             BIT(19),	0},
> +	{"UFSX2_PGD0_PG_STS",            BIT(20),	0},
> +	{"ENDBG_PGD0_PG_STS",            BIT(21),	0},
> +	{"DBC_PGD0_PG_STS",              BIT(22),	0},
> +	{"SBR16B4_PGD0_PG_STS",          BIT(23),	0},
> +	{"RSVD_24",                      BIT(24),	0},
> +	{"NPK_PGD1_PG_STS",              BIT(25),	0},
> +	{"RSVD_26",                      BIT(26),	0},
> +	{"SBR16B20_PGD0_PG_STS",         BIT(27),	0},
> +	{"RSVD_28",                      BIT(28),	0},
> +	{"SBR8B20_PGD0_PG_STS",          BIT(29),	0},
> +	{"RSVD_30",                      BIT(30),	0},
> +	{"FIA_U_PGD0_PG_STS",            BIT(31),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcds_power_gating_status_2_map[] = {
> +	{"PSF8_PGD0_PG_STS",             BIT(0),	0},
> +	{"RSVD_1",                       BIT(1),	0},
> +	{"RSVD_2",                       BIT(2),	0},
> +	{"FIACPCB_U_PGD0_PG_STS",        BIT(3),	0},
> +	{"TAM_PGD0_PG_STS",              BIT(4),	1},
> +	{"D2D_NOC_PGD2_PG_STS",          BIT(5),	1},
> +	{"SBR8B2_PGD0_PG_STS",           BIT(6),	0},
> +	{"THC0_PGD0_PG_STS",             BIT(7),	0},
> +	{"THC1_PGD0_PG_STS",             BIT(8),	0},
> +	{"PMC_PGD1_PG_STS",              BIT(9),	0},
> +	{"SBR16B3_PGD0_PG_STS",          BIT(10),	0},
> +	{"TCSS_PGD0_PG_STS",             BIT(11),	0},
> +	{"DISP_PGA_PGD0_PG_STS",         BIT(12),	0},
> +	{"RSVD_13",                      BIT(13),	0},
> +	{"RSVD_14",                      BIT(14),	0},
> +	{"RSVD_15",                      BIT(15),	0},
> +	{"SBRG_PGD0_PG_STS",             BIT(16),	0},
> +	{"RSVD_17",                      BIT(17),	0},
> +	{"SBR16B0_PGD0_PG_STS",          BIT(18),	0},
> +	{"SBR8B0_PGD0_PG_STS",           BIT(19),	0},
> +	{"PSF7_PGD0_PG_STS",             BIT(20),	0},
> +	{"RSVD_21",                      BIT(21),	0},
> +	{"RSVD_22",                      BIT(22),	0},
> +	{"RSVD_23",                      BIT(23),	0},
> +	{"SBR16B6_PGD0_PG_STS",          BIT(24),	0},
> +	{"PSF0_PGD0_PG_STS",             BIT(25),	0},
> +	{"STRC_PGD0_PG_STS",             BIT(26),	0},
> +	{"RSVD_27",                      BIT(27),	0},
> +	{"DBG_SBR_PGD0_PG_STS",          BIT(28),	0},
> +	{"RSVD_29",                      BIT(29),	0},
> +	{"OSSE_PGD0_PG_STS",             BIT(30),	1},
> +	{"DISP_PGA1_PGD0_PG_STS",        BIT(31),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcds_d3_status_0_map[] = {
> +	{"LPSS_D3_STS",                  BIT(3),	1},
> +	{"XDCI_D3_STS",                  BIT(4),	1},
> +	{"XHCI_D3_STS",                  BIT(5),	1},
> +	{"SPA_D3_STS",                   BIT(12),	0},
> +	{"SPB_D3_STS",                   BIT(13),	0},
> +	{"ESPISPI_D3_STS",               BIT(18),	0},
> +	{"PSTH_D3_STS",                  BIT(21),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcds_d3_status_1_map[] = {
> +	{"OSSE_D3_STS",                  BIT(14),	0},
> +	{"GBE_D3_STS",                   BIT(19),	0},
> +	{"ITSS_D3_STS",                  BIT(23),	0},
> +	{"CNVI_D3_STS",                  BIT(27),	0},
> +	{"UFSX2_D3_STS",                 BIT(28),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcds_d3_status_2_map[] = {
> +	{"CSMERTC_D3_STS",               BIT(1),	0},
> +	{"CSE_D3_STS",                   BIT(4),	0},
> +	{"KVMCC_D3_STS",                 BIT(5),	0},
> +	{"USBR0_D3_STS",                 BIT(6),	0},
> +	{"ISH_D3_STS",                   BIT(7),	0},
> +	{"SMT1_D3_STS",                  BIT(8),	0},
> +	{"SMT2_D3_STS",                  BIT(9),	0},
> +	{"SMT3_D3_STS",                  BIT(10),	0},
> +	{"OSSE_SMT1_D3_STS",             BIT(12),	0},
> +	{"CLINK_D3_STS",                 BIT(14),	0},
> +	{"PTIO_D3_STS",                  BIT(16),	0},
> +	{"PMT_D3_STS",                   BIT(17),	0},
> +	{"SMS1_D3_STS",                  BIT(18),	0},
> +	{"SMS2_D3_STS",                  BIT(19),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcds_d3_status_3_map[] = {
> +	{"OSSE_SMT2_D3_STS",             BIT(0),	0},
> +	{"THC0_D3_STS",                  BIT(14),	1},
> +	{"THC1_D3_STS",                  BIT(15),	1},
> +	{"OSSE_SMT3_D3_STS",             BIT(19),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcds_vnn_req_status_0_map[] = {
> +	{"LPSS_VNN_REQ_STS",             BIT(3),	0},
> +	{"ESPISPI_VNN_REQ_STS",          BIT(18),	1},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcds_vnn_req_status_1_map[] = {
> +	{"NPK_VNN_REQ_STS",              BIT(4),	1},
> +	{"DFXAGG_VNN_REQ_STS",           BIT(8),	0},
> +	{"EXI_VNN_REQ_STS",              BIT(9),	1},
> +	{"OSSE_VNN_REQ_STS",             BIT(14),	1},
> +	{"P2D_VNN_REQ_STS",              BIT(18),	1},
> +	{"GBE_VNN_REQ_STS",              BIT(19),	0},
> +	{"SMB_VNN_REQ_STS",              BIT(25),	1},
> +	{"LPC_VNN_REQ_STS",              BIT(26),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcds_vnn_req_status_2_map[] = {
> +	{"CSMERTC_VNN_REQ_STS",          BIT(1),	0},
> +	{"CSE_VNN_REQ_STS",              BIT(4),	1},
> +	{"ISH_VNN_REQ_STS",              BIT(7),	0},
> +	{"SMT1_VNN_REQ_STS",             BIT(8),	0},
> +	{"OSSE_SMT1_VNN_REQ_STS",        BIT(12),	1},
> +	{"CLINK_VNN_REQ_STS",            BIT(14),	0},
> +	{"SMS1_VNN_REQ_STS",             BIT(18),	0},
> +	{"SMS2_VNN_REQ_STS",             BIT(19),	0},
> +	{"GPIOCOM4_VNN_REQ_STS",         BIT(20),	0},
> +	{"GPIOCOM3_VNN_REQ_STS",         BIT(21),	1},
> +	{"GPIOCOM1_VNN_REQ_STS",         BIT(23),	1},
> +	{"GPIOCOM0_VNN_REQ_STS",         BIT(24),	1},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcds_vnn_req_status_3_map[] = {
> +	{"DISP_SHIM_VNN_REQ_STS",        BIT(4),	1},
> +	{"DTS0_VNN_REQ_STS",             BIT(7),	0},
> +	{"GPIOCOM5_VNN_REQ_STS",         BIT(11),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcds_vnn_misc_status_map[] = {
> +	{"CPU_C10_REQ_STS",              BIT(0),	0},
> +	{"TS_OFF_REQ_STS",               BIT(1),	0},
> +	{"PNDE_MET_REQ_STS",             BIT(2),	1},
> +	{"PG5_PMA0_REQ_STS",             BIT(3),	1},
> +	{"FW_THROTTLE_ALLOWED_REQ_STS",  BIT(4),	0},
> +	{"VNN_SOC_REQ_STS",              BIT(6),	1},
> +	{"ISH_VNNAON_REQ_STS",           BIT(7),	0},
> +	{"D2D_NOC_CFI_QACTIVE_REQ_STS",	 BIT(8),	1},
> +	{"D2D_NOC_GPSB_QACTIVE_REQ_STS", BIT(9),	1},
> +	{"PLT_GREATER_REQ_STS",          BIT(11),	1},
> +	{"ALL_SBR_IDLE_REQ_STS",         BIT(12),	0},
> +	{"PMC_IDLE_FB_OCP_REQ_STS",      BIT(13),	0},
> +	{"PM_SYNC_STATES_REQ_STS",       BIT(14),	0},
> +	{"EA_REQ_STS",                   BIT(15),	0},
> +	{"MPHY_CORE_OFF_REQ_STS",        BIT(16),	0},
> +	{"BRK_EV_EN_REQ_STS",            BIT(17),	0},
> +	{"AUTO_DEMO_EN_REQ_STS",         BIT(18),	0},
> +	{"ITSS_CLK_SRC_REQ_STS",         BIT(19),	1},
> +	{"ARC_IDLE_REQ_STS",             BIT(21),	0},
> +	{"PG5_PMA1_REQ_STS",             BIT(22),	1},
> +	{"DG5_PMA0_REQ_STS",             BIT(23),	1},
> +	{"ARC_INTERRUPT_WAKE_REQ_STS",   BIT(25),	0},
> +	{"D2D_DISP_DDI_QACTIVE_REQ_STS", BIT(26),	1},
> +	{"PRE_WAKE0_REQ_STS",            BIT(27),	1},
> +	{"PRE_WAKE1_REQ_STS",            BIT(28),	1},
> +	{"PRE_WAKE2_REQ_STS",            BIT(29),	1},
> +	{"D2D_DISP_EDP_QACTIVE_REQ_STS", BIT(31),	1},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcds_rsc_status_map[] = {
> +	{"CORE",		0,		1},
> +	{"Memory",		0,		1},
> +	{"PRIM_D2D",		0,		1},
> +	{"PSF0",		0,		1},
> +	{"SB",			0,		1},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pcds_signal_status_map[] = {
> +	{"LSX_Wake0_STS",		 BIT(0),	0},
> +	{"LSX_Wake1_STS",		 BIT(1),	0},
> +	{"LSX_Wake2_STS",		 BIT(2),	0},
> +	{"LSX_Wake3_STS",		 BIT(3),	0},
> +	{"LSX_Wake4_STS",		 BIT(4),	0},
> +	{"LSX_Wake5_STS",		 BIT(5),	0},
> +	{"LSX_Wake6_STS",		 BIT(6),	0},
> +	{"LSX_Wake7_STS",		 BIT(7),	0},
> +	{"LPSS_Wake0_STS",		 BIT(8),	1},
> +	{"LPSS_Wake1_STS",		 BIT(9),	1},
> +	{"Int_Timer_SS_Wake0_STS",	 BIT(10),	1},
> +	{"Int_Timer_SS_Wake1_STS",	 BIT(11),	1},
> +	{"Int_Timer_SS_Wake2_STS",	 BIT(12),	1},
> +	{"Int_Timer_SS_Wake3_STS",	 BIT(13),	1},
> +	{"Int_Timer_SS_Wake4_STS",	 BIT(14),	1},
> +	{"Int_Timer_SS_Wake5_STS",	 BIT(15),	1},
> +	{}
> +};
> +
> +static const struct pmc_bit_map *nvl_pcds_lpm_maps[] = {
> +	nvl_pcds_clocksource_status_map,
> +	nvl_pcds_power_gating_status_0_map,
> +	nvl_pcds_power_gating_status_1_map,
> +	nvl_pcds_power_gating_status_2_map,
> +	nvl_pcds_d3_status_0_map,
> +	nvl_pcds_d3_status_1_map,
> +	nvl_pcds_d3_status_2_map,
> +	nvl_pcds_d3_status_3_map,
> +	nvl_pcds_vnn_req_status_0_map,
> +	nvl_pcds_vnn_req_status_1_map,
> +	nvl_pcds_vnn_req_status_2_map,
> +	nvl_pcds_vnn_req_status_3_map,
> +	nvl_pcds_vnn_misc_status_map,
> +	nvl_pcds_signal_status_map,
> +	NULL
> +};
> +
> +static const struct pmc_bit_map *nvl_pcds_blk_maps[] = {
> +	nvl_pcds_power_gating_status_0_map,
> +	nvl_pcds_power_gating_status_1_map,
> +	nvl_pcds_power_gating_status_2_map,
> +	nvl_pcds_rsc_status_map,
> +	nvl_pcds_vnn_req_status_0_map,
> +	nvl_pcds_vnn_req_status_1_map,
> +	nvl_pcds_vnn_req_status_2_map,
> +	nvl_pcds_vnn_req_status_3_map,
> +	nvl_pcds_d3_status_0_map,
> +	nvl_pcds_d3_status_1_map,
> +	nvl_pcds_d3_status_2_map,
> +	nvl_pcds_d3_status_3_map,
> +	nvl_pcds_clocksource_status_map,
> +	nvl_pcds_vnn_misc_status_map,
> +	nvl_pcds_signal_status_map,
> +	NULL
> +};
> +
> +static const struct pmc_bit_map nvl_pchs_pfear_map[] = {
> +	{"PMC_PGD0",                 BIT(0)},
> +	{"FIA_D_PGD0",               BIT(1)},
> +	{"SPI_PGD0",                 BIT(2)},
> +	{"XHCI_PGD0",                BIT(3)},
> +	{"SPA_PGD0",                 BIT(4)},
> +	{"SPB_PGD0",                 BIT(5)},
> +	{"MPFPW2_PGD0",              BIT(6)},
> +	{"GBE_PGD0",                 BIT(7)},
> +
> +	{"RSVD8",                    BIT(0)},
> +	{"PSF3_PGD0",                BIT(1)},
> +	{"SBR5_PGD0",                BIT(2)},
> +	{"SBR0_PGD0",                BIT(3)},
> +	{"RSVD12",                   BIT(4)},
> +	{"D2D_DISP_PGD1",            BIT(5)},
> +	{"LPSS_PGD0",                BIT(6)},
> +	{"LPC_PGD0",                 BIT(7)},
> +
> +	{"SMB_PGD0",                 BIT(0)},
> +	{"ISH_PGD0",                 BIT(1)},
> +	{"P2SB_PGD0",                BIT(2)},
> +	{"NPK_PGD0",                 BIT(3)},
> +	{"D2D_NOC_PGD1",             BIT(4)},
> +	{"EAH_PGD0",                 BIT(5)},
> +	{"FUSE_PGD0",                BIT(6)},
> +	{"SBR8_PGD0",                BIT(7)},
> +
> +	{"PSF7_PGD0",                BIT(0)},
> +	{"OTG_PGD0",                 BIT(1)},
> +	{"EXI_PGD0",                 BIT(2)},
> +	{"CSE_PGD0",                 BIT(3)},
> +	{"CSME_KVM_PGD0",            BIT(4)},
> +	{"CSME_PMT_PGD0",            BIT(5)},
> +	{"CSME_CLINK_PGD0",          BIT(6)},
> +	{"CSME_PTIO_PGD0",           BIT(7)},
> +
> +	{"CSME_USBR_PGD0",           BIT(0)},
> +	{"SBR1_PGD0",                BIT(1)},
> +	{"CSME_SMT1_PGD0",           BIT(2)},
> +	{"MPFPW1_PGD0",              BIT(3)},
> +	{"CSME_SMS2_PGD0",           BIT(4)},
> +	{"CSME_SMS_PGD0",            BIT(5)},
> +	{"CSME_RTC_PGD0",            BIT(6)},
> +	{"CSMEPSF_PGD0",             BIT(7)},
> +
> +	{"D2D_NOC_PGD0",             BIT(0)},
> +	{"ESE_PGD0",                 BIT(1)},
> +	{"SBR2_PGD0",                BIT(2)},
> +	{"SBR3_PGD0",                BIT(3)},
> +	{"SBR4_PGD0",                BIT(4)},
> +	{"RSVD45",                   BIT(5)},
> +	{"D2D_DISP_PGD0",            BIT(6)},
> +	{"PSF1_PGD0",                BIT(7)},
> +
> +	{"U3FPW1_PGD0",              BIT(0)},
> +	{"DMI3FPW_PGD0",             BIT(1)},
> +	{"PSF4_PGD0",                BIT(2)},
> +	{"CNVI_PGD0",                BIT(3)},
> +	{"RSVD52",                   BIT(4)},
> +	{"ENDBG_PGD0",               BIT(5)},
> +	{"DBC_PGD0",                 BIT(6)},
> +	{"SMT4_PGD0",                BIT(7)},
> +
> +	{"RSVD56",                   BIT(0)},
> +	{"NPK_PGD1",                 BIT(1)},
> +	{"RSVD58",                   BIT(2)},
> +	{"DMI3_PGD0",                BIT(3)},
> +	{"RSVD60",                   BIT(4)},
> +	{"FIACPCB_D_PGD0",           BIT(5)},
> +	{"RSVD62",                   BIT(6)},
> +	{"FIA_U_PGD0",               BIT(7)},
> +
> +	{"FIACPCB_PGS_PGD0",         BIT(0)},
> +	{"FIA_PGS_PGD0",             BIT(1)},
> +	{"RSVD66",                   BIT(2)},
> +	{"FIACPCB_U_PGD0",           BIT(3)},
> +	{"TAM_PGD0",                 BIT(4)},
> +	{"D2D_NOC_PGD2",             BIT(5)},
> +	{"PSF2_PGD0",                BIT(6)},
> +	{"THC0_PGD0",                BIT(7)},
> +
> +	{"THC1_PGD0",                BIT(0)},
> +	{"PMC_PGD1",                 BIT(1)},
> +	{"SBR9_PGD0",                BIT(2)},
> +	{"U3FPW2_PGD0",              BIT(3)},
> +	{"RSVD76",                   BIT(4)},
> +	{"DBG_PSF_PGD0",             BIT(5)},
> +	{"DBG_SBR_PGD0",             BIT(6)},
> +	{"SBR6_PGD0",                BIT(7)},
> +
> +	{"SPC_PGD0",                 BIT(0)},
> +	{"ACE_PGD0",                 BIT(1)},
> +	{"ACE_PGD1",                 BIT(2)},
> +	{"ACE_PGD2",                 BIT(3)},
> +	{"ACE_PGD3",                 BIT(4)},
> +	{"ACE_PGD4",                 BIT(5)},
> +	{"ACE_PGD5",                 BIT(6)},
> +	{"ACE_PGD6",                 BIT(7)},
> +
> +	{"ACE_PGD7",                 BIT(0)},
> +	{"ACE_PGD8",                 BIT(1)},
> +	{"ACE_PGD9",                 BIT(2)},
> +	{"ACE_PGD10",                BIT(3)},
> +	{"U3FPW3_PGD0",              BIT(4)},
> +	{"SBR7_PGD0",                BIT(5)},
> +	{"OSSE_PGD0",                BIT(6)},
> +	{"ST_PGD0",                  BIT(7)},
> +	{}
> +};
> +
> +static const struct pmc_bit_map *ext_nvl_pchs_pfear_map[] = {
> +	nvl_pchs_pfear_map,
> +	NULL
> +};
> +
> +static const struct pmc_bit_map nvl_pchs_clocksource_status_map[] = {
> +	{"AON2_OFF_STS",                 BIT(0),	1},
> +	{"AON3_OFF_STS",                 BIT(1),	0},
> +	{"AON4_OFF_STS",                 BIT(2),	0},
> +	{"AON2_SPL_OFF_STS",             BIT(3),	0},
> +	{"AONL_OFF_STS",                 BIT(4),	0},
> +	{"XTAL_LVM_OFF_STS",             BIT(5),	0},
> +	{"AON5_OFF_STS",                 BIT(6),	0},
> +	{"USB3_PLL_OFF_STS",             BIT(8),	1},
> +	{"MAIN_CRO_OFF_STS",             BIT(11),	0},
> +	{"MAIN_DIVIDER_OFF_STS",         BIT(12),	1},
> +	{"REF_PLL_NON_OC_OFF_STS",       BIT(13),	1},
> +	{"DMI_PLL_OFF_STS",              BIT(14),	1},
> +	{"PHY_EXT_INJ_OFF_STS",          BIT(15),	1},
> +	{"AON6_MCRO_OFF_STS",            BIT(16),	0},
> +	{"XTAL_AGGR_OFF_STS",            BIT(17),	0},
> +	{"USB2_PLL_OFF_STS",             BIT(18),	1},
> +	{"GBE_PLL_OFF_STS",              BIT(21),	1},
> +	{"SATA_PLL_OFF_STS",             BIT(22),	1},
> +	{"PCIE0_PLL_OFF_STS",            BIT(23),	1},
> +	{"PCIE1_PLL_OFF_STS",            BIT(24),	1},
> +	{"FABRIC_PLL_OFF_STS",           BIT(25),	1},
> +	{"PCIE2_PLL_OFF_STS",            BIT(26),	1},
> +	{"REF_PLL_OFF_STS",              BIT(28),	1},
> +	{"REF38P4_PLL_OFF_STS",          BIT(31),	1},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pchs_power_gating_status_0_map[] = {
> +	{"PMC_PGD0_PG_STS",              BIT(0),	0},
> +	{"FIA_D_PGD0_PG_STS",            BIT(1),	0},
> +	{"ESPISPI_PGD0_PG_STS",          BIT(2),	0},
> +	{"XHCI_PGD0_PG_STS",             BIT(3),	0},
> +	{"SPA_PGD0_PG_STS",              BIT(4),	1},
> +	{"SPB_PGD0_PG_STS",              BIT(5),	1},
> +	{"MPFPW2_PGD0_PG_STS",           BIT(6),	0},
> +	{"GBE_PGD0_PG_STS",              BIT(7),	1},
> +	{"RSVD_8",                       BIT(8),	0},
> +	{"PSF3_PGD0_PG_STS",             BIT(9),	0},
> +	{"SBR5_PGD0_PG_STS",             BIT(10),	0},
> +	{"SBR0_PGD0_PG_STS",             BIT(11),	0},
> +	{"RSVD_12",                      BIT(12),	0},
> +	{"D2D_DISP_PGD1_PG_STS",         BIT(13),	0},
> +	{"LPSS_PGD0_PG_STS",             BIT(14),	1},
> +	{"LPC_PGD0_PG_STS",              BIT(15),	0},
> +	{"SMB_PGD0_PG_STS",              BIT(16),	0},
> +	{"ISH_PGD0_PG_STS",              BIT(17),	0},
> +	{"P2S_PGD0_PG_STS",              BIT(18),	0},
> +	{"NPK_PGD0_PG_STS",              BIT(19),	0},
> +	{"D2D_NOC_PGD1_PG_STS",          BIT(20),	0},
> +	{"EAH_PGD0_PG_STS",              BIT(21),	0},
> +	{"FUSE_PGD0_PG_STS",             BIT(22),	0},
> +	{"SBR8_PGD0_PG_STS",             BIT(23),	0},
> +	{"PSF7_PGD0_PG_STS",             BIT(24),	0},
> +	{"XDCI_PGD0_PG_STS",             BIT(25),	1},
> +	{"EXI_PGD0_PG_STS",              BIT(26),	0},
> +	{"CSE_PGD0_PG_STS",              BIT(27),	1},
> +	{"KVMCC_PGD0_PG_STS",            BIT(28),	1},
> +	{"PMT_PGD0_PG_STS",              BIT(29),	1},
> +	{"CLINK_PGD0_PG_STS",            BIT(30),	1},
> +	{"PTIO_PGD0_PG_STS",             BIT(31),	1},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pchs_power_gating_status_1_map[] = {
> +	{"USBR0_PGD0_PG_STS",            BIT(0),	1},
> +	{"SBR1_PGD0_PG_STS",             BIT(1),	0},
> +	{"SMT1_PGD0_PG_STS",             BIT(2),	1},
> +	{"MPFPW1_PGD0_PG_STS",           BIT(3),	0},
> +	{"SMS2_PGD0_PG_STS",             BIT(4),	1},
> +	{"SMS1_PGD0_PG_STS",             BIT(5),	1},
> +	{"CSMERTC_PGD0_PG_STS",          BIT(6),	0},
> +	{"CSMEPSF_PGD0_PG_STS",          BIT(7),	0},
> +	{"D2D_NOC_PGD0_PG_STS",          BIT(8),	0},
> +	{"ESE_PGD0_PG_STS",              BIT(9),	1},
> +	{"SBR2_PGD0_PG_STS",             BIT(10),	0},
> +	{"SBR3_PGD0_PG_STS",             BIT(11),	0},
> +	{"SBR4_PGD0_PG_STS",             BIT(12),	0},
> +	{"RSVD_13",                      BIT(13),	0},
> +	{"D2D_DISP_PGD0_PG_STS",         BIT(14),	0},
> +	{"PSF1_PGD0_PG_STS",             BIT(15),	0},
> +	{"U3FPW1_PGD0_PG_STS",           BIT(16),	0},
> +	{"DMI3FPW_PGD0_PG_STS",          BIT(17),	0},
> +	{"PSF4_PGD0_PG_STS",             BIT(18),	0},
> +	{"CNVI_PGD0_PG_STS",             BIT(19),	0},
> +	{"RSVD_20",                      BIT(20),	0},
> +	{"ENDBG_PGD0_PG_STS",            BIT(21),	0},
> +	{"DBC_PGD0_PG_STS",              BIT(22),	0},
> +	{"SMT4_PGD0_PG_STS",             BIT(23),	1},
> +	{"RSVD_24",                      BIT(24),	0},
> +	{"NPK_PGD1_PG_STS",              BIT(25),	0},
> +	{"RSVD_26",                      BIT(26),	0},
> +	{"DMI3_PGD0_PG_STS",             BIT(27),	1},
> +	{"RSVD_28",                      BIT(28),	0},
> +	{"FIACPCB_D_PGD0_PG_STS",        BIT(29),	0},
> +	{"RSVD_30",                      BIT(30),	0},
> +	{"FIA_U_PGD0_PG_STS",            BIT(31),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pchs_power_gating_status_2_map[] = {
> +	{"FIACPCB_PGS_PGD0_PG_STS",      BIT(0),	0},
> +	{"FIA_PGS_PGD0_PG_STS",          BIT(1),	0},
> +	{"RSVD_2",                       BIT(2),	0},
> +	{"FIACPCB_U_PGD0_PG_STS",        BIT(3),	0},
> +	{"TAM_PGD0_PG_STS",              BIT(4),	0},
> +	{"D2D_NOC_PGD2_PG_STS",          BIT(5),	0},
> +	{"PSF2_PGD0_PG_STS",             BIT(6),	0},
> +	{"THC0_PGD0_PG_STS",             BIT(7),	1},
> +	{"THC1_PGD0_PG_STS",             BIT(8),	1},
> +	{"PMC_PGD1_PG_STS",              BIT(9),	0},
> +	{"SBR9_PGA0_PGD0_PG_STS",        BIT(10),	0},
> +	{"U3FPW2_PGD0_PG_STS",           BIT(11),	0},
> +	{"RSVD_12",                      BIT(12),	0},
> +	{"DBG_PSF_PGD0_PG_STS",          BIT(13),	0},
> +	{"DBG_SBR_PGD0_PG_STS",          BIT(14),	0},
> +	{"SBR6_PGD0_PG_STS",             BIT(15),	0},
> +	{"SPC_PGD0_PG_STS",              BIT(16),	1},
> +	{"ACE_PGD0_PG_STS",              BIT(17),	0},
> +	{"ACE_PGD1_PG_STS",              BIT(18),	0},
> +	{"ACE_PGD2_PG_STS",              BIT(19),	0},
> +	{"ACE_PGD3_PG_STS",              BIT(20),	0},
> +	{"ACE_PGD4_PG_STS",              BIT(21),	0},
> +	{"ACE_PGD5_PG_STS",              BIT(22),	0},
> +	{"ACE_PGD6_PG_STS",              BIT(23),	0},
> +	{"ACE_PGD7_PG_STS",              BIT(24),	0},
> +	{"ACE_PGD8_PG_STS",              BIT(25),	0},
> +	{"ACE_PGD9_PG_STS",              BIT(26),	0},
> +	{"ACE_PGD10_PG_STS",             BIT(27),	0},
> +	{"U3FPW3_PGD0_PG_STS",           BIT(28),	0},
> +	{"SBR7_PGD0_PG_STS",             BIT(29),	0},
> +	{"OSSE_PGD0_PG_STS",             BIT(30),	0},
> +	{"SATA_PGD0_PG_STS",             BIT(31),	1},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pchs_d3_status_0_map[] = {
> +	{"LPSS_D3_STS",                  BIT(3),	1},
> +	{"XDCI_D3_STS",                  BIT(4),	1},
> +	{"XHCI_D3_STS",                  BIT(5),	0},
> +	{"SPA_D3_STS",                   BIT(12),	0},
> +	{"SPB_D3_STS",                   BIT(13),	0},
> +	{"SPC_D3_STS",                   BIT(14),	0},
> +	{"ESPISPI_D3_STS",               BIT(18),	0},
> +	{"SATA_D3_STS",                  BIT(20),	1},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pchs_d3_status_1_map[] = {
> +	{"OSSE_D3_STS",                  BIT(6),	0},
> +	{"GBE_D3_STS",                   BIT(19),	0},
> +	{"ITSS_D3_STS",                  BIT(23),	0},
> +	{"P2S_D3_STS",                   BIT(24),	0},
> +	{"CNVI_D3_STS",                  BIT(27),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pchs_d3_status_2_map[] = {
> +	{"CSMERTC_D3_STS",               BIT(1),	0},
> +	{"CSE_D3_STS",                   BIT(4),	0},
> +	{"KVMCC_D3_STS",                 BIT(5),	0},
> +	{"USBR0_D3_STS",                 BIT(6),	0},
> +	{"ISH_D3_STS",                   BIT(7),	0},
> +	{"SMT1_D3_STS",                  BIT(8),	0},
> +	{"SMT2_D3_STS",                  BIT(9),	0},
> +	{"SMT3_D3_STS",                  BIT(10),	0},
> +	{"SMT4_D3_STS",                  BIT(11),	0},
> +	{"SMT5_D3_STS",                  BIT(12),	0},
> +	{"SMT6_D3_STS",                  BIT(13),	0},
> +	{"CLINK_D3_STS",                 BIT(14),	0},
> +	{"PTIO_D3_STS",                  BIT(16),	0},
> +	{"PMT_D3_STS",                   BIT(17),	0},
> +	{"SMS1_D3_STS",                  BIT(18),	0},
> +	{"SMS2_D3_STS",                  BIT(19),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pchs_d3_status_3_map[] = {
> +	{"THC0_D3_STS",                  BIT(14),	0},
> +	{"THC1_D3_STS",                  BIT(15),	0},
> +	{"ACE_D3_STS",                   BIT(23),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pchs_vnn_req_status_1_map[] = {
> +	{"NPK_VNN_REQ_STS",              BIT(4),	0},
> +	{"OSSE_VNN_REQ_STS",             BIT(6),	0},
> +	{"DFXAGG_VNN_REQ_STS",           BIT(8),	0},
> +	{"EXI_VNN_REQ_STS",              BIT(9),	0},
> +	{"GBE_VNN_REQ_STS",              BIT(19),	0},
> +	{"SMB_VNN_REQ_STS",              BIT(25),	0},
> +	{"LPC_VNN_REQ_STS",              BIT(26),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pchs_vnn_req_status_2_map[] = {
> +	{"CSMERTC_VNN_REQ_STS",          BIT(1),	0},
> +	{"CSE_VNN_REQ_STS",              BIT(4),	0},
> +	{"ISH_VNN_REQ_STS",              BIT(7),	0},
> +	{"SMT1_VNN_REQ_STS",             BIT(8),	0},
> +	{"SMT4_VNN_REQ_STS",             BIT(11),	0},
> +	{"CLINK_VNN_REQ_STS",            BIT(14),	0},
> +	{"SMS1_VNN_REQ_STS",             BIT(18),	0},
> +	{"SMS2_VNN_REQ_STS",             BIT(19),	0},
> +	{"GPIOCOM4_VNN_REQ_STS",         BIT(20),	0},
> +	{"GPIOCOM3_VNN_REQ_STS",         BIT(21),	0},
> +	{"GPIOCOM2_VNN_REQ_STS",         BIT(22),	0},
> +	{"GPIOCOM1_VNN_REQ_STS",         BIT(23),	0},
> +	{"GPIOCOM0_VNN_REQ_STS",         BIT(24),	0},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pchs_vnn_misc_status_map[] = {
> +	{"CPU_C10_REQ_STS",              BIT(0),	0},
> +	{"TS_OFF_REQ_STS",               BIT(1),	0},
> +	{"PNDE_MET_REQ_STS",             BIT(2),	1},
> +	{"PG5_PMA0_GVNN_REQ_STS",        BIT(3),	1},
> +	{"FW_THROTTLE_ALLOWED_REQ_STS",  BIT(4),	0},
> +	{"DMI_IN_L1_REQ_STS",            BIT(6),	0},
> +	{"ISH_VNNAON_REQ_STS",           BIT(7),	0},
> +	{"PLT_GREATER_REQ_STS",          BIT(11),	1},
> +	{"ALL_SBR_IDLE_REQ_STS",         BIT(12),	0},
> +	{"PMC_IDLE_FB_OCP_REQ_STS",      BIT(13),	0},
> +	{"PM_SYNC_STATES_REQ_STS",       BIT(14),	0},
> +	{"EA_REQ_STS",                   BIT(15),	0},
> +	{"DMI_CLKREQ_B_REQ_STS",         BIT(16),	0},
> +	{"BRK_EV_EN_REQ_STS",            BIT(17),	0},
> +	{"AUTO_DEMO_EN_REQ_STS",         BIT(18),	0},
> +	{"ITSS_CLK_SRC_REQ_STS",         BIT(19),	1},
> +	{"ARC_IDLE_REQ_STS",             BIT(21),	0},
> +	{"PG5_PMA1_GVNN_REQ_STS",        BIT(22),	1},
> +	{"FIA_DEEP_PM_REQ_STS",          BIT(23),	0},
> +	{"XDCI_ATTACHED_REQ_STS",        BIT(24),	0},
> +	{"ARC_INTERRUPT_WAKE_REQ_STS",   BIT(25),	0},
> +	{"PRE_WAKE0_REQ_STS",            BIT(27),	1},
> +	{"PRE_WAKE1_REQ_STS",            BIT(28),	1},
> +	{"PRE_WAKE2_EN_REQ_STS",         BIT(29),	0},
> +	{"PG5_PMA2_GVNN_REQ_STS",        BIT(30),	1},
> +	{}
> +};
> +
> +static const struct pmc_bit_map nvl_pchs_rsc_status_map[] = {
> +	{"Memory",		0,		1},
> +	{"Memory_NS",		0,		1},
> +	{"PSF1",		0,		1},
> +	{"PSF2",		0,		1},
> +	{"PSF3",		0,		1},
> +	{"REF_PLL",		0,		1},
> +	{"SB",			0,		1},
> +	{}
> +};
> +
> +static const struct pmc_bit_map *nvl_pchs_lpm_maps[] = {
> +	nvl_pchs_clocksource_status_map,
> +	nvl_pchs_power_gating_status_0_map,
> +	nvl_pchs_power_gating_status_1_map,
> +	nvl_pchs_power_gating_status_2_map,
> +	nvl_pchs_d3_status_0_map,
> +	nvl_pchs_d3_status_1_map,
> +	nvl_pchs_d3_status_2_map,
> +	nvl_pchs_d3_status_3_map,
> +	nvl_pcds_vnn_req_status_0_map,
> +	nvl_pchs_vnn_req_status_1_map,
> +	nvl_pchs_vnn_req_status_2_map,
> +	nvl_pcdh_vnn_req_status_3_map,
> +	nvl_pchs_vnn_misc_status_map,
> +	ptl_pcdp_signal_status_map,
> +	NULL
> +};
> +
> +static const struct pmc_bit_map *nvl_pchs_blk_maps[] = {
> +	nvl_pchs_power_gating_status_0_map,
> +	nvl_pchs_power_gating_status_1_map,
> +	nvl_pchs_power_gating_status_2_map,
> +	nvl_pchs_rsc_status_map,
> +	nvl_pchs_d3_status_0_map,
> +	nvl_pchs_clocksource_status_map,
> +	nvl_pchs_vnn_misc_status_map,
> +	NULL
> +};
> +
> +static const struct pmc_reg_map nvl_pcdh_reg_map = {
> +	.pfear_sts = ext_nvl_pcdh_pfear_map,
> +	.slp_s0_offset = CNP_PMC_SLP_S0_RES_COUNTER_OFFSET,
> +	.slp_s0_res_counter_step = TGL_PMC_SLP_S0_RES_COUNTER_STEP,
> +	.ltr_show_sts = ptl_pcdp_ltr_show_map,
> +	.msr_sts = msr_map,
> +	.ltr_ignore_offset = CNP_PMC_LTR_IGNORE_OFFSET,
> +	.regmap_length = NVL_PCDH_PMC_MMIO_REG_LEN,
> +	.ppfear0_offset = CNP_PMC_HOST_PPFEAR0A,
> +	.ppfear_buckets = NVL_PCDH_PPFEAR_NUM_ENTRIES,
> +	.pm_cfg_offset = CNP_PMC_PM_CFG_OFFSET,
> +	.pm_read_disable_bit = CNP_PMC_READ_DISABLE_BIT,
> +	.lpm_num_maps = NVL_LPM_NUM_MAPS,
> +	.ltr_ignore_max = LNL_NUM_IP_IGN_ALLOWED,
> +	.lpm_res_counter_step_x2 = TGL_PMC_LPM_RES_COUNTER_STEP_X2,
> +	.etr3_offset = ETR3_OFFSET,
> +	.lpm_sts_latch_en_offset = MTL_LPM_STATUS_LATCH_EN_OFFSET,
> +	.lpm_priority_offset = NVL_LPM_PRI_OFFSET,
> +	.lpm_en_offset = NVL_LPM_EN_OFFSET,
> +	.lpm_residency_offset = NVL_LPM_RESIDENCY_OFFSET,
> +	.lpm_sts = nvl_pcdh_lpm_maps,
> +	.lpm_status_offset = MTL_LPM_STATUS_OFFSET,
> +	.lpm_live_status_offset = NVL_LPM_LIVE_STATUS_OFFSET,
> +	.s0ix_blocker_maps = nvl_pcdh_blk_maps,
> +	.s0ix_blocker_offset = LNL_S0IX_BLOCKER_OFFSET,
> +	.num_s0ix_blocker = NVL_PCDH_NUM_S0IX_BLOCKER,
> +	.blocker_req_offset = NVL_PCDH_BLK_REQ_OFFSET,
> +	.lpm_req_guid = PCDH_LPM_REQ_GUID,
> +};
> +
> +static const struct pmc_reg_map nvl_pcds_reg_map = {
> +	.pfear_sts = ext_nvl_pcds_pfear_map,
> +	.slp_s0_offset = CNP_PMC_SLP_S0_RES_COUNTER_OFFSET,
> +	.slp_s0_res_counter_step = TGL_PMC_SLP_S0_RES_COUNTER_STEP,
> +	.ltr_show_sts = nvl_pcds_ltr_show_map,
> +	.msr_sts = msr_map,
> +	.ltr_ignore_offset = CNP_PMC_LTR_IGNORE_OFFSET,
> +	.regmap_length = NVL_PCDS_PMC_MMIO_REG_LEN,
> +	.ppfear0_offset = CNP_PMC_HOST_PPFEAR0A,
> +	.ppfear_buckets = LNL_PPFEAR_NUM_ENTRIES,
> +	.pm_cfg_offset = CNP_PMC_PM_CFG_OFFSET,
> +	.pm_read_disable_bit = CNP_PMC_READ_DISABLE_BIT,
> +	.lpm_num_maps = PTL_LPM_NUM_MAPS,
> +	.ltr_ignore_max = LNL_NUM_IP_IGN_ALLOWED,
> +	.lpm_res_counter_step_x2 = TGL_PMC_LPM_RES_COUNTER_STEP_X2,
> +	.etr3_offset = ETR3_OFFSET,
> +	.lpm_sts_latch_en_offset = MTL_LPM_STATUS_LATCH_EN_OFFSET,
> +	.lpm_priority_offset = MTL_LPM_PRI_OFFSET,
> +	.lpm_en_offset = MTL_LPM_EN_OFFSET,
> +	.lpm_residency_offset = MTL_LPM_RESIDENCY_OFFSET,
> +	.lpm_sts = nvl_pcds_lpm_maps,
> +	.lpm_status_offset = MTL_LPM_STATUS_OFFSET,
> +	.lpm_live_status_offset = MTL_LPM_LIVE_STATUS_OFFSET,
> +	.s0ix_blocker_maps = nvl_pcds_blk_maps,
> +	.s0ix_blocker_offset = LNL_S0IX_BLOCKER_OFFSET,
> +	.num_s0ix_blocker = NVL_PCDS_NUM_S0IX_BLOCKER,
> +	.lpm_req_guid = PCDS_LPM_REQ_GUID,
> +	.blocker_req_offset = NVL_PCDS_BLK_REQ_OFFSET,
> +};
> +
> +static const struct pmc_reg_map nvl_pchs_reg_map = {
> +	.pfear_sts = ext_nvl_pchs_pfear_map,
> +	.slp_s0_offset = CNP_PMC_SLP_S0_RES_COUNTER_OFFSET,
> +	.slp_s0_res_counter_step = TGL_PMC_SLP_S0_RES_COUNTER_STEP,
> +	.ltr_show_sts = ptl_pcdp_ltr_show_map,
> +	.msr_sts = msr_map,
> +	.ltr_ignore_offset = CNP_PMC_LTR_IGNORE_OFFSET,
> +	.regmap_length = NVL_PCHS_PMC_MMIO_REG_LEN,
> +	.ppfear0_offset = CNP_PMC_HOST_PPFEAR0A,
> +	.ppfear_buckets = LNL_PPFEAR_NUM_ENTRIES,
> +	.pm_cfg_offset = CNP_PMC_PM_CFG_OFFSET,
> +	.pm_read_disable_bit = CNP_PMC_READ_DISABLE_BIT,
> +	.lpm_num_maps = PTL_LPM_NUM_MAPS,
> +	.ltr_ignore_max = LNL_NUM_IP_IGN_ALLOWED,
> +	.lpm_res_counter_step_x2 = TGL_PMC_LPM_RES_COUNTER_STEP_X2,
> +	.etr3_offset = ETR3_OFFSET,
> +	.lpm_sts_latch_en_offset = MTL_LPM_STATUS_LATCH_EN_OFFSET,
> +	.lpm_priority_offset = MTL_LPM_PRI_OFFSET,
> +	.lpm_en_offset = MTL_LPM_EN_OFFSET,
> +	.lpm_residency_offset = MTL_LPM_RESIDENCY_OFFSET,
> +	.lpm_sts = nvl_pchs_lpm_maps,
> +	.lpm_status_offset = MTL_LPM_STATUS_OFFSET,
> +	.lpm_live_status_offset = MTL_LPM_LIVE_STATUS_OFFSET,
> +	.s0ix_blocker_maps = nvl_pchs_blk_maps,
> +	.s0ix_blocker_offset = LNL_S0IX_BLOCKER_OFFSET,
> +	.num_s0ix_blocker = NVL_PCHS_NUM_S0IX_BLOCKER,
> +	.blocker_req_offset = NVL_PCHS_BLK_REQ_OFFSET,
> +	.lpm_req_guid = PCHS_LPM_REQ_GUID,
> +};
> +
> +static struct pmc_info nvl_pmc_info_list[] = {
> +	{
> +		.devid	= PMC_DEVID_NVL_PCDH,
> +		.map	= &nvl_pcdh_reg_map,
> +	},
> +	{
> +		.devid  = PMC_DEVID_NVL_PCDS,
> +		.map    = &nvl_pcds_reg_map,
> +	},
> +	{
> +		.devid  = PMC_DEVID_NVL_PCHS,
> +		.map    = &nvl_pchs_reg_map,
> +	},
> +	{}
> +};
> +
> +const char *nvl_ltr_block_counter_arr[] = {
> +	"PKGC_PREVENT_LTR_IADOMAIN",
> +	"PKGC_PREVENT_LTR_GDIE",
> +	"PKGC_PREVENT_LTR_PCH",
> +	"PKGC_PREVENT_LTR_DISPLAY",
> +	"PKGC_PREVENT_LTR_IPU",
> +	NULL
> +};
> +
> +const char *nvl_pkgc_blocker_residency[] = {
> +	"PKGC_BLOCK_RESIDENCY_INVALID",
> +	"PKGC_BLOCK_RESIDENCY_MISC",
> +	"PKGC_BLOCK_RESIDENCY_CDIE_MISC",
> +	"PKGC_BLOCK_RESIDENCY_MEDIA_MISC",
> +	"PKGC_BLOCK_RESIDENCY_GT_MISC",
> +	"PKGC_BLOCK_RESIDENCY_HUBATOM_MISC",
> +	"PKGC_BLOCK_RESIDENCY_IPU_BUSY",
> +	"PKGC_BLOCK_RESIDENCY_IPU_LTR",
> +	"PKGC_BLOCK_RESIDENCY_IPU_TIMER",
> +	"PKGC_BLOCK_RESIDENCY_DISP_BUSY",
> +	"PKGC_BLOCK_RESIDENCY_DISP_LTR",
> +	"PKGC_BLOCK_RESIDENCY_DISP_TIMER",
> +	"PKGC_BLOCK_RESIDENCY_VPU_BUSY",
> +	"PKGC_BLOCK_RESIDENCY_VPU_TIMER",
> +	"PKGC_BLOCK_RESIDENCY_PMC_BUSY",
> +	"PKGC_BLOCK_RESIDENCY_PMC_LTR",
> +	"PKGC_BLOCK_RESIDENCY_PMC_TIMER",
> +	"PKGC_BLOCK_RESIDENCY_HUBATOM_ARAT",
> +	"PKGC_BLOCK_RESIDENCY_CDIE0_ARAT",
> +	"PKGC_BLOCK_RESIDENCY_CDIE1_ARAT",
> +	"PKGC_BLOCK_RESIDENCY_GT_ARAT",
> +	"PKGC_BLOCK_RESIDENCY_MEDIA_ARAT",
> +	"PKGC_BLOCK_RESIDENCY_DEMOTION",
> +	"PKGC_BLOCK_RESIDENCY_THERMALS",
> +	"PKGC_BLOCK_RESIDENCY_SNCU",
> +	"PKGC_BLOCK_RESIDENCY_SVTU",
> +	"PKGC_BLOCK_RESIDENCY_IAA",
> +	"PKGC_BLOCK_RESIDENCY_IOC",
> +	NULL,
> +};
> +
> +static u8 nvl_pmc_list[] = {PMC_IDX_MAIN, PMC_IDX_PCH};
> +static u8 nvl_h_pmc_list[] = {PMC_IDX_MAIN, PMC_IDX_PCH};

These are identical so why both are needed?

These arrays could generally be const, no? (Applies to the other patch as 
well.)

> +
> +#define NVL_NPU_PCI_DEV                0xd71d
> +
> +/*
> + * Set power state of select devices that do not have drivers to D3
> + * so that they do not block Package C entry.
> + */
> +static void nvl_d3_fixup(void)
> +{
> +	pmc_core_set_device_d3(NVL_NPU_PCI_DEV);
> +}
> +
> +static int nvl_resume(struct pmc_dev *pmcdev)
> +{
> +	nvl_d3_fixup();
> +	return cnl_resume(pmcdev);
> +}
> +
> +static int nvl_core_init(struct pmc_dev *pmcdev, struct pmc_dev_info *pmc_dev_info)
> +{
> +	nvl_d3_fixup();
> +	return generic_core_init(pmcdev, pmc_dev_info);
> +}
> +
> +static u32 nvl_pmt_dmu_guids[] = {NVL_PMT_DMU_GUID, 0x0};
> +struct pmc_dev_info nvl_s_pmc_dev = {
> +	.num_pmcs = ARRAY_SIZE(nvl_pmc_list),
> +	.pmc_list = nvl_pmc_list,
> +	.regmap_list = nvl_pmc_info_list,
> +	.map = &nvl_pcds_reg_map,
> +	.sub_req_show = &pmc_core_substate_blk_req_fops,
> +	.suspend = cnl_suspend,
> +	.resume = nvl_resume,
> +	.init = nvl_core_init,
> +	.sub_req = pmc_core_pmt_get_blk_sub_req,
> +	.dmu_guids = nvl_pmt_dmu_guids,
> +	.pc_guid = NVL_PMT_PC_GUID,
> +	.pkgc_ltr_blocker_offset = NVL_LTR_BLK_OFFSET,
> +	.pkgc_ltr_blocker_counters = nvl_ltr_block_counter_arr,
> +	.pkgc_blocker_offset = NVL_PKGC_BLK_OFFSET,
> +	.pkgc_blocker_counters = nvl_pkgc_blocker_residency,
> +	.ssram_hidden = false,
> +	.die_c6_offset = NVL_PMT_DMU_DIE_C6_OFFSET,
> +};
> +
> +struct pmc_dev_info nvl_h_pmc_dev = {
> +	.num_pmcs = ARRAY_SIZE(nvl_h_pmc_list),
> +	.pmc_list = nvl_h_pmc_list,
> +	.regmap_list = nvl_pmc_info_list,
> +	.map = &nvl_pcdh_reg_map,
> +	.sub_req_show = &pmc_core_substate_blk_req_fops,
> +	.suspend = cnl_suspend,
> +	.resume = nvl_resume,
> +	.init = nvl_core_init,
> +	.sub_req = pmc_core_pmt_get_blk_sub_req,
> +	.dmu_guids = nvl_pmt_dmu_guids,
> +	.pc_guid = NVL_PMT_PC_GUID,
> +	.pkgc_ltr_blocker_offset = NVL_LTR_BLK_OFFSET,
> +	.pkgc_ltr_blocker_counters = nvl_ltr_block_counter_arr,
> +	.pkgc_blocker_offset = NVL_PKGC_BLK_OFFSET,
> +	.pkgc_blocker_counters = nvl_pkgc_blocker_residency,
> +	.ssram_hidden = false,
> +	.die_c6_offset = NVL_PMT_DMU_DIE_C6_OFFSET,
> +};
> diff --git a/drivers/platform/x86/intel/pmc/ptl.c b/drivers/platform/x86/intel/pmc/ptl.c
> index 7aa39db256770..3e1cf6905e111 100644
> --- a/drivers/platform/x86/intel/pmc/ptl.c
> +++ b/drivers/platform/x86/intel/pmc/ptl.c
> @@ -137,7 +137,7 @@ static const struct pmc_bit_map *ext_ptl_pcdp_pfear_map[] = {
>  	NULL
>  };
>  
> -static const struct pmc_bit_map ptl_pcdp_ltr_show_map[] = {
> +const struct pmc_bit_map ptl_pcdp_ltr_show_map[] = {
>  	{"SOUTHPORT_A",		CNP_PMC_LTR_SPA},
>  	{"SOUTHPORT_B",		CNP_PMC_LTR_SPB},
>  	{"SATA",		CNP_PMC_LTR_SATA},
> 

-- 
 i.


^ permalink raw reply

* [GIT PULL] Thermal control updates for v7.1-rc1
From: Rafael J. Wysocki @ 2026-04-10 14:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux PM, Linux Kernel Mailing List, Daniel Lezcano

Hi Linus,

This goes early because I will be traveling next week.

Please pull from the tag

 git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
 thermal-7.1-rc1

with top-most commit cd1a3b2ff0553e987de71ff0aa675e418de22898

 Merge tag 'thermal-v7.1-rc1' of
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux

on top of commit 9e07e3b81807edd356e1f794cffa00a428eff443

 thermal: core: Fix thermal zone device registration error path

to receive thermal control updates for 7.1-rc1.

These include thermal core fixes and simplifications, driver fixes and
new hardware support (SDM670, Eliza SoC), new driver features (hwmon
support in imx91, DDR data rate on Nova Lake in int340x), and a handful
of cleanups:

 - Fix thermal core issues related to thermal zone removal and
   registration errors that may lead to a use-after-free or a memory
   leak in some cases (Rafael Wysocki)

 - Drop a redundant check from thermal_zone_device_update(), adjust
   thermal workqueue allocation flags, and switch over thermal_class
   allocation to static (Rafael Wysocki)

 - Relocate the suspend and resume of thermal zones closer to the
   suspend and resume of devices, respectively (Rafael Wysocki)

 - Remove a pointless variable used in the thermal core when
   registering a cooling device (Daniel Lezcano)

 - Replace sprintf() in thermal_bind_cdev_to_trip() and use
   str_enabled_disabled() helper in mode_show() (Thorsten Blum)

 - Replace cpumask_weight() in intel_hfi_offline() with cpumask_empty()
   which is generally more efficient (Yury Norov)

 - Add support for reading DDR data rate from PCI config space on Nova
   Lake platforms to the int340x thermal driver (Srinivas Pandruvada)

 - Add an OF node address to output message to make sensor names more
   distinguishable (Alexander Stein)

 - Add hwmon support for the i.MX97 thermal sensor (Alexander Stein)

 - Clamp correctly the results when doing value/temperature conversion
   in the Spreadtrum driver (Thorsten Blum)

 - Add SDM670 compatible DT bindings for the Tsens and the lMH thermal
   drivers (Richard Acayan)

 - Add SM8750 compatible DT bindings for the Tsens thermal driver (Manaf
   Meethalavalappu Pallikunhi)

 - Add Eliza SoC compatible DT bindings for the Tsens driver (Krzysztof
   Kozlowski)

 - Fix inverted condition check on error in the Spear thermal control
   driver (Gopi Krishna Menon)

 - Convert DT bindings documentation into DT schema (Gopi Krishna Menon)

 - Use max() macro to increase readability in the Broadcom STB thermal
   sensor (Thorsten Blum)

 - Remove a stale @trim_offset kernel-doc entry (John Madieu)

Thanks!


---------------

Alexander Stein (2):
      thermal/of: Add OF node address to output message
      thermal/drivers/imx91: Add hwmon support

Anas Iqbal (1):
      thermal: devfreq_cooling: avoid unnecessary kfree of freq_table

Daniel Lezcano (1):
      thermal/core: Remove pointless variable when registering a cooling device

Gopi Krishna Menon (2):
      thermal/drivers/spear: Fix error condition for reading st,thermal-flags
      dt-bindings: thermal: st,thermal-spear1340: convert to dtschema

John Madieu (1):
      thermal: renesas: rzg3e: Remove stale @trim_offset kernel-doc entry

Krzysztof Kozlowski (1):
      dt-bindings: thermal: qcom-tsens: Add Eliza SoC TSENS

Manaf Meethalavalappu Pallikunhi (1):
      dt-bindings: thermal: qcom-tsens: Document the SM8750 Temperature Sensor

Rafael J. Wysocki (6):
      thermal: core: Fix thermal zone governor cleanup issues
      thermal: core: Free thermal zone ID later during removal
      thermal: core: Drop redundant check from thermal_zone_device_update()
      thermal: core: Adjust thermal_wq allocation flags
      thermal: core: Allocate thermal_class statically
      thermal: core: Suspend thermal zones later and resume them earlier

Richard Acayan (2):
      dt-bindings: thermal: tsens: add SDM670 compatible
      dt-bindings: thermal: lmh: Add SDM670 compatible

Srinivas Pandruvada (1):
      thermal: intel: int340x: Read DDR data rate for Nova Lake

Thorsten Blum (6):
      thermal: core: Replace sprintf() in thermal_bind_cdev_to_trip()
      thermal/drivers/sprd: Fix temperature clamping in sprd_thm_temp_to_rawdata
      thermal/drivers/sprd: Fix raw temperature clamping in
sprd_thm_rawdata_to_temp
      thermal/drivers/sprd: Use min instead of clamp in sprd_thm_temp_to_rawdata
      thermal: sysfs: Use str_enabled_disabled() helper in mode_show()
      thermal/drivers/brcmstb_thermal: Use max to simplify brcmstb_get_temp

Yury Norov (1):
      thermal: intel: hfi: use cpumask_empty() in intel_hfi_offline()

---------------

 .../devicetree/bindings/thermal/qcom-lmh.yaml      |   3 +
 .../devicetree/bindings/thermal/qcom-tsens.yaml    |   3 +
 .../devicetree/bindings/thermal/spear-thermal.txt  |  14 ---
 .../bindings/thermal/st,thermal-spear1340.yaml     |  36 ++++++
 drivers/base/power/main.c                          |   5 +
 drivers/thermal/broadcom/brcmstb_thermal.c         |   8 +-
 drivers/thermal/devfreq_cooling.c                  |   3 +-
 drivers/thermal/imx91_thermal.c                    |   4 +
 .../intel/int340x_thermal/processor_thermal_rfim.c |  25 ++++-
 drivers/thermal/intel/intel_hfi.c                  |   2 +-
 drivers/thermal/renesas/rzg3e_thermal.c            |   1 -
 drivers/thermal/spear_thermal.c                    |   2 +-
 drivers/thermal/sprd_thermal.c                     |   6 +-
 drivers/thermal/thermal_core.c                     | 121 ++++++++-------------
 drivers/thermal/thermal_of.c                       |  20 ++--
 drivers/thermal/thermal_sysfs.c                    |   7 +-
 include/linux/thermal.h                            |   6 +
 17 files changed, 143 insertions(+), 123 deletions(-)

^ permalink raw reply

* Re: [PATCH v2 2/7] platform/x86/intel/pmc: Enable PkgC LTR blocking counter
From: Ilpo Järvinen @ 2026-04-10 14:28 UTC (permalink / raw)
  To: Xi Pardee
  Cc: irenic.rajneesh, david.e.box, platform-driver-x86, LKML, linux-pm
In-Reply-To: <20260408222144.3288928-3-xi.pardee@linux.intel.com>

On Wed, 8 Apr 2026, Xi Pardee wrote:

> Enable the Package C-state LTR blocking counter in the PMT telemetry
> region. This counter records how many times any Package C-state entry
> is blocked for the specified reasons.
> 
> Signed-off-by: Xi Pardee <xi.pardee@linux.intel.com>
> ---
>  drivers/platform/x86/intel/pmc/core.c | 74 +++++++++++++++++++++++----
>  drivers/platform/x86/intel/pmc/core.h | 15 +++++-
>  2 files changed, 77 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/platform/x86/intel/pmc/core.c b/drivers/platform/x86/intel/pmc/core.c
> index c8a92d6235203..5c519942ec58c 100644
> --- a/drivers/platform/x86/intel/pmc/core.c
> +++ b/drivers/platform/x86/intel/pmc/core.c
> @@ -1071,6 +1071,29 @@ static int pmc_core_die_c6_us_show(struct seq_file *s, void *unused)
>  }
>  DEFINE_SHOW_ATTRIBUTE(pmc_core_die_c6_us);
>  
> +static int pmc_core_pkgc_ltr_blocker_show(struct seq_file *s, void *unused)
> +{
> +	struct pmc_dev *pmcdev = s->private;
> +	const char **pkgc_ltr_blocker_counters;
> +	unsigned int i;
> +	u32 counter;
> +	int ret;
> +
> +	pkgc_ltr_blocker_counters = pmcdev->pkgc_ltr_blocker_counters;
> +	for (i = 0; pkgc_ltr_blocker_counters[i]; i++) {
> +		ret = pmt_telem_read32(pmcdev->pc_ep,
> +				       pmcdev->pkgc_ltr_blocker_offset + i,
> +				       &counter, 1);
> +
> +		if (ret)

Don't leve empty lines between call and its error handling.

> +			return ret;

Maybe put the empty line here instead?

> +		seq_printf(s, "%-30s %-30u\n", pkgc_ltr_blocker_counters[i], counter);
> +	}
> +
> +	return 0;
> +}
> +DEFINE_SHOW_ATTRIBUTE(pmc_core_pkgc_ltr_blocker);

-- 
 i.

^ permalink raw reply

* Re: [PATCH v2 3/7] platform/x86/intel/pmc: Enable Pkgc blocking residency counter
From: Ilpo Järvinen @ 2026-04-10 14:27 UTC (permalink / raw)
  To: Xi Pardee
  Cc: irenic.rajneesh, david.e.box, platform-driver-x86, LKML, linux-pm
In-Reply-To: <20260408222144.3288928-4-xi.pardee@linux.intel.com>

[-- Attachment #1: Type: text/plain, Size: 6210 bytes --]

On Wed, 8 Apr 2026, Xi Pardee wrote:

> Enable the Package C-state blocking counter in the PMT telemetry
> region. This counter reports the number of 10 µs intervals during
> which a Package C-state 10.2/3 entry was blocked for the specified
> reasons.
> 
> Create a common helper for pmc_core_pkgc_ltr_blocker_show() and
> pmc_core_pkgc_blocker_residency_show() as these two functions
> share similar logic.

Please don't do back and forth changes like this within a series. You 
should add it in the right form from the beginning.

-- 
 i.

> Signed-off-by: Xi Pardee <xi.pardee@linux.intel.com>
> ---
>  drivers/platform/x86/intel/pmc/core.c | 40 ++++++++++++++++++++-------
>  drivers/platform/x86/intel/pmc/core.h |  8 ++++++
>  2 files changed, 38 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/platform/x86/intel/pmc/core.c b/drivers/platform/x86/intel/pmc/core.c
> index 5c519942ec58c..94ae098a155a6 100644
> --- a/drivers/platform/x86/intel/pmc/core.c
> +++ b/drivers/platform/x86/intel/pmc/core.c
> @@ -1071,29 +1071,44 @@ static int pmc_core_die_c6_us_show(struct seq_file *s, void *unused)
>  }
>  DEFINE_SHOW_ATTRIBUTE(pmc_core_die_c6_us);
>  
> -static int pmc_core_pkgc_ltr_blocker_show(struct seq_file *s, void *unused)
> +static int pmc_core_pkgc_counters_show(struct seq_file *s,
> +				       struct telem_endpoint *ep,
> +				       u32 offset, const char **counters)
>  {
> -	struct pmc_dev *pmcdev = s->private;
> -	const char **pkgc_ltr_blocker_counters;
>  	unsigned int i;
>  	u32 counter;
>  	int ret;
>  
> -	pkgc_ltr_blocker_counters = pmcdev->pkgc_ltr_blocker_counters;
> -	for (i = 0; pkgc_ltr_blocker_counters[i]; i++) {
> -		ret = pmt_telem_read32(pmcdev->pc_ep,
> -				       pmcdev->pkgc_ltr_blocker_offset + i,
> -				       &counter, 1);
> -
> +	for (i = 0; counters[i]; i++) {
> +		ret = pmt_telem_read32(ep, offset + i, &counter, 1);
>  		if (ret)
>  			return ret;
> -		seq_printf(s, "%-30s %-30u\n", pkgc_ltr_blocker_counters[i], counter);
> +		seq_printf(s, "%-30s %-30u\n", counters[i], counter);
>  	}
>  
>  	return 0;
>  }
> +
> +static int pmc_core_pkgc_ltr_blocker_show(struct seq_file *s, void *unused)
> +{
> +	struct pmc_dev *pmcdev = s->private;
> +
> +	return pmc_core_pkgc_counters_show(s, pmcdev->pc_ep,
> +					   pmcdev->pkgc_ltr_blocker_offset,
> +					   pmcdev->pkgc_ltr_blocker_counters);
> +}
>  DEFINE_SHOW_ATTRIBUTE(pmc_core_pkgc_ltr_blocker);
>  
> +static int pmc_core_pkgc_blocker_residency_show(struct seq_file *s, void *unused)
> +{
> +	struct pmc_dev *pmcdev = s->private;
> +
> +	return pmc_core_pkgc_counters_show(s, pmcdev->pc_ep,
> +					   pmcdev->pkgc_blocker_offset,
> +					   pmcdev->pkgc_blocker_counters);
> +}
> +DEFINE_SHOW_ATTRIBUTE(pmc_core_pkgc_blocker_residency);
> +
>  static int pmc_core_lpm_latch_mode_show(struct seq_file *s, void *unused)
>  {
>  	struct pmc_dev *pmcdev = s->private;
> @@ -1381,6 +1396,8 @@ void pmc_core_punit_pmt_init(struct pmc_dev *pmcdev, struct pmc_dev_info *pmc_de
>  		pmcdev->pc_ep = ep;
>  		pmcdev->pkgc_ltr_blocker_counters = pmc_dev_info->pkgc_ltr_blocker_counters;
>  		pmcdev->pkgc_ltr_blocker_offset = pmc_dev_info->pkgc_ltr_blocker_offset;
> +		pmcdev->pkgc_blocker_counters = pmc_dev_info->pkgc_blocker_counters;
> +		pmcdev->pkgc_blocker_offset = pmc_dev_info->pkgc_blocker_offset;
>  	}
>  }
>  
> @@ -1510,6 +1527,9 @@ static void pmc_core_dbgfs_register(struct pmc_dev *pmcdev, struct pmc_dev_info
>  		debugfs_create_file("pkgc_ltr_blocker_show", 0444,
>  				    pmcdev->dbgfs_dir, pmcdev,
>  				    &pmc_core_pkgc_ltr_blocker_fops);
> +		debugfs_create_file("pkgc_blocker_residency_show", 0444,
> +				    pmcdev->dbgfs_dir, pmcdev,
> +				    &pmc_core_pkgc_blocker_residency_fops);
>  	}
>  
>  }
> diff --git a/drivers/platform/x86/intel/pmc/core.h b/drivers/platform/x86/intel/pmc/core.h
> index a20aab73c1409..829b1dee3f636 100644
> --- a/drivers/platform/x86/intel/pmc/core.h
> +++ b/drivers/platform/x86/intel/pmc/core.h
> @@ -455,6 +455,8 @@ struct pmc {
>   *
>   * @pkgc_ltr_blocker_counters: Array of PKGC LTR blocker counters
>   * @pkgc_ltr_blocker_offset: Offset to PKGC LTR blockers in telemetry region
> + * @pkgc_blocker_counters: Array of PKGC blocker counters
> + * @pkgc_blocker_offset: Offset to PKGC blocker in telemetry region
>   *
>   * pmc_dev contains info about power management controller device.
>   */
> @@ -480,6 +482,8 @@ struct pmc_dev {
>  
>  	const char **pkgc_ltr_blocker_counters;
>  	u32 pkgc_ltr_blocker_offset;
> +	const char **pkgc_blocker_counters;
> +	u32 pkgc_blocker_offset;
>  };
>  
>  enum pmc_index {
> @@ -495,6 +499,7 @@ enum pmc_index {
>   * @dmu_guids:		List of Die Management Unit GUID
>   * @pc_guid:		GUID for telemetry region to read PKGC blocker info
>   * @pkgc_ltr_blocker_offset: Offset to PKGC LTR blockers in telemetry region
> + * @pkgc_blocker_offset:Offset to PKGC blocker in telemetry region
>   * @regmap_list:	Pointer to a list of pmc_info structure that could be
>   *			available for the platform. When set, this field implies
>   *			SSRAM support.
> @@ -502,6 +507,7 @@ enum pmc_index {
>   *			specific attributes of the primary PMC
>   * @sub_req_show:	File operations to show substate requirements
>   * @pkgc_ltr_blocker_counters: Array of PKGC LTR blocker counters
> + * @pkgc_blocker_counters: Array of PKGC blocker counters
>   * @suspend:		Function to perform platform specific suspend
>   * @resume:		Function to perform platform specific resume
>   * @init:		Function to perform platform specific init action
> @@ -512,10 +518,12 @@ struct pmc_dev_info {
>  	u32 *dmu_guids;
>  	u32 pc_guid;
>  	u32 pkgc_ltr_blocker_offset;
> +	u32 pkgc_blocker_offset;
>  	struct pmc_info *regmap_list;
>  	const struct pmc_reg_map *map;
>  	const struct file_operations *sub_req_show;
>  	const char **pkgc_ltr_blocker_counters;
> +	const char **pkgc_blocker_counters;
>  	void (*suspend)(struct pmc_dev *pmcdev);
>  	int (*resume)(struct pmc_dev *pmcdev);
>  	int (*init)(struct pmc_dev *pmcdev, struct pmc_dev_info *pmc_dev_info);
> 

^ permalink raw reply

* [GIT PULL] ACPI support updates for v7.1-rc1
From: Rafael J. Wysocki @ 2026-04-10 14:27 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: ACPI Devel Maling List, Linux PM, Linux Kernel Mailing List

Hi Linus,

This goes early because I will be traveling next week.

Please pull from the tag

 git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
 acpi-7.1-rc1

with top-most commit 8e937866b425248fa375b2138c19c117a87c6be0

 Merge branch 'acpi-apei'

on top of commit 591cd656a1bf5ea94a222af5ef2ee76df029c1d2

 Linux 7.0-rc7

to receive ACPI support updates for 7.1-rc1.

These include an update of the CMOS RTC driver and the related ACPI
and x86 code that, among other things, switches it over to using the
platform device interface for device binding on x86 instead of the PNP
device driver interface (which allows the code in question to be
simplified quite a bit), a major update of the ACPI Time and Alarm
Device (TAD) driver adding an RTC class device interface to it, and
updates of core ACPI drivers that remove some unnecessary and not
really useful code from them.

Apart from that, two drivers are converted to using the platform driver
interface for device binding instead of the ACPI driver one, which is
slated for removal, support for the Performance Limited register is
added to the ACPI CPPC library and there are some janitorial updates
of it and the related cpufreq CPPC driver, the ACPI processor driver is
fixed and cleaned up, and NVIDIA vendor CPER record handler is added
to the APEI GHES code.

Also, the interface for obtaining a CPU UID from ACPI is consolidated
across architectures and used for fixing a problem with the PCI TPH
Steering Tag on ARM64, there are two updates related to ACPICA, a
minor ACPI OS Services Layer (OSL) update, and a few assorted updates
related to ACPI tables parsing.

Specifics:

 - Update maintainers information regarding ACPICA (Rafael Wysocki)

 - Replace strncpy() with strscpy_pad() in acpi_ut_safe_strncpy() (Kees
   Cook)

 - Trigger an ordered system power off after encountering a fatal error
   operator in AML (Armin Wolf)

 - Enable ACPI FPDT parsing on LoongArch (Xi Ruoyao)

 - Remove the temporary stop-gap acpi_pptt_cache_v1_full structure from
   the ACPI PPTT parser (Ben Horgan)

 - Add support for exposing ACPI FPDT subtables FBPT and S3PT (Nate
   DeSimone)

 - Address multiple assorted issues and clean up the code in the ACPI
   processor idle driver (Huisong Li)

 - Replace strlcat() in the ACPI processor idle drive with a better
   alternative (Andy Shevchenko)

 - Rearrange and clean up acpi_processor_errata_piix4() (Rafael Wysocki)

 - Move reference performance to capabilities and fix an uninitialized
   variable in the ACPI CPPC library (Pengjie Zhang)

 - Add support for the Performance Limited Register to the ACPI CPPC
   library (Sumit Gupta)

 - Add cppc_get_perf() API to read performance controls, extend
   cppc_set_epp_perf() for FFH/SystemMemory, and make the ACPI CPPC
   library warn on missing mandatory DESIRED_PERF register (Sumit Gupta)

 - Modify the cpufreq CPPC driver to update MIN_PERF/MAX_PERF in target
   callbacks to allow it to control performance bounds via standard
   scaling_min_freq and scaling_max_freq sysfs attributes and add sysfs
   documentation for the Performance Limited Register to it (Sumit Gupta)

 - Add ACPI support to the platform device interface in the CMOS RTC
   driver, make the ACPI core device enumeration code create a platform
   device for the CMOS RTC, and drop CMOS RTC PNP device support (Rafael
   Wysocki)

 - Consolidate the x86-specific CMOS RTC handling with the ACPI TAD
   driver and clean up the CMOS RTC ACPI address space handler (Rafael
   Wysocki)

 - Enable ACPI alarm in the CMOS RTC driver if advertised in ACPI FADT
   and allow that driver to work without a dedicated IRQ if the ACPI
   alarm is used (Rafael Wysocki)

 - Clean up the ACPI TAD driver in various ways and add an RTC class
   device interface, including both the RTC setting/reading and alarm
   timer support, to it (Rafael Wysocki)

 - Clean up the ACPI AC and ACPI PAD (processor aggregator device)
   drivers (Rafael Wysocki)

 - Rework checking for duplicate video bus devices and consolidate
   pnp.bus_id workarounds handling in the ACPI video bus driver (Rafael
   Wysocki)

 - Update the ACPI core device drivers to stop setting acpi_device_name()
   unnecessarily (Rafael Wysocki)

 - Rearrange code using acpi_device_class() in the ACPI core device
   drivers and update them to stop setting acpi_device_class()
   unnecessarily (Rafael Wysocki)

 - Define ACPI_AC_CLASS in one place (Rafael Wysocki)

 - Convert the ni903x_wdt watchdog driver and the xen ACPI PAD driver to
   bind to platform devices instead of ACPI devices (Rafael Wysocki)

 - Add devm_ghes_register_vendor_record_notifier(), use it in the PCI
   hisi driver, and Add NVIDIA vendor CPER record handler (Kai-Heng
   Feng)

 - Consolidate the interface for obtaining a CPU UID from ACPI across
   architectures and use it to address incorrect PCI TPH Steering Tag
   on ARM64 resulting from the invalid assumption that the ACPI
   Processor UID would always be the same as the corresponding logical
   CPU ID in Linux (Chengwen Feng)

Thanks!


---------------

Andy Shevchenko (1):
      ACPI: processor: idle: Replace strlcat() with better alternative

Armin Wolf (1):
      ACPI: OSL: Poweroff when encountering a fatal ACPI error

Ben Horgan (1):
      ACPI: PPTT: Remove duplicate structure, acpi_pptt_cache_v1_full

Chengwen Feng (8):
      arm64: acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
      LoongArch: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
      RISC-V: ACPI: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
      x86/acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
      ACPI: Centralize acpi_get_cpu_uid() declaration in include/linux/acpi.h
      perf: arm_cspmu: Switch to acpi_get_cpu_uid() from get_acpi_id_for_cpu()
      ACPI: PPTT: Use acpi_get_cpu_uid() and remove get_acpi_id_for_cpu()
      PCI/TPH: Pass ACPI Processor UID to Cache Locality _DSM

Huisong Li (7):
      ACPI: processor: idle: Remove redundant cstate check in
acpi_processor_power_init
      ACPI: processor: idle: Move max_cstate update out of the loop
      ACPI: processor: idle: Remove redundant static variable and
rename cstate check function
      ACPI: processor: idle: Reset power_setup_done flag on
initialization failure
      ACPI: processor: idle: Fix NULL pointer dereference in hotplug path
      cpuidle: Extract and export no-lock variants of
cpuidle_unregister_device()
      ACPI: processor: idle: Reset cpuidle on C-state list changes

Jingkai Tan (1):
      ACPI: processor: idle: Add missing bounds check in flatten_lpi_states()

Kai-Heng Feng (3):
      ACPI: APEI: GHES: Add devm_ghes_register_vendor_record_notifier()
      PCI: hisi: Use devm_ghes_register_vendor_record_notifier()
      ACPI: APEI: GHES: Add NVIDIA vendor CPER record handler

Kees Cook (1):
      ACPICA: Replace strncpy() with strscpy_pad() in acpi_ut_safe_strncpy()

Nate DeSimone (2):
      ACPI: FPDT: expose FBPT and S3PT subtables via sysfs
      Documentation: ABI: add FBPT and S3PT entries to sysfs-firmware-acpi

Pengjie Zhang (2):
      ACPI: CPPC: Move reference performance to capabilities
      ACPI: CPPC: Fix uninitialized ref variable in cppc_get_perf_caps()

Rafael J. Wysocki (37):
      ACPI: x86: cmos_rtc: Clean up address space handler driver
      ACPI: x86: cmos_rtc: Improve coordination with ACPI TAD driver
      ACPI: x86: cmos_rtc: Create a CMOS RTC platform device
      ACPI: x86/rtc-cmos: Use platform device for driver binding
      ACPI: PNP: Drop CMOS RTC PNP device support
      x86: rtc: Drop PNP device check
      rtc: cmos: Drop PNP device support
      ACPI: TAD/x86: cmos_rtc: Consolidate address space handler setup
      ACPI: AC: Get rid of unnecessary declarations
      ACPI: PAD: Rearrange notify handler installation and removal
      driver core: auxiliary bus: Introduce dev_is_auxiliary()
      ACPI: video: Rework checking for duplicate video bus devices
      ACPI: video: Consolidate pnp.bus_id workarounds handling
      ACPI: driver: Do not set acpi_device_name() unnecessarily
      ACPI: event: Redefine acpi_notifier_call_chain()
      ACPI: driver: Avoid using pnp.device_class for netlink handling
      ACPI: driver: Do not set acpi_device_class() unnecessarily
      ACPI: AC: Define ACPI_AC_CLASS in one place
      rtc: cmos: Enable ACPI alarm if advertised in ACPI FADT
      rtc: cmos: Do not require IRQ if ACPI alarm is used
      ACPI: TAD: Create one attribute group
      ACPI: TAD: Support RTC without wakeup
      ACPI: TAD: Use __free() for cleanup in time_store()
      ACPI: TAD: Rearrange RT data validation checking
      ACPI: TAD: Clear unused RT data in acpi_tad_set_real_time()
      ACPI: TAD: Add RTC class device interface
      ACPI: TAD: Update the driver description comment
      ACPI: TAD: Use dev_groups in struct device_driver
      ACPI: TAD: Use DC wakeup only if AC wakeup is supported
      ACPI: processor: Rearrange and clean up acpi_processor_errata_piix4()
      ACPI: TAD: Split three functions to untangle runtime PM handling
      ACPI: TAD: Relocate two functions
      ACPI: TAD: Split acpi_tad_rtc_read_time()
      ACPI: TAD: Add alarm support to the RTC class device interface
      ACPI: PAD: xen: Convert to a platform driver
      watchdog: ni903x_wdt: Convert to a platform driver
      ACPICA: Update maintainers information

Sumit Gupta (8):
      ACPI: CPPC: Add cppc_get_perf() API to read performance controls
      ACPI: CPPC: Warn on missing mandatory DESIRED_PERF register
      ACPI: CPPC: Extend cppc_set_epp_perf() for FFH/SystemMemory
      cpufreq: CPPC: Update cached perf_ctrls on sysfs write
      cpufreq: cppc: Update MIN_PERF/MAX_PERF in target callbacks
      ACPI: CPPC: add APIs and sysfs interface for perf_limited
      cpufreq: CPPC: Add sysfs documentation for perf_limited
      ACPI: CPPC: Check cpc_read() return values consistently

Xi Ruoyao (1):
      ACPI: tables: Enable FPDT on LoongArch

---------------

 Documentation/ABI/testing/sysfs-devices-system-cpu |  18 +
 Documentation/ABI/testing/sysfs-firmware-acpi      |   6 +
 Documentation/PCI/tph.rst                          |   4 +-
 Documentation/admin-guide/kernel-parameters.txt    |   8 +
 MAINTAINERS                                        |   8 +-
 arch/arm64/include/asm/acpi.h                      |  17 +-
 arch/arm64/kernel/acpi.c                           |  30 ++
 arch/loongarch/include/asm/acpi.h                  |   5 -
 arch/loongarch/kernel/acpi.c                       |   9 +
 arch/riscv/include/asm/acpi.h                      |   4 -
 arch/riscv/kernel/acpi.c                           |  16 +
 arch/riscv/kernel/acpi_numa.c                      |   9 +-
 arch/x86/include/asm/cpu.h                         |   1 -
 arch/x86/include/asm/smp.h                         |   1 -
 arch/x86/kernel/acpi/boot.c                        |  20 +
 arch/x86/kernel/rtc.c                              |  21 +-
 arch/x86/xen/enlighten_hvm.c                       |   5 +-
 drivers/acpi/Kconfig                               |   2 +-
 drivers/acpi/ac.c                                  |  31 +-
 drivers/acpi/acpi_fpdt.c                           |  28 ++
 drivers/acpi/acpi_memhotplug.c                     |   4 -
 drivers/acpi/acpi_pad.c                            |  28 +-
 drivers/acpi/acpi_pnp.c                            |  22 +-
 drivers/acpi/acpi_processor.c                      |  31 +-
 drivers/acpi/acpi_tad.c                            | 492 ++++++++++++++-------
 drivers/acpi/acpi_video.c                          | 100 ++---
 drivers/acpi/acpica/utnonansi.c                    |   3 +-
 drivers/acpi/apei/Kconfig                          |  14 +
 drivers/acpi/apei/Makefile                         |   1 +
 drivers/acpi/apei/ghes-nvidia.c                    | 149 +++++++
 drivers/acpi/apei/ghes.c                           |  18 +
 drivers/acpi/battery.c                             |   9 +-
 drivers/acpi/button.c                              |  11 +-
 drivers/acpi/cppc_acpi.c                           | 293 ++++++++++--
 drivers/acpi/ec.c                                  |   6 -
 drivers/acpi/event.c                               |   7 +-
 drivers/acpi/osl.c                                 |  19 +-
 drivers/acpi/pci_link.c                            |   4 -
 drivers/acpi/pci_root.c                            |   9 +-
 drivers/acpi/power.c                               |   4 -
 drivers/acpi/pptt.c                                |  81 ++--
 drivers/acpi/processor_driver.c                    |  22 +-
 drivers/acpi/processor_idle.c                      |  82 ++--
 drivers/acpi/riscv/rhct.c                          |   7 +-
 drivers/acpi/sbs.c                                 |   4 -
 drivers/acpi/sbshc.c                               |   6 -
 drivers/acpi/thermal.c                             |  13 +-
 drivers/acpi/x86/cmos_rtc.c                        |  86 ++--
 drivers/base/auxiliary.c                           |  10 +
 drivers/cpufreq/cppc_cpufreq.c                     | 104 ++++-
 drivers/cpuidle/cpuidle.c                          |  22 +-
 drivers/gpu/drm/amd/include/amd_acpi.h             |   2 -
 drivers/gpu/drm/radeon/radeon_acpi.c               |   2 -
 drivers/pci/controller/pcie-hisi-error.c           |  12 +-
 drivers/pci/tph.c                                  |  16 +-
 drivers/perf/arm_cspmu/arm_cspmu.c                 |   6 +-
 drivers/platform/x86/hp/hp-wmi.c                   |   2 -
 drivers/platform/x86/lenovo/wmi-capdata.c          |   1 -
 drivers/rtc/rtc-cmos.c                             | 143 ++----
 drivers/watchdog/ni903x_wdt.c                      |  27 +-
 drivers/xen/xen-acpi-pad.c                         |  23 +-
 include/acpi/acpi_bus.h                            |  14 +-
 include/acpi/cppc_acpi.h                           |  22 +-
 include/acpi/ghes.h                                |  11 +
 include/acpi/processor.h                           |   2 -
 include/linux/acpi.h                               |  21 +
 include/linux/auxiliary_bus.h                      |   2 +
 include/linux/cpuidle.h                            |   2 +
 include/linux/pci-tph.h                            |   4 +-
 69 files changed, 1450 insertions(+), 766 deletions(-)

^ permalink raw reply

* [GIT PULL] Power management updates for v7.1-rc1
From: Rafael J. Wysocki @ 2026-04-10 14:24 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Linux PM, Linux Kernel Mailing List, ACPI Devel Maling List,
	Viresh Kumar, Shuah Khan, Chanwoo Choi (samsung.com),
	Mario Limonciello

Hi Linus,

This goes early because I will be traveling next week.

Please pull from the tag

 git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
 pm-7.1-rc1

with top-most commit d923f70e37310fe613883a3a4c2ea2f31246253b

 Merge branch 'pm-devfreq'

on top of commit 591cd656a1bf5ea94a222af5ef2ee76df029c1d2

 Linux 7.0-rc7

to receive power management updates for 7.1-rc1.

Once again, cpufreq is the most active development area, mostly because
of the new feature additions and documentation updates in the amd-pstate
driver, but there are also changes in the cpufreq core related to boost
support and other assorted updates elsewhere.

Next up are power capping changes due to the major cleanup of the Intel
RAPL driver.

On the cpuidle front, a new C-states table for Intel Panther Lake is
added to the intel_idle driver, the stopped tick handling in the menu
and teo governors is updated, and there are a couple of cleanups.

Apart from the above, support for Tegra114 is added to devfreq and
there are assorted cleanups of that code, there are also two updates
of the operating performance points (OPP) library, two minor updates
related to hibernation, and cpupower utility man pages updates and
cleanups.

Specifics:

 - Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa)

 - Update cpufreq-dt-platdev blocklist (Faruque Ansari)

 - Minor updates to driver and dt-bindings for Tegra (Thierry Reding,
   Rosen Penev)

 - Add MAINTAINERS entry for CPPC driver (Viresh Kumar)

 - Add support for new features: CPPC performance priority, Dynamic EPP,
   Raw EPP, and new unit tests for them to amd-pstate (Gautham Shenoy,
   Mario Limonciello)

 - Fix sysfs files being present when HW missing and broken/outdated
   documentation in the amd-pstate driver (Ninad Naik, Gautham Shenoy)

 - Pass the policy to cpufreq_driver->adjust_perf() to avoid using
   cpufreq_cpu_get() in the .adjust_perf() callback in amd-pstate which
   leads to a scheduling-while-atomic bug (K Prateek Nayak)

 - Clean up dead code in Kconfig for cpufreq (Julian Braha)

 - Remove max_freq_req update for pre-existing cpufreq policy and add a
   boost_freq_req QoS request to save the boost constraint instead of
   overwriting the last scaling_max_freq constraint (Pierre Gondois)

 - Embed cpufreq QoS freq_req objects in cpufreq policy so they all
   are allocated in one go along with the policy to simplify lifetime
   rules and avoid error handling issues (Viresh Kumar)

 - Use DMI max speed when CPPC is unavailable in the acpi-cpufreq
   scaling driver (Henry Tseng)

 - Switch policy_is_shared() in cpufreq to using cpumask_nth() instead
   of cpumask_weight() because the former is more efficient (Yury Norov)

 - Use sysfs_emit() in sysfs show functions for cpufreq governor
   attributes (Thorsten Blum)

 - Update intel_pstate to stop returning an error when "off" is written
   to its status sysfs attribute while the driver is already off (Fabio
   De Francesco)

 - Include current frequency in the debug message printed by
   __cpufreq_driver_target() (Pengjie Zhang)

 - Refine stopped tick handling in the menu cpuidle governor and
   rearrange stopped tick handling in the teo cpuidle governor (Rafael
   Wysocki)

 - Add Panther Lake C-states table to the intel_idle driver (Artem
   Bityutskiy)

 - Clean up dead dependencies on CPU_IDLE in Kconfig (Julian Braha)

 - Simplify cpuidle_register_device() with guard() (Huisong Li)

 - Use performance level if available to distinguish between rates in
   OPP debugfs (Manivannan Sadhasivam)

 - Fix scoped_guard in dev_pm_opp_xlate_required_opp() (Viresh Kumar)

 - Return -ENODATA if the snapshot image is not loaded (Alberto Garcia)

 - Remove inclusion of crypto/hash.h from hibernate_64.c on x86 (Eric
   Biggers)

 - Clean up and rearrange the intel_rapl power capping driver to make
   the respective interface drivers (TPMI, MSR, and MMOI) hold their
   own settings and primitives and consolidate PL4 and PMU support
   flags into rapl_defaults (Kuppuswamy Sathyanarayanan)

 - Correct kernel-doc function parameter names in the power capping core
   code (Randy Dunlap)

 - Remove unneeded casting for HZ_PER_KHZ in devfreq (Andy Shevchenko)

 - Use _visible attribute to replace create/remove_sysfs_files() in
   devfreq (Pengjie Zhang)

 - Add Tegra114 support to activity monitor device in tegra30-devfreq as
   a preparation to upcoming EMC controller support (Svyatoslav Ryhel)

 - Fix mistakes in cpupower man pages, add the boost and epp options to
   the cpupower-frequency-info man page, and add the perf-bias option to
   the cpupower-info man page (Roberto Ricci)

 - Remove unnecessary extern declarations from getopt.h in arguments
   parsing functions in cpufreq-set, cpuidle-info, cpuidle-set,
   cpupower-info, and cpupower-set utilities (Kaushlendra Kumar)

Thanks!


---------------

Abel Vesa (1):
      dt-bindings: cpufreq: qcom-hw: document Eliza cpufreq hardware

Alberto Garcia (1):
      PM: hibernate: return -ENODATA if the snapshot image is not loaded

Andy Shevchenko (1):
      PM / devfreq: Remove unneeded casting for HZ_PER_KHZ

Artem Bityutskiy (1):
      intel_idle: Add Panther Lake C-states table

Eric Biggers (1):
      PM: hibernate: x86: Remove inclusion of crypto/hash.h

Fabio M. De Francesco (1):
      cpufreq: intel_pstate: Allow repeated intel_pstate disable

Faruque Ansari (1):
      cpufreq: Add QCS8300 to cpufreq-dt-platdev blocklist

Gautham R. Shenoy (13):
      amd-pstate: Fix memory leak in amd_pstate_epp_cpu_init()
      amd-pstate: Update cppc_req_cached in fast_switch case
      amd-pstate: Make certain freq_attrs conditionally visible
      x86/cpufeatures: Add AMD CPPC Performance Priority feature.
      amd-pstate: Add support for CPPC_REQ2 and FLOOR_PERF
      amd-pstate: Add sysfs support for floor_freq and floor_count
      amd-pstate: Introduce a tracepoint trace_amd_pstate_cppc_req2()
      amd-pstate-ut: Add module parameter to select testcases
      amd-pstate-ut: Add a testcase to validate the visibility of
driver attributes
      Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file
      Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file
      Documentation/amd-pstate: Add documentation for
amd_pstate_floor_{freq,count}
      MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer

Henry Tseng (1):
      cpufreq: acpi-cpufreq: use DMI max speed when CPPC is unavailable

Huisong Li (1):
      cpuidle: Simplify cpuidle_register_device() with guard()

Julian Braha (2):
      cpufreq: clean up dead code in Kconfig
      cpuidle: clean up dead dependencies on CPU_IDLE in Kconfig

K Prateek Nayak (2):
      cpufreq/amd-pstate: Pass the policy to amd_pstate_update()
      cpufreq: Pass the policy to cpufreq_driver->adjust_perf()

Kaushlendra Kumar (1):
      cpupower: remove extern declarations in cmd functions

Kuppuswamy Sathyanarayanan (18):
      powercap: intel_rapl: Add a symbol namespace for intel_rapl exports
      powercap: intel_rapl: Cleanup coding style
      powercap: intel_rapl: Remove unused TIME_WINDOW macros
      powercap: intel_rapl: Simplify rapl_compute_time_window_atom()
      powercap: intel_rapl: Use shifts for power-of-2 operations
      powercap: intel_rapl: Use GENMASK() and BIT() macros
      powercap: intel_rapl: Use unit conversion macros from units.h
      powercap: intel_rapl: Allow interface drivers to configure rapl_defaults
      powercap: intel_rapl: Move TPMI default settings into TPMI
interface driver
      thermal: intel: int340x: processor: Move RAPL defaults to MMIO driver
      powercap: intel_rapl: Remove unused AVERAGE_POWER primitive
      powercap: intel_rapl: Move MSR default settings into MSR interface driver
      powercap: intel_rapl: Remove unused macro definitions
      powercap: intel_rapl: Move primitive info to header for interface drivers
      powercap: intel_rapl: Move TPMI primitives to TPMI driver
      thermal: intel: int340x: processor: Move MMIO primitives to MMIO driver
      powercap: intel_rapl: Move MSR primitives to MSR driver
      powercap: intel_rapl: Consolidate PL4 and PMU support flags into
rapl_defaults

Manivannan Sadhasivam (1):
      OPP: debugfs: Use performance level if available to distinguish
between rates

Mario Limonciello (AMD) (7):
      cpufreq/amd-pstate: Add POWER_SUPPLY select for dynamic EPP
      cpufreq/amd-pstate: Cache the max frequency in cpudata
      cpufreq/amd-pstate: Add dynamic energy performance preference
      cpufreq/amd-pstate: add kernel command line to override dynamic epp
      cpufreq/amd-pstate: Add support for platform profile class
      cpufreq/amd-pstate: Add support for raw EPP writes
      cpufreq/amd-pstate-ut: Add a unit test for raw EPP

Ninad Naik (1):
      Documentation: amd-pstate: fix dead links in the reference section

Pengjie Zhang (2):
      cpufreq: Add debug print for current frequency in
__cpufreq_driver_target()
      PM / devfreq: use _visible attribute to replace
create/remove_sysfs_files()

Pierre Gondois (2):
      cpufreq: Remove max_freq_req update for pre-existing policy
      cpufreq: Add boost_freq_req QoS request

Rafael J. Wysocki (2):
      cpuidle: governors: menu: Refine stopped tick handling
      cpuidle: governors: teo: Rearrange stopped tick handling

Randy Dunlap (1):
      powercap: correct kernel-doc function parameter names

Roberto Ricci (4):
      cpupower-idle-info.1: fix short option names
      cpupower-frequency-info.1: use the proper name of the --perf option
      cpupower-frequency-info.1: document --boost and --epp options
      cpupower-info.1: describe the --perf-bias option

Rosen Penev (1):
      cpufreq: tegra194: remove COMPILE_TEST

Svyatoslav Ryhel (1):
      PM / devfreq: tegra30-devfreq: add support for Tegra114

Thierry Reding (2):
      dt-bindings: arm: nvidia: Document the Tegra238 CCPLEX cluster
      cpufreq: tegra194: Rename Tegra239 to Tegra238

Thorsten Blum (1):
      cpufreq: governor: Use sysfs_emit() in sysfs show functions

Viresh Kumar (3):
      OPP: Move break out of scoped_guard in dev_pm_opp_xlate_required_opp()
      cpufreq: Add MAINTAINERS entry for CPPC driver
      cpufreq: Allocate QoS freq_req objects with policy

Yury Norov (1):
      cpufreq: optimize policy_is_shared()

---------------

 Documentation/admin-guide/kernel-parameters.txt    |   7 +
 Documentation/admin-guide/pm/amd-pstate.rst        |  87 ++-
 .../arm/tegra/nvidia,tegra-ccplex-cluster.yaml     |   1 +
 .../bindings/cpufreq/cpufreq-qcom-hw.yaml          |   1 +
 MAINTAINERS                                        |  25 +-
 arch/x86/include/asm/cpufeatures.h                 |   2 +-
 arch/x86/include/asm/msr-index.h                   |   5 +
 arch/x86/kernel/cpu/scattered.c                    |   1 +
 arch/x86/power/hibernate_64.c                      |   2 -
 drivers/acpi/cppc_acpi.c                           |   3 +-
 drivers/cpufreq/Kconfig                            |   5 +-
 drivers/cpufreq/Kconfig.arm                        |   2 +-
 drivers/cpufreq/Kconfig.x86                        |  14 +
 drivers/cpufreq/acpi-cpufreq.c                     |  31 +-
 drivers/cpufreq/amd-pstate-trace.h                 |  35 ++
 drivers/cpufreq/amd-pstate-ut.c                    | 279 ++++++++-
 drivers/cpufreq/amd-pstate.c                       | 627 ++++++++++++++++++---
 drivers/cpufreq/amd-pstate.h                       |  37 +-
 drivers/cpufreq/cppc_cpufreq.c                     |  10 +-
 drivers/cpufreq/cpufreq-dt-platdev.c               |   1 +
 drivers/cpufreq/cpufreq.c                          |  85 ++-
 drivers/cpufreq/cpufreq_governor.h                 |   5 +-
 drivers/cpufreq/intel_pstate.c                     |   6 +-
 drivers/cpufreq/tegra194-cpufreq.c                 |   4 +-
 drivers/cpuidle/Kconfig                            |   2 +-
 drivers/cpuidle/Kconfig.mips                       |   2 +-
 drivers/cpuidle/Kconfig.powerpc                    |   2 -
 drivers/cpuidle/cpuidle.c                          |  12 +-
 drivers/cpuidle/governors/gov.h                    |   5 +
 drivers/cpuidle/governors/menu.c                   |  15 +-
 drivers/cpuidle/governors/teo.c                    |  81 ++-
 drivers/devfreq/devfreq.c                          | 108 ++--
 drivers/devfreq/tegra30-devfreq.c                  |  17 +-
 drivers/idle/intel_idle.c                          |  42 ++
 drivers/opp/core.c                                 |   2 +-
 drivers/opp/debugfs.c                              |  20 +-
 drivers/powercap/intel_rapl_common.c               | 565 ++-----------------
 drivers/powercap/intel_rapl_msr.c                  | 393 ++++++++++++-
 drivers/powercap/intel_rapl_tpmi.c                 | 101 ++++
 .../intel/int340x_thermal/processor_thermal_rapl.c |  81 +++
 include/acpi/cppc_acpi.h                           |   1 +
 include/linux/cpufreq.h                            |  11 +-
 include/linux/intel_rapl.h                         |  52 +-
 include/linux/powercap.h                           |   4 +-
 include/linux/units.h                              |   3 +
 kernel/power/user.c                                |   7 +-
 kernel/sched/cpufreq_schedutil.c                   |   5 +-
 rust/kernel/cpufreq.rs                             |  13 +-
 tools/arch/x86/include/asm/cpufeatures.h           |   2 +-
 tools/power/cpupower/man/cpupower-frequency-info.1 |   8 +-
 tools/power/cpupower/man/cpupower-idle-info.1      |   4 +-
 tools/power/cpupower/man/cpupower-info.1           |   9 +-
 tools/power/cpupower/utils/cpufreq-info.c          |   2 -
 tools/power/cpupower/utils/cpufreq-set.c           |   2 -
 tools/power/cpupower/utils/cpuidle-info.c          |   2 -
 tools/power/cpupower/utils/cpuidle-set.c           |   2 -
 tools/power/cpupower/utils/cpupower-info.c         |   2 -
 tools/power/cpupower/utils/cpupower-set.c          |   2 -
 58 files changed, 1951 insertions(+), 903 deletions(-)

^ permalink raw reply

* Re: [PATCH] cpufreq: CPPC: add autonomous mode boot parameter support
From: Pierre Gondois @ 2026-04-10 13:47 UTC (permalink / raw)
  To: Sumit Gupta
  Cc: linux-tegra, linux-kernel, linux-doc, zhenglifeng1, treding,
	viresh.kumar, jonathanh, vsethi, ionela.voinescu, ksitaraman,
	sanjayc, zhanjie9, corbet, mochs, skhan, bbasu, rdunlap, linux-pm,
	mario.limonciello, rafael
In-Reply-To: <b8debb30-67a5-4d2b-8c08-8fd287f7258e@nvidia.com>

Hello Sumit,

On 4/6/26 20:08, Sumit Gupta wrote:
> Hi Pierre,
>
> Thank you for the comments.
> Sorry for late reply as I was on vacation.
>
No worries
>
> On 24/03/26 23:48, Pierre Gondois wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> Hello Sumit,
>>
>> On 3/17/26 16:10, Sumit Gupta wrote:
>>> Add kernel boot parameter 'cppc_cpufreq.auto_sel_mode' to enable CPPC
>>> autonomous performance selection on all CPUs at system startup without
>>> requiring runtime sysfs manipulation. When autonomous mode is enabled,
>>> the hardware automatically adjusts CPU performance based on workload
>>> demands using Energy Performance Preference (EPP) hints.
>>>
>>> When auto_sel_mode=1:
>>> - Configure all CPUs for autonomous operation on first init
>>> - Set EPP to performance preference (0x0)
>>> - Use HW min/max when set; otherwise program from policy limits (caps)
>>> - Clamp desired_perf to bounds before enabling autonomous mode
>>> - Hardware controls frequency instead of the OS governor
>>>
>>> The boot parameter is applied only during first policy initialization.
>>> On hotplug, skip applying it so that the user's runtime sysfs
>>> configuration is preserved.
>>>
>>> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> (Documentation)
>>> Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
>>> ---
>>> Part 1 [1] of this series was applied for 7.1 and present in next.
>>> Sending this patch as reworked version of 'patch 11' from [2] based
>>> on next.
>>>
>>> [1] 
>>> https://lore.kernel.org/lkml/20260206142658.72583-1-sumitg@nvidia.com/
>>> [2] 
>>> https://lore.kernel.org/lkml/20251223121307.711773-1-sumitg@nvidia.com/
>>> ---
>>>   .../admin-guide/kernel-parameters.txt         | 13 +++
>>>   drivers/cpufreq/cppc_cpufreq.c                | 84 
>>> +++++++++++++++++--
>>>   2 files changed, 92 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
>>> b/Documentation/admin-guide/kernel-parameters.txt
>>> index fa6171b5fdd5..de4b4c89edfe 100644
>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>> @@ -1060,6 +1060,19 @@ Kernel parameters
>>>                       policy to use. This governor must be 
>>> registered in the
>>>                       kernel before the cpufreq driver probes.
>>>
>>> +     cppc_cpufreq.auto_sel_mode=
>>> +                     [CPU_FREQ] Enable ACPI CPPC autonomous 
>>> performance
>>> +                     selection. When enabled, hardware 
>>> automatically adjusts
>>> +                     CPU frequency on all CPUs based on workload 
>>> demands.
>>> +                     In Autonomous mode, Energy Performance 
>>> Preference (EPP)
>>> +                     hints guide hardware toward performance (0x0) 
>>> or energy
>>> +                     efficiency (0xff).
>>> +                     Requires ACPI CPPC autonomous selection 
>>> register support.
>>> +                     Format: <bool>
>>> +                     Default: 0 (disabled)
>>> +                     0: use cpufreq governors
>>> +                     1: enable if supported by hardware
>>> +
>>>       cpu_init_udelay=N
>>>                       [X86,EARLY] Delay for N microsec between 
>>> assert and de-assert
>>>                       of APIC INIT to start processors.  This delay 
>>> occurs
>>> diff --git a/drivers/cpufreq/cppc_cpufreq.c 
>>> b/drivers/cpufreq/cppc_cpufreq.c
>>> index 5dfb109cf1f4..49c148b2a0a4 100644
>>> --- a/drivers/cpufreq/cppc_cpufreq.c
>>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>>> @@ -28,6 +28,9 @@
>>>
>>>   static struct cpufreq_driver cppc_cpufreq_driver;
>>>
>>> +/* Autonomous Selection boot parameter */
>>> +static bool auto_sel_mode;
>>> +
>>>   #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE
>>>   static enum {
>>>       FIE_UNSET = -1,
>>> @@ -708,11 +711,74 @@ static int cppc_cpufreq_cpu_init(struct 
>>> cpufreq_policy *policy)
>>>       policy->cur = cppc_perf_to_khz(caps, caps->highest_perf);
>>>       cpu_data->perf_ctrls.desired_perf = caps->highest_perf;
>>>
>>> -     ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls);
>>> -     if (ret) {
>>> -             pr_debug("Err setting perf value:%d on CPU:%d. ret:%d\n",
>>> -                      caps->highest_perf, cpu, ret);
>>> -             goto out;
>>> +     /*
>>> +      * Enable autonomous mode on first init if boot param is set.
>>> +      * Check last_governor to detect first init and skip if auto_sel
>>> +      * is already enabled.
>>> +      */
>> If the goal is to set autosel only once at the driver init,
>> shouldn't this be done in cppc_cpufreq_init() ?
>> I understand that cpu_data doesn't exist yet in
>> cppc_cpufreq_init(), but this seems more appropriate to do
>> it there IMO.
>>
>> This means the cpudata should be updated accordingly
>> in this cppc_cpufreq_cpu_init() function.
>
> In an earlier version [1], the setup was in cppc_cpufreq_init() but
> was moved to cppc_cpufreq_cpu_init() to improve per-CPU error handling.
> Keeping the setup in cppc_cpufreq_init() helps to avoid the last_governor
> check. We can warn for a CPU failing to enable and continue so other
> CPUs keep autonomous mode.
> cppc_cpufreq_cpu_init() would then just check the auto_sel state
> from register and sync policy limits from min/max_perf registers when
> autonomous mode is active.
> Please let me know your thoughts.

FWIU the auto_sel_mode module parameter allows to
configure the default auto_sel_mode when the driver is
first loaded, so there should not need to check that again
whenever cppc_cpufreq_cpu_init() is called.
Maybe Ionela saw something we didn't see ?

Also just to be sure, should it still be possible to change
the auto_sel_mode through the sysfs if the driver was
loaded with auto_sel_mode=1 ?


>
> [1] 
> https://lore.kernel.org/lkml/5593d364-ca37-41c5-b33f-f7e245d6d626@nvidia.com/
>
>
>>
>>> +     if (auto_sel_mode && policy->last_governor[0] == '\0' &&
>>> +         !cpu_data->perf_ctrls.auto_sel) {
>>> +             /* Enable CPPC - optional register, some platforms 
>>> need it */
>> The documentation of the CPPC Enable Register is subject to
>> interpretation, but IIUC the field should be set to use the CPPC
>> controls, so I assume this should be set in cppc_cpufreq_init()
>> instead ?
>
> Agree that the CPPC Enable is about using the CPPC control path
> in general and not only for autonomous selection.
> Will move cppc_set_enable() into cppc_cpufreq_init() or outside the
> autonomous mode block in cppc_cpufreq_cpu_init() as per conclusion
> of previous comment.
>
>>> +             ret = cppc_set_enable(cpu, true);
>>> +             if (ret && ret != -EOPNOTSUPP)
>>> +                     pr_warn("Failed to enable CPPC for CPU%d 
>>> (%d)\n", cpu, ret);
>>> +
>>> +             /*
>>> +              * Prefer HW min/max_perf when set; otherwise program 
>>> from
>>> +              * policy limits derived earlier from caps.
>>> +              * Clamp desired_perf to bounds and sync policy->cur.
>>> +              */
>>> +             if (!cpu_data->perf_ctrls.min_perf || 
>>> !cpu_data->perf_ctrls.max_perf)
>>
>> The function doesn't seem to exist.
>
> It is newly added in [2].
> Don't need to call it if we move the setup to cppc_cpufreq_init().

Ah ok right thanks.


>
> [2] 
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=ea3db45ae476889a1ba0ab3617e6afdeeefbda3d 
>
>
>
>>
>>> + cppc_cpufreq_update_perf_limits(cpu_data, policy);
>>> +
>>> +             cpu_data->perf_ctrls.desired_perf =
>>> +                     clamp_t(u32, cpu_data->perf_ctrls.desired_perf,
>>> + cpu_data->perf_ctrls.min_perf,
>>> + cpu_data->perf_ctrls.max_perf);
>>> +
>>> +             policy->cur = cppc_perf_to_khz(caps,
>>> + cpu_data->perf_ctrls.desired_perf);
>>> +
>>
>> Maybe this should also be done in cppc_cpufreq_init()
>> if the auto_sel_mode parameter is set ?
>
> Yes.
>
>>
>>> +             /* EPP is optional - some platforms may not support it */
>>> +             ret = cppc_set_epp(cpu, CPPC_EPP_PERFORMANCE_PREF);
>>> +             if (ret && ret != -EOPNOTSUPP)
>>> +                     pr_warn("Failed to set EPP for CPU%d (%d)\n", 
>>> cpu, ret);
>>> +             else if (!ret)
>>> +                     cpu_data->perf_ctrls.energy_perf = 
>>> CPPC_EPP_PERFORMANCE_PREF;
>>> +
>>> +             ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls);
>>> +             if (ret) {
>>> +                     pr_debug("Err setting perf for autonomous mode 
>>> CPU:%d ret:%d\n",
>>> +                              cpu, ret);
>>> +                     goto out;
>>> +             }
>>> +
>>> +             ret = cppc_set_auto_sel(cpu, true);
>>> +             if (ret && ret != -EOPNOTSUPP) {
>>> +                     pr_warn("Failed autonomous config for CPU%d 
>>> (%d)\n",
>>> +                             cpu, ret);
>>> +                     goto out;
>>> +             }
>>> +             if (!ret)
>>> +                     cpu_data->perf_ctrls.auto_sel = true;
>>> +     }
>>> +
>>> +     if (cpu_data->perf_ctrls.auto_sel) {
>>
>> There is a patchset ongoing which tries to remove
>> setting policy->min/max from driver initialization.
>> Indeed, these values are only temporarily valid,
>> until the governor override them.
>> It is not sure yet the patch will be accepted though.
>>
>> https://lore.kernel.org/lkml/20260317101753.2284763-4-pierre.gondois@arm.com/ 
>>
>
>
> You are right that policy->min/max from .init() are temporary today
> as cpufreq_set_policy() overwrites them before the governor starts.
>
> On my test platform (highest == nominal, lowest_nonlinear == lowest),
> this had no visible effect because the BIOS bounds and cpuinfo range
> end up identical. But on platforms where they differ, the governor
> would widen the range to full cpuinfo limits.
>
> I think your patch [3] fixes this by giving these the right semantic as
> initial QoS requests. With it, cpufreq_set_policy() preserves the policy
> limits set from min/max_perf registers in .init(), which can either be
> BIOS values on first boot or last user configured values before hotplug.
>
> I will update the comment in v2 to reflect QoS seeding intent.
>
> I see that the first two patches of your series [3] is applied for 7.1.
> Do you plan to send the pending patch (3/4) from [3]?
>
I need to ping Viresh to check if this is still relevant.


> [3] 
> https://lore.kernel.org/lkml/20260317101753.2284763-4-pierre.gondois@arm.com/
>
>
>>
>>
>>> +             /* Sync policy limits from HW when autonomous mode is 
>>> active */
>>> +             policy->min = cppc_perf_to_khz(caps,
>>> + cpu_data->perf_ctrls.min_perf ?:
>>> + caps->lowest_nonlinear_perf);
>>> +             policy->max = cppc_perf_to_khz(caps,
>>> + cpu_data->perf_ctrls.max_perf ?:
>>> + caps->nominal_perf);
>>> +     } else {
>>> +             /* Normal mode: governors control frequency */
>>> +             ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls);
>>> +             if (ret) {
>>> +                     pr_debug("Err setting perf value:%d on CPU:%d. 
>>> ret:%d\n",
>>> +                              caps->highest_perf, cpu, ret);
>>> +                     goto out;
>>> +             }
>>>       }
>>>
>>>       cppc_cpufreq_cpu_fie_init(policy);
>>> @@ -1038,10 +1104,18 @@ static int __init cppc_cpufreq_init(void)
>>>
>>>   static void __exit cppc_cpufreq_exit(void)
>>>   {
>>> +     unsigned int cpu;
>>> +
>>> +     for_each_present_cpu(cpu)
>>> +             cppc_set_auto_sel(cpu, false);
>>
>> If the firmware has a default EPP value, it means that loading
>> and the unloading the driver will reset this default EPP value.
>> Maybe the initial EPP value and/or the auto_sel value should be
>> cached somewhere and restored on exit ?
>> I don't know if this is actually an issue, this is just to signal it.
>
> The auto_sel_mode boot path programs EPP to performance preference(0),
> not the firmware’s previous value. On unload we only call
> cppc_set_auto_sel(false); we do not restore EPP, min/max perf,
> or other CPPC fields to firmware defaults.

Yes right, so loading/unloading the driver might change the
default EPP value.


>
> Thank you,
> Sumit Gupta
>
> ....
>
>

^ permalink raw reply

* Re: [patch V2 08/11] fs/timerfd: Use the new alarm/hrtimer functions
From: Frederic Weisbecker @ 2026-04-10 13:46 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Alexander Viro, Christian Brauner, Jan Kara,
	Anna-Maria Behnsen, linux-fsdevel, Calvin Owens,
	Peter Zijlstra (Intel), John Stultz, Stephen Boyd,
	Sebastian Reichel, linux-pm, Pablo Neira Ayuso, Florian Westphal,
	Phil Sutter, netfilter-devel, coreteam
In-Reply-To: <20260408114952.469141112@kernel.org>

Le Wed, Apr 08, 2026 at 01:54:20PM +0200, Thomas Gleixner a écrit :
> Like any other user controlled interface, timerfd based timers can be
> programmed with expiry times in the past or vary small intervals.
> 
> Both hrtimer and alarmtimer provide new interfaces which return the queued
> state of the timer. If the timer was already expired, then let the callsite
> handle the timerfd context update so that the full round trip through the
> hrtimer interrupt is avoided.
> 
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>
> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> Cc: Christian Brauner <brauner@kernel.org>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Anna-Maria Behnsen <anna-maria@linutronix.de>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Cc: linux-fsdevel@vger.kernel.org

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>

-- 
Frederic Weisbecker
SUSE Labs

^ permalink raw reply

* [GIT PULL] turbostat fixes for 7.0
From: Len Brown @ 2026-04-10 13:40 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux PM list, Linux Kernel Mailing List

Hi Linus,

Please pull these turbostat-fixes-for-7.0 patches.

thanks!
Len Brown, Intel Open Source Technology Center

The following changes since commit 1f318b96cc84d7c2ab792fcc0bfd42a7ca890681:

  Linux 7.0-rc3 (2026-03-08 16:56:54 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux.git
tags/turbostat-fixes-for-7.0

for you to fetch changes up to ba893caead54745595e29953f0531cf3651610aa:

  tools/power turbostat: Allow execution to continue after
perf_l2_init() failure (2026-04-10 09:04:32 -0400)

----------------------------------------------------------------
Turbostat Fixes

Fix a memory allocation issue that could corrupt output values or SEGV

Fix a perf initilization issue that could exit on some HW + kernels.

Minor fixes.

----------------------------------------------------------------
Artem Bityutskiy (4):
      tools/power turbostat: Consistently use print_float_value()
      tools/power turbostat: Fix incorrect format variable
      tools/power turbostat: Fix --show/--hide for individual cpuidle counters
      tools/power turbostat: Fix delimiter bug in print functions

David Arcari (1):
      tools/power turbostat: Allow execution to continue after
perf_l2_init() failure

Len Brown (1):
      tools/power turbostat: Fix swidle header vs data display

Serhii Pievniev (1):
      tools/power/turbostat: Fix microcode patch level output for AMD/Hygon

Zhang Rui (2):
      tools/power turbostat: Fix illegal memory access when SMT is
present and disabled
      tools/power turbostat: Eliminate unnecessary data structure allocation

 tools/power/x86/turbostat/turbostat.c | 100 ++++++++++++++++++----------------
 1 file changed, 54 insertions(+), 46 deletions(-)

^ permalink raw reply

* [PATCH 9/9] tools/power turbostat: Allow execution to continue after perf_l2_init() failure
From: Len Brown @ 2026-04-10 13:25 UTC (permalink / raw)
  To: linux-pm; +Cc: David Arcari, Len Brown
In-Reply-To: <57d2371d52be1d574b33382bfbf8052485b99d8b.1775827309.git.len.brown@intel.com>

From: David Arcari <darcari@redhat.com>

Currently, if perf_l2_init() fails turbostat exits after issuing the
following error (which was encountered on AlderLake):

turbostat: perf_l2_init(cpu0, 0x0, 0xff24) REFS: Invalid argument

This occurs because perf_l2_init() calls err(). However, the code has been
written in such a manner that it is able to perform cleanup and continue.
Therefore, this issue can be addressed by changing the appropriate calls
to err() to warnx().

Additionally, correct the PMU type arguments passed to the warning strings
in the ecore and lcore blocks so the logs accurately reflect the failing
counter type.

Signed-off-by: David Arcari <darcari@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
---
 tools/power/x86/turbostat/turbostat.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 34e2143cd4b3..e9e8ef72395a 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -9405,13 +9405,13 @@ void perf_l2_init(void)
 		if (!is_hybrid) {
 			fd_l2_percpu[cpu] = open_perf_counter(cpu, perf_pmu_types.uniform, perf_model_support->first.refs, -1, PERF_FORMAT_GROUP);
 			if (fd_l2_percpu[cpu] == -1) {
-				err(-1, "%s(cpu%d, 0x%x, 0x%llx) REFS", __func__, cpu, perf_pmu_types.uniform, perf_model_support->first.refs);
+				warnx("%s(cpu%d, 0x%x, 0x%llx) REFS", __func__, cpu, perf_pmu_types.uniform, perf_model_support->first.refs);
 				free_fd_l2_percpu();
 				return;
 			}
 			retval = open_perf_counter(cpu, perf_pmu_types.uniform, perf_model_support->first.hits, fd_l2_percpu[cpu], PERF_FORMAT_GROUP);
 			if (retval == -1) {
-				err(-1, "%s(cpu%d, 0x%x, 0x%llx) HITS", __func__, cpu, perf_pmu_types.uniform, perf_model_support->first.hits);
+				warnx("%s(cpu%d, 0x%x, 0x%llx) HITS", __func__, cpu, perf_pmu_types.uniform, perf_model_support->first.hits);
 				free_fd_l2_percpu();
 				return;
 			}
@@ -9420,39 +9420,39 @@ void perf_l2_init(void)
 		if (perf_pcore_set && CPU_ISSET_S(cpu, cpu_possible_setsize, perf_pcore_set)) {
 			fd_l2_percpu[cpu] = open_perf_counter(cpu, perf_pmu_types.pcore, perf_model_support->first.refs, -1, PERF_FORMAT_GROUP);
 			if (fd_l2_percpu[cpu] == -1) {
-				err(-1, "%s(cpu%d, 0x%x, 0x%llx) REFS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->first.refs);
+				warnx("%s(cpu%d, 0x%x, 0x%llx) REFS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->first.refs);
 				free_fd_l2_percpu();
 				return;
 			}
 			retval = open_perf_counter(cpu, perf_pmu_types.pcore, perf_model_support->first.hits, fd_l2_percpu[cpu], PERF_FORMAT_GROUP);
 			if (retval == -1) {
-				err(-1, "%s(cpu%d, 0x%x, 0x%llx) HITS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->first.hits);
+				warnx("%s(cpu%d, 0x%x, 0x%llx) HITS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->first.hits);
 				free_fd_l2_percpu();
 				return;
 			}
 		} else if (perf_ecore_set && CPU_ISSET_S(cpu, cpu_possible_setsize, perf_ecore_set)) {
 			fd_l2_percpu[cpu] = open_perf_counter(cpu, perf_pmu_types.ecore, perf_model_support->second.refs, -1, PERF_FORMAT_GROUP);
 			if (fd_l2_percpu[cpu] == -1) {
-				err(-1, "%s(cpu%d, 0x%x, 0x%llx) REFS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->second.refs);
+				warnx("%s(cpu%d, 0x%x, 0x%llx) REFS", __func__, cpu, perf_pmu_types.ecore, perf_model_support->second.refs);
 				free_fd_l2_percpu();
 				return;
 			}
 			retval = open_perf_counter(cpu, perf_pmu_types.ecore, perf_model_support->second.hits, fd_l2_percpu[cpu], PERF_FORMAT_GROUP);
 			if (retval == -1) {
-				err(-1, "%s(cpu%d, 0x%x, 0x%llx) HITS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->second.hits);
+				warnx("%s(cpu%d, 0x%x, 0x%llx) HITS", __func__, cpu, perf_pmu_types.ecore, perf_model_support->second.hits);
 				free_fd_l2_percpu();
 				return;
 			}
 		} else if (perf_lcore_set && CPU_ISSET_S(cpu, cpu_possible_setsize, perf_lcore_set)) {
 			fd_l2_percpu[cpu] = open_perf_counter(cpu, perf_pmu_types.lcore, perf_model_support->third.refs, -1, PERF_FORMAT_GROUP);
 			if (fd_l2_percpu[cpu] == -1) {
-				err(-1, "%s(cpu%d, 0x%x, 0x%llx) REFS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->third.refs);
+				warnx("%s(cpu%d, 0x%x, 0x%llx) REFS", __func__, cpu, perf_pmu_types.lcore, perf_model_support->third.refs);
 				free_fd_l2_percpu();
 				return;
 			}
 			retval = open_perf_counter(cpu, perf_pmu_types.lcore, perf_model_support->third.hits, fd_l2_percpu[cpu], PERF_FORMAT_GROUP);
 			if (retval == -1) {
-				err(-1, "%s(cpu%d, 0x%x, 0x%llx) HITS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->third.hits);
+				warnx("%s(cpu%d, 0x%x, 0x%llx) HITS", __func__, cpu, perf_pmu_types.lcore, perf_model_support->third.hits);
 				free_fd_l2_percpu();
 				return;
 			}
-- 
2.45.2


^ permalink raw reply related

* [PATCH 8/9] tools/power turbostat: Fix delimiter bug in print functions
From: Len Brown @ 2026-04-10 13:25 UTC (permalink / raw)
  To: linux-pm; +Cc: Artem Bityutskiy, Len Brown
In-Reply-To: <57d2371d52be1d574b33382bfbf8052485b99d8b.1775827309.git.len.brown@intel.com>

From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>

Commands that add counters, such as 'turbostat --show C1,C1+'
display merged columns without a delimiter.

This is caused by the bad syntax: '(*printed++ ? delim : "")', shared by
print_name()/print_hex_value()/print_decimal_value()/print_float_value()

Use '((*printed)++ ? delim : "")' to correctly increment the value at *printed.

[lenb: fix code and commit message typo, re-word]
Fixes: 56dbb878507b ("tools/power turbostat: Refactor added column header printing")
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
---
 tools/power/x86/turbostat/turbostat.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 3487548841e1..34e2143cd4b3 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -2837,29 +2837,29 @@ static inline int print_name(int width, int *printed, char *delim, char *name, e
 	UNUSED(type);
 
 	if (format == FORMAT_RAW && width >= 64)
-		return (sprintf(outp, "%s%-8s", (*printed++ ? delim : ""), name));
+		return (sprintf(outp, "%s%-8s", ((*printed)++ ? delim : ""), name));
 	else
-		return (sprintf(outp, "%s%s", (*printed++ ? delim : ""), name));
+		return (sprintf(outp, "%s%s", ((*printed)++ ? delim : ""), name));
 }
 
 static inline int print_hex_value(int width, int *printed, char *delim, unsigned long long value)
 {
 	if (width <= 32)
-		return (sprintf(outp, "%s%08x", (*printed++ ? delim : ""), (unsigned int)value));
+		return (sprintf(outp, "%s%08x", ((*printed)++ ? delim : ""), (unsigned int)value));
 	else
-		return (sprintf(outp, "%s%016llx", (*printed++ ? delim : ""), value));
+		return (sprintf(outp, "%s%016llx", ((*printed)++ ? delim : ""), value));
 }
 
 static inline int print_decimal_value(int width, int *printed, char *delim, unsigned long long value)
 {
 	UNUSED(width);
 
-	return (sprintf(outp, "%s%lld", (*printed++ ? delim : ""), value));
+	return (sprintf(outp, "%s%lld", ((*printed)++ ? delim : ""), value));
 }
 
 static inline int print_float_value(int *printed, char *delim, double value)
 {
-	return (sprintf(outp, "%s%0.2f", (*printed++ ? delim : ""), value));
+	return (sprintf(outp, "%s%0.2f", ((*printed)++ ? delim : ""), value));
 }
 
 void print_header(char *delim)
-- 
2.45.2


^ permalink raw reply related

* [PATCH 7/9] tools/power turbostat: Fix --show/--hide for individual cpuidle counters
From: Len Brown @ 2026-04-10 13:25 UTC (permalink / raw)
  To: linux-pm; +Cc: Artem Bityutskiy, Len Brown
In-Reply-To: <57d2371d52be1d574b33382bfbf8052485b99d8b.1775827309.git.len.brown@intel.com>

From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>

Problem: individual swidle counter names (C1, C1+, C1-, etc.) cannot be
selected via --show/--hide due to two bugs in probe_cpuidle_counts():
1. The function returns immediately when BIC_cpuidle is not enabled,
   without checking deferred_add_index.
2. The deferred name check runs against name_buf before the trailing
   newline is stripped, so is_deferred_add("C1\n") never matches "C1".

Fix:
1. Relax the early return to pass through when deferred names are
   queued.
2. Strip the trailing newline from name_buf before performing deferred
   name checks.
3. Check each suffixed variant (C1+, C1, C1-) individually so that
   e.g. "--show C1+" enables only the requested metric.

In addition, introduce a helper function to avoid repeating the
condition (readability cleanup).

Fixes: ec4acd3166d8 ("tools/power turbostat: disable "cpuidle" invocation counters, by default")
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
---
 tools/power/x86/turbostat/turbostat.c | 35 ++++++++++++++++-----------
 1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 4d954533c71d..3487548841e1 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -11285,6 +11285,14 @@ void probe_cpuidle_residency(void)
 	}
 }
 
+static bool cpuidle_counter_wanted(char *name)
+{
+	if (is_deferred_skip(name))
+		return false;
+
+	return DO_BIC(BIC_cpuidle) || is_deferred_add(name);
+}
+
 void probe_cpuidle_counts(void)
 {
 	char path[64];
@@ -11294,7 +11302,7 @@ void probe_cpuidle_counts(void)
 	int min_state = 1024, max_state = 0;
 	char *sp;
 
-	if (!DO_BIC(BIC_cpuidle))
+	if (!DO_BIC(BIC_cpuidle) && !deferred_add_index)
 		return;
 
 	for (state = 10; state >= 0; --state) {
@@ -11309,12 +11317,6 @@ void probe_cpuidle_counts(void)
 
 		remove_underbar(name_buf);
 
-		if (!DO_BIC(BIC_cpuidle) && !is_deferred_add(name_buf))
-			continue;
-
-		if (is_deferred_skip(name_buf))
-			continue;
-
 		/* truncate "C1-HSW\n" to "C1", or truncate "C1\n" to "C1" */
 		sp = strchr(name_buf, '-');
 		if (!sp)
@@ -11329,16 +11331,19 @@ void probe_cpuidle_counts(void)
 			 * Add 'C1+' for C1, and so on. The 'below' sysfs file always contains 0 for
 			 * the last state, so do not add it.
 			 */
-
 			*sp = '+';
 			*(sp + 1) = '\0';
-			sprintf(path, "cpuidle/state%d/below", state);
-			add_counter(0, path, name_buf, 64, SCOPE_CPU, COUNTER_ITEMS, FORMAT_DELTA, SYSFS_PERCPU, 0);
+			if (cpuidle_counter_wanted(name_buf)) {
+				sprintf(path, "cpuidle/state%d/below", state);
+				add_counter(0, path, name_buf, 64, SCOPE_CPU, COUNTER_ITEMS, FORMAT_DELTA, SYSFS_PERCPU, 0);
+			}
 		}
 
 		*sp = '\0';
-		sprintf(path, "cpuidle/state%d/usage", state);
-		add_counter(0, path, name_buf, 64, SCOPE_CPU, COUNTER_ITEMS, FORMAT_DELTA, SYSFS_PERCPU, 0);
+		if (cpuidle_counter_wanted(name_buf)) {
+			sprintf(path, "cpuidle/state%d/usage", state);
+			add_counter(0, path, name_buf, 64, SCOPE_CPU, COUNTER_ITEMS, FORMAT_DELTA, SYSFS_PERCPU, 0);
+		}
 
 		/*
 		 * The 'above' sysfs file always contains 0 for the shallowest state (smallest
@@ -11347,8 +11352,10 @@ void probe_cpuidle_counts(void)
 		if (state != min_state) {
 			*sp = '-';
 			*(sp + 1) = '\0';
-			sprintf(path, "cpuidle/state%d/above", state);
-			add_counter(0, path, name_buf, 64, SCOPE_CPU, COUNTER_ITEMS, FORMAT_DELTA, SYSFS_PERCPU, 0);
+			if (cpuidle_counter_wanted(name_buf)) {
+				sprintf(path, "cpuidle/state%d/above", state);
+				add_counter(0, path, name_buf, 64, SCOPE_CPU, COUNTER_ITEMS, FORMAT_DELTA, SYSFS_PERCPU, 0);
+			}
 		}
 	}
 }
-- 
2.45.2


^ permalink raw reply related

* [PATCH 6/9] tools/power turbostat: Fix incorrect format variable
From: Len Brown @ 2026-04-10 13:25 UTC (permalink / raw)
  To: linux-pm; +Cc: Artem Bityutskiy, Len Brown
In-Reply-To: <57d2371d52be1d574b33382bfbf8052485b99d8b.1775827309.git.len.brown@intel.com>

From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>

In the perf thread, core, and package counter loops, an incorrect
'mp->format' variable is used instead of 'pp->format'.

[lenb: edit commit message]
Fixes: 696d15cbd8c2 ("tools/power turbostat: Refactor floating point printout code")
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
---
 tools/power/x86/turbostat/turbostat.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 9744f9caac9a..4d954533c71d 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -3468,7 +3468,7 @@ int format_counters(PER_THREAD_PARAMS)
 	for (i = 0, pp = sys.perf_tp; pp; ++i, pp = pp->next) {
 		if (pp->format == FORMAT_RAW)
 			outp += print_hex_value(pp->width, &printed, delim, t->perf_counter[i]);
-		else if (pp->format == FORMAT_DELTA || mp->format == FORMAT_AVERAGE)
+		else if (pp->format == FORMAT_DELTA || pp->format == FORMAT_AVERAGE)
 			outp += print_decimal_value(pp->width, &printed, delim, t->perf_counter[i]);
 		else if (pp->format == FORMAT_PERCENT) {
 			if (pp->type == COUNTER_USEC)
@@ -3538,7 +3538,7 @@ int format_counters(PER_THREAD_PARAMS)
 	for (i = 0, pp = sys.perf_cp; pp; i++, pp = pp->next) {
 		if (pp->format == FORMAT_RAW)
 			outp += print_hex_value(pp->width, &printed, delim, c->perf_counter[i]);
-		else if (pp->format == FORMAT_DELTA || mp->format == FORMAT_AVERAGE)
+		else if (pp->format == FORMAT_DELTA || pp->format == FORMAT_AVERAGE)
 			outp += print_decimal_value(pp->width, &printed, delim, c->perf_counter[i]);
 		else if (pp->format == FORMAT_PERCENT)
 			outp += print_float_value(&printed, delim, pct(c->perf_counter[i], tsc));
@@ -3694,7 +3694,7 @@ int format_counters(PER_THREAD_PARAMS)
 			outp += print_hex_value(pp->width, &printed, delim, p->perf_counter[i]);
 		else if (pp->type == COUNTER_K2M)
 			outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), (unsigned int)p->perf_counter[i] / 1000);
-		else if (pp->format == FORMAT_DELTA || mp->format == FORMAT_AVERAGE)
+		else if (pp->format == FORMAT_DELTA || pp->format == FORMAT_AVERAGE)
 			outp += print_decimal_value(pp->width, &printed, delim, p->perf_counter[i]);
 		else if (pp->format == FORMAT_PERCENT)
 			outp += print_float_value(&printed, delim, pct(p->perf_counter[i], tsc));
-- 
2.45.2


^ permalink raw reply related

* [PATCH 5/9] tools/power turbostat: Consistently use print_float_value()
From: Len Brown @ 2026-04-10 13:25 UTC (permalink / raw)
  To: linux-pm; +Cc: Artem Bityutskiy, Len Brown
In-Reply-To: <57d2371d52be1d574b33382bfbf8052485b99d8b.1775827309.git.len.brown@intel.com>

From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>

Fix the PMT thread code to use print_float_value(),
to be consistent with the PMT core and package code.

[lenb: commit message]
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
---
 tools/power/x86/turbostat/turbostat.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index b985bce69142..9744f9caac9a 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -3489,12 +3489,12 @@ int format_counters(PER_THREAD_PARAMS)
 
 		case PMT_TYPE_XTAL_TIME:
 			value_converted = pct(value_raw / crystal_hz, interval_float);
-			outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), value_converted);
+			outp += print_float_value(&printed, delim, value_converted);
 			break;
 
 		case PMT_TYPE_TCORE_CLOCK:
 			value_converted = pct(value_raw / tcore_clock_freq_hz, interval_float);
-			outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), value_converted);
+			outp += print_float_value(&printed, delim, value_converted);
 		}
 	}
 
-- 
2.45.2


^ permalink raw reply related

* [PATCH 4/9] tools/power/turbostat: Fix microcode patch level output for AMD/Hygon
From: Len Brown @ 2026-04-10 13:25 UTC (permalink / raw)
  To: linux-pm; +Cc: Serhii Pievniev, Len Brown
In-Reply-To: <57d2371d52be1d574b33382bfbf8052485b99d8b.1775827309.git.len.brown@intel.com>

From: Serhii Pievniev <spevnev16@gmail.com>

turbostat always used the same logic to read the microcode patch level,
which is correct for Intel but not for AMD/Hygon.
While Intel stores the patch level in the upper 32 bits of MSR, AMD
stores it in the lower 32 bits, which causes turbostat to report the
microcode version as 0x0 on AMD/Hygon.

Fix by shifting right by 32 for non-AMD/Hygon, preserving the existing
behavior for Intel and unknown vendors.

Fixes: 3e4048466c39 ("tools/power turbostat: Add --no-msr option")
Signed-off-by: Serhii Pievniev <spevnev16@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
---
 tools/power/x86/turbostat/turbostat.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 14021a6ed717..b985bce69142 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -9121,10 +9121,13 @@ void process_cpuid()
 	cpuid_has_hv = ecx_flags & (1 << 31);
 
 	if (!no_msr) {
-		if (get_msr(sched_getcpu(), MSR_IA32_UCODE_REV, &ucode_patch))
+		if (get_msr(sched_getcpu(), MSR_IA32_UCODE_REV, &ucode_patch)) {
 			warnx("get_msr(UCODE)");
-		else
+		} else {
 			ucode_patch_valid = true;
+			if (!authentic_amd && !hygon_genuine)
+				ucode_patch >>= 32;
+		}
 	}
 
 	/*
@@ -9138,7 +9141,7 @@ void process_cpuid()
 	if (!quiet) {
 		fprintf(outf, "CPUID(1): family:model:stepping 0x%x:%x:%x (%d:%d:%d)", family, model, stepping, family, model, stepping);
 		if (ucode_patch_valid)
-			fprintf(outf, " microcode 0x%x", (unsigned int)((ucode_patch >> 32) & 0xFFFFFFFF));
+			fprintf(outf, " microcode 0x%x", (unsigned int)ucode_patch);
 		fputc('\n', outf);
 
 		fprintf(outf, "CPUID(0x80000000): max_extended_levels: 0x%x\n", max_extended_level);
-- 
2.45.2


^ permalink raw reply related

* [PATCH 3/9] tools/power turbostat: Eliminate unnecessary data structure allocation
From: Len Brown @ 2026-04-10 13:25 UTC (permalink / raw)
  To: linux-pm; +Cc: Zhang Rui, Len Brown
In-Reply-To: <57d2371d52be1d574b33382bfbf8052485b99d8b.1775827309.git.len.brown@intel.com>

From: Zhang Rui <rui.zhang@intel.com>

Linux core_id's are a per-package namespace, not a per-node namespace.

Rename topo.cores_per_node to topo.cores_per_pkg to reflect this.

Eliminate topo.nodes_per_pkg from the sizing for core data structures,
since it has no role except to unnecessarily bloat the allocation.

Validated on multiple Intel platforms (ICX/SPR/SRF/EMR/GNR/CWF) with
various CPU online/offline configurations and SMT enabled/disabled
scenarios.

No functional changes.

[lenb: commit message]
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
---
 tools/power/x86/turbostat/turbostat.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 791b9154f662..14021a6ed717 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -2409,7 +2409,7 @@ struct topo_params {
 	int max_l3_id;
 	int max_node_num;
 	int nodes_per_pkg;
-	int cores_per_node;
+	int cores_per_pkg;
 	int threads_per_core;
 } topo;
 
@@ -9633,9 +9633,9 @@ void topology_probe(bool startup)
 	topo.max_core_id = max_core_id;	/* within a package */
 	topo.max_package_id = max_package_id;
 
-	topo.cores_per_node = max_core_id + 1;
+	topo.cores_per_pkg = max_core_id + 1;
 	if (debug > 1)
-		fprintf(outf, "max_core_id %d, sizing for %d cores per package\n", max_core_id, topo.cores_per_node);
+		fprintf(outf, "max_core_id %d, sizing for %d cores per package\n", max_core_id, topo.cores_per_pkg);
 	if (!summary_only)
 		BIC_PRESENT(BIC_Core);
 
@@ -9700,7 +9700,7 @@ void allocate_counters_1(struct counters *counters)
 void allocate_counters(struct counters *counters)
 {
 	int i;
-	int num_cores = topo.cores_per_node * topo.nodes_per_pkg * topo.num_packages;
+	int num_cores = topo.cores_per_pkg * topo.num_packages;
 
 	counters->threads = calloc(topo.max_cpu_num + 1, sizeof(struct thread_data));
 	if (counters->threads == NULL)
-- 
2.45.2


^ permalink raw reply related

* [PATCH 2/9] tools/power turbostat: Fix swidle header vs data display
From: Len Brown @ 2026-04-10 13:25 UTC (permalink / raw)
  To: linux-pm; +Cc: Len Brown, Artem Bityutskiy
In-Reply-To: <57d2371d52be1d574b33382bfbf8052485b99d8b.1775827309.git.len.brown@intel.com>

From: Len Brown <len.brown@intel.com>

I changed my mind about displaying swidle statistics,
which are "added counters".  Recently I reverted the
column headers to 8-columns, but kept print_decimal_value()
padding out to 16-columns for all 64-bit counters.

Simplify by keeping print_decimial_value() at %lld -- which
will often fit into 8-columns, and live with the fact
that it can overflow and shift the other columns,
which continue to tab-delimited.

This is a better compromise than inserting a bunch
of space characters that most users don't like.

Fixes: 1a23ba6a1ba2 ("tools/power turbostat: Print wide names only for RAW 64-bit columns")
Reported-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
---
 tools/power/x86/turbostat/turbostat.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index ae827485950d..791b9154f662 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -2852,10 +2852,9 @@ static inline int print_hex_value(int width, int *printed, char *delim, unsigned
 
 static inline int print_decimal_value(int width, int *printed, char *delim, unsigned long long value)
 {
-	if (width <= 32)
-		return (sprintf(outp, "%s%d", (*printed++ ? delim : ""), (unsigned int)value));
-	else
-		return (sprintf(outp, "%s%-8lld", (*printed++ ? delim : ""), value));
+	UNUSED(width);
+
+	return (sprintf(outp, "%s%lld", (*printed++ ? delim : ""), value));
 }
 
 static inline int print_float_value(int *printed, char *delim, double value)
-- 
2.45.2


^ permalink raw reply related

* [PATCH 1/9] tools/power turbostat: Fix illegal memory access when SMT is present and disabled
From: Len Brown @ 2026-04-10 13:25 UTC (permalink / raw)
  To: linux-pm; +Cc: Zhang Rui, Len Brown
In-Reply-To: <20260410132836.398255-1-lenb@kernel.org>

From: Zhang Rui <rui.zhang@intel.com>

When SMT is present and disabled, turbostat may under-size
the thread_data array.  This can corrupt results or
cause turbostat to exit with a segmentation fault.

[lenb: commit message]
Fixes: a2b4d0f8bf07 ("tools/power turbostat: Favor cpu# over core#")
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
---
 tools/power/x86/turbostat/turbostat.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 1a2671c28209..ae827485950d 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -9702,13 +9702,12 @@ void allocate_counters(struct counters *counters)
 {
 	int i;
 	int num_cores = topo.cores_per_node * topo.nodes_per_pkg * topo.num_packages;
-	int num_threads = topo.threads_per_core * num_cores;
 
-	counters->threads = calloc(num_threads, sizeof(struct thread_data));
+	counters->threads = calloc(topo.max_cpu_num + 1, sizeof(struct thread_data));
 	if (counters->threads == NULL)
 		goto error;
 
-	for (i = 0; i < num_threads; i++)
+	for (i = 0; i < topo.max_cpu_num + 1; i++)
 		(counters->threads)[i].cpu_id = -1;
 
 	counters->cores = calloc(num_cores, sizeof(struct core_data));
-- 
2.45.2


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox