Linux Documentation
 help / color / mirror / Atom feed
* [PATCH v5 1/2] cpufreq: CPPC: Set CPPC Enable register in cpu_init
From: Sumit Gupta @ 2026-06-23  8:06 UTC (permalink / raw)
  To: rafael, viresh.kumar, pierre.gondois, ionela.voinescu,
	zhenglifeng1, zhanjie9, corbet, skhan, rdunlap, mario.limonciello,
	linux-kernel, linux-pm, linux-doc, linux-tegra
  Cc: treding, jonathanh, vsethi, ksitaraman, sanjayc, mochs, bbasu,
	sumitg
In-Reply-To: <20260623080652.3353386-1-sumitg@nvidia.com>

As per ACPI 6.x s8.4.6.1.4 (CPPC Enable register):
  "If supported by the platform, OSPM writes a one to this register
   to enable CPPC on this processor. If not implemented, OSPM assumes
   the platform always has CPPC enabled."

Call cppc_set_enable() at the start of cppc_cpufreq_cpu_init() so
this is done for both OS-driven and autonomous CPPC control modes.
Errors are logged but non-fatal as the register is optional.

Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
---
 drivers/cpufreq/cppc_cpufreq.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index f6cea0c54dd9..f7a47576717a 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -655,6 +655,14 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
 	caps = &cpu_data->perf_caps;
 	policy->driver_data = cpu_data;
 
+	/*
+	 * Enable CPPC for both OS-driven and autonomous modes.
+	 * The Enable register is optional - some platforms may not support it
+	 */
+	ret = cppc_set_enable(cpu, true);
+	if (ret && ret != -EOPNOTSUPP)
+		pr_warn("Failed to enable CPPC for CPU%d (%d)\n", cpu, ret);
+
 	/*
 	 * Set min to lowest nonlinear perf to avoid any efficiency penalty (see
 	 * Section 8.4.7.1.1.5 of ACPI 6.1 spec)
-- 
2.34.1


^ permalink raw reply related

* [PATCH v5 0/2] cpufreq: CPPC: add autonomous mode boot parameter support
From: Sumit Gupta @ 2026-06-23  8:06 UTC (permalink / raw)
  To: rafael, viresh.kumar, pierre.gondois, ionela.voinescu,
	zhenglifeng1, zhanjie9, corbet, skhan, rdunlap, mario.limonciello,
	linux-kernel, linux-pm, linux-doc, linux-tegra
  Cc: treding, jonathanh, vsethi, ksitaraman, sanjayc, mochs, bbasu,
	sumitg

This series adds a kernel boot parameter 'cppc_cpufreq.auto_sel_mode'
to enable CPPC autonomous performance selection on all CPUs at system
startup, avoiding per-CPU sysfs scripting at every boot.

When autonomous mode is enabled, the hardware automatically adjusts
CPU performance based on workload demands using Energy Performance
Preference (EPP) hints.

Patch 1: Sets CPPC Enable Register for both OS-driven and autonomous
CPPC control modes. It can be applied independently of patch 2.

Patch 2: Adds the auto_sel_mode boot parameter with three modes:
  - performance (or 1):         override EPP to performance (0x0)
  - balance_performance (or 2): override EPP to balance_performance (0x80)
  - default_epp (or 3):         preserve EPP value programmed by
                                BIOS/firmware

Patch 2 relies on commit 8c83947c5dbb ("cpufreq: Use policy->min/max
init as QoS request") so that policy->min/max set during
cppc_cpufreq_cpu_init() are not overridden by cpufreq_set_policy().

v4[4] -> v5:
- Accept "disabled/0" and treat unrecognized auto_sel_mode as disabled.
- Rebased on the merged QoS-constraints and updated commit dependency.

v3[3] -> v4:
- Add 'balance_performance' mode which sets EPP to 0x80.
- Add CPPC_EPP_BALANCE_PERFORMANCE_PREF (0x80) constant in cppc_acpi.h.
- Clean up EPP mode selection with switch + boolean flag in cpu_init.
- Use local variable for kp->arg in auto_sel_mode_set/get to avoid
  repeated casts.


Sumit Gupta (2):
  cpufreq: CPPC: Set CPPC Enable register in cpu_init
  cpufreq: CPPC: add autonomous mode boot parameter support

 .../admin-guide/kernel-parameters.txt         |  22 +++
 drivers/cpufreq/cppc_cpufreq.c                | 159 +++++++++++++++++-
 include/acpi/cppc_acpi.h                      |   1 +
 3 files changed, 177 insertions(+), 5 deletions(-)

[1] v1: https://lore.kernel.org/lkml/20260317151053.2361475-1-sumitg@nvidia.com/
[2] v2: https://lore.kernel.org/lkml/20260424201814.230071-1-sumitg@nvidia.com/
[3] v3: https://lore.kernel.org/lkml/20260515122624.1920637-1-sumitg@nvidia.com/
[4] v4: https://lore.kernel.org/lkml/20260527202550.206828-1-sumitg@nvidia.com/

-- 
2.34.1


^ permalink raw reply

* Re: [PATCH v2 1/3] dt-bindings: watchdog: npcm: add GCR syscon property
From: Krzysztof Kozlowski @ 2026-06-23  8:05 UTC (permalink / raw)
  To: Tomer Maimon
  Cc: andrew, wim, linux, robh, krzk+dt, conor+dt, openbmc,
	linux-watchdog, linux-doc, devicetree, linux-kernel, avifishman70,
	tali.perry1, venture, yuenn, benjaminfair, corbet, skhan, joel
In-Reply-To: <20260622083046.3189603-2-tmaimon77@gmail.com>

On Mon, Jun 22, 2026 at 11:30:44AM +0300, Tomer Maimon wrote:
> Describe syscon property that handles general control registers (GCR) in
> Nuvoton BMC NPCM watchdog driver.

Why? Well, you try to answer by saying something about driver, but we do
not add bindings for drivers. Instead hardware should be the reason.

Anyway, why is this needed now?

> 
> Signed-off-by: Tomer Maimon <tmaimon77@gmail.com>
> ---
>  .../devicetree/bindings/watchdog/nuvoton,npcm750-wdt.yaml   | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/watchdog/nuvoton,npcm750-wdt.yaml b/Documentation/devicetree/bindings/watchdog/nuvoton,npcm750-wdt.yaml
> index 7aa30f5b5c49..4f00f099b2d2 100644
> --- a/Documentation/devicetree/bindings/watchdog/nuvoton,npcm750-wdt.yaml
> +++ b/Documentation/devicetree/bindings/watchdog/nuvoton,npcm750-wdt.yaml
> @@ -40,6 +40,12 @@ properties:
>    clock-frequency:
>      description: Frequency in Hz of the clock that drives the NPCM timer.
>  
> +  nuvoton,sysgcr:
> +    $ref: /schemas/types.yaml#/definitions/phandle
> +    description:
> +      a phandle to access GCR registers on NPCM750 and NPCM845 watchdog
> +      instances.

Here you write also for what purpose.


Best regards,
Krzysztof


^ permalink raw reply

* htmldocs: Documentation/core-api/list:775: ./include/linux/list.h:793: WARNING: Definition list ends without a blank line; unexpected unindent. [docutils]
From: kernel test robot @ 2026-06-23  8:00 UTC (permalink / raw)
  To: Kaitao Cheng; +Cc: oe-kbuild-all, 0day robot, linux-doc

tree:   https://github.com/intel-lab-lkp/linux/commits/Kaitao-Cheng/list-Add-mutable-iterator-variants/20260622-193620
head:   72faf2855f60a21ac98d3e63d13a20dc1db9d8a1
commit: 7858c2cc567eace0010e1727aabb1967f14c98f8 list: Add mutable iterator variants
date:   20 hours ago
compiler: clang version 22.1.8 (https://github.com/llvm/llvm-project ca7933e47d3a3451d81e72ac174dcb5aa28b59d1)
docutils: docutils (Docutils 0.21.2, Python 3.13.5, on linux)
reproduce: (https://download.01.org/0day-ci/archive/20260623/202606230940.yeWFIO56-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202606230940.yeWFIO56-lkp@intel.com/

All warnings (new ones prefixed by >>):

   1             automatic adjustment of input current limit
   0             no adjustment of input current limit. This
   helps for more unusual power sources like
   solar modules. [docutils]
   WARNING: ./block/blk-map.c:366 Excess function parameter 'op' description in 'bio_copy_kern'
>> Documentation/core-api/list:775: ./include/linux/list.h:793: WARNING: Definition list ends without a blank line; unexpected unindent. [docutils]
   Documentation/core-api/list:775: ./include/linux/list.h:826: WARNING: Definition list ends without a blank line; unexpected unindent. [docutils]
   Documentation/core-api/list:775: ./include/linux/list.h:971: WARNING: Definition list ends without a blank line; unexpected unindent. [docutils]
   Documentation/core-api/list:775: ./include/linux/list.h:1009: WARNING: Definition list ends without a blank line; unexpected unindent. [docutils]
   Documentation/core-api/list:775: ./include/linux/list.h:1048: WARNING: Definition list ends without a blank line; unexpected unindent. [docutils]
   Documentation/core-api/list:775: ./include/linux/list.h:1089: WARNING: Definition list ends without a blank line; unexpected unindent. [docutils]

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply

* Issue cloning kernel-doc-zh from HUST mirror
From: Siwei Chen @ 2026-06-23  7:39 UTC (permalink / raw)
  To: linux-doc; +Cc: si.yanteng, dzm91, wy

Hello,

I am following the documentation at:

https://docs.kernel.org/translations/zh_CN/how-to.html#id3

When trying to clone the repository from the recommended mirror:

git clone https://mirrors.hust.edu.cn/git/kernel-doc-zh.git linux

I consistently get the following error:

error: RPC failed; curl 52 Empty reply from server
fatal: expected 'packfile'

My environment is:

Ubuntu 26.04
git version 2.53

I have verified that the URL is reachable from my network, but the clone 
operation still fails.

Could anyone help me understand whether this is a mirror-side issue, a Git 
compatibility issue, or something wrong with my setup?

Thank you for your time.

Best regards,
Siwei Chen



^ permalink raw reply

* Re: [PATCH v8 13/46] KVM: guest_memfd: Add base support for KVM_SET_MEMORY_ATTRIBUTES2
From: Binbin Wu @ 2026-06-23  7:38 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, brauner, chao.p.peng, david, jmattson,
	jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, tabba, willy,
	wyihan, yan.y.zhao, forkloop, pratyush, suzuki.poulose,
	aneesh.kumar, liam, Paolo Bonzini, Sean Christopherson,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, Jonathan Corbet, Shuah Khan, Shuah Khan,
	Vishal Annapurve, Andrew Morton, Chris Li, Kairui Song,
	Kemeng Shi, Nhat Pham, Barry Song, Axel Rasmussen, Yuanchu Xie,
	Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt, Kiryl Shutsemau,
	Baoquan He, Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-13-9d2959357853@google.com>

On 6/19/2026 8:31 AM, Ackerley Tng via B4 Relay wrote:
> From: Ackerley Tng <ackerleytng@google.com>
> 
> Introduce base support for KVM_SET_MEMORY_ATTRIBUTES2 in guest_memfd, which
> just updates attributes tracked by guest_memfd.
> 
> Validate input fields in general. Guard usage of KVM_SET_MEMORY_ATTRIBUTES2
> by making sure requested attributes are supported for this instance of kvm.
> 
> A new KVM_SET_MEMORY_ATTRIBUTES2 is defined to support writes (unlike
> KVM_SET_MEMORY_ATTRIBUTES) in addition to reads so it can provide error
> details to userspace. This will be used in a later patch.
> 
> The two ioctls use their corresponding structs with no overlap, but
> backward compatibility is baked in for future support of
> KVM_SET_MEMORY_ATTRIBUTES2 and struct kvm_memory_attributes2 in the VM
> ioctl.
> 
> The process of setting memory attributes is set up such that the later half
> will not fail due to allocation. Any necessary checks are performed before
> the point of no return.
> 
> Co-developed-by: Vishal Annapurve <vannapurve@google.com>
> Signed-off-by: Vishal Annapurve <vannapurve@google.com>
> Co-developed-by: Sean Christoperson <seanjc@google.com>
> Signed-off-by: Sean Christoperson <seanjc@google.com>

s/Christoperson /Christopherson

> Reviewed-by: Fuad Tabba <tabba@google.com>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>
> ---
>  include/uapi/linux/kvm.h |  13 ++++++
>  virt/kvm/Kconfig         |   1 +
>  virt/kvm/guest_memfd.c   | 116 +++++++++++++++++++++++++++++++++++++++++++++++
>  virt/kvm/kvm_main.c      |  12 +++++
>  4 files changed, 142 insertions(+)
> 
>

[...]

> diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
> index 297e4399fbd49..cfa2c78ba5fb9 100644
> --- a/virt/kvm/Kconfig
> +++ b/virt/kvm/Kconfig
> @@ -102,6 +102,7 @@ config KVM_MMU_LOCKLESS_AGING
>  
>  config KVM_GUEST_MEMFD
>         select XARRAY_MULTI
> +       select KVM_MEMORY_ATTRIBUTES

What's this?
This config is gone.

>         bool
>  

^ permalink raw reply

* Re: [PATCH v8 12/46] KVM: guest_memfd: Only prepare folios for private pages
From: Binbin Wu @ 2026-06-23  6:48 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, brauner, chao.p.peng, david, jmattson,
	jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, tabba, willy,
	wyihan, yan.y.zhao, forkloop, pratyush, suzuki.poulose,
	aneesh.kumar, liam, Paolo Bonzini, Sean Christopherson,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, Jonathan Corbet, Shuah Khan, Shuah Khan,
	Vishal Annapurve, Andrew Morton, Chris Li, Kairui Song,
	Kemeng Shi, Nhat Pham, Barry Song, Axel Rasmussen, Yuanchu Xie,
	Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt, Kiryl Shutsemau,
	Baoquan He, Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-12-9d2959357853@google.com>

On 6/19/2026 8:31 AM, Ackerley Tng via B4 Relay wrote:
> From: Ackerley Tng <ackerleytng@google.com>
> 
> All-shared guest_memfd used to be only supported for non-CoCo VMs where
> preparation doesn't apply. INIT_SHARED is about to be supported for CoCo
> VMs in a later patch in this series.
> 
> In addition, KVM_SET_MEMORY_ATTRIBUTES2 is about to be supported in
> guest_memfd in a later patch in this series.
> 
> This means that the kvm fault handler may now call kvm_gmem_get_pfn() on a
> shared folio for a CoCo VM where preparation applies.
> 
> Add a check to make sure that preparation is only performed for private
> folios.
> 
> Preparation will be undone on freeing (see kvm_gmem_free_folio()) and on
> conversion to shared.
> 
> Suggested-by: Michael Roth <michael.roth@amd.com>
> Reviewed-by: Fuad Tabba <tabba@google.com>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>

Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>


^ permalink raw reply

* Re: [PATCH v8 11/46] KVM: Consolidate private memory and guest_memfd ifdeffery in kvm_host.h
From: Binbin Wu @ 2026-06-23  6:19 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, brauner, chao.p.peng, david, jmattson,
	jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, tabba, willy,
	wyihan, yan.y.zhao, forkloop, pratyush, suzuki.poulose,
	aneesh.kumar, liam, Paolo Bonzini, Sean Christopherson,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, Jonathan Corbet, Shuah Khan, Shuah Khan,
	Vishal Annapurve, Andrew Morton, Chris Li, Kairui Song,
	Kemeng Shi, Nhat Pham, Barry Song, Axel Rasmussen, Yuanchu Xie,
	Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt, Kiryl Shutsemau,
	Baoquan He, Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-11-9d2959357853@google.com>

On 6/19/2026 8:31 AM, Ackerley Tng via B4 Relay wrote:
> From: Sean Christopherson <seanjc@google.com>
> 
> Move the kvm_arch_has_private_mem() stub and a few guest_memfd function
> definitions/declarations "down" in kvm_host.h to utilize existing #ifdefs,
> and so that related code is clustered together.
> 
> No functional change intended.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>

After fixing SoB ...

Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>


^ permalink raw reply

* Re: [PATCH v8 10/46] KVM: guest_memfd: Wire up core private/shared attribute interfaces
From: Binbin Wu @ 2026-06-23  6:15 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, brauner, chao.p.peng, david, jmattson,
	jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, tabba, willy,
	wyihan, yan.y.zhao, forkloop, pratyush, suzuki.poulose,
	aneesh.kumar, liam, Paolo Bonzini, Sean Christopherson,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, Jonathan Corbet, Shuah Khan, Shuah Khan,
	Vishal Annapurve, Andrew Morton, Chris Li, Kairui Song,
	Kemeng Shi, Nhat Pham, Barry Song, Axel Rasmussen, Yuanchu Xie,
	Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt, Kiryl Shutsemau,
	Baoquan He, Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-10-9d2959357853@google.com>

On 6/19/2026 8:31 AM, Ackerley Tng via B4 Relay wrote:

[...]

> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index bca912db5be6e..e0e544ef47d69 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -926,6 +926,24 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot,
>  EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_gmem_get_pfn);
>  
>  #ifdef CONFIG_HAVE_KVM_ARCH_GMEM_POPULATE
> +static bool kvm_gmem_range_is_private(struct file *file, pgoff_t index,
> +				      size_t nr_pages, struct kvm *kvm, gfn_t gfn)
> +{
> +	struct maple_tree *mt = &GMEM_I(file_inode(file))->attributes;
> +	pgoff_t end = index + nr_pages - 1;
> +	void *entry;
> +
> +	if (!gmem_in_place_conversion)
> +		return kvm_range_has_vm_memory_attributes(kvm, gfn, gfn + nr_pages,
> +							  KVM_MEMORY_ATTRIBUTE_PRIVATE,
> +							  KVM_MEMORY_ATTRIBUTE_PRIVATE);
> +
> +	mt_for_each(mt, entry, index, end) {
> +		if (xa_to_value(entry) != KVM_MEMORY_ATTRIBUTE_PRIVATE)
> +			return false;
> +	}

Patch 1 noted that "Ensuring every index is represented in the maple tree at all times".
So I think the queried range should not be a hole in the maple tree.
However, there is a inconsistency: in patch 1 kvm_gmem_get_attributes() explicitly
checks for holes, but this patch does not.

> +	return true;
> +}
>  

^ permalink raw reply

* Re: [PATCH 1/2] cgroup/cpuset: Avoid unnecessary cpus & mems update in cpuset_hotplug_update_tasks()
From: Waiman Long @ 2026-06-23  5:58 UTC (permalink / raw)
  To: Ridong Chen, Tejun Heo, Johannes Weiner, Michal Koutný,
	Jonathan Corbet, Shuah Khan
  Cc: cgroups, linux-kernel, linux-doc
In-Reply-To: <e24b8145-7a67-4cc0-8ba0-24bd89243c04@linux.dev>

On 6/22/26 9:14 PM, Ridong Chen wrote:
>
>
> On 6/23/2026 6:45 AM, Waiman Long wrote:
>> As reported by sashiko [1], cpuset_hotplug_update_tasks() may perform
>> unnecessary task iteration and updating of tasks' CPU and node masks
>> when mems_allowed and/or cpus_allowed are not set in cpuset v2. It is
>> due to the fact that the temporary new_cpus and new_mems masks do not
>> inherit parent's effective_cpus/mems when they are empty which is the
>> expected behavior for cpuset v2 since commit 4ec22e9c5a90 ("cpuset:
>> Enable cpuset controller in default hierarchy").
>>
>> Fix that and avoid unnecessay work by adding the empty mask checks and
>> inheriting the parent's versions if empty.
>>
>> [1] 
>> https://sashiko.dev/#/patchset/20260621032816.1806773-1-longman%40redhat.com
>>
>> Fixes: 4ec22e9c5a90 ("cpuset: Enable cpuset controller in default 
>> hierarchy")
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>>   kernel/cgroup/cpuset.c | 8 ++++++++
>>   1 file changed, 8 insertions(+)
>>
>> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
>> index aff86acea701..bc0207fd6e57 100644
>> --- a/kernel/cgroup/cpuset.c
>> +++ b/kernel/cgroup/cpuset.c
>> @@ -3925,6 +3925,14 @@ static void cpuset_hotplug_update_tasks(struct 
>> cpuset *cs, struct tmpmasks *tmp)
>>       compute_effective_cpumask(&new_cpus, cs, parent);
>>       nodes_and(new_mems, cs->mems_allowed, parent->effective_mems);
>>   +    if (is_in_v2_mode()) {
>> +        /* Inherit parent's effective_cpus/mems if empty */
>> +        if (cpumask_empty(&new_cpus))
>> +            cpumask_copy(&new_cpus, parent->effective_cpus);
>> +        if (nodes_empty(new_mems))
>> +            new_mems = parent->effective_mems;
>> +    }
>> +
>>       if (!tmp || !cs->partition_root_state)
>>           goto update_tasks;
>
> I noticed that compute_effective_cpumask(...) is called in several 
> places, so I think the logic should be consolidated into that function.
>
> ```
> static void compute_effective_cpumask(struct cpumask *new_cpus,
>                       struct cpuset *cs, struct cpuset *parent)
> {
>     cpumask_and(new_cpus, cs->cpus_allowed, parent->effective_cpus);
>     if (cpumask_empty(&new_cpus) && is_in_v2_mode())
>         cpumask_copy(&new_cpus, parent->effective_cpus);
> }
>
> ```
>
> Similarly, for new_mems, should we introduce a dedicated helper like 
> compute_effective_nodemask? The same fallback logic is needed in 
> update_nodemasks_hier:
>
>
> ```
> static void update_nodemasks_hier(struct cpuset *cs, nodemask_t 
> *new_mems)
> {
> ...
>         bool has_mems = nodes_and(*new_mems, cp->mems_allowed, 
> parent->effective_mems);
>
>         /*
>          * If it becomes empty, inherit the effective mask of the
>          * parent, which is guaranteed to have some MEMs.
>          */
>         if (is_in_v2_mode() && !has_mems)
>             *new_mems = parent->effective_mems;
> ...
> ```
>
Yes, that makes sense. Will adopt this approach in the next version.

Cheers,
Longman


^ permalink raw reply

* Re: [PATCH v8 23/46] KVM: TDX: Make source page optional for KVM_TDX_INIT_MEM_REGION
From: Yan Zhao @ 2026-06-23  5:16 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: ackerleytng, aik, andrew.jones, binbin.wu, brauner, chao.p.peng,
	david, jmattson, jthoughton, michael.roth, oupton, pankaj.gupta,
	qperret, rick.p.edgecombe, rientjes, shivankg, steven.price,
	tabba, willy, wyihan, forkloop, pratyush, suzuki.poulose,
	aneesh.kumar, liam, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Steven Rostedt,
	Masami Hiramatsu, Mathieu Desnoyers, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Vishal Annapurve, Andrew Morton, Chris Li,
	Kairui Song, Kemeng Shi, Nhat Pham, Barry Song, Axel Rasmussen,
	Yuanchu Xie, Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt,
	Kiryl Shutsemau, Baoquan He, Jason Gunthorpe, Vlastimil Babka,
	kvm, linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest,
	linux-mm, linux-coco
In-Reply-To: <ajnf5Z9nWZxoLS4x@google.com>

On Mon, Jun 22, 2026 at 06:22:45PM -0700, Sean Christopherson wrote:
> On Mon, Jun 22, 2026, Yan Zhao wrote:
> > On Thu, Jun 18, 2026 at 05:32:00PM -0700, Ackerley Tng via B4 Relay wrote:
> > > From: Ackerley Tng <ackerleytng@google.com>
> > > 
> > > Update tdx_gmem_post_populate() to handle cases where a source page is
> > > not explicitly provided. Instead of returning -EOPNOTSUPP when src_page
> > > is NULL, default to using the page associated with the destination PFN.
> > > 
> > > This change allows for in-place memory conversion where the data is
> > > already present in the target PFN, ensuring the TDX module has a valid
> > > source page reference for the TDH.MEM.PAGE.ADD operation.
> > > 
> > > Signed-off-by: Ackerley Tng <ackerleytng@google.com>
> > > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > > ---
> > >  Documentation/virt/kvm/x86/intel-tdx.rst |  4 ++++
> > >  arch/x86/kvm/vmx/tdx.c                   | 11 ++++++++---
> > >  2 files changed, 12 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/Documentation/virt/kvm/x86/intel-tdx.rst b/Documentation/virt/kvm/x86/intel-tdx.rst
> > > index 6a222e9d09541..74357fe87f9ec 100644
> > > --- a/Documentation/virt/kvm/x86/intel-tdx.rst
> > > +++ b/Documentation/virt/kvm/x86/intel-tdx.rst
> > > @@ -158,6 +158,10 @@ KVM_TDX_INIT_MEM_REGION
> > >  Initialize @nr_pages TDX guest private memory starting from @gpa with userspace
> > >  provided data from @source_addr. @source_addr must be PAGE_SIZE-aligned.
> > >  
> > > +If guest_memfd in-place conversion is enabled, pass NULL for @source_addr to
> > > +initialize the memory region using memory contents already populated in
> > > +guest_memfd memory.
> > > +
> > >  Note, before calling this sub command, memory attribute of the range
> > >  [gpa, gpa + nr_pages] needs to be private.  Userspace can use
> > >  KVM_SET_MEMORY_ATTRIBUTES to set the attribute.
> > > diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> > > index ffe9d0db58c59..56d10333c61a7 100644
> > > --- a/arch/x86/kvm/vmx/tdx.c
> > > +++ b/arch/x86/kvm/vmx/tdx.c
> > > @@ -3198,8 +3198,12 @@ static int tdx_gmem_post_populate(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn,
> > >  	if (KVM_BUG_ON(kvm_tdx->page_add_src, kvm))
> > >  		return -EIO;
> > >  
> > > -	if (!src_page)
> > > -		return -EOPNOTSUPP;
> > > +	if (!src_page) {
> > > +		if (!gmem_in_place_conversion)
> > When userspace turns on gmem_in_place_conversion while creating guest_memfd
> > without the MMAP flag, the absence of src_page should still be treated as an
> > error.
> 
> Why MMAP?
Hmm, I was showing a scenario that in-place conversion couldn't occur.
I didn't mean that with the MMAP flag, mmap() and user write must occur.

> Shouldn't this be a general "if (!src_page && !up-to-date)"?  Just
> because userspace _can_ mmap() the memory doesn't mean userspace _has_ mmap()'d
> and written memory.  And when write() lands, MMAP wouldn't be necessary to
> initialize the memory.
Do you mean using up-to-date flag as below?

if (!src_page) {
	src_page = pfn_to_page(pfn);
	if (!folio_test_uptodate(page_folio(src_page)))
		return -EOPNOTSUPP;
}

One concern is that TDX now does not much care about the up-to-date flag since
TDX doesn't rely on the flag to clear pages on conversions.
I'm not sure if the flag can be reliably checked in this case. e.g.,
now the whole folio is marked up-to-date even if only part of it is faulted by
user access.
Ensuring that the up-to-date flag works correctly with huge page support seems
to have more effort than introducing a dedicated flag for TDX.

> > Additionally, to properly enable in-place copying for the TDX initial memory
> > region, userspace must not only specify source_addr to NULL, but also follow
> > a specific sequence (where steps 1/2/3/7 are required only for in-place copy):
> > 1. create guest_memfd with MMAP flag
> > 2. mmap the guest_memfd.
> > 3. convert the initial memory range to shared.
> > 4. copy initial content to the source page.
> > 5. convert the initial memory range to private
> > 6. invoke ioctl KVM_TDX_INIT_MEM_REGION.
> > 7. do not unmap the source backend.
> > 
> > So, would it be reasonable to introduce a dedicated flag that allows userspace
> > to explicitly opt into the in-place copy functionality? e.g.,
> 
> Why?  It's userspace's responsibility to get the above right.  If userspace fails
> to provide a src_page when it doesn't want in-place copy, that's a userspace bug.
I mean if userspace specifies a NULL source_addr by mistake, it's better for
kernel to detect this mistake, similar to how it validates whether source_addr
is PAGE_ALIGNED.
Since userspace already needs to perform additional steps to enable in-place
copy, specifying a dedicated flag to indicate that the NULL source_addr is
intentional seems like a reasonable burden.

^ permalink raw reply

* [PATCH 1/3] Documentation/hwmon: Add onsemi's FD5121 controllers' documentation
From: Selvamani Rajagopal via B4 Relay @ 2026-06-23  5:55 UTC (permalink / raw)
  To: Guenter Roeck, Jonathan Corbet, Shuah Khan, Selva Rajagopal,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley
  Cc: linux-hwmon, linux-doc, linux-kernel, devicetree,
	Selvamani Rajagopal
In-Reply-To: <20260622-support-fd5121-from-onsemi-v1-0-b31767689c65@onsemi.com>

From: Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com>

Document the hardware monitoring support for the FD5121, FD5123, and
FD5125 devices.

Documentation describes the supported telemetry data exposed via
the sysfs for the hwmon subsystem, including voltage, current,
power and temperature measurements.

Signed-off-by: Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com>
---
 Documentation/hwmon/fd5121.rst | 93 ++++++++++++++++++++++++++++++++++++++++++
 Documentation/hwmon/index.rst  |  1 +
 2 files changed, 94 insertions(+)

diff --git a/Documentation/hwmon/fd5121.rst b/Documentation/hwmon/fd5121.rst
new file mode 100644
index 000000000000..c279db4641e4
--- /dev/null
+++ b/Documentation/hwmon/fd5121.rst
@@ -0,0 +1,93 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+Kernel driver fd5121
+=====================
+
+Supported chips:
+
+  * onsemi FD5121
+
+    Prefix: 'fd5121'
+
+    Datasheet: Datasheet is not publicly available.
+
+  * onsemi FD5123
+
+    Prefix: 'fd5121'
+
+    Datasheet: Datasheet is not publicly available.
+
+  * onsemi FD5125
+
+    Prefix: 'fd5121'
+
+    Datasheet: Datasheet is not publicly available.
+
+Author: Selva Rajagopal <selvamani.rajagopal@onsemi.com>
+
+Description
+-----------
+
+FD5121, FD5123, FD5125 are dual-rail, multi-phase controllers
+and compliant to
+
+  - AVS and AMD proprietary SVI3 protocol.
+  - PMBus rev 1.4.1 interface.
+
+Used as multi-phase voltage regulators for CPUs, high performance
+ASICs, SoCs or graphic cores.
+
+Gives full telemetry options including input/output voltage
+and current, as well as fault handling and identifications.
+
+Usage Notes
+-----------
+
+This driver does not probe for PMBus devices. You will have
+to instantiate devices explicitly.
+
+Example: the following commands will load the driver for the
+controller at address 0x50 on I2C bus #1::
+
+    # modprobe fd5121
+    # echo fd5121 0x50 > /sys/bus/i2c/devices/i2c-1/new_device
+
+It can also be instantiated by declaring in device tree
+
+Sysfs attributes
+----------------
+
+The following attributes are supported:
+
+======================  ====================================
+curr[1-2]_label		"iin[1-2]"
+curr[3-4]_label		"iout[1-2]"
+curr[1-2]_input		Measured input current.
+curr[3-4]_input		Measured output current.
+curr[1-4]_crit_alarm	Output current critical high alarm.
+curr[1-4]_max_alarm	Output current high alarm.
+
+in[1-2]_label		"vin[1-2]"
+in[3-4]_label		"vout[1-2]"
+in[1-4]_lcrit_alarm	Input voltage critical low alarm.
+in[1-4]_crit_alarm	Input voltage critical high alarm.
+in[1-2]_max_alarm	Input voltage high alarm.
+in[1-2]_input           Measured input voltage.
+in[3-4]_input           Measured output voltage.
+
+power[1-2]_label	"pin[1-2]"
+power[3-4]_label	"pout[1-2]"
+power[3-4]_crit_alarm	Output power critical high alarm.
+power[1-2]_max_alarm	Output power high alarm.
+power[1-4]_max          Power limit.
+power[1-4]_input        Measured input power.
+power[3-4]_crit         Critical maximum output power.
+
+temp[1-2]_crit_alarm	Chip temperature critical high alarm.
+temp[1-2]_max_alarm	Chip temperature high alarm.
+temp[1-2]_input         Measured temperature.
+temp[1-2]_max           Maximum temperature.
+temp[1-2]_crit          Critical high temperature.
+
+======================  ====================================
+
diff --git a/Documentation/hwmon/index.rst b/Documentation/hwmon/index.rst
index 4aa910569c31..451f5433fa60 100644
--- a/Documentation/hwmon/index.rst
+++ b/Documentation/hwmon/index.rst
@@ -79,6 +79,7 @@ Hardware Monitoring Kernel Drivers
    f71805f
    f71882fg
    fam15h_power
+   fd5121
    fsp-3y
    ftsteutates
    g760a

-- 
2.43.0



^ permalink raw reply related

* [PATCH 2/3] dt-bindings: hwmon: pmbus: Support for onsemi's FD5121
From: Selvamani Rajagopal via B4 Relay @ 2026-06-23  5:55 UTC (permalink / raw)
  To: Guenter Roeck, Jonathan Corbet, Shuah Khan, Selva Rajagopal,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley
  Cc: linux-hwmon, linux-doc, linux-kernel, devicetree,
	Selvamani Rajagopal
In-Reply-To: <20260622-support-fd5121-from-onsemi-v1-0-b31767689c65@onsemi.com>

From: Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com>

Add devicetree schema for onsemi FD5121, FD5123, and
FD5125 dual rail, multi-phase digital controllers.

Signed-off-by: Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com>
---
 .../bindings/hwmon/pmbus/onnn,fd5121.yaml          | 41 ++++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/Documentation/devicetree/bindings/hwmon/pmbus/onnn,fd5121.yaml b/Documentation/devicetree/bindings/hwmon/pmbus/onnn,fd5121.yaml
new file mode 100644
index 000000000000..b0453b0634f0
--- /dev/null
+++ b/Documentation/devicetree/bindings/hwmon/pmbus/onnn,fd5121.yaml
@@ -0,0 +1,41 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/hwmon/pmbus/onnn,fd5121.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: onsemi's multi-phase digital controllers
+
+maintainers:
+  - Selvamani Rajagopal <selvamani.rajagopal@onsemi.com>
+
+description:
+  onsemi multi-phase digital controllers with PMBus.
+
+properties:
+  compatible:
+    enum:
+      - onnn,fd5121
+      - onnn,fd5123
+      - onnn,fd5125
+
+  reg:
+    maxItems: 1
+
+required:
+  - compatible
+  - reg
+
+additionalProperties: false
+
+examples:
+  - |
+    i2c {
+      #address-cells = <1>;
+      #size-cells = <0>;
+
+      fd5121@50 {
+        compatible = "onnn,fd5121";
+        reg = <0x50>;
+      };
+    };

-- 
2.43.0



^ permalink raw reply related

* [PATCH 0/3] Support onsemi's FD5121 multiphase digital controller
From: Selvamani Rajagopal via B4 Relay @ 2026-06-23  5:55 UTC (permalink / raw)
  To: Guenter Roeck, Jonathan Corbet, Shuah Khan, Selva Rajagopal,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley
  Cc: linux-hwmon, linux-doc, linux-kernel, devicetree,
	Selvamani Rajagopal

FD5121 is a dual rail, multi-phase controller designed to
power CPU, ASIC or SoC with fully configurable rails.

This driver adds support for FD5121, FD5123 and FD5125. These
controllers configurability through PMBus 1.4.1. 

Added documents for these controllers.

Signed-off-by: Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com>
---
Selvamani Rajagopal (3):
      Documentation/hwmon: Add onsemi's FD5121 controllers' documentation
      dt-bindings: hwmon: pmbus: Support for onsemi's FD5121
      hwmon: (pmbus/fd5121): Add support FD5121, FD5123 and FD5125

 .../bindings/hwmon/pmbus/onnn,fd5121.yaml          |   41 +
 Documentation/hwmon/fd5121.rst                     |   93 ++
 Documentation/hwmon/index.rst                      |    1 +
 MAINTAINERS                                        |    8 +
 drivers/hwmon/pmbus/Kconfig                        |    9 +
 drivers/hwmon/pmbus/Makefile                       |    1 +
 drivers/hwmon/pmbus/fd5121.c                       | 1004 ++++++++++++++++++++
 7 files changed, 1157 insertions(+)
---
base-commit: 1a3746ccbb0a97bed3c06ccde6b880013b1dddc1
change-id: 20260622-support-fd5121-from-onsemi-6fa9f98b5bf0

Best regards,
-- 
Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com>



^ permalink raw reply

* [PATCH 3/3] hwmon: (pmbus/fd5121): Add support FD5121, FD5123 and FD5125
From: Selvamani Rajagopal via B4 Relay @ 2026-06-23  5:55 UTC (permalink / raw)
  To: Guenter Roeck, Jonathan Corbet, Shuah Khan, Selva Rajagopal,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley
  Cc: linux-hwmon, linux-doc, linux-kernel, devicetree,
	Selvamani Rajagopal
In-Reply-To: <20260622-support-fd5121-from-onsemi-v1-0-b31767689c65@onsemi.com>

From: Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com>

FD5121 is a dual-rail, multi-phase, digital controller that offers
full telemtry options including input/output voltage, current as
well as fault handling and identifications.

These controllers are compliant with PMBus specification.

Signed-off-by: Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com>
---
 MAINTAINERS                  |    8 +
 drivers/hwmon/pmbus/Kconfig  |    9 +
 drivers/hwmon/pmbus/Makefile |    1 +
 drivers/hwmon/pmbus/fd5121.c | 1004 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 1022 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index d95d3ef77773..c0664c33324a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -20135,6 +20135,14 @@ L:	linux-mips@vger.kernel.org
 S:	Maintained
 F:	arch/mips/boot/dts/ralink/omega2p.dts
 
+ONSEMI HARDWARE MONITOR DRIVER
+M:	Selva Rajagopal <selvamani.rajagopal@onsemi.com>
+L:	linux-hwmon@vger.kernel.org
+S:	Supported
+W:	https://www.onsemi.com
+F:	Documentation/devicetree/bindings/hwmon/pmbus/onnn,fd5121.yaml
+F:	drivers/hwmon/pmbus/fd5121.c
+
 ONSEMI ETHERNET PHY DRIVERS
 M:	Piergiorgio Beruto <piergiorgio.beruto@gmail.com>
 L:	netdev@vger.kernel.org
diff --git a/drivers/hwmon/pmbus/Kconfig b/drivers/hwmon/pmbus/Kconfig
index c8cda160b5f8..3a06ed83539e 100644
--- a/drivers/hwmon/pmbus/Kconfig
+++ b/drivers/hwmon/pmbus/Kconfig
@@ -179,6 +179,15 @@ config SENSORS_E50SN12051
 	  This driver can also be built as a module. If so, the module will
 	  be called e50sn12051.
 
+config SENSORS_FD5121
+	tristate "FD5121/FD5123/FD5125 controllers from onsemi"
+	help
+	  If you say yes here, you get support for onsemi
+	  controllers FD5121, FD5123, FD5125.
+
+	  This driver can also be built as a module. If so, the module will
+	  be called fd5121.
+
 config SENSORS_INA233
 	tristate "Texas Instruments INA233 and compatibles"
 	help
diff --git a/drivers/hwmon/pmbus/Makefile b/drivers/hwmon/pmbus/Makefile
index ffc05f493213..70f4afb41fe0 100644
--- a/drivers/hwmon/pmbus/Makefile
+++ b/drivers/hwmon/pmbus/Makefile
@@ -13,6 +13,7 @@ obj-$(CONFIG_SENSORS_APS_379)	+= aps-379.o
 obj-$(CONFIG_SENSORS_BEL_PFE)	+= bel-pfe.o
 obj-$(CONFIG_SENSORS_BPA_RS600)	+= bpa-rs600.o
 obj-$(CONFIG_SENSORS_DELTA_AHE50DC_FAN) += delta-ahe50dc-fan.o
+obj-$(CONFIG_SENSORS_FD5121)	+= fd5121.o
 obj-$(CONFIG_SENSORS_FSP_3Y)	+= fsp-3y.o
 obj-$(CONFIG_SENSORS_HAC300S)	+= hac300s.o
 obj-$(CONFIG_SENSORS_IBM_CFFPS)	+= ibm-cffps.o
diff --git a/drivers/hwmon/pmbus/fd5121.c b/drivers/hwmon/pmbus/fd5121.c
new file mode 100644
index 000000000000..e68c6d6cabbd
--- /dev/null
+++ b/drivers/hwmon/pmbus/fd5121.c
@@ -0,0 +1,1004 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright 2026 Semiconductor Components Industries, LLC ("onsemi").
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/err.h>
+#include <linux/slab.h>
+#include <linux/delay.h>
+#include <linux/i2c.h>
+#include <linux/hwmon.h>
+#include <linux/hwmon-sysfs.h>
+#include <linux/unaligned.h>
+
+#include "pmbus.h"
+
+enum chips { chip_fd5121, chip_fd5123, chip_fd5125 };
+
+#define CTLR_ID_UNKNOWN				0
+#define CTLR_ID_FD5121				0xFD5121
+#define CTLR_ID_FD5123				0xFD5123
+#define CTLR_ID_FD5125				0xFD5125
+
+#define FD5121_NUM_PAGES			2
+
+/* Custom PMBUS commands */
+#define PMBUS_REG_VOUT_MIN			0x2B
+#define PMBUS_REG_POWER_MODE			0x34
+#define PMBUS_REG_VIN_ON			0x35
+#define PMBUS_REG_VIN_OFF			0x36
+#define PMBUS_REG_VIN_UV_FAULT_RESPONSE		0x5A
+#define PMBUS_REG_IIN_OC_FAULT_RESPONSE		0x5C
+#define PMBUS_REG_TON_DELAY			0x60
+#define PMBUS_REG_POUT_OP_FAULT_RESPONSE	0x69
+#define PMBUS_REG_READ_VAUX			0x85
+
+#define PMBUS_REG_IKNEE_SET			0x2D
+#define PMBUS_REG_PIN_COUNTER			0x2E
+#define PMBUS_REG_VMIN_AWARE			0x2F
+#define PMBUS_REG_VAUX_UV_FAULT_LIMIT		0x6C
+#define PMBUS_REG_VAUX_OV_FAULT_LIMIT		0x6D
+#define PMBUS_REG_VAUX_UV_FAULT_RESPONSE	0x6E
+#define PMBUS_REG_VAUX_OV_FAULT_RESPONSE	0x6F
+#define PMBUS_REG_VAUX_UV_WARNING		0x75
+#define PMBUS_REG_VAUX_OV_WARNING		0x76
+#define PMBUS_REG_MFR_FREE_USER_CONFIG_TABLES	0xCF
+#define PMBUS_REG_MFR_ADDRESS_TABLE		0xD0
+#define PMBUS_REG_MFR_STATUS_ONSEMI		0xD1
+#define PMBUS_REG_MFR_UNLOCK			0xD2
+#define PMBUS_REG_MFR_FAULTY_SPS		0xD3
+#define PMBUS_REG_TLVR_FAULTS			0xD4
+#define PMBUS_REG_MFR_USER_STORE_CONFIG_TAB	0xD5
+#define PMBUS_REG_MFR_USER_CONFIG_INDEX		0xD6
+#define PMBUS_REG_MFR_PWM_DISCONNECTION		0xD7
+#define PMBUS_REG_MFR_VR_DISCONNECTION		0xD8
+#define PMBUS_REG_MFR_TON_SLEW			0xD9
+#define PMBUS_REG_MFR_TOFF_SLEW			0xDA
+#define PMBUS_REG_MFR_RAIL_NAME			0xDB
+#define PMBUS_REG_MFR_VOUT_DROOP		0xDC
+#define PMBUS_REG_MFR_USER_RESTORE_CONFIG_TAB	0xDD
+#define PMBUS_REG_MFR_SVR_GO			0xDE
+#define PMBUS_REG_MFR_SET_PWD			0xDF
+#define PMBUS_REG_MFR_CONFIG_ACTIVATE		0xE0
+#define PMBUS_REG_MFR_CONFIG_RECOVER		0xE1
+#define PMBUS_REG_MFR_OTP_DUMP			0xE2
+#define PMBUS_REG_MFR_BBR_EN			0xE3
+#define PMBUS_REG_MFR_DPM_MIN			0xE4
+#define PMBUS_REG_MFR_VBOOT			0xE5
+#define PMBUS_REG_MFR_PRECLAMP_OFFSET		0xE6
+#define PMBUS_REG_MFR_TLVR_DIAGN		0xE7
+#define PMBUS_REG_MFR_READ_VSYS			0xE8
+#define PMBUS_REG_MFR_SPECIFIC_E9		0xE9
+#define PMBUS_REG_MFR_SPECIFIC_EA		0xEA
+#define PMBUS_REG_MFR_SS_CBC			0xEB
+#define PMBUS_REG_MFR_AMD_STATUS		0xEC
+#define PMBUS_REG_MFR_CHECKSUM			0xEE
+#define PMBUS_REG_CSE_INDEX			0xF0
+#define PMBUS_REG_COUT_MEASURE			0xF1
+#define PMBUS_REG_VR_COUT			0xF2
+#define PMBUS_REG_BBR_RAM			0xF3
+#define PMBUS_REG_BBR_OTP			0xF4
+#define PMBUS_REG_READ_PSYS			0xF5
+#define PMBUS_REG_POSTCLAMP_OFFSET		0xF6
+#define PMBUS_REG_PGOOD_DELAY			0xF7
+#define PMBUS_REG_MFR_SPECIFIC_F8		0xF8
+#define PMBUS_REG_MFR_SPECIFIC_F9		0xF9
+#define PMBUS_REG_MFR_PWD_PROGRAM_RAM		0xFA
+#define PMBUS_REG_MFR_PWD_PROGRAM_I2C		0xFB
+#define PMBUS_REG_MFR_PWD_ENABLE_OTP_STORE	0xFC
+
+/* List of recognized commands */
+static const u8 cc_list[] = {
+	PMBUS_PAGE,
+	PMBUS_OPERATION,
+	PMBUS_ON_OFF_CONFIG,
+	PMBUS_CLEAR_FAULTS,
+	PMBUS_WRITE_PROTECT,
+	PMBUS_CAPABILITY,
+	PMBUS_VOUT_MODE,
+	PMBUS_VOUT_COMMAND,
+	PMBUS_VOUT_MAX,
+	PMBUS_VOUT_MARGIN_HIGH,
+	PMBUS_VOUT_MARGIN_LOW,
+	PMBUS_VOUT_TRANSITION_RATE,
+	PMBUS_REG_VOUT_MIN,
+	PMBUS_REG_IKNEE_SET,
+	PMBUS_REG_PIN_COUNTER,
+	PMBUS_REG_VMIN_AWARE,
+	PMBUS_REG_POWER_MODE,
+	PMBUS_REG_VIN_ON,
+	PMBUS_REG_VIN_OFF,
+	PMBUS_VOUT_OV_FAULT_LIMIT,
+	PMBUS_VOUT_OV_FAULT_RESPONSE,
+	PMBUS_VOUT_UV_FAULT_LIMIT,
+	PMBUS_VOUT_UV_FAULT_RESPONSE,
+	PMBUS_IOUT_OC_FAULT_LIMIT,
+	PMBUS_IOUT_OC_FAULT_RESPONSE,
+	PMBUS_IOUT_OC_WARN_LIMIT,
+	PMBUS_OT_FAULT_LIMIT,
+	PMBUS_OT_FAULT_RESPONSE,
+	PMBUS_OT_WARN_LIMIT,
+	PMBUS_VIN_OV_FAULT_LIMIT,
+	PMBUS_VIN_OV_FAULT_RESPONSE,
+	PMBUS_VIN_OV_WARN_LIMIT,
+	PMBUS_VIN_UV_WARN_LIMIT,
+	PMBUS_VIN_UV_FAULT_LIMIT,
+	PMBUS_REG_VIN_UV_FAULT_RESPONSE,
+	PMBUS_IIN_OC_FAULT_LIMIT,
+	PMBUS_REG_IIN_OC_FAULT_RESPONSE,
+	PMBUS_IIN_OC_WARN_LIMIT,
+	PMBUS_REG_TON_DELAY,
+	PMBUS_POUT_OP_FAULT_LIMIT,
+	PMBUS_REG_POUT_OP_FAULT_RESPONSE,
+	PMBUS_POUT_OP_WARN_LIMIT,
+	PMBUS_PIN_OP_WARN_LIMIT,
+	PMBUS_REG_VAUX_UV_FAULT_LIMIT,
+	PMBUS_REG_VAUX_OV_FAULT_LIMIT,
+	PMBUS_REG_VAUX_UV_FAULT_RESPONSE,
+	PMBUS_REG_VAUX_OV_FAULT_RESPONSE,
+	PMBUS_REG_VAUX_UV_WARNING,
+	PMBUS_REG_VAUX_OV_WARNING,
+	PMBUS_STATUS_BYTE,
+	PMBUS_STATUS_WORD,
+	PMBUS_STATUS_VOUT,
+	PMBUS_STATUS_IOUT,
+	PMBUS_STATUS_INPUT,
+	PMBUS_STATUS_TEMPERATURE,
+	PMBUS_STATUS_CML,
+	PMBUS_STATUS_OTHER,
+	PMBUS_STATUS_MFR_SPECIFIC,
+	PMBUS_REG_READ_VAUX,
+	PMBUS_READ_VIN,
+	PMBUS_READ_IIN,
+	PMBUS_READ_VOUT,
+	PMBUS_READ_IOUT,
+	PMBUS_READ_TEMPERATURE_1,
+	PMBUS_READ_POUT,
+	PMBUS_READ_PIN,
+	PMBUS_REVISION,
+	PMBUS_MFR_ID,
+	PMBUS_MFR_MODEL,
+	PMBUS_MFR_REVISION,
+	PMBUS_IC_DEVICE_ID,
+	PMBUS_REG_MFR_FREE_USER_CONFIG_TABLES,
+	PMBUS_REG_MFR_ADDRESS_TABLE,
+	PMBUS_REG_MFR_STATUS_ONSEMI,
+	PMBUS_REG_MFR_UNLOCK,
+	PMBUS_REG_MFR_FAULTY_SPS,
+	PMBUS_REG_TLVR_FAULTS,
+	PMBUS_REG_MFR_USER_STORE_CONFIG_TAB,
+	PMBUS_REG_MFR_USER_CONFIG_INDEX,
+	PMBUS_REG_MFR_PWM_DISCONNECTION,
+	PMBUS_REG_MFR_VR_DISCONNECTION,
+	PMBUS_REG_MFR_TON_SLEW,
+	PMBUS_REG_MFR_TOFF_SLEW,
+	PMBUS_REG_MFR_RAIL_NAME,
+	PMBUS_REG_MFR_VOUT_DROOP,
+	PMBUS_REG_MFR_USER_RESTORE_CONFIG_TAB,
+	PMBUS_REG_MFR_SVR_GO,
+	PMBUS_REG_MFR_SET_PWD,
+	PMBUS_REG_MFR_CONFIG_ACTIVATE,
+	PMBUS_REG_MFR_CONFIG_RECOVER,
+	PMBUS_REG_MFR_OTP_DUMP,
+	PMBUS_REG_MFR_BBR_EN,
+	PMBUS_REG_MFR_DPM_MIN,
+	PMBUS_REG_MFR_VBOOT,
+	PMBUS_REG_MFR_PRECLAMP_OFFSET,
+	PMBUS_REG_MFR_TLVR_DIAGN,
+	PMBUS_REG_MFR_READ_VSYS,
+	PMBUS_REG_MFR_SPECIFIC_E9,
+	PMBUS_REG_MFR_SPECIFIC_EA,
+	PMBUS_REG_MFR_SS_CBC,
+	PMBUS_REG_MFR_AMD_STATUS,
+	PMBUS_REG_MFR_CHECKSUM,
+	PMBUS_REG_CSE_INDEX,
+	PMBUS_REG_COUT_MEASURE,
+	PMBUS_REG_VR_COUT,
+	PMBUS_REG_BBR_RAM,
+	PMBUS_REG_BBR_OTP,
+	PMBUS_REG_READ_PSYS,
+	PMBUS_REG_POSTCLAMP_OFFSET,
+	PMBUS_REG_PGOOD_DELAY,
+	PMBUS_REG_MFR_SPECIFIC_F8,
+	PMBUS_REG_MFR_SPECIFIC_F9,
+	PMBUS_REG_MFR_PWD_PROGRAM_RAM,
+	PMBUS_REG_MFR_PWD_PROGRAM_I2C,
+	PMBUS_REG_MFR_PWD_ENABLE_OTP_STORE,
+};
+
+/* Following registers expect block read */
+static const u8 blk_rd_cc[] = {
+	PMBUS_SMBALERT_MASK,
+	PMBUS_MFR_DATE,
+	PMBUS_IC_DEVICE_REV,
+};
+
+struct fd5121_data {
+	struct attribute_group *groups[3];
+	struct pmbus_driver_info info;
+	struct device *dev;
+	u32 id;
+};
+
+static s32 fd5121_read_block_data(const struct i2c_client *client,
+				  u8 cmd_code, u8 len, u8 *pbuf)
+{
+	s32 ret = 0;
+
+	if (!i2c_check_functionality(client->adapter,
+				     I2C_FUNC_SMBUS_READ_BLOCK_DATA)) {
+
+		/* Payload length is in the first byte. */
+		ret = i2c_smbus_read_i2c_block_data(client, cmd_code,
+						    len, pbuf);
+		if (ret < 0)
+			return ret;
+		ret = pbuf[0];
+		if (ret > len)
+			ret = len;
+		for (int idx = 0; idx < ret; idx++)
+			pbuf[idx] = pbuf[idx + 1];
+		return ret;
+	}
+	ret = i2c_smbus_read_block_data(client, cmd_code, pbuf);
+	if (ret < 0)
+		return ret;
+	return min_t(s32, ret, len);
+}
+
+/* Command code that expects block read, not word read */
+static bool fd5121_blk_rd_reg(u8 cmd_code)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(blk_rd_cc); i++) {
+		if (cmd_code == blk_rd_cc[i])
+			return true;
+	}
+	return false;
+}
+
+static ssize_t fd5121_send_byte_store(struct device *dev,
+				      struct device_attribute *da,
+				      const char *buf, size_t count)
+{
+	struct i2c_client *client = to_i2c_client(dev->parent);
+	u8 val = 0;
+	int ret;
+
+	ret = kstrtou8(buf, 10, &val);
+	if (ret < 0)
+		return ret;
+	ret = i2c_smbus_write_byte(client, val);
+	if (ret < 0)
+		return ret;
+	return count;
+}
+
+static int fd5121_config_activate(struct i2c_client *client)
+{
+	return i2c_smbus_write_byte_data(client,
+					 PMBUS_REG_MFR_CONFIG_ACTIVATE, 0xAA);
+}
+
+static ssize_t fd5121_byte_store(struct device *dev,
+				 struct device_attribute *da,
+				 const char *buf, size_t count)
+{
+	struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
+	struct i2c_client *client = to_i2c_client(dev->parent);
+	u8 reg = attr->index;
+	int ret = 0;
+	u8 val = 0;
+
+	switch (reg) {
+	case PMBUS_REG_MFR_CONFIG_ACTIVATE:
+		ret = fd5121_config_activate(client);
+		if (ret < 0)
+			return ret;
+		return count;
+	default:
+		ret = kstrtou8(buf, 10, &val);
+		if (ret < 0)
+			return ret;
+		break;
+	}
+	if (reg == PMBUS_PAGE && ((val != 0 && val != 1 &&
+	    val != GENMASK(7, 0))))
+		return -EINVAL;
+	ret = i2c_smbus_write_byte_data(client, reg, val);
+	if (ret < 0)
+		return ret;
+	return count;
+}
+
+static ssize_t fd5121_byte_show(struct device *dev,
+				struct device_attribute *da, char *buf)
+{
+	struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
+	struct i2c_client *client = to_i2c_client(dev->parent);
+	u8 reg = attr->index;
+	s32 ret;
+
+	ret = i2c_smbus_read_byte_data(client, reg);
+	if (ret < 0)
+		return ret;
+	return scnprintf(buf, PAGE_SIZE, "%d\n", ret & 0xFF);
+}
+
+static ssize_t fd5121_word_store(struct device *dev,
+				 struct device_attribute *da,
+				 const char *buf, size_t count)
+{
+	struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
+	struct i2c_client *client = to_i2c_client(dev->parent);
+	u8 reg = attr->index;
+	s16 val = 0;
+	int ret = 0;
+
+	switch (reg) {
+	case PMBUS_REG_MFR_PWD_PROGRAM_RAM:
+		val = 0xC93F;
+		break;
+	default:
+		ret = kstrtos16(buf, 10, &val);
+		if (ret)
+			return ret;
+		break;
+	}
+	ret = i2c_smbus_write_word_data(client, reg, val);
+	if (ret < 0)
+		return ret;
+	return count;
+}
+
+static ssize_t fd5121_word_show(struct device *dev,
+				struct device_attribute *da, char *buf)
+{
+	struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
+	struct i2c_client *client = to_i2c_client(dev->parent);
+	u8 data[I2C_SMBUS_BLOCK_MAX] = { 0 };
+	u8 reg = attr->index;
+	s32 ret = 0;
+
+	if (fd5121_blk_rd_reg(reg)) {
+		ret = fd5121_read_block_data(client, reg, 2, data);
+		if (ret >= 0)
+			ret = get_unaligned_le16(data);
+	} else
+		ret = i2c_smbus_read_word_data(client, reg);
+	if (ret < 0)
+		return ret;
+	return scnprintf(buf, PAGE_SIZE, "%d\n", ret & 0xFFFF);
+}
+
+static s32 fd5121_write_block_data(const struct i2c_client *client,
+				   u8 cmd_code, u8 len, u8 *pbuf)
+{
+	s32 ret = 0;
+
+	if (!i2c_check_functionality(client->adapter,
+				     I2C_FUNC_SMBUS_WRITE_BLOCK_DATA))
+		ret = i2c_smbus_write_i2c_block_data(client, cmd_code,
+						     len, pbuf);
+	else
+		ret = i2c_smbus_write_block_data(client, cmd_code,
+						 len, pbuf);
+	return ret;
+}
+
+static s32 fd5121_read_long(struct i2c_client *client, u8 cmd_code, u32 *pval)
+{
+	u8 buffer[I2C_SMBUS_BLOCK_MAX] = { 0 };
+	s32 ret;
+
+	ret = fd5121_read_block_data(client, cmd_code, 4, buffer);
+	if (ret < 0)
+		return ret;
+	if (ret < 4)
+		return -EIO;
+
+	*pval = get_unaligned_le32(buffer);
+	return 0;
+}
+
+static ssize_t fd5121_long_store(struct device *dev,
+				 struct device_attribute *da,
+				 const char *buf, size_t count)
+{
+	struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
+	struct i2c_client *client = to_i2c_client(dev->parent);
+	u8 reg = attr->index;
+	u8 buffer[4];
+	u32 val = 0;
+	int ret = 0;
+	u8 len;
+
+	ret = kstrtou32(buf, 10, &val);
+	if (ret)
+		return ret;
+
+	len = (u8) sizeof(buffer);
+	for (u8 i = 0; i < len; i++) {
+		buffer[i] = val & 0xFF;
+		val >>= 8;
+	}
+	ret = fd5121_write_block_data(client, reg, len, buffer);
+	if (ret < 0)
+		return ret;
+	return count;
+}
+
+static ssize_t fd5121_long_show(struct device *dev,
+				struct device_attribute *da, char *buf)
+{
+	struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
+	struct i2c_client *client = to_i2c_client(dev->parent);
+	u8 reg = attr->index;
+	u32 val = 0;
+	s32 ret = 0;
+
+	ret = fd5121_read_long(client, reg, &val);
+	if (ret < 0)
+		return ret;
+	return scnprintf(buf, PAGE_SIZE, "%d\n", val);
+}
+
+static ssize_t fd5121_block_show(struct device *dev,
+				 struct device_attribute *da, char *buf)
+{
+	struct i2c_client *client = to_i2c_client(dev->parent);
+	struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
+	u8 buffer[I2C_SMBUS_BLOCK_MAX] = { 0 };
+	u8 reg = attr->index;
+	int printed = 0;
+	s32 ret = 0;
+	u8 len = 0;
+	int i = 0;
+
+	if (reg == PMBUS_REG_MFR_FAULTY_SPS) {
+		int to_print = 0;
+
+		len = 7;
+		ret = fd5121_read_block_data(client, reg, len, buffer);
+		if (ret < 0)
+			return ret;
+		printed = 0;
+		to_print = (ret < len) ? ret : len;
+		for (i = 0; i < to_print; i++)
+			printed += scnprintf(buf + printed,
+					     PAGE_SIZE - printed,
+					     "%02x", buffer[i]);
+		printed += scnprintf(buf + printed,
+				     PAGE_SIZE - printed, "\n");
+		return printed;
+	} else if (reg == PMBUS_REG_BBR_RAM ||
+		   reg == PMBUS_REG_BBR_OTP) {
+		u32 len = (reg == PMBUS_REG_BBR_OTP) ? 165 : 164;
+
+		/* Extra byte may be needed in case we need to store
+		 * the length of the data
+		 */
+		u8 *tmp_in = kcalloc(len+1, sizeof(u8), GFP_KERNEL);
+
+		if (tmp_in == NULL)
+			return -ENOMEM;
+		ret = fd5121_read_block_data(client, reg, len, tmp_in);
+		if (ret < 0) {
+			kfree(tmp_in);
+			return ret;
+		}
+
+		printed = 0;
+		for (i = 0; i < ret; i++)
+			printed += scnprintf(buf + printed,
+					     PAGE_SIZE - printed, "%02x",
+					     tmp_in[i]);
+		printed += scnprintf(buf + printed,
+				     PAGE_SIZE - printed, "\n");
+
+		kfree(tmp_in);
+		return printed;
+	} else
+		return -ENODATA;
+}
+
+static SENSOR_DEVICE_ATTR_RW(page, fd5121_byte,
+			     PMBUS_PAGE);
+static SENSOR_DEVICE_ATTR_RO(vout_raw, fd5121_word,
+			     PMBUS_READ_VOUT);
+static SENSOR_DEVICE_ATTR_RW(operation, fd5121_byte,
+			     PMBUS_OPERATION);
+static SENSOR_DEVICE_ATTR_RW(on_off_config, fd5121_byte,
+			     PMBUS_ON_OFF_CONFIG);
+static SENSOR_DEVICE_ATTR_WO(clear_faults, fd5121_byte,
+			     PMBUS_CLEAR_FAULTS);
+static SENSOR_DEVICE_ATTR_RW(write_protect, fd5121_byte,
+			     PMBUS_WRITE_PROTECT);
+static SENSOR_DEVICE_ATTR_RO(capability, fd5121_byte,
+			     PMBUS_CAPABILITY);
+static SENSOR_DEVICE_ATTR_RW(smbalert_mask, fd5121_word,
+			     PMBUS_SMBALERT_MASK);
+static SENSOR_DEVICE_ATTR_RO(vout_mode, fd5121_byte,
+			     PMBUS_VOUT_MODE);
+static SENSOR_DEVICE_ATTR_RW(vout_command, fd5121_word,
+			     PMBUS_VOUT_COMMAND);
+static SENSOR_DEVICE_ATTR_RW(vout_max, fd5121_word,
+			     PMBUS_VOUT_MAX);
+static SENSOR_DEVICE_ATTR_RW(vout_margin_high, fd5121_word,
+			     PMBUS_VOUT_MARGIN_HIGH);
+static SENSOR_DEVICE_ATTR_RW(vout_margin_low, fd5121_word,
+			     PMBUS_VOUT_MARGIN_LOW);
+static SENSOR_DEVICE_ATTR_RW(vout_transition_rate, fd5121_word,
+			     PMBUS_VOUT_TRANSITION_RATE);
+static SENSOR_DEVICE_ATTR_RW(vout_min, fd5121_word,
+			     PMBUS_REG_VOUT_MIN);
+static SENSOR_DEVICE_ATTR_RW(power_mode, fd5121_byte,
+			     PMBUS_REG_POWER_MODE);
+static SENSOR_DEVICE_ATTR_RW(vin_on, fd5121_word,
+			     PMBUS_REG_VIN_ON);
+static SENSOR_DEVICE_ATTR_RW(vin_off, fd5121_word,
+			     PMBUS_REG_VIN_OFF);
+static SENSOR_DEVICE_ATTR_RW(vin_uv_fault_response, fd5121_byte,
+			     PMBUS_REG_VIN_UV_FAULT_RESPONSE);
+static SENSOR_DEVICE_ATTR_RW(iin_oc_fault_response, fd5121_byte,
+			     PMBUS_REG_IIN_OC_FAULT_RESPONSE);
+static SENSOR_DEVICE_ATTR_RW(ton_delay, fd5121_word,
+			     PMBUS_REG_TON_DELAY);
+static SENSOR_DEVICE_ATTR_RW(pout_op_fault_response, fd5121_byte,
+			     PMBUS_REG_POUT_OP_FAULT_RESPONSE);
+static SENSOR_DEVICE_ATTR_RO(read_vaux, fd5121_word,
+			     PMBUS_REG_READ_VAUX);
+static SENSOR_DEVICE_ATTR_RW(iknee_set, fd5121_word,
+			     PMBUS_REG_IKNEE_SET);
+static SENSOR_DEVICE_ATTR_RW(pin_counter, fd5121_byte,
+			     PMBUS_REG_PIN_COUNTER);
+static SENSOR_DEVICE_ATTR_RW(vmin_aware, fd5121_word,
+			     PMBUS_REG_VMIN_AWARE);
+static SENSOR_DEVICE_ATTR_RW(vout_ov_fault_response, fd5121_byte,
+			     PMBUS_VOUT_OV_FAULT_RESPONSE);
+static SENSOR_DEVICE_ATTR_RW(vout_uv_fault_response, fd5121_byte,
+			     PMBUS_VOUT_UV_FAULT_RESPONSE);
+static SENSOR_DEVICE_ATTR_RW(iout_oc_fault_response, fd5121_byte,
+			     PMBUS_IOUT_OC_FAULT_RESPONSE);
+static SENSOR_DEVICE_ATTR_RW(ot_fault_response, fd5121_byte,
+			     PMBUS_OT_FAULT_RESPONSE);
+static SENSOR_DEVICE_ATTR_RW(vin_ov_fault_response, fd5121_byte,
+			     PMBUS_VIN_OV_FAULT_RESPONSE);
+static SENSOR_DEVICE_ATTR_RW(vaux_uv_fault_limit, fd5121_word,
+			     PMBUS_REG_VAUX_UV_FAULT_LIMIT);
+static SENSOR_DEVICE_ATTR_RW(vaux_ov_fault_limit, fd5121_word,
+			     PMBUS_REG_VAUX_OV_FAULT_LIMIT);
+static SENSOR_DEVICE_ATTR_RW(vaux_uv_fault_response, fd5121_byte,
+			     PMBUS_REG_VAUX_UV_FAULT_RESPONSE);
+static SENSOR_DEVICE_ATTR_RW(vaux_ov_fault_response, fd5121_byte,
+			     PMBUS_REG_VAUX_OV_FAULT_RESPONSE);
+static SENSOR_DEVICE_ATTR_RW(vaux_uv_warning, fd5121_word,
+			     PMBUS_REG_VAUX_UV_WARNING);
+static SENSOR_DEVICE_ATTR_RW(vaux_ov_warning, fd5121_word,
+			     PMBUS_REG_VAUX_OV_WARNING);
+static SENSOR_DEVICE_ATTR_RO(free_user_config_tables, fd5121_byte,
+			     PMBUS_REG_MFR_FREE_USER_CONFIG_TABLES);
+static SENSOR_DEVICE_ATTR_RW(address_table, fd5121_byte,
+			     PMBUS_REG_MFR_ADDRESS_TABLE);
+static SENSOR_DEVICE_ATTR_RW(status_onsemi, fd5121_word,
+			     PMBUS_REG_MFR_STATUS_ONSEMI);
+static SENSOR_DEVICE_ATTR_RO(status_byte, fd5121_byte,
+			     PMBUS_STATUS_BYTE);
+static SENSOR_DEVICE_ATTR_RO(status_cml, fd5121_byte,
+			     PMBUS_STATUS_CML);
+static SENSOR_DEVICE_ATTR_RO(status_other, fd5121_byte,
+			     PMBUS_STATUS_OTHER);
+static SENSOR_DEVICE_ATTR_RO(status_mfr_specific, fd5121_byte,
+			     PMBUS_STATUS_MFR_SPECIFIC);
+static SENSOR_DEVICE_ATTR_RO(revision, fd5121_byte,
+			     PMBUS_REVISION);
+static SENSOR_DEVICE_ATTR_RO(id, fd5121_long,
+			     PMBUS_MFR_ID);
+static SENSOR_DEVICE_ATTR_RO(model, fd5121_long,
+			     PMBUS_MFR_MODEL);
+static SENSOR_DEVICE_ATTR_RO(mfr_revision, fd5121_long,
+			     PMBUS_MFR_REVISION);
+static SENSOR_DEVICE_ATTR_RW(date, fd5121_word,
+			     PMBUS_MFR_DATE);
+static SENSOR_DEVICE_ATTR_RO(ic_device_id, fd5121_long,
+			     PMBUS_IC_DEVICE_ID);
+static SENSOR_DEVICE_ATTR_RO(ic_device_rev, fd5121_word,
+			     PMBUS_IC_DEVICE_REV);
+static SENSOR_DEVICE_ATTR_WO(unlock, fd5121_byte,
+			     PMBUS_REG_MFR_UNLOCK);
+static SENSOR_DEVICE_ATTR_RO(faulty_sps, fd5121_block,
+			     PMBUS_REG_MFR_FAULTY_SPS);
+static SENSOR_DEVICE_ATTR_RO(tlvr_faults, fd5121_word,
+			     PMBUS_REG_TLVR_FAULTS);
+static SENSOR_DEVICE_ATTR_RW(user_store_config_tab, fd5121_byte,
+			     PMBUS_REG_MFR_USER_STORE_CONFIG_TAB);
+static SENSOR_DEVICE_ATTR_RO(user_config_index, fd5121_byte,
+			     PMBUS_REG_MFR_USER_CONFIG_INDEX);
+static SENSOR_DEVICE_ATTR_RO(pwm_disconnection, fd5121_word,
+			     PMBUS_REG_MFR_PWM_DISCONNECTION);
+static SENSOR_DEVICE_ATTR_RO(vr_disconnection, fd5121_byte,
+			     PMBUS_REG_MFR_VR_DISCONNECTION);
+static SENSOR_DEVICE_ATTR_RW(ton_slew, fd5121_byte,
+			     PMBUS_REG_MFR_TON_SLEW);
+static SENSOR_DEVICE_ATTR_RW(toff_slew, fd5121_byte,
+			     PMBUS_REG_MFR_TOFF_SLEW);
+static SENSOR_DEVICE_ATTR_RW(rail_name, fd5121_word,
+			     PMBUS_REG_MFR_RAIL_NAME);
+static SENSOR_DEVICE_ATTR_RW(vout_droop, fd5121_byte,
+			     PMBUS_REG_MFR_VOUT_DROOP);
+static SENSOR_DEVICE_ATTR_WO(svr_go, fd5121_send_byte,
+			     PMBUS_REG_MFR_SVR_GO);
+static SENSOR_DEVICE_ATTR_RW(user_restore_config_tab, fd5121_byte,
+			     PMBUS_REG_MFR_USER_RESTORE_CONFIG_TAB);
+static SENSOR_DEVICE_ATTR_WO(set_pwd, fd5121_byte,
+			     PMBUS_REG_MFR_SET_PWD);
+static SENSOR_DEVICE_ATTR_RW(config_activate, fd5121_byte,
+			     PMBUS_REG_MFR_CONFIG_ACTIVATE);
+static SENSOR_DEVICE_ATTR_RW(config_recover, fd5121_byte,
+			     PMBUS_REG_MFR_CONFIG_RECOVER);
+static SENSOR_DEVICE_ATTR_RW(otp_dump, fd5121_byte,
+			     PMBUS_REG_MFR_OTP_DUMP);
+static SENSOR_DEVICE_ATTR_RW(bbr_en, fd5121_byte,
+			     PMBUS_REG_MFR_BBR_EN);
+static SENSOR_DEVICE_ATTR_RW(dpm_min, fd5121_byte,
+			     PMBUS_REG_MFR_DPM_MIN);
+static SENSOR_DEVICE_ATTR_RW(vboot, fd5121_word,
+			     PMBUS_REG_MFR_VBOOT);
+static SENSOR_DEVICE_ATTR_RW(preclamp_offset, fd5121_word,
+			     PMBUS_REG_MFR_PRECLAMP_OFFSET);
+static SENSOR_DEVICE_ATTR_RW(tlvr_diagn, fd5121_word,
+			     PMBUS_REG_MFR_TLVR_DIAGN);
+static SENSOR_DEVICE_ATTR_RO(vsys, fd5121_word,
+			     PMBUS_REG_MFR_READ_VSYS);
+static SENSOR_DEVICE_ATTR_RW(specific_e9, fd5121_word,
+			     PMBUS_REG_MFR_SPECIFIC_E9);
+static SENSOR_DEVICE_ATTR_RW(specific_ea, fd5121_long,
+			     PMBUS_REG_MFR_SPECIFIC_EA);
+static SENSOR_DEVICE_ATTR_RO(ss_cbc, fd5121_word,
+			     PMBUS_REG_MFR_SS_CBC);
+static SENSOR_DEVICE_ATTR_RO(amd_status, fd5121_byte,
+			     PMBUS_REG_MFR_AMD_STATUS);
+static SENSOR_DEVICE_ATTR_RO(checksum, fd5121_word,
+			     PMBUS_REG_MFR_CHECKSUM);
+static SENSOR_DEVICE_ATTR_RO(cse_index, fd5121_word,
+			     PMBUS_REG_CSE_INDEX);
+static SENSOR_DEVICE_ATTR_RW(cout_measure, fd5121_word,
+			     PMBUS_REG_COUT_MEASURE);
+static SENSOR_DEVICE_ATTR_RO(vr_cout, fd5121_word,
+			     PMBUS_REG_VR_COUT);
+static SENSOR_DEVICE_ATTR_RO(bbr_ram, fd5121_block,
+			     PMBUS_REG_BBR_RAM);
+static SENSOR_DEVICE_ATTR_RO(bbr_otp, fd5121_block,
+			     PMBUS_REG_BBR_OTP);
+static SENSOR_DEVICE_ATTR_RO(psys, fd5121_word,
+			     PMBUS_REG_READ_PSYS);
+static SENSOR_DEVICE_ATTR_RW(postclamp_offset, fd5121_word,
+			     PMBUS_REG_POSTCLAMP_OFFSET);
+static SENSOR_DEVICE_ATTR_RW(pgood_delay, fd5121_byte,
+			     PMBUS_REG_PGOOD_DELAY);
+static SENSOR_DEVICE_ATTR_RW(specific_f8, fd5121_word,
+			     PMBUS_REG_MFR_SPECIFIC_F8);
+static SENSOR_DEVICE_ATTR_RW(specific_f9, fd5121_long,
+			     PMBUS_REG_MFR_SPECIFIC_F9);
+static SENSOR_DEVICE_ATTR_RW(pwd_program_ram, fd5121_word,
+			     PMBUS_REG_MFR_PWD_PROGRAM_RAM);
+static SENSOR_DEVICE_ATTR_RW(pwd_program_i2c, fd5121_word,
+			     PMBUS_REG_MFR_PWD_PROGRAM_I2C);
+static SENSOR_DEVICE_ATTR_RW(pwd_enable_otp_store, fd5121_word,
+			     PMBUS_REG_MFR_PWD_ENABLE_OTP_STORE);
+
+static struct attribute *fd5121_non_paged_attrs[] = {
+	&sensor_dev_attr_page.dev_attr.attr,
+	&sensor_dev_attr_capability.dev_attr.attr,
+	&sensor_dev_attr_pin_counter.dev_attr.attr,
+	&sensor_dev_attr_vaux_uv_fault_limit.dev_attr.attr,
+	&sensor_dev_attr_vaux_ov_fault_limit.dev_attr.attr,
+	&sensor_dev_attr_vaux_uv_warning.dev_attr.attr,
+	&sensor_dev_attr_vaux_ov_warning.dev_attr.attr,
+	&sensor_dev_attr_free_user_config_tables.dev_attr.attr,
+	&sensor_dev_attr_address_table.dev_attr.attr,
+	&sensor_dev_attr_unlock.dev_attr.attr,
+	&sensor_dev_attr_faulty_sps.dev_attr.attr,
+	&sensor_dev_attr_tlvr_faults.dev_attr.attr,
+	&sensor_dev_attr_user_store_config_tab.dev_attr.attr,
+	&sensor_dev_attr_user_config_index.dev_attr.attr,
+	&sensor_dev_attr_pwm_disconnection.dev_attr.attr,
+	&sensor_dev_attr_vr_disconnection.dev_attr.attr,
+	&sensor_dev_attr_user_restore_config_tab.dev_attr.attr,
+	&sensor_dev_attr_svr_go.dev_attr.attr,
+	&sensor_dev_attr_set_pwd.dev_attr.attr,
+	&sensor_dev_attr_config_activate.dev_attr.attr,
+	&sensor_dev_attr_config_recover.dev_attr.attr,
+	&sensor_dev_attr_otp_dump.dev_attr.attr,
+	&sensor_dev_attr_bbr_en.dev_attr.attr,
+	&sensor_dev_attr_vboot.dev_attr.attr,
+	&sensor_dev_attr_vsys.dev_attr.attr,
+	&sensor_dev_attr_specific_e9.dev_attr.attr,
+	&sensor_dev_attr_specific_ea.dev_attr.attr,
+	&sensor_dev_attr_ss_cbc.dev_attr.attr,
+	&sensor_dev_attr_checksum.dev_attr.attr,
+	&sensor_dev_attr_cse_index.dev_attr.attr,
+	&sensor_dev_attr_cout_measure.dev_attr.attr,
+	&sensor_dev_attr_vr_cout.dev_attr.attr,
+	&sensor_dev_attr_bbr_ram.dev_attr.attr,
+	&sensor_dev_attr_bbr_otp.dev_attr.attr,
+	&sensor_dev_attr_psys.dev_attr.attr,
+	&sensor_dev_attr_specific_f8.dev_attr.attr,
+	&sensor_dev_attr_specific_f9.dev_attr.attr,
+	&sensor_dev_attr_pwd_program_ram.dev_attr.attr,
+	&sensor_dev_attr_pwd_program_i2c.dev_attr.attr,
+	&sensor_dev_attr_pwd_enable_otp_store.dev_attr.attr,
+	&sensor_dev_attr_revision.dev_attr.attr,
+	&sensor_dev_attr_id.dev_attr.attr,
+	&sensor_dev_attr_model.dev_attr.attr,
+	&sensor_dev_attr_mfr_revision.dev_attr.attr,
+	&sensor_dev_attr_date.dev_attr.attr,
+	&sensor_dev_attr_ic_device_id.dev_attr.attr,
+	&sensor_dev_attr_ic_device_rev.dev_attr.attr,
+	NULL
+};
+
+static struct attribute *fd5121_paged_attrs[] = {
+	&sensor_dev_attr_operation.dev_attr.attr,
+	&sensor_dev_attr_vout_raw.dev_attr.attr,
+	&sensor_dev_attr_on_off_config.dev_attr.attr,
+	&sensor_dev_attr_clear_faults.dev_attr.attr,
+	&sensor_dev_attr_write_protect.dev_attr.attr,
+	&sensor_dev_attr_smbalert_mask.dev_attr.attr,
+	&sensor_dev_attr_vout_mode.dev_attr.attr,
+	&sensor_dev_attr_vout_command.dev_attr.attr,
+	&sensor_dev_attr_vout_margin_high.dev_attr.attr,
+	&sensor_dev_attr_vout_margin_low.dev_attr.attr,
+	&sensor_dev_attr_vout_min.dev_attr.attr,
+	&sensor_dev_attr_vin_on.dev_attr.attr,
+	&sensor_dev_attr_vin_off.dev_attr.attr,
+	&sensor_dev_attr_vout_ov_fault_response.dev_attr.attr,
+	&sensor_dev_attr_vout_uv_fault_response.dev_attr.attr,
+	&sensor_dev_attr_iout_oc_fault_response.dev_attr.attr,
+	&sensor_dev_attr_ot_fault_response.dev_attr.attr,
+	&sensor_dev_attr_vin_ov_fault_response.dev_attr.attr,
+	&sensor_dev_attr_status_byte.dev_attr.attr,
+	&sensor_dev_attr_iknee_set.dev_attr.attr,
+	&sensor_dev_attr_vmin_aware.dev_attr.attr,
+	&sensor_dev_attr_power_mode.dev_attr.attr,
+	&sensor_dev_attr_vin_uv_fault_response.dev_attr.attr,
+	&sensor_dev_attr_iin_oc_fault_response.dev_attr.attr,
+	&sensor_dev_attr_ton_delay.dev_attr.attr,
+	&sensor_dev_attr_pout_op_fault_response.dev_attr.attr,
+	&sensor_dev_attr_vaux_uv_fault_response.dev_attr.attr,
+	&sensor_dev_attr_vaux_ov_fault_response.dev_attr.attr,
+	&sensor_dev_attr_status_onsemi.dev_attr.attr,
+	&sensor_dev_attr_status_cml.dev_attr.attr,
+	&sensor_dev_attr_status_other.dev_attr.attr,
+	&sensor_dev_attr_status_mfr_specific.dev_attr.attr,
+	&sensor_dev_attr_read_vaux.dev_attr.attr,
+	&sensor_dev_attr_ton_slew.dev_attr.attr,
+	&sensor_dev_attr_toff_slew.dev_attr.attr,
+	&sensor_dev_attr_rail_name.dev_attr.attr,
+	&sensor_dev_attr_vout_droop.dev_attr.attr,
+	&sensor_dev_attr_dpm_min.dev_attr.attr,
+	&sensor_dev_attr_preclamp_offset.dev_attr.attr,
+	&sensor_dev_attr_tlvr_diagn.dev_attr.attr,
+	&sensor_dev_attr_amd_status.dev_attr.attr,
+	&sensor_dev_attr_postclamp_offset.dev_attr.attr,
+	&sensor_dev_attr_pgood_delay.dev_attr.attr,
+	&sensor_dev_attr_vout_max.dev_attr.attr,
+	&sensor_dev_attr_vout_transition_rate.dev_attr.attr,
+	NULL
+};
+
+static struct attribute_group fd5121_groups[2] = {
+	{ .name = "global", .attrs = fd5121_non_paged_attrs },
+	{ .name = "paged", .attrs = fd5121_paged_attrs }
+};
+
+/* Regulator descriptors for VOUT rails (VID encoded) */
+static struct regulator_desc fd5121_reg_desc[] = {
+	PMBUS_REGULATOR_STEP_ONE("vout1", 3001, 1000, 200000),
+	PMBUS_REGULATOR_STEP_ONE("vout2", 3001, 1000, 200000),
+};
+
+static int fd5121_valid_reg(struct i2c_client *client, int reg)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(cc_list); i++) {
+		if (reg == cc_list[i])
+			return 0;
+	}
+
+	if (fd5121_blk_rd_reg(reg))
+		return 0;
+	return -ENXIO;
+}
+
+static int fd5121_read_word_data(struct i2c_client *client, int page,
+				 int phase, int reg)
+{
+	int ret;
+
+	ret = fd5121_valid_reg(client, reg);
+	if (ret < 0)
+		return ret;
+
+	ret = pmbus_read_word_data(client, page, phase, reg);
+	if (ret < 0)
+		return ret;
+
+	/* Chip reports VOUT_MODE as vid. But gives raw value 1mV per bit.
+	 * So, encode the READ_VOUT value so that it gets decoded and
+	 * reported correctly.
+	 */
+	if (reg == PMBUS_READ_VOUT)
+		ret = DIV_ROUND_CLOSEST(155000 - ret * 100, 625);
+	return ret;
+}
+
+static int fd5121_read_byte_data(struct i2c_client *client, int page, int reg)
+{
+	int ret;
+
+	ret = fd5121_valid_reg(client, reg);
+	if (ret < 0)
+		return ret;
+
+	return pmbus_read_byte_data(client, page, reg);
+}
+
+static int fd5121_write_byte_data(struct i2c_client *client, int page,
+				  int reg, u8 value)
+{
+	int ret;
+
+	ret = fd5121_valid_reg(client, reg);
+	if (ret < 0)
+		return ret;
+	return pmbus_write_byte_data(client, page, reg, value);
+}
+
+static int fd5121_write_byte(struct i2c_client *client, int page, u8 byte)
+{
+	return pmbus_write_byte(client, page, byte);
+}
+
+static int fd5121_write_word_data(struct i2c_client *client, int page,
+				    int reg, u16 word)
+{
+	int ret;
+
+	ret = fd5121_valid_reg(client, reg);
+	if (ret < 0)
+		return ret;
+	ret = pmbus_write_word_data(client, page, reg, word);
+	return ret;
+}
+
+static u32 fd5121_get_dev_id(struct i2c_client *client)
+{
+	u32 dev_id = 0;
+	s32 ret = 0;
+
+	ret = fd5121_read_long(client, PMBUS_IC_DEVICE_ID, &dev_id);
+	if (ret < 0)
+		return CTLR_ID_UNKNOWN;
+
+	switch (dev_id) {
+	case CTLR_ID_FD5121:
+	case CTLR_ID_FD5123:
+	case CTLR_ID_FD5125:
+		break;
+	default:
+		if (dev_id != 0)
+			dev_err(&client->dev, "Unknown device 0x%x",
+				dev_id);
+		return CTLR_ID_UNKNOWN;
+	}
+	return dev_id;
+}
+
+static int fd5121_probe(struct i2c_client *client)
+{
+	struct pmbus_driver_info *info;
+	struct fd5121_data *pdata;
+	u32 id;
+
+	if (!i2c_check_functionality(client->adapter, I2C_FUNC_I2C))
+		return -EOPNOTSUPP;
+
+	pdata = devm_kzalloc(&client->dev, sizeof(struct fd5121_data),
+			     GFP_KERNEL);
+	if (pdata == NULL)
+		return -ENOMEM;
+
+	pdata->dev = &client->dev;
+	pdata->groups[0] = &fd5121_groups[0];
+	pdata->groups[1] = &fd5121_groups[1];
+
+	id = fd5121_get_dev_id(client);
+	if (id == CTLR_ID_UNKNOWN)
+		return -ENODEV;
+
+	pdata->id = id;
+
+	switch (id) {
+	case CTLR_ID_FD5121:
+	case CTLR_ID_FD5123:
+	case CTLR_ID_FD5125:
+		break;
+	default:
+		dev_err(&client->dev, "Failed to read device ID");
+		return -ENODEV;
+	}
+
+	info = &pdata->info;
+	info->groups = (const struct attribute_group **)&pdata->groups[0];
+	info->write_word_data = fd5121_write_word_data;
+	info->write_byte = fd5121_write_byte;
+	info->write_byte_data = fd5121_write_byte_data;
+	info->read_word_data = fd5121_read_word_data;
+	info->read_byte_data = fd5121_read_byte_data;
+
+	info->pages = FD5121_NUM_PAGES;
+	info->format[PSC_VOLTAGE_IN] = linear;
+	info->format[PSC_VOLTAGE_OUT] = vid;
+
+	fd5121_reg_desc[0].id = 0;
+	fd5121_reg_desc[1].id = 1;
+
+	/* Device implements VID coding with 1 mV steps from 0.200 V
+	 * up to 3.200 V
+	 */
+	info->num_regulators = FD5121_NUM_PAGES;
+	info->reg_desc = fd5121_reg_desc;
+	info->format[PSC_CURRENT_IN] = linear;
+	info->format[PSC_CURRENT_OUT] = linear;
+	info->format[PSC_POWER] = linear;
+	info->format[PSC_TEMPERATURE] = linear;
+	for (u8 idx = 0; idx < info->pages; idx++) {
+		info->func[idx] = PMBUS_HAVE_IOUT | PMBUS_HAVE_STATUS_IOUT;
+		info->func[idx] |= PMBUS_HAVE_VOUT | PMBUS_HAVE_STATUS_VOUT;
+		info->func[idx] |= PMBUS_HAVE_TEMP | PMBUS_HAVE_STATUS_TEMP;
+		info->func[idx] |= PMBUS_HAVE_PIN | PMBUS_HAVE_POUT;
+		info->func[idx] |= PMBUS_HAVE_VIN | PMBUS_HAVE_IIN;
+		info->func[idx] |= PMBUS_HAVE_STATUS_INPUT;
+		info->vrm_version[idx] = amd625mv;
+	}
+	return pmbus_do_probe(client, info);
+}
+
+#ifdef CONFIG_OF
+static const struct of_device_id fd5121_of_match[] = {
+	{ .compatible = "onnn,fd5121" },
+	{ }
+};
+MODULE_DEVICE_TABLE(of, fd5121_of_match);
+#endif
+
+static const struct i2c_device_id fd5121_id[] = {
+	{ "fd5121", chip_fd5121 },
+	{ "fd5123", chip_fd5123 },
+	{ "fd5125", chip_fd5125 },
+	{ }
+};
+MODULE_DEVICE_TABLE(i2c, fd5121_id);
+
+static struct i2c_driver fd5121_driver = {
+	.driver = {
+		.name = "fd5121",
+		.of_match_table = of_match_ptr(fd5121_of_match),
+	},
+	.probe = fd5121_probe,
+	.id_table = fd5121_id,
+};
+
+module_i2c_driver(fd5121_driver);
+
+MODULE_AUTHOR("Selva Rajagopal <selvamani.rajagopal@onsemi.com>");
+MODULE_DESCRIPTION("PMBus driver for FD5121");
+MODULE_LICENSE("GPL");
+MODULE_IMPORT_NS("PMBUS");
+

-- 
2.43.0



^ permalink raw reply related

* Re: [PATCH v8 09/46] KVM: guest_memfd: Introduce function to check GFN private/shared status
From: Binbin Wu @ 2026-06-23  5:25 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, brauner, chao.p.peng, david, jmattson,
	jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, tabba, willy,
	wyihan, yan.y.zhao, forkloop, pratyush, suzuki.poulose,
	aneesh.kumar, liam, Paolo Bonzini, Sean Christopherson,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, Jonathan Corbet, Shuah Khan, Shuah Khan,
	Vishal Annapurve, Andrew Morton, Chris Li, Kairui Song,
	Kemeng Shi, Nhat Pham, Barry Song, Axel Rasmussen, Yuanchu Xie,
	Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt, Kiryl Shutsemau,
	Baoquan He, Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-9-9d2959357853@google.com>



On 6/19/2026 8:31 AM, Ackerley Tng via B4 Relay wrote:
> From: Ackerley Tng <ackerleytng@google.com>
> 
> Introduce function for KVM to check the private/shared status of guest
           ^
Nit:       a
 > memory at a given GFN.
> 
> This will be used in a later patch.

[...]

>  
> +bool kvm_gmem_is_private(struct kvm *kvm, gfn_t gfn)
> +{
> +	struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn);
> +	struct inode *inode;
> +
> +	/*
> +	 * If this gfn has no associated memslot, there's no chance of the gfn
> +	 * being backed by private memory, since guest_memfd must be used for
> +	 * private memory,

"guest_memfd must be used for private memory" is a bit confusing to me.


> and guest_memfd must be associated with some memslot.
> +	 */
> +	if (!slot)
> +		return 0;
> +
> +	CLASS(gmem_get_file, file)(slot);
> +	if (!file)
> +		return 0;
> +
> +	inode = file_inode(file);
> +
> +	/*
> +	 * Rely on the maple tree's internal RCU lock to ensure a
> +	 * stable result. This result can become stale as soon as the
> +	 * lock is dropped, so the caller _must_ still protect
> +	 * consumption of private vs. shared by checking
> +	 * mmu_invalidate_retry_gfn() under mmu_lock to serialize
> +	 * against ongoing attribute updates.
> +	 */
> +	return kvm_gmem_is_private_mem(inode, kvm_gmem_get_index(slot, gfn));
> +}
> +EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_gmem_is_private);
> +
>  static struct file_operations kvm_gmem_fops = {
>  	.mmap		= kvm_gmem_mmap,
>  	.open		= generic_file_open,
> 


^ permalink raw reply

* Re: [RFC PATCH 0/2] kasan: hw_tags: Add option to tag only at allocation time
From: Dev Jain @ 2026-06-23  5:02 UTC (permalink / raw)
  To: Catalin Marinas, Harry Yoo
  Cc: ryabinin.a.a, akpm, corbet, glider, andreyknvl, dvyukov,
	vincenzo.frascino, kasan-dev, linux-mm, linux-kernel, skhan,
	workflows, linux-doc, linux-arm-kernel, ryan.roberts,
	anshuman.khandual, kaleshsingh, 21cnbao, david, will
In-Reply-To: <ajltLd6FQg1aMge_@arm.com>



On 22/06/26 10:43 pm, Catalin Marinas wrote:
> Hi Harry,
> 
> On Mon, Jun 22, 2026 at 09:42:10PM +0900, Harry Yoo wrote:
>> On 6/19/26 10:19 PM, Catalin Marinas wrote:
>>> On Thu, Jun 18, 2026 at 10:35:15PM +0900, Harry Yoo wrote:
>>>> On 6/12/26 1:44 PM, Dev Jain wrote:
>>>>> Now, when a memory object will be freed, it will retain the random tag it
>>>>> had at allocation time. This compromises on catching UAF bugs, till the
>>>>> time the object is not reallocated, at which point it will have a new
>>>>> random tag.
>>>>>
>>>>> Hence, not catching "use-after-free-before-reallocation" and not catching
>>>>> "double-free" will be the compromise for reduced KASAN overhead.
>>>>
>>>> I doubt users who care about security enough to enable HW_TAGS KASAN
>>>> are willing to compromise on security just to save a few instructions
>>>> to store tags in the free path.
>>>>
>>>> To me, it looks like too much of a compromise on security for little
>>>> performance gain.
>>>
>>> I don't think there's much compromise on security for use-after-free.
>>
>> I think it depends... OH, WAIT! I see what you mean.
>>
>> You mean use-after-free before reallocation does not lead to much
>> compromise on security because objects are initialized after allocation?
>>
>> You're probably right.
>>
>> Hmm, but stores to e.g.) free pointer, fields initialized by
>> constructor or accessed by SLAB_TYPESAFE_BY_RCU semantics after free
>> will be undiscovered if they happen before reallocation.
> 
> Even with SLAB_TYPESAFE_BY_RCU, the object isn't tagged on free either
> (or realloc, only if the actual slab page ends up freed). But we don't
> get type confusion for such slab.
> 
> However, without tagging on free, one could argue that it reduces
> security for cases where the page is re-allocated as untagged - e.g. all
> user pages mapped without PROT_MTE. Currently we have a deterministic
> tag check fault if the page is coloured as KASAN_TAG_INVALID. I think

So you are saying that a stale kernel pointer can continue to use the
reallocated page, because for non-PROT_MTE case the page does not get
a new tag. Makes sense.
> for this patch, it might be better to only do such skip on free in
> kasan_poison_slab() rather than kasan_poison(). Freed pages would then
> be tagged.

I think you mean to say, "skip tag on free when freeing pages into buddy"?
So that would mean, kasan_poison() will do the poisoning also in the
case of value == KASAN_PAGE_FREE.

> 
> An alternative would be tagging on free only with a new tag and skipping
> it on re-alloc. But we'd need to track when it's a completely new
> allocation or a reused object (I haven't looked I'm pretty sure it's
> doable).

That was our original approach, and IIRC we had concluded there was no
security compromise. However it is difficult to implement - it has cases
like, what happens when two differently tagged pages are coalesced by
buddy and someone gets that large page as an allocation.

> 


^ permalink raw reply

* Re: [PATCH v8 07/46] KVM: Rename memory attribute APIs to prepare for in-place gmem conversion
From: Binbin Wu @ 2026-06-23  4:55 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, brauner, chao.p.peng, david, jmattson,
	jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, tabba, willy,
	wyihan, yan.y.zhao, forkloop, pratyush, suzuki.poulose,
	aneesh.kumar, liam, Paolo Bonzini, Sean Christopherson,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, Jonathan Corbet, Shuah Khan, Shuah Khan,
	Vishal Annapurve, Andrew Morton, Chris Li, Kairui Song,
	Kemeng Shi, Nhat Pham, Barry Song, Axel Rasmussen, Yuanchu Xie,
	Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt, Kiryl Shutsemau,
	Baoquan He, Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-7-9d2959357853@google.com>

On 6/19/2026 8:31 AM, Ackerley Tng via B4 Relay wrote:

> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index d370e834d619e..eb26d4ea8945a 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -2534,13 +2534,13 @@ static inline bool kvm_memslot_is_gmem_only(const struct kvm_memory_slot *slot)
>  }
>  
>  #ifdef CONFIG_KVM_VM_MEMORY_ATTRIBUTES
> -static inline unsigned long kvm_get_memory_attributes(struct kvm *kvm, gfn_t gfn)
> +static inline unsigned long kvm_get_vm_memory_attributes(struct kvm *kvm, gfn_t gfn)
>  {
>  	return xa_to_value(xa_load(&kvm->mem_attr_array, gfn));
>  }
>  
> -bool kvm_range_has_memory_attributes(struct kvm *kvm, gfn_t start, gfn_t end,
> -				     unsigned long mask, unsigned long attrs);
> +bool kvm_range_has_vm_memory_attributes(struct kvm *kvm, gfn_t start, gfn_t end,
> +					unsigned long mask, unsigned long attrs);
>  bool kvm_arch_pre_set_memory_attributes(struct kvm *kvm,
>  					struct kvm_gfn_range *range);
>  bool kvm_arch_post_set_memory_attributes(struct kvm *kvm,
> @@ -2548,7 +2548,14 @@ bool kvm_arch_post_set_memory_attributes(struct kvm *kvm,
>  
>  static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
>  {
> -	return kvm_get_memory_attributes(kvm, gfn) & KVM_MEMORY_ATTRIBUTE_PRIVATE;
> +	return kvm_get_vm_memory_attributes(kvm, gfn) & KVM_MEMORY_ATTRIBUTE_PRIVATE;
> +}
> +static inline bool kvm_mem_range_is_private(struct kvm *kvm, gfn_t start,
> +					    gfn_t end)
> +{
> +	return kvm_range_has_vm_memory_attributes(kvm, start, end,
> +						  KVM_MEMORY_ATTRIBUTE_PRIVATE,
> +						  KVM_MEMORY_ATTRIBUTE_PRIVATE);
>  }

This function is added, but never used in this patch series.
Is it intended to be called only when CONFIG_KVM_VM_MEMORY_ATTRIBUTES is
enabled?



>  #else
>  static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)

^ permalink raw reply

* Re: [PATCH v8 06/46] KVM: Enumerate support for PRIVATE memory iff kvm_arch_has_private_mem is defined
From: Binbin Wu @ 2026-06-23  3:10 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, brauner, chao.p.peng, david, jmattson,
	jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, tabba, willy,
	wyihan, yan.y.zhao, forkloop, pratyush, suzuki.poulose,
	aneesh.kumar, liam, Paolo Bonzini, Sean Christopherson,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, Jonathan Corbet, Shuah Khan, Shuah Khan,
	Vishal Annapurve, Andrew Morton, Chris Li, Kairui Song,
	Kemeng Shi, Nhat Pham, Barry Song, Axel Rasmussen, Yuanchu Xie,
	Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt, Kiryl Shutsemau,
	Baoquan He, Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-6-9d2959357853@google.com>

On 6/19/2026 8:31 AM, Ackerley Tng via B4 Relay wrote:
> From: Ackerley Tng <ackerleytng@google.com>
> 
> Explicitly guard reporting support for KVM_MEMORY_ATTRIBUTE_PRIVATE based
> on kvm_arch_has_private_mem being #defined in anticipation of decoupling
> kvm_supported_mem_attributes() from CONFIG_KVM_VM_MEMORY_ATTRIBUTES.
> guest_memfd support for memory attributes will be unconditional to avoid
> yet more macros (all architectures that support guest_memfd are expected to
> use per-gmem attributes at some point), at which point enumerating support
> KVM_MEMORY_ATTRIBUTE_PRIVATE based solely on memory attributes being
> supported _somewhere_ would result in KVM over-reporting support on arm64.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Reviewed-by: Fuad Tabba <tabba@google.com>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>

Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>

> ---
>  virt/kvm/kvm_main.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 1ccc4895a4c26..7b989b659cf82 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2421,8 +2421,10 @@ static int kvm_vm_ioctl_clear_dirty_log(struct kvm *kvm,
>  #ifdef CONFIG_KVM_VM_MEMORY_ATTRIBUTES
>  static u64 kvm_supported_mem_attributes(struct kvm *kvm)
>  {
> +#ifdef kvm_arch_has_private_mem
>  	if (!kvm || kvm_arch_has_private_mem(kvm))
>  		return KVM_MEMORY_ATTRIBUTE_PRIVATE;
> +#endif
>  
>  	return 0;
>  }
> 


^ permalink raw reply

* Re: [PATCH v8 04/46] KVM: Decouple kvm_has_arch_private_mem from CONFIG_KVM_VM_MEMORY_ATTRIBUTES
From: Binbin Wu @ 2026-06-23  2:51 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, brauner, chao.p.peng, david, jmattson,
	jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, tabba, willy,
	wyihan, yan.y.zhao, forkloop, pratyush, suzuki.poulose,
	aneesh.kumar, liam, Paolo Bonzini, Sean Christopherson,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, Jonathan Corbet, Shuah Khan, Shuah Khan,
	Vishal Annapurve, Andrew Morton, Chris Li, Kairui Song,
	Kemeng Shi, Nhat Pham, Barry Song, Axel Rasmussen, Yuanchu Xie,
	Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt, Kiryl Shutsemau,
	Baoquan He, Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-4-9d2959357853@google.com>

On 6/19/2026 8:31 AM, Ackerley Tng via B4 Relay wrote:
> From: Sean Christopherson <seanjc@google.com>
> 
> When memory attributes become trackable in guest_memfd, the concept of
> having private memory is no longer dependent on
> CONFIG_KVM_VM_MEMORY_ATTRIBUTES.
> 
> With this, on x86, kvm_arch_has_private_mem() is defined if some CoCo
> platform support (or the testing CONFIG_KVM_SW_PROTECTED_VM) is compiled
> in.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Co-developed-by: Ackerley Tng <ackerleytng@google.com>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>

Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>

One nit below.

> ---
>  arch/x86/include/asm/kvm_host.h | 4 +++-
>  include/linux/kvm_host.h        | 2 +-
>  2 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 8e8eb8a5e8a6b..1bde67cf6eb0e 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -2394,7 +2394,9 @@ void kvm_configure_mmu(bool enable_tdp, int tdp_forced_root_level,
>  		       int tdp_max_root_level, int tdp_huge_page_level);
>  
>  
> -#ifdef CONFIG_KVM_VM_MEMORY_ATTRIBUTES
> +#if defined(CONFIG_KVM_SW_PROTECTED_VM) ||	\
> +	defined(CONFIG_KVM_INTEL_TDX) ||	\
> +	defined(CONFIG_KVM_AMD_SEV)

Nit:
Vertically align the defined(XXX) statements for better readability?


>  #define kvm_arch_has_private_mem(kvm) ((kvm)->arch.has_private_mem)
>  #endif
>  
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 201d0f2143976..d370e834d619e 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -722,7 +722,7 @@ static inline int kvm_arch_vcpu_memslots_id(struct kvm_vcpu *vcpu)
>  }
>  #endif
>  
> -#ifndef CONFIG_KVM_VM_MEMORY_ATTRIBUTES
> +#ifndef kvm_arch_has_private_mem
>  static inline bool kvm_arch_has_private_mem(struct kvm *kvm)
>  {
>  	return false;
> 


^ permalink raw reply

* Re: [PATCH v8 03/46] KVM: Move KVM_VM_MEMORY_ATTRIBUTES config definition to x86
From: Binbin Wu @ 2026-06-23  2:48 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, brauner, chao.p.peng, david, jmattson,
	jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, tabba, willy,
	wyihan, yan.y.zhao, forkloop, pratyush, suzuki.poulose,
	aneesh.kumar, liam, Paolo Bonzini, Sean Christopherson,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, Jonathan Corbet, Shuah Khan, Shuah Khan,
	Vishal Annapurve, Andrew Morton, Chris Li, Kairui Song,
	Kemeng Shi, Nhat Pham, Barry Song, Axel Rasmussen, Yuanchu Xie,
	Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt, Kiryl Shutsemau,
	Baoquan He, Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-3-9d2959357853@google.com>

On 6/19/2026 8:31 AM, Ackerley Tng via B4 Relay wrote:
> From: Sean Christopherson <seanjc@google.com>
> 
> Bury KVM_VM_MEMORY_ATTRIBUTES in x86 to discourage other architectures
> from adding support for per-VM memory attributes, because tracking private
> vs. shared memory on a per-VM basis is now deprecated in favor of tracking
> on a per-guest_memfd basis, and while RWX memory attributes are on the
> horizon, they too are expected to be x86-only.
> 
> This will also allow modifying KVM_VM_MEMORY_ATTRIBUTES to be
> user-selectable (in x86) without creating weirdness in KVM's Kconfigs.
> Now that guest_memfd supports in-place conversions, it's entirely possible
> to run x86 CoCo VMs without support for KVM_VM_MEMORY_ATTRIBUTES.
> 
> Leave the code itself in common KVM so that it's trivial to undo this
> change if new per-VM attributes do come along.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Reviewed-by: Fuad Tabba <tabba@google.com>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>

Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>

> ---
>  arch/x86/kvm/Kconfig | 3 +++
>  virt/kvm/Kconfig     | 3 ---
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
> index 26f6afd51bbdc..24f96396cfa1c 100644
> --- a/arch/x86/kvm/Kconfig
> +++ b/arch/x86/kvm/Kconfig
> @@ -80,6 +80,9 @@ config KVM_WERROR
>  
>  	  If in doubt, say "N".
>  
> +config KVM_VM_MEMORY_ATTRIBUTES
> +	bool
> +
>  config KVM_SW_PROTECTED_VM
>  	bool "Enable support for KVM software-protected VMs"
>  	depends on EXPERT
> diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
> index 5119cb37145fc..297e4399fbd49 100644
> --- a/virt/kvm/Kconfig
> +++ b/virt/kvm/Kconfig
> @@ -100,9 +100,6 @@ config KVM_ELIDE_TLB_FLUSH_IF_YOUNG
>  config KVM_MMU_LOCKLESS_AGING
>         bool
>  
> -config KVM_VM_MEMORY_ATTRIBUTES
> -       bool
> -
>  config KVM_GUEST_MEMFD
>         select XARRAY_MULTI
>         bool
> 


^ permalink raw reply

* Re: [PATCH v8 02/46] KVM: Rename KVM_GENERIC_MEMORY_ATTRIBUTES to KVM_VM_MEMORY_ATTRIBUTES
From: Binbin Wu @ 2026-06-23  2:48 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, brauner, chao.p.peng, david, jmattson,
	jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, tabba, willy,
	wyihan, yan.y.zhao, forkloop, pratyush, suzuki.poulose,
	aneesh.kumar, liam, Paolo Bonzini, Sean Christopherson,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, Jonathan Corbet, Shuah Khan, Shuah Khan,
	Vishal Annapurve, Andrew Morton, Chris Li, Kairui Song,
	Kemeng Shi, Nhat Pham, Barry Song, Axel Rasmussen, Yuanchu Xie,
	Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt, Kiryl Shutsemau,
	Baoquan He, Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-2-9d2959357853@google.com>

On 6/19/2026 8:31 AM, Ackerley Tng via B4 Relay wrote:
> From: Sean Christopherson <seanjc@google.com>
> 
> Rename the per-VM memory attributes Kconfig to make it explicitly about
> per-VM attributes in anticipation of adding memory attributes support to
> guest_memfd, at which point it will be possible (and desirable) to have
> memory attributes without the per-VM support, even in x86.
> 
> No functional change intended.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Reviewed-by: Fuad Tabba <tabba@google.com>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>

Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>


^ permalink raw reply

* Re: [PATCH v8 00/46] guest_memfd: In-place conversion support
From: Xiaoyao Li @ 2026-06-23  2:39 UTC (permalink / raw)
  To: ackerleytng, aik, andrew.jones, binbin.wu, brauner, chao.p.peng,
	david, jmattson, jthoughton, michael.roth, oupton, pankaj.gupta,
	qperret, rick.p.edgecombe, rientjes, shivankg, steven.price,
	tabba, willy, wyihan, yan.y.zhao, forkloop, pratyush,
	suzuki.poulose, aneesh.kumar, liam, Paolo Bonzini,
	Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Steven Rostedt,
	Masami Hiramatsu, Mathieu Desnoyers, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Vishal Annapurve, Andrew Morton, Chris Li,
	Kairui Song, Kemeng Shi, Nhat Pham, Barry Song, Axel Rasmussen,
	Yuanchu Xie, Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt,
	Kiryl Shutsemau, Baoquan He, Jason Gunthorpe, Vlastimil Babka
  Cc: kvm, linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest,
	linux-mm, linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-0-9d2959357853@google.com>

On 6/19/2026 8:31 AM, Ackerley Tng via B4 Relay wrote:
> TODOs
> 
> + Retest with TDX selftests. v7 was tested with TDX [12], but the setup there was
>    wrong. Conversions were successful (no errors), but the shared memory being
>    tested is actually in a completely different host physical page.

Glad to see you knew it already (I was going to report this to the 
original POC TDX patch)

^ permalink raw reply

* Re: [PATCH v8 01/46] KVM: guest_memfd: Introduce per-gmem attributes, use to guard user mappings
From: Binbin Wu @ 2026-06-23  2:14 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: ackerleytng, aik, andrew.jones, brauner, chao.p.peng, david,
	jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, tabba, willy,
	wyihan, yan.y.zhao, forkloop, pratyush, suzuki.poulose,
	aneesh.kumar, liam, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Steven Rostedt,
	Masami Hiramatsu, Mathieu Desnoyers, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Vishal Annapurve, Andrew Morton, Chris Li,
	Kairui Song, Kemeng Shi, Nhat Pham, Barry Song, Axel Rasmussen,
	Yuanchu Xie, Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt,
	Kiryl Shutsemau, Baoquan He, Jason Gunthorpe, Vlastimil Babka,
	kvm, linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest,
	linux-mm, linux-coco
In-Reply-To: <ajnjTJdQKD1Kz3tf@google.com>



On 6/23/2026 9:37 AM, Sean Christopherson wrote:
> On Mon, Jun 22, 2026, Binbin Wu wrote:
>> On 6/19/2026 8:31 AM, Ackerley Tng via B4 Relay wrote:
>>
>> [...]
>>
>>>  
>>> +static u64 kvm_gmem_get_attributes(struct inode *inode, pgoff_t index)
>>> +{
>>> +	struct maple_tree *mt = &GMEM_I(inode)->attributes;
>>> +	void *entry = mtree_load(mt, index);
>>> +
>>> +	return WARN_ON_ONCE(!entry) ? 0 : xa_to_value(entry);
>>
>> If the entry is unexpectedly missing, returning 0 means the attribute would
>> be treated as shared.  And then in kvm_gmem_fault_user_mapping(), it would
>> allow the userspace to fault in the folio.
>>
>> Should gmem deny such edge case?
> 
> After several bugs this year where a WARN_ON_ONCE() fired, but was entirely
> insufficient to prevent true badness, I'm definitely senstive to making the "bad"
> behavior as harmless as possible.
> 
> However, in this case I think we're just hosed.  If KVM treats the memory as
> private, KVM will incorrectly do prepare(), incorrectly allow populate(), and
> will caused missed invalidations (though I suppose __kvm_gmem_set_attributes()
> "only" lies to userspace in that case).
> 
> That said, assuming SHARED is definitely odd for cases where guest_memfd *can't*
> hold shared memory.  Ditto for assuming PRIVATE.  

Indeed.

> What if we instead fall back to
> the "init" state, e.g.?
LGTM.

> 
> static u64 kvm_gmem_get_attributes(struct inode *inode, pgoff_t index)
> {
> 	struct maple_tree *mt = &GMEM_I(inode)->attributes;
> 	void *entry = mtree_load(mt, index);
> 
> 	if (WARN_ON_ONCE(!entry)) {
> 		bool shared = GMEM_I(inode)->flags & GUEST_MEMFD_FLAG_INIT_SHARED;
> 
> 		return shared ? 0 : KVM_MEMORY_ATTRIBUTE_PRIVATE;
> 	}
> 
> 	return xa_to_value(entry);
> }
> 


^ permalink raw reply

* [PATCH v7 10/10] tracing/probes: Add a new testcase for BTF typecasts
From: Masami Hiramatsu (Google) @ 2026-06-23  1:45 UTC (permalink / raw)
  To: Steven Rostedt, Mathieu Desnoyers
  Cc: Jonathan Corbet, Shuah Khan, Masami Hiramatsu, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest
In-Reply-To: <178217904992.643090.15726197350652241270.stgit@devnote2>

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

With the introduction of container_of-style BTF typecasting and
per-CPU variable access support in trace probes, we need a way to
verify their functionality and prevent regressions.

Add a new ftrace kselftest and update the trace event sample module
to test and validate these features.

Specifically, update the trace-events-sample module to set up a
periodic timer whose callback accesses a per-CPU counter. Introduce
a new sample trace event, foo_timer_fn, to trace this callback
and log the current counter value.

Then, add a new test case, btf_probe_event.tc, which defines a
dynamic probe on the timer callback. The probe uses BTF typecasting
to recover the parent structure from the timer argument and
this_cpu_read() to fetch the per-CPU counter. The test verifies
the integrity of the implementation by ensuring the values
recorded by the dynamic probe match those from the static tracepoint.

Assisted-by: Antigravity:gemini-3.5-flash
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 Changes in v6:
  - Update testcase according to changes.
 Changes in v5:
  - Add more syntax test cases.
 Changes in v4:
  - Fix uprobe $current test.
 Changes in v3:
  - Add syntax test case.
  - Update testcase to use this_cpu_read()
 Changes in v2:
  - Use timer_shutdown_sync() instead of timer_delete_sync() for teardown.
---
 samples/trace_events/trace-events-sample.c         |   40 +++++++++++++++-
 samples/trace_events/trace-events-sample.h         |   34 ++++++++++++-
 .../ftrace/test.d/dynevent/btf_probe_event.tc      |   51 ++++++++++++++++++++
 .../ftrace/test.d/dynevent/fprobe_syntax_errors.tc |   11 ++++
 .../ftrace/test.d/kprobe/kprobe_syntax_errors.tc   |   11 ++++
 .../ftrace/test.d/kprobe/uprobe_syntax_errors.tc   |    5 ++
 6 files changed, 147 insertions(+), 5 deletions(-)
 create mode 100644 tools/testing/selftests/ftrace/test.d/dynevent/btf_probe_event.tc

diff --git a/samples/trace_events/trace-events-sample.c b/samples/trace_events/trace-events-sample.c
index 0b7a6efdb247..ca5d98c360cb 100644
--- a/samples/trace_events/trace-events-sample.c
+++ b/samples/trace_events/trace-events-sample.c
@@ -94,6 +94,20 @@ static int simple_thread_fn(void *arg)
 static DEFINE_MUTEX(thread_mutex);
 static int simple_thread_cnt;
 
+static struct foo_timer_data *foo_timer_data;
+
+static void sample_timer_cb(struct timer_list *t)
+{
+	struct foo_timer_data *data = container_of(t, struct foo_timer_data, timer);
+
+	get_cpu();
+	trace_foo_timer_fn(data);
+	(*this_cpu_ptr(data->counter))++;
+	put_cpu();
+
+	mod_timer(t, jiffies + HZ);
+}
+
 int foo_bar_reg(void)
 {
 	mutex_lock(&thread_mutex);
@@ -132,9 +146,27 @@ void foo_bar_unreg(void)
 
 static int __init trace_event_init(void)
 {
+	foo_timer_data = kzalloc_obj(*foo_timer_data, GFP_KERNEL);
+	if (!foo_timer_data)
+		return -ENOMEM;
+
+	foo_timer_data->name = "sample_timer_counter";
+	foo_timer_data->counter = alloc_percpu(int);
+	if (!foo_timer_data->counter) {
+		kfree(foo_timer_data);
+		return -ENOMEM;
+	}
+
+	timer_setup(&foo_timer_data->timer, sample_timer_cb, 0);
+	mod_timer(&foo_timer_data->timer, jiffies + HZ);
+
 	simple_tsk = kthread_run(simple_thread, NULL, "event-sample");
-	if (IS_ERR(simple_tsk))
-		return -1;
+	if (IS_ERR(simple_tsk)) {
+		timer_shutdown_sync(&foo_timer_data->timer);
+		free_percpu(foo_timer_data->counter);
+		kfree(foo_timer_data);
+		return PTR_ERR(simple_tsk);
+	}
 
 	return 0;
 }
@@ -147,6 +179,10 @@ static void __exit trace_event_exit(void)
 		kthread_stop(simple_tsk_fn);
 	simple_tsk_fn = NULL;
 	mutex_unlock(&thread_mutex);
+
+	timer_shutdown_sync(&foo_timer_data->timer);
+	free_percpu(foo_timer_data->counter);
+	kfree(foo_timer_data);
 }
 
 module_init(trace_event_init);
diff --git a/samples/trace_events/trace-events-sample.h b/samples/trace_events/trace-events-sample.h
index 1a05fc153353..816848a456a2 100644
--- a/samples/trace_events/trace-events-sample.h
+++ b/samples/trace_events/trace-events-sample.h
@@ -247,12 +247,14 @@
  */
 
 /*
- * It is OK to have helper functions in the file, but they need to be protected
- * from being defined more than once. Remember, this file gets included more
- * than once.
+ * It is OK to have helper functions and data structures in the file, but they
+ * need to be protected from being defined more than once. Remember, this file
+ * gets included more than once.
  */
 #ifndef __TRACE_EVENT_SAMPLE_HELPER_FUNCTIONS
 #define __TRACE_EVENT_SAMPLE_HELPER_FUNCTIONS
+#include <linux/timer.h>
+
 static inline int __length_of(const int *list)
 {
 	int i;
@@ -270,6 +272,13 @@ enum {
 	TRACE_SAMPLE_BAR = 4,
 	TRACE_SAMPLE_ZOO = 8,
 };
+
+struct foo_timer_data {
+	const char		*name;
+	struct timer_list	timer;
+	int __percpu		*counter;
+};
+
 #endif
 
 /*
@@ -595,6 +604,25 @@ TRACE_EVENT(foo_rel_loc,
 		  __get_rel_bitmask(bitmask),
 		  __get_rel_cpumask(cpumask))
 );
+
+TRACE_EVENT(foo_timer_fn,
+
+	TP_PROTO(struct foo_timer_data *data),
+
+	TP_ARGS(data),
+
+	TP_STRUCT__entry(
+		__string(	name,			data->name	)
+		__field(	int,			count		)
+	),
+
+	TP_fast_assign(
+		__assign_str(name);
+		__entry->count	= *this_cpu_ptr(data->counter);
+	),
+
+	TP_printk("name=%s count=%d", __get_str(name), __entry->count)
+);
 #endif
 
 /***** NOTICE! The #if protection ends here. *****/
diff --git a/tools/testing/selftests/ftrace/test.d/dynevent/btf_probe_event.tc b/tools/testing/selftests/ftrace/test.d/dynevent/btf_probe_event.tc
new file mode 100644
index 000000000000..96791e120b7d
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/dynevent/btf_probe_event.tc
@@ -0,0 +1,51 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# description: BTF event with typecast and percpu access
+# requires: dynamic_events "this_cpu_read(<fetcharg>)":README "[(structname[,field])]<argname>[->field[->field|.field...]]":README
+
+# Check if the sample module is loaded
+if ! lsmod | grep -q trace_events_sample; then
+  modprobe trace-events-sample || exit_unsupported
+fi
+
+echo 0 > events/enable
+echo > dynamic_events
+
+# The sample_timer_cb(struct timer_list *t) is called.
+# We want to check (STRUCT,FIELD)VAR typecast and this_cpu_read() access.
+# (foo_timer_data,timer)t converts t to struct foo_timer_data * using container_of.
+# data->counter is a per-cpu pointer to int.
+# this_cpu_read(data->counter) should give the value of the counter.
+
+echo 'f:mysample/myevent sample_timer_cb name=(foo_timer_data,timer)t->name:string count=this_cpu_read((foo_timer_data,timer)t->counter)' >> dynamic_events
+
+echo 1 > events/mysample/myevent/enable
+echo 1 > events/sample-trace/foo_timer_fn/enable
+
+sleep 2
+
+echo 0 > events/mysample/myevent/enable
+echo 0 > events/sample-trace/foo_timer_fn/enable
+
+# Compare the values.
+MATCH=0
+while read line; do
+  if echo $line | grep -q "foo_timer_fn:"; then
+    NAME=`echo $line | sed 's/.*name=\([^ ]*\) .*/\1/'`
+    COUNT=`echo $line | sed 's/.*count=\([^ ]*\).*/\1/'`
+    if grep -q "myevent:.*name=\"${NAME}\" count=$COUNT" trace; then
+       MATCH=$((MATCH+1))
+    fi
+  fi
+done < trace
+
+if [ $MATCH -eq 0 ]; then
+  echo "No matching events found"
+  exit_fail
+fi
+
+# Clean up
+echo 0 > events/mysample/myevent/enable
+echo 0 > events/sample-trace/foo_timer_fn/enable
+echo > dynamic_events
+clear_trace
diff --git a/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_syntax_errors.tc b/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_syntax_errors.tc
index fee479295e2f..e111d426a984 100644
--- a/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_syntax_errors.tc
+++ b/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_syntax_errors.tc
@@ -112,6 +112,17 @@ check_error 'f vfs_read%return $retval->^foo'	# NO_PTR_STRCT
 check_error 'f vfs_read file->^foo'		# NO_BTF_FIELD
 check_error 'f vfs_read file^-.foo'		# BAD_HYPHEN
 check_error 'f vfs_read ^file:string'		# BAD_TYPE4STR
+if grep -qF "[(structname" README ; then
+check_error 'f vfs_read arg1=(task_struct)file^'		# TYPECAST_REQ_FIELD
+check_error 'f vfs_read arg1=(a)((b)((c)(^(d)file->d)->c)->b)->a'	# TOO_MANY_NESTED
+check_error 'f vfs_read arg1=(task_struct,^in_execve)file->comm'	# TYPECAST_NOT_ALIGNED
+check_error 'f vfs_read arg1=(task_struct,^foo_bar)file->pid'	# NO_BTF_FIELD
+check_error 'f vfs_read arg1=(^task_struct1234)file->pid'	# NO_PTR_STRCT
+check_error 'f vfs_read arg1=(task_struct,se^->group_node)file->comm'	# TYPECAST_BAD_ARROW
+check_error 'f vfs_read arg1=(task_struct,^->pid)file->comm'	# NO_BTF_FIELD
+check_error 'f vfs_read arg1=(task_struct,^.pid)file->comm'	# NO_BTF_FIELD
+check_error 'f vfs_read arg1=(task_struct,^.)file->comm'	# NO_BTF_FIELD
+fi
 fi
 
 else
diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc
index 8f1c58f0c239..626adeb2e840 100644
--- a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc
+++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc
@@ -115,6 +115,17 @@ check_error 'p vfs_read+20 ^$arg*'		# NOFENTRY_ARGS
 check_error 'p vfs_read ^hoge'			# NO_BTFARG
 check_error 'p kfree ^$arg10'			# NO_BTFARG (exceed the number of parameters)
 check_error 'r kfree ^$retval'			# NO_RETVAL
+if grep -qF "[(structname" README ; then
+check_error 'p vfs_read arg1=(task_struct)file^'		# TYPECAST_REQ_FIELD
+check_error 'p vfs_read arg1=(a)((b)((c)(^(d)file->d)->c)->b)->a'	# TOO_MANY_NESTED
+check_error 'p vfs_read arg1=(task_struct,^in_execve)file->comm'	# TYPECAST_NOT_ALIGNED
+check_error 'p vfs_read arg1=(task_struct,^foo_bar)file->pid'	# NO_BTF_FIELD
+check_error 'p vfs_read arg1=(^task_struct1234)file->pid'		# NO_PTR_STRCT
+check_error 'p vfs_read arg1=(task_struct,se^->group_node)file->comm'	# TYPECAST_BAD_ARROW
+check_error 'p vfs_read arg1=(task_struct,^->pid)file->comm'	# NO_BTF_FIELD
+check_error 'p vfs_read arg1=(task_struct,^.pid)file->comm'	# NO_BTF_FIELD
+check_error 'p vfs_read arg1=(task_struct,^.)file->comm'	# NO_BTF_FIELD
+fi
 else
 check_error 'p vfs_read ^$arg*'			# NOSUP_BTFARG
 fi
diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/uprobe_syntax_errors.tc b/tools/testing/selftests/ftrace/test.d/kprobe/uprobe_syntax_errors.tc
index c817158b99db..e12dc967ec76 100644
--- a/tools/testing/selftests/ftrace/test.d/kprobe/uprobe_syntax_errors.tc
+++ b/tools/testing/selftests/ftrace/test.d/kprobe/uprobe_syntax_errors.tc
@@ -28,4 +28,9 @@ if grep -q ".*symstr.*" README; then
 check_error 'p /bin/sh:10 $stack0:^symstr'	# BAD_TYPE
 fi
 
+# $current is not supported by uprobe
+if grep -q "\$current.*" README; then
+check_error 'p /bin/sh:10 ^$current:u8'	# BAD_VAR
+fi
+
 exit 0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox