Linux Documentation
 help / color / mirror / Atom feed
* Re: [PATCH v8 01/46] KVM: guest_memfd: Introduce per-gmem attributes, use to guard user mappings
From: Ackerley Tng @ 2026-06-24  0:09 UTC (permalink / raw)
  To: Sean Christopherson, Binbin Wu
  Cc: aik, andrew.jones, brauner, chao.p.peng, david, jmattson,
	jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, tabba, willy,
	wyihan, yan.y.zhao, forkloop, pratyush, suzuki.poulose,
	aneesh.kumar, liam, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Steven Rostedt,
	Masami Hiramatsu, Mathieu Desnoyers, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Vishal Annapurve, Andrew Morton, Chris Li,
	Kairui Song, Kemeng Shi, Nhat Pham, Barry Song, Axel Rasmussen,
	Yuanchu Xie, Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt,
	Kiryl Shutsemau, Baoquan He, Jason Gunthorpe, Vlastimil Babka,
	kvm, linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest,
	linux-mm, linux-coco
In-Reply-To: <ajnjTJdQKD1Kz3tf@google.com>

Sean Christopherson <seanjc@google.com> writes:

> On Mon, Jun 22, 2026, Binbin Wu wrote:
>> On 6/19/2026 8:31 AM, Ackerley Tng via B4 Relay wrote:
>>
>> [...]
>>
>> >
>> > +static u64 kvm_gmem_get_attributes(struct inode *inode, pgoff_t index)
>> > +{
>> > +	struct maple_tree *mt = &GMEM_I(inode)->attributes;
>> > +	void *entry = mtree_load(mt, index);
>> > +
>> > +	return WARN_ON_ONCE(!entry) ? 0 : xa_to_value(entry);
>>
>> If the entry is unexpectedly missing, returning 0 means the attribute would
>> be treated as shared.  And then in kvm_gmem_fault_user_mapping(), it would
>> allow the userspace to fault in the folio.
>>
>> Should gmem deny such edge case?
>
> After several bugs this year where a WARN_ON_ONCE() fired, but was entirely
> insufficient to prevent true badness, I'm definitely senstive to making the "bad"
> behavior as harmless as possible.
>

I guess both are indeed awkward.

> However, in this case I think we're just hosed.  If KVM treats the memory as
> private, KVM will incorrectly do prepare(), incorrectly allow populate(), and
> will caused missed invalidations (though I suppose __kvm_gmem_set_attributes()
> "only" lies to userspace in that case).
>
> That said, assuming SHARED is definitely odd for cases where guest_memfd *can't*
> hold shared memory.  Ditto for assuming PRIVATE.  What if we instead fall back to
> the "init" state, e.g.?
>
> static u64 kvm_gmem_get_attributes(struct inode *inode, pgoff_t index)
> {
> 	struct maple_tree *mt = &GMEM_I(inode)->attributes;
> 	void *entry = mtree_load(mt, index);
>
> 	if (WARN_ON_ONCE(!entry)) {
> 		bool shared = GMEM_I(inode)->flags & GUEST_MEMFD_FLAG_INIT_SHARED;
>
> 		return shared ? 0 : KVM_MEMORY_ATTRIBUTE_PRIVATE;

I was wondering if we should not only return the init state but also set
the init state, but that would involve performing a conversion to the
init state... Too complicated for an edge case.

> 	}
>
> 	return xa_to_value(entry);
> }

Thanks Binbin and Sean!

^ permalink raw reply

* RE: [PATCH 3/3] hwmon: (pmbus/fd5121): Add support FD5121, FD5123 and FD5125
From: Selvamani Rajagopal @ 2026-06-23 23:12 UTC (permalink / raw)
  To: Guenter Roeck, Jonathan Corbet, Shuah Khan, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley
  Cc: linux-hwmon@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, devicetree@vger.kernel.org
In-Reply-To: <320c44c9-924f-4b7e-a46a-37a72fa7267f@roeck-us.net>


> -----Original Message-----
> From: Guenter Roeck <groeck7@gmail.com> On Behalf Of Guenter Roeck
> Subject: Re: [PATCH 3/3] hwmon: (pmbus/fd5121): Add support FD5121, FD5123 and
> FD5125
> 
> 
> > + &sensor_dev_attr_vout_transition_rate.dev_attr.attr,
> > + NULL
> 
> This is a complete no-go. We do not explose raw register data as sysfs
> attributes. You may expose essential register data as debugfs files,
> but only those deemed necessary. The above is just "let's blindly
> expose everything". Most of the above should be programmed in manufacturing
> and not be touched subsequently, much less as writeable attributes.
> Writing bad/unexpected values into many of those attributes can turn
> a board into a brick. It is bad enough that/if this is even possible,
> but exposing it as sysfs attribute would be a terrible idea.
> 
> I am not going to review this driver any further at this point.
> 
> Guenter
> 

Thanks for your comment. Understood and agree on comment on exposing everything without enforcing what is being written/read.
We will discuss internally to see how to go about it. We need some custom registers for customers. We will identify those and 
expose under debugfs, if needed.


> >


^ permalink raw reply

* [PATCH v2 2/2] cgroup/cpuset: Rebind/migrate mm only for threadgroup leader in cpuset_update_tasks_nodemask()
From: Waiman Long @ 2026-06-23 23:04 UTC (permalink / raw)
  To: Tejun Heo, Johannes Weiner, Michal Koutný, Ridong Chen,
	Jonathan Corbet, Shuah Khan
  Cc: cgroups, linux-kernel, linux-doc, linux-kselftest, Waiman Long
In-Reply-To: <20260623230413.1984188-1-longman@redhat.com>

As reported by sashiko [1], cpuset_update_tasks_nodemask() will do
mpol_rebind_mm() and possibly cpuset_migrate_mm() for all threads of
a multithreaded process. Since commit 3df9ca0a2b8b ("cpuset: migrate
memory only for threadgroup leaders"), cpuset_attach() had been updated
to rebind and migrate memory only for threadgroup leaders to mark the
group leader as the owner of the mm_struct.

To be consistent and avoid unnecessary performance overhead for heavily
multithreaded processes, follow the cpuset_attach() example and perform
memory rebind and migration only for threadgroup leaders.

Also add a paragraph in cgroup-v2.rst under cpuset.mems that the
threadgroup leader is the memory owner of that threadgroup. Therefore
the non-leading threads shouldn't be in other cgroups whose "cpuset.mems"
doesn't fully overlap that of the group leader.

[1] https://sashiko.dev/#/patchset/20260621032816.1806773-1-longman%40redhat.com

Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Ridong Chen <ridong.chen@linux.dev>
---
 Documentation/admin-guide/cgroup-v2.rst | 7 +++++++
 kernel/cgroup/cpuset.c                  | 4 ++++
 2 files changed, 11 insertions(+)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 993446ab66d0..f9c353174a7e 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2527,6 +2527,13 @@ Cpuset Interface Files
 	a need to change "cpuset.mems" with active tasks, it shouldn't
 	be done frequently.
 
+	For a multithreaded process, the threadgroup leader is
+	considered the owner of the group's memory. Memory policy
+	rebinding and migration will only happen with respect to the
+	threadgroup leader. To avoid unexpected result, non-leading
+	threads shouldn't be put into another cgroup whose "cpuset.mems"
+	doesn't fully overlap that of the threadgroup leader.
+
   cpuset.mems.effective
 	A read-only multiple values file which exists on all
 	cpuset-enabled cgroups.
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 044ddbf66f8e..055ae54a040a 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -2673,6 +2673,10 @@ void cpuset_update_tasks_nodemask(struct cpuset *cs)
 
 		cpuset_change_task_nodemask(task, &newmems);
 
+		/* Rebind and migrate mm only for thread group leader */
+		if (!thread_group_leader(task))
+			continue;
+
 		mm = get_task_mm(task);
 		if (!mm)
 			continue;
-- 
2.54.0


^ permalink raw reply related

* [PATCH v2 1/2] cgroup/cpuset: Avoid unnecessary cpus & mems update in cpuset_hotplug_update_tasks()
From: Waiman Long @ 2026-06-23 23:04 UTC (permalink / raw)
  To: Tejun Heo, Johannes Weiner, Michal Koutný, Ridong Chen,
	Jonathan Corbet, Shuah Khan
  Cc: cgroups, linux-kernel, linux-doc, linux-kselftest, Waiman Long
In-Reply-To: <20260623230413.1984188-1-longman@redhat.com>

As reported by sashiko [1], cpuset_hotplug_update_tasks() may perform
unnecessary task iteration and updating of tasks' CPU and node masks
when mems_allowed and/or cpus_allowed are not set in cpuset v2. It is
due to the fact that the temporary new_cpus and new_mems masks do not
inherit parent's effective_cpus/mems when they are empty which is the
expected behavior for cpuset v2 since commit 4ec22e9c5a90 ("cpuset:
Enable cpuset controller in default hierarchy").

Fix that and avoid unnecessay work by enhancing
compute_effective_cpumask() to add the empty cpumask check
and inheriting the parent's versions if empty when in v2. A new
compute_effective_nodemask() helper is also added to perform similar
function for new effective_mems.

Add new test_cpuset_prs.sh test cases to confirm that effective_cpus
will inherit the parent's version if cpuset.cpus is empty.

[1] https://sashiko.dev/#/patchset/20260621032816.1806773-1-longman%40redhat.com

Suggested-by: Ridong Chen <ridong.chen@linux.dev>
Fixes: 4ec22e9c5a90 ("cpuset: Enable cpuset controller in default hierarchy")
Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/cgroup/cpuset.c                        | 45 +++++++++++--------
 .../selftests/cgroup/test_cpuset_prs.sh       | 11 ++++-
 2 files changed, 35 insertions(+), 21 deletions(-)

diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index aff86acea701..044ddbf66f8e 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -1094,12 +1094,35 @@ void cpuset_update_tasks_cpumask(struct cpuset *cs, struct cpumask *new_cpus)
  * @cs: the cpuset the need to recompute the new effective_cpus mask
  * @parent: the parent cpuset
  *
+ * For v2, the parent's effective_cpus is inherited if cpumask empty.
  * The result is valid only if the given cpuset isn't a partition root.
  */
 static void compute_effective_cpumask(struct cpumask *new_cpus,
 				      struct cpuset *cs, struct cpuset *parent)
 {
-	cpumask_and(new_cpus, cs->cpus_allowed, parent->effective_cpus);
+	bool has_cpus;
+
+	has_cpus = cpumask_and(new_cpus, cs->cpus_allowed, parent->effective_cpus);
+	if (!has_cpus && is_in_v2_mode())
+		cpumask_copy(new_cpus, parent->effective_cpus);
+}
+
+/**
+ * compute_effective_nodemask - Compute the effective nodemask of the cpuset
+ * @new_cpus: the temp variable for the new effective_mems mask
+ * @cs: the cpuset the need to recompute the new effective_mems mask
+ * @parent: the parent cpuset
+ *
+ * For v2, the parent's effective_mems is inherited if nodemask empty.
+ */
+static void compute_effective_nodemask(nodemask_t *new_mems,
+				       struct cpuset *cs, struct cpuset *parent)
+{
+	bool has_mems;
+
+	has_mems = nodes_and(*new_mems, cs->mems_allowed, parent->effective_mems);
+	if (!has_mems && is_in_v2_mode())
+		nodes_copy(*new_mems, parent->effective_mems);
 }
 
 /*
@@ -2148,15 +2171,6 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp,
 			goto update_parent_effective;
 		}
 
-		/*
-		 * If it becomes empty, inherit the effective mask of the
-		 * parent, which is guaranteed to have some CPUs unless
-		 * it is a partition root that has explicitly distributed
-		 * out all its CPUs.
-		 */
-		if (is_in_v2_mode() && !remote && cpumask_empty(tmp->new_cpus))
-			cpumask_copy(tmp->new_cpus, parent->effective_cpus);
-
 		/*
 		 * Skip the whole subtree if
 		 * 1) the cpumask remains the same,
@@ -2704,14 +2718,7 @@ static void update_nodemasks_hier(struct cpuset *cs, nodemask_t *new_mems)
 	cpuset_for_each_descendant_pre(cp, pos_css, cs) {
 		struct cpuset *parent = parent_cs(cp);
 
-		bool has_mems = nodes_and(*new_mems, cp->mems_allowed, parent->effective_mems);
-
-		/*
-		 * If it becomes empty, inherit the effective mask of the
-		 * parent, which is guaranteed to have some MEMs.
-		 */
-		if (is_in_v2_mode() && !has_mems)
-			*new_mems = parent->effective_mems;
+		compute_effective_nodemask(new_mems, cp, parent);
 
 		/* Skip the whole subtree if the nodemask remains the same. */
 		if (nodes_equal(*new_mems, cp->effective_mems)) {
@@ -3923,7 +3930,7 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp)
 
 	parent = parent_cs(cs);
 	compute_effective_cpumask(&new_cpus, cs, parent);
-	nodes_and(new_mems, cs->mems_allowed, parent->effective_mems);
+	compute_effective_nodemask(&new_mems, cs, parent);
 
 	if (!tmp || !cs->partition_root_state)
 		goto update_tasks;
diff --git a/tools/testing/selftests/cgroup/test_cpuset_prs.sh b/tools/testing/selftests/cgroup/test_cpuset_prs.sh
index 0d41aa0d343d..ca9bc38fdb95 100755
--- a/tools/testing/selftests/cgroup/test_cpuset_prs.sh
+++ b/tools/testing/selftests/cgroup/test_cpuset_prs.sh
@@ -495,13 +495,20 @@ REMOTE_TEST_MATRIX=(
 	# Narrowing cpuset.cpus to previously sibling-excluded CPUs should
 	# not return CPUs that were never actually owned.
 	"  C1-4:P1   .   C1-2:P1  C1-3:P2  .       .  \
-	      .      .     .         C3    .       .     p1:4|c11:1-2|c12:3 \
+	      .      .     .       C3      .       .     p1:4|c11:1-2|c12:3 \
 							 p1:P1|c11:P1|c12:P2 3"
 	# Expanding cpuset.cpus to include a previously sibling-excluded CPU
 	# after the sibling has become a member should correctly request it.
 	"  C1-4:P1   .   C1-2:P1  C1-3:P2  .       .  \
-	      .      .      P0     C2-3    .       .     p1:1,4|c11:1|c12:2-3 \
+	      .      .     P0      C2-3    .       .     p1:1,4|c11:1|c12:2-3 \
 							 p1:P1|c11:P0|c12:P2 2-3"
+	# Cpusets with empty cpuset.cpus should inherit parent's effective_cpus
+	"  C1-4:P1 C5-6   C1-2     .       C5      .  \
+	      .      P1    P1      .       .       .     p1:3-4|p2:5-6|c11:1-2|c12:3-4|c21:5|c22:5-6 \
+							 p1:P1|p2:P1|c11:P1"
+	"  C1-4:P1 C5-6   C1-2     .       C5      .  \
+	      .      P1    P1      .      O5=0     .     p1:3-4|p2:6|c11:1-2|c12:3-4|c21:6|c22:6 \
+							 p1:P1|p2:P1|c11:P1"
 )
 
 #
-- 
2.54.0


^ permalink raw reply related

* [PATCH v2 0/2] cgroup/cpuset: Miscellaneous fixes and cleanups
From: Waiman Long @ 2026-06-23 23:04 UTC (permalink / raw)
  To: Tejun Heo, Johannes Weiner, Michal Koutný, Ridong Chen,
	Jonathan Corbet, Shuah Khan
  Cc: cgroups, linux-kernel, linux-doc, linux-kselftest, Waiman Long

 v2:
  - Update patch 1 as suggested by Ridong Chen and add new test cases.
  - Minor update to patch 2 code and comment log.

Patch 1 updates compute_effective_cpumask() and adds new
compute_effective_nodemask() helper to make sure that effective_cpus
and effective_mems will inherit parent's versions for v2 if
cpuset.cpus/cpuset.mems is empty.

Patch 2 makes cpuset_update_tasks_nodemask() to perform memory rebind
and migration only for thread group leader like cpuset_attach().

Waiman Long (2):
  cgroup/cpuset: Avoid unnecessary cpus & mems update in
    cpuset_hotplug_update_tasks()
  cgroup/cpuset: Rebind/migrate mm only for threadgroup leader in
    cpuset_update_tasks_nodemask()

 Documentation/admin-guide/cgroup-v2.rst       |  7 +++
 kernel/cgroup/cpuset.c                        | 49 ++++++++++++-------
 .../selftests/cgroup/test_cpuset_prs.sh       | 11 ++++-
 3 files changed, 46 insertions(+), 21 deletions(-)

-- 
2.54.0


^ permalink raw reply

* Re: [PATCH 3/3] hwmon: (pmbus/fd5121): Add support FD5121, FD5123 and FD5125
From: Guenter Roeck @ 2026-06-23 22:07 UTC (permalink / raw)
  To: Selvamani.Rajagopal, Jonathan Corbet, Shuah Khan, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley
  Cc: linux-hwmon, linux-doc, linux-kernel, devicetree
In-Reply-To: <20260622-support-fd5121-from-onsemi-v1-3-b31767689c65@onsemi.com>

On 6/22/26 22:55, Selvamani Rajagopal via B4 Relay wrote:
> From: Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com>
> 
> FD5121 is a dual-rail, multi-phase, digital controller that offers
> full telemtry options including input/output voltage, current as
> well as fault handling and identifications.
> 
> These controllers are compliant with PMBus specification.
> 
> Signed-off-by: Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com>
> ---
>   MAINTAINERS                  |    8 +
>   drivers/hwmon/pmbus/Kconfig  |    9 +
>   drivers/hwmon/pmbus/Makefile |    1 +
>   drivers/hwmon/pmbus/fd5121.c | 1004 ++++++++++++++++++++++++++++++++++++++++++
>   4 files changed, 1022 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index d95d3ef77773..c0664c33324a 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -20135,6 +20135,14 @@ L:	linux-mips@vger.kernel.org
>   S:	Maintained
>   F:	arch/mips/boot/dts/ralink/omega2p.dts
>   
> +ONSEMI HARDWARE MONITOR DRIVER
> +M:	Selva Rajagopal <selvamani.rajagopal@onsemi.com>
> +L:	linux-hwmon@vger.kernel.org
> +S:	Supported
> +W:	https://www.onsemi.com
> +F:	Documentation/devicetree/bindings/hwmon/pmbus/onnn,fd5121.yaml
> +F:	drivers/hwmon/pmbus/fd5121.c
> +
>   ONSEMI ETHERNET PHY DRIVERS
>   M:	Piergiorgio Beruto <piergiorgio.beruto@gmail.com>
>   L:	netdev@vger.kernel.org
> diff --git a/drivers/hwmon/pmbus/Kconfig b/drivers/hwmon/pmbus/Kconfig
> index c8cda160b5f8..3a06ed83539e 100644
> --- a/drivers/hwmon/pmbus/Kconfig
> +++ b/drivers/hwmon/pmbus/Kconfig
> @@ -179,6 +179,15 @@ config SENSORS_E50SN12051
>   	  This driver can also be built as a module. If so, the module will
>   	  be called e50sn12051.
>   
> +config SENSORS_FD5121
> +	tristate "FD5121/FD5123/FD5125 controllers from onsemi"
> +	help
> +	  If you say yes here, you get support for onsemi
> +	  controllers FD5121, FD5123, FD5125.
> +
> +	  This driver can also be built as a module. If so, the module will
> +	  be called fd5121.
> +
>   config SENSORS_INA233
>   	tristate "Texas Instruments INA233 and compatibles"
>   	help
> diff --git a/drivers/hwmon/pmbus/Makefile b/drivers/hwmon/pmbus/Makefile
> index ffc05f493213..70f4afb41fe0 100644
> --- a/drivers/hwmon/pmbus/Makefile
> +++ b/drivers/hwmon/pmbus/Makefile
> @@ -13,6 +13,7 @@ obj-$(CONFIG_SENSORS_APS_379)	+= aps-379.o
>   obj-$(CONFIG_SENSORS_BEL_PFE)	+= bel-pfe.o
>   obj-$(CONFIG_SENSORS_BPA_RS600)	+= bpa-rs600.o
>   obj-$(CONFIG_SENSORS_DELTA_AHE50DC_FAN) += delta-ahe50dc-fan.o
> +obj-$(CONFIG_SENSORS_FD5121)	+= fd5121.o
>   obj-$(CONFIG_SENSORS_FSP_3Y)	+= fsp-3y.o
>   obj-$(CONFIG_SENSORS_HAC300S)	+= hac300s.o
>   obj-$(CONFIG_SENSORS_IBM_CFFPS)	+= ibm-cffps.o
> diff --git a/drivers/hwmon/pmbus/fd5121.c b/drivers/hwmon/pmbus/fd5121.c
> new file mode 100644
> index 000000000000..e68c6d6cabbd
> --- /dev/null
> +++ b/drivers/hwmon/pmbus/fd5121.c
> @@ -0,0 +1,1004 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Copyright 2026 Semiconductor Components Industries, LLC ("onsemi").
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/init.h>
> +#include <linux/err.h>
> +#include <linux/slab.h>
> +#include <linux/delay.h>
> +#include <linux/i2c.h>
> +#include <linux/hwmon.h>
> +#include <linux/hwmon-sysfs.h>
> +#include <linux/unaligned.h>
> +
> +#include "pmbus.h"
> +
> +enum chips { chip_fd5121, chip_fd5123, chip_fd5125 };
> +
> +#define CTLR_ID_UNKNOWN				0
> +#define CTLR_ID_FD5121				0xFD5121
> +#define CTLR_ID_FD5123				0xFD5123
> +#define CTLR_ID_FD5125				0xFD5125
> +
> +#define FD5121_NUM_PAGES			2
> +
> +/* Custom PMBUS commands */
> +#define PMBUS_REG_VOUT_MIN			0x2B
> +#define PMBUS_REG_POWER_MODE			0x34
> +#define PMBUS_REG_VIN_ON			0x35
> +#define PMBUS_REG_VIN_OFF			0x36
> +#define PMBUS_REG_VIN_UV_FAULT_RESPONSE		0x5A
> +#define PMBUS_REG_IIN_OC_FAULT_RESPONSE		0x5C
> +#define PMBUS_REG_TON_DELAY			0x60
> +#define PMBUS_REG_POUT_OP_FAULT_RESPONSE	0x69
> +#define PMBUS_REG_READ_VAUX			0x85
> +
> +#define PMBUS_REG_IKNEE_SET			0x2D
> +#define PMBUS_REG_PIN_COUNTER			0x2E
> +#define PMBUS_REG_VMIN_AWARE			0x2F
> +#define PMBUS_REG_VAUX_UV_FAULT_LIMIT		0x6C
> +#define PMBUS_REG_VAUX_OV_FAULT_LIMIT		0x6D
> +#define PMBUS_REG_VAUX_UV_FAULT_RESPONSE	0x6E
> +#define PMBUS_REG_VAUX_OV_FAULT_RESPONSE	0x6F
> +#define PMBUS_REG_VAUX_UV_WARNING		0x75
> +#define PMBUS_REG_VAUX_OV_WARNING		0x76
> +#define PMBUS_REG_MFR_FREE_USER_CONFIG_TABLES	0xCF
> +#define PMBUS_REG_MFR_ADDRESS_TABLE		0xD0
> +#define PMBUS_REG_MFR_STATUS_ONSEMI		0xD1
> +#define PMBUS_REG_MFR_UNLOCK			0xD2
> +#define PMBUS_REG_MFR_FAULTY_SPS		0xD3
> +#define PMBUS_REG_TLVR_FAULTS			0xD4
> +#define PMBUS_REG_MFR_USER_STORE_CONFIG_TAB	0xD5
> +#define PMBUS_REG_MFR_USER_CONFIG_INDEX		0xD6
> +#define PMBUS_REG_MFR_PWM_DISCONNECTION		0xD7
> +#define PMBUS_REG_MFR_VR_DISCONNECTION		0xD8
> +#define PMBUS_REG_MFR_TON_SLEW			0xD9
> +#define PMBUS_REG_MFR_TOFF_SLEW			0xDA
> +#define PMBUS_REG_MFR_RAIL_NAME			0xDB
> +#define PMBUS_REG_MFR_VOUT_DROOP		0xDC
> +#define PMBUS_REG_MFR_USER_RESTORE_CONFIG_TAB	0xDD
> +#define PMBUS_REG_MFR_SVR_GO			0xDE
> +#define PMBUS_REG_MFR_SET_PWD			0xDF
> +#define PMBUS_REG_MFR_CONFIG_ACTIVATE		0xE0
> +#define PMBUS_REG_MFR_CONFIG_RECOVER		0xE1
> +#define PMBUS_REG_MFR_OTP_DUMP			0xE2
> +#define PMBUS_REG_MFR_BBR_EN			0xE3
> +#define PMBUS_REG_MFR_DPM_MIN			0xE4
> +#define PMBUS_REG_MFR_VBOOT			0xE5
> +#define PMBUS_REG_MFR_PRECLAMP_OFFSET		0xE6
> +#define PMBUS_REG_MFR_TLVR_DIAGN		0xE7
> +#define PMBUS_REG_MFR_READ_VSYS			0xE8
> +#define PMBUS_REG_MFR_SPECIFIC_E9		0xE9
> +#define PMBUS_REG_MFR_SPECIFIC_EA		0xEA
> +#define PMBUS_REG_MFR_SS_CBC			0xEB
> +#define PMBUS_REG_MFR_AMD_STATUS		0xEC
> +#define PMBUS_REG_MFR_CHECKSUM			0xEE
> +#define PMBUS_REG_CSE_INDEX			0xF0
> +#define PMBUS_REG_COUT_MEASURE			0xF1
> +#define PMBUS_REG_VR_COUT			0xF2
> +#define PMBUS_REG_BBR_RAM			0xF3
> +#define PMBUS_REG_BBR_OTP			0xF4
> +#define PMBUS_REG_READ_PSYS			0xF5
> +#define PMBUS_REG_POSTCLAMP_OFFSET		0xF6
> +#define PMBUS_REG_PGOOD_DELAY			0xF7
> +#define PMBUS_REG_MFR_SPECIFIC_F8		0xF8
> +#define PMBUS_REG_MFR_SPECIFIC_F9		0xF9
> +#define PMBUS_REG_MFR_PWD_PROGRAM_RAM		0xFA
> +#define PMBUS_REG_MFR_PWD_PROGRAM_I2C		0xFB
> +#define PMBUS_REG_MFR_PWD_ENABLE_OTP_STORE	0xFC
> +
> +/* List of recognized commands */
> +static const u8 cc_list[] = {
> +	PMBUS_PAGE,
> +	PMBUS_OPERATION,
> +	PMBUS_ON_OFF_CONFIG,
> +	PMBUS_CLEAR_FAULTS,
> +	PMBUS_WRITE_PROTECT,
> +	PMBUS_CAPABILITY,
> +	PMBUS_VOUT_MODE,
> +	PMBUS_VOUT_COMMAND,
> +	PMBUS_VOUT_MAX,
> +	PMBUS_VOUT_MARGIN_HIGH,
> +	PMBUS_VOUT_MARGIN_LOW,
> +	PMBUS_VOUT_TRANSITION_RATE,
> +	PMBUS_REG_VOUT_MIN,
> +	PMBUS_REG_IKNEE_SET,
> +	PMBUS_REG_PIN_COUNTER,
> +	PMBUS_REG_VMIN_AWARE,
> +	PMBUS_REG_POWER_MODE,
> +	PMBUS_REG_VIN_ON,
> +	PMBUS_REG_VIN_OFF,
> +	PMBUS_VOUT_OV_FAULT_LIMIT,
> +	PMBUS_VOUT_OV_FAULT_RESPONSE,
> +	PMBUS_VOUT_UV_FAULT_LIMIT,
> +	PMBUS_VOUT_UV_FAULT_RESPONSE,
> +	PMBUS_IOUT_OC_FAULT_LIMIT,
> +	PMBUS_IOUT_OC_FAULT_RESPONSE,
> +	PMBUS_IOUT_OC_WARN_LIMIT,
> +	PMBUS_OT_FAULT_LIMIT,
> +	PMBUS_OT_FAULT_RESPONSE,
> +	PMBUS_OT_WARN_LIMIT,
> +	PMBUS_VIN_OV_FAULT_LIMIT,
> +	PMBUS_VIN_OV_FAULT_RESPONSE,
> +	PMBUS_VIN_OV_WARN_LIMIT,
> +	PMBUS_VIN_UV_WARN_LIMIT,
> +	PMBUS_VIN_UV_FAULT_LIMIT,
> +	PMBUS_REG_VIN_UV_FAULT_RESPONSE,
> +	PMBUS_IIN_OC_FAULT_LIMIT,
> +	PMBUS_REG_IIN_OC_FAULT_RESPONSE,
> +	PMBUS_IIN_OC_WARN_LIMIT,
> +	PMBUS_REG_TON_DELAY,
> +	PMBUS_POUT_OP_FAULT_LIMIT,
> +	PMBUS_REG_POUT_OP_FAULT_RESPONSE,
> +	PMBUS_POUT_OP_WARN_LIMIT,
> +	PMBUS_PIN_OP_WARN_LIMIT,
> +	PMBUS_REG_VAUX_UV_FAULT_LIMIT,
> +	PMBUS_REG_VAUX_OV_FAULT_LIMIT,
> +	PMBUS_REG_VAUX_UV_FAULT_RESPONSE,
> +	PMBUS_REG_VAUX_OV_FAULT_RESPONSE,
> +	PMBUS_REG_VAUX_UV_WARNING,
> +	PMBUS_REG_VAUX_OV_WARNING,
> +	PMBUS_STATUS_BYTE,
> +	PMBUS_STATUS_WORD,
> +	PMBUS_STATUS_VOUT,
> +	PMBUS_STATUS_IOUT,
> +	PMBUS_STATUS_INPUT,
> +	PMBUS_STATUS_TEMPERATURE,
> +	PMBUS_STATUS_CML,
> +	PMBUS_STATUS_OTHER,
> +	PMBUS_STATUS_MFR_SPECIFIC,
> +	PMBUS_REG_READ_VAUX,
> +	PMBUS_READ_VIN,
> +	PMBUS_READ_IIN,
> +	PMBUS_READ_VOUT,
> +	PMBUS_READ_IOUT,
> +	PMBUS_READ_TEMPERATURE_1,
> +	PMBUS_READ_POUT,
> +	PMBUS_READ_PIN,
> +	PMBUS_REVISION,
> +	PMBUS_MFR_ID,
> +	PMBUS_MFR_MODEL,
> +	PMBUS_MFR_REVISION,
> +	PMBUS_IC_DEVICE_ID,
> +	PMBUS_REG_MFR_FREE_USER_CONFIG_TABLES,
> +	PMBUS_REG_MFR_ADDRESS_TABLE,
> +	PMBUS_REG_MFR_STATUS_ONSEMI,
> +	PMBUS_REG_MFR_UNLOCK,
> +	PMBUS_REG_MFR_FAULTY_SPS,
> +	PMBUS_REG_TLVR_FAULTS,
> +	PMBUS_REG_MFR_USER_STORE_CONFIG_TAB,
> +	PMBUS_REG_MFR_USER_CONFIG_INDEX,
> +	PMBUS_REG_MFR_PWM_DISCONNECTION,
> +	PMBUS_REG_MFR_VR_DISCONNECTION,
> +	PMBUS_REG_MFR_TON_SLEW,
> +	PMBUS_REG_MFR_TOFF_SLEW,
> +	PMBUS_REG_MFR_RAIL_NAME,
> +	PMBUS_REG_MFR_VOUT_DROOP,
> +	PMBUS_REG_MFR_USER_RESTORE_CONFIG_TAB,
> +	PMBUS_REG_MFR_SVR_GO,
> +	PMBUS_REG_MFR_SET_PWD,
> +	PMBUS_REG_MFR_CONFIG_ACTIVATE,
> +	PMBUS_REG_MFR_CONFIG_RECOVER,
> +	PMBUS_REG_MFR_OTP_DUMP,
> +	PMBUS_REG_MFR_BBR_EN,
> +	PMBUS_REG_MFR_DPM_MIN,
> +	PMBUS_REG_MFR_VBOOT,
> +	PMBUS_REG_MFR_PRECLAMP_OFFSET,
> +	PMBUS_REG_MFR_TLVR_DIAGN,
> +	PMBUS_REG_MFR_READ_VSYS,
> +	PMBUS_REG_MFR_SPECIFIC_E9,
> +	PMBUS_REG_MFR_SPECIFIC_EA,
> +	PMBUS_REG_MFR_SS_CBC,
> +	PMBUS_REG_MFR_AMD_STATUS,
> +	PMBUS_REG_MFR_CHECKSUM,
> +	PMBUS_REG_CSE_INDEX,
> +	PMBUS_REG_COUT_MEASURE,
> +	PMBUS_REG_VR_COUT,
> +	PMBUS_REG_BBR_RAM,
> +	PMBUS_REG_BBR_OTP,
> +	PMBUS_REG_READ_PSYS,
> +	PMBUS_REG_POSTCLAMP_OFFSET,
> +	PMBUS_REG_PGOOD_DELAY,
> +	PMBUS_REG_MFR_SPECIFIC_F8,
> +	PMBUS_REG_MFR_SPECIFIC_F9,
> +	PMBUS_REG_MFR_PWD_PROGRAM_RAM,
> +	PMBUS_REG_MFR_PWD_PROGRAM_I2C,
> +	PMBUS_REG_MFR_PWD_ENABLE_OTP_STORE,
> +};
> +
> +/* Following registers expect block read */
> +static const u8 blk_rd_cc[] = {
> +	PMBUS_SMBALERT_MASK,
> +	PMBUS_MFR_DATE,
> +	PMBUS_IC_DEVICE_REV,
> +};
> +
> +struct fd5121_data {
> +	struct attribute_group *groups[3];
> +	struct pmbus_driver_info info;
> +	struct device *dev;
> +	u32 id;
> +};
> +
> +static s32 fd5121_read_block_data(const struct i2c_client *client,
> +				  u8 cmd_code, u8 len, u8 *pbuf)
> +{
> +	s32 ret = 0;
> +
> +	if (!i2c_check_functionality(client->adapter,
> +				     I2C_FUNC_SMBUS_READ_BLOCK_DATA)) {
> +
> +		/* Payload length is in the first byte. */
> +		ret = i2c_smbus_read_i2c_block_data(client, cmd_code,
> +						    len, pbuf);
> +		if (ret < 0)
> +			return ret;
> +		ret = pbuf[0];
> +		if (ret > len)
> +			ret = len;
> +		for (int idx = 0; idx < ret; idx++)
> +			pbuf[idx] = pbuf[idx + 1];
> +		return ret;
> +	}
> +	ret = i2c_smbus_read_block_data(client, cmd_code, pbuf);
> +	if (ret < 0)
> +		return ret;
> +	return min_t(s32, ret, len);
> +}
> +
> +/* Command code that expects block read, not word read */
> +static bool fd5121_blk_rd_reg(u8 cmd_code)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(blk_rd_cc); i++) {
> +		if (cmd_code == blk_rd_cc[i])
> +			return true;
> +	}
> +	return false;
> +}
> +
> +static ssize_t fd5121_send_byte_store(struct device *dev,
> +				      struct device_attribute *da,
> +				      const char *buf, size_t count)
> +{
> +	struct i2c_client *client = to_i2c_client(dev->parent);
> +	u8 val = 0;
> +	int ret;
> +
> +	ret = kstrtou8(buf, 10, &val);
> +	if (ret < 0)
> +		return ret;
> +	ret = i2c_smbus_write_byte(client, val);
> +	if (ret < 0)
> +		return ret;
> +	return count;
> +}
> +
> +static int fd5121_config_activate(struct i2c_client *client)
> +{
> +	return i2c_smbus_write_byte_data(client,
> +					 PMBUS_REG_MFR_CONFIG_ACTIVATE, 0xAA);
> +}
> +
> +static ssize_t fd5121_byte_store(struct device *dev,
> +				 struct device_attribute *da,
> +				 const char *buf, size_t count)
> +{
> +	struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
> +	struct i2c_client *client = to_i2c_client(dev->parent);
> +	u8 reg = attr->index;
> +	int ret = 0;
> +	u8 val = 0;
> +
> +	switch (reg) {
> +	case PMBUS_REG_MFR_CONFIG_ACTIVATE:
> +		ret = fd5121_config_activate(client);
> +		if (ret < 0)
> +			return ret;
> +		return count;
> +	default:
> +		ret = kstrtou8(buf, 10, &val);
> +		if (ret < 0)
> +			return ret;
> +		break;
> +	}
> +	if (reg == PMBUS_PAGE && ((val != 0 && val != 1 &&
> +	    val != GENMASK(7, 0))))
> +		return -EINVAL;
> +	ret = i2c_smbus_write_byte_data(client, reg, val);
> +	if (ret < 0)
> +		return ret;
> +	return count;
> +}
> +
> +static ssize_t fd5121_byte_show(struct device *dev,
> +				struct device_attribute *da, char *buf)
> +{
> +	struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
> +	struct i2c_client *client = to_i2c_client(dev->parent);
> +	u8 reg = attr->index;
> +	s32 ret;
> +
> +	ret = i2c_smbus_read_byte_data(client, reg);
> +	if (ret < 0)
> +		return ret;
> +	return scnprintf(buf, PAGE_SIZE, "%d\n", ret & 0xFF);
> +}
> +
> +static ssize_t fd5121_word_store(struct device *dev,
> +				 struct device_attribute *da,
> +				 const char *buf, size_t count)
> +{
> +	struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
> +	struct i2c_client *client = to_i2c_client(dev->parent);
> +	u8 reg = attr->index;
> +	s16 val = 0;
> +	int ret = 0;
> +
> +	switch (reg) {
> +	case PMBUS_REG_MFR_PWD_PROGRAM_RAM:
> +		val = 0xC93F;
> +		break;
> +	default:
> +		ret = kstrtos16(buf, 10, &val);
> +		if (ret)
> +			return ret;
> +		break;
> +	}
> +	ret = i2c_smbus_write_word_data(client, reg, val);
> +	if (ret < 0)
> +		return ret;
> +	return count;
> +}
> +
> +static ssize_t fd5121_word_show(struct device *dev,
> +				struct device_attribute *da, char *buf)
> +{
> +	struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
> +	struct i2c_client *client = to_i2c_client(dev->parent);
> +	u8 data[I2C_SMBUS_BLOCK_MAX] = { 0 };
> +	u8 reg = attr->index;
> +	s32 ret = 0;
> +
> +	if (fd5121_blk_rd_reg(reg)) {
> +		ret = fd5121_read_block_data(client, reg, 2, data);
> +		if (ret >= 0)
> +			ret = get_unaligned_le16(data);
> +	} else
> +		ret = i2c_smbus_read_word_data(client, reg);
> +	if (ret < 0)
> +		return ret;
> +	return scnprintf(buf, PAGE_SIZE, "%d\n", ret & 0xFFFF);
> +}
> +
> +static s32 fd5121_write_block_data(const struct i2c_client *client,
> +				   u8 cmd_code, u8 len, u8 *pbuf)
> +{
> +	s32 ret = 0;
> +
> +	if (!i2c_check_functionality(client->adapter,
> +				     I2C_FUNC_SMBUS_WRITE_BLOCK_DATA))
> +		ret = i2c_smbus_write_i2c_block_data(client, cmd_code,
> +						     len, pbuf);
> +	else
> +		ret = i2c_smbus_write_block_data(client, cmd_code,
> +						 len, pbuf);
> +	return ret;
> +}
> +
> +static s32 fd5121_read_long(struct i2c_client *client, u8 cmd_code, u32 *pval)
> +{
> +	u8 buffer[I2C_SMBUS_BLOCK_MAX] = { 0 };
> +	s32 ret;
> +
> +	ret = fd5121_read_block_data(client, cmd_code, 4, buffer);
> +	if (ret < 0)
> +		return ret;
> +	if (ret < 4)
> +		return -EIO;
> +
> +	*pval = get_unaligned_le32(buffer);
> +	return 0;
> +}
> +
> +static ssize_t fd5121_long_store(struct device *dev,
> +				 struct device_attribute *da,
> +				 const char *buf, size_t count)
> +{
> +	struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
> +	struct i2c_client *client = to_i2c_client(dev->parent);
> +	u8 reg = attr->index;
> +	u8 buffer[4];
> +	u32 val = 0;
> +	int ret = 0;
> +	u8 len;
> +
> +	ret = kstrtou32(buf, 10, &val);
> +	if (ret)
> +		return ret;
> +
> +	len = (u8) sizeof(buffer);
> +	for (u8 i = 0; i < len; i++) {
> +		buffer[i] = val & 0xFF;
> +		val >>= 8;
> +	}
> +	ret = fd5121_write_block_data(client, reg, len, buffer);
> +	if (ret < 0)
> +		return ret;
> +	return count;
> +}
> +
> +static ssize_t fd5121_long_show(struct device *dev,
> +				struct device_attribute *da, char *buf)
> +{
> +	struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
> +	struct i2c_client *client = to_i2c_client(dev->parent);
> +	u8 reg = attr->index;
> +	u32 val = 0;
> +	s32 ret = 0;
> +
> +	ret = fd5121_read_long(client, reg, &val);
> +	if (ret < 0)
> +		return ret;
> +	return scnprintf(buf, PAGE_SIZE, "%d\n", val);
> +}
> +
> +static ssize_t fd5121_block_show(struct device *dev,
> +				 struct device_attribute *da, char *buf)
> +{
> +	struct i2c_client *client = to_i2c_client(dev->parent);
> +	struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
> +	u8 buffer[I2C_SMBUS_BLOCK_MAX] = { 0 };
> +	u8 reg = attr->index;
> +	int printed = 0;
> +	s32 ret = 0;
> +	u8 len = 0;
> +	int i = 0;
> +
> +	if (reg == PMBUS_REG_MFR_FAULTY_SPS) {
> +		int to_print = 0;
> +
> +		len = 7;
> +		ret = fd5121_read_block_data(client, reg, len, buffer);
> +		if (ret < 0)
> +			return ret;
> +		printed = 0;
> +		to_print = (ret < len) ? ret : len;
> +		for (i = 0; i < to_print; i++)
> +			printed += scnprintf(buf + printed,
> +					     PAGE_SIZE - printed,
> +					     "%02x", buffer[i]);
> +		printed += scnprintf(buf + printed,
> +				     PAGE_SIZE - printed, "\n");
> +		return printed;
> +	} else if (reg == PMBUS_REG_BBR_RAM ||
> +		   reg == PMBUS_REG_BBR_OTP) {
> +		u32 len = (reg == PMBUS_REG_BBR_OTP) ? 165 : 164;
> +
> +		/* Extra byte may be needed in case we need to store
> +		 * the length of the data
> +		 */
> +		u8 *tmp_in = kcalloc(len+1, sizeof(u8), GFP_KERNEL);
> +
> +		if (tmp_in == NULL)
> +			return -ENOMEM;
> +		ret = fd5121_read_block_data(client, reg, len, tmp_in);
> +		if (ret < 0) {
> +			kfree(tmp_in);
> +			return ret;
> +		}
> +
> +		printed = 0;
> +		for (i = 0; i < ret; i++)
> +			printed += scnprintf(buf + printed,
> +					     PAGE_SIZE - printed, "%02x",
> +					     tmp_in[i]);
> +		printed += scnprintf(buf + printed,
> +				     PAGE_SIZE - printed, "\n");
> +
> +		kfree(tmp_in);
> +		return printed;
> +	} else
> +		return -ENODATA;
> +}
> +
> +static SENSOR_DEVICE_ATTR_RW(page, fd5121_byte,
> +			     PMBUS_PAGE);
> +static SENSOR_DEVICE_ATTR_RO(vout_raw, fd5121_word,
> +			     PMBUS_READ_VOUT);
> +static SENSOR_DEVICE_ATTR_RW(operation, fd5121_byte,
> +			     PMBUS_OPERATION);
> +static SENSOR_DEVICE_ATTR_RW(on_off_config, fd5121_byte,
> +			     PMBUS_ON_OFF_CONFIG);
> +static SENSOR_DEVICE_ATTR_WO(clear_faults, fd5121_byte,
> +			     PMBUS_CLEAR_FAULTS);
> +static SENSOR_DEVICE_ATTR_RW(write_protect, fd5121_byte,
> +			     PMBUS_WRITE_PROTECT);
> +static SENSOR_DEVICE_ATTR_RO(capability, fd5121_byte,
> +			     PMBUS_CAPABILITY);
> +static SENSOR_DEVICE_ATTR_RW(smbalert_mask, fd5121_word,
> +			     PMBUS_SMBALERT_MASK);
> +static SENSOR_DEVICE_ATTR_RO(vout_mode, fd5121_byte,
> +			     PMBUS_VOUT_MODE);
> +static SENSOR_DEVICE_ATTR_RW(vout_command, fd5121_word,
> +			     PMBUS_VOUT_COMMAND);
> +static SENSOR_DEVICE_ATTR_RW(vout_max, fd5121_word,
> +			     PMBUS_VOUT_MAX);
> +static SENSOR_DEVICE_ATTR_RW(vout_margin_high, fd5121_word,
> +			     PMBUS_VOUT_MARGIN_HIGH);
> +static SENSOR_DEVICE_ATTR_RW(vout_margin_low, fd5121_word,
> +			     PMBUS_VOUT_MARGIN_LOW);
> +static SENSOR_DEVICE_ATTR_RW(vout_transition_rate, fd5121_word,
> +			     PMBUS_VOUT_TRANSITION_RATE);
> +static SENSOR_DEVICE_ATTR_RW(vout_min, fd5121_word,
> +			     PMBUS_REG_VOUT_MIN);
> +static SENSOR_DEVICE_ATTR_RW(power_mode, fd5121_byte,
> +			     PMBUS_REG_POWER_MODE);
> +static SENSOR_DEVICE_ATTR_RW(vin_on, fd5121_word,
> +			     PMBUS_REG_VIN_ON);
> +static SENSOR_DEVICE_ATTR_RW(vin_off, fd5121_word,
> +			     PMBUS_REG_VIN_OFF);
> +static SENSOR_DEVICE_ATTR_RW(vin_uv_fault_response, fd5121_byte,
> +			     PMBUS_REG_VIN_UV_FAULT_RESPONSE);
> +static SENSOR_DEVICE_ATTR_RW(iin_oc_fault_response, fd5121_byte,
> +			     PMBUS_REG_IIN_OC_FAULT_RESPONSE);
> +static SENSOR_DEVICE_ATTR_RW(ton_delay, fd5121_word,
> +			     PMBUS_REG_TON_DELAY);
> +static SENSOR_DEVICE_ATTR_RW(pout_op_fault_response, fd5121_byte,
> +			     PMBUS_REG_POUT_OP_FAULT_RESPONSE);
> +static SENSOR_DEVICE_ATTR_RO(read_vaux, fd5121_word,
> +			     PMBUS_REG_READ_VAUX);
> +static SENSOR_DEVICE_ATTR_RW(iknee_set, fd5121_word,
> +			     PMBUS_REG_IKNEE_SET);
> +static SENSOR_DEVICE_ATTR_RW(pin_counter, fd5121_byte,
> +			     PMBUS_REG_PIN_COUNTER);
> +static SENSOR_DEVICE_ATTR_RW(vmin_aware, fd5121_word,
> +			     PMBUS_REG_VMIN_AWARE);
> +static SENSOR_DEVICE_ATTR_RW(vout_ov_fault_response, fd5121_byte,
> +			     PMBUS_VOUT_OV_FAULT_RESPONSE);
> +static SENSOR_DEVICE_ATTR_RW(vout_uv_fault_response, fd5121_byte,
> +			     PMBUS_VOUT_UV_FAULT_RESPONSE);
> +static SENSOR_DEVICE_ATTR_RW(iout_oc_fault_response, fd5121_byte,
> +			     PMBUS_IOUT_OC_FAULT_RESPONSE);
> +static SENSOR_DEVICE_ATTR_RW(ot_fault_response, fd5121_byte,
> +			     PMBUS_OT_FAULT_RESPONSE);
> +static SENSOR_DEVICE_ATTR_RW(vin_ov_fault_response, fd5121_byte,
> +			     PMBUS_VIN_OV_FAULT_RESPONSE);
> +static SENSOR_DEVICE_ATTR_RW(vaux_uv_fault_limit, fd5121_word,
> +			     PMBUS_REG_VAUX_UV_FAULT_LIMIT);
> +static SENSOR_DEVICE_ATTR_RW(vaux_ov_fault_limit, fd5121_word,
> +			     PMBUS_REG_VAUX_OV_FAULT_LIMIT);
> +static SENSOR_DEVICE_ATTR_RW(vaux_uv_fault_response, fd5121_byte,
> +			     PMBUS_REG_VAUX_UV_FAULT_RESPONSE);
> +static SENSOR_DEVICE_ATTR_RW(vaux_ov_fault_response, fd5121_byte,
> +			     PMBUS_REG_VAUX_OV_FAULT_RESPONSE);
> +static SENSOR_DEVICE_ATTR_RW(vaux_uv_warning, fd5121_word,
> +			     PMBUS_REG_VAUX_UV_WARNING);
> +static SENSOR_DEVICE_ATTR_RW(vaux_ov_warning, fd5121_word,
> +			     PMBUS_REG_VAUX_OV_WARNING);
> +static SENSOR_DEVICE_ATTR_RO(free_user_config_tables, fd5121_byte,
> +			     PMBUS_REG_MFR_FREE_USER_CONFIG_TABLES);
> +static SENSOR_DEVICE_ATTR_RW(address_table, fd5121_byte,
> +			     PMBUS_REG_MFR_ADDRESS_TABLE);
> +static SENSOR_DEVICE_ATTR_RW(status_onsemi, fd5121_word,
> +			     PMBUS_REG_MFR_STATUS_ONSEMI);
> +static SENSOR_DEVICE_ATTR_RO(status_byte, fd5121_byte,
> +			     PMBUS_STATUS_BYTE);
> +static SENSOR_DEVICE_ATTR_RO(status_cml, fd5121_byte,
> +			     PMBUS_STATUS_CML);
> +static SENSOR_DEVICE_ATTR_RO(status_other, fd5121_byte,
> +			     PMBUS_STATUS_OTHER);
> +static SENSOR_DEVICE_ATTR_RO(status_mfr_specific, fd5121_byte,
> +			     PMBUS_STATUS_MFR_SPECIFIC);
> +static SENSOR_DEVICE_ATTR_RO(revision, fd5121_byte,
> +			     PMBUS_REVISION);
> +static SENSOR_DEVICE_ATTR_RO(id, fd5121_long,
> +			     PMBUS_MFR_ID);
> +static SENSOR_DEVICE_ATTR_RO(model, fd5121_long,
> +			     PMBUS_MFR_MODEL);
> +static SENSOR_DEVICE_ATTR_RO(mfr_revision, fd5121_long,
> +			     PMBUS_MFR_REVISION);
> +static SENSOR_DEVICE_ATTR_RW(date, fd5121_word,
> +			     PMBUS_MFR_DATE);
> +static SENSOR_DEVICE_ATTR_RO(ic_device_id, fd5121_long,
> +			     PMBUS_IC_DEVICE_ID);
> +static SENSOR_DEVICE_ATTR_RO(ic_device_rev, fd5121_word,
> +			     PMBUS_IC_DEVICE_REV);
> +static SENSOR_DEVICE_ATTR_WO(unlock, fd5121_byte,
> +			     PMBUS_REG_MFR_UNLOCK);
> +static SENSOR_DEVICE_ATTR_RO(faulty_sps, fd5121_block,
> +			     PMBUS_REG_MFR_FAULTY_SPS);
> +static SENSOR_DEVICE_ATTR_RO(tlvr_faults, fd5121_word,
> +			     PMBUS_REG_TLVR_FAULTS);
> +static SENSOR_DEVICE_ATTR_RW(user_store_config_tab, fd5121_byte,
> +			     PMBUS_REG_MFR_USER_STORE_CONFIG_TAB);
> +static SENSOR_DEVICE_ATTR_RO(user_config_index, fd5121_byte,
> +			     PMBUS_REG_MFR_USER_CONFIG_INDEX);
> +static SENSOR_DEVICE_ATTR_RO(pwm_disconnection, fd5121_word,
> +			     PMBUS_REG_MFR_PWM_DISCONNECTION);
> +static SENSOR_DEVICE_ATTR_RO(vr_disconnection, fd5121_byte,
> +			     PMBUS_REG_MFR_VR_DISCONNECTION);
> +static SENSOR_DEVICE_ATTR_RW(ton_slew, fd5121_byte,
> +			     PMBUS_REG_MFR_TON_SLEW);
> +static SENSOR_DEVICE_ATTR_RW(toff_slew, fd5121_byte,
> +			     PMBUS_REG_MFR_TOFF_SLEW);
> +static SENSOR_DEVICE_ATTR_RW(rail_name, fd5121_word,
> +			     PMBUS_REG_MFR_RAIL_NAME);
> +static SENSOR_DEVICE_ATTR_RW(vout_droop, fd5121_byte,
> +			     PMBUS_REG_MFR_VOUT_DROOP);
> +static SENSOR_DEVICE_ATTR_WO(svr_go, fd5121_send_byte,
> +			     PMBUS_REG_MFR_SVR_GO);
> +static SENSOR_DEVICE_ATTR_RW(user_restore_config_tab, fd5121_byte,
> +			     PMBUS_REG_MFR_USER_RESTORE_CONFIG_TAB);
> +static SENSOR_DEVICE_ATTR_WO(set_pwd, fd5121_byte,
> +			     PMBUS_REG_MFR_SET_PWD);
> +static SENSOR_DEVICE_ATTR_RW(config_activate, fd5121_byte,
> +			     PMBUS_REG_MFR_CONFIG_ACTIVATE);
> +static SENSOR_DEVICE_ATTR_RW(config_recover, fd5121_byte,
> +			     PMBUS_REG_MFR_CONFIG_RECOVER);
> +static SENSOR_DEVICE_ATTR_RW(otp_dump, fd5121_byte,
> +			     PMBUS_REG_MFR_OTP_DUMP);
> +static SENSOR_DEVICE_ATTR_RW(bbr_en, fd5121_byte,
> +			     PMBUS_REG_MFR_BBR_EN);
> +static SENSOR_DEVICE_ATTR_RW(dpm_min, fd5121_byte,
> +			     PMBUS_REG_MFR_DPM_MIN);
> +static SENSOR_DEVICE_ATTR_RW(vboot, fd5121_word,
> +			     PMBUS_REG_MFR_VBOOT);
> +static SENSOR_DEVICE_ATTR_RW(preclamp_offset, fd5121_word,
> +			     PMBUS_REG_MFR_PRECLAMP_OFFSET);
> +static SENSOR_DEVICE_ATTR_RW(tlvr_diagn, fd5121_word,
> +			     PMBUS_REG_MFR_TLVR_DIAGN);
> +static SENSOR_DEVICE_ATTR_RO(vsys, fd5121_word,
> +			     PMBUS_REG_MFR_READ_VSYS);
> +static SENSOR_DEVICE_ATTR_RW(specific_e9, fd5121_word,
> +			     PMBUS_REG_MFR_SPECIFIC_E9);
> +static SENSOR_DEVICE_ATTR_RW(specific_ea, fd5121_long,
> +			     PMBUS_REG_MFR_SPECIFIC_EA);
> +static SENSOR_DEVICE_ATTR_RO(ss_cbc, fd5121_word,
> +			     PMBUS_REG_MFR_SS_CBC);
> +static SENSOR_DEVICE_ATTR_RO(amd_status, fd5121_byte,
> +			     PMBUS_REG_MFR_AMD_STATUS);
> +static SENSOR_DEVICE_ATTR_RO(checksum, fd5121_word,
> +			     PMBUS_REG_MFR_CHECKSUM);
> +static SENSOR_DEVICE_ATTR_RO(cse_index, fd5121_word,
> +			     PMBUS_REG_CSE_INDEX);
> +static SENSOR_DEVICE_ATTR_RW(cout_measure, fd5121_word,
> +			     PMBUS_REG_COUT_MEASURE);
> +static SENSOR_DEVICE_ATTR_RO(vr_cout, fd5121_word,
> +			     PMBUS_REG_VR_COUT);
> +static SENSOR_DEVICE_ATTR_RO(bbr_ram, fd5121_block,
> +			     PMBUS_REG_BBR_RAM);
> +static SENSOR_DEVICE_ATTR_RO(bbr_otp, fd5121_block,
> +			     PMBUS_REG_BBR_OTP);
> +static SENSOR_DEVICE_ATTR_RO(psys, fd5121_word,
> +			     PMBUS_REG_READ_PSYS);
> +static SENSOR_DEVICE_ATTR_RW(postclamp_offset, fd5121_word,
> +			     PMBUS_REG_POSTCLAMP_OFFSET);
> +static SENSOR_DEVICE_ATTR_RW(pgood_delay, fd5121_byte,
> +			     PMBUS_REG_PGOOD_DELAY);
> +static SENSOR_DEVICE_ATTR_RW(specific_f8, fd5121_word,
> +			     PMBUS_REG_MFR_SPECIFIC_F8);
> +static SENSOR_DEVICE_ATTR_RW(specific_f9, fd5121_long,
> +			     PMBUS_REG_MFR_SPECIFIC_F9);
> +static SENSOR_DEVICE_ATTR_RW(pwd_program_ram, fd5121_word,
> +			     PMBUS_REG_MFR_PWD_PROGRAM_RAM);
> +static SENSOR_DEVICE_ATTR_RW(pwd_program_i2c, fd5121_word,
> +			     PMBUS_REG_MFR_PWD_PROGRAM_I2C);
> +static SENSOR_DEVICE_ATTR_RW(pwd_enable_otp_store, fd5121_word,
> +			     PMBUS_REG_MFR_PWD_ENABLE_OTP_STORE);
> +
> +static struct attribute *fd5121_non_paged_attrs[] = {
> +	&sensor_dev_attr_page.dev_attr.attr,
> +	&sensor_dev_attr_capability.dev_attr.attr,
> +	&sensor_dev_attr_pin_counter.dev_attr.attr,
> +	&sensor_dev_attr_vaux_uv_fault_limit.dev_attr.attr,
> +	&sensor_dev_attr_vaux_ov_fault_limit.dev_attr.attr,
> +	&sensor_dev_attr_vaux_uv_warning.dev_attr.attr,
> +	&sensor_dev_attr_vaux_ov_warning.dev_attr.attr,
> +	&sensor_dev_attr_free_user_config_tables.dev_attr.attr,
> +	&sensor_dev_attr_address_table.dev_attr.attr,
> +	&sensor_dev_attr_unlock.dev_attr.attr,
> +	&sensor_dev_attr_faulty_sps.dev_attr.attr,
> +	&sensor_dev_attr_tlvr_faults.dev_attr.attr,
> +	&sensor_dev_attr_user_store_config_tab.dev_attr.attr,
> +	&sensor_dev_attr_user_config_index.dev_attr.attr,
> +	&sensor_dev_attr_pwm_disconnection.dev_attr.attr,
> +	&sensor_dev_attr_vr_disconnection.dev_attr.attr,
> +	&sensor_dev_attr_user_restore_config_tab.dev_attr.attr,
> +	&sensor_dev_attr_svr_go.dev_attr.attr,
> +	&sensor_dev_attr_set_pwd.dev_attr.attr,
> +	&sensor_dev_attr_config_activate.dev_attr.attr,
> +	&sensor_dev_attr_config_recover.dev_attr.attr,
> +	&sensor_dev_attr_otp_dump.dev_attr.attr,
> +	&sensor_dev_attr_bbr_en.dev_attr.attr,
> +	&sensor_dev_attr_vboot.dev_attr.attr,
> +	&sensor_dev_attr_vsys.dev_attr.attr,
> +	&sensor_dev_attr_specific_e9.dev_attr.attr,
> +	&sensor_dev_attr_specific_ea.dev_attr.attr,
> +	&sensor_dev_attr_ss_cbc.dev_attr.attr,
> +	&sensor_dev_attr_checksum.dev_attr.attr,
> +	&sensor_dev_attr_cse_index.dev_attr.attr,
> +	&sensor_dev_attr_cout_measure.dev_attr.attr,
> +	&sensor_dev_attr_vr_cout.dev_attr.attr,
> +	&sensor_dev_attr_bbr_ram.dev_attr.attr,
> +	&sensor_dev_attr_bbr_otp.dev_attr.attr,
> +	&sensor_dev_attr_psys.dev_attr.attr,
> +	&sensor_dev_attr_specific_f8.dev_attr.attr,
> +	&sensor_dev_attr_specific_f9.dev_attr.attr,
> +	&sensor_dev_attr_pwd_program_ram.dev_attr.attr,
> +	&sensor_dev_attr_pwd_program_i2c.dev_attr.attr,
> +	&sensor_dev_attr_pwd_enable_otp_store.dev_attr.attr,
> +	&sensor_dev_attr_revision.dev_attr.attr,
> +	&sensor_dev_attr_id.dev_attr.attr,
> +	&sensor_dev_attr_model.dev_attr.attr,
> +	&sensor_dev_attr_mfr_revision.dev_attr.attr,
> +	&sensor_dev_attr_date.dev_attr.attr,
> +	&sensor_dev_attr_ic_device_id.dev_attr.attr,
> +	&sensor_dev_attr_ic_device_rev.dev_attr.attr,
> +	NULL
> +};
> +
> +static struct attribute *fd5121_paged_attrs[] = {
> +	&sensor_dev_attr_operation.dev_attr.attr,
> +	&sensor_dev_attr_vout_raw.dev_attr.attr,
> +	&sensor_dev_attr_on_off_config.dev_attr.attr,
> +	&sensor_dev_attr_clear_faults.dev_attr.attr,
> +	&sensor_dev_attr_write_protect.dev_attr.attr,
> +	&sensor_dev_attr_smbalert_mask.dev_attr.attr,
> +	&sensor_dev_attr_vout_mode.dev_attr.attr,
> +	&sensor_dev_attr_vout_command.dev_attr.attr,
> +	&sensor_dev_attr_vout_margin_high.dev_attr.attr,
> +	&sensor_dev_attr_vout_margin_low.dev_attr.attr,
> +	&sensor_dev_attr_vout_min.dev_attr.attr,
> +	&sensor_dev_attr_vin_on.dev_attr.attr,
> +	&sensor_dev_attr_vin_off.dev_attr.attr,
> +	&sensor_dev_attr_vout_ov_fault_response.dev_attr.attr,
> +	&sensor_dev_attr_vout_uv_fault_response.dev_attr.attr,
> +	&sensor_dev_attr_iout_oc_fault_response.dev_attr.attr,
> +	&sensor_dev_attr_ot_fault_response.dev_attr.attr,
> +	&sensor_dev_attr_vin_ov_fault_response.dev_attr.attr,
> +	&sensor_dev_attr_status_byte.dev_attr.attr,
> +	&sensor_dev_attr_iknee_set.dev_attr.attr,
> +	&sensor_dev_attr_vmin_aware.dev_attr.attr,
> +	&sensor_dev_attr_power_mode.dev_attr.attr,
> +	&sensor_dev_attr_vin_uv_fault_response.dev_attr.attr,
> +	&sensor_dev_attr_iin_oc_fault_response.dev_attr.attr,
> +	&sensor_dev_attr_ton_delay.dev_attr.attr,
> +	&sensor_dev_attr_pout_op_fault_response.dev_attr.attr,
> +	&sensor_dev_attr_vaux_uv_fault_response.dev_attr.attr,
> +	&sensor_dev_attr_vaux_ov_fault_response.dev_attr.attr,
> +	&sensor_dev_attr_status_onsemi.dev_attr.attr,
> +	&sensor_dev_attr_status_cml.dev_attr.attr,
> +	&sensor_dev_attr_status_other.dev_attr.attr,
> +	&sensor_dev_attr_status_mfr_specific.dev_attr.attr,
> +	&sensor_dev_attr_read_vaux.dev_attr.attr,
> +	&sensor_dev_attr_ton_slew.dev_attr.attr,
> +	&sensor_dev_attr_toff_slew.dev_attr.attr,
> +	&sensor_dev_attr_rail_name.dev_attr.attr,
> +	&sensor_dev_attr_vout_droop.dev_attr.attr,
> +	&sensor_dev_attr_dpm_min.dev_attr.attr,
> +	&sensor_dev_attr_preclamp_offset.dev_attr.attr,
> +	&sensor_dev_attr_tlvr_diagn.dev_attr.attr,
> +	&sensor_dev_attr_amd_status.dev_attr.attr,
> +	&sensor_dev_attr_postclamp_offset.dev_attr.attr,
> +	&sensor_dev_attr_pgood_delay.dev_attr.attr,
> +	&sensor_dev_attr_vout_max.dev_attr.attr,
> +	&sensor_dev_attr_vout_transition_rate.dev_attr.attr,
> +	NULL

This is a complete no-go. We do not explose raw register data as sysfs
attributes. You may expose essential register data as debugfs files,
but only those deemed necessary. The above is just "let's blindly
expose everything". Most of the above should be programmed in manufacturing
and not be touched subsequently, much less as writeable attributes.
Writing bad/unexpected values into many of those attributes can turn
a board into a brick. It is bad enough that/if this is even possible,
but exposing it as sysfs attribute would be a terrible idea.

I am not going to review this driver any further at this point.

Guenter

> +};
> +
> +static struct attribute_group fd5121_groups[2] = {
> +	{ .name = "global", .attrs = fd5121_non_paged_attrs },
> +	{ .name = "paged", .attrs = fd5121_paged_attrs }
> +};
> +
> +/* Regulator descriptors for VOUT rails (VID encoded) */
> +static struct regulator_desc fd5121_reg_desc[] = {
> +	PMBUS_REGULATOR_STEP_ONE("vout1", 3001, 1000, 200000),
> +	PMBUS_REGULATOR_STEP_ONE("vout2", 3001, 1000, 200000),
> +};
> +
> +static int fd5121_valid_reg(struct i2c_client *client, int reg)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(cc_list); i++) {
> +		if (reg == cc_list[i])
> +			return 0;
> +	}
> +
> +	if (fd5121_blk_rd_reg(reg))
> +		return 0;
> +	return -ENXIO;
> +}
> +
> +static int fd5121_read_word_data(struct i2c_client *client, int page,
> +				 int phase, int reg)
> +{
> +	int ret;
> +
> +	ret = fd5121_valid_reg(client, reg);
> +	if (ret < 0)
> +		return ret;
> +
> +	ret = pmbus_read_word_data(client, page, phase, reg);
> +	if (ret < 0)
> +		return ret;
> +
> +	/* Chip reports VOUT_MODE as vid. But gives raw value 1mV per bit.
> +	 * So, encode the READ_VOUT value so that it gets decoded and
> +	 * reported correctly.
> +	 */
> +	if (reg == PMBUS_READ_VOUT)
> +		ret = DIV_ROUND_CLOSEST(155000 - ret * 100, 625);
> +	return ret;
> +}
> +
> +static int fd5121_read_byte_data(struct i2c_client *client, int page, int reg)
> +{
> +	int ret;
> +
> +	ret = fd5121_valid_reg(client, reg);
> +	if (ret < 0)
> +		return ret;
> +
> +	return pmbus_read_byte_data(client, page, reg);
> +}
> +
> +static int fd5121_write_byte_data(struct i2c_client *client, int page,
> +				  int reg, u8 value)
> +{
> +	int ret;
> +
> +	ret = fd5121_valid_reg(client, reg);
> +	if (ret < 0)
> +		return ret;
> +	return pmbus_write_byte_data(client, page, reg, value);
> +}
> +
> +static int fd5121_write_byte(struct i2c_client *client, int page, u8 byte)
> +{
> +	return pmbus_write_byte(client, page, byte);
> +}
> +
> +static int fd5121_write_word_data(struct i2c_client *client, int page,
> +				    int reg, u16 word)
> +{
> +	int ret;
> +
> +	ret = fd5121_valid_reg(client, reg);
> +	if (ret < 0)
> +		return ret;
> +	ret = pmbus_write_word_data(client, page, reg, word);
> +	return ret;
> +}
> +
> +static u32 fd5121_get_dev_id(struct i2c_client *client)
> +{
> +	u32 dev_id = 0;
> +	s32 ret = 0;
> +
> +	ret = fd5121_read_long(client, PMBUS_IC_DEVICE_ID, &dev_id);
> +	if (ret < 0)
> +		return CTLR_ID_UNKNOWN;
> +
> +	switch (dev_id) {
> +	case CTLR_ID_FD5121:
> +	case CTLR_ID_FD5123:
> +	case CTLR_ID_FD5125:
> +		break;
> +	default:
> +		if (dev_id != 0)
> +			dev_err(&client->dev, "Unknown device 0x%x",
> +				dev_id);
> +		return CTLR_ID_UNKNOWN;
> +	}
> +	return dev_id;
> +}
> +
> +static int fd5121_probe(struct i2c_client *client)
> +{
> +	struct pmbus_driver_info *info;
> +	struct fd5121_data *pdata;
> +	u32 id;
> +
> +	if (!i2c_check_functionality(client->adapter, I2C_FUNC_I2C))
> +		return -EOPNOTSUPP;
> +
> +	pdata = devm_kzalloc(&client->dev, sizeof(struct fd5121_data),
> +			     GFP_KERNEL);
> +	if (pdata == NULL)
> +		return -ENOMEM;
> +
> +	pdata->dev = &client->dev;
> +	pdata->groups[0] = &fd5121_groups[0];
> +	pdata->groups[1] = &fd5121_groups[1];
> +
> +	id = fd5121_get_dev_id(client);
> +	if (id == CTLR_ID_UNKNOWN)
> +		return -ENODEV;
> +
> +	pdata->id = id;
> +
> +	switch (id) {
> +	case CTLR_ID_FD5121:
> +	case CTLR_ID_FD5123:
> +	case CTLR_ID_FD5125:
> +		break;
> +	default:
> +		dev_err(&client->dev, "Failed to read device ID");
> +		return -ENODEV;
> +	}
> +
> +	info = &pdata->info;
> +	info->groups = (const struct attribute_group **)&pdata->groups[0];
> +	info->write_word_data = fd5121_write_word_data;
> +	info->write_byte = fd5121_write_byte;
> +	info->write_byte_data = fd5121_write_byte_data;
> +	info->read_word_data = fd5121_read_word_data;
> +	info->read_byte_data = fd5121_read_byte_data;
> +
> +	info->pages = FD5121_NUM_PAGES;
> +	info->format[PSC_VOLTAGE_IN] = linear;
> +	info->format[PSC_VOLTAGE_OUT] = vid;
> +
> +	fd5121_reg_desc[0].id = 0;
> +	fd5121_reg_desc[1].id = 1;
> +
> +	/* Device implements VID coding with 1 mV steps from 0.200 V
> +	 * up to 3.200 V
> +	 */
> +	info->num_regulators = FD5121_NUM_PAGES;
> +	info->reg_desc = fd5121_reg_desc;
> +	info->format[PSC_CURRENT_IN] = linear;
> +	info->format[PSC_CURRENT_OUT] = linear;
> +	info->format[PSC_POWER] = linear;
> +	info->format[PSC_TEMPERATURE] = linear;
> +	for (u8 idx = 0; idx < info->pages; idx++) {
> +		info->func[idx] = PMBUS_HAVE_IOUT | PMBUS_HAVE_STATUS_IOUT;
> +		info->func[idx] |= PMBUS_HAVE_VOUT | PMBUS_HAVE_STATUS_VOUT;
> +		info->func[idx] |= PMBUS_HAVE_TEMP | PMBUS_HAVE_STATUS_TEMP;
> +		info->func[idx] |= PMBUS_HAVE_PIN | PMBUS_HAVE_POUT;
> +		info->func[idx] |= PMBUS_HAVE_VIN | PMBUS_HAVE_IIN;
> +		info->func[idx] |= PMBUS_HAVE_STATUS_INPUT;
> +		info->vrm_version[idx] = amd625mv;
> +	}
> +	return pmbus_do_probe(client, info);
> +}
> +
> +#ifdef CONFIG_OF
> +static const struct of_device_id fd5121_of_match[] = {
> +	{ .compatible = "onnn,fd5121" },
> +	{ }
> +};
> +MODULE_DEVICE_TABLE(of, fd5121_of_match);
> +#endif
> +
> +static const struct i2c_device_id fd5121_id[] = {
> +	{ "fd5121", chip_fd5121 },
> +	{ "fd5123", chip_fd5123 },
> +	{ "fd5125", chip_fd5125 },
> +	{ }
> +};
> +MODULE_DEVICE_TABLE(i2c, fd5121_id);
> +
> +static struct i2c_driver fd5121_driver = {
> +	.driver = {
> +		.name = "fd5121",
> +		.of_match_table = of_match_ptr(fd5121_of_match),
> +	},
> +	.probe = fd5121_probe,
> +	.id_table = fd5121_id,
> +};
> +
> +module_i2c_driver(fd5121_driver);
> +
> +MODULE_AUTHOR("Selva Rajagopal <selvamani.rajagopal@onsemi.com>");
> +MODULE_DESCRIPTION("PMBus driver for FD5121");
> +MODULE_LICENSE("GPL");
> +MODULE_IMPORT_NS("PMBUS");
> +
> 


^ permalink raw reply

* Re: [PATCH] crypto: af_alg - Add af_alg_restrict sysctl, defaulting to 1
From: Rosen Penev @ 2026-06-23 22:06 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-crypto, Herbert Xu, linux-kernel, linux-doc,
	linux-bluetooth, iwd, linux-hardening, Milan Broz,
	Demi Marie Obenour, Andy Lutomirski
In-Reply-To: <20260623214730.GA3281861@google.com>

On Tue, Jun 23, 2026 at 2:47 PM Eric Biggers <ebiggers@kernel.org> wrote:
>
> On Tue, Jun 23, 2026 at 02:28:17PM -0700, Rosen Penev wrote:
> > > +static const struct af_alg_allowlist_entry hash_allowlist[] = {
> > > +       { "cmac(aes)", true }, /* iwd, bluez */
> > > +       { "hmac(md5)", true }, /* iwd */
> > > +       { "hmac(sha1)", true }, /* iwd */
> > > +       { "hmac(sha224)", true }, /* iwd */
> > > +       { "hmac(sha256)", true }, /* iwd */
> > > +       { "hmac(sha384)", true }, /* iwd */
> > > +       { "hmac(sha512)", true }, /* iwd, sha512hmac */
> > > +       { "md4", true }, /* iwd */
> > > +       { "md5", true }, /* iwd */
> > > +       { "sha1", false }, /* iwd, iproute2 < 7.0 */
> > > +       { "sha224", true }, /* iwd */
> > > +       { "sha256", true }, /* iwd */
> > > +       { "sha384", true }, /* iwd */
> > > +       { "sha512", true }, /* iwd */
> > > +       {},
> > In OpenWrt, https://gitlab.com/linux-afs/kafs-client and strongswan
> > seem to be the other users of the user API. I haven't looked into what
> > they need.
>
> [Please trim your replies, thanks!]
Not sure what you mean.
>
> https://gitlab.com/linux-afs/kafs-client uses AF_ALG only for
> "hmac(md5)", which I already put on the privileged allowlist due to iwd
> also using it.  So it would still work by default with the current
> patch, unless it needs to use it unprivileged.
>
> (FWIW, a use of a single obsolete algorithm like this is also a good
> candidate for just replacing with local code...)
>
> https://github.com/strongswan/strongswan already supports userspace
> crypto libraries.
Oh lovely. Looks like this needs fixing in OpenWrt.
>
> - Eric

^ permalink raw reply

* Re: [PATCH] crypto: af_alg - Add af_alg_restrict sysctl, defaulting to 1
From: Eric Biggers @ 2026-06-23 21:47 UTC (permalink / raw)
  To: Rosen Penev
  Cc: linux-crypto, Herbert Xu, linux-kernel, linux-doc,
	linux-bluetooth, iwd, linux-hardening, Milan Broz,
	Demi Marie Obenour, Andy Lutomirski
In-Reply-To: <CAKxU2N_EGTWkvtPOxQXBroxGVXDf1atPoFVyRRu0wHOtEXVWaA@mail.gmail.com>

On Tue, Jun 23, 2026 at 02:28:17PM -0700, Rosen Penev wrote:
> > +static const struct af_alg_allowlist_entry hash_allowlist[] = {
> > +       { "cmac(aes)", true }, /* iwd, bluez */
> > +       { "hmac(md5)", true }, /* iwd */
> > +       { "hmac(sha1)", true }, /* iwd */
> > +       { "hmac(sha224)", true }, /* iwd */
> > +       { "hmac(sha256)", true }, /* iwd */
> > +       { "hmac(sha384)", true }, /* iwd */
> > +       { "hmac(sha512)", true }, /* iwd, sha512hmac */
> > +       { "md4", true }, /* iwd */
> > +       { "md5", true }, /* iwd */
> > +       { "sha1", false }, /* iwd, iproute2 < 7.0 */
> > +       { "sha224", true }, /* iwd */
> > +       { "sha256", true }, /* iwd */
> > +       { "sha384", true }, /* iwd */
> > +       { "sha512", true }, /* iwd */
> > +       {},
> In OpenWrt, https://gitlab.com/linux-afs/kafs-client and strongswan
> seem to be the other users of the user API. I haven't looked into what
> they need.

[Please trim your replies, thanks!]

https://gitlab.com/linux-afs/kafs-client uses AF_ALG only for
"hmac(md5)", which I already put on the privileged allowlist due to iwd
also using it.  So it would still work by default with the current
patch, unless it needs to use it unprivileged.

(FWIW, a use of a single obsolete algorithm like this is also a good
candidate for just replacing with local code...)

https://github.com/strongswan/strongswan already supports userspace
crypto libraries.

- Eric

^ permalink raw reply

* Re: [PATCH] crypto: af_alg - Add af_alg_restrict sysctl, defaulting to 1
From: Rosen Penev @ 2026-06-23 21:28 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-crypto, Herbert Xu, linux-kernel, linux-doc,
	linux-bluetooth, iwd, linux-hardening, Milan Broz,
	Demi Marie Obenour, Andy Lutomirski
In-Reply-To: <20260622234803.6982-1-ebiggers@kernel.org>

On Mon, Jun 22, 2026 at 4:49 PM Eric Biggers <ebiggers@kernel.org> wrote:
>
> AF_ALG is a frequent source of vulnerabilities and a maintenance
> nightmare.  It exposes far more functionality to userspace than ever
> should have been exposed, especially to unprivileged processes.  Recent
> exploits have targeted kernel internal implementation details like
> "authencesn" that have zero use case for userspace access.
>
> Fortunately, AF_ALG is rarely used in practice, as userspace crypto
> libraries exist.  And when it is used, only some functionality is known
> to be used, and many users are known to hold capabilities already.
> iwd for example requires CAP_NET_ADMIN and has a known algorithm list
> (https://lore.kernel.org/linux-crypto/bcbbef00-5881-421b-8892-7be6c04b832d@gmail.com/).
>
> Thus, let's restrict the set of allowed algorithms by default, depending
> on the capabilities held.
>
> Add a sysctl /proc/sys/crypto/af_alg_restrict with meaning:
>
>     0: unrestricted
>     1: limited functionality
>     2: completely disabled
>
> Set the default value to 1, which enables an algorithm allowlist for
> unprivileged processes and a slightly longer allowlist for privileged
> processes.
>
> Note that the list may be tweaked in the future.  However, the common
> use cases such as iwd and bluez are taken into account already.  I've
> tested that iwd still works with the default value of 1.
>
> Signed-off-by: Eric Biggers <ebiggers@kernel.org>
> ---
>  Documentation/admin-guide/sysctl/crypto.rst | 36 +++++++++++
>  Documentation/crypto/userspace-if.rst       | 13 +++-
>  crypto/af_alg.c                             | 72 +++++++++++++++++++--
>  crypto/algif_aead.c                         | 11 ++++
>  crypto/algif_hash.c                         | 24 +++++++
>  crypto/algif_rng.c                          |  9 +++
>  crypto/algif_skcipher.c                     | 20 ++++++
>  include/crypto/if_alg.h                     |  8 +++
>  8 files changed, 184 insertions(+), 9 deletions(-)
>
> diff --git a/Documentation/admin-guide/sysctl/crypto.rst b/Documentation/admin-guide/sysctl/crypto.rst
> index b707bd314a64..9a1bd53287f4 100644
> --- a/Documentation/admin-guide/sysctl/crypto.rst
> +++ b/Documentation/admin-guide/sysctl/crypto.rst
> @@ -5,10 +5,46 @@
>  These files show up in ``/proc/sys/crypto/``, depending on the
>  kernel configuration:
>
>  .. contents:: :local:
>
> +.. _af_alg_restrict:
> +
> +af_alg_restrict
> +===============
> +
> +Controls the level of restriction of AF_ALG.
> +
> +AF_ALG is a deprecated and rarely-used userspace interface that is a
> +frequent source of vulnerabilities. It also unnecessarily exposes a
> +large number of kernel implementation details. For more information
> +about AF_ALG, see :ref:`Documentation/crypto/userspace-if.rst
> +<crypto_userspace_interface>`.
> +
> +Starting in Linux v7.3, AF_ALG supports only a limited set of
> +algorithms by default. This sysctl allows the system administrator to
> +remove this restriction when needed for compatibility reasons, or to
> +go further and disable AF_ALG entirely. The default value is 1.
> +
> +===  ==================================================================
> +0    AF_ALG is unrestricted.
> +
> +1    AF_ALG is supported with a limited list of algorithms. The list
> +     is designed for compatibility with known users such as iwd and
> +     bluez that haven't yet been fixed to use userspace crypto code.
> +
> +     Specifically, there is an allowlist for unprivileged processes
> +     and a somewhat longer allowlist for processes that hold
> +     CAP_SYS_ADMIN or CAP_NET_ADMIN in the initial user namespace.
> +
> +     Attempts to bind() an AF_ALG socket with a disallowed algorithm
> +     fail with ENOENT.
> +
> +2    AF_ALG is completely disabled. Attempts to create an AF_ALG
> +     socket fail with EAFNOSUPPORT.
> +===  ==================================================================
> +
>  fips_enabled
>  ============
>
>  Read-only flag that indicates whether FIPS mode is enabled.
>
> diff --git a/Documentation/crypto/userspace-if.rst b/Documentation/crypto/userspace-if.rst
> index ab93300c8e04..d6194346e366 100644
> --- a/Documentation/crypto/userspace-if.rst
> +++ b/Documentation/crypto/userspace-if.rst
> @@ -1,5 +1,7 @@
> +.. _crypto_userspace_interface:
> +
>  User Space Interface
>  ====================
>
>  Introduction
>  ------------
> @@ -10,13 +12,18 @@ code.
>
>  AF_ALG is insecure and is deprecated. Originally added to the kernel in 2010,
>  most kernel developers now consider it to be a mistake. Support for hardware
>  accelerators, which was the original purpose of AF_ALG, has been removed.
>
> -AF_ALG continues to be supported only for backwards compatibility. On systems
> -where no programs using AF_ALG remain, the support for it should be disabled by
> -disabling ``CONFIG_CRYPTO_USER_API_*``.
> +AF_ALG continues to be supported only for backwards compatibility.
> +
> +Starting in Linux v7.3, the set of algorithms supported by AF_ALG is limited by
> +default. See :ref:`/proc/sys/crypto/af_alg_restrict <af_alg_restrict>`.
> +
> +On systems where no programs using AF_ALG remain, the support for it should be
> +disabled entirely by setting ``/proc/sys/crypto/af_alg_restrict`` to 2 or by
> +disabling ``CONFIG_CRYPTO_USER_API_*`` in the kernel configuration.
>
>  Deprecation
>  -----------
>
>  AF_ALG was originally intended to provide userspace programs access to crypto
> diff --git a/crypto/af_alg.c b/crypto/af_alg.c
> index cce000e8590e..34b801568fba 100644
> --- a/crypto/af_alg.c
> +++ b/crypto/af_alg.c
> @@ -6,10 +6,11 @@
>   *
>   * Copyright (c) 2010 Herbert Xu <herbert@gondor.apana.org.au>
>   */
>
>  #include <linux/atomic.h>
> +#include <linux/capability.h>
>  #include <crypto/if_alg.h>
>  #include <linux/crypto.h>
>  #include <linux/init.h>
>  #include <linux/kernel.h>
>  #include <linux/key.h>
> @@ -20,14 +21,32 @@
>  #include <linux/rwsem.h>
>  #include <linux/sched.h>
>  #include <linux/sched/signal.h>
>  #include <linux/security.h>
>  #include <linux/string.h>
> +#include <linux/sysctl.h>
> +#include <linux/user_namespace.h>
>  #include <keys/user-type.h>
>  #include <keys/trusted-type.h>
>  #include <keys/encrypted-type.h>
>
> +static int af_alg_restrict = 1;
> +
> +static const struct ctl_table af_alg_table[] = {
> +       {
> +               .procname       = "af_alg_restrict",
> +               .data           = &af_alg_restrict,
> +               .maxlen         = sizeof(int),
> +               .mode           = 0644,
> +               .proc_handler   = proc_dointvec_minmax,
> +               .extra1         = SYSCTL_ZERO,
> +               .extra2         = SYSCTL_TWO,
> +       },
> +};
> +
> +static struct ctl_table_header *af_alg_header;
> +
>  struct alg_type_list {
>         const struct af_alg_type *type;
>         struct list_head list;
>  };
>
> @@ -108,10 +127,43 @@ int af_alg_unregister_type(const struct af_alg_type *type)
>
>         return err;
>  }
>  EXPORT_SYMBOL_GPL(af_alg_unregister_type);
>
> +static bool af_alg_capable(void)
> +{
> +       return ns_capable_noaudit(&init_user_ns, CAP_NET_ADMIN) ||
> +              capable(CAP_SYS_ADMIN);
> +}
> +
> +int af_alg_check_restriction(const char *name,
> +                            const struct af_alg_allowlist_entry allowlist[])
> +{
> +       int level = READ_ONCE(af_alg_restrict);
> +
> +       if (level == 0)
> +               return 0;
> +       if (level == 1) {
> +               for (const struct af_alg_allowlist_entry *ent = allowlist;
> +                    ent->name; ent++) {
> +                       if (strcmp(name, ent->name) == 0 &&
> +                           (!ent->privileged || af_alg_capable()))
> +                               return 0;
> +               }
> +       }
> +       /*
> +        * Use -ENOENT (the error code for "algorithm not found") instead of
> +        * -EACCES or -EPERM, for the highest chance of correctly triggering
> +        * fallback code paths in userspace programs.
> +        *
> +        * Don't log a warning, since it would be noisy.  iwd tries to bind a
> +        * bunch of algorithms that it never uses.
> +        */
> +       return -ENOENT;
> +}
> +EXPORT_SYMBOL_GPL(af_alg_check_restriction);
> +
>  static void alg_do_release(const struct af_alg_type *type, void *private)
>  {
>         if (!type)
>                 return;
>
> @@ -504,10 +556,13 @@ static int alg_create(struct net *net, struct socket *sock, int protocol,
>                       int kern)
>  {
>         struct sock *sk;
>         int err;
>
> +       if (READ_ONCE(af_alg_restrict) == 2)
> +               return -EAFNOSUPPORT;
> +
>         if (sock->type != SOCK_SEQPACKET)
>                 return -ESOCKTNOSUPPORT;
>         if (protocol != 0)
>                 return -EPROTONOSUPPORT;
>
> @@ -1220,31 +1275,36 @@ int af_alg_get_rsgl(struct sock *sk, struct msghdr *msg, int flags,
>  }
>  EXPORT_SYMBOL_GPL(af_alg_get_rsgl);
>
>  static int __init af_alg_init(void)
>  {
> -       int err = proto_register(&alg_proto, 0);
> +       int err;
> +
> +       af_alg_header = register_sysctl("crypto", af_alg_table);
>
> +       err = proto_register(&alg_proto, 0);
>         if (err)
> -               goto out;
> +               goto out_unregister_sysctl;
>
>         err = sock_register(&alg_family);
> -       if (err != 0)
> +       if (err)
>                 goto out_unregister_proto;
>
> -out:
> -       return err;
> +       return 0;
>
>  out_unregister_proto:
>         proto_unregister(&alg_proto);
> -       goto out;
> +out_unregister_sysctl:
> +       unregister_sysctl_table(af_alg_header);
> +       return err;
>  }
>
>  static void __exit af_alg_exit(void)
>  {
>         sock_unregister(PF_ALG);
>         proto_unregister(&alg_proto);
> +       unregister_sysctl_table(af_alg_header);
>  }
>
>  module_init(af_alg_init);
>  module_exit(af_alg_exit);
>  MODULE_DESCRIPTION("Crypto userspace interface");
> diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
> index 787aac8aeb24..b9217f9086aa 100644
> --- a/crypto/algif_aead.c
> +++ b/crypto/algif_aead.c
> @@ -32,10 +32,15 @@
>  #include <linux/mm.h>
>  #include <linux/module.h>
>  #include <linux/net.h>
>  #include <net/sock.h>
>
> +static const struct af_alg_allowlist_entry aead_allowlist[] = {
> +       { "ccm(aes)", true }, /* bluez */
> +       {},
> +};
> +
>  static inline bool aead_sufficient_data(struct sock *sk)
>  {
>         struct alg_sock *ask = alg_sk(sk);
>         struct sock *psk = ask->parent;
>         struct alg_sock *pask = alg_sk(psk);
> @@ -342,10 +347,16 @@ static struct proto_ops algif_aead_ops_nokey = {
>         .poll           =       af_alg_poll,
>  };
>
>  static void *aead_bind(const char *name)
>  {
> +       int err;
> +
> +       err = af_alg_check_restriction(name, aead_allowlist);
> +       if (err)
> +               return ERR_PTR(err);
> +
>         return crypto_alloc_aead(name, 0, AF_ALG_CRYPTOAPI_MASK);
>  }
>
>  static void aead_release(void *private)
>  {
> diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
> index 5452ad6c1506..a8d958d51ece 100644
> --- a/crypto/algif_hash.c
> +++ b/crypto/algif_hash.c
> @@ -14,10 +14,28 @@
>  #include <linux/mm.h>
>  #include <linux/module.h>
>  #include <linux/net.h>
>  #include <net/sock.h>
>
> +static const struct af_alg_allowlist_entry hash_allowlist[] = {
> +       { "cmac(aes)", true }, /* iwd, bluez */
> +       { "hmac(md5)", true }, /* iwd */
> +       { "hmac(sha1)", true }, /* iwd */
> +       { "hmac(sha224)", true }, /* iwd */
> +       { "hmac(sha256)", true }, /* iwd */
> +       { "hmac(sha384)", true }, /* iwd */
> +       { "hmac(sha512)", true }, /* iwd, sha512hmac */
> +       { "md4", true }, /* iwd */
> +       { "md5", true }, /* iwd */
> +       { "sha1", false }, /* iwd, iproute2 < 7.0 */
> +       { "sha224", true }, /* iwd */
> +       { "sha256", true }, /* iwd */
> +       { "sha384", true }, /* iwd */
> +       { "sha512", true }, /* iwd */
> +       {},
In OpenWrt, https://gitlab.com/linux-afs/kafs-client and strongswan
seem to be the other users of the user API. I haven't looked into what
they need.
> +};
> +
>  struct hash_ctx {
>         struct af_alg_sgl sgl;
>
>         u8 *result;
>
> @@ -380,10 +398,16 @@ static struct proto_ops algif_hash_ops_nokey = {
>         .accept         =       hash_accept_nokey,
>  };
>
>  static void *hash_bind(const char *name)
>  {
> +       int err;
> +
> +       err = af_alg_check_restriction(name, hash_allowlist);
> +       if (err)
> +               return ERR_PTR(err);
> +
>         return crypto_alloc_ahash(name, 0, AF_ALG_CRYPTOAPI_MASK);
>  }
>
>  static void hash_release(void *private)
>  {
> diff --git a/crypto/algif_rng.c b/crypto/algif_rng.c
> index 4dfe7899f8fa..bd522915d56d 100644
> --- a/crypto/algif_rng.c
> +++ b/crypto/algif_rng.c
> @@ -48,10 +48,14 @@
>
>  MODULE_LICENSE("GPL");
>  MODULE_AUTHOR("Stephan Mueller <smueller@chronox.de>");
>  MODULE_DESCRIPTION("User-space interface for random number generators");
>
> +static const struct af_alg_allowlist_entry rng_allowlist[] = {
> +       {},
> +};
> +
>  struct rng_ctx {
>  #define MAXSIZE 128
>         unsigned int len;
>         struct crypto_rng *drng;
>         u8 *addtl;
> @@ -199,10 +203,15 @@ static struct proto_ops __maybe_unused algif_rng_test_ops = {
>
>  static void *rng_bind(const char *name)
>  {
>         struct rng_parent_ctx *pctx;
>         struct crypto_rng *rng;
> +       int err;
> +
> +       err = af_alg_check_restriction(name, rng_allowlist);
> +       if (err)
> +               return ERR_PTR(err);
>
>         pctx = kzalloc_obj(*pctx);
>         if (!pctx)
>                 return ERR_PTR(-ENOMEM);
>
> diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
> index df20bdfe1f1f..2b8069667974 100644
> --- a/crypto/algif_skcipher.c
> +++ b/crypto/algif_skcipher.c
> @@ -32,10 +32,24 @@
>  #include <linux/mm.h>
>  #include <linux/module.h>
>  #include <linux/net.h>
>  #include <net/sock.h>
>
> +static const struct af_alg_allowlist_entry skcipher_allowlist[] = {
> +       { "adiantum(xchacha12,aes)", false }, /* cryptsetup */
> +       { "adiantum(xchacha20,aes)", false }, /* cryptsetup */
> +       { "cbc(aes)", true }, /* iwd */
> +       { "cbc(des)", true }, /* iwd */
> +       { "cbc(des3_ede)", true }, /* iwd */
> +       { "ctr(aes)", true }, /* iwd */
> +       { "ecb(aes)", true }, /* iwd, bluez */
> +       { "ecb(des)", true }, /* iwd */
> +       { "hctr2(aes)", false }, /* cryptsetup */
> +       { "xts(aes)", false }, /* cryptsetup benchmark */
> +       {},
> +};
> +
>  static int skcipher_sendmsg(struct socket *sock, struct msghdr *msg,
>                             size_t size)
>  {
>         struct sock *sk = sock->sk;
>         struct alg_sock *ask = alg_sk(sk);
> @@ -307,10 +321,16 @@ static struct proto_ops algif_skcipher_ops_nokey = {
>         .poll           =       af_alg_poll,
>  };
>
>  static void *skcipher_bind(const char *name)
>  {
> +       int err;
> +
> +       err = af_alg_check_restriction(name, skcipher_allowlist);
> +       if (err)
> +               return ERR_PTR(err);
> +
>         return crypto_alloc_skcipher(name, 0, AF_ALG_CRYPTOAPI_MASK);
>  }
>
>  static void skcipher_release(void *private)
>  {
> diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h
> index 7643ba954125..4e9ed8e73403 100644
> --- a/include/crypto/if_alg.h
> +++ b/include/crypto/if_alg.h
> @@ -159,13 +159,21 @@ struct af_alg_ctx {
>         unsigned int len;
>
>         unsigned int inflight;
>  };
>
> +struct af_alg_allowlist_entry {
> +       const char *name;
> +       bool privileged;
> +};
> +
>  int af_alg_register_type(const struct af_alg_type *type);
>  int af_alg_unregister_type(const struct af_alg_type *type);
>
> +int af_alg_check_restriction(const char *name,
> +                            const struct af_alg_allowlist_entry allowlist[]);
> +
>  int af_alg_release(struct socket *sock);
>  void af_alg_release_parent(struct sock *sk);
>  int af_alg_accept(struct sock *sk, struct socket *newsock,
>                   struct proto_accept_arg *arg);
>
>
> base-commit: 1dc18801be29bc54709aa355b8acd80e183b03cd
> --
> 2.54.0
>
>

^ permalink raw reply

* Re: [PATCH 2/3] dt-bindings: hwmon: pmbus: Support for onsemi's FD5121
From: Conor Dooley @ 2026-06-23 21:14 UTC (permalink / raw)
  To: Selvamani Rajagopal
  Cc: Guenter Roeck, Jonathan Corbet, Shuah Khan, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, linux-hwmon@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	devicetree@vger.kernel.org
In-Reply-To: <CYYPR02MB98280DF78A07EADACFD084EE83EE2@CYYPR02MB9828.namprd02.prod.outlook.com>

[-- Attachment #1: Type: text/plain, Size: 1717 bytes --]

On Tue, Jun 23, 2026 at 09:01:32PM +0000, Selvamani Rajagopal wrote:
> 
> > -----Original Message-----
> > From: Conor Dooley <conor@kernel.org>
> > Subject: Re: [PATCH 2/3] dt-bindings: hwmon: pmbus: Support for onsemi's FD5121
> > 
> >
> > > +
> > > +title: onsemi's multi-phase digital controllers
> > 
> > Can someone explain to me what a "digital controller" actually is?
> > Seems very generi and that a word may have been left out, were it not
> > for the fact that this wording is used several times in the patch.
> > 
> 
> Thanks for reviewing.
> 
> According to me, "digital controller" means the controller uses digital circuits to implement 
> the features and functionality. We can remove "digital" and keep only controller. It won't make any
> difference for Linux documentation.

My point is that what's actually being controlled is missing. Maybe it
is obvious to you, but it is not to me. Your nodename in your example is
> +      fd5121@50 {
which doesn't comply with node naming requirements and I wanted to come
up with a suggestion for what it should be.
I am assuming that its power or voltage that you're controlling so
either it should be hwmon@ or regulator@. 

> 
> > > +
> > > + enum:
> > > + - onnn,fd5121
> > > + - onnn,fd5123
> > > + - onnn,fd5125
> > 
> > Your /OF/ match data in your driver suggests that you intended to permit
> > fallback compatibles here?
> 
> Agree. Sorry about the discrepancy. Will fix it.
> 
> > 
> > |+#ifdef CONFIG_OF
> > |+static const struct of_device_id fd5121_of_match[] = {
> > |+ { .compatible = "onnn,fd5121" },
> > |+ { }
> > |+};
> > |+MODULE_DEVICE_TABLE(of, fd5121_of_match);
> > |+#endif
> > 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* Re: [PATCH v2] usbcore: Add quirk for 255-bytes initial config read
From: Nikhil Solanke @ 2026-06-23 21:14 UTC (permalink / raw)
  To: Alan Stern
  Cc: linux-usb, gregkh, linux-kernel, michal.pecio, stable, corbet,
	skhan, linux-doc
In-Reply-To: <567e8866-4308-4e5f-819c-fe778dbf74f8@rowland.harvard.edu>

> Moving this delay up here changes the behavior when the quirk flag isn't
> set.  While it agrees with the intention of the USB_QUIRK_DELAY_INIT
> flag, such a change should be mentioned in the patch description.

How should I mention it then? Nothing comes to mind besides the
obvious: "Also move the USB_QUIRK_DELAY_INIT sleep to before the
initial descriptor read, so the delay applies consistently regardless
of whether USB_QUIRK_CONFIG_SIZE is set.". Or should i revert it back
to original position?

> > +
> > +             /*
> > +              * Grab just the first descriptor so we know how long the whole
> > +              * configuration is. In case of quirky firmware, try to grab the
> > +              * whole thing in one go by asking for a 255-bytes sized buffer
> > +              * mirroring Windows behavior.
> > +              */
>
> This needs to be rewritten, as it is self-contradictory.  When the quirk
> flag is set we issue a 255-byte request to mimic the Windows behavior,
> and only when the flag isn't set do we grab just the first descriptor.

I am sorry I didn't understand how it is self contradictory. The
comment does say, "in case of quirky firmware..."? Am i missing
something?

> >               result = usb_get_descriptor(dev, USB_DT_CONFIG, cfgno,
> > -                 desc, USB_DT_CONFIG_SIZE);
> > +                                             desc, usb_config_req_size);
>
> Don't make extraneous changes to the existing indentation (or whitespace
> in general), here and below.

Well the linux coding style guidelines mention that those descendants
should preferably be aligned with the function open parenthesis. Since
i did "touch" that line/part of code I though might as well indent it
a bit accordingly. Should i revert the indent then (in this and the
other place)?

> >                       if (result != -EPIPE)
> >                               goto err;
> >                       dev_notice(ddev, "chopping to %d config(s)\n", cfgno);
> > @@ -957,13 +976,25 @@ int usb_get_configuration(struct usb_device *dev)
> >                       break;
> >               } else if (result < 4) {
> >                       dev_err(ddev, "config index %d descriptor too short "
> > -                         "(expected %i, got %i)\n", cfgno,
> > -                         USB_DT_CONFIG_SIZE, result);
> > +                             "(asked for %zu, got %i, expected at least %i)\n",
> > +                             cfgno, usb_config_req_size, result, 4);
> >                       result = -EINVAL;
> >                       goto err;
> >               }
> > +
> >               length = max_t(int, le16_to_cpu(desc->wTotalLength),
> > -                 USB_DT_CONFIG_SIZE);
> > +                             USB_DT_CONFIG_SIZE);
>
> This is another example of a change that has nothing to do with the
> purpose of the patch.

Isn't that what you told me to change? So the logs are accurate? I
made that change because you suggested it. :')

> > +
> > +             /*
> > +              * If the device returns the full length configuration
> > +              * descriptor, skip the second read. Otherwise, send a second
>
> Strictly speaking, the configuration descriptor is only 9 bytes long.
> What you mean here is the entire configuration descriptor set.

Alright i'll reword it.

> > +              * request asking for the full length.
> > +              */
> > +             if (result >= le16_to_cpu(desc->wTotalLength)) {
>
> Shouldn't this be: result >= length?  No point in repeating the
> le16_to_cpu calculation.

Yess. initially the length assignment was happening afterwards in my
patch. then i decided to move it before the "if" statement since the
outcome of length was going to be similar in any case (within if and
after if). but then i forgot to modify the if too. Will fix it.

> Like above, this string should all be on one line.

Will fix all the strings as well

Nikhil Solanke

^ permalink raw reply

* RE: [PATCH 2/3] dt-bindings: hwmon: pmbus: Support for onsemi's FD5121
From: Selvamani Rajagopal @ 2026-06-23 21:01 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Guenter Roeck, Jonathan Corbet, Shuah Khan, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, linux-hwmon@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	devicetree@vger.kernel.org
In-Reply-To: <20260623-anybody-gutter-e6ca04f53bdb@spud>


> -----Original Message-----
> From: Conor Dooley <conor@kernel.org>
> Subject: Re: [PATCH 2/3] dt-bindings: hwmon: pmbus: Support for onsemi's FD5121
> 
>
> > +
> > +title: onsemi's multi-phase digital controllers
> 
> Can someone explain to me what a "digital controller" actually is?
> Seems very generi and that a word may have been left out, were it not
> for the fact that this wording is used several times in the patch.
> 

Thanks for reviewing.

According to me, "digital controller" means the controller uses digital circuits to implement 
the features and functionality. We can remove "digital" and keep only controller. It won't make any
difference for Linux documentation.

> > +
> > + enum:
> > + - onnn,fd5121
> > + - onnn,fd5123
> > + - onnn,fd5125
> 
> Your /OF/ match data in your driver suggests that you intended to permit
> fallback compatibles here?

Agree. Sorry about the discrepancy. Will fix it.

> 
> |+#ifdef CONFIG_OF
> |+static const struct of_device_id fd5121_of_match[] = {
> |+ { .compatible = "onnn,fd5121" },
> |+ { }
> |+};
> |+MODULE_DEVICE_TABLE(of, fd5121_of_match);
> |+#endif
> 


^ permalink raw reply

* Re: [PATCH] docs: tools: Fix typo in unittest.rst
From: Jonathan Corbet @ 2026-06-23 20:48 UTC (permalink / raw)
  To: Declan Wale; +Cc: skhan, mchehab+huawei, linux-doc, linux-kernel
In-Reply-To: <CADz3o9mbM60-p1PV8t=nOm7099KnFeYQOyo5J+bC2iiP9PtBJQ@mail.gmail.com>

Declan Wale <decwale37@gmail.com> writes:

> From 355e0c80a9ba337d0cac106c8cb66859927a501c Mon Sep 17 00:00:00 2001
> From: Declan Wale <decwale37@gmail.com>
> Date: Sun, 31 May 2026 19:03:06 +0100
> Subject: [PATCH] docs: tools: Fix typo 'ackward' to 'awkward' in unittest.rst
>
> Signed-off-by: Declan Wale <decwale37@gmail.com>
> ---
>  Documentation/tools/unittest.rst | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/tools/unittest.rst b/Documentation/tools/unittest.rst
> index 14a2b2a65236..0fa8716741df 100644
> --- a/Documentation/tools/unittest.rst
> +++ b/Documentation/tools/unittest.rst
> @@ -11,7 +11,7 @@ While the actual test implementation is usecase dependent, Python already
>  provides a standard way to add unit tests by using ``import unittest``.
>  
>  Using such class, requires setting up a test suite. Also, the default format
> -is a little bit ackward. To improve it and provide a more uniform way to
> +is a little bit awkward. To improve it and provide a more uniform way to
>  report errors, some unittest classes and functions are defined.

Applied, thanks.

jon

^ permalink raw reply

* Re: [PATCH v2] kdoc: xforms: ignore special static/inline macros
From: Jonathan Corbet @ 2026-06-23 20:45 UTC (permalink / raw)
  To: Randy Dunlap, linux-doc
  Cc: Randy Dunlap, Shuah Khan, Mauro Carvalho Chehab, Harry Wentland,
	Alex Hung, Ivan Lipski, Dan Wheeler, Alex Deucher,
	Christian König, amd-gfx
In-Reply-To: <20260612234458.1084156-1-rdunlap@infradead.org>

Randy Dunlap <rdunlap@infradead.org> writes:

> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c contains 7 (for
> now) functions that use STATIC_IFN_KUNIT or INLINE_IFN_KUNIT macros for
> function qualifiers (static or not, inline or not).
>
> These cause parse warnings from kernel-doc:
> Invalid C declaration: Expected identifier in nested name, got keyword:
>   struct [error at 29]
> STATIC_IFN_KUNIT const struct drm_color_lut * __extract_blob_lut (const
>   struct drm_property_blob *blob, uint32_t *size)
>
> Handle these in kernel-doc to prevent multiple warnings.
>
> Fixes: 647d1fd04652 ("drm/amd/display: Add KUnit test for color helpers")
> Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
> ---
> v2: drop an unsubmitted patch so that this one applies with no problem
>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Shuah Khan <skhan@linuxfoundation.org>
> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
> Cc: Harry Wentland <harry.wentland@amd.com>
> Cc: Alex Hung <alex.hung@amd.com>
> Cc: Ivan Lipski <ivan.lipski@amd.com>
> Cc: Dan Wheeler <daniel.wheeler@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: amd-gfx@lists.freedesktop.org
>
>  tools/lib/python/kdoc/xforms_lists.py |    2 ++
>  1 file changed, 2 insertions(+)
>
> --- linext-2026-0610.orig/tools/lib/python/kdoc/xforms_lists.py
> +++ linext-2026-0610/tools/lib/python/kdoc/xforms_lists.py
> @@ -102,6 +102,8 @@ class CTransforms:
>          (CMatch("__no_context_analysis"), ""),
>          (CMatch("__attribute_const__"), ""),
>          (CMatch("__attribute__"), ""),
> +        (CMatch("STATIC_IFN_KUNIT"), ""),
> +        (CMatch("INLINE_IFN_KUNIT"), ""),
>  

Applied, thanks.

jon

^ permalink raw reply

* Re: [PATCH] kdoc: xforms_lists: handle DECLARE_PER_CPU() in kernel-doc
From: Jonathan Corbet @ 2026-06-23 20:41 UTC (permalink / raw)
  To: Randy Dunlap, linux-doc
  Cc: Randy Dunlap, Shuah Khan, Mauro Carvalho Chehab, netfilter-devel
In-Reply-To: <20260614052452.1557987-1-rdunlap@infradead.org>

Randy Dunlap <rdunlap@infradead.org> writes:

> Add support for DECLARE_PER_CPU() as a var (variable) as used in
> <linux/netfilter/x_tables.h>.
>
> Warning: include/linux/netfilter/x_tables.h:345 function parameter 'seqcount_t' not described in 'DECLARE_PER_CPU'
> Warning: include/linux/netfilter/x_tables.h:345 function parameter 'xt_recseq' not described in 'DECLARE_PER_CPU'
>
> Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
> ---
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Shuah Khan <skhan@linuxfoundation.org>
> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
> Cc: netfilter-devel@vger.kernel.org
>
>  tools/lib/python/kdoc/xforms_lists.py |    1 +
>  1 file changed, 1 insertion(+)
>
> --- linext-2026-0610.orig/tools/lib/python/kdoc/xforms_lists.py
> +++ linext-2026-0610/tools/lib/python/kdoc/xforms_lists.py
> @@ -118,6 +118,7 @@ class CTransforms:
>          (CMatch("__guarded_by"), ""),
>          (CMatch("__pt_guarded_by"), ""),
>          (CMatch("LIST_HEAD"), r"struct list_head \1"),
> +        (CMatch("DECLARE_PER_CPU"), r"\1 \2[PER_CPU]; }"),
>  
Applied, thanks.

jon

^ permalink raw reply

* Re: [PATCH] MAINTAINERS: Fix regex for kdoc
From: Jonathan Corbet @ 2026-06-23 20:36 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Matthew Wilcox (Oracle), Mauro Carvalho Chehab, Shuah Khan,
	linux-doc
In-Reply-To: <20260615154057.2156589-1-willy@infradead.org>

"Matthew Wilcox (Oracle)" <willy@infradead.org> writes:

> The trailing '*' means "all files in this directory, but not
> subdirectories" which excluded tools/lib/python/kdoc/.  This is surely
> not intended.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  MAINTAINERS | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 0d94420eae3d..999957a3e0ca 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -7653,7 +7653,7 @@ S:	Maintained
>  P:	Documentation/doc-guide/maintainer-profile.rst
>  T:	git git://git.lwn.net/linux.git docs-next
>  F:	Documentation/
> -F:	tools/lib/python/*
> +F:	tools/lib/python/
>  F:	tools/docs/

Applied, thanks.

jon

^ permalink raw reply

* Re: [PATCH] Documentation: tracing: fix typo in events documentation
From: Jonathan Corbet @ 2026-06-23 20:34 UTC (permalink / raw)
  To: Yudistira Putra, Steven Rostedt, Masami Hiramatsu
  Cc: Mathieu Desnoyers, Shuah Khan, linux-trace-kernel, linux-doc,
	linux-kernel, Yudistira Putra
In-Reply-To: <20260622143735.71778-1-pyudistira519@gmail.com>

Yudistira Putra <pyudistira519@gmail.com> writes:

> Fix a typo in the tracing events documentation: "can by built up"
> should be "can be built up".
>
> Signed-off-by: Yudistira Putra <pyudistira519@gmail.com>
> ---
>  Documentation/trace/events.rst | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/trace/events.rst b/Documentation/trace/events.rst
> index 18d112963dec..581f2260614b 100644
> --- a/Documentation/trace/events.rst
> +++ b/Documentation/trace/events.rst
> @@ -1064,7 +1064,7 @@ correct command type, and a pointer to an event-specific run_command()
>  callback that will be called to actually execute the event-specific
>  command function.
>  
> -Once that's done, the command string can by built up by successive
> +Once that's done, the command string can be built up by successive
>  calls to argument-adding functions.

Applied, thanks.

jon

^ permalink raw reply

* Re: [PATCH] Docs/driver-api/uio-howto: document mmap_prepare callback
From: Jonathan Corbet @ 2026-06-23 20:33 UTC (permalink / raw)
  To: Doehyun Baek, Greg Kroah-Hartman, Shuah Khan
  Cc: Andrew Morton, Vlastimil Babka, Lorenzo Stoakes, linux-doc,
	linux-kernel, Doehyun Baek
In-Reply-To: <20260622181821.1195257-1-doehyunbaek@gmail.com>

Doehyun Baek <doehyunbaek@gmail.com> writes:

> The UIO howto still documents an mmap callback in struct uio_info.
> That field was replaced by mmap_prepare, which takes a struct
> vm_area_desc.
>
> A UIO driver following the current howto no longer builds because
> struct uio_info has no mmap member. Update the documented callback
> signature and matching text to match the current API.
>
> Fixes: 933f05f58ac6 ("uio: replace deprecated mmap hook with mmap_prepare in uio_info")
> Signed-off-by: Doehyun Baek <doehyunbaek@gmail.com>
> ---
>  Documentation/driver-api/uio-howto.rst | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/driver-api/uio-howto.rst b/Documentation/driver-api/uio-howto.rst
> index 907ffa3b38f5..c08472dfbcfe 100644
> --- a/Documentation/driver-api/uio-howto.rst
> +++ b/Documentation/driver-api/uio-howto.rst
> @@ -246,10 +246,10 @@ the members are required, others are optional.
>     hardware interrupt number. The flags given here will be used in the
>     call to :c:func:`request_irq()`.
>  
> --  ``int (*mmap)(struct uio_info *info, struct vm_area_struct *vma)``:
> +-  ``int (*mmap_prepare)(struct uio_info *info, struct vm_area_desc *desc)``:
>     Optional. If you need a special :c:func:`mmap()`
>     function, you can set it here. If this pointer is not NULL, your
> -   :c:func:`mmap()` will be called instead of the built-in one.
> +   ``mmap_prepare`` will be called instead of the built-in one.
>  
Applied, thanks.

jon

^ permalink raw reply

* Re: [PATCH] docs/mm: clarify that we are not looking for LLM generated content
From: Jonathan Corbet @ 2026-06-23 20:33 UTC (permalink / raw)
  To: David Hildenbrand (Arm), linux-doc
  Cc: Andrew Morton, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka,
	Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Shuah Khan,
	Matthew Wilcox, Harry Yoo, linux-mm, linux-kernel
In-Reply-To: <1cbdb3a3-b8db-4ae2-aa66-9042c1033745@kernel.org>

"David Hildenbrand (Arm)" <david@kernel.org> writes:

>>> I assume this was not picked up yet? (via documentation or mm tree?)
>> 
>> I had figured Andrew would grab it; I can certainly do so if you'd like.
>
> yes please. I guess I'll soon start grabbing stuff myself. Stay tuned. :)

OK, done.

Let me know how you would like to handle things going forward.  I'm
happy to pick up MM docs changes or to leave them for you, whatever
works best.

Thanks,

jon

^ permalink raw reply

* Re: [PATCH] kernel-doc: xforms: support __SYSFS_FUNCTION_ALTERNATIVE()
From: Jonathan Corbet @ 2026-06-23 20:31 UTC (permalink / raw)
  To: Randy Dunlap, linux-kernel
  Cc: Randy Dunlap, Thomas Weißschuh, Rafael J. Wysocki,
	Danilo Krummrich, driver-core, Greg Kroah-Hartman, Shuah Khan,
	linux-doc, Mauro Carvalho Chehab, stable
In-Reply-To: <20260623190006.406571-1-rdunlap@infradead.org>

Randy Dunlap <rdunlap@infradead.org> writes:

> Add support for __SYSFS_FUNCTION_ALTERNATIVE() to create a union of its
> members (as though CONFIG_CFI is unset).
>
> Fixes these docs build warnings:
>
> WARNING: include/linux/device.h:117 Invalid param: __SYSFS_FUNCTION_ALTERNATIVE( ssize_t (*show)(struct device *dev, struct device_attribute *attr, char *buf)
> WARNING: include/linux/device.h:117 struct member '__SYSFS_FUNCTION_ALTERNATIVE( ssize_t (*show' not described in 'device_attribute'
> WARNING: include/linux/device.h:117 Invalid param: __SYSFS_FUNCTION_ALTERNATIVE( ssize_t (*store)(struct device *dev, struct device_attribute *attr, const char *buf, size_t count)
> WARNING: include/linux/device.h:117 struct member '__SYSFS_FUNCTION_ALTERNATIVE( ssize_t (*store' not described in 'device_attribute'
>
> Fixes: 434506b86a6c ("driver core: Allow the constification of device attributes")
> Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
> ---
> Cc: Thomas Weißschuh <linux@weissschuh.net>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Danilo Krummrich <dakr@kernel.org>
> Cc: driver-core@lists.linux.dev
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Shuah Khan <skhan@linuxfoundation.org>
> Cc: linux-doc@vger.kernel.org
> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
> Cc: stable@vger.kernel.org
>
>  tools/lib/python/kdoc/xforms_lists.py |    1 +
>  1 file changed, 1 insertion(+)
>
> --- linux-next-20260619.orig/tools/lib/python/kdoc/xforms_lists.py
> +++ linux-next-20260619/tools/lib/python/kdoc/xforms_lists.py
> @@ -49,6 +49,7 @@ class CTransforms:
>          (CMatch("DEFINE_DMA_UNMAP_ADDR"), r"dma_addr_t \1"),
>          (CMatch("DEFINE_DMA_UNMAP_LEN"), r"__u32 \1"),
>          (CMatch("VIRTIO_DECLARE_FEATURES"), r"union { u64 \1; u64 \1_array[VIRTIO_FEATURES_U64S]; }"),
> +        (CMatch("__SYSFS_FUNCTION_ALTERNATIVE"), r"union { \1+ }"),
>          (CMatch("__attribute__"), ""),

Applied, thanks.

jon

^ permalink raw reply

* Re: [PATCH v2] usbcore: Add quirk for 255-bytes initial config read
From: Alan Stern @ 2026-06-23 20:24 UTC (permalink / raw)
  To: Nikhil Solanke
  Cc: linux-usb, gregkh, linux-kernel, michal.pecio, stable, corbet,
	skhan, linux-doc
In-Reply-To: <20260623161035.5792-1-nikhilsolanke5@gmail.com>

This is looking better.  A few more things to mention below, but 
otherwise okay.

On Tue, Jun 23, 2026 at 09:40:35PM +0530, Nikhil Solanke wrote:
> Certain third-party USB game controllers exposing (or spoofing) an Xbox
> 360-compatible interface (VID:PID 045e:028e) fail to enumerate under Linux.
> The device disconnects from the bus without responding to the initial
> GET_DESCRIPTOR(CONFIGURATION) request, and the kernel logs 'unable to read
> config index 0 descriptor/start: -71'.
> 
> The device then falls back to a secondary Android HID mode (with a
> different VID:PID), losing XInput functionality including rumble support.
> The failure reproduces across multiple machines, host controller types, and
> kernel versions including current mainline and LTS. The device enumerates
> correctly and remains in XInput mode under Windows. Notably, the device
> enumerates correctly in Android mode when the same 9-byte request
> is issued for that mode's configuration descriptor, confirming the firmware
> bug is specific to the XInput mode.
> 
> usbmon traces from Linux and Wireshark/USBPcap traces from Windows are
> identical up to the point of failure, with no visible protocol-level
> difference explaining the divergence. The root cause was identified when
> Michal Pecio discovered via a QEMU bus-level capture that Windows does not
> use wLength=9 for the initial config descriptor request; it uses
> wLength=255. Alan Stern subsequently confirmed this with a bus
> analyzer on a different USB 2.0 device, and Michal verified the behavior
> goes back to Windows 95 OSR2.1.
> 
> So, add a new quirk flag USB_QUIRK_CONFIG_SIZE which causes
> usb_get_configuration() to issue a 255 byte sized configuration request
> instead of USB_DT_CONFIG_SIZE (9) for the initial
> GET_DESCRIPTOR(CONFIGURATION) request, mimicking long-standing Windows
> behavior.
> 
> Suggested-by: Alan Stern <stern@rowland.harvard.edu>
> Suggested-by: Michal Pecio <michal.pecio@gmail.com>
> Closes: https://lore.kernel.org/linux-usb/CAFgddh+JWdT4LLwMc5qjM8q_pBu-fRo2qADR5ovAKoGHWMQrRw@mail.gmail.com/
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Cc: stable@vger.kernel.org
> 
> Signed-off-by: Nikhil Solanke <nikhilsolanke5@gmail.com>

Normally people don't leave a blank line before their Signed-off-by: 
(unless it's the first tag to appear).  But I don't think it makes any 
difference, especially if the checkpatch.pl script doesn't object.

> ---
> Changes in v2:
> - Add Documentation
> - Naming changes
> - Refactored to have a better flow with existing code.
> 
>  .../admin-guide/kernel-parameters.txt         |  9 +++
>  drivers/usb/core/config.c                     | 61 ++++++++++++++-----
>  drivers/usb/core/hub.c                        |  6 +-
>  drivers/usb/core/quirks.c                     |  4 ++
>  include/linux/usb/quirks.h                    |  3 +
>  5 files changed, 67 insertions(+), 16 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 97007f4f69d4..af4bf0ef2c7b 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -8158,6 +8158,15 @@ Kernel parameters
>  				q = USB_QUIRK_FORCE_ONE_CONFIG (Device
>  					claims zero configurations,
>  					forcing to 1);
> +                r = USB_QUIRK_WINDOWS_CONFIG_REQ_SIZE (Device
> +                    fails during initialization when asked for
> +                    9-bytes configuration desciptor request. Ask
> +                    for 255-bytes request instead to mirror
> +                    Windows' behavior. This quirk is originally
> +                    meant to fix some quirky gamepads that refuse
> +                    to connect in their XInput mode. But it can also
> +                    potentially fix issues with other USB devices
> +                    that work on Windows but not on Linux)
>  			Example: quirks=0781:5580:bk,0a5c:5834:gij

As Randy said, use tabs instead of spaces.  And this new entry should be 
aligned with all the preceding entries, and it should end with a ';' 
like they do.

I would leave out the two sentences of explanation, but that's more of a 
personal choice.  We'll see if Greg KH objects.

>  
>  	usbhid.mousepoll=
> diff --git a/drivers/usb/core/config.c b/drivers/usb/core/config.c
> index 45e20c6d76c0..4fc3145404d6 100644
> --- a/drivers/usb/core/config.c
> +++ b/drivers/usb/core/config.c
> @@ -19,6 +19,9 @@
>  
>  #define USB_MAXCONFIG			8	/* Arbitrary limit */
>  
> +/* config req size if USB_QUIRK_WINDOWS_CONFIG_REQ_SIZE is set */
> +#define USB_CONFIG_WINDOWS_REQ_SIZE	255
> +
>  static int find_next_descriptor(unsigned char *buffer, int size,
>      int dt1, int dt2, int *num_skipped)
>  {
> @@ -912,6 +915,13 @@ int usb_get_configuration(struct usb_device *dev)
>  	unsigned char *bigbuffer;
>  	struct usb_config_descriptor *desc;
>  	int result;
> +	/*
> +	 * Devices with quirky firmware will stall or reset when asked only for
> +	 * the configuration header. This variable decides which size to use in
> +	 * that case, if the quirk for that device was set.
> +	 */
> +	size_t usb_config_req_size = (dev->quirks & USB_QUIRK_WINDOWS_CONFIG_REQ_SIZE)
> +		? USB_CONFIG_WINDOWS_REQ_SIZE : USB_DT_CONFIG_SIZE;

It's a little unusual to have a multiline comment in the middle of a 
bunch of variable definitions.  The alternative is to do the assignment 
later on and move the comment there.

>  
>  	if (ncfg > USB_MAXCONFIG) {
>  		dev_notice(ddev, "too many configurations: %d, "
> @@ -938,18 +948,27 @@ int usb_get_configuration(struct usb_device *dev)
>  	if (!dev->rawdescriptors)
>  		return -ENOMEM;
>  
> -	desc = kmalloc(USB_DT_CONFIG_SIZE, GFP_KERNEL);
> +	desc = kmalloc(usb_config_req_size, GFP_KERNEL);
> +
>  	if (!desc)
>  		return -ENOMEM;
>  
>  	for (cfgno = 0; cfgno < ncfg; cfgno++) {
> -		/* We grab just the first descriptor so we know how long
> -		 * the whole configuration is */
> +
> +		if (dev->quirks & USB_QUIRK_DELAY_INIT)
> +			msleep(200);

Moving this delay up here changes the behavior when the quirk flag isn't 
set.  While it agrees with the intention of the USB_QUIRK_DELAY_INIT 
flag, such a change should be mentioned in the patch description.

> +
> +		/*
> +		 * Grab just the first descriptor so we know how long the whole
> +		 * configuration is. In case of quirky firmware, try to grab the
> +		 * whole thing in one go by asking for a 255-bytes sized buffer
> +		 * mirroring Windows behavior.
> +		 */

This needs to be rewritten, as it is self-contradictory.  When the quirk 
flag is set we issue a 255-byte request to mimic the Windows behavior, 
and only when the flag isn't set do we grab just the first descriptor.

>  		result = usb_get_descriptor(dev, USB_DT_CONFIG, cfgno,
> -		    desc, USB_DT_CONFIG_SIZE);
> +						desc, usb_config_req_size);

Don't make extraneous changes to the existing indentation (or whitespace 
in general), here and below.

>  		if (result < 0) {
>  			dev_err(ddev, "unable to read config index %d "
> -			    "descriptor/%s: %d\n", cfgno, "start", result);
> +				"descriptor/%s: %d\n", cfgno, "start", result);

At the time this code was originally written, the policy for kernel code 
was to break lines before 80 columns.  Since then the policy has changed 
to avoid splitting long literal strings into pieces, even when that 
means exceeding 80 columns.  So your new string here should all go on 
one line.

>  			if (result != -EPIPE)
>  				goto err;
>  			dev_notice(ddev, "chopping to %d config(s)\n", cfgno);
> @@ -957,13 +976,25 @@ int usb_get_configuration(struct usb_device *dev)
>  			break;
>  		} else if (result < 4) {
>  			dev_err(ddev, "config index %d descriptor too short "
> -			    "(expected %i, got %i)\n", cfgno,
> -			    USB_DT_CONFIG_SIZE, result);
> +				"(asked for %zu, got %i, expected at least %i)\n",
> +				cfgno, usb_config_req_size, result, 4);
>  			result = -EINVAL;
>  			goto err;
>  		}
> +
>  		length = max_t(int, le16_to_cpu(desc->wTotalLength),
> -		    USB_DT_CONFIG_SIZE);
> +				USB_DT_CONFIG_SIZE);

This is another example of a change that has nothing to do with the 
purpose of the patch.

> +
> +		/*
> +		 * If the device returns the full length configuration
> +		 * descriptor, skip the second read. Otherwise, send a second

Strictly speaking, the configuration descriptor is only 9 bytes long.  
What you mean here is the entire configuration descriptor set.

> +		 * request asking for the full length.
> +		 */
> +		if (result >= le16_to_cpu(desc->wTotalLength)) {

Shouldn't this be: result >= length?  No point in repeating the 
le16_to_cpu calculation.

> +			bigbuffer = (unsigned char *) desc;
> +			desc = NULL;
> +			goto store_and_parse;
> +		}
>  
>  		/* Now that we know the length, get the whole thing */
>  		bigbuffer = kmalloc(length, GFP_KERNEL);
> @@ -972,23 +1003,25 @@ int usb_get_configuration(struct usb_device *dev)
>  			goto err;
>  		}
>  
> -		if (dev->quirks & USB_QUIRK_DELAY_INIT)
> -			msleep(200);
> -
>  		result = usb_get_descriptor(dev, USB_DT_CONFIG, cfgno,
> -		    bigbuffer, length);
> +						bigbuffer, length);
> +
>  		if (result < 0) {
>  			dev_err(ddev, "unable to read config index %d "
> -			    "descriptor/%s\n", cfgno, "all");
> +				"descriptor/%s\n", cfgno, "all");
>  			kfree(bigbuffer);
>  			goto err;
>  		}
> +

More examples of unnecessary whitespace changes.

>  		if (result < length) {
>  			dev_notice(ddev, "config index %d descriptor too short "
> -			    "(expected %i, got %i)\n", cfgno, length, result);
> +				"(asked for %i, got %i)\n",
> +				cfgno, length, result);
>  			length = result;
>  		}
>  
> +store_and_parse:
> +		krealloc(bigbuffer, length, GFP_KERNEL);
>  		dev->rawdescriptors[cfgno] = bigbuffer;
>  
>  		result = usb_parse_configuration(dev, cfgno,
> diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
> index 24960ba9caa9..9acd278666fc 100644
> --- a/drivers/usb/core/hub.c
> +++ b/drivers/usb/core/hub.c
> @@ -2527,8 +2527,10 @@ static int usb_enumerate_device(struct usb_device *udev)
>  		err = usb_get_configuration(udev);
>  		if (err < 0) {
>  			if (err != -ENODEV)
> -				dev_err(&udev->dev, "can't read configurations, error %d\n",
> -						err);
> +				dev_err(&udev->dev, "can't read configurations, "
> +					"for device %04x:%04x, error %d\n",

Like above, this string should all be on one line.

Alan Stern

> +					le16_to_cpu(udev->descriptor.idVendor),
> +					le16_to_cpu(udev->descriptor.idProduct), err);
>  			return err;
>  		}
>  	}
> diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c
> index 87810eff974e..df670b0b66fe 100644
> --- a/drivers/usb/core/quirks.c
> +++ b/drivers/usb/core/quirks.c
> @@ -142,6 +142,10 @@ static int quirks_param_set(const char *value, const struct kernel_param *kp)
>  				break;
>  			case 'q':
>  				flags |= USB_QUIRK_FORCE_ONE_CONFIG;
> +				break;
> +			case 'r':
> +				flags |= USB_QUIRK_WINDOWS_CONFIG_REQ_SIZE;
> +				break;
>  			/* Ignore unrecognized flag characters */
>  			}
>  		}
> diff --git a/include/linux/usb/quirks.h b/include/linux/usb/quirks.h
> index b3cc7beab4a3..a4043b33c2c2 100644
> --- a/include/linux/usb/quirks.h
> +++ b/include/linux/usb/quirks.h
> @@ -81,4 +81,7 @@
>  /* Device claims zero configurations, forcing to 1 */
>  #define USB_QUIRK_FORCE_ONE_CONFIG		BIT(18)
>  
> +/* Use a 255 bytes config descriptor request mirroring windows behavior */
> +#define USB_QUIRK_WINDOWS_CONFIG_REQ_SIZE	BIT(19)
> +
>  #endif /* __LINUX_USB_QUIRKS_H */
> -- 
> 2.54.0
> 

^ permalink raw reply

* Re: [PATCH] kernel-doc: xforms: support __SYSFS_FUNCTION_ALTERNATIVE()
From: Thomas Weißschuh @ 2026-06-23 20:10 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: linux-kernel, Rafael J. Wysocki, Danilo Krummrich, driver-core,
	Greg Kroah-Hartman, Jonathan Corbet, Shuah Khan, linux-doc,
	Mauro Carvalho Chehab, stable
In-Reply-To: <20260623190006.406571-1-rdunlap@infradead.org>

On 2026-06-23 12:00:04-0700, Randy Dunlap wrote:
> Add support for __SYSFS_FUNCTION_ALTERNATIVE() to create a union of its
> members (as though CONFIG_CFI is unset).
> 
> Fixes these docs build warnings:
> 
> WARNING: include/linux/device.h:117 Invalid param: __SYSFS_FUNCTION_ALTERNATIVE( ssize_t (*show)(struct device *dev, struct device_attribute *attr, char *buf)
> WARNING: include/linux/device.h:117 struct member '__SYSFS_FUNCTION_ALTERNATIVE( ssize_t (*show' not described in 'device_attribute'
> WARNING: include/linux/device.h:117 Invalid param: __SYSFS_FUNCTION_ALTERNATIVE( ssize_t (*store)(struct device *dev, struct device_attribute *attr, const char *buf, size_t count)
> WARNING: include/linux/device.h:117 struct member '__SYSFS_FUNCTION_ALTERNATIVE( ssize_t (*store' not described in 'device_attribute'
> 
> Fixes: 434506b86a6c ("driver core: Allow the constification of device attributes")
> Signed-off-by: Randy Dunlap <rdunlap@infradead.org>

Thanks for the fix. I would have expected 0day to catch this :-(

Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>

> ---
> Cc: Thomas Weißschuh <linux@weissschuh.net>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Danilo Krummrich <dakr@kernel.org>
> Cc: driver-core@lists.linux.dev
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Shuah Khan <skhan@linuxfoundation.org>
> Cc: linux-doc@vger.kernel.org
> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
> Cc: stable@vger.kernel.org
> 
>  tools/lib/python/kdoc/xforms_lists.py |    1 +
>  1 file changed, 1 insertion(+)
> 
> --- linux-next-20260619.orig/tools/lib/python/kdoc/xforms_lists.py
> +++ linux-next-20260619/tools/lib/python/kdoc/xforms_lists.py
> @@ -49,6 +49,7 @@ class CTransforms:
>          (CMatch("DEFINE_DMA_UNMAP_ADDR"), r"dma_addr_t \1"),
>          (CMatch("DEFINE_DMA_UNMAP_LEN"), r"__u32 \1"),
>          (CMatch("VIRTIO_DECLARE_FEATURES"), r"union { u64 \1; u64 \1_array[VIRTIO_FEATURES_U64S]; }"),
> +        (CMatch("__SYSFS_FUNCTION_ALTERNATIVE"), r"union { \1+ }"),
>          (CMatch("__attribute__"), ""),
>  
>          #

^ permalink raw reply

* Re: [RFC PATCH v2 06/10] kvm: guest_memfd: Add support for freezing and unfreezing mappings
From: tarunsahu @ 2026-06-23 20:06 UTC (permalink / raw)
  To: Pratyush Yadav
  Cc: Ackerley Tng, Jonathan Corbet, vannapurve, fvdl, Pasha Tatashin,
	Shuah Khan, sagis, aneesh.kumar, skhawaja, vipinsh,
	Pratyush Yadav, david, dmatlack, mark.rutland, Paolo Bonzini,
	Mike Rapoport, Alexander Graf, seanjc, axelrasmussen,
	linux-kselftest, kexec, linux-kernel, linux-doc, kvm, linux-mm
In-Reply-To: <2vxz8q85mdyh.fsf@kernel.org>

Pratyush Yadav <pratyush@kernel.org> writes:

> On Tue, Jun 23 2026, tarunsahu@google.com wrote:
>
>> Ackerley Tng <ackerleytng@google.com> writes:
>>
>>> Tarun Sahu <tarunsahu@google.com> writes:
>>>
>>>>  static long kvm_gmem_fallocate(struct file *file, int mode, loff_t offset,
>>>>  			       loff_t len)
>>>>  {
>>>> +	struct inode *inode = file_inode(file);
>>>>  	int ret;
>>>> +	int idx;
>>>>
>>>> -	if (!(mode & FALLOC_FL_KEEP_SIZE))
>>>> -		return -EOPNOTSUPP;
>>>> +	idx = srcu_read_lock(&kvm_gmem_freeze_srcu);
>>>> +	if (kvm_gmem_is_frozen(inode)) {
>>>> +		srcu_read_unlock(&kvm_gmem_freeze_srcu, idx);
>>>> +		return -EPERM;
>>>> +	}
>>>
>>> fallocate may eventually go to kvm_gmem_get_folio(), so that would check
>>> kvm_gmem_is_frozen() twice. Is this meant to catch the punch hole case?
>
> Yeah, I reckon you can get away with doing this check only in
> kvm_gmem_get_folio(). Normally you'd like to fail early, but as of now I
> don't see much of a problem. If you drop the check here and fail in
> kvm_gmem_get_folio() you'd end up taking and releasing the mapping
> invalidate_lock, but this isn't a fast path anyway so I don't think it
> should matter much.

No, Don't agree.
kvm_gmem_get_folios already have the is_frozen check. which blocks the
kvm_gmem_allocate. But not kvm_gmem_punch_hole. Your argument is correct
for kvm_gmem_allocate only. So is_frozen check in fallocate is to
block the punch hole as well. What ackerley said is correct.

>
> I think either way can work just as fine...
>
>>>
>>>>
>>>> -	if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE))
>>>> -		return -EOPNOTSUPP;
>>>> +	if (!(mode & FALLOC_FL_KEEP_SIZE)) {
>>>> +		ret = -EOPNOTSUPP;
>>>> +		goto out;
>>>> +	}
>>>>
>>>> -	if (!PAGE_ALIGNED(offset) || !PAGE_ALIGNED(len))
>>>> -		return -EINVAL;
>>>> +	if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)) {
>>>> +		ret = -EOPNOTSUPP;
>>>> +		goto out;
>>>> +	}
>>>> +
>>>> +	if (!PAGE_ALIGNED(offset) || !PAGE_ALIGNED(len)) {
>>>> +		ret = -EINVAL;
>>>> +		goto out;
>>>> +	}
>>>
>>> There's some reordering here. Why not let the validation happen like
>>> before, then check kvm_gmem_is_frozen()?
>
> There is no reordering, if I am reading the diff correctly. The diff is
> somewhat misleading. The kvm_gmem_is_frozen() call is added at the top
> of the function, and then all the later checks are in the same place but
> get a goto out (and hence a full body to the if block). So the diff
> reads like reordering, but there is none.
>

That is right. I thought Ackerley was asking why not add the check after
the all the if condition and before the kvm_gmem_punch_hole/allocate calls.

> It would be very neat if scru had a cleanup.h style scope-based locking
> function, but on a quick glance I can't see one.
>
>>
>> To align with design. "stop the fallocate call if inode is frozen, No
>> need to go further". I dont have strict opinion on this. I am fine with
>> taking it across punch hole as well to make it more fine grained. But it
>> will no longer claims stop the fallocate call (allocation one is stopped
>> in separate path: fault path) , though functionally it does the same
>> thing.
>>
>> WDYT?
>>
>> ~Tarun
>
> -- 
> Regards,
> Pratyush Yadav

^ permalink raw reply

* Re: [RFC PATCH v2 10/10] selftests: kvm: Add guest_memfd_preservation_test
From: tarunsahu @ 2026-06-23 19:50 UTC (permalink / raw)
  To: Ackerley Tng, Jonathan Corbet, vannapurve, fvdl, Pasha Tatashin,
	Shuah Khan, sagis, aneesh.kumar, skhawaja, vipinsh,
	Pratyush Yadav, david, dmatlack, mark.rutland, Paolo Bonzini,
	Mike Rapoport, Alexander Graf, seanjc, axelrasmussen
  Cc: linux-kselftest, kexec, linux-kernel, linux-doc, kvm, linux-mm
In-Reply-To: <CAEvNRgE2GZNiDg_g6SP_H9CDsDDAnpN7KTRWJEK18wxpTZFJZw@mail.gmail.com>

Ackerley Tng <ackerleytng@google.com> writes:

> Tarun Sahu <tarunsahu@google.com> writes:
>
>> Add a new KVM selftest `guest_memfd_preservation_test` to verify that
>> guest memory backed by guest_memfd is preserved properly.
>>
>
> Don't think using backticks in commit messages is a common practice but
> I might be wrong here.
>
>> The test leverages the Live Update Orchestrator (LUO) infrastructure
>> to validate that memory folios and configuration layouts are
>> successfully saved and then restored during kernel live updates,
>> preventing any memory loss for the guest.
>>
>> Here, I have used the kvm selftests framework by creating a new
>> vm and mapping two memory slots to it. One is the code that is executed
>> inside the vm and other is the guest_memfd whose memory is being
>> written by the guest code.
>>
>
> Don't think commit messages with "I" are common either


Will take care of commit message everywhere as per the guidelines,
Sorry about that.
>
>> In Phase 1: Once data is written the vm exits and wait for the user
>> to trigger the kexec.
>>
>> In Phase 2: A new vm is created with retrieved kvm and again two
>> memory slots are assigned. Once for guest code, and another is for
>> retrieved guest_memfd where guest_memfd memory is verified by the
>> executed guest code. If verification succeeds, The test passes.
>>
>>
>> [...snip...]
>>
>> +#define SESSION_NAME "gmem_vm_preservation_session"
>> +#define VM_TOKEN 0x1001
>> +#define GMEM_TOKEN 0x1002
>> +
>> +#define GMEM_SIZE (16ULL * 1024 * 1024)
>> +#define DATA_SIZE (5ULL * 1024 * 1024)
>> +
>> +static size_t page_size;
>> +
>> +/* Deterministic byte pattern generation based on offset */
>> +static inline uint8_t get_pattern_byte(size_t offset)
>> +{
>> +	return (uint8_t)(offset ^ 0x5A);
>> +}
>> +
>> +static void guest_code_phase1(uint64_t gpa, uint64_t size, uint64_t data_size)
>> +{
>> +	uint8_t *mem = (uint8_t *)gpa;
>> +	size_t i;
>> +
>> +	for (i = 0; i < data_size; i++)
>> +		mem[i] = get_pattern_byte(i);
>> +
>> +	GUEST_DONE();
>> +}
>> +
>> +static void guest_code_phase2(uint64_t gpa, uint64_t size, uint64_t data_size)
>> +{
>> +	uint8_t *mem = (uint8_t *)gpa;
>> +	size_t i;
>> +
>> +	for (i = 0; i < data_size; i++) {
>> +		uint8_t val = get_pattern_byte(i);
>> +
>> +		__GUEST_ASSERT(mem[i] == val,
>> +			       "Data mismatch at offset %lu! Expected 0x%x, got 0x%x",
>> +			       i, val, mem[i]);
>> +	}
>> +
>> +	GUEST_DONE();
>> +}
>> +
>> +static void do_phase1(void)
>> +{
>> +	uint64_t flags = GUEST_MEMFD_FLAG_MMAP | GUEST_MEMFD_FLAG_INIT_SHARED;
>
> Is there a reason to set GUEST_MEMFD_FLAG_MMAP? We're not really
> accessing that memory from the host in this test.

Right, We can skip it.

>
>> +	int gmem_fd, dev_luo_fd, session_fd, ret;
>> +	const uint64_t gpa = SZ_4G;
>> +	struct kvm_vcpu *vcpu;
>> +	const int slot = 1;
>> +	struct kvm_vm *vm;
>> +
>> +	vm = __vm_create_shape_with_one_vcpu(VM_SHAPE_DEFAULT, &vcpu, 1,
>> +					guest_code_phase1);
>> +	gmem_fd = vm_create_guest_memfd(vm, GMEM_SIZE, flags);
>> +	vm_set_user_memory_region2(vm, slot, KVM_MEM_GUEST_MEMFD, gpa, GMEM_SIZE, NULL,
>> +				 gmem_fd, 0);
>> +
>> +	for (size_t i = 0; i < GMEM_SIZE; i += page_size)
>> +		virt_pg_map(vm, gpa + i, gpa + i);
>> +
>> +	vcpu_args_set(vcpu, 3, gpa, GMEM_SIZE, DATA_SIZE);
>
> If GMEM_SIZE and DATA_SIZE are static I think we don't have to set those
> as vcpu_args_set(), they can be used as macros from within the guest.

Yes, There are multiple places we can skip it, Like passing them as the
argument in the guest_code_phase1/2. Will update it.

>
>> +
>> +	vcpu_run(vcpu);
>> +	TEST_ASSERT_EQ(get_ucall(vcpu, NULL), UCALL_DONE);
>> +
>> +	dev_luo_fd = luo_open_device();
>> +	TEST_ASSERT(dev_luo_fd >= 0, "Failed to open /dev/liveupdate");
>> +
>> +	session_fd = luo_create_session(dev_luo_fd, SESSION_NAME);
>> +	TEST_ASSERT(session_fd >= 0, "Failed to create LUO session");
>> +
>> +	ret = luo_session_preserve_fd(session_fd, vm->fd, VM_TOKEN);
>> +	TEST_ASSERT(ret == 0, "Failed to preserve VM file descriptor");
>> +
>> +	ret = luo_session_preserve_fd(session_fd, gmem_fd, GMEM_TOKEN);
>> +	TEST_ASSERT(ret == 0, "Failed to preserve guest_memfd file descriptor");
>> +
>
> Thanks for showing how this works :)

Glad to know. it helped.
. .
 v


>
>> +	printf("\n============================================================\n");
>> +	printf("Phase 1 Complete Successfully!\n");
>> +	printf("VM file and guest_memfd file have been preserved via LUO.\n");
>> +	printf("Tokens: VM_TOKEN=0x%x, GMEM_TOKEN=0x%x\n", VM_TOKEN, GMEM_TOKEN);
>> +	printf("Machine Size: %llu MB, Data Size: %llu MB\n", GMEM_SIZE / SZ_1M,
>> +				 DATA_SIZE / SZ_1M);
>> +	printf("------------------------------------------------------------\n");
>> +
>> +	daemonize_and_wait();
>> +}
>> +
>> +static struct kvm_vm *vm_create_from_fd(int resurrected_vm_fd,
>> +					struct vm_shape shape)
>> +{
>> +	struct kvm_vm *vm;
>> +
>> +	vm = calloc(1, sizeof(*vm));
>> +	TEST_ASSERT(vm != NULL, "Insufficient Memory");
>> +
>> +	vm_init_fields(vm, shape);
>
> What would happen if the shape was changed between preserving and
> restoring?
Shape must be consistent across kexec. It may break. struct vm_shape
includes two fields
1. mode: 4k, 16k or 64k mapping, how many page table bit etc.
         which are used to setup mapping with guest code, and memory.
         guest_memfd does not support mapping other than PAGE_SIZE.
2. type: This is userspace side of the vm type. And it will not have
         information about the preserved vm_type via vm_file (kernel).
         If userspace changes this on stage2, some action might not
         work. for example, if userspace is expecting vm_type to be
         COCO VM and preserved vm_type is shared, this will have
         conflict when userspace will try to perform some operation that
         only works with COCO VM.


My Question is: Why should someone change the shape, unless they are
                planning to make the vm fail?

>
>> +
>> +	vm->kvm_fd = open_path_or_exit(KVM_DEV_PATH, O_RDWR);
>> +	vm->fd = resurrected_vm_fd;
>> +
>> +	if (kvm_has_cap(KVM_CAP_BINARY_STATS_FD))
>> +		vm->stats.fd = vm_get_stats_fd(vm);
>> +	else
>> +		vm->stats.fd = -1;
>> +
>> +	vm_init_memory_properties(vm);
>> +
>> +	return vm;
>> +}
>> +
>
> I think vm_create_from_fd() could be introduced in an earlier patch to
> reduce the amount of new code in this patch. Also, I think it could
> perhaps be moved to kvm_util.c assuming that other test will use it too.
>
Makes sense, Will take of it next revision v4.


>> +static void do_phase2(void)
>> +{
>> +	int retrieved_vm_fd, retrieved_gmem_fd, dev_luo_fd, session_fd;
>> +	struct vm_shape shape = VM_SHAPE_DEFAULT;
>> +	const uint64_t gpa = SZ_4G;
>> +	struct kvm_vcpu *vcpu;
>> +	const int slot = 1;
>> +	struct kvm_vm *vm;
>> +
>> +	dev_luo_fd = luo_open_device();
>> +	TEST_ASSERT(dev_luo_fd >= 0, "Failed to open /dev/liveupdate");
>> +
>> +	session_fd = luo_retrieve_session(dev_luo_fd, SESSION_NAME);
>> +	TEST_ASSERT(session_fd >= 0, "Failed to retrieve LUO session");
>> +
>> +	retrieved_vm_fd = luo_session_retrieve_fd(session_fd, VM_TOKEN);
>> +	TEST_ASSERT(retrieved_vm_fd >= 0, "Failed to retrieve VM file descriptor");
>> +
>> +	retrieved_gmem_fd = luo_session_retrieve_fd(session_fd, GMEM_TOKEN);
>> +	TEST_ASSERT(retrieved_gmem_fd >= 0, "Failed to retrieve guest_memfd file descriptor");
>> +
>> +	vm = vm_create_from_fd(retrieved_vm_fd, shape);
>> +
>> +	u64 nr_pages = 2048; /* 8MB is plenty for slot0 pages */
>> +
>
> I don't think declarations are usually mixed with regular code.
Okay, will udpate that.

>
>> +	vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, 0, 0, nr_pages, 0);
>> +	kvm_vm_elf_load(vm, program_invocation_name);
>> +
>> +	for (int i = 0; i < NR_MEM_REGIONS; i++)
>> +		vm->memslots[i] = 0;
>> +
>> +	struct userspace_mem_region *slot0 = memslot2region(vm, 0);
>> +
>> +	ucall_init(vm, slot0->region.guest_phys_addr + slot0->region.memory_size);
>> +
>> +	vm_set_user_memory_region2(vm, slot, KVM_MEM_GUEST_MEMFD, gpa, GMEM_SIZE, NULL,
>> +				   retrieved_gmem_fd, 0);
>> +
>> +	for (size_t i = 0; i < GMEM_SIZE; i += page_size)
>> +		virt_pg_map(vm, gpa + i, gpa + i);
>> +
>> +	vcpu = vm_vcpu_add(vm, 0, guest_code_phase2);
>> +	kvm_arch_vm_finalize_vcpus(vm);
>> +
>> +	vcpu_args_set(vcpu, 3, gpa, GMEM_SIZE, DATA_SIZE);
>> +
>> +	printf("Resuming / Running VM in Phase 2...\n");
>> +	vcpu_run(vcpu);
>> +	TEST_ASSERT_EQ(get_ucall(vcpu, NULL), UCALL_DONE);
>> +
>> +	printf("\nSUCCESS: Phase 2 Complete! All 5MB complex data verified intact!\n");
>> +
>> +	luo_session_finish(session_fd);
>> +	close(session_fd);
>> +	close(dev_luo_fd);
>> +	/* This will also close the vm_fd */
>> +	kvm_vm_free(vm);
>> +	close(retrieved_gmem_fd);
>> +}
>> +
>> +int main(int argc, char *argv[])
>> +{
>> +	bool phase2 = false;
>> +
>> +	TEST_REQUIRE(kvm_has_cap(KVM_CAP_GUEST_MEMFD));
>> +	page_size = getpagesize();
>> +
>> +	for (int i = 1; i < argc; i++) {
>> +		if (strcmp(argv[i], "--phase2") == 0)
>> +			phase2 = true;
>> +	}
>> +
>
> Maybe use getopt() here?

In V3, it is update to use liveupdate library.
>
>> +	if (phase2)
>> +		do_phase2();
>> +	else
>> +		do_phase1();
>> +
>> +	return 0;
>> +}
>> --
>> 2.54.0.1032.g2f8565e1d1-goog
>
> I think we also need tests for trying to allocate while frozen, and
> conversion while frozen, and trying to preserve while preservation is
> not allowed.

Yes, We need those tests. For this series, I wanted to focus on design.
Now that we are aligned, next revision, I will send with more tests.

~Tarun

^ permalink raw reply

* Re: [PATCH] crypto: af_alg - Add af_alg_restrict sysctl, defaulting to 1
From: Eric Biggers @ 2026-06-23 19:49 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-crypto, Herbert Xu, linux-kernel, linux-doc,
	linux-bluetooth, iwd, linux-hardening, Milan Broz,
	Demi Marie Obenour, Andy Lutomirski
In-Reply-To: <202606231216.14A774833@keescook>

On Tue, Jun 23, 2026 at 12:24:28PM -0700, Kees Cook wrote:
> On Mon, Jun 22, 2026 at 04:48:03PM -0700, Eric Biggers wrote:
> > AF_ALG is a frequent source of vulnerabilities and a maintenance
> > nightmare.  It exposes far more functionality to userspace than ever
> > should have been exposed, especially to unprivileged processes.  Recent
> > exploits have targeted kernel internal implementation details like
> > "authencesn" that have zero use case for userspace access.
> 
> I absolutely want to see this attack surface reduction.
> 
> > Add a sysctl /proc/sys/crypto/af_alg_restrict with meaning:
> > [...]
> > Note that the list may be tweaked in the future.  However, the common
> > use cases such as iwd and bluez are taken into account already.  I've
> > tested that iwd still works with the default value of 1.
> 
> I wince at this bit, though. This is a "security policy in the kernel"
> which we try to avoid, and it's could be done already in userspace with
> modprobe blacklist.
> 
> But, as you say, AF_ALG is deprecated. I understand that to mean that
> the alg list is only ever going to *shrink* in the future.
> 
> Using a sysctl means monolithic kernels are protected, but wouldn't
> those systems just compile AF_ALG out?
> 
> So, I guess, I would want a more clear rationale for why we do it this
> way instead of via modprobe blacklist. I see a few reasons, but they
> don't really convince me that we should ignore the "no security policy
> in the kernel" rule to do it this way.

As we saw when distros tried to mitigate copy.fail, a lot of distros
have CONFIG_CRYPTO_USER_API_* set to 'y', so algif_aead.ko couldn't be
blacklisted.  (Ironically because of FIPS 140, which is yet another
example of how FIPS 140 harms real-world security.)

But even when 'm', the module blacklist is just a binary choice for each
algorithm type: aead, skcipher, hash, and rng.  Loading algif_aead.ko
allows not just "ccm(aes)" that bluez needs, but also bizarre things
like "authencesn(hmac(sha256),cbc(aes))" that are used only in exploits.

And sure, userspace could theoretically gather the complete list of
algorithm modules (e.g. authencesn.ko) and blacklist them individually.
But no one does that, and many are built-in anyway -- and this time not
just because of FIPS.

So we need an allowlist at the algorithm level, not just the algorithm
type level.  Putting the allowlist in the kernel, taking into account
the real use cases like iwd and bluez, and having a simple tristate
sysctl similar to some of the existing ones, is the simplest and most
practical way to achieve this by default across Linux distros.

If we did something like delegate the algorithm allowlist to LSMs, I
think that in practice it's just going to be almost never used.

- Eric

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox