* Re: [PATCH v2] Documentation: Refactored watchdog old doc
From: Guenter Roeck @ 2026-04-11 19:07 UTC (permalink / raw)
To: Randy Dunlap, Sunny Patel, Jonathan Corbet
Cc: Wim Van Sebroeck, Shuah Khan, linux-watchdog, linux-doc,
linux-kernel
In-Reply-To: <303dcd9e-ca40-48b7-851e-6cd283cb96ad@infradead.org>
On 4/11/26 10:22, Randy Dunlap wrote:
>
>
> On 4/11/26 8:09 AM, Sunny Patel wrote:
>> Mark WDIOC_GETTEMP and WDIOS_TEMPPANIC as deprecated since
>> neither is implemented by the watchdog core and both are only
>> present in a small number of legacy drivers.
>>
>> Add documentation for previously undocumented status bits
>> WDIOF_MAGICCLOSE and WDIOF_ALARMONLY in the options field.
>>
>> Add documentation for WDIOF_PRETIMEOUT and WDIOF_SETTIMEOUT
>> status bits describing their respective ioctls.
>>
>> Fix the following issues in existing documentation:
>> - Remove version-specific reference to Linux 2.4.18 from
>> the GETTIMEOUT ioctl description
>> - Fix duplicate "was is" in printf format strings
>> - Replace [FIXME] placeholder with proper descriptions for
>> WDIOS_DISABLECARD, WDIOS_ENABLECARD and WDIOS_TEMPPANIC
>>
>> Signed-off-by: Sunny Patel <nueralspacetech@gmail.com>
>> ---
>>
>> Changes in v2:
>> - Fixed typos: "tiemout" -> "timeout", "characted" -> "character"
>> - Fixed "small number if legacy" -> "of legacy"
>> - Fixed capitalization: "New Drivers" -> "New drivers", "USE" -> "Use"
>> - Fixed spacing: "WDIOS_DISABLECARD,this" -> "WDIOS_DISABLECARD, this"
>> - Fixed double spaces in two places
>> - Added missing newline at end of file
>> - Rewrote commit message
>
> However, you failed to fix a malformed table warning that I reported here:
> https://lore.kernel.org/linux-doc/9e3403a0-4ec2-4fbe-a50f-53f939c1d841@infradead.org/
>
On top of that, it should have been v3, not v2.
Guenter
> Documentation/watchdog/watchdog-api.rst:250: ERROR: Malformed table.
> Text in column margin in table line 2.
>
> ================ ================================
> WDIOF_ALARMONLY Not a reboot watchdog
> ================ ================================
>
>
> So I repeat, please test your patches.
>
>>
>> Documentation/watchdog/watchdog-api.rst | 59 +++++++++++++++++++++----
>> 1 file changed, 51 insertions(+), 8 deletions(-)
^ permalink raw reply
* Re: (sashiko review) [PATCH] Docs/mm/damon/maintainer-profile: add AI review usage guideline
From: SeongJae Park @ 2026-04-11 18:51 UTC (permalink / raw)
To: SeongJae Park
Cc: Andrew Morton, Liam R. Howlett, David Hildenbrand,
Jonathan Corbet, Lorenzo Stoakes, Michal Hocko, Mike Rapoport,
Shuah Khan, Suren Baghdasaryan, Vlastimil Babka, damon, linux-doc,
linux-kernel, linux-mm
In-Reply-To: <20260411184802.81804-1-sj@kernel.org>
On Sat, 11 Apr 2026 11:48:01 -0700 SeongJae Park <sj@kernel.org> wrote:
> # sashiko review suggestions
> #
> # 1. Consider reducing recipients. Maybe the author,
> # maintainers, reviewers, and mailing list of the
> # direct subsystem and parent susystem mailing lists
> # could be a starting point.
> # 2. Add short summary of your opinion at the beginning.
> # For example:
> #
> # TL;DR: sashiko found an issue. I will respin.
> # TL;DR: sashiko found no issue on this patch.
> #
> # Please DON'T FORGET removing this comment block before
> # sending this!
I forgot following my self guideline above. Sorry for the noise.
TL; DR: sashiko found an issue that better to be fixed. I will respin.
Thanks,
SJ
[...]
^ permalink raw reply
* Re: (sashiko review) [PATCH] Docs/mm/damon/maintainer-profile: add AI review usage guideline
From: SeongJae Park @ 2026-04-11 18:48 UTC (permalink / raw)
To: SeongJae Park
Cc: Andrew Morton, Liam R. Howlett, David Hildenbrand,
Jonathan Corbet, Lorenzo Stoakes, Michal Hocko, Mike Rapoport,
Shuah Khan, Suren Baghdasaryan, Vlastimil Babka, damon, linux-doc,
linux-kernel, linux-mm
In-Reply-To: <20260411174906.79491-1-sj@kernel.org>
# sashiko review suggestions
#
# 1. Consider reducing recipients. Maybe the author,
# maintainers, reviewers, and mailing list of the
# direct subsystem and parent susystem mailing lists
# could be a starting point.
# 2. Add short summary of your opinion at the beginning.
# For example:
#
# TL;DR: sashiko found an issue. I will respin.
# TL;DR: sashiko found no issue on this patch.
#
# Please DON'T FORGET removing this comment block before
# sending this!
Forwarding full sashiko review in a reply format with my
inline comments below, for sharing details of my view and
doing followup discussions via mails if needed.
> # review url: https://sashiko.dev/#/patchset/20260411174906.79491-1-sj@kernel.org
> # start of sashiko.dev inline review
> commit eaf13e650de39668ddb68467088d425fa3e8be04
> Author: SeongJae Park <sj@kernel.org>
>
> Docs/mm/damon/maintainer-profile: add AI review usage guideline
>
> This commit clarifies how the DAMON maintainer profile uses sashiko.dev for
> AI patch scanning and email delivery. It adds a new AI Review section
> explaining how patch authors can check reviews and share their opinions.
>
> > diff --git a/Documentation/mm/damon/maintainer-profile.rst b/Documentation/mm/damon/maintainer-profile.rst
> > index bcb9798a27a86..5c23b1281c50c 100644
> > --- a/Documentation/mm/damon/maintainer-profile.rst
> > +++ b/Documentation/mm/damon/maintainer-profile.rst
> > @@ -100,3 +100,24 @@ There is also a public Google `calendar
> > <https://calendar.google.com/calendar/u/0?cid=ZDIwOTA4YTMxNjc2MDQ3NTIyMmUzYTM5ZmQyM2U4NDA0ZGIwZjBiYmJlZGQxNDM0MmY4ZTRjOTE0NjdhZDRiY0Bncm91cC5jYWxlbmRhci5nb29nbGUuY29t>`_
> > that has the events. Anyone can subscribe to it. DAMON maintainer will also
> > provide periodic reminders to the mailing list (damon@lists.linux.dev).
> > +
> > +AI Review
> > +---------
> > +
> > +For patches that are publicly posted to DAMON mailing list
> > +(damon@lists.linux.dev), AI reviews of the patches will be available at
> > +sashiko.dev. The reviews could also be sent as mails to the author of the
> > +patch.
> > +
> > +Patch authors are encouraged to check the AI reviews and share their opinions.
> > +The sharing could be done as a reply to the mail thread. Consider reducing the
> > +recipients list for such sharing, since some people are not really interested
> > +in AI reviews. As a rule of thumb, drop individuals except DAMON maintainer
> > +and stable@vger.kernel.org from the recipients list.
>
> Is stable@vger.kernel.org the correct mailing list to keep here?
>
> The stable mailing list is exclusively for stable kernel backports. Sending
> ongoing patch AI review discussions to that list could result in spamming it.
> Should this perhaps be damon@lists.linux.dev instead?
I mean, drop individuals except DAMON maintainer, and also drop stable@. I
will wordsmith like below:
'''
--- a/Documentation/mm/damon/maintainer-profile.rst
+++ b/Documentation/mm/damon/maintainer-profile.rst
@@ -112,8 +112,8 @@ patch.
Patch authors are encouraged to check the AI reviews and share their opinions.
The sharing could be done as a reply to the mail thread. Consider reducing the
recipients list for such sharing, since some people are not really interested
-in AI reviews. As a rule of thumb, drop individuals except DAMON maintainer
-and stable@vger.kernel.org from the recipients list.
+in AI reviews. As a rule of thumb, drop stable@vger.kernel.org and individuals
+except DAMON maintainer.
`hkml` also provides a `feature
<https://github.com/sjp38/hackermail/blob/master/USAGE.md#forwarding-sashikodev-statuscomments-to-mailing-list>`_
'''
>
> > +
> > +`hkml` also provides a `feature
> > +<https://github.com/sjp38/hackermail/blob/master/USAGE.md#forwarding-sashikodev-statuscomments-to-mailing-list>`_
> > +for such sharing. Please feel free to use the feature.
> > +
> > +It is only a non-mandastory recommendation. DAMON maintainer could also ask
>
> This isn't a bug, but there's a typo (non-mandastory) here. It should be
> non-mandatory.
Good eye. Randy also found [1] this and I already promised to fix. Sashiko is
better than google doc [2]!
>
> > +any question about the AI reviews, though.
>
>
> # end of sashiko.dev inline review
> # review url: https://sashiko.dev/#/patchset/20260411174906.79491-1-sj@kernel.org
[1] https://lore.kernel.org/20aa1471-5257-4913-ae46-7b40ca0aa128@infradead.org
[2] https://lore.kernel.org/20260411183029.81030-1-sj@kernel.org
Thanks,
SJ
# hkml [1] generated a draft of this mail. You can regenerate
# this using below command:
#
# hkml patch sashiko_dev --for_forwarding \
# 20260411174906.79491-1-sj@kernel.org
#
# [1] https://github.com/sjp38/hackermail
^ permalink raw reply
* Re: [PATCH] Docs/mm/damon/maintainer-profile: add AI review usage guideline
From: SeongJae Park @ 2026-04-11 18:30 UTC (permalink / raw)
To: Randy Dunlap
Cc: SeongJae Park, Andrew Morton, Liam R. Howlett, David Hildenbrand,
Jonathan Corbet, Lorenzo Stoakes, Michal Hocko, Mike Rapoport,
Shuah Khan, Suren Baghdasaryan, Vlastimil Babka, damon, linux-doc,
linux-kernel, linux-mm
In-Reply-To: <20aa1471-5257-4913-ae46-7b40ca0aa128@infradead.org>
On Sat, 11 Apr 2026 11:27:14 -0700 Randy Dunlap <rdunlap@infradead.org> wrote:
>
>
> On 4/11/26 10:49 AM, SeongJae Park wrote:
> > DAMON is opted-in for DAMON patches scanning [1] and email delivery [2].
> > Clarify how that could be used on DAMON maintainer profile.
> >
> > [1] https://github.com/sashiko-dev/sashiko/commit/ad9f4a98f958
> > [2] https://github.com/sashiko-dev/sashiko/commit/b554c7b6e733
> >
> > Signed-off-by: SeongJae Park <sj@kernel.org>
> > ---
> > Documentation/mm/damon/maintainer-profile.rst | 21 +++++++++++++++++++
> > 1 file changed, 21 insertions(+)
> >
> > diff --git a/Documentation/mm/damon/maintainer-profile.rst b/Documentation/mm/damon/maintainer-profile.rst
> > index bcb9798a27a86..5c23b1281c50c 100644
> > --- a/Documentation/mm/damon/maintainer-profile.rst
> > +++ b/Documentation/mm/damon/maintainer-profile.rst
> > @@ -100,3 +100,24 @@ There is also a public Google `calendar
> > <https://calendar.google.com/calendar/u/0?cid=ZDIwOTA4YTMxNjc2MDQ3NTIyMmUzYTM5ZmQyM2U4NDA0ZGIwZjBiYmJlZGQxNDM0MmY4ZTRjOTE0NjdhZDRiY0Bncm91cC5jYWxlbmRhci5nb29nbGUuY29t>`_
> > that has the events. Anyone can subscribe to it. DAMON maintainer will also
> > provide periodic reminders to the mailing list (damon@lists.linux.dev).
> > +
> > +AI Review
> > +---------
> > +
> > +For patches that are publicly posted to DAMON mailing list
> > +(damon@lists.linux.dev), AI reviews of the patches will be available at
> > +sashiko.dev. The reviews could also be sent as mails to the author of the
> > +patch.
> > +
> > +Patch authors are encouraged to check the AI reviews and share their opinions.
> > +The sharing could be done as a reply to the mail thread. Consider reducing the
> > +recipients list for such sharing, since some people are not really interested
> > +in AI reviews. As a rule of thumb, drop individuals except DAMON maintainer
> > +and stable@vger.kernel.org from the recipients list.
> > +
> > +`hkml` also provides a `feature
> > +<https://github.com/sjp38/hackermail/blob/master/USAGE.md#forwarding-sashikodev-statuscomments-to-mailing-list>`_
> > +for such sharing. Please feel free to use the feature.
> > +
> > +It is only a non-mandastory recommendation. DAMON maintainer could also ask
>
> non-mandatory
Good eyes! Thank you, Randy. I use Google doc for typo checks, but seems the
use of "-" made Google doc not to complain this.
> or maybe
> This is an optional recommendation.
Sounds better, I will respin with this, unless Andrew picks this patch with the
change.
Thanks,
SJ
[...]
^ permalink raw reply
* Re: [PATCH] Docs/mm/damon/maintainer-profile: add AI review usage guideline
From: Randy Dunlap @ 2026-04-11 18:27 UTC (permalink / raw)
To: SeongJae Park, Andrew Morton
Cc: Liam R. Howlett, David Hildenbrand, Jonathan Corbet,
Lorenzo Stoakes, Michal Hocko, Mike Rapoport, Shuah Khan,
Suren Baghdasaryan, Vlastimil Babka, damon, linux-doc,
linux-kernel, linux-mm
In-Reply-To: <20260411174906.79491-1-sj@kernel.org>
On 4/11/26 10:49 AM, SeongJae Park wrote:
> DAMON is opted-in for DAMON patches scanning [1] and email delivery [2].
> Clarify how that could be used on DAMON maintainer profile.
>
> [1] https://github.com/sashiko-dev/sashiko/commit/ad9f4a98f958
> [2] https://github.com/sashiko-dev/sashiko/commit/b554c7b6e733
>
> Signed-off-by: SeongJae Park <sj@kernel.org>
> ---
> Documentation/mm/damon/maintainer-profile.rst | 21 +++++++++++++++++++
> 1 file changed, 21 insertions(+)
>
> diff --git a/Documentation/mm/damon/maintainer-profile.rst b/Documentation/mm/damon/maintainer-profile.rst
> index bcb9798a27a86..5c23b1281c50c 100644
> --- a/Documentation/mm/damon/maintainer-profile.rst
> +++ b/Documentation/mm/damon/maintainer-profile.rst
> @@ -100,3 +100,24 @@ There is also a public Google `calendar
> <https://calendar.google.com/calendar/u/0?cid=ZDIwOTA4YTMxNjc2MDQ3NTIyMmUzYTM5ZmQyM2U4NDA0ZGIwZjBiYmJlZGQxNDM0MmY4ZTRjOTE0NjdhZDRiY0Bncm91cC5jYWxlbmRhci5nb29nbGUuY29t>`_
> that has the events. Anyone can subscribe to it. DAMON maintainer will also
> provide periodic reminders to the mailing list (damon@lists.linux.dev).
> +
> +AI Review
> +---------
> +
> +For patches that are publicly posted to DAMON mailing list
> +(damon@lists.linux.dev), AI reviews of the patches will be available at
> +sashiko.dev. The reviews could also be sent as mails to the author of the
> +patch.
> +
> +Patch authors are encouraged to check the AI reviews and share their opinions.
> +The sharing could be done as a reply to the mail thread. Consider reducing the
> +recipients list for such sharing, since some people are not really interested
> +in AI reviews. As a rule of thumb, drop individuals except DAMON maintainer
> +and stable@vger.kernel.org from the recipients list.
> +
> +`hkml` also provides a `feature
> +<https://github.com/sjp38/hackermail/blob/master/USAGE.md#forwarding-sashikodev-statuscomments-to-mailing-list>`_
> +for such sharing. Please feel free to use the feature.
> +
> +It is only a non-mandastory recommendation. DAMON maintainer could also ask
non-mandatory
or maybe
This is an optional recommendation.
> +any question about the AI reviews, though.
>
> base-commit: aeaae01df7d17b5742e22b65b06f666ddea76816
--
~Randy
^ permalink raw reply
* [PATCH] Docs/mm/damon/maintainer-profile: add AI review usage guideline
From: SeongJae Park @ 2026-04-11 17:49 UTC (permalink / raw)
To: Andrew Morton
Cc: SeongJae Park, Liam R. Howlett, David Hildenbrand,
Jonathan Corbet, Lorenzo Stoakes, Michal Hocko, Mike Rapoport,
Shuah Khan, Suren Baghdasaryan, Vlastimil Babka, damon, linux-doc,
linux-kernel, linux-mm
DAMON is opted-in for DAMON patches scanning [1] and email delivery [2].
Clarify how that could be used on DAMON maintainer profile.
[1] https://github.com/sashiko-dev/sashiko/commit/ad9f4a98f958
[2] https://github.com/sashiko-dev/sashiko/commit/b554c7b6e733
Signed-off-by: SeongJae Park <sj@kernel.org>
---
Documentation/mm/damon/maintainer-profile.rst | 21 +++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/Documentation/mm/damon/maintainer-profile.rst b/Documentation/mm/damon/maintainer-profile.rst
index bcb9798a27a86..5c23b1281c50c 100644
--- a/Documentation/mm/damon/maintainer-profile.rst
+++ b/Documentation/mm/damon/maintainer-profile.rst
@@ -100,3 +100,24 @@ There is also a public Google `calendar
<https://calendar.google.com/calendar/u/0?cid=ZDIwOTA4YTMxNjc2MDQ3NTIyMmUzYTM5ZmQyM2U4NDA0ZGIwZjBiYmJlZGQxNDM0MmY4ZTRjOTE0NjdhZDRiY0Bncm91cC5jYWxlbmRhci5nb29nbGUuY29t>`_
that has the events. Anyone can subscribe to it. DAMON maintainer will also
provide periodic reminders to the mailing list (damon@lists.linux.dev).
+
+AI Review
+---------
+
+For patches that are publicly posted to DAMON mailing list
+(damon@lists.linux.dev), AI reviews of the patches will be available at
+sashiko.dev. The reviews could also be sent as mails to the author of the
+patch.
+
+Patch authors are encouraged to check the AI reviews and share their opinions.
+The sharing could be done as a reply to the mail thread. Consider reducing the
+recipients list for such sharing, since some people are not really interested
+in AI reviews. As a rule of thumb, drop individuals except DAMON maintainer
+and stable@vger.kernel.org from the recipients list.
+
+`hkml` also provides a `feature
+<https://github.com/sjp38/hackermail/blob/master/USAGE.md#forwarding-sashikodev-statuscomments-to-mailing-list>`_
+for such sharing. Please feel free to use the feature.
+
+It is only a non-mandastory recommendation. DAMON maintainer could also ask
+any question about the AI reviews, though.
base-commit: aeaae01df7d17b5742e22b65b06f666ddea76816
--
2.47.3
^ permalink raw reply related
* Re: [PATCH v2] Documentation: Refactored watchdog old doc
From: Randy Dunlap @ 2026-04-11 17:22 UTC (permalink / raw)
To: Sunny Patel, Jonathan Corbet
Cc: Wim Van Sebroeck, Guenter Roeck, Shuah Khan, linux-watchdog,
linux-doc, linux-kernel
In-Reply-To: <20260411150922.20536-1-nueralspacetech@gmail.com>
On 4/11/26 8:09 AM, Sunny Patel wrote:
> Mark WDIOC_GETTEMP and WDIOS_TEMPPANIC as deprecated since
> neither is implemented by the watchdog core and both are only
> present in a small number of legacy drivers.
>
> Add documentation for previously undocumented status bits
> WDIOF_MAGICCLOSE and WDIOF_ALARMONLY in the options field.
>
> Add documentation for WDIOF_PRETIMEOUT and WDIOF_SETTIMEOUT
> status bits describing their respective ioctls.
>
> Fix the following issues in existing documentation:
> - Remove version-specific reference to Linux 2.4.18 from
> the GETTIMEOUT ioctl description
> - Fix duplicate "was is" in printf format strings
> - Replace [FIXME] placeholder with proper descriptions for
> WDIOS_DISABLECARD, WDIOS_ENABLECARD and WDIOS_TEMPPANIC
>
> Signed-off-by: Sunny Patel <nueralspacetech@gmail.com>
> ---
>
> Changes in v2:
> - Fixed typos: "tiemout" -> "timeout", "characted" -> "character"
> - Fixed "small number if legacy" -> "of legacy"
> - Fixed capitalization: "New Drivers" -> "New drivers", "USE" -> "Use"
> - Fixed spacing: "WDIOS_DISABLECARD,this" -> "WDIOS_DISABLECARD, this"
> - Fixed double spaces in two places
> - Added missing newline at end of file
> - Rewrote commit message
However, you failed to fix a malformed table warning that I reported here:
https://lore.kernel.org/linux-doc/9e3403a0-4ec2-4fbe-a50f-53f939c1d841@infradead.org/
Documentation/watchdog/watchdog-api.rst:250: ERROR: Malformed table.
Text in column margin in table line 2.
================ ================================
WDIOF_ALARMONLY Not a reboot watchdog
================ ================================
So I repeat, please test your patches.
>
> Documentation/watchdog/watchdog-api.rst | 59 +++++++++++++++++++++----
> 1 file changed, 51 insertions(+), 8 deletions(-)
--
~Randy
^ permalink raw reply
* [RFC PATCH v5.1 06/11] Docs/admin-guide/mm/damon/usage: document fail_charge_{num,denom} files
From: SeongJae Park @ 2026-04-11 16:48 UTC (permalink / raw)
Cc: SeongJae Park, Liam R. Howlett, Andrew Morton, David Hildenbrand,
Jonathan Corbet, Lorenzo Stoakes, Michal Hocko, Mike Rapoport,
Shuah Khan, Suren Baghdasaryan, Vlastimil Babka, damon, linux-doc,
linux-kernel, linux-mm
In-Reply-To: <20260411164908.77189-1-sj@kernel.org>
Update DAMON usage document for the DAMOS action failed regions quota
charge ratio control sysfs files.
Signed-off-by: SeongJae Park <sj@kernel.org>
---
Documentation/admin-guide/mm/damon/usage.rst | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/Documentation/admin-guide/mm/damon/usage.rst b/Documentation/admin-guide/mm/damon/usage.rst
index bfdb717441f05..d5548e460857c 100644
--- a/Documentation/admin-guide/mm/damon/usage.rst
+++ b/Documentation/admin-guide/mm/damon/usage.rst
@@ -84,7 +84,9 @@ comma (",").
│ │ │ │ │ │ │ │ sz/min,max
│ │ │ │ │ │ │ │ nr_accesses/min,max
│ │ │ │ │ │ │ │ age/min,max
- │ │ │ │ │ │ │ :ref:`quotas <sysfs_quotas>`/ms,bytes,reset_interval_ms,effective_bytes,goal_tuner
+ │ │ │ │ │ │ │ :ref:`quotas <sysfs_quotas>`/ms,bytes,reset_interval_ms,
+ │ │ │ │ │ │ │ effective_bytes,goal_tuner,
+ │ │ │ │ │ │ │ fail_charge_num,fail_charge_denom
│ │ │ │ │ │ │ │ weights/sz_permil,nr_accesses_permil,age_permil
│ │ │ │ │ │ │ │ :ref:`goals <sysfs_schemes_quota_goals>`/nr_goals
│ │ │ │ │ │ │ │ │ 0/target_metric,target_value,current_value,nid,path
@@ -381,9 +383,10 @@ schemes/<N>/quotas/
The directory for the :ref:`quotas <damon_design_damos_quotas>` of the given
DAMON-based operation scheme.
-Under ``quotas`` directory, five files (``ms``, ``bytes``,
-``reset_interval_ms``, ``effective_bytes`` and ``goal_tuner``) and two
-directories (``weights`` and ``goals``) exist.
+Under ``quotas`` directory, seven files (``ms``, ``bytes``,
+``reset_interval_ms``, ``effective_bytes``, ``goal_tuner``, ``fail_charge_num``
+and ``fail_charge_denom``) and two directories (``weights`` and ``goals``)
+exist.
You can set the ``time quota`` in milliseconds, ``size quota`` in bytes, and
``reset interval`` in milliseconds by writing the values to the three files,
@@ -402,6 +405,13 @@ the background design of the feature and the name of the selectable algorithms.
Refer to :ref:`goals directory <sysfs_schemes_quota_goals>` for the goals
setup.
+You can set the action-failed memory quota charging ratio by writing the
+numerator and the denominator for the ratio to ``fail_charge_num`` and
+``fail_charge_denom`` files, respectively. Reading those files will return the
+current set values. Refer to :ref:`design
+<damon_design_damos_quotas_failed_memory_charging_ratio>` for more details of
+the ratio feature.
+
The time quota is internally transformed to a size quota. Between the
transformed size quota and user-specified size quota, smaller one is applied.
Based on the user-specified :ref:`goal <sysfs_schemes_quota_goals>`, the
--
2.47.3
^ permalink raw reply related
* [RFC PATCH v5.1 05/11] Docs/mm/damon/design: document fail_charge_{num,denom}
From: SeongJae Park @ 2026-04-11 16:48 UTC (permalink / raw)
Cc: SeongJae Park, Liam R. Howlett, Andrew Morton, David Hildenbrand,
Jonathan Corbet, Lorenzo Stoakes, Michal Hocko, Mike Rapoport,
Shuah Khan, Suren Baghdasaryan, Vlastimil Babka, damon, linux-doc,
linux-kernel, linux-mm
In-Reply-To: <20260411164908.77189-1-sj@kernel.org>
Update DAMON design document for the DAMOS action failed region quota
charge ratio.
Signed-off-by: SeongJae Park <sj@kernel.org>
---
Documentation/mm/damon/design.rst | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/Documentation/mm/damon/design.rst b/Documentation/mm/damon/design.rst
index 622d24e35961e..fa7392b5a331d 100644
--- a/Documentation/mm/damon/design.rst
+++ b/Documentation/mm/damon/design.rst
@@ -576,6 +576,28 @@ interface <sysfs_interface>`, refer to :ref:`weights <sysfs_quotas>` part of
the documentation.
+.. _damon_design_damos_quotas_failed_memory_charging_ratio:
+
+Action-failed Memory Charging Ratio
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+DAMOS action to a given region can fail for some subsets of the memory of the
+region. For example, if the action is ``pageout`` and the region has some
+unreclaimable pages, applying the action to the pages will fail. The amount of
+system resource that is taken for such failed action applications is usually
+different from that for successful action applications. For such cases, users
+can set different charging ratio for such failed memory. The ratio can be
+specified using ``fail_charge_num`` and ``fail_charge_denom`` parameters. The
+two parameters represent the numerator and denominator of the ratio. The
+feature is enabled only if ``fail_charge_denom`` is not zero.
+
+For example, let's suppose a DAMOS action is applied to a region of 1,000 MiB
+size. The action is successfully applied to only 700 MiB of the region.
+``fail_charge_num`` and ``fail_charge_denom`` are set to ``1`` and ``1024``,
+respectively. Then only 700 MiB and 300 KiB of size (``700 MiB + 300 MiB * 1 /
+1024``) will be charged.
+
+
.. _damon_design_damos_quotas_auto_tuning:
Aim-oriented Feedback-driven Auto-tuning
--
2.47.3
^ permalink raw reply related
* [RFC PATCH v5.1 00/11] mm/damon: introduce DAMOS failed region quota charge ratio
From: SeongJae Park @ 2026-04-11 16:48 UTC (permalink / raw)
Cc: SeongJae Park, Liam R. Howlett, Andrew Morton, Brendan Higgins,
David Gow, David Hildenbrand, Jonathan Corbet, Lorenzo Stoakes,
Michal Hocko, Mike Rapoport, Shuah Khan, Shuah Khan,
Suren Baghdasaryan, Vlastimil Babka, damon, kunit-dev, linux-doc,
linux-kernel, linux-kselftest, linux-mm
TL; DR: Let users set different DAMOS quota charge ratios for DAMOS
action failed regions, for deterministic and consistent DAMOS action
progress.
Common Reports: Unexpectedly Slow DAMOS
=======================================
One common issue report that we get from DAMON users is that DAMOS
action applying progress speed is sometimes much slower than expected.
And one common root cause is that the DAMOS quota is exceeded by the
action applying failed memory regions.
For example, a group of users tried to run DAMOS-based proactive memory
reclamation (DAMON_RECLAIM) with 100 MiB per second DAMOS quota. They
ran it on a system having no active workload which means all memory of
the system is cold. The expectation was that the system will show 100
MiB per second reclamation until (nearly) all memory is reclaimed. But
what they found is that the speed is quite inconsistent and sometimes it
becomes very slower than the expectation, sometimes even no reclamation
at all for about tens of seconds. The upper limit of the speed (100 MiB
per second) was being kept as expected, though.
By monitoring the qt_exceeds (number of DAMOS quota exceed events) DAMOS
stat, we found DAMOS quota is always exceeded when the speed is slow. By
monitoring sz_tried and sz_applied (the total amount of DAMOS action
tried memory and succeeded memory) DAMOS stats together, we found the
reclamation attempts nearly always failed when the speed is slow.
DAMOS quota charges DAMOS action tried regions regardless of the
successfulness of the try. Hence in the example reported case, there
was unreclaimable memory spread around the system memory. Sometimes
nearly 100 MiB of memory that DAMOS tried to reclaim in the given quota
interval was reclaimable, and therefore showed nearly 100 MiB per second
speed. Sometimes nearly 99 MiB of memory that DAMOS was trying to
reclaim in the given quota interval was unreclaimable, and therefore
showing only about 1 MiB per second reclaim speed.
We explained it is an expected behavior of the feature rather than a
bug, as DAMOS quota is there for only the upper-limit of the speed. The
users agreed and later reported a huge win from the adoption of
DAMON_RECLAIM on their products.
It is Not a Bug but a Feature; But...
=====================================
So nothing is broken. DAMOS quota is working as intended, as the upper
limit of the speed. It also provides its behavior observability via
DAMOS stat. In the real world production environment that runs long
term active workloads and matters stability, the speed sometimes being
slow is not a real problem.
But, the non-deterministic behavior is sometimes annoying, especially in
lab environments. Even in a realistic production environment, when
there is a huge amount of DAMOS action unapplicable memory, the speed
could be problematically slow. Let's suppose a virtual machines
provider that setup 99% of the host memory as hugetlb pages that cannot
be reclaimed, to give it to virtual machines. Also, when aim-oriented
DAMOS auto-tuning is applied, this could also make the internal feedback
loop confused.
The intention of the current behavior was that trying DAMOS action to
regions would anyway impose some overhead, and therefore somehow be
charged. But in the real world, the overhead for failed action is much
lighter than successful action. Charging those at the same ratio may be
unfair, or at least suboptimum in some environments.
DAMOS Action Failed Region Quota Charge Ratio
=============================================
Let users set the charge ratio for the action-failed memory, for more
optimal and deterministic use of DAMOS. It allows users to specify the
numerator and the denominator of the ratio for flexible setup. For
example, let's suppose the numerator and the denominator are set to 1
and 4,096, respectively. The ratio is 1 / 4,096. A DAMOS scheme action
is applied to 5 GiB memory. For 1 GiB of the memory, the action is
succeeded. For the rest (4 GiB), the action is failed. Then, only 1
GiB and 1 MiB quota is charged.
The optimal charge ratio will depend on the use case and
system/workload. I'd recommend starting from setting the nominator as 1
and the denominator as PAGE_SIZE and tune based on the results, because
many DAMOS actions are applied at page level.
Tests
=====
I tested this feature in the steps below.
1. Allocate 50% of system memory and mlock() it using a test program.
2. Fill up the page cache to exhaust nearly all free memory.
3. Start DAMON-based proactive reclamation with 100 MiB/second DAMOS
hard-quota. Auto-tune the DAMOS soft-quota under the hard-quota for
achieving 40% free memory of the system with 'temporal' tuner.
For step 1, I run a simple C program that is written by Gemini. It is
quite straightforward, so I'm not sharing the code here.
For step 2, I use dd command like below:
dd if=/dev/zero of=foo bs=1M count=$50_percent_of_system_memory
For step 3, I use the latest version of DAMON user-space tool (damo)
like below.
sudo damo start --damos_action pageout \
` # Do the pageout only up to 100 MiB per second ` \
--damos_quota_space 100M --damos_quota_interval 1s \
` # Auto-tune the quota below the hard quota aiming` \
` # 40% free memory of the node 0 ` \
` # (entire node of the test system)` \
--damos_quota_goal node_mem_free_bp 40% 0 \
` # use temporal tuner, which is easy to understnd ` \
--damos_quota_goal_tuner temporal
As expected, the progress of the reclamation is not consistent, because
the quota is exceeded for the failed reclamation of the unreclaimable
memory.
I do this again, but with the failed region charge ratio feature. For
this, the above 'damo' command is used, after appending command line
option for setup of the charge ratio like below. Note that the option
was added to 'damo' after v3.1.9.
sudo ./damo start --damos_action pageout \
[...]
` # quota-charge only 1/4096 for pageout-failed regions ` \
--damos_quota_fail_charge_ratio 1 4096
The progress of the reclamation was nearly 100 MiB per second until the
goal was achieved, meeting the expectation.
Patches Sequence
================
The first two patches make preparational changes. Patch 1 updates fully
charged quota check to handle <min_region_sz remaining quota, which will
be able to exist after this series is applied. Patch 2 merges regions
after applying schemes is done as long as it is ok to do, since regions
split operations for quota could happen much more frequently under a
corner case that this series will make available.
Patch 3 implements the feature and exposes it via DAMON core API. Patch
4 implements DAMON sysfs ABI for the feature. Three following patches
(5-7) document the feature and ABI on design, usage, and ABI documents,
respectively. Four patches for testing of the new feature follow.
Patch 8 implements a kunit test for the feature. Patches 9 and 10
extend DAMON selftest helpers for DAMON sysfs control and internal state
dumping for adding a new selftest for the feature. Patch 11 extends
existing DAMON sysfs interface selftest to test the new feature using
the extended helper scripts.
Changelog
=========
Changes from RFC v5
(https://lore.kernel.org/20260410142034.83798-1-sj@kernel.org)
- Merge back: merge whatever if it doesn't lose monitoring infomration
and not violating min_nr_regions.
Changes from RFC v4
(https://lore.kernel.org/20260409142148.60652-1-sj@kernel.org)
- Fix quota-sliced region merge-back issues.
- Use damon_for_each_region() instead of damon_for_each_region_safe().
- Avoid merging back of sliced but scheme unapplied regions, to keep
the monitoring information.
Changes from RFC v3
(https://lore.kernel.org/20260407010536.83603-1-sj@kernel.org)
- Make damos_quota_is_full() safe from overflow and easier to read.
- Avoid quota-based region split making too many new regions.
Changes from RFC v2
(https://lore.kernel.org/20260405151232.102690-1-sj@kernel.org)
- Handle <min_region_sz remaining quota.
- Document zero denum behavior.
- Fix typos: s/selftets/selftests/
Changes from RFC v1
(https://lore.kernel.org/20260404163943.89278-1-sj@kernel.org)
- Avoid overflows in charge amount calculation.
- Fix/wordsmith documentation for grammar, typo, and wrong examples.
- Improve unit test for more consistent comparison source use.
SeongJae Park (11):
mm/damon/core: handle <min_region_sz remaining quota as empty
mm/damon/core: merge regions after applying DAMOS schemes
mm/damon/core: introduce failed region quota charge ratio
mm/damon/sysfs-schemes: implement fail_charge_{num,denom} files
Docs/mm/damon/design: document fail_charge_{num,denom}
Docs/admin-guide/mm/damon/usage: document fail_charge_{num,denom}
files
Docs/ABI/damon: document fail_charge_{num,denom}
mm/damon/tests/core-kunit: test fail_charge_{num,denom} committing
selftests/damon/_damon_sysfs: support failed region quota charge ratio
selftests/damon/drgn_dump_damon_status: support failed region quota
charge ratio
selftests/damon/sysfs.py: test failed region quota charge ratio
.../ABI/testing/sysfs-kernel-mm-damon | 12 +++
Documentation/admin-guide/mm/damon/usage.rst | 18 ++++-
Documentation/mm/damon/design.rst | 22 +++++
include/linux/damon.h | 9 +++
mm/damon/core.c | 80 ++++++++++++++++---
mm/damon/sysfs-schemes.c | 54 +++++++++++++
mm/damon/tests/core-kunit.h | 6 ++
tools/testing/selftests/damon/_damon_sysfs.py | 21 ++++-
.../selftests/damon/drgn_dump_damon_status.py | 2 +
tools/testing/selftests/damon/sysfs.py | 6 ++
10 files changed, 213 insertions(+), 17 deletions(-)
base-commit: 45df8a80cb5d9b548f8586bf6dee79b6c77c3703
--
2.47.3
^ permalink raw reply
* Re: [PATCH v9 2/3] hwmon: ltc4283: Add support for the LTC4283 Swap Controller
From: Guenter Roeck @ 2026-04-11 15:54 UTC (permalink / raw)
To: Nuno Sá, nuno.sa, linux-gpio, linux-hwmon, devicetree,
linux-doc
Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Jonathan Corbet,
Linus Walleij, Bartosz Golaszewski
In-Reply-To: <2653dc70f42fd015b88e2744da257f6200603b50.camel@gmail.com>
On 4/11/26 05:38, Nuno Sá wrote:
> On Fri, 2026-04-10 at 16:27 -0700, Guenter Roeck wrote:
>> On 4/6/26 07:31, Nuno Sá via B4 Relay wrote:
>>> From: Nuno Sá <nuno.sa@analog.com>
>>>
>>> Support the LTC4283 Hot Swap Controller. The device features programmable
>>> current limit with foldback and independently adjustable inrush current to
>>> optimize the MOSFET safe operating area (SOA). The SOA timer limits MOSFET
>>> temperature rise for reliable protection against overstresses.
>>>
>>> An I2C interface and onboard ADC allow monitoring of board current,
>>> voltage, power, energy, and fault status.
>>>
>>> Signed-off-by: Nuno Sá <nuno.sa@analog.com>
>>
>> The patch still has some issues. Please see
>>
>> https://sashiko.dev/#/patchset/20260406-ltc4283-support-v9-0-b66cfc749261%40analog.com
>>
>> Specifically:
>>
>> - regmap_clear_bits() may not cause problems, but it is not the best
>> choice either because the register was already read.
>> It might be better to just write the value to be masked since
>> both the register value and the mask are known.
>
> Fair enough.
>
>>
>> - I can't comment on the energy accuracy lost. That is your call.
>>
>
> The AI might have a point. Maybe you know better but if I understood correctly,
> mul_u64_u64_div_u64() will handle the multiplication by using 128bits (when
> available) or if not, using clever tricks. And it should also handle overflows.
>
> So my feeling is that we can simplify all of those check_overflow paths with the
> suggested API.
>
>> - Clamping before multiplying is indeed wrong.
>> You'll need to clamp before multiplying (and then possibly
>> clamp again).
>
> Yeah, the clamp change was just nonsense from me. What about about
>
> val = clamp_val((u64)val * MILLI, ...)
>
> ?
>
I don't think that will work on systems where sizeof(long) == 64.
I'd suggest to just bite the bullet and clamp against LONG_MAX/MILLI
first.
>
>> - %*ph: The AI seems to have a point.
>
> Indeed!
>
> FWIW, I was already aware of the AI feedback but I'll just setup things locally and
> run the review before submitting again.
>
The AI now copies you on new revisions. Please feel free to rely on that
(unless you have tokens to burn, of course ;-). Those AI reviews are cheap
for what they do, but they are expensive in absolute terms.
Thanks,
Guenter
^ permalink raw reply
* [PATCH] Documentation: core-api: real-time: correct spelling
From: Sukrut Heroorkar @ 2026-04-11 15:51 UTC (permalink / raw)
To: Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt,
Jonathan Corbet, Shuah Khan,
open list:Real-time Linux (PREEMPT_RT), open list:DOCUMENTATION,
open list
Cc: Sukrut Heroorkar
Fix typo "excpetion" with "exception".
Signed-off-by: Sukrut Heroorkar <hsukrut3@gmail.com>
---
Documentation/core-api/real-time/architecture-porting.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/core-api/real-time/architecture-porting.rst b/Documentation/core-api/real-time/architecture-porting.rst
index c90a426d8062..c9a39d708866 100644
--- a/Documentation/core-api/real-time/architecture-porting.rst
+++ b/Documentation/core-api/real-time/architecture-porting.rst
@@ -74,7 +74,7 @@ Exception handlers
Enabling interrupts is especially important on PREEMPT_RT, where certain
locks, such as spinlock_t, become sleepable. For example, handling an
invalid opcode may result in sending a SIGILL signal to the user task. A
- debug excpetion will send a SIGTRAP signal.
+ debug exception will send a SIGTRAP signal.
In both cases, if the exception occurred in user space, it is safe to enable
interrupts early. Sending a signal requires both interrupts and kernel
preemption to be enabled.
--
2.43.0
^ permalink raw reply related
* [PATCH v2] Documentation: Refactored watchdog old doc
From: Sunny Patel @ 2026-04-11 15:09 UTC (permalink / raw)
To: Jonathan Corbet
Cc: Wim Van Sebroeck, Guenter Roeck, Shuah Khan, linux-watchdog,
linux-doc, linux-kernel, Sunny Patel
In-Reply-To: <3e25ae54-e62d-484e-8d90-4f7825705e4f@roeck-us.net>
Mark WDIOC_GETTEMP and WDIOS_TEMPPANIC as deprecated since
neither is implemented by the watchdog core and both are only
present in a small number of legacy drivers.
Add documentation for previously undocumented status bits
WDIOF_MAGICCLOSE and WDIOF_ALARMONLY in the options field.
Add documentation for WDIOF_PRETIMEOUT and WDIOF_SETTIMEOUT
status bits describing their respective ioctls.
Fix the following issues in existing documentation:
- Remove version-specific reference to Linux 2.4.18 from
the GETTIMEOUT ioctl description
- Fix duplicate "was is" in printf format strings
- Replace [FIXME] placeholder with proper descriptions for
WDIOS_DISABLECARD, WDIOS_ENABLECARD and WDIOS_TEMPPANIC
Signed-off-by: Sunny Patel <nueralspacetech@gmail.com>
---
Changes in v2:
- Fixed typos: "tiemout" -> "timeout", "characted" -> "character"
- Fixed "small number if legacy" -> "of legacy"
- Fixed capitalization: "New Drivers" -> "New drivers", "USE" -> "Use"
- Fixed spacing: "WDIOS_DISABLECARD,this" -> "WDIOS_DISABLECARD, this"
- Fixed double spaces in two places
- Added missing newline at end of file
- Rewrote commit message
Documentation/watchdog/watchdog-api.rst | 59 +++++++++++++++++++++----
1 file changed, 51 insertions(+), 8 deletions(-)
diff --git a/Documentation/watchdog/watchdog-api.rst b/Documentation/watchdog/watchdog-api.rst
index 78e228c272cf..3e9021a79671 100644
--- a/Documentation/watchdog/watchdog-api.rst
+++ b/Documentation/watchdog/watchdog-api.rst
@@ -2,7 +2,7 @@
The Linux Watchdog driver API
=============================
-Last reviewed: 10/05/2007
+Last reviewed: 04/08/2026
@@ -106,11 +106,10 @@ the requested one due to limitation of the hardware::
This example might actually print "The timeout was set to 60 seconds"
if the device has a granularity of minutes for its timeout.
-Starting with the Linux 2.4.18 kernel, it is possible to query the
-current timeout using the GETTIMEOUT ioctl::
+It is also possible to get the current timeout with the GETTIMEOUT ioctl::
ioctl(fd, WDIOC_GETTIMEOUT, &timeout);
- printf("The timeout was is %d seconds\n", timeout);
+ printf("The timeout is %d seconds\n", timeout);
Pretimeouts
===========
@@ -133,7 +132,7 @@ seconds. Setting a pretimeout to zero disables it.
There is also a get function for getting the pretimeout::
ioctl(fd, WDIOC_GETPRETIMEOUT, &timeout);
- printf("The pretimeout was is %d seconds\n", timeout);
+ printf("The pretimeout is %d seconds\n", timeout);
Not all watchdog drivers will support a pretimeout.
@@ -145,7 +144,7 @@ before the system will reboot. The WDIOC_GETTIMELEFT is the ioctl
that returns the number of seconds before reboot::
ioctl(fd, WDIOC_GETTIMELEFT, &timeleft);
- printf("The timeout was is %d seconds\n", timeleft);
+ printf("The timeout is %d seconds\n", timeleft);
Environmental monitoring
========================
@@ -227,12 +226,33 @@ The watchdog saw a keepalive ping since it was last queried.
WDIOF_SETTIMEOUT Can set/get the timeout
================ =======================
-The watchdog can do pretimeouts.
+The watchdog supports timeout set/get via the WDIOC_SETTIMEOUT and
+WDIOC_GETTIMEOUT ioctls.
================ ================================
WDIOF_PRETIMEOUT Pretimeout (in seconds), get/set
================ ================================
+The watchdog supports a pretimeout, a warning interrupt that fires before
+the actual reboot timeout. Use WDIOC_SETPRETIMEOUT and WDIOC_GETPRETIMEOUT
+to set/get the pretimeout.
+
+ ================ ================================
+ WDIOF_MAGICCLOSE Supports magic close char
+ ================ ================================
+
+The driver supports the Magic Close feature, The watchdog is only disabled
+if the character 'V' is written to /dev/watchdog before the file descriptor
+is closed. Without this, closing the device disables the watchdog
+unconditionally.
+
+ ================ ================================
+ WDIOF_ALARMONLY Not a reboot watchdog
+ ================ ================================
+
+The watchdog will not reboot the system when it expires. Instead it
+triggers a management or other external alarm. Userspace should not
+rely on a system reboot occurring.
For those drivers that return any bits set in the option field, the
GETSTATUS and GETBOOTSTATUS ioctls can be used to ask for the current
@@ -254,6 +274,11 @@ returned value is the temperature in degrees Fahrenheit::
int temperature;
ioctl(fd, WDIOC_GETTEMP, &temperature);
+.. deprecated::
+ ``WDIOC_GETTEMP`` is not implemented by the watchdog core. It is only
+ supported by a small number of legacy drivers. New drivers should not
+ implement it.
+
Finally the SETOPTIONS ioctl can be used to control some aspects of
the cards operation::
@@ -268,4 +293,22 @@ The following options are available:
WDIOS_TEMPPANIC Kernel panic on temperature trip
================= ================================
-[FIXME -- better explanations]
+``WDIOS_DISABLECARD`` stops the watchdog timer. The driver will cease
+pinging the hardware watchdog, allowing a controlled shutdown without
+a forced reboot. This is equivalent to the watchdog being disarmed.
+
+``WDIOS_ENABLECARD`` starts the watchdog timer. If the watchdog was
+previously stopped via ``WDIOS_DISABLECARD``, this will re-enable it. The
+hardware watchdog will begin counting down from the configured timeout.
+
+``WDIOS_TEMPPANIC`` enables temperature-based kernel panic. When set,
+the driver will call ``panic()`` (or ``kernel_power_off()`` on some
+drivers) if the hardware temperature sensor exceeds its threshold,
+rather than only setting the ``WDIOF_OVERHEAT`` status bit. Support
+for this option is driver-specific, not all watchdog drivers implement
+temperature monitoring.
+
+.. deprecated::
+ ``WDIOS_TEMPPANIC`` is not implemented by the watchdog core and is only
+ present in a small number of legacy drivers. New drivers should not
+ implement it.
--
2.43.0
^ permalink raw reply related
* Re: [PATCH 1/6] hugetlb: open-code hugetlb folio lookup index conversion
From: Mike Rapoport @ 2026-04-11 14:14 UTC (permalink / raw)
To: Jane Chu
Cc: akpm, david, muchun.song, osalvador, lorenzo.stoakes,
Liam.Howlett, vbabka, surenb, mhocko, corbet, skhan, hughd,
baolin.wang, peterx, linux-mm, linux-doc, linux-kernel
In-Reply-To: <20260409234158.837786-2-jane.chu@oracle.com>
Hi,
On Thu, Apr 09, 2026 at 05:41:52PM -0600, Jane Chu wrote:
> This patch removes `filemap_lock_hugetlb_folio()` and open-codes
> the index conversion at each call site, making it explicit when
> hugetlb code is translating a hugepage index into the base-page index
> expected by `filemap_lock_folio()`. As part of that cleanup,
> it also uses a base-page index directly in `hugetlbfs_zero_partial_page()`,
> where the byte offset is already page-granular. Overall, the change
> makes the indexing model more obvious at the call sites and avoids
> hiding the huge-index to base-index conversion inside a helper.
>
> Suggested-by: David Hildenbrand <david@kernel.org>
> Signed-off-by: Jane Chu <jane.chu@oracle.com>
> ---
> fs/hugetlbfs/inode.c | 20 ++++++++++----------
> include/linux/hugetlb.h | 12 ------------
> mm/hugetlb.c | 4 ++--
> 3 files changed, 12 insertions(+), 24 deletions(-)
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index cd6b22f6e2b1..cf79fb830377 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -242,9 +242,9 @@ static ssize_t hugetlbfs_read_iter(struct kiocb *iocb, struct iov_iter *to)
> struct hstate *h = hstate_file(file);
> struct address_space *mapping = file->f_mapping;
> struct inode *inode = mapping->host;
> - unsigned long index = iocb->ki_pos >> huge_page_shift(h);
> + unsigned long idx = iocb->ki_pos >> huge_page_shift(h);
Is it necessary to rename index to idx?
> unsigned long offset = iocb->ki_pos & ~huge_page_mask(h);
> - unsigned long end_index;
> + unsigned long end_idx;
> loff_t isize;
> ssize_t retval = 0;
...
> @@ -652,10 +652,10 @@ static void hugetlbfs_zero_partial_page(struct hstate *h,
> loff_t start,
> loff_t end)
> {
> - pgoff_t idx = start >> huge_page_shift(h);
> + pgoff_t index = start >> PAGE_SHIFT;
And idx to index?
Maybe let's pick one and rename the other or just leave them be.
> struct folio *folio;
>
--
Sincerely yours,
Mike.
^ permalink raw reply
* Re: [PATCH v9 2/3] hwmon: ltc4283: Add support for the LTC4283 Swap Controller
From: Nuno Sá @ 2026-04-11 12:38 UTC (permalink / raw)
To: Guenter Roeck, nuno.sa, linux-gpio, linux-hwmon, devicetree,
linux-doc
Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Jonathan Corbet,
Linus Walleij, Bartosz Golaszewski
In-Reply-To: <29b207c8-10ab-42b4-a1c8-988aacc75154@roeck-us.net>
On Fri, 2026-04-10 at 16:27 -0700, Guenter Roeck wrote:
> On 4/6/26 07:31, Nuno Sá via B4 Relay wrote:
> > From: Nuno Sá <nuno.sa@analog.com>
> >
> > Support the LTC4283 Hot Swap Controller. The device features programmable
> > current limit with foldback and independently adjustable inrush current to
> > optimize the MOSFET safe operating area (SOA). The SOA timer limits MOSFET
> > temperature rise for reliable protection against overstresses.
> >
> > An I2C interface and onboard ADC allow monitoring of board current,
> > voltage, power, energy, and fault status.
> >
> > Signed-off-by: Nuno Sá <nuno.sa@analog.com>
>
> The patch still has some issues. Please see
>
> https://sashiko.dev/#/patchset/20260406-ltc4283-support-v9-0-b66cfc749261%40analog.com
>
> Specifically:
>
> - regmap_clear_bits() may not cause problems, but it is not the best
> choice either because the register was already read.
> It might be better to just write the value to be masked since
> both the register value and the mask are known.
Fair enough.
>
> - I can't comment on the energy accuracy lost. That is your call.
>
The AI might have a point. Maybe you know better but if I understood correctly,
mul_u64_u64_div_u64() will handle the multiplication by using 128bits (when
available) or if not, using clever tricks. And it should also handle overflows.
So my feeling is that we can simplify all of those check_overflow paths with the
suggested API.
> - Clamping before multiplying is indeed wrong.
> You'll need to clamp before multiplying (and then possibly
> clamp again).
Yeah, the clamp change was just nonsense from me. What about about
val = clamp_val((u64)val * MILLI, ...)
?
> - %*ph: The AI seems to have a point.
Indeed!
FWIW, I was already aware of the AI feedback but I'll just setup things locally and
run the review before submitting again.
- Nuno Sá
>
> - debugfs: False positive. I'll need to check if the guidance ever made it into the
> Agent's prompts.
>
> Thanks,
> Guenter
>
> > ---
> > Documentation/hwmon/index.rst | 1 +
> > Documentation/hwmon/ltc4283.rst | 266 ++++++
> > MAINTAINERS | 1 +
> > drivers/hwmon/Kconfig | 12 +
> > drivers/hwmon/Makefile | 1 +
> > drivers/hwmon/ltc4283.c | 1808 +++++++++++++++++++++++++++++++++++++++
> > 6 files changed, 2089 insertions(+)
> >
> > diff --git a/Documentation/hwmon/index.rst b/Documentation/hwmon/index.rst
> > index 199f35a75282..d54dda83ab6e 100644
> > --- a/Documentation/hwmon/index.rst
> > +++ b/Documentation/hwmon/index.rst
> > @@ -144,6 +144,7 @@ Hardware Monitoring Kernel Drivers
> > ltc4260
> > ltc4261
> > ltc4282
> > + ltc4283
> > ltc4286
> > macsmc-hwmon
> > max127
> > diff --git a/Documentation/hwmon/ltc4283.rst b/Documentation/hwmon/ltc4283.rst
> > new file mode 100644
> > index 000000000000..ba88445e45f4
> > --- /dev/null
> > +++ b/Documentation/hwmon/ltc4283.rst
> > @@ -0,0 +1,266 @@
> > +.. SPDX-License-Identifier: GPL-2.0-only
> > +
> > +Kernel drivers ltc4283
> > +==========================================
> > +
> > +Supported chips:
> > +
> > + * Analog Devices LTC4283
> > +
> > + Prefix: 'ltc4283'
> > +
> > + Addresses scanned: -
> > +
> > + Datasheet:
> > +
> > +
> > https://www.analog.com/media/en/technical-documentation/data-sheets/ltc4283.pdf
> > +
> > +Author: Nuno Sá <nuno.sa@analog.com>
> > +
> > +Description
> > +___________
> > +
> > +The LTC4283 negative voltage hot swap controller drives an external N-channel
> > +MOSFET to allow a board to be safely inserted and removed from a live backplane.
> > +The device features programmable current limit with foldback and independently
> > +adjustable inrush current to optimize the MOSFET safe operating area (SOA). The
> > +SOA timer limits MOSFET temperature rise for reliable protection against
> > +overstresses. An I2C interface and onboard gear-shift ADC allow monitoring of
> > +board current, voltage, power, energy, and fault status. Additional features
> > +respond to input UV/OV, interrupt the host when a fault has occurred, notify
> > +when output power is good, detect insertion of a board, turn off the MOSFET
> > +if an external supply monitor fails to indicate power good within a timeout
> > +period, and auto-reboot after a programmable delay following a host commanded
> > +turn-off.
> > +
> > +Sysfs entries
> > +_____________
> > +
> > +The following attributes are supported. Limits are read-write and all the other
> > +attributes are read-only. Note that the VADIOx channels might not be available
> > +if the ADIO pins are used as GPIOs (naturally also affects the respective
> > +differential channels).
> > +
> > +======================= ==========================================
> > +in0_lcrit_alarm Critical Undervoltage alarm
> > +in0_crit_alarm Critical Overvoltage alarm
> > +in0_label Channel label (VIN)
> > +
> > +in1_input Output voltage (mV).
> > +in1_min Undervoltage threshold
> > +in1_max Overvoltage threshold
> > +in1_lowest Lowest measured voltage
> > +in1_highest Highest measured voltage
> > +in1_reset_history Write 1 to reset history.
> > +in1_min_alarm Undervoltage alarm
> > +in1_max_alarm Overvoltage alarm
> > +in1_label Channel label (VPWR)
> > +
> > +in2_input Output voltage (mV).
> > +in2_min Undervoltage threshold
> > +in2_max Overvoltage threshold
> > +in2_lowest Lowest measured voltage
> > +in2_highest Highest measured voltage
> > +in2_reset_history Write 1 to reset history.
> > +in2_min_alarm Undervoltage alarm
> > +in2_max_alarm Overvoltage alarm
> > +in2_enable Enable/Disable monitoring.
> > +in2_label Channel label (VADI1)
> > +
> > +in3_input Output voltage (mV).
> > +in3_min Undervoltage threshold
> > +in3_max Overvoltage threshold
> > +in3_lowest Lowest measured voltage
> > +in3_highest Highest measured voltage
> > +in3_reset_history Write 1 to reset history.
> > +in3_min_alarm Undervoltage alarm
> > +in3_max_alarm Overvoltage alarm
> > +in3_enable Enable/Disable monitoring.
> > +in3_label Channel label (VADI2)
> > +
> > +in4_input Output voltage (mV).
> > +in4_min Undervoltage threshold
> > +in4_max Overvoltage threshold
> > +in4_lowest Lowest measured voltage
> > +in4_highest Highest measured voltage
> > +in4_reset_history Write 1 to reset history.
> > +in4_min_alarm Undervoltage alarm
> > +in4_max_alarm Overvoltage alarm
> > +in4_enable Enable/Disable monitoring.
> > +in4_label Channel label (VADI3)
> > +
> > +in5_input Output voltage (mV).
> > +in5_min Undervoltage threshold
> > +in5_max Overvoltage threshold
> > +in5_lowest Lowest measured voltage
> > +in5_highest Highest measured voltage
> > +in5_reset_history Write 1 to reset history.
> > +in5_min_alarm Undervoltage alarm
> > +in5_max_alarm Overvoltage alarm
> > +in5_enable Enable/Disable monitoring.
> > +in5_label Channel label (VADI4)
> > +
> > +in6_input Output voltage (mV).
> > +in6_min Undervoltage threshold
> > +in6_max Overvoltage threshold
> > +in6_lowest Lowest measured voltage
> > +in6_highest Highest measured voltage
> > +in6_reset_history Write 1 to reset history.
> > +in6_min_alarm Undervoltage alarm
> > +in6_max_alarm Overvoltage alarm
> > +in6_enable Enable/Disable monitoring.
> > +in6_label Channel label (VADIO1)
> > +
> > +in7_input Output voltage (mV).
> > +in7_min Undervoltage threshold
> > +in7_max Overvoltage threshold
> > +in7_lowest Lowest measured voltage
> > +in7_highest Highest measured voltage
> > +in7_reset_history Write 1 to reset history.
> > +in7_min_alarm Undervoltage alarm
> > +in7_max_alarm Overvoltage alarm
> > +in7_enable Enable/Disable monitoring.
> > +in7_label Channel label (VADIO2)
> > +
> > +in8_input Output voltage (mV).
> > +in8_min Undervoltage threshold
> > +in8_max Overvoltage threshold
> > +in8_lowest Lowest measured voltage
> > +in8_highest Highest measured voltage
> > +in8_reset_history Write 1 to reset history.
> > +in8_min_alarm Undervoltage alarm
> > +in8_max_alarm Overvoltage alarm
> > +in8_enable Enable/Disable monitoring.
> > +in8_label Channel label (VADIO3)
> > +
> > +in9_input Output voltage (mV).
> > +in9_min Undervoltage threshold
> > +in9_max Overvoltage threshold
> > +in9_lowest Lowest measured voltage
> > +in9_highest Highest measured voltage
> > +in9_reset_history Write 1 to reset history.
> > +in9_min_alarm Undervoltage alarm
> > +in9_max_alarm Overvoltage alarm
> > +in9_enable Enable/Disable monitoring.
> > +in9_label Channel label (VADIO4)
> > +
> > +in10_input Output voltage (mV).
> > +in10_min Undervoltage threshold
> > +in10_max Overvoltage threshold
> > +in10_lowest Lowest measured voltage
> > +in10_highest Highest measured voltage
> > +in10_reset_history Write 1 to reset history.
> > +in10_min_alarm Undervoltage alarm
> > +in10_max_alarm Overvoltage alarm
> > +in10_enable Enable/Disable monitoring.
> > +in10_label Channel label (DRNS)
> > +
> > +in11_input Output voltage (mV).
> > +in11_min Undervoltage threshold
> > +in11_max Overvoltage threshold
> > +in11_lowest Lowest measured voltage
> > +in11_highest Highest measured voltage
> > +in11_reset_history Write 1 to reset history.
> > + Also clears fet bad and short fault logs.
> > +in11_min_alarm Undervoltage alarm
> > +in11_max_alarm Overvoltage alarm
> > +in11_enable Enable/Disable monitoring
> > +in11_fault Failure in the MOSFET. Either bad or shorted FET.
> > +in11_label Channel label (DRAIN)
> > +
> > +in12_input Output voltage (mV).
> > +in12_min Undervoltage threshold
> > +in12_max Overvoltage threshold
> > +in12_lowest Lowest measured voltage
> > +in12_highest Highest measured voltage
> > +in12_reset_history Write 1 to reset history.
> > +in12_min_alarm Undervoltage alarm
> > +in12_max_alarm Overvoltage alarm
> > +in12_enable Enable/Disable monitoring.
> > +in12_label Channel label (ADIN2-ADIN1)
> > +
> > +in13_input Output voltage (mV).
> > +in13_min Undervoltage threshold
> > +in13_max Overvoltage threshold
> > +in13_lowest Lowest measured voltage
> > +in13_highest Highest measured voltage
> > +in13_reset_history Write 1 to reset history.
> > +in13_min_alarm Undervoltage alarm
> > +in13_max_alarm Overvoltage alarm
> > +in13_enable Enable/Disable monitoring.
> > +in13_label Channel label (ADIN4-ADIN3)
> > +
> > +in14_input Output voltage (mV).
> > +in14_min Undervoltage threshold
> > +in14_max Overvoltage threshold
> > +in14_lowest Lowest measured voltage
> > +in14_highest Highest measured voltage
> > +in14_reset_history Write 1 to reset history.
> > +in14_min_alarm Undervoltage alarm
> > +in14_max_alarm Overvoltage alarm
> > +in14_enable Enable/Disable monitoring.
> > +in14_label Channel label (ADIO2-ADIO1)
> > +
> > +in15_input Output voltage (mV).
> > +in15_min Undervoltage threshold
> > +in15_max Overvoltage threshold
> > +in15_lowest Lowest measured voltage
> > +in15_highest Highest measured voltage
> > +in15_reset_history Write 1 to reset history.
> > +in15_min_alarm Undervoltage alarm
> > +in15_max_alarm Overvoltage alarm
> > +in15_enable Enable/Disable monitoring.
> > +in15_label Channel label (ADIO4-ADIO3)
> > +
> > +curr1_input Sense current (mA)
> > +curr1_min Undercurrent threshold
> > +curr1_max Overcurrent threshold
> > +curr1_lowest Lowest measured current
> > +curr1_highest Highest measured current
> > +curr1_reset_history Write 1 to reset curr1 history.
> > + Also clears overcurrent fault logs.
> > +curr1_min_alarm Undercurrent alarm
> > +curr1_max_alarm Overcurrent alarm
> > +curr1_crit_alarm Critical Overcurrent alarm
> > +curr1_label Channel label (ISENSE)
> > +
> > +power1_input Power (in uW)
> > +power1_min Low power threshold
> > +power1_max High power threshold
> > +power1_input_lowest Historical minimum power use
> > +power1_input_highest Historical maximum power use
> > +power1_reset_history Write 1 to reset power1 history.
> > + Also clears power fault logs.
> > +power1_min_alarm Low power alarm
> > +power1_max_alarm High power alarm
> > +power1_label Channel label (Power)
> > +
> > +energy1_input Measured energy over time (in microJoule)
> > +energy1_enable Enable/Disable Energy accumulation
> > +======================= ==========================================
> > +
> > +DebugFs entries
> > +_______________
> > +
> > +The chip also has a fault log register where failures can be logged. Hence,
> > +as these are logging events, we give access to them in debugfs. Note that
> > +even if some failure is detected in these logs, it does necessarily mean
> > +that the failure is still present. As mentioned in the proper Sysfs entries,
> > +these logs can be cleared by writing in the proper reset_history attribute.
> > +
> > +.. warning:: The debugfs interface is subject to change without notice
> > + and is only available when the kernel is compiled with
> > + ``CONFIG_DEBUG_FS`` defined.
> > +
> > +``/sys/kernel/debug/i2c/i2c-[X]/[X]-addr/``
> > +contains the following attributes:
> > +
> > +======================= ========================================
> > ==
> > +power1_failed_fault_log Set to 1 by a power1 fault occurring.
> > +power1_good_input_fault_log Set to 1 by a power1 good input fault occurring
> > at PGIO3.
> > +in11_fet_short_fault_log Set to 1 when a FET-short fault occurs.
> > +in11_fet_bad_fault_log Set to 1 when a FET-BAD fault occurs.
> > +in0_lcrit_fault_log Set to 1 by a VIN undervoltage fault occurring.
> > +in0_crit_fault_log Set to 1 by a VIN overvoltage fault occurring.
> > +curr1_crit_fault_log Set to 1 by an overcurrent fault occurring.
> > +======================= ==========================================
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 3f727d7fdfa4..a63833b6fe8b 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -15166,6 +15166,7 @@ M: Nuno Sá <nuno.sa@analog.com>
> > L: linux-hwmon@vger.kernel.org
> > S: Supported
> > F: Documentation/devicetree/bindings/hwmon/adi,ltc4283.yaml
> > +F: drivers/hwmon/ltc4283.c
> >
> > LTC4286 HARDWARE MONITOR DRIVER
> > M: Delphine CC Chiu <Delphine_CC_Chiu@Wiwynn.com>
> > diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
> > index fb847ab40ab4..4d9f500ae6ee 100644
> > --- a/drivers/hwmon/Kconfig
> > +++ b/drivers/hwmon/Kconfig
> > @@ -1157,6 +1157,18 @@ config SENSORS_LTC4282
> > This driver can also be built as a module. If so, the module will
> > be called ltc4282.
> >
> > +config SENSORS_LTC4283
> > + tristate "Analog Devices LTC4283"
> > + depends on I2C
> > + select REGMAP_I2C
> > + select AUXILIARY_BUS
> > + help
> > + If you say yes here you get support for Analog Devices LTC4283
> > + Negative Voltage Hot Swap Controller I2C interface.
> > +
> > + This driver can also be built as a module. If so, the module will
> > + be called ltc4283.
> > +
> > config SENSORS_LTQ_CPUTEMP
> > bool "Lantiq cpu temperature sensor driver"
> > depends on SOC_XWAY
> > diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
> > index 0fce31b43eb1..b9d7b0287b9c 100644
> > --- a/drivers/hwmon/Makefile
> > +++ b/drivers/hwmon/Makefile
> > @@ -147,6 +147,7 @@ obj-$(CONFIG_SENSORS_LTC4245) += ltc4245.o
> > obj-$(CONFIG_SENSORS_LTC4260) += ltc4260.o
> > obj-$(CONFIG_SENSORS_LTC4261) += ltc4261.o
> > obj-$(CONFIG_SENSORS_LTC4282) += ltc4282.o
> > +obj-$(CONFIG_SENSORS_LTC4283) += ltc4283.o
> > obj-$(CONFIG_SENSORS_LTQ_CPUTEMP) += ltq-cputemp.o
> > obj-$(CONFIG_SENSORS_MACSMC_HWMON) += macsmc-hwmon.o
> > obj-$(CONFIG_SENSORS_MAX1111) += max1111.o
> > diff --git a/drivers/hwmon/ltc4283.c b/drivers/hwmon/ltc4283.c
> > new file mode 100644
> > index 000000000000..2a2674a55167
> > --- /dev/null
> > +++ b/drivers/hwmon/ltc4283.c
> > @@ -0,0 +1,1808 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Analog Devices LTC4283 I2C Negative Voltage Hot Swap Controller (HWMON)
> > + *
> > + * Copyright 2025 Analog Devices Inc.
> > + */
> > +#include <linux/auxiliary_bus.h>
> > +#include <linux/bitfield.h>
> > +#include <linux/bitmap.h>
> > +#include <linux/bitops.h>
> > +#include <linux/bits.h>
> > +
> > +#include <linux/debugfs.h>
> > +#include <linux/device.h>
> > +#include <linux/device/devres.h>
> > +#include <linux/hwmon.h>
> > +#include <linux/i2c.h>
> > +#include <linux/math.h>
> > +#include <linux/math64.h>
> > +#include <linux/minmax.h>
> > +#include <linux/module.h>
> > +
> > +#include <linux/mod_devicetable.h>
> > +#include <linux/overflow.h>
> > +#include <linux/property.h>
> > +#include <linux/regmap.h>
> > +#include <linux/unaligned.h>
> > +#include <linux/units.h>
> > +
> > +#define LTC4283_SYSTEM_STATUS 0x00
> > +#define LTC4283_FAULT_STATUS 0x03
> > +#define LTC4283_OV_MASK BIT(0)
> > +#define LTC4283_UV_MASK BIT(1)
> > +#define LTC4283_OC_MASK BIT(2)
> > +#define LTC4283_FET_BAD_MASK BIT(3)
> > +#define LTC4283_FET_SHORT_MASK BIT(6)
> > +#define LTC4283_FAULT_LOG 0x04
> > +#define LTC4283_OV_FAULT_MASK BIT(0)
> > +#define LTC4283_UV_FAULT_MASK BIT(1)
> > +#define LTC4283_OC_FAULT_MASK BIT(2)
> > +#define LTC4283_FET_BAD_FAULT_MASK BIT(3)
> > +#define LTC4283_PGI_FAULT_MASK BIT(4)
> > +#define LTC4283_PWR_FAIL_FAULT_MASK BIT(5)
> > +#define LTC4283_FET_SHORT_FAULT_MASK BIT(6)
> > +#define LTC4283_ADC_ALM_LOG_1 0x05
> > +#define LTC4283_POWER_LOW_ALM BIT(0)
> > +#define LTC4283_POWER_HIGH_ALM BIT(1)
> > +#define LTC4283_SENSE_LOW_ALM BIT(4)
> > +#define LTC4283_SENSE_HIGH_ALM BIT(5)
> > +#define LTC4283_ADC_ALM_LOG_2 0x06
> > +#define LTC4283_ADC_ALM_LOG_3 0x07
> > +#define LTC4283_ADC_ALM_LOG_4 0x08
> > +#define LTC4283_ADC_ALM_LOG_5 0x09
> > +#define LTC4283_CONTROL_1 0x0a
> > +#define LTC4283_RW_PAGE_MASK BIT(0)
> > +#define LTC4283_PIGIO2_ACLB_MASK BIT(2)
> > +#define LTC4283_PWRGD_RST_CTRL_MASK BIT(3)
> > +#define LTC4283_FET_BAD_OFF_MASK BIT(4)
> > +#define LTC4283_THERM_TMR_MASK BIT(5)
> > +#define LTC4283_DVDT_MASK BIT(6)
> > +#define LTC4283_CONTROL_2 0x0b
> > +#define LTC4283_OV_RETRY_MASK BIT(0)
> > +#define LTC4283_UV_RETRY_MASK BIT(1)
> > +#define LTC4283_OC_RETRY_MASK GENMASK(3, 2)
> > +#define LTC4283_FET_BAD_RETRY_MASK GENMASK(5, 4)
> > +#define LTC4283_EXT_FAULT_RETRY_MASK BIT(7)
> > +#define LTC4283_RESERVED_OC 0x0c
> > +#define LTC4283_CONFIG_1 0x0d
> > +#define LTC4283_FB_MASK GENMASK(3, 2)
> > +#define LTC4283_ILIM_MASK GENMASK(7, 4)
> > +#define LTC4283_CONFIG_2 0x0e
> > +#define LTC4283_COOLING_DL_MASK GENMASK(3, 1)
> > +#define LTC4283_FTBD_DL_MASK GENMASK(5, 4)
> > +#define LTC4283_CONFIG_3 0x0f
> > +#define LTC4283_VPWR_DRNS_MASK BIT(6)
> > +#define LTC4283_EXTFLT_TURN_OFF_MASK BIT(7)
> > +#define LTC4283_PGIO_CONFIG 0x10
> > +#define LTC4283_PGIO1_CFG_MASK GENMASK(1, 0)
> > +#define LTC4283_PGIO2_CFG_MASK GENMASK(3, 2)
> > +#define LTC4283_PGIO3_CFG_MASK GENMASK(5, 4)
> > +#define LTC4283_PGIO4_CFG_MASK GENMASK(7, 6)
> > +#define LTC4283_PGIO_CONFIG_2 0x11
> > +#define LTC4283_ADC_MASK GENMASK(2, 0)
> > +#define LTC4283_ADC_SELECT(c) (0x13 + (c) / 8)
> > +#define LTC4283_ADC_SELECT_MASK(c) BIT((c) % 8)
> > +#define LTC4283_SENSE_MIN_TH 0x1b
> > +#define LTC4283_SENSE_MAX_TH 0x1c
> > +#define LTC4283_VPWR_MIN_TH 0x1d
> > +#define LTC4283_VPWR_MAX_TH 0x1e
> > +#define LTC4283_POWER_MIN_TH 0x1f
> > +#define LTC4283_POWER_MAX_TH 0x20
> > +#define LTC4283_ADC_2_MIN_TH(c) (0x21 + (c) * 2)
> > +#define LTC4283_ADC_2_MAX_TH(c) (0x22 + (c) * 2)
> > +#define LTC4283_ADC_2_MIN_TH_DIFF(c) (0x39 + (c) * 2)
> > +#define LTC4283_ADC_2_MAX_TH_DIFF(c) (0x3a + (c) * 2)
> > +#define LTC4283_SENSE 0x41
> > +#define LTC4283_SENSE_MIN 0x42
> > +#define LTC4283_SENSE_MAX 0x43
> > +#define LTC4283_VPWR 0x44
> > +#define LTC4283_VPWR_MIN 0x45
> > +#define LTC4283_VPWR_MAX 0x46
> > +#define LTC4283_POWER 0x47
> > +#define LTC4283_POWER_MIN 0x48
> > +#define LTC4283_POWER_MAX 0x49
> > +#define LTC4283_RESERVED_68 0x68
> > +#define LTC4283_RESERVED_6D 0x6D
> > +/* get channels from ADC 2 */
> > +#define LTC4283_ADC_2(c) (0x4a + (c) * 3)
> > +#define LTC4283_ADC_2_MIN(c) (0x4b + (c) * 3)
> > +#define LTC4283_ADC_2_MAX(c) (0x4c + (c) * 3)
> > +#define LTC4283_ADC_2_DIFF(c) (0x6e + (c) * 3)
> > +#define LTC4283_ADC_2_MIN_DIFF(c) (0x6f + (c) * 3)
> > +#define LTC4283_ADC_2_MAX_DIFF(c) (0x70 + (c) * 3)
> > +#define LTC4283_ENERGY 0x7a
> > +#define LTC4283_METER_CONTROL 0x84
> > +#define LTC4283_INTEGRATE_I_MASK BIT(0)
> > +#define LTC4283_METER_HALT_MASK BIT(6)
> > +#define LTC4283_RESERVED_86 0x86
> > +#define LTC4283_RESERVED_8F 0x8F
> > +#define LTC4283_FAULT_LOG_CTRL 0x90
> > +#define LTC4283_FAULT_LOG_EN_MASK BIT(7)
> > +#define LTC4283_RESERVED_91 0x91
> > +#define LTC4283_RESERVED_A1 0xA1
> > +#define LTC4283_RESERVED_A3 0xA3
> > +#define LTC4283_RESERVED_AC 0xAC
> > +#define LTC4283_POWER_PLAY_MSB 0xE7
> > +#define LTC4283_POWER_PLAY_LSB 0xE8
> > +#define LTC4283_RESERVED_F1 0xF1
> > +#define LTC4283_RESERVED_FF 0xFF
> > +
> > +/* also applies for differential channels */
> > +#define LTC4283_ADC1_FS_uV 32768
> > +#define LTC4283_ADC2_FS_mV 2048
> > +#define LTC4283_TCONV_uS 64103
> > +#define LTC4283_VILIM_MIN_uV 15000
> > +#define LTC4283_VILIM_MAX_uV 30000
> > +#define LTC4283_VILIM_RANGE \
> > + (LTC4283_VILIM_MAX_uV - LTC4283_VILIM_MIN_uV + 1)
> > +
> > +#define LTC4283_PGIO_FUNC_GPIO 2
> > +#define LTC4283_PGIO2_FUNC_ACLB 3
> > +
> > +/*
> > + * Maximum value for rsense in nano ohms. The reasoning for this value is that
> > + * it's the max value for which multiplying by 256 does not overflow long on
> > + * 32bits. For the minimum value, is a sane minimum rsense for which power_max
> > + * does not overflow 32bits.
> > + */
> > +#define LTC4283_MAX_RSENSE 1677721599
> > +#define LTC4283_MIN_RSENSE 50000
> > +
> > +/* voltage channels */
> > +enum {
> > + LTC4283_CHAN_VIN,
> > + LTC4283_CHAN_VPWR,
> > + LTC4283_CHAN_ADI_1,
> > + LTC4283_CHAN_ADI_2,
> > + LTC4283_CHAN_ADI_3,
> > + LTC4283_CHAN_ADI_4,
> > + LTC4283_CHAN_ADIO_1,
> > + LTC4283_CHAN_ADIO_2,
> > + LTC4283_CHAN_ADIO_3,
> > + LTC4283_CHAN_ADIO_4,
> > + LTC4283_CHAN_DRNS,
> > + LTC4283_CHAN_DRAIN,
> > + /* differential channels */
> > + LTC4283_CHAN_ADIN12,
> > + LTC4283_CHAN_ADIN34,
> > + LTC4283_CHAN_ADIO12,
> > + LTC4283_CHAN_ADIO34,
> > + LTC4283_CHAN_MAX
> > +};
> > +
> > +/* Just for ease of use on the regmap */
> > +#define LTC4283_ADIO34_MAX \
> > + LTC4283_ADC_2_MAX_DIFF(LTC4283_CHAN_ADIO34 - LTC4283_CHAN_ADIN12)
> > +
> > +struct ltc4283_hwmon {
> > + struct regmap *map;
> > + struct i2c_client *client;
> > + unsigned long gpio_mask;
> > + unsigned long ch_enable_mask;
> > + /* in microwatt */
> > + long power_max;
> > + /* in millivolt */
> > + u32 vsense_max;
> > + /* in tenths of microohm*/
> > + u32 rsense;
> > + bool energy_en;
> > + bool ext_fault;
> > +};
> > +
> > +static int ltc4283_read_voltage_word(const struct ltc4283_hwmon *st,
> > + u32 reg, u32 fs, long *val)
> > +{
> > + unsigned int __raw;
> > + int ret;
> > +
> > + ret = regmap_read(st->map, reg, &__raw);
> > + if (ret)
> > + return ret;
> > +
> > + *val = DIV_ROUND_CLOSEST(__raw * fs, BIT(16));
> > + return 0;
> > +}
> > +
> > +static int ltc4283_read_voltage_byte(const struct ltc4283_hwmon *st,
> > + u32 reg, u32 fs, long *val)
> > +{
> > + int ret;
> > + u32 in;
> > +
> > + ret = regmap_read(st->map, reg, &in);
> > + if (ret)
> > + return ret;
> > +
> > + *val = DIV_ROUND_CLOSEST(in * fs, BIT(8));
> > + return 0;
> > +}
> > +
> > +static u32 ltc4283_in_reg(u32 attr, u32 channel)
> > +{
> > + switch (attr) {
> > + case hwmon_in_input:
> > + if (channel == LTC4283_CHAN_VPWR)
> > + return LTC4283_VPWR;
> > + if (channel >= LTC4283_CHAN_ADI_1 && channel <=
> > LTC4283_CHAN_DRAIN)
> > + return LTC4283_ADC_2(channel - LTC4283_CHAN_ADI_1);
> > + return LTC4283_ADC_2_DIFF(channel - LTC4283_CHAN_ADIN12);
> > + case hwmon_in_highest:
> > + if (channel == LTC4283_CHAN_VPWR)
> > + return LTC4283_VPWR_MAX;
> > + if (channel >= LTC4283_CHAN_ADI_1 && channel <=
> > LTC4283_CHAN_DRAIN)
> > + return LTC4283_ADC_2_MAX(channel - LTC4283_CHAN_ADI_1);
> > + return LTC4283_ADC_2_MAX_DIFF(channel - LTC4283_CHAN_ADIN12);
> > + case hwmon_in_lowest:
> > + if (channel == LTC4283_CHAN_VPWR)
> > + return LTC4283_VPWR_MIN;
> > + if (channel >= LTC4283_CHAN_ADI_1 && channel <=
> > LTC4283_CHAN_DRAIN)
> > + return LTC4283_ADC_2_MIN(channel - LTC4283_CHAN_ADI_1);
> > + return LTC4283_ADC_2_MIN_DIFF(channel - LTC4283_CHAN_ADIN12);
> > + case hwmon_in_max:
> > + if (channel == LTC4283_CHAN_VPWR)
> > + return LTC4283_VPWR_MAX_TH;
> > + if (channel >= LTC4283_CHAN_ADI_1 && channel <=
> > LTC4283_CHAN_DRAIN)
> > + return LTC4283_ADC_2_MAX_TH(channel -
> > LTC4283_CHAN_ADI_1);
> > + return LTC4283_ADC_2_MAX_TH_DIFF(channel - LTC4283_CHAN_ADIN12);
> > + default:
> > + if (channel == LTC4283_CHAN_VPWR)
> > + return LTC4283_VPWR_MIN_TH;
> > + if (channel >= LTC4283_CHAN_ADI_1 && channel <=
> > LTC4283_CHAN_DRAIN)
> > + return LTC4283_ADC_2_MIN_TH(channel -
> > LTC4283_CHAN_ADI_1);
> > + return LTC4283_ADC_2_MIN_TH_DIFF(channel - LTC4283_CHAN_ADIN12);
> > + }
> > +}
> > +
> > +static int ltc4283_read_in_vals(const struct ltc4283_hwmon *st,
> > + u32 attr, u32 channel, long *val)
> > +{
> > + u32 reg = ltc4283_in_reg(attr, channel);
> > + int ret;
> > +
> > + if (channel < LTC4283_CHAN_ADIN12) {
> > + if (attr != hwmon_in_max && attr != hwmon_in_min)
> > + return ltc4283_read_voltage_word(st, reg,
> > + LTC4283_ADC2_FS_mV,
> > + val);
> > +
> > + return ltc4283_read_voltage_byte(st, reg,
> > + LTC4283_ADC2_FS_mV, val);
> > + }
> > +
> > + if (attr != hwmon_in_max && attr != hwmon_in_min)
> > + ret = ltc4283_read_voltage_word(st, reg,
> > + LTC4283_ADC1_FS_uV, val);
> > + else
> > + ret = ltc4283_read_voltage_byte(st, reg,
> > + LTC4283_ADC1_FS_uV, val);
> > + if (ret)
> > + return ret;
> > +
> > + *val = DIV_ROUND_CLOSEST(*val, MILLI);
> > + return 0;
> > +}
> > +
> > +static int ltc4283_read_alarm(struct ltc4283_hwmon *st, u32 reg,
> > + u32 mask, long *val)
> > +{
> > + u32 alarm;
> > + int ret;
> > +
> > + ret = regmap_read(st->map, reg, &alarm);
> > + if (ret)
> > + return ret;
> > +
> > + *val = !!(alarm & mask);
> > +
> > + /* If not status/fault logs, clear the alarm after reading it. */
> > + if (reg != LTC4283_FAULT_STATUS && reg != LTC4283_FAULT_LOG)
> > + return regmap_clear_bits(st->map, reg, mask);
> > +
> > + return 0;
> > +}
> > +
> > +static int ltc4283_read_in_alarm(struct ltc4283_hwmon *st, u32 channel,
> > + bool max_alm, long *val)
> > +{
> > + if (channel == LTC4283_VPWR)
> > + return ltc4283_read_alarm(st, LTC4283_ADC_ALM_LOG_1,
> > + BIT(2 + max_alm), val);
> > +
> > + if (channel >= LTC4283_CHAN_ADI_1 && channel <= LTC4283_CHAN_ADI_4) {
> > + u32 bit = (channel - LTC4283_CHAN_ADI_1) * 2;
> > + /*
> > + * Lower channels go to higher bits. We also want to go +1 down
> > + * in the min_alarm case.
> > + */
> > + return ltc4283_read_alarm(st, LTC4283_ADC_ALM_LOG_2,
> > + BIT(7 - bit - !max_alm), val);
> > + }
> > +
> > + if (channel >= LTC4283_CHAN_ADIO_1 && channel <= LTC4283_CHAN_ADIO_4) {
> > + u32 bit = (channel - LTC4283_CHAN_ADIO_1) * 2;
> > +
> > + return ltc4283_read_alarm(st, LTC4283_ADC_ALM_LOG_3,
> > + BIT(7 - bit - !max_alm), val);
> > + }
> > +
> > + if (channel >= LTC4283_CHAN_ADIN12 && channel <= LTC4283_CHAN_ADIO34) {
> > + u32 bit = (channel - LTC4283_CHAN_ADIN12) * 2;
> > +
> > + return ltc4283_read_alarm(st, LTC4283_ADC_ALM_LOG_5,
> > + BIT(7 - bit - !max_alm), val);
> > + }
> > +
> > + if (channel == LTC4283_CHAN_DRNS)
> > + return ltc4283_read_alarm(st, LTC4283_ADC_ALM_LOG_4,
> > + BIT(6 + max_alm), val);
> > +
> > + return ltc4283_read_alarm(st, LTC4283_ADC_ALM_LOG_4, BIT(4 + max_alm),
> > + val);
> > +}
> > +
> > +static int ltc4283_read_in(struct ltc4283_hwmon *st, u32 attr, u32 channel,
> > + long *val)
> > +{
> > + switch (attr) {
> > + case hwmon_in_input:
> > + if (!test_bit(channel, &st->ch_enable_mask))
> > + return -ENODATA;
> > +
> > + return ltc4283_read_in_vals(st, attr, channel, val);
> > + case hwmon_in_highest:
> > + case hwmon_in_lowest:
> > + case hwmon_in_max:
> > + case hwmon_in_min:
> > + return ltc4283_read_in_vals(st, attr, channel, val);
> > + case hwmon_in_max_alarm:
> > + return ltc4283_read_in_alarm(st, channel, true, val);
> > + case hwmon_in_min_alarm:
> > + return ltc4283_read_in_alarm(st, channel, false, val);
> > + case hwmon_in_crit_alarm:
> > + return ltc4283_read_alarm(st, LTC4283_FAULT_STATUS,
> > + LTC4283_OV_MASK, val);
> > + case hwmon_in_lcrit_alarm:
> > + return ltc4283_read_alarm(st, LTC4283_FAULT_STATUS,
> > + LTC4283_UV_MASK, val);
> > + case hwmon_in_fault:
> > + /*
> > + * We report failure if we detect either a fer_bad or a
> > + * fet_short in the status register.
> > + */
> > + return ltc4283_read_alarm(st, LTC4283_FAULT_STATUS,
> > + LTC4283_FET_BAD_MASK |
> > LTC4283_FET_SHORT_MASK, val);
> > + case hwmon_in_enable:
> > + *val = test_bit(channel, &st->ch_enable_mask);
> > + return 0;
> > + default:
> > + return -EOPNOTSUPP;
> > + }
> > + return 0;
> > +}
> > +
> > +static int ltc4283_read_current_word(const struct ltc4283_hwmon *st, u32 reg,
> > + long *val)
> > +{
> > + u64 temp = (u64)LTC4283_ADC1_FS_uV * DECA * MILLI;
> > + unsigned int __raw;
> > + int ret;
> > +
> > + ret = regmap_read(st->map, reg, &__raw);
> > + if (ret)
> > + return ret;
> > +
> > + *val = DIV64_U64_ROUND_CLOSEST(__raw * temp,
> > + BIT_ULL(16) * st->rsense);
> > +
> > + return 0;
> > +}
> > +
> > +static int ltc4283_read_current_byte(const struct ltc4283_hwmon *st, u32 reg,
> > + long *val)
> > +{
> > + u64 temp = (u64)LTC4283_ADC1_FS_uV * DECA * MILLI;
> > + u32 curr;
> > + int ret;
> > +
> > + ret = regmap_read(st->map, reg, &curr);
> > + if (ret)
> > + return ret;
> > +
> > + *val = DIV_ROUND_CLOSEST_ULL(curr * temp, BIT(8) * st->rsense);
> > + return 0;
> > +}
> > +
> > +static int ltc4283_read_curr(struct ltc4283_hwmon *st, u32 attr, long *val)
> > +{
> > + switch (attr) {
> > + case hwmon_curr_input:
> > + return ltc4283_read_current_word(st, LTC4283_SENSE, val);
> > + case hwmon_curr_highest:
> > + return ltc4283_read_current_word(st, LTC4283_SENSE_MAX, val);
> > + case hwmon_curr_lowest:
> > + return ltc4283_read_current_word(st, LTC4283_SENSE_MIN, val);
> > + case hwmon_curr_max:
> > + return ltc4283_read_current_byte(st, LTC4283_SENSE_MAX_TH, val);
> > + case hwmon_curr_min:
> > + return ltc4283_read_current_byte(st, LTC4283_SENSE_MIN_TH, val);
> > + case hwmon_curr_max_alarm:
> > + return ltc4283_read_alarm(st, LTC4283_ADC_ALM_LOG_1,
> > + LTC4283_SENSE_HIGH_ALM, val);
> > + case hwmon_curr_min_alarm:
> > + return ltc4283_read_alarm(st, LTC4283_ADC_ALM_LOG_1,
> > + LTC4283_SENSE_LOW_ALM, val);
> > + case hwmon_curr_crit_alarm:
> > + return ltc4283_read_alarm(st, LTC4283_FAULT_STATUS,
> > + LTC4283_OC_MASK, val);
> > + default:
> > + return -EOPNOTSUPP;
> > + }
> > +}
> > +
> > +static int ltc4283_read_power_word(const struct ltc4283_hwmon *st,
> > + u32 reg, long *val)
> > +{
> > + u64 temp = (u64)LTC4283_ADC1_FS_uV * LTC4283_ADC2_FS_mV * DECA * MILLI;
> > + unsigned int __raw;
> > + int ret;
> > +
> > + ret = regmap_read(st->map, reg, &__raw);
> > + if (ret)
> > + return ret;
> > +
> > + /*
> > + * Power is given by:
> > + * P = CODE(16b) * 32.768mV * 2.048V / (2^16 * Rsense)
> > + */
> > + *val = DIV64_U64_ROUND_CLOSEST(temp * __raw, BIT_ULL(16) * st->rsense);
> > +
> > + return 0;
> > +}
> > +
> > +static int ltc4283_read_power_byte(const struct ltc4283_hwmon *st,
> > + u32 reg, long *val)
> > +{
> > + u64 temp = (u64)LTC4283_ADC1_FS_uV * LTC4283_ADC2_FS_mV * DECA * MILLI;
> > + u32 power;
> > + int ret;
> > +
> > + ret = regmap_read(st->map, reg, &power);
> > + if (ret)
> > + return ret;
> > +
> > + *val = DIV_ROUND_CLOSEST_ULL(power * temp, BIT(8) * st->rsense);
> > +
> > + return 0;
> > +}
> > +
> > +static int ltc4283_read_power(struct ltc4283_hwmon *st, u32 attr, long *val)
> > +{
> > + switch (attr) {
> > + case hwmon_power_input:
> > + return ltc4283_read_power_word(st, LTC4283_POWER, val);
> > + case hwmon_power_input_highest:
> > + return ltc4283_read_power_word(st, LTC4283_POWER_MAX, val);
> > + case hwmon_power_input_lowest:
> > + return ltc4283_read_power_word(st, LTC4283_POWER_MIN, val);
> > + case hwmon_power_max_alarm:
> > + return ltc4283_read_alarm(st, LTC4283_ADC_ALM_LOG_1,
> > + LTC4283_POWER_HIGH_ALM, val);
> > + case hwmon_power_min_alarm:
> > + return ltc4283_read_alarm(st, LTC4283_ADC_ALM_LOG_1,
> > + LTC4283_POWER_LOW_ALM, val);
> > + case hwmon_power_max:
> > + return ltc4283_read_power_byte(st, LTC4283_POWER_MAX_TH, val);
> > + case hwmon_power_min:
> > + return ltc4283_read_power_byte(st, LTC4283_POWER_MIN_TH, val);
> > + default:
> > + return -EOPNOTSUPP;
> > + }
> > +}
> > +
> > +static int ltc4283_read_energy(struct ltc4283_hwmon *st, u32 attr, s64 *val)
> > +{
> > + u64 temp = LTC4283_ADC1_FS_uV * LTC4283_ADC2_FS_mV, energy, temp_2;
> > + u8 raw[8] = {};
> > + int ret;
> > +
> > + if (!st->energy_en)
> > + return -ENODATA;
> > +
> > + ret = i2c_smbus_read_i2c_block_data(st->client, LTC4283_ENERGY, 6, raw);
> > + if (ret < 0)
> > + return ret;
> > + if (ret != 6)
> > + return -EIO;
> > +
> > + energy = get_unaligned_be64(raw) >> 16;
> > +
> > + /*
> > + * The formula for energy is given by:
> > + * E = CODE(48b) * 32.768mV * 2.048V * Tconv / 2^24 * Rsense
> > + *
> > + * As Rsense can have tenths of micro-ohm resolution, we need to
> > + * multiply by DECA to get microjoule.
> > + */
> > + if (check_mul_overflow(temp * LTC4283_TCONV_uS, energy, &temp_2)) {
> > + /*
> > + * We multiply again by 1000 to make sure that we don't get 0
> > + * in the following division which could happen for big rsense
> > + * values. OTOH, we then divide energy first by 1000 so that
> > + * we do not overflow u64 again for very small rsense values.
> > + * We add 100 factor for proper conversion to microjoule.
> > + */
> > + temp_2 = DIV64_U64_ROUND_CLOSEST(temp * LTC4283_TCONV_uS *
> > MILLI,
> > + BIT_ULL(24) * st->rsense);
> > + energy = DIV_ROUND_CLOSEST_ULL(energy, MILLI * CENTI) * temp_2;
> > + } else {
> > + /* Put rsense back into nanoohm so we get microjoule. */
> > + energy = DIV64_U64_ROUND_CLOSEST(temp_2, BIT_ULL(24) * st-
> > >rsense * CENTI);
> > + }
> > +
> > + *val = energy;
> > + return 0;
> > +}
> > +
> > +static int ltc4283_read(struct device *dev, enum hwmon_sensor_types type,
> > + u32 attr, int channel, long *val)
> > +{
> > + struct ltc4283_hwmon *st = dev_get_drvdata(dev);
> > +
> > + switch (type) {
> > + case hwmon_in:
> > + return ltc4283_read_in(st, attr, channel, val);
> > + case hwmon_curr:
> > + return ltc4283_read_curr(st, attr, val);
> > + case hwmon_power:
> > + return ltc4283_read_power(st, attr, val);
> > + case hwmon_energy:
> > + *val = st->energy_en;
> > + return 0;
> > + case hwmon_energy64:
> > + return ltc4283_read_energy(st, attr, (s64 *)val);
> > + default:
> > + return -EOPNOTSUPP;
> > + }
> > +}
> > +
> > +static int ltc4283_write_power_byte(const struct ltc4283_hwmon *st, u32 reg,
> > + long val)
> > +{
> > + u64 temp = (u64)LTC4283_ADC1_FS_uV * LTC4283_ADC2_FS_mV * DECA * MILLI;
> > + u32 __raw;
> > +
> > + val = clamp_val(val, 0, st->power_max);
> > + __raw = DIV64_U64_ROUND_CLOSEST(val * BIT_ULL(8) * st->rsense, temp);
> > +
> > + return regmap_write(st->map, reg, __raw);
> > +}
> > +
> > +static int ltc4283_write_power_word(const struct ltc4283_hwmon *st,
> > + u32 reg, long val)
> > +{
> > + u64 temp = st->rsense * BIT_ULL(16), temp_2;
> > + u16 __raw;
> > +
> > + if (check_mul_overflow(val, temp, &temp_2)) {
> > + temp = DIV_ROUND_CLOSEST_ULL(temp, DECA * MILLI);
> > + __raw = DIV_ROUND_CLOSEST_ULL(temp * val, LTC4283_ADC1_FS_uV *
> > LTC4283_ADC2_FS_mV);
> > + } else {
> > + temp = (u64)LTC4283_ADC1_FS_uV * LTC4283_ADC2_FS_mV * DECA *
> > MILLI;
> > + __raw = DIV64_U64_ROUND_CLOSEST(temp_2, temp);
> > + }
> > +
> > + return regmap_write(st->map, reg, __raw);
> > +}
> > +
> > +static int ltc4283_reset_power_hist(struct ltc4283_hwmon *st)
> > +{
> > + int ret;
> > +
> > + ret = ltc4283_write_power_word(st, LTC4283_POWER_MIN, st->power_max);
> > + if (ret)
> > + return ret;
> > +
> > + ret = ltc4283_write_power_word(st, LTC4283_POWER_MAX, 0);
> > + if (ret)
> > + return ret;
> > +
> > + /* Clear possible power faults. */
> > + return regmap_clear_bits(st->map, LTC4283_FAULT_LOG,
> > + LTC4283_PWR_FAIL_FAULT_MASK |
> > LTC4283_PGI_FAULT_MASK);
> > +}
> > +
> > +static int ltc4283_write_power(struct ltc4283_hwmon *st, u32 attr, long val)
> > +{
> > + switch (attr) {
> > + case hwmon_power_max:
> > + return ltc4283_write_power_byte(st, LTC4283_POWER_MAX_TH, val);
> > + case hwmon_power_min:
> > + return ltc4283_write_power_byte(st, LTC4283_POWER_MIN_TH, val);
> > + case hwmon_power_reset_history:
> > + return ltc4283_reset_power_hist(st);
> > + default:
> > + return -EOPNOTSUPP;
> > + }
> > +}
> > +
> > +static int ltc4283_write_in_history(struct ltc4283_hwmon *st, u32 reg,
> > + long lowest, u32 fs)
> > +{
> > + u32 __raw;
> > + int ret;
> > +
> > + __raw = DIV_ROUND_CLOSEST(BIT(16) * lowest, fs);
> > + if (__raw == BIT(16))
> > + __raw = U16_MAX;
> > +
> > + ret = regmap_write(st->map, reg, __raw);
> > + if (ret)
> > + return ret;
> > +
> > + return regmap_write(st->map, reg + 1, 0);
> > +}
> > +
> > +static int ltc4283_write_in_byte(const struct ltc4283_hwmon *st,
> > + u32 reg, u32 fs, long val)
> > +{
> > + u32 __raw;
> > +
> > + val = clamp_val(val, 0, fs);
> > + __raw = DIV_ROUND_CLOSEST(val * BIT(8), fs);
> > + if (__raw == BIT(8))
> > + __raw = U8_MAX;
> > +
> > + return regmap_write(st->map, reg, __raw);
> > +}
> > +
> > +static int ltc4283_reset_in_hist(struct ltc4283_hwmon *st, u32 channel)
> > +{
> > + u32 reg, fs;
> > + int ret;
> > +
> > + /*
> > + * Make sure to clear possible under/over voltage faults. Otherwise the
> > + * chip won't latch on again.
> > + */
> > + if (channel == LTC4283_CHAN_VIN)
> > + return regmap_clear_bits(st->map, LTC4283_FAULT_LOG,
> > + LTC4283_OV_FAULT_MASK |
> > LTC4283_UV_FAULT_MASK);
> > +
> > + if (channel == LTC4283_CHAN_VPWR)
> > + return ltc4283_write_in_history(st, LTC4283_VPWR_MIN,
> > + LTC4283_ADC2_FS_mV,
> > + LTC4283_ADC2_FS_mV);
> > +
> > + if (channel >= LTC4283_CHAN_ADI_1 && channel <= LTC4283_CHAN_DRAIN) {
> > + fs = LTC4283_ADC2_FS_mV;
> > + reg = LTC4283_ADC_2_MIN(channel - LTC4283_CHAN_ADI_1);
> > + } else {
> > + fs = LTC4283_ADC1_FS_uV;
> > + reg = LTC4283_ADC_2_MIN_DIFF(channel - LTC4283_CHAN_ADIN12);
> > + }
> > +
> > + ret = ltc4283_write_in_history(st, reg, fs, fs);
> > + if (ret)
> > + return ret;
> > + if (channel != LTC4283_CHAN_DRAIN)
> > + return 0;
> > +
> > + /* Then, let's also clear possible fet faults. Same as above. */
> > + return regmap_clear_bits(st->map, LTC4283_FAULT_LOG,
> > + LTC4283_FET_BAD_FAULT_MASK |
> > LTC4283_FET_SHORT_FAULT_MASK);
> > +}
> > +
> > +static int ltc4283_write_in_en(struct ltc4283_hwmon *st, u32 channel, bool en)
> > +{
> > + unsigned int bit, adc_idx = channel - LTC4283_CHAN_ADI_1;
> > + unsigned int reg = LTC4283_ADC_SELECT(adc_idx);
> > + int ret;
> > +
> > + bit = LTC4283_ADC_SELECT_MASK(adc_idx);
> > + if (channel > LTC4283_CHAN_DRAIN)
> > + /* Account for two reserved fields after DRAIN. */
> > + bit <<= 2;
> > +
> > + if (en)
> > + ret = regmap_set_bits(st->map, reg, bit);
> > + else
> > + ret = regmap_clear_bits(st->map, reg, bit);
> > + if (ret)
> > + return ret;
> > +
> > + __assign_bit(channel, &st->ch_enable_mask, en);
> > + return 0;
> > +}
> > +
> > +static int ltc4283_write_minmax(struct ltc4283_hwmon *st, long val,
> > + u32 channel, bool is_max)
> > +{
> > + u32 reg;
> > +
> > + if (channel == LTC4283_CHAN_VPWR) {
> > + if (is_max)
> > + return ltc4283_write_in_byte(st, LTC4283_VPWR_MAX_TH,
> > + LTC4283_ADC2_FS_mV, val);
> > +
> > + return ltc4283_write_in_byte(st, LTC4283_VPWR_MIN_TH,
> > + LTC4283_ADC2_FS_mV, val);
> > + }
> > +
> > + if (channel >= LTC4283_CHAN_ADI_1 && channel <= LTC4283_CHAN_DRAIN) {
> > + if (is_max) {
> > + reg = LTC4283_ADC_2_MAX_TH(channel -
> > LTC4283_CHAN_ADI_1);
> > + return ltc4283_write_in_byte(st, reg,
> > + LTC4283_ADC2_FS_mV, val);
> > + }
> > +
> > + reg = LTC4283_ADC_2_MIN_TH(channel - LTC4283_CHAN_ADI_1);
> > + return ltc4283_write_in_byte(st, reg, LTC4283_ADC2_FS_mV, val);
> > + }
> > +
> > + /* Just sanity check we do not overflow val for 32bit */
> > + val = clamp_val(val * MILLI, 0, LTC4283_ADC1_FS_uV);
> > +
> > + if (is_max) {
> > + reg = LTC4283_ADC_2_MAX_TH_DIFF(channel - LTC4283_CHAN_ADIN12);
> > + return ltc4283_write_in_byte(st, reg, LTC4283_ADC1_FS_uV, val);
> > + }
> > +
> > + reg = LTC4283_ADC_2_MIN_TH_DIFF(channel - LTC4283_CHAN_ADIN12);
> > + return ltc4283_write_in_byte(st, reg, LTC4283_ADC1_FS_uV, val);
> > +}
> > +
> > +static int ltc4283_write_in(struct ltc4283_hwmon *st, u32 attr, long val,
> > + int channel)
> > +{
> > + switch (attr) {
> > + case hwmon_in_max:
> > + return ltc4283_write_minmax(st, val, channel, true);
> > + case hwmon_in_min:
> > + return ltc4283_write_minmax(st, val, channel, false);
> > + case hwmon_in_reset_history:
> > + return ltc4283_reset_in_hist(st, channel);
> > + case hwmon_in_enable:
> > + return ltc4283_write_in_en(st, channel, !!val);
> > + default:
> > + return -EOPNOTSUPP;
> > + }
> > +}
> > +
> > +static int ltc4283_write_curr_byte(const struct ltc4283_hwmon *st,
> > + u32 reg, long val)
> > +{
> > + u32 temp = LTC4283_ADC1_FS_uV * DECA * MILLI;
> > + u32 reg_val, isense_max;
> > +
> > + isense_max = DIV_ROUND_CLOSEST(st->vsense_max * MICRO * DECA, st-
> > >rsense);
> > + val = clamp_val(val, 0, isense_max);
> > + reg_val = DIV_ROUND_CLOSEST_ULL(val * BIT_ULL(8) * st->rsense, temp);
> > +
> > + return regmap_write(st->map, reg, reg_val);
> > +}
> > +
> > +static int ltc4283_write_curr_history(struct ltc4283_hwmon *st)
> > +{
> > + int ret;
> > +
> > + ret = ltc4283_write_in_history(st, LTC4283_SENSE_MIN,
> > + st->vsense_max * MILLI,
> > + LTC4283_ADC1_FS_uV);
> > + if (ret)
> > + return ret;
> > +
> > + /* Now, let's also clear possible overcurrent logs. */
> > + return regmap_clear_bits(st->map, LTC4283_FAULT_LOG,
> > + LTC4283_OC_FAULT_MASK);
> > +}
> > +
> > +static int ltc4283_write_curr(struct ltc4283_hwmon *st, u32 attr, long val)
> > +{
> > + switch (attr) {
> > + case hwmon_curr_max:
> > + return ltc4283_write_curr_byte(st, LTC4283_SENSE_MAX_TH, val);
> > + case hwmon_curr_min:
> > + return ltc4283_write_curr_byte(st, LTC4283_SENSE_MIN_TH, val);
> > + case hwmon_curr_reset_history:
> > + return ltc4283_write_curr_history(st);
> > + default:
> > + return -EOPNOTSUPP;
> > + }
> > +}
> > +
> > +static int ltc4283_energy_enable_set(struct ltc4283_hwmon *st, long val)
> > +{
> > + int ret;
> > +
> > + /* Setting the bit halts the meter. */
> > + val = !!val;
> > + ret = regmap_update_bits(st->map, LTC4283_METER_CONTROL,
> > + LTC4283_METER_HALT_MASK,
> > + FIELD_PREP(LTC4283_METER_HALT_MASK, !val));
> > + if (ret)
> > + return ret;
> > +
> > + st->energy_en = val;
> > +
> > + return 0;
> > +}
> > +
> > +static int ltc4283_write(struct device *dev, enum hwmon_sensor_types type,
> > + u32 attr, int channel, long val)
> > +{
> > + struct ltc4283_hwmon *st = dev_get_drvdata(dev);
> > +
> > + switch (type) {
> > + case hwmon_power:
> > + return ltc4283_write_power(st, attr, val);
> > + case hwmon_in:
> > + return ltc4283_write_in(st, attr, val, channel);
> > + case hwmon_curr:
> > + return ltc4283_write_curr(st, attr, val);
> > + case hwmon_energy:
> > + return ltc4283_energy_enable_set(st, val);
> > + default:
> > + return -EOPNOTSUPP;
> > + }
> > +}
> > +
> > +static umode_t ltc4283_in_is_visible(const struct ltc4283_hwmon *st,
> > + u32 attr, int channel)
> > +{
> > + /* If ADIO is set as a GPIO, don´t make it visible. */
> > + if (channel >= LTC4283_CHAN_ADIO_1 && channel <= LTC4283_CHAN_ADIO_4) {
> > + /* ADIOX pins come at index 0 in the gpio mask. */
> > + channel -= LTC4283_CHAN_ADIO_1;
> > + if (test_bit(channel, &st->gpio_mask))
> > + return 0;
> > + }
> > +
> > + /* Also take care of differential channels. */
> > + if (channel >= LTC4283_CHAN_ADIO12 && channel <= LTC4283_CHAN_ADIO34) {
> > + channel -= LTC4283_CHAN_ADIO12;
> > + /* If one channel in the pair is used, make it invisible. */
> > + if (test_bit(channel * 2, &st->gpio_mask) ||
> > + test_bit(channel * 2 + 1, &st->gpio_mask))
> > + return 0;
> > + }
> > +
> > + switch (attr) {
> > + case hwmon_in_input:
> > + case hwmon_in_highest:
> > + case hwmon_in_lowest:
> > + case hwmon_in_max_alarm:
> > + case hwmon_in_min_alarm:
> > + case hwmon_in_label:
> > + case hwmon_in_lcrit_alarm:
> > + case hwmon_in_crit_alarm:
> > + case hwmon_in_fault:
> > + return 0444;
> > + case hwmon_in_max:
> > + case hwmon_in_min:
> > + case hwmon_in_enable:
> > + return 0644;
> > + case hwmon_in_reset_history:
> > + return 0200;
> > + default:
> > + return 0;
> > + }
> > +}
> > +
> > +static umode_t ltc4283_curr_is_visible(u32 attr)
> > +{
> > + switch (attr) {
> > + case hwmon_curr_input:
> > + case hwmon_curr_highest:
> > + case hwmon_curr_lowest:
> > + case hwmon_curr_max_alarm:
> > + case hwmon_curr_min_alarm:
> > + case hwmon_curr_crit_alarm:
> > + case hwmon_curr_label:
> > + return 0444;
> > + case hwmon_curr_max:
> > + case hwmon_curr_min:
> > + return 0644;
> > + case hwmon_curr_reset_history:
> > + return 0200;
> > + default:
> > + return 0;
> > + }
> > +}
> > +
> > +static umode_t ltc4283_power_is_visible(u32 attr)
> > +{
> > + switch (attr) {
> > + case hwmon_power_input:
> > + case hwmon_power_input_highest:
> > + case hwmon_power_input_lowest:
> > + case hwmon_power_label:
> > + case hwmon_power_max_alarm:
> > + case hwmon_power_min_alarm:
> > + return 0444;
> > + case hwmon_power_max:
> > + case hwmon_power_min:
> > + return 0644;
> > + case hwmon_power_reset_history:
> > + return 0200;
> > + default:
> > + return 0;
> > + }
> > +}
> > +
> > +static umode_t ltc4283_is_visible(const void *data,
> > + enum hwmon_sensor_types type,
> > + u32 attr, int channel)
> > +{
> > + switch (type) {
> > + case hwmon_in:
> > + return ltc4283_in_is_visible(data, attr, channel);
> > + case hwmon_curr:
> > + return ltc4283_curr_is_visible(attr);
> > + case hwmon_power:
> > + return ltc4283_power_is_visible(attr);
> > + case hwmon_energy:
> > + /* hwmon_energy_enable */
> > + return 0644;
> > + case hwmon_energy64:
> > + /* hwmon_energy_input */
> > + return 0444;
> > + default:
> > + return 0;
> > + }
> > +}
> > +
> > +static const char * const ltc4283_in_strs[] = {
> > + "VIN", "VPWR", "VADI1", "VADI2", "VADI3", "VADI4", "VADIO1", "VADIO2",
> > + "VADIO3", "VADIO4", "DRNS", "DRAIN", "ADIN2-ADIN1", "ADIN4-ADIN3",
> > + "ADIO2-ADIO1", "ADIO4-ADIO3"
> > +};
> > +
> > +static int ltc4283_read_labels(struct device *dev,
> > + enum hwmon_sensor_types type,
> > + u32 attr, int channel, const char **str)
> > +{
> > + switch (type) {
> > + case hwmon_in:
> > + *str = ltc4283_in_strs[channel];
> > + return 0;
> > + case hwmon_curr:
> > + *str = "ISENSE";
> > + return 0;
> > + case hwmon_power:
> > + *str = "Power";
> > + return 0;
> > + default:
> > + return -EOPNOTSUPP;
> > + }
> > +}
> > +
> > +/*
> > + * Set max limits for ISENSE and Power as that depends on the max voltage on
> > + * rsense that is defined in ILIM_ADJUST. This is specially important for power
> > + * because for some rsense and vfsout values, if we allow the default raw 255
> > + * value, that would overflow long in 32bit archs when reading back the max
> > + * power limit.
> > + */
> > +static int ltc4283_set_max_limits(struct ltc4283_hwmon *st, struct device *dev)
> > +{
> > + u32 temp = st->vsense_max * DECA * MICRO;
> > + int ret;
> > +
> > + ret = ltc4283_write_in_byte(st, LTC4283_SENSE_MAX_TH,
> > LTC4283_ADC1_FS_uV,
> > + st->vsense_max * MILLI);
> > + if (ret)
> > + return ret;
> > +
> > + /* Power is given by ISENSE * Vout. */
> > + st->power_max = DIV_ROUND_CLOSEST(temp, st->rsense) *
> > LTC4283_ADC2_FS_mV;
> > + return ltc4283_write_power_byte(st, LTC4283_POWER_MAX_TH, st-
> > >power_max);
> > +}
> > +
> > +static int ltc4283_parse_array_prop(const struct ltc4283_hwmon *st,
> > + struct device *dev, const char *prop,
> > + const u32 *vals, u32 n_vals)
> > +{
> > + u32 prop_val;
> > + int ret;
> > + u32 i;
> > +
> > + ret = device_property_read_u32(dev, prop, &prop_val);
> > + if (ret)
> > + return n_vals;
> > +
> > + for (i = 0; i < n_vals; i++) {
> > + if (prop_val != vals[i])
> > + continue;
> > +
> > + return i;
> > + }
> > +
> > + return dev_err_probe(dev, -EINVAL,
> > + "Invalid %s property value %u, expected one of:
> > %*ph\n",
> > + prop, prop_val, n_vals, vals);
> > +}
> > +
> > +static int ltc4283_get_defaults(struct ltc4283_hwmon *st)
> > +{
> > + u32 reg_val, ilm_adjust, c;
> > + int ret;
> > +
> > + ret = regmap_read(st->map, LTC4283_METER_CONTROL, ®_val);
> > + if (ret)
> > + return ret;
> > +
> > + st->energy_en = !FIELD_GET(LTC4283_METER_HALT_MASK, reg_val);
> > +
> > + ret = regmap_read(st->map, LTC4283_CONFIG_1, ®_val);
> > + if (ret)
> > + return ret;
> > +
> > + ilm_adjust = FIELD_GET(LTC4283_ILIM_MASK, reg_val);
> > + st->vsense_max = LTC4283_VILIM_MIN_uV / MILLI + ilm_adjust;
> > +
> > + ret = regmap_read(st->map, LTC4283_PGIO_CONFIG, ®_val);
> > + if (ret)
> > + return ret;
> > +
> > + /* Can be latter overwritten in ltc4283_pgio_config() */
> > + if (FIELD_GET(LTC4283_PGIO4_CFG_MASK, reg_val) < LTC4283_PGIO_FUNC_GPIO)
> > + st->ext_fault = true;
> > +
> > + /* VPWR and VIN are always enabled */
> > + __set_bit(LTC4283_CHAN_VIN, &st->ch_enable_mask);
> > + __set_bit(LTC4283_CHAN_VPWR, &st->ch_enable_mask);
> > + for (c = LTC4283_CHAN_ADI_1; c < LTC4283_CHAN_MAX; c++) {
> > + u32 chan = c - LTC4283_CHAN_ADI_1, bit;
> > +
> > + ret = regmap_read(st->map, LTC4283_ADC_SELECT(chan), ®_val);
> > + if (ret)
> > + return ret;
> > +
> > + bit = LTC4283_ADC_SELECT_MASK(chan);
> > + if (c > LTC4283_CHAN_DRAIN)
> > + /* account for two reserved fields after DRAIN */
> > + bit <<= 2;
> > +
> > + if (!(bit & reg_val))
> > + continue;
> > +
> > + __set_bit(c, &st->ch_enable_mask);
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static const char * const ltc4283_pgio1_funcs[] = {
> > + "inverted_power_good", "power_good", "gpio"
> > +};
> > +
> > +static const char * const ltc4283_pgio2_funcs[] = {
> > + "inverted_power_good", "power_good", "gpio", "active_current_limiting"
> > +};
> > +
> > +static const char * const ltc4283_pgio3_funcs[] = {
> > + "inverted_power_good_input", "power_good_input", "gpio"
> > +};
> > +
> > +static const char * const ltc4283_pgio4_funcs[] = {
> > + "inverted_external_fault", "external_fault", "gpio"
> > +};
> > +
> > +enum {
> > + LTC4283_PIN_ADIO1,
> > + LTC4283_PIN_ADIO2,
> > + LTC4283_PIN_ADIO3,
> > + LTC4283_PIN_ADIO4,
> > + LTC4283_PIN_PGIO1,
> > + LTC4283_PIN_PGIO2,
> > + LTC4283_PIN_PGIO3,
> > + LTC4283_PIN_PGIO4,
> > +};
> > +
> > +static int ltc4283_pgio_config(struct ltc4283_hwmon *st, struct device *dev)
> > +{
> > + int ret, func;
> > +
> > + func = device_property_match_property_string(dev, "adi,pgio1-func",
> > + ltc4283_pgio1_funcs,
> > +
> > ARRAY_SIZE(ltc4283_pgio1_funcs));
> > + if (func < 0 && func != -EINVAL)
> > + return dev_err_probe(dev, func,
> > + "Invalid adi,pgio1-func property\n");
> > + if (func >= 0) {
> > + if (func == LTC4283_PGIO_FUNC_GPIO) {
> > + __set_bit(LTC4283_PIN_PGIO1, &st->gpio_mask);
> > + /* If GPIO, default to an input pin. */
> > + func++;
> > + }
> > +
> > + ret = regmap_update_bits(st->map, LTC4283_PGIO_CONFIG,
> > + LTC4283_PGIO1_CFG_MASK,
> > + FIELD_PREP(LTC4283_PGIO1_CFG_MASK,
> > func));
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + func = device_property_match_property_string(dev, "adi,pgio2-func",
> > + ltc4283_pgio2_funcs,
> > +
> > ARRAY_SIZE(ltc4283_pgio2_funcs));
> > +
> > + if (func < 0 && func != -EINVAL)
> > + return dev_err_probe(dev, func,
> > + "Invalid adi,pgio2-func property\n");
> > + if (func >= 0) {
> > + if (func != LTC4283_PGIO2_FUNC_ACLB) {
> > + if (func == LTC4283_PGIO_FUNC_GPIO) {
> > + __set_bit(LTC4283_PIN_PGIO2, &st->gpio_mask);
> > + func++;
> > + }
> > +
> > + ret = regmap_update_bits(st->map, LTC4283_PGIO_CONFIG,
> > + LTC4283_PGIO2_CFG_MASK,
> > +
> > FIELD_PREP(LTC4283_PGIO2_CFG_MASK, func));
> > + } else {
> > + ret = regmap_set_bits(st->map, LTC4283_CONTROL_1,
> > + LTC4283_PIGIO2_ACLB_MASK);
> > + }
> > +
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + func = device_property_match_property_string(dev, "adi,pgio3-func",
> > + ltc4283_pgio3_funcs,
> > +
> > ARRAY_SIZE(ltc4283_pgio3_funcs));
> > +
> > + if (func < 0 && func != -EINVAL)
> > + return dev_err_probe(dev, func,
> > + "Invalid adi,pgio3-func property\n");
> > + if (func >= 0) {
> > + if (func == LTC4283_PGIO_FUNC_GPIO) {
> > + __set_bit(LTC4283_PIN_PGIO3, &st->gpio_mask);
> > + func++;
> > + }
> > +
> > + ret = regmap_update_bits(st->map, LTC4283_PGIO_CONFIG,
> > + LTC4283_PGIO3_CFG_MASK,
> > + FIELD_PREP(LTC4283_PGIO3_CFG_MASK,
> > func));
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + func = device_property_match_property_string(dev, "adi,pgio4-func",
> > + ltc4283_pgio4_funcs,
> > +
> > ARRAY_SIZE(ltc4283_pgio4_funcs));
> > +
> > + if (func < 0 && func != -EINVAL)
> > + return dev_err_probe(dev, func,
> > + "Invalid adi,pgio4-func property\n");
> > + if (func >= 0) {
> > + if (func == LTC4283_PGIO_FUNC_GPIO) {
> > + __set_bit(LTC4283_PIN_PGIO4, &st->gpio_mask);
> > + func++;
> > + st->ext_fault = false;
> > + } else {
> > + st->ext_fault = true;
> > + }
> > +
> > + ret = regmap_update_bits(st->map, LTC4283_PGIO_CONFIG,
> > + LTC4283_PGIO4_CFG_MASK,
> > + FIELD_PREP(LTC4283_PGIO4_CFG_MASK,
> > func));
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static int ltc4283_adio_config(struct ltc4283_hwmon *st, struct device *dev,
> > + const char *prop, u32 pin)
> > +{
> > + u32 adc_idx;
> > + int ret;
> > +
> > + if (!device_property_read_bool(dev, prop))
> > + return 0;
> > +
> > + adc_idx = LTC4283_CHAN_ADIO_1 - LTC4283_CHAN_ADI_1 + pin;
> > + ret = regmap_clear_bits(st->map, LTC4283_ADC_SELECT(adc_idx),
> > + LTC4283_ADC_SELECT_MASK(adc_idx));
> > + if (ret)
> > + return ret;
> > +
> > + __set_bit(pin, &st->gpio_mask);
> > + return 0;
> > +}
> > +
> > +static int ltc4283_pin_config(struct ltc4283_hwmon *st, struct device *dev)
> > +{
> > + int ret;
> > +
> > + ret = ltc4283_pgio_config(st, dev);
> > + if (ret)
> > + return ret;
> > +
> > + ret = ltc4283_adio_config(st, dev, "adi,gpio-on-adio1",
> > LTC4283_PIN_ADIO1);
> > + if (ret)
> > + return ret;
> > +
> > + ret = ltc4283_adio_config(st, dev, "adi,gpio-on-adio2",
> > LTC4283_PIN_ADIO2);
> > + if (ret)
> > + return ret;
> > +
> > + ret = ltc4283_adio_config(st, dev, "adi,gpio-on-adio3",
> > LTC4283_PIN_ADIO3);
> > + if (ret)
> > + return ret;
> > +
> > + return ltc4283_adio_config(st, dev, "adi,gpio-on-adio4",
> > LTC4283_PIN_ADIO4);
> > +}
> > +
> > +static const char * const ltc4283_oc_fet_retry[] = {
> > + "latch-off", "1", "7", "unlimited"
> > +};
> > +
> > +static const u32 ltc4283_fb_factor[] = {
> > + 100, 50, 20, 10
> > +};
> > +
> > +static const u32 ltc4283_cooling_dl[] = {
> > + 512, 1002, 2005, 4100, 8190, 16400, 32800, 65600
> > +};
> > +
> > +static const u32 ltc4283_fet_bad_delay[] = {
> > + 256, 512, 1002, 2005
> > +};
> > +
> > +static int ltc4283_setup(struct ltc4283_hwmon *st, struct device *dev)
> > +{
> > + u32 val;
> > + int ret;
> > +
> > + /* The part has an eeprom so let's get the needed defaults from it */
> > + ret = ltc4283_get_defaults(st);
> > + if (ret)
> > + return ret;
> > +
> > + /*
> > + * Default to LTC4283_MIN_RSENSE so we can probe without FW properties.
> > + */
> > + st->rsense = LTC4283_MIN_RSENSE;
> > + ret = device_property_read_u32(dev, "adi,rsense-nano-ohms",
> > + &st->rsense);
> > + if (!ret) {
> > + if (st->rsense < LTC4283_MIN_RSENSE || st->rsense >
> > LTC4283_MAX_RSENSE)
> > + return dev_err_probe(dev, -EINVAL,
> > + "adi,rsense-nano-ohms(%u) too small
> > or too large [%u %u]\n",
> > + st->rsense, LTC4283_MIN_RSENSE,
> > LTC4283_MAX_RSENSE);
> > + }
> > +
> > + /*
> > + * The resolution for rsense is tenths of micro (eg: 62.5 uOhm) which
> > + * means we need nano in the bindings. However, to make things easier to
> > + * handle (with respect to overflows) we divide it by 100 as we don't
> > + * really need the last two digits.
> > + */
> > + st->rsense /= CENTI;
> > +
> > + ret = device_property_read_u32(dev, "adi,current-limit-sense-microvolt",
> > + &st->vsense_max);
> > + if (!ret) {
> > + u32 reg_val;
> > +
> > + if (!in_range(st->vsense_max, LTC4283_VILIM_MIN_uV,
> > + LTC4283_VILIM_RANGE)) {
> > + return dev_err_probe(dev, -EINVAL,
> > + "adi,current-limit-sense-microvolt
> > (%u) out of range [%u %u]\n",
> > + st->vsense_max,
> > LTC4283_VILIM_MIN_uV,
> > + LTC4283_VILIM_MAX_uV);
> > + }
> > +
> > + st->vsense_max /= MILLI;
> > + reg_val = FIELD_PREP(LTC4283_ILIM_MASK,
> > + st->vsense_max - LTC4283_VILIM_MIN_uV /
> > MILLI);
> > + ret = regmap_update_bits(st->map, LTC4283_CONFIG_1,
> > + LTC4283_ILIM_MASK, reg_val);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + ret = ltc4283_parse_array_prop(st, dev, "adi,current-limit-foldback-
> > factor",
> > + ltc4283_fb_factor,
> > ARRAY_SIZE(ltc4283_fb_factor));
> > + if (ret < 0)
> > + return ret;
> > + if (ret < ARRAY_SIZE(ltc4283_fb_factor)) {
> > + ret = regmap_update_bits(st->map, LTC4283_CONFIG_1,
> > LTC4283_FB_MASK,
> > + FIELD_PREP(LTC4283_FB_MASK, ret));
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + ret = ltc4283_parse_array_prop(st, dev, "adi,cooling-delay-ms",
> > + ltc4283_cooling_dl,
> > ARRAY_SIZE(ltc4283_cooling_dl));
> > + if (ret < 0)
> > + return ret;
> > + if (ret < ARRAY_SIZE(ltc4283_cooling_dl)) {
> > + ret = regmap_update_bits(st->map, LTC4283_CONFIG_2,
> > LTC4283_COOLING_DL_MASK,
> > + FIELD_PREP(LTC4283_COOLING_DL_MASK,
> > ret));
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + ret = ltc4283_parse_array_prop(st, dev, "adi,fet-bad-timer-delay-ms",
> > + ltc4283_fet_bad_delay,
> > ARRAY_SIZE(ltc4283_fet_bad_delay));
> > + if (ret < 0)
> > + return ret;
> > + if (ret < ARRAY_SIZE(ltc4283_fet_bad_delay)) {
> > + ret = regmap_update_bits(st->map, LTC4283_CONFIG_2,
> > LTC4283_FTBD_DL_MASK,
> > + FIELD_PREP(LTC4283_FTBD_DL_MASK, ret));
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + ret = ltc4283_set_max_limits(st, dev);
> > + if (ret)
> > + return ret;
> > +
> > + ret = ltc4283_pin_config(st, dev);
> > + if (ret)
> > + return ret;
> > +
> > + if (device_property_read_bool(dev, "adi,power-good-reset-on-fet")) {
> > + ret = regmap_clear_bits(st->map, LTC4283_CONTROL_1,
> > + LTC4283_PWRGD_RST_CTRL_MASK);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + if (device_property_read_bool(dev, "adi,fet-turn-off-disable")) {
> > + ret = regmap_clear_bits(st->map, LTC4283_CONTROL_1,
> > + LTC4283_FET_BAD_OFF_MASK);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + if (device_property_read_bool(dev, "adi,tmr-pull-down-disable")) {
> > + ret = regmap_set_bits(st->map, LTC4283_CONTROL_1,
> > + LTC4283_THERM_TMR_MASK);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + if (device_property_read_bool(dev, "adi,dvdt-inrush-control-disable")) {
> > + ret = regmap_clear_bits(st->map, LTC4283_CONTROL_1,
> > + LTC4283_DVDT_MASK);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + if (device_property_read_bool(dev, "adi,undervoltage-retry-disable")) {
> > + ret = regmap_clear_bits(st->map, LTC4283_CONTROL_2,
> > + LTC4283_UV_RETRY_MASK);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + if (device_property_read_bool(dev, "adi,overvoltage-retry-disable")) {
> > + ret = regmap_clear_bits(st->map, LTC4283_CONTROL_2,
> > + LTC4283_OV_RETRY_MASK);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + if (device_property_read_bool(dev, "adi,external-fault-retry-enable")) {
> > + if (!st->ext_fault)
> > + return dev_err_probe(dev, -EINVAL,
> > + "adi,external-fault-retry-enable
> > set but PGIO4 not configured\n");
> > + ret = regmap_set_bits(st->map, LTC4283_CONTROL_2,
> > + LTC4283_EXT_FAULT_RETRY_MASK);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + if (device_property_read_bool(dev, "adi,fault-log-enable")) {
> > + ret = regmap_set_bits(st->map, LTC4283_FAULT_LOG_CTRL,
> > + LTC4283_FAULT_LOG_EN_MASK);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + ret = device_property_match_property_string(dev, "adi,overcurrent-
> > retries",
> > + ltc4283_oc_fet_retry,
> > +
> > ARRAY_SIZE(ltc4283_oc_fet_retry));
> > + /* We still want to catch when an invalid string is given. */
> > + if (ret < 0 && ret != -EINVAL)
> > + return dev_err_probe(dev, ret,
> > + "adi,overcurrent-retries invalid value\n");
> > + if (ret >= 0) {
> > + ret = regmap_update_bits(st->map, LTC4283_CONTROL_2,
> > + LTC4283_OC_RETRY_MASK,
> > + FIELD_PREP(LTC4283_OC_RETRY_MASK,
> > ret));
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + ret = device_property_match_property_string(dev, "adi,fet-bad-retries",
> > + ltc4283_oc_fet_retry,
> > +
> > ARRAY_SIZE(ltc4283_oc_fet_retry));
> > + if (ret < 0 && ret != -EINVAL)
> > + return dev_err_probe(dev, ret,
> > + "adi,fet-bad-retries invalid value\n");
> > + if (ret >= 0) {
> > + ret = regmap_update_bits(st->map, LTC4283_CONTROL_2,
> > + LTC4283_FET_BAD_RETRY_MASK,
> > + FIELD_PREP(LTC4283_FET_BAD_RETRY_MASK,
> > ret));
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + if (device_property_read_bool(dev, "adi,external-fault-fet-off-enable"))
> > {
> > + if (!st->ext_fault)
> > + return dev_err_probe(dev, -EINVAL,
> > + "adi,external-fault-fet-off-enable
> > set but PGIO4 not configured\n");
> > + ret = regmap_set_bits(st->map, LTC4283_CONFIG_3,
> > + LTC4283_EXTFLT_TURN_OFF_MASK);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + if (device_property_read_bool(dev, "adi,vpower-drns-enable")) {
> > + u32 chan = LTC4283_CHAN_DRNS - LTC4283_CHAN_ADI_1;
> > +
> > + __clear_bit(LTC4283_CHAN_DRNS, &st->ch_enable_mask);
> > + /*
> > + * Then, let's by default disable DRNS from ADC2 given that it
> > + * is already being monitored by the VPWR channel. One can still
> > + * enable it later on if needed.
> > + */
> > + ret = regmap_clear_bits(st->map, LTC4283_ADC_SELECT(chan),
> > + LTC4283_ADC_SELECT_MASK(chan));
> > + if (ret)
> > + return ret;
> > +
> > + val = 1;
> > + } else {
> > + val = 0;
> > + }
> > +
> > + ret = regmap_update_bits(st->map, LTC4283_CONFIG_3,
> > + LTC4283_VPWR_DRNS_MASK,
> > + FIELD_PREP(LTC4283_VPWR_DRNS_MASK, val));
> > + if (ret)
> > + return ret;
> > +
> > + /* Make sure the ADC has 12bit resolution since we're assuming that. */
> > + ret = regmap_update_bits(st->map, LTC4283_PGIO_CONFIG_2,
> > + LTC4283_ADC_MASK,
> > + FIELD_PREP(LTC4283_ADC_MASK, 3));
> > + if (ret)
> > + return ret;
> > +
> > + /* Energy reads (which are 6 byte block reads) rely on page access */
> > + ret = regmap_set_bits(st->map, LTC4283_CONTROL_1, LTC4283_RW_PAGE_MASK);
> > + if (ret)
> > + return ret;
> > +
> > + /*
> > + * Make sure we are integrating power as we only support reporting
> > + * consumed energy.
> > + */
> > + return regmap_clear_bits(st->map, LTC4283_METER_CONTROL,
> > + LTC4283_INTEGRATE_I_MASK);
> > +}
> > +
> > +static const struct hwmon_channel_info * const ltc4283_info[] = {
> > + HWMON_CHANNEL_INFO(in,
> > + HWMON_I_LCRIT_ALARM | HWMON_I_CRIT_ALARM |
> > + HWMON_I_RESET_HISTORY | HWMON_I_LABEL,
> > + HWMON_I_INPUT | HWMON_I_LOWEST | HWMON_I_HIGHEST |
> > + HWMON_I_MAX | HWMON_I_MIN | HWMON_I_MIN_ALARM |
> > + HWMON_I_MAX_ALARM | HWMON_I_RESET_HISTORY |
> > + HWMON_I_LABEL,
> > + HWMON_I_INPUT | HWMON_I_LOWEST | HWMON_I_HIGHEST |
> > + HWMON_I_MAX | HWMON_I_MIN | HWMON_I_MIN_ALARM |
> > + HWMON_I_RESET_HISTORY | HWMON_I_MAX_ALARM |
> > + HWMON_I_ENABLE | HWMON_I_LABEL,
> > + HWMON_I_INPUT | HWMON_I_LOWEST | HWMON_I_HIGHEST |
> > + HWMON_I_MAX | HWMON_I_MIN | HWMON_I_MIN_ALARM |
> > + HWMON_I_RESET_HISTORY | HWMON_I_MAX_ALARM |
> > + HWMON_I_ENABLE | HWMON_I_LABEL,
> > + HWMON_I_INPUT | HWMON_I_LOWEST | HWMON_I_HIGHEST |
> > + HWMON_I_MAX | HWMON_I_MIN | HWMON_I_MIN_ALARM |
> > + HWMON_I_RESET_HISTORY | HWMON_I_MAX_ALARM |
> > + HWMON_I_ENABLE | HWMON_I_LABEL,
> > + HWMON_I_INPUT | HWMON_I_LOWEST | HWMON_I_HIGHEST |
> > + HWMON_I_MAX | HWMON_I_MIN | HWMON_I_MIN_ALARM |
> > + HWMON_I_RESET_HISTORY | HWMON_I_MAX_ALARM |
> > + HWMON_I_ENABLE | HWMON_I_LABEL,
> > + HWMON_I_INPUT | HWMON_I_LOWEST | HWMON_I_HIGHEST |
> > + HWMON_I_MAX | HWMON_I_MIN | HWMON_I_MIN_ALARM |
> > + HWMON_I_RESET_HISTORY | HWMON_I_MAX_ALARM |
> > + HWMON_I_ENABLE | HWMON_I_LABEL,
> > + HWMON_I_INPUT | HWMON_I_LOWEST | HWMON_I_HIGHEST |
> > + HWMON_I_MAX | HWMON_I_MIN | HWMON_I_MIN_ALARM |
> > + HWMON_I_RESET_HISTORY | HWMON_I_MAX_ALARM |
> > + HWMON_I_ENABLE | HWMON_I_LABEL,
> > + HWMON_I_INPUT | HWMON_I_LOWEST | HWMON_I_HIGHEST |
> > + HWMON_I_MAX | HWMON_I_MIN | HWMON_I_MIN_ALARM |
> > + HWMON_I_RESET_HISTORY | HWMON_I_MAX_ALARM |
> > + HWMON_I_ENABLE | HWMON_I_LABEL,
> > + HWMON_I_INPUT | HWMON_I_LOWEST | HWMON_I_HIGHEST |
> > + HWMON_I_MAX | HWMON_I_MIN | HWMON_I_MIN_ALARM |
> > + HWMON_I_RESET_HISTORY | HWMON_I_MAX_ALARM |
> > + HWMON_I_ENABLE | HWMON_I_LABEL,
> > + HWMON_I_INPUT | HWMON_I_LOWEST | HWMON_I_HIGHEST |
> > + HWMON_I_MAX | HWMON_I_MIN | HWMON_I_MIN_ALARM |
> > + HWMON_I_RESET_HISTORY | HWMON_I_MAX_ALARM |
> > + HWMON_I_ENABLE | HWMON_I_LABEL,
> > + HWMON_I_INPUT | HWMON_I_LOWEST | HWMON_I_HIGHEST |
> > + HWMON_I_MAX | HWMON_I_MIN | HWMON_I_MIN_ALARM |
> > + HWMON_I_RESET_HISTORY | HWMON_I_MAX_ALARM |
> > + HWMON_I_FAULT | HWMON_I_ENABLE | HWMON_I_LABEL,
> > + HWMON_I_INPUT | HWMON_I_LOWEST | HWMON_I_HIGHEST |
> > + HWMON_I_MAX | HWMON_I_MIN | HWMON_I_MIN_ALARM |
> > + HWMON_I_RESET_HISTORY | HWMON_I_MAX_ALARM |
> > + HWMON_I_ENABLE | HWMON_I_LABEL,
> > + HWMON_I_INPUT | HWMON_I_LOWEST | HWMON_I_HIGHEST |
> > + HWMON_I_MAX | HWMON_I_MIN | HWMON_I_MIN_ALARM |
> > + HWMON_I_RESET_HISTORY | HWMON_I_MAX_ALARM |
> > + HWMON_I_ENABLE | HWMON_I_LABEL,
> > + HWMON_I_INPUT | HWMON_I_LOWEST | HWMON_I_HIGHEST |
> > + HWMON_I_MAX | HWMON_I_MIN | HWMON_I_MIN_ALARM |
> > + HWMON_I_RESET_HISTORY | HWMON_I_MAX_ALARM |
> > + HWMON_I_ENABLE | HWMON_I_LABEL,
> > + HWMON_I_INPUT | HWMON_I_LOWEST | HWMON_I_HIGHEST |
> > + HWMON_I_MAX | HWMON_I_MIN | HWMON_I_MIN_ALARM |
> > + HWMON_I_RESET_HISTORY | HWMON_I_MAX_ALARM |
> > + HWMON_I_ENABLE | HWMON_I_LABEL),
> > + HWMON_CHANNEL_INFO(curr,
> > + HWMON_C_INPUT | HWMON_C_LOWEST | HWMON_C_HIGHEST |
> > + HWMON_C_MAX | HWMON_C_MIN | HWMON_C_MIN_ALARM |
> > + HWMON_C_MAX_ALARM | HWMON_C_CRIT_ALARM |
> > + HWMON_C_RESET_HISTORY | HWMON_C_LABEL),
> > + HWMON_CHANNEL_INFO(power,
> > + HWMON_P_INPUT | HWMON_P_INPUT_LOWEST |
> > + HWMON_P_INPUT_HIGHEST | HWMON_P_MAX | HWMON_P_MIN |
> > + HWMON_P_MAX_ALARM | HWMON_P_MIN_ALARM |
> > + HWMON_P_RESET_HISTORY | HWMON_P_LABEL),
> > + HWMON_CHANNEL_INFO(energy,
> > + HWMON_E_ENABLE),
> > + HWMON_CHANNEL_INFO(energy64,
> > + HWMON_E_INPUT),
> > + NULL
> > +};
> > +
> > +static const struct hwmon_ops ltc4283_ops = {
> > + .read = ltc4283_read,
> > + .write = ltc4283_write,
> > + .is_visible = ltc4283_is_visible,
> > + .read_string = ltc4283_read_labels,
> > +};
> > +
> > +static const struct hwmon_chip_info ltc4283_chip_info = {
> > + .ops = <c4283_ops,
> > + .info = ltc4283_info,
> > +};
> > +
> > +static int ltc4283_show_fault_log(void *arg, u64 *val, u32 mask)
> > +{
> > + struct ltc4283_hwmon *st = arg;
> > + long alarm;
> > + int ret;
> > +
> > + ret = ltc4283_read_alarm(st, LTC4283_FAULT_LOG, mask, &alarm);
> > + if (ret)
> > + return ret;
> > +
> > + *val = alarm;
> > +
> > + return 0;
> > +}
> > +
> > +static int ltc4283_show_in0_lcrit_fault_log(void *arg, u64 *val)
> > +{
> > + return ltc4283_show_fault_log(arg, val, LTC4283_UV_FAULT_MASK);
> > +}
> > +DEFINE_DEBUGFS_ATTRIBUTE(ltc4283_in0_lcrit_fault_log,
> > + ltc4283_show_in0_lcrit_fault_log, NULL, "%llu\n");
> > +
> > +static int ltc4283_show_in0_crit_fault_log(void *arg, u64 *val)
> > +{
> > + return ltc4283_show_fault_log(arg, val, LTC4283_OV_FAULT_MASK);
> > +}
> > +DEFINE_DEBUGFS_ATTRIBUTE(ltc4283_in0_crit_fault_log,
> > + ltc4283_show_in0_crit_fault_log, NULL, "%llu\n");
> > +
> > +static int ltc4283_show_fet_bad_fault_log(void *arg, u64 *val)
> > +{
> > + return ltc4283_show_fault_log(arg, val, LTC4283_FET_BAD_FAULT_MASK);
> > +}
> > +DEFINE_DEBUGFS_ATTRIBUTE(ltc4283_fet_bad_fault_log,
> > + ltc4283_show_fet_bad_fault_log, NULL, "%llu\n");
> > +
> > +static int ltc4283_show_fet_short_fault_log(void *arg, u64 *val)
> > +{
> > + return ltc4283_show_fault_log(arg, val, LTC4283_FET_SHORT_FAULT_MASK);
> > +}
> > +DEFINE_DEBUGFS_ATTRIBUTE(ltc4283_fet_short_fault_log,
> > + ltc4283_show_fet_short_fault_log, NULL, "%llu\n");
> > +
> > +static int ltc4283_show_curr1_crit_fault_log(void *arg, u64 *val)
> > +{
> > + return ltc4283_show_fault_log(arg, val, LTC4283_OC_FAULT_MASK);
> > +}
> > +DEFINE_DEBUGFS_ATTRIBUTE(ltc4283_curr1_crit_fault_log,
> > + ltc4283_show_curr1_crit_fault_log, NULL, "%llu\n");
> > +
> > +static int ltc4283_show_power1_failed_fault_log(void *arg, u64 *val)
> > +{
> > + return ltc4283_show_fault_log(arg, val, LTC4283_PWR_FAIL_FAULT_MASK);
> > +}
> > +DEFINE_DEBUGFS_ATTRIBUTE(ltc4283_power1_failed_fault_log,
> > + ltc4283_show_power1_failed_fault_log, NULL, "%llu\n");
> > +
> > +static int ltc4283_show_power1_good_input_fault_log(void *arg, u64 *val)
> > +{
> > + return ltc4283_show_fault_log(arg, val, LTC4283_PGI_FAULT_MASK);
> > +}
> > +DEFINE_DEBUGFS_ATTRIBUTE(ltc4283_power1_good_input_fault_log,
> > + ltc4283_show_power1_good_input_fault_log, NULL,
> > "%llu\n");
> > +
> > +static void ltc4283_debugfs_init(struct ltc4283_hwmon *st, struct i2c_client
> > *i2c)
> > +{
> > + debugfs_create_file_unsafe("in0_crit_fault_log", 0400, i2c->debugfs, st,
> > + <c4283_in0_crit_fault_log);
> > + debugfs_create_file_unsafe("in0_lcrit_fault_log", 0400, i2c->debugfs,
> > st,
> > + <c4283_in0_lcrit_fault_log);
> > + debugfs_create_file_unsafe("in0_fet_bad_fault_log", 0400, i2c->debugfs,
> > st,
> > + <c4283_fet_bad_fault_log);
> > + debugfs_create_file_unsafe("in0_fet_short_fault_log", 0400, i2c-
> > >debugfs, st,
> > + <c4283_fet_short_fault_log);
> > + debugfs_create_file_unsafe("curr1_crit_fault_log", 0400, i2c->debugfs,
> > st,
> > + <c4283_curr1_crit_fault_log);
> > + debugfs_create_file_unsafe("power1_failed_fault_log", 0400, i2c-
> > >debugfs, st,
> > + <c4283_power1_failed_fault_log);
> > + debugfs_create_file_unsafe("power1_good_input_fault_log", 0400, i2c-
> > >debugfs,
> > + st, <c4283_power1_good_input_fault_log);
> > +}
> > +
> > +static bool ltc4283_is_word_reg(unsigned int reg)
> > +{
> > + return reg >= LTC4283_SENSE && reg <= LTC4283_ADIO34_MAX;
> > +}
> > +
> > +static int ltc4283_reg_read(void *context, unsigned int reg, unsigned int *val)
> > +{
> > + struct i2c_client *client = context;
> > + int ret;
> > +
> > + if (ltc4283_is_word_reg(reg))
> > + ret = i2c_smbus_read_word_swapped(client, reg);
> > + else
> > + ret = i2c_smbus_read_byte_data(client, reg);
> > +
> > + if (ret < 0)
> > + return ret;
> > +
> > + *val = ret;
> > + return 0;
> > +}
> > +
> > +static int ltc4283_reg_write(void *context, unsigned int reg, unsigned int val)
> > +{
> > + struct i2c_client *client = context;
> > +
> > + if (ltc4283_is_word_reg(reg))
> > + return i2c_smbus_write_word_swapped(client, reg, val);
> > +
> > + return i2c_smbus_write_byte_data(client, reg, val);
> > +}
> > +
> > +static const struct regmap_bus ltc4283_regmap_bus = {
> > + .reg_read = ltc4283_reg_read,
> > + .reg_write = ltc4283_reg_write,
> > +};
> > +
> > +static bool ltc4283_writable_reg(struct device *dev, unsigned int reg)
> > +{
> > + switch (reg) {
> > + case LTC4283_SYSTEM_STATUS ... LTC4283_FAULT_STATUS:
> > + return false;
> > + case LTC4283_RESERVED_OC:
> > + return false;
> > + case LTC4283_RESERVED_86 ... LTC4283_RESERVED_8F:
> > + return false;
> > + case LTC4283_RESERVED_91 ... LTC4283_RESERVED_A1:
> > + return false;
> > + case LTC4283_RESERVED_A3:
> > + return false;
> > + case LTC4283_RESERVED_AC:
> > + return false;
> > + case LTC4283_POWER_PLAY_MSB ... LTC4283_POWER_PLAY_LSB:
> > + return false;
> > + case LTC4283_RESERVED_F1 ... LTC4283_RESERVED_FF:
> > + return false;
> > + default:
> > + return true;
> > + }
> > +}
> > +
> > +static const struct regmap_config ltc4283_regmap_config = {
> > + .reg_bits = 8,
> > + .val_bits = 16,
> > + .max_register = 0xFF,
> > + .writeable_reg = ltc4283_writable_reg,
> > +};
> > +
> > +static int ltc4283_probe(struct i2c_client *client)
> > +{
> > + struct device *dev = &client->dev, *hwmon;
> > + struct auxiliary_device *adev;
> > + struct ltc4283_hwmon *st;
> > + int ret, id;
> > +
> > + st = devm_kzalloc(dev, sizeof(*st), GFP_KERNEL);
> > + if (!st)
> > + return -ENOMEM;
> > +
> > + if (!i2c_check_functionality(client->adapter,
> > + I2C_FUNC_SMBUS_BYTE_DATA |
> > + I2C_FUNC_SMBUS_WORD_DATA |
> > + I2C_FUNC_SMBUS_READ_I2C_BLOCK))
> > + return -EOPNOTSUPP;
> > +
> > + st->client = client;
> > + st->map = devm_regmap_init(dev, <c4283_regmap_bus, client,
> > + <c4283_regmap_config);
> > + if (IS_ERR(st->map))
> > + return dev_err_probe(dev, PTR_ERR(st->map),
> > + "Failed to create regmap\n");
> > +
> > + ret = ltc4283_setup(st, dev);
> > + if (ret)
> > + return ret;
> > +
> > + hwmon = devm_hwmon_device_register_with_info(dev, "ltc4283", st,
> > + <c4283_chip_info, NULL);
> > +
> > + if (IS_ERR(hwmon))
> > + return PTR_ERR(hwmon);
> > +
> > + ltc4283_debugfs_init(st, client);
> > +
> > + if (!st->gpio_mask)
> > + return 0;
> > +
> > + id = (client->adapter->nr << 10) | client->addr;
> > + adev = __devm_auxiliary_device_create(dev, KBUILD_MODNAME, "gpio",
> > + NULL, id);
> > + if (!adev)
> > + return dev_err_probe(dev, -ENODEV, "Failed to add GPIO
> > device\n");
> > +
> > + return 0;
> > +}
> > +
> > +static const struct of_device_id ltc4283_of_match[] = {
> > + { .compatible = "adi,ltc4283" },
> > + { }
> > +};
> > +
> > +static const struct i2c_device_id ltc4283_i2c_id[] = {
> > + { "ltc4283" },
> > + { }
> > +};
> > +MODULE_DEVICE_TABLE(i2c, ltc4283_i2c_id);
> > +
> > +static struct i2c_driver ltc4283_driver = {
> > + .driver = {
> > + .name = "ltc4283",
> > + .of_match_table = ltc4283_of_match,
> > + },
> > + .probe = ltc4283_probe,
> > + .id_table = ltc4283_i2c_id,
> > +};
> > +module_i2c_driver(ltc4283_driver);
> > +
> > +MODULE_AUTHOR("Nuno Sá <nuno.sa@analog.com>");
> > +MODULE_DESCRIPTION("LTC4283 Hot Swap Controller driver");
> > +MODULE_LICENSE("GPL");
> >
^ permalink raw reply
* Re: [RFC PATCH] Documentation: Add managed interrupts
From: Jonathan Corbet @ 2026-04-11 13:08 UTC (permalink / raw)
To: Thomas Gleixner, Sebastian Andrzej Siewior, linux-doc,
linux-kernel
Cc: Aaron Tomlin, Christoph Hellwig, Frederic Weisbecker, Jens Axboe,
Ming Lei, Valentin Schneider, Waiman Long, Peter Zijlstra,
John Ogness
In-Reply-To: <87qzomzv8q.ffs@tglx>
Thomas Gleixner <tglx@kernel.org> writes:
> On Thu, Apr 09 2026 at 08:32, Jonathan Corbet wrote:
>>> This documents what we have as of today and how it works. I added some
>>> examples how the parameter affects the configuration. Did I miss
>>> something?
>>
>> There's been a lot of silence on this one... should I pick this one up,
>> or are there other plans for it...?
>
> Looks good to me.
>
> Acked-by: Thomas Gleixner <tglx@kernel.org>
OK, I've applied it, thanks.
jon
^ permalink raw reply
* Re: [RFC PATCH] Documentation: Add managed interrupts
From: Ming Lei @ 2026-04-11 12:18 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: linux-doc, linux-kernel, Aaron Tomlin, Christoph Hellwig,
Frederic Weisbecker, Jens Axboe, Jonathan Corbet, Thomas Gleixner,
Valentin Schneider, Waiman Long, Peter Zijlstra, John Ogness
In-Reply-To: <20260401110232.ET5RxZfl@linutronix.de>
On Wed, Apr 1, 2026 at 7:02 PM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> I stumbled upon "isolcpus=managed_irq" which is the last piece which
> can only be handled by isolcpus= and has no runtime knob. I knew roughly
> what managed interrupts should do but I lacked some details how it is
> used and what the managed_irq sub parameter means in practise.
>
> This documents what we have as of today and how it works. I added some
> examples how the parameter affects the configuration. Did I miss
> something?
>
> Given that the spreading as computed group_cpus_evenly() does not take
> the mask of isolated CPUs into account I'm not sure how relevant the
> managed_irq argument is. The virtio_scsi driver has no way to limit the
> interrupts and I don't see this for the nvme. Even if the number of
> queues can be reduced to two (as in the example) it is still spread
> evenly in the system instead and the isolated CPUs are not taken into
> account.
> To make this worse, you can even argue further whether or not the
> application on the isolated CPU wants to receive the interrupt directly
> or would prefer not to.
>
> Given all this, I am not sure if it makes sense to add 'io_queue' to the
> mix or if it could be incorporated into 'managed_irq'.
>
> One more point: Given that isolcpus= is marked deprecated as of commit
> b0d40d2b22fe4 ("sched/isolation: Document isolcpus= boot parameter flags, mark it deprecated")
>
> and the 'managed_irq' is evaluated at device's probe time it would
> require additional callbacks to re-evaluate the situation. Probably for
> 'io_queue', too. Does is make sense or should we simply drop the
> "deprecation" notice and allowing using it long term?
> Dynamic partitions work with cpusets, there this (managed_irq)
> limitation but is it really? And if static partition is the use case why
> bother.
>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
> Documentation/core-api/irq/index.rst | 1 +
> Documentation/core-api/irq/managed_irq.rst | 116 +++++++++++++++++++++
> 2 files changed, 117 insertions(+)
> create mode 100644 Documentation/core-api/irq/managed_irq.rst
>
> diff --git a/Documentation/core-api/irq/index.rst b/Documentation/core-api/irq/index.rst
> index 0d65d11e54200..13bd24dd2b1cc 100644
> --- a/Documentation/core-api/irq/index.rst
> +++ b/Documentation/core-api/irq/index.rst
> @@ -9,3 +9,4 @@ IRQs
> irq-affinity
> irq-domain
> irqflags-tracing
> + managed_irq
> diff --git a/Documentation/core-api/irq/managed_irq.rst b/Documentation/core-api/irq/managed_irq.rst
> new file mode 100644
> index 0000000000000..05e295f3c289d
> --- /dev/null
> +++ b/Documentation/core-api/irq/managed_irq.rst
> @@ -0,0 +1,116 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +===========================
> +Affinity managed interrupts
> +===========================
> +
> +The IRQ core provides support for managing interrupts according to a specified
> +CPU affinity. Under normal operation, an interrupt is associated with a
> +particular CPU. If that CPU is taken offline, the interrupt is migrated to
> +another online CPU.
> +
> +Devices with large numbers of interrupt vectors can stress the available vector
> +space. For example, an NVMe device with 128 I/O queues typically requests one
> +interrupt per queue on systems with at least 128 CPUs. Two such devices
> +therefore request 256 interrupts. On x86, the interrupt vector space is
> +notoriously low, providing only 256 vectors per CPU, and the kernel reserves a
> +subset of these, further reducing the number available for device interrupts.
> +In practice this is not an issue because the interrupts are distributed across
> +many CPUs, so each CPU only receives a small number of vectors.
> +
> +During system suspend, however, all secondary CPUs are taken offline and all
> +interrupts are migrated to the single CPU that remains online. This can exhaust
> +the available interrupt vectors on that CPU and cause the suspend operation to
> +fail.
> +
> +Affinity‑managed interrupts address this limitation. Each interrupt is assigned
> +a CPU affinity mask that specifies the set of CPUs on which the interrupt may
> +be targeted. When a CPU in the mask goes offline, the interrupt is moved to the
> +next CPU in the mask. If the last CPU in the mask goes offline, the interrupt
> +is shut down. Drivers using affinity‑managed interrupts must ensure that the
> +associated queue is quiesced before the interrupt is disabled so that no
> +further interrupts are generated. When a CPU in the affinity mask comes back
> +online, the interrupt is re‑enabled.
> +
> +Implementation
> +--------------
> +
> +Devices must provide per‑instance interrupts, such as per‑I/O‑queue interrupts
> +for storage devices like NVMe. The driver allocates interrupt vectors with the
> +required affinity settings using struct irq_affinity. For MSI‑X devices, this
> +is done via pci_alloc_irq_vectors_affinity() with the PCI_IRQ_AFFINITY flag
> +set.
> +
> +Based on the provided affinity information, the IRQ core attempts to spread the
> +interrupts evenly across the system. The affinity masks are computed during
> +this allocation step, but the final IRQ assignment is performed when
> +request_irq() is invoked.
> +
> +Isolated CPUs
> +-------------
> +
> +The affinity of managed interrupts is handled entirely in the kernel and cannot
> +be modified from user space through the /proc interfaces. The managed_irq
> +sub‑parameter of the isolcpus boot option specifies a CPU mask that managed
> +interrupts should attempt to avoid. This isolation is best‑effort and only
> +applies if the automatically assigned interrupt mask also contains online CPUs
> +outside the avoided mask. If the requested mask contains only isolated CPUs,
> +the setting has no effect.
> +
> +CPUs listed in the avoided mask remain part of the interrupt’s affinity mask.
> +This means that if all non‑isolated CPUs go offline while isolated CPUs remain
> +online, the interrupt will be assigned to one of the isolated CPUs.
Maybe you can add:
In reality it is fine because IO isn't supposed to submit from isolated CPUs.
> +
> +The following examples assume a system with 8 CPUs.
> +
> +- A QEMU instance is booted with "-device virtio-scsi-pci".
> + The MSI‑X device exposes 11 interrupts: 3 "management" interrupts and 8
> + "queue" interrupts. The driver requests the 8 queue interrupts, each of which
> + is affine to exactly one CPU. If that CPU goes offline, the interrupt is shut
> + down.
> +
> + Assuming interrupt 48 is one of the queue interrupts, the following appears::
> +
> + /proc/irq/48/effective_affinity_list:7
> + /proc/irq/48/smp_affinity_list:7
> +
> + This indicates that the interrupt is served only by CPU7. Shutting down CPU7
> + does not migrate the interrupt to another CPU::
> +
> + /proc/irq/48/effective_affinity_list:0
> + /proc/irq/48/smp_affinity_list:7
> +
> + This can be verified via the debugfs interface
> + (/sys/kernel/debug/irq/irqs/48). The dstate field will include
> + IRQD_IRQ_DISABLED, IRQD_IRQ_MASKED and IRQD_MANAGED_SHUTDOWN.
> +
> +- A QEMU instance is booted with "-device virtio-scsi-pci,num_queues=2"
> + and the kernel command line includes:
> + "irqaffinity=0,1 isolcpus=domain,2-7 isolcpus=managed_irq,1-3,5-7".
> + The MSI‑X device exposes 5 interrupts: 3 management interrupts and 2 queue
> + interrupts. The management interrupts follow the irqaffinity= setting. The
> + queue interrupts are spread across available CPUs::
> +
> + /proc/irq/47/effective_affinity_list:0
> + /proc/irq/47/smp_affinity_list:0-3
> + /proc/irq/48/effective_affinity_list:4
> + /proc/irq/48/smp_affinity_list:4-7
> +
> + The two queue interrupts are evenly distributed. Interrupt 48 is placed on CPU4
> + because the managed_irq mask avoids CPUs 5–7 when possible.
> +
> + Replacing the managed_irq argument with "isolcpus=managed_irq,1-3,4-5,7"
> + results in::
> +
> + /proc/irq/48/effective_affinity_list:6
> + /proc/irq/48/smp_affinity_list:4-7
> +
> + Interrupt 48 is now served on CPU6 because the system avoids CPUs 4, 5 and
> + 7. If CPU6 is taken offline, the interrupt migrates to one of the "isolated"
> + CPUs::
> +
> + /proc/irq/48/effective_affinity_list:7
> + /proc/irq/48/smp_affinity_list:4-7
> +
> + The interrupt is shut down once all CPUs listed in its smp_affinity mask are
> + offline.
Nice document, with or without the above change:
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Thanks,
^ permalink raw reply
* [PATCH v5 0/4] KVM: arm64: PMU: Use multiple host PMUs
From: Akihiko Odaki @ 2026-04-11 10:20 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Catalin Marinas, Will Deacon, Kees Cook,
Gustavo A. R. Silva, Paolo Bonzini, Jonathan Corbet, Shuah Khan
Cc: linux-arm-kernel, kvmarm, linux-kernel, linux-hardening, devel,
kvm, linux-doc, linux-kselftest, Akihiko Odaki
On a heterogeneous arm64 system, KVM's PMU emulation is based on the
features of a single host PMU instance. When a vCPU is migrated to a
pCPU with an incompatible PMU, counters such as PMCCNTR_EL0 stop
incrementing.
Although this behavior is permitted by the architecture, Windows does
not handle it gracefully and may crash with a division-by-zero error.
The current workaround requires VMMs to pin vCPUs to a set of pCPUs
that share a compatible PMU. This is difficult to implement correctly in
QEMU/libvirt, where pinning occurs after vCPU initialization, and it
also restricts the guest to a subset of available pCPUs.
This patch introduces the KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY
attribute. If set, PMUv3 will be emulated without programmable event
counters. KVM will be able to run VCPUs on any physical CPUs with a
compatible hardware PMU.
This allows Windows guests to run reliably on heterogeneous systems
without crashing, even without vCPU pinning, and enables VMMs to
schedule vCPUs across all available pCPUs, making full use of the host
hardware.
A QEMU patch that demonstrates the usage of the new attribute is
available at:
https://lore.kernel.org/qemu-devel/20260225-kvm-v2-1-b8d743db0f73@rsg.ci.i.u-tokyo.ac.jp/
("[PATCH RFC v2] target/arm/kvm: Choose PMU backend")
Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
---
Changes in v5:
- Rebased.
- Fixed the order to clear KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY in
kvm_arm_pmu_v3_set_pmu().
- Fixed the setting of KVM_ARM_VCPU_PMU_V3_IRQ in
test_fixed_counters_only().
- Changed to WARN_ON_ONCE() when kvm_pmu_probe_armpmu() returns NULL in
kvm_pmu_create_perf_event(), which is no longer supposed to happen.
- Link to v4: https://lore.kernel.org/r/20260317-hybrid-v4-0-bd62bcd48644@rsg.ci.i.u-tokyo.ac.jp
Changes in v4:
- Extracted kvm_pmu_enabled_counter_mask() into a separate patch.
- Added patch "KVM: arm64: PMU: Protect the list of PMUs with RCU".
- Merged KVM_REQ_CREATE_PMU into KVM_REQ_RELOAD_PMU.
- Added a check to avoid unnecessary KVM_REQ_RELOAD_PMU requests.
- Dropped the change to avoid setting kvm_arm_set_default_pmu() when
KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY is not set.
- Link to v3: https://lore.kernel.org/r/20260225-hybrid-v3-0-46e8fe220880@rsg.ci.i.u-tokyo.ac.jp
Changes in v3:
- Renamed the attribute to KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY.
- Changed to request the creation of perf counters when loading vCPU.
- Link to v2: https://lore.kernel.org/r/20250806-hybrid-v2-0-0661aec3af8c@rsg.ci.i.u-tokyo.ac.jp
Changes in v2:
- Added the KVM_ARM_VCPU_PMU_V3_COMPOSITION attribute to opt in the
feature.
- Added code to handle overflow.
- Link to v1: https://lore.kernel.org/r/20250319-hybrid-v1-1-4d1ada10e705@daynix.com
---
Akihiko Odaki (4):
KVM: arm64: PMU: Add kvm_pmu_enabled_counter_mask()
KVM: arm64: PMU: Protect the list of PMUs with RCU
KVM: arm64: PMU: Introduce FIXED_COUNTERS_ONLY
KVM: arm64: selftests: Test PMU_V3_FIXED_COUNTERS_ONLY
Documentation/virt/kvm/devices/vcpu.rst | 29 ++++
arch/arm64/include/asm/kvm_host.h | 2 +
arch/arm64/include/uapi/asm/kvm.h | 1 +
arch/arm64/kvm/arm.c | 1 +
arch/arm64/kvm/pmu-emul.c | 188 ++++++++++++++-------
include/kvm/arm_pmu.h | 2 +
.../selftests/kvm/arm64/vpmu_counter_access.c | 148 +++++++++++++---
7 files changed, 288 insertions(+), 83 deletions(-)
---
base-commit: 9a9c8ce300cd3859cc87b408ef552cd697cc2ab7
change-id: 20250224-hybrid-01d5ff47edd2
Best regards,
--
Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
^ permalink raw reply
* [PATCH v5 3/4] KVM: arm64: PMU: Introduce FIXED_COUNTERS_ONLY
From: Akihiko Odaki @ 2026-04-11 10:20 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Catalin Marinas, Will Deacon, Kees Cook,
Gustavo A. R. Silva, Paolo Bonzini, Jonathan Corbet, Shuah Khan
Cc: linux-arm-kernel, kvmarm, linux-kernel, linux-hardening, devel,
kvm, linux-doc, linux-kselftest, Akihiko Odaki
In-Reply-To: <20260411-hybrid-v5-0-b043b4d9f49e@rsg.ci.i.u-tokyo.ac.jp>
On a heterogeneous arm64 system, KVM's PMU emulation is based on the
features of a single host PMU instance. When a vCPU is migrated to a
pCPU with an incompatible PMU, counters such as PMCCNTR_EL0 stop
incrementing.
Although this behavior is permitted by the architecture, Windows does
not handle it gracefully and may crash with a division-by-zero error.
The current workaround requires VMMs to pin vCPUs to a set of pCPUs
that share a compatible PMU. This is difficult to implement correctly in
QEMU/libvirt, where pinning occurs after vCPU initialization, and it
also restricts the guest to a subset of available pCPUs.
Introduce the KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY attribute to
create a "fixed-counters-only" PMU. When set, KVM exposes a PMU that is
compatible with all pCPUs but that does not support programmable
event counters which may have different feature sets on different PMUs.
This allows Windows guests to run reliably on heterogeneous systems
without crashing, even without vCPU pinning, and enables VMMs to
schedule vCPUs across all available pCPUs, making full use of the host
hardware.
Much like KVM_ARM_VCPU_PMU_V3_IRQ and other read-write attributes, this
attribute provides a getter that facilitates kernel and userspace
debugging/testing.
Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
---
Documentation/virt/kvm/devices/vcpu.rst | 29 ++++++
arch/arm64/include/asm/kvm_host.h | 2 +
arch/arm64/include/uapi/asm/kvm.h | 1 +
arch/arm64/kvm/arm.c | 1 +
arch/arm64/kvm/pmu-emul.c | 156 +++++++++++++++++++++++---------
include/kvm/arm_pmu.h | 2 +
6 files changed, 148 insertions(+), 43 deletions(-)
diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
index 60bf205cb373..e0aeb1897d77 100644
--- a/Documentation/virt/kvm/devices/vcpu.rst
+++ b/Documentation/virt/kvm/devices/vcpu.rst
@@ -161,6 +161,35 @@ explicitly selected, or the number of counters is out of range for the
selected PMU. Selecting a new PMU cancels the effect of setting this
attribute.
+1.6 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY
+------------------------------------------------------
+
+:Parameters: no additional parameter in kvm_device_attr.addr
+
+:Returns:
+
+ ======= =====================================================
+ -EBUSY Attempted to set after initializing PMUv3 or running
+ VCPU, or attempted to set for the first time after
+ setting an event filter
+ -ENXIO Attempted to get before setting
+ -ENODEV Attempted to set while PMUv3 not supported
+ ======= =====================================================
+
+If set, PMUv3 will be emulated without programmable event counters. The VCPU
+will use any compatible hardware PMU. This attribute is particularly useful on
+heterogeneous systems where different hardware PMUs cover different physical
+CPUs. The compatibility of hardware PMUs can be checked with
+KVM_ARM_VCPU_PMU_V3_SET_PMU. All VCPUs in a VM share this attribute. It isn't
+possible to set it for the first time if a PMU event filter is already present.
+
+Note that KVM will not make any attempts to run the VCPU on the physical CPUs
+with compatible hardware PMUs. This is entirely left to userspace. However,
+attempting to run the VCPU on an unsupported CPU will fail and KVM_RUN will
+return with exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct
+by setting hardware_entry_failure_reason field to
+KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and the cpu field to the processor id.
+
2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
=================================
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 59f25b85be2b..b59e0182472c 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -353,6 +353,8 @@ struct kvm_arch {
#define KVM_ARCH_FLAG_WRITABLE_IMP_ID_REGS 10
/* Unhandled SEAs are taken to userspace */
#define KVM_ARCH_FLAG_EXIT_SEA 11
+ /* PMUv3 is emulated without progammable event counters */
+#define KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY 12
unsigned long flags;
/* VM-wide vCPU feature set */
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index a792a599b9d6..474c84fa757f 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -436,6 +436,7 @@ enum {
#define KVM_ARM_VCPU_PMU_V3_FILTER 2
#define KVM_ARM_VCPU_PMU_V3_SET_PMU 3
#define KVM_ARM_VCPU_PMU_V3_SET_NR_COUNTERS 4
+#define KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY 5
#define KVM_ARM_VCPU_TIMER_CTRL 1
#define KVM_ARM_VCPU_TIMER_IRQ_VTIMER 0
#define KVM_ARM_VCPU_TIMER_IRQ_PTIMER 1
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 620a465248d1..dca16ca26d32 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -634,6 +634,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
if (has_vhe())
kvm_vcpu_load_vhe(vcpu);
kvm_arch_vcpu_load_fp(vcpu);
+ kvm_vcpu_load_pmu(vcpu);
kvm_vcpu_pmu_restore_guest(vcpu);
if (kvm_arm_is_pvtime_enabled(&vcpu->arch))
kvm_make_request(KVM_REQ_RECORD_STEAL, vcpu);
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index ef5140bbfe28..c827e66af0a2 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -326,7 +326,10 @@ u64 kvm_pmu_implemented_counter_mask(struct kvm_vcpu *vcpu)
static void kvm_pmc_enable_perf_event(struct kvm_pmc *pmc)
{
- if (!pmc->perf_event) {
+ struct kvm_vcpu *vcpu = kvm_pmc_to_vcpu(pmc);
+
+ if (!pmc->perf_event ||
+ !cpumask_test_cpu(vcpu->cpu, &to_arm_pmu(pmc->perf_event->pmu)->supported_cpus)) {
kvm_pmu_create_perf_event(pmc);
return;
}
@@ -667,10 +670,8 @@ static bool kvm_pmc_counts_at_el2(struct kvm_pmc *pmc)
return kvm_pmc_read_evtreg(pmc) & ARMV8_PMU_INCLUDE_EL2;
}
-static int kvm_map_pmu_event(struct kvm *kvm, unsigned int eventsel)
+static int kvm_map_pmu_event(struct arm_pmu *pmu, unsigned int eventsel)
{
- struct arm_pmu *pmu = kvm->arch.arm_pmu;
-
/*
* The CPU PMU likely isn't PMUv3; let the driver provide a mapping
* for the guest's PMUv3 event ID.
@@ -681,6 +682,23 @@ static int kvm_map_pmu_event(struct kvm *kvm, unsigned int eventsel)
return eventsel;
}
+static struct arm_pmu *kvm_pmu_probe_armpmu(int cpu)
+{
+ struct arm_pmu_entry *entry;
+ struct arm_pmu *pmu;
+
+ guard(rcu)();
+
+ list_for_each_entry_rcu(entry, &arm_pmus, entry) {
+ pmu = entry->arm_pmu;
+
+ if (cpumask_test_cpu(cpu, &pmu->supported_cpus))
+ return pmu;
+ }
+
+ return NULL;
+}
+
/**
* kvm_pmu_create_perf_event - create a perf event for a counter
* @pmc: Counter context
@@ -694,6 +712,12 @@ static void kvm_pmu_create_perf_event(struct kvm_pmc *pmc)
int eventsel;
u64 evtreg;
+ if (test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY, &vcpu->kvm->arch.flags)) {
+ arm_pmu = kvm_pmu_probe_armpmu(vcpu->cpu);
+ if (WARN_ON_ONCE(!arm_pmu))
+ return;
+ }
+
evtreg = kvm_pmc_read_evtreg(pmc);
kvm_pmu_stop_counter(pmc);
@@ -722,7 +746,7 @@ static void kvm_pmu_create_perf_event(struct kvm_pmc *pmc)
* Don't create an event if we're running on hardware that requires
* PMUv3 event translation and we couldn't find a valid mapping.
*/
- eventsel = kvm_map_pmu_event(vcpu->kvm, eventsel);
+ eventsel = kvm_map_pmu_event(arm_pmu, eventsel);
if (eventsel < 0)
return;
@@ -810,42 +834,6 @@ void kvm_host_pmu_init(struct arm_pmu *pmu)
list_add_tail_rcu(&entry->entry, &arm_pmus);
}
-static struct arm_pmu *kvm_pmu_probe_armpmu(void)
-{
- struct arm_pmu_entry *entry;
- struct arm_pmu *pmu;
- int cpu;
-
- guard(rcu)();
-
- /*
- * It is safe to use a stale cpu to iterate the list of PMUs so long as
- * the same value is used for the entirety of the loop. Given this, and
- * the fact that no percpu data is used for the lookup there is no need
- * to disable preemption.
- *
- * It is still necessary to get a valid cpu, though, to probe for the
- * default PMU instance as userspace is not required to specify a PMU
- * type. In order to uphold the preexisting behavior KVM selects the
- * PMU instance for the core during vcpu init. A dependent use
- * case would be a user with disdain of all things big.LITTLE that
- * affines the VMM to a particular cluster of cores.
- *
- * In any case, userspace should just do the sane thing and use the UAPI
- * to select a PMU type directly. But, be wary of the baggage being
- * carried here.
- */
- cpu = raw_smp_processor_id();
- list_for_each_entry_rcu(entry, &arm_pmus, entry) {
- pmu = entry->arm_pmu;
-
- if (cpumask_test_cpu(cpu, &pmu->supported_cpus))
- return pmu;
- }
-
- return NULL;
-}
-
static u64 __compute_pmceid(struct arm_pmu *pmu, bool pmceid1)
{
u32 hi[2], lo[2];
@@ -888,6 +876,9 @@ u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
u64 val, mask = 0;
int base, i, nr_events;
+ if (test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY, &vcpu->kvm->arch.flags))
+ return 0;
+
if (!pmceid1) {
val = compute_pmceid0(cpu_pmu);
base = 0;
@@ -915,6 +906,26 @@ u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
return val & mask;
}
+void kvm_vcpu_load_pmu(struct kvm_vcpu *vcpu)
+{
+ unsigned long mask = kvm_pmu_enabled_counter_mask(vcpu);
+ struct kvm_pmc *pmc;
+ struct arm_pmu *cpu_pmu;
+ int i;
+
+ for_each_set_bit(i, &mask, 32) {
+ pmc = kvm_vcpu_idx_to_pmc(vcpu, i);
+ if (!pmc->perf_event)
+ continue;
+
+ cpu_pmu = to_arm_pmu(pmc->perf_event->pmu);
+ if (!cpumask_test_cpu(vcpu->cpu, &cpu_pmu->supported_cpus)) {
+ kvm_make_request(KVM_REQ_RELOAD_PMU, vcpu);
+ break;
+ }
+ }
+}
+
void kvm_vcpu_reload_pmu(struct kvm_vcpu *vcpu)
{
u64 mask = kvm_pmu_implemented_counter_mask(vcpu);
@@ -1016,6 +1027,9 @@ u8 kvm_arm_pmu_get_max_counters(struct kvm *kvm)
{
struct arm_pmu *arm_pmu = kvm->arch.arm_pmu;
+ if (test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY, &kvm->arch.flags))
+ return 0;
+
/*
* PMUv3 requires that all event counters are capable of counting any
* event, though the same may not be true of non-PMUv3 hardware.
@@ -1070,7 +1084,24 @@ static void kvm_arm_set_pmu(struct kvm *kvm, struct arm_pmu *arm_pmu)
*/
int kvm_arm_set_default_pmu(struct kvm *kvm)
{
- struct arm_pmu *arm_pmu = kvm_pmu_probe_armpmu();
+ /*
+ * It is safe to use a stale cpu to iterate the list of PMUs so long as
+ * the same value is used for the entirety of the loop. Given this, and
+ * the fact that no percpu data is used for the lookup there is no need
+ * to disable preemption.
+ *
+ * It is still necessary to get a valid cpu, though, to probe for the
+ * default PMU instance as userspace is not required to specify a PMU
+ * type. In order to uphold the preexisting behavior KVM selects the
+ * PMU instance for the core during vcpu init. A dependent use
+ * case would be a user with disdain of all things big.LITTLE that
+ * affines the VMM to a particular cluster of cores.
+ *
+ * In any case, userspace should just do the sane thing and use the UAPI
+ * to select a PMU type directly. But, be wary of the baggage being
+ * carried here.
+ */
+ struct arm_pmu *arm_pmu = kvm_pmu_probe_armpmu(raw_smp_processor_id());
if (!arm_pmu)
return -ENODEV;
@@ -1098,6 +1129,7 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
break;
}
+ clear_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY, &kvm->arch.flags);
kvm_arm_set_pmu(kvm, arm_pmu);
cpumask_copy(kvm->arch.supported_cpus, &arm_pmu->supported_cpus);
ret = 0;
@@ -1108,11 +1140,42 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
return ret;
}
+static int kvm_arm_pmu_v3_set_pmu_fixed_counters_only(struct kvm_vcpu *vcpu)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct arm_pmu_entry *entry;
+ struct arm_pmu *arm_pmu;
+ struct cpumask *supported_cpus = kvm->arch.supported_cpus;
+
+ lockdep_assert_held(&kvm->arch.config_lock);
+
+ if (kvm_vm_has_ran_once(kvm) ||
+ (kvm->arch.pmu_filter &&
+ !test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY, &kvm->arch.flags)))
+ return -EBUSY;
+
+ set_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY, &kvm->arch.flags);
+ kvm_arm_set_nr_counters(kvm, 0);
+ cpumask_clear(supported_cpus);
+
+ guard(rcu)();
+
+ list_for_each_entry_rcu(entry, &arm_pmus, entry) {
+ arm_pmu = entry->arm_pmu;
+ cpumask_or(supported_cpus, supported_cpus, &arm_pmu->supported_cpus);
+ }
+
+ return 0;
+}
+
static int kvm_arm_pmu_v3_set_nr_counters(struct kvm_vcpu *vcpu, unsigned int n)
{
struct kvm *kvm = vcpu->kvm;
- if (!kvm->arch.arm_pmu)
+ lockdep_assert_held(&kvm->arch.config_lock);
+
+ if (!kvm->arch.arm_pmu &&
+ !test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY, &kvm->arch.flags))
return -EINVAL;
if (n > kvm_arm_pmu_get_max_counters(kvm))
@@ -1227,6 +1290,8 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
return kvm_arm_pmu_v3_set_nr_counters(vcpu, n);
}
+ case KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY:
+ return kvm_arm_pmu_v3_set_pmu_fixed_counters_only(vcpu);
case KVM_ARM_VCPU_PMU_V3_INIT:
return kvm_arm_pmu_v3_init(vcpu);
}
@@ -1253,6 +1318,10 @@ int kvm_arm_pmu_v3_get_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
irq = vcpu->arch.pmu.irq_num;
return put_user(irq, uaddr);
}
+ case KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY:
+ lockdep_assert_held(&vcpu->kvm->arch.config_lock);
+ if (test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY, &vcpu->kvm->arch.flags))
+ return 0;
}
return -ENXIO;
@@ -1266,6 +1335,7 @@ int kvm_arm_pmu_v3_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
case KVM_ARM_VCPU_PMU_V3_FILTER:
case KVM_ARM_VCPU_PMU_V3_SET_PMU:
case KVM_ARM_VCPU_PMU_V3_SET_NR_COUNTERS:
+ case KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY:
if (kvm_vcpu_has_pmu(vcpu))
return 0;
}
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index 96754b51b411..1375cbaf97b2 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -56,6 +56,7 @@ void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val);
void kvm_pmu_handle_pmcr(struct kvm_vcpu *vcpu, u64 val);
void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
u64 select_idx);
+void kvm_vcpu_load_pmu(struct kvm_vcpu *vcpu);
void kvm_vcpu_reload_pmu(struct kvm_vcpu *vcpu);
int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu,
struct kvm_device_attr *attr);
@@ -161,6 +162,7 @@ static inline u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1)
static inline void kvm_pmu_update_vcpu_events(struct kvm_vcpu *vcpu) {}
static inline void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu) {}
static inline void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu) {}
+static inline void kvm_vcpu_load_pmu(struct kvm_vcpu *vcpu) {}
static inline void kvm_vcpu_reload_pmu(struct kvm_vcpu *vcpu) {}
static inline u8 kvm_arm_pmu_get_pmuver_limit(void)
{
--
2.53.0
^ permalink raw reply related
* [PATCH v5 4/4] KVM: arm64: selftests: Test PMU_V3_FIXED_COUNTERS_ONLY
From: Akihiko Odaki @ 2026-04-11 10:20 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Catalin Marinas, Will Deacon, Kees Cook,
Gustavo A. R. Silva, Paolo Bonzini, Jonathan Corbet, Shuah Khan
Cc: linux-arm-kernel, kvmarm, linux-kernel, linux-hardening, devel,
kvm, linux-doc, linux-kselftest, Akihiko Odaki
In-Reply-To: <20260411-hybrid-v5-0-b043b4d9f49e@rsg.ci.i.u-tokyo.ac.jp>
Assert the following:
- KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY is unset at initialization.
- KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY can be set.
- Setting KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY for the first time
after setting an event filter results in EBUSY.
- KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY can be set again even if an
event filter has already been set.
- Setting KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY after running a VCPU
results in EBUSY.
- The existing test cases pass with
KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY set.
Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
---
.../selftests/kvm/arm64/vpmu_counter_access.c | 148 +++++++++++++++++----
1 file changed, 122 insertions(+), 26 deletions(-)
diff --git a/tools/testing/selftests/kvm/arm64/vpmu_counter_access.c b/tools/testing/selftests/kvm/arm64/vpmu_counter_access.c
index ae36325c022f..50189fb56374 100644
--- a/tools/testing/selftests/kvm/arm64/vpmu_counter_access.c
+++ b/tools/testing/selftests/kvm/arm64/vpmu_counter_access.c
@@ -403,12 +403,7 @@ static void create_vpmu_vm(void *guest_code)
{
struct kvm_vcpu_init init;
uint8_t pmuver, ec;
- uint64_t dfr0, irq = 23;
- struct kvm_device_attr irq_attr = {
- .group = KVM_ARM_VCPU_PMU_V3_CTRL,
- .attr = KVM_ARM_VCPU_PMU_V3_IRQ,
- .addr = (uint64_t)&irq,
- };
+ uint64_t dfr0;
/* The test creates the vpmu_vm multiple times. Ensure a clean state */
memset(&vpmu_vm, 0, sizeof(vpmu_vm));
@@ -434,8 +429,6 @@ static void create_vpmu_vm(void *guest_code)
TEST_ASSERT(pmuver != ID_AA64DFR0_EL1_PMUVer_IMP_DEF &&
pmuver >= ID_AA64DFR0_EL1_PMUVer_IMP,
"Unexpected PMUVER (0x%x) on the vCPU with PMUv3", pmuver);
-
- vcpu_ioctl(vpmu_vm.vcpu, KVM_SET_DEVICE_ATTR, &irq_attr);
}
static void destroy_vpmu_vm(void)
@@ -461,15 +454,25 @@ static void run_vcpu(struct kvm_vcpu *vcpu, uint64_t pmcr_n)
}
}
-static void test_create_vpmu_vm_with_nr_counters(unsigned int nr_counters, bool expect_fail)
+static void test_create_vpmu_vm_with_nr_counters(unsigned int nr_counters,
+ bool fixed_counters_only,
+ bool expect_fail)
{
struct kvm_vcpu *vcpu;
unsigned int prev;
int ret;
+ uint64_t irq = 23;
create_vpmu_vm(guest_code);
vcpu = vpmu_vm.vcpu;
+ if (fixed_counters_only)
+ vcpu_device_attr_set(vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY, NULL);
+
+ vcpu_device_attr_set(vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_IRQ, &irq);
+
prev = get_pmcr_n(vcpu_get_reg(vcpu, KVM_ARM64_SYS_REG(SYS_PMCR_EL0)));
ret = __vcpu_device_attr_set(vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
@@ -489,15 +492,15 @@ static void test_create_vpmu_vm_with_nr_counters(unsigned int nr_counters, bool
* Create a guest with one vCPU, set the PMCR_EL0.N for the vCPU to @pmcr_n,
* and run the test.
*/
-static void run_access_test(uint64_t pmcr_n)
+static void run_access_test(uint64_t pmcr_n, bool fixed_counters_only)
{
uint64_t sp;
struct kvm_vcpu *vcpu;
struct kvm_vcpu_init init;
- pr_debug("Test with pmcr_n %lu\n", pmcr_n);
+ pr_debug("Test with pmcr_n %lu, fixed_counters_only %d\n", pmcr_n, fixed_counters_only);
- test_create_vpmu_vm_with_nr_counters(pmcr_n, false);
+ test_create_vpmu_vm_with_nr_counters(pmcr_n, fixed_counters_only, false);
vcpu = vpmu_vm.vcpu;
/* Save the initial sp to restore them later to run the guest again */
@@ -531,14 +534,14 @@ static struct pmreg_sets validity_check_reg_sets[] = {
* Create a VM, and check if KVM handles the userspace accesses of
* the PMU register sets in @validity_check_reg_sets[] correctly.
*/
-static void run_pmregs_validity_test(uint64_t pmcr_n)
+static void run_pmregs_validity_test(uint64_t pmcr_n, bool fixed_counters_only)
{
int i;
struct kvm_vcpu *vcpu;
uint64_t set_reg_id, clr_reg_id, reg_val;
uint64_t valid_counters_mask, max_counters_mask;
- test_create_vpmu_vm_with_nr_counters(pmcr_n, false);
+ test_create_vpmu_vm_with_nr_counters(pmcr_n, fixed_counters_only, false);
vcpu = vpmu_vm.vcpu;
valid_counters_mask = get_counters_mask(pmcr_n);
@@ -588,11 +591,11 @@ static void run_pmregs_validity_test(uint64_t pmcr_n)
* the vCPU to @pmcr_n, which is larger than the host value.
* The attempt should fail as @pmcr_n is too big to set for the vCPU.
*/
-static void run_error_test(uint64_t pmcr_n)
+static void run_error_test(uint64_t pmcr_n, bool fixed_counters_only)
{
pr_debug("Error test with pmcr_n %lu (larger than the host)\n", pmcr_n);
- test_create_vpmu_vm_with_nr_counters(pmcr_n, true);
+ test_create_vpmu_vm_with_nr_counters(pmcr_n, fixed_counters_only, true);
destroy_vpmu_vm();
}
@@ -622,22 +625,115 @@ static bool kvm_supports_nr_counters_attr(void)
return supported;
}
-int main(void)
+static void test_config(uint64_t pmcr_n, bool fixed_counters_only)
{
- uint64_t i, pmcr_n;
-
- TEST_REQUIRE(kvm_has_cap(KVM_CAP_ARM_PMU_V3));
- TEST_REQUIRE(kvm_supports_vgic_v3());
- TEST_REQUIRE(kvm_supports_nr_counters_attr());
+ uint64_t i;
- pmcr_n = get_pmcr_n_limit();
for (i = 0; i <= pmcr_n; i++) {
- run_access_test(i);
- run_pmregs_validity_test(i);
+ run_access_test(i, fixed_counters_only);
+ run_pmregs_validity_test(i, fixed_counters_only);
}
for (i = pmcr_n + 1; i < ARMV8_PMU_MAX_COUNTERS; i++)
- run_error_test(i);
+ run_error_test(i, fixed_counters_only);
+}
+
+static void test_fixed_counters_only(void)
+{
+ struct kvm_pmu_event_filter filter = { .nevents = 0 };
+ struct kvm_vm *vm;
+ struct kvm_vcpu *running_vcpu;
+ struct kvm_vcpu *stopped_vcpu;
+ struct kvm_vcpu_init init;
+ int ret;
+ uint64_t irq = 23;
+
+ create_vpmu_vm(guest_code);
+ ret = __vcpu_has_device_attr(vpmu_vm.vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY);
+ if (ret) {
+ TEST_ASSERT(ret == -1 && errno == ENXIO,
+ KVM_IOCTL_ERROR(KVM_GET_DEVICE_ATTR, ret));
+ destroy_vpmu_vm();
+ return;
+ }
+
+ /* Assert that FIXED_COUNTERS_ONLY is unset at initialization. */
+ ret = __vcpu_device_attr_get(vpmu_vm.vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY, NULL);
+ TEST_ASSERT(ret == -1 && errno == ENXIO,
+ KVM_IOCTL_ERROR(KVM_GET_DEVICE_ATTR, ret));
+
+ /* Assert that setting FIXED_COUNTERS_ONLY succeeds. */
+ vcpu_device_attr_set(vpmu_vm.vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY, NULL);
+
+ /* Assert that getting FIXED_COUNTERS_ONLY succeeds. */
+ vcpu_device_attr_get(vpmu_vm.vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY, NULL);
+
+ /*
+ * Assert that setting FIXED_COUNTERS_ONLY again succeeds even if an
+ * event filter has already been set.
+ */
+ vcpu_device_attr_set(vpmu_vm.vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_FILTER, &filter);
+
+ vcpu_device_attr_set(vpmu_vm.vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY, NULL);
+
+ destroy_vpmu_vm();
+
+ create_vpmu_vm(guest_code);
+
+ /*
+ * Assert that setting FIXED_COUNTERS_ONLY results in EBUSY if an event
+ * filter has already been set while FIXED_COUNTERS_ONLY has not.
+ */
+ vcpu_device_attr_set(vpmu_vm.vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_FILTER, &filter);
+
+ ret = __vcpu_device_attr_set(vpmu_vm.vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY, NULL);
+ TEST_ASSERT(ret == -1 && errno == EBUSY,
+ KVM_IOCTL_ERROR(KVM_GET_DEVICE_ATTR, ret));
+
+ destroy_vpmu_vm();
+
+ /*
+ * Assert that setting FIXED_COUNTERS_ONLY after running a VCPU results
+ * in EBUSY.
+ */
+ vm = vm_create(2);
+ vm_ioctl(vm, KVM_ARM_PREFERRED_TARGET, &init);
+ init.features[0] |= (1 << KVM_ARM_VCPU_PMU_V3);
+ running_vcpu = aarch64_vcpu_add(vm, 0, &init, guest_code);
+ stopped_vcpu = aarch64_vcpu_add(vm, 1, &init, guest_code);
+ kvm_arch_vm_finalize_vcpus(vm);
+ vcpu_device_attr_set(running_vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_IRQ, &irq);
+ vcpu_device_attr_set(running_vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_INIT, NULL);
+ vcpu_run(running_vcpu);
+
+ ret = __vcpu_device_attr_set(stopped_vcpu, KVM_ARM_VCPU_PMU_V3_CTRL,
+ KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY, NULL);
+ TEST_ASSERT(ret == -1 && errno == EBUSY,
+ KVM_IOCTL_ERROR(KVM_GET_DEVICE_ATTR, ret));
+
+ kvm_vm_free(vm);
+
+ test_config(0, true);
+}
+
+int main(void)
+{
+ TEST_REQUIRE(kvm_has_cap(KVM_CAP_ARM_PMU_V3));
+ TEST_REQUIRE(kvm_supports_vgic_v3());
+ TEST_REQUIRE(kvm_supports_nr_counters_attr());
+
+ test_config(get_pmcr_n_limit(), false);
+ test_fixed_counters_only();
return 0;
}
--
2.53.0
^ permalink raw reply related
* [PATCH v5 1/4] KVM: arm64: PMU: Add kvm_pmu_enabled_counter_mask()
From: Akihiko Odaki @ 2026-04-11 10:20 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Catalin Marinas, Will Deacon, Kees Cook,
Gustavo A. R. Silva, Paolo Bonzini, Jonathan Corbet, Shuah Khan
Cc: linux-arm-kernel, kvmarm, linux-kernel, linux-hardening, devel,
kvm, linux-doc, linux-kselftest, Akihiko Odaki
In-Reply-To: <20260411-hybrid-v5-0-b043b4d9f49e@rsg.ci.i.u-tokyo.ac.jp>
This function will be useful to enumerate enabled counters.
Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
---
arch/arm64/kvm/pmu-emul.c | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index b03dbda7f1ab..59ec96e09321 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -619,18 +619,24 @@ void kvm_pmu_handle_pmcr(struct kvm_vcpu *vcpu, u64 val)
}
}
-static bool kvm_pmu_counter_is_enabled(struct kvm_pmc *pmc)
+static u64 kvm_pmu_enabled_counter_mask(struct kvm_vcpu *vcpu)
{
- struct kvm_vcpu *vcpu = kvm_pmc_to_vcpu(pmc);
- unsigned int mdcr = __vcpu_sys_reg(vcpu, MDCR_EL2);
+ u64 mask = 0;
- if (!(__vcpu_sys_reg(vcpu, PMCNTENSET_EL0) & BIT(pmc->idx)))
- return false;
+ if (__vcpu_sys_reg(vcpu, MDCR_EL2) & MDCR_EL2_HPME)
+ mask |= kvm_pmu_hyp_counter_mask(vcpu);
- if (kvm_pmu_counter_is_hyp(vcpu, pmc->idx))
- return mdcr & MDCR_EL2_HPME;
+ if (kvm_vcpu_read_pmcr(vcpu) & ARMV8_PMU_PMCR_E)
+ mask |= ~kvm_pmu_hyp_counter_mask(vcpu);
+
+ return __vcpu_sys_reg(vcpu, PMCNTENSET_EL0) & mask;
+}
+
+static bool kvm_pmu_counter_is_enabled(struct kvm_pmc *pmc)
+{
+ struct kvm_vcpu *vcpu = kvm_pmc_to_vcpu(pmc);
- return kvm_vcpu_read_pmcr(vcpu) & ARMV8_PMU_PMCR_E;
+ return kvm_pmu_enabled_counter_mask(vcpu) & BIT(pmc->idx);
}
static bool kvm_pmc_counts_at_el0(struct kvm_pmc *pmc)
--
2.53.0
^ permalink raw reply related
* [PATCH v5 2/4] KVM: arm64: PMU: Protect the list of PMUs with RCU
From: Akihiko Odaki @ 2026-04-11 10:20 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Catalin Marinas, Will Deacon, Kees Cook,
Gustavo A. R. Silva, Paolo Bonzini, Jonathan Corbet, Shuah Khan
Cc: linux-arm-kernel, kvmarm, linux-kernel, linux-hardening, devel,
kvm, linux-doc, linux-kselftest, Akihiko Odaki
In-Reply-To: <20260411-hybrid-v5-0-b043b4d9f49e@rsg.ci.i.u-tokyo.ac.jp>
Convert the list of PMUs to a RCU-protected list that has primitives to
avoid read-side contention.
Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
---
arch/arm64/kvm/pmu-emul.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index 59ec96e09321..ef5140bbfe28 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -7,9 +7,9 @@
#include <linux/cpu.h>
#include <linux/kvm.h>
#include <linux/kvm_host.h>
-#include <linux/list.h>
#include <linux/perf_event.h>
#include <linux/perf/arm_pmu.h>
+#include <linux/rculist.h>
#include <linux/uaccess.h>
#include <asm/kvm_emulate.h>
#include <kvm/arm_pmu.h>
@@ -26,7 +26,6 @@ static bool kvm_pmu_counter_is_enabled(struct kvm_pmc *pmc);
bool kvm_supports_guest_pmuv3(void)
{
- guard(mutex)(&arm_pmus_lock);
return !list_empty(&arm_pmus);
}
@@ -808,7 +807,7 @@ void kvm_host_pmu_init(struct arm_pmu *pmu)
return;
entry->arm_pmu = pmu;
- list_add_tail(&entry->entry, &arm_pmus);
+ list_add_tail_rcu(&entry->entry, &arm_pmus);
}
static struct arm_pmu *kvm_pmu_probe_armpmu(void)
@@ -817,7 +816,7 @@ static struct arm_pmu *kvm_pmu_probe_armpmu(void)
struct arm_pmu *pmu;
int cpu;
- guard(mutex)(&arm_pmus_lock);
+ guard(rcu)();
/*
* It is safe to use a stale cpu to iterate the list of PMUs so long as
@@ -837,7 +836,7 @@ static struct arm_pmu *kvm_pmu_probe_armpmu(void)
* carried here.
*/
cpu = raw_smp_processor_id();
- list_for_each_entry(entry, &arm_pmus, entry) {
+ list_for_each_entry_rcu(entry, &arm_pmus, entry) {
pmu = entry->arm_pmu;
if (cpumask_test_cpu(cpu, &pmu->supported_cpus))
@@ -1088,9 +1087,9 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
int ret = -ENXIO;
lockdep_assert_held(&kvm->arch.config_lock);
- mutex_lock(&arm_pmus_lock);
+ guard(rcu)();
- list_for_each_entry(entry, &arm_pmus, entry) {
+ list_for_each_entry_rcu(entry, &arm_pmus, entry) {
arm_pmu = entry->arm_pmu;
if (arm_pmu->pmu.type == pmu_id) {
if (kvm_vm_has_ran_once(kvm) ||
@@ -1106,7 +1105,6 @@ static int kvm_arm_pmu_v3_set_pmu(struct kvm_vcpu *vcpu, int pmu_id)
}
}
- mutex_unlock(&arm_pmus_lock);
return ret;
}
--
2.53.0
^ permalink raw reply related
* [PATCH net-next v05 6/6] hinic3: Remove unneeded coalesce parameters
From: Fan Gong @ 2026-04-11 3:37 UTC (permalink / raw)
To: Fan Gong, Zhu Yikai, netdev, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Andrew Lunn,
Ioana Ciornei, Mohsin Bashir
Cc: linux-kernel, linux-doc, luosifu, Xin Guo, Zhou Shuai, Wu Like,
Shi Jing, Zheng Jiezhen, Maxime Chevallier
In-Reply-To: <cover.1775711066.git.zhuyikai1@h-partners.com>
Remove unneeded coalesce parameters in irq handling.
Co-developed-by: Zhu Yikai <zhuyikai1@h-partners.com>
Signed-off-by: Zhu Yikai <zhuyikai1@h-partners.com>
Signed-off-by: Fan Gong <gongfan1@huawei.com>
---
drivers/net/ethernet/huawei/hinic3/hinic3_irq.c | 6 +-----
drivers/net/ethernet/huawei/hinic3/hinic3_rx.h | 3 ---
2 files changed, 1 insertion(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_irq.c b/drivers/net/ethernet/huawei/hinic3/hinic3_irq.c
index d3b3927b5408..42464c007174 100644
--- a/drivers/net/ethernet/huawei/hinic3/hinic3_irq.c
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_irq.c
@@ -156,13 +156,9 @@ static int hinic3_set_interrupt_moder(struct net_device *netdev, u16 q_id,
spin_unlock_irqrestore(&nic_dev->channel_res_lock, flags);
err = hinic3_set_interrupt_cfg(nic_dev->hwdev, info);
- if (err) {
+ if (err)
netdev_err(netdev,
"Failed to modify moderation for Queue: %u\n", q_id);
- } else {
- nic_dev->rxqs[q_id].last_coalesc_timer_cfg = coalesc_timer_cfg;
- nic_dev->rxqs[q_id].last_pending_limit = pending_limit;
- }
return err;
}
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_rx.h b/drivers/net/ethernet/huawei/hinic3/hinic3_rx.h
index c11d080408a7..2ab691ed11a9 100644
--- a/drivers/net/ethernet/huawei/hinic3/hinic3_rx.h
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_rx.h
@@ -111,9 +111,6 @@ struct hinic3_rxq {
dma_addr_t cqe_start_paddr;
struct dim dim;
-
- u8 last_coalesc_timer_cfg;
- u8 last_pending_limit;
} ____cacheline_aligned;
struct hinic3_dyna_rxq_res {
--
2.43.0
^ permalink raw reply related
* [PATCH net-next v05 4/6] hinic3: Add ethtool rss ops
From: Fan Gong @ 2026-04-11 3:37 UTC (permalink / raw)
To: Fan Gong, Zhu Yikai, netdev, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Andrew Lunn,
Ioana Ciornei, Mohsin Bashir
Cc: linux-kernel, linux-doc, luosifu, Xin Guo, Zhou Shuai, Wu Like,
Shi Jing, Zheng Jiezhen, Maxime Chevallier
In-Reply-To: <cover.1775711066.git.zhuyikai1@h-partners.com>
Implement following ethtool callback function:
.get_rxnfc
.set_rxnfc
.get_channels
.set_channels
.get_rxfh_indir_size
.get_rxfh_key_size
.get_rxfh
.set_rxfh
These callbacks allow users to utilize ethtool for detailed
RSS parameters configuration and monitoring.
Co-developed-by: Zhu Yikai <zhuyikai1@h-partners.com>
Signed-off-by: Zhu Yikai <zhuyikai1@h-partners.com>
Signed-off-by: Fan Gong <gongfan1@huawei.com>
---
.../ethernet/huawei/hinic3/hinic3_ethtool.c | 9 +
.../huawei/hinic3/hinic3_mgmt_interface.h | 2 +
| 487 +++++++++++++++++-
| 19 +
4 files changed, 515 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_ethtool.c b/drivers/net/ethernet/huawei/hinic3/hinic3_ethtool.c
index f0fb9a30840b..69663ee70cbd 100644
--- a/drivers/net/ethernet/huawei/hinic3/hinic3_ethtool.c
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_ethtool.c
@@ -15,6 +15,7 @@
#include "hinic3_hw_comm.h"
#include "hinic3_nic_dev.h"
#include "hinic3_nic_cfg.h"
+#include "hinic3_rss.h"
#define HINIC3_MGMT_VERSION_MAX_LEN 32
/* Coalesce time properties in microseconds */
@@ -1231,6 +1232,14 @@ static const struct ethtool_ops hinic3_ethtool_ops = {
.get_pause_stats = hinic3_get_pause_stats,
.get_coalesce = hinic3_get_coalesce,
.set_coalesce = hinic3_set_coalesce,
+ .get_rxnfc = hinic3_get_rxnfc,
+ .set_rxnfc = hinic3_set_rxnfc,
+ .get_channels = hinic3_get_channels,
+ .set_channels = hinic3_set_channels,
+ .get_rxfh_indir_size = hinic3_get_rxfh_indir_size,
+ .get_rxfh_key_size = hinic3_get_rxfh_key_size,
+ .get_rxfh = hinic3_get_rxfh,
+ .set_rxfh = hinic3_set_rxfh,
};
void hinic3_set_ethtool_ops(struct net_device *netdev)
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_mgmt_interface.h b/drivers/net/ethernet/huawei/hinic3/hinic3_mgmt_interface.h
index 76c691f82703..3c1263ff99ff 100644
--- a/drivers/net/ethernet/huawei/hinic3/hinic3_mgmt_interface.h
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_mgmt_interface.h
@@ -282,6 +282,7 @@ enum l2nic_cmd {
L2NIC_CMD_SET_VLAN_FILTER_EN = 26,
L2NIC_CMD_SET_RX_VLAN_OFFLOAD = 27,
L2NIC_CMD_CFG_RSS = 60,
+ L2NIC_CMD_GET_RSS_CTX_TBL = 62,
L2NIC_CMD_CFG_RSS_HASH_KEY = 63,
L2NIC_CMD_CFG_RSS_HASH_ENGINE = 64,
L2NIC_CMD_SET_RSS_CTX_TBL = 65,
@@ -301,6 +302,7 @@ enum l2nic_ucode_cmd {
L2NIC_UCODE_CMD_MODIFY_QUEUE_CTX = 0,
L2NIC_UCODE_CMD_CLEAN_QUEUE_CTX = 1,
L2NIC_UCODE_CMD_SET_RSS_INDIR_TBL = 4,
+ L2NIC_UCODE_CMD_GET_RSS_INDIR_TBL = 6,
};
/* hilink mac group command */
--git a/drivers/net/ethernet/huawei/hinic3/hinic3_rss.c b/drivers/net/ethernet/huawei/hinic3/hinic3_rss.c
index 25db74d8c7dd..b40d5fa885c2 100644
--- a/drivers/net/ethernet/huawei/hinic3/hinic3_rss.c
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_rss.c
@@ -155,7 +155,7 @@ static int hinic3_set_rss_type(struct hinic3_hwdev *hwdev,
L2NIC_CMD_SET_RSS_CTX_TBL, &msg_params);
if (ctx_tbl.msg_head.status == MGMT_STATUS_CMD_UNSUPPORTED) {
- return MGMT_STATUS_CMD_UNSUPPORTED;
+ return -EOPNOTSUPP;
} else if (err || ctx_tbl.msg_head.status) {
dev_err(hwdev->dev, "mgmt Failed to set rss context offload, err: %d, status: 0x%x\n",
err, ctx_tbl.msg_head.status);
@@ -165,6 +165,39 @@ static int hinic3_set_rss_type(struct hinic3_hwdev *hwdev,
return 0;
}
+static int hinic3_get_rss_type(struct hinic3_hwdev *hwdev,
+ struct hinic3_rss_type *rss_type)
+{
+ struct l2nic_cmd_rss_ctx_tbl ctx_tbl = {};
+ struct mgmt_msg_params msg_params = {};
+ int err;
+
+ ctx_tbl.func_id = hinic3_global_func_id(hwdev);
+
+ mgmt_msg_params_init_default(&msg_params, &ctx_tbl, sizeof(ctx_tbl));
+
+ err = hinic3_send_mbox_to_mgmt(hwdev, MGMT_MOD_L2NIC,
+ L2NIC_CMD_GET_RSS_CTX_TBL,
+ &msg_params);
+ if (err || ctx_tbl.msg_head.status) {
+ dev_err(hwdev->dev, "Failed to get hash type, err: %d, status: 0x%x\n",
+ err, ctx_tbl.msg_head.status);
+ return -EINVAL;
+ }
+
+ rss_type->ipv4 = L2NIC_RSS_TYPE_GET(ctx_tbl.context, IPV4);
+ rss_type->ipv6 = L2NIC_RSS_TYPE_GET(ctx_tbl.context, IPV6);
+ rss_type->ipv6_ext = L2NIC_RSS_TYPE_GET(ctx_tbl.context, IPV6_EXT);
+ rss_type->tcp_ipv4 = L2NIC_RSS_TYPE_GET(ctx_tbl.context, TCP_IPV4);
+ rss_type->tcp_ipv6 = L2NIC_RSS_TYPE_GET(ctx_tbl.context, TCP_IPV6);
+ rss_type->tcp_ipv6_ext = L2NIC_RSS_TYPE_GET(ctx_tbl.context,
+ TCP_IPV6_EXT);
+ rss_type->udp_ipv4 = L2NIC_RSS_TYPE_GET(ctx_tbl.context, UDP_IPV4);
+ rss_type->udp_ipv6 = L2NIC_RSS_TYPE_GET(ctx_tbl.context, UDP_IPV6);
+
+ return 0;
+}
+
static int hinic3_rss_cfg_hash_type(struct hinic3_hwdev *hwdev, u8 opcode,
enum hinic3_rss_hash_type *type)
{
@@ -264,7 +297,8 @@ static int hinic3_set_hw_rss_parameters(struct net_device *netdev, u8 rss_en)
if (err)
return err;
- hinic3_fillout_indir_tbl(netdev, nic_dev->rss_indir);
+ if (!netif_is_rxfh_configured(netdev))
+ hinic3_fillout_indir_tbl(netdev, nic_dev->rss_indir);
err = hinic3_config_rss_hw_resource(netdev, nic_dev->rss_indir);
if (err)
@@ -334,3 +368,452 @@ void hinic3_try_to_enable_rss(struct net_device *netdev)
clear_bit(HINIC3_RSS_ENABLE, &nic_dev->flags);
nic_dev->q_params.num_qps = nic_dev->max_qps;
}
+
+static int hinic3_set_l4_rss_hash_ops(const struct ethtool_rxnfc *cmd,
+ struct hinic3_rss_type *rss_type)
+{
+ u8 rss_l4_en;
+
+ switch (cmd->data & (RXH_L4_B_0_1 | RXH_L4_B_2_3)) {
+ case 0:
+ rss_l4_en = 0;
+ break;
+ case (RXH_L4_B_0_1 | RXH_L4_B_2_3):
+ rss_l4_en = 1;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ switch (cmd->flow_type) {
+ case TCP_V4_FLOW:
+ rss_type->tcp_ipv4 = rss_l4_en;
+ break;
+ case TCP_V6_FLOW:
+ rss_type->tcp_ipv6 = rss_l4_en;
+ break;
+ case UDP_V4_FLOW:
+ rss_type->udp_ipv4 = rss_l4_en;
+ break;
+ case UDP_V6_FLOW:
+ rss_type->udp_ipv6 = rss_l4_en;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int hinic3_update_rss_hash_opts(struct net_device *netdev,
+ struct ethtool_rxnfc *cmd,
+ struct hinic3_rss_type *rss_type)
+{
+ int err;
+
+ switch (cmd->flow_type) {
+ case TCP_V4_FLOW:
+ case TCP_V6_FLOW:
+ case UDP_V4_FLOW:
+ case UDP_V6_FLOW:
+ err = hinic3_set_l4_rss_hash_ops(cmd, rss_type);
+ if (err)
+ return err;
+
+ break;
+ case IPV4_FLOW:
+ rss_type->ipv4 = 1;
+ break;
+ case IPV6_FLOW:
+ rss_type->ipv6 = 1;
+ break;
+ default:
+ netdev_err(netdev, "Unsupported flow type\n");
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int hinic3_set_rss_hash_opts(struct net_device *netdev,
+ struct ethtool_rxnfc *cmd)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ struct hinic3_rss_type rss_type;
+ int err;
+
+ if (!test_bit(HINIC3_RSS_ENABLE, &nic_dev->flags)) {
+ cmd->data = 0;
+ netdev_err(netdev, "RSS is disable, not support to set flow-hash\n");
+ return -EOPNOTSUPP;
+ }
+
+ /* RSS only supports hashing of IP addresses and L4 ports */
+ if (cmd->data & ~(RXH_IP_SRC | RXH_IP_DST |
+ RXH_L4_B_0_1 | RXH_L4_B_2_3))
+ return -EINVAL;
+
+ /* Both IP addresses must be part of the hash tuple */
+ if (!(cmd->data & RXH_IP_SRC) || !(cmd->data & RXH_IP_DST))
+ return -EINVAL;
+
+ err = hinic3_get_rss_type(nic_dev->hwdev, &rss_type);
+ if (err) {
+ netdev_err(netdev, "Failed to get rss type\n");
+ return err;
+ }
+
+ err = hinic3_update_rss_hash_opts(netdev, cmd, &rss_type);
+ if (err)
+ return err;
+
+ err = hinic3_set_rss_type(nic_dev->hwdev, rss_type);
+ if (err) {
+ netdev_err(netdev, "Failed to set rss type\n");
+ return err;
+ }
+
+ nic_dev->rss_type = rss_type;
+
+ return 0;
+}
+
+static void convert_rss_type(u8 rss_opt, struct ethtool_rxnfc *cmd)
+{
+ if (rss_opt)
+ cmd->data |= RXH_L4_B_0_1 | RXH_L4_B_2_3;
+}
+
+static int hinic3_convert_rss_type(struct net_device *netdev,
+ struct hinic3_rss_type *rss_type,
+ struct ethtool_rxnfc *cmd)
+{
+ cmd->data = RXH_IP_SRC | RXH_IP_DST;
+ switch (cmd->flow_type) {
+ case TCP_V4_FLOW:
+ convert_rss_type(rss_type->tcp_ipv4, cmd);
+ break;
+ case TCP_V6_FLOW:
+ convert_rss_type(rss_type->tcp_ipv6, cmd);
+ break;
+ case UDP_V4_FLOW:
+ convert_rss_type(rss_type->udp_ipv4, cmd);
+ break;
+ case UDP_V6_FLOW:
+ convert_rss_type(rss_type->udp_ipv6, cmd);
+ break;
+ case IPV4_FLOW:
+ case IPV6_FLOW:
+ break;
+ default:
+ netdev_err(netdev, "Unsupported flow type\n");
+ cmd->data = 0;
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int hinic3_get_rss_hash_opts(struct net_device *netdev,
+ struct ethtool_rxnfc *cmd)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ struct hinic3_rss_type rss_type;
+ int err;
+
+ cmd->data = 0;
+
+ if (!test_bit(HINIC3_RSS_ENABLE, &nic_dev->flags))
+ return 0;
+
+ err = hinic3_get_rss_type(nic_dev->hwdev, &rss_type);
+ if (err) {
+ netdev_err(netdev, "Failed to get rss type\n");
+ return err;
+ }
+
+ return hinic3_convert_rss_type(netdev, &rss_type, cmd);
+}
+
+int hinic3_get_rxnfc(struct net_device *netdev,
+ struct ethtool_rxnfc *cmd, u32 *rule_locs)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ int err = 0;
+
+ switch (cmd->cmd) {
+ case ETHTOOL_GRXRINGS:
+ cmd->data = nic_dev->q_params.num_qps;
+ break;
+ case ETHTOOL_GRXFH:
+ err = hinic3_get_rss_hash_opts(netdev, cmd);
+ break;
+ default:
+ err = -EOPNOTSUPP;
+ break;
+ }
+
+ return err;
+}
+
+int hinic3_set_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd)
+{
+ int err;
+
+ switch (cmd->cmd) {
+ case ETHTOOL_SRXFH:
+ err = hinic3_set_rss_hash_opts(netdev, cmd);
+ break;
+ default:
+ err = -EOPNOTSUPP;
+ break;
+ }
+
+ return err;
+}
+
+static u16 hinic3_max_channels(struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ u8 tcs = netdev_get_num_tc(netdev);
+
+ return tcs ? nic_dev->max_qps / tcs : nic_dev->max_qps;
+}
+
+static u16 hinic3_curr_channels(struct net_device *netdev)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+
+ if (netif_running(netdev))
+ return nic_dev->q_params.num_qps ?
+ nic_dev->q_params.num_qps : 1;
+ else
+ return min_t(u16, hinic3_max_channels(netdev),
+ nic_dev->q_params.num_qps);
+}
+
+void hinic3_get_channels(struct net_device *netdev,
+ struct ethtool_channels *channels)
+{
+ channels->max_rx = 0;
+ channels->max_tx = 0;
+ channels->max_other = 0;
+ /* report maximum channels */
+ channels->max_combined = hinic3_max_channels(netdev);
+ channels->rx_count = 0;
+ channels->tx_count = 0;
+ channels->other_count = 0;
+ /* report flow director queues as maximum channels */
+ channels->combined_count = hinic3_curr_channels(netdev);
+}
+
+static int
+hinic3_validate_channel_parameter(struct net_device *netdev,
+ const struct ethtool_channels *channels)
+{
+ u16 max_channel = hinic3_max_channels(netdev);
+ unsigned int count = channels->combined_count;
+
+ if (!count) {
+ netdev_err(netdev, "Unsupported combined_count=0\n");
+ return -EINVAL;
+ }
+
+ if (channels->tx_count || channels->rx_count || channels->other_count) {
+ netdev_err(netdev, "Setting rx/tx/other count not supported\n");
+ return -EINVAL;
+ }
+
+ if (count > max_channel) {
+ netdev_err(netdev, "Combined count %u exceed limit %u\n", count,
+ max_channel);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+int hinic3_set_channels(struct net_device *netdev,
+ struct ethtool_channels *channels)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ unsigned int count = channels->combined_count;
+ struct hinic3_dyna_txrxq_params q_params;
+ int err;
+
+ if (hinic3_validate_channel_parameter(netdev, channels))
+ return -EINVAL;
+
+ if (!test_bit(HINIC3_RSS_ENABLE, &nic_dev->flags)) {
+ netdev_err(netdev, "This function doesn't support RSS, only support 1 queue pair\n");
+ return -EOPNOTSUPP;
+ }
+
+ netdev_dbg(netdev, "Set max combined queue number from %u to %u\n",
+ nic_dev->q_params.num_qps, count);
+
+ if (netif_running(netdev)) {
+ q_params = nic_dev->q_params;
+ q_params.num_qps = (u16)count;
+ q_params.txqs_res = NULL;
+ q_params.rxqs_res = NULL;
+ q_params.irq_cfg = NULL;
+
+ err = hinic3_change_channel_settings(netdev, &q_params);
+ if (err) {
+ netdev_err(netdev, "Failed to change channel settings\n");
+ return err;
+ }
+ } else {
+ nic_dev->q_params.num_qps = (u16)count;
+ }
+
+ return 0;
+}
+
+u32 hinic3_get_rxfh_indir_size(struct net_device *netdev)
+{
+ return L2NIC_RSS_INDIR_SIZE;
+}
+
+static int hinic3_set_rss_rxfh(struct net_device *netdev,
+ const u32 *indir, u8 *key)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ int err;
+ u32 i;
+
+ if (indir) {
+ for (i = 0; i < L2NIC_RSS_INDIR_SIZE; i++)
+ nic_dev->rss_indir[i] = (u16)indir[i];
+
+ err = hinic3_rss_set_indir_tbl(nic_dev->hwdev,
+ nic_dev->rss_indir);
+ if (err) {
+ netdev_err(netdev, "Failed to set rss indir table\n");
+ return err;
+ }
+ }
+
+ if (key) {
+ err = hinic3_rss_set_hash_key(nic_dev->hwdev, key);
+ if (err) {
+ netdev_err(netdev, "Failed to set rss key\n");
+ return err;
+ }
+
+ memcpy(nic_dev->rss_hkey, key, L2NIC_RSS_KEY_SIZE);
+ }
+
+ return 0;
+}
+
+u32 hinic3_get_rxfh_key_size(struct net_device *netdev)
+{
+ return L2NIC_RSS_KEY_SIZE;
+}
+
+static int hinic3_rss_get_indir_tbl(struct hinic3_hwdev *hwdev,
+ u32 *indir_table)
+{
+ struct hinic3_cmd_buf_pair pair;
+ __le16 *indir_tbl = NULL;
+ int err, i;
+
+ err = hinic3_cmd_buf_pair_init(hwdev, &pair);
+ if (err) {
+ dev_err(hwdev->dev, "Failed to allocate cmd_buf.\n");
+ return err;
+ }
+
+ err = hinic3_cmdq_detail_resp(hwdev, MGMT_MOD_L2NIC,
+ L2NIC_UCODE_CMD_GET_RSS_INDIR_TBL,
+ pair.in, pair.out, NULL);
+ if (err) {
+ dev_err(hwdev->dev, "Failed to get rss indir table\n");
+ goto err_get_indir_tbl;
+ }
+
+ indir_tbl = (__le16 *)pair.out->buf;
+ for (i = 0; i < L2NIC_RSS_INDIR_SIZE; i++)
+ indir_table[i] = le16_to_cpu(*(indir_tbl + i));
+
+err_get_indir_tbl:
+ hinic3_cmd_buf_pair_uninit(hwdev, &pair);
+
+ return err;
+}
+
+int hinic3_get_rxfh(struct net_device *netdev,
+ struct ethtool_rxfh_param *rxfh)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ int err = 0;
+
+ if (!test_bit(HINIC3_RSS_ENABLE, &nic_dev->flags)) {
+ netdev_err(netdev, "Rss is disabled\n");
+ return -EOPNOTSUPP;
+ }
+
+ rxfh->hfunc =
+ nic_dev->rss_hash_type == HINIC3_RSS_HASH_ENGINE_TYPE_XOR ?
+ ETH_RSS_HASH_XOR : ETH_RSS_HASH_TOP;
+
+ if (rxfh->indir) {
+ err = hinic3_rss_get_indir_tbl(nic_dev->hwdev, rxfh->indir);
+ if (err)
+ return err;
+ }
+
+ if (rxfh->key)
+ memcpy(rxfh->key, nic_dev->rss_hkey, L2NIC_RSS_KEY_SIZE);
+
+ return err;
+}
+
+static int hinic3_update_hash_func_type(struct net_device *netdev, u8 hfunc)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ enum hinic3_rss_hash_type new_rss_hash_type;
+
+ switch (hfunc) {
+ case ETH_RSS_HASH_NO_CHANGE:
+ return 0;
+ case ETH_RSS_HASH_XOR:
+ new_rss_hash_type = HINIC3_RSS_HASH_ENGINE_TYPE_XOR;
+ break;
+ case ETH_RSS_HASH_TOP:
+ new_rss_hash_type = HINIC3_RSS_HASH_ENGINE_TYPE_TOEP;
+ break;
+ default:
+ netdev_err(netdev, "Unsupported hash func %u\n", hfunc);
+ return -EOPNOTSUPP;
+ }
+
+ if (new_rss_hash_type == nic_dev->rss_hash_type)
+ return 0;
+
+ nic_dev->rss_hash_type = new_rss_hash_type;
+ return hinic3_rss_set_hash_type(nic_dev->hwdev, nic_dev->rss_hash_type);
+}
+
+int hinic3_set_rxfh(struct net_device *netdev,
+ struct ethtool_rxfh_param *rxfh,
+ struct netlink_ext_ack *extack)
+{
+ struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
+ int err;
+
+ if (!test_bit(HINIC3_RSS_ENABLE, &nic_dev->flags)) {
+ netdev_err(netdev, "Not support to set rss parameters when rss is disable\n");
+ return -EOPNOTSUPP;
+ }
+
+ err = hinic3_update_hash_func_type(netdev, rxfh->hfunc);
+ if (err)
+ return err;
+
+ err = hinic3_set_rss_rxfh(netdev, rxfh->indir, rxfh->key);
+
+ return err;
+}
--git a/drivers/net/ethernet/huawei/hinic3/hinic3_rss.h b/drivers/net/ethernet/huawei/hinic3/hinic3_rss.h
index 78d82c2aca06..9f1b77780cd4 100644
--- a/drivers/net/ethernet/huawei/hinic3/hinic3_rss.h
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_rss.h
@@ -5,10 +5,29 @@
#define _HINIC3_RSS_H_
#include <linux/netdevice.h>
+#include <linux/ethtool.h>
int hinic3_rss_init(struct net_device *netdev);
void hinic3_rss_uninit(struct net_device *netdev);
void hinic3_try_to_enable_rss(struct net_device *netdev);
void hinic3_clear_rss_config(struct net_device *netdev);
+int hinic3_get_rxnfc(struct net_device *netdev,
+ struct ethtool_rxnfc *cmd, u32 *rule_locs);
+int hinic3_set_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd);
+
+void hinic3_get_channels(struct net_device *netdev,
+ struct ethtool_channels *channels);
+int hinic3_set_channels(struct net_device *netdev,
+ struct ethtool_channels *channels);
+
+u32 hinic3_get_rxfh_indir_size(struct net_device *netdev);
+u32 hinic3_get_rxfh_key_size(struct net_device *netdev);
+
+int hinic3_get_rxfh(struct net_device *netdev,
+ struct ethtool_rxfh_param *rxfh);
+int hinic3_set_rxfh(struct net_device *netdev,
+ struct ethtool_rxfh_param *rxfh,
+ struct netlink_ext_ack *extack);
+
#endif
--
2.43.0
^ permalink raw reply related
* [PATCH net-next v05 5/6] hinic3: Configure netdev->watchdog_timeo to set nic tx timeout
From: Fan Gong @ 2026-04-11 3:37 UTC (permalink / raw)
To: Fan Gong, Zhu Yikai, netdev, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Andrew Lunn,
Ioana Ciornei, Mohsin Bashir
Cc: linux-kernel, linux-doc, luosifu, Xin Guo, Zhou Shuai, Wu Like,
Shi Jing, Zheng Jiezhen, Maxime Chevallier
In-Reply-To: <cover.1775711066.git.zhuyikai1@h-partners.com>
Configure netdev watchdog timeout to improve transmission reliability.
Co-developed-by: Zhu Yikai <zhuyikai1@h-partners.com>
Signed-off-by: Zhu Yikai <zhuyikai1@h-partners.com>
Signed-off-by: Fan Gong <gongfan1@huawei.com>
---
drivers/net/ethernet/huawei/hinic3/hinic3_main.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/huawei/hinic3/hinic3_main.c b/drivers/net/ethernet/huawei/hinic3/hinic3_main.c
index 3b470978714a..7e09b4b2da9f 100644
--- a/drivers/net/ethernet/huawei/hinic3/hinic3_main.c
+++ b/drivers/net/ethernet/huawei/hinic3/hinic3_main.c
@@ -33,6 +33,8 @@
#define HINIC3_RX_PENDING_LIMIT_LOW 2
#define HINIC3_RX_PENDING_LIMIT_HIGH 8
+#define HINIC3_WATCHDOG_TIMEOUT 5
+
static void init_intr_coal_param(struct net_device *netdev)
{
struct hinic3_nic_dev *nic_dev = netdev_priv(netdev);
@@ -246,6 +248,8 @@ static void hinic3_assign_netdev_ops(struct net_device *netdev)
{
hinic3_set_netdev_ops(netdev);
hinic3_set_ethtool_ops(netdev);
+
+ netdev->watchdog_timeo = HINIC3_WATCHDOG_TIMEOUT * HZ;
}
static void netdev_feature_init(struct net_device *netdev)
--
2.43.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox