From: Greg KH <gregkh@linuxfoundation.org>
To: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Muralidhara M K <muralidhara.mk@amd.com>,
ilpo.jarvinen@linux.intel.com, rafael@kernel.org,
platform-driver-x86@vger.kernel.org,
linux-kernel@vger.kernel.org, driver-core@lists.linux.dev
Subject: Re: [PATCH v2 4/7] sysfs: Add SYSFS_HUGE_BIN_FILE flag for binary attributes larger than PAGE_SIZE
Date: Wed, 13 May 2026 08:24:15 +0200 [thread overview]
Message-ID: <2026051354-succulent-acting-e006@gregkh> (raw)
In-Reply-To: <47c6134b-c0ed-44af-b77e-145bfceded74@amd.com>
On Wed, May 13, 2026 at 09:43:57AM +0530, K Prateek Nayak wrote:
> Hello Greg,
>
> On 5/12/2026 5:31 PM, Greg KH wrote:
> > On Mon, Apr 27, 2026 at 09:21:26PM +0530, Muralidhara M K wrote:
> >> Historically, sysfs read buffers were allocated with get_zeroed_page(),
> >> limiting reads to PAGE_SIZE. Commit 13c589d5b0ac ("sysfs: use seq_file
> >> when reading regular files") transitioned regular (text) attribute reads
> >> to seq_file, which can dynamically grow buffers beyond PAGE_SIZE.
> >> However, the PAGE_SIZE limit was intentionally preserved for
> >> compatibility. When binary attribute handling was later unified into
> >> the same codebase, the non-seq_file read path (kernfs_file_read_iter)
> >> retained this PAGE_SIZE cap for binary files as well.
> >>
> >> Drivers that expose binary attributes larger than PAGE_SIZE — such as
> >> the AMD HSMP metric table (~13 KB) — cannot deliver the full content
> >> in a single read() call through the existing path.
> >
> > That's fine, userspace must be able to handle a "short" read, and will
> > just continue on and read everything afterward, right? You can't rely
> > on userspace always asking for more data.
>
> I think this is complicated by the HSMP driver bits that requires the
> read to issue a HSMP command to the hardware first to updates the
> table before copying from the MMIO region.
Then you have bigger problems here :(
> If a concurrent reader arrives, they'll refresh the table for their
> PAGE_SIZE chunk read and the prior user will see a torn value. For
> most part it shouldn't be a problem but folks try to co-relate the
> Temperature and Power data from the first chunk with the Throttle
> Indicators in the second chunk and sometimes, they don't match the
> expectations.
Again, this is a problem, perhaps do not use sysfs for this? You can't
control userspace, and to expect it to always work properly is not going
to end well. This change isn't going to fix your problems listed above
at all.
> The table should never have grown this big but some folks decided it
> was a good idea and we can't fix it for a while and have hit the
> PAGE_SIZE limit now.
Just delete it and use a different interface to the kernel instead
please. If you need atomic read/writes, use an ioctl. Don't try to fix
sysfs into something that it was not designed for at all.
> If there is a better alternate, we are all ears, and more than happy
> to try out an alternative suggestion for the described problem.
A misc device sounds like the properly solution.
> >> Introduce a new opt-in flag SYSFS_HUGE_BIN_FILE (040000) that drivers
> >> can OR into their bin_attribute mode. When set, sysfs selects a new
> >> kernfs_ops (sysfs_bin_kfops_huge_file_ro) whose .seq_show callback
> >> pipes the bin_attribute ->read() result through seq_file, allowing
> >> reads of arbitrary size in one shot. Existing binary attributes
> >> without the flag continue using the legacy capped path.
> >
> > If this is such a big issue, why not just do it always for binary files?
> > What is the benefit of keeping two different code paths just for this
> > "new" flag?
>
> We can do that! For bin attributes that specify .size or a size
> function, we can use a flexible buffer and for the ones that don't, we
> can enforce a PAGE_SIZE cap like today.
>
> Would that be okay?
Overall, yes, but again, I don't think this is going to fix your
problem.
thanks,
greg k-h
next prev parent reply other threads:[~2026-05-13 6:25 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260427155129.545327-1-muralidhara.mk@amd.com>
[not found] ` <20260427155129.545327-3-muralidhara.mk@amd.com>
2026-05-08 5:12 ` [PATCH v2 2/7] platform/x86/amd/hsmp: Add metrics table support for Family 1Ah Model 50h-5Fh Suma Hegde
2026-05-11 17:38 ` Ilpo Järvinen
2026-05-12 6:24 ` M K, Muralidhara
[not found] ` <20260427155129.545327-4-muralidhara.mk@amd.com>
2026-05-08 5:13 ` [PATCH v2 3/7] platform/x86/amd/hsmp: Unify response_sz validation to an upper-bound check Suma Hegde
2026-05-11 11:20 ` [PATCH v2 0/7] AMD HSMP: metrics table improvements and Family 1Ah Model 50h-5Fh support M K, Muralidhara
[not found] ` <20260427155129.545327-7-muralidhara.mk@amd.com>
2026-05-11 17:27 ` [PATCH v2 6/7] platform/x86/amd/hsmp: Make metric table read locking use guard(mutex) Ilpo Järvinen
2026-05-12 6:26 ` M K, Muralidhara
[not found] ` <20260427155129.545327-2-muralidhara.mk@amd.com>
2026-05-11 17:35 ` [PATCH v2 1/7] platform/x86/amd/hsmp: Add new HSMP messages for Family 1Ah, Model 50h-5Fh Ilpo Järvinen
2026-05-12 6:21 ` M K, Muralidhara
[not found] ` <20260427155129.545327-5-muralidhara.mk@amd.com>
2026-05-12 6:28 ` [PATCH v2 4/7] sysfs: Add SYSFS_HUGE_BIN_FILE flag for binary attributes larger than PAGE_SIZE M K, Muralidhara
2026-05-12 11:44 ` Ilpo Järvinen
2026-05-13 3:59 ` K Prateek Nayak
2026-05-12 12:01 ` Greg KH
2026-05-13 4:13 ` K Prateek Nayak
2026-05-13 6:24 ` Greg KH [this message]
2026-05-13 6:36 ` K Prateek Nayak
2026-05-13 7:18 ` Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2026051354-succulent-acting-e006@gregkh \
--to=gregkh@linuxfoundation.org \
--cc=driver-core@lists.linux.dev \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=muralidhara.mk@amd.com \
--cc=platform-driver-x86@vger.kernel.org \
--cc=rafael@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox