From: Al Viro <viro@zeniv.linux.org.uk>
To: linux-fsdevel@vger.kernel.org
Cc: gregkh@linuxfoundation.org
Subject: [PATCHES][RFC][CFT] debugfs cleanups
Date: Sun, 29 Dec 2024 08:09:48 +0000 [thread overview]
Message-ID: <20241229080948.GY1977892@ZenIV> (raw)
Debugfs has several unpleasant problems, all with the same root
cause - use of bare struct inode. Most of the filesystems embed struct
inode into fs-specific objects used to store whatever extra state is
needed for that filesystem. struct inode itself has one opaque pointer
(->i_private), and for simple cases that's enough; unfortunately,
debugfs case is not that simple.
A debugfs file needs to carry at least the following:
* which driver-supplied callbacks should be used by read/write/etc.
* which driver-supplied object should they act upon, if driver has
several objects of the same kind (a very common situation)
What's more, since driver may remove the underlying object, we
need some exclusion between the methods and removal; that is to say, for
anything opened we need to keep track of the number of threads currently
in driver-supplied callbacks and we need some way for removal to make
sure that no further callbacks will be attempted and to wait for the
one in progress to finish.
Worse yet, if we have several similar files for access to
different fields of the same driver's object, we need either a separate
set of callbacks for each field, or some way for a callback to tell
which field it's being used for.
Result is kludges galore. The pointer to driver's object is
stored in inode->i_private. The pointer to use-tracking state (struct
debugfs_fsdata) is stored in dentry->d_fsdata... and we have run out
of opaque fields.
The pointer to driver-supplied callback table has nowhere to go -
we can't store it in ->i_fop (that points to debugfs wrappers that would
handle use counts and call the real driver-supplied callbacks). OK, that
can be kludged around - we allocate debugfs_fsdata on the first use,
and we can keep that pointer in it afterwards. So we have it stashed
into ->d_fsdata until that point and move it over into debugfs_fsdata
afterwards. Since pointers are aligned, we can use the lowest bit to
tell one from another.
That has grown even nastier when an option to use a trimmed-down
variant of method table had been added (for most of those files we only
need read/write/lseek). Now we have 3 states - ->d_fsdata pointing to
struct debugfs_fsdata, ->d_fsdata used to stash a pointer to full struct
file_operations and ->d_fsdata used to stash a pointer to trimmed-down
variant. Not to worry - all those pointers are at least 32bit-aligned,
so we can use the lower couple of bits to tell one state from another
(00, 01 and 11 resp.)
It's still not the end, though - some drivers have fuckloads of
files for each underlying object, and while debugfs has an unsanitary
degree of fondness for templates, apparently there are some limits.
That gets kludged over in two fairly disgusting ways. For one thing,
we already store a reference to driver-supplied file_operations, so
we can always have several (dozens of) identical copies of that thing
and have callback ask debugfs to give it the pointer. Voila - we can
tell whether we are trying to read driver_object->plonk or driver_object->puke
by checking which copy does that point to. Another kludge is to look
at dentry name - compare it with a bunch of strings until we find the
one we want.
All of that could be avoided if we augmented debugfs inodes with
a couple of pointers - no more stashing crap in ->d_fsdata, etc.
And it's really not hard to do. The series below attempts to untangle
that mess; it can be found in
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git #work.debugfs
It's very lightly tested; review and more testing would be
very welcome.
Shortlog:
Al Viro (20):
debugfs: fix missing mutex_destroy() in short_fops case
debugfs: separate cache for debugfs inodes
debugfs: move ->automount into debugfs_inode_info
debugfs: get rid of dynamically allocation proxy_ops
debugfs: don't mess with bits in ->d_fsdata
debugfs: allow to store an additional opaque pointer at file creation
carl9170: stop embedding file_operations into their objects
b43: stop embedding struct file_operations into their objects
b43legacy: make use of debugfs_get_aux()
netdevsim: don't embed file_operations into your structs
mediatek: stop messing with ->d_iname
[not even compile-tested] greybus/camera - stop messing with ->d_iname
mtu3: don't mess wiht ->d_iname
xhci: don't mess with ->d_iname
qat: don't mess with ->d_name
sof-client-ipc-flood-test: don't mess with ->d_name
slub: don't mess with ->d_name
arm_scmi: don't mess with ->d_parent->d_name
octeontx2: don't mess with ->d_parent or ->d_parent->d_name
saner replacement for debugfs_rename()
Diffstat:
Documentation/filesystems/debugfs.rst | 12 +-
.../crypto/intel/qat/qat_common/adf_tl_debugfs.c | 36 +---
drivers/firmware/arm_scmi/raw_mode.c | 12 +-
drivers/net/bonding/bond_debugfs.c | 9 +-
drivers/net/ethernet/amd/xgbe/xgbe-debugfs.c | 19 +-
.../ethernet/marvell/octeontx2/af/rvu_debugfs.c | 76 +++-----
drivers/net/ethernet/marvell/skge.c | 5 +-
drivers/net/ethernet/marvell/sky2.c | 5 +-
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 6 +-
drivers/net/netdevsim/hwstats.c | 29 ++-
drivers/net/wireless/ath/carl9170/debug.c | 28 ++-
drivers/net/wireless/broadcom/b43/debugfs.c | 27 ++-
drivers/net/wireless/broadcom/b43legacy/debugfs.c | 26 ++-
drivers/opp/debugfs.c | 10 +-
drivers/phy/mediatek/phy-mtk-tphy.c | 40 +---
drivers/staging/greybus/camera.c | 17 +-
drivers/usb/host/xhci-debugfs.c | 25 +--
drivers/usb/mtu3/mtu3_debugfs.c | 40 +---
fs/debugfs/file.c | 167 ++++++++--------
fs/debugfs/inode.c | 213 ++++++++++-----------
fs/debugfs/internal.h | 55 +++---
include/linux/debugfs.h | 34 +++-
mm/shrinker_debug.c | 16 +-
mm/slub.c | 13 +-
net/hsr/hsr_debugfs.c | 9 +-
net/mac80211/debugfs_netdev.c | 11 +-
net/wireless/core.c | 5 +-
sound/soc/sof/sof-client-ipc-flood-test.c | 39 ++--
28 files changed, 399 insertions(+), 585 deletions(-)
Patches' overview:
01/20) debugfs: fix missing mutex_destroy() in short_fops case
-stable fodder, in one place the logics for "what does ->d_fsdata
contain for this one" got confused.
Meat of the series:
02/20) debugfs: separate cache for debugfs inodes
Embed them into container (struct debugfs_inode_info, with nothing
else in it at the moment), set the cache up, etc.
Just the infrastructure changes letting us augment debugfs inodes
here; adding stuff will come at the next step.
03/20) debugfs: move ->automount into debugfs_inode_info
... and don't bother with debugfs_fsdata for those. Life's
simpler that way...
04/20) debugfs: get rid of dynamically allocation proxy_ops
All it takes is having full_proxy_open() collect the information
about available methods and store it in debugfs_fsdata.
Wrappers are called only after full_proxy_open() has succeeded
calling debugfs_get_file(), so they are guaranteed to have ->d_fsdata
already pointing to debugfs_fsdata.
As the result, they can check if method is absent and bugger off
early, without any atomic operations, etc. - same effect as what we'd
have from NULL method. Which makes the entire proxy_fops contents
unconditional, making it completely pointless - we can just put those
methods (unconditionally) into debugfs_full_proxy_file_operations and
forget about dynamic allocation, replace_fops, etc.
05/20) debugfs: don't mess with bits in ->d_fsdata
The reason we need that crap is the dual use ->d_fsdata has there -
it's both holding a debugfs_fsdata reference after the first
debugfs_file_get() (actually, after the call of proxy ->open())
*and* it serves as a place to stash a reference to real file_operations
from object creation to the first open. Oh, and it's triple use,
actually - that stashed reference might be to debugfs_short_fops.
Bugger that for a game of solidiers - just put the operations
reference into debugfs-private augmentation of inode. And split
debugfs_full_file_operations into full and short cases, so that
debugfs_get_file() could tell one from another.
Voila - ->d_fsdata holds NULL until the first (successful)
debugfs_get_file() and a reference to struct debugfs_fsdata afterwards.
06/20) debugfs: allow to store an additional opaque pointer at file creation
Set by debugfs_create_file_aux(name, mode, parent, data, aux, fops).
Plain debugfs_create_file() has it set to NULL.
Accessed by debugfs_get_aux(file).
Convenience macros for numeric opaque data - debugfs_create_file_aux_num
and debugfs_get_aux_num, resp.
Now we can trim the crap from drivers:
A bunch "let's make a bunch of identical copies of struct file_operations
and use debugfs_real_fops() to tell which field are we asked to operate upon":
07/20) carl9170: stop embedding file_operations into their objects
08/20) b43: stop embedding struct file_operations into their objects
09/20) b43legacy: make use of debugfs_get_aux()
10/20) netdevsim: don't embed file_operations into your structs
BTW, that crack about several dozens? Literal truth - carl9170 has forty
three such identical copies. Hidden by creative uses of preprocessor...
A bunch of "let's play with dentry name" case:
11/20) mediatek: stop messing with ->d_iname
12/20) [not even compile-tested] greybus/camera - stop messing with ->d_iname
depends on BROKEN, so...
13/20) mtu3: don't mess wiht ->d_iname
14/20) xhci: don't mess with ->d_iname
15/20) qat: don't mess with ->d_name
16/20) sof-client-ipc-flood-test: don't mess with ->d_name
17/20) slub: don't mess with ->d_name
Even nastier - these look at the name of parent directory instead.
18/20) arm_scmi: don't mess with ->d_parent->d_name
19/20) octeontx2: don't mess with ->d_parent or ->d_parent->d_name
The last commit is pretty much independent from the rest of series:
20/20) saner replacement for debugfs_rename()
Existing primitive has several problems:
1) calling conventions are clumsy - it returns a dentry reference
that is either identical to its second argument or is an ERR_PTR(-E...);
in both cases no refcount changes happen. Inconvenient for users and
bug-prone; it would be better to have it return 0 on success and -E... on
failure.
2) it allows cross-directory moves; however, no such caller have
ever materialized and considering the way debugfs is used, it's unlikely
to happen in the future. What's more, any such caller would have fun
issues to deal with wrt interplay with recursive removal. It also makes
the calling conventions clumsier...
3) tautological rename fails; the callers have no race-free way
to deal with that.
4) new name must have been formed by the caller; quite a few
callers have it done by sprintf/kasprintf/etc., ending up with considerable
boilerplate.
Proposed replacement: int debugfs_change_name(dentry, fmt, ...).
All callers convert to that easily, and it's simpler internally.
IMO debugfs_rename() should go; if we ever get a real-world use
case for cross-directory moves in debugfs, we can always look into
the right way to handle that.
next reply other threads:[~2024-12-29 8:09 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-29 8:09 Al Viro [this message]
2024-12-29 8:12 ` [PATCH 01/20] debugfs: fix missing mutex_destroy() in short_fops case Al Viro
2024-12-29 8:12 ` [PATCH 02/20] debugfs: separate cache for debugfs inodes Al Viro
2024-12-29 8:12 ` [PATCH 03/20] debugfs: move ->automount into debugfs_inode_info Al Viro
2024-12-29 8:12 ` [PATCH 04/20] debugfs: get rid of dynamically allocation proxy_ops Al Viro
2024-12-29 8:12 ` [PATCH 05/20] debugfs: don't mess with bits in ->d_fsdata Al Viro
2024-12-29 8:12 ` [PATCH 06/20] debugfs: allow to store an additional opaque pointer at file creation Al Viro
2024-12-29 8:12 ` [PATCH 07/20] carl9170: stop embedding file_operations into their objects Al Viro
2024-12-29 8:12 ` [PATCH 08/20] b43: stop embedding struct " Al Viro
2024-12-29 8:12 ` [PATCH 09/20] b43legacy: make use of debugfs_get_aux() Al Viro
2024-12-29 8:12 ` [PATCH 10/20] netdevsim: don't embed file_operations into your structs Al Viro
2024-12-29 8:12 ` [PATCH 11/20] mediatek: stop messing with ->d_iname Al Viro
2024-12-29 8:12 ` [PATCH 12/20] [not even compile-tested] greybus/camera - " Al Viro
2024-12-29 8:12 ` [PATCH 13/20] mtu3: don't mess wiht ->d_iname Al Viro
2024-12-29 8:12 ` [PATCH 14/20] xhci: don't mess with ->d_iname Al Viro
2024-12-29 8:12 ` [PATCH 15/20] qat: don't mess with ->d_name Al Viro
2024-12-29 8:12 ` [PATCH 16/20] sof-client-ipc-flood-test: " Al Viro
2024-12-29 8:12 ` [PATCH 17/20] slub: " Al Viro
2024-12-29 8:12 ` [PATCH 18/20] arm_scmi: don't mess with ->d_parent->d_name Al Viro
2024-12-29 8:12 ` [PATCH 19/20] octeontx2: don't mess with ->d_parent or ->d_parent->d_name Al Viro
2024-12-29 8:12 ` [PATCH 20/20] saner replacement for debugfs_rename() Al Viro
2024-12-29 16:38 ` Al Viro
2025-01-07 15:20 ` [PATCH 01/20] debugfs: fix missing mutex_destroy() in short_fops case Greg KH
2024-12-29 20:58 ` [PATCHES][RFC][CFT] debugfs cleanups Al Viro
2025-01-07 14:56 ` Greg KH
2025-01-12 8:05 ` [PATCHES v2][RFC][CFT] " Al Viro
2025-01-12 8:06 ` [PATCH v2 01/21] debugfs: separate cache for debugfs inodes Al Viro
2025-01-12 8:06 ` [PATCH v2 02/21] debugfs: move ->automount into debugfs_inode_info Al Viro
2025-01-13 14:49 ` Christian Brauner
2025-01-12 8:06 ` [PATCH v2 03/21] debugfs: get rid of dynamically allocation proxy_ops Al Viro
2025-01-13 14:51 ` Christian Brauner
2025-01-12 8:06 ` [PATCH v2 04/21] debugfs: don't mess with bits in ->d_fsdata Al Viro
2025-01-13 14:55 ` Christian Brauner
2025-01-12 8:06 ` [PATCH v2 05/21] debugfs: allow to store an additional opaque pointer at file creation Al Viro
2025-01-13 14:56 ` Christian Brauner
2025-01-12 8:06 ` [PATCH v2 06/21] debugfs: take debugfs_short_fops definition out of ifdef Al Viro
2025-01-13 14:57 ` Christian Brauner
2025-01-12 8:06 ` [PATCH v2 07/21] carl9170: stop embedding file_operations into their objects Al Viro
2025-01-12 8:06 ` [PATCH v2 08/21] b43: stop embedding struct " Al Viro
2025-01-12 8:06 ` [PATCH v2 09/21] b43legacy: make use of debugfs_get_aux() Al Viro
2025-01-12 8:06 ` [PATCH v2 10/21] netdevsim: don't embed file_operations into your structs Al Viro
2025-01-12 8:06 ` [PATCH v2 11/21] mediatek: stop messing with ->d_iname Al Viro
2025-01-12 8:06 ` [PATCH v2 12/21] [not even compile-tested] greybus/camera - " Al Viro
2025-01-12 8:06 ` [PATCH v2 13/21] mtu3: don't mess wiht ->d_iname Al Viro
2025-01-12 8:06 ` [PATCH v2 14/21] xhci: don't mess with ->d_iname Al Viro
2025-01-12 8:06 ` [PATCH v2 15/21] qat: don't mess with ->d_name Al Viro
2025-01-12 8:07 ` [PATCH v2 16/21] sof-client-ipc-flood-test: " Al Viro
2025-01-12 8:07 ` [PATCH v2 17/21] slub: " Al Viro
2025-01-12 8:07 ` [PATCH v2 18/21] arm_scmi: don't mess with ->d_parent->d_name Al Viro
2025-01-12 8:07 ` [PATCH v2 19/21] octeontx2: don't mess with ->d_parent or ->d_parent->d_name Al Viro
2025-01-12 8:07 ` [PATCH v2 20/21] orangefs-debugfs: don't mess with ->d_name Al Viro
2025-01-12 8:07 ` [PATCH v2 21/21] saner replacement for debugfs_rename() Al Viro
2025-01-13 14:48 ` [PATCH v2 01/21] debugfs: separate cache for debugfs inodes Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241229080948.GY1977892@ZenIV \
--to=viro@zeniv.linux.org.uk \
--cc=gregkh@linuxfoundation.org \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox