From: Mateusz Guzik <mjguzik@gmail.com>
To: brauner@kernel.org
Cc: viro@zeniv.linux.org.uk, jack@suse.cz,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Mateusz Guzik <mjguzik@gmail.com>
Subject: [PATCH v2] fs: scale opening of character devices
Date: Fri, 21 Nov 2025 08:28:18 +0100 [thread overview]
Message-ID: <20251121072818.3230541-1-mjguzik@gmail.com> (raw)
chrdev_open() always takes cdev_lock, which is only needed to synchronize
against cd_forget(). But the latter is only ever called by inode evict(),
meaning these two can never legally race. Solidify this with asserts.
More cleanups are needed here but this is enough to get the thing out of
the way.
Rationale is funny-sounding at first: opening of /dev/zero happens to be
a contention point in large-scale package building (think 100+ packages
at the same with a thread count to support it). Such a workload is not
only very fork+exec heavy, but frequently involves scripts which use the
idiom of silencing output by redirecting it to /dev/null.
A non-large-scale microbenchmark of opening /dev/null in a loop in 16
processes:
before: 2865472
after: 4011960 (+40%)
Code goes from being bottlenecked on the spinlock to being bottlenecked
on lockref.
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---
v2:
- add back new = NULL lost in refactoring
I'll note for interested my experience with the workload at hand comes
from FreeBSD and was surprised to find /dev/null on the profile. Given
that Linux is globally serializing on it, it has to be a factor as well
in this case.
fs/char_dev.c | 20 +++++++++++---------
fs/inode.c | 2 +-
include/linux/cdev.h | 2 +-
3 files changed, 13 insertions(+), 11 deletions(-)
diff --git a/fs/char_dev.c b/fs/char_dev.c
index c2ddb998f3c9..9a6dfab084d1 100644
--- a/fs/char_dev.c
+++ b/fs/char_dev.c
@@ -374,15 +374,15 @@ static int chrdev_open(struct inode *inode, struct file *filp)
{
const struct file_operations *fops;
struct cdev *p;
- struct cdev *new = NULL;
int ret = 0;
- spin_lock(&cdev_lock);
- p = inode->i_cdev;
+ VFS_BUG_ON_INODE(icount_read(inode) < 1, inode);
+
+ p = READ_ONCE(inode->i_cdev);
if (!p) {
struct kobject *kobj;
+ struct cdev *new = NULL;
int idx;
- spin_unlock(&cdev_lock);
kobj = kobj_lookup(cdev_map, inode->i_rdev, &idx);
if (!kobj)
return -ENXIO;
@@ -392,19 +392,19 @@ static int chrdev_open(struct inode *inode, struct file *filp)
we dropped the lock. */
p = inode->i_cdev;
if (!p) {
- inode->i_cdev = p = new;
+ p = new;
+ WRITE_ONCE(inode->i_cdev, p);
list_add(&inode->i_devices, &p->list);
new = NULL;
} else if (!cdev_get(p))
ret = -ENXIO;
+ spin_unlock(&cdev_lock);
+ cdev_put(new);
} else if (!cdev_get(p))
ret = -ENXIO;
- spin_unlock(&cdev_lock);
- cdev_put(new);
if (ret)
return ret;
- ret = -ENXIO;
fops = fops_get(p->ops);
if (!fops)
goto out_cdev_put;
@@ -423,8 +423,10 @@ static int chrdev_open(struct inode *inode, struct file *filp)
return ret;
}
-void cd_forget(struct inode *inode)
+void inode_cdev_forget(struct inode *inode)
{
+ VFS_BUG_ON_INODE(!(inode_state_read_once(inode) & I_FREEING), inode);
+
spin_lock(&cdev_lock);
list_del_init(&inode->i_devices);
inode->i_cdev = NULL;
diff --git a/fs/inode.c b/fs/inode.c
index a62032864ddf..88be1f20782d 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -840,7 +840,7 @@ static void evict(struct inode *inode)
clear_inode(inode);
}
if (S_ISCHR(inode->i_mode) && inode->i_cdev)
- cd_forget(inode);
+ inode_cdev_forget(inode);
remove_inode_hash(inode);
diff --git a/include/linux/cdev.h b/include/linux/cdev.h
index 0e8cd6293deb..bed99967ad90 100644
--- a/include/linux/cdev.h
+++ b/include/linux/cdev.h
@@ -34,6 +34,6 @@ void cdev_device_del(struct cdev *cdev, struct device *dev);
void cdev_del(struct cdev *);
-void cd_forget(struct inode *);
+void inode_cdev_forget(struct inode *);
#endif
--
2.48.1
next reply other threads:[~2025-11-21 7:28 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-21 7:28 Mateusz Guzik [this message]
2025-11-25 11:26 ` [PATCH v2] fs: scale opening of character devices Jan Kara
2025-11-25 11:33 ` Mateusz Guzik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251121072818.3230541-1-mjguzik@gmail.com \
--to=mjguzik@gmail.com \
--cc=brauner@kernel.org \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).