From: Sam Edwards <cfsworks@gmail.com>
To: Xiubo Li <xiubli@redhat.com>, Ilya Dryomov <idryomov@gmail.com>
Cc: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>,
Christian Brauner <brauner@kernel.org>,
Milind Changire <mchangir@redhat.com>,
Jeff Layton <jlayton@kernel.org>,
ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org,
Sam Edwards <CFSworks@gmail.com>,
stable@vger.kernel.org
Subject: [PATCH v2 6/6] ceph: Fix write storm on fscrypted files
Date: Wed, 7 Jan 2026 13:01:39 -0800 [thread overview]
Message-ID: <20260107210139.40554-7-CFSworks@gmail.com> (raw)
In-Reply-To: <20260107210139.40554-1-CFSworks@gmail.com>
CephFS stores file data across multiple RADOS objects. An object is the
atomic unit of storage, so the writeback code must clean only folios
that belong to the same object with each OSD request.
CephFS also supports RAID0-style striping of file contents: if enabled,
each object stores multiple unbroken "stripe units" covering different
portions of the file; if disabled, a "stripe unit" is simply the whole
object. The stripe unit is (usually) reported as the inode's block size.
Though the writeback logic could, in principle, lock all dirty folios
belonging to the same object, its current design is to lock only a
single stripe unit at a time. Ever since this code was first written,
it has determined this size by checking the inode's block size.
However, the relatively-new fscrypt support needed to reduce the block
size for encrypted inodes to the crypto block size (see 'fixes' commit),
which causes an unnecessarily high number of write operations (~1024x as
many, with 4MiB objects) and correspondingly degraded performance.
Fix this (and clarify intent) by using i_layout.stripe_unit directly in
ceph_define_write_size() so that encrypted inodes are written back with
the same number of operations as if they were unencrypted.
Fixes: 94af0470924c ("ceph: add some fscrypt guardrails")
Cc: stable@vger.kernel.org
Signed-off-by: Sam Edwards <CFSworks@gmail.com>
---
fs/ceph/addr.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index f2db05b51a3b..b97a6120d4b9 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -1000,7 +1000,8 @@ unsigned int ceph_define_write_size(struct address_space *mapping)
{
struct inode *inode = mapping->host;
struct ceph_fs_client *fsc = ceph_inode_to_fs_client(inode);
- unsigned int wsize = i_blocksize(inode);
+ struct ceph_inode_info *ci = ceph_inode(inode);
+ unsigned int wsize = ci->i_layout.stripe_unit;
if (fsc->mount_options->wsize < wsize)
wsize = fsc->mount_options->wsize;
--
2.51.2
prev parent reply other threads:[~2026-01-07 21:02 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260107210139.40554-1-CFSworks@gmail.com>
2026-01-07 21:01 ` [PATCH v2 1/6] ceph: Do not propagate page array emplacement errors as batch errors Sam Edwards
2026-01-08 20:05 ` Viacheslav Dubeyko
2026-01-07 21:01 ` [PATCH v2 3/6] ceph: Free page array when ceph_submit_write fails Sam Edwards
2026-01-07 21:01 ` Sam Edwards [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260107210139.40554-7-CFSworks@gmail.com \
--to=cfsworks@gmail.com \
--cc=Slava.Dubeyko@ibm.com \
--cc=brauner@kernel.org \
--cc=ceph-devel@vger.kernel.org \
--cc=idryomov@gmail.com \
--cc=jlayton@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mchangir@redhat.com \
--cc=stable@vger.kernel.org \
--cc=xiubli@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox