public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sam Edwards <cfsworks@gmail.com>
To: Xiubo Li <xiubli@redhat.com>, Ilya Dryomov <idryomov@gmail.com>
Cc: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>,
	Christian Brauner <brauner@kernel.org>,
	Milind Changire <mchangir@redhat.com>,
	Jeff Layton <jlayton@kernel.org>,
	ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Sam Edwards <CFSworks@gmail.com>
Subject: [PATCH 0/5] ceph: CephFS writeback correctness and performance fixes
Date: Tue, 30 Dec 2025 18:43:11 -0800	[thread overview]
Message-ID: <20251231024316.4643-1-CFSworks@gmail.com> (raw)

Hello list,

This series addresses several interrelated CephFS writeback issues,
particularly for fscrypted files. My work began with a performance problem:
encrypted files caused a write storm during writeback because the writeback
code was inadvertently selecting the crypto block instead of the stripe unit as
the maximum write unit size.

While testing that fix, I encountered a correctness bug: failures to allocate
bounce pages during writeback were incorrectly propagated as batch errors,
which trigger kernel oopses/panics due to poor handling in the writeback loop.
While investigating that, I discovered that the same oopses could be triggered
by a failure in ceph_submit_write() as well.

The patches in this series:

1. Prevent bounce page allocation failures from aborting the writeback batch
   and causing a kernel oops/panic due to the page array not being freed.
2. Remove the now-redundant error return from ceph_process_folio_batch().
3. Free page arrays during failure in ceph_submit_write(), preventing another
   path to the same kernel oops/panic. This was not an issue I encountered in
   testing, and it is tricky to trigger organically. I used the fault injection
   framework to confirm it and verify the fix.
4. Assert writeback loop invariants explicitly to help prevent regressions and
   aid debugging should the problem reappear.
5. Fix the write storm on fscrypted files by using the correct stripe unit.

Note that this series follows a "fix-then-refactor" cadence: patches 1, 3, and
5 fix bugs and are intended for stable, while patches 2 and 4 represent code
cleanup and are intended only for next.

Wishing you all a prosperous 2026 ahead,
Sam

Sam Edwards (5):
  ceph: Do not propagate page array emplacement errors as batch errors
  ceph: Remove error return from ceph_process_folio_batch()
  ceph: Free page array when ceph_submit_write fails
  ceph: Assert writeback loop invariants
  ceph: Fix write storm on fscrypted files

 fs/ceph/addr.c | 35 +++++++++++++++++++----------------
 1 file changed, 19 insertions(+), 16 deletions(-)

-- 
2.51.2


             reply	other threads:[~2025-12-31  2:55 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-31  2:43 Sam Edwards [this message]
2025-12-31  2:43 ` [PATCH 1/5] ceph: Do not propagate page array emplacement errors as batch errors Sam Edwards
2026-01-05 20:23   ` Viacheslav Dubeyko
2026-01-06  6:52     ` Sam Edwards
2026-01-06 21:08       ` Viacheslav Dubeyko
2026-01-06 23:50         ` Sam Edwards
2025-12-31  2:43 ` [PATCH 2/5] ceph: Remove error return from ceph_process_folio_batch() Sam Edwards
2026-01-05 20:36   ` Viacheslav Dubeyko
2026-01-06  6:52     ` Sam Edwards
2026-01-06 22:47       ` Viacheslav Dubeyko
2026-01-07  0:15         ` Sam Edwards
2025-12-31  2:43 ` [PATCH 3/5] ceph: Free page array when ceph_submit_write fails Sam Edwards
2026-01-05 21:09   ` Viacheslav Dubeyko
2026-01-06  6:52     ` Sam Edwards
2025-12-31  2:43 ` [PATCH 4/5] ceph: Assert writeback loop invariants Sam Edwards
2026-01-05 22:28   ` Viacheslav Dubeyko
2026-01-06  6:53     ` Sam Edwards
2026-01-06 23:00       ` Viacheslav Dubeyko
2026-01-07  0:33         ` Sam Edwards
2025-12-31  2:43 ` [PATCH 5/5] ceph: Fix write storm on fscrypted files Sam Edwards
2026-01-05 22:34   ` Viacheslav Dubeyko
2026-01-06  6:53     ` Sam Edwards
2026-01-06 23:11       ` Viacheslav Dubeyko
2026-01-07  0:05         ` Sam Edwards

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251231024316.4643-1-CFSworks@gmail.com \
    --to=cfsworks@gmail.com \
    --cc=Slava.Dubeyko@ibm.com \
    --cc=brauner@kernel.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=idryomov@gmail.com \
    --cc=jlayton@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchangir@redhat.com \
    --cc=xiubli@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox