Re: [REGRESSION] [PATCH v2] ceph: fix num_ops OBOE when crypto allocation fails

public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed

From: Viacheslav Dubeyko <slava@dubeyko.com>
To: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>,
	"idryomov@gmail.com"	 <idryomov@gmail.com>,
	Alex Markuze <amarkuze@redhat.com>,
	"cfsworks@gmail.com"	 <cfsworks@gmail.com>
Cc: Milind Changire <mchangir@redhat.com>,
	"stable@vger.kernel.org"	 <stable@vger.kernel.org>,
	Xiubo Li <xiubli@redhat.com>,
	"jlayton@kernel.org"	 <jlayton@kernel.org>,
	"linux-kernel@vger.kernel.org"	 <linux-kernel@vger.kernel.org>,
	"ceph-devel@vger.kernel.org"	 <ceph-devel@vger.kernel.org>,
	"regressions@lists.linux.dev"	 <regressions@lists.linux.dev>
Subject: Re:  [REGRESSION] [PATCH v2] ceph: fix num_ops OBOE when crypto allocation fails
Date: Thu, 19 Mar 2026 12:14:01 -0700	[thread overview]
Message-ID: <27650d0c70f8ccfb8846c03de10d521173e06212.camel@dubeyko.com> (raw)
In-Reply-To: <bae7a16910a7b2cff6b9f8996d93ea72dabb9a6b.camel@ibm.com>

On Wed, 2026-03-18 at 19:41 +0000, Viacheslav Dubeyko wrote:
> On Tue, 2026-03-17 at 19:37 -0700, Sam Edwards wrote:
> > move_dirty_folio_in_page_array() may fail if the file is encrypted,
> > the
> > dirty folio is not the first in the batch, and it fails to allocate
> > a
> > bounce buffer to hold the ciphertext. When that happens,
> > ceph_process_folio_batch() simply redirties the folio and flushes
> > the
> > current batch -- it can retry that folio in a future batch.
> > 
> > However, if this failed folio is not contiguous with the last folio
> > that
> > did make it into the batch, then ceph_process_folio_batch() has
> > already
> > incremented `ceph_wbc->num_ops`; because it doesn't follow through
> > and
> > add the discontiguous folio to the array, ceph_submit_write() --
> > which
> > expects that `ceph_wbc->num_ops` accurately reflects the number of
> > contiguous ranges (and therefore the required number of "write
> > extent"
> > ops) in the writeback -- will panic the kernel:
> > 
> >     BUG_ON(ceph_wbc->op_idx + 1 != req->r_num_ops);
> > 
> > This issue can be reproduced on affected kernels by writing to
> > fscrypt-enabled CephFS file(s) with a 4KiB-written/4KiB-
> > skipped/repeat
> > pattern (total filesize should not matter) and gradually increasing
> > the
> > system's memory pressure until a bounce buffer allocation fails.
> > 
> > Fix this crash by decrementing `ceph_wbc->num_ops` back to the
> > correct
> > value when move_dirty_folio_in_page_array() fails, but the folio
> > already
> > started counting a new (i.e. still-empty) extent.
> > 
> > The defect corrected by this patch has existed since 2022 (see
> > first
> > `Fixes:`), but another bug blocked multi-folio encrypted writeback
> > until
> > recently (see second `Fixes:`). The second commit made it into
> > 6.18.16,
> > 6.19.6, and 7.0-rc1, unmasking the panic in those versions. This
> > patch
> > therefore fixes a regression (panic) introduced by cac190c7674f.
> > 
> > Cc: stable@vger.kernel.org # v6.18+
> > Fixes: d55207717ded ("ceph: add encryption support to writepage and
> > writepages")
> > Fixes: cac190c7674f ("ceph: fix write storm on fscrypted files")
> > Signed-off-by: Sam Edwards <CFSworks@gmail.com>
> > ---
> > 
> > Changes v1->v2:
> > - Added a paragraph to the commit log briefly explaining the I/O
> > pattern to
> >   reproduce the issue (thanks Slava)
> > 
> > - Additionally Cc'd regressions@lists.linux.dev as required when
> > handling
> >   regressions
> > 
> > Feedback not addressed:
> > - "Commit message should link to the mentioned BUG_ON line in a
> > source listing"
> >     (link would not really help anyone, and the line is a moving
> > target anyway)
> 
> My request was to identify the location of:
> 
> BUG_ON(ceph_wbc->op_idx + 1 != req->r_num_ops);
> 
> Because, it's completely not clear from the commit message the
> location of this
> code pattern.
> 
> There are two possible ways:
> (1) Link
> https://elixir.bootlin.com/linux/v7.0-rc4/source/fs/ceph/addr.c#L1555
> .
> I hope you can see that it includes kernel version. So, if the line
> will change
> with time, then this link always will identify the position of this
> code pattern
> in v7.0-rc4, for example.
> 
> (2) You can show the function that contains this code pattern:
> 
> static
> int ceph_submit_write(struct address_space *mapping,
> 			struct writeback_control *wbc,
> 			struct ceph_writeback_ctl *ceph_wbc)
> {
> <skipped>
> 
>     BUG_ON(ceph_wbc->op_idx + 1 != req->r_num_ops);
> 
> <skipped>
> }
> 
> > 
> > - "Commit message should indicate that ceph_wbc->num_ops is passed
> > to
> >    ceph_osdc_new_request() to explain why ceph_wbc->num_ops == req-
> > >r_num_ops"
> >     (ceph_wbc->num_ops is easy enough to search; and the cause-
> > >effect of the
> >      BUG_ON() is secondary to the central point that
> > ceph_process_folio_batch()
> >      is responsible for ensuring ceph_wbc->num_ops is correct
> > before returning)
> > 
> > - "An issue should be filed in the Ceph Redmine, linked via
> > Closes:"
> >     (thanks Ilya for clarifying this is unnecessary)
> > 
> > ---
> >  fs/ceph/addr.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
> > index e87b3bb94ee8..f366e159ffa6 100644
> > --- a/fs/ceph/addr.c
> > +++ b/fs/ceph/addr.c
> > @@ -1366,6 +1366,10 @@ void ceph_process_folio_batch(struct
> > address_space *mapping,
> >  		rc = move_dirty_folio_in_page_array(mapping, wbc,
> > ceph_wbc,
> >  				folio);
> >  		if (rc) {
> > +			/* Did we just begin a new contiguous op?
> > Nevermind! */
> > +			if (ceph_wbc->len == 0)
> > +				ceph_wbc->num_ops--;
> > +
> >  			folio_redirty_for_writepage(wbc, folio);
> >  			folio_unlock(folio);
> >  			break;
> 
> Let me run the xfstests for the patch. I'll be back with the result
> ASAP.
> 
> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
> 

I don't see any new issue during the xfstests run.

Tested-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>

Thanks,
Slava.

next prev parent reply	other threads:[~2026-03-19 19:14 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-18  2:37 [REGRESSION] [PATCH v2] ceph: fix num_ops OBOE when crypto allocation fails Sam Edwards
2026-03-18 19:41 ` Viacheslav Dubeyko
2026-03-19 19:14   ` Viacheslav Dubeyko [this message]
2026-03-25  2:56   ` Sam Edwards
2026-03-25 11:55     ` Ilya Dryomov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=27650d0c70f8ccfb8846c03de10d521173e06212.camel@dubeyko.com \
    --to=slava@dubeyko.com \
    --cc=Slava.Dubeyko@ibm.com \
    --cc=amarkuze@redhat.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=cfsworks@gmail.com \
    --cc=idryomov@gmail.com \
    --cc=jlayton@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchangir@redhat.com \
    --cc=regressions@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    --cc=xiubli@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox