From: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
To: Xiubo Li <xiubli@redhat.com>,
"idryomov@gmail.com" <idryomov@gmail.com>,
"cfsworks@gmail.com" <cfsworks@gmail.com>
Cc: Milind Changire <mchangir@redhat.com>,
"stable@vger.kernel.org" <stable@vger.kernel.org>,
"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>,
"brauner@kernel.org" <brauner@kernel.org>,
"jlayton@kernel.org" <jlayton@kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 1/6] ceph: Do not propagate page array emplacement errors as batch errors
Date: Thu, 8 Jan 2026 20:05:31 +0000 [thread overview]
Message-ID: <7cf9d1f42ed1ffa16dfd54b8d1356dbb5f10e650.camel@ibm.com> (raw)
In-Reply-To: <20260107210139.40554-2-CFSworks@gmail.com>
On Wed, 2026-01-07 at 13:01 -0800, Sam Edwards wrote:
> When fscrypt is enabled, move_dirty_folio_in_page_array() may fail
> because it needs to allocate bounce buffers to store the encrypted
> versions of each folio. Each folio beyond the first allocates its bounce
> buffer with GFP_NOWAIT. Failures are common (and expected) under this
> allocation mode; they should flush (not abort) the batch.
>
> However, ceph_process_folio_batch() uses the same `rc` variable for its
> own return code and for capturing the return codes of its routine calls;
> failing to reset `rc` back to 0 results in the error being propagated
> out to the main writeback loop, which cannot actually tolerate any
> errors here: once `ceph_wbc.pages` is allocated, it must be passed to
> ceph_submit_write() to be freed. If it survives until the next iteration
> (e.g. due to the goto being followed), ceph_allocate_page_array()'s
> BUG_ON() will oops the worker. (Subsequent patches in this series make
> the loop more robust.)
>
> Note that this failure mode is currently masked due to another bug
> (addressed later in this series) that prevents multiple encrypted folios
> from being selected for the same write.
>
> For now, just reset `rc` when redirtying the folio to prevent errors in
> move_dirty_folio_in_page_array() from propagating. (Note that
> move_dirty_folio_in_page_array() is careful never to return errors on
> the first folio, so there is no need to check for that.) After this
> change, ceph_process_folio_batch() no longer returns errors; its only
> remaining failure indicator is `locked_pages == 0`, which the caller
> already handles correctly. The next patch in this series addresses this.
>
> Fixes: ce80b76dd327 ("ceph: introduce ceph_process_folio_batch() method")
> Cc: stable@vger.kernel.org
> Signed-off-by: Sam Edwards <CFSworks@gmail.com>
> ---
> fs/ceph/addr.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
> index 63b75d214210..3462df35d245 100644
> --- a/fs/ceph/addr.c
> +++ b/fs/ceph/addr.c
> @@ -1369,6 +1369,7 @@ int ceph_process_folio_batch(struct address_space *mapping,
> rc = move_dirty_folio_in_page_array(mapping, wbc, ceph_wbc,
> folio);
> if (rc) {
> + rc = 0;
> folio_redirty_for_writepage(wbc, folio);
> folio_unlock(folio);
> break;
I've shared my opinion about this patch already. It should be the last one.
Because, another patch fixes the issue that hides this one. It makes sense to
uncover this bug and fix it then. My opinion is still here.
Thanks,
Slava.
next prev parent reply other threads:[~2026-01-08 20:05 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-07 21:01 [PATCH v2 0/6] ceph: CephFS writeback correctness and performance fixes Sam Edwards
2026-01-07 21:01 ` [PATCH v2 1/6] ceph: Do not propagate page array emplacement errors as batch errors Sam Edwards
2026-01-08 20:05 ` Viacheslav Dubeyko [this message]
2026-01-07 21:01 ` [PATCH v2 2/6] ceph: Remove error return from ceph_process_folio_batch() Sam Edwards
2026-01-08 20:08 ` Viacheslav Dubeyko
2026-01-09 0:29 ` Sam Edwards
2026-01-07 21:01 ` [PATCH v2 3/6] ceph: Free page array when ceph_submit_write fails Sam Edwards
2026-01-07 21:01 ` [PATCH v2 4/6] ceph: Split out page-array discarding to a function Sam Edwards
2026-01-08 20:11 ` Viacheslav Dubeyko
2026-01-07 21:01 ` [PATCH v2 5/6] ceph: Assert writeback loop invariants Sam Edwards
2026-01-08 20:12 ` Viacheslav Dubeyko
2026-01-07 21:01 ` [PATCH v2 6/6] ceph: Fix write storm on fscrypted files Sam Edwards
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7cf9d1f42ed1ffa16dfd54b8d1356dbb5f10e650.camel@ibm.com \
--to=slava.dubeyko@ibm.com \
--cc=brauner@kernel.org \
--cc=ceph-devel@vger.kernel.org \
--cc=cfsworks@gmail.com \
--cc=idryomov@gmail.com \
--cc=jlayton@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mchangir@redhat.com \
--cc=stable@vger.kernel.org \
--cc=xiubli@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox