From: Xiubo Li <xiubli@redhat.com>
To: jlayton@kernel.org, Luis Henriques <lhenriques@suse.de>
Cc: idryomov@gmail.com, vshankar@redhat.com, ukernel@gmail.com,
ceph-devel@vger.kernel.org
Subject: Re: [PATCH v2] ceph: do not update snapshot context when there is no new snapshot
Date: Sat, 19 Feb 2022 14:30:33 +0800 [thread overview]
Message-ID: <f8a283ad-dabe-9268-491e-5d3aaa65e893@redhat.com> (raw)
In-Reply-To: <20220218024722.7952-1-xiubli@redhat.com>
On 2/18/22 10:47 AM, xiubli@redhat.com wrote:
> From: Xiubo Li <xiubli@redhat.com>
>
> We will only track the uppest parent snapshot realm from which we
> need to rebuild the snapshot contexts _downward_ in hierarchy. For
> all the others having no new snapshot we will do nothing.
>
> This fix will avoid calling ceph_queue_cap_snap() on some inodes
> inappropriately. For example, with the code in mainline, suppose there
> are 2 directory hierarchies (with 6 directories total), like this:
>
> /dir_X1/dir_X2/dir_X3/
> /dir_Y1/dir_Y2/dir_Y3/
>
> Firstly, make a snapshot under /dir_X1/dir_X2/.snap/snap_X2, then make a
> root snapshot under /.snap/root_snap. Every time we make snapshots under
> /dir_Y1/..., the kclient will always try to rebuild the snap context for
> snap_X2 realm and finally will always try to queue cap snaps for dir_Y2
> and dir_Y3, which makes no sense.
>
> That's because the snap_X2's seq is 2 and root_snap's seq is 3. So when
> creating a new snapshot under /dir_Y1/... the new seq will be 4, and
> the mds will send the kclient a snapshot backtrace in _downward_
> order: seqs 4, 3.
>
> When ceph_update_snap_trace() is called, it will always rebuild the from
> the last realm, that's the root_snap. So later when rebuilding the snap
> context, the current logic will always cause it to rebuild the snap_X2
> realm and then try to queue cap snaps for all the inodes related in that
> realm, even though it's not necessary.
>
> This is accompanied by a lot of these sorts of dout messages:
>
> "ceph: queue_cap_snap 00000000a42b796b nothing dirty|writing"
>
> Fix the logic to avoid this situation.
>
> The 'invalidate' word is not precise here, acutally it will rebuild
> the snapshot existing contexts or just build none-existing ones,
> rename it to 'rebuild_snapcs'.
>
> URL: https://tracker.ceph.com/issues/44100
> Signed-off-by: Xiubo Li <xiubli@redhat.com>
> ---
>
> Changed in V2:
> - Thanks Zheng's feedback and switched to Zheng's patch.
> - Rename invalidate to rebuild_snapcs.
>
>
>
> fs/ceph/snap.c | 28 +++++++++++++++++++---------
> 1 file changed, 19 insertions(+), 9 deletions(-)
>
> diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c
> index dbf34f212596..6d55b8ba79d8 100644
> --- a/fs/ceph/snap.c
> +++ b/fs/ceph/snap.c
> @@ -735,7 +735,8 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,
> __le64 *prior_parent_snaps; /* encoded */
> struct ceph_snap_realm *realm = NULL;
> struct ceph_snap_realm *first_realm = NULL;
> - int invalidate = 0;
> + struct ceph_snap_realm *realm_to_rebuild = NULL;
> + int rebuild_snapcs;
> int err = -ENOMEM;
> LIST_HEAD(dirty_realms);
>
> @@ -743,6 +744,7 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,
>
> dout("update_snap_trace deletion=%d\n", deletion);
> more:
> + rebuild_snapcs = 0;
> ceph_decode_need(&p, e, sizeof(*ri), bad);
> ri = p;
> p += sizeof(*ri);
> @@ -766,7 +768,7 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,
> err = adjust_snap_realm_parent(mdsc, realm, le64_to_cpu(ri->parent));
> if (err < 0)
> goto fail;
> - invalidate += err;
> + rebuild_snapcs += err;
>
> if (le64_to_cpu(ri->seq) > realm->seq) {
> dout("update_snap_trace updating %llx %p %lld -> %lld\n",
> @@ -791,22 +793,30 @@ int ceph_update_snap_trace(struct ceph_mds_client *mdsc,
> if (realm->seq > mdsc->last_snap_seq)
> mdsc->last_snap_seq = realm->seq;
>
> - invalidate = 1;
> + rebuild_snapcs = 1;
> } else if (!realm->cached_context) {
> dout("update_snap_trace %llx %p seq %lld new\n",
> realm->ino, realm, realm->seq);
> - invalidate = 1;
> + rebuild_snapcs = 1;
> } else {
> dout("update_snap_trace %llx %p seq %lld unchanged\n",
> realm->ino, realm, realm->seq);
> }
>
> - dout("done with %llx %p, invalidated=%d, %p %p\n", realm->ino,
> - realm, invalidate, p, e);
> + dout("done with %llx %p, rebuild_snapcs=%d, %p %p\n", realm->ino,
> + realm, rebuild_snapcs, p, e);
>
> - /* invalidate when we reach the _end_ (root) of the trace */
> - if (invalidate && p >= e)
> - rebuild_snap_realms(realm, &dirty_realms);
> + /*
> + * this will always track the uppest parent realm from which
> + * we need to rebuild the snapshot contexts _downward_ in
> + * hierarchy.
> + */
> + if (rebuild_snapcs)
> + realm_to_rebuild = realm;
> +
> + /* rebuild_snapcs when we reach the _end_ (root) of the trace */
> + if (rebuild_snapcs && p >= e)
s/rebuild_snapcs/realm_to_rebuild/
This will fix the bug Luís Henriques reported.
I have sent the V3 to fix it. Thanks.
- Xiubo
> + rebuild_snap_realms(realm_to_rebuild, &dirty_realms);
>
> if (!first_realm)
> first_realm = realm;
next prev parent reply other threads:[~2022-02-19 6:30 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-18 2:47 [PATCH v2] ceph: do not update snapshot context when there is no new snapshot xiubli
2022-02-18 14:17 ` Jeff Layton
2022-02-18 16:53 ` Luís Henriques
2022-02-19 2:35 ` Xiubo Li
2022-02-19 13:00 ` Jeff Layton
2022-02-19 6:30 ` Xiubo Li [this message]
2022-02-21 9:54 ` Luís Henriques
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f8a283ad-dabe-9268-491e-5d3aaa65e893@redhat.com \
--to=xiubli@redhat.com \
--cc=ceph-devel@vger.kernel.org \
--cc=idryomov@gmail.com \
--cc=jlayton@kernel.org \
--cc=lhenriques@suse.de \
--cc=ukernel@gmail.com \
--cc=vshankar@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.