From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21148C433DB for ; Fri, 8 Jan 2021 18:25:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DB50623A7C for ; Fri, 8 Jan 2021 18:25:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727570AbhAHSZE (ORCPT ); Fri, 8 Jan 2021 13:25:04 -0500 Received: from mail.kernel.org ([198.145.29.99]:58530 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726735AbhAHSZE (ORCPT ); Fri, 8 Jan 2021 13:25:04 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 6F50023A7B; Fri, 8 Jan 2021 18:24:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1610130263; bh=WXs0wko9YZrKPClyrAhMspNe2Z2CTEpwtjfU4QTFWW4=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=osi0AkXsgUIviHgnmN33gezX8atmLqp6OLBmTjjB1q42IqWTPWoVMxgUIjPpo5h9f 1IPhbT2CKLS6nfXollFUws715fKCXEpaaEmMP5jI411ZGVP+/tcoK5AnEAcoGEK3ef W/WTgNUWDICF+4jlyOXD2rM3/dlNeoe3VEVRYZCui0jz47vRWjR0dsC/nzvlSUm79+ Mg9y7KB7AmriXw2mctYsbMjghkTRf+fNVWIswCLRDcZpxoO8mK0k+kdcVzhSd3nn0I USkmMzoG1lf9Fwd/GgczLw1aj7AWThKLOxaQK0900j9JdFc/rJUNx7styCWKPQHpab hWw5hthTB6NaA== Message-ID: <97ec811665eee254627c46d7999818fd94bab9ed.camel@kernel.org> Subject: Re: [PATCH v2] ceph: defer flushing the capsnap if the Fb is used From: Jeff Layton To: xiubli@redhat.com Cc: idryomov@gmail.com, pdonnell@redhat.com, ceph-devel@vger.kernel.org Date: Fri, 08 Jan 2021 13:24:22 -0500 In-Reply-To: <20210107023051.119063-1-xiubli@redhat.com> References: <20210107023051.119063-1-xiubli@redhat.com> Content-Type: text/plain; charset="ISO-8859-15" User-Agent: Evolution 3.38.2 (3.38.2-1.fc33) MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org On Thu, 2021-01-07 at 10:30 +0800, xiubli@redhat.com wrote: > From: Xiubo Li > > If the Fb cap is used it means the client is flushing the dirty > data to OSD, just defer flushing the capsnap. > > URL: https://tracker.ceph.com/issues/48679 > URL: https://tracker.ceph.com/issues/48640 > Signed-off-by: Xiubo Li > --- > > V2: > - Fix inode reference leak bug > >  fs/ceph/caps.c | 32 +++++++++++++++++++------------- >  fs/ceph/snap.c | 6 +++--- >  2 files changed, 22 insertions(+), 16 deletions(-) > > diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c > index abbf48fc6230..2f2451d563bd 100644 > --- a/fs/ceph/caps.c > +++ b/fs/ceph/caps.c > @@ -3047,6 +3047,7 @@ static void __ceph_put_cap_refs(struct ceph_inode_info *ci, int had, >  { >   struct inode *inode = &ci->vfs_inode; >   int last = 0, put = 0, flushsnaps = 0, wake = 0; > + bool check_flushsnaps = false; >   > > > >   spin_lock(&ci->i_ceph_lock); >   if (had & CEPH_CAP_PIN) > @@ -3064,25 +3065,15 @@ static void __ceph_put_cap_refs(struct ceph_inode_info *ci, int had, >   if (--ci->i_wb_ref == 0) { >   last++; >   put++; > + check_flushsnaps = true; >   } >   dout("put_cap_refs %p wb %d -> %d (?)\n", >   inode, ci->i_wb_ref+1, ci->i_wb_ref); >   } > - if (had & CEPH_CAP_FILE_WR) > + if (had & CEPH_CAP_FILE_WR) { >   if (--ci->i_wr_ref == 0) { >   last++; > - if (__ceph_have_pending_cap_snap(ci)) { > - struct ceph_cap_snap *capsnap = > - list_last_entry(&ci->i_cap_snaps, > - struct ceph_cap_snap, > - ci_item); > - capsnap->writing = 0; > - if (ceph_try_drop_cap_snap(ci, capsnap)) > - put++; > - else if (__ceph_finish_cap_snap(ci, capsnap)) > - flushsnaps = 1; > - wake = 1; > - } > + check_flushsnaps = true; >   if (ci->i_wrbuffer_ref_head == 0 && >   ci->i_dirty_caps == 0 && >   ci->i_flushing_caps == 0) { > @@ -3094,6 +3085,21 @@ static void __ceph_put_cap_refs(struct ceph_inode_info *ci, int had, >   if (!__ceph_is_any_real_caps(ci) && ci->i_snap_realm) >   drop_inode_snap_realm(ci); >   } > + } > + if (check_flushsnaps) { > + if (__ceph_have_pending_cap_snap(ci)) { > + struct ceph_cap_snap *capsnap = > + list_last_entry(&ci->i_cap_snaps, > + struct ceph_cap_snap, > + ci_item); > + capsnap->writing = 0; > + if (ceph_try_drop_cap_snap(ci, capsnap)) > + put++; > + else if (__ceph_finish_cap_snap(ci, capsnap)) > + flushsnaps = 1; > + wake = 1; > + } > + } Ok, so let's assume you're putting Fb. You increment put and set check_flushsnaps to true. Later, you get down to here and call ceph_try_drop_cap_snap and it returns true and now you've incremented "put" twice. Is that right? Do Fb caps hold two inode references? Either way, I think this function needs some better documentation/comments, particularly since you're making a significant change to how it works. >   spin_unlock(&ci->i_ceph_lock); >   > > > >   dout("put_cap_refs %p had %s%s%s\n", inode, ceph_cap_string(had), > diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c > index b611f829cb61..639fb91cc9db 100644 > --- a/fs/ceph/snap.c > +++ b/fs/ceph/snap.c > @@ -561,10 +561,10 @@ void ceph_queue_cap_snap(struct ceph_inode_info *ci) >   capsnap->context = old_snapc; >   list_add_tail(&capsnap->ci_item, &ci->i_cap_snaps); >   > > > > - if (used & CEPH_CAP_FILE_WR) { > + if (used & (CEPH_CAP_FILE_WR | CEPH_CAP_FILE_BUFFER)) { >   dout("queue_cap_snap %p cap_snap %p snapc %p" > - " seq %llu used WR, now pending\n", inode, > - capsnap, old_snapc, old_snapc->seq); > + " seq %llu used WR | BUFFFER, now pending\n", > + inode, capsnap, old_snapc, old_snapc->seq); >   capsnap->writing = 1; >   } else { >   /* note mtime, size NOW. */ -- Jeff Layton