From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 31D1E17AA6 for ; Wed, 9 Aug 2023 11:41:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A7D34C433C7; Wed, 9 Aug 2023 11:41:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1691581266; bh=zxzhvY/D/PYJEspPjj4s+zTdSCY4VjdWjAMx6CTSzSU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=eRUQPpHJly5NQ/UyO2prk+n7dW9fnKFrnNx2zsug6LKrvSlbLO412VS6R1Ud3EH/F re25cwTNLns9ZEGKOQsYem2AgcgPAJ053aUyBpeGGsi0yumcuv7KBFQ3BtSxfu8l4H nFX69hNWazILFggpduhCs+0/8TJJXL7pi/vkaBxA= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Xiubo Li , Milind Changire , Ilya Dryomov Subject: [PATCH 5.10 165/201] ceph: defer stopping mdsc delayed_work Date: Wed, 9 Aug 2023 12:42:47 +0200 Message-ID: <20230809103649.258742973@linuxfoundation.org> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230809103643.799166053@linuxfoundation.org> References: <20230809103643.799166053@linuxfoundation.org> User-Agent: quilt/0.67 Precedence: bulk X-Mailing-List: patches@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From: Xiubo Li commit e7e607bd00481745550389a29ecabe33e13d67cf upstream. Flushing the dirty buffer may take a long time if the cluster is overloaded or if there is network issue. So we should ping the MDSs periodically to keep alive, else the MDS will blocklist the kclient. Cc: stable@vger.kernel.org Link: https://tracker.ceph.com/issues/61843 Signed-off-by: Xiubo Li Reviewed-by: Milind Changire Signed-off-by: Ilya Dryomov Signed-off-by: Greg Kroah-Hartman --- fs/ceph/mds_client.c | 4 ++-- fs/ceph/mds_client.h | 5 +++++ fs/ceph/super.c | 10 ++++++++++ 3 files changed, 17 insertions(+), 2 deletions(-) --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -4511,7 +4511,7 @@ static void delayed_work(struct work_str dout("mdsc delayed_work\n"); - if (mdsc->stopping) + if (mdsc->stopping >= CEPH_MDSC_STOPPING_FLUSHED) return; mutex_lock(&mdsc->mutex); @@ -4701,7 +4701,7 @@ void send_flush_mdlog(struct ceph_mds_se void ceph_mdsc_pre_umount(struct ceph_mds_client *mdsc) { dout("pre_umount\n"); - mdsc->stopping = 1; + mdsc->stopping = CEPH_MDSC_STOPPING_BEGIN; ceph_mdsc_iterate_sessions(mdsc, send_flush_mdlog, true); ceph_mdsc_iterate_sessions(mdsc, lock_unlock_session, false); --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -372,6 +372,11 @@ struct cap_wait { int want; }; +enum { + CEPH_MDSC_STOPPING_BEGIN = 1, + CEPH_MDSC_STOPPING_FLUSHED = 2, +}; + /* * mds client state */ --- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -1222,6 +1222,16 @@ static void ceph_kill_sb(struct super_bl ceph_mdsc_pre_umount(fsc->mdsc); flush_fs_workqueues(fsc); + /* + * Though the kill_anon_super() will finally trigger the + * sync_filesystem() anyway, we still need to do it here + * and then bump the stage of shutdown to stop the work + * queue as earlier as possible. + */ + sync_filesystem(s); + + fsc->mdsc->stopping = CEPH_MDSC_STOPPING_FLUSHED; + kill_anon_super(s); fsc->client->extra_mon_dispatch = NULL;