From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4EEDC5479D for ; Wed, 11 Jan 2023 10:48:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237980AbjAKKsN (ORCPT ); Wed, 11 Jan 2023 05:48:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33212 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238088AbjAKKsK (ORCPT ); Wed, 11 Jan 2023 05:48:10 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 888E114038 for ; Wed, 11 Jan 2023 02:47:56 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 73B19B81B90 for ; Wed, 11 Jan 2023 10:47:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A153BC433D2; Wed, 11 Jan 2023 10:47:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673434074; bh=tnLyQYJdBOWlRHraKAwQLi7OkGpCUopLoF3vD0MnDxI=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=dsAGDEHlol20u4I3X+aN1g3wPD7qkhUUlsxgnroWJsheaHNFRmPPz5gIKa/FxunSu DuKyEx4xUh6ZmU0V3Ra9NLmZpT3/sP7+yzs0j8XDDLv1clzLfXhyiyuqkoXQH0UXpk iOpMRRb8tQ5yD2qImiojhYAqyfhQjUi30Stg8gQ6iGFXBlxxmuWKMa2CGsZ8+Rlsim uaGCVXtYOOR63U2xbe+GttDXT+Nlj+zdVPGuoJgNU33Gj+p6pFH/Qx/uHMSUT7JON8 7ot83vSPCreIjFf3uXRKGiieWo9rUzjmJIBm7Bl2MnO560H33+WseRxnWqs/YhrSOD mrPSugoEOBVgQ== Message-ID: <956fe608b456141b1b43f9dfe65581f168247cb8.camel@kernel.org> Subject: Re: [PATCH v3 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work From: Jeff Layton To: Dai Ngo , chuck.lever@oracle.com Cc: efault@gmx.de, linux-nfs@vger.kernel.org Date: Wed, 11 Jan 2023 05:47:52 -0500 In-Reply-To: <1673432658-4140-1-git-send-email-dai.ngo@oracle.com> References: <1673432658-4140-1-git-send-email-dai.ngo@oracle.com> Content-Type: text/plain; charset="ISO-8859-15" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.46.3 (3.46.3-1.fc37) MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Wed, 2023-01-11 at 02:24 -0800, Dai Ngo wrote: > Currently nfsd4_state_shrinker_worker can be schduled multiple times > from nfsd4_state_shrinker_count when memory is low. This causes > the WARN_ON_ONCE in __queue_delayed_work to trigger. >=20 > This patch allows only one instance of nfsd4_state_shrinker_worker > at a time using the nfsd_shrinker_active flag, protected by the > client_lock. >=20 > Change nfsd_shrinker_work from delayed_work to work_struct since we > don't use the delay. >=20 > Replace mod_delayed_work in nfsd4_state_shrinker_count with queue_work. >=20 > Cancel work_struct nfsd_shrinker_work after unregistering shrinker > in nfs4_state_shutdown_net >=20 > Fixes: 44df6f439a17 ("NFSD: add delegation reaper to react to low memory = condition") > Reported-by: Mike Galbraith > Signed-off-by: Dai Ngo > --- > v2: > . Change nfsd_shrinker_work from delayed_work to work_struct > . Replace mod_delayed_work in nfsd4_state_shrinker_count with queue_wor= k > . Cancel work_struct nfsd_shrinker_work after unregistering shrinker > v3: > . set nfsd_shrinker_active earlier in nfsd4_state_shrinker_count >=20 > fs/nfsd/netns.h | 3 ++- > fs/nfsd/nfs4state.c | 24 +++++++++++++++++++----- > 2 files changed, 21 insertions(+), 6 deletions(-) >=20 > diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h > index 8c854ba3285b..b0c7b657324b 100644 > --- a/fs/nfsd/netns.h > +++ b/fs/nfsd/netns.h > @@ -195,7 +195,8 @@ struct nfsd_net { > =20 > atomic_t nfsd_courtesy_clients; > struct shrinker nfsd_client_shrinker; > - struct delayed_work nfsd_shrinker_work; > + struct work_struct nfsd_shrinker_work; > + bool nfsd_shrinker_active; > }; > =20 > /* Simple check to find out if a given net was properly initialized */ > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > index a7cfefd7c205..35ec4cba88b3 100644 > --- a/fs/nfsd/nfs4state.c > +++ b/fs/nfsd/nfs4state.c > @@ -4407,11 +4407,22 @@ nfsd4_state_shrinker_count(struct shrinker *shrin= k, struct shrink_control *sc) > struct nfsd_net *nn =3D container_of(shrink, > struct nfsd_net, nfsd_client_shrinker); > =20 > + spin_lock(&nn->client_lock); > + if (nn->nfsd_shrinker_active) { > + spin_unlock(&nn->client_lock); > + return 0; > + } > + nn->nfsd_shrinker_active =3D true; > count =3D atomic_read(&nn->nfsd_courtesy_clients); > if (!count) > count =3D atomic_long_read(&num_delegations); > - if (count) > - mod_delayed_work(laundry_wq, &nn->nfsd_shrinker_work, 0); > + if (count) { > + spin_unlock(&nn->client_lock); > + queue_work(laundry_wq, &nn->nfsd_shrinker_work); > + } else { > + nn->nfsd_shrinker_active =3D false; > + spin_unlock(&nn->client_lock); > + } The change to normal work_struct is an improvement, but NAK on this patch. The spinlocking and flag are not needed here. I seriously doubt that we have a clear understanding of this problem. > return (unsigned long)count; > } > =20 > @@ -6233,12 +6244,14 @@ deleg_reaper(struct nfsd_net *nn) > static void > nfsd4_state_shrinker_worker(struct work_struct *work) > { > - struct delayed_work *dwork =3D to_delayed_work(work); > - struct nfsd_net *nn =3D container_of(dwork, struct nfsd_net, > + struct nfsd_net *nn =3D container_of(work, struct nfsd_net, > nfsd_shrinker_work); > =20 > courtesy_client_reaper(nn); > deleg_reaper(nn); > + spin_lock(&nn->client_lock); > + nn->nfsd_shrinker_active =3D false; > + spin_unlock(&nn->client_lock); > } > =20 > static inline __be32 nfs4_check_fh(struct svc_fh *fhp, struct nfs4_stid = *stp) > @@ -8064,7 +8077,7 @@ static int nfs4_state_create_net(struct net *net) > INIT_LIST_HEAD(&nn->blocked_locks_lru); > =20 > INIT_DELAYED_WORK(&nn->laundromat_work, laundromat_main); > - INIT_DELAYED_WORK(&nn->nfsd_shrinker_work, nfsd4_state_shrinker_worker)= ; > + INIT_WORK(&nn->nfsd_shrinker_work, nfsd4_state_shrinker_worker); > get_net(net); > =20 > nn->nfsd_client_shrinker.scan_objects =3D nfsd4_state_shrinker_scan; > @@ -8171,6 +8184,7 @@ nfs4_state_shutdown_net(struct net *net) > struct nfsd_net *nn =3D net_generic(net, nfsd_net_id); > =20 > unregister_shrinker(&nn->nfsd_client_shrinker); > + cancel_work(&nn->nfsd_shrinker_work); > cancel_delayed_work_sync(&nn->laundromat_work); > locks_end_grace(&nn->nfsd4_manager); > =20 --=20 Jeff Layton