From: Kinglong Mee <kinglongmee@gmail.com>
To: Jeffrey Layton <jeff.layton@primarydata.com>,
Trond Myklebust <trond.myklebust@primarydata.com>
Cc: Bruce Fields <bfields@fieldses.org>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
Christoph Hellwig <hch@infradead.org>
Subject: Re: [PATCH v3 31/38] nfsd: Move the open owner hash table into struct nfs4_client
Date: Sun, 03 Aug 2014 22:15:05 +0800 [thread overview]
Message-ID: <53DE43E9.9030704@gmail.com> (raw)
In-Reply-To: <CAPakX04ctGujqpJ48WqT1-r5=q0T8D1Ji=q1P5UTuwdw_hTsGg@mail.gmail.com>
On 8/3/2014 19:35, Jeffrey Layton wrote:
> On Sat, Aug 2, 2014 at 9:59 PM, Trond Myklebust <trond.myklebust@primarydata.com <mailto:trond.myklebust@primarydata.com>> wrote:
>
> On Sat, Aug 2, 2014 at 6:59 PM, Jeff Layton <jeff.layton@primarydata.com <mailto:jeff.layton@primarydata.com>> wrote:
> > On Sat, 2 Aug 2014 10:47:27 -0400
> > Trond Myklebust <trond.myklebust@primarydata.com <mailto:trond.myklebust@primarydata.com>> wrote:
> >
> >> On Sat, Aug 2, 2014 at 10:20 AM, Kinglong Mee <kinglongmee@gmail.com <mailto:kinglongmee@gmail.com>> wrote:
> >> > On 8/2/2014 22:05, Trond Myklebust wrote:
> >> >> On Sat, Aug 2, 2014 at 9:43 AM, Kinglong Mee <kinglongmee@gmail.com <mailto:kinglongmee@gmail.com>> wrote:
> >> >>> On 8/2/2014 21:11, Trond Myklebust wrote:
> >> >>>> On Sat, Aug 2, 2014 at 6:39 AM, Kinglong Mee <kinglongmee@gmail.com <mailto:kinglongmee@gmail.com>> wrote:
> >> >>>>> On 7/30/2014 09:34, Jeff Layton wrote:
> >> >>>>>> From: Trond Myklebust <trond.myklebust@primarydata.com <mailto:trond.myklebust@primarydata.com>>
> >> >>>>>>
> >> >>>>>> Preparation for removing the client_mutex.
> >> >>>>>>
> >> >>>>>> Convert the open owner hash table into a per-client table and protect it
> >> >>>>>> using the nfs4_client->cl_lock spin lock.
> >> >>>>>>
> >> >>>>>> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com <mailto:trond.myklebust@primarydata.com>>
> >> >>>>>> ---
> >> >>>>>> fs/nfsd/netns.h | 1 -
> >> >>>>>> fs/nfsd/nfs4state.c | 187 ++++++++++++++++++++++++----------------------------
> >> >>>>>> fs/nfsd/state.h | 1 +
> >> >>>>>> 3 files changed, 86 insertions(+), 103 deletions(-)
> >> >>>>>>
> >> >>>>>> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> >> >>>>>> index a71d14413d39..e1f479c162b5 100644
> >> >>>>>> --- a/fs/nfsd/netns.h
> >> >>>>>> +++ b/fs/nfsd/netns.h
> >> >>>>>> @@ -63,7 +63,6 @@ struct nfsd_net {
> >> >>>>>> struct rb_root conf_name_tree;
> >> >>>>>> struct list_head *unconf_id_hashtbl;
> >> >>>>>> struct rb_root unconf_name_tree;
> >> >>>>>> - struct list_head *ownerstr_hashtbl;
> >> >>>>>
> >> >>>>> I send a patch "NFSD: Rervert "knfsd: locks: flag NFSv4-owned locks"" before,
> >> >>>>> http://comments.gmane.org/gmane.linux.nfs/64382
> >> >>>>>
> >> >>>>> nfsd needs the hashtbl to find the lockowner for locking by owner from
> >> >>>>> fl->fl_owner stored in struct file_lock, but without nfs_client.
> >> >>>>
> >> >>>> Why? We're not currently doing that.
> >> >>>
> >> >>> Although not doing that right now, but there is a bug for getting the right ld_owner
> >> >>> in nfs4_set_lock_denied.
> >> >>>
> >> >>> If denying locks, vfs don't copy fl->fl_lmops to the returned file_lock, so that,
> >> >>> fl->fl_lmops always be NULL, nfsd never return the owner who holds the conflock.
> >> >>>
> >> >>> If we want fix this problem, needs finding the lockowner from struct file_lock.
> >> >>
> >> >> Do we really care enough about fixing nfs4_set_lock_denied enough to
> >> >> do so at the cost of reducing overall scalability of locking state?
> >> >
> >> > I just report this problem, don't think enough about the scalability.
> >> >
> >> >> We will always be faking up the clientid etc for local locks. Are
> >> >> there any clients out there that actually inspect the clientid on a
> >> >> result of NFS4ERR_DENIED and that will break if we give them a fake
> >> >> for non-local locks?
> >> >
> >> > Jeff has point the same problem of a non-nfs4_lockowner.
> >> > Maybe we should copy fl_lmops to conflock as before, nfsd can distinguish
> >> > the lockowner stored in struct file_lock by checking fl_lmops.
> >> >
> >>
> >> Alternatively, set a flag in fl_flags. Back in the days, we used to
> >> have a FL_NFSD, perhaps it is time to resurrect that?
> >>
> >
> > Would we need a similar flag for lockd too?
> >
> > I'm not sure a flag is the correct approach for this. A "fl_lmtype" field
> > or something similar might make sense, but I'm not sure that really
> > adds much over just ensuring that fl_lmops is set properly for these locks.
>
> It avoids adding an extra copy of fl_lmops that would only be useful
> to knfsd as a flag.
>
> > That said, we definitely will need to ensure that there are no harmful
> > effects from setting fl_lmops on a conflock container. I don't see any
> > right offhand, but it's probably reasonable to put something like that
> > in -next for a bit of soak time...
>
> The real problem here is that you are copying the lock, and then
> trying to dereference the copied pointer without ensuring that the
> thing the pointer is referencing still exists. Given that we're
> removing the global state mutex, then it is now perfectly possible for
> a competing knfsd thread to free the conflicting lock and the
> lockowner pointed to by fl->fl_owner before your thread hits
> nfs4_set_lock_denied.
> The check for fl_lmtype does nothing to fix that race (nor does FL_NFSD).
>
> Cheers
> Trond
>
>
> Agreed. That's a significant bit of nastiness, and I think you're correct that the
> client_mutex removal will exacerbate the problem. The only way I can see to
> properly fix that would be to make it so that the conflock holds a reference to the
> lockowner, and then put it when the conflock is freed.
>
I will check the reference to lockowner in nfsd, and try to simulate client-id expire.
> That will require some new fl_ops or fl_lmops. It would also be good to clean up
> conflock generation a bit, and turn it into a more explicit process.
> __locks_copy_lock doesn't really fully describe what's going on there, IMO...
I will have a try.
thanks,
Kinglong Mee
next prev parent reply other threads:[~2014-08-03 14:15 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-30 1:34 [PATCH v3 00/38] nfsd: stateid and stateowner refcounting overhaul Jeff Layton
2014-07-30 1:34 ` [PATCH v3 01/38] nfsd: Add reference counting to the lock and open stateids Jeff Layton
2014-07-30 1:34 ` [PATCH v3 02/38] nfsd: Cleanup the freeing of stateids Jeff Layton
2014-07-30 20:57 ` Christoph Hellwig
2014-07-31 16:50 ` J. Bruce Fields
2014-07-30 1:34 ` [PATCH v3 03/38] nfsd: Add a struct nfs4_file field to struct nfs4_stid Jeff Layton
2014-07-30 1:34 ` [PATCH v3 04/38] nfsd: Replace nfs4_ol_stateid->st_file with the st_stid.sc_file Jeff Layton
2014-07-30 20:59 ` Christoph Hellwig
2014-07-31 16:52 ` J. Bruce Fields
2014-07-30 1:34 ` [PATCH v3 05/38] nfsd4: use cl_lock to synchronize all stateid idr calls Jeff Layton
2014-07-30 20:59 ` Christoph Hellwig
2014-07-30 1:34 ` [PATCH v3 06/38] nfsd: do filp_close in sc_free callback for lock stateids Jeff Layton
2014-07-30 21:08 ` Christoph Hellwig
2014-07-30 1:34 ` [PATCH v3 07/38] nfsd: Add locking to protect the state owner lists Jeff Layton
2014-07-30 1:34 ` [PATCH v3 08/38] nfsd: clean up races in lock stateid searching and creation Jeff Layton
2014-07-30 1:34 ` [PATCH v3 09/38] nfsd: ensure atomicity in nfsd4_free_stateid and nfsd4_validate_stateid Jeff Layton
2014-07-30 1:34 ` [PATCH v3 10/38] nfsd: Add reference counting to lock stateids Jeff Layton
2014-07-30 1:34 ` [PATCH v3 11/38] nfsd: nfsd4_locku() must reference the lock stateid Jeff Layton
2014-07-30 1:34 ` [PATCH v3 12/38] nfsd: Ensure that nfs4_open_delegation() references the delegation stateid Jeff Layton
2014-07-30 1:34 ` [PATCH v3 13/38] nfsd: nfsd4_process_open2() must reference " Jeff Layton
2014-07-30 1:34 ` [PATCH v3 14/38] nfsd: nfsd4_process_open2() must reference the open stateid Jeff Layton
2014-07-30 1:34 ` [PATCH v3 15/38] nfsd: Prepare nfsd4_close() for open stateid referencing Jeff Layton
2014-07-30 1:34 ` [PATCH v3 16/38] nfsd: nfsd4_open_confirm() must reference the open stateid Jeff Layton
2014-07-30 1:34 ` [PATCH v3 17/38] nfsd: Add reference counting to nfs4_preprocess_confirmed_seqid_op Jeff Layton
2014-07-30 1:34 ` [PATCH v3 18/38] nfsd: Migrate the stateid reference into nfs4_preprocess_seqid_op Jeff Layton
2014-07-30 1:34 ` [PATCH v3 19/38] nfsd: Migrate the stateid reference into nfs4_lookup_stateid() Jeff Layton
2014-07-30 1:34 ` [PATCH v3 20/38] nfsd: Migrate the stateid reference into nfs4_find_stateid_by_type() Jeff Layton
2014-07-31 20:04 ` J. Bruce Fields
2014-07-30 1:34 ` [PATCH v3 21/38] nfsd: Add reference counting to state owners Jeff Layton
2014-08-01 16:48 ` J. Bruce Fields
2014-08-01 16:52 ` Jeff Layton
2014-07-30 1:34 ` [PATCH v3 22/38] nfsd: Add a mutex to protect the NFSv4.0 open owner replay cache Jeff Layton
2014-07-30 1:34 ` [PATCH v3 23/38] nfsd: clean up lockowner refcounting when finding them Jeff Layton
2014-07-30 1:34 ` [PATCH v3 24/38] nfsd: add an operation for unhashing a stateowner Jeff Layton
2014-07-30 1:34 ` [PATCH v3 25/38] nfsd: Make lock stateid take a reference to the lockowner Jeff Layton
2014-07-30 1:34 ` [PATCH v3 26/38] nfsd: clean up refcounting for lockowners Jeff Layton
2014-07-30 1:34 ` [PATCH v3 27/38] nfsd: make openstateids hold references to their openowners Jeff Layton
2014-08-01 19:29 ` J. Bruce Fields
2014-07-30 1:34 ` [PATCH v3 28/38] nfsd: don't allow CLOSE to proceed until refcount on stateid drops Jeff Layton
2014-07-30 1:34 ` [PATCH v3 29/38] nfsd: Protect adding/removing open state owners using client_lock Jeff Layton
2014-07-30 1:34 ` [PATCH v3 30/38] nfsd: Protect adding/removing lock " Jeff Layton
2014-08-01 19:44 ` J. Bruce Fields
2014-07-30 1:34 ` [PATCH v3 31/38] nfsd: Move the open owner hash table into struct nfs4_client Jeff Layton
2014-08-02 10:39 ` Kinglong Mee
2014-08-02 13:11 ` Trond Myklebust
2014-08-02 13:23 ` Jeff Layton
2014-08-02 13:51 ` Kinglong Mee
2014-08-02 13:43 ` Kinglong Mee
2014-08-02 14:05 ` Trond Myklebust
2014-08-02 14:20 ` Kinglong Mee
2014-08-02 14:47 ` Trond Myklebust
2014-08-02 22:59 ` Jeff Layton
2014-08-03 1:59 ` Trond Myklebust
[not found] ` <CAPakX04ctGujqpJ48WqT1-r5=q0T8D1Ji=q1P5UTuwdw_hTsGg@mail.gmail.com>
2014-08-03 14:15 ` Kinglong Mee [this message]
2014-08-03 14:49 ` Jeff Layton
2014-08-05 18:46 ` Bruce Fields
2014-07-30 1:34 ` [PATCH v3 32/38] nfsd: clean up and reorganize release_lockowner Jeff Layton
2014-07-30 1:34 ` [PATCH v3 33/38] nfsd: add locking to stateowner release Jeff Layton
2014-07-30 1:34 ` [PATCH v3 34/38] nfsd: optimize destroy_lockowner cl_lock thrashing Jeff Layton
2014-07-30 1:34 ` [PATCH v3 35/38] nfsd: close potential race in nfsd4_free_stateid Jeff Layton
2014-07-30 1:34 ` [PATCH v3 36/38] nfsd: reduce cl_lock thrashing in release_openowner Jeff Layton
2014-07-30 1:34 ` [PATCH v3 37/38] nfsd: don't thrash the cl_lock while freeing an open stateid Jeff Layton
2014-07-30 1:34 ` [PATCH v3 38/38] nfsd: rename unhash_generic_stateid to unhash_ol_stateid Jeff Layton
2014-08-01 20:25 ` [PATCH v3 00/38] nfsd: stateid and stateowner refcounting overhaul J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53DE43E9.9030704@gmail.com \
--to=kinglongmee@gmail.com \
--cc=bfields@fieldses.org \
--cc=hch@infradead.org \
--cc=jeff.layton@primarydata.com \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@primarydata.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).