From: Kinglong Mee <kinglongmee@gmail.com>
To: Jeffrey Layton <jeff.layton@primarydata.com>,
Trond Myklebust <trond.myklebust@primarydata.com>
Cc: Bruce Fields <bfields@fieldses.org>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
Christoph Hellwig <hch@infradead.org>
Subject: Re: [PATCH v3 31/38] nfsd: Move the open owner hash table into struct nfs4_client
Date: Sun, 03 Aug 2014 22:15:05 +0800 [thread overview]
Message-ID: <53DE43E9.9030704@gmail.com> (raw)
In-Reply-To: <CAPakX04ctGujqpJ48WqT1-r5=q0T8D1Ji=q1P5UTuwdw_hTsGg@mail.gmail.com>
On 8/3/2014 19:35, Jeffrey Layton wrote:
> On Sat, Aug 2, 2014 at 9:59 PM, Trond Myklebust <trond.myklebust@primarydata.com <mailto:trond.myklebust@primarydata.com>> wrote:
>
> On Sat, Aug 2, 2014 at 6:59 PM, Jeff Layton <jeff.layton@primarydata.com <mailto:jeff.layton@primarydata.com>> wrote:
> > On Sat, 2 Aug 2014 10:47:27 -0400
> > Trond Myklebust <trond.myklebust@primarydata.com <mailto:trond.myklebust@primarydata.com>> wrote:
> >
> >> On Sat, Aug 2, 2014 at 10:20 AM, Kinglong Mee <kinglongmee@gmail.com <mailto:kinglongmee@gmail.com>> wrote:
> >> > On 8/2/2014 22:05, Trond Myklebust wrote:
> >> >> On Sat, Aug 2, 2014 at 9:43 AM, Kinglong Mee <kinglongmee@gmail.com <mailto:kinglongmee@gmail.com>> wrote:
> >> >>> On 8/2/2014 21:11, Trond Myklebust wrote:
> >> >>>> On Sat, Aug 2, 2014 at 6:39 AM, Kinglong Mee <kinglongmee@gmail.com <mailto:kinglongmee@gmail.com>> wrote:
> >> >>>>> On 7/30/2014 09:34, Jeff Layton wrote:
> >> >>>>>> From: Trond Myklebust <trond.myklebust@primarydata.com <mailto:trond.myklebust@primarydata.com>>
> >> >>>>>>
> >> >>>>>> Preparation for removing the client_mutex.
> >> >>>>>>
> >> >>>>>> Convert the open owner hash table into a per-client table and protect it
> >> >>>>>> using the nfs4_client->cl_lock spin lock.
> >> >>>>>>
> >> >>>>>> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com <mailto:trond.myklebust@primarydata.com>>
> >> >>>>>> ---
> >> >>>>>> fs/nfsd/netns.h | 1 -
> >> >>>>>> fs/nfsd/nfs4state.c | 187 ++++++++++++++++++++++++----------------------------
> >> >>>>>> fs/nfsd/state.h | 1 +
> >> >>>>>> 3 files changed, 86 insertions(+), 103 deletions(-)
> >> >>>>>>
> >> >>>>>> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> >> >>>>>> index a71d14413d39..e1f479c162b5 100644
> >> >>>>>> --- a/fs/nfsd/netns.h
> >> >>>>>> +++ b/fs/nfsd/netns.h
> >> >>>>>> @@ -63,7 +63,6 @@ struct nfsd_net {
> >> >>>>>> struct rb_root conf_name_tree;
> >> >>>>>> struct list_head *unconf_id_hashtbl;
> >> >>>>>> struct rb_root unconf_name_tree;
> >> >>>>>> - struct list_head *ownerstr_hashtbl;
> >> >>>>>
> >> >>>>> I send a patch "NFSD: Rervert "knfsd: locks: flag NFSv4-owned locks"" before,
> >> >>>>> http://comments.gmane.org/gmane.linux.nfs/64382
> >> >>>>>
> >> >>>>> nfsd needs the hashtbl to find the lockowner for locking by owner from
> >> >>>>> fl->fl_owner stored in struct file_lock, but without nfs_client.
> >> >>>>
> >> >>>> Why? We're not currently doing that.
> >> >>>
> >> >>> Although not doing that right now, but there is a bug for getting the right ld_owner
> >> >>> in nfs4_set_lock_denied.
> >> >>>
> >> >>> If denying locks, vfs don't copy fl->fl_lmops to the returned file_lock, so that,
> >> >>> fl->fl_lmops always be NULL, nfsd never return the owner who holds the conflock.
> >> >>>
> >> >>> If we want fix this problem, needs finding the lockowner from struct file_lock.
> >> >>
> >> >> Do we really care enough about fixing nfs4_set_lock_denied enough to
> >> >> do so at the cost of reducing overall scalability of locking state?
> >> >
> >> > I just report this problem, don't think enough about the scalability.
> >> >
> >> >> We will always be faking up the clientid etc for local locks. Are
> >> >> there any clients out there that actually inspect the clientid on a
> >> >> result of NFS4ERR_DENIED and that will break if we give them a fake
> >> >> for non-local locks?
> >> >
> >> > Jeff has point the same problem of a non-nfs4_lockowner.
> >> > Maybe we should copy fl_lmops to conflock as before, nfsd can distinguish
> >> > the lockowner stored in struct file_lock by checking fl_lmops.
> >> >
> >>
> >> Alternatively, set a flag in fl_flags. Back in the days, we used to
> >> have a FL_NFSD, perhaps it is time to resurrect that?
> >>
> >
> > Would we need a similar flag for lockd too?
> >
> > I'm not sure a flag is the correct approach for this. A "fl_lmtype" field
> > or something similar might make sense, but I'm not sure that really
> > adds much over just ensuring that fl_lmops is set properly for these locks.
>
> It avoids adding an extra copy of fl_lmops that would only be useful
> to knfsd as a flag.
>
> > That said, we definitely will need to ensure that there are no harmful
> > effects from setting fl_lmops on a conflock container. I don't see any
> > right offhand, but it's probably reasonable to put something like that
> > in -next for a bit of soak time...
>
> The real problem here is that you are copying the lock, and then
> trying to dereference the copied pointer without ensuring that the
> thing the pointer is referencing still exists. Given that we're
> removing the global state mutex, then it is now perfectly possible for
> a competing knfsd thread to free the conflicting lock and the
> lockowner pointed to by fl->fl_owner before your thread hits
> nfs4_set_lock_denied.
> The check for fl_lmtype does nothing to fix that race (nor does FL_NFSD).
>
> Cheers
> Trond
>
>
> Agreed. That's a significant bit of nastiness, and I think you're correct that the
> client_mutex removal will exacerbate the problem. The only way I can see to
> properly fix that would be to make it so that the conflock holds a reference to the
> lockowner, and then put it when the conflock is freed.
>
I will check the reference to lockowner in nfsd, and try to simulate client-id expire.
> That will require some new fl_ops or fl_lmops. It would also be good to clean up
> conflock generation a bit, and turn it into a more explicit process.
> __locks_copy_lock doesn't really fully describe what's going on there, IMO...
I will have a try.
thanks,
Kinglong Mee
next prev parent reply other threads:[~2014-08-03 14:15 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-30 1:34 [PATCH v3 00/38] nfsd: stateid and stateowner refcounting overhaul Jeff Layton
2014-07-30 1:34 ` [PATCH v3 01/38] nfsd: Add reference counting to the lock and open stateids Jeff Layton
2014-07-30 1:34 ` [PATCH v3 02/38] nfsd: Cleanup the freeing of stateids Jeff Layton
2014-07-30 20:57 ` Christoph Hellwig
2014-07-31 16:50 ` J. Bruce Fields
2014-07-30 1:34 ` [PATCH v3 03/38] nfsd: Add a struct nfs4_file field to struct nfs4_stid Jeff Layton
2014-07-30 1:34 ` [PATCH v3 04/38] nfsd: Replace nfs4_ol_stateid->st_file with the st_stid.sc_file Jeff Layton
2014-07-30 20:59 ` Christoph Hellwig
2014-07-31 16:52 ` J. Bruce Fields
2014-07-30 1:34 ` [PATCH v3 05/38] nfsd4: use cl_lock to synchronize all stateid idr calls Jeff Layton
2014-07-30 20:59 ` Christoph Hellwig
2014-07-30 1:34 ` [PATCH v3 06/38] nfsd: do filp_close in sc_free callback for lock stateids Jeff Layton
2014-07-30 21:08 ` Christoph Hellwig
2014-07-30 1:34 ` [PATCH v3 07/38] nfsd: Add locking to protect the state owner lists Jeff Layton
2014-07-30 1:34 ` [PATCH v3 08/38] nfsd: clean up races in lock stateid searching and creation Jeff Layton
2014-07-30 1:34 ` [PATCH v3 09/38] nfsd: ensure atomicity in nfsd4_free_stateid and nfsd4_validate_stateid Jeff Layton
2014-07-30 1:34 ` [PATCH v3 10/38] nfsd: Add reference counting to lock stateids Jeff Layton
2014-07-30 1:34 ` [PATCH v3 11/38] nfsd: nfsd4_locku() must reference the lock stateid Jeff Layton
2014-07-30 1:34 ` [PATCH v3 12/38] nfsd: Ensure that nfs4_open_delegation() references the delegation stateid Jeff Layton
2014-07-30 1:34 ` [PATCH v3 13/38] nfsd: nfsd4_process_open2() must reference " Jeff Layton
2014-07-30 1:34 ` [PATCH v3 14/38] nfsd: nfsd4_process_open2() must reference the open stateid Jeff Layton
2014-07-30 1:34 ` [PATCH v3 15/38] nfsd: Prepare nfsd4_close() for open stateid referencing Jeff Layton
2014-07-30 1:34 ` [PATCH v3 16/38] nfsd: nfsd4_open_confirm() must reference the open stateid Jeff Layton
2014-07-30 1:34 ` [PATCH v3 17/38] nfsd: Add reference counting to nfs4_preprocess_confirmed_seqid_op Jeff Layton
2014-07-30 1:34 ` [PATCH v3 18/38] nfsd: Migrate the stateid reference into nfs4_preprocess_seqid_op Jeff Layton
2014-07-30 1:34 ` [PATCH v3 19/38] nfsd: Migrate the stateid reference into nfs4_lookup_stateid() Jeff Layton
2014-07-30 1:34 ` [PATCH v3 20/38] nfsd: Migrate the stateid reference into nfs4_find_stateid_by_type() Jeff Layton
2014-07-31 20:04 ` J. Bruce Fields
2014-07-30 1:34 ` [PATCH v3 21/38] nfsd: Add reference counting to state owners Jeff Layton
2014-08-01 16:48 ` J. Bruce Fields
2014-08-01 16:52 ` Jeff Layton
2014-07-30 1:34 ` [PATCH v3 22/38] nfsd: Add a mutex to protect the NFSv4.0 open owner replay cache Jeff Layton
2014-07-30 1:34 ` [PATCH v3 23/38] nfsd: clean up lockowner refcounting when finding them Jeff Layton
2014-07-30 1:34 ` [PATCH v3 24/38] nfsd: add an operation for unhashing a stateowner Jeff Layton
2014-07-30 1:34 ` [PATCH v3 25/38] nfsd: Make lock stateid take a reference to the lockowner Jeff Layton
2014-07-30 1:34 ` [PATCH v3 26/38] nfsd: clean up refcounting for lockowners Jeff Layton
2014-07-30 1:34 ` [PATCH v3 27/38] nfsd: make openstateids hold references to their openowners Jeff Layton
2014-08-01 19:29 ` J. Bruce Fields
2014-07-30 1:34 ` [PATCH v3 28/38] nfsd: don't allow CLOSE to proceed until refcount on stateid drops Jeff Layton
2014-07-30 1:34 ` [PATCH v3 29/38] nfsd: Protect adding/removing open state owners using client_lock Jeff Layton
2014-07-30 1:34 ` [PATCH v3 30/38] nfsd: Protect adding/removing lock " Jeff Layton
2014-08-01 19:44 ` J. Bruce Fields
2014-07-30 1:34 ` [PATCH v3 31/38] nfsd: Move the open owner hash table into struct nfs4_client Jeff Layton
2014-08-02 10:39 ` Kinglong Mee
2014-08-02 13:11 ` Trond Myklebust
2014-08-02 13:23 ` Jeff Layton
2014-08-02 13:51 ` Kinglong Mee
2014-08-02 13:43 ` Kinglong Mee
2014-08-02 14:05 ` Trond Myklebust
2014-08-02 14:20 ` Kinglong Mee
2014-08-02 14:47 ` Trond Myklebust
2014-08-02 22:59 ` Jeff Layton
2014-08-03 1:59 ` Trond Myklebust
[not found] ` <CAPakX04ctGujqpJ48WqT1-r5=q0T8D1Ji=q1P5UTuwdw_hTsGg@mail.gmail.com>
2014-08-03 14:15 ` Kinglong Mee [this message]
2014-08-03 14:49 ` Jeff Layton
2014-08-05 18:46 ` Bruce Fields
2014-07-30 1:34 ` [PATCH v3 32/38] nfsd: clean up and reorganize release_lockowner Jeff Layton
2014-07-30 1:34 ` [PATCH v3 33/38] nfsd: add locking to stateowner release Jeff Layton
2014-07-30 1:34 ` [PATCH v3 34/38] nfsd: optimize destroy_lockowner cl_lock thrashing Jeff Layton
2014-07-30 1:34 ` [PATCH v3 35/38] nfsd: close potential race in nfsd4_free_stateid Jeff Layton
2014-07-30 1:34 ` [PATCH v3 36/38] nfsd: reduce cl_lock thrashing in release_openowner Jeff Layton
2014-07-30 1:34 ` [PATCH v3 37/38] nfsd: don't thrash the cl_lock while freeing an open stateid Jeff Layton
2014-07-30 1:34 ` [PATCH v3 38/38] nfsd: rename unhash_generic_stateid to unhash_ol_stateid Jeff Layton
2014-08-01 20:25 ` [PATCH v3 00/38] nfsd: stateid and stateowner refcounting overhaul J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53DE43E9.9030704@gmail.com \
--to=kinglongmee@gmail.com \
--cc=bfields@fieldses.org \
--cc=hch@infradead.org \
--cc=jeff.layton@primarydata.com \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@primarydata.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.