public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever <chuck.lever@oracle.com>
To: Jeff Layton <jlayton@kernel.org>, NeilBrown <neilb@suse.de>
Cc: Olga Kornievskaia <okorniev@redhat.com>,
	Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
	Salvatore Bonaccorso <carnil@debian.org>,
	linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] nfsd: validate the nfsd_serv pointer before calling svc_wake_up
Date: Mon, 27 Jan 2025 08:39:40 -0500	[thread overview]
Message-ID: <ac965f67-db15-4f93-be03-878e6a3d171b@oracle.com> (raw)
In-Reply-To: <5923519a4c8f6bb6d5ccd2697447b313fb61bba3.camel@kernel.org>

On 1/27/25 8:32 AM, Jeff Layton wrote:
> On Mon, 2025-01-27 at 08:22 -0500, Chuck Lever wrote:
>> On 1/27/25 8:07 AM, Jeff Layton wrote:
>>> On Mon, 2025-01-27 at 11:15 +1100, NeilBrown wrote:
>>>> On Mon, 27 Jan 2025, Jeff Layton wrote:
>>>>> On Mon, 2025-01-27 at 08:53 +1100, NeilBrown wrote:
>>>>>> On Sun, 26 Jan 2025, Jeff Layton wrote:
>>>>>>> On Sun, 2025-01-26 at 13:39 +1100, NeilBrown wrote:
>>>>>>>> On Sun, 26 Jan 2025, Jeff Layton wrote:
>>>>>>>>> nfsd_file_dispose_list_delayed can be called from the filecache
>>>>>>>>> laundrette, which is shut down after the nfsd threads are shut down and
>>>>>>>>> the nfsd_serv pointer is cleared. If nn->nfsd_serv is NULL then there
>>>>>>>>> are no threads to wake.
>>>>>>>>>
>>>>>>>>> Ensure that the nn->nfsd_serv pointer is non-NULL before calling
>>>>>>>>> svc_wake_up in nfsd_file_dispose_list_delayed. This is safe since the
>>>>>>>>> svc_serv is not freed until after the filecache laundrette is cancelled.
>>>>>>>>>
>>>>>>>>> Fixes: ffb402596147 ("nfsd: Don't leave work of closing files to a work queue")
>>>>>>>>> Reported-by: Salvatore Bonaccorso <carnil@debian.org>
>>>>>>>>> Closes: https://lore.kernel.org/linux-nfs/7d9f2a8aede4f7ca9935a47e1d405643220d7946.camel@kernel.org/
>>>>>>>>> Signed-off-by: Jeff Layton <jlayton@kernel.org>
>>>>>>>>> ---
>>>>>>>>> This is only lightly tested, but I think it will fix the bug that
>>>>>>>>> Salvatore reported.
>>>>>>>>> ---
>>>>>>>>>    fs/nfsd/filecache.c | 11 ++++++++++-
>>>>>>>>>    1 file changed, 10 insertions(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
>>>>>>>>> index e91c164b5ea21507659904690533a19ca43b1b64..fb2a4469b7a3c077de2dd750f43239b4af6d37b0 100644
>>>>>>>>> --- a/fs/nfsd/filecache.c
>>>>>>>>> +++ b/fs/nfsd/filecache.c
>>>>>>>>> @@ -445,11 +445,20 @@ nfsd_file_dispose_list_delayed(struct list_head *dispose)
>>>>>>>>>    						struct nfsd_file, nf_gc);
>>>>>>>>>    		struct nfsd_net *nn = net_generic(nf->nf_net, nfsd_net_id);
>>>>>>>>>    		struct nfsd_fcache_disposal *l = nn->fcache_disposal;
>>>>>>>>> +		struct svc_serv *serv;
>>>>>>>>>    
>>>>>>>>>    		spin_lock(&l->lock);
>>>>>>>>>    		list_move_tail(&nf->nf_gc, &l->freeme);
>>>>>>>>>    		spin_unlock(&l->lock);
>>>>>>>>> -		svc_wake_up(nn->nfsd_serv);
>>>>>>>>> +
>>>>>>>>> +		/*
>>>>>>>>> +		 * The filecache laundrette is shut down after the
>>>>>>>>> +		 * nn->nfsd_serv pointer is cleared, but before the
>>>>>>>>> +		 * svc_serv is freed.
>>>>>>>>> +		 */
>>>>>>>>> +		serv = nn->nfsd_serv;
>>>>>>>>
>>>>>>>> I wonder if this should be READ_ONCE() to tell the compiler that we
>>>>>>>> could race with clearing nn->nfsd_serv.  Would the comment still be
>>>>>>>> needed?
>>>>>>>>
>>>>>>>
>>>>>>> I think we need a comment at least. The linkage between the laundrette
>>>>>>> and the nfsd_serv being set to NULL is very subtle. A READ_ONCE()
>>>>>>> doesn't convey that well, and is unnecessary here.
>>>>>>
>>>>>> Why do you say "is unnecessary here" ?
>>>>>> If the code were
>>>>>>      if (nn->nfsd_serv)
>>>>>>               svc_wake_up(nn->nfsd_serv);
>>>>>> that would be wrong as nn->nfds_serv could be set to NULL between the
>>>>>> two.
>>>>>> And the C compile is allowed to load the value twice because the C memory
>>>>>> model declares that would have the same effect.
>>>>>> While I doubt it would actually change how the code is compiled, I think
>>>>>> we should have READ_ONCE() here (and I've been wrong before about what
>>>>>> the compiler will actually do).
>>>>>>
>>>>>>
>>>>>
>>>>> It's unnecessary because the outcome of either case is acceptable.
>>>>>
>>>>> When racing with shutdown, either it's NULL and the laundrette won't
>>>>> call svc_wake_up(), or it's non-NULL and it will. In the non-NULL case,
>>>>> the call to svc_wake_up() will be a no-op because the threads are shut
>>>>> down.
>>>>>
>>>>> The vastly common case in this code is that this pointer will be non-
>>>>> NULL, because the server is running (i.e. not racing with shutdown). I
>>>>> don't see the need in making all of those accesses volatile.
>>>>
>>>> One of us is confused.  I hope it isn't me.
>>>>
>>>
>>> It's probably me. I think you have a much better understanding of
>>> compiler design than I do. Still...
>>>
>>>> The hypothetical problem I see is that the C compiler could generate
>>>> code to load the value "nn->nfsd_serv" twice.  The first time it is not
>>>> NULL, the second time it is NULL.
>>>> The first is used for the test, the second is passed to svc_wake_up().
>>>>
>>>> Unlikely though this is, it is possible and READ_ONCE() is designed
>>>> precisely to prevent this.
>>>> To quote from include/asm-generic/rwonce.h it will
>>>>    "Prevent the compiler from merging or refetching reads"
>>>>
>>>> A "volatile" access does not add any cost (in this case).  What it does
>>>> is break any aliasing that the compile might have deduced.
>>>> Even if the compiler thinks it has "nn->nfsd_serv" in a register, it
>>>> won't think it has the result of READ_ONCE(nn->nfsd_serv) in that register.
>>>> And if it needs the result of a previous READ_ONCE(nn->nfsd_serv) it
>>>> won't decide that it can just read nn->nfsd_serv again.  It MUST keep
>>>> the result of READ_ONCE(nn->nfsd_serv) somewhere until it is not needed
>>>> any more.
>>>
>>> I'm mainly just considering the resulting pointer. There are two
>>> possible outcomes to the fetch of nn->nfsd_serv. Either it's a valid
>>> pointer that points to the svc_serv, or it's NULL. The resulting code
>>> can handle either case, so it doesn't seem like adding READ_ONCE() will
>>> create any material difference here.
>>>
>>> Maybe I should ask it this way: What bad outcome could result if we
>>> don't add READ_ONCE() here?
>>
>> Neil just described it. The compiler would generate two load operations,
>> one for the test and one for the function call argument. The first load
>> can retrieve a non-NULL address, and the second a NULL address.
>>
>> I agree a READ_ONCE() is necessary.
>>
>>
> 
> Now I'm confused:
> 
>                  struct svc_serv *serv;
> 
> 		[...]
> 
>                  /*
>                   * The filecache laundrette is shut down after the
>                   * nn->nfsd_serv pointer is cleared, but before the
>                   * svc_serv is freed.
>                   */
>                  serv = nn->nfsd_serv;
>                  if (serv)
>                          svc_wake_up(serv);
> 
> This code is explicitly asking to fetch nn->nfsd_serv into the serv
> variable, and then is testing that copy of the pointer and passing it
> into svc_wake_up().
> 
> How is the compiler allowed to suddenly refetch a NULL pointer into
> serv after testing that serv is non-NULL?

There's nothing here that tells the compiler it is not allowed to
optimize this into two separate fetches if it feels that is better
code. READ_ONCE is what tells the compiler we do not want that re-
organization /ever/.


-- 
Chuck Lever

  reply	other threads:[~2025-01-27 13:40 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-26  1:13 [PATCH] nfsd: validate the nfsd_serv pointer before calling svc_wake_up Jeff Layton
2025-01-26  2:39 ` NeilBrown
2025-01-26 12:36   ` Jeff Layton
2025-01-26 21:53     ` NeilBrown
2025-01-26 22:48       ` Jeff Layton
2025-01-27  0:15         ` NeilBrown
2025-01-27 13:07           ` Jeff Layton
2025-01-27 13:22             ` Chuck Lever
2025-01-27 13:32               ` Jeff Layton
2025-01-27 13:39                 ` Chuck Lever [this message]
2025-01-27 14:03                   ` Chuck Lever
2025-01-27 14:34                     ` Jeff Layton
2025-01-27 22:11                       ` NeilBrown
2025-01-27 22:16                     ` NeilBrown
2025-01-26 18:58 ` cel
2025-01-28 17:07 ` cel
2025-01-29 21:13   ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ac965f67-db15-4f93-be03-878e6a3d171b@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=Dai.Ngo@oracle.com \
    --cc=carnil@debian.org \
    --cc=jlayton@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=okorniev@redhat.com \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox