[bug report] deploying both NFS client and server on the same machine triggle hungtask

Linux NFS development
 help / color / mirror / Atom feed

* [bug report] deploying both NFS client and server on the same machine triggle hungtask
@ 2024-11-25 11:17 Li Lingfeng
  2024-11-25 17:32 ` Mark Liam Brown
  2024-11-28  7:22 ` Li Lingfeng
  0 siblings, 2 replies; 6+ messages in thread
From: Li Lingfeng @ 2024-11-25 11:17 UTC (permalink / raw)
  To: Dai.Ngo, Chuck Lever, Jeff Layton, NeilBrown, okorniev, tom,
	trond.myklebust
  Cc: linux-nfs, linux-kernel, Yu Kuai, Hou Tao, zhangyi (F), yangerkun,
	chengzhihao1, Li Lingfeng, Li Lingfeng

Hi, we have found a hungtask issue recently.

Commit 7746b32f467b ("NFSD: add shrinker to reap courtesy clients on low
memory condition") adds a shrinker to NFSD, which causes NFSD to try to
obtain shrinker_rwsem when starting and stopping services.

Deploying both NFS client and server on the same machine may lead to the
following issue, since they will share the global shrinker_rwsem.

     nfsd                            nfs
                             drop_cache // hold shrinker_rwsem
                             write back, wait for rpc_task to exit
// stop nfsd threads
svc_set_num_threads
// clean up xprts
svc_xprt_destroy_all
                             rpc_check_timeout
                              rpc_check_connected
                              // wait for the connection to be disconnected
unregister_shrinker
// wait for shrinker_rwsem

Normally, the client's rpc_task will exit after the server's nfsd thread
has processed the request.
When all the server's nfsd threads exit, the client’s rpc_task is expected
to detect the network connection being disconnected and exit.
However, although the server has executed svc_xprt_destroy_all before
waiting for shrinker_rwsem, the network connection is not actually
disconnected. Instead, the operation to close the socket is simply added
to the task_works queue.

svc_xprt_destroy_all
  ...
  svc_sock_free
   sockfd_put
    fput_many
     init_task_work // ____fput
     task_work_add // add to task->task_works

The actual disconnection of the network connection will only occur after
the current process finishes.
do_exit
  exit_task_work
   task_work_run
   ...
    ____fput // close sock

Although it is not a common practice to deploy NFS client and server on
the same machine, I think this issue still needs to be addressed,
otherwise it will cause all processes trying to acquire the shrinker_rwsem
to hang.

I don't have any ideas yet on how to solve this problem, does anyone have
any suggestions?

Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [bug report] deploying both NFS client and server on the same machine triggle hungtask
  2024-11-25 11:17 [bug report] deploying both NFS client and server on the same machine triggle hungtask Li Lingfeng
@ 2024-11-25 17:32 ` Mark Liam Brown
  2024-11-26  2:28   ` Li Lingfeng
  2024-11-28  7:22 ` Li Lingfeng
  1 sibling, 1 reply; 6+ messages in thread
From: Mark Liam Brown @ 2024-11-25 17:32 UTC (permalink / raw)
  To: linux-nfs, linux-kernel

On Mon, Nov 25, 2024 at 1:48 PM Li Lingfeng <lilingfeng3@huawei.com> wrote:
>
> Hi, we have found a hungtask issue recently.
>
> Commit 7746b32f467b ("NFSD: add shrinker to reap courtesy clients on low
> memory condition") adds a shrinker to NFSD, which causes NFSD to try to
> obtain shrinker_rwsem when starting and stopping services.
>
> Deploying both NFS client and server on the same machine may lead to the
> following issue, since they will share the global shrinker_rwsem.
>
>      nfsd                            nfs
>                              drop_cache // hold shrinker_rwsem
>                              write back, wait for rpc_task to exit
> // stop nfsd threads
> svc_set_num_threads
> // clean up xprts
> svc_xprt_destroy_all
>                              rpc_check_timeout
>                               rpc_check_connected
>                               // wait for the connection to be disconnected
> unregister_shrinker
> // wait for shrinker_rwsem
>
> Normally, the client's rpc_task will exit after the server's nfsd thread
> has processed the request.
> When all the server's nfsd threads exit, the client’s rpc_task is expected
> to detect the network connection being disconnected and exit.
> However, although the server has executed svc_xprt_destroy_all before
> waiting for shrinker_rwsem, the network connection is not actually
> disconnected. Instead, the operation to close the socket is simply added
> to the task_works queue.
>
> svc_xprt_destroy_all
>   ...
>   svc_sock_free
>    sockfd_put
>     fput_many
>      init_task_work // ____fput
>      task_work_add // add to task->task_works
>
> The actual disconnection of the network connection will only occur after
> the current process finishes.
> do_exit
>   exit_task_work
>    task_work_run
>    ...
>     ____fput // close sock
>
> Although it is not a common practice to deploy NFS client and server on
> the same machine, I think this issue still needs to be addressed,
> otherwise it will cause all processes trying to acquire the shrinker_rwsem
> to hang.

I disagree with that comment. Most small companies have NFS client and
NFS server on the same machine, the client being used to allow logins
by users, or to support schroot or containers.

Mark
-- 
IT Infrastructure Consultant
Windows, Linux

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [bug report] deploying both NFS client and server on the same machine triggle hungtask
  2024-11-25 17:32 ` Mark Liam Brown
@ 2024-11-26  2:28   ` Li Lingfeng
  0 siblings, 0 replies; 6+ messages in thread
From: Li Lingfeng @ 2024-11-26  2:28 UTC (permalink / raw)
  To: Mark Liam Brown, linux-nfs, linux-kernel
  Cc: yangerkun, zhangyi (F), yukuai (C), chengzhihao1, Hou Tao


在 2024/11/26 1:32, Mark Liam Brown 写道:
> On Mon, Nov 25, 2024 at 1:48 PM Li Lingfeng <lilingfeng3@huawei.com> wrote:
>> Hi, we have found a hungtask issue recently.
>>
>> Commit 7746b32f467b ("NFSD: add shrinker to reap courtesy clients on low
>> memory condition") adds a shrinker to NFSD, which causes NFSD to try to
>> obtain shrinker_rwsem when starting and stopping services.
>>
>> Deploying both NFS client and server on the same machine may lead to the
>> following issue, since they will share the global shrinker_rwsem.
>>
>>       nfsd                            nfs
>>                               drop_cache // hold shrinker_rwsem
>>                               write back, wait for rpc_task to exit
>> // stop nfsd threads
>> svc_set_num_threads
>> // clean up xprts
>> svc_xprt_destroy_all
>>                               rpc_check_timeout
>>                                rpc_check_connected
>>                                // wait for the connection to be disconnected
>> unregister_shrinker
>> // wait for shrinker_rwsem
>>
>> Normally, the client's rpc_task will exit after the server's nfsd thread
>> has processed the request.
>> When all the server's nfsd threads exit, the client’s rpc_task is expected
>> to detect the network connection being disconnected and exit.
>> However, although the server has executed svc_xprt_destroy_all before
>> waiting for shrinker_rwsem, the network connection is not actually
>> disconnected. Instead, the operation to close the socket is simply added
>> to the task_works queue.
>>
>> svc_xprt_destroy_all
>>    ...
>>    svc_sock_free
>>     sockfd_put
>>      fput_many
>>       init_task_work // ____fput
>>       task_work_add // add to task->task_works
>>
>> The actual disconnection of the network connection will only occur after
>> the current process finishes.
>> do_exit
>>    exit_task_work
>>     task_work_run
>>     ...
>>      ____fput // close sock
>>
>> Although it is not a common practice to deploy NFS client and server on
>> the same machine, I think this issue still needs to be addressed,
>> otherwise it will cause all processes trying to acquire the shrinker_rwsem
>> to hang.
> I disagree with that comment. Most small companies have NFS client and
> NFS server on the same machine, the client being used to allow logins
> by users, or to support schroot or containers.
>
> Mark

Sorry for my hasty conclusion.

By the way, nfsd_reply_cache_shrinker triggers this too.

Li


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [bug report] deploying both NFS client and server on the same machine triggle hungtask
  2024-11-25 11:17 [bug report] deploying both NFS client and server on the same machine triggle hungtask Li Lingfeng
  2024-11-25 17:32 ` Mark Liam Brown
@ 2024-11-28  7:22 ` Li Lingfeng
  2024-12-02 16:05   ` Chuck Lever III
  1 sibling, 1 reply; 6+ messages in thread
From: Li Lingfeng @ 2024-11-28  7:22 UTC (permalink / raw)
  To: Dai.Ngo, Chuck Lever, Jeff Layton, NeilBrown, okorniev, tom,
	trond.myklebust
  Cc: linux-nfs, linux-kernel, Yu Kuai, Hou Tao, zhangyi (F), yangerkun,
	chengzhihao1, Li Lingfeng

Besides nfsd_file_shrinker, the nfsd_client_shrinker added by commit
7746b32f467b ("NFSD: add shrinker to reap courtesy clients on low memory
condition") in 2022 and the nfsd_reply_cache_shrinker added by commit
3ba75830ce17 ("nfsd4: drc containerization") in 2019 may also trigger such
an issue.
Was this scenario not considered when designing the shrinkers for NFSD, or
was it deemed unreasonable and not worth considering?

在 2024/11/25 19:17, Li Lingfeng 写道:
> Hi, we have found a hungtask issue recently.
>
> Commit 7746b32f467b ("NFSD: add shrinker to reap courtesy clients on low
> memory condition") adds a shrinker to NFSD, which causes NFSD to try to
> obtain shrinker_rwsem when starting and stopping services.
>
> Deploying both NFS client and server on the same machine may lead to the
> following issue, since they will share the global shrinker_rwsem.
>
>     nfsd                            nfs
>                             drop_cache // hold shrinker_rwsem
>                             write back, wait for rpc_task to exit
> // stop nfsd threads
> svc_set_num_threads
> // clean up xprts
> svc_xprt_destroy_all
>                             rpc_check_timeout
>                              rpc_check_connected
>                              // wait for the connection to be 
> disconnected
> unregister_shrinker
> // wait for shrinker_rwsem
>
> Normally, the client's rpc_task will exit after the server's nfsd thread
> has processed the request.
> When all the server's nfsd threads exit, the client’s rpc_task is 
> expected
> to detect the network connection being disconnected and exit.
> However, although the server has executed svc_xprt_destroy_all before
> waiting for shrinker_rwsem, the network connection is not actually
> disconnected. Instead, the operation to close the socket is simply added
> to the task_works queue.
>
> svc_xprt_destroy_all
>  ...
>  svc_sock_free
>   sockfd_put
>    fput_many
>     init_task_work // ____fput
>     task_work_add // add to task->task_works
>
> The actual disconnection of the network connection will only occur after
> the current process finishes.
> do_exit
>  exit_task_work
>   task_work_run
>   ...
>    ____fput // close sock
>
> Although it is not a common practice to deploy NFS client and server on
> the same machine, I think this issue still needs to be addressed,
> otherwise it will cause all processes trying to acquire the 
> shrinker_rwsem
> to hang.
>
> I don't have any ideas yet on how to solve this problem, does anyone have
> any suggestions?
>
> Thanks.
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [bug report] deploying both NFS client and server on the same machine triggle hungtask
  2024-11-28  7:22 ` Li Lingfeng
@ 2024-12-02 16:05   ` Chuck Lever III
  2024-12-03  2:32     ` Li Lingfeng
  0 siblings, 1 reply; 6+ messages in thread
From: Chuck Lever III @ 2024-12-02 16:05 UTC (permalink / raw)
  To: Li Lingfeng
  Cc: Dai Ngo, Jeff Layton, Neil Brown, Olga Kornievskaia, Tom Talpey,
	Trond Myklebust, Linux NFS Mailing List,
	Linux Kernel Mailing List, Yu Kuai, Hou Tao, zhangyi (F),
	yangerkun, chengzhihao1@huawei.com, Li Lingfeng



> On Nov 28, 2024, at 2:22 AM, Li Lingfeng <lilingfeng3@huawei.com> wrote:
> 
> Besides nfsd_file_shrinker, the nfsd_client_shrinker added by commit
> 7746b32f467b ("NFSD: add shrinker to reap courtesy clients on low memory
> condition") in 2022 and the nfsd_reply_cache_shrinker added by commit
> 3ba75830ce17 ("nfsd4: drc containerization") in 2019 may also trigger such
> an issue.
> Was this scenario not considered when designing the shrinkers for NFSD, or
> was it deemed unreasonable and not worth considering?

I'm speculating, but it is possible that the issue was
introduced by another patch in an area related to the
rwsem. Seems like there is a testing gap in this area.

Can you file a bugzilla report on bugzilla.kernel.org <http://bugzilla.kernel.org/>
under Filesystems/NFSD ?


> 在 2024/11/25 19:17, Li Lingfeng 写道:
>> Hi, we have found a hungtask issue recently.
>> 
>> Commit 7746b32f467b ("NFSD: add shrinker to reap courtesy clients on low
>> memory condition") adds a shrinker to NFSD, which causes NFSD to try to
>> obtain shrinker_rwsem when starting and stopping services.
>> 
>> Deploying both NFS client and server on the same machine may lead to the
>> following issue, since they will share the global shrinker_rwsem.
>> 
>>     nfsd                            nfs
>>                             drop_cache // hold shrinker_rwsem
>>                             write back, wait for rpc_task to exit
>> // stop nfsd threads
>> svc_set_num_threads
>> // clean up xprts
>> svc_xprt_destroy_all
>>                             rpc_check_timeout
>>                              rpc_check_connected
>>                              // wait for the connection to be disconnected
>> unregister_shrinker
>> // wait for shrinker_rwsem
>> 
>> Normally, the client's rpc_task will exit after the server's nfsd thread
>> has processed the request.
>> When all the server's nfsd threads exit, the client’s rpc_task is expected
>> to detect the network connection being disconnected and exit.
>> However, although the server has executed svc_xprt_destroy_all before
>> waiting for shrinker_rwsem, the network connection is not actually
>> disconnected. Instead, the operation to close the socket is simply added
>> to the task_works queue.
>> 
>> svc_xprt_destroy_all
>>  ...
>>  svc_sock_free
>>   sockfd_put
>>    fput_many
>>     init_task_work // ____fput
>>     task_work_add // add to task->task_works
>> 
>> The actual disconnection of the network connection will only occur after
>> the current process finishes.
>> do_exit
>>  exit_task_work
>>   task_work_run
>>   ...
>>    ____fput // close sock
>> 
>> Although it is not a common practice to deploy NFS client and server on
>> the same machine, I think this issue still needs to be addressed,
>> otherwise it will cause all processes trying to acquire the shrinker_rwsem
>> to hang.
>> 
>> I don't have any ideas yet on how to solve this problem, does anyone have
>> any suggestions?
>> 
>> Thanks.
>> 

--
Chuck Lever



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [bug report] deploying both NFS client and server on the same machine triggle hungtask
  2024-12-02 16:05   ` Chuck Lever III
@ 2024-12-03  2:32     ` Li Lingfeng
  0 siblings, 0 replies; 6+ messages in thread
From: Li Lingfeng @ 2024-12-03  2:32 UTC (permalink / raw)
  To: Chuck Lever III
  Cc: Dai Ngo, Jeff Layton, Neil Brown, Olga Kornievskaia, Tom Talpey,
	Trond Myklebust, Linux NFS Mailing List,
	Linux Kernel Mailing List, Yu Kuai, Hou Tao, zhangyi (F),
	yangerkun, chengzhihao1@huawei.com, Li Lingfeng


在 2024/12/3 0:05, Chuck Lever III 写道:
>
>> On Nov 28, 2024, at 2:22 AM, Li Lingfeng <lilingfeng3@huawei.com> wrote:
>>
>> Besides nfsd_file_shrinker, the nfsd_client_shrinker added by commit
>> 7746b32f467b ("NFSD: add shrinker to reap courtesy clients on low memory
>> condition") in 2022 and the nfsd_reply_cache_shrinker added by commit
>> 3ba75830ce17 ("nfsd4: drc containerization") in 2019 may also trigger such
>> an issue.
>> Was this scenario not considered when designing the shrinkers for NFSD, or
>> was it deemed unreasonable and not worth considering?
> I'm speculating, but it is possible that the issue was
> introduced by another patch in an area related to the
> rwsem. Seems like there is a testing gap in this area.
>
> Can you file a bugzilla report on bugzilla.kernel.org <http://bugzilla.kernel.org/>
> under Filesystems/NFSD ?

Hi Chuck,

I have uploaded the dmesg log and some information from the vmcore here:
https://bugzilla.kernel.org/show_bug.cgi?id=219550

Thanks,

Li

>
>> 在 2024/11/25 19:17, Li Lingfeng 写道:
>>> Hi, we have found a hungtask issue recently.
>>>
>>> Commit 7746b32f467b ("NFSD: add shrinker to reap courtesy clients on low
>>> memory condition") adds a shrinker to NFSD, which causes NFSD to try to
>>> obtain shrinker_rwsem when starting and stopping services.
>>>
>>> Deploying both NFS client and server on the same machine may lead to the
>>> following issue, since they will share the global shrinker_rwsem.
>>>
>>>      nfsd                            nfs
>>>                              drop_cache // hold shrinker_rwsem
>>>                              write back, wait for rpc_task to exit
>>> // stop nfsd threads
>>> svc_set_num_threads
>>> // clean up xprts
>>> svc_xprt_destroy_all
>>>                              rpc_check_timeout
>>>                               rpc_check_connected
>>>                               // wait for the connection to be disconnected
>>> unregister_shrinker
>>> // wait for shrinker_rwsem
>>>
>>> Normally, the client's rpc_task will exit after the server's nfsd thread
>>> has processed the request.
>>> When all the server's nfsd threads exit, the client’s rpc_task is expected
>>> to detect the network connection being disconnected and exit.
>>> However, although the server has executed svc_xprt_destroy_all before
>>> waiting for shrinker_rwsem, the network connection is not actually
>>> disconnected. Instead, the operation to close the socket is simply added
>>> to the task_works queue.
>>>
>>> svc_xprt_destroy_all
>>>   ...
>>>   svc_sock_free
>>>    sockfd_put
>>>     fput_many
>>>      init_task_work // ____fput
>>>      task_work_add // add to task->task_works
>>>
>>> The actual disconnection of the network connection will only occur after
>>> the current process finishes.
>>> do_exit
>>>   exit_task_work
>>>    task_work_run
>>>    ...
>>>     ____fput // close sock
>>>
>>> Although it is not a common practice to deploy NFS client and server on
>>> the same machine, I think this issue still needs to be addressed,
>>> otherwise it will cause all processes trying to acquire the shrinker_rwsem
>>> to hang.
>>>
>>> I don't have any ideas yet on how to solve this problem, does anyone have
>>> any suggestions?
>>>
>>> Thanks.
>>>
> --
> Chuck Lever
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-12-03  2:32 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-25 11:17 [bug report] deploying both NFS client and server on the same machine triggle hungtask Li Lingfeng
2024-11-25 17:32 ` Mark Liam Brown
2024-11-26  2:28   ` Li Lingfeng
2024-11-28  7:22 ` Li Lingfeng
2024-12-02 16:05   ` Chuck Lever III
2024-12-03  2:32     ` Li Lingfeng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox