public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* PROBLEM: NFS client IO fails with ERESTARTSYS when another mount point with the same export is unmounted with force [NFS] [SUNRPC]
@ 2024-02-21  8:20 Zhitao Li
  2024-02-21 13:48 ` Trond Myklebust
  0 siblings, 1 reply; 11+ messages in thread
From: Zhitao Li @ 2024-02-21  8:20 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker, Chuck Lever, Jeff Layton,
	Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, linux-kernel, Ping Huang

Hi, everyone,

- Facts:
I have a remote NFS export and I mount the same export on two
different directories in my OS with the same options. There is an
inflight IO under one mounted directory. And then I unmount another
mounted directory with force. The inflight IO ends up with "Unknown
error 512", which is ERESTARTSYS.

OS: Linux kernel v6.7.0
NFS mount options: vers=4.1


- My speculation:
When the same export is mounted on different directories with the same
options, superblock and sunrpc_client will be shared. Unmount with
force will kill all rpc_tasks with ERESTARTSYS in rpc_killall_tasks().
However, no signal gets involved in this case. So ERESTARTSYS is not
handled before entering user mode.

I think there are two unexpected points here:
1. The inflight IO should not fail when I unmount another directory,
though the two directories share the same export.
2. "ERESTARTSYS" should not be seen in user space. EIO may be better.


- Reproduction:
1. Prepare some NFS export, nfsd or nfs-ganesha. For example, the
export is "ip:/export_path".
2. On the latest stable mainstream Linux kernel v6.7.0, mount the
export into two different directories with the same options:
      mount -t nfs -o vers=4.1 ip:/export_path  /mnt/test1
      mount -t nfs -o vers=4.1 ip:/export_path  /mnt/test2
3. Start an inflight IO in "/mnt/test1":
      dd if=/dev/urandom of=/mnt/test1/1G bs=1M count=1024 oflag=direct
4. Umount "/mnt/test2" with force when IO in step 3 is going:
      umount -f /mnt/test2
5. The "dd" is expected to fail with following information:
       # dd if=/dev/urandom of=/mnt/test1/1G bs=1M count=1024 oflag=direct
       dd: error writing '/mnt/test1/1G': Unknown error 512
       214+0 records in
       213+0 records out
       223346688 bytes (223 MB, 213 MiB) copied, 7.87017 s, 28.4 MB/s.


- Helpful links
1. v6.7.0 rpc_killall_tasks():
https://elixir.bootlin.com/linux/v6.7/source/net/sunrpc/clnt.c#L869
2. COMMIT "SUNRPC: Fix up task signalling v5.2-rc1" changes the error
code of rpc_tasks in rpc_killall_tasks() from EIO to ERESTARTSYS. The
link is https://github.com/torvalds/linux/commit/ae67bd3821bb0a54d97e7883d211196637d487a9?diff=split&w=0


Looking forward to your early reply :)

Best regards,
Zhitao Li, in SmartX.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-02-28  3:38 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-21  8:20 PROBLEM: NFS client IO fails with ERESTARTSYS when another mount point with the same export is unmounted with force [NFS] [SUNRPC] Zhitao Li
2024-02-21 13:48 ` Trond Myklebust
2024-02-22  3:05   ` Zhitao Li
2024-02-22 11:05   ` Jeff Layton
2024-02-22 15:20     ` Trond Myklebust
2024-02-23  3:44       ` Zhitao Li
2024-02-23 10:31       ` Jeff Layton
2024-02-27  2:35         ` Zhitao Li
2024-02-23  3:20     ` Zhitao Li
2024-02-27 23:55       ` NeilBrown
2024-02-28  3:37         ` Zhitao Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox