linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [netfs/cifs - Linux 6.14] loop on file cat + file copy when files are on CIFS share
@ 2025-03-24 10:40 Nicolas Baranger
  2025-03-27 11:15 ` Nicolas Baranger
  0 siblings, 1 reply; 16+ messages in thread
From: Nicolas Baranger @ 2025-03-24 10:40 UTC (permalink / raw)
  To: Christoph Hellwig, David Howells, netfs, linux-cifs,
	linux-fsdevel, linux-kernel
  Cc: Steve French, Jeff Layton, Christian Brauner

Hi Christoph, David

Sorry my last mail didn't arrive at the top of the list so I resend it 
with a new title

I don't know if it had already been reported but after building Linux 
6.14-rc1 I constat the following behaviour:

'cat' command is going on a loop when I cat a file which reside on cifs 
share

And so 'cp' command does the same: it copy the content of a file on cifs 
share and loop writing it to the destination
I did test with a file named 'toto' and containing only ascii string 
'toto'.

When I started copying it from cifs share to local filesystem, I had to 
CTRL+C the copy of this 5 bytes file after some time because the 
destination file was using all the filesystem free space and containing 
billions of 'toto' lines

Here is an example with cat:

CIFS SHARE is mounted as /mnt/fbx/FBX-24T

CIFS mount options:
grep cifs /proc/mounts
//10.0.10.100/FBX24T /mnt/fbx/FBX-24T cifs 
rw,nosuid,nodev,noexec,relatime,vers=3.1.1,cache=none,upcall_target=app,username=fbx,domain=HOMELAN,uid=0,noforceuid,gid=0,noforcegid,addr=10.0.10.100,file_mode=0666,dir_mode=0755,iocharset=utf8,soft,nounix,serverino,mapposix,mfsymlinks,reparse=nfs,nativesocket,symlink=mfsymlinks,rsize=65536,wsize=65536,bsize=16777216,retrans=1,echo_interval=60,actimeo=1,closetimeo=1 
0 0

KERNEL: uname -a
Linux 14RV-SERVER.14rv.lan 6.14.0.1-ast-rc2-amd64 #0 SMP PREEMPT_DYNAMIC 
Wed Feb 12 18:23:00 CET 2025 x86_64 GNU/Linux


To be reproduced:
echo toto >/mnt/fbx/FBX-24T/toto

ls -l /mnt/fbx/FBX-24T/toto
-rw-rw-rw- 1 root root 5 20 mars  09:20 /mnt/fbx/FBX-24T/toto

cat /mnt/fbx/FBX-24T/toto
toto
toto
toto
toto
toto
toto
toto
^C

strace cat /mnt/fbx/FBX-24T/toto
execve("/usr/bin/cat", ["cat", "/mnt/fbx/FBX-24T/toto"], 0x7ffc39b41848 
/* 19 vars */) = 0
brk(NULL)                               = 0x55755b1c1000
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
= 0x7f55f95d6000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (Aucun fichier ou 
dossier de ce type)
openat(AT_FDCWD, "glibc-hwcaps/x86-64-v3/libc.so.6", O_RDONLY|O_CLOEXEC) 
= -1 ENOENT (Aucun fichier ou dossier de ce type)
openat(AT_FDCWD, "glibc-hwcaps/x86-64-v2/libc.so.6", O_RDONLY|O_CLOEXEC) 
= -1 ENOENT (Aucun fichier ou dossier de ce type)
openat(AT_FDCWD, "tls/haswell/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = 
-1 ENOENT (Aucun fichier ou dossier de ce type)
openat(AT_FDCWD, "tls/haswell/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 
ENOENT (Aucun fichier ou dossier de ce type)
openat(AT_FDCWD, "tls/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT 
(Aucun fichier ou dossier de ce type)
openat(AT_FDCWD, "tls/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun 
fichier ou dossier de ce type)
openat(AT_FDCWD, "haswell/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 
ENOENT (Aucun fichier ou dossier de ce type)
openat(AT_FDCWD, "haswell/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT 
(Aucun fichier ou dossier de ce type)
openat(AT_FDCWD, "x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT 
(Aucun fichier ou dossier de ce type)
openat(AT_FDCWD, "libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun 
fichier ou dossier de ce type)
openat(AT_FDCWD, 
"/usr/local/cuda-12.6/lib64/glibc-hwcaps/x86-64-v3/libc.so.6", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
newfstatat(AT_FDCWD, 
"/usr/local/cuda-12.6/lib64/glibc-hwcaps/x86-64-v3", 0x7fff25937800, 0) 
= -1 ENOENT (Aucun fichier ou dossier de ce type)
openat(AT_FDCWD, 
"/usr/local/cuda-12.6/lib64/glibc-hwcaps/x86-64-v2/libc.so.6", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
newfstatat(AT_FDCWD, 
"/usr/local/cuda-12.6/lib64/glibc-hwcaps/x86-64-v2", 0x7fff25937800, 0) 
= -1 ENOENT (Aucun fichier ou dossier de ce type)
openat(AT_FDCWD, 
"/usr/local/cuda-12.6/lib64/tls/haswell/x86_64/libc.so.6", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
newfstatat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/tls/haswell/x86_64", 
0x7fff25937800, 0) = -1 ENOENT (Aucun fichier ou dossier de ce type)
openat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/tls/haswell/libc.so.6", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
newfstatat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/tls/haswell", 
0x7fff25937800, 0) = -1 ENOENT (Aucun fichier ou dossier de ce type)
openat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/tls/x86_64/libc.so.6", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
newfstatat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/tls/x86_64", 
0x7fff25937800, 0) = -1 ENOENT (Aucun fichier ou dossier de ce type)
openat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/tls/libc.so.6", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
newfstatat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/tls", 0x7fff25937800, 
0) = -1 ENOENT (Aucun fichier ou dossier de ce type)
openat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/haswell/x86_64/libc.so.6", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
newfstatat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/haswell/x86_64", 
0x7fff25937800, 0) = -1 ENOENT (Aucun fichier ou dossier de ce type)
openat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/haswell/libc.so.6", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
newfstatat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/haswell", 
0x7fff25937800, 0) = -1 ENOENT (Aucun fichier ou dossier de ce type)
openat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/x86_64/libc.so.6", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
newfstatat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/x86_64", 
0x7fff25937800, 0) = -1 ENOENT (Aucun fichier ou dossier de ce type)
openat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/libc.so.6", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
newfstatat(AT_FDCWD, "/usr/local/cuda-12.6/lib64", 
{st_mode=S_IFDIR|S_ISGID|0755, st_size=4570, ...}, 0) = 0
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=148466, ...}, 
AT_EMPTY_PATH) = 0
mmap(NULL, 148466, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f55f95b1000
close(3)                                = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) 
= 3
read(3, 
"\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\20t\2\0\0\0\0\0"..., 
832) = 832
pread64(3, 
"\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 
64) = 784
newfstatat(3, "", {st_mode=S_IFREG|0755, st_size=1922136, ...}, 
AT_EMPTY_PATH) = 0
pread64(3, 
"\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 
64) = 784
mmap(NULL, 1970000, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x7f55f93d0000
mmap(0x7f55f93f6000, 1396736, PROT_READ|PROT_EXEC, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x26000) = 0x7f55f93f6000
mmap(0x7f55f954b000, 339968, PROT_READ, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x17b000) = 0x7f55f954b000
mmap(0x7f55f959e000, 24576, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1ce000) = 0x7f55f959e000
mmap(0x7f55f95a4000, 53072, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f55f95a4000
close(3)                                = 0
mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
0) = 0x7f55f93cd000
arch_prctl(ARCH_SET_FS, 0x7f55f93cd740) = 0
set_tid_address(0x7f55f93cda10)         = 38427
set_robust_list(0x7f55f93cda20, 24)     = 0
rseq(0x7f55f93ce060, 0x20, 0, 0x53053053) = 0
mprotect(0x7f55f959e000, 16384, PROT_READ) = 0
mprotect(0x55754475e000, 4096, PROT_READ) = 0
mprotect(0x7f55f960e000, 8192, PROT_READ) = 0
prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, 
rlim_max=RLIM64_INFINITY}) = 0
munmap(0x7f55f95b1000, 148466)          = 0
getrandom("\x19\x6b\x9e\x55\x7e\x09\x74\x5f", 8, GRND_NONBLOCK) = 8
brk(NULL)                               = 0x55755b1c1000
brk(0x55755b1e2000)                     = 0x55755b1e2000
openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 
3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=3048928, ...}, 
AT_EMPTY_PATH) = 0
mmap(NULL, 3048928, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f55f9000000
close(3)                                = 0
newfstatat(1, "", {st_mode=S_IFCHR|0600, st_rdev=makedev(0x88, 0), ...}, 
AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/mnt/fbx/FBX-24T/toto", O_RDONLY) = 3
newfstatat(3, "", {st_mode=S_IFREG|0666, st_size=5, ...}, AT_EMPTY_PATH) 
= 0
fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
mmap(NULL, 16785408, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, 
-1, 0) = 0x7f55f7ffe000
read(3, 
"toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
16777216) = 16711680
write(1, 
"toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
16711680toto
) = 16711680
read(3, 
"toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
16777216) = 16711680
write(1, 
"toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
16711680toto
) = 16711680
read(3, 
"toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
16777216) = 16711680
write(1, 
"toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
16711680toto
) = 16711680
read(3, 
"toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
16777216) = 16711680
write(1, 
"toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
16711680toto
) = 16711680
read(3, 
"toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
16777216) = 16711680
write(1, 
"toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
16711680toto
) = 16711680
read(3, 
"toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
16777216) = 16711680
write(1, 
"toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
16711680toto
) = 16711680
read(3, 
"toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
16777216) = 16711680
write(1, 
"toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
16711680toto
^Cstrace: Process 38427 detached
  <detached ...>


Please let me know if it had already been fixed or reported and if 
you're able to reproduce this issue.

Thanks for help

Kind regards
Nicolas Baranger


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [netfs/cifs - Linux 6.14] loop on file cat + file copy when files are on CIFS share
  2025-03-24 10:40 [netfs/cifs - Linux 6.14] loop on file cat + file copy when files are on CIFS share Nicolas Baranger
@ 2025-03-27 11:15 ` Nicolas Baranger
  2025-03-28 10:45   ` Christoph Hellwig
  0 siblings, 1 reply; 16+ messages in thread
From: Nicolas Baranger @ 2025-03-27 11:15 UTC (permalink / raw)
  To: hch, Christoph Hellwig, David Howells, netfs, linux-cifs,
	linux-fsdevel, linux-kernel
  Cc: Steve French, Jeff Layton, Christian Brauner

Dear maintener

I get no answer from linux-cifs and netfs list so I'm sending this new 
mail
Today I've just download and build mainline kernel: Linux 6.14.0 (no 
-rc)

I still constat the bug describe in my previous mail.
What can I do to be able to use CIFS share ?

Thanks for help,

Kind regard
Nicolas Baranger

Le 2025-03-24 11:40, Nicolas Baranger a écrit :

> Hi Christoph, David
> 
> Sorry my last mail didn't arrive at the top of the list so I resend it 
> with a new title
> 
> I don't know if it had already been reported but after building Linux 
> 6.14-rc1 I constat the following behaviour:
> 
> 'cat' command is going on a loop when I cat a file which reside on cifs 
> share
> 
> And so 'cp' command does the same: it copy the content of a file on 
> cifs share and loop writing it to the destination
> I did test with a file named 'toto' and containing only ascii string 
> 'toto'.
> 
> When I started copying it from cifs share to local filesystem, I had to 
> CTRL+C the copy of this 5 bytes file after some time because the 
> destination file was using all the filesystem free space and containing 
> billions of 'toto' lines
> 
> Here is an example with cat:
> 
> CIFS SHARE is mounted as /mnt/fbx/FBX-24T
> 
> CIFS mount options:
> grep cifs /proc/mounts
> //10.0.10.100/FBX24T /mnt/fbx/FBX-24T cifs 
> rw,nosuid,nodev,noexec,relatime,vers=3.1.1,cache=none,upcall_target=app,username=fbx,domain=HOMELAN,uid=0,noforceuid,gid=0,noforcegid,addr=10.0.10.100,file_mode=0666,dir_mode=0755,iocharset=utf8,soft,nounix,serverino,mapposix,mfsymlinks,reparse=nfs,nativesocket,symlink=mfsymlinks,rsize=65536,wsize=65536,bsize=16777216,retrans=1,echo_interval=60,actimeo=1,closetimeo=1 
> 0 0
> 
> KERNEL: uname -a
> Linux 14RV-SERVER.14rv.lan 6.14.0.1-ast-rc2-amd64 #0 SMP 
> PREEMPT_DYNAMIC Wed Feb 12 18:23:00 CET 2025 x86_64 GNU/Linux
> 
> To be reproduced:
> echo toto >/mnt/fbx/FBX-24T/toto
> 
> ls -l /mnt/fbx/FBX-24T/toto
> -rw-rw-rw- 1 root root 5 20 mars  09:20 /mnt/fbx/FBX-24T/toto
> 
> cat /mnt/fbx/FBX-24T/toto
> toto
> toto
> toto
> toto
> toto
> toto
> toto
> ^C
> 
> strace cat /mnt/fbx/FBX-24T/toto
> execve("/usr/bin/cat", ["cat", "/mnt/fbx/FBX-24T/toto"], 0x7ffc39b41848 
> /* 19 vars */) = 0
> brk(NULL)                               = 0x55755b1c1000
> mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
> 0) = 0x7f55f95d6000
> access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (Aucun fichier ou 
> dossier de ce type)
> openat(AT_FDCWD, "glibc-hwcaps/x86-64-v3/libc.so.6", 
> O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
> openat(AT_FDCWD, "glibc-hwcaps/x86-64-v2/libc.so.6", 
> O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
> openat(AT_FDCWD, "tls/haswell/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = 
> -1 ENOENT (Aucun fichier ou dossier de ce type)
> openat(AT_FDCWD, "tls/haswell/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 
> ENOENT (Aucun fichier ou dossier de ce type)
> openat(AT_FDCWD, "tls/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 
> ENOENT (Aucun fichier ou dossier de ce type)
> openat(AT_FDCWD, "tls/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT 
> (Aucun fichier ou dossier de ce type)
> openat(AT_FDCWD, "haswell/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 
> ENOENT (Aucun fichier ou dossier de ce type)
> openat(AT_FDCWD, "haswell/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT 
> (Aucun fichier ou dossier de ce type)
> openat(AT_FDCWD, "x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT 
> (Aucun fichier ou dossier de ce type)
> openat(AT_FDCWD, "libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun 
> fichier ou dossier de ce type)
> openat(AT_FDCWD, 
> "/usr/local/cuda-12.6/lib64/glibc-hwcaps/x86-64-v3/libc.so.6", 
> O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
> newfstatat(AT_FDCWD, 
> "/usr/local/cuda-12.6/lib64/glibc-hwcaps/x86-64-v3", 0x7fff25937800, 0) 
> = -1 ENOENT (Aucun fichier ou dossier de ce type)
> openat(AT_FDCWD, 
> "/usr/local/cuda-12.6/lib64/glibc-hwcaps/x86-64-v2/libc.so.6", 
> O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
> newfstatat(AT_FDCWD, 
> "/usr/local/cuda-12.6/lib64/glibc-hwcaps/x86-64-v2", 0x7fff25937800, 0) 
> = -1 ENOENT (Aucun fichier ou dossier de ce type)
> openat(AT_FDCWD, 
> "/usr/local/cuda-12.6/lib64/tls/haswell/x86_64/libc.so.6", 
> O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
> newfstatat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/tls/haswell/x86_64", 
> 0x7fff25937800, 0) = -1 ENOENT (Aucun fichier ou dossier de ce type)
> openat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/tls/haswell/libc.so.6", 
> O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
> newfstatat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/tls/haswell", 
> 0x7fff25937800, 0) = -1 ENOENT (Aucun fichier ou dossier de ce type)
> openat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/tls/x86_64/libc.so.6", 
> O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
> newfstatat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/tls/x86_64", 
> 0x7fff25937800, 0) = -1 ENOENT (Aucun fichier ou dossier de ce type)
> openat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/tls/libc.so.6", 
> O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
> newfstatat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/tls", 0x7fff25937800, 
> 0) = -1 ENOENT (Aucun fichier ou dossier de ce type)
> openat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/haswell/x86_64/libc.so.6", 
> O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
> newfstatat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/haswell/x86_64", 
> 0x7fff25937800, 0) = -1 ENOENT (Aucun fichier ou dossier de ce type)
> openat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/haswell/libc.so.6", 
> O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
> newfstatat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/haswell", 
> 0x7fff25937800, 0) = -1 ENOENT (Aucun fichier ou dossier de ce type)
> openat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/x86_64/libc.so.6", 
> O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
> newfstatat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/x86_64", 
> 0x7fff25937800, 0) = -1 ENOENT (Aucun fichier ou dossier de ce type)
> openat(AT_FDCWD, "/usr/local/cuda-12.6/lib64/libc.so.6", 
> O_RDONLY|O_CLOEXEC) = -1 ENOENT (Aucun fichier ou dossier de ce type)
> newfstatat(AT_FDCWD, "/usr/local/cuda-12.6/lib64", 
> {st_mode=S_IFDIR|S_ISGID|0755, st_size=4570, ...}, 0) = 0
> openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
> newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=148466, ...}, 
> AT_EMPTY_PATH) = 0
> mmap(NULL, 148466, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f55f95b1000
> close(3)                                = 0
> openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) 
> = 3
> read(3, 
> "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\20t\2\0\0\0\0\0"..., 
> 832) = 832
> pread64(3, 
> "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 
> 784, 64) = 784
> newfstatat(3, "", {st_mode=S_IFREG|0755, st_size=1922136, ...}, 
> AT_EMPTY_PATH) = 0
> pread64(3, 
> "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 
> 784, 64) = 784
> mmap(NULL, 1970000, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
> 0x7f55f93d0000
> mmap(0x7f55f93f6000, 1396736, PROT_READ|PROT_EXEC, 
> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x26000) = 0x7f55f93f6000
> mmap(0x7f55f954b000, 339968, PROT_READ, 
> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x17b000) = 0x7f55f954b000
> mmap(0x7f55f959e000, 24576, PROT_READ|PROT_WRITE, 
> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1ce000) = 0x7f55f959e000
> mmap(0x7f55f95a4000, 53072, PROT_READ|PROT_WRITE, 
> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f55f95a4000
> close(3)                                = 0
> mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
> 0) = 0x7f55f93cd000
> arch_prctl(ARCH_SET_FS, 0x7f55f93cd740) = 0
> set_tid_address(0x7f55f93cda10)         = 38427
> set_robust_list(0x7f55f93cda20, 24)     = 0
> rseq(0x7f55f93ce060, 0x20, 0, 0x53053053) = 0
> mprotect(0x7f55f959e000, 16384, PROT_READ) = 0
> mprotect(0x55754475e000, 4096, PROT_READ) = 0
> mprotect(0x7f55f960e000, 8192, PROT_READ) = 0
> prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, 
> rlim_max=RLIM64_INFINITY}) = 0
> munmap(0x7f55f95b1000, 148466)          = 0
> getrandom("\x19\x6b\x9e\x55\x7e\x09\x74\x5f", 8, GRND_NONBLOCK) = 8
> brk(NULL)                               = 0x55755b1c1000
> brk(0x55755b1e2000)                     = 0x55755b1e2000
> openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) 
> = 3
> newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=3048928, ...}, 
> AT_EMPTY_PATH) = 0
> mmap(NULL, 3048928, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f55f9000000
> close(3)                                = 0
> newfstatat(1, "", {st_mode=S_IFCHR|0600, st_rdev=makedev(0x88, 0), 
> ...}, AT_EMPTY_PATH) = 0
> openat(AT_FDCWD, "/mnt/fbx/FBX-24T/toto", O_RDONLY) = 3
> newfstatat(3, "", {st_mode=S_IFREG|0666, st_size=5, ...}, 
> AT_EMPTY_PATH) = 0
> fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
> mmap(NULL, 16785408, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, 
> -1, 0) = 0x7f55f7ffe000
> read(3, 
> "toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 16777216) = 16711680
> write(1, 
> "toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 16711680toto
> ) = 16711680
> read(3, 
> "toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 16777216) = 16711680
> write(1, 
> "toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 16711680toto
> ) = 16711680
> read(3, 
> "toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 16777216) = 16711680
> write(1, 
> "toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 16711680toto
> ) = 16711680
> read(3, 
> "toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 16777216) = 16711680
> write(1, 
> "toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 16711680toto
> ) = 16711680
> read(3, 
> "toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 16777216) = 16711680
> write(1, 
> "toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 16711680toto
> ) = 16711680
> read(3, 
> "toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 16777216) = 16711680
> write(1, 
> "toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 16711680toto
> ) = 16711680
> read(3, 
> "toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 16777216) = 16711680
> write(1, 
> "toto\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
> 16711680toto
> ^Cstrace: Process 38427 detached
> <detached ...>
> 
> Please let me know if it had already been fixed or reported and if 
> you're able to reproduce this issue.
> 
> Thanks for help
> 
> Kind regards
> Nicolas Baranger

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [netfs/cifs - Linux 6.14] loop on file cat + file copy when files are on CIFS share
  2025-03-27 11:15 ` Nicolas Baranger
@ 2025-03-28 10:45   ` Christoph Hellwig
  2025-04-04  8:50     ` Nicolas Baranger
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2025-03-28 10:45 UTC (permalink / raw)
  To: Nicolas Baranger
  Cc: hch, Christoph Hellwig, David Howells, netfs, linux-cifs,
	linux-fsdevel, linux-kernel, Steve French, Jeff Layton,
	Christian Brauner

Hi Nicolas,

please wait a bit, many file system developers where at a conference
this week. 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [netfs/cifs - Linux 6.14] loop on file cat + file copy when files are on CIFS share
  2025-03-28 10:45   ` Christoph Hellwig
@ 2025-04-04  8:50     ` Nicolas Baranger
  2025-04-04 13:54       ` Paulo Alcantara
  0 siblings, 1 reply; 16+ messages in thread
From: Nicolas Baranger @ 2025-04-04  8:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: hch, David Howells, netfs, linux-cifs, linux-fsdevel,
	linux-kernel, Steve French, Jeff Layton, Christian Brauner

Hi Christoph

Thanks for answer and help
Did someone reproduced the issue (very easy) ?


CIFS SHARE is mounted as /mnt/fbx/FBX-24T
echo toto >/mnt/fbx/FBX-24T/toto

ls -l /mnt/fbx/FBX-24T/toto
-rw-rw-rw- 1 root root 5 20 mars  09:20 /mnt/fbx/FBX-24T/toto

cat /mnt/fbx/FBX-24T/toto
toto
toto
toto
toto
toto
toto
toto
^C


CIFS mount options:
grep cifs /proc/mounts
//10.0.10.100/FBX24T /mnt/fbx/FBX-24T cifs 
rw,nosuid,nodev,noexec,relatime,vers=3.1.1,cache=none,upcall_target=app,username=fbx,domain=HOMELAN,uid=0,noforceuid,gid=0,noforcegid,addr=10.0.10.100,file_mode=0666,dir_mode=0755,iocharset=utf8,soft,nounix,serverino,mapposix,mfsymlinks,reparse=nfs,nativesocket,symlink=mfsymlinks,rsize=65536,wsize=65536,bsize=16777216,retrans=1,echo_interval=60,actimeo=1,closetimeo=1 
0 0

KERNEL: uname -a
Linux 14RV-SERVER.14rv.lan 6.14.0-rc2-amd64 #0 SMP PREEMPT_DYNAMIC Wed 
Feb 12 18:23:00 CET 2025 x86_64 GNU/Linux


Kind regards
Nicolas Baranger


Le 2025-03-28 11:45, Christoph Hellwig a écrit :

> Hi Nicolas,
> 
> please wait a bit, many file system developers where at a conference
> this week.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [netfs/cifs - Linux 6.14] loop on file cat + file copy when files are on CIFS share
  2025-04-04  8:50     ` Nicolas Baranger
@ 2025-04-04 13:54       ` Paulo Alcantara
  2025-04-10  8:43         ` Nicolas Baranger
  0 siblings, 1 reply; 16+ messages in thread
From: Paulo Alcantara @ 2025-04-04 13:54 UTC (permalink / raw)
  To: Nicolas Baranger, Christoph Hellwig
  Cc: hch, David Howells, netfs, linux-cifs, linux-fsdevel,
	linux-kernel, Steve French, Jeff Layton, Christian Brauner

Hi Nicolas,

I'll look into it as soon as I recover from my illness.  Sorry for the delay.

On 4 April 2025 08:50:27 UTC, Nicolas Baranger <nicolas.baranger@3xo.fr> wrote:
>Hi Christoph
>
>Thanks for answer and help
>Did someone reproduced the issue (very easy) ?
>
>
>CIFS SHARE is mounted as /mnt/fbx/FBX-24T
>echo toto >/mnt/fbx/FBX-24T/toto
>
>ls -l /mnt/fbx/FBX-24T/toto
>-rw-rw-rw- 1 root root 5 20 mars  09:20 /mnt/fbx/FBX-24T/toto
>
>cat /mnt/fbx/FBX-24T/toto
>toto
>toto
>toto
>toto
>toto
>toto
>toto
>^C
>
>
>CIFS mount options:
>grep cifs /proc/mounts
>//10.0.10.100/FBX24T /mnt/fbx/FBX-24T cifs rw,nosuid,nodev,noexec,relatime,vers=3.1.1,cache=none,upcall_target=app,username=fbx,domain=HOMELAN,uid=0,noforceuid,gid=0,noforcegid,addr=10.0.10.100,file_mode=0666,dir_mode=0755,iocharset=utf8,soft,nounix,serverino,mapposix,mfsymlinks,reparse=nfs,nativesocket,symlink=mfsymlinks,rsize=65536,wsize=65536,bsize=16777216,retrans=1,echo_interval=60,actimeo=1,closetimeo=1 0 0
>
>KERNEL: uname -a
>Linux 14RV-SERVER.14rv.lan 6.14.0-rc2-amd64 #0 SMP PREEMPT_DYNAMIC Wed Feb 12 18:23:00 CET 2025 x86_64 GNU/Linux
>
>
>Kind regards
>Nicolas Baranger
>
>
>Le 2025-03-28 11:45, Christoph Hellwig a écrit :
>
>> Hi Nicolas,
>> 
>> please wait a bit, many file system developers where at a conference
>> this week.
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [netfs/cifs - Linux 6.14] loop on file cat + file copy when files are on CIFS share
  2025-04-04 13:54       ` Paulo Alcantara
@ 2025-04-10  8:43         ` Nicolas Baranger
  2025-04-15 18:28           ` Paulo Alcantara
  0 siblings, 1 reply; 16+ messages in thread
From: Nicolas Baranger @ 2025-04-10  8:43 UTC (permalink / raw)
  To: Paulo Alcantara
  Cc: Christoph Hellwig, hch, David Howells, netfs, linux-cifs,
	linux-fsdevel, linux-kernel, Steve French, Jeff Layton,
	Christian Brauner

Hi Paulo

Thanks for answer and help

> I'll look into it as soon as I recover from my illness.
Hope you're doing better

I had to rollback to linux 6.13.8 to be able to use the SMB share and 
here is what I constat
(don't know if it's a normal behavior but if yes, SMB seems to be a very 
very unefficient protocol)

I think the issue can be buffer related:
On Linux 6.13.8 the copy and cat of the 5 bytes 'toto' file containing 
only ascii string 'toto' is working fine but here is what I capture with 
tcpdump during transfert of toto file:
https://xba.soartist.net/t6.pcap
131 tcp packets to transfer a 5 byte file...
Isn't there a problem ?
Openning the pcap file with wireshark show a lot of lines:
25	0.005576	10.0.10.100	10.0.10.25	SMB2	1071	Read Response, Error: 
STATUS_END_OF_FILE
It seems that those lines appears after the 5 bytes 'toto' file had been 
transferred, and it continue until the last ACK recieved

I will try to reboot on Linux 6.14.0 mainline to see if I have the same 
behavior or to see what I get in the packet capture
(system is in production, I cannot reboot on a failing kernel when I 
want, it should be organised... sorry)

Let me know if you reproduced the issue

Kind regards
Nicolas Baranger



Le 2025-04-04 15:54, Paulo Alcantara a écrit :

> Hi Nicolas,
> 
> I'll look into it as soon as I recover from my illness.  Sorry for the 
> delay.
> 
> On 4 April 2025 08:50:27 UTC, Nicolas Baranger 
> <nicolas.baranger@3xo.fr> wrote: Hi Christoph
> 
> Thanks for answer and help
> Did someone reproduced the issue (very easy) ?
> 
> CIFS SHARE is mounted as /mnt/fbx/FBX-24T
> echo toto >/mnt/fbx/FBX-24T/toto
> 
> ls -l /mnt/fbx/FBX-24T/toto
> -rw-rw-rw- 1 root root 5 20 mars  09:20 /mnt/fbx/FBX-24T/toto
> 
> cat /mnt/fbx/FBX-24T/toto
> toto
> toto
> toto
> toto
> toto
> toto
> toto
> ^C
> 
> CIFS mount options:
> grep cifs /proc/mounts
> //10.0.10.100/FBX24T /mnt/fbx/FBX-24T cifs 
> rw,nosuid,nodev,noexec,relatime,vers=3.1.1,cache=none,upcall_target=app,username=fbx,domain=HOMELAN,uid=0,noforceuid,gid=0,noforcegid,addr=10.0.10.100,file_mode=0666,dir_mode=0755,iocharset=utf8,soft,nounix,serverino,mapposix,mfsymlinks,reparse=nfs,nativesocket,symlink=mfsymlinks,rsize=65536,wsize=65536,bsize=16777216,retrans=1,echo_interval=60,actimeo=1,closetimeo=1 
> 0 0
> 
> KERNEL: uname -a
> Linux 14RV-SERVER.14rv.lan 6.14.0-rc2-amd64 #0 SMP PREEMPT_DYNAMIC Wed 
> Feb 12 18:23:00 CET 2025 x86_64 GNU/Linux
> 
> Kind regards
> Nicolas Baranger
> 
> Le 2025-03-28 11:45, Christoph Hellwig a écrit :
> 
> Hi Nicolas,
> 
> please wait a bit, many file system developers where at a conference
> this week.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [netfs/cifs - Linux 6.14] loop on file cat + file copy when files are on CIFS share
  2025-04-10  8:43         ` Nicolas Baranger
@ 2025-04-15 18:28           ` Paulo Alcantara
  2025-04-17 10:10             ` Nicolas Baranger
  0 siblings, 1 reply; 16+ messages in thread
From: Paulo Alcantara @ 2025-04-15 18:28 UTC (permalink / raw)
  To: Nicolas Baranger
  Cc: Christoph Hellwig, hch, David Howells, netfs, linux-cifs,
	linux-fsdevel, linux-kernel, Steve French, Jeff Layton,
	Christian Brauner

Hi Nicolas,

Sorry for the delay as I've got busy with some downstream work.

Nicolas Baranger <nicolas.baranger@3xo.fr> writes:

>> I'll look into it as soon as I recover from my illness.
> Hope you're doing better

I'm fully recovered now, thanks :-)

> I had to rollback to linux 6.13.8 to be able to use the SMB share and 
> here is what I constat
> (don't know if it's a normal behavior but if yes, SMB seems to be a very 
> very unefficient protocol)
>
> I think the issue can be buffer related:
> On Linux 6.13.8 the copy and cat of the 5 bytes 'toto' file containing 
> only ascii string 'toto' is working fine but here is what I capture with 
> tcpdump during transfert of toto file:
> https://xba.soartist.net/t6.pcap
> 131 tcp packets to transfer a 5 byte file...
> Isn't there a problem ?
> Openning the pcap file with wireshark show a lot of lines:
> 25	0.005576	10.0.10.100	10.0.10.25	SMB2	1071	Read Response, Error: 
> STATUS_END_OF_FILE
> It seems that those lines appears after the 5 bytes 'toto' file had been 
> transferred, and it continue until the last ACK recieved

Thanks for the trace.  I was finally able to reproduce your issue and
will provide you with a fix soon.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [netfs/cifs - Linux 6.14] loop on file cat + file copy when files are on CIFS share
  2025-04-15 18:28           ` Paulo Alcantara
@ 2025-04-17 10:10             ` Nicolas Baranger
  2025-04-21 23:45               ` Paulo Alcantara
  0 siblings, 1 reply; 16+ messages in thread
From: Nicolas Baranger @ 2025-04-17 10:10 UTC (permalink / raw)
  To: Paulo Alcantara
  Cc: Christoph Hellwig, hch, David Howells, netfs, linux-cifs,
	linux-fsdevel, linux-kernel, Steve French, Jeff Layton,
	Christian Brauner

Hi Paulo

Resending this mail with content-type: text, sorry !

Thanks again for answer and help, it's good to hear you're back to 
health.

> Thanks for the trace.  I was finally able to reproduce your issue and 
> will provide you with a fix soon.

Perfect... And thanks !

If you need more traces or details on (both?) issues :

- 1) infinite loop issue during 'cat' or 'copy' since Linux 6.14.0

- 2) (don't know if it's related) the very high number of several bytes 
TCP packets transmitted in SMB transaction (more than a hundred) for a 5 
bytes file transfert under Linux 6.13.8

Do not hesitate to ask, I would be happy to help.

Kind regards
Nicolas




Le 2025-04-15 20:28, Paulo Alcantara a écrit :

> Hi Nicolas,
> 
> Sorry for the delay as I've got busy with some downstream work.
> 
> Nicolas Baranger <nicolas.baranger@3xo.fr> writes:
> 
> I'll look into it as soon as I recover from my illness. Hope you're 
> doing better

I'm fully recovered now, thanks :-)

> I had to rollback to linux 6.13.8 to be able to use the SMB share and
> here is what I constat
> (don't know if it's a normal behavior but if yes, SMB seems to be a 
> very
> very unefficient protocol)
> 
> I think the issue can be buffer related:
> On Linux 6.13.8 the copy and cat of the 5 bytes 'toto' file containing
> only ascii string 'toto' is working fine but here is what I capture 
> with
> tcpdump during transfert of toto file:
> https://xba.soartist.net/t6.pcap
> 131 tcp packets to transfer a 5 byte file...
> Isn't there a problem ?
> Openning the pcap file with wireshark show a lot of lines:
> 25    0.005576    10.0.10.100    10.0.10.25    SMB2    1071    Read 
> Response, Error:
> STATUS_END_OF_FILE
> It seems that those lines appears after the 5 bytes 'toto' file had 
> been
> transferred, and it continue until the last ACK recieved

Thanks for the trace.  I was finally able to reproduce your issue and
will provide you with a fix soon.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [netfs/cifs - Linux 6.14] loop on file cat + file copy when files are on CIFS share
  2025-04-17 10:10             ` Nicolas Baranger
@ 2025-04-21 23:45               ` Paulo Alcantara
  2025-04-23 16:28                 ` Nicolas Baranger
  0 siblings, 1 reply; 16+ messages in thread
From: Paulo Alcantara @ 2025-04-21 23:45 UTC (permalink / raw)
  To: Nicolas Baranger
  Cc: Christoph Hellwig, hch, David Howells, netfs, linux-cifs,
	linux-fsdevel, linux-kernel, Steve French, Jeff Layton,
	Christian Brauner

Nicolas Baranger <nicolas.baranger@3xo.fr> writes:

> If you need more traces or details on (both?) issues :
>
> - 1) infinite loop issue during 'cat' or 'copy' since Linux 6.14.0
>
> - 2) (don't know if it's related) the very high number of several bytes 
> TCP packets transmitted in SMB transaction (more than a hundred) for a 5 
> bytes file transfert under Linux 6.13.8

According to your mount options and network traces, cat(1) is attempting
to read 16M from 'toto' file, in which case netfslib will create 256
subrequests to handle 64K (rsize=65536) reads from 'toto' file.

The first 64K read at offset 0 succeeds and server returns 5 bytes, the
client then sets NETFS_SREQ_HIT_EOF to indicate that this subrequest hit
the EOF.  The next subrequests will still be processed by netfslib and
sent to the server, but they all fail with STATUS_END_OF_FILE.

So, the problem is with short DIO reads in netfslib that are not being
handled correctly.  It is returning a fixed number of bytes read to
every read(2) call in your cat command, 16711680 bytes which is the
offset of last subrequest.  This will make cat(1) retry forever as
netfslib is failing to return the correct number of bytes read,
including EOF.

While testing a potential fix, I also found other problems with DIO in
cifs.ko, so I'm working with Dave to get the proper fixes for both
netfslib and cifs.ko.

I've noticed that you disabled caching with 'cache=none', is there any
particular reason for that?

Have you also set rsize, wsize and bsize mount options?  If so, why?

If you want to keep 'cache=none', then a possible workaround for you
would be making rsize and wsize always greater than bsize.  The default
values (rsize=4194304,wsize=4194304,bsize=1048576) would do it.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [netfs/cifs - Linux 6.14] loop on file cat + file copy when files are on CIFS share
  2025-04-21 23:45               ` Paulo Alcantara
@ 2025-04-23 16:28                 ` Nicolas Baranger
  2025-04-24  7:40                   ` Nicolas Baranger
  0 siblings, 1 reply; 16+ messages in thread
From: Nicolas Baranger @ 2025-04-23 16:28 UTC (permalink / raw)
  To: Paulo Alcantara
  Cc: Christoph Hellwig, hch, David Howells, netfs, linux-cifs,
	linux-fsdevel, linux-kernel, Steve French, Jeff Layton,
	Christian Brauner

Hi Paolo

Thanks for answer, all explanations and help

I'm happy you found those 2 bugs and starting to patch them.
Reading your answer, I want to remember that I already found a bug in 
cifs DIO starting from Linux 6.10 (when cifs statring to use netfs to do 
its IO) and it was fixed by David and Christoph
full story here: 
https://lore.kernel.org/all/14271ed82a5be7fcc5ceea5f68a10bbd@manguebit.com/T/

> I've noticed that you disabled caching with 'cache=none', is there any
> particular reason for that?


Yes, it's related with the precedent use case describes in the other 
bug:
For backuping servers, I've got some KSMBD cifs share on which there are 
some 4TB+ sparses files (back-files) which are LUKS + BTRFS formatted.
The cifs share is mounted on servers and each server mount its own 
back-file as a block device and make its backup inside this crypted disk 
file
Due to performance issues, it is required that the disk files are using 
4KB block and are mounted in servers using losetup DIO option (+ 4K 
block size options)
When I use something else than 'cache=none', sometimes the BTRFS 
filesystem on the back file get corrupted and I also need to mount the 
BTRFS filesystem with 'space_cache=v2' to avoid filesystem corruption

> Have you also set rsize, wsize and bsize mount options?  If so, why?

After a lot of testing, the mounts buffers values: rsize=65536, 
wsize=65536, bsize=16777216, are the one which provide the best 
performances with no corruptions on the back-file filesystem and with 
these options a ~2TB backup is possible in few hours during  timeframe 
~1 -> ~5 AM each night

For me it's important that kernel async DIO on netfs continue to work as 
it's used by all my production backup system (transfer speed ratio 
compared with and without DIO is between 10 to 25)

I will try the patch "[PATCH] netfs: Fix setting of transferred bytes 
with short DIO reads", thanks

Let me know if you need further explanations,

Kind regards
Nicolas Baranger


Le 2025-04-22 01:45, Paulo Alcantara a écrit :

> Nicolas Baranger <nicolas.baranger@3xo.fr> writes:
> 
>> If you need more traces or details on (both?) issues :
>> 
>> - 1) infinite loop issue during 'cat' or 'copy' since Linux 6.14.0
>> 
>> - 2) (don't know if it's related) the very high number of several 
>> bytes
>> TCP packets transmitted in SMB transaction (more than a hundred) for a 
>> 5
>> bytes file transfert under Linux 6.13.8
> 
> According to your mount options and network traces, cat(1) is 
> attempting
> to read 16M from 'toto' file, in which case netfslib will create 256
> subrequests to handle 64K (rsize=65536) reads from 'toto' file.
> 
> The first 64K read at offset 0 succeeds and server returns 5 bytes, the
> client then sets NETFS_SREQ_HIT_EOF to indicate that this subrequest 
> hit
> the EOF.  The next subrequests will still be processed by netfslib and
> sent to the server, but they all fail with STATUS_END_OF_FILE.
> 
> So, the problem is with short DIO reads in netfslib that are not being
> handled correctly.  It is returning a fixed number of bytes read to
> every read(2) call in your cat command, 16711680 bytes which is the
> offset of last subrequest.  This will make cat(1) retry forever as
> netfslib is failing to return the correct number of bytes read,
> including EOF.
> 
> While testing a potential fix, I also found other problems with DIO in
> cifs.ko, so I'm working with Dave to get the proper fixes for both
> netfslib and cifs.ko.
> 
> I've noticed that you disabled caching with 'cache=none', is there any
> particular reason for that?
> 
> Have you also set rsize, wsize and bsize mount options?  If so, why?
> 
> If you want to keep 'cache=none', then a possible workaround for you
> would be making rsize and wsize always greater than bsize.  The default
> values (rsize=4194304,wsize=4194304,bsize=1048576) would do it.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [netfs/cifs - Linux 6.14] loop on file cat + file copy when files are on CIFS share
  2025-04-23 16:28                 ` Nicolas Baranger
@ 2025-04-24  7:40                   ` Nicolas Baranger
  2025-04-24  8:39                     ` Nicolas Baranger
  2025-04-24 13:58                     ` Steve French
  0 siblings, 2 replies; 16+ messages in thread
From: Nicolas Baranger @ 2025-04-24  7:40 UTC (permalink / raw)
  To: Paulo Alcantara
  Cc: Christoph Hellwig, hch, David Howells, netfs, linux-cifs,
	linux-fsdevel, linux-kernel, Steve French, Jeff Layton,
	Christian Brauner

Hi Paolo

Thanks again for help.

I'm sorry, I made a mistake in my answer yesterday:

> After a lot of testing, the mounts buffers values: rsize=65536, 
> wsize=65536, bsize=16777216,...

The actual values in /etc/fstab are:
rsize=4194304,wsize=4194304,bsize=16777216

But negociated values in /proc/mounts are:
rsize=65536,wsize=65536,bsize=16777216

And don't know if it's related but I have:
grep -i maxbuf /proc/fs/cifs/DebugData
CIFSMaxBufSize: 16384

I've just force a manual 'mount -o remount' and now I have in 
/proc/mounts the good values (SMB version is 3.1.1).
Where does this behavior comes from ?

After some search, it appears that when the CIFS share is mounted by 
systemd option x-systemd.automount (for example doing 'ls' in the mount 
point directory), negociated values are:
rsize=65536,wsize=65536,bsize=16777216
If I umount / remount manually, the negociated values are those defined 
in /etc/fstab !

Don't know if it's a normal behavior but it is a source of errors / 
mistake and makes troubleshooting performance issues harder

Kind regards
Nicolas



Le 2025-04-23 18:28, Nicolas Baranger a écrit :

> Hi Paolo
> 
> Thanks for answer, all explanations and help
> 
> I'm happy you found those 2 bugs and starting to patch them.
> Reading your answer, I want to remember that I already found a bug in 
> cifs DIO starting from Linux 6.10 (when cifs statring to use netfs to 
> do its IO) and it was fixed by David and Christoph
> full story here: 
> https://lore.kernel.org/all/14271ed82a5be7fcc5ceea5f68a10bbd@manguebit.com/T/
> 
>> I've noticed that you disabled caching with 'cache=none', is there any
>> particular reason for that?
> 
> Yes, it's related with the precedent use case describes in the other 
> bug:
> For backuping servers, I've got some KSMBD cifs share on which there 
> are some 4TB+ sparses files (back-files) which are LUKS + BTRFS 
> formatted.
> The cifs share is mounted on servers and each server mount its own 
> back-file as a block device and make its backup inside this crypted 
> disk file
> Due to performance issues, it is required that the disk files are using 
> 4KB block and are mounted in servers using losetup DIO option (+ 4K 
> block size options)
> When I use something else than 'cache=none', sometimes the BTRFS 
> filesystem on the back file get corrupted and I also need to mount the 
> BTRFS filesystem with 'space_cache=v2' to avoid filesystem corruption
> 
>> Have you also set rsize, wsize and bsize mount options?  If so, why?
> 
> After a lot of testing, the mounts buffers values: rsize=65536, 
> wsize=65536, bsize=16777216, are the one which provide the best 
> performances with no corruptions on the back-file filesystem and with 
> these options a ~2TB backup is possible in few hours during  timeframe 
> ~1 -> ~5 AM each night
> 
> For me it's important that kernel async DIO on netfs continue to work 
> as it's used by all my production backup system (transfer speed ratio 
> compared with and without DIO is between 10 to 25)
> 
> I will try the patch "[PATCH] netfs: Fix setting of transferred bytes 
> with short DIO reads", thanks
> 
> Let me know if you need further explanations,
> 
> Kind regards
> Nicolas Baranger
> 
> Le 2025-04-22 01:45, Paulo Alcantara a écrit :
> 
> Nicolas Baranger <nicolas.baranger@3xo.fr> writes:
> 
> If you need more traces or details on (both?) issues :
> 
> - 1) infinite loop issue during 'cat' or 'copy' since Linux 6.14.0
> 
> - 2) (don't know if it's related) the very high number of several bytes
> TCP packets transmitted in SMB transaction (more than a hundred) for a 
> 5
> bytes file transfert under Linux 6.13.8
> According to your mount options and network traces, cat(1) is 
> attempting
> to read 16M from 'toto' file, in which case netfslib will create 256
> subrequests to handle 64K (rsize=65536) reads from 'toto' file.
> 
> The first 64K read at offset 0 succeeds and server returns 5 bytes, the
> client then sets NETFS_SREQ_HIT_EOF to indicate that this subrequest 
> hit
> the EOF.  The next subrequests will still be processed by netfslib and
> sent to the server, but they all fail with STATUS_END_OF_FILE.
> 
> So, the problem is with short DIO reads in netfslib that are not being
> handled correctly.  It is returning a fixed number of bytes read to
> every read(2) call in your cat command, 16711680 bytes which is the
> offset of last subrequest.  This will make cat(1) retry forever as
> netfslib is failing to return the correct number of bytes read,
> including EOF.
> 
> While testing a potential fix, I also found other problems with DIO in
> cifs.ko, so I'm working with Dave to get the proper fixes for both
> netfslib and cifs.ko.
> 
> I've noticed that you disabled caching with 'cache=none', is there any
> particular reason for that?
> 
> Have you also set rsize, wsize and bsize mount options?  If so, why?
> 
> If you want to keep 'cache=none', then a possible workaround for you
> would be making rsize and wsize always greater than bsize.  The default
> values (rsize=4194304,wsize=4194304,bsize=1048576) would do it.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [netfs/cifs - Linux 6.14] loop on file cat + file copy when files are on CIFS share
  2025-04-24  7:40                   ` Nicolas Baranger
@ 2025-04-24  8:39                     ` Nicolas Baranger
  2025-04-24 14:25                       ` Paulo Alcantara
  2025-04-24 13:58                     ` Steve French
  1 sibling, 1 reply; 16+ messages in thread
From: Nicolas Baranger @ 2025-04-24  8:39 UTC (permalink / raw)
  To: Paulo Alcantara
  Cc: Christoph Hellwig, hch, David Howells, netfs, linux-cifs,
	linux-fsdevel, linux-kernel, Steve French, Jeff Layton,
	Christian Brauner

[Resending mail in plain text version, sorry !]

Hi Paolo

Thanks again for help and sorry for this new mail but I think it could 
be relevant

In fact, I think there is somethings wrong:

After a remount, I sucessfully get the good buffers size values in 
/proc/mounts (those defined in /etc/fstab).

grep cifs /proc/mounts
//10.0.10.100/FBX24T /mnt/fbx/FBX-24T cifs 
rw,nosuid,nodev,noexec,relatime,vers=3.1.1,cache=none,upcall_target=app,username=*****,domain=*****,uid=0,noforceuid,gid=0,noforcegid,addr=10.0.10.100,file_mode=0666,dir_mode=0755,iocharset=utf8,soft,nounix,serverino,mapposix,mfsymlinks,reparse=nfs,rsize=4194304,wsize=4194304,bsize=16777216,retrans=1,echo_interval=60,actimeo=1,closetimeo=1 
0 0

uname -r
6.13.8.1-ast-nba0-amd64



But here is what I constat: a 'dd' with a block size smaller than 65536 
is working fine:
LANG=en_US.UTF-8

dd if=/dev/urandom of=/mnt/fbx/FBX-24T/dd.test3 bs=65536 status=progress 
conv=notrunc oflag=direct count=128
128+0 records in
128+0 records out
8388608 bytes (8.4 MB, 8.0 MiB) copied, 0.100398 s, 83.6 MB/s



But a 'dd' with a block size bigger than 65536 is not working:
LANG=en_US.UTF-8

dd if=/dev/urandom of=/mnt/fbx/FBX-24T/dd.test3 bs=65537 status=progress 
conv=notrunc oflag=direct count=128
dd: error writing '/mnt/fbx/FBX-24T/dd.test3'
dd: closing output file '/mnt/fbx/FBX-24T/dd.test3': Invalid argument

And kernel report:
Apr 24 10:01:37 14RV-SERVER.14rv.lan kernel: CIFS: VFS: \\10.0.10.100 
Error -32 sending data on socket to server



If I let systemd option x-systemd.automount mount the share it configure 
/proc/mount with rsize=65536,wsize=65536 and I'm able to send datas 
whatever is the size of each packet of datas in the transfer stream.
Example:

grep cifs /proc/mounts
//10.0.10.100/FBX24T /mnt/fbx/FBX-24T cifs 
rw,nosuid,nodev,noexec,relatime,vers=3.1.1,cache=none,upcall_target=app,username=*****,domain=*****,uid=0,noforceuid,gid=0,noforcegid,addr=10.0.10.100,file_mode=0666,dir_mode=0755,iocharset=utf8,soft,nounix,serverino,mapposix,mfsymlinks,reparse=nfs,rsize=65536,wsize=65536,bsize=16777216,retrans=1,echo_interval=60,actimeo=1,closetimeo=1 
0 0

dd if=/dev/urandom of=/mnt/fbx/FBX-24T/dd.test3 bs=64M status=progress 
conv=notrunc oflag=direct count=128
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 42 s, 203 MB/s
128+0 records in
128+0 records out
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 42.2399 s, 203 MB/s


To conclude, if I force an fstab value bigger than 65536 to be 
concidered and used (visible in /proc/mounts), transfer failed if I 
don't stream the transfer in packets of maximum 65536 bytes and if I let 
systemd configure rsize and wsize at 65536, I can stream the transfer in 
blocks of all size and specially of bigger size (*1024 in the example)


Let me know if you need further testing

Kind regards
Nicolas




Le 2025-04-24 09:40, Nicolas Baranger a écrit :

> Hi Paolo
> 
> Thanks again for help.
> 
> I'm sorry, I made a mistake in my answer yesterday:
> 
>> After a lot of testing, the mounts buffers values: rsize=65536, 
>> wsize=65536, bsize=16777216,...
> 
> The actual values in /etc/fstab are:
> rsize=4194304,wsize=4194304,bsize=16777216
> 
> But negociated values in /proc/mounts are:
> rsize=65536,wsize=65536,bsize=16777216
> 
> And don't know if it's related but I have:
> grep -i maxbuf /proc/fs/cifs/DebugData
> CIFSMaxBufSize: 16384
> 
> I've just force a manual 'mount -o remount' and now I have in 
> /proc/mounts the good values (SMB version is 3.1.1).
> Where does this behavior comes from ?
> 
> After some search, it appears that when the CIFS share is mounted by 
> systemd option x-systemd.automount (for example doing 'ls' in the mount 
> point directory), negociated values are:
> rsize=65536,wsize=65536,bsize=16777216
> If I umount / remount manually, the negociated values are those defined 
> in /etc/fstab !
> 
> Don't know if it's a normal behavior but it is a source of errors / 
> mistake and makes troubleshooting performance issues harder
> 
> Kind regards
> Nicolas
> 
> Le 2025-04-23 18:28, Nicolas Baranger a écrit :
> 
> Hi Paolo
> 
> Thanks for answer, all explanations and help
> 
> I'm happy you found those 2 bugs and starting to patch them.
> Reading your answer, I want to remember that I already found a bug in 
> cifs DIO starting from Linux 6.10 (when cifs statring to use netfs to 
> do its IO) and it was fixed by David and Christoph
> full story here: 
> https://lore.kernel.org/all/14271ed82a5be7fcc5ceea5f68a10bbd@manguebit.com/T/
> 
> I've noticed that you disabled caching with 'cache=none', is there any
> particular reason for that?
> Yes, it's related with the precedent use case describes in the other 
> bug:
> For backuping servers, I've got some KSMBD cifs share on which there 
> are some 4TB+ sparses files (back-files) which are LUKS + BTRFS 
> formatted.
> The cifs share is mounted on servers and each server mount its own 
> back-file as a block device and make its backup inside this crypted 
> disk file
> Due to performance issues, it is required that the disk files are using 
> 4KB block and are mounted in servers using losetup DIO option (+ 4K 
> block size options)
> When I use something else than 'cache=none', sometimes the BTRFS 
> filesystem on the back file get corrupted and I also need to mount the 
> BTRFS filesystem with 'space_cache=v2' to avoid filesystem corruption
> 
> Have you also set rsize, wsize and bsize mount options?  If so, why?
> After a lot of testing, the mounts buffers values: rsize=65536, 
> wsize=65536, bsize=16777216, are the one which provide the best 
> performances with no corruptions on the back-file filesystem and with 
> these options a ~2TB backup is possible in few hours during  timeframe 
> ~1 -> ~5 AM each night
> 
> For me it's important that kernel async DIO on netfs continue to work 
> as it's used by all my production backup system (transfer speed ratio 
> compared with and without DIO is between 10 to 25)
> 
> I will try the patch "[PATCH] netfs: Fix setting of transferred bytes 
> with short DIO reads", thanks
> 
> Let me know if you need further explanations,
> 
> Kind regards
> Nicolas Baranger
> 
> Le 2025-04-22 01:45, Paulo Alcantara a écrit :
> 
> Nicolas Baranger <nicolas.baranger@3xo.fr> writes:
> 
> If you need more traces or details on (both?) issues :
> 
> - 1) infinite loop issue during 'cat' or 'copy' since Linux 6.14.0
> 
> - 2) (don't know if it's related) the very high number of several bytes
> TCP packets transmitted in SMB transaction (more than a hundred) for a 
> 5
> bytes file transfert under Linux 6.13.8
> According to your mount options and network traces, cat(1) is 
> attempting
> to read 16M from 'toto' file, in which case netfslib will create 256
> subrequests to handle 64K (rsize=65536) reads from 'toto' file.
> 
> The first 64K read at offset 0 succeeds and server returns 5 bytes, the
> client then sets NETFS_SREQ_HIT_EOF to indicate that this subrequest 
> hit
> the EOF.  The next subrequests will still be processed by netfslib and
> sent to the server, but they all fail with STATUS_END_OF_FILE.
> 
> So, the problem is with short DIO reads in netfslib that are not being
> handled correctly.  It is returning a fixed number of bytes read to
> every read(2) call in your cat command, 16711680 bytes which is the
> offset of last subrequest.  This will make cat(1) retry forever as
> netfslib is failing to return the correct number of bytes read,
> including EOF.
> 
> While testing a potential fix, I also found other problems with DIO in
> cifs.ko, so I'm working with Dave to get the proper fixes for both
> netfslib and cifs.ko.
> 
> I've noticed that you disabled caching with 'cache=none', is there any
> particular reason for that?
> 
> Have you also set rsize, wsize and bsize mount options?  If so, why?
> 
> If you want to keep 'cache=none', then a possible workaround for you
> would be making rsize and wsize always greater than bsize.  The default
> values (rsize=4194304,wsize=4194304,bsize=1048576) would do it.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [netfs/cifs - Linux 6.14] loop on file cat + file copy when files are on CIFS share
  2025-04-24  7:40                   ` Nicolas Baranger
  2025-04-24  8:39                     ` Nicolas Baranger
@ 2025-04-24 13:58                     ` Steve French
  1 sibling, 0 replies; 16+ messages in thread
From: Steve French @ 2025-04-24 13:58 UTC (permalink / raw)
  To: Nicolas Baranger
  Cc: Paulo Alcantara, Christoph Hellwig, hch, David Howells, netfs,
	linux-cifs, linux-fsdevel, linux-kernel, Jeff Layton,
	Christian Brauner

> when the CIFS share is mounted by
> systemd option x-systemd.automount (for example doing 'ls' in the mount
> point directory), negociated values are:
> rsize=65536,wsize=65536,bsize=16777216
> If I umount / remount manually, the negociated values are those defined
> in /etc/fstab !

That does seems broken (and obviously can hurt performance
significantly as most servers negotiate an rsize and wsize of at least
4MB).

It looks like it can be overridden by creating the file to configure
the smb3 systemd automounts in /etc/systemd/system but it is odd that
it overrides the default that would be used for normal mounts or
/etc/fstab automounts

On Thu, Apr 24, 2025 at 2:40 AM Nicolas Baranger
<nicolas.baranger@3xo.fr> wrote:
>
> Hi Paolo
>
> Thanks again for help.
>
> I'm sorry, I made a mistake in my answer yesterday:
>
> > After a lot of testing, the mounts buffers values: rsize=65536,
> > wsize=65536, bsize=16777216,...
>
> The actual values in /etc/fstab are:
> rsize=4194304,wsize=4194304,bsize=16777216
>
> But negociated values in /proc/mounts are:
> rsize=65536,wsize=65536,bsize=16777216
>
> And don't know if it's related but I have:
> grep -i maxbuf /proc/fs/cifs/DebugData
> CIFSMaxBufSize: 16384
>
> I've just force a manual 'mount -o remount' and now I have in
> /proc/mounts the good values (SMB version is 3.1.1).
> Where does this behavior comes from ?
>
> After some search, it appears that when the CIFS share is mounted by
> systemd option x-systemd.automount (for example doing 'ls' in the mount
> point directory), negociated values are:
> rsize=65536,wsize=65536,bsize=16777216
> If I umount / remount manually, the negociated values are those defined
> in /etc/fstab !
>
> Don't know if it's a normal behavior but it is a source of errors /
> mistake and makes troubleshooting performance issues harder
>
> Kind regards
> Nicolas
>
>
>
> Le 2025-04-23 18:28, Nicolas Baranger a écrit :
>
> > Hi Paolo
> >
> > Thanks for answer, all explanations and help
> >
> > I'm happy you found those 2 bugs and starting to patch them.
> > Reading your answer, I want to remember that I already found a bug in
> > cifs DIO starting from Linux 6.10 (when cifs statring to use netfs to
> > do its IO) and it was fixed by David and Christoph
> > full story here:
> > https://lore.kernel.org/all/14271ed82a5be7fcc5ceea5f68a10bbd@manguebit.com/T/
> >
> >> I've noticed that you disabled caching with 'cache=none', is there any
> >> particular reason for that?
> >
> > Yes, it's related with the precedent use case describes in the other
> > bug:
> > For backuping servers, I've got some KSMBD cifs share on which there
> > are some 4TB+ sparses files (back-files) which are LUKS + BTRFS
> > formatted.
> > The cifs share is mounted on servers and each server mount its own
> > back-file as a block device and make its backup inside this crypted
> > disk file
> > Due to performance issues, it is required that the disk files are using
> > 4KB block and are mounted in servers using losetup DIO option (+ 4K
> > block size options)
> > When I use something else than 'cache=none', sometimes the BTRFS
> > filesystem on the back file get corrupted and I also need to mount the
> > BTRFS filesystem with 'space_cache=v2' to avoid filesystem corruption
> >
> >> Have you also set rsize, wsize and bsize mount options?  If so, why?
> >
> > After a lot of testing, the mounts buffers values: rsize=65536,
> > wsize=65536, bsize=16777216, are the one which provide the best
> > performances with no corruptions on the back-file filesystem and with
> > these options a ~2TB backup is possible in few hours during  timeframe
> > ~1 -> ~5 AM each night
> >
> > For me it's important that kernel async DIO on netfs continue to work
> > as it's used by all my production backup system (transfer speed ratio
> > compared with and without DIO is between 10 to 25)
> >
> > I will try the patch "[PATCH] netfs: Fix setting of transferred bytes
> > with short DIO reads", thanks
> >
> > Let me know if you need further explanations,
> >
> > Kind regards
> > Nicolas Baranger
> >
> > Le 2025-04-22 01:45, Paulo Alcantara a écrit :
> >
> > Nicolas Baranger <nicolas.baranger@3xo.fr> writes:
> >
> > If you need more traces or details on (both?) issues :
> >
> > - 1) infinite loop issue during 'cat' or 'copy' since Linux 6.14.0
> >
> > - 2) (don't know if it's related) the very high number of several bytes
> > TCP packets transmitted in SMB transaction (more than a hundred) for a
> > 5
> > bytes file transfert under Linux 6.13.8
> > According to your mount options and network traces, cat(1) is
> > attempting
> > to read 16M from 'toto' file, in which case netfslib will create 256
> > subrequests to handle 64K (rsize=65536) reads from 'toto' file.
> >
> > The first 64K read at offset 0 succeeds and server returns 5 bytes, the
> > client then sets NETFS_SREQ_HIT_EOF to indicate that this subrequest
> > hit
> > the EOF.  The next subrequests will still be processed by netfslib and
> > sent to the server, but they all fail with STATUS_END_OF_FILE.
> >
> > So, the problem is with short DIO reads in netfslib that are not being
> > handled correctly.  It is returning a fixed number of bytes read to
> > every read(2) call in your cat command, 16711680 bytes which is the
> > offset of last subrequest.  This will make cat(1) retry forever as
> > netfslib is failing to return the correct number of bytes read,
> > including EOF.
> >
> > While testing a potential fix, I also found other problems with DIO in
> > cifs.ko, so I'm working with Dave to get the proper fixes for both
> > netfslib and cifs.ko.
> >
> > I've noticed that you disabled caching with 'cache=none', is there any
> > particular reason for that?
> >
> > Have you also set rsize, wsize and bsize mount options?  If so, why?
> >
> > If you want to keep 'cache=none', then a possible workaround for you
> > would be making rsize and wsize always greater than bsize.  The default
> > values (rsize=4194304,wsize=4194304,bsize=1048576) would do it.



-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [netfs/cifs - Linux 6.14] loop on file cat + file copy when files are on CIFS share
  2025-04-24  8:39                     ` Nicolas Baranger
@ 2025-04-24 14:25                       ` Paulo Alcantara
  2025-05-06 22:53                         ` Paulo Alcantara
  0 siblings, 1 reply; 16+ messages in thread
From: Paulo Alcantara @ 2025-04-24 14:25 UTC (permalink / raw)
  To: Nicolas Baranger
  Cc: Christoph Hellwig, hch, David Howells, netfs, linux-cifs,
	linux-fsdevel, linux-kernel, Steve French, Jeff Layton,
	Christian Brauner

Hi Nicolas,

Thanks for the very detailed information and testing.

Nicolas Baranger <nicolas.baranger@3xo.fr> writes:

> In fact, I think there is somethings wrong:
>
> After a remount, I sucessfully get the good buffers size values in 
> /proc/mounts (those defined in /etc/fstab).
>
> grep cifs /proc/mounts
> //10.0.10.100/FBX24T /mnt/fbx/FBX-24T cifs 
> rw,nosuid,nodev,noexec,relatime,vers=3.1.1,cache=none,upcall_target=app,username=*****,domain=*****,uid=0,noforceuid,gid=0,noforcegid,addr=10.0.10.100,file_mode=0666,dir_mode=0755,iocharset=utf8,soft,nounix,serverino,mapposix,mfsymlinks,reparse=nfs,rsize=4194304,wsize=4194304,bsize=16777216,retrans=1,echo_interval=60,actimeo=1,closetimeo=1 
> 0 0

Interesting.  When you do 'mount -o remount ...' but don't pass rsize=
and wsize=, the client is suppposed to reuse the existing values of
rsize and wsize set in the current superblock.  The above values of
rsize, wsize and bsize are also the default ones in case you don't pass
them at all.

I'll look into that when time allows it.

> But here is what I constat: a 'dd' with a block size smaller than 65536 
> is working fine:
> LANG=en_US.UTF-8
>
> dd if=/dev/urandom of=/mnt/fbx/FBX-24T/dd.test3 bs=65536 status=progress 
> conv=notrunc oflag=direct count=128
> 128+0 records in
> 128+0 records out
> 8388608 bytes (8.4 MB, 8.0 MiB) copied, 0.100398 s, 83.6 MB/s
>
>
>
> But a 'dd' with a block size bigger than 65536 is not working:
> LANG=en_US.UTF-8
>
> dd if=/dev/urandom of=/mnt/fbx/FBX-24T/dd.test3 bs=65537 status=progress 
> conv=notrunc oflag=direct count=128
> dd: error writing '/mnt/fbx/FBX-24T/dd.test3'
> dd: closing output file '/mnt/fbx/FBX-24T/dd.test3': Invalid argument
>
> And kernel report:
> Apr 24 10:01:37 14RV-SERVER.14rv.lan kernel: CIFS: VFS: \\10.0.10.100 
> Error -32 sending data on socket to server

This seems related to unaligned DIO reads and writes.  With O_DIRECT,
the client will set FILE_NO_INTERMEDIATE_BUFFERING when opening the
file, telling the server to not do any buffering when reading from or
writing to the file.  Some servers will fail the read or write request
if the file offset or length isn't a multiple of block size, where the
block size is >= 512 && <= PAGE_SIZE, as specified in MS-FSA 2.1.5.[34].

Since you're passing bs= with a value that is not multiple of block
size, then the server is failing the request with
STATUS_INVALID_PARAMETER as specified in MS-FSA.

I've tested it against Windows Server 2022 and it seems to enforce the
alignment only for DIO reads.  While samba doesn't enforce it at all.

win2k22:

$ dd if=/mnt/1/foo of=/dev/null status=none iflag=direct count=128 bs=65536
$ dd if=/mnt/1/foo of=/dev/null status=none iflag=direct count=128 bs=65537
dd: error reading '/mnt/1/foo': Invalid argument
$ dd if=/mnt/1/foo of=/dev/null status=none iflag=direct count=128 bs=$((65536+512))

$ xfs_io -d -f -c "pread 0 4096" /mnt/1/foo
read 4096/4096 bytes at offset 0
4 KiB, 1 ops; 0.0009 sec (4.260 MiB/sec and 1090.5125 ops/sec)
$ xfs_io -d -f -c "pread 1 4096" /mnt/1/foo
pread: Invalid argument

samba:

$ dd if=/mnt/1/foo of=/dev/null status=none iflag=direct count=128 bs=65536
$ dd if=/mnt/1/foo of=/dev/null status=none iflag=direct count=128 bs=65537
$ dd if=/mnt/1/foo of=/dev/null status=none iflag=direct count=128 bs=$((65536+512))

$ xfs_io -d -f -c "pread 0 4096" /mnt/1/foo
read 4096/4096 bytes at offset 0
4 KiB, 1 ops; 0.0071 sec (557.880 KiB/sec and 139.4700 ops/sec)
$ xfs_io -d -f -c "pread 1 4096" /mnt/1/foo
read 4096/4096 bytes at offset 1
4 KiB, 1 ops; 0.0010 sec (3.864 MiB/sec and 989.1197 ops/sec)

Note that the netfslib fix is for short DIO reads, so this bug is
related to unaligned DIO reads and writes and need to be fixed in the
client.  I'll let you know when I have patches for that.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [netfs/cifs - Linux 6.14] loop on file cat + file copy when files are on CIFS share
  2025-04-24 14:25                       ` Paulo Alcantara
@ 2025-05-06 22:53                         ` Paulo Alcantara
  2025-05-07 15:58                           ` Nicolas Baranger
  0 siblings, 1 reply; 16+ messages in thread
From: Paulo Alcantara @ 2025-05-06 22:53 UTC (permalink / raw)
  To: Nicolas Baranger
  Cc: Christoph Hellwig, hch, David Howells, netfs, linux-cifs,
	linux-fsdevel, linux-kernel, Steve French, Jeff Layton,
	Christian Brauner

Hi Nicolas,

Could you try my cifs.dio branch [1] which contains the following fixes

	afea8b581c75 ("netfs: Fix wait/wake to be consistent about the waitqueue used")
	ae9f3deaa17a ("netfs: Fix the request's work item to not require a ref")
	b2a47dc3ead6 ("netfs: Fix setting of transferred bytes with short DIO reads")
        c59f7c9661b9 ("smb: client: ensure aligned IO sizes")

Let me know if you find any issues with it.  Thanks.

[1] https://git.manguebit.com/linux.git

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [netfs/cifs - Linux 6.14] loop on file cat + file copy when files are on CIFS share
  2025-05-06 22:53                         ` Paulo Alcantara
@ 2025-05-07 15:58                           ` Nicolas Baranger
  0 siblings, 0 replies; 16+ messages in thread
From: Nicolas Baranger @ 2025-05-07 15:58 UTC (permalink / raw)
  To: Paulo Alcantara
  Cc: Christoph Hellwig, hch, David Howells, netfs, linux-cifs,
	linux-fsdevel, linux-kernel, Steve French, Jeff Layton,
	Christian Brauner

Hi Paulo

I'm testing this branch and going back with results

Thanks again !
Nicolas

Le 2025-05-07 00:53, Paulo Alcantara a écrit :

> Hi Nicolas,
> 
> Could you try my cifs.dio branch [1] which contains the following fixes
> 
> afea8b581c75 ("netfs: Fix wait/wake to be consistent about the 
> waitqueue used")
> ae9f3deaa17a ("netfs: Fix the request's work item to not require a 
> ref")
> b2a47dc3ead6 ("netfs: Fix setting of transferred bytes with short DIO 
> reads")
> c59f7c9661b9 ("smb: client: ensure aligned IO sizes")
> 
> Let me know if you find any issues with it.  Thanks.
> 
> [1] https://git.manguebit.com/linux.git

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-05-07 16:05 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-24 10:40 [netfs/cifs - Linux 6.14] loop on file cat + file copy when files are on CIFS share Nicolas Baranger
2025-03-27 11:15 ` Nicolas Baranger
2025-03-28 10:45   ` Christoph Hellwig
2025-04-04  8:50     ` Nicolas Baranger
2025-04-04 13:54       ` Paulo Alcantara
2025-04-10  8:43         ` Nicolas Baranger
2025-04-15 18:28           ` Paulo Alcantara
2025-04-17 10:10             ` Nicolas Baranger
2025-04-21 23:45               ` Paulo Alcantara
2025-04-23 16:28                 ` Nicolas Baranger
2025-04-24  7:40                   ` Nicolas Baranger
2025-04-24  8:39                     ` Nicolas Baranger
2025-04-24 14:25                       ` Paulo Alcantara
2025-05-06 22:53                         ` Paulo Alcantara
2025-05-07 15:58                           ` Nicolas Baranger
2025-04-24 13:58                     ` Steve French

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).