* Unexpected low pNFS IO performance with parallel workload
@ 2025-02-28 18:13 Mkrtchyan, Tigran
2025-03-03 17:31 ` Mkrtchyan, Tigran
0 siblings, 1 reply; 3+ messages in thread
From: Mkrtchyan, Tigran @ 2025-02-28 18:13 UTC (permalink / raw)
To: linux-nfs; +Cc: trondmy, Olga Kornievskaia
[-- Attachment #1: Type: text/plain, Size: 1767 bytes --]
Dear NFS fellows,
During HPC workloads on we notice that Linux NFS4.2/pNFS client menonstraits unexpected low performance.
The application opens 55 files parallel reads the data with multiple threads. The server issues flexfile
layout with tighly coupled NFSv4.1 DSes.
Oservations:
- despite 1MB rsize/wsize returned by layout, client never issues reads bigger that 512k (offten much smaller)
- client always uses slot 0 on DS, and
- reads happen sequentialy, i.e. only one in-flight READ requests
- following reads often just read the next 512k block
- If instead of parallel application a simple dd is called, that multiple slots and 1MB READs are sent
$ dd if=/pnfs/xxxx/00054.h5 of=/dev/null
45753381+1 records in
45753381+1 records out
23425731171 bytes (23 GB, 22 GiB) copied, 69.702 s, 336 MB/s
The client has 80 cores on 2 sockets, 512BG of RAM and runs REHL 9.4
$ uname -r
5.14.0-427.26.1.el9_4.x86_64
$ free -g
total used free shared buff/cache available
Mem: 503 84 392 0 29 419
$ lscpu | head
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 80
On-line CPU(s) list: 0-79
Vendor ID: GenuineIntel
BIOS Vendor ID: Intel(R) Corporation
Model name: Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz
BIOS Model name: Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz
The client and all DSes equiped with 10GB/s NICs.
Any ideas where to look?
Best regards,
Tigran.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2826 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Unexpected low pNFS IO performance with parallel workload
2025-02-28 18:13 Unexpected low pNFS IO performance with parallel workload Mkrtchyan, Tigran
@ 2025-03-03 17:31 ` Mkrtchyan, Tigran
2025-03-25 11:33 ` Lionel Cons
0 siblings, 1 reply; 3+ messages in thread
From: Mkrtchyan, Tigran @ 2025-03-03 17:31 UTC (permalink / raw)
To: linux-nfs; +Cc: trondmy, Olga Kornievskaia
[-- Attachment #1: Type: text/plain, Size: 3045 bytes --]
I was able to reproduce low throughput with the fio command. The examples below read 200GB from multiple files.
The --offset=98% is there just to read a small portion of a file, as our files are 33GB each. In 'case 1', the data is read from a single
file, and when it reaches EOF, it switches to the next one. In 'case 2', all files are opened in advance, and data is read round-robin through
all files.
case 1: read files sequentially
fio --name test --opendir=/pnfs/data --rw=randread:8 --bssplit=4k/25:512k --offset=98% --io_size=200G --file_service_type=sequential
case 2: open all files and select round-robin from which to read
fio --name test --opendir=/pnfs/data --rw=randread:8 --bssplit=4k/25:512k --offset=98% --io_size=200G --file_service_type=roundrobin
The case 1 takes a couple of minutes (2-3).
The case 2 takes two (2) hours.
Tigran.
----- Original Message -----
> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
> To: "linux-nfs" <linux-nfs@vger.kernel.org>
> Cc: "trondmy" <trondmy@kernel.org>, "Olga Kornievskaia" <aglo@umich.edu>
> Sent: Friday, 28 February, 2025 19:13:42
> Subject: Unexpected low pNFS IO performance with parallel workload
> Dear NFS fellows,
>
> During HPC workloads on we notice that Linux NFS4.2/pNFS client menonstraits
> unexpected low performance.
> The application opens 55 files parallel reads the data with multiple threads.
> The server issues flexfile
> layout with tighly coupled NFSv4.1 DSes.
>
> Oservations:
>
> - despite 1MB rsize/wsize returned by layout, client never issues reads bigger
> that 512k (offten much smaller)
> - client always uses slot 0 on DS, and
> - reads happen sequentialy, i.e. only one in-flight READ requests
> - following reads often just read the next 512k block
> - If instead of parallel application a simple dd is called, that multiple slots
> and 1MB READs are sent
>
> $ dd if=/pnfs/xxxx/00054.h5 of=/dev/null
> 45753381+1 records in
> 45753381+1 records out
> 23425731171 bytes (23 GB, 22 GiB) copied, 69.702 s, 336 MB/s
>
>
> The client has 80 cores on 2 sockets, 512BG of RAM and runs REHL 9.4
>
> $ uname -r
> 5.14.0-427.26.1.el9_4.x86_64
>
> $ free -g
> total used free shared buff/cache available
> Mem: 503 84 392 0 29 419
>
> $ lscpu | head
> Architecture: x86_64
> CPU op-mode(s): 32-bit, 64-bit
> Address sizes: 46 bits physical, 48 bits virtual
> Byte Order: Little Endian
> CPU(s): 80
> On-line CPU(s) list: 0-79
> Vendor ID: GenuineIntel
> BIOS Vendor ID: Intel(R) Corporation
> Model name: Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz
> BIOS Model name: Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz
>
> The client and all DSes equiped with 10GB/s NICs.
>
> Any ideas where to look?
>
> Best regards,
> Tigran.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2826 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Unexpected low pNFS IO performance with parallel workload
2025-03-03 17:31 ` Mkrtchyan, Tigran
@ 2025-03-25 11:33 ` Lionel Cons
0 siblings, 0 replies; 3+ messages in thread
From: Lionel Cons @ 2025-03-25 11:33 UTC (permalink / raw)
To: linux-nfs
Has there been any progress, or solution found?
Lionel
On Mon, 3 Mar 2025 at 18:45, Mkrtchyan, Tigran <tigran.mkrtchyan@desy.de> wrote:
>
>
>
> I was able to reproduce low throughput with the fio command. The examples below read 200GB from multiple files.
> The --offset=98% is there just to read a small portion of a file, as our files are 33GB each. In 'case 1', the data is read from a single
> file, and when it reaches EOF, it switches to the next one. In 'case 2', all files are opened in advance, and data is read round-robin through
> all files.
>
> case 1: read files sequentially
> fio --name test --opendir=/pnfs/data --rw=randread:8 --bssplit=4k/25:512k --offset=98% --io_size=200G --file_service_type=sequential
>
>
> case 2: open all files and select round-robin from which to read
> fio --name test --opendir=/pnfs/data --rw=randread:8 --bssplit=4k/25:512k --offset=98% --io_size=200G --file_service_type=roundrobin
>
> The case 1 takes a couple of minutes (2-3).
> The case 2 takes two (2) hours.
>
> Tigran.
>
>
> ----- Original Message -----
> > From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
> > To: "linux-nfs" <linux-nfs@vger.kernel.org>
> > Cc: "trondmy" <trondmy@kernel.org>, "Olga Kornievskaia" <aglo@umich.edu>
> > Sent: Friday, 28 February, 2025 19:13:42
> > Subject: Unexpected low pNFS IO performance with parallel workload
>
> > Dear NFS fellows,
> >
> > During HPC workloads on we notice that Linux NFS4.2/pNFS client menonstraits
> > unexpected low performance.
> > The application opens 55 files parallel reads the data with multiple threads.
> > The server issues flexfile
> > layout with tighly coupled NFSv4.1 DSes.
> >
> > Oservations:
> >
> > - despite 1MB rsize/wsize returned by layout, client never issues reads bigger
> > that 512k (offten much smaller)
> > - client always uses slot 0 on DS, and
> > - reads happen sequentialy, i.e. only one in-flight READ requests
> > - following reads often just read the next 512k block
> > - If instead of parallel application a simple dd is called, that multiple slots
> > and 1MB READs are sent
> >
> > $ dd if=/pnfs/xxxx/00054.h5 of=/dev/null
> > 45753381+1 records in
> > 45753381+1 records out
> > 23425731171 bytes (23 GB, 22 GiB) copied, 69.702 s, 336 MB/s
> >
> >
> > The client has 80 cores on 2 sockets, 512BG of RAM and runs REHL 9.4
> >
> > $ uname -r
> > 5.14.0-427.26.1.el9_4.x86_64
> >
> > $ free -g
> > total used free shared buff/cache available
> > Mem: 503 84 392 0 29 419
> >
> > $ lscpu | head
> > Architecture: x86_64
> > CPU op-mode(s): 32-bit, 64-bit
> > Address sizes: 46 bits physical, 48 bits virtual
> > Byte Order: Little Endian
> > CPU(s): 80
> > On-line CPU(s) list: 0-79
> > Vendor ID: GenuineIntel
> > BIOS Vendor ID: Intel(R) Corporation
> > Model name: Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz
> > BIOS Model name: Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz
> >
> > The client and all DSes equiped with 10GB/s NICs.
> >
> > Any ideas where to look?
> >
> > Best regards,
> > Tigran.
--
Lionel
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-03-25 11:33 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-28 18:13 Unexpected low pNFS IO performance with parallel workload Mkrtchyan, Tigran
2025-03-03 17:31 ` Mkrtchyan, Tigran
2025-03-25 11:33 ` Lionel Cons
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox