Linux NFS development
 help / color / mirror / Atom feed
From: "Mkrtchyan, Tigran" <tigran.mkrtchyan@desy.de>
To: linux-nfs <linux-nfs@vger.kernel.org>
Cc: trondmy <trondmy@kernel.org>, Olga Kornievskaia <aglo@umich.edu>
Subject: Re: Unexpected low pNFS IO performance with parallel workload
Date: Mon, 3 Mar 2025 18:31:45 +0100 (CET)	[thread overview]
Message-ID: <732824542.7754132.1741023105596.JavaMail.zimbra@desy.de> (raw)
In-Reply-To: <319477679.6763859.1740766422175.JavaMail.zimbra@desy.de>

[-- Attachment #1: Type: text/plain, Size: 3045 bytes --]



I was able to reproduce low throughput with the fio command. The examples below read 200GB from multiple files.
The --offset=98% is there just to read a small portion of a file, as our files are 33GB each. In 'case 1', the data is read from a single
file, and when it reaches EOF, it switches to the next one. In 'case 2', all files are opened in advance, and data is read round-robin through
all files.

case 1: read files sequentially
fio --name test --opendir=/pnfs/data --rw=randread:8 --bssplit=4k/25:512k --offset=98% --io_size=200G --file_service_type=sequential


case 2: open all files and select round-robin from which to read
fio --name test --opendir=/pnfs/data --rw=randread:8 --bssplit=4k/25:512k --offset=98% --io_size=200G --file_service_type=roundrobin

The case 1 takes a couple of minutes (2-3).
The case 2 takes two (2) hours.

Tigran.


----- Original Message -----
> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
> To: "linux-nfs" <linux-nfs@vger.kernel.org>
> Cc: "trondmy" <trondmy@kernel.org>, "Olga Kornievskaia" <aglo@umich.edu>
> Sent: Friday, 28 February, 2025 19:13:42
> Subject: Unexpected low pNFS IO performance with parallel workload

> Dear NFS fellows,
> 
> During HPC workloads on we notice that Linux NFS4.2/pNFS client menonstraits
> unexpected low performance.
> The application opens 55 files parallel reads the data with multiple threads.
> The server issues flexfile
> layout with tighly coupled NFSv4.1 DSes.
> 
> Oservations:
> 
> - despite 1MB rsize/wsize returned by layout, client never issues reads bigger
> that 512k (offten much smaller)
> - client always uses slot 0 on DS, and
> - reads happen sequentialy, i.e. only one in-flight READ requests
> - following reads often just read the next 512k block
> - If instead of parallel application a simple dd is called, that multiple slots
> and 1MB READs are sent
> 
> $ dd if=/pnfs/xxxx/00054.h5 of=/dev/null
> 45753381+1 records in
> 45753381+1 records out
> 23425731171 bytes (23 GB, 22 GiB) copied, 69.702 s, 336 MB/s
> 
> 
> The client has 80 cores on 2 sockets, 512BG of RAM and runs REHL 9.4
> 
> $ uname -r
> 5.14.0-427.26.1.el9_4.x86_64
> 
> $ free -g
>               total        used        free      shared  buff/cache   available
> Mem:             503          84         392           0          29         419
> 
> $ lscpu | head
> Architecture:                       x86_64
> CPU op-mode(s):                     32-bit, 64-bit
> Address sizes:                      46 bits physical, 48 bits virtual
> Byte Order:                         Little Endian
> CPU(s):                             80
> On-line CPU(s) list:                0-79
> Vendor ID:                          GenuineIntel
> BIOS Vendor ID:                     Intel(R) Corporation
> Model name:                         Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz
> BIOS Model name:                    Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz
> 
> The client and all DSes equiped  with 10GB/s NICs.
> 
> Any ideas where to look?
> 
> Best regards,
>    Tigran.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2826 bytes --]

  reply	other threads:[~2025-03-03 17:31 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-28 18:13 Unexpected low pNFS IO performance with parallel workload Mkrtchyan, Tigran
2025-03-03 17:31 ` Mkrtchyan, Tigran [this message]
2025-03-25 11:33   ` Lionel Cons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=732824542.7754132.1741023105596.JavaMail.zimbra@desy.de \
    --to=tigran.mkrtchyan@desy.de \
    --cc=aglo@umich.edu \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trondmy@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox