From: "J. Bruce Fields" <bfields@fieldses.org>
To: Daire Byrne <daire@dneg.com>
Cc: linux-nfs <linux-nfs@vger.kernel.org>
Subject: Re: parallel file create rates (+high latency)
Date: Mon, 24 Jan 2022 15:50:45 -0500 [thread overview]
Message-ID: <20220124205045.GB4975@fieldses.org> (raw)
In-Reply-To: <CAPt2mGOCn5OaeZm24+zh92qRcWTF8h-H2WXqScz9RMfo4r_-Qw@mail.gmail.com>
On Mon, Jan 24, 2022 at 08:10:07PM +0000, Daire Byrne wrote:
> On Mon, 24 Jan 2022 at 19:38, J. Bruce Fields <bfields@fieldses.org> wrote:
> >
> > On Sun, Jan 23, 2022 at 11:53:08PM +0000, Daire Byrne wrote:
> > > I've been experimenting a bit more with high latency NFSv4.2 (200ms).
> > > I've noticed a difference between the file creation rates when you
> > > have parallel processes running against a single client mount creating
> > > files in multiple directories compared to in one shared directory.
> >
> > The Linux VFS requires an exclusive lock on the directory while you're
> > creating a file.
>
> Right. So when I mounted the same server/dir multiple times using
> namespaces, all I was really doing was making the VFS *think* I wanted
> locks on different directories even though the remote server directory
> was actually the same?
In that scenario the client-side locks are probably all different, but
they'd all have to wait for the same lock on the server side, yes.
> > So, if L is the time in seconds required to create a single file, you're
> > never going to be able to create more than 1/L files per second, because
> > there's no parallelism.
>
> And things like directory delegations can't help with this kind of
> workload? You can't batch directories locks or file creates I guess.
Alas, there are directory delegations specified in RFC 8881, but they
are read-only, and nobody's implemented them.
Directory write delegations could help a lot, if they existed.
> > So, it's not surprising you'd get a higher rate when creating in
> > multiple directories.
> >
> > Also, that lock's taken on both client and server. So it makes sense
> > that you might get a little more parallelism from multiple clients.
> >
> > So the usual advice is just to try to get that latency number as low as
> > possible, by using a low-latency network and storage that can commit
> > very quickly. (An NFS server isn't permitted to reply to the RPC
> > creating the new file until the new file actually hits stable storage.)
> >
> > Are you really seeing 200ms in production?
>
> Yea, it's just a (crazy) test for now. This is the latency between two
> of our offices. Running batch jobs over this kind of latency with a
> NFS re-export server doing all the caching works surprisingly well.
>
> It's just these file creations that's the deal breaker. A batch job
> might create 100,000+ files in a single directory across many clients.
>
> Maybe many containerised re-export servers in round-robin with a
> common cache is the only way to get more directory locks and file
> creates in flight at the same time.
ssh into the original server and crate the files there?
I've got no help, sorry.
The client-side locking does seem redundant to some degree, but I don't
know what to do about it.
--b.
next prev parent reply other threads:[~2022-01-24 22:34 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-23 23:53 parallel file create rates (+high latency) Daire Byrne
2022-01-24 13:52 ` Daire Byrne
2022-01-24 19:37 ` J. Bruce Fields
2022-01-24 20:10 ` Daire Byrne
2022-01-24 20:50 ` J. Bruce Fields [this message]
2022-01-25 12:52 ` Daire Byrne
2022-01-25 13:59 ` J. Bruce Fields
2022-01-25 15:24 ` Daire Byrne
2022-01-25 15:30 ` Chuck Lever III
2022-01-25 21:50 ` Patrick Goetz
2022-01-25 21:58 ` Chuck Lever III
2022-01-25 21:59 ` Bruce Fields
2022-01-25 22:11 ` Patrick Goetz
2022-01-25 22:41 ` Daire Byrne
2022-01-25 23:01 ` Patrick Goetz
2022-01-25 23:25 ` Daire Byrne
2022-01-25 21:15 ` Patrick Goetz
2022-01-25 21:20 ` J. Bruce Fields
2022-01-26 0:02 ` NeilBrown
2022-01-26 0:28 ` Daire Byrne
2022-01-26 2:57 ` J. Bruce Fields
2022-02-08 18:48 ` Daire Byrne
2022-02-10 18:19 ` Daire Byrne
2022-02-11 15:59 ` J. Bruce Fields
2022-02-17 19:50 ` Daire Byrne
2022-02-18 7:46 ` NeilBrown
2022-02-21 13:59 ` Daire Byrne
2022-04-25 13:00 ` Daire Byrne
2022-04-25 13:22 ` J. Bruce Fields
2022-04-25 15:24 ` Daire Byrne
2022-04-25 16:02 ` J. Bruce Fields
2022-04-25 16:47 ` Daire Byrne
2022-04-26 1:36 ` NeilBrown
2022-04-26 12:29 ` Daire Byrne
2022-04-28 5:46 ` NeilBrown
2022-04-29 7:55 ` Daire Byrne
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220124205045.GB4975@fieldses.org \
--to=bfields@fieldses.org \
--cc=daire@dneg.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.