From: Mateusz Guzik <mguzik@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>,
Andrew Morton <akpm@linux-foundation.org>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Yann Droneaud <ydroneaud@opteya.com>,
Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] fs: use a sequence counter instead of file_lock in fd_install
Date: Fri, 17 Apr 2015 00:00:03 +0200 [thread overview]
Message-ID: <20150416220002.GB20615@mguzik> (raw)
In-Reply-To: <1429217739.7346.218.camel@edumazet-glaptop2.roam.corp.google.com>
On Thu, Apr 16, 2015 at 01:55:39PM -0700, Eric Dumazet wrote:
> On Thu, 2015-04-16 at 13:42 -0700, Eric Dumazet wrote:
> > On Thu, 2015-04-16 at 19:09 +0100, Al Viro wrote:
> > > On Thu, Apr 16, 2015 at 02:16:31PM +0200, Mateusz Guzik wrote:
> > > > @@ -165,8 +165,10 @@ static int expand_fdtable(struct files_struct *files, int nr)
> > > > cur_fdt = files_fdtable(files);
> > > > if (nr >= cur_fdt->max_fds) {
> > > > /* Continue as planned */
> > > > + write_seqcount_begin(&files->fdt_seqcount);
> > > > copy_fdtable(new_fdt, cur_fdt);
> > > > rcu_assign_pointer(files->fdt, new_fdt);
> > > > + write_seqcount_end(&files->fdt_seqcount);
> > > > if (cur_fdt != &files->fdtab)
> > > > call_rcu(&cur_fdt->rcu, free_fdtable_rcu);
> > >
> > > Interesting. AFAICS, your test doesn't step anywhere near that path,
> > > does it? So basically you never hit the retries during that...
> >
> > Right, but then the table is almost never changed for a given process,
> > as we only increase it by power of two steps.
> >
> > (So I scratch my initial comment, fdt_seqcount is really mostly read)
>
> I tested Mateusz patch with my opensock program, mimicking a bit more
> what a server does (having lot of sockets)
>
> 24 threads running, doing close(randomfd())/socket() calls like crazy.
>
> Before patch :
>
> # time ./opensock
>
> real 0m10.863s
> user 0m0.954s
> sys 2m43.659s
>
>
> After patch :
>
> # time ./opensock
>
> real 0m9.750s
> user 0m0.804s
> sys 2m18.034s
>
> So this is an improvement for sure, but not massive.
>
> perf record ./opensock ; report
>
> 87.80% opensock [kernel.kallsyms] [k] _raw_spin_lock
> |--52.70%-- __close_fd
> |--46.41%-- __alloc_fd
My crap benchmark is here: http://people.redhat.com/~mguzik/pipebench.c
(compile with -pthread, run with -s 10 -n 16 for 10 second test + 16
threads)
As noted earlier it tends to go from rougly 300k ops/s to 400.
The fundamental problem here seems to be this pesky POSIX requirement of
providing the lowest possible fd on each allocation (as a side note
Linux breaks this with parallel fd allocs, where one of these backs off
the reservation, not that I believe this causes trouble).
Ideally a process-wide switch could be implemented (e.g.
prctl(SCRATCH_LOWEST_FD_REQ)) which would grant the kernel the freedom
to return any fd it wants, so it would be possible to have fd ranges
per thread and the like.
Having only a O_SCRATCH_POSIX flag passed to syscalls would still leave
close() as a bottleneck.
In the meantime I consider the approach taken in my patch as an ok
temporary improvement.
--
Mateusz Guzik
next prev parent reply other threads:[~2015-04-16 22:00 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-16 12:16 [RFC PATCH] fs: use a sequence counter instead of file_lock in fd_install Mateusz Guzik
2015-04-16 17:47 ` Eric Dumazet
2015-04-16 18:09 ` Al Viro
2015-04-16 20:42 ` Eric Dumazet
2015-04-16 20:55 ` Eric Dumazet
2015-04-16 22:00 ` Mateusz Guzik [this message]
2015-04-16 22:52 ` Eric Dumazet
2015-04-16 22:35 ` Mateusz Guzik
2015-04-17 21:46 ` Eric Dumazet
2015-04-17 22:16 ` Mateusz Guzik
2015-04-17 23:02 ` Al Viro
2015-04-18 19:41 ` Eric Dumazet
2015-04-20 13:41 ` Mateusz Guzik
2015-04-20 16:46 ` Eric Dumazet
2015-04-20 16:48 ` Eric Dumazet
2015-04-20 13:06 ` Mateusz Guzik
2015-04-20 13:43 ` Mateusz Guzik
2015-04-20 15:10 ` Mateusz Guzik
2015-04-20 17:15 ` Eric Dumazet
2015-04-20 20:49 ` Eric Dumazet
2015-04-21 18:05 ` Eric Dumazet
2015-04-21 20:06 ` Mateusz Guzik
2015-04-21 20:12 ` Mateusz Guzik
2015-04-21 21:06 ` Eric Dumazet
2015-04-22 4:59 ` [PATCH] fs/file.c: don't acquire files->file_lock in fd_install() Eric Dumazet
2015-04-27 19:05 ` Mateusz Guzik
2015-04-28 16:20 ` Eric Dumazet
2015-04-29 4:25 ` [PATCH v2] " Eric Dumazet
2015-06-22 2:32 ` Al Viro
2015-06-23 5:31 ` Eric Dumazet
2015-06-30 13:54 ` [PATCH v3] " Eric Dumazet
2015-04-22 13:31 ` [RFC PATCH] fs: use a sequence counter instead of file_lock in fd_install Mateusz Guzik
2015-04-22 13:55 ` Eric Dumazet
2015-04-21 20:57 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150416220002.GB20615@mguzik \
--to=mguzik@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=eric.dumazet@gmail.com \
--cc=khlebnikov@yandex-team.ru \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=viro@ZenIV.linux.org.uk \
--cc=ydroneaud@opteya.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox