From: Mateusz Guzik <mguzik@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>,
Andrew Morton <akpm@linux-foundation.org>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Yann Droneaud <ydroneaud@opteya.com>,
Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] fs: use a sequence counter instead of file_lock in fd_install
Date: Fri, 17 Apr 2015 00:00:03 +0200 [thread overview]
Message-ID: <20150416220002.GB20615@mguzik> (raw)
In-Reply-To: <1429217739.7346.218.camel@edumazet-glaptop2.roam.corp.google.com>
On Thu, Apr 16, 2015 at 01:55:39PM -0700, Eric Dumazet wrote:
> On Thu, 2015-04-16 at 13:42 -0700, Eric Dumazet wrote:
> > On Thu, 2015-04-16 at 19:09 +0100, Al Viro wrote:
> > > On Thu, Apr 16, 2015 at 02:16:31PM +0200, Mateusz Guzik wrote:
> > > > @@ -165,8 +165,10 @@ static int expand_fdtable(struct files_struct *files, int nr)
> > > > cur_fdt = files_fdtable(files);
> > > > if (nr >= cur_fdt->max_fds) {
> > > > /* Continue as planned */
> > > > + write_seqcount_begin(&files->fdt_seqcount);
> > > > copy_fdtable(new_fdt, cur_fdt);
> > > > rcu_assign_pointer(files->fdt, new_fdt);
> > > > + write_seqcount_end(&files->fdt_seqcount);
> > > > if (cur_fdt != &files->fdtab)
> > > > call_rcu(&cur_fdt->rcu, free_fdtable_rcu);
> > >
> > > Interesting. AFAICS, your test doesn't step anywhere near that path,
> > > does it? So basically you never hit the retries during that...
> >
> > Right, but then the table is almost never changed for a given process,
> > as we only increase it by power of two steps.
> >
> > (So I scratch my initial comment, fdt_seqcount is really mostly read)
>
> I tested Mateusz patch with my opensock program, mimicking a bit more
> what a server does (having lot of sockets)
>
> 24 threads running, doing close(randomfd())/socket() calls like crazy.
>
> Before patch :
>
> # time ./opensock
>
> real 0m10.863s
> user 0m0.954s
> sys 2m43.659s
>
>
> After patch :
>
> # time ./opensock
>
> real 0m9.750s
> user 0m0.804s
> sys 2m18.034s
>
> So this is an improvement for sure, but not massive.
>
> perf record ./opensock ; report
>
> 87.80% opensock [kernel.kallsyms] [k] _raw_spin_lock
> |--52.70%-- __close_fd
> |--46.41%-- __alloc_fd
My crap benchmark is here: http://people.redhat.com/~mguzik/pipebench.c
(compile with -pthread, run with -s 10 -n 16 for 10 second test + 16
threads)
As noted earlier it tends to go from rougly 300k ops/s to 400.
The fundamental problem here seems to be this pesky POSIX requirement of
providing the lowest possible fd on each allocation (as a side note
Linux breaks this with parallel fd allocs, where one of these backs off
the reservation, not that I believe this causes trouble).
Ideally a process-wide switch could be implemented (e.g.
prctl(SCRATCH_LOWEST_FD_REQ)) which would grant the kernel the freedom
to return any fd it wants, so it would be possible to have fd ranges
per thread and the like.
Having only a O_SCRATCH_POSIX flag passed to syscalls would still leave
close() as a bottleneck.
In the meantime I consider the approach taken in my patch as an ok
temporary improvement.
--
Mateusz Guzik
next prev parent reply other threads:[~2015-04-16 22:00 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-16 12:16 [RFC PATCH] fs: use a sequence counter instead of file_lock in fd_install Mateusz Guzik
2015-04-16 17:47 ` Eric Dumazet
2015-04-16 18:09 ` Al Viro
2015-04-16 20:42 ` Eric Dumazet
2015-04-16 20:55 ` Eric Dumazet
2015-04-16 22:00 ` Mateusz Guzik [this message]
2015-04-16 22:52 ` Eric Dumazet
2015-04-16 22:35 ` Mateusz Guzik
2015-04-17 21:46 ` Eric Dumazet
2015-04-17 22:16 ` Mateusz Guzik
2015-04-17 23:02 ` Al Viro
2015-04-18 19:41 ` Eric Dumazet
2015-04-20 13:41 ` Mateusz Guzik
2015-04-20 16:46 ` Eric Dumazet
2015-04-20 16:48 ` Eric Dumazet
2015-04-20 13:06 ` Mateusz Guzik
2015-04-20 13:43 ` Mateusz Guzik
2015-04-20 15:10 ` Mateusz Guzik
2015-04-20 17:15 ` Eric Dumazet
2015-04-20 20:49 ` Eric Dumazet
2015-04-21 18:05 ` Eric Dumazet
2015-04-21 20:06 ` Mateusz Guzik
2015-04-21 20:12 ` Mateusz Guzik
2015-04-21 21:06 ` Eric Dumazet
2015-04-22 4:59 ` [PATCH] fs/file.c: don't acquire files->file_lock in fd_install() Eric Dumazet
2015-04-27 19:05 ` Mateusz Guzik
2015-04-28 16:20 ` Eric Dumazet
2015-04-29 4:25 ` [PATCH v2] " Eric Dumazet
2015-06-22 2:32 ` Al Viro
2015-06-22 2:32 ` Al Viro
2015-06-23 5:31 ` Eric Dumazet
2015-06-23 5:31 ` Eric Dumazet
2015-06-30 13:54 ` [PATCH v3] " Eric Dumazet
2015-04-22 13:31 ` [RFC PATCH] fs: use a sequence counter instead of file_lock in fd_install Mateusz Guzik
2015-04-22 13:55 ` Eric Dumazet
2015-04-21 20:57 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150416220002.GB20615@mguzik \
--to=mguzik@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=eric.dumazet@gmail.com \
--cc=khlebnikov@yandex-team.ru \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=viro@ZenIV.linux.org.uk \
--cc=ydroneaud@opteya.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.