From: Jan Kara <jack@suse.cz>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>,
linux-ext4@vger.kernel.org, Eryu Guan <eguan@redhat.com>,
stable@vger.kernel.org
Subject: Re: [PATCH 1/4] ext4: Fix deadlock during page writeback
Date: Wed, 6 Jul 2016 09:51:16 +0200 [thread overview]
Message-ID: <20160706075116.GB14067@quack2.suse.cz> (raw)
In-Reply-To: <20160705033824.GD15193@thunk.org>
On Mon 04-07-16 23:38:24, Ted Tso wrote:
> On Mon, Jul 04, 2016 at 05:51:07PM +0200, Jan Kara wrote:
> > On Mon 04-07-16 10:14:35, Ted Tso wrote:
> > > This is what I'm currently testing; do you have objections to this?
> >
> > Meh, I don't like it but it should work... Did you see any improvement with
> > your change or are you just operating on the assumption that you want as
> > few code while the handle is running as possible?
>
> I haven't had a chance to try to benchmark it yet. I've working at
> home over the long (US) holiday weekend, and the high core-count
> servers I need are on the internal work network, and it's pain to
> access them from home.
>
> I've just been tired of seeing the sort of analysis that can be found
> at papers like:
>
> https://www.usenix.org/system/files/conference/fast14/fast14-paper_kang.pdf
So the biggest gap shown in this paper is for buffered write where I suspect
ext4 suffers because it starts a handle for each write. There is some
low-hanging fruit though - we just need to start it when we may be updating
i_size. I'll try to look into this when I have time to setup proper
benchmark.
> (And there's a ATC 2016 paper which shows that things haven't gotten
> any better as well.)
>
> Given that our massive lock bottlenecks come from the j_list_lock and
> j_state_lock, and that most of the contention happens when we are
> closing down a transaction for a commit, there is a pretty direct
> correlation between handle lifetimes and the contention statistics on
> the journal spinlocks. Enough so that I've instrumented the handle
> type and handle line number in the jbd2_handle_stats tracepoint, and
> work to push down on the handle hold times have definitely helped our
> contention numbers.
Yeah, JBD2 scalability sucks. I suspect you are conflating two issues here
though. One issue is j_list_lock and j_state_lock contention - that is
exposed by starting handles often, doing lots of operations with buffers
etc. This is what the above paper shows. Another issue is that while a
transaction is preparing for commit, we have to wait for all outstanding
handles against that transaction and while we do that, we have no running
transaction and the whole journalling machinery is stalled. For this
problem, the time each handle runs is essential. This is what you've likely
seen in your testing.
Reducing j_list_lock and j_state_lock contention is IMO doable, although
the low hanging fruit is probably eaten these days ;). Fixing the second
problem is harder as that is inherent problem with block-level journalling.
I suspect we could allow starting another transaction while the previous
one is in "preparing for commit" phase but that would lead to two
transactions getting updates at one point in time which JBD2 currently does
not expect.
> So I do have experimental evidence that reducing code while the handle
> is running does matter in general. I just don't have it for this
> specific case yet....
OK.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2016-07-06 7:51 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-16 10:42 [PATCH 0/4] ext4: Fix deadlock during page writeback Jan Kara
2016-06-16 10:42 ` [PATCH 1/4] " Jan Kara
2016-06-30 15:05 ` Theodore Ts'o
2016-07-01 9:09 ` Jan Kara
2016-07-01 16:53 ` Theodore Ts'o
2016-07-01 17:40 ` Jan Kara
2016-07-01 21:26 ` Theodore Ts'o
2016-07-04 14:00 ` Jan Kara
2016-07-04 15:20 ` Theodore Ts'o
2016-07-04 15:47 ` Jan Kara
2016-07-05 2:43 ` Theodore Ts'o
2016-07-06 7:04 ` Jan Kara
2016-07-04 14:14 ` Theodore Ts'o
2016-07-04 15:51 ` Jan Kara
2016-07-05 3:38 ` Theodore Ts'o
2016-07-06 7:51 ` Jan Kara [this message]
2016-07-06 12:35 ` Theodore Ts'o
2016-07-06 12:52 ` Jan Kara
2016-07-06 14:27 ` Theodore Ts'o
2016-07-06 14:41 ` Jan Kara
2016-06-16 10:42 ` [PATCH 2/4] jbd2: Move lockdep instrumentation for jbd2 handles Jan Kara
2016-06-30 15:34 ` Theodore Ts'o
2016-06-16 10:42 ` [PATCH 3/4] jbd2: Move lockdep tracking to journal_s Jan Kara
2016-06-16 11:42 ` kbuild test robot
2016-06-30 15:40 ` Theodore Ts'o
2016-06-16 10:42 ` [PATCH 4/4] jbd2: Track more dependencies on transaction commit Jan Kara
2016-06-30 15:45 ` Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160706075116.GB14067@quack2.suse.cz \
--to=jack@suse.cz \
--cc=eguan@redhat.com \
--cc=linux-ext4@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).