From: Jan Kara <jack@suse.cz>
To: Nikola Pajkovsky <nikola.pajkovsky@gooddata.com>
Cc: "Jan Kara" <jack@suse.cz>,
"Holger Hoffstätte" <holger@applied-asynchrony.com>,
linux-ext4@vger.kernel.org, "Jan Kara" <jack@suse.com>
Subject: Re: xfstests generic/130 hang with non-4k block size ext4 on 4.7-rc1 kernel
Date: Mon, 20 Jun 2016 13:39:50 +0200 [thread overview]
Message-ID: <20160620113950.GD6882@quack2.suse.cz> (raw)
In-Reply-To: <8737odw5xp.fsf@gooddata.com>
On Thu 16-06-16 16:42:58, Nikola Pajkovsky wrote:
> Jan Kara <jack@suse.cz> writes:
>
> > On Fri 10-06-16 07:52:56, Nikola Pajkovsky wrote:
> >> Jan Kara <jack@suse.cz> writes:
> >> > On Thu 09-06-16 09:23:29, Nikola Pajkovsky wrote:
> >> >> Holger Hoffstätte <holger@applied-asynchrony.com> writes:
> >> >>
> >> >> > On Wed, 08 Jun 2016 14:56:31 +0200, Jan Kara wrote:
> >> >> > (snip)
> >> >> >> Attached patch fixes the issue for me. I'll submit it once a full xfstests
> >> >> >> run finishes for it (which may take a while as our server room is currently
> >> >> >> moving to a different place).
> >> >> >>
> >> >> >> Honza
> >> >> >> --
> >> >> >> Jan Kara <jack@suse.com>
> >> >> >> SUSE Labs, CR
> >> >> >> From 3a120841a5d9a6c42bf196389467e9e663cf1cf8 Mon Sep 17 00:00:00 2001
> >> >> >> From: Jan Kara <jack@suse.cz>
> >> >> >> Date: Wed, 8 Jun 2016 10:01:45 +0200
> >> >> >> Subject: [PATCH] ext4: Fix deadlock during page writeback
> >> >> >>
> >> >> >> Commit 06bd3c36a733 (ext4: fix data exposure after a crash) uncovered a
> >> >> >> deadlock in ext4_writepages() which was previously much harder to hit.
> >> >> >> After this commit xfstest generic/130 reproduces the deadlock on small
> >> >> >> filesystems.
> >> >> >
> >> >> > Since you marked this for -stable, just a heads-up that the previous patch
> >> >> > for the data exposure was rejected from -stable (see [1]) because it
> >> >> > has the mismatching "!IS_NOQUOTA(inode) &&" line, which didn't exist
> >> >> > until 4.6. I removed it locally but Greg probably wants an official patch.
> >> >> >
> >> >> > So both this and the previous patch need to be submitted.
> >> >> >
> >> >> > [1] http://permalink.gmane.org/gmane.linux.kernel.stable/18074{4,5,6}
> >> >>
> >> >> I'm just wondering if the Jan's patch is not related to blocked
> >> >> processes in following trace. It very hard to hit it and I don't have
> >> >> any reproducer.
> >> >
> >> > This looks like a different issue. Does the machine recover itself or is it
> >> > a hard hang and you have to press a reset button?
> >>
> >> The machine is bit bigger than I have pretend. It's 18 vcpu with 160 GB
> >> ram and machine has dedicated mount point only for PostgreSQL data.
> >>
> >> Nevertheless, I was able always to ssh to the machine, so machine itself
> >> was not in hard hang and ext4 mostly gets recover by itself (it took
> >> 30min). But I have seen situation, were every process who 'touch' the ext4
> >> goes immediately to D state and does not recover even after hour.
> >
> > If such situation happens, can you run 'echo w >/proc/sysrq-trigger' to
> > dump stuck processes and also run 'iostat -x 1' for a while to see how much
> > IO is happening in the system? That should tell us more.
>
>
> Link to 'echo w >/proc/sysrq-trigger' is here, because it's bit bigger
> to mail it.
>
> http://expirebox.com/download/68c26e396feb8c9abb0485f857ccea3a.html
Can you upload it again please? I've got to looking at the file only today
and it is already deleted. Thanks!
> I was running iotop and there was traffic roughly ~20 KB/s write.
>
> What was bit more interesting, was looking at
>
> cat /proc/vmstat | egrep "nr_dirty|nr_writeback"
>
> nr_drity had around 240 and was slowly counting up, but nr_writeback had
> ~8800 and was stuck for 120s.
Hum, interesting. This would suggest like IO completion got stuck for some
reason. We'll see more from the stacktraces hopefully.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2016-06-20 11:39 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-31 14:09 xfstests generic/130 hang with non-4k block size ext4 on 4.7-rc1 kernel Eryu Guan
2016-05-31 15:40 ` Theodore Ts'o
2016-06-01 6:38 ` Eryu Guan
2016-06-01 13:53 ` Theodore Ts'o
2016-06-01 16:58 ` Eryu Guan
2016-06-02 8:58 ` Jan Kara
2016-06-02 12:17 ` Jan Kara
2016-06-02 12:30 ` Nikola Pajkovsky
2016-06-03 10:16 ` Eryu Guan
2016-06-03 11:58 ` Jan Kara
2016-06-08 12:56 ` Jan Kara
2016-06-08 14:23 ` Holger Hoffstätte
2016-06-09 7:23 ` Nikola Pajkovsky
2016-06-09 15:04 ` Jan Kara
2016-06-10 5:52 ` Nikola Pajkovsky
2016-06-16 13:26 ` Jan Kara
2016-06-16 14:42 ` Nikola Pajkovsky
2016-06-20 11:39 ` Jan Kara [this message]
2016-06-20 12:59 ` Nikola Pajkovsky
2016-06-21 10:11 ` Jan Kara
2016-06-22 8:55 ` Nikola Pajkovsky
2016-06-09 14:59 ` Jan Kara
2016-06-10 8:37 ` Eryu Guan
2016-06-12 3:28 ` Eryu Guan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160620113950.GD6882@quack2.suse.cz \
--to=jack@suse.cz \
--cc=holger@applied-asynchrony.com \
--cc=jack@suse.com \
--cc=linux-ext4@vger.kernel.org \
--cc=nikola.pajkovsky@gooddata.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.