From: "Lukáš Czerner" <lczerner@redhat.com>
To: "Theodore Ts'o" <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org
Subject: Re: [PATCH v4 15/20] ext4: use ext4_zero_partial_blocks in punch_hole
Date: Mon, 17 Jun 2013 14:46:29 +0200 (CEST) [thread overview]
Message-ID: <alpine.LFD.2.00.1306171435301.3270@localhost.localdomain> (raw)
In-Reply-To: <20130617122518.GA24403@thunk.org>
[-- Attachment #1: Type: TEXT/PLAIN, Size: 3807 bytes --]
On Mon, 17 Jun 2013, Theodore Ts'o wrote:
> Date: Mon, 17 Jun 2013 08:25:18 -0400
> From: Theodore Ts'o <tytso@mit.edu>
> To: Lukáš Czerner <lczerner@redhat.com>
> Cc: linux-ext4@vger.kernel.org
> Subject: Re: [PATCH v4 15/20] ext4: use ext4_zero_partial_blocks in punch_hole
>
> On Mon, Jun 17, 2013 at 11:08:32AM +0200, Lukáš Czerner wrote:
> > > Correction... reverting patches #15 through #19 (which is what I did in
> > > the dev-with-revert branch found on ext4.git) causes the problem to go
> > > away in the nojournal case, but it causes a huge number of other
> > > problems. Some of the reverts weren't clean, so it's possible I
> > > screwed up one of the reverts. It's also possible that only applying
> > > part of this series leaves the tree in an unstable state.
> > >
> > > I'd much rather figure out how to fix the problem on the dev branch,
> > > so thank you for looking into this!
> >
> > Wow, this looks bad. Theoretically reverting patches %15 through
> > #19 should not have any real impact. So far I do not see what is
> > causing that, but I am looking into this.
>
> I've been looking into this more intensively over the weekend. I'm
> now beginning to think we have had a pre-existing race, and the
> changes in question has simply changed the timing. I tried a version
> of the dev branch (you can find it as the branch dev2 in my
> kernel.org's ext4.git tree) which only had patches 1 through 10 of the
> invalidate page range patches (dropping patches 11 through 20), and I
> found that generic/300 was failing in the configuration ext3 (a file
> system with nodelalloc, no flex_bg, and no extents). I also found
> the same failure with a 3.10-rc2 configuration.
>
> The your changes seem to make generic/300 failure consistently for me
> using the nojournal configuration, but looking at patches in question,
> I don't think they could have directly caused the problem. Instead, I
> think they just changed the timing to unmask the problem.
Ok, I though that there is something weird because patches #1-#14
should not cause anything like that and from my testing (see my
previous mail) it really seems it does not cause it, at least not
directly.
>
> Given that I've seen generic/300 test failures in various different
> baselines going all the way back to 3.9-rc4, this isn't a recent
> regression. And given that it does seem to be timing sensitive,
> bisecting it is going to be difficult. On the other hand, given that
> using the dev (or master) branch, generic/300 is failing with a
> greater than 70% probability using kvm with 2 cpu's, 2 megs of RAM and
> 5400 rpm laptop drives in nojournal mode, the fact that it's
> reproducing relatively reliably hopefully will make it easier to find
> the problem.
As mentioned in previous email test generic/300 runs without any
problems (even in the loop) without journal with patches #1 through
#14 applied on 3.10-rc2 (c7788792a5e7b0d5d7f96d0766b4cb6112d47d75).
This is on kvm with 24 cpu's, 8GB of RAM (I suppose you're not using
2MB of ram in your setup, but rather 2GB :) and server drives with
linear lvm on top of it.
-Lukas
>
> > I see that there are problems in other mode, not just nojournal. Are
> > those caused by this as well, or are you seeing those even without
> > the patchset ?
>
> I think the other problems in my dev-with-revert branch was caused by
> some screw up on my part when did the revert using git. I found that
> dropping the patches from a copy of the guilt patch stack, and then
> applying all of the patches except for the last half of the invalidate
> page range patch series, resulted in a clean branch that didn't have
> any of these failures. It's what I should have done late last week,
> instead of trying to use "git revert".
>
> Cheers,
>
> - Ted
>
next prev parent reply other threads:[~2013-06-17 12:46 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-14 16:37 [PATCH v4 00/20] change invalidatepage prototype to accept length Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 01/20] mm: " Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 02/20] jbd2: change jbd2_journal_invalidatepage " Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 03/20] ext4: use ->invalidatepage() length argument Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 04/20] jbd: change journal_invalidatepage() to accept length Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 05/20] xfs: use ->invalidatepage() length argument Lukas Czerner
2013-05-15 7:30 ` Dave Chinner
2013-05-14 16:37 ` [PATCH v4 06/20] ocfs2: " Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 07/20] ceph: " Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 08/20] gfs2: " Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 09/20] reiserfs: " Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 10/20] mm: teach truncate_inode_pages_range() to handle non page aligned ranges Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 11/20] Revert "ext4: remove no longer used functions in inode.c" Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 12/20] ext4: Call ext4_jbd2_file_inode() after zeroing block Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 13/20] Revert "ext4: fix fsx truncate failure" Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 14/20] ext4: truncate_inode_pages() in orphan cleanup path Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 15/20] ext4: use ext4_zero_partial_blocks in punch_hole Lukas Czerner
2013-06-14 3:01 ` Theodore Ts'o
2013-06-14 10:16 ` Lukáš Czerner
2013-06-14 13:39 ` Ted Ts'o
2013-06-17 9:08 ` Lukáš Czerner
2013-06-17 12:25 ` Theodore Ts'o
2013-06-17 12:46 ` Lukáš Czerner [this message]
2013-06-17 12:30 ` Lukáš Czerner
2013-06-19 16:37 ` Lukáš Czerner
2013-06-19 23:42 ` Theodore Ts'o
2013-06-20 9:12 ` Lukáš Czerner
2013-06-20 9:14 ` [PATCH] ext4: Only zero partial blocks in ext4_zero_partial_blocks() Lukas Czerner
2013-06-20 15:01 ` Theodore Ts'o
2013-05-14 16:37 ` [PATCH v4 16/20] ext4: remove unused discard_partial_page_buffers Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 17/20] ext4: remove unused code from ext4_remove_blocks() Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 18/20] ext4: update ext4_ext_remove_space trace point Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 19/20] ext4: make punch hole code path work with bigalloc Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 20/20] ext4: Allow punch hole with bigalloc enabled Lukas Czerner
2013-05-31 15:14 ` Theodore Ts'o
2013-06-05 10:04 ` Lukáš Czerner
2013-06-11 12:54 ` Lukáš Czerner
2013-06-18 12:34 ` Tomas Racek
2013-05-21 14:34 ` [PATCH v4 00/20] change invalidatepage prototype to accept length Lukáš Czerner
2013-05-28 11:21 ` Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.00.1306171435301.3270@localhost.localdomain \
--to=lczerner@redhat.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox