linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Theodore Ts'o <tytso@mit.edu>
To: "Lukáš Czerner" <lczerner@redhat.com>
Cc: linux-ext4@vger.kernel.org
Subject: Re: [PATCH v4 15/20] ext4: use ext4_zero_partial_blocks in punch_hole
Date: Mon, 17 Jun 2013 08:25:18 -0400	[thread overview]
Message-ID: <20130617122518.GA24403@thunk.org> (raw)
In-Reply-To: <alpine.LFD.2.00.1306171104540.3270@localhost.localdomain>

On Mon, Jun 17, 2013 at 11:08:32AM +0200, Lukáš Czerner wrote:
> > Correction...  reverting patches #15 through #19 (which is what I did in
> > the dev-with-revert branch found on ext4.git) causes the problem to go
> > away in the nojournal case, but it causes a huge number of other
> > problems.  Some of the reverts weren't clean, so it's possible I
> > screwed up one of the reverts.  It's also possible that only applying
> > part of this series leaves the tree in an unstable state.
> > 
> > I'd much rather figure out how to fix the problem on the dev branch,
> > so thank you for looking into this!
> 
> Wow, this looks bad. Theoretically reverting patches %15 through
> #19 should not have any real impact. So far I do not see what is
> causing that, but I am looking into this.

I've been looking into this more intensively over the weekend.  I'm
now beginning to think we have had a pre-existing race, and the
changes in question has simply changed the timing.  I tried a version
of the dev branch (you can find it as the branch dev2 in my
kernel.org's ext4.git tree) which only had patches 1 through 10 of the
invalidate page range patches (dropping patches 11 through 20), and I
found that generic/300 was failing in the configuration ext3 (a file
system with nodelalloc, no flex_bg, and no extents).  I also found
the same failure with a 3.10-rc2 configuration.

The your changes seem to make generic/300 failure consistently for me
using the nojournal configuration, but looking at patches in question,
I don't think they could have directly caused the problem.  Instead, I
think they just changed the timing to unmask the problem.

Given that I've seen generic/300 test failures in various different
baselines going all the way back to 3.9-rc4, this isn't a recent
regression.  And given that it does seem to be timing sensitive,
bisecting it is going to be difficult.  On the other hand, given that
using the dev (or master) branch, generic/300 is failing with a
greater than 70% probability using kvm with 2 cpu's, 2 megs of RAM and
5400 rpm laptop drives in nojournal mode, the fact that it's
reproducing relatively reliably hopefully will make it easier to find
the problem.

> I see that there are problems in other mode, not just nojournal. Are
> those caused by this as well, or are you seeing those even without
> the patchset ?

I think the other problems in my dev-with-revert branch was caused by
some screw up on my part when did the revert using git.  I found that
dropping the patches from a copy of the guilt patch stack, and then
applying all of the patches except for the last half of the invalidate
page range patch series, resulted in a clean branch that didn't have
any of these failures.  It's what I should have done late last week,
instead of trying to use "git revert".

Cheers,

					- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2013-06-17 12:25 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-14 16:37 [PATCH v4 00/20] change invalidatepage prototype to accept length Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 01/20] mm: " Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 02/20] jbd2: change jbd2_journal_invalidatepage " Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 03/20] ext4: use ->invalidatepage() length argument Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 04/20] jbd: change journal_invalidatepage() to accept length Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 05/20] xfs: use ->invalidatepage() length argument Lukas Czerner
2013-05-15  7:30   ` Dave Chinner
2013-05-14 16:37 ` [PATCH v4 06/20] ocfs2: " Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 07/20] ceph: " Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 08/20] gfs2: " Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 09/20] reiserfs: " Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 10/20] mm: teach truncate_inode_pages_range() to handle non page aligned ranges Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 11/20] Revert "ext4: remove no longer used functions in inode.c" Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 12/20] ext4: Call ext4_jbd2_file_inode() after zeroing block Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 13/20] Revert "ext4: fix fsx truncate failure" Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 14/20] ext4: truncate_inode_pages() in orphan cleanup path Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 15/20] ext4: use ext4_zero_partial_blocks in punch_hole Lukas Czerner
2013-06-14  3:01   ` Theodore Ts'o
2013-06-14 10:16     ` Lukáš Czerner
2013-06-14 13:39       ` Ted Ts'o
2013-06-17  9:08         ` Lukáš Czerner
2013-06-17 12:25           ` Theodore Ts'o [this message]
2013-06-17 12:46             ` Lukáš Czerner
2013-06-17 12:30         ` Lukáš Czerner
2013-06-19 16:37     ` Lukáš Czerner
2013-06-19 23:42       ` Theodore Ts'o
2013-06-20  9:12         ` Lukáš Czerner
2013-06-20  9:14         ` [PATCH] ext4: Only zero partial blocks in ext4_zero_partial_blocks() Lukas Czerner
2013-06-20 15:01           ` Theodore Ts'o
2013-05-14 16:37 ` [PATCH v4 16/20] ext4: remove unused discard_partial_page_buffers Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 17/20] ext4: remove unused code from ext4_remove_blocks() Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 18/20] ext4: update ext4_ext_remove_space trace point Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 19/20] ext4: make punch hole code path work with bigalloc Lukas Czerner
2013-05-14 16:37 ` [PATCH v4 20/20] ext4: Allow punch hole with bigalloc enabled Lukas Czerner
2013-05-31 15:14   ` Theodore Ts'o
2013-06-05 10:04     ` Lukáš Czerner
2013-06-11 12:54     ` Lukáš Czerner
2013-06-18 12:34       ` Tomas Racek
2013-05-21 14:34 ` [PATCH v4 00/20] change invalidatepage prototype to accept length Lukáš Czerner
2013-05-28 11:21   ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130617122518.GA24403@thunk.org \
    --to=tytso@mit.edu \
    --cc=lczerner@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).