* data=journal regressions in 3.16-rc1
@ 2014-06-23 20:24 Eric Whitney
2014-06-23 21:11 ` Wilcox, Matthew R
0 siblings, 1 reply; 8+ messages in thread
From: Eric Whitney @ 2014-06-23 20:24 UTC (permalink / raw)
To: linux-ext4; +Cc: tytso, matthew.r.wilcox
My regression test results for 3.16-rc1 on x86_64 show three new xfstests
failures since 3.15 final when running on an ext4 filesystem mounted with the
data=journal and block_validity mount options (xfstests-bld's data_journal
scenario). These are generic/075, /112, and /231. All three tests fail
consistently.
These failures bisect to this kernel patch:
7fc34a62ca mm/msync.c: sync only the requested range in msync()
These failures also appear when running on 3.16-rc2, and disappear if the
aforementioned patch is reverted. I've not seen the failures in any of the
other test scenarios I've run on 3.16-rc1 (4k, ext3, nojournal, etc.).
No error messages appear in the kernel log, and not a lot useful is reported
when a test fails. Just for reference, here's the result of a generic/075
failure:
generic/075 62s ... [15:33:07] [15:33:09] [failed, exit status 1] - output mismatch (see /root/xfstests/results//generic/075.out.bad)
--- tests/generic/075.out 2014-06-16 13:14:27.233891460 -0400
+++ /root/xfstests/results//generic/075.out.bad 2014-06-23 15:33:09.654212783 -0400
@@ -4,15 +4,5 @@
-----------------------------------------------
fsx.0 : -d -N numops -S 0
-----------------------------------------------
-
------------------------------------------------
-fsx.1 : -d -N numops -S 0 -x
------------------------------------------------
...
(Run 'diff -u tests/generic/075.out /root/xfstests/results//generic/075.out.bad' to see the entire diff)
Ran: generic/075
Failures: generic/075
Failed 1 of 1 tests
And the contents of xfstests/results/generic/075.out.bad:
QA output created by 075
brevity is wit...
-----------------------------------------------
fsx.0 : -d -N numops -S 0
-----------------------------------------------
fsx (-d -N 1000 -S 0) failed, 0 - compare /root/xfstests/results//generic/075.0.{good,bad,fsxlog}
od: /root/xfstests/results//generic/075.0.fsxgood: No such file or directory
Additional test configuration info:
e2fsprogs master branch: bb9cca2ca9
xfstests master branch: 45d1fac130
Perhaps data=journal has an unexpected dependency on the old msync behavior,
given the patch comment?
Thanks,
Eric
^ permalink raw reply [flat|nested] 8+ messages in thread* RE: data=journal regressions in 3.16-rc1 2014-06-23 20:24 data=journal regressions in 3.16-rc1 Eric Whitney @ 2014-06-23 21:11 ` Wilcox, Matthew R 2014-06-23 21:55 ` Eric Whitney 0 siblings, 1 reply; 8+ messages in thread From: Wilcox, Matthew R @ 2014-06-23 21:11 UTC (permalink / raw) To: Eric Whitney, linux-ext4@vger.kernel.org; +Cc: tytso@mit.edu Which test in 075.0.fsxlog indicates failure? ________________________________________ From: Eric Whitney [enwlinux@gmail.com] Sent: June 23, 2014 1:24 PM To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu; Wilcox, Matthew R Subject: data=journal regressions in 3.16-rc1 My regression test results for 3.16-rc1 on x86_64 show three new xfstests failures since 3.15 final when running on an ext4 filesystem mounted with the data=journal and block_validity mount options (xfstests-bld's data_journal scenario). These are generic/075, /112, and /231. All three tests fail consistently. These failures bisect to this kernel patch: 7fc34a62ca mm/msync.c: sync only the requested range in msync() These failures also appear when running on 3.16-rc2, and disappear if the aforementioned patch is reverted. I've not seen the failures in any of the other test scenarios I've run on 3.16-rc1 (4k, ext3, nojournal, etc.). No error messages appear in the kernel log, and not a lot useful is reported when a test fails. Just for reference, here's the result of a generic/075 failure: generic/075 62s ... [15:33:07] [15:33:09] [failed, exit status 1] - output mismatch (see /root/xfstests/results//generic/075.out.bad) --- tests/generic/075.out 2014-06-16 13:14:27.233891460 -0400 +++ /root/xfstests/results//generic/075.out.bad 2014-06-23 15:33:09.654212783 -0400 @@ -4,15 +4,5 @@ ----------------------------------------------- fsx.0 : -d -N numops -S 0 ----------------------------------------------- - ------------------------------------------------ -fsx.1 : -d -N numops -S 0 -x ------------------------------------------------ ... (Run 'diff -u tests/generic/075.out /root/xfstests/results//generic/075.out.bad' to see the entire diff) Ran: generic/075 Failures: generic/075 Failed 1 of 1 tests And the contents of xfstests/results/generic/075.out.bad: QA output created by 075 brevity is wit... ----------------------------------------------- fsx.0 : -d -N numops -S 0 ----------------------------------------------- fsx (-d -N 1000 -S 0) failed, 0 - compare /root/xfstests/results//generic/075.0.{good,bad,fsxlog} od: /root/xfstests/results//generic/075.0.fsxgood: No such file or directory Additional test configuration info: e2fsprogs master branch: bb9cca2ca9 xfstests master branch: 45d1fac130 Perhaps data=journal has an unexpected dependency on the old msync behavior, given the patch comment? Thanks, Eric ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: data=journal regressions in 3.16-rc1 2014-06-23 21:11 ` Wilcox, Matthew R @ 2014-06-23 21:55 ` Eric Whitney 2014-06-26 15:12 ` Lukáš Czerner 0 siblings, 1 reply; 8+ messages in thread From: Eric Whitney @ 2014-06-23 21:55 UTC (permalink / raw) To: Wilcox, Matthew R; +Cc: Eric Whitney, linux-ext4@vger.kernel.org, tytso@mit.edu The first invocation of fsx causes generic/075 to fail. Within 075.0.fsxlog, bad reads appear to be the cause: READ BAD DATA: offset = 0xb7f0, size = 0x8111, fname = 075.0 OFFSET GOOD BAD RANGE 0x13000 0x1aee 0x0000 0x 0 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x13001 0xee1a 0x0000 0x 1 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x13002 0x1a01 0x0000 0x 2 (etc. - goes on until RANGE = 0xf) This code is new to me, but it looks like fsx is getting zeros where it expects other values. Eric * Wilcox, Matthew R <matthew.r.wilcox@intel.com>: > Which test in 075.0.fsxlog indicates failure? > ________________________________________ > From: Eric Whitney [enwlinux@gmail.com] > Sent: June 23, 2014 1:24 PM > To: linux-ext4@vger.kernel.org > Cc: tytso@mit.edu; Wilcox, Matthew R > Subject: data=journal regressions in 3.16-rc1 > > My regression test results for 3.16-rc1 on x86_64 show three new xfstests > failures since 3.15 final when running on an ext4 filesystem mounted with the > data=journal and block_validity mount options (xfstests-bld's data_journal > scenario). These are generic/075, /112, and /231. All three tests fail > consistently. > > These failures bisect to this kernel patch: > 7fc34a62ca mm/msync.c: sync only the requested range in msync() > > These failures also appear when running on 3.16-rc2, and disappear if the > aforementioned patch is reverted. I've not seen the failures in any of the > other test scenarios I've run on 3.16-rc1 (4k, ext3, nojournal, etc.). > > No error messages appear in the kernel log, and not a lot useful is reported > when a test fails. Just for reference, here's the result of a generic/075 > failure: > > generic/075 62s ... [15:33:07] [15:33:09] [failed, exit status 1] - output mismatch (see /root/xfstests/results//generic/075.out.bad) > --- tests/generic/075.out 2014-06-16 13:14:27.233891460 -0400 > +++ /root/xfstests/results//generic/075.out.bad 2014-06-23 15:33:09.654212783 -0400 > @@ -4,15 +4,5 @@ > ----------------------------------------------- > fsx.0 : -d -N numops -S 0 > ----------------------------------------------- > - > ------------------------------------------------ > -fsx.1 : -d -N numops -S 0 -x > ------------------------------------------------ > ... > (Run 'diff -u tests/generic/075.out /root/xfstests/results//generic/075.out.bad' to see the entire diff) > Ran: generic/075 > Failures: generic/075 > Failed 1 of 1 tests > > > And the contents of xfstests/results/generic/075.out.bad: > > > QA output created by 075 > brevity is wit... > > ----------------------------------------------- > fsx.0 : -d -N numops -S 0 > ----------------------------------------------- > fsx (-d -N 1000 -S 0) failed, 0 - compare /root/xfstests/results//generic/075.0.{good,bad,fsxlog} > od: /root/xfstests/results//generic/075.0.fsxgood: No such file or directory > > > Additional test configuration info: > > e2fsprogs master branch: bb9cca2ca9 > xfstests master branch: 45d1fac130 > > Perhaps data=journal has an unexpected dependency on the old msync behavior, > given the patch comment? > > Thanks, > Eric ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: data=journal regressions in 3.16-rc1 2014-06-23 21:55 ` Eric Whitney @ 2014-06-26 15:12 ` Lukáš Czerner 2014-06-27 4:20 ` Namjae Jeon 2014-06-27 6:03 ` Namjae Jeon 0 siblings, 2 replies; 8+ messages in thread From: Lukáš Czerner @ 2014-06-26 15:12 UTC (permalink / raw) To: Eric Whitney Cc: Wilcox, Matthew R, linux-ext4@vger.kernel.org, tytso@mit.edu, namjae.jeon On Mon, 23 Jun 2014, Eric Whitney wrote: > Date: Mon, 23 Jun 2014 17:55:14 -0400 > From: Eric Whitney <enwlinux@gmail.com> > To: "Wilcox, Matthew R" <matthew.r.wilcox@intel.com> > Cc: Eric Whitney <enwlinux@gmail.com>, > "linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>, > "tytso@mit.edu" <tytso@mit.edu> > Subject: Re: data=journal regressions in 3.16-rc1 > > The first invocation of fsx causes generic/075 to fail. Within 075.0.fsxlog, > bad reads appear to be the cause: > > READ BAD DATA: offset = 0xb7f0, size = 0x8111, fname = 075.0 > OFFSET GOOD BAD RANGE > 0x13000 0x1aee 0x0000 0x 0 > operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops > 0x13001 0xee1a 0x0000 0x 1 > operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops > 0x13002 0x1a01 0x0000 0x 2 > > (etc. - goes on until RANGE = 0xf) > > This code is new to me, but it looks like fsx is getting zeros where it > expects other values. This seems to be related to collapse range feature. When adding -C to the fsx the problem goes away. Also when comparing the fsxgood with the real file it seems that there is a big chunk if the file missing in the middle. We need to investigate further. Namjae any idea what might be causing it ? -Lukas > > Eric > > > * Wilcox, Matthew R <matthew.r.wilcox@intel.com>: > > Which test in 075.0.fsxlog indicates failure? > > ________________________________________ > > From: Eric Whitney [enwlinux@gmail.com] > > Sent: June 23, 2014 1:24 PM > > To: linux-ext4@vger.kernel.org > > Cc: tytso@mit.edu; Wilcox, Matthew R > > Subject: data=journal regressions in 3.16-rc1 > > > > My regression test results for 3.16-rc1 on x86_64 show three new xfstests > > failures since 3.15 final when running on an ext4 filesystem mounted with the > > data=journal and block_validity mount options (xfstests-bld's data_journal > > scenario). These are generic/075, /112, and /231. All three tests fail > > consistently. > > > > These failures bisect to this kernel patch: > > 7fc34a62ca mm/msync.c: sync only the requested range in msync() > > > > These failures also appear when running on 3.16-rc2, and disappear if the > > aforementioned patch is reverted. I've not seen the failures in any of the > > other test scenarios I've run on 3.16-rc1 (4k, ext3, nojournal, etc.). > > > > No error messages appear in the kernel log, and not a lot useful is reported > > when a test fails. Just for reference, here's the result of a generic/075 > > failure: > > > > generic/075 62s ... [15:33:07] [15:33:09] [failed, exit status 1] - output mismatch (see /root/xfstests/results//generic/075.out.bad) > > --- tests/generic/075.out 2014-06-16 13:14:27.233891460 -0400 > > +++ /root/xfstests/results//generic/075.out.bad 2014-06-23 15:33:09.654212783 -0400 > > @@ -4,15 +4,5 @@ > > ----------------------------------------------- > > fsx.0 : -d -N numops -S 0 > > ----------------------------------------------- > > - > > ------------------------------------------------ > > -fsx.1 : -d -N numops -S 0 -x > > ------------------------------------------------ > > ... > > (Run 'diff -u tests/generic/075.out /root/xfstests/results//generic/075.out.bad' to see the entire diff) > > Ran: generic/075 > > Failures: generic/075 > > Failed 1 of 1 tests > > > > > > And the contents of xfstests/results/generic/075.out.bad: > > > > > > QA output created by 075 > > brevity is wit... > > > > ----------------------------------------------- > > fsx.0 : -d -N numops -S 0 > > ----------------------------------------------- > > fsx (-d -N 1000 -S 0) failed, 0 - compare /root/xfstests/results//generic/075.0.{good,bad,fsxlog} > > od: /root/xfstests/results//generic/075.0.fsxgood: No such file or directory > > > > > > Additional test configuration info: > > > > e2fsprogs master branch: bb9cca2ca9 > > xfstests master branch: 45d1fac130 > > > > Perhaps data=journal has an unexpected dependency on the old msync behavior, > > given the patch comment? > > > > Thanks, > > Eric > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: data=journal regressions in 3.16-rc1 2014-06-26 15:12 ` Lukáš Czerner @ 2014-06-27 4:20 ` Namjae Jeon 2014-06-27 6:03 ` Namjae Jeon 1 sibling, 0 replies; 8+ messages in thread From: Namjae Jeon @ 2014-06-27 4:20 UTC (permalink / raw) To: 'Lukáš Czerner', 'Eric Whitney' Cc: 'Wilcox, Matthew R', linux-ext4, tytso > On Mon, 23 Jun 2014, Eric Whitney wrote: > > > Date: Mon, 23 Jun 2014 17:55:14 -0400 > > From: Eric Whitney <enwlinux@gmail.com> > > To: "Wilcox, Matthew R" <matthew.r.wilcox@intel.com> > > Cc: Eric Whitney <enwlinux@gmail.com>, > > "linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>, > > "tytso@mit.edu" <tytso@mit.edu> > > Subject: Re: data=journal regressions in 3.16-rc1 > > > > The first invocation of fsx causes generic/075 to fail. Within 075.0.fsxlog, > > bad reads appear to be the cause: > > > > READ BAD DATA: offset = 0xb7f0, size = 0x8111, fname = 075.0 > > OFFSET GOOD BAD RANGE > > 0x13000 0x1aee 0x0000 0x 0 > > operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops > > 0x13001 0xee1a 0x0000 0x 1 > > operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops > > 0x13002 0x1a01 0x0000 0x 2 > > > > (etc. - goes on until RANGE = 0xf) > > > > This code is new to me, but it looks like fsx is getting zeros where it > > expects other values. > > This seems to be related to collapse range feature. When adding -C > to the fsx the problem goes away. Also when comparing the fsxgood > with the real file it seems that there is a big chunk if the file > missing in the middle. > > We need to investigate further. Namjae any idea what might be > causing it ? Hi, Lukas. It seems this issue is related with collapse range on test result(with fsx -C option) I will check it. Thanks! > > -Lukas > > > > > Eric > > > > > > * Wilcox, Matthew R <matthew.r.wilcox@intel.com>: > > > Which test in 075.0.fsxlog indicates failure? > > > ________________________________________ > > > From: Eric Whitney [enwlinux@gmail.com] > > > Sent: June 23, 2014 1:24 PM > > > To: linux-ext4@vger.kernel.org > > > Cc: tytso@mit.edu; Wilcox, Matthew R > > > Subject: data=journal regressions in 3.16-rc1 > > > > > > My regression test results for 3.16-rc1 on x86_64 show three new xfstests > > > failures since 3.15 final when running on an ext4 filesystem mounted with the > > > data=journal and block_validity mount options (xfstests-bld's data_journal > > > scenario). These are generic/075, /112, and /231. All three tests fail > > > consistently. > > > > > > These failures bisect to this kernel patch: > > > 7fc34a62ca mm/msync.c: sync only the requested range in msync() > > > > > > These failures also appear when running on 3.16-rc2, and disappear if the > > > aforementioned patch is reverted. I've not seen the failures in any of the > > > other test scenarios I've run on 3.16-rc1 (4k, ext3, nojournal, etc.). > > > > > > No error messages appear in the kernel log, and not a lot useful is reported > > > when a test fails. Just for reference, here's the result of a generic/075 > > > failure: > > > > > > generic/075 62s ... [15:33:07] [15:33:09] [failed, exit status 1] - output mismatch (see > /root/xfstests/results//generic/075.out.bad) > > > --- tests/generic/075.out 2014-06-16 13:14:27.233891460 -0400 > > > +++ /root/xfstests/results//generic/075.out.bad 2014-06-23 15:33:09.654212783 -0400 > > > @@ -4,15 +4,5 @@ > > > ----------------------------------------------- > > > fsx.0 : -d -N numops -S 0 > > > ----------------------------------------------- > > > - > > > ------------------------------------------------ > > > -fsx.1 : -d -N numops -S 0 -x > > > ------------------------------------------------ > > > ... > > > (Run 'diff -u tests/generic/075.out /root/xfstests/results//generic/075.out.bad' to see the > entire diff) > > > Ran: generic/075 > > > Failures: generic/075 > > > Failed 1 of 1 tests > > > > > > > > > And the contents of xfstests/results/generic/075.out.bad: > > > > > > > > > QA output created by 075 > > > brevity is wit... > > > > > > ----------------------------------------------- > > > fsx.0 : -d -N numops -S 0 > > > ----------------------------------------------- > > > fsx (-d -N 1000 -S 0) failed, 0 - compare > /root/xfstests/results//generic/075.0.{good,bad,fsxlog} > > > od: /root/xfstests/results//generic/075.0.fsxgood: No such file or directory > > > > > > > > > Additional test configuration info: > > > > > > e2fsprogs master branch: bb9cca2ca9 > > > xfstests master branch: 45d1fac130 > > > > > > Perhaps data=journal has an unexpected dependency on the old msync behavior, > > > given the patch comment? > > > > > > Thanks, > > > Eric > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: data=journal regressions in 3.16-rc1 2014-06-26 15:12 ` Lukáš Czerner 2014-06-27 4:20 ` Namjae Jeon @ 2014-06-27 6:03 ` Namjae Jeon 2014-06-27 10:48 ` Lukáš Czerner 1 sibling, 1 reply; 8+ messages in thread From: Namjae Jeon @ 2014-06-27 6:03 UTC (permalink / raw) To: 'Lukáš Czerner', 'Wilcox, Matthew R' Cc: 'Eric Whitney', linux-ext4, tytso, Christoph Hellwig, Ashish Sangwan > > On Mon, 23 Jun 2014, Eric Whitney wrote: > > > > > Date: Mon, 23 Jun 2014 17:55:14 -0400 > > > From: Eric Whitney <enwlinux@gmail.com> > > > To: "Wilcox, Matthew R" <matthew.r.wilcox@intel.com> > > > Cc: Eric Whitney <enwlinux@gmail.com>, > > > "linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>, > > > "tytso@mit.edu" <tytso@mit.edu> > > > Subject: Re: data=journal regressions in 3.16-rc1 > > > > > > The first invocation of fsx causes generic/075 to fail. Within 075.0.fsxlog, > > > bad reads appear to be the cause: > > > > > > READ BAD DATA: offset = 0xb7f0, size = 0x8111, fname = 075.0 > > > OFFSET GOOD BAD RANGE > > > 0x13000 0x1aee 0x0000 0x 0 > > > operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops > > > 0x13001 0xee1a 0x0000 0x 1 > > > operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops > > > 0x13002 0x1a01 0x0000 0x 2 > > > > > > (etc. - goes on until RANGE = 0xf) > > > > > > This code is new to me, but it looks like fsx is getting zeros where it > > > expects other values. > > > > This seems to be related to collapse range feature. When adding -C > > to the fsx the problem goes away. Also when comparing the fsxgood > > with the real file it seems that there is a big chunk if the file > > missing in the middle. > > > > We need to investigate further. Namjae any idea what might be > > causing it ? > Hi, Lukas. > > It seems this issue is related with collapse range on test result(with fsx -C option) > I will check it. Looks fstart is wrongly calcurated.. - fstart = start + ((loff_t)vma->vm_pgoff << PAGE_SHIFT); + fstart = (start - vma->vm_start) + ((loff_t)vma->vm_pgoff << PAGE_SHIFT); Could you confirm this change is correct ? Matthew ? Thanks. > > Thanks! > > > > -Lukas > > > > > > > > Eric > > > > > > > > > * Wilcox, Matthew R <matthew.r.wilcox@intel.com>: > > > > Which test in 075.0.fsxlog indicates failure? > > > > ________________________________________ > > > > From: Eric Whitney [enwlinux@gmail.com] > > > > Sent: June 23, 2014 1:24 PM > > > > To: linux-ext4@vger.kernel.org > > > > Cc: tytso@mit.edu; Wilcox, Matthew R > > > > Subject: data=journal regressions in 3.16-rc1 > > > > > > > > My regression test results for 3.16-rc1 on x86_64 show three new xfstests > > > > failures since 3.15 final when running on an ext4 filesystem mounted with the > > > > data=journal and block_validity mount options (xfstests-bld's data_journal > > > > scenario). These are generic/075, /112, and /231. All three tests fail > > > > consistently. > > > > > > > > These failures bisect to this kernel patch: > > > > 7fc34a62ca mm/msync.c: sync only the requested range in msync() > > > > > > > > These failures also appear when running on 3.16-rc2, and disappear if the > > > > aforementioned patch is reverted. I've not seen the failures in any of the > > > > other test scenarios I've run on 3.16-rc1 (4k, ext3, nojournal, etc.). > > > > > > > > No error messages appear in the kernel log, and not a lot useful is reported > > > > when a test fails. Just for reference, here's the result of a generic/075 > > > > failure: > > > > > > > > generic/075 62s ... [15:33:07] [15:33:09] [failed, exit status 1] - output mismatch (see > > /root/xfstests/results//generic/075.out.bad) > > > > --- tests/generic/075.out 2014-06-16 13:14:27.233891460 -0400 > > > > +++ /root/xfstests/results//generic/075.out.bad 2014-06-23 15:33:09.654212783 -0400 > > > > @@ -4,15 +4,5 @@ > > > > ----------------------------------------------- > > > > fsx.0 : -d -N numops -S 0 > > > > ----------------------------------------------- > > > > - > > > > ------------------------------------------------ > > > > -fsx.1 : -d -N numops -S 0 -x > > > > ------------------------------------------------ > > > > ... > > > > (Run 'diff -u tests/generic/075.out /root/xfstests/results//generic/075.out.bad' to see the > > entire diff) > > > > Ran: generic/075 > > > > Failures: generic/075 > > > > Failed 1 of 1 tests > > > > > > > > > > > > And the contents of xfstests/results/generic/075.out.bad: > > > > > > > > > > > > QA output created by 075 > > > > brevity is wit... > > > > > > > > ----------------------------------------------- > > > > fsx.0 : -d -N numops -S 0 > > > > ----------------------------------------------- > > > > fsx (-d -N 1000 -S 0) failed, 0 - compare > > /root/xfstests/results//generic/075.0.{good,bad,fsxlog} > > > > od: /root/xfstests/results//generic/075.0.fsxgood: No such file or directory > > > > > > > > > > > > Additional test configuration info: > > > > > > > > e2fsprogs master branch: bb9cca2ca9 > > > > xfstests master branch: 45d1fac130 > > > > > > > > Perhaps data=journal has an unexpected dependency on the old msync behavior, > > > > given the patch comment? > > > > > > > > Thanks, > > > > Eric > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: data=journal regressions in 3.16-rc1 2014-06-27 6:03 ` Namjae Jeon @ 2014-06-27 10:48 ` Lukáš Czerner 2014-06-27 11:14 ` Namjae Jeon 0 siblings, 1 reply; 8+ messages in thread From: Lukáš Czerner @ 2014-06-27 10:48 UTC (permalink / raw) To: Namjae Jeon Cc: 'Wilcox, Matthew R', 'Eric Whitney', linux-ext4, tytso, Christoph Hellwig, Ashish Sangwan [-- Attachment #1: Type: TEXT/PLAIN, Size: 5999 bytes --] On Fri, 27 Jun 2014, Namjae Jeon wrote: > Date: Fri, 27 Jun 2014 15:03:38 +0900 > From: Namjae Jeon <namjae.jeon@samsung.com> > To: 'Lukáš Czerner' <lczerner@redhat.com>, > "'Wilcox, Matthew R'" <matthew.r.wilcox@intel.com> > Cc: 'Eric Whitney' <enwlinux@gmail.com>, linux-ext4@vger.kernel.org, > tytso@mit.edu, Christoph Hellwig <hch@infradead.org>, > Ashish Sangwan <a.sangwan@samsung.com> > Subject: RE: data=journal regressions in 3.16-rc1 > > > > On Mon, 23 Jun 2014, Eric Whitney wrote: > > > > > > > Date: Mon, 23 Jun 2014 17:55:14 -0400 > > > > From: Eric Whitney <enwlinux@gmail.com> > > > > To: "Wilcox, Matthew R" <matthew.r.wilcox@intel.com> > > > > Cc: Eric Whitney <enwlinux@gmail.com>, > > > > "linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>, > > > > "tytso@mit.edu" <tytso@mit.edu> > > > > Subject: Re: data=journal regressions in 3.16-rc1 > > > > > > > > The first invocation of fsx causes generic/075 to fail. Within 075.0.fsxlog, > > > > bad reads appear to be the cause: > > > > > > > > READ BAD DATA: offset = 0xb7f0, size = 0x8111, fname = 075.0 > > > > OFFSET GOOD BAD RANGE > > > > 0x13000 0x1aee 0x0000 0x 0 > > > > operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops > > > > 0x13001 0xee1a 0x0000 0x 1 > > > > operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops > > > > 0x13002 0x1a01 0x0000 0x 2 > > > > > > > > (etc. - goes on until RANGE = 0xf) > > > > > > > > This code is new to me, but it looks like fsx is getting zeros where it > > > > expects other values. > > > > > > This seems to be related to collapse range feature. When adding -C > > > to the fsx the problem goes away. Also when comparing the fsxgood > > > with the real file it seems that there is a big chunk if the file > > > missing in the middle. > > > > > > We need to investigate further. Namjae any idea what might be > > > causing it ? > > Hi, Lukas. > > > > It seems this issue is related with collapse range on test result(with fsx -C option) > > I will check it. > > Looks fstart is wrongly calcurated.. > > - fstart = start + ((loff_t)vma->vm_pgoff << PAGE_SHIFT); > + fstart = (start - vma->vm_start) + ((loff_t)vma->vm_pgoff << PAGE_SHIFT); Good catch, this indeed is a bug. And this change fixes the problem we're seeing with data=journal in ext4. Will you send a proper patch ? Thanks! -Lukas > > Could you confirm this change is correct ? Matthew ? > > Thanks. > > > > > Thanks! > > > > > > -Lukas > > > > > > > > > > > Eric > > > > > > > > > > > > * Wilcox, Matthew R <matthew.r.wilcox@intel.com>: > > > > > Which test in 075.0.fsxlog indicates failure? > > > > > ________________________________________ > > > > > From: Eric Whitney [enwlinux@gmail.com] > > > > > Sent: June 23, 2014 1:24 PM > > > > > To: linux-ext4@vger.kernel.org > > > > > Cc: tytso@mit.edu; Wilcox, Matthew R > > > > > Subject: data=journal regressions in 3.16-rc1 > > > > > > > > > > My regression test results for 3.16-rc1 on x86_64 show three new xfstests > > > > > failures since 3.15 final when running on an ext4 filesystem mounted with the > > > > > data=journal and block_validity mount options (xfstests-bld's data_journal > > > > > scenario). These are generic/075, /112, and /231. All three tests fail > > > > > consistently. > > > > > > > > > > These failures bisect to this kernel patch: > > > > > 7fc34a62ca mm/msync.c: sync only the requested range in msync() > > > > > > > > > > These failures also appear when running on 3.16-rc2, and disappear if the > > > > > aforementioned patch is reverted. I've not seen the failures in any of the > > > > > other test scenarios I've run on 3.16-rc1 (4k, ext3, nojournal, etc.). > > > > > > > > > > No error messages appear in the kernel log, and not a lot useful is reported > > > > > when a test fails. Just for reference, here's the result of a generic/075 > > > > > failure: > > > > > > > > > > generic/075 62s ... [15:33:07] [15:33:09] [failed, exit status 1] - output mismatch (see > > > /root/xfstests/results//generic/075.out.bad) > > > > > --- tests/generic/075.out 2014-06-16 13:14:27.233891460 -0400 > > > > > +++ /root/xfstests/results//generic/075.out.bad 2014-06-23 15:33:09.654212783 -0400 > > > > > @@ -4,15 +4,5 @@ > > > > > ----------------------------------------------- > > > > > fsx.0 : -d -N numops -S 0 > > > > > ----------------------------------------------- > > > > > - > > > > > ------------------------------------------------ > > > > > -fsx.1 : -d -N numops -S 0 -x > > > > > ------------------------------------------------ > > > > > ... > > > > > (Run 'diff -u tests/generic/075.out /root/xfstests/results//generic/075.out.bad' to see the > > > entire diff) > > > > > Ran: generic/075 > > > > > Failures: generic/075 > > > > > Failed 1 of 1 tests > > > > > > > > > > > > > > > And the contents of xfstests/results/generic/075.out.bad: > > > > > > > > > > > > > > > QA output created by 075 > > > > > brevity is wit... > > > > > > > > > > ----------------------------------------------- > > > > > fsx.0 : -d -N numops -S 0 > > > > > ----------------------------------------------- > > > > > fsx (-d -N 1000 -S 0) failed, 0 - compare > > > /root/xfstests/results//generic/075.0.{good,bad,fsxlog} > > > > > od: /root/xfstests/results//generic/075.0.fsxgood: No such file or directory > > > > > > > > > > > > > > > Additional test configuration info: > > > > > > > > > > e2fsprogs master branch: bb9cca2ca9 > > > > > xfstests master branch: 45d1fac130 > > > > > > > > > > Perhaps data=journal has an unexpected dependency on the old msync behavior, > > > > > given the patch comment? > > > > > > > > > > Thanks, > > > > > Eric > > > > -- > > > > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > > > > the body of a message to majordomo@vger.kernel.org > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: data=journal regressions in 3.16-rc1 2014-06-27 10:48 ` Lukáš Czerner @ 2014-06-27 11:14 ` Namjae Jeon 0 siblings, 0 replies; 8+ messages in thread From: Namjae Jeon @ 2014-06-27 11:14 UTC (permalink / raw) To: 'Lukáš Czerner' Cc: 'Wilcox, Matthew R', 'Eric Whitney', linux-ext4, tytso, 'Christoph Hellwig', 'Ashish Sangwan' > > On Fri, 27 Jun 2014, Namjae Jeon wrote: > > > Date: Fri, 27 Jun 2014 15:03:38 +0900 > > From: Namjae Jeon <namjae.jeon@samsung.com> > > To: 'Lukáš Czerner' <lczerner@redhat.com>, > > "'Wilcox, Matthew R'" <matthew.r.wilcox@intel.com> > > Cc: 'Eric Whitney' <enwlinux@gmail.com>, linux-ext4@vger.kernel.org, > > tytso@mit.edu, Christoph Hellwig <hch@infradead.org>, > > Ashish Sangwan <a.sangwan@samsung.com> > > Subject: RE: data=journal regressions in 3.16-rc1 > > > > > > On Mon, 23 Jun 2014, Eric Whitney wrote: > > > > > > > > > Date: Mon, 23 Jun 2014 17:55:14 -0400 > > > > > From: Eric Whitney <enwlinux@gmail.com> > > > > > To: "Wilcox, Matthew R" <matthew.r.wilcox@intel.com> > > > > > Cc: Eric Whitney <enwlinux@gmail.com>, > > > > > "linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>, > > > > > "tytso@mit.edu" <tytso@mit.edu> > > > > > Subject: Re: data=journal regressions in 3.16-rc1 > > > > > > > > > > The first invocation of fsx causes generic/075 to fail. Within 075.0.fsxlog, > > > > > bad reads appear to be the cause: > > > > > > > > > > READ BAD DATA: offset = 0xb7f0, size = 0x8111, fname = 075.0 > > > > > OFFSET GOOD BAD RANGE > > > > > 0x13000 0x1aee 0x0000 0x 0 > > > > > operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops > > > > > 0x13001 0xee1a 0x0000 0x 1 > > > > > operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops > > > > > 0x13002 0x1a01 0x0000 0x 2 > > > > > > > > > > (etc. - goes on until RANGE = 0xf) > > > > > > > > > > This code is new to me, but it looks like fsx is getting zeros where it > > > > > expects other values. > > > > > > > > This seems to be related to collapse range feature. When adding -C > > > > to the fsx the problem goes away. Also when comparing the fsxgood > > > > with the real file it seems that there is a big chunk if the file > > > > missing in the middle. > > > > > > > > We need to investigate further. Namjae any idea what might be > > > > causing it ? > > > Hi, Lukas. > > > > > > It seems this issue is related with collapse range on test result(with fsx -C option) > > > I will check it. > > > > Looks fstart is wrongly calcurated.. > > > > - fstart = start + ((loff_t)vma->vm_pgoff << PAGE_SHIFT); > > + fstart = (start - vma->vm_start) + ((loff_t)vma->vm_pgoff << PAGE_SHIFT); > > Good catch, this indeed is a bug. And this change fixes the problem > we're seeing with data=journal in ext4. > > Will you send a proper patch ? Okay, I will send the patch soon. Thanks! > > Thanks! > -Lukas > > > > > Could you confirm this change is correct ? Matthew ? > > > > Thanks. > > > > > > > > Thanks! > > > > > > > > -Lukas > > > > > > > > > > > > > > Eric > > > > > > > > > > > > > > > * Wilcox, Matthew R <matthew.r.wilcox@intel.com>: > > > > > > Which test in 075.0.fsxlog indicates failure? > > > > > > ________________________________________ > > > > > > From: Eric Whitney [enwlinux@gmail.com] > > > > > > Sent: June 23, 2014 1:24 PM > > > > > > To: linux-ext4@vger.kernel.org > > > > > > Cc: tytso@mit.edu; Wilcox, Matthew R > > > > > > Subject: data=journal regressions in 3.16-rc1 > > > > > > > > > > > > My regression test results for 3.16-rc1 on x86_64 show three new xfstests > > > > > > failures since 3.15 final when running on an ext4 filesystem mounted with the > > > > > > data=journal and block_validity mount options (xfstests-bld's data_journal > > > > > > scenario). These are generic/075, /112, and /231. All three tests fail > > > > > > consistently. > > > > > > > > > > > > These failures bisect to this kernel patch: > > > > > > 7fc34a62ca mm/msync.c: sync only the requested range in msync() > > > > > > > > > > > > These failures also appear when running on 3.16-rc2, and disappear if the > > > > > > aforementioned patch is reverted. I've not seen the failures in any of the > > > > > > other test scenarios I've run on 3.16-rc1 (4k, ext3, nojournal, etc.). > > > > > > > > > > > > No error messages appear in the kernel log, and not a lot useful is reported > > > > > > when a test fails. Just for reference, here's the result of a generic/075 > > > > > > failure: > > > > > > > > > > > > generic/075 62s ... [15:33:07] [15:33:09] [failed, exit status 1] - output mismatch (see > > > > /root/xfstests/results//generic/075.out.bad) > > > > > > --- tests/generic/075.out 2014-06-16 13:14:27.233891460 -0400 > > > > > > +++ /root/xfstests/results//generic/075.out.bad 2014-06-23 15:33:09.654212783 -0400 > > > > > > @@ -4,15 +4,5 @@ > > > > > > ----------------------------------------------- > > > > > > fsx.0 : -d -N numops -S 0 > > > > > > ----------------------------------------------- > > > > > > - > > > > > > ------------------------------------------------ > > > > > > -fsx.1 : -d -N numops -S 0 -x > > > > > > ------------------------------------------------ > > > > > > ... > > > > > > (Run 'diff -u tests/generic/075.out /root/xfstests/results//generic/075.out.bad' to see > the > > > > entire diff) > > > > > > Ran: generic/075 > > > > > > Failures: generic/075 > > > > > > Failed 1 of 1 tests > > > > > > > > > > > > > > > > > > And the contents of xfstests/results/generic/075.out.bad: > > > > > > > > > > > > > > > > > > QA output created by 075 > > > > > > brevity is wit... > > > > > > > > > > > > ----------------------------------------------- > > > > > > fsx.0 : -d -N numops -S 0 > > > > > > ----------------------------------------------- > > > > > > fsx (-d -N 1000 -S 0) failed, 0 - compare > > > > /root/xfstests/results//generic/075.0.{good,bad,fsxlog} > > > > > > od: /root/xfstests/results//generic/075.0.fsxgood: No such file or directory > > > > > > > > > > > > > > > > > > Additional test configuration info: > > > > > > > > > > > > e2fsprogs master branch: bb9cca2ca9 > > > > > > xfstests master branch: 45d1fac130 > > > > > > > > > > > > Perhaps data=journal has an unexpected dependency on the old msync behavior, > > > > > > given the patch comment? > > > > > > > > > > > > Thanks, > > > > > > Eric > > > > > -- > > > > > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > > > > > the body of a message to majordomo@vger.kernel.org > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-06-27 11:14 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-06-23 20:24 data=journal regressions in 3.16-rc1 Eric Whitney 2014-06-23 21:11 ` Wilcox, Matthew R 2014-06-23 21:55 ` Eric Whitney 2014-06-26 15:12 ` Lukáš Czerner 2014-06-27 4:20 ` Namjae Jeon 2014-06-27 6:03 ` Namjae Jeon 2014-06-27 10:48 ` Lukáš Czerner 2014-06-27 11:14 ` Namjae Jeon
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).