diff for duplicates of <20101216033718.GM9925@dastard> diff --git a/a/1.txt b/N1/1.txt index a7ac93c..0f9ff02 100644 --- a/a/1.txt +++ b/N1/1.txt @@ -1,69 +1,91 @@ On Wed, Dec 08, 2010 at 07:20:24AM -0500, Chris Mason wrote: > Excerpts from Jon Nelson's message of 2010-12-07 22:29:26 -0500: -> > On Tue, Dec 7, 2010 at 3:02 PM, Chris Mason <chris.mason@oracle.com> wrote: +> > On Tue, Dec 7, 2010 at 3:02 PM, Chris Mason <chris.mason@oracle.com= +> wrote: > > > Excerpts from Jon Nelson's message of 2010-12-07 15:48:58 -0500: -> > >> On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason <chris.mason@oracle.com> wrote: -> > >> > Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500: -> > >> >> On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason <chris.mason@oracle.com> wrote: -> > >> >> > Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500: -> > >> >> >> On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason <chris.mason@oracle.com> wrote: -> > >> >> >> >> postgresql errors. Typically, header corruption but from the limited -> > >> >> >> >> visibility I've had into this via strace, what I see is zeroed pages +> > >> On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason <chris.mason@oracle.= +com> wrote: +> > >> > Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -050= +0: +> > >> >> On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason <chris.mason@orac= +le.com> wrote: +> > >> >> > Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -= +0500: +> > >> >> >> On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason <chris.mason@= +oracle.com> wrote: +> > >> >> >> >> postgresql errors. Typically, header corruption but fro= +m the limited +> > >> >> >> >> visibility I've had into this via strace, what I see is= + zeroed pages > > >> >> >> >> where there shouldn't be. > > >> >> >> > -> > >> >> >> > This sounds a lot like a bug higher up than dm-crypt. Zeros tend to -> > >> >> >> > come from some piece of code explicitly filling a page with zeros, and -> > >> >> >> > that often happens in the corner cases for O_DIRECT and a few other +> > >> >> >> > This sounds a lot like a bug higher up than dm-crypt. =A0= +Zeros tend to +> > >> >> >> > come from some piece of code explicitly filling a page w= +ith zeros, and +> > >> >> >> > that often happens in the corner cases for O_DIRECT and = +a few other > > >> >> >> > places in the filesystem. > > >> >> >> > -> > >> >> >> > Have you tried triggering this with a regular block device? +> > >> >> >> > Have you tried triggering this with a regular block devi= +ce? > > >> >> >> -> > >> >> >> I just tried the whole set of tests, but with /dev/sdb directly (as +> > >> >> >> I just tried the whole set of tests, but with /dev/sdb dir= +ectly (as > > >> >> >> ext4) without any crypt-y bits. -> > >> >> >> It takes more iterations but out of 6 tests I had one failure: same +> > >> >> >> It takes more iterations but out of 6 tests I had one fail= +ure: same > > >> >> >> type of thing, 'invalid page header in block ....'. > > >> >> >> -> > >> >> >> I can't guarantee that it is a full-page of zeroes, just what I saw +> > >> >> >> I can't guarantee that it is a full-page of zeroes, just w= +hat I saw > > >> >> >> from the (limited) stracing I did. > > >> >> > > > >> >> > Fantastic. Now for our usual suspects: -> > +> >=20 > > Maybe not so fantastic. I kept testing and had no more failures. At > > all. After 40+ iterations I gave up. -> > I went back to trying ext4 on a LUKS volume. The 'hit' ratio went to +> > I went back to trying ext4 on a LUKS volume. The 'hit' ratio went t= +o > > something like 1 in 3, or better. -> > +> >=20 > > I will continue to do testing with and without LUKS. I did /not/ -> > reboot between tests, but I do start with a fresh postgres database. -> > -> -> Once we trigger once without dm-crypt, dm-crypt is off the hook. Just -> to verify, when you say without luks, you mean without any crypto bits +> > reboot between tests, but I do start with a fresh postgres database= +=2E +> >=20 +>=20 +> Once we trigger once without dm-crypt, dm-crypt is off the hook. Jus= +t +> to verify, when you say without luks, you mean without any crypto bit= +s > in use at all on the filesystems postgres uses? -> -> Usually the trick to reproducing filesystem corruptions is adding memory +>=20 +> Usually the trick to reproducing filesystem corruptions is adding mem= +ory > pressure. The corruption is probably a bad interaction between reads > and writes, and we need to make sure the reads actually happen. -> +>=20 > http://oss.oracle.com/~mason/pin_ram.c -> +>=20 > gcc -Wall -o pin_ram pin_ram.c -> +>=20 > pin_ram -m 80%-of-your-ram-in-mb Implemented in xfstests about 10 years ago: -http://git.kernel.org/?p=fs/xfs/xfstests-dev.git;a=blob;f=src/usemem.c;h=b8794a6b209cebf8dbf312a8ef131e2e54b18d29;hb=HEAD +http://git.kernel.org/?p=3Dfs/xfs/xfstests-dev.git;a=3Dblob;f=3Dsrc/use= +mem.c;h=3Db8794a6b209cebf8dbf312a8ef131e2e54b18d29;hb=3DHEAD :P Cheers, Dave. --- +--=20 Dave Chinner david@fromorbit.com -- -To unsubscribe from this list: send the line "unsubscribe linux-ext4" in +To unsubscribe from this list: send the line "unsubscribe linux-ext4" i= +n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/a/content_digest b/N1/content_digest index 3fe187b..b1e5db2 100644 --- a/a/content_digest +++ b/N1/content_digest @@ -27,72 +27,94 @@ "b\0" "On Wed, Dec 08, 2010 at 07:20:24AM -0500, Chris Mason wrote:\n" "> Excerpts from Jon Nelson's message of 2010-12-07 22:29:26 -0500:\n" - "> > On Tue, Dec 7, 2010 at 3:02 PM, Chris Mason <chris.mason@oracle.com> wrote:\n" + "> > On Tue, Dec 7, 2010 at 3:02 PM, Chris Mason <chris.mason@oracle.com=\n" + "> wrote:\n" "> > > Excerpts from Jon Nelson's message of 2010-12-07 15:48:58 -0500:\n" - "> > >> On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason <chris.mason@oracle.com> wrote:\n" - "> > >> > Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500:\n" - "> > >> >> On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason <chris.mason@oracle.com> wrote:\n" - "> > >> >> > Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500:\n" - "> > >> >> >> On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason <chris.mason@oracle.com> wrote:\n" - "> > >> >> >> >> postgresql errors. Typically, header corruption but from the limited\n" - "> > >> >> >> >> visibility I've had into this via strace, what I see is zeroed pages\n" + "> > >> On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason <chris.mason@oracle.=\n" + "com> wrote:\n" + "> > >> > Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -050=\n" + "0:\n" + "> > >> >> On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason <chris.mason@orac=\n" + "le.com> wrote:\n" + "> > >> >> > Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -=\n" + "0500:\n" + "> > >> >> >> On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason <chris.mason@=\n" + "oracle.com> wrote:\n" + "> > >> >> >> >> postgresql errors. Typically, header corruption but fro=\n" + "m the limited\n" + "> > >> >> >> >> visibility I've had into this via strace, what I see is=\n" + " zeroed pages\n" "> > >> >> >> >> where there shouldn't be.\n" "> > >> >> >> >\n" - "> > >> >> >> > This sounds a lot like a bug higher up than dm-crypt. \302\240Zeros tend to\n" - "> > >> >> >> > come from some piece of code explicitly filling a page with zeros, and\n" - "> > >> >> >> > that often happens in the corner cases for O_DIRECT and a few other\n" + "> > >> >> >> > This sounds a lot like a bug higher up than dm-crypt. =A0=\n" + "Zeros tend to\n" + "> > >> >> >> > come from some piece of code explicitly filling a page w=\n" + "ith zeros, and\n" + "> > >> >> >> > that often happens in the corner cases for O_DIRECT and =\n" + "a few other\n" "> > >> >> >> > places in the filesystem.\n" "> > >> >> >> >\n" - "> > >> >> >> > Have you tried triggering this with a regular block device?\n" + "> > >> >> >> > Have you tried triggering this with a regular block devi=\n" + "ce?\n" "> > >> >> >>\n" - "> > >> >> >> I just tried the whole set of tests, but with /dev/sdb directly (as\n" + "> > >> >> >> I just tried the whole set of tests, but with /dev/sdb dir=\n" + "ectly (as\n" "> > >> >> >> ext4) without any crypt-y bits.\n" - "> > >> >> >> It takes more iterations but out of 6 tests I had one failure: same\n" + "> > >> >> >> It takes more iterations but out of 6 tests I had one fail=\n" + "ure: same\n" "> > >> >> >> type of thing, 'invalid page header in block ....'.\n" "> > >> >> >>\n" - "> > >> >> >> I can't guarantee that it is a full-page of zeroes, just what I saw\n" + "> > >> >> >> I can't guarantee that it is a full-page of zeroes, just w=\n" + "hat I saw\n" "> > >> >> >> from the (limited) stracing I did.\n" "> > >> >> >\n" "> > >> >> > Fantastic. Now for our usual suspects:\n" - "> > \n" + "> >=20\n" "> > Maybe not so fantastic. I kept testing and had no more failures. At\n" "> > all. After 40+ iterations I gave up.\n" - "> > I went back to trying ext4 on a LUKS volume. The 'hit' ratio went to\n" + "> > I went back to trying ext4 on a LUKS volume. The 'hit' ratio went t=\n" + "o\n" "> > something like 1 in 3, or better.\n" - "> > \n" + "> >=20\n" "> > I will continue to do testing with and without LUKS. I did /not/\n" - "> > reboot between tests, but I do start with a fresh postgres database.\n" - "> > \n" - "> \n" - "> Once we trigger once without dm-crypt, dm-crypt is off the hook. Just\n" - "> to verify, when you say without luks, you mean without any crypto bits\n" + "> > reboot between tests, but I do start with a fresh postgres database=\n" + "=2E\n" + "> >=20\n" + ">=20\n" + "> Once we trigger once without dm-crypt, dm-crypt is off the hook. Jus=\n" + "t\n" + "> to verify, when you say without luks, you mean without any crypto bit=\n" + "s\n" "> in use at all on the filesystems postgres uses?\n" - "> \n" - "> Usually the trick to reproducing filesystem corruptions is adding memory\n" + ">=20\n" + "> Usually the trick to reproducing filesystem corruptions is adding mem=\n" + "ory\n" "> pressure. The corruption is probably a bad interaction between reads\n" "> and writes, and we need to make sure the reads actually happen.\n" - "> \n" + ">=20\n" "> http://oss.oracle.com/~mason/pin_ram.c\n" - "> \n" + ">=20\n" "> gcc -Wall -o pin_ram pin_ram.c\n" - "> \n" + ">=20\n" "> pin_ram -m 80%-of-your-ram-in-mb\n" "\n" "Implemented in xfstests about 10 years ago:\n" "\n" - "http://git.kernel.org/?p=fs/xfs/xfstests-dev.git;a=blob;f=src/usemem.c;h=b8794a6b209cebf8dbf312a8ef131e2e54b18d29;hb=HEAD\n" + "http://git.kernel.org/?p=3Dfs/xfs/xfstests-dev.git;a=3Dblob;f=3Dsrc/use=\n" + "mem.c;h=3Db8794a6b209cebf8dbf312a8ef131e2e54b18d29;hb=3DHEAD\n" "\n" ":P\n" "\n" "Cheers,\n" "\n" "Dave.\n" - "-- \n" + "--=20\n" "Dave Chinner\n" "david@fromorbit.com\n" "--\n" - "To unsubscribe from this list: send the line \"unsubscribe linux-ext4\" in\n" + "To unsubscribe from this list: send the line \"unsubscribe linux-ext4\" i=\n" + "n\n" "the body of a message to majordomo@vger.kernel.org\n" More majordomo info at http://vger.kernel.org/majordomo-info.html -668d992ee7e9d7e4abbf18b35378b281d4cffc058b201d970c3cefec3f901dd9 +30e59e19724de253cfcf5f18a7f3ec7d106f00d2e52f3e75a677d7605bd5d711
diff --git a/a/1.txt b/N2/1.txt index a7ac93c..3bfe223 100644 --- a/a/1.txt +++ b/N2/1.txt @@ -63,7 +63,3 @@ Dave. -- Dave Chinner david@fromorbit.com --- -To unsubscribe from this list: send the line "unsubscribe linux-ext4" in -the body of a message to majordomo@vger.kernel.org -More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/a/content_digest b/N2/content_digest index 3fe187b..7aab234 100644 --- a/a/content_digest +++ b/N2/content_digest @@ -89,10 +89,6 @@ "Dave.\n" "-- \n" "Dave Chinner\n" - "david@fromorbit.com\n" - "--\n" - "To unsubscribe from this list: send the line \"unsubscribe linux-ext4\" in\n" - "the body of a message to majordomo@vger.kernel.org\n" - More majordomo info at http://vger.kernel.org/majordomo-info.html + david@fromorbit.com -668d992ee7e9d7e4abbf18b35378b281d4cffc058b201d970c3cefec3f901dd9 +f893c39e0a10a2cfd9c7d33f99cd56ec799d60d3aa2b4ec194f82c8582c52dd1
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.