diff for duplicates of <1291755258-sup-8760@think> diff --git a/a/1.txt b/N1/1.txt index 19012d0..f47b290 100644 --- a/a/1.txt +++ b/N1/1.txt @@ -1,53 +1,66 @@ Excerpts from Jon Nelson's message of 2010-12-07 15:48:58 -0500: -> On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason <chris.mason@oracle.com> wrote: +> On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason <chris.mason@oracle.com> = +wrote: > > Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500: -> >> On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason <chris.mason@oracle.com> wrote: +> >> On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason <chris.mason@oracle.co= +m> wrote: > >> > Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500: -> >> >> On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason <chris.mason@oracle.com> wrote: -> >> >> >> postgresql errors. Typically, header corruption but from the limited -> >> >> >> visibility I've had into this via strace, what I see is zeroed pages +> >> >> On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason <chris.mason@oracl= +e.com> wrote: +> >> >> >> postgresql errors. Typically, header corruption but from the= + limited +> >> >> >> visibility I've had into this via strace, what I see is zero= +ed pages > >> >> >> where there shouldn't be. > >> >> > -> >> >> > This sounds a lot like a bug higher up than dm-crypt. Â Zeros tend to -> >> >> > come from some piece of code explicitly filling a page with zeros, and -> >> >> > that often happens in the corner cases for O_DIRECT and a few other +> >> >> > This sounds a lot like a bug higher up than dm-crypt. =C2=A0Z= +eros tend to +> >> >> > come from some piece of code explicitly filling a page with z= +eros, and +> >> >> > that often happens in the corner cases for O_DIRECT and a few= + other > >> >> > places in the filesystem. > >> >> > > >> >> > Have you tried triggering this with a regular block device? > >> >> -> >> >> I just tried the whole set of tests, but with /dev/sdb directly (as +> >> >> I just tried the whole set of tests, but with /dev/sdb directly= + (as > >> >> ext4) without any crypt-y bits. -> >> >> It takes more iterations but out of 6 tests I had one failure: same +> >> >> It takes more iterations but out of 6 tests I had one failure: = +same > >> >> type of thing, 'invalid page header in block ....'. > >> >> -> >> >> I can't guarantee that it is a full-page of zeroes, just what I saw +> >> >> I can't guarantee that it is a full-page of zeroes, just what I= + saw > >> >> from the (limited) stracing I did. > >> > > >> > Fantastic. Now for our usual suspects: > >> > -> >> > 1) Is postgres using O_DIRECT? Â If yes, please turn it off +> >> > 1) Is postgres using O_DIRECT? =C2=A0If yes, please turn it off > >> > >> According to strace, O_DIRECT didn't show up once during the test. > >> -> >> > 2) Is postgres allocating sparse files? Â If yes, please have it fully +> >> > 2) Is postgres allocating sparse files? =C2=A0If yes, please hav= +e it fully > >> > allocate the file instead. > >> -> >> That's a tough one. I don't think postgresql does that, but I'm not an +> >> That's a tough one. I don't think postgresql does that, but I'm no= +t an > >> expert here. > > > > Ok, please compare du -k and du -k --apparent-size for each of the > > files involved in the postgres run. -> +>=20 > Because this is all done in a transaction (which fails), and because > the table is a TEMPORARY table, there *are* no files once the > transaction fails because postgresql unlinks them. -> +>=20 > I can modify the test to use real tables and do things outside of a > transaction, however. That would really help. -> +>=20 > I was using fdatasync[1] and now I'm using sync. I'm on 9 iterations > without a failure (on ext4 - no crypt). Theoretically, these settings > only make a difference in the event of a crash. However, could they diff --git a/a/content_digest b/N1/content_digest index 680615e..065d3f5 100644 --- a/a/content_digest +++ b/N1/content_digest @@ -34,55 +34,68 @@ "\00:1\0" "b\0" "Excerpts from Jon Nelson's message of 2010-12-07 15:48:58 -0500:\n" - "> On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason <chris.mason@oracle.com> wrote:\n" + "> On Tue, Dec 7, 2010 at 2:41 PM, Chris Mason <chris.mason@oracle.com> =\n" + "wrote:\n" "> > Excerpts from Jon Nelson's message of 2010-12-07 15:25:47 -0500:\n" - "> >> On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason <chris.mason@oracle.com> wrote:\n" + "> >> On Tue, Dec 7, 2010 at 2:02 PM, Chris Mason <chris.mason@oracle.co=\n" + "m> wrote:\n" "> >> > Excerpts from Jon Nelson's message of 2010-12-07 14:34:40 -0500:\n" - "> >> >> On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason <chris.mason@oracle.com> wrote:\n" - "> >> >> >> postgresql errors. Typically, header corruption but from the limited\n" - "> >> >> >> visibility I've had into this via strace, what I see is zeroed pages\n" + "> >> >> On Tue, Dec 7, 2010 at 12:52 PM, Chris Mason <chris.mason@oracl=\n" + "e.com> wrote:\n" + "> >> >> >> postgresql errors. Typically, header corruption but from the=\n" + " limited\n" + "> >> >> >> visibility I've had into this via strace, what I see is zero=\n" + "ed pages\n" "> >> >> >> where there shouldn't be.\n" "> >> >> >\n" - "> >> >> > This sounds a lot like a bug higher up than dm-crypt. \303\202\302\240Zeros tend to\n" - "> >> >> > come from some piece of code explicitly filling a page with zeros, and\n" - "> >> >> > that often happens in the corner cases for O_DIRECT and a few other\n" + "> >> >> > This sounds a lot like a bug higher up than dm-crypt. =C2=A0Z=\n" + "eros tend to\n" + "> >> >> > come from some piece of code explicitly filling a page with z=\n" + "eros, and\n" + "> >> >> > that often happens in the corner cases for O_DIRECT and a few=\n" + " other\n" "> >> >> > places in the filesystem.\n" "> >> >> >\n" "> >> >> > Have you tried triggering this with a regular block device?\n" "> >> >>\n" - "> >> >> I just tried the whole set of tests, but with /dev/sdb directly (as\n" + "> >> >> I just tried the whole set of tests, but with /dev/sdb directly=\n" + " (as\n" "> >> >> ext4) without any crypt-y bits.\n" - "> >> >> It takes more iterations but out of 6 tests I had one failure: same\n" + "> >> >> It takes more iterations but out of 6 tests I had one failure: =\n" + "same\n" "> >> >> type of thing, 'invalid page header in block ....'.\n" "> >> >>\n" - "> >> >> I can't guarantee that it is a full-page of zeroes, just what I saw\n" + "> >> >> I can't guarantee that it is a full-page of zeroes, just what I=\n" + " saw\n" "> >> >> from the (limited) stracing I did.\n" "> >> >\n" "> >> > Fantastic. Now for our usual suspects:\n" "> >> >\n" - "> >> > 1) Is postgres using O_DIRECT? \303\202\302\240If yes, please turn it off\n" + "> >> > 1) Is postgres using O_DIRECT? =C2=A0If yes, please turn it off\n" "> >>\n" "> >> According to strace, O_DIRECT didn't show up once during the test.\n" "> >>\n" - "> >> > 2) Is postgres allocating sparse files? \303\202\302\240If yes, please have it fully\n" + "> >> > 2) Is postgres allocating sparse files? =C2=A0If yes, please hav=\n" + "e it fully\n" "> >> > allocate the file instead.\n" "> >>\n" - "> >> That's a tough one. I don't think postgresql does that, but I'm not an\n" + "> >> That's a tough one. I don't think postgresql does that, but I'm no=\n" + "t an\n" "> >> expert here.\n" "> >\n" "> > Ok, please compare du -k and du -k --apparent-size for each of the\n" "> > files involved in the postgres run.\n" - "> \n" + ">=20\n" "> Because this is all done in a transaction (which fails), and because\n" "> the table is a TEMPORARY table, there *are* no files once the\n" "> transaction fails because postgresql unlinks them.\n" - "> \n" + ">=20\n" "> I can modify the test to use real tables and do things outside of a\n" "> transaction, however.\n" "\n" "That would really help.\n" "\n" - "> \n" + ">=20\n" "> I was using fdatasync[1] and now I'm using sync. I'm on 9 iterations\n" "> without a failure (on ext4 - no crypt). Theoretically, these settings\n" "> only make a difference in the event of a crash. However, could they\n" @@ -93,4 +106,4 @@ "\n" -chris -53a938c52d65b4f948a545149149baa8ced4cab3f1f79428661113f4837530dc +9b1d1c4146f6c2f0230a4141fdd9457272f6f77fd760033b523c0f04ccf7f1a7
diff --git a/a/1.txt b/N2/1.txt index 19012d0..8df5e65 100644 --- a/a/1.txt +++ b/N2/1.txt @@ -8,7 +8,7 @@ Excerpts from Jon Nelson's message of 2010-12-07 15:48:58 -0500: > >> >> >> visibility I've had into this via strace, what I see is zeroed pages > >> >> >> where there shouldn't be. > >> >> > -> >> >> > This sounds a lot like a bug higher up than dm-crypt. Â Zeros tend to +> >> >> > This sounds a lot like a bug higher up than dm-crypt. Zeros tend to > >> >> > come from some piece of code explicitly filling a page with zeros, and > >> >> > that often happens in the corner cases for O_DIRECT and a few other > >> >> > places in the filesystem. @@ -25,11 +25,11 @@ Excerpts from Jon Nelson's message of 2010-12-07 15:48:58 -0500: > >> > > >> > Fantastic. Now for our usual suspects: > >> > -> >> > 1) Is postgres using O_DIRECT? Â If yes, please turn it off +> >> > 1) Is postgres using O_DIRECT? If yes, please turn it off > >> > >> According to strace, O_DIRECT didn't show up once during the test. > >> -> >> > 2) Is postgres allocating sparse files? Â If yes, please have it fully +> >> > 2) Is postgres allocating sparse files? If yes, please have it fully > >> > allocate the file instead. > >> > >> That's a tough one. I don't think postgresql does that, but I'm not an diff --git a/a/content_digest b/N2/content_digest index 680615e..c3fa573 100644 --- a/a/content_digest +++ b/N2/content_digest @@ -15,7 +15,6 @@ "ref\01291751698-sup-9297@think\0" "ref\0AANLkTin79GzUbfuZNKyTtqcyoUSO9AJimO77_ZOvqggH@mail.gmail.com\0" "ref\01291754340-sup-1631@think\0" - "ref\0 AANLkTim8uCmFK=LjkMmq_1O0KE3AiN_7g41AO0woxMv7@mail.gmail.com\0" "ref\0AANLkTim8uCmFK=LjkMmq_1O0KE3AiN_7g41AO0woxMv7@mail.gmail.com\0" "From\0Chris Mason <chris.mason@oracle.com>\0" "Subject\0Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective)\0" @@ -43,7 +42,7 @@ "> >> >> >> visibility I've had into this via strace, what I see is zeroed pages\n" "> >> >> >> where there shouldn't be.\n" "> >> >> >\n" - "> >> >> > This sounds a lot like a bug higher up than dm-crypt. \303\202\302\240Zeros tend to\n" + "> >> >> > This sounds a lot like a bug higher up than dm-crypt. \302\240Zeros tend to\n" "> >> >> > come from some piece of code explicitly filling a page with zeros, and\n" "> >> >> > that often happens in the corner cases for O_DIRECT and a few other\n" "> >> >> > places in the filesystem.\n" @@ -60,11 +59,11 @@ "> >> >\n" "> >> > Fantastic. Now for our usual suspects:\n" "> >> >\n" - "> >> > 1) Is postgres using O_DIRECT? \303\202\302\240If yes, please turn it off\n" + "> >> > 1) Is postgres using O_DIRECT? \302\240If yes, please turn it off\n" "> >>\n" "> >> According to strace, O_DIRECT didn't show up once during the test.\n" "> >>\n" - "> >> > 2) Is postgres allocating sparse files? \303\202\302\240If yes, please have it fully\n" + "> >> > 2) Is postgres allocating sparse files? \302\240If yes, please have it fully\n" "> >> > allocate the file instead.\n" "> >>\n" "> >> That's a tough one. I don't think postgresql does that, but I'm not an\n" @@ -93,4 +92,4 @@ "\n" -chris -53a938c52d65b4f948a545149149baa8ced4cab3f1f79428661113f4837530dc +03722d5e2be4f4fe768f5922110cd28c9b74e11583c771628b0b5038841c6ad9
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.