* Possible ext3 corruption with 1K block size
@ 2008-10-15 3:24 Andrey Borzenkov
2008-10-15 12:49 ` Eric Sandeen
0 siblings, 1 reply; 9+ messages in thread
From: Andrey Borzenkov @ 2008-10-15 3:24 UTC (permalink / raw)
To: linux-ext4; +Cc: Linux Kernel Mailing List
[-- Attachment #1: Type: text/plain, Size: 1272 bytes --]
There is long standing open bug report on Mandriva which is currently
beieved to have root cause in file system corruption. It shows itself
in RPM DB corruption (at least, there is no other known method to trigger
it). So far all reported cases happened on filesystem with 1K block size
and stopped when RPM DB was moved to FS with 4K block size.
There are also similar RH reports as well.
Here are references:
https://qa.mandriva.com/show_bug.cgi?id=32547
This one is rather long. Interesting bits are probably around
https://qa.mandriva.com/show_bug.cgi?id=32547#c177
https://qa.mandriva.com/show_bug.cgi?id=32547#c148 (many users reporting
dumpe2fs)
https://bugzilla.redhat.com/show_bug.cgi?id=230362
https://bugzilla.redhat.com/show_bug.cgi?id=375931
https://bugzilla.redhat.com/show_bug.cgi?id=305301
The Mandriva bugzilla also mentions this mail from Stephen Tweedie
http://lkml.org/lkml/2007/9/18/232
which indicates some issues with 1K blocks, but according to last comment:
https://qa.mandriva.com/show_bug.cgi?id=32547#c300
it is still present in 2.6.27 (at least was present on -rc6)
There was a kernel bug report http://bugzilla.kernel.org/show_bug.cgi?id=11564,
but in this case it was identified as hardware issue.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible ext3 corruption with 1K block size
2008-10-15 3:24 Possible ext3 corruption with 1K block size Andrey Borzenkov
@ 2008-10-15 12:49 ` Eric Sandeen
2008-10-15 14:24 ` Andrey Borzenkov
0 siblings, 1 reply; 9+ messages in thread
From: Eric Sandeen @ 2008-10-15 12:49 UTC (permalink / raw)
To: Andrey Borzenkov; +Cc: linux-ext4, Linux Kernel Mailing List
Andrey Borzenkov wrote:
> There is long standing open bug report on Mandriva which is currently
> beieved to have root cause in file system corruption. It shows itself
> in RPM DB corruption (at least, there is no other known method to trigger
> it). So far all reported cases happened on filesystem with 1K block size
> and stopped when RPM DB was moved to FS with 4K block size.
>
> There are also similar RH reports as well.
>
> Here are references:
>
> https://qa.mandriva.com/show_bug.cgi?id=32547
>
> This one is rather long.
yep, unfortunately IIRC most of the bug is "me too's" and "how do I do
the workaround" :)
> Interesting bits are probably around
>
> https://qa.mandriva.com/show_bug.cgi?id=32547#c177
> https://qa.mandriva.com/show_bug.cgi?id=32547#c148 (many users reporting
> dumpe2fs)
>
> https://bugzilla.redhat.com/show_bug.cgi?id=230362
> https://bugzilla.redhat.com/show_bug.cgi?id=375931
> https://bugzilla.redhat.com/show_bug.cgi?id=305301
>
> The Mandriva bugzilla also mentions this mail from Stephen Tweedie
> http://lkml.org/lkml/2007/9/18/232
I don't think this is related, in the end... there was some possiblity
of corruption from that, but I think it's doubtful it'd hit 1k block
filesystems more, and in any case, the corruption has been seen since
then if I read it right.
> which indicates some issues with 1K blocks, but according to last comment:
> https://qa.mandriva.com/show_bug.cgi?id=32547#c300
>
> it is still present in 2.6.27 (at least was present on -rc6)
>
> There was a kernel bug report http://bugzilla.kernel.org/show_bug.cgi?id=11564,
> but in this case it was identified as hardware issue.
My kingdom for a testcase... does anyone have simple steps to reproduce
this? Or do they all start with "install mandriva on a 1k block size
system?" :)
-Eric
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible ext3 corruption with 1K block size
2008-10-15 12:49 ` Eric Sandeen
@ 2008-10-15 14:24 ` Andrey Borzenkov
2008-10-15 14:43 ` Eric Sandeen
0 siblings, 1 reply; 9+ messages in thread
From: Andrey Borzenkov @ 2008-10-15 14:24 UTC (permalink / raw)
To: Eric Sandeen, pterjan; +Cc: linux-ext4, Linux Kernel Mailing List
[-- Attachment #1: Type: text/plain, Size: 2151 bytes --]
On Wednesday 15 October 2008, Eric Sandeen wrote:
> Andrey Borzenkov wrote:
> > There is long standing open bug report on Mandriva which is currently
> > beieved to have root cause in file system corruption. It shows itself
> > in RPM DB corruption (at least, there is no other known method to trigger
> > it). So far all reported cases happened on filesystem with 1K block size
> > and stopped when RPM DB was moved to FS with 4K block size.
> >
> > There are also similar RH reports as well.
> >
> > Here are references:
> >
> > https://qa.mandriva.com/show_bug.cgi?id=32547
> >
> > This one is rather long.
>
> yep, unfortunately IIRC most of the bug is "me too's" and "how do I do
> the workaround" :)
>
> > Interesting bits are probably around
> >
> > https://qa.mandriva.com/show_bug.cgi?id=32547#c177
> > https://qa.mandriva.com/show_bug.cgi?id=32547#c148 (many users reporting
> > dumpe2fs)
> >
> > https://bugzilla.redhat.com/show_bug.cgi?id=230362
> > https://bugzilla.redhat.com/show_bug.cgi?id=375931
> > https://bugzilla.redhat.com/show_bug.cgi?id=305301
> >
> > The Mandriva bugzilla also mentions this mail from Stephen Tweedie
> > http://lkml.org/lkml/2007/9/18/232
>
> I don't think this is related, in the end... there was some possiblity
> of corruption from that, but I think it's doubtful it'd hit 1k block
> filesystems more, and in any case, the corruption has been seen since
> then if I read it right.
>
> > which indicates some issues with 1K blocks, but according to last comment:
> > https://qa.mandriva.com/show_bug.cgi?id=32547#c300
> >
> > it is still present in 2.6.27 (at least was present on -rc6)
> >
> > There was a kernel bug report http://bugzilla.kernel.org/show_bug.cgi?id=11564,
> > but in this case it was identified as hardware issue.
>
> My kingdom for a testcase... does anyone have simple steps to reproduce
> this? Or do they all start with "install mandriva on a 1k block size
> system?" :)
>
May be RH will do? :)
As indicated by last comment, Pascal has some ways to trigger it; I
forgot to Cc to him initially; doing it now.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible ext3 corruption with 1K block size
2008-10-15 14:24 ` Andrey Borzenkov
@ 2008-10-15 14:43 ` Eric Sandeen
2008-10-16 13:47 ` Pascal Terjan
0 siblings, 1 reply; 9+ messages in thread
From: Eric Sandeen @ 2008-10-15 14:43 UTC (permalink / raw)
To: Andrey Borzenkov; +Cc: pterjan, linux-ext4, Linux Kernel Mailing List
Andrey Borzenkov wrote:
> On Wednesday 15 October 2008, Eric Sandeen wrote:
>> My kingdom for a testcase... does anyone have simple steps to reproduce
>> this? Or do they all start with "install mandriva on a 1k block size
>> system?" :)
>>
>
> May be RH will do? :)
I did try a 1k-block root fs Fedora install, and didn't see any problems...
> As indicated by last comment, Pascal has some ways to trigger it; I
> forgot to Cc to him initially; doing it now.
Ok, good deal.
-Eric
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible ext3 corruption with 1K block size
2008-10-15 14:43 ` Eric Sandeen
@ 2008-10-16 13:47 ` Pascal Terjan
2008-10-16 14:38 ` Eric Sandeen
0 siblings, 1 reply; 9+ messages in thread
From: Pascal Terjan @ 2008-10-16 13:47 UTC (permalink / raw)
To: Eric Sandeen
Cc: Andrey Borzenkov, pterjan, linux-ext4, Linux Kernel Mailing List
Le mercredi 15 octobre 2008 à 09:43 -0500, Eric Sandeen a écrit :
> Andrey Borzenkov wrote:
> > On Wednesday 15 October 2008, Eric Sandeen wrote:
>
>
> >> My kingdom for a testcase... does anyone have simple steps to reproduce
> >> this? Or do they all start with "install mandriva on a 1k block size
> >> system?" :)
> >>
> >
> > May be RH will do? :)
>
> I did try a 1k-block root fs Fedora install, and didn't see any problems...
>
> > As indicated by last comment, Pascal has some ways to trigger it; I
> > forgot to Cc to him initially; doing it now.
>
> Ok, good deal.
>
On my test machine I reproduce it easily : rpm --rebuilddb and if the db
is not detected to be corrupted yet it will be after installing a few
packages (tested again with 2.6.27).
If I do the rebuilddb on a 2.6.17 and then reboot on a recent kernel,
then I can install/uninstall thousands of packages without any
corruption.
I wanted to try a few things including copying the partition to a file
and trying to reproduce in a vm.
Given how I can reproduce and repair it i can even write a bisecting
script which would basically be an initscript which would do
if on test kernel
- rebuild the db
- install 10 rpm
- remove the 10 rpm
- check the db
- do the good/bad
- reboot onto 2.6.17
else if on 2.6.17
- rebuild the db
- build the kernel
- reboot on test kernel
and let it run :)
All I need is to find some time with nothing more urgent...
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible ext3 corruption with 1K block size
2008-10-16 13:47 ` Pascal Terjan
@ 2008-10-16 14:38 ` Eric Sandeen
2008-10-16 14:40 ` Pascal Terjan
2008-12-18 18:12 ` Jan Kara
0 siblings, 2 replies; 9+ messages in thread
From: Eric Sandeen @ 2008-10-16 14:38 UTC (permalink / raw)
To: Pascal Terjan
Cc: Andrey Borzenkov, pterjan, linux-ext4, Linux Kernel Mailing List
Pascal Terjan wrote:
> Le mercredi 15 octobre 2008 à 09:43 -0500, Eric Sandeen a écrit :
>> Andrey Borzenkov wrote:
>>> On Wednesday 15 October 2008, Eric Sandeen wrote:
>>
>>>> My kingdom for a testcase... does anyone have simple steps to reproduce
>>>> this? Or do they all start with "install mandriva on a 1k block size
>>>> system?" :)
>>>>
>>> May be RH will do? :)
>> I did try a 1k-block root fs Fedora install, and didn't see any problems...
>>
>>> As indicated by last comment, Pascal has some ways to trigger it; I
>>> forgot to Cc to him initially; doing it now.
>> Ok, good deal.
>>
>
> On my test machine I reproduce it easily : rpm --rebuilddb and if the db
> is not detected to be corrupted yet it will be after installing a few
> packages (tested again with 2.6.27).
>
> If I do the rebuilddb on a 2.6.17 and then reboot on a recent kernel,
> then I can install/uninstall thousands of packages without any
> corruption.
so it seems to be the database rebuilding, under a recent kernel, which
causes the problem? installing under a recent kernel is ok, as long as
the db was created on an older kernel?
Ok that's a good clue...
-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible ext3 corruption with 1K block size
2008-10-16 14:38 ` Eric Sandeen
@ 2008-10-16 14:40 ` Pascal Terjan
2008-12-18 18:12 ` Jan Kara
1 sibling, 0 replies; 9+ messages in thread
From: Pascal Terjan @ 2008-10-16 14:40 UTC (permalink / raw)
To: Eric Sandeen
Cc: Andrey Borzenkov, pterjan, linux-ext4, Linux Kernel Mailing List
Le jeudi 16 octobre 2008 à 09:38 -0500, Eric Sandeen a écrit :
> Pascal Terjan wrote:
> > Le mercredi 15 octobre 2008 à 09:43 -0500, Eric Sandeen a écrit :
> >> Andrey Borzenkov wrote:
> >>> On Wednesday 15 October 2008, Eric Sandeen wrote:
> >>
> >>>> My kingdom for a testcase... does anyone have simple steps to reproduce
> >>>> this? Or do they all start with "install mandriva on a 1k block size
> >>>> system?" :)
> >>>>
> >>> May be RH will do? :)
> >> I did try a 1k-block root fs Fedora install, and didn't see any problems...
> >>
> >>> As indicated by last comment, Pascal has some ways to trigger it; I
> >>> forgot to Cc to him initially; doing it now.
> >> Ok, good deal.
> >>
> >
> > On my test machine I reproduce it easily : rpm --rebuilddb and if the db
> > is not detected to be corrupted yet it will be after installing a few
> > packages (tested again with 2.6.27).
> >
> > If I do the rebuilddb on a 2.6.17 and then reboot on a recent kernel,
> > then I can install/uninstall thousands of packages without any
> > corruption.
>
> so it seems to be the database rebuilding, under a recent kernel, which
> causes the problem? installing under a recent kernel is ok, as long as
> the db was created on an older kernel?
Yes
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible ext3 corruption with 1K block size
2008-10-16 14:38 ` Eric Sandeen
2008-10-16 14:40 ` Pascal Terjan
@ 2008-12-18 18:12 ` Jan Kara
2008-12-18 18:20 ` Eric Sandeen
1 sibling, 1 reply; 9+ messages in thread
From: Jan Kara @ 2008-12-18 18:12 UTC (permalink / raw)
To: Eric Sandeen
Cc: Pascal Terjan, Andrey Borzenkov, pterjan, linux-ext4,
Linux Kernel Mailing List
Hi Eric,
> Pascal Terjan wrote:
> > Le mercredi 15 octobre 2008 à 09:43 -0500, Eric Sandeen a écrit :
> >> Andrey Borzenkov wrote:
> >>> On Wednesday 15 October 2008, Eric Sandeen wrote:
> >>
> >>>> My kingdom for a testcase... does anyone have simple steps to reproduce
> >>>> this? Or do they all start with "install mandriva on a 1k block size
> >>>> system?" :)
> >>>>
> >>> May be RH will do? :)
> >> I did try a 1k-block root fs Fedora install, and didn't see any problems...
> >>
> >>> As indicated by last comment, Pascal has some ways to trigger it; I
> >>> forgot to Cc to him initially; doing it now.
> >> Ok, good deal.
> >>
> >
> > On my test machine I reproduce it easily : rpm --rebuilddb and if the db
> > is not detected to be corrupted yet it will be after installing a few
> > packages (tested again with 2.6.27).
> >
> > If I do the rebuilddb on a 2.6.17 and then reboot on a recent kernel,
> > then I can install/uninstall thousands of packages without any
> > corruption.
>
> so it seems to be the database rebuilding, under a recent kernel, which
> causes the problem? installing under a recent kernel is ok, as long as
> the db was created on an older kernel?
>
> Ok that's a good clue...
Have you been able to track this down? Anything interesting?
Honza
--
Jan Kara <jack@suse.cz>
SuSE CR Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2008-12-18 18:20 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-15 3:24 Possible ext3 corruption with 1K block size Andrey Borzenkov
2008-10-15 12:49 ` Eric Sandeen
2008-10-15 14:24 ` Andrey Borzenkov
2008-10-15 14:43 ` Eric Sandeen
2008-10-16 13:47 ` Pascal Terjan
2008-10-16 14:38 ` Eric Sandeen
2008-10-16 14:40 ` Pascal Terjan
2008-12-18 18:12 ` Jan Kara
2008-12-18 18:20 ` Eric Sandeen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).