linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Another checksum error bugreport
@ 2010-09-29 10:48 Sebastian 'gonX' Jensen
  2010-09-29 11:12 ` Francis Galiegue
  2010-09-29 17:35 ` Lubos Kolouch
  0 siblings, 2 replies; 9+ messages in thread
From: Sebastian 'gonX' Jensen @ 2010-09-29 10:48 UTC (permalink / raw)
  To: linux-btrfs

Hey guys,

Today I experienced my first checksum error just out of the blue - and
it's not just the 'csum + 1 = private' issue, it's a completely
different one. Because of this, I am unable to retrieve the data off
the drive, even with nodatasum enabled - I simply get an I/O error.
Here's the dmesg output:

[149423.845177] btrfs: setting nodatasum
[149423.850339] Btrfs detected SSD devices, enabling SSD mode
[149432.094728] btrfs csum failed ino 259 off 26701824 csum 3875867041
private 371726550
[149432.117938] btrfs csum failed ino 259 off 26701824 csum 3875867041
private 371726550
[149432.118340] btrfs csum failed ino 259 off 26701824 csum 3875867041
private 371726550
[149432.125671] btrfs csum failed ino 259 off 26701824 csum 3875867041
private 371726550
[149432.126075] btrfs csum failed ino 259 off 26701824 csum 3875867041
private 371726550
[149432.135671] btrfs csum failed ino 259 off 26701824 csum 3875867041
private 371726550

I would really like to have the files on the drive retrieved in their
entirety, but if that is not possible then that is also OK. Consider
this a bugreport and a question on how to retrieve the data now.

Thanks,
Sebastian J.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Another checksum error bugreport
  2010-09-29 10:48 Another checksum error bugreport Sebastian 'gonX' Jensen
@ 2010-09-29 11:12 ` Francis Galiegue
  2010-09-29 11:37   ` Sebastian 'gonX' Jensen
  2010-09-29 17:35 ` Lubos Kolouch
  1 sibling, 1 reply; 9+ messages in thread
From: Francis Galiegue @ 2010-09-29 11:12 UTC (permalink / raw)
  To: Sebastian 'gonX' Jensen; +Cc: linux-btrfs

On Wed, Sep 29, 2010 at 12:48, Sebastian 'gonX' Jensen
<gonx@overclocked.net> wrote:
> Hey guys,
>
> Today I experienced my first checksum error just out of the blue - an=
d
> it's not just the 'csum + 1 =3D private' issue, it's a completely
> different one. Because of this, I am unable to retrieve the data off
> the drive, even with nodatasum enabled - I simply get an I/O error.
> Here's the dmesg output:
>
> [149423.845177] btrfs: setting nodatasum
> [149423.850339] Btrfs detected SSD devices, enabling SSD mode
> [149432.094728] btrfs csum failed ino 259 off 26701824 csum 387586704=
1
> private 371726550
> [149432.117938] btrfs csum failed ino 259 off 26701824 csum 387586704=
1
> private 371726550
> [149432.118340] btrfs csum failed ino 259 off 26701824 csum 387586704=
1
> private 371726550
> [149432.125671] btrfs csum failed ino 259 off 26701824 csum 387586704=
1
> private 371726550
> [149432.126075] btrfs csum failed ino 259 off 26701824 csum 387586704=
1
> private 371726550
> [149432.135671] btrfs csum failed ino 259 off 26701824 csum 387586704=
1
> private 371726550
>
> I would really like to have the files on the drive retrieved in their
> entirety, but if that is not possible then that is also OK. Consider
> this a bugreport and a question on how to retrieve the data now.
>

Which kernel is that?

A patch made it in 2.6.36-rc6 which fixed an important bug in the bdi
code, wherein write requests and discard requests were merged,
transforming all requests in discard requests.

And you use an SSD... Hmmm.

--=20
=46rancis Galiegue, fgaliegue@gmail.com
"It seems obvious [...] that at least some 'business intelligence'
tools invest so much intelligence on the business side that they have
nothing left for generating SQL queries" (St=C3=A9phane Faroult, in "Th=
e
Art of SQL", ISBN 0-596-00894-5)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Another checksum error bugreport
  2010-09-29 11:12 ` Francis Galiegue
@ 2010-09-29 11:37   ` Sebastian 'gonX' Jensen
  2010-09-29 12:50     ` Francis Galiegue
  0 siblings, 1 reply; 9+ messages in thread
From: Sebastian 'gonX' Jensen @ 2010-09-29 11:37 UTC (permalink / raw)
  To: Francis Galiegue; +Cc: linux-btrfs

On 29 September 2010 13:12, Francis Galiegue <fgaliegue@gmail.com> wrot=
e:
> On Wed, Sep 29, 2010 at 12:48, Sebastian 'gonX' Jensen
> <gonx@overclocked.net> wrote:
>> Hey guys,
>>
>> Today I experienced my first checksum error just out of the blue - a=
nd
>> it's not just the 'csum + 1 =3D private' issue, it's a completely
>> different one. Because of this, I am unable to retrieve the data off
>> the drive, even with nodatasum enabled - I simply get an I/O error.
>> Here's the dmesg output:
>>
>> [149423.845177] btrfs: setting nodatasum
>> [149423.850339] Btrfs detected SSD devices, enabling SSD mode
>> [149432.094728] btrfs csum failed ino 259 off 26701824 csum 38758670=
41
>> private 371726550
>> [149432.117938] btrfs csum failed ino 259 off 26701824 csum 38758670=
41
>> private 371726550
>> [149432.118340] btrfs csum failed ino 259 off 26701824 csum 38758670=
41
>> private 371726550
>> [149432.125671] btrfs csum failed ino 259 off 26701824 csum 38758670=
41
>> private 371726550
>> [149432.126075] btrfs csum failed ino 259 off 26701824 csum 38758670=
41
>> private 371726550
>> [149432.135671] btrfs csum failed ino 259 off 26701824 csum 38758670=
41
>> private 371726550
>>
>> I would really like to have the files on the drive retrieved in thei=
r
>> entirety, but if that is not possible then that is also OK. Consider
>> this a bugreport and a question on how to retrieve the data now.
>>
>
> Which kernel is that?
It was one of the 2.6.35 versions from the Ubuntu repository. I'm
running Ubuntu 10.04 Server.

> A patch made it in 2.6.36-rc6 which fixed an important bug in the bdi
> code, wherein write requests and discard requests were merged,
> transforming all requests in discard requests.
>
> And you use an SSD... Hmmm.
>
> --
> Francis Galiegue, fgaliegue@gmail.com
> "It seems obvious [...] that at least some 'business intelligence'
> tools invest so much intelligence on the business side that they have
> nothing left for generating SQL queries" (St=C3=A9phane Faroult, in "=
The
> Art of SQL", ISBN 0-596-00894-5)
>

Well, overall it seems to work now. I downgraded to the .32 version in
the Ubuntu 10.04 repository and now I do not get any errors from
dmesg. I don't know what caused it, but I think I'll stick to stable
kernel versions instead. Since this is a production system it's not
very easy for me to troubleshoot this any further if it requires a
reboot. I can unmount and mount the drive from time to time, but not
reboot. If you want btrfs-debug-tree output or something, let me know.

Regards,
Sebastian J.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Another checksum error bugreport
  2010-09-29 11:37   ` Sebastian 'gonX' Jensen
@ 2010-09-29 12:50     ` Francis Galiegue
  2010-09-29 13:15       ` cwillu
  0 siblings, 1 reply; 9+ messages in thread
From: Francis Galiegue @ 2010-09-29 12:50 UTC (permalink / raw)
  To: Sebastian 'gonX' Jensen; +Cc: linux-btrfs

On Wed, Sep 29, 2010 at 13:37, Sebastian 'gonX' Jensen
<gonx@overclocked.net> wrote:
[...]
>>
>> Which kernel is that?
> It was one of the 2.6.35 versions from the Ubuntu repository. I'm
> running Ubuntu 10.04 Server.
>

Since 2.6.32 works, you should report that bug to Ubuntu.

The upstream commit is f281fb5fe54e15a7ab802945e42f8e24fceb56b2,
pasted below, merged Sep 25:

----
commit f281fb5fe54e15a7ab802945e42f8e24fceb56b2
Author: Adrian Hunter <adrian.hunter@nokia.com>
Date:   Sat Sep 25 12:42:55 2010 +0200

    block: prevent merges of discard and write requests

    Add logic to prevent two I/O requests being merged if
    only one of them is a discard.  Ditto secure discard.

    Without this fix, it is possible for write requests
    to transform into discard requests.  For example:

      Submit bio 1 to discard 8 sectors from sector n
      Submit bio 2 to write 8 sectors from sector n + 16
      Submit bio 3 to write 8 sectors from sector n + 8

    Bio 1 becomes request 1.  Bio 2 becomes request 2.
    Bio 3 is merged with request 2, and then subsequently
    request 2 is merged with request 1 resulting in just
    one I/O request which discards all 24 sectors.

    Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>

    (Moved the checks above the position checks /Jens)

    Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
----

--=20
=46rancis Galiegue, fgaliegue@gmail.com
"It seems obvious [...] that at least some 'business intelligence'
tools invest so much intelligence on the business side that they have
nothing left for generating SQL queries" (St=C3=A9phane Faroult, in "Th=
e
Art of SQL", ISBN 0-596-00894-5)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Another checksum error bugreport
  2010-09-29 12:50     ` Francis Galiegue
@ 2010-09-29 13:15       ` cwillu
  2010-09-29 14:31         ` Sebastian 'gonX' Jensen
  0 siblings, 1 reply; 9+ messages in thread
From: cwillu @ 2010-09-29 13:15 UTC (permalink / raw)
  To: Francis Galiegue; +Cc: Sebastian 'gonX' Jensen, linux-btrfs

>>> Which kernel is that?
>> It was one of the 2.6.35 versions from the Ubuntu repository. I'm
>> running Ubuntu 10.04 Server.
>>
>
> Since 2.6.32 works, you should report that bug to Ubuntu.

Alternatively, retest using ubuntu's mainline kernel ppa
(http://kernel.ubuntu.com/~kernel-ppa/mainline/), which doesn't
include any ubuntu patches.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Another checksum error bugreport
  2010-09-29 13:15       ` cwillu
@ 2010-09-29 14:31         ` Sebastian 'gonX' Jensen
  2010-09-29 15:05           ` cwillu
  0 siblings, 1 reply; 9+ messages in thread
From: Sebastian 'gonX' Jensen @ 2010-09-29 14:31 UTC (permalink / raw)
  To: cwillu; +Cc: Francis Galiegue, linux-btrfs

On 29 September 2010 15:15, cwillu <cwillu@cwillu.com> wrote:
>>>> Which kernel is that?
>>> It was one of the 2.6.35 versions from the Ubuntu repository. I'm
>>> running Ubuntu 10.04 Server.
>>>
>>
>> Since 2.6.32 works, you should report that bug to Ubuntu.
>
> Alternatively, retest using ubuntu's mainline kernel ppa
> (http://kernel.ubuntu.com/~kernel-ppa/mainline/), which doesn't
> include any ubuntu patches.
>

I used the mainline ppa when I used 2.6.35. That is where I had the
issue. Forgive me for saying it was in the repository, but I did not
realize they were not the same thing.

Thanks,
Sebastian J.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Another checksum error bugreport
  2010-09-29 14:31         ` Sebastian 'gonX' Jensen
@ 2010-09-29 15:05           ` cwillu
  0 siblings, 0 replies; 9+ messages in thread
From: cwillu @ 2010-09-29 15:05 UTC (permalink / raw)
  To: Sebastian 'gonX' Jensen; +Cc: Francis Galiegue, linux-btrfs

On Wed, Sep 29, 2010 at 8:31 AM, Sebastian 'gonX' Jensen
<gonx@overclocked.net> wrote:
> On 29 September 2010 15:15, cwillu <cwillu@cwillu.com> wrote:
>>>>> Which kernel is that?
>>>> It was one of the 2.6.35 versions from the Ubuntu repository. I'm
>>>> running Ubuntu 10.04 Server.
>>>>
>>>
>>> Since 2.6.32 works, you should report that bug to Ubuntu.
>>
>> Alternatively, retest using ubuntu's mainline kernel ppa
>> (http://kernel.ubuntu.com/~kernel-ppa/mainline/), which doesn't
>> include any ubuntu patches.
>
> I used the mainline ppa when I used 2.6.35. That is where I had the
> issue. Forgive me for saying it was in the repository, but I did not
> realize they were not the same thing.

Well, it is a repository, but that one specifically doesn't include
ubuntu patches, as opposed to whats in the default repositories.
Given that you used the mainline kernels, it's unlikely to be an
ubuntu bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Another checksum error bugreport
  2010-09-29 10:48 Another checksum error bugreport Sebastian 'gonX' Jensen
  2010-09-29 11:12 ` Francis Galiegue
@ 2010-09-29 17:35 ` Lubos Kolouch
  2010-09-29 18:36   ` Sebastian 'gonX' Jensen
  1 sibling, 1 reply; 9+ messages in thread
From: Lubos Kolouch @ 2010-09-29 17:35 UTC (permalink / raw)
  To: linux-btrfs

Sebastian 'gonX' Jensen, Wed, 29 Sep 2010 12:48:56 +0200:

> Hey guys,
> 
> Today I experienced my first checksum error just out of the blue - and
> it's not just the 'csum + 1 = private' issue, it's a completely
> different one. Because of this, I am unable to retrieve the data off the
> drive, even with nodatasum enabled - I simply get an I/O error. Here's
> the dmesg output:
> 
> [149423.845177] btrfs: setting nodatasum [149423.850339] Btrfs detected
> SSD devices, enabling SSD mode [149432.094728] btrfs csum failed ino 259
> off 26701824 csum 3875867041 private 371726550
> [149432.117938] btrfs csum failed ino 259 off 26701824 csum 3875867041
> private 371726550
> [149432.118340] btrfs csum failed ino 259 off 26701824 csum 3875867041
> private 371726550
> [149432.125671] btrfs csum failed ino 259 off 26701824 csum 3875867041
> private 371726550
> [149432.126075] btrfs csum failed ino 259 off 26701824 csum 3875867041
> private 371726550
> [149432.135671] btrfs csum failed ino 259 off 26701824 csum 3875867041
> private 371726550
> 
> I would really like to have the files on the drive retrieved in their
> entirety, but if that is not possible then that is also OK. Consider
> this a bugreport and a question on how to retrieve the data now.
> 
> Thanks,
> Sebastian J.

I have seen this now too, on a usb flash drive (yes, with LUKS on it) that
has been always correctly unmounted and luksClosed. In facts I created 
the fs on it couple of days ago and now I cannot read one file with the 
same error.

I do not need to recover the file, it's just that you are not the only 
with this error.

Lubos


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Another checksum error bugreport
  2010-09-29 17:35 ` Lubos Kolouch
@ 2010-09-29 18:36   ` Sebastian 'gonX' Jensen
  0 siblings, 0 replies; 9+ messages in thread
From: Sebastian 'gonX' Jensen @ 2010-09-29 18:36 UTC (permalink / raw)
  To: Lubos Kolouch; +Cc: linux-btrfs

On 29 September 2010 19:35, Lubos Kolouch <lubos.kolouch@gmail.com> wro=
te:
> Sebastian 'gonX' Jensen, Wed, 29 Sep 2010 12:48:56 +0200:
>
>> Hey guys,
>>
>> Today I experienced my first checksum error just out of the blue - a=
nd
>> it's not just the 'csum + 1 =3D private' issue, it's a completely
>> different one. Because of this, I am unable to retrieve the data off=
 the
>> drive, even with nodatasum enabled - I simply get an I/O error. Here=
's
>> the dmesg output:
>>
>> [149423.845177] btrfs: setting nodatasum [149423.850339] Btrfs detec=
ted
>> SSD devices, enabling SSD mode [149432.094728] btrfs csum failed ino=
 259
>> off 26701824 csum 3875867041 private 371726550
>> [149432.117938] btrfs csum failed ino 259 off 26701824 csum 38758670=
41
>> private 371726550
>> [149432.118340] btrfs csum failed ino 259 off 26701824 csum 38758670=
41
>> private 371726550
>> [149432.125671] btrfs csum failed ino 259 off 26701824 csum 38758670=
41
>> private 371726550
>> [149432.126075] btrfs csum failed ino 259 off 26701824 csum 38758670=
41
>> private 371726550
>> [149432.135671] btrfs csum failed ino 259 off 26701824 csum 38758670=
41
>> private 371726550
>>
>> I would really like to have the files on the drive retrieved in thei=
r
>> entirety, but if that is not possible then that is also OK. Consider
>> this a bugreport and a question on how to retrieve the data now.
>>
>> Thanks,
>> Sebastian J.
>
> I have seen this now too, on a usb flash drive (yes, with LUKS on it)=
 that
> has been always correctly unmounted and luksClosed. In facts I create=
d
> the fs on it couple of days ago and now I cannot read one file with t=
he
> same error.
>
> I do not need to recover the file, it's just that you are not the onl=
y
> with this error.
>
> Lubos
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs=
" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at =C2=A0http://vger.kernel.org/majordomo-info.ht=
ml
>

Good to hear I am not alone with this. It seemed more like a fluke
than an actual issue since I have no issues reading the drive in
2.6.32

Regards,
Sebastian J.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-09-29 18:36 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-29 10:48 Another checksum error bugreport Sebastian 'gonX' Jensen
2010-09-29 11:12 ` Francis Galiegue
2010-09-29 11:37   ` Sebastian 'gonX' Jensen
2010-09-29 12:50     ` Francis Galiegue
2010-09-29 13:15       ` cwillu
2010-09-29 14:31         ` Sebastian 'gonX' Jensen
2010-09-29 15:05           ` cwillu
2010-09-29 17:35 ` Lubos Kolouch
2010-09-29 18:36   ` Sebastian 'gonX' Jensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).