[PATCH] Btrfs: relocate csums properly with prealloc extents

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] Btrfs: relocate csums properly with prealloc extents
@ 2013-09-27 13:37 Josef Bacik
  2013-10-04 21:19 ` Johannes Hirte
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Josef Bacik @ 2013-09-27 13:37 UTC (permalink / raw)
  To: linux-btrfs

A user reported a problem where they were getting csum errors when running a
balance and running systemd's journal.  This is because systemd is awesome and
fallocate()'s its log space and writes into it.  Unfortunately we assume that
when we read in all the csums for an extent that they are sequential starting at
the bytenr we care about.  This obviously isn't the case for prealloc extents,
where we could have written to the middle of the prealloc extent only, which
means the csum would be for the bytenr in the middle of our range and not the
front of our range.  Fix this by offsetting the new bytenr we are logging to
based on the original bytenr the csum was for.  With this patch I no longer see
the csum errors I was seeing.  Thanks,

Cc: stable@vger.kernel.org
Reported-by: Chris Murphy <lists@colorremedies.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
---
 fs/btrfs/relocation.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 5ca7ea9..b7afeaa 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -4472,6 +4472,7 @@ int btrfs_reloc_clone_csums(struct inode *inode, u64 file_pos, u64 len)
 	struct btrfs_root *root = BTRFS_I(inode)->root;
 	int ret;
 	u64 disk_bytenr;
+	u64 new_bytenr;
 	LIST_HEAD(list);
 
 	ordered = btrfs_lookup_ordered_extent(inode, file_pos);
@@ -4483,13 +4484,24 @@ int btrfs_reloc_clone_csums(struct inode *inode, u64 file_pos, u64 len)
 	if (ret)
 		goto out;
 
-	disk_bytenr = ordered->start;
 	while (!list_empty(&list)) {
 		sums = list_entry(list.next, struct btrfs_ordered_sum, list);
 		list_del_init(&sums->list);
 
-		sums->bytenr = disk_bytenr;
-		disk_bytenr += sums->len;
+		/*
+		 * We need to offset the new_bytenr based on where the csum is.
+		 * We need to do this because we will read in entire prealloc
+		 * extents but we may have written to say the middle of the
+		 * prealloc extent, so we need to make sure the csum goes with
+		 * the right disk offset.
+		 *
+		 * We can do this because the data reloc inode refers strictly
+		 * to the on disk bytes, so we don't have to worry about
+		 * disk_len vs real len like with real inodes since it's all
+		 * disk length.
+		 */
+		new_bytenr = ordered->start + (sums->bytenr - disk_bytenr);
+		sums->bytenr = new_bytenr;
 
 		btrfs_add_ordered_sum(inode, ordered, sums);
 	}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] Btrfs: relocate csums properly with prealloc extents
  2013-09-27 13:37 [PATCH] Btrfs: relocate csums properly with prealloc extents Josef Bacik
@ 2013-10-04 21:19 ` Johannes Hirte
  2013-10-23 21:24   ` Hans-Kristian Bakke
  2013-10-24 14:08 ` [PATCH] Btrfs: relocate csums properly with prealloc extents - for 3.12-rc David Sterba
  2013-11-25 16:51 ` [PATCH] Btrfs: relocate csums properly with prealloc extents David Sterba
  2 siblings, 1 reply; 9+ messages in thread
From: Johannes Hirte @ 2013-10-04 21:19 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs

On Fri, 27 Sep 2013 09:37:00 -0400
Josef Bacik <jbacik@fusionio.com> wrote:

> A user reported a problem where they were getting csum errors when
> running a balance and running systemd's journal.  This is because
> systemd is awesome and fallocate()'s its log space and writes into
> it.  Unfortunately we assume that when we read in all the csums for
> an extent that they are sequential starting at the bytenr we care
> about.  This obviously isn't the case for prealloc extents, where we
> could have written to the middle of the prealloc extent only, which
> means the csum would be for the bytenr in the middle of our range and
> not the front of our range.  Fix this by offsetting the new bytenr we
> are logging to based on the original bytenr the csum was for.  With
> this patch I no longer see the csum errors I was seeing.  Thanks,

Any assessment when this goes upstream? Until it hit Linus tree it
won't won't appear in stable. And this seems rather important.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Btrfs: relocate csums properly with prealloc extents
  2013-10-04 21:19 ` Johannes Hirte
@ 2013-10-23 21:24   ` Hans-Kristian Bakke
  2013-10-23 21:49     ` Hans-Kristian Bakke
  0 siblings, 1 reply; 9+ messages in thread
From: Hans-Kristian Bakke @ 2013-10-23 21:24 UTC (permalink / raw)
  To: linux-btrfs

I was hit by this when trying to rebalance a 16TB RAID10 to 32TB
RAID10 going from 4 to 8 WD SE 4TB drives today. I cannot finish a
rebalance because of failed csum.

[10228.850910] BTRFS info (device sdq): csum failed ino 487 off 65536
csum 2566472073 private 151366068
[10228.850967] BTRFS info (device sdq): csum failed ino 487 off 69632
csum 2566472073 private 3056924305
[10228.850973] BTRFS info (device sdq): csum failed ino 487 off 593920
csum 2566472073 private 906093395
[10228.851004] BTRFS info (device sdq): csum failed ino 487 off 73728
csum 2566472073 private 2680502892
[10228.851014] BTRFS info (device sdq): csum failed ino 487 off 598016
csum 2566472073 private 1940162924
[10228.851029] BTRFS info (device sdq): csum failed ino 487 off 77824
csum 2566472073 private 2939385278
[10228.851051] BTRFS info (device sdq): csum failed ino 487 off 602112
csum 2566472073 private 645310077
[10228.851055] BTRFS info (device sdq): csum failed ino 487 off 81920
csum 2566472073 private 3600741549
[10228.851078] BTRFS info (device sdq): csum failed ino 487 off 86016
csum 2566472073 private 200201951
[10228.851091] BTRFS info (device sdq): csum failed ino 487 off 606208
csum 2566472073 private 1002916440

The system is running a scrub now and I will return with some more
details later. I do not think systemd is logging to this volume, but
the scrub wil probably show which files are affected.

As this is a very serious issue for those hit by the corruption (it
basically makes it impossible to run rebalance with all its
consequences) hopefully this wil go upstream soon.
I am on Kernel 3.11.6 by the way.
Mvh

Hans-Kristian Bakke
Mob: 91 76 17 38


On 4 October 2013 23:19, Johannes Hirte <johannes.hirte@datenkhaos.de> wrote:
> On Fri, 27 Sep 2013 09:37:00 -0400
> Josef Bacik <jbacik@fusionio.com> wrote:
>
>> A user reported a problem where they were getting csum errors when
>> running a balance and running systemd's journal.  This is because
>> systemd is awesome and fallocate()'s its log space and writes into
>> it.  Unfortunately we assume that when we read in all the csums for
>> an extent that they are sequential starting at the bytenr we care
>> about.  This obviously isn't the case for prealloc extents, where we
>> could have written to the middle of the prealloc extent only, which
>> means the csum would be for the bytenr in the middle of our range and
>> not the front of our range.  Fix this by offsetting the new bytenr we
>> are logging to based on the original bytenr the csum was for.  With
>> this patch I no longer see the csum errors I was seeing.  Thanks,
>
> Any assessment when this goes upstream? Until it hit Linus tree it
> won't won't appear in stable. And this seems rather important.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Btrfs: relocate csums properly with prealloc extents
  2013-10-23 21:24   ` Hans-Kristian Bakke
@ 2013-10-23 21:49     ` Hans-Kristian Bakke
  2013-10-24 16:19       ` Hans-Kristian Bakke
  0 siblings, 1 reply; 9+ messages in thread
From: Hans-Kristian Bakke @ 2013-10-23 21:49 UTC (permalink / raw)
  To: linux-btrfs

OK. btrfs scrub and dmesg is hitting me with lots of unfixable errors.
All in the same file. Example

[13313.441091] btrfs: unable to fixup (regular) error at logical
560107954176 on dev /dev/sdn
[13321.532223] scrub_handle_errored_block: 1510 callbacks suppressed
[13321.532309] btrfs_dev_stat_print_on_error: 1510 callbacks suppressed
[13321.532314] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
40016, gen 0
[13321.532420] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
40017, gen 0
[13321.532545] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
40018, gen 0
[13321.532605] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
40019, gen 0
[13321.533039] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
40020, gen 0
[13321.537519] scrub_handle_errored_block: 1508 callbacks suppressed
[13321.537525] btrfs: unable to fixup (regular) error at logical
560630136832 on dev /dev/sdq
[13321.537821] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
40021, gen 0
[13321.538081] btrfs: unable to fixup (regular) error at logical
560630140928 on dev /dev/sdq
[13321.538438] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
40022, gen 0
[13321.538715] btrfs: unable to fixup (regular) error at logical
560630145024 on dev /dev/sdq
[13321.539016] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
40023, gen 0
[13321.539234] btrfs: unable to fixup (regular) error at logical
560630149120 on dev /dev/sdq
[13321.539522] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
40024, gen 0
[13321.539739] btrfs: unable to fixup (regular) error at logical
560630153216 on dev /dev/sdq
[13321.540027] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
40025, gen 0
[13321.540242] btrfs: unable to fixup (regular) error at logical
560630157312 on dev /dev/sdq
[13321.540620] btrfs: unable to fixup (regular) error at logical
560630161408 on dev /dev/sdq
[13321.541140] btrfs: unable to fixup (regular) error at logical
560630165504 on dev /dev/sdq
[13321.541571] btrfs: unable to fixup (regular) error at logical
560630169600 on dev /dev/sdq
[13321.541931] btrfs: unable to fixup (regular) error at logical
560630173696 on dev /dev/sdq

Luckily all the corruption seems to be in a single very large file,
but on different part of it on different disks. The file was written
by rtorrent which have the option "system.file_allocate.set = yes"
configured.
I also have samba configured with "strict allocate = yes" because it
is recommended for best performance on extent based filesystems. Do
that mean even samba files vulnerable to this corruption too?
If so this could become very ugly very fast on certain systems.

Mvh

Hans-Kristian Bakke


On 23 October 2013 23:24, Hans-Kristian Bakke <hkbakke@gmail.com> wrote:
> I was hit by this when trying to rebalance a 16TB RAID10 to 32TB
> RAID10 going from 4 to 8 WD SE 4TB drives today. I cannot finish a
> rebalance because of failed csum.
>
> [10228.850910] BTRFS info (device sdq): csum failed ino 487 off 65536
> csum 2566472073 private 151366068
> [10228.850967] BTRFS info (device sdq): csum failed ino 487 off 69632
> csum 2566472073 private 3056924305
> [10228.850973] BTRFS info (device sdq): csum failed ino 487 off 593920
> csum 2566472073 private 906093395
> [10228.851004] BTRFS info (device sdq): csum failed ino 487 off 73728
> csum 2566472073 private 2680502892
> [10228.851014] BTRFS info (device sdq): csum failed ino 487 off 598016
> csum 2566472073 private 1940162924
> [10228.851029] BTRFS info (device sdq): csum failed ino 487 off 77824
> csum 2566472073 private 2939385278
> [10228.851051] BTRFS info (device sdq): csum failed ino 487 off 602112
> csum 2566472073 private 645310077
> [10228.851055] BTRFS info (device sdq): csum failed ino 487 off 81920
> csum 2566472073 private 3600741549
> [10228.851078] BTRFS info (device sdq): csum failed ino 487 off 86016
> csum 2566472073 private 200201951
> [10228.851091] BTRFS info (device sdq): csum failed ino 487 off 606208
> csum 2566472073 private 1002916440
>
> The system is running a scrub now and I will return with some more
> details later. I do not think systemd is logging to this volume, but
> the scrub wil probably show which files are affected.
>
> As this is a very serious issue for those hit by the corruption (it
> basically makes it impossible to run rebalance with all its
> consequences) hopefully this wil go upstream soon.
> I am on Kernel 3.11.6 by the way.
> Mvh
>
> Hans-Kristian Bakke
> Mob: 91 76 17 38
>
>
> On 4 October 2013 23:19, Johannes Hirte <johannes.hirte@datenkhaos.de> wrote:
>> On Fri, 27 Sep 2013 09:37:00 -0400
>> Josef Bacik <jbacik@fusionio.com> wrote:
>>
>>> A user reported a problem where they were getting csum errors when
>>> running a balance and running systemd's journal.  This is because
>>> systemd is awesome and fallocate()'s its log space and writes into
>>> it.  Unfortunately we assume that when we read in all the csums for
>>> an extent that they are sequential starting at the bytenr we care
>>> about.  This obviously isn't the case for prealloc extents, where we
>>> could have written to the middle of the prealloc extent only, which
>>> means the csum would be for the bytenr in the middle of our range and
>>> not the front of our range.  Fix this by offsetting the new bytenr we
>>> are logging to based on the original bytenr the csum was for.  With
>>> this patch I no longer see the csum errors I was seeing.  Thanks,
>>
>> Any assessment when this goes upstream? Until it hit Linus tree it
>> won't won't appear in stable. And this seems rather important.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Btrfs: relocate csums properly with prealloc extents - for 3.12-rc
  2013-09-27 13:37 [PATCH] Btrfs: relocate csums properly with prealloc extents Josef Bacik
  2013-10-04 21:19 ` Johannes Hirte
@ 2013-10-24 14:08 ` David Sterba
  2013-11-25 16:51 ` [PATCH] Btrfs: relocate csums properly with prealloc extents David Sterba
  2 siblings, 0 replies; 9+ messages in thread
From: David Sterba @ 2013-10-24 14:08 UTC (permalink / raw)
  To: chris.mason; +Cc: Josef Bacik, linux-btrfs

Hi Chris,

this needs to go to 3.12, the patch is only in btrfs-next. The bug can
happen with systemd journal + balance, the fix helps quite a lot of
users out there. (https://bugzilla.kernel.org/show_bug.cgi?id=63411)

I have cherry-picked the patch to current master, applies cleanly and
the test btrfs/013 passes, here's my

Tested-by: David Sterba <dsterba@suse.cz>

david

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Btrfs: relocate csums properly with prealloc extents
  2013-10-23 21:49     ` Hans-Kristian Bakke
@ 2013-10-24 16:19       ` Hans-Kristian Bakke
  0 siblings, 0 replies; 9+ messages in thread
From: Hans-Kristian Bakke @ 2013-10-24 16:19 UTC (permalink / raw)
  To: linux-btrfs

The result of the scrubbing came back today and it was not pretty:
...
scrub done for b64daec7-6c14-4996-94b3-80c6abfa26ce
        scrub started at Wed Oct 23 23:01:22 2013 and finished after
34990 seconds
        total bytes scrubbed: 12.55TB with 3859542 errors
        error details: csum=3859542
        corrected errors: 0, uncorrectable errors: 3859542, unverified errors: 0
---

Still only two folder structures affected, but seemingly unrecoverable.
I noticed the mail to include it in 3.12. Jippi!
Until this is included I will have to pospone rebalancing over the
four new drives.


Mvh

Hans-Kristian Bakke


On 23 October 2013 23:49, Hans-Kristian Bakke <hkbakke@gmail.com> wrote:
> OK. btrfs scrub and dmesg is hitting me with lots of unfixable errors.
> All in the same file. Example
>
> [13313.441091] btrfs: unable to fixup (regular) error at logical
> 560107954176 on dev /dev/sdn
> [13321.532223] scrub_handle_errored_block: 1510 callbacks suppressed
> [13321.532309] btrfs_dev_stat_print_on_error: 1510 callbacks suppressed
> [13321.532314] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
> 40016, gen 0
> [13321.532420] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
> 40017, gen 0
> [13321.532545] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
> 40018, gen 0
> [13321.532605] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
> 40019, gen 0
> [13321.533039] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
> 40020, gen 0
> [13321.537519] scrub_handle_errored_block: 1508 callbacks suppressed
> [13321.537525] btrfs: unable to fixup (regular) error at logical
> 560630136832 on dev /dev/sdq
> [13321.537821] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
> 40021, gen 0
> [13321.538081] btrfs: unable to fixup (regular) error at logical
> 560630140928 on dev /dev/sdq
> [13321.538438] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
> 40022, gen 0
> [13321.538715] btrfs: unable to fixup (regular) error at logical
> 560630145024 on dev /dev/sdq
> [13321.539016] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
> 40023, gen 0
> [13321.539234] btrfs: unable to fixup (regular) error at logical
> 560630149120 on dev /dev/sdq
> [13321.539522] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
> 40024, gen 0
> [13321.539739] btrfs: unable to fixup (regular) error at logical
> 560630153216 on dev /dev/sdq
> [13321.540027] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
> 40025, gen 0
> [13321.540242] btrfs: unable to fixup (regular) error at logical
> 560630157312 on dev /dev/sdq
> [13321.540620] btrfs: unable to fixup (regular) error at logical
> 560630161408 on dev /dev/sdq
> [13321.541140] btrfs: unable to fixup (regular) error at logical
> 560630165504 on dev /dev/sdq
> [13321.541571] btrfs: unable to fixup (regular) error at logical
> 560630169600 on dev /dev/sdq
> [13321.541931] btrfs: unable to fixup (regular) error at logical
> 560630173696 on dev /dev/sdq
>
> Luckily all the corruption seems to be in a single very large file,
> but on different part of it on different disks. The file was written
> by rtorrent which have the option "system.file_allocate.set = yes"
> configured.
> I also have samba configured with "strict allocate = yes" because it
> is recommended for best performance on extent based filesystems. Do
> that mean even samba files vulnerable to this corruption too?
> If so this could become very ugly very fast on certain systems.
>
> Mvh
>
> Hans-Kristian Bakke
>
>
> On 23 October 2013 23:24, Hans-Kristian Bakke <hkbakke@gmail.com> wrote:
>> I was hit by this when trying to rebalance a 16TB RAID10 to 32TB
>> RAID10 going from 4 to 8 WD SE 4TB drives today. I cannot finish a
>> rebalance because of failed csum.
>>
>> [10228.850910] BTRFS info (device sdq): csum failed ino 487 off 65536
>> csum 2566472073 private 151366068
>> [10228.850967] BTRFS info (device sdq): csum failed ino 487 off 69632
>> csum 2566472073 private 3056924305
>> [10228.850973] BTRFS info (device sdq): csum failed ino 487 off 593920
>> csum 2566472073 private 906093395
>> [10228.851004] BTRFS info (device sdq): csum failed ino 487 off 73728
>> csum 2566472073 private 2680502892
>> [10228.851014] BTRFS info (device sdq): csum failed ino 487 off 598016
>> csum 2566472073 private 1940162924
>> [10228.851029] BTRFS info (device sdq): csum failed ino 487 off 77824
>> csum 2566472073 private 2939385278
>> [10228.851051] BTRFS info (device sdq): csum failed ino 487 off 602112
>> csum 2566472073 private 645310077
>> [10228.851055] BTRFS info (device sdq): csum failed ino 487 off 81920
>> csum 2566472073 private 3600741549
>> [10228.851078] BTRFS info (device sdq): csum failed ino 487 off 86016
>> csum 2566472073 private 200201951
>> [10228.851091] BTRFS info (device sdq): csum failed ino 487 off 606208
>> csum 2566472073 private 1002916440
>>
>> The system is running a scrub now and I will return with some more
>> details later. I do not think systemd is logging to this volume, but
>> the scrub wil probably show which files are affected.
>>
>> As this is a very serious issue for those hit by the corruption (it
>> basically makes it impossible to run rebalance with all its
>> consequences) hopefully this wil go upstream soon.
>> I am on Kernel 3.11.6 by the way.
>> Mvh
>>
>> Hans-Kristian Bakke
>> Mob: 91 76 17 38
>>
>>
>> On 4 October 2013 23:19, Johannes Hirte <johannes.hirte@datenkhaos.de> wrote:
>>> On Fri, 27 Sep 2013 09:37:00 -0400
>>> Josef Bacik <jbacik@fusionio.com> wrote:
>>>
>>>> A user reported a problem where they were getting csum errors when
>>>> running a balance and running systemd's journal.  This is because
>>>> systemd is awesome and fallocate()'s its log space and writes into
>>>> it.  Unfortunately we assume that when we read in all the csums for
>>>> an extent that they are sequential starting at the bytenr we care
>>>> about.  This obviously isn't the case for prealloc extents, where we
>>>> could have written to the middle of the prealloc extent only, which
>>>> means the csum would be for the bytenr in the middle of our range and
>>>> not the front of our range.  Fix this by offsetting the new bytenr we
>>>> are logging to based on the original bytenr the csum was for.  With
>>>> this patch I no longer see the csum errors I was seeing.  Thanks,
>>>
>>> Any assessment when this goes upstream? Until it hit Linus tree it
>>> won't won't appear in stable. And this seems rather important.
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Btrfs: relocate csums properly with prealloc extents
  2013-09-27 13:37 [PATCH] Btrfs: relocate csums properly with prealloc extents Josef Bacik
  2013-10-04 21:19 ` Johannes Hirte
  2013-10-24 14:08 ` [PATCH] Btrfs: relocate csums properly with prealloc extents - for 3.12-rc David Sterba
@ 2013-11-25 16:51 ` David Sterba
  2013-11-25 21:01   ` Greg KH
  2 siblings, 1 reply; 9+ messages in thread
From: David Sterba @ 2013-11-25 16:51 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, chris.mason, stable

On Fri, Sep 27, 2013 at 09:37:00AM -0400, Josef Bacik wrote:
> A user reported a problem where they were getting csum errors when running a
> balance and running systemd's journal.  This is because systemd is awesome and
> fallocate()'s its log space and writes into it.  Unfortunately we assume that
> when we read in all the csums for an extent that they are sequential starting at
> the bytenr we care about.  This obviously isn't the case for prealloc extents,
> where we could have written to the middle of the prealloc extent only, which
> means the csum would be for the bytenr in the middle of our range and not the
> front of our range.  Fix this by offsetting the new bytenr we are logging to
> based on the original bytenr the csum was for.  With this patch I no longer see
> the csum errors I was seeing.  Thanks,
> 
> Cc: stable@vger.kernel.org

The patch had the right CC but I don't see it in the mail's CC list (now
added by me). I'm afraid that this never reached stable and explains why
the patch did not end up in 3.12.1.

Stable team, please add this patch to 3.12.x, the commit id is

 4577b014d1bc3db386da3246f625888fc48083a9

thanks,
david

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Btrfs: relocate csums properly with prealloc extents
  2013-11-25 16:51 ` [PATCH] Btrfs: relocate csums properly with prealloc extents David Sterba
@ 2013-11-25 21:01   ` Greg KH
  2013-11-30  7:39     ` Goffredo Baroncelli
  0 siblings, 1 reply; 9+ messages in thread
From: Greg KH @ 2013-11-25 21:01 UTC (permalink / raw)
  To: dsterba, Josef Bacik, linux-btrfs, chris.mason, stable

On Mon, Nov 25, 2013 at 05:51:16PM +0100, David Sterba wrote:
> On Fri, Sep 27, 2013 at 09:37:00AM -0400, Josef Bacik wrote:
> > A user reported a problem where they were getting csum errors when running a
> > balance and running systemd's journal.  This is because systemd is awesome and
> > fallocate()'s its log space and writes into it.  Unfortunately we assume that
> > when we read in all the csums for an extent that they are sequential starting at
> > the bytenr we care about.  This obviously isn't the case for prealloc extents,
> > where we could have written to the middle of the prealloc extent only, which
> > means the csum would be for the bytenr in the middle of our range and not the
> > front of our range.  Fix this by offsetting the new bytenr we are logging to
> > based on the original bytenr the csum was for.  With this patch I no longer see
> > the csum errors I was seeing.  Thanks,
> > 
> > Cc: stable@vger.kernel.org
> 
> The patch had the right CC but I don't see it in the mail's CC list (now
> added by me). I'm afraid that this never reached stable and explains why
> the patch did not end up in 3.12.1.

No, it made it to my list, I was waiting for 3.13-rc1 to come out with
this patch in it before I could queue it up.  Don't worry, it's not
lost.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Btrfs: relocate csums properly with prealloc extents
  2013-11-25 21:01   ` Greg KH
@ 2013-11-30  7:39     ` Goffredo Baroncelli
  0 siblings, 0 replies; 9+ messages in thread
From: Goffredo Baroncelli @ 2013-11-30  7:39 UTC (permalink / raw)
  To: Greg KH; +Cc: dsterba, Josef Bacik, linux-btrfs, chris.mason, stable

On 2013-11-25 22:01, Greg KH wrote:
> On Mon, Nov 25, 2013 at 05:51:16PM +0100, David Sterba wrote:
>> On Fri, Sep 27, 2013 at 09:37:00AM -0400, Josef Bacik wrote:
>>> A user reported a problem where they were getting csum errors when running a
>>> balance and running systemd's journal.  This is because systemd is awesome and
>>> fallocate()'s its log space and writes into it.  Unfortunately we assume that
>>> when we read in all the csums for an extent that they are sequential starting at
>>> the bytenr we care about.  This obviously isn't the case for prealloc extents,
>>> where we could have written to the middle of the prealloc extent only, which
>>> means the csum would be for the bytenr in the middle of our range and not the
>>> front of our range.  Fix this by offsetting the new bytenr we are logging to
>>> based on the original bytenr the csum was for.  With this patch I no longer see
>>> the csum errors I was seeing.  Thanks,
>>>
>>> Cc: stable@vger.kernel.org
>>
>> The patch had the right CC but I don't see it in the mail's CC list (now
>> added by me). I'm afraid that this never reached stable and explains why
>> the patch did not end up in 3.12.1.
> 
> No, it made it to my list, I was waiting for 3.13-rc1 to come out with
> this patch in it before I could queue it up.  Don't worry, it's not
> lost.
> 
The patch landed in 3.12.2

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-11-30  7:39 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-27 13:37 [PATCH] Btrfs: relocate csums properly with prealloc extents Josef Bacik
2013-10-04 21:19 ` Johannes Hirte
2013-10-23 21:24   ` Hans-Kristian Bakke
2013-10-23 21:49     ` Hans-Kristian Bakke
2013-10-24 16:19       ` Hans-Kristian Bakke
2013-10-24 14:08 ` [PATCH] Btrfs: relocate csums properly with prealloc extents - for 3.12-rc David Sterba
2013-11-25 16:51 ` [PATCH] Btrfs: relocate csums properly with prealloc extents David Sterba
2013-11-25 21:01   ` Greg KH
2013-11-30  7:39     ` Goffredo Baroncelli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).