* btrfs send and kernel 3.17
@ 2014-10-06 18:50 David Arendt
2014-10-06 19:06 ` Chris Mason
0 siblings, 1 reply; 15+ messages in thread
From: David Arendt @ 2014-10-06 18:50 UTC (permalink / raw)
To: linux-btrfs
Hi,
After upgrading to kernel 3.17 btrfs send has stopped working.
ERROR: send ioctl failed with -5: Input/output error
The following message is printed by kernel:
[75322.782197] BTRFS error (device sda2): did not find backref in
send_root. inode=461, offset=0, disk_byte=1094713344 found extent=1094713344
btrfs inspect-internal inode-resolve -v 461 /u00/root.snapshot returns:
/var/log/emerge-fetch.log
After removing this file, the error moves on to another file.
btrfs scrub output:
scrub status for bc31b068-2c36-4ff2-ac5c-7ce55af5371d
scrub started at Mon Oct 6 19:49:25 2014 and finished after 1748
seconds
total bytes scrubbed: 94.21GiB with 0 errors
Other then the btrfs send problem, the filesystem works normally.
Is this a bug in btrfs-send or is my filesystem corrupted and should be
restored from backup ?
Please tell me if I can do anything else to help debugging this issue.
Thanks in advance,
David Arendt
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: btrfs send and kernel 3.17 2014-10-06 18:50 btrfs send and kernel 3.17 David Arendt @ 2014-10-06 19:06 ` Chris Mason 2014-10-06 19:48 ` David Arendt 2014-10-06 20:51 ` David Arendt 0 siblings, 2 replies; 15+ messages in thread From: Chris Mason @ 2014-10-06 19:06 UTC (permalink / raw) To: David Arendt; +Cc: linux-btrfs On Mon, Oct 6, 2014 at 2:50 PM, David Arendt <admin@prnet.org> wrote: > Hi, > > After upgrading to kernel 3.17 btrfs send has stopped working. > > ERROR: send ioctl failed with -5: Input/output error > > The following message is printed by kernel: > > [75322.782197] BTRFS error (device sda2): did not find backref in > send_root. inode=461, offset=0, disk_byte=1094713344 found > extent=1094713344 > > btrfs inspect-internal inode-resolve -v 461 /u00/root.snapshot > returns: > > /var/log/emerge-fetch.log > > After removing this file, the error moves on to another file. > > btrfs scrub output: > > scrub status for bc31b068-2c36-4ff2-ac5c-7ce55af5371d > scrub started at Mon Oct 6 19:49:25 2014 and finished after 1748 > seconds > total bytes scrubbed: 94.21GiB with 0 errors > > Other then the btrfs send problem, the filesystem works normally. > > Is this a bug in btrfs-send or is my filesystem corrupted and should > be > restored from backup ? > > Please tell me if I can do anything else to help debugging this issue. Which kernel did you upgrade from? I don't think we have changes in 3.17 that should impact this. Is merge-fetch.log just a simple append-only log file? -chris ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: btrfs send and kernel 3.17 2014-10-06 19:06 ` Chris Mason @ 2014-10-06 19:48 ` David Arendt 2014-10-06 20:51 ` David Arendt 1 sibling, 0 replies; 15+ messages in thread From: David Arendt @ 2014-10-06 19:48 UTC (permalink / raw) To: Chris Mason; +Cc: linux-btrfs I have upgraded from kernel 3.16.3. emerge-fetch.log is a simple append-only log file. Other files having the problems after deleting them one by one have been emerge.log, mysql.log, freshclam.log and main.cvd from clamav. At this one, I stopped deleting. On 10/06/2014 09:06 PM, Chris Mason wrote: > On Mon, Oct 6, 2014 at 2:50 PM, David Arendt <admin@prnet.org> wrote: >> Hi, >> >> After upgrading to kernel 3.17 btrfs send has stopped working. >> >> ERROR: send ioctl failed with -5: Input/output error >> >> The following message is printed by kernel: >> >> [75322.782197] BTRFS error (device sda2): did not find backref in >> send_root. inode=461, offset=0, disk_byte=1094713344 found >> extent=1094713344 >> >> btrfs inspect-internal inode-resolve -v 461 /u00/root.snapshot returns: >> >> /var/log/emerge-fetch.log >> >> After removing this file, the error moves on to another file. >> >> btrfs scrub output: >> >> scrub status for bc31b068-2c36-4ff2-ac5c-7ce55af5371d >> scrub started at Mon Oct 6 19:49:25 2014 and finished after 1748 >> seconds >> total bytes scrubbed: 94.21GiB with 0 errors >> >> Other then the btrfs send problem, the filesystem works normally. >> >> Is this a bug in btrfs-send or is my filesystem corrupted and should be >> restored from backup ? >> >> Please tell me if I can do anything else to help debugging this issue. > > Which kernel did you upgrade from? I don't think we have changes in > 3.17 that should impact this. > > Is merge-fetch.log just a simple append-only log file? > > -chris > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: btrfs send and kernel 3.17 2014-10-06 19:06 ` Chris Mason 2014-10-06 19:48 ` David Arendt @ 2014-10-06 20:51 ` David Arendt 2014-10-06 22:22 ` Chris Mason 1 sibling, 1 reply; 15+ messages in thread From: David Arendt @ 2014-10-06 20:51 UTC (permalink / raw) To: Chris Mason; +Cc: linux-btrfs I just tried downgrading to 3.16.3 again. In 3.16.3 btrfs send is working without any problem. Afterwards I upgraded again to 3.17 and the problem reappeared. So the problem seems to be kernel version related. On 10/06/2014 09:06 PM, Chris Mason wrote: > On Mon, Oct 6, 2014 at 2:50 PM, David Arendt <admin@prnet.org> wrote: >> Hi, >> >> After upgrading to kernel 3.17 btrfs send has stopped working. >> >> ERROR: send ioctl failed with -5: Input/output error >> >> The following message is printed by kernel: >> >> [75322.782197] BTRFS error (device sda2): did not find backref in >> send_root. inode=461, offset=0, disk_byte=1094713344 found >> extent=1094713344 >> >> btrfs inspect-internal inode-resolve -v 461 /u00/root.snapshot returns: >> >> /var/log/emerge-fetch.log >> >> After removing this file, the error moves on to another file. >> >> btrfs scrub output: >> >> scrub status for bc31b068-2c36-4ff2-ac5c-7ce55af5371d >> scrub started at Mon Oct 6 19:49:25 2014 and finished after 1748 >> seconds >> total bytes scrubbed: 94.21GiB with 0 errors >> >> Other then the btrfs send problem, the filesystem works normally. >> >> Is this a bug in btrfs-send or is my filesystem corrupted and should be >> restored from backup ? >> >> Please tell me if I can do anything else to help debugging this issue. > > Which kernel did you upgrade from? I don't think we have changes in > 3.17 that should impact this. > > Is merge-fetch.log just a simple append-only log file? > > -chris > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: btrfs send and kernel 3.17 2014-10-06 20:51 ` David Arendt @ 2014-10-06 22:22 ` Chris Mason 0 siblings, 0 replies; 15+ messages in thread From: Chris Mason @ 2014-10-06 22:22 UTC (permalink / raw) To: David Arendt; +Cc: linux-btrfs On Mon, Oct 6, 2014 at 4:51 PM, David Arendt <admin@prnet.org> wrote: > I just tried downgrading to 3.16.3 again. In 3.16.3 btrfs send is > working without any problem. Afterwards I upgraded again to 3.17 and > the > problem reappeared. So the problem seems to be kernel version related. [ backref errors during btrfs-send ] Ok then, our list of suspects is pretty short. Can you easily build test kernels? I'd like to try reverting this commit: 51f395ad4058883e4273b02fdebe98072dbdc0d2 -chris ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <DC336054-F307-4A86-AD6D-204E700DE9AA@prnet.org>]
* Re: btrfs send and kernel 3.17 [not found] <DC336054-F307-4A86-AD6D-204E700DE9AA@prnet.org> @ 2014-10-07 13:19 ` Chris Mason 2014-10-07 20:45 ` David Arendt 0 siblings, 1 reply; 15+ messages in thread From: Chris Mason @ 2014-10-07 13:19 UTC (permalink / raw) To: David Arendt; +Cc: linux-btrfs On Tue, Oct 7, 2014 at 1:25 AM, David Arendt <admin@prnet.org> wrote: > I did a revert of this commit. After creating a snapshot, the > filesystem was no longer usable, even with kernel 3.16.3 (crashes 10 > seconds after mount without error message) . Maybe there was some > previous damage that just appeared now. This evening, I will restore > from backup and report back. > > On October 7, 2014 12:22:11 AM CEST, Chris Mason <clm@fb.com> wrote: >> On Mon, Oct 6, 2014 at 4:51 PM, David Arendt <admin@prnet.org> wrote: >>> I just tried downgrading to 3.16.3 again. In 3.16.3 btrfs send is >>> working without any problem. Afterwards I upgraded again to 3.17 >>> and >>> the >>> problem reappeared. So the problem seems to be kernel version >>> related. >> >> [ backref errors during btrfs-send ] >> >> Ok then, our list of suspects is pretty short. Can you easily build >> test kernels? >> >> I'd like to try reverting this commit: >> >> 51f395ad4058883e4273b02fdebe98072dbdc0d2 Oh no! Reverting this definitely should not have caused corruptions, so I think the problem was already there. Do you still have the filesystem image? Please let us know if you're missing files off the backup, we'll help pull them out. -chris ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: btrfs send and kernel 3.17 2014-10-07 13:19 ` Chris Mason @ 2014-10-07 20:45 ` David Arendt 2014-10-07 20:46 ` Chris Mason 0 siblings, 1 reply; 15+ messages in thread From: David Arendt @ 2014-10-07 20:45 UTC (permalink / raw) To: Chris Mason; +Cc: linux-btrfs On 10/07/2014 03:19 PM, Chris Mason wrote: > > > On Tue, Oct 7, 2014 at 1:25 AM, David Arendt <admin@prnet.org> wrote: >> I did a revert of this commit. After creating a snapshot, the >> filesystem was no longer usable, even with kernel 3.16.3 (crashes 10 >> seconds after mount without error message) . Maybe there was some >> previous damage that just appeared now. This evening, I will restore >> from backup and report back. >> >> On October 7, 2014 12:22:11 AM CEST, Chris Mason <clm@fb.com> wrote: >>> On Mon, Oct 6, 2014 at 4:51 PM, David Arendt <admin@prnet.org> wrote: >>>> I just tried downgrading to 3.16.3 again. In 3.16.3 btrfs send is >>>> working without any problem. Afterwards I upgraded again to 3.17 and >>>> the >>>> problem reappeared. So the problem seems to be kernel version >>>> related. >>> >>> [ backref errors during btrfs-send ] >>> >>> Ok then, our list of suspects is pretty short. Can you easily build >>> test kernels? >>> >>> I'd like to try reverting this commit: >>> >>> 51f395ad4058883e4273b02fdebe98072dbdc0d2 > > Oh no! Reverting this definitely should not have caused corruptions, > so I think the problem was already there. Do you still have the > filesystem image? > > Please let us know if you're missing files off the backup, we'll help > pull them out. > > -chris > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Due to space constraints, it was not possible to take an image of the corrupted filesystem. As I do backups daily, and the problems occurred 5 hours after backup, no file was lost. Thanks for offering your help. In 4 days I will do some send tests on the newly created filesystem and report back. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: btrfs send and kernel 3.17 2014-10-07 20:45 ` David Arendt @ 2014-10-07 20:46 ` Chris Mason 2014-10-12 11:11 ` David Arendt 0 siblings, 1 reply; 15+ messages in thread From: Chris Mason @ 2014-10-07 20:46 UTC (permalink / raw) To: David Arendt; +Cc: linux-btrfs On Tue, Oct 7, 2014 at 4:45 PM, David Arendt <admin@prnet.org> wrote: > On 10/07/2014 03:19 PM, Chris Mason wrote: >> >> >> On Tue, Oct 7, 2014 at 1:25 AM, David Arendt <admin@prnet.org> >> wrote: >>> I did a revert of this commit. After creating a snapshot, the >>> filesystem was no longer usable, even with kernel 3.16.3 (crashes >>> 10 >>> seconds after mount without error message) . Maybe there was some >>> previous damage that just appeared now. This evening, I will >>> restore >>> from backup and report back. >>> >>> On October 7, 2014 12:22:11 AM CEST, Chris Mason <clm@fb.com> >>> wrote: >>>> On Mon, Oct 6, 2014 at 4:51 PM, David Arendt <admin@prnet.org> >>>> wrote: >>>>> I just tried downgrading to 3.16.3 again. In 3.16.3 btrfs send >>>>> is >>>>> working without any problem. Afterwards I upgraded again to >>>>> 3.17 and >>>>> the >>>>> problem reappeared. So the problem seems to be kernel version >>>>> related. >>>> >>>> [ backref errors during btrfs-send ] >>>> >>>> Ok then, our list of suspects is pretty short. Can you easily >>>> build >>>> test kernels? >>>> >>>> I'd like to try reverting this commit: >>>> >>>> 51f395ad4058883e4273b02fdebe98072dbdc0d2 >> >> Oh no! Reverting this definitely should not have caused >> corruptions, >> so I think the problem was already there. Do you still have the >> filesystem image? >> >> Please let us know if you're missing files off the backup, we'll >> help >> pull them out. >> > Due to space constraints, it was not possible to take an image of the > corrupted filesystem. As I do backups daily, and the problems > occurred 5 > hours after backup, no file was lost. Thanks for offering your help. > In > 4 days I will do some send tests on the newly created filesystem and > report back. Ok, if you have the kernel messages from the panic, please send them along. -chris ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: btrfs send and kernel 3.17 2014-10-07 20:46 ` Chris Mason @ 2014-10-12 11:11 ` David Arendt 2014-10-12 15:24 ` john terragon 2014-10-13 17:22 ` Rich Freeman 0 siblings, 2 replies; 15+ messages in thread From: David Arendt @ 2014-10-12 11:11 UTC (permalink / raw) To: Chris Mason; +Cc: linux-btrfs This weekend I finally had time to try btrfs send again on the newly created fs. Now I am running into another problem: btrfs send returns: ERROR: send ioctl failed with -12: Cannot allocate memory In dmesg I see only the following output: parent transid verify failed on 21325004800 wanted 2620 found 8325 On 10/07/2014 10:46 PM, Chris Mason wrote: > On Tue, Oct 7, 2014 at 4:45 PM, David Arendt <admin@prnet.org> wrote: >> On 10/07/2014 03:19 PM, Chris Mason wrote: >>> >>> >>> On Tue, Oct 7, 2014 at 1:25 AM, David Arendt <admin@prnet.org> wrote: >>>> I did a revert of this commit. After creating a snapshot, the >>>> filesystem was no longer usable, even with kernel 3.16.3 (crashes 10 >>>> seconds after mount without error message) . Maybe there was some >>>> previous damage that just appeared now. This evening, I will restore >>>> from backup and report back. >>>> >>>> On October 7, 2014 12:22:11 AM CEST, Chris Mason <clm@fb.com> wrote: >>>>> On Mon, Oct 6, 2014 at 4:51 PM, David Arendt <admin@prnet.org> >>>>> wrote: >>>>>> I just tried downgrading to 3.16.3 again. In 3.16.3 btrfs send is >>>>>> working without any problem. Afterwards I upgraded again to >>>>>> 3.17 and >>>>>> the >>>>>> problem reappeared. So the problem seems to be kernel version >>>>>> related. >>>>> >>>>> [ backref errors during btrfs-send ] >>>>> >>>>> Ok then, our list of suspects is pretty short. Can you easily build >>>>> test kernels? >>>>> >>>>> I'd like to try reverting this commit: >>>>> >>>>> 51f395ad4058883e4273b02fdebe98072dbdc0d2 >>> >>> Oh no! Reverting this definitely should not have caused corruptions, >>> so I think the problem was already there. Do you still have the >>> filesystem image? >>> >>> Please let us know if you're missing files off the backup, we'll help >>> pull them out. >>> >> Due to space constraints, it was not possible to take an image of the >> corrupted filesystem. As I do backups daily, and the problems occurred 5 >> hours after backup, no file was lost. Thanks for offering your help. In >> 4 days I will do some send tests on the newly created filesystem and >> report back. > > Ok, if you have the kernel messages from the panic, please send them > along. > > -chris > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: btrfs send and kernel 3.17 2014-10-12 11:11 ` David Arendt @ 2014-10-12 15:24 ` john terragon 2014-10-12 21:35 ` David Arendt 2014-10-13 17:22 ` Rich Freeman 1 sibling, 1 reply; 15+ messages in thread From: john terragon @ 2014-10-12 15:24 UTC (permalink / raw) To: David Arendt; +Cc: Chris Mason, Btrfs BTRFS Hi. I just wanted to "confirm David's story" so to speak :) -kernel 3.17-rc7 (didn't bother to compile 3.17 as there weren't any btrfs fixes, I think) -btrfs-progs 3.16.2 (also compiled from source, so no distribution-specific patches) -fresh fs -I get the same two errors David got (first I got the I/O error one and then the memory allocation one) -plus now when I ls -la the fs top volume this is what I get drwxrwsr-x 1 root staff 30 Sep 11 16:15 home d????????? ? ? ? ? ? home-backup drwxr-xr-x 1 root root 250 Oct 10 15:37 root d????????? ? ? ? ? ? root-backup drwxr-xr-x 1 root root 88 Sep 15 16:02 vms drwxr-xr-x 1 root root 88 Sep 15 16:02 vms-backup yes, the question marks on those two *-backup snapshots are really there. I can't access the snapshots, I can't delete them, I can't do anything with them. -btrfs check segfaults -the events that led to this situation are these: 1) btrfs su snap -r root root-backup 2) send |receive (the entire root-backup, not and incremental send) immediate I/O error 3) move on to home: btrfs su snap -r home home-backup 4) send|receive (again not an incremental send) everything goes well (!) 5) retry with root: btrfs su snap -r root root-backup 6) send|receive and it goes seemingly well 7) apt-get dist-upgrade just to modify root and try an incremental send 8) reboot after the dist-upgrade 9) ls -la the fs top volume: first I get the memory allocation error and after that any ls -la gives the output I pasted above. (notice that beside the ls -la, the two snapshots were not touched in any way since the two send|receive) Few final notes. I haven't tried send/receive in a while (they were unreliable) so I can't tell which is the last version they worked for me (well, no version actually :) ). I've never had any problem with just snapshots. I make them regularly, I use them, I modify them and I've never had one problem (with 3.17 too, it's just send/receive that murders them). Best regards John ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: btrfs send and kernel 3.17 2014-10-12 15:24 ` john terragon @ 2014-10-12 21:35 ` David Arendt 2014-10-13 4:11 ` David Arendt 0 siblings, 1 reply; 15+ messages in thread From: David Arendt @ 2014-10-12 21:35 UTC (permalink / raw) To: john terragon; +Cc: Chris Mason, Btrfs BTRFS Just to let you know, I just tried an ls -l on 2 machines running kernel 3.17 and btrfs-progs 3.16.2. Here is my ls -l output: Machine 1: ls: cannot access root.20141009.000503.backup: Cannot allocate memory total 0 d????????? ? ? ? ? ? root.20141009.000503.backup drwxr-xr-x 1 root root 182 Oct 7 20:35 root.20141012.095526.backup drwxr-xr-x 1 root root 182 Oct 7 20:35 root.20141012.000503.backup drwxr-xr-x 1 root root 182 Oct 7 20:35 root.20141011.000502.backup drwxr-xr-x 1 root root 182 Oct 7 20:35 root.20141010.000502.backup root.20141009.000503.backup is not deletable. Machine 2: ls: cannot access root.20141006.003239.backup: Cannot allocate memory ls: cannot access root.20141007.001616.backup: Cannot allocate memory ls: cannot access root.20141008.000501.backup: Cannot allocate memory ls: cannot access root.20141009.052436.backup: Cannot allocate memory total 0 d????????? ? ? ? ? ? root.20141009.052436.backup d????????? ? ? ? ? ? root.20141008.000501.backup d????????? ? ? ? ? ? root.20141007.001616.backup d????????? ? ? ? ? ? root.20141006.003239.backup drwxr-xr-x 1 root root 232 Aug 3 15:00 root.20140925.001125.backup drwxr-xr-x 1 root root 232 Aug 3 15:00 root.20140924.001017.backup drwxr-xr-x 1 root root 232 Aug 3 15:00 root.20140923.001008.backup drwxr-xr-x 1 root root 232 Aug 3 15:00 root.20140922.001836.backup drwxr-xr-x 1 root root 232 Aug 3 15:00 root.20140921.001029.backup drwxr-xr-x 1 root root 232 Aug 3 15:00 root.20140920.001020.backup The ? ones are also not deletable. Both machines are giving transid verify failed errors. I verified my logfiles and this problem was never there using previous kernel versions. On machine 1, it is also sure that it was not any previous corruption as this filesystem has also been created with btrfs-progs 3.16.2 using kernel 3.17. On 10/12/2014 05:24 PM, john terragon wrote: > Hi. > > I just wanted to "confirm David's story" so to speak :) > > -kernel 3.17-rc7 (didn't bother to compile 3.17 as there weren't any > btrfs fixes, I think) > > -btrfs-progs 3.16.2 (also compiled from source, so no > distribution-specific patches) > > -fresh fs > > -I get the same two errors David got (first I got the I/O error one > and then the memory allocation one) > > -plus now when I ls -la the fs top volume this is what I get > > drwxrwsr-x 1 root staff 30 Sep 11 16:15 home > d????????? ? ? ? ? ? home-backup > drwxr-xr-x 1 root root 250 Oct 10 15:37 root > d????????? ? ? ? ? ? root-backup > drwxr-xr-x 1 root root 88 Sep 15 16:02 vms > drwxr-xr-x 1 root root 88 Sep 15 16:02 vms-backup > > yes, the question marks on those two *-backup snapshots are really > there. I can't access the snapshots, I can't delete them, I can't do > anything with them. > > -btrfs check segfaults > > -the events that led to this situation are these: > 1) btrfs su snap -r root root-backup > 2) send |receive (the entire root-backup, not and incremental send) > immediate I/O error > 3) move on to home: btrfs su snap -r home home-backup > 4) send|receive (again not an incremental send) > everything goes well (!) > 5) retry with root: btrfs su snap -r root root-backup > 6) send|receive > and it goes seemingly well > 7) apt-get dist-upgrade just to modify root and try an incremental send > 8) reboot after the dist-upgrade > 9) ls -la the fs top volume: first I get the memory allocation error > and after that > any ls -la gives the output I pasted above. (notice that beside > the ls -la, the > two snapshots were not touched in any way since the two send|receive) > > Few final notes. I haven't tried send/receive in a while (they were > unreliable) so I can't tell which is the last version they worked for > me (well, no version actually :) ). > I've never had any problem with just snapshots. I make them regularly, > I use them, I modify them and I've never had one problem (with 3.17 > too, it's just send/receive that murders them). > > Best regards > > John ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: btrfs send and kernel 3.17 2014-10-12 21:35 ` David Arendt @ 2014-10-13 4:11 ` David Arendt 2014-10-13 12:40 ` john terragon 0 siblings, 1 reply; 15+ messages in thread From: David Arendt @ 2014-10-13 4:11 UTC (permalink / raw) To: john terragon; +Cc: Chris Mason, Btrfs BTRFS Some more info I thought off. For me, the corruption problem seems not to be send related but snapshot creation related. On machine 2 send was never used. However both filesystems are stored on SSDs (of different brand). Another filesystem stored on a normal HDD didn't experience the problem. Maybe this is pure coincidence and has nothing to do with the fact that it is on SSD or HDD. Another thing I noticed is that for me, the problem only seems to occur for root subvolumes with many small files. I have no root subvolumes on HDD so it might be not SSD related. On 10/12/2014 11:35 PM, David Arendt wrote: > Just to let you know, I just tried an ls -l on 2 machines running kernel > 3.17 and btrfs-progs 3.16.2. > > Here is my ls -l output: > > Machine 1: > ls: cannot access root.20141009.000503.backup: Cannot allocate memory > total 0 > d????????? ? ? ? ? ? root.20141009.000503.backup > drwxr-xr-x 1 root root 182 Oct 7 20:35 root.20141012.095526.backup > drwxr-xr-x 1 root root 182 Oct 7 20:35 root.20141012.000503.backup > drwxr-xr-x 1 root root 182 Oct 7 20:35 root.20141011.000502.backup > drwxr-xr-x 1 root root 182 Oct 7 20:35 root.20141010.000502.backup > > root.20141009.000503.backup is not deletable. > > Machine 2: > ls: cannot access root.20141006.003239.backup: Cannot allocate memory > ls: cannot access root.20141007.001616.backup: Cannot allocate memory > ls: cannot access root.20141008.000501.backup: Cannot allocate memory > ls: cannot access root.20141009.052436.backup: Cannot allocate memory > total 0 > d????????? ? ? ? ? ? root.20141009.052436.backup > d????????? ? ? ? ? ? root.20141008.000501.backup > d????????? ? ? ? ? ? root.20141007.001616.backup > d????????? ? ? ? ? ? root.20141006.003239.backup > drwxr-xr-x 1 root root 232 Aug 3 15:00 root.20140925.001125.backup > drwxr-xr-x 1 root root 232 Aug 3 15:00 root.20140924.001017.backup > drwxr-xr-x 1 root root 232 Aug 3 15:00 root.20140923.001008.backup > drwxr-xr-x 1 root root 232 Aug 3 15:00 root.20140922.001836.backup > drwxr-xr-x 1 root root 232 Aug 3 15:00 root.20140921.001029.backup > drwxr-xr-x 1 root root 232 Aug 3 15:00 root.20140920.001020.backup > > The ? ones are also not deletable. > > Both machines are giving transid verify failed errors. > > I verified my logfiles and this problem was never there using previous > kernel versions. On machine 1, it is also sure that it was not any > previous corruption as this filesystem has also been created with > btrfs-progs 3.16.2 using kernel 3.17. > > On 10/12/2014 05:24 PM, john terragon wrote: >> Hi. >> >> I just wanted to "confirm David's story" so to speak :) >> >> -kernel 3.17-rc7 (didn't bother to compile 3.17 as there weren't any >> btrfs fixes, I think) >> >> -btrfs-progs 3.16.2 (also compiled from source, so no >> distribution-specific patches) >> >> -fresh fs >> >> -I get the same two errors David got (first I got the I/O error one >> and then the memory allocation one) >> >> -plus now when I ls -la the fs top volume this is what I get >> >> drwxrwsr-x 1 root staff 30 Sep 11 16:15 home >> d????????? ? ? ? ? ? home-backup >> drwxr-xr-x 1 root root 250 Oct 10 15:37 root >> d????????? ? ? ? ? ? root-backup >> drwxr-xr-x 1 root root 88 Sep 15 16:02 vms >> drwxr-xr-x 1 root root 88 Sep 15 16:02 vms-backup >> >> yes, the question marks on those two *-backup snapshots are really >> there. I can't access the snapshots, I can't delete them, I can't do >> anything with them. >> >> -btrfs check segfaults >> >> -the events that led to this situation are these: >> 1) btrfs su snap -r root root-backup >> 2) send |receive (the entire root-backup, not and incremental send) >> immediate I/O error >> 3) move on to home: btrfs su snap -r home home-backup >> 4) send|receive (again not an incremental send) >> everything goes well (!) >> 5) retry with root: btrfs su snap -r root root-backup >> 6) send|receive >> and it goes seemingly well >> 7) apt-get dist-upgrade just to modify root and try an incremental send >> 8) reboot after the dist-upgrade >> 9) ls -la the fs top volume: first I get the memory allocation error >> and after that >> any ls -la gives the output I pasted above. (notice that beside >> the ls -la, the >> two snapshots were not touched in any way since the two send|receive) >> >> Few final notes. I haven't tried send/receive in a while (they were >> unreliable) so I can't tell which is the last version they worked for >> me (well, no version actually :) ). >> I've never had any problem with just snapshots. I make them regularly, >> I use them, I modify them and I've never had one problem (with 3.17 >> too, it's just send/receive that murders them). >> >> Best regards >> >> John ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: btrfs send and kernel 3.17 2014-10-13 4:11 ` David Arendt @ 2014-10-13 12:40 ` john terragon 2014-10-13 15:40 ` David Arendt 0 siblings, 1 reply; 15+ messages in thread From: john terragon @ 2014-10-13 12:40 UTC (permalink / raw) To: David Arendt; +Cc: Chris Mason, Btrfs BTRFS Actually it seems strange that a send operation could corrupt the source subvolume or fs. Why would the send modify the source subvolume in any significant way? The only way I can find to reconcile your observations with mine is that maybe the snapshots get corrupted not by the send operation by itself but when they are generated with -r (readonly, as it is needed to send them). Are the corrupted snapshots you have in machine 2 (the one in which send was never used) readonly? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: btrfs send and kernel 3.17 2014-10-13 12:40 ` john terragon @ 2014-10-13 15:40 ` David Arendt 0 siblings, 0 replies; 15+ messages in thread From: David Arendt @ 2014-10-13 15:40 UTC (permalink / raw) To: john terragon; +Cc: Chris Mason, Btrfs BTRFS On 10/13/2014 02:40 PM, john terragon wrote: > Actually it seems strange that a send operation could corrupt the > source subvolume or fs. Why would the send modify the source subvolume > in any significant way? The only way I can find to reconcile your > observations with mine is that maybe the snapshots get corrupted not > by the send operation by itself but when they are generated with -r > (readonly, as it is needed to send them). Are the corrupted snapshots > you have in machine 2 (the one in which send was never used) readonly? Yes, on both machines there are only readonly snapshots. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: btrfs send and kernel 3.17 2014-10-12 11:11 ` David Arendt 2014-10-12 15:24 ` john terragon @ 2014-10-13 17:22 ` Rich Freeman 1 sibling, 0 replies; 15+ messages in thread From: Rich Freeman @ 2014-10-13 17:22 UTC (permalink / raw) To: David Arendt; +Cc: Chris Mason, Btrfs BTRFS On Sun, Oct 12, 2014 at 7:11 AM, David Arendt <admin@prnet.org> wrote: > This weekend I finally had time to try btrfs send again on the newly > created fs. Now I am running into another problem: > > btrfs send returns: ERROR: send ioctl failed with -12: Cannot allocate > memory > > In dmesg I see only the following output: > > parent transid verify failed on 21325004800 wanted 2620 found 8325 > I'm not using send at all, but I've been running into parent transid verify failed messages where the wanted is way smaller than the found when trying to balance a raid1 after adding a new drive. Originally I had gotten a BUG, and after reboot the drive finished balancing (interestingly enough without moving any chunks to the new drive - just consolidating everything on the old drives), and then when I try to do another balance I get: [ 4426.987177] BTRFS info (device sdc2): relocating block group 10367073779712 flags 17 [ 4446.287998] BTRFS info (device sdc2): found 13 extents [ 4451.330887] parent transid verify failed on 10063286579200 wanted 987432 found 993678 [ 4451.350663] parent transid verify failed on 10063286579200 wanted 987432 found 993678 The btrfs program itself outputs: btrfs balance start -v /data Dumping filters: flags 0x7, state 0x0, force is off DATA (flags 0x0): balancing METADATA (flags 0x0): balancing SYSTEM (flags 0x0): balancing ERROR: error during balancing '/data' - Cannot allocate memory There may be more info in syslog - try dmesg | tail This is also on 3.17. This may be completely unrelated, but it seemed similar enough to be worth mentioning. The filesystem otherwise seems to work fine, other than the new drive not having any data on it: Label: 'datafs' uuid: cd074207-9bc3-402d-bee8-6a8c77d56959 Total devices 6 FS bytes used 2.16TiB devid 1 size 2.73TiB used 2.40TiB path /dev/sdc2 devid 2 size 931.32GiB used 695.03GiB path /dev/sda2 devid 3 size 931.32GiB used 700.00GiB path /dev/sdb2 devid 4 size 931.32GiB used 700.00GiB path /dev/sdd2 devid 5 size 931.32GiB used 699.00GiB path /dev/sde2 devid 6 size 2.73TiB used 0.00 path /dev/sdf2 This is btrfs-progs-3.16.2. -- Rich ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2014-10-13 17:22 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-06 18:50 btrfs send and kernel 3.17 David Arendt
2014-10-06 19:06 ` Chris Mason
2014-10-06 19:48 ` David Arendt
2014-10-06 20:51 ` David Arendt
2014-10-06 22:22 ` Chris Mason
[not found] <DC336054-F307-4A86-AD6D-204E700DE9AA@prnet.org>
2014-10-07 13:19 ` Chris Mason
2014-10-07 20:45 ` David Arendt
2014-10-07 20:46 ` Chris Mason
2014-10-12 11:11 ` David Arendt
2014-10-12 15:24 ` john terragon
2014-10-12 21:35 ` David Arendt
2014-10-13 4:11 ` David Arendt
2014-10-13 12:40 ` john terragon
2014-10-13 15:40 ` David Arendt
2014-10-13 17:22 ` Rich Freeman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).