From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from vs2.lukas-pirl.de ([5.45.100.90]:46918 "EHLO pim.lukas-pirl.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1761215AbbKUAhp (ORCPT ); Fri, 20 Nov 2015 19:37:45 -0500 Received: from [192.168.1.5] (unknown [119.224.19.150]) by pim.lukas-pirl.de (Postfix) with ESMTPSA id C0EFD1FD931E for ; Sat, 21 Nov 2015 00:37:41 +0000 (UTC) Subject: Re: 4.2.6: livelock in recovery (free_reloc_roots)? To: linux-btrfs@vger.kernel.org References: <564EE213.3060007@lukas-pirl.de> From: Lukas Pirl Message-ID: <564FBCD1.1020009@lukas-pirl.de> Date: Sat, 21 Nov 2015 13:37:37 +1300 MIME-Version: 1.0 In-Reply-To: <564EE213.3060007@lukas-pirl.de> Content-Type: text/plain; charset=utf-8 Sender: linux-btrfs-owner@vger.kernel.org List-ID: A follow-up question: Can "btrfs_recover_relocation" prevented from being run? I would not mind losing a few recent writes (what was a balance) but instead going rw again, so I can restart a balance. >>From what I have read, btrfs-zero-log would not help in this case (?) so I did not run it so far. By the way, I can confirm the defect of 'btrfs device remove missing …" mentioned here: http://www.spinics.net/lists/linux-btrfs/msg48383.html : $ btrfs device delete missing /mnt/data ERROR: missing is not a block device $ btrfs device delete 5 /mnt/data ERROR: 5 is not a block device Thanks and best regards, Lukas On 11/20/2015 10:04 PM, Lukas Pirl wrote as excerpted: > Dear list, > > I am (still) trying to recover a RAID1 that can only be mounted > recovery,degraded,ro. > > I experienced an issue that might be interesting for you: I tried to > mount the file system rw,recovery and the kernel ended up burning one > core (and only one specific core, never scheduled to another one). > > The watchdog printed a stack trace roughly every 20 seconds. There were > only a few stack traces that were printed alternating (see below). > After a few hours with the mount command still being blocked and without > visible IO activity, the system was power-cycled. > > Summary: > > Call Trace: > [] ? free_reloc_roots+0x11/0x30 [btrfs] > [] ? free_reloc_roots+0x1d/0x30 [btrfs] > [] ? merge_reloc_roots+0x165/0x220 [btrfs] > [] ? btrfs_recover_relocation+0x293/0x380 [btrfs] > [] ? open_ctree+0x20d2/0x23b0 [btrfs] > [] ? btrfs_mount+0x87b/0x990 [btrfs] > [] ? pcpu_next_unpop+0x3f/0x50 > [] ? mount_fs+0x36/0x170 > [] ? vfs_kern_mount+0x68/0x110 > [] ? btrfs_mount+0x1bb/0x990 [btrfs] > … > > Call Trace: > [] ? rcu_dump_cpu_stacks+0x80/0xb0 > [] ? rcu_check_callbacks+0x421/0x6e0 > [] ? sched_clock+0x5/0x10 > [] ? notifier_call_chain+0x45/0x70 > [] ? timekeeping_update+0xf1/0x150 > [] ? tick_sched_do_timer+0x40/0x40 > [] ? update_process_times+0x36/0x60 > [] ? tick_sched_do_timer+0x40/0x40 > [] ? tick_sched_handle.isra.15+0x24/0x60 > [] ? tick_sched_do_timer+0x40/0x40 > [] ? tick_sched_timer+0x3b/0x70 > [] ? __hrtimer_run_queues+0xdc/0x210 > [] ? read_tsc+0x5/0x10 > [] ? read_tsc+0x5/0x10 > [] ? hrtimer_interrupt+0x9a/0x190 > [] ? smp_apic_timer_interrupt+0x39/0x50 > [] ? apic_timer_interrupt+0x6b/0x70 > [] ? _raw_spin_lock+0x10/0x20 > [] ? __del_reloc_root+0x2f/0x100 [btrfs] > [] ? __add_reloc_root+0xe0/0xe0 [btrfs] > [] ? free_reloc_roots+0x1d/0x30 [btrfs] > [] ? merge_reloc_roots+0x165/0x220 [btrfs] > [] ? btrfs_recover_relocation+0x293/0x380 [btrfs] > [] ? open_ctree+0x20d2/0x23b0 [btrfs] > [] ? btrfs_mount+0x87b/0x990 [btrfs] > [] ? pcpu_next_unpop+0x3f/0x50 > [] ? mount_fs+0x36/0x170 > [] ? vfs_kern_mount+0x68/0x110 > [] ? btrfs_mount+0x1bb/0x990 [btrfs] > … > > Call Trace: > [] ? __del_reloc_root+0x2f/0x100 [btrfs] > [] ? free_reloc_roots+0x1d/0x30 [btrfs] > [] ? merge_reloc_roots+0x165/0x220 [btrfs] > [] ? btrfs_recover_relocation+0x293/0x380 [btrfs] > [] ? open_ctree+0x20d2/0x23b0 [btrfs] > [] ? btrfs_mount+0x87b/0x990 [btrfs] > [] ? pcpu_next_unpop+0x3f/0x50 > [] ? mount_fs+0x36/0x170 > [] ? vfs_kern_mount+0x68/0x110 > [] ? btrfs_mount+0x1bb/0x990 [btrfs] > … > > A longer excerpt can be found here: http://pastebin.com/NPM0Ckfy > > I am using kernel 4.2.6 (Debian backports) and btrfs-tools 4.3. > > btrfs check --readonly gave no errors. > (except the probably false positives mentioned here > http://www.mail-archive.com/linux-btrfs%40vger.kernel.org/msg48325.html) > > Reading the whole file system worked also. > > If you need more information to trace this back, let me know and I'll > try to get it. > If you have suggestions regarding the recovery, please let me know as well. > > Best regards, > > Lukas > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >