From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:28599 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752519AbbBITHU (ORCPT ); Mon, 9 Feb 2015 14:07:20 -0500 Date: Mon, 9 Feb 2015 14:07:12 -0500 From: Chris Mason Subject: Re: [PATCH] Btrfs: fix scrub race leading to use-after-free To: Filipe Manana CC: Message-ID: <1423508832.26622.2@mail.thefacebook.com> In-Reply-To: <1422375078-6916-1-git-send-email-fdmanana@suse.com> References: <1422375078-6916-1-git-send-email-fdmanana@suse.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: On Tue, Jan 27, 2015 at 11:11 AM, Filipe Manana wrote: > While running a scrub on a kernel with CONFIG_DEBUG_PAGEALLOC=y, I got > the following trace: This actually trades one bug for another: [ 1928.950319] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:621^M [ 1928.967334] in_atomic(): 1, irqs_disabled(): 0, pid: 149670, name: fsstress^M [ 1928.981324] INFO: lockdep is turned off.^M [ 1928.989244] CPU: 24 PID: 149670 Comm: fsstress Tainted: G W 3.19.0-rc7-mason+ #41^M [ 1929.006418] Hardware name: ZTSYSTEMS Echo Ridge T4 /A9DRPF-10D, BIOS 1.07 05/10/2012^M [ 1929.022207] ffffffff81a22cf8 ffff881076e03b78 ffffffff816b8dd9 ffff881076e03b78^M [ 1929.037267] ffff880d8e828710 ffff881076e03ba8 ffffffff810856c4 ffff881076e03bc8^M [ 1929.052315] 0000000000000000 000000000000026d ffffffff81a22cf8 ffff881076e03bd8^M [ 1929.067381] Call Trace:^M [ 1929.072344] [] dump_stack+0x4f/0x6e^M [ 1929.083968] [] ___might_sleep+0x174/0x230^M [ 1929.095352] [] __might_sleep+0x52/0x90^M [ 1929.106223] [] mutex_lock_nested+0x2f/0x3b0^M [ 1929.117951] [] ? trace_hardirqs_on+0xd/0x10^M [ 1929.129708] [] scrub_pending_bio_dec+0x38/0x70 [btrfs]^M [ 1929.143370] [] scrub_parity_bio_endio+0x50/0x70 [btrfs]^M [ 1929.157191] [] bio_endio+0x53/0xa0^M [ 1929.167382] [] rbio_orig_end_io+0x7c/0xa0 [btrfs]^M [ 1929.180161] [] raid_write_parity_end_io+0x5a/0x80 [btrfs]^M [ 1929.194318] [] bio_endio+0x53/0xa0^M [ 1929.204496] [] blk_update_request+0x1eb/0x450^M [ 1929.216569] [] ? trigger_load_balance+0x78/0x500^M [ 1929.229176] [] scsi_end_request+0x3d/0x1f0^M [ 1929.240740] [] scsi_io_completion+0xac/0x5b0^M [ 1929.252654] [] scsi_finish_command+0xf0/0x150^M [ 1929.264725] [] scsi_softirq_done+0x147/0x170^M [ 1929.276635] [] blk_done_softirq+0x86/0xa0^M [ 1929.288014] [] __do_softirq+0xde/0x600^M [ 1929.298885] [] irq_exit+0xbd/0xd0^M [ 1929.308879] [] smp_call_function_single_interrupt+0x35/0x40^M [ 1929.323455] [] call_function_single_interrupt+0x6f/0x80^M [ 1929.337270] [] ? sync_inodes_sb+0x1b5/0x2a0^M [ 1929.350261] [] ? sync_inodes_sb+0x198/0x2a0^M [ 1929.361991] [] ? wait_for_completion+0xef/0x120^M [ 1929.374423] [] ? fdatawrite_one_bdev+0x20/0x20^M [ 1929.386671] [] ? fdatawrite_one_bdev+0x20/0x20^M [ 1929.398930] [] sync_inodes_one_sb+0x1d/0x30^M [ 1929.410668] [] iterate_supers+0xb6/0xf0^M [ 1929.421712] [] sys_sync+0x35/0x90^M [ 1929.431704] [] system_call_fastpath+0x12/0x17^M So we'll have to either put in a refcount or a spinlock instead. -chris