From: Jan Kara <jack@suse.cz>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Ext4 Developers List <linux-ext4@vger.kernel.org>,
linux@sciencehorizons.net, stable@vger.kernel.org, #@thunk.org
Subject: Re: [PATCH 2/4] ext4: fix deadlock between inline_data and ext4_expand_extra_isize_ea()
Date: Fri, 20 Jan 2017 14:53:17 +0100 [thread overview]
Message-ID: <20170120135317.GC10446@quack2.suse.cz> (raw)
In-Reply-To: <20170112034938.5934-4-tytso@mit.edu>
On Wed 11-01-17 22:49:36, Ted Tso wrote:
> The xattr_sem deadlock problems fixed in commit 2e81a4eeedca: "ext4:
> avoid deadlock when expanding inode size" didn't include the use of
> xattr_sem in fs/ext4/inline.c. With the addition of project quota
> which added a new extra inode field, this exposed deadlocks in the
> inline_data code similar to the ones fixed by 2e81a4eeedca.
>
> The deadlock can be reproduced via:
>
> dmesg -n 7
> mke2fs -t ext4 -O inline_data -Fq -I 256 /dev/vdc 32768
> mount -t ext4 -o debug_want_extra_isize=24 /dev/vdc /vdc
> mkdir /vdc/a
> umount /vdc
> mount -t ext4 /dev/vdc /vdc
> echo foo > /vdc/a/foo
>
> and looks like this:
>
> [ 11.158815]
> [ 11.160276] =============================================
> [ 11.161960] [ INFO: possible recursive locking detected ]
> [ 11.161960] 4.10.0-rc3-00015-g011b30a8a3cf #160 Tainted: G W
> [ 11.161960] ---------------------------------------------
> [ 11.161960] bash/2519 is trying to acquire lock:
> [ 11.161960] (&ei->xattr_sem){++++..}, at: [<c1225a4b>] ext4_expand_extra_isize_ea+0x3d/0x4cd
> [ 11.161960]
> [ 11.161960] but task is already holding lock:
> [ 11.161960] (&ei->xattr_sem){++++..}, at: [<c1227941>] ext4_try_add_inline_entry+0x3a/0x152
> [ 11.161960]
> [ 11.161960] other info that might help us debug this:
> [ 11.161960] Possible unsafe locking scenario:
> [ 11.161960]
> [ 11.161960] CPU0
> [ 11.161960] ----
> [ 11.161960] lock(&ei->xattr_sem);
> [ 11.161960] lock(&ei->xattr_sem);
> [ 11.161960]
> [ 11.161960] *** DEADLOCK ***
> [ 11.161960]
> [ 11.161960] May be due to missing lock nesting notation
> [ 11.161960]
> [ 11.161960] 4 locks held by bash/2519:
> [ 11.161960] #0: (sb_writers#3){.+.+.+}, at: [<c11a2414>] mnt_want_write+0x1e/0x3e
> [ 11.161960] #1: (&type->i_mutex_dir_key){++++++}, at: [<c119508b>] path_openat+0x338/0x67a
> [ 11.161960] #2: (jbd2_handle){++++..}, at: [<c123314a>] start_this_handle+0x582/0x622
> [ 11.161960] #3: (&ei->xattr_sem){++++..}, at: [<c1227941>] ext4_try_add_inline_entry+0x3a/0x152
> [ 11.161960]
> [ 11.161960] stack backtrace:
> [ 11.161960] CPU: 0 PID: 2519 Comm: bash Tainted: G W 4.10.0-rc3-00015-g011b30a8a3cf #160
> [ 11.161960] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1 04/01/2014
> [ 11.161960] Call Trace:
> [ 11.161960] dump_stack+0x72/0xa3
> [ 11.161960] __lock_acquire+0xb7c/0xcb9
> [ 11.161960] ? kvm_clock_read+0x1f/0x29
> [ 11.161960] ? __lock_is_held+0x36/0x66
> [ 11.161960] ? __lock_is_held+0x36/0x66
> [ 11.161960] lock_acquire+0x106/0x18a
> [ 11.161960] ? ext4_expand_extra_isize_ea+0x3d/0x4cd
> [ 11.161960] down_write+0x39/0x72
> [ 11.161960] ? ext4_expand_extra_isize_ea+0x3d/0x4cd
> [ 11.161960] ext4_expand_extra_isize_ea+0x3d/0x4cd
> [ 11.161960] ? _raw_read_unlock+0x22/0x2c
> [ 11.161960] ? jbd2_journal_extend+0x1e2/0x262
> [ 11.161960] ? __ext4_journal_get_write_access+0x3d/0x60
> [ 11.161960] ext4_mark_inode_dirty+0x17d/0x26d
> [ 11.161960] ? ext4_add_dirent_to_inline.isra.12+0xa5/0xb2
> [ 11.161960] ext4_add_dirent_to_inline.isra.12+0xa5/0xb2
> [ 11.161960] ext4_try_add_inline_entry+0x69/0x152
> [ 11.161960] ext4_add_entry+0xa3/0x848
> [ 11.161960] ? __brelse+0x14/0x2f
> [ 11.161960] ? _raw_spin_unlock_irqrestore+0x44/0x4f
> [ 11.161960] ext4_add_nondir+0x17/0x5b
> [ 11.161960] ext4_create+0xcf/0x133
> [ 11.161960] ? ext4_mknod+0x12f/0x12f
> [ 11.161960] lookup_open+0x39e/0x3fb
> [ 11.161960] ? __wake_up+0x1a/0x40
> [ 11.161960] ? lock_acquire+0x11e/0x18a
> [ 11.161960] path_openat+0x35c/0x67a
> [ 11.161960] ? sched_clock_cpu+0xd7/0xf2
> [ 11.161960] do_filp_open+0x36/0x7c
> [ 11.161960] ? _raw_spin_unlock+0x22/0x2c
> [ 11.161960] ? __alloc_fd+0x169/0x173
> [ 11.161960] do_sys_open+0x59/0xcc
> [ 11.161960] SyS_open+0x1d/0x1f
> [ 11.161960] do_int80_syscall_32+0x4f/0x61
> [ 11.161960] entry_INT80_32+0x2f/0x2f
> [ 11.161960] EIP: 0xb76ad469
> [ 11.161960] EFLAGS: 00000286 CPU: 0
> [ 11.161960] EAX: ffffffda EBX: 08168ac8 ECX: 00008241 EDX: 000001b6
> [ 11.161960] ESI: b75e46bc EDI: b7755000 EBP: bfbdb108 ESP: bfbdafc0
> [ 11.161960] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
>
> Cc: stable@vger.kernel.org # 3.10 (requires 2e81a4eeedca as a prereq)
> Reported-by: George Spelvin <linux@sciencehorizons.net>
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Looks mostly good, just two comments below.
> diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
> index 5a94fa52b74f..c40bd55b6400 100644
> --- a/fs/ext4/xattr.c
> +++ b/fs/ext4/xattr.c
> @@ -1188,16 +1188,14 @@ ext4_xattr_set_handle(handle_t *handle, struct inode *inode, int name_index,
> struct ext4_xattr_block_find bs = {
> .s = { .not_found = -ENODATA, },
> };
> - unsigned long no_expand;
> + int no_expand;
> int error;
>
> if (!name)
> return -EINVAL;
> if (strlen(name) > 255)
> return -ERANGE;
> - down_write(&EXT4_I(inode)->xattr_sem);
> - no_expand = ext4_test_inode_state(inode, EXT4_STATE_NO_EXPAND);
> - ext4_set_inode_state(inode, EXT4_STATE_NO_EXPAND);
> + ext4_write_lock_xattr(inode, &no_expand);
>
> error = ext4_reserve_inode_write(handle, inode, &is.iloc);
> if (error)
> @@ -1264,7 +1262,7 @@ ext4_xattr_set_handle(handle_t *handle, struct inode *inode, int name_index,
> ext4_xattr_update_super_block(handle, inode->i_sb);
> inode->i_ctime = current_time(inode);
> if (!value)
> - ext4_clear_inode_state(inode, EXT4_STATE_NO_EXPAND);
> + no_expand = 0;
OK. I suppose you can use ext4_mark_inode_dirty() instead of
ext4_mark_iloc_dirty() because the deadlock on xattr_sem is now taken care
of by your changes, right?
> error = ext4_mark_iloc_dirty(handle, inode, &is.iloc);
> /*
> * The bh is consumed by ext4_mark_iloc_dirty, even with
> @@ -1278,9 +1276,7 @@ ext4_xattr_set_handle(handle_t *handle, struct inode *inode, int name_index,
> cleanup:
> brelse(is.iloc.bh);
> brelse(bs.bh);
> - if (no_expand == 0)
> - ext4_clear_inode_state(inode, EXT4_STATE_NO_EXPAND);
> - up_write(&EXT4_I(inode)->xattr_sem);
> + ext4_write_unlock_xattr(inode, &no_expand);
> return error;
> }
>
> @@ -1497,12 +1493,11 @@ int ext4_expand_extra_isize_ea(struct inode *inode, int new_extra_isize,
> int error = 0, tried_min_extra_isize = 0;
> int s_min_extra_isize = le16_to_cpu(EXT4_SB(inode->i_sb)->s_es->s_min_extra_isize);
> int isize_diff; /* How much do we need to grow i_extra_isize */
> + int no_expand;
> +
> + if (ext4_write_trylock_xattr(inode, &no_expand) == 0)
> + return 0;
Why do you play tricks with trylock here? ext4_mark_inode_dirty() checks
EXT4_STATE_NO_EXPAND and thus we should not ever get here if we already
hold xattr_sem...
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2017-01-20 13:53 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-12 3:49 [PATCH 0/4] ext4 inline cleanup patches Theodore Ts'o
2017-01-12 3:49 ` [PATCH 1/4] ext4: add debug_want_extra_isize mount option Theodore Ts'o
2017-01-12 19:19 ` Andreas Dilger
2017-01-13 2:58 ` Theodore Ts'o
2017-01-12 3:49 ` [PATCH -v2] overlay: stress test changes to top and bottom layers simultaneously Theodore Ts'o
2017-01-12 3:54 ` Theodore Ts'o
2017-01-12 3:49 ` [PATCH 2/4] ext4: fix deadlock between inline_data and ext4_expand_extra_isize_ea() Theodore Ts'o
2017-01-20 13:53 ` Jan Kara [this message]
2017-01-22 22:25 ` Theodore Ts'o
2017-01-24 12:27 ` Jan Kara
2017-01-25 1:32 ` Theodore Ts'o
2017-01-12 3:49 ` [PATCH 3/4] ext4: avoid calling ext4_mark_inode_dirty() under unneeded semaphores Theodore Ts'o
2017-01-20 13:37 ` Jan Kara
2017-01-12 3:49 ` [PATCH 4/4] ext4: propagate error values from ext4_inline_data_truncate() Theodore Ts'o
2017-01-20 13:39 ` Jan Kara
2017-01-12 9:12 ` [PATCH 0/4] ext4 inline cleanup patches George Spelvin
2017-01-12 14:46 ` Theodore Ts'o
2017-01-18 8:21 ` kernel BUG at fs/ext4/inline.c:1943! George Spelvin
2017-01-19 17:50 ` ext4_iget:4740: inode #%ld: block 48: comm find: invalid block George Spelvin
2017-01-19 22:58 ` kernel BUG at fs/ext4/inline.c:1943! Theodore Ts'o
2017-01-20 19:14 ` Damien Guibouret
2017-01-20 19:17 ` Andreas Dilger
2017-01-20 22:24 ` Damien Guibouret
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170120135317.GC10446@quack2.suse.cz \
--to=jack@suse.cz \
--cc=#@thunk.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux@sciencehorizons.net \
--cc=stable@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.