* [PULL] Btrfs, updates for 4.12 @ 2017-04-19 11:35 David Sterba 2017-04-26 15:06 ` Filipe Manana 0 siblings, 1 reply; 5+ messages in thread From: David Sterba @ 2017-04-19 11:35 UTC (permalink / raw) To: clm; +Cc: David Sterba, linux-btrfs Hi, this is the main part of my 4.12 pull, condensed changelog below. I might send another pull with low-risk patches, mostly cleanups, but so far I'm done with base testing now. We had a high-churn cycle last time, so this could be small one and we can concentrate on testing & fixing the raid56 updates. The qgroup patches have been in for-next but I haven't seen any new review for the core part. Updates: * raid56: * fix mirror name in warning message after repair * scrub fixes: calculate parity correctly * scrub recheck and dev replace race fix * enabled auto-repair during read * fix warnings during recovery, due to races, bogus reports can appear * switch to refcount_t where atomic_t was used for plain refcounting * new and updated tracepoints * split __btrfs_map_block, clean up * minor qgroup fixes * usual cleanups ---------------------------------------------------------------- The following changes since commit 4f7d029b9bf009fbee76bb10c0c4351a1870d2f3: Linux 4.11-rc7 (2017-04-16 13:00:18 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-chris-4.12 for you to fetch changes up to c2a9c7ab475bc3aaf06521a39ac65bc48c8cad4f: btrfs: check if the device is flush capable (2017-04-18 16:13:27 +0200) ---------------------------------------------------------------- Adam Borowski (1): btrfs: fix a bogus warning when converting only data or metadata Anand Jain (3): btrfs: use q which is already obtained from bdev_get_queue btrfs: delete unused member nobarriers btrfs: check if the device is flush capable Dan Carpenter (1): Btrfs: handle only applicable errors returned by btrfs_get_extent David Sterba (12): btrfs: preallocate radix tree node for readahead btrfs: preallocate radix tree node for global readahead tree btrfs: remove redundant parameter from btree_readahead_hook btrfs: remove redundant parameter from reada_find_zone btrfs: remove redundant parameter from reada_start_machine_dev btrfs: remove local blocksize variable in reada_find_extent btrfs: remove unused qgroup members from btrfs_trans_handle btrfs: track exclusive filesystem operation in flags btrfs: sink GFP flags parameter to tree_mod_log_insert_move btrfs: sink GFP flags parameter to tree_mod_log_insert_root btrfs: drop redundant parameters from btrfs_map_sblock btrfs: use clear_page where appropriate Deepa Dinamani (1): btrfs: Use ktime_get_real_ts for root ctime Dmitry V. Levin (1): MAINTAINERS: add btrfs file entries for include directories Edmund Nadolski (2): btrfs: provide enumeration for __merge_refs mode argument btrfs: replace hardcoded value with SEQ_LAST macro Elena Reshetova (16): btrfs: convert btrfs_bio.refs from atomic_t to refcount_t btrfs: convert btrfs_transaction.use_count from atomic_t to refcount_t btrfs: convert extent_map.refs from atomic_t to refcount_t btrfs: convert btrfs_ordered_extent.refs from atomic_t to refcount_t btrfs: convert btrfs_caching_control.count from atomic_t to refcount_t btrfs: convert btrfs_delayed_ref_node.refs from atomic_t to refcount_t btrfs: convert btrfs_delayed_node.refs from atomic_t to refcount_t btrfs: convert btrfs_delayed_item.refs from atomic_t to refcount_t btrfs: convert btrfs_root.refs from atomic_t to refcount_t btrfs: convert extent_state.refs from atomic_t to refcount_t btrfs: convert compressed_bio.pending_bios from atomic_t to refcount_t btrfs: convert scrub_recover.refs from atomic_t to refcount_t btrfs: convert scrub_block.refs from atomic_t to refcount_t btrfs: convert scrub_parity.refs from atomic_t to refcount_t btrfs: convert scrub_ctx.refs from atomic_t to refcount_t btrfs: convert btrfs_raid_bio.refs from atomic_t to refcount_t Goldwyn Rodrigues (2): btrfs: No need to check !(flags & MS_RDONLY) twice btrfs: qgroups: Retry after commit on getting EDQUOT Hans van Kranenburg (1): Btrfs: consistent usage of types in balance_args Liu Bo (15): Btrfs: remove ASSERT in btrfs_truncate_inode_items Btrfs: add file item tracepoints Btrfs: create a helper for getting chunk map Btrfs: separate DISCARD from __btrfs_map_block Btrfs: introduce a function to get extra mirror from replace Btrfs: handle operations for device replace separately Btrfs: do not add extra mirror when dev_replace target dev is not available Btrfs: helper for ops that requires full stripe Btrfs: convert BUG_ON to WARN_ON Btrfs: update comments in cache_save_setup Btrfs: set scrub page's io_error if failing to submit io Btrfs: fix wrong failed mirror_num of read-repair on raid56 Btrfs: enable repair during read for raid56 profile Btrfs: update scrub_parity to use u64 stripe_len Btrfs: switch to div64_u64 if with a u64 divisor Qu Wenruo (8): btrfs: qgroup: Add trace point for qgroup reserved space btrfs: qgroup: Re-arrange tracepoint timing to co-operate with reserved space tracepoint btrfs: scrub: Don't append on-disk pages for raid56 scrub btrfs: Wait for in-flight bios before freeing target device for raid56 btrfs: Prevent scrub recheck from racing with dev replace btrfs: qgroup: Fix qgroup corruption caused by inode_cache mount option btrfs: scrub: Introduce full stripe lock for RAID56 btrfs: scrub: Fix RAID56 recovery race condition MAINTAINERS | 2 + fs/btrfs/backref.c | 41 ++- fs/btrfs/compression.c | 18 +- fs/btrfs/ctree.c | 20 +- fs/btrfs/ctree.h | 34 +- fs/btrfs/delayed-inode.c | 46 +-- fs/btrfs/delayed-inode.h | 6 +- fs/btrfs/delayed-ref.c | 8 +- fs/btrfs/delayed-ref.h | 8 +- fs/btrfs/dev-replace.c | 9 +- fs/btrfs/disk-io.c | 15 +- fs/btrfs/disk-io.h | 4 +- fs/btrfs/extent-tree.c | 35 +- fs/btrfs/extent_io.c | 59 +-- fs/btrfs/extent_io.h | 3 +- fs/btrfs/extent_map.c | 10 +- fs/btrfs/extent_map.h | 3 +- fs/btrfs/file.c | 16 +- fs/btrfs/free-space-cache.c | 2 +- fs/btrfs/inode.c | 47 +-- fs/btrfs/ioctl.c | 33 +- fs/btrfs/ordered-data.c | 20 +- fs/btrfs/ordered-data.h | 2 +- fs/btrfs/qgroup.c | 102 ++---- fs/btrfs/qgroup.h | 51 ++- fs/btrfs/raid56.c | 38 +- fs/btrfs/reada.c | 37 +- fs/btrfs/root-tree.c | 3 +- fs/btrfs/scrub.c | 331 +++++++++++++++-- fs/btrfs/super.c | 3 +- fs/btrfs/tests/btrfs-tests.c | 1 - fs/btrfs/transaction.c | 48 ++- fs/btrfs/transaction.h | 6 +- fs/btrfs/tree-log.c | 2 +- fs/btrfs/volumes.c | 856 +++++++++++++++++++++++-------------------- fs/btrfs/volumes.h | 8 +- include/trace/events/btrfs.h | 187 +++++++++- include/uapi/linux/btrfs.h | 10 +- 38 files changed, 1354 insertions(+), 770 deletions(-) ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PULL] Btrfs, updates for 4.12 2017-04-19 11:35 [PULL] Btrfs, updates for 4.12 David Sterba @ 2017-04-26 15:06 ` Filipe Manana 2017-04-26 15:12 ` Chris Mason ` (2 more replies) 0 siblings, 3 replies; 5+ messages in thread From: Filipe Manana @ 2017-04-26 15:06 UTC (permalink / raw) To: David Sterba; +Cc: Chris Mason, linux-btrfs@vger.kernel.org On Wed, Apr 19, 2017 at 12:35 PM, David Sterba <dsterba@suse.com> wrote: > Hi, > > this is the main part of my 4.12 pull, condensed changelog below. I might send > another pull with low-risk patches, mostly cleanups, but so far I'm done with > base testing now. We had a high-churn cycle last time, so this could be small > one and we can concentrate on testing & fixing the raid56 updates. > > The qgroup patches have been in for-next but I haven't seen any new review for > the core part. > > Updates: > * raid56: > * fix mirror name in warning message after repair > * scrub fixes: calculate parity correctly > * scrub recheck and dev replace race fix > * enabled auto-repair during read > * fix warnings during recovery, due to races, bogus reports can appear > * switch to refcount_t where atomic_t was used for plain refcounting > * new and updated tracepoints > * split __btrfs_map_block, clean up > * minor qgroup fixes > * usual cleanups > > ---------------------------------------------------------------- > The following changes since commit 4f7d029b9bf009fbee76bb10c0c4351a1870d2f3: > > Linux 4.11-rc7 (2017-04-16 13:00:18 -0700) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-chris-4.12 > > for you to fetch changes up to c2a9c7ab475bc3aaf06521a39ac65bc48c8cad4f: > > btrfs: check if the device is flush capable (2017-04-18 16:13:27 +0200) > > ---------------------------------------------------------------- > Adam Borowski (1): > btrfs: fix a bogus warning when converting only data or metadata > > Anand Jain (3): > btrfs: use q which is already obtained from bdev_get_queue > btrfs: delete unused member nobarriers > btrfs: check if the device is flush capable > > Dan Carpenter (1): > Btrfs: handle only applicable errors returned by btrfs_get_extent > > David Sterba (12): > btrfs: preallocate radix tree node for readahead > btrfs: preallocate radix tree node for global readahead tree > btrfs: remove redundant parameter from btree_readahead_hook > btrfs: remove redundant parameter from reada_find_zone > btrfs: remove redundant parameter from reada_start_machine_dev > btrfs: remove local blocksize variable in reada_find_extent > btrfs: remove unused qgroup members from btrfs_trans_handle > btrfs: track exclusive filesystem operation in flags > btrfs: sink GFP flags parameter to tree_mod_log_insert_move > btrfs: sink GFP flags parameter to tree_mod_log_insert_root > btrfs: drop redundant parameters from btrfs_map_sblock > btrfs: use clear_page where appropriate Hi, Did you actually ran xfstests with those readahead patches to preallocate radix tree nodes? With those 2 patches applied (Chris' for-linus.4,12 branch) this breaks things and many btrfs specific tests (at least, since I can't get pass them) result in tons of traces like the following in a debug kernel: [ 8180.696804] BUG: sleeping function called from invalid context at mm/slab.h:432 [ 8180.703584] in_atomic(): 1, irqs_disabled(): 0, pid: 28583, name: btrfs [ 8180.724146] 2 locks held by btrfs/28583: [ 8180.726427] #0: (sb_writers#12){.+.+.+}, at: [<ffffffff811c1e33>] mnt_want_write_file+0x25/0x4d [ 8180.736742] #1: (&(&fs_info->reada_lock)->rlock){+.+.+.}, at: [<ffffffffa02306eb>] reada_add_block+0x2fe/0x6cd [btrfs] [ 8180.766321] Preemption disabled at: [ 8180.766326] [<ffffffff8107ac54>] preempt_count_add+0x65/0x68 [ 8180.794837] CPU: 5 PID: 28583 Comm: btrfs Tainted: G W 4.11.0-rc8-btrfs-next-39+ #1 [ 8180.798818] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014 [ 8180.798818] Call Trace: [ 8180.798818] dump_stack+0x68/0x92 [ 8180.798818] ? preempt_count_add+0x65/0x68 [ 8180.798818] ___might_sleep+0x20f/0x226 [ 8180.798818] __might_sleep+0x77/0x7e [ 8180.798818] slab_pre_alloc_hook+0x32/0x4f [ 8180.798818] kmem_cache_alloc+0x39/0x233 [ 8180.798818] ? radix_tree_node_alloc.constprop.12+0x9d/0xdf [ 8180.798818] radix_tree_node_alloc.constprop.12+0x9d/0xdf [ 8180.798818] __radix_tree_create+0xc3/0x143 [ 8180.798818] __radix_tree_insert+0x32/0xc0 [ 8180.798818] reada_add_block+0x318/0x6cd [btrfs] [ 8180.798818] btrfs_reada_add+0xf5/0x122 [btrfs] [ 8180.798818] scrub_stripe+0x34b/0xdf0 [btrfs] [ 8180.798818] ? __lock_acquire+0x69b/0xf38 [ 8180.798818] ? scrub_chunk+0x48/0x13b [btrfs] [ 8180.798818] scrub_chunk+0x10b/0x13b [btrfs] [ 8180.798818] ? scrub_chunk+0x10b/0x13b [btrfs] [ 8180.798818] scrub_enumerate_chunks+0x31e/0x59b [btrfs] [ 8180.798818] ? add_wait_queue+0x44/0x44 [ 8180.798818] btrfs_scrub_dev+0x2e3/0x494 [btrfs] [ 8180.798818] ? __mnt_want_write+0x65/0x7c [ 8180.798818] btrfs_ioctl+0x1498/0x1fb9 [btrfs] [ 8180.798818] vfs_ioctl+0x21/0x38 [ 8180.945015] ? vfs_ioctl+0x21/0x38 [ 8180.945015] do_vfs_ioctl+0x611/0x645 [ 8180.945015] ? rcu_read_unlock+0x5b/0x5d [ 8180.945015] ? __fget+0x6d/0x79 [ 8180.945015] SyS_ioctl+0x57/0x7b [ 8180.945015] entry_SYSCALL_64_fastpath+0x18/0xad [ 8180.945015] RIP: 0033:0x7f2b83eefc47 [ 8180.945015] RSP: 002b:00007f2b82e0bd68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 8180.945015] RAX: ffffffffffffffda RBX: ffffffff8109612f RCX: 00007f2b83eefc47 [ 8180.945015] RDX: 00000000008a4df0 RSI: 00000000c400941b RDI: 0000000000000003 [ 8181.012957] RBP: ffffc90003613f98 R08: 00007f2b82e0c700 R09: 0000000000000000 [ 8181.012957] R10: 00007f2b82e0c700 R11: 0000000000000246 R12: 0000000000000046 [ 8181.012957] R13: ffffc90003613f78 R14: 0000000000000000 R15: 00007f2b84e75040 thanks > > Deepa Dinamani (1): > btrfs: Use ktime_get_real_ts for root ctime > > Dmitry V. Levin (1): > MAINTAINERS: add btrfs file entries for include directories > > Edmund Nadolski (2): > btrfs: provide enumeration for __merge_refs mode argument > btrfs: replace hardcoded value with SEQ_LAST macro > > Elena Reshetova (16): > btrfs: convert btrfs_bio.refs from atomic_t to refcount_t > btrfs: convert btrfs_transaction.use_count from atomic_t to refcount_t > btrfs: convert extent_map.refs from atomic_t to refcount_t > btrfs: convert btrfs_ordered_extent.refs from atomic_t to refcount_t > btrfs: convert btrfs_caching_control.count from atomic_t to refcount_t > btrfs: convert btrfs_delayed_ref_node.refs from atomic_t to refcount_t > btrfs: convert btrfs_delayed_node.refs from atomic_t to refcount_t > btrfs: convert btrfs_delayed_item.refs from atomic_t to refcount_t > btrfs: convert btrfs_root.refs from atomic_t to refcount_t > btrfs: convert extent_state.refs from atomic_t to refcount_t > btrfs: convert compressed_bio.pending_bios from atomic_t to refcount_t > btrfs: convert scrub_recover.refs from atomic_t to refcount_t > btrfs: convert scrub_block.refs from atomic_t to refcount_t > btrfs: convert scrub_parity.refs from atomic_t to refcount_t > btrfs: convert scrub_ctx.refs from atomic_t to refcount_t > btrfs: convert btrfs_raid_bio.refs from atomic_t to refcount_t > > Goldwyn Rodrigues (2): > btrfs: No need to check !(flags & MS_RDONLY) twice > btrfs: qgroups: Retry after commit on getting EDQUOT > > Hans van Kranenburg (1): > Btrfs: consistent usage of types in balance_args > > Liu Bo (15): > Btrfs: remove ASSERT in btrfs_truncate_inode_items > Btrfs: add file item tracepoints > Btrfs: create a helper for getting chunk map > Btrfs: separate DISCARD from __btrfs_map_block > Btrfs: introduce a function to get extra mirror from replace > Btrfs: handle operations for device replace separately > Btrfs: do not add extra mirror when dev_replace target dev is not available > Btrfs: helper for ops that requires full stripe > Btrfs: convert BUG_ON to WARN_ON > Btrfs: update comments in cache_save_setup > Btrfs: set scrub page's io_error if failing to submit io > Btrfs: fix wrong failed mirror_num of read-repair on raid56 > Btrfs: enable repair during read for raid56 profile > Btrfs: update scrub_parity to use u64 stripe_len > Btrfs: switch to div64_u64 if with a u64 divisor > > Qu Wenruo (8): > btrfs: qgroup: Add trace point for qgroup reserved space > btrfs: qgroup: Re-arrange tracepoint timing to co-operate with reserved space tracepoint > btrfs: scrub: Don't append on-disk pages for raid56 scrub > btrfs: Wait for in-flight bios before freeing target device for raid56 > btrfs: Prevent scrub recheck from racing with dev replace > btrfs: qgroup: Fix qgroup corruption caused by inode_cache mount option > btrfs: scrub: Introduce full stripe lock for RAID56 > btrfs: scrub: Fix RAID56 recovery race condition > > MAINTAINERS | 2 + > fs/btrfs/backref.c | 41 ++- > fs/btrfs/compression.c | 18 +- > fs/btrfs/ctree.c | 20 +- > fs/btrfs/ctree.h | 34 +- > fs/btrfs/delayed-inode.c | 46 +-- > fs/btrfs/delayed-inode.h | 6 +- > fs/btrfs/delayed-ref.c | 8 +- > fs/btrfs/delayed-ref.h | 8 +- > fs/btrfs/dev-replace.c | 9 +- > fs/btrfs/disk-io.c | 15 +- > fs/btrfs/disk-io.h | 4 +- > fs/btrfs/extent-tree.c | 35 +- > fs/btrfs/extent_io.c | 59 +-- > fs/btrfs/extent_io.h | 3 +- > fs/btrfs/extent_map.c | 10 +- > fs/btrfs/extent_map.h | 3 +- > fs/btrfs/file.c | 16 +- > fs/btrfs/free-space-cache.c | 2 +- > fs/btrfs/inode.c | 47 +-- > fs/btrfs/ioctl.c | 33 +- > fs/btrfs/ordered-data.c | 20 +- > fs/btrfs/ordered-data.h | 2 +- > fs/btrfs/qgroup.c | 102 ++---- > fs/btrfs/qgroup.h | 51 ++- > fs/btrfs/raid56.c | 38 +- > fs/btrfs/reada.c | 37 +- > fs/btrfs/root-tree.c | 3 +- > fs/btrfs/scrub.c | 331 +++++++++++++++-- > fs/btrfs/super.c | 3 +- > fs/btrfs/tests/btrfs-tests.c | 1 - > fs/btrfs/transaction.c | 48 ++- > fs/btrfs/transaction.h | 6 +- > fs/btrfs/tree-log.c | 2 +- > fs/btrfs/volumes.c | 856 +++++++++++++++++++++++-------------------- > fs/btrfs/volumes.h | 8 +- > include/trace/events/btrfs.h | 187 +++++++++- > include/uapi/linux/btrfs.h | 10 +- > 38 files changed, 1354 insertions(+), 770 deletions(-) > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Filipe David Manana, “Whether you think you can, or you think you can't — you're right.” ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PULL] Btrfs, updates for 4.12 2017-04-26 15:06 ` Filipe Manana @ 2017-04-26 15:12 ` Chris Mason 2017-04-26 16:08 ` David Sterba 2017-04-26 17:26 ` Chris Mason 2 siblings, 0 replies; 5+ messages in thread From: Chris Mason @ 2017-04-26 15:12 UTC (permalink / raw) To: fdmanana, David Sterba; +Cc: linux-btrfs@vger.kernel.org On 04/26/2017 11:06 AM, Filipe Manana wrote: > Hi, > > Did you actually ran xfstests with those readahead patches to > preallocate radix tree nodes? > > With those 2 patches applied (Chris' for-linus.4,12 branch) this > breaks things and many btrfs specific tests (at least, since I can't > get pass them) result in tons of traces like the following in a debug > kernel: Huh, I did and these didn't come up. I'll double check I have preemption enabled. -chris ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PULL] Btrfs, updates for 4.12 2017-04-26 15:06 ` Filipe Manana 2017-04-26 15:12 ` Chris Mason @ 2017-04-26 16:08 ` David Sterba 2017-04-26 17:26 ` Chris Mason 2 siblings, 0 replies; 5+ messages in thread From: David Sterba @ 2017-04-26 16:08 UTC (permalink / raw) To: Filipe Manana; +Cc: Chris Mason, linux-btrfs@vger.kernel.org On Wed, Apr 26, 2017 at 04:06:29PM +0100, Filipe Manana wrote: > On Wed, Apr 19, 2017 at 12:35 PM, David Sterba <dsterba@suse.com> wrote: > > Adam Borowski (1): > > btrfs: fix a bogus warning when converting only data or metadata > > > > Anand Jain (3): > > btrfs: use q which is already obtained from bdev_get_queue > > btrfs: delete unused member nobarriers > > btrfs: check if the device is flush capable > > > > Dan Carpenter (1): > > Btrfs: handle only applicable errors returned by btrfs_get_extent > > > > David Sterba (12): > > btrfs: preallocate radix tree node for readahead > > btrfs: preallocate radix tree node for global readahead tree > > btrfs: remove redundant parameter from btree_readahead_hook > > btrfs: remove redundant parameter from reada_find_zone > > btrfs: remove redundant parameter from reada_start_machine_dev > > btrfs: remove local blocksize variable in reada_find_extent > > btrfs: remove unused qgroup members from btrfs_trans_handle > > btrfs: track exclusive filesystem operation in flags > > btrfs: sink GFP flags parameter to tree_mod_log_insert_move > > btrfs: sink GFP flags parameter to tree_mod_log_insert_root > > btrfs: drop redundant parameters from btrfs_map_sblock > > btrfs: use clear_page where appropriate > > Did you actually ran xfstests with those readahead patches to > preallocate radix tree nodes? > > With those 2 patches applied (Chris' for-linus.4,12 branch) this > breaks things and many btrfs specific tests (at least, since I can't > get pass them) result in tons of traces like the following in a debug > kernel: I did, no such reports appeared in my setup. There are several debugging options enabled in the config but I don't see the one to catch sleep in atomic. I'll fix my setup scripts to enable it before each build. There might be other surprises that I missed though. Thanks for cathing it. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PULL] Btrfs, updates for 4.12 2017-04-26 15:06 ` Filipe Manana 2017-04-26 15:12 ` Chris Mason 2017-04-26 16:08 ` David Sterba @ 2017-04-26 17:26 ` Chris Mason 2 siblings, 0 replies; 5+ messages in thread From: Chris Mason @ 2017-04-26 17:26 UTC (permalink / raw) To: fdmanana, David Sterba; +Cc: linux-btrfs@vger.kernel.org On 04/26/2017 11:06 AM, Filipe Manana wrote: > Hi, > > Did you actually ran xfstests with those readahead patches to > preallocate radix tree nodes? > > With those 2 patches applied (Chris' for-linus.4,12 branch) this > breaks things and many btrfs specific tests (at least, since I can't > get pass them) result in tons of traces like the following in a debug > kernel: > > [ 8180.696804] BUG: sleeping function called from invalid context at > mm/slab.h:432 > [ 8180.703584] in_atomic(): 1, irqs_disabled(): 0, pid: 28583, name: btrfs > [ 8180.724146] 2 locks held by btrfs/28583: > [ 8180.726427] #0: (sb_writers#12){.+.+.+}, at: [<ffffffff811c1e33>] > mnt_want_write_file+0x25/0x4d > [ 8180.736742] #1: (&(&fs_info->reada_lock)->rlock){+.+.+.}, at: > [<ffffffffa02306eb>] reada_add_block+0x2fe/0x6cd [btrfs] > [ 8180.766321] Preemption disabled at: > [ 8180.766326] [<ffffffff8107ac54>] preempt_count_add+0x65/0x68 > [ 8180.794837] CPU: 5 PID: 28583 Comm: btrfs Tainted: G W > 4.11.0-rc8-btrfs-next-39+ #1 > [ 8180.798818] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014 > [ 8180.798818] Call Trace: > [ 8180.798818] dump_stack+0x68/0x92 > [ 8180.798818] ? preempt_count_add+0x65/0x68 > [ 8180.798818] ___might_sleep+0x20f/0x226 > [ 8180.798818] __might_sleep+0x77/0x7e > [ 8180.798818] slab_pre_alloc_hook+0x32/0x4f > [ 8180.798818] kmem_cache_alloc+0x39/0x233 > [ 8180.798818] ? radix_tree_node_alloc.constprop.12+0x9d/0xdf > [ 8180.798818] radix_tree_node_alloc.constprop.12+0x9d/0xdf > [ 8180.798818] __radix_tree_create+0xc3/0x143 > [ 8180.798818] __radix_tree_insert+0x32/0xc0 > [ 8180.798818] reada_add_block+0x318/0x6cd [btrfs] So radix_tree_preload doesn't work the way I thought it did. It populates a per-cpu pool of radix tree nodes so the allocation is sure not to fail. But, when we go to actually allocate the node during radix_tree_insert: static struct radix_tree_node * radix_tree_node_alloc(gfp_t gfp_mask, struct radix_tree_node *parent, struct radix_tree_root *root, unsigned int shift, unsigned int offset, unsigned int count, unsigned int exceptional) { struct radix_tree_node *ret = NULL; /* * Preload code isn't irq safe and it doesn't make sense to use * preloading during an interrupt anyway as all the allocations have * to be atomic. So just do normal allocation when in interrupt. */ if (!gfpflags_allow_blocking(gfp_mask) && !in_interrupt()) { struct radix_tree_preload *rtp; /* * Even if the caller has preloaded, try to allocate from the * cache first for the new node to get accounted to the memory * cgroup. */ ret = kmem_cache_alloc(radix_tree_node_cachep, gfp_mask | __GFP_NOWARN); if (ret) goto out; /* * Provided the caller has preloaded here, we will always * succeed in getting a node here (and never reach * kmem_cache_alloc) */ rtp = this_cpu_ptr(&radix_tree_preloads); if (rtp->nr) { ret = rtp->nodes; rtp->nodes = ret->parent; rtp->nr--; } /* * Update the allocation stack trace as this is more useful * for debugging. */ kmemleak_update_trace(ret); goto out; } ret = kmem_cache_alloc(radix_tree_node_cachep, gfp_mask); We only jump into the preload pool if our gfp_mask for the root doesn't allow blocking. And even if we don't allow blocking we'll still hit the pool as a last resort. So I think the right answer is to keep the sleeping flag off the root and also keep the preload GFP_KERNEL. -chris ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-04-26 17:28 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-04-19 11:35 [PULL] Btrfs, updates for 4.12 David Sterba 2017-04-26 15:06 ` Filipe Manana 2017-04-26 15:12 ` Chris Mason 2017-04-26 16:08 ` David Sterba 2017-04-26 17:26 ` Chris Mason
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).