linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Mahoney <jeffm@suse.de>
To: Andi Kleen <andi@firstfloor.org>
Cc: Jeff Mahoney <jeffm@suse.com>, Btrfs List <linux-btrfs@vger.kernel.org>
Subject: Re: [patch 07/99] btrfs: Use mempools for extent_state structures
Date: Thu, 01 Dec 2011 14:55:11 -0500	[thread overview]
Message-ID: <4ED7DB9F.4030502@suse.de> (raw)
In-Reply-To: <4ED42171.6090704@suse.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/28/2011 07:04 PM, Jeff Mahoney wrote:
> On 11/28/2011 06:53 PM, Andi Kleen wrote:
>> Jeff Mahoney <jeffm@suse.com> writes:
> 
>>> The extent_state structure is used at the core of the extent
>>> i/o code for managing flags, locking, etc. It requires
>>> allocations deep in the write code and if failures occur they
>>> are difficult to recover from.
>>> 
>>> We avoid most of the failures by using a mempool, which can
>>> sleep when required, to honor the allocations. This allows
>>> future patches to convert most of the
>>> {set,clear,convert}_extent_bit and derivatives to return void.
> 
>> Is this really safe?
> 
>> iirc if there's any operation that needs multiple mempool
>> objects you can get into the following ABBA style deadlock:
> 
>> thread 1                     thread 2
> 
>> get object A from pool 1 get object C from pool 2 get object B
>> from pool 2 pool 2 full -> block get object D from pool 2
> ^ pool1, i assume
>> pool1 full -> block
> 
> 
>> Now for thread 1 to make progress it needs thread 2 to free its 
>> object first, but that needs thread 1 to free its object also 
>> first, which is a deadlock.
> 
>> It would still work if there are other users which eventually
>> make progress, but if you're really unlucky either pool 1 or 2
>> is complete used by threads doing a multiple object operation. So
>> you got a nasty rare deadlock ...
> 
> Yes, I think you're right. I think the risk is probably there and
> I'm not sure it can be totally mitigated. We'd be stuck if there is
> a string of pathological cases and both mempool are empty. The only
> way I can see to try to help the situation is to make the mempool
> fairly substantial, which is what I did here. I'd prefer to make
> the mempools per-fs but that would require pretty heavy
> modifications in order to pass around a per-fs struct. In any case,
> the risk isn't totally eliminated.

The more I look into it, the more I don't think this is an uncommon
scenario in the kernel. Device mapper draws from a number of mempools
that can be interspersed with allocations from the bio pool. Even
without stacking different types of dm devices, I think we can run
into this scenario. The DM scenario is probably even worse since the
allocs are GFP_NOIO instead of the (relatively) more relaxed GFP_NOFS.

I'm not saying it's ok, just that there are similar sites already that
seem to be easier to hit but we haven't heard of issues there either.
It seems to be a behavior that is largely mitigated by establishing
mempools of sufficient size.

- -Jeff

- -- 
Jeff Mahoney
SUSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJO19ueAAoJEB57S2MheeWyDb4QAKZ3T9JGTXZN9MikO0Q8aH08
KE0X4NSnagn/goS/88+6Hj9Z165gQ3ERUaoW4+VsdpLOAaJuNo4YtMShoN7pgC9X
HeaddX1cQb8aducWPt9o7sglkpERJHzdSNKO4rJaYQbJLZFBQ1KLQEISRjS1ZAiu
74Y/t3tESiJO8W5XV1b06pythUKG4sLPK8+ssBvpYt+qrfhfp//dse/sKPE3L1k5
zHP0GIePRazhkE9RUB8+5rjItUR7MSvECJviCKKhxyY/TsQRm4g27AhbL717Ew/9
V3LyyZe5ojhXWNDyaCeLUu4j/xOcYVftkxuQLHcVltTDaAzxCkaRBnzxoz5nouc+
SusUzVYy/aR14QyNbtFpfJVAB3E9djXsaA3EZO2ar+NPeBch28LEJRfC8hoItej3
O39PAFpviYRSHa12GDIqJ0x3q9auTeU5DRPt3gnpstT26t4vICkFn4B/cp9EzpQI
G1Gm09ftkehpSdBiFFSBXCpPg/7qTAMPnaF1Yq3ueQoWbAqVEtnPnaWQLfH4IBBj
5fp6ei0/zAS2qr1iBKZnWUKJKCIAyvCkEeLusTcLymFzLzVMpywYWchU+EW0AZ5G
98tx8Dgzi77ImHG4uF9uWuNJ80G7iR+nGgE+8dKIoEKLETHdoQMRgTCc05o9tTHM
Gk0GUwRESn97r/ytUm8n
=zdAv
-----END PGP SIGNATURE-----

  reply	other threads:[~2011-12-01 19:55 UTC|newest]

Thread overview: 115+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-24  0:35 [patch 00/99] Error handling patchset v6 Jeff Mahoney
2011-11-24  0:35 ` [patch 01/99] btrfs: Add btrfs_panic() Jeff Mahoney
2011-11-24  2:05   ` David Brown
2011-11-24  2:22     ` Jeff Mahoney
2011-11-24  6:37       ` David Brown
2011-11-25  2:36         ` Jeff Mahoney
2011-11-25  5:42           ` David Brown
2011-11-24  0:35 ` [patch 02/99] btrfs: Catch locking failures in {set,clear,convert}_extent_bit Jeff Mahoney
2011-11-24  0:35 ` [patch 03/99] btrfs: Panic on bad rbtree operations Jeff Mahoney
2011-11-24 23:41   ` David Sterba
2011-11-25  2:13     ` Jeff Mahoney
2011-11-24  0:35 ` [patch 04/99] btrfs: Simplify btrfs_insert_root Jeff Mahoney
2011-11-24  0:35 ` [patch 05/99] btrfs: Remove set bits return from clear_extent_bit Jeff Mahoney
2011-11-24  0:35 ` [patch 06/99] btrfs: Add extent_state alloc/free tracing Jeff Mahoney
2011-11-24  0:35 ` [patch 07/99] btrfs: Use mempools for extent_state structures Jeff Mahoney
2011-11-28 23:53   ` Andi Kleen
2011-11-29  0:04     ` Jeff Mahoney
2011-12-01 19:55       ` Jeff Mahoney [this message]
2011-12-03  4:53         ` Jeff Mahoney
2011-11-24  0:35 ` [patch 08/99] btrfs: clear_extent_bit should return void with __GFP_WAIT set Jeff Mahoney
2011-11-24  0:35 ` [patch 09/99] btrfs: unlock_extent can return void Jeff Mahoney
2011-11-24  0:35 ` [patch 10/99] btrfs: Split unlock_extent_cached into sleeping and atomic versions Jeff Mahoney
2011-11-24  0:35 ` [patch 11/99] btrfs: unlock_extent can drop gfp_t argument Jeff Mahoney
2011-11-24  0:35 ` [patch 12/99] btrfs: clear_extent_dirty " Jeff Mahoney
2011-11-24  0:35 ` [patch 13/99] btrfs: clear_extent_uptodate can drop gfp_t argumetn Jeff Mahoney
2011-11-24 23:57   ` David Sterba
2011-11-25  2:14     ` Jeff Mahoney
2011-11-24  0:35 ` [patch 14/99] btrfs: clear_extent_bits " Jeff Mahoney
2011-11-24  0:35 ` [patch 15/99] btrfs: try_lock_extent " Jeff Mahoney
2011-11-24  0:35 ` [patch 16/99] btrfs: clear_extent_bit can drop gfp_t argument Jeff Mahoney
2011-11-24  0:35 ` [patch 17/99] btrfs: set_extent_bit: split exclusive mode out Jeff Mahoney
2011-11-24  0:35 ` [patch 18/99] btrfs: set_extent_bit should return void with __GFP_WAIT set Jeff Mahoney
2011-11-24  0:35 ` [patch 19/99] btrfs: lock_extent can drop gfp_t argument Jeff Mahoney
2011-11-24  0:35 ` [patch 20/99] btrfs: set_extent_dirty " Jeff Mahoney
2011-11-24  0:35 ` [patch 21/99] btrfs: set_extent_bits " Jeff Mahoney
2011-11-24  0:35 ` [patch 22/99] btrfs: set_extent_delalloc " Jeff Mahoney
2011-11-24  0:35 ` [patch 23/99] btrfs: set_extent_new " Jeff Mahoney
2011-11-24  0:35 ` [patch 24/99] btrfs: set_extent_uptodate " Jeff Mahoney
2011-11-24  0:35 ` [patch 25/99] btrfs: set_extent_bit " Jeff Mahoney
2011-11-24  0:35 ` [patch 26/99] btrfs: set_extent_buffer_uptodate should return void Jeff Mahoney
2011-11-24  0:36 ` [patch 27/99] btrfs: set_extent_bit should return -ENOMEM on GFP_ATOMIC failures Jeff Mahoney
2011-11-24  0:36 ` [patch 28/99] btrfs: clear_extent_bit error push-up Jeff Mahoney
2011-11-24  0:36 ` [patch 29/99] btrfs: convert_extent_bit should return void with __GFP_WAIT set Jeff Mahoney
2011-11-24  0:36 ` [patch 30/99] btrfs: pin_down_extent should return void Jeff Mahoney
2011-11-24  0:36 ` [patch 31/99] btrfs: btrfs_pin_extent error push-up Jeff Mahoney
2011-11-24  0:36 ` [patch 32/99] btrfs: btrfs_drop_snapshot should return int Jeff Mahoney
2011-11-24  0:36 ` [patch 33/99] btrfs: btrfs_start_transaction non-looped error push-up Jeff Mahoney
2011-11-24  0:36 ` [patch 34/99] btrfs: find_and_setup_root " Jeff Mahoney
2011-11-24  0:36 ` [patch 35/99] btrfs: btrfs_update_root " Jeff Mahoney
2011-11-24  0:36 ` [patch 36/99] btrfs: set_range_writeback should return void Jeff Mahoney
2011-11-24  0:36 ` [patch 37/99] btrfs: wait_on_state " Jeff Mahoney
2011-11-24  0:36 ` [patch 38/99] btrfs: wait_extent_bit " Jeff Mahoney
2011-11-24  0:36 ` [patch 39/99] btrfs: __unlock_for_delalloc " Jeff Mahoney
2011-11-24  0:36 ` [patch 40/99] btrfs: check_page_uptodate " Jeff Mahoney
2011-11-24  0:36 ` [patch 41/99] btrfs: check_page_locked " Jeff Mahoney
2011-11-24  0:36 ` [patch 42/99] btrfs: check_page_writeback " Jeff Mahoney
2011-11-24  0:36 ` [patch 43/99] btrfs: clear_extent_buffer_dirty " Jeff Mahoney
2011-11-24  0:36 ` [patch 44/99] btrfs: btrfs_cleanup_fs_uuids " Jeff Mahoney
2011-11-24  0:36 ` [patch 45/99] btrfs: run_scheduled_bios " Jeff Mahoney
2011-11-24  0:36 ` [patch 46/99] btrfs: btrfs_close_extra_devices " Jeff Mahoney
2011-11-24  0:36 ` [patch 47/99] btrfs: schedule_bio " Jeff Mahoney
2011-11-24  0:36 ` [patch 48/99] btrfs: fill_device_from_item " Jeff Mahoney
2011-11-24  0:36 ` [patch 49/99] btrfs: btrfs_queue_worker " Jeff Mahoney
2011-11-24  0:36 ` [patch 50/99] btrfs: run_ordered_completions " Jeff Mahoney
2011-11-24  0:36 ` [patch 51/99] btrfs: btrfs_stop_workers " Jeff Mahoney
2011-11-24  0:36 ` [patch 52/99] btrfs: btrfs_requeue_work " Jeff Mahoney
2011-11-24  0:36 ` [patch 53/99] btrfs: btrfs_end_log_trans " Jeff Mahoney
2011-11-24  0:36 ` [patch 54/99] btrfs: wait_for_writer " Jeff Mahoney
2011-11-24  0:36 ` [patch 55/99] btrfs: btrfs_init_compress " Jeff Mahoney
2011-11-24  0:36 ` [patch 56/99] btrfs: btrfs_invalidate_inodes " Jeff Mahoney
2011-11-24  0:36 ` [patch 57/99] btrfs: __setup_root " Jeff Mahoney
2011-11-24  0:36 ` [patch 58/99] btrfs: btrfs_destroy_delalloc_inodes " Jeff Mahoney
2011-11-24  0:36 ` [patch 59/99] btrfs: btrfs_prepare_extent_commit " Jeff Mahoney
2011-11-24  0:36 ` [patch 60/99] btrfs: btrfs_set_block_group_rw " Jeff Mahoney
2011-11-24  0:36 ` [patch 61/99] btrfs: setup_inline_extent_backref " Jeff Mahoney
2011-11-24  0:36 ` [patch 62/99] btrfs: btrfs_run_defrag_inodes " Jeff Mahoney
2011-11-24  0:36 ` [patch 63/99] btrfs: Simplify btrfs_submit_bio_hook Jeff Mahoney
2011-11-24  0:36 ` [patch 64/99] btrfs: Factor out tree->ops->merge_bio_hook call Jeff Mahoney
2011-11-24  0:36 ` [patch 65/99] btrfs: ->submit_bio_hook error push-up Jeff Mahoney
2011-11-25  0:46   ` David Sterba
2011-11-25  2:17     ` Jeff Mahoney
2011-11-24  0:36 ` [patch 66/99] btrfs: __add_reloc_root " Jeff Mahoney
2011-11-24  0:36 ` [patch 67/99] btrfs: fixup_low_keys should return void Jeff Mahoney
2011-11-24  0:36 ` [patch 68/99] btrfs: setup_items_for_insert " Jeff Mahoney
2011-11-24  0:36 ` [patch 69/99] btrfs: del_ptr " Jeff Mahoney
2011-11-24  0:36 ` [patch 70/99] btrfs: insert_ptr " Jeff Mahoney
2011-11-24  0:36 ` [patch 71/99] btrfs: add_delayed_ref_head " Jeff Mahoney
2011-11-24  0:36 ` [patch 72/99] btrfs: add_delayed_tree_ref " Jeff Mahoney
2011-11-24  0:36 ` [patch 73/99] btrfs: add_delayed_data_ref " Jeff Mahoney
2011-11-24  0:36 ` [patch 74/99] btrfs: Fix kfree of member instead of structure Jeff Mahoney
2011-11-24  0:36 ` [patch 75/99] btrfs: Use mempools for delayed refs Jeff Mahoney
2011-11-24  0:36 ` [patch 76/99] btrfs: Delayed ref mempool functions should return void Jeff Mahoney
2011-11-24  0:36 ` [patch 77/99] btrfs: btrfs_inc_extent_ref void return prep Jeff Mahoney
2011-11-24  0:36 ` [patch 78/99] btrfs: btrfs_free_extent " Jeff Mahoney
2011-11-24  0:36 ` [patch 79/99] btrfs: __btrfs_mod_refs process_func should return void Jeff Mahoney
2011-11-24  0:36 ` [patch 80/99] btrfs: __btrfs_mod_ref " Jeff Mahoney
2011-11-24  0:36 ` [patch 81/99] btrfs: clean_tree_block " Jeff Mahoney
2011-11-24  0:36 ` [patch 82/99] btrfs: btrfs_truncate_item " Jeff Mahoney
2011-11-24  0:36 ` [patch 83/99] btrfs: btrfs_extend_item " Jeff Mahoney
2011-11-24  0:36 ` [patch 84/99] btrfs: end_compressed_writeback " Jeff Mahoney
2011-11-24  0:36 ` [patch 85/99] btrfs: copy_for_split " Jeff Mahoney
2011-11-24  0:36 ` [patch 86/99] btrfs: update_inline_extent_backref " Jeff Mahoney
2011-11-24  0:37 ` [patch 87/99] btrfs: btrfs_put_ordered_extent " Jeff Mahoney
2011-11-24  0:37 ` [patch 88/99] btrfs: __btrfs_remove_ordered_extent " Jeff Mahoney
2011-11-24  0:37 ` [patch 89/99] btrfs: btrfs_wait_ordered_extents " Jeff Mahoney
2011-11-24  0:37 ` [patch 90/99] btrfs: btrfs_wait_ordered_range " Jeff Mahoney
2011-11-24  0:37 ` [patch 91/99] btrfs: btrfs_run_ordered_operations " Jeff Mahoney
2011-11-24  0:37 ` [patch 92/99] btrfs: btrfs_add_ordered_operation " Jeff Mahoney
2011-11-24  0:37 ` [patch 93/99] btrfs: btrfs_add_ordered_sum " Jeff Mahoney
2011-11-24  0:37 ` [patch 94/99] btrfs: btrfs_free_fs_root " Jeff Mahoney
2011-11-24  0:37 ` [patch 95/99] btrfs: del_fs_roots " Jeff Mahoney
2011-11-24  0:37 ` [patch 96/99] btrfs: btrfs_destroy_ordered_operations " Jeff Mahoney
2011-11-24  0:37 ` [patch 97/99] btrfs: btrfs_destroy_ordered_extents " Jeff Mahoney
2011-11-24  0:37 ` [patch 98/99] btrfs: btrfs_destroy_pending_snapshots " Jeff Mahoney
2011-11-24  0:37 ` [patch 99/99] btrfs: add_excluded_extent " Jeff Mahoney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ED7DB9F.4030502@suse.de \
    --to=jeffm@suse.de \
    --cc=andi@firstfloor.org \
    --cc=jeffm@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).