linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* question about transaction-abort-related commits
@ 2013-06-23 18:52 Alex Lyakas
  2013-06-24 16:24 ` Josef Bacik
  0 siblings, 1 reply; 7+ messages in thread
From: Alex Lyakas @ 2013-06-23 18:52 UTC (permalink / raw)
  To: Josef Bacik, bo.li.liu; +Cc: linux-btrfs

Hello Josef, Liu,
I am reviewing commits in the mainline tree:

e4a2bcaca9643e7430207810653222fc5187f2be Btrfs: if we aren't
committing just end the transaction if we error out
(call end_transaction() instead of goto cleanup_transaction) - Josef

f094ac32aba3a51c00e970a2ea029339af2ca048 Btrfs: fix NULL pointer after
aborting a transaction
(wait until all writers detach, before setting running_transaction to
NULL) - Liu

66b6135b7cf741f6f44ba938b27583ea3b83bd12 Btrfs: avoid deadlock on
transaction waiting list
(check if transaction was already removed from the transactions list) - Liu

Josef, according to your fix, if the committer encounters a problem
early, it just doesn't commit. Instead it aborts the transaction
(possibly setting FS to read-only) and detaches from the transaction.
So if this was the only committer (e.g., the transaction_kthread),
then transaction commit will not happen at all. Is this what you
intended? So then the user will notice that FS went read-only, and she
will unmount the FS, and transaction will be cleaned up in
btrfs_error_commit_super()=>btrfs_cleanup_transaction(), and not in
cleanup_transaction() via btrfs_commit_transaction(). Is my
understanding correct?

Liu, it looks like after having Josef's fix, the above two commits of
yours are not strictly needed, right? Because Josef's fix ensures that
only the "real" committer will call cleanup_transaction(), so at this
point there is only one writer attached to the transaction, which is
the committer itself (your fixes doesn't hurt though). Is that
correct?

Thanks for helping,
Alex.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: question about transaction-abort-related commits
  2013-06-23 18:52 question about transaction-abort-related commits Alex Lyakas
@ 2013-06-24 16:24 ` Josef Bacik
  2013-06-24 18:56   ` Alex Lyakas
  0 siblings, 1 reply; 7+ messages in thread
From: Josef Bacik @ 2013-06-24 16:24 UTC (permalink / raw)
  To: Alex Lyakas; +Cc: Josef Bacik, bo.li.liu, linux-btrfs

On Sun, Jun 23, 2013 at 09:52:14PM +0300, Alex Lyakas wrote:
> Hello Josef, Liu,
> I am reviewing commits in the mainline tree:
> 
> e4a2bcaca9643e7430207810653222fc5187f2be Btrfs: if we aren't
> committing just end the transaction if we error out
> (call end_transaction() instead of goto cleanup_transaction) - Josef
> 
> f094ac32aba3a51c00e970a2ea029339af2ca048 Btrfs: fix NULL pointer after
> aborting a transaction
> (wait until all writers detach, before setting running_transaction to
> NULL) - Liu
> 
> 66b6135b7cf741f6f44ba938b27583ea3b83bd12 Btrfs: avoid deadlock on
> transaction waiting list
> (check if transaction was already removed from the transactions list) - Liu
> 
> Josef, according to your fix, if the committer encounters a problem
> early, it just doesn't commit. Instead it aborts the transaction
> (possibly setting FS to read-only) and detaches from the transaction.
> So if this was the only committer (e.g., the transaction_kthread),
> then transaction commit will not happen at all. Is this what you
> intended? So then the user will notice that FS went read-only, and she
> will unmount the FS, and transaction will be cleaned up in
> btrfs_error_commit_super()=>btrfs_cleanup_transaction(), and not in
> cleanup_transaction() via btrfs_commit_transaction(). Is my
> understanding correct?
> 
> Liu, it looks like after having Josef's fix, the above two commits of
> yours are not strictly needed, right? Because Josef's fix ensures that
> only the "real" committer will call cleanup_transaction(), so at this
> point there is only one writer attached to the transaction, which is
> the committer itself (your fixes doesn't hurt though). Is that
> correct?
> 

I've looked at the patches and I'm going to say yes with the caveat that I
stopped thinking about it when my head started hurting :).  Thanks,

Josef

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: question about transaction-abort-related commits
  2013-06-24 16:24 ` Josef Bacik
@ 2013-06-24 18:56   ` Alex Lyakas
  2013-06-26 14:16     ` Alex Lyakas
  0 siblings, 1 reply; 7+ messages in thread
From: Alex Lyakas @ 2013-06-24 18:56 UTC (permalink / raw)
  To: Josef Bacik; +Cc: bo.li.liu, linux-btrfs

Thanks for commenting Josef. I hope your head will get better:)
Actually, while re-looking at the code, I see that there are couple of
"goto cleanup;", before we ensure that all the writers have detached
from the committing transaction. So Liu's code is still needed, looks
like.

Thanks,
Alex.

On Mon, Jun 24, 2013 at 7:24 PM, Josef Bacik <jbacik@fusionio.com> wrote:
> On Sun, Jun 23, 2013 at 09:52:14PM +0300, Alex Lyakas wrote:
>> Hello Josef, Liu,
>> I am reviewing commits in the mainline tree:
>>
>> e4a2bcaca9643e7430207810653222fc5187f2be Btrfs: if we aren't
>> committing just end the transaction if we error out
>> (call end_transaction() instead of goto cleanup_transaction) - Josef
>>
>> f094ac32aba3a51c00e970a2ea029339af2ca048 Btrfs: fix NULL pointer after
>> aborting a transaction
>> (wait until all writers detach, before setting running_transaction to
>> NULL) - Liu
>>
>> 66b6135b7cf741f6f44ba938b27583ea3b83bd12 Btrfs: avoid deadlock on
>> transaction waiting list
>> (check if transaction was already removed from the transactions list) - Liu
>>
>> Josef, according to your fix, if the committer encounters a problem
>> early, it just doesn't commit. Instead it aborts the transaction
>> (possibly setting FS to read-only) and detaches from the transaction.
>> So if this was the only committer (e.g., the transaction_kthread),
>> then transaction commit will not happen at all. Is this what you
>> intended? So then the user will notice that FS went read-only, and she
>> will unmount the FS, and transaction will be cleaned up in
>> btrfs_error_commit_super()=>btrfs_cleanup_transaction(), and not in
>> cleanup_transaction() via btrfs_commit_transaction(). Is my
>> understanding correct?
>>
>> Liu, it looks like after having Josef's fix, the above two commits of
>> yours are not strictly needed, right? Because Josef's fix ensures that
>> only the "real" committer will call cleanup_transaction(), so at this
>> point there is only one writer attached to the transaction, which is
>> the committer itself (your fixes doesn't hurt though). Is that
>> correct?
>>
>
> I've looked at the patches and I'm going to say yes with the caveat that I
> stopped thinking about it when my head started hurting :).  Thanks,
>
> Josef

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: question about transaction-abort-related commits
  2013-06-24 18:56   ` Alex Lyakas
@ 2013-06-26 14:16     ` Alex Lyakas
  2013-06-30  8:29       ` Alex Lyakas
  0 siblings, 1 reply; 7+ messages in thread
From: Alex Lyakas @ 2013-06-26 14:16 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, Shyam Kaushik

Hi Josef,
Can you please help me with another question.

I am looking at your patch:
Btrfs: fix chunk allocation error handling
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=0448748849ef7c593be40e2c1404f7974bd3aac6

Here you changed the order of btrfs_make_block_group() vs
btrfs_alloc_dev_extent(), because we could have allocated from the
in-memory block group, before we have inserted the dev extent into a
tree. However, with this fix, I hit the deadlock[1] of
btrfs_alloc_dev_extent() that also wants to allocate a chunk and
recursively calls do_chunk_alloc, but then is stuck on chunk_mutex.

Was this patch:
Btrfs: don't re-enter when allocating a chunk
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c6b305a89b1903d63652691ad5eb9f05aa0326b8
introduced to fix this deadlock?

Thanks,
Alex.

[1]
[<ffffffffa044e57d>] do_chunk_alloc+0x8d/0x510 [btrfs]
[<ffffffffa04532ad>] find_free_extent+0x9cd/0xb90 [btrfs]
[<ffffffffa0453510>] btrfs_reserve_extent+0xa0/0x1b0 [btrfs]
[<ffffffffa0453bc9>] btrfs_alloc_free_block+0xf9/0x570 [btrfs]
[<ffffffffa043d9e6>] __btrfs_cow_block+0x126/0x500 [btrfs]
[<ffffffffa043dfba>] btrfs_cow_block+0x17a/0x230 [btrfs]
[<ffffffffa04425b1>] btrfs_search_slot+0x381/0x820 [btrfs]
[<ffffffffa044463c>] btrfs_insert_empty_items+0x7c/0x120 [btrfs]
[<ffffffffa048f31b>] btrfs_alloc_dev_extent+0x9b/0x1c0 [btrfs]
[<ffffffffa048f9ca>] __btrfs_alloc_chunk+0x58a/0x850 [btrfs]
[<ffffffffa049239f>] btrfs_alloc_chunk+0xbf/0x160 [btrfs]
[<ffffffffa044e81b>] do_chunk_alloc+0x32b/0x510 [btrfs]
[<ffffffffa04532ad>] find_free_extent+0x9cd/0xb90 [btrfs]
[<ffffffffa0453510>] btrfs_reserve_extent+0xa0/0x1b0 [btrfs]
[<ffffffffa0453bc9>] btrfs_alloc_free_block+0xf9/0x570 [btrfs]
[<ffffffffa043d9e6>] __btrfs_cow_block+0x126/0x500 [btrfs]
[<ffffffffa043dfba>] btrfs_cow_block+0x17a/0x230 [btrfs]
[<ffffffffa0441613>] push_leaf_right+0x133/0x1a0 [btrfs]
[<ffffffffa0441d51>] split_leaf+0x5e1/0x770 [btrfs]
[<ffffffffa04429b5>] btrfs_search_slot+0x785/0x820 [btrfs]
[<ffffffffa0449c0e>] lookup_inline_extent_backref+0x8e/0x5b0 [btrfs]
[<ffffffffa044a193>] insert_inline_extent_backref+0x63/0x130 [btrfs]
[<ffffffffa044abaf>] __btrfs_inc_extent_ref+0x9f/0x240 [btrfs]
[<ffffffffa0451841>] run_clustered_refs+0x971/0xd00 [btrfs]
[<ffffffffa0455db0>] btrfs_run_delayed_refs+0xd0/0x330 [btrfs]
[<ffffffffa0467a87>] __btrfs_end_transaction+0xf7/0x440 [btrfs]
[<ffffffffa0467e20>] btrfs_end_transaction+0x10/0x20 [btrfs]




On Mon, Jun 24, 2013 at 9:56 PM, Alex Lyakas
<alex.btrfs@zadarastorage.com> wrote:
>
> Thanks for commenting Josef. I hope your head will get better:)
> Actually, while re-looking at the code, I see that there are couple of
> "goto cleanup;", before we ensure that all the writers have detached
> from the committing transaction. So Liu's code is still needed, looks
> like.
>
> Thanks,
> Alex.
>
> On Mon, Jun 24, 2013 at 7:24 PM, Josef Bacik <jbacik@fusionio.com> wrote:
> > On Sun, Jun 23, 2013 at 09:52:14PM +0300, Alex Lyakas wrote:
> >> Hello Josef, Liu,
> >> I am reviewing commits in the mainline tree:
> >>
> >> e4a2bcaca9643e7430207810653222fc5187f2be Btrfs: if we aren't
> >> committing just end the transaction if we error out
> >> (call end_transaction() instead of goto cleanup_transaction) - Josef
> >>
> >> f094ac32aba3a51c00e970a2ea029339af2ca048 Btrfs: fix NULL pointer after
> >> aborting a transaction
> >> (wait until all writers detach, before setting running_transaction to
> >> NULL) - Liu
> >>
> >> 66b6135b7cf741f6f44ba938b27583ea3b83bd12 Btrfs: avoid deadlock on
> >> transaction waiting list
> >> (check if transaction was already removed from the transactions list) -
> >> Liu
> >>
> >> Josef, according to your fix, if the committer encounters a problem
> >> early, it just doesn't commit. Instead it aborts the transaction
> >> (possibly setting FS to read-only) and detaches from the transaction.
> >> So if this was the only committer (e.g., the transaction_kthread),
> >> then transaction commit will not happen at all. Is this what you
> >> intended? So then the user will notice that FS went read-only, and she
> >> will unmount the FS, and transaction will be cleaned up in
> >> btrfs_error_commit_super()=>btrfs_cleanup_transaction(), and not in
> >> cleanup_transaction() via btrfs_commit_transaction(). Is my
> >> understanding correct?
> >>
> >> Liu, it looks like after having Josef's fix, the above two commits of
> >> yours are not strictly needed, right? Because Josef's fix ensures that
> >> only the "real" committer will call cleanup_transaction(), so at this
> >> point there is only one writer attached to the transaction, which is
> >> the committer itself (your fixes doesn't hurt though). Is that
> >> correct?
> >>
> >
> > I've looked at the patches and I'm going to say yes with the caveat that
> > I
> > stopped thinking about it when my head started hurting :).  Thanks,
> >
> > Josef

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: question about transaction-abort-related commits
  2013-06-26 14:16     ` Alex Lyakas
@ 2013-06-30  8:29       ` Alex Lyakas
  2013-06-30 11:36         ` Josef Bacik
  0 siblings, 1 reply; 7+ messages in thread
From: Alex Lyakas @ 2013-06-30  8:29 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, Shyam Kaushik

Hi Josef,

On Wed, Jun 26, 2013 at 5:16 PM, Alex Lyakas
<alex.btrfs@zadarastorage.com> wrote:
> Hi Josef,
> Can you please help me with another question.
>
> I am looking at your patch:
> Btrfs: fix chunk allocation error handling
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=0448748849ef7c593be40e2c1404f7974bd3aac6
>
> Here you changed the order of btrfs_make_block_group() vs
> btrfs_alloc_dev_extent(), because we could have allocated from the
> in-memory block group, before we have inserted the dev extent into a
> tree. However, with this fix, I hit the deadlock[1] of
> btrfs_alloc_dev_extent() that also wants to allocate a chunk and
> recursively calls do_chunk_alloc, but then is stuck on chunk_mutex.
>
> Was this patch:
> Btrfs: don't re-enter when allocating a chunk
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c6b305a89b1903d63652691ad5eb9f05aa0326b8
> introduced to fix this deadlock?

With these two patches ("Btrfs: fix chunk allocation error handling"
and "Btrfs: don't re-enter when allocating a chunk"), I am hitting
ENOSPC during metadata chunk allocation.

Upon entry into "do_chunk_alloc", I have only one METADATA block-group
as follows:
total_bytes=8388608
bytes_used=7938048
bytes_pinned=446464
bytes_reserved=4096
bytes_readonly=0
bytes_may_use=3362816

As we see bytes_used+bytes_pinned+bytes_reserved==total_bytes

What happens next is that within __btrfs_alloc_chunk():
- find_free_dev_extent() finds a free extent (metadata policy is SINGLE)
- btrfs_alloc_dev_extent() fails with ENOSPC

(btrfs_make_block_group() is called after btrfs_alloc_dev_extent()
with these patches).

What should be done in such situation, when there is not enough
METADATA to allocate a device extent item, but we still don't allow
allocating from the newly-created METADATA block group?

Thanks,
Alex.




>
> Thanks,
> Alex.
>
> [1]
> [<ffffffffa044e57d>] do_chunk_alloc+0x8d/0x510 [btrfs]
> [<ffffffffa04532ad>] find_free_extent+0x9cd/0xb90 [btrfs]
> [<ffffffffa0453510>] btrfs_reserve_extent+0xa0/0x1b0 [btrfs]
> [<ffffffffa0453bc9>] btrfs_alloc_free_block+0xf9/0x570 [btrfs]
> [<ffffffffa043d9e6>] __btrfs_cow_block+0x126/0x500 [btrfs]
> [<ffffffffa043dfba>] btrfs_cow_block+0x17a/0x230 [btrfs]
> [<ffffffffa04425b1>] btrfs_search_slot+0x381/0x820 [btrfs]
> [<ffffffffa044463c>] btrfs_insert_empty_items+0x7c/0x120 [btrfs]
> [<ffffffffa048f31b>] btrfs_alloc_dev_extent+0x9b/0x1c0 [btrfs]
> [<ffffffffa048f9ca>] __btrfs_alloc_chunk+0x58a/0x850 [btrfs]
> [<ffffffffa049239f>] btrfs_alloc_chunk+0xbf/0x160 [btrfs]
> [<ffffffffa044e81b>] do_chunk_alloc+0x32b/0x510 [btrfs]
> [<ffffffffa04532ad>] find_free_extent+0x9cd/0xb90 [btrfs]
> [<ffffffffa0453510>] btrfs_reserve_extent+0xa0/0x1b0 [btrfs]
> [<ffffffffa0453bc9>] btrfs_alloc_free_block+0xf9/0x570 [btrfs]
> [<ffffffffa043d9e6>] __btrfs_cow_block+0x126/0x500 [btrfs]
> [<ffffffffa043dfba>] btrfs_cow_block+0x17a/0x230 [btrfs]
> [<ffffffffa0441613>] push_leaf_right+0x133/0x1a0 [btrfs]
> [<ffffffffa0441d51>] split_leaf+0x5e1/0x770 [btrfs]
> [<ffffffffa04429b5>] btrfs_search_slot+0x785/0x820 [btrfs]
> [<ffffffffa0449c0e>] lookup_inline_extent_backref+0x8e/0x5b0 [btrfs]
> [<ffffffffa044a193>] insert_inline_extent_backref+0x63/0x130 [btrfs]
> [<ffffffffa044abaf>] __btrfs_inc_extent_ref+0x9f/0x240 [btrfs]
> [<ffffffffa0451841>] run_clustered_refs+0x971/0xd00 [btrfs]
> [<ffffffffa0455db0>] btrfs_run_delayed_refs+0xd0/0x330 [btrfs]
> [<ffffffffa0467a87>] __btrfs_end_transaction+0xf7/0x440 [btrfs]
> [<ffffffffa0467e20>] btrfs_end_transaction+0x10/0x20 [btrfs]
>
>
>
>
> On Mon, Jun 24, 2013 at 9:56 PM, Alex Lyakas
> <alex.btrfs@zadarastorage.com> wrote:
>>
>> Thanks for commenting Josef. I hope your head will get better:)
>> Actually, while re-looking at the code, I see that there are couple of
>> "goto cleanup;", before we ensure that all the writers have detached
>> from the committing transaction. So Liu's code is still needed, looks
>> like.
>>
>> Thanks,
>> Alex.
>>
>> On Mon, Jun 24, 2013 at 7:24 PM, Josef Bacik <jbacik@fusionio.com> wrote:
>> > On Sun, Jun 23, 2013 at 09:52:14PM +0300, Alex Lyakas wrote:
>> >> Hello Josef, Liu,
>> >> I am reviewing commits in the mainline tree:
>> >>
>> >> e4a2bcaca9643e7430207810653222fc5187f2be Btrfs: if we aren't
>> >> committing just end the transaction if we error out
>> >> (call end_transaction() instead of goto cleanup_transaction) - Josef
>> >>
>> >> f094ac32aba3a51c00e970a2ea029339af2ca048 Btrfs: fix NULL pointer after
>> >> aborting a transaction
>> >> (wait until all writers detach, before setting running_transaction to
>> >> NULL) - Liu
>> >>
>> >> 66b6135b7cf741f6f44ba938b27583ea3b83bd12 Btrfs: avoid deadlock on
>> >> transaction waiting list
>> >> (check if transaction was already removed from the transactions list) -
>> >> Liu
>> >>
>> >> Josef, according to your fix, if the committer encounters a problem
>> >> early, it just doesn't commit. Instead it aborts the transaction
>> >> (possibly setting FS to read-only) and detaches from the transaction.
>> >> So if this was the only committer (e.g., the transaction_kthread),
>> >> then transaction commit will not happen at all. Is this what you
>> >> intended? So then the user will notice that FS went read-only, and she
>> >> will unmount the FS, and transaction will be cleaned up in
>> >> btrfs_error_commit_super()=>btrfs_cleanup_transaction(), and not in
>> >> cleanup_transaction() via btrfs_commit_transaction(). Is my
>> >> understanding correct?
>> >>
>> >> Liu, it looks like after having Josef's fix, the above two commits of
>> >> yours are not strictly needed, right? Because Josef's fix ensures that
>> >> only the "real" committer will call cleanup_transaction(), so at this
>> >> point there is only one writer attached to the transaction, which is
>> >> the committer itself (your fixes doesn't hurt though). Is that
>> >> correct?
>> >>
>> >
>> > I've looked at the patches and I'm going to say yes with the caveat that
>> > I
>> > stopped thinking about it when my head started hurting :).  Thanks,
>> >
>> > Josef

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: question about transaction-abort-related commits
  2013-06-30  8:29       ` Alex Lyakas
@ 2013-06-30 11:36         ` Josef Bacik
  2013-07-02 16:35           ` Alex Lyakas
  0 siblings, 1 reply; 7+ messages in thread
From: Josef Bacik @ 2013-06-30 11:36 UTC (permalink / raw)
  To: Alex Lyakas; +Cc: Josef Bacik, linux-btrfs, Shyam Kaushik

On Sun, Jun 30, 2013 at 11:29:14AM +0300, Alex Lyakas wrote:
> Hi Josef,
> 
> On Wed, Jun 26, 2013 at 5:16 PM, Alex Lyakas
> <alex.btrfs@zadarastorage.com> wrote:
> > Hi Josef,
> > Can you please help me with another question.
> >
> > I am looking at your patch:
> > Btrfs: fix chunk allocation error handling
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=0448748849ef7c593be40e2c1404f7974bd3aac6
> >
> > Here you changed the order of btrfs_make_block_group() vs
> > btrfs_alloc_dev_extent(), because we could have allocated from the
> > in-memory block group, before we have inserted the dev extent into a
> > tree. However, with this fix, I hit the deadlock[1] of
> > btrfs_alloc_dev_extent() that also wants to allocate a chunk and
> > recursively calls do_chunk_alloc, but then is stuck on chunk_mutex.
> >
> > Was this patch:
> > Btrfs: don't re-enter when allocating a chunk
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c6b305a89b1903d63652691ad5eb9f05aa0326b8
> > introduced to fix this deadlock?
> 
> With these two patches ("Btrfs: fix chunk allocation error handling"
> and "Btrfs: don't re-enter when allocating a chunk"), I am hitting
> ENOSPC during metadata chunk allocation.
> 
> Upon entry into "do_chunk_alloc", I have only one METADATA block-group
> as follows:
> total_bytes=8388608
> bytes_used=7938048
> bytes_pinned=446464
> bytes_reserved=4096
> bytes_readonly=0
> bytes_may_use=3362816
> 
> As we see bytes_used+bytes_pinned+bytes_reserved==total_bytes
> 
> What happens next is that within __btrfs_alloc_chunk():
> - find_free_dev_extent() finds a free extent (metadata policy is SINGLE)
> - btrfs_alloc_dev_extent() fails with ENOSPC
> 
> (btrfs_make_block_group() is called after btrfs_alloc_dev_extent()
> with these patches).
> 
> What should be done in such situation, when there is not enough
> METADATA to allocate a device extent item, but we still don't allow
> allocating from the newly-created METADATA block group?
> 

So I had a third patch that you are likely missing that makes sure we try and
allocate chunks sooner specifically for this case

96f1bb57771f71bf1d55d5031a1cf47908494330

and then Miao made it better I think with this

3c76cd84e0c0d3ceb094a1020f8c55c2417e18d3

Thanks,

Josef

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: question about transaction-abort-related commits
  2013-06-30 11:36         ` Josef Bacik
@ 2013-07-02 16:35           ` Alex Lyakas
  0 siblings, 0 replies; 7+ messages in thread
From: Alex Lyakas @ 2013-07-02 16:35 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, Shyam Kaushik

On Sun, Jun 30, 2013 at 2:36 PM, Josef Bacik <jbacik@fusionio.com> wrote:
> On Sun, Jun 30, 2013 at 11:29:14AM +0300, Alex Lyakas wrote:
>> Hi Josef,
>>
>> On Wed, Jun 26, 2013 at 5:16 PM, Alex Lyakas
>> <alex.btrfs@zadarastorage.com> wrote:
>> > Hi Josef,
>> > Can you please help me with another question.
>> >
>> > I am looking at your patch:
>> > Btrfs: fix chunk allocation error handling
>> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=0448748849ef7c593be40e2c1404f7974bd3aac6
>> >
>> > Here you changed the order of btrfs_make_block_group() vs
>> > btrfs_alloc_dev_extent(), because we could have allocated from the
>> > in-memory block group, before we have inserted the dev extent into a
>> > tree. However, with this fix, I hit the deadlock[1] of
>> > btrfs_alloc_dev_extent() that also wants to allocate a chunk and
>> > recursively calls do_chunk_alloc, but then is stuck on chunk_mutex.
>> >
>> > Was this patch:
>> > Btrfs: don't re-enter when allocating a chunk
>> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c6b305a89b1903d63652691ad5eb9f05aa0326b8
>> > introduced to fix this deadlock?
>>
>> With these two patches ("Btrfs: fix chunk allocation error handling"
>> and "Btrfs: don't re-enter when allocating a chunk"), I am hitting
>> ENOSPC during metadata chunk allocation.
>>
>> Upon entry into "do_chunk_alloc", I have only one METADATA block-group
>> as follows:
>> total_bytes=8388608
>> bytes_used=7938048
>> bytes_pinned=446464
>> bytes_reserved=4096
>> bytes_readonly=0
>> bytes_may_use=3362816
>>
>> As we see bytes_used+bytes_pinned+bytes_reserved==total_bytes
>>
>> What happens next is that within __btrfs_alloc_chunk():
>> - find_free_dev_extent() finds a free extent (metadata policy is SINGLE)
>> - btrfs_alloc_dev_extent() fails with ENOSPC
>>
>> (btrfs_make_block_group() is called after btrfs_alloc_dev_extent()
>> with these patches).
>>
>> What should be done in such situation, when there is not enough
>> METADATA to allocate a device extent item, but we still don't allow
>> allocating from the newly-created METADATA block group?
>>
>
> So I had a third patch that you are likely missing that makes sure we try and
> allocate chunks sooner specifically for this case
>
> 96f1bb57771f71bf1d55d5031a1cf47908494330
>
> and then Miao made it better I think with this
>
> 3c76cd84e0c0d3ceb094a1020f8c55c2417e18d3
>
> Thanks,
>
> Josef

Thank you Josef, I didn't realize that.

Alex.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-07-02 16:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-23 18:52 question about transaction-abort-related commits Alex Lyakas
2013-06-24 16:24 ` Josef Bacik
2013-06-24 18:56   ` Alex Lyakas
2013-06-26 14:16     ` Alex Lyakas
2013-06-30  8:29       ` Alex Lyakas
2013-06-30 11:36         ` Josef Bacik
2013-07-02 16:35           ` Alex Lyakas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).