All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <jaxboe@fusionio.com>
To: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Alasdair Kergon <agk@redhat.com>,
	"linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>,
	device-mapper development <dm-devel@redhat.com>
Subject: Re: [PATCH] Fix deadlock with request based dm and some drivers
Date: Wed, 7 Nov 2012 09:00:57 +0100	[thread overview]
Message-ID: <509A1539.3060203@fusionio.com> (raw)
In-Reply-To: <5099A80B.8070900@ce.jp.nec.com>

On 2012-11-07 01:15, Jun'ichi Nomura wrote:
> On 11/06/12 21:35, Jens Axboe wrote:
>> On 2012-11-06 13:14, Jun'ichi Nomura wrote:
>>> (Cc to dm-devel)
>>>
>>> On 11/06/12 20:29, Jens Axboe wrote:
>>>> I recently fixed a deadlock for a customer we have with
>>>> the below patch. I have queued it up in my tree as not
>>>> to lose it. Can I have an ack from you, or do you want
>>>> to submit it yourself? I've marked it stable as well.
>>>
>>> Could you tell details about how the deadlock happened?
>>>
>>> dm's end_io always throws the completion handling to softirq:
>>>   end_clone_request()
>>>     dm_complete_request()
>>>       blk_complete_request()
>>>
>>> then it is processed in softirq context:
>>>   dm_softirq_done()
>>>     dm_done()
>>>       dm_end_request()
>>>         rq_completed(run_queue=true)
>>>           blk_run_queue()
>>>
>>> Since queue_lock is always held with interrupt disabled,
>>> I couldn't see why it could deadlock.
>>>
>>> Request-based dm followed the completion model of scsi
>>> mid layer. So similar code path exists in scsi, too.
>>> For example:
>>>   scsi_request_fn()
>>>     scsi_kill_request()
>>>       blk_complete_request()
>>> then:
>>>   scsi_softirq_done()
>>>     scsi_io_completion()
>>>       scsi_next_command()
>>>         scsi_run_queue()
>>>           spin_lock queue_lock
>>>           __blk_run_queue()
>>>
>>> If calling run-queue from softirq_done_fn() can cause deadlock,
>>> I'm afraid the problem is not limited to dm.
>>
>> dm_softirq_done()
>>     rq_completed()
>>         blk_run_queue()
>           ^^ this is for dm's queue
>>             dm_request_fn()
>>                 dm_dispatch_request()
>>                     blk_insert_cloned_request()
>>                         __elv_add_request()
>>                             elv_insert()
>>                                 blk_run_queue()
>                                   ^^ this is for lower device's queue
>>                                     ...
>>
>> Basically you recurse back into the request handler, since it ends up
>> running the queue.
> 
> But the queues are different as commented inline above.
> So it should be ok. (from deadlock point of view)

It's still not OK, some drivers end up doing spin_unlock_irq() in their
request_fn. Running unknown request_fn from ipi/irq is a bad idea, imho,
it'll quickly cause problems.

>> But I see I've been too focused on the older release,
>> since we don't actually do that anymore after the plugging rewrite. So
>> it should actually be safe in current kernels and hence the patch only
>> needed for the stable series where that is not the case.
> 
> BTW, using blk_run_queue_async() in softirq_done_fn() might be good
> from performance point of view? It may reduce latency of the softirq
> and have an effect of batching request_fn calls.

Yes, I argued the same for the original people who saw the problem. It
is quite an extensive and expensive path to have off the IPI handler. So
from that point of view, I'd recommend the patch as well.

-- 
Jens Axboe


  reply	other threads:[~2012-11-07  8:00 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-06 11:29 [PATCH] Fix deadlock with request based dm and some drivers Jens Axboe
2012-11-06 12:14 ` Jun'ichi Nomura
2012-11-06 12:35   ` Jens Axboe
2012-11-07  0:15     ` Jun'ichi Nomura
2012-11-07  8:00       ` Jens Axboe [this message]
2012-11-07  8:35         ` Jun'ichi Nomura
2012-11-07 10:59           ` [dm-devel] " Alasdair G Kergon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=509A1539.3060203@fusionio.com \
    --to=jaxboe@fusionio.com \
    --cc=agk@redhat.com \
    --cc=dm-devel@redhat.com \
    --cc=j-nomura@ce.jp.nec.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.