* Re: [PATCH] mmc: block: handle complete_work on the system_highpri workqueue
[not found] <1548965800-3080-1-git-send-email-zhays@lexmark.com>
@ 2019-02-04 6:24 ` Ulf Hansson
2019-02-04 8:02 ` Adrian Hunter
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Ulf Hansson @ 2019-02-04 6:24 UTC (permalink / raw)
To: Zachary Hays, Jens Axboe, Christoph Hellwig, Adrian Hunter,
Linus Walleij
Cc: linux-mmc@vger.kernel.org, linux-block
+ Jens, Christoph, Adrian, Linus
On Thu, 31 Jan 2019 at 21:16, Zachary Hays <zhays@lexmark.com> wrote:
>
> The kblockd workqueue is created with the WQ_MEM_RECLAIM flag set.
> This generates a rescuer thread for that queue that will trigger when
> the CPU is under heavy load and collect the uncompleted work.
>
> In the case of mmc, this creates the possibility of a deadlock as
> other blk-mq is also run on the same queue. For example:
>
> - worker 0 claims the mmc host
> - worker 1 attempts to claim the host
> - worker 0 schedules complete_work to release the host
> - rescuer thread is triggered after time-out and collects the dangling
> work
> - rescuer thread attempts to complete the work in order starting with
> claim host
> - the task to release host is now blocked by a task to claim it and
> will never be called
>
> The above results in multiple hung tasks that lead to failures to boot.
>
> Switching complete_work to the system_highpri queue avoids this
> because system_highpri is not flagged with WQ_MEM_RECLAIM. This allows
> the host to be released without getting blocked by other claims tasks.
>
Thanks for fix and the detailed description to the problem!
> Signed-off-by: Zachary Hays <zhays@lexmark.com>
> ---
> drivers/mmc/core/block.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
> index aef1185f383d..59b6b41b84c6 100644
> --- a/drivers/mmc/core/block.c
> +++ b/drivers/mmc/core/block.c
> @@ -2112,7 +2112,7 @@ static void mmc_blk_mq_req_done(struct mmc_request *mrq)
> if (waiting)
> wake_up(&mq->wait);
> else
> - kblockd_schedule_work(&mq->complete_work);
> + queue_work(system_highpri_wq, &mq->complete_work);
Even if this solves the problem, I think we need some input from some
of the block experts/maintainers to understand if this is the correct
way to fix the problem. So, I have lopped them in.
I am guessing MMC is not the only block device driver that have this
kind of locking issue. Or perhaps it is..
>
> return;
> }
> --
> 2.7.4
>
Kind regards
Uffe
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mmc: block: handle complete_work on the system_highpri workqueue
2019-02-04 6:24 ` [PATCH] mmc: block: handle complete_work on the system_highpri workqueue Ulf Hansson
@ 2019-02-04 8:02 ` Adrian Hunter
2019-02-04 14:30 ` Christoph Hellwig
2019-02-04 8:25 ` Christoph Hellwig
2019-02-04 12:52 ` Ulf Hansson
2 siblings, 1 reply; 8+ messages in thread
From: Adrian Hunter @ 2019-02-04 8:02 UTC (permalink / raw)
To: Ulf Hansson, Zachary Hays, Jens Axboe, Christoph Hellwig,
Linus Walleij
Cc: linux-mmc@vger.kernel.org, linux-block
On 4/02/19 8:24 AM, Ulf Hansson wrote:
> + Jens, Christoph, Adrian, Linus
>
> On Thu, 31 Jan 2019 at 21:16, Zachary Hays <zhays@lexmark.com> wrote:
>>
>> The kblockd workqueue is created with the WQ_MEM_RECLAIM flag set.
>> This generates a rescuer thread for that queue that will trigger when
>> the CPU is under heavy load and collect the uncompleted work.
>>
>> In the case of mmc, this creates the possibility of a deadlock as
>> other blk-mq is also run on the same queue. For example:
>>
>> - worker 0 claims the mmc host
>> - worker 1 attempts to claim the host
>> - worker 0 schedules complete_work to release the host
>> - rescuer thread is triggered after time-out and collects the dangling
>> work
>> - rescuer thread attempts to complete the work in order starting with
>> claim host
>> - the task to release host is now blocked by a task to claim it and
>> will never be called
>>
>> The above results in multiple hung tasks that lead to failures to boot.
>>
>> Switching complete_work to the system_highpri queue avoids this
>> because system_highpri is not flagged with WQ_MEM_RECLAIM. This allows
>> the host to be released without getting blocked by other claims tasks.
>>
>
> Thanks for fix and the detailed description to the problem!
>
>> Signed-off-by: Zachary Hays <zhays@lexmark.com>
>> ---
>> drivers/mmc/core/block.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
>> index aef1185f383d..59b6b41b84c6 100644
>> --- a/drivers/mmc/core/block.c
>> +++ b/drivers/mmc/core/block.c
>> @@ -2112,7 +2112,7 @@ static void mmc_blk_mq_req_done(struct mmc_request *mrq)
>> if (waiting)
>> wake_up(&mq->wait);
>> else
>> - kblockd_schedule_work(&mq->complete_work);
>> + queue_work(system_highpri_wq, &mq->complete_work);
>
> Even if this solves the problem, I think we need some input from some
> of the block experts/maintainers to understand if this is the correct
> way to fix the problem. So, I have lopped them in.
>
> I am guessing MMC is not the only block device driver that have this
> kind of locking issue. Or perhaps it is..
WRT kblockd_workqueue, there is also still this issue outstanding:
https://lore.kernel.org/lkml/20170921140729.GA17333@lst.de/
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mmc: block: handle complete_work on the system_highpri workqueue
2019-02-04 6:24 ` [PATCH] mmc: block: handle complete_work on the system_highpri workqueue Ulf Hansson
2019-02-04 8:02 ` Adrian Hunter
@ 2019-02-04 8:25 ` Christoph Hellwig
2019-02-04 12:30 ` Ulf Hansson
2019-02-04 12:52 ` Ulf Hansson
2 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2019-02-04 8:25 UTC (permalink / raw)
To: Ulf Hansson
Cc: Zachary Hays, Jens Axboe, Christoph Hellwig, Adrian Hunter,
Linus Walleij, linux-mmc@vger.kernel.org, linux-block
On Mon, Feb 04, 2019 at 07:24:43AM +0100, Ulf Hansson wrote:
> > The above results in multiple hung tasks that lead to failures to boot.
> >
> > Switching complete_work to the system_highpri queue avoids this
> > because system_highpri is not flagged with WQ_MEM_RECLAIM. This allows
> > the host to be released without getting blocked by other claims tasks.
> >
>
> Thanks for fix and the detailed description to the problem!
I don't think this is correct, though. If you think of the swap
(or memory reclaim of file backed pages) case on mmc, completing
a mmc request can very much be critical for completing that memory
reclaim. Thus we absolutely do need a WQ_MEM_RECLAIM workqueue.
You probably want to create a private WQ_MEM_RECLAIM workqueue in the
mmc code for that.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mmc: block: handle complete_work on the system_highpri workqueue
2019-02-04 8:25 ` Christoph Hellwig
@ 2019-02-04 12:30 ` Ulf Hansson
2019-02-04 14:28 ` Christoph Hellwig
0 siblings, 1 reply; 8+ messages in thread
From: Ulf Hansson @ 2019-02-04 12:30 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Zachary Hays, Jens Axboe, Adrian Hunter, Linus Walleij,
linux-mmc@vger.kernel.org, linux-block
On Mon, 4 Feb 2019 at 09:25, Christoph Hellwig <hch@lst.de> wrote:
>
> On Mon, Feb 04, 2019 at 07:24:43AM +0100, Ulf Hansson wrote:
> > > The above results in multiple hung tasks that lead to failures to boot.
> > >
> > > Switching complete_work to the system_highpri queue avoids this
> > > because system_highpri is not flagged with WQ_MEM_RECLAIM. This allows
> > > the host to be released without getting blocked by other claims tasks.
> > >
> >
> > Thanks for fix and the detailed description to the problem!
>
> I don't think this is correct, though. If you think of the swap
> (or memory reclaim of file backed pages) case on mmc, completing
> a mmc request can very much be critical for completing that memory
> reclaim. Thus we absolutely do need a WQ_MEM_RECLAIM workqueue.
> You probably want to create a private WQ_MEM_RECLAIM workqueue in the
> mmc code for that.
Thanks for your input!
Although, I am not sure why having our own mmc workqueue, would fix
this problem. Couldn't we hit the same kind of deadlock anyways you
think?
Kind regards
Uffe
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mmc: block: handle complete_work on the system_highpri workqueue
2019-02-04 6:24 ` [PATCH] mmc: block: handle complete_work on the system_highpri workqueue Ulf Hansson
2019-02-04 8:02 ` Adrian Hunter
2019-02-04 8:25 ` Christoph Hellwig
@ 2019-02-04 12:52 ` Ulf Hansson
2 siblings, 0 replies; 8+ messages in thread
From: Ulf Hansson @ 2019-02-04 12:52 UTC (permalink / raw)
To: Zachary Hays, Jens Axboe, Christoph Hellwig, Adrian Hunter,
Linus Walleij
Cc: linux-mmc@vger.kernel.org, linux-block
On Mon, 4 Feb 2019 at 07:24, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> + Jens, Christoph, Adrian, Linus
>
> On Thu, 31 Jan 2019 at 21:16, Zachary Hays <zhays@lexmark.com> wrote:
> >
> > The kblockd workqueue is created with the WQ_MEM_RECLAIM flag set.
> > This generates a rescuer thread for that queue that will trigger when
> > the CPU is under heavy load and collect the uncompleted work.
> >
> > In the case of mmc, this creates the possibility of a deadlock as
> > other blk-mq is also run on the same queue. For example:
> >
> > - worker 0 claims the mmc host
> > - worker 1 attempts to claim the host
> > - worker 0 schedules complete_work to release the host
> > - rescuer thread is triggered after time-out and collects the dangling
> > work
> > - rescuer thread attempts to complete the work in order starting with
> > claim host
> > - the task to release host is now blocked by a task to claim it and
> > will never be called
A second thought about this.
Claiming and releasing the host, is a bit special managed in case the
claiming is done to serve a block I/O request. The mmc host is
actually re-claimable for these cases, which is needed to allow us to
operate on two I/O requests simultaneously - for the same mmc host.
mmc_claim_host() shouldn't even have to wait to retrieve access to the
mmc host for these cases. So, it's a bit weird that you observes this
deadlock/hang.
Perhaps there is a problem internally with __mmc_claim_host() and
mmc_release_host(), that we have overlooked when we introduced the
re-claimable host for the block I/O path. There is a wait queue in
there, perhaps that isn't working as we expect with this scenario...
> >
> > The above results in multiple hung tasks that lead to failures to boot.
> >
> > Switching complete_work to the system_highpri queue avoids this
> > because system_highpri is not flagged with WQ_MEM_RECLAIM. This allows
> > the host to be released without getting blocked by other claims tasks.
> >
>
[...]
Kind regards
Uffe
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mmc: block: handle complete_work on the system_highpri workqueue
2019-02-04 12:30 ` Ulf Hansson
@ 2019-02-04 14:28 ` Christoph Hellwig
2019-02-05 14:41 ` Zak Hays
0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2019-02-04 14:28 UTC (permalink / raw)
To: Ulf Hansson
Cc: Christoph Hellwig, Zachary Hays, Jens Axboe, Adrian Hunter,
Linus Walleij, linux-mmc@vger.kernel.org, linux-block
On Mon, Feb 04, 2019 at 01:30:37PM +0100, Ulf Hansson wrote:
> Although, I am not sure why having our own mmc workqueue, would fix
> this problem. Couldn't we hit the same kind of deadlock anyways you
> think?
Maybe I misunderstood the issue. I thought the problem was that
the one rescuer kblockd thread is executing some block submission
work item, which is blocking because it waits for something in
completion handling.
Now with another workqueue we have two rescuer threads, one that
exectures all completions, which shouldn't depend on items in
the submission queue, and another one executing the block layer
submission, which might or might not depend on something happening
to another request in the completion workqueue.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mmc: block: handle complete_work on the system_highpri workqueue
2019-02-04 8:02 ` Adrian Hunter
@ 2019-02-04 14:30 ` Christoph Hellwig
0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2019-02-04 14:30 UTC (permalink / raw)
To: Adrian Hunter
Cc: Ulf Hansson, Zachary Hays, Jens Axboe, Christoph Hellwig,
Linus Walleij, linux-mmc@vger.kernel.org, linux-block
On Mon, Feb 04, 2019 at 10:02:18AM +0200, Adrian Hunter wrote:
> On 4/02/19 8:24 AM, Ulf Hansson wrote:
> > + Jens, Christoph, Adrian, Linus
> >
> > On Thu, 31 Jan 2019 at 21:16, Zachary Hays <zhays@lexmark.com> wrote:
> >>
> >> The kblockd workqueue is created with the WQ_MEM_RECLAIM flag set.
> >> This generates a rescuer thread for that queue that will trigger when
> >> the CPU is under heavy load and collect the uncompleted work.
> >>
> >> In the case of mmc, this creates the possibility of a deadlock as
> >> other blk-mq is also run on the same queue. For example:
> >>
> >> - worker 0 claims the mmc host
> >> - worker 1 attempts to claim the host
> >> - worker 0 schedules complete_work to release the host
> >> - rescuer thread is triggered after time-out and collects the dangling
> >> work
> >> - rescuer thread attempts to complete the work in order starting with
> >> claim host
> >> - the task to release host is now blocked by a task to claim it and
> >> will never be called
> >>
> >> The above results in multiple hung tasks that lead to failures to boot.
> >>
> >> Switching complete_work to the system_highpri queue avoids this
> >> because system_highpri is not flagged with WQ_MEM_RECLAIM. This allows
> >> the host to be released without getting blocked by other claims tasks.
> >>
> >
> > Thanks for fix and the detailed description to the problem!
> >
> >> Signed-off-by: Zachary Hays <zhays@lexmark.com>
> >> ---
> >> drivers/mmc/core/block.c | 2 +-
> >> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
> >> index aef1185f383d..59b6b41b84c6 100644
> >> --- a/drivers/mmc/core/block.c
> >> +++ b/drivers/mmc/core/block.c
> >> @@ -2112,7 +2112,7 @@ static void mmc_blk_mq_req_done(struct mmc_request *mrq)
> >> if (waiting)
> >> wake_up(&mq->wait);
> >> else
> >> - kblockd_schedule_work(&mq->complete_work);
> >> + queue_work(system_highpri_wq, &mq->complete_work);
> >
> > Even if this solves the problem, I think we need some input from some
> > of the block experts/maintainers to understand if this is the correct
> > way to fix the problem. So, I have lopped them in.
> >
> > I am guessing MMC is not the only block device driver that have this
> > kind of locking issue. Or perhaps it is..
>
> WRT kblockd_workqueue, there is also still this issue outstanding:
>
> https://lore.kernel.org/lkml/20170921140729.GA17333@lst.de/
Did you post your summary to the block list? Or even looked into a
prototype that uses a kthread for the deferred submission in blk-mq
instead of a workqueue?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mmc: block: handle complete_work on the system_highpri workqueue
2019-02-04 14:28 ` Christoph Hellwig
@ 2019-02-05 14:41 ` Zak Hays
0 siblings, 0 replies; 8+ messages in thread
From: Zak Hays @ 2019-02-05 14:41 UTC (permalink / raw)
To: Christoph Hellwig, Ulf Hansson
Cc: Jens Axboe, Adrian Hunter, Linus Walleij,
linux-mmc@vger.kernel.org, linux-block
> On Mon, Feb 04, 2019 at 01:30:37PM +0100, Ulf Hansson wrote:
> > Although, I am not sure why having our own mmc workqueue, would fix
> > this problem. Couldn't we hit the same kind of deadlock anyways you
> > think?
>
> Maybe I misunderstood the issue. I thought the problem was that
> the one rescuer kblockd thread is executing some block submission
> work item, which is blocking because it waits for something in
> completion handling.
>
> Now with another workqueue we have two rescuer threads, one that
> exectures all completions, which shouldn't depend on items in
> the submission queue, and another one executing the block layer
> submission, which might or might not depend on something happening
> to another request in the completion workqueue.
You understood the issue correctly. Having a second private workqueue should also avoid the deadlock as it avoids having claim and release tasks on the same queue.
I have generated a test build with a private workqueue and it appears to be equivalent to my original patch. If this is the preferred solution, I will clean up my changes and resubmit.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-02-05 14:41 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1548965800-3080-1-git-send-email-zhays@lexmark.com>
2019-02-04 6:24 ` [PATCH] mmc: block: handle complete_work on the system_highpri workqueue Ulf Hansson
2019-02-04 8:02 ` Adrian Hunter
2019-02-04 14:30 ` Christoph Hellwig
2019-02-04 8:25 ` Christoph Hellwig
2019-02-04 12:30 ` Ulf Hansson
2019-02-04 14:28 ` Christoph Hellwig
2019-02-05 14:41 ` Zak Hays
2019-02-04 12:52 ` Ulf Hansson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.