* [Qemu-devel] [PATCH v2] throttle-groups: update tg->any_timer_armed[] on detach
@ 2017-09-20 10:17 Stefan Hajnoczi
2017-09-20 10:34 ` Manos Pitsidianakis
2017-09-20 11:08 ` Alberto Garcia
0 siblings, 2 replies; 6+ messages in thread
From: Stefan Hajnoczi @ 2017-09-20 10:17 UTC (permalink / raw)
To: qemu-devel
Cc: Max Reitz, Kevin Wolf, qemu-block, Manos Pitsidianakis,
Alberto Garcia, Stefan Hajnoczi
Clear tg->any_timer_armed[] when throttling timers are destroyed during
AioContext attach/detach. Failure to do so causes throttling to hang
because we believe the timer is already scheduled!
The following was broken at least since QEMU 2.10.0 with -drive
iops=100:
$ dd if=/dev/zero of=/dev/vdb oflag=direct count=1000
(qemu) stop
(qemu) cont
...I/O is stuck...
Reported-by: Yongxue Hong <yhong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
v2:
* Use timer_pending(tt->timers[i]) instead of token [Berto]
* Fix s/destroy/destroyed/ typo [Eric]
block/throttle-groups.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/block/throttle-groups.c b/block/throttle-groups.c
index 6ba992c8d7..0a49f3c409 100644
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -592,6 +592,17 @@ void throttle_group_attach_aio_context(ThrottleGroupMember *tgm,
void throttle_group_detach_aio_context(ThrottleGroupMember *tgm)
{
ThrottleTimers *tt = &tgm->throttle_timers;
+ ThrottleGroup *tg = container_of(tgm->throttle_state, ThrottleGroup, ts);
+
+ qemu_mutex_lock(&tg->lock);
+ if (timer_pending(tt->timers[0])) {
+ tg->any_timer_armed[0] = false;
+ }
+ if (timer_pending(tt->timers[1])) {
+ tg->any_timer_armed[1] = false;
+ }
+ qemu_mutex_unlock(&tg->lock);
+
throttle_timers_detach_aio_context(tt);
tgm->aio_context = NULL;
}
--
2.13.5
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH v2] throttle-groups: update tg->any_timer_armed[] on detach
2017-09-20 10:17 [Qemu-devel] [PATCH v2] throttle-groups: update tg->any_timer_armed[] on detach Stefan Hajnoczi
@ 2017-09-20 10:34 ` Manos Pitsidianakis
2017-09-20 11:08 ` Alberto Garcia
1 sibling, 0 replies; 6+ messages in thread
From: Manos Pitsidianakis @ 2017-09-20 10:34 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: qemu-devel, Max Reitz, Kevin Wolf, qemu-block, Alberto Garcia
[-- Attachment #1: Type: text/plain, Size: 1662 bytes --]
On Wed, Sep 20, 2017 at 11:17:40AM +0100, Stefan Hajnoczi wrote:
>Clear tg->any_timer_armed[] when throttling timers are destroyed during
>AioContext attach/detach. Failure to do so causes throttling to hang
>because we believe the timer is already scheduled!
>
>The following was broken at least since QEMU 2.10.0 with -drive
>iops=100:
>
> $ dd if=/dev/zero of=/dev/vdb oflag=direct count=1000
> (qemu) stop
> (qemu) cont
> ...I/O is stuck...
>
>Reported-by: Yongxue Hong <yhong@redhat.com>
>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
>v2:
> * Use timer_pending(tt->timers[i]) instead of token [Berto]
> * Fix s/destroy/destroyed/ typo [Eric]
>
> block/throttle-groups.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
>diff --git a/block/throttle-groups.c b/block/throttle-groups.c
>index 6ba992c8d7..0a49f3c409 100644
>--- a/block/throttle-groups.c
>+++ b/block/throttle-groups.c
>@@ -592,6 +592,17 @@ void throttle_group_attach_aio_context(ThrottleGroupMember *tgm,
> void throttle_group_detach_aio_context(ThrottleGroupMember *tgm)
> {
> ThrottleTimers *tt = &tgm->throttle_timers;
>+ ThrottleGroup *tg = container_of(tgm->throttle_state, ThrottleGroup, ts);
>+
>+ qemu_mutex_lock(&tg->lock);
>+ if (timer_pending(tt->timers[0])) {
>+ tg->any_timer_armed[0] = false;
>+ }
>+ if (timer_pending(tt->timers[1])) {
>+ tg->any_timer_armed[1] = false;
>+ }
>+ qemu_mutex_unlock(&tg->lock);
>+
> throttle_timers_detach_aio_context(tt);
> tgm->aio_context = NULL;
> }
>--
>2.13.5
>
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH v2] throttle-groups: update tg->any_timer_armed[] on detach
2017-09-20 10:17 [Qemu-devel] [PATCH v2] throttle-groups: update tg->any_timer_armed[] on detach Stefan Hajnoczi
2017-09-20 10:34 ` Manos Pitsidianakis
@ 2017-09-20 11:08 ` Alberto Garcia
2017-09-20 11:39 ` Manos Pitsidianakis
1 sibling, 1 reply; 6+ messages in thread
From: Alberto Garcia @ 2017-09-20 11:08 UTC (permalink / raw)
To: Stefan Hajnoczi, qemu-devel
Cc: Max Reitz, Kevin Wolf, qemu-block, Manos Pitsidianakis
On Wed 20 Sep 2017 12:17:40 PM CEST, Stefan Hajnoczi wrote:
> @@ -592,6 +592,17 @@ void throttle_group_attach_aio_context(ThrottleGroupMember *tgm,
> void throttle_group_detach_aio_context(ThrottleGroupMember *tgm)
> {
> ThrottleTimers *tt = &tgm->throttle_timers;
> + ThrottleGroup *tg = container_of(tgm->throttle_state, ThrottleGroup, ts);
> +
> + qemu_mutex_lock(&tg->lock);
> + if (timer_pending(tt->timers[0])) {
> + tg->any_timer_armed[0] = false;
> + }
> + if (timer_pending(tt->timers[1])) {
> + tg->any_timer_armed[1] = false;
> + }
> + qemu_mutex_unlock(&tg->lock);
> +
> throttle_timers_detach_aio_context(tt);
> tgm->aio_context = NULL;
> }
I'm sorry that I didn't noticed this in my previous e-mail, but after
this call you might have destroyed the timer that was set for that
throttling group, so if there are pending requests waiting it can happen
that no one wakes them up.
I think that the queue needs to be restarted after this, probably after
having reattached the context (or actually after detaching it already,
but then what happens if you try to restart the queue while aio_context
is NULL?).
Berto
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH v2] throttle-groups: update tg->any_timer_armed[] on detach
2017-09-20 11:08 ` Alberto Garcia
@ 2017-09-20 11:39 ` Manos Pitsidianakis
2017-09-20 12:17 ` Alberto Garcia
0 siblings, 1 reply; 6+ messages in thread
From: Manos Pitsidianakis @ 2017-09-20 11:39 UTC (permalink / raw)
To: Alberto Garcia
Cc: Stefan Hajnoczi, qemu-devel, Max Reitz, Kevin Wolf, qemu-block
[-- Attachment #1: Type: text/plain, Size: 1478 bytes --]
On Wed, Sep 20, 2017 at 01:08:52PM +0200, Alberto Garcia wrote:
>On Wed 20 Sep 2017 12:17:40 PM CEST, Stefan Hajnoczi wrote:
>> @@ -592,6 +592,17 @@ void throttle_group_attach_aio_context(ThrottleGroupMember *tgm,
>> void throttle_group_detach_aio_context(ThrottleGroupMember *tgm)
>> {
>> ThrottleTimers *tt = &tgm->throttle_timers;
>> + ThrottleGroup *tg = container_of(tgm->throttle_state, ThrottleGroup, ts);
>> +
>> + qemu_mutex_lock(&tg->lock);
>> + if (timer_pending(tt->timers[0])) {
>> + tg->any_timer_armed[0] = false;
>> + }
>> + if (timer_pending(tt->timers[1])) {
>> + tg->any_timer_armed[1] = false;
>> + }
>> + qemu_mutex_unlock(&tg->lock);
>> +
>> throttle_timers_detach_aio_context(tt);
>> tgm->aio_context = NULL;
>> }
>
>I'm sorry that I didn't noticed this in my previous e-mail, but after
>this call you might have destroyed the timer that was set for that
>throttling group, so if there are pending requests waiting it can happen
>that no one wakes them up.
>
>I think that the queue needs to be restarted after this, probably after
>having reattached the context (or actually after detaching it already,
>but then what happens if you try to restart the queue while aio_context
>is NULL?).
aio_co_enter in the restart queue function requires that aio_context is
non-NULL. Perhaps calling throttle_group_restart_tgm in
throttle_group_attach_aio_context will suffice.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH v2] throttle-groups: update tg->any_timer_armed[] on detach
2017-09-20 11:39 ` Manos Pitsidianakis
@ 2017-09-20 12:17 ` Alberto Garcia
2017-09-20 14:16 ` Stefan Hajnoczi
0 siblings, 1 reply; 6+ messages in thread
From: Alberto Garcia @ 2017-09-20 12:17 UTC (permalink / raw)
To: Manos Pitsidianakis
Cc: Stefan Hajnoczi, qemu-devel, Max Reitz, Kevin Wolf, qemu-block
On Wed 20 Sep 2017 01:39:02 PM CEST, Manos Pitsidianakis wrote:
>>> void throttle_group_detach_aio_context(ThrottleGroupMember *tgm)
>>> {
>>> ThrottleTimers *tt = &tgm->throttle_timers;
>>> + ThrottleGroup *tg = container_of(tgm->throttle_state, ThrottleGroup, ts);
>>> +
>>> + qemu_mutex_lock(&tg->lock);
>>> + if (timer_pending(tt->timers[0])) {
>>> + tg->any_timer_armed[0] = false;
>>> + }
>>> + if (timer_pending(tt->timers[1])) {
>>> + tg->any_timer_armed[1] = false;
>>> + }
>>> + qemu_mutex_unlock(&tg->lock);
>>> +
>>> throttle_timers_detach_aio_context(tt);
>>> tgm->aio_context = NULL;
>>> }
>>
>>I'm sorry that I didn't noticed this in my previous e-mail, but after
>>this call you might have destroyed the timer that was set for that
>>throttling group, so if there are pending requests waiting it can
>>happen that no one wakes them up.
>>
>>I think that the queue needs to be restarted after this, probably
>>after having reattached the context (or actually after detaching it
>>already, but then what happens if you try to restart the queue while
>>aio_context is NULL?).
>
> aio_co_enter in the restart queue function requires that aio_context
> is non-NULL. Perhaps calling throttle_group_restart_tgm in
> throttle_group_attach_aio_context will suffice.
But can we guarantee that everything is safe between the _detach() and
_attach() calls?
Berto
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH v2] throttle-groups: update tg->any_timer_armed[] on detach
2017-09-20 12:17 ` Alberto Garcia
@ 2017-09-20 14:16 ` Stefan Hajnoczi
0 siblings, 0 replies; 6+ messages in thread
From: Stefan Hajnoczi @ 2017-09-20 14:16 UTC (permalink / raw)
To: Alberto Garcia
Cc: Manos Pitsidianakis, qemu-devel, Max Reitz, Kevin Wolf,
qemu-block
On Wed, Sep 20, 2017 at 02:17:51PM +0200, Alberto Garcia wrote:
> On Wed 20 Sep 2017 01:39:02 PM CEST, Manos Pitsidianakis wrote:
> >>> void throttle_group_detach_aio_context(ThrottleGroupMember *tgm)
> >>> {
> >>> ThrottleTimers *tt = &tgm->throttle_timers;
> >>> + ThrottleGroup *tg = container_of(tgm->throttle_state, ThrottleGroup, ts);
> >>> +
> >>> + qemu_mutex_lock(&tg->lock);
> >>> + if (timer_pending(tt->timers[0])) {
> >>> + tg->any_timer_armed[0] = false;
> >>> + }
> >>> + if (timer_pending(tt->timers[1])) {
> >>> + tg->any_timer_armed[1] = false;
> >>> + }
> >>> + qemu_mutex_unlock(&tg->lock);
> >>> +
> >>> throttle_timers_detach_aio_context(tt);
> >>> tgm->aio_context = NULL;
> >>> }
> >>
> >>I'm sorry that I didn't noticed this in my previous e-mail, but after
> >>this call you might have destroyed the timer that was set for that
> >>throttling group, so if there are pending requests waiting it can
> >>happen that no one wakes them up.
> >>
> >>I think that the queue needs to be restarted after this, probably
> >>after having reattached the context (or actually after detaching it
> >>already, but then what happens if you try to restart the queue while
> >>aio_context is NULL?).
> >
> > aio_co_enter in the restart queue function requires that aio_context
> > is non-NULL. Perhaps calling throttle_group_restart_tgm in
> > throttle_group_attach_aio_context will suffice.
>
> But can we guarantee that everything is safe between the _detach() and
> _attach() calls?
Here is a second patch that I didn't send because I was unsure whether it's
needed.
The idea is to move the throttle_group_detach/attach_aio_context() into
bdrv_set_aio_context() so it becomes part of the bdrv_drained_begin/end()
region.
The guarantees we have inside the drained region:
1. Throttling has been temporarily disabled.
2. throttle_group_restart_tgm() has been called to kick throttled reqs.
3. All requests are completed.
This is a nice, controlled environment to do the detach/attach. That said,
Berto's point still stands: what about other ThrottleGroupMembers who have
throttled requests queued?
The main reason I didn't publish this patch is because Manos' "block: remove
legacy I/O throttling" series ought to remove this code anyway soon.
diff --git a/block/block-backend.c b/block/block-backend.c
index 45d9101be3..624eb3dc15 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1747,10 +1747,6 @@ void blk_set_aio_context(BlockBackend *blk, AioContext *new_context)
ThrottleGroupMember *tgm = &blk->public.throttle_group_member;
if (bs) {
- if (tgm->throttle_state) {
- throttle_group_detach_aio_context(tgm);
- throttle_group_attach_aio_context(tgm, new_context);
- }
bdrv_set_aio_context(bs, new_context);
}
}
@@ -2029,6 +2025,13 @@ static void blk_root_drained_end(BdrvChild *child)
BlockBackend *blk = child->opaque;
assert(blk->quiesce_counter);
+ if (tgm->throttle_state) {
+ AioContext *new_context = bdrv_aio_context(blk_bs(blk));
+
+ throttle_group_detach_aio_context(tgm);
+ throttle_group_attach_aio_context(tgm, new_context);
+ }
+
assert(blk->public.throttle_group_member.io_limits_disabled);
atomic_dec(&blk->public.throttle_group_member.io_limits_disabled);
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-09-20 14:17 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-09-20 10:17 [Qemu-devel] [PATCH v2] throttle-groups: update tg->any_timer_armed[] on detach Stefan Hajnoczi
2017-09-20 10:34 ` Manos Pitsidianakis
2017-09-20 11:08 ` Alberto Garcia
2017-09-20 11:39 ` Manos Pitsidianakis
2017-09-20 12:17 ` Alberto Garcia
2017-09-20 14:16 ` Stefan Hajnoczi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).