public inbox for qemu-devel@nongnu.org
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: qemu-block@nongnu.org, hreitz@redhat.com, xeor@yandex-team.ru,
	vsementsov@yandex-team.ru, pkrempa@redhat.com,
	qemu-devel@nongnu.org, qemu-stable@nongnu.org
Subject: Re: [PATCH v2] block: Never drop BLOCK_IO_ERROR with action=stop for rate limiting
Date: Wed, 4 Mar 2026 16:39:32 +0100	[thread overview]
Message-ID: <aahSNLncXvcnHEFU@redhat.com> (raw)
In-Reply-To: <aagoLvb3zpLlVtED@redhat.com>

Am 04.03.2026 um 13:40 hat Daniel P. Berrangé geschrieben:
> On Wed, Mar 04, 2026 at 01:28:00PM +0100, Kevin Wolf wrote:
> > Commit 2155d2dd introduced rate limiting for BLOCK_IO_ERROR to emit an
> > event only once a second. This makes sense for cases in which the guest
> > keeps running and can submit more requests that would possibly also fail
> > because there is a problem with the backend.
> > 
> > However, if the error policy is configured so that the VM is stopped on
> > errors, this is both unnecessary because stopping the VM means that the
> > guest can't issue more requests and in fact harmful because stopping the
> > VM is an important state change that management tools need to keep track
> > of even if it happens more than once in a given second. If an event is
> > dropped, the management tool would see a VM randomly going to paused
> > state without an associated error, so it has a hard time deciding how to
> > handle the situation.
> > 
> > This patch disables rate limiting for action=stop by not relying on the
> > event type alone any more in monitor_qapi_event_queue_no_reenter(), but
> > checking action for BLOCK_IO_ERROR, too. If the error is reported to the
> > guest or ignored, the rate limiting stays in place.
> > 
> > Fixes: 2155d2dd7f73 ('block-backend: per-device throttling of BLOCK_IO_ERROR reports')
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >  qapi/block-core.json |  2 +-
> >  monitor/monitor.c    | 21 ++++++++++++++++++++-
> >  2 files changed, 21 insertions(+), 2 deletions(-)
> > 
> > diff --git a/qapi/block-core.json b/qapi/block-core.json
> > index b66bf316e2f..da0b36a3751 100644
> > --- a/qapi/block-core.json
> > +++ b/qapi/block-core.json
> > @@ -5794,7 +5794,7 @@
> >  # .. note:: If action is "stop", a `STOP` event will eventually follow
> >  #    the `BLOCK_IO_ERROR` event.
> >  #
> > -# .. note:: This event is rate-limited.
> > +# .. note:: This event is rate-limited, except if action is "stop".
> >  #
> >  # Since: 0.13
> >  #
> > diff --git a/monitor/monitor.c b/monitor/monitor.c
> > index 1273eb72605..37fa674cfe6 100644
> > --- a/monitor/monitor.c
> > +++ b/monitor/monitor.c
> > @@ -367,14 +367,33 @@ monitor_qapi_event_queue_no_reenter(QAPIEvent event, QDict *qdict)
> >  {
> >      MonitorQAPIEventConf *evconf;
> >      MonitorQAPIEventState *evstate;
> > +    bool throttled;
> >  
> >      assert(event < QAPI_EVENT__MAX);
> >      evconf = &monitor_qapi_event_conf[event];
> >      trace_monitor_protocol_event_queue(event, qdict, evconf->rate);
> > +    throttled = evconf->rate;
> > +
> > +    /*
> > +     * Rate limit BLOCK_IO_ERROR only for action != "stop".
> > +     *
> > +     * If the VM is stopped after an I/O error, this is important information
> > +     * for the management tool to keep track of the state of QEMU and we can't
> > +     * merge any events. At the same time, stopping the VM means that the guest
> > +     * can't send additional requests and the number of events is already
> > +     * limited, so we can do without rate limiting.
> > +     */
> > +    if (event == QAPI_EVENT_BLOCK_IO_ERROR) {
> > +        QDict *data = qobject_to(QDict, qdict_get(qdict, "data"));
> > +        const char *action = qdict_get_str(data, "action");
> > +        if (!strcmp(action, "stop")) {
> > +            throttled = false;
> > +        }
> > +    }
> 
> Can this be handled in the same way as other events viat he
> qapi_event_throttle_hash & qapi_event_throttle_equal methods ?
> 
> eg if action is "stop", then ensure "equal" is always false ?
> Possibly add a random token to the hash but might not be needed
> if 'equal' is always false

That was v1 and cost me a day debugging the crashes resulting from
events not comparing equal to themselves (which in turn means that
removing them from the hash table fails silently and you get use after
free).

Kevin



  reply	other threads:[~2026-03-04 15:40 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-04 12:28 [PATCH v2] block: Never drop BLOCK_IO_ERROR with action=stop for rate limiting Kevin Wolf
2026-03-04 12:40 ` Daniel P. Berrangé
2026-03-04 15:39   ` Kevin Wolf [this message]
2026-03-10 13:32 ` Markus Armbruster
2026-03-10 14:21   ` Kevin Wolf
2026-03-10 14:52     ` Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aahSNLncXvcnHEFU@redhat.com \
    --to=kwolf@redhat.com \
    --cc=berrange@redhat.com \
    --cc=hreitz@redhat.com \
    --cc=pkrempa@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-stable@nongnu.org \
    --cc=vsementsov@yandex-team.ru \
    --cc=xeor@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox