From: Cornelia Huck <cohuck@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, "Michael S . Tsirkin" <mst@redhat.com>,
"Jason Wang" <jasowang@redhat.com>,
"Markus Armbruster" <armbru@redhat.com>,
"Philippe Mathieu-Daudé" <f4bug@amsat.org>,
"Halil Pasic" <pasic@linux.ibm.com>
Subject: Re: [Qemu-devel] [PATCH v4 1/2] qemu-error: introduce {error|warn}_report_once
Date: Wed, 30 May 2018 15:36:00 +0200 [thread overview]
Message-ID: <20180530153600.63f16405.cohuck@redhat.com> (raw)
In-Reply-To: <20180530033045.GC25245@xz-mi>
On Wed, 30 May 2018 11:30:45 +0800
Peter Xu <peterx@redhat.com> wrote:
> On Tue, May 29, 2018 at 11:30:00AM +0200, Cornelia Huck wrote:
> > On Thu, 24 May 2018 12:44:53 +0800
> > Peter Xu <peterx@redhat.com> wrote:
> >
> > > There are many error_report()s that can be used in frequently called
> > > functions, especially on IO paths. That can be unideal in that
> > > malicious guest can try to trigger the error tons of time which might
> > > use up the log space on the host (e.g., libvirt can capture the stderr
> > > of QEMU and put it persistently onto disk). In VT-d emulation code, we
> > > have trace_vtd_error() tracer. AFAIU all those places can be replaced
> > > by something like error_report() but trace points are mostly used to
> > > avoid the DDOS attack that mentioned above. However using trace points
> > > mean that errors are not dumped if trace not enabled.
> > >
> > > It's not a big deal in most modern server managements since we have
> > > things like logrotate to maintain the logs and make sure the quota is
> > > expected. However it'll still be nice that we just provide another way
> > > to restrict message generations. In most cases, this kind of
> > > error_report()s will only provide valid information on the first message
> > > sent, and all the rest of similar messages will be mostly talking about
> > > the same thing. This patch introduces *_report_once() helpers to allow
> > > a message to be dumped only once during one QEMU process's life cycle.
> > > It will make sure: (1) it's on by deffault, so we can even get something
> > > without turning the trace on and reproducing, and (2) it won't be
> > > affected by DDOS attack.
> >
> > This is good for something (sub-)system wide, where it is enough to
> > alert the user once; but we may want to print something e.g. once per
> > the device where it happens (see v3 of "vfio-ccw: add force unlimited
> > prefetch property" for an example).
>
> I'm glad that we start to have more users of it, no matter which
> implementation we'll choose. At least it means it makes some sense to
> have such a thing.
>
> For me this patch works nicely enough. Of course there can be
> per-device errors, but AFAICT mostly we don't have any error at
> all... and my debugging experience is that when multiple error happens
> on different devices simutaneously, they'll very possible that it's
> caused by the same problem (again, errors are rare after all), or, the
> rest of the problems are caused by the first error only (so the first
> error might cause a collapse of the rest). That's why I wanted to
> always debug with the first error I encounter, because that's mostly
> always the root cause. In that sense, current patch works nicely for
> me (note that we can have device ID embeded in the error message with
> this one).
I think we have slightly different use cases here. In your case,
something (an error) happened and you don't really care about any
subsequent messages. In the vfio-ccw case, we want to log when a guest
tries to do things on a certain device, so we can possibly throw a
magic switch for that device. We still want messages after that so we
can catch further devices for which we may want to throw the magic
switch. (Similar for the other message, we want to give a hint why a
certain device may not work as expected.)
>
> At the same time, I think this patch should be easier to use of course
> - no extra variables to define, and very self contained. So I would
> slightly prefer current patch.
>
> However I'm also fine with the approach proposed in the vfio-ccw patch
> too. Though if so I would possibly drop the 2nd patch too since if
> with the vfio-ccw patch I'll need to introduce one bool for every
> trace_vtd_err() place... then I'd not bother with it any more but
> instead live with that trace_*(). ;) Or I can define one per-IOMMU
> error boolean and pass it in for each of the error_report_once(), but
> it seems a bit awkward too.
I think we can have both the fine-grained control and convenience
macros for those cases where we really just want to print a message
once.
>
> >
> > >
> > > To implement it, I stole the printk_once() macro from Linux.
> >
> > Would something akin to printk_ratelimited() also make sense to avoid
> > log flooding?
>
> Yes it will. IMHO we can have that too as follow up if we want, and
> it does not conflict with this print_once(). I'd say currently this
> error_report_once() is good enough for me, especially lightweighted.
> I suspect we'll need more lines to implement a ratelimited version.
Yes, and I agree that wants to be a separate patch should we find a use
case for it.
next prev parent reply other threads:[~2018-05-30 13:36 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-24 4:44 [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once Peter Xu
2018-05-24 4:44 ` [Qemu-devel] [PATCH v4 1/2] qemu-error: " Peter Xu
2018-05-29 9:30 ` Cornelia Huck
2018-05-30 3:30 ` Peter Xu
2018-05-30 13:36 ` Cornelia Huck [this message]
2018-06-13 7:56 ` Markus Armbruster
2018-05-30 4:47 ` Michael S. Tsirkin
2018-05-30 6:39 ` Peter Xu
2018-05-30 13:53 ` Cornelia Huck
2018-06-13 7:59 ` Markus Armbruster
2018-05-30 15:15 ` Halil Pasic
2018-05-30 15:19 ` Michael S. Tsirkin
2018-05-30 15:30 ` Halil Pasic
2018-06-13 8:01 ` Markus Armbruster
2018-06-13 9:08 ` Peter Xu
2018-05-24 4:44 ` [Qemu-devel] [PATCH v4 2/2] intel-iommu: start to use error_report_once Peter Xu
2018-06-13 8:05 ` Markus Armbruster
2018-06-13 9:36 ` Auger Eric
2018-06-14 12:51 ` Markus Armbruster
2018-06-14 12:56 ` Peter Maydell
2018-08-15 5:58 ` [Qemu-devel] [PATCH v4 0/2] error-report: introduce {error|warn}_report_once Markus Armbruster
2018-08-15 6:10 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180530153600.63f16405.cohuck@redhat.com \
--to=cohuck@redhat.com \
--cc=armbru@redhat.com \
--cc=f4bug@amsat.org \
--cc=jasowang@redhat.com \
--cc=mst@redhat.com \
--cc=pasic@linux.ibm.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.