Re: [Qemu-devel] [PATCH for-1.5 0/9] Disable expensive QOM cast debugging for official releases

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Paolo Bonzini <pbonzini@redhat.com>
To: Anthony Liguori <aliguori@us.ibm.com>
Cc: qemu-devel@nongnu.org, "Andreas Färber" <afaerber@suse.de>,
	"Aurelien Jarno" <aurelien@aurel32.net>,
	mst@redhat.com
Subject: Re: [Qemu-devel] [PATCH for-1.5 0/9] Disable expensive QOM cast debugging for official releases
Date: Fri, 10 May 2013 22:59:22 +0200	[thread overview]
Message-ID: <518D5FAA.5050507@redhat.com> (raw)
In-Reply-To: <87ip2q91ho.fsf@codemonkey.ws>

Il 10/05/2013 19:41, Anthony Liguori ha scritto:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
>> Il 10/05/2013 16:39, Anthony Liguori ha scritto:
>>> I just oppose the notion of disabling casts and *especially* only
>>> disabling casts for official builds.
>>
>> This actually happens all the time.  Exactly this kind of type-safe cast
>> is disabled in releases of GCC, but enabled when building from svn trunk.
> 
> Let's assume for a moment that you are right and this behavior is what
> we should have.  Let's also assume there is a real regression here
> which has yet to have been established.

Aurelien timed the effect of my patch two hours before you sent this
message.  If it's not a regression in 1.5 (which is quite obvious from
the profile), it is a regression from the introduction of CPU classes
(1.3 or 1.4), when this code didn't exist at all.

And in 1.5 we introduced virtio-net casts as well (or did mst sneak in
his change anyway? ;)). If 10% is the effect of a few hundred
interrupts/sec, perhaps the same effect is visible on a few thousand
packets/sec.  I wouldn't bet against that one week from release.

(In fact, I'm not going to bet against that after release, either.  I'll
propose to disable QOM cast checking in Fedora and RHEL).

> As soon as we open up 1.6 for development, we face an immediate
> regression in performance.  So we need to fix the real problem here
> anyway.

No, we don't.  We will simply have to live with a debugging vs.
production tradeoff.  You will need to remember using the appropriate
option when doing performance tests on development releases.  We can
print a message if necessary, but it's really common practice.  What
about lockdep or might_sleep debugging or similar checks that Fedora
only enables for the kernel during development phases?

> I strongly suspect that if there is a problem, that optimizing
> leaf/concrete casts will take care of it.  Otherwise, a small lookup
> cache ought to do the trick too.  We can even use pointer comparisons
> for the lookup cache which should make it extremely cheap.

Still more expensive than no code at all.

> Let's independently evaluate your proposal for 1.6.

No, my proposal has to be evaluated for 1.5, and perhaps reverted later.
 We have a potentially large *set* of regressions (large amounts of QOM
casts were introduced in 1.5) that we found after -rc1, and the big
hammer is the only way to deal with it.

> Of course, these changes never make it into the tree which is an
> indication that the mechanism works very well :-)

Yes, it works *great*.  I don't want to give the impression that I think
the opposite.  But how often have you seen them abort in the distro
QEMU, as opposed to a development version?  And how often have you seen
"CRITICAL: foo != NULL" or "CRITICAL: GTK_IS_WIDGET (foo)" from a GTK+
app?  For me, it is never vs. quite often.

It works great simply because it's very cheap to add and makes debugging
much nicer.

> Note that the cast macros are a big improvement in code readability.
> The only real replacement would be static casts which would downgrade
> safety.

Yes, we should keep the cast macros, no doubt about that.  And the
checks, for development.  But adding tracing, while removing the actual
checks, is a pretty good speed compromise IMHO.  And leaving the checks
in before the freeze matches what most developers probably desire, and
avoids bitrotting of the check code.

> If you want to debate the merits of the safety, that's fine.
> 
>> Type-safe casts make sense in GTK+/GObject where: 1) type-safe casts
>> return NULL and log a "critical" error, they do not abort;  2) all
>> functions fail with another "critical" error if they are passed NULL.
>> We do neither, so we're just trading a crash now for a crash very soon
>> after (our call stacks tend to be relatively shallow, with some
>> exceptions such as the block layer).
> 
> The big assumption here is that dereference a NULL results in
> predictable failure.  This is not always the case and can lead to
> security vulnerabilities.

That's not what I was talking about.  I was talking about code like this:

void
gtk_window_set_wmclass (GtkWindow *window,
                        const gchar *wmclass_name,
                        const gchar *wmclass_class)
{
  GtkWindowPrivate *priv;

  g_return_if_fail (GTK_IS_WINDOW (window));
  priv = window->priv;

  ...
}

that avoids NULL pointer dereferences before they happen, while still
doing the checks.

Anyway, yes: the checks can catch some security vulnerabilities.  But
they won't catch many uses-after-free, they won't catch wrong
refcounting, they won't catch the _actual_ vulnerabilities that we had,
nor probably the ones that we could have.  Every segfault we fix could
potentially become a guest->host exploit with enough skill, the guest
has control of an enormous part of the QEMU address space.  But we fix
segfaults, not QOM cast failures.

Before QOM casts, I don't remember seeing many such failures _in the
tree_.  They of course happen during development, but QOM casts really
help before patches are posted.

> If you are concerned about the performance, provide a concrete example
> of poor performance and I will fix it.
> 
> If we find one that is unfixable, then we should consider your
> proposal.

Nothing is unfixable.  Worst of all, you can just stick static C casts
in the hot paths.  But that's what I want to avoid, I want my brain to
concentrate on the code on the hot paths, not on pointless differences
between those parts and the rest of a device.

Paolo

next prev parent reply	other threads:[~2013-05-10 20:59 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-10 12:16 [Qemu-devel] [PATCH for-1.5 0/9] Disable expensive QOM cast debugging for official releases Paolo Bonzini
2013-05-10 12:16 ` [Qemu-devel] [PATCH for-1.5 1/9] qom: improve documentation of cast functions Paolo Bonzini
2013-05-10 12:16 ` [Qemu-devel] [PATCH for-1.5 2/9] qom: allow casting of a NULL class Paolo Bonzini
2013-05-10 12:16 ` [Qemu-devel] [PATCH for-1.5 3/9] qom: add a fast path to object_class_dynamic_cast Paolo Bonzini
2013-05-10 12:16 ` [Qemu-devel] [PATCH for-1.5 4/9] qom: pass file/line/function to asserting casts Paolo Bonzini
2013-05-10 12:16 ` [Qemu-devel] [PATCH for-1.5 5/9] qom: trace " Paolo Bonzini
2013-05-10 12:16 ` [Qemu-devel] [PATCH for-1.5 6/9] qom: allow turning cast debugging off Paolo Bonzini
2013-05-10 12:16 ` [Qemu-devel] [PATCH for-1.5 7/9] build: disable QOM cast debugging for official releases Paolo Bonzini
2013-05-10 12:16 ` [Qemu-devel] [PATCH for-1.5 8/9] qom: simplify object_class_dynamic_cast, part 1 Paolo Bonzini
2013-05-10 17:52   ` Anthony Liguori
2013-05-10 12:16 ` [Qemu-devel] [PATCH for-1.5 9/9] qom: simplify object_class_dynamic_cast, part 2 Paolo Bonzini
2013-05-10 13:01 ` [Qemu-devel] [PATCH for-1.5 0/9] Disable expensive QOM cast debugging for official releases Anthony Liguori
2013-05-10 13:08   ` Paolo Bonzini
2013-05-10 13:23     ` Andreas Färber
2013-05-10 13:30       ` Paolo Bonzini
2013-05-10 14:22         ` Aurelien Jarno
2013-05-10 14:39         ` Anthony Liguori
2013-05-10 14:59           ` Paolo Bonzini
2013-05-10 17:41             ` Anthony Liguori
2013-05-10 19:00               ` Aurelien Jarno
2013-05-10 19:35                 ` Anthony Liguori
2013-05-10 20:59               ` Paolo Bonzini [this message]
2013-05-10 23:06                 ` Anthony Liguori
2013-05-11  7:59                   ` Paolo Bonzini
2013-05-11  7:23                 ` Aurelien Jarno
2013-05-10 14:27     ` Anthony Liguori
2013-05-10 14:46       ` Aurelien Jarno
2013-05-10 15:04         ` Andreas Färber
2013-05-10 15:56 ` Aurelien Jarno
2013-05-10 16:18   ` Andreas Färber
2013-05-10 16:40     ` Paolo Bonzini
2013-05-10 18:50   ` Anthony Liguori
2013-05-13 16:46 ` Anthony Liguori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=518D5FAA.5050507@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=afaerber@suse.de \
    --cc=aliguori@us.ibm.com \
    --cc=aurelien@aurel32.net \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.