qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Anthony Liguori <aliguori@us.ibm.com>
Cc: qemu-devel@nongnu.org, "Andreas Färber" <afaerber@suse.de>,
	"Aurelien Jarno" <aurelien@aurel32.net>,
	mst@redhat.com
Subject: Re: [Qemu-devel] [PATCH for-1.5 0/9] Disable expensive QOM cast debugging for official releases
Date: Fri, 10 May 2013 22:59:22 +0200	[thread overview]
Message-ID: <518D5FAA.5050507@redhat.com> (raw)
In-Reply-To: <87ip2q91ho.fsf@codemonkey.ws>

Il 10/05/2013 19:41, Anthony Liguori ha scritto:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
>> Il 10/05/2013 16:39, Anthony Liguori ha scritto:
>>> I just oppose the notion of disabling casts and *especially* only
>>> disabling casts for official builds.
>>
>> This actually happens all the time.  Exactly this kind of type-safe cast
>> is disabled in releases of GCC, but enabled when building from svn trunk.
> 
> Let's assume for a moment that you are right and this behavior is what
> we should have.  Let's also assume there is a real regression here
> which has yet to have been established.

Aurelien timed the effect of my patch two hours before you sent this
message.  If it's not a regression in 1.5 (which is quite obvious from
the profile), it is a regression from the introduction of CPU classes
(1.3 or 1.4), when this code didn't exist at all.

And in 1.5 we introduced virtio-net casts as well (or did mst sneak in
his change anyway? ;)). If 10% is the effect of a few hundred
interrupts/sec, perhaps the same effect is visible on a few thousand
packets/sec.  I wouldn't bet against that one week from release.

(In fact, I'm not going to bet against that after release, either.  I'll
propose to disable QOM cast checking in Fedora and RHEL).

> As soon as we open up 1.6 for development, we face an immediate
> regression in performance.  So we need to fix the real problem here
> anyway.

No, we don't.  We will simply have to live with a debugging vs.
production tradeoff.  You will need to remember using the appropriate
option when doing performance tests on development releases.  We can
print a message if necessary, but it's really common practice.  What
about lockdep or might_sleep debugging or similar checks that Fedora
only enables for the kernel during development phases?

> I strongly suspect that if there is a problem, that optimizing
> leaf/concrete casts will take care of it.  Otherwise, a small lookup
> cache ought to do the trick too.  We can even use pointer comparisons
> for the lookup cache which should make it extremely cheap.

Still more expensive than no code at all.

> Let's independently evaluate your proposal for 1.6.

No, my proposal has to be evaluated for 1.5, and perhaps reverted later.
 We have a potentially large *set* of regressions (large amounts of QOM
casts were introduced in 1.5) that we found after -rc1, and the big
hammer is the only way to deal with it.

> Of course, these changes never make it into the tree which is an
> indication that the mechanism works very well :-)

Yes, it works *great*.  I don't want to give the impression that I think
the opposite.  But how often have you seen them abort in the distro
QEMU, as opposed to a development version?  And how often have you seen
"CRITICAL: foo != NULL" or "CRITICAL: GTK_IS_WIDGET (foo)" from a GTK+
app?  For me, it is never vs. quite often.

It works great simply because it's very cheap to add and makes debugging
much nicer.

> Note that the cast macros are a big improvement in code readability.
> The only real replacement would be static casts which would downgrade
> safety.

Yes, we should keep the cast macros, no doubt about that.  And the
checks, for development.  But adding tracing, while removing the actual
checks, is a pretty good speed compromise IMHO.  And leaving the checks
in before the freeze matches what most developers probably desire, and
avoids bitrotting of the check code.

> If you want to debate the merits of the safety, that's fine.
> 
>> Type-safe casts make sense in GTK+/GObject where: 1) type-safe casts
>> return NULL and log a "critical" error, they do not abort;  2) all
>> functions fail with another "critical" error if they are passed NULL.
>> We do neither, so we're just trading a crash now for a crash very soon
>> after (our call stacks tend to be relatively shallow, with some
>> exceptions such as the block layer).
> 
> The big assumption here is that dereference a NULL results in
> predictable failure.  This is not always the case and can lead to
> security vulnerabilities.

That's not what I was talking about.  I was talking about code like this:

void
gtk_window_set_wmclass (GtkWindow *window,
                        const gchar *wmclass_name,
                        const gchar *wmclass_class)
{
  GtkWindowPrivate *priv;

  g_return_if_fail (GTK_IS_WINDOW (window));
  priv = window->priv;

  ...
}

that avoids NULL pointer dereferences before they happen, while still
doing the checks.

Anyway, yes: the checks can catch some security vulnerabilities.  But
they won't catch many uses-after-free, they won't catch wrong
refcounting, they won't catch the _actual_ vulnerabilities that we had,
nor probably the ones that we could have.  Every segfault we fix could
potentially become a guest->host exploit with enough skill, the guest
has control of an enormous part of the QEMU address space.  But we fix
segfaults, not QOM cast failures.

Before QOM casts, I don't remember seeing many such failures _in the
tree_.  They of course happen during development, but QOM casts really
help before patches are posted.

> If you are concerned about the performance, provide a concrete example
> of poor performance and I will fix it.
> 
> If we find one that is unfixable, then we should consider your
> proposal.

Nothing is unfixable.  Worst of all, you can just stick static C casts
in the hot paths.  But that's what I want to avoid, I want my brain to
concentrate on the code on the hot paths, not on pointless differences
between those parts and the rest of a device.

Paolo

  parent reply	other threads:[~2013-05-10 20:59 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-10 12:16 [Qemu-devel] [PATCH for-1.5 0/9] Disable expensive QOM cast debugging for official releases Paolo Bonzini
2013-05-10 12:16 ` [Qemu-devel] [PATCH for-1.5 1/9] qom: improve documentation of cast functions Paolo Bonzini
2013-05-10 12:16 ` [Qemu-devel] [PATCH for-1.5 2/9] qom: allow casting of a NULL class Paolo Bonzini
2013-05-10 12:16 ` [Qemu-devel] [PATCH for-1.5 3/9] qom: add a fast path to object_class_dynamic_cast Paolo Bonzini
2013-05-10 12:16 ` [Qemu-devel] [PATCH for-1.5 4/9] qom: pass file/line/function to asserting casts Paolo Bonzini
2013-05-10 12:16 ` [Qemu-devel] [PATCH for-1.5 5/9] qom: trace " Paolo Bonzini
2013-05-10 12:16 ` [Qemu-devel] [PATCH for-1.5 6/9] qom: allow turning cast debugging off Paolo Bonzini
2013-05-10 12:16 ` [Qemu-devel] [PATCH for-1.5 7/9] build: disable QOM cast debugging for official releases Paolo Bonzini
2013-05-10 12:16 ` [Qemu-devel] [PATCH for-1.5 8/9] qom: simplify object_class_dynamic_cast, part 1 Paolo Bonzini
2013-05-10 17:52   ` Anthony Liguori
2013-05-10 12:16 ` [Qemu-devel] [PATCH for-1.5 9/9] qom: simplify object_class_dynamic_cast, part 2 Paolo Bonzini
2013-05-10 13:01 ` [Qemu-devel] [PATCH for-1.5 0/9] Disable expensive QOM cast debugging for official releases Anthony Liguori
2013-05-10 13:08   ` Paolo Bonzini
2013-05-10 13:23     ` Andreas Färber
2013-05-10 13:30       ` Paolo Bonzini
2013-05-10 14:22         ` Aurelien Jarno
2013-05-10 14:39         ` Anthony Liguori
2013-05-10 14:59           ` Paolo Bonzini
2013-05-10 17:41             ` Anthony Liguori
2013-05-10 19:00               ` Aurelien Jarno
2013-05-10 19:35                 ` Anthony Liguori
2013-05-10 20:59               ` Paolo Bonzini [this message]
2013-05-10 23:06                 ` Anthony Liguori
2013-05-11  7:59                   ` Paolo Bonzini
2013-05-11  7:23                 ` Aurelien Jarno
2013-05-10 14:27     ` Anthony Liguori
2013-05-10 14:46       ` Aurelien Jarno
2013-05-10 15:04         ` Andreas Färber
2013-05-10 15:56 ` Aurelien Jarno
2013-05-10 16:18   ` Andreas Färber
2013-05-10 16:40     ` Paolo Bonzini
2013-05-10 18:50   ` Anthony Liguori
2013-05-13 16:46 ` Anthony Liguori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=518D5FAA.5050507@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=afaerber@suse.de \
    --cc=aliguori@us.ibm.com \
    --cc=aurelien@aurel32.net \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).