From: Zachary Amsden <zach@vmware.com>
To: tglx@linutronix.de
Cc: Daniel Arai <arai@vmware.com>,
Virtualization Mailing List <virtualization@lists.osdl.org>,
john stultz <johnstul@us.ibm.com>, Ingo Molnar <mingo@elte.hu>,
akpm@linux-foundation.org, LKML <linux-kernel@vger.kernel.org>,
Daniel Hecht <dhecht@vmware.com>,
Rusty Russell <rusty@rustcorp.com.au>,
Jeremy Fitzhardinge <jeremy@goop.org>
Subject: Re: + stupid-hack-to-make-mainline-build.patch added to -mm tree
Date: Thu, 08 Mar 2007 00:01:13 -0800 [thread overview]
Message-ID: <45EFC2C9.10603@vmware.com> (raw)
In-Reply-To: <1173338928.24738.979.camel@localhost.localdomain>
Thomas Gleixner wrote:
> On Wed, 2007-03-07 at 17:01 -0800, Daniel Arai wrote:
>
>> But more importantly, we want a kernel that can run both on native hardware and
>> in a paravirtualized environment. Linux doesn't really provide abstractions for
>> replacing the appropriate code. We tried to hook into the source code at a
>> level that seemed possible.
>>
>
> Again. You just refuse to change your implementation and you want to
> keep it by arguing how hard it is because there are no abstractions.
>
It is no longer possible to change our _hypervisor_ implementation. The
Linux side of our code is entirely flexible, and we are trying to change
it, but it hasn't always been clear what you want us to do.
> Your prayer wheel argument of missing abstractions and easiness of
> emulating things is annoying. If you think it is better to emulate APIC,
> please emulate it without paravirt ops. If you want the speed
> improvement, work with us to create the interfaces and abstractions
> which are necessary to have a sane, maintainable and useful for all
> hypervisors implementation.
>
That's what we are doing. Our prayer wheel would be easier appeased if
you actually told us which parts of the VMI timer you objected to. As I
understand it now:
1) We should not call into external functions in other time sources; any
common code should be merged up
2) We should not be using global_clock_event; it is a horrible hack
which you want to remove
3) We should not use the smp_apic_timer_interrupt assembly code which
calls up to the lapic timer handlers
4) We should not add our own assembly code to call out to a local timer
handler (from Ingo)
These last two points create a conflict which is a little tricky to
solve. We can't add our own custom timer handler, and we can't re-use
the APIC timer handler. But there is no timer handler available on i386
that works, since the handlers will fall back to either PIC or IO-APIC
edge handling. Using either of those for the local timer interrupt on
SMP does not work because they assume traditional IRQ semantics - an IRQ
raised from the bus should be serviced by one processor. Re-raises of
the same IRQ on remote processors are locked out by the handler, and
dropped. Thus simultaneous local timers firing on multiple CPUs cause
only one to be serviced.
This does not work for local timer interrupts in NO_HZ mode, because
they must always be serviced so that they can reschedule the next local
timer. I have a proposed solution to this issue, but it fails to work
when the IO-APIC assumes control of all IRQs based on ACPI results
(which we control, but can't change because of compatibility issues with
other operating systems).
My proposal is to keep IRQ-0 as the timer interrupt, on all CPUs, but
fire it from the LAPIC after local apic timers get initialized. We
would do this by converting the irq handler using set_irq_handler(0,
handle_percpu_irq). The only problem is the IO-APIC code will want to
take over IRQ0 and convert it to an edge triggered IO-APIC interrupt.
But for the local irq handlers to work, we have to keep them using the
handle_percpu_irq handler, and can't let the IO-APIC steal these
vectors. There is no way to do conditionally for just a specific set of
IRQs in tree today, so we would need to add a special case to io_apic.c
to allow early boot code to reserve specific vectors so they are not
subsumed by the IO-APIC. This seems reasonable, but is a special case.
If, on the other hand, we are allowed to use our own assembly code to
call out to our local timer handler (dropping constraint #4 above), we
can simply rewire LOCAL_TIMER_VECTOR to point to this code, but now we
must emulate the semantics of irq_enter / leave / etc inside our code,
which is also not the cleanest solution. We used to do this, and it
caught flak I believe from Ingo.
The basic problem is that a local IRQ doesn't behave like a global IRQ,
and the i386 backend is unaware of how to set up any local IRQs except
in the case of local APIC, but you have told us we should not re-use the
APIC handlers by overloading global_clock_event. The patches we sent
out recently did just this, but seemed to meet even more violence than
our previous way of doing things.
So the question is, which approach do you prefer?
Zach
next prev parent reply other threads:[~2007-03-08 8:01 UTC|newest]
Thread overview: 105+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <200703060654.l266sVxr014860@shell0.pdx.osdl.net>
[not found] ` <45ED16D2.3000202@vmware.com>
[not found] ` <20070306084258.GA15745@elte.hu>
[not found] ` <20070306084647.GA16280@elte.hu>
2007-03-06 8:55 ` + stupid-hack-to-make-mainline-build.patch added to -mm tree Zachary Amsden
2007-03-06 10:59 ` Thomas Gleixner
2007-03-06 21:07 ` Dan Hecht
2007-03-06 22:21 ` Andi Kleen
2007-03-06 21:32 ` Dan Hecht
2007-03-06 23:53 ` Thomas Gleixner
2007-03-07 0:24 ` Jeremy Fitzhardinge
2007-03-07 0:35 ` Dan Hecht
2007-03-07 0:49 ` Thomas Gleixner
2007-03-07 0:53 ` Dan Hecht
2007-03-07 1:18 ` Thomas Gleixner
2007-03-07 2:08 ` Dan Hecht
2007-03-07 8:37 ` Thomas Gleixner
2007-03-07 17:41 ` Jeremy Fitzhardinge
2007-03-07 17:49 ` Ingo Molnar
2007-03-07 18:03 ` James Morris
2007-03-07 18:35 ` Jeremy Fitzhardinge
2007-03-08 0:45 ` Alan Cox
2007-03-07 17:52 ` Ingo Molnar
2007-03-07 18:28 ` Jeremy Fitzhardinge
2007-03-07 18:53 ` Thomas Gleixner
2007-03-07 18:11 ` James Morris
2007-03-07 18:56 ` Thomas Gleixner
2007-03-07 19:05 ` Jeremy Fitzhardinge
2007-03-07 19:49 ` Dan Hecht
2007-03-07 20:11 ` Jeremy Fitzhardinge
2007-03-07 20:49 ` Dan Hecht
2007-03-07 21:14 ` Thomas Gleixner
2007-03-07 20:57 ` Thomas Gleixner
2007-03-07 21:02 ` Dan Hecht
2007-03-07 21:08 ` Jeremy Fitzhardinge
2007-03-07 21:19 ` Thomas Gleixner
2007-03-07 21:14 ` Dan Hecht
2007-03-07 21:21 ` Thomas Gleixner
2007-03-07 21:33 ` Dan Hecht
2007-03-07 22:05 ` Jeremy Fitzhardinge
2007-03-07 23:05 ` Thomas Gleixner
2007-03-07 23:25 ` Zachary Amsden
2007-03-07 23:36 ` Jeremy Fitzhardinge
2007-03-07 23:40 ` Zachary Amsden
2007-03-08 18:30 ` Chris Wright
2007-03-08 0:22 ` Thomas Gleixner
2007-03-08 1:01 ` Daniel Arai
2007-03-08 1:23 ` Jeremy Fitzhardinge
2007-03-08 7:02 ` Thomas Gleixner
2007-03-08 7:28 ` Thomas Gleixner
2007-03-08 8:01 ` Zachary Amsden [this message]
2007-03-08 18:24 ` Chris Wright
2007-03-08 18:44 ` Daniel Arai
2007-03-08 19:14 ` Chris Wright
2007-03-08 19:17 ` Ingo Molnar
2007-03-08 19:42 ` Jeremy Fitzhardinge
2007-03-08 19:47 ` Chris Wright
2007-03-08 19:52 ` Jeremy Fitzhardinge
2007-03-08 20:10 ` Chris Wright
2007-03-08 20:18 ` Jeremy Fitzhardinge
2007-03-08 20:23 ` Chris Wright
2007-03-08 20:33 ` Jeremy Fitzhardinge
2007-03-08 20:42 ` Chris Wright
2007-03-08 20:42 ` Jeremy Fitzhardinge
2007-03-08 21:45 ` Andi Kleen
2007-03-08 19:54 ` Ingo Molnar
[not found] ` <20070308091019.GA19460@elte.hu>
2007-03-08 10:06 ` hardwired VMI crap Zachary Amsden
2007-03-08 11:09 ` Thomas Gleixner
2007-03-08 20:46 ` Zachary Amsden
2007-03-08 21:15 ` Jeremy Fitzhardinge
2007-03-08 21:34 ` Ingo Molnar
2007-03-08 21:43 ` Andi Kleen
2007-03-08 23:39 ` Jeremy Fitzhardinge
2007-03-08 23:55 ` Zachary Amsden
2007-03-09 0:10 ` Jeremy Fitzhardinge
2007-03-09 0:29 ` Linus Torvalds
2007-03-09 0:22 ` Daniel Walker
2007-03-09 0:28 ` Thomas Gleixner
2007-03-09 0:04 ` Thomas Gleixner
2007-03-09 0:44 ` Jeremy Fitzhardinge
2007-03-08 22:31 ` Zachary Amsden
2007-03-08 21:39 ` Andi Kleen
2007-03-08 22:58 ` Zachary Amsden
2007-03-08 18:35 ` Chris Wright
2007-03-07 23:33 ` + stupid-hack-to-make-mainline-build.patch added to -mm tree Jeremy Fitzhardinge
2007-03-07 23:52 ` Dan Hecht
2007-03-08 0:19 ` Jeremy Fitzhardinge
2007-03-08 0:35 ` Thomas Gleixner
2007-03-08 0:38 ` Jeremy Fitzhardinge
2007-03-07 20:40 ` Thomas Gleixner
2007-03-07 21:07 ` Jeremy Fitzhardinge
2007-03-07 21:40 ` Thomas Gleixner
2007-03-07 21:34 ` Dan Hecht
2007-03-07 22:14 ` Thomas Gleixner
2007-03-07 22:17 ` Zachary Amsden
2007-03-07 22:31 ` Thomas Gleixner
2007-03-07 22:28 ` Dan Hecht
2007-03-08 8:01 ` Ingo Molnar
2007-03-08 8:15 ` Keir Fraser
2007-03-08 8:41 ` Jeremy Fitzhardinge
2007-03-08 10:26 ` Rusty Russell
2007-03-07 21:42 ` Dan Hecht
2007-03-07 22:07 ` Thomas Gleixner
2007-03-07 5:10 ` Jeremy Fitzhardinge
2007-03-07 0:40 ` Thomas Gleixner
2007-03-07 0:42 ` Dan Hecht
2007-03-07 1:22 ` Thomas Gleixner
2007-03-07 1:44 ` Dan Hecht
2007-03-07 7:48 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45EFC2C9.10603@vmware.com \
--to=zach@vmware.com \
--cc=akpm@linux-foundation.org \
--cc=arai@vmware.com \
--cc=dhecht@vmware.com \
--cc=jeremy@goop.org \
--cc=johnstul@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=rusty@rustcorp.com.au \
--cc=tglx@linutronix.de \
--cc=virtualization@lists.osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).