* [Qemu-devel] interrupt handling in qemu @ 2011-12-27 23:12 Xin Tong 2011-12-27 23:36 ` Peter Maydell 2011-12-28 10:42 ` Avi Kivity 0 siblings, 2 replies; 13+ messages in thread From: Xin Tong @ 2011-12-27 23:12 UTC (permalink / raw) To: qemu-devel QEMU does not exit and handle interrupt within translation blocks. it only exits after the translation block is finished. Assuming a translation block is very long, is it possible that QEMU could have exceeded the interrupt's "timing window" and yields unexpected behavior. The reason I ask is that I am searching for alternatives to QEMU current way of handling interrupt (unlink translation blocks on interrupt). However, an obvious approach - checking for interrupt in every basic block, seems to be too heavy ( too many tb enters/exits ). Maybe checking interrupt in a few basic blocks might be better, but what is a good measure for the number of basic blocks to execute before checking for interrupt ? Thanks Xin ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] interrupt handling in qemu 2011-12-27 23:12 [Qemu-devel] interrupt handling in qemu Xin Tong @ 2011-12-27 23:36 ` Peter Maydell 2011-12-28 0:43 ` Xin Tong 2011-12-28 10:42 ` Avi Kivity 1 sibling, 1 reply; 13+ messages in thread From: Peter Maydell @ 2011-12-27 23:36 UTC (permalink / raw) To: Xin Tong; +Cc: qemu-devel On 27 December 2011 23:12, Xin Tong <xerox.time.tech@gmail.com> wrote: > The reason I ask is that I am searching for alternatives to QEMU > current way of handling interrupt (unlink translation blocks on > interrupt). However, an obvious approach - checking for interrupt in > every basic block, seems to be too heavy ( too many tb enters/exits > ). It's not awful -- an extra load-test-branch-not-taken per TB, which IIRC from last time I tried to measure it was ~3% speed penalty, obv. very variable with what the guest code is. I have a half-finished patch for this but since I don't have a decent benchmarking setup I've never got round to submitting it. > Maybe checking interrupt in a few basic blocks might be better, but > what is a good measure for the number of basic blocks to execute > before checking for interrupt ? The trouble is that you can't tell when you're translating the TB whether it's just one in a sequence A->B->C (where you could perhaps skip the check at the start of B), or if you're actually looking at a tight loop B->B (or B->C->B). So you don't have the information conveniently to hand to tell you whether you can skip compiling the interrupt check into this TB. (One heuristic for how often we need to check would be "every N instructions, or at every backwards branch or indirect-branch", but this doesn't fit with the idea of putting the checks at the start of the TB, they'd have to go in the middle of the TB which is probably awkward.) -- PMM ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] interrupt handling in qemu 2011-12-27 23:36 ` Peter Maydell @ 2011-12-28 0:43 ` Xin Tong 2011-12-28 1:10 ` Peter Maydell 2011-12-28 21:10 ` Peter Maydell 0 siblings, 2 replies; 13+ messages in thread From: Xin Tong @ 2011-12-28 0:43 UTC (permalink / raw) To: Peter Maydell; +Cc: qemu-devel On Tue, Dec 27, 2011 at 4:36 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > On 27 December 2011 23:12, Xin Tong <xerox.time.tech@gmail.com> wrote: >> The reason I ask is that I am searching for alternatives to QEMU >> current way of handling interrupt (unlink translation blocks on >> interrupt). However, an obvious approach - checking for interrupt in >> every basic block, seems to be too heavy ( too many tb enters/exits >> ). > > It's not awful -- an extra load-test-branch-not-taken per > TB, which IIRC from last time I tried to measure it was ~3% > speed penalty, obv. very variable with what the guest code is. > I have a half-finished patch for this but since I don't have a > decent benchmarking setup I've never got round to submitting it. > >> Maybe checking interrupt in a few basic blocks might be better, but >> what is a good measure for the number of basic blocks to execute >> before checking for interrupt ? > Which version of QEMU did you do your test on, and what are the tests. I modified QEMU to check for interrupt status at the end of every TB and ran it on SPECINT2000 benchmarks with QEMU 0.15.0. The performance is 70% of the unmodified one for some benchmarks on a x86_64 host. I agree that the extra load-test-branch-not-taken per TB is minimal, but what I found is that the average number of TB executed per TB enter is low (~3.5 TBs), while the unmodified approach has ~10 TBs per TB enter. this makes me wonder why. Maybe the mechanism i used to gather this statistics is flawed. but the performance is indeed hindered. > The trouble is that you can't tell when you're translating the > TB whether it's just one in a sequence A->B->C (where you could > perhaps skip the check at the start of B), or if you're > actually looking at a tight loop B->B (or B->C->B). So you don't > have the information conveniently to hand to tell you whether > you can skip compiling the interrupt check into this TB. > > (One heuristic for how often we need to check would be "every > N instructions, or at every backwards branch or indirect-branch", > but this doesn't fit with the idea of putting the checks at the > start of the TB, they'd have to go in the middle of the TB which > is probably awkward.) By keeping a counter that decrements on every TB, and when the counter reaches 0, the current executing TB checks for interrupt status. this way, we can control the dynamic number of TBs executed per interrupt check. But it is going to be introducing some more overhead. Probably 3% - 5% ---Xin . > > -- PMM ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] interrupt handling in qemu 2011-12-28 0:43 ` Xin Tong @ 2011-12-28 1:10 ` Peter Maydell 2011-12-28 1:23 ` Xin Tong 2011-12-28 21:10 ` Peter Maydell 1 sibling, 1 reply; 13+ messages in thread From: Peter Maydell @ 2011-12-28 1:10 UTC (permalink / raw) To: Xin Tong; +Cc: qemu-devel On 28 December 2011 00:43, Xin Tong <xerox.time.tech@gmail.com> wrote: > Which version of QEMU did you do your test on, and what are the tests. It was whatever trunk qemu was a year or so ago, and the test was just "time to login prompt for ARM guest in system mode". > I modified QEMU to check for interrupt status at the end of every TB > and ran it on SPECINT2000 benchmarks with QEMU 0.15.0. The performance > is 70% of the unmodified one for some benchmarks on a x86_64 host. I don't suppose you could provide a brief set of instructions for setting up a benchmark setup like this? (Is this user-mode or system-mode?) > I > agree that the extra load-test-branch-not-taken per TB is minimal, but > what I found is that the average number of TB executed per TB enter is > low (~3.5 TBs), while the unmodified approach has ~10 TBs per TB > enter. this makes me wonder why. Maybe the mechanism i used to gather > this statistics is flawed. but the performance is indeed hindered. Odd. > By keeping a counter that decrements on every TB, and when the counter > reaches 0, the current executing TB checks for interrupt status. Decrementing a counter on every TB is going to be slower than just checking a flag, so you might as well just check the flag. (If the flag is set you need to handle the interrupt anyway so there's no point delaying it.) -- PMM ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] interrupt handling in qemu 2011-12-28 1:10 ` Peter Maydell @ 2011-12-28 1:23 ` Xin Tong 0 siblings, 0 replies; 13+ messages in thread From: Xin Tong @ 2011-12-28 1:23 UTC (permalink / raw) To: Peter Maydell; +Cc: qemu-devel On Tue, Dec 27, 2011 at 6:10 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > On 28 December 2011 00:43, Xin Tong <xerox.time.tech@gmail.com> wrote: >> Which version of QEMU did you do your test on, and what are the tests. > > It was whatever trunk qemu was a year or so ago, and the test was > just "time to login prompt for ARM guest in system mode". > >> I modified QEMU to check for interrupt status at the end of every TB >> and ran it on SPECINT2000 benchmarks with QEMU 0.15.0. The performance >> is 70% of the unmodified one for some benchmarks on a x86_64 host. > > I don't suppose you could provide a brief set of instructions for > setting up a benchmark setup like this? (Is this user-mode or > system-mode?) It is in system mode, the SPECINT2000 benchmarks are running on top of a ubuntu linux. It is not trial to set up. > >> I >> agree that the extra load-test-branch-not-taken per TB is minimal, but >> what I found is that the average number of TB executed per TB enter is >> low (~3.5 TBs), while the unmodified approach has ~10 TBs per TB >> enter. this makes me wonder why. Maybe the mechanism i used to gather >> this statistics is flawed. but the performance is indeed hindered. > > Odd. > >> By keeping a counter that decrements on every TB, and when the counter >> reaches 0, the current executing TB checks for interrupt status. > > Decrementing a counter on every TB is going to be slower than > just checking a flag, so you might as well just check the flag. > (If the flag is set you need to handle the interrupt anyway > so there's no point delaying it.) > The point I am trying to make here is that if qemu TB exits for every interrupt, it is going to be too many. What if QEMU exits and handle a few interrupts in one exit. > -- PMM ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] interrupt handling in qemu 2011-12-28 0:43 ` Xin Tong 2011-12-28 1:10 ` Peter Maydell @ 2011-12-28 21:10 ` Peter Maydell 2011-12-29 0:48 ` Xin Tong 1 sibling, 1 reply; 13+ messages in thread From: Peter Maydell @ 2011-12-28 21:10 UTC (permalink / raw) To: Xin Tong; +Cc: qemu-devel On 28 December 2011 00:43, Xin Tong <xerox.time.tech@gmail.com> wrote: > I modified QEMU to check for interrupt status at the end of every TB > and ran it on SPECINT2000 benchmarks with QEMU 0.15.0. The performance > is 70% of the unmodified one for some benchmarks on a x86_64 host. I > agree that the extra load-test-branch-not-taken per TB is minimal, but > what I found is that the average number of TB executed per TB enter is > low (~3.5 TBs), while the unmodified approach has ~10 TBs per TB > enter. this makes me wonder why. Maybe the mechanism i used to gather > this statistics is flawed. but the performance is indeed hindered. Since you said you're using system mode, here's my guess. The unlink-tbs method of interrupting the guest CPU thread runs in a second thread (the io thread), and doesn't stop the guest CPU thread. So while the io thread is trying to unlink TBs, the CPU thread is still running on, and might well execute a few more TBs before the io thread's traversal of the TB graph catches up with it and manages to unlink the TB link the CPU thread is about to traverse. More generally: are we really taking an interrupt every 3 to 5 TBs? This seems very high -- surely we will be spending more time in the OS servicing interrupts than running useful guest userspace code... -- PMM ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] interrupt handling in qemu 2011-12-28 21:10 ` Peter Maydell @ 2011-12-29 0:48 ` Xin Tong 2011-12-29 1:31 ` Peter Maydell 0 siblings, 1 reply; 13+ messages in thread From: Xin Tong @ 2011-12-29 0:48 UTC (permalink / raw) To: Peter Maydell; +Cc: qemu-devel That is my guess as well in the first place, but my QEMU is built with CONFIG_IOTHREAD set to 0. I am not 100% sure about how interrupts are delivered in QEMU, my guess is that some kind of timer devices will have to fire and qemu might have installed a signal handler and the signal handler takes the signal and invokes unlink_tb. I hope you can enlighten me on that. Thanks Xin On Wed, Dec 28, 2011 at 2:10 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > On 28 December 2011 00:43, Xin Tong <xerox.time.tech@gmail.com> wrote: >> I modified QEMU to check for interrupt status at the end of every TB >> and ran it on SPECINT2000 benchmarks with QEMU 0.15.0. The performance >> is 70% of the unmodified one for some benchmarks on a x86_64 host. I >> agree that the extra load-test-branch-not-taken per TB is minimal, but >> what I found is that the average number of TB executed per TB enter is >> low (~3.5 TBs), while the unmodified approach has ~10 TBs per TB >> enter. this makes me wonder why. Maybe the mechanism i used to gather >> this statistics is flawed. but the performance is indeed hindered. > > Since you said you're using system mode, here's my guess. The > unlink-tbs method of interrupting the guest CPU thread runs > in a second thread (the io thread), and doesn't stop the guest > CPU thread. So while the io thread is trying to unlink TBs, > the CPU thread is still running on, and might well execute > a few more TBs before the io thread's traversal of the TB > graph catches up with it and manages to unlink the TB link > the CPU thread is about to traverse. > > More generally: are we really taking an interrupt every 3 to > 5 TBs? This seems very high -- surely we will be spending more > time in the OS servicing interrupts than running useful guest > userspace code... > > -- PMM ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] interrupt handling in qemu 2011-12-29 0:48 ` Xin Tong @ 2011-12-29 1:31 ` Peter Maydell 0 siblings, 0 replies; 13+ messages in thread From: Peter Maydell @ 2011-12-29 1:31 UTC (permalink / raw) To: Xin Tong; +Cc: qemu-devel On 29 December 2011 00:48, Xin Tong <xerox.time.tech@gmail.com> wrote: > That is my guess as well in the first place, but my QEMU is built with > CONFIG_IOTHREAD set to 0. Your QEMU is old -- iothread is now the only option (the config option to use not-iothread has gone away). > I am not 100% sure about how interrupts are delivered in QEMU, my > guess is that some kind of timer devices will have to fire and qemu > might have installed a signal handler and the signal handler takes the > signal and invokes unlink_tb. I hope you can enlighten me on that. I think the non-iothread config used to use a signal handler, yes. However I don't recall the details and it's all a bit irrelevant now anyway. I recommend using an up to date source tree to do your experiments with... -- PMM ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] interrupt handling in qemu 2011-12-27 23:12 [Qemu-devel] interrupt handling in qemu Xin Tong 2011-12-27 23:36 ` Peter Maydell @ 2011-12-28 10:42 ` Avi Kivity 2011-12-28 11:40 ` Peter Maydell 1 sibling, 1 reply; 13+ messages in thread From: Avi Kivity @ 2011-12-28 10:42 UTC (permalink / raw) To: Xin Tong; +Cc: qemu-devel On 12/28/2011 01:12 AM, Xin Tong wrote: > QEMU does not exit and handle interrupt within translation blocks. it > only exits after the translation block is finished. Assuming a > translation block is very long, is it possible that QEMU could have > exceeded the interrupt's "timing window" and yields unexpected > behavior. > > The reason I ask is that I am searching for alternatives to QEMU > current way of handling interrupt (unlink translation blocks on > interrupt). However, an obvious approach - checking for interrupt in > every basic block, seems to be too heavy ( too many tb enters/exits > ). Maybe checking interrupt in a few basic blocks might be better, but > what is a good measure for the number of basic blocks to execute > before checking for interrupt ? > It's possible to check for an interrupt before every instruction, without any overhead: - when a signal arrives, check the instruction pointer. If it points outside tcg code, set a flag and return. - consult a table indexed by the instruction pointer, that gives the number of bytes to the next guest instruction boundary - if nonzero, set a breakpoint at that boundary, and resume - remove the breakpoint (if set) - adjust the TB to return on the current instruction pointer - return -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] interrupt handling in qemu 2011-12-28 10:42 ` Avi Kivity @ 2011-12-28 11:40 ` Peter Maydell 2011-12-28 12:04 ` Avi Kivity 0 siblings, 1 reply; 13+ messages in thread From: Peter Maydell @ 2011-12-28 11:40 UTC (permalink / raw) To: Avi Kivity; +Cc: qemu-devel, Xin Tong On 28 December 2011 10:42, Avi Kivity <avi@redhat.com> wrote: > It's possible to check for an interrupt before every instruction, > without any overhead: > > - when a signal arrives, check the instruction pointer. If it points > outside tcg code, set a flag and return. > - consult a table indexed by the instruction pointer, that gives the > number of bytes to the next guest instruction boundary > - if nonzero, set a breakpoint at that boundary, and resume > - remove the breakpoint (if set) > - adjust the TB to return on the current instruction pointer > - return This assumes you have hardware breakpoints on your host, so it's not portable. (You also need to add a check-and-handle-flag for every return from a helper function to TCG code, and of course you need to actually create the instruction-boundary table. These are both overheads.) -- PMM ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] interrupt handling in qemu 2011-12-28 11:40 ` Peter Maydell @ 2011-12-28 12:04 ` Avi Kivity 2011-12-28 17:00 ` Xin Tong 0 siblings, 1 reply; 13+ messages in thread From: Avi Kivity @ 2011-12-28 12:04 UTC (permalink / raw) To: Peter Maydell; +Cc: qemu-devel, Xin Tong On 12/28/2011 01:40 PM, Peter Maydell wrote: > On 28 December 2011 10:42, Avi Kivity <avi@redhat.com> wrote: > > It's possible to check for an interrupt before every instruction, > > without any overhead: > > > > - when a signal arrives, check the instruction pointer. If it points > > outside tcg code, set a flag and return. > > - consult a table indexed by the instruction pointer, that gives the > > number of bytes to the next guest instruction boundary > > - if nonzero, set a breakpoint at that boundary, and resume > > - remove the breakpoint (if set) > > - adjust the TB to return on the current instruction pointer > > - return > > This assumes you have hardware breakpoints on your host, so > it's not portable. You could also use software breakpoints. Or just temporarily replace the host instruction on the next guest instruction boundary with a return. > (You also need to add a check-and-handle-flag for every return > from a helper function to TCG code, ah yes - didn't consider that. you could put all helper in their own section, an do something around that - but that assumes no callouts from helpers to the standard library. > and of course you need to > actually create the instruction-boundary table. This should be well amortized. > These are both > overheads.) -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] interrupt handling in qemu 2011-12-28 12:04 ` Avi Kivity @ 2011-12-28 17:00 ` Xin Tong 2011-12-28 19:07 ` Lluís Vilanova 0 siblings, 1 reply; 13+ messages in thread From: Xin Tong @ 2011-12-28 17:00 UTC (permalink / raw) To: Avi Kivity, Peter Maydell; +Cc: qemu-devel My main concern here is not how timely the interrupts can be handled, i am more interested in reducing the number of TB enters/exits due to interrupt. Returning to qemu mainloop requires saving and restoring register contexts which are expensive, what i am thinking is that can we check and handle interrupts every few TBs executed. But the drawback is that I do not know how many TBs would be a good number such that the interrupts do not get delayed too much. Thanks On Wed, Dec 28, 2011 at 5:04 AM, Avi Kivity <avi@redhat.com> wrote: > On 12/28/2011 01:40 PM, Peter Maydell wrote: >> On 28 December 2011 10:42, Avi Kivity <avi@redhat.com> wrote: >> > It's possible to check for an interrupt before every instruction, >> > without any overhead: >> > >> > - when a signal arrives, check the instruction pointer. If it points >> > outside tcg code, set a flag and return. >> > - consult a table indexed by the instruction pointer, that gives the >> > number of bytes to the next guest instruction boundary >> > - if nonzero, set a breakpoint at that boundary, and resume >> > - remove the breakpoint (if set) >> > - adjust the TB to return on the current instruction pointer >> > - return >> >> This assumes you have hardware breakpoints on your host, so >> it's not portable. > > You could also use software breakpoints. Or just temporarily replace > the host instruction on the next guest instruction boundary with a return. > >> (You also need to add a check-and-handle-flag for every return >> from a helper function to TCG code, > > ah yes - didn't consider that. > > you could put all helper in their own section, an do something around > that - but that assumes no callouts from helpers to the standard library. > >> and of course you need to >> actually create the instruction-boundary table. > > This should be well amortized. > >> These are both >> overheads.) > > -- > error compiling committee.c: too many arguments to function > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] interrupt handling in qemu 2011-12-28 17:00 ` Xin Tong @ 2011-12-28 19:07 ` Lluís Vilanova 0 siblings, 0 replies; 13+ messages in thread From: Lluís Vilanova @ 2011-12-28 19:07 UTC (permalink / raw) To: Xin Tong; +Cc: Peter Maydell, Avi Kivity, qemu-devel Xin Tong writes: > My main concern here is not how timely the interrupts can be handled, > i am more interested in reducing the number of TB enters/exits due to > interrupt. Returning to qemu mainloop requires saving and restoring > register contexts which are expensive, what i am thinking is that can > we check and handle interrupts every few TBs executed. But the > drawback is that I do not know how many TBs would be a good number > such that the interrupts do not get delayed too much. I think a maximum amount of guest time for which interrupts can be delayed would provide a better response (maybe together with a maximum number of delayed interrupts). For that you could program a "special" timer that forces a return-from-guest, whatever the mechanism. But I'm sure that's going to make the system slower than just checking every fixed number of TBs. Lluis -- "And it's much the same thing with knowledge, for whenever you learn something new, the whole world becomes that much richer." -- The Princess of Pure Reason, as told by Norton Juster in The Phantom Tollbooth ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2011-12-29 1:31 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-12-27 23:12 [Qemu-devel] interrupt handling in qemu Xin Tong 2011-12-27 23:36 ` Peter Maydell 2011-12-28 0:43 ` Xin Tong 2011-12-28 1:10 ` Peter Maydell 2011-12-28 1:23 ` Xin Tong 2011-12-28 21:10 ` Peter Maydell 2011-12-29 0:48 ` Xin Tong 2011-12-29 1:31 ` Peter Maydell 2011-12-28 10:42 ` Avi Kivity 2011-12-28 11:40 ` Peter Maydell 2011-12-28 12:04 ` Avi Kivity 2011-12-28 17:00 ` Xin Tong 2011-12-28 19:07 ` Lluís Vilanova
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).