* [Qemu-devel] [PATCH] Fix freezing bug in curses console
@ 2009-01-21 15:51 Matthew Bloch
2009-02-27 19:49 ` [Qemu-devel] " Anthony Liguori
0 siblings, 1 reply; 9+ messages in thread
From: Matthew Bloch @ 2009-01-21 15:51 UTC (permalink / raw)
To: qemu-devel; +Cc: kvm
Hi there,
We are running lots of kvm processes in screen and found that about 1 in
5 froze shortly after startup startup with a backtrace like this one:
#0 0xf7c7fcd9 in pthread_exit () from /lib/tls/libc.so.6
#1 0xf7cfbe62 in wresize () from /lib/libncurses.so.5
#2 0xf7cfb7ab in is_term_resized () from /lib/libncurses.so.5
#3 0xf7cfb877 in is_term_resized () from /lib/libncurses.so.5
#4 0xf7cfba31 in resize_term () from /lib/libncurses.so.5
#5 0x080d3dd9 in vga_init ()
#6 <signal handler called>
#7 0xf7c0da5b in free () from /lib/tls/libc.so.6
#8 0xf7c0effe in calloc () from /lib/tls/libc.so.6
#9 0xf7cf222e in newpad () from /lib/libncurses.so.5
#10 0x080d3549 in vga_init ()
We're just using the lenny version of kvm from 2008-12-16.
On casual inspection, the SIGWINCH signal handling looked ropey to me -
grandpa always told me not to do any real work in a signal handler, and
the backtrace suggested re-entrancy problems in curses, so I changed the
behaviour to set a flag and do the work in the main loop instead. Maybe
I'm reading the backtrace wrong.
So far that means that when you resize the window, the display is
corrupt until the VM outputs some text, or the user hits a key. But I
think it has solved the freezing / crashing bug too - would appreciate
any comments on my analysis or proposed solution.
Index: curses.c
===================================================================
--- curses.c (revision 6374)
+++ curses.c (working copy)
@@ -41,6 +41,7 @@
#define FONT_HEIGHT 16
#define FONT_WIDTH 8
+static int winch_flag = 0;
static console_ch_t screen[160 * 100];
static WINDOW *screenpad = NULL;
static int width, height, gwidth, gheight, invalidate;
@@ -110,7 +111,7 @@
#ifndef _WIN32
#if defined(SIGWINCH) && defined(KEY_RESIZE)
-static void curses_winch_handler(int signum)
+static void curses_winch_handler_real(void)
{
struct winsize {
unsigned short ws_row;
@@ -126,7 +127,13 @@
resize_term(ws.ws_row, ws.ws_col);
curses_calc_pad();
invalidate = 1;
+ winch_flag = 0;
+}
+static void curses_winch_handler(int sig)
+{
+ winch_flag = 1;
+
/* some systems require this */
signal(SIGWINCH, curses_winch_handler);
}
@@ -179,6 +186,12 @@
s
nextchr = ERR;
while (1) {
+
+#if !defined(_WIN32) && defined(SIGWINCH) && defined(KEY_RESIZE)
+ if (winch_flag)
+ curses_winch_handler_real();
+#endif
+
/* while there are any pending key strokes to process */
if (nextchr == ERR)
chr = getch();
--
Matthew
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] Re: [PATCH] Fix freezing bug in curses console
2009-01-21 15:51 [Qemu-devel] [PATCH] Fix freezing bug in curses console Matthew Bloch
@ 2009-02-27 19:49 ` Anthony Liguori
2009-02-27 21:01 ` andrzej zaborowski
0 siblings, 1 reply; 9+ messages in thread
From: Anthony Liguori @ 2009-02-27 19:49 UTC (permalink / raw)
To: Matthew Bloch; +Cc: qemu-devel, kvm
Matthew Bloch wrote:
> Hi there,
>
> We are running lots of kvm processes in screen and found that about 1 in
> 5 froze shortly after startup startup with a backtrace like this one:
>
> #0 0xf7c7fcd9 in pthread_exit () from /lib/tls/libc.so.6
> #1 0xf7cfbe62 in wresize () from /lib/libncurses.so.5
> #2 0xf7cfb7ab in is_term_resized () from /lib/libncurses.so.5
> #3 0xf7cfb877 in is_term_resized () from /lib/libncurses.so.5
> #4 0xf7cfba31 in resize_term () from /lib/libncurses.so.5
> #5 0x080d3dd9 in vga_init ()
> #6 <signal handler called>
> #7 0xf7c0da5b in free () from /lib/tls/libc.so.6
> #8 0xf7c0effe in calloc () from /lib/tls/libc.so.6
> #9 0xf7cf222e in newpad () from /lib/libncurses.so.5
> #10 0x080d3549 in vga_init ()
>
> We're just using the lenny version of kvm from 2008-12-16.
>
> On casual inspection, the SIGWINCH signal handling looked ropey to me -
> grandpa always told me not to do any real work in a signal handler, and
> the backtrace suggested re-entrancy problems in curses, so I changed the
> behaviour to set a flag and do the work in the main loop instead. Maybe
> I'm reading the backtrace wrong.
>
> So far that means that when you resize the window, the display is
> corrupt until the VM outputs some text, or the user hits a key. But I
> think it has solved the freezing / crashing bug too - would appreciate
> any comments on my analysis or proposed solution.
>
It's racy with select(). A better fix would be to create a pipe and
write to that pipe in the SIGWINCH handler. You should then register an
io callback using qemu_set_fd_handler2() that does the actions for SIGWINCH.
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Re: [PATCH] Fix freezing bug in curses console
2009-02-27 19:49 ` [Qemu-devel] " Anthony Liguori
@ 2009-02-27 21:01 ` andrzej zaborowski
2009-02-27 21:04 ` Anthony Liguori
0 siblings, 1 reply; 9+ messages in thread
From: andrzej zaborowski @ 2009-02-27 21:01 UTC (permalink / raw)
To: qemu-devel; +Cc: Matthew Bloch, kvm
2009/2/27 Anthony Liguori <aliguori@us.ibm.com>:
> Matthew Bloch wrote:
>>
>> Hi there,
>>
>> We are running lots of kvm processes in screen and found that about 1 in
>> 5 froze shortly after startup startup with a backtrace like this one:
>>
>> #0 0xf7c7fcd9 in pthread_exit () from /lib/tls/libc.so.6
>> #1 0xf7cfbe62 in wresize () from /lib/libncurses.so.5
>> #2 0xf7cfb7ab in is_term_resized () from /lib/libncurses.so.5
>> #3 0xf7cfb877 in is_term_resized () from /lib/libncurses.so.5
>> #4 0xf7cfba31 in resize_term () from /lib/libncurses.so.5
>> #5 0x080d3dd9 in vga_init ()
>> #6 <signal handler called>
>> #7 0xf7c0da5b in free () from /lib/tls/libc.so.6
>> #8 0xf7c0effe in calloc () from /lib/tls/libc.so.6
>> #9 0xf7cf222e in newpad () from /lib/libncurses.so.5
>> #10 0x080d3549 in vga_init ()
>>
>> We're just using the lenny version of kvm from 2008-12-16.
>>
>> On casual inspection, the SIGWINCH signal handling looked ropey to me -
>> grandpa always told me not to do any real work in a signal handler, and
>> the backtrace suggested re-entrancy problems in curses, so I changed the
>> behaviour to set a flag and do the work in the main loop instead. Maybe
>> I'm reading the backtrace wrong.
>>
>> So far that means that when you resize the window, the display is
>> corrupt until the VM outputs some text, or the user hits a key. But I
>> think it has solved the freezing / crashing bug too - would appreciate
>> any comments on my analysis or proposed solution.
>>
>
> It's racy with select(). A better fix would be to create a pipe and write
> to that pipe in the SIGWINCH handler. You should then register an io
> callback using qemu_set_fd_handler2() that does the actions for SIGWINCH.
Maybe a bottom half would work? The scheduling of a bh shouldn't
constitute "real work".
Cheers
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Re: [PATCH] Fix freezing bug in curses console
2009-02-27 21:01 ` andrzej zaborowski
@ 2009-02-27 21:04 ` Anthony Liguori
2009-02-28 21:21 ` Jamie Lokier
0 siblings, 1 reply; 9+ messages in thread
From: Anthony Liguori @ 2009-02-27 21:04 UTC (permalink / raw)
To: andrzej zaborowski; +Cc: Matthew Bloch, qemu-devel, kvm
andrzej zaborowski wrote:
> 2009/2/27 Anthony Liguori <aliguori@us.ibm.com>:
>
>> Matthew Bloch wrote:
>>
>>> Hi there,
>>>
>>> We are running lots of kvm processes in screen and found that about 1 in
>>> 5 froze shortly after startup startup with a backtrace like this one:
>>>
>>> #0 0xf7c7fcd9 in pthread_exit () from /lib/tls/libc.so.6
>>> #1 0xf7cfbe62 in wresize () from /lib/libncurses.so.5
>>> #2 0xf7cfb7ab in is_term_resized () from /lib/libncurses.so.5
>>> #3 0xf7cfb877 in is_term_resized () from /lib/libncurses.so.5
>>> #4 0xf7cfba31 in resize_term () from /lib/libncurses.so.5
>>> #5 0x080d3dd9 in vga_init ()
>>> #6 <signal handler called>
>>> #7 0xf7c0da5b in free () from /lib/tls/libc.so.6
>>> #8 0xf7c0effe in calloc () from /lib/tls/libc.so.6
>>> #9 0xf7cf222e in newpad () from /lib/libncurses.so.5
>>> #10 0x080d3549 in vga_init ()
>>>
>>> We're just using the lenny version of kvm from 2008-12-16.
>>>
>>> On casual inspection, the SIGWINCH signal handling looked ropey to me -
>>> grandpa always told me not to do any real work in a signal handler, and
>>> the backtrace suggested re-entrancy problems in curses, so I changed the
>>> behaviour to set a flag and do the work in the main loop instead. Maybe
>>> I'm reading the backtrace wrong.
>>>
>>> So far that means that when you resize the window, the display is
>>> corrupt until the VM outputs some text, or the user hits a key. But I
>>> think it has solved the freezing / crashing bug too - would appreciate
>>> any comments on my analysis or proposed solution.
>>>
>>>
>> It's racy with select(). A better fix would be to create a pipe and write
>> to that pipe in the SIGWINCH handler. You should then register an io
>> callback using qemu_set_fd_handler2() that does the actions for SIGWINCH.
>>
>
> Maybe a bottom half would work? The scheduling of a bh shouldn't
> constitute "real work".
>
I think it still suffers from the same race condition so today it
wouldn't work. You could fix the bottom half scheduling though so that
you could safely schedule a bottom half from a signal handler (using
roughly the same trick).
Regards,
Anthony Liguori
> Cheers
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Re: [PATCH] Fix freezing bug in curses console
2009-02-27 21:04 ` Anthony Liguori
@ 2009-02-28 21:21 ` Jamie Lokier
2009-03-01 11:36 ` Daniel P. Berrange
0 siblings, 1 reply; 9+ messages in thread
From: Jamie Lokier @ 2009-02-28 21:21 UTC (permalink / raw)
To: qemu-devel; +Cc: Matthew Bloch, kvm
Anthony Liguori wrote:
> >>It's racy with select(). A better fix would be to create a pipe and write
> >>to that pipe in the SIGWINCH handler. You should then register an io
> >>
> >
> >Maybe a bottom half would work? The scheduling of a bh shouldn't
> >constitute "real work".
>
> I think it still suffers from the same race condition so today it
> wouldn't work. You could fix the bottom half scheduling though so that
> you could safely schedule a bottom half from a signal handler (using
> roughly the same trick).
Fwiw, it's perfectly sensible to have a single pipe which is shared by
all signal handlers, just used to say "check for work flags set".
-- Jamie
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Re: [PATCH] Fix freezing bug in curses console
2009-02-28 21:21 ` Jamie Lokier
@ 2009-03-01 11:36 ` Daniel P. Berrange
2009-03-01 13:03 ` Paul Brook
0 siblings, 1 reply; 9+ messages in thread
From: Daniel P. Berrange @ 2009-03-01 11:36 UTC (permalink / raw)
To: qemu-devel; +Cc: Matthew Bloch, kvm
On Sat, Feb 28, 2009 at 09:21:16PM +0000, Jamie Lokier wrote:
> Anthony Liguori wrote:
> > >>It's racy with select(). A better fix would be to create a pipe and write
> > >>to that pipe in the SIGWINCH handler. You should then register an io
> > >>
> > >
> > >Maybe a bottom half would work? The scheduling of a bh shouldn't
> > >constitute "real work".
> >
> > I think it still suffers from the same race condition so today it
> > wouldn't work. You could fix the bottom half scheduling though so that
> > you could safely schedule a bottom half from a signal handler (using
> > roughly the same trick).
>
> Fwiw, it's perfectly sensible to have a single pipe which is shared by
> all signal handlers, just used to say "check for work flags set".
And if you need the main loop to be able to distinguish signals coming
out of the pipe, then just write the signum into the pipe as a byte,
instead of a single dummy byte. Or even write the whole 'siginfo_t'
struct passed to the signal handler, and read it out in sizeof(siginfo_t)
sized chunks for processing.
Daniel
--
|: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Re: [PATCH] Fix freezing bug in curses console
2009-03-01 11:36 ` Daniel P. Berrange
@ 2009-03-01 13:03 ` Paul Brook
2009-03-01 14:07 ` Anthony Liguori
0 siblings, 1 reply; 9+ messages in thread
From: Paul Brook @ 2009-03-01 13:03 UTC (permalink / raw)
To: qemu-devel, Daniel P. Berrange; +Cc: Matthew Bloch, kvm
> > > I think it still suffers from the same race condition so today it
> > > wouldn't work. You could fix the bottom half scheduling though so that
> > > you could safely schedule a bottom half from a signal handler (using
> > > roughly the same trick).
> >
> > Fwiw, it's perfectly sensible to have a single pipe which is shared by
> > all signal handlers, just used to say "check for work flags set".
>
> And if you need the main loop to be able to distinguish signals coming
> out of the pipe, then just write the signum into the pipe as a byte,
> instead of a single dummy byte. Or even write the whole 'siginfo_t'
> struct passed to the signal handler, and read it out in sizeof(siginfo_t)
> sized chunks for processing.
I don't think this will works. If the pipe buffer gets full the write will
either block or you'll loose signals.
When using the pipe as a simple semaphore all you care about is the presence
or absence of data. It doesn't matter if subsequent writes loose data (e.g.
by not retrying a nonblocking write) as long as a write to an empty pipe
succeeds.
Paul
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Re: [PATCH] Fix freezing bug in curses console
2009-03-01 13:03 ` Paul Brook
@ 2009-03-01 14:07 ` Anthony Liguori
2009-03-02 16:57 ` Jamie Lokier
0 siblings, 1 reply; 9+ messages in thread
From: Anthony Liguori @ 2009-03-01 14:07 UTC (permalink / raw)
To: Paul Brook; +Cc: Matthew Bloch, qemu-devel, kvm
Paul Brook wrote:
>>>> I think it still suffers from the same race condition so today it
>>>> wouldn't work. You could fix the bottom half scheduling though so that
>>>> you could safely schedule a bottom half from a signal handler (using
>>>> roughly the same trick).
>>>>
>>> Fwiw, it's perfectly sensible to have a single pipe which is shared by
>>> all signal handlers, just used to say "check for work flags set".
>>>
>> And if you need the main loop to be able to distinguish signals coming
>> out of the pipe, then just write the signum into the pipe as a byte,
>> instead of a single dummy byte. Or even write the whole 'siginfo_t'
>> struct passed to the signal handler, and read it out in sizeof(siginfo_t)
>> sized chunks for processing.
>>
>
> I don't think this will works. If the pipe buffer gets full the write will
> either block or you'll loose signals.
>
> When using the pipe as a simple semaphore all you care about is the presence
> or absence of data. It doesn't matter if subsequent writes loose data (e.g.
> by not retrying a nonblocking write) as long as a write to an empty pipe
> succeeds.
>
Yup. You need to use a global flag to distinguish the type of signal.
Regards,
Anthony Liguori
> Paul
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Re: [PATCH] Fix freezing bug in curses console
2009-03-01 14:07 ` Anthony Liguori
@ 2009-03-02 16:57 ` Jamie Lokier
0 siblings, 0 replies; 9+ messages in thread
From: Jamie Lokier @ 2009-03-02 16:57 UTC (permalink / raw)
To: qemu-devel; +Cc: Matthew Bloch, Paul Brook, kvm
Anthony Liguori wrote:
> >When using the pipe as a simple semaphore all you care about is the
> >presence or absence of data. It doesn't matter if subsequent writes loose
> >data (e.g. by not retrying a nonblocking write) as long as a write to an
> >empty pipe succeeds.
>
> Yup. You need to use a global flag to distinguish the type of signal.
If you have a set of BHs which can be scheduled from a signal handler,
set a flag in the BH when it's scheduled, prior to the non-blocking
pipe write. The select-pipe reader can then look at all eligible BHs
looking for ones with the flag set.
If you can enqueue them in the signal handler that's even better, but
obviously beware of race conditions.
Don't forget to completely drain the pipe when reading.
Maybe use an eventfd instead of a pipe if you have eventfd. :-)
If the signal handler might be run in different threads, you'll need
to take care of memory ordering. The flag must be set before writing
to the pipe, as observed by the pipe-reading thread.
-- Jamie
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-03-02 16:58 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-21 15:51 [Qemu-devel] [PATCH] Fix freezing bug in curses console Matthew Bloch
2009-02-27 19:49 ` [Qemu-devel] " Anthony Liguori
2009-02-27 21:01 ` andrzej zaborowski
2009-02-27 21:04 ` Anthony Liguori
2009-02-28 21:21 ` Jamie Lokier
2009-03-01 11:36 ` Daniel P. Berrange
2009-03-01 13:03 ` Paul Brook
2009-03-01 14:07 ` Anthony Liguori
2009-03-02 16:57 ` Jamie Lokier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).