All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Mini-OS update of events initialisation
@ 2006-11-15 19:39 Grzegorz Milos
  2006-11-16  7:45 ` Keir Fraser
  0 siblings, 1 reply; 5+ messages in thread
From: Grzegorz Milos @ 2006-11-15 19:39 UTC (permalink / raw)
  To: xen-devel

[-- Attachment #1: Type: text/plain, Size: 1003 bytes --]

This patch moves initialisation of events (masking event channels) 
earlier during the boot process. Otherwise 64bit guests would sometimes 
crash.

Signed-off-by: Grzegorz Milos <gm281@cam.ac.uk>

Admittedly, I cannot explain the behavior I observed fully. This patch 
seems to fix it well, but I'd be grateful if someone on the list could 
explain it.

The problem only appears in 64 bit guest, and only if MiniOS keeps 
accessing the console ring page (which it does, as all printks write to 
the console, even before events are initialised. Of course without 
sending notifications).
What happens is that after a while, usually in memory initialisation 
stage, but at fairly random place MiniOS gets "restarted" i.e. 
start_kernel is called again. This leads to a crash as soon as MiniOS 
tries to remap shared info page. It looks like if Xen or xend decided 
that MiniOS domain should have masked the event channels before writing 
to the console page, and reboot it ...

Any ideas?

Thanks
Gregor

[-- Attachment #2: mini-os_events.patch --]
[-- Type: text/plain, Size: 2413 bytes --]

diff -r 8ed53f24279e -r 2ba83e480bd6 extras/mini-os/include/x86/os.h
--- a/extras/mini-os/include/x86/os.h	Wed Nov 15 16:27:37 2006 +0000
+++ b/extras/mini-os/include/x86/os.h	Wed Nov 15 19:10:33 2006 +0000
@@ -18,6 +18,8 @@
 #ifndef __ASSEMBLY__
 #include <types.h>
 #include <hypervisor.h>
+
+#define USED    __attribute__ ((used))
 
 extern void do_exit(void);
 #define BUG do_exit
diff -r 8ed53f24279e -r 2ba83e480bd6 extras/mini-os/kernel.c
--- a/extras/mini-os/kernel.c	Wed Nov 15 16:27:37 2006 +0000
+++ b/extras/mini-os/kernel.c	Wed Nov 15 19:10:33 2006 +0000
@@ -116,6 +116,9 @@ void start_kernel(start_info_t *si)
     printk("  cmd_line:   %s\n",  
            si->cmd_line ? (const char *)si->cmd_line : "NULL");
 
+    /* Set up events. */
+    init_events();
+    
     arch_print_info();
 
     setup_xen_features();
@@ -123,9 +126,6 @@ void start_kernel(start_info_t *si)
     /* Init memory management. */
     init_mm();
 
-    /* Set up events. */
-    init_events();
-    
     /* Init time and timers. */
     init_time();
 
diff -r 8ed53f24279e -r 2ba83e480bd6 extras/mini-os/mm.c
--- a/extras/mini-os/mm.c	Wed Nov 15 16:27:37 2006 +0000
+++ b/extras/mini-os/mm.c	Wed Nov 15 19:10:33 2006 +0000
@@ -148,7 +148,7 @@ static chunk_head_t  free_tail[FREELIST_
  * Prints allocation[0/1] for @nr_pages, starting at @start
  * address (virtual).
  */
-static void print_allocation(void *start, int nr_pages)
+USED static void print_allocation(void *start, int nr_pages)
 {
     unsigned long pfn_start = virt_to_pfn(start);
     int count;
@@ -163,7 +163,7 @@ static void print_allocation(void *start
  * Prints chunks (making them with letters) for @nr_pages starting
  * at @start (virtual).
  */
-static void print_chunks(void *start, int nr_pages)
+USED static void print_chunks(void *start, int nr_pages)
 {
     char chunks[1001], current='A';
     int order, count;
@@ -408,7 +408,6 @@ void new_pt_frame(unsigned long *pt_pfn,
          do_exit();
          break;
     }
-
     /* Update the entry */
 #if defined(__x86_64__)
     tab = pte_to_virt(tab[l4_table_offset(pt_page)]);
@@ -446,7 +445,6 @@ void new_pt_frame(unsigned long *pt_pfn,
        printk("ERROR: mmu_update failed\n");
        do_exit();
     }
-
     *pt_pfn += 1;
 }
 
@@ -581,7 +579,6 @@ void build_pagetable(unsigned long *star
         }
         start_address += PAGE_SIZE;
     }
-
     *start_pfn = pt_pfn;
 }
 

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Mini-OS update of events initialisation
  2006-11-15 19:39 [PATCH] Mini-OS update of events initialisation Grzegorz Milos
@ 2006-11-16  7:45 ` Keir Fraser
  0 siblings, 0 replies; 5+ messages in thread
From: Keir Fraser @ 2006-11-16  7:45 UTC (permalink / raw)
  To: Grzegorz Milos, xen-devel




On 15/11/06 7:39 pm, "Grzegorz Milos" <gm281@cam.ac.uk> wrote:

> What happens is that after a while, usually in memory initialisation
> stage, but at fairly random place MiniOS gets "restarted" i.e.
> start_kernel is called again. This leads to a crash as soon as MiniOS
> tries to remap shared info page. It looks like if Xen or xend decided
> that MiniOS domain should have masked the event channels before writing
> to the console page, and reboot it ...

If you enable event delivery before registering a callback handler then your
domain will crash on first event delivery. But then you would be restarted
in a fresh domain with a new shared_info page; wou wouldn't jump back to the
entry code of the existing domain.

 -- Keir

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Mini-OS update of events initialisation
@ 2006-11-17 13:53 Melvin Anderson
  2006-11-17 15:29 ` Grzegorz Milos
  0 siblings, 1 reply; 5+ messages in thread
From: Melvin Anderson @ 2006-11-17 13:53 UTC (permalink / raw)
  To: xen-devel; +Cc: Grzegorz Milos, Micha Moffie

We have been seeing the same problem with a 32bit Mini-OS guest.

I suspect the problem is in the initialisation order.  Looking at
start_kernel() in kernel.c:

    arch_init(si);

    trap_init();

    /* ENABLE EVENT DELIVERY. This is disabled at start of day. */
    __sti();
    
    <code omitted...>

    /* Set up events. */
    init_events();

The function arch_init() registers hypervisor_callback, and __sti()
enables events to be delivered.  Entry point hypervisor_callback is in
the assembly code in x86_32.S which calls do_hypervisor_callback() in
hypervisor.c, which in turn calls do_event() in events.c.

Suppose a callback occurs between the calls of __sti() and
init_events().  The function do_event() calls the handler indirectly via
the array ev_actions.  But ev_actions is initialised in init_events(),
so if do_event() is called too early, the function pointer "handler" in
ev_actions will still be 0 (default for static storage).  We start again
at virtual address 0, which is the entry point of Mini-OS, but %esi will
not now point to the start_info page.  I think this explains why Mini-OS
sometimes gets "restarted", and when it does the start_info page seems
to be garbage.

I am not convinced that Grzegorz' patch closes the window of opportunity
for a misplaced callback, but it does reduce it.  Shouldn't __sti() be
after init_events(), not before it?

Thanks are due to Micha Moffie, who came up with key insights.

Regards,

Melvin Anderson.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Mini-OS update of events initialisation
  2006-11-17 13:53 Melvin Anderson
@ 2006-11-17 15:29 ` Grzegorz Milos
  2006-11-17 17:31   ` Melvin Anderson
  0 siblings, 1 reply; 5+ messages in thread
From: Grzegorz Milos @ 2006-11-17 15:29 UTC (permalink / raw)
  To: Melvin Anderson; +Cc: xen-devel, Micha Moffie

[-- Attachment #1: Type: text/plain, Size: 2369 bytes --]

> We have been seeing the same problem with a 32bit Mini-OS guest.
> 
> I suspect the problem is in the initialisation order.  Looking at
> start_kernel() in kernel.c:
> 
>     arch_init(si);
> 
>     trap_init();
> 
>     /* ENABLE EVENT DELIVERY. This is disabled at start of day. */
>     __sti();
>     
>     <code omitted...>
> 
>     /* Set up events. */
>     init_events();
> 
> The function arch_init() registers hypervisor_callback, and __sti()
> enables events to be delivered.  Entry point hypervisor_callback is in
> the assembly code in x86_32.S which calls do_hypervisor_callback() in
> hypervisor.c, which in turn calls do_event() in events.c.

That's all correct so far.

> Suppose a callback occurs between the calls of __sti() and
> init_events().  The function do_event() calls the handler indirectly via
> the array ev_actions.  But ev_actions is initialised in init_events(),
> so if do_event() is called too early, the function pointer "handler" in
> ev_actions will still be 0 (default for static storage).  We start again
> at virtual address 0, which is the entry point of Mini-OS, but %esi will
> not now point to the start_info page.  I think this explains why Mini-OS
> sometimes gets "restarted", and when it does the start_info page seems
> to be garbage.

This seems to explain the "restart". However, my understanding was that 
  an event port has to be explicitly unmasked in order for any event to 
be delivered (i.e. that all event ports are masked by default). 
Unmasking is done by bind_evtchn() or bind_virq() in event.s. All calls 
to these functions happen after init_events() (in init_time(), 
init_console() and init_xenbus()). I guess the best way to verify if 
your scenario is correct is to put 'printk' in do_events(). Could you 
try this Melvin?

> I am not convinced that Grzegorz' patch closes the window of opportunity
> for a misplaced callback, but it does reduce it.  Shouldn't __sti() be
> after init_events(), not before it?

That certainly cannot hurt. I'm attaching a patch. Keir could you apply?
Patch description:
Events enabled after init_events() call. This guarantees that internal 
datastructures are set up correctly. Problem spotted/explained by Melvin 
Anderson and Micha Moffie.

Signed-off-by: Grzegorz Milos <gm281@cam.ac.uk>


> 
> Thanks are due to Micha Moffie, who came up with key insights.



[-- Attachment #2: mini-os_events.patch --]
[-- Type: text/plain, Size: 718 bytes --]

diff -r f7cc9ce481d9 -r 669ea1ab20d4 extras/mini-os/kernel.c
--- a/extras/mini-os/kernel.c	Wed Nov 15 22:35:24 2006 +0000
+++ b/extras/mini-os/kernel.c	Fri Nov 17 15:22:54 2006 +0000
@@ -100,10 +100,7 @@ void start_kernel(start_info_t *si)
     arch_init(si);
 
     trap_init();
-
-    /* ENABLE EVENT DELIVERY. This is disabled at start of day. */
-    __sti();
-    
+   
     /* print out some useful information  */
     printk("Xen Minimal OS!\n");
     printk("start_info:   %p\n",    si);
@@ -118,6 +115,9 @@ void start_kernel(start_info_t *si)
 
     /* Set up events. */
     init_events();
+ 
+    /* ENABLE EVENT DELIVERY. This is disabled at start of day. */
+    __sti();
     
     arch_print_info();
 

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Mini-OS update of events initialisation
  2006-11-17 15:29 ` Grzegorz Milos
@ 2006-11-17 17:31   ` Melvin Anderson
  0 siblings, 0 replies; 5+ messages in thread
From: Melvin Anderson @ 2006-11-17 17:31 UTC (permalink / raw)
  To: Grzegorz Milos; +Cc: xen-devel, Micha Moffie

On Fri, 2006-11-17 at 15:29 +0000, Grzegorz Milos wrote:
> > Suppose a callback occurs between the calls of __sti() and
> > init_events().  The function do_event() calls the handler indirectly via
> > the array ev_actions.  But ev_actions is initialised in init_events(),
> > so if do_event() is called too early, the function pointer "handler" in
> > ev_actions will still be 0 (default for static storage).  We start again
> > at virtual address 0, which is the entry point of Mini-OS, but %esi will
> > not now point to the start_info page.  I think this explains why Mini-OS
> > sometimes gets "restarted", and when it does the start_info page seems
> > to be garbage.
> 
> This seems to explain the "restart". However, my understanding was that 
>   an event port has to be explicitly unmasked in order for any event to 
> be delivered (i.e. that all event ports are masked by default). 
> Unmasking is done by bind_evtchn() or bind_virq() in event.s. All calls 
> to these functions happen after init_events() (in init_time(), 
> init_console() and init_xenbus()). I guess the best way to verify if 
> your scenario is correct is to put 'printk' in do_events(). Could you 
> try this Melvin?

Very good question.  I have just tried calling a modified version of
init_events() to set up the default_handler before __sti() is called,
but my modified init_events() did not call mask_evtchn().

I find that I occasionally see an event on port 2.  Looking at the debug
output from xend (I ran xend with XEND_DEBUG=1 set in the environment),
I see that this is the console event port.

I have just looked at the domain builder code in libxc,
tools/libxc/xc_linux_build.c, function setup_guest().  I see that this
initialises the shared_info page with:

    memset(shared_info, 0, PAGE_SIZE);
    /* Mask all upcalls... */
    for ( i = 0; i < MAX_VIRT_CPUS; i++ )
        shared_info->vcpu_info[i].evtchn_upcall_mask = 1;

So, the evtchn_mask array is cleared (unmasked) by the memset() call,
but the per-cpu evtchn_upcall_mask is set to hold off callbacks until we
are properly initialised.  Function __sti() clears evtchn_upcall_mask,
and as evtchn_mask array is cleared by default, that explains why the
upcall on port 2 is not masked.

Regards,

Melvin Anderson.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-11-17 17:31 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-15 19:39 [PATCH] Mini-OS update of events initialisation Grzegorz Milos
2006-11-16  7:45 ` Keir Fraser
  -- strict thread matches above, loose matches on Subject: below --
2006-11-17 13:53 Melvin Anderson
2006-11-17 15:29 ` Grzegorz Milos
2006-11-17 17:31   ` Melvin Anderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.