From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by ozlabs.org (Postfix) with ESMTP id 42014B6F98 for ; Mon, 5 Dec 2011 21:55:06 +1100 (EST) Date: Mon, 5 Dec 2011 16:24:52 +0530 From: Amit Shah To: Miche Baker-Harvey Subject: Re: [PATCH v3 2/3] hvc_init(): Enforce one-time initialization. Message-ID: <20111205105452.GB27683@amit-x200.redhat.com> References: <20111108214452.28884.14840.stgit@miche.sea.corp.google.com> <20111108214504.28884.61814.stgit@miche.sea.corp.google.com> <874nybqo0o.fsf@rustcorp.com.au> <20111123103852.GG16665@amit-x200.redhat.com> <20111129142149.GE2822@amit-x200.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: Cc: Stephen Rothwell , xen-devel@lists.xensource.com, Konrad Rzeszutek Wilk , Rusty Russell , linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, Anton Blanchard , Mike Waychison , ppc-dev , Greg Kroah-Hartman , Eric Northrup List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On (Tue) 29 Nov 2011 [09:50:41], Miche Baker-Harvey wrote: > Good grief! Sorry for the spacing mess-up! Here's a resend with reformatting. > > Amit, > We aren't using either QEMU or kvmtool, but we are using KVM. All So it's a different userspace? Any chance this different userspace is causing these problems to appear? Esp. since I couldn't reproduce with qemu. > the issues we are seeing happen when we try to establish multiple > virtioconsoles at boot time. The command line isn't relevant, but I > can tell you the protocol that's passing between the host (kvm) and > the guest (see the end of this message). > > We do go through the control_work_handler(), but it's not > providing synchronization. Here's a trace of the > control_work_handler() and handle_control_message() calls; note that > there are two concurrent calls to control_work_handler(). Ah; how does that happen? control_work_handler() should just be invoked once, and if there are any more pending work items to be consumed, they should be done within the loop inside control_work_handler(). > I decorated control_work_handler() with a "lifetime" marker, and > passed this value to handle_control_message(), so we can see which > control messages are being handled from which instance of > the control_work_handler() thread. > > Notice that we enter control_work_handler() a second time before > the handling of the second PORT_ADD message is complete. The > first CONSOLE_PORT message is handled by the second > control_work_handler() call, but the second is handled by the first > control_work_handler() call. > > root@myubuntu:~# dmesg | grep MBH > [3371055.808738] control_work_handler #1 > [3371055.809372] + #1 handle_control_message PORT_ADD > [3371055.810169] - handle_control_message PORT_ADD > [3371055.810170] + #1 handle_control_message PORT_ADD > [3371055.810244] control_work_handler #2 > [3371055.810245] + #2 handle_control_message CONSOLE_PORT > [3371055.810246] got hvc_ports_mutex > [3371055.810578] - handle_control_message PORT_ADD > [3371055.810579] + #1 handle_control_message CONSOLE_PORT > [3371055.810580] trylock of hvc_ports_mutex failed > [3371055.811352] got hvc_ports_mutex > [3371055.811370] - handle_control_message CONSOLE_PORT > [3371055.816609] - handle_control_message CONSOLE_PORT > > So, I'm guessing the bug is that there shouldn't be two instances of > control_work_handler() running simultaneously? Yep, I assumed we did that but apparently not. Do you plan to chase this one down? Amit