From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=42836 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Oxn3l-00058A-Ll for qemu-devel@nongnu.org; Mon, 20 Sep 2010 16:33:58 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1Oxn3i-0000jw-Oz for qemu-devel@nongnu.org; Mon, 20 Sep 2010 16:33:55 -0400 Received: from mx1.redhat.com ([209.132.183.28]:20527) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Oxn3i-0000js-Gb for qemu-devel@nongnu.org; Mon, 20 Sep 2010 16:33:54 -0400 Date: Mon, 20 Sep 2010 22:27:56 +0200 From: "Michael S. Tsirkin" Message-ID: <20100920202755.GA821@redhat.com> References: <20100920164758.GB29862@redhat.com> <4C979258.9020701@codemonkey.ws> <20100920171439.GF29862@redhat.com> <4C97A474.8040900@codemonkey.ws> <20100920182459.GE30611@redhat.com> <4C97AA44.8000403@codemonkey.ws> <20100920191500.GG30611@redhat.com> <4C97B5EA.9060809@codemonkey.ws> <20100920194415.GK30611@redhat.com> <4C97C22B.3010502@codemonkey.ws> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4C97C22B.3010502@codemonkey.ws> Subject: [Qemu-devel] Re: [PATCH] net: delay peer host device delete List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: qemu-devel@nongnu.org On Mon, Sep 20, 2010 at 03:20:59PM -0500, Anthony Liguori wrote: > On 09/20/2010 02:44 PM, Michael S. Tsirkin wrote: > > > >>I think the only workable approach that doesn't involve new commands > >>is to change the semantics of the existing ones. > >> > >>Make netdev_del work regardless of whether the device is still present. > >> > >>You would need to reference count the actual netdev structure and > >>have each device using it unref on delete. You make netdev_del mark > >>the device as deleted and when a device is deleted, any calls into > >>the device effectively become nops. > >> > >>You have to go through most of the cleanup process to ensure that > >>tap device gets closed even before your reference count goes to > >>zero. > >I think you mean 'does not get closed': we need the fd to get the flags etc. > > No, I actually meant does get closed. > > When you do netdev_del, it should result in the fd getting closed. > > The actual netdev structure then becomes a zombie that's completely > useless until the device goes away. > > >Note that it will mostly work unless when it'll crash. > >Issue is we don't have any documentation so > >people get the command set by trial and error. > > > >So how can we prove it's a user bug and not qemu bug? > >I guess we should blame ourselves until proven innocent. > > Here's what I'm now suggesting: > > device_del -> may or may not unplug a device from a guest when it > returns. To figure out if it does, you have to run info qdm. I think it should also always unplug on guest reset. > netdev_del -> always destroys a netdev device when it returns. May > be called at any point in time. If you destroy a netdev while the > device is still using it, all packets go into the bit bucket and the > link status is modified to be unplugged. One issue here is that we can't allow a new device with same name to be created until the nic is destroyed. > You're suggesting: > > netdev_del -> may or may not destroy a netdev depending on when the > device delete completes. Eventually, when there's a reset, we will > kill the device. Even though the netdev is still active, we'll hide > it from the management tools. This last point is a bug in my patch. I now think we should not hide it, run info net to figure out when it is removed. > I think the suggested semantics are totally unusable. If we can > make something deterministic for a management tool, that should be > the path we take. > > Regards, > > Anthony Liguori So I basically propose that netdev_del and device_del behave identically. Why is this unusable? -- MST