From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=33577 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OxmtL-0000kB-KC for qemu-devel@nongnu.org; Mon, 20 Sep 2010 16:23:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OxmrU-0006kt-6O for qemu-devel@nongnu.org; Mon, 20 Sep 2010 16:21:19 -0400 Received: from mail-pw0-f45.google.com ([209.85.160.45]:44618) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OxmrT-0006kh-Se for qemu-devel@nongnu.org; Mon, 20 Sep 2010 16:21:16 -0400 Received: by pwj4 with SMTP id 4so1496432pwj.4 for ; Mon, 20 Sep 2010 13:21:14 -0700 (PDT) Message-ID: <4C97C22B.3010502@codemonkey.ws> Date: Mon, 20 Sep 2010 15:20:59 -0500 From: Anthony Liguori MIME-Version: 1.0 References: <20100920163042.GA29466@redhat.com> <4C978EC9.20907@codemonkey.ws> <20100920164758.GB29862@redhat.com> <4C979258.9020701@codemonkey.ws> <20100920171439.GF29862@redhat.com> <4C97A474.8040900@codemonkey.ws> <20100920182459.GE30611@redhat.com> <4C97AA44.8000403@codemonkey.ws> <20100920191500.GG30611@redhat.com> <4C97B5EA.9060809@codemonkey.ws> <20100920194415.GK30611@redhat.com> In-Reply-To: <20100920194415.GK30611@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: [PATCH] net: delay peer host device delete List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: qemu-devel@nongnu.org On 09/20/2010 02:44 PM, Michael S. Tsirkin wrote: > >> I think the only workable approach that doesn't involve new commands >> is to change the semantics of the existing ones. >> >> Make netdev_del work regardless of whether the device is still present. >> >> You would need to reference count the actual netdev structure and >> have each device using it unref on delete. You make netdev_del mark >> the device as deleted and when a device is deleted, any calls into >> the device effectively become nops. >> >> You have to go through most of the cleanup process to ensure that >> tap device gets closed even before your reference count goes to >> zero. >> > I think you mean 'does not get closed': we need the fd to get the flags etc. > No, I actually meant does get closed. When you do netdev_del, it should result in the fd getting closed. The actual netdev structure then becomes a zombie that's completely useless until the device goes away. > Note that it will mostly work unless when it'll crash. > Issue is we don't have any documentation so > people get the command set by trial and error. > > So how can we prove it's a user bug and not qemu bug? > I guess we should blame ourselves until proven innocent. > Here's what I'm now suggesting: device_del -> may or may not unplug a device from a guest when it returns. To figure out if it does, you have to run info qdm. netdev_del -> always destroys a netdev device when it returns. May be called at any point in time. If you destroy a netdev while the device is still using it, all packets go into the bit bucket and the link status is modified to be unplugged. You're suggesting: netdev_del -> may or may not destroy a netdev depending on when the device delete completes. Eventually, when there's a reset, we will kill the device. Even though the netdev is still active, we'll hide it from the management tools. I think the suggested semantics are totally unusable. If we can make something deterministic for a management tool, that should be the path we take. Regards, Anthony Liguori