From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=33979 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Oxn8W-0007qd-Q7 for qemu-devel@nongnu.org; Mon, 20 Sep 2010 16:38:53 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1Oxn8V-0001Qb-DQ for qemu-devel@nongnu.org; Mon, 20 Sep 2010 16:38:52 -0400 Received: from mail-gx0-f173.google.com ([209.85.161.173]:40223) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Oxn8V-0001QM-6v for qemu-devel@nongnu.org; Mon, 20 Sep 2010 16:38:51 -0400 Received: by gxk22 with SMTP id 22so1912843gxk.4 for ; Mon, 20 Sep 2010 13:38:50 -0700 (PDT) Message-ID: <4C97C64C.6010607@codemonkey.ws> Date: Mon, 20 Sep 2010 15:38:36 -0500 From: Anthony Liguori MIME-Version: 1.0 References: <20100920164758.GB29862@redhat.com> <4C979258.9020701@codemonkey.ws> <20100920171439.GF29862@redhat.com> <4C97A474.8040900@codemonkey.ws> <20100920182459.GE30611@redhat.com> <4C97AA44.8000403@codemonkey.ws> <20100920191500.GG30611@redhat.com> <4C97B5EA.9060809@codemonkey.ws> <20100920194415.GK30611@redhat.com> <4C97C22B.3010502@codemonkey.ws> <20100920202755.GA821@redhat.com> In-Reply-To: <20100920202755.GA821@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: [PATCH] net: delay peer host device delete List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: qemu-devel@nongnu.org On 09/20/2010 03:27 PM, Michael S. Tsirkin wrote: > On Mon, Sep 20, 2010 at 03:20:59PM -0500, Anthony Liguori wrote: > >> On 09/20/2010 02:44 PM, Michael S. Tsirkin wrote: >> >>> >>>> I think the only workable approach that doesn't involve new commands >>>> is to change the semantics of the existing ones. >>>> >>>> Make netdev_del work regardless of whether the device is still present. >>>> >>>> You would need to reference count the actual netdev structure and >>>> have each device using it unref on delete. You make netdev_del mark >>>> the device as deleted and when a device is deleted, any calls into >>>> the device effectively become nops. >>>> >>>> You have to go through most of the cleanup process to ensure that >>>> tap device gets closed even before your reference count goes to >>>> zero. >>>> >>> I think you mean 'does not get closed': we need the fd to get the flags etc. >>> >> No, I actually meant does get closed. >> >> When you do netdev_del, it should result in the fd getting closed. >> >> The actual netdev structure then becomes a zombie that's completely >> useless until the device goes away. >> >> >>> Note that it will mostly work unless when it'll crash. >>> Issue is we don't have any documentation so >>> people get the command set by trial and error. >>> >>> So how can we prove it's a user bug and not qemu bug? >>> I guess we should blame ourselves until proven innocent. >>> >> Here's what I'm now suggesting: >> >> device_del -> may or may not unplug a device from a guest when it >> returns. To figure out if it does, you have to run info qdm. >> > I think it should also always unplug on guest reset. > > >> netdev_del -> always destroys a netdev device when it returns. May >> be called at any point in time. If you destroy a netdev while the >> device is still using it, all packets go into the bit bucket and the >> link status is modified to be unplugged. >> > One issue here is that we can't allow a new device with same name > to be created until the nic is destroyed. > A new netdev device? Why not? We just need the struct VLANClientState to remain valid, it doesn't need to be in the linked list of clients.. >> I think the suggested semantics are totally unusable. If we can >> make something deterministic for a management tool, that should be >> the path we take. >> >> Regards, >> >> Anthony Liguori >> > So I basically propose that netdev_del and device_del behave > identically. Why is this unusable? > Because the fundamental problem is that device_del is too difficult to use. You're just making netdev_del equally difficult to use. Try your patch with libvirt, don't load acpiphp in the guest, and then play around with virsh device_detach and device_attach. All sorts of badness will ensue as libvirt tries to manage assignment of PCI slot numbers and such. Regards, Anthony Liguori