From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=41500 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Oxjfz-0007h6-8w for qemu-devel@nongnu.org; Mon, 20 Sep 2010 12:57:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1Oxjfy-0001im-2X for qemu-devel@nongnu.org; Mon, 20 Sep 2010 12:57:11 -0400 Received: from mail-vw0-f45.google.com ([209.85.212.45]:65042) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Oxjfx-0001ih-VT for qemu-devel@nongnu.org; Mon, 20 Sep 2010 12:57:10 -0400 Received: by vws19 with SMTP id 19so3734661vws.4 for ; Mon, 20 Sep 2010 09:57:09 -0700 (PDT) Message-ID: <4C979258.9020701@codemonkey.ws> Date: Mon, 20 Sep 2010 11:56:56 -0500 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH] net: delay peer host device delete References: <20100920163042.GA29466@redhat.com> <4C978EC9.20907@codemonkey.ws> <20100920164758.GB29862@redhat.com> In-Reply-To: <20100920164758.GB29862@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: qemu-devel@nongnu.org On 09/20/2010 11:47 AM, Michael S. Tsirkin wrote: > On Mon, Sep 20, 2010 at 11:41:45AM -0500, Anthony Liguori wrote: > >> On 09/20/2010 11:30 AM, Michael S. Tsirkin wrote: >> >>> With -netdev, virtio devices present offload >>> features to guest, depending on the backend used. >>> Thus, removing host ntedev peer while guest is >>> active leads to guest-visible inconsistency and/or crashes. >>> See e.g. https://bugzilla.redhat.com/show_bug.cgi?id=623735 >>> >>> As a solution, while guest (NIC) peer device exists, >>> we must prevent the host peer from being deleted. >>> >>> This patch does this by adding peer_deleted flag in nic state: >>> if host device is going away while guest device >>> is around, set this flag and keep host device around >>> for as long as guest device exists. >>> >> Having an unclear life cycle really worries me. >> >> Wouldn't the more correct solution be to avoid removing the netdev >> device until after the peer has successfully been removed? >> >> Regards, >> >> Anthony Liguori >> > This is exactly what the patch does. > At the management layer instead of doing it magically in the backend. IOW, if device_del returns and the device isn't actually deleted, that's a bug and addressing it like this just means we'll trip over it somewhere else. We'll have the same problem with drive_del. Regards, Anthony Liguori > >>> Signed-off-by: Michael S. Tsirkin >>> --- >>> net.c | 21 ++++++++++++++++++++- >>> net.h | 1 + >>> 2 files changed, 21 insertions(+), 1 deletions(-) >>> >>> diff --git a/net.c b/net.c >>> index 3d0fde7..10855d1 100644 >>> --- a/net.c >>> +++ b/net.c >>> @@ -286,12 +286,31 @@ void qemu_del_vlan_client(VLANClientState *vc) >>> if (vc->vlan) { >>> QTAILQ_REMOVE(&vc->vlan->clients, vc, next); >>> } else { >>> + /* Even if client will not be deleted yet, remove it from list so it >>> + * does not appear in monitor. */ >>> + QTAILQ_REMOVE(&non_vlan_clients, vc, next); >>> + /* Detect that guest-visible (NIC) peer is active, and delay deletion. >>> + * */ >>> + if (vc->peer&& vc->peer->info->type == NET_CLIENT_TYPE_NIC) { >>> + NICState *nic = DO_UPCAST(NICState, nc, vc->peer); >>> + assert(!nic->peer_deleted); >>> + nic->peer_deleted = true; >>> + return; >>> + } >>> if (vc->send_queue) { >>> qemu_del_net_queue(vc->send_queue); >>> } >>> - QTAILQ_REMOVE(&non_vlan_clients, vc, next); >>> if (vc->peer) { >>> vc->peer->peer = NULL; >>> + /* If this is a guest-visible (NIC) device, >>> + * and peer has already been removed from monitor, >>> + * delete it here. */ >>> + if (vc->info->type == NET_CLIENT_TYPE_NIC) { >>> + NICState *nic = DO_UPCAST(NICState, nc, vc); >>> + if (nic->peer_deleted) { >>> + qemu_del_vlan_client(vc->peer); >>> + } >>> + } >>> } >>> } >>> >>> diff --git a/net.h b/net.h >>> index 518cf9c..44c31a9 100644 >>> --- a/net.h >>> +++ b/net.h >>> @@ -72,6 +72,7 @@ typedef struct NICState { >>> VLANClientState nc; >>> NICConf *conf; >>> void *opaque; >>> + bool peer_deleted; >>> } NICState; >>> >>> struct VLANState { >>>