All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Subject: [Qemu-devel] Re: [PATCH] net: delay peer host device delete
Date: Mon, 20 Sep 2010 14:28:42 -0500	[thread overview]
Message-ID: <4C97B5EA.9060809@codemonkey.ws> (raw)
In-Reply-To: <20100920191500.GG30611@redhat.com>

On 09/20/2010 02:15 PM, Michael S. Tsirkin wrote:
> On Mon, Sep 20, 2010 at 01:39:00PM -0500, Anthony Liguori wrote:
>    
>> On 09/20/2010 01:24 PM, Michael S. Tsirkin wrote:
>>      
>>> On Mon, Sep 20, 2010 at 01:14:12PM -0500, Anthony Liguori wrote:
>>>        
>>>> On 09/20/2010 12:14 PM, Michael S. Tsirkin wrote:
>>>>          
>>>>> On Mon, Sep 20, 2010 at 11:56:56AM -0500, Anthony Liguori wrote:
>>>>>            
>>>>>> On 09/20/2010 11:47 AM, Michael S. Tsirkin wrote:
>>>>>>              
>>>>>>> On Mon, Sep 20, 2010 at 11:41:45AM -0500, Anthony Liguori wrote:
>>>>>>>                
>>>>>>>> On 09/20/2010 11:30 AM, Michael S. Tsirkin wrote:
>>>>>>>>                  
>>>>>>>>> With -netdev, virtio devices present offload
>>>>>>>>> features to guest, depending on the backend used.
>>>>>>>>> Thus, removing host ntedev peer while guest is
>>>>>>>>> active leads to guest-visible inconsistency and/or crashes.
>>>>>>>>> See e.g. https://bugzilla.redhat.com/show_bug.cgi?id=623735
>>>>>>>>>
>>>>>>>>> As a solution, while guest (NIC) peer device exists,
>>>>>>>>> we must prevent the host peer from being deleted.
>>>>>>>>>
>>>>>>>>> This patch does this by adding peer_deleted flag in nic state:
>>>>>>>>> if host device is going away while guest device
>>>>>>>>> is around, set this flag and keep host device around
>>>>>>>>> for as long as guest device exists.
>>>>>>>>>                    
>>>>>>>> Having an unclear life cycle really worries me.
>>>>>>>>
>>>>>>>> Wouldn't the more correct solution be to avoid removing the netdev
>>>>>>>> device until after the peer has successfully been removed?
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Anthony Liguori
>>>>>>>>                  
>>>>>>> This is exactly what the patch does.
>>>>>>>                
>>>>>> At the management layer instead of doing it magically in the backend.
>>>>>>              
>>>>> The amount of pain this inflicts on management would be considerable.
>>>>> Hotplug commands were designed to be asynchronous
>>>>> (starts the process, does not wait for it to complete), maybe that
>>>>> was a mistake but we can not change semantics at will now.
>>>>>
>>>>> Add new commands, okay, but existing ones should work and get fixed
>>>>> if there's a bug.
>>>>>            
>>>> But having commands that are impossible to use correctly is not very good.
>>>>          
>>> So we will have to fix the existing commands so they can be used
>>> correctly. Since the device is removed from the list
>>> shown to the monitor, I do not really see why the user
>>> cares that the backend is actually still around
>>> until the device is removed.
>>>        
>> That's even more wrong and maybe I don't understand what you're saying.
>>
>> But the test case is easy.  acpiphp is not loaded.  You do a
>> device_del of a device.  What happens?
>>      
> Backend will stay active until guest reset (once we fix guest reset).
> An alternative is to only lock the backend when guest driver loads,
> and unlock on nic reset in addition to hotplug.
> It would work as long as the backend only affects the driver, not the
> guest OS itself.
>
> I discarded this approach as it seemed even less deterministic ...
> do you like this better?
>    

I think the only workable approach that doesn't involve new commands is 
to change the semantics of the existing ones.

Make netdev_del work regardless of whether the device is still present.

You would need to reference count the actual netdev structure and have 
each device using it unref on delete.  You make netdev_del mark the 
device as deleted and when a device is deleted, any calls into the 
device effectively become nops.

You have to go through most of the cleanup process to ensure that tap 
device gets closed even before your reference count goes to zero.

>>> have qemu do the right thing.
>>>        
>> The only way to do this correctly is to make device_del block until
>> the operation completes.  Unfortunately, this becomes a libvirt DoS
>> which would cause all sorts of problems.
>>
>> I don't see a lot of options that allow the management tools to
>> continue doing what they're doing.  This cannot work properly unless
>> there is a management interface that is explicitly aware that 1) pci
>> hot unplug can and will not be successful 2) the device is still
>> there until it's successful (which may be forever).
>>
>> Regards,
>>
>> Anthony Liguori
>>      
> Let's start with disabling hotplug in 0.13 then?
> It's clearly not feasible to add new commands there.
>    

To be fair, this is really a bug in libvirt IMHO.  libvirt shouldn't 
assume that after device_del is called, the device is gone.

Regards,

Anthony Liguori

  reply	other threads:[~2010-09-20 19:29 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-20 16:30 [Qemu-devel] [PATCH] net: delay peer host device delete Michael S. Tsirkin
2010-09-20 16:41 ` Anthony Liguori
2010-09-20 16:47   ` Michael S. Tsirkin
2010-09-20 16:56     ` Anthony Liguori
2010-09-20 17:14       ` Michael S. Tsirkin
2010-09-20 18:14         ` Anthony Liguori
2010-09-20 18:19           ` Anthony Liguori
2010-09-20 18:59             ` [Qemu-devel] " Michael S. Tsirkin
2010-09-20 19:22               ` Anthony Liguori
2010-09-20 19:37                 ` Michael S. Tsirkin
2010-09-20 20:15                   ` Anthony Liguori
2010-09-20 20:15                     ` Michael S. Tsirkin
2010-09-21  8:58                   ` Daniel P. Berrange
2010-09-21  9:20                     ` Michael S. Tsirkin
2010-09-21 12:47                       ` Anthony Liguori
2010-09-20 18:24           ` Michael S. Tsirkin
2010-09-20 18:39             ` Anthony Liguori
2010-09-20 19:15               ` Michael S. Tsirkin
2010-09-20 19:28                 ` Anthony Liguori [this message]
2010-09-20 19:44                   ` Michael S. Tsirkin
2010-09-20 20:20                     ` Anthony Liguori
2010-09-20 20:27                       ` Michael S. Tsirkin
2010-09-20 20:38                         ` Anthony Liguori
2010-09-20 20:37                           ` Michael S. Tsirkin
2010-09-20 20:50                             ` Anthony Liguori
2010-09-21  9:18                               ` Michael S. Tsirkin
2010-09-21 12:42                                 ` Anthony Liguori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C97B5EA.9060809@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.