From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:57234)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <chen.fan.fnst@cn.fujitsu.com>) id 1aInbo-0002YU-CD
	for qemu-devel@nongnu.org; Mon, 11 Jan 2016 20:18:53 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <chen.fan.fnst@cn.fujitsu.com>) id 1aInbl-0001Sb-1x
	for qemu-devel@nongnu.org; Mon, 11 Jan 2016 20:18:52 -0500
Received: from [59.151.112.132] (port=14627 helo=heian.cn.fujitsu.com)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <chen.fan.fnst@cn.fujitsu.com>) id 1aInbk-0001QZ-3I
	for qemu-devel@nongnu.org; Mon, 11 Jan 2016 20:18:48 -0500
Message-ID: <56945348.9010207@cn.fujitsu.com>
Date: Tue, 12 Jan 2016 09:13:44 +0800
From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
MIME-Version: 1.0
References: <cover.1451901804.git.chen.fan.fnst@cn.fujitsu.com>		
	<93bca47b31cf1abd661fbe778d415abbb08b680a.1451901804.git.chen.fan.fnst@cn.fujitsu.com>	
	<1452023901.22223.48.camel@redhat.com>
	<568C782C.8050001@cn.fujitsu.com>
	<1452098674.29599.106.camel@redhat.com>
In-Reply-To: <1452098674.29599.106.camel@redhat.com>
Content-Type: text/plain; charset="utf-8"; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [v15 12/15] vfio: add bus in reset flag
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Alex Williamson <alex.williamson@redhat.com>, Cao jin <caoj.fnst@cn.fujitsu.com>, qemu-devel@nongnu.org
Cc: mst@redhat.com


On 01/07/2016 12:44 AM, Alex Williamson wrote:
> On Wed, 2016-01-06 at 10:13 +0800, Chen Fan wrote:
>> On 01/06/2016 03:58 AM, Alex Williamson wrote:
>>> On Tue, 2016-01-05 at 09:20 +0800, Cao jin wrote:
>>>> From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
>>>>
>>>> mark the host bus be in reset. avoid multiple devices trigger the
>>>> host bus reset many times.
>>>>
>>>> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
>>>> ---
>>>>    hw/vfio/pci.c                 | 6 ++++++
>>>>    include/hw/vfio/vfio-common.h | 1 +
>>>>    2 files changed, 7 insertions(+)
>>>>
>>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>>> index ee88db3..aa0d945 100644
>>>> --- a/hw/vfio/pci.c
>>>> +++ b/hw/vfio/pci.c
>>>> @@ -2249,6 +2249,11 @@ static int
>>>> vfio_pci_hot_reset(VFIOPCIDevice
>>>> *vdev, bool single)
>>>>    
>>>>        trace_vfio_pci_hot_reset(vdev->vbasedev.name, single ?
>>>> "one" :
>>>> "multi");
>>>>    
>>>> +    if (vdev->vbasedev.bus_in_reset) {
>>>> +        vdev->vbasedev.bus_in_reset = false;
>>>> +        return 0;
>>>> +    }
>>>> +
>>>>        vfio_pci_pre_reset(vdev);
>>>>        vdev->vbasedev.needs_reset = false;
>>>>    
>>>> @@ -2312,6 +2317,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice
>>>> *vdev, bool single)
>>>>                    }
>>>>                    vfio_pci_pre_reset(tmp);
>>>>                    tmp->vbasedev.needs_reset = false;
>>>> +                tmp->vbasedev.bus_in_reset = true;
>>>>                    multi = true;
>>>>                    break;
>>>>                }
>>>> diff --git a/include/hw/vfio/vfio-common.h
>>>> b/include/hw/vfio/vfio-
>>>> common.h
>>>> index f037f3c..44b19d7 100644
>>>> --- a/include/hw/vfio/vfio-common.h
>>>> +++ b/include/hw/vfio/vfio-common.h
>>>> @@ -95,6 +95,7 @@ typedef struct VFIODevice {
>>>>        bool reset_works;
>>>>        bool needs_reset;
>>>>        bool no_mmap;
>>>> +    bool bus_in_reset;
>>>>        VFIODeviceOps *ops;
>>>>        unsigned int num_irqs;
>>>>        unsigned int num_regions;
>>> I imagine this should be a VFIOPCIDevice field, it has no use in
>>> the
>>> common code.  The name is also a bit confusing; when I suggested a
>>> bus_in_reset flag, I was referring to a property on the bus itself
>>> that
>>> the existing device_reset could query to switch modes rather than
>>> add a
>>> separate callback as you've done in this series.  This works, but
>>> it's
>>> perhaps more intrusive than I was thinking.  It will need to get
>>> approval by qdev folks.
>> maybe I don't get your point. I just think add a bus_in_reset flag in
>> bus
>> has no much sense. for instance, if assigning device A and B from
>> different host bus into a same virtual bus. assume all check passed.
>> then if device A aer occurs. we should reset virtual bus to recover
>> the device A, we also need to reset the device B and do device B host
>> bus reset. but here the bus_in_reset just denote the device B not
>> need
>> to do host bus reset, it's incorrect. right?
> Let's take an example of the state of this flag on the device to see
> why the name doesn't make sense to me.  We have a dual port card with
> devices A and B, both on the same host bus.  We start a hot reset on A
> and we have the following states:
>
> A.bus_in_reset = false
> B.bus_in_reset = false
>
> Well, that's not accurate.  As we complete the hot reset we tag device
> B as already being reset:
>
> A.bus_in_reset = false
> B.bus_in_reset = true
>
> That's not accurate either, they're both in the same bus hierarchy,
> they should not be different.  Later hot reset is called on B and we're
> back to:
>
> A.bus_in_reset = false
> B.bus_in_reset = false
>
> So I agree with your algorithm, but the variable is not tracking the
> state of the bus being in reset, it's tracking whether to skip the next
> reset call.
>
> The separate hot (bus) reset in DeviceClass seems unnecessary too, this
> is where I think we could work entirely within the PCI code w/o new
> qbus/qdev interfaces.  Imagine pci_bridge_write_config()
> calls qbus_walk_children() directly instead of calling
> qbus_reset_all().  The pre_busfn() could set a flag on the PCIBus to
> indicate the bus is in reset.  qdev_reset_one() could be used as the
> post_devfn() and the post_busfn() would call qdev_reset_one() followed
> by a clear of the flag.  The modification to vfio is then simply that
> if we're resetting an AER device and the bus is in reset, we know we
> can do a hot reset.  It simply allows us to test which reset type is
> occurring within the existing reset callback rather than adding a new
> one.
>
> Going back to my idea of a sequence ID, if we had:
>
> bool PCIBus.bus_in_reset
> uint PCIBus.bus_reset_seqid
>
> The pre_busfn would do:
>
> PCIBus.bus_in_reset = true
> PCIBus.bus_reset_seqid++
>
> Then we could add:
>
> uint VFIOPCIDevice.last_bus_reset_seqid
>
> vfio_pci_reset() would test (PCIBus.bus_in_reset && VFIOPCIDevice.AER)
> to know whether to do a hot reset.  vfio_pci_hot_reset() would skip
> devices for which (VFIOPCIDevice.last_bus_reset_seqid ==
> PCIBus.bus_reset_seqid) and for each device reset would set
> VFIOPCIDevice.last_bus_reset_seqid = PCIBus.bus_reset_seqid.
>
> That feels like a much more deterministic solution if MST is willing to
> support it in the PCI specific BusState.  Thanks,

Thanks for your suggestion. I will send out a new version soon.

Chen

>
> Alex
>
>
> .
>