Linux-HyperV List
 help / color / mirror / Atom feed
* Re: [PATCH] PCI: hv: Detect and fix Hyper-V PCI domain number collision
From: Sasha Levin @ 2019-08-05 18:36 UTC (permalink / raw)
  To: Haiyang Zhang
  Cc: bhelgaas@google.com, lorenzo.pieralisi@arm.com,
	linux-hyperv@vger.kernel.org, linux-pci@vger.kernel.org,
	KY Srinivasan, Stephen Hemminger, olaf@aepfle.de, vkuznets,
	linux-kernel@vger.kernel.org
In-Reply-To: <1564771954-9181-1-git-send-email-haiyangz@microsoft.com>

On Fri, Aug 02, 2019 at 06:52:56PM +0000, Haiyang Zhang wrote:
>Due to Azure host agent settings, the device instance ID's bytes 8 and 9
>are no longer unique. This causes some of the PCI devices not showing up
>in VMs with multiple passthrough devices, such as GPUs. So, as recommended
>by Azure host team, we now use the bytes 4 and 5 which usually provide
>unique numbers.
>
>In the rare cases of collision, we will detect and find another number
>that is not in use.
>Thanks to Michael Kelley <mikelley@microsoft.com> for proposing this idea.
>
>Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>

Acked-by: Sasha Levin <sashal@kernel.org>

Bjorn, will you take it through the PCI tree or do you want me to take
it through hyper-v?

--
Thanks,
Sasha

^ permalink raw reply

* RE: [PATCH v2 net] hv_sock: Fix hang when a connection is closed
From: Dexuan Cui @ 2019-08-03  0:49 UTC (permalink / raw)
  To: David Miller
  Cc: Sunil Muthuswamy, netdev@vger.kernel.org, KY Srinivasan,
	Haiyang Zhang, Stephen Hemminger, sashal@kernel.org,
	Michael Kelley, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, olaf@aepfle.de, apw@canonical.com,
	jasowang@redhat.com, vkuznets, marcelo.cerri@canonical.com
In-Reply-To: <20190802.172729.1656276508211556851.davem@davemloft.net>

> From: linux-hyperv-owner@vger.kernel.org
> Sent: Friday, August 2, 2019 5:27 PM
> ...
> Applied and queued up for -stable.
> 
> Do not ever CC: stable for networking patches, we submit to -stable manually.
 
Thanks, David!
I'll remember to not add the stable tag for network patches.

Thanks,
-- Dexuan

^ permalink raw reply

* Re: [PATCH v2 net] hv_sock: Fix hang when a connection is closed
From: David Miller @ 2019-08-03  0:27 UTC (permalink / raw)
  To: decui
  Cc: sunilmut, netdev, kys, haiyangz, sthemmin, sashal, mikelley,
	linux-hyperv, linux-kernel, olaf, apw, jasowang, vkuznets,
	marcelo.cerri
In-Reply-To: <PU1P153MB01696DDD3A3F601370701DD2BFDF0@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>

From: Dexuan Cui <decui@microsoft.com>
Date: Wed, 31 Jul 2019 01:25:45 +0000

> 
> There is a race condition for an established connection that is being closed
> by the guest: the refcnt is 4 at the end of hvs_release() (Note: here the
> 'remove_sock' is false):
> 
> 1 for the initial value;
> 1 for the sk being in the bound list;
> 1 for the sk being in the connected list;
> 1 for the delayed close_work.
> 
> After hvs_release() finishes, __vsock_release() -> sock_put(sk) *may*
> decrease the refcnt to 3.
> 
> Concurrently, hvs_close_connection() runs in another thread:
>   calls vsock_remove_sock() to decrease the refcnt by 2;
>   call sock_put() to decrease the refcnt to 0, and free the sk;
>   next, the "release_sock(sk)" may hang due to use-after-free.
> 
> In the above, after hvs_release() finishes, if hvs_close_connection() runs
> faster than "__vsock_release() -> sock_put(sk)", then there is not any issue,
> because at the beginning of hvs_close_connection(), the refcnt is still 4.
> 
> The issue can be resolved if an extra reference is taken when the
> connection is established.
> 
> Fixes: a9eeb998c28d ("hv_sock: Add support for delayed close")
> Signed-off-by: Dexuan Cui <decui@microsoft.com>

Applied and queued up for -stable.

Do not ever CC: stable for networking patches, we submit to -stable manually.

Thank you.

^ permalink raw reply

* [PATCH v2] PCI: hv: Fix panic by calling hv_pci_remove_slots() earlier
From: Dexuan Cui @ 2019-08-02 22:50 UTC (permalink / raw)
  To: lorenzo.pieralisi@arm.com, bhelgaas@google.com,
	linux-pci@vger.kernel.org, Michael Kelley, Stephen Hemminger
  Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	driverdev-devel@linuxdriverproject.org, Sasha Levin,
	Haiyang Zhang, KY Srinivasan, olaf@aepfle.de, apw@canonical.com,
	jasowang@redhat.com, vkuznets, marcelo.cerri@canonical.com,
	jackm@mellanox.com, Dexuan Cui


The slot must be removed before the pci_dev is removed, otherwise a panic
can happen due to use-after-free.

Fixes: 15becc2b56c6 ("PCI: hv: Add hv_pci_remove_slots() when we unload the driver")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: stable@vger.kernel.org
---

Changes in v2:
  Improved the changelog accordign to the discussion with Bjorn Helgaas:
	  https://lkml.org/lkml/2019/8/1/1173
	  https://lkml.org/lkml/2019/8/2/1559

 drivers/pci/controller/pci-hyperv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index 6b9cc6e60a..68c611d 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -2757,8 +2757,8 @@ static int hv_pci_remove(struct hv_device *hdev)
 		/* Remove the bus from PCI's point of view. */
 		pci_lock_rescan_remove();
 		pci_stop_root_bus(hbus->pci_bus);
-		pci_remove_root_bus(hbus->pci_bus);
 		hv_pci_remove_slots(hbus);
+		pci_remove_root_bus(hbus->pci_bus);
 		pci_unlock_rescan_remove();
 		hbus->state = hv_pcibus_removed;
 	}
-- 
1.8.3.1


^ permalink raw reply related

* Re: [PATCH] PCI: hv: Fix panic by calling hv_pci_remove_slots() earlier
From: Bjorn Helgaas @ 2019-08-02 22:15 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: Stephen Hemminger, lorenzo.pieralisi@arm.com,
	linux-pci@vger.kernel.org, Michael Kelley,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	driverdev-devel@linuxdriverproject.org, Sasha Levin,
	Haiyang Zhang, KY Srinivasan, olaf@aepfle.de, apw@canonical.com,
	jasowang@redhat.com, vkuznets, marcelo.cerri@canonical.com,
	jackm@mellanox.com
In-Reply-To: <PU1P153MB01698F51FE22C39086CC8353BFD90@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>

On Fri, Aug 02, 2019 at 08:31:26PM +0000, Dexuan Cui wrote:
> > From: Bjorn Helgaas <helgaas@kernel.org>
> > Sent: Friday, August 2, 2019 12:41 PM
> > The subject line only describes the mechanical code change, which is
> > obvious from the patch.  It would be better if we could say something
> > about *why* we need this.
> 
> Hi Bjorn,
> Sorry. I'll try to write a better changelog in v2. :-)
>  
> > On Fri, Aug 02, 2019 at 01:32:28AM +0000, Dexuan Cui wrote:
> > >
> > > When a slot is removed, the pci_dev must still exist.
> > >
> > > pci_remove_root_bus() removes and free all the pci_devs, so
> > > hv_pci_remove_slots() must be called before pci_remove_root_bus(),
> > > otherwise a general protection fault can happen, if the kernel is built
> > 
> > "general protection fault" is an x86 term that doesn't really say what
> > the issue is.  I suspect this would be a "use-after-free" problem.
> 
> Yes, it's use-after-free. I'll fix the the wording.
>  
> > > --- a/drivers/pci/controller/pci-hyperv.c
> > > +++ b/drivers/pci/controller/pci-hyperv.c
> > > @@ -2757,8 +2757,8 @@ static int hv_pci_remove(struct hv_device *hdev)
> > >  		/* Remove the bus from PCI's point of view. */
> > >  		pci_lock_rescan_remove();
> > >  		pci_stop_root_bus(hbus->pci_bus);
> > > -		pci_remove_root_bus(hbus->pci_bus);
> > >  		hv_pci_remove_slots(hbus);
> > > +		pci_remove_root_bus(hbus->pci_bus);
> > 
> > I'm curious about why we need hv_pci_remove_slots() at all.  None of
> > the other callers of pci_stop_root_bus() and pci_remove_root_bus() do
> > anything similar to hv_pci_remove_slots().
> > 
> > Surely some of those callers also support slots, so there must be some
> > other path that calls pci_destroy_slot() in those cases.  Can we use a
> > similar strategy here?
> 
> Originally Stephen Heminger added the slot code for pci-hyperv.c:
> a15f2c08c708 ("PCI: hv: support reporting serial number as slot information")
> So he may know this better. My understanding is: we can not use the similar
> stragegy used in the 2 other users of pci_create_slot():
> 
> drivers/pci/hotplug/pci_hotplug_core.c calls pci_create_slot().
> It looks drivers/pci/hotplug/ is quite different from pci-hyperv.c because
> pci-hyper-v uses a simple *private* hot-plug protocol, making it impossible
> to use the API pci_hp_register() and pci_hp_destroy() -> pci_destroy_slot().
> 
> drivers/acpi/pci_slot.c calls pci_create_slot(), and saves the created slots in
> the static "slot_list" list in the same file. Again, since pci-hyper-v uses a private
> PCI-device-discovery protocol (which is based on VMBus rather the emulated
> ACPI and PCI), acpi_pci_slot_enumerate() can not find the PCI devices that are
> discovered by pci-hyperv, so we can not use the standard register_slot() ->
> pci_create_slot() to create the slots and hence acpi_pci_slot_remove() -> 
> pci_destroy_slot() can not work for pci-hyperv.

Hmm, ok.  This still doesn't seem right to me, but I think the bottom
line will be that the current slot registration interfaces just don't
work quite right for all the cases we want them to.

Maybe it would be a good project for somebody to rethink them, but it
doesn't seem practical for *this* patch.  Thanks for looking into it
this far!

> I think I can use this as the v2 changelog:
> 
> The slot must be removed before the pci_dev is removed, otherwise a panic
> can happen due to use-after-free.

Sounds good.

Bjorn

^ permalink raw reply

* RE: [PATCH] PCI: hv: Fix panic by calling hv_pci_remove_slots() earlier
From: Dexuan Cui @ 2019-08-02 20:31 UTC (permalink / raw)
  To: Bjorn Helgaas, Stephen Hemminger
  Cc: lorenzo.pieralisi@arm.com, linux-pci@vger.kernel.org,
	Michael Kelley, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	driverdev-devel@linuxdriverproject.org, Sasha Levin,
	Haiyang Zhang, KY Srinivasan, olaf@aepfle.de, apw@canonical.com,
	jasowang@redhat.com, vkuznets, marcelo.cerri@canonical.com,
	jackm@mellanox.com
In-Reply-To: <20190802194053.GL151852@google.com>

> From: Bjorn Helgaas <helgaas@kernel.org>
> Sent: Friday, August 2, 2019 12:41 PM
> The subject line only describes the mechanical code change, which is
> obvious from the patch.  It would be better if we could say something
> about *why* we need this.

Hi Bjorn,
Sorry. I'll try to write a better changelog in v2. :-)
 
> On Fri, Aug 02, 2019 at 01:32:28AM +0000, Dexuan Cui wrote:
> >
> > When a slot is removed, the pci_dev must still exist.
> >
> > pci_remove_root_bus() removes and free all the pci_devs, so
> > hv_pci_remove_slots() must be called before pci_remove_root_bus(),
> > otherwise a general protection fault can happen, if the kernel is built
> 
> "general protection fault" is an x86 term that doesn't really say what
> the issue is.  I suspect this would be a "use-after-free" problem.

Yes, it's use-after-free. I'll fix the the wording.
 
> > --- a/drivers/pci/controller/pci-hyperv.c
> > +++ b/drivers/pci/controller/pci-hyperv.c
> > @@ -2757,8 +2757,8 @@ static int hv_pci_remove(struct hv_device *hdev)
> >  		/* Remove the bus from PCI's point of view. */
> >  		pci_lock_rescan_remove();
> >  		pci_stop_root_bus(hbus->pci_bus);
> > -		pci_remove_root_bus(hbus->pci_bus);
> >  		hv_pci_remove_slots(hbus);
> > +		pci_remove_root_bus(hbus->pci_bus);
> 
> I'm curious about why we need hv_pci_remove_slots() at all.  None of
> the other callers of pci_stop_root_bus() and pci_remove_root_bus() do
> anything similar to hv_pci_remove_slots().
> 
> Surely some of those callers also support slots, so there must be some
> other path that calls pci_destroy_slot() in those cases.  Can we use a
> similar strategy here?

Originally Stephen Heminger added the slot code for pci-hyperv.c:
a15f2c08c708 ("PCI: hv: support reporting serial number as slot information")
So he may know this better. My understanding is: we can not use the similar
stragegy used in the 2 other users of pci_create_slot():

drivers/pci/hotplug/pci_hotplug_core.c calls pci_create_slot().
It looks drivers/pci/hotplug/ is quite different from pci-hyperv.c because
pci-hyper-v uses a simple *private* hot-plug protocol, making it impossible
to use the API pci_hp_register() and pci_hp_destroy() -> pci_destroy_slot().

drivers/acpi/pci_slot.c calls pci_create_slot(), and saves the created slots in
the static "slot_list" list in the same file. Again, since pci-hyper-v uses a private
PCI-device-discovery protocol (which is based on VMBus rather the emulated
ACPI and PCI), acpi_pci_slot_enumerate() can not find the PCI devices that are
discovered by pci-hyperv, so we can not use the standard register_slot() ->
pci_create_slot() to create the slots and hence acpi_pci_slot_remove() -> 
pci_destroy_slot() can not work for pci-hyperv.

I think I can use this as the v2 changelog:

The slot must be removed before the pci_dev is removed, otherwise a panic
can happen due to use-after-free.

Thanks,
Dexuan

^ permalink raw reply

* Re: [PATCH] PCI: hv: Fix panic by calling hv_pci_remove_slots() earlier
From: Bjorn Helgaas @ 2019-08-02 19:40 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: lorenzo.pieralisi@arm.com, linux-pci@vger.kernel.org,
	Michael Kelley, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	driverdev-devel@linuxdriverproject.org, Sasha Levin,
	Haiyang Zhang, KY Srinivasan, Stephen Hemminger, olaf@aepfle.de,
	apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com, jackm@mellanox.com
In-Reply-To: <PU1P153MB0169DBCFEE7257F5BB93580ABFD90@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>

Hi Dexuan,

The subject line only describes the mechanical code change, which is
obvious from the patch.  It would be better if we could say something
about *why* we need this.

On Fri, Aug 02, 2019 at 01:32:28AM +0000, Dexuan Cui wrote:
> 
> When a slot is removed, the pci_dev must still exist.
> 
> pci_remove_root_bus() removes and free all the pci_devs, so
> hv_pci_remove_slots() must be called before pci_remove_root_bus(),
> otherwise a general protection fault can happen, if the kernel is built

"general protection fault" is an x86 term that doesn't really say what
the issue is.  I suspect this would be a "use-after-free" problem.

> with the memory debugging options.
> 
> Fixes: 15becc2b56c6 ("PCI: hv: Add hv_pci_remove_slots() when we unload the driver")
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> Cc: stable@vger.kernel.org
> 
> ---
> 
> When pci-hyperv is unloaded, this panic can happen:
> 
>  general protection fault:
>  CPU: 2 PID: 1091 Comm: rmmod Not tainted 5.2.0+
>  RIP: 0010:pci_slot_release+0x30/0xd0
>  Call Trace:
>   kobject_release+0x65/0x190
>   pci_destroy_slot+0x25/0x60
>   hv_pci_remove+0xec/0x110 [pci_hyperv]
>   vmbus_remove+0x20/0x30 [hv_vmbus]
>   device_release_driver_internal+0xd5/0x1b0
>   driver_detach+0x44/0x7c
>   bus_remove_driver+0x75/0xc7
>   vmbus_driver_unregister+0x50/0xbd [hv_vmbus]
>   __x64_sys_delete_module+0x136/0x200
>   do_syscall_64+0x5e/0x220
> 
>  drivers/pci/controller/pci-hyperv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index 6b9cc6e60a..68c611d 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -2757,8 +2757,8 @@ static int hv_pci_remove(struct hv_device *hdev)
>  		/* Remove the bus from PCI's point of view. */
>  		pci_lock_rescan_remove();
>  		pci_stop_root_bus(hbus->pci_bus);
> -		pci_remove_root_bus(hbus->pci_bus);
>  		hv_pci_remove_slots(hbus);
> +		pci_remove_root_bus(hbus->pci_bus);

I'm curious about why we need hv_pci_remove_slots() at all.  None of
the other callers of pci_stop_root_bus() and pci_remove_root_bus() do
anything similar to hv_pci_remove_slots().

Surely some of those callers also support slots, so there must be some
other path that calls pci_destroy_slot() in those cases.  Can we use a
similar strategy here?

>  		pci_unlock_rescan_remove();
>  		hbus->state = hv_pcibus_removed;
>  	}
> -- 
> 1.8.3.1
> 

^ permalink raw reply

* Re: [PATCH 2/3] drivers: hv: vmbus: add fuzz test attributes to sysfs
From: Branden Bonaby @ 2019-08-02 19:02 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: linux-hyperv, linux-kernel, kys, haiyangz, sthemmin, sashal
In-Reply-To: <87a7csggvj.fsf@vitty.brq.redhat.com>

On Fri, Aug 02, 2019 at 09:34:40AM +0200, Vitaly Kuznetsov wrote:
> Branden Bonaby <brandonbonaby94@gmail.com> writes:
> 
> > Expose the test parameters as part of the sysfs channel attributes.
> > We will control the testing state via these attributes.
> >
> > Signed-off-by: Branden Bonaby <brandonbonaby94@gmail.com>
> > ---
> >  Documentation/ABI/stable/sysfs-bus-vmbus | 22 ++++++
> >  drivers/hv/vmbus_drv.c                   | 97 +++++++++++++++++++++++-
> >  2 files changed, 118 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/ABI/stable/sysfs-bus-vmbus b/Documentation/ABI/stable/sysfs-bus-vmbus
> > index 8e8d167eca31..239fcb6fdc75 100644
> > --- a/Documentation/ABI/stable/sysfs-bus-vmbus
> > +++ b/Documentation/ABI/stable/sysfs-bus-vmbus
> > @@ -185,3 +185,25 @@ Contact:        Michael Kelley <mikelley@microsoft.com>
> >  Description:    Total number of write operations that encountered an outbound
> >  		ring buffer full condition
> >  Users:          Debugging tools
> > +
> > +What:           /sys/bus/vmbus/devices/<UUID>/fuzz_test_state
> 
> I would prefer this to go under /sys/kernel/debug/ as this is clearly a
> debug/test feature.
> -- 
> Vitaly

Alright, it is testing so I see what you mean and why the code in 
this patch should go in debugfs. Will fix that and resend.

Branden Bonaby 

^ permalink raw reply

* Re: [PATCH 1/3] drivers: hv: vmbus: Introduce latency testing
From: Branden Bonaby @ 2019-08-02 19:02 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kys, haiyangz, sthemmin, sashal, linux-hyperv, linux-kernel
In-Reply-To: <87d0hoggyc.fsf@vitty.brq.redhat.com>

On Fri, Aug 02, 2019 at 09:32:59AM +0200, Vitaly Kuznetsov wrote:
> Branden Bonaby <brandonbonaby94@gmail.com> writes:
> 
> > Introduce user specified latency in the packet reception path.
> >
> > Signed-off-by: Branden Bonaby <brandonbonaby94@gmail.com>
> > ---
> >  drivers/hv/connection.c  |  5 +++++
> >  drivers/hv/ring_buffer.c | 10 ++++++++++
> >  include/linux/hyperv.h   | 14 ++++++++++++++
> >  3 files changed, 29 insertions(+)
> >
> > diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
> > index 09829e15d4a0..2a2c22f5570e 100644
> > --- a/drivers/hv/connection.c
> > +++ b/drivers/hv/connection.c
> > @@ -354,9 +354,14 @@ void vmbus_on_event(unsigned long data)
> >  {
> >  	struct vmbus_channel *channel = (void *) data;
> >  	unsigned long time_limit = jiffies + 2;
> > +	struct vmbus_channel *test_channel = !channel->primary_channel ?
> > +						channel :
> > +						channel->primary_channel;
> >  
> >  	trace_vmbus_on_event(channel);
> >  
> > +	if (unlikely(test_channel->fuzz_testing_buffer_delay > 0))
> > +		udelay(test_channel->fuzz_testing_buffer_delay);
> >  	do {
> >  		void (*callback_fn)(void *);
> >  
> > diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
> > index 9a03b163cbbd..d7627c9023d6 100644
> > --- a/drivers/hv/ring_buffer.c
> > +++ b/drivers/hv/ring_buffer.c
> > @@ -395,7 +395,12 @@ struct vmpacket_descriptor *hv_pkt_iter_first(struct vmbus_channel *channel)
> >  {
> >  	struct hv_ring_buffer_info *rbi = &channel->inbound;
> >  	struct vmpacket_descriptor *desc;
> > +	struct vmbus_channel *test_channel = !channel->primary_channel ?
> > +						channel :
> > +						channel->primary_channel;
> >  
> > +	if (unlikely(test_channel->fuzz_testing_message_delay > 0))
> > +		udelay(test_channel->fuzz_testing_message_delay);
> >  	if (hv_pkt_iter_avail(rbi) < sizeof(struct vmpacket_descriptor))
> >  		return NULL;
> >  
> > @@ -420,7 +425,12 @@ __hv_pkt_iter_next(struct vmbus_channel *channel,
> >  	struct hv_ring_buffer_info *rbi = &channel->inbound;
> >  	u32 packetlen = desc->len8 << 3;
> >  	u32 dsize = rbi->ring_datasize;
> > +	struct vmbus_channel *test_channel = !channel->primary_channel ?
> > +						channel :
> > +						channel->primary_channel;
> 
> This pattern is repeated 3 times so a define is justified. I would also
> reversed the logic:
> 
>    test_channel = channel->primary_channel ? channel->primary_channel : channel;
> 
> >  
> > +	if (unlikely(test_channel->fuzz_testing_message_delay > 0))
> > +		udelay(test_channel->fuzz_testing_message_delay);
> 
> unlikely() is good but if it was under #ifdef it would've been even better.
> 
> >  	/* bump offset to next potential packet */
> -- 
> Vitaly

Makes sense, I'll address the repeated code and will change the way I
handled that if statement. Using an ifdef CONFIG_HYPERV_TESTING
seems like a good thing to add in here like you suggested.

^ permalink raw reply

* Re: [PATCH 0/3] hv: vmbus: add fuzz testing to hv devices
From: Branden Bonaby @ 2019-08-02 19:02 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: linux-hyperv, linux-kernel, kys, haiyangz, sthemmin, sashal
In-Reply-To: <87ftmkgh2t.fsf@vitty.brq.redhat.com>

On Fri, Aug 02, 2019 at 09:30:18AM +0200, Vitaly Kuznetsov wrote:
> Branden Bonaby <brandonbonaby94@gmail.com> writes:
> 
> > This patchset introduces a testing framework for Hyper-V drivers.
> > This framework allows us to introduce delays in the packet receive
> > path on a per-device basis. While the current code only supports 
> > introducing arbitrary delays in the host/guest communication path,
> > we intend to expand this to support error injection in the future.
> >
> > Branden Bonaby (3):
> >   drivers: hv: vmbus: Introduce latency testing
> >   drivers: hv: vmbus: add fuzz test attributes to sysfs
> >   tools: hv: add vmbus testing tool
> >
> >  Documentation/ABI/stable/sysfs-bus-vmbus |  22 ++
> >  drivers/hv/connection.c                  |   5 +
> >  drivers/hv/ring_buffer.c                 |  10 +
> >  drivers/hv/vmbus_drv.c                   |  97 ++++++-
> 
> Can we have something like CONFIG_HYPERV_TESTING and put this new 
> code under #ifdef?
> 
> >  include/linux/hyperv.h                   |  14 +
> >  tools/hv/vmbus_testing                   | 326 +++++++++++++++++++++++
> >  6 files changed, 473 insertions(+), 1 deletion(-)
> >  create mode 100644 tools/hv/vmbus_testing
> 
> -- 
> Vitaly

You're right, it would be better to do it that way with ifdef's.
Will edit my patches and resend.

Branden Bonaby

^ permalink raw reply

* [PATCH] PCI: hv: Detect and fix Hyper-V PCI domain number collision
From: Haiyang Zhang @ 2019-08-02 18:52 UTC (permalink / raw)
  To: sashal@kernel.org, bhelgaas@google.com, lorenzo.pieralisi@arm.com,
	linux-hyperv@vger.kernel.org, linux-pci@vger.kernel.org
  Cc: Haiyang Zhang, KY Srinivasan, Stephen Hemminger, olaf@aepfle.de,
	vkuznets, linux-kernel@vger.kernel.org

Due to Azure host agent settings, the device instance ID's bytes 8 and 9
are no longer unique. This causes some of the PCI devices not showing up
in VMs with multiple passthrough devices, such as GPUs. So, as recommended
by Azure host team, we now use the bytes 4 and 5 which usually provide
unique numbers.

In the rare cases of collision, we will detect and find another number
that is not in use.
Thanks to Michael Kelley <mikelley@microsoft.com> for proposing this idea.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
---
 drivers/pci/controller/pci-hyperv.c | 91 +++++++++++++++++++++++++++++++------
 1 file changed, 78 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index 82acd61..6b9cc6e60a 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -37,6 +37,8 @@
  * the PCI back-end driver in Hyper-V.
  */
 
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include <linux/kernel.h>
 #include <linux/module.h>
 #include <linux/pci.h>
@@ -2507,6 +2509,47 @@ static void put_hvpcibus(struct hv_pcibus_device *hbus)
 		complete(&hbus->remove_event);
 }
 
+#define HVPCI_DOM_MAP_SIZE (64 * 1024)
+static DECLARE_BITMAP(hvpci_dom_map, HVPCI_DOM_MAP_SIZE);
+
+/* PCI domain number 0 is used by emulated devices on Gen1 VMs, so define 0
+ * as invalid for passthrough PCI devices of this driver.
+ */
+#define HVPCI_DOM_INVALID 0
+
+/**
+ * hv_get_dom_num() - Get a valid PCI domain number
+ * Check if the PCI domain number is in use, and return another number if
+ * it is in use.
+ *
+ * @dom: Requested domain number
+ *
+ * return: domain number on success, HVPCI_DOM_INVALID on failure
+ */
+static u16 hv_get_dom_num(u16 dom)
+{
+	unsigned int i;
+
+	if (test_and_set_bit(dom, hvpci_dom_map) == 0)
+		return dom;
+
+	for_each_clear_bit(i, hvpci_dom_map, HVPCI_DOM_MAP_SIZE) {
+		if (test_and_set_bit(i, hvpci_dom_map) == 0)
+			return i;
+	}
+
+	return HVPCI_DOM_INVALID;
+}
+
+/**
+ * hv_put_dom_num() - Mark the PCI domain number as free
+ * @dom: Domain number to be freed
+ */
+static void hv_put_dom_num(u16 dom)
+{
+	clear_bit(dom, hvpci_dom_map);
+}
+
 /**
  * hv_pci_probe() - New VMBus channel probe, for a root PCI bus
  * @hdev:	VMBus's tracking struct for this root PCI bus
@@ -2518,6 +2561,7 @@ static int hv_pci_probe(struct hv_device *hdev,
 			const struct hv_vmbus_device_id *dev_id)
 {
 	struct hv_pcibus_device *hbus;
+	u16 dom_req, dom;
 	int ret;
 
 	/*
@@ -2532,19 +2576,32 @@ static int hv_pci_probe(struct hv_device *hdev,
 	hbus->state = hv_pcibus_init;
 
 	/*
-	 * The PCI bus "domain" is what is called "segment" in ACPI and
-	 * other specs.  Pull it from the instance ID, to get something
-	 * unique.  Bytes 8 and 9 are what is used in Windows guests, so
-	 * do the same thing for consistency.  Note that, since this code
-	 * only runs in a Hyper-V VM, Hyper-V can (and does) guarantee
-	 * that (1) the only domain in use for something that looks like
-	 * a physical PCI bus (which is actually emulated by the
-	 * hypervisor) is domain 0 and (2) there will be no overlap
-	 * between domains derived from these instance IDs in the same
-	 * VM.
+	 * The PCI bus "domain" is what is called "segment" in ACPI and other
+	 * specs. Pull it from the instance ID, to get something usually
+	 * unique. In rare cases of collision, we will find out another number
+	 * not in use.
+	 * Note that, since this code only runs in a Hyper-V VM, Hyper-V
+	 * together with this guest driver can guarantee that (1) The only
+	 * domain used by Gen1 VMs for something that looks like a physical
+	 * PCI bus (which is actually emulated by the hypervisor) is domain 0.
+	 * (2) There will be no overlap between domains (after fixing possible
+	 * collisions) in the same VM.
 	 */
-	hbus->sysdata.domain = hdev->dev_instance.b[9] |
-			       hdev->dev_instance.b[8] << 8;
+	dom_req = hdev->dev_instance.b[5] << 8 | hdev->dev_instance.b[4];
+	dom = hv_get_dom_num(dom_req);
+
+	if (dom == HVPCI_DOM_INVALID) {
+		pr_err("Unable to use dom# 0x%hx or other numbers",
+		       dom_req);
+		ret = -EINVAL;
+		goto free_bus;
+	}
+
+	if (dom != dom_req)
+		pr_info("PCI dom# 0x%hx has collision, using 0x%hx",
+			dom_req, dom);
+
+	hbus->sysdata.domain = dom;
 
 	hbus->hdev = hdev;
 	refcount_set(&hbus->remove_lock, 1);
@@ -2559,7 +2616,7 @@ static int hv_pci_probe(struct hv_device *hdev,
 					   hbus->sysdata.domain);
 	if (!hbus->wq) {
 		ret = -ENOMEM;
-		goto free_bus;
+		goto free_dom;
 	}
 
 	ret = vmbus_open(hdev->channel, pci_ring_size, pci_ring_size, NULL, 0,
@@ -2636,6 +2693,8 @@ static int hv_pci_probe(struct hv_device *hdev,
 	vmbus_close(hdev->channel);
 destroy_wq:
 	destroy_workqueue(hbus->wq);
+free_dom:
+	hv_put_dom_num(hbus->sysdata.domain);
 free_bus:
 	free_page((unsigned long)hbus);
 	return ret;
@@ -2717,6 +2776,9 @@ static int hv_pci_remove(struct hv_device *hdev)
 	put_hvpcibus(hbus);
 	wait_for_completion(&hbus->remove_event);
 	destroy_workqueue(hbus->wq);
+
+	hv_put_dom_num(hbus->sysdata.domain);
+
 	free_page((unsigned long)hbus);
 	return 0;
 }
@@ -2744,6 +2806,9 @@ static void __exit exit_hv_pci_drv(void)
 
 static int __init init_hv_pci_drv(void)
 {
+	/* Set the invalid domain number's bit, so it will not be used */
+	set_bit(HVPCI_DOM_INVALID, hvpci_dom_map);
+
 	return vmbus_driver_register(&hv_pci_drv);
 }
 
-- 
1.8.3.1


^ permalink raw reply related

* RE: [PATCH] PCI: hv: Fix panic by calling hv_pci_remove_slots() earlier
From: Dexuan Cui @ 2019-08-02 18:13 UTC (permalink / raw)
  To: Sasha Levin, lorenzo.pieralisi@arm.com
  Cc: linux-hyperv@vger.kernel.org, stable@vger.kernel.org,
	stable@vger.kernel.org
In-Reply-To: <20190802180817.A206520578@mail.kernel.org>

> From: Sasha Levin <sashal@kernel.org>
> Sent: Friday, August 2, 2019 11:08 AM
> 
> Hi,
> 
> [This is an automated email]
> 
> This commit has been processed because it contains a "Fixes:" tag,
> fixing commit: 15becc2b56c6 PCI: hv: Add hv_pci_remove_slots() when we
> unload the driver.
> 
> The bot has tested the following trees: v5.2.5, v4.19.63, v4.14.135.
> 
> v5.2.5: Build OK!
> v4.19.63: Build OK!
> v4.14.135: Failed to apply! Possible dependencies:
>     Unable to calculate
> 
> NOTE: The patch will not be queued to stable trees until it is upstream.
> 
> How should we proceed with this patch?
> Sasha

I'm waiting the patch to be accepted into the pci tree first.
We do not need to do anything at this moment.

Thanks,
-- Dexuan

^ permalink raw reply

* Re: [PATCH 2/3] drivers: hv: vmbus: add fuzz test attributes to sysfs
From: Vitaly Kuznetsov @ 2019-08-02  7:34 UTC (permalink / raw)
  To: Branden Bonaby
  Cc: linux-hyperv, linux-kernel, kys, haiyangz, sthemmin, sashal
In-Reply-To: <20f96dba927eaa42fceeebfc7a6a37f3b1a9ee65.1564527684.git.brandonbonaby94@gmail.com>

Branden Bonaby <brandonbonaby94@gmail.com> writes:

> Expose the test parameters as part of the sysfs channel attributes.
> We will control the testing state via these attributes.
>
> Signed-off-by: Branden Bonaby <brandonbonaby94@gmail.com>
> ---
>  Documentation/ABI/stable/sysfs-bus-vmbus | 22 ++++++
>  drivers/hv/vmbus_drv.c                   | 97 +++++++++++++++++++++++-
>  2 files changed, 118 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/ABI/stable/sysfs-bus-vmbus b/Documentation/ABI/stable/sysfs-bus-vmbus
> index 8e8d167eca31..239fcb6fdc75 100644
> --- a/Documentation/ABI/stable/sysfs-bus-vmbus
> +++ b/Documentation/ABI/stable/sysfs-bus-vmbus
> @@ -185,3 +185,25 @@ Contact:        Michael Kelley <mikelley@microsoft.com>
>  Description:    Total number of write operations that encountered an outbound
>  		ring buffer full condition
>  Users:          Debugging tools
> +
> +What:           /sys/bus/vmbus/devices/<UUID>/fuzz_test_state

I would prefer this to go under /sys/kernel/debug/ as this is clearly a
debug/test feature.


> +Date:           July 2019
> +KernelVersion:  5.2
> +Contact:        Branden Bonaby <brandonbonaby94@gmail.com>
> +Description:    Fuzz testing status of a vmbus device, whether its in an ON
> +		 state or a OFF state
> +Users:          Debugging tools
> +
> +What:           /sys/bus/vmbus/devices/<UUID>/fuzz_test_buffer_delay
> +Date:           July 2019
> +KernelVersion:  5.2
> +Contact:        Branden Bonaby <brandonbonaby94@gmail.com>
> +Description:    Fuzz testing buffer delay value between 0 - 1000
> +Users:          Debugging tools
> +
> +What:           /sys/bus/vmbus/devices/<UUID>/fuzz_test_message_delay
> +Date:           July 2019
> +KernelVersion:  5.2
> +Contact:        Branden Bonaby <brandonbonaby94@gmail.com>
> +Description:    Fuzz testing message delay value between 0 - 1000
> +Users:          Debugging tools
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 92b1874b3eb3..0c71fd66ef81 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -22,7 +22,7 @@
>  #include <linux/clockchips.h>
>  #include <linux/cpu.h>
>  #include <linux/sched/task_stack.h>
> -
> +#include <linux/kernel.h>
>  #include <asm/mshyperv.h>
>  #include <linux/notifier.h>
>  #include <linux/ptrace.h>
> @@ -584,6 +584,98 @@ static ssize_t driver_override_show(struct device *dev,
>  }
>  static DEVICE_ATTR_RW(driver_override);
>  
> +static ssize_t fuzz_test_state_store(struct device *dev,
> +				     struct device_attribute *attr,
> +				     const char *buf, size_t count)
> +{
> +	struct hv_device *hv_dev = device_to_hv_device(dev);
> +	struct vmbus_channel *channel = hv_dev->channel;
> +	int state;
> +	int delay = kstrtoint(buf, 0, &state);
> +
> +	if (delay)
> +		return count;
> +	if (state)
> +		channel->fuzz_testing_state = 1;
> +	else
> +		channel->fuzz_testing_state = 0;
> +	return count;
> +}
> +
> +static ssize_t fuzz_test_state_show(struct device *dev,
> +				    struct device_attribute *dev_attr,
> +				    char *buf)
> +{
> +	struct hv_device *hv_dev = device_to_hv_device(dev);
> +	struct vmbus_channel *channel = hv_dev->channel;
> +
> +	return sprintf(buf, "%u\n", channel->fuzz_testing_state);
> +}
> +static DEVICE_ATTR_RW(fuzz_test_state);
> +
> +static ssize_t fuzz_test_buffer_delay_store(struct device *dev,
> +					    struct device_attribute *attr,
> +					    const char *buf, size_t count)
> +{
> +	struct hv_device *hv_dev = device_to_hv_device(dev);
> +	struct vmbus_channel *channel = hv_dev->channel;
> +	int val;
> +	int delay = kstrtoint(buf, 0, &val);
> +
> +	if (delay)
> +		return count;
> +	if (val >= 1 && val <= 1000)
> +		channel->fuzz_testing_buffer_delay = val;
> +	/*Best to not use else statement here since we want
> +	 *the buffer delay to remain the same if val > 1000
> +	 */
> +	else if (val <= 0)
> +		channel->fuzz_testing_buffer_delay = 0;
> +	return count;
> +}
> +
> +static ssize_t fuzz_test_buffer_delay_show(struct device *dev,
> +					   struct device_attribute *dev_attr,
> +					   char *buf)
> +{
> +	struct hv_device *hv_dev = device_to_hv_device(dev);
> +	struct vmbus_channel *channel = hv_dev->channel;
> +
> +	return sprintf(buf, "%u\n", channel->fuzz_testing_buffer_delay);
> +}
> +static DEVICE_ATTR_RW(fuzz_test_buffer_delay);
> +
> +static ssize_t fuzz_test_message_delay_store(struct device *dev,
> +					     struct device_attribute *attr,
> +					     const char *buf, size_t count)
> +{
> +	struct hv_device *hv_dev = device_to_hv_device(dev);
> +	struct vmbus_channel *channel = hv_dev->channel;
> +	int val;
> +	int delay = kstrtoint(buf, 0, &val);
> +
> +	if (delay)
> +		return count;
> +	if (val >= 1 && val <= 1000)
> +		channel->fuzz_testing_message_delay = val;
> +	/*Best to not use else statement here since we want
> +	 *the message delay to remain the same if val > 1000
> +	 */
> +	else if (val <= 0)
> +		channel->fuzz_testing_message_delay = 0;
> +	return count;
> +}
> +
> +static ssize_t fuzz_test_message_delay_show(struct device *dev,
> +					    struct device_attribute *dev_attr,
> +					    char *buf)
> +{
> +	struct hv_device *hv_dev = device_to_hv_device(dev);
> +	struct vmbus_channel *channel = hv_dev->channel;
> +
> +	return sprintf(buf, "%u\n", channel->fuzz_testing_message_delay);
> +}
> +static DEVICE_ATTR_RW(fuzz_test_message_delay);
>  /* Set up per device attributes in /sys/bus/vmbus/devices/<bus device> */
>  static struct attribute *vmbus_dev_attrs[] = {
>  	&dev_attr_id.attr,
> @@ -615,6 +707,9 @@ static struct attribute *vmbus_dev_attrs[] = {
>  	&dev_attr_vendor.attr,
>  	&dev_attr_device.attr,
>  	&dev_attr_driver_override.attr,
> +	&dev_attr_fuzz_test_state.attr,
> +	&dev_attr_fuzz_test_buffer_delay.attr,
> +	&dev_attr_fuzz_test_message_delay.attr,
>  	NULL,
>  };

-- 
Vitaly

^ permalink raw reply

* Re: [PATCH 1/3] drivers: hv: vmbus: Introduce latency testing
From: Vitaly Kuznetsov @ 2019-08-02  7:32 UTC (permalink / raw)
  To: Branden Bonaby, kys, haiyangz, sthemmin, sashal
  Cc: Branden Bonaby, linux-hyperv, linux-kernel
In-Reply-To: <18193677a879c402d00955c445ae7ce461b4198f.1564527684.git.brandonbonaby94@gmail.com>

Branden Bonaby <brandonbonaby94@gmail.com> writes:

> Introduce user specified latency in the packet reception path.
>
> Signed-off-by: Branden Bonaby <brandonbonaby94@gmail.com>
> ---
>  drivers/hv/connection.c  |  5 +++++
>  drivers/hv/ring_buffer.c | 10 ++++++++++
>  include/linux/hyperv.h   | 14 ++++++++++++++
>  3 files changed, 29 insertions(+)
>
> diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
> index 09829e15d4a0..2a2c22f5570e 100644
> --- a/drivers/hv/connection.c
> +++ b/drivers/hv/connection.c
> @@ -354,9 +354,14 @@ void vmbus_on_event(unsigned long data)
>  {
>  	struct vmbus_channel *channel = (void *) data;
>  	unsigned long time_limit = jiffies + 2;
> +	struct vmbus_channel *test_channel = !channel->primary_channel ?
> +						channel :
> +						channel->primary_channel;
>  
>  	trace_vmbus_on_event(channel);
>  
> +	if (unlikely(test_channel->fuzz_testing_buffer_delay > 0))
> +		udelay(test_channel->fuzz_testing_buffer_delay);
>  	do {
>  		void (*callback_fn)(void *);
>  
> diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
> index 9a03b163cbbd..d7627c9023d6 100644
> --- a/drivers/hv/ring_buffer.c
> +++ b/drivers/hv/ring_buffer.c
> @@ -395,7 +395,12 @@ struct vmpacket_descriptor *hv_pkt_iter_first(struct vmbus_channel *channel)
>  {
>  	struct hv_ring_buffer_info *rbi = &channel->inbound;
>  	struct vmpacket_descriptor *desc;
> +	struct vmbus_channel *test_channel = !channel->primary_channel ?
> +						channel :
> +						channel->primary_channel;
>  
> +	if (unlikely(test_channel->fuzz_testing_message_delay > 0))
> +		udelay(test_channel->fuzz_testing_message_delay);
>  	if (hv_pkt_iter_avail(rbi) < sizeof(struct vmpacket_descriptor))
>  		return NULL;
>  
> @@ -420,7 +425,12 @@ __hv_pkt_iter_next(struct vmbus_channel *channel,
>  	struct hv_ring_buffer_info *rbi = &channel->inbound;
>  	u32 packetlen = desc->len8 << 3;
>  	u32 dsize = rbi->ring_datasize;
> +	struct vmbus_channel *test_channel = !channel->primary_channel ?
> +						channel :
> +						channel->primary_channel;

This pattern is repeated 3 times so a define is justified. I would also
reversed the logic:

   test_channel = channel->primary_channel ? channel->primary_channel : channel;

>  
> +	if (unlikely(test_channel->fuzz_testing_message_delay > 0))
> +		udelay(test_channel->fuzz_testing_message_delay);

unlikely() is good but if it was under #ifdef it would've been even better.

>  	/* bump offset to next potential packet */
>  	rbi->priv_read_index += packetlen + VMBUS_PKT_TRAILER;
>  	if (rbi->priv_read_index >= dsize)
> diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
> index 6256cc34c4a6..8d068956dd67 100644
> --- a/include/linux/hyperv.h
> +++ b/include/linux/hyperv.h
> @@ -23,6 +23,7 @@
>  #include <linux/mod_devicetable.h>
>  #include <linux/interrupt.h>
>  #include <linux/reciprocal_div.h>
> +#include <linux/delay.h>
>  
>  #define MAX_PAGE_BUFFER_COUNT				32
>  #define MAX_MULTIPAGE_BUFFER_COUNT			32 /* 128K */
> @@ -926,6 +927,19 @@ struct vmbus_channel {
>  	 * full outbound ring buffer.
>  	 */
>  	u64 out_full_first;
> +
> +	/* enabling/disabling fuzz testing on the channel (default is false)*/
> +	bool fuzz_testing_state;
> +
> +	/* Buffer delay will delay the guest from emptying the ring buffer
> +	 * for a specific amount of time. The delay is in microseconds and will
> +	 * be between 1 to a maximum of 1000, its default is 0 (no delay).
> +	 * The  Message delay will delay guest reading on a per message basis
> +	 * in microseconds between 1 to 1000 with the default being 0
> +	 * (no delay).
> +	 */
> +	u32 fuzz_testing_buffer_delay;
> +	u32 fuzz_testing_message_delay;
>  };
>  
>  static inline bool is_hvsock_channel(const struct vmbus_channel *c)

-- 
Vitaly

^ permalink raw reply

* Re: [PATCH 0/3] hv: vmbus: add fuzz testing to hv devices
From: Vitaly Kuznetsov @ 2019-08-02  7:30 UTC (permalink / raw)
  To: Branden Bonaby
  Cc: linux-hyperv, linux-kernel, kys, haiyangz, sthemmin, sashal
In-Reply-To: <cover.1564527684.git.brandonbonaby94@gmail.com>

Branden Bonaby <brandonbonaby94@gmail.com> writes:

> This patchset introduces a testing framework for Hyper-V drivers.
> This framework allows us to introduce delays in the packet receive
> path on a per-device basis. While the current code only supports 
> introducing arbitrary delays in the host/guest communication path,
> we intend to expand this to support error injection in the future.
>
> Branden Bonaby (3):
>   drivers: hv: vmbus: Introduce latency testing
>   drivers: hv: vmbus: add fuzz test attributes to sysfs
>   tools: hv: add vmbus testing tool
>
>  Documentation/ABI/stable/sysfs-bus-vmbus |  22 ++
>  drivers/hv/connection.c                  |   5 +
>  drivers/hv/ring_buffer.c                 |  10 +
>  drivers/hv/vmbus_drv.c                   |  97 ++++++-

Can we have something like CONFIG_HYPERV_TESTING and put this new 
code under #ifdef?

>  include/linux/hyperv.h                   |  14 +
>  tools/hv/vmbus_testing                   | 326 +++++++++++++++++++++++
>  6 files changed, 473 insertions(+), 1 deletion(-)
>  create mode 100644 tools/hv/vmbus_testing

-- 
Vitaly

^ permalink raw reply

* [PATCH] PCI: hv: Fix panic by calling hv_pci_remove_slots() earlier
From: Dexuan Cui @ 2019-08-02  1:32 UTC (permalink / raw)
  To: lorenzo.pieralisi@arm.com, bhelgaas@google.com,
	linux-pci@vger.kernel.org, Michael Kelley
  Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	driverdev-devel@linuxdriverproject.org, Sasha Levin,
	Haiyang Zhang, KY Srinivasan, Stephen Hemminger, olaf@aepfle.de,
	apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com, jackm@mellanox.com, Dexuan Cui


When a slot is removed, the pci_dev must still exist.

pci_remove_root_bus() removes and free all the pci_devs, so
hv_pci_remove_slots() must be called before pci_remove_root_bus(),
otherwise a general protection fault can happen, if the kernel is built
with the memory debugging options.

Fixes: 15becc2b56c6 ("PCI: hv: Add hv_pci_remove_slots() when we unload the driver")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: stable@vger.kernel.org

---

When pci-hyperv is unloaded, this panic can happen:

 general protection fault:
 CPU: 2 PID: 1091 Comm: rmmod Not tainted 5.2.0+
 RIP: 0010:pci_slot_release+0x30/0xd0
 Call Trace:
  kobject_release+0x65/0x190
  pci_destroy_slot+0x25/0x60
  hv_pci_remove+0xec/0x110 [pci_hyperv]
  vmbus_remove+0x20/0x30 [hv_vmbus]
  device_release_driver_internal+0xd5/0x1b0
  driver_detach+0x44/0x7c
  bus_remove_driver+0x75/0xc7
  vmbus_driver_unregister+0x50/0xbd [hv_vmbus]
  __x64_sys_delete_module+0x136/0x200
  do_syscall_64+0x5e/0x220

 drivers/pci/controller/pci-hyperv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index 6b9cc6e60a..68c611d 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -2757,8 +2757,8 @@ static int hv_pci_remove(struct hv_device *hdev)
 		/* Remove the bus from PCI's point of view. */
 		pci_lock_rescan_remove();
 		pci_stop_root_bus(hbus->pci_bus);
-		pci_remove_root_bus(hbus->pci_bus);
 		hv_pci_remove_slots(hbus);
+		pci_remove_root_bus(hbus->pci_bus);
 		pci_unlock_rescan_remove();
 		hbus->state = hv_pcibus_removed;
 	}
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH 3/3] tools: hv: add vmbus testing tool
From: Branden Bonaby @ 2019-08-01 20:00 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, sashal
  Cc: Branden Bonaby, linux-hyperv, linux-kernel
In-Reply-To: <cover.1564527684.git.brandonbonaby94@gmail.com>

This is a userspace tool to drive the testing. Currently it supports
introducing user specified delay in the host to guest communication
path on a per-channel basis.

Signed-off-by: Branden Bonaby <brandonbonaby94@gmail.com>
---
 tools/hv/vmbus_testing | 326 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 326 insertions(+)
 create mode 100644 tools/hv/vmbus_testing

diff --git a/tools/hv/vmbus_testing b/tools/hv/vmbus_testing
new file mode 100644
index 000000000000..8f5cd997952b
--- /dev/null
+++ b/tools/hv/vmbus_testing
@@ -0,0 +1,326 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+#
+#Program to allow users to fuzz test Hyper-V drivers
+#by interfacing with Hyper-V sysfs directories
+#author: Branden Bonaby <brandonbonaby94@gmail.com>
+
+import os
+import cmd
+import argparse
+from collections import defaultdict
+from argparse import RawDescriptionHelpFormatter
+
+#sysfs paths for vmbus must exist (same as in lsvmbus)
+vmbus_sys_path = '/sys/bus/vmbus/devices'
+if not os.path.isdir(vmbus_sys_path):
+            print("{} doesn't exist: exiting...".format(vmbus_sys_path))
+            exit(-1)
+
+#Do not change unless, you change the sysfs attributes
+#in "/sys/bus/vmbus/devices/<UUID>/". All fuzz testing
+#attributes will start with "fuzz_test".
+pathlen = len(vmbus_sys_path)
+fuzz_state_location = "fuzz_test_state"
+fuzz_states = {0 : "Disable", 1 : "Enable"}
+fuzz_methods ={1 : "Delay_testing"}
+fuzz_delay_types = {1 : "fuzz_test_buffer_delay", 2 :"fuzz_test_message_delay"}
+
+def parse_args():
+        parser = argparse.ArgumentParser(description = "vmbus_testing "\
+                "[-s] [0|1] [-q] [-p] <sysfs-path>\n""vmbus_testing [-s]"\
+                " [0|1] [-q] [-p] <sysfs-path> delay [-d] [val] [val] [-E|-D]\n"
+                "vmbus_testing [-q] disable-all\n"
+                "vmbus_testing [-q] view [-v|-V]\n"\
+                "vmbus_testing --version",
+                epilog = "Current testing options {}".format(fuzz_methods),
+                prog = 'vmbus_testing',
+                formatter_class = RawDescriptionHelpFormatter)
+        subparsers = parser.add_subparsers(dest="action")
+        parser.add_argument('--version', action='version',\
+                        version = '%(prog)s 1.0')
+        parser.add_argument("-q","--quiet",action = "store_true",\
+                        help = "silence none important test messages")
+        parser.add_argument("-s","--state",metavar = "", type = int,\
+                        choices = range(0,2),\
+                        help = "Turn testing ON or OFF for a single device."\
+                        " The value (1) will turn testing ON. The value"\
+                        " of (0) will turn testing OFF with the default set"\
+                        " to (0).")
+        parser.add_argument("-p","--path", metavar = "",\
+                        help = "Refers to the sysfs path to a vmbus device."
+                        " If the path is not a valid path to a sysfs device,"\
+                        " the program will exit. The path must be the"\
+                        " absolute path; use the lsvmbus command to find"\
+                        " the path.")
+        parser_delay = subparsers.add_parser("delay",\
+                        help = "Delay buffer/message reads in microseconds.",
+                        description = "vmbus_testing -s [0|1] [-q] -p "\
+                        "<sysfs-path> delay -d "\
+                        "[buffer-delay-value] [message-delay-value]\n"
+                        "vmbus_testing [-q] delay [buffer-delay-value] "\
+                                "[message-delay-value] -E\n"
+                        "vmbus_testing [-q] delay [buffer-delay-value] "\
+                                "[message-delay-value] -D",
+                        formatter_class = RawDescriptionHelpFormatter)
+        delay_group = parser_delay.add_mutually_exclusive_group()
+        delay_group.add_argument("-E","--en-all-delay", action = "store_true",\
+                        help = "Enable Buffer/Message Delay testing on ALL"\
+                        " devices. Use -d option with this to set the values"\
+                        " for both the buffer delay and the message delay. No"\
+                        " value can be (0) or less than (-1). If testing is"\
+                        " disabled on a device prior to running this command,"\
+                        " testing will be enabled on the device as a result"\
+                        " of this command.")
+        delay_group.add_argument("-D","--dis-all-delay", action="store_true",\
+                        help = "Disable Buffer/Message delay testing on ALL"\
+                        " devices. A  value equal to (-1) will keep the"\
+                        " current delay value, and a value equal to (0) will"\
+                        " remove delay testing for the specfied delay column."\
+                        " only values (-1) and (0) will be accepted but at"\
+                        " least one value must be a (0) or a (-1).")
+        parser_delay.add_argument("-d","--delay-time", metavar="", nargs=2,\
+                        type = check_range, default =[0,0], required = (True),\
+                        help = "Buffer/message delay time. A value of (0) will"\
+                        "disable delay testing on the specified delay column,"\
+                        " while a value of (-1) will ignore the specfied"\
+                        " delay column. The default values are [0] & [0]."\
+                        " The first column represents the buffer delay value"\
+                        " and the second represents the message delay value."\
+                        " Value constraints: -1 <= value <= 1000.")
+        parser_dis_all = subparsers.add_parser("disable-all",\
+                        help = "Disable ALL testing on all vmbus devices.",
+                        description = "vmbus_testing disable-all",
+                        formatter_class = RawDescriptionHelpFormatter)
+        parser_view = subparsers.add_parser("view",\
+                        help = "View testing on vmbus devices.",
+                        description = "vmbus_testing view -V\n"
+                        "vmbus_testing -p <sysfs-path> view -v",
+                        formatter_class = RawDescriptionHelpFormatter)
+        view_group = parser_view.add_mutually_exclusive_group()
+        view_group.add_argument("-V","--view-all-states",action = "store_true",\
+                        help = "View the test status for all vmbus devices.")
+        view_group.add_argument("-v","--view-single-device",\
+                        action = "store_true",help = "View test values for a"\
+                        " single vmbus device.")
+
+        return  parser.parse_args()
+
+#value checking for range checking input in parser
+def check_range(arg1):
+        try:
+                val = int(arg1)
+        except ValueError as err:
+                raise argparse.ArgumentTypeError(str(err))
+        if val < -1 or val > 1000:
+                message = ("\n\nError, Expected -1 <= value <= 1000, got value"\
+                        " {}\n").format(val)
+                raise argparse.ArgumentTypeError(message)
+        return val
+
+def main():
+        try:
+                dev_list = []
+                for dir in os.listdir(vmbus_sys_path):\
+                        dev_list.append(os.path.join(vmbus_sys_path,dir))
+                #key value, pairs
+                #key = sysfs device path
+                #value = list of fuzz testing attributes.
+                device_and_files = defaultdict(list)
+                for dev in dev_list:
+                        for f in os.listdir(dev):
+                                if (f.startswith("fuzz_test")):
+                                        device_and_files[dev].append(f)
+
+                device_and_files.default_factory = None
+                args = parse_args()
+                path = args.path
+                state = args.state
+                quiet = args.quiet
+                if (not quiet):
+                        print("*** Use lsvmbus to get vmbus device"\
+                                " information.*** ")
+                if (state is not None and validate_args_path(path,dev_list)):
+                        if (state is not get_test_state(path)):
+                                change_test_state(path,quiet)
+                        state = get_test_state(path)
+                if (state is 0 and path is not None):
+                        disable_testing_single_device(path,0,quiet)
+                        return
+                #Use subparsers as the key for different fuzz testing methods
+                if (args.action == "delay"):
+                        delay = args.delay_time
+                        if (validate_delay_values(args,delay)):
+                                delay_test_all_devices(dev_list,delay,quiet)
+                        elif (validate_args_path(path,dev_list)):
+                                if(get_test_state(path) is 1):
+                                        delay_test_store(path,delay,quiet)
+                                        return
+                                print("device testing OFF, use -s 1 to turn ON")
+                elif (args.action == "disable-all"):
+                        disable_all_testing(dev_list,quiet)
+                elif (args.action == "view"):
+                        if (args.view_all_states):
+                                all_devices_test_status(dev_list)
+                        elif (args.view_single_device):
+                                if (validate_args_path(path,dev_list)):
+                                        device_test_values(device_and_files,\
+                                                path)
+                                        return
+                                print("Error,(check path) usage: -p [path]"\
+                                            " view -v")
+        except AttributeError:
+                print("check usage, 1 or more elements not provided")
+                exit(-1)
+
+#Validate delay values to make sure they are acceptable to
+#to either enable all delays on a device or disable all
+#delays on a device
+def validate_delay_values(args,delay):
+        if (args.en_all_delay):
+                for i in delay:
+                        if (i < -1 or i == 0):
+                                print("\nError, Values must be"\
+                                        " equal to -1 or be > 0, use"\
+                                        " -d option")
+                                exit(-1)
+                return True
+        elif (args.dis_all_delay):
+                for i in delay:
+                        if (i < -1 or i > 0):
+                                print("\nError, at least 1 value"
+                                        " is not a (0) or a (-1)")
+                                exit(-1)
+                return True
+        else:
+                return False
+
+
+#Validate argument path
+def validate_args_path(path,dev_list):
+        if (path in dev_list):
+                return True
+        else:
+                return False
+
+#display Testing status of single device
+def device_test_values(device_and_files,path):
+        for test in  device_and_files.get(path):
+                print("{}".format(test), end = '')
+                print((" value =  {}")\
+                        .format(read_test_files(os.path.join(path,test))))
+
+#display Testing state of devices
+def all_devices_test_status(dev_list):
+    for device in dev_list:
+        if (get_test_state(device) is 1):
+                print("Testing = ON for: {}".format(device.split("/")[5]))
+        else:
+                print("Testing = OFF for: {}".format(device.split("/")[5]))
+
+#read the vmbus device files, path must be absolute path before calling
+def read_test_files(path):
+        try:
+                with open(path,"r") as f:
+                        state = f.readline().replace("\n","")
+                return int(state)
+
+        except IOError as e:
+                errno, strerror = e.args
+                print("I/O error({0}): {1} on file {2}"\
+                        .format(errno,strerror,path))
+                exit(-1)
+        except ValueError:
+                print ("Element to int conversion error in: \n{}".format(path))
+                exit(-1)
+
+#writing to vmbus device files, path must be absolute path before calling
+def write_test_files(path,value):
+        try:
+                with open(path,"w") as f:
+                        f.write("{}".format(value))
+        except IOError as e:
+                errno, strerror = e.args
+                print("I/O error({0}): {1} on file {2}"\
+                        .format(errno,strerror,path))
+                exit(-1)
+
+#change testing state of device
+def change_test_state(device,quiet):
+        state_path = os.path.join(device,fuzz_state_location)
+        if (get_test_state(device) is 0):
+                write_test_files(state_path,1)
+                if (not quiet):
+                            print("Testing = ON for device: {}"\
+                                    .format(state_path.split("/")[5]))
+        else:
+                write_test_files(state_path,0)
+                if (not quiet):
+                            print("Testing = OFF for device: {}"\
+                                    .format(state_path.split("/")[5]))
+
+#get testing state of device
+def get_test_state(device):
+        #state == 1 - test = ON
+        #state == 0 - test = OFF
+        return  read_test_files(os.path.join(device,fuzz_state_location))
+
+#Enter 1 - 1000 microseconds, into a single device using the
+#fuzz_test_buffer_delay and fuzz_test_message_delay
+#sysfs attributes
+def delay_test_store(device,delay_length,quiet):
+
+        try:
+                # delay[0]- buffer delay, delay[1]- message delay
+                buff_test = os.path.join(device,fuzz_delay_types.get(1))
+                mess_test = os.path.join(device,fuzz_delay_types.get(2))
+
+                if (delay_length[0] >= 0):
+                        write_test_files(buff_test,delay_length[0])
+                if (delay_length[1] >= 0):
+                        write_test_files(mess_test,delay_length[1])
+                if (not quiet):
+                        print("Buffer delay testing = {} for: {}"\
+                                .format(read_test_files(buff_test),\
+                                buff_test.split("/")[5]))
+                        print("Message delay testing = {} for: {}"\
+                                .format(read_test_files(mess_test),\
+                                mess_test.split("/")[5]))
+        except IOError as e:
+                errno, strerror = e.args
+                print("I/O error({0}): {1} on files {2}{3}"\
+                        .format(errno,strerror,buff_test,mess_test))
+                exit(-1)
+
+#enabling/disabling delay testing on all devices
+def delay_test_all_devices(dev_list,delay,quiet):
+
+        for device in (dev_list):
+                if (get_test_state(device) is 0):
+                        change_test_state(device,quiet)
+                delay_test_store(device,delay,quiet)
+
+#disabling testing on single device
+def disable_testing_single_device(device,test_type,quiet):
+
+        #test_type represents corresponding key
+        #delay method in delay_methods dict.
+        #special type 0 , used to disable all
+        #testing on SINGLE device.
+
+        if (test_type is 1 or test_type is 0):
+                #disable list [buffer,message]
+                disable_delay = [0,0]
+                if (get_test_state(device) is 1):
+                        change_test_state(device,quiet)
+                delay_test_store(device,disable_delay,quiet)
+
+#disabling testing on ALL devices
+def disable_all_testing(dev_list,quiet):
+
+        #delay disable list [buffer,message]
+        for device in dev_list:
+                disable_testing_single_device(device,0,quiet)
+
+if __name__ == "__main__":
+        main()
-- 
2.17.1


^ permalink raw reply related

* [PATCH 2/3] drivers: hv: vmbus: add fuzz test attributes to sysfs
From: Branden Bonaby @ 2019-08-01 20:00 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, sashal
  Cc: Branden Bonaby, linux-hyperv, linux-kernel
In-Reply-To: <cover.1564527684.git.brandonbonaby94@gmail.com>

Expose the test parameters as part of the sysfs channel attributes.
We will control the testing state via these attributes.

Signed-off-by: Branden Bonaby <brandonbonaby94@gmail.com>
---
 Documentation/ABI/stable/sysfs-bus-vmbus | 22 ++++++
 drivers/hv/vmbus_drv.c                   | 97 +++++++++++++++++++++++-
 2 files changed, 118 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/stable/sysfs-bus-vmbus b/Documentation/ABI/stable/sysfs-bus-vmbus
index 8e8d167eca31..239fcb6fdc75 100644
--- a/Documentation/ABI/stable/sysfs-bus-vmbus
+++ b/Documentation/ABI/stable/sysfs-bus-vmbus
@@ -185,3 +185,25 @@ Contact:        Michael Kelley <mikelley@microsoft.com>
 Description:    Total number of write operations that encountered an outbound
 		ring buffer full condition
 Users:          Debugging tools
+
+What:           /sys/bus/vmbus/devices/<UUID>/fuzz_test_state
+Date:           July 2019
+KernelVersion:  5.2
+Contact:        Branden Bonaby <brandonbonaby94@gmail.com>
+Description:    Fuzz testing status of a vmbus device, whether its in an ON
+		 state or a OFF state
+Users:          Debugging tools
+
+What:           /sys/bus/vmbus/devices/<UUID>/fuzz_test_buffer_delay
+Date:           July 2019
+KernelVersion:  5.2
+Contact:        Branden Bonaby <brandonbonaby94@gmail.com>
+Description:    Fuzz testing buffer delay value between 0 - 1000
+Users:          Debugging tools
+
+What:           /sys/bus/vmbus/devices/<UUID>/fuzz_test_message_delay
+Date:           July 2019
+KernelVersion:  5.2
+Contact:        Branden Bonaby <brandonbonaby94@gmail.com>
+Description:    Fuzz testing message delay value between 0 - 1000
+Users:          Debugging tools
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 92b1874b3eb3..0c71fd66ef81 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -22,7 +22,7 @@
 #include <linux/clockchips.h>
 #include <linux/cpu.h>
 #include <linux/sched/task_stack.h>
-
+#include <linux/kernel.h>
 #include <asm/mshyperv.h>
 #include <linux/notifier.h>
 #include <linux/ptrace.h>
@@ -584,6 +584,98 @@ static ssize_t driver_override_show(struct device *dev,
 }
 static DEVICE_ATTR_RW(driver_override);
 
+static ssize_t fuzz_test_state_store(struct device *dev,
+				     struct device_attribute *attr,
+				     const char *buf, size_t count)
+{
+	struct hv_device *hv_dev = device_to_hv_device(dev);
+	struct vmbus_channel *channel = hv_dev->channel;
+	int state;
+	int delay = kstrtoint(buf, 0, &state);
+
+	if (delay)
+		return count;
+	if (state)
+		channel->fuzz_testing_state = 1;
+	else
+		channel->fuzz_testing_state = 0;
+	return count;
+}
+
+static ssize_t fuzz_test_state_show(struct device *dev,
+				    struct device_attribute *dev_attr,
+				    char *buf)
+{
+	struct hv_device *hv_dev = device_to_hv_device(dev);
+	struct vmbus_channel *channel = hv_dev->channel;
+
+	return sprintf(buf, "%u\n", channel->fuzz_testing_state);
+}
+static DEVICE_ATTR_RW(fuzz_test_state);
+
+static ssize_t fuzz_test_buffer_delay_store(struct device *dev,
+					    struct device_attribute *attr,
+					    const char *buf, size_t count)
+{
+	struct hv_device *hv_dev = device_to_hv_device(dev);
+	struct vmbus_channel *channel = hv_dev->channel;
+	int val;
+	int delay = kstrtoint(buf, 0, &val);
+
+	if (delay)
+		return count;
+	if (val >= 1 && val <= 1000)
+		channel->fuzz_testing_buffer_delay = val;
+	/*Best to not use else statement here since we want
+	 *the buffer delay to remain the same if val > 1000
+	 */
+	else if (val <= 0)
+		channel->fuzz_testing_buffer_delay = 0;
+	return count;
+}
+
+static ssize_t fuzz_test_buffer_delay_show(struct device *dev,
+					   struct device_attribute *dev_attr,
+					   char *buf)
+{
+	struct hv_device *hv_dev = device_to_hv_device(dev);
+	struct vmbus_channel *channel = hv_dev->channel;
+
+	return sprintf(buf, "%u\n", channel->fuzz_testing_buffer_delay);
+}
+static DEVICE_ATTR_RW(fuzz_test_buffer_delay);
+
+static ssize_t fuzz_test_message_delay_store(struct device *dev,
+					     struct device_attribute *attr,
+					     const char *buf, size_t count)
+{
+	struct hv_device *hv_dev = device_to_hv_device(dev);
+	struct vmbus_channel *channel = hv_dev->channel;
+	int val;
+	int delay = kstrtoint(buf, 0, &val);
+
+	if (delay)
+		return count;
+	if (val >= 1 && val <= 1000)
+		channel->fuzz_testing_message_delay = val;
+	/*Best to not use else statement here since we want
+	 *the message delay to remain the same if val > 1000
+	 */
+	else if (val <= 0)
+		channel->fuzz_testing_message_delay = 0;
+	return count;
+}
+
+static ssize_t fuzz_test_message_delay_show(struct device *dev,
+					    struct device_attribute *dev_attr,
+					    char *buf)
+{
+	struct hv_device *hv_dev = device_to_hv_device(dev);
+	struct vmbus_channel *channel = hv_dev->channel;
+
+	return sprintf(buf, "%u\n", channel->fuzz_testing_message_delay);
+}
+static DEVICE_ATTR_RW(fuzz_test_message_delay);
 /* Set up per device attributes in /sys/bus/vmbus/devices/<bus device> */
 static struct attribute *vmbus_dev_attrs[] = {
 	&dev_attr_id.attr,
@@ -615,6 +707,9 @@ static struct attribute *vmbus_dev_attrs[] = {
 	&dev_attr_vendor.attr,
 	&dev_attr_device.attr,
 	&dev_attr_driver_override.attr,
+	&dev_attr_fuzz_test_state.attr,
+	&dev_attr_fuzz_test_buffer_delay.attr,
+	&dev_attr_fuzz_test_message_delay.attr,
 	NULL,
 };
 
-- 
2.17.1


^ permalink raw reply related

* [PATCH 1/3] drivers: hv: vmbus: Introduce latency testing
From: Branden Bonaby @ 2019-08-01 20:00 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, sashal
  Cc: Branden Bonaby, linux-hyperv, linux-kernel
In-Reply-To: <cover.1564527684.git.brandonbonaby94@gmail.com>

Introduce user specified latency in the packet reception path.

Signed-off-by: Branden Bonaby <brandonbonaby94@gmail.com>
---
 drivers/hv/connection.c  |  5 +++++
 drivers/hv/ring_buffer.c | 10 ++++++++++
 include/linux/hyperv.h   | 14 ++++++++++++++
 3 files changed, 29 insertions(+)

diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index 09829e15d4a0..2a2c22f5570e 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -354,9 +354,14 @@ void vmbus_on_event(unsigned long data)
 {
 	struct vmbus_channel *channel = (void *) data;
 	unsigned long time_limit = jiffies + 2;
+	struct vmbus_channel *test_channel = !channel->primary_channel ?
+						channel :
+						channel->primary_channel;
 
 	trace_vmbus_on_event(channel);
 
+	if (unlikely(test_channel->fuzz_testing_buffer_delay > 0))
+		udelay(test_channel->fuzz_testing_buffer_delay);
 	do {
 		void (*callback_fn)(void *);
 
diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
index 9a03b163cbbd..d7627c9023d6 100644
--- a/drivers/hv/ring_buffer.c
+++ b/drivers/hv/ring_buffer.c
@@ -395,7 +395,12 @@ struct vmpacket_descriptor *hv_pkt_iter_first(struct vmbus_channel *channel)
 {
 	struct hv_ring_buffer_info *rbi = &channel->inbound;
 	struct vmpacket_descriptor *desc;
+	struct vmbus_channel *test_channel = !channel->primary_channel ?
+						channel :
+						channel->primary_channel;
 
+	if (unlikely(test_channel->fuzz_testing_message_delay > 0))
+		udelay(test_channel->fuzz_testing_message_delay);
 	if (hv_pkt_iter_avail(rbi) < sizeof(struct vmpacket_descriptor))
 		return NULL;
 
@@ -420,7 +425,12 @@ __hv_pkt_iter_next(struct vmbus_channel *channel,
 	struct hv_ring_buffer_info *rbi = &channel->inbound;
 	u32 packetlen = desc->len8 << 3;
 	u32 dsize = rbi->ring_datasize;
+	struct vmbus_channel *test_channel = !channel->primary_channel ?
+						channel :
+						channel->primary_channel;
 
+	if (unlikely(test_channel->fuzz_testing_message_delay > 0))
+		udelay(test_channel->fuzz_testing_message_delay);
 	/* bump offset to next potential packet */
 	rbi->priv_read_index += packetlen + VMBUS_PKT_TRAILER;
 	if (rbi->priv_read_index >= dsize)
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 6256cc34c4a6..8d068956dd67 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -23,6 +23,7 @@
 #include <linux/mod_devicetable.h>
 #include <linux/interrupt.h>
 #include <linux/reciprocal_div.h>
+#include <linux/delay.h>
 
 #define MAX_PAGE_BUFFER_COUNT				32
 #define MAX_MULTIPAGE_BUFFER_COUNT			32 /* 128K */
@@ -926,6 +927,19 @@ struct vmbus_channel {
 	 * full outbound ring buffer.
 	 */
 	u64 out_full_first;
+
+	/* enabling/disabling fuzz testing on the channel (default is false)*/
+	bool fuzz_testing_state;
+
+	/* Buffer delay will delay the guest from emptying the ring buffer
+	 * for a specific amount of time. The delay is in microseconds and will
+	 * be between 1 to a maximum of 1000, its default is 0 (no delay).
+	 * The  Message delay will delay guest reading on a per message basis
+	 * in microseconds between 1 to 1000 with the default being 0
+	 * (no delay).
+	 */
+	u32 fuzz_testing_buffer_delay;
+	u32 fuzz_testing_message_delay;
 };
 
 static inline bool is_hvsock_channel(const struct vmbus_channel *c)
-- 
2.17.1


^ permalink raw reply related

* [PATCH 0/3] hv: vmbus: add fuzz testing to hv devices
From: Branden Bonaby @ 2019-08-01 20:00 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, sashal
  Cc: Branden Bonaby, linux-hyperv, linux-kernel

This patchset introduces a testing framework for Hyper-V drivers.
This framework allows us to introduce delays in the packet receive
path on a per-device basis. While the current code only supports 
introducing arbitrary delays in the host/guest communication path,
we intend to expand this to support error injection in the future.

Branden Bonaby (3):
  drivers: hv: vmbus: Introduce latency testing
  drivers: hv: vmbus: add fuzz test attributes to sysfs
  tools: hv: add vmbus testing tool

 Documentation/ABI/stable/sysfs-bus-vmbus |  22 ++
 drivers/hv/connection.c                  |   5 +
 drivers/hv/ring_buffer.c                 |  10 +
 drivers/hv/vmbus_drv.c                   |  97 ++++++-
 include/linux/hyperv.h                   |  14 +
 tools/hv/vmbus_testing                   | 326 +++++++++++++++++++++++
 6 files changed, 473 insertions(+), 1 deletion(-)
 create mode 100644 tools/hv/vmbus_testing

-- 
2.17.1


^ permalink raw reply

* RE: [PATCH v2 net] hv_sock: Fix hang when a connection is closed
From: Sunil Muthuswamy @ 2019-08-01  0:50 UTC (permalink / raw)
  To: Dexuan Cui, David Miller, netdev@vger.kernel.org
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal@kernel.org, Michael Kelley, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, olaf@aepfle.de, apw@canonical.com,
	jasowang@redhat.com, vkuznets, marcelo.cerri@canonical.com
In-Reply-To: <PU1P153MB01696DDD3A3F601370701DD2BFDF0@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>

> -----Original Message-----
> From: Dexuan Cui <decui@microsoft.com>
> Sent: Tuesday, July 30, 2019 6:26 PM
> To: Sunil Muthuswamy <sunilmut@microsoft.com>; David Miller <davem@davemloft.net>; netdev@vger.kernel.org
> Cc: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Stephen Hemminger
> <sthemmin@microsoft.com>; sashal@kernel.org; Michael Kelley <mikelley@microsoft.com>; linux-hyperv@vger.kernel.org; linux-
> kernel@vger.kernel.org; olaf@aepfle.de; apw@canonical.com; jasowang@redhat.com; vkuznets <vkuznets@redhat.com>;
> marcelo.cerri@canonical.com
> Subject: [PATCH v2 net] hv_sock: Fix hang when a connection is closed
> 
> 
> There is a race condition for an established connection that is being closed
> by the guest: the refcnt is 4 at the end of hvs_release() (Note: here the
> 'remove_sock' is false):
> 
> 1 for the initial value;
> 1 for the sk being in the bound list;
> 1 for the sk being in the connected list;
> 1 for the delayed close_work.
> 
> After hvs_release() finishes, __vsock_release() -> sock_put(sk) *may*
> decrease the refcnt to 3.
> 
> Concurrently, hvs_close_connection() runs in another thread:
>   calls vsock_remove_sock() to decrease the refcnt by 2;
>   call sock_put() to decrease the refcnt to 0, and free the sk;
>   next, the "release_sock(sk)" may hang due to use-after-free.
> 
> In the above, after hvs_release() finishes, if hvs_close_connection() runs
> faster than "__vsock_release() -> sock_put(sk)", then there is not any issue,
> because at the beginning of hvs_close_connection(), the refcnt is still 4.
> 
> The issue can be resolved if an extra reference is taken when the
> connection is established.
> 
> Fixes: a9eeb998c28d ("hv_sock: Add support for delayed close")
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> Cc: stable@vger.kernel.org
> 
> ---
> 
> Changes in v2:
>   Changed the location of the sock_hold() call.
>   Updated the changelog accordingly.
> 
>   Thanks Sunil for the suggestion!
> 
> 
> With the proper kernel debugging options enabled, first a warning can
> appear:
> 
> kworker/1:0/4467 is freeing memory ..., with a lock still held there!
> stack backtrace:
> Workqueue: events vmbus_onmessage_work [hv_vmbus]
> Call Trace:
>  dump_stack+0x67/0x90
>  debug_check_no_locks_freed.cold.52+0x78/0x7d
>  slab_free_freelist_hook+0x85/0x140
>  kmem_cache_free+0xa5/0x380
>  __sk_destruct+0x150/0x260
>  hvs_close_connection+0x24/0x30 [hv_sock]
>  vmbus_onmessage_work+0x1d/0x30 [hv_vmbus]
>  process_one_work+0x241/0x600
>  worker_thread+0x3c/0x390
>  kthread+0x11b/0x140
>  ret_from_fork+0x24/0x30
> 
> and then the following release_sock(sk) can hang:
> 
> watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:0:4467]
> ...
> irq event stamp: 62890
> CPU: 1 PID: 4467 Comm: kworker/1:0 Tainted: G        W         5.2.0+ #39
> Workqueue: events vmbus_onmessage_work [hv_vmbus]
> RIP: 0010:queued_spin_lock_slowpath+0x2b/0x1e0
> ...
> Call Trace:
>  do_raw_spin_lock+0xab/0xb0
>  release_sock+0x19/0xb0
>  vmbus_onmessage_work+0x1d/0x30 [hv_vmbus]
>  process_one_work+0x241/0x600
>  worker_thread+0x3c/0x390
>  kthread+0x11b/0x140
>  ret_from_fork+0x24/0x30
> 
>  net/vmw_vsock/hyperv_transport.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
> index f2084e3f7aa4..9d864ebeb7b3 100644
> --- a/net/vmw_vsock/hyperv_transport.c
> +++ b/net/vmw_vsock/hyperv_transport.c
> @@ -312,6 +312,11 @@ static void hvs_close_connection(struct vmbus_channel *chan)
>  	lock_sock(sk);
>  	hvs_do_close_lock_held(vsock_sk(sk), true);
>  	release_sock(sk);
> +
> +	/* Release the refcnt for the channel that's opened in
> +	 * hvs_open_connection().
> +	 */
> +	sock_put(sk);
>  }
> 
>  static void hvs_open_connection(struct vmbus_channel *chan)
> @@ -407,6 +412,9 @@ static void hvs_open_connection(struct vmbus_channel *chan)
>  	}
> 
>  	set_per_channel_state(chan, conn_from_host ? new : sk);
> +
> +	/* This reference will be dropped by hvs_close_connection(). */
> +	sock_hold(conn_from_host ? new : sk);
>  	vmbus_set_chn_rescind_callback(chan, hvs_close_connection);
> 
>  	/* Set the pending send size to max packet size to always get
> --
> 2.19.1

Reviewed-by: Sunil Muthuswamy <sunilmut@microsoft.com>

^ permalink raw reply

* [PATCH v2 3/7] Drivers: hv: vmbus: Break out synic enable and disable operations
From: Dexuan Cui @ 2019-07-31 17:52 UTC (permalink / raw)
  To: linux-hyperv@vger.kernel.org, gregkh@linuxfoundation.org,
	Stephen Hemminger, Sasha Levin, sashal@kernel.org, Haiyang Zhang,
	KY Srinivasan, Michael Kelley, tglx@linutronix.de
  Cc: linux-kernel@vger.kernel.org, Dexuan Cui
In-Reply-To: <1564595464-56520-1-git-send-email-decui@microsoft.com>

Break out synic enable and disable operations into separate
hv_synic_disable_regs() and hv_synic_enable_regs() functions for use by a
later patch to support hibernation.

There is no functional change except the unnecessary check
"if (sctrl.enable != 1) return -EFAULT;" which is removed, because when
we're in hv_synic_cleanup(), we're absolutely sure sctrl.enable must be 1.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
---
 drivers/hv/hv.c           | 66 ++++++++++++++++++++++++++---------------------
 drivers/hv/hyperv_vmbus.h |  2 ++
 2 files changed, 39 insertions(+), 29 deletions(-)

diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index 6188fb7..fcc5279 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -154,7 +154,7 @@ void hv_synic_free(void)
  * retrieve the initialized message and event pages.  Otherwise, we create and
  * initialize the message and event pages.
  */
-int hv_synic_init(unsigned int cpu)
+void hv_synic_enable_regs(unsigned int cpu)
 {
 	struct hv_per_cpu_context *hv_cpu
 		= per_cpu_ptr(hv_context.cpu_context, cpu);
@@ -196,6 +196,11 @@ int hv_synic_init(unsigned int cpu)
 	sctrl.enable = 1;
 
 	hv_set_synic_state(sctrl.as_uint64);
+}
+
+int hv_synic_init(unsigned int cpu)
+{
+	hv_synic_enable_regs(cpu);
 
 	hv_stimer_init(cpu);
 
@@ -205,20 +210,45 @@ int hv_synic_init(unsigned int cpu)
 /*
  * hv_synic_cleanup - Cleanup routine for hv_synic_init().
  */
-int hv_synic_cleanup(unsigned int cpu)
+void hv_synic_disable_regs(unsigned int cpu)
 {
 	union hv_synic_sint shared_sint;
 	union hv_synic_simp simp;
 	union hv_synic_siefp siefp;
 	union hv_synic_scontrol sctrl;
+
+	hv_get_synint_state(VMBUS_MESSAGE_SINT, shared_sint.as_uint64);
+
+	shared_sint.masked = 1;
+
+	/* Need to correctly cleanup in the case of SMP!!! */
+	/* Disable the interrupt */
+	hv_set_synint_state(VMBUS_MESSAGE_SINT, shared_sint.as_uint64);
+
+	hv_get_simp(simp.as_uint64);
+	simp.simp_enabled = 0;
+	simp.base_simp_gpa = 0;
+
+	hv_set_simp(simp.as_uint64);
+
+	hv_get_siefp(siefp.as_uint64);
+	siefp.siefp_enabled = 0;
+	siefp.base_siefp_gpa = 0;
+
+	hv_set_siefp(siefp.as_uint64);
+
+	/* Disable the global synic bit */
+	hv_get_synic_state(sctrl.as_uint64);
+	sctrl.enable = 0;
+	hv_set_synic_state(sctrl.as_uint64);
+}
+
+int hv_synic_cleanup(unsigned int cpu)
+{
 	struct vmbus_channel *channel, *sc;
 	bool channel_found = false;
 	unsigned long flags;
 
-	hv_get_synic_state(sctrl.as_uint64);
-	if (sctrl.enable != 1)
-		return -EFAULT;
-
 	/*
 	 * Search for channels which are bound to the CPU we're about to
 	 * cleanup. In case we find one and vmbus is still connected we need to
@@ -249,29 +279,7 @@ int hv_synic_cleanup(unsigned int cpu)
 
 	hv_stimer_cleanup(cpu);
 
-	hv_get_synint_state(VMBUS_MESSAGE_SINT, shared_sint.as_uint64);
-
-	shared_sint.masked = 1;
-
-	/* Need to correctly cleanup in the case of SMP!!! */
-	/* Disable the interrupt */
-	hv_set_synint_state(VMBUS_MESSAGE_SINT, shared_sint.as_uint64);
-
-	hv_get_simp(simp.as_uint64);
-	simp.simp_enabled = 0;
-	simp.base_simp_gpa = 0;
-
-	hv_set_simp(simp.as_uint64);
-
-	hv_get_siefp(siefp.as_uint64);
-	siefp.siefp_enabled = 0;
-	siefp.base_siefp_gpa = 0;
-
-	hv_set_siefp(siefp.as_uint64);
-
-	/* Disable the global synic bit */
-	sctrl.enable = 0;
-	hv_set_synic_state(sctrl.as_uint64);
+	hv_synic_disable_regs(cpu);
 
 	return 0;
 }
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 362e70e..26ea161 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -171,8 +171,10 @@ extern int hv_post_message(union hv_connection_id connection_id,
 
 extern void hv_synic_free(void);
 
+extern void hv_synic_enable_regs(unsigned int cpu);
 extern int hv_synic_init(unsigned int cpu);
 
+extern void hv_synic_disable_regs(unsigned int cpu);
 extern int hv_synic_cleanup(unsigned int cpu);
 
 /* Interface */
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH v2 2/7] clocksource/drivers: Suspend/resume Hyper-V clocksource for hibernation
From: Dexuan Cui @ 2019-07-31 17:52 UTC (permalink / raw)
  To: linux-hyperv@vger.kernel.org, gregkh@linuxfoundation.org,
	Stephen Hemminger, Sasha Levin, sashal@kernel.org, Haiyang Zhang,
	KY Srinivasan, Michael Kelley, tglx@linutronix.de
  Cc: linux-kernel@vger.kernel.org, Dexuan Cui
In-Reply-To: <1564595464-56520-1-git-send-email-decui@microsoft.com>

This is needed for hibernation, e.g. when we resume the old kernel, we need
to disable the "current" kernel's TSC page and then resume the old kernel's.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
---
 drivers/clocksource/hyperv_timer.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
index ba2c79e6..41c31a7 100644
--- a/drivers/clocksource/hyperv_timer.c
+++ b/drivers/clocksource/hyperv_timer.c
@@ -237,12 +237,37 @@ static u64 read_hv_clock_tsc(struct clocksource *arg)
 	return read_hv_sched_clock_tsc();
 }
 
+static void suspend_hv_clock_tsc(struct clocksource *arg)
+{
+	u64 tsc_msr;
+
+	/* Disable the TSC page */
+	hv_get_reference_tsc(tsc_msr);
+	tsc_msr &= ~BIT_ULL(0);
+	hv_set_reference_tsc(tsc_msr);
+}
+
+
+static void resume_hv_clock_tsc(struct clocksource *arg)
+{
+	phys_addr_t phys_addr = page_to_phys(vmalloc_to_page(tsc_pg));
+	u64 tsc_msr;
+
+	/* Re-enable the TSC page */
+	hv_get_reference_tsc(tsc_msr);
+	tsc_msr &= GENMASK_ULL(11, 0);
+	tsc_msr |= BIT_ULL(0) | (u64)phys_addr;
+	hv_set_reference_tsc(tsc_msr);
+}
+
 static struct clocksource hyperv_cs_tsc = {
 	.name	= "hyperv_clocksource_tsc_page",
 	.rating	= 400,
 	.read	= read_hv_clock_tsc,
 	.mask	= CLOCKSOURCE_MASK(64),
 	.flags	= CLOCK_SOURCE_IS_CONTINUOUS,
+	.suspend= suspend_hv_clock_tsc,
+	.resume	= resume_hv_clock_tsc,
 };
 #endif
 
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH v2 5/7] Drivers: hv: vmbus: Ignore the offers when resuming from hibernation
From: Dexuan Cui @ 2019-07-31 17:52 UTC (permalink / raw)
  To: linux-hyperv@vger.kernel.org, gregkh@linuxfoundation.org,
	Stephen Hemminger, Sasha Levin, sashal@kernel.org, Haiyang Zhang,
	KY Srinivasan, Michael Kelley, tglx@linutronix.de
  Cc: linux-kernel@vger.kernel.org, Dexuan Cui
In-Reply-To: <1564595464-56520-1-git-send-email-decui@microsoft.com>

When the VM resumes, the host re-sends the offers. We should not add the
offers to the global vmbus_connection.chn_list again.

Added some debug code, in case the host screws up the exact info related to
the offers.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 drivers/hv/channel_mgmt.c | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index addcef5..165f125 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -854,12 +854,39 @@ void vmbus_initiate_unload(bool crash)
 static void vmbus_onoffer(struct vmbus_channel_message_header *hdr)
 {
 	struct vmbus_channel_offer_channel *offer;
-	struct vmbus_channel *newchannel;
+	struct vmbus_channel *oldchannel, *newchannel;
+	size_t offer_sz;
 
 	offer = (struct vmbus_channel_offer_channel *)hdr;
 
 	trace_vmbus_onoffer(offer);
 
+	mutex_lock(&vmbus_connection.channel_mutex);
+	oldchannel = relid2channel(offer->child_relid);
+	mutex_unlock(&vmbus_connection.channel_mutex);
+
+	if (oldchannel != NULL) {
+		atomic_dec(&vmbus_connection.offer_in_progress);
+
+		/*
+		 * We're resuming from hibernation: we expect the host to send
+		 * exactly the same offers that we had before the hibernation.
+		 */
+		offer_sz = sizeof(*offer);
+		if (memcmp(offer, &oldchannel->offermsg, offer_sz) == 0)
+			return;
+
+		pr_err("Mismatched offer from the host (relid=%d)!\n",
+		       offer->child_relid);
+
+		print_hex_dump_debug("Old vmbus offer: ", DUMP_PREFIX_OFFSET,
+				     16, 4, &oldchannel->offermsg, offer_sz,
+				     false);
+		print_hex_dump_debug("New vmbus offer: ", DUMP_PREFIX_OFFSET,
+				     16, 4, offer, offer_sz, false);
+		return;
+	}
+
 	/* Allocate the channel object and save this offer. */
 	newchannel = alloc_channel();
 	if (!newchannel) {
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH v2 6/7] Drivers: hv: vmbus: Suspend/resume the vmbus itself for hibernation
From: Dexuan Cui @ 2019-07-31 17:52 UTC (permalink / raw)
  To: linux-hyperv@vger.kernel.org, gregkh@linuxfoundation.org,
	Stephen Hemminger, Sasha Levin, sashal@kernel.org, Haiyang Zhang,
	KY Srinivasan, Michael Kelley, tglx@linutronix.de
  Cc: linux-kernel@vger.kernel.org, Dexuan Cui
In-Reply-To: <1564595464-56520-1-git-send-email-decui@microsoft.com>

Before Linux enters hibernation, it sends the CHANNELMSG_UNLOAD message to
the host so all the offers are gone. After hibernation, Linux needs to
re-negotiate with the host using the same vmbus protocol version (which
was in use before hibernation), and ask the host to re-offer the vmbus
devices.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 drivers/hv/connection.c   |  3 +--
 drivers/hv/hyperv_vmbus.h |  2 ++
 drivers/hv/vmbus_drv.c    | 50 +++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 53 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index 09829e1..806319c 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -59,8 +59,7 @@ static __u32 vmbus_get_next_version(__u32 current_version)
 	}
 }
 
-static int vmbus_negotiate_version(struct vmbus_channel_msginfo *msginfo,
-					__u32 version)
+int vmbus_negotiate_version(struct vmbus_channel_msginfo *msginfo, u32 version)
 {
 	int ret = 0;
 	unsigned int cur_cpu;
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 26ea161..e657197 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -274,6 +274,8 @@ struct vmbus_msginfo {
 
 extern struct vmbus_connection vmbus_connection;
 
+int vmbus_negotiate_version(struct vmbus_channel_msginfo *msginfo, u32 version);
+
 static inline void vmbus_send_interrupt(u32 relid)
 {
 	sync_set_bit(relid, vmbus_connection.send_int_page);
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 2ef375c..ca6f4c8 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -2043,6 +2043,51 @@ static int vmbus_acpi_add(struct acpi_device *device)
 	return ret_val;
 }
 
+static int vmbus_bus_suspend(struct device *dev)
+{
+	vmbus_initiate_unload(false);
+
+	vmbus_connection.conn_state = DISCONNECTED;
+
+	return 0;
+}
+
+static int vmbus_bus_resume(struct device *dev)
+{
+	struct vmbus_channel_msginfo *msginfo;
+	size_t msgsize;
+	int ret;
+
+	/*
+	 * We only use the 'vmbus_proto_version', which was in use before
+	 * hibernation, to re-negotiate with the host.
+	 */
+	if (vmbus_proto_version == VERSION_INVAL ||
+	    vmbus_proto_version == 0) {
+		pr_err("Invalid proto version = 0x%x\n", vmbus_proto_version);
+		return -EINVAL;
+	}
+
+	msgsize = sizeof(*msginfo) +
+		  sizeof(struct vmbus_channel_initiate_contact);
+
+	msginfo = kzalloc(msgsize, GFP_KERNEL);
+
+	if (msginfo == NULL)
+		return -ENOMEM;
+
+	ret = vmbus_negotiate_version(msginfo, vmbus_proto_version);
+
+	kfree(msginfo);
+
+	if (ret != 0)
+		return ret;
+
+	vmbus_request_offers();
+
+	return 0;
+}
+
 static const struct acpi_device_id vmbus_acpi_device_ids[] = {
 	{"VMBUS", 0},
 	{"VMBus", 0},
@@ -2050,6 +2095,10 @@ static int vmbus_acpi_add(struct acpi_device *device)
 };
 MODULE_DEVICE_TABLE(acpi, vmbus_acpi_device_ids);
 
+static const struct dev_pm_ops vmbus_bus_pm = {
+	SET_SYSTEM_SLEEP_PM_OPS(vmbus_bus_suspend, vmbus_bus_resume)
+};
+
 static struct acpi_driver vmbus_acpi_driver = {
 	.name = "vmbus",
 	.ids = vmbus_acpi_device_ids,
@@ -2057,6 +2106,7 @@ static int vmbus_acpi_add(struct acpi_device *device)
 		.add = vmbus_acpi_add,
 		.remove = vmbus_acpi_remove,
 	},
+	.drv.pm = &vmbus_bus_pm,
 };
 
 static void hv_kexec_handler(void)
-- 
1.8.3.1


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox