Linux-HyperV List

Linux-HyperV List
 help / color / mirror / Atom feed

* Re: [PATCH net] hvsock: fix epollout hang from race condition
From: David Miller @ 2019-06-15  2:14 UTC (permalink / raw)
  To: sunilmut
  Cc: kys, haiyangz, sthemmin, sashal, decui, mikelley, netdev,
	linux-hyperv, linux-kernel
In-Reply-To: <MW2PR2101MB11164C6EEAA5C511B395EF3AC0EC0@MW2PR2101MB1116.namprd21.prod.outlook.com>


This adds lots of new warnings:

net/vmw_vsock/hyperv_transport.c: In function ‘hvs_probe’:
net/vmw_vsock/hyperv_transport.c:205:20: warning: ‘vnew’ may be used uninitialized in this function [-Wmaybe-uninitialized]
   remote->svm_port = host_ephemeral_port++;
   ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
net/vmw_vsock/hyperv_transport.c:332:21: note: ‘vnew’ was declared here
  struct vsock_sock *vnew;
                     ^~~~
net/vmw_vsock/hyperv_transport.c:406:22: warning: ‘hvs_new’ may be used uninitialized in this function [-Wmaybe-uninitialized]
   hvs_new->vm_srv_id = *if_type;
   ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~
net/vmw_vsock/hyperv_transport.c:333:23: note: ‘hvs_new’ was declared here
  struct hvsock *hvs, *hvs_new;
                       ^~~~~~~

^ permalink raw reply

* [PATCH] scsi: storvsc: Add ability to change scsi queue depth
From: Branden Bonaby @ 2019-06-14 23:48 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, sashal, jejb, martin.petersen
  Cc: Branden Bonaby, linux-hyperv, linux-scsi, linux-kernel

Adding functionality to allow the SCSI queue depth to be changed,
by utilizing the "scsi_change_queue_depth" function.

Signed-off-by: Branden Bonaby <brandonbonaby94@gmail.com>
---
 drivers/scsi/storvsc_drv.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 8472de1007ff..719ca9906fc2 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -387,6 +387,7 @@ enum storvsc_request_type {
 
 static int storvsc_ringbuffer_size = (128 * 1024);
 static u32 max_outstanding_req_per_channel;
+static int storvsc_change_queue_depth(struct scsi_device *sdev, int queue_depth);
 
 static int storvsc_vcpus_per_sub_channel = 4;
 
@@ -1711,6 +1712,7 @@ static struct scsi_host_template scsi_driver = {
 	.dma_boundary =		PAGE_SIZE-1,
 	.no_write_same =	1,
 	.track_queue_depth =	1,
+	.change_queue_depth =	storvsc_change_queue_depth,
 };
 
 enum {
@@ -1917,6 +1919,15 @@ static int storvsc_probe(struct hv_device *device,
 	return ret;
 }
 
+/* Change a scsi target's queue depth */
+static int storvsc_change_queue_depth(struct scsi_device *sdev, int queue_depth)
+{
+	if (queue_depth > scsi_driver.can_queue){
+		queue_depth = scsi_driver.can_queue;
+	}
+	return scsi_change_queue_depth(sdev, queue_depth);
+}
+
 static int storvsc_remove(struct hv_device *dev)
 {
 	struct storvsc_device *stor_device = hv_get_drvdata(dev);
-- 
2.17.1


^ permalink raw reply related

* RE: [PATCH] ACPI: PM: Export the function acpi_sleep_state_supported()
From: Dexuan Cui @ 2019-06-14 23:34 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Michael Kelley, linux-acpi@vger.kernel.org, rjw@rjwysocki.net,
	lenb@kernel.org, robert.moore@intel.com, erik.schmauss@intel.com,
	Russ Dill, Sebastian Capella, Pavel Machek, Lorenzo Pieralisi,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	KY Srinivasan, Stephen Hemminger, Haiyang Zhang, Sasha Levin,
	olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com
In-Reply-To: <20190614223310.pwwoefu5qdvcuaiy@shell.armlinux.org.uk>

> From: linux-hyperv-owner@vger.kernel.org
> <linux-hyperv-owner@vger.kernel.org> On Behalf Of Russell King
> On Fri, Jun 14, 2019 at 10:19:02PM +0000, Dexuan Cui wrote:
> > It looks ARM does not support the ACPI S4 state, then how do we know
> > if an ARM host supports hibernation or not?
> 
> Don't forget that Linux does not support ACPI on 32-bit ARM, which is
> quite different from the situation on 64-bit ARM.
> 
> arch/arm/kernel/hibernate.c is only for 32-bit ARM, and is written with
> the assumption that there is no interaction required with any firmware
> to save state, and later restore state upon resuming.
> 
> Or am I missing something?

Hi Russell,
Thanks for your reply and please excuse me for my ignorance of ARM. 

So 32-bit ARM Linux can hibernate even if it doesn't support ACPI, but
I guess not all 32-bit ARM machines support hibernation? If my guess
is correct, is there any standard capability bit or something that can be
used to tell if an ARM machine supports hibernation? I'm purely curious. :-)

Do you imply 64-bit ARM Linux supports ACPI and the ACPI S4 state?
If not, how can we tell if a 64-bit ARM machine supports hibernation or not?

Thanks,
-- Dexuan

^ permalink raw reply

* RE: [PATCH 2/2] hv_balloon: Reorganize the probe function
From: Dexuan Cui @ 2019-06-14 23:08 UTC (permalink / raw)
  To: Michael Kelley, linux-hyperv@vger.kernel.org,
	gregkh@linuxfoundation.org, Stephen Hemminger, Sasha Levin,
	Haiyang Zhang, KY Srinivasan, linux-kernel@vger.kernel.org,
	Tianyu Lan
  Cc: olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com
In-Reply-To: <BL0PR2101MB13487B8D2A157AA7FCFD159DD7EE0@BL0PR2101MB1348.namprd21.prod.outlook.com>

> From: Michael Kelley <mikelley@microsoft.com>
> Sent: Friday, June 14, 2019 2:56 PM
> > ...
> > +	ret = balloon_connect_vsp(dev);
> > +	if (ret != 0)
> > +		return ret;
> > +
> >  	dm_device.state = DM_INITIALIZED;
> > -	last_post_time = jiffies;
> 
> I was curious about the above deletion.  But I guess the line
> is not needed as the time_after() check in post_status() should
> handle an initial value of 0 for last_post_time just fine.

In a 32-bit kernel, sizeof(unsigned long) is 4, and the global 32-bit
varilable "jiffies" can overflow in 49.7 days if HZ is defined as 1000;
so in theory there is a tiny chance time_after() can not work as
expected here (i.e. we're loading hv_balloon driver when the
"jiffies" is just about to overflow, which is highly unlikely in practice);
even if that happens, we do not care, since the consequence is
just that the memory pressure reporting is delayed by 1 second. :-)

> > +
> > +	dm_device.thread =
> > +		 kthread_run(dm_thread_func, &dm_device, "hv_balloon");
> > +	if (IS_ERR(dm_device.thread)) {
> > +		ret = PTR_ERR(dm_device.thread);
> > +		goto probe_error;
> > +	}
> 
> Just an observation:  this thread creation now happens at the end of the
> probing process.  But that's good, because in the old code, the thread
> was started and could run before the protocol version had been
> negotiated.  So I'll assume your change here is intentional.

Yes, this is intentional.
 
> >
> >  	return 0;
> >
> > -probe_error2:
> > +probe_error:
> > +	vmbus_close(dev->channel);
> >  #ifdef CONFIG_MEMORY_HOTPLUG
> > +	unregister_memory_notifier(&hv_memory_nb);
> 
> Hmmm. Evidently the above cleanup was missing in the
> old code.

Yes.
 
> >  	restore_online_page_callback(&hv_online_page);
> >  #endif
> > -	kthread_stop(dm_device.thread);
> > -
> > -probe_error1:
> > -	vmbus_close(dev->channel);
> >  	return ret;
> >  }
> >
> > @@ -1734,11 +1742,11 @@ static int balloon_remove(struct hv_device
> *dev)
> >  	cancel_work_sync(&dm->balloon_wrk.wrk);
> >  	cancel_work_sync(&dm->ha_wrk.wrk);
> >
> > -	vmbus_close(dev->channel);
> >  	kthread_stop(dm->thread);
> > +	vmbus_close(dev->channel);
> 
> Presumably this is an intentional ordering change as well.
> The kthread should be stopped before closing the channel.

Yes. The old code is buggy: after the vmbus_close(), there is
a small window in which the old code can still try to send
messages to the host via a freed ringbuffer, causing panic.
 
> >  #ifdef CONFIG_MEMORY_HOTPLUG
> > -	restore_online_page_callback(&hv_online_page);
> >  	unregister_memory_notifier(&hv_memory_nb);
> > +	restore_online_page_callback(&hv_online_page);
> 
> And you've changed the ordering of these steps so they are
> the inverse of when they are set up.  Also a good cleanup ....

Yes. The change is not really necessary, but let's just do it
in a better manner.
 
> 
> Reviewed-by: Michael Kelley <mikelley@microsoft.com>

Thaks for the detailed comments!

Thanks,
-- Dexuan

^ permalink raw reply

* Re: [PATCH] ACPI: PM: Export the function acpi_sleep_state_supported()
From: Russell King - ARM Linux admin @ 2019-06-14 22:33 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: Michael Kelley, linux-acpi@vger.kernel.org, rjw@rjwysocki.net,
	lenb@kernel.org, robert.moore@intel.com, erik.schmauss@intel.com,
	Russ Dill, Sebastian Capella, Pavel Machek, Lorenzo Pieralisi,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	KY Srinivasan, Stephen Hemminger, Haiyang Zhang, Sasha Levin,
	olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com
In-Reply-To: <PU1P153MB01699020B5BC4287C58F5335BFEE0@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>

Hi,

On Fri, Jun 14, 2019 at 10:19:02PM +0000, Dexuan Cui wrote:
> > -----Original Message-----
> > From: Michael Kelley <mikelley@microsoft.com>
> > Sent: Friday, June 14, 2019 1:48 PM
> > To: Dexuan Cui <decui@microsoft.com>; linux-acpi@vger.kernel.org;
> > rjw@rjwysocki.net; lenb@kernel.org; robert.moore@intel.com;
> > erik.schmauss@intel.com
> > Cc: linux-hyperv@vger.kernel.org; linux-kernel@vger.kernel.org; KY Srinivasan
> > <kys@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>;
> > Haiyang Zhang <haiyangz@microsoft.com>; Sasha Levin
> > <Alexander.Levin@microsoft.com>; olaf@aepfle.de; apw@canonical.com;
> > jasowang@redhat.com; vkuznets <vkuznets@redhat.com>;
> > marcelo.cerri@canonical.com
> > Subject: RE: [PATCH] ACPI: PM: Export the function
> > acpi_sleep_state_supported()
> > 
> > It seems that sleep.c isn't built when on the ARM64 architecture.  Using
> > acpi_sleep_state_supported() directly in hv_balloon.c will be problematic
> > since hv_balloon.c needs to be architecture independent when the
> > Hyper-V ARM64 support is added.  If that doesn't change, a per-architecture
> > wrapper will be needed to give hv_balloon.c the correct information.  This
> > may affect whether acpi_sleep_state_supported() needs to be exported vs.
> > just removing the "static".   I'm not sure what the best approach is.
> > 
> > Michael
> 
> + some ARM experts who worked on arch/arm/kernel/hibernate.c.
> 
> drivers/acpi/sleep.c is only built if ACPI_SYSTEM_POWER_STATES_SUPPORT
> is defined, but it looks this option is not defined on ARM.
> 
> It looks ARM does not support the ACPI S4 state, then how do we know 
> if an ARM host supports hibernation or not?

Don't forget that Linux does not support ACPI on 32-bit ARM, which is
quite different from the situation on 64-bit ARM.

arch/arm/kernel/hibernate.c is only for 32-bit ARM, and is written with
the assumption that there is no interaction required with any firmware
to save state, and later restore state upon resuming.

Or am I missing something?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply

* RE: [PATCH] ACPI: PM: Export the function acpi_sleep_state_supported()
From: Dexuan Cui @ 2019-06-14 22:19 UTC (permalink / raw)
  To: Michael Kelley, linux-acpi@vger.kernel.org, rjw@rjwysocki.net,
	lenb@kernel.org, robert.moore@intel.com, erik.schmauss@intel.com,
	Russell King, Russ Dill, Sebastian Capella, Pavel Machek,
	Lorenzo Pieralisi
  Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	KY Srinivasan, Stephen Hemminger, Haiyang Zhang, Sasha Levin,
	olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com
In-Reply-To: <BL0PR2101MB134895BADA1D8E0FA631D532D7EE0@BL0PR2101MB1348.namprd21.prod.outlook.com>

> -----Original Message-----
> From: Michael Kelley <mikelley@microsoft.com>
> Sent: Friday, June 14, 2019 1:48 PM
> To: Dexuan Cui <decui@microsoft.com>; linux-acpi@vger.kernel.org;
> rjw@rjwysocki.net; lenb@kernel.org; robert.moore@intel.com;
> erik.schmauss@intel.com
> Cc: linux-hyperv@vger.kernel.org; linux-kernel@vger.kernel.org; KY Srinivasan
> <kys@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>;
> Haiyang Zhang <haiyangz@microsoft.com>; Sasha Levin
> <Alexander.Levin@microsoft.com>; olaf@aepfle.de; apw@canonical.com;
> jasowang@redhat.com; vkuznets <vkuznets@redhat.com>;
> marcelo.cerri@canonical.com
> Subject: RE: [PATCH] ACPI: PM: Export the function
> acpi_sleep_state_supported()
> 
> From: Dexuan Cui <decui@microsoft.com>  Sent: Friday, June 14, 2019 11:19
> AM
> >
> > In a Linux VM running on Hyper-V, when ACPI S4 is enabled, the balloon
> > driver (drivers/hv/hv_balloon.c) needs to ask the host not to do memory
> > hot-add/remove.
> >
> > So let's export acpi_sleep_state_supported() for the hv_balloon driver.
> > This might also be useful to the other drivers in the future.
> >
> > Signed-off-by: Dexuan Cui <decui@microsoft.com>
> > ---
> >  drivers/acpi/sleep.c    | 3 ++-
> >  include/acpi/acpi_bus.h | 2 ++
> >  2 files changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
> > index a34deccd7317..69755411e008 100644
> > --- a/drivers/acpi/sleep.c
> > +++ b/drivers/acpi/sleep.c
> > @@ -79,7 +79,7 @@ static int acpi_sleep_prepare(u32 acpi_state)
> >  	return 0;
> >  }
> >
> > -static bool acpi_sleep_state_supported(u8 sleep_state)
> > +bool acpi_sleep_state_supported(u8 sleep_state)
> >  {
> >  	acpi_status status;
> >  	u8 type_a, type_b;
> > @@ -89,6 +89,7 @@ static bool acpi_sleep_state_supported(u8 sleep_state)
> >  		|| (acpi_gbl_FADT.sleep_control.address
> >  			&& acpi_gbl_FADT.sleep_status.address));
> >  }
> > +EXPORT_SYMBOL_GPL(acpi_sleep_state_supported);
> >
> >  #ifdef CONFIG_ACPI_SLEEP
> >  static u32 acpi_target_sleep_state = ACPI_STATE_S0;
> > diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
> > index 31b6c87d6240..5b102e7bbf25 100644
> > --- a/include/acpi/acpi_bus.h
> > +++ b/include/acpi/acpi_bus.h
> > @@ -651,6 +651,8 @@ static inline int acpi_pm_set_bridge_wakeup(struct
> device *dev,
> > bool enable)
> >  }
> >  #endif
> >
> > +bool acpi_sleep_state_supported(u8 sleep_state);
> > +
> >  #ifdef CONFIG_ACPI_SLEEP
> >  u32 acpi_target_system_state(void);
> >  #else
> > --
> > 2.19.1
> 
> It seems that sleep.c isn't built when on the ARM64 architecture.  Using
> acpi_sleep_state_supported() directly in hv_balloon.c will be problematic
> since hv_balloon.c needs to be architecture independent when the
> Hyper-V ARM64 support is added.  If that doesn't change, a per-architecture
> wrapper will be needed to give hv_balloon.c the correct information.  This
> may affect whether acpi_sleep_state_supported() needs to be exported vs.
> just removing the "static".   I'm not sure what the best approach is.
> 
> Michael

+ some ARM experts who worked on arch/arm/kernel/hibernate.c.

drivers/acpi/sleep.c is only built if ACPI_SYSTEM_POWER_STATES_SUPPORT
is defined, but it looks this option is not defined on ARM.

It looks ARM does not support the ACPI S4 state, then how do we know 
if an ARM host supports hibernation or not?

Thanks,
-- Dexuan

^ permalink raw reply

* RE: [PATCH 2/2] hv_balloon: Reorganize the probe function
From: Michael Kelley @ 2019-06-14 21:56 UTC (permalink / raw)
  To: Dexuan Cui, linux-hyperv@vger.kernel.org,
	gregkh@linuxfoundation.org, Stephen Hemminger, Sasha Levin,
	Haiyang Zhang, KY Srinivasan, linux-kernel@vger.kernel.org,
	Tianyu Lan
  Cc: olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com
In-Reply-To: <1560537692-37400-2-git-send-email-decui@microsoft.com>

From: Dexuan Cui <decui@microsoft.com>  Sent: Friday, June 14, 2019 11:43 AM
> 
> Move the code that negotiates with the host to a new function
> balloon_connect_vsp() and improve the error handling.
> 
> This makes the code more readable and paves the way for the
> support of hibernation in future.
> 
> Makes no real logic change here.
> 
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> ---
>  drivers/hv/hv_balloon.c | 124 +++++++++++++++++++++-------------------
>  1 file changed, 66 insertions(+), 58 deletions(-)
> 
> diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
> index 13381ea3e3e7..111ea3599659 100644
> --- a/drivers/hv/hv_balloon.c
> +++ b/drivers/hv/hv_balloon.c
> @@ -1574,50 +1574,18 @@ static void balloon_onchannelcallback(void *context)
> 
>  }
> 
> -static int balloon_probe(struct hv_device *dev,
> -			const struct hv_vmbus_device_id *dev_id)
> +static int balloon_connect_vsp(struct hv_device *dev)
>  {
> -	int ret;
> -	unsigned long t;
>  	struct dm_version_request version_req;
>  	struct dm_capabilities cap_msg;
> -
> -#ifdef CONFIG_MEMORY_HOTPLUG
> -	do_hot_add = hot_add;
> -#else
> -	do_hot_add = false;
> -#endif
> +	unsigned long t;
> +	int ret;
> 
>  	ret = vmbus_open(dev->channel, dm_ring_size, dm_ring_size, NULL, 0,
> -			balloon_onchannelcallback, dev);
> -
> +			 balloon_onchannelcallback, dev);
>  	if (ret)
>  		return ret;
> 
> -	dm_device.dev = dev;
> -	dm_device.state = DM_INITIALIZING;
> -	dm_device.next_version = DYNMEM_PROTOCOL_VERSION_WIN8;
> -	init_completion(&dm_device.host_event);
> -	init_completion(&dm_device.config_event);
> -	INIT_LIST_HEAD(&dm_device.ha_region_list);
> -	spin_lock_init(&dm_device.ha_lock);
> -	INIT_WORK(&dm_device.balloon_wrk.wrk, balloon_up);
> -	INIT_WORK(&dm_device.ha_wrk.wrk, hot_add_req);
> -	dm_device.host_specified_ha_region = false;
> -
> -	dm_device.thread =
> -		 kthread_run(dm_thread_func, &dm_device, "hv_balloon");
> -	if (IS_ERR(dm_device.thread)) {
> -		ret = PTR_ERR(dm_device.thread);
> -		goto probe_error1;
> -	}
> -
> -#ifdef CONFIG_MEMORY_HOTPLUG
> -	set_online_page_callback(&hv_online_page);
> -	register_memory_notifier(&hv_memory_nb);
> -#endif
> -
> -	hv_set_drvdata(dev, &dm_device);
>  	/*
>  	 * Initiate the hand shake with the host and negotiate
>  	 * a version that the host can support. We start with the
> @@ -1633,16 +1601,15 @@ static int balloon_probe(struct hv_device *dev,
>  	dm_device.version = version_req.version.version;
> 
>  	ret = vmbus_sendpacket(dev->channel, &version_req,
> -				sizeof(struct dm_version_request),
> -				(unsigned long)NULL,
> -				VM_PKT_DATA_INBAND, 0);
> +			       sizeof(struct dm_version_request),
> +			       (unsigned long)NULL, VM_PKT_DATA_INBAND, 0);
>  	if (ret)
> -		goto probe_error2;
> +		goto out;
> 
>  	t = wait_for_completion_timeout(&dm_device.host_event, 5*HZ);
>  	if (t == 0) {
>  		ret = -ETIMEDOUT;
> -		goto probe_error2;
> +		goto out;
>  	}
> 
>  	/*
> @@ -1650,8 +1617,8 @@ static int balloon_probe(struct hv_device *dev,
>  	 * fail the probe function.
>  	 */
>  	if (dm_device.state == DM_INIT_ERROR) {
> -		ret = -ETIMEDOUT;
> -		goto probe_error2;
> +		ret = -EPROTO;
> +		goto out;
>  	}
> 
>  	pr_info("Using Dynamic Memory protocol version %u.%u\n",
> @@ -1684,16 +1651,15 @@ static int balloon_probe(struct hv_device *dev,
>  	cap_msg.max_page_number = -1;
> 
>  	ret = vmbus_sendpacket(dev->channel, &cap_msg,
> -				sizeof(struct dm_capabilities),
> -				(unsigned long)NULL,
> -				VM_PKT_DATA_INBAND, 0);
> +			       sizeof(struct dm_capabilities),
> +			       (unsigned long)NULL, VM_PKT_DATA_INBAND, 0);
>  	if (ret)
> -		goto probe_error2;
> +		goto out;
> 
>  	t = wait_for_completion_timeout(&dm_device.host_event, 5*HZ);
>  	if (t == 0) {
>  		ret = -ETIMEDOUT;
> -		goto probe_error2;
> +		goto out;
>  	}
> 
>  	/*
> @@ -1701,23 +1667,65 @@ static int balloon_probe(struct hv_device *dev,
>  	 * fail the probe function.
>  	 */
>  	if (dm_device.state == DM_INIT_ERROR) {
> -		ret = -ETIMEDOUT;
> -		goto probe_error2;
> +		ret = -EPROTO;
> +		goto out;
>  	}
> 
> +	return 0;
> +out:
> +	vmbus_close(dev->channel);
> +	return ret;
> +}
> +
> +static int balloon_probe(struct hv_device *dev,
> +			 const struct hv_vmbus_device_id *dev_id)
> +{
> +	int ret;
> +
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +	do_hot_add = hot_add;
> +#else
> +	do_hot_add = false;
> +#endif
> +	dm_device.dev = dev;
> +	dm_device.state = DM_INITIALIZING;
> +	dm_device.next_version = DYNMEM_PROTOCOL_VERSION_WIN8;
> +	init_completion(&dm_device.host_event);
> +	init_completion(&dm_device.config_event);
> +	INIT_LIST_HEAD(&dm_device.ha_region_list);
> +	spin_lock_init(&dm_device.ha_lock);
> +	INIT_WORK(&dm_device.balloon_wrk.wrk, balloon_up);
> +	INIT_WORK(&dm_device.ha_wrk.wrk, hot_add_req);
> +	dm_device.host_specified_ha_region = false;
> +
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +	set_online_page_callback(&hv_online_page);
> +	register_memory_notifier(&hv_memory_nb);
> +#endif
> +
> +	hv_set_drvdata(dev, &dm_device);
> +
> +	ret = balloon_connect_vsp(dev);
> +	if (ret != 0)
> +		return ret;
> +
>  	dm_device.state = DM_INITIALIZED;
> -	last_post_time = jiffies;

I was curious about the above deletion.  But I guess the line
is not needed as the time_after() check in post_status() should
handle an initial value of 0 for last_post_time just fine.

> +
> +	dm_device.thread =
> +		 kthread_run(dm_thread_func, &dm_device, "hv_balloon");
> +	if (IS_ERR(dm_device.thread)) {
> +		ret = PTR_ERR(dm_device.thread);
> +		goto probe_error;
> +	}

Just an observation:  this thread creation now happens at the end of the
probing process.  But that's good, because in the old code, the thread
was started and could run before the protocol version had been
negotiated.  So I'll assume your change here is intentional.

> 
>  	return 0;
> 
> -probe_error2:
> +probe_error:
> +	vmbus_close(dev->channel);
>  #ifdef CONFIG_MEMORY_HOTPLUG
> +	unregister_memory_notifier(&hv_memory_nb);

Hmmm. Evidently the above cleanup was missing in the
old code.

>  	restore_online_page_callback(&hv_online_page);
>  #endif
> -	kthread_stop(dm_device.thread);
> -
> -probe_error1:
> -	vmbus_close(dev->channel);
>  	return ret;
>  }
> 
> @@ -1734,11 +1742,11 @@ static int balloon_remove(struct hv_device *dev)
>  	cancel_work_sync(&dm->balloon_wrk.wrk);
>  	cancel_work_sync(&dm->ha_wrk.wrk);
> 
> -	vmbus_close(dev->channel);
>  	kthread_stop(dm->thread);
> +	vmbus_close(dev->channel);

Presumably this is an intentional ordering change as well.
The kthread should be stopped before closing the channel.

>  #ifdef CONFIG_MEMORY_HOTPLUG
> -	restore_online_page_callback(&hv_online_page);
>  	unregister_memory_notifier(&hv_memory_nb);
> +	restore_online_page_callback(&hv_online_page);

And you've changed the ordering of these steps so they are
the inverse of when they are set up.  Also a good cleanup ....

>  #endif
>  	spin_lock_irqsave(&dm_device.ha_lock, flags);
>  	list_for_each_entry_safe(has, tmp, &dm->ha_region_list, list) {
> --
> 2.19.1

Reviewed-by: Michael Kelley <mikelley@microsoft.com>

^ permalink raw reply

* Re: [PATCH] x86/hyperv: Disable preemption while setting reenlightenment vector
From: Vitaly Kuznetsov @ 2019-06-14 21:44 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Prasanna Panchamukhi, Andy Lutomirski,
	Borislav Petkov, Cathy Avery, Haiyang Zhang, H. Peter Anvin,
	Ingo Molnar, K. Y. Srinivasan, Michael Kelley (EOSG),
	Mohammed Gamal, Paolo Bonzini, Radim Krčmář,
	Roman Kagan, Sasha Levin, Stephen Hemminger, Thomas Gleixner,
	devel, kvm, linux-hyperv, x86, Dmitry Safonov
In-Reply-To: <20190614122726.GL3436@hirez.programming.kicks-ass.net>

Peter Zijlstra <peterz@infradead.org> writes:

>
> I know you probably can't change the HV interface, but I'm thinking its
> rather daft you have to specify a CPU at all for this. The HV can just
> pick one and send the notification there, who cares.

Generally speaking, hypervisor can't know if the CPU is offline (or
e.g. 'isolated') from guest's perspective so I think having an option to
specify affinity for reenlightenment notification is rather a good
thing, not bad.

(Actually, I don't remember if I tried specifying 'HV_ANY' (U32_MAX-1)
here to see what happens. But then I doubt it'll notice the fact that we 
offlined some CPU so we may get a totally unexpected IRQ there).

-- 
Vitaly

^ permalink raw reply

* Re: [PATCH] x86/hyperv: Disable preemption while setting reenlightenment vector
From: Vitaly Kuznetsov @ 2019-06-14 21:36 UTC (permalink / raw)
  To: Dmitry Safonov, Peter Zijlstra
  Cc: linux-kernel, Prasanna Panchamukhi, Andy Lutomirski,
	Borislav Petkov, Cathy Avery, Haiyang Zhang, H. Peter Anvin,
	Ingo Molnar, K. Y. Srinivasan, Michael Kelley (EOSG),
	Mohammed Gamal, Paolo Bonzini, Radim Krčmář,
	Roman Kagan, Sasha Levin, Stephen Hemminger, Thomas Gleixner,
	devel, kvm, linux-hyperv, x86
In-Reply-To: <cb9e1645-98c2-4341-d6da-4effa4f57fb1@arista.com>

Dmitry Safonov <dima@arista.com> writes:

> On 6/14/19 11:08 AM, Vitaly Kuznetsov wrote:
>> Peter Zijlstra <peterz@infradead.org> writes:
>> 
>>> @@ -182,7 +182,7 @@ void set_hv_tscchange_cb(void (*cb)(void))
>>>  	struct hv_reenlightenment_control re_ctrl = {
>>>  		.vector = HYPERV_REENLIGHTENMENT_VECTOR,
>>>  		.enabled = 1,
>>> -		.target_vp = hv_vp_index[smp_processor_id()]
>>> +		.target_vp = hv_vp_index[raw_smp_processor_id()]
>>>  	};
>>>  	struct hv_tsc_emulation_control emu_ctrl = {.enabled = 1};
>>>  
>> 
>> Yes, this should do, thanks! I'd also suggest to leave a comment like
>> 	/* 
>>          * This function can get preemted and migrate to a different CPU
>> 	 * but this doesn't matter. We just need to assign
>> 	 * reenlightenment notification to some online CPU. In case this
>>          * CPU goes offline, hv_cpu_die() will re-assign it to some
>>  	 * other online CPU.
>> 	 */
>
> What if the cpu goes down just before wrmsrl()?
> I mean, hv_cpu_die() will reassign another cpu, but this thread will be
> resumed on some other cpu and will write cpu number which is at that
> moment already down?
>

Right you are, we need to guarantee wrmsr() happens before the CPU gets
a chance to go offline: we don't save the cpu number anywhere, we just
read it with rdmsr() in hv_cpu_die().

>
> And I presume it's guaranteed that during hv_cpu_die() no other cpu may
> go down:
> :	new_cpu = cpumask_any_but(cpu_online_mask, cpu);
> :	re_ctrl.target_vp = hv_vp_index[new_cpu];
> :	wrmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl));

I *think* I got convinced that CPUs don't go offline simultaneously when
I was writing this.

-- 
Vitaly

^ permalink raw reply

* RE: [PATCH 1/2] hv_balloon: Use a static page for the balloon_up send buffer
From: Michael Kelley @ 2019-06-14 20:56 UTC (permalink / raw)
  To: Dexuan Cui, linux-hyperv@vger.kernel.org,
	gregkh@linuxfoundation.org, Stephen Hemminger, Sasha Levin,
	Haiyang Zhang, KY Srinivasan, linux-kernel@vger.kernel.org,
	Tianyu Lan
  Cc: olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com
In-Reply-To: <1560537692-37400-1-git-send-email-decui@microsoft.com>

From: Dexuan Cui <decui@microsoft.com>  Sent: Friday, June 14, 2019 11:42 AM
> 
> It's unnecessary to dynamically allocate the buffer.
> 
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> ---
>  drivers/hv/hv_balloon.c | 19 ++++---------------
>  1 file changed, 4 insertions(+), 15 deletions(-)
> 

Reviewed-by: Michael Kelley <mikelley@microsoft.com>

^ permalink raw reply

* RE: [PATCH] ACPI: PM: Export the function acpi_sleep_state_supported()
From: Michael Kelley @ 2019-06-14 20:48 UTC (permalink / raw)
  To: Dexuan Cui, linux-acpi@vger.kernel.org, rjw@rjwysocki.net,
	lenb@kernel.org, robert.moore@intel.com, erik.schmauss@intel.com
  Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	KY Srinivasan, Stephen Hemminger, Haiyang Zhang, Sasha Levin,
	olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com
In-Reply-To: <1560536224-35338-1-git-send-email-decui@microsoft.com>

From: Dexuan Cui <decui@microsoft.com>  Sent: Friday, June 14, 2019 11:19 AM
> 
> In a Linux VM running on Hyper-V, when ACPI S4 is enabled, the balloon
> driver (drivers/hv/hv_balloon.c) needs to ask the host not to do memory
> hot-add/remove.
> 
> So let's export acpi_sleep_state_supported() for the hv_balloon driver.
> This might also be useful to the other drivers in the future.
> 
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> ---
>  drivers/acpi/sleep.c    | 3 ++-
>  include/acpi/acpi_bus.h | 2 ++
>  2 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
> index a34deccd7317..69755411e008 100644
> --- a/drivers/acpi/sleep.c
> +++ b/drivers/acpi/sleep.c
> @@ -79,7 +79,7 @@ static int acpi_sleep_prepare(u32 acpi_state)
>  	return 0;
>  }
> 
> -static bool acpi_sleep_state_supported(u8 sleep_state)
> +bool acpi_sleep_state_supported(u8 sleep_state)
>  {
>  	acpi_status status;
>  	u8 type_a, type_b;
> @@ -89,6 +89,7 @@ static bool acpi_sleep_state_supported(u8 sleep_state)
>  		|| (acpi_gbl_FADT.sleep_control.address
>  			&& acpi_gbl_FADT.sleep_status.address));
>  }
> +EXPORT_SYMBOL_GPL(acpi_sleep_state_supported);
> 
>  #ifdef CONFIG_ACPI_SLEEP
>  static u32 acpi_target_sleep_state = ACPI_STATE_S0;
> diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
> index 31b6c87d6240..5b102e7bbf25 100644
> --- a/include/acpi/acpi_bus.h
> +++ b/include/acpi/acpi_bus.h
> @@ -651,6 +651,8 @@ static inline int acpi_pm_set_bridge_wakeup(struct device *dev,
> bool enable)
>  }
>  #endif
> 
> +bool acpi_sleep_state_supported(u8 sleep_state);
> +
>  #ifdef CONFIG_ACPI_SLEEP
>  u32 acpi_target_system_state(void);
>  #else
> --
> 2.19.1

It seems that sleep.c isn't built when on the ARM64 architecture.  Using
acpi_sleep_state_supported() directly in hv_balloon.c will be problematic
since hv_balloon.c needs to be architecture independent when the
Hyper-V ARM64 support is added.  If that doesn't change, a per-architecture
wrapper will be needed to give hv_balloon.c the correct information.  This
may affect whether acpi_sleep_state_supported() needs to be exported vs.
just removing the "static".   I'm not sure what the best approach is.

Michael

^ permalink raw reply

* [PATCH 2/2] hv_balloon: Reorganize the probe function
From: Dexuan Cui @ 2019-06-14 18:42 UTC (permalink / raw)
  To: linux-hyperv@vger.kernel.org, gregkh@linuxfoundation.org,
	Stephen Hemminger, Sasha Levin, Haiyang Zhang, KY Srinivasan,
	linux-kernel@vger.kernel.org, Michael Kelley, Tianyu Lan
  Cc: olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com, Dexuan Cui
In-Reply-To: <1560537692-37400-1-git-send-email-decui@microsoft.com>

Move the code that negotiates with the host to a new function
balloon_connect_vsp() and improve the error handling.

This makes the code more readable and paves the way for the
support of hibernation in future.

Makes no real logic change here.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 drivers/hv/hv_balloon.c | 124 +++++++++++++++++++++-------------------
 1 file changed, 66 insertions(+), 58 deletions(-)

diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
index 13381ea3e3e7..111ea3599659 100644
--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -1574,50 +1574,18 @@ static void balloon_onchannelcallback(void *context)
 
 }
 
-static int balloon_probe(struct hv_device *dev,
-			const struct hv_vmbus_device_id *dev_id)
+static int balloon_connect_vsp(struct hv_device *dev)
 {
-	int ret;
-	unsigned long t;
 	struct dm_version_request version_req;
 	struct dm_capabilities cap_msg;
-
-#ifdef CONFIG_MEMORY_HOTPLUG
-	do_hot_add = hot_add;
-#else
-	do_hot_add = false;
-#endif
+	unsigned long t;
+	int ret;
 
 	ret = vmbus_open(dev->channel, dm_ring_size, dm_ring_size, NULL, 0,
-			balloon_onchannelcallback, dev);
-
+			 balloon_onchannelcallback, dev);
 	if (ret)
 		return ret;
 
-	dm_device.dev = dev;
-	dm_device.state = DM_INITIALIZING;
-	dm_device.next_version = DYNMEM_PROTOCOL_VERSION_WIN8;
-	init_completion(&dm_device.host_event);
-	init_completion(&dm_device.config_event);
-	INIT_LIST_HEAD(&dm_device.ha_region_list);
-	spin_lock_init(&dm_device.ha_lock);
-	INIT_WORK(&dm_device.balloon_wrk.wrk, balloon_up);
-	INIT_WORK(&dm_device.ha_wrk.wrk, hot_add_req);
-	dm_device.host_specified_ha_region = false;
-
-	dm_device.thread =
-		 kthread_run(dm_thread_func, &dm_device, "hv_balloon");
-	if (IS_ERR(dm_device.thread)) {
-		ret = PTR_ERR(dm_device.thread);
-		goto probe_error1;
-	}
-
-#ifdef CONFIG_MEMORY_HOTPLUG
-	set_online_page_callback(&hv_online_page);
-	register_memory_notifier(&hv_memory_nb);
-#endif
-
-	hv_set_drvdata(dev, &dm_device);
 	/*
 	 * Initiate the hand shake with the host and negotiate
 	 * a version that the host can support. We start with the
@@ -1633,16 +1601,15 @@ static int balloon_probe(struct hv_device *dev,
 	dm_device.version = version_req.version.version;
 
 	ret = vmbus_sendpacket(dev->channel, &version_req,
-				sizeof(struct dm_version_request),
-				(unsigned long)NULL,
-				VM_PKT_DATA_INBAND, 0);
+			       sizeof(struct dm_version_request),
+			       (unsigned long)NULL, VM_PKT_DATA_INBAND, 0);
 	if (ret)
-		goto probe_error2;
+		goto out;
 
 	t = wait_for_completion_timeout(&dm_device.host_event, 5*HZ);
 	if (t == 0) {
 		ret = -ETIMEDOUT;
-		goto probe_error2;
+		goto out;
 	}
 
 	/*
@@ -1650,8 +1617,8 @@ static int balloon_probe(struct hv_device *dev,
 	 * fail the probe function.
 	 */
 	if (dm_device.state == DM_INIT_ERROR) {
-		ret = -ETIMEDOUT;
-		goto probe_error2;
+		ret = -EPROTO;
+		goto out;
 	}
 
 	pr_info("Using Dynamic Memory protocol version %u.%u\n",
@@ -1684,16 +1651,15 @@ static int balloon_probe(struct hv_device *dev,
 	cap_msg.max_page_number = -1;
 
 	ret = vmbus_sendpacket(dev->channel, &cap_msg,
-				sizeof(struct dm_capabilities),
-				(unsigned long)NULL,
-				VM_PKT_DATA_INBAND, 0);
+			       sizeof(struct dm_capabilities),
+			       (unsigned long)NULL, VM_PKT_DATA_INBAND, 0);
 	if (ret)
-		goto probe_error2;
+		goto out;
 
 	t = wait_for_completion_timeout(&dm_device.host_event, 5*HZ);
 	if (t == 0) {
 		ret = -ETIMEDOUT;
-		goto probe_error2;
+		goto out;
 	}
 
 	/*
@@ -1701,23 +1667,65 @@ static int balloon_probe(struct hv_device *dev,
 	 * fail the probe function.
 	 */
 	if (dm_device.state == DM_INIT_ERROR) {
-		ret = -ETIMEDOUT;
-		goto probe_error2;
+		ret = -EPROTO;
+		goto out;
 	}
 
+	return 0;
+out:
+	vmbus_close(dev->channel);
+	return ret;
+}
+
+static int balloon_probe(struct hv_device *dev,
+			 const struct hv_vmbus_device_id *dev_id)
+{
+	int ret;
+
+#ifdef CONFIG_MEMORY_HOTPLUG
+	do_hot_add = hot_add;
+#else
+	do_hot_add = false;
+#endif
+	dm_device.dev = dev;
+	dm_device.state = DM_INITIALIZING;
+	dm_device.next_version = DYNMEM_PROTOCOL_VERSION_WIN8;
+	init_completion(&dm_device.host_event);
+	init_completion(&dm_device.config_event);
+	INIT_LIST_HEAD(&dm_device.ha_region_list);
+	spin_lock_init(&dm_device.ha_lock);
+	INIT_WORK(&dm_device.balloon_wrk.wrk, balloon_up);
+	INIT_WORK(&dm_device.ha_wrk.wrk, hot_add_req);
+	dm_device.host_specified_ha_region = false;
+
+#ifdef CONFIG_MEMORY_HOTPLUG
+	set_online_page_callback(&hv_online_page);
+	register_memory_notifier(&hv_memory_nb);
+#endif
+
+	hv_set_drvdata(dev, &dm_device);
+
+	ret = balloon_connect_vsp(dev);
+	if (ret != 0)
+		return ret;
+
 	dm_device.state = DM_INITIALIZED;
-	last_post_time = jiffies;
+
+	dm_device.thread =
+		 kthread_run(dm_thread_func, &dm_device, "hv_balloon");
+	if (IS_ERR(dm_device.thread)) {
+		ret = PTR_ERR(dm_device.thread);
+		goto probe_error;
+	}
 
 	return 0;
 
-probe_error2:
+probe_error:
+	vmbus_close(dev->channel);
 #ifdef CONFIG_MEMORY_HOTPLUG
+	unregister_memory_notifier(&hv_memory_nb);
 	restore_online_page_callback(&hv_online_page);
 #endif
-	kthread_stop(dm_device.thread);
-
-probe_error1:
-	vmbus_close(dev->channel);
 	return ret;
 }
 
@@ -1734,11 +1742,11 @@ static int balloon_remove(struct hv_device *dev)
 	cancel_work_sync(&dm->balloon_wrk.wrk);
 	cancel_work_sync(&dm->ha_wrk.wrk);
 
-	vmbus_close(dev->channel);
 	kthread_stop(dm->thread);
+	vmbus_close(dev->channel);
 #ifdef CONFIG_MEMORY_HOTPLUG
-	restore_online_page_callback(&hv_online_page);
 	unregister_memory_notifier(&hv_memory_nb);
+	restore_online_page_callback(&hv_online_page);
 #endif
 	spin_lock_irqsave(&dm_device.ha_lock, flags);
 	list_for_each_entry_safe(has, tmp, &dm->ha_region_list, list) {
-- 
2.19.1


^ permalink raw reply related

* [PATCH 1/2] hv_balloon: Use a static page for the balloon_up send buffer
From: Dexuan Cui @ 2019-06-14 18:42 UTC (permalink / raw)
  To: linux-hyperv@vger.kernel.org, gregkh@linuxfoundation.org,
	Stephen Hemminger, Sasha Levin, Haiyang Zhang, KY Srinivasan,
	linux-kernel@vger.kernel.org, Michael Kelley, Tianyu Lan
  Cc: olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com, Dexuan Cui

It's unnecessary to dynamically allocate the buffer.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 drivers/hv/hv_balloon.c | 19 ++++---------------
 1 file changed, 4 insertions(+), 15 deletions(-)

diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
index dd475f3bcc8a..13381ea3e3e7 100644
--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -504,7 +504,7 @@ enum hv_dm_state {
 
 
 static __u8 recv_buffer[PAGE_SIZE];
-static __u8 *send_buffer;
+static __u8 balloon_up_send_buffer[PAGE_SIZE];
 #define PAGES_IN_2M	512
 #define HA_CHUNK (32 * 1024)
 
@@ -1302,8 +1302,8 @@ static void balloon_up(struct work_struct *dummy)
 	}
 
 	while (!done) {
-		bl_resp = (struct dm_balloon_response *)send_buffer;
-		memset(send_buffer, 0, PAGE_SIZE);
+		memset(balloon_up_send_buffer, 0, PAGE_SIZE);
+		bl_resp = (struct dm_balloon_response *)balloon_up_send_buffer;
 		bl_resp->hdr.type = DM_BALLOON_RESPONSE;
 		bl_resp->hdr.size = sizeof(struct dm_balloon_response);
 		bl_resp->more_pages = 1;
@@ -1588,19 +1588,11 @@ static int balloon_probe(struct hv_device *dev,
 	do_hot_add = false;
 #endif
 
-	/*
-	 * First allocate a send buffer.
-	 */
-
-	send_buffer = kmalloc(PAGE_SIZE, GFP_KERNEL);
-	if (!send_buffer)
-		return -ENOMEM;
-
 	ret = vmbus_open(dev->channel, dm_ring_size, dm_ring_size, NULL, 0,
 			balloon_onchannelcallback, dev);
 
 	if (ret)
-		goto probe_error0;
+		return ret;
 
 	dm_device.dev = dev;
 	dm_device.state = DM_INITIALIZING;
@@ -1726,8 +1718,6 @@ static int balloon_probe(struct hv_device *dev,
 
 probe_error1:
 	vmbus_close(dev->channel);
-probe_error0:
-	kfree(send_buffer);
 	return ret;
 }
 
@@ -1746,7 +1736,6 @@ static int balloon_remove(struct hv_device *dev)
 
 	vmbus_close(dev->channel);
 	kthread_stop(dm->thread);
-	kfree(send_buffer);
 #ifdef CONFIG_MEMORY_HOTPLUG
 	restore_online_page_callback(&hv_online_page);
 	unregister_memory_notifier(&hv_memory_nb);
-- 
2.19.1


^ permalink raw reply related

* [PATCH] ACPI: PM: Export the function acpi_sleep_state_supported()
From: Dexuan Cui @ 2019-06-14 18:19 UTC (permalink / raw)
  To: linux-acpi@vger.kernel.org, rjw@rjwysocki.net, lenb@kernel.org,
	robert.moore@intel.com, erik.schmauss@intel.com, Michael Kelley
  Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	KY Srinivasan, Stephen Hemminger, Haiyang Zhang, Sasha Levin,
	olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com, Dexuan Cui

In a Linux VM running on Hyper-V, when ACPI S4 is enabled, the balloon
driver (drivers/hv/hv_balloon.c) needs to ask the host not to do memory
hot-add/remove.

So let's export acpi_sleep_state_supported() for the hv_balloon driver.
This might also be useful to the other drivers in the future.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 drivers/acpi/sleep.c    | 3 ++-
 include/acpi/acpi_bus.h | 2 ++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
index a34deccd7317..69755411e008 100644
--- a/drivers/acpi/sleep.c
+++ b/drivers/acpi/sleep.c
@@ -79,7 +79,7 @@ static int acpi_sleep_prepare(u32 acpi_state)
 	return 0;
 }
 
-static bool acpi_sleep_state_supported(u8 sleep_state)
+bool acpi_sleep_state_supported(u8 sleep_state)
 {
 	acpi_status status;
 	u8 type_a, type_b;
@@ -89,6 +89,7 @@ static bool acpi_sleep_state_supported(u8 sleep_state)
 		|| (acpi_gbl_FADT.sleep_control.address
 			&& acpi_gbl_FADT.sleep_status.address));
 }
+EXPORT_SYMBOL_GPL(acpi_sleep_state_supported);
 
 #ifdef CONFIG_ACPI_SLEEP
 static u32 acpi_target_sleep_state = ACPI_STATE_S0;
diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
index 31b6c87d6240..5b102e7bbf25 100644
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -651,6 +651,8 @@ static inline int acpi_pm_set_bridge_wakeup(struct device *dev, bool enable)
 }
 #endif
 
+bool acpi_sleep_state_supported(u8 sleep_state);
+
 #ifdef CONFIG_ACPI_SLEEP
 u32 acpi_target_system_state(void);
 #else
-- 
2.19.1


^ permalink raw reply related

* Re: [PATCH] x86/hyperv: Disable preemption while setting reenlightenment vector
From: Dmitry Safonov @ 2019-06-14 14:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vitaly Kuznetsov, linux-kernel, Prasanna Panchamukhi,
	Andy Lutomirski, Borislav Petkov, Cathy Avery, Haiyang Zhang,
	H. Peter Anvin, Ingo Molnar, K. Y. Srinivasan,
	Michael Kelley (EOSG), Mohammed Gamal, Paolo Bonzini,
	Radim Krčmář, Roman Kagan, Sasha Levin,
	Stephen Hemminger, Thomas Gleixner, devel, kvm, linux-hyperv, x86
In-Reply-To: <20190614122726.GL3436@hirez.programming.kicks-ass.net>



On 6/14/19 1:27 PM, Peter Zijlstra wrote:
> On Fri, Jun 14, 2019 at 12:50:51PM +0100, Dmitry Safonov wrote:
>> On 6/14/19 11:08 AM, Vitaly Kuznetsov wrote:
>>> Peter Zijlstra <peterz@infradead.org> writes:
>>>
>>>> @@ -182,7 +182,7 @@ void set_hv_tscchange_cb(void (*cb)(void))
>>>>  	struct hv_reenlightenment_control re_ctrl = {
>>>>  		.vector = HYPERV_REENLIGHTENMENT_VECTOR,
>>>>  		.enabled = 1,
>>>> -		.target_vp = hv_vp_index[smp_processor_id()]
>>>> +		.target_vp = hv_vp_index[raw_smp_processor_id()]
>>>>  	};
>>>>  	struct hv_tsc_emulation_control emu_ctrl = {.enabled = 1};
>>>>  
>>>
>>> Yes, this should do, thanks! I'd also suggest to leave a comment like
>>> 	/* 
>>>          * This function can get preemted and migrate to a different CPU
>>> 	 * but this doesn't matter. We just need to assign
>>> 	 * reenlightenment notification to some online CPU. In case this
>>>          * CPU goes offline, hv_cpu_die() will re-assign it to some
>>>  	 * other online CPU.
>>> 	 */
>>
>> What if the cpu goes down just before wrmsrl()?
>> I mean, hv_cpu_die() will reassign another cpu, but this thread will be
>> resumed on some other cpu and will write cpu number which is at that
>> moment already down?
>>
>> (probably I miss something)
>>
>> And I presume it's guaranteed that during hv_cpu_die() no other cpu may
>> go down:
>> :	new_cpu = cpumask_any_but(cpu_online_mask, cpu);
>> :	re_ctrl.target_vp = hv_vp_index[new_cpu];
>> :	wrmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl));
> 
> Then cpus_read_lock() is the right interface, not preempt_disable().
> 
> I know you probably can't change the HV interface, but I'm thinking its
> rather daft you have to specify a CPU at all for this. The HV can just
> pick one and send the notification there, who cares.

Heh, I thought cpus_read_lock() is more "internal" api and
preempt_diable() is prefered ;-)

Will send v2 with the suggested comment and cpus_read_lock().

-- 
          Dima

^ permalink raw reply

* Re: [PATCH] x86/hyperv: Disable preemption while setting reenlightenment vector
From: Peter Zijlstra @ 2019-06-14 12:27 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Vitaly Kuznetsov, linux-kernel, Prasanna Panchamukhi,
	Andy Lutomirski, Borislav Petkov, Cathy Avery, Haiyang Zhang,
	H. Peter Anvin, Ingo Molnar, K. Y. Srinivasan,
	Michael Kelley (EOSG), Mohammed Gamal, Paolo Bonzini,
	Radim Krčmář, Roman Kagan, Sasha Levin,
	Stephen Hemminger, Thomas Gleixner, devel, kvm, linux-hyperv, x86
In-Reply-To: <cb9e1645-98c2-4341-d6da-4effa4f57fb1@arista.com>

On Fri, Jun 14, 2019 at 12:50:51PM +0100, Dmitry Safonov wrote:
> On 6/14/19 11:08 AM, Vitaly Kuznetsov wrote:
> > Peter Zijlstra <peterz@infradead.org> writes:
> > 
> >> @@ -182,7 +182,7 @@ void set_hv_tscchange_cb(void (*cb)(void))
> >>  	struct hv_reenlightenment_control re_ctrl = {
> >>  		.vector = HYPERV_REENLIGHTENMENT_VECTOR,
> >>  		.enabled = 1,
> >> -		.target_vp = hv_vp_index[smp_processor_id()]
> >> +		.target_vp = hv_vp_index[raw_smp_processor_id()]
> >>  	};
> >>  	struct hv_tsc_emulation_control emu_ctrl = {.enabled = 1};
> >>  
> > 
> > Yes, this should do, thanks! I'd also suggest to leave a comment like
> > 	/* 
> >          * This function can get preemted and migrate to a different CPU
> > 	 * but this doesn't matter. We just need to assign
> > 	 * reenlightenment notification to some online CPU. In case this
> >          * CPU goes offline, hv_cpu_die() will re-assign it to some
> >  	 * other online CPU.
> > 	 */
> 
> What if the cpu goes down just before wrmsrl()?
> I mean, hv_cpu_die() will reassign another cpu, but this thread will be
> resumed on some other cpu and will write cpu number which is at that
> moment already down?
> 
> (probably I miss something)
> 
> And I presume it's guaranteed that during hv_cpu_die() no other cpu may
> go down:
> :	new_cpu = cpumask_any_but(cpu_online_mask, cpu);
> :	re_ctrl.target_vp = hv_vp_index[new_cpu];
> :	wrmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl));

Then cpus_read_lock() is the right interface, not preempt_disable().

I know you probably can't change the HV interface, but I'm thinking its
rather daft you have to specify a CPU at all for this. The HV can just
pick one and send the notification there, who cares.

^ permalink raw reply

* Re: [PATCH] x86/hyperv: Disable preemption while setting reenlightenment vector
From: Dmitry Safonov @ 2019-06-14 11:50 UTC (permalink / raw)
  To: Vitaly Kuznetsov, Peter Zijlstra
  Cc: linux-kernel, Prasanna Panchamukhi, Andy Lutomirski,
	Borislav Petkov, Cathy Avery, Haiyang Zhang, H. Peter Anvin,
	Ingo Molnar, K. Y. Srinivasan, Michael Kelley (EOSG),
	Mohammed Gamal, Paolo Bonzini, Radim Krčmář,
	Roman Kagan, Sasha Levin, Stephen Hemminger, Thomas Gleixner,
	devel, kvm, linux-hyperv, x86
In-Reply-To: <877e9o7a4e.fsf@vitty.brq.redhat.com>

On 6/14/19 11:08 AM, Vitaly Kuznetsov wrote:
> Peter Zijlstra <peterz@infradead.org> writes:
> 
>> @@ -182,7 +182,7 @@ void set_hv_tscchange_cb(void (*cb)(void))
>>  	struct hv_reenlightenment_control re_ctrl = {
>>  		.vector = HYPERV_REENLIGHTENMENT_VECTOR,
>>  		.enabled = 1,
>> -		.target_vp = hv_vp_index[smp_processor_id()]
>> +		.target_vp = hv_vp_index[raw_smp_processor_id()]
>>  	};
>>  	struct hv_tsc_emulation_control emu_ctrl = {.enabled = 1};
>>  
> 
> Yes, this should do, thanks! I'd also suggest to leave a comment like
> 	/* 
>          * This function can get preemted and migrate to a different CPU
> 	 * but this doesn't matter. We just need to assign
> 	 * reenlightenment notification to some online CPU. In case this
>          * CPU goes offline, hv_cpu_die() will re-assign it to some
>  	 * other online CPU.
> 	 */

What if the cpu goes down just before wrmsrl()?
I mean, hv_cpu_die() will reassign another cpu, but this thread will be
resumed on some other cpu and will write cpu number which is at that
moment already down?

(probably I miss something)

And I presume it's guaranteed that during hv_cpu_die() no other cpu may
go down:
:	new_cpu = cpumask_any_but(cpu_online_mask, cpu);
:	re_ctrl.target_vp = hv_vp_index[new_cpu];
:	wrmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl));

-- 
          Dima

^ permalink raw reply

* Re: [PATCH] x86/hyperv: Disable preemption while setting reenlightenment vector
From: Vitaly Kuznetsov @ 2019-06-14 10:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dmitry Safonov, linux-kernel, Prasanna Panchamukhi,
	Andy Lutomirski, Borislav Petkov, Cathy Avery, Haiyang Zhang,
	H. Peter Anvin, Ingo Molnar, K. Y. Srinivasan,
	Michael Kelley (EOSG), Mohammed Gamal, Paolo Bonzini,
	Radim Krčmář, Roman Kagan, Sasha Levin,
	Stephen Hemminger, Thomas Gleixner, devel, kvm, linux-hyperv, x86
In-Reply-To: <20190614082807.GV3436@hirez.programming.kicks-ass.net>

Peter Zijlstra <peterz@infradead.org> writes:

> @@ -182,7 +182,7 @@ void set_hv_tscchange_cb(void (*cb)(void))
>  	struct hv_reenlightenment_control re_ctrl = {
>  		.vector = HYPERV_REENLIGHTENMENT_VECTOR,
>  		.enabled = 1,
> -		.target_vp = hv_vp_index[smp_processor_id()]
> +		.target_vp = hv_vp_index[raw_smp_processor_id()]
>  	};
>  	struct hv_tsc_emulation_control emu_ctrl = {.enabled = 1};
>  

Yes, this should do, thanks! I'd also suggest to leave a comment like
	/* 
         * This function can get preemted and migrate to a different CPU
	 * but this doesn't matter. We just need to assign
	 * reenlightenment notification to some online CPU. In case this
         * CPU goes offline, hv_cpu_die() will re-assign it to some
 	 * other online CPU.
	 */
  
-- 
Vitaly

^ permalink raw reply

* Re: [PATCH] x86/hyperv: Disable preemption while setting reenlightenment vector
From: Peter Zijlstra @ 2019-06-14  8:28 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Dmitry Safonov, linux-kernel, Prasanna Panchamukhi,
	Andy Lutomirski, Borislav Petkov, Cathy Avery, Haiyang Zhang,
	H. Peter Anvin, Ingo Molnar, K. Y. Srinivasan,
	Michael Kelley (EOSG), Mohammed Gamal, Paolo Bonzini,
	Radim Krčmář, Roman Kagan, Sasha Levin,
	Stephen Hemminger, Thomas Gleixner, devel, kvm, linux-hyperv, x86
In-Reply-To: <8736kff6q3.fsf@vitty.brq.redhat.com>

On Wed, Jun 12, 2019 at 12:17:24PM +0200, Vitaly Kuznetsov wrote:
> Dmitry Safonov <dima@arista.com> writes:
> 
> > KVM support may be compiled as dynamic module, which triggers the
> > following splat on modprobe:
> >
> >  KVM: vmx: using Hyper-V Enlightened VMCS
> >  BUG: using smp_processor_id() in preemptible [00000000] code: modprobe/466 caller is debug_smp_processor_id+0x17/0x19
> >  CPU: 0 PID: 466 Comm: modprobe Kdump: loaded Not tainted 4.19.43 #1
> >  Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007  06/02/2017
> >  Call Trace:
> >   dump_stack+0x61/0x7e
> >   check_preemption_disabled+0xd4/0xe6
> >   debug_smp_processor_id+0x17/0x19
> >   set_hv_tscchange_cb+0x1b/0x89
> >   kvm_arch_init+0x14a/0x163 [kvm]
> >   kvm_init+0x30/0x259 [kvm]
> >   vmx_init+0xed/0x3db [kvm_intel]
> >   do_one_initcall+0x89/0x1bc
> >   do_init_module+0x5f/0x207
> >   load_module+0x1b34/0x209b
> >   __ia32_sys_init_module+0x17/0x19
> >   do_fast_syscall_32+0x121/0x1fa
> >   entry_SYSENTER_compat+0x7f/0x91
> 
> Hm, I never noticed this one, you probably need something like
> CONFIG_PREEMPT enabled so see it.

CONFIG_DEBUG_PREEMPT

> > @@ -91,7 +91,7 @@ EXPORT_SYMBOL_GPL(hv_max_vp_index);
> >  static int hv_cpu_init(unsigned int cpu)
> >  {
> >  	u64 msr_vp_index;
> > -	struct hv_vp_assist_page **hvp = &hv_vp_assist_page[smp_processor_id()];
> > +	struct hv_vp_assist_page **hvp = &hv_vp_assist_page[cpu];
> >  	void **input_arg;
> >  	struct page *pg;
> >  
> > @@ -103,7 +103,7 @@ static int hv_cpu_init(unsigned int cpu)
> >  
> >  	hv_get_vp_index(msr_vp_index);
> >  
> > -	hv_vp_index[smp_processor_id()] = msr_vp_index;
> > +	hv_vp_index[cpu] = msr_vp_index;
> >  
> >  	if (msr_vp_index > hv_max_vp_index)
> >  		hv_max_vp_index = msr_vp_index;
> 
> The above is unrelated cleanup (as cpu == smp_processor_id() for
> CPUHP_AP_ONLINE_DYN callbacks), right? As I'm pretty sure these can'd be
> preempted.

Yeah, makes sense though.

> > @@ -182,7 +182,6 @@ void set_hv_tscchange_cb(void (*cb)(void))
> >  	struct hv_reenlightenment_control re_ctrl = {
> >  		.vector = HYPERV_REENLIGHTENMENT_VECTOR,
> >  		.enabled = 1,
> > -		.target_vp = hv_vp_index[smp_processor_id()]
> >  	};
> >  	struct hv_tsc_emulation_control emu_ctrl = {.enabled = 1};
> >  
> > @@ -196,7 +195,11 @@ void set_hv_tscchange_cb(void (*cb)(void))
> >  	/* Make sure callback is registered before we write to MSRs */
> >  	wmb();
> >  
> > +	preempt_disable();
> > +	re_ctrl.target_vp = hv_vp_index[smp_processor_id()];
> >  	wrmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl));
> > +	preempt_enable();
> > +
> 
> My personal preference would be to do something like
>    int cpu = get_cpu();
> 
>    ... set things up ...
> 
>    put_cpu();

If it doesn't matter, how about this then?

---
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 1608050e9df9..e58c693a9fce 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -91,7 +91,7 @@ EXPORT_SYMBOL_GPL(hv_max_vp_index);
 static int hv_cpu_init(unsigned int cpu)
 {
 	u64 msr_vp_index;
-	struct hv_vp_assist_page **hvp = &hv_vp_assist_page[smp_processor_id()];
+	struct hv_vp_assist_page **hvp = &hv_vp_assist_page[cpu];
 	void **input_arg;
 	struct page *pg;
 
@@ -103,7 +103,7 @@ static int hv_cpu_init(unsigned int cpu)
 
 	hv_get_vp_index(msr_vp_index);
 
-	hv_vp_index[smp_processor_id()] = msr_vp_index;
+	hv_vp_index[cpu] = msr_vp_index;
 
 	if (msr_vp_index > hv_max_vp_index)
 		hv_max_vp_index = msr_vp_index;
@@ -182,7 +182,7 @@ void set_hv_tscchange_cb(void (*cb)(void))
 	struct hv_reenlightenment_control re_ctrl = {
 		.vector = HYPERV_REENLIGHTENMENT_VECTOR,
 		.enabled = 1,
-		.target_vp = hv_vp_index[smp_processor_id()]
+		.target_vp = hv_vp_index[raw_smp_processor_id()]
 	};
 	struct hv_tsc_emulation_control emu_ctrl = {.enabled = 1};
 

^ permalink raw reply related

* Re: [PATCH] x86/hyperv: Disable preemption while setting reenlightenment vector
From: Vitaly Kuznetsov @ 2019-06-14  8:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Dmitry Safonov, linux-kernel, Prasanna Panchamukhi,
	Andy Lutomirski, Borislav Petkov, Cathy Avery, Haiyang Zhang,
	H. Peter Anvin, Ingo Molnar, K. Y. Srinivasan,
	Michael Kelley (EOSG), Mohammed Gamal, Paolo Bonzini,
	Radim Krčmář, Roman Kagan, Sasha Levin,
	Stephen Hemminger, devel, kvm, linux-hyperv, x86
In-Reply-To: <alpine.DEB.2.21.1906132059020.1791@nanos.tec.linutronix.de>

Thomas Gleixner <tglx@linutronix.de> writes:

> On Wed, 12 Jun 2019, Vitaly Kuznetsov wrote:
>> Dmitry Safonov <dima@arista.com> writes:
>> > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
>> > index 1608050e9df9..0bdd79ecbff8 100644
>> > --- a/arch/x86/hyperv/hv_init.c
>> > +++ b/arch/x86/hyperv/hv_init.c
>> > @@ -91,7 +91,7 @@ EXPORT_SYMBOL_GPL(hv_max_vp_index);
>> >  static int hv_cpu_init(unsigned int cpu)
>> >  {
>> >  	u64 msr_vp_index;
>> > -	struct hv_vp_assist_page **hvp = &hv_vp_assist_page[smp_processor_id()];
>> > +	struct hv_vp_assist_page **hvp = &hv_vp_assist_page[cpu];
>> >  	void **input_arg;
>> >  	struct page *pg;
>> >  
>> > @@ -103,7 +103,7 @@ static int hv_cpu_init(unsigned int cpu)
>> >  
>> >  	hv_get_vp_index(msr_vp_index);
>> >  
>> > -	hv_vp_index[smp_processor_id()] = msr_vp_index;
>> > +	hv_vp_index[cpu] = msr_vp_index;
>> >  
>> >  	if (msr_vp_index > hv_max_vp_index)
>> >  		hv_max_vp_index = msr_vp_index;
>> 
>> The above is unrelated cleanup (as cpu == smp_processor_id() for
>> CPUHP_AP_ONLINE_DYN callbacks), right? As I'm pretty sure these can'd be
>> preempted.
>
> They can be preempted, but they are guaranteed to run on the upcoming CPU,
> i.e. smp_processor_id() is allowed even in preemptible context as the task
> cannot migrate.
>

Ah, right, thanks! The guarantee that they don't migrate should be enough.

-- 
Vitaly

^ permalink raw reply

* RE: [PATCH v2 4/5] HID: hv: Remove dependencies on PAGE_SIZE for ring buffer
From: Vitaly Kuznetsov @ 2019-06-14  7:53 UTC (permalink / raw)
  To: Michael Kelley, m.maya.nakamura
  Cc: x86@kernel.org, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, KY Srinivasan, Haiyang Zhang,
	Stephen Hemminger, sashal@kernel.org
In-Reply-To: <BL0PR2101MB134877ED5DCB9F23033C92D9D7EF0@BL0PR2101MB1348.namprd21.prod.outlook.com>

Michael Kelley <mikelley@microsoft.com> writes:

> From: Vitaly Kuznetsov <vkuznets@redhat.com> Sent: Wednesday, June 12, 2019 3:40 AM
>> Maya Nakamura <m.maya.nakamura@gmail.com> writes:
>> 
>> > Define the ring buffer size as a constant expression because it should
>> > not depend on the guest page size.
>> >
>> > Signed-off-by: Maya Nakamura <m.maya.nakamura@gmail.com>
>> > ---
>> >  drivers/hid/hid-hyperv.c | 4 ++--
>> >  1 file changed, 2 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/drivers/hid/hid-hyperv.c b/drivers/hid/hid-hyperv.c
>> > index d3311d714d35..e8b154fa38e2 100644
>> > --- a/drivers/hid/hid-hyperv.c
>> > +++ b/drivers/hid/hid-hyperv.c
>> > @@ -112,8 +112,8 @@ struct synthhid_input_report {
>> >
>> >  #pragma pack(pop)
>> >
>> > -#define INPUTVSC_SEND_RING_BUFFER_SIZE		(10*PAGE_SIZE)
>> > -#define INPUTVSC_RECV_RING_BUFFER_SIZE		(10*PAGE_SIZE)
>> > +#define INPUTVSC_SEND_RING_BUFFER_SIZE		(40 * 1024)
>> > +#define INPUTVSC_RECV_RING_BUFFER_SIZE		(40 * 1024)
>> >
>> 
>> My understanding is that this size is pretty arbitrary and as I see you
>> use it for hyperv-keyboard.c as well. It may make sense to have a
>> define, something like HYPERV_STD_RINGBUFFER_SIZE.
>
> Yes, the size is pretty arbitrary because it hasn't been important enough
> from a memory consumption or performance standpoint to run experiments
> to see if a smaller value could be used.  That said, I would not want to
> link these two devices (keyboard and mouse) by using a shared ring buffer
> size definition.  Logically, the ring buffer sizes are independent of each other,
> and using a common #define implies that they are somehow linked.

Ok, makes sense, let's keep them separate.

-- 
Vitaly

^ permalink raw reply

* RE: [PATCH net] hvsock: fix epollout hang from race condition
From: Dexuan Cui @ 2019-06-13 23:35 UTC (permalink / raw)
  To: Sunil Muthuswamy, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Sasha Levin, David S. Miller, Michael Kelley
  Cc: netdev@vger.kernel.org, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <MW2PR2101MB11164C6EEAA5C511B395EF3AC0EC0@MW2PR2101MB1116.namprd21.prod.outlook.com>

> From: Sunil Muthuswamy <sunilmut@microsoft.com>
> Sent: Wednesday, June 12, 2019 2:19 PM
>  ...
> The fix is to set the pending size to the default size and never change it.
> This way the host will always notify the guest whenever the writable space
> is bigger than the pending size. The host is already optimized to *only*
> notify the guest when the pending size threshold boundary is crossed and
> not everytime.
> 
> This change also reduces the cpu usage somewhat since
> hv_stream_has_space()
> is in the hotpath of send:
> vsock_stream_sendmsg()->hv_stream_has_space()
> Earlier hv_stream_has_space was setting/clearing the pending size on every
> call.
> 
> Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>

Hi Sunil, thanks for the fix! It looks good.

Reviewed-by: Dexuan Cui <decui@microsoft.com>


^ permalink raw reply

* [PATCH net] hv_netvsc: Set probe mode to sync
From: Haiyang Zhang @ 2019-06-13 21:06 UTC (permalink / raw)
  To: sashal@kernel.org, linux-hyperv@vger.kernel.org,
	netdev@vger.kernel.org
  Cc: Haiyang Zhang, KY Srinivasan, Stephen Hemminger, olaf@aepfle.de,
	vkuznets, davem@davemloft.net, linux-kernel@vger.kernel.org

For better consistency of synthetic NIC names, we set the probe mode to
PROBE_FORCE_SYNCHRONOUS. So the names can be aligned with the vmbus
channel offer sequence.

Fixes: af0a5646cb8d ("use the new async probing feature for the hyperv drivers")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
---
 drivers/net/hyperv/netvsc_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 03ea5a7..afdcc56 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -2407,7 +2407,7 @@ static int netvsc_remove(struct hv_device *dev)
 	.probe = netvsc_probe,
 	.remove = netvsc_remove,
 	.driver = {
-		.probe_type = PROBE_PREFER_ASYNCHRONOUS,
+		.probe_type = PROBE_FORCE_SYNCHRONOUS,
 	},
 };
 
-- 
1.8.3.1


^ permalink raw reply related

* RE: [PATCH 10/13] megaraid_sas: set virt_boundary_mask in the scsi host
From: Kashyap Desai @ 2019-06-13 20:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Sebastian Ott, Sagi Grimberg, Max Gurtovoy,
	Bart Van Assche, Ulf Hansson, Alan Stern, Oliver Neukum,
	linux-block, linux-rdma, linux-mmc, linux-nvme, linux-scsi,
	PDL,MEGARAIDLINUX, PDL-MPT-FUSIONLINUX, linux-hyperv, linux-usb,
	usb-storage, linux-kernel
In-Reply-To: <20190613084458.GB13221@lst.de>

>
> So before I respin this series, can you help with a way to figure out
for
> mpt3sas and megaraid if a given controller supports NVMe devices at all,
so
> that we don't have to set the virt boundary if not?


In MegaRaid we have below enum -        VENTURA_SERIES and AERO_SERIES
supports NVME

enum MR_ADAPTER_TYPE {
        MFI_SERIES = 1,
        THUNDERBOLT_SERIES = 2,
        INVADER_SERIES = 3,
        VENTURA_SERIES = 4,
        AERO_SERIES = 5,
};

In mpt3sas driver we have below method - If IOC FACT reports NVME Device
support in Protocol Flags, we can consider it as HBA with NVME drive
support.

ioc->facts.ProtocolFlags & MPI2_IOCFACTS_PROTOCOL_NVME_DEVICES

Kashyap

^ permalink raw reply

* RE: [PATCH 10/13] megaraid_sas: set virt_boundary_mask in the scsi host
From: Kashyap Desai @ 2019-06-13 19:58 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Sebastian Ott, Sagi Grimberg, Max Gurtovoy,
	Bart Van Assche, Ulf Hansson, Alan Stern, Oliver Neukum,
	linux-block, linux-rdma, linux-mmc, linux-nvme, linux-scsi,
	PDL,MEGARAIDLINUX, PDL-MPT-FUSIONLINUX, linux-hyperv, linux-usb,
	usb-storage, linux-kernel
In-Reply-To: <20190608081400.GA19573@lst.de>

>
> On Thu, Jun 06, 2019 at 09:07:27PM +0530, Kashyap Desai wrote:
> > Hi Christoph, Changes for <megaraid_sas> and <mpt3sas> looks good. We
> > want to confirm few sanity before ACK. BTW, what benefit we will see
> > moving virt_boundry setting to SCSI mid layer ? Is it just modular
> > approach OR any functional fix ?
>
> The big difference is that virt_boundary now also changes the
> max_segment_size, and this ensures that this limit is also communicated
to
> the DMA mapping layer.
Is there any changes in API  blk_queue_virt_boundary? I could not find
relevant code which account for this. Can you help ?
Which git repo shall I use for testing ? That way I can confirm, I didn't
miss relevant changes.

From your above explanation, it means (after this patch) max segment size
of the MR controller will be set to 4K.
Earlier it is possible to receive single SGE of 64K datalength (Since max
seg size was 64K), but now the same buffer will reach the driver having 16
SGEs (Each SGE will contain 4K length).
Right ?

Kashyap

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox