Linux-HyperV List

Linux-HyperV List
 help / color / mirror / Atom feed

* RE: [PATHC v6] video: hyperv: hyperv_fb: Support deferred IO for Hyper-V frame buffer driver
From: Michael Kelley @ 2019-09-20 17:26 UTC (permalink / raw)
  To: Michael Kelley, Wei Hu, rdunlap@infradead.org, shc_work@mail.ru,
	gregkh@linuxfoundation.org, lee.jones@linaro.org,
	alexandre.belloni@bootlin.com, baijiaju1990@gmail.com,
	info@metux.net, linux-hyperv@vger.kernel.org,
	dri-devel@lists.freedesktop.org, linux-fbdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, sashal@kernel.org,
	Stephen Hemminger, Haiyang Zhang, KY Srinivasan, Dexuan Cui
In-Reply-To: <DM5PR21MB0137DA408FE59E8C1171CFFCD78E0@DM5PR21MB0137.namprd21.prod.outlook.com>

From: Michael Kelley <mikelley@microsoft.com>  Sent: Wednesday, September 18, 2019 2:48 PM
> >
> > Without deferred IO support, hyperv_fb driver informs the host to refresh
> > the entire guest frame buffer at fixed rate, e.g. at 20Hz, no matter there
> > is screen update or not. This patch supports deferred IO for screens in
> > graphics mode and also enables the frame buffer on-demand refresh. The
> > highest refresh rate is still set at 20Hz.
> >
> > Currently Hyper-V only takes a physical address from guest as the starting
> > address of frame buffer. This implies the guest must allocate contiguous
> > physical memory for frame buffer. In addition, Hyper-V Gen 2 VMs only
> > accept address from MMIO region as frame buffer address. Due to these
> > limitations on Hyper-V host, we keep a shadow copy of frame buffer
> > in the guest. This means one more copy of the dirty rectangle inside
> > guest when doing the on-demand refresh. This can be optimized in the
> > future with help from host. For now the host performance gain from deferred
> > IO outweighs the shadow copy impact in the guest.
> >
> > Signed-off-by: Wei Hu <weh@microsoft.com>

Sasha -- this patch and one other from Wei Hu for the Hyper-V frame buffer
driver should be ready.  Both patches affect only the Hyper-V frame buffer
driver so can go through the Hyper-V tree.  Can you pick these up?  Thx.

Michael

^ permalink raw reply

* Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page()
From: David Hildenbrand @ 2019-09-20  8:17 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, Souptick Joarder, linux-hyperv, Andrew Morton,
	Dan Williams, Haiyang Zhang, K. Y. Srinivasan, Michal Hocko,
	Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin,
	Stephen Hemminger, Wei Yang
In-Reply-To: <20190909114830.662-1-david@redhat.com>

On 09.09.19 13:48, David Hildenbrand wrote:
> Based on linux/next + "[PATCH 0/3] Remove __online_page_set_limits()"
> 
> Let's replace the __online_page...() functions by generic_online_page().
> Hyper-V only wants to delay the actual onlining of un-backed pages, so we
> can simpy re-use the generic function.
> 
> Only compile-tested.
> 
> Cc: Souptick Joarder <jrdr.linux@gmail.com>
> 
> David Hildenbrand (3):
>   mm/memory_hotplug: Export generic_online_page()
>   hv_balloon: Use generic_online_page()
>   mm/memory_hotplug: Remove __online_page_free() and
>     __online_page_increment_counters()
> 
>  drivers/hv/hv_balloon.c        |  3 +--
>  include/linux/memory_hotplug.h |  4 +---
>  mm/memory_hotplug.c            | 17 ++---------------
>  3 files changed, 4 insertions(+), 20 deletions(-)
> 

Ping, any comments on this one?

-- 

Thanks,

David / dhildenb

^ permalink raw reply

* Re: [PATCH] Drivers: hv: vmbus: Fix harmless building warnings without CONFIG_PM
From: Arnd Bergmann @ 2019-09-20  7:33 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal@kernel.org, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, Michael Kelley
In-Reply-To: <PU1P153MB016971CD922FC453F3E31E04BF890@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>

On Thu, Sep 19, 2019 at 11:38 PM Dexuan Cui <decui@microsoft.com> wrote:
> > Sent: Thursday, September 19, 2019 5:11 AM
> > On Thu, Sep 19, 2019 at 7:19 AM Dexuan Cui <decui@microsoft.com> wrote:

> > I think this will still produce a warning if CONFIG_PM is set but
> > CONFIG_PM_SLEEP is not, possibly in other configurations as
> > well.
>
> You're correct. Thanks!
>
> I'll use " #ifdef CONFIG_PM_SLEEP ... #endif" instead.
>
> The mentioned functions are only used in the micros
> SET_NOIRQ_SYSTEM_SLEEP_PM_OPS, which is empty if CONFIG_PM_SLEEP
> is not defined. So it looks to me using "#ifdef CONFIG_PM_SLEEP ..." should
> resolve the issue.

Probably, yes. There are sometimes surprising effects, such as when one of the
functions inside of the #ifdef call another function that is otherwise unused.

I would normally try to build a few hundred randconfig builds to be fairly sure
of a change like this, or use __maybe_unused like most other drivers do here.

       Arnd

^ permalink raw reply

* RE: [PATCH v5 1/2] drivers: hv: vmbus: Introduce latency testing
From: Michael Kelley @ 2019-09-19 22:52 UTC (permalink / raw)
  To: brandonbonaby94, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal@kernel.org
  Cc: brandonbonaby94, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <83b5fc34e8f25c882f2502931f766ef547c6c950.1568320416.git.brandonbonaby94@gmail.com>

From: Branden Bonaby <brandonbonaby94@gmail.com> Sent: Thursday, September 12, 2019 7:32 PM
> 
> +
> +static int hv_debugfs_delay_set(void *data, u64 val)
> +{
> +	int ret = 0;
> +
> +	if (val >= 0 && val <= 1000)
> +		*(u32 *)data = val;
> +	else
> +		ret = -EINVAL;
> +
> +	return ret;
> +}

I should probably quit picking at your code, but I'm going to
do it one more time. :-)

The above test for val >=0 is redundant as 'val' is declared
as 'u64'.  As an unsigned value, it will always be >= 0.  More
broadly, the above function could be written as follows
with no loss of clarity.  This accomplishes the same thing in
only 4 lines of code instead of 6, and the main execution path
is in the sequential execution flow, not in an 'if' statement.

{
	if (val > 1000)
		return -EINVAL;
	*(u32 *)data = val;
	return 0;
}

Your code is correct as written, so this is arguably more a
matter of style, but Linux generally likes to do things clearly
and compactly with no extra motion.

> +/* Delay buffer/message reads on a vmbus channel */
> +void hv_debug_delay_test(struct vmbus_channel *channel, enum delay delay_type)
> +{
> +	struct vmbus_channel *test_channel =    channel->primary_channel ?
> +						channel->primary_channel :
> +						channel;
> +	bool state = test_channel->fuzz_testing_state;
> +
> +	if (state) {
> +		if (delay_type == 0)
> +			udelay(test_channel->fuzz_testing_interrupt_delay);
> +		else
> +			udelay(test_channel->fuzz_testing_message_delay);

This 'if/else' statement got me thinking.  You have an enum declared below
that lists the two options -- INTERRUPT_DELAY or MESSAGE_DELAY.  The
implication is that we might add more options in the future.  But the
above 'if/else' statement isn't really set up to easily add more options, and
the individual fields for fuzz_testing_interrupt_delay and
fuzz_testing_message_delay mean adding more branches to the 'if/else'
statement whenever a new DELAY type is added to the enum.   And the
same is true when adding the entries into debugfs.  A more general
solution might use arrays and loops, and treat the enum value as an
index into an array of delay values.  Extending to add another delay type
could be as easy as adding another entry to the enum declaration.

The current code is for the case where n=2 (i.e., two different delay
types), and as such probably doesn't warrant the full index/looping
treatment.  But in the future, if we add additional delay types, we'll
probably revise the code to do the index/looping approach.

So to be clear, at this point I'm not asking you to change the existing
code.  My comments are more of an observation and something to
think about in the future.

> 
> +enum delay {
> +	INTERRUPT_DELAY = 0,
> +	MESSAGE_DELAY   = 1,
> +};
> +

Michael

^ permalink raw reply

* [PATCH v2] Drivers: hv: vmbus: Fix harmless building warnings without CONFIG_PM_SLEEP
From: Dexuan Cui @ 2019-09-19 21:46 UTC (permalink / raw)
  To: KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal@kernel.org, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, Michael Kelley, arnd@arndb.de
  Cc: Dexuan Cui

If CONFIG_PM_SLEEP is not set, we can comment out these functions to avoid
the below warnings:

drivers/hv/vmbus_drv.c:2208:12: warning: ‘vmbus_bus_resume’ defined but not used [-Wunused-function]
drivers/hv/vmbus_drv.c:2128:12: warning: ‘vmbus_bus_suspend’ defined but not used [-Wunused-function]
drivers/hv/vmbus_drv.c:937:12: warning: ‘vmbus_resume’ defined but not used [-Wunused-function]
drivers/hv/vmbus_drv.c:918:12: warning: ‘vmbus_suspend’ defined but not used [-Wunused-function]

Fixes: 271b2224d42f ("Drivers: hv: vmbus: Implement suspend/resume for VSC drivers for hibernation")
Fixes: f53335e3289f ("Drivers: hv: vmbus: Suspend/resume the vmbus itself for hibernation")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---

In v2:
	test CONFIG_PM_SLEEP rather than CONFIG_PM. Thanks, Arnd!

 drivers/hv/vmbus_drv.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 391f0b225c9a..53a60c81e220 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -912,6 +912,7 @@ static void vmbus_shutdown(struct device *child_device)
 		drv->shutdown(dev);
 }
 
+#ifdef CONFIG_PM_SLEEP
 /*
  * vmbus_suspend - Suspend a vmbus device
  */
@@ -949,6 +950,7 @@ static int vmbus_resume(struct device *child_device)
 
 	return drv->resume(dev);
 }
+#endif /* CONFIG_PM_SLEEP */
 
 /*
  * vmbus_device_release - Final callback release of the vmbus child device
@@ -1070,6 +1072,7 @@ void vmbus_on_msg_dpc(unsigned long data)
 	vmbus_signal_eom(msg, message_type);
 }
 
+#ifdef CONFIG_PM_SLEEP
 /*
  * Fake RESCIND_CHANNEL messages to clean up hv_sock channels by force for
  * hibernation, because hv_sock connections can not persist across hibernation.
@@ -1105,6 +1108,7 @@ static void vmbus_force_channel_rescinded(struct vmbus_channel *channel)
 		      vmbus_connection.work_queue,
 		      &ctx->work);
 }
+#endif /* CONFIG_PM_SLEEP */
 
 /*
  * Direct callback for channels using other deferred processing
@@ -2125,6 +2129,7 @@ static int vmbus_acpi_add(struct acpi_device *device)
 	return ret_val;
 }
 
+#ifdef CONFIG_PM_SLEEP
 static int vmbus_bus_suspend(struct device *dev)
 {
 	struct vmbus_channel *channel, *sc;
@@ -2247,6 +2252,7 @@ static int vmbus_bus_resume(struct device *dev)
 
 	return 0;
 }
+#endif /* CONFIG_PM_SLEEP */
 
 static const struct acpi_device_id vmbus_acpi_device_ids[] = {
 	{"VMBUS", 0},
-- 
2.19.1


^ permalink raw reply related

* RE: [PATCH] Drivers: hv: vmbus: Fix harmless building warnings without CONFIG_PM
From: Dexuan Cui @ 2019-09-19 21:38 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal@kernel.org, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, Michael Kelley
In-Reply-To: <CAK8P3a0oi2MQwt-P8taBt+VS+RTaoeNBgjoYNE7_L2VoQUSaEA@mail.gmail.com>

> Sent: Thursday, September 19, 2019 5:11 AM
> On Thu, Sep 19, 2019 at 7:19 AM Dexuan Cui <decui@microsoft.com> wrote:
> >
> > If CONFIG_PM is not set, we can comment out these functions to avoid the
> > below warnings:
> >
> > drivers/hv/vmbus_drv.c:2208:12: warning: ‘vmbus_bus_resume’ defined
> but not used [-Wunused-function]
> > drivers/hv/vmbus_drv.c:2128:12: warning: ‘vmbus_bus_suspend’ defined
> but not used [-Wunused-function]
> > drivers/hv/vmbus_drv.c:937:12: warning: ‘vmbus_resume’ defined but not
> used [-Wunused-function]
> > drivers/hv/vmbus_drv.c:918:12: warning: ‘vmbus_suspend’ defined but not
> used [-Wunused-function]
> >
> > Fixes: 271b2224d42f ("Drivers: hv: vmbus: Implement suspend/resume for
> VSC drivers for hibernation")
> > Fixes: f53335e3289f ("Drivers: hv: vmbus: Suspend/resume the vmbus itself
> for hibernation")
> > Reported-by: Arnd Bergmann <arnd@arndb.de>
> > Signed-off-by: Dexuan Cui <decui@microsoft.com>
> 
> I think this will still produce a warning if CONFIG_PM is set but
> CONFIG_PM_SLEEP is not, possibly in other configurations as
> well.
> 
>  Arnd

You're correct. Thanks! 

I'll use " #ifdef CONFIG_PM_SLEEP ... #endif" instead.

The mentioned functions are only used in the micros
SET_NOIRQ_SYSTEM_SLEEP_PM_OPS, which is empty if CONFIG_PM_SLEEP
is not defined. So it looks to me using "#ifdef CONFIG_PM_SLEEP ..." should
resolve the issue.

BTW, CONFIG_PM_SLEEP depends on CONFIG_PM, so if CONFIG_PM is not
defined, CONFIG_PM_SLEEP is not defined either.

Thanks,
-- Dexuan

^ permalink raw reply

* Re: [PATCH] Input: hyperv-keyboard: Add the support of hibernation
From: dmitry.torokhov @ 2019-09-19 16:17 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal@kernel.org, linux-hyperv@vger.kernel.org,
	linux-input@vger.kernel.org, linux-kernel@vger.kernel.org,
	Michael Kelley
In-Reply-To: <1568244975-66795-1-git-send-email-decui@microsoft.com>

Hi Dexuan,

On Wed, Sep 11, 2019 at 11:36:20PM +0000, Dexuan Cui wrote:
> We need hv_kbd_pm_notify() to make sure the pm_wakeup_hard_event() call
> does not prevent the system from entering hibernation: the hibernation
> is a relatively long process, which can be aborted by the call
> pm_wakeup_hard_event(), which is invoked upon keyboard events.
> 
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> ---
> 
> This patch is basically a pure Hyper-V specific change and it has a
> build dependency on the commit 271b2224d42f ("Drivers: hv: vmbus: Implement
> suspend/resume for VSC drivers for hibernation"), which is on Sasha Levin's
> Hyper-V tree's hyperv-next branch:
> https://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git/log/?h=hyperv-next
> 
> I request this patch should go through Sasha's tree rather than the
> input subsystemi's tree.
> 
> Hi Dmitry, can you please Ack?
> 
>  drivers/input/serio/hyperv-keyboard.c | 68 ++++++++++++++++++++++++++++++++---
>  1 file changed, 63 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/input/serio/hyperv-keyboard.c b/drivers/input/serio/hyperv-keyboard.c
> index 88ae7c2..277dc4c 100644
> --- a/drivers/input/serio/hyperv-keyboard.c
> +++ b/drivers/input/serio/hyperv-keyboard.c
> @@ -10,6 +10,7 @@
>  #include <linux/hyperv.h>
>  #include <linux/serio.h>
>  #include <linux/slab.h>
> +#include <linux/suspend.h>
>  
>  /*
>   * Current version 1.0
> @@ -95,6 +96,9 @@ struct hv_kbd_dev {
>  	struct completion wait_event;
>  	spinlock_t lock; /* protects 'started' field */
>  	bool started;
> +
> +	struct notifier_block pm_nb;
> +	bool hibernation_in_progress;

Why do you use notifier block instead of exposing proper PM methods if
you want to support hibernation?

Thanks.

-- 
Dmitry

^ permalink raw reply

* Re: [PATCH] Drivers: hv: vmbus: Fix harmless building warnings without CONFIG_PM
From: Arnd Bergmann @ 2019-09-19 12:10 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal@kernel.org, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, Michael Kelley
In-Reply-To: <1568870297-108679-1-git-send-email-decui@microsoft.com>

On Thu, Sep 19, 2019 at 7:19 AM Dexuan Cui <decui@microsoft.com> wrote:
>
> If CONFIG_PM is not set, we can comment out these functions to avoid the
> below warnings:
>
> drivers/hv/vmbus_drv.c:2208:12: warning: ‘vmbus_bus_resume’ defined but not used [-Wunused-function]
> drivers/hv/vmbus_drv.c:2128:12: warning: ‘vmbus_bus_suspend’ defined but not used [-Wunused-function]
> drivers/hv/vmbus_drv.c:937:12: warning: ‘vmbus_resume’ defined but not used [-Wunused-function]
> drivers/hv/vmbus_drv.c:918:12: warning: ‘vmbus_suspend’ defined but not used [-Wunused-function]
>
> Fixes: 271b2224d42f ("Drivers: hv: vmbus: Implement suspend/resume for VSC drivers for hibernation")
> Fixes: f53335e3289f ("Drivers: hv: vmbus: Suspend/resume the vmbus itself for hibernation")
> Reported-by: Arnd Bergmann <arnd@arndb.de>
> Signed-off-by: Dexuan Cui <decui@microsoft.com>

I think this will still produce a warning if CONFIG_PM is set but
CONFIG_PM_SLEEP is not, possibly in other configurations as
well.

      Arnd

^ permalink raw reply

* RE: [PATCH 1/3] hv_utils: Add the support of hibernation
From: Vitaly Kuznetsov @ 2019-09-19 10:27 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal@kernel.org, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, Michael Kelley
In-Reply-To: <PU1P153MB01694ABFED7ADAE8B40A65F7BF890@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>

Dexuan Cui <decui@microsoft.com> writes:

> BTW, for vss, maybe the VM should not hibernate if there is a backup 
> ongoing? -- if the file system is frozen by hv_vss_daemon, and the VM
> hibernates, then when the VM resumes back, it's almost always true that 
> the VM won't receive the host's VSS_OP_THAW request, and the VM will
> end up in an unusable state.

Makes sense. Or, alternatively, can we postpone hibernation until after
VSS_OP_THAW?

-- 
Vitaly


^ permalink raw reply

* RE: [PATCH 1/3] hv_utils: Add the support of hibernation
From: Dexuan Cui @ 2019-09-19  6:34 UTC (permalink / raw)
  To: vkuznets
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal@kernel.org, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, Michael Kelley
In-Reply-To: <87pnk0bpe8.fsf@vitty.brq.redhat.com>

> From: Vitaly Kuznetsov <vkuznets@redhat.com>
> Sent: Monday, September 16, 2019 1:46 AM
> Dexuan Cui <decui@microsoft.com> writes:
> 
> >> From: Vitaly Kuznetsov <vkuznets@redhat.com>
> >> Sent: Thursday, September 12, 2019 9:37 AM
> >
> >> > +static int util_suspend(struct hv_device *dev)
> >> > +{
> >> > +	struct hv_util_service *srv = hv_get_drvdata(dev);
> >> > +
> >> > +	if (srv->util_cancel_work)
> >> > +		srv->util_cancel_work();
> >> > +
> >> > +	vmbus_close(dev->channel);
> >>
> >> And what happens if you receive a real reply from userspace after you
> >> close the channel? You've only cancelled timeout work so the driver
> >> will not try to reply by itself but this doesn't mean we won't try to
> >> write to the channel on legitimate replies from daemons.
> >>
> >> I think you need to block all userspace requests (hang in kernel until
> >> util_resume()).
> >>
> >> While I'm not sure we can't get away without it but I'd suggest we add a
> >> new HVUTIL_SUSPENDED state to the hvutil state machine.
> >> Vitaly
> >
> > When we reach util_suspend(), all the userspace processes have been
> > frozen: see kernel/power/hibernate.c: hibernate() -> freeze_processes(),
> > so here we can not receive a reply from the userspace daemon.
> >
> 
> Let's add a WARN() or something then as if this ever happens this is
> going to be realy tricky to catch.

Looking at the path hibernate() -> freeze_processes() -> 
try_to_freeze_tasks(true) -> freeze_task() -> fake_signal_wake_up(), I'm
sure when try_to_freeze_tasks(true) returns 0, all the user-space processes
must be frozen in do_signal() -> get_signal() -> try_to_freeze() -> ... -> 
__refrigerator().

hibernate () -> hibernation_snapshot () -> dpm_suspend() -> ... -> 
util_suspend() only runs after hibernate() -> freeze_processes(), so I'm
pretty sure we're safe here.

> > However, it looks there is indeed some tricky corner cases we need to deal
> > with: in util_resume(), before we call vmbus_open(), all the userspace
> > processes are still frozen, and the userspace daemon (e.g. hv_kvp_daemon)
> > can be in any of these states:
> >
> > 1. the driver has not buffered any message for the daemon. This is good.
> >
> > 2. the driver has buffered a message for the daemon, and
> > kvp_transaction.state is HVUTIL_USERSPACE_REQ. Later, the kvp daemon
> > writes the response to the driver, and in kvp_on_msg()
> > kvp_transaction.state is moved to HVUTIL_USERSPACE_RECV, but since
> > cancel_delayed_work_sync(&kvp_timeout_work) is false (the work has
> > been cancelled by util_suspend()), the driver reports nothing to the host,
> > which is good as the VM has undergone a hibernation event and IMO the
> > response may be stale and I believe the host is not expecting this
> > response from the VM at all (the host side application that requested the
> > KVP must have failed or timed out), but the bad thing is: the "state"
> > remains in HVUTIL_USERSPACE_RECV, preventing
> > hv_kvp_onchannelcallback() from reading from the channel ringbuffer.
> >
> > I suggest we work around this race condition by the below patch:
> >
> > --- a/drivers/hv/hv_kvp.c
> > +++ b/drivers/hv/hv_kvp.c
> > @@ -250,8 +250,8 @@ static int kvp_on_msg(void *msg, int len)
> >          */
> >         if (cancel_delayed_work_sync(&kvp_timeout_work)) {
> >                 kvp_respond_to_host(message, error);
> > -               hv_poll_channel(kvp_transaction.recv_channel,
> kvp_poll_wrapper);
> >         }
> > +       hv_poll_channel(kvp_transaction.recv_channel, kvp_poll_wrapper);
> >
> >         return 0;
> >  }
> >
> > How do you like this?
> >
> 
> Is it safe to call hv_poll_channel() simultaneously on several CPUs? It
> seems it is as we're doing
> 
> smp_call_function_single(channel->target_cpu, cb, channel, true);

Looks safe to me. The code has been there for years. :-)
 
> (I'm asking because if it's not, than doing what you suggest will open
> the following window: timeout function kvp_timeout_func() is already
> running but the daemon is trying to reply at the same moment).
> 
> > I can imagine there is still a small chance that the state machine can run
> > out of order, and the kvp daemon may exit due to the return values of
> > read()/write() being -EINVAL, but the chance should be small enough in
> > practice, and IMO the issue even exists in the current code when
> > hibernation is not involved, e.g. kvp_timeout_func() and kvp_on_msg()
> > can run concurrently; if kvp_on_msg() runs slowly due to some reason
> > (e.g. the kvp daemon is stopped as I'm gdb'ing it), kvp_timeout_func()
> > fires and moves the state to HVUTIL_READY; later kvp_on_msg() starts
> > to run and returns -EINVAL, and the kvp daemon will exit().
> >
> > IMHO here it looks extremely difficult to make things flawless (if that's
> > even possible), so it's necessary to ask the daemons to auto-restart once
> > they exit() unexpectedly. This can be achieved by configuring systemd
> > properly for the kvp/vss/fcopy services.
> 
> I think we can also teach daemons to ignore -EINVAL or switch to
> something like -EAGAIN in non-fatal cases.

Maybe the driver should return 0 rather than -EINVAL for the kvp daemon
in some scenarios, since kvp is never 100% reliable.

> > I'm not sure introducing a HVUTIL_SUSPENDED state would solve all
> > of the corner cases, but I'm sure that would further complicate the
> > current code, at least to me. :-)
> >
> 
> Maybe, if we don't need it than we don't. Basically, I see the only
> advantage in having such state: it makes our tricks to support
> hibernation more visible in the code.
> Vitaly

Let me think about this. 

BTW, for vss, maybe the VM should not hibernate if there is a backup 
ongoing? -- if the file system is frozen by hv_vss_daemon, and the VM
hibernates, then when the VM resumes back, it's almost always true that 
the VM won't receive the host's VSS_OP_THAW request, and the VM will
end up in an unusable state.

Thanks,
-- Dexuan


^ permalink raw reply

* [PATCH] Drivers: hv: vmbus: Fix harmless building warnings without CONFIG_PM
From: Dexuan Cui @ 2019-09-19  5:19 UTC (permalink / raw)
  To: KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal@kernel.org, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, Michael Kelley, arnd@arndb.de
  Cc: Dexuan Cui

If CONFIG_PM is not set, we can comment out these functions to avoid the
below warnings:

drivers/hv/vmbus_drv.c:2208:12: warning: ‘vmbus_bus_resume’ defined but not used [-Wunused-function]
drivers/hv/vmbus_drv.c:2128:12: warning: ‘vmbus_bus_suspend’ defined but not used [-Wunused-function]
drivers/hv/vmbus_drv.c:937:12: warning: ‘vmbus_resume’ defined but not used [-Wunused-function]
drivers/hv/vmbus_drv.c:918:12: warning: ‘vmbus_suspend’ defined but not used [-Wunused-function]

Fixes: 271b2224d42f ("Drivers: hv: vmbus: Implement suspend/resume for VSC drivers for hibernation")
Fixes: f53335e3289f ("Drivers: hv: vmbus: Suspend/resume the vmbus itself for hibernation")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 drivers/hv/vmbus_drv.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 391f0b225c9a..8bfb36695413 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -912,6 +912,7 @@ static void vmbus_shutdown(struct device *child_device)
 		drv->shutdown(dev);
 }
 
+#ifdef CONFIG_PM
 /*
  * vmbus_suspend - Suspend a vmbus device
  */
@@ -949,6 +950,7 @@ static int vmbus_resume(struct device *child_device)
 
 	return drv->resume(dev);
 }
+#endif /* CONFIG_PM */
 
 /*
  * vmbus_device_release - Final callback release of the vmbus child device
@@ -1070,6 +1072,7 @@ void vmbus_on_msg_dpc(unsigned long data)
 	vmbus_signal_eom(msg, message_type);
 }
 
+#ifdef CONFIG_PM
 /*
  * Fake RESCIND_CHANNEL messages to clean up hv_sock channels by force for
  * hibernation, because hv_sock connections can not persist across hibernation.
@@ -1105,6 +1108,7 @@ static void vmbus_force_channel_rescinded(struct vmbus_channel *channel)
 		      vmbus_connection.work_queue,
 		      &ctx->work);
 }
+#endif /* CONFIG_PM */
 
 /*
  * Direct callback for channels using other deferred processing
@@ -2125,6 +2129,7 @@ static int vmbus_acpi_add(struct acpi_device *device)
 	return ret_val;
 }
 
+#ifdef CONFIG_PM
 static int vmbus_bus_suspend(struct device *dev)
 {
 	struct vmbus_channel *channel, *sc;
@@ -2247,6 +2252,7 @@ static int vmbus_bus_resume(struct device *dev)
 
 	return 0;
 }
+#endif /* CONFIG_PM */
 
 static const struct acpi_device_id vmbus_acpi_device_ids[] = {
 	{"VMBUS", 0},
-- 
2.19.1


^ permalink raw reply related

* RE: [PATCH] hv: vmbus: mark PM functions as __maybe_unused
From: Dexuan Cui @ 2019-09-19  5:15 UTC (permalink / raw)
  To: Arnd Bergmann, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Sasha Levin
  Cc: Michael Kelley, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <20190918200052.2261769-1-arnd@arndb.de>

> From: Arnd Bergmann <arnd@arndb.de>
> Sent: Wednesday, September 18, 2019 1:01 PM
> 
> When CONFIG_PM is disabled, we get a couple of harmless warnings:
> 
> drivers/hv/vmbus_drv.c:918:12: error: unused function 'vmbus_suspend'
> [-Werror,-Wunused-function]
> drivers/hv/vmbus_drv.c:937:12: error: unused function 'vmbus_resume'
> [-Werror,-Wunused-function]
> drivers/hv/vmbus_drv.c:2128:12: error: unused function 'vmbus_bus_suspend'
> [-Werror,-Wunused-function]
> drivers/hv/vmbus_drv.c:2208:12: error: unused function 'vmbus_bus_resume'
> [-Werror,-Wunused-function]
> 
> Mark these functions __maybe_unused to let gcc drop them silently.

Hi Arnd,
Thanks for reporting the issue!

If CONFIG_PM is not set, IMO it's better to comment out these functions. :-)

I'll post a patch for this with you Cc'd.

Thanks,
-- Dexuan

^ permalink raw reply

* RE: [PATHC v6] video: hyperv: hyperv_fb: Support deferred IO for Hyper-V frame buffer driver
From: Michael Kelley @ 2019-09-18 21:48 UTC (permalink / raw)
  To: Wei Hu, rdunlap@infradead.org, shc_work@mail.ru,
	gregkh@linuxfoundation.org, lee.jones@linaro.org,
	alexandre.belloni@bootlin.com, baijiaju1990@gmail.com,
	fthain@telegraphics.com.au, info@metux.net,
	linux-hyperv@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linux-fbdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	sashal@kernel.org, Stephen Hemminger, Haiyang Zhang,
	KY Srinivasan, Dexuan Cui
In-Reply-To: <20190918060227.6834-1-weh@microsoft.com>

From: Wei Hu <weh@microsoft.com> Sent: Tuesday, September 17, 2019 11:03 PM

> 
> Without deferred IO support, hyperv_fb driver informs the host to refresh
> the entire guest frame buffer at fixed rate, e.g. at 20Hz, no matter there
> is screen update or not. This patch supports deferred IO for screens in
> graphics mode and also enables the frame buffer on-demand refresh. The
> highest refresh rate is still set at 20Hz.
> 
> Currently Hyper-V only takes a physical address from guest as the starting
> address of frame buffer. This implies the guest must allocate contiguous
> physical memory for frame buffer. In addition, Hyper-V Gen 2 VMs only
> accept address from MMIO region as frame buffer address. Due to these
> limitations on Hyper-V host, we keep a shadow copy of frame buffer
> in the guest. This means one more copy of the dirty rectangle inside
> guest when doing the on-demand refresh. This can be optimized in the
> future with help from host. For now the host performance gain from deferred
> IO outweighs the shadow copy impact in the guest.
> 
> Signed-off-by: Wei Hu <weh@microsoft.com>
> ---
>     v2: Incorporated review comments from Michael Kelley
>     - Increased dirty rectangle by one row in deferred IO case when sending
>     to Hyper-V.
>     - Corrected the dirty rectangle size in the text mode.
>     - Added more comments.
>     - Other minor code cleanups.
> 
>     v3: Incorporated more review comments
>     - Removed a few unnecessary variable tests
> 
>     v4: Incorporated test and review feedback from Dexuan Cui
>     - Not disable interrupt while acquiring docopy_lock in
>       hvfb_update_work(). This avoids significant bootup delay in
>       large vCPU count VMs.
> 
>     v5: Completely remove the unnecessary docopy_lock after discussing
>     with Dexuan Cui.
> 
>     v6: Do not request host refresh when the VM guest screen is
>     closed or minimized.
> 
>  drivers/video/fbdev/Kconfig     |   1 +
>  drivers/video/fbdev/hyperv_fb.c | 210 ++++++++++++++++++++++++++++----
>  2 files changed, 190 insertions(+), 21 deletions(-)
> 

Reviewed-by: Michael Kelley <mikelley@microsoft.com>

^ permalink raw reply

* [PATCH] hv: vmbus: mark PM functions as __maybe_unused
From: Arnd Bergmann @ 2019-09-18 20:00 UTC (permalink / raw)
  To: K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger, Sasha Levin
  Cc: Arnd Bergmann, Dexuan Cui, Michael Kelley, linux-hyperv,
	linux-kernel

When CONFIG_PM is disabled, we get a couple of harmless warnings:

drivers/hv/vmbus_drv.c:918:12: error: unused function 'vmbus_suspend' [-Werror,-Wunused-function]
drivers/hv/vmbus_drv.c:937:12: error: unused function 'vmbus_resume' [-Werror,-Wunused-function]
drivers/hv/vmbus_drv.c:2128:12: error: unused function 'vmbus_bus_suspend' [-Werror,-Wunused-function]
drivers/hv/vmbus_drv.c:2208:12: error: unused function 'vmbus_bus_resume' [-Werror,-Wunused-function]

Mark these functions __maybe_unused to let gcc drop them silently.

Fixes: f53335e3289f ("Drivers: hv: vmbus: Suspend/resume the vmbus itself for hibernation")
Fixes: 271b2224d42f ("Drivers: hv: vmbus: Implement suspend/resume for VSC drivers for hibernation")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 drivers/hv/vmbus_drv.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 391f0b225c9a..2542eb1f872b 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -915,7 +915,7 @@ static void vmbus_shutdown(struct device *child_device)
 /*
  * vmbus_suspend - Suspend a vmbus device
  */
-static int vmbus_suspend(struct device *child_device)
+static int __maybe_unused vmbus_suspend(struct device *child_device)
 {
 	struct hv_driver *drv;
 	struct hv_device *dev = device_to_hv_device(child_device);
@@ -934,7 +934,7 @@ static int vmbus_suspend(struct device *child_device)
 /*
  * vmbus_resume - Resume a vmbus device
  */
-static int vmbus_resume(struct device *child_device)
+static int __maybe_unused vmbus_resume(struct device *child_device)
 {
 	struct hv_driver *drv;
 	struct hv_device *dev = device_to_hv_device(child_device);
@@ -2125,7 +2125,7 @@ static int vmbus_acpi_add(struct acpi_device *device)
 	return ret_val;
 }
 
-static int vmbus_bus_suspend(struct device *dev)
+static int __maybe_unused vmbus_bus_suspend(struct device *dev)
 {
 	struct vmbus_channel *channel, *sc;
 	unsigned long flags;
@@ -2205,7 +2205,7 @@ static int vmbus_bus_suspend(struct device *dev)
 	return 0;
 }
 
-static int vmbus_bus_resume(struct device *dev)
+static int __maybe_unused vmbus_bus_resume(struct device *dev)
 {
 	struct vmbus_channel_msginfo *msginfo;
 	size_t msgsize;
-- 
2.20.0


^ permalink raw reply related

* RE: [PATHC v6] video: hyperv: hyperv_fb: Support deferred IO for Hyper-V frame buffer driver
From: Dexuan Cui @ 2019-09-18 16:45 UTC (permalink / raw)
  To: Wei Hu, Michael Kelley, rdunlap@infradead.org, shc_work@mail.ru,
	gregkh@linuxfoundation.org, lee.jones@linaro.org,
	alexandre.belloni@bootlin.com, baijiaju1990@gmail.com,
	fthain@telegraphics.com.au, info@metux.net,
	linux-hyperv@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linux-fbdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	sashal@kernel.org, Stephen Hemminger, Haiyang Zhang,
	KY Srinivasan
In-Reply-To: <20190918060227.6834-1-weh@microsoft.com>

> From: Wei Hu <weh@microsoft.com>
> Sent: Tuesday, September 17, 2019 11:03 PM
> [...]
> ---
>     v6: Do not request host refresh when the VM guest screen is
>     closed or minimized.

Looks good to me. Thanks!

Reviewed-by: Dexuan Cui <decui@microsoft.com>

^ permalink raw reply

* [PATHC v6] video: hyperv: hyperv_fb: Support deferred IO for Hyper-V frame buffer driver
From: Wei Hu @ 2019-09-18  6:03 UTC (permalink / raw)
  To: Michael Kelley, rdunlap@infradead.org, shc_work@mail.ru,
	gregkh@linuxfoundation.org, lee.jones@linaro.org,
	alexandre.belloni@bootlin.com, baijiaju1990@gmail.com,
	fthain@telegraphics.com.au, info@metux.net,
	linux-hyperv@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linux-fbdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	sashal@kernel.org, Stephen Hemminger, Haiyang Zhang,
	KY Srinivasan, Dexuan Cui
  Cc: Wei Hu

Without deferred IO support, hyperv_fb driver informs the host to refresh
the entire guest frame buffer at fixed rate, e.g. at 20Hz, no matter there
is screen update or not. This patch supports deferred IO for screens in
graphics mode and also enables the frame buffer on-demand refresh. The
highest refresh rate is still set at 20Hz.

Currently Hyper-V only takes a physical address from guest as the starting
address of frame buffer. This implies the guest must allocate contiguous
physical memory for frame buffer. In addition, Hyper-V Gen 2 VMs only
accept address from MMIO region as frame buffer address. Due to these
limitations on Hyper-V host, we keep a shadow copy of frame buffer
in the guest. This means one more copy of the dirty rectangle inside
guest when doing the on-demand refresh. This can be optimized in the
future with help from host. For now the host performance gain from deferred
IO outweighs the shadow copy impact in the guest.

Signed-off-by: Wei Hu <weh@microsoft.com>
---
    v2: Incorporated review comments from Michael Kelley
    - Increased dirty rectangle by one row in deferred IO case when sending
    to Hyper-V.
    - Corrected the dirty rectangle size in the text mode.
    - Added more comments.
    - Other minor code cleanups.

    v3: Incorporated more review comments
    - Removed a few unnecessary variable tests

    v4: Incorporated test and review feedback from Dexuan Cui
    - Not disable interrupt while acquiring docopy_lock in
      hvfb_update_work(). This avoids significant bootup delay in
      large vCPU count VMs.

    v5: Completely remove the unnecessary docopy_lock after discussing
    with Dexuan Cui.

    v6: Do not request host refresh when the VM guest screen is
    closed or minimized.

 drivers/video/fbdev/Kconfig     |   1 +
 drivers/video/fbdev/hyperv_fb.c | 210 ++++++++++++++++++++++++++++----
 2 files changed, 190 insertions(+), 21 deletions(-)

diff --git a/drivers/video/fbdev/Kconfig b/drivers/video/fbdev/Kconfig
index 1b2f5f31fb6f..e781f89a1824 100644
--- a/drivers/video/fbdev/Kconfig
+++ b/drivers/video/fbdev/Kconfig
@@ -2241,6 +2241,7 @@ config FB_HYPERV
 	select FB_CFB_FILLRECT
 	select FB_CFB_COPYAREA
 	select FB_CFB_IMAGEBLIT
+	select FB_DEFERRED_IO
 	help
 	  This framebuffer driver supports Microsoft Hyper-V Synthetic Video.
 
diff --git a/drivers/video/fbdev/hyperv_fb.c b/drivers/video/fbdev/hyperv_fb.c
index fe319fc39bec..0c57445f8357 100644
--- a/drivers/video/fbdev/hyperv_fb.c
+++ b/drivers/video/fbdev/hyperv_fb.c
@@ -237,6 +237,7 @@ struct synthvid_msg {
 #define RING_BUFSIZE (256 * 1024)
 #define VSP_TIMEOUT (10 * HZ)
 #define HVFB_UPDATE_DELAY (HZ / 20)
+#define HVFB_ONDEMAND_THROTTLE (HZ / 20)
 
 struct hvfb_par {
 	struct fb_info *info;
@@ -256,6 +257,16 @@ struct hvfb_par {
 	bool synchronous_fb;
 
 	struct notifier_block hvfb_panic_nb;
+
+	/* Memory for deferred IO and frame buffer itself */
+	unsigned char *dio_vp;
+	unsigned char *mmio_vp;
+	unsigned long mmio_pp;
+
+	/* Dirty rectangle, protected by delayed_refresh_lock */
+	int x1, y1, x2, y2;
+	bool delayed_refresh;
+	spinlock_t delayed_refresh_lock;
 };
 
 static uint screen_width = HVFB_WIDTH;
@@ -264,6 +275,7 @@ static uint screen_width_max = HVFB_WIDTH;
 static uint screen_height_max = HVFB_HEIGHT;
 static uint screen_depth;
 static uint screen_fb_size;
+static uint dio_fb_size; /* FB size for deferred IO */
 
 /* Send message to Hyper-V host */
 static inline int synthvid_send(struct hv_device *hdev,
@@ -350,28 +362,88 @@ static int synthvid_send_ptr(struct hv_device *hdev)
 }
 
 /* Send updated screen area (dirty rectangle) location to host */
-static int synthvid_update(struct fb_info *info)
+static int
+synthvid_update(struct fb_info *info, int x1, int y1, int x2, int y2)
 {
 	struct hv_device *hdev = device_to_hv_device(info->device);
 	struct synthvid_msg msg;
 
 	memset(&msg, 0, sizeof(struct synthvid_msg));
+	if (x2 == INT_MAX)
+		x2 = info->var.xres;
+	if (y2 == INT_MAX)
+		y2 = info->var.yres;
 
 	msg.vid_hdr.type = SYNTHVID_DIRT;
 	msg.vid_hdr.size = sizeof(struct synthvid_msg_hdr) +
 		sizeof(struct synthvid_dirt);
 	msg.dirt.video_output = 0;
 	msg.dirt.dirt_count = 1;
-	msg.dirt.rect[0].x1 = 0;
-	msg.dirt.rect[0].y1 = 0;
-	msg.dirt.rect[0].x2 = info->var.xres;
-	msg.dirt.rect[0].y2 = info->var.yres;
+	msg.dirt.rect[0].x1 = (x1 > x2) ? 0 : x1;
+	msg.dirt.rect[0].y1 = (y1 > y2) ? 0 : y1;
+	msg.dirt.rect[0].x2 =
+		(x2 < x1 || x2 > info->var.xres) ? info->var.xres : x2;
+	msg.dirt.rect[0].y2 =
+		(y2 < y1 || y2 > info->var.yres) ? info->var.yres : y2;
 
 	synthvid_send(hdev, &msg);
 
 	return 0;
 }
 
+static void hvfb_docopy(struct hvfb_par *par,
+			unsigned long offset,
+			unsigned long size)
+{
+	if (!par || !par->mmio_vp || !par->dio_vp || !par->fb_ready ||
+	    size == 0 || offset >= dio_fb_size)
+		return;
+
+	if (offset + size > dio_fb_size)
+		size = dio_fb_size - offset;
+
+	memcpy(par->mmio_vp + offset, par->dio_vp + offset, size);
+}
+
+/* Deferred IO callback */
+static void synthvid_deferred_io(struct fb_info *p,
+				 struct list_head *pagelist)
+{
+	struct hvfb_par *par = p->par;
+	struct page *page;
+	unsigned long start, end;
+	int y1, y2, miny, maxy;
+
+	miny = INT_MAX;
+	maxy = 0;
+
+	/*
+	 * Merge dirty pages. It is possible that last page cross
+	 * over the end of frame buffer row yres. This is taken care of
+	 * in synthvid_update function by clamping the y2
+	 * value to yres.
+	 */
+	list_for_each_entry(page, pagelist, lru) {
+		start = page->index << PAGE_SHIFT;
+		end = start + PAGE_SIZE - 1;
+		y1 = start / p->fix.line_length;
+		y2 = end / p->fix.line_length;
+		miny = min_t(int, miny, y1);
+		maxy = max_t(int, maxy, y2);
+
+		/* Copy from dio space to mmio address */
+		if (par->fb_ready)
+			hvfb_docopy(par, start, PAGE_SIZE);
+	}
+
+	if (par->fb_ready && par->update)
+		synthvid_update(p, 0, miny, p->var.xres, maxy + 1);
+}
+
+static struct fb_deferred_io synthvid_defio = {
+	.delay		= HZ / 20,
+	.deferred_io	= synthvid_deferred_io,
+};
 
 /*
  * Actions on received messages from host:
@@ -618,7 +690,7 @@ static int synthvid_send_config(struct hv_device *hdev)
 	msg->vid_hdr.type = SYNTHVID_VRAM_LOCATION;
 	msg->vid_hdr.size = sizeof(struct synthvid_msg_hdr) +
 		sizeof(struct synthvid_vram_location);
-	msg->vram.user_ctx = msg->vram.vram_gpa = info->fix.smem_start;
+	msg->vram.user_ctx = msg->vram.vram_gpa = par->mmio_pp;
 	msg->vram.is_vram_gpa_specified = 1;
 	synthvid_send(hdev, msg);
 
@@ -628,7 +700,7 @@ static int synthvid_send_config(struct hv_device *hdev)
 		ret = -ETIMEDOUT;
 		goto out;
 	}
-	if (msg->vram_ack.user_ctx != info->fix.smem_start) {
+	if (msg->vram_ack.user_ctx != par->mmio_pp) {
 		pr_err("Unable to set VRAM location\n");
 		ret = -ENODEV;
 		goto out;
@@ -645,19 +717,77 @@ static int synthvid_send_config(struct hv_device *hdev)
 
 /*
  * Delayed work callback:
- * It is called at HVFB_UPDATE_DELAY or longer time interval to process
- * screen updates. It is re-scheduled if further update is necessary.
+ * It is scheduled to call whenever update request is received and it has
+ * not been called in last HVFB_ONDEMAND_THROTTLE time interval.
  */
 static void hvfb_update_work(struct work_struct *w)
 {
 	struct hvfb_par *par = container_of(w, struct hvfb_par, dwork.work);
 	struct fb_info *info = par->info;
+	unsigned long flags;
+	int x1, x2, y1, y2;
+	int j;
+
+	spin_lock_irqsave(&par->delayed_refresh_lock, flags);
+	/* Reset the request flag */
+	par->delayed_refresh = false;
+
+	/* Store the dirty rectangle to local variables */
+	x1 = par->x1;
+	x2 = par->x2;
+	y1 = par->y1;
+	y2 = par->y2;
+
+	/* Clear dirty rectangle */
+	par->x1 = par->y1 = INT_MAX;
+	par->x2 = par->y2 = 0;
+
+	spin_unlock_irqrestore(&par->delayed_refresh_lock, flags);
+
+	if (x1 > info->var.xres || x2 > info->var.xres ||
+	    y1 > info->var.yres || y2 > info->var.yres || x2 <= x1)
+		return;
+
+	/* Copy the dirty rectangle to frame buffer memory */
+	for (j = y1; j < y2; j++) {
+		hvfb_docopy(par,
+			    j * info->fix.line_length +
+			    (x1 * screen_depth / 8),
+			    (x2 - x1) * screen_depth / 8);
+	}
+
+	/* Refresh */
+	if (par->fb_ready && par->update)
+		synthvid_update(info, x1, y1, x2, y2);
+}
 
-	if (par->fb_ready)
-		synthvid_update(info);
+/*
+ * Control the on-demand refresh frequency. It schedules a delayed
+ * screen update if it has not yet.
+ */
+static void hvfb_ondemand_refresh_throttle(struct hvfb_par *par,
+					   int x1, int y1, int w, int h)
+{
+	unsigned long flags;
+	int x2 = x1 + w;
+	int y2 = y1 + h;
+
+	spin_lock_irqsave(&par->delayed_refresh_lock, flags);
+
+	/* Merge dirty rectangle */
+	par->x1 = min_t(int, par->x1, x1);
+	par->y1 = min_t(int, par->y1, y1);
+	par->x2 = max_t(int, par->x2, x2);
+	par->y2 = max_t(int, par->y2, y2);
+
+	/* Schedule a delayed screen update if not yet */
+	if (par->delayed_refresh == false) {
+		schedule_delayed_work(&par->dwork,
+				      HVFB_ONDEMAND_THROTTLE);
+		par->delayed_refresh = true;
+	}
 
-	if (par->update)
-		schedule_delayed_work(&par->dwork, HVFB_UPDATE_DELAY);
+	spin_unlock_irqrestore(&par->delayed_refresh_lock, flags);
 }
 
 static int hvfb_on_panic(struct notifier_block *nb,
@@ -669,7 +799,8 @@ static int hvfb_on_panic(struct notifier_block *nb,
 	par = container_of(nb, struct hvfb_par, hvfb_panic_nb);
 	par->synchronous_fb = true;
 	info = par->info;
-	synthvid_update(info);
+	hvfb_docopy(par, 0, dio_fb_size);
+	synthvid_update(info, 0, 0, INT_MAX, INT_MAX);
 
 	return NOTIFY_DONE;
 }
@@ -730,7 +861,10 @@ static void hvfb_cfb_fillrect(struct fb_info *p,
 
 	cfb_fillrect(p, rect);
 	if (par->synchronous_fb)
-		synthvid_update(p);
+		synthvid_update(p, 0, 0, INT_MAX, INT_MAX);
+	else
+		hvfb_ondemand_refresh_throttle(par, rect->dx, rect->dy,
+					       rect->width, rect->height);
 }
 
 static void hvfb_cfb_copyarea(struct fb_info *p,
@@ -740,7 +874,10 @@ static void hvfb_cfb_copyarea(struct fb_info *p,
 
 	cfb_copyarea(p, area);
 	if (par->synchronous_fb)
-		synthvid_update(p);
+		synthvid_update(p, 0, 0, INT_MAX, INT_MAX);
+	else
+		hvfb_ondemand_refresh_throttle(par, area->dx, area->dy,
+					       area->width, area->height);
 }
 
 static void hvfb_cfb_imageblit(struct fb_info *p,
@@ -750,7 +887,10 @@ static void hvfb_cfb_imageblit(struct fb_info *p,
 
 	cfb_imageblit(p, image);
 	if (par->synchronous_fb)
-		synthvid_update(p);
+		synthvid_update(p, 0, 0, INT_MAX, INT_MAX);
+	else
+		hvfb_ondemand_refresh_throttle(par, image->dx, image->dy,
+					       image->width, image->height);
 }
 
 static struct fb_ops hvfb_ops = {
@@ -809,6 +949,9 @@ static int hvfb_getmem(struct hv_device *hdev, struct fb_info *info)
 	resource_size_t pot_start, pot_end;
 	int ret;
 
+	dio_fb_size =
+		screen_width * screen_height * screen_depth / 8;
+
 	if (gen2vm) {
 		pot_start = 0;
 		pot_end = -1;
@@ -843,9 +986,14 @@ static int hvfb_getmem(struct hv_device *hdev, struct fb_info *info)
 	if (!fb_virt)
 		goto err2;
 
+	/* Allocate memory for deferred IO */
+	par->dio_vp = vzalloc(round_up(dio_fb_size, PAGE_SIZE));
+	if (par->dio_vp == NULL)
+		goto err3;
+
 	info->apertures = alloc_apertures(1);
 	if (!info->apertures)
-		goto err3;
+		goto err4;
 
 	if (gen2vm) {
 		info->apertures->ranges[0].base = screen_info.lfb_base;
@@ -857,16 +1005,23 @@ static int hvfb_getmem(struct hv_device *hdev, struct fb_info *info)
 		info->apertures->ranges[0].size = pci_resource_len(pdev, 0);
 	}
 
+	/* Physical address of FB device */
+	par->mmio_pp = par->mem->start;
+	/* Virtual address of FB device */
+	par->mmio_vp = (unsigned char *) fb_virt;
+
 	info->fix.smem_start = par->mem->start;
-	info->fix.smem_len = screen_fb_size;
-	info->screen_base = fb_virt;
-	info->screen_size = screen_fb_size;
+	info->fix.smem_len = dio_fb_size;
+	info->screen_base = par->dio_vp;
+	info->screen_size = dio_fb_size;
 
 	if (!gen2vm)
 		pci_dev_put(pdev);
 
 	return 0;
 
+err4:
+	vfree(par->dio_vp);
 err3:
 	iounmap(fb_virt);
 err2:
@@ -884,6 +1039,7 @@ static void hvfb_putmem(struct fb_info *info)
 {
 	struct hvfb_par *par = info->par;
 
+	vfree(par->dio_vp);
 	iounmap(info->screen_base);
 	vmbus_free_mmio(par->mem->start, screen_fb_size);
 	par->mem = NULL;
@@ -909,6 +1065,11 @@ static int hvfb_probe(struct hv_device *hdev,
 	init_completion(&par->wait);
 	INIT_DELAYED_WORK(&par->dwork, hvfb_update_work);
 
+	par->delayed_refresh = false;
+	spin_lock_init(&par->delayed_refresh_lock);
+	par->x1 = par->y1 = INT_MAX;
+	par->x2 = par->y2 = 0;
+
 	/* Connect to VSP */
 	hv_set_drvdata(hdev, info);
 	ret = synthvid_connect_vsp(hdev);
@@ -960,6 +1121,10 @@ static int hvfb_probe(struct hv_device *hdev,
 	info->fbops = &hvfb_ops;
 	info->pseudo_palette = par->pseudo_palette;
 
+	/* Initialize deferred IO */
+	info->fbdefio = &synthvid_defio;
+	fb_deferred_io_init(info);
+
 	/* Send config to host */
 	ret = synthvid_send_config(hdev);
 	if (ret)
@@ -981,6 +1146,7 @@ static int hvfb_probe(struct hv_device *hdev,
 	return 0;
 
 error:
+	fb_deferred_io_cleanup(info);
 	hvfb_putmem(info);
 error2:
 	vmbus_close(hdev->channel);
@@ -1003,6 +1169,8 @@ static int hvfb_remove(struct hv_device *hdev)
 	par->update = false;
 	par->fb_ready = false;
 
+	fb_deferred_io_cleanup(info);
+
 	unregister_framebuffer(info);
 	cancel_delayed_work_sync(&par->dwork);
 
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH V4 0/3] KVM/Hyper-V: Add Hyper-V direct tlb flush support
From: Tianyu Lan @ 2019-09-18  1:55 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Tianyu Lan, kvm, linux-doc, linux-hyperv,
	linux-kernel@vger kernel org, Radim Krcmar, corbet, KY Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Sasha Levin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	the arch/x86 maintainers, michael.h.kelley
In-Reply-To: <874l1baqnf.fsf@vitty.brq.redhat.com>

On Tue, Sep 17, 2019 at 11:28 PM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
> Paolo Bonzini <pbonzini@redhat.com> writes:
>
> > On 22/08/19 16:30, lantianyu1986@gmail.com wrote:
> >> From: Tianyu Lan <Tianyu.Lan@microsoft.com>
> >>
> >> This patchset is to add Hyper-V direct tlb support in KVM. Hyper-V
> >> in L0 can delegate L1 hypervisor to handle tlb flush request from
> >> L2 guest when direct tlb flush is enabled in L1.
> >>
> >> Patch 2 introduces new cap KVM_CAP_HYPERV_DIRECT_TLBFLUSH to enable
> >> feature from user space. User space should enable this feature only
> >> when Hyper-V hypervisor capability is exposed to guest and KVM profile
> >> is hided. There is a parameter conflict between KVM and Hyper-V hypercall.
> >> We hope L2 guest doesn't use KVM hypercall when the feature is
> >> enabled. Detail please see comment of new API "KVM_CAP_HYPERV_DIRECT_TLBFLUSH"
> >>
> >> Change since v3:
> >>        - Update changelog in each patches.
> >>
> >> Change since v2:
> >>        - Move hv assist page(hv_pa_pg) from struct kvm  to struct kvm_hv.
> >>
> >> Change since v1:
> >>        - Fix offset issue in the patch 1.
> >>        - Update description of KVM KVM_CAP_HYPERV_DIRECT_TLBFLUSH.
> >>
> >> Tianyu Lan (2):
> >>   x86/Hyper-V: Fix definition of struct hv_vp_assist_page
> >>   KVM/Hyper-V: Add new KVM capability KVM_CAP_HYPERV_DIRECT_TLBFLUSH
> >>
> >> Vitaly Kuznetsov (1):
> >>   KVM/Hyper-V/VMX: Add direct tlb flush support
> >>
> >>  Documentation/virtual/kvm/api.txt  | 13 +++++++++++++
> >>  arch/x86/include/asm/hyperv-tlfs.h | 24 ++++++++++++++++++-----
> >>  arch/x86/include/asm/kvm_host.h    |  4 ++++
> >>  arch/x86/kvm/vmx/evmcs.h           |  2 ++
> >>  arch/x86/kvm/vmx/vmx.c             | 39 ++++++++++++++++++++++++++++++++++++++
> >>  arch/x86/kvm/x86.c                 |  8 ++++++++
> >>  include/uapi/linux/kvm.h           |  1 +
> >>  7 files changed, 86 insertions(+), 5 deletions(-)
> >>
> >
> > Queued, thanks.
> >
>
> I had a suggestion how we can get away without the new capability (like
> direct tlb flush gets automatically enabled when Hyper-V hypercall page
> is activated and we know we can't handle KVM hypercalls any more)
> but this can probably be done as a follow-up.
>

Hi Vital'y：
              Actually, I have tried your proposal but it turns out
KVM in L1 fails to
enable direct tlb flush most time after nested VM starts.
"hv_enlightenments_control.
nested_flush_hypercall" flag in evmcs is cleared by Hyper-V after run
nested VM. I still
wait answer from Hyper-V team. So far, it looks like enabling direct
tlb flush before start
nested VM is a safe way.Once get more infomration from Hyper-V team and we may
have a look to how to enable your proposal.
-- 
Best regards
Tianyu Lan

^ permalink raw reply

* Re: [PATCH V4 0/3] KVM/Hyper-V: Add Hyper-V direct tlb flush support
From: Vitaly Kuznetsov @ 2019-09-17 15:28 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Tianyu Lan, kvm, linux-doc, linux-hyperv, linux-kernel,
	lantianyu1986, rkrcmar, corbet, kys, haiyangz, sthemmin, sashal,
	tglx, mingo, bp, hpa, x86, michael.h.kelley
In-Reply-To: <7ea7fa06-f100-1507-8507-1c701877c8ab@redhat.com>

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 22/08/19 16:30, lantianyu1986@gmail.com wrote:
>> From: Tianyu Lan <Tianyu.Lan@microsoft.com>
>> 
>> This patchset is to add Hyper-V direct tlb support in KVM. Hyper-V
>> in L0 can delegate L1 hypervisor to handle tlb flush request from
>> L2 guest when direct tlb flush is enabled in L1.
>> 
>> Patch 2 introduces new cap KVM_CAP_HYPERV_DIRECT_TLBFLUSH to enable
>> feature from user space. User space should enable this feature only
>> when Hyper-V hypervisor capability is exposed to guest and KVM profile
>> is hided. There is a parameter conflict between KVM and Hyper-V hypercall.
>> We hope L2 guest doesn't use KVM hypercall when the feature is
>> enabled. Detail please see comment of new API "KVM_CAP_HYPERV_DIRECT_TLBFLUSH"
>> 
>> Change since v3:
>>        - Update changelog in each patches. 
>> 
>> Change since v2:
>>        - Move hv assist page(hv_pa_pg) from struct kvm  to struct kvm_hv.
>> 
>> Change since v1:
>>        - Fix offset issue in the patch 1.
>>        - Update description of KVM KVM_CAP_HYPERV_DIRECT_TLBFLUSH.
>> 
>> Tianyu Lan (2):
>>   x86/Hyper-V: Fix definition of struct hv_vp_assist_page
>>   KVM/Hyper-V: Add new KVM capability KVM_CAP_HYPERV_DIRECT_TLBFLUSH
>> 
>> Vitaly Kuznetsov (1):
>>   KVM/Hyper-V/VMX: Add direct tlb flush support
>> 
>>  Documentation/virtual/kvm/api.txt  | 13 +++++++++++++
>>  arch/x86/include/asm/hyperv-tlfs.h | 24 ++++++++++++++++++-----
>>  arch/x86/include/asm/kvm_host.h    |  4 ++++
>>  arch/x86/kvm/vmx/evmcs.h           |  2 ++
>>  arch/x86/kvm/vmx/vmx.c             | 39 ++++++++++++++++++++++++++++++++++++++
>>  arch/x86/kvm/x86.c                 |  8 ++++++++
>>  include/uapi/linux/kvm.h           |  1 +
>>  7 files changed, 86 insertions(+), 5 deletions(-)
>> 
>
> Queued, thanks.
>

I had a suggestion how we can get away without the new capability (like
direct tlb flush gets automatically enabled when Hyper-V hypercall page
is activated and we know we can't handle KVM hypercalls any more)
but this can probably be done as a follow-up.

-- 
Vitaly

^ permalink raw reply

* Re: [PATCH 1/3] cpu/SMT: create and export cpu_smt_possible()
From: Vitaly Kuznetsov @ 2019-09-17 15:11 UTC (permalink / raw)
  To: Jim Mattson
  Cc: kvm list, LKML, linux-hyperv, the arch/x86 maintainers,
	Paolo Bonzini, Radim Krčmář, Sean Christopherson,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Peter Zijlstra (Intel), Michael Kelley, Roman Kagan
In-Reply-To: <CALMp9eQP7Dup+mMuAiShNtH754R_Wwuvf63hezygh3TGR=a9rg@mail.gmail.com>

Jim Mattson <jmattson@google.com> writes:

> On Mon, Sep 16, 2019 at 9:23 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>>
>> KVM needs to know if SMT is theoretically possible, this means it is
>> supported and not forcefully disabled ('nosmt=force'). Create and
>> export cpu_smt_possible() answering this question.
>
> It seems to me that KVM really just wants to know if the scheduler can
> be trusted to avoid violating the invariant expressed by the Hyper-V
> enlightenment, NoNonArchitecturalCoreSharing. It is possible to do
> that even when SMT is enabled, if the scheduler is core-aware.
> Wouldn't it be better to implement a scheduler API that told you
> exactly what you wanted to know, rather than trying to infer the
> answer from various breadcrumbs?

(I know not that much about scheduler so please bear with me)

Having a trustworthy scheduler not placing unrelated (not exposed as
sibling SMT threads to a guest or vCPUs from different guests) on
sibling SMT threads when it's not limited with affinity is definitely a
good thing. We, however, also need to know if vCPU pinning is planned
for the guest: like when QEMU vCPU threads are created they're not
pinned, however, libvirt pins them if needed before launching the
guest. So 'NoNonArchitecturalCoreSharing' can also be set in two cases:
- No vCPU pinning is planned but the scheduler is aware of the problem
(I'm not sure it is nowadays)
- The upper layer promises to do the right pinning.

This patch series, however, doesn't go that deep, it only covers the
simplest case: SMT is unavailable or forcefully disabled. I'll try to
learn more about scheduler though.

-- 
Vitaly

^ permalink raw reply

* Re: [PATCH 2/3] KVM: x86: hyper-v: set NoNonArchitecturalCoreSharing CPUID bit when SMT is impossible
From: Paolo Bonzini @ 2019-09-17 14:08 UTC (permalink / raw)
  To: Vitaly Kuznetsov, Jim Mattson
  Cc: kvm list, LKML, linux-hyperv, the arch/x86 maintainers,
	Radim Krčmář, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Peter Zijlstra (Intel), Michael Kelley, Roman Kagan
In-Reply-To: <87ef0fb72x.fsf@vitty.brq.redhat.com>

On 17/09/19 11:33, Vitaly Kuznetsov wrote:
> Jim Mattson <jmattson@google.com> writes:
> 
>> On Mon, Sep 16, 2019 at 9:23 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>>>
>>> Hyper-V 2019 doesn't expose MD_CLEAR CPUID bit to guests when it cannot
>>> guarantee that two virtual processors won't end up running on sibling SMT
>>> threads without knowing about it. This is done as an optimization as in
>>> this case there is nothing the guest can do to protect itself against MDS
>>> and issuing additional flush requests is just pointless. On bare metal the
>>> topology is known, however, when Hyper-V is running nested (e.g. on top of
>>> KVM) it needs an additional piece of information: a confirmation that the
>>> exposed topology (wrt vCPU placement on different SMT threads) is
>>> trustworthy.
>>>
>>> NoNonArchitecturalCoreSharing (CPUID 0x40000004 EAX bit 18) is described in
>>> TLFS as follows: "Indicates that a virtual processor will never share a
>>> physical core with another virtual processor, except for virtual processors
>>> that are reported as sibling SMT threads." From KVM we can give such
>>> guarantee in two cases:
>>> - SMT is unsupported or forcefully disabled (just 'disabled' doesn't work
>>>  as it can become re-enabled during the lifetime of the guest).
>>> - vCPUs are properly pinned so the scheduler won't put them on sibling
>>> SMT threads (when they're not reported as such).
>>
>> That's a nice bit of information. Have you considered a mechanism for
>> communicating this information to kvm guests in a way that doesn't
>> require Hyper-V enlightenments?
>>
> 
> (I haven't put much thought in this) but can we re-use MD_CLEAR CPUID
> bit for that? Like if the hypervisor can't guarantee usefulness
> (e.g. when two random vCPUs can be put on sibling SMT threads) of
> flushing, is there any reason to still make the guest think the feature
> is there?

Yes, that's a good idea.

Paolo

^ permalink raw reply

* Re: [PATCH 1/3] cpu/SMT: create and export cpu_smt_possible()
From: Paolo Bonzini @ 2019-09-17 14:07 UTC (permalink / raw)
  To: Jim Mattson, Vitaly Kuznetsov
  Cc: kvm list, LKML, linux-hyperv, the arch/x86 maintainers,
	Radim Krčmář, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Peter Zijlstra (Intel), Michael Kelley, Roman Kagan
In-Reply-To: <CALMp9eQP7Dup+mMuAiShNtH754R_Wwuvf63hezygh3TGR=a9rg@mail.gmail.com>

On 16/09/19 19:16, Jim Mattson wrote:
>> KVM needs to know if SMT is theoretically possible, this means it is
>> supported and not forcefully disabled ('nosmt=force'). Create and
>> export cpu_smt_possible() answering this question.
> It seems to me that KVM really just wants to know if the scheduler can
> be trusted to avoid violating the invariant expressed by the Hyper-V
> enlightenment, NoNonArchitecturalCoreSharing. It is possible to do
> that even when SMT is enabled, if the scheduler is core-aware.
> Wouldn't it be better to implement a scheduler API that told you
> exactly what you wanted to know, rather than trying to infer the
> answer from various breadcrumbs?

Yes, however that scheduler API could also rely on something like
cpu_smt_possible(), at least in the case where core scheduling is not
active, so this is still a step in the right direction.

Paolo

^ permalink raw reply

* Re: [PATCH V4 0/3] KVM/Hyper-V: Add Hyper-V direct tlb flush support
From: Paolo Bonzini @ 2019-09-17 13:27 UTC (permalink / raw)
  To: lantianyu1986, rkrcmar, corbet, kys, haiyangz, sthemmin, sashal,
	tglx, mingo, bp, hpa, x86, michael.h.kelley
  Cc: Tianyu Lan, kvm, linux-doc, linux-hyperv, linux-kernel, vkuznets
In-Reply-To: <20190822143021.7518-1-Tianyu.Lan@microsoft.com>

On 22/08/19 16:30, lantianyu1986@gmail.com wrote:
> From: Tianyu Lan <Tianyu.Lan@microsoft.com>
> 
> This patchset is to add Hyper-V direct tlb support in KVM. Hyper-V
> in L0 can delegate L1 hypervisor to handle tlb flush request from
> L2 guest when direct tlb flush is enabled in L1.
> 
> Patch 2 introduces new cap KVM_CAP_HYPERV_DIRECT_TLBFLUSH to enable
> feature from user space. User space should enable this feature only
> when Hyper-V hypervisor capability is exposed to guest and KVM profile
> is hided. There is a parameter conflict between KVM and Hyper-V hypercall.
> We hope L2 guest doesn't use KVM hypercall when the feature is
> enabled. Detail please see comment of new API "KVM_CAP_HYPERV_DIRECT_TLBFLUSH"
> 
> Change since v3:
>        - Update changelog in each patches. 
> 
> Change since v2:
>        - Move hv assist page(hv_pa_pg) from struct kvm  to struct kvm_hv.
> 
> Change since v1:
>        - Fix offset issue in the patch 1.
>        - Update description of KVM KVM_CAP_HYPERV_DIRECT_TLBFLUSH.
> 
> Tianyu Lan (2):
>   x86/Hyper-V: Fix definition of struct hv_vp_assist_page
>   KVM/Hyper-V: Add new KVM capability KVM_CAP_HYPERV_DIRECT_TLBFLUSH
> 
> Vitaly Kuznetsov (1):
>   KVM/Hyper-V/VMX: Add direct tlb flush support
> 
>  Documentation/virtual/kvm/api.txt  | 13 +++++++++++++
>  arch/x86/include/asm/hyperv-tlfs.h | 24 ++++++++++++++++++-----
>  arch/x86/include/asm/kvm_host.h    |  4 ++++
>  arch/x86/kvm/vmx/evmcs.h           |  2 ++
>  arch/x86/kvm/vmx/vmx.c             | 39 ++++++++++++++++++++++++++++++++++++++
>  arch/x86/kvm/x86.c                 |  8 ++++++++
>  include/uapi/linux/kvm.h           |  1 +
>  7 files changed, 86 insertions(+), 5 deletions(-)
> 

Queued, thanks.

Paolo

^ permalink raw reply

* Re: [PATCH 2/3] KVM: x86: hyper-v: set NoNonArchitecturalCoreSharing CPUID bit when SMT is impossible
From: Vitaly Kuznetsov @ 2019-09-17  9:33 UTC (permalink / raw)
  To: Jim Mattson
  Cc: kvm list, LKML, linux-hyperv, the arch/x86 maintainers,
	Paolo Bonzini, Radim Krčmář, Sean Christopherson,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Peter Zijlstra (Intel), Michael Kelley, Roman Kagan
In-Reply-To: <CALMp9eRa0-HO+JWGDoAFO1zOtNjrutfT7d4pLxjsxn-XiAJwwQ@mail.gmail.com>

Jim Mattson <jmattson@google.com> writes:

> On Mon, Sep 16, 2019 at 9:23 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>>
>> Hyper-V 2019 doesn't expose MD_CLEAR CPUID bit to guests when it cannot
>> guarantee that two virtual processors won't end up running on sibling SMT
>> threads without knowing about it. This is done as an optimization as in
>> this case there is nothing the guest can do to protect itself against MDS
>> and issuing additional flush requests is just pointless. On bare metal the
>> topology is known, however, when Hyper-V is running nested (e.g. on top of
>> KVM) it needs an additional piece of information: a confirmation that the
>> exposed topology (wrt vCPU placement on different SMT threads) is
>> trustworthy.
>>
>> NoNonArchitecturalCoreSharing (CPUID 0x40000004 EAX bit 18) is described in
>> TLFS as follows: "Indicates that a virtual processor will never share a
>> physical core with another virtual processor, except for virtual processors
>> that are reported as sibling SMT threads." From KVM we can give such
>> guarantee in two cases:
>> - SMT is unsupported or forcefully disabled (just 'disabled' doesn't work
>>  as it can become re-enabled during the lifetime of the guest).
>> - vCPUs are properly pinned so the scheduler won't put them on sibling
>> SMT threads (when they're not reported as such).
>
> That's a nice bit of information. Have you considered a mechanism for
> communicating this information to kvm guests in a way that doesn't
> require Hyper-V enlightenments?
>

(I haven't put much thought in this) but can we re-use MD_CLEAR CPUID
bit for that? Like if the hypervisor can't guarantee usefulness
(e.g. when two random vCPUs can be put on sibling SMT threads) of
flushing, is there any reason to still make the guest think the feature
is there?

-- 
Vitaly

^ permalink raw reply

* RE: [PATCH v4] video: hyperv: hyperv_fb: Obtain screen resolution from Hyper-V host
From: Dexuan Cui @ 2019-09-16 21:48 UTC (permalink / raw)
  To: Michael Kelley, Wei Hu, b.zolnierkie@samsung.com,
	linux-hyperv@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linux-fbdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Stephen Hemminger, sashal@kernel.org, Haiyang Zhang,
	KY Srinivasan
  Cc: Iouri Tarassov
In-Reply-To: <PU1P153MB0169E5FA3D359C6BDD50EC34BF8C0@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>

> From: Dexuan Cui
> Sent: Monday, September 16, 2019 2:46 PM
> To: Michael Kelley <mikelley@microsoft.com>; Wei Hu <weh@microsoft.com>;
> b.zolnierkie@samsung.com; linux-hyperv@vger.kernel.org;
> dri-devel@lists.freedesktop.org; linux-fbdev@vger.kernel.org;
> linux-kernel@vger.kernel.org; Stephen Hemminger
> <sthemmin@microsoft.com>; sashal@kernel.org; Haiyang Zhang
> <haiyangz@microsoft.com>; KY Srinivasan <kys@microsoft.com>
> Cc: Iouri Tarassov <iourit@microsoft.com>
> Subject: RE: [PATCH v4] video: hyperv: hyperv_fb: Obtain screen resolution
> from Hyper-V host
> 
> > From: linux-hyperv-owner@vger.kernel.org
> > <linux-hyperv-owner@vger.kernel.org> On Behalf Of Dexuan Cui
> > Sent: Thursday, September 12, 2019 11:39 PM
> > To: Michael Kelley <mikelley@microsoft.com>; Wei Hu
> <weh@microsoft.com>;
> > b.zolnierkie@samsung.com; linux-hyperv@vger.kernel.org;
> > dri-devel@lists.freedesktop.org; linux-fbdev@vger.kernel.org;
> > linux-kernel@vger.kernel.org; Stephen Hemminger
> > <sthemmin@microsoft.com>; sashal@kernel.org; Haiyang Zhang
> > <haiyangz@microsoft.com>; KY Srinivasan <kys@microsoft.com>
> > Cc: Iouri Tarassov <iourit@microsoft.com>
> > Subject: RE: [PATCH v4] video: hyperv: hyperv_fb: Obtain screen resolution
> > from Hyper-V host
> >
> > > From: Michael Kelley <mikelley@microsoft.com>
> > > Sent: Thursday, September 5, 2019 7:06 AM
> > >
> > > From: Wei Hu <weh@microsoft.com> Sent: Thursday, September 5, 2019
> > 2:12
> > > AM
> > > >
> > > > Beginning from Windows 10 RS5+, VM screen resolution is obtained from
> > > host.
> > > > The "video=hyperv_fb" boot time option is not needed, but still can be
> > > > used to overwrite what the host specifies. The VM resolution on the host
> > > > could be set by executing the powershell "set-vmvideo" command.
> > > >
> > > > Signed-off-by: Iouri Tarassov <iourit@microsoft.com>
> > > > Signed-off-by: Wei Hu <weh@microsoft.com>
> > > > ---
> > > >     v2:
> > > >     - Implemented fallback when version negotiation failed.
> > > >     - Defined full size for supported_resolution array.
> > > >
> > > >     v3:
> > > >     - Corrected the synthvid major and minor version comparison
> > problem.
> > > >
> > > >     v4:
> > > >     - Changed function name to synthvid_ver_ge().
> > > >
> > > >  drivers/video/fbdev/hyperv_fb.c | 159
> > > +++++++++++++++++++++++++++++---
> > > >  1 file changed, 147 insertions(+), 12 deletions(-)
> > > >
> > >
> > > Reviewed-by: Michael Kelley <mikelley@microsoft.com>
> >
> > Looks good to me.
> >
> > Reviewed-by: Dexuan Cui <decui@microsoft.com>
> 
> Hi Wei,
> It turns out we need to make a further fix. :-)
> 
> The patch forgets to take par->update into consideration.
> 
> When the VM Connection window is closed (or minimized?),
> the host sends a message to the guest, and the guest sets
> par->update to false in synthvid_recv_sub().
> 
> If par->update is false, the guest doesn't need to call
> synthvid_update().
> 
> Thanks,
> -- Dexuan

Please ignore the last reply from me. 

It was meant to reply another mail:
RE: [PATCH v5] video: hyperv: hyperv_fb: Support deferred IO for Hyper-V frame buffer driver

Sorry for the confusion.

Thanks,
-- Dexuan

^ permalink raw reply

* RE: [PATCH v5] video: hyperv: hyperv_fb: Support deferred IO for Hyper-V frame buffer driver
From: Dexuan Cui @ 2019-09-16 21:46 UTC (permalink / raw)
  To: Dexuan Cui, Wei Hu, Michael Kelley, rdunlap@infradead.org,
	shc_work@mail.ru, gregkh@linuxfoundation.org,
	lee.jones@linaro.org, alexandre.belloni@bootlin.com,
	baijiaju1990@gmail.com, fthain@telegraphics.com.au,
	info@metux.net, linux-hyperv@vger.kernel.org,
	dri-devel@lists.freedesktop.org, linux-fbdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, sashal@kernel.org,
	Stephen Hemminger, Haiyang Zhang, KY Srinivasan
In-Reply-To: <PU1P153MB0169E5E73D258A034B4869DCBFB30@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>



Thanks,
-- Dexuan

> -----Original Message-----
> From: linux-hyperv-owner@vger.kernel.org
> <linux-hyperv-owner@vger.kernel.org> On Behalf Of Dexuan Cui
> Sent: Thursday, September 12, 2019 11:38 PM
> To: Wei Hu <weh@microsoft.com>; Michael Kelley <mikelley@microsoft.com>;
> rdunlap@infradead.org; shc_work@mail.ru; gregkh@linuxfoundation.org;
> lee.jones@linaro.org; alexandre.belloni@bootlin.com;
> baijiaju1990@gmail.com; fthain@telegraphics.com.au; info@metux.net;
> linux-hyperv@vger.kernel.org; dri-devel@lists.freedesktop.org;
> linux-fbdev@vger.kernel.org; linux-kernel@vger.kernel.org; sashal@kernel.org;
> Stephen Hemminger <sthemmin@microsoft.com>; Haiyang Zhang
> <haiyangz@microsoft.com>; KY Srinivasan <kys@microsoft.com>
> Subject: RE: [PATCH v5] video: hyperv: hyperv_fb: Support deferred IO for
> Hyper-V frame buffer driver
> 
> > From: Wei Hu <weh@microsoft.com>
> > Sent: Thursday, September 12, 2019 11:03 PM
> >
> > Without deferred IO support, hyperv_fb driver informs the host to refresh
> > the entire guest frame buffer at fixed rate, e.g. at 20Hz, no matter there
> > is screen update or not. This patch supports deferred IO for screens in
> > graphics mode and also enables the frame buffer on-demand refresh. The
> > highest refresh rate is still set at 20Hz.
> >
> > Currently Hyper-V only takes a physical address from guest as the starting
> > address of frame buffer. This implies the guest must allocate contiguous
> > physical memory for frame buffer. In addition, Hyper-V Gen 2 VMs only
> > accept address from MMIO region as frame buffer address. Due to these
> > limitations on Hyper-V host, we keep a shadow copy of frame buffer
> > in the guest. This means one more copy of the dirty rectangle inside
> > guest when doing the on-demand refresh. This can be optimized in the
> > future with help from host. For now the host performance gain from deferred
> > IO outweighs the shadow copy impact in the guest.
> >
> > Signed-off-by: Wei Hu <weh@microsoft.com>
> > ---
> >     v2: Incorporated review comments from Michael Kelley
> >     - Increased dirty rectangle by one row in deferred IO case when sending
> >     to Hyper-V.
> >     - Corrected the dirty rectangle size in the text mode.
> >     - Added more comments.
> >     - Other minor code cleanups.
> >
> >     v3: Incorporated more review comments
> >     - Removed a few unnecessary variable tests
> >
> >     v4: Incorporated test and review feedback from Dexuan Cui
> >     - Not disable interrupt while acquiring docopy_lock in
> >       hvfb_update_work(). This avoids significant bootup delay in
> >       large vCPU count VMs.
> >
> >     v5: Completely remove the unnecessary docopy_lock after discussing
> >     with Dexuan Cui.
> 
> Thanks! Looks good to me.
> 
> Reviewed-by: Dexuan Cui <decui@microsoft.com>


Hi Wei,
It turns out we need to make a further fix. :-)

The patch forgets to take par->update into consideration.

When the VM Connection window is closed (or minimized?),
the host sends a message to the guest, and the guest sets
par->update to false in synthvid_recv_sub().

If par->update is false, the guest doesn't need to call
synthvid_update().

Thanks,
-- Dexuan

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox