Linux-HyperV List

Linux-HyperV List
 help / color / mirror / Atom feed

* [PATCH 1/8] scsi: add a host / host template field for the virt boundary
From: Christoph Hellwig @ 2019-06-17 12:19 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Sagi Grimberg, Max Gurtovoy, Bart Van Assche, linux-rdma,
	linux-scsi, megaraidlinux.pdl, MPT-FusionLinux.pdl, linux-hyperv,
	linux-kernel
In-Reply-To: <20190617122000.22181-1-hch@lst.de>

This allows drivers setting it up easily instead of branching out to
block layer calls in slave_alloc, and ensures the upgraded
max_segment_size setting gets picked up by the DMA layer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/hosts.c     | 3 +++
 drivers/scsi/scsi_lib.c  | 3 ++-
 include/scsi/scsi_host.h | 3 +++
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index ff0d8c6a8d0c..55522b7162d3 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -462,6 +462,9 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
 	else
 		shost->dma_boundary = 0xffffffff;
 
+	if (sht->virt_boundary_mask)
+		shost->virt_boundary_mask = sht->virt_boundary_mask;
+
 	device_initialize(&shost->shost_gendev);
 	dev_set_name(&shost->shost_gendev, "host%d", shost->host_no);
 	shost->shost_gendev.bus = &scsi_bus_type;
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 65d0a10c76ad..d333bb6b1c59 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1775,7 +1775,8 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
 	dma_set_seg_boundary(dev, shost->dma_boundary);
 
 	blk_queue_max_segment_size(q, shost->max_segment_size);
-	dma_set_max_seg_size(dev, shost->max_segment_size);
+	blk_queue_virt_boundary(q, shost->virt_boundary_mask);
+	dma_set_max_seg_size(dev, queue_max_segment_size(q));
 
 	/*
 	 * Set a reasonable default alignment:  The larger of 32-byte (dword),
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index a5fcdad4a03e..cc139dbd71e5 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -369,6 +369,8 @@ struct scsi_host_template {
 	 */
 	unsigned long dma_boundary;
 
+	unsigned long virt_boundary_mask;
+
 	/*
 	 * This specifies "machine infinity" for host templates which don't
 	 * limit the transfer size.  Note this limit represents an absolute
@@ -587,6 +589,7 @@ struct Scsi_Host {
 	unsigned int max_sectors;
 	unsigned int max_segment_size;
 	unsigned long dma_boundary;
+	unsigned long virt_boundary_mask;
 	/*
 	 * In scsi-mq mode, the number of hardware queues supported by the LLD.
 	 *
-- 
2.20.1


^ permalink raw reply related

* RE: [PATCH 10/13] megaraid_sas: set virt_boundary_mask in the scsi host
From: Kashyap Desai @ 2019-06-17  9:10 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Sebastian Ott, Sagi Grimberg, Max Gurtovoy,
	Bart Van Assche, Ulf Hansson, Alan Stern, Oliver Neukum,
	linux-block, linux-rdma, linux-mmc, linux-nvme, linux-scsi,
	PDL,MEGARAIDLINUX, PDL-MPT-FUSIONLINUX, linux-hyperv, linux-usb,
	usb-storage, linux-kernel
In-Reply-To: <20190617084433.GA7969@lst.de>

>
> On Fri, Jun 14, 2019 at 01:28:47AM +0530, Kashyap Desai wrote:
> > Is there any changes in API  blk_queue_virt_boundary? I could not find
> > relevant code which account for this. Can you help ?
> > Which git repo shall I use for testing ? That way I can confirm, I
> > didn't miss relevant changes.
>
> Latest mainline plus the series (which is about to get resent).
> blk_queue_virt_boundary now forced an unlimited max_hw_sectors as that
is
> how PRP-like schemes work, to work around a block driver merging bug.
But
> we also need to communicate that limit to the DMA layer so that we don't
set
> a smaller iommu segment size limitation.
>
> > >From your above explanation, it means (after this patch) max segment
> > >size
> > of the MR controller will be set to 4K.
> > Earlier it is possible to receive single SGE of 64K datalength (Since
> > max seg size was 64K), but now the same buffer will reach the driver
> > having 16 SGEs (Each SGE will contain 4K length).
>
> No, there is no more limit for the size of the segment at all, as for
PRPs each
> PRP is sort of a segment from the hardware perspective.
> We just require the segments to not have gaps, as PRPs don't allow for
that.
Thanks for clarification. I have also observed that max_segment_size Is
unchanged and it is 64K.
>
> That being said I think these patches are wrong for the case of megaraid
or
> mpt having both NVMe and SAS/ATA devices behind a single controller.
> Is that a valid configuration?
Yes. This is a valid configuration.

^ permalink raw reply

* Re: [PATCH 10/13] megaraid_sas: set virt_boundary_mask in the scsi host
From: Christoph Hellwig @ 2019-06-17  8:44 UTC (permalink / raw)
  To: Kashyap Desai
  Cc: Christoph Hellwig, Jens Axboe, Sebastian Ott, Sagi Grimberg,
	Max Gurtovoy, Bart Van Assche, Ulf Hansson, Alan Stern,
	Oliver Neukum, linux-block, linux-rdma, linux-mmc, linux-nvme,
	linux-scsi, PDL,MEGARAIDLINUX, PDL-MPT-FUSIONLINUX, linux-hyperv,
	linux-usb, usb-storage, linux-kernel
In-Reply-To: <98f6557ae91a7cdfe8069fcf7d788c88@mail.gmail.com>

On Fri, Jun 14, 2019 at 01:28:47AM +0530, Kashyap Desai wrote:
> Is there any changes in API  blk_queue_virt_boundary? I could not find
> relevant code which account for this. Can you help ?
> Which git repo shall I use for testing ? That way I can confirm, I didn't
> miss relevant changes.

Latest mainline plus the series (which is about to get resent).
blk_queue_virt_boundary now forced an unlimited max_hw_sectors as that
is how PRP-like schemes work, to work around a block driver merging
bug.  But we also need to communicate that limit to the DMA layer so
that we don't set a smaller iommu segment size limitation.

> >From your above explanation, it means (after this patch) max segment size
> of the MR controller will be set to 4K.
> Earlier it is possible to receive single SGE of 64K datalength (Since max
> seg size was 64K), but now the same buffer will reach the driver having 16
> SGEs (Each SGE will contain 4K length).

No, there is no more limit for the size of the segment at all,
as for PRPs each PRP is sort of a segment from the hardware perspective.
We just require the segments to not have gaps, as PRPs don't allow for
that.

That being said I think these patches are wrong for the case of megaraid
or mpt having both NVMe and SAS/ATA devices behind a single controller.
Is that a valid configuration?

^ permalink raw reply

* Re: [PATCH v2 2/5] x86: hv: hv_init.c: Add functions to allocate/deallocate page for Hyper-V
From: Maya Nakamura @ 2019-06-17  3:59 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: x86, linux-hyperv, linux-kernel, mikelley, kys, haiyangz,
	sthemmin, sashal
In-Reply-To: <87muindr9c.fsf@vitty.brq.redhat.com>

On Wed, Jun 12, 2019 at 12:36:47PM +0200, Vitaly Kuznetsov wrote:
> Maya Nakamura <m.maya.nakamura@gmail.com> writes:
> 
> > Introduce two new functions, hv_alloc_hyperv_page() and
> > hv_free_hyperv_page(), to allocate/deallocate memory with the size and
> > alignment that Hyper-V expects as a page. Although currently they are
> > not used, they are ready to be used to allocate/deallocate memory on x86
> > when their ARM64 counterparts are implemented, keeping symmetry between
> > architectures with potentially different guest page sizes.
> >
> > Signed-off-by: Maya Nakamura <m.maya.nakamura@gmail.com>
> > ---
> >  arch/x86/hyperv/hv_init.c | 14 ++++++++++++++
> >  1 file changed, 14 insertions(+)
> >
> > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> > index e4ba467a9fc6..84baf0e9a2d4 100644
> > --- a/arch/x86/hyperv/hv_init.c
> > +++ b/arch/x86/hyperv/hv_init.c
> > @@ -98,6 +98,20 @@ EXPORT_SYMBOL_GPL(hyperv_pcpu_input_arg);
> >  u32 hv_max_vp_index;
> >  EXPORT_SYMBOL_GPL(hv_max_vp_index);
> >  
> > +void *hv_alloc_hyperv_page(void)
> > +{
> > +	BUILD_BUG_ON(!(PAGE_SIZE == HV_HYP_PAGE_SIZE));
> 
> (nit)
> 
> PAGE_SIZE != HV_HYP_PAGE_SIZE ?
> 
> > +
> > +	return (void *)__get_free_page(GFP_KERNEL);
> > +}
> > +EXPORT_SYMBOL_GPL(hv_alloc_hyperv_page);
> > +
> > +void hv_free_hyperv_page(unsigned long addr)
> > +{
> > +	free_page(addr);
> > +}
> > +EXPORT_SYMBOL_GPL(hv_free_hyperv_page);
> > +
> >  static int hv_cpu_init(unsigned int cpu)
> >  {
> >  	u64 msr_vp_index;
> 
> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> 
> -- 
> Vitaly

Agreed. I will resubmit the patch set with this correction.

^ permalink raw reply

* Re: [PATCH net] hv_sock: Suppress bogus "may be used uninitialized" warnings
From: David Miller @ 2019-06-16 21:01 UTC (permalink / raw)
  To: decui
  Cc: netdev, mikelley, linux-hyperv, linux-kernel, kys, sthemmin,
	haiyangz, Alexander.Levin, olaf, apw, jasowang, vkuznets,
	marcelo.cerri
In-Reply-To: <1560574826-99551-1-git-send-email-decui@microsoft.com>

From: Dexuan Cui <decui@microsoft.com>
Date: Sat, 15 Jun 2019 05:00:57 +0000

> gcc 8.2.0 may report these bogus warnings under some condition:
> 
> warning: ‘vnew’ may be used uninitialized in this function
> warning: ‘hvs_new’ may be used uninitialized in this function
> 
> Actually, the 2 pointers are only initialized and used if the variable
> "conn_from_host" is true. The code is not buggy here.
> 
> Signed-off-by: Dexuan Cui <decui@microsoft.com>

Applied.

^ permalink raw reply

* Re: [PATCH net] hvsock: fix epollout hang from race condition
From: David Miller @ 2019-06-16 20:54 UTC (permalink / raw)
  To: decui
  Cc: sunilmut, kys, haiyangz, sthemmin, sashal, mikelley, netdev,
	linux-hyperv, linux-kernel
In-Reply-To: <PU1P153MB0169BACDA500F94910849770BFE90@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>

From: Dexuan Cui <decui@microsoft.com>
Date: Sat, 15 Jun 2019 03:22:32 +0000

> These warnings are not introduced by this patch from Sunil.
> 
> I'm not sure why I didn't notice these warnings before.  
> Probably my gcc version is not new eought? 
> 
> Actually these warnings are bogus, as I checked the related functions,
> which may confuse the compiler's static analysis.
> 
> I'm going to make a patch to initialize the pointers to NULL to suppress
> the warnings. My patch will be based on the latest's net.git + this patch
> from Sunil.

Sunil should then resubmit his patch against something that has the
warning suppression patch applied.

^ permalink raw reply

* RE: [PATCH] scsi: storvsc: Add ability to change scsi queue depth
From: Michael Kelley @ 2019-06-15 18:19 UTC (permalink / raw)
  To: brandonbonaby94, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal@kernel.org, jejb@linux.ibm.com, martin.petersen@oracle.com
  Cc: brandonbonaby94, linux-hyperv@vger.kernel.org,
	linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <20190614234822.5193-1-brandonbonaby94@gmail.com>

From: Branden Bonaby <brandonbonaby94@gmail.com> Sent: Friday, June 14, 2019 4:48 PM
> 
> Adding functionality to allow the SCSI queue depth to be changed,
> by utilizing the "scsi_change_queue_depth" function.
> 
> Signed-off-by: Branden Bonaby <brandonbonaby94@gmail.com>
> ---
>  drivers/scsi/storvsc_drv.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
> index 8472de1007ff..719ca9906fc2 100644
> --- a/drivers/scsi/storvsc_drv.c
> +++ b/drivers/scsi/storvsc_drv.c
> @@ -387,6 +387,7 @@ enum storvsc_request_type {
> 
>  static int storvsc_ringbuffer_size = (128 * 1024);
>  static u32 max_outstanding_req_per_channel;
> +static int storvsc_change_queue_depth(struct scsi_device *sdev, int queue_depth);
> 
>  static int storvsc_vcpus_per_sub_channel = 4;
> 
> @@ -1711,6 +1712,7 @@ static struct scsi_host_template scsi_driver = {
>  	.dma_boundary =		PAGE_SIZE-1,
>  	.no_write_same =	1,
>  	.track_queue_depth =	1,
> +	.change_queue_depth =	storvsc_change_queue_depth,
>  };
> 
>  enum {
> @@ -1917,6 +1919,15 @@ static int storvsc_probe(struct hv_device *device,
>  	return ret;
>  }
> 
> +/* Change a scsi target's queue depth */
> +static int storvsc_change_queue_depth(struct scsi_device *sdev, int queue_depth)
> +{
> +	if (queue_depth > scsi_driver.can_queue){
> +		queue_depth = scsi_driver.can_queue;
> +	}
> +	return scsi_change_queue_depth(sdev, queue_depth);
> +}
> +
>  static int storvsc_remove(struct hv_device *dev)
>  {
>  	struct storvsc_device *stor_device = hv_get_drvdata(dev);
> --
> 2.17.1

Reviewed-by: Michael Kelley <mikelley@microsoft.com>


^ permalink raw reply

* RE: [PATCH net] hvsock: fix epollout hang from race condition
From: Dexuan Cui @ 2019-06-15 17:01 UTC (permalink / raw)
  To: Sunil Muthuswamy, David Miller
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal@kernel.org, Michael Kelley, netdev@vger.kernel.org,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <MW2PR2101MB111678811A8761515B669A47C0E90@MW2PR2101MB1116.namprd21.prod.outlook.com>

> From: Sunil Muthuswamy <sunilmut@microsoft.com>
> Sent: Saturday, June 15, 2019 12:23 AM
> To: Dexuan Cui <decui@microsoft.com>; David Miller <davem@davemloft.net>
> > ...
> > It looks a simple inline assembly code can confuse gcc. I'm not sure if I should
> > report a bug for gcc...
> >
> > I posted a patch to suppress these bogus warnings just now. The Subject is:
> >
> > [PATCH net] hv_sock: Suppress bogus "may be used uninitialized" warnings

David, as I described, these warnings are spurious and can be safely ignored.

Please consider not applying my two-line patch to avoid merge conflict
with Sunil's another patch in net-next.git. 

> Yes, these warnings are not specific to this patch. And, additionally these
> should already addressed in the commit ...
> I was trying to avoid making the same changes here to avoid merge
> conflicts when 'net-next' merges with 'net' next.

Yeah, I agree.

Thanks,
-- Dexuan

^ permalink raw reply

* RE: [PATCH net] hvsock: fix epollout hang from race condition
From: Sunil Muthuswamy @ 2019-06-15  7:23 UTC (permalink / raw)
  To: Dexuan Cui, David Miller
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal@kernel.org, Michael Kelley, netdev@vger.kernel.org,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <PU1P153MB0169810F29C473D44C090F6ABFE90@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>



> -----Original Message-----
> From: Dexuan Cui <decui@microsoft.com>
> Sent: Friday, June 14, 2019 10:04 PM
> To: David Miller <davem@davemloft.net>; Sunil Muthuswamy <sunilmut@microsoft.com>
> Cc: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Stephen Hemminger
> <sthemmin@microsoft.com>; sashal@kernel.org; Michael Kelley <mikelley@microsoft.com>; netdev@vger.kernel.org; linux-
> hyperv@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: RE: [PATCH net] hvsock: fix epollout hang from race condition
> 
> > From: linux-hyperv-owner@vger.kernel.org
> > <linux-hyperv-owner@vger.kernel.org> On Behalf Of Dexuan Cui
> > Sent: Friday, June 14, 2019 8:23 PM
> > To: David Miller <davem@davemloft.net>; Sunil Muthuswamy
> > <sunilmut@microsoft.com>
> > Cc: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang
> > <haiyangz@microsoft.com>; Stephen Hemminger
> > <sthemmin@microsoft.com>; sashal@kernel.org; Michael Kelley
> > <mikelley@microsoft.com>; netdev@vger.kernel.org;
> > linux-hyperv@vger.kernel.org; linux-kernel@vger.kernel.org
> > Subject: RE: [PATCH net] hvsock: fix epollout hang from race condition
> >
> > > From: linux-hyperv-owner@vger.kernel.org
> > > <linux-hyperv-owner@vger.kernel.org> On Behalf Of David Miller
> > > Sent: Friday, June 14, 2019 7:15 PM
> > > To: Sunil Muthuswamy <sunilmut@microsoft.com>
> > >
> > > This adds lots of new warnings:
> > >
> > > net/vmw_vsock/hyperv_transport.c: In function ‘hvs_probe’:
> > > net/vmw_vsock/hyperv_transport.c:205:20: warning: ‘vnew’ may be used
> > > uninitialized in this function [-Wmaybe-uninitialized]
> > >    remote->svm_port = host_ephemeral_port++;
> > >    ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
> > > net/vmw_vsock/hyperv_transport.c:332:21: note: ‘vnew’ was declared
> > here
> > >   struct vsock_sock *vnew;
> > >                      ^~~~
> > > net/vmw_vsock/hyperv_transport.c:406:22: warning: ‘hvs_new’ may be
> > > used uninitialized in this function [-Wmaybe-uninitialized]
> > >    hvs_new->vm_srv_id = *if_type;
> > >    ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~
> > > net/vmw_vsock/hyperv_transport.c:333:23: note: ‘hvs_new’ was declared
> > > here
> > >   struct hvsock *hvs, *hvs_new;
> > >                        ^~~~~~~
> >
> > Hi David,
> > These warnings are not introduced by this patch from Sunil.
> 
> Well, technically speaking, the warnings are caused by Suni's patch, but I think it should
> be a bug of gcc (I'm using "gcc (Ubuntu 8.2.0-12ubuntu1) 8.2.0"). As you can see, the
> line numbers given by gcc, i.e. line 205, line 406, are not related to the warnings.
> 
> Actually, the same warnings can repro with the below one-line patch on this commit of
> today's net.git:
>     9a33629ba6b2 ("hv_netvsc: Set probe mode to sync")
> 
> --- a/net/vmw_vsock/hyperv_transport.c
> +++ b/net/vmw_vsock/hyperv_transport.c
> @@ -403,6 +403,7 @@ static void hvs_open_connection(struct vmbus_channel *chan)
> 
>         set_per_channel_state(chan, conn_from_host ? new : sk);
>         vmbus_set_chn_rescind_callback(chan, hvs_close_connection);
> +       asm ("nop");
> 
>         if (conn_from_host) {
>                 new->sk_state = TCP_ESTABLISHED;
> 
> It looks a simple inline assembly code can confuse gcc. I'm not sure if I should
> report a bug for gcc...
> 
> I posted a patch to suppress these bogus warnings just now. The Subject is:
> 
> [PATCH net] hv_sock: Suppress bogus "may be used uninitialized" warnings
> 
> Thanks,
> -- Dexuan

Yes, these warnings are not specific to this patch. And, additionally these
should already addressed in the commit 
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=ac383f58f3c98de37fa67452acc5bd677396e9f3
I was trying to avoid making the same changes here to avoid merge
conflicts when 'net-next' merges with 'net' next.

^ permalink raw reply

* RE: [PATCH net] hv_sock: Suppress bogus "may be used uninitialized" warnings
From: Sunil Muthuswamy @ 2019-06-15  7:19 UTC (permalink / raw)
  To: Dexuan Cui, netdev@vger.kernel.org, davem@davemloft.net,
	Michael Kelley
  Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	KY Srinivasan, Stephen Hemminger, Haiyang Zhang, Sasha Levin,
	olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com
In-Reply-To: <1560574826-99551-1-git-send-email-decui@microsoft.com>



> -----Original Message-----
> From: linux-hyperv-owner@vger.kernel.org <linux-hyperv-owner@vger.kernel.org> On Behalf Of Dexuan Cui
> Sent: Friday, June 14, 2019 10:01 PM
> To: netdev@vger.kernel.org; davem@davemloft.net; Michael Kelley <mikelley@microsoft.com>
> Cc: linux-hyperv@vger.kernel.org; linux-kernel@vger.kernel.org; KY Srinivasan <kys@microsoft.com>; Stephen Hemminger
> <sthemmin@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Sasha Levin <Alexander.Levin@microsoft.com>;
> olaf@aepfle.de; apw@canonical.com; jasowang@redhat.com; vkuznets <vkuznets@redhat.com>; marcelo.cerri@canonical.com;
> Dexuan Cui <decui@microsoft.com>
> Subject: [PATCH net] hv_sock: Suppress bogus "may be used uninitialized" warnings
> 
> gcc 8.2.0 may report these bogus warnings under some condition:
> 
> warning: ‘vnew’ may be used uninitialized in this function
> warning: ‘hvs_new’ may be used uninitialized in this function
> 
> Actually, the 2 pointers are only initialized and used if the variable
> "conn_from_host" is true. The code is not buggy here.
> 
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> ---
>  net/vmw_vsock/hyperv_transport.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
> index 8d1ea9eda8a2..cd3f47f54fa7 100644
> --- a/net/vmw_vsock/hyperv_transport.c
> +++ b/net/vmw_vsock/hyperv_transport.c
> @@ -329,8 +329,8 @@ static void hvs_open_connection(struct vmbus_channel *chan)
> 
>  	struct sockaddr_vm addr;
>  	struct sock *sk, *new = NULL;
> -	struct vsock_sock *vnew;
> -	struct hvsock *hvs, *hvs_new;
> +	struct vsock_sock *vnew = NULL;
> +	struct hvsock *hvs, *hvs_new = NULL;
>  	int ret;
> 
These are all already fixed under 
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=ac383f58f3c98de37fa67452acc5bd677396e9f3 
Its just that that commit hasn't merged with the 'net' branch yet.
>  	if_type = &chan->offermsg.offer.if_type;
> --
> 2.19.1


^ permalink raw reply

* Re: [PATCH] PCI: hv: Fix build error without CONFIG_SYSFS
From: Yuehaibing @ 2019-06-15  6:48 UTC (permalink / raw)
  To: Michael Kelley, bhelgaas@google.com, Stephen Hemminger,
	sashal@kernel.org, Dexuan Cui, linux-hyperv@vger.kernel.org
  Cc: linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org
In-Reply-To: <7d8ca05e-7519-45d8-e694-d31e221696d5@huawei.com>

+cc Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>

On 2019/6/15 14:18, Yuehaibing wrote:
> 
> On 2019/6/2 6:59, Michael Kelley wrote:
>> From: YueHaibing <yuehaibing@huawei.com>  Sent: Friday, May 31, 2019 8:09 AM
>>>
>>> while building without CONFIG_SYSFS, fails as below:
>>>
>>> drivers/pci/controller/pci-hyperv.o: In function 'hv_pci_assign_slots':
>>> pci-hyperv.c:(.text+0x40a): undefined reference to 'pci_create_slot'
>>> drivers/pci/controller/pci-hyperv.o: In function 'pci_devices_present_work':
>>> pci-hyperv.c:(.text+0xc02): undefined reference to 'pci_destroy_slot'
>>> drivers/pci/controller/pci-hyperv.o: In function 'hv_pci_remove':
>>> pci-hyperv.c:(.text+0xe50): undefined reference to 'pci_destroy_slot'
>>> drivers/pci/controller/pci-hyperv.o: In function 'hv_eject_device_work':
>>> pci-hyperv.c:(.text+0x11f9): undefined reference to 'pci_destroy_slot'
>>>
>>> Select SYSFS while PCI_HYPERV is set to fix this.
>>>
>>
>> I'm wondering if is the right way to fix the problem.  Conceptually
>> is it possible to setup & operate virtual PCI devices like 
>> pci-hyperv.c does, even if sysfs is not present?  Or is it right to
>> always required sysfs?
>>
>> The function pci_dev_assign_slot() in slot.c has a null implementation
>> in include/linux/pci.h when CONFIG_SYSFS is not defined, which
>> seems to be trying to solve the same problem for that function.  And
>> if CONFIG_HOTPLUG_PCI is defined but CONFIG_SYSFS is not,
>> pci_hp_create_module_link() and pci_hp_remove_module_link()
>> look like they would have the same problem.  Maybe there should
>> be degenerate implementations of pci_create_slot() and
>> pci_destroy_slot() for cases when CONFIG_SYSFS is not defined?
>>
>> But I'll admit I don't know the full story behind how PCI slots
>> are represented and used, so maybe I'm off base.  I just noticed
>> the inconsistency in how other functions in slot.c are handled.
>>
>> Thoughts?
> 
> 268a03a42d33 ("PCI: drivers/pci/slot.c should depend on CONFIG_SYSFS")
> 
> make slot.o depends CONFIG_SYSFS
> 
> commit 268a03a42d3377d5fb41e6e7cbdec4e0b65cab2e
> Author: Alex Chiang <achiang@hp.com>
> Date:   Wed Jun 17 19:03:57 2009 -0600
> 
>     PCI: drivers/pci/slot.c should depend on CONFIG_SYSFS
> 
>     There is no way to interact with a physical PCI slot without
>     sysfs, so encode the dependency and prevent this build error:
> 
>         drivers/pci/slot.c: In function 'pci_hp_create_module_link':
>         drivers/pci/slot.c:327: error: 'module_kset' undeclared
> 
>     This patch _should_ make pci-sysfs.o depend on CONFIG_SYSFS too,
>     but we cannot (yet) because the PCI core merrily assumes the
>     existence of sysfs:
> 
>         drivers/built-in.o: In function `pci_bus_add_device':
>         drivers/pci/bus.c:89: undefined reference to `pci_create_sysfs_dev_files'
>         drivers/built-in.o: In function `pci_stop_dev':
>         drivers/pci/remove.c:24: undefined reference to `pci_remove_sysfs_dev_files'
> 
>     So do the minimal bit for now and figure out how to untangle it
>     later.
> 
> If No CONFIG_SYSFS, slot.o is not build
> 
>>
>> Michael
>>
>>
> 
> 
> .
> 


^ permalink raw reply

* Re: [PATCH] PCI: hv: Fix build error without CONFIG_SYSFS
From: Yuehaibing @ 2019-06-15  6:18 UTC (permalink / raw)
  To: Michael Kelley, bhelgaas@google.com, Stephen Hemminger,
	sashal@kernel.org, Dexuan Cui, linux-hyperv@vger.kernel.org
  Cc: linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org
In-Reply-To: <BYAPR21MB12211EEA95200F437C8E37ECD71A0@BYAPR21MB1221.namprd21.prod.outlook.com>


On 2019/6/2 6:59, Michael Kelley wrote:
> From: YueHaibing <yuehaibing@huawei.com>  Sent: Friday, May 31, 2019 8:09 AM
>>
>> while building without CONFIG_SYSFS, fails as below:
>>
>> drivers/pci/controller/pci-hyperv.o: In function 'hv_pci_assign_slots':
>> pci-hyperv.c:(.text+0x40a): undefined reference to 'pci_create_slot'
>> drivers/pci/controller/pci-hyperv.o: In function 'pci_devices_present_work':
>> pci-hyperv.c:(.text+0xc02): undefined reference to 'pci_destroy_slot'
>> drivers/pci/controller/pci-hyperv.o: In function 'hv_pci_remove':
>> pci-hyperv.c:(.text+0xe50): undefined reference to 'pci_destroy_slot'
>> drivers/pci/controller/pci-hyperv.o: In function 'hv_eject_device_work':
>> pci-hyperv.c:(.text+0x11f9): undefined reference to 'pci_destroy_slot'
>>
>> Select SYSFS while PCI_HYPERV is set to fix this.
>>
> 
> I'm wondering if is the right way to fix the problem.  Conceptually
> is it possible to setup & operate virtual PCI devices like 
> pci-hyperv.c does, even if sysfs is not present?  Or is it right to
> always required sysfs?
> 
> The function pci_dev_assign_slot() in slot.c has a null implementation
> in include/linux/pci.h when CONFIG_SYSFS is not defined, which
> seems to be trying to solve the same problem for that function.  And
> if CONFIG_HOTPLUG_PCI is defined but CONFIG_SYSFS is not,
> pci_hp_create_module_link() and pci_hp_remove_module_link()
> look like they would have the same problem.  Maybe there should
> be degenerate implementations of pci_create_slot() and
> pci_destroy_slot() for cases when CONFIG_SYSFS is not defined?
> 
> But I'll admit I don't know the full story behind how PCI slots
> are represented and used, so maybe I'm off base.  I just noticed
> the inconsistency in how other functions in slot.c are handled.
> 
> Thoughts?

268a03a42d33 ("PCI: drivers/pci/slot.c should depend on CONFIG_SYSFS")

make slot.o depends CONFIG_SYSFS

commit 268a03a42d3377d5fb41e6e7cbdec4e0b65cab2e
Author: Alex Chiang <achiang@hp.com>
Date:   Wed Jun 17 19:03:57 2009 -0600

    PCI: drivers/pci/slot.c should depend on CONFIG_SYSFS

    There is no way to interact with a physical PCI slot without
    sysfs, so encode the dependency and prevent this build error:

        drivers/pci/slot.c: In function 'pci_hp_create_module_link':
        drivers/pci/slot.c:327: error: 'module_kset' undeclared

    This patch _should_ make pci-sysfs.o depend on CONFIG_SYSFS too,
    but we cannot (yet) because the PCI core merrily assumes the
    existence of sysfs:

        drivers/built-in.o: In function `pci_bus_add_device':
        drivers/pci/bus.c:89: undefined reference to `pci_create_sysfs_dev_files'
        drivers/built-in.o: In function `pci_stop_dev':
        drivers/pci/remove.c:24: undefined reference to `pci_remove_sysfs_dev_files'

    So do the minimal bit for now and figure out how to untangle it
    later.

If No CONFIG_SYSFS, slot.o is not build

> 
> Michael
> 
> 


^ permalink raw reply

* RE: [PATCH net] hvsock: fix epollout hang from race condition
From: Dexuan Cui @ 2019-06-15  5:03 UTC (permalink / raw)
  To: David Miller, Sunil Muthuswamy
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal@kernel.org, Michael Kelley, netdev@vger.kernel.org,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <PU1P153MB0169BACDA500F94910849770BFE90@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>

> From: linux-hyperv-owner@vger.kernel.org
> <linux-hyperv-owner@vger.kernel.org> On Behalf Of Dexuan Cui
> Sent: Friday, June 14, 2019 8:23 PM
> To: David Miller <davem@davemloft.net>; Sunil Muthuswamy
> <sunilmut@microsoft.com>
> Cc: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang
> <haiyangz@microsoft.com>; Stephen Hemminger
> <sthemmin@microsoft.com>; sashal@kernel.org; Michael Kelley
> <mikelley@microsoft.com>; netdev@vger.kernel.org;
> linux-hyperv@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: RE: [PATCH net] hvsock: fix epollout hang from race condition
> 
> > From: linux-hyperv-owner@vger.kernel.org
> > <linux-hyperv-owner@vger.kernel.org> On Behalf Of David Miller
> > Sent: Friday, June 14, 2019 7:15 PM
> > To: Sunil Muthuswamy <sunilmut@microsoft.com>
> >
> > This adds lots of new warnings:
> >
> > net/vmw_vsock/hyperv_transport.c: In function ‘hvs_probe’:
> > net/vmw_vsock/hyperv_transport.c:205:20: warning: ‘vnew’ may be used
> > uninitialized in this function [-Wmaybe-uninitialized]
> >    remote->svm_port = host_ephemeral_port++;
> >    ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
> > net/vmw_vsock/hyperv_transport.c:332:21: note: ‘vnew’ was declared
> here
> >   struct vsock_sock *vnew;
> >                      ^~~~
> > net/vmw_vsock/hyperv_transport.c:406:22: warning: ‘hvs_new’ may be
> > used uninitialized in this function [-Wmaybe-uninitialized]
> >    hvs_new->vm_srv_id = *if_type;
> >    ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~
> > net/vmw_vsock/hyperv_transport.c:333:23: note: ‘hvs_new’ was declared
> > here
> >   struct hvsock *hvs, *hvs_new;
> >                        ^~~~~~~
> 
> Hi David,
> These warnings are not introduced by this patch from Sunil.

Well, technically speaking, the warnings are caused by Suni's patch, but I think it should
be a bug of gcc (I'm using "gcc (Ubuntu 8.2.0-12ubuntu1) 8.2.0"). As you can see, the
line numbers given by gcc, i.e. line 205, line 406, are not related to the warnings.

Actually, the same warnings can repro with the below one-line patch on this commit of
today's net.git:
    9a33629ba6b2 ("hv_netvsc: Set probe mode to sync")

--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -403,6 +403,7 @@ static void hvs_open_connection(struct vmbus_channel *chan)

        set_per_channel_state(chan, conn_from_host ? new : sk);
        vmbus_set_chn_rescind_callback(chan, hvs_close_connection);
+       asm ("nop");

        if (conn_from_host) {
                new->sk_state = TCP_ESTABLISHED;

It looks a simple inline assembly code can confuse gcc. I'm not sure if I should
report a bug for gcc...

I posted a patch to suppress these bogus warnings just now. The Subject is:

[PATCH net] hv_sock: Suppress bogus "may be used uninitialized" warnings

Thanks,
-- Dexuan

^ permalink raw reply

* [PATCH net] hv_sock: Suppress bogus "may be used uninitialized" warnings
From: Dexuan Cui @ 2019-06-15  5:00 UTC (permalink / raw)
  To: netdev@vger.kernel.org, davem@davemloft.net, Michael Kelley
  Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	KY Srinivasan, Stephen Hemminger, Haiyang Zhang, Sasha Levin,
	olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com, Dexuan Cui

gcc 8.2.0 may report these bogus warnings under some condition:

warning: ‘vnew’ may be used uninitialized in this function
warning: ‘hvs_new’ may be used uninitialized in this function

Actually, the 2 pointers are only initialized and used if the variable
"conn_from_host" is true. The code is not buggy here.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 net/vmw_vsock/hyperv_transport.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
index 8d1ea9eda8a2..cd3f47f54fa7 100644
--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -329,8 +329,8 @@ static void hvs_open_connection(struct vmbus_channel *chan)
 
 	struct sockaddr_vm addr;
 	struct sock *sk, *new = NULL;
-	struct vsock_sock *vnew;
-	struct hvsock *hvs, *hvs_new;
+	struct vsock_sock *vnew = NULL;
+	struct hvsock *hvs, *hvs_new = NULL;
 	int ret;
 
 	if_type = &chan->offermsg.offer.if_type;
-- 
2.19.1


^ permalink raw reply related

* RE: [PATCH net] hvsock: fix epollout hang from race condition
From: Dexuan Cui @ 2019-06-15  3:22 UTC (permalink / raw)
  To: David Miller, Sunil Muthuswamy
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal@kernel.org, Michael Kelley, netdev@vger.kernel.org,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <20190614.191456.407433636343988177.davem@davemloft.net>

> From: linux-hyperv-owner@vger.kernel.org
> <linux-hyperv-owner@vger.kernel.org> On Behalf Of David Miller
> Sent: Friday, June 14, 2019 7:15 PM
> To: Sunil Muthuswamy <sunilmut@microsoft.com>
> 
> This adds lots of new warnings:
> 
> net/vmw_vsock/hyperv_transport.c: In function ‘hvs_probe’:
> net/vmw_vsock/hyperv_transport.c:205:20: warning: ‘vnew’ may be used
> uninitialized in this function [-Wmaybe-uninitialized]
>    remote->svm_port = host_ephemeral_port++;
>    ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
> net/vmw_vsock/hyperv_transport.c:332:21: note: ‘vnew’ was declared here
>   struct vsock_sock *vnew;
>                      ^~~~
> net/vmw_vsock/hyperv_transport.c:406:22: warning: ‘hvs_new’ may be
> used uninitialized in this function [-Wmaybe-uninitialized]
>    hvs_new->vm_srv_id = *if_type;
>    ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~
> net/vmw_vsock/hyperv_transport.c:333:23: note: ‘hvs_new’ was declared
> here
>   struct hvsock *hvs, *hvs_new;
>                        ^~~~~~~

Hi David,
These warnings are not introduced by this patch from Sunil.

I'm not sure why I didn't notice these warnings before.  
Probably my gcc version is not new eought? 

Actually these warnings are bogus, as I checked the related functions,
which may confuse the compiler's static analysis.

I'm going to make a patch to initialize the pointers to NULL to suppress
the warnings. My patch will be based on the latest's net.git + this patch
from Sunil.

Thanks,
-- Dexuan

^ permalink raw reply

* Re: [PATCH net] hv_netvsc: Set probe mode to sync
From: David Miller @ 2019-06-15  2:47 UTC (permalink / raw)
  To: haiyangz
  Cc: sashal, linux-hyperv, netdev, kys, sthemmin, olaf, vkuznets,
	linux-kernel
In-Reply-To: <1560459955-37900-1-git-send-email-haiyangz@microsoft.com>

From: Haiyang Zhang <haiyangz@microsoft.com>
Date: Thu, 13 Jun 2019 21:06:53 +0000

> For better consistency of synthetic NIC names, we set the probe mode to
> PROBE_FORCE_SYNCHRONOUS. So the names can be aligned with the vmbus
> channel offer sequence.
> 
> Fixes: af0a5646cb8d ("use the new async probing feature for the hyperv drivers")
> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH net] hvsock: fix epollout hang from race condition
From: David Miller @ 2019-06-15  2:14 UTC (permalink / raw)
  To: sunilmut
  Cc: kys, haiyangz, sthemmin, sashal, decui, mikelley, netdev,
	linux-hyperv, linux-kernel
In-Reply-To: <MW2PR2101MB11164C6EEAA5C511B395EF3AC0EC0@MW2PR2101MB1116.namprd21.prod.outlook.com>


This adds lots of new warnings:

net/vmw_vsock/hyperv_transport.c: In function ‘hvs_probe’:
net/vmw_vsock/hyperv_transport.c:205:20: warning: ‘vnew’ may be used uninitialized in this function [-Wmaybe-uninitialized]
   remote->svm_port = host_ephemeral_port++;
   ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
net/vmw_vsock/hyperv_transport.c:332:21: note: ‘vnew’ was declared here
  struct vsock_sock *vnew;
                     ^~~~
net/vmw_vsock/hyperv_transport.c:406:22: warning: ‘hvs_new’ may be used uninitialized in this function [-Wmaybe-uninitialized]
   hvs_new->vm_srv_id = *if_type;
   ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~
net/vmw_vsock/hyperv_transport.c:333:23: note: ‘hvs_new’ was declared here
  struct hvsock *hvs, *hvs_new;
                       ^~~~~~~

^ permalink raw reply

* [PATCH] scsi: storvsc: Add ability to change scsi queue depth
From: Branden Bonaby @ 2019-06-14 23:48 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, sashal, jejb, martin.petersen
  Cc: Branden Bonaby, linux-hyperv, linux-scsi, linux-kernel

Adding functionality to allow the SCSI queue depth to be changed,
by utilizing the "scsi_change_queue_depth" function.

Signed-off-by: Branden Bonaby <brandonbonaby94@gmail.com>
---
 drivers/scsi/storvsc_drv.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 8472de1007ff..719ca9906fc2 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -387,6 +387,7 @@ enum storvsc_request_type {
 
 static int storvsc_ringbuffer_size = (128 * 1024);
 static u32 max_outstanding_req_per_channel;
+static int storvsc_change_queue_depth(struct scsi_device *sdev, int queue_depth);
 
 static int storvsc_vcpus_per_sub_channel = 4;
 
@@ -1711,6 +1712,7 @@ static struct scsi_host_template scsi_driver = {
 	.dma_boundary =		PAGE_SIZE-1,
 	.no_write_same =	1,
 	.track_queue_depth =	1,
+	.change_queue_depth =	storvsc_change_queue_depth,
 };
 
 enum {
@@ -1917,6 +1919,15 @@ static int storvsc_probe(struct hv_device *device,
 	return ret;
 }
 
+/* Change a scsi target's queue depth */
+static int storvsc_change_queue_depth(struct scsi_device *sdev, int queue_depth)
+{
+	if (queue_depth > scsi_driver.can_queue){
+		queue_depth = scsi_driver.can_queue;
+	}
+	return scsi_change_queue_depth(sdev, queue_depth);
+}
+
 static int storvsc_remove(struct hv_device *dev)
 {
 	struct storvsc_device *stor_device = hv_get_drvdata(dev);
-- 
2.17.1


^ permalink raw reply related

* RE: [PATCH] ACPI: PM: Export the function acpi_sleep_state_supported()
From: Dexuan Cui @ 2019-06-14 23:34 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Michael Kelley, linux-acpi@vger.kernel.org, rjw@rjwysocki.net,
	lenb@kernel.org, robert.moore@intel.com, erik.schmauss@intel.com,
	Russ Dill, Sebastian Capella, Pavel Machek, Lorenzo Pieralisi,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	KY Srinivasan, Stephen Hemminger, Haiyang Zhang, Sasha Levin,
	olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com
In-Reply-To: <20190614223310.pwwoefu5qdvcuaiy@shell.armlinux.org.uk>

> From: linux-hyperv-owner@vger.kernel.org
> <linux-hyperv-owner@vger.kernel.org> On Behalf Of Russell King
> On Fri, Jun 14, 2019 at 10:19:02PM +0000, Dexuan Cui wrote:
> > It looks ARM does not support the ACPI S4 state, then how do we know
> > if an ARM host supports hibernation or not?
> 
> Don't forget that Linux does not support ACPI on 32-bit ARM, which is
> quite different from the situation on 64-bit ARM.
> 
> arch/arm/kernel/hibernate.c is only for 32-bit ARM, and is written with
> the assumption that there is no interaction required with any firmware
> to save state, and later restore state upon resuming.
> 
> Or am I missing something?

Hi Russell,
Thanks for your reply and please excuse me for my ignorance of ARM. 

So 32-bit ARM Linux can hibernate even if it doesn't support ACPI, but
I guess not all 32-bit ARM machines support hibernation? If my guess
is correct, is there any standard capability bit or something that can be
used to tell if an ARM machine supports hibernation? I'm purely curious. :-)

Do you imply 64-bit ARM Linux supports ACPI and the ACPI S4 state?
If not, how can we tell if a 64-bit ARM machine supports hibernation or not?

Thanks,
-- Dexuan

^ permalink raw reply

* RE: [PATCH 2/2] hv_balloon: Reorganize the probe function
From: Dexuan Cui @ 2019-06-14 23:08 UTC (permalink / raw)
  To: Michael Kelley, linux-hyperv@vger.kernel.org,
	gregkh@linuxfoundation.org, Stephen Hemminger, Sasha Levin,
	Haiyang Zhang, KY Srinivasan, linux-kernel@vger.kernel.org,
	Tianyu Lan
  Cc: olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com
In-Reply-To: <BL0PR2101MB13487B8D2A157AA7FCFD159DD7EE0@BL0PR2101MB1348.namprd21.prod.outlook.com>

> From: Michael Kelley <mikelley@microsoft.com>
> Sent: Friday, June 14, 2019 2:56 PM
> > ...
> > +	ret = balloon_connect_vsp(dev);
> > +	if (ret != 0)
> > +		return ret;
> > +
> >  	dm_device.state = DM_INITIALIZED;
> > -	last_post_time = jiffies;
> 
> I was curious about the above deletion.  But I guess the line
> is not needed as the time_after() check in post_status() should
> handle an initial value of 0 for last_post_time just fine.

In a 32-bit kernel, sizeof(unsigned long) is 4, and the global 32-bit
varilable "jiffies" can overflow in 49.7 days if HZ is defined as 1000;
so in theory there is a tiny chance time_after() can not work as
expected here (i.e. we're loading hv_balloon driver when the
"jiffies" is just about to overflow, which is highly unlikely in practice);
even if that happens, we do not care, since the consequence is
just that the memory pressure reporting is delayed by 1 second. :-)

> > +
> > +	dm_device.thread =
> > +		 kthread_run(dm_thread_func, &dm_device, "hv_balloon");
> > +	if (IS_ERR(dm_device.thread)) {
> > +		ret = PTR_ERR(dm_device.thread);
> > +		goto probe_error;
> > +	}
> 
> Just an observation:  this thread creation now happens at the end of the
> probing process.  But that's good, because in the old code, the thread
> was started and could run before the protocol version had been
> negotiated.  So I'll assume your change here is intentional.

Yes, this is intentional.
 
> >
> >  	return 0;
> >
> > -probe_error2:
> > +probe_error:
> > +	vmbus_close(dev->channel);
> >  #ifdef CONFIG_MEMORY_HOTPLUG
> > +	unregister_memory_notifier(&hv_memory_nb);
> 
> Hmmm. Evidently the above cleanup was missing in the
> old code.

Yes.
 
> >  	restore_online_page_callback(&hv_online_page);
> >  #endif
> > -	kthread_stop(dm_device.thread);
> > -
> > -probe_error1:
> > -	vmbus_close(dev->channel);
> >  	return ret;
> >  }
> >
> > @@ -1734,11 +1742,11 @@ static int balloon_remove(struct hv_device
> *dev)
> >  	cancel_work_sync(&dm->balloon_wrk.wrk);
> >  	cancel_work_sync(&dm->ha_wrk.wrk);
> >
> > -	vmbus_close(dev->channel);
> >  	kthread_stop(dm->thread);
> > +	vmbus_close(dev->channel);
> 
> Presumably this is an intentional ordering change as well.
> The kthread should be stopped before closing the channel.

Yes. The old code is buggy: after the vmbus_close(), there is
a small window in which the old code can still try to send
messages to the host via a freed ringbuffer, causing panic.
 
> >  #ifdef CONFIG_MEMORY_HOTPLUG
> > -	restore_online_page_callback(&hv_online_page);
> >  	unregister_memory_notifier(&hv_memory_nb);
> > +	restore_online_page_callback(&hv_online_page);
> 
> And you've changed the ordering of these steps so they are
> the inverse of when they are set up.  Also a good cleanup ....

Yes. The change is not really necessary, but let's just do it
in a better manner.
 
> 
> Reviewed-by: Michael Kelley <mikelley@microsoft.com>

Thaks for the detailed comments!

Thanks,
-- Dexuan

^ permalink raw reply

* Re: [PATCH] ACPI: PM: Export the function acpi_sleep_state_supported()
From: Russell King - ARM Linux admin @ 2019-06-14 22:33 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: Michael Kelley, linux-acpi@vger.kernel.org, rjw@rjwysocki.net,
	lenb@kernel.org, robert.moore@intel.com, erik.schmauss@intel.com,
	Russ Dill, Sebastian Capella, Pavel Machek, Lorenzo Pieralisi,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	KY Srinivasan, Stephen Hemminger, Haiyang Zhang, Sasha Levin,
	olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com
In-Reply-To: <PU1P153MB01699020B5BC4287C58F5335BFEE0@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>

Hi,

On Fri, Jun 14, 2019 at 10:19:02PM +0000, Dexuan Cui wrote:
> > -----Original Message-----
> > From: Michael Kelley <mikelley@microsoft.com>
> > Sent: Friday, June 14, 2019 1:48 PM
> > To: Dexuan Cui <decui@microsoft.com>; linux-acpi@vger.kernel.org;
> > rjw@rjwysocki.net; lenb@kernel.org; robert.moore@intel.com;
> > erik.schmauss@intel.com
> > Cc: linux-hyperv@vger.kernel.org; linux-kernel@vger.kernel.org; KY Srinivasan
> > <kys@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>;
> > Haiyang Zhang <haiyangz@microsoft.com>; Sasha Levin
> > <Alexander.Levin@microsoft.com>; olaf@aepfle.de; apw@canonical.com;
> > jasowang@redhat.com; vkuznets <vkuznets@redhat.com>;
> > marcelo.cerri@canonical.com
> > Subject: RE: [PATCH] ACPI: PM: Export the function
> > acpi_sleep_state_supported()
> > 
> > It seems that sleep.c isn't built when on the ARM64 architecture.  Using
> > acpi_sleep_state_supported() directly in hv_balloon.c will be problematic
> > since hv_balloon.c needs to be architecture independent when the
> > Hyper-V ARM64 support is added.  If that doesn't change, a per-architecture
> > wrapper will be needed to give hv_balloon.c the correct information.  This
> > may affect whether acpi_sleep_state_supported() needs to be exported vs.
> > just removing the "static".   I'm not sure what the best approach is.
> > 
> > Michael
> 
> + some ARM experts who worked on arch/arm/kernel/hibernate.c.
> 
> drivers/acpi/sleep.c is only built if ACPI_SYSTEM_POWER_STATES_SUPPORT
> is defined, but it looks this option is not defined on ARM.
> 
> It looks ARM does not support the ACPI S4 state, then how do we know 
> if an ARM host supports hibernation or not?

Don't forget that Linux does not support ACPI on 32-bit ARM, which is
quite different from the situation on 64-bit ARM.

arch/arm/kernel/hibernate.c is only for 32-bit ARM, and is written with
the assumption that there is no interaction required with any firmware
to save state, and later restore state upon resuming.

Or am I missing something?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply

* RE: [PATCH] ACPI: PM: Export the function acpi_sleep_state_supported()
From: Dexuan Cui @ 2019-06-14 22:19 UTC (permalink / raw)
  To: Michael Kelley, linux-acpi@vger.kernel.org, rjw@rjwysocki.net,
	lenb@kernel.org, robert.moore@intel.com, erik.schmauss@intel.com,
	Russell King, Russ Dill, Sebastian Capella, Pavel Machek,
	Lorenzo Pieralisi
  Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	KY Srinivasan, Stephen Hemminger, Haiyang Zhang, Sasha Levin,
	olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com
In-Reply-To: <BL0PR2101MB134895BADA1D8E0FA631D532D7EE0@BL0PR2101MB1348.namprd21.prod.outlook.com>

> -----Original Message-----
> From: Michael Kelley <mikelley@microsoft.com>
> Sent: Friday, June 14, 2019 1:48 PM
> To: Dexuan Cui <decui@microsoft.com>; linux-acpi@vger.kernel.org;
> rjw@rjwysocki.net; lenb@kernel.org; robert.moore@intel.com;
> erik.schmauss@intel.com
> Cc: linux-hyperv@vger.kernel.org; linux-kernel@vger.kernel.org; KY Srinivasan
> <kys@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>;
> Haiyang Zhang <haiyangz@microsoft.com>; Sasha Levin
> <Alexander.Levin@microsoft.com>; olaf@aepfle.de; apw@canonical.com;
> jasowang@redhat.com; vkuznets <vkuznets@redhat.com>;
> marcelo.cerri@canonical.com
> Subject: RE: [PATCH] ACPI: PM: Export the function
> acpi_sleep_state_supported()
> 
> From: Dexuan Cui <decui@microsoft.com>  Sent: Friday, June 14, 2019 11:19
> AM
> >
> > In a Linux VM running on Hyper-V, when ACPI S4 is enabled, the balloon
> > driver (drivers/hv/hv_balloon.c) needs to ask the host not to do memory
> > hot-add/remove.
> >
> > So let's export acpi_sleep_state_supported() for the hv_balloon driver.
> > This might also be useful to the other drivers in the future.
> >
> > Signed-off-by: Dexuan Cui <decui@microsoft.com>
> > ---
> >  drivers/acpi/sleep.c    | 3 ++-
> >  include/acpi/acpi_bus.h | 2 ++
> >  2 files changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
> > index a34deccd7317..69755411e008 100644
> > --- a/drivers/acpi/sleep.c
> > +++ b/drivers/acpi/sleep.c
> > @@ -79,7 +79,7 @@ static int acpi_sleep_prepare(u32 acpi_state)
> >  	return 0;
> >  }
> >
> > -static bool acpi_sleep_state_supported(u8 sleep_state)
> > +bool acpi_sleep_state_supported(u8 sleep_state)
> >  {
> >  	acpi_status status;
> >  	u8 type_a, type_b;
> > @@ -89,6 +89,7 @@ static bool acpi_sleep_state_supported(u8 sleep_state)
> >  		|| (acpi_gbl_FADT.sleep_control.address
> >  			&& acpi_gbl_FADT.sleep_status.address));
> >  }
> > +EXPORT_SYMBOL_GPL(acpi_sleep_state_supported);
> >
> >  #ifdef CONFIG_ACPI_SLEEP
> >  static u32 acpi_target_sleep_state = ACPI_STATE_S0;
> > diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
> > index 31b6c87d6240..5b102e7bbf25 100644
> > --- a/include/acpi/acpi_bus.h
> > +++ b/include/acpi/acpi_bus.h
> > @@ -651,6 +651,8 @@ static inline int acpi_pm_set_bridge_wakeup(struct
> device *dev,
> > bool enable)
> >  }
> >  #endif
> >
> > +bool acpi_sleep_state_supported(u8 sleep_state);
> > +
> >  #ifdef CONFIG_ACPI_SLEEP
> >  u32 acpi_target_system_state(void);
> >  #else
> > --
> > 2.19.1
> 
> It seems that sleep.c isn't built when on the ARM64 architecture.  Using
> acpi_sleep_state_supported() directly in hv_balloon.c will be problematic
> since hv_balloon.c needs to be architecture independent when the
> Hyper-V ARM64 support is added.  If that doesn't change, a per-architecture
> wrapper will be needed to give hv_balloon.c the correct information.  This
> may affect whether acpi_sleep_state_supported() needs to be exported vs.
> just removing the "static".   I'm not sure what the best approach is.
> 
> Michael

+ some ARM experts who worked on arch/arm/kernel/hibernate.c.

drivers/acpi/sleep.c is only built if ACPI_SYSTEM_POWER_STATES_SUPPORT
is defined, but it looks this option is not defined on ARM.

It looks ARM does not support the ACPI S4 state, then how do we know 
if an ARM host supports hibernation or not?

Thanks,
-- Dexuan

^ permalink raw reply

* RE: [PATCH 2/2] hv_balloon: Reorganize the probe function
From: Michael Kelley @ 2019-06-14 21:56 UTC (permalink / raw)
  To: Dexuan Cui, linux-hyperv@vger.kernel.org,
	gregkh@linuxfoundation.org, Stephen Hemminger, Sasha Levin,
	Haiyang Zhang, KY Srinivasan, linux-kernel@vger.kernel.org,
	Tianyu Lan
  Cc: olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com
In-Reply-To: <1560537692-37400-2-git-send-email-decui@microsoft.com>

From: Dexuan Cui <decui@microsoft.com>  Sent: Friday, June 14, 2019 11:43 AM
> 
> Move the code that negotiates with the host to a new function
> balloon_connect_vsp() and improve the error handling.
> 
> This makes the code more readable and paves the way for the
> support of hibernation in future.
> 
> Makes no real logic change here.
> 
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> ---
>  drivers/hv/hv_balloon.c | 124 +++++++++++++++++++++-------------------
>  1 file changed, 66 insertions(+), 58 deletions(-)
> 
> diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
> index 13381ea3e3e7..111ea3599659 100644
> --- a/drivers/hv/hv_balloon.c
> +++ b/drivers/hv/hv_balloon.c
> @@ -1574,50 +1574,18 @@ static void balloon_onchannelcallback(void *context)
> 
>  }
> 
> -static int balloon_probe(struct hv_device *dev,
> -			const struct hv_vmbus_device_id *dev_id)
> +static int balloon_connect_vsp(struct hv_device *dev)
>  {
> -	int ret;
> -	unsigned long t;
>  	struct dm_version_request version_req;
>  	struct dm_capabilities cap_msg;
> -
> -#ifdef CONFIG_MEMORY_HOTPLUG
> -	do_hot_add = hot_add;
> -#else
> -	do_hot_add = false;
> -#endif
> +	unsigned long t;
> +	int ret;
> 
>  	ret = vmbus_open(dev->channel, dm_ring_size, dm_ring_size, NULL, 0,
> -			balloon_onchannelcallback, dev);
> -
> +			 balloon_onchannelcallback, dev);
>  	if (ret)
>  		return ret;
> 
> -	dm_device.dev = dev;
> -	dm_device.state = DM_INITIALIZING;
> -	dm_device.next_version = DYNMEM_PROTOCOL_VERSION_WIN8;
> -	init_completion(&dm_device.host_event);
> -	init_completion(&dm_device.config_event);
> -	INIT_LIST_HEAD(&dm_device.ha_region_list);
> -	spin_lock_init(&dm_device.ha_lock);
> -	INIT_WORK(&dm_device.balloon_wrk.wrk, balloon_up);
> -	INIT_WORK(&dm_device.ha_wrk.wrk, hot_add_req);
> -	dm_device.host_specified_ha_region = false;
> -
> -	dm_device.thread =
> -		 kthread_run(dm_thread_func, &dm_device, "hv_balloon");
> -	if (IS_ERR(dm_device.thread)) {
> -		ret = PTR_ERR(dm_device.thread);
> -		goto probe_error1;
> -	}
> -
> -#ifdef CONFIG_MEMORY_HOTPLUG
> -	set_online_page_callback(&hv_online_page);
> -	register_memory_notifier(&hv_memory_nb);
> -#endif
> -
> -	hv_set_drvdata(dev, &dm_device);
>  	/*
>  	 * Initiate the hand shake with the host and negotiate
>  	 * a version that the host can support. We start with the
> @@ -1633,16 +1601,15 @@ static int balloon_probe(struct hv_device *dev,
>  	dm_device.version = version_req.version.version;
> 
>  	ret = vmbus_sendpacket(dev->channel, &version_req,
> -				sizeof(struct dm_version_request),
> -				(unsigned long)NULL,
> -				VM_PKT_DATA_INBAND, 0);
> +			       sizeof(struct dm_version_request),
> +			       (unsigned long)NULL, VM_PKT_DATA_INBAND, 0);
>  	if (ret)
> -		goto probe_error2;
> +		goto out;
> 
>  	t = wait_for_completion_timeout(&dm_device.host_event, 5*HZ);
>  	if (t == 0) {
>  		ret = -ETIMEDOUT;
> -		goto probe_error2;
> +		goto out;
>  	}
> 
>  	/*
> @@ -1650,8 +1617,8 @@ static int balloon_probe(struct hv_device *dev,
>  	 * fail the probe function.
>  	 */
>  	if (dm_device.state == DM_INIT_ERROR) {
> -		ret = -ETIMEDOUT;
> -		goto probe_error2;
> +		ret = -EPROTO;
> +		goto out;
>  	}
> 
>  	pr_info("Using Dynamic Memory protocol version %u.%u\n",
> @@ -1684,16 +1651,15 @@ static int balloon_probe(struct hv_device *dev,
>  	cap_msg.max_page_number = -1;
> 
>  	ret = vmbus_sendpacket(dev->channel, &cap_msg,
> -				sizeof(struct dm_capabilities),
> -				(unsigned long)NULL,
> -				VM_PKT_DATA_INBAND, 0);
> +			       sizeof(struct dm_capabilities),
> +			       (unsigned long)NULL, VM_PKT_DATA_INBAND, 0);
>  	if (ret)
> -		goto probe_error2;
> +		goto out;
> 
>  	t = wait_for_completion_timeout(&dm_device.host_event, 5*HZ);
>  	if (t == 0) {
>  		ret = -ETIMEDOUT;
> -		goto probe_error2;
> +		goto out;
>  	}
> 
>  	/*
> @@ -1701,23 +1667,65 @@ static int balloon_probe(struct hv_device *dev,
>  	 * fail the probe function.
>  	 */
>  	if (dm_device.state == DM_INIT_ERROR) {
> -		ret = -ETIMEDOUT;
> -		goto probe_error2;
> +		ret = -EPROTO;
> +		goto out;
>  	}
> 
> +	return 0;
> +out:
> +	vmbus_close(dev->channel);
> +	return ret;
> +}
> +
> +static int balloon_probe(struct hv_device *dev,
> +			 const struct hv_vmbus_device_id *dev_id)
> +{
> +	int ret;
> +
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +	do_hot_add = hot_add;
> +#else
> +	do_hot_add = false;
> +#endif
> +	dm_device.dev = dev;
> +	dm_device.state = DM_INITIALIZING;
> +	dm_device.next_version = DYNMEM_PROTOCOL_VERSION_WIN8;
> +	init_completion(&dm_device.host_event);
> +	init_completion(&dm_device.config_event);
> +	INIT_LIST_HEAD(&dm_device.ha_region_list);
> +	spin_lock_init(&dm_device.ha_lock);
> +	INIT_WORK(&dm_device.balloon_wrk.wrk, balloon_up);
> +	INIT_WORK(&dm_device.ha_wrk.wrk, hot_add_req);
> +	dm_device.host_specified_ha_region = false;
> +
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +	set_online_page_callback(&hv_online_page);
> +	register_memory_notifier(&hv_memory_nb);
> +#endif
> +
> +	hv_set_drvdata(dev, &dm_device);
> +
> +	ret = balloon_connect_vsp(dev);
> +	if (ret != 0)
> +		return ret;
> +
>  	dm_device.state = DM_INITIALIZED;
> -	last_post_time = jiffies;

I was curious about the above deletion.  But I guess the line
is not needed as the time_after() check in post_status() should
handle an initial value of 0 for last_post_time just fine.

> +
> +	dm_device.thread =
> +		 kthread_run(dm_thread_func, &dm_device, "hv_balloon");
> +	if (IS_ERR(dm_device.thread)) {
> +		ret = PTR_ERR(dm_device.thread);
> +		goto probe_error;
> +	}

Just an observation:  this thread creation now happens at the end of the
probing process.  But that's good, because in the old code, the thread
was started and could run before the protocol version had been
negotiated.  So I'll assume your change here is intentional.

> 
>  	return 0;
> 
> -probe_error2:
> +probe_error:
> +	vmbus_close(dev->channel);
>  #ifdef CONFIG_MEMORY_HOTPLUG
> +	unregister_memory_notifier(&hv_memory_nb);

Hmmm. Evidently the above cleanup was missing in the
old code.

>  	restore_online_page_callback(&hv_online_page);
>  #endif
> -	kthread_stop(dm_device.thread);
> -
> -probe_error1:
> -	vmbus_close(dev->channel);
>  	return ret;
>  }
> 
> @@ -1734,11 +1742,11 @@ static int balloon_remove(struct hv_device *dev)
>  	cancel_work_sync(&dm->balloon_wrk.wrk);
>  	cancel_work_sync(&dm->ha_wrk.wrk);
> 
> -	vmbus_close(dev->channel);
>  	kthread_stop(dm->thread);
> +	vmbus_close(dev->channel);

Presumably this is an intentional ordering change as well.
The kthread should be stopped before closing the channel.

>  #ifdef CONFIG_MEMORY_HOTPLUG
> -	restore_online_page_callback(&hv_online_page);
>  	unregister_memory_notifier(&hv_memory_nb);
> +	restore_online_page_callback(&hv_online_page);

And you've changed the ordering of these steps so they are
the inverse of when they are set up.  Also a good cleanup ....

>  #endif
>  	spin_lock_irqsave(&dm_device.ha_lock, flags);
>  	list_for_each_entry_safe(has, tmp, &dm->ha_region_list, list) {
> --
> 2.19.1

Reviewed-by: Michael Kelley <mikelley@microsoft.com>

^ permalink raw reply

* Re: [PATCH] x86/hyperv: Disable preemption while setting reenlightenment vector
From: Vitaly Kuznetsov @ 2019-06-14 21:44 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Prasanna Panchamukhi, Andy Lutomirski,
	Borislav Petkov, Cathy Avery, Haiyang Zhang, H. Peter Anvin,
	Ingo Molnar, K. Y. Srinivasan, Michael Kelley (EOSG),
	Mohammed Gamal, Paolo Bonzini, Radim Krčmář,
	Roman Kagan, Sasha Levin, Stephen Hemminger, Thomas Gleixner,
	devel, kvm, linux-hyperv, x86, Dmitry Safonov
In-Reply-To: <20190614122726.GL3436@hirez.programming.kicks-ass.net>

Peter Zijlstra <peterz@infradead.org> writes:

>
> I know you probably can't change the HV interface, but I'm thinking its
> rather daft you have to specify a CPU at all for this. The HV can just
> pick one and send the notification there, who cares.

Generally speaking, hypervisor can't know if the CPU is offline (or
e.g. 'isolated') from guest's perspective so I think having an option to
specify affinity for reenlightenment notification is rather a good
thing, not bad.

(Actually, I don't remember if I tried specifying 'HV_ANY' (U32_MAX-1)
here to see what happens. But then I doubt it'll notice the fact that we 
offlined some CPU so we may get a totally unexpected IRQ there).

-- 
Vitaly

^ permalink raw reply

* Re: [PATCH] x86/hyperv: Disable preemption while setting reenlightenment vector
From: Vitaly Kuznetsov @ 2019-06-14 21:36 UTC (permalink / raw)
  To: Dmitry Safonov, Peter Zijlstra
  Cc: linux-kernel, Prasanna Panchamukhi, Andy Lutomirski,
	Borislav Petkov, Cathy Avery, Haiyang Zhang, H. Peter Anvin,
	Ingo Molnar, K. Y. Srinivasan, Michael Kelley (EOSG),
	Mohammed Gamal, Paolo Bonzini, Radim Krčmář,
	Roman Kagan, Sasha Levin, Stephen Hemminger, Thomas Gleixner,
	devel, kvm, linux-hyperv, x86
In-Reply-To: <cb9e1645-98c2-4341-d6da-4effa4f57fb1@arista.com>

Dmitry Safonov <dima@arista.com> writes:

> On 6/14/19 11:08 AM, Vitaly Kuznetsov wrote:
>> Peter Zijlstra <peterz@infradead.org> writes:
>> 
>>> @@ -182,7 +182,7 @@ void set_hv_tscchange_cb(void (*cb)(void))
>>>  	struct hv_reenlightenment_control re_ctrl = {
>>>  		.vector = HYPERV_REENLIGHTENMENT_VECTOR,
>>>  		.enabled = 1,
>>> -		.target_vp = hv_vp_index[smp_processor_id()]
>>> +		.target_vp = hv_vp_index[raw_smp_processor_id()]
>>>  	};
>>>  	struct hv_tsc_emulation_control emu_ctrl = {.enabled = 1};
>>>  
>> 
>> Yes, this should do, thanks! I'd also suggest to leave a comment like
>> 	/* 
>>          * This function can get preemted and migrate to a different CPU
>> 	 * but this doesn't matter. We just need to assign
>> 	 * reenlightenment notification to some online CPU. In case this
>>          * CPU goes offline, hv_cpu_die() will re-assign it to some
>>  	 * other online CPU.
>> 	 */
>
> What if the cpu goes down just before wrmsrl()?
> I mean, hv_cpu_die() will reassign another cpu, but this thread will be
> resumed on some other cpu and will write cpu number which is at that
> moment already down?
>

Right you are, we need to guarantee wrmsr() happens before the CPU gets
a chance to go offline: we don't save the cpu number anywhere, we just
read it with rdmsr() in hv_cpu_die().

>
> And I presume it's guaranteed that during hv_cpu_die() no other cpu may
> go down:
> :	new_cpu = cpumask_any_but(cpu_online_mask, cpu);
> :	re_ctrl.target_vp = hv_vp_index[new_cpu];
> :	wrmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl));

I *think* I got convinced that CPUs don't go offline simultaneously when
I was writing this.

-- 
Vitaly

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox