Linux-HyperV List

Linux-HyperV List
 help / color / mirror / Atom feed

* RE: [PATCH net] hv_sock: Suppress bogus "may be used uninitialized" warnings
From: Sunil Muthuswamy @ 2019-06-15  7:19 UTC (permalink / raw)
  To: Dexuan Cui, netdev@vger.kernel.org, davem@davemloft.net,
	Michael Kelley
  Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	KY Srinivasan, Stephen Hemminger, Haiyang Zhang, Sasha Levin,
	olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com
In-Reply-To: <1560574826-99551-1-git-send-email-decui@microsoft.com>



> -----Original Message-----
> From: linux-hyperv-owner@vger.kernel.org <linux-hyperv-owner@vger.kernel.org> On Behalf Of Dexuan Cui
> Sent: Friday, June 14, 2019 10:01 PM
> To: netdev@vger.kernel.org; davem@davemloft.net; Michael Kelley <mikelley@microsoft.com>
> Cc: linux-hyperv@vger.kernel.org; linux-kernel@vger.kernel.org; KY Srinivasan <kys@microsoft.com>; Stephen Hemminger
> <sthemmin@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Sasha Levin <Alexander.Levin@microsoft.com>;
> olaf@aepfle.de; apw@canonical.com; jasowang@redhat.com; vkuznets <vkuznets@redhat.com>; marcelo.cerri@canonical.com;
> Dexuan Cui <decui@microsoft.com>
> Subject: [PATCH net] hv_sock: Suppress bogus "may be used uninitialized" warnings
> 
> gcc 8.2.0 may report these bogus warnings under some condition:
> 
> warning: ‘vnew’ may be used uninitialized in this function
> warning: ‘hvs_new’ may be used uninitialized in this function
> 
> Actually, the 2 pointers are only initialized and used if the variable
> "conn_from_host" is true. The code is not buggy here.
> 
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> ---
>  net/vmw_vsock/hyperv_transport.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
> index 8d1ea9eda8a2..cd3f47f54fa7 100644
> --- a/net/vmw_vsock/hyperv_transport.c
> +++ b/net/vmw_vsock/hyperv_transport.c
> @@ -329,8 +329,8 @@ static void hvs_open_connection(struct vmbus_channel *chan)
> 
>  	struct sockaddr_vm addr;
>  	struct sock *sk, *new = NULL;
> -	struct vsock_sock *vnew;
> -	struct hvsock *hvs, *hvs_new;
> +	struct vsock_sock *vnew = NULL;
> +	struct hvsock *hvs, *hvs_new = NULL;
>  	int ret;
> 
These are all already fixed under 
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=ac383f58f3c98de37fa67452acc5bd677396e9f3 
Its just that that commit hasn't merged with the 'net' branch yet.
>  	if_type = &chan->offermsg.offer.if_type;
> --
> 2.19.1


^ permalink raw reply

* RE: [PATCH net] hvsock: fix epollout hang from race condition
From: Sunil Muthuswamy @ 2019-06-15  7:23 UTC (permalink / raw)
  To: Dexuan Cui, David Miller
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal@kernel.org, Michael Kelley, netdev@vger.kernel.org,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <PU1P153MB0169810F29C473D44C090F6ABFE90@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>



> -----Original Message-----
> From: Dexuan Cui <decui@microsoft.com>
> Sent: Friday, June 14, 2019 10:04 PM
> To: David Miller <davem@davemloft.net>; Sunil Muthuswamy <sunilmut@microsoft.com>
> Cc: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Stephen Hemminger
> <sthemmin@microsoft.com>; sashal@kernel.org; Michael Kelley <mikelley@microsoft.com>; netdev@vger.kernel.org; linux-
> hyperv@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: RE: [PATCH net] hvsock: fix epollout hang from race condition
> 
> > From: linux-hyperv-owner@vger.kernel.org
> > <linux-hyperv-owner@vger.kernel.org> On Behalf Of Dexuan Cui
> > Sent: Friday, June 14, 2019 8:23 PM
> > To: David Miller <davem@davemloft.net>; Sunil Muthuswamy
> > <sunilmut@microsoft.com>
> > Cc: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang
> > <haiyangz@microsoft.com>; Stephen Hemminger
> > <sthemmin@microsoft.com>; sashal@kernel.org; Michael Kelley
> > <mikelley@microsoft.com>; netdev@vger.kernel.org;
> > linux-hyperv@vger.kernel.org; linux-kernel@vger.kernel.org
> > Subject: RE: [PATCH net] hvsock: fix epollout hang from race condition
> >
> > > From: linux-hyperv-owner@vger.kernel.org
> > > <linux-hyperv-owner@vger.kernel.org> On Behalf Of David Miller
> > > Sent: Friday, June 14, 2019 7:15 PM
> > > To: Sunil Muthuswamy <sunilmut@microsoft.com>
> > >
> > > This adds lots of new warnings:
> > >
> > > net/vmw_vsock/hyperv_transport.c: In function ‘hvs_probe’:
> > > net/vmw_vsock/hyperv_transport.c:205:20: warning: ‘vnew’ may be used
> > > uninitialized in this function [-Wmaybe-uninitialized]
> > >    remote->svm_port = host_ephemeral_port++;
> > >    ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
> > > net/vmw_vsock/hyperv_transport.c:332:21: note: ‘vnew’ was declared
> > here
> > >   struct vsock_sock *vnew;
> > >                      ^~~~
> > > net/vmw_vsock/hyperv_transport.c:406:22: warning: ‘hvs_new’ may be
> > > used uninitialized in this function [-Wmaybe-uninitialized]
> > >    hvs_new->vm_srv_id = *if_type;
> > >    ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~
> > > net/vmw_vsock/hyperv_transport.c:333:23: note: ‘hvs_new’ was declared
> > > here
> > >   struct hvsock *hvs, *hvs_new;
> > >                        ^~~~~~~
> >
> > Hi David,
> > These warnings are not introduced by this patch from Sunil.
> 
> Well, technically speaking, the warnings are caused by Suni's patch, but I think it should
> be a bug of gcc (I'm using "gcc (Ubuntu 8.2.0-12ubuntu1) 8.2.0"). As you can see, the
> line numbers given by gcc, i.e. line 205, line 406, are not related to the warnings.
> 
> Actually, the same warnings can repro with the below one-line patch on this commit of
> today's net.git:
>     9a33629ba6b2 ("hv_netvsc: Set probe mode to sync")
> 
> --- a/net/vmw_vsock/hyperv_transport.c
> +++ b/net/vmw_vsock/hyperv_transport.c
> @@ -403,6 +403,7 @@ static void hvs_open_connection(struct vmbus_channel *chan)
> 
>         set_per_channel_state(chan, conn_from_host ? new : sk);
>         vmbus_set_chn_rescind_callback(chan, hvs_close_connection);
> +       asm ("nop");
> 
>         if (conn_from_host) {
>                 new->sk_state = TCP_ESTABLISHED;
> 
> It looks a simple inline assembly code can confuse gcc. I'm not sure if I should
> report a bug for gcc...
> 
> I posted a patch to suppress these bogus warnings just now. The Subject is:
> 
> [PATCH net] hv_sock: Suppress bogus "may be used uninitialized" warnings
> 
> Thanks,
> -- Dexuan

Yes, these warnings are not specific to this patch. And, additionally these
should already addressed in the commit 
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=ac383f58f3c98de37fa67452acc5bd677396e9f3
I was trying to avoid making the same changes here to avoid merge
conflicts when 'net-next' merges with 'net' next.

^ permalink raw reply

* RE: [PATCH net] hvsock: fix epollout hang from race condition
From: Dexuan Cui @ 2019-06-15 17:01 UTC (permalink / raw)
  To: Sunil Muthuswamy, David Miller
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal@kernel.org, Michael Kelley, netdev@vger.kernel.org,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <MW2PR2101MB111678811A8761515B669A47C0E90@MW2PR2101MB1116.namprd21.prod.outlook.com>

> From: Sunil Muthuswamy <sunilmut@microsoft.com>
> Sent: Saturday, June 15, 2019 12:23 AM
> To: Dexuan Cui <decui@microsoft.com>; David Miller <davem@davemloft.net>
> > ...
> > It looks a simple inline assembly code can confuse gcc. I'm not sure if I should
> > report a bug for gcc...
> >
> > I posted a patch to suppress these bogus warnings just now. The Subject is:
> >
> > [PATCH net] hv_sock: Suppress bogus "may be used uninitialized" warnings

David, as I described, these warnings are spurious and can be safely ignored.

Please consider not applying my two-line patch to avoid merge conflict
with Sunil's another patch in net-next.git. 

> Yes, these warnings are not specific to this patch. And, additionally these
> should already addressed in the commit ...
> I was trying to avoid making the same changes here to avoid merge
> conflicts when 'net-next' merges with 'net' next.

Yeah, I agree.

Thanks,
-- Dexuan

^ permalink raw reply

* RE: [PATCH] scsi: storvsc: Add ability to change scsi queue depth
From: Michael Kelley @ 2019-06-15 18:19 UTC (permalink / raw)
  To: brandonbonaby94, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	sashal@kernel.org, jejb@linux.ibm.com, martin.petersen@oracle.com
  Cc: brandonbonaby94, linux-hyperv@vger.kernel.org,
	linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <20190614234822.5193-1-brandonbonaby94@gmail.com>

From: Branden Bonaby <brandonbonaby94@gmail.com> Sent: Friday, June 14, 2019 4:48 PM
> 
> Adding functionality to allow the SCSI queue depth to be changed,
> by utilizing the "scsi_change_queue_depth" function.
> 
> Signed-off-by: Branden Bonaby <brandonbonaby94@gmail.com>
> ---
>  drivers/scsi/storvsc_drv.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
> index 8472de1007ff..719ca9906fc2 100644
> --- a/drivers/scsi/storvsc_drv.c
> +++ b/drivers/scsi/storvsc_drv.c
> @@ -387,6 +387,7 @@ enum storvsc_request_type {
> 
>  static int storvsc_ringbuffer_size = (128 * 1024);
>  static u32 max_outstanding_req_per_channel;
> +static int storvsc_change_queue_depth(struct scsi_device *sdev, int queue_depth);
> 
>  static int storvsc_vcpus_per_sub_channel = 4;
> 
> @@ -1711,6 +1712,7 @@ static struct scsi_host_template scsi_driver = {
>  	.dma_boundary =		PAGE_SIZE-1,
>  	.no_write_same =	1,
>  	.track_queue_depth =	1,
> +	.change_queue_depth =	storvsc_change_queue_depth,
>  };
> 
>  enum {
> @@ -1917,6 +1919,15 @@ static int storvsc_probe(struct hv_device *device,
>  	return ret;
>  }
> 
> +/* Change a scsi target's queue depth */
> +static int storvsc_change_queue_depth(struct scsi_device *sdev, int queue_depth)
> +{
> +	if (queue_depth > scsi_driver.can_queue){
> +		queue_depth = scsi_driver.can_queue;
> +	}
> +	return scsi_change_queue_depth(sdev, queue_depth);
> +}
> +
>  static int storvsc_remove(struct hv_device *dev)
>  {
>  	struct storvsc_device *stor_device = hv_get_drvdata(dev);
> --
> 2.17.1

Reviewed-by: Michael Kelley <mikelley@microsoft.com>


^ permalink raw reply

* Re: [PATCH net] hvsock: fix epollout hang from race condition
From: David Miller @ 2019-06-16 20:54 UTC (permalink / raw)
  To: decui
  Cc: sunilmut, kys, haiyangz, sthemmin, sashal, mikelley, netdev,
	linux-hyperv, linux-kernel
In-Reply-To: <PU1P153MB0169BACDA500F94910849770BFE90@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>

From: Dexuan Cui <decui@microsoft.com>
Date: Sat, 15 Jun 2019 03:22:32 +0000

> These warnings are not introduced by this patch from Sunil.
> 
> I'm not sure why I didn't notice these warnings before.  
> Probably my gcc version is not new eought? 
> 
> Actually these warnings are bogus, as I checked the related functions,
> which may confuse the compiler's static analysis.
> 
> I'm going to make a patch to initialize the pointers to NULL to suppress
> the warnings. My patch will be based on the latest's net.git + this patch
> from Sunil.

Sunil should then resubmit his patch against something that has the
warning suppression patch applied.

^ permalink raw reply

* Re: [PATCH net] hv_sock: Suppress bogus "may be used uninitialized" warnings
From: David Miller @ 2019-06-16 21:01 UTC (permalink / raw)
  To: decui
  Cc: netdev, mikelley, linux-hyperv, linux-kernel, kys, sthemmin,
	haiyangz, Alexander.Levin, olaf, apw, jasowang, vkuznets,
	marcelo.cerri
In-Reply-To: <1560574826-99551-1-git-send-email-decui@microsoft.com>

From: Dexuan Cui <decui@microsoft.com>
Date: Sat, 15 Jun 2019 05:00:57 +0000

> gcc 8.2.0 may report these bogus warnings under some condition:
> 
> warning: ‘vnew’ may be used uninitialized in this function
> warning: ‘hvs_new’ may be used uninitialized in this function
> 
> Actually, the 2 pointers are only initialized and used if the variable
> "conn_from_host" is true. The code is not buggy here.
> 
> Signed-off-by: Dexuan Cui <decui@microsoft.com>

Applied.

^ permalink raw reply

* Re: [PATCH v2 2/5] x86: hv: hv_init.c: Add functions to allocate/deallocate page for Hyper-V
From: Maya Nakamura @ 2019-06-17  3:59 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: x86, linux-hyperv, linux-kernel, mikelley, kys, haiyangz,
	sthemmin, sashal
In-Reply-To: <87muindr9c.fsf@vitty.brq.redhat.com>

On Wed, Jun 12, 2019 at 12:36:47PM +0200, Vitaly Kuznetsov wrote:
> Maya Nakamura <m.maya.nakamura@gmail.com> writes:
> 
> > Introduce two new functions, hv_alloc_hyperv_page() and
> > hv_free_hyperv_page(), to allocate/deallocate memory with the size and
> > alignment that Hyper-V expects as a page. Although currently they are
> > not used, they are ready to be used to allocate/deallocate memory on x86
> > when their ARM64 counterparts are implemented, keeping symmetry between
> > architectures with potentially different guest page sizes.
> >
> > Signed-off-by: Maya Nakamura <m.maya.nakamura@gmail.com>
> > ---
> >  arch/x86/hyperv/hv_init.c | 14 ++++++++++++++
> >  1 file changed, 14 insertions(+)
> >
> > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> > index e4ba467a9fc6..84baf0e9a2d4 100644
> > --- a/arch/x86/hyperv/hv_init.c
> > +++ b/arch/x86/hyperv/hv_init.c
> > @@ -98,6 +98,20 @@ EXPORT_SYMBOL_GPL(hyperv_pcpu_input_arg);
> >  u32 hv_max_vp_index;
> >  EXPORT_SYMBOL_GPL(hv_max_vp_index);
> >  
> > +void *hv_alloc_hyperv_page(void)
> > +{
> > +	BUILD_BUG_ON(!(PAGE_SIZE == HV_HYP_PAGE_SIZE));
> 
> (nit)
> 
> PAGE_SIZE != HV_HYP_PAGE_SIZE ?
> 
> > +
> > +	return (void *)__get_free_page(GFP_KERNEL);
> > +}
> > +EXPORT_SYMBOL_GPL(hv_alloc_hyperv_page);
> > +
> > +void hv_free_hyperv_page(unsigned long addr)
> > +{
> > +	free_page(addr);
> > +}
> > +EXPORT_SYMBOL_GPL(hv_free_hyperv_page);
> > +
> >  static int hv_cpu_init(unsigned int cpu)
> >  {
> >  	u64 msr_vp_index;
> 
> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> 
> -- 
> Vitaly

Agreed. I will resubmit the patch set with this correction.

^ permalink raw reply

* Re: [PATCH 10/13] megaraid_sas: set virt_boundary_mask in the scsi host
From: Christoph Hellwig @ 2019-06-17  8:44 UTC (permalink / raw)
  To: Kashyap Desai
  Cc: Christoph Hellwig, Jens Axboe, Sebastian Ott, Sagi Grimberg,
	Max Gurtovoy, Bart Van Assche, Ulf Hansson, Alan Stern,
	Oliver Neukum, linux-block, linux-rdma, linux-mmc, linux-nvme,
	linux-scsi, PDL,MEGARAIDLINUX, PDL-MPT-FUSIONLINUX, linux-hyperv,
	linux-usb, usb-storage, linux-kernel
In-Reply-To: <98f6557ae91a7cdfe8069fcf7d788c88@mail.gmail.com>

On Fri, Jun 14, 2019 at 01:28:47AM +0530, Kashyap Desai wrote:
> Is there any changes in API  blk_queue_virt_boundary? I could not find
> relevant code which account for this. Can you help ?
> Which git repo shall I use for testing ? That way I can confirm, I didn't
> miss relevant changes.

Latest mainline plus the series (which is about to get resent).
blk_queue_virt_boundary now forced an unlimited max_hw_sectors as that
is how PRP-like schemes work, to work around a block driver merging
bug.  But we also need to communicate that limit to the DMA layer so
that we don't set a smaller iommu segment size limitation.

> >From your above explanation, it means (after this patch) max segment size
> of the MR controller will be set to 4K.
> Earlier it is possible to receive single SGE of 64K datalength (Since max
> seg size was 64K), but now the same buffer will reach the driver having 16
> SGEs (Each SGE will contain 4K length).

No, there is no more limit for the size of the segment at all,
as for PRPs each PRP is sort of a segment from the hardware perspective.
We just require the segments to not have gaps, as PRPs don't allow for
that.

That being said I think these patches are wrong for the case of megaraid
or mpt having both NVMe and SAS/ATA devices behind a single controller.
Is that a valid configuration?

^ permalink raw reply

* RE: [PATCH 10/13] megaraid_sas: set virt_boundary_mask in the scsi host
From: Kashyap Desai @ 2019-06-17  9:10 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Sebastian Ott, Sagi Grimberg, Max Gurtovoy,
	Bart Van Assche, Ulf Hansson, Alan Stern, Oliver Neukum,
	linux-block, linux-rdma, linux-mmc, linux-nvme, linux-scsi,
	PDL,MEGARAIDLINUX, PDL-MPT-FUSIONLINUX, linux-hyperv, linux-usb,
	usb-storage, linux-kernel
In-Reply-To: <20190617084433.GA7969@lst.de>

>
> On Fri, Jun 14, 2019 at 01:28:47AM +0530, Kashyap Desai wrote:
> > Is there any changes in API  blk_queue_virt_boundary? I could not find
> > relevant code which account for this. Can you help ?
> > Which git repo shall I use for testing ? That way I can confirm, I
> > didn't miss relevant changes.
>
> Latest mainline plus the series (which is about to get resent).
> blk_queue_virt_boundary now forced an unlimited max_hw_sectors as that
is
> how PRP-like schemes work, to work around a block driver merging bug.
But
> we also need to communicate that limit to the DMA layer so that we don't
set
> a smaller iommu segment size limitation.
>
> > >From your above explanation, it means (after this patch) max segment
> > >size
> > of the MR controller will be set to 4K.
> > Earlier it is possible to receive single SGE of 64K datalength (Since
> > max seg size was 64K), but now the same buffer will reach the driver
> > having 16 SGEs (Each SGE will contain 4K length).
>
> No, there is no more limit for the size of the segment at all, as for
PRPs each
> PRP is sort of a segment from the hardware perspective.
> We just require the segments to not have gaps, as PRPs don't allow for
that.
Thanks for clarification. I have also observed that max_segment_size Is
unchanged and it is 64K.
>
> That being said I think these patches are wrong for the case of megaraid
or
> mpt having both NVMe and SAS/ATA devices behind a single controller.
> Is that a valid configuration?
Yes. This is a valid configuration.

^ permalink raw reply

* [PATCH 1/8] scsi: add a host / host template field for the virt boundary
From: Christoph Hellwig @ 2019-06-17 12:19 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Sagi Grimberg, Max Gurtovoy, Bart Van Assche, linux-rdma,
	linux-scsi, megaraidlinux.pdl, MPT-FusionLinux.pdl, linux-hyperv,
	linux-kernel
In-Reply-To: <20190617122000.22181-1-hch@lst.de>

This allows drivers setting it up easily instead of branching out to
block layer calls in slave_alloc, and ensures the upgraded
max_segment_size setting gets picked up by the DMA layer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/hosts.c     | 3 +++
 drivers/scsi/scsi_lib.c  | 3 ++-
 include/scsi/scsi_host.h | 3 +++
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index ff0d8c6a8d0c..55522b7162d3 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -462,6 +462,9 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
 	else
 		shost->dma_boundary = 0xffffffff;
 
+	if (sht->virt_boundary_mask)
+		shost->virt_boundary_mask = sht->virt_boundary_mask;
+
 	device_initialize(&shost->shost_gendev);
 	dev_set_name(&shost->shost_gendev, "host%d", shost->host_no);
 	shost->shost_gendev.bus = &scsi_bus_type;
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 65d0a10c76ad..d333bb6b1c59 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1775,7 +1775,8 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
 	dma_set_seg_boundary(dev, shost->dma_boundary);
 
 	blk_queue_max_segment_size(q, shost->max_segment_size);
-	dma_set_max_seg_size(dev, shost->max_segment_size);
+	blk_queue_virt_boundary(q, shost->virt_boundary_mask);
+	dma_set_max_seg_size(dev, queue_max_segment_size(q));
 
 	/*
 	 * Set a reasonable default alignment:  The larger of 32-byte (dword),
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index a5fcdad4a03e..cc139dbd71e5 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -369,6 +369,8 @@ struct scsi_host_template {
 	 */
 	unsigned long dma_boundary;
 
+	unsigned long virt_boundary_mask;
+
 	/*
 	 * This specifies "machine infinity" for host templates which don't
 	 * limit the transfer size.  Note this limit represents an absolute
@@ -587,6 +589,7 @@ struct Scsi_Host {
 	unsigned int max_sectors;
 	unsigned int max_segment_size;
 	unsigned long dma_boundary;
+	unsigned long virt_boundary_mask;
 	/*
 	 * In scsi-mq mode, the number of hardware queues supported by the LLD.
 	 *
-- 
2.20.1


^ permalink raw reply related

* [PATCH 4/8] storvsc: set virt_boundary_mask in the scsi host template
From: Christoph Hellwig @ 2019-06-17 12:19 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Sagi Grimberg, Max Gurtovoy, Bart Van Assche, linux-rdma,
	linux-scsi, megaraidlinux.pdl, MPT-FusionLinux.pdl, linux-hyperv,
	linux-kernel
In-Reply-To: <20190617122000.22181-1-hch@lst.de>

This ensures all proper DMA layer handling is taken care of by the
SCSI midlayer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/storvsc_drv.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index b89269120a2d..7ed6f2fc1446 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -1422,9 +1422,6 @@ static int storvsc_device_configure(struct scsi_device *sdevice)
 {
 	blk_queue_rq_timeout(sdevice->request_queue, (storvsc_timeout * HZ));
 
-	/* Ensure there are no gaps in presented sgls */
-	blk_queue_virt_boundary(sdevice->request_queue, PAGE_SIZE - 1);
-
 	sdevice->no_write_same = 1;
 
 	/*
@@ -1697,6 +1694,8 @@ static struct scsi_host_template scsi_driver = {
 	.this_id =		-1,
 	/* Make sure we dont get a sg segment crosses a page boundary */
 	.dma_boundary =		PAGE_SIZE-1,
+	/* Ensure there are no gaps in presented sgls */
+	.virt_boundary_mask =	PAGE_SIZE-1,
 	.no_write_same =	1,
 	.track_queue_depth =	1,
 };
-- 
2.20.1


^ permalink raw reply related

* [PATCH 8/8] megaraid_sas: set an unlimited max_segment_size
From: Christoph Hellwig @ 2019-06-17 12:20 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Sagi Grimberg, Max Gurtovoy, Bart Van Assche, linux-rdma,
	linux-scsi, megaraidlinux.pdl, MPT-FusionLinux.pdl, linux-hyperv,
	linux-kernel
In-Reply-To: <20190617122000.22181-1-hch@lst.de>

When using a virt_boundary_mask, as done for NVMe devices attached to
megaraid_sas controllers we require an unlimited max_segment_size, as
the virt boundary merging code assumes that.  But we also need to
propagate that to the DMA mapping layer to make dma-debug happy.  The
SCSI layer takes care of that when using the per-host virt_boundary
setting, but given that megaraid_sas only wants to set the virt_boundary
for actual NVMe devices we can't rely on that.  The DMA layer maximum
segment is global to the HBA however, so we have to set it explicitly.
This patch assumes that megaraid_sas does not have a segment size
limitation, which seems true based on the SGL format, but will need
to be verified.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/megaraid/megaraid_sas_base.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index 3dd1df472dc6..59f709dbbab9 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -3207,6 +3207,7 @@ static struct scsi_host_template megasas_template = {
 	.shost_attrs = megaraid_host_attrs,
 	.bios_param = megasas_bios_param,
 	.change_queue_depth = scsi_change_queue_depth,
+	.max_segment_size = 0xffffffff,
 	.no_write_same = 1,
 };

-- 
2.20.1

^ permalink raw reply related

* [PATCH 7/8] mpt3sas: set an unlimited max_segment_size for SAS 3.0 HBAs
From: Christoph Hellwig @ 2019-06-17 12:19 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Sagi Grimberg, Max Gurtovoy, Bart Van Assche, linux-rdma,
	linux-scsi, megaraidlinux.pdl, MPT-FusionLinux.pdl, linux-hyperv,
	linux-kernel
In-Reply-To: <20190617122000.22181-1-hch@lst.de>

When using a virt_boundary_mask, as done for NVMe devices attached to
mpt3sas controllers we require an unlimited max_segment_size, as the
virt boundary merging code assumes that.  But we also need to propagate
that to the DMA mapping layer to make dma-debug happy.  The SCSI layer
takes care of that when using the per-host virt_boundary setting, but
given that mpt3sas only wants to set the virt_boundary for actual
NVMe devices we can't rely on that.  The DMA layer maximum segment
is global to the HBA however, so we have to set it explicitly.  This
patch assumes that mpt3sas does not have a segment size limitation,
which seems true based on the SGL format, but will need to be verified.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 1ccfbc7eebe0..c719b807f6d8 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -10222,6 +10222,7 @@ static struct scsi_host_template mpt3sas_driver_template = {
 	.this_id			= -1,
 	.sg_tablesize			= MPT3SAS_SG_DEPTH,
 	.max_sectors			= 32767,
+	.max_segment_size		= 0xffffffff,
 	.cmd_per_lun			= 7,
 	.shost_attrs			= mpt3sas_host_attrs,
 	.sdev_attrs			= mpt3sas_dev_attrs,
-- 
2.20.1

^ permalink raw reply related

* [PATCH 6/8] IB/srp: set virt_boundary_mask in the scsi host
From: Christoph Hellwig @ 2019-06-17 12:19 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Sagi Grimberg, Max Gurtovoy, Bart Van Assche, linux-rdma,
	linux-scsi, megaraidlinux.pdl, MPT-FusionLinux.pdl, linux-hyperv,
	linux-kernel
In-Reply-To: <20190617122000.22181-1-hch@lst.de>

This ensures all proper DMA layer handling is taken care of by the
SCSI midlayer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 18 +++---------------
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 4305da2c9037..b3a4ebd85046 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -3063,20 +3063,6 @@ static int srp_target_alloc(struct scsi_target *starget)
 	return 0;
 }
 
-static int srp_slave_alloc(struct scsi_device *sdev)
-{
-	struct Scsi_Host *shost = sdev->host;
-	struct srp_target_port *target = host_to_target(shost);
-	struct srp_device *srp_dev = target->srp_host->srp_dev;
-	struct ib_device *ibdev = srp_dev->dev;
-
-	if (!(ibdev->attrs.device_cap_flags & IB_DEVICE_SG_GAPS_REG))
-		blk_queue_virt_boundary(sdev->request_queue,
-					~srp_dev->mr_page_mask);
-
-	return 0;
-}
-
 static int srp_slave_configure(struct scsi_device *sdev)
 {
 	struct Scsi_Host *shost = sdev->host;
@@ -3279,7 +3265,6 @@ static struct scsi_host_template srp_template = {
 	.name				= "InfiniBand SRP initiator",
 	.proc_name			= DRV_NAME,
 	.target_alloc			= srp_target_alloc,
-	.slave_alloc			= srp_slave_alloc,
 	.slave_configure		= srp_slave_configure,
 	.info				= srp_target_info,
 	.queuecommand			= srp_queuecommand,
@@ -3814,6 +3799,9 @@ static ssize_t srp_create_target(struct device *dev,
 	target_host->max_cmd_len = sizeof ((struct srp_cmd *) (void *) 0L)->cdb;
 	target_host->max_segment_size = ib_dma_max_seg_size(ibdev);
 
+	if (!(ibdev->attrs.device_cap_flags & IB_DEVICE_SG_GAPS_REG))
+		target_host->virt_boundary_mask = ~srp_dev->mr_page_mask;
+
 	target = host_to_target(target_host);
 
 	target->net		= kobj_ns_grab_current(KOBJ_NS_TYPE_NET);
-- 
2.20.1


^ permalink raw reply related

* [PATCH 5/8] IB/iser: set virt_boundary_mask in the scsi host
From: Christoph Hellwig @ 2019-06-17 12:19 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Sagi Grimberg, Max Gurtovoy, Bart Van Assche, linux-rdma,
	linux-scsi, megaraidlinux.pdl, MPT-FusionLinux.pdl, linux-hyperv,
	linux-kernel
In-Reply-To: <20190617122000.22181-1-hch@lst.de>

This ensures all proper DMA layer handling is taken care of by the
SCSI midlayer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/infiniband/ulp/iser/iscsi_iser.c | 35 +++++-------------------
 1 file changed, 7 insertions(+), 28 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c
index 9c185a8dabd3..841b66397a57 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.c
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.c
@@ -613,6 +613,7 @@ iscsi_iser_session_create(struct iscsi_endpoint *ep,
 	struct Scsi_Host *shost;
 	struct iser_conn *iser_conn = NULL;
 	struct ib_conn *ib_conn;
+	struct ib_device *ib_dev;
 	u32 max_fr_sectors;
 
 	shost = iscsi_host_alloc(&iscsi_iser_sht, 0, 0);
@@ -643,16 +644,19 @@ iscsi_iser_session_create(struct iscsi_endpoint *ep,
 		}
 
 		ib_conn = &iser_conn->ib_conn;
+		ib_dev = ib_conn->device->ib_device;
 		if (ib_conn->pi_support) {
-			u32 sig_caps = ib_conn->device->ib_device->attrs.sig_prot_cap;
+			u32 sig_caps = ib_dev->attrs.sig_prot_cap;
 
 			scsi_host_set_prot(shost, iser_dif_prot_caps(sig_caps));
 			scsi_host_set_guard(shost, SHOST_DIX_GUARD_IP |
 						   SHOST_DIX_GUARD_CRC);
 		}
 
-		if (iscsi_host_add(shost,
-				   ib_conn->device->ib_device->dev.parent)) {
+		if (!(ib_dev->attrs.device_cap_flags & IB_DEVICE_SG_GAPS_REG))
+			shost->virt_boundary_mask = ~MASK_4K;
+
+		if (iscsi_host_add(shost, ib_dev->dev.parent)) {
 			mutex_unlock(&iser_conn->state_mutex);
 			goto free_host;
 		}
@@ -958,30 +962,6 @@ static umode_t iser_attr_is_visible(int param_type, int param)
 	return 0;
 }
 
-static int iscsi_iser_slave_alloc(struct scsi_device *sdev)
-{
-	struct iscsi_session *session;
-	struct iser_conn *iser_conn;
-	struct ib_device *ib_dev;
-
-	mutex_lock(&unbind_iser_conn_mutex);
-
-	session = starget_to_session(scsi_target(sdev))->dd_data;
-	iser_conn = session->leadconn->dd_data;
-	if (!iser_conn) {
-		mutex_unlock(&unbind_iser_conn_mutex);
-		return -ENOTCONN;
-	}
-	ib_dev = iser_conn->ib_conn.device->ib_device;
-
-	if (!(ib_dev->attrs.device_cap_flags & IB_DEVICE_SG_GAPS_REG))
-		blk_queue_virt_boundary(sdev->request_queue, ~MASK_4K);
-
-	mutex_unlock(&unbind_iser_conn_mutex);
-
-	return 0;
-}
-
 static struct scsi_host_template iscsi_iser_sht = {
 	.module                 = THIS_MODULE,
 	.name                   = "iSCSI Initiator over iSER",
@@ -994,7 +974,6 @@ static struct scsi_host_template iscsi_iser_sht = {
 	.eh_device_reset_handler= iscsi_eh_device_reset,
 	.eh_target_reset_handler = iscsi_eh_recover_target,
 	.target_alloc		= iscsi_target_alloc,
-	.slave_alloc            = iscsi_iser_slave_alloc,
 	.proc_name              = "iscsi_iser",
 	.this_id                = -1,
 	.track_queue_depth	= 1,
-- 
2.20.1


^ permalink raw reply related

* [PATCH 3/8] ufshcd: set max_segment_size in the scsi host template
From: Christoph Hellwig @ 2019-06-17 12:19 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Sagi Grimberg, Max Gurtovoy, Bart Van Assche, linux-rdma,
	linux-scsi, megaraidlinux.pdl, MPT-FusionLinux.pdl, linux-hyperv,
	linux-kernel
In-Reply-To: <20190617122000.22181-1-hch@lst.de>

We need to also mirror the value to the device to ensure IOMMU merging
doesn't undo it, and the SCSI host level parameter will ensure that.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/ufs/ufshcd.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 3fe3029617a8..505d625ed28d 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -4587,8 +4587,6 @@ static int ufshcd_slave_configure(struct scsi_device *sdev)
 	struct request_queue *q = sdev->request_queue;
 
 	blk_queue_update_dma_pad(q, PRDT_DATA_BYTE_COUNT_PAD - 1);
-	blk_queue_max_segment_size(q, PRDT_DATA_BYTE_COUNT_MAX);
-
 	return 0;
 }
 
@@ -6991,6 +6989,7 @@ static struct scsi_host_template ufshcd_driver_template = {
 	.sg_tablesize		= SG_ALL,
 	.cmd_per_lun		= UFSHCD_CMD_PER_LUN,
 	.can_queue		= UFSHCD_CAN_QUEUE,
+	.max_segment_size	= PRDT_DATA_BYTE_COUNT_MAX,
 	.max_host_blocked	= 1,
 	.track_queue_depth	= 1,
 	.sdev_groups		= ufshcd_driver_groups,
-- 
2.20.1


^ permalink raw reply related

* properly communicate queue limits to the DMA layer v2
From: Christoph Hellwig @ 2019-06-17 12:19 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Sagi Grimberg, Max Gurtovoy, Bart Van Assche, linux-rdma,
	linux-scsi, megaraidlinux.pdl, MPT-FusionLinux.pdl, linux-hyperv,
	linux-kernel

Hi Martin,

we've always had a bit of a problem communicating the block layer
queue limits to the DMA layer, which needs to respect them when
an IOMMU that could merge segments is used.  Unfortunately most
drivers don't get this right.  Oddly enough we've been mostly
getting away with it, although lately dma-debug has been catching
a few of those issues.

The segment merging fix for devices with PRP-like structures seems
to have escalated this a bit.  The first patch fixes the actual
report from Sebastian, while the rest fix every drivers that appears
to have the problem based on a code audit looking for drivers using
blk_queue_max_segment_size, blk_queue_segment_boundary or
blk_queue_virt_boundary and calling dma_map_sg eventually.

For SCSI drivers I've taken the blk_queue_virt_boundary setting
to the SCSI core, similar to how we did it for the other two settings
a while ago.  This also deals with the fact that the DMA layer
settings are on a per-device granularity, so the per-device settings
in a few SCSI drivers can't actually work in an IOMMU environment.

It would be nice to eventually pass these limits as arguments to
dma_map_sg, but that is a far too big series for the 5.2 merge
window.

Changes since v1:
 - dropped block layer parts merged by Jens
 - dropped the usb-storage / uas changes, as the virt_boundary usage
   there will be dropped soon
 - reworked the mpt3sas / megaraid_sas changes to keep per-device
   settings

^ permalink raw reply

* [PATCH 2/8] scsi: take the DMA max mapping size into account
From: Christoph Hellwig @ 2019-06-17 12:19 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: Sagi Grimberg, Max Gurtovoy, Bart Van Assche, linux-rdma,
	linux-scsi, megaraidlinux.pdl, MPT-FusionLinux.pdl, linux-hyperv,
	linux-kernel
In-Reply-To: <20190617122000.22181-1-hch@lst.de>

We need to limit the devices max_sectors to what the DMA mapping
implementation can support.  If not we risk running out of swiotlb
buffers easily.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/scsi_lib.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index d333bb6b1c59..f233bfd84cd7 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1768,6 +1768,8 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
 		blk_queue_max_integrity_segments(q, shost->sg_prot_tablesize);
 	}
 
+	shost->max_sectors = min_t(unsigned int, shost->max_sectors,
+			dma_max_mapping_size(dev) << SECTOR_SHIFT);
 	blk_queue_max_hw_sectors(q, shost->max_sectors);
 	if (shost->unchecked_isa_dma)
 		blk_queue_bounce_limit(q, BLK_BOUNCE_ISA);
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH] ACPI: PM: Export the function acpi_sleep_state_supported()
From: Pavel Machek @ 2019-06-17 13:08 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: Michael Kelley, linux-acpi@vger.kernel.org, rjw@rjwysocki.net,
	lenb@kernel.org, robert.moore@intel.com, erik.schmauss@intel.com,
	Russell King, Russ Dill, Sebastian Capella, Lorenzo Pieralisi,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	KY Srinivasan, Stephen Hemminger, Haiyang Zhang, Sasha Levin,
	olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com
In-Reply-To: <PU1P153MB01699020B5BC4287C58F5335BFEE0@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>

[-- Attachment #1: Type: text/plain, Size: 1672 bytes --]

Hi!

> > > In a Linux VM running on Hyper-V, when ACPI S4 is enabled, the balloon
> > > driver (drivers/hv/hv_balloon.c) needs to ask the host not to do memory
> > > hot-add/remove.
> > >
> > > So let's export acpi_sleep_state_supported() for the hv_balloon driver.
> > > This might also be useful to the other drivers in the future.
> > >
> > > Signed-off-by: Dexuan Cui <decui@microsoft.com>

> > 
> > It seems that sleep.c isn't built when on the ARM64 architecture.  Using
> > acpi_sleep_state_supported() directly in hv_balloon.c will be problematic
> > since hv_balloon.c needs to be architecture independent when the
> > Hyper-V ARM64 support is added.  If that doesn't change, a per-architecture
> > wrapper will be needed to give hv_balloon.c the correct information.  This
> > may affect whether acpi_sleep_state_supported() needs to be exported vs.
> > just removing the "static".   I'm not sure what the best approach is.
> > 
> > Michael
> 
> + some ARM experts who worked on arch/arm/kernel/hibernate.c.
> 
> drivers/acpi/sleep.c is only built if ACPI_SYSTEM_POWER_STATES_SUPPORT
> is defined, but it looks this option is not defined on ARM.
> 
> It looks ARM does not support the ACPI S4 state, then how do we know 
> if an ARM host supports hibernation or not?

You should be able to do hibernation without ACPI S4 support. All you
need is ability to powerdown...

It is well possible that noone tested hibernation on ARM.. people
usually do suspend-to-ram there.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply

* Re: [PATCH] ACPI: PM: Export the function acpi_sleep_state_supported()
From: Pavel Machek @ 2019-06-17 13:09 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: Michael Kelley, linux-acpi@vger.kernel.org, rjw@rjwysocki.net,
	lenb@kernel.org, robert.moore@intel.com, erik.schmauss@intel.com,
	Russell King, Russ Dill, Sebastian Capella, Lorenzo Pieralisi,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	KY Srinivasan, Stephen Hemminger, Haiyang Zhang, Sasha Levin,
	olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com
In-Reply-To: <PU1P153MB01699020B5BC4287C58F5335BFEE0@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>

[-- Attachment #1: Type: text/plain, Size: 1189 bytes --]


> > It seems that sleep.c isn't built when on the ARM64 architecture.  Using
> > acpi_sleep_state_supported() directly in hv_balloon.c will be problematic
> > since hv_balloon.c needs to be architecture independent when the
> > Hyper-V ARM64 support is added.  If that doesn't change, a per-architecture
> > wrapper will be needed to give hv_balloon.c the correct information.  This
> > may affect whether acpi_sleep_state_supported() needs to be exported vs.
> > just removing the "static".   I'm not sure what the best approach is.
> > 
> > Michael
> 
> + some ARM experts who worked on arch/arm/kernel/hibernate.c.
> 
> drivers/acpi/sleep.c is only built if ACPI_SYSTEM_POWER_STATES_SUPPORT
> is defined, but it looks this option is not defined on ARM.
> 
> It looks ARM does not support the ACPI S4 state, then how do we know 
> if an ARM host supports hibernation or not?

But actually... I remember ELCE talk about hibernation or ARM32. Not
sure if patches are mainline, but someone was working on that.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply

* Re: [PATCH] ACPI: PM: Export the function acpi_sleep_state_supported()
From: Lorenzo Pieralisi @ 2019-06-17 16:14 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: Michael Kelley, linux-acpi@vger.kernel.org, rjw@rjwysocki.net,
	lenb@kernel.org, robert.moore@intel.com, erik.schmauss@intel.com,
	Russell King, Russ Dill, Sebastian Capella, Pavel Machek,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	KY Srinivasan, Stephen Hemminger, Haiyang Zhang, Sasha Levin,
	olaf@aepfle.de, apw@canonical.com, jasowang@redhat.com, vkuznets,
	marcelo.cerri@canonical.com
In-Reply-To: <PU1P153MB01699020B5BC4287C58F5335BFEE0@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>

On Fri, Jun 14, 2019 at 10:19:02PM +0000, Dexuan Cui wrote:
> > -----Original Message-----
> > From: Michael Kelley <mikelley@microsoft.com>
> > Sent: Friday, June 14, 2019 1:48 PM
> > To: Dexuan Cui <decui@microsoft.com>; linux-acpi@vger.kernel.org;
> > rjw@rjwysocki.net; lenb@kernel.org; robert.moore@intel.com;
> > erik.schmauss@intel.com
> > Cc: linux-hyperv@vger.kernel.org; linux-kernel@vger.kernel.org; KY Srinivasan
> > <kys@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>;
> > Haiyang Zhang <haiyangz@microsoft.com>; Sasha Levin
> > <Alexander.Levin@microsoft.com>; olaf@aepfle.de; apw@canonical.com;
> > jasowang@redhat.com; vkuznets <vkuznets@redhat.com>;
> > marcelo.cerri@canonical.com
> > Subject: RE: [PATCH] ACPI: PM: Export the function
> > acpi_sleep_state_supported()
> > 
> > From: Dexuan Cui <decui@microsoft.com>  Sent: Friday, June 14, 2019 11:19
> > AM
> > >
> > > In a Linux VM running on Hyper-V, when ACPI S4 is enabled, the balloon
> > > driver (drivers/hv/hv_balloon.c) needs to ask the host not to do memory
> > > hot-add/remove.
> > >
> > > So let's export acpi_sleep_state_supported() for the hv_balloon driver.
> > > This might also be useful to the other drivers in the future.
> > >
> > > Signed-off-by: Dexuan Cui <decui@microsoft.com>
> > > ---
> > >  drivers/acpi/sleep.c    | 3 ++-
> > >  include/acpi/acpi_bus.h | 2 ++
> > >  2 files changed, 4 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
> > > index a34deccd7317..69755411e008 100644
> > > --- a/drivers/acpi/sleep.c
> > > +++ b/drivers/acpi/sleep.c
> > > @@ -79,7 +79,7 @@ static int acpi_sleep_prepare(u32 acpi_state)
> > >  	return 0;
> > >  }
> > >
> > > -static bool acpi_sleep_state_supported(u8 sleep_state)
> > > +bool acpi_sleep_state_supported(u8 sleep_state)
> > >  {
> > >  	acpi_status status;
> > >  	u8 type_a, type_b;
> > > @@ -89,6 +89,7 @@ static bool acpi_sleep_state_supported(u8 sleep_state)
> > >  		|| (acpi_gbl_FADT.sleep_control.address
> > >  			&& acpi_gbl_FADT.sleep_status.address));
> > >  }
> > > +EXPORT_SYMBOL_GPL(acpi_sleep_state_supported);
> > >
> > >  #ifdef CONFIG_ACPI_SLEEP
> > >  static u32 acpi_target_sleep_state = ACPI_STATE_S0;
> > > diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
> > > index 31b6c87d6240..5b102e7bbf25 100644
> > > --- a/include/acpi/acpi_bus.h
> > > +++ b/include/acpi/acpi_bus.h
> > > @@ -651,6 +651,8 @@ static inline int acpi_pm_set_bridge_wakeup(struct
> > device *dev,
> > > bool enable)
> > >  }
> > >  #endif
> > >
> > > +bool acpi_sleep_state_supported(u8 sleep_state);
> > > +
> > >  #ifdef CONFIG_ACPI_SLEEP
> > >  u32 acpi_target_system_state(void);
> > >  #else
> > > --
> > > 2.19.1
> > 
> > It seems that sleep.c isn't built when on the ARM64 architecture.  Using
> > acpi_sleep_state_supported() directly in hv_balloon.c will be problematic
> > since hv_balloon.c needs to be architecture independent when the
> > Hyper-V ARM64 support is added.  If that doesn't change, a per-architecture
> > wrapper will be needed to give hv_balloon.c the correct information.  This
> > may affect whether acpi_sleep_state_supported() needs to be exported vs.
> > just removing the "static".   I'm not sure what the best approach is.
> > 
> > Michael
> 
> + some ARM experts who worked on arch/arm/kernel/hibernate.c.
> 
> drivers/acpi/sleep.c is only built if ACPI_SYSTEM_POWER_STATES_SUPPORT
> is defined, but it looks this option is not defined on ARM.
> 
> It looks ARM does not support the ACPI S4 state, then how do we know 
> if an ARM host supports hibernation or not?

Maybe we should start from understanding why you need to know whether
Hibernate is possible to answer your question ?

On ARM64 platforms system states are entered through PSCI firmware
interface that works for ACPI and device tree alike.

Lorenzo

^ permalink raw reply

* [PATCHv2] x86/hyperv: Hold cpus_read_lock() on assigning reenlightenment vector
From: Dmitry Safonov @ 2019-06-17 16:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Prasanna Panchamukhi,
	Andy Lutomirski, Borislav Petkov, Cathy Avery, Haiyang Zhang,
	H. Peter Anvin, Ingo Molnar, K. Y. Srinivasan,
	Michael Kelley (EOSG), Mohammed Gamal, Paolo Bonzini,
	Peter Zijlstra, Radim Krčmář, Roman Kagan,
	Sasha Levin, Stephen Hemminger, Thomas Gleixner, Vitaly Kuznetsov,
	devel, kvm, linux-hyperv, x86

KVM support may be compiled as dynamic module, which triggers the
following splat on modprobe (under CONFIG_DEBUG_PREEMPT):

 KVM: vmx: using Hyper-V Enlightened VMCS
 BUG: using smp_processor_id() in preemptible [00000000] code: modprobe/466
 caller is debug_smp_processor_id+0x17/0x19
 CPU: 0 PID: 466 Comm: modprobe Kdump: loaded Not tainted 4.19.43 #1
 Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007  06/02/2017
 Call Trace:
  dump_stack+0x61/0x7e
  check_preemption_disabled+0xd4/0xe6
  debug_smp_processor_id+0x17/0x19
  set_hv_tscchange_cb+0x1b/0x89
  kvm_arch_init+0x14a/0x163 [kvm]
  kvm_init+0x30/0x259 [kvm]
  vmx_init+0xed/0x3db [kvm_intel]
  do_one_initcall+0x89/0x1bc
  do_init_module+0x5f/0x207
  load_module+0x1b34/0x209b
  __ia32_sys_init_module+0x17/0x19
  do_fast_syscall_32+0x121/0x1fa
  entry_SYSENTER_compat+0x7f/0x91

Hold cpus_read_lock() so that MSR will be written for an online CPU,
even if set_hv_tscchange_cb() gets being preempted.
While at it, cleanup smp_processor_id()'s in hv_cpu_init() and add a
lockdep assert into hv_cpu_die().

Fixes: 93286261de1b4 ("x86/hyperv: Reenlightenment notifications
support")

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Cathy Avery <cavery@redhat.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: "Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com>
Cc: Mohammed Gamal <mmorsy@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Roman Kagan <rkagan@virtuozzo.com>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>

Cc: devel@linuxdriverproject.org
Cc: kvm@vger.kernel.org
Cc: linux-hyperv@vger.kernel.org
Cc: x86@kernel.org
Reported-by: Prasanna Panchamukhi <panchamukhi@arista.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
v1 link: lkml.kernel.org/r/20190611212003.26382-1-dima@arista.com

NOTE that I hadn't a chance to test v2 on hyperv machine so far,
ONLY BUILD TESTED. (In hope that the patch still makes sense and Kbuild
bot will report any issue).

 arch/x86/hyperv/hv_init.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 1608050e9df9..ec7fd7d6c125 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -20,6 +20,7 @@
 #include <linux/clockchips.h>
 #include <linux/hyperv.h>
 #include <linux/slab.h>
+#include <linux/cpu.h>
 #include <linux/cpuhotplug.h>
 
 #ifdef CONFIG_HYPERV_TSCPAGE
@@ -91,7 +92,7 @@ EXPORT_SYMBOL_GPL(hv_max_vp_index);
 static int hv_cpu_init(unsigned int cpu)
 {
 	u64 msr_vp_index;
-	struct hv_vp_assist_page **hvp = &hv_vp_assist_page[smp_processor_id()];
+	struct hv_vp_assist_page **hvp = &hv_vp_assist_page[cpu];
 	void **input_arg;
 	struct page *pg;
 
@@ -103,7 +104,7 @@ static int hv_cpu_init(unsigned int cpu)
 
 	hv_get_vp_index(msr_vp_index);
 
-	hv_vp_index[smp_processor_id()] = msr_vp_index;
+	hv_vp_index[cpu] = msr_vp_index;
 
 	if (msr_vp_index > hv_max_vp_index)
 		hv_max_vp_index = msr_vp_index;
@@ -182,7 +183,6 @@ void set_hv_tscchange_cb(void (*cb)(void))
 	struct hv_reenlightenment_control re_ctrl = {
 		.vector = HYPERV_REENLIGHTENMENT_VECTOR,
 		.enabled = 1,
-		.target_vp = hv_vp_index[smp_processor_id()]
 	};
 	struct hv_tsc_emulation_control emu_ctrl = {.enabled = 1};
 
@@ -196,7 +196,16 @@ void set_hv_tscchange_cb(void (*cb)(void))
 	/* Make sure callback is registered before we write to MSRs */
 	wmb();
 
+	/*
+	 * As reenlightenment vector is global, there is no difference which
+	 * CPU will register MSR, though it should be an online CPU.
+	 * hv_cpu_die() callback guarantees that on CPU teardown
+	 * another CPU will re-register MSR back.
+	 */
+	cpus_read_lock();
+	re_ctrl.target_vp = hv_vp_index[raw_smp_processor_id()];
 	wrmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl));
+	cpus_read_unlock();
 	wrmsrl(HV_X64_MSR_TSC_EMULATION_CONTROL, *((u64 *)&emu_ctrl));
 }
 EXPORT_SYMBOL_GPL(set_hv_tscchange_cb);
@@ -239,6 +248,7 @@ static int hv_cpu_die(unsigned int cpu)
 
 	rdmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl));
 	if (re_ctrl.target_vp == hv_vp_index[cpu]) {
+		lockdep_assert_cpus_held();
 		/* Reassign to some other online CPU */
 		new_cpu = cpumask_any_but(cpu_online_mask, cpu);
 
-- 
2.22.0


^ permalink raw reply related

* RE: [PATCH] scsi: storvsc: Add ability to change scsi queue depth
From: KY Srinivasan @ 2019-06-17 17:08 UTC (permalink / raw)
  To: Michael Kelley, brandonbonaby94, Haiyang Zhang, Stephen Hemminger,
	sashal@kernel.org, jejb@linux.ibm.com, martin.petersen@oracle.com
  Cc: brandonbonaby94, linux-hyperv@vger.kernel.org,
	linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <BL0PR2101MB13482F73342F77866D6A6AABD7E90@BL0PR2101MB1348.namprd21.prod.outlook.com>



> -----Original Message-----
> From: Michael Kelley <mikelley@microsoft.com>
> Sent: Saturday, June 15, 2019 11:19 AM
> To: brandonbonaby94 <brandonbonaby94@gmail.com>; KY Srinivasan
> <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Stephen
> Hemminger <sthemmin@microsoft.com>; sashal@kernel.org;
> jejb@linux.ibm.com; martin.petersen@oracle.com
> Cc: brandonbonaby94 <brandonbonaby94@gmail.com>; linux-
> hyperv@vger.kernel.org; linux-scsi@vger.kernel.org; linux-
> kernel@vger.kernel.org
> Subject: RE: [PATCH] scsi: storvsc: Add ability to change scsi queue depth
> 
> From: Branden Bonaby <brandonbonaby94@gmail.com> Sent: Friday, June
> 14, 2019 4:48 PM
> >
> > Adding functionality to allow the SCSI queue depth to be changed,
> > by utilizing the "scsi_change_queue_depth" function.
> >
> > Signed-off-by: Branden Bonaby <brandonbonaby94@gmail.com>
> > ---
> >  drivers/scsi/storvsc_drv.c | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> >
> > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
> > index 8472de1007ff..719ca9906fc2 100644
> > --- a/drivers/scsi/storvsc_drv.c
> > +++ b/drivers/scsi/storvsc_drv.c
> > @@ -387,6 +387,7 @@ enum storvsc_request_type {
> >
> >  static int storvsc_ringbuffer_size = (128 * 1024);
> >  static u32 max_outstanding_req_per_channel;
> > +static int storvsc_change_queue_depth(struct scsi_device *sdev, int
> queue_depth);
> >
> >  static int storvsc_vcpus_per_sub_channel = 4;
> >
> > @@ -1711,6 +1712,7 @@ static struct scsi_host_template scsi_driver = {
> >  	.dma_boundary =		PAGE_SIZE-1,
> >  	.no_write_same =	1,
> >  	.track_queue_depth =	1,
> > +	.change_queue_depth =	storvsc_change_queue_depth,
> >  };
> >
> >  enum {
> > @@ -1917,6 +1919,15 @@ static int storvsc_probe(struct hv_device
> *device,
> >  	return ret;
> >  }
> >
> > +/* Change a scsi target's queue depth */
> > +static int storvsc_change_queue_depth(struct scsi_device *sdev, int
> queue_depth)
> > +{
> > +	if (queue_depth > scsi_driver.can_queue){
> > +		queue_depth = scsi_driver.can_queue;
> > +	}
> > +	return scsi_change_queue_depth(sdev, queue_depth);
> > +}
> > +
> >  static int storvsc_remove(struct hv_device *dev)
> >  {
> >  	struct storvsc_device *stor_device = hv_get_drvdata(dev);
> > --
> > 2.17.1
> 
> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>


^ permalink raw reply

* Re: [PATCH 5/8] IB/iser: set virt_boundary_mask in the scsi host
From: Sagi Grimberg @ 2019-06-17 17:34 UTC (permalink / raw)
  To: Christoph Hellwig, Martin K . Petersen
  Cc: Max Gurtovoy, Bart Van Assche, linux-rdma, linux-scsi,
	megaraidlinux.pdl, MPT-FusionLinux.pdl, linux-hyperv,
	linux-kernel
In-Reply-To: <20190617122000.22181-6-hch@lst.de>

Acked-by: Sagi Grimberg <sagi@grimberg.me>

^ permalink raw reply

* Re: [PATCH 6/8] IB/srp: set virt_boundary_mask in the scsi host
From: Sagi Grimberg @ 2019-06-17 17:35 UTC (permalink / raw)
  To: Christoph Hellwig, Martin K . Petersen
  Cc: Max Gurtovoy, Bart Van Assche, linux-rdma, linux-scsi,
	megaraidlinux.pdl, MPT-FusionLinux.pdl, linux-hyperv,
	linux-kernel
In-Reply-To: <20190617122000.22181-7-hch@lst.de>

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox