Linux-HyperV List
 help / color / mirror / Atom feed
* [PATCH v2 5/5] Input: hv: Remove dependencies on PAGE_SIZE for ring buffer
From: Maya Nakamura @ 2019-06-06  8:09 UTC (permalink / raw)
  To: mikelley, kys, haiyangz, sthemmin, sashal; +Cc: x86, linux-hyperv, linux-kernel
In-Reply-To: <cover.1559807514.git.m.maya.nakamura@gmail.com>

Define the ring buffer size as a constant expression because it should
not depend on the guest page size.

Signed-off-by: Maya Nakamura <m.maya.nakamura@gmail.com>
---
 drivers/input/serio/hyperv-keyboard.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/input/serio/hyperv-keyboard.c b/drivers/input/serio/hyperv-keyboard.c
index 7935e52b5435..a3480dbfadd2 100644
--- a/drivers/input/serio/hyperv-keyboard.c
+++ b/drivers/input/serio/hyperv-keyboard.c
@@ -83,8 +83,8 @@ struct synth_kbd_keystroke {
 
 #define HK_MAXIMUM_MESSAGE_SIZE 256
 
-#define KBD_VSC_SEND_RING_BUFFER_SIZE		(10 * PAGE_SIZE)
-#define KBD_VSC_RECV_RING_BUFFER_SIZE		(10 * PAGE_SIZE)
+#define KBD_VSC_SEND_RING_BUFFER_SIZE		(40 * 1024)
+#define KBD_VSC_RECV_RING_BUFFER_SIZE		(40 * 1024)
 
 #define XTKBD_EMUL0     0xe0
 #define XTKBD_EMUL1     0xe1
-- 
2.17.1


^ permalink raw reply related

* [PATCH v2 4/5] HID: hv: Remove dependencies on PAGE_SIZE for ring buffer
From: Maya Nakamura @ 2019-06-06  8:07 UTC (permalink / raw)
  To: mikelley, kys, haiyangz, sthemmin, sashal; +Cc: x86, linux-hyperv, linux-kernel
In-Reply-To: <cover.1559807514.git.m.maya.nakamura@gmail.com>

Define the ring buffer size as a constant expression because it should
not depend on the guest page size.

Signed-off-by: Maya Nakamura <m.maya.nakamura@gmail.com>
---
 drivers/hid/hid-hyperv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/hid/hid-hyperv.c b/drivers/hid/hid-hyperv.c
index d3311d714d35..e8b154fa38e2 100644
--- a/drivers/hid/hid-hyperv.c
+++ b/drivers/hid/hid-hyperv.c
@@ -112,8 +112,8 @@ struct synthhid_input_report {
 
 #pragma pack(pop)
 
-#define INPUTVSC_SEND_RING_BUFFER_SIZE		(10*PAGE_SIZE)
-#define INPUTVSC_RECV_RING_BUFFER_SIZE		(10*PAGE_SIZE)
+#define INPUTVSC_SEND_RING_BUFFER_SIZE		(40 * 1024)
+#define INPUTVSC_RECV_RING_BUFFER_SIZE		(40 * 1024)
 
 
 enum pipe_prot_msg_type {
-- 
2.17.1


^ permalink raw reply related

* [PATCH v2 3/5] hv: vmbus: Replace page definition with Hyper-V specific one
From: Maya Nakamura @ 2019-06-06  8:06 UTC (permalink / raw)
  To: mikelley, kys, haiyangz, sthemmin, sashal; +Cc: x86, linux-hyperv, linux-kernel
In-Reply-To: <cover.1559807514.git.m.maya.nakamura@gmail.com>

Replace PAGE_SIZE with HV_HYP_PAGE_SIZE because the guest page size may
not be 4096 on all architectures and Hyper-V always runs with a page
size of 4096.

Signed-off-by: Maya Nakamura <m.maya.nakamura@gmail.com>
---
 drivers/hv/hyperv_vmbus.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index e5467b821f41..5489b061d261 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -208,11 +208,11 @@ int hv_ringbuffer_read(struct vmbus_channel *channel,
 		       u64 *requestid, bool raw);
 
 /*
- * Maximum channels is determined by the size of the interrupt page
- * which is PAGE_SIZE. 1/2 of PAGE_SIZE is for send endpoint interrupt
- * and the other is receive endpoint interrupt
+ * Maximum channels, 16348, is determined by the size of the interrupt page,
+ * which is HV_HYP_PAGE_SIZE. 1/2 of HV_HYP_PAGE_SIZE is to send endpoint
+ * interrupt, and the other is to receive endpoint interrupt.
  */
-#define MAX_NUM_CHANNELS	((PAGE_SIZE >> 1) << 3)	/* 16348 channels */
+#define MAX_NUM_CHANNELS	((HV_HYP_PAGE_SIZE >> 1) << 3)
 
 /* The value here must be in multiple of 32 */
 /* TODO: Need to make this configurable */
-- 
2.17.1


^ permalink raw reply related

* [PATCH v2 2/5] x86: hv: hv_init.c: Add functions to allocate/deallocate page for Hyper-V
From: Maya Nakamura @ 2019-06-06  8:05 UTC (permalink / raw)
  To: mikelley, kys, haiyangz, sthemmin, sashal; +Cc: x86, linux-hyperv, linux-kernel
In-Reply-To: <cover.1559807514.git.m.maya.nakamura@gmail.com>

Introduce two new functions, hv_alloc_hyperv_page() and
hv_free_hyperv_page(), to allocate/deallocate memory with the size and
alignment that Hyper-V expects as a page. Although currently they are
not used, they are ready to be used to allocate/deallocate memory on x86
when their ARM64 counterparts are implemented, keeping symmetry between
architectures with potentially different guest page sizes.

Signed-off-by: Maya Nakamura <m.maya.nakamura@gmail.com>
---
 arch/x86/hyperv/hv_init.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index e4ba467a9fc6..84baf0e9a2d4 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -98,6 +98,20 @@ EXPORT_SYMBOL_GPL(hyperv_pcpu_input_arg);
 u32 hv_max_vp_index;
 EXPORT_SYMBOL_GPL(hv_max_vp_index);
 
+void *hv_alloc_hyperv_page(void)
+{
+	BUILD_BUG_ON(!(PAGE_SIZE == HV_HYP_PAGE_SIZE));
+
+	return (void *)__get_free_page(GFP_KERNEL);
+}
+EXPORT_SYMBOL_GPL(hv_alloc_hyperv_page);
+
+void hv_free_hyperv_page(unsigned long addr)
+{
+	free_page(addr);
+}
+EXPORT_SYMBOL_GPL(hv_free_hyperv_page);
+
 static int hv_cpu_init(unsigned int cpu)
 {
 	u64 msr_vp_index;
-- 
2.17.1


^ permalink raw reply related

* [PATCH v2 1/5] x86: hv: hyperv-tlfs.h: Create and use Hyper-V page definitions
From: Maya Nakamura @ 2019-06-06  8:03 UTC (permalink / raw)
  To: mikelley, kys, haiyangz, sthemmin, sashal; +Cc: x86, linux-hyperv, linux-kernel
In-Reply-To: <cover.1559807514.git.m.maya.nakamura@gmail.com>

Define HV_HYP_PAGE_SHIFT, HV_HYP_PAGE_SIZE, and HV_HYP_PAGE_MASK because
the Linux guest page size and hypervisor page size concepts are
different, even though they happen to be the same value on x86.

Also, replace PAGE_SIZE with HV_HYP_PAGE_SIZE.

Signed-off-by: Maya Nakamura <m.maya.nakamura@gmail.com>
---
 arch/x86/include/asm/hyperv-tlfs.h | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index cdf44aa9a501..44bd68aefd00 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -12,6 +12,16 @@
 #include <linux/types.h>
 #include <asm/page.h>
 
+/*
+ * While not explicitly listed in the TLFS, Hyper-V always runs with a page size
+ * of 4096. These definitions are used when communicating with Hyper-V using
+ * guest physical pages and guest physical page addresses, since the guest page
+ * size may not be 4096 on all architectures.
+ */
+#define HV_HYP_PAGE_SHIFT	12
+#define HV_HYP_PAGE_SIZE	BIT(HV_HYP_PAGE_SHIFT)
+#define HV_HYP_PAGE_MASK	(~(HV_HYP_PAGE_SIZE - 1))
+
 /*
  * The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent
  * is set by CPUID(HvCpuIdFunctionVersionAndFeatures).
@@ -841,7 +851,7 @@ union hv_gpa_page_range {
  * count is equal with how many entries of union hv_gpa_page_range can
  * be populated into the input parameter page.
  */
-#define HV_MAX_FLUSH_REP_COUNT ((PAGE_SIZE - 2 * sizeof(u64)) /	\
+#define HV_MAX_FLUSH_REP_COUNT ((HV_HYP_PAGE_SIZE - 2 * sizeof(u64)) /	\
 				sizeof(union hv_gpa_page_range))
 
 struct hv_guest_mapping_flush_list {
-- 
2.17.1


^ permalink raw reply related

* [PATCH v2 0/5] hv: Remove dependencies on guest page size
From: Maya Nakamura @ 2019-06-06  8:00 UTC (permalink / raw)
  To: mikelley, kys, haiyangz, sthemmin, sashal; +Cc: x86, linux-hyperv, linux-kernel

The Linux guest page size and hypervisor page size concepts are
different, even though they happen to be the same value on x86. Hyper-V
code mixes up the two, so this patchset begins to address that by
creating and using a set of Hyper-V specific page definitions.

A major benefit of those new definitions is that they support non-x86
architectures, such as ARM64, that use different page sizes. On ARM64,
the guest page size may not be 4096, and Hyper-V always runs with a page
size of 4096.

In this patchset, the first two patches lay the foundation for the
others, creating definitions and preparing for allocation of memory with
the size and alignment that Hyper-V expects as a page. Patch 3 applies
the page size definition where the guest VM and Hyper-V communicate, and
where the code intends to use the Hyper-V page size. The last two
patches set the ring buffer size to a fixed value, removing the
dependency on the guest page size.

This is the initial set of changes to the Hyper-V code, and future
patches will make additional changes using the same foundation, for
example, replace __vmalloc() and related functions when Hyper-V pages
are intended.

Changes in v2:
- [PATCH 2/5] Replace with a new patch.

Maya Nakamura (5):
  x86: hv: hyperv-tlfs.h: Create and use Hyper-V page definitions
  x86: hv: hv_init.c: Add functions to allocate/deallocate page for
    Hyper-V
  hv: vmbus: Replace page definition with Hyper-V specific one
  HID: hv: Remove dependencies on PAGE_SIZE for ring buffer
  Input: hv: Remove dependencies on PAGE_SIZE for ring buffer

 arch/x86/hyperv/hv_init.c             | 14 ++++++++++++++
 arch/x86/include/asm/hyperv-tlfs.h    | 12 +++++++++++-
 drivers/hid/hid-hyperv.c              |  4 ++--
 drivers/hv/hyperv_vmbus.h             |  8 ++++----
 drivers/input/serio/hyperv-keyboard.c |  4 ++--
 5 files changed, 33 insertions(+), 9 deletions(-)

-- 
2.17.1


^ permalink raw reply

* Re: [PATCH 10/13] megaraid_sas: set virt_boundary_mask in the scsi host
From: Christoph Hellwig @ 2019-06-06  6:41 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Christoph Hellwig, Jens Axboe, Sebastian Ott, Sagi Grimberg,
	Max Gurtovoy, Bart Van Assche, Ulf Hansson, Alan Stern,
	Oliver Neukum, linux-block, linux-rdma, linux-mmc, linux-nvme,
	linux-scsi, megaraidlinux.pdl, MPT-FusionLinux.pdl, linux-hyperv,
	linux-usb, usb-storage, linux-kernel
In-Reply-To: <345c3931-0940-7d59-ebc6-fa1ea56c60ac@suse.de>

On Thu, Jun 06, 2019 at 08:02:07AM +0200, Hannes Reinecke wrote:
> >  	scsi_change_queue_depth(sdev, device_qd);
> >  
> What happened to the NOMERGES queue flag?

Quote from the patch description:

"Also remove the bogus nomerges flag, merges do take the virt_boundary
 into account."

^ permalink raw reply

* Re: [PATCH 08/13] IB/iser: set virt_boundary_mask in the scsi host
From: Christoph Hellwig @ 2019-06-06  6:24 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, Jens Axboe, Sebastian Ott, Sagi Grimberg,
	Max Gurtovoy, Bart Van Assche, Ulf Hansson, Alan Stern,
	Oliver Neukum, linux-block, linux-rdma, linux-mmc, linux-nvme,
	linux-scsi, megaraidlinux.pdl, MPT-FusionLinux.pdl, linux-hyperv,
	linux-usb, usb-storage, linux-kernel
In-Reply-To: <20190605202235.GC3273@ziepe.ca>

On Wed, Jun 05, 2019 at 05:22:35PM -0300, Jason Gunthorpe wrote:
> On Wed, Jun 05, 2019 at 09:08:31PM +0200, Christoph Hellwig wrote:
> > This ensures all proper DMA layer handling is taken care of by the
> > SCSI midlayer.
> 
> Maybe not entirely related to this series, but it looks like the SCSI
> layer is changing the device global dma_set_max_seg_size() - at least
> in RDMA the dma device is being shared between many users, so we
> really don't want SCSI to make this value smaller.
> 
> Can we do something about this?

We could do something about it as outlined in my mail - pass the
dma_params explicitly to the dma_map_sg call.  But that isn't really
suitable for a short term fix and will take a little more time.

Until we've sorted that out the device paramter needs to be set to
the smallest value supported.

^ permalink raw reply

* Re: [PATCH 10/13] megaraid_sas: set virt_boundary_mask in the scsi host
From: Hannes Reinecke @ 2019-06-06  6:02 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Sebastian Ott, Sagi Grimberg, Max Gurtovoy, Bart Van Assche,
	Ulf Hansson, Alan Stern, Oliver Neukum, linux-block, linux-rdma,
	linux-mmc, linux-nvme, linux-scsi, megaraidlinux.pdl,
	MPT-FusionLinux.pdl, linux-hyperv, linux-usb, usb-storage,
	linux-kernel
In-Reply-To: <20190605190836.32354-11-hch@lst.de>

On 6/5/19 9:08 PM, Christoph Hellwig wrote:
> This ensures all proper DMA layer handling is taken care of by the
> SCSI midlayer.  Note that the effect is global, as the IOMMU merging
> is based off a paramters in struct device.  We could still turn if off
> if no PCIe devices are present, but I don't know how to find that out.
> 
> Also remove the bogus nomerges flag, merges do take the virt_boundary
> into account.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  drivers/scsi/megaraid/megaraid_sas_base.c   | 46 +++++----------------
>  drivers/scsi/megaraid/megaraid_sas_fusion.c |  7 ++++
>  2 files changed, 18 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
> index 3dd1df472dc6..20b3b3f8bc16 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> @@ -1870,39 +1870,6 @@ void megasas_set_dynamic_target_properties(struct scsi_device *sdev,
>  	}
>  }
>  
> -/*
> - * megasas_set_nvme_device_properties -
> - * set nomerges=2
> - * set virtual page boundary = 4K (current mr_nvme_pg_size is 4K).
> - * set maximum io transfer = MDTS of NVME device provided by MR firmware.
> - *
> - * MR firmware provides value in KB. Caller of this function converts
> - * kb into bytes.
> - *
> - * e.a MDTS=5 means 2^5 * nvme page size. (In case of 4K page size,
> - * MR firmware provides value 128 as (32 * 4K) = 128K.
> - *
> - * @sdev:				scsi device
> - * @max_io_size:				maximum io transfer size
> - *
> - */
> -static inline void
> -megasas_set_nvme_device_properties(struct scsi_device *sdev, u32 max_io_size)
> -{
> -	struct megasas_instance *instance;
> -	u32 mr_nvme_pg_size;
> -
> -	instance = (struct megasas_instance *)sdev->host->hostdata;
> -	mr_nvme_pg_size = max_t(u32, instance->nvme_page_size,
> -				MR_DEFAULT_NVME_PAGE_SIZE);
> -
> -	blk_queue_max_hw_sectors(sdev->request_queue, (max_io_size / 512));
> -
> -	blk_queue_flag_set(QUEUE_FLAG_NOMERGES, sdev->request_queue);
> -	blk_queue_virt_boundary(sdev->request_queue, mr_nvme_pg_size - 1);
> -}
> -
> -
>  /*
>   * megasas_set_static_target_properties -
>   * Device property set by driver are static and it is not required to be
> @@ -1961,8 +1928,10 @@ static void megasas_set_static_target_properties(struct scsi_device *sdev,
>  		max_io_size_kb = le32_to_cpu(instance->tgt_prop->max_io_size_kb);
>  	}
>  
> -	if (instance->nvme_page_size && max_io_size_kb)
> -		megasas_set_nvme_device_properties(sdev, (max_io_size_kb << 10));
> +	if (instance->nvme_page_size && max_io_size_kb) {
> +		blk_queue_max_hw_sectors(sdev->request_queue,
> +				(max_io_size_kb << 10) / 512);
> +	}
>  
>  	scsi_change_queue_depth(sdev, device_qd);
>  
What happened to the NOMERGES queue flag?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG Nürnberg)

^ permalink raw reply

* Re: [PATCH 08/13] IB/iser: set virt_boundary_mask in the scsi host
From: Sagi Grimberg @ 2019-06-05 23:35 UTC (permalink / raw)
  To: Jason Gunthorpe, Christoph Hellwig
  Cc: Jens Axboe, Sebastian Ott, Max Gurtovoy, Bart Van Assche,
	Ulf Hansson, Alan Stern, Oliver Neukum, linux-block, linux-rdma,
	linux-mmc, linux-nvme, linux-scsi, megaraidlinux.pdl,
	MPT-FusionLinux.pdl, linux-hyperv, linux-usb, usb-storage,
	linux-kernel
In-Reply-To: <20190605202235.GC3273@ziepe.ca>


>> This ensures all proper DMA layer handling is taken care of by the
>> SCSI midlayer.
> 
> Maybe not entirely related to this series, but it looks like the SCSI
> layer is changing the device global dma_set_max_seg_size() - at least
> in RDMA the dma device is being shared between many users, so we
> really don't want SCSI to make this value smaller.
> 
> Can we do something about this?

srp seems to do the right thing:
target_host->max_segment_size = ib_dma_max_seg_size(ibdev);

But iser does not, which means that scsi limits it to:
BLK_MAX_SEGMENT_SIZE (64k)

I can send a fix to iser.

> Wondering about other values too, and the interaction with the new
> combining stuff in umem.c

The only other values AFAICT is the dma_boundary that rdma llds don't
set...

^ permalink raw reply

* Re: [PATCH] revert async probing of VMBus scsi device
From: Stephen Hemminger @ 2019-06-05 21:20 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-scsi, linux-hyperv, Stephen Hemminger
In-Reply-To: <20190605192647.GA25034@infradead.org>

On Wed, 5 Jun 2019 12:26:47 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> On Wed, Jun 05, 2019 at 12:10:20PM -0700, Stephen Hemminger wrote:
> > > Sure.  But they should not get a way out for just one specific driver.  
> > 
> > There are people running new kernels on 6 year old distributions.
> > Was every distribution smart enough then? If you think so, then
> > this not necessary.  
> 
> I think you are missing my point.  If we want a way to disable this,
> we:
> 
>  a) want it opt-in
>  b) it needs to for the whole SCSI layer and not just one driver

Based on feedback from the Program Manager who handles the distros
the patch for storvsc is not needed.

SCSI device naming was never deterministic anyway.  
Cloud-init relies on the same mechanism as the Azure agent to detect the OS and ephemeral disk. 

^ permalink raw reply

* Re: [PATCH 08/13] IB/iser: set virt_boundary_mask in the scsi host
From: Jason Gunthorpe @ 2019-06-05 20:22 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Sebastian Ott, Sagi Grimberg, Max Gurtovoy,
	Bart Van Assche, Ulf Hansson, Alan Stern, Oliver Neukum,
	linux-block, linux-rdma, linux-mmc, linux-nvme, linux-scsi,
	megaraidlinux.pdl, MPT-FusionLinux.pdl, linux-hyperv, linux-usb,
	usb-storage, linux-kernel
In-Reply-To: <20190605190836.32354-9-hch@lst.de>

On Wed, Jun 05, 2019 at 09:08:31PM +0200, Christoph Hellwig wrote:
> This ensures all proper DMA layer handling is taken care of by the
> SCSI midlayer.

Maybe not entirely related to this series, but it looks like the SCSI
layer is changing the device global dma_set_max_seg_size() - at least
in RDMA the dma device is being shared between many users, so we
really don't want SCSI to make this value smaller.

Can we do something about this?

Wondering about other values too, and the interaction with the new
combining stuff in umem.c

Thanks,
Jason

^ permalink raw reply

* Re: [PATCH] revert async probing of VMBus scsi device
From: Stephen Hemminger @ 2019-06-05 20:06 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-scsi, linux-hyperv, Stephen Hemminger
In-Reply-To: <20190605192647.GA25034@infradead.org>

On Wed, 5 Jun 2019 12:26:47 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> On Wed, Jun 05, 2019 at 12:10:20PM -0700, Stephen Hemminger wrote:
> > > Sure.  But they should not get a way out for just one specific driver.  
> > 
> > There are people running new kernels on 6 year old distributions.
> > Was every distribution smart enough then? If you think so, then
> > this not necessary.  
> 
> I think you are missing my point.  If we want a way to disable this,
> we:
> 
>  a) want it opt-in
>  b) it needs to for the whole SCSI layer and not just one driver

There is some possible issues with how initial images are deployed
with temporary disks.  The temp disks come pre-formatted with a blank
NTFS and cloudinit reformats them to ext4 the first time.

Not sure if ordering matters, or if cloudinit is smart enough to do
the right thing. I am trying to get an answer.

It might just be that the first boot should turn off async probing
for the whole scsi layer via kernel cmdline. Or it might not be an issue
if cloudinit was written by developers smart enough to handle moving
disks.

^ permalink raw reply

* Re: [PATCH] revert async probing of VMBus scsi device
From: David Miller @ 2019-06-05 19:27 UTC (permalink / raw)
  To: hch; +Cc: stephen, linux-scsi, linux-hyperv, sthemmin
In-Reply-To: <20190605190722.GA19684@infradead.org>

From: Christoph Hellwig <hch@infradead.org>
Date: Wed, 5 Jun 2019 12:07:23 -0700

> On Wed, Jun 05, 2019 at 12:06:40PM -0700, Stephen Hemminger wrote:
>> > Which is true for every device, and why we use UUIDs or label for
>> > mounts that are supposed to be stable.
>> 
>> Not everyone is smart enough to do that.
> 
> Sure.  But they should not get a way out for just one specific driver.

Agreed.

^ permalink raw reply

* Re: [PATCH] revert async probing of VMBus scsi device
From: Christoph Hellwig @ 2019-06-05 19:26 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Christoph Hellwig, linux-scsi, linux-hyperv, Stephen Hemminger
In-Reply-To: <20190605121020.1a41b753@hermes.lan>

On Wed, Jun 05, 2019 at 12:10:20PM -0700, Stephen Hemminger wrote:
> > Sure.  But they should not get a way out for just one specific driver.
> 
> There are people running new kernels on 6 year old distributions.
> Was every distribution smart enough then? If you think so, then
> this not necessary.

I think you are missing my point.  If we want a way to disable this,
we:

 a) want it opt-in
 b) it needs to for the whole SCSI layer and not just one driver

^ permalink raw reply

* Re: properly communicate queue limits to the DMA layer
From: Christoph Hellwig @ 2019-06-05 19:24 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Christoph Hellwig, Sebastian Ott, Sagi Grimberg, Max Gurtovoy,
	Bart Van Assche, Ulf Hansson, Alan Stern, Oliver Neukum,
	linux-block, linux-rdma, linux-mmc, linux-nvme, linux-scsi,
	megaraidlinux.pdl, MPT-FusionLinux.pdl, linux-hyperv, linux-usb,
	usb-storage, linux-kernel
In-Reply-To: <591cfa1e-fecb-7d00-c855-3b9eb8eb8a2a@kernel.dk>

On Wed, Jun 05, 2019 at 01:17:15PM -0600, Jens Axboe wrote:
> Since I'm heading out shortly and since I think this should make
> the next -rc, I'll tentatively queue this up.

The SCSI bits will need a bit more review, and possibly tweaking
fo megaraid and mpt3sas.  But they are really independent of the
other patches, so maybe skip them for now and leave them for Martin
to deal with.

^ permalink raw reply

* Re: properly communicate queue limits to the DMA layer
From: Jens Axboe @ 2019-06-05 19:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sebastian Ott, Sagi Grimberg, Max Gurtovoy, Bart Van Assche,
	Ulf Hansson, Alan Stern, Oliver Neukum, linux-block, linux-rdma,
	linux-mmc, linux-nvme, linux-scsi, megaraidlinux.pdl,
	MPT-FusionLinux.pdl, linux-hyperv, linux-usb, usb-storage,
	linux-kernel
In-Reply-To: <20190605190836.32354-1-hch@lst.de>

On 6/5/19 1:08 PM, Christoph Hellwig wrote:
> Hi Jens,
> 
> we've always had a bit of a problem communicating the block layer
> queue limits to the DMA layer, which needs to respect them when
> an IOMMU that could merge segments is used.  Unfortunately most
> drivers don't get this right.  Oddly enough we've been mostly
> getting away with it, although lately dma-debug has been catching
> a few of those issues.
> 
> The segment merging fix for devices with PRP-like structures seems
> to have escalated this a bit.  The first patch fixes the actual
> report from Sebastian, while the rest fix every drivers that appears
> to have the problem based on a code audit looking for drivers using
> blk_queue_max_segment_size, blk_queue_segment_boundary or
> blk_queue_virt_boundary and calling dma_map_sg eventually.  Note
> that for SCSI drivers I've taken the blk_queue_virt_boundary setting
> to the SCSI core, similar to how we did it for the other two settings
> a while ago.  This also deals with the fact that the DMA layer
> settings are on a per-device granularity, so the per-device settings
> in a few SCSI drivers can't actually work in an IOMMU environment.
> 
> It would be nice to eventually pass these limits as arguments to
> dma_map_sg, but that is a far too big series for the 5.2 merge
> window.

Since I'm heading out shortly and since I think this should make
the next -rc, I'll tentatively queue this up.

-- 
Jens Axboe


^ permalink raw reply

* [PATCH 01/13] nvme-pci: don't limit DMA segement size
From: Christoph Hellwig @ 2019-06-05 19:08 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Sebastian Ott, Sagi Grimberg, Max Gurtovoy, Bart Van Assche,
	Ulf Hansson, Alan Stern, Oliver Neukum, linux-block, linux-rdma,
	linux-mmc, linux-nvme, linux-scsi, megaraidlinux.pdl,
	MPT-FusionLinux.pdl, linux-hyperv, linux-usb, usb-storage,
	linux-kernel
In-Reply-To: <20190605190836.32354-1-hch@lst.de>

NVMe uses PRPs (or optionally unlimited SGLs) for data transfers and
has no specific limit for a single DMA segement.  Limiting the size
will cause problems because the block layer assumes PRP-ish devices
using a virt boundary mask don't have a segment limit.  And while this
is true, we also really need to tell the DMA mapping layer about it,
otherwise dma-debug will trip over it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Sebastian Ott <sebott@linux.ibm.com>
---
 drivers/nvme/host/pci.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index f562154551ce..524d6bd6d095 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2513,6 +2513,12 @@ static void nvme_reset_work(struct work_struct *work)
 	 */
 	dev->ctrl.max_hw_sectors = NVME_MAX_KB_SZ << 1;
 	dev->ctrl.max_segments = NVME_MAX_SEGS;
+
+	/*
+	 * Don't limit the IOMMU merged segment size.
+	 */
+	dma_set_max_seg_size(dev->dev, 0xffffffff);
+
 	mutex_unlock(&dev->shutdown_lock);
 
 	/*
-- 
2.20.1


^ permalink raw reply related

* properly communicate queue limits to the DMA layer
From: Christoph Hellwig @ 2019-06-05 19:08 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Sebastian Ott, Sagi Grimberg, Max Gurtovoy, Bart Van Assche,
	Ulf Hansson, Alan Stern, Oliver Neukum, linux-block, linux-rdma,
	linux-mmc, linux-nvme, linux-scsi, megaraidlinux.pdl,
	MPT-FusionLinux.pdl, linux-hyperv, linux-usb, usb-storage,
	linux-kernel

Hi Jens,

we've always had a bit of a problem communicating the block layer
queue limits to the DMA layer, which needs to respect them when
an IOMMU that could merge segments is used.  Unfortunately most
drivers don't get this right.  Oddly enough we've been mostly
getting away with it, although lately dma-debug has been catching
a few of those issues.

The segment merging fix for devices with PRP-like structures seems
to have escalated this a bit.  The first patch fixes the actual
report from Sebastian, while the rest fix every drivers that appears
to have the problem based on a code audit looking for drivers using
blk_queue_max_segment_size, blk_queue_segment_boundary or
blk_queue_virt_boundary and calling dma_map_sg eventually.  Note
that for SCSI drivers I've taken the blk_queue_virt_boundary setting
to the SCSI core, similar to how we did it for the other two settings
a while ago.  This also deals with the fact that the DMA layer
settings are on a per-device granularity, so the per-device settings
in a few SCSI drivers can't actually work in an IOMMU environment.

It would be nice to eventually pass these limits as arguments to
dma_map_sg, but that is a far too big series for the 5.2 merge
window.

^ permalink raw reply

* [PATCH 09/13] IB/srp: set virt_boundary_mask in the scsi host
From: Christoph Hellwig @ 2019-06-05 19:08 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Sebastian Ott, Sagi Grimberg, Max Gurtovoy, Bart Van Assche,
	Ulf Hansson, Alan Stern, Oliver Neukum, linux-block, linux-rdma,
	linux-mmc, linux-nvme, linux-scsi, megaraidlinux.pdl,
	MPT-FusionLinux.pdl, linux-hyperv, linux-usb, usb-storage,
	linux-kernel
In-Reply-To: <20190605190836.32354-1-hch@lst.de>

This ensures all proper DMA layer handling is taken care of by the
SCSI midlayer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 18 +++---------------
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index be9ddcad8f28..944fe8eee1ea 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -3061,20 +3061,6 @@ static int srp_target_alloc(struct scsi_target *starget)
 	return 0;
 }
 
-static int srp_slave_alloc(struct scsi_device *sdev)
-{
-	struct Scsi_Host *shost = sdev->host;
-	struct srp_target_port *target = host_to_target(shost);
-	struct srp_device *srp_dev = target->srp_host->srp_dev;
-	struct ib_device *ibdev = srp_dev->dev;
-
-	if (!(ibdev->attrs.device_cap_flags & IB_DEVICE_SG_GAPS_REG))
-		blk_queue_virt_boundary(sdev->request_queue,
-					~srp_dev->mr_page_mask);
-
-	return 0;
-}
-
 static int srp_slave_configure(struct scsi_device *sdev)
 {
 	struct Scsi_Host *shost = sdev->host;
@@ -3277,7 +3263,6 @@ static struct scsi_host_template srp_template = {
 	.name				= "InfiniBand SRP initiator",
 	.proc_name			= DRV_NAME,
 	.target_alloc			= srp_target_alloc,
-	.slave_alloc			= srp_slave_alloc,
 	.slave_configure		= srp_slave_configure,
 	.info				= srp_target_info,
 	.queuecommand			= srp_queuecommand,
@@ -3812,6 +3797,9 @@ static ssize_t srp_create_target(struct device *dev,
 	target_host->max_cmd_len = sizeof ((struct srp_cmd *) (void *) 0L)->cdb;
 	target_host->max_segment_size = ib_dma_max_seg_size(ibdev);
 
+	if (!(ibdev->attrs.device_cap_flags & IB_DEVICE_SG_GAPS_REG))
+		target_host->virt_boundary_mask = ~srp_dev->mr_page_mask;
+
 	target = host_to_target(target_host);
 
 	target->net		= kobj_ns_grab_current(KOBJ_NS_TYPE_NET);
-- 
2.20.1


^ permalink raw reply related

* [PATCH 10/13] megaraid_sas: set virt_boundary_mask in the scsi host
From: Christoph Hellwig @ 2019-06-05 19:08 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Sebastian Ott, Sagi Grimberg, Max Gurtovoy, Bart Van Assche,
	Ulf Hansson, Alan Stern, Oliver Neukum, linux-block, linux-rdma,
	linux-mmc, linux-nvme, linux-scsi, megaraidlinux.pdl,
	MPT-FusionLinux.pdl, linux-hyperv, linux-usb, usb-storage,
	linux-kernel
In-Reply-To: <20190605190836.32354-1-hch@lst.de>

This ensures all proper DMA layer handling is taken care of by the
SCSI midlayer.  Note that the effect is global, as the IOMMU merging
is based off a paramters in struct device.  We could still turn if off
if no PCIe devices are present, but I don't know how to find that out.

Also remove the bogus nomerges flag, merges do take the virt_boundary
into account.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/megaraid/megaraid_sas_base.c   | 46 +++++----------------
 drivers/scsi/megaraid/megaraid_sas_fusion.c |  7 ++++
 2 files changed, 18 insertions(+), 35 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index 3dd1df472dc6..20b3b3f8bc16 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -1870,39 +1870,6 @@ void megasas_set_dynamic_target_properties(struct scsi_device *sdev,
 	}
 }
 
-/*
- * megasas_set_nvme_device_properties -
- * set nomerges=2
- * set virtual page boundary = 4K (current mr_nvme_pg_size is 4K).
- * set maximum io transfer = MDTS of NVME device provided by MR firmware.
- *
- * MR firmware provides value in KB. Caller of this function converts
- * kb into bytes.
- *
- * e.a MDTS=5 means 2^5 * nvme page size. (In case of 4K page size,
- * MR firmware provides value 128 as (32 * 4K) = 128K.
- *
- * @sdev:				scsi device
- * @max_io_size:				maximum io transfer size
- *
- */
-static inline void
-megasas_set_nvme_device_properties(struct scsi_device *sdev, u32 max_io_size)
-{
-	struct megasas_instance *instance;
-	u32 mr_nvme_pg_size;
-
-	instance = (struct megasas_instance *)sdev->host->hostdata;
-	mr_nvme_pg_size = max_t(u32, instance->nvme_page_size,
-				MR_DEFAULT_NVME_PAGE_SIZE);
-
-	blk_queue_max_hw_sectors(sdev->request_queue, (max_io_size / 512));
-
-	blk_queue_flag_set(QUEUE_FLAG_NOMERGES, sdev->request_queue);
-	blk_queue_virt_boundary(sdev->request_queue, mr_nvme_pg_size - 1);
-}
-
-
 /*
  * megasas_set_static_target_properties -
  * Device property set by driver are static and it is not required to be
@@ -1961,8 +1928,10 @@ static void megasas_set_static_target_properties(struct scsi_device *sdev,
 		max_io_size_kb = le32_to_cpu(instance->tgt_prop->max_io_size_kb);
 	}
 
-	if (instance->nvme_page_size && max_io_size_kb)
-		megasas_set_nvme_device_properties(sdev, (max_io_size_kb << 10));
+	if (instance->nvme_page_size && max_io_size_kb) {
+		blk_queue_max_hw_sectors(sdev->request_queue,
+				(max_io_size_kb << 10) / 512);
+	}
 
 	scsi_change_queue_depth(sdev, device_qd);
 
@@ -6258,6 +6227,7 @@ static int megasas_start_aen(struct megasas_instance *instance)
 static int megasas_io_attach(struct megasas_instance *instance)
 {
 	struct Scsi_Host *host = instance->host;
+	u32 nvme_page_size = instance->nvme_page_size;
 
 	/*
 	 * Export parameters required by SCSI mid-layer
@@ -6298,6 +6268,12 @@ static int megasas_io_attach(struct megasas_instance *instance)
 	host->max_lun = MEGASAS_MAX_LUN;
 	host->max_cmd_len = 16;
 
+	if (nvme_page_size) {
+		if (nvme_page_size > MR_DEFAULT_NVME_PAGE_SIZE)
+			nvme_page_size = MR_DEFAULT_NVME_PAGE_SIZE;
+		host->virt_boundary_mask = nvme_page_size - 1;
+	}
+
 	/*
 	 * Notify the mid-layer about the new controller
 	 */
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index 4dfa0685a86c..a9ff3a648e7b 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -1935,6 +1935,13 @@ megasas_is_prp_possible(struct megasas_instance *instance,
 			build_prp = true;
 	}
 
+/*
+ * XXX: All the code following should go away.  The block layer guarantees
+ * merging according to the virt boundary.  And while we might have had some
+ * issues with that in the past we fixed them, and any new bug should be fixed
+ * in the core code as well.
+ */
+
 /*
  * Below code detects gaps/holes in IO data buffers.
  * What does holes/gaps mean?
-- 
2.20.1


^ permalink raw reply related

* [PATCH 08/13] IB/iser: set virt_boundary_mask in the scsi host
From: Christoph Hellwig @ 2019-06-05 19:08 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Sebastian Ott, Sagi Grimberg, Max Gurtovoy, Bart Van Assche,
	Ulf Hansson, Alan Stern, Oliver Neukum, linux-block, linux-rdma,
	linux-mmc, linux-nvme, linux-scsi, megaraidlinux.pdl,
	MPT-FusionLinux.pdl, linux-hyperv, linux-usb, usb-storage,
	linux-kernel
In-Reply-To: <20190605190836.32354-1-hch@lst.de>

This ensures all proper DMA layer handling is taken care of by the
SCSI midlayer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/infiniband/ulp/iser/iscsi_iser.c | 35 +++++-------------------
 1 file changed, 7 insertions(+), 28 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c
index 9c185a8dabd3..841b66397a57 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.c
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.c
@@ -613,6 +613,7 @@ iscsi_iser_session_create(struct iscsi_endpoint *ep,
 	struct Scsi_Host *shost;
 	struct iser_conn *iser_conn = NULL;
 	struct ib_conn *ib_conn;
+	struct ib_device *ib_dev;
 	u32 max_fr_sectors;
 
 	shost = iscsi_host_alloc(&iscsi_iser_sht, 0, 0);
@@ -643,16 +644,19 @@ iscsi_iser_session_create(struct iscsi_endpoint *ep,
 		}
 
 		ib_conn = &iser_conn->ib_conn;
+		ib_dev = ib_conn->device->ib_device;
 		if (ib_conn->pi_support) {
-			u32 sig_caps = ib_conn->device->ib_device->attrs.sig_prot_cap;
+			u32 sig_caps = ib_dev->attrs.sig_prot_cap;
 
 			scsi_host_set_prot(shost, iser_dif_prot_caps(sig_caps));
 			scsi_host_set_guard(shost, SHOST_DIX_GUARD_IP |
 						   SHOST_DIX_GUARD_CRC);
 		}
 
-		if (iscsi_host_add(shost,
-				   ib_conn->device->ib_device->dev.parent)) {
+		if (!(ib_dev->attrs.device_cap_flags & IB_DEVICE_SG_GAPS_REG))
+			shost->virt_boundary_mask = ~MASK_4K;
+
+		if (iscsi_host_add(shost, ib_dev->dev.parent)) {
 			mutex_unlock(&iser_conn->state_mutex);
 			goto free_host;
 		}
@@ -958,30 +962,6 @@ static umode_t iser_attr_is_visible(int param_type, int param)
 	return 0;
 }
 
-static int iscsi_iser_slave_alloc(struct scsi_device *sdev)
-{
-	struct iscsi_session *session;
-	struct iser_conn *iser_conn;
-	struct ib_device *ib_dev;
-
-	mutex_lock(&unbind_iser_conn_mutex);
-
-	session = starget_to_session(scsi_target(sdev))->dd_data;
-	iser_conn = session->leadconn->dd_data;
-	if (!iser_conn) {
-		mutex_unlock(&unbind_iser_conn_mutex);
-		return -ENOTCONN;
-	}
-	ib_dev = iser_conn->ib_conn.device->ib_device;
-
-	if (!(ib_dev->attrs.device_cap_flags & IB_DEVICE_SG_GAPS_REG))
-		blk_queue_virt_boundary(sdev->request_queue, ~MASK_4K);
-
-	mutex_unlock(&unbind_iser_conn_mutex);
-
-	return 0;
-}
-
 static struct scsi_host_template iscsi_iser_sht = {
 	.module                 = THIS_MODULE,
 	.name                   = "iSCSI Initiator over iSER",
@@ -994,7 +974,6 @@ static struct scsi_host_template iscsi_iser_sht = {
 	.eh_device_reset_handler= iscsi_eh_device_reset,
 	.eh_target_reset_handler = iscsi_eh_recover_target,
 	.target_alloc		= iscsi_target_alloc,
-	.slave_alloc            = iscsi_iser_slave_alloc,
 	.proc_name              = "iscsi_iser",
 	.this_id                = -1,
 	.track_queue_depth	= 1,
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH] revert async probing of VMBus scsi device
From: Stephen Hemminger @ 2019-06-05 19:10 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-scsi, linux-hyperv, Stephen Hemminger
In-Reply-To: <20190605190722.GA19684@infradead.org>

On Wed, 5 Jun 2019 12:07:23 -0700
Christoph Hellwig <hch@infradead.org> wrote:

> On Wed, Jun 05, 2019 at 12:06:40PM -0700, Stephen Hemminger wrote:
> > > Which is true for every device, and why we use UUIDs or label for
> > > mounts that are supposed to be stable.  
> > 
> > Not everyone is smart enough to do that.  
> 
> Sure.  But they should not get a way out for just one specific driver.

There are people running new kernels on 6 year old distributions.
Was every distribution smart enough then? If you think so, then
this not necessary.

^ permalink raw reply

* [PATCH 12/13] usb-storage: set virt_boundary_mask in the scsi host
From: Christoph Hellwig @ 2019-06-05 19:08 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Sebastian Ott, Sagi Grimberg, Max Gurtovoy, Bart Van Assche,
	Ulf Hansson, Alan Stern, Oliver Neukum, linux-block, linux-rdma,
	linux-mmc, linux-nvme, linux-scsi, megaraidlinux.pdl,
	MPT-FusionLinux.pdl, linux-hyperv, linux-usb, usb-storage,
	linux-kernel
In-Reply-To: <20190605190836.32354-1-hch@lst.de>

This ensures all proper DMA layer handling is taken care of by the
SCSI midlayer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/usb/storage/scsiglue.c | 10 ----------
 drivers/usb/storage/usb.c      | 10 ++++++++++
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/usb/storage/scsiglue.c b/drivers/usb/storage/scsiglue.c
index 59190d88fa9f..02c3b66b3f78 100644
--- a/drivers/usb/storage/scsiglue.c
+++ b/drivers/usb/storage/scsiglue.c
@@ -65,7 +65,6 @@ static const char* host_info(struct Scsi_Host *host)
 static int slave_alloc (struct scsi_device *sdev)
 {
 	struct us_data *us = host_to_us(sdev->host);
-	int maxp;
 
 	/*
 	 * Set the INQUIRY transfer length to 36.  We don't use any of
@@ -74,15 +73,6 @@ static int slave_alloc (struct scsi_device *sdev)
 	 */
 	sdev->inquiry_len = 36;
 
-	/*
-	 * USB has unusual scatter-gather requirements: the length of each
-	 * scatterlist element except the last must be divisible by the
-	 * Bulk maxpacket value.  Fortunately this value is always a
-	 * power of 2.  Inform the block layer about this requirement.
-	 */
-	maxp = usb_maxpacket(us->pusb_dev, us->recv_bulk_pipe, 0);
-	blk_queue_virt_boundary(sdev->request_queue, maxp - 1);
-
 	/*
 	 * Some host controllers may have alignment requirements.
 	 * We'll play it safe by requiring 512-byte alignment always.
diff --git a/drivers/usb/storage/usb.c b/drivers/usb/storage/usb.c
index 9a79cd9762f3..b0f23f4f58e3 100644
--- a/drivers/usb/storage/usb.c
+++ b/drivers/usb/storage/usb.c
@@ -1050,6 +1050,16 @@ int usb_stor_probe2(struct us_data *us)
 	usb_autopm_get_interface_no_resume(us->pusb_intf);
 	snprintf(us->scsi_name, sizeof(us->scsi_name), "usb-storage %s",
 					dev_name(&us->pusb_intf->dev));
+
+	/*
+	 * USB has unusual scatter-gather requirements: the length of each
+	 * scatterlist element except the last must be divisible by the
+	 * Bulk maxpacket value.  Fortunately this value is always a
+	 * power of 2.  Inform the block layer about this requirement.
+	 */
+	us_to_host(us)->virt_boundary_mask =
+		usb_maxpacket(us->pusb_dev, us->recv_bulk_pipe, 0) - 1;
+
 	result = scsi_add_host(us_to_host(us), dev);
 	if (result) {
 		dev_warn(dev,
-- 
2.20.1


^ permalink raw reply related

* [PATCH 13/13] uas: set virt_boundary_mask in the scsi host
From: Christoph Hellwig @ 2019-06-05 19:08 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Sebastian Ott, Sagi Grimberg, Max Gurtovoy, Bart Van Assche,
	Ulf Hansson, Alan Stern, Oliver Neukum, linux-block, linux-rdma,
	linux-mmc, linux-nvme, linux-scsi, megaraidlinux.pdl,
	MPT-FusionLinux.pdl, linux-hyperv, linux-usb, usb-storage,
	linux-kernel
In-Reply-To: <20190605190836.32354-1-hch@lst.de>

This ensures all proper DMA layer handling is taken care of by the
SCSI midlayer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/usb/storage/uas.c | 36 ++++++++++++++++--------------------
 1 file changed, 16 insertions(+), 20 deletions(-)

diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
index 047c5922618f..d20919e7bbf4 100644
--- a/drivers/usb/storage/uas.c
+++ b/drivers/usb/storage/uas.c
@@ -789,29 +789,9 @@ static int uas_slave_alloc(struct scsi_device *sdev)
 {
 	struct uas_dev_info *devinfo =
 		(struct uas_dev_info *)sdev->host->hostdata;
-	int maxp;
 
 	sdev->hostdata = devinfo;
 
-	/*
-	 * We have two requirements here. We must satisfy the requirements
-	 * of the physical HC and the demands of the protocol, as we
-	 * definitely want no additional memory allocation in this path
-	 * ruling out using bounce buffers.
-	 *
-	 * For a transmission on USB to continue we must never send
-	 * a package that is smaller than maxpacket. Hence the length of each
-         * scatterlist element except the last must be divisible by the
-         * Bulk maxpacket value.
-	 * If the HC does not ensure that through SG,
-	 * the upper layer must do that. We must assume nothing
-	 * about the capabilities off the HC, so we use the most
-	 * pessimistic requirement.
-	 */
-
-	maxp = usb_maxpacket(devinfo->udev, devinfo->data_in_pipe, 0);
-	blk_queue_virt_boundary(sdev->request_queue, maxp - 1);
-
 	/*
 	 * The protocol has no requirements on alignment in the strict sense.
 	 * Controllers may or may not have alignment restrictions.
@@ -1004,6 +984,22 @@ static int uas_probe(struct usb_interface *intf, const struct usb_device_id *id)
 	 */
 	shost->can_queue = devinfo->qdepth - 2;
 
+	/*
+	 * We have two requirements here. We must satisfy the requirements of
+	 * the physical HC and the demands of the protocol, as we definitely
+	 * want no additional memory allocation in this path ruling out using
+	 * bounce buffers.
+	 *
+	 * For a transmission on USB to continue we must never send a package
+	 * that is smaller than maxpacket.  Hence the length of each scatterlist
+	 * element except the last must be divisible by the Bulk maxpacket
+	 * value.  If the HC does not ensure that through SG, the upper layer
+	 * must do that.  We must assume nothing about the capabilities off the
+	 * HC, so we use the most pessimistic requirement.
+	 */
+	shost->virt_boundary_mask =
+		usb_maxpacket(udev, devinfo->data_in_pipe, 0) - 1;
+
 	usb_set_intfdata(intf, shost);
 	result = scsi_add_host(shost, &intf->dev);
 	if (result)
-- 
2.20.1


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox