Linux-ARM-Kernel Archive on lore.kernel.org

Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 6/6] arm64: uaccess: suppress spurious clang warning
From: Mark Rutland @ 2017-05-03 15:09 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493824178-7399-1-git-send-email-mark.rutland@arm.com>

Clang tries to warn when there's a mismatch between an operand's size,
and the size of the register it is held in, as this may indicate a bug.
Specifically, clang warns when the operand's type is less than 64 bits
wide, and the register is used unqualified (i.e. %N rather than %xN or
%wN).

Unfortunately clang can generate these warnings for unreachable code.
For example, for code like:

do {                                            \
        typeof(*(ptr)) __v = (v);               \
        switch(sizeof(*(ptr))) {                \
        case 1:                                 \
                // assume __v is 1 byte wide    \
                asm ("{op}b %w0" : : "r" (v));  \
                break;                          \
        case 8:                                 \
                // assume __v is 8 bytes wide   \
                asm ("{op} %0" : : "r" (v));    \
                break;                          \
        }
while (0)

... if op() were passed a char value and pointer to char, clang may
produce a warning for the unreachable case where sizeof(*(ptr)) is 8.

For the same reasons, clang produces warnings when __put_user_err() is
used for types that are less than 64 bits wide.

We could avoid this with a cast to a fixed-width type in each of the
cases. However, GCC will then warn that pointer types are being cast to
mismatched integer sizes (in unreachable paths).

Another option would be to use the same union trickery as we do for
__smp_store_release() and __smp_load_acquire(), but this is fairly
invasive.

Instead, this patch suppresses the clang warning by using an x modifier
in the assembly for the 8 byte case of __put_user_err(). No additional
work is necessary as the value has been cast to typeof(*(ptr)), so the
compiler will have performed any necessary extension for the reachable
case.

For consistency, __get_user_err() is also updated to use the x modifier
for its 8 byte case.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Matthias Kaehlcke <mka@chromium.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/uaccess.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index ed3ecf1..2c7822c 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -257,7 +257,7 @@ static inline void uaccess_enable_not_uao(void)
 			       (err), ARM64_HAS_UAO);			\
 		break;							\
 	case 8:								\
-		__get_user_asm("ldr", "ldtr", "%",  __gu_val, (ptr),	\
+		__get_user_asm("ldr", "ldtr", "%x",  __gu_val, (ptr),	\
 			       (err), ARM64_HAS_UAO);			\
 		break;							\
 	default:							\
@@ -324,7 +324,7 @@ static inline void uaccess_enable_not_uao(void)
 			       (err), ARM64_HAS_UAO);			\
 		break;							\
 	case 8:								\
-		__put_user_asm("str", "sttr", "%", __pu_val, (ptr),	\
+		__put_user_asm("str", "sttr", "%x", __pu_val, (ptr),	\
 			       (err), ARM64_HAS_UAO);			\
 		break;							\
 	default:							\
-- 
1.9.1

^ permalink raw reply related

* [PATCH v2] drivers: dma-mapping: Do not leave an invalid area->pages pointer in dma_common_contiguous_remap()
From: Greg Kroah-Hartman @ 2017-05-03 15:19 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493823468-19470-1-git-send-email-catalin.marinas@arm.com>

On Wed, May 03, 2017 at 03:57:48PM +0100, Catalin Marinas wrote:
> The dma_common_pages_remap() function allocates a vm_struct object and
> initialises the pages pointer to value passed as argument. However, when
> this function is called dma_common_contiguous_remap(), the pages array
> is only temporarily allocated, being freed shortly after
> dma_common_contiguous_remap() returns. Architecture code checking the
> validity of an area->pages pointer would incorrectly dereference already
> freed pointers. This has been exposed by the arm64 commit 44176bb38fa4
> ("arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU").
> 
> Fixes: 513510ddba96 ("common: dma-mapping: introduce common remapping functions")
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Reported-by: Andrzej Hajda <a.hajda@samsung.com>
> Acked-by: Laura Abbott <labbott@redhat.com>
> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
> 
> Greg,
> 
> Please merge this patch via your tree (and therefore I haven't added
> your ack). Thanks.

Ok, will queue it up after 4.12-rc1 is out.

thanks,

greg k-h

^ permalink raw reply

* [PATCH] iommu/rockchip: Enable Rockchip IOMMU on ARM64
From: Matthias Brugger @ 2017-05-03 15:19 UTC (permalink / raw)
  To: linux-arm-kernel

From: Simon Xue <xxm@rock-chips.com>

This patch makes it possible to compile the rockchip-iommu driver on
ARM64, so that it can be used with 64-bit SoCs equipped with this type
of IOMMU.

Signed-off-by: Simon Xue <xxm@rock-chips.com>
Signed-off-by: Shunqian Zheng <zhengsq@rock-chips.com>
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Matthias Brugger <mbrugger@suse.com>
---
 drivers/iommu/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6ee3a25ae731..99c6366a2551 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -219,7 +219,7 @@ config OMAP_IOMMU_DEBUG
 
 config ROCKCHIP_IOMMU
 	bool "Rockchip IOMMU Support"
-	depends on ARM
+	depends on ARM || ARM64
 	depends on ARCH_ROCKCHIP || COMPILE_TEST
 	select IOMMU_API
 	select ARM_DMA_USE_IOMMU
-- 
2.12.0

^ permalink raw reply related

* [PATCH v5 20/22] KVM: arm64: vgic-its: Device table save/restore
From: Christoffer Dall @ 2017-05-03 15:29 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <9f446caa-d62c-1a01-b916-9531a8034989@redhat.com>

On Wed, May 03, 2017 at 04:07:45PM +0200, Auger Eric wrote:
> Hi Christoffer,
> On 30/04/2017 22:55, Christoffer Dall wrote:
> > On Fri, Apr 14, 2017 at 12:15:32PM +0200, Eric Auger wrote:
> >> This patch saves the device table entries into guest RAM.
> >> Both flat table and 2 stage tables are supported. DeviceId
> >> indexing is used.
> >>
> >> For each device listed in the device table, we also save
> >> the translation table using the vgic_its_save/restore_itt
> >> routines.
> >>
> >> On restore, devices are re-allocated and their ite are
> >> re-built.
> >>
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >>
> >> ---
> >> v4 -> v5:
> >> - sort the device list by deviceid on device table save
> >> - use defines for shifts and masks
> >> - use abi->dte_esz
> >> - clatify entry sizes for L1 and L2 tables
> >>
> >> v3 -> v4:
> >> - use the new proto for its_alloc_device
> >> - compute_next_devid_offset, vgic_its_flush/restore_itt
> >>   become static in this patch
> >> - change in the DTE entry format with the introduction of the
> >>   valid bit and next field width decrease; ittaddr encoded
> >>   on its full range
> >> - fix handle_l1_entry entry handling
> >> - correct vgic_its_table_restore error handling
> >>
> >> v2 -> v3:
> >> - fix itt_addr bitmask in vgic_its_restore_dte
> >> - addition of return 0 in vgic_its_restore_ite moved to
> >>   the ITE related patch
> >>
> >> v1 -> v2:
> >> - use 8 byte format for DTE and ITE
> >> - support 2 stage format
> >> - remove kvm parameter
> >> - ITT flush/restore moved in a separate patch
> >> - use deviceid indexing
> >> ---
> >>  virt/kvm/arm/vgic/vgic-its.c | 183 +++++++++++++++++++++++++++++++++++++++++--
> >>  virt/kvm/arm/vgic/vgic.h     |   7 ++
> >>  2 files changed, 185 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
> >> index b02fc3f..86dfc6c 100644
> >> --- a/virt/kvm/arm/vgic/vgic-its.c
> >> +++ b/virt/kvm/arm/vgic/vgic-its.c
> >> @@ -1682,7 +1682,8 @@ int vgic_its_attr_regs_access(struct kvm_device *dev,
> >>  	return ret;
> >>  }
> >>  
> >> -u32 compute_next_devid_offset(struct list_head *h, struct its_device *dev)
> >> +static u32 compute_next_devid_offset(struct list_head *h,
> >> +				     struct its_device *dev)
> >>  {
> >>  	struct list_head *e = &dev->dev_list;
> >>  	struct its_device *next;
> >> @@ -1858,7 +1859,7 @@ static int vgic_its_ite_cmp(void *priv, struct list_head *a,
> >>  		return 1;
> >>  }
> >>  
> >> -int vgic_its_save_itt(struct vgic_its *its, struct its_device *device)
> >> +static int vgic_its_save_itt(struct vgic_its *its, struct its_device *device)
> >>  {
> >>  	const struct vgic_its_abi *abi = vgic_its_get_abi(its);
> >>  	gpa_t base = device->itt_addr;
> >> @@ -1877,7 +1878,7 @@ int vgic_its_save_itt(struct vgic_its *its, struct its_device *device)
> >>  	return 0;
> >>  }
> >>  
> >> -int vgic_its_restore_itt(struct vgic_its *its, struct its_device *dev)
> >> +static int vgic_its_restore_itt(struct vgic_its *its, struct its_device *dev)
> >>  {
> >>  	const struct vgic_its_abi *abi = vgic_its_get_abi(its);
> >>  	gpa_t base = dev->itt_addr;
> >> @@ -1895,12 +1896,161 @@ int vgic_its_restore_itt(struct vgic_its *its, struct its_device *dev)
> >>  }
> >>  
> >>  /**
> >> + * vgic_its_save_dte - Save a device table entry at a given GPA
> >> + *
> >> + * @its: ITS handle
> >> + * @dev: ITS device
> >> + * @ptr: GPA
> >> + */
> >> +static int vgic_its_save_dte(struct vgic_its *its, struct its_device *dev,
> >> +			     gpa_t ptr, int dte_esz)
> >> +{
> >> +	struct kvm *kvm = its->dev->kvm;
> >> +	u64 val, itt_addr_field;
> >> +	int ret;
> >> +	u32 next_offset;
> >> +
> >> +	itt_addr_field = dev->itt_addr >> 8;
> >> +	next_offset = compute_next_devid_offset(&its->device_list, dev);
> >> +	val = (1ULL << KVM_ITS_DTE_VALID_SHIFT |
> >> +	       ((u64)next_offset << KVM_ITS_DTE_NEXT_SHIFT) |
> > 
> > I think this implies that the next field in your ABI points to the next
> > offset, regardless of whether or not this is in a a level 2 or lavel 1
> > table.  See more comments on this below (I reviewed this patch from the
> > bottom up).
> Not sure I understand your comment.
> 
> Doc says:
>  - next: equals to 0 if this entry is the last one; otherwise it
>    corresponds to the deviceid offset to the next DTE, capped by
>    2^14 -1.
> 
> This is independent on 1 or 2 levels as we sort the devices by
> deviceid's and compute the delta between those id.

see below.

> > 
> > I have a feeling this wasn't tested with 2 level device tables.  Could
> > that be true?
> No this was tested with 1 & and 2 levels (I hacked the guest to force 2
> levels). 1 test hole I have though is all my dte's currently belong to
> the same 2d level page, ie. my deviceid are not scattered enough.
> > 
> >> +	       (itt_addr_field << KVM_ITS_DTE_ITTADDR_SHIFT) |
> >> +		(dev->nb_eventid_bits - 1));
> >> +	val = cpu_to_le64(val);
> >> +	ret = kvm_write_guest(kvm, ptr, &val, dte_esz);
> >> +	return ret;
> >> +}
> >> +
> >> +/**
> >> + * vgic_its_restore_dte - restore a device table entry
> >> + *
> >> + * @its: its handle
> >> + * @id: device id the DTE corresponds to
> >> + * @ptr: kernel VA where the 8 byte DTE is located
> >> + * @opaque: unused
> >> + * @next: offset to the next valid device id
> >> + *
> >> + * Return: < 0 on error, 0 otherwise
> >> + */
> >> +static int vgic_its_restore_dte(struct vgic_its *its, u32 id,
> >> +				void *ptr, void *opaque, u32 *next)
> >> +{
> >> +	struct its_device *dev;
> >> +	gpa_t itt_addr;
> >> +	u8 nb_eventid_bits;
> >> +	u64 entry = *(u64 *)ptr;
> >> +	bool valid;
> >> +	int ret;
> >> +
> >> +	entry = le64_to_cpu(entry);
> >> +
> >> +	valid = entry >> KVM_ITS_DTE_VALID_SHIFT;
> >> +	nb_eventid_bits = (entry & KVM_ITS_DTE_SIZE_MASK) + 1;
> >> +	itt_addr = ((entry & KVM_ITS_DTE_ITTADDR_MASK)
> >> +			>> KVM_ITS_DTE_ITTADDR_SHIFT) << 8;
> >> +	*next = 1;
> >> +
> >> +	if (!valid)
> >> +		return 0;
> >> +
> >> +	/* dte entry is valid */
> >> +	*next = (entry & KVM_ITS_DTE_NEXT_MASK) >> KVM_ITS_DTE_NEXT_SHIFT;
> >> +
> >> +	ret = vgic_its_alloc_device(its, &dev, id,
> >> +				    itt_addr, nb_eventid_bits);
> >> +	if (ret)
> >> +		return ret;
> >> +	ret = vgic_its_restore_itt(its, dev);
> >> +
> >> +	return ret;
> >> +}
> >> +
> >> +static int vgic_its_device_cmp(void *priv, struct list_head *a,
> >> +			       struct list_head *b)
> >> +{
> >> +	struct its_device *deva = container_of(a, struct its_device, dev_list);
> >> +	struct its_device *devb = container_of(b, struct its_device, dev_list);
> >> +
> >> +	if (deva->device_id < devb->device_id)
> >> +		return -1;
> >> +	else
> >> +		return 1;
> >> +}
> >> +
> >> +/**
> >>   * vgic_its_save_device_tables - Save the device table and all ITT
> >>   * into guest RAM
> >> + *
> >> + * L1/L2 handling is hidden by vgic_its_check_id() helper which directly
> >> + * returns the GPA of the device entry
> >>   */
> >>  static int vgic_its_save_device_tables(struct vgic_its *its)
> >>  {
> >> -	return -ENXIO;
> >> +	const struct vgic_its_abi *abi = vgic_its_get_abi(its);
> >> +	struct its_device *dev;
> >> +	int dte_esz = abi->dte_esz;
> >> +	u64 baser;
> >> +
> >> +	baser = its->baser_device_table;
> >> +
> >> +	list_sort(NULL, &its->device_list, vgic_its_device_cmp);
> >> +
> >> +	list_for_each_entry(dev, &its->device_list, dev_list) {
> >> +		int ret;
> >> +		gpa_t eaddr;
> >> +
> >> +		if (!vgic_its_check_id(its, baser,
> >> +				       dev->device_id, &eaddr))
> >> +			return -EINVAL;
> >> +
> >> +		ret = vgic_its_save_itt(its, dev);
> >> +		if (ret)
> >> +			return ret;
> >> +
> >> +		ret = vgic_its_save_dte(its, dev, eaddr, dte_esz);
> >> +		if (ret)
> >> +			return ret;
> >> +	}
> >> +	return 0;
> >> +}
> >> +
> >> +/**
> >> + * handle_l1_entry - callback used for L1 entries (2 stage case)
> >> + *
> >> + * @its: its handle
> >> + * @id: id
> > 
> > IIUC, this is actually the index of the entry in the L1 table.  I think
> > this should be clarified.
> yep
> > 
> >> + * @addr: kernel VA
> >> + * @opaque: unused
> >> + * @next_offset: offset to the next L1 entry: 0 if the last element
> >> + * was found, 1 otherwise
> >> + */
> >> +static int handle_l1_entry(struct vgic_its *its, u32 id, void *addr,
> >> +			   void *opaque, u32 *next_offset)
> > 
> > nit: shouldn't this be called handle_l1_device_table_entry ?
> renamed into handle_l1_dte
> > 
> >> +{
> >> +	const struct vgic_its_abi *abi = vgic_its_get_abi(its);
> >> +	int l2_start_id = id * (SZ_64K / GITS_LVL1_ENTRY_SIZE);
> > 
> > Hmmm, is this not actually supposed to be (SZ_64K / abi->dte_esz) ?
> no because 1st level entries have a fixed size of GITS_LVL1_ENTRY_SIZE bytes

yes, but the ID you calculate is a result of how many IDs each 64K 2nd
level table can hold, which depends on the size of each entry in the 2nd
level table, right?  Or am I misunderstanding how this works completely.

> > 
> >> +	u64 entry = *(u64 *)addr;
> >> +	int ret, ite_esz = abi->ite_esz;
> > 
> > Should this be ite_esz or dte_esz?
> 
> you're correct. dte_esz should be used.
> > 
> >> +	gpa_t gpa;
> > 
> > nit: put declarations with initialization on separate lines.
> OK
> > 
> >> +
> >> +	entry = le64_to_cpu(entry);
> >> +	*next_offset = 1;
> > 
> > I think you could attach a comment here saying that level 1 tables have
> > to be scanned entirely.
> added. note we exit as soon as the last element is found when scanning
> l2 tables.
> > 
> > But this also reminds me.  Does that mean that the next field in the DTE
> > in your documented ABI format points to the next DTE within that level-2
> > table, or does it point across to different level-2 tables?  I think
> > this needs to be clarified in the ABI unless I'm missing something.
> see above comment on next_index semantic. In the doc I talk about
> deviceid offset and not of table index.
> 

ok, I see, I was misled by the definition of lookup_table saying that it
returns 1 if the last element is identified, which is only true when you
actually find an element that is valid and where the next field is zero.
I understood it to mean if it found the last item in the table it was
scanning.  So it is implied that lookup table can be called in two
levels and the return value indicates if the element was the last from
the point of view of the highest level, not in the context the last
instance was called.

Note that it's further confusing that the handler function has the
return value the other way around, where 0 means it's the last element.
Perhaps you could make this much more readable by introducing a define
for the return values.

Thanks,
-Christoffer

^ permalink raw reply

* [PATCH 07/31] KVM: arm/arm64: vgic-v3: Add accessors for the ICH_APxRn_EL2 registers
From: Mark Rutland @ 2017-05-03 15:32 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170503104606.19342-8-marc.zyngier@arm.com>

On Wed, May 03, 2017 at 11:45:42AM +0100, Marc Zyngier wrote:
> As we're about to access the Active Priority registers a lot more,
> let's define accessors that take the register number as a parameter.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  virt/kvm/arm/hyp/vgic-v3-sr.c | 116 ++++++++++++++++++++++++++++++++++++------
>  1 file changed, 100 insertions(+), 16 deletions(-)
> 
> diff --git a/virt/kvm/arm/hyp/vgic-v3-sr.c b/virt/kvm/arm/hyp/vgic-v3-sr.c
> index 32c3295929b0..990d9d1e85d0 100644
> --- a/virt/kvm/arm/hyp/vgic-v3-sr.c
> +++ b/virt/kvm/arm/hyp/vgic-v3-sr.c
> @@ -118,6 +118,90 @@ static void __hyp_text __gic_v3_set_lr(u64 val, int lr)
>  	}
>  }
>  
> +static void __hyp_text __vgic_v3_write_ap0rn(u32 val, int n)
> +{
> +	switch (n) {
> +	case 0:
> +		write_gicreg(val, ICH_AP0R0_EL2);
> +		break;
> +	case 1:
> +		write_gicreg(val, ICH_AP0R1_EL2);
> +		break;
> +	case 2:
> +		write_gicreg(val, ICH_AP0R2_EL2);
> +		break;
> +	case 3:
> +		write_gicreg(val, ICH_AP0R3_EL2);
> +		break;
> +	}

Is there any way we can get a build or runtime failure for an
out-of-bounds n value?

> +}

Given this is used with a constant n, you could make this:

#define __vgic_v3_write_ap0rn(v, n) \
	write_gicreg(v, ICH_AP0R##n##_EL2)

... which should also give you a warning for an out-of-bounds n.

Similar could apply for the other helpers here.

That would require some function -> macro conversion in later patches
though, so I can understand if you're not keen on that.

Thanks,
Mark.

^ permalink raw reply

* [PATCH] iommu/arm-smmu-v3: Poll for CMDQ drain completion more effectively
From: Robin Murphy @ 2017-05-03 15:33 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493291587-23488-1-git-send-email-sunil.kovvuri@gmail.com>

On 27/04/17 12:13, sunil.kovvuri at gmail.com wrote:
> From: Sunil Goutham <sgoutham@cavium.com>
> 
> Modified polling on CMDQ consumer similar to how polling is done for TLB SYNC
> completion in SMMUv2 driver. Code changes are done with reference to
> 
> 8513c8930069 iommu/arm-smmu: Poll for TLB sync completion more effectively
> 
> Poll timeout has been increased which addresses issue of 100us timeout not
> sufficient, when command queue is full with TLB invalidation commands.
> 
> Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
> Signed-off-by: Geetha <gakula@cavium.com>
> ---
>  drivers/iommu/arm-smmu-v3.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index d412bdd..34599d4 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -379,6 +379,9 @@
>  #define CMDQ_SYNC_0_CS_NONE		(0UL << CMDQ_SYNC_0_CS_SHIFT)
>  #define CMDQ_SYNC_0_CS_SEV		(2UL << CMDQ_SYNC_0_CS_SHIFT)
>  
> +#define CMDQ_DRAIN_TIMEOUT_US		1000
> +#define CMDQ_SPIN_COUNT			10
> +
>  /* Event queue */
>  #define EVTQ_ENT_DWORDS			4
>  #define EVTQ_MAX_SZ_SHIFT		7
> @@ -737,7 +740,8 @@ static void queue_inc_prod(struct arm_smmu_queue *q)
>   */
>  static int queue_poll_cons(struct arm_smmu_queue *q, bool drain, bool wfe)
>  {
> -	ktime_t timeout = ktime_add_us(ktime_get(), ARM_SMMU_POLL_TIMEOUT_US);
> +	ktime_t timeout = ktime_add_us(ktime_get(), CMDQ_DRAIN_TIMEOUT_US);
> +	unsigned int spin_cnt, delay = 1;
>  
>  	while (queue_sync_cons(q), (drain ? !queue_empty(q) : queue_full(q))) {
>  		if (ktime_compare(ktime_get(), timeout) > 0)
> @@ -746,8 +750,13 @@ static int queue_poll_cons(struct arm_smmu_queue *q, bool drain, bool wfe)
>  		if (wfe) {
>  			wfe();
>  		} else {
> -			cpu_relax();
> -			udelay(1);
> +			for (spin_cnt = 0;
> +			     spin_cnt < CMDQ_SPIN_COUNT; spin_cnt++) {
> +				cpu_relax();
> +				continue;
> +			}
> +			udelay(delay);
> +			delay *= 2;

Sorry, I can't make sense of this. The referenced commit uses the spin
loop to poll opportunistically a few times before delaying. This loop
just adds a short open-coded udelay to an exponential udelay, and it's
not really clear that that's any better than a fixed udelay (especially
as the two cases in which we poll are somewhat different).

What's wrong with simply increasing the timeout value alone?

Robin.

>  		}
>  	}
>  
> 

^ permalink raw reply

* [PATCH 08/31] arm64: Add a facility to turn an ESR syndrome into a sysreg encoding
From: Mark Rutland @ 2017-05-03 15:35 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170503104606.19342-9-marc.zyngier@arm.com>

On Wed, May 03, 2017 at 11:45:43AM +0100, Marc Zyngier wrote:
> It is often useful to compare an ESR syndrome reporting the trapping
> of a system register with a value matching that system register.
> 
> Since encoding both the sysreg and the ESR version seem to be a bit
> overkill, let's add a set of macros that convert an ESR value into
> the corresponding sysreg encoding.
> 
> We handle both AArch32 and AArch64, taking advantage of identical
> encodings between system registers and CP15 accessors.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Nice!

Acked-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  arch/arm64/include/asm/esr.h | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
> index ad42e79a5d4d..db8f13137443 100644
> --- a/arch/arm64/include/asm/esr.h
> +++ b/arch/arm64/include/asm/esr.h
> @@ -19,6 +19,7 @@
>  #define __ASM_ESR_H
>  
>  #include <asm/memory.h>
> +#include <asm/sysreg.h>
>  
>  #define ESR_ELx_EC_UNKNOWN	(0x00)
>  #define ESR_ELx_EC_WFx		(0x01)
> @@ -177,6 +178,30 @@
>  
>  #define ESR_ELx_SYS64_ISS_SYS_CNTVCT	(ESR_ELx_SYS64_ISS_SYS_VAL(3, 3, 2, 14, 0) | \
>  					 ESR_ELx_SYS64_ISS_DIR_READ)
> +
> +#define esr_sys64_to_sysreg(e)					\
> +	sys_reg((((e) & ESR_ELx_SYS64_ISS_OP0_MASK) >>		\
> +		 ESR_ELx_SYS64_ISS_OP0_SHIFT),			\
> +		(((e) & ESR_ELx_SYS64_ISS_OP1_MASK) >>		\
> +		 ESR_ELx_SYS64_ISS_OP1_SHIFT),			\
> +		(((e) & ESR_ELx_SYS64_ISS_CRN_MASK) >>		\
> +		 ESR_ELx_SYS64_ISS_CRN_SHIFT),			\
> +		(((e) & ESR_ELx_SYS64_ISS_CRM_MASK) >>		\
> +		 ESR_ELx_SYS64_ISS_CRM_SHIFT),			\
> +		(((e) & ESR_ELx_SYS64_ISS_OP2_MASK) >>		\
> +		 ESR_ELx_SYS64_ISS_OP2_SHIFT))
> +
> +#define esr_cp15_to_sysreg(e)					\
> +	sys_reg(3,						\
> +		(((e) & ESR_ELx_SYS64_ISS_OP1_MASK) >>		\
> +		 ESR_ELx_SYS64_ISS_OP1_SHIFT),			\
> +		(((e) & ESR_ELx_SYS64_ISS_CRN_MASK) >>		\
> +		 ESR_ELx_SYS64_ISS_CRN_SHIFT),			\
> +		(((e) & ESR_ELx_SYS64_ISS_CRM_MASK) >>		\
> +		 ESR_ELx_SYS64_ISS_CRM_SHIFT),			\
> +		(((e) & ESR_ELx_SYS64_ISS_OP2_MASK) >>		\
> +		 ESR_ELx_SYS64_ISS_OP2_SHIFT))
> +
>  #ifndef __ASSEMBLY__
>  #include <asm/types.h>
>  
> -- 
> 2.11.0
> 

^ permalink raw reply

* [PATCH] iommu/arm-smmu-v3: Poll for CMDQ drain completion more effectively
From: Will Deacon @ 2017-05-03 15:37 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CA+sq2CdR0vC76F1m-LvYGfZW32nKbQg9dQsfJ_jtvYgZ7yyt3g@mail.gmail.com>

On Wed, May 03, 2017 at 06:49:09PM +0530, Sunil Kovvuri wrote:
> On Thu, Apr 27, 2017 at 4:43 PM,  <sunil.kovvuri@gmail.com> wrote:
> > From: Sunil Goutham <sgoutham@cavium.com>
> >
> > Modified polling on CMDQ consumer similar to how polling is done for TLB SYNC
> > completion in SMMUv2 driver. Code changes are done with reference to
> >
> > 8513c8930069 iommu/arm-smmu: Poll for TLB sync completion more effectively
> >
> > Poll timeout has been increased which addresses issue of 100us timeout not
> > sufficient, when command queue is full with TLB invalidation commands.
> >
> > Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
> > Signed-off-by: Geetha <gakula@cavium.com>
> > ---
> >  drivers/iommu/arm-smmu-v3.c | 15 ++++++++++++---
> >  1 file changed, 12 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> > index d412bdd..34599d4 100644
> > --- a/drivers/iommu/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm-smmu-v3.c
> > @@ -379,6 +379,9 @@
> >  #define CMDQ_SYNC_0_CS_NONE            (0UL << CMDQ_SYNC_0_CS_SHIFT)
> >  #define CMDQ_SYNC_0_CS_SEV             (2UL << CMDQ_SYNC_0_CS_SHIFT)
> >
> > +#define CMDQ_DRAIN_TIMEOUT_US          1000
> > +#define CMDQ_SPIN_COUNT                        10
> > +
> >  /* Event queue */
> >  #define EVTQ_ENT_DWORDS                        4
> >  #define EVTQ_MAX_SZ_SHIFT              7
> > @@ -737,7 +740,8 @@ static void queue_inc_prod(struct arm_smmu_queue *q)
> >   */
> >  static int queue_poll_cons(struct arm_smmu_queue *q, bool drain, bool wfe)
> >  {
> > -       ktime_t timeout = ktime_add_us(ktime_get(), ARM_SMMU_POLL_TIMEOUT_US);
> > +       ktime_t timeout = ktime_add_us(ktime_get(), CMDQ_DRAIN_TIMEOUT_US);
> > +       unsigned int spin_cnt, delay = 1;
> >
> >         while (queue_sync_cons(q), (drain ? !queue_empty(q) : queue_full(q))) {
> >                 if (ktime_compare(ktime_get(), timeout) > 0)
> > @@ -746,8 +750,13 @@ static int queue_poll_cons(struct arm_smmu_queue *q, bool drain, bool wfe)
> >                 if (wfe) {
> >                         wfe();
> >                 } else {
> > -                       cpu_relax();
> > -                       udelay(1);
> > +                       for (spin_cnt = 0;
> > +                            spin_cnt < CMDQ_SPIN_COUNT; spin_cnt++) {
> > +                               cpu_relax();
> > +                               continue;
> > +                       }
> > +                       udelay(delay);
> > +                       delay *= 2;
> >                 }
> >         }
> >
> > --
> > 2.7.4
> >
> 
> Sorry for the ignorance.
> Is there a patchwork where I can check current status of ARM IOMMU
> related patches ?
> 
> And is this patch accepted, if not any comments / feedback ?

Please be patient: the merge window is open and it's not been long since you
posted the patch, which looks pretty bonkers at first glance.

Will

^ permalink raw reply

* [PATCH v2 0/4] arm64: improve tagged pointer handling
From: Kristina Martsenko @ 2017-05-03 15:37 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

Here are some patches to fix a few issues related to tagged pointer
handling.

Tagged pointers from userspace can end up in the kernel in a number of
ways. I most likely have not found all of them, but they include at
least the following:

 - Passing tagged pointers in system call arguments. This would be a
   userspace bug, as documented in tagged-pointers.txt.

 - Through FAR_EL1 when we take a data abort or watchpoint exception.
   Watchpoint handling is currently broken if we get a tagged pointer,
   patch #2 in this series fixes it. We already do the right thing for
   data aborts but patch #3 tries to improve on it a little.

 - Reading a tagged pointer from a GPR when trapping and emulating
   instructions, e.g. cache maintenance or uprobes. Patch #1 fixes the
   cache maintenance case.

 - The user stack pointer, frame pointer (x29), frame records, and link
   register (x30) can contain tagged pointers. Patch #4 documents that
   some kernel features do not currently work with tagged pointers in
   the first three of these.

 - A tagged pointer can end up in the PC on an illegal exception return
   (see D4.1.4 ARMARM A.k_iss10775), and from there in ELR on exception
   entry. As I understand it, this can only be caused by a bad eret at
   EL1 or a bad debug state exit by an external debugger, so only by a
   bug in Linux/firmware or the external debugger. So I don't think we
   need to handle this.

Note that the above applies to Linux only. I have spoken to Marc Zyngier
about KVM, and so far he hasn't found any problems there.

Thanks,
Kristina

v2:
 - Patch #3: changed clear_address_tag macro arguments, swapped bic and
   tst


Kristina Martsenko (4):
  arm64: traps: fix userspace cache maintenance emulation on a tagged
    pointer
  arm64: hw_breakpoint: fix watchpoint matching for tagged pointers
  arm64: entry: improve data abort handling of tagged pointers
  arm64: documentation: document tagged pointer stack constraints

 Documentation/arm64/tagged-pointers.txt | 62 +++++++++++++++++++++++++--------
 arch/arm64/include/asm/asm-uaccess.h    |  9 +++++
 arch/arm64/include/asm/uaccess.h        |  6 ++--
 arch/arm64/kernel/entry.S               |  5 +--
 arch/arm64/kernel/hw_breakpoint.c       |  3 ++
 arch/arm64/kernel/traps.c               |  4 +--
 6 files changed, 67 insertions(+), 22 deletions(-)

-- 
2.1.4

^ permalink raw reply

* [PATCH v2 1/4] arm64: traps: fix userspace cache maintenance emulation on a tagged pointer
From: Kristina Martsenko @ 2017-05-03 15:37 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493825868-30872-1-git-send-email-kristina.martsenko@arm.com>

When we emulate userspace cache maintenance in the kernel, we can
currently send the task a SIGSEGV even though the maintenance was done
on a valid address. This happens if the address has a non-zero address
tag, and happens to not be mapped in.

When we get the address from a user register, we don't currently remove
the address tag before performing cache maintenance on it. If the
maintenance faults, we end up in either __do_page_fault, where find_vma
can't find the VMA if the address has a tag, or in do_translation_fault,
where the tagged address will appear to be above TASK_SIZE. In both
cases, the address is not mapped in, and the task is sent a SIGSEGV.

This patch removes the tag from the address before using it. With this
patch, the fault is handled correctly, the address gets mapped in, and
the cache maintenance succeeds.

As a second bug, if cache maintenance (correctly) fails on an invalid
tagged address, the address gets passed into arm64_notify_segfault,
where find_vma fails to find the VMA due to the tag, and the wrong
si_code may be sent as part of the siginfo_t of the segfault. With this
patch, the correct si_code is sent.

Fixes: 7dd01aef0557 ("arm64: trap userspace "dc cvau" cache operation on errata-affected core")
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
---

Note that patch #3 would also fix the first bug (incorrect segfault),
but not the second (wrong si_code), hence this patch.

 arch/arm64/kernel/traps.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index e52be6aa44ee..45c8eca951bc 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -443,7 +443,7 @@ int cpu_enable_cache_maint_trap(void *__unused)
 }

 #define __user_cache_maint(insn, address, res)			\
-	if (untagged_addr(address) >= user_addr_max()) {	\
+	if (address >= user_addr_max()) {			\
 		res = -EFAULT;					\
 	} else {						\
 		uaccess_ttbr0_enable();				\
@@ -469,7 +469,7 @@ static void user_cache_maint_handler(unsigned int esr, struct pt_regs *regs)
 	int crm = (esr & ESR_ELx_SYS64_ISS_CRM_MASK) >> ESR_ELx_SYS64_ISS_CRM_SHIFT;
 	int ret = 0;

-	address = pt_regs_read_reg(regs, rt);
+	address = untagged_addr(pt_regs_read_reg(regs, rt));

 	switch (crm) {
 	case ESR_ELx_SYS64_ISS_CRM_DC_CVAU:	/* DC CVAU, gets promoted */
-- 
2.1.4

^ permalink raw reply related

* [PATCH v2 2/4] arm64: hw_breakpoint: fix watchpoint matching for tagged pointers
From: Kristina Martsenko @ 2017-05-03 15:37 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493825868-30872-1-git-send-email-kristina.martsenko@arm.com>

When we take a watchpoint exception, the address that triggered the
watchpoint is found in FAR_EL1. We compare it to the address of each
configured watchpoint to see which one was hit.

The configured watchpoint addresses are untagged, while the address in
FAR_EL1 will have an address tag if the data access was done using a
tagged address. The tag needs to be removed to compare the address to
the watchpoints.

Currently we don't remove it, and as a result can report the wrong
watchpoint as being hit (specifically, always either the highest TTBR0
watchpoint or lowest TTBR1 watchpoint). This patch removes the tag.

Fixes: d50240a5f6ce ("arm64: mm: permit use of tagged pointers at EL0")
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
---
 arch/arm64/include/asm/uaccess.h  | 6 +++---
 arch/arm64/kernel/hw_breakpoint.c | 3 +++
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index 5308d696311b..0221029e27ff 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -106,9 +106,9 @@ static inline void set_fs(mm_segment_t fs)
 })
 
 /*
- * When dealing with data aborts or instruction traps we may end up with
- * a tagged userland pointer. Clear the tag to get a sane pointer to pass
- * on to access_ok(), for instance.
+ * When dealing with data aborts, watchpoints, or instruction traps we may end
+ * up with a tagged userland pointer. Clear the tag to get a sane pointer to
+ * pass on to access_ok(), for instance.
  */
 #define untagged_addr(addr)		sign_extend64(addr, 55)
 
diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c
index 0296e7924240..749f81779420 100644
--- a/arch/arm64/kernel/hw_breakpoint.c
+++ b/arch/arm64/kernel/hw_breakpoint.c
@@ -36,6 +36,7 @@
 #include <asm/traps.h>
 #include <asm/cputype.h>
 #include <asm/system_misc.h>
+#include <asm/uaccess.h>
 
 /* Breakpoint currently in use for each BRP. */
 static DEFINE_PER_CPU(struct perf_event *, bp_on_reg[ARM_MAX_BRP]);
@@ -721,6 +722,8 @@ static u64 get_distance_from_watchpoint(unsigned long addr, u64 val,
 	u64 wp_low, wp_high;
 	u32 lens, lene;
 
+	addr = untagged_addr(addr);
+
 	lens = __ffs(ctrl->len);
 	lene = __fls(ctrl->len);
 
-- 
2.1.4

^ permalink raw reply related

* [PATCH v2 3/4] arm64: entry: improve data abort handling of tagged pointers
From: Kristina Martsenko @ 2017-05-03 15:37 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493825868-30872-1-git-send-email-kristina.martsenko@arm.com>

When handling a data abort from EL0, we currently zero the top byte of
the faulting address, as we assume the address is a TTBR0 address, which
may contain a non-zero address tag. However, the address may be a TTBR1
address, in which case we should not zero the top byte. This patch fixes
that. The effect is that the full TTBR1 address is passed to the task's
signal handler (or printed out in the kernel log).

When handling a data abort from EL1, we leave the faulting address
intact, as we assume it's either a TTBR1 address or a TTBR0 address with
tag 0x00. This is true as far as I'm aware, we don't seem to access a
tagged TTBR0 address anywhere in the kernel. Regardless, it's easy to
forget about address tags, and code added in the future may not always
remember to remove tags from addresses before accessing them. So add tag
handling to the EL1 data abort handler as well. This also makes it
consistent with the EL0 data abort handler.

Fixes: d50240a5f6ce ("arm64: mm: permit use of tagged pointers at EL0")
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
---
 arch/arm64/include/asm/asm-uaccess.h | 9 +++++++++
 arch/arm64/kernel/entry.S            | 5 +++--
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/asm-uaccess.h b/arch/arm64/include/asm/asm-uaccess.h
index df411f3e083c..ecd9788cd298 100644
--- a/arch/arm64/include/asm/asm-uaccess.h
+++ b/arch/arm64/include/asm/asm-uaccess.h
@@ -62,4 +62,13 @@ alternative_if ARM64_ALT_PAN_NOT_UAO
 alternative_else_nop_endif
 	.endm

+/*
+ * Remove the address tag from a virtual address, if present.
+ */
+	.macro	clear_address_tag, dst, addr
+	tst	\addr, #(1 << 55)
+	bic	\dst, \addr, #(0xff << 56)
+	csel	\dst, \dst, \addr, eq
+	.endm
+
 #endif
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 43512d4d7df2..b738880350f9 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -428,12 +428,13 @@ el1_da:
 	/*
 	 * Data abort handling
 	 */
-	mrs	x0, far_el1
+	mrs	x3, far_el1
 	enable_dbg
 	// re-enable interrupts if they were enabled in the aborted context
 	tbnz	x23, #7, 1f			// PSR_I_BIT
 	enable_irq
 1:
+	clear_address_tag x0, x3
 	mov	x2, sp				// struct pt_regs
 	bl	do_mem_abort

@@ -594,7 +595,7 @@ el0_da:
 	// enable interrupts before calling the main handler
 	enable_dbg_and_irq
 	ct_user_exit
-	bic	x0, x26, #(0xff << 56)
+	clear_address_tag x0, x26
 	mov	x1, x25
 	mov	x2, sp
 	bl	do_mem_abort
-- 
2.1.4

^ permalink raw reply related

* [PATCH v2 4/4] arm64: documentation: document tagged pointer stack constraints
From: Kristina Martsenko @ 2017-05-03 15:37 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493825868-30872-1-git-send-email-kristina.martsenko@arm.com>

Some kernel features don't currently work if a task puts a non-zero
address tag in its stack pointer, frame pointer, or frame record entries
(FP, LR).

For example, with a tagged stack pointer, the kernel can't deliver
signals to the process, and the task is killed instead. As another
example, with a tagged frame pointer or frame records, perf fails to
generate call graphs or resolve symbols.

For now, just document these limitations, instead of finding and fixing
everything that doesn't work, as it's not known if anyone needs to use
tags in these places anyway.

In addition, as requested by Dave Martin, generalize the limitations
into a general kernel address tag policy, and refactor
tagged-pointers.txt to include it.

Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
---
 Documentation/arm64/tagged-pointers.txt | 62 +++++++++++++++++++++++++--------
 1 file changed, 47 insertions(+), 15 deletions(-)

diff --git a/Documentation/arm64/tagged-pointers.txt b/Documentation/arm64/tagged-pointers.txt
index d9995f1f51b3..a25a99e82bb1 100644
--- a/Documentation/arm64/tagged-pointers.txt
+++ b/Documentation/arm64/tagged-pointers.txt
@@ -11,24 +11,56 @@ in AArch64 Linux.
 The kernel configures the translation tables so that translations made
 via TTBR0 (i.e. userspace mappings) have the top byte (bits 63:56) of
 the virtual address ignored by the translation hardware. This frees up
-this byte for application use, with the following caveats:
+this byte for application use.
 
-	(1) The kernel requires that all user addresses passed to EL1
-	    are tagged with tag 0x00. This means that any syscall
-	    parameters containing user virtual addresses *must* have
-	    their top byte cleared before trapping to the kernel.
 
-	(2) Non-zero tags are not preserved when delivering signals.
-	    This means that signal handlers in applications making use
-	    of tags cannot rely on the tag information for user virtual
-	    addresses being maintained for fields inside siginfo_t.
-	    One exception to this rule is for signals raised in response
-	    to watchpoint debug exceptions, where the tag information
-	    will be preserved.
+Passing tagged addresses to the kernel
+--------------------------------------
 
-	(3) Special care should be taken when using tagged pointers,
-	    since it is likely that C compilers will not hazard two
-	    virtual addresses differing only in the upper byte.
+All interpretation of userspace memory addresses by the kernel assumes
+an address tag of 0x00.
+
+This includes, but is not limited to, addresses found in:
+
+ - pointer arguments to system calls, including pointers in structures
+   passed to system calls,
+
+ - the stack pointer (sp), e.g. when interpreting it to deliver a
+   signal,
+
+ - the frame pointer (x29) and frame records, e.g. when interpreting
+   them to generate a backtrace or call graph.
+
+Using non-zero address tags in any of these locations may result in an
+error code being returned, a (fatal) signal being raised, or other modes
+of failure.
+
+For these reasons, passing non-zero address tags to the kernel via
+system calls is forbidden, and using a non-zero address tag for sp is
+strongly discouraged.
+
+Programs maintaining a frame pointer and frame records that use non-zero
+address tags may suffer impaired or inaccurate debug and profiling
+visibility.
+
+
+Preserving tags
+---------------
+
+Non-zero tags are not preserved when delivering signals. This means that
+signal handlers in applications making use of tags cannot rely on the
+tag information for user virtual addresses being maintained for fields
+inside siginfo_t. One exception to this rule is for signals raised in
+response to watchpoint debug exceptions, where the tag information will
+be preserved.
 
 The architecture prevents the use of a tagged PC, so the upper byte will
 be set to a sign-extension of bit 55 on exception return.
+
+
+Other considerations
+--------------------
+
+Special care should be taken when using tagged pointers, since it is
+likely that C compilers will not hazard two virtual addresses differing
+only in the upper byte.
-- 
2.1.4

^ permalink raw reply related

* [PATCH] iommu/arm-smmu-v3: Poll for CMDQ drain completion more effectively
From: Will Deacon @ 2017-05-03 15:40 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <a9d1bb4d-6578-93bb-66cf-5ed55952b85a@arm.com>

On Wed, May 03, 2017 at 04:33:57PM +0100, Robin Murphy wrote:
> On 27/04/17 12:13, sunil.kovvuri at gmail.com wrote:
> > From: Sunil Goutham <sgoutham@cavium.com>
> > 
> > Modified polling on CMDQ consumer similar to how polling is done for TLB SYNC
> > completion in SMMUv2 driver. Code changes are done with reference to
> > 
> > 8513c8930069 iommu/arm-smmu: Poll for TLB sync completion more effectively
> > 
> > Poll timeout has been increased which addresses issue of 100us timeout not
> > sufficient, when command queue is full with TLB invalidation commands.
> > 
> > Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
> > Signed-off-by: Geetha <gakula@cavium.com>
> > ---
> >  drivers/iommu/arm-smmu-v3.c | 15 ++++++++++++---
> >  1 file changed, 12 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> > index d412bdd..34599d4 100644
> > --- a/drivers/iommu/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm-smmu-v3.c
> > @@ -379,6 +379,9 @@
> >  #define CMDQ_SYNC_0_CS_NONE		(0UL << CMDQ_SYNC_0_CS_SHIFT)
> >  #define CMDQ_SYNC_0_CS_SEV		(2UL << CMDQ_SYNC_0_CS_SHIFT)
> >  
> > +#define CMDQ_DRAIN_TIMEOUT_US		1000
> > +#define CMDQ_SPIN_COUNT			10
> > +
> >  /* Event queue */
> >  #define EVTQ_ENT_DWORDS			4
> >  #define EVTQ_MAX_SZ_SHIFT		7
> > @@ -737,7 +740,8 @@ static void queue_inc_prod(struct arm_smmu_queue *q)
> >   */
> >  static int queue_poll_cons(struct arm_smmu_queue *q, bool drain, bool wfe)
> >  {
> > -	ktime_t timeout = ktime_add_us(ktime_get(), ARM_SMMU_POLL_TIMEOUT_US);
> > +	ktime_t timeout = ktime_add_us(ktime_get(), CMDQ_DRAIN_TIMEOUT_US);
> > +	unsigned int spin_cnt, delay = 1;
> >  
> >  	while (queue_sync_cons(q), (drain ? !queue_empty(q) : queue_full(q))) {
> >  		if (ktime_compare(ktime_get(), timeout) > 0)
> > @@ -746,8 +750,13 @@ static int queue_poll_cons(struct arm_smmu_queue *q, bool drain, bool wfe)
> >  		if (wfe) {
> >  			wfe();
> >  		} else {
> > -			cpu_relax();
> > -			udelay(1);
> > +			for (spin_cnt = 0;
> > +			     spin_cnt < CMDQ_SPIN_COUNT; spin_cnt++) {
> > +				cpu_relax();
> > +				continue;
> > +			}
> > +			udelay(delay);
> > +			delay *= 2;
> 
> Sorry, I can't make sense of this. The referenced commit uses the spin
> loop to poll opportunistically a few times before delaying. This loop
> just adds a short open-coded udelay to an exponential udelay, and it's
> not really clear that that's any better than a fixed udelay (especially
> as the two cases in which we poll are somewhat different).
> 
> What's wrong with simply increasing the timeout value alone?

I asked that the timeout is only increased for the drain case, and that
we fix the issue here where we udelat if cons didn't move immediately:

http://lists.infradead.org/pipermail/linux-arm-kernel/2017-April/503389.html

but I don't think the patch above actually achieves any of that.

Will

^ permalink raw reply

* [PATCH 0/9] [v3] arm64: defconfig: enable several options useful for ARM64 server platforms
From: Will Deacon @ 2017-05-03 15:48 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493413563-18375-1-git-send-email-timur@codeaurora.org>

Hi Timur,

On Fri, Apr 28, 2017 at 04:05:54PM -0500, Timur Tabi wrote:
> ACPI-based ARM64 server platforms based, like the Qualcomm Datacenter
> Technologies QDF2400, need several features and drivers enabled for
> full functionality.  This patchset enables many of those features.
> 
> The first patch, "resynchronize the defconfig" refreshes the ARM64
> defconfig so that it's aligned with savedefconfig.  This needs to be
> done periodically, so that new patches avoid merge conflicts.  
> 
> If the first patch does not apply cleanly, I ask the maintainer to 
> refresh it himself manually, following the instructions in the commit
> message.  The remaining 8 patches should apply cleanly on top of that.

The arm-soc folks usually deal with defconfig updates, so please resend this
to arm at kernel.org, although I suspect it's too late for 4.12 given that the
merge window is now open. The "resynchronize" patch is also likely to
conflict with any other pending changes to the file.

Will

> Timur Tabi (9):
>   [v3] arm64: defconfig: resynchronize the defconfig
>   arm64: defconfig: enable ACPI_CPPC_CPUFREQ
>   arm64: defconfig: enable BLK_DEV_NVME
>   [v2] arm64: defconfig: enable EFI_CAPSULE_LOADER
>   arm64: defconfig: enable support for PCIe hotplug
>   arm64: defconfig: enable APEI and GHES features
>   [v2] arm64: defconfig: enable EDAC options
>   arm64: defconfig: enable QCOM_L2_PMU and QCOM_L3_PMU
>   arm64: defconfig: enable the Qualcomm Technologies EMAC Ethernet
>     driver
> 
>  arch/arm64/configs/defconfig | 116 ++++++++++++++++++++-----------------------
>  1 file changed, 55 insertions(+), 61 deletions(-)
> 
> -- 
> Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
> Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
> Code Aurora Forum, a Linux Foundation Collaborative Project.
> 

^ permalink raw reply

* [PATCH] ARM64/PCI: Allow userspace to mmap PCI resources
From: Bharat Bhushan @ 2017-05-03 15:48 UTC (permalink / raw)
  To: linux-arm-kernel

This patch allows user-space to mmap PCI resources. This
patch is inline to arm32 bit implementation.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@nxp.com>
---
 arch/arm64/include/asm/pci.h |  4 ++++
 arch/arm64/kernel/pci.c      | 19 +++++++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
index b9a7ba9..8a18915 100644
--- a/arch/arm64/include/asm/pci.h
+++ b/arch/arm64/include/asm/pci.h
@@ -31,6 +31,10 @@ static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
 	return -ENODEV;
 }
 
+#define HAVE_PCI_MMAP
+extern int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
+			       enum pci_mmap_state mmap_state,
+			       int write_combine);
 static inline int pci_proc_domain(struct pci_bus *bus)
 {
 	return 1;
diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c
index ce20f4c..2e45a83 100644
--- a/arch/arm64/kernel/pci.c
+++ b/arch/arm64/kernel/pci.c
@@ -97,6 +97,25 @@ int pcibios_check_service_irqs(struct pci_dev *dev, int *irqs, int mask)
 	return count;
 }
 
+int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
+			enum pci_mmap_state mmap_state, int write_combine)
+{
+	if (mmap_state == pci_mmap_io)
+		return -EINVAL;
+
+	/*
+	 * Mark this as IO
+	 */
+	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+
+	if (remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
+			     vma->vm_end - vma->vm_start,
+			     vma->vm_page_prot))
+		return -EAGAIN;
+
+	return 0;
+}
+
 /*
  * raw_pci_read/write - Platform-specific PCI config space access.
  */
-- 
1.9.3

^ permalink raw reply related

* [PATCH 0/9] [v3] arm64: defconfig: enable several options useful for ARM64 server platforms
From: Timur Tabi @ 2017-05-03 15:53 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170503154852.GS8233@arm.com>

On 05/03/2017 10:48 AM, Will Deacon wrote:
> The arm-soc folks usually deal with defconfig updates, so please resend this
> to arm at kernel.org, although I suspect it's too late for 4.12 given that the
> merge window is now open. The "resynchronize" patch is also likely to
> conflict with any other pending changes to the file.

I'm sure it will conflict, but the defconfig has to be synchronized sooner
or later.  Frankly, I think Linus should just do a "batch resynchronize" of
every defconfig in every release (e.g. make it the last commit).
arch/arm64/configs/defconfig is way off today, and it's only going to get worse.

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply

* [PATCH] iommu/arm-smmu-v3: Poll for CMDQ drain completion more effectively
From: Sunil Kovvuri @ 2017-05-03 15:54 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170503153738.GP8233@arm.com>

On Wed, May 3, 2017 at 9:07 PM, Will Deacon <will.deacon@arm.com> wrote:
> On Wed, May 03, 2017 at 06:49:09PM +0530, Sunil Kovvuri wrote:
>> On Thu, Apr 27, 2017 at 4:43 PM,  <sunil.kovvuri@gmail.com> wrote:
>> > From: Sunil Goutham <sgoutham@cavium.com>
>> >
>> > Modified polling on CMDQ consumer similar to how polling is done for TLB SYNC
>> > completion in SMMUv2 driver. Code changes are done with reference to
>> >
>> > 8513c8930069 iommu/arm-smmu: Poll for TLB sync completion more effectively
>> >
>> > Poll timeout has been increased which addresses issue of 100us timeout not
>> > sufficient, when command queue is full with TLB invalidation commands.
>> >
>> > Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
>> > Signed-off-by: Geetha <gakula@cavium.com>
>> > ---
>> >  drivers/iommu/arm-smmu-v3.c | 15 ++++++++++++---
>> >  1 file changed, 12 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> > index d412bdd..34599d4 100644
>> > --- a/drivers/iommu/arm-smmu-v3.c
>> > +++ b/drivers/iommu/arm-smmu-v3.c
>> > @@ -379,6 +379,9 @@
>> >  #define CMDQ_SYNC_0_CS_NONE            (0UL << CMDQ_SYNC_0_CS_SHIFT)
>> >  #define CMDQ_SYNC_0_CS_SEV             (2UL << CMDQ_SYNC_0_CS_SHIFT)
>> >
>> > +#define CMDQ_DRAIN_TIMEOUT_US          1000
>> > +#define CMDQ_SPIN_COUNT                        10
>> > +
>> >  /* Event queue */
>> >  #define EVTQ_ENT_DWORDS                        4
>> >  #define EVTQ_MAX_SZ_SHIFT              7
>> > @@ -737,7 +740,8 @@ static void queue_inc_prod(struct arm_smmu_queue *q)
>> >   */
>> >  static int queue_poll_cons(struct arm_smmu_queue *q, bool drain, bool wfe)
>> >  {
>> > -       ktime_t timeout = ktime_add_us(ktime_get(), ARM_SMMU_POLL_TIMEOUT_US);
>> > +       ktime_t timeout = ktime_add_us(ktime_get(), CMDQ_DRAIN_TIMEOUT_US);
>> > +       unsigned int spin_cnt, delay = 1;
>> >
>> >         while (queue_sync_cons(q), (drain ? !queue_empty(q) : queue_full(q))) {
>> >                 if (ktime_compare(ktime_get(), timeout) > 0)
>> > @@ -746,8 +750,13 @@ static int queue_poll_cons(struct arm_smmu_queue *q, bool drain, bool wfe)
>> >                 if (wfe) {
>> >                         wfe();
>> >                 } else {
>> > -                       cpu_relax();
>> > -                       udelay(1);
>> > +                       for (spin_cnt = 0;
>> > +                            spin_cnt < CMDQ_SPIN_COUNT; spin_cnt++) {
>> > +                               cpu_relax();
>> > +                               continue;
>> > +                       }
>> > +                       udelay(delay);
>> > +                       delay *= 2;
>> >                 }
>> >         }
>> >
>> > --
>> > 2.7.4
>> >
>>
>> Sorry for the ignorance.
>> Is there a patchwork where I can check current status of ARM IOMMU
>> related patches ?
>>
>> And is this patch accepted, if not any comments / feedback ?
>
> Please be patient: the merge window is open and it's not been long since you
> posted the patch, which looks pretty bonkers at first glance.
>
> Will

Look at this
https://lkml.org/lkml/2017/4/3/605
The same thing, i pinged after a week and you said you already picked it up.
All I am asking is how do i know the current status, how many days
would normally
be considered being patient ?
Instead of troubling you, is there a patchwork where i can check the status ?

Thanks,
Sunil.

^ permalink raw reply

* [PATCH 07/31] KVM: arm/arm64: vgic-v3: Add accessors for the ICH_APxRn_EL2 registers
From: Marc Zyngier @ 2017-05-03 15:58 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170503153201.GC4951@leverpostej>

On 03/05/17 16:32, Mark Rutland wrote:
> On Wed, May 03, 2017 at 11:45:42AM +0100, Marc Zyngier wrote:
>> As we're about to access the Active Priority registers a lot more,
>> let's define accessors that take the register number as a parameter.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  virt/kvm/arm/hyp/vgic-v3-sr.c | 116 ++++++++++++++++++++++++++++++++++++------
>>  1 file changed, 100 insertions(+), 16 deletions(-)
>>
>> diff --git a/virt/kvm/arm/hyp/vgic-v3-sr.c b/virt/kvm/arm/hyp/vgic-v3-sr.c
>> index 32c3295929b0..990d9d1e85d0 100644
>> --- a/virt/kvm/arm/hyp/vgic-v3-sr.c
>> +++ b/virt/kvm/arm/hyp/vgic-v3-sr.c
>> @@ -118,6 +118,90 @@ static void __hyp_text __gic_v3_set_lr(u64 val, int lr)
>>  	}
>>  }
>>  
>> +static void __hyp_text __vgic_v3_write_ap0rn(u32 val, int n)
>> +{
>> +	switch (n) {
>> +	case 0:
>> +		write_gicreg(val, ICH_AP0R0_EL2);
>> +		break;
>> +	case 1:
>> +		write_gicreg(val, ICH_AP0R1_EL2);
>> +		break;
>> +	case 2:
>> +		write_gicreg(val, ICH_AP0R2_EL2);
>> +		break;
>> +	case 3:
>> +		write_gicreg(val, ICH_AP0R3_EL2);
>> +		break;
>> +	}
> 
> Is there any way we can get a build or runtime failure for an
> out-of-bounds n value?

I'd rather avoid runtime failure on this path, because that's pretty
terminal. Build-time is a possibility, to some extent.

> 
>> +}
> 
> Given this is used with a constant n, you could make this:
> 
> #define __vgic_v3_write_ap0rn(v, n) \
> 	write_gicreg(v, ICH_AP0R##n##_EL2)
> 
> ... which should also give you a warning for an out-of-bounds n.
> 
> Similar could apply for the other helpers here.
> 
> That would require some function -> macro conversion in later patches
> though, so I can understand if you're not keen on that.

I don't mind reworking this if that makes it safer. But the real problem
is that the register number and the group are not necessarily constants
(see how this is used in __vgic_v3_get_highest_active_priority).

I'll have a look at how I can make that look a bit better.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply

* [PATCH] ARM64/PCI: Allow userspace to mmap PCI resources
From: Lorenzo Pieralisi @ 2017-05-03 15:58 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493826538-23785-1-git-send-email-Bharat.Bhushan@nxp.com>

On Wed, May 03, 2017 at 09:18:58PM +0530, Bharat Bhushan wrote:
> This patch allows user-space to mmap PCI resources. This
> patch is inline to arm32 bit implementation.
> 
> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@nxp.com>
> ---
>  arch/arm64/include/asm/pci.h |  4 ++++
>  arch/arm64/kernel/pci.c      | 19 +++++++++++++++++++
>  2 files changed, 23 insertions(+)

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/arch/arm64/include/asm/pci.h?id=f719582435afe9c7985206e42d804ea6aa315d33

> 
> diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
> index b9a7ba9..8a18915 100644
> --- a/arch/arm64/include/asm/pci.h
> +++ b/arch/arm64/include/asm/pci.h
> @@ -31,6 +31,10 @@ static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
>  	return -ENODEV;
>  }
>  
> +#define HAVE_PCI_MMAP
> +extern int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
> +			       enum pci_mmap_state mmap_state,
> +			       int write_combine);
>  static inline int pci_proc_domain(struct pci_bus *bus)
>  {
>  	return 1;
> diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c
> index ce20f4c..2e45a83 100644
> --- a/arch/arm64/kernel/pci.c
> +++ b/arch/arm64/kernel/pci.c
> @@ -97,6 +97,25 @@ int pcibios_check_service_irqs(struct pci_dev *dev, int *irqs, int mask)
>  	return count;
>  }
>  
> +int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
> +			enum pci_mmap_state mmap_state, int write_combine)
> +{
> +	if (mmap_state == pci_mmap_io)
> +		return -EINVAL;
> +
> +	/*
> +	 * Mark this as IO
> +	 */
> +	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
> +
> +	if (remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
> +			     vma->vm_end - vma->vm_start,
> +			     vma->vm_page_prot))
> +		return -EAGAIN;
> +
> +	return 0;
> +}
> +
>  /*
>   * raw_pci_read/write - Platform-specific PCI config space access.
>   */
> -- 
> 1.9.3
> 

^ permalink raw reply

* [PATCH] iommu/arm-smmu-v3: Poll for CMDQ drain completion more effectively
From: Will Deacon @ 2017-05-03 15:59 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CA+sq2CeU7RN_UEiN1fuRtFpkHwgORHv_qU8yvziHqteLdz+_rw@mail.gmail.com>

On Wed, May 03, 2017 at 09:24:13PM +0530, Sunil Kovvuri wrote:
> On Wed, May 3, 2017 at 9:07 PM, Will Deacon <will.deacon@arm.com> wrote:
> > On Wed, May 03, 2017 at 06:49:09PM +0530, Sunil Kovvuri wrote:
> >> On Thu, Apr 27, 2017 at 4:43 PM,  <sunil.kovvuri@gmail.com> wrote:
> >> > From: Sunil Goutham <sgoutham@cavium.com>
> >> >
> >> > Modified polling on CMDQ consumer similar to how polling is done for TLB SYNC
> >> > completion in SMMUv2 driver. Code changes are done with reference to
> >> >
> >> > 8513c8930069 iommu/arm-smmu: Poll for TLB sync completion more effectively
> >> >
> >> > Poll timeout has been increased which addresses issue of 100us timeout not
> >> > sufficient, when command queue is full with TLB invalidation commands.
> >> >
> >> > Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
> >> > Signed-off-by: Geetha <gakula@cavium.com>
> >> > ---
> >> >  drivers/iommu/arm-smmu-v3.c | 15 ++++++++++++---
> >> >  1 file changed, 12 insertions(+), 3 deletions(-)
> >> >
> >> > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> >> > index d412bdd..34599d4 100644
> >> > --- a/drivers/iommu/arm-smmu-v3.c
> >> > +++ b/drivers/iommu/arm-smmu-v3.c
> >> > @@ -379,6 +379,9 @@
> >> >  #define CMDQ_SYNC_0_CS_NONE            (0UL << CMDQ_SYNC_0_CS_SHIFT)
> >> >  #define CMDQ_SYNC_0_CS_SEV             (2UL << CMDQ_SYNC_0_CS_SHIFT)
> >> >
> >> > +#define CMDQ_DRAIN_TIMEOUT_US          1000
> >> > +#define CMDQ_SPIN_COUNT                        10
> >> > +
> >> >  /* Event queue */
> >> >  #define EVTQ_ENT_DWORDS                        4
> >> >  #define EVTQ_MAX_SZ_SHIFT              7
> >> > @@ -737,7 +740,8 @@ static void queue_inc_prod(struct arm_smmu_queue *q)
> >> >   */
> >> >  static int queue_poll_cons(struct arm_smmu_queue *q, bool drain, bool wfe)
> >> >  {
> >> > -       ktime_t timeout = ktime_add_us(ktime_get(), ARM_SMMU_POLL_TIMEOUT_US);
> >> > +       ktime_t timeout = ktime_add_us(ktime_get(), CMDQ_DRAIN_TIMEOUT_US);
> >> > +       unsigned int spin_cnt, delay = 1;
> >> >
> >> >         while (queue_sync_cons(q), (drain ? !queue_empty(q) : queue_full(q))) {
> >> >                 if (ktime_compare(ktime_get(), timeout) > 0)
> >> > @@ -746,8 +750,13 @@ static int queue_poll_cons(struct arm_smmu_queue *q, bool drain, bool wfe)
> >> >                 if (wfe) {
> >> >                         wfe();
> >> >                 } else {
> >> > -                       cpu_relax();
> >> > -                       udelay(1);
> >> > +                       for (spin_cnt = 0;
> >> > +                            spin_cnt < CMDQ_SPIN_COUNT; spin_cnt++) {
> >> > +                               cpu_relax();
> >> > +                               continue;
> >> > +                       }
> >> > +                       udelay(delay);
> >> > +                       delay *= 2;
> >> >                 }
> >> >         }
> >> >
> >> > --
> >> > 2.7.4
> >> >
> >>
> >> Sorry for the ignorance.
> >> Is there a patchwork where I can check current status of ARM IOMMU
> >> related patches ?
> >>
> >> And is this patch accepted, if not any comments / feedback ?
> >
> > Please be patient: the merge window is open and it's not been long since you
> > posted the patch, which looks pretty bonkers at first glance.
> >
> > Will
> 
> Look at this
> https://lkml.org/lkml/2017/4/3/605
> The same thing, i pinged after a week and you said you already picked it up.
> All I am asking is how do i know the current status, how many days
> would normally
> be considered being patient ?

At least wait until the merge window is over if it's not a fix, or keep an
eye on the relevant branches (see below).

> Instead of troubling you, is there a patchwork where i can check the status ?

No, but I pick patches up on my iommu/devel branch here:

https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/

and at some point they appear on for-joerg/arm-smmu/updates, which I send
to Joerg (who is the iommu maintainer). He then puts them into linux-next
before they get sent for inclusion in mainline during the next merge window.

Will

^ permalink raw reply

* [PATCH] ARM: dts: imx6sx-sdb: Remove cpufreq OPP override
From: Marek Vasut @ 2017-05-03 15:59 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493823533.11226.28.camel@nxp.com>

On 05/03/2017 04:58 PM, Leonard Crestez wrote:
> On Wed, 2017-05-03 at 16:26 +0200, Marek Vasut wrote:
> 
>> On 05/03/2017 03:57 PM, Shawn Guo wrote:
>>>
>>> On Tue, Apr 25, 2017 at 07:28:06PM +0200, Marek Vasut wrote:
>>>>
>>>> On 04/25/2017 07:23 PM, Leonard Crestez wrote:
>>>>>
>>>>> Anyway, that version also sets the supply for reg_arm and reg_soc. It
>>>>> is not necessary for fixing the crash I'm seeing but is good because it
>>>>> will result in the minimum voltage on VDD_ARM_SOC_IN rather than a fix
>>>>> 1375mv. I tested Marek's patch and it works fine on my rev B board
>>>>> (which otherwise fails to boot upstream).
>>>> Oh that's nice , thanks ! I don't have SDB and I hacked it up after a
>>>> brief discussion with Fabio without even compile-testing it, thus RFC.
>>>> Glad to hear it works and thanks for testing it ! Can you add a formal
>>>> Tested-by please ?
>>> Hi Marek,
>> Hi Shawn,
>>
>>> Thanks for your patch.  But I prefer Leonard's version because: 1) it
>>> has a better commit log; 2) it sticks to one-patch-does-one-thing
>>> policy.
> 
>> 2) It actually fixes a problem with the voltage rails such that the DVFS
>>    works without leaving the system in unstable or dead state. You do
>>    need the second part of my patch if you drop the OPP hackery, without
>>    it the power framework cannot correctly configure the core voltages,
>>    so the patch from Leonard makes things worse.
> 
> No, I think there is a misunderstanding here. The second part of your
> patch will cause cpufreq poking at LDOs to indirectly adjust the input
> from the PMIC to the minimum required (this is LDO target +
> min_dropout_uv). Without it by default VDD_ARM_SOC_IN will remain fixed
> as 1375mV from boot.

Who sets / guarantees that default value for ARM and SOC rails ?

With the OPP override in place, there's at least the guarantee that both
rails will have the same voltage requirement. If you remove the OPP
override without modeling the actual regulator wiring, the guarantee is
gone.

> That default value should be high enough for all cpufreq settings.
> Setting the LDO parent will cause this voltage to be dynamically
> reduced when possible (at low frequencies). This is not required for
> proper operation, it is just an optimization to do more of the
> regulation in the PMIC instead. It should work just fine without the
> second part.
> 
> That OPP override exists for "LDO bypass" mode, a feature not present
> in upstream. In that mode the internal regulators are set to bypass
> mode and voltage is controlled directly from the PMIC. Since VDD_ARM
> and VDD_SOC have different minimum requirements but are joined on the
> board the OPP voltages are adjusted to max(VDD_ARM, VDD_SOC). If LDOs
> are enabled there is no good reason to do this.
> 
> I don't care which patch goes in but the effect of the patch should be
> clarified.
> 
> -- 
> Regards,
> Leonard
> 


-- 
Best regards,
Marek Vasut

^ permalink raw reply

* [PATCH] ARM64/PCI: Allow userspace to mmap PCI resources
From: Bharat Bhushan @ 2017-05-03 16:00 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170503155858.GA11986@red-moon>



> -----Original Message-----
> From: Lorenzo Pieralisi [mailto:lorenzo.pieralisi at arm.com]
> Sent: Wednesday, May 03, 2017 9:29 PM
> To: Bharat Bhushan <bharat.bhushan@nxp.com>
> Cc: catalin.marinas at arm.com; will.deacon at arm.com;
> bhelgaas at google.com; linux-arm-kernel at lists.infradead.org; linux-
> kernel at vger.kernel.org
> Subject: Re: [PATCH] ARM64/PCI: Allow userspace to mmap PCI resources
> 
> On Wed, May 03, 2017 at 09:18:58PM +0530, Bharat Bhushan wrote:
> > This patch allows user-space to mmap PCI resources. This patch is
> > inline to arm32 bit implementation.
> >
> > Signed-off-by: Bharat Bhushan <Bharat.Bhushan@nxp.com>
> > ---
> >  arch/arm64/include/asm/pci.h |  4 ++++
> >  arch/arm64/kernel/pci.c      | 19 +++++++++++++++++++
> >  2 files changed, 23 insertions(+)
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-
> next.git/commit/arch/arm64/include/asm/pci.h?id=f719582435afe9c7985206
> e42d804ea6aa315d33

Oh, I missed this patch series,

Thanks
-Bharat

> 
> >
> > diff --git a/arch/arm64/include/asm/pci.h
> > b/arch/arm64/include/asm/pci.h index b9a7ba9..8a18915 100644
> > --- a/arch/arm64/include/asm/pci.h
> > +++ b/arch/arm64/include/asm/pci.h
> > @@ -31,6 +31,10 @@ static inline int pci_get_legacy_ide_irq(struct pci_dev
> *dev, int channel)
> >  	return -ENODEV;
> >  }
> >
> > +#define HAVE_PCI_MMAP
> > +extern int pci_mmap_page_range(struct pci_dev *dev, struct
> vm_area_struct *vma,
> > +			       enum pci_mmap_state mmap_state,
> > +			       int write_combine);
> >  static inline int pci_proc_domain(struct pci_bus *bus)  {
> >  	return 1;
> > diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c index
> > ce20f4c..2e45a83 100644
> > --- a/arch/arm64/kernel/pci.c
> > +++ b/arch/arm64/kernel/pci.c
> > @@ -97,6 +97,25 @@ int pcibios_check_service_irqs(struct pci_dev *dev,
> int *irqs, int mask)
> >  	return count;
> >  }
> >
> > +int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct
> *vma,
> > +			enum pci_mmap_state mmap_state, int
> write_combine) {
> > +	if (mmap_state == pci_mmap_io)
> > +		return -EINVAL;
> > +
> > +	/*
> > +	 * Mark this as IO
> > +	 */
> > +	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
> > +
> > +	if (remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
> > +			     vma->vm_end - vma->vm_start,
> > +			     vma->vm_page_prot))
> > +		return -EAGAIN;
> > +
> > +	return 0;
> > +}
> > +
> >  /*
> >   * raw_pci_read/write - Platform-specific PCI config space access.
> >   */
> > --
> > 1.9.3
> >

^ permalink raw reply

* [PATCH 1/2] PCI: Add new PCIe Fabric End Node flag, PCI_DEV_FLAGS_NO_RELAXED_ORDERING
From: Alexander Duyck @ 2017-05-03 16:02 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <MWHPR12MB160095DF53CE981C923FE65DC8160@MWHPR12MB1600.namprd12.prod.outlook.com>

On Tue, May 2, 2017 at 9:30 PM, Casey Leedom <leedom@chelsio.com> wrote:
> | From: Alexander Duyck <alexander.duyck@gmail.com>
> | Date: Tuesday, May 2, 2017 11:10 AM
> | ...
> | So for example, in the case of x86 it seems like there are multiple
> | root complexes that have issues, and the gains for enabling it with
> | standard DMA to host memory are small. As such we may want to default
> | it to off via the architecture specific PCIe code and then look at
> | having "white-list" cases where we enable it for things like
> | peer-to-peer accesses. In the case of SPARC we could look at
> | defaulting it to on, and only "black-list" any cases where there might
> | be issues since SPARC relies on this in a significant way for
> | performance. In the case of ARM and other architectures it is a bit of
> | a toss-up. I would say we could just default it to on for now and
> | "black-list" anything that doesn't work, or we could go the other way
> | like I suggested for x86. It all depends on what the ARM community
> | would want to agree on for this. I would say unless it makes a
> | significant difference like it does in the case of SPARC we are
> | probably better off just defaulting it to off.
>
>   Sorry, I forgot to respond to this earlier when someone was rushing me out
> to a customer dinner.
>
>   I'm unaware of any other x86 Root Complex Port that has a problem with
> Relaxed Ordering other than the performance issue with the current Intel
> implementation.  Ashok tells me that Intel is in the final stages of putting
> together a technical notice on this issue but I don't know when that will
> come out.  Hopefully that will shed much more light on the cause and actual
> use of Relaxed Ordering when directed to Coherent Memory on current and past
> Intel platforms.  (Note that the performance bug seems to limit us to
> ~75-85Gb/s DMA Write speed to Coherent Host Memory.)

So my concern isn't so much about existing issues as it is about where
is the advantage in enabling it. We have had support in the Intel
hardware for enabling relaxed ordering for about 10 years. In all that
time I have yet to see an x86 platform that sees any real benefit from
enabling it for standard DMA. That is why my preference would be to
leave it disabled by default on x86 and we white list it in at some
point when hardware shows that there is a benefit to be had for
enabling it.

>   The only other Device that I currently know of which has issues with
> Relaxed Ordering TLPs directed at it, is also a Root Complex Port on the new
> AMD A1100 ARM SoC ("SEATTLE").  There we have an actual bug which could lead
> to Data Corruption.
>
>   But I don't know anything about other x86 Root Complex Ports having issues
> with this -- we've been using it ever since commit ef306b50b from December
> 2010.

So the question I would have for you then, is what benefits have you
seen from enabling it on x86? In our case we haven't seen any for
transactions that go through the root complex. If you are seeing
benefits would I be correct in assuming it is for your peer-to-peer
case or were there some x86 platforms that showed gains?

>   Also, I'm not aware of any performance data which has been gathered on the
> use of Relaxed Ordering when directed at Host Memory.  From your note, it
> sounds like it's important on SPARC architectures.  But it could conceivably
> be important on any architecture or Root Complex/Memory Controller
> implementation.  We use it to direct Ingress Packet Data to Free List
> Buffers, followed by a TLP without Relaxed Ordering directed at a Host
> Memory Message Queue.  The idea here is to give the Root Complex options on
> which DMA Memory Write TLPs to process in order to optimize data placement
> in memory.  And with the next Ethernet speed bump to 200Gb/s this may be
> even more important.

The operative term here is "may be". That is my concern. We are
leaving this enabled by default and there are known hardware that have
issues that can be pretty serious. I'm not saying we have to disable
it and keep it disabled, but I would like to see us intelligently
enable this feature so that it is enabled on the platforms that show
benefit and disabled on the ones that don't.

My biggest concern with all this is introducing regressions as drivers
like igb and ixgbe are used on a wide range of platforms beyond even
what is covered by x86 and I would prefer not to suddenly have a
deluge of bugs to sort out triggered by us enabling relaxed ordering
on platforms that historically have not had it enabled on them.

>   Basically, I'd hate to come up with a solution where we write off the use
> of Relaxed Ordering for DMA Writes to Host Memory.  I don't think you're
> suggesting that, but there are a number of x86 Root Complex implementations
> out there -- and some like the new AMD Ryzen have yet to be tested -- as
> well as other architectures.

I'm not saying that we have to write off the use of Relaxed Ordering,
what I am saying is that we should probably be more judicious about
how we go about enabling it. If a platform and/or architecture has no
benefit to enabling it what is the point in adding the possible risk?

Hopefully the AMD Ryzen platform has already been tested and doesn't
need a quirk to disable relaxed ordering. Really it shouldn't fall on
the likes of us to be testing for those kind of things.

>   And, as Ashok and I have both nothed, any solution we come up with needs
> to cope with a heterogeneous situation where, on the same PCIe Fabric, it
> may be necessesary/desireable to support Relaxed Ordering TLPs directed at
> some End Nodes but not others.
>
> Casey

It sounds like we are more or less in agreement. My only concern is
really what we default this to. On x86 I would say we could probably
default this to disabled for existing platforms since my understanding
is that relaxed ordering doesn't provide much benefit on what is out
there right now when performing DMA through the root complex. As far
as peer-to-peer I would say we should probably look at enabling the
ability to have Relaxed Ordering enabled for some channels but not
others. In those cases the hardware needs to be smart enough to allow
for you to indicate you want it disabled by default for most of your
DMA channels, and then enabled for the select channels that are
handling the peer-to-peer traffic.

- Alex

^ permalink raw reply

* [PATCH] [RFC] ARM: dts: imx6sx-sdb: Drop OPP hackery
From: Fabio Estevam @ 2017-05-03 16:07 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170425164224.26537-1-marex@denx.de>

On Tue, Apr 25, 2017 at 1:42 PM, Marek Vasut <marex@denx.de> wrote:
> The imx6sx-sdb has one power supply that drives both VDDARM_IN
> and VDDSOC_IN, which is the sw1a regulator on the PFUZE PMIC.
> Connect both inputs to the sw1a regulator on the PMIC and drop
> the OPP hackery which is no longer needed as the power framework
> will take care of the regulator configuration as needed.
>
> Signed-off-by: Marek Vasut <marex@denx.de>
> Cc: Fabio Estevam <fabio.estevam@nxp.com>
> Cc: Shawn Guo <shawnguo@kernel.org>

Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox