Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3 v2] xen: Implement EFI reset_system callback
From: Juergen Gross @ 2017-05-02 10:13 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170424175839.5262-1-julien.grall@arm.com>

On 24/04/17 19:58, Julien Grall wrote:
> Hi all,
> 
> This small patch series implements EFI reset_system callback when using EFI
> Xen. Without this, it will not be possible to reboot/power off ARM64 DOM0
> when using ACPI.
> 
> Cheers,
> 
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Juergen Gross <jgross@suse.com>

Series rebased and pushed to xen/tip for-linus-4.12b


Juergen

^ permalink raw reply

* [PATCH] arm64: Fix multiple 'asm-operand-widths' warnings
From: Mark Rutland @ 2017-05-02 10:27 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170501212622.153720-1-mka@chromium.org>

Hi Matthias,

On Mon, May 01, 2017 at 02:26:22PM -0700, Matthias Kaehlcke wrote:
> clang raises 'asm-operand-widths' warnings in inline assembly code when
> the size of an operand is < 64 bits and the operand width is unspecified.
> Most warnings are raised in macros, i.e. the datatype of the operand may
> vary. Forcing the use of an x register through the 'x' operand modifier
> would silence the warning however it involves the risk that for operands
> < 64 bits 'unused' bits may be assigned to 64-bit values (more details at
> http://lists.infradead.org/pipermail/linux-arm-kernel/2017-April/503816.html).
> Instead we cast the operand to 64 bits, which also forces the use of a
> x register, but without the unexpected behavior.
> 
> In gic_write_bpr1() use write_sysreg_s() to write the register. This
> aligns the functions with others in this header and fixes an
> 'asm-operand-widths' warning.
> 
> Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
> ---
>  arch/arm64/include/asm/arch_gicv3.h  | 2 +-
>  arch/arm64/include/asm/barrier.h     | 2 +-
>  arch/arm64/include/asm/uaccess.h     | 2 +-
>  arch/arm64/kernel/armv8_deprecated.c | 2 +-
>  4 files changed, 4 insertions(+), 4 deletions(-)

Thanks for putting this together.

Just to check, are these the only instances that you see clang warning
about?

There are a number of other cases where we can see similar problems
(e.g. passing a u8 value to an smp_store_release() on a u32 variable),
so we need to fix more than the clang warnings.

I'm currently attempting a systematic audit of our inline asm to correct
all instances, looking at:

	git grep -e asm \
	--and --not -e 'include' \
	--and --not -e 'asmlinkage' \
	-- arch/arm64

I hope to have patches shortly, and will keep you informed.

Thanks,
Mark.

> 
> diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h
> index f37e3a21f6e7..9092d612d8c2 100644
> --- a/arch/arm64/include/asm/arch_gicv3.h
> +++ b/arch/arm64/include/asm/arch_gicv3.h
> @@ -166,7 +166,7 @@ static inline void gic_write_sre(u32 val)
>  
>  static inline void gic_write_bpr1(u32 val)
>  {
> -	asm volatile("msr_s " __stringify(ICC_BPR1_EL1) ", %0" : : "r" (val));
> +	write_sysreg_s(val, ICC_BPR1_EL1);
>  }
>  
>  #define gic_read_typer(c)		readq_relaxed(c)
> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
> index 4e0497f581a0..1248401b07ab 100644
> --- a/arch/arm64/include/asm/barrier.h
> +++ b/arch/arm64/include/asm/barrier.h
> @@ -60,7 +60,7 @@ do {									\
>  		break;							\
>  	case 8:								\
>  		asm volatile ("stlr %1, %0"				\
> -				: "=Q" (*p) : "r" (v) : "memory");	\
> +			      : "=Q" (*p) : "r" ((__u64)v) : "memory");	\
>  		break;							\
>  	}								\
>  } while (0)
> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> index 5308d696311b..7db143689694 100644
> --- a/arch/arm64/include/asm/uaccess.h
> +++ b/arch/arm64/include/asm/uaccess.h
> @@ -302,7 +302,7 @@ do {									\
>  	"	.previous\n"						\
>  	_ASM_EXTABLE(1b, 3b)						\
>  	: "+r" (err)							\
> -	: "r" (x), "r" (addr), "i" (-EFAULT))
> +	: "r" ((__u64)x), "r" (addr), "i" (-EFAULT))
>  
>  #define __put_user_err(x, ptr, err)					\
>  do {									\
> diff --git a/arch/arm64/kernel/armv8_deprecated.c b/arch/arm64/kernel/armv8_deprecated.c
> index 657977e77ec8..79b9fef48b14 100644
> --- a/arch/arm64/kernel/armv8_deprecated.c
> +++ b/arch/arm64/kernel/armv8_deprecated.c
> @@ -306,7 +306,7 @@ do {								\
>  	_ASM_EXTABLE(0b, 4b)					\
>  	_ASM_EXTABLE(1b, 4b)					\
>  	: "=&r" (res), "+r" (data), "=&r" (temp), "=&r" (temp2)	\
> -	: "r" (addr), "i" (-EAGAIN), "i" (-EFAULT),		\
> +	: "r" ((__u64)addr), "i" (-EAGAIN), "i" (-EFAULT),	\
>  	  "i" (__SWP_LL_SC_LOOPS)				\
>  	: "memory");						\
>  	uaccess_disable();					\
> -- 
> 2.13.0.rc0.306.g87b477812d-goog
> 

^ permalink raw reply

* [PATCH/RFT renesas-devel] arm64: dts: ulcb: Set drive-strength for ravb pins
From: Simon Horman @ 2017-05-02 10:30 UTC (permalink / raw)
  To: linux-arm-kernel

The EthernetAVB should not depend on the bootloader to setup correct
drive-strength values.  Values for drive-strength where found by
examining the registers after the bootloader has configured the
registers and successfully used the EthernetAVB.

Based on:
* commit 7d73a4da2681 ("arm64: dts: r8a7795: salvator-x: Set drive-strength
  for ravb pins")
* commit 4903987033be ("arm64: dts: r8a7796: salvator-x: Set drive-strength
  for ravb pins")

Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Niklas S?derlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
---
* Compile tested only due to lack of hardware access
---
 arch/arm64/boot/dts/renesas/ulcb.dtsi | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/renesas/ulcb.dtsi b/arch/arm64/boot/dts/renesas/ulcb.dtsi
index 7669b282bde2..a38d68c4d9a6 100644
--- a/arch/arm64/boot/dts/renesas/ulcb.dtsi
+++ b/arch/arm64/boot/dts/renesas/ulcb.dtsi
@@ -199,8 +199,22 @@
 	pinctrl-names = "default";
 
 	avb_pins: avb {
-		groups = "avb_mdc";
-		function = "avb";
+		mux {
+			groups = "avb_link", "avb_phy_int", "avb_mdc",
+				 "avb_mii";
+			function = "avb";
+		};
+
+		pins_mdc {
+			groups = "avb_mdc";
+			drive-strength = <24>;
+		};
+
+		pins_mii_tx {
+			pins = "PIN_AVB_TX_CTL", "PIN_AVB_TXC", "PIN_AVB_TD0",
+			       "PIN_AVB_TD1", "PIN_AVB_TD2", "PIN_AVB_TD3";
+			drive-strength = <12>;
+		};
 	};
 
 	i2c2_pins: i2c2 {
-- 
2.1.4

^ permalink raw reply related

* [PATCH 0/3 v2] xen: Implement EFI reset_system callback
From: Julien Grall @ 2017-05-02 10:53 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <f7bfc07d-ae10-09fb-9fce-a2ef93460459@suse.com>

Hi Juergen,

On 02/05/17 10:30, Juergen Gross wrote:
> On 02/05/17 11:08, Julien Grall wrote:
>> Hi all,
>>
>> It looks like the series has fully been acked, can someone merge this
>> into xentip?
>
> As I already wrote: patch 1 doesn't apply any longer.
>
> As there were other conflicts between xentip and Linus' tree I'm doing
> a rebase of for-linus-4.12 right now, so I can do the rebase of this
> patch for you.

Sorry I missed your answer on patch #1.

I saw you just rebased it, thank you for that!

Cheers,


-- 
Julien Grall

^ permalink raw reply

* [PATCH v3 1/3] arm64: kvm: support kvmtool to detect RAS extension feature
From: gengdongjiu @ 2017-05-02 11:05 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170502075631.GE16940@cbox>

Hi Christoffer,
   thanks for your review and comments

On 2017/5/2 15:56, Christoffer Dall wrote:
> Hi Dongjiu,
> 
> Please send a cover letter for patch series with more than a single
> patch.
  OK, got it.

> 
> The subject and description of these patches are also misleading.
> Hopefully this is in no way tied to kvmtool, but to userspace
> generically, for example also to be used by QEMU?
 yes, it is also used by QEMU, it should be userspace.


> 
> On Sun, Apr 30, 2017 at 01:37:55PM +0800, Dongjiu Geng wrote:
>> Handle kvmtool's detection for RAS extension, because sometimes
>> the APP needs to know the CPU's capacity
> 
> the APP ?
> 
> the CPU's capacity?
  I will fix it.

> 
>>
>> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
>> ---
>>  arch/arm64/kvm/reset.c   | 11 +++++++++++
>>  include/uapi/linux/kvm.h |  1 +
>>  2 files changed, 12 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
>> index d9e9697..1004039 100644
>> --- a/arch/arm64/kvm/reset.c
>> +++ b/arch/arm64/kvm/reset.c
>> @@ -64,6 +64,14 @@ static bool cpu_has_32bit_el1(void)
>>  	return !!(pfr0 & 0x20);
>>  }
>>  
>> +static bool kvm_arm_support_ras_extension(void)
>> +{
>> +	u64 pfr0;
>> +
>> +	pfr0 = read_system_reg(SYS_ID_AA64PFR0_EL1);
>> +	return !!(pfr0 & 0x10000000);
>> +}
>> +
>>  /**
>>   * kvm_arch_dev_ioctl_check_extension
>>   *
>> @@ -87,6 +95,9 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
>>  	case KVM_CAP_ARM_PMU_V3:
>>  		r = kvm_arm_support_pmu_v3();
>>  		break;
>> +	case KVM_CAP_ARM_RAS_EXTENSION:
>> +		r = kvm_arm_support_ras_extension();
>> +		break;
> 
> You need to document this capability and API in
> Documentation/virtual/kvm/api.txt and explain how this works.
  Ok, thanks for your suggestion.

> 
> 
> 
>>  	case KVM_CAP_SET_GUEST_DEBUG:
>>  	case KVM_CAP_VCPU_ATTRIBUTES:
>>  		r = 1;
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index f51d508..27fe556 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -883,6 +883,7 @@ struct kvm_ppc_resize_hpt {
>>  #define KVM_CAP_PPC_MMU_RADIX 134
>>  #define KVM_CAP_PPC_MMU_HASH_V3 135
>>  #define KVM_CAP_IMMEDIATE_EXIT 136
>> +#define KVM_CAP_ARM_RAS_EXTENSION 137
>>  
>>  #ifdef KVM_CAP_IRQ_ROUTING
>>  
>> -- 
>> 2.10.1
>>
> 
> Thanks,
> -Christoffer
> 
> .
> 

^ permalink raw reply

* [PATCH 1/1] arm64: Always provide "model name" in /proc/cpuinfo
From: Catalin Marinas @ 2017-05-02 11:08 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170501223913.6894-1-xypron.glpk@gmx.de>

On Tue, May 02, 2017 at 12:39:13AM +0200, Heinrich Schuchardt wrote:
> There is no need to hide the model name in processes
> that are not PER_LINUX32.
> 
> So let us always provide a model name that is easily readable.
> 
> Fixes: e47b020a323d ("arm64: Provide "model name" in /proc/cpuinfo for PER_LINUX32 tasks")
> Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
> ---
>  arch/arm64/kernel/cpuinfo.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
> index b3d5b3e8fbcb..9ad9ddcd2f19 100644
> --- a/arch/arm64/kernel/cpuinfo.c
> +++ b/arch/arm64/kernel/cpuinfo.c
> @@ -118,9 +118,8 @@ static int c_show(struct seq_file *m, void *v)
>  		 * "processor".  Give glibc what it expects.
>  		 */
>  		seq_printf(m, "processor\t: %d\n", i);
> -		if (compat)
> -			seq_printf(m, "model name\t: ARMv8 Processor rev %d (%s)\n",
> -				   MIDR_REVISION(midr), COMPAT_ELF_PLATFORM);
> +		seq_printf(m, "model name\t: ARMv8 Processor rev %d (%s)\n",
> +			   MIDR_REVISION(midr), COMPAT_ELF_PLATFORM);
>  
>  		seq_printf(m, "BogoMIPS\t: %lu.%02lu\n",
>  			   loops_per_jiffy / (500000UL/HZ),

Such patch seems to come up regularly:

https://patchwork.kernel.org/patch/9303311/

(and it usually gets rejected)

-- 
Catalin

^ permalink raw reply

* [PATCH 2/5] mtd: nand: gpmi: add i.MX 7 SoC support
From: Marek Vasut @ 2017-05-02 11:18 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170502111758.53e9a959@bbrezillon>

On 05/02/2017 11:17 AM, Boris Brezillon wrote:
> Hi Han,
> 
> On Fri, 21 Apr 2017 13:29:16 -0500
> Han Xu <xhnjupt@gmail.com> wrote:
> 
>>>>  
>>>>>>> But then, adding the type would only require 2-3 lines of change if I
>>>>>>> add it to the GPMI_IS_MX6 macro...  
>>>>>>
>>>>>> Then at least add a comment because using type = IMX6SX right under
>>>>>> gpmi_data_mx7d can trigger some head-scratching. And put my R-B on V2.  
>>>>>
>>>>> FWIW, I mentioned it in the commit message.
>>>>>
>>>>> I think rather then adding a comment it is cleaner to just add IS_IMX7D
>>>>> and add it to the GPMI_IS_MX6 macro. That does not need a comment since
>>>>> it implicitly says we have a i.MX 7 but treat it like i.MX 6 and it is a
>>>>> rather small change. Does that sound acceptable?  
>>>>
>>>> Sure, that's even better, thanks.
>>>>
>>>> btw isn't there some single-core mx7 (mx7s ?) , maybe we should just go
>>>> with mx7 (without the d suffix). I dunno if it has GPMI NAND though, so
>>>> maybe mx7d is the right thing to do here ...
>>>>  
>>>
>>> There is a Solo version yes, and it has GPMI NAND too. However, almost
>>> all i.MX 7 IPs have been named imx7d by NXP for some reason (including
>>> compatible strings, see grep -r -e imx7 Documentation/), so I thought I
>>> stay consistent here...  

I missed the DT bit, sorry. the DT bindings say:
  - compatible : should be "fsl,<chip>-gpmi-nand"
so if FSL invented their own buggy bindings, they need to get them
through Rob :) IMO for MX7, this should be "imx7-gpmi-nand" , unless
there's some incentive to discern the solo/dual chips and/or there
is a future imx7 coming up with different GPMI NAND block version.

>> Hi Guys,
>>
>> Yes, there should be a i.MX7 Solo version with one core fused out. IMO, can
>> we use QUIRK to distinguish them rather than SoC name. I know I also sent
>> some patch set with SoC Name but I prefer to use QUIRK now.
> 
> Not sure what this means. Are you okay with Stefan's v2?

IMO the GPMI controller in solo and dual should be the same, so there's
no need to have quirks for it.

-- 
Best regards,
Marek Vasut

^ permalink raw reply

* [PATCH v5 01/10] arm64: allwinner: a64: enable RSB on A64
From: Maxime Ripard @ 2017-05-02 11:22 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <4f910cd7ab972c1b6d93b0327833b596@aosc.io>

On Fri, Apr 28, 2017 at 02:14:58AM +0800, icenowy at aosc.io wrote:
> ? 2017-04-27 21:28?Maxime Ripard ???
> > On Wed, Apr 26, 2017 at 11:20:14PM +0800, Icenowy Zheng wrote:
> > > Allwinner A64 have a RSB controller like the one on A23/A33 SoCs.
> > > 
> > > Add it and its pinmux.
> > > 
> > > Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
> > > Acked-by: Chen-Yu Tsai <wens@csie.org>
> > > ---
> > > Changes in v2:
> > > - Removed bonus properties in pio node.
> > > - Added Chen-Yu's ACK.
> > > 
> > >  arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi | 19
> > > +++++++++++++++++++
> > >  1 file changed, 19 insertions(+)
> > > 
> > > diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
> > > b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
> > > index c7f669f5884f..05ec9fc5e81f 100644
> > > --- a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
> > > +++ b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
> > > @@ -422,6 +422,25 @@
> > >  			#gpio-cells = <3>;
> > >  			interrupt-controller;
> > >  			#interrupt-cells = <3>;
> > > +
> > > +			r_rsb_pins: rsb at 0 {
> > > +				pins = "PL0", "PL1";
> > > +				function = "s_rsb";
> > > +			};
> > > +		};
> > > +
> > > +		r_rsb: rsb at 1f03400 {
> > > +			compatible = "allwinner,sun8i-a23-rsb";
> > > +			reg = <0x01f03400 0x400>;
> > > +			interrupts = <GIC_SPI 39 IRQ_TYPE_LEVEL_HIGH>;
> > > +			clocks = <&r_ccu 6>;
> > 
> > Please use the defines here..
> 
> Linux-4.12 doesn't yet enter rc1, and the defines are still not in
> Linus's tree.
> 
> Please note that I have already mentioned that this patch is necessary
> to be merged into 4.12, otherwise poweroff won't work properly at 4.12 .

This is too late for 4.12. We don't merge any patch two weeks before
the merge window opens, which makes it around -rc6. This will be 4.13
material, so we'll definitely have the defines by then.

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20170502/f8b1df85/attachment.sig>

^ permalink raw reply

* [PATCH v4 1/5] dt-bindings: gpu: add bindings for the ARM Mali Midgard GPU
From: Guillaume Tucker @ 2017-05-02 11:23 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170428192742.pu4v4layszr6z2ot@rob-hp-laptop>

Hi Rob,

On 28/04/17 20:27, Rob Herring wrote:
> On Tue, Apr 25, 2017 at 02:16:16PM +0100, Guillaume Tucker wrote:

>> diff --git a/Documentation/devicetree/bindings/gpu/arm,mali-midgard.txt b/Documentation/devicetree/bindings/gpu/arm,mali-midgard.txt
>> new file mode 100644
>> index 000000000000..547ddeceb498
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/gpu/arm,mali-midgard.txt
>> @@ -0,0 +1,82 @@
>> +ARM Mali Midgard GPU
>> +====================
>> +
>> +Required properties:
>> +
>> +- compatible :
>> +  * Must be one of the following:
>> +    + "arm,mali-t60x"
>> +    + "arm,mali-t62x"
>
> Don't use wildcards.

Sure, old habits die hard...  I'll fix it in patch v5.

>> +    + "arm,mali-t720"
>> +    + "arm,mali-t760"
>> +    + "arm,mali-t820"
>> +    + "arm,mali-t830"
>> +    + "arm,mali-t860"
>> +    + "arm,mali-t880"
>> +  * And, optionally, one of the vendor specific compatible:
>
> IMO, these should not be optional.

Well, vendor compatible strings are clearly optional for the
Utgard GPU series for which the bindings docs were recently
merged.  It seems that whether these should be optional or not,
the documentation should be consistent between at least all
similar types of devices like Midgard and Utgard GPUs.  They have
different architectures but from a device tree point of view,
they both have the same kind of SoC-specific integration (clocks,
irqs, regulators...).

So was this was overlooked in the Utgard case and should it
ideally be fixed there as well as non-optional?  Or, is it OK to
keep these optional on a second thought?

>> +    + "amlogic,meson-gxm-mali"
>> +    + "rockchip,rk3288-mali"

Guillaume

^ permalink raw reply

* [PATCH V8 4/5] PCI/ASPM: save power on values during bridge init
From: Patel, Mayurkumar @ 2017-05-02 12:02 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CAErSpo7TO1rfG6MDYuohyXL4WbaGCJdV8y3Ke-uL6VjGRb1hvw@mail.gmail.com>

Hi Bjorn

>
>On Fri, Apr 21, 2017 at 2:46 AM, Patel, Mayurkumar
><mayurkumar.patel@intel.com> wrote:
>> Hi Bjorn/Kaya,
>>
>>
>>>
>>>On 4/17/2017 12:38 PM, Bjorn Helgaas wrote:
>>>>> Like you said, what do we do by default is the question. Should we opt
>>>>> for safe like we are doing, or try to save some power.
>>>> I think safety is paramount.  Every user should be able to boot safely
>>>> without any kernel parameters.  We don't want users to have a problem
>>>> booting and then have to search for a workaround like booting with
>>>> "pcie_aspm=off".  Most users will never do that.
>>>>
>>>
>>>OK, no problem with leaving the behavior as it is.
>>>
>>>My initial approach was #2. We knew this way that user had full control
>>>over the ASPM policy by changing the BIOS option. Then, Mayurkumar
>>>complained that ASPM is not enabled following a hotplug insertion to an
>>>empty slot. That's when I switched to #3 as it sounded like a good thing
>>>to have for us.
>>>
>>>> Here's a long-term strawman proposal, see what you think:
>>>>
>>>>   - Deprecate CONFIG_PCIEASPM_DEFAULT, CONFIG_PCIEASPM_POWERSAVE, etc.
>>>>   - Default aspm_policy is POLICY_DEFAULT always.
>>>>   - POLICY_DEFAULT means Linux doesn't touch anything: if BIOS enabled
>>>> ASPM, we leave it that way; we leave ASPM disabled on hot-added
>>>> devices.
>>>
>> I am also ok with leaving the same behavior as now.
>> But still following is something open I feel besides, Which may be there in your comments redundantly.
>> The current problem is, pcie_aspm_exit_link_state() disables the ASPM configuration even
>> if POLICY_DEFAULT was set.
>
>We call pcie_aspm_exit_link_state() when removing an endpoint.  When
>we remove an endpoint, I think disabling ASPM is the right thing to
>do.  The spec (PCIe r3.1, sec 5.4.1.3) says "Software must not enable
>L0s in either direction on a given Link unless components on both
>sides of the Link each support L0s; otherwise, the result is
>undefined."
>

Yes, you are right and per spec also it makes sense that ASPM needs to be disabled.
But, if POLICY_DEFAULT is set then, shouldn't BIOS take care of disabling ASPM?



>> I am seeing already following problem(or may be influence) with it. The Endpoint I have does not have
>> does not have "Presence detect change" mechanism. Hot plug is working with Link status events.
>> When link is in L1 or L1SS and if EP is powered off, no Link status change event are triggered (It might be
>> the expected behavior in L1 or L1SS).  When next time EP is powered on there are link down and
>> link up events coming one after other. BIOS enables ASPM on Root port and Endpoint, but while
>> processing link status down, pcie_aspm_exit_link_state() clears the ASPM already which were enabled by BIOS.
>> If we want to follow above approach then shall we consider having something similar as following?
>
>The proposal was to leave ASPM disabled on hot-added devices.  If the
>endpoint was powered off and powered back on again, I think that
>device looks like a hot-added device, doesn't it?
>

Yes, it is hot-added device. Also, I understand, for POLICY_DEFAULT, OS would/should not touch ASPM(enable/disable),
but BIOS could still (enable/disable), right?

Currently, what happens in my system is as following, (each 2nd power cycle/hotplug of Endpoint disables ASPM):


First Power cycle (When ASPM L1 is already enabled):
device gets powered off -> there are no Link status events, so no pcie hotplug interrupt and pcie_aspm_exit_link_state() triggered.
When the device gets powered on again -> Link down/Link up events are coming back to back. 
First Link down is served. (BIOS checks for the Link status and enables ASPM already, as the device is
actually powered back). OS calls pcie_aspm_exit_link_state() and ASPM gets disabled by OS.

Second Power cycle (When ASPM L1 is disabled after above):
device gets powered off -> there are link status events, pcie hotplug interrupt is triggered and pcie_aspm_exit_link_state() triggered.
OS disables ASPM. BIOS checks Link status and disables ASPM too.
When the device gets powered on -> BIOS enables ASPM and as this is pcie hotplug insertion, OS
does not interfere and we have ASPM enabled.

The above sequence happens each 2nd power cycle of the hotplug device.

So One could still argue if POLICY_DEFAULT is set, then why OS disables ASPM if it is not meant to touch configuration.
This is why I proposed following kind of change, so that OS would not touch ASPM, if POLICY_DEFAULT is set.
Also, With the below change, everything relies on BIOS for ASPM when POLICY_DEFAULT is set and I see above problem
gets resolved. Also, the existing ASPM behavior does not have impact, unless specific BIOS does not disable ASPM on
Root Port when device gets removed.



>Bjorn
Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Christian Lamprechter
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

^ permalink raw reply

* [PATCH v3 1/3] arm64: kvm: support kvmtool to detect RAS extension feature
From: gengdongjiu @ 2017-05-02 12:15 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170502075631.GE16940@cbox>

Hi Christoffer,
   thanks for your review and comments.

On 2017/5/2 15:56, Christoffer Dall wrote:
> Hi Dongjiu,
> 
> Please send a cover letter for patch series with more than a single
> patch.
 OK, got it.

> 
> The subject and description of these patches are also misleading.
> Hopefully this is in no way tied to kvmtool, but to userspace
> generically, for example also to be used by QEMU?
> 
> On Sun, Apr 30, 2017 at 01:37:55PM +0800, Dongjiu Geng wrote:
>> Handle kvmtool's detection for RAS extension, because sometimes
>> the APP needs to know the CPU's capacity
> 
> the APP ?
> 
> the CPU's capacity?
I will fix it.

> 
>>
>> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
>> ---
>>  arch/arm64/kvm/reset.c   | 11 +++++++++++
>>  include/uapi/linux/kvm.h |  1 +
>>  2 files changed, 12 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
>> index d9e9697..1004039 100644
>> --- a/arch/arm64/kvm/reset.c
>> +++ b/arch/arm64/kvm/reset.c
>> @@ -64,6 +64,14 @@ static bool cpu_has_32bit_el1(void)
>>  	return !!(pfr0 & 0x20);
>>  }
>>  
>> +static bool kvm_arm_support_ras_extension(void)
>> +{
>> +	u64 pfr0;
>> +
>> +	pfr0 = read_system_reg(SYS_ID_AA64PFR0_EL1);
>> +	return !!(pfr0 & 0x10000000);
>> +}
>> +
>>  /**
>>   * kvm_arch_dev_ioctl_check_extension
>>   *
>> @@ -87,6 +95,9 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
>>  	case KVM_CAP_ARM_PMU_V3:
>>  		r = kvm_arm_support_pmu_v3();
>>  		break;
>> +	case KVM_CAP_ARM_RAS_EXTENSION:
>> +		r = kvm_arm_support_ras_extension();
>> +		break;
> 
> You need to document this capability and API in
> Documentation/virtual/kvm/api.txt and explain how this works.
 Ok, thanks for your suggestion.

> 
> 
> 
>>  	case KVM_CAP_SET_GUEST_DEBUG:
>>  	case KVM_CAP_VCPU_ATTRIBUTES:
>>  		r = 1;
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index f51d508..27fe556 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -883,6 +883,7 @@ struct kvm_ppc_resize_hpt {
>>  #define KVM_CAP_PPC_MMU_RADIX 134
>>  #define KVM_CAP_PPC_MMU_HASH_V3 135
>>  #define KVM_CAP_IMMEDIATE_EXIT 136
>> +#define KVM_CAP_ARM_RAS_EXTENSION 137
>>  
>>  #ifdef KVM_CAP_IRQ_ROUTING
>>  
>> -- 
>> 2.10.1
>>
> 
> Thanks,
> -Christoffer
> 
> .
> 

^ permalink raw reply

* [PATCH v3 2/3] arm64: kvm: inject SError with virtual syndrome
From: gengdongjiu @ 2017-05-02 12:20 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170502080342.GF16940@cbox>

Hello Christoffer.


On 2017/5/2 16:03, Christoffer Dall wrote:
> On Sun, Apr 30, 2017 at 01:37:56PM +0800, Dongjiu Geng wrote:
>> when SError happen, kvm notifies kvmtool to generate GHES table
>> to record the error, then kvmtools inject the SError with specified
> 
> again, is this really specific to kvmtool?  Pleae try to explain this
> mechanism in generic terms.
  It is both for qemu and other userspace application. I will correct it.

> 
>> virtual syndrome. when switch to guest, a virtual SError will happen with
>> this specified syndrome.
>>
>> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
>> ---
>>  arch/arm64/include/asm/esr.h         |  2 ++
>>  arch/arm64/include/asm/kvm_emulate.h | 10 ++++++++++
>>  arch/arm64/include/asm/kvm_host.h    |  1 +
>>  arch/arm64/include/asm/sysreg.h      |  3 +++
>>  arch/arm64/kvm/handle_exit.c         | 25 +++++++++++++++++++------
>>  arch/arm64/kvm/hyp/switch.c          | 15 ++++++++++++++-
>>  include/uapi/linux/kvm.h             |  5 +++++
>>  7 files changed, 54 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
>> index 22f9c90..d009c99 100644
>> --- a/arch/arm64/include/asm/esr.h
>> +++ b/arch/arm64/include/asm/esr.h
>> @@ -127,6 +127,8 @@
>>  #define ESR_ELx_WFx_ISS_WFE	(UL(1) << 0)
>>  #define ESR_ELx_xVC_IMM_MASK	((1UL << 16) - 1)
>>  
>> +#define VSESR_ELx_IDS_ISS_MASK    ((1UL << 25) - 1)
>> +
>>  /* ESR value templates for specific events */
>>  
>>  /* BRK instruction trap from AArch64 state */
>> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
>> index f5ea0ba..a3259a9 100644
>> --- a/arch/arm64/include/asm/kvm_emulate.h
>> +++ b/arch/arm64/include/asm/kvm_emulate.h
>> @@ -148,6 +148,16 @@ static inline u32 kvm_vcpu_get_hsr(const struct kvm_vcpu *vcpu)
>>  	return vcpu->arch.fault.esr_el2;
>>  }
>>  
>> +static inline u32 kvm_vcpu_get_vsesr(const struct kvm_vcpu *vcpu)
>> +{
>> +		return vcpu->arch.fault.vsesr_el2;
>> +}
>> +
>> +static inline void kvm_vcpu_set_vsesr(struct kvm_vcpu *vcpu, unsigned long val)
>> +{
>> +		vcpu->arch.fault.vsesr_el2 = val;
>> +}
>> +
>>  static inline int kvm_vcpu_get_condition(const struct kvm_vcpu *vcpu)
>>  {
>>  	u32 esr = kvm_vcpu_get_hsr(vcpu);
>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>> index e7705e7..84ed239 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -86,6 +86,7 @@ struct kvm_vcpu_fault_info {
>>  	u32 esr_el2;		/* Hyp Syndrom Register */
>>  	u64 far_el2;		/* Hyp Fault Address Register */
>>  	u64 hpfar_el2;		/* Hyp IPA Fault Address Register */
>> +	u32 vsesr_el2;		/* Virtual SError Exception Syndrome Register */
>>  };
>>  
>>  /*
>> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
>> index 32964c7..b6afb7a 100644
>> --- a/arch/arm64/include/asm/sysreg.h
>> +++ b/arch/arm64/include/asm/sysreg.h
>> @@ -125,6 +125,9 @@
>>  #define REG_PSTATE_PAN_IMM		sys_reg(0, 0, 4, 0, 4)
>>  #define REG_PSTATE_UAO_IMM		sys_reg(0, 0, 4, 0, 3)
>>  
>> +#define VSESR_EL2			sys_reg(3, 4, 5, 2, 3)
>> +
>> +
>>  #define SET_PSTATE_PAN(x) __emit_inst(0xd5000000 | REG_PSTATE_PAN_IMM |	\
>>  				      (!!x)<<8 | 0x1f)
>>  #define SET_PSTATE_UAO(x) __emit_inst(0xd5000000 | REG_PSTATE_UAO_IMM |	\
>> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
>> index c89d83a..3d024a9 100644
>> --- a/arch/arm64/kvm/handle_exit.c
>> +++ b/arch/arm64/kvm/handle_exit.c
>> @@ -180,7 +180,11 @@ static exit_handle_fn kvm_get_exit_handler(struct kvm_vcpu *vcpu)
>>  
>>  static int kvm_handle_guest_sei(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>  {
>> -	unsigned long fault_ipa = kvm_vcpu_get_fault_ipa(vcpu);
>> +	unsigned long hva, fault_ipa = kvm_vcpu_get_fault_ipa(vcpu);
>> +	struct kvm_memory_slot *memslot;
>> +	int hsr, ret = 1;
>> +	bool writable;
>> +	gfn_t gfn;
>>  
>>  	if (handle_guest_sei((unsigned long)fault_ipa,
>>  				kvm_vcpu_get_hsr(vcpu))) {
>> @@ -190,9 +194,20 @@ static int kvm_handle_guest_sei(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>  				(unsigned long)kvm_vcpu_get_hsr(vcpu));
>>  
>>  		kvm_inject_vabt(vcpu);
>> +	} else {
>> +		hsr = kvm_vcpu_get_hsr(vcpu);
>> +
>> +		gfn = fault_ipa >> PAGE_SHIFT;
>> +		memslot = gfn_to_memslot(vcpu->kvm, gfn);
>> +		hva = gfn_to_hva_memslot_prot(memslot, gfn, &writable);
>> +
>> +		run->exit_reason = KVM_EXIT_INTR;
>> +		run->intr.syndrome_info = hsr;
>> +		run->intr.address = hva;
>> +		ret = 0;
>>  	}
>>  
>> -	return 0;
>> +	return ret;
>>  }
>>  
>>  /*
>> @@ -218,8 +233,7 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
>>  			*vcpu_pc(vcpu) -= adj;
>>  		}
>>  
>> -		kvm_handle_guest_sei(vcpu, run);
>> -		return 1;
>> +		return kvm_handle_guest_sei(vcpu, run);
>>  	}
>>  
>>  	exception_index = ARM_EXCEPTION_CODE(exception_index);
>> @@ -228,8 +242,7 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
>>  	case ARM_EXCEPTION_IRQ:
>>  		return 1;
>>  	case ARM_EXCEPTION_EL1_SERROR:
>> -		kvm_handle_guest_sei(vcpu, run);
>> -		return 1;
>> +		return kvm_handle_guest_sei(vcpu, run);
>>  	case ARM_EXCEPTION_TRAP:
>>  		/*
>>  		 * See ARM ARM B1.14.1: "Hyp traps on instructions
>> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
>> index aede165..ded6211 100644
>> --- a/arch/arm64/kvm/hyp/switch.c
>> +++ b/arch/arm64/kvm/hyp/switch.c
>> @@ -86,6 +86,13 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu)
>>  		isb();
>>  	}
>>  	write_sysreg(val, hcr_el2);
>> +    /* If virtual System Error or Asynchronous Abort is pending. set
> 
> nit: I think you want a comma after pending, not a dot.
> 
>> +     * the virtual exception syndrome information
>> +     */
> 
> nit: commenting style
> 
>> +	if (cpus_have_cap(ARM64_HAS_RAS_EXTN) &&
>> +			(vcpu->arch.hcr_el2 & HCR_VSE))
>> +		write_sysreg_s(vcpu->arch.fault.vsesr_el2, VSESR_EL2);
>> +
>>  	/* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */
>>  	write_sysreg(1 << 15, hstr_el2);
>>  	/*
>> @@ -139,9 +146,15 @@ static void __hyp_text __deactivate_traps(struct kvm_vcpu *vcpu)
>>  	 * the crucial bit is "On taking a vSError interrupt,
>>  	 * HCR_EL2.VSE is cleared to 0."
>>  	 */
>> -	if (vcpu->arch.hcr_el2 & HCR_VSE)
>> +	if (vcpu->arch.hcr_el2 & HCR_VSE) {
>>  		vcpu->arch.hcr_el2 = read_sysreg(hcr_el2);
>>  
>> +		if (cpus_have_cap(ARM64_HAS_RAS_EXTN)) {
>> +			/* set vsesr_el2[24:0] with esr_el2[24:0] */
>> +			kvm_vcpu_set_vsesr(vcpu, read_sysreg_el2(esr)
>> +					& VSESR_ELx_IDS_ISS_MASK);
>> +		}
>> +	}
>>  	__deactivate_traps_arch()();
>>  	write_sysreg(0, hstr_el2);
>>  	write_sysreg(0, pmuserenr_el0);
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index 27fe556..bb02909 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -360,6 +360,11 @@ struct kvm_run {
>>  		struct {
>>  			__u8 vector;
>>  		} eoi;
>> +		/* KVM_EXIT_INTR */
>> +		struct {
>> +			__u32 syndrome_info;
>> +			__u64 address;
>> +		} intr;
> 
> definitely, not.  KVM_EXIT_INTR is a generic exit code to tell userspace
> that we exited because we needed to deliver a signal or something else
> related to an asynchronous event.  This implies that the syndrome_info
> etc. always has valid values on all architectures when exiting with
> KVM_EXIT_INTR.
> 
> Either document the behavior as the syndrome_info has side-channel
> information on every exit, or on some KVM_EXIT_INTR exits, as we explain
> in the KVM_CAP_ARM_USER_IRQ ABI that was just added, or dedicate an
> access code.
OK.

> 
>>  		/* KVM_EXIT_HYPERV */
>>  		struct kvm_hyperv_exit hyperv;
>>  		/* Fix the size of the union. */
>> -- 
>> 2.10.1
>>
> 
> I'll look at the details of such patches once the ABI is clear and
> well-documented.
  OK.

> 
> Thanks,
> -Christoffer
> 
> .
> 

^ permalink raw reply

* [PATCH] Remove ARM errata Workarounds 458693 and 460075
From: Robin Murphy @ 2017-05-02 12:27 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170418155725.GG27592@e104818-lin.cambridge.arm.com>

On 18/04/17 16:57, Catalin Marinas wrote:
> On Sun, Apr 16, 2017 at 09:04:46AM +0100, Russell King - ARM Linux wrote:
>> On Sat, Apr 15, 2017 at 07:06:06PM -0500, Nisal Menuka wrote:
>>> According to ARM, these errata exist only in a version of Cortex-A8
>>> (r2p0) which was never built. Therefore, I believe there are no platforms
>>> where this workaround should be enabled.
>>> link :http://infocenter.arm.com/help/index.jsp?topic=
>>> /com.arm.doc.faqs/ka15634.html
>>
>> These were submitted by ARM Ltd back in 2009 - if the silicon was never
>> built, there would've been no reason to submit them.  Maybe Catalin can
>> shed some light on this, being the commit author who introduced these?
> 
> We normally try not to submit errata workarounds for revisions that are
> not going to be built/deployed. It's possible that at the time there
> were plans for r2p0 to be licensed and built (not just FPGA) but I don't
> really remember the details. The A8 errata document indeed states that
> r1p0 and r2p0 are obsolete but this can mean many things (like not
> available to license).
> 
> I'll try to see if any of the A8 past product managers know anything
> about this. In the meantime, I would leave them in (no run-time
> overhead).

FWIW, I just fired up a RealView PB-A8 board to check, and that reports
r1p1. True, it's not strictly a real silicon implementation (I think
it's one of the structured ASIC test chips), but since it was, as far as
I'm aware, a commercially-available development system, it's not
impossible that someone may still own and use one of these beasts.

Robin.

^ permalink raw reply

* [PATCH  v2] iio: stm32 trigger: Add support for TRGO2 triggers
From: Fabrice Gasnier @ 2017-05-02 12:33 UTC (permalink / raw)
  To: linux-arm-kernel

Add support for TRGO2 trigger that can be found on STM32F7.
Add additional master modes supported by TRGO2.
Register additional "tim[1/8]_trgo2" triggers for timer1 & timer8.
Detect TRGO2 timer capability (master mode selection 2).

Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Acked-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
---
Changes in v2:
- Improve ABI documentation by adding ascii art
---
 .../ABI/testing/sysfs-bus-iio-timer-stm32          |  48 +++++++++
 drivers/iio/trigger/stm32-timer-trigger.c          | 113 ++++++++++++++++++---
 include/linux/iio/timer/stm32-timer-trigger.h      |   2 +
 include/linux/mfd/stm32-timers.h                   |   2 +
 4 files changed, 151 insertions(+), 14 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-bus-iio-timer-stm32 b/Documentation/ABI/testing/sysfs-bus-iio-timer-stm32
index 230020e..deb0159 100644
--- a/Documentation/ABI/testing/sysfs-bus-iio-timer-stm32
+++ b/Documentation/ABI/testing/sysfs-bus-iio-timer-stm32
@@ -16,6 +16,54 @@ Description:
 		- "OC2REF"    : OC2REF signal is used as trigger output.
 		- "OC3REF"    : OC3REF signal is used as trigger output.
 		- "OC4REF"    : OC4REF signal is used as trigger output.
+		Additional modes (on TRGO2 only):
+		- "OC5REF"    : OC5REF signal is used as trigger output.
+		- "OC6REF"    : OC6REF signal is used as trigger output.
+		- "compare_pulse_OC4REF":
+		  OC4REF rising or falling edges generate pulses.
+		- "compare_pulse_OC6REF":
+		  OC6REF rising or falling edges generate pulses.
+		- "compare_pulse_OC4REF_r_or_OC6REF_r":
+		  OC4REF or OC6REF rising edges generate pulses.
+		- "compare_pulse_OC4REF_r_or_OC6REF_f":
+		  OC4REF rising or OC6REF falling edges generate pulses.
+		- "compare_pulse_OC5REF_r_or_OC6REF_r":
+		  OC5REF or OC6REF rising edges generate pulses.
+		- "compare_pulse_OC5REF_r_or_OC6REF_f":
+		  OC5REF rising or OC6REF falling edges generate pulses.
+
+		+-----------+   +-------------+            +---------+
+		| Prescaler +-> | Counter     |        +-> | Master  | TRGO(2)
+		+-----------+   +--+--------+-+        |-> | Control +-->
+		                   |        |          ||  +---------+
+		                +--v--------+-+ OCxREF ||  +---------+
+		                | Chx compare +----------> | Output  | ChX
+		                +-----------+-+         |  | Control +-->
+		                      .     |           |  +---------+
+		                      .     |           |    .
+		                +-----------v-+ OC6REF  |    .
+		                | Ch6 compare +---------+>
+		                +-------------+
+
+		Example with: "compare_pulse_OC4REF_r_or_OC6REF_r":
+
+		                X
+		              X   X
+		            X .   . X
+		          X   .   .   X
+		        X     .   .     X
+		count X .     .   .     . X
+		        .     .   .     .
+		        .     .   .     .
+		        +---------------+
+		OC4REF  |     .   .     |
+		      +-+     .   .     +-+
+		        .     +---+     .
+		OC6REF  .     |   |     .
+		      +-------+   +-------+
+		        +-+   +-+
+		TRGO2   | |   | |
+		      +-+ +---+ +---------+
 
 What:		/sys/bus/iio/devices/triggerX/master_mode
 KernelVersion:	4.11
diff --git a/drivers/iio/trigger/stm32-timer-trigger.c b/drivers/iio/trigger/stm32-timer-trigger.c
index 0f1a2cf..a0031b7 100644
--- a/drivers/iio/trigger/stm32-timer-trigger.c
+++ b/drivers/iio/trigger/stm32-timer-trigger.c
@@ -14,19 +14,19 @@
 #include <linux/module.h>
 #include <linux/platform_device.h>
 
-#define MAX_TRIGGERS 6
+#define MAX_TRIGGERS 7
 #define MAX_VALIDS 5
 
 /* List the triggers created by each timer */
 static const void *triggers_table[][MAX_TRIGGERS] = {
-	{ TIM1_TRGO, TIM1_CH1, TIM1_CH2, TIM1_CH3, TIM1_CH4,},
+	{ TIM1_TRGO, TIM1_TRGO2, TIM1_CH1, TIM1_CH2, TIM1_CH3, TIM1_CH4,},
 	{ TIM2_TRGO, TIM2_CH1, TIM2_CH2, TIM2_CH3, TIM2_CH4,},
 	{ TIM3_TRGO, TIM3_CH1, TIM3_CH2, TIM3_CH3, TIM3_CH4,},
 	{ TIM4_TRGO, TIM4_CH1, TIM4_CH2, TIM4_CH3, TIM4_CH4,},
 	{ TIM5_TRGO, TIM5_CH1, TIM5_CH2, TIM5_CH3, TIM5_CH4,},
 	{ TIM6_TRGO,},
 	{ TIM7_TRGO,},
-	{ TIM8_TRGO, TIM8_CH1, TIM8_CH2, TIM8_CH3, TIM8_CH4,},
+	{ TIM8_TRGO, TIM8_TRGO2, TIM8_CH1, TIM8_CH2, TIM8_CH3, TIM8_CH4,},
 	{ TIM9_TRGO, TIM9_CH1, TIM9_CH2,},
 	{ }, /* timer 10 */
 	{ }, /* timer 11 */
@@ -56,9 +56,16 @@ struct stm32_timer_trigger {
 	u32 max_arr;
 	const void *triggers;
 	const void *valids;
+	bool has_trgo2;
 };
 
+static bool stm32_timer_is_trgo2_name(const char *name)
+{
+	return !!strstr(name, "trgo2");
+}
+
 static int stm32_timer_start(struct stm32_timer_trigger *priv,
+			     struct iio_trigger *trig,
 			     unsigned int frequency)
 {
 	unsigned long long prd, div;
@@ -102,7 +109,12 @@ static int stm32_timer_start(struct stm32_timer_trigger *priv,
 	regmap_update_bits(priv->regmap, TIM_CR1, TIM_CR1_ARPE, TIM_CR1_ARPE);
 
 	/* Force master mode to update mode */
-	regmap_update_bits(priv->regmap, TIM_CR2, TIM_CR2_MMS, 0x20);
+	if (stm32_timer_is_trgo2_name(trig->name))
+		regmap_update_bits(priv->regmap, TIM_CR2, TIM_CR2_MMS2,
+				   0x2 << TIM_CR2_MMS2_SHIFT);
+	else
+		regmap_update_bits(priv->regmap, TIM_CR2, TIM_CR2_MMS,
+				   0x2 << TIM_CR2_MMS_SHIFT);
 
 	/* Make sure that registers are updated */
 	regmap_update_bits(priv->regmap, TIM_EGR, TIM_EGR_UG, TIM_EGR_UG);
@@ -150,7 +162,7 @@ static ssize_t stm32_tt_store_frequency(struct device *dev,
 	if (freq == 0) {
 		stm32_timer_stop(priv);
 	} else {
-		ret = stm32_timer_start(priv, freq);
+		ret = stm32_timer_start(priv, trig, freq);
 		if (ret)
 			return ret;
 	}
@@ -183,6 +195,9 @@ static IIO_DEV_ATTR_SAMP_FREQ(0660,
 			      stm32_tt_read_frequency,
 			      stm32_tt_store_frequency);
 
+#define MASTER_MODE_MAX		7
+#define MASTER_MODE2_MAX	15
+
 static char *master_mode_table[] = {
 	"reset",
 	"enable",
@@ -191,7 +206,16 @@ static IIO_DEV_ATTR_SAMP_FREQ(0660,
 	"OC1REF",
 	"OC2REF",
 	"OC3REF",
-	"OC4REF"
+	"OC4REF",
+	/* Master mode selection 2 only */
+	"OC5REF",
+	"OC6REF",
+	"compare_pulse_OC4REF",
+	"compare_pulse_OC6REF",
+	"compare_pulse_OC4REF_r_or_OC6REF_r",
+	"compare_pulse_OC4REF_r_or_OC6REF_f",
+	"compare_pulse_OC5REF_r_or_OC6REF_r",
+	"compare_pulse_OC5REF_r_or_OC6REF_f",
 };
 
 static ssize_t stm32_tt_show_master_mode(struct device *dev,
@@ -199,10 +223,15 @@ static ssize_t stm32_tt_show_master_mode(struct device *dev,
 					 char *buf)
 {
 	struct stm32_timer_trigger *priv = dev_get_drvdata(dev);
+	struct iio_trigger *trig = to_iio_trigger(dev);
 	u32 cr2;
 
 	regmap_read(priv->regmap, TIM_CR2, &cr2);
-	cr2 = (cr2 & TIM_CR2_MMS) >> TIM_CR2_MMS_SHIFT;
+
+	if (stm32_timer_is_trgo2_name(trig->name))
+		cr2 = (cr2 & TIM_CR2_MMS2) >> TIM_CR2_MMS2_SHIFT;
+	else
+		cr2 = (cr2 & TIM_CR2_MMS) >> TIM_CR2_MMS_SHIFT;
 
 	return snprintf(buf, PAGE_SIZE, "%s\n", master_mode_table[cr2]);
 }
@@ -212,13 +241,25 @@ static ssize_t stm32_tt_store_master_mode(struct device *dev,
 					  const char *buf, size_t len)
 {
 	struct stm32_timer_trigger *priv = dev_get_drvdata(dev);
+	struct iio_trigger *trig = to_iio_trigger(dev);
+	u32 mask, shift, master_mode_max;
 	int i;
 
-	for (i = 0; i < ARRAY_SIZE(master_mode_table); i++) {
+	if (stm32_timer_is_trgo2_name(trig->name)) {
+		mask = TIM_CR2_MMS2;
+		shift = TIM_CR2_MMS2_SHIFT;
+		master_mode_max = MASTER_MODE2_MAX;
+	} else {
+		mask = TIM_CR2_MMS;
+		shift = TIM_CR2_MMS_SHIFT;
+		master_mode_max = MASTER_MODE_MAX;
+	}
+
+	for (i = 0; i <= master_mode_max; i++) {
 		if (!strncmp(master_mode_table[i], buf,
 			     strlen(master_mode_table[i]))) {
-			regmap_update_bits(priv->regmap, TIM_CR2,
-					   TIM_CR2_MMS, i << TIM_CR2_MMS_SHIFT);
+			regmap_update_bits(priv->regmap, TIM_CR2, mask,
+					   i << shift);
 			/* Make sure that registers are updated */
 			regmap_update_bits(priv->regmap, TIM_EGR,
 					   TIM_EGR_UG, TIM_EGR_UG);
@@ -229,8 +270,31 @@ static ssize_t stm32_tt_store_master_mode(struct device *dev,
 	return -EINVAL;
 }
 
-static IIO_CONST_ATTR(master_mode_available,
-	"reset enable update compare_pulse OC1REF OC2REF OC3REF OC4REF");
+static ssize_t stm32_tt_show_master_mode_avail(struct device *dev,
+					       struct device_attribute *attr,
+					       char *buf)
+{
+	struct iio_trigger *trig = to_iio_trigger(dev);
+	unsigned int i, master_mode_max;
+	size_t len = 0;
+
+	if (stm32_timer_is_trgo2_name(trig->name))
+		master_mode_max = MASTER_MODE2_MAX;
+	else
+		master_mode_max = MASTER_MODE_MAX;
+
+	for (i = 0; i <= master_mode_max; i++)
+		len += scnprintf(buf + len, PAGE_SIZE - len,
+			"%s ", master_mode_table[i]);
+
+	/* replace trailing space by newline */
+	buf[len - 1] = '\n';
+
+	return len;
+}
+
+static IIO_DEVICE_ATTR(master_mode_available, 0444,
+		       stm32_tt_show_master_mode_avail, NULL, 0);
 
 static IIO_DEVICE_ATTR(master_mode, 0660,
 		       stm32_tt_show_master_mode,
@@ -240,7 +304,7 @@ static IIO_DEVICE_ATTR(master_mode, 0660,
 static struct attribute *stm32_trigger_attrs[] = {
 	&iio_dev_attr_sampling_frequency.dev_attr.attr,
 	&iio_dev_attr_master_mode.dev_attr.attr,
-	&iio_const_attr_master_mode_available.dev_attr.attr,
+	&iio_dev_attr_master_mode_available.dev_attr.attr,
 	NULL,
 };
 
@@ -264,6 +328,12 @@ static int stm32_setup_iio_triggers(struct stm32_timer_trigger *priv)
 
 	while (cur && *cur) {
 		struct iio_trigger *trig;
+		bool cur_is_trgo2 = stm32_timer_is_trgo2_name(*cur);
+
+		if (cur_is_trgo2 && !priv->has_trgo2) {
+			cur++;
+			continue;
+		}
 
 		trig = devm_iio_trigger_alloc(priv->dev, "%s", *cur);
 		if  (!trig)
@@ -277,7 +347,7 @@ static int stm32_setup_iio_triggers(struct stm32_timer_trigger *priv)
 		 * should only be available on trgo trigger which
 		 * is always the first in the list.
 		 */
-		if (cur == priv->triggers)
+		if (cur == priv->triggers || cur_is_trgo2)
 			trig->dev.groups = stm32_trigger_attr_groups;
 
 		iio_trigger_set_drvdata(trig, priv);
@@ -584,6 +654,20 @@ bool is_stm32_timer_trigger(struct iio_trigger *trig)
 }
 EXPORT_SYMBOL(is_stm32_timer_trigger);
 
+static void stm32_timer_detect_trgo2(struct stm32_timer_trigger *priv)
+{
+	u32 val;
+
+	/*
+	 * Master mode selection 2 bits can only be written and read back when
+	 * timer supports it.
+	 */
+	regmap_update_bits(priv->regmap, TIM_CR2, TIM_CR2_MMS2, TIM_CR2_MMS2);
+	regmap_read(priv->regmap, TIM_CR2, &val);
+	regmap_update_bits(priv->regmap, TIM_CR2, TIM_CR2_MMS2, 0);
+	priv->has_trgo2 = !!val;
+}
+
 static int stm32_timer_trigger_probe(struct platform_device *pdev)
 {
 	struct device *dev = &pdev->dev;
@@ -614,6 +698,7 @@ static int stm32_timer_trigger_probe(struct platform_device *pdev)
 	priv->max_arr = ddata->max_arr;
 	priv->triggers = triggers_table[index];
 	priv->valids = valids_table[index];
+	stm32_timer_detect_trgo2(priv);
 
 	ret = stm32_setup_iio_triggers(priv);
 	if (ret)
diff --git a/include/linux/iio/timer/stm32-timer-trigger.h b/include/linux/iio/timer/stm32-timer-trigger.h
index 55535ae..fa7d786 100644
--- a/include/linux/iio/timer/stm32-timer-trigger.h
+++ b/include/linux/iio/timer/stm32-timer-trigger.h
@@ -10,6 +10,7 @@
 #define _STM32_TIMER_TRIGGER_H_
 
 #define TIM1_TRGO	"tim1_trgo"
+#define TIM1_TRGO2	"tim1_trgo2"
 #define TIM1_CH1	"tim1_ch1"
 #define TIM1_CH2	"tim1_ch2"
 #define TIM1_CH3	"tim1_ch3"
@@ -44,6 +45,7 @@
 #define TIM7_TRGO	"tim7_trgo"
 
 #define TIM8_TRGO	"tim8_trgo"
+#define TIM8_TRGO2	"tim8_trgo2"
 #define TIM8_CH1	"tim8_ch1"
 #define TIM8_CH2	"tim8_ch2"
 #define TIM8_CH3	"tim8_ch3"
diff --git a/include/linux/mfd/stm32-timers.h b/include/linux/mfd/stm32-timers.h
index 4a0abbc..ce7346e 100644
--- a/include/linux/mfd/stm32-timers.h
+++ b/include/linux/mfd/stm32-timers.h
@@ -34,6 +34,7 @@
 #define TIM_CR1_DIR	BIT(4)  /* Counter Direction	   */
 #define TIM_CR1_ARPE	BIT(7)	/* Auto-reload Preload Ena */
 #define TIM_CR2_MMS	(BIT(4) | BIT(5) | BIT(6)) /* Master mode selection */
+#define TIM_CR2_MMS2	GENMASK(23, 20) /* Master mode selection 2 */
 #define TIM_SMCR_SMS	(BIT(0) | BIT(1) | BIT(2)) /* Slave mode selection */
 #define TIM_SMCR_TS	(BIT(4) | BIT(5) | BIT(6)) /* Trigger selection */
 #define TIM_DIER_UIE	BIT(0)	/* Update interrupt	   */
@@ -60,6 +61,7 @@
 
 #define MAX_TIM_PSC		0xFFFF
 #define TIM_CR2_MMS_SHIFT	4
+#define TIM_CR2_MMS2_SHIFT	20
 #define TIM_SMCR_TS_SHIFT	4
 #define TIM_BDTR_BKF_MASK	0xF
 #define TIM_BDTR_BKF_SHIFT	16
-- 
1.9.1

^ permalink raw reply related

* [PATCH 1/1] arm64: Always provide "model name" in /proc/cpuinfo
From: Mark Rutland @ 2017-05-02 12:37 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170502110827.GA29224@e104818-lin.cambridge.arm.com>

On Tue, May 02, 2017 at 12:08:27PM +0100, Catalin Marinas wrote:
> On Tue, May 02, 2017 at 12:39:13AM +0200, Heinrich Schuchardt wrote:
> > There is no need to hide the model name in processes
> > that are not PER_LINUX32.
> > 
> > So let us always provide a model name that is easily readable.
> > 
> > Fixes: e47b020a323d ("arm64: Provide "model name" in /proc/cpuinfo for PER_LINUX32 tasks")
> > Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
> > ---
> >  arch/arm64/kernel/cpuinfo.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
> > index b3d5b3e8fbcb..9ad9ddcd2f19 100644
> > --- a/arch/arm64/kernel/cpuinfo.c
> > +++ b/arch/arm64/kernel/cpuinfo.c
> > @@ -118,9 +118,8 @@ static int c_show(struct seq_file *m, void *v)
> >  		 * "processor".  Give glibc what it expects.
> >  		 */
> >  		seq_printf(m, "processor\t: %d\n", i);
> > -		if (compat)
> > -			seq_printf(m, "model name\t: ARMv8 Processor rev %d (%s)\n",
> > -				   MIDR_REVISION(midr), COMPAT_ELF_PLATFORM);
> > +		seq_printf(m, "model name\t: ARMv8 Processor rev %d (%s)\n",
> > +			   MIDR_REVISION(midr), COMPAT_ELF_PLATFORM);
> >  
> >  		seq_printf(m, "BogoMIPS\t: %lu.%02lu\n",
> >  			   loops_per_jiffy / (500000UL/HZ),
> 
> Such patch seems to come up regularly:
> 
> https://patchwork.kernel.org/patch/9303311/
> 
> (and it usually gets rejected)

Indeed; my comments from that previous discussion apply here.

In addition, the commit message above refers to this as fixing another
commit, but does not explain why the current behviour would be
considered a bug.

I do not think it makes sense to take this patch.

Thanks,
Mark.

^ permalink raw reply

* [PATCH] iio: stm32 trigger: Add support for TRGO2 triggers
From: Fabrice Gasnier @ 2017-05-02 12:38 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <72817113-a4c7-c1f7-0ae2-813df78f3acc@kernel.org>

On 04/30/2017 07:07 PM, Jonathan Cameron wrote:
> On 28/04/17 15:52, Fabrice Gasnier wrote:
>> On 04/27/2017 07:49 AM, Jonathan Cameron wrote:
>>> On 26/04/17 09:55, Benjamin Gaignard wrote:
>>>> 2017-04-26 10:17 GMT+02:00 Fabrice Gasnier <fabrice.gasnier@st.com>:
>>>>> Add support for TRGO2 trigger that can be found on STM32F7.
>>>>> Add additional master modes supported by TRGO2.
>>> These additional modes would benefit from more information in the
>>> ABI docs.  Otherwise patch seems fine, though this may win
>>> the award for hardest hardware to come up with a generic
>>> interface for... 
>>>>> Register additional "tim[1/8]_trgo2" triggers for timer1 & timer8.
>>>>> Detect TRGO2 timer capability (master mode selection 2).
>>>>>
>>>>> Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
>>>>> ---
>>>>>  .../ABI/testing/sysfs-bus-iio-timer-stm32          |  15 +++
>>>>>  drivers/iio/trigger/stm32-timer-trigger.c          | 113 ++++++++++++++++++---
>>>>>  include/linux/iio/timer/stm32-timer-trigger.h      |   2 +
>>>>>  include/linux/mfd/stm32-timers.h                   |   2 +
>>>>>  4 files changed, 118 insertions(+), 14 deletions(-)
>>>>>
>>>>> diff --git a/Documentation/ABI/testing/sysfs-bus-iio-timer-stm32 b/Documentation/ABI/testing/sysfs-bus-iio-timer-stm32
>>>>> index 230020e..47647b4 100644
>>>>> --- a/Documentation/ABI/testing/sysfs-bus-iio-timer-stm32
>>>>> +++ b/Documentation/ABI/testing/sysfs-bus-iio-timer-stm32
>>>>> @@ -16,6 +16,21 @@ Description:
>>>>>                 - "OC2REF"    : OC2REF signal is used as trigger output.
>>>>>                 - "OC3REF"    : OC3REF signal is used as trigger output.
>>>>>                 - "OC4REF"    : OC4REF signal is used as trigger output.
>>>>> +               Additional modes (on TRGO2 only):
>>>>> +               - "OC5REF"    : OC5REF signal is used as trigger output.
>>>>> +               - "OC6REF"    : OC6REF signal is used as trigger output.
>>>>> +               - "compare_pulse_OC4REF":
>>>>> +                 OC4REF rising or falling edges generate pulses.
>>> I'd like this to be fairly understandable without resorting to reading the
>>> datasheet.  As I understand it you get a fixed term pulse on both edges
>>> of the waveform?  Perhaps this calls for some ascii art :)
>>
>> Hi Jonathan,
>>
>> If you feel like it needs more documentation, I'd rather prefer to add
>> reference or link to the datasheet... That will be more accurate,
>> up-to-date (e.g. like RM0385 pdf). Does this sound ok ? Or...
> Datasheet is good, but give it 10 years and chances are it will disappear
> into a black hole, whereas the hardware might still be in use by someone.
> Some of the hardware I use is at least that old. Frankly this laptop is
> getting close ;)
>>
>> Just in case, I prepared some ascii art, hope it clarify things.
>> I'm wondering if this is best place to put it ?
>> Shouldn't this be added in source code, instead of ABI Doc ?
> Could be either, but arguably the ABI docs should be all that a
> userspace developer should need to see.  This isn't an internal
> detail afterall.
>> Maybe I can skip 1st part of it, heading boxes? (only example is enough?
>> or not...)
>>
>> +-----------+   +-------------+            +---------+
>> | Prescaler +-> | Counter     |        +-> | Master  | TRGO(2)
>> +-----------+   +--+--------+-+        |-> | Control +-->
>>                    |        |          ||  +---------+
>>                 +--v--------+-+ OCxREF ||  +---------+
>>                 | Chx compare +----------> | Output  | ChX
>>                 +-----------+-+         |  | Control +-->
>>                       .     |           |  +---------+
>>                       .     |           |    .
>>                 +-----------v-+ OC6REF  |    .
>>                 | Ch6 compare +---------+>
>>                 +-------------+
>>
>> Example with: "compare_pulse_OC4REF_r_or_OC6REF_r":
>>
>>                 X
>>               X   X
>>             X .   . X
>>           X   .   .   X
>>         X     .   .     X
>> count X .     .   .     . X
>>         .     .   .     .
>>         .     .   .     .
>>         +---------------+
>> OC4REF  |     .   .     |
>>       +-+     .   .     +-+
>>         .     +---+     .
>> OC6REF  .     |   |     .
>>       +-------+   +-------+
>>         +-+   +-+
>> TRGO2   | |   | |
>>       +-+ +---+ +---------+
> This is good stuff so I'd put it in the ABI docs.
> 
Hi Jonathan,

Thanks, I just sent a v2 that includes this.

Best Regards,
Fabrice

> Jonathan
>>
>>
>> side note: this isn't my house ;-)
> :)
>>
>> Please advise,
>> Thanks,
>> Fabrice
>>
>>
>>>>> +               - "compare_pulse_OC6REF":
>>>>> +                 OC6REF rising or falling edges generate pulses.
>>>>> +               - "compare_pulse_OC4REF_r_or_OC6REF_r":
>>>>> +                 OC4REF or OC6REF rising edges generate pulses.
>>>>> +               - "compare_pulse_OC4REF_r_or_OC6REF_f":
>>>>> +                 OC4REF rising or OC6REF falling edges generate pulses.
>>>>> +               - "compare_pulse_OC5REF_r_or_OC6REF_r":
>>>>> +                 OC5REF or OC6REF rising edges generate pulses.
>>>>> +               - "compare_pulse_OC5REF_r_or_OC6REF_f":
>>>>> +                 OC5REF rising or OC6REF falling edges generate pulses.
>>>>>
>>>>>  What:          /sys/bus/iio/devices/triggerX/master_mode
>>>>>  KernelVersion: 4.11
>>>>> diff --git a/drivers/iio/trigger/stm32-timer-trigger.c b/drivers/iio/trigger/stm32-timer-trigger.c
>>>>> index 0f1a2cf..a0031b7 100644
>>>>> --- a/drivers/iio/trigger/stm32-timer-trigger.c
>>>>> +++ b/drivers/iio/trigger/stm32-timer-trigger.c
>>>>> @@ -14,19 +14,19 @@
>>>>>  #include <linux/module.h>
>>>>>  #include <linux/platform_device.h>
>>>>>
>>>>> -#define MAX_TRIGGERS 6
>>>>> +#define MAX_TRIGGERS 7
>>>>>  #define MAX_VALIDS 5
>>>>>
>>>>>  /* List the triggers created by each timer */
>>>>>  static const void *triggers_table[][MAX_TRIGGERS] = {
>>>>> -       { TIM1_TRGO, TIM1_CH1, TIM1_CH2, TIM1_CH3, TIM1_CH4,},
>>>>> +       { TIM1_TRGO, TIM1_TRGO2, TIM1_CH1, TIM1_CH2, TIM1_CH3, TIM1_CH4,},
>>>>>         { TIM2_TRGO, TIM2_CH1, TIM2_CH2, TIM2_CH3, TIM2_CH4,},
>>>>>         { TIM3_TRGO, TIM3_CH1, TIM3_CH2, TIM3_CH3, TIM3_CH4,},
>>>>>         { TIM4_TRGO, TIM4_CH1, TIM4_CH2, TIM4_CH3, TIM4_CH4,},
>>>>>         { TIM5_TRGO, TIM5_CH1, TIM5_CH2, TIM5_CH3, TIM5_CH4,},
>>>>>         { TIM6_TRGO,},
>>>>>         { TIM7_TRGO,},
>>>>> -       { TIM8_TRGO, TIM8_CH1, TIM8_CH2, TIM8_CH3, TIM8_CH4,},
>>>>> +       { TIM8_TRGO, TIM8_TRGO2, TIM8_CH1, TIM8_CH2, TIM8_CH3, TIM8_CH4,},
>>>>>         { TIM9_TRGO, TIM9_CH1, TIM9_CH2,},
>>>>>         { }, /* timer 10 */
>>>>>         { }, /* timer 11 */
>>>>> @@ -56,9 +56,16 @@ struct stm32_timer_trigger {
>>>>>         u32 max_arr;
>>>>>         const void *triggers;
>>>>>         const void *valids;
>>>>> +       bool has_trgo2;
>>>>>  };
>>>>>
>>>>> +static bool stm32_timer_is_trgo2_name(const char *name)
>>>>> +{
>>>>> +       return !!strstr(name, "trgo2");
>>>>> +}
>>>>> +
>>>>>  static int stm32_timer_start(struct stm32_timer_trigger *priv,
>>>>> +                            struct iio_trigger *trig,
>>>>>                              unsigned int frequency)
>>>>>  {
>>>>>         unsigned long long prd, div;
>>>>> @@ -102,7 +109,12 @@ static int stm32_timer_start(struct stm32_timer_trigger *priv,
>>>>>         regmap_update_bits(priv->regmap, TIM_CR1, TIM_CR1_ARPE, TIM_CR1_ARPE);
>>>>>
>>>>>         /* Force master mode to update mode */
>>>>> -       regmap_update_bits(priv->regmap, TIM_CR2, TIM_CR2_MMS, 0x20);
>>>>> +       if (stm32_timer_is_trgo2_name(trig->name))
>>>>> +               regmap_update_bits(priv->regmap, TIM_CR2, TIM_CR2_MMS2,
>>>>> +                                  0x2 << TIM_CR2_MMS2_SHIFT);
>>>>> +       else
>>>>> +               regmap_update_bits(priv->regmap, TIM_CR2, TIM_CR2_MMS,
>>>>> +                                  0x2 << TIM_CR2_MMS_SHIFT);
>>>>>
>>>>>         /* Make sure that registers are updated */
>>>>>         regmap_update_bits(priv->regmap, TIM_EGR, TIM_EGR_UG, TIM_EGR_UG);
>>>>> @@ -150,7 +162,7 @@ static ssize_t stm32_tt_store_frequency(struct device *dev,
>>>>>         if (freq == 0) {
>>>>>                 stm32_timer_stop(priv);
>>>>>         } else {
>>>>> -               ret = stm32_timer_start(priv, freq);
>>>>> +               ret = stm32_timer_start(priv, trig, freq);
>>>>>                 if (ret)
>>>>>                         return ret;
>>>>>         }
>>>>> @@ -183,6 +195,9 @@ static IIO_DEV_ATTR_SAMP_FREQ(0660,
>>>>>                               stm32_tt_read_frequency,
>>>>>                               stm32_tt_store_frequency);
>>>>>
>>>>> +#define MASTER_MODE_MAX                7
>>>>> +#define MASTER_MODE2_MAX       15
>>>>> +
>>>>>  static char *master_mode_table[] = {
>>>>>         "reset",
>>>>>         "enable",
>>>>> @@ -191,7 +206,16 @@ static IIO_DEV_ATTR_SAMP_FREQ(0660,
>>>>>         "OC1REF",
>>>>>         "OC2REF",
>>>>>         "OC3REF",
>>>>> -       "OC4REF"
>>>>> +       "OC4REF",
>>>>> +       /* Master mode selection 2 only */
>>>>> +       "OC5REF",
>>>>> +       "OC6REF",
>>>>> +       "compare_pulse_OC4REF",
>>>>> +       "compare_pulse_OC6REF",
>>>>> +       "compare_pulse_OC4REF_r_or_OC6REF_r",
>>>>> +       "compare_pulse_OC4REF_r_or_OC6REF_f",
>>>>> +       "compare_pulse_OC5REF_r_or_OC6REF_r",
>>>>> +       "compare_pulse_OC5REF_r_or_OC6REF_f",
>>>>>  };
>>>>>
>>>>>  static ssize_t stm32_tt_show_master_mode(struct device *dev,
>>>>> @@ -199,10 +223,15 @@ static ssize_t stm32_tt_show_master_mode(struct device *dev,
>>>>>                                          char *buf)
>>>>>  {
>>>>>         struct stm32_timer_trigger *priv = dev_get_drvdata(dev);
>>>>> +       struct iio_trigger *trig = to_iio_trigger(dev);
>>>>>         u32 cr2;
>>>>>
>>>>>         regmap_read(priv->regmap, TIM_CR2, &cr2);
>>>>> -       cr2 = (cr2 & TIM_CR2_MMS) >> TIM_CR2_MMS_SHIFT;
>>>>> +
>>>>> +       if (stm32_timer_is_trgo2_name(trig->name))
>>>>> +               cr2 = (cr2 & TIM_CR2_MMS2) >> TIM_CR2_MMS2_SHIFT;
>>>>> +       else
>>>>> +               cr2 = (cr2 & TIM_CR2_MMS) >> TIM_CR2_MMS_SHIFT;
>>>>>
>>>>>         return snprintf(buf, PAGE_SIZE, "%s\n", master_mode_table[cr2]);
>>>>>  }
>>>>> @@ -212,13 +241,25 @@ static ssize_t stm32_tt_store_master_mode(struct device *dev,
>>>>>                                           const char *buf, size_t len)
>>>>>  {
>>>>>         struct stm32_timer_trigger *priv = dev_get_drvdata(dev);
>>>>> +       struct iio_trigger *trig = to_iio_trigger(dev);
>>>>> +       u32 mask, shift, master_mode_max;
>>>>>         int i;
>>>>>
>>>>> -       for (i = 0; i < ARRAY_SIZE(master_mode_table); i++) {
>>>>> +       if (stm32_timer_is_trgo2_name(trig->name)) {
>>>>> +               mask = TIM_CR2_MMS2;
>>>>> +               shift = TIM_CR2_MMS2_SHIFT;
>>>>> +               master_mode_max = MASTER_MODE2_MAX;
>>>>> +       } else {
>>>>> +               mask = TIM_CR2_MMS;
>>>>> +               shift = TIM_CR2_MMS_SHIFT;
>>>>> +               master_mode_max = MASTER_MODE_MAX;
>>>>> +       }
>>>>> +
>>>>> +       for (i = 0; i <= master_mode_max; i++) {
>>>>>                 if (!strncmp(master_mode_table[i], buf,
>>>>>                              strlen(master_mode_table[i]))) {
>>>>> -                       regmap_update_bits(priv->regmap, TIM_CR2,
>>>>> -                                          TIM_CR2_MMS, i << TIM_CR2_MMS_SHIFT);
>>>>> +                       regmap_update_bits(priv->regmap, TIM_CR2, mask,
>>>>> +                                          i << shift);
>>>>>                         /* Make sure that registers are updated */
>>>>>                         regmap_update_bits(priv->regmap, TIM_EGR,
>>>>>                                            TIM_EGR_UG, TIM_EGR_UG);
>>>>> @@ -229,8 +270,31 @@ static ssize_t stm32_tt_store_master_mode(struct device *dev,
>>>>>         return -EINVAL;
>>>>>  }
>>>>>
>>>>> -static IIO_CONST_ATTR(master_mode_available,
>>>>> -       "reset enable update compare_pulse OC1REF OC2REF OC3REF OC4REF");
>>>>> +static ssize_t stm32_tt_show_master_mode_avail(struct device *dev,
>>>>> +                                              struct device_attribute *attr,
>>>>> +                                              char *buf)
>>>>> +{
>>>>> +       struct iio_trigger *trig = to_iio_trigger(dev);
>>>>> +       unsigned int i, master_mode_max;
>>>>> +       size_t len = 0;
>>>>> +
>>>>> +       if (stm32_timer_is_trgo2_name(trig->name))
>>>>> +               master_mode_max = MASTER_MODE2_MAX;
>>>>> +       else
>>>>> +               master_mode_max = MASTER_MODE_MAX;
>>>>> +
>>>>> +       for (i = 0; i <= master_mode_max; i++)
>>>>> +               len += scnprintf(buf + len, PAGE_SIZE - len,
>>>>> +                       "%s ", master_mode_table[i]);
>>>>> +
>>>>> +       /* replace trailing space by newline */
>>>>> +       buf[len - 1] = '\n';
>>>>> +
>>>>> +       return len;
>>>>> +}
>>>>> +
>>>>> +static IIO_DEVICE_ATTR(master_mode_available, 0444,
>>>>> +                      stm32_tt_show_master_mode_avail, NULL, 0);
>>>>>
>>>>>  static IIO_DEVICE_ATTR(master_mode, 0660,
>>>>>                        stm32_tt_show_master_mode,
>>>>> @@ -240,7 +304,7 @@ static IIO_DEVICE_ATTR(master_mode, 0660,
>>>>>  static struct attribute *stm32_trigger_attrs[] = {
>>>>>         &iio_dev_attr_sampling_frequency.dev_attr.attr,
>>>>>         &iio_dev_attr_master_mode.dev_attr.attr,
>>>>> -       &iio_const_attr_master_mode_available.dev_attr.attr,
>>>>> +       &iio_dev_attr_master_mode_available.dev_attr.attr,
>>>>>         NULL,
>>>>>  };
>>>>>
>>>>> @@ -264,6 +328,12 @@ static int stm32_setup_iio_triggers(struct stm32_timer_trigger *priv)
>>>>>
>>>>>         while (cur && *cur) {
>>>>>                 struct iio_trigger *trig;
>>>>> +               bool cur_is_trgo2 = stm32_timer_is_trgo2_name(*cur);
>>>>> +
>>>>> +               if (cur_is_trgo2 && !priv->has_trgo2) {
>>>>> +                       cur++;
>>>>> +                       continue;
>>>>> +               }
>>>>>
>>>>>                 trig = devm_iio_trigger_alloc(priv->dev, "%s", *cur);
>>>>>                 if  (!trig)
>>>>> @@ -277,7 +347,7 @@ static int stm32_setup_iio_triggers(struct stm32_timer_trigger *priv)
>>>>>                  * should only be available on trgo trigger which
>>>>>                  * is always the first in the list.
>>>>>                  */
>>>>> -               if (cur == priv->triggers)
>>>>> +               if (cur == priv->triggers || cur_is_trgo2)
>>>>>                         trig->dev.groups = stm32_trigger_attr_groups;
>>>>>
>>>>>                 iio_trigger_set_drvdata(trig, priv);
>>>>> @@ -584,6 +654,20 @@ bool is_stm32_timer_trigger(struct iio_trigger *trig)
>>>>>  }
>>>>>  EXPORT_SYMBOL(is_stm32_timer_trigger);
>>>>>
>>>>> +static void stm32_timer_detect_trgo2(struct stm32_timer_trigger *priv)
>>>>> +{
>>>>> +       u32 val;
>>>>> +
>>>>> +       /*
>>>>> +        * Master mode selection 2 bits can only be written and read back when
>>>>> +        * timer supports it.
>>>>> +        */
>>>>> +       regmap_update_bits(priv->regmap, TIM_CR2, TIM_CR2_MMS2, TIM_CR2_MMS2);
>>>>> +       regmap_read(priv->regmap, TIM_CR2, &val);
>>>>> +       regmap_update_bits(priv->regmap, TIM_CR2, TIM_CR2_MMS2, 0);
>>>>> +       priv->has_trgo2 = !!val;
>>>>> +}
>>>>> +
>>>>>  static int stm32_timer_trigger_probe(struct platform_device *pdev)
>>>>>  {
>>>>>         struct device *dev = &pdev->dev;
>>>>> @@ -614,6 +698,7 @@ static int stm32_timer_trigger_probe(struct platform_device *pdev)
>>>>>         priv->max_arr = ddata->max_arr;
>>>>>         priv->triggers = triggers_table[index];
>>>>>         priv->valids = valids_table[index];
>>>>> +       stm32_timer_detect_trgo2(priv);
>>>>>
>>>>>         ret = stm32_setup_iio_triggers(priv);
>>>>>         if (ret)
>>>>> diff --git a/include/linux/iio/timer/stm32-timer-trigger.h b/include/linux/iio/timer/stm32-timer-trigger.h
>>>>> index 55535ae..fa7d786 100644
>>>>> --- a/include/linux/iio/timer/stm32-timer-trigger.h
>>>>> +++ b/include/linux/iio/timer/stm32-timer-trigger.h
>>>>> @@ -10,6 +10,7 @@
>>>>>  #define _STM32_TIMER_TRIGGER_H_
>>>>>
>>>>>  #define TIM1_TRGO      "tim1_trgo"
>>>>> +#define TIM1_TRGO2     "tim1_trgo2"
>>>>>  #define TIM1_CH1       "tim1_ch1"
>>>>>  #define TIM1_CH2       "tim1_ch2"
>>>>>  #define TIM1_CH3       "tim1_ch3"
>>>>> @@ -44,6 +45,7 @@
>>>>>  #define TIM7_TRGO      "tim7_trgo"
>>>>>
>>>>>  #define TIM8_TRGO      "tim8_trgo"
>>>>> +#define TIM8_TRGO2     "tim8_trgo2"
>>>>>  #define TIM8_CH1       "tim8_ch1"
>>>>>  #define TIM8_CH2       "tim8_ch2"
>>>>>  #define TIM8_CH3       "tim8_ch3"
>>>>> diff --git a/include/linux/mfd/stm32-timers.h b/include/linux/mfd/stm32-timers.h
>>>>> index 4a0abbc..ce7346e 100644
>>>>> --- a/include/linux/mfd/stm32-timers.h
>>>>> +++ b/include/linux/mfd/stm32-timers.h
>>>>> @@ -34,6 +34,7 @@
>>>>>  #define TIM_CR1_DIR    BIT(4)  /* Counter Direction       */
>>>>>  #define TIM_CR1_ARPE   BIT(7)  /* Auto-reload Preload Ena */
>>>>>  #define TIM_CR2_MMS    (BIT(4) | BIT(5) | BIT(6)) /* Master mode selection */
>>>>> +#define TIM_CR2_MMS2   GENMASK(23, 20) /* Master mode selection 2 */
>>>>>  #define TIM_SMCR_SMS   (BIT(0) | BIT(1) | BIT(2)) /* Slave mode selection */
>>>>>  #define TIM_SMCR_TS    (BIT(4) | BIT(5) | BIT(6)) /* Trigger selection */
>>>>>  #define TIM_DIER_UIE   BIT(0)  /* Update interrupt        */
>>>>> @@ -60,6 +61,7 @@
>>>>>
>>>>>  #define MAX_TIM_PSC            0xFFFF
>>>>>  #define TIM_CR2_MMS_SHIFT      4
>>>>> +#define TIM_CR2_MMS2_SHIFT     20
>>>>>  #define TIM_SMCR_TS_SHIFT      4
>>>>>  #define TIM_BDTR_BKF_MASK      0xF
>>>>>  #define TIM_BDTR_BKF_SHIFT     16
>>>>> --
>>>>> 1.9.1
>>>>>
>>>>
>>>> Acked-by: Benjamin Gaiganrd <benjamin.gaignard@linaro.org>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-iio" in
>>>> the body of a message to majordomo at vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-iio" in
>> the body of a message to majordomo at vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 

^ permalink raw reply

* [PATCH 0/9] net: thunderx: Adds XDP support
From: sunil.kovvuri at gmail.com @ 2017-05-02 13:06 UTC (permalink / raw)
  To: linux-arm-kernel

From: Sunil Goutham <sgoutham@cavium.com>

This patch series adds support for XDP to ThunderX NIC driver
which is used on CN88xx, CN81xx and CN83xx platforms. 

Patches 1-4 are performance improvement and cleanup patches
which are done keeping XDP performance bottlenecks in view.
Rest of the patches adds actual XDP support.

Sunil Goutham (9):
  net: thunderx: Support for page recycling
  net: thunderx: Optimize RBDR descriptor handling
  net: thunderx: Optimize CQE_TX handling
  net: thunderx: Cleanup receive buffer allocation
  net: thunderx: Add basic XDP support
  net: thunderx: Add support for XDP_DROP
  net: thunderx: Add support for XDP_TX
  net: thunderx: Support for XDP header adjustment
  net: thunderx: Optimize page recycling for XDP

 drivers/net/ethernet/cavium/thunder/nic.h          |  10 +-
 .../net/ethernet/cavium/thunder/nicvf_ethtool.c    |  29 +-
 drivers/net/ethernet/cavium/thunder/nicvf_main.c   | 313 +++++++++++++++--
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 387 +++++++++++++++++----
 drivers/net/ethernet/cavium/thunder/nicvf_queues.h |  32 +-
 drivers/net/ethernet/cavium/thunder/q_struct.h     |  10 +-
 6 files changed, 657 insertions(+), 124 deletions(-)

-- 
2.7.4

^ permalink raw reply

* [PATCH 1/9] net: thunderx: Support for page recycling
From: sunil.kovvuri at gmail.com @ 2017-05-02 13:06 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493730418-24606-1-git-send-email-sunil.kovvuri@gmail.com>

From: Sunil Goutham <sgoutham@cavium.com>

Adds support for page recycling for allocating receive buffers
to reduce cost of refilling RBDR ring. Also got rid of using
compound pages when pagesize is 4K, only order-0 pages now.

Only page is recycled, DMA mappings still needs to be done for
every receive buffer allocated due to following constraints
- Cannot have just one receive buffer per 64KB page.
- There is just one buffer ring shared across 8 Rx queues, so
  buffers of same page can go to any Rx queue.
- HW gives buffer address where packet has been DMA'ed and not
  the index into buffer ring.
This makes it not possible to resue DMA mapping info. So unfortunately
have to go through costly mapping route for every buffer.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
---
 drivers/net/ethernet/cavium/thunder/nic.h          |   4 +-
 .../net/ethernet/cavium/thunder/nicvf_ethtool.c    |   3 +-
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 121 ++++++++++++++++++---
 drivers/net/ethernet/cavium/thunder/nicvf_queues.h |  11 ++
 4 files changed, 119 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h
index 6fb4421..dca6aed 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -252,12 +252,14 @@ struct nicvf_drv_stats {
 	u64 tx_csum_overflow;
 
 	/* driver debug stats */
-	u64 rcv_buffer_alloc_failures;
 	u64 tx_tso;
 	u64 tx_timeout;
 	u64 txq_stop;
 	u64 txq_wake;
 
+	u64 rcv_buffer_alloc_failures;
+	u64 page_alloc;
+
 	struct u64_stats_sync   syncp;
 };
 
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c b/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
index 02a986c..a89db5f 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
@@ -100,11 +100,12 @@ static const struct nicvf_stat nicvf_drv_stats[] = {
 	NICVF_DRV_STAT(tx_csum_overlap),
 	NICVF_DRV_STAT(tx_csum_overflow),
 
-	NICVF_DRV_STAT(rcv_buffer_alloc_failures),
 	NICVF_DRV_STAT(tx_tso),
 	NICVF_DRV_STAT(tx_timeout),
 	NICVF_DRV_STAT(txq_stop),
 	NICVF_DRV_STAT(txq_wake),
+	NICVF_DRV_STAT(rcv_buffer_alloc_failures),
+	NICVF_DRV_STAT(page_alloc),
 };
 
 static const struct nicvf_stat nicvf_queue_stats[] = {
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 7b0fd8d..12f9709 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -19,8 +19,6 @@
 #include "q_struct.h"
 #include "nicvf_queues.h"
 
-#define NICVF_PAGE_ORDER ((PAGE_SIZE <= 4096) ?  PAGE_ALLOC_COSTLY_ORDER : 0)
-
 static inline u64 nicvf_iova_to_phys(struct nicvf *nic, dma_addr_t dma_addr)
 {
 	/* Translation is installed only when IOMMU is present */
@@ -90,33 +88,88 @@ static void nicvf_free_q_desc_mem(struct nicvf *nic, struct q_desc_mem *dmem)
 	dmem->base = NULL;
 }
 
-/* Allocate buffer for packet reception
- * HW returns memory address where packet is DMA'ed but not a pointer
- * into RBDR ring, so save buffer address at the start of fragment and
- * align the start address to a cache aligned address
+/* Allocate a new page or recycle one if possible
+ *
+ * We cannot optimize dma mapping here, since
+ * 1. It's only one RBDR ring for 8 Rx queues.
+ * 2. CQE_RX gives address of the buffer where pkt has been DMA'ed
+ *    and not idx into RBDR ring, so can't refer to saved info.
+ * 3. There are multiple receive buffers per page
  */
-static inline int nicvf_alloc_rcv_buffer(struct nicvf *nic, gfp_t gfp,
-					 u32 buf_len, u64 **rbuf)
+static struct pgcache *nicvf_alloc_page(struct nicvf *nic,
+					struct rbdr *rbdr, gfp_t gfp)
 {
-	int order = NICVF_PAGE_ORDER;
+	struct page *page = NULL;
+	struct pgcache *pgcache, *next;
+
+	/* Check if page is already allocated */
+	pgcache = &rbdr->pgcache[rbdr->pgidx];
+	page = pgcache->page;
+	/* Check if page can be recycled */
+	if (page && (page_ref_count(page) != 1))
+		page = NULL;
+
+	if (!page) {
+		page = alloc_pages(gfp | __GFP_COMP | __GFP_NOWARN, 0);
+		if (!page)
+			return NULL;
+
+		this_cpu_inc(nic->pnicvf->drv_stats->page_alloc);
+
+		/* Check for space */
+		if (rbdr->pgalloc >= rbdr->pgcnt) {
+			/* Page can still be used */
+			nic->rb_page = page;
+			return NULL;
+		}
+
+		/* Save the page in page cache */
+		pgcache->page = page;
+		rbdr->pgalloc++;
+	}
+
+	/* Take extra page reference for recycling */
+	page_ref_add(page, 1);
+
+	rbdr->pgidx++;
+	rbdr->pgidx &= (rbdr->pgcnt - 1);
+
+	/* Prefetch refcount of next page in page cache */
+	next = &rbdr->pgcache[rbdr->pgidx];
+	page = next->page;
+	if (page)
+		prefetch(&page->_refcount);
+
+	return pgcache;
+}
+
+/* Allocate buffer for packet reception */
+static inline int nicvf_alloc_rcv_buffer(struct nicvf *nic, struct rbdr *rbdr,
+					 gfp_t gfp, u32 buf_len, u64 **rbuf)
+{
+	struct pgcache *pgcache = NULL;
 
 	/* Check if request can be accomodated in previous allocated page */
 	if (nic->rb_page &&
-	    ((nic->rb_page_offset + buf_len) < (PAGE_SIZE << order))) {
+	    ((nic->rb_page_offset + buf_len) <= PAGE_SIZE)) {
 		nic->rb_pageref++;
 		goto ret;
 	}
 
 	nicvf_get_page(nic);
+	nic->rb_page = NULL;
 
-	/* Allocate a new page */
-	nic->rb_page = alloc_pages(gfp | __GFP_COMP | __GFP_NOWARN,
-				   order);
-	if (!nic->rb_page) {
+	/* Get new page, either recycled or new one */
+	pgcache = nicvf_alloc_page(nic, rbdr, gfp);
+	if (!pgcache && !nic->rb_page) {
 		this_cpu_inc(nic->pnicvf->drv_stats->rcv_buffer_alloc_failures);
 		return -ENOMEM;
 	}
+
 	nic->rb_page_offset = 0;
+	/* Check if it's recycled */
+	if (pgcache)
+		nic->rb_page = pgcache->page;
 ret:
 	/* HW will ensure data coherency, CPU sync not required */
 	*rbuf = (u64 *)((u64)dma_map_page_attrs(&nic->pdev->dev, nic->rb_page,
@@ -125,7 +178,7 @@ static inline int nicvf_alloc_rcv_buffer(struct nicvf *nic, gfp_t gfp,
 						DMA_ATTR_SKIP_CPU_SYNC));
 	if (dma_mapping_error(&nic->pdev->dev, (dma_addr_t)*rbuf)) {
 		if (!nic->rb_page_offset)
-			__free_pages(nic->rb_page, order);
+			__free_pages(nic->rb_page, 0);
 		nic->rb_page = NULL;
 		return -ENOMEM;
 	}
@@ -177,10 +230,26 @@ static int  nicvf_init_rbdr(struct nicvf *nic, struct rbdr *rbdr,
 	rbdr->head = 0;
 	rbdr->tail = 0;
 
+	/* Initialize page recycling stuff.
+	 *
+	 * Can't use single buffer per page especially with 64K pages.
+	 * On embedded platforms i.e 81xx/83xx available memory itself
+	 * is low and minimum ring size of RBDR is 8K, that takes away
+	 * lots of memory.
+	 */
+	rbdr->pgcnt = ring_len / (PAGE_SIZE / buf_size);
+	rbdr->pgcnt = roundup_pow_of_two(rbdr->pgcnt);
+	rbdr->pgcache = kzalloc(sizeof(*rbdr->pgcache) *
+				rbdr->pgcnt, GFP_KERNEL);
+	if (!rbdr->pgcache)
+		return -ENOMEM;
+	rbdr->pgidx = 0;
+	rbdr->pgalloc = 0;
+
 	nic->rb_page = NULL;
 	for (idx = 0; idx < ring_len; idx++) {
-		err = nicvf_alloc_rcv_buffer(nic, GFP_KERNEL, RCV_FRAG_LEN,
-					     &rbuf);
+		err = nicvf_alloc_rcv_buffer(nic, rbdr, GFP_KERNEL,
+					     RCV_FRAG_LEN, &rbuf);
 		if (err) {
 			/* To free already allocated and mapped ones */
 			rbdr->tail = idx - 1;
@@ -201,6 +270,7 @@ static void nicvf_free_rbdr(struct nicvf *nic, struct rbdr *rbdr)
 {
 	int head, tail;
 	u64 buf_addr, phys_addr;
+	struct pgcache *pgcache;
 	struct rbdr_entry_t *desc;
 
 	if (!rbdr)
@@ -234,6 +304,18 @@ static void nicvf_free_rbdr(struct nicvf *nic, struct rbdr *rbdr)
 	if (phys_addr)
 		put_page(virt_to_page(phys_to_virt(phys_addr)));
 
+	/* Sync page cache info */
+	smp_rmb();
+
+	/* Release additional page references held for recycling */
+	head = 0;
+	while (head < rbdr->pgcnt) {
+		pgcache = &rbdr->pgcache[head];
+		if (pgcache->page && page_ref_count(pgcache->page) != 0)
+			put_page(pgcache->page);
+		head++;
+	}
+
 	/* Free RBDR ring */
 	nicvf_free_q_desc_mem(nic, &rbdr->dmem);
 }
@@ -269,13 +351,16 @@ static void nicvf_refill_rbdr(struct nicvf *nic, gfp_t gfp)
 	else
 		refill_rb_cnt = qs->rbdr_len - qcount - 1;
 
+	/* Sync page cache info */
+	smp_rmb();
+
 	/* Start filling descs from tail */
 	tail = nicvf_queue_reg_read(nic, NIC_QSET_RBDR_0_1_TAIL, rbdr_idx) >> 3;
 	while (refill_rb_cnt) {
 		tail++;
 		tail &= (rbdr->dmem.q_len - 1);
 
-		if (nicvf_alloc_rcv_buffer(nic, gfp, RCV_FRAG_LEN, &rbuf))
+		if (nicvf_alloc_rcv_buffer(nic, rbdr, gfp, RCV_FRAG_LEN, &rbuf))
 			break;
 
 		desc = GET_RBDR_DESC(rbdr, tail);
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index 10cb4b8..da48366 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -213,6 +213,11 @@ struct q_desc_mem {
 	void		*unalign_base;
 };
 
+struct pgcache {
+	struct page	*page;
+	u64		dma_addr;
+};
+
 struct rbdr {
 	bool		enable;
 	u32		dma_size;
@@ -222,6 +227,12 @@ struct rbdr {
 	u32		head;
 	u32		tail;
 	struct q_desc_mem   dmem;
+
+	/* For page recycling */
+	int		pgidx;
+	int		pgcnt;
+	int		pgalloc;
+	struct pgcache	*pgcache;
 } ____cacheline_aligned_in_smp;
 
 struct rcv_queue {
-- 
2.7.4

^ permalink raw reply related

* [PATCH 2/9] net: thunderx: Optimize RBDR descriptor handling
From: sunil.kovvuri at gmail.com @ 2017-05-02 13:06 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493730418-24606-1-git-send-email-sunil.kovvuri@gmail.com>

From: Sunil Goutham <sgoutham@cavium.com>

Receive buffer's physical address or iova will anyway not
go beyond 49bits, since it is the max supported HW address.
As per perf, updating bitfields i.e buf_addr:42 in RBDR
descriptor entry consumes lots of cpu cycles, hence changed
it to a 64bit field with alignment requirements taken care of.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
---
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c |  8 ++++----
 drivers/net/ethernet/cavium/thunder/q_struct.h     | 10 +---------
 2 files changed, 5 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 12f9709..dfc85a1 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -257,7 +257,7 @@ static int  nicvf_init_rbdr(struct nicvf *nic, struct rbdr *rbdr,
 		}
 
 		desc = GET_RBDR_DESC(rbdr, idx);
-		desc->buf_addr = (u64)rbuf >> NICVF_RCV_BUF_ALIGN;
+		desc->buf_addr = (u64)rbuf & ~(NICVF_RCV_BUF_ALIGN_BYTES - 1);
 	}
 
 	nicvf_get_page(nic);
@@ -286,7 +286,7 @@ static void nicvf_free_rbdr(struct nicvf *nic, struct rbdr *rbdr)
 	/* Release page references */
 	while (head != tail) {
 		desc = GET_RBDR_DESC(rbdr, head);
-		buf_addr = ((u64)desc->buf_addr) << NICVF_RCV_BUF_ALIGN;
+		buf_addr = desc->buf_addr;
 		phys_addr = nicvf_iova_to_phys(nic, buf_addr);
 		dma_unmap_page_attrs(&nic->pdev->dev, buf_addr, RCV_FRAG_LEN,
 				     DMA_FROM_DEVICE, DMA_ATTR_SKIP_CPU_SYNC);
@@ -297,7 +297,7 @@ static void nicvf_free_rbdr(struct nicvf *nic, struct rbdr *rbdr)
 	}
 	/* Release buffer of tail desc */
 	desc = GET_RBDR_DESC(rbdr, tail);
-	buf_addr = ((u64)desc->buf_addr) << NICVF_RCV_BUF_ALIGN;
+	buf_addr = desc->buf_addr;
 	phys_addr = nicvf_iova_to_phys(nic, buf_addr);
 	dma_unmap_page_attrs(&nic->pdev->dev, buf_addr, RCV_FRAG_LEN,
 			     DMA_FROM_DEVICE, DMA_ATTR_SKIP_CPU_SYNC);
@@ -364,7 +364,7 @@ static void nicvf_refill_rbdr(struct nicvf *nic, gfp_t gfp)
 			break;
 
 		desc = GET_RBDR_DESC(rbdr, tail);
-		desc->buf_addr = (u64)rbuf >> NICVF_RCV_BUF_ALIGN;
+		desc->buf_addr = (u64)rbuf & ~(NICVF_RCV_BUF_ALIGN_BYTES - 1);
 		refill_rb_cnt--;
 		new_rb++;
 	}
diff --git a/drivers/net/ethernet/cavium/thunder/q_struct.h b/drivers/net/ethernet/cavium/thunder/q_struct.h
index f363472..e47205a 100644
--- a/drivers/net/ethernet/cavium/thunder/q_struct.h
+++ b/drivers/net/ethernet/cavium/thunder/q_struct.h
@@ -359,15 +359,7 @@ union cq_desc_t {
 };
 
 struct rbdr_entry_t {
-#if defined(__BIG_ENDIAN_BITFIELD)
-	u64   rsvd0:15;
-	u64   buf_addr:42;
-	u64   cache_align:7;
-#elif defined(__LITTLE_ENDIAN_BITFIELD)
-	u64   cache_align:7;
-	u64   buf_addr:42;
-	u64   rsvd0:15;
-#endif
+	u64   buf_addr;
 };
 
 /* TCP reassembly context */
-- 
2.7.4

^ permalink raw reply related

* [PATCH 3/9] net: thunderx: Optimize CQE_TX handling
From: sunil.kovvuri at gmail.com @ 2017-05-02 13:06 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493730418-24606-1-git-send-email-sunil.kovvuri@gmail.com>

From: Sunil Goutham <sgoutham@cavium.com>

Optimized CQE handling with below changes
- Feeing descriptors back to SQ in bulk i.e once per NAPI
  instance instead for every CQE_TX, this will reduce number
  of atomic updates to 'sq->free_cnt'.
- Checking errors in CQE_TX and CQE_RX before calling appropriate
  fn()s to update error stats i.e reduce branching.

Also removed debug messages in packet handling path which otherwise
causes issues if DEBUG is enabled.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
---
 drivers/net/ethernet/cavium/thunder/nicvf_main.c   | 44 +++++++++++-----------
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c |  5 ---
 2 files changed, 21 insertions(+), 28 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index 81a2fcb..0d79894 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -498,7 +498,7 @@ static int nicvf_init_resources(struct nicvf *nic)
 
 static void nicvf_snd_pkt_handler(struct net_device *netdev,
 				  struct cqe_send_t *cqe_tx,
-				  int cqe_type, int budget,
+				  int budget, int *subdesc_cnt,
 				  unsigned int *tx_pkts, unsigned int *tx_bytes)
 {
 	struct sk_buff *skb = NULL;
@@ -513,12 +513,10 @@ static void nicvf_snd_pkt_handler(struct net_device *netdev,
 	if (hdr->subdesc_type != SQ_DESC_TYPE_HEADER)
 		return;
 
-	netdev_dbg(nic->netdev,
-		   "%s Qset #%d SQ #%d SQ ptr #%d subdesc count %d\n",
-		   __func__, cqe_tx->sq_qs, cqe_tx->sq_idx,
-		   cqe_tx->sqe_ptr, hdr->subdesc_cnt);
+	/* Check for errors */
+	if (cqe_tx->send_status)
+		nicvf_check_cqe_tx_errs(nic->pnicvf, cqe_tx);
 
-	nicvf_check_cqe_tx_errs(nic, cqe_tx);
 	skb = (struct sk_buff *)sq->skbuff[cqe_tx->sqe_ptr];
 	if (skb) {
 		/* Check for dummy descriptor used for HW TSO offload on 88xx */
@@ -528,12 +526,12 @@ static void nicvf_snd_pkt_handler(struct net_device *netdev,
 			 (struct sq_hdr_subdesc *)GET_SQ_DESC(sq, hdr->rsvd2);
 			nicvf_unmap_sndq_buffers(nic, sq, hdr->rsvd2,
 						 tso_sqe->subdesc_cnt);
-			nicvf_put_sq_desc(sq, tso_sqe->subdesc_cnt + 1);
+			*subdesc_cnt += tso_sqe->subdesc_cnt + 1;
 		} else {
 			nicvf_unmap_sndq_buffers(nic, sq, cqe_tx->sqe_ptr,
 						 hdr->subdesc_cnt);
 		}
-		nicvf_put_sq_desc(sq, hdr->subdesc_cnt + 1);
+		*subdesc_cnt += hdr->subdesc_cnt + 1;
 		prefetch(skb);
 		(*tx_pkts)++;
 		*tx_bytes += skb->len;
@@ -544,7 +542,7 @@ static void nicvf_snd_pkt_handler(struct net_device *netdev,
 		 * a SKB attached, so just free SQEs here.
 		 */
 		if (!nic->hw_tso)
-			nicvf_put_sq_desc(sq, hdr->subdesc_cnt + 1);
+			*subdesc_cnt += hdr->subdesc_cnt + 1;
 	}
 }
 
@@ -595,9 +593,11 @@ static void nicvf_rcv_pkt_handler(struct net_device *netdev,
 	}
 
 	/* Check for errors */
-	err = nicvf_check_cqe_rx_errs(nic, cqe_rx);
-	if (err && !cqe_rx->rb_cnt)
-		return;
+	if (cqe_rx->err_level || cqe_rx->err_opcode) {
+		err = nicvf_check_cqe_rx_errs(nic, cqe_rx);
+		if (err && !cqe_rx->rb_cnt)
+			return;
+	}
 
 	skb = nicvf_get_rcv_skb(snic, cqe_rx);
 	if (!skb) {
@@ -646,6 +646,7 @@ static int nicvf_cq_intr_handler(struct net_device *netdev, u8 cq_idx,
 {
 	int processed_cqe, work_done = 0, tx_done = 0;
 	int cqe_count, cqe_head;
+	int subdesc_cnt = 0;
 	struct nicvf *nic = netdev_priv(netdev);
 	struct queue_set *qs = nic->qs;
 	struct cmp_queue *cq = &qs->cq[cq_idx];
@@ -667,8 +668,6 @@ static int nicvf_cq_intr_handler(struct net_device *netdev, u8 cq_idx,
 	cqe_head = nicvf_queue_reg_read(nic, NIC_QSET_CQ_0_7_HEAD, cq_idx) >> 9;
 	cqe_head &= 0xFFFF;
 
-	netdev_dbg(nic->netdev, "%s CQ%d cqe_count %d cqe_head %d\n",
-		   __func__, cq_idx, cqe_count, cqe_head);
 	while (processed_cqe < cqe_count) {
 		/* Get the CQ descriptor */
 		cq_desc = (struct cqe_rx_t *)GET_CQ_DESC(cq, cqe_head);
@@ -682,17 +681,15 @@ static int nicvf_cq_intr_handler(struct net_device *netdev, u8 cq_idx,
 			break;
 		}
 
-		netdev_dbg(nic->netdev, "CQ%d cq_desc->cqe_type %d\n",
-			   cq_idx, cq_desc->cqe_type);
 		switch (cq_desc->cqe_type) {
 		case CQE_TYPE_RX:
 			nicvf_rcv_pkt_handler(netdev, napi, cq_desc);
 			work_done++;
 		break;
 		case CQE_TYPE_SEND:
-			nicvf_snd_pkt_handler(netdev,
-					      (void *)cq_desc, CQE_TYPE_SEND,
-					      budget, &tx_pkts, &tx_bytes);
+			nicvf_snd_pkt_handler(netdev, (void *)cq_desc,
+					      budget, &subdesc_cnt,
+					      &tx_pkts, &tx_bytes);
 			tx_done++;
 		break;
 		case CQE_TYPE_INVALID:
@@ -704,9 +701,6 @@ static int nicvf_cq_intr_handler(struct net_device *netdev, u8 cq_idx,
 		}
 		processed_cqe++;
 	}
-	netdev_dbg(nic->netdev,
-		   "%s CQ%d processed_cqe %d work_done %d budget %d\n",
-		   __func__, cq_idx, processed_cqe, work_done, budget);
 
 	/* Ring doorbell to inform H/W to reuse processed CQEs */
 	nicvf_queue_reg_write(nic, NIC_QSET_CQ_0_7_DOOR,
@@ -716,8 +710,12 @@ static int nicvf_cq_intr_handler(struct net_device *netdev, u8 cq_idx,
 		goto loop;
 
 done:
-	/* Wakeup TXQ if its stopped earlier due to SQ full */
 	sq = &nic->qs->sq[cq_idx];
+	/* Update SQ's descriptor free count */
+	if (subdesc_cnt)
+		nicvf_put_sq_desc(sq, subdesc_cnt);
+
+	/* Wakeup TXQ if its stopped earlier due to SQ full */
 	if (tx_done ||
 	    (atomic_read(&sq->free_cnt) >= MIN_SQ_DESC_PER_PKT_XMIT)) {
 		netdev = nic->pnicvf->netdev;
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index dfc85a1..90c5bc7d 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -1640,9 +1640,6 @@ void nicvf_update_sq_stats(struct nicvf *nic, int sq_idx)
 /* Check for errors in the receive cmp.queue entry */
 int nicvf_check_cqe_rx_errs(struct nicvf *nic, struct cqe_rx_t *cqe_rx)
 {
-	if (!cqe_rx->err_level && !cqe_rx->err_opcode)
-		return 0;
-
 	if (netif_msg_rx_err(nic))
 		netdev_err(nic->netdev,
 			   "%s: RX error CQE err_level 0x%x err_opcode 0x%x\n",
@@ -1731,8 +1728,6 @@ int nicvf_check_cqe_rx_errs(struct nicvf *nic, struct cqe_rx_t *cqe_rx)
 int nicvf_check_cqe_tx_errs(struct nicvf *nic, struct cqe_send_t *cqe_tx)
 {
 	switch (cqe_tx->send_status) {
-	case CQ_TX_ERROP_GOOD:
-		return 0;
 	case CQ_TX_ERROP_DESC_FAULT:
 		this_cpu_inc(nic->drv_stats->tx_desc_fault);
 		break;
-- 
2.7.4

^ permalink raw reply related

* [PATCH 4/9] net: thunderx: Cleanup receive buffer allocation
From: sunil.kovvuri at gmail.com @ 2017-05-02 13:06 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493730418-24606-1-git-send-email-sunil.kovvuri@gmail.com>

From: Sunil Goutham <sgoutham@cavium.com>

Get rid of unnecessary double pointer references and type casting
in receive buffer allocation code.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
---
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 90c5bc7d..e4a02a9 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -145,7 +145,7 @@ static struct pgcache *nicvf_alloc_page(struct nicvf *nic,
 
 /* Allocate buffer for packet reception */
 static inline int nicvf_alloc_rcv_buffer(struct nicvf *nic, struct rbdr *rbdr,
-					 gfp_t gfp, u32 buf_len, u64 **rbuf)
+					 gfp_t gfp, u32 buf_len, u64 *rbuf)
 {
 	struct pgcache *pgcache = NULL;
 
@@ -172,10 +172,10 @@ static inline int nicvf_alloc_rcv_buffer(struct nicvf *nic, struct rbdr *rbdr,
 		nic->rb_page = pgcache->page;
 ret:
 	/* HW will ensure data coherency, CPU sync not required */
-	*rbuf = (u64 *)((u64)dma_map_page_attrs(&nic->pdev->dev, nic->rb_page,
-						nic->rb_page_offset, buf_len,
-						DMA_FROM_DEVICE,
-						DMA_ATTR_SKIP_CPU_SYNC));
+	*rbuf = (u64)dma_map_page_attrs(&nic->pdev->dev, nic->rb_page,
+					nic->rb_page_offset, buf_len,
+					DMA_FROM_DEVICE,
+					DMA_ATTR_SKIP_CPU_SYNC);
 	if (dma_mapping_error(&nic->pdev->dev, (dma_addr_t)*rbuf)) {
 		if (!nic->rb_page_offset)
 			__free_pages(nic->rb_page, 0);
@@ -212,7 +212,7 @@ static int  nicvf_init_rbdr(struct nicvf *nic, struct rbdr *rbdr,
 			    int ring_len, int buf_size)
 {
 	int idx;
-	u64 *rbuf;
+	u64 rbuf;
 	struct rbdr_entry_t *desc;
 	int err;
 
@@ -257,7 +257,7 @@ static int  nicvf_init_rbdr(struct nicvf *nic, struct rbdr *rbdr,
 		}
 
 		desc = GET_RBDR_DESC(rbdr, idx);
-		desc->buf_addr = (u64)rbuf & ~(NICVF_RCV_BUF_ALIGN_BYTES - 1);
+		desc->buf_addr = rbuf & ~(NICVF_RCV_BUF_ALIGN_BYTES - 1);
 	}
 
 	nicvf_get_page(nic);
@@ -330,7 +330,7 @@ static void nicvf_refill_rbdr(struct nicvf *nic, gfp_t gfp)
 	int refill_rb_cnt;
 	struct rbdr *rbdr;
 	struct rbdr_entry_t *desc;
-	u64 *rbuf;
+	u64 rbuf;
 	int new_rb = 0;
 
 refill:
@@ -364,7 +364,7 @@ static void nicvf_refill_rbdr(struct nicvf *nic, gfp_t gfp)
 			break;
 
 		desc = GET_RBDR_DESC(rbdr, tail);
-		desc->buf_addr = (u64)rbuf & ~(NICVF_RCV_BUF_ALIGN_BYTES - 1);
+		desc->buf_addr = rbuf & ~(NICVF_RCV_BUF_ALIGN_BYTES - 1);
 		refill_rb_cnt--;
 		new_rb++;
 	}
-- 
2.7.4

^ permalink raw reply related

* [PATCH 5/9] net: thunderx: Add basic XDP support
From: sunil.kovvuri at gmail.com @ 2017-05-02 13:06 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493730418-24606-1-git-send-email-sunil.kovvuri@gmail.com>

From: Sunil Goutham <sgoutham@cavium.com>

Adds basic XDP support i.e attaching a BPF program to an
interface. Also takes care of allocating separate Tx queues
for XDP path and for network stack packet transmission.

This patch doesn't support handling of any of the XDP actions,
all are treated as XDP_PASS i.e packets will be handed over to
the network stack.

Changes also involve allocating one receive buffer per page in XDP
mode and multiple in normal mode i.e when no BPF program is attached.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
---
 drivers/net/ethernet/cavium/thunder/nic.h          |   6 +-
 .../net/ethernet/cavium/thunder/nicvf_ethtool.c    |  26 +++-
 drivers/net/ethernet/cavium/thunder/nicvf_main.c   | 162 ++++++++++++++++++++-
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c |  15 +-
 drivers/net/ethernet/cavium/thunder/nicvf_queues.h |   9 ++
 5 files changed, 199 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic.h b/drivers/net/ethernet/cavium/thunder/nic.h
index dca6aed..4a02e61 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -268,9 +268,9 @@ struct nicvf {
 	struct net_device	*netdev;
 	struct pci_dev		*pdev;
 	void __iomem		*reg_base;
+	struct bpf_prog         *xdp_prog;
 #define	MAX_QUEUES_PER_QSET			8
 	struct queue_set	*qs;
-	struct nicvf_cq_poll	*napi[8];
 	void			*iommu_domain;
 	u8			vf_id;
 	u8			sqs_id;
@@ -296,6 +296,7 @@ struct nicvf {
 	/* Queue count */
 	u8			rx_queues;
 	u8			tx_queues;
+	u8			xdp_tx_queues;
 	u8			max_queues;
 
 	u8			node;
@@ -320,6 +321,9 @@ struct nicvf {
 	struct nicvf_drv_stats  __percpu *drv_stats;
 	struct bgx_stats	bgx_stats;
 
+	/* Napi */
+	struct nicvf_cq_poll	*napi[8];
+
 	/* MSI-X  */
 	u8			num_vec;
 	char			irq_name[NIC_VF_MSIX_VECTORS][IFNAMSIZ + 15];
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c b/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
index a89db5f..b9ece9c 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c
@@ -721,7 +721,7 @@ static int nicvf_set_channels(struct net_device *dev,
 	struct nicvf *nic = netdev_priv(dev);
 	int err = 0;
 	bool if_up = netif_running(dev);
-	int cqcount;
+	u8 cqcount, txq_count;
 
 	if (!channel->rx_count || !channel->tx_count)
 		return -EINVAL;
@@ -730,10 +730,26 @@ static int nicvf_set_channels(struct net_device *dev,
 	if (channel->tx_count > nic->max_queues)
 		return -EINVAL;
 
+	if (nic->xdp_prog &&
+	    ((channel->tx_count + channel->rx_count) > nic->max_queues)) {
+		netdev_err(nic->netdev,
+			   "XDP mode, RXQs + TXQs > Max %d\n",
+			   nic->max_queues);
+		return -EINVAL;
+	}
+
 	if (if_up)
 		nicvf_stop(dev);
 
-	cqcount = max(channel->rx_count, channel->tx_count);
+	nic->rx_queues = channel->rx_count;
+	nic->tx_queues = channel->tx_count;
+	if (!nic->xdp_prog)
+		nic->xdp_tx_queues = 0;
+	else
+		nic->xdp_tx_queues = channel->rx_count;
+
+	txq_count = nic->xdp_tx_queues + nic->tx_queues;
+	cqcount = max(nic->rx_queues, txq_count);
 
 	if (cqcount > MAX_CMP_QUEUES_PER_QS) {
 		nic->sqs_count = roundup(cqcount, MAX_CMP_QUEUES_PER_QS);
@@ -742,12 +758,10 @@ static int nicvf_set_channels(struct net_device *dev,
 		nic->sqs_count = 0;
 	}
 
-	nic->qs->rq_cnt = min_t(u32, channel->rx_count, MAX_RCV_QUEUES_PER_QS);
-	nic->qs->sq_cnt = min_t(u32, channel->tx_count, MAX_SND_QUEUES_PER_QS);
+	nic->qs->rq_cnt = min_t(u8, nic->rx_queues, MAX_RCV_QUEUES_PER_QS);
+	nic->qs->sq_cnt = min_t(u8, txq_count, MAX_SND_QUEUES_PER_QS);
 	nic->qs->cq_cnt = max(nic->qs->rq_cnt, nic->qs->sq_cnt);
 
-	nic->rx_queues = channel->rx_count;
-	nic->tx_queues = channel->tx_count;
 	err = nicvf_set_real_num_queues(dev, nic->tx_queues, nic->rx_queues);
 	if (err)
 		return err;
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index 0d79894..9c48873 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -17,6 +17,8 @@
 #include <linux/prefetch.h>
 #include <linux/irq.h>
 #include <linux/iommu.h>
+#include <linux/bpf.h>
+#include <linux/filter.h>
 
 #include "nic_reg.h"
 #include "nic.h"
@@ -397,8 +399,10 @@ static void nicvf_request_sqs(struct nicvf *nic)
 
 	if (nic->rx_queues > MAX_RCV_QUEUES_PER_QS)
 		rx_queues = nic->rx_queues - MAX_RCV_QUEUES_PER_QS;
-	if (nic->tx_queues > MAX_SND_QUEUES_PER_QS)
-		tx_queues = nic->tx_queues - MAX_SND_QUEUES_PER_QS;
+
+	tx_queues = nic->tx_queues + nic->xdp_tx_queues;
+	if (tx_queues > MAX_SND_QUEUES_PER_QS)
+		tx_queues = tx_queues - MAX_SND_QUEUES_PER_QS;
 
 	/* Set no of Rx/Tx queues in each of the SQsets */
 	for (sqs = 0; sqs < nic->sqs_count; sqs++) {
@@ -496,6 +500,43 @@ static int nicvf_init_resources(struct nicvf *nic)
 	return 0;
 }
 
+static inline bool nicvf_xdp_rx(struct nicvf *nic,
+				struct bpf_prog *prog,
+				struct cqe_rx_t *cqe_rx)
+{
+	struct xdp_buff xdp;
+	u32 action;
+	u16 len;
+	u64 dma_addr, cpu_addr;
+
+	/* Retrieve packet buffer's DMA address and length */
+	len = *((u16 *)((void *)cqe_rx + (3 * sizeof(u64))));
+	dma_addr = *((u64 *)((void *)cqe_rx + (7 * sizeof(u64))));
+
+	cpu_addr = nicvf_iova_to_phys(nic, dma_addr);
+	if (!cpu_addr)
+		return false;
+
+	xdp.data = phys_to_virt(cpu_addr);
+	xdp.data_end = xdp.data + len;
+
+	rcu_read_lock();
+	action = bpf_prog_run_xdp(prog, &xdp);
+	rcu_read_unlock();
+
+	switch (action) {
+	case XDP_PASS:
+	case XDP_TX:
+	case XDP_ABORTED:
+	case XDP_DROP:
+		/* Pass on all packets to network stack */
+		return false;
+	default:
+		bpf_warn_invalid_xdp_action(action);
+	}
+	return false;
+}
+
 static void nicvf_snd_pkt_handler(struct net_device *netdev,
 				  struct cqe_send_t *cqe_tx,
 				  int budget, int *subdesc_cnt,
@@ -599,6 +640,11 @@ static void nicvf_rcv_pkt_handler(struct net_device *netdev,
 			return;
 	}
 
+	/* For XDP, ignore pkts spanning multiple pages */
+	if (nic->xdp_prog && (cqe_rx->rb_cnt == 1))
+		if (nicvf_xdp_rx(snic, nic->xdp_prog, cqe_rx))
+			return;
+
 	skb = nicvf_get_rcv_skb(snic, cqe_rx);
 	if (!skb) {
 		netdev_dbg(nic->netdev, "Packet not received\n");
@@ -1529,6 +1575,117 @@ static int nicvf_set_features(struct net_device *netdev,
 	return 0;
 }
 
+static void nicvf_set_xdp_queues(struct nicvf *nic, bool bpf_attached)
+{
+	u8 cq_count, txq_count;
+
+	/* Set XDP Tx queue count same as Rx queue count */
+	if (!bpf_attached)
+		nic->xdp_tx_queues = 0;
+	else
+		nic->xdp_tx_queues = nic->rx_queues;
+
+	/* If queue count > MAX_CMP_QUEUES_PER_QS, then additional qsets
+	 * needs to be allocated, check how many.
+	 */
+	txq_count = nic->xdp_tx_queues + nic->tx_queues;
+	cq_count = max(nic->rx_queues, txq_count);
+	if (cq_count > MAX_CMP_QUEUES_PER_QS) {
+		nic->sqs_count = roundup(cq_count, MAX_CMP_QUEUES_PER_QS);
+		nic->sqs_count = (nic->sqs_count / MAX_CMP_QUEUES_PER_QS) - 1;
+	} else {
+		nic->sqs_count = 0;
+	}
+
+	/* Set primary Qset's resources */
+	nic->qs->rq_cnt = min_t(u8, nic->rx_queues, MAX_RCV_QUEUES_PER_QS);
+	nic->qs->sq_cnt = min_t(u8, txq_count, MAX_SND_QUEUES_PER_QS);
+	nic->qs->cq_cnt = max_t(u8, nic->qs->rq_cnt, nic->qs->sq_cnt);
+
+	/* Update stack */
+	nicvf_set_real_num_queues(nic->netdev, nic->tx_queues, nic->rx_queues);
+}
+
+static int nicvf_xdp_setup(struct nicvf *nic, struct bpf_prog *prog)
+{
+	struct net_device *dev = nic->netdev;
+	bool if_up = netif_running(nic->netdev);
+	struct bpf_prog *old_prog;
+	bool bpf_attached = false;
+
+	/* For now just support only the usual MTU sized frames */
+	if (prog && (dev->mtu > 1500)) {
+		netdev_warn(dev, "Jumbo frames not yet supported with XDP, current MTU %d.\n",
+			    dev->mtu);
+		return -EOPNOTSUPP;
+	}
+
+	if (prog && prog->xdp_adjust_head)
+		return -EOPNOTSUPP;
+
+	/* ALL SQs attached to CQs i.e same as RQs, are treated as
+	 * XDP Tx queues and more Tx queues are allocated for
+	 * network stack to send pkts out.
+	 *
+	 * No of Tx queues are either same as Rx queues or whatever
+	 * is left in max no of queues possible.
+	 */
+	if ((nic->rx_queues + nic->tx_queues) > nic->max_queues) {
+		netdev_warn(dev,
+			    "Failed to attach BPF prog, RXQs + TXQs > Max %d\n",
+			    nic->max_queues);
+		return -ENOMEM;
+	}
+
+	if (if_up)
+		nicvf_stop(nic->netdev);
+
+	old_prog = xchg(&nic->xdp_prog, prog);
+	/* Detach old prog, if any */
+	if (old_prog)
+		bpf_prog_put(old_prog);
+
+	if (nic->xdp_prog) {
+		/* Attach BPF program */
+		nic->xdp_prog = bpf_prog_add(nic->xdp_prog, nic->rx_queues - 1);
+		if (!IS_ERR(nic->xdp_prog))
+			bpf_attached = true;
+	}
+
+	/* Calculate Tx queues needed for XDP and network stack */
+	nicvf_set_xdp_queues(nic, bpf_attached);
+
+	if (if_up) {
+		/* Reinitialize interface, clean slate */
+		nicvf_open(nic->netdev);
+		netif_trans_update(nic->netdev);
+	}
+
+	return 0;
+}
+
+static int nicvf_xdp(struct net_device *netdev, struct netdev_xdp *xdp)
+{
+	struct nicvf *nic = netdev_priv(netdev);
+
+	/* To avoid checks while retrieving buffer address from CQE_RX,
+	 * do not support XDP for T88 pass1.x silicons which are anyway
+	 * not in use widely.
+	 */
+	if (pass1_silicon(nic->pdev))
+		return -EOPNOTSUPP;
+
+	switch (xdp->command) {
+	case XDP_SETUP_PROG:
+		return nicvf_xdp_setup(nic, xdp->prog);
+	case XDP_QUERY_PROG:
+		xdp->prog_attached = !!nic->xdp_prog;
+		return 0;
+	default:
+		return -EINVAL;
+	}
+}
+
 static const struct net_device_ops nicvf_netdev_ops = {
 	.ndo_open		= nicvf_open,
 	.ndo_stop		= nicvf_stop,
@@ -1539,6 +1696,7 @@ static const struct net_device_ops nicvf_netdev_ops = {
 	.ndo_tx_timeout         = nicvf_tx_timeout,
 	.ndo_fix_features       = nicvf_fix_features,
 	.ndo_set_features       = nicvf_set_features,
+	.ndo_xdp		= nicvf_xdp,
 };
 
 static int nicvf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index e4a02a9..8c3c571 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -19,14 +19,6 @@
 #include "q_struct.h"
 #include "nicvf_queues.h"
 
-static inline u64 nicvf_iova_to_phys(struct nicvf *nic, dma_addr_t dma_addr)
-{
-	/* Translation is installed only when IOMMU is present */
-	if (nic->iommu_domain)
-		return iommu_iova_to_phys(nic->iommu_domain, dma_addr);
-	return dma_addr;
-}
-
 static void nicvf_get_page(struct nicvf *nic)
 {
 	if (!nic->rb_pageref || !nic->rb_page)
@@ -149,8 +141,10 @@ static inline int nicvf_alloc_rcv_buffer(struct nicvf *nic, struct rbdr *rbdr,
 {
 	struct pgcache *pgcache = NULL;
 
-	/* Check if request can be accomodated in previous allocated page */
-	if (nic->rb_page &&
+	/* Check if request can be accomodated in previous allocated page.
+	 * But in XDP mode only one buffer per page is permitted.
+	 */
+	if (!nic->pnicvf->xdp_prog && nic->rb_page &&
 	    ((nic->rb_page_offset + buf_len) <= PAGE_SIZE)) {
 		nic->rb_pageref++;
 		goto ret;
@@ -961,6 +955,7 @@ int nicvf_set_qset_resources(struct nicvf *nic)
 
 	nic->rx_queues = qs->rq_cnt;
 	nic->tx_queues = qs->sq_cnt;
+	nic->xdp_tx_queues = 0;
 
 	return 0;
 }
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index da48366..07136a2 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -10,6 +10,7 @@
 #define NICVF_QUEUES_H
 
 #include <linux/netdevice.h>
+#include <linux/iommu.h>
 #include "q_struct.h"
 
 #define MAX_QUEUE_SET			128
@@ -312,6 +313,14 @@ struct queue_set {
 
 #define	CQ_ERR_MASK	(CQ_WR_FULL | CQ_WR_DISABLE | CQ_WR_FAULT)
 
+static inline u64 nicvf_iova_to_phys(struct nicvf *nic, dma_addr_t dma_addr)
+{
+	/* Translation is installed only when IOMMU is present */
+	if (nic->iommu_domain)
+		return iommu_iova_to_phys(nic->iommu_domain, dma_addr);
+	return dma_addr;
+}
+
 void nicvf_unmap_sndq_buffers(struct nicvf *nic, struct snd_queue *sq,
 			      int hdr_sqe, u8 subdesc_cnt);
 void nicvf_config_vlan_stripping(struct nicvf *nic,
-- 
2.7.4

^ permalink raw reply related

* [PATCH 6/9] net: thunderx: Add support for XDP_DROP
From: sunil.kovvuri at gmail.com @ 2017-05-02 13:06 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493730418-24606-1-git-send-email-sunil.kovvuri@gmail.com>

From: Sunil Goutham <sgoutham@cavium.com>

Adds support for XDP_DROP.
Also since in XDP mode there is just a single buffer per page,
made changes to recycle DMA mapping info as well along with pages.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
---
 drivers/net/ethernet/cavium/thunder/nicvf_main.c   | 23 ++++++-
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 77 ++++++++++++++++------
 drivers/net/ethernet/cavium/thunder/nicvf_queues.h |  4 +-
 3 files changed, 79 insertions(+), 25 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index 9c48873..a58cc1e 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -18,6 +18,7 @@
 #include <linux/irq.h>
 #include <linux/iommu.h>
 #include <linux/bpf.h>
+#include <linux/bpf_trace.h>
 #include <linux/filter.h>
 
 #include "nic_reg.h"
@@ -505,6 +506,7 @@ static inline bool nicvf_xdp_rx(struct nicvf *nic,
 				struct cqe_rx_t *cqe_rx)
 {
 	struct xdp_buff xdp;
+	struct page *page;
 	u32 action;
 	u16 len;
 	u64 dma_addr, cpu_addr;
@@ -527,12 +529,27 @@ static inline bool nicvf_xdp_rx(struct nicvf *nic,
 	switch (action) {
 	case XDP_PASS:
 	case XDP_TX:
-	case XDP_ABORTED:
-	case XDP_DROP:
 		/* Pass on all packets to network stack */
 		return false;
 	default:
 		bpf_warn_invalid_xdp_action(action);
+	case XDP_ABORTED:
+		trace_xdp_exception(nic->netdev, prog, action);
+	case XDP_DROP:
+		page = virt_to_page(xdp.data);
+		/* Check if it's a recycled page, if not
+		 * unmap the DMA mapping.
+		 *
+		 * Recycled page holds an extra reference.
+		 */
+		if (page_ref_count(page) == 1) {
+			dma_addr &= PAGE_MASK;
+			dma_unmap_page_attrs(&nic->pdev->dev, dma_addr,
+					     RCV_FRAG_LEN, DMA_FROM_DEVICE,
+					     DMA_ATTR_SKIP_CPU_SYNC);
+		}
+		put_page(page);
+		return true;
 	}
 	return false;
 }
@@ -645,7 +662,7 @@ static void nicvf_rcv_pkt_handler(struct net_device *netdev,
 		if (nicvf_xdp_rx(snic, nic->xdp_prog, cqe_rx))
 			return;
 
-	skb = nicvf_get_rcv_skb(snic, cqe_rx);
+	skb = nicvf_get_rcv_skb(snic, cqe_rx, nic->xdp_prog ? true : false);
 	if (!skb) {
 		netdev_dbg(nic->netdev, "Packet not received\n");
 		return;
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 8c3c571..5009f49 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -117,6 +117,7 @@ static struct pgcache *nicvf_alloc_page(struct nicvf *nic,
 
 		/* Save the page in page cache */
 		pgcache->page = page;
+		pgcache->dma_addr = 0;
 		rbdr->pgalloc++;
 	}
 
@@ -144,7 +145,7 @@ static inline int nicvf_alloc_rcv_buffer(struct nicvf *nic, struct rbdr *rbdr,
 	/* Check if request can be accomodated in previous allocated page.
 	 * But in XDP mode only one buffer per page is permitted.
 	 */
-	if (!nic->pnicvf->xdp_prog && nic->rb_page &&
+	if (!rbdr->is_xdp && nic->rb_page &&
 	    ((nic->rb_page_offset + buf_len) <= PAGE_SIZE)) {
 		nic->rb_pageref++;
 		goto ret;
@@ -165,18 +166,24 @@ static inline int nicvf_alloc_rcv_buffer(struct nicvf *nic, struct rbdr *rbdr,
 	if (pgcache)
 		nic->rb_page = pgcache->page;
 ret:
-	/* HW will ensure data coherency, CPU sync not required */
-	*rbuf = (u64)dma_map_page_attrs(&nic->pdev->dev, nic->rb_page,
-					nic->rb_page_offset, buf_len,
-					DMA_FROM_DEVICE,
-					DMA_ATTR_SKIP_CPU_SYNC);
-	if (dma_mapping_error(&nic->pdev->dev, (dma_addr_t)*rbuf)) {
-		if (!nic->rb_page_offset)
-			__free_pages(nic->rb_page, 0);
-		nic->rb_page = NULL;
-		return -ENOMEM;
+	if (rbdr->is_xdp && pgcache && pgcache->dma_addr) {
+		*rbuf = pgcache->dma_addr;
+	} else {
+		/* HW will ensure data coherency, CPU sync not required */
+		*rbuf = (u64)dma_map_page_attrs(&nic->pdev->dev, nic->rb_page,
+						nic->rb_page_offset, buf_len,
+						DMA_FROM_DEVICE,
+						DMA_ATTR_SKIP_CPU_SYNC);
+		if (dma_mapping_error(&nic->pdev->dev, (dma_addr_t)*rbuf)) {
+			if (!nic->rb_page_offset)
+				__free_pages(nic->rb_page, 0);
+			nic->rb_page = NULL;
+			return -ENOMEM;
+		}
+		if (pgcache)
+			pgcache->dma_addr = *rbuf;
+		nic->rb_page_offset += buf_len;
 	}
-	nic->rb_page_offset += buf_len;
 
 	return 0;
 }
@@ -230,8 +237,16 @@ static int  nicvf_init_rbdr(struct nicvf *nic, struct rbdr *rbdr,
 	 * On embedded platforms i.e 81xx/83xx available memory itself
 	 * is low and minimum ring size of RBDR is 8K, that takes away
 	 * lots of memory.
+	 *
+	 * But for XDP it has to be a single buffer per page.
 	 */
-	rbdr->pgcnt = ring_len / (PAGE_SIZE / buf_size);
+	if (!nic->pnicvf->xdp_prog) {
+		rbdr->pgcnt = ring_len / (PAGE_SIZE / buf_size);
+		rbdr->is_xdp = false;
+	} else {
+		rbdr->pgcnt = ring_len;
+		rbdr->is_xdp = true;
+	}
 	rbdr->pgcnt = roundup_pow_of_two(rbdr->pgcnt);
 	rbdr->pgcache = kzalloc(sizeof(*rbdr->pgcache) *
 				rbdr->pgcnt, GFP_KERNEL);
@@ -1454,8 +1469,31 @@ static inline unsigned frag_num(unsigned i)
 #endif
 }
 
+static void nicvf_unmap_rcv_buffer(struct nicvf *nic, u64 dma_addr,
+				   u64 buf_addr, bool xdp)
+{
+	struct page *page = NULL;
+	int len = RCV_FRAG_LEN;
+
+	if (xdp) {
+		page = virt_to_page(phys_to_virt(buf_addr));
+		/* Check if it's a recycled page, if not
+		 * unmap the DMA mapping.
+		 *
+		 * Recycled page holds an extra reference.
+		 */
+		if (page_ref_count(page) != 1)
+			return;
+		/* Receive buffers in XDP mode are mapped from page start */
+		dma_addr &= PAGE_MASK;
+	}
+	dma_unmap_page_attrs(&nic->pdev->dev, dma_addr, len,
+			     DMA_FROM_DEVICE, DMA_ATTR_SKIP_CPU_SYNC);
+}
+
 /* Returns SKB for a received packet */
-struct sk_buff *nicvf_get_rcv_skb(struct nicvf *nic, struct cqe_rx_t *cqe_rx)
+struct sk_buff *nicvf_get_rcv_skb(struct nicvf *nic,
+				  struct cqe_rx_t *cqe_rx, bool xdp)
 {
 	int frag;
 	int payload_len = 0;
@@ -1490,10 +1528,9 @@ struct sk_buff *nicvf_get_rcv_skb(struct nicvf *nic, struct cqe_rx_t *cqe_rx)
 
 		if (!frag) {
 			/* First fragment */
-			dma_unmap_page_attrs(&nic->pdev->dev,
-					     *rb_ptrs - cqe_rx->align_pad,
-					     RCV_FRAG_LEN, DMA_FROM_DEVICE,
-					     DMA_ATTR_SKIP_CPU_SYNC);
+			nicvf_unmap_rcv_buffer(nic,
+					       *rb_ptrs - cqe_rx->align_pad,
+					       phys_addr, xdp);
 			skb = nicvf_rb_ptr_to_skb(nic,
 						  phys_addr - cqe_rx->align_pad,
 						  payload_len);
@@ -1503,9 +1540,7 @@ struct sk_buff *nicvf_get_rcv_skb(struct nicvf *nic, struct cqe_rx_t *cqe_rx)
 			skb_put(skb, payload_len);
 		} else {
 			/* Add fragments */
-			dma_unmap_page_attrs(&nic->pdev->dev, *rb_ptrs,
-					     RCV_FRAG_LEN, DMA_FROM_DEVICE,
-					     DMA_ATTR_SKIP_CPU_SYNC);
+			nicvf_unmap_rcv_buffer(nic, *rb_ptrs, phys_addr, xdp);
 			page = virt_to_page(phys_to_virt(phys_addr));
 			offset = phys_to_virt(phys_addr) - page_address(page);
 			skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page,
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index 07136a2..db04c0e 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -228,6 +228,7 @@ struct rbdr {
 	u32		head;
 	u32		tail;
 	struct q_desc_mem   dmem;
+	bool		is_xdp;
 
 	/* For page recycling */
 	int		pgidx;
@@ -339,7 +340,8 @@ void nicvf_sq_free_used_descs(struct net_device *netdev,
 int nicvf_sq_append_skb(struct nicvf *nic, struct snd_queue *sq,
 			struct sk_buff *skb, u8 sq_num);
 
-struct sk_buff *nicvf_get_rcv_skb(struct nicvf *nic, struct cqe_rx_t *cqe_rx);
+struct sk_buff *nicvf_get_rcv_skb(struct nicvf *nic,
+				  struct cqe_rx_t *cqe_rx, bool xdp);
 void nicvf_rbdr_task(unsigned long data);
 void nicvf_rbdr_work(struct work_struct *work);
 
-- 
2.7.4

^ permalink raw reply related

* [PATCH 7/9] net: thunderx: Add support for XDP_TX
From: sunil.kovvuri at gmail.com @ 2017-05-02 13:06 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493730418-24606-1-git-send-email-sunil.kovvuri@gmail.com>

From: Sunil Goutham <sgoutham@cavium.com>

Adds support for XDP_TX i.e transmits packet out of
the XDP TX queue mapped to the corresponding Rx queue
on which packet is received.

Since SQ for XDP TX will be used only on a single cpu i.e
SQ description creation and freeing, using atomic free count
is not necessary and will become a bottleneck. Hence added
a separate 'xdp_free_cnt' used for SQs designated for XDP
to track descriptor free count.

Changes also include
- A new entry 'xdp_page' is added to save transmitted packet's
  page pointer for later cleanup.
- XDP Tx SQ's doorbell is ringed once per NAPI instance.
- Retrieving designated SQ for packets being sent out by stack
  via 'nicvf_xmit'.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
---
 drivers/net/ethernet/cavium/thunder/nicvf_main.c   |  63 ++++++++---
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 117 ++++++++++++++++++---
 drivers/net/ethernet/cavium/thunder/nicvf_queues.h |   7 ++
 3 files changed, 160 insertions(+), 27 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index a58cc1e..bb13dee 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -501,9 +501,8 @@ static int nicvf_init_resources(struct nicvf *nic)
 	return 0;
 }
 
-static inline bool nicvf_xdp_rx(struct nicvf *nic,
-				struct bpf_prog *prog,
-				struct cqe_rx_t *cqe_rx)
+static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
+				struct cqe_rx_t *cqe_rx, struct snd_queue *sq)
 {
 	struct xdp_buff xdp;
 	struct page *page;
@@ -528,9 +527,11 @@ static inline bool nicvf_xdp_rx(struct nicvf *nic,
 
 	switch (action) {
 	case XDP_PASS:
-	case XDP_TX:
-		/* Pass on all packets to network stack */
+		/* Pass on packet to network stack */
 		return false;
+	case XDP_TX:
+		nicvf_xdp_sq_append_pkt(nic, sq, (u64)xdp.data, dma_addr, len);
+		return true;
 	default:
 		bpf_warn_invalid_xdp_action(action);
 	case XDP_ABORTED:
@@ -560,6 +561,7 @@ static void nicvf_snd_pkt_handler(struct net_device *netdev,
 				  unsigned int *tx_pkts, unsigned int *tx_bytes)
 {
 	struct sk_buff *skb = NULL;
+	struct page *page;
 	struct nicvf *nic = netdev_priv(netdev);
 	struct snd_queue *sq;
 	struct sq_hdr_subdesc *hdr;
@@ -575,6 +577,22 @@ static void nicvf_snd_pkt_handler(struct net_device *netdev,
 	if (cqe_tx->send_status)
 		nicvf_check_cqe_tx_errs(nic->pnicvf, cqe_tx);
 
+	/* Is this a XDP designated Tx queue */
+	if (sq->is_xdp) {
+		page = (struct page *)sq->xdp_page[cqe_tx->sqe_ptr];
+		/* Check if it's recycled page or else unmap DMA mapping */
+		if (page && (page_ref_count(page) == 1))
+			nicvf_unmap_sndq_buffers(nic, sq, cqe_tx->sqe_ptr,
+						 hdr->subdesc_cnt);
+
+		/* Release page reference for recycling */
+		if (page)
+			put_page(page);
+		sq->xdp_page[cqe_tx->sqe_ptr] = (u64)NULL;
+		*subdesc_cnt += hdr->subdesc_cnt + 1;
+		return;
+	}
+
 	skb = (struct sk_buff *)sq->skbuff[cqe_tx->sqe_ptr];
 	if (skb) {
 		/* Check for dummy descriptor used for HW TSO offload on 88xx */
@@ -634,7 +652,7 @@ static inline void nicvf_set_rxhash(struct net_device *netdev,
 
 static void nicvf_rcv_pkt_handler(struct net_device *netdev,
 				  struct napi_struct *napi,
-				  struct cqe_rx_t *cqe_rx)
+				  struct cqe_rx_t *cqe_rx, struct snd_queue *sq)
 {
 	struct sk_buff *skb;
 	struct nicvf *nic = netdev_priv(netdev);
@@ -659,7 +677,7 @@ static void nicvf_rcv_pkt_handler(struct net_device *netdev,
 
 	/* For XDP, ignore pkts spanning multiple pages */
 	if (nic->xdp_prog && (cqe_rx->rb_cnt == 1))
-		if (nicvf_xdp_rx(snic, nic->xdp_prog, cqe_rx))
+		if (nicvf_xdp_rx(snic, nic->xdp_prog, cqe_rx, sq))
 			return;
 
 	skb = nicvf_get_rcv_skb(snic, cqe_rx, nic->xdp_prog ? true : false);
@@ -715,8 +733,8 @@ static int nicvf_cq_intr_handler(struct net_device *netdev, u8 cq_idx,
 	struct cmp_queue *cq = &qs->cq[cq_idx];
 	struct cqe_rx_t *cq_desc;
 	struct netdev_queue *txq;
-	struct snd_queue *sq;
-	unsigned int tx_pkts = 0, tx_bytes = 0;
+	struct snd_queue *sq = &qs->sq[cq_idx];
+	unsigned int tx_pkts = 0, tx_bytes = 0, txq_idx;
 
 	spin_lock_bh(&cq->lock);
 loop:
@@ -746,7 +764,7 @@ static int nicvf_cq_intr_handler(struct net_device *netdev, u8 cq_idx,
 
 		switch (cq_desc->cqe_type) {
 		case CQE_TYPE_RX:
-			nicvf_rcv_pkt_handler(netdev, napi, cq_desc);
+			nicvf_rcv_pkt_handler(netdev, napi, cq_desc, sq);
 			work_done++;
 		break;
 		case CQE_TYPE_SEND:
@@ -773,17 +791,26 @@ static int nicvf_cq_intr_handler(struct net_device *netdev, u8 cq_idx,
 		goto loop;
 
 done:
-	sq = &nic->qs->sq[cq_idx];
 	/* Update SQ's descriptor free count */
 	if (subdesc_cnt)
 		nicvf_put_sq_desc(sq, subdesc_cnt);
 
+	txq_idx = nicvf_netdev_qidx(nic, cq_idx);
+	/* Handle XDP TX queues */
+	if (nic->pnicvf->xdp_prog) {
+		if (txq_idx < nic->pnicvf->xdp_tx_queues) {
+			nicvf_xdp_sq_doorbell(nic, sq, cq_idx);
+			goto out;
+		}
+		nic = nic->pnicvf;
+		txq_idx -= nic->pnicvf->xdp_tx_queues;
+	}
+
 	/* Wakeup TXQ if its stopped earlier due to SQ full */
 	if (tx_done ||
 	    (atomic_read(&sq->free_cnt) >= MIN_SQ_DESC_PER_PKT_XMIT)) {
 		netdev = nic->pnicvf->netdev;
-		txq = netdev_get_tx_queue(netdev,
-					  nicvf_netdev_qidx(nic, cq_idx));
+		txq = netdev_get_tx_queue(netdev, txq_idx);
 		if (tx_pkts)
 			netdev_tx_completed_queue(txq, tx_pkts, tx_bytes);
 
@@ -796,10 +823,11 @@ static int nicvf_cq_intr_handler(struct net_device *netdev, u8 cq_idx,
 			if (netif_msg_tx_err(nic))
 				netdev_warn(netdev,
 					    "%s: Transmit queue wakeup SQ%d\n",
-					    netdev->name, cq_idx);
+					    netdev->name, txq_idx);
 		}
 	}
 
+out:
 	spin_unlock_bh(&cq->lock);
 	return work_done;
 }
@@ -1115,6 +1143,13 @@ static netdev_tx_t nicvf_xmit(struct sk_buff *skb, struct net_device *netdev)
 		return NETDEV_TX_OK;
 	}
 
+	/* In XDP case, initial HW tx queues are used for XDP,
+	 * but stack's queue mapping starts at '0', so skip the
+	 * Tx queues attached to Rx queues for XDP.
+	 */
+	if (nic->xdp_prog)
+		qid += nic->xdp_tx_queues;
+
 	snic = nic;
 	/* Get secondary Qset's SQ structure */
 	if (qid >= MAX_SND_QUEUES_PER_QS) {
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 5009f49..ec234b6 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -19,6 +19,8 @@
 #include "q_struct.h"
 #include "nicvf_queues.h"
 
+static inline void nicvf_sq_add_gather_subdesc(struct snd_queue *sq, int qentry,
+					       int size, u64 data);
 static void nicvf_get_page(struct nicvf *nic)
 {
 	if (!nic->rb_pageref || !nic->rb_page)
@@ -456,7 +458,7 @@ static void nicvf_free_cmp_queue(struct nicvf *nic, struct cmp_queue *cq)
 
 /* Initialize transmit queue */
 static int nicvf_init_snd_queue(struct nicvf *nic,
-				struct snd_queue *sq, int q_len)
+				struct snd_queue *sq, int q_len, int qidx)
 {
 	int err;
 
@@ -469,17 +471,38 @@ static int nicvf_init_snd_queue(struct nicvf *nic,
 	sq->skbuff = kcalloc(q_len, sizeof(u64), GFP_KERNEL);
 	if (!sq->skbuff)
 		return -ENOMEM;
+
 	sq->head = 0;
 	sq->tail = 0;
-	atomic_set(&sq->free_cnt, q_len - 1);
 	sq->thresh = SND_QUEUE_THRESH;
 
-	/* Preallocate memory for TSO segment's header */
-	sq->tso_hdrs = dma_alloc_coherent(&nic->pdev->dev,
-					  q_len * TSO_HEADER_SIZE,
-					  &sq->tso_hdrs_phys, GFP_KERNEL);
-	if (!sq->tso_hdrs)
-		return -ENOMEM;
+	/* Check if this SQ is a XDP TX queue */
+	if (nic->sqs_mode)
+		qidx += ((nic->sqs_id + 1) * MAX_SND_QUEUES_PER_QS);
+	if (qidx < nic->pnicvf->xdp_tx_queues) {
+		/* Alloc memory to save page pointers for XDP_TX */
+		sq->xdp_page = kcalloc(q_len, sizeof(u64), GFP_KERNEL);
+		if (!sq->xdp_page)
+			return -ENOMEM;
+		sq->xdp_desc_cnt = 0;
+		sq->xdp_free_cnt = q_len - 1;
+		sq->is_xdp = true;
+	} else {
+		sq->xdp_page = NULL;
+		sq->xdp_desc_cnt = 0;
+		sq->xdp_free_cnt = 0;
+		sq->is_xdp = false;
+
+		atomic_set(&sq->free_cnt, q_len - 1);
+
+		/* Preallocate memory for TSO segment's header */
+		sq->tso_hdrs = dma_alloc_coherent(&nic->pdev->dev,
+						  q_len * TSO_HEADER_SIZE,
+						  &sq->tso_hdrs_phys,
+						  GFP_KERNEL);
+		if (!sq->tso_hdrs)
+			return -ENOMEM;
+	}
 
 	return 0;
 }
@@ -505,6 +528,7 @@ void nicvf_unmap_sndq_buffers(struct nicvf *nic, struct snd_queue *sq,
 static void nicvf_free_snd_queue(struct nicvf *nic, struct snd_queue *sq)
 {
 	struct sk_buff *skb;
+	struct page *page;
 	struct sq_hdr_subdesc *hdr;
 	struct sq_hdr_subdesc *tso_sqe;
 
@@ -522,8 +546,15 @@ static void nicvf_free_snd_queue(struct nicvf *nic, struct snd_queue *sq)
 	smp_rmb();
 	while (sq->head != sq->tail) {
 		skb = (struct sk_buff *)sq->skbuff[sq->head];
-		if (!skb)
+		if (!skb || !sq->xdp_page)
+			goto next;
+
+		page = (struct page *)sq->xdp_page[sq->head];
+		if (!page)
 			goto next;
+		else
+			put_page(page);
+
 		hdr = (struct sq_hdr_subdesc *)GET_SQ_DESC(sq, sq->head);
 		/* Check for dummy descriptor used for HW TSO offload on 88xx */
 		if (hdr->dont_send) {
@@ -536,12 +567,14 @@ static void nicvf_free_snd_queue(struct nicvf *nic, struct snd_queue *sq)
 			nicvf_unmap_sndq_buffers(nic, sq, sq->head,
 						 hdr->subdesc_cnt);
 		}
-		dev_kfree_skb_any(skb);
+		if (skb)
+			dev_kfree_skb_any(skb);
 next:
 		sq->head++;
 		sq->head &= (sq->dmem.q_len - 1);
 	}
 	kfree(sq->skbuff);
+	kfree(sq->xdp_page);
 	nicvf_free_q_desc_mem(nic, &sq->dmem);
 }
 
@@ -932,7 +965,7 @@ static int nicvf_alloc_resources(struct nicvf *nic)
 
 	/* Alloc send queue */
 	for (qidx = 0; qidx < qs->sq_cnt; qidx++) {
-		if (nicvf_init_snd_queue(nic, &qs->sq[qidx], qs->sq_len))
+		if (nicvf_init_snd_queue(nic, &qs->sq[qidx], qs->sq_len, qidx))
 			goto alloc_fail;
 	}
 
@@ -1035,7 +1068,10 @@ static inline int nicvf_get_sq_desc(struct snd_queue *sq, int desc_cnt)
 	int qentry;
 
 	qentry = sq->tail;
-	atomic_sub(desc_cnt, &sq->free_cnt);
+	if (!sq->is_xdp)
+		atomic_sub(desc_cnt, &sq->free_cnt);
+	else
+		sq->xdp_free_cnt -= desc_cnt;
 	sq->tail += desc_cnt;
 	sq->tail &= (sq->dmem.q_len - 1);
 
@@ -1053,7 +1089,10 @@ static inline void nicvf_rollback_sq_desc(struct snd_queue *sq,
 /* Free descriptor back to SQ for future use */
 void nicvf_put_sq_desc(struct snd_queue *sq, int desc_cnt)
 {
-	atomic_add(desc_cnt, &sq->free_cnt);
+	if (!sq->is_xdp)
+		atomic_add(desc_cnt, &sq->free_cnt);
+	else
+		sq->xdp_free_cnt += desc_cnt;
 	sq->head += desc_cnt;
 	sq->head &= (sq->dmem.q_len - 1);
 }
@@ -1111,6 +1150,58 @@ void nicvf_sq_free_used_descs(struct net_device *netdev, struct snd_queue *sq,
 	}
 }
 
+/* XDP Transmit APIs */
+void nicvf_xdp_sq_doorbell(struct nicvf *nic,
+			   struct snd_queue *sq, int sq_num)
+{
+	if (!sq->xdp_desc_cnt)
+		return;
+
+	/* make sure all memory stores are done before ringing doorbell */
+	wmb();
+
+	/* Inform HW to xmit all TSO segments */
+	nicvf_queue_reg_write(nic, NIC_QSET_SQ_0_7_DOOR,
+			      sq_num, sq->xdp_desc_cnt);
+	sq->xdp_desc_cnt = 0;
+}
+
+static inline void
+nicvf_xdp_sq_add_hdr_subdesc(struct snd_queue *sq, int qentry,
+			     int subdesc_cnt, u64 data, int len)
+{
+	struct sq_hdr_subdesc *hdr;
+
+	hdr = (struct sq_hdr_subdesc *)GET_SQ_DESC(sq, qentry);
+	memset(hdr, 0, SND_QUEUE_DESC_SIZE);
+	hdr->subdesc_type = SQ_DESC_TYPE_HEADER;
+	hdr->subdesc_cnt = subdesc_cnt;
+	hdr->tot_len = len;
+	hdr->post_cqe = 1;
+	sq->xdp_page[qentry] = (u64)virt_to_page((void *)data);
+}
+
+int nicvf_xdp_sq_append_pkt(struct nicvf *nic, struct snd_queue *sq,
+			    u64 bufaddr, u64 dma_addr, u16 len)
+{
+	int subdesc_cnt = MIN_SQ_DESC_PER_PKT_XMIT;
+	int qentry;
+
+	if (subdesc_cnt > sq->xdp_free_cnt)
+		return 0;
+
+	qentry = nicvf_get_sq_desc(sq, subdesc_cnt);
+
+	nicvf_xdp_sq_add_hdr_subdesc(sq, qentry, subdesc_cnt - 1, bufaddr, len);
+
+	qentry = nicvf_get_nxt_sqentry(sq, qentry);
+	nicvf_sq_add_gather_subdesc(sq, qentry, len, dma_addr);
+
+	sq->xdp_desc_cnt += subdesc_cnt;
+
+	return 1;
+}
+
 /* Calculate no of SQ subdescriptors needed to transmit all
  * segments of this TSO packet.
  * Taken from 'Tilera network driver' with a minor modification.
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index db04c0e..a07d5b4 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -271,6 +271,10 @@ struct snd_queue {
 	u32		tail;
 	u64		*skbuff;
 	void		*desc;
+	u64		*xdp_page;
+	u16		xdp_desc_cnt;
+	u16		xdp_free_cnt;
+	bool		is_xdp;
 
 #define	TSO_HEADER_SIZE	128
 	/* For TSO segment's header */
@@ -339,6 +343,9 @@ void nicvf_sq_free_used_descs(struct net_device *netdev,
 			      struct snd_queue *sq, int qidx);
 int nicvf_sq_append_skb(struct nicvf *nic, struct snd_queue *sq,
 			struct sk_buff *skb, u8 sq_num);
+int nicvf_xdp_sq_append_pkt(struct nicvf *nic, struct snd_queue *sq,
+			    u64 bufaddr, u64 dma_addr, u16 len);
+void nicvf_xdp_sq_doorbell(struct nicvf *nic, struct snd_queue *sq, int sq_num);
 
 struct sk_buff *nicvf_get_rcv_skb(struct nicvf *nic,
 				  struct cqe_rx_t *cqe_rx, bool xdp);
-- 
2.7.4

^ permalink raw reply related

* [PATCH 8/9] net: thunderx: Support for XDP header adjustment
From: sunil.kovvuri at gmail.com @ 2017-05-02 13:06 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493730418-24606-1-git-send-email-sunil.kovvuri@gmail.com>

From: Sunil Goutham <sgoutham@cavium.com>

When in XDP mode reserve XDP_PACKET_HEADROOM bytes at the start
of receive buffer for XDP program to modify headers and adjust
packet start. Additional code changes done to handle such packets.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
---
 drivers/net/ethernet/cavium/thunder/nicvf_main.c   | 63 ++++++++++++++++------
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c |  9 +++-
 2 files changed, 55 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index bb13dee..d6477af 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -502,13 +502,15 @@ static int nicvf_init_resources(struct nicvf *nic)
 }
 
 static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
-				struct cqe_rx_t *cqe_rx, struct snd_queue *sq)
+				struct cqe_rx_t *cqe_rx, struct snd_queue *sq,
+				struct sk_buff **skb)
 {
 	struct xdp_buff xdp;
 	struct page *page;
 	u32 action;
-	u16 len;
+	u16 len, offset = 0;
 	u64 dma_addr, cpu_addr;
+	void *orig_data;
 
 	/* Retrieve packet buffer's DMA address and length */
 	len = *((u16 *)((void *)cqe_rx + (3 * sizeof(u64))));
@@ -517,17 +519,47 @@ static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
 	cpu_addr = nicvf_iova_to_phys(nic, dma_addr);
 	if (!cpu_addr)
 		return false;
+	cpu_addr = (u64)phys_to_virt(cpu_addr);
+	page = virt_to_page((void *)cpu_addr);
 
-	xdp.data = phys_to_virt(cpu_addr);
+	xdp.data_hard_start = page_address(page);
+	xdp.data = (void *)cpu_addr;
 	xdp.data_end = xdp.data + len;
+	orig_data = xdp.data;
 
 	rcu_read_lock();
 	action = bpf_prog_run_xdp(prog, &xdp);
 	rcu_read_unlock();
 
+	/* Check if XDP program has changed headers */
+	if (orig_data != xdp.data) {
+		len = xdp.data_end - xdp.data;
+		offset = orig_data - xdp.data;
+		dma_addr -= offset;
+	}
+
 	switch (action) {
 	case XDP_PASS:
-		/* Pass on packet to network stack */
+		/* Check if it's a recycled page, if not
+		 * unmap the DMA mapping.
+		 *
+		 * Recycled page holds an extra reference.
+		 */
+		if (page_ref_count(page) == 1) {
+			dma_addr &= PAGE_MASK;
+			dma_unmap_page_attrs(&nic->pdev->dev, dma_addr,
+					     RCV_FRAG_LEN + XDP_PACKET_HEADROOM,
+					     DMA_FROM_DEVICE,
+					     DMA_ATTR_SKIP_CPU_SYNC);
+		}
+
+		/* Build SKB and pass on packet to network stack */
+		*skb = build_skb(xdp.data,
+				 RCV_FRAG_LEN - cqe_rx->align_pad + offset);
+		if (!*skb)
+			put_page(page);
+		else
+			skb_put(*skb, len);
 		return false;
 	case XDP_TX:
 		nicvf_xdp_sq_append_pkt(nic, sq, (u64)xdp.data, dma_addr, len);
@@ -537,7 +569,6 @@ static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
 	case XDP_ABORTED:
 		trace_xdp_exception(nic->netdev, prog, action);
 	case XDP_DROP:
-		page = virt_to_page(xdp.data);
 		/* Check if it's a recycled page, if not
 		 * unmap the DMA mapping.
 		 *
@@ -546,7 +577,8 @@ static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
 		if (page_ref_count(page) == 1) {
 			dma_addr &= PAGE_MASK;
 			dma_unmap_page_attrs(&nic->pdev->dev, dma_addr,
-					     RCV_FRAG_LEN, DMA_FROM_DEVICE,
+					     RCV_FRAG_LEN + XDP_PACKET_HEADROOM,
+					     DMA_FROM_DEVICE,
 					     DMA_ATTR_SKIP_CPU_SYNC);
 		}
 		put_page(page);
@@ -654,7 +686,7 @@ static void nicvf_rcv_pkt_handler(struct net_device *netdev,
 				  struct napi_struct *napi,
 				  struct cqe_rx_t *cqe_rx, struct snd_queue *sq)
 {
-	struct sk_buff *skb;
+	struct sk_buff *skb = NULL;
 	struct nicvf *nic = netdev_priv(netdev);
 	struct nicvf *snic = nic;
 	int err = 0;
@@ -676,15 +708,17 @@ static void nicvf_rcv_pkt_handler(struct net_device *netdev,
 	}
 
 	/* For XDP, ignore pkts spanning multiple pages */
-	if (nic->xdp_prog && (cqe_rx->rb_cnt == 1))
-		if (nicvf_xdp_rx(snic, nic->xdp_prog, cqe_rx, sq))
+	if (nic->xdp_prog && (cqe_rx->rb_cnt == 1)) {
+		/* Packet consumed by XDP */
+		if (nicvf_xdp_rx(snic, nic->xdp_prog, cqe_rx, sq, &skb))
 			return;
+	} else {
+		skb = nicvf_get_rcv_skb(snic, cqe_rx,
+					nic->xdp_prog ? true : false);
+	}
 
-	skb = nicvf_get_rcv_skb(snic, cqe_rx, nic->xdp_prog ? true : false);
-	if (!skb) {
-		netdev_dbg(nic->netdev, "Packet not received\n");
+	if (!skb)
 		return;
-	}
 
 	if (netif_msg_pktdata(nic)) {
 		netdev_info(nic->netdev, "%s: skb 0x%p, len=%d\n", netdev->name,
@@ -1672,9 +1706,6 @@ static int nicvf_xdp_setup(struct nicvf *nic, struct bpf_prog *prog)
 		return -EOPNOTSUPP;
 	}
 
-	if (prog && prog->xdp_adjust_head)
-		return -EOPNOTSUPP;
-
 	/* ALL SQs attached to CQs i.e same as RQs, are treated as
 	 * XDP Tx queues and more Tx queues are allocated for
 	 * network stack to send pkts out.
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index ec234b6..43428ce 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -164,6 +164,11 @@ static inline int nicvf_alloc_rcv_buffer(struct nicvf *nic, struct rbdr *rbdr,
 	}
 
 	nic->rb_page_offset = 0;
+
+	/* Reserve space for header modifications by BPF program */
+	if (rbdr->is_xdp)
+		buf_len += XDP_PACKET_HEADROOM;
+
 	/* Check if it's recycled */
 	if (pgcache)
 		nic->rb_page = pgcache->page;
@@ -183,7 +188,7 @@ static inline int nicvf_alloc_rcv_buffer(struct nicvf *nic, struct rbdr *rbdr,
 			return -ENOMEM;
 		}
 		if (pgcache)
-			pgcache->dma_addr = *rbuf;
+			pgcache->dma_addr = *rbuf + XDP_PACKET_HEADROOM;
 		nic->rb_page_offset += buf_len;
 	}
 
@@ -1575,6 +1580,8 @@ static void nicvf_unmap_rcv_buffer(struct nicvf *nic, u64 dma_addr,
 		 */
 		if (page_ref_count(page) != 1)
 			return;
+
+		len += XDP_PACKET_HEADROOM;
 		/* Receive buffers in XDP mode are mapped from page start */
 		dma_addr &= PAGE_MASK;
 	}
-- 
2.7.4

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox