Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] net: mvneta: replace MVNETA_CPU_D_CACHE_LINE_SIZE with L1_CACHE_BYTES
From: Thomas Petazzoni @ 2016-03-31 20:37 UTC (permalink / raw)
  To: David Miller; +Cc: jszhang, netdev, linux-kernel, linux-arm-kernel
In-Reply-To: <20160331.151547.1889188465826831929.davem@davemloft.net>

Hello,

On Thu, 31 Mar 2016 15:15:47 -0400 (EDT), David Miller wrote:
> From: Jisheng Zhang <jszhang@marvell.com>
> Date: Wed, 30 Mar 2016 19:55:21 +0800
> 
> > The mvneta is also used in some Marvell berlin family SoCs which may
> > have 64bytes cacheline size. Replace the MVNETA_CPU_D_CACHE_LINE_SIZE
> > usage with L1_CACHE_BYTES.
> > 
> > And since dma_alloc_coherent() is always cacheline size aligned, so
> > remove the align checks.
> > 
> > Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
> 
> Applied.

A new version of the patch was sent, which more rightfully uses
cache_line_size(), see:

 "[PATCH v2] net: mvneta: replace MVNETA_CPU_D_CACHE_LINE_SIZE with cache_line_size"

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply

* Re: [PATCH] fec: Do not access unexisting register in Coldfire
From: David Miller @ 2016-03-31 20:46 UTC (permalink / raw)
  To: festevam; +Cc: fugang.duan, troy.kisky, gerg, netdev, fabio.estevam
In-Reply-To: <1459436717-12809-1-git-send-email-festevam@gmail.com>

From: Fabio Estevam <festevam@gmail.com>
Date: Thu, 31 Mar 2016 12:05:17 -0300

> From: Fabio Estevam <fabio.estevam@nxp.com>
> 
> Commit 55cd48c821de ("net: fec: stop the "rcv is not +last, " error
> messages") introduces a write to a register that does not exist in
> Coldfire.
> 
> Move the FEC_FTRL register access inside the FEC_QUIRK_HAS_RACC 'if' block,
> so that we guarantee it will not be used on Coldfire CPUs.
> 
> Reported-by: Greg Ungerer <gerg@uclinux.org>
> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] net: mvneta: replace MVNETA_CPU_D_CACHE_LINE_SIZE with L1_CACHE_BYTES
From: David Miller @ 2016-03-31 20:47 UTC (permalink / raw)
  To: thomas.petazzoni; +Cc: jszhang, netdev, linux-kernel, linux-arm-kernel
In-Reply-To: <20160331223735.32904e42@free-electrons.com>

From: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Date: Thu, 31 Mar 2016 22:37:35 +0200

> Hello,
> 
> On Thu, 31 Mar 2016 15:15:47 -0400 (EDT), David Miller wrote:
>> From: Jisheng Zhang <jszhang@marvell.com>
>> Date: Wed, 30 Mar 2016 19:55:21 +0800
>> 
>> > The mvneta is also used in some Marvell berlin family SoCs which may
>> > have 64bytes cacheline size. Replace the MVNETA_CPU_D_CACHE_LINE_SIZE
>> > usage with L1_CACHE_BYTES.
>> > 
>> > And since dma_alloc_coherent() is always cacheline size aligned, so
>> > remove the align checks.
>> > 
>> > Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
>> 
>> Applied.
> 
> A new version of the patch was sent, which more rightfully uses
> cache_line_size(), see:
> 
>  "[PATCH v2] net: mvneta: replace MVNETA_CPU_D_CACHE_LINE_SIZE with cache_line_size"

Sorry about that.

Send me a realtive fixup patch if you like.

Thanks.

^ permalink raw reply

* Re: [PATCH net] rtnl: fix msg size calculation in if_nlmsg_size()
From: David Miller @ 2016-03-31 20:50 UTC (permalink / raw)
  To: nicolas.dichtel; +Cc: netdev, dsahern
In-Reply-To: <1459440631-31729-1-git-send-email-nicolas.dichtel@6wind.com>

From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Date: Thu, 31 Mar 2016 18:10:31 +0200

> Size of the attribute IFLA_PHYS_PORT_NAME was missing.
> 
> Fixes: db24a9044ee1 ("net: add support for phys_port_name")
> CC: David Ahern <dsahern@gmail.com>
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>

Applied, and queued up for -stable, thanks Nicolas.

Man... this thing is getting really huge and unwieldly.

^ permalink raw reply

* Re: [PATCH v2] can: rcar_canfd: Add Renesas R-Car CAN FD driver
From: Marc Kleine-Budde @ 2016-03-31 20:51 UTC (permalink / raw)
  To: Ramesh Shanmugasundaram, wg, robh+dt, pawel.moll, mark.rutland,
	ijc+devicetree, galak, corbet
  Cc: linux-renesas-soc, devicetree, linux-can, netdev, linux-doc,
	geert+renesas, chris.paterson2
In-Reply-To: <1457019515-21158-1-git-send-email-ramesh.shanmugasundaram@bp.renesas.com>


[-- Attachment #1.1: Type: text/plain, Size: 62051 bytes --]

On 03/03/2016 04:38 PM, Ramesh Shanmugasundaram wrote:
> This patch adds support for the CAN FD controller found in Renesas R-Car
> SoCs. The controller operates in CAN FD mode by default.
> 
> CAN FD mode supports both Classical CAN & CAN FD frame formats. The
> controller supports ISO 11898-1:2015 CAN FD format only.
> 
> This controller supports two channels and the driver can enable either
> or both of the channels.
> 
> Driver uses Rx FIFOs (one per channel) for reception & Common FIFOs (one
> per channel) for transmission. Rx filter rules are configured to the
> minimum (one per channel) and it accepts Standard, Extended, Data &
> Remote Frame combinations.
> 
> Note: There are few documentation errors in R-Car Gen3 Hardware User
> Manual v0.5E with respect to CAN FD controller. They are listed below:
> 
> 1. CAN FD interrupt numbers 29 & 30 are listed as per channel
> interrupts. However, they are common to both channels (i.e.) they are
> global and channel interrupts respectively.
> 
> 2. CANFD clock is derived from PLL1. This is not documented.
> 
> 3. CANFD clock is further divided by (1/2) within the CAN FD controller.
> This is not documented.
> 
> 4. The minimum value of NTSEG1 in RSCFDnCFDCmNCFG register is 2 Tq. It
> is mentioned 4 Tq in the manual.
> 
> 5. The maximum number of message RAM area the controller can use is 3584
> bytes. It is specified 10752 bytes in the manual.
> 
> Signed-off-by: Ramesh Shanmugasundaram <ramesh.shanmugasundaram@bp.renesas.com>
> ---
> Changes since v1:
> 	* Removed testmodes & debugfs code (suggested by Oliver H)
> 	* Fixed tx path race issue by introducing lock (suggested by Marc K)
> 	* Removed __maybe_unused attribute of rcar_canfd_of_table
> 
> Thanks,
> Ramesh
> ---
>  .../devicetree/bindings/net/can/rcar_canfd.txt     |   86 ++
>  drivers/net/can/Kconfig                            |   10 +
>  drivers/net/can/Makefile                           |    1 +
>  drivers/net/can/rcar_canfd.c                       | 1624 ++++++++++++++++++++
>  4 files changed, 1721 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/net/can/rcar_canfd.txt
>  create mode 100644 drivers/net/can/rcar_canfd.c
> 
> diff --git a/Documentation/devicetree/bindings/net/can/rcar_canfd.txt b/Documentation/devicetree/bindings/net/can/rcar_canfd.txt
> new file mode 100644
> index 0000000..4299bd8
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/net/can/rcar_canfd.txt
> @@ -0,0 +1,86 @@
> +Renesas R-Car CAN FD controller Device Tree Bindings
> +----------------------------------------------------
> +
> +Required properties:
> +- compatible: Must contain one or more of the following:
> +  - "renesas,rcar-gen3-canfd" for R-Car Gen3 compatible controller.
> +  - "renesas,r8a7795-canfd" for R8A7795 (R-Car H3) compatible controller.
> +
> +  When compatible with the generic version, nodes must list the
> +  SoC-specific version corresponding to the platform first, followed by the
> +  family-specific and/or generic versions.
> +
> +- reg: physical base address and size of the R-Car CAN FD register map.
> +- interrupts: interrupt specifier for the Global & Channel interrupts
> +- clocks: phandles and clock specifiers for 3 clock inputs.
> +- clock-names: 3 clock input name strings: "fck", "canfd", "can_clk".
> +- pinctrl-0: pin control group to be used for this controller.
> +- pinctrl-names: must be "default".
> +
> +Required properties for "renesas,r8a7795-canfd" compatible:
> +In R8A7795 SoC, canfd clock is a div6 clock and can be used by both CAN
> +and CAN FD controller at the same time. It needs to be scaled to maximum
> +frequency if any of these controllers use it. This is done using the
> +below properties.
> +
> +- assigned-clocks: phandle of canfd clock.
> +- assigned-clock-rates: maximum frequency of this clock.
> +
> +Each channel is represented as a child node. They can be enabled/disabled
> +using "status" property.
> +
> +Example
> +-------
> +
> +SoC common .dtsi file:
> +
> +		canfd: canfd@e66c0000 {
> +			compatible = "renesas,r8a7795-canfd",
> +				     "renesas,rcar-gen3-canfd";
> +			reg = <0 0xe66c0000 0 0x8000>;
> +			interrupts = <GIC_SPI 29 IRQ_TYPE_LEVEL_HIGH>,
> +				   <GIC_SPI 30 IRQ_TYPE_LEVEL_HIGH>;
> +			clocks = <&cpg CPG_MOD 914>,
> +			       <&cpg CPG_CORE R8A7795_CLK_CANFD>,
> +			       <&can_clk>;
> +			clock-names = "fck", "canfd", "can_clk";
> +			assigned-clocks = <&cpg CPG_CORE R8A7795_CLK_CANFD>;
> +			assigned-clock-rates = <40000000>;
> +			power-domains = <&cpg>;
> +			status = "disabled";
> +
> +			channel0 {
> +				status = "disabled";
> +			};
> +
> +			channel1 {
> +				status = "disabled";
> +			};
> +		};
> +
> +Board specific .dts file:
> +
> +E.g. below enables Channel 1 alone in the board.
> +
> +&canfd {
> +        pinctrl-0 = <&canfd1_pins>;
> +        pinctrl-names = "default";
> +        status = "okay";
> +
> +	channel1 {
> +		status = "okay";
> +	};
> +};
> +
> +E.g. below enables Channel 0 alone in the board using External clock
> +as fCAN clock.
> +
> +&canfd {
> +        pinctrl-0 = <&canfd0_pins &can_clk_pins>;
> +        pinctrl-names = "default";
> +        status = "okay";
> +
> +	channel0 {
> +		status = "okay";
> +	};
> +};
> diff --git a/drivers/net/can/Kconfig b/drivers/net/can/Kconfig
> index 0d40aef..0ecb4fe 100644
> --- a/drivers/net/can/Kconfig
> +++ b/drivers/net/can/Kconfig
> @@ -114,6 +114,16 @@ config CAN_RCAR
>  	  To compile this driver as a module, choose M here: the module will
>  	  be called rcar_can.
>  
> +config CANFD_RCAR
> +	tristate "Renesas R-Car CAN FD controller"
> +	depends on ARCH_RENESAS || ARM
> +	---help---
> +	  Say Y here if you want to use CAN FD controller found on
> +	  Renesas R-Car SoCs.
> +
> +	  To compile this driver as a module, choose M here: the module will
> +	  be called rcar_canfd.
> +
>  config CAN_SUN4I
>  	tristate "Allwinner A10 CAN controller"
>  	depends on MACH_SUN4I || MACH_SUN7I || COMPILE_TEST
> diff --git a/drivers/net/can/Makefile b/drivers/net/can/Makefile
> index e3db0c8..401b150 100644
> --- a/drivers/net/can/Makefile
> +++ b/drivers/net/can/Makefile
> @@ -25,6 +25,7 @@ obj-$(CONFIG_CAN_JANZ_ICAN3)	+= janz-ican3.o
>  obj-$(CONFIG_CAN_MSCAN)		+= mscan/
>  obj-$(CONFIG_CAN_M_CAN)		+= m_can/
>  obj-$(CONFIG_CAN_RCAR)		+= rcar_can.o
> +obj-$(CONFIG_CANFD_RCAR)	+= rcar_canfd.o
>  obj-$(CONFIG_CAN_SJA1000)	+= sja1000/
>  obj-$(CONFIG_CAN_SUN4I)		+= sun4i_can.o
>  obj-$(CONFIG_CAN_TI_HECC)	+= ti_hecc.o
> diff --git a/drivers/net/can/rcar_canfd.c b/drivers/net/can/rcar_canfd.c
> new file mode 100644
> index 0000000..56e089d
> --- /dev/null
> +++ b/drivers/net/can/rcar_canfd.c
> @@ -0,0 +1,1624 @@
> +/* Renesas R-Car CAN FD device driver
> + *
> + * Copyright (C) 2015 Renesas Electronics Corp.
> + *
> + * This program is free software; you can redistribute  it and/or modify it
> + * under  the terms of  the GNU General  Public License as published by the
> + * Free Software Foundation;  either version 2 of the  License, or (at your
> + * option) any later version.
> + */
> +
> +#include <linux/module.h>
> +#include <linux/moduleparam.h>
> +#include <linux/kernel.h>
> +#include <linux/types.h>
> +#include <linux/interrupt.h>
> +#include <linux/errno.h>
> +#include <linux/netdevice.h>
> +#include <linux/platform_device.h>
> +#include <linux/can/led.h>
> +#include <linux/can/dev.h>
> +#include <linux/clk.h>
> +#include <linux/of.h>
> +#include <linux/of_device.h>
> +#include <linux/bitmap.h>
> +#include <linux/bitops.h>
> +#include <linux/iopoll.h>
> +
> +#define RCANFD_DRV_NAME			"rcar_canfd"
> +
> +#define RCANFD_FIFO_DEPTH		8	/* Tx FIFO depth */
> +#define RCANFD_NAPI_WEIGHT		8	/* Rx poll quota */
> +
> +#define RCANFD_NUM_CHANNELS		2
> +#define RCANFD_CHANNELS_MASK		0x3	/* Two channels max */

(BIT(RCANFD_NUM_CHANNELS) - 1

> +
> +/* Rx FIFO is a global resource of the controller. There are 8 such FIFOs
> + * available. Each channel gets a dedicated Rx FIFO (i.e.) the channel
> + * number is added to RFFIFO index.
> + */
> +#define RCANFD_RFFIFO_IDX		0
> +
> +/* Tx/Rx or Common FIFO is a per channel resource. Each channel has 3 Common
> + * FIFOs dedicated to them. Use the first (index 0) FIFO out of the 3 for Tx.
> + */
> +#define RCANFD_CFFIFO_IDX		0
> +
> +/* Global register bits */
> +#define RCANFD_GINTF_CANFD		BIT(0)
> +
> +#define RCANFD_GCFG_TPRI		BIT(0)
> +#define RCANFD_GCFG_DCE			BIT(1)
> +#define RCANFD_GCFG_DCS			BIT(4)
> +#define RCANFD_GCFG_CMPOC		BIT(5)
> +#define RCANFD_GCFG_EEFE		BIT(6)
> +
> +#define RCANFD_GCTR_SLPR		BIT(2)
> +#define RCANFD_GCTR_MODEMASK		(0x3)
> +#define RCANFD_GCTR_GOPM		(0x0)
> +#define RCANFD_GCTR_GRESET		(0x1)
> +#define RCANFD_GCTR_GHLT		(0x2)
> +
> +#define RCANFD_GCTR_DEIE		BIT(8)
> +#define RCANFD_GCTR_MEIE		BIT(9)
> +#define RCANFD_GCTR_THLEIE		BIT(10)
> +#define RCANFD_GCTR_CFMPOFIE		BIT(11)
> +#define RCANFD_GCTR_TSRST		BIT(16)
> +
> +#define RCANFD_GSTS_RAMINIT		BIT(3)
> +#define RCANFD_GSTS_SLP			BIT(2)
> +#define RCANFD_GSTS_HLT			BIT(1)
> +#define RCANFD_GSTS_RESET		BIT(0)
> +
> +#define RCANFD_GSTS_GNOPM		(BIT(0) | BIT(1) | BIT(2) | BIT(3))
> +
> +/* Channel register bits */
> +#define RCANFD_CCTR_CSLPR		BIT(2)
> +#define RCANFD_CCTR_MODEMASK		(0x3)
> +#define RCANFD_CCTR_COPM		(0x0)
> +#define RCANFD_CCTR_CRESET		(0x1)
> +#define RCANFD_CCTR_CHLT		(0x2)
> +#define RCANFD_CCTR_CTMASK		(0x3 << 25)
> +#define RCANFD_CCTR_CTMS_ILB		(0x3 << 25)
> +#define RCANFD_CCTR_CTME		BIT(24)
> +#define RCANFD_CCTR_ERRD		BIT(23)
> +#define RCANFD_CCTR_BOMMASK		(0x3 << 21)
> +#define RCANFD_CCTR_BOM_ENTRY		(0x1 << 21)
> +#define RCANFD_CCTR_TDCVFIE		BIT(19)
> +#define RCANFD_CCTR_SOCOIE		BIT(18)
> +#define RCANFD_CCTR_EOCOIE		BIT(17)
> +#define RCANFD_CCTR_TAIE		BIT(16)
> +#define RCANFD_CCTR_ALIE		BIT(15)
> +#define RCANFD_CCTR_BLIE		BIT(14)
> +#define RCANFD_CCTR_OLIE		BIT(13)
> +#define RCANFD_CCTR_BORIE		BIT(12)
> +#define RCANFD_CCTR_BOEIE		BIT(11)
> +#define RCANFD_CCTR_EPIE		BIT(10)
> +#define RCANFD_CCTR_EWIE		BIT(9)
> +#define RCANFD_CCTR_BEIE		BIT(8)
> +
> +#define RCANFD_CSTS_COM			BIT(7)
> +#define RCANFD_CSTS_REC			BIT(6)
> +#define RCANFD_CSTS_TRM			BIT(5)
> +#define RCANFD_CSTS_BO			BIT(4)
> +#define RCANFD_CSTS_EP			BIT(3)
> +#define RCANFD_CSTS_SLP			BIT(2)
> +#define RCANFD_CSTS_HALT		BIT(1)
> +#define RCANFD_CSTS_RESET		BIT(0)
> +
> +#define RCANFD_CSTS_TECCNT(x)		(((x) >> 24) & 0xff)
> +#define RCANFD_CSTS_RECCNT(x)		(((x) >> 16) & 0xff)
> +
> +/* Bit Configuration register */
> +#define RCANFD_BRP(x)			(((x) & 0x3ff) << 0)
> +#define RCANFD_SJW(x)			(((x) & 0x3) << 24)
> +#define RCANFD_TSEG1(x)			(((x) & 0xf) << 16)
> +#define RCANFD_TSEG2(x)			(((x) & 0x7) << 20)
> +
> +#define RCANFD_NR_BRP(x)		(((x) & 0x3ff) << 0)
> +#define RCANFD_NR_SJW(x)		(((x) & 0x1f) << 11)
> +#define RCANFD_NR_TSEG1(x)		(((x) & 0x7f) << 16)
> +#define RCANFD_NR_TSEG2(x)		(((x) & 0x1f) << 24)
> +
> +#define RCANFD_DR_BRP(x)		(((x) & 0xff) << 0)
> +#define RCANFD_DR_SJW(x)		(((x) & 0x7) << 24)
> +#define RCANFD_DR_TSEG1(x)		(((x) & 0xf) << 16)
> +#define RCANFD_DR_TSEG2(x)		(((x) & 0x7) << 20)
> +
> +/* Global Error flag bits */
> +#define RCANFD_GERFL_EEF1		BIT(17)
> +#define RCANFD_GERFL_EEF0		BIT(16)
> +#define RCANFD_GERFL_CMPOF		BIT(3)
> +#define RCANFD_GERFL_THLES		BIT(2)
> +#define RCANFD_GERFL_MES		BIT(1)
> +#define RCANFD_GERFL_DEF		BIT(0)
> +
> +#define RCANFD_GERFL_ERR(x)		((x) & (RCANFD_GERFL_EEF1 |\
> +						RCANFD_GERFL_EEF0 |\
> +						RCANFD_GERFL_MES |\
> +						RCANFD_GERFL_CMPOF))
> +
> +/* Channel Error flag bits */
> +#define RCANFD_CERFL_ADEF		BIT(14)
> +#define RCANFD_CERFL_B0EF		BIT(13)
> +#define RCANFD_CERFL_B1EF		BIT(12)
> +#define RCANFD_CERFL_CEF		BIT(11)
> +#define RCANFD_CERFL_AEF		BIT(10)
> +#define RCANFD_CERFL_FEF		BIT(9)
> +#define RCANFD_CERFL_SEF		BIT(8)
> +#define RCANFD_CERFL_ALEF		BIT(7)
> +#define RCANFD_CERFL_BLEF		BIT(6)
> +#define RCANFD_CERFL_OLEF		BIT(5)
> +#define RCANFD_CERFL_BOREF		BIT(4)
> +#define RCANFD_CERFL_BOEEF		BIT(3)
> +#define RCANFD_CERFL_EPEF		BIT(2)
> +#define RCANFD_CERFL_EWEF		BIT(1)
> +#define RCANFD_CERFL_BEF		BIT(0)
> +
> +#define RCANFD_CERFL_ERR(x)		((x) & (0x7fff)) /* above bits 14:0 */
> +
> +/* CAN FD specific registers */
> +#define RCANFD_DCFG_TDCE		BIT(9)
> +#define RCANFD_DCFG_TDCOC		BIT(8)
> +#define RCANFD_DCFG_TDCO(x)		(((x) & 0x7f) >> 16)
> +
> +#define RCANFD_DCSTS_TDCR(x)		(((x) >> 0) & 0x7f)
> +#define RCANFD_DCSTS_SOCCNT(x)		(((x) >> 24) & 0xff)
> +#define RCANFD_DCSTS_EOCCNT(x)		(((x) >> 16) & 0xff)
> +
> +#define RCANFD_DCSTS_TDCVF		BIT(7)
> +#define RCANFD_DCSTS_EOCO		BIT(8)
> +#define RCANFD_DCSTS_SOCO		BIT(9)
> +
> +/* Rx FIFO bits */
> +#define RCANFD_RFFIFO_RFIF		BIT(3)
> +#define RCANFD_RFFIFO_RFMLT		BIT(2)
> +#define RCANFD_RFFIFO_RFFLL		BIT(1)
> +#define RCANFD_RFFIFO_RFEMP		BIT(0)
> +
> +#define RCANFD_RFFIFO_RFESI		BIT(0)
> +#define RCANFD_RFFIFO_RFBRS		BIT(1)
> +#define RCANFD_RFFIFO_RFFDF		BIT(2)
> +
> +#define RCANFD_RFFIFO_RFIDE		BIT(31)
> +#define RCANFD_RFFIFO_RFRTR		BIT(30)
> +
> +#define RCANFD_RFFIFO_RFDLC(x)		(((x) >> 28) & 0xf)
> +#define RCANFD_RFFIFO_RFPTR(x)		(((x) >> 16) & 0xfff)
> +#define RCANFD_RFFIFO_RFTS(x)		(((x) >> 0) & 0xff)
> +
> +#define RCANFD_RFFIFO_RFIM		BIT(12)
> +#define RCANFD_RFFIFO_RFDC(x)		(((x) & 0x7) << 8)
> +#define RCANFD_RFFIFO_RFPLS(x)		(((x) & 0x7) << 4)
> +#define RCANFD_RFFIFO_RFIE		BIT(1)
> +#define RCANFD_RFFIFO_RFE		BIT(0)
> +
> +/* Common FIFO bits */
> +#define RCANFD_CMFIFO_TML(x)		(((x) & 0xf) << 20)
> +#define RCANFD_CMFIFO_M(x)		(((x) & 0x3) << 16)
> +#define RCANFD_CMFIFO_CFIM		BIT(12)
> +#define RCANFD_CMFIFO_DC(x)		(((x) & 0x7) << 8)
> +#define RCANFD_CMFIFO_PLS(x)		(((x) & 0x7) << 4)
> +#define RCANFD_CMFIFO_CFTXIE		BIT(2)
> +#define RCANFD_CMFIFO_CFE		BIT(0)
> +
> +#define RCANFD_CMFIFO_CFTXIF		BIT(4)
> +#define RCANFD_CMFIFO_CFMLT		BIT(2)
> +#define RCANFD_CMFIFO_CFFLL		BIT(1)
> +#define RCANFD_CMFIFO_CFEMP		BIT(0)
> +#define RCANFD_CMFIFO_CFMC(x)		(((x) >> 8) & 0xff)
> +
> +#define RCANFD_CMFIFO_CFESI		BIT(0)
> +#define RCANFD_CMFIFO_CFBRS		BIT(1)
> +#define RCANFD_CMFIFO_CFFDF		BIT(2)
> +
> +#define RCANFD_CMFIFO_CFIDE		BIT(31)
> +#define RCANFD_CMFIFO_CFRTR		BIT(30)
> +#define RCANFD_CMFIFO_CFID(x)		((x) & 0x1fffffff)
> +
> +#define RCANFD_CMFIFO_CFDLC(x)		(((x) & 0xf) << 28)
> +#define RCANFD_CMFIFO_CFPTR(x)		(((x) & 0xfff) << 16)
> +#define RCANFD_CMFIFO_CFTS(x)		(((x) & 0xff) << 0)
> +
> +/* Global Test Config register */
> +#define RCANFD_GTSTCFG_C0CBCE		BIT(0)
> +#define RCANFD_GTSTCFG_C1CBCE		BIT(1)
> +
> +#define RCANFD_GTSTCTR_ICBCTME		BIT(0)
> +
> +/* AFL Rx rules registers */
> +#define RCANFD_AFLCFG_SETRNC0(x)	(((x) & 0xff) << 24)
> +#define RCANFD_AFLCFG_SETRNC1(x)	(((x) & 0xff) << 16)

What about something like:

#define RCANFD_AFLCFG_SETRNC(n, x)	(((x) & 0xff) << (24 - n * 8))

This will save some if()s in the code

> +#define RCANFD_AFLCFG_GETRNC0(x)	(((x) >> 24) & 0xff)
> +#define RCANFD_AFLCFG_GETRNC1(x)	(((x) >> 16) & 0xff)
> +
> +#define RCANFD_AFL_PAGENUM(entry)	((entry) / 16)
> +
> +#define RCANFD_AFLCTR_AFLDAE		BIT(8)
> +#define RCANFD_AFLCTR_AFLPN(x)		((x) & 0x1f)
> +#define RCANFD_CHANNEL_NUMRULES		1	/* only one rule per channel */
> +#define RCANFD_AFLID_GAFLLB		BIT(29)
> +#define RCANFD_AFLPTR1_RFFIFO(x)	(1 << (x))
> +
> +/* Common register map offsets */
> +
> +/* Nominal rate registers */
> +#define RCANFD_CCFG(m)			(0x0000 + (0x10 * (m)))
> +#define RCANFD_CCTR(m)			(0x0004 + (0x10 * (m)))
> +#define RCANFD_CSTS(m)			(0x0008 + (0x10 * (m)))
> +#define RCANFD_CERFL(m)			(0x000C + (0x10 * (m)))
> +
> +/* Global registers */
> +#define RCANFD_GCFG			(0x0084)
> +#define RCANFD_GCTR			(0x0088)
> +#define RCANFD_GSTS			(0x008c)
> +#define RCANFD_GERFL			(0x0090)
> +#define RCANFD_GTSC			(0x0094)
> +#define RCANFD_GAFLECTR			(0x0098)
> +#define RCANFD_GAFLCFG0			(0x009c)
> +#define RCANFD_GAFLCFG1			(0x00a0)
> +#define RCANFD_RMNB			(0x00a4)
> +
> +#define RCANFD_RMND(y)			(0x00a8 + (0x04 * (y)))
> +
> +/* Rx FIFO Control registers */
> +#define RCANFD_RFCC(x)			(0x00b8 + (0x04 * (x)))
> +#define RCANFD_RFSTS(x)			(0x00d8 + (0x04 * (x)))
> +#define RCANFD_RFPCTR(x)		(0x00f8 + (0x04 * (x)))
> +
> +/* Common FIFO Control registers */
> +#define RCANFD_CFCC(ch, idx)		(0x0118 + (0x0c * (ch)) + \
> +					 (0x04 * (idx)))
> +#define RCANFD_CFSTS(ch, idx)		(0x0178 + (0x0c * (ch)) + \
> +					 (0x04 * (idx)))
> +#define RCANFD_CFPCTR(ch, idx)		(0x01d8 + (0x0c * (ch)) + \
> +					 (0x04 * (idx)))
> +
> +#define RCANFD_FESTS			(0x0238)
> +#define RCANFD_FFSTS			(0x023c)
> +#define RCANFD_FMSTS			(0x0240)
> +#define RCANFD_RFISTS			(0x0244)
> +#define RCANFD_CFRISTS			(0x0248)
> +#define RCANFD_CFTISTS			(0x024c)
> +
> +#define RCANFD_TMC(p)			(0x0250 + (0x01 * (p)))
> +#define RCANFD_TMSTS(p)			(0x02d0 + (0x01 * (p)))
> +
> +#define RCANFD_TMTRSTS(y)		(0x0350 + (0x04 * (y)))
> +#define RCANFD_TMTARSTS(y)		(0x0360 + (0x04 * (y)))
> +#define RCANFD_TMTCSTS(y)		(0x0370 + (0x04 * (y)))
> +#define RCANFD_TMTASTS(y)		(0x0380 + (0x04 * (y)))
> +#define RCANFD_TMIEC(y)			(0x0390 + (0x04 * (y)))
> +
> +#define RCANFD_TXQCC(m)			(0x03a0 + (0x04 * (m)))
> +#define RCANFD_TXQSTS(m)		(0x03c0 + (0x04 * (m)))
> +#define RCANFD_TXQPCTR(m)		(0x03e0 + (0x04 * (m)))
> +
> +#define RCANFD_THLCC(m)			(0x0400 + (0x04 * (m)))
> +#define RCANFD_THLSTS(m)		(0x0420 + (0x04 * (m)))
> +#define RCANFD_THLPCTR(m)		(0x0440 + (0x04 * (m)))
> +
> +#define RCANFD_GTINTSTS0		(0x0460)
> +#define RCANFD_GTINTSTS1		(0x0464)
> +#define RCANFD_GTSTCFG			(0x0468)
> +#define RCANFD_GTSTCTR			(0x046c)
> +#define RCANFD_GLOCKK			(0x047c)
> +#define RCANFD_GRMCFG			(0x04fc)
> +
> +/* Receive rule registers */
> +#define RCANFD_GAFLID(offset, j)	((offset) + (0x10 * (j)))
> +#define RCANFD_GAFLM(offset, j)		((offset) + 0x04 + (0x10 * (j)))
> +#define RCANFD_GAFLP0(offset, j)	((offset) + 0x08 + (0x10 * (j)))
> +#define RCANFD_GAFLP1(offset, j)	((offset) + 0x0c + (0x10 * (j)))
> +
> +/* CAN FD mode specific regsiter map */
> +
> +/* Data bitrate registers */
> +#define RCANFD_F_CDFG(m)		(0x0500 + (0x20 * (m)))
> +#define RCANFD_F_CFDCFG(m)		(0x0504 + (0x20 * (m)))
> +#define RCANFD_F_CFDCTR(m)		(0x0508 + (0x20 * (m)))
> +#define RCANFD_F_CFDSTS(m)		(0x050c + (0x20 * (m)))
> +#define RCANFD_F_CFDCRC(m)		(0x0510 + (0x20 * (m)))
> +
> +#define RCANFD_F_GAFL_OFFSET		(0x1000)
> +
> +#define RCANFD_F_RMID(q)		(0x2000 + (0x10 * (q)))
> +#define RCANFD_F_RMPTR(q)		(0x2004 + (0x10 * (q)))
> +#define RCANFD_F_RMFDSTS(q)		(0x2008 + (0x10 * (q)))
> +#define RCANFD_F_RMDF(q, b)		(0x200c + (0x04 * (b)) + (0x20 * (q)))
> +
> +/* Rx FIFO data registers */
> +#define RCANFD_F_RFOFFSET		(0x3000)
> +#define RCANFD_F_RFID(x)		(RCANFD_F_RFOFFSET + (0x80 * (x)))
> +#define RCANFD_F_RFPTR(x)		(RCANFD_F_RFOFFSET + 0x04 + \
> +					 (0x80 * (x)))
> +#define RCANFD_F_RFFDSTS(x)		(RCANFD_F_RFOFFSET + 0x08 + \
> +					 (0x80 * (x)))
> +#define RCANFD_F_RFDF(x, df)		(RCANFD_F_RFOFFSET + 0x0c + \
> +					 (0x80 * (x)) + (0x04 * (df)))
> +
> +/* Common FIFO data registers */
> +#define RCANFD_F_CFOFFSET		(0x3400)
> +#define RCANFD_F_CFID(ch, idx)		(RCANFD_F_CFOFFSET + (0x180 * (ch)) + \
> +					 (0x80 * (idx)))
> +#define RCANFD_F_CFPTR(ch, idx)		(RCANFD_F_CFOFFSET + 0x04 + \
> +					 (0x180 * (ch)) + (0x80 * (idx)))
> +#define RCANFD_F_CFFDCSTS(ch, idx)	(RCANFD_F_CFOFFSET + 0x08 + \
> +					 (0x180 * (ch)) + (0x80 * (idx)))
> +#define RCANFD_F_CFDF(ch, idx, df)	(RCANFD_F_CFOFFSET + 0x0c + \
> +					 (0x180 * (ch)) + (0x80 * (idx)) + \
> +					 (0x04 * (df)))
> +
> +#define RCANFD_F_TMID(p)		(0x4000 + (0x20 * (p)))
> +#define RCANFD_F_TMPTR(p)		(0x4004 + (0x20 * (p)))
> +#define RCANFD_F_TMFDCTR(p)		(0x4008 + (0x20 * (p)))
> +#define RCANFD_F_TMDF(p, b)		(0x400c + (0x20 * (p)) + (0x04 * (b)))
> +
> +#define RCANFD_F_THLACC(m)		(0x6000 + (0x04 * (m)))
> +#define RCANFD_F_RPGACC(r)		(0x6400 + (0x04 * (r)))
> +
> +struct rcar_canfd_global;
> +
> +/* Channel priv data */
> +struct rcar_canfd_channel {
> +	struct can_priv can;			/* Must be the first member */
> +	struct net_device *ndev;
> +	struct rcar_canfd_global *gpriv;	/* Controller reference */
> +	void __iomem *base;			/* Register base address */
> +#ifdef CONFIG_DEBUG_FS
> +	struct dentry *dev_root;
> +#endif /* CONFIG_DEBUG_FS */
> +	struct napi_struct napi;
> +	u8  tx_len[RCANFD_FIFO_DEPTH];		/* For net stats */
> +	u32 tx_head;				/* Incremented on xmit */
> +	u32 tx_tail;				/* Incremented on xmit done */
> +	u32 channel;				/* Channel number */
> +	spinlock_t tx_lock;			/* To protect tx path */
> +};
> +
> +/* Global priv data */
> +struct rcar_canfd_global {
> +	struct rcar_canfd_channel *ch[RCANFD_NUM_CHANNELS];
> +	void __iomem *base;		/* Register base address */
> +	struct platform_device *pdev;	/* Respective platform device */
> +	struct clk *clkp;		/* Peripheral clock */
> +	struct clk *can_clk;		/* fCAN clock */
> +	int clock_select;		/* CANFD or Ext clock */
        ^^^
please give the corresponding enum a proper name and use it here.

> +	unsigned long channels_mask;	/* Enabled channels mask */
> +	u32 freq;			/* fCAN clock frequency in Hz */

This value is not used outside of the probe function. You can pass the
freq to the individual channels via an argument.

> +};
> +
> +/* CAN FD mode nominal rate constants */
> +static const struct can_bittiming_const rcar_canfd_nom_bittiming_const = {
> +	.name = RCANFD_DRV_NAME,
> +	.tseg1_min = 2,
> +	.tseg1_max = 128,
> +	.tseg2_min = 2,
> +	.tseg2_max = 32,
> +	.sjw_max = 32,
> +	.brp_min = 1,
> +	.brp_max = 1024,
> +	.brp_inc = 1,
> +};
> +
> +/* CAN FD mode data rate constants */
> +static const struct can_bittiming_const rcar_canfd_data_bittiming_const = {
> +	.name = RCANFD_DRV_NAME,
> +	.tseg1_min = 2,
> +	.tseg1_max = 16,
> +	.tseg2_min = 2,
> +	.tseg2_max = 8,
> +	.sjw_max = 8,
> +	.brp_min = 1,
> +	.brp_max = 256,
> +	.brp_inc = 1,
> +};
> +
> +/* fCAN clock select register settings */
> +enum {
> +	RCANFD_CANFDCLK = 0,	/* CANFD clock */
> +	RCANFD_EXTCLK,		/* Externally input clock */
> +};
> +
> +/* Helper functions */
> +static inline void rcar_canfd_update(u32 mask, u32 val, u32 __iomem *reg)
> +{
> +	u32 data = readl(reg);
> +
> +	data &= ~mask;
> +	data |= (val & mask);
> +	writel(data, reg);
> +}
> +
> +#define rcar_canfd_read(priv, offset)			\
> +	readl(priv->base + (offset))
> +#define rcar_canfd_write(priv, offset, val)		\
> +	writel(val, priv->base + (offset))
> +#define rcar_canfd_set_bit(priv, reg, val)		\
> +	rcar_canfd_update(val, val, priv->base + (reg))
> +#define rcar_canfd_clear_bit(priv, reg, val)		\
> +	rcar_canfd_update(val, 0, priv->base + (reg))
> +#define rcar_canfd_update_bit(priv, reg, mask, val)	\
> +	rcar_canfd_update(mask, val, priv->base + (reg))

please use static inline functions instead of defines.

> +
> +static void rcar_canfd_get_data(struct canfd_frame *cf,
> +				struct rcar_canfd_channel *priv, u32 off)

Please use "struct rcar_canfd_channel *priv" as first argument, struct
canfd_frame *cf as second. Remove off, as the offset is already defined
by the channel.

> +{
> +	u32 i, lwords;
> +
> +	lwords = cf->len / sizeof(u32);
> +	if (cf->len % sizeof(u32))
> +		lwords++;

Use DIV_ROUND_UP() instead of open coding it.

> +	for (i = 0; i < lwords; i++)
> +		*((u32 *)cf->data + i) =
> +			rcar_canfd_read(priv, off + (i * sizeof(u32)));
> +}
> +
> +static void rcar_canfd_put_data(struct canfd_frame *cf,
> +				struct rcar_canfd_channel *priv, u32 off)

same here

> +{
> +	u32 i, j, lwords, leftover;
> +	u32 data = 0;
> +
> +	lwords = cf->len / sizeof(u32);
> +	leftover = cf->len % sizeof(u32);
> +	for (i = 0; i < lwords; i++)
> +		rcar_canfd_write(priv, off + (i * sizeof(u32)),
> +				 *((u32 *)cf->data + i));

Here you don't convert the endianess...
> +
> +	if (leftover) {
> +		u8 *p = (u8 *)((u32 *)cf->data + i);
> +
> +		for (j = 0; j < leftover; j++)
> +			data |= p[j] << (j * 8);

...here you do an implicit endianess conversion. "data" is little
endian, while p[j] is big endian.

> +		rcar_canfd_write(priv, off + (i * sizeof(u32)), data);
> +	}

Have you tested to send CAN frames with len != n*4 against a different
controller?

> +}
> +
> +static void rcar_canfd_tx_failure_cleanup(struct net_device *ndev)
> +{
> +	u32 i;
> +
> +	for (i = 0; i < RCANFD_FIFO_DEPTH; i++)
> +		can_free_echo_skb(ndev, i);
> +}
> +
> +static int rcar_canfd_reset_controller(struct rcar_canfd_global *gpriv)
> +{
> +	u32 sts, ch;
> +	int err;
> +
> +	/* Check RAMINIT flag as CAN RAM initialization takes place
> +	 * after the MCU reset
> +	 */
> +	err = readl_poll_timeout((gpriv->base + RCANFD_GSTS), sts,
> +				 !(sts & RCANFD_GSTS_RAMINIT), 2, 500000);
> +	if (err) {
> +		dev_dbg(&gpriv->pdev->dev, "global raminit failed\n");
> +		return err;
> +	}
> +
> +	/* Transition to Global Reset mode */
> +	rcar_canfd_clear_bit(gpriv, RCANFD_GCTR, RCANFD_GCTR_SLPR);
> +	rcar_canfd_update_bit(gpriv, RCANFD_GCTR,
> +			      RCANFD_GCTR_MODEMASK, RCANFD_GCTR_GRESET);
> +
> +	/* Ensure Global reset mode */
> +	err = readl_poll_timeout((gpriv->base + RCANFD_GSTS), sts,
> +				 (sts & RCANFD_GSTS_RESET), 2, 500000);
> +	if (err) {
> +		dev_dbg(&gpriv->pdev->dev, "global reset failed\n");
> +		return err;
> +	}
> +
> +	/* Reset Global error flags */
> +	rcar_canfd_write(gpriv, RCANFD_GERFL, 0x0);
> +
> +	/* Set the controller into FD mode */
> +	rcar_canfd_set_bit(gpriv, RCANFD_GRMCFG, RCANFD_GINTF_CANFD);
> +
> +	/* Transition all Channels to reset mode */
> +	for_each_set_bit(ch, &gpriv->channels_mask, RCANFD_NUM_CHANNELS) {
> +		rcar_canfd_clear_bit(gpriv, RCANFD_CCTR(ch), RCANFD_CCTR_CSLPR);
> +
> +		rcar_canfd_update_bit(gpriv, RCANFD_CCTR(ch),
> +				      RCANFD_CCTR_MODEMASK,
> +				      RCANFD_CCTR_CRESET);
> +
> +		/* Ensure Channel reset mode */
> +		err = readl_poll_timeout((gpriv->base + RCANFD_CSTS(ch)), sts,
> +					 (sts & RCANFD_CSTS_RESET), 2, 500000);
> +		if (err) {
> +			dev_dbg(&gpriv->pdev->dev,
> +				"channel %u reset failed\n", ch);
> +			return err;
> +		}
> +	}
> +	return 0;
> +}
> +
> +static void rcar_canfd_configure_controller(struct rcar_canfd_global *gpriv)
> +{
> +	u32 cfg, ch;
> +
> +	/* Global Configuration settings */
> +	cfg = RCANFD_GCFG_EEFE;		/* ECC error flag enabled */
> +
> +	/* Set External Clock if selected */
> +	if (gpriv->clock_select != RCANFD_CANFDCLK)
> +		cfg |= RCANFD_GCFG_DCS;
> +
> +	/* Truncate payload to configured message size RFPLS */
> +	cfg |= RCANFD_GCFG_CMPOC;
> +
> +	rcar_canfd_set_bit(gpriv, RCANFD_GCFG, cfg);
> +
> +	/* Channel configuration settings */
> +	for_each_set_bit(ch, &gpriv->channels_mask, RCANFD_NUM_CHANNELS) {
> +		rcar_canfd_set_bit(gpriv, RCANFD_CCTR(ch), RCANFD_CCTR_ERRD);
> +		rcar_canfd_update_bit(gpriv, RCANFD_CCTR(ch),
> +				      RCANFD_CCTR_BOMMASK,
> +				      RCANFD_CCTR_BOM_ENTRY);
> +	}
> +}
> +
> +static void rcar_canfd_configure_afl_rules(struct rcar_canfd_global *gpriv,
> +					   u32 ch)
> +{
> +	u32 cfg;
> +	int start, page, num_rules = RCANFD_CHANNEL_NUMRULES;
> +	u32 ridx = ch + RCANFD_RFFIFO_IDX;
> +
> +	if (ch == 0) {
> +		start = 0; /* Start from 0th rule */
> +	} else {
> +		/* Get number of existing rules and adjust */
> +		cfg = rcar_canfd_read(gpriv, RCANFD_GAFLCFG0);
> +		start = RCANFD_AFLCFG_GETRNC0(cfg);
> +	}
> +
> +	/* Enable write access to entry */
> +	page = RCANFD_AFL_PAGENUM(start);
> +	rcar_canfd_set_bit(gpriv, RCANFD_GAFLECTR,
> +			   (RCANFD_AFLCTR_AFLPN(page) | RCANFD_AFLCTR_AFLDAE));
> +
> +	/* Write number of rules for channel */
> +	if (ch == 0)
> +		rcar_canfd_set_bit(gpriv, RCANFD_GAFLCFG0,
> +				   RCANFD_AFLCFG_SETRNC0(num_rules));
> +	else
> +		rcar_canfd_set_bit(gpriv, RCANFD_GAFLCFG0,
> +				   RCANFD_AFLCFG_SETRNC1(num_rules));
> +
> +	/* Accept all IDs */
> +	rcar_canfd_write(gpriv, RCANFD_GAFLID(RCANFD_F_GAFL_OFFSET, start), 0);
> +	/* IDE or RTR is not considered for matching */
> +	rcar_canfd_write(gpriv, RCANFD_GAFLM(RCANFD_F_GAFL_OFFSET, start), 0);
> +	/* Any data length accepted */
> +	rcar_canfd_write(gpriv, RCANFD_GAFLP0(RCANFD_F_GAFL_OFFSET, start), 0);
> +	/* Place the msg in corresponding Rx FIFO entry */
> +	rcar_canfd_write(gpriv, RCANFD_GAFLP1(RCANFD_F_GAFL_OFFSET, start),
> +			 RCANFD_AFLPTR1_RFFIFO(ridx));
> +
> +	/* Disable write access to page */
> +	rcar_canfd_clear_bit(gpriv, RCANFD_GAFLECTR, RCANFD_AFLCTR_AFLDAE);
> +}
> +
> +static void rcar_canfd_configure_rx(struct rcar_canfd_global *gpriv, u32 ch)
> +{
> +	/* Rx FIFO is used for reception */
> +	u32 cfg;
> +	u16 rfdc, rfpls;
> +
> +	/* Select Rx FIFO based on channel */
> +	u32 ridx = ch + RCANFD_RFFIFO_IDX;
> +
> +	rfdc = 2;		/* b010 - 8 messages Rx FIFO depth */
> +	rfpls = 7;		/* b111 - Max 64 bytes payload */
> +
> +	cfg = (RCANFD_RFFIFO_RFIM | RCANFD_RFFIFO_RFDC(rfdc) |
> +		RCANFD_RFFIFO_RFPLS(rfpls) | RCANFD_RFFIFO_RFIE);
> +	rcar_canfd_write(gpriv, RCANFD_RFCC(ridx), cfg);
> +}
> +
> +static void rcar_canfd_configure_tx(struct rcar_canfd_global *gpriv, u32 ch)
> +{
> +	/* Tx/Rx(Common) FIFO configured in Tx mode is
> +	 * used for transmission
> +	 *
> +	 * Each channel has 3 Common FIFO dedicated to them.
> +	 * Use the 1st (index 0) out of 3
> +	 */
> +	u32 cfg;
> +	u16 cftml, cfm, cfdc, cfpls;
> +
> +	cftml = 0;		/* 0th buffer */
> +	cfm = 1;		/* b01 - Transmit mode */
> +	cfdc = 2;		/* b010 - 8 messages Tx FIFO depth */
> +	cfpls = 7;		/* b111 - Max 64 bytes payload */
> +
> +	cfg = (RCANFD_CMFIFO_TML(cftml) | RCANFD_CMFIFO_M(cfm) |
> +		RCANFD_CMFIFO_CFIM | RCANFD_CMFIFO_DC(cfdc) |
> +		RCANFD_CMFIFO_PLS(cfpls) | RCANFD_CMFIFO_CFTXIE);
> +	rcar_canfd_write(gpriv, RCANFD_CFCC(ch, RCANFD_CFFIFO_IDX), cfg);
> +
> +	/* Clear FD mode specific control/status register */
> +	rcar_canfd_write(gpriv, RCANFD_F_CFFDCSTS(ch, RCANFD_CFFIFO_IDX), 0);
> +}
> +
> +static void rcar_canfd_enable_global_interrupts(struct rcar_canfd_global *gpriv)
> +{
> +	u32 ctr;
> +
> +	/* Clear any stray error interrupt flags */
> +	rcar_canfd_write(gpriv, RCANFD_GERFL, 0);
> +
> +	/* Global interrupts setup */
> +	ctr = RCANFD_GCTR_MEIE;
> +	ctr |= RCANFD_GCTR_CFMPOFIE;
> +
> +	rcar_canfd_set_bit(gpriv, RCANFD_GCTR, ctr);
> +}
> +
> +static void rcar_canfd_disable_global_interrupts(struct rcar_canfd_global
> +						 *gpriv)
> +{
> +	/* Disable all interrupts */
> +	rcar_canfd_write(gpriv, RCANFD_GCTR, 0);
> +
> +	/* Clear any stray error interrupt flags */
> +	rcar_canfd_write(gpriv, RCANFD_GERFL, 0);
> +}
> +
> +static void rcar_canfd_enable_channel_interrupts(struct rcar_canfd_channel
> +						 *priv)
> +{
> +	u32 ctr, ch = priv->channel;
> +
> +	/* Clear any stray error flags */
> +	rcar_canfd_write(priv, RCANFD_CERFL(ch), 0);
> +
> +	/* Channel interrupts setup */
> +	ctr = (RCANFD_CCTR_TAIE |
> +	       RCANFD_CCTR_ALIE | RCANFD_CCTR_BLIE |
> +	       RCANFD_CCTR_OLIE | RCANFD_CCTR_BORIE |
> +	       RCANFD_CCTR_BOEIE | RCANFD_CCTR_EPIE |
> +	       RCANFD_CCTR_EWIE | RCANFD_CCTR_BEIE);
> +	rcar_canfd_set_bit(priv, RCANFD_CCTR(ch), ctr);
> +}
> +
> +static void rcar_canfd_disable_channel_interrupts(struct rcar_canfd_channel
> +						  *priv)
> +{
> +	u32 ctr, ch = priv->channel;
> +
> +	ctr = (RCANFD_CCTR_TAIE |
> +	       RCANFD_CCTR_ALIE | RCANFD_CCTR_BLIE |
> +	       RCANFD_CCTR_OLIE | RCANFD_CCTR_BORIE |
> +	       RCANFD_CCTR_BOEIE | RCANFD_CCTR_EPIE |
> +	       RCANFD_CCTR_EWIE | RCANFD_CCTR_BEIE);
> +	rcar_canfd_clear_bit(priv, RCANFD_CCTR(ch), ctr);
> +
> +	/* Clear any stray error flags */
> +	rcar_canfd_write(priv, RCANFD_CERFL(ch), 0);
> +}
> +
> +static void rcar_canfd_global_error(struct net_device *ndev)
> +{
> +	struct rcar_canfd_channel *priv = netdev_priv(ndev);
> +	struct net_device_stats *stats = &ndev->stats;
> +	u32 ch = priv->channel;
> +	u32 gerfl, sts;
> +	u32 ridx = ch + RCANFD_RFFIFO_IDX;
> +
> +	gerfl = rcar_canfd_read(priv, RCANFD_GERFL);
> +	if ((gerfl & RCANFD_GERFL_EEF0) && (ch == 0)) {
> +		netdev_dbg(ndev, "Ch0: ECC Error flag\n");
> +		stats->tx_dropped++;
> +	}
> +	if ((gerfl & RCANFD_GERFL_EEF1) && (ch == 1)) {
> +		netdev_dbg(ndev, "Ch1: ECC Error flag\n");
> +		stats->tx_dropped++;
> +	}
> +	if (gerfl & RCANFD_GERFL_MES) {
> +		sts = rcar_canfd_read(priv,
> +				      RCANFD_CFSTS(ch, RCANFD_CFFIFO_IDX));
> +		if (sts & RCANFD_CMFIFO_CFMLT) {
> +			netdev_dbg(ndev, "Tx Message Lost flag\n");
> +			stats->tx_dropped++;
> +			rcar_canfd_write(priv,
> +					 RCANFD_CFSTS(ch, RCANFD_CFFIFO_IDX),
> +					 sts & ~RCANFD_CMFIFO_CFMLT);
> +		}
> +
> +		sts = rcar_canfd_read(priv, RCANFD_RFSTS(ridx));
> +		if (sts & RCANFD_RFFIFO_RFMLT) {
> +			netdev_dbg(ndev, "Rx Message Lost flag\n");
> +			stats->rx_dropped++;
> +			rcar_canfd_write(priv, RCANFD_RFSTS(ridx),
> +					 sts & ~RCANFD_RFFIFO_RFMLT);
> +		}
> +	}
> +	if (gerfl & RCANFD_GERFL_CMPOF) {
> +		/* Message Lost flag will be set for respective channel
> +		 * when this condition happens with counters and flags
> +		 * already updated.
> +		 */
> +		netdev_dbg(ndev, "global payload overflow interrupt\n");
> +	}
> +
> +	/* Clear all global error interrupts. Only affected channels bits
> +	 * get cleared
> +	 */
> +	rcar_canfd_write(priv, RCANFD_GERFL, 0);
> +}
> +
> +static void rcar_canfd_error(struct net_device *ndev)
> +{
> +	struct rcar_canfd_channel *priv = netdev_priv(ndev);
> +	struct net_device_stats *stats = &ndev->stats;
> +	struct can_frame *cf;
> +	struct sk_buff *skb;
> +	u32 cerfl, csts;
> +	u32 txerr = 0, rxerr = 0;
> +	u32 ch = priv->channel;
> +
> +	/* Propagate the error condition to the CAN stack */
> +	skb = alloc_can_err_skb(ndev, &cf);
> +	if (!skb) {
> +		stats->rx_dropped++;
> +		return;
> +	}
> +
> +	/* Channel error interrupt */
> +	cerfl = rcar_canfd_read(priv, RCANFD_CERFL(ch));
> +	csts = rcar_canfd_read(priv, RCANFD_CSTS(ch));
> +	txerr = RCANFD_CSTS_TECCNT(csts);
> +	rxerr = RCANFD_CSTS_RECCNT(csts);
> +
> +	netdev_dbg(ndev, "ch erfl %x sts %x txerr %u rxerr %u\n",
> +		   cerfl, csts, txerr, rxerr);
> +
> +	if (cerfl & RCANFD_CERFL_BEF) {
> +		netdev_dbg(ndev, "Bus error\n");
> +		cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
> +		cf->data[2] = CAN_ERR_PROT_UNSPEC;
> +		priv->can.can_stats.bus_error++;
> +	}
> +	if (cerfl & RCANFD_CERFL_ADEF) {
> +		netdev_dbg(ndev, "ACK Delimiter Error\n");
> +		stats->tx_errors++;
> +		cf->data[3] |= CAN_ERR_PROT_LOC_ACK_DEL;
> +	}
> +	if (cerfl & RCANFD_CERFL_B0EF) {
> +		netdev_dbg(ndev, "Bit Error (dominant)\n");
> +		stats->tx_errors++;
> +		cf->data[2] |= CAN_ERR_PROT_BIT0;
> +	}
> +	if (cerfl & RCANFD_CERFL_B1EF) {
> +		netdev_dbg(ndev, "Bit Error (recessive)\n");
> +		stats->tx_errors++;
> +		cf->data[2] |= CAN_ERR_PROT_BIT1;
> +	}
> +	if (cerfl & RCANFD_CERFL_CEF) {
> +		netdev_dbg(ndev, "CRC Error\n");
> +		stats->rx_errors++;
> +		cf->data[3] |= CAN_ERR_PROT_LOC_CRC_SEQ;
> +	}
> +	if (cerfl & RCANFD_CERFL_AEF) {
> +		netdev_dbg(ndev, "ACK Error\n");
> +		stats->tx_errors++;
> +		cf->can_id |= CAN_ERR_ACK;
> +		cf->data[3] |= CAN_ERR_PROT_LOC_ACK;
> +	}
> +	if (cerfl & RCANFD_CERFL_FEF) {
> +		netdev_dbg(ndev, "Form Error\n");
> +		stats->rx_errors++;
> +		cf->data[2] |= CAN_ERR_PROT_FORM;
> +	}
> +	if (cerfl & RCANFD_CERFL_SEF) {
> +		netdev_dbg(ndev, "Stuff Error\n");
> +		stats->rx_errors++;
> +		cf->data[2] |= CAN_ERR_PROT_STUFF;
> +	}
> +	if (cerfl & RCANFD_CERFL_ALEF) {
> +		netdev_dbg(ndev, "Arbitration lost Error\n");
> +		priv->can.can_stats.arbitration_lost++;
> +		cf->can_id |= CAN_ERR_LOSTARB;
> +		cf->data[0] |= CAN_ERR_LOSTARB_UNSPEC;
> +	}
> +	if (cerfl & RCANFD_CERFL_BLEF) {
> +		netdev_dbg(ndev, "Bus Lock Error\n");
> +		stats->rx_errors++;
> +		cf->can_id |= CAN_ERR_BUSERROR;
> +	}
> +	if (cerfl & RCANFD_CERFL_EWEF) {
> +		netdev_dbg(ndev, "Error warning interrupt\n");
> +		priv->can.state = CAN_STATE_ERROR_WARNING;
> +		priv->can.can_stats.error_warning++;
> +		cf->can_id |= CAN_ERR_CRTL;
> +		cf->data[1] = txerr > rxerr ? CAN_ERR_CRTL_TX_WARNING :
> +			CAN_ERR_CRTL_RX_WARNING;
> +		cf->data[6] = txerr;
> +		cf->data[7] = rxerr;
> +	}
> +	if (cerfl & RCANFD_CERFL_EPEF) {
> +		netdev_dbg(ndev, "Error passive interrupt\n");
> +		priv->can.state = CAN_STATE_ERROR_PASSIVE;
> +		priv->can.can_stats.error_passive++;
> +		cf->can_id |= CAN_ERR_CRTL;
> +		cf->data[1] = txerr > rxerr ? CAN_ERR_CRTL_TX_PASSIVE :
> +			CAN_ERR_CRTL_RX_PASSIVE;
> +		cf->data[6] = txerr;
> +		cf->data[7] = rxerr;
> +	}
> +	if (cerfl & RCANFD_CERFL_BOEEF) {
> +		netdev_dbg(ndev, "Bus-off entry interrupt\n");
> +		rcar_canfd_tx_failure_cleanup(ndev);
> +		priv->can.state = CAN_STATE_BUS_OFF;
> +		priv->can.can_stats.bus_off++;
> +		can_bus_off(ndev);
> +		cf->can_id |= CAN_ERR_BUSOFF;
> +	}
> +	if (cerfl & RCANFD_CERFL_OLEF) {
> +		netdev_dbg(ndev,
> +			   "Overload Frame Transmission error interrupt\n");
> +		stats->tx_errors++;
> +		cf->can_id |= CAN_ERR_PROT;
> +		cf->data[2] |= CAN_ERR_PROT_OVERLOAD;
> +	}
> +
> +	/* Clear all channel error interrupts */
> +	rcar_canfd_write(priv, RCANFD_CERFL(ch), 0);
> +	stats->rx_packets++;
> +	stats->rx_bytes += cf->can_dlc;
> +	netif_rx(skb);
> +}
> +
> +static void rcar_canfd_tx_done(struct net_device *ndev)
> +{
> +	struct rcar_canfd_channel *priv = netdev_priv(ndev);
> +	struct net_device_stats *stats = &ndev->stats;
> +	u32 sts;
> +	unsigned long flags;
> +	u32 ch = priv->channel;
> +
> +	do {

You should iterare over all pending CAN frames:

> 	for (/* nix */; (priv->tx_head - priv->tx_tail) > 0; priv->tx_tail++) {


> +		u8 unsent, sent;
> +
> +		sent = priv->tx_tail % RCANFD_FIFO_DEPTH;

and check here, if that packet has really been tramsitted. Exit the loop
otherweise.

> +		stats->tx_packets++;
> +		stats->tx_bytes += priv->tx_len[sent];
> +		priv->tx_len[sent] = 0;
> +		can_get_echo_skb(ndev, sent);
> +
> +		spin_lock_irqsave(&priv->tx_lock, flags);

What does the tx_lock protect? The tx path is per channel, isn't it?

> +		priv->tx_tail++;
> +		sts = rcar_canfd_read(priv,
> +				      RCANFD_CFSTS(ch, RCANFD_CFFIFO_IDX));
> +		unsent = RCANFD_CMFIFO_CFMC(sts);
> +
> +		/* Wake producer only when there is room */
> +		if (unsent != RCANFD_FIFO_DEPTH)
> +			netif_wake_queue(ndev);

Move the netif_wake_queue() out of the loop.

> +
> +		if (priv->tx_head - priv->tx_tail <= unsent) {
> +			spin_unlock_irqrestore(&priv->tx_lock, flags);
> +			break;
> +		}
> +		spin_unlock_irqrestore(&priv->tx_lock, flags);
> +
> +	} while (1);
> +
> +	/* Clear interrupt */
> +	rcar_canfd_write(priv, RCANFD_CFSTS(ch, RCANFD_CFFIFO_IDX),
> +			 sts & ~RCANFD_CMFIFO_CFTXIF);
> +	can_led_event(ndev, CAN_LED_EVENT_TX);
> +}
> +
> +static irqreturn_t rcar_canfd_global_interrupt(int irq, void *dev_id)
> +{
> +	struct rcar_canfd_global *gpriv = dev_id;
> +	struct net_device *ndev;
> +	struct rcar_canfd_channel *priv;
> +	u32 sts, gerfl;
> +	u32 ch, ridx;
> +
> +	/* Global error interrupts still indicate a condition specific
> +	 * to a channel. RxFIFO interrupt is a global interrupt.
> +	 */
> +	for_each_set_bit(ch, &gpriv->channels_mask, RCANFD_NUM_CHANNELS) {
> +		priv = gpriv->ch[ch];
> +		ndev = priv->ndev;
> +		ridx = ch + RCANFD_RFFIFO_IDX;
> +
> +		/* Global error interrupts */
> +		gerfl = rcar_canfd_read(priv, RCANFD_GERFL);
> +		if (RCANFD_GERFL_ERR(gerfl))
> +			rcar_canfd_global_error(ndev);
> +
> +		/* Handle Rx interrupts */
> +		sts = rcar_canfd_read(priv, RCANFD_RFSTS(ridx));
> +		if (sts & RCANFD_RFFIFO_RFIF) {
> +			if (napi_schedule_prep(&priv->napi)) {
> +				/* Disable Rx FIFO interrupts */
> +				rcar_canfd_clear_bit(priv,
> +						     RCANFD_RFCC(ridx),
> +						     RCANFD_RFFIFO_RFIE);
> +				__napi_schedule(&priv->napi);
> +			}
> +		}
> +	}
> +	return IRQ_HANDLED;
> +}
> +
> +static irqreturn_t rcar_canfd_channel_interrupt(int irq, void *dev_id)
> +{
> +	struct rcar_canfd_global *gpriv = dev_id;
> +	struct net_device *ndev;
> +	struct rcar_canfd_channel *priv;
> +	u32 sts, cerfl, ch;
> +
> +	/* Common FIFO is a per channel resource */
> +	for_each_set_bit(ch, &gpriv->channels_mask, RCANFD_NUM_CHANNELS) {
> +		priv = gpriv->ch[ch];
> +		ndev = priv->ndev;
> +
> +		/* Channel error interrupts */
> +		cerfl = rcar_canfd_read(priv, RCANFD_CERFL(ch));
> +		if (RCANFD_CERFL_ERR(cerfl))
> +			rcar_canfd_error(ndev);
> +
> +		/* Handle Tx interrupts */
> +		sts = rcar_canfd_read(priv,
> +				      RCANFD_CFSTS(ch, RCANFD_CFFIFO_IDX));
> +		if (sts & RCANFD_CMFIFO_CFTXIF)
> +			rcar_canfd_tx_done(ndev);
> +	}
> +	return IRQ_HANDLED;
> +}
> +
> +static void rcar_canfd_set_bittiming(struct net_device *dev)
> +{
> +	struct rcar_canfd_channel *priv = netdev_priv(dev);
> +	const struct can_bittiming *bt = &priv->can.bittiming;
> +	const struct can_bittiming *dbt = &priv->can.data_bittiming;
> +	u16 brp, sjw, tseg1, tseg2;
> +	u32 cfg;
> +	u32 ch = priv->channel;
> +
> +	/* Nominal bit timing settings */
> +	brp = bt->brp - 1;
> +	sjw = bt->sjw - 1;
> +	tseg1 = bt->prop_seg + bt->phase_seg1 - 1;
> +	tseg2 = bt->phase_seg2 - 1;
> +
> +	cfg = (RCANFD_NR_TSEG1(tseg1) | RCANFD_NR_BRP(brp) |
> +	       RCANFD_NR_SJW(sjw) | RCANFD_NR_TSEG2(tseg2));
> +
> +	rcar_canfd_write(priv, RCANFD_CCFG(ch), cfg);
> +	netdev_dbg(priv->ndev, "nrate: brp %u, sjw %u, tseg1 %u, tseg2 %u\n",
> +		   brp, sjw, tseg1, tseg2);
> +
> +	/* Data bit timing settings */
> +	brp = dbt->brp - 1;
> +	sjw = dbt->sjw - 1;
> +	tseg1 = dbt->prop_seg + dbt->phase_seg1 - 1;
> +	tseg2 = dbt->phase_seg2 - 1;
> +
> +	cfg = (RCANFD_DR_TSEG1(tseg1) | RCANFD_DR_BRP(brp) |
> +	       RCANFD_DR_SJW(sjw) | RCANFD_DR_TSEG2(tseg2));
> +
> +	rcar_canfd_write(priv, RCANFD_F_CDFG(ch), cfg);
> +	netdev_dbg(priv->ndev, "drate: brp %u, sjw %u, tseg1 %u, tseg2 %u\n",
> +		   brp, sjw, tseg1, tseg2);
> +}
> +
> +static int rcar_canfd_start(struct net_device *ndev)
> +{
> +	struct rcar_canfd_channel *priv = netdev_priv(ndev);
> +	int err = -EOPNOTSUPP;
> +	u32 sts, ch = priv->channel;
> +	u32 ridx = ch + RCANFD_RFFIFO_IDX;
> +
> +	/* Ensure channel starts in FD mode */
> +	if (!(priv->can.ctrlmode & CAN_CTRLMODE_FD)) {
> +		netdev_err(ndev, "enable can fd mode for channel %d\n", ch);
> +		goto fail_mode;
> +	}
> +
> +	rcar_canfd_set_bittiming(ndev);
> +
> +	rcar_canfd_enable_channel_interrupts(priv);
> +
> +	/* Set channel to Operational mode */
> +	rcar_canfd_update_bit(priv, RCANFD_CCTR(ch),
> +			      RCANFD_CCTR_MODEMASK, RCANFD_CCTR_COPM);
> +
> +	/* Verify channel mode change */
> +	err = readl_poll_timeout((priv->base + RCANFD_CSTS(ch)), sts,
> +				 (sts & RCANFD_CSTS_COM), 2, 500000);
> +	if (err) {
> +		netdev_err(ndev, "channel %u communication state failed\n", ch);
> +		goto fail_mode_change;
> +	}
> +
> +	/* Enable Common & Rx FIFO */
> +	rcar_canfd_set_bit(priv, RCANFD_CFCC(ch, RCANFD_CFFIFO_IDX),
> +			   RCANFD_CMFIFO_CFE);
> +	rcar_canfd_set_bit(priv, RCANFD_RFCC(ridx), RCANFD_RFFIFO_RFE);
> +
> +	priv->can.state = CAN_STATE_ERROR_ACTIVE;
> +	return 0;
> +
> +fail_mode_change:
> +	rcar_canfd_disable_channel_interrupts(priv);
> +fail_mode:
> +	return err;
> +}
> +
> +static int rcar_canfd_open(struct net_device *ndev)
> +{
> +	struct rcar_canfd_channel *priv = netdev_priv(ndev);
> +	struct rcar_canfd_global *gpriv = priv->gpriv;
> +	int err;
> +
> +	/* Peripheral clock is already enabled in probe */
> +	err = clk_prepare_enable(gpriv->can_clk);
> +	if (err) {
> +		netdev_err(ndev, "failed to enable CAN clock, error %d\n", err);
> +		goto out_clock;
> +	}
> +
> +	err = open_candev(ndev);
> +	if (err) {
> +		netdev_err(ndev, "open_candev() failed, error %d\n", err);
> +		goto out_can_clock;
> +	}
> +
> +	napi_enable(&priv->napi);
> +	err = rcar_canfd_start(ndev);
> +	if (err)
> +		goto out_close;
> +	netif_start_queue(ndev);
> +	can_led_event(ndev, CAN_LED_EVENT_OPEN);
> +	return 0;
> +out_close:
> +	napi_disable(&priv->napi);
> +	close_candev(ndev);
> +out_can_clock:
> +	clk_disable_unprepare(gpriv->can_clk);
> +out_clock:
> +	return err;
> +}
> +
> +static void rcar_canfd_stop(struct net_device *ndev)
> +{
> +	struct rcar_canfd_channel *priv = netdev_priv(ndev);
> +	int err;
> +	u32 sts, ch = priv->channel;
> +	u32 ridx = ch + RCANFD_RFFIFO_IDX;
> +
> +	/* Transition to channel reset mode  */
> +	rcar_canfd_update_bit(priv, RCANFD_CCTR(ch),
> +			      RCANFD_CCTR_MODEMASK, RCANFD_CCTR_CRESET);
> +
> +	/* Check Channel reset mode */
> +	err = readl_poll_timeout((priv->base + RCANFD_CSTS(ch)), sts,
> +				 (sts & RCANFD_CSTS_RESET), 2, 500000);
> +	if (err)
> +		netdev_err(ndev, "channel %u reset failed\n", ch);
> +
> +	rcar_canfd_disable_channel_interrupts(priv);
> +
> +	/* Disable Common & Rx FIFO */
> +	rcar_canfd_clear_bit(priv, RCANFD_CFCC(ch, RCANFD_CFFIFO_IDX),
> +			     RCANFD_CMFIFO_CFE);
> +	rcar_canfd_clear_bit(priv, RCANFD_RFCC(ridx), RCANFD_RFFIFO_RFE);
> +
> +	/* Set the state as STOPPED */
> +	priv->can.state = CAN_STATE_STOPPED;
> +}
> +
> +static int rcar_canfd_close(struct net_device *ndev)
> +{
> +	struct rcar_canfd_channel *priv = netdev_priv(ndev);
> +	struct rcar_canfd_global *gpriv = priv->gpriv;
> +
> +	netif_stop_queue(ndev);
> +	rcar_canfd_stop(ndev);
> +	napi_disable(&priv->napi);
> +	clk_disable_unprepare(gpriv->can_clk);
> +	close_candev(ndev);
> +	can_led_event(ndev, CAN_LED_EVENT_STOP);
> +	return 0;
> +}
> +
> +static netdev_tx_t rcar_canfd_start_xmit(struct sk_buff *skb,
> +					 struct net_device *ndev)
> +{
> +	struct rcar_canfd_channel *priv = netdev_priv(ndev);
> +	struct canfd_frame *cf = (struct canfd_frame *)skb->data;
> +	u32 sts, id, ptr;
> +	unsigned long flags;
> +	u32 ch = priv->channel;
> +
> +	if (can_dropped_invalid_skb(ndev, skb))
> +		return NETDEV_TX_OK;
> +
> +	if (cf->can_id & CAN_EFF_FLAG) {
> +		id = cf->can_id & CAN_EFF_MASK;
> +		id |= RCANFD_CMFIFO_CFIDE;
> +	} else {
> +		id = cf->can_id & CAN_SFF_MASK;
> +	}
> +
> +	if (cf->can_id & CAN_RTR_FLAG)
> +		id |= RCANFD_CMFIFO_CFRTR;
> +
> +	rcar_canfd_write(priv, RCANFD_F_CFID(ch, RCANFD_CFFIFO_IDX),
> +			 id);
> +	ptr = RCANFD_CMFIFO_CFDLC(can_len2dlc(cf->len));

ptr usually means pointer, better call it dlc.

> +	rcar_canfd_write(priv, RCANFD_F_CFPTR(ch, RCANFD_CFFIFO_IDX),
> +			 ptr);
> +
> +	sts = rcar_canfd_read(priv, RCANFD_F_CFFDCSTS(ch, RCANFD_CFFIFO_IDX));
> +	if (can_is_canfd_skb(skb)) {
> +		/* CAN FD frame format */
> +		sts |= RCANFD_CMFIFO_CFFDF;
> +		if (cf->flags & CANFD_BRS)
> +			sts |= RCANFD_CMFIFO_CFBRS;
> +		else
> +			sts &= ~RCANFD_CMFIFO_CFBRS;
> +	} else {
> +		sts &= ~RCANFD_CMFIFO_CFFDF;
> +	}
> +	rcar_canfd_write(priv, RCANFD_F_CFFDCSTS(ch, RCANFD_CFFIFO_IDX), sts);
> +
> +	rcar_canfd_put_data(cf, priv,
> +			    RCANFD_F_CFDF(ch, RCANFD_CFFIFO_IDX, 0));
> +
> +	priv->tx_len[priv->tx_head % RCANFD_FIFO_DEPTH] = cf->len;
> +	can_put_echo_skb(skb, ndev, priv->tx_head % RCANFD_FIFO_DEPTH);
> +
> +	spin_lock_irqsave(&priv->tx_lock, flags);
> +	priv->tx_head++;
> +
> +	/* Start Tx: Write 0xff to CFPC to increment the CPU-side
> +	 * pointer for the Common FIFO
> +	 */
> +	rcar_canfd_write(priv, RCANFD_CFPCTR(ch, RCANFD_CFFIFO_IDX), 0xff);
> +
> +	/* Stop the queue if we've filled all FIFO entries */
> +	if (priv->tx_head - priv->tx_tail >= RCANFD_FIFO_DEPTH)
> +		netif_stop_queue(ndev);

Please move the check of stop_queue, before starting the send.

> +
> +	spin_unlock_irqrestore(&priv->tx_lock, flags);
> +	return NETDEV_TX_OK;
> +}
> +
> +static void rcar_canfd_rx_pkt(struct rcar_canfd_channel *priv)
> +{
> +	struct net_device_stats *stats = &priv->ndev->stats;
> +	struct canfd_frame *cf;
> +	struct sk_buff *skb;
> +	u32 sts = 0, id, ptr;
> +	u32 ch = priv->channel;
> +	u32 ridx = ch + RCANFD_RFFIFO_IDX;
> +
> +	id = rcar_canfd_read(priv, RCANFD_F_RFID(ridx));
> +	ptr = rcar_canfd_read(priv, RCANFD_F_RFPTR(ridx));
> +
> +	sts = rcar_canfd_read(priv, RCANFD_F_RFFDSTS(ridx));
> +	if (sts & RCANFD_RFFIFO_RFFDF)
> +		skb = alloc_canfd_skb(priv->ndev, &cf);
> +	else
> +		skb = alloc_can_skb(priv->ndev,
> +				    (struct can_frame **)&cf);
> +
> +	if (!skb) {
> +		stats->rx_dropped++;
> +		return;
> +	}
> +
> +	if (sts & RCANFD_RFFIFO_RFFDF)
> +		cf->len = can_dlc2len(RCANFD_RFFIFO_RFDLC(ptr));
> +	else
> +		cf->len = get_can_dlc(RCANFD_RFFIFO_RFDLC(ptr));
> +
> +	if (sts & RCANFD_RFFIFO_RFESI) {
> +		cf->flags |= CANFD_ESI;
> +		netdev_dbg(priv->ndev, "ESI Error\n");
> +	}
> +
> +	if (id & RCANFD_RFFIFO_RFIDE)
> +		cf->can_id = (id & CAN_EFF_MASK) | CAN_EFF_FLAG;
> +	else
> +		cf->can_id = id & CAN_SFF_MASK;
> +
> +	if (!(sts & RCANFD_RFFIFO_RFFDF) && (id & RCANFD_RFFIFO_RFRTR)) {
> +		cf->can_id |= CAN_RTR_FLAG;
> +	} else {
> +		if (sts & RCANFD_RFFIFO_RFBRS)
> +			cf->flags |= CANFD_BRS;
> +
> +		rcar_canfd_get_data(cf, priv, RCANFD_F_RFDF(ridx, 0));
> +	}
> +
> +	/* Write 0xff to RFPC to increment the CPU-side
> +	 * pointer of the Rx FIFO
> +	 */
> +	rcar_canfd_write(priv, RCANFD_RFPCTR(ridx), 0xff);
> +
> +	can_led_event(priv->ndev, CAN_LED_EVENT_RX);
> +
> +	stats->rx_bytes += cf->len;
> +	stats->rx_packets++;
> +	netif_receive_skb(skb);
> +}
> +
> +static int rcar_canfd_rx_poll(struct napi_struct *napi, int quota)
> +{
> +	struct rcar_canfd_channel *priv =
> +		container_of(napi, struct rcar_canfd_channel, napi);
> +	int num_pkts;
> +	u32 sts;
> +	u32 ch = priv->channel;
> +	u32 ridx = ch + RCANFD_RFFIFO_IDX;
> +
> +	for (num_pkts = 0; num_pkts < quota; num_pkts++) {
> +		sts = rcar_canfd_read(priv, RCANFD_RFSTS(ridx));
> +		/* Clear interrupt bit */
> +		if (sts & RCANFD_RFFIFO_RFIF)
> +			rcar_canfd_write(priv, RCANFD_RFSTS(ridx),
> +					 sts & ~RCANFD_RFFIFO_RFIF);
> +
> +		/* Check FIFO empty condition */
> +		if (sts & RCANFD_RFFIFO_RFEMP)
> +			break;
> +
> +		rcar_canfd_rx_pkt(priv);

This sequence looks strange. You first conditionally ack the interrupt
then you check for empty fifo, then read the CAN frame. I would assume
that you first check if there's a CAN frame, read it and then clear the
interrupt.

> +	}
> +
> +	/* All packets processed */
> +	if (num_pkts < quota) {
> +		napi_complete(napi);
> +		/* Enable Rx FIFO interrupts */
> +		rcar_canfd_set_bit(priv, RCANFD_RFCC(ridx), RCANFD_RFFIFO_RFIE);
> +	}
> +	return num_pkts;
> +}
> +
> +static int rcar_canfd_do_set_mode(struct net_device *ndev, enum can_mode mode)
> +{
> +	int err;
> +
> +	switch (mode) {
> +	case CAN_MODE_START:
> +		err = rcar_canfd_start(ndev);
> +		if (err)
> +			return err;
> +		netif_wake_queue(ndev);
> +		return 0;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
> +static int rcar_canfd_get_berr_counter(const struct net_device *dev,
> +				       struct can_berr_counter *bec)
> +{
> +	struct rcar_canfd_channel *priv = netdev_priv(dev);
> +	u32 val, ch = priv->channel;
> +
> +	/* Peripheral clock is already enabled in probe */
> +	val = rcar_canfd_read(priv, RCANFD_CSTS(ch));
> +	bec->txerr = RCANFD_CSTS_TECCNT(val);
> +	bec->rxerr = RCANFD_CSTS_RECCNT(val);
> +	return 0;
> +}
> +
> +static const struct net_device_ops rcar_canfd_netdev_ops = {
> +	.ndo_open = rcar_canfd_open,
> +	.ndo_stop = rcar_canfd_close,
> +	.ndo_start_xmit = rcar_canfd_start_xmit,
> +	.ndo_change_mtu = can_change_mtu,
> +};
> +
> +static int rcar_canfd_channel_probe(struct rcar_canfd_global *gpriv, u32 ch)
> +{
> +	struct platform_device *pdev = gpriv->pdev;
> +	struct rcar_canfd_channel *priv;
> +	struct net_device *ndev;
> +	int err = -ENODEV;
> +
> +	ndev = alloc_candev(sizeof(*priv), RCANFD_FIFO_DEPTH);
> +	if (!ndev) {
> +		dev_err(&pdev->dev, "alloc_candev() failed\n");
> +		err = -ENOMEM;
> +		goto fail;
> +	}
> +	priv = netdev_priv(ndev);
> +
> +	ndev->netdev_ops = &rcar_canfd_netdev_ops;
> +	ndev->flags |= IFF_ECHO;
> +	priv->ndev = ndev;
> +	priv->base = gpriv->base;
> +	priv->channel = ch;
> +	priv->can.clock.freq = gpriv->freq;
> +	dev_info(&pdev->dev, "can_clk rate is %u\n", priv->can.clock.freq);
> +
> +	priv->can.bittiming_const = &rcar_canfd_nom_bittiming_const;
> +	priv->can.data_bittiming_const =
> +		&rcar_canfd_data_bittiming_const;
> +
> +	/* Controller starts in CAN FD mode, which supports classical CAN and
> +	 * CAN FD frames. For classical CAN frames only configurations, data
> +	 * bitrate should be configured the same as nominal bitrate.
> +	 */
> +	priv->can.ctrlmode_supported = CAN_CTRLMODE_BERR_REPORTING |
> +		CAN_CTRLMODE_FD;
> +
> +	priv->can.do_set_mode = rcar_canfd_do_set_mode;
> +	priv->can.do_get_berr_counter = rcar_canfd_get_berr_counter;
> +	priv->gpriv = gpriv;
> +	SET_NETDEV_DEV(ndev, &pdev->dev);
> +
> +	netif_napi_add(ndev, &priv->napi, rcar_canfd_rx_poll,
> +		       RCANFD_NAPI_WEIGHT);
> +	err = register_candev(ndev);
> +	if (err) {
> +		dev_err(&pdev->dev,
> +			"register_candev() failed, error %d\n", err);
> +		goto fail_candev;
> +	}
> +	spin_lock_init(&priv->tx_lock);
> +	devm_can_led_init(ndev);
> +	gpriv->ch[priv->channel] = priv;
> +	dev_info(&pdev->dev, "device registered (channel %u)\n", priv->channel);
> +	return 0;
> +
> +fail_candev:
> +	netif_napi_del(&priv->napi);
> +	free_candev(ndev);
> +fail:
> +	return err;
> +}
> +
> +static void rcar_canfd_channel_remove(struct rcar_canfd_global *gpriv, u32 ch)
> +{
> +	struct rcar_canfd_channel *priv = gpriv->ch[ch];
> +
> +	if (priv) {
> +		unregister_candev(priv->ndev);
> +		netif_napi_del(&priv->napi);
> +		free_candev(priv->ndev);
> +	}
> +}
> +
> +static int rcar_canfd_probe(struct platform_device *pdev)
> +{
> +	struct resource *mem;
> +	void __iomem *addr;
> +	u32 sts, ch;
> +	struct rcar_canfd_global *gpriv;
> +	struct device_node *of_child;
> +	unsigned long channels_mask = 0;
> +	int err, ch_irq, g_irq;
> +
> +	of_child = of_get_child_by_name(pdev->dev.of_node, "channel0");
> +	if (of_child && of_device_is_available(of_child))
> +		channels_mask |= BIT(0);	/* Channel 0 */
> +
> +	of_child = of_get_child_by_name(pdev->dev.of_node, "channel1");
> +	if (of_child && of_device_is_available(of_child))
> +		channels_mask |= BIT(1);	/* Channel 1 */
> +
> +	ch_irq = platform_get_irq(pdev, 0);
> +	if (ch_irq < 0) {
> +		dev_err(&pdev->dev, "no Channel IRQ resource\n");
> +		err = ch_irq;
> +		goto fail_dev;
> +	}
> +
> +	g_irq = platform_get_irq(pdev, 1);
> +	if (g_irq < 0) {
> +		dev_err(&pdev->dev, "no Global IRQ resource\n");
> +		err = g_irq;
> +		goto fail_dev;
> +	}
> +
> +	/* Global controller context */
> +	gpriv = devm_kzalloc(&pdev->dev, sizeof(*gpriv), GFP_KERNEL);
> +	if (!gpriv) {
> +		err = -ENOMEM;
> +		goto fail_dev;
> +	}
> +	gpriv->pdev = pdev;
> +	gpriv->channels_mask = channels_mask;
> +
> +	/* Peripheral clock */
> +	gpriv->clkp = devm_clk_get(&pdev->dev, "fck");
> +	if (IS_ERR(gpriv->clkp)) {
> +		err = PTR_ERR(gpriv->clkp);
> +		dev_err(&pdev->dev, "cannot get peripheral clock, error %d\n",
> +			err);
> +		goto fail_dev;
> +	}
> +
> +	/* fCAN clock: Pick External clock. If not available fallback to
> +	 * CANFD clock
> +	 */
> +	gpriv->can_clk = devm_clk_get(&pdev->dev, "can_clk");
> +	if (IS_ERR(gpriv->can_clk)) {
> +		gpriv->can_clk = devm_clk_get(&pdev->dev, "canfd");
> +		if (IS_ERR(gpriv->can_clk)) {
> +			err = PTR_ERR(gpriv->can_clk);
> +			dev_err(&pdev->dev,
> +				"cannot get canfd clock, error %d\n", err);
> +			goto fail_dev;
> +		}
> +		gpriv->clock_select = RCANFD_CANFDCLK;
> +
> +	} else {
> +		gpriv->clock_select = RCANFD_EXTCLK;
> +	}
> +	gpriv->freq = clk_get_rate(gpriv->can_clk);
> +
> +	if (gpriv->clock_select == RCANFD_CANFDCLK)
> +		/* CANFD clock is further divided by (1/2) within the IP */
> +		gpriv->freq /= 2;
> +
> +	mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +	addr = devm_ioremap_resource(&pdev->dev, mem);
> +	if (IS_ERR(addr)) {
> +		err = PTR_ERR(addr);
> +		goto fail_dev;
> +	}
> +	gpriv->base = addr;
> +
> +	/* Request IRQ that's common for both channels */
> +	err = devm_request_irq(&pdev->dev, ch_irq,
> +			       rcar_canfd_channel_interrupt, 0,
> +			       "canfd.chn", gpriv);
> +	if (err) {
> +		dev_err(&pdev->dev, "devm_request_irq(%d) failed, error %d\n",
> +			ch_irq, err);
> +		goto fail_dev;
> +	}
> +	err = devm_request_irq(&pdev->dev, g_irq,
> +			       rcar_canfd_global_interrupt, 0,
> +			       "canfd.gbl", gpriv);
> +	if (err) {
> +		dev_err(&pdev->dev, "devm_request_irq(%d) failed, error %d\n",
> +			g_irq, err);
> +		goto fail_dev;
> +	}
> +
> +	/* Enable peripheral clock for register access */
> +	err = clk_prepare_enable(gpriv->clkp);
> +	if (err) {
> +		dev_err(&pdev->dev,
> +			"failed to enable peripheral clock, error %d\n", err);
> +		goto fail_dev;
> +	}
> +
> +	err = rcar_canfd_reset_controller(gpriv);
> +	if (err) {
> +		dev_err(&pdev->dev, "reset controller failed\n");
> +		goto fail_clk;
> +	}
> +
> +	/* Controller in Global reset & Channel reset mode */
> +	rcar_canfd_configure_controller(gpriv);
> +
> +	/* Configure per channel attributes */
> +	for_each_set_bit(ch, &gpriv->channels_mask, RCANFD_NUM_CHANNELS) {
> +		/* Configure Channel's Rx fifo */
> +		rcar_canfd_configure_rx(gpriv, ch);
> +
> +		/* Configure Channel's Tx (Common) fifo */
> +		rcar_canfd_configure_tx(gpriv, ch);
> +
> +		/* Configure receive rules */
> +		rcar_canfd_configure_afl_rules(gpriv, ch);
> +	}
> +
> +	/* Configure common interrupts */
> +	rcar_canfd_enable_global_interrupts(gpriv);
> +
> +	/* Start Global operation mode */
> +	rcar_canfd_update_bit(gpriv, RCANFD_GCTR, RCANFD_GCTR_MODEMASK,
> +			      RCANFD_GCTR_GOPM);
> +
> +	/* Verify mode change */
> +	err = readl_poll_timeout((gpriv->base + RCANFD_GSTS), sts,
> +				 !(sts & RCANFD_GSTS_GNOPM), 2, 500000);
> +	if (err) {
> +		dev_err(&pdev->dev, "global operational mode failed\n");
> +		goto fail_mode;
> +	}
> +
> +	for_each_set_bit(ch, &gpriv->channels_mask, RCANFD_NUM_CHANNELS) {
> +		err = rcar_canfd_channel_probe(gpriv, ch);
> +		if (err)
> +			goto fail_channel;
> +	}

Should the CAN IP core be clocked the whole time? What about shuting
down the clock and enabling it on ifup?

> +
> +	platform_set_drvdata(pdev, gpriv);
> +	dev_info(&pdev->dev, "global operational state (clk %d)\n",
> +		 gpriv->clock_select);
> +	return 0;
> +
> +fail_channel:
> +	for_each_set_bit(ch, &gpriv->channels_mask, RCANFD_NUM_CHANNELS)
> +		rcar_canfd_channel_remove(gpriv, ch);
> +fail_mode:
> +	rcar_canfd_disable_global_interrupts(gpriv);
> +fail_clk:
> +	clk_disable_unprepare(gpriv->clkp);
> +fail_dev:
> +	return err;
> +}
> +
> +static int rcar_canfd_remove(struct platform_device *pdev)
> +{
> +	struct rcar_canfd_global *gpriv = platform_get_drvdata(pdev);
> +	struct rcar_canfd_channel *priv;
> +	u32 ch;
> +
> +	rcar_canfd_reset_controller(gpriv);
> +	rcar_canfd_disable_global_interrupts(gpriv);
> +
> +	for_each_set_bit(ch, &gpriv->channels_mask, RCANFD_NUM_CHANNELS) {
> +		priv = gpriv->ch[ch];
> +		if (priv) {

This should always be true.

> +			rcar_canfd_disable_channel_interrupts(priv);
> +			unregister_candev(priv->ndev);
> +			netif_napi_del(&priv->napi);
> +			free_candev(priv->ndev);

Please make use of rcar_canfd_channel_remove(), as you already have the
function.

> +		}
> +	}
> +
> +	/* Enter global sleep mode */
> +	rcar_canfd_set_bit(gpriv, RCANFD_GCTR, RCANFD_GCTR_SLPR);
> +	clk_disable_unprepare(gpriv->clkp);
> +	return 0;
> +}
> +
> +static int __maybe_unused rcar_canfd_suspend(struct device *dev)
> +{
> +	return 0;
> +}
> +
> +static int __maybe_unused rcar_canfd_resume(struct device *dev)
> +{
> +	return 0;
> +}
> +
> +static SIMPLE_DEV_PM_OPS(rcar_canfd_pm_ops, rcar_canfd_suspend,
> +			 rcar_canfd_resume);
> +
> +static const struct of_device_id rcar_canfd_of_table[] = {
> +	{ .compatible = "renesas,rcar-gen3-canfd" },
> +	{ }
> +};
> +
> +MODULE_DEVICE_TABLE(of, rcar_canfd_of_table);
> +
> +static struct platform_driver rcar_canfd_driver = {
> +	.driver = {
> +		.name = RCANFD_DRV_NAME,
> +		.of_match_table = of_match_ptr(rcar_canfd_of_table),
> +		.pm = &rcar_canfd_pm_ops,
> +	},
> +	.probe = rcar_canfd_probe,
> +	.remove = rcar_canfd_remove,
> +};
> +
> +module_platform_driver(rcar_canfd_driver);
> +
> +MODULE_AUTHOR("Ramesh Shanmugasundaram <ramesh.shanmugasundaram@bp.renesas.com>");
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("CAN FD driver for Renesas R-Car SoC");
> +MODULE_ALIAS("platform:" RCANFD_DRV_NAME);

regards,
Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply

* Re: [PATCH 1/1] net: usb: cdc_ncm: adding Telit LE910 V2 mobile broadband card
From: David Miller @ 2016-03-31 20:51 UTC (permalink / raw)
  To: dnlplm; +Cc: linux-usb, netdev, bjorn
In-Reply-To: <1459430207-11757-2-git-send-email-dnlplm@gmail.com>

From: Daniele Palmas <dnlplm@gmail.com>
Date: Thu, 31 Mar 2016 15:16:47 +0200

> Telit LE910 V2 is a mobile broadband card with no ARP capabilities:
> the patch makes this device to use wwan_noarp_info struct
> 
> Signed-off-by: Daniele Palmas <dnlplm@gmail.com>

Bjorn, can you take a quick look at this?

Thank you.

^ permalink raw reply

* [PATCH net-next v2 0/6] net: dsa: mv88e6131: HW bridging support for 6185
From: Vivien Didelot @ 2016-03-31 20:53 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, kernel, David S. Miller, Andrew Lunn,
	Vivien Didelot

All packets passing through a switch of the 6185 family are currently all
directed to the CPU port. This means that port bridging is software driven.

To enable hardware bridging for this switch family, we need to implement the
port mapping operations, the FDB operations, and optionally the VLAN operations
(for 802.1Q and VLAN filtering aware systems).

However this family only has 256 FDBs indexed by 8-bit identifiers, opposed to
4096 FDBs with 12-bit identifiers for other families such as 6352. It also
doesn't have dedicated FID registers for ATU and VTU operations.

This patchset fixes these differences, and enable hardware bridging for 6185.

Changes v1 -> v2:
 - Describe the different numbers of databases and prefer a feature-based logic
   over the current ID/family-based logic.

Vivien Didelot (6):
  net: dsa: mv88e6xxx: protect SID register access
  net: dsa: mv88e6xxx: protect FID registers access
  net: dsa: mv88e6xxx: variable number of databases
  net: dsa: mv88e6xxx: support 256 databases
  net: dsa: mv88e6xxx: map destination addresses for 6185
  net: dsa: mv88e6131: enable hardware bridging

 drivers/net/dsa/mv88e6131.c |  11 ++++
 drivers/net/dsa/mv88e6xxx.c | 134 +++++++++++++++++++++++++++++++++++---------
 2 files changed, 118 insertions(+), 27 deletions(-)

-- 
2.7.4

^ permalink raw reply

* [PATCH net-next v2 1/6] net: dsa: mv88e6xxx: protect SID register access
From: Vivien Didelot @ 2016-03-31 20:53 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, kernel, David S. Miller, Andrew Lunn,
	Vivien Didelot
In-Reply-To: <1459457626-30082-1-git-send-email-vivien.didelot@savoirfairelinux.com>

Introduce a mv88e6xxx_has_stu() helper to protect the access to the
GLOBAL_VTU_SID register, instead of checking switch families.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
---
 drivers/net/dsa/mv88e6xxx.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index fa086e0..19a8208 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -482,6 +482,16 @@ static bool mv88e6xxx_6352_family(struct dsa_switch *ds)
 	return false;
 }
 
+static bool mv88e6xxx_has_stu(struct dsa_switch *ds)
+{
+	/* Does the device have STU and dedicated SID registers for VTU ops? */
+	if (mv88e6xxx_6097_family(ds) || mv88e6xxx_6165_family(ds) ||
+	    mv88e6xxx_6351_family(ds) || mv88e6xxx_6352_family(ds))
+		return true;
+
+	return false;
+}
+
 /* We expect the switch to perform auto negotiation if there is a real
  * phy. However, in the case of a fixed link phy, we force the port
  * settings from the fixed link settings.
@@ -1329,7 +1339,9 @@ static int _mv88e6xxx_vtu_getnext(struct dsa_switch *ds,
 				return ret;
 
 			next.fid = ret & GLOBAL_VTU_FID_MASK;
+		}
 
+		if (mv88e6xxx_has_stu(ds)) {
 			ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL,
 						  GLOBAL_VTU_SID);
 			if (ret < 0)
@@ -1412,8 +1424,7 @@ static int _mv88e6xxx_vtu_loadpurge(struct dsa_switch *ds,
 	if (ret < 0)
 		return ret;
 
-	if (mv88e6xxx_6097_family(ds) || mv88e6xxx_6165_family(ds) ||
-	    mv88e6xxx_6351_family(ds) || mv88e6xxx_6352_family(ds)) {
+	if (mv88e6xxx_has_stu(ds)) {
 		reg = entry->sid & GLOBAL_VTU_SID_MASK;
 		ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_SID, reg);
 		if (ret < 0)
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next v2 2/6] net: dsa: mv88e6xxx: protect FID registers access
From: Vivien Didelot @ 2016-03-31 20:53 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, kernel, David S. Miller, Andrew Lunn,
	Vivien Didelot
In-Reply-To: <1459457626-30082-1-git-send-email-vivien.didelot@savoirfairelinux.com>

Only switch families with 4096 address databases have dedicated FID
registers for ATU and VTU operations.

Factorize the access to the GLOBAL_ATU_FID register and introduce a
mv88e6xxx_has_fid_reg() helper function to protect the access to
GLOBAL_ATU_FID and GLOBAL_VTU_FID.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
---
 drivers/net/dsa/mv88e6xxx.c | 42 +++++++++++++++++++++++-------------------
 1 file changed, 23 insertions(+), 19 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 19a8208..cc066dc 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -482,6 +482,16 @@ static bool mv88e6xxx_6352_family(struct dsa_switch *ds)
 	return false;
 }
 
+static bool mv88e6xxx_has_fid_reg(struct dsa_switch *ds)
+{
+	/* Does the device have dedicated FID registers for ATU and VTU ops? */
+	if (mv88e6xxx_6097_family(ds) || mv88e6xxx_6165_family(ds) ||
+	    mv88e6xxx_6351_family(ds) || mv88e6xxx_6352_family(ds))
+		return true;
+
+	return false;
+}
+
 static bool mv88e6xxx_has_stu(struct dsa_switch *ds)
 {
 	/* Does the device have STU and dedicated SID registers for VTU ops? */
@@ -961,10 +971,16 @@ out:
 	return ret;
 }
 
-static int _mv88e6xxx_atu_cmd(struct dsa_switch *ds, u16 cmd)
+static int _mv88e6xxx_atu_cmd(struct dsa_switch *ds, u16 fid, u16 cmd)
 {
 	int ret;
 
+	if (mv88e6xxx_has_fid_reg(ds)) {
+		ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_ATU_FID, fid);
+		if (ret < 0)
+			return ret;
+	}
+
 	ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_ATU_OP, cmd);
 	if (ret < 0)
 		return ret;
@@ -1011,11 +1027,6 @@ static int _mv88e6xxx_atu_flush_move(struct dsa_switch *ds,
 		return err;
 
 	if (entry->fid) {
-		err = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_ATU_FID,
-					   entry->fid);
-		if (err)
-			return err;
-
 		op = static_too ? GLOBAL_ATU_OP_FLUSH_MOVE_ALL_DB :
 			GLOBAL_ATU_OP_FLUSH_MOVE_NON_STATIC_DB;
 	} else {
@@ -1023,7 +1034,7 @@ static int _mv88e6xxx_atu_flush_move(struct dsa_switch *ds,
 			GLOBAL_ATU_OP_FLUSH_MOVE_NON_STATIC;
 	}
 
-	return _mv88e6xxx_atu_cmd(ds, op);
+	return _mv88e6xxx_atu_cmd(ds, entry->fid, op);
 }
 
 static int _mv88e6xxx_atu_flush(struct dsa_switch *ds, u16 fid, bool static_too)
@@ -1331,8 +1342,7 @@ static int _mv88e6xxx_vtu_getnext(struct dsa_switch *ds,
 		if (ret < 0)
 			return ret;
 
-		if (mv88e6xxx_6097_family(ds) || mv88e6xxx_6165_family(ds) ||
-		    mv88e6xxx_6351_family(ds) || mv88e6xxx_6352_family(ds)) {
+		if (mv88e6xxx_has_fid_reg(ds)) {
 			ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL,
 						  GLOBAL_VTU_FID);
 			if (ret < 0)
@@ -1429,7 +1439,9 @@ static int _mv88e6xxx_vtu_loadpurge(struct dsa_switch *ds,
 		ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_SID, reg);
 		if (ret < 0)
 			return ret;
+	}
 
+	if (mv88e6xxx_has_fid_reg(ds)) {
 		reg = entry->fid & GLOBAL_VTU_FID_MASK;
 		ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_FID, reg);
 		if (ret < 0)
@@ -1976,11 +1988,7 @@ static int _mv88e6xxx_atu_load(struct dsa_switch *ds,
 	if (ret < 0)
 		return ret;
 
-	ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_ATU_FID, entry->fid);
-	if (ret < 0)
-		return ret;
-
-	return _mv88e6xxx_atu_cmd(ds, GLOBAL_ATU_OP_LOAD_DB);
+	return _mv88e6xxx_atu_cmd(ds, entry->fid, GLOBAL_ATU_OP_LOAD_DB);
 }
 
 static int _mv88e6xxx_port_fdb_load(struct dsa_switch *ds, int port,
@@ -2063,11 +2071,7 @@ static int _mv88e6xxx_atu_getnext(struct dsa_switch *ds, u16 fid,
 	if (ret < 0)
 		return ret;
 
-	ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_ATU_FID, fid);
-	if (ret < 0)
-		return ret;
-
-	ret = _mv88e6xxx_atu_cmd(ds, GLOBAL_ATU_OP_GET_NEXT_DB);
+	ret = _mv88e6xxx_atu_cmd(ds, fid, GLOBAL_ATU_OP_GET_NEXT_DB);
 	if (ret < 0)
 		return ret;
 
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next v2 3/6] net: dsa: mv88e6xxx: variable number of databases
From: Vivien Didelot @ 2016-03-31 20:53 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, kernel, David S. Miller, Andrew Lunn,
	Vivien Didelot
In-Reply-To: <1459457626-30082-1-git-send-email-vivien.didelot@savoirfairelinux.com>

Marvell switch chips have different number of address databases.

The code currently only supports models with 4096 databases. Such switch
has dedicated FID registers for ATU and VTU operations. Models with
fewer databases have their FID split in several registers.

List them all but only support models with 4096 databases at the moment.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
---
 drivers/net/dsa/mv88e6xxx.c | 38 ++++++++++++++++++++++++++++++++++----
 1 file changed, 34 insertions(+), 4 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index cc066dc..888c66b 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -482,6 +482,30 @@ static bool mv88e6xxx_6352_family(struct dsa_switch *ds)
 	return false;
 }
 
+static unsigned int mv88e6xxx_num_databases(struct dsa_switch *ds)
+{
+	struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
+
+	/* The following devices have 4-bit identifiers for 16 databases */
+	if (ps->id == PORT_SWITCH_ID_6061)
+		return 16;
+
+	/* The following devices have 6-bit identifiers for 64 databases */
+	if (ps->id == PORT_SWITCH_ID_6065)
+		return 64;
+
+	/* The following devices have 8-bit identifiers for 256 databases */
+	if (mv88e6xxx_6095_family(ds) || mv88e6xxx_6185_family(ds))
+		return 256;
+
+	/* The following devices have 12-bit identifiers for 4096 databases */
+	if (mv88e6xxx_6097_family(ds) || mv88e6xxx_6165_family(ds) ||
+	    mv88e6xxx_6351_family(ds) || mv88e6xxx_6352_family(ds))
+		return 4096;
+
+	return 0;
+}
+
 static bool mv88e6xxx_has_fid_reg(struct dsa_switch *ds)
 {
 	/* Does the device have dedicated FID registers for ATU and VTU ops? */
@@ -1534,9 +1558,15 @@ loadpurge:
 static int _mv88e6xxx_port_fid(struct dsa_switch *ds, int port, u16 *new,
 			       u16 *old)
 {
+	u16 upper_mask;
 	u16 fid;
 	int ret;
 
+	if (mv88e6xxx_num_databases(ds) == 4096)
+		upper_mask = 0xff;
+	else
+		return -EOPNOTSUPP;
+
 	/* Port's default FID bits 3:0 are located in reg 0x06, offset 12 */
 	ret = _mv88e6xxx_reg_read(ds, REG_PORT(port), PORT_BASE_VLAN);
 	if (ret < 0)
@@ -1559,11 +1589,11 @@ static int _mv88e6xxx_port_fid(struct dsa_switch *ds, int port, u16 *new,
 	if (ret < 0)
 		return ret;
 
-	fid |= (ret & PORT_CONTROL_1_FID_11_4_MASK) << 4;
+	fid |= (ret & upper_mask) << 4;
 
 	if (new) {
-		ret &= ~PORT_CONTROL_1_FID_11_4_MASK;
-		ret |= (*new >> 4) & PORT_CONTROL_1_FID_11_4_MASK;
+		ret &= ~upper_mask;
+		ret |= (*new >> 4) & upper_mask;
 
 		ret = _mv88e6xxx_reg_write(ds, REG_PORT(port), PORT_CONTROL_1,
 					   ret);
@@ -1627,7 +1657,7 @@ static int _mv88e6xxx_fid_new(struct dsa_switch *ds, u16 *fid)
 	 * databases are not needed. Return the next positive available.
 	 */
 	*fid = find_next_zero_bit(fid_bitmap, MV88E6XXX_N_FID, 1);
-	if (unlikely(*fid == MV88E6XXX_N_FID))
+	if (unlikely(*fid >= mv88e6xxx_num_databases(ds)))
 		return -ENOSPC;
 
 	/* Clear the database */
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next v2 4/6] net: dsa: mv88e6xxx: support 256 databases
From: Vivien Didelot @ 2016-03-31 20:53 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, kernel, David S. Miller, Andrew Lunn,
	Vivien Didelot
In-Reply-To: <1459457626-30082-1-git-send-email-vivien.didelot@savoirfairelinux.com>

The 6185 family of devices has only 256 address databases. Their 8-bit
FID for ATU and VTU operations are split into ATU Control and ATU/VTU
Operation registers.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
---
 drivers/net/dsa/mv88e6xxx.c | 36 +++++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 888c66b..52c3e17 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -1003,6 +1003,20 @@ static int _mv88e6xxx_atu_cmd(struct dsa_switch *ds, u16 fid, u16 cmd)
 		ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_ATU_FID, fid);
 		if (ret < 0)
 			return ret;
+	} else if (mv88e6xxx_num_databases(ds) == 256) {
+		/* ATU DBNum[7:4] are located in ATU Control 15:12 */
+		ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_ATU_CONTROL);
+		if (ret < 0)
+			return ret;
+
+		ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_ATU_CONTROL,
+					   (ret & 0xfff) |
+					   ((fid << 8) & 0xf000));
+		if (ret < 0)
+			return ret;
+
+		/* ATU DBNum[3:0] are located in ATU Operation 3:0 */
+		cmd |= fid & 0xf;
 	}
 
 	ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_ATU_OP, cmd);
@@ -1373,6 +1387,17 @@ static int _mv88e6xxx_vtu_getnext(struct dsa_switch *ds,
 				return ret;
 
 			next.fid = ret & GLOBAL_VTU_FID_MASK;
+		} else if (mv88e6xxx_num_databases(ds) == 256) {
+			/* VTU DBNum[7:4] are located in VTU Operation 11:8, and
+			 * VTU DBNum[3:0] are located in VTU Operation 3:0
+			 */
+			ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL,
+						  GLOBAL_VTU_OP);
+			if (ret < 0)
+				return ret;
+
+			next.fid = (ret & 0xf00) >> 4;
+			next.fid |= ret & 0xf;
 		}
 
 		if (mv88e6xxx_has_stu(ds)) {
@@ -1443,6 +1468,7 @@ unlock:
 static int _mv88e6xxx_vtu_loadpurge(struct dsa_switch *ds,
 				    struct mv88e6xxx_vtu_stu_entry *entry)
 {
+	u16 op = GLOBAL_VTU_OP_VTU_LOAD_PURGE;
 	u16 reg = 0;
 	int ret;
 
@@ -1470,6 +1496,12 @@ static int _mv88e6xxx_vtu_loadpurge(struct dsa_switch *ds,
 		ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_FID, reg);
 		if (ret < 0)
 			return ret;
+	} else if (mv88e6xxx_num_databases(ds) == 256) {
+		/* VTU DBNum[7:4] are located in VTU Operation 11:8, and
+		 * VTU DBNum[3:0] are located in VTU Operation 3:0
+		 */
+		op |= (entry->fid & 0xf0) << 8;
+		op |= entry->fid & 0xf;
 	}
 
 	reg = GLOBAL_VTU_VID_VALID;
@@ -1479,7 +1511,7 @@ loadpurge:
 	if (ret < 0)
 		return ret;
 
-	return _mv88e6xxx_vtu_cmd(ds, GLOBAL_VTU_OP_VTU_LOAD_PURGE);
+	return _mv88e6xxx_vtu_cmd(ds, op);
 }
 
 static int _mv88e6xxx_stu_getnext(struct dsa_switch *ds, u8 sid,
@@ -1564,6 +1596,8 @@ static int _mv88e6xxx_port_fid(struct dsa_switch *ds, int port, u16 *new,
 
 	if (mv88e6xxx_num_databases(ds) == 4096)
 		upper_mask = 0xff;
+	else if (mv88e6xxx_num_databases(ds) == 256)
+		upper_mask = 0xf;
 	else
 		return -EOPNOTSUPP;
 
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next v2 5/6] net: dsa: mv88e6xxx: map destination addresses for 6185
From: Vivien Didelot @ 2016-03-31 20:53 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, kernel, David S. Miller, Andrew Lunn,
	Vivien Didelot
In-Reply-To: <1459457626-30082-1-git-send-email-vivien.didelot@savoirfairelinux.com>

The 88E6185 switch also has a MapDA bit in its Port Control 2 register.
When this bit is cleared, all frames are sent out to the CPU port.

Set this bit to rely on address databases (ATU) hits and direct frames
out of the correct ports, and thus allow hardware bridging.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
---
 drivers/net/dsa/mv88e6xxx.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 52c3e17..56f5657 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -2455,7 +2455,8 @@ static int mv88e6xxx_setup_port(struct dsa_switch *ds, int port)
 	reg = 0;
 	if (mv88e6xxx_6352_family(ds) || mv88e6xxx_6351_family(ds) ||
 	    mv88e6xxx_6165_family(ds) || mv88e6xxx_6097_family(ds) ||
-	    mv88e6xxx_6095_family(ds) || mv88e6xxx_6320_family(ds))
+	    mv88e6xxx_6095_family(ds) || mv88e6xxx_6320_family(ds) ||
+	    mv88e6xxx_6185_family(ds))
 		reg = PORT_CONTROL_2_MAP_DA;
 
 	if (mv88e6xxx_6352_family(ds) || mv88e6xxx_6351_family(ds) ||
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next v2 6/6] net: dsa: mv88e6131: enable hardware bridging
From: Vivien Didelot @ 2016-03-31 20:53 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, kernel, David S. Miller, Andrew Lunn,
	Vivien Didelot
In-Reply-To: <1459457626-30082-1-git-send-email-vivien.didelot@savoirfairelinux.com>

By adding support for bridge operations, FDB operations, and optionally
VLAN operations (for 802.1Q and VLAN filtering aware systems), the
switch bridges ports correctly, the CPU is able to populate the hardware
address databases, and thus hardware bridging becomes functional within
the 88E6185 family of switches.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
---
 drivers/net/dsa/mv88e6131.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index a92ca65..2407028 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -169,6 +169,17 @@ struct dsa_switch_driver mv88e6131_switch_driver = {
 	.get_ethtool_stats	= mv88e6xxx_get_ethtool_stats,
 	.get_sset_count		= mv88e6xxx_get_sset_count,
 	.adjust_link		= mv88e6xxx_adjust_link,
+	.port_bridge_join	= mv88e6xxx_port_bridge_join,
+	.port_bridge_leave	= mv88e6xxx_port_bridge_leave,
+	.port_vlan_filtering	= mv88e6xxx_port_vlan_filtering,
+	.port_vlan_prepare	= mv88e6xxx_port_vlan_prepare,
+	.port_vlan_add		= mv88e6xxx_port_vlan_add,
+	.port_vlan_del		= mv88e6xxx_port_vlan_del,
+	.port_vlan_dump		= mv88e6xxx_port_vlan_dump,
+	.port_fdb_prepare       = mv88e6xxx_port_fdb_prepare,
+	.port_fdb_add           = mv88e6xxx_port_fdb_add,
+	.port_fdb_del           = mv88e6xxx_port_fdb_del,
+	.port_fdb_dump          = mv88e6xxx_port_fdb_dump,
 };
 
 MODULE_ALIAS("platform:mv88e6085");
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH 1/1] net: usb: cdc_ncm: adding Telit LE910 V2 mobile broadband card
From: Bjørn Mork @ 2016-03-31 21:07 UTC (permalink / raw)
  To: David Miller
  Cc: dnlplm-Re5JQEeQqe8AvxtiuMwx3w, linux-usb-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20160331.165138.1846092223088368562.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> writes:
> From: Daniele Palmas <dnlplm-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Date: Thu, 31 Mar 2016 15:16:47 +0200
>
>> Telit LE910 V2 is a mobile broadband card with no ARP capabilities:
>> the patch makes this device to use wwan_noarp_info struct
>> 
>> Signed-off-by: Daniele Palmas <dnlplm-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>
> Bjorn, can you take a quick look at this?

Looks fine to me.

Reviewed-by: Bjørn Mork <bjorn-yOkvZcmFvRU@public.gmane.org>


Bjørn
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: Best way to reduce system call overhead for tun device I/O?
From: Tom Herbert @ 2016-03-31 21:18 UTC (permalink / raw)
  To: Guus Sliepen; +Cc: Linux Kernel Network Developers
In-Reply-To: <20160329224043.GY3784@sliepen.org>

On Tue, Mar 29, 2016 at 6:40 PM, Guus Sliepen <guus@tinc-vpn.org> wrote:
> I'm trying to reduce system call overhead when reading/writing to/from a
> tun device in userspace. For sockets, one can use sendmmsg()/recvmmsg(),
> but a tun fd is not a socket fd, so this doesn't work. I'm see several
> options to allow userspace to read/write multiple packets with one
> syscall:
>
> - Implement a TX/RX ring buffer that is mmap()ed, like with AF_PACKET
>   sockets.
>
> - Implement a ioctl() to emulate sendmmsg()/recvmmsg().
>
> - Add a flag that can be set using TUNSETIFF that makes regular
>   read()/write() calls handle multiple packets in one go.
>
> - Expose a socket fd to userspace, so regular sendmmsg()/recvmmsg() can
>   be used. There is tun_get_socket() which is used internally in the
>   kernel, but this is not exposed to userspace, and doesn't look trivial
>   to do either.
>
> What would be the right way to do this?
>
Personally I think tun could benefit greatly if it were implemented as
a socket instead of character interface. One thing that could be much
better is sending/receiving of meta data attached to skbuf. For
instance GSO data could be in ancillary data in a socket instead of
inline with packet data as tun seems to be doing now.

Tom

> --
> Met vriendelijke groet / with kind regards,
>      Guus Sliepen <guus@tinc-vpn.org>

^ permalink raw reply

* Re: [PATCH 1/1] net: usb: cdc_ncm: adding Telit LE910 V2 mobile broadband card
From: David Miller @ 2016-03-31 21:20 UTC (permalink / raw)
  To: bjorn-yOkvZcmFvRU
  Cc: dnlplm-Re5JQEeQqe8AvxtiuMwx3w, linux-usb-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <87lh4ycq5p.fsf-lbf33ChDnrE/G1V5fR+Y7Q@public.gmane.org>

From: Bjørn Mork <bjorn-yOkvZcmFvRU@public.gmane.org>
Date: Thu, 31 Mar 2016 23:07:30 +0200

> David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> writes:
>> From: Daniele Palmas <dnlplm-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> Date: Thu, 31 Mar 2016 15:16:47 +0200
>>
>>> Telit LE910 V2 is a mobile broadband card with no ARP capabilities:
>>> the patch makes this device to use wwan_noarp_info struct
>>> 
>>> Signed-off-by: Daniele Palmas <dnlplm-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>>
>> Bjorn, can you take a quick look at this?
> 
> Looks fine to me.
> 
> Reviewed-by: Bjørn Mork <bjorn-yOkvZcmFvRU@public.gmane.org>

Thanks, applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: Best way to reduce system call overhead for tun device I/O?
From: David Miller @ 2016-03-31 21:20 UTC (permalink / raw)
  To: tom; +Cc: guus, netdev
In-Reply-To: <CALx6S37xsTeDyMHOEiGzjpbR_5zFw6XSDEgPkZqpUzEa9WDStg@mail.gmail.com>

From: Tom Herbert <tom@herbertland.com>
Date: Thu, 31 Mar 2016 17:18:48 -0400

> On Tue, Mar 29, 2016 at 6:40 PM, Guus Sliepen <guus@tinc-vpn.org> wrote:
>> I'm trying to reduce system call overhead when reading/writing to/from a
>> tun device in userspace. For sockets, one can use sendmmsg()/recvmmsg(),
>> but a tun fd is not a socket fd, so this doesn't work. I'm see several
>> options to allow userspace to read/write multiple packets with one
>> syscall:
>>
>> - Implement a TX/RX ring buffer that is mmap()ed, like with AF_PACKET
>>   sockets.
>>
>> - Implement a ioctl() to emulate sendmmsg()/recvmmsg().
>>
>> - Add a flag that can be set using TUNSETIFF that makes regular
>>   read()/write() calls handle multiple packets in one go.
>>
>> - Expose a socket fd to userspace, so regular sendmmsg()/recvmmsg() can
>>   be used. There is tun_get_socket() which is used internally in the
>>   kernel, but this is not exposed to userspace, and doesn't look trivial
>>   to do either.
>>
>> What would be the right way to do this?
>>
> Personally I think tun could benefit greatly if it were implemented as
> a socket instead of character interface. One thing that could be much
> better is sending/receiving of meta data attached to skbuf. For
> instance GSO data could be in ancillary data in a socket instead of
> inline with packet data as tun seems to be doing now.

Agreed.

^ permalink raw reply

* Re: [PATCH] sctp: avoid refreshing heartbeat timer too often
From: 'Marcelo Ricardo Leitner' @ 2016-03-31 21:25 UTC (permalink / raw)
  To: David Laight
  Cc: netdev@vger.kernel.org, Neil Horman, Vlad Yasevich,
	linux-sctp@vger.kernel.org
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6D54DC1DB0@AcuExch.aculab.com>

On Thu, Mar 31, 2016 at 11:16:52AM +0000, David Laight wrote:
> From: Marcelo Ricardo Leitner
> > Sent: 30 March 2016 13:13
> > Em 30-03-2016 06:37, David Laight escreveu:
> > > From: Marcelo Ricardo Leitner
> > >> Sent: 29 March 2016 14:42
> > >>
> > >> Currently on high rate SCTP streams the heartbeat timer refresh can
> > >> consume quite a lot of resources as timer updates are costly and it
> > >> contains a random factor, which a) is also costly and b) invalidates
> > >> mod_timer() optimization for not editing a timer to the same value.
> > >> It may even cause the timer to be slightly advanced, for no good reason.
> > >
> > > Interesting thoughts:
> > > 1) Is it necessary to use a different 'random factor' until the timer actually
> > >     expires?
> > 
> > I don't understand you fully here, but we have to have a random factor
> > on timer expire. As noted by Daniel Borkmann on his commit 8f61059a96c2
> > ("net: sctp: improve timer slack calculation for transport HBs"):
> 
> When a HEARTBEAT chunk is sent determine the new interval, use that
> interval until the timer actually expires when a new interval is
> calculated. So the random number is only generated once per heartbeat.
> 
> >      RFC4960, section 8.3 says:
> > 
> >        On an idle destination address that is allowed to heartbeat,
> >        it is recommended that a HEARTBEAT chunk is sent once per RTO
> >        of that destination address plus the protocol parameter
> >        'HB.interval', with jittering of +/- 50% of the RTO value,
> >        and exponential backoff of the RTO if the previous HEARTBEAT
> >        is unanswered.
> > 
> > Previous to his commit, it was using a random factor based on jiffies.
> > 
> > This patch then assumes that random_A+2 is just as random as random_B as
> > long as it is within the allowed range, avoiding the unnecessary updates.
> > 
> > > 2) It might be better to allow the heartbeat timer to expire, on expiry work
> > >     out the new interval based on when the last 'refresh' was done.
> > 
> > Cool, I thought about this too. It would introduce some extra complexity
> > that is not really worth I think, specially because now we may be doing
> > more timer updates even with this patch but it's not triggering any wake
> > ups and we would need at least 2 wake ups then: one for the first
> > timeout event, and then re-schedule the timer for the next updated one,
> > and maybe again, and again.. less timer updates but more wake ups, one
> > at every heartbeat interval even on a busy transport. Seems it's cheaper
> > to just update the timer then.
> 
> One wakeup per heartbeat interval on a busy connection is probably noise.
> Probably much less than the 1000s of timer updates that would otherwise happen.

I was thinking more on near-idle systems, as the overhead for this
refresh looked rather small now even for busy transports if compared to
other points, a worth trade-off for reducing wake ups, imho.

But then I checked tcp, and it does what you're suggesting.
I'll rework the patch. Thanks

> A further optimisation would be to restart the timer if more than (say) 80%
> of the way through the timeout period.
> 
> Similarly the HEARTBEAT could be sent if the 2nd wakeup would be almost immediate.

^ permalink raw reply

* [PATCH] net: thunderx: Fix broken of_node_put() code.
From: David Daney @ 2016-03-31 21:42 UTC (permalink / raw)
  To: David S. Miller, netdev
  Cc: linux-kernel, linux-arm-kernel, Robert Richter, Sunil Goutham,
	David Daney

From: David Daney <david.daney@cavium.com>

commit b7d3e3d3d21a ("net: thunderx: Don't leak phy device references
on -EPROBE_DEFER condition.") incorrectly moved the call to
of_node_put() outside of the loop.  Under normal loop exit, the node
has already had of_node_put() called, so the extra call results in:

[    8.228020] ERROR: Bad of_node_put() on /soc@0/pci@848000000000/mrml-bridge0@1,0/bgx0/xlaui00
[    8.239433] CPU: 16 PID: 608 Comm: systemd-udevd Not tainted 4.6.0-rc1-numa+ #157
[    8.247380] Hardware name: www.cavium.com EBB8800/EBB8800, BIOS 0.3 Mar  2 2016
[    8.273541] Call trace:
[    8.273550] [<fffffc0008097364>] dump_backtrace+0x0/0x210
[    8.273557] [<fffffc0008097598>] show_stack+0x24/0x2c
[    8.273560] [<fffffc0008399ed0>] dump_stack+0x8c/0xb4
[    8.273566] [<fffffc00085aa828>] of_node_release+0xa8/0xac
[    8.273570] [<fffffc000839cad8>] kobject_cleanup+0x8c/0x194
[    8.273573] [<fffffc000839c97c>] kobject_put+0x44/0x6c
[    8.273576] [<fffffc00085a9ab0>] of_node_put+0x24/0x30
[    8.273587] [<fffffc0000bd0f74>] bgx_probe+0x17c/0xcd8 [thunder_bgx]
[    8.273591] [<fffffc00083ed220>] pci_device_probe+0xa0/0x114
[    8.273596] [<fffffc0008473fbc>] driver_probe_device+0x178/0x418
[    8.273599] [<fffffc000847435c>] __driver_attach+0x100/0x118
[    8.273602] [<fffffc0008471b58>] bus_for_each_dev+0x6c/0xac
[    8.273605] [<fffffc0008473884>] driver_attach+0x30/0x38
[    8.273608] [<fffffc00084732f4>] bus_add_driver+0x1f8/0x29c
[    8.273611] [<fffffc0008475028>] driver_register+0x70/0x110
[    8.273617] [<fffffc00083ebf08>] __pci_register_driver+0x60/0x6c
[    8.273623] [<fffffc0000bf0040>] bgx_init_module+0x40/0x48 [thunder_bgx]
[    8.273626] [<fffffc0008090d04>] do_one_initcall+0xcc/0x1c0
[    8.273631] [<fffffc0008198abc>] do_init_module+0x68/0x1c8
[    8.273635] [<fffffc0008125668>] load_module+0xf44/0x11f4
[    8.273638] [<fffffc0008125b64>] SyS_finit_module+0xb8/0xe0
[    8.273641] [<fffffc0008093b30>] el0_svc_naked+0x24/0x28

Go back to the previous (correct) code that only did the extra
of_node_put() call on early exit from the loop.

Signed-off-by: David Daney <david.daney@cavium.com>
---
 drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
index 9679515..d20539a 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
@@ -1011,10 +1011,11 @@ static int bgx_init_of_phy(struct bgx *bgx)
 		}
 
 		lmac++;
-		if (lmac == MAX_LMAC_PER_BGX)
+		if (lmac == MAX_LMAC_PER_BGX) {
+			of_node_put(node);
 			break;
+		}
 	}
-	of_node_put(node);
 	return 0;
 
 defer:
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
From: Daniel Borkmann @ 2016-03-31 21:52 UTC (permalink / raw)
  To: Hannes Frederic Sowa, David Miller
  Cc: eric.dumazet, alexei.starovoitov, mkubecek, sasha.levin, jslaby,
	mst, netdev
In-Reply-To: <56FD7F0B.5090602@stressinduktion.org>

On 03/31/2016 09:48 PM, Hannes Frederic Sowa wrote:
[...]
> Tightest solution would probably be to combine both patches.
>
> bool called_by_tuntap;
>
> old_fp = rcu_dereference_protected(sk->sk_filter, called_by_tuntap ? lockdep_rtnl_is_held() : lockdep_sock_is_held());

If I understand you correctly with combining them, you mean you'd still
need the API change to pass the bool 'called_by_tuntap' down, right?

If so, your main difference is, after all, to replace the sock_owned_by_user()
with the lockdep_sock_is_held() construction instead, correct?

But then, isn't it already sufficient when you pass the bool itself down
that 'demuxes' in this case between the sock_owned_by_user() vs
lockdep_rtnl_is_held() check?

Thanks,
Daniel

^ permalink raw reply

* Re: [PATCH net-next 0/8] add TX timestamping via cmsg
From: Martin KaFai Lau @ 2016-03-31 21:57 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: Soheil Hassas Yeganeh, Network Development
In-Reply-To: <CAF=yD-Lpx2nBq1bL64-pzOqHh5E-a4Bbr2goDE59ApWvzFri_Q@mail.gmail.com>

On Wed, Mar 30, 2016 at 11:50:31PM -0400, Willem de Bruijn wrote:
> >> Nice patches!
>
> This does not yet solve the append issue that your MSG_EOR patch
> addresses, of course.
Yes.  I have been thinking about both approaches.

>
> The straightforward jump to new_segment that I proposed as
> simplification is more properly a "start-of-record" than
> "end-of-record" signal. It is probably preferable to indeed be able to
> pass EOR as signal that the last skb must not be appended to in
> subsequent calls.
I suspect we could do better than only checking MSG_EOR and jump.
Before entering the loop, we may be able to check if the
last-not-yet-written out skb has tskey set and the current
tcp_sendmsg may need to overwrite it before jumping.

Also, the 2nd sendmsg may not be called with MSG_EOR set but
the per-write-knob is turned on.  It could overwrite the
1st sendmsg with both per-write-knob on and MSG_EOR set.

Note that there is another collapse-case during tcp retrans
where the MSG_EOR bit is already loss.

Hence, having EOR passed as signal (as you mentioned) and stored
is needed.

I think another bit in the TCP_SKB_CB may help here.
The semantic of this bit could be 'no skb merge under some rare conditions'.
For now, it is limited to tskey.

[Another side note is, the split/fragment case should be fine as it is.
 When splitting a skb into two smaller skbs, the tskey should fall
 into either of them and no information loss.]

> I think that the record bounds issue is best solved independently from
> the interface for intermittent timestamps because
I understand that users may want to selectively do timestamping on a
particular sendmsg (per-write-knob), while do not care if the tskey
will be overwritten (no-tskey-overwritten) by the future
sendmsg/retrans.  Separating them gives the end-user a choice.

On the other hand, if the caller has specifically asked to do tstamp in
a particular tcp_sendmsg, it is a strong intention that the caller wants to
specifically track this message alone and not expecting this tstmap to
include anything else sent after it.  Beside, TLS user needs to make more
effort to pass the per-write-knob to tcp_sendmsg.  Hence, when per-write-knob
is used, I think the no-tskey-overwritten should be at least allowed (if
not the default).

> (a) changing the tcp
> bytestream packetization for timestamping introduces subtle
> differences between tracked and untracked data that are not always
> acceptable and (b) EOR can also be useful outside timestamps. A
> zerocopy sendmsg patchset that I sent for RFC last year encountered a
> similar requirement, to give one example: each skb with user data must
> point to a completion notification structure (ubuf_info), and can only
> point to one at a time. Appends that cause a conflict in skb->uarg
> pointers had to be blocked, at the cost of possibly different
> packetization compared to regular sends.
With no-tskey-overwritten turned on in production, we don't found the
end-user's performance has been impacted in meaningful way. I believe
it is the proper tradeoff as an opt-in feature to get measurement
data for existing tcp-socket users without adding a lot of burden
to the TCP stack which is a byte-oriented socket to begin with.

^ permalink raw reply

* Re: [net PATCH v2] i40e/i40evf: Limit TSO to 7 descriptors for payload instead of 8 per packet
From: Jesse Brandeburg @ 2016-03-31 22:04 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: netdev, alexander.duyck, intel-wired-lan, jeffrey.t.kirsher,
	sowmini.varadhan
In-Reply-To: <20160330231116.12643.59554.stgit@localhost.localdomain>

On Wed, 30 Mar 2016 16:15:37 -0700
Alexander Duyck <aduyck@mirantis.com> wrote:

> This patch addresses a bug introduced based on my interpretation of the
> XL710 datasheet.  Specifically section 8.4.1 states that "A single transmit
> packet may span up to 8 buffers (up to 8 data descriptors per packet
> including both the header and payload buffers)."  It then later goes on to
> say that each segment for a TSO obeys the previous rule, however it then
> refers to TSO header and the segment payload buffers.
> 
> I believe the actual limit for fragments with TSO and a skbuff that has
> payload data in the header portion of the buffer is actually only 7
> fragments as the skb->data portion counts as 2 buffers, one for the TSO
> header, and one for a segment payload buffer.
> 
> Fixes: 2d37490b82af ("i40e/i40evf: Rewrite logic for 8 descriptor per packet check")
> Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
> ---
> 
> v2: I realized that I overlooked the check in the inline function and as a
>     result we were still allowing for cases where 8 descriptors were being
>     used per packet and this would result in 9 DMA buffers.  I updated the
>     code so that we only allow 8 in the case of a single send, otherwise we
>     go into the function that walks the frags to verify each block.
> 
> I have tested this using rds-stress and it seems to run traffic without
> throwing any errors.

Looking like it is working for me too with at least the PF.

Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>

Should also add:
Reported-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>

^ permalink raw reply

* Re: [net PATCH v2] i40e/i40evf: Limit TSO to 7 descriptors for payload instead of 8 per packet
From: Alexander Duyck @ 2016-03-31 22:09 UTC (permalink / raw)
  To: Jesse Brandeburg
  Cc: Alexander Duyck, Netdev, intel-wired-lan, Jeff Kirsher,
	Sowmini Varadhan
In-Reply-To: <20160331150438.00005617@unknown>

On Thu, Mar 31, 2016 at 3:04 PM, Jesse Brandeburg
<jesse.brandeburg@intel.com> wrote:
> On Wed, 30 Mar 2016 16:15:37 -0700
> Alexander Duyck <aduyck@mirantis.com> wrote:
>
>> This patch addresses a bug introduced based on my interpretation of the
>> XL710 datasheet.  Specifically section 8.4.1 states that "A single transmit
>> packet may span up to 8 buffers (up to 8 data descriptors per packet
>> including both the header and payload buffers)."  It then later goes on to
>> say that each segment for a TSO obeys the previous rule, however it then
>> refers to TSO header and the segment payload buffers.
>>
>> I believe the actual limit for fragments with TSO and a skbuff that has
>> payload data in the header portion of the buffer is actually only 7
>> fragments as the skb->data portion counts as 2 buffers, one for the TSO
>> header, and one for a segment payload buffer.
>>
>> Fixes: 2d37490b82af ("i40e/i40evf: Rewrite logic for 8 descriptor per packet check")
>> Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>> ---
>>
>> v2: I realized that I overlooked the check in the inline function and as a
>>     result we were still allowing for cases where 8 descriptors were being
>>     used per packet and this would result in 9 DMA buffers.  I updated the
>>     code so that we only allow 8 in the case of a single send, otherwise we
>>     go into the function that walks the frags to verify each block.
>>
>> I have tested this using rds-stress and it seems to run traffic without
>> throwing any errors.
>
> Looking like it is working for me too with at least the PF.

I was testing PF <-> VF in my environment so I think I ended up
covering both in my test at least.

- Alex

^ permalink raw reply

* Re: qdisc spin lock
From: Cong Wang @ 2016-03-31 22:16 UTC (permalink / raw)
  To: Michael Ma; +Cc: Linux Kernel Network Developers
In-Reply-To: <CAAmHdhw9bQkCm7uehRZ9mTetMzafdXxWhYj16f8W-YvSz8V4=g@mail.gmail.com>

On Wed, Mar 30, 2016 at 12:20 AM, Michael Ma <make0818@gmail.com> wrote:
> As far as I understand the design of TC is to simplify locking schema
> and minimize the work in __qdisc_run so that throughput won’t be
> affected, especially with large packets. However if the scenario is
> that multiple classes in the queueing discipline only have the shaping
> limit, there isn’t really a necessary correlation between different
> classes. The only synchronization point should be when the packet is
> dequeued from the qdisc queue and enqueued to the transmit queue of
> the device. My question is – is it worth investing on avoiding the
> locking contention by partitioning the queue/lock so that this
> scenario is addressed with relatively smaller latency?

If your HTB classes don't share bandwidth, why do you still make them
under the same hierarchy? IOW, you can just isolate them either with some
other qdisc or just separated interfaces.

^ permalink raw reply

* [PATCH v2 net-next] net: ipv4: Consider unreachable nexthops in multipath routes
From: David Ahern @ 2016-03-31 22:24 UTC (permalink / raw)
  To: netdev; +Cc: ja, David Ahern

Multipath route lookups should consider knowledge about next hops and not
select a hop that is known to be failed.

Example:

                     [h2]                   [h3]   15.0.0.5
                      |                      |
                     3|                     3|
                    [SP1]                  [SP2]--+
                     1  2                   1     2
                     |  |     /-------------+     |
                     |   \   /                    |
                     |     X                      |
                     |    / \                     |
                     |   /   \---------------\    |
                     1  2                     1   2
         12.0.0.2  [TOR1] 3-----------------3 [TOR2] 12.0.0.3
                     4                         4
                      \                       /
                        \                    /
                         \                  /
                          -------|   |-----/
                                 1   2
                                [TOR3]
                                  3|
                                   |
                                  [h1]  12.0.0.1

host h1 with IP 12.0.0.1 has 2 paths to host h3 at 15.0.0.5:

    root@h1:~# ip ro ls
    ...
    12.0.0.0/24 dev swp1  proto kernel  scope link  src 12.0.0.1
    15.0.0.0/16
            nexthop via 12.0.0.2  dev swp1 weight 1
            nexthop via 12.0.0.3  dev swp1 weight 1
    ...

If the link between tor3 and tor1 is down and the link between tor1
and tor2 then tor1 is effectively cut-off from h1. Yet the route lookups
in h1 are alternating between the 2 routes: ping 15.0.0.5 gets one and
ssh 15.0.0.5 gets the other. Connections that attempt to use the
12.0.0.2 nexthop fail since that neighbor is not reachable:

    root@h1:~# ip neigh show
    ...
    12.0.0.3 dev swp1 lladdr 00:02:00:00:00:1b REACHABLE
    12.0.0.2 dev swp1  FAILED
    ...

The failed path can be avoided by considering known neighbor information
when selecting next hops. If the neighbor lookup fails we have no
knowledge about the nexthop, so give it a shot. If there is an entry
then only select the nexthop if the state is sane. This is similar to
what fib_detect_death does.

To maintain backward compatibility use of the neighbor information is
based on a new sysctl, fib_multipath_use_neigh.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
v2
- use rcu locking to avoid refcnts per Eric's suggestion
- only consider neighbor info for nh_scope == RT_SCOPE_LINK per Julian's
  comment
- drop the 'state == NUD_REACHABLE' from the state check since it is
  part of NUD_VALID (comment from Julian)
- wrapped the use of the neigh in a sysctl

 Documentation/networking/ip-sysctl.txt | 10 ++++++++++
 include/net/netns/ipv4.h               |  3 +++
 net/ipv4/fib_semantics.c               | 32 ++++++++++++++++++++++++++++++--
 net/ipv4/sysctl_net_ipv4.c             | 11 +++++++++++
 4 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index b183e2b606c8..5b316d33a23f 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -63,6 +63,16 @@ fwmark_reflect - BOOLEAN
 	fwmark of the packet they are replying to.
 	Default: 0
 
+fib_multipath_use_neigh - BOOLEAN
+	Use status of existing neighbor entry when determining nexthop for
+	multipath routes. If disabled neighbor information is not used then
+	packets could be directed to a dead nexthop. Only valid for kernels
+	built with CONFIG_IP_ROUTE_MULTIPATH enabled.
+	Default: 0 (disabled)
+	Possible values:
+	0 - disabled
+	1 - enabled
+
 route/max_size - INTEGER
 	Maximum number of routes allowed in the kernel.  Increase
 	this when using large numbers of interfaces and/or routes.
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index a69cde3ce460..d061ffeb1e71 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -133,6 +133,9 @@ struct netns_ipv4 {
 	struct fib_rules_ops	*mr_rules_ops;
 #endif
 #endif
+#ifdef CONFIG_IP_ROUTE_MULTIPATH
+	int sysctl_fib_multipath_use_neigh;
+#endif
 	atomic_t	rt_genid;
 };
 #endif
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index d97268e8ff10..6d423faff0ce 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -1559,17 +1559,45 @@ int fib_sync_up(struct net_device *dev, unsigned int nh_flags)
 }
 
 #ifdef CONFIG_IP_ROUTE_MULTIPATH
+static bool fib_good_nh(const struct fib_nh *nh, struct net_device *dev)
+{
+	struct neighbour *n = NULL;
+	int state = NUD_NONE;
+
+	if (nh->nh_scope == RT_SCOPE_LINK) {
+		rcu_read_lock_bh();
+
+		n = __neigh_lookup_noref(&arp_tbl, &nh->nh_gw, dev);
+		if (n)
+			state = n->nud_state;
+
+		rcu_read_unlock_bh();
+	}
+
+	/* outside of rcu locking using n only as a boolean
+	 * on whether a neighbor entry existed
+	 */
+	if (!n || (state & NUD_VALID))
+		return true;
+
+	return false;
+}
 
 void fib_select_multipath(struct fib_result *res, int hash)
 {
 	struct fib_info *fi = res->fi;
+	struct net_device *dev = fi->fib_dev;
+	struct net *net = fi->fib_net;
 
 	for_nexthops(fi) {
 		if (hash > atomic_read(&nh->nh_upper_bound))
 			continue;
 
-		res->nh_sel = nhsel;
-		return;
+		if (!net->ipv4.sysctl_fib_multipath_use_neigh ||
+		    fib_good_nh(nh, dev)) {
+			res->nh_sel = nhsel;
+			return;
+		}
 	} endfor_nexthops(fi);
 
 	/* Race condition: route has just become dead. */
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 1e1fe6086dd9..bb0419582b8d 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -960,6 +960,17 @@ static struct ctl_table ipv4_net_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec,
 	},
+#ifdef CONFIG_IP_ROUTE_MULTIPATH
+	{
+		.procname	= "fib_multipath_use_neigh",
+		.data		= &init_net.ipv4.sysctl_fib_multipath_use_neigh,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &zero,
+		.extra2		= &one,
+	},
+#endif
 	{ }
 };
 
-- 
1.9.1

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox