* [PATCH 4/5] dt-bindings: ptp: add ptp-qoriq.txt
From: Yangbo Lu @ 2018-05-25 4:40 UTC (permalink / raw)
To: netdev, devicetree, linux-kernel, Richard Cochran, claudiu.manoil,
Rob Herring
Cc: Yangbo Lu
In-Reply-To: <20180525044038.37756-1-yangbo.lu@nxp.com>
This patch is to add a documentation for ptp_qoriq dt-bindings.
The description for ptp_qoriq dt-bindings was actually moved
from Documentation/devicetree/bindings/net/fsl-tsec-phy.txt,
since gianfar_ptp driver was moved to ptp_qoriq driver.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
---
.../devicetree/bindings/net/fsl-tsec-phy.txt | 68 +-------------------
.../devicetree/bindings/ptp/ptp-qoriq.txt | 69 ++++++++++++++++++++
2 files changed, 70 insertions(+), 67 deletions(-)
create mode 100644 Documentation/devicetree/bindings/ptp/ptp-qoriq.txt
diff --git a/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt b/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt
index 79bf352..047bdf7 100644
--- a/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt
+++ b/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt
@@ -86,70 +86,4 @@ Example:
* Gianfar PTP clock nodes
-General Properties:
-
- - compatible Should be "fsl,etsec-ptp"
- - reg Offset and length of the register set for the device
- - interrupts There should be at least two interrupts. Some devices
- have as many as four PTP related interrupts.
-
-Clock Properties:
-
- - fsl,cksel Timer reference clock source.
- - fsl,tclk-period Timer reference clock period in nanoseconds.
- - fsl,tmr-prsc Prescaler, divides the output clock.
- - fsl,tmr-add Frequency compensation value.
- - fsl,tmr-fiper1 Fixed interval period pulse generator.
- - fsl,tmr-fiper2 Fixed interval period pulse generator.
- - fsl,max-adj Maximum frequency adjustment in parts per billion.
-
- These properties set the operational parameters for the PTP
- clock. You must choose these carefully for the clock to work right.
- Here is how to figure good values:
-
- TimerOsc = selected reference clock MHz
- tclk_period = desired clock period nanoseconds
- NominalFreq = 1000 / tclk_period MHz
- FreqDivRatio = TimerOsc / NominalFreq (must be greater that 1.0)
- tmr_add = ceil(2^32 / FreqDivRatio)
- OutputClock = NominalFreq / tmr_prsc MHz
- PulseWidth = 1 / OutputClock microseconds
- FiperFreq1 = desired frequency in Hz
- FiperDiv1 = 1000000 * OutputClock / FiperFreq1
- tmr_fiper1 = tmr_prsc * tclk_period * FiperDiv1 - tclk_period
- max_adj = 1000000000 * (FreqDivRatio - 1.0) - 1
-
- The calculation for tmr_fiper2 is the same as for tmr_fiper1. The
- driver expects that tmr_fiper1 will be correctly set to produce a 1
- Pulse Per Second (PPS) signal, since this will be offered to the PPS
- subsystem to synchronize the Linux clock.
-
- Reference clock source is determined by the value, which is holded
- in CKSEL bits in TMR_CTRL register. "fsl,cksel" property keeps the
- value, which will be directly written in those bits, that is why,
- according to reference manual, the next clock sources can be used:
-
- <0> - external high precision timer reference clock (TSEC_TMR_CLK
- input is used for this purpose);
- <1> - eTSEC system clock;
- <2> - eTSEC1 transmit clock;
- <3> - RTC clock input.
-
- When this attribute is not used, eTSEC system clock will serve as
- IEEE 1588 timer reference clock.
-
-Example:
-
- ptp_clock@24e00 {
- compatible = "fsl,etsec-ptp";
- reg = <0x24E00 0xB0>;
- interrupts = <12 0x8 13 0x8>;
- interrupt-parent = < &ipic >;
- fsl,cksel = <1>;
- fsl,tclk-period = <10>;
- fsl,tmr-prsc = <100>;
- fsl,tmr-add = <0x999999A4>;
- fsl,tmr-fiper1 = <0x3B9AC9F6>;
- fsl,tmr-fiper2 = <0x00018696>;
- fsl,max-adj = <659999998>;
- };
+Refer to Documentation/devicetree/bindings/ptp/ptp-qoriq.txt
diff --git a/Documentation/devicetree/bindings/ptp/ptp-qoriq.txt b/Documentation/devicetree/bindings/ptp/ptp-qoriq.txt
new file mode 100644
index 0000000..0f569d8
--- /dev/null
+++ b/Documentation/devicetree/bindings/ptp/ptp-qoriq.txt
@@ -0,0 +1,69 @@
+* Freescale QorIQ 1588 timer based PTP clock
+
+General Properties:
+
+ - compatible Should be "fsl,etsec-ptp"
+ - reg Offset and length of the register set for the device
+ - interrupts There should be at least two interrupts. Some devices
+ have as many as four PTP related interrupts.
+
+Clock Properties:
+
+ - fsl,cksel Timer reference clock source.
+ - fsl,tclk-period Timer reference clock period in nanoseconds.
+ - fsl,tmr-prsc Prescaler, divides the output clock.
+ - fsl,tmr-add Frequency compensation value.
+ - fsl,tmr-fiper1 Fixed interval period pulse generator.
+ - fsl,tmr-fiper2 Fixed interval period pulse generator.
+ - fsl,max-adj Maximum frequency adjustment in parts per billion.
+
+ These properties set the operational parameters for the PTP
+ clock. You must choose these carefully for the clock to work right.
+ Here is how to figure good values:
+
+ TimerOsc = selected reference clock MHz
+ tclk_period = desired clock period nanoseconds
+ NominalFreq = 1000 / tclk_period MHz
+ FreqDivRatio = TimerOsc / NominalFreq (must be greater that 1.0)
+ tmr_add = ceil(2^32 / FreqDivRatio)
+ OutputClock = NominalFreq / tmr_prsc MHz
+ PulseWidth = 1 / OutputClock microseconds
+ FiperFreq1 = desired frequency in Hz
+ FiperDiv1 = 1000000 * OutputClock / FiperFreq1
+ tmr_fiper1 = tmr_prsc * tclk_period * FiperDiv1 - tclk_period
+ max_adj = 1000000000 * (FreqDivRatio - 1.0) - 1
+
+ The calculation for tmr_fiper2 is the same as for tmr_fiper1. The
+ driver expects that tmr_fiper1 will be correctly set to produce a 1
+ Pulse Per Second (PPS) signal, since this will be offered to the PPS
+ subsystem to synchronize the Linux clock.
+
+ Reference clock source is determined by the value, which is holded
+ in CKSEL bits in TMR_CTRL register. "fsl,cksel" property keeps the
+ value, which will be directly written in those bits, that is why,
+ according to reference manual, the next clock sources can be used:
+
+ <0> - external high precision timer reference clock (TSEC_TMR_CLK
+ input is used for this purpose);
+ <1> - eTSEC system clock;
+ <2> - eTSEC1 transmit clock;
+ <3> - RTC clock input.
+
+ When this attribute is not used, eTSEC system clock will serve as
+ IEEE 1588 timer reference clock.
+
+Example:
+
+ ptp_clock@24e00 {
+ compatible = "fsl,etsec-ptp";
+ reg = <0x24E00 0xB0>;
+ interrupts = <12 0x8 13 0x8>;
+ interrupt-parent = < &ipic >;
+ fsl,cksel = <1>;
+ fsl,tclk-period = <10>;
+ fsl,tmr-prsc = <100>;
+ fsl,tmr-add = <0x999999A4>;
+ fsl,tmr-fiper1 = <0x3B9AC9F6>;
+ fsl,tmr-fiper2 = <0x00018696>;
+ fsl,max-adj = <659999998>;
+ };
--
1.7.1
^ permalink raw reply related
* [PATCH 5/5] MAINTAINERS: add myself as maintainer for QorIQ PTP clock driver
From: Yangbo Lu @ 2018-05-25 4:40 UTC (permalink / raw)
To: netdev, devicetree, linux-kernel, Richard Cochran, claudiu.manoil,
Rob Herring
Cc: Yangbo Lu
In-Reply-To: <20180525044038.37756-1-yangbo.lu@nxp.com>
Added myself as maintainer for QorIQ PTP clock driver.
Since gianfar_ptp.c was renamed to ptp_qoriq.c, let's
also maintain it under QorIQ PTP clock driver.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
---
MAINTAINERS | 17 +++++++++--------
1 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index 4b65225..a71d4fa 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4411,12 +4411,6 @@ L: linux-kernel@vger.kernel.org
S: Maintained
F: drivers/staging/fsl-dpaa2/ethsw
-DPAA2 PTP CLOCK DRIVER
-M: Yangbo Lu <yangbo.lu@nxp.com>
-L: linux-kernel@vger.kernel.org
-S: Maintained
-F: drivers/staging/fsl-dpaa2/rtc
-
DPT_I2O SCSI RAID DRIVER
M: Adaptec OEM Raid Solutions <aacraid@microsemi.com>
L: linux-scsi@vger.kernel.org
@@ -5648,7 +5642,6 @@ M: Claudiu Manoil <claudiu.manoil@nxp.com>
L: netdev@vger.kernel.org
S: Maintained
F: drivers/net/ethernet/freescale/gianfar*
-X: drivers/net/ethernet/freescale/gianfar_ptp.c
F: Documentation/devicetree/bindings/net/fsl-tsec-phy.txt
FREESCALE GPMI NAND DRIVER
@@ -5695,6 +5688,15 @@ S: Maintained
F: drivers/net/ethernet/freescale/fman
F: Documentation/devicetree/bindings/powerpc/fsl/fman.txt
+FREESCALE QORIQ PTP CLOCK DRIVER
+M: Yangbo Lu <yangbo.lu@nxp.com>
+L: linux-kernel@vger.kernel.org
+S: Maintained
+F: drivers/staging/fsl-dpaa2/rtc
+F: drivers/ptp/ptp_qoriq.c
+F: include/linux/fsl/ptp_qoriq.h
+F: Documentation/devicetree/bindings/ptp/ptp-qoriq.txt
+
FREESCALE QUAD SPI DRIVER
M: Han Xu <han.xu@nxp.com>
L: linux-mtd@lists.infradead.org
@@ -11429,7 +11431,6 @@ S: Maintained
W: http://linuxptp.sourceforge.net/
F: Documentation/ABI/testing/sysfs-ptp
F: Documentation/ptp/*
-F: drivers/net/ethernet/freescale/gianfar_ptp.c
F: drivers/net/phy/dp83640*
F: drivers/ptp/*
F: include/linux/ptp_cl*
--
1.7.1
^ permalink raw reply related
* Re: STMMAC driver with TSO enabled issue
From: Bhadram Varka @ 2018-05-25 4:41 UTC (permalink / raw)
To: Jose Abreu, netdev@vger.kernel.org, Joao Pinto
In-Reply-To: <06ec3e2e-c41a-5f19-ffd8-51c5453d586b@synopsys.com>
[-- Attachment #1: Type: text/plain, Size: 761 bytes --]
Hi Jose,
On 5/24/2018 3:01 PM, Jose Abreu wrote:
> Hi Bhadram,
>
> On 24-05-2018 06:58, Bhadram Varka wrote:
>>
>> After some time if check Tx descriptor status - then I see only
>> below
>>
>> [..]
>> [85788.286730] 027 [0x827951b0]: 0xf854f000 0x0 0x16d8 0x90000000
>>
>> index 025 and 026 descriptors processed but not index 027.
>>
>> At this stage Tx DMA is always in below state -
>>
>> ■ 3'b011: Running (Reading Data from system memory
>> buffer and queuing it to the Tx buffer (Tx FIFO))
>
> Thats strange, I think the descriptors look okay though. I will
> need the registers values (before the lock) and, if possible, the
> git bisect output.
Attaching the register dump file after the issue observed. Please check
once.
--
Thanks,
Bhadram.
[-- Attachment #2: regdump.txt --]
[-- Type: text/plain, Size: 2966 bytes --]
0x0 = 0x08062203
0x4 = 0x00000000
0x8 = 0x00000004
0xc = 0x00000000
0x10 = 0x00004002
0x14 = 0x00020001
0x18 = 0x00000000
0x1c = 0x00000000
0x20 = 0x00000000
0x24 = 0x00000000
0x28 = 0x00000000
0x2c = 0x00000000
0x50 = 0x00000000
0x54 = 0x00000000
0x58 = 0x00000000
0x60 = 0x00000000
0x64 = 0x00000000
0x70 = 0x00000000
0x74 = 0x00000000
0x78 = 0x00000000
0x7c = 0x00000000
0x90 = 0x00000000
0x94 = 0x00000000
0x98 = 0x00000000
0x9c = 0x00000000
0xa0 = 0x000000AA
0xa4 = 0x00000000
0xa8 = 0x03020100
0xac = 0x00000000
0xb0 = 0x00000000
0xb4 = 0x00000030
0xb8 = 0x00000000
0xc0 = 0x00000000
0xc4 = 0x00000000
0xd0 = 0x00000000
0xd4 = 0x03E80000
0xd8 = 0x00000000
0xdc = 0x00000063
0xe0 = 0x00000000
0xe4 = 0x00000000
0xe8 = 0x00000000
0xec = 0x00000000
0xf0 = 0x00000000
0xf4 = 0x00000000
0xf8 = 0x00000000
0x110 = 0x00001041
0x114 = 0x00000000
0x11c = 0x1BFD73F7
0x120 = 0x429E79C7
0x124 = 0x100C30C3
0x128 = 0x00000000
0x140 = 0x00000000
0x144 = 0x00000000
0x148 = 0x00000000
0x14c = 0x00000000
0x150 = 0x00000000
0x200 = 0x00100104
0x204 = 0x00000000
0x208 = 0x00000000
0x20c = 0x00000000
0x210 = 0x00000000
0x230 = 0x00000000
0x234 = 0x00000000
0x238 = 0x00000000
0x240 = 0x00000000
0x244 = 0x00000000
0x300 = 0x80005CE1
0x304 = 0xCAA296FE
0xc00 = 0x00000000
0xc08 = 0x00000000
0xc0c = 0x00800018
0xc10 = 0x00000000
0xc20 = 0x00000000
0xc30 = 0x02020100
0xc34 = 0x00000000
0xd00 = 0x000F000A
0xd04 = 0x00000000
0xd08 = 0x00000000
0xd0c = 0x00000000
0xd14 = 0x00000000
0xd18 = 0x00000010
0xd2c = 0x01000000
0xd30 = 0x00F0C1A0
0xd34 = 0x00000000
0xd38 = 0x00000000
0xd3c = 0x00000000
0xd40 = 0x000F000A
0xd80 = 0x000F000A
0xdc0 = 0x000F000A
0xd44 = 0x00000000
0xd84 = 0x00000000
0xdc4 = 0x00000000
0xd48 = 0x00000000
0xd88 = 0x00000000
0xdc8 = 0x00000000
0x1000 = 0x00000000
0x1004 = 0x0002100E
0x1008 = 0x00000000
0x100c = 0x33636300
0x1010 = 0x00000063
0x1014 = 0x00000000
0x1020 = 0x00000000
0x1024 = 0x00000000
0x1028 = 0x00000000
0x1100 = 0x00010000
0x1180 = 0x00010000
0x1200 = 0x00010000
0x1280 = 0x00010000
0x1104 = 0x00201001
0x1184 = 0x00201001
0x1204 = 0x00201001
0x1284 = 0x00201001
0x1108 = 0x00080001
0x1188 = 0x00080001
0x1208 = 0x00080001
0x1288 = 0x00080001
0x1110 = 0x00000000
0x1190 = 0x00000000
0x1210 = 0x00000000
0x1290 = 0x00000000
0x1114 = 0xFC044000
0x1194 = 0xFC045000
0x1214 = 0xFC046000
0x1294 = 0xFC047000
0x1118 = 0x00000000
0x1198 = 0x00000000
0x1218 = 0x00000000
0x1298 = 0x00000000
0x111C = 0xFC040000
0x119c = 0xFC041000
0x121c = 0xFC042000
0x129c = 0xFC043000
0x1120 = 0xFC044400
0x11A0 = 0xFC045400
0x1220 = 0xFC046400
0x12A0 = 0xFC047400
0x1128 = 0xFC040400
0x11A8 = 0xFC041400
0x1228 = 0xFC042400
0x12A8 = 0xFC043400
0x112c = 0x0000003F
0x11ac = 0x0000003F
0x122c = 0x0000003F
0x12ac = 0x0000003F
0x1130 = 0x0000003F
0x11b0 = 0x0000003F
0x1230 = 0x0000003F
0x12b0 = 0x0000003F
^ permalink raw reply
* RE: [PATCH 5/5] MAINTAINERS: add myself as maintainer for QorIQ PTP clock driver
From: Y.b. Lu @ 2018-05-25 4:47 UTC (permalink / raw)
To: Y.b. Lu, netdev@vger.kernel.org, devicetree@vger.kernel.org,
linux-kernel@vger.kernel.org, Richard Cochran, Claudiu Manoil,
Rob Herring
In-Reply-To: <20180525044038.37756-5-yangbo.lu@nxp.com>
This patch has a dependency which is now on staging git tree.
https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git/commit/?h=staging-next&id=7fd899fff5907dbb02089494102ef628988f2330
> -----Original Message-----
> From: Yangbo Lu [mailto:yangbo.lu@nxp.com]
> Sent: Friday, May 25, 2018 12:41 PM
> To: netdev@vger.kernel.org; devicetree@vger.kernel.org;
> linux-kernel@vger.kernel.org; Richard Cochran <richardcochran@gmail.com>;
> Claudiu Manoil <claudiu.manoil@nxp.com>; Rob Herring <robh+dt@kernel.org>
> Cc: Y.b. Lu <yangbo.lu@nxp.com>
> Subject: [PATCH 5/5] MAINTAINERS: add myself as maintainer for QorIQ PTP
> clock driver
>
> Added myself as maintainer for QorIQ PTP clock driver.
> Since gianfar_ptp.c was renamed to ptp_qoriq.c, let's also maintain it under
> QorIQ PTP clock driver.
>
> Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
> ---
> MAINTAINERS | 17 +++++++++--------
> 1 files changed, 9 insertions(+), 8 deletions(-)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 4b65225..a71d4fa 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -4411,12 +4411,6 @@ L: linux-kernel@vger.kernel.org
> S: Maintained
> F: drivers/staging/fsl-dpaa2/ethsw
>
> -DPAA2 PTP CLOCK DRIVER
> -M: Yangbo Lu <yangbo.lu@nxp.com>
> -L: linux-kernel@vger.kernel.org
> -S: Maintained
> -F: drivers/staging/fsl-dpaa2/rtc
> -
> DPT_I2O SCSI RAID DRIVER
> M: Adaptec OEM Raid Solutions <aacraid@microsemi.com>
> L: linux-scsi@vger.kernel.org
> @@ -5648,7 +5642,6 @@ M: Claudiu Manoil <claudiu.manoil@nxp.com>
> L: netdev@vger.kernel.org
> S: Maintained
> F: drivers/net/ethernet/freescale/gianfar*
> -X: drivers/net/ethernet/freescale/gianfar_ptp.c
> F: Documentation/devicetree/bindings/net/fsl-tsec-phy.txt
>
> FREESCALE GPMI NAND DRIVER
> @@ -5695,6 +5688,15 @@ S: Maintained
> F: drivers/net/ethernet/freescale/fman
> F: Documentation/devicetree/bindings/powerpc/fsl/fman.txt
>
> +FREESCALE QORIQ PTP CLOCK DRIVER
> +M: Yangbo Lu <yangbo.lu@nxp.com>
> +L: linux-kernel@vger.kernel.org
> +S: Maintained
> +F: drivers/staging/fsl-dpaa2/rtc
> +F: drivers/ptp/ptp_qoriq.c
> +F: include/linux/fsl/ptp_qoriq.h
> +F: Documentation/devicetree/bindings/ptp/ptp-qoriq.txt
> +
> FREESCALE QUAD SPI DRIVER
> M: Han Xu <han.xu@nxp.com>
> L: linux-mtd@lists.infradead.org
> @@ -11429,7 +11431,6 @@ S: Maintained
> W:
> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fli
> nuxptp.sourceforge.net%2F&data=02%7C01%7Cyangbo.lu%40nxp.com%7Cd7
> 840089f091467d11de08d5c1f9e801%7C686ea1d3bc2b4c6fa92cd99c5c3016
> 35%7C0%7C0%7C636628201433493648&sdata=XhJjFQyrROZzMU7zUGsUkA
> BjJD%2BJ25q2Jq77vdHoco0%3D&reserved=0
> F: Documentation/ABI/testing/sysfs-ptp
> F: Documentation/ptp/*
> -F: drivers/net/ethernet/freescale/gianfar_ptp.c
> F: drivers/net/phy/dp83640*
> F: drivers/ptp/*
> F: include/linux/ptp_cl*
> --
> 1.7.1
^ permalink raw reply
* Re: [PATCH v2 bpf-next 1/5] bpf: Hooks for sys_sendmsg
From: Andrey Ignatov @ 2018-05-25 4:56 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: netdev, davem, kafai, ast, kernel-team
In-Reply-To: <b1d94917-d362-f872-a3f7-09d3bc770543@iogearbox.net>
Daniel Borkmann <daniel@iogearbox.net> [Thu, 2018-05-24 18:00 -0700]:
> On 05/23/2018 01:40 AM, Andrey Ignatov wrote:
> [...]
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > index ff4d4ba..a1f9ba2 100644
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -900,6 +900,7 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
> > {
> > struct inet_sock *inet = inet_sk(sk);
> > struct udp_sock *up = udp_sk(sk);
> > + DECLARE_SOCKADDR(struct sockaddr_in *, usin, msg->msg_name);
> > struct flowi4 fl4_stack;
> > struct flowi4 *fl4;
> > int ulen = len;
> > @@ -954,8 +955,7 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
> > /*
> > * Get and verify the address.
> > */
> > - if (msg->msg_name) {
> > - DECLARE_SOCKADDR(struct sockaddr_in *, usin, msg->msg_name);
> > + if (usin) {
> > if (msg->msg_namelen < sizeof(*usin))
> > return -EINVAL;
> > if (usin->sin_family != AF_INET) {
> > @@ -1009,6 +1009,22 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
> > rcu_read_unlock();
> > }
> >
> > + if (!connected) {
> > + err = BPF_CGROUP_RUN_PROG_UDP4_SENDMSG_LOCK(sk,
> > + (struct sockaddr *)usin, &ipc.addr);
> > + if (err)
> > + goto out_free;
> > + if (usin) {
> > + if (usin->sin_port == 0) {
> > + /* BPF program set invalid port. Reject it. */
> > + err = -EINVAL;
> > + goto out_free;
> > + }
> > + daddr = usin->sin_addr.s_addr;
> > + dport = usin->sin_port;
> > + }
> > + }
> > +
> > saddr = ipc.addr;
> > ipc.addr = faddr = daddr;
> >
> > diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> > index 2839c1b..67c44b5 100644
> > --- a/net/ipv6/udp.c
> > +++ b/net/ipv6/udp.c
> > @@ -1315,6 +1315,29 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
> > fl6.saddr = np->saddr;
> > fl6.fl6_sport = inet->inet_sport;
> >
> > + if (!connected) {
> > + err = BPF_CGROUP_RUN_PROG_UDP6_SENDMSG_LOCK(sk,
> > + (struct sockaddr *)sin6, &fl6.saddr);
> > + if (err)
> > + goto out_no_dst;
> > + if (sin6) {
> > + if (ipv6_addr_v4mapped(&sin6->sin6_addr)) {
> > + /* BPF program rewrote IPv6-only by IPv4-mapped
> > + * IPv6. It's currently unsupported.
> > + */
> > + err = -ENOTSUPP;
> > + goto out_no_dst;
> > + }
> > + if (sin6->sin6_port == 0) {
> > + /* BPF program set invalid port. Reject it. */
> > + err = -EINVAL;
> > + goto out_no_dst;
> > + }
> > + fl6.fl6_dport = sin6->sin6_port;
> > + fl6.daddr = sin6->sin6_addr;
> > + }
>
> Hmm, this extra work here and in v4 case should probably all be done under
> the static key? Otherwise we'll do the extra work for checking sin6 and
> setting up fl6 twice?
Hm .. true, we can put the whole this block under static key (the main
one, since there are no others, but we can follow-up separately):
if (cgroup_bpf_enabled && !connected) {
I'll send v3 with this change for both ipv6 and ipv4. Thanks.
As for the logic inside the `if`, I'll describe it just in case, since
some things may not be obvious.
There are two cases earlier in this function that can lead to
`connected = false`, either user specifies destination address (the 1st
`if (sin6)`) or/and user specifies ancillary data
(`if (msg->msg_controllen)`).
Ancillary data can contain option to set source IP. So to simplify: if
user specifies source or destination we're in unconnected mode.
Now imagine that we have connected socket and then user calls sendmsg
without setting destination (sin6 = NULL), but sets the source IP in
ancillary data at the same time. It will cause `connected = false` and
BPF prog will be run (it can e.g. override that source IP set by user),
but we have no sin6, that's why this `if (sin6)` is second time here.
On the other hand if sin6 is passed by user, it'll cause unconnected
mode as well and BPF prog has a chance to override IP and port in sin6
and in this case we have to update fl6 after BPF prog finishes. That's
why `fl6.daddr = sin6->sin6_addr;` the second time.
But I agree that work should be avoided when cgroup-bpf is disabled.
> Also, when not enabled, couldn't we run into the case
> of ipv6_addr_v4mapped() as well? If I'm spotting this right, then we would
> bail out though we shouldn't normally?
IPv4-mapped IPv6 case is handled earlier in this function and if user
passed IPv4-mapped IPv6, we don't get this far and call IPv4
udp_sendmsg() much earlier.
Same is true for port.
That's why this code wouldn't affect the logic for IPv4-mapped IPv6, but
again, you're right that we shouldn't do this extra work when cgroup-bpf
is disabled and I'll fix it.
>
> > + }
> > +
> > final_p = fl6_update_dst(&fl6, opt, &final);
> > if (final_p)
> > connected = false;
> > @@ -1394,6 +1417,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
> >
> > out:
> > dst_release(dst);
> > +out_no_dst:
> > fl6_sock_release(flowlabel);
> > txopt_put(opt_to_free);
> > if (!err)
> >
>
--
Andrey Ignatov
^ permalink raw reply
* Re: [PATCH 4/4] cpsw: add switchdev support
From: Ilias Apalodimas @ 2018-05-25 4:56 UTC (permalink / raw)
To: Andrew Lunn
Cc: netdev, grygorii.strashko, ivan.khoronzhuk, nsekhar, jiri,
ivecera, francois.ozog, yogeshs, spatton
In-Reply-To: <20180524163904.GH5128@lunn.ch>
On Thu, May 24, 2018 at 06:39:04PM +0200, Andrew Lunn wrote:
> On Thu, May 24, 2018 at 04:32:34PM +0300, Ilias Apalodimas wrote:
> > On Thu, May 24, 2018 at 03:12:29PM +0200, Andrew Lunn wrote:
> > > Device tree is supposed to describe the hardware. Using that hardware
> > > in different ways is not something you should describe in DT.
> > >
> > The new switchdev mode is applied with a .config option in the kernel. What you
> > see is pre-existing code, so i am not sure if i should change it in this
> > patchset.
>
> If you break the code up into a library and two drivers, it becomes a
> moot point.
Agree
>
> But what i don't like here is that the device tree says to do dual
> mac. But you ignore that and do sometime else. I would prefer that if
> DT says dual mac, and switchdev is compiled in, the probe fails with
> EINVAL. Rather than ignore something, make it clear it is invalid.
The switch has 3 modes of operation as is.
1. switch mode, to enable that you don't need to add anything on
the DTS and linux registers a single netdev interface.
2. dual mac mode, this is when you need to add dual_emac; on the DTS.
3. switchdev mode which is controlled by a .config option, since as you
pointed out DTS was not made for controlling config options.
I agree that this is far from beautiful. If the driver remains as in though,
i'd prefer either keeping what's there or making "switchdev" a DTS option,
following the pre-existing erroneous usage rather than making the device
unusable. If we end up returning some error and refuse to initialize, users
that remote upgrade their equipment, without taking a good look at changelog,
will loose access to their devices with no means of remotely fixing that.
Regards
Ilias
^ permalink raw reply
* [PATCH v3 bpf-next 0/5] bpf: Hooks for sys_sendmsg
From: Andrey Ignatov @ 2018-05-25 5:09 UTC (permalink / raw)
To: netdev; +Cc: Andrey Ignatov, davem, kafai, ast, daniel, kernel-team
v2 -> v3:
* place BPF logic under static key in udp_sendmsg, udpv6_sendmsg;
* rebase.
v1 -> v2:
* return ENOTSUPP if bpf_prog rewrote IPv6-only with IPv4-mapped IPv6;
* add test for IPv4-mapped IPv6 use-case;
* fix build for CONFIG_CGROUP_BPF=n;
* rebase.
This path set adds BPF hooks for sys_sendmsg similar to existing hooks for
sys_bind and sys_connect.
Hooks allow to override source IP (including the case when it's set via
cmsg(3)) and destination IP:port for unconnected UDP (slow path). TCP and
connected UDP (fast path) are not affected. This makes UDP support
complete: connected UDP is handled by sys_connect hooks, unconnected by
sys_sendmsg ones.
Similar to sys_connect hooks, sys_sendmsg ones can be used to make system
calls such as sendmsg(2) and sendto(2) return EPERM.
Please see patch 0001 for more details.
Andrey Ignatov (5):
bpf: Hooks for sys_sendmsg
bpf: Sync bpf.h to tools/
libbpf: Support guessing sendmsg{4,6} progs
selftests/bpf: Prepare test_sock_addr for extension
selftests/bpf: Selftest for sys_sendmsg hooks
include/linux/bpf-cgroup.h | 23 +-
include/linux/filter.h | 1 +
include/uapi/linux/bpf.h | 8 +
kernel/bpf/cgroup.c | 11 +-
kernel/bpf/syscall.c | 8 +
net/core/filter.c | 39 +
net/ipv4/udp.c | 20 +-
net/ipv6/udp.c | 24 +
tools/include/uapi/linux/bpf.h | 8 +
tools/lib/bpf/libbpf.c | 2 +
tools/testing/selftests/bpf/Makefile | 2 +-
tools/testing/selftests/bpf/sendmsg4_prog.c | 49 ++
tools/testing/selftests/bpf/sendmsg6_prog.c | 60 ++
tools/testing/selftests/bpf/test_sock_addr.c | 1155 +++++++++++++++++++++-----
14 files changed, 1214 insertions(+), 196 deletions(-)
create mode 100644 tools/testing/selftests/bpf/sendmsg4_prog.c
create mode 100644 tools/testing/selftests/bpf/sendmsg6_prog.c
--
2.9.5
^ permalink raw reply
* [PATCH v3 bpf-next 1/5] bpf: Hooks for sys_sendmsg
From: Andrey Ignatov @ 2018-05-25 5:09 UTC (permalink / raw)
To: netdev; +Cc: Andrey Ignatov, davem, kafai, ast, daniel, kernel-team
In-Reply-To: <cover.1527224903.git.rdna@fb.com>
In addition to already existing BPF hooks for sys_bind and sys_connect,
the patch provides new hooks for sys_sendmsg.
It leverages existing BPF program type `BPF_PROG_TYPE_CGROUP_SOCK_ADDR`
that provides access to socket itlself (properties like family, type,
protocol) and user-passed `struct sockaddr *` so that BPF program can
override destination IP and port for system calls such as sendto(2) or
sendmsg(2) and/or assign source IP to the socket.
The hooks are implemented as two new attach types:
`BPF_CGROUP_UDP4_SENDMSG` and `BPF_CGROUP_UDP6_SENDMSG` for UDPv4 and
UDPv6 correspondingly.
UDPv4 and UDPv6 separate attach types for same reason as sys_bind and
sys_connect hooks, i.e. to prevent reading from / writing to e.g.
user_ip6 fields when user passes sockaddr_in since it'd be out-of-bound.
The difference with already existing hooks is sys_sendmsg are
implemented only for unconnected UDP.
For TCP it doesn't make sense to change user-provided `struct sockaddr *`
at sendto(2)/sendmsg(2) time since socket either was already connected
and has source/destination set or wasn't connected and call to
sendto(2)/sendmsg(2) would lead to ENOTCONN anyway.
Connected UDP is already handled by sys_connect hooks that can override
source/destination at connect time and use fast-path later, i.e. these
hooks don't affect UDP fast-path.
Rewriting source IP is implemented differently than that in sys_connect
hooks. When sys_sendmsg is used with unconnected UDP it doesn't work to
just bind socket to desired local IP address since source IP can be set
on per-packet basis by using ancillary data (cmsg(3)). So no matter if
socket is bound or not, source IP has to be rewritten on every call to
sys_sendmsg.
To do so two new fields are added to UAPI `struct bpf_sock_addr`;
* `msg_src_ip4` to set source IPv4 for UDPv4;
* `msg_src_ip6` to set source IPv6 for UDPv6.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
---
include/linux/bpf-cgroup.h | 23 +++++++++++++++++------
include/linux/filter.h | 1 +
include/uapi/linux/bpf.h | 8 ++++++++
kernel/bpf/cgroup.c | 11 ++++++++++-
kernel/bpf/syscall.c | 8 ++++++++
net/core/filter.c | 39 +++++++++++++++++++++++++++++++++++++++
net/ipv4/udp.c | 20 ++++++++++++++++++--
net/ipv6/udp.c | 24 ++++++++++++++++++++++++
8 files changed, 125 insertions(+), 9 deletions(-)
diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
index 30d15e6..29f8085 100644
--- a/include/linux/bpf-cgroup.h
+++ b/include/linux/bpf-cgroup.h
@@ -66,7 +66,8 @@ int __cgroup_bpf_run_filter_sk(struct sock *sk,
int __cgroup_bpf_run_filter_sock_addr(struct sock *sk,
struct sockaddr *uaddr,
- enum bpf_attach_type type);
+ enum bpf_attach_type type,
+ void *t_ctx);
int __cgroup_bpf_run_filter_sock_ops(struct sock *sk,
struct bpf_sock_ops_kern *sock_ops,
@@ -120,16 +121,18 @@ int __cgroup_bpf_check_dev_permission(short dev_type, u32 major, u32 minor,
({ \
int __ret = 0; \
if (cgroup_bpf_enabled) \
- __ret = __cgroup_bpf_run_filter_sock_addr(sk, uaddr, type); \
+ __ret = __cgroup_bpf_run_filter_sock_addr(sk, uaddr, type, \
+ NULL); \
__ret; \
})
-#define BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, type) \
+#define BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, type, t_ctx) \
({ \
int __ret = 0; \
if (cgroup_bpf_enabled) { \
lock_sock(sk); \
- __ret = __cgroup_bpf_run_filter_sock_addr(sk, uaddr, type); \
+ __ret = __cgroup_bpf_run_filter_sock_addr(sk, uaddr, type, \
+ t_ctx); \
release_sock(sk); \
} \
__ret; \
@@ -151,10 +154,16 @@ int __cgroup_bpf_check_dev_permission(short dev_type, u32 major, u32 minor,
BPF_CGROUP_RUN_SA_PROG(sk, uaddr, BPF_CGROUP_INET6_CONNECT)
#define BPF_CGROUP_RUN_PROG_INET4_CONNECT_LOCK(sk, uaddr) \
- BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, BPF_CGROUP_INET4_CONNECT)
+ BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, BPF_CGROUP_INET4_CONNECT, NULL)
#define BPF_CGROUP_RUN_PROG_INET6_CONNECT_LOCK(sk, uaddr) \
- BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, BPF_CGROUP_INET6_CONNECT)
+ BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, BPF_CGROUP_INET6_CONNECT, NULL)
+
+#define BPF_CGROUP_RUN_PROG_UDP4_SENDMSG_LOCK(sk, uaddr, t_ctx) \
+ BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, BPF_CGROUP_UDP4_SENDMSG, t_ctx)
+
+#define BPF_CGROUP_RUN_PROG_UDP6_SENDMSG_LOCK(sk, uaddr, t_ctx) \
+ BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, BPF_CGROUP_UDP6_SENDMSG, t_ctx)
#define BPF_CGROUP_RUN_PROG_SOCK_OPS(sock_ops) \
({ \
@@ -197,6 +206,8 @@ static inline int cgroup_bpf_inherit(struct cgroup *cgrp) { return 0; }
#define BPF_CGROUP_RUN_PROG_INET4_CONNECT_LOCK(sk, uaddr) ({ 0; })
#define BPF_CGROUP_RUN_PROG_INET6_CONNECT(sk, uaddr) ({ 0; })
#define BPF_CGROUP_RUN_PROG_INET6_CONNECT_LOCK(sk, uaddr) ({ 0; })
+#define BPF_CGROUP_RUN_PROG_UDP4_SENDMSG_LOCK(sk, uaddr, t_ctx) ({ 0; })
+#define BPF_CGROUP_RUN_PROG_UDP6_SENDMSG_LOCK(sk, uaddr, t_ctx) ({ 0; })
#define BPF_CGROUP_RUN_PROG_SOCK_OPS(sock_ops) ({ 0; })
#define BPF_CGROUP_RUN_PROG_DEVICE_CGROUP(type,major,minor,access) ({ 0; })
diff --git a/include/linux/filter.h b/include/linux/filter.h
index d358d18..d90abda 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -1010,6 +1010,7 @@ struct bpf_sock_addr_kern {
* only two (src and dst) are available at convert_ctx_access time
*/
u64 tmp_reg;
+ void *t_ctx; /* Attach type specific context. */
};
struct bpf_sock_ops_kern {
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 9b8c6e3..cc68787 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -160,6 +160,8 @@ enum bpf_attach_type {
BPF_CGROUP_INET6_CONNECT,
BPF_CGROUP_INET4_POST_BIND,
BPF_CGROUP_INET6_POST_BIND,
+ BPF_CGROUP_UDP4_SENDMSG,
+ BPF_CGROUP_UDP6_SENDMSG,
__MAX_BPF_ATTACH_TYPE
};
@@ -2363,6 +2365,12 @@ struct bpf_sock_addr {
__u32 family; /* Allows 4-byte read, but no write */
__u32 type; /* Allows 4-byte read, but no write */
__u32 protocol; /* Allows 4-byte read, but no write */
+ __u32 msg_src_ip4; /* Allows 1,2,4-byte read an 4-byte write.
+ * Stored in network byte order.
+ */
+ __u32 msg_src_ip6[4]; /* Allows 1,2,4-byte read an 4-byte write.
+ * Stored in network byte order.
+ */
};
/* User bpf_sock_ops struct to access socket values and specify request ops
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 43171a0..f7c00bd 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -500,6 +500,7 @@ EXPORT_SYMBOL(__cgroup_bpf_run_filter_sk);
* @sk: sock struct that will use sockaddr
* @uaddr: sockaddr struct provided by user
* @type: The type of program to be exectuted
+ * @t_ctx: Pointer to attach type specific context
*
* socket is expected to be of type INET or INET6.
*
@@ -508,12 +509,15 @@ EXPORT_SYMBOL(__cgroup_bpf_run_filter_sk);
*/
int __cgroup_bpf_run_filter_sock_addr(struct sock *sk,
struct sockaddr *uaddr,
- enum bpf_attach_type type)
+ enum bpf_attach_type type,
+ void *t_ctx)
{
struct bpf_sock_addr_kern ctx = {
.sk = sk,
.uaddr = uaddr,
+ .t_ctx = t_ctx,
};
+ struct sockaddr_storage unspec;
struct cgroup *cgrp;
int ret;
@@ -523,6 +527,11 @@ int __cgroup_bpf_run_filter_sock_addr(struct sock *sk,
if (sk->sk_family != AF_INET && sk->sk_family != AF_INET6)
return 0;
+ if (!ctx.uaddr) {
+ memset(&unspec, 0, sizeof(unspec));
+ ctx.uaddr = (struct sockaddr *)&unspec;
+ }
+
cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
ret = BPF_PROG_RUN_ARRAY(cgrp->bpf.effective[type], &ctx, BPF_PROG_RUN);
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 388d4fe..e254526 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1249,6 +1249,8 @@ bpf_prog_load_check_attach_type(enum bpf_prog_type prog_type,
case BPF_CGROUP_INET6_BIND:
case BPF_CGROUP_INET4_CONNECT:
case BPF_CGROUP_INET6_CONNECT:
+ case BPF_CGROUP_UDP4_SENDMSG:
+ case BPF_CGROUP_UDP6_SENDMSG:
return 0;
default:
return -EINVAL;
@@ -1565,6 +1567,8 @@ static int bpf_prog_attach(const union bpf_attr *attr)
case BPF_CGROUP_INET6_BIND:
case BPF_CGROUP_INET4_CONNECT:
case BPF_CGROUP_INET6_CONNECT:
+ case BPF_CGROUP_UDP4_SENDMSG:
+ case BPF_CGROUP_UDP6_SENDMSG:
ptype = BPF_PROG_TYPE_CGROUP_SOCK_ADDR;
break;
case BPF_CGROUP_SOCK_OPS:
@@ -1635,6 +1639,8 @@ static int bpf_prog_detach(const union bpf_attr *attr)
case BPF_CGROUP_INET6_BIND:
case BPF_CGROUP_INET4_CONNECT:
case BPF_CGROUP_INET6_CONNECT:
+ case BPF_CGROUP_UDP4_SENDMSG:
+ case BPF_CGROUP_UDP6_SENDMSG:
ptype = BPF_PROG_TYPE_CGROUP_SOCK_ADDR;
break;
case BPF_CGROUP_SOCK_OPS:
@@ -1692,6 +1698,8 @@ static int bpf_prog_query(const union bpf_attr *attr,
case BPF_CGROUP_INET6_POST_BIND:
case BPF_CGROUP_INET4_CONNECT:
case BPF_CGROUP_INET6_CONNECT:
+ case BPF_CGROUP_UDP4_SENDMSG:
+ case BPF_CGROUP_UDP6_SENDMSG:
case BPF_CGROUP_SOCK_OPS:
case BPF_CGROUP_DEVICE:
break;
diff --git a/net/core/filter.c b/net/core/filter.c
index acf1f4f..24e6ce8 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5299,6 +5299,7 @@ static bool sock_addr_is_valid_access(int off, int size,
switch (prog->expected_attach_type) {
case BPF_CGROUP_INET4_BIND:
case BPF_CGROUP_INET4_CONNECT:
+ case BPF_CGROUP_UDP4_SENDMSG:
break;
default:
return false;
@@ -5308,6 +5309,24 @@ static bool sock_addr_is_valid_access(int off, int size,
switch (prog->expected_attach_type) {
case BPF_CGROUP_INET6_BIND:
case BPF_CGROUP_INET6_CONNECT:
+ case BPF_CGROUP_UDP6_SENDMSG:
+ break;
+ default:
+ return false;
+ }
+ break;
+ case bpf_ctx_range(struct bpf_sock_addr, msg_src_ip4):
+ switch (prog->expected_attach_type) {
+ case BPF_CGROUP_UDP4_SENDMSG:
+ break;
+ default:
+ return false;
+ }
+ break;
+ case bpf_ctx_range_till(struct bpf_sock_addr, msg_src_ip6[0],
+ msg_src_ip6[3]):
+ switch (prog->expected_attach_type) {
+ case BPF_CGROUP_UDP6_SENDMSG:
break;
default:
return false;
@@ -5318,6 +5337,9 @@ static bool sock_addr_is_valid_access(int off, int size,
switch (off) {
case bpf_ctx_range(struct bpf_sock_addr, user_ip4):
case bpf_ctx_range_till(struct bpf_sock_addr, user_ip6[0], user_ip6[3]):
+ case bpf_ctx_range(struct bpf_sock_addr, msg_src_ip4):
+ case bpf_ctx_range_till(struct bpf_sock_addr, msg_src_ip6[0],
+ msg_src_ip6[3]):
/* Only narrow read access allowed for now. */
if (type == BPF_READ) {
bpf_ctx_record_field_size(info, size_default);
@@ -6072,6 +6094,23 @@ static u32 sock_addr_convert_ctx_access(enum bpf_access_type type,
*insn++ = BPF_ALU32_IMM(BPF_RSH, si->dst_reg,
SK_FL_PROTO_SHIFT);
break;
+
+ case offsetof(struct bpf_sock_addr, msg_src_ip4):
+ /* Treat t_ctx as struct in_addr for msg_src_ip4. */
+ SOCK_ADDR_LOAD_OR_STORE_NESTED_FIELD_SIZE_OFF(
+ struct bpf_sock_addr_kern, struct in_addr, t_ctx,
+ s_addr, BPF_SIZE(si->code), 0, tmp_reg);
+ break;
+
+ case bpf_ctx_range_till(struct bpf_sock_addr, msg_src_ip6[0],
+ msg_src_ip6[3]):
+ off = si->off;
+ off -= offsetof(struct bpf_sock_addr, msg_src_ip6[0]);
+ /* Treat t_ctx as struct in6_addr for msg_src_ip6. */
+ SOCK_ADDR_LOAD_OR_STORE_NESTED_FIELD_SIZE_OFF(
+ struct bpf_sock_addr_kern, struct in6_addr, t_ctx,
+ s6_addr32[0], BPF_SIZE(si->code), off, tmp_reg);
+ break;
}
return insn - insn_buf;
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index d71f1f3..3c27d00 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -901,6 +901,7 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
{
struct inet_sock *inet = inet_sk(sk);
struct udp_sock *up = udp_sk(sk);
+ DECLARE_SOCKADDR(struct sockaddr_in *, usin, msg->msg_name);
struct flowi4 fl4_stack;
struct flowi4 *fl4;
int ulen = len;
@@ -955,8 +956,7 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
/*
* Get and verify the address.
*/
- if (msg->msg_name) {
- DECLARE_SOCKADDR(struct sockaddr_in *, usin, msg->msg_name);
+ if (usin) {
if (msg->msg_namelen < sizeof(*usin))
return -EINVAL;
if (usin->sin_family != AF_INET) {
@@ -1010,6 +1010,22 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
rcu_read_unlock();
}
+ if (cgroup_bpf_enabled && !connected) {
+ err = BPF_CGROUP_RUN_PROG_UDP4_SENDMSG_LOCK(sk,
+ (struct sockaddr *)usin, &ipc.addr);
+ if (err)
+ goto out_free;
+ if (usin) {
+ if (usin->sin_port == 0) {
+ /* BPF program set invalid port. Reject it. */
+ err = -EINVAL;
+ goto out_free;
+ }
+ daddr = usin->sin_addr.s_addr;
+ dport = usin->sin_port;
+ }
+ }
+
saddr = ipc.addr;
ipc.addr = faddr = daddr;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 426c9d2..9f729a7 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1316,6 +1316,29 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
fl6.saddr = np->saddr;
fl6.fl6_sport = inet->inet_sport;
+ if (cgroup_bpf_enabled && !connected) {
+ err = BPF_CGROUP_RUN_PROG_UDP6_SENDMSG_LOCK(sk,
+ (struct sockaddr *)sin6, &fl6.saddr);
+ if (err)
+ goto out_no_dst;
+ if (sin6) {
+ if (ipv6_addr_v4mapped(&sin6->sin6_addr)) {
+ /* BPF program rewrote IPv6-only by IPv4-mapped
+ * IPv6. It's currently unsupported.
+ */
+ err = -ENOTSUPP;
+ goto out_no_dst;
+ }
+ if (sin6->sin6_port == 0) {
+ /* BPF program set invalid port. Reject it. */
+ err = -EINVAL;
+ goto out_no_dst;
+ }
+ fl6.fl6_dport = sin6->sin6_port;
+ fl6.daddr = sin6->sin6_addr;
+ }
+ }
+
final_p = fl6_update_dst(&fl6, opt, &final);
if (final_p)
connected = false;
@@ -1395,6 +1418,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
out:
dst_release(dst);
+out_no_dst:
fl6_sock_release(flowlabel);
txopt_put(opt_to_free);
if (!err)
--
2.9.5
^ permalink raw reply related
* [PATCH v3 bpf-next 4/5] selftests/bpf: Prepare test_sock_addr for extension
From: Andrey Ignatov @ 2018-05-25 5:09 UTC (permalink / raw)
To: netdev; +Cc: Andrey Ignatov, davem, kafai, ast, daniel, kernel-team
In-Reply-To: <cover.1527224903.git.rdna@fb.com>
test_sock_addr was not easy to extend since it was focused on sys_bind
and sys_connect quite a bit.
Reorganized it so that it'll be easier to cover new test-cases for
`BPF_PROG_TYPE_CGROUP_SOCK_ADDR`:
- decouple test-cases so that only one BPF prog is tested at a time;
- check programmatically that local IP:port for sys_bind, source IP and
destination IP:port for sys_connect are rewritten property by tested
BPF programs.
The output of new version:
# test_sock_addr.sh 2>/dev/null
Wait for testing IPv4/IPv6 to become available ... OK
Test case: bind4: load prog with wrong expected attach type .. [PASS]
Test case: bind4: attach prog with wrong attach type .. [PASS]
Test case: bind4: rewrite IP & TCP port in .. [PASS]
Test case: bind4: rewrite IP & UDP port in .. [PASS]
Test case: bind6: load prog with wrong expected attach type .. [PASS]
Test case: bind6: attach prog with wrong attach type .. [PASS]
Test case: bind6: rewrite IP & TCP port in .. [PASS]
Test case: bind6: rewrite IP & UDP port in .. [PASS]
Test case: connect4: load prog with wrong expected attach type .. [PASS]
Test case: connect4: attach prog with wrong attach type .. [PASS]
Test case: connect4: rewrite IP & TCP port .. [PASS]
Test case: connect4: rewrite IP & UDP port .. [PASS]
Test case: connect6: load prog with wrong expected attach type .. [PASS]
Test case: connect6: attach prog with wrong attach type .. [PASS]
Test case: connect6: rewrite IP & TCP port .. [PASS]
Test case: connect6: rewrite IP & UDP port .. [PASS]
Summary: 16 PASSED, 0 FAILED
(stderr contains errors from libbpf when testing load/attach with
invalid arguments)
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
---
tools/testing/selftests/bpf/test_sock_addr.c | 655 +++++++++++++++++++--------
1 file changed, 460 insertions(+), 195 deletions(-)
diff --git a/tools/testing/selftests/bpf/test_sock_addr.c b/tools/testing/selftests/bpf/test_sock_addr.c
index 2950f80..ed3e397 100644
--- a/tools/testing/selftests/bpf/test_sock_addr.c
+++ b/tools/testing/selftests/bpf/test_sock_addr.c
@@ -17,34 +17,292 @@
#include "cgroup_helpers.h"
#include "bpf_rlimit.h"
+#ifndef ARRAY_SIZE
+# define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+#endif
+
#define CG_PATH "/foo"
#define CONNECT4_PROG_PATH "./connect4_prog.o"
#define CONNECT6_PROG_PATH "./connect6_prog.o"
#define SERV4_IP "192.168.1.254"
#define SERV4_REWRITE_IP "127.0.0.1"
+#define SRC4_REWRITE_IP "127.0.0.4"
#define SERV4_PORT 4040
#define SERV4_REWRITE_PORT 4444
#define SERV6_IP "face:b00c:1234:5678::abcd"
#define SERV6_REWRITE_IP "::1"
+#define SRC6_REWRITE_IP "::6"
#define SERV6_PORT 6060
#define SERV6_REWRITE_PORT 6666
#define INET_NTOP_BUF 40
-typedef int (*load_fn)(enum bpf_attach_type, const char *comment);
+struct sock_addr_test;
+
+typedef int (*load_fn)(const struct sock_addr_test *test);
typedef int (*info_fn)(int, struct sockaddr *, socklen_t *);
-struct program {
- enum bpf_attach_type type;
- load_fn loadfn;
- int fd;
- const char *name;
- enum bpf_attach_type invalid_type;
+char bpf_log_buf[BPF_LOG_BUF_SIZE];
+
+struct sock_addr_test {
+ const char *descr;
+ /* BPF prog properties */
+ load_fn loadfn;
+ enum bpf_attach_type expected_attach_type;
+ enum bpf_attach_type attach_type;
+ /* Socket properties */
+ int domain;
+ int type;
+ /* IP:port pairs for BPF prog to override */
+ const char *requested_ip;
+ unsigned short requested_port;
+ const char *expected_ip;
+ unsigned short expected_port;
+ const char *expected_src_ip;
+ /* Expected test result */
+ enum {
+ LOAD_REJECT,
+ ATTACH_REJECT,
+ SUCCESS,
+ } expected_result;
};
-char bpf_log_buf[BPF_LOG_BUF_SIZE];
+static int bind4_prog_load(const struct sock_addr_test *test);
+static int bind6_prog_load(const struct sock_addr_test *test);
+static int connect4_prog_load(const struct sock_addr_test *test);
+static int connect6_prog_load(const struct sock_addr_test *test);
+
+static struct sock_addr_test tests[] = {
+ /* bind */
+ {
+ "bind4: load prog with wrong expected attach type",
+ bind4_prog_load,
+ BPF_CGROUP_INET6_BIND,
+ BPF_CGROUP_INET4_BIND,
+ AF_INET,
+ SOCK_STREAM,
+ NULL,
+ 0,
+ NULL,
+ 0,
+ NULL,
+ LOAD_REJECT,
+ },
+ {
+ "bind4: attach prog with wrong attach type",
+ bind4_prog_load,
+ BPF_CGROUP_INET4_BIND,
+ BPF_CGROUP_INET6_BIND,
+ AF_INET,
+ SOCK_STREAM,
+ NULL,
+ 0,
+ NULL,
+ 0,
+ NULL,
+ ATTACH_REJECT,
+ },
+ {
+ "bind4: rewrite IP & TCP port in",
+ bind4_prog_load,
+ BPF_CGROUP_INET4_BIND,
+ BPF_CGROUP_INET4_BIND,
+ AF_INET,
+ SOCK_STREAM,
+ SERV4_IP,
+ SERV4_PORT,
+ SERV4_REWRITE_IP,
+ SERV4_REWRITE_PORT,
+ NULL,
+ SUCCESS,
+ },
+ {
+ "bind4: rewrite IP & UDP port in",
+ bind4_prog_load,
+ BPF_CGROUP_INET4_BIND,
+ BPF_CGROUP_INET4_BIND,
+ AF_INET,
+ SOCK_DGRAM,
+ SERV4_IP,
+ SERV4_PORT,
+ SERV4_REWRITE_IP,
+ SERV4_REWRITE_PORT,
+ NULL,
+ SUCCESS,
+ },
+ {
+ "bind6: load prog with wrong expected attach type",
+ bind6_prog_load,
+ BPF_CGROUP_INET4_BIND,
+ BPF_CGROUP_INET6_BIND,
+ AF_INET6,
+ SOCK_STREAM,
+ NULL,
+ 0,
+ NULL,
+ 0,
+ NULL,
+ LOAD_REJECT,
+ },
+ {
+ "bind6: attach prog with wrong attach type",
+ bind6_prog_load,
+ BPF_CGROUP_INET6_BIND,
+ BPF_CGROUP_INET4_BIND,
+ AF_INET,
+ SOCK_STREAM,
+ NULL,
+ 0,
+ NULL,
+ 0,
+ NULL,
+ ATTACH_REJECT,
+ },
+ {
+ "bind6: rewrite IP & TCP port in",
+ bind6_prog_load,
+ BPF_CGROUP_INET6_BIND,
+ BPF_CGROUP_INET6_BIND,
+ AF_INET6,
+ SOCK_STREAM,
+ SERV6_IP,
+ SERV6_PORT,
+ SERV6_REWRITE_IP,
+ SERV6_REWRITE_PORT,
+ NULL,
+ SUCCESS,
+ },
+ {
+ "bind6: rewrite IP & UDP port in",
+ bind6_prog_load,
+ BPF_CGROUP_INET6_BIND,
+ BPF_CGROUP_INET6_BIND,
+ AF_INET6,
+ SOCK_DGRAM,
+ SERV6_IP,
+ SERV6_PORT,
+ SERV6_REWRITE_IP,
+ SERV6_REWRITE_PORT,
+ NULL,
+ SUCCESS,
+ },
+
+ /* connect */
+ {
+ "connect4: load prog with wrong expected attach type",
+ connect4_prog_load,
+ BPF_CGROUP_INET6_CONNECT,
+ BPF_CGROUP_INET4_CONNECT,
+ AF_INET,
+ SOCK_STREAM,
+ NULL,
+ 0,
+ NULL,
+ 0,
+ NULL,
+ LOAD_REJECT,
+ },
+ {
+ "connect4: attach prog with wrong attach type",
+ connect4_prog_load,
+ BPF_CGROUP_INET4_CONNECT,
+ BPF_CGROUP_INET6_CONNECT,
+ AF_INET,
+ SOCK_STREAM,
+ NULL,
+ 0,
+ NULL,
+ 0,
+ NULL,
+ ATTACH_REJECT,
+ },
+ {
+ "connect4: rewrite IP & TCP port",
+ connect4_prog_load,
+ BPF_CGROUP_INET4_CONNECT,
+ BPF_CGROUP_INET4_CONNECT,
+ AF_INET,
+ SOCK_STREAM,
+ SERV4_IP,
+ SERV4_PORT,
+ SERV4_REWRITE_IP,
+ SERV4_REWRITE_PORT,
+ SRC4_REWRITE_IP,
+ SUCCESS,
+ },
+ {
+ "connect4: rewrite IP & UDP port",
+ connect4_prog_load,
+ BPF_CGROUP_INET4_CONNECT,
+ BPF_CGROUP_INET4_CONNECT,
+ AF_INET,
+ SOCK_DGRAM,
+ SERV4_IP,
+ SERV4_PORT,
+ SERV4_REWRITE_IP,
+ SERV4_REWRITE_PORT,
+ SRC4_REWRITE_IP,
+ SUCCESS,
+ },
+ {
+ "connect6: load prog with wrong expected attach type",
+ connect6_prog_load,
+ BPF_CGROUP_INET4_CONNECT,
+ BPF_CGROUP_INET6_CONNECT,
+ AF_INET6,
+ SOCK_STREAM,
+ NULL,
+ 0,
+ NULL,
+ 0,
+ NULL,
+ LOAD_REJECT,
+ },
+ {
+ "connect6: attach prog with wrong attach type",
+ connect6_prog_load,
+ BPF_CGROUP_INET6_CONNECT,
+ BPF_CGROUP_INET4_CONNECT,
+ AF_INET,
+ SOCK_STREAM,
+ NULL,
+ 0,
+ NULL,
+ 0,
+ NULL,
+ ATTACH_REJECT,
+ },
+ {
+ "connect6: rewrite IP & TCP port",
+ connect6_prog_load,
+ BPF_CGROUP_INET6_CONNECT,
+ BPF_CGROUP_INET6_CONNECT,
+ AF_INET6,
+ SOCK_STREAM,
+ SERV6_IP,
+ SERV6_PORT,
+ SERV6_REWRITE_IP,
+ SERV6_REWRITE_PORT,
+ SRC6_REWRITE_IP,
+ SUCCESS,
+ },
+ {
+ "connect6: rewrite IP & UDP port",
+ connect6_prog_load,
+ BPF_CGROUP_INET6_CONNECT,
+ BPF_CGROUP_INET6_CONNECT,
+ AF_INET6,
+ SOCK_DGRAM,
+ SERV6_IP,
+ SERV6_PORT,
+ SERV6_REWRITE_IP,
+ SERV6_REWRITE_PORT,
+ SRC6_REWRITE_IP,
+ SUCCESS,
+ },
+};
static int mk_sockaddr(int domain, const char *ip, unsigned short port,
struct sockaddr *addr, socklen_t addr_len)
@@ -84,25 +342,23 @@ static int mk_sockaddr(int domain, const char *ip, unsigned short port,
return 0;
}
-static int load_insns(enum bpf_attach_type attach_type,
- const struct bpf_insn *insns, size_t insns_cnt,
- const char *comment)
+static int load_insns(const struct sock_addr_test *test,
+ const struct bpf_insn *insns, size_t insns_cnt)
{
struct bpf_load_program_attr load_attr;
int ret;
memset(&load_attr, 0, sizeof(struct bpf_load_program_attr));
load_attr.prog_type = BPF_PROG_TYPE_CGROUP_SOCK_ADDR;
- load_attr.expected_attach_type = attach_type;
+ load_attr.expected_attach_type = test->expected_attach_type;
load_attr.insns = insns;
load_attr.insns_cnt = insns_cnt;
load_attr.license = "GPL";
ret = bpf_load_program_xattr(&load_attr, bpf_log_buf, BPF_LOG_BUF_SIZE);
- if (ret < 0 && comment) {
- log_err(">>> Loading %s program error.\n"
- ">>> Output from verifier:\n%s\n-------\n",
- comment, bpf_log_buf);
+ if (ret < 0 && test->expected_result != LOAD_REJECT) {
+ log_err(">>> Loading program error.\n"
+ ">>> Verifier output:\n%s\n-------\n", bpf_log_buf);
}
return ret;
@@ -119,8 +375,7 @@ static int load_insns(enum bpf_attach_type attach_type,
* to count jumps properly.
*/
-static int bind4_prog_load(enum bpf_attach_type attach_type,
- const char *comment)
+static int bind4_prog_load(const struct sock_addr_test *test)
{
union {
uint8_t u4_addr8[4];
@@ -186,12 +441,10 @@ static int bind4_prog_load(enum bpf_attach_type attach_type,
BPF_EXIT_INSN(),
};
- return load_insns(attach_type, insns,
- sizeof(insns) / sizeof(struct bpf_insn), comment);
+ return load_insns(test, insns, sizeof(insns) / sizeof(struct bpf_insn));
}
-static int bind6_prog_load(enum bpf_attach_type attach_type,
- const char *comment)
+static int bind6_prog_load(const struct sock_addr_test *test)
{
struct sockaddr_in6 addr6_rw;
struct in6_addr ip6;
@@ -254,13 +507,10 @@ static int bind6_prog_load(enum bpf_attach_type attach_type,
BPF_EXIT_INSN(),
};
- return load_insns(attach_type, insns,
- sizeof(insns) / sizeof(struct bpf_insn), comment);
+ return load_insns(test, insns, sizeof(insns) / sizeof(struct bpf_insn));
}
-static int connect_prog_load_path(const char *path,
- enum bpf_attach_type attach_type,
- const char *comment)
+static int load_path(const struct sock_addr_test *test, const char *path)
{
struct bpf_prog_load_attr attr;
struct bpf_object *obj;
@@ -269,75 +519,83 @@ static int connect_prog_load_path(const char *path,
memset(&attr, 0, sizeof(struct bpf_prog_load_attr));
attr.file = path;
attr.prog_type = BPF_PROG_TYPE_CGROUP_SOCK_ADDR;
- attr.expected_attach_type = attach_type;
+ attr.expected_attach_type = test->expected_attach_type;
if (bpf_prog_load_xattr(&attr, &obj, &prog_fd)) {
- if (comment)
- log_err(">>> Loading %s program at %s error.\n",
- comment, path);
+ if (test->expected_result != LOAD_REJECT)
+ log_err(">>> Loading program (%s) error.\n", path);
return -1;
}
return prog_fd;
}
-static int connect4_prog_load(enum bpf_attach_type attach_type,
- const char *comment)
+static int connect4_prog_load(const struct sock_addr_test *test)
{
- return connect_prog_load_path(CONNECT4_PROG_PATH, attach_type, comment);
+ return load_path(test, CONNECT4_PROG_PATH);
}
-static int connect6_prog_load(enum bpf_attach_type attach_type,
- const char *comment)
+static int connect6_prog_load(const struct sock_addr_test *test)
{
- return connect_prog_load_path(CONNECT6_PROG_PATH, attach_type, comment);
+ return load_path(test, CONNECT6_PROG_PATH);
}
-static void print_ip_port(int sockfd, info_fn fn, const char *fmt)
+static int cmp_addr(const struct sockaddr_storage *addr1,
+ const struct sockaddr_storage *addr2, int cmp_port)
{
- char addr_buf[INET_NTOP_BUF];
- struct sockaddr_storage addr;
- struct sockaddr_in6 *addr6;
- struct sockaddr_in *addr4;
- socklen_t addr_len;
- unsigned short port;
- void *nip;
-
- addr_len = sizeof(struct sockaddr_storage);
- memset(&addr, 0, addr_len);
-
- if (fn(sockfd, (struct sockaddr *)&addr, (socklen_t *)&addr_len) == 0) {
- if (addr.ss_family == AF_INET) {
- addr4 = (struct sockaddr_in *)&addr;
- nip = (void *)&addr4->sin_addr;
- port = ntohs(addr4->sin_port);
- } else if (addr.ss_family == AF_INET6) {
- addr6 = (struct sockaddr_in6 *)&addr;
- nip = (void *)&addr6->sin6_addr;
- port = ntohs(addr6->sin6_port);
- } else {
- return;
- }
- const char *addr_str =
- inet_ntop(addr.ss_family, nip, addr_buf, INET_NTOP_BUF);
- printf(fmt, addr_str ? addr_str : "??", port);
+ const struct sockaddr_in *four1, *four2;
+ const struct sockaddr_in6 *six1, *six2;
+
+ if (addr1->ss_family != addr2->ss_family)
+ return -1;
+
+ if (addr1->ss_family == AF_INET) {
+ four1 = (const struct sockaddr_in *)addr1;
+ four2 = (const struct sockaddr_in *)addr2;
+ return !((four1->sin_port == four2->sin_port || !cmp_port) &&
+ four1->sin_addr.s_addr == four2->sin_addr.s_addr);
+ } else if (addr1->ss_family == AF_INET6) {
+ six1 = (const struct sockaddr_in6 *)addr1;
+ six2 = (const struct sockaddr_in6 *)addr2;
+ return !((six1->sin6_port == six2->sin6_port || !cmp_port) &&
+ !memcmp(&six1->sin6_addr, &six2->sin6_addr,
+ sizeof(struct in6_addr)));
}
+
+ return -1;
+}
+
+static int cmp_sock_addr(info_fn fn, int sock1,
+ const struct sockaddr_storage *addr2, int cmp_port)
+{
+ struct sockaddr_storage addr1;
+ socklen_t len1 = sizeof(addr1);
+
+ memset(&addr1, 0, len1);
+ if (fn(sock1, (struct sockaddr *)&addr1, (socklen_t *)&len1) != 0)
+ return -1;
+
+ return cmp_addr(&addr1, addr2, cmp_port);
+}
+
+static int cmp_local_ip(int sock1, const struct sockaddr_storage *addr2)
+{
+ return cmp_sock_addr(getsockname, sock1, addr2, /*cmp_port*/ 0);
}
-static void print_local_ip_port(int sockfd, const char *fmt)
+static int cmp_local_addr(int sock1, const struct sockaddr_storage *addr2)
{
- print_ip_port(sockfd, getsockname, fmt);
+ return cmp_sock_addr(getsockname, sock1, addr2, /*cmp_port*/ 1);
}
-static void print_remote_ip_port(int sockfd, const char *fmt)
+static int cmp_peer_addr(int sock1, const struct sockaddr_storage *addr2)
{
- print_ip_port(sockfd, getpeername, fmt);
+ return cmp_sock_addr(getpeername, sock1, addr2, /*cmp_port*/ 1);
}
static int start_server(int type, const struct sockaddr_storage *addr,
socklen_t addr_len)
{
-
int fd;
fd = socket(addr->ss_family, type, 0);
@@ -358,8 +616,6 @@ static int start_server(int type, const struct sockaddr_storage *addr,
}
}
- print_local_ip_port(fd, "\t Actual: bind(%s, %d)\n");
-
goto out;
close_out:
close(fd);
@@ -372,19 +628,19 @@ static int connect_to_server(int type, const struct sockaddr_storage *addr,
socklen_t addr_len)
{
int domain;
- int fd;
+ int fd = -1;
domain = addr->ss_family;
if (domain != AF_INET && domain != AF_INET6) {
log_err("Unsupported address family");
- return -1;
+ goto err;
}
fd = socket(domain, type, 0);
if (fd == -1) {
- log_err("Failed to creating client socket");
- return -1;
+ log_err("Failed to create client socket");
+ goto err;
}
if (connect(fd, (const struct sockaddr *)addr, addr_len) == -1) {
@@ -392,162 +648,188 @@ static int connect_to_server(int type, const struct sockaddr_storage *addr,
goto err;
}
- print_remote_ip_port(fd, "\t Actual: connect(%s, %d)");
- print_local_ip_port(fd, " from (%s, %d)\n");
-
- return 0;
+ goto out;
err:
close(fd);
- return -1;
+ fd = -1;
+out:
+ return fd;
}
-static void print_test_case_num(int domain, int type)
+static int init_addrs(const struct sock_addr_test *test,
+ struct sockaddr_storage *requested_addr,
+ struct sockaddr_storage *expected_addr,
+ struct sockaddr_storage *expected_src_addr)
{
- static int test_num;
-
- printf("Test case #%d (%s/%s):\n", ++test_num,
- (domain == AF_INET ? "IPv4" :
- domain == AF_INET6 ? "IPv6" :
- "unknown_domain"),
- (type == SOCK_STREAM ? "TCP" :
- type == SOCK_DGRAM ? "UDP" :
- "unknown_type"));
+ socklen_t addr_len = sizeof(struct sockaddr_storage);
+
+ if (mk_sockaddr(test->domain, test->expected_ip, test->expected_port,
+ (struct sockaddr *)expected_addr, addr_len) == -1)
+ goto err;
+
+ if (mk_sockaddr(test->domain, test->requested_ip, test->requested_port,
+ (struct sockaddr *)requested_addr, addr_len) == -1)
+ goto err;
+
+ if (test->expected_src_ip &&
+ mk_sockaddr(test->domain, test->expected_src_ip, 0,
+ (struct sockaddr *)expected_src_addr, addr_len) == -1)
+ goto err;
+
+ return 0;
+err:
+ return -1;
}
-static int run_test_case(int domain, int type, const char *ip,
- unsigned short port)
+static int run_bind_test_case(const struct sock_addr_test *test)
{
- struct sockaddr_storage addr;
- socklen_t addr_len = sizeof(addr);
+ socklen_t addr_len = sizeof(struct sockaddr_storage);
+ struct sockaddr_storage requested_addr;
+ struct sockaddr_storage expected_addr;
+ int clientfd = -1;
int servfd = -1;
int err = 0;
- print_test_case_num(domain, type);
-
- if (mk_sockaddr(domain, ip, port, (struct sockaddr *)&addr,
- addr_len) == -1)
- return -1;
+ if (init_addrs(test, &requested_addr, &expected_addr, NULL))
+ goto err;
- printf("\tRequested: bind(%s, %d) ..\n", ip, port);
- servfd = start_server(type, &addr, addr_len);
+ servfd = start_server(test->type, &requested_addr, addr_len);
if (servfd == -1)
goto err;
- printf("\tRequested: connect(%s, %d) from (*, *) ..\n", ip, port);
- if (connect_to_server(type, &addr, addr_len))
+ if (cmp_local_addr(servfd, &expected_addr))
+ goto err;
+
+ /* Try to connect to server just in case */
+ clientfd = connect_to_server(test->type, &expected_addr, addr_len);
+ if (clientfd == -1)
goto err;
goto out;
err:
err = -1;
out:
+ close(clientfd);
close(servfd);
return err;
}
-static void close_progs_fds(struct program *progs, size_t prog_cnt)
+static int run_connect_test_case(const struct sock_addr_test *test)
{
- size_t i;
+ socklen_t addr_len = sizeof(struct sockaddr_storage);
+ struct sockaddr_storage expected_src_addr;
+ struct sockaddr_storage requested_addr;
+ struct sockaddr_storage expected_addr;
+ int clientfd = -1;
+ int servfd = -1;
+ int err = 0;
- for (i = 0; i < prog_cnt; ++i) {
- close(progs[i].fd);
- progs[i].fd = -1;
- }
-}
+ if (init_addrs(test, &requested_addr, &expected_addr,
+ &expected_src_addr))
+ goto err;
-static int load_and_attach_progs(int cgfd, struct program *progs,
- size_t prog_cnt)
-{
- size_t i;
-
- for (i = 0; i < prog_cnt; ++i) {
- printf("Load %s with invalid type (can pollute stderr) ",
- progs[i].name);
- fflush(stdout);
- progs[i].fd = progs[i].loadfn(progs[i].invalid_type, NULL);
- if (progs[i].fd != -1) {
- log_err("Load with invalid type accepted for %s",
- progs[i].name);
- goto err;
- }
- printf("... REJECTED\n");
+ /* Prepare server to connect to */
+ servfd = start_server(test->type, &expected_addr, addr_len);
+ if (servfd == -1)
+ goto err;
- printf("Load %s with valid type", progs[i].name);
- progs[i].fd = progs[i].loadfn(progs[i].type, progs[i].name);
- if (progs[i].fd == -1) {
- log_err("Failed to load program %s", progs[i].name);
- goto err;
- }
- printf(" ... OK\n");
-
- printf("Attach %s with invalid type", progs[i].name);
- if (bpf_prog_attach(progs[i].fd, cgfd, progs[i].invalid_type,
- BPF_F_ALLOW_OVERRIDE) != -1) {
- log_err("Attach with invalid type accepted for %s",
- progs[i].name);
- goto err;
- }
- printf(" ... REJECTED\n");
+ clientfd = connect_to_server(test->type, &requested_addr, addr_len);
+ if (clientfd == -1)
+ goto err;
- printf("Attach %s with valid type", progs[i].name);
- if (bpf_prog_attach(progs[i].fd, cgfd, progs[i].type,
- BPF_F_ALLOW_OVERRIDE) == -1) {
- log_err("Failed to attach program %s", progs[i].name);
- goto err;
- }
- printf(" ... OK\n");
- }
+ /* Make sure src and dst addrs were overridden properly */
+ if (cmp_peer_addr(clientfd, &expected_addr))
+ goto err;
- return 0;
+ if (cmp_local_ip(clientfd, &expected_src_addr))
+ goto err;
+
+ goto out;
err:
- close_progs_fds(progs, prog_cnt);
- return -1;
+ err = -1;
+out:
+ close(clientfd);
+ close(servfd);
+ return err;
}
-static int run_domain_test(int domain, int cgfd, struct program *progs,
- size_t prog_cnt, const char *ip, unsigned short port)
+static int run_test_case(int cgfd, const struct sock_addr_test *test)
{
+ int progfd = -1;
int err = 0;
- if (load_and_attach_progs(cgfd, progs, prog_cnt) == -1)
+ printf("Test case: %s .. ", test->descr);
+
+ progfd = test->loadfn(test);
+ if (test->expected_result == LOAD_REJECT && progfd < 0)
+ goto out;
+ else if (test->expected_result == LOAD_REJECT || progfd < 0)
+ goto err;
+
+ err = bpf_prog_attach(progfd, cgfd, test->attach_type,
+ BPF_F_ALLOW_OVERRIDE);
+ if (test->expected_result == ATTACH_REJECT && err) {
+ err = 0; /* error was expected, reset it */
+ goto out;
+ } else if (test->expected_result == ATTACH_REJECT || err) {
goto err;
+ }
- if (run_test_case(domain, SOCK_STREAM, ip, port) == -1)
+ switch (test->attach_type) {
+ case BPF_CGROUP_INET4_BIND:
+ case BPF_CGROUP_INET6_BIND:
+ err = run_bind_test_case(test);
+ break;
+ case BPF_CGROUP_INET4_CONNECT:
+ case BPF_CGROUP_INET6_CONNECT:
+ err = run_connect_test_case(test);
+ break;
+ default:
goto err;
+ }
- if (run_test_case(domain, SOCK_DGRAM, ip, port) == -1)
+ if (err || test->expected_result != SUCCESS)
goto err;
goto out;
err:
err = -1;
out:
- close_progs_fds(progs, prog_cnt);
+ /* Detaching w/o checking return code: best effort attempt. */
+ if (progfd != -1)
+ bpf_prog_detach(cgfd, test->attach_type);
+ close(progfd);
+ printf("[%s]\n", err ? "FAIL" : "PASS");
return err;
}
-static int run_test(void)
+static int run_tests(int cgfd)
+{
+ int passes = 0;
+ int fails = 0;
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(tests); ++i) {
+ if (run_test_case(cgfd, &tests[i]))
+ ++fails;
+ else
+ ++passes;
+ }
+ printf("Summary: %d PASSED, %d FAILED\n", passes, fails);
+ return fails ? -1 : 0;
+}
+
+int main(int argc, char **argv)
{
- size_t inet6_prog_cnt;
- size_t inet_prog_cnt;
int cgfd = -1;
int err = 0;
- struct program inet6_progs[] = {
- {BPF_CGROUP_INET6_BIND, bind6_prog_load, -1, "bind6",
- BPF_CGROUP_INET4_BIND},
- {BPF_CGROUP_INET6_CONNECT, connect6_prog_load, -1, "connect6",
- BPF_CGROUP_INET4_CONNECT},
- };
- inet6_prog_cnt = sizeof(inet6_progs) / sizeof(struct program);
-
- struct program inet_progs[] = {
- {BPF_CGROUP_INET4_BIND, bind4_prog_load, -1, "bind4",
- BPF_CGROUP_INET6_BIND},
- {BPF_CGROUP_INET4_CONNECT, connect4_prog_load, -1, "connect4",
- BPF_CGROUP_INET6_CONNECT},
- };
- inet_prog_cnt = sizeof(inet_progs) / sizeof(struct program);
+ if (argc < 2) {
+ fprintf(stderr,
+ "%s has to be run via %s.sh. Skip direct run.\n",
+ argv[0], argv[0]);
+ exit(err);
+ }
if (setup_cgroup_environment())
goto err;
@@ -559,12 +841,7 @@ static int run_test(void)
if (join_cgroup(CG_PATH))
goto err;
- if (run_domain_test(AF_INET, cgfd, inet_progs, inet_prog_cnt, SERV4_IP,
- SERV4_PORT) == -1)
- goto err;
-
- if (run_domain_test(AF_INET6, cgfd, inet6_progs, inet6_prog_cnt,
- SERV6_IP, SERV6_PORT) == -1)
+ if (run_tests(cgfd))
goto err;
goto out;
@@ -573,17 +850,5 @@ static int run_test(void)
out:
close(cgfd);
cleanup_cgroup_environment();
- printf(err ? "### FAIL\n" : "### SUCCESS\n");
return err;
}
-
-int main(int argc, char **argv)
-{
- if (argc < 2) {
- fprintf(stderr,
- "%s has to be run via %s.sh. Skip direct run.\n",
- argv[0], argv[0]);
- exit(0);
- }
- return run_test();
-}
--
2.9.5
^ permalink raw reply related
* [PATCH v3 bpf-next 5/5] selftests/bpf: Selftest for sys_sendmsg hooks
From: Andrey Ignatov @ 2018-05-25 5:09 UTC (permalink / raw)
To: netdev; +Cc: Andrey Ignatov, davem, kafai, ast, daniel, kernel-team
In-Reply-To: <cover.1527224903.git.rdna@fb.com>
Add selftest for BPF_CGROUP_UDP4_SENDMSG and BPF_CGROUP_UDP6_SENDMSG
attach types.
Try to sendmsg(2) to specific IP:port and test that:
* source IP is overridden as expected.
* remote IP:port pair is overridden as expected;
Both UDPv4 and UDPv6 are tested.
Output:
# test_sock_addr.sh 2>/dev/null
Wait for testing IPv4/IPv6 to become available ... OK
... pre-existing test-cases skipped ...
Test case: sendmsg4: load prog with wrong expected attach type .. [PASS]
Test case: sendmsg4: attach prog with wrong attach type .. [PASS]
Test case: sendmsg4: rewrite IP & port (asm) .. [PASS]
Test case: sendmsg4: rewrite IP & port (C) .. [PASS]
Test case: sendmsg4: deny call .. [PASS]
Test case: sendmsg6: load prog with wrong expected attach type .. [PASS]
Test case: sendmsg6: attach prog with wrong attach type .. [PASS]
Test case: sendmsg6: rewrite IP & port (asm) .. [PASS]
Test case: sendmsg6: rewrite IP & port (C) .. [PASS]
Test case: sendmsg6: IPv4-mapped IPv6 .. [PASS]
Test case: sendmsg6: deny call .. [PASS]
Summary: 27 PASSED, 0 FAILED
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
---
tools/testing/selftests/bpf/Makefile | 2 +-
tools/testing/selftests/bpf/sendmsg4_prog.c | 49 +++
tools/testing/selftests/bpf/sendmsg6_prog.c | 60 ++++
tools/testing/selftests/bpf/test_sock_addr.c | 518 +++++++++++++++++++++++++++
4 files changed, 628 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/bpf/sendmsg4_prog.c
create mode 100644 tools/testing/selftests/bpf/sendmsg6_prog.c
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 8504444..a1b66da 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -34,7 +34,7 @@ TEST_GEN_FILES = test_pkt_access.o test_xdp.o test_l4lb.o test_tcp_estats.o test
sockmap_tcp_msg_prog.o connect4_prog.o connect6_prog.o test_adjust_tail.o \
test_btf_haskv.o test_btf_nokv.o test_sockmap_kern.o test_tunnel_kern.o \
test_get_stack_rawtp.o test_sockmap_kern.o test_sockhash_kern.o \
- test_lwt_seg6local.o
+ test_lwt_seg6local.o sendmsg4_prog.o sendmsg6_prog.o
# Order correspond to 'make run_tests' order
TEST_PROGS := test_kmod.sh \
diff --git a/tools/testing/selftests/bpf/sendmsg4_prog.c b/tools/testing/selftests/bpf/sendmsg4_prog.c
new file mode 100644
index 0000000..a91536b
--- /dev/null
+++ b/tools/testing/selftests/bpf/sendmsg4_prog.c
@@ -0,0 +1,49 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2018 Facebook
+
+#include <linux/stddef.h>
+#include <linux/bpf.h>
+#include <sys/socket.h>
+
+#include "bpf_helpers.h"
+#include "bpf_endian.h"
+
+#define SRC1_IP4 0xAC100001U /* 172.16.0.1 */
+#define SRC2_IP4 0x00000000U
+#define SRC_REWRITE_IP4 0x7f000004U
+#define DST_IP4 0xC0A801FEU /* 192.168.1.254 */
+#define DST_REWRITE_IP4 0x7f000001U
+#define DST_PORT 4040
+#define DST_REWRITE_PORT4 4444
+
+int _version SEC("version") = 1;
+
+SEC("cgroup/sendmsg4")
+int sendmsg_v4_prog(struct bpf_sock_addr *ctx)
+{
+ if (ctx->type != SOCK_DGRAM)
+ return 0;
+
+ /* Rewrite source. */
+ if (ctx->msg_src_ip4 == bpf_htonl(SRC1_IP4) ||
+ ctx->msg_src_ip4 == bpf_htonl(SRC2_IP4)) {
+ ctx->msg_src_ip4 = bpf_htonl(SRC_REWRITE_IP4);
+ } else {
+ /* Unexpected source. Reject sendmsg. */
+ return 0;
+ }
+
+ /* Rewrite destination. */
+ if ((ctx->user_ip4 >> 24) == (bpf_htonl(DST_IP4) >> 24) &&
+ ctx->user_port == bpf_htons(DST_PORT)) {
+ ctx->user_ip4 = bpf_htonl(DST_REWRITE_IP4);
+ ctx->user_port = bpf_htons(DST_REWRITE_PORT4);
+ } else {
+ /* Unexpected source. Reject sendmsg. */
+ return 0;
+ }
+
+ return 1;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/sendmsg6_prog.c b/tools/testing/selftests/bpf/sendmsg6_prog.c
new file mode 100644
index 0000000..5aeaa28
--- /dev/null
+++ b/tools/testing/selftests/bpf/sendmsg6_prog.c
@@ -0,0 +1,60 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2018 Facebook
+
+#include <linux/stddef.h>
+#include <linux/bpf.h>
+#include <sys/socket.h>
+
+#include "bpf_helpers.h"
+#include "bpf_endian.h"
+
+#define SRC_REWRITE_IP6_0 0
+#define SRC_REWRITE_IP6_1 0
+#define SRC_REWRITE_IP6_2 0
+#define SRC_REWRITE_IP6_3 6
+
+#define DST_REWRITE_IP6_0 0
+#define DST_REWRITE_IP6_1 0
+#define DST_REWRITE_IP6_2 0
+#define DST_REWRITE_IP6_3 1
+
+#define DST_REWRITE_PORT6 6666
+
+int _version SEC("version") = 1;
+
+SEC("cgroup/sendmsg6")
+int sendmsg_v6_prog(struct bpf_sock_addr *ctx)
+{
+ if (ctx->type != SOCK_DGRAM)
+ return 0;
+
+ /* Rewrite source. */
+ if (ctx->msg_src_ip6[3] == bpf_htonl(1) ||
+ ctx->msg_src_ip6[3] == bpf_htonl(0)) {
+ ctx->msg_src_ip6[0] = bpf_htonl(SRC_REWRITE_IP6_0);
+ ctx->msg_src_ip6[1] = bpf_htonl(SRC_REWRITE_IP6_1);
+ ctx->msg_src_ip6[2] = bpf_htonl(SRC_REWRITE_IP6_2);
+ ctx->msg_src_ip6[3] = bpf_htonl(SRC_REWRITE_IP6_3);
+ } else {
+ /* Unexpected source. Reject sendmsg. */
+ return 0;
+ }
+
+ /* Rewrite destination. */
+ if ((ctx->user_ip6[0] & 0xFFFF) == bpf_htons(0xFACE) &&
+ ctx->user_ip6[0] >> 16 == bpf_htons(0xB00C)) {
+ ctx->user_ip6[0] = bpf_htonl(DST_REWRITE_IP6_0);
+ ctx->user_ip6[1] = bpf_htonl(DST_REWRITE_IP6_1);
+ ctx->user_ip6[2] = bpf_htonl(DST_REWRITE_IP6_2);
+ ctx->user_ip6[3] = bpf_htonl(DST_REWRITE_IP6_3);
+
+ ctx->user_port = bpf_htons(DST_REWRITE_PORT6);
+ } else {
+ /* Unexpected destination. Reject sendmsg. */
+ return 0;
+ }
+
+ return 1;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_sock_addr.c b/tools/testing/selftests/bpf/test_sock_addr.c
index ed3e397..a5e76b9 100644
--- a/tools/testing/selftests/bpf/test_sock_addr.c
+++ b/tools/testing/selftests/bpf/test_sock_addr.c
@@ -1,12 +1,16 @@
// SPDX-License-Identifier: GPL-2.0
// Copyright (c) 2018 Facebook
+#define _GNU_SOURCE
+
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <arpa/inet.h>
+#include <netinet/in.h>
#include <sys/types.h>
+#include <sys/select.h>
#include <sys/socket.h>
#include <linux/filter.h>
@@ -17,6 +21,10 @@
#include "cgroup_helpers.h"
#include "bpf_rlimit.h"
+#ifndef ENOTSUPP
+# define ENOTSUPP 524
+#endif
+
#ifndef ARRAY_SIZE
# define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
#endif
@@ -24,15 +32,20 @@
#define CG_PATH "/foo"
#define CONNECT4_PROG_PATH "./connect4_prog.o"
#define CONNECT6_PROG_PATH "./connect6_prog.o"
+#define SENDMSG4_PROG_PATH "./sendmsg4_prog.o"
+#define SENDMSG6_PROG_PATH "./sendmsg6_prog.o"
#define SERV4_IP "192.168.1.254"
#define SERV4_REWRITE_IP "127.0.0.1"
+#define SRC4_IP "172.16.0.1"
#define SRC4_REWRITE_IP "127.0.0.4"
#define SERV4_PORT 4040
#define SERV4_REWRITE_PORT 4444
#define SERV6_IP "face:b00c:1234:5678::abcd"
#define SERV6_REWRITE_IP "::1"
+#define SERV6_V4MAPPED_IP "::ffff:192.168.0.4"
+#define SRC6_IP "::1"
#define SRC6_REWRITE_IP "::6"
#define SERV6_PORT 6060
#define SERV6_REWRITE_PORT 6666
@@ -65,6 +78,8 @@ struct sock_addr_test {
enum {
LOAD_REJECT,
ATTACH_REJECT,
+ SYSCALL_EPERM,
+ SYSCALL_ENOTSUPP,
SUCCESS,
} expected_result;
};
@@ -73,6 +88,12 @@ static int bind4_prog_load(const struct sock_addr_test *test);
static int bind6_prog_load(const struct sock_addr_test *test);
static int connect4_prog_load(const struct sock_addr_test *test);
static int connect6_prog_load(const struct sock_addr_test *test);
+static int sendmsg_deny_prog_load(const struct sock_addr_test *test);
+static int sendmsg4_rw_asm_prog_load(const struct sock_addr_test *test);
+static int sendmsg4_rw_c_prog_load(const struct sock_addr_test *test);
+static int sendmsg6_rw_asm_prog_load(const struct sock_addr_test *test);
+static int sendmsg6_rw_c_prog_load(const struct sock_addr_test *test);
+static int sendmsg6_rw_v4mapped_prog_load(const struct sock_addr_test *test);
static struct sock_addr_test tests[] = {
/* bind */
@@ -302,6 +323,162 @@ static struct sock_addr_test tests[] = {
SRC6_REWRITE_IP,
SUCCESS,
},
+
+ /* sendmsg */
+ {
+ "sendmsg4: load prog with wrong expected attach type",
+ sendmsg4_rw_asm_prog_load,
+ BPF_CGROUP_UDP6_SENDMSG,
+ BPF_CGROUP_UDP4_SENDMSG,
+ AF_INET,
+ SOCK_DGRAM,
+ NULL,
+ 0,
+ NULL,
+ 0,
+ NULL,
+ LOAD_REJECT,
+ },
+ {
+ "sendmsg4: attach prog with wrong attach type",
+ sendmsg4_rw_asm_prog_load,
+ BPF_CGROUP_UDP4_SENDMSG,
+ BPF_CGROUP_UDP6_SENDMSG,
+ AF_INET,
+ SOCK_DGRAM,
+ NULL,
+ 0,
+ NULL,
+ 0,
+ NULL,
+ ATTACH_REJECT,
+ },
+ {
+ "sendmsg4: rewrite IP & port (asm)",
+ sendmsg4_rw_asm_prog_load,
+ BPF_CGROUP_UDP4_SENDMSG,
+ BPF_CGROUP_UDP4_SENDMSG,
+ AF_INET,
+ SOCK_DGRAM,
+ SERV4_IP,
+ SERV4_PORT,
+ SERV4_REWRITE_IP,
+ SERV4_REWRITE_PORT,
+ SRC4_REWRITE_IP,
+ SUCCESS,
+ },
+ {
+ "sendmsg4: rewrite IP & port (C)",
+ sendmsg4_rw_c_prog_load,
+ BPF_CGROUP_UDP4_SENDMSG,
+ BPF_CGROUP_UDP4_SENDMSG,
+ AF_INET,
+ SOCK_DGRAM,
+ SERV4_IP,
+ SERV4_PORT,
+ SERV4_REWRITE_IP,
+ SERV4_REWRITE_PORT,
+ SRC4_REWRITE_IP,
+ SUCCESS,
+ },
+ {
+ "sendmsg4: deny call",
+ sendmsg_deny_prog_load,
+ BPF_CGROUP_UDP4_SENDMSG,
+ BPF_CGROUP_UDP4_SENDMSG,
+ AF_INET,
+ SOCK_DGRAM,
+ SERV4_IP,
+ SERV4_PORT,
+ SERV4_REWRITE_IP,
+ SERV4_REWRITE_PORT,
+ SRC4_REWRITE_IP,
+ SYSCALL_EPERM,
+ },
+ {
+ "sendmsg6: load prog with wrong expected attach type",
+ sendmsg6_rw_asm_prog_load,
+ BPF_CGROUP_UDP4_SENDMSG,
+ BPF_CGROUP_UDP6_SENDMSG,
+ AF_INET6,
+ SOCK_DGRAM,
+ NULL,
+ 0,
+ NULL,
+ 0,
+ NULL,
+ LOAD_REJECT,
+ },
+ {
+ "sendmsg6: attach prog with wrong attach type",
+ sendmsg6_rw_asm_prog_load,
+ BPF_CGROUP_UDP6_SENDMSG,
+ BPF_CGROUP_UDP4_SENDMSG,
+ AF_INET6,
+ SOCK_DGRAM,
+ NULL,
+ 0,
+ NULL,
+ 0,
+ NULL,
+ ATTACH_REJECT,
+ },
+ {
+ "sendmsg6: rewrite IP & port (asm)",
+ sendmsg6_rw_asm_prog_load,
+ BPF_CGROUP_UDP6_SENDMSG,
+ BPF_CGROUP_UDP6_SENDMSG,
+ AF_INET6,
+ SOCK_DGRAM,
+ SERV6_IP,
+ SERV6_PORT,
+ SERV6_REWRITE_IP,
+ SERV6_REWRITE_PORT,
+ SRC6_REWRITE_IP,
+ SUCCESS,
+ },
+ {
+ "sendmsg6: rewrite IP & port (C)",
+ sendmsg6_rw_c_prog_load,
+ BPF_CGROUP_UDP6_SENDMSG,
+ BPF_CGROUP_UDP6_SENDMSG,
+ AF_INET6,
+ SOCK_DGRAM,
+ SERV6_IP,
+ SERV6_PORT,
+ SERV6_REWRITE_IP,
+ SERV6_REWRITE_PORT,
+ SRC6_REWRITE_IP,
+ SUCCESS,
+ },
+ {
+ "sendmsg6: IPv4-mapped IPv6",
+ sendmsg6_rw_v4mapped_prog_load,
+ BPF_CGROUP_UDP6_SENDMSG,
+ BPF_CGROUP_UDP6_SENDMSG,
+ AF_INET6,
+ SOCK_DGRAM,
+ SERV6_IP,
+ SERV6_PORT,
+ SERV6_REWRITE_IP,
+ SERV6_REWRITE_PORT,
+ SRC6_REWRITE_IP,
+ SYSCALL_ENOTSUPP,
+ },
+ {
+ "sendmsg6: deny call",
+ sendmsg_deny_prog_load,
+ BPF_CGROUP_UDP6_SENDMSG,
+ BPF_CGROUP_UDP6_SENDMSG,
+ AF_INET6,
+ SOCK_DGRAM,
+ SERV6_IP,
+ SERV6_PORT,
+ SERV6_REWRITE_IP,
+ SERV6_REWRITE_PORT,
+ SRC6_REWRITE_IP,
+ SYSCALL_EPERM,
+ },
};
static int mk_sockaddr(int domain, const char *ip, unsigned short port,
@@ -540,6 +717,141 @@ static int connect6_prog_load(const struct sock_addr_test *test)
return load_path(test, CONNECT6_PROG_PATH);
}
+static int sendmsg_deny_prog_load(const struct sock_addr_test *test)
+{
+ struct bpf_insn insns[] = {
+ /* return 0 */
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ };
+ return load_insns(test, insns, sizeof(insns) / sizeof(struct bpf_insn));
+}
+
+static int sendmsg4_rw_asm_prog_load(const struct sock_addr_test *test)
+{
+ struct sockaddr_in dst4_rw_addr;
+ struct in_addr src4_rw_ip;
+
+ if (inet_pton(AF_INET, SRC4_REWRITE_IP, (void *)&src4_rw_ip) != 1) {
+ log_err("Invalid IPv4: %s", SRC4_REWRITE_IP);
+ return -1;
+ }
+
+ if (mk_sockaddr(AF_INET, SERV4_REWRITE_IP, SERV4_REWRITE_PORT,
+ (struct sockaddr *)&dst4_rw_addr,
+ sizeof(dst4_rw_addr)) == -1)
+ return -1;
+
+ struct bpf_insn insns[] = {
+ BPF_MOV64_REG(BPF_REG_6, BPF_REG_1),
+
+ /* if (sk.family == AF_INET && */
+ BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_6,
+ offsetof(struct bpf_sock_addr, family)),
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_7, AF_INET, 8),
+
+ /* sk.type == SOCK_DGRAM) { */
+ BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_6,
+ offsetof(struct bpf_sock_addr, type)),
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_7, SOCK_DGRAM, 6),
+
+ /* msg_src_ip4 = src4_rw_ip */
+ BPF_MOV32_IMM(BPF_REG_7, src4_rw_ip.s_addr),
+ BPF_STX_MEM(BPF_W, BPF_REG_6, BPF_REG_7,
+ offsetof(struct bpf_sock_addr, msg_src_ip4)),
+
+ /* user_ip4 = dst4_rw_addr.sin_addr */
+ BPF_MOV32_IMM(BPF_REG_7, dst4_rw_addr.sin_addr.s_addr),
+ BPF_STX_MEM(BPF_W, BPF_REG_6, BPF_REG_7,
+ offsetof(struct bpf_sock_addr, user_ip4)),
+
+ /* user_port = dst4_rw_addr.sin_port */
+ BPF_MOV32_IMM(BPF_REG_7, dst4_rw_addr.sin_port),
+ BPF_STX_MEM(BPF_W, BPF_REG_6, BPF_REG_7,
+ offsetof(struct bpf_sock_addr, user_port)),
+ /* } */
+
+ /* return 1 */
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ };
+
+ return load_insns(test, insns, sizeof(insns) / sizeof(struct bpf_insn));
+}
+
+static int sendmsg4_rw_c_prog_load(const struct sock_addr_test *test)
+{
+ return load_path(test, SENDMSG4_PROG_PATH);
+}
+
+static int sendmsg6_rw_dst_asm_prog_load(const struct sock_addr_test *test,
+ const char *rw_dst_ip)
+{
+ struct sockaddr_in6 dst6_rw_addr;
+ struct in6_addr src6_rw_ip;
+
+ if (inet_pton(AF_INET6, SRC6_REWRITE_IP, (void *)&src6_rw_ip) != 1) {
+ log_err("Invalid IPv6: %s", SRC6_REWRITE_IP);
+ return -1;
+ }
+
+ if (mk_sockaddr(AF_INET6, rw_dst_ip, SERV6_REWRITE_PORT,
+ (struct sockaddr *)&dst6_rw_addr,
+ sizeof(dst6_rw_addr)) == -1)
+ return -1;
+
+ struct bpf_insn insns[] = {
+ BPF_MOV64_REG(BPF_REG_6, BPF_REG_1),
+
+ /* if (sk.family == AF_INET6) { */
+ BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_6,
+ offsetof(struct bpf_sock_addr, family)),
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_7, AF_INET6, 18),
+
+#define STORE_IPV6_WORD_N(DST, SRC, N) \
+ BPF_MOV32_IMM(BPF_REG_7, SRC[N]), \
+ BPF_STX_MEM(BPF_W, BPF_REG_6, BPF_REG_7, \
+ offsetof(struct bpf_sock_addr, DST[N]))
+
+#define STORE_IPV6(DST, SRC) \
+ STORE_IPV6_WORD_N(DST, SRC, 0), \
+ STORE_IPV6_WORD_N(DST, SRC, 1), \
+ STORE_IPV6_WORD_N(DST, SRC, 2), \
+ STORE_IPV6_WORD_N(DST, SRC, 3)
+
+ STORE_IPV6(msg_src_ip6, src6_rw_ip.s6_addr32),
+ STORE_IPV6(user_ip6, dst6_rw_addr.sin6_addr.s6_addr32),
+
+ /* user_port = dst6_rw_addr.sin6_port */
+ BPF_MOV32_IMM(BPF_REG_7, dst6_rw_addr.sin6_port),
+ BPF_STX_MEM(BPF_W, BPF_REG_6, BPF_REG_7,
+ offsetof(struct bpf_sock_addr, user_port)),
+
+ /* } */
+
+ /* return 1 */
+ BPF_MOV64_IMM(BPF_REG_0, 1),
+ BPF_EXIT_INSN(),
+ };
+
+ return load_insns(test, insns, sizeof(insns) / sizeof(struct bpf_insn));
+}
+
+static int sendmsg6_rw_asm_prog_load(const struct sock_addr_test *test)
+{
+ return sendmsg6_rw_dst_asm_prog_load(test, SERV6_REWRITE_IP);
+}
+
+static int sendmsg6_rw_v4mapped_prog_load(const struct sock_addr_test *test)
+{
+ return sendmsg6_rw_dst_asm_prog_load(test, SERV6_V4MAPPED_IP);
+}
+
+static int sendmsg6_rw_c_prog_load(const struct sock_addr_test *test)
+{
+ return load_path(test, SENDMSG6_PROG_PATH);
+}
+
static int cmp_addr(const struct sockaddr_storage *addr1,
const struct sockaddr_storage *addr2, int cmp_port)
{
@@ -656,6 +968,135 @@ static int connect_to_server(int type, const struct sockaddr_storage *addr,
return fd;
}
+int init_pktinfo(int domain, struct cmsghdr *cmsg)
+{
+ struct in6_pktinfo *pktinfo6;
+ struct in_pktinfo *pktinfo4;
+
+ if (domain == AF_INET) {
+ cmsg->cmsg_level = SOL_IP;
+ cmsg->cmsg_type = IP_PKTINFO;
+ cmsg->cmsg_len = CMSG_LEN(sizeof(struct in_pktinfo));
+ pktinfo4 = (struct in_pktinfo *)CMSG_DATA(cmsg);
+ memset(pktinfo4, 0, sizeof(struct in_pktinfo));
+ if (inet_pton(domain, SRC4_IP,
+ (void *)&pktinfo4->ipi_spec_dst) != 1)
+ return -1;
+ } else if (domain == AF_INET6) {
+ cmsg->cmsg_level = SOL_IPV6;
+ cmsg->cmsg_type = IPV6_PKTINFO;
+ cmsg->cmsg_len = CMSG_LEN(sizeof(struct in6_pktinfo));
+ pktinfo6 = (struct in6_pktinfo *)CMSG_DATA(cmsg);
+ memset(pktinfo6, 0, sizeof(struct in6_pktinfo));
+ if (inet_pton(domain, SRC6_IP,
+ (void *)&pktinfo6->ipi6_addr) != 1)
+ return -1;
+ } else {
+ return -1;
+ }
+
+ return 0;
+}
+
+static int sendmsg_to_server(const struct sockaddr_storage *addr,
+ socklen_t addr_len, int set_cmsg, int *syscall_err)
+{
+ union {
+ char buf[CMSG_SPACE(sizeof(struct in6_pktinfo))];
+ struct cmsghdr align;
+ } control6;
+ union {
+ char buf[CMSG_SPACE(sizeof(struct in_pktinfo))];
+ struct cmsghdr align;
+ } control4;
+ struct msghdr hdr;
+ struct iovec iov;
+ char data = 'a';
+ int domain;
+ int fd = -1;
+
+ domain = addr->ss_family;
+
+ if (domain != AF_INET && domain != AF_INET6) {
+ log_err("Unsupported address family");
+ goto err;
+ }
+
+ fd = socket(domain, SOCK_DGRAM, 0);
+ if (fd == -1) {
+ log_err("Failed to create client socket");
+ goto err;
+ }
+
+ memset(&iov, 0, sizeof(iov));
+ iov.iov_base = &data;
+ iov.iov_len = sizeof(data);
+
+ memset(&hdr, 0, sizeof(hdr));
+ hdr.msg_name = (void *)addr;
+ hdr.msg_namelen = addr_len;
+ hdr.msg_iov = &iov;
+ hdr.msg_iovlen = 1;
+
+ if (set_cmsg) {
+ if (domain == AF_INET) {
+ hdr.msg_control = &control4;
+ hdr.msg_controllen = sizeof(control4.buf);
+ } else if (domain == AF_INET6) {
+ hdr.msg_control = &control6;
+ hdr.msg_controllen = sizeof(control6.buf);
+ }
+ if (init_pktinfo(domain, CMSG_FIRSTHDR(&hdr))) {
+ log_err("Fail to init pktinfo");
+ goto err;
+ }
+ }
+
+ if (sendmsg(fd, &hdr, 0) != sizeof(data)) {
+ log_err("Fail to send message to server");
+ *syscall_err = errno;
+ goto err;
+ }
+
+ goto out;
+err:
+ close(fd);
+ fd = -1;
+out:
+ return fd;
+}
+
+static int recvmsg_from_client(int sockfd, struct sockaddr_storage *src_addr)
+{
+ struct timeval tv;
+ struct msghdr hdr;
+ struct iovec iov;
+ char data[64];
+ fd_set rfds;
+
+ FD_ZERO(&rfds);
+ FD_SET(sockfd, &rfds);
+
+ tv.tv_sec = 2;
+ tv.tv_usec = 0;
+
+ if (select(sockfd + 1, &rfds, NULL, NULL, &tv) <= 0 ||
+ !FD_ISSET(sockfd, &rfds))
+ return -1;
+
+ memset(&iov, 0, sizeof(iov));
+ iov.iov_base = data;
+ iov.iov_len = sizeof(data);
+
+ memset(&hdr, 0, sizeof(hdr));
+ hdr.msg_name = src_addr;
+ hdr.msg_namelen = sizeof(struct sockaddr_storage);
+ hdr.msg_iov = &iov;
+ hdr.msg_iovlen = 1;
+
+ return recvmsg(sockfd, &hdr, 0);
+}
+
static int init_addrs(const struct sock_addr_test *test,
struct sockaddr_storage *requested_addr,
struct sockaddr_storage *expected_addr,
@@ -753,6 +1194,69 @@ static int run_connect_test_case(const struct sock_addr_test *test)
return err;
}
+static int run_sendmsg_test_case(const struct sock_addr_test *test)
+{
+ socklen_t addr_len = sizeof(struct sockaddr_storage);
+ struct sockaddr_storage expected_src_addr;
+ struct sockaddr_storage requested_addr;
+ struct sockaddr_storage expected_addr;
+ struct sockaddr_storage real_src_addr;
+ int clientfd = -1;
+ int servfd = -1;
+ int set_cmsg;
+ int err = 0;
+
+ if (test->type != SOCK_DGRAM)
+ goto err;
+
+ if (init_addrs(test, &requested_addr, &expected_addr,
+ &expected_src_addr))
+ goto err;
+
+ /* Prepare server to sendmsg to */
+ servfd = start_server(test->type, &expected_addr, addr_len);
+ if (servfd == -1)
+ goto err;
+
+ for (set_cmsg = 0; set_cmsg <= 1; ++set_cmsg) {
+ if (clientfd >= 0)
+ close(clientfd);
+
+ clientfd = sendmsg_to_server(&requested_addr, addr_len,
+ set_cmsg, &err);
+ if (err)
+ goto out;
+ else if (clientfd == -1)
+ goto err;
+
+ /* Try to receive message on server instead of using
+ * getpeername(2) on client socket, to check that client's
+ * destination address was rewritten properly, since
+ * getpeername(2) doesn't work with unconnected datagram
+ * sockets.
+ *
+ * Get source address from recvmsg(2) as well to make sure
+ * source was rewritten properly: getsockname(2) can't be used
+ * since socket is unconnected and source defined for one
+ * specific packet may differ from the one used by default and
+ * returned by getsockname(2).
+ */
+ if (recvmsg_from_client(servfd, &real_src_addr) == -1)
+ goto err;
+
+ if (cmp_addr(&real_src_addr, &expected_src_addr, /*cmp_port*/0))
+ goto err;
+ }
+
+ goto out;
+err:
+ err = -1;
+out:
+ close(clientfd);
+ close(servfd);
+ return err;
+}
+
static int run_test_case(int cgfd, const struct sock_addr_test *test)
{
int progfd = -1;
@@ -784,10 +1288,24 @@ static int run_test_case(int cgfd, const struct sock_addr_test *test)
case BPF_CGROUP_INET6_CONNECT:
err = run_connect_test_case(test);
break;
+ case BPF_CGROUP_UDP4_SENDMSG:
+ case BPF_CGROUP_UDP6_SENDMSG:
+ err = run_sendmsg_test_case(test);
+ break;
default:
goto err;
}
+ if (test->expected_result == SYSCALL_EPERM && err == EPERM) {
+ err = 0; /* error was expected, reset it */
+ goto out;
+ }
+
+ if (test->expected_result == SYSCALL_ENOTSUPP && err == ENOTSUPP) {
+ err = 0; /* error was expected, reset it */
+ goto out;
+ }
+
if (err || test->expected_result != SUCCESS)
goto err;
--
2.9.5
^ permalink raw reply related
* [PATCH v3 bpf-next 2/5] bpf: Sync bpf.h to tools/
From: Andrey Ignatov @ 2018-05-25 5:09 UTC (permalink / raw)
To: netdev; +Cc: Andrey Ignatov, davem, kafai, ast, daniel, kernel-team
In-Reply-To: <cover.1527224903.git.rdna@fb.com>
Sync new `BPF_CGROUP_UDP4_SENDMSG` and `BPF_CGROUP_UDP6_SENDMSG`
attach types to tools/.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
---
tools/include/uapi/linux/bpf.h | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 9b8c6e3..cc68787 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -160,6 +160,8 @@ enum bpf_attach_type {
BPF_CGROUP_INET6_CONNECT,
BPF_CGROUP_INET4_POST_BIND,
BPF_CGROUP_INET6_POST_BIND,
+ BPF_CGROUP_UDP4_SENDMSG,
+ BPF_CGROUP_UDP6_SENDMSG,
__MAX_BPF_ATTACH_TYPE
};
@@ -2363,6 +2365,12 @@ struct bpf_sock_addr {
__u32 family; /* Allows 4-byte read, but no write */
__u32 type; /* Allows 4-byte read, but no write */
__u32 protocol; /* Allows 4-byte read, but no write */
+ __u32 msg_src_ip4; /* Allows 1,2,4-byte read an 4-byte write.
+ * Stored in network byte order.
+ */
+ __u32 msg_src_ip6[4]; /* Allows 1,2,4-byte read an 4-byte write.
+ * Stored in network byte order.
+ */
};
/* User bpf_sock_ops struct to access socket values and specify request ops
--
2.9.5
^ permalink raw reply related
* [PATCH v3 bpf-next 3/5] libbpf: Support guessing sendmsg{4,6} progs
From: Andrey Ignatov @ 2018-05-25 5:09 UTC (permalink / raw)
To: netdev; +Cc: Andrey Ignatov, davem, kafai, ast, daniel, kernel-team
In-Reply-To: <cover.1527224903.git.rdna@fb.com>
libbpf can guess prog type and expected attach type based on section
name. Add hints for "cgroup/sendmsg4" and "cgroup/sendmsg6" section
names.
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
---
tools/lib/bpf/libbpf.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index d20411e..b1a60ac 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -2043,6 +2043,8 @@ static const struct {
BPF_SA_PROG_SEC("cgroup/bind6", BPF_CGROUP_INET6_BIND),
BPF_SA_PROG_SEC("cgroup/connect4", BPF_CGROUP_INET4_CONNECT),
BPF_SA_PROG_SEC("cgroup/connect6", BPF_CGROUP_INET6_CONNECT),
+ BPF_SA_PROG_SEC("cgroup/sendmsg4", BPF_CGROUP_UDP4_SENDMSG),
+ BPF_SA_PROG_SEC("cgroup/sendmsg6", BPF_CGROUP_UDP6_SENDMSG),
BPF_S_PROG_SEC("cgroup/post_bind4", BPF_CGROUP_INET4_POST_BIND),
BPF_S_PROG_SEC("cgroup/post_bind6", BPF_CGROUP_INET6_POST_BIND),
};
--
2.9.5
^ permalink raw reply related
* Re: [PATCH net-next] net:sched: add action inheritdsfield to skbmod
From: Fu, Qiaobin @ 2018-05-25 5:45 UTC (permalink / raw)
To: Marcelo Ricardo Leitner
Cc: davem@davemloft.net, netdev@vger.kernel.org, jhs@mojatatu.com,
Michel Machado
In-Reply-To: <20180523210628.GK5488@localhost.localdomain>
Hi Marcelo,
Thanks for pointing out these style issues. Below is the updated version:
---
The new action inheritdsfield copies the field DS of
IPv4 and IPv6 packets into skb->priority. This enables
later classification of packets based on the DS field.
Original idea by Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Qiaobin Fu <qiaobinf@bu.edu>
Reviewed-by: Michel Machado <michel@digirati.com.br>
---
Note that the motivation for this patch is found in the following discussion:
https://www.spinics.net/lists/netdev/msg501061.html
---
diff --git a/include/uapi/linux/tc_act/tc_skbmod.h b/include/uapi/linux/tc_act/tc_skbmod.h
index 38c072f..0718b48 100644
--- a/include/uapi/linux/tc_act/tc_skbmod.h
+++ b/include/uapi/linux/tc_act/tc_skbmod.h
@@ -19,6 +19,7 @@
#define SKBMOD_F_SMAC 0x2
#define SKBMOD_F_ETYPE 0x4
#define SKBMOD_F_SWAPMAC 0x8
+#define SKBMOD_F_INHERITDSFIELD 0x10
struct tc_skbmod {
tc_gen;
diff --git a/net/sched/act_skbmod.c b/net/sched/act_skbmod.c
index ad050d7..e2082f6 100644
--- a/net/sched/act_skbmod.c
+++ b/net/sched/act_skbmod.c
@@ -16,6 +16,9 @@
#include <linux/rtnetlink.h>
#include <net/netlink.h>
#include <net/pkt_sched.h>
+#include <net/ip.h>
+#include <net/ipv6.h>
+#include <net/dsfield.h>
#include <linux/tc_act/tc_skbmod.h>
#include <net/tc_act/tc_skbmod.h>
@@ -72,6 +75,26 @@ static int tcf_skbmod_run(struct sk_buff *skb, const struct tc_action *a,
ether_addr_copy(eth_hdr(skb)->h_source, (u8 *)tmpaddr);
}
+ if (flags & SKBMOD_F_INHERITDSFIELD) {
+ int wlen = skb_network_offset(skb);
+
+ switch (tc_skb_protocol(skb)) {
+ case htons(ETH_P_IP):
+ wlen += sizeof(struct iphdr);
+ if (!pskb_may_pull(skb, wlen))
+ return TC_ACT_SHOT;
+ skb->priority = ipv4_get_dsfield(ip_hdr(skb)) >> 2;
+ break;
+
+ case htons(ETH_P_IPV6):
+ wlen += sizeof(struct ipv6hdr);
+ if (!pskb_may_pull(skb, wlen))
+ return TC_ACT_SHOT;
+ skb->priority = ipv6_get_dsfield(ipv6_hdr(skb)) >> 2;
+ break;
+ }
+ }
+
return action;
}
@@ -127,6 +150,9 @@ static int tcf_skbmod_init(struct net *net, struct nlattr *nla,
if (parm->flags & SKBMOD_F_SWAPMAC)
lflags = SKBMOD_F_SWAPMAC;
+ if (parm->flags & SKBMOD_F_INHERITDSFIELD)
+ lflags |= SKBMOD_F_INHERITDSFIELD;
+
exists = tcf_idr_check(tn, parm->index, a, bind);
if (exists && bind)
return 0;
> On May 23, 2018, at 2:06 PM, Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> wrote:
>
> Hi,
>
> Some style fixes:
>
> On Thu, May 17, 2018 at 07:33:08PM +0000, Fu, Qiaobin wrote:
>> net/sched: add action inheritdsfield to skbmod
>
> This extra line above should not be here.
>
>>
>> The new action inheritdsfield copies the field DS of
>> IPv4 and IPv6 packets into skb->prioriry. This enables
> typo -----^
>
>> later classification of packets based on the DS field.
>>
>> Original idea by Jamal Hadi Salim <jhs@mojatatu.com>
>>
>> Signed-off-by: Qiaobin Fu <qiaobinf@bu.edu>
>> Reviewed-by: Michel Machado <michel@digirati.com.br>
>> ---
>>
>> Note that the motivation for this patch is found in the following discussion:
>> https://www.spinics.net/lists/netdev/msg501061.html
>> ---
>>
>> diff --git a/include/uapi/linux/tc_act/tc_skbmod.h b/include/uapi/linux/tc_act/tc_skbmod.h
>> index 38c072f..0718b48 100644
>> --- a/include/uapi/linux/tc_act/tc_skbmod.h
>> +++ b/include/uapi/linux/tc_act/tc_skbmod.h
>> @@ -19,6 +19,7 @@
>> #define SKBMOD_F_SMAC 0x2
>> #define SKBMOD_F_ETYPE 0x4
>> #define SKBMOD_F_SWAPMAC 0x8
>> +#define SKBMOD_F_INHERITDSFIELD 0x10
>>
>> struct tc_skbmod {
>> tc_gen;
>> diff --git a/net/sched/act_skbmod.c b/net/sched/act_skbmod.c
>> index ad050d7..21d5bec 100644
>> --- a/net/sched/act_skbmod.c
>> +++ b/net/sched/act_skbmod.c
>> @@ -16,6 +16,9 @@
>> #include <linux/rtnetlink.h>
>> #include <net/netlink.h>
>> #include <net/pkt_sched.h>
>> +#include <net/ip.h>
>> +#include <net/ipv6.h>
>> +#include <net/dsfield.h>
>>
>> #include <linux/tc_act/tc_skbmod.h>
>> #include <net/tc_act/tc_skbmod.h>
>> @@ -72,6 +75,25 @@ static int tcf_skbmod_run(struct sk_buff *skb, const struct tc_action *a,
>> ether_addr_copy(eth_hdr(skb)->h_source, (u8 *)tmpaddr);
>> }
>>
>> + if (flags & SKBMOD_F_INHERITDSFIELD) {
>> + int wlen = skb_network_offset(skb);
>
> You need a blank line here, between var declaration and the rest.
>
>> + switch (tc_skb_protocol(skb)) {
>> + case htons(ETH_P_IP):
>> + wlen += sizeof(struct iphdr);
>> + if (!pskb_may_pull(skb, wlen))
>> + return TC_ACT_SHOT;
>> + skb->priority = ipv4_get_dsfield(ip_hdr(skb)) >> 2;
>> + break;
>> +
>> + case htons(ETH_P_IPV6):
>> + wlen += sizeof(struct ipv6hdr);
>> + if (!pskb_may_pull(skb, wlen))
>> + return TC_ACT_SHOT;
>> + skb->priority = ipv6_get_dsfield(ipv6_hdr(skb)) >> 2;
>> + break;
>> + }
>> + }
>> +
>> return action;
>> }
>>
>> @@ -127,6 +149,9 @@ static int tcf_skbmod_init(struct net *net, struct nlattr *nla,
>> if (parm->flags & SKBMOD_F_SWAPMAC)
>> lflags = SKBMOD_F_SWAPMAC;
>>
>> + if (parm->flags & SKBMOD_F_INHERITDSFIELD)
>> + lflags |= SKBMOD_F_INHERITDSFIELD;
>> +
>> exists = tcf_idr_check(tn, parm->index, a, bind);
>> if (exists && bind)
>> return 0;
^ permalink raw reply related
* Re: [PATCH 1/6] ravb: remove custom .nway_reset from ethtool ops
From: Vladimir Zapolskiy @ 2018-05-25 6:05 UTC (permalink / raw)
To: Sergei Shtylyov, Andrew Lunn
Cc: Vladimir Zapolskiy, David S. Miller, netdev, linux-renesas-soc
In-Reply-To: <f6e0b8c5-7fb9-babe-0114-350e7d6b2186@cogentembedded.com>
Hello Sergei,
On 05/24/2018 08:01 PM, Sergei Shtylyov wrote:
> On 05/24/2018 07:44 PM, Andrew Lunn wrote:
>
>>>>>> The change fixes a sleep in atomic context issue, which can be
>>>>>> always triggered by running 'ethtool -r' command, because
>>>>>> phy_start_aneg() protects phydev fields by a mutex.
>>>
>>> You don't say that *not* grabbing the spinlock is safe...
I say both that it is the fix and it is safe, I've already described
the function of the spinlock in my comments, and it is more or less
clear from the driver code.
>>
>> For it to be unsafe, i think that would mean phylib would need to call
>> back into the MAC driver? The only way that could happen is via the
>> adjust_link call. And that will deadlock, since it takes the same
>> lock.
>>
>> Or am i/we missing something?
>
> It doesn't take any locks currently, only patches #3/#6 makes it do so...
And that's the proper fix in my opinion, my tests don't unveil any issues.
^ permalink raw reply
* Re: [PATCH net-next] net: phy: convert further flags in struct phy_device to bit-field
From: Heiner Kallweit @ 2018-05-25 6:22 UTC (permalink / raw)
To: Florian Fainelli, Andrew Lunn, David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <7a4cd053-74aa-17a9-1ce2-da4e3ae21809@gmail.com>
Am 25.05.2018 um 01:03 schrieb Florian Fainelli:
> On 05/24/2018 01:15 PM, Heiner Kallweit wrote:
>> This patch is a follow-up to 87e5808d52b6 ("net: phy: replace bool
>> members in struct phy_device with bit-fields") and converts further
>> flags to bit-fields.
>
> This looks fine, but then you would also have to clean-up all code that
> does phydev->asym_pause = 1 and phydev->pause = 1 to use true/false
> instead, I am not sure there is much value in doing that for these
> fields considering that they are exposed to drivers so there is a risk
> of possible breakage.
>
I grepped over drivers/net and all assignments to pause / asym_pause
use 0, 1, or the result of a logical operation only.
However you're right, there's one potential issue:
struct ethtool_pauseparam defines rx_pause and tx_pause as __u32
and mentions in the kernel doc only flag autoneg to be sanitized
(even though rx_pause and tx_pause are called "flags").
So basically all drivers implementing ethtool callback set_pauseparam
don't check user space input for both values.
However this could be easily fixed by implementing these checks in the
core. In ethtool_set_pauseparam() we could do for autoneg, rx_pause
and tx_pause: val = !!val to sanitize them.
How would you see it after adding this?
Heiner
> Thanks!
>
>>
>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
>> ---
>> include/linux/phy.h | 17 ++++++++---------
>> 1 file changed, 8 insertions(+), 9 deletions(-)
>>
>> diff --git a/include/linux/phy.h b/include/linux/phy.h
>> index 6cd090984..cc66f2834 100644
>> --- a/include/linux/phy.h
>> +++ b/include/linux/phy.h
>> @@ -418,21 +418,20 @@ struct phy_device {
>> /* The most recently read link state */
>> unsigned link:1;
>>
>> + /* forced speed & duplex (no autoneg)
>> + * partner speed & duplex & pause (autoneg)
>> + */
>> + unsigned pause:1;
>> + unsigned asym_pause:1;
>> + int speed;
>> + int duplex;
>> +
>> enum phy_state state;
>>
>> u32 dev_flags;
>>
>> phy_interface_t interface;
>>
>> - /*
>> - * forced speed & duplex (no autoneg)
>> - * partner speed & duplex & pause (autoneg)
>> - */
>> - int speed;
>> - int duplex;
>> - int pause;
>> - int asym_pause;
>> -
>> /* Enabled Interrupts */
>> u32 interrupts;
>>
>>
>
>
^ permalink raw reply
* Re: [PATCH 0/6] ravb/sh_eth: fix sleep in atomic by reusing shared ethtool handlers
From: Vladimir Zapolskiy @ 2018-05-25 6:25 UTC (permalink / raw)
To: Sergei Shtylyov, David S. Miller; +Cc: netdev, linux-renesas-soc
In-Reply-To: <88a3af9d-7aa8-5c60-2625-6b529bd8d93c@cogentembedded.com>
Hello Sergei,
On 05/24/2018 08:24 PM, Sergei Shtylyov wrote:
> On 05/24/2018 07:40 PM, Sergei Shtylyov wrote:
>
>>> For ages trivial changes to RAVB and SuperH ethernet links by means of
>>> standard 'ethtool' trigger a 'sleeping function called from invalid
>>> context' bug, to visualize it on r8a7795 ULCB:
>>>
>>> % ethtool -r eth0
>>> BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747
>>> in_atomic(): 1, irqs_disabled(): 128, pid: 554, name: ethtool
>>> INFO: lockdep is turned off.
>>> irq event stamp: 0
>>> hardirqs last enabled at (0): [<0000000000000000>] (null)
>>> hardirqs last disabled at (0): [<ffff0000080e1d3c>] copy_process.isra.7.part.8+0x2cc/0x1918
>>> softirqs last enabled at (0): [<ffff0000080e1d3c>] copy_process.isra.7.part.8+0x2cc/0x1918
>>> softirqs last disabled at (0): [<0000000000000000>] (null)
>>> CPU: 5 PID: 554 Comm: ethtool Not tainted 4.17.0-rc4-arm64-renesas+ #33
>>> Hardware name: Renesas H3ULCB board based on r8a7795 ES2.0+ (DT)
>>> Call trace:
>>> dump_backtrace+0x0/0x198
>>> show_stack+0x24/0x30
>>> dump_stack+0xb8/0xf4
>>> ___might_sleep+0x1c8/0x1f8
>>> __might_sleep+0x58/0x90
>>> __mutex_lock+0x50/0x890
>>> mutex_lock_nested+0x3c/0x50
>>> phy_start_aneg_priv+0x38/0x180
>>> phy_start_aneg+0x24/0x30
>>> ravb_nway_reset+0x3c/0x68
>>> dev_ethtool+0x3dc/0x2338
>>> dev_ioctl+0x19c/0x490
>>> sock_do_ioctl+0xe0/0x238
>>> sock_ioctl+0x254/0x460
>>> do_vfs_ioctl+0xb0/0x918
>>> ksys_ioctl+0x50/0x80
>>> sys_ioctl+0x34/0x48
>>> __sys_trace_return+0x0/0x4
>>>
>>> The root cause is that an attempt to modify ECMR and GECMR registers
>>> only when RX/TX function is disabled was too overcomplicated in its
>>> original implementation, also processing of an optional Link Change
>>> interrupt added even more complexity, as a result the implementation
>>> was error prone.
>>>
>>> The new locking scheme is confirmed to be correct by dumping driver
>>> specific and generic PHY framework function calls with aid of ftrace
>>> while running more or less advanced tests.
>>>
>>> Please note that sh_eth patches from the series were built-tested only.
>>>
>>> On purpose I do not add Fixes tags, the reused PHY handlers were added
>>> way later than the fixed problems were firstly found in the drivers.
>>
>> I think you went one step too far with these fixes. On the first glance,
>> the real fixes are to remove grabbing/releasing the spinlock for the duration
>> of the phylib calls. Am I right? If so, making use of the new phylib APIs
>> would be a further enhancement, it's not needed for fixing the splats per se...
>
> Note that I hadn't looked at the patches #3/#6 at the time of writing this;
> those seem to be more complicated than the rest.
Right, the simplistic approach of just removing the held spinlock does
not fit well into the overall lame locking model found in the driver.
The thing is that I would prefer to exhibit 'remove custom callbacks'
side of the changes as it is done now, and fixing severe 'invalid contex'
bugs is left as a valuable side effect. I may attempt to find enough
free time to follow your instructions, but frankly speaking I don't
see it beneficial to split a single good all-sufficient change into
three or more: removal of spinlocks, replacement of phy_start_aneg(),
then a non-functional clean-up. Bikeshedding isn't my preference,
but a report about technical flaws related to the published changes
is appreciated, otherwise let me ask you to accept the changes as is,
secondary optimizations can be done on top of them.
^ permalink raw reply
* Re: [PATCH 0/4] RFC CPSW switchdev mode
From: Ilias Apalodimas @ 2018-05-25 6:29 UTC (permalink / raw)
To: Andrew Lunn
Cc: Ivan Vecera, Jiri Pirko, netdev, grygorii.strashko,
ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton
In-Reply-To: <20180524163310.GG5128@lunn.ch>
On Thu, May 24, 2018 at 06:33:10PM +0200, Andrew Lunn wrote:
> On Thu, May 24, 2018 at 07:02:54PM +0300, Ilias Apalodimas wrote:
> > On Thu, May 24, 2018 at 05:25:59PM +0200, Andrew Lunn wrote:
> > > O.K, back to the basic idea. Switch ports are just normal Linux
> > > interfaces.
> > >
> > > How would you configure this with two e1000e put in a bridge? I want
> > > multicast to be bridged between the two e1000e, but the host stack
> > > should not see the packets.
> > I am not sure i am following. I might be missing something. In your case you
> > got two ethernet pci/pcie interfaces bridged through software. You can filter
> > those if needed. In the case we are trying to cover, you got a hardware that
> > offers that capability. Since not all switches are pcie based shouldn't we be
> > able to allow this ?
>
> switchdev is about offloading what Linux can do to hardware to
> accelerate it. The switch is a block of accelerator hardware, like a
> GPU is for accelerating graphics. Linux can render OpenGL, but it is
> better to hand it over to the GPU accelerator.
>
> Same applies here. The Linux bridge can bridge multicast. Using the
> switchdev API, you can push that down to the accelerator, and let it
> do it.
>
> So you need to think about, how do you make the Linux bridge not pass
> multicast traffic to the host stack. Then how do you extend the
> switchdev API so you can push this down to the accelerator.
>
> To really get switchdev, you often need to pivot your point of view a
> bit. People often think, switchdev is about writing drivers for
> switches. Its not, its about how you offload networking which Linux
> can do down to a switch. And if the switch cannot accelerate it, you
> leave Linux to do it.
>
> When you get in the details, i think you will find the switchdev API
> actually already has what you need for this use case. What you need to
> figure out is how you make the Linux bridge not pass multicast to the
> host. Well, actually, not pass multicast it has not asked for. Then
> accelerate it.
>
Understood, if we missed back anything on handling multicast for
the cpu port we'll go back and fix it (i am assuming snooping is the answer
here). Multicasting is only one part of the equation though. What about the
need for vlans/FDBs on that port though?
Ilias
^ permalink raw reply
* Re: [PATCH net-next 0/8] nfp: offload LAG for tc flower egress
From: Jiri Pirko @ 2018-05-25 6:48 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, oss-drivers, Jay Vosburgh, Veaceslav Falico,
Andy Gospodarek
In-Reply-To: <20180524022255.18548-1-jakub.kicinski@netronome.com>
Thu, May 24, 2018 at 04:22:47AM CEST, jakub.kicinski@netronome.com wrote:
>Hi!
>
>This series from John adds bond offload to the nfp driver. Patch 5
>exposes the hash type for NETDEV_LAG_TX_TYPE_HASH to make sure nfp
>hashing matches that of the software LAG. This may be unnecessarily
>conservative, let's see what LAG maintainers think :)
So you need to restrict offload to only certain hash algo? In mlxsw, we
just ignore the lag setting and do some hw default hashing. Would not be
enough? Note that there's a good reason for it, as you see, in team, the
hashing is done in a BPF function and could be totally arbitrary.
Your patchset effectively disables team offload for nfp.
^ permalink raw reply
* Re: [PATCH net-next] net: stmmac: Add PPS and Flexible PPS support
From: kbuild test robot @ 2018-05-25 6:49 UTC (permalink / raw)
To: Jose Abreu
Cc: kbuild-all, netdev, Jose Abreu, David S. Miller, Joao Pinto,
Vitor Soares, Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <072478625b1cb3d4af9e3b42f83ece7303fd554e.1526993857.git.joabreu@synopsys.com>
[-- Attachment #1: Type: text/plain, Size: 681 bytes --]
Hi Jose,
I love your patch! Yet something to improve:
[auto build test ERROR on net-next/master]
url: https://github.com/0day-ci/linux/commits/Jose-Abreu/net-stmmac-Add-PPS-and-Flexible-PPS-support/20180525-074128
config: i386-allmodconfig (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All errors (new ones prefixed by >>):
>> ERROR: "__udivdi3" [drivers/net/ethernet/stmicro/stmmac/stmmac.ko] undefined!
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 63135 bytes --]
^ permalink raw reply
* [PATCH V0:net-next 0/4] net: ethernet: stmmac: add support for stm32mp1
From: Christophe Roullier @ 2018-05-25 7:46 UTC (permalink / raw)
To: davem, joabreu, mcoquelin.stm32, alexandre.torgue,
peppe.cavallaro
Cc: linux-kernel, linux-arm-kernel, netdev, robh, christophe.roullier
Patches to have Ethernet support on stm32mp1
Christophe Roullier (4):
net: ethernet: stmmac: add adaptation for stm32mp157c.
dt-bindings: stm32-dwmac: add support of MPU families
net: stmmac: add dwmac-4.20a compatible
dt-bindings: stm32: add compatible for syscon
.../devicetree/bindings/arm/stm32/stm32-syscon.txt | 14 ++
.../devicetree/bindings/arm/{ => stm32}/stm32.txt | 0
.../devicetree/bindings/net/stm32-dwmac.txt | 18 +-
drivers/net/ethernet/stmicro/stmmac/dwmac-stm32.c | 267 +++++++++++++++++++--
.../net/ethernet/stmicro/stmmac/stmmac_platform.c | 3 +-
5 files changed, 284 insertions(+), 18 deletions(-)
create mode 100644 Documentation/devicetree/bindings/arm/stm32/stm32-syscon.txt
rename Documentation/devicetree/bindings/arm/{ => stm32}/stm32.txt (100%)
--
1.9.1
^ permalink raw reply
* [PATCH V0:net-next 1/4] net: ethernet: stmmac: add adaptation for stm32mp157c.
From: Christophe Roullier @ 2018-05-25 7:46 UTC (permalink / raw)
To: davem, joabreu, mcoquelin.stm32, alexandre.torgue,
peppe.cavallaro
Cc: linux-kernel, linux-arm-kernel, netdev, robh, christophe.roullier
In-Reply-To: <1527234401-15812-1-git-send-email-christophe.roullier@st.com>
Glue codes to support stm32mp157c device and stay
compatible with stm32 mcu familly
Signed-off-by: Christophe Roullier <christophe.roullier@st.com>
Acked-by: Alexandre TORGUE <alexandre.torgue@st.com>
---
drivers/net/ethernet/stmicro/stmmac/dwmac-stm32.c | 267 ++++++++++++++++++++--
1 file changed, 252 insertions(+), 15 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-stm32.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-stm32.c
index 9e6db16..7e2e79d 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-stm32.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-stm32.c
@@ -16,49 +16,180 @@
#include <linux/of_net.h>
#include <linux/phy.h>
#include <linux/platform_device.h>
+#include <linux/pm_wakeirq.h>
#include <linux/regmap.h>
#include <linux/slab.h>
#include <linux/stmmac.h>
#include "stmmac_platform.h"
-#define MII_PHY_SEL_MASK BIT(23)
+#define SYSCFG_MCU_ETH_MASK BIT(23)
+#define SYSCFG_MP1_ETH_MASK GENMASK(23, 16)
+
+#define SYSCFG_PMCR_ETH_CLK_SEL BIT(16)
+#define SYSCFG_PMCR_ETH_REF_CLK_SEL BIT(17)
+#define SYSCFG_PMCR_ETH_SEL_MII BIT(20)
+#define SYSCFG_PMCR_ETH_SEL_RGMII BIT(21)
+#define SYSCFG_PMCR_ETH_SEL_RMII BIT(23)
+#define SYSCFG_PMCR_ETH_SEL_GMII 0
+#define SYSCFG_MCU_ETH_SEL_MII 0
+#define SYSCFG_MCU_ETH_SEL_RMII 1
struct stm32_dwmac {
struct clk *clk_tx;
struct clk *clk_rx;
+ struct clk *clk_eth_ck;
+ struct clk *clk_ethstp;
+ struct clk *syscfg_clk;
+ bool int_phyclk; /* Clock from RCC to drive PHY */
u32 mode_reg; /* MAC glue-logic mode register */
struct regmap *regmap;
u32 speed;
+ const struct stm32_ops *ops;
+ struct device *dev;
+};
+
+struct stm32_ops {
+ int (*set_mode)(struct plat_stmmacenet_data *plat_dat);
+ int (*clk_prepare)(struct stm32_dwmac *dwmac, bool prepare);
+ int (*suspend)(struct stm32_dwmac *dwmac);
+ void (*resume)(struct stm32_dwmac *dwmac);
+ int (*parse_data)(struct stm32_dwmac *dwmac,
+ struct device *dev);
+ u32 syscfg_eth_mask;
};
static int stm32_dwmac_init(struct plat_stmmacenet_data *plat_dat)
{
struct stm32_dwmac *dwmac = plat_dat->bsp_priv;
- u32 reg = dwmac->mode_reg;
- u32 val;
int ret;
- val = (plat_dat->interface == PHY_INTERFACE_MODE_MII) ? 0 : 1;
- ret = regmap_update_bits(dwmac->regmap, reg, MII_PHY_SEL_MASK, val);
- if (ret)
- return ret;
+ if (dwmac->ops->set_mode) {
+ ret = dwmac->ops->set_mode(plat_dat);
+ if (ret)
+ return ret;
+ }
ret = clk_prepare_enable(dwmac->clk_tx);
if (ret)
return ret;
- ret = clk_prepare_enable(dwmac->clk_rx);
- if (ret)
- clk_disable_unprepare(dwmac->clk_tx);
+ if (!dwmac->dev->power.is_suspended) {
+ ret = clk_prepare_enable(dwmac->clk_rx);
+ if (ret) {
+ clk_disable_unprepare(dwmac->clk_tx);
+ return ret;
+ }
+ }
+
+ if (dwmac->ops->clk_prepare) {
+ ret = dwmac->ops->clk_prepare(dwmac, true);
+ if (ret) {
+ clk_disable_unprepare(dwmac->clk_rx);
+ clk_disable_unprepare(dwmac->clk_tx);
+ }
+ }
return ret;
}
+static int stm32mp1_clk_prepare(struct stm32_dwmac *dwmac, bool prepare)
+{
+ int ret = 0;
+
+ if (prepare) {
+ ret = clk_prepare_enable(dwmac->syscfg_clk);
+ if (ret)
+ return ret;
+
+ if (dwmac->int_phyclk) {
+ ret = clk_prepare_enable(dwmac->clk_eth_ck);
+ if (ret) {
+ clk_disable_unprepare(dwmac->syscfg_clk);
+ return ret;
+ }
+ }
+ } else {
+ clk_disable_unprepare(dwmac->syscfg_clk);
+ if (dwmac->int_phyclk)
+ clk_disable_unprepare(dwmac->clk_eth_ck);
+ }
+ return ret;
+}
+
+static int stm32mp1_set_mode(struct plat_stmmacenet_data *plat_dat)
+{
+ struct stm32_dwmac *dwmac = plat_dat->bsp_priv;
+ u32 reg = dwmac->mode_reg;
+ int val;
+
+ switch (plat_dat->interface) {
+ case PHY_INTERFACE_MODE_MII:
+ val = SYSCFG_PMCR_ETH_SEL_MII;
+ pr_debug("SYSCFG init : PHY_INTERFACE_MODE_MII\n");
+ break;
+ case PHY_INTERFACE_MODE_GMII:
+ val = SYSCFG_PMCR_ETH_SEL_GMII;
+ if (dwmac->int_phyclk)
+ val |= SYSCFG_PMCR_ETH_CLK_SEL;
+ pr_debug("SYSCFG init : PHY_INTERFACE_MODE_GMII\n");
+ break;
+ case PHY_INTERFACE_MODE_RMII:
+ val = SYSCFG_PMCR_ETH_SEL_RMII;
+ if (dwmac->int_phyclk)
+ val |= SYSCFG_PMCR_ETH_REF_CLK_SEL;
+ pr_debug("SYSCFG init : PHY_INTERFACE_MODE_RMII\n");
+ break;
+ case PHY_INTERFACE_MODE_RGMII:
+ val = SYSCFG_PMCR_ETH_SEL_RGMII;
+ if (dwmac->int_phyclk)
+ val |= SYSCFG_PMCR_ETH_CLK_SEL;
+ pr_debug("SYSCFG init : PHY_INTERFACE_MODE_RGMII\n");
+ break;
+ default:
+ pr_debug("SYSCFG init : Do not manage %d interface\n",
+ plat_dat->interface);
+ /* Do not manage others interfaces */
+ return -EINVAL;
+ }
+
+ return regmap_update_bits(dwmac->regmap, reg,
+ dwmac->ops->syscfg_eth_mask, val);
+}
+
+static int stm32mcu_set_mode(struct plat_stmmacenet_data *plat_dat)
+{
+ struct stm32_dwmac *dwmac = plat_dat->bsp_priv;
+ u32 reg = dwmac->mode_reg;
+ int val;
+
+ switch (plat_dat->interface) {
+ case PHY_INTERFACE_MODE_MII:
+ val = SYSCFG_MCU_ETH_SEL_MII;
+ pr_debug("SYSCFG init : PHY_INTERFACE_MODE_MII\n");
+ break;
+ case PHY_INTERFACE_MODE_RMII:
+ val = SYSCFG_MCU_ETH_SEL_RMII;
+ pr_debug("SYSCFG init : PHY_INTERFACE_MODE_RMII\n");
+ break;
+ default:
+ pr_debug("SYSCFG init : Do not manage %d interface\n",
+ plat_dat->interface);
+ /* Do not manage others interfaces */
+ return -EINVAL;
+ }
+
+ return regmap_update_bits(dwmac->regmap, reg,
+ dwmac->ops->syscfg_eth_mask, val);
+}
+
static void stm32_dwmac_clk_disable(struct stm32_dwmac *dwmac)
{
clk_disable_unprepare(dwmac->clk_tx);
clk_disable_unprepare(dwmac->clk_rx);
+
+ if (dwmac->ops->clk_prepare)
+ dwmac->ops->clk_prepare(dwmac, false);
}
static int stm32_dwmac_parse_data(struct stm32_dwmac *dwmac,
@@ -70,15 +201,22 @@ static int stm32_dwmac_parse_data(struct stm32_dwmac *dwmac,
/* Get TX/RX clocks */
dwmac->clk_tx = devm_clk_get(dev, "mac-clk-tx");
if (IS_ERR(dwmac->clk_tx)) {
- dev_err(dev, "No tx clock provided...\n");
+ dev_err(dev, "No ETH Tx clock provided...\n");
return PTR_ERR(dwmac->clk_tx);
}
+
dwmac->clk_rx = devm_clk_get(dev, "mac-clk-rx");
if (IS_ERR(dwmac->clk_rx)) {
- dev_err(dev, "No rx clock provided...\n");
+ dev_err(dev, "No ETH Rx clock provided...\n");
return PTR_ERR(dwmac->clk_rx);
}
+ if (dwmac->ops->parse_data) {
+ err = dwmac->ops->parse_data(dwmac, dev);
+ if (err)
+ return err;
+ }
+
/* Get mode register */
dwmac->regmap = syscon_regmap_lookup_by_phandle(np, "st,syscon");
if (IS_ERR(dwmac->regmap))
@@ -91,11 +229,46 @@ static int stm32_dwmac_parse_data(struct stm32_dwmac *dwmac,
return err;
}
+static int stm32mp1_parse_data(struct stm32_dwmac *dwmac,
+ struct device *dev)
+{
+ struct device_node *np = dev->of_node;
+
+ dwmac->int_phyclk = of_property_read_bool(np, "st,int-phyclk");
+
+ /* Check if internal clk from RCC selected */
+ if (dwmac->int_phyclk) {
+ /* Get ETH_CLK clocks */
+ dwmac->clk_eth_ck = devm_clk_get(dev, "eth-ck");
+ if (IS_ERR(dwmac->clk_eth_ck)) {
+ dev_err(dev, "No ETH CK clock provided...\n");
+ return PTR_ERR(dwmac->clk_eth_ck);
+ }
+ }
+
+ /* Clock used for low power mode */
+ dwmac->clk_ethstp = devm_clk_get(dev, "ethstp");
+ if (IS_ERR(dwmac->clk_ethstp)) {
+ dev_err(dev, "No ETH peripheral clock provided for CStop mode ...\n");
+ return PTR_ERR(dwmac->clk_ethstp);
+ }
+
+ /* Clock for sysconfig */
+ dwmac->syscfg_clk = devm_clk_get(dev, "syscfg-clk");
+ if (IS_ERR(dwmac->syscfg_clk)) {
+ dev_err(dev, "No syscfg clock provided...\n");
+ return PTR_ERR(dwmac->syscfg_clk);
+ }
+
+ return 0;
+}
+
static int stm32_dwmac_probe(struct platform_device *pdev)
{
struct plat_stmmacenet_data *plat_dat;
struct stmmac_resources stmmac_res;
struct stm32_dwmac *dwmac;
+ const struct stm32_ops *data;
int ret;
ret = stmmac_get_platform_resources(pdev, &stmmac_res);
@@ -112,6 +285,16 @@ static int stm32_dwmac_probe(struct platform_device *pdev)
goto err_remove_config_dt;
}
+ data = of_device_get_match_data(&pdev->dev);
+ if (!data) {
+ dev_err(&pdev->dev, "no of match data provided\n");
+ ret = -EINVAL;
+ goto err_remove_config_dt;
+ }
+
+ dwmac->ops = data;
+ dwmac->dev = &pdev->dev;
+
ret = stm32_dwmac_parse_data(dwmac, &pdev->dev);
if (ret) {
dev_err(&pdev->dev, "Unable to parse OF data\n");
@@ -149,15 +332,48 @@ static int stm32_dwmac_remove(struct platform_device *pdev)
return ret;
}
+static int stm32mp1_suspend(struct stm32_dwmac *dwmac)
+{
+ int ret = 0;
+
+ ret = clk_prepare_enable(dwmac->clk_ethstp);
+ if (ret)
+ return ret;
+
+ clk_disable_unprepare(dwmac->clk_tx);
+ clk_disable_unprepare(dwmac->syscfg_clk);
+ if (dwmac->int_phyclk)
+ clk_disable_unprepare(dwmac->clk_eth_ck);
+
+ return ret;
+}
+
+static void stm32mp1_resume(struct stm32_dwmac *dwmac)
+{
+ clk_disable_unprepare(dwmac->clk_ethstp);
+}
+
+static int stm32mcu_suspend(struct stm32_dwmac *dwmac)
+{
+ clk_disable_unprepare(dwmac->clk_tx);
+ clk_disable_unprepare(dwmac->clk_rx);
+
+ return 0;
+}
+
#ifdef CONFIG_PM_SLEEP
static int stm32_dwmac_suspend(struct device *dev)
{
struct net_device *ndev = dev_get_drvdata(dev);
struct stmmac_priv *priv = netdev_priv(ndev);
+ struct stm32_dwmac *dwmac = priv->plat->bsp_priv;
+
int ret;
ret = stmmac_suspend(dev);
- stm32_dwmac_clk_disable(priv->plat->bsp_priv);
+
+ if (dwmac->ops->suspend)
+ ret = dwmac->ops->suspend(dwmac);
return ret;
}
@@ -166,8 +382,12 @@ static int stm32_dwmac_resume(struct device *dev)
{
struct net_device *ndev = dev_get_drvdata(dev);
struct stmmac_priv *priv = netdev_priv(ndev);
+ struct stm32_dwmac *dwmac = priv->plat->bsp_priv;
int ret;
+ if (dwmac->ops->resume)
+ dwmac->ops->resume(dwmac);
+
ret = stm32_dwmac_init(priv->plat);
if (ret)
return ret;
@@ -181,8 +401,24 @@ static int stm32_dwmac_resume(struct device *dev)
static SIMPLE_DEV_PM_OPS(stm32_dwmac_pm_ops,
stm32_dwmac_suspend, stm32_dwmac_resume);
+static struct stm32_ops stm32mcu_dwmac_data = {
+ .set_mode = stm32mcu_set_mode,
+ .suspend = stm32mcu_suspend,
+ .syscfg_eth_mask = SYSCFG_MCU_ETH_MASK
+};
+
+static struct stm32_ops stm32mp1_dwmac_data = {
+ .set_mode = stm32mp1_set_mode,
+ .clk_prepare = stm32mp1_clk_prepare,
+ .suspend = stm32mp1_suspend,
+ .resume = stm32mp1_resume,
+ .parse_data = stm32mp1_parse_data,
+ .syscfg_eth_mask = SYSCFG_MP1_ETH_MASK
+};
+
static const struct of_device_id stm32_dwmac_match[] = {
- { .compatible = "st,stm32-dwmac"},
+ { .compatible = "st,stm32-dwmac", .data = &stm32mcu_dwmac_data},
+ { .compatible = "st,stm32mp1-dwmac", .data = &stm32mp1_dwmac_data},
{ }
};
MODULE_DEVICE_TABLE(of, stm32_dwmac_match);
@@ -199,5 +435,6 @@ static SIMPLE_DEV_PM_OPS(stm32_dwmac_pm_ops,
module_platform_driver(stm32_dwmac_driver);
MODULE_AUTHOR("Alexandre Torgue <alexandre.torgue@gmail.com>");
-MODULE_DESCRIPTION("STMicroelectronics MCU DWMAC Specific Glue layer");
+MODULE_AUTHOR("Christophe Roullier <christophe.roullier@st.com>");
+MODULE_DESCRIPTION("STMicroelectronics STM32 DWMAC Specific Glue layer");
MODULE_LICENSE("GPL v2");
--
1.9.1
^ permalink raw reply related
* [PATCH V0:net-next 2/4] dt-bindings: stm32-dwmac: add support of MPU families
From: Christophe Roullier @ 2018-05-25 7:46 UTC (permalink / raw)
To: davem, joabreu, mcoquelin.stm32, alexandre.torgue,
peppe.cavallaro
Cc: linux-kernel, linux-arm-kernel, netdev, robh, christophe.roullier
In-Reply-To: <1527234401-15812-1-git-send-email-christophe.roullier@st.com>
Add description for Ethernet MPU families fields
Signed-off-by: Christophe Roullier <christophe.roullier@st.com>
Reviewed-by: Rob Herring <robh@kernel.org>
---
Documentation/devicetree/bindings/net/stm32-dwmac.txt | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/Documentation/devicetree/bindings/net/stm32-dwmac.txt b/Documentation/devicetree/bindings/net/stm32-dwmac.txt
index 489dbcb..1341012 100644
--- a/Documentation/devicetree/bindings/net/stm32-dwmac.txt
+++ b/Documentation/devicetree/bindings/net/stm32-dwmac.txt
@@ -6,14 +6,28 @@ Please see stmmac.txt for the other unchanged properties.
The device node has following properties.
Required properties:
-- compatible: Should be "st,stm32-dwmac" to select glue, and
+- compatible: For MCU family should be "st,stm32-dwmac" to select glue, and
"snps,dwmac-3.50a" to select IP version.
+ For MPU family should be "st,stm32mp1-dwmac" to select
+ glue, and "snps,dwmac-4.20a" to select IP version.
- clocks: Must contain a phandle for each entry in clock-names.
- clock-names: Should be "stmmaceth" for the host clock.
Should be "mac-clk-tx" for the MAC TX clock.
Should be "mac-clk-rx" for the MAC RX clock.
+ For MPU family need to add also "ethstp" for power mode clock and,
+ "syscfg-clk" for SYSCFG clock.
+- interrupt-names: Should contain a list of interrupt names corresponding to
+ the interrupts in the interrupts property, if available.
+ Should be "macirq" for the main MAC IRQ
+ Should be "eth_wake_irq" for the IT which wake up system
- st,syscon : Should be phandle/offset pair. The phandle to the syscon node which
- encompases the glue register, and the offset of the control register.
+ encompases the glue register, and the offset of the control register.
+
+Optional properties:
+- clock-names: For MPU family "mac-clk-ck" for PHY without quartz
+- st,int-phyclk (boolean) : valid only where PHY do not have quartz and need to be clock
+ by RCC
+
Example:
ethernet@40028000 {
--
1.9.1
^ permalink raw reply related
* [PATCH V0:net-next 3/4] net: stmmac: add dwmac-4.20a compatible
From: Christophe Roullier @ 2018-05-25 7:46 UTC (permalink / raw)
To: davem, joabreu, mcoquelin.stm32, alexandre.torgue,
peppe.cavallaro
Cc: netdev, christophe.roullier, linux-kernel, linux-arm-kernel, robh
In-Reply-To: <1527234401-15812-1-git-send-email-christophe.roullier@st.com>
Manage dwmac-4.20a version from synopsys
Signed-off-by: Christophe Roullier <christophe.roullier@st.com>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
index ebd3e5f..6d141f3 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
@@ -472,7 +472,8 @@ struct plat_stmmacenet_data *
}
if (of_device_is_compatible(np, "snps,dwmac-4.00") ||
- of_device_is_compatible(np, "snps,dwmac-4.10a")) {
+ of_device_is_compatible(np, "snps,dwmac-4.10a") ||
+ of_device_is_compatible(np, "snps,dwmac-4.20a")) {
plat->has_gmac4 = 1;
plat->has_gmac = 0;
plat->pmt = 1;
--
1.9.1
^ permalink raw reply related
* [PATCH V0:net-next 4/4] dt-bindings: stm32: add compatible for syscon
From: Christophe Roullier @ 2018-05-25 7:46 UTC (permalink / raw)
To: davem, joabreu, mcoquelin.stm32, alexandre.torgue,
peppe.cavallaro
Cc: netdev, christophe.roullier, linux-kernel, linux-arm-kernel, robh
In-Reply-To: <1527234401-15812-1-git-send-email-christophe.roullier@st.com>
This patch describes syscon DT bindings.
Signed-off-by: Christophe Roullier <christophe.roullier@st.com>
Reviewed-by: Rob Herring <robh@kernel.org>
---
.../devicetree/bindings/arm/stm32/stm32-syscon.txt | 14 ++++++++++++++
.../devicetree/bindings/arm/{ => stm32}/stm32.txt | 0
2 files changed, 14 insertions(+)
create mode 100644 Documentation/devicetree/bindings/arm/stm32/stm32-syscon.txt
rename Documentation/devicetree/bindings/arm/{ => stm32}/stm32.txt (100%)
diff --git a/Documentation/devicetree/bindings/arm/stm32/stm32-syscon.txt b/Documentation/devicetree/bindings/arm/stm32/stm32-syscon.txt
new file mode 100644
index 0000000..99980ae
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/stm32/stm32-syscon.txt
@@ -0,0 +1,14 @@
+STMicroelectronics STM32 Platforms System Controller
+
+Properties:
+ - compatible : should contain two values. First value must be :
+ - " st,stm32mp157-syscfg " - for stm32mp157 based SoCs,
+ second value must be always "syscon".
+ - reg : offset and length of the register set.
+
+ Example:
+ syscfg: syscon@50020000 {
+ compatible = "st,stm32mp157-syscfg", "syscon";
+ reg = <0x50020000 0x400>;
+ };
+
diff --git a/Documentation/devicetree/bindings/arm/stm32.txt b/Documentation/devicetree/bindings/arm/stm32/stm32.txt
similarity index 100%
rename from Documentation/devicetree/bindings/arm/stm32.txt
rename to Documentation/devicetree/bindings/arm/stm32/stm32.txt
--
1.9.1
^ permalink raw reply related
* Re: [PATCH v2 13/13] ARM: pxa: change SSP DMA channels allocation
From: Daniel Mack @ 2018-05-25 7:56 UTC (permalink / raw)
To: Robert Jarzmik, Haojian Zhuang, Ezequiel Garcia, Boris Brezillon,
David Woodhouse, Brian Norris, Marek Vasut, Richard Weinberger,
Liam Girdwood, Mark Brown, Arnd Bergmann
Cc: linux-arm-kernel, linux-kernel, linux-ide, dmaengine, linux-media,
linux-mmc, linux-mtd, netdev, alsa-devel
In-Reply-To: <20180524070703.11901-14-robert.jarzmik@free.fr>
On Thursday, May 24, 2018 09:07 AM, Robert Jarzmik wrote:
> Now the dma_slave_map is available for PXA architecture, switch the SSP
> device to it.
>
> This specifically means that :
> - for platform data based machines, the DMA requestor channels are
> extracted from the slave map, where pxa-ssp-dai.<N> is a 1-1 match to
> ssp.<N>, and the channels are either "rx" or "tx".
>
> - for device tree platforms, the dma node should be hooked into the
> pxa2xx-ac97 or pxa-ssp-dai node.
>
> Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Acked-by: Daniel Mack <daniel@zonque.org>
We should, however, merge what's left of this management glue code into
the users of it, so the dma related properties can be put in the right
devicetree node.
I'll prepare a patch for that for 4.18. This is a good preparation for
this round though.
Thanks,
Daniel
> ---
> Since v1: Removed channel names from platform_data
> ---
> arch/arm/plat-pxa/ssp.c | 47 ----------------------------------------------
> include/linux/pxa2xx_ssp.h | 2 --
> sound/soc/pxa/pxa-ssp.c | 5 ++---
> 3 files changed, 2 insertions(+), 52 deletions(-)
>
> diff --git a/arch/arm/plat-pxa/ssp.c b/arch/arm/plat-pxa/ssp.c
> index ba13f793fbce..ed36dcab80f1 100644
> --- a/arch/arm/plat-pxa/ssp.c
> +++ b/arch/arm/plat-pxa/ssp.c
> @@ -127,53 +127,6 @@ static int pxa_ssp_probe(struct platform_device *pdev)
> if (IS_ERR(ssp->clk))
> return PTR_ERR(ssp->clk);
>
> - if (dev->of_node) {
> - struct of_phandle_args dma_spec;
> - struct device_node *np = dev->of_node;
> - int ret;
> -
> - /*
> - * FIXME: we should allocate the DMA channel from this
> - * context and pass the channel down to the ssp users.
> - * For now, we lookup the rx and tx indices manually
> - */
> -
> - /* rx */
> - ret = of_parse_phandle_with_args(np, "dmas", "#dma-cells",
> - 0, &dma_spec);
> -
> - if (ret) {
> - dev_err(dev, "Can't parse dmas property\n");
> - return -ENODEV;
> - }
> - ssp->drcmr_rx = dma_spec.args[0];
> - of_node_put(dma_spec.np);
> -
> - /* tx */
> - ret = of_parse_phandle_with_args(np, "dmas", "#dma-cells",
> - 1, &dma_spec);
> - if (ret) {
> - dev_err(dev, "Can't parse dmas property\n");
> - return -ENODEV;
> - }
> - ssp->drcmr_tx = dma_spec.args[0];
> - of_node_put(dma_spec.np);
> - } else {
> - res = platform_get_resource(pdev, IORESOURCE_DMA, 0);
> - if (res == NULL) {
> - dev_err(dev, "no SSP RX DRCMR defined\n");
> - return -ENODEV;
> - }
> - ssp->drcmr_rx = res->start;
> -
> - res = platform_get_resource(pdev, IORESOURCE_DMA, 1);
> - if (res == NULL) {
> - dev_err(dev, "no SSP TX DRCMR defined\n");
> - return -ENODEV;
> - }
> - ssp->drcmr_tx = res->start;
> - }
> -
> res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> if (res == NULL) {
> dev_err(dev, "no memory resource defined\n");
> diff --git a/include/linux/pxa2xx_ssp.h b/include/linux/pxa2xx_ssp.h
> index 8461b18e4608..03a7ca46735b 100644
> --- a/include/linux/pxa2xx_ssp.h
> +++ b/include/linux/pxa2xx_ssp.h
> @@ -212,8 +212,6 @@ struct ssp_device {
> int type;
> int use_count;
> int irq;
> - int drcmr_rx;
> - int drcmr_tx;
>
> struct device_node *of_node;
> };
> diff --git a/sound/soc/pxa/pxa-ssp.c b/sound/soc/pxa/pxa-ssp.c
> index 0291c7cb64eb..e09368d89bbc 100644
> --- a/sound/soc/pxa/pxa-ssp.c
> +++ b/sound/soc/pxa/pxa-ssp.c
> @@ -104,9 +104,8 @@ static int pxa_ssp_startup(struct snd_pcm_substream *substream,
> dma = kzalloc(sizeof(struct snd_dmaengine_dai_dma_data), GFP_KERNEL);
> if (!dma)
> return -ENOMEM;
> -
> - dma->filter_data = substream->stream == SNDRV_PCM_STREAM_PLAYBACK ?
> - &ssp->drcmr_tx : &ssp->drcmr_rx;
> + dma->chan_name = substream->stream == SNDRV_PCM_STREAM_PLAYBACK ?
> + "tx" : "rx";
>
> snd_soc_dai_set_dma_data(cpu_dai, substream, dma);
>
>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox