Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH v2] net: Fix documentation for unregister_netdevice_notifier_net
From: Jiri Pirko @ 2022-12-16 11:19 UTC (permalink / raw)
  To: Miaoqian Lin
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Sebastian Andrzej Siewior, Menglong Dong, Kuniyuki Iwashima,
	Petr Machata, Jiri Pirko, netdev, linux-kernel
In-Reply-To: <20221216094838.683379-1-linmq006@gmail.com>

Fri, Dec 16, 2022 at 10:48:35AM CET, linmq006@gmail.com wrote:
>unregister_netdevice_notifier_net() is used for unregister a notifier
>registered by register_netdevice_notifier_net(). Also s/into/from/.
>
>Fixes: a30c7b429f2d ("net: introduce per-netns netdevice notifiers")

Hmm, I believe that comment fixes should not contain the fixes tag not
to confuse anyone (bot) as this is not fixing a bug. This is net-next
tree material, please mark it as such in the patch subject line:
[PATCH net-next v2] ...

Reviewed-by: Jiri Pirko <jiri@nvidia.com>

^ permalink raw reply

* Re: [PATCH v2 7/9] net: stmmac: Add glue layer for StarFive JH71x0 SoCs
From: Krzysztof Kozlowski @ 2022-12-16 11:19 UTC (permalink / raw)
  To: Yanhong Wang, linux-riscv, netdev, devicetree, linux-kernel
  Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Rob Herring, Krzysztof Kozlowski, Emil Renner Berthing,
	Richard Cochran, Andrew Lunn, Heiner Kallweit, Peter Geis
In-Reply-To: <20221216070632.11444-8-yanhong.wang@starfivetech.com>

On 16/12/2022 08:06, Yanhong Wang wrote:
> This adds StarFive dwmac driver support on the StarFive JH71x0 SoCs.
> 
> Signed-off-by: Yanhong Wang <yanhong.wang@starfivetech.com>
> Co-developed-by: Emil Renner Berthing <kernel@esmil.dk>
> Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>


> +
> +static const struct of_device_id starfive_eth_plat_match[] = {
> +	{
> +		.compatible = "starfive,jh7110-dwmac"
> +	},
> +	{
> +		.compatible = "starfive,jh7100-dwmac",

NAK.

This wasn't even checked with checkpatch.

Best regards,
Krzysztof


^ permalink raw reply

* Re: [PATCH] ath9k: use proper statements in conditionals
From: Arnd Bergmann @ 2022-12-16 11:16 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Arnd Bergmann, Kalle Valo,
	Pavel Skripkin
  Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Tetsuo Handa, linux-wireless, Netdev, linux-kernel
In-Reply-To: <87k02sd1uz.fsf@toke.dk>

On Thu, Dec 15, 2022, at 18:16, Toke Høiland-Jørgensen wrote:
>> index 30f0765fb9fd..237f4ec2cffd 100644
>> --- a/drivers/net/wireless/ath/ath9k/htc.h
>> +++ b/drivers/net/wireless/ath/ath9k/htc.h
>> @@ -327,9 +327,9 @@ static inline struct ath9k_htc_tx_ctl *HTC_SKB_CB(struct sk_buff *skb)
>>  }
>>  
>>  #ifdef CONFIG_ATH9K_HTC_DEBUGFS
>> -#define __STAT_SAFE(hif_dev, expr)	((hif_dev)->htc_handle->drv_priv ? (expr) : 0)
>> -#define CAB_STAT_INC(priv)		((priv)->debug.tx_stats.cab_queued++)
>> -#define TX_QSTAT_INC(priv, q)		((priv)->debug.tx_stats.queue_stats[q]++)
>> +#define __STAT_SAFE(hif_dev, expr)	do { ((hif_dev)->htc_handle->drv_priv ? (expr) : 0); } while (0)
>> +#define CAB_STAT_INC(priv)		do { ((priv)->debug.tx_stats.cab_queued++); } while (0)
>> +#define TX_QSTAT_INC(priv, q)		do { ((priv)->debug.tx_stats.queue_stats[q]++); } while (0)
>
> Hmm, is it really necessary to wrap these in do/while constructs? AFAICT
> they're all simple statements already?

It's generally safer to do the same thing on both side of the #ifdef.

The "do { } while (0)" is an empty statement that is needed to fix
the bug on the #else side. The expressions you have on the #ifdef
side can be used as values, and wrapping them in do{}while(0)
turns them into statements (without a value) as well, so fewer
things can go wrong when you only test one side.

I suppose the best solution would be to just use inline functions
for all of them and get rid of the macros.

     Arnd

^ permalink raw reply

* Re: [PATCH v2 5/9] dt-bindings: net: motorcomm: add support for Motorcomm YT8531
From: Krzysztof Kozlowski @ 2022-12-16 11:15 UTC (permalink / raw)
  To: Yanhong Wang, linux-riscv, netdev, devicetree, linux-kernel
  Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Rob Herring, Krzysztof Kozlowski, Emil Renner Berthing,
	Richard Cochran, Andrew Lunn, Heiner Kallweit, Peter Geis
In-Reply-To: <20221216070632.11444-6-yanhong.wang@starfivetech.com>

On 16/12/2022 08:06, Yanhong Wang wrote:
> Add support for Motorcomm Technology YT8531 10/100/1000 Ethernet PHY.
> The document describe details of clock delay train configuration.
> 
> Signed-off-by: Yanhong Wang <yanhong.wang@starfivetech.com>

Missing vendor prefix documentation. I don't think you tested this at
all with checkpatch and dt_binding_check.

> ---
>  .../bindings/net/motorcomm,yt8531.yaml        | 111 ++++++++++++++++++
>  MAINTAINERS                                   |   1 +
>  2 files changed, 112 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/net/motorcomm,yt8531.yaml
> 
> diff --git a/Documentation/devicetree/bindings/net/motorcomm,yt8531.yaml b/Documentation/devicetree/bindings/net/motorcomm,yt8531.yaml
> new file mode 100644
> index 000000000000..c5b8a09a78bb
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/net/motorcomm,yt8531.yaml
> @@ -0,0 +1,111 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/net/motorcomm,yt8531.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Motorcomm YT8531 Gigabit Ethernet PHY
> +
> +maintainers:
> +  - Yanhong Wang <yanhong.wang@starfivetech.com>
> +

Why there is no reference to ethernet-phy.yaml?

> +select:
> +  properties:
> +    $nodename:
> +      pattern: "^ethernet-phy(@[a-f0-9]+)?$"

I don't think that's correct approach. You know affect all phys.

> +
> +  required:
> +    - $nodename
> +
> +properties:
> +  $nodename:
> +    pattern: "^ethernet-phy(@[a-f0-9]+)?$"

Just reference ethernet-phy.yaml.

> +
> +  reg:
> +    minimum: 0
> +    maximum: 31
> +    description:
> +      The ID number for the PHY.

Drop duplicated properties.

> +
> +  rxc_dly_en:

No underscores in node names. Missing vendor prefix. Both apply to all
your other custom properties, unless they are not custom but generic.

Missing ref.

> +    description: |
> +      RGMII Receive PHY Clock Delay defined with fixed 2ns.This is used for

After every full stop goes space.

> +      PHY that have configurable RX internal delays. If this property set
> +      to 1, then automatically add 2ns delay pad for Receive PHY clock.

Nope, this is wrong. You wrote now boolean property as enum.

> +    enum: [0, 1]
> +    default: 0
> +
> +  rx_delay_sel:
> +    description: |
> +      This is supplement to rxc_dly_en property,and it can
> +      be specified in 150ps(pico seconds) steps. The effective
> +      delay is: 150ps * N.

Nope. Use proper units and drop all this register stuff.

> +    minimum: 0
> +    maximum: 15
> +    default: 0
> +
> +  tx_delay_sel_fe:
> +    description: |
> +      RGMII Transmit PHY Clock Delay defined in pico seconds.This is used for
> +      PHY's that have configurable TX internal delays when speed is 100Mbps
> +      or 10Mbps. It can be specified in 150ps steps, the effective delay
> +      is: 150ps * N.

The binding is in very poor shape. Please look carefully in
example-schema. All my previous comments apply everywhere.

> +    minimum: 0
> +    maximum: 15
> +    default: 15
> +
> +  tx_delay_sel:
> +    description: |
> +      RGMII Transmit PHY Clock Delay defined in pico seconds.This is used for
> +      PHY's that have configurable TX internal delays when speed is 1000Mbps.
> +      It can be specified in 150ps steps, the effective delay is: 150ps * N.
> +    minimum: 0
> +    maximum: 15
> +    default: 1
> +
> +  tx_inverted_10:
> +    description: |
> +      Use original or inverted RGMII Transmit PHY Clock to drive the RGMII
> +      Transmit PHY Clock delay train configuration when speed is 10Mbps.
> +      0: original   1: inverted
> +    enum: [0, 1]
> +    default: 0
> +
> +  tx_inverted_100:
> +    description: |
> +      Use original or inverted RGMII Transmit PHY Clock to drive the RGMII
> +      Transmit PHY Clock delay train configuration when speed is 100Mbps.
> +      0: original   1: inverted
> +    enum: [0, 1]
> +    default: 0
> +
> +  tx_inverted_1000:
> +    description: |
> +      Use original or inverted RGMII Transmit PHY Clock to drive the RGMII
> +      Transmit PHY Clock delay train configuration when speed is 1000Mbps.
> +      0: original   1: inverted
> +    enum: [0, 1]
> +    default: 0
> +
> +required:
> +  - reg
> +
> +additionalProperties: true

This must be false. After referencing ethernet-phy this should be
unevaluatedProperties: false.


Best regards,
Krzysztof


^ permalink raw reply

* Re: [PATCH v2 4/9] dt-bindings: net: Add bindings for StarFive dwmac
From: Krzysztof Kozlowski @ 2022-12-16 11:06 UTC (permalink / raw)
  To: Yanhong Wang, linux-riscv, netdev, devicetree, linux-kernel
  Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Rob Herring, Krzysztof Kozlowski, Emil Renner Berthing,
	Richard Cochran, Andrew Lunn, Heiner Kallweit, Peter Geis
In-Reply-To: <20221216070632.11444-5-yanhong.wang@starfivetech.com>

On 16/12/2022 08:06, Yanhong Wang wrote:
> Add documentation to describe StarFive dwmac driver(GMAC).
> 

Subject: drop second, redundant "bindings for".

Best regards,
Krzysztof


^ permalink raw reply

* Re: [PATCH v2 4/9] dt-bindings: net: Add bindings for StarFive dwmac
From: Krzysztof Kozlowski @ 2022-12-16 11:05 UTC (permalink / raw)
  To: Yanhong Wang, linux-riscv, netdev, devicetree, linux-kernel
  Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Rob Herring, Krzysztof Kozlowski, Emil Renner Berthing,
	Richard Cochran, Andrew Lunn, Heiner Kallweit, Peter Geis
In-Reply-To: <20221216070632.11444-5-yanhong.wang@starfivetech.com>

On 16/12/2022 08:06, Yanhong Wang wrote:
> Add documentation to describe StarFive dwmac driver(GMAC).
> 
> Signed-off-by: Yanhong Wang <yanhong.wang@starfivetech.com>
> ---
>  .../devicetree/bindings/net/snps,dwmac.yaml   |   1 +
>  .../bindings/net/starfive,jh71x0-dwmac.yaml   | 103 ++++++++++++++++++
>  MAINTAINERS                                   |   5 +
>  3 files changed, 109 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/net/starfive,jh71x0-dwmac.yaml
> 
> diff --git a/Documentation/devicetree/bindings/net/snps,dwmac.yaml b/Documentation/devicetree/bindings/net/snps,dwmac.yaml
> index 7870228b4cd3..cdb045d1c618 100644
> --- a/Documentation/devicetree/bindings/net/snps,dwmac.yaml
> +++ b/Documentation/devicetree/bindings/net/snps,dwmac.yaml
> @@ -91,6 +91,7 @@ properties:
>          - snps,dwmac-5.20
>          - snps,dwxgmac
>          - snps,dwxgmac-2.10
> +        - starfive,jh7110-dwmac
>  
>    reg:
>      minItems: 1
> diff --git a/Documentation/devicetree/bindings/net/starfive,jh71x0-dwmac.yaml b/Documentation/devicetree/bindings/net/starfive,jh71x0-dwmac.yaml
> new file mode 100644
> index 000000000000..5cb1272fe959
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/net/starfive,jh71x0-dwmac.yaml
> @@ -0,0 +1,103 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +# Copyright (C) 2022 StarFive Technology Co., Ltd.
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/net/starfive,jh71x0-dwmac.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: StarFive JH71x0 DWMAC glue layer
> +
> +maintainers:
> +  - Yanhong Wang <yanhong.wang@starfivetech.com>
> +
> +select:
> +  properties:
> +    compatible:
> +      contains:
> +        enum:
> +          - starfive,jh7110-dwmac
> +  required:
> +    - compatible
> +
> +allOf:
> +  - $ref: snps,dwmac.yaml#
> +
> +properties:
> +  compatible:
> +    items:
> +      - enum:
> +          - starfive,jh7110-dwmac

Is it going to grow with new models? If yes, when? If not, filename does
not match compatible.

> +      - const: snps,dwmac-5.20
> +
> +  clocks:
> +    items:
> +      - description: GMAC main clock
> +      - description: GMAC AHB clock
> +      - description: PTP clock
> +      - description: TX clock
> +      - description: GTXC clock
> +      - description: GTX clock
> +
> +  clock-names:
> +    items:
> +      - const: stmmaceth
> +      - const: pclk
> +      - const: ptp_ref
> +      - const: tx
> +      - const: gtxc
> +      - const: gtx

missing resets and reset-names.

> +
> +required:
> +  - compatible
> +  - clocks
> +  - clock-names
> +  - resets
> +  - reset-names
> +
> +unevaluatedProperties: false
> +
Best regards,
Krzysztof


^ permalink raw reply

* Re: [PATCH v2 2/9] dt-bindings: net: snps,dwmac: Update the maxitems number of resets and reset-names
From: Krzysztof Kozlowski @ 2022-12-16 11:03 UTC (permalink / raw)
  To: Yanhong Wang, linux-riscv, netdev, devicetree, linux-kernel
  Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Rob Herring, Krzysztof Kozlowski, Emil Renner Berthing,
	Richard Cochran, Andrew Lunn, Heiner Kallweit, Peter Geis
In-Reply-To: <20221216070632.11444-3-yanhong.wang@starfivetech.com>

On 16/12/2022 08:06, Yanhong Wang wrote:
> Some boards(such as StarFive VisionFive v2) require more than one value
> which defined by resets property, so the original definition can not
> meet the requirements. In order to adapt to different requirements,
> adjust the maxitems number from 1 to 3..
> 
> Signed-off-by: Yanhong Wang <yanhong.wang@starfivetech.com>
> ---
>  .../devicetree/bindings/net/snps,dwmac.yaml       | 15 +++++++++++----
>  1 file changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/net/snps,dwmac.yaml b/Documentation/devicetree/bindings/net/snps,dwmac.yaml
> index e26c3e76ebb7..7870228b4cd3 100644
> --- a/Documentation/devicetree/bindings/net/snps,dwmac.yaml
> +++ b/Documentation/devicetree/bindings/net/snps,dwmac.yaml
> @@ -133,12 +133,19 @@ properties:
>          - ptp_ref
>  
>    resets:
> -    maxItems: 1
> -    description:
> -      MAC Reset signal.
> +    minItems: 1
> +    maxItems: 3
> +    additionalItems: true
> +    items:
> +      - description: MAC Reset signal
>  
>    reset-names:
> -    const: stmmaceth
> +    minItems: 1
> +    maxItems: 3
> +    additionalItems: true
> +    contains:
> +      enum:
> +        - stmmaceth

No, this is highly unspecific and you know affect all the schemas using
snps,dwmac.yaml. Both lists must be specific - for your device and for
others.

Best regards,
Krzysztof


^ permalink raw reply

* Re: [PULL] Networking for next-6.1
From: Jiri Slaby @ 2022-12-16 10:49 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, linux-kernel, pabeni, joannelkoong
In-Reply-To: <20221004052000.2645894-1-kuba@kernel.org>

Hi,

On 04. 10. 22, 7:20, Jakub Kicinski wrote:
> Joanne Koong (7):

>        net: Add a bhash2 table hashed by port and address

This makes regression tests of python-ephemeral-port-reserve to fail.

I'm not sure if the issue is in the commit or in the test.

This C reproducer used to fail with 6.0, now it succeeds:
#include <err.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#include <sys/socket.h>

#include <arpa/inet.h>
#include <netinet/ip.h>

int main()
{
         int x;
         int s1 = socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_IP);
         if (s1 < 0)
                 err(1, "sock1");
         x = 1;
         if (setsockopt(s1, SOL_SOCKET, SO_REUSEADDR, &x, sizeof(x)))
                 err(1, "setsockopt1");

         struct sockaddr_in in = {
                 .sin_family = AF_INET,
                 .sin_port = INADDR_ANY,
                 .sin_addr = { htonl(INADDR_LOOPBACK) },
         };
         if (bind(s1, (const struct sockaddr *)&in, sizeof(in)) < 0)
                 err(1, "bind1");

         if (listen(s1, 1) < 0)
                 err(1, "listen1");

         socklen_t inl = sizeof(in);
         if (getsockname(s1, (struct sockaddr *)&in, &inl) < 0)
                 err(1, "getsockname1");

         int s2 = socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_IP);
         if (s1 < 0)
                 err(1, "sock2");

         if (connect(s2, (struct sockaddr *)&in, inl) < 0)
                 err(1, "conn2");

         struct sockaddr_in acc;
         inl = sizeof(acc);
         int fdX = accept(s1, (struct sockaddr *)&acc, &inl);
         if (fdX < 0)
                 err(1, "accept");

         close(fdX);
         close(s2);
         close(s1);

         int s3 = socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_IP);
         if (s3 < 0)
                 err(1, "sock3");

         if (bind(s3, (struct sockaddr *)&in, sizeof(in)) < 0)
                 err(1, "bind3");

         close(s3);

         return 0;
}



thanks,
-- 
js
suse labs


^ permalink raw reply

* Re: [PATCH net v3] openvswitch: Fix flow lookup to use unmasked key
From: patchwork-bot+netdevbpf @ 2022-12-16 10:40 UTC (permalink / raw)
  To: Eelco Chaudron
  Cc: netdev, pshelar, davem, dev, i.maximets, aconole, edumazet, kuba,
	pabeni, stable
In-Reply-To: <167111551443.359845.7122827280135116424.stgit@ebuild>

Hello:

This patch was applied to netdev/net.git (master)
by David S. Miller <davem@davemloft.net>:

On Thu, 15 Dec 2022 15:46:33 +0100 you wrote:
> The commit mentioned below causes the ovs_flow_tbl_lookup() function
> to be called with the masked key. However, it's supposed to be called
> with the unmasked key. This due to the fact that the datapath supports
> installing wider flows, and OVS relies on this behavior. For example
> if ipv4(src=1.1.1.1/192.0.0.0, dst=1.1.1.2/192.0.0.0) exists, a wider
> flow (smaller mask) of ipv4(src=192.1.1.1/128.0.0.0,dst=192.1.1.2/
> 128.0.0.0) is allowed to be added.
> 
> [...]

Here is the summary with links:
  - [net,v3] openvswitch: Fix flow lookup to use unmasked key
    https://git.kernel.org/netdev/net/c/68bb10101e6b

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net 0/3] devlink: region snapshot locking fix and selftest adjustments
From: patchwork-bot+netdevbpf @ 2022-12-16 10:30 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, edumazet, pabeni, jiri, jacob.e.keller
In-Reply-To: <20221215020102.1619685-1-kuba@kernel.org>

Hello:

This series was applied to netdev/net.git (master)
by David S. Miller <davem@davemloft.net>:

On Wed, 14 Dec 2022 18:00:59 -0800 you wrote:
> Minor fix for region snapshot locking and adjustments to selftests.
> 
> Jakub Kicinski (3):
>   devlink: hold region lock when flushing snapshots
>   selftests: devlink: fix the fd redirect in dummy_reporter_test
>   selftests: devlink: add a warning for interfaces coming up
> 
> [...]

Here is the summary with links:
  - [net,1/3] devlink: hold region lock when flushing snapshots
    https://git.kernel.org/netdev/net/c/b4cafb3d2c74
  - [net,2/3] selftests: devlink: fix the fd redirect in dummy_reporter_test
    https://git.kernel.org/netdev/net/c/2fc60e2ff972
  - [net,3/3] selftests: devlink: add a warning for interfaces coming up
    https://git.kernel.org/netdev/net/c/d1c4a3469e73

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: kernel v6.1: NULL pointer dereference in ieee80211_deliver_skb
From: Wolfgang Walter @ 2022-12-16 10:28 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: linux-wireless, linux-kernel, netdev
In-Reply-To: <5ef22539-7a99-0c12-a5b0-a5ea643fe635@nbd.name>

Am 2022-12-15 20:27, schrieb Felix Fietkau:
> On 15.12.22 18:31, Wolfgang Walter wrote:
>> Hello,
>> 
>> with kernel v6.1 I always get the following oops when running on a 
>> small
>> router:
> Please try this fix that I just posted:
> https://patchwork.kernel.org/project/linux-wireless/patch/20221215190503.79904-1-nbd@nbd.name/
> 
> - Felix

Thanks al lot, that fixed the problem.

Regards
-- 
Wolfgang Walter
Studentenwerk München
Anstalt des öffentlichen Rechts

^ permalink raw reply

* Re: [PATCH] ipvs: use div_s64 for signed division
From: Pablo Neira Ayuso @ 2022-12-16 10:10 UTC (permalink / raw)
  To: Julian Anastasov
  Cc: Arnd Bergmann, Simon Horman, Arnd Bergmann, Jakub Kicinski,
	Paolo Abeni, Jiri Wiesner, netdev, lvs-devel, netfilter-devel,
	coreteam, linux-kernel
In-Reply-To: <e1fea67-7425-f13d-e5bd-3d80d9a8afb8@ssi.bg>

Hi Julian,

On Thu, Dec 15, 2022 at 09:01:59PM +0200, Julian Anastasov wrote:
> 
> 	Hello,
> 
> On Thu, 15 Dec 2022, Arnd Bergmann wrote:
> 
> > From: Arnd Bergmann <arnd@arndb.de>
> > 
> > do_div() is only well-behaved for positive numbers, and now warns
> > when the first argument is a an s64:
> > 
> > net/netfilter/ipvs/ip_vs_est.c: In function 'ip_vs_est_calc_limits':
> > include/asm-generic/div64.h:222:35: error: comparison of distinct pointer types lacks a cast [-Werror]
> >   222 |         (void)(((typeof((n)) *)0) == ((uint64_t *)0));  \
> >       |                                   ^~
> > net/netfilter/ipvs/ip_vs_est.c:694:17: note: in expansion of macro 'do_div'
> >   694 |                 do_div(val, loops);
> 
> 	net-next already contains fix for this warning
> and changes val to u64.

Arnd's patch applies fine on top of net-next, maybe he is addressing
something else?

> > Convert to using the more appropriate div_s64(), which also
> > simplifies the code a bit.
> > 
> > Fixes: 705dd3444081 ("ipvs: use kthreads for stats estimation")
> > Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> > ---
> >  net/netfilter/ipvs/ip_vs_est.c | 6 ++----
> >  1 file changed, 2 insertions(+), 4 deletions(-)
> > 
> > diff --git a/net/netfilter/ipvs/ip_vs_est.c b/net/netfilter/ipvs/ip_vs_est.c
> > index ce2a1549b304..dbc32f8cf1f9 100644
> > --- a/net/netfilter/ipvs/ip_vs_est.c
> > +++ b/net/netfilter/ipvs/ip_vs_est.c
> > @@ -691,15 +691,13 @@ static int ip_vs_est_calc_limits(struct netns_ipvs *ipvs, int *chain_max)
> >  		}
> >  		if (diff >= NSEC_PER_SEC)
> >  			continue;
> > -		val = diff;
> > -		do_div(val, loops);
> > +		val = div_s64(diff, loops);
> 
> 	On CONFIG_X86_32 both versions execute single divl
> for our case but div_s64 is not inlined. I'm not expert in
> this area but if you think div_u64 is more appropriate then
> post another patch. Note that now val is u64 and
> min_est is still s32 (can be u32).
> 
> >  		if (!min_est || val < min_est) {
> >  			min_est = val;
> >  			/* goal: 95usec per chain */
> >  			val = 95 * NSEC_PER_USEC;
> >  			if (val >= min_est) {
> > -				do_div(val, min_est);
> > -				max = (int)val;
> > +				max = div_s64(val, min_est);
> >  			} else {
> >  				max = 1;
> >  			}
> > -- 
> > 2.35.1
> 
> Regards
> 
> --
> Julian Anastasov <ja@ssi.bg>
> 

^ permalink raw reply

* Re: Possible race with xsk_flush
From: Magnus Karlsson @ 2022-12-16 10:05 UTC (permalink / raw)
  To: Shawn Bohrer; +Cc: netdev, bpf, bjorn, magnus.karlsson, kernel-team
In-Reply-To: <Y5u4dA01y9RjjdAW@sbohrer-cf-dell>

On Fri, Dec 16, 2022 at 1:15 AM Shawn Bohrer <sbohrer@cloudflare.com> wrote:
>
> On Thu, Dec 15, 2022 at 11:22:05AM +0100, Magnus Karlsson wrote:
> > Thanks Shawn for your detailed bug report. The rings between user
> > space and kernel space are single-producer/single-consumer, so in your
> > case when both CPU0 and CPU2 are working on the fill ring and the Rx
> > ring at the same time, this will indeed produce races. The intended
> > design principle to protect against this is that only one NAPI context
> > can access any given ring at any point in time and that the NAPI logic
> > should prevent two instances of the same NAPI instance from running at
> > the same time. So if that is not true for some reason, we would get a
> > race like this. Another option is that one of the CPUs should really
> > process another fill ring instead of the same.
> >
> > Do you see the second socket being worked on when this happens?
> >
> > Could you please share how you set up the two AF_XDP sockets?
>
> Alex Forster sent more details on the configuration but just to
> reiterate there are actually 8 AF_XDP sockets in this test setup.
> There are two veth interfaces and each interface has four receive
> queues.  We create one socket per interface/queue pair.  Our XDP
> program redirects each packet to the correct AF_XDP socket based on
> the queue number.
>
> Yes there is often activity on other sockets near the time when the
> bug occurs.  This is why I'm printing xs/fq, the socket address and
> fill queue address, and printing the ingress/egress device name and
> queue number in my prints.  This allows to match up the user space and
> kernel space prints.  Additionally we are using a shared UMEM so
> descriptors could move around between sockets though I've tried to
> minimize this and in every case I've seen so far the mystery
> descriptor was last used on the same socket and has also been in the
> fill queue just not next in line.
>
> > Are you using XDP_DRV mode in your tests?
> >
> > > A couple more notes:
> > > * The ftrace print order and timestamps seem to indicate that the CPU
> > >   2 napi_poll is running before the CPU 0 xsk_flush().  I don't know
> > >   if these timestamps can be trusted but it does imply that maybe this
> > >   can race as I described.  I've triggered this twice with xsk_flush
> > >   probes and both show the order above.
> > > * In the 3 times I've triggered this it has occurred right when the
> > >   softirq processing switches CPUs
> >
> > This is interesting. Could you check, in some way, if you only have
> > one core working on the fill ring before the softirq switching and
> > then after that you have two? And if you have two, is that period
> > transient?
>
> I think what you are asking is why does the softirq processing switch
> CPUs?  There is still a lot I don't fully understand here but I've
> tried to understand this, if only to try to make it happen more
> frequently and make this easier to reproduce.
>
> In this test setup there is no hardware IRQ.  iperf2 sends the packet
> and the CPU where iperf is running runs the veth softirq.  I'm not
> sure how it picks which veth receive queue receives the packets, but
> they end up distributed across the veth qeueus.  Additionally
> __veth_xdp_flush() calls __napi_schedule().  This is called from
> veth_xdp_xmit() which I think means that transmitting packets from
> AF_XDP also schedules the softirq on the current CPU for that veth
> queue.  What I definitely see is that if I pin both iperf and my
> application to a single CPU all softirqs of all queues run on that
> single CPU.  If I pin iperf2 to one core and my application to another
> core I get softirqs for all veth queues on both cores.

To summarize, we are expecting this ordering:

CPU 0 __xsk_rcv_zc()
CPU 0 __xsk_map_flush()
CPU 2 __xsk_rcv_zc()
CPU 2 __xsk_map_flush()

But we are seeing this order:

CPU 0 __xsk_rcv_zc()
CPU 2 __xsk_rcv_zc()
CPU 0 __xsk_map_flush()
CPU 2 __xsk_map_flush()

Here is the veth NAPI poll loop:

static int veth_poll(struct napi_struct *napi, int budget)
{
    struct veth_rq *rq =
    container_of(napi, struct veth_rq, xdp_napi);
    struct veth_stats stats = {};
    struct veth_xdp_tx_bq bq;
    int done;

    bq.count = 0;

    xdp_set_return_frame_no_direct();
    done = veth_xdp_rcv(rq, budget, &bq, &stats);

    if (done < budget && napi_complete_done(napi, done)) {
        /* Write rx_notify_masked before reading ptr_ring */
       smp_store_mb(rq->rx_notify_masked, false);
       if (unlikely(!__ptr_ring_empty(&rq->xdp_ring))) {
           if (napi_schedule_prep(&rq->xdp_napi)) {
               WRITE_ONCE(rq->rx_notify_masked, true);
               __napi_schedule(&rq->xdp_napi);
            }
        }
    }

    if (stats.xdp_tx > 0)
        veth_xdp_flush(rq, &bq);
    if (stats.xdp_redirect > 0)
        xdp_do_flush();
    xdp_clear_return_frame_no_direct();

    return done;
}

Something I have never seen before is that there is
napi_complete_done() and a __napi_schedule() before xdp_do_flush().
Let us check if this has something to do with it. So two suggestions
to be executed separately:

* Put a probe at the __napi_schedule() above and check if it gets
triggered before this problem
* Move the "if (stats.xdp_redirect > 0) xdp_do_flush();" to just
before "if (done < budget && napi_complete_done(napi, done)) {"

This might provide us some hints on what is going on.

Thanks: Magnus

> In our test setup we aren't applying any cpu affinity.  iperf2 is
> multi-threaded and can run on all 4 cores, and our application is
> multithreaded and can run on all 4 cores.  The napi scheduling seems
> to be per veth queue and yes I see those softirqs move and switch
> between CPUs.  I don't however have anything that clearly shows it
> running concurrently on two CPUs (The stretches of __xsk_rcv_zc are
> all on one core before it switches).  The closest I have is the
> several microseconds where it appears xsk_flush() overlaps at the end
> of my traces.  I would think that if the napi locking didn't work at
> all you'd see clear overlap.
>
> From my experiments with CPU affinity I've updated my test setup to
> frequently change the CPU affinity of iperf and our application on one
> of my test boxes with hopes that it helps to reproduce but I have no
> results so far.
>
> > > * I've also triggered this before I added the xsk_flush() probe and
> > >   in that case saw the kernel side additionally fill in the next
> > >   expected descriptor, which in the example above would be 0xfe4100.
> > >   This seems to indicate that my tracking is all still sane.
> > > * This is fairly reproducible, but I've got 10 test boxes running and
> > >   I only get maybe bug a day.
> > >
> > > Any thoughts on if the bug I described is actually possible,
> > > alternative theories, or other things to test/try would be welcome.
> >
> > I thought this would be impossible, but apparently not :-). We are
> > apparently doing something wrong in the AF_XDP code or have the wrong
> > assumptions in some situation, but I just do not know what at this
> > point in time. Maybe it is veth that breaks some of our assumptions,
> > who knows. But let us dig into it. I need your help here, because I
> > think it will be hard for me to reproduce the issue.
>
> Yeah if you have ideas on what to test I'll do my best to try them.
>
> I've additionally updated my application to put a bad "cookie"
> descriptor address back in the RX ring before updating the consumer
> pointer.  My hope is that if we then ever receive that cookie it
> proves the kernel raced and failed to update the correct address.
>
> Thanks,
> Shawn Bohrer

^ permalink raw reply

* Re: [PATCH ethtool] misc: header includes cleanup
From: patchwork-bot+netdevbpf @ 2022-12-16 10:00 UTC (permalink / raw)
  To: Michal Kubecek; +Cc: netdev
In-Reply-To: <20221208131348.7B7166045E@lion.mk-sys.cz>

Hello:

This patch was applied to ethtool/ethtool.git (master)
by Michal Kubecek <mkubecek@suse.cz>:

On Thu,  8 Dec 2022 14:13:48 +0100 (CET) you wrote:
> An attempt to build with -std=c99 or -std=c11 revealed few problems with
> system header includes.
> 
> - strcasecmp() and strncasecmp() need <strings.h>
> - ioctl() needs <linux/ioctl.h>
> - struct ifreq needs <linux/if.h> (unless _USE_MISC is defined)
> - fileno() needs _POSIX_C_SOURCE
> - strdup() needs _POSIX_C_SOURCE >= _200809L
> - inet_aton() would require _DEFAULT_SOURCE
> 
> [...]

Here is the summary with links:
  - [ethtool] misc: header includes cleanup
    https://git.kernel.org/pub/scm/network/ethtool/ethtool.git/commit/?id=1fa60003a8b8

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* [PATCH v2] net: Fix documentation for unregister_netdevice_notifier_net
From: Miaoqian Lin @ 2022-12-16  9:48 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Sebastian Andrzej Siewior, Menglong Dong, Kuniyuki Iwashima,
	Petr Machata, Jiri Pirko, netdev, linux-kernel
  Cc: linmq006

unregister_netdevice_notifier_net() is used for unregister a notifier
registered by register_netdevice_notifier_net(). Also s/into/from/.

Fixes: a30c7b429f2d ("net: introduce per-netns netdevice notifiers")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
---
changes in v2:
- s/into/from/ as pointed out by Petr Machata.
---
 net/core/dev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index b76fb37b381e..cf78f35bc0b9 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1840,7 +1840,7 @@ EXPORT_SYMBOL(register_netdevice_notifier_net);
  * @nb: notifier
  *
  * Unregister a notifier previously registered by
- * register_netdevice_notifier(). The notifier is unlinked into the
+ * register_netdevice_notifier_net(). The notifier is unlinked from the
  * kernel structures and may then be reused. A negative errno code
  * is returned on a failure.
  *
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH net-next v1 4/4] net: phy: mxl-gpy: disable interrupts on GPY215 by default
From: Michael Walle @ 2022-12-16  9:46 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Xu Liang, Heiner Kallweit, Russell King, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Rob Herring,
	Krzysztof Kozlowski, netdev, linux-kernel, devicetree
In-Reply-To: <Y4uzYVSRiE9feD01@lunn.ch>

Am 2022-12-03 21:36, schrieb Andrew Lunn:
>> > > @@ -290,6 +291,10 @@ static int gpy_probe(struct phy_device *phydev)
>> > >  	phydev->priv = priv;
>> > >  	mutex_init(&priv->mbox_lock);
>> > >
>> > > +	if (gpy_has_broken_mdint(phydev) &&
>> > > +	    !device_property_present(dev,
>> > > "maxlinear,use-broken-interrupts"))
>> > > +		phydev->irq = PHY_POLL;
>> > > +
>> >
>> > I'm not sure of ordering here. It could be phydev->irq is set after
>> > probe. The IRQ is requested as part of phy_connect_direct(), which is
>> > much later.
>> 
>> I've did it that way, because phy_probe() also sets phydev->irq = 
>> PHY_POLL
>> in some cases and the phy driver .probe() is called right after it.
> 
> Yes, it is a valid point to do this check, but on its own i don't
> think it is sufficient.

Care to elaborate a bit? E.g. what is the difference to the case
the phy would have an interrupt described but no .config_intr()
op.

>> > I think a better place for this test is in gpy_config_intr(), return
>> > -EOPNOTSUPP. phy_enable_interrupts() failing should then cause
>> > phy_request_interrupt() to use polling.
>> 
>> Which will then print a warning, which might be misleading.
>> Or we disable the warning if -EOPNOTSUPP is returned?
> 
> Disabling the warning is the right thing to do.

There is more to this. .config_intr() is also used in
phy_init_hw() and phy_drv_supports_irq(). The latter would
still return true in our case. I'm not sure that is correct.

After trying your suggestion, I'm still in favor of somehow
tell the phy core to force polling mode during probe() of the
driver. The same way it's done if there is no .config_intr().

It's not like we'd need change the mode after probe during
runtime. Also with your proposed changed the attachment print
is wrong/misleading as it still prints the original irq instead
of PHY_POLL.

-michael

^ permalink raw reply

* Re: [RFC PATCH 6/9] virtio_net: construct multi-buffer xdp in mergeable
From: Heng Qi @ 2022-12-16  9:42 UTC (permalink / raw)
  To: Jason Wang
  Cc: netdev, bpf, Michael S. Tsirkin, Paolo Abeni, Jakub Kicinski,
	John Fastabend, David S. Miller, Daniel Borkmann,
	Alexei Starovoitov, Eric Dumazet
In-Reply-To: <CACGkMEu4a0B8_3sWisnQ4PjAURfqTa8mWC8HWWHaW3QFv4EBjA@mail.gmail.com>



在 2022/12/16 上午11:46, Jason Wang 写道:
> On Wed, Dec 14, 2022 at 4:38 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>> On Tue, Dec 13, 2022 at 03:08:46PM +0800, Jason Wang wrote:
>>> On Thu, Dec 8, 2022 at 4:30 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>>>>
>>>>
>>>> 在 2022/12/6 下午2:33, Jason Wang 写道:
>>>>> On Tue, Nov 22, 2022 at 3:44 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>>>>>> Build multi-buffer xdp using virtnet_build_xdp_buff() in mergeable.
>>>>>>
>>>>>> For the prefilled buffer before xdp is set, vq reset can be
>>>>>> used to clear it, but most devices do not support it at present.
>>>>>> In order not to bother users who are using xdp normally, we do
>>>>>> not use vq reset for the time being.
>>>>> I guess to tweak the part to say we will probably use vq reset in the future.
>>>> OK, it works.
>>>>
>>>>>> At the same time, virtio
>>>>>> net currently uses comp pages, and bpf_xdp_frags_increase_tail()
>>>>>> needs to calculate the tailroom of the last frag, which will
>>>>>> involve the offset of the corresponding page and cause a negative
>>>>>> value, so we disable tail increase by not setting xdp_rxq->frag_size.
>>>>>>
>>>>>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>>>>>> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
>>>>>> ---
>>>>>>    drivers/net/virtio_net.c | 67 +++++++++++++++++++++++-----------------
>>>>>>    1 file changed, 38 insertions(+), 29 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>>>>> index 20784b1d8236..83e6933ae62b 100644
>>>>>> --- a/drivers/net/virtio_net.c
>>>>>> +++ b/drivers/net/virtio_net.c
>>>>>> @@ -994,6 +994,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>>>>>>                                            unsigned int *xdp_xmit,
>>>>>>                                            struct virtnet_rq_stats *stats)
>>>>>>    {
>>>>>> +       unsigned int tailroom = SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
>>>>>>           struct virtio_net_hdr_mrg_rxbuf *hdr = buf;
>>>>>>           u16 num_buf = virtio16_to_cpu(vi->vdev, hdr->num_buffers);
>>>>>>           struct page *page = virt_to_head_page(buf);
>>>>>> @@ -1024,53 +1025,50 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>>>>>>           rcu_read_lock();
>>>>>>           xdp_prog = rcu_dereference(rq->xdp_prog);
>>>>>>           if (xdp_prog) {
>>>>>> +               unsigned int xdp_frags_truesz = 0;
>>>>>> +               struct skb_shared_info *shinfo;
>>>>>>                   struct xdp_frame *xdpf;
>>>>>>                   struct page *xdp_page;
>>>>>>                   struct xdp_buff xdp;
>>>>>>                   void *data;
>>>>>>                   u32 act;
>>>>>> +               int i;
>>>>>>
>>>>>> -               /* Transient failure which in theory could occur if
>>>>>> -                * in-flight packets from before XDP was enabled reach
>>>>>> -                * the receive path after XDP is loaded.
>>>>>> -                */
>>>>>> -               if (unlikely(hdr->hdr.gso_type))
>>>>>> -                       goto err_xdp;
>>>>> Two questions:
>>>>>
>>>>> 1) should we keep this check for the XDP program that can't deal with XDP frags?
>>>> Yes, the problem is the same as the xdp program without xdp.frags when
>>>> GRO_HW, I will correct it.
>>>>
>>>>> 2) how could we guarantee that the vnet header (gso_type/csum_start
>>>>> etc) is still valid after XDP (where XDP program can choose to
>>>>> override the header)?
>>>> We can save the vnet headr before the driver receives the packet and
>>>> build xdp_buff, and then use
>>>> the pre-saved value in the subsequent process.
>>> The problem is that XDP may modify the packet (header) so some fields
>>> are not valid any more (e.g csum_start/offset ?).
>>>
>>> If I was not wrong, there's no way for the XDP program to access those
>>> fields or does it support it right now?
>>>
>> When guest_csum feature is negotiated, xdp cannot be set, because the metadata
>> of xdp_{buff, frame} may be adjusted by the bpf program, therefore,
>> csum_{start, offset} itself is invalid. And at the same time,
>> multi-buffer xdp programs should only Receive packets over larger MTU, so
>> we don't need gso related information anymore and need to disable GRO_HW.
> Ok, that's fine.
>
> (But it requires a large pMTU).

Yes. Like a jumbo frame that larger than 4K.

Thanks.

>
> Thanks
>
>> Thanks.
>>
>>>>>> -
>>>>>> -               /* Buffers with headroom use PAGE_SIZE as alloc size,
>>>>>> -                * see add_recvbuf_mergeable() + get_mergeable_buf_len()
>>>>>> +               /* Now XDP core assumes frag size is PAGE_SIZE, but buffers
>>>>>> +                * with headroom may add hole in truesize, which
>>>>>> +                * make their length exceed PAGE_SIZE. So we disabled the
>>>>>> +                * hole mechanism for xdp. See add_recvbuf_mergeable().
>>>>>>                    */
>>>>>>                   frame_sz = headroom ? PAGE_SIZE : truesize;
>>>>>>
>>>>>> -               /* This happens when rx buffer size is underestimated
>>>>>> -                * or headroom is not enough because of the buffer
>>>>>> -                * was refilled before XDP is set. This should only
>>>>>> -                * happen for the first several packets, so we don't
>>>>>> -                * care much about its performance.
>>>>>> +               /* This happens when headroom is not enough because
>>>>>> +                * of the buffer was prefilled before XDP is set.
>>>>>> +                * This should only happen for the first several packets.
>>>>>> +                * In fact, vq reset can be used here to help us clean up
>>>>>> +                * the prefilled buffers, but many existing devices do not
>>>>>> +                * support it, and we don't want to bother users who are
>>>>>> +                * using xdp normally.
>>>>>>                    */
>>>>>> -               if (unlikely(num_buf > 1 ||
>>>>>> -                            headroom < virtnet_get_headroom(vi))) {
>>>>>> -                       /* linearize data for XDP */
>>>>>> -                       xdp_page = xdp_linearize_page(rq, &num_buf,
>>>>>> -                                                     page, offset,
>>>>>> -                                                     VIRTIO_XDP_HEADROOM,
>>>>>> -                                                     &len);
>>>>>> -                       frame_sz = PAGE_SIZE;
>>>>>> +               if (unlikely(headroom < virtnet_get_headroom(vi))) {
>>>>>> +                       if ((VIRTIO_XDP_HEADROOM + len + tailroom) > PAGE_SIZE)
>>>>>> +                               goto err_xdp;
>>>>>>
>>>>>> +                       xdp_page = alloc_page(GFP_ATOMIC);
>>>>>>                           if (!xdp_page)
>>>>>>                                   goto err_xdp;
>>>>>> +
>>>>>> +                       memcpy(page_address(xdp_page) + VIRTIO_XDP_HEADROOM,
>>>>>> +                              page_address(page) + offset, len);
>>>>>> +                       frame_sz = PAGE_SIZE;
>>>>> How can we know a single page is sufficient here? (before XDP is set,
>>>>> we reserve neither headroom nor tailroom).
>>>> This is only for the first buffer, refer to add_recvbuf_mergeable() and
>>>> get_mergeable_buf_len() A buffer is always no larger than a page.
>>> Ok.
>>>
>>> Thanks
>>>
>>>>>>                           offset = VIRTIO_XDP_HEADROOM;
>>>>> I think we should still try to do linearization for the XDP program
>>>>> that doesn't support XDP frags.
>>>> Yes, you are right.
>>>>
>>>> Thanks.
>>>>
>>>>> Thanks
>>>>>
>>>>>>                   } else {
>>>>>>                           xdp_page = page;
>>>>>>                   }
>>>>>> -
>>>>>> -               /* Allow consuming headroom but reserve enough space to push
>>>>>> -                * the descriptor on if we get an XDP_TX return code.
>>>>>> -                */
>>>>>>                   data = page_address(xdp_page) + offset;
>>>>>> -               xdp_init_buff(&xdp, frame_sz - vi->hdr_len, &rq->xdp_rxq);
>>>>>> -               xdp_prepare_buff(&xdp, data - VIRTIO_XDP_HEADROOM + vi->hdr_len,
>>>>>> -                                VIRTIO_XDP_HEADROOM, len - vi->hdr_len, true);
>>>>>> +               err = virtnet_build_xdp_buff(dev, vi, rq, &xdp, data, len, frame_sz,
>>>>>> +                                            &num_buf, &xdp_frags_truesz, stats);
>>>>>> +               if (unlikely(err))
>>>>>> +                       goto err_xdp_frags;
>>>>>>
>>>>>>                   act = bpf_prog_run_xdp(xdp_prog, &xdp);
>>>>>>                   stats->xdp_packets++;
>>>>>> @@ -1164,6 +1162,17 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>>>>>>                                   __free_pages(xdp_page, 0);
>>>>>>                           goto err_xdp;
>>>>>>                   }
>>>>>> +err_xdp_frags:
>>>>>> +               shinfo = xdp_get_shared_info_from_buff(&xdp);
>>>>>> +
>>>>>> +               if (unlikely(xdp_page != page))
>>>>>> +                       __free_pages(xdp_page, 0);
>>>>>> +
>>>>>> +               for (i = 0; i < shinfo->nr_frags; i++) {
>>>>>> +                       xdp_page = skb_frag_page(&shinfo->frags[i]);
>>>>>> +                       put_page(xdp_page);
>>>>>> +               }
>>>>>> +               goto err_xdp;
>>>>>>           }
>>>>>>           rcu_read_unlock();
>>>>>>
>>>>>> --
>>>>>> 2.19.1.6.gb485710b
>>>>>>


^ permalink raw reply

* Re: [PATCHv2 net-next] selftests/net: mv bpf/nat6to4.c to net folder
From: Björn Töpel @ 2022-12-16  9:34 UTC (permalink / raw)
  To: Hangbin Liu, netdev
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Shuah Khan,
	David Ahern, Lina Wang, Coleman Dietsch, bpf, Maciej enczykowski,
	Björn Töpel, Hangbin Liu
In-Reply-To: <20221216084109.1565213-1-liuhangbin@gmail.com>

Hangbin Liu <liuhangbin@gmail.com> writes:

> There are some issues with the bpf/nat6to4.c building.
>
> 1. It use TEST_CUSTOM_PROGS, which will add the nat6to4.o to
>    kselftest-list file and run by common run_tests.
> 2. When building the test via `make -C tools/testing/selftests/
>    TARGETS="net"`, the nat6to4.o will be build in selftests/net/bpf/
>    folder. But in test udpgro_frglist.sh it refers to ../bpf/nat6to4.o.
>    The correct path should be ./bpf/nat6to4.o.
> 3. If building the test via `make -C tools/testing/selftests/ TARGETS="net"
>    install`. The nat6to4.o will be installed to kselftest_install/net/
>    folder. Then the udpgro_frglist.sh should refer to ./nat6to4.o.
>
> To fix the confusing test path, let's just move the nat6to4.c to net folder
> and build it as TEST_GEN_FILES.
>
> v2: Update the Makefile rules rely on commit 837a3d66d698 ("selftests:
> net: Add cross-compilation support for BPF programs").
>
> Fixes: edae34a3ed92 ("selftests net: add UDP GRO fraglist + bpf self-tests")
> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>

FWIW, tested cross-compilation on riscv (and minor nit below):

Tested-by: Björn Töpel <bjorn@kernel.org>

> diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
> index 3007e98a6d64..ed9a315187c1 100644
> --- a/tools/testing/selftests/net/Makefile
> +++ b/tools/testing/selftests/net/Makefile
> @@ -75,14 +75,60 @@ TEST_GEN_PROGS += so_incoming_cpu
>  TEST_PROGS += sctp_vrf.sh
>  TEST_GEN_FILES += sctp_hello
>  TEST_GEN_FILES += csum
> +TEST_GEN_FILES += nat6to4.o
>  
>  TEST_FILES := settings
>  
>  include ../lib.mk
>  
> -include bpf/Makefile
> -
>  $(OUTPUT)/reuseport_bpf_numa: LDLIBS += -lnuma
>  $(OUTPUT)/tcp_mmap: LDLIBS += -lpthread
>  $(OUTPUT)/tcp_inq: LDLIBS += -lpthread
>  $(OUTPUT)/bind_bhash: LDLIBS += -lpthread
> +
> +# Rules to generate bpf obj nat6to4.o
> +CLANG ?= clang
> +SCRATCH_DIR := $(OUTPUT)/tools
> +BUILD_DIR := $(SCRATCH_DIR)/build
> +BPFDIR := $(abspath ../../../lib/bpf)
> +APIDIR := $(abspath ../../../include/uapi)
> +
> +CCINCLUDE += -I../bpf
> +CCINCLUDE += -I../../../../usr/include/
> +CCINCLUDE += -I$(SCRATCH_DIR)/include
> +
> +BPFOBJ := $(BUILD_DIR)/libbpf/libbpf.a
> +
> +MAKE_DIRS := $(BUILD_DIR)/libbpf $(OUTPUT)/bpf
                                    ^^^^^^^^^^^^^
                                    Can be removed after the BPF-prog
                                    moved out from /bpf

Björn

^ permalink raw reply

* Re: [RFC net-next 14/15] devlink: add by-instance dump infra
From: Jiri Pirko @ 2022-12-16  9:23 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, edumazet, pabeni, jacob.e.keller, leon
In-Reply-To: <20221215114706.42be5299@kernel.org>

Thu, Dec 15, 2022 at 08:47:06PM CET, kuba@kernel.org wrote:
>On Thu, 15 Dec 2022 10:11:03 +0100 Jiri Pirko wrote:
>> Instead of having this extra list of ops struct, woudn't it make sence
>> to rather implement this dumpit_one infra directly as a part of generic
>> netlink code?
>
>I was wondering about that, but none of the ideas were sufficiently
>neat to implement :( There's a lot of improvements that can be done
>in the core, starting with making more of the info structures shared
>between do and dump in genl :( 
>
>> Something like:
>> 
>>  	{
>>  		.cmd = DEVLINK_CMD_RATE_GET,
>>  		.doit = devlink_nl_cmd_rate_get_doit,
>> 		.dumpit_one = devlink_nl_cmd_rate_get_dumpit_one,
>> 		.dumpit_one_walk = devlink_nl_dumpit_one_walk,
>>  		.internal_flags = DEVLINK_NL_FLAG_NEED_RATE,
>>  		/* can be retrieved by unprivileged users */
>>  	},
>
>Growing the struct ops (especially the one called _small_) may be 
>a hard sale for a single user. For split ops, it's a different story,
>because we can possibly have a flag that changes the interpretation
>of the union. Maybe.
>
>I'd love to have a way of breaking down the ops so that we can factor
>out the filling of the message (the code that is shared between doit
>and dump). Just for the walk I don't think it's worth it.

Okay, that is something I thought about as well. Let me take a stab at
it.


>
>I went in the same direction as ethtool because if over time we arrive
>at a similar structure we can use that as a corner stone.
>
>All in all, I think this patch is a reasonable step forward. 

Yeah, could be always changed...


>But definitely agree that the genl infra is still painfully basic.


^ permalink raw reply

* Re: [PATCH net] devlink: protect devlink dump by the instance lock
From: Jiri Pirko @ 2022-12-16  9:18 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, jacob.e.keller, jiri, moshe
In-Reply-To: <20221215204447.149b00e6@kernel.org>

Fri, Dec 16, 2022 at 05:44:47AM CET, kuba@kernel.org wrote:
>On Thu, 15 Dec 2022 20:41:22 -0800 Jakub Kicinski wrote:
>> Take the instance lock around devlink_nl_fill() when dumping,
>> doit takes it already.
>> 
>> We are only dumping basic info so in the worst case we were risking
>> data races around the reload statistics. Also note that the reloads
>> themselves had not been under the instance lock until recently, so
>> the selection of the Fixes tag is inherently questionable.
>> 
>> Fixes: a254c264267e ("devlink: Add reload stats")
>
>On second thought, the drivers can't call reload, so until we got rid
>of the big bad mutex there could have been no race. I'll swap the tag
>for:
>
>Fixes: d3efc2a6a6d8 ("net: devlink: remove devlink_mutex")
>
>when/if applying.

You are right.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>

Thanks!

^ permalink raw reply

* [PATCH v1] iavfs/iavf_main: actually log ->src mask when talking about it
From: Daniil Tatianin @ 2022-12-16  9:13 UTC (permalink / raw)
  To: Jesse Brandeburg
  Cc: Daniil Tatianin, Tony Nguyen, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Harshitha Ramamurthy, Jeff Kirsher, intel-wired-lan,
	netdev, linux-kernel

This fixes a copy-paste issue where dev_err would log the dst mask even
though it is clearly talking about src.

Found by Linux Verification Center (linuxtesting.org) with the SVACE
static analysis tool.

Fixes: 0075fa0fadd0a ("i40evf: Add support to apply cloud filters")
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
---
 drivers/net/ethernet/intel/iavf/iavf_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index c4e451ef7942..adc02adef83a 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -3850,7 +3850,7 @@ static int iavf_parse_cls_flower(struct iavf_adapter *adapter,
 				field_flags |= IAVF_CLOUD_FIELD_IIP;
 			} else {
 				dev_err(&adapter->pdev->dev, "Bad ip src mask 0x%08x\n",
-					be32_to_cpu(match.mask->dst));
+					be32_to_cpu(match.mask->src));
 				return -EINVAL;
 			}
 		}
-- 
2.25.1


^ permalink raw reply related

* Re: [RFC net-next 01/15] devlink: move code to a dedicated directory
From: Jiri Pirko @ 2022-12-16  9:10 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, edumazet, pabeni, jacob.e.keller, leon
In-Reply-To: <20221215110925.6a9d0f4a@kernel.org>

Thu, Dec 15, 2022 at 08:09:25PM CET, kuba@kernel.org wrote:
>On Thu, 15 Dec 2022 10:51:43 +0100 Jiri Pirko wrote:
>> > net/Makefile                            | 1 +
>> > net/core/Makefile                       | 1 -
>> > net/devlink/Makefile                    | 3 +++
>> > net/{core/devlink.c => devlink/basic.c} | 0  
>> 
>> What's "basic" about it? It sounds a bit misleading.
>
>Agreed, but try to suggest a better name ;)  the_rest_of_it.c ? :)
>
>> > 4 files changed, 4 insertions(+), 1 deletion(-)
>> > create mode 100644 net/devlink/Makefile
>> > rename net/{core/devlink.c => devlink/basic.c} (100%)
>> >
>> >diff --git a/net/Makefile b/net/Makefile
>> >index 6a62e5b27378..0914bea9c335 100644
>> >--- a/net/Makefile
>> >+++ b/net/Makefile
>> >@@ -23,6 +23,7 @@ obj-$(CONFIG_BPFILTER)		+= bpfilter/
>> > obj-$(CONFIG_PACKET)		+= packet/
>> > obj-$(CONFIG_NET_KEY)		+= key/
>> > obj-$(CONFIG_BRIDGE)		+= bridge/
>> >+obj-$(CONFIG_NET_DEVLINK)	+= devlink/  
>> 
>> Hmm, as devlink is not really designed to be only networking thing,
>> perhaps this is good opportunity to move out of net/ and change the
>> config name to "CONFIG_DEVLINK" ?
>
>Nothing against it, but don't think it belongs in this patch.
>So I call scope creep.

Yeah, but I mean, since you move it from /net/core to /net/, why not
just move it to / ?


^ permalink raw reply

* Re: [RFC net-next 01/15] devlink: move code to a dedicated directory
From: Jiri Pirko @ 2022-12-16  9:09 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Jacob Keller, davem, netdev, edumazet, pabeni, leon
In-Reply-To: <20221215114829.5bc59d7a@kernel.org>

Thu, Dec 15, 2022 at 08:48:29PM CET, kuba@kernel.org wrote:
>On Thu, 15 Dec 2022 11:29:02 -0800 Jacob Keller wrote:
>> >> What's "basic" about it? It sounds a bit misleading.  
>> > 
>> > Agreed, but try to suggest a better name ;)  the_rest_of_it.c ? :)
>> 
>> I tried to think of something, but you already use core elsewhere in the
>> series. If our long term goal really is to split everything out then
>> maybe "leftover.c"? Or just "devlink/devlink.c"
>
>leftover.c is fine by me. Jiri?

If the goal is to remove the file entirely eventually, I'm okay with
that.


^ permalink raw reply

* Re: [PATCH net-next v1 3/4] dt-bindings: net: phy: add MaxLinear GPY2xx bindings
From: Michael Walle @ 2022-12-16  9:03 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: Rob Herring, Xu Liang, Andrew Lunn, Heiner Kallweit, Russell King,
	David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Krzysztof Kozlowski, netdev, linux-kernel, devicetree
In-Reply-To: <6c82b403962aaf1450eb5014c9908328@walle.cc>

Am 2022-12-06 10:44, schrieb Michael Walle:
> Am 2022-12-06 09:38, schrieb Krzysztof Kozlowski:
> 
>>>>> Just omit the interrupt property if you don't want interrupts and
>>>>> add it if you do.
>>>> 
>>>> How does that work together with "the device tree describes
>>>> the hardware and not the configuration". The interrupt line
>>>> is there, its just broken sometimes and thus it's disabled
>>>> by default for these PHY revisions/firmwares. With this
>>>> flag the user can say, "hey on this hardware it is not
>>>> relevant because we don't have shared interrupts or because
>>>> I know what I'm doing".
>> 
>> Yeah, that's a good question. In your case broken interrupts could be
>> understood the same as "not connected", so property not present. When
>> things are broken, you do not describe them fully in DTS for the
>> completeness of hardware description, right?
> 
> I'd agree here, but in this case it's different. First, it isn't
> obvious in the first place that things are broken and boards in
> the field wouldn't/couldn't get that update. I'd really expect
> an erratum from MaxLinear here. And secondly, (which I
> just noticed right now, sorry), is that the interrupt line
> is also used for wake-on-lan, which can also be used even for
> the "broken" PHYs.
> 
> To work around this, the basic idea was to just disable the
> normal interrupts and fall back to polling mode, as the PHY
> driver just use it for link detection and don't offer any
> advanced features like PTP (for now). But still get the system
> integrator a knob to opt-in to the old behavior on new device
> trees.
> 
>>> Specifically you can't do the following: Have the same device
>>> tree and still being able to use it with a future PHY firmware
>>> update/revision. Because according to your suggestion, this
>>> won't have the interrupt property set. With this flag you can
>>> have the following cases:
>>>   (1) the interrupt information is there and can be used in the
>>>       future by non-broken PHY revisions,
>>>   (2) broken PHYs will ignore the interrupt line
>>>   (3) except the system designer opts-in with this flag (because
>>>       maybe this is the only PHY on the interrupt line etc).
>> 
>> I am not sure if I understand the case. You want to have a DTS with
>> interrupts and "maxlinear,use-broken-interrupts", where the latter 
>> will
>> be ignored by some future firmware?
> 
> Yes, that's correct.
> 
>> Isn't then the property not really correct? Broken for one firmware
>> on the same device, working for other firmware on the same device?
> 
> Arguable, but you can interpret "use broken-interrupts" as no-op
> if there are no broken interrupts.
> 
>> I would assume that in such cases you (or bootloader or overlay)
>> should patch the DTS...
> 
> I think this would turn the opt-in into an opt-out and we'd rely
> on the bootloader to workaround the erratum. Which isn't what we
> want here.

Just a recap what happened on IRC:
  (1) Krzysztof signalled that such a property might be ok but the
      commit message should be explain it better. For reference
      here is what I explained there:

       maybe that property has a wrong name, but ultimately, it's just
       a hint that the systems designer wants to use the interrupts
       even if they don't work as expected, because they work on that
       particular hardware.
       the interrupt line is there but it's broken, there are device
       trees out there with that property, so all we can do is to not
       use the interrupts for that PHY. but as a systems designer who
       is aware of the consequences and knowing that they don't apply
       to my board, how could i then tell the driver to use it anyway.

  (2) Krzysztof pointed out that there is still the issue raised by
      Rob, that the schemas haven't any compatible and cannot be
      validated. I think that applies to all the network PHY bindings
      in the tree right now. I don't know how to fix them.

  (3) The main problem with the broken interrupt handling of the PHY
      is that it will disturb other devices on that interrupt line.
      IOW if the interrupt line is shared the PHY should fall back
      to polling mode. I haven't found anything in the interrupt
      subsys to query if a line is shared and I guess it's also
      conceptually impossible to do such a thing, because there
      might be any driver probed at a later time which also uses
      that line.
      Rob had the idea to walk the device tree and determine if
      a particular interrupt is used by other devices, too. If
      feasable, this sounds like a good enough heuristic for our
      problem. Although there might be some edge cases, like
      DT overlays loaded at linux runtime (?!).

So this is what I'd do now: I'd skip a new device tree property
for now and determine if the interrupt line is shared (by solely
looking at the DT) and then disable the interrupt in the PHY
driver. This begs the question what we do if there is no DT,
interrupts disabled or enabled?

Andrew, what do you think?

-michael

^ permalink raw reply

* [PATCHv2 net-next] selftests/net: mv bpf/nat6to4.c to net folder
From: Hangbin Liu @ 2022-12-16  8:41 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Shuah Khan,
	David Ahern, Lina Wang, Coleman Dietsch, bpf, Maciej enczykowski,
	Björn Töpel, Hangbin Liu

There are some issues with the bpf/nat6to4.c building.

1. It use TEST_CUSTOM_PROGS, which will add the nat6to4.o to
   kselftest-list file and run by common run_tests.
2. When building the test via `make -C tools/testing/selftests/
   TARGETS="net"`, the nat6to4.o will be build in selftests/net/bpf/
   folder. But in test udpgro_frglist.sh it refers to ../bpf/nat6to4.o.
   The correct path should be ./bpf/nat6to4.o.
3. If building the test via `make -C tools/testing/selftests/ TARGETS="net"
   install`. The nat6to4.o will be installed to kselftest_install/net/
   folder. Then the udpgro_frglist.sh should refer to ./nat6to4.o.

To fix the confusing test path, let's just move the nat6to4.c to net folder
and build it as TEST_GEN_FILES.

v2: Update the Makefile rules rely on commit 837a3d66d698 ("selftests:
net: Add cross-compilation support for BPF programs").

Fixes: edae34a3ed92 ("selftests net: add UDP GRO fraglist + bpf self-tests")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---

I don't know if there is a way to just install a single TEST_GEN_FILES
to a separate folder. If there is, then we don't need to move the files.

---
 tools/testing/selftests/net/Makefile          | 50 +++++++++++++++++-
 tools/testing/selftests/net/bpf/Makefile      | 51 -------------------
 .../testing/selftests/net/{bpf => }/nat6to4.c |  0
 tools/testing/selftests/net/udpgro_frglist.sh |  8 +--
 4 files changed, 52 insertions(+), 57 deletions(-)
 delete mode 100644 tools/testing/selftests/net/bpf/Makefile
 rename tools/testing/selftests/net/{bpf => }/nat6to4.c (100%)

diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index 3007e98a6d64..ed9a315187c1 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -75,14 +75,60 @@ TEST_GEN_PROGS += so_incoming_cpu
 TEST_PROGS += sctp_vrf.sh
 TEST_GEN_FILES += sctp_hello
 TEST_GEN_FILES += csum
+TEST_GEN_FILES += nat6to4.o
 
 TEST_FILES := settings
 
 include ../lib.mk
 
-include bpf/Makefile
-
 $(OUTPUT)/reuseport_bpf_numa: LDLIBS += -lnuma
 $(OUTPUT)/tcp_mmap: LDLIBS += -lpthread
 $(OUTPUT)/tcp_inq: LDLIBS += -lpthread
 $(OUTPUT)/bind_bhash: LDLIBS += -lpthread
+
+# Rules to generate bpf obj nat6to4.o
+CLANG ?= clang
+SCRATCH_DIR := $(OUTPUT)/tools
+BUILD_DIR := $(SCRATCH_DIR)/build
+BPFDIR := $(abspath ../../../lib/bpf)
+APIDIR := $(abspath ../../../include/uapi)
+
+CCINCLUDE += -I../bpf
+CCINCLUDE += -I../../../../usr/include/
+CCINCLUDE += -I$(SCRATCH_DIR)/include
+
+BPFOBJ := $(BUILD_DIR)/libbpf/libbpf.a
+
+MAKE_DIRS := $(BUILD_DIR)/libbpf $(OUTPUT)/bpf
+$(MAKE_DIRS):
+	mkdir -p $@
+
+# Get Clang's default includes on this system, as opposed to those seen by
+# '-target bpf'. This fixes "missing" files on some architectures/distros,
+# such as asm/byteorder.h, asm/socket.h, asm/sockios.h, sys/cdefs.h etc.
+#
+# Use '-idirafter': Don't interfere with include mechanics except where the
+# build would have failed anyways.
+define get_sys_includes
+$(shell $(1) $(2) -v -E - </dev/null 2>&1 \
+	| sed -n '/<...> search starts here:/,/End of search list./{ s| \(/.*\)|-idirafter \1|p }') \
+$(shell $(1) $(2) -dM -E - </dev/null | grep '__riscv_xlen ' | awk '{printf("-D__riscv_xlen=%d -D__BITS_PER_LONG=%d", $$3, $$3)}')
+endef
+
+ifneq ($(CROSS_COMPILE),)
+CLANG_TARGET_ARCH = --target=$(notdir $(CROSS_COMPILE:%-=%))
+endif
+
+CLANG_SYS_INCLUDES = $(call get_sys_includes,$(CLANG),$(CLANG_TARGET_ARCH))
+
+$(OUTPUT)/nat6to4.o: nat6to4.c $(BPFOBJ) | $(MAKE_DIRS)
+	$(CLANG) -O2 -target bpf -c $< $(CCINCLUDE) $(CLANG_SYS_INCLUDES) -o $@
+
+$(BPFOBJ): $(wildcard $(BPFDIR)/*.[ch] $(BPFDIR)/Makefile)		       \
+	   $(APIDIR)/linux/bpf.h					       \
+	   | $(BUILD_DIR)/libbpf
+	$(MAKE) $(submake_extras) -C $(BPFDIR) OUTPUT=$(BUILD_DIR)/libbpf/     \
+		    EXTRA_CFLAGS='-g -O0'				       \
+		    DESTDIR=$(SCRATCH_DIR) prefix= all install_headers
+
+EXTRA_CLEAN := $(SCRATCH_DIR)
diff --git a/tools/testing/selftests/net/bpf/Makefile b/tools/testing/selftests/net/bpf/Makefile
deleted file mode 100644
index 4abaf16d2077..000000000000
--- a/tools/testing/selftests/net/bpf/Makefile
+++ /dev/null
@@ -1,51 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-CLANG ?= clang
-SCRATCH_DIR := $(OUTPUT)/tools
-BUILD_DIR := $(SCRATCH_DIR)/build
-BPFDIR := $(abspath ../../../lib/bpf)
-APIDIR := $(abspath ../../../include/uapi)
-
-CCINCLUDE += -I../../bpf
-CCINCLUDE += -I../../../../../usr/include/
-CCINCLUDE += -I$(SCRATCH_DIR)/include
-
-BPFOBJ := $(BUILD_DIR)/libbpf/libbpf.a
-
-MAKE_DIRS := $(BUILD_DIR)/libbpf $(OUTPUT)/bpf
-$(MAKE_DIRS):
-	mkdir -p $@
-
-TEST_CUSTOM_PROGS = $(OUTPUT)/bpf/nat6to4.o
-all: $(TEST_CUSTOM_PROGS)
-
-# Get Clang's default includes on this system, as opposed to those seen by
-# '-target bpf'. This fixes "missing" files on some architectures/distros,
-# such as asm/byteorder.h, asm/socket.h, asm/sockios.h, sys/cdefs.h etc.
-#
-# Use '-idirafter': Don't interfere with include mechanics except where the
-# build would have failed anyways.
-define get_sys_includes
-$(shell $(1) $(2) -v -E - </dev/null 2>&1 \
-	| sed -n '/<...> search starts here:/,/End of search list./{ s| \(/.*\)|-idirafter \1|p }') \
-$(shell $(1) $(2) -dM -E - </dev/null | grep '__riscv_xlen ' | awk '{printf("-D__riscv_xlen=%d -D__BITS_PER_LONG=%d", $$3, $$3)}')
-endef
-
-ifneq ($(CROSS_COMPILE),)
-CLANG_TARGET_ARCH = --target=$(notdir $(CROSS_COMPILE:%-=%))
-endif
-
-CLANG_SYS_INCLUDES = $(call get_sys_includes,$(CLANG),$(CLANG_TARGET_ARCH))
-
-$(TEST_CUSTOM_PROGS): $(OUTPUT)/%.o: %.c $(BPFOBJ) | $(MAKE_DIRS)
-	$(CLANG) -O2 -target bpf -c $< $(CCINCLUDE) $(CLANG_SYS_INCLUDES) -o $@
-
-$(BPFOBJ): $(wildcard $(BPFDIR)/*.[ch] $(BPFDIR)/Makefile)		       \
-	   $(APIDIR)/linux/bpf.h					       \
-	   | $(BUILD_DIR)/libbpf
-	$(MAKE) $(submake_extras) -C $(BPFDIR) OUTPUT=$(BUILD_DIR)/libbpf/     \
-		    EXTRA_CFLAGS='-g -O0'				       \
-		    DESTDIR=$(SCRATCH_DIR) prefix= all install_headers
-
-EXTRA_CLEAN := $(TEST_CUSTOM_PROGS) $(SCRATCH_DIR)
-
diff --git a/tools/testing/selftests/net/bpf/nat6to4.c b/tools/testing/selftests/net/nat6to4.c
similarity index 100%
rename from tools/testing/selftests/net/bpf/nat6to4.c
rename to tools/testing/selftests/net/nat6to4.c
diff --git a/tools/testing/selftests/net/udpgro_frglist.sh b/tools/testing/selftests/net/udpgro_frglist.sh
index c9c4b9d65839..0a6359bed0b9 100755
--- a/tools/testing/selftests/net/udpgro_frglist.sh
+++ b/tools/testing/selftests/net/udpgro_frglist.sh
@@ -40,8 +40,8 @@ run_one() {
 
 	ip -n "${PEER_NS}" link set veth1 xdp object ${BPF_FILE} section xdp
 	tc -n "${PEER_NS}" qdisc add dev veth1 clsact
-	tc -n "${PEER_NS}" filter add dev veth1 ingress prio 4 protocol ipv6 bpf object-file ../bpf/nat6to4.o section schedcls/ingress6/nat_6  direct-action
-	tc -n "${PEER_NS}" filter add dev veth1 egress prio 4 protocol ip bpf object-file ../bpf/nat6to4.o section schedcls/egress4/snat4 direct-action
+	tc -n "${PEER_NS}" filter add dev veth1 ingress prio 4 protocol ipv6 bpf object-file nat6to4.o section schedcls/ingress6/nat_6  direct-action
+	tc -n "${PEER_NS}" filter add dev veth1 egress prio 4 protocol ip bpf object-file nat6to4.o section schedcls/egress4/snat4 direct-action
         echo ${rx_args}
 	ip netns exec "${PEER_NS}" ./udpgso_bench_rx ${rx_args} -r &
 
@@ -88,8 +88,8 @@ if [ ! -f ${BPF_FILE} ]; then
 	exit -1
 fi
 
-if [ ! -f bpf/nat6to4.o ]; then
-	echo "Missing nat6to4 helper. Build bpfnat6to4.o selftest first"
+if [ ! -f nat6to4.o ]; then
+	echo "Missing nat6to4 helper. Build bpf nat6to4.o selftest first"
 	exit -1
 fi
 
-- 
2.38.1


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox