* [PATCH] net: dsa: fix fixed-link-phy device leaks
From: Johan Hovold @ 2016-11-16 14:47 UTC (permalink / raw)
To: Andrew Lunn
Cc: Vivien Didelot, Florian Fainelli, David S. Miller, netdev,
linux-kernel, Johan Hovold
Make sure to drop the reference taken by of_phy_find_device() when
registering and deregistering the fixed-link PHY-device.
Note that we need to put both references held at deregistration.
Fixes: 39b0c705195e ("net: dsa: Allow configuration of CPU & DSA port
speeds/duplex")
Signed-off-by: Johan Hovold <johan@kernel.org>
---
Hi,
This is one has been compile tested only, but fixes a couple of leaks
similar to one that was found in the cpsw driver for which I just posted
a patch.
It turns out all drivers but DSA fail to deregister the fixed-link PHYs
registered by of_phy_register_fixed_link(). Due to the way this
interface was designed, deregistering such a PHY is a bit cumbersome and
looks like it would benefit from a common helper.
However, perhaps the interface should instead be changed so that the PHY
device is returned so that drivers do not need to use
of_phy_find_device() when they need to access properties of the fixed
link (e.g. as in dsu_cpu_dsa_setup below).
Thoughts?
Thanks,
Johan
net/dsa/dsa.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index a6902c1e2f28..798a6a776a5f 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -233,6 +233,8 @@ int dsa_cpu_dsa_setup(struct dsa_switch *ds, struct device *dev,
genphy_read_status(phydev);
if (ds->ops->adjust_link)
ds->ops->adjust_link(ds, port, phydev);
+
+ phy_device_free(phydev); /* of_phy_find_device */
}
return 0;
@@ -509,8 +511,12 @@ void dsa_cpu_dsa_destroy(struct device_node *port_dn)
if (of_phy_is_fixed_link(port_dn)) {
phydev = of_phy_find_device(port_dn);
if (phydev) {
- phy_device_free(phydev);
fixed_phy_unregister(phydev);
+ /* Put references taken by of_phy_find_device() and
+ * of_phy_register_fixed_link().
+ */
+ phy_device_free(phydev);
+ phy_device_free(phydev);
}
}
}
--
2.7.3
^ permalink raw reply related
* Re: [patch net-next 6/8] ipv4: fib: Add an API to request a FIB dump
From: Hannes Frederic Sowa @ 2016-11-16 14:51 UTC (permalink / raw)
To: Jiri Pirko, netdev
Cc: davem, idosch, eladr, yotamg, nogahf, arkadis, ogerlitz, roopa,
dsa, nikolay, andy, vivien.didelot, andrew, f.fainelli,
alexander.h.duyck, kuznet, jmorris, yoshfuji, kaber
In-Reply-To: <1479305343-13167-7-git-send-email-jiri@resnulli.us>
On 16.11.2016 15:09, Jiri Pirko wrote:
> From: Ido Schimmel <idosch@mellanox.com>
>
> Commit b90eb7549499 ("fib: introduce FIB notification infrastructure")
> introduced a new notification chain to notify listeners (f.e., switchdev
> drivers) about addition and deletion of routes.
>
> However, upon registration to the chain the FIB tables can already be
> populated, which means potential listeners will have an incomplete view
> of the tables.
>
> Solve that by adding an API to request a FIB dump. The dump itself it
> done using RCU in order not to starve consumers that need RTNL to make
> progress.
>
> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Have you looked at potential inconsistencies resulting of RCU walking
the table and having concurrent inserts?
I don't see a way around doing a journal like in filesystems somehow,
but maybe the effects are not that severe and it is not a problem after all.
Bye,
Hannes
^ permalink raw reply
* Re: [PATCH net 1/3] net: phy: realtek: add eee advertisement disable options
From: Jerome Brunet @ 2016-11-16 14:51 UTC (permalink / raw)
To: Andrew Lunn
Cc: Florian Fainelli, netdev-u79uwXL29TY76Z2rM5mHXA,
devicetree-u79uwXL29TY76Z2rM5mHXA, Carlo Caione, Kevin Hilman,
Giuseppe Cavallaro, Alexandre TORGUE, Martin Blumenstingl,
Andre Roth, Neil Armstrong,
linux-amlogic-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20161116132337.GD19962-g2DYL2Zd6BY@public.gmane.org>
On Wed, 2016-11-16 at 14:23 +0100, Andrew Lunn wrote:
> >
> > There two kind of PHYs supporting eee, the one advertising eee by
> > default (like realtek) and the one not advertising it (like
> > micrel).
This is just the default register value.
>
> I don't know too much about EEE. So maybe a dumb question. Does the
> MAC need to be involved? Or is it just the PHY?
>
> If the MAC needs to be involved, the PHY should not be advertising
> EEE
> unless the MAC asks for it by calling phy_init_eee(). If this is
> true,
> maybe we need to change the realtek driver, and others in that class.
As far I understand, the advertised capabilities are exchanged during
the auto-negotiation.
At this stage, if the advertisement is disabled (regarless of the
actual support) on either side of the link, there will be no low power
idle state on the Tx nor the Rx path.
If the advertisement is enabled on both side but we don't call
phy_init_eee, I suppose Tx won't enter LPI, but Rx could.
>
> Andrew
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH net 1/3] net: phy: realtek: add eee advertisement disable options
From: Andrew Lunn @ 2016-11-16 15:06 UTC (permalink / raw)
To: Jerome Brunet
Cc: Florian Fainelli, netdev-u79uwXL29TY76Z2rM5mHXA,
devicetree-u79uwXL29TY76Z2rM5mHXA, Carlo Caione, Kevin Hilman,
Giuseppe Cavallaro, Alexandre TORGUE, Martin Blumenstingl,
Andre Roth, Neil Armstrong,
linux-amlogic-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1479307890.17538.40.camel-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
On Wed, Nov 16, 2016 at 03:51:30PM +0100, Jerome Brunet wrote:
> On Wed, 2016-11-16 at 14:23 +0100, Andrew Lunn wrote:
> > >
> > > There two kind of PHYs supporting eee, the one advertising eee by
> > > default (like realtek) and the one not advertising it (like
> > > micrel).
>
> This is just the default register value.
>
> >
> > I don't know too much about EEE. So maybe a dumb question. Does the
> > MAC need to be involved? Or is it just the PHY?
> >
> > If the MAC needs to be involved, the PHY should not be advertising
> > EEE
> > unless the MAC asks for it by calling phy_init_eee(). If this is
> > true,
> > maybe we need to change the realtek driver, and others in that class.
>
> As far I understand, the advertised capabilities are exchanged during
> the auto-negotiation.
>
> At this stage, if the advertisement is disabled (regarless of the
> actual support) on either side of the link, there will be no low power
> idle state on the Tx nor the Rx path.
>
> If the advertisement is enabled on both side but we don't call
> phy_init_eee, I suppose Tx won't enter LPI, but Rx could.
What i was trying to find out is, if the MAC needs to support EEE as
well as the PHY, what happens when the MAC does not support EEE, but
the PHYs do negotiate EEE? Does it break?
Andrew
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH net 2/3] dt-bindings: net: add DT bindings for realtek phys
From: Rob Herring @ 2016-11-16 15:11 UTC (permalink / raw)
To: Jerome Brunet
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, devicetree-u79uwXL29TY76Z2rM5mHXA,
Florian Fainelli, Carlo Caione, Kevin Hilman, Giuseppe Cavallaro,
Alexandre TORGUE, Martin Blumenstingl, Andre Roth, Neil Armstrong,
linux-amlogic-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1479220154-25851-3-git-send-email-jbrunet-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
On Tue, Nov 15, 2016 at 03:29:13PM +0100, Jerome Brunet wrote:
> Signed-off-by: Jerome Brunet <jbrunet-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Neil Armstrong <narmstrong-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
> ---
> .../devicetree/bindings/net/realtek-phy.txt | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/net/realtek-phy.txt
>
> diff --git a/Documentation/devicetree/bindings/net/realtek-phy.txt b/Documentation/devicetree/bindings/net/realtek-phy.txt
> new file mode 100644
> index 000000000000..dc2845a6b387
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/net/realtek-phy.txt
> @@ -0,0 +1,20 @@
> +Realtek Ethernet PHY
> +
> +Some boards require special tuning values of the phy.
> +
> +Optional properties:
> +
> +realtek,disable-eee-1000t:
> +realtek,disable-eee-100tx:
Make these generic/common.
> + If set, respectively disable 1000-BaseT and 100-BaseTx energy efficient
> + ethernet capabilty advertisement
> + default: Leave the phy default settings unchanged (capabilities advertised)
> +
> +Example:
> +
> +&mdio0 {
> + ethernetphy0: ethernet-phy@0 {
> + reg = <0>;
> + realtek,disable-eee-1000t;
> + };
> +};
> --
> 2.7.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe devicetree" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: linux kernel 4.6 and 4.7 from backports have bug in congestion module refcount
From: Ben Hutchings @ 2016-11-16 15:12 UTC (permalink / raw)
To: netdev; +Cc: Panagiotis Malakoudis, Debian kernel maintainers
In-Reply-To: <CAMig2e6C0Ga2oHiVGRrmQ_8Uw2AabWj7EyYyDDbHhxnMJSOWag@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 6726 bytes --]
[Moving to netdev and debian-kernel lists]
On Wed, 2016-11-16 at 16:12 +0200, Panagiotis Malakoudis wrote:
> I am using Debian Jessie with backports and with the last kernels,
> 4.6.0-0.bpo.1 and 4.7.0-bpo.1 I experience the following WARNING after a
> few hours of heavy load tcp traffic.
>
> Nov 15 01:40:00 archive kernel: WARNING: CPU: 8 PID: 0 at /build/linux-lVEVrl/linux-4.7.8/kernel/module.c:1107 module_put+0x8d/0xa0
> Nov 15 01:40:00 archive kernel: Modules linked in: seqiv xfrm6_mode_tunnel xfrm4_mode_tunnel twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 twofish_common serpent_avx2 serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic blowfish_generic blowfish_x86_64 blowfish_common cast5_avx_x86_64 cast5_generic cast_common ctr ecb des_generic cbc algif_skcipher camellia_generic camellia_aesni_avx2 camellia_aesni_avx_x86_64 camellia_x86_64 xts xcbc sha512_ssse3 sha512_generic md4 algif_hash af_alg xfrm_user xfrm4_tunnel ipcomp xfrm_ipcomp esp4 ah4 af_key xfrm_algo nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc ipip tunnel4 ip_tunnel xt_TCPMSS xt_tcpmss nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables xt_recent xt_tcpudp xt_policy nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack
> Nov 15 01:40:00 archive kernel: iptable_filter ip_tables x_tables xfs libcrc32c crc32c_generic intel_rapl sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel iTCO_wdt iTCO_vendor_support jitterentropy_rng hmac drbg ansi_cprng aesni_intel ast aes_x86_64 lrw ttm gf128mul glue_helper ablk_helper cryptd drm_kms_helper pcspkr drm lpc_ich mfd_core i2c_i801 sg mei_me joydev evdev ipmi_si mei ioatdma shpchp wmi button ipmi_msghandler tpm_tis acpi_pad tpm acpi_power_meter tcp_htcp autofs4 ext4 crc16 jbd2 mbcache dm_mod md_mod hid_generic usbhid hid sd_mod crc32c_intel xhci_pci ahci xhci_hcd ehci_pci libahci ehci_hcd libata ixgbe mpt3sas igb usbcore raid_class scsi_transport_sas i2c_algo_bit usb_common dca scsi_mod ptp pps_core mdio fjes
> Nov 15 01:40:00 archive kernel: CPU: 8 PID: 0 Comm: swapper/8 Tainted: G W 4.7.0-0.bpo.1-amd64 #1 Debian 4.7.8-1~bpo8+1
> Nov 15 01:40:00 archive kernel: Hardware name: Supermicro Super Server/X10SRH-CF, BIOS 1.0c 09/14/2015
> Nov 15 01:40:00 archive kernel: 0000000000000286 c4c1e09c90bbaf17 ffffffffa3d1c805 0000000000000000
> Nov 15 01:40:00 archive kernel: 0000000000000000 ffffffffa3a7c9c4 ffff881024255000 ffffffffc032a0c0
> Nov 15 01:40:00 archive kernel: ffff881024255130 0000000000000000 ffff881024255000 ffff881024255000
> Nov 15 01:40:00 archive kernel: Call Trace:
> Nov 15 01:40:00 archive kernel: <IRQ> [<ffffffffa3d1c805>] ? dump_stack+0x5c/0x77
> Nov 15 01:40:00 archive kernel: [<ffffffffa3a7c9c4>] ? __warn+0xc4/0xe0
> Nov 15 01:40:00 archive kernel: [<ffffffffa3afe6dd>] ? module_put+0x8d/0xa0
> Nov 15 01:40:00 archive kernel: [<ffffffffa3f41ed0>] ? tcp_v4_destroy_sock+0x20/0x290
> Nov 15 01:40:00 archive kernel: [<ffffffffa3f292e7>] ? inet_csk_destroy_sock+0x47/0x160
> Nov 15 01:40:00 archive kernel: [<ffffffffa3f38c37>] ? tcp_rcv_state_process+0xcc7/0xcd0
> Nov 15 01:40:00 archive kernel: [<ffffffffa3fd14e0>] ? tcp_v4_inbound_md5_hash+0x67/0x177
> Nov 15 01:40:00 archive kernel: [<ffffffffa3f41d20>] ? tcp_v4_do_rcv+0x70/0x200
> Nov 15 01:40:00 archive kernel: [<ffffffffa3f435e2>] ? tcp_v4_rcv+0x8a2/0xa90
> Nov 15 01:40:00 archive kernel: [<ffffffffc06fac01>] ? ipv4_confirm+0x61/0xf0 [nf_conntrack_ipv4]
> Nov 15 01:40:00 archive kernel: [<ffffffffa3f1d86b>] ? ip_local_deliver_finish+0x8b/0x1c0
> Nov 15 01:40:00 archive kernel: [<ffffffffa3f1db3b>] ? ip_local_deliver+0x6b/0xf0
> Nov 15 01:40:00 archive kernel: [<ffffffffa3f1d7e0>] ? ip_rcv_finish+0x3e0/0x3e0
> Nov 15 01:40:00 archive kernel: [<ffffffffa3f1de41>] ? ip_rcv+0x281/0x3b0
> Nov 15 01:40:00 archive kernel: [<ffffffffa3f1d400>] ? inet_del_offload+0x40/0x40
> Nov 15 01:40:00 archive kernel: [<ffffffffa3edf11e>] ? __netif_receive_skb_core+0x2be/0xa50
> Nov 15 01:40:00 archive kernel: [<ffffffffa3f58865>] ? inet_gro_receive+0x265/0x2a0
> Nov 15 01:40:00 archive kernel: [<ffffffffa3edf93f>] ? netif_receive_skb_internal+0x2f/0xa0
> Nov 15 01:40:00 archive kernel: [<ffffffffa3ee088b>] ? napi_gro_receive+0xbb/0x110
> Nov 15 01:40:00 archive kernel: [<ffffffffc0332522>] ? ixgbe_clean_rx_irq+0x542/0xaf0 [ixgbe]
> Nov 15 01:40:00 archive kernel: [<ffffffffc033381c>] ? ixgbe_poll+0x44c/0x7a0 [ixgbe]
> Nov 15 01:40:00 archive kernel: [<ffffffffa3ee00e8>] ? net_rx_action+0x238/0x370
> Nov 15 01:40:00 archive kernel: [<ffffffffa3fe20b6>] ? __do_softirq+0x106/0x294
> Nov 15 01:40:00 archive kernel: [<ffffffffa3a82306>] ? irq_exit+0x86/0x90
> Nov 15 01:40:00 archive kernel: [<ffffffffa3fe1dff>] ? do_IRQ+0x4f/0xd0
> Nov 15 01:40:00 archive kernel: [<ffffffffa3fdff42>] ? common_interrupt+0x82/0x82
> Nov 15 01:40:00 archive kernel: <EOI> [<ffffffffa3ea38e2>] ? cpuidle_enter_state+0x112/0x260
> Nov 15 01:40:00 archive kernel: [<ffffffffa3abe31e>] ? cpu_startup_entry+0x2be/0x360
> Nov 15 01:40:00 archive kernel: [<ffffffffa3a4e1c1>] ? start_secondary+0x151/0x190
> Nov 15 01:40:00 archive kernel: ---[ end trace cdbbaec80305a0cc ]---
>
> Tracing the code I found that the issue comes from the usage of a tcp
> congestion module.
> I use sysctl -w net.ipv4.tcp_congestion_control=htcp, which is compiled as
> module. Default congestion algo, cubic, is compiled inside kernel, not
> module.
>
> When the warning is triggered, /sys/module/tcp_htcp/refcnt is -1, which is
> something that should not happen. It is triggered from inside
> tcp_v4_destroy_sock, when calling tcp_cleanup_congestion_control, which
> then calls module_put.
>
> This doesn't happen in 4.5.0-0.bpo.2 (2016-05-13) so I guess there is a bug
> introduced from 4.5 to 4.6 which still lives in 4.7 and decrements
> reference count of htcp congestion module when it shouldn't. The issue is
> only triggered under heavy load (the above mentioned server is a 10g SFP+
> server with more than 4 Gbps traffic at certain times of day).
I didn't see any bug fixes relating to congestion control refcounts
after 4.7, so I'm guessing this bug is still present upstream.
Ben.
> I will test with a different congestion module (tcp_highspeed) to see if
> the issue is only htcp related or it is general issue of using congestion
> tcp modules.
>
> Panagiotis Malakoudis
--
Ben Hutchings
Time is nature's way of making sure that everything doesn't happen at
once.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
^ permalink raw reply
* [PATCH net] ixgbevf: fix invalid use of napi_hash_del()
From: Eric Dumazet @ 2016-11-16 15:15 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Jeff Kirsher
From: Eric Dumazet <edumazet@google.com>
Calling napi_hash_del() before netif_napi_del() is dangerous
if a synchronize_rcu() is not enforced before NAPI struct freeing.
Lets leave this detail to core networking stack to get it right.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 7eaac3234049..30a26e624e5a 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -2511,9 +2511,6 @@ static int ixgbevf_alloc_q_vectors(struct ixgbevf_adapter *adapter)
while (q_idx) {
q_idx--;
q_vector = adapter->q_vector[q_idx];
-#ifdef CONFIG_NET_RX_BUSY_POLL
- napi_hash_del(&q_vector->napi);
-#endif
netif_napi_del(&q_vector->napi);
kfree(q_vector);
adapter->q_vector[q_idx] = NULL;
^ permalink raw reply related
* Re: [patch net-next 6/8] ipv4: fib: Add an API to request a FIB dump
From: Ido Schimmel @ 2016-11-16 15:18 UTC (permalink / raw)
To: Hannes Frederic Sowa
Cc: Jiri Pirko, netdev, davem, idosch, eladr, yotamg, nogahf, arkadis,
ogerlitz, roopa, dsa, nikolay, andy, vivien.didelot, andrew,
f.fainelli, alexander.h.duyck, kuznet, jmorris, yoshfuji, kaber
In-Reply-To: <e795e6e0-a680-a2c3-7d5e-afd966104a4a@stressinduktion.org>
Hi Hannes,
On Wed, Nov 16, 2016 at 03:51:01PM +0100, Hannes Frederic Sowa wrote:
> On 16.11.2016 15:09, Jiri Pirko wrote:
> > From: Ido Schimmel <idosch@mellanox.com>
> >
> > Commit b90eb7549499 ("fib: introduce FIB notification infrastructure")
> > introduced a new notification chain to notify listeners (f.e., switchdev
> > drivers) about addition and deletion of routes.
> >
> > However, upon registration to the chain the FIB tables can already be
> > populated, which means potential listeners will have an incomplete view
> > of the tables.
> >
> > Solve that by adding an API to request a FIB dump. The dump itself it
> > done using RCU in order not to starve consumers that need RTNL to make
> > progress.
> >
> > Signed-off-by: Ido Schimmel <idosch@mellanox.com>
> > Signed-off-by: Jiri Pirko <jiri@mellanox.com>
>
> Have you looked at potential inconsistencies resulting of RCU walking
> the table and having concurrent inserts?
Yes. I did try to think about situations in which this approach will
fail, but I could only find problems with concurrent removals, which I
addressed in 5/8. In case of concurrent insertions, even if you missed
the node, you would still get the ENTRY_ADD event to your listener.
^ permalink raw reply
* Re: [PATCH net 2/3] dt-bindings: net: add DT bindings for realtek phys
From: Jerome Brunet @ 2016-11-16 15:20 UTC (permalink / raw)
To: Rob Herring
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, devicetree-u79uwXL29TY76Z2rM5mHXA,
Florian Fainelli, Carlo Caione, Kevin Hilman, Giuseppe Cavallaro,
Alexandre TORGUE, Martin Blumenstingl, Andre Roth, Neil Armstrong,
linux-amlogic-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20161116151104.5sfcqjyvrzre5pkn@rob-hp-laptop>
On Wed, 2016-11-16 at 09:11 -0600, Rob Herring wrote:
> On Tue, Nov 15, 2016 at 03:29:13PM +0100, Jerome Brunet wrote:
> >
> > Signed-off-by: Jerome Brunet <jbrunet-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
> > Signed-off-by: Neil Armstrong <narmstrong-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>
> > ---
> > .../devicetree/bindings/net/realtek-phy.txt | 20
> > ++++++++++++++++++++
> > 1 file changed, 20 insertions(+)
> > create mode 100644 Documentation/devicetree/bindings/net/realtek-
> > phy.txt
> >
> > diff --git a/Documentation/devicetree/bindings/net/realtek-phy.txt
> > b/Documentation/devicetree/bindings/net/realtek-phy.txt
> > new file mode 100644
> > index 000000000000..dc2845a6b387
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/net/realtek-phy.txt
> > @@ -0,0 +1,20 @@
> > +Realtek Ethernet PHY
> > +
> > +Some boards require special tuning values of the phy.
> > +
> > +Optional properties:
> > +
> > +realtek,disable-eee-1000t:
> > +realtek,disable-eee-100tx:
>
> Make these generic/common.
Same feedback from the net folks. Will do.
Thx Rob
>
> >
> > + If set, respectively disable 1000-BaseT and 100-BaseTx energy
> > efficient
> > + ethernet capabilty advertisement
> > + default: Leave the phy default settings unchanged (capabilities
> > advertised)
> > +
> > +Example:
> > +
> > +&mdio0 {
> > + ethernetphy0: ethernet-phy@0 {
> > + reg = <0>;
> > + realtek,disable-eee-1000t;
> > + };
> > +};
> > --
> > 2.7.4
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe
> > devicetree" in
> > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Cannot set IPv6 address
From: Saeed Mahameed @ 2016-11-16 15:22 UTC (permalink / raw)
To: david.lebrun; +Cc: Linux Netdev List, Doron Tsur, Majd Dibbiny
Hi David,
The following commit introduced a new issue when setting IPv6 address
via the following command:
/sbin/ip -6 addr add 2001:0db8:0:f112::1/64 dev enp2s2
RTNETLINK answers: Operation not supported
Offending commit:
commit 6c8702c60b88651072460f3f4026c7dfe2521d12
Author: David Lebrun <david.lebrun@uclouvain.be>
Date: Tue Nov 8 14:57:41 2016 +0100
ipv6: sr: add support for SRH encapsulation and injection with lwtunnels
This patch creates a new type of interfaceless lightweight tunnel (SEG6),
enabling the encapsulation and injection of SRH within locally emitted
packets and forwarded packets.
>From a configuration viewpoint, a seg6 tunnel would be configured
as follows:
ip -6 ro ad fc00::1/128 encap seg6 mode encap segs
fc42::1,fc42::2,fc42::3 dev eth0
Any packet whose destination address is fc00::1 would thus be encapsulated
within an outer IPv6 header containing the SRH with three
segments, and would
actually be routed to the first segment of the list. If `mode inline' was
specified instead of `mode encap', then the SRH would be directly inserted
after the IPv6 header without outer encapsulation.
The inline mode is only available if CONFIG_IPV6_SEG6_INLINE is
enabled. This
feature was made configurable because direct header insertion may break
several mechanisms such as PMTUD or IPSec AH.
Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Can you check ? Are we missing something here ?
Thanks,
Saeed.
^ permalink raw reply
* Re: [PATCH net] ixgbevf: fix invalid use of napi_hash_del()
From: Eric Dumazet @ 2016-11-16 15:25 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Jeff Kirsher
In-Reply-To: <1479309315.8455.189.camel@edumazet-glaptop3.roam.corp.google.com>
On Wed, 2016-11-16 at 07:15 -0800, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> Calling napi_hash_del() before netif_napi_del() is dangerous
> if a synchronize_rcu() is not enforced before NAPI struct freeing.
>
> Lets leave this detail to core networking stack to get it right.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> ---
> drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 3 ---
> 1 file changed, 3 deletions(-)
Will send a v2
Same stuff is needed in ixgbevf_free_q_vectors()
^ permalink raw reply
* Re: [PATCH net-next v2 2/4] bpf, mlx5: fix various refcount issues in mlx5e_xdp_set
From: Daniel Borkmann @ 2016-11-16 15:26 UTC (permalink / raw)
To: Saeed Mahameed
Cc: David S. Miller, Alexei Starovoitov, Brenden Blanco, zhiyisun,
Rana Shahout, Saeed Mahameed, Linux Netdev List
In-Reply-To: <CALzJLG8jYhp0EZS=oOTv-Oy6fJFTax8FSA6hh4EXXS1E5g-+uQ@mail.gmail.com>
On 11/16/2016 03:30 PM, Saeed Mahameed wrote:
> On Wed, Nov 16, 2016 at 3:54 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 11/16/2016 01:25 PM, Saeed Mahameed wrote:
>>> On Wed, Nov 16, 2016 at 2:04 AM, Daniel Borkmann <daniel@iogearbox.net>
>>> wrote:
>>>>
>>>> There are multiple issues in mlx5e_xdp_set():
>>>>
>>>> 1) The batched bpf_prog_add() is currently not checked for errors! When
>>>> doing so, it should be done at an earlier point in time to makes sure
>>>> that we cannot fail anymore at the time we want to set the program
>>>> for
>>>> each channel. This only means that we have to undo the bpf_prog_add()
>>>> in case we return due to required reset.
>>>>
>>>> 2) When swapping the priv->xdp_prog, then no extra reference count must
>>>> be
>>>> taken since we got that from call path via dev_change_xdp_fd()
>>>> already.
>>>> Otherwise, we'd never be able to free the program. Also,
>>>> bpf_prog_add()
>>>> without checking the return code could fail.
>>>>
>>>> Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support")
>>>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>>>> ---
>>>> drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 25
>>>> ++++++++++++++++++-----
>>>> 1 file changed, 20 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>>> b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>>> index ab0c336..cf26672 100644
>>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>>> @@ -3142,6 +3142,17 @@ static int mlx5e_xdp_set(struct net_device
>>>> *netdev, struct bpf_prog *prog)
>>>> goto unlock;
>>>> }
>>>>
>>>> + if (prog) {
>>>> + /* num_channels is invariant here, so we can take the
>>>> + * batched reference right upfront.
>>>> + */
>>>> + prog = bpf_prog_add(prog, priv->params.num_channels);
>>>> + if (IS_ERR(prog)) {
>>>> + err = PTR_ERR(prog);
>>>> + goto unlock;
>>>> + }
>>>> + }
>>>> +
>>>
>>> With this approach you will end up taking a ref count twice per RQ! on
>>> the first time xdp_set is called i.e (old_prog = NULL, prog != NULL).
>>> One ref will be taken per RQ/Channel from the above code, and since
>>> reset will be TRUE mlx5e_open_locked will be called and another ref
>>> count will be taken on mlx5e_create_rq.
>>>
>>> The issue here is that we have two places for ref count accounting,
>>> xdp_set and mlx5e_create_rq. Having ref-count updates in
>>> mlx5e_create_rq is essential for num_channels configuration changes
>>> (mlx5e_set_ringparam).
>>>
>>> Our previous approach made sure that only one path will do the ref
>>> counting (mlx5e_open_locked vs. mlx5e_xdp_set batch ref update when
>>> reset == false).
>>
>> That is correct, for a short time bpf_prog_add() was charged also when
>> we reset. When we reset, we will then jump to unlock_put and safely undo
>> this since we took ref from mlx5e_create_rq() already in that case, and
>> return successfully. That was intentional, so that eventually we end up
>> just taking a /single/ ref per RQ when we exit mlx5e_xdp_set(), but more
>> below ...
>>
>> [...]
>>>
>>> 2. Keep the current approach and make sure to not update the ref count
>>> twice, you can batch update only if (!reset && was_open) otherwise you
>>> can rely on mlx5e_open_locked to take the num_channels ref count for
>>> you.
>>>
>>> Personally I prefer option 2, since we will keep the current logic
>>> which will allow configuration updates such as (mlx5e_set_ringparam)
>>> to not worry about ref counting since it will be done in the reset
>>> flow.
>>
>> ... agree on keeping current approach. I actually like the idea, so we'd
>> end up with this simpler version for the batched ref then.
>
> Great :).
>
> So let's do the batched update only if we are not going to reset (we
> already know that in advance), instead of the current patch where you
> batch update unconditionally and then
> unlock_put in case reset was performed (which is just redundant and confusing).
>
> Please note that if open_locked fails you need to goto unlock_put.
Sorry, I'm not quite sure I follow you here; are you /now/ commenting on
the original patch or on the already updated diff I did from my very last
email, that is, http://patchwork.ozlabs.org/patch/695564/ ?
>> Note, your "bpf_prog_add(prog, 1) // one ref for the device." is not needed
>> since this we already got that one through dev_change_xdp_fd() as mentioned.
>
> If it is not needed then why we need bpf_prog_put on mlx5e_nic_cleanup
> in your next patch? this doesn't look symmetric (right) !
> you shouldn't release a reference you didn't take.
> Overall with this series the driver can take num_channels refs and
> will release num_channels refs on mlx5e_close. we shouldn't release
> one extra ref on NIC cleanup.
I already explained this in the commit description; when dev_change_xdp_fd()
is called and sees a valid fd, it does bpf_prog_get_type(), which calls the
__bpf_prog_get() taking a ref on the program (bpf_prog_inc()). That is then
passed down to ops->ndo_xdp(). Only in case of error from the ->ndo_xdp()
callback, we bpf_prog_put() this reference from dev_change_xdp_fd() side.
For drivers that implement against this ndo, it means that you need N-1 refs
in addition. Have a look at the other two in-tree users, which do it correctly,
that is mlx4_xdp_set() and nfp_net_xdp_setup().
It's documented here (include/linux/netdevice.h) with ndo_xdp referring to it:
/* These structures hold the attributes of xdp state that are being passed
* to the netdevice through the xdp op.
*/
enum xdp_netdev_command {
/* Set or clear a bpf program used in the earliest stages of packet
* rx. The prog will have been loaded as BPF_PROG_TYPE_XDP. The callee
* is responsible for calling bpf_prog_put on any old progs that are
* stored. In case of error, the callee need not release the new prog
* reference, but on success it takes ownership and must bpf_prog_put
* when it is no longer used.
*/
XDP_SETUP_PROG,
[...]
};
I think reason why this is rather confusing would be that initially, it was
just a single prog for all queues, but it was requested along the way to move
prog pointer down to queues instead and let them have their own ref, so that
some time in future individual progs from a subset of the queues can be
exchanged.
Since the way xdp in mlx5 is implemented, you currently have the priv->xdp_prog
as the control prog. That's okay right now, but requires to drop the last ref
on priv->xdp_prog via bpf_prog_put() when netdev is dismantled and priv->xdp_prog
still present.
^ permalink raw reply
* [PATCH v2 net] ixgbevf: fix invalid uses of napi_hash_del()
From: Eric Dumazet @ 2016-11-16 15:26 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Jeff Kirsher
In-Reply-To: <1479309315.8455.189.camel@edumazet-glaptop3.roam.corp.google.com>
From: Eric Dumazet <edumazet@google.com>
Calling napi_hash_del() before netif_napi_del() is dangerous
if a synchronize_rcu() is not enforced before NAPI struct freeing.
Lets leave this detail to core networking stack to get it right.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 6 ------
1 file changed, 6 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 7eaac3234049..bf4d7efc7dbd 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -2511,9 +2511,6 @@ static int ixgbevf_alloc_q_vectors(struct ixgbevf_adapter *adapter)
while (q_idx) {
q_idx--;
q_vector = adapter->q_vector[q_idx];
-#ifdef CONFIG_NET_RX_BUSY_POLL
- napi_hash_del(&q_vector->napi);
-#endif
netif_napi_del(&q_vector->napi);
kfree(q_vector);
adapter->q_vector[q_idx] = NULL;
@@ -2537,9 +2534,6 @@ static void ixgbevf_free_q_vectors(struct ixgbevf_adapter *adapter)
struct ixgbevf_q_vector *q_vector = adapter->q_vector[q_idx];
adapter->q_vector[q_idx] = NULL;
-#ifdef CONFIG_NET_RX_BUSY_POLL
- napi_hash_del(&q_vector->napi);
-#endif
netif_napi_del(&q_vector->napi);
kfree(q_vector);
}
^ permalink raw reply related
* [PATCH net] ip6_tunnel: disable caching when the traffic class is inherited
From: Paolo Abeni @ 2016-11-16 15:26 UTC (permalink / raw)
To: netdev; +Cc: David S. Miller, Liam McBirnie, Hannes Frederic Sowa
If an ip6 tunnel is configured to inherit the traffic class from
the inner header, the dst_cache must be disabled or it will foul
the policy routing.
The issue is apprently there since at leat Linux-2.6.12-rc2.
Reported-by: Liam McBirnie <liam.mcbirnie@boeing.com>
Cc: Liam McBirnie <liam.mcbirnie@boeing.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
net/ipv6/ip6_tunnel.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 8778456..0a4759b 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -1034,6 +1034,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield,
int mtu;
unsigned int psh_hlen = sizeof(struct ipv6hdr) + t->encap_hlen;
unsigned int max_headroom = psh_hlen;
+ bool use_cache = false;
u8 hop_limit;
int err = -1;
@@ -1066,7 +1067,15 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield,
memcpy(&fl6->daddr, addr6, sizeof(fl6->daddr));
neigh_release(neigh);
- } else if (!fl6->flowi6_mark)
+ } else if (!(t->parms.flags &
+ (IP6_TNL_F_USE_ORIG_TCLASS | IP6_TNL_F_USE_ORIG_FWMARK))) {
+ /* enable the cache only only if the routing decision does
+ * not depend on the current inner header value
+ */
+ use_cache = true;
+ }
+
+ if (use_cache)
dst = dst_cache_get(&t->dst_cache);
if (!ip6_tnl_xmit_ctl(t, &fl6->saddr, &fl6->daddr))
@@ -1150,7 +1159,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield,
if (t->encap.type != TUNNEL_ENCAP_NONE)
goto tx_err_dst_release;
} else {
- if (!fl6->flowi6_mark && ndst)
+ if (use_cache && ndst)
dst_cache_set_ip6(&t->dst_cache, ndst, &fl6->saddr);
}
skb_dst_set(skb, dst);
--
1.8.3.1
^ permalink raw reply related
* Re: Cannot set IPv6 address
From: Eric Dumazet @ 2016-11-16 15:29 UTC (permalink / raw)
To: Saeed Mahameed; +Cc: david.lebrun, Linux Netdev List, Doron Tsur, Majd Dibbiny
In-Reply-To: <CALzJLG9B52CbMEiWgroaihq4Ar2Jc9H3BZdA_xjF37CZESbZ8Q@mail.gmail.com>
On Wed, 2016-11-16 at 17:22 +0200, Saeed Mahameed wrote:
> Hi David,
>
> The following commit introduced a new issue when setting IPv6 address
> via the following command:
>
> /sbin/ip -6 addr add 2001:0db8:0:f112::1/64 dev enp2s2
> RTNETLINK answers: Operation not supported
>
> Offending commit:
>
> commit 6c8702c60b88651072460f3f4026c7dfe2521d12
> Author: David Lebrun <david.lebrun@uclouvain.be>
> Date: Tue Nov 8 14:57:41 2016 +0100
>
> ipv6: sr: add support for SRH encapsulation and injection with lwtunnels
>
> This patch creates a new type of interfaceless lightweight tunnel (SEG6),
> enabling the encapsulation and injection of SRH within locally emitted
> packets and forwarded packets.
>
> >From a configuration viewpoint, a seg6 tunnel would be configured
> as follows:
>
> ip -6 ro ad fc00::1/128 encap seg6 mode encap segs
> fc42::1,fc42::2,fc42::3 dev eth0
>
> Any packet whose destination address is fc00::1 would thus be encapsulated
> within an outer IPv6 header containing the SRH with three
> segments, and would
> actually be routed to the first segment of the list. If `mode inline' was
> specified instead of `mode encap', then the SRH would be directly inserted
> after the IPv6 header without outer encapsulation.
>
> The inline mode is only available if CONFIG_IPV6_SEG6_INLINE is
> enabled. This
> feature was made configurable because direct header insertion may break
> several mechanisms such as PMTUD or IPSec AH.
>
> Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
> Signed-off-by: David S. Miller <davem@davemloft.net>
>
>
> Can you check ? Are we missing something here ?
Sure, patch is under review. Please look at netdev archives and/or
ozlabs
https://patchwork.ozlabs.org/patch/695060/
^ permalink raw reply
* RE: [PATCH] lan78xx: relocate mdix setting to phy driver
From: Woojung.Huh @ 2016-11-16 15:31 UTC (permalink / raw)
To: f.fainelli, davem; +Cc: andrew, netdev, UNGLinuxDriver
In-Reply-To: <c894a7bd-4062-cb63-6b70-02bb86c874c1@gmail.com>
> static void lan88xx_set_mdix(struct phy_device *phydev)
> {
> int buf;
> int mask_val;
>
> switch (phydev->mdix) {
> case ETH_TP_MDI:
> mask_val = LAN88XX_EXT_MODE_CTRL_MDI_;
> break;
> case ETH_TP_MDI_X:
> mask_val = LAN88XX_EXT_MODE_CTRL_MDI_X_;
> break;
> case ETH_TP_MDI_AUTO:
> mask_val = LAN88XX_EXT_MODE_CTRL_AUTO_MDIX_:
> break:
> default:
> return;
> }
>
> phy_write(phydev, LAN88XX_EXT_PAGE_ACCESS,
> LAN88XX_EXT_PAGE_SPACE_1);
> buf = phy_read(phydev, LAN88XX_EXT_MODE_CTRL);
> buf &= ~LAN88XX_EXT_MODE_CTRL_MDIX_MASK_;
> buf |= mask_val;
> phy_write(phydev, LAN88XX_EXT_MODE_CTRL, buf);
> phy_write(phydev, LAN88XX_EXT_PAGE_ACCESS,
> LAN88XX_EXT_PAGE_SPACE_0);
> }
Florian,
Looks simpler to me too. Will submit new patch.
Thanks.
- Woojung
^ permalink raw reply
* Re: Cannot set IPv6 address
From: David Lebrun @ 2016-11-16 15:34 UTC (permalink / raw)
To: Saeed Mahameed; +Cc: Linux Netdev List, Doron Tsur, Majd Dibbiny
In-Reply-To: <CALzJLG9B52CbMEiWgroaihq4Ar2Jc9H3BZdA_xjF37CZESbZ8Q@mail.gmail.com>
[-- Attachment #1.1: Type: text/plain, Size: 955 bytes --]
On 11/16/2016 04:22 PM, Saeed Mahameed wrote:
> Hi David,
>
> The following commit introduced a new issue when setting IPv6 address
> via the following command:
>
> /sbin/ip -6 addr add 2001:0db8:0:f112::1/64 dev enp2s2
> RTNETLINK answers: Operation not supported
>
> Offending commit:
>
> commit 6c8702c60b88651072460f3f4026c7dfe2521d12
Saeed,
Do you have LWTUNNEL enabled ? This commit introduced a bug causing IPv6
initialization to fail if LWTUNNEL is disabled. The patch has been
submitted to the list and is pending approval from DaveM.
If you see something like
NET: Registered protocol family 10
IPv6: Attempt to unregister permanent protocol 6
IPv6: Attempt to unregister permanent protocol 136
IPv6: Attempt to unregister permanent protocol 17
NET: Unregistered protocol family 10
in you dmesg logs then it would confirm my theory.
Short fix: enable CONFIG_LWTUNNEL or apply patch in attachment
David
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.2: 0001-ipv6-sr-add-option-to-control-lwtunnel-support.patch --]
[-- Type: text/x-patch; name="0001-ipv6-sr-add-option-to-control-lwtunnel-support.patch", Size: 4643 bytes --]
From 51775d7223b6d5bd16cb5d09df9ba494fac8ffda Mon Sep 17 00:00:00 2001
From: David Lebrun <david.lebrun@uclouvain.be>
Date: Tue, 15 Nov 2016 14:57:52 +0100
Subject: [PATCH net-next 1/1] ipv6: sr: add option to control lwtunnel support
This patch adds a new option CONFIG_IPV6_SEG6_LWTUNNEL to enable/disable
support of encapsulation with the lightweight tunnels. When this option
is enabled, CONFIG_LWTUNNEL is automatically selected.
Fix commit 6c8702c60b88 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels")
Without a proper option to control lwtunnel support for SR-IPv6, if
CONFIG_LWTUNNEL=n then the IPv6 initialization fails as a consequence
of seg6_iptunnel_init() failure with EOPNOTSUPP:
NET: Registered protocol family 10
IPv6: Attempt to unregister permanent protocol 6
IPv6: Attempt to unregister permanent protocol 136
IPv6: Attempt to unregister permanent protocol 17
NET: Unregistered protocol family 10
Tested (compiling, booting, and loading ipv6 module when relevant)
with possible combinations of CONFIG_IPV6={y,m,n},
CONFIG_IPV6_SEG6_LWTUNNEL={y,n} and CONFIG_LWTUNNEL={y,n}.
Reported-by: Lorenzo Colitti <lorenzo@google.com>
Suggested-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
---
net/ipv6/Kconfig | 13 ++++++++++++-
net/ipv6/Makefile | 5 +++--
net/ipv6/seg6.c | 8 ++++++++
3 files changed, 23 insertions(+), 3 deletions(-)
diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig
index 0f00811..ec1267e 100644
--- a/net/ipv6/Kconfig
+++ b/net/ipv6/Kconfig
@@ -289,9 +289,20 @@ config IPV6_PIMSM_V2
Support for IPv6 PIM multicast routing protocol PIM-SMv2.
If unsure, say N.
+config IPV6_SEG6_LWTUNNEL
+ bool "IPv6: Segment Routing Header encapsulation support"
+ depends on IPV6
+ select LWTUNNEL
+ ---help---
+ Support for encapsulation of packets within an outer IPv6
+ header and a Segment Routing Header using the lightweight
+ tunnels mechanism.
+
+ If unsure, say N.
+
config IPV6_SEG6_INLINE
bool "IPv6: direct Segment Routing Header insertion "
- depends on IPV6
+ depends on IPV6_SEG6_LWTUNNEL
---help---
Support for direct insertion of the Segment Routing Header,
also known as inline mode. Be aware that direct insertion of
diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile
index 129cad2..a9e9fec 100644
--- a/net/ipv6/Makefile
+++ b/net/ipv6/Makefile
@@ -9,7 +9,7 @@ ipv6-objs := af_inet6.o anycast.o ip6_output.o ip6_input.o addrconf.o \
route.o ip6_fib.o ipv6_sockglue.o ndisc.o udp.o udplite.o \
raw.o icmp.o mcast.o reassembly.o tcp_ipv6.o ping.o \
exthdrs.o datagram.o ip6_flowlabel.o inet6_connection_sock.o \
- udp_offload.o seg6.o seg6_iptunnel.o
+ udp_offload.o seg6.o
ipv6-offload := ip6_offload.o tcpv6_offload.o exthdrs_offload.o
@@ -23,6 +23,8 @@ ipv6-$(CONFIG_IPV6_MULTIPLE_TABLES) += fib6_rules.o
ipv6-$(CONFIG_PROC_FS) += proc.o
ipv6-$(CONFIG_SYN_COOKIES) += syncookies.o
ipv6-$(CONFIG_NETLABEL) += calipso.o
+ipv6-$(CONFIG_IPV6_SEG6_LWTUNNEL) += seg6_iptunnel.o
+ipv6-$(CONFIG_IPV6_SEG6_HMAC) += seg6_hmac.o
ipv6-objs += $(ipv6-y)
@@ -44,7 +46,6 @@ obj-$(CONFIG_IPV6_SIT) += sit.o
obj-$(CONFIG_IPV6_TUNNEL) += ip6_tunnel.o
obj-$(CONFIG_IPV6_GRE) += ip6_gre.o
obj-$(CONFIG_IPV6_FOU) += fou6.o
-obj-$(CONFIG_IPV6_SEG6_HMAC) += seg6_hmac.o
obj-y += addrconf_core.o exthdrs_core.o ip6_checksum.o ip6_icmp.o
obj-$(CONFIG_INET) += output_core.o protocol.o $(ipv6-offload)
diff --git a/net/ipv6/seg6.c b/net/ipv6/seg6.c
index 50f6e06..b172d85 100644
--- a/net/ipv6/seg6.c
+++ b/net/ipv6/seg6.c
@@ -451,9 +451,11 @@ int __init seg6_init(void)
if (err)
goto out_unregister_genl;
+#ifdef CONFIG_IPV6_SEG6_LWTUNNEL
err = seg6_iptunnel_init();
if (err)
goto out_unregister_pernet;
+#endif
#ifdef CONFIG_IPV6_SEG6_HMAC
err = seg6_hmac_init();
@@ -467,10 +469,14 @@ int __init seg6_init(void)
return err;
#ifdef CONFIG_IPV6_SEG6_HMAC
out_unregister_iptun:
+#ifdef CONFIG_IPV6_SEG6_LWTUNNEL
seg6_iptunnel_exit();
#endif
+#endif
+#ifdef CONFIG_IPV6_SEG6_LWTUNNEL
out_unregister_pernet:
unregister_pernet_subsys(&ip6_segments_ops);
+#endif
out_unregister_genl:
genl_unregister_family(&seg6_genl_family);
goto out;
@@ -481,7 +487,9 @@ void seg6_exit(void)
#ifdef CONFIG_IPV6_SEG6_HMAC
seg6_hmac_exit();
#endif
+#ifdef CONFIG_IPV6_SEG6_LWTUNNEL
seg6_iptunnel_exit();
+#endif
unregister_pernet_subsys(&ip6_segments_ops);
genl_unregister_family(&seg6_genl_family);
}
--
2.7.3
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 163 bytes --]
^ permalink raw reply related
* Re: [PATCH net 1/3] net: phy: realtek: add eee advertisement disable options
From: Jerome Brunet @ 2016-11-16 15:38 UTC (permalink / raw)
To: Andrew Lunn
Cc: Florian Fainelli, netdev, devicetree, Carlo Caione, Kevin Hilman,
Giuseppe Cavallaro, Alexandre TORGUE, Martin Blumenstingl,
Andre Roth, Neil Armstrong, linux-amlogic, linux-arm-kernel,
linux-kernel
In-Reply-To: <20161116150628.GI23231@lunn.ch>
On Wed, 2016-11-16 at 16:06 +0100, Andrew Lunn wrote:
> On Wed, Nov 16, 2016 at 03:51:30PM +0100, Jerome Brunet wrote:
> >
> > On Wed, 2016-11-16 at 14:23 +0100, Andrew Lunn wrote:
> > >
> > > >
> > > >
> > > > There two kind of PHYs supporting eee, the one advertising eee
> > > > by
> > > > default (like realtek) and the one not advertising it (like
> > > > micrel).
> >
> > This is just the default register value.
> >
> > >
> > >
> > > I don't know too much about EEE. So maybe a dumb question. Does
> > > the
> > > MAC need to be involved? Or is it just the PHY?
> > >
> > > If the MAC needs to be involved, the PHY should not be
> > > advertising
> > > EEE
> > > unless the MAC asks for it by calling phy_init_eee(). If this is
> > > true,
> > > maybe we need to change the realtek driver, and others in that
> > > class.
> >
> > As far I understand, the advertised capabilities are exchanged
> > during
> > the auto-negotiation.
> >
> > At this stage, if the advertisement is disabled (regarless of the
> > actual support) on either side of the link, there will be no low
> > power
> > idle state on the Tx nor the Rx path.
> >
> > If the advertisement is enabled on both side but we don't call
> > phy_init_eee, I suppose Tx won't enter LPI, but Rx could.
>
> What i was trying to find out is, if the MAC needs to support EEE as
> well as the PHY, what happens when the MAC does not support EEE, but
> the PHYs do negotiate EEE? Does it break?
Interesting question. In a regular case, I suppose it should be fine.
As you would have LPI only on the Rx path this should be transparent to
the MAC. That's my understanding. Maybe people knowing EEE better than
me could confirm (or not) ? Peppe? Alexandre?
I just checked with the OdroidC2, I disabled eee support by forcing
"dma_cap.eee = 0" in stmmac_get_hw_features. As expected, no tx_LPI
interrupts but plenty of rx_LPI interrupts.
What was not expected is test failing like before.
So in our case, having LPI on the Rx path is fine for receiving data,
but not for sending.
>
> Andrew
^ permalink raw reply
* Re: [PATCH net-next v2 3/4] bpf, mlx5: drop priv->xdp_prog reference on netdev cleanup
From: Daniel Borkmann @ 2016-11-16 15:45 UTC (permalink / raw)
To: Saeed Mahameed
Cc: David S. Miller, Alexei Starovoitov, Brenden Blanco, zhiyisun,
Rana Shahout, Saeed Mahameed, Linux Netdev List
In-Reply-To: <CALzJLG9Rzds3H-wXnjCg=_2oGV-33TY714Hx0n+NyVsZzcBBfA@mail.gmail.com>
On 11/16/2016 01:51 PM, Saeed Mahameed wrote:
> On Wed, Nov 16, 2016 at 2:04 AM, Daniel Borkmann <daniel@iogearbox.net> wrote:
>> mlx5e_xdp_set() is currently the only place where we drop reference on the
>> prog sitting in priv->xdp_prog when it's exchanged by a new one. We also
>> need to make sure that we eventually release that reference, for example,
>> in case the netdev is dismantled.
>>
>> Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support")
>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>> ---
>> drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>> index cf26672..60fe54c 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>> @@ -3715,6 +3715,9 @@ static void mlx5e_nic_cleanup(struct mlx5e_priv *priv)
>>
>> if (MLX5_CAP_GEN(mdev, vport_group_manager))
>> mlx5_eswitch_unregister_vport_rep(esw, 0);
>> +
>> + if (priv->xdp_prog)
>> + bpf_prog_put(priv->xdp_prog);
>> }
>
> I thought that on unregister_netdev ndo_xdp_set will be called with
> NULL prog to cleanup. like any other resources (Vlans/mac_lists/
> etc..), why xdp should be different ?
> Anyway if this is the case, I am ok with this fix, you can even send
> it to net (it looks like a serious leak).
The only interaction with ndo_xdp() right now is dev_change_xdp_fd()
and the currently a bit terse dump via rtnl_xdp_fill(). The latter
only tells whether something is actually attached and will have more
info in near future, but doesn't alter anything.
dev_change_xdp_fd() is only triggered from user side via netlink when
IFLA_XDP container attr is around, so no automatic cleanup here. This
means as per documentation in enum xdp_netdev_command, that the driver
has full ownership, thus needs to bpf_prog_put().
^ permalink raw reply
* Re: [PATCH net-next v2] ipv6: sr: fix IPv6 initialization failure without lwtunnels
From: Roopa Prabhu @ 2016-11-16 15:49 UTC (permalink / raw)
To: David Miller; +Cc: david.lebrun, netdev, lorenzo
In-Reply-To: <20161115.101857.1945116546500210861.davem@davemloft.net>
On 11/15/16, 7:18 AM, David Miller wrote:
> From: David Lebrun <david.lebrun@uclouvain.be>
> Date: Tue, 15 Nov 2016 11:17:20 +0100
>
>> On 11/14/2016 03:22 PM, Roopa Prabhu wrote:
>>> I prefer option b). most LWTUNNEL encaps are done this way.
>>>
>>> seg6 and seg6_iptunnel is new segment routing code and can be under
>>> CONFIG_IPV6_SEG6 which depends on CONFIG_LWTUNNEL and CONFIG_IPV6.
>>> CONFIG_IPV6_SEG6_HMAC could then depend on CONFIG_IPV6_SEG6
>> Will do that, thanks
> This is good for the time being.
>
> Although I'd like to entertain the idea of making LWTUNNEL
> unconditionally built and considered a fundamental piece of
> networking infrastructure just like net/core/dst.c
ok, ack. I can submit a patch for that. But, I had the lwtunnel infra hooks in
CONFIG_LWTUNNEL to reduce the cost of hooks in the default fast path when it was not enabled.
Will need to re-evaluate the cost of the hooks in the default fast-path.
I am assuming you are ok with various encaps staying in their respective configs (mpls iptunnels, ila, and now
ipv6 segment routing).
thanks
^ permalink raw reply
* Re: [PATCH net-next v2 3/4] bpf, mlx5: drop priv->xdp_prog reference on netdev cleanup
From: Daniel Borkmann @ 2016-11-16 15:55 UTC (permalink / raw)
To: Saeed Mahameed
Cc: David S. Miller, Alexei Starovoitov, Brenden Blanco, zhiyisun,
Rana Shahout, Saeed Mahameed, Linux Netdev List
In-Reply-To: <582C7F11.7000006@iogearbox.net>
On 11/16/2016 04:45 PM, Daniel Borkmann wrote:
> On 11/16/2016 01:51 PM, Saeed Mahameed wrote:
>> On Wed, Nov 16, 2016 at 2:04 AM, Daniel Borkmann <daniel@iogearbox.net> wrote:
>>> mlx5e_xdp_set() is currently the only place where we drop reference on the
>>> prog sitting in priv->xdp_prog when it's exchanged by a new one. We also
>>> need to make sure that we eventually release that reference, for example,
>>> in case the netdev is dismantled.
>>>
>>> Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support")
>>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>>> ---
>>> drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 +++
>>> 1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>> index cf26672..60fe54c 100644
>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>>> @@ -3715,6 +3715,9 @@ static void mlx5e_nic_cleanup(struct mlx5e_priv *priv)
>>>
>>> if (MLX5_CAP_GEN(mdev, vport_group_manager))
>>> mlx5_eswitch_unregister_vport_rep(esw, 0);
>>> +
>>> + if (priv->xdp_prog)
>>> + bpf_prog_put(priv->xdp_prog);
>>> }
>>
>> I thought that on unregister_netdev ndo_xdp_set will be called with
>> NULL prog to cleanup. like any other resources (Vlans/mac_lists/
>> etc..), why xdp should be different ?
>> Anyway if this is the case, I am ok with this fix, you can even send
>> it to net (it looks like a serious leak).
>
> The only interaction with ndo_xdp() right now is dev_change_xdp_fd()
> and the currently a bit terse dump via rtnl_xdp_fill(). The latter
> only tells whether something is actually attached and will have more
> info in near future, but doesn't alter anything.
>
> dev_change_xdp_fd() is only triggered from user side via netlink when
> IFLA_XDP container attr is around, so no automatic cleanup here. This
> means as per documentation in enum xdp_netdev_command, that the driver
> has full ownership, thus needs to bpf_prog_put().
Note that without patch 2/4, just sending this one to net doesn't really
solve anything, since there the mlx5e_xdp_set() still has the incorrect
bpf_prog_add(prog, 1) around. So it's the whole series if so. I had it
originally targeted at net, but Alexei suggested net-next; I don't really
mind either way, so I agreed to go for net-next.
^ permalink raw reply
* Re: [PATCH net] : add a missing rcu synchronization
From: Michael Chan @ 2016-11-16 15:59 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1479306712.8455.185.camel@edumazet-glaptop3.roam.corp.google.com>
On Wed, Nov 16, 2016 at 6:31 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> Add a missing synchronize_net() call to avoid potential use after free,
> since we explicitly call napi_hash_del() to factorize the RCU grace
> period.
>
> Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Michael Chan <michael.chan@broadcom.com>
Thanks. Subject line is missing the driver name. Other than that,
Acked-by: Michael Chan <michael.chan@broadcom.com>
^ permalink raw reply
* Re: [PATCH] bpf: fix possible uninitialized access in inactive rotation
From: Daniel Borkmann @ 2016-11-16 16:00 UTC (permalink / raw)
To: Arnd Bergmann, Alexei Starovoitov
Cc: Martin KaFai Lau, David S. Miller, netdev, linux-kernel
In-Reply-To: <20161116143836.2448688-1-arnd@arndb.de>
On 11/16/2016 03:38 PM, Arnd Bergmann wrote:
> This newly added code causes a build warning:
>
> kernel/bpf/bpf_lru_list.c: In function '__bpf_lru_list_rotate_inactive':
> kernel/bpf/bpf_lru_list.c:201:28: error: 'next' may be used uninitialized in this function [-Werror=maybe-uninitialized]
>
> The warning is plausible from looking at the code, though there might
> be non-obvious external constraints that ensure it always works.
>
> Moving the assignment of ->next_inactive_rotation inside of the
> loop makes it obvious to the reader and the compiler when we
> actually want to update ->next.
>
> Fixes: 3a08c2fd7634 ("bpf: LRU List")
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Thanks a lot, Arnd, patch was already sent here though:
http://patchwork.ozlabs.org/patch/695202/
^ permalink raw reply
* Re: [PATCH] net: ethernet: faraday: To support device tree usage.
From: Arnd Bergmann @ 2016-11-16 16:12 UTC (permalink / raw)
To: Andrew Lunn; +Cc: Greentime Hu, netdev, devicetree-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20161116143715.GH19962-g2DYL2Zd6BY@public.gmane.org>
On Wednesday, November 16, 2016 3:37:15 PM CET Andrew Lunn wrote:
> On Wed, Nov 16, 2016 at 10:26:52PM +0800, Greentime Hu wrote:
> > On Wed, Nov 16, 2016 at 9:47 PM, Andrew Lunn <andrew-g2DYL2Zd6BY@public.gmane.org> wrote:
> > > On Wed, Nov 16, 2016 at 04:43:15PM +0800, Greentime Hu wrote:
> > >> To support device tree usage for ftmac100.
> > >>
> > >> Signed-off-by: Greentime Hu <green.hu-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > >> ---
> > >> drivers/net/ethernet/faraday/ftmac100.c | 7 +++++++
> > >> 1 file changed, 7 insertions(+)
> > >>
> > >> diff --git a/drivers/net/ethernet/faraday/ftmac100.c b/drivers/net/ethernet/faraday/ftmac100.c
> > >> index dce5f7b..81dd9e1 100644
> > >> --- a/drivers/net/ethernet/faraday/ftmac100.c
> > >> +++ b/drivers/net/ethernet/faraday/ftmac100.c
> > >> @@ -1172,11 +1172,17 @@ static int __exit ftmac100_remove(struct platform_device *pdev)
> > >> return 0;
> > >> }
> > >>
> > >> +static const struct of_device_id mac_of_ids[] = {
> > >> + { .compatible = "andestech,atmac100" },
> > >> + { }
> > >
> > > andestech is not in
> > > Documentation/devicetree/bindings/vendor-prefixes.txt Please provide a
> > > separate patch adding it.
> > OK. I will provide another patch to add andestech.
> >
> > > Humm, why andestech? Why not something based around faraday
> > > technology?
> > It is because we use the same ftmac100 IP provided from faraday
> > technology but I am now using it in andestech SoC.
>
> Please make sure you get an acked-by: from the device tree
> maintainers. They might want you to use faraday, since that is the
> original IP provider. For example, all Synopsys licensed IP uses
> "snps,XXX", not the SoC vendor with the license.
I think ideally we have both the ID from andes and from faraday here.
Note that we already have "moxa,moxart-mac" as a compatible string
for this hardware, though it uses a different driver.
We should probably have a single binding document describing
both compatible strings and any optional properties.
Arnd
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [patch net-next v2 5/8] Introduce sample tc action
From: Roopa Prabhu @ 2016-11-16 16:15 UTC (permalink / raw)
To: Jiri Pirko
Cc: netdev, davem, yotamg, idosch, eladr, nogahf, ogerlitz, jhs,
geert+renesas, stephen, xiyou.wangcong, linux, john.fastabend,
simon.horman
In-Reply-To: <1479135638-3580-6-git-send-email-jiri@resnulli.us>
On 11/14/16, 7:00 AM, Jiri Pirko wrote:
> From: Yotam Gigi <yotamg@mellanox.com>
>
> This action allow the user to sample traffic matched by tc classifier.
> The sampling consists of choosing packets randomly, truncating them,
> adding some informative metadata regarding the interface and the original
> packet size and mark them with specific mark, to allow further tc rules to
> match and process. The marked sample packets are then injected into the
> device ingress qdisc using netif_receive_skb.
>
> The packets metadata is packed using the ife encapsulation protocol, and
> the outer packet's ethernet dest, source and eth_type, along with the
> rate, mark and the optional truncation size can be configured from
> userspace.
>
> Example:
> To sample ingress traffic from interface eth1, and redirect the sampled
> the sampled packets to interface dummy0, one may use the commands:
>
> tc qdisc add dev eth1 handle ffff: ingress
>
> tc filter add dev eth1 parent ffff: \
> matchall action sample rate 12 mark 17
Yotham, I am guessing in the future if one does not want to use mark,
the sample api is extensible to allow for other actions to be added.
This is from the general concern we had on using mark: some may not want to use mark.
As long as the api is extensible to allow an alternate way in the future,
we should be good. (We would prefer to not go down the path of having to introduce
a new 'action sample' if this limits us in some way).
>
> tc filter add parent ffff: dev eth1 protocol all \
> u32 match mark 17 0xff \
> action mirred egress redirect dev dummy0
>
thanks.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox