* Re: [RFC PATCH net-next v2] ipv6: implement consistent hashing for equal-cost multipath routing
From: David Lebrun @ 2016-11-30 7:56 UTC (permalink / raw)
To: Hannes Frederic Sowa, netdev
In-Reply-To: <1480477952.3702850.803295033.367FD66D@webmail.messagingengine.com>
[-- Attachment #1: Type: text/plain, Size: 1667 bytes --]
On 11/30/2016 04:52 AM, Hannes Frederic Sowa wrote:
> In the worst case this causes 2GB (order 19) allocations (x == 31) to
> happen in GFP_ATOMIC (due to write lock) context and could cause update
> failures to the routing table due to fragmentation. Are you sure the
> upper limit of 31 is reasonable? I would very much prefer an upper limit
> of below or equal 25 for x to stay within the bounds of the slab
> allocators (which is still a lot and probably causes errors!).
> Unfortunately because of the nature of the sysctl you can't really
> create its own cache for it. :/
>
Agreed. I think that even something like 16 would be excessively
sufficient, that would enable 65K slices, which is way more than enough
to have sufficient balancing with a reasonable amount of nexthops (I
wonder whether there are actual deployments with more than 32 nexthops
for a route).
> Also by design, one day this should all be RCU and having that much data
> outstanding worries me a bit during routing table mutation.
>
> I am a fan of consistent hashing but I am not so sure if it belongs into
> a generic ECMP implementation or into its own ipvs or netfilter module
> where you specifically know how much memory to burn for it.
>
The complexity of the consistent hashing code might warrant something
like that, but I am ot sure of the implications.
> Also please convert the sysctl to a netlink attribute if you pursue this
> because if I change the sysctl while my quagga is hammering the routing
> table I would like to know which nodes allocate what amount of memory.
Yes, that was the idea.
Thanks for the feedback
David
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 163 bytes --]
^ permalink raw reply
* Re: [PATCH v4 3/5] net: asix: Fix AX88772x resume failures
From: Jon Hunter @ 2016-11-30 8:07 UTC (permalink / raw)
To: allan, freddy, Dean_Jenkins, Mark_Craske, davem, robert.foss,
ivecera, john.stultz, vpalatin, stephen, grundler, changchias,
andrew, tremyfr, colin.king, linux-usb, netdev, linux-kernel,
vpalatin
In-Reply-To: <00a601d24ab6$4e9274d0$ebb75e70$@asix.com.tw>
Hi Allan,
On 30/11/16 03:03, ASIX_Allan [Office] wrote:
> The change fixes AX88772x resume failure by
> - Restore incorrect AX88772A PHY registers when resetting
> - Need to stop MAC operation when suspending
> - Need to restart MII when restoring PHY
>
> Signed-off-by: Allan Chou <allan@asix.com.tw>
> Signed-off-by: Robert Foss <robert.foss@collabora.com>
> Tested-by: Robert Foss <robert.foss@collabora.com>
> Tested-by: Jon Hunter <jonathanh@nvidia.com>
> Tested-by: Allan Chou <allan@asix.com.tw>
V3 of this patch is already in the current mainline branch. So you need
to send a patch on top of V3 (or v4.9-rc7) to get this fixed. Also you
should highlight the fact that this is a fix needed for v4.9.
Cheers
Jon
--
nvpublic
^ permalink raw reply
* Re: [PATCH iproute2 V2 1/2] tc/cls_flower: Classify packet in ip tunnels
From: Jiri Pirko @ 2016-11-30 8:13 UTC (permalink / raw)
To: Amir Vadai
Cc: Stephen Hemminger, netdev, David S. Miller, Jiri Benc, Or Gerlitz,
Hadar Har-Zion, Roi Dayan
In-Reply-To: <CAML6475YBZvreU4PEPWUWrEkW5EoGDCw1KS2+ndac7gMY2BQAQ@mail.gmail.com>
Wed, Nov 30, 2016 at 08:38:24AM CET, amirva@gmail.com wrote:
>On Wed, Nov 30, 2016 at 9:17 AM Amir Vadai <amir@vadai.me> wrote:
>
>> On Tue, Nov 29, 2016 at 07:26:58PM -0800, Stephen Hemminger wrote:
>> > The overall design is fine, just a couple nits with the code.
>> >
>> > > diff --git a/tc/f_flower.c b/tc/f_flower.c
>> > > index 2d31d1aa832d..1cf0750b5b83 100644
>> > > --- a/tc/f_flower.c
>> > > +++ b/tc/f_flower.c
>> >
>> > >
>> > > +static int flower_parse_key_id(char *str, int type, struct nlmsghdr
>> *n)
>> >
>> > str is not modified, therefore use: const char *str
>> ack
>>
>> >
>> > > +{
>> > > + int ret;
>> > > + __be32 key_id;
>> > > +
>> > > + ret = get_be32(&key_id, str, 10);
>> > > + if (ret)
>> > > + return -1;
>> >
>> > Traditionally netlink attributes are in host order, why was flower
>> > chosen to be different?
>> I don't know, maybe Jiri (cc'ed) can explain, but it is all over the
>> flower code.
>>
>Now the right Jiri (Pirko) is CC'ed
There is a bunch of helpers inside the cls_flower.c to put and set
values, they work with generic char arrays of len.
>
>
>> >
>> > > +
>> > > + addattr32(n, MAX_MSG, type, key_id);
>> > > +
>> > > + return 0;
>> >
>> >
>> > Why lose the return value here? Why not:
>> >
>> > ret = get_be32(&key_id, str, 10);
>> > if (!ret)
>> > addattr32(n, MAX_MSG, type, key_id);
>> The get_*() function can return only -1 or 0. But you are right, and it
>> is better the way you suggested. Changing accordingly in V3.
>>
>> >
>> > > +}
>> > > +
>> > > static int flower_parse_opt(struct filter_util *qu, char *handle,
>> > > int argc, char **argv, struct nlmsghdr *n)
>> > > {
>> > > @@ -339,6 +359,38 @@ static int flower_parse_opt(struct filter_util
>> *qu, char *handle,
>> > > fprintf(stderr, "Illegal \"src_port\"\n");
>> > > return -1;
>> > > }
>> > > + } else if (matches(*argv, "enc_dst_ip") == 0) {
>> > > + NEXT_ARG();
>> > > + ret = flower_parse_ip_addr(*argv, 0,
>> > > +
>> TCA_FLOWER_KEY_ENC_IPV4_DST,
>> > > +
>> TCA_FLOWER_KEY_ENC_IPV4_DST_MASK,
>> > > +
>> TCA_FLOWER_KEY_ENC_IPV6_DST,
>> > > +
>> TCA_FLOWER_KEY_ENC_IPV6_DST_MASK,
>> > > + n);
>> > > + if (ret < 0) {
>> > > + fprintf(stderr, "Illegal
>> \"enc_dst_ip\"\n");
>> > > + return -1;
>> > > + }
>> > > + } else if (matches(*argv, "enc_src_ip") == 0) {
>> > > + NEXT_ARG();
>> > > + ret = flower_parse_ip_addr(*argv, 0,
>> > > +
>> TCA_FLOWER_KEY_ENC_IPV4_SRC,
>> > > +
>> TCA_FLOWER_KEY_ENC_IPV4_SRC_MASK,
>> > > +
>> TCA_FLOWER_KEY_ENC_IPV6_SRC,
>> > > +
>> TCA_FLOWER_KEY_ENC_IPV6_SRC_MASK,
>> > > + n);
>> > > + if (ret < 0) {
>> > > + fprintf(stderr, "Illegal
>> \"enc_src_ip\"\n");
>> > > + return -1;
>> > > + }
>> > > + } else if (matches(*argv, "enc_key_id") == 0) {
>> > > + NEXT_ARG();
>> > > + ret = flower_parse_key_id(*argv,
>> > > +
>> TCA_FLOWER_KEY_ENC_KEY_ID, n);
>> > > + if (ret < 0) {
>> > > + fprintf(stderr, "Illegal
>> \"enc_key_id\"\n");
>> > > + return -1;
>> > > + }
>> > > } else if (matches(*argv, "action") == 0) {
>> > > NEXT_ARG();
>> > > ret = parse_action(&argc, &argv, TCA_FLOWER_ACT,
>> n);
>> > > @@ -509,6 +561,14 @@ static void flower_print_port(FILE *f, char
>> *name, __u8 ip_proto,
>> > > fprintf(f, "\n %s %d", name, ntohs(rta_getattr_u16(attr)));
>> > > }
>> > >
>> > > +static void flower_print_key_id(FILE *f, char *name,
>> > > + struct rtattr *attr)
>> >
>> > const char *name?
>> ack
>>
>> >
>> >
>> > > +{
>> > > + if (!attr)
>> > > + return;
>> > > + fprintf(f, "\n %s %d", name, ntohl(rta_getattr_u32(attr)));
>> > > +}
>> > > +
>> >
>> > Why short circuit, just change the order:
>> >
>> > if (attr)
>> > fprintf(f, "\n %s %s", name, ntohl(rta_getattr_u32(attr));
>> >
>> > You might also want to introduce rta_getattr_be32()
>> ack
>>
>> >
>> > Please change, retest and resubmit both patches.
>> ack
>>
>> Thanks for reviewing,
>> Amir
>>
^ permalink raw reply
* RE: [PATCH v4 3/5] net: asix: Fix AX88772x resume failures
From: ASIX_Allan [Office] @ 2016-11-30 8:28 UTC (permalink / raw)
To: 'Jon Hunter', freddy-knRN6Y/kmf1NUHwG+Fw1Kw,
Dean_Jenkins-nmGgyN9QBj3QT0dZR+AlfA,
Mark_Craske-nmGgyN9QBj3QT0dZR+AlfA, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
robert.foss-ZGY8ohtN/8qB+jHODAdFcQ,
ivecera-H+wXaHxf7aLQT0dZR+AlfA,
john.stultz-QSEj5FYQhm4dnm+yROfE0A,
vpalatin-F7+t8E8rja9g9hUCZPvPmw,
stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ,
grundler-F7+t8E8rja9g9hUCZPvPmw,
changchias-Re5JQEeQqe8AvxtiuMwx3w, andrew-g2DYL2Zd6BY,
tremyfr-Re5JQEeQqe8AvxtiuMwx3w, colin.king-Z7WLFzj8eWMS+FvcfC7Uqw,
linux-usb-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
vpalatin-hpIqsD4AKlfQT0dZR+AlfA
In-Reply-To: <b24b2fa4-88d8-5772-b79c-aa2cadd7aa1e-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
Dear Jon,
Thanks a lot for your reminding. I will submit a new driver patch soon.
---
Best regards,
Allan Chou
-----Original Message-----
From: Jon Hunter [mailto:jonathanh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org]
Sent: Wednesday, November 30, 2016 4:08 PM
To: allan-knRN6Y/kmf1NUHwG+Fw1Kw@public.gmane.org; freddy-knRN6Y/kmf1NUHwG+Fw1Kw@public.gmane.org; Dean_Jenkins-nmGgyN9QBj3QT0dZR+AlfA@public.gmane.org;
Mark_Craske-nmGgyN9QBj3QT0dZR+AlfA@public.gmane.org; davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org; robert.foss-ZGY8ohtN/8qB+jHODAdFcQ@public.gmane.org;
ivecera-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org; john.stultz-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org; vpalatin-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org;
stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ@public.gmane.org; grundler-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org; changchias-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org;
andrew-g2DYL2Zd6BY@public.gmane.org; tremyfr-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; colin.king-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org;
linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org;
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; vpalatin-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org
Subject: Re: [PATCH v4 3/5] net: asix: Fix AX88772x resume failures
Hi Allan,
On 30/11/16 03:03, ASIX_Allan [Office] wrote:
> The change fixes AX88772x resume failure by
> - Restore incorrect AX88772A PHY registers when resetting
> - Need to stop MAC operation when suspending
> - Need to restart MII when restoring PHY
>
> Signed-off-by: Allan Chou <allan-knRN6Y/kmf1NUHwG+Fw1Kw@public.gmane.org>
> Signed-off-by: Robert Foss <robert.foss-ZGY8ohtN/8qB+jHODAdFcQ@public.gmane.org>
> Tested-by: Robert Foss <robert.foss-ZGY8ohtN/8qB+jHODAdFcQ@public.gmane.org>
> Tested-by: Jon Hunter <jonathanh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
> Tested-by: Allan Chou <allan-knRN6Y/kmf1NUHwG+Fw1Kw@public.gmane.org>
V3 of this patch is already in the current mainline branch. So you need to
send a patch on top of V3 (or v4.9-rc7) to get this fixed. Also you should
highlight the fact that this is a fix needed for v4.9.
Cheers
Jon
--
nvpublic
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH] net: asix: Fix AX88772_suspend() USB vendor commands failure issues
From: ASIX_Allan [Office] @ 2016-11-30 8:29 UTC (permalink / raw)
To: 'Jon Hunter', freddy-knRN6Y/kmf1NUHwG+Fw1Kw,
Dean_Jenkins-nmGgyN9QBj3QT0dZR+AlfA,
Mark_Craske-nmGgyN9QBj3QT0dZR+AlfA, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
robert.foss-ZGY8ohtN/8qB+jHODAdFcQ,
ivecera-H+wXaHxf7aLQT0dZR+AlfA,
john.stultz-QSEj5FYQhm4dnm+yROfE0A,
vpalatin-F7+t8E8rja9g9hUCZPvPmw,
stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ,
grundler-F7+t8E8rja9g9hUCZPvPmw,
changchias-Re5JQEeQqe8AvxtiuMwx3w, andrew-g2DYL2Zd6BY,
tremyfr-Re5JQEeQqe8AvxtiuMwx3w, colin.king-Z7WLFzj8eWMS+FvcfC7Uqw,
linux-usb-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
vpalatin-hpIqsD4AKlfQT0dZR+AlfA
The change fixes AX88772_suspend() USB vendor commands failure issues.
Signed-off-by: Allan Chou <allan-knRN6Y/kmf1NUHwG+Fw1Kw@public.gmane.org>
Tested-by: Allan Chou <allan-knRN6Y/kmf1NUHwG+Fw1Kw@public.gmane.org>
Tested-by: Jon Hunter <jonathanh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
---
--- a/drivers/net/usb/asix_devices.c 2016-11-28 05:08:04.000000000 +0800
+++ b/drivers/net/usb/asix_devices.c 2016-11-30 09:31:54.000000000 +0800
@@ -603,12 +603,12 @@ static void ax88772_suspend(struct usbne
u16 medium;
/* Stop MAC operation */
- medium = asix_read_medium_status(dev, 0);
+ medium = asix_read_medium_status(dev, 1);
medium &= ~AX_MEDIUM_RE;
- asix_write_medium_mode(dev, medium, 0);
+ asix_write_medium_mode(dev, medium, 1);
netdev_dbg(dev->net, "ax88772_suspend: medium=0x%04x\n",
- asix_read_medium_status(dev, 0));
+ asix_read_medium_status(dev, 1));
/* Preserve BMCR for restoring */
priv->presvd_phy_bmcr =
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [iproute PATCH] man: ip-route.8: Add notes about dropped IPv4 route cache
From: Phil Sutter @ 2016-11-30 8:29 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
Signed-off-by: Phil Sutter <phil@nwl.cc>
---
man/man8/ip-route.8.in | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/man/man8/ip-route.8.in b/man/man8/ip-route.8.in
index d4fae3cc783ba..c0acaa0020ef7 100644
--- a/man/man8/ip-route.8.in
+++ b/man/man8/ip-route.8.in
@@ -924,6 +924,12 @@ routes are left unchanged. Any routes specified in the data stream that
already exist in the table will be ignored.
.RE
+.SH NOTES
+Starting with Linux kernel version 3.6, there is no routing cache for IPv4
+anymore. Hence
+.B "ip route show cached"
+will never print any entries on systems with this or newer kernel versions.
+
.SH EXAMPLES
.PP
ip ro
--
2.10.0
^ permalink raw reply related
* [net-next] rtnetlink: return the correct error code
From: Zhang Shengju @ 2016-11-30 8:37 UTC (permalink / raw)
To: netdev
Before this patch, function ndo_dflt_fdb_dump() will always return code
from uc fdb dump. The reture code of mc fdb dump is lost.
Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
---
net/core/rtnetlink.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 4e60525..061415f 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3165,7 +3165,7 @@ int ndo_dflt_fdb_dump(struct sk_buff *skb,
err = nlmsg_populate_fdb(skb, cb, dev, idx, &dev->uc);
if (err)
goto out;
- nlmsg_populate_fdb(skb, cb, dev, idx, &dev->mc);
+ err = nlmsg_populate_fdb(skb, cb, dev, idx, &dev->mc);
out:
netif_addr_unlock_bh(dev);
return err;
--
1.8.3.1
^ permalink raw reply related
* Re: [PATCH v3] ethernet :mellanox :mlx4: Replace pci_pool_alloc by pci_pool_zalloc
From: Tariq Toukan @ 2016-11-30 8:44 UTC (permalink / raw)
To: Souptick Joarder, sergei.shtylyov, yishaih
Cc: netdev, linux-rdma, sahu.rameshwar73
In-Reply-To: <20161129194611.GA4088@jordon-HP-15-Notebook-PC>
Hi Souptic,
Thanks for your patch.
On 29/11/2016 9:46 PM, Souptick Joarder wrote:
> In mlx4_alloc_cmd_mailbox(), pci_pool_alloc() followed by memset will be
> replaced by pci_pool_zalloc()
>
> Signed-off-by: Souptick joarder <jrdr.linux@gmail.com>
> ---
> v3:
> - Fixed alignment issues
As mentioned already, you mean 'Remove empty line'.
>
> v2:
> - Address comment from sergei
> Alignment was not proper
>
> drivers/net/ethernet/mellanox/mlx4/cmd.c | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c
> index e36bebc..a49072b4 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
> @@ -2679,15 +2679,13 @@ struct mlx4_cmd_mailbox *mlx4_alloc_cmd_mailbox(struct mlx4_dev *dev)
> if (!mailbox)
> return ERR_PTR(-ENOMEM);
>
> - mailbox->buf = pci_pool_alloc(mlx4_priv(dev)->cmd.pool, GFP_KERNEL,
> - &mailbox->dma);
> + mailbox->buf = pci_pool_zalloc(mlx4_priv(dev)->cmd.pool, GFP_KERNEL,
> + &mailbox->dma);
> if (!mailbox->buf) {
> kfree(mailbox);
> return ERR_PTR(-ENOMEM);
> }
>
> - memset(mailbox->buf, 0, MLX4_MAILBOX_SIZE);
> -
> return mailbox;
> }
> EXPORT_SYMBOL_GPL(mlx4_alloc_cmd_mailbox);
> --
> 1.9.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Thanks,
Tariq
^ permalink raw reply
* Re: [PATCH net-next v3 3/4] bpf: BPF for lightweight tunnel infrastructure
From: Thomas Graf @ 2016-11-30 8:57 UTC (permalink / raw)
To: Alexei Starovoitov; +Cc: davem, netdev, daniel, tom, roopa, hannes
In-Reply-To: <20161130070150.GA33397@ast-mbp.thefacebook.com>
On 11/29/16 at 11:01pm, Alexei Starovoitov wrote:
> On Wed, Nov 30, 2016 at 07:48:51AM +0100, Thomas Graf wrote:
> > Should we check in __bpf_redirect_common() whether mac_header <
> > nework_header then or add it to lwt-bpf conditional on
> > dev_is_mac_header_xmit()?
>
> may be only extra 'if' in lwt-bpf is all we need?
Agreed, I will add a mac_header < network_header check to lwt-bpf if we
redirect to an l2 device.
> I'm still missing what will happen if we 'forget' to do
> bpf_skb_push() inside the lwt-bpf program, but still do redirect
> in lwt_xmit stage to l2 netdev...
The same as for a AF_PACKET socket not providing an actual L2 header.
I will add a test case to cover this scenario as well.
^ permalink raw reply
* [PATCH net-next 0/3] net/sched: act_pedit: Support using offset relative to the conventional network headers
From: Amir Vadai @ 2016-11-30 9:09 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Jamal Hadi Salim, Or Gerlitz, Hadar Har-Zion, Amir Vadai
Hi,
Patch 1/3 ("net/skbuff: Introduce skb_mac_offset()") adds a utility function to
get mac header offset.
Patch 2/3 ("net/act_pedit: Support using offset relative to the conventional
network headers") extends pedit to enable the user to set offset relative to
MAC/IPv4/IPv6/TCP network headers.
This would enable to work with more complex header schemes (vs the simple IPv4
case) where setting a fixed offset relative to the network header is not
enough. It is also forward looking to enable hardware offloading of pedit more
easier.
The header type is embedded in the 8 MSB of the u32 key->shift which
were never used till now. Therefore backward compatibility is being
kept.
Patch 3/3 ("net/act_pedit: Introduce 'add' operation") add a new operation to
increase the value of a header field. The operation is passed on another free
8bit in the key->shift.
Usage example:
$ tc filter add dev enp0s9 protocol ip parent ffff: \
flower \
ip_proto tcp \
src_port 80 \
action \
pedit munge ip ttl add 0xff \
pedit munge tcp dport set 8080 \
pipe action mirred egress redirect dev veth0
Will forward traffic with tcp dport 80, and modify the destination port to
8080, and decrease the ttl by 1.
I've uploaded a draft for the userspace [2] to make it easier to review and
test the patchset.
The patchset will conflict if already accepted patch [1] from net is missing.
It was applied and tested with [1] on top of commit 93ba22225504 ("hv_netvsc:
remove excessive logging on MTU change").
[1] - 95c2027bfeda ("net/sched: pedit: make sure that offset is valid")
[2] - git: https://bitbucket.org/av42/iproute2.git
branch: pedit
Thanks,
Amir
Amir Vadai (3):
net/skbuff: Introduce skb_mac_offset()
net/act_pedit: Support using offset relative to the conventional
network headers
net/act_pedit: Introduce 'add' operation
include/linux/skbuff.h | 5 +++
include/uapi/linux/tc_act/tc_pedit.h | 27 ++++++++++++
net/sched/act_pedit.c | 81 ++++++++++++++++++++++++++++++------
3 files changed, 100 insertions(+), 13 deletions(-)
--
2.10.2
^ permalink raw reply
* [PATCH net-next 1/3] net/skbuff: Introduce skb_mac_offset()
From: Amir Vadai @ 2016-11-30 9:09 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Jamal Hadi Salim, Or Gerlitz, Hadar Har-Zion, Amir Vadai
In-Reply-To: <20161130090928.14816-1-amir@vadai.me>
Introduce skb_mac_offset() that could be used to get mac header offset.
Signed-off-by: Amir Vadai <amir@vadai.me>
---
include/linux/skbuff.h | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 9c535fbccf2c..395eb5111df0 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2169,6 +2169,11 @@ static inline unsigned char *skb_mac_header(const struct sk_buff *skb)
return skb->head + skb->mac_header;
}
+static inline int skb_mac_offset(const struct sk_buff *skb)
+{
+ return skb_mac_header(skb) - skb->data;
+}
+
static inline int skb_mac_header_was_set(const struct sk_buff *skb)
{
return skb->mac_header != (typeof(skb->mac_header))~0U;
--
2.10.2
^ permalink raw reply related
* [PATCH net-next 2/3] net/act_pedit: Support using offset relative to the conventional network headers
From: Amir Vadai @ 2016-11-30 9:09 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Jamal Hadi Salim, Or Gerlitz, Hadar Har-Zion, Amir Vadai
In-Reply-To: <20161130090928.14816-1-amir@vadai.me>
Extend pedit to enable the user using offset relative to network
headers. This change would enable to work with more complex header
schemes (vs the simple IPv4 case) where setting a fixed offset relative
to the network header is not enough. It is also forward looking to
enable hardware offloading of pedit more easier.
The header type is embedded in the 8 MSB of the u32 key->shift which
were never used till now. Therefore backward compatibility is being
kept.
Usage example:
$ tc filter add dev enp0s9 protocol ip parent ffff: \
flower \
ip_proto tcp \
src_port 80 \
action pedit munge tcp dport set 8080 pipe \
action mirred egress redirect dev veth0
Will forward traffic to tcp port 80, and modify the destination port to
8080.
hange-Id: Ibd7bbbe0b8c2f6adae0591868bb6892c55e75732
Signed-off-by: Amir Vadai <amir@vadai.me>
---
include/uapi/linux/tc_act/tc_pedit.h | 17 ++++++++++
net/sched/act_pedit.c | 65 +++++++++++++++++++++++++++++-------
2 files changed, 70 insertions(+), 12 deletions(-)
diff --git a/include/uapi/linux/tc_act/tc_pedit.h b/include/uapi/linux/tc_act/tc_pedit.h
index 6389959a5157..604e6729ad38 100644
--- a/include/uapi/linux/tc_act/tc_pedit.h
+++ b/include/uapi/linux/tc_act/tc_pedit.h
@@ -32,4 +32,21 @@ struct tc_pedit_sel {
};
#define tc_pedit tc_pedit_sel
+#define PEDIT_TYPE_SHIFT 24
+#define PEDIT_TYPE_MASK 0xff
+
+#define PEDIT_TYPE_GET(_val) \
+ (((_val) >> PEDIT_TYPE_SHIFT) & PEDIT_TYPE_MASK)
+#define PEDIT_SHIFT_GET(_val) ((_val) & 0xff)
+
+enum pedit_header_type {
+ PEDIT_HDR_TYPE_RAW = 0,
+
+ PEDIT_HDR_TYPE_ETH = 1,
+ PEDIT_HDR_TYPE_IP4 = 2,
+ PEDIT_HDR_TYPE_IP6 = 3,
+ PEDIT_HDR_TYPE_TCP = 4,
+ PEDIT_HDR_TYPE_UDP = 5,
+};
+
#endif
diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
index b27c4daec88f..4b9c7184c752 100644
--- a/net/sched/act_pedit.c
+++ b/net/sched/act_pedit.c
@@ -119,18 +119,45 @@ static bool offset_valid(struct sk_buff *skb, int offset)
return true;
}
+static int pedit_skb_hdr_offset(struct sk_buff *skb,
+ enum pedit_header_type htype, int *hoffset)
+{
+ int ret = -1;
+
+ switch (htype) {
+ case PEDIT_HDR_TYPE_ETH:
+ if (skb_mac_header_was_set(skb)) {
+ *hoffset = skb_mac_offset(skb);
+ ret = 0;
+ }
+ break;
+ case PEDIT_HDR_TYPE_RAW:
+ case PEDIT_HDR_TYPE_IP4:
+ case PEDIT_HDR_TYPE_IP6:
+ *hoffset = skb_network_offset(skb);
+ ret = 0;
+ break;
+ case PEDIT_HDR_TYPE_TCP:
+ case PEDIT_HDR_TYPE_UDP:
+ if (skb_transport_header_was_set(skb)) {
+ *hoffset = skb_transport_offset(skb);
+ ret = 0;
+ }
+ break;
+ };
+
+ return ret;
+}
+
static int tcf_pedit(struct sk_buff *skb, const struct tc_action *a,
struct tcf_result *res)
{
struct tcf_pedit *p = to_pedit(a);
int i;
- unsigned int off;
if (skb_unclone(skb, GFP_ATOMIC))
return p->tcf_action;
- off = skb_network_offset(skb);
-
spin_lock(&p->tcf_lock);
tcf_lastuse_update(&p->tcf_tm);
@@ -141,20 +168,32 @@ static int tcf_pedit(struct sk_buff *skb, const struct tc_action *a,
for (i = p->tcfp_nkeys; i > 0; i--, tkey++) {
u32 *ptr, _data;
int offset = tkey->off;
+ int hoffset;
+ int rc;
+ enum pedit_header_type htype =
+ PEDIT_TYPE_GET(tkey->shift);
+
+ rc = pedit_skb_hdr_offset(skb, htype, &hoffset);
+ if (rc) {
+ pr_info("tc filter pedit bad header type specified (0x%x)\n",
+ htype);
+ goto bad;
+ }
if (tkey->offmask) {
char *d, _d;
- if (!offset_valid(skb, off + tkey->at)) {
+ if (!offset_valid(skb, hoffset + tkey->at)) {
pr_info("tc filter pedit 'at' offset %d out of bounds\n",
- off + tkey->at);
+ hoffset + tkey->at);
goto bad;
}
- d = skb_header_pointer(skb, off + tkey->at, 1,
- &_d);
+ d = skb_header_pointer(skb,
+ hoffset + tkey->at,
+ 1, &_d);
if (!d)
goto bad;
- offset += (*d & tkey->offmask) >> tkey->shift;
+ offset += (*d & tkey->offmask) >> PEDIT_SHIFT_GET(tkey->shift);
}
if (offset % 4) {
@@ -163,19 +202,21 @@ static int tcf_pedit(struct sk_buff *skb, const struct tc_action *a,
goto bad;
}
- if (!offset_valid(skb, off + offset)) {
+ if (!offset_valid(skb, hoffset + offset)) {
pr_info("tc filter pedit offset %d out of bounds\n",
- offset);
+ hoffset + offset);
goto bad;
}
- ptr = skb_header_pointer(skb, off + offset, 4, &_data);
+ ptr = skb_header_pointer(skb,
+ hoffset + offset,
+ 4, &_data);
if (!ptr)
goto bad;
/* just do it, baby */
*ptr = ((*ptr & tkey->mask) ^ tkey->val);
if (ptr == &_data)
- skb_store_bits(skb, off + offset, ptr, 4);
+ skb_store_bits(skb, hoffset + offset, ptr, 4);
}
goto done;
--
2.10.2
^ permalink raw reply related
* [PATCH net-next 3/3] net/act_pedit: Introduce 'add' operation
From: Amir Vadai @ 2016-11-30 9:09 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Jamal Hadi Salim, Or Gerlitz, Hadar Har-Zion, Amir Vadai
In-Reply-To: <20161130090928.14816-1-amir@vadai.me>
This command could be useful to inc/dec fields.
For example, to forward any TCP packet and decrease its TTL:
$ tc filter add dev enp0s9 protocol ip parent ffff: \
flower ip_proto tcp \
action pedit munge ip ttl add 0xff pipe \
action mirred egress redirect dev veth0
In the example above, adding 0xff to this u8 field is actually
decreasing it by one, since the operation is masked.
Signed-off-by: Amir Vadai <amir@vadai.me>
---
include/uapi/linux/tc_act/tc_pedit.h | 10 ++++++++++
net/sched/act_pedit.c | 16 +++++++++++++++-
2 files changed, 25 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/tc_act/tc_pedit.h b/include/uapi/linux/tc_act/tc_pedit.h
index 604e6729ad38..80028cd0bb1b 100644
--- a/include/uapi/linux/tc_act/tc_pedit.h
+++ b/include/uapi/linux/tc_act/tc_pedit.h
@@ -35,8 +35,13 @@ struct tc_pedit_sel {
#define PEDIT_TYPE_SHIFT 24
#define PEDIT_TYPE_MASK 0xff
+#define PEDIT_CMD_SHIFT 16
+#define PEDIT_CMD_MASK 0xff
+
#define PEDIT_TYPE_GET(_val) \
(((_val) >> PEDIT_TYPE_SHIFT) & PEDIT_TYPE_MASK)
+#define PEDIT_CMD_GET(_val) \
+ (((_val) >> PEDIT_CMD_SHIFT) & PEDIT_CMD_MASK)
#define PEDIT_SHIFT_GET(_val) ((_val) & 0xff)
enum pedit_header_type {
@@ -49,4 +54,9 @@ enum pedit_header_type {
PEDIT_HDR_TYPE_UDP = 5,
};
+enum pedit_cmd {
+ PEDIT_CMD_SET = 0,
+ PEDIT_CMD_ADD = 1,
+};
+
#endif
diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
index 4b9c7184c752..aa137d51bf7f 100644
--- a/net/sched/act_pedit.c
+++ b/net/sched/act_pedit.c
@@ -169,6 +169,7 @@ static int tcf_pedit(struct sk_buff *skb, const struct tc_action *a,
u32 *ptr, _data;
int offset = tkey->off;
int hoffset;
+ u32 val;
int rc;
enum pedit_header_type htype =
PEDIT_TYPE_GET(tkey->shift);
@@ -214,7 +215,20 @@ static int tcf_pedit(struct sk_buff *skb, const struct tc_action *a,
if (!ptr)
goto bad;
/* just do it, baby */
- *ptr = ((*ptr & tkey->mask) ^ tkey->val);
+ switch (PEDIT_CMD_GET(tkey->shift)) {
+ case PEDIT_CMD_SET:
+ val = tkey->val;
+ break;
+ case PEDIT_CMD_ADD:
+ val = (*ptr + tkey->val) & ~tkey->mask;
+ break;
+ default:
+ pr_info("tc filter pedit bad command (%d)\n",
+ PEDIT_CMD_GET(tkey->shift));
+ goto bad;
+ }
+
+ *ptr = ((*ptr & tkey->mask) ^ val);
if (ptr == &_data)
skb_store_bits(skb, hoffset + offset, ptr, 4);
}
--
2.10.2
^ permalink raw reply related
* Re: [PATCH v2 13/13] net: ethernet: ti: cpts: fix overflow check period
From: Richard Cochran @ 2016-11-30 9:12 UTC (permalink / raw)
To: Grygorii Strashko
Cc: David S. Miller, netdev, Mugunthan V N, Sekhar Nori, linux-kernel,
linux-omap, Rob Herring, devicetree, Murali Karicheri,
Wingman Kwok, John Stultz, Thomas Gleixner
In-Reply-To: <20161128230337.6731-14-grygorii.strashko@ti.com>
On Mon, Nov 28, 2016 at 05:03:37PM -0600, Grygorii Strashko wrote:
> The CPTS drivers uses 8sec period for overflow checking with
> assumption that CPTS retclk will not exceed 500MHz. But that's not
> true on some TI platforms (Kesytone 2). As result, it is possible that
> CPTS counter will overflow more than once between two readings.
>
> Hence, fix it by selecting overflow check period dynamically as
> max_sec_before_overflow/2, where
> max_sec_before_overflow = max_counter_val / rftclk_freq.
>
> Cc: John Stultz <john.stultz@linaro.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
^ permalink raw reply
* Re: [Patch net-next] audit: remove useless synchronize_net()
From: Richard Guy Briggs @ 2016-11-30 9:16 UTC (permalink / raw)
To: Cong Wang; +Cc: netdev, linux-audit, pmoore
In-Reply-To: <1480439696-21818-1-git-send-email-xiyou.wangcong@gmail.com>
On 2016-11-29 09:14, Cong Wang wrote:
> netlink kernel socket is protected by refcount, not RCU.
> Its rcv path is neither protected by RCU. So the synchronize_net()
> is just pointless.
If I understand correctly, xfrm_user_net_exit() usage of
RCU_INIT_POINTER() and synchronize_net() is similarly pointless? Also
net/phonet/socket.c? I probably modelled things based on the former...
> Cc: Richard Guy Briggs <rgb@redhat.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> ---
> kernel/audit.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/kernel/audit.c b/kernel/audit.c
> index 92c463d..67b9fbd8 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -1172,9 +1172,8 @@ static void __net_exit audit_net_exit(struct net *net)
> audit_sock = NULL;
> }
>
> - RCU_INIT_POINTER(aunet->nlsk, NULL);
> - synchronize_net();
> netlink_kernel_release(sock);
> + aunet->nlsk = NULL;
> }
>
> static struct pernet_operations audit_net_ops __net_initdata = {
> --
> 2.1.0
>
- RGB
^ permalink raw reply
* Re: [PATCH] cpsw: ethtool: add support for nway reset
From: Yegor Yefremov @ 2016-11-30 9:31 UTC (permalink / raw)
To: David Miller
Cc: netdev, linux-omap@vger.kernel.org, Grygorii Strashko,
N, Mugunthan V
In-Reply-To: <20161129.194158.1482957574907674651.davem@davemloft.net>
Hi David,
On Wed, Nov 30, 2016 at 1:41 AM, David Miller <davem@davemloft.net> wrote:
> From: yegorslists@googlemail.com
> Date: Mon, 28 Nov 2016 10:47:52 +0100
>
>> From: Yegor Yefremov <yegorslists@googlemail.com>
>>
>> This patch adds support for ethtool's '-r' command. Restarting
>> N-WAY negotiation can be useful to activate newly changed EEE
>> settings etc.
>>
>> Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
>
> This doesn't apply cleanly to net-next.
My previous patch [1] doesn't show up in net-next. This could explain,
why nway patch doesn't apply.
Should I resend them both as series?
[1] http://marc.info/?l=linux-omap&m=148036822211869&w=2
Yegor
^ permalink raw reply
* Re: [PATCH net 04/16] net: ethernet: aurora: nb8800: fix fixed-link phydev leaks
From: Mason @ 2016-11-30 9:36 UTC (permalink / raw)
To: Johan Hovold, Mans Rullgard, Sebastian Frias
Cc: netdev, LKML, David S. Miller, Joe Perches, Brian Norris
In-Reply-To: <1480357509-28074-5-git-send-email-johan@kernel.org>
On 28/11/2016 19:24, Johan Hovold wrote:
> Make sure to deregister and free any fixed-link PHY registered using
> of_phy_register_fixed_link() on probe errors and on driver unbind.
>
> Fixes: c7dfe3abf40e ("net: ethernet: nb8800: support fixed-link DT node")
> Signed-off-by: Johan Hovold <johan@kernel.org>
> ---
> drivers/net/ethernet/aurora/nb8800.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
Did you use scripts/get_maintainer.pl ?
Neither the author of the driver (Mans) nor the author of
the code in question (Sebastian) were CCed on this patch.
It looks like the CC list was truncated, the last entry being
Vivien Didelot <
Regards.
> diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
> index 00c38bf151e6..e078d8da978c 100644
> --- a/drivers/net/ethernet/aurora/nb8800.c
> +++ b/drivers/net/ethernet/aurora/nb8800.c
> @@ -1466,12 +1466,12 @@ static int nb8800_probe(struct platform_device *pdev)
>
> ret = nb8800_hw_init(dev);
> if (ret)
> - goto err_free_bus;
> + goto err_deregister_fixed_link;
>
> if (ops && ops->init) {
> ret = ops->init(dev);
> if (ret)
> - goto err_free_bus;
> + goto err_deregister_fixed_link;
> }
>
> dev->netdev_ops = &nb8800_netdev_ops;
> @@ -1504,6 +1504,9 @@ static int nb8800_probe(struct platform_device *pdev)
>
> err_free_dma:
> nb8800_dma_free(dev);
> +err_deregister_fixed_link:
> + if (of_phy_is_fixed_link(pdev->dev.of_node))
> + of_phy_deregister_fixed_link(pdev->dev.of_node);
> err_free_bus:
> of_node_put(priv->phy_node);
> mdiobus_unregister(bus);
> @@ -1521,6 +1524,8 @@ static int nb8800_remove(struct platform_device *pdev)
> struct nb8800_priv *priv = netdev_priv(ndev);
>
> unregister_netdev(ndev);
> + if (of_phy_is_fixed_link(pdev->dev.of_node))
> + of_phy_deregister_fixed_link(pdev->dev.of_node);
> of_node_put(priv->phy_node);
>
> mdiobus_unregister(priv->mii_bus);
^ permalink raw reply
* Re: [PATCH 1/6] net: ethernet: ti: netcp: add support of cpts
From: Richard Cochran @ 2016-11-30 9:44 UTC (permalink / raw)
To: Grygorii Strashko
Cc: David S. Miller, netdev, Mugunthan V N, Sekhar Nori, linux-kernel,
linux-omap, Rob Herring, devicetree, Murali Karicheri,
Wingman Kwok
In-Reply-To: <20161128230428.6872-2-grygorii.strashko@ti.com>
On Mon, Nov 28, 2016 at 05:04:23PM -0600, Grygorii Strashko wrote:
> @@ -678,6 +744,9 @@ struct gbe_priv {
> int num_et_stats;
> /* Lock for updating the hwstats */
> spinlock_t hw_stats_lock;
> +
> + int cpts_registered;
The usage of this counter is racy.
> + struct cpts *cpts;
> };
This ++ and -- business ...
> +static void gbe_register_cpts(struct gbe_priv *gbe_dev)
> +{
> + if (!gbe_dev->cpts)
> + return;
> +
> + if (gbe_dev->cpts_registered > 0)
> + goto done;
> +
> + if (cpts_register(gbe_dev->cpts)) {
> + dev_err(gbe_dev->dev, "error registering cpts device\n");
> + return;
> + }
> +
> +done:
> + ++gbe_dev->cpts_registered;
> +}
> +
> +static void gbe_unregister_cpts(struct gbe_priv *gbe_dev)
> +{
> + if (!gbe_dev->cpts || (gbe_dev->cpts_registered <= 0))
> + return;
> +
> + if (--gbe_dev->cpts_registered)
> + return;
> +
> + cpts_unregister(gbe_dev->cpts);
> +}
is invoked from your open() and close() methods, but those methods
are not serialized among multiple ports.
Thanks,
Richard
^ permalink raw reply
* Re: [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue
From: Jerome Brunet @ 2016-11-30 9:47 UTC (permalink / raw)
To: Florian Fainelli, netdev, devicetree
Cc: Andrew Lunn, Alexandre TORGUE, Neil Armstrong,
Martin Blumenstingl, Kevin Hilman, linux-kernel, Yegor Yefremov,
Julia Lawall, Andre Roth, linux-amlogic, Carlo Caione,
Giuseppe Cavallaro, Andreas Färber, linux-arm-kernel
In-Reply-To: <049b1efc-3bad-92e0-45ef-0563dc5d81de@gmail.com>
On Mon, 2016-11-28 at 09:54 -0800, Florian Fainelli wrote:
> On 11/28/2016 07:50 AM, Jerome Brunet wrote:
> >
> > This patchset fixes an issue with the OdroidC2 board (DWMAC +
> > RTL8211F).
> > The platform seems to enter LPI on the Rx path too often while
> > performing
> > relatively high TX transfer. This eventually break the link (both
> > Tx and
> > Rx), and require to bring the interface down and up again to get
> > the Rx
> > path working again.
> >
> > The root cause of this issue is not fully understood yet but
> > disabling EEE
> > advertisement on the PHY prevent this feature to be negotiated.
> > With this change, the link is stable and reliable, with the
> > expected
> > throughput performance.
> >
> > The patchset adds options in the generic phy driver to disable EEE
> > advertisement, through device tree. The way it is done is very
> > similar
> > to the handling of the max-speed property.
> >
> > Patch 4 is provided here for testing purpose only. Please don't
> > merge
> > patch 4, this change will go through the amlogic's tree.
>
> Sorry, but I really don't like the route this is going, and I should
> have made myself clearer before on that, I really think utilizing a
> PHY
> fixup is more appropriate here than an extremely generic DT property.
> The fixup code can be in the affected PHY driver, or it can be
> somewhere
> else, your call. There is no shortage of option on how to implement
> it,
> and this would be something easy to enable/disable for known good
> configurations (ala PCI/USB fixups).
>
> If we start supporting generic "enable", "disable" type of properties
> with values that map directly to register definitions of the HW, we
> leave too much room for these properties to be utilized to implement
> a
> specific policy, and this is not acceptable.
Florian,
I agree that DT should not be used to setup a policy, but to describe
what the HW is.
I tried to implement it the way you suggested, using phy fixup, too see
what it looks like.
There is 2 places in the code that seems (remotely) linked to the
issue:
- meson8b_dwmac driver : if the mac, regardless of the board/platform,
could not tolerate to have EEE activated, it would make sense to have
the fixup here. It can provide a C callback for such case.
- realtek phy driver: philosophy is kind of the same
To be clear, it is doable and it works that way, but I don't think
embedding this directly in the code is the right way to do it. It seems
we are hiding an information specific about the board inside a generic
driver.
We have several amlogic's design with the same MAC, sometimes with the
same PHY, which have no problem with EEE at all. The issue is really
about the board design.
What I propose is not an enable/disable configuration switch, but to
clearly state that a particular mode of operation is broken. Like the
"max-speed" property, it setup a restriction. IMO, this is a
description of what the HW is and is capable of, and as such it should
be part of the DT.
Yes the property directly map to a register, but it does let you
directly manipulate it (you can't pass the value you want to write in
the register). Having it this way just makes the code simple on both
ends (user and driver).
Yes people could start abusing this to setup policy. In the end, it is
our responsibility, as community, to make sure APIs are used in a
proper way, and not let it be used that way.
I'm open to suggestion on how improve the solution, maybe something
which could bring more confidence that property won't be misused.
Jerome
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH net 04/16] net: ethernet: aurora: nb8800: fix fixed-link phydev leaks
From: Johan Hovold @ 2016-11-30 9:51 UTC (permalink / raw)
To: Mason
Cc: Johan Hovold, Mans Rullgard, Sebastian Frias, netdev, LKML,
David S. Miller, Joe Perches, Brian Norris
In-Reply-To: <583E9DAD.8090205@free.fr>
On Wed, Nov 30, 2016 at 10:36:45AM +0100, Mason wrote:
> On 28/11/2016 19:24, Johan Hovold wrote:
>
> > Make sure to deregister and free any fixed-link PHY registered using
> > of_phy_register_fixed_link() on probe errors and on driver unbind.
> >
> > Fixes: c7dfe3abf40e ("net: ethernet: nb8800: support fixed-link DT node")
> > Signed-off-by: Johan Hovold <johan@kernel.org>
> > ---
> > drivers/net/ethernet/aurora/nb8800.c | 9 +++++++--
> > 1 file changed, 7 insertions(+), 2 deletions(-)
>
> Did you use scripts/get_maintainer.pl ?
>
> Neither the author of the driver (Mans) nor the author of
> the code in question (Sebastian) were CCed on this patch.
I did, but I only included parties listed as maintainers, not commit
signers, to keep the already large CC list down somewhat.
Johan
^ permalink raw reply
* Re: [PATCH 2/6] net: ethernet: ti: cpts: add support for ext rftclk selection
From: Richard Cochran @ 2016-11-30 9:56 UTC (permalink / raw)
To: Grygorii Strashko
Cc: David S. Miller, netdev-u79uwXL29TY76Z2rM5mHXA, Mugunthan V N,
Sekhar Nori, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
linux-omap-u79uwXL29TY76Z2rM5mHXA, Rob Herring,
devicetree-u79uwXL29TY76Z2rM5mHXA, Murali Karicheri, Wingman Kwok
In-Reply-To: <20161128230428.6872-3-grygorii.strashko-l0cyMroinI0@public.gmane.org>
On Mon, Nov 28, 2016 at 05:04:24PM -0600, Grygorii Strashko wrote:
> Some CPTS instances, which can be found on KeyStone 2 1/10G Ethernet
> Switch Subsystems, can control an external multiplexer that selects
> one of up to 32 clocks for time sync reference (RFTCLK). This feature
> can be configured through CPTS_RFTCLK_SEL register (offset: x08).
>
> Hence, introduce optional DT cpts_rftclk_sel poperty wich, if present,
> will specify CPTS reference clock. The cpts_rftclk_sel should be
> omitted in DT if HW doesn't support this feature. The external fixed
> rate clocks can be defined in board files as "fixed-clock".
Can't you implement this using the clock tree, rather than an ad-hoc
DT property?
Thanks,
Richard
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH net] tipc: check minimum bearer MTU
From: Michal Kubecek @ 2016-11-30 9:57 UTC (permalink / raw)
To: Jon Maloy, Ying Xue
Cc: David S. Miller, tipc-discussion, netdev, linux-kernel,
Ben Hutchings, Qian Zhang
Qian Zhang (张谦) reported a potential socket buffer overflow in
tipc_msg_build() which is also known as CVE-2016-8632: due to
insufficient checks, a buffer overflow can occur if MTU is too short for
even tipc headers. As anyone can set device MTU in a user/net namespace,
this issue can be abused by a regular user.
As agreed in the discussion on Ben Hutchings' original patch, we should
check the MTU at the moment a bearer is attached rather than for each
processed packet. We also need to repeat the check when bearer MTU is
adjusted to new device MTU. UDP case also needs a check to avoid
overflow when calculating bearer MTU.
Fixes: b97bf3fd8f6a ("[TIPC] Initial merge")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reported-by: Qian Zhang (张谦) <zhangqian-c@360.cn>
---
net/tipc/bearer.c | 9 +++++++--
net/tipc/bearer.h | 13 +++++++++++++
net/tipc/udp_media.c | 5 +++++
3 files changed, 25 insertions(+), 2 deletions(-)
diff --git a/net/tipc/bearer.c b/net/tipc/bearer.c
index 975dbeb60ab0..dd4b19e8bb43 100644
--- a/net/tipc/bearer.c
+++ b/net/tipc/bearer.c
@@ -421,6 +421,10 @@ int tipc_enable_l2_media(struct net *net, struct tipc_bearer *b,
dev = dev_get_by_name(net, driver_name);
if (!dev)
return -ENODEV;
+ if (tipc_check_mtu(dev, 0)) {
+ dev_put(dev);
+ return -EINVAL;
+ }
/* Associate TIPC bearer with L2 bearer */
rcu_assign_pointer(b->media_ptr, dev);
@@ -610,8 +614,6 @@ static int tipc_l2_device_event(struct notifier_block *nb, unsigned long evt,
if (!b)
return NOTIFY_DONE;
- b->mtu = dev->mtu;
-
switch (evt) {
case NETDEV_CHANGE:
if (netif_carrier_ok(dev))
@@ -624,6 +626,9 @@ static int tipc_l2_device_event(struct notifier_block *nb, unsigned long evt,
tipc_reset_bearer(net, b);
break;
case NETDEV_CHANGEMTU:
+ if (tipc_check_mtu(dev, 0))
+ return -EINVAL;
+ b->mtu = dev->mtu;
tipc_reset_bearer(net, b);
break;
case NETDEV_CHANGEADDR:
diff --git a/net/tipc/bearer.h b/net/tipc/bearer.h
index 78892e2f53e3..1a0b7434ec24 100644
--- a/net/tipc/bearer.h
+++ b/net/tipc/bearer.h
@@ -39,6 +39,7 @@
#include "netlink.h"
#include "core.h"
+#include "msg.h"
#include <net/genetlink.h>
#define MAX_MEDIA 3
@@ -59,6 +60,9 @@
#define TIPC_MEDIA_TYPE_IB 2
#define TIPC_MEDIA_TYPE_UDP 3
+/* minimum bearer MTU */
+#define TIPC_MIN_BEARER_MTU (MAX_H_SIZE + INT_H_SIZE)
+
/**
* struct tipc_media_addr - destination address used by TIPC bearers
* @value: address info (format defined by media)
@@ -215,4 +219,13 @@ void tipc_bearer_xmit(struct net *net, u32 bearer_id,
void tipc_bearer_bc_xmit(struct net *net, u32 bearer_id,
struct sk_buff_head *xmitq);
+/* check if device MTU is sufficient for tipc headers */
+inline bool tipc_check_mtu(struct net_device *dev, unsigned int reserve)
+{
+ if (dev->mtu >= TIPC_MIN_BEARER_MTU + reserve)
+ return false;
+ netdev_warn(dev, "MTU too low for tipc bearer\n");
+ return true;
+}
+
#endif /* _TIPC_BEARER_H */
diff --git a/net/tipc/udp_media.c b/net/tipc/udp_media.c
index 78cab9c5a445..376ed3e3ed46 100644
--- a/net/tipc/udp_media.c
+++ b/net/tipc/udp_media.c
@@ -697,6 +697,11 @@ static int tipc_udp_enable(struct net *net, struct tipc_bearer *b,
udp_conf.local_ip.s_addr = htonl(INADDR_ANY);
udp_conf.use_udp_checksums = false;
ub->ifindex = dev->ifindex;
+ if (tipc_check_mtu(dev, sizeof(struct iphdr) +
+ sizeof(struct udphdr))) {
+ err = -EINVAL;
+ goto err;
+ }
b->mtu = dev->mtu - sizeof(struct iphdr)
- sizeof(struct udphdr);
#if IS_ENABLED(CONFIG_IPV6)
--
2.10.2
^ permalink raw reply related
* Re: [PATCH net 1/2] esp4: Fix integrity verification when ESN are used
From: Herbert Xu @ 2016-11-30 9:58 UTC (permalink / raw)
To: Tobias Brunner; +Cc: David S. Miller, Steffen Klassert, netdev
In-Reply-To: <091e32fb-ad85-4dc8-2864-f2b97141f097@strongswan.org>
On Tue, Nov 29, 2016 at 05:05:20PM +0100, Tobias Brunner wrote:
> When handling inbound packets, the two halves of the sequence number
> stored on the skb are already in network order.
>
> Fixes: 7021b2e1cddd ("esp4: Switch to new AEAD interface")
> Signed-off-by: Tobias Brunner <tobias@strongswan.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Thanks for catching this!
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Re: [PATCH net 2/2] esp6: Fix integrity verification when ESN are used
From: Herbert Xu @ 2016-11-30 9:58 UTC (permalink / raw)
To: Tobias Brunner; +Cc: David S. Miller, Steffen Klassert, netdev
In-Reply-To: <f3c5eaa5-d1a3-cc7a-1bd8-20fa2c961770@strongswan.org>
On Tue, Nov 29, 2016 at 05:05:25PM +0100, Tobias Brunner wrote:
> When handling inbound packets, the two halves of the sequence number
> stored on the skb are already in network order.
>
> Fixes: 000ae7b2690e ("esp6: Switch to new AEAD interface")
> Signed-off-by: Tobias Brunner <tobias@strongswan.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Thanks.
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Re: [PATCH net-next 1/7] net/mlx5e: Implement Fragmented Work Queue (WQ)
From: Tariq Toukan @ 2016-11-30 10:02 UTC (permalink / raw)
To: Eric Dumazet, Saeed Mahameed
Cc: David S. Miller, netdev, Tariq Toukan, Or Gerlitz, Roi Dayan,
Sebastian Ott
In-Reply-To: <1480462282.18162.161.camel@edumazet-glaptop3.roam.corp.google.com>
On 30/11/2016 1:31 AM, Eric Dumazet wrote:
> On Wed, 2016-11-30 at 00:19 +0200, Saeed Mahameed wrote:
>> From: Tariq Toukan <tariqt@mellanox.com>
>>
>> Add new type of struct mlx5_frag_buf which is used to allocate fragmented
>> buffers rather than contiguous, and make the Completion Queues (CQs) use
>> it as they are big (default of 2MB per CQ in Striding RQ).
>>
>> This fixes the failures of type:
>> "mlx5e_open_locked: mlx5e_open_channels failed, -12"
>> due to dma_zalloc_coherent insufficient contiguous coherent memory to
>> satisfy the driver's request when the user tries to setup more or larger
>> rings.
>>
>> Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
>> Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
>> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
>> ---
>> drivers/net/ethernet/mellanox/mlx5/core/alloc.c | 66 +++++++++++++++++++++++
>> drivers/net/ethernet/mellanox/mlx5/core/en.h | 2 +-
>> drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 10 ++--
>> drivers/net/ethernet/mellanox/mlx5/core/wq.c | 26 ++++++---
>> drivers/net/ethernet/mellanox/mlx5/core/wq.h | 18 +++++--
>> include/linux/mlx5/driver.h | 11 ++++
>> 6 files changed, 116 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/alloc.c b/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
>> index 2c6e3c7..bc8357d 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
>> @@ -106,6 +106,63 @@ void mlx5_buf_free(struct mlx5_core_dev *dev, struct mlx5_buf *buf)
>> }
>> EXPORT_SYMBOL_GPL(mlx5_buf_free);
>>
>> +int mlx5_frag_buf_alloc_node(struct mlx5_core_dev *dev, int size,
>> + struct mlx5_frag_buf *buf, int node)
>> +{
>> + int i;
>> +
>> + buf->size = size;
>> + buf->npages = 1 << get_order(size);
>> + buf->page_shift = PAGE_SHIFT;
>> + buf->frags = kcalloc(buf->npages, sizeof(struct mlx5_buf_list),
>> + GFP_KERNEL);
>> + if (!buf->frags)
>> + goto err_out;
>> +
>> + for (i = 0; i < buf->npages; i++) {
>> + struct mlx5_buf_list *frag = &buf->frags[i];
>> + int frag_sz = min_t(int, size, PAGE_SIZE);
>> +
>> + frag->buf = mlx5_dma_zalloc_coherent_node(dev, frag_sz,
>> + &frag->map, node);
>> + if (!frag->buf)
>> + goto err_free_buf;
>> + if (frag->map & ((1 << buf->page_shift) - 1)) {
>> + dma_free_coherent(&dev->pdev->dev, frag_sz,
>> + buf->frags[i].buf, buf->frags[i].map);
> There is a bug if this happens with i = 0
>
>> + mlx5_core_warn(dev, "unexpected map alignment: 0x%p, page_shift=%d\n",
>> + (void *)frag->map, buf->page_shift);
>> + goto err_free_buf;
>> + }
>> + size -= frag_sz;
>> + }
>> +
>> + return 0;
>> +
>> +err_free_buf:
>> + while (--i)
> Because this loop will be done about 2^32 times.
Right. I'll fix this.
Thanks,
Tariq.
>
>> + dma_free_coherent(&dev->pdev->dev, PAGE_SIZE, buf->frags[i].buf,
>> + buf->frags[i].map);
>> + kfree(buf->frags);
>> +err_out:
>> + return -ENOMEM;
>> +}
>
>
>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox