* Re: [PATCH net-next] virtio-net: on tx, only call napi_disable if tx napi is on
From: Michael S. Tsirkin @ 2017-04-25 20:44 UTC (permalink / raw)
To: Willem de Bruijn; +Cc: Willem de Bruijn, netdev, davem, virtualization
In-Reply-To: <20170425195917.54209-1-willemdebruijn.kernel@gmail.com>
On Tue, Apr 25, 2017 at 03:59:17PM -0400, Willem de Bruijn wrote:
> From: Willem de Bruijn <willemb@google.com>
>
> As of tx napi, device down (`ip link set dev $dev down`) hangs unless
> tx napi is enabled. Else napi_enable is not called, so napi_disable
> will spin on test_and_set_bit NAPI_STATE_SCHED.
>
> Only call napi_disable if tx napi is enabled.
>
> Fixes: 5a719c2552ca ("virtio-net: transmit napi")
> Reported-by: Jason Wang <jasowang@redhat.com>
> Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> drivers/net/virtio_net.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 003143835766..82f1c3a73345 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -997,6 +997,12 @@ static void virtnet_napi_tx_enable(struct virtnet_info *vi,
> return virtnet_napi_enable(vq, napi);
> }
>
> +static void virtnet_napi_tx_disable(struct napi_struct *napi)
> +{
> + if (napi->weight)
> + napi_disable(napi);
> +}
> +
> static void refill_work(struct work_struct *work)
> {
> struct virtnet_info *vi =
> @@ -1445,7 +1451,7 @@ static int virtnet_close(struct net_device *dev)
>
> for (i = 0; i < vi->max_queue_pairs; i++) {
> napi_disable(&vi->rq[i].napi);
> - napi_disable(&vi->sq[i].napi);
> + virtnet_napi_tx_disable(&vi->sq[i].napi);
> }
>
> return 0;
> @@ -1803,7 +1809,7 @@ static void virtnet_freeze_down(struct virtio_device *vdev)
> if (netif_running(vi->dev)) {
> for (i = 0; i < vi->max_queue_pairs; i++) {
> napi_disable(&vi->rq[i].napi);
> - napi_disable(&vi->sq[i].napi);
> + virtnet_napi_tx_disable(&vi->sq[i].napi);
> }
> }
> }
> --
> 2.13.0.rc0.306.g87b477812d-goog
^ permalink raw reply
* Re: more test_progs...
From: Daniel Borkmann @ 2017-04-25 20:49 UTC (permalink / raw)
To: David Miller, ast; +Cc: netdev
In-Reply-To: <20170425.125217.1962662516948420246.davem@davemloft.net>
On 04/25/2017 06:52 PM, David Miller wrote:
[...]
> Load eth->h_proto
>
> 10: (15) if r3 == 0xdd86 goto pc+9
> R0=imm2,min_value=2,max_value=2 R1=pkt(id=0,off=0,r=14) R2=pkt_end R3=inv R4=pkt(id=0,off=14,r=14) R5=inv56 R10=fp
>
> Hmmm, endianness looks wrong here. "-target bpf" defaults to the
> endianness of whatever cpu that llvm was built for, right?
Hmm, would it show the right endianess when you compile
with "-target bpfeb"?
My understanding is that "-target bpf" defaults to host
cpu's endianess, and since you likely built clang/llvm
directly on sparc, it should also all run on target
endianness anyway (so no potential mixup when compiling
f.e. bpfeb on x86_64).
^ permalink raw reply
* Re: [RFC 1/4] netlink: make extended ACK setting NULL-friendly
From: Jakub Kicinski @ 2017-04-25 20:53 UTC (permalink / raw)
To: Johannes Berg, daniel
Cc: netdev, davem, dsa, alexei.starovoitov, bblanco, john.fastabend,
kubakici, oss-drivers
In-Reply-To: <1493108014.2592.1.camel@sipsolutions.net>
On Tue, 25 Apr 2017 10:13:34 +0200, Johannes Berg wrote:
> On Tue, 2017-04-25 at 01:06 -0700, Jakub Kicinski wrote:
>
> > +#define NL_SET_ERR_MSG(extack, msg) do { \
> > + struct netlink_ext_ack *_extack = (extack); \
> > + static const char _msg[] = (msg); \
> > + \
> > + if (_extack) \
> > + _extack->_msg = _msg; \
> > + else \
> > + pr_info("%s\n", _msg); \
> > } while (0)
>
> That's a good point, I used it only for genetlink so far where it was
> guaranteed non-NULL.
>
> I'm not really sure about the printing though - I'd rather not people
> start relying on that and then we convert to have non-NULL and the
> message disappears as a result ...
Yes, agreed. I don't really know what to do about that one though :|
One could argue people may already be depending on the messages which
I'm converting in this series... On the other hand, that would
be considering logs as part of the ABI which we don't want to do.
I'm leaning towards dropping the else clause and never printing, that
will add an incentive for people to convert more paths to provide the
ext ack. Any thoughts on that?
^ permalink raw reply
* Re: [RFC 0/4] xdp: use netlink extended ACK reporting
From: Jakub Kicinski @ 2017-04-25 21:00 UTC (permalink / raw)
To: Daniel Borkmann
Cc: netdev, davem, johannes, dsa, alexei.starovoitov, bblanco,
john.fastabend, kubakici, oss-drivers
In-Reply-To: <58FF1157.8030309@iogearbox.net>
On Tue, 25 Apr 2017 11:05:27 +0200, Daniel Borkmann wrote:
> > adding #defines for the most common configuration conflicts?
> > Sharing the messages verbatim between drivers could make them easier
> > to google.
>
> Makes sense, once more drivers adapt to this reporting, these
> messages could be consolidated.
There seem to be concerns about standardizing/turning the strings into
ABI. I will leave it out for now, but we can revisit later :)
^ permalink raw reply
* Re: TCP fast open using experimental TCP option?
From: Yuchung Cheng @ 2017-04-25 21:00 UTC (permalink / raw)
To: Tom Herbert; +Cc: Linux Kernel Network Developers, Jerry Chu
In-Reply-To: <CALx6S36L4BbdcASbGNvn6mRMJDhJJ6n8y-uRTgVqteM=f6On6A@mail.gmail.com>
On Tue, Apr 25, 2017 at 12:08 PM, Tom Herbert <tom@herbertland.com> wrote:
> Looks like TCP fast open was using experimental TCP option at some. Is
> this still needed? Technically this violates usage requirements of
> experimental options. Can this be removed now since there is now an
> assigned option number for TFO?
Given that many clients (e.g. android) have not migrated to 4.0
kernels that support TFO opt 34, I would keep it for backward
compatibility for now.
>
> case TCPOPT_EXP:
> /* Fast Open option shares code 254 using a
> * 16 bits magic number.
> */
> if (opsize >= TCPOLEN_EXP_FASTOPEN_BASE &&
> get_unaligned_be16(ptr) ==
> TCPOPT_FASTOPEN_MAGIC)
> tcp_parse_fastopen_option(opsize -
> TCPOLEN_EXP_FASTOPEN_BASE,
> ptr + 2, th->syn, foc, true);
> break;
^ permalink raw reply
* Re: [RFC 0/4] xdp: use netlink extended ACK reporting
From: Jakub Kicinski @ 2017-04-25 21:05 UTC (permalink / raw)
To: David Ahern
Cc: netdev, davem, johannes, daniel, alexei.starovoitov, bblanco,
john.fastabend, oss-drivers
In-Reply-To: <2fbd12c0-dfed-c8df-8c74-1686565168ab@cumulusnetworks.com>
On Tue, 25 Apr 2017 08:53:35 -0600, David Ahern wrote:
> On 4/25/17 2:06 AM, Jakub Kicinski wrote:
>
> > Also - is anyone working on adding proper extack support to iproute2?
> > The code I have right now is a bit of a hack...
>
> This is what I have done:
> https://github.com/dsahern/iproute2/commits/ext-ack
>
> Basically, added the parsing code and then a new rtnl_talk_extack
> function that takes a callback to invoke with the extack data. The last
> patch (of 3) purposely breaks ip set link mtu -- sending the mtu as a
> u16 rather than a u32 just to work on the plumbing for parsing the
> returned message:
>
> $ ip li set dummy1 mtu 1490
> Error with rtnetlink attribute IFLA_MTU
>
> If an errmsg is returned it is printed as well.
Great, that's much better than what I have. It will make the XDP patch
for iproute2 pretty trivial :)
^ permalink raw reply
* Re: [PATCH v1] net: phy: fix auto-negotiation stall due to unavailable interrupt
From: Florian Fainelli @ 2017-04-25 21:07 UTC (permalink / raw)
To: David Miller, alexandre.belloni
Cc: al.kochet, netdev, linux-kernel, sergei.shtylyov, rogerq,
madalin.bucur
In-Reply-To: <20170425.162625.941066354018174414.davem@davemloft.net>
On 04/25/2017 01:26 PM, David Miller wrote:
> From: Alexandre Belloni <alexandre.belloni@free-electrons.com>
> Date: Tue, 25 Apr 2017 22:09:11 +0200
>
>> Hi,
>>
>> On 25/04/2017 at 18:25:30 +0300, Alexander Kochetkov wrote:
>>> Hello David!
>>>
>>>> 25 апр. 2017 г., в 17:36, David Miller <davem@davemloft.net> написал(а):
>>>>
>>>> So... what are we doing here?
>>>>
>>>> My understanding is that this should fix the same problem that commit
>>>> 99f81afc139c6edd14d77a91ee91685a414a1c66 ("phy: micrel: Disable auto
>>>> negotiation on startup") fixed and that this micrel commit should thus
>>>> be reverted to improve MAC startup times which regressed.
>>>>
>>>> Florian, any guidance?
>>>
>>> Yes, this should be done.
>>>
>>> I aksed Alexandre to check if 99f81afc139c6edd14d77a91ee91685a414a1c66 ("phy: micrel: Disable auto
>>> negotiation on startup») can be reverted, and he answered what it may do that
>>> sometime this/next week.
>>>
>>
>> Yes, it can be safely reverted after Alexander's patch. I had to test on
>> v4.7 because we are not using interrupts on those boards since v4.8
>> (another issue to be fixed).
>>
>> As Florian pointed out, at the time I sent my patch, I didn't have time
>> to investigate whether this was affecting other phys, see
>> https://lkml.org/lkml/2016/2/26/766
>>
>> I can send the revert or you can do it.
>
> I can take care of it, thanks for testing.
Thanks! Can you add the following Fixes tag:
Fixes: 321beec5047a (net: phy: Use interrupts when available in NOLINK
state)
BTW, can you have the netdev patchwork instance automatically accepted
Fixes: tag sent as replies to patches? (just like Acked-by, Reviewed-by
and so on are already accepted)?
--
Florian
^ permalink raw reply
* Re: [PATCH net-next 01/10] tcp: add tp->tcp_mstamp field
From: Yuchung Cheng @ 2017-04-25 21:27 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S . Miller, netdev, Soheil Hassas Yeganeh, Eric Dumazet
In-Reply-To: <20170425171541.3417-2-edumazet@google.com>
On Tue, Apr 25, 2017 at 10:15 AM, Eric Dumazet <edumazet@google.com> wrote:
> We want to use precise timestamps in TCP stack, but we do not
> want to call possibly expensive kernel time services too often.
>
> tp->tcp_mstamp is guaranteed to be updated once per incoming packet.
>
> We will use it in the following patches, removing specific
> skb_mstamp_get() calls, and removing ack_time from
> struct tcp_sacktag_state.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
> include/linux/tcp.h | 1 +
> net/ipv4/tcp_input.c | 3 +++
> 2 files changed, 4 insertions(+)
>
> diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> index cbe5b602a2d349fdeb1e878305f37b4da1e6cc86..99a22f44c32e1587a6bf4835b65c7a4314807aa8 100644
> --- a/include/linux/tcp.h
> +++ b/include/linux/tcp.h
> @@ -240,6 +240,7 @@ struct tcp_sock {
> u32 tlp_high_seq; /* snd_nxt at the time of TLP retransmit. */
>
> /* RTT measurement */
> + struct skb_mstamp tcp_mstamp; /* most recent packet received/sent */
Eric: would this new stamp cover outgoing packet as well in the
future? in the patch series seem to cover only the incoming packets.
> u32 srtt_us; /* smoothed round trip time << 3 in usecs */
> u32 mdev_us; /* medium deviation */
> u32 mdev_max_us; /* maximal mdev for the last rtt period */
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 5af2f04f885914491a7116c20056b3d2188d2d7d..bd18c65df4a9d9c2b66d8005f2cc4ff468140a73 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -5362,6 +5362,7 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb,
> {
> struct tcp_sock *tp = tcp_sk(sk);
>
> + skb_mstamp_get(&tp->tcp_mstamp);
> if (unlikely(!sk->sk_rx_dst))
> inet_csk(sk)->icsk_af_ops->sk_rx_dst_set(sk, skb);
> /*
> @@ -5922,6 +5923,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
>
> case TCP_SYN_SENT:
> tp->rx_opt.saw_tstamp = 0;
> + skb_mstamp_get(&tp->tcp_mstamp);
> queued = tcp_rcv_synsent_state_process(sk, skb, th);
> if (queued >= 0)
> return queued;
> @@ -5933,6 +5935,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
> return 0;
> }
>
> + skb_mstamp_get(&tp->tcp_mstamp);
> tp->rx_opt.saw_tstamp = 0;
> req = tp->fastopen_rsk;
> if (req) {
> --
> 2.13.0.rc0.306.g87b477812d-goog
>
^ permalink raw reply
* Re: [PATCH net-next 01/10] tcp: add tp->tcp_mstamp field
From: Eric Dumazet @ 2017-04-25 21:33 UTC (permalink / raw)
To: Yuchung Cheng
Cc: David S . Miller, netdev, Soheil Hassas Yeganeh, Eric Dumazet
In-Reply-To: <CAK6E8=fhzzLKiUnB73upZdUUUSZ4t6HABmM17YGyEj93jLBkMQ@mail.gmail.com>
On Tue, Apr 25, 2017 at 2:27 PM, Yuchung Cheng <ycheng@google.com> wrote:
>> + struct skb_mstamp tcp_mstamp; /* most recent packet received/sent */
> Eric: would this new stamp cover outgoing packet as well in the
> future? in the patch series seem to cover only the incoming packets.
This is the plan yes : tp->lsndtime will also be replaced for a better
TSO autodefer.
When it happens, tp->tcp_mstamp will be updated once per write episode.
And we'll remove a lot of skb_mstamp_get(&skb->skb_mstamp) calls all
over the places.
^ permalink raw reply
* [Patch net] ipv6: check skb->protocol before lookup for nexthop
From: Cong Wang @ 2017-04-25 21:37 UTC (permalink / raw)
To: netdev; +Cc: andreyknvl, Cong Wang, Steffen Klassert
Andrey reported a out-of-bound access in ip6_tnl_xmit(), this
is because we use an ipv4 dst in ip6_tnl_xmit() and cast an IPv4
neigh key as an IPv6 address:
neigh = dst_neigh_lookup(skb_dst(skb),
&ipv6_hdr(skb)->daddr);
if (!neigh)
goto tx_err_link_failure;
addr6 = (struct in6_addr *)&neigh->primary_key; // <=== HERE
addr_type = ipv6_addr_type(addr6);
if (addr_type == IPV6_ADDR_ANY)
addr6 = &ipv6_hdr(skb)->daddr;
memcpy(&fl6->daddr, addr6, sizeof(fl6->daddr));
Also the network header of the skb at this point should be still IPv4
for 4in6 tunnels, we shold not just use it as IPv6 header.
This patch fixes it by checking if skb->protocol is ETH_P_IPV6: if it
is, we are safe to do the nexthop lookup using skb_dst() and
ipv6_hdr(skb)->daddr; if not (aka IPv4), we have no clue about which
dest address we can pick here, we have to rely on callers to fill it
from tunnel config, so just fall to ip6_route_output() to make the
decision.
Fixes: ea3dc9601bda ("ip6_tunnel: Add support for wildcard tunnel endpoints.")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
net/ipv6/ip6_tunnel.c | 34 ++++++++++++++++++----------------
1 file changed, 18 insertions(+), 16 deletions(-)
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 75fac93..a9692ec 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -1037,7 +1037,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield,
struct ip6_tnl *t = netdev_priv(dev);
struct net *net = t->net;
struct net_device_stats *stats = &t->dev->stats;
- struct ipv6hdr *ipv6h = ipv6_hdr(skb);
+ struct ipv6hdr *ipv6h;
struct ipv6_tel_txoption opt;
struct dst_entry *dst = NULL, *ndst = NULL;
struct net_device *tdev;
@@ -1057,26 +1057,28 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield,
/* NBMA tunnel */
if (ipv6_addr_any(&t->parms.raddr)) {
- struct in6_addr *addr6;
- struct neighbour *neigh;
- int addr_type;
+ if (skb->protocol == htons(ETH_P_IPV6)) {
+ struct in6_addr *addr6;
+ struct neighbour *neigh;
+ int addr_type;
- if (!skb_dst(skb))
- goto tx_err_link_failure;
+ if (!skb_dst(skb))
+ goto tx_err_link_failure;
- neigh = dst_neigh_lookup(skb_dst(skb),
- &ipv6_hdr(skb)->daddr);
- if (!neigh)
- goto tx_err_link_failure;
+ neigh = dst_neigh_lookup(skb_dst(skb),
+ &ipv6_hdr(skb)->daddr);
+ if (!neigh)
+ goto tx_err_link_failure;
- addr6 = (struct in6_addr *)&neigh->primary_key;
- addr_type = ipv6_addr_type(addr6);
+ addr6 = (struct in6_addr *)&neigh->primary_key;
+ addr_type = ipv6_addr_type(addr6);
- if (addr_type == IPV6_ADDR_ANY)
- addr6 = &ipv6_hdr(skb)->daddr;
+ if (addr_type == IPV6_ADDR_ANY)
+ addr6 = &ipv6_hdr(skb)->daddr;
- memcpy(&fl6->daddr, addr6, sizeof(fl6->daddr));
- neigh_release(neigh);
+ memcpy(&fl6->daddr, addr6, sizeof(fl6->daddr));
+ neigh_release(neigh);
+ }
} else if (!(t->parms.flags &
(IP6_TNL_F_USE_ORIG_TCLASS | IP6_TNL_F_USE_ORIG_FWMARK))) {
/* enable the cache only only if the routing decision does
--
2.5.5
^ permalink raw reply related
* Re: [PATCH] IB/IPoIB: Check the headroom size
From: Or Gerlitz @ 2017-04-25 21:50 UTC (permalink / raw)
To: Doug Ledford
Cc: Erez Shitrit, Paolo Abeni, Honggang LI, Erez Shitrit,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Linux Netdev List, David Miller
In-Reply-To: <1493134815.3041.72.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
On Tue, Apr 25, 2017 at 6:40 PM, Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On Tue, 2017-04-25 at 17:39 +0300, Or Gerlitz wrote:
>> If got you right, Paolo's commit introduced a regression, so we (I
>> guess you and Paolo) need to either solve it or we (community)
>> should consider a revert, please suggest.
> [...] So, this issue should be
> reproducible either after Paolo's commit or with any kernel prior to
> your commit to use the skb->cb area to store the DGID, but it probably
> requires the specific series of actions in this bug to trigger it.
mmm
> A normal, clean shutdown of the interface doesn't demonstrate the issue.
so maybe @ least for the time being, we should be picking Hong's patch
with proper change log and without the giant stack dump till we have
something better. If you agree, can you do the re-write of the change
log?
Or.
>> The bug is now in stable and distro kernels, so please act.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH v2 1/2] net: dsa: b53: Add compatible strings for the Cygnus-family BCM11360.
From: Eric Anholt @ 2017-04-25 23:53 UTC (permalink / raw)
To: Florian Fainelli, Vivien Didelot, Andrew Lunn, netdev,
Rob Herring, Mark Rutland, devicetree
Cc: Scott Branden, Jon Mason, Ray Jui, linux-kernel, Eric Anholt,
bcm-kernel-feedback-list, linux-arm-kernel
Cygnus is a small family of SoCs, of which we currently have
devicetree for BCM11360 and BCM58300. The 11360's B53 is mostly the
same as 58xx, just requiring a tiny bit of setup that was previously
missing.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
---
v2: Reorder the entry in the docs (suggestion by Scott Branden), add
missing '"'
Documentation/devicetree/bindings/net/dsa/b53.txt | 3 +++
drivers/net/dsa/b53/b53_srab.c | 2 ++
2 files changed, 5 insertions(+)
diff --git a/Documentation/devicetree/bindings/net/dsa/b53.txt b/Documentation/devicetree/bindings/net/dsa/b53.txt
index d6c6e41648d4..eb679e92d525 100644
--- a/Documentation/devicetree/bindings/net/dsa/b53.txt
+++ b/Documentation/devicetree/bindings/net/dsa/b53.txt
@@ -13,6 +13,9 @@ Required properties:
"brcm,bcm5397"
"brcm,bcm5398"
+ For the BCM11360 SoC, must be:
+ "brcm,bcm11360-srab" and the mandatory "brcm,cygnus-srab" string
+
For the BCM5310x SoCs with an integrated switch, must be one of:
"brcm,bcm53010-srab"
"brcm,bcm53011-srab"
diff --git a/drivers/net/dsa/b53/b53_srab.c b/drivers/net/dsa/b53/b53_srab.c
index 8a62b6a69703..c37ffd1b6833 100644
--- a/drivers/net/dsa/b53/b53_srab.c
+++ b/drivers/net/dsa/b53/b53_srab.c
@@ -364,6 +364,7 @@ static const struct of_device_id b53_srab_of_match[] = {
{ .compatible = "brcm,bcm53018-srab" },
{ .compatible = "brcm,bcm53019-srab" },
{ .compatible = "brcm,bcm5301x-srab" },
+ { .compatible = "brcm,bcm11360-srab", .data = (void *)BCM58XX_DEVICE_ID },
{ .compatible = "brcm,bcm58522-srab", .data = (void *)BCM58XX_DEVICE_ID },
{ .compatible = "brcm,bcm58525-srab", .data = (void *)BCM58XX_DEVICE_ID },
{ .compatible = "brcm,bcm58535-srab", .data = (void *)BCM58XX_DEVICE_ID },
@@ -371,6 +372,7 @@ static const struct of_device_id b53_srab_of_match[] = {
{ .compatible = "brcm,bcm58623-srab", .data = (void *)BCM58XX_DEVICE_ID },
{ .compatible = "brcm,bcm58625-srab", .data = (void *)BCM58XX_DEVICE_ID },
{ .compatible = "brcm,bcm88312-srab", .data = (void *)BCM58XX_DEVICE_ID },
+ { .compatible = "brcm,cygnus-srab", .data = (void *)BCM58XX_DEVICE_ID },
{ .compatible = "brcm,nsp-srab", .data = (void *)BCM58XX_DEVICE_ID },
{ /* sentinel */ },
};
--
2.11.0
^ permalink raw reply related
* [PATCH v2 2/2] ARM: dts: Add the ethernet and ethernet PHY to the cygnus core DT.
From: Eric Anholt @ 2017-04-25 23:53 UTC (permalink / raw)
To: Florian Fainelli, Vivien Didelot, Andrew Lunn, netdev,
Rob Herring, Mark Rutland, devicetree
Cc: linux-arm-kernel, linux-kernel, bcm-kernel-feedback-list, Ray Jui,
Scott Branden, Jon Mason, Eric Anholt
In-Reply-To: <20170425235357.7690-1-eric@anholt.net>
Cygnus has a single AMAC controller connected to the B53 switch with 2
PHYs. On the BCM911360_EP platform, those two PHYs are connected to
the external ethernet jacks.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
---
v2: Call the node "switch", just call the ports "port" (suggestions by
Florian), drop max-speed on the phys (suggestion by Andrew Lunn),
call the other nodes "ethernet" and "ethernet-phy" (suggestions by
Sergei Shtylyov)
arch/arm/boot/dts/bcm-cygnus.dtsi | 58 ++++++++++++++++++++++++++++++++++
arch/arm/boot/dts/bcm911360_entphn.dts | 8 +++++
2 files changed, 66 insertions(+)
diff --git a/arch/arm/boot/dts/bcm-cygnus.dtsi b/arch/arm/boot/dts/bcm-cygnus.dtsi
index 009f1346b817..9fd89be0f5e0 100644
--- a/arch/arm/boot/dts/bcm-cygnus.dtsi
+++ b/arch/arm/boot/dts/bcm-cygnus.dtsi
@@ -142,6 +142,54 @@
interrupts = <0>;
};
+ mdio: mdio@18002000 {
+ compatible = "brcm,iproc-mdio";
+ reg = <0x18002000 0x8>;
+ #size-cells = <1>;
+ #address-cells = <0>;
+
+ gphy0: ethernet-phy@0 {
+ reg = <0>;
+ };
+
+ gphy1: ethernet-phy@1 {
+ reg = <1>;
+ };
+ };
+
+ switch: switch@18007000 {
+ compatible = "brcm,bcm11360-srab", "brcm,cygnus-srab";
+ reg = <0x18007000 0x1000>;
+ status = "disabled";
+
+ ports {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ port@0 {
+ reg = <0>;
+ phy-handle = <&gphy0>;
+ phy-mode = "rgmii";
+ };
+
+ port@1 {
+ reg = <1>;
+ phy-handle = <&gphy1>;
+ phy-mode = "rgmii";
+ };
+
+ port@8 {
+ reg = <8>;
+ label = "cpu";
+ ethernet = <ð0>;
+ fixed-link {
+ speed = <1000>;
+ full-duplex;
+ };
+ };
+ };
+ };
+
i2c0: i2c@18008000 {
compatible = "brcm,cygnus-iproc-i2c", "brcm,iproc-i2c";
reg = <0x18008000 0x100>;
@@ -295,6 +343,16 @@
status = "disabled";
};
+ eth0: ethernet@18042000 {
+ compatible = "brcm,amac";
+ reg = <0x18042000 0x1000>,
+ <0x18110000 0x1000>;
+ reg-names = "amac_base", "idm_base";
+ interrupts = <GIC_SPI 110 IRQ_TYPE_LEVEL_HIGH>;
+ max-speed = <1000>;
+ status = "disabled";
+ };
+
nand: nand@18046000 {
compatible = "brcm,nand-iproc", "brcm,brcmnand-v6.1";
reg = <0x18046000 0x600>, <0xf8105408 0x600>,
diff --git a/arch/arm/boot/dts/bcm911360_entphn.dts b/arch/arm/boot/dts/bcm911360_entphn.dts
index 8b3800f46288..e037dea63f4a 100644
--- a/arch/arm/boot/dts/bcm911360_entphn.dts
+++ b/arch/arm/boot/dts/bcm911360_entphn.dts
@@ -57,6 +57,14 @@
};
};
+ð0 {
+ status = "okay";
+};
+
+&switch {
+ status = "okay";
+};
+
&uart3 {
status = "okay";
};
--
2.11.0
^ permalink raw reply related
* Re: [PATCH v2 2/2] ARM: dts: Add the ethernet and ethernet PHY to the cygnus core DT.
From: Florian Fainelli @ 2017-04-25 23:59 UTC (permalink / raw)
To: Eric Anholt, Vivien Didelot, Andrew Lunn,
netdev-u79uwXL29TY76Z2rM5mHXA, Rob Herring, Mark Rutland,
devicetree-u79uwXL29TY76Z2rM5mHXA
Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
bcm-kernel-feedback-list-dY08KVG/lbpWk0Htik3J/w, Ray Jui,
Scott Branden, Jon Mason
In-Reply-To: <20170425235357.7690-2-eric-WhKQ6XTQaPysTnJN9+BGXg@public.gmane.org>
On 04/25/2017 04:53 PM, Eric Anholt wrote:
> Cygnus has a single AMAC controller connected to the B53 switch with 2
> PHYs. On the BCM911360_EP platform, those two PHYs are connected to
> the external ethernet jacks.
>
> Signed-off-by: Eric Anholt <eric-WhKQ6XTQaPysTnJN9+BGXg@public.gmane.org>
> Reviewed-by: Florian Fainelli <f.fainelli-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
>
> v2: Call the node "switch", just call the ports "port" (suggestions by
> Florian), drop max-speed on the phys (suggestion by Andrew Lunn),
> call the other nodes "ethernet" and "ethernet-phy" (suggestions by
> Sergei Shtylyov)
>
> arch/arm/boot/dts/bcm-cygnus.dtsi | 58 ++++++++++++++++++++++++++++++++++
> arch/arm/boot/dts/bcm911360_entphn.dts | 8 +++++
> 2 files changed, 66 insertions(+)
>
> diff --git a/arch/arm/boot/dts/bcm-cygnus.dtsi b/arch/arm/boot/dts/bcm-cygnus.dtsi
> index 009f1346b817..9fd89be0f5e0 100644
> --- a/arch/arm/boot/dts/bcm-cygnus.dtsi
> +++ b/arch/arm/boot/dts/bcm-cygnus.dtsi
> @@ -142,6 +142,54 @@
> interrupts = <0>;
> };
>
> + mdio: mdio@18002000 {
> + compatible = "brcm,iproc-mdio";
> + reg = <0x18002000 0x8>;
> + #size-cells = <1>;
> + #address-cells = <0>;
Sorry for not noticing earlier, since you override this correctly in the
board-level DTS file can you put a:
status = "disabled"
property in there by default?
Thanks!
> +
> + gphy0: ethernet-phy@0 {
> + reg = <0>;
> + };
> +
> + gphy1: ethernet-phy@1 {
> + reg = <1>;
> + };
> + };
> +
> + switch: switch@18007000 {
> + compatible = "brcm,bcm11360-srab", "brcm,cygnus-srab";
> + reg = <0x18007000 0x1000>;
> + status = "disabled";
> +
> + ports {
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + port@0 {
> + reg = <0>;
> + phy-handle = <&gphy0>;
> + phy-mode = "rgmii";
> + };
> +
> + port@1 {
> + reg = <1>;
> + phy-handle = <&gphy1>;
> + phy-mode = "rgmii";
> + };
> +
> + port@8 {
> + reg = <8>;
> + label = "cpu";
> + ethernet = <ð0>;
> + fixed-link {
> + speed = <1000>;
> + full-duplex;
> + };
> + };
> + };
> + };
> +
> i2c0: i2c@18008000 {
> compatible = "brcm,cygnus-iproc-i2c", "brcm,iproc-i2c";
> reg = <0x18008000 0x100>;
> @@ -295,6 +343,16 @@
> status = "disabled";
> };
>
> + eth0: ethernet@18042000 {
> + compatible = "brcm,amac";
> + reg = <0x18042000 0x1000>,
> + <0x18110000 0x1000>;
> + reg-names = "amac_base", "idm_base";
> + interrupts = <GIC_SPI 110 IRQ_TYPE_LEVEL_HIGH>;
> + max-speed = <1000>;
> + status = "disabled";
> + };
> +
> nand: nand@18046000 {
> compatible = "brcm,nand-iproc", "brcm,brcmnand-v6.1";
> reg = <0x18046000 0x600>, <0xf8105408 0x600>,
> diff --git a/arch/arm/boot/dts/bcm911360_entphn.dts b/arch/arm/boot/dts/bcm911360_entphn.dts
> index 8b3800f46288..e037dea63f4a 100644
> --- a/arch/arm/boot/dts/bcm911360_entphn.dts
> +++ b/arch/arm/boot/dts/bcm911360_entphn.dts
> @@ -57,6 +57,14 @@
> };
> };
>
> +ð0 {
> + status = "okay";
> +};
> +
> +&switch {
> + status = "okay";
> +};
> +
> &uart3 {
> status = "okay";
> };
>
--
Florian
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH RFC] ptr_ring: add ptr_ring_unconsume
From: Jason Wang @ 2017-04-26 0:10 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: linux-kernel, netdev
In-Reply-To: <20170425182222-mutt-send-email-mst@kernel.org>
On 2017年04月25日 23:35, Michael S. Tsirkin wrote:
> On Tue, Apr 25, 2017 at 12:07:01PM +0800, Jason Wang wrote:
>>
>> On 2017年04月24日 20:00, Michael S. Tsirkin wrote:
>>> On Mon, Apr 24, 2017 at 07:54:18PM +0800, Jason Wang wrote:
>>>> On 2017年04月24日 07:28, Michael S. Tsirkin wrote:
>>>>> On Tue, Apr 18, 2017 at 11:07:42AM +0800, Jason Wang wrote:
>>>>>> On 2017年04月17日 07:19, Michael S. Tsirkin wrote:
>>>>>>> Applications that consume a batch of entries in one go
>>>>>>> can benefit from ability to return some of them back
>>>>>>> into the ring.
>>>>>>>
>>>>>>> Add an API for that - assuming there's space. If there's no space
>>>>>>> naturally we can't do this and have to drop entries, but this implies
>>>>>>> ring is full so we'd likely drop some anyway.
>>>>>>>
>>>>>>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>>>>>>> ---
>>>>>>>
>>>>>>> Jason, in my mind the biggest issue with your batching patchset is the
>>>>>>> backet drops on disconnect. This API will help avoid that in the common
>>>>>>> case.
>>>>>> Ok, I will rebase the series on top of this. (Though I don't think we care
>>>>>> the packet loss).
>>>>> E.g. I care - I often start sending packets to VM before it's
>>>>> fully booted. Several vhost resets might follow.
>>>> Ok.
>>>>
>>>>>>> I would still prefer that we understand what's going on,
>>>>>> I try to reply in another thread, does it make sense?
>>>>>>
>>>>>>> and I would
>>>>>>> like to know what's the smallest batch size that's still helpful,
>>>>>> Yes, I've replied in another thread, the result is:
>>>>>>
>>>>>>
>>>>>> no batching 1.88Mpps
>>>>>> RX_BATCH=1 1.93Mpps
>>>>>> RX_BATCH=4 2.11Mpps
>>>>>> RX_BATCH=16 2.14Mpps
>>>>>> RX_BATCH=64 2.25Mpps
>>>>>> RX_BATCH=256 2.18Mpps
>>>>> Essentially 4 is enough, other stuf looks more like noise
>>>>> to me. What about 2?
>>>> The numbers are pretty stable, so probably not noise. Retested on top of
>>>> batch zeroing:
>>>>
>>>> no 1.97Mpps
>>>> 1 2.09Mpps
>>>> 2 2.11Mpps
>>>> 4 2.16Mpps
>>>> 8 2.19Mpps
>>>> 16 2.21Mpps
>>>> 32 2.25Mpps
>>>> 64 2.30Mpps
>>>> 128 2.21Mpps
>>>> 256 2.21Mpps
>>>>
>>>> 64 performs best.
>>>>
>>>> Thanks
>>> OK but it might be e.g. a function of the ring size, host cache size or
>>> whatever. As we don't really understand the why, if we just optimize for
>>> your setup we risk regressions in others. 64 entries is a lot, it
>>> increases the queue size noticeably. Could this be part of the effect?
>>> Could you try changing the queue size to see what happens?
>> I increase tx_queue_len to 1100, but only see less than 1% improvement on
>> pps number (batch = 1) in my machine. If you care about the regression, we
>> probably can leave the choice to user through e.g module parameter. But I'm
>> afraid we have already had too much choices for them. Or I can test this
>> with different CPU types.
>>
>> Thanks
>>
> I agree here. Let's keep it a constant. Testing on more machines would
> be nice but not strictly required.
Ok, I will give a full benchmark (batch=1,4,64) on TCP stream to see how
it will perform. Let's decide then.
> I just dislike not understanding why
> it helps because it means we can easily break it by mistake. So my only
> request really is that you wrap access to this internal buffer in an
> API. Let's see - I think we need
>
> struct vhost_net_buf
> vhost_net_buf_get_ptr
> vhost_net_buf_get_size
> vhost_net_buf_is_empty
> vhost_net_buf_peek
> vhost_net_buf_consume
> vhost_net_buf_produce
Ok. Will do in next version.
Thanks.
^ permalink raw reply
* Re: TCP fast open using experimental TCP option?
From: Eric Dumazet @ 2017-04-26 0:13 UTC (permalink / raw)
To: Tom Herbert; +Cc: Linux Kernel Network Developers, Jerry Chu
In-Reply-To: <CALx6S36L4BbdcASbGNvn6mRMJDhJJ6n8y-uRTgVqteM=f6On6A@mail.gmail.com>
On Tue, 2017-04-25 at 12:08 -0700, Tom Herbert wrote:
> Looks like TCP fast open was using experimental TCP option at some. Is
> this still needed? Technically this violates usage requirements of
> experimental options. Can this be removed now since there is now an
> assigned option number for TFO?
>
> case TCPOPT_EXP:
> /* Fast Open option shares code 254 using a
> * 16 bits magic number.
> */
> if (opsize >= TCPOLEN_EXP_FASTOPEN_BASE &&
> get_unaligned_be16(ptr) ==
> TCPOPT_FASTOPEN_MAGIC)
> tcp_parse_fastopen_option(opsize -
> TCPOLEN_EXP_FASTOPEN_BASE,
> ptr + 2, th->syn, foc, true);
> break;
Hi Tom
Client side was updated in linux-4.1 only two years ago.
We lack counters telling how often the experimental option is used.
RFC6994 ( 5. Migration to Assigned Options ) guidelines are properly
met.
Not sure why we should hurry ?
^ permalink raw reply
* Get back to me
From: Ashraf Basit @ 2017-04-26 0:10 UTC (permalink / raw)
To: netdev
Good day. Did you receive the business proposal I sent to you yesterday? I was waiting for your reply but I am not sure if you receive the message. If for some reason you did not receive my previous email, I can resend the message to you. Please confirm as this is very urgent and important.
Regards,
Ashraf.
ashrafbas@secsuremailer.com
^ permalink raw reply
* Re: [PATCH net-next] virtio-net: on tx, only call napi_disable if tx napi is on
From: Jason Wang @ 2017-04-26 0:17 UTC (permalink / raw)
To: Willem de Bruijn, netdev; +Cc: Willem de Bruijn, virtualization, davem, mst
In-Reply-To: <20170425195917.54209-1-willemdebruijn.kernel@gmail.com>
On 2017年04月26日 03:59, Willem de Bruijn wrote:
> From: Willem de Bruijn <willemb@google.com>
>
> As of tx napi, device down (`ip link set dev $dev down`) hangs unless
> tx napi is enabled. Else napi_enable is not called, so napi_disable
> will spin on test_and_set_bit NAPI_STATE_SCHED.
>
> Only call napi_disable if tx napi is enabled.
>
> Fixes: 5a719c2552ca ("virtio-net: transmit napi")
> Reported-by: Jason Wang <jasowang@redhat.com>
> Signed-off-by: Willem de Bruijn <willemb@google.com>
> ---
> drivers/net/virtio_net.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 003143835766..82f1c3a73345 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -997,6 +997,12 @@ static void virtnet_napi_tx_enable(struct virtnet_info *vi,
> return virtnet_napi_enable(vq, napi);
> }
>
> +static void virtnet_napi_tx_disable(struct napi_struct *napi)
> +{
> + if (napi->weight)
> + napi_disable(napi);
> +}
> +
> static void refill_work(struct work_struct *work)
> {
> struct virtnet_info *vi =
> @@ -1445,7 +1451,7 @@ static int virtnet_close(struct net_device *dev)
>
> for (i = 0; i < vi->max_queue_pairs; i++) {
> napi_disable(&vi->rq[i].napi);
> - napi_disable(&vi->sq[i].napi);
> + virtnet_napi_tx_disable(&vi->sq[i].napi);
> }
>
> return 0;
> @@ -1803,7 +1809,7 @@ static void virtnet_freeze_down(struct virtio_device *vdev)
> if (netif_running(vi->dev)) {
> for (i = 0; i < vi->max_queue_pairs; i++) {
> napi_disable(&vi->rq[i].napi);
> - napi_disable(&vi->sq[i].napi);
> + virtnet_napi_tx_disable(&vi->sq[i].napi);
> }
> }
> }
Acked-by: Jason Wang <jasowang@redhat.com>
Thanks
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* Re: xdp_redirect ifindex vs port. Was: best API for returning/setting egress port?
From: Alexei Starovoitov @ 2017-04-26 0:26 UTC (permalink / raw)
To: Jesper Dangaard Brouer
Cc: Daniel Borkmann, Andy Gospodarek, Daniel Borkmann,
Alexei Starovoitov, netdev@vger.kernel.org,
xdp-newbies@vger.kernel.org, John Fastabend
In-Reply-To: <20170425113453.5c72080f@redhat.com>
On Tue, Apr 25, 2017 at 11:34:53AM +0200, Jesper Dangaard Brouer wrote:
> > Note the very first bpf patchset years ago contained the port table
> > abstraction. ovs has concept of vports as well. These two very
> > different projects needed port table to provide a layer of
> > indirection between ifindex==netdev and virtual port number.
> > This is still the case and I'd like to see this port table to be
> > implemented for both cls_bpf and xdp. In that sense xdp is not
> > special.
>
> Glad to hear you want to see this implemented, I will start coding on
> this then. Good point with cls_bpf, I was planning to make this port
> table strongly connected to XDP, guess I should also think of cls_bpf.
perfect.
I think we should try to make all additions to bpf networking world
to be usable for both tc and xdp, since both are actively used and
it wouldn't be great to have cool feature for one, but not the other.
I think port table is an excellent candidate that applies to both.
> I'm not worried about the DROP case, I agree that is fine (as you also
> say). The problem is unintentionally sending a packet to a wrong
> ifindex. This is clearly an eBPF program error, BUT with XDP this
> becomes a very hard to debug program error. With TC-redirect/cls_bpf
> we can tcpdump the packets, with XDP there is no visibility into this
> happening (the NSA is going to love this "feature"). Maybe we could add
> yet-another tracepoint to allow debugging this. My proposal that we
> simply remove the possibility for such program errors, by as you say
> move the validation from run-time into static insertion-time, via a
> port table.
I think lack of tcpdump-like debugging in xdp is a separate issue.
As I was saying in the other thread we have trivial 'xdpdump' kern+user
app that emits pcap file, but it's too specific to how we use
tail_calls+prog_array in our xdp setup. I'm working on the program
chaining that will be generic and allow us transparently add multiple
xdp or tc progs to the same attachment point and will allow us to
do 'xdpdump' at any point of this pipeline, so debugging of what
happened to the packet will be easier and done in the same way
for both tc and xdp.
btw in our experience working with both tc and xdp the tc+bpf was
actually harder to use and more bug prone.
^ permalink raw reply
* [PATCH net-next] tcp: memset ca_priv data to 0 properly
From: Wei Wang @ 2017-04-26 0:38 UTC (permalink / raw)
To: netdev, David Miller; +Cc: Eric Dumazet, Yuchung Cheng, Neal Cardwell, Wei Wang
From: Wei Wang <weiwan@google.com>
Always zero out ca_priv data in tcp_assign_congestion_control() so that
ca_priv data is cleared out during socket creation.
Also always zero out ca_priv data in tcp_reinit_congestion_control() so
that when cc algorithm is changed, ca_priv data is cleared out as well.
We should still zero out ca_priv data even in TCP_CLOSE state because
user could call connect() on AF_UNSPEC to disconnect the socket and
leave it in TCP_CLOSE state and later call setsockopt() to switch cc
algorithm on this socket.
Fixes: 2b0a8c9ee ("tcp: add CDG congestion control")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
---
net/ipv4/tcp_cong.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c
index 79c4817abc94..6e3c512054a6 100644
--- a/net/ipv4/tcp_cong.c
+++ b/net/ipv4/tcp_cong.c
@@ -168,12 +168,8 @@ void tcp_assign_congestion_control(struct sock *sk)
}
out:
rcu_read_unlock();
+ memset(icsk->icsk_ca_priv, 0, sizeof(icsk->icsk_ca_priv));
- /* Clear out private data before diag gets it and
- * the ca has not been initialized.
- */
- if (ca->get_info)
- memset(icsk->icsk_ca_priv, 0, sizeof(icsk->icsk_ca_priv));
if (ca->flags & TCP_CONG_NEEDS_ECN)
INET_ECN_xmit(sk);
else
@@ -200,11 +196,10 @@ static void tcp_reinit_congestion_control(struct sock *sk,
tcp_cleanup_congestion_control(sk);
icsk->icsk_ca_ops = ca;
icsk->icsk_ca_setsockopt = 1;
+ memset(icsk->icsk_ca_priv, 0, sizeof(icsk->icsk_ca_priv));
- if (sk->sk_state != TCP_CLOSE) {
- memset(icsk->icsk_ca_priv, 0, sizeof(icsk->icsk_ca_priv));
+ if (sk->sk_state != TCP_CLOSE)
tcp_init_congestion_control(sk);
- }
}
/* Manage refcounts on socket close. */
--
2.13.0.rc0.306.g87b477812d-goog
^ permalink raw reply related
* Re: [PATCH net] net: batman-adv: Fix possible memleaks when fail to register_netdevice
From: Gao Feng @ 2017-04-26 0:41 UTC (permalink / raw)
To: 'Sven Eckelmann',
b.a.t.m.a.n-ZwoEplunGu2X36UT3dwllkB+6BGkLq7r
Cc: mareklindner-rVWd3aGhH2z5bpWLKbzFeg,
netdev-u79uwXL29TY76Z2rM5mHXA, a, 'Gao Feng',
davem-fT/PcQaiUtIeIZ0/mPfg9Q
In-Reply-To: <1756616.320MP6AHYH@bentobox>
> From: Sven Eckelmann [mailto:sven-KaDOiPu9UxWEi8DpZVb4nw@public.gmane.org]
> Sent: Tuesday, April 25, 2017 9:53 PM
> On Dienstag, 25. April 2017 20:03:20 CEST gfree.wind-H32Fclmsjq1BDgjK7y7TUQ@public.gmane.org wrote:
> > From: Gao Feng <fgao-KlmEoCYek3zQT0dZR+AlfA@public.gmane.org>
> >
> > Because the func batadv_softif_init_late allocate some resources and
> > it would be invoked in register_netdevice. So we need to invoke the
> > func batadv_softif_free instead of free_netdev to cleanup when fail to
> > register_netdevice.
>
> I could be wrong, but shouldn't the destructor be replaced with
free_netdevice
> and the batadv_softif_free (without the free_netdev) used as ndo_uninit?
The
> line you've changed should then be kept as free_netdevice.
>
> At least this seems to be important when using rtnl_newlink() instead of
the
> legacy sysfs netdev stuff from batman-adv. rtnl_newlink() would also only
call
> free_netdevice and thus also not run batadv_softif_free. This seems to be
only
> fixable by calling ndo_uninit.
>
> Kind regards,
> Sven
Sorry, I don't get you.
The net_dev is created in this func batadv_softif_create.
Why couldn't invoke batadv_softif_free to cleanup when fail to
register_netdevice?
Best Regards
Feng
^ permalink raw reply
* [PATCH] ipv6: check raw payload size correctly in ioctl
From: Jamie Bainbridge @ 2017-04-26 0:43 UTC (permalink / raw)
To: David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy, netdev
Cc: Jamie Bainbridge
In situations where an skb is paged, the transport header pointer and
tail pointer can be the same because the skb contents are in frags.
This results in ioctl(SIOCINQ/FIONREAD) incorrectly returning a
length of 0 when the length to receive is actually greater than zero.
skb->len is already correctly set in ip6_input_finish() with
pskb_pull(), so use skb->len as it always returns the correct result
for both linear and paged data.
Signed-off-by: Jamie Bainbridge <jbainbri@redhat.com>
---
net/ipv6/raw.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index f174e76e6505d4045e940c9fceef765d2aaa937d..0da6a12b5472e322d679572c7244e5c9bc467741 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -1178,8 +1178,7 @@ static int rawv6_ioctl(struct sock *sk, int cmd, unsigned long arg)
spin_lock_bh(&sk->sk_receive_queue.lock);
skb = skb_peek(&sk->sk_receive_queue);
if (skb)
- amount = skb_tail_pointer(skb) -
- skb_transport_header(skb);
+ amount = skb->len;
spin_unlock_bh(&sk->sk_receive_queue.lock);
return put_user(amount, (int __user *)arg);
}
--
1.8.3.1
^ permalink raw reply related
* Re: TCP fast open using experimental TCP option?
From: Tom Herbert @ 2017-04-26 0:39 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Linux Kernel Network Developers, Jerry Chu
In-Reply-To: <1493165636.6453.56.camel@edumazet-glaptop3.roam.corp.google.com>
On Tue, Apr 25, 2017 at 5:13 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2017-04-25 at 12:08 -0700, Tom Herbert wrote:
>> Looks like TCP fast open was using experimental TCP option at some. Is
>> this still needed? Technically this violates usage requirements of
>> experimental options. Can this be removed now since there is now an
>> assigned option number for TFO?
>>
>> case TCPOPT_EXP:
>> /* Fast Open option shares code 254 using a
>> * 16 bits magic number.
>> */
>> if (opsize >= TCPOLEN_EXP_FASTOPEN_BASE &&
>> get_unaligned_be16(ptr) ==
>> TCPOPT_FASTOPEN_MAGIC)
>> tcp_parse_fastopen_option(opsize -
>> TCPOLEN_EXP_FASTOPEN_BASE,
>> ptr + 2, th->syn, foc, true);
>> break;
>
> Hi Tom
>
> Client side was updated in linux-4.1 only two years ago.
>
> We lack counters telling how often the experimental option is used.
>
> RFC6994 ( 5. Migration to Assigned Options ) guidelines are properly
> met.
>
> Not sure why we should hurry ?
>
An IETFer was complaining that Linux indiscriminately violates TCP
RFCs and gave the use of experimental options as example. Yuchung
pointed out that this use is actually conformant to the spec so we're
good (thanks Yuchung!).
Tom
>
^ permalink raw reply
* Re: [PATCH v2 2/2] ARM: dts: Add the ethernet and ethernet PHY to the cygnus core DT.
From: Andrew Lunn @ 2017-04-26 0:49 UTC (permalink / raw)
To: Eric Anholt
Cc: Florian Fainelli, Vivien Didelot, netdev, Rob Herring,
Mark Rutland, devicetree, linux-arm-kernel, linux-kernel,
bcm-kernel-feedback-list, Ray Jui, Scott Branden, Jon Mason
In-Reply-To: <20170425235357.7690-2-eric@anholt.net>
> + eth0: ethernet@18042000 {
> + compatible = "brcm,amac";
> + reg = <0x18042000 0x1000>,
> + <0x18110000 0x1000>;
> + reg-names = "amac_base", "idm_base";
> + interrupts = <GIC_SPI 110 IRQ_TYPE_LEVEL_HIGH>;
> + max-speed = <1000>;
Hi Eric
Sorry i missed this the first time. Does this Ethernet controller do >
1Gbps? Does this max-speed do anything useful?
Andrew
^ permalink raw reply
* linux-next: manual merge of the net-next tree with the kbuild tree
From: Stephen Rothwell @ 2017-04-26 1:09 UTC (permalink / raw)
To: David Miller, Networking, Masahiro Yamada
Cc: Linux-Next Mailing List, Linux Kernel Mailing List,
Nicolas Dichtel, Gerard Garcia
Hi all,
Today's linux-next merge of the net-next tree got a conflict in:
include/uapi/linux/Kbuild
between commit:
65017bab8a9e ("uapi: export all headers under uapi directories")
from the kbuild tree and commit:
0b2e66448ba2 ("VSOCK: Add vsockmon device")
from the net-next tree.
I fixed it up (I just used the kbuild tree version as new entries are not
needed any more in this file) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging. You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.
--
Cheers,
Stephen Rothwell
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox