* Re: [PATCH net-next 2/2] net: initialize hashrnd in flow_dissector with net_get_random_once
From: David Miller @ 2013-10-25 23:04 UTC (permalink / raw)
To: hannes; +Cc: netdev, edumazet
In-Reply-To: <20131023111219.GA31531@order.stressinduktion.org>
From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Wed, 23 Oct 2013 13:12:19 +0200
> We also can defer the initialization of hashrnd in flow_dissector
> to its first use. Since net_get_random_once is irqsave now we don't
> have to audit the call paths if one of this functions get called by an
> interrupt handler.
>
> Cc: David S. Miller <davem@davemloft.net>
> Cc: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Applied.
^ permalink raw reply
* Re: [PATCH net-next] net: add missing dev_put() in __netdev_adjacent_dev_insert
From: David Miller @ 2013-10-25 23:04 UTC (permalink / raw)
To: nikolay; +Cc: netdev, vfalico
In-Reply-To: <1382534936-23080-1-git-send-email-nikolay@redhat.com>
From: Nikolay Aleksandrov <nikolay@redhat.com>
Date: Wed, 23 Oct 2013 15:28:56 +0200
> I think that a dev_put() is needed in the error path to preserve the
> proper dev refcount.
>
> CC: Veaceslav Falico <vfalico@redhat.com>
> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Applied.
^ permalink raw reply
* Re: [PATCH v2 net-next] fix rtnl notification in atomic context
From: David Miller @ 2013-10-25 23:05 UTC (permalink / raw)
To: ast; +Cc: nicolas.dichtel, amwang, vfalico, netdev
In-Reply-To: <1382569362-4820-1-git-send-email-ast@plumgrid.com>
From: Alexei Starovoitov <ast@plumgrid.com>
Date: Wed, 23 Oct 2013 16:02:42 -0700
> commit 991fb3f74c "dev: always advertise rx_flags changes via netlink"
> introduced rtnl notification from __dev_set_promiscuity(),
> which can be called in atomic context.
>
> Steps to reproduce:
> ip tuntap add dev tap1 mode tap
> ifconfig tap1 up
> tcpdump -nei tap1 &
> ip tuntap del dev tap1 mode tap
...
> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Applied.
^ permalink raw reply
* Re: [patch net-next] ipv6: allow userspace to create address with IFLA_F_TEMPORARY flag
From: Vladislav Yasevich @ 2013-10-25 23:05 UTC (permalink / raw)
To: Jiri Pirko
Cc: netdev@vger.kernel.org, David Miller, kuznet, jmorris, yoshfuji,
kaber, thaller, Stephen Hemminger
In-Reply-To: <1382622355-6500-1-git-send-email-jiri@resnulli.us>
On Thu, Oct 24, 2013 at 9:45 AM, Jiri Pirko <jiri@resnulli.us> wrote:
> This is needed in order to implement userspace address configuration,
> namely ip6-privacy (rfc4941) in NetworkManager.
>
> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
> ---
> net/ipv6/addrconf.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index cd3fb30..962c7c9 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -3715,7 +3715,8 @@ inet6_rtm_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh)
> return -ENODEV;
>
> /* We ignore other flags so far. */
> - ifa_flags = ifm->ifa_flags & (IFA_F_NODAD | IFA_F_HOMEADDRESS);
> + ifa_flags = ifm->ifa_flags & (IFA_F_NODAD | IFA_F_HOMEADDRESS |
> + IFA_F_TEMPORARY);
>
> ifa = ipv6_get_ifaddr(net, pfx, dev, 1);
> if (ifa == NULL) {
> --
> 1.8.3.1
>
Jiri
So, is the idea behind this is that all of temp address management
would be done in user space? If so, then you
may have to verify that no-one sets the lifetime values on the prefix
in your other patch. I am still trying to figure out
why this would be needed.
Thanks
-vlad
^ permalink raw reply
* Re: [PATCH net] netconsole: fix NULL pointer dereference
From: Francois Romieu @ 2013-10-25 23:06 UTC (permalink / raw)
To: Nikolay Aleksandrov; +Cc: David Miller, David.Laight, vfalico, netdev
In-Reply-To: <526A7CE5.8040205@redhat.com>
Nikolay Aleksandrov <nikolay@redhat.com> :
[...]
> I thought this implies that the spinlock protects us against running with
> write_msg().
> It's fine by me either way (with or w/o the addition to the comment). It's
> up to you Dave, do you still want it explicitly there ?
There was some facetiousness in David's comment.
--
Ueimor
^ permalink raw reply
* Re: vxlan gso is broken by stackable gso_segment()
From: David Miller @ 2013-10-25 23:10 UTC (permalink / raw)
To: ast; +Cc: eric.dumazet, edumazet, stephen, netdev
In-Reply-To: <CAMEtUuxwi05oX0TqNdMbjy04MT=ZLemLXJQTTcAMrnWuhXLKdQ@mail.gmail.com>
From: Alexei Starovoitov <ast@plumgrid.com>
Date: Fri, 25 Oct 2013 15:41:47 -0700
> 'bool tunnel' actually still used to indicate encap_level > 0
Good catch, I misread the code.
^ permalink raw reply
* Re: [PATCH] ibm emac: Don't call napi_complete if napi_reschedule failed
From: David Miller @ 2013-10-25 23:13 UTC (permalink / raw)
To: alistair; +Cc: netdev, benh
In-Reply-To: <1382585780-13244-1-git-send-email-alistair@popple.id.au>
From: Alistair Popple <alistair@popple.id.au>
Date: Thu, 24 Oct 2013 14:36:20 +1100
> This patch fixes a bug which would trigger the BUG_ON() at
> net/core/dev.c:4156. It was found that this was due to napi_complete
> being called even when the corresponding call to napi_reschedule had
> failed.
>
> This patch ensures that we only contine processing rotting packets in
> the current mal_poll call if we are not already on the polling list.
> It also adds locking calls to protect the read-write-modify of the
> MAL_CFG DCR.
>
> Signed-off-by: Alistair Popple <alistair@popple.id.au>
The mistaken napi_complete() call and the locking change are two different
bugs, fix them in two different patches.
^ permalink raw reply
* Re: vxlan gso is broken by stackable gso_segment()
From: Eric Dumazet @ 2013-10-25 23:25 UTC (permalink / raw)
To: Alexei Starovoitov; +Cc: David Miller, Eric Dumazet, stephen, netdev
In-Reply-To: <CAMEtUuxwi05oX0TqNdMbjy04MT=ZLemLXJQTTcAMrnWuhXLKdQ@mail.gmail.com>
On Fri, 2013-10-25 at 15:41 -0700, Alexei Starovoitov wrote:
> 'bool tunnel' actually still used to indicate encap_level > 0
>
Yes, I am studying if the setting of skb->encapsulation = 1 was really
needed in the :
if (tunnel) {
skb_reset_inner_headers(skb);
skb->encapsulation = 1;
}
And was planning to rename 'bool tunnel' by 'bool stacked' or
something...
> Eric's fix brings back performance for vxlan and gre keeps working. Thx!
Please note the original performance is not that good, you mentioned 230
Mbps on lxc, while I get more than 5Gb/s on a 10G link.
This should be investigated ...
>
> net/core/skbuff.c:3474 skb_try_coalesce() warning, I mentioned before,
> is unrelated.
> I still see it with this patch. Running either gre or vxlan tunnels.
I think this might be related to commit 6ff50cd55545 ("tcp: gso: do not
generate out of order packets")
I'll investigate this as well, thanks.
^ permalink raw reply
* Re: [PATCH] net: wan: sbni: remove assembly crc32 code
From: David Miller @ 2013-10-25 23:27 UTC (permalink / raw)
To: sebastian; +Cc: andi, akpm, linux-kernel, ak, netdev
In-Reply-To: <20131022183625.GA4382@breakpoint.cc>
From: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Date: Tue, 22 Oct 2013 20:36:25 +0200
> There is also a C function doing the same thing. Unless the asm code is
> 110% faster we could stick to the C function.
>
> Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Applied.
^ permalink raw reply
* Re: [PATCHv2 net] netpoll: fix rx_hook() interface by passing the skb
From: David Miller @ 2013-10-25 23:27 UTC (permalink / raw)
To: antonio; +Cc: netdev, David.Laight
In-Reply-To: <1382564190-334-1-git-send-email-antonio@meshcoding.com>
From: Antonio Quartulli <antonio@meshcoding.com>
Date: Wed, 23 Oct 2013 23:36:30 +0200
> Right now skb->data is passed to rx_hook() even if the skb
> has not been linearised and without giving rx_hook() a way
> to linearise it.
>
> Change the rx_hook() interface and make it accept the skb
> and the offset to the UDP payload as arguments. rx_hook() is
> also renamed to rx_skb_hook() to ensure that out of the tree
> users notice the API change.
>
> In this way any rx_skb_hook() implementation can perform all
> the needed operations to properly (and safely) access the
> skb data.
>
> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Applied.
^ permalink raw reply
* Re: [PATCH 1/1] ax88179_178a: Remove AX_MEDIUM_ALWAYS_ONE bit in AX_MEDIUM_STATUS_MODE register to avoid TX throttling
From: David Miller @ 2013-10-25 23:27 UTC (permalink / raw)
To: freddy; +Cc: linux-usb, linux-kernel, netdev, allan, louis
In-Reply-To: <1382597905-2393-1-git-send-email-freddy@asix.com.tw>
From: freddy@asix.com.tw
Date: Thu, 24 Oct 2013 14:58:25 +0800
> From: Freddy Xin <freddy@asix.com.tw>
>
> Remove AX_MEDIUM_ALWAYS_ONE in AX_MEDIUM_STATUS_MODE register.
> Setting this bit may cause TX throttling in Half-Duplex mode.
>
> Signed-off-by: Freddy Xin <freddy@asix.com.tw>
Applied.
^ permalink raw reply
* Re: [PATCH net v2 1/2] ipv6: reset dst.expires value when clearing expire flag
From: David Miller @ 2013-10-25 23:27 UTC (permalink / raw)
To: hannes; +Cc: netdev, sgunderson, valentyn, yoshfuji, edumazet
In-Reply-To: <20131024081427.GC15744@order.stressinduktion.org>
From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Thu, 24 Oct 2013 10:14:27 +0200
> On receiving a packet too big icmp error we update the expire value by
> calling rt6_update_expires. This function uses dst_set_expires which is
> implemented that it can only reduce the expiration value of the dst entry.
>
> If we insert new routing non-expiry information into the ipv6 fib where
> we already have a matching rt6_info we only clear the RTF_EXPIRES flag
> in rt6i_flags and leave the dst.expires value as is.
>
> When new mtu information arrives for that cached dst_entry we again
> call dst_set_expires. This time it won't update the dst.expire value
> because we left the dst.expire value intact from the last update. So
> dst_set_expires won't touch dst.expires.
>
> Fix this by resetting dst.expires when clearing the RTF_EXPIRE flag.
> dst_set_expires checks for a zero expiration and updates the
> dst.expires.
>
> In the past this (not updating dst.expires) was necessary because
> dst.expire was placed in a union with the dst_entry *from reference
> and rt6_clean_expires did assign NULL to it. This split happend in
> ecd9883724b78cc72ed92c98bcb1a46c764fff21 ("ipv6: fix race condition
> regarding dst->expires and dst->from").
>
> Reported-by: Steinar H. Gunderson <sgunderson@bigfoot.com>
> Reported-by: Valentijn Sessink <valentyn@blub.net>
> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
> Acked-by: Eric Dumazet <edumazet@google.com>
> Tested-by: Valentijn Sessink <valentyn@blub.net>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Applied.
^ permalink raw reply
* Re: [PATCH net 2/2] ipv6: ip6_dst_check needs to check for expired dst_entries
From: David Miller @ 2013-10-25 23:27 UTC (permalink / raw)
To: hannes; +Cc: netdev, sgunderson, valentyn, yoshfuji
In-Reply-To: <20131024054824.GA15744@order.stressinduktion.org>
From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Thu, 24 Oct 2013 07:48:24 +0200
> On receiving a packet too big icmp error we check if our current cached
> dst_entry in the socket is still valid. This validation check did not
> care about the expiration of the (cached) route.
>
> The error path I traced down:
> The socket receives a packet too big mtu notification. It still has a
> valid dst_entry and thus issues the ip6_rt_pmtu_update on this dst_entry,
> setting RTF_EXPIRE and updates the dst.expiration value (which could
> fail because of not up-to-date expiration values, see previous patch).
>
> In some seldom cases we race with a) the ip6_fib gc or b) another routing
> lookup which would result in a recreation of the cached rt6_info from its
> parent non-cached rt6_info. While copying the rt6_info we reinitialize the
> metrics store by copying it over from the parent thus invalidating the
> just installed pmtu update (both dsts use the same key to the inetpeer
> storage). The dst_entry with the just invalidated metrics data would
> just get its RTF_EXPIRES flag cleared and would continue to stay valid
> for the socket.
>
> We should have not issued the pmtu update on the already expired dst_entry
> in the first placed. By checking the expiration on the dst entry and
> doing a relookup in case it is out of date we close the race because
> we would install a new rt6_info into the fib before we issue the pmtu
> update, thus closing this race.
>
> Not reliably updating the dst.expire value was fixed by the patch "ipv6:
> reset dst.expires value when clearing expire flag".
>
> Reported-by: Steinar H. Gunderson <sgunderson@bigfoot.com>
> Reported-by: Valentijn Sessink <valentyn@blub.net>
> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> ---
> I would propose this patch for -stable.
Applied and queued up for -stable.
^ permalink raw reply
* Re: [PATCH 1/4] sctp: merge two if statements to one
From: Vlad Yasevich @ 2013-10-25 23:40 UTC (permalink / raw)
To: Wang Weidong, davem, nhorman; +Cc: dingtianhong, linux-sctp, netdev
In-Reply-To: <1382665805-13952-2-git-send-email-wangweidong1@huawei.com>
On 10/24/2013 09:50 PM, Wang Weidong wrote:
> Two if statements do the same work, maybe we can merge them to
> one. There is just code simplification, no functional changes.
>
> Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
> ---
> net/sctp/auth.c | 12 ++++--------
> 1 file changed, 4 insertions(+), 8 deletions(-)
>
> diff --git a/net/sctp/auth.c b/net/sctp/auth.c
> index 8c4fa5d..19fb0ae 100644
> --- a/net/sctp/auth.c
> +++ b/net/sctp/auth.c
> @@ -539,18 +539,14 @@ struct sctp_hmac *sctp_auth_asoc_get_hmac(const struct sctp_association *asoc)
> for (i = 0; i < n_elt; i++) {
> id = ntohs(hmacs->hmac_ids[i]);
>
> - /* Check the id is in the supported range */
> - if (id > SCTP_AUTH_HMAC_ID_MAX) {
> - id = 0;
> - continue;
> - }
> -
> - /* See is we support the id. Supported IDs have name and
> + /* Check the id is in the supported range. And
> + * see is we support the id. Supported IDs have name and
> * length fields set, so that we can allocated and use
> * them. We can safely just check for name, for without the
> * name, we can't allocate the TFM.
> */
> - if (!sctp_hmac_list[id].hmac_name) {
> + if (id > SCTP_AUTH_HMAC_ID_MAX ||
> + !sctp_hmac_list[id].hmac_name) {
Can you please make the 2 parts of the 'if' statement above line up
with each other instead of the code below. I makes it easy to see what
the whole 'if conditional' is.
Thanks
-vlad
> id = 0;
> continue;
> }
>
^ permalink raw reply
* Re: [PATCH 2/4] sctp: remove the repeat initialize with 0
From: Vlad Yasevich @ 2013-10-25 23:42 UTC (permalink / raw)
To: Wang Weidong, davem, nhorman; +Cc: dingtianhong, linux-sctp, netdev
In-Reply-To: <1382665805-13952-3-git-send-email-wangweidong1@huawei.com>
On 10/24/2013 09:50 PM, Wang Weidong wrote:
> kmem_cache_zalloc had set the allocated memory to zero. I think no need
> to initialize with 0. And move the comments to the function begin.
>
> Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Yes, thank you.
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
-vlad
> ---
> net/sctp/sm_make_chunk.c | 29 ++++++++---------------------
> 1 file changed, 8 insertions(+), 21 deletions(-)
>
> diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
> index d244a23..fe69032 100644
> --- a/net/sctp/sm_make_chunk.c
> +++ b/net/sctp/sm_make_chunk.c
> @@ -1297,6 +1297,13 @@ struct sctp_chunk *sctp_make_auth(const struct sctp_association *asoc)
>
> /* Turn an skb into a chunk.
> * FIXME: Eventually move the structure directly inside the skb->cb[].
> + *
> + * sctpimpguide-05.txt Section 2.8.2
> + * M1) Each time a new DATA chunk is transmitted
> + * set the 'TSN.Missing.Report' count for that TSN to 0. The
> + * 'TSN.Missing.Report' count will be used to determine missing chunks
> + * and when to fast retransmit.
> + *
> */
> struct sctp_chunk *sctp_chunkify(struct sk_buff *skb,
> const struct sctp_association *asoc,
> @@ -1314,29 +1321,9 @@ struct sctp_chunk *sctp_chunkify(struct sk_buff *skb,
> INIT_LIST_HEAD(&retval->list);
> retval->skb = skb;
> retval->asoc = (struct sctp_association *)asoc;
> - retval->has_tsn = 0;
> - retval->has_ssn = 0;
> - retval->rtt_in_progress = 0;
> - retval->sent_at = 0;
> retval->singleton = 1;
> - retval->end_of_packet = 0;
> - retval->ecn_ce_done = 0;
> - retval->pdiscard = 0;
> -
> - /* sctpimpguide-05.txt Section 2.8.2
> - * M1) Each time a new DATA chunk is transmitted
> - * set the 'TSN.Missing.Report' count for that TSN to 0. The
> - * 'TSN.Missing.Report' count will be used to determine missing chunks
> - * and when to fast retransmit.
> - */
> - retval->tsn_missing_report = 0;
> - retval->tsn_gap_acked = 0;
> - retval->fast_retransmit = SCTP_CAN_FRTX;
>
> - /* If this is a fragmented message, track all fragments
> - * of the message (for SEND_FAILED).
> - */
> - retval->msg = NULL;
> + retval->fast_retransmit = SCTP_CAN_FRTX;
>
> /* Polish the bead hole. */
> INIT_LIST_HEAD(&retval->transmitted_list);
>
^ permalink raw reply
* Re: [PATCH 3/4] sctp: fix some comments in associola.c
From: Vlad Yasevich @ 2013-10-25 23:43 UTC (permalink / raw)
To: Wang Weidong, davem, nhorman; +Cc: dingtianhong, linux-sctp, netdev
In-Reply-To: <1382665805-13952-4-git-send-email-wangweidong1@huawei.com>
On 10/24/2013 09:50 PM, Wang Weidong wrote:
> fix some spellings
>
> Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
-vlad
> ---
> net/sctp/associola.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/sctp/associola.c b/net/sctp/associola.c
> index cef5099..c9b91cb 100644
> --- a/net/sctp/associola.c
> +++ b/net/sctp/associola.c
> @@ -602,7 +602,7 @@ void sctp_assoc_rm_peer(struct sctp_association *asoc,
>
> /* Start a T3 timer here in case it wasn't running so
> * that these migrated packets have a chance to get
> - * retrnasmitted.
> + * retransmitted.
> */
> if (!timer_pending(&active->T3_rtx_timer))
> if (!mod_timer(&active->T3_rtx_timer,
> @@ -665,7 +665,7 @@ struct sctp_transport *sctp_assoc_add_peer(struct sctp_association *asoc,
> /* Set the path max_retrans. */
> peer->pathmaxrxt = asoc->pathmaxrxt;
>
> - /* And the partial failure retrnas threshold */
> + /* And the partial failure retrans threshold */
> peer->pf_retrans = asoc->pf_retrans;
>
> /* Initialize the peer's SACK delay timeout based on the
>
^ permalink raw reply
* Re: [PATCH 4/4] sctp: fix comment in chunk.c
From: Vlad Yasevich @ 2013-10-25 23:45 UTC (permalink / raw)
To: Wang Weidong, davem, nhorman; +Cc: dingtianhong, linux-sctp, netdev
In-Reply-To: <1382665805-13952-5-git-send-email-wangweidong1@huawei.com>
On 10/24/2013 09:50 PM, Wang Weidong wrote:
> fix a spelling
>
> Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
You could have combined these spelling fixes into 1 patch.
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
-vlad
> ---
> net/sctp/chunk.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/sctp/chunk.c b/net/sctp/chunk.c
> index 7bd5ed4..f2044fc 100644
> --- a/net/sctp/chunk.c
> +++ b/net/sctp/chunk.c
> @@ -201,7 +201,7 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
>
> max = asoc->frag_point;
> /* If the the peer requested that we authenticate DATA chunks
> - * we need to accound for bundling of the AUTH chunks along with
> + * we need to account for bundling of the AUTH chunks along with
> * DATA.
> */
> if (sctp_auth_send_cid(SCTP_CID_DATA, asoc)) {
>
^ permalink raw reply
* [PATCH net-next] tcp: gso: fix truesize tracking
From: Eric Dumazet @ 2013-10-26 0:26 UTC (permalink / raw)
To: Alexei Starovoitov; +Cc: David Miller, Eric Dumazet, stephen, netdev
In-Reply-To: <1382743502.4032.6.camel@edumazet-glaptop.roam.corp.google.com>
From: Eric Dumazet <edumazet@google.com>
commit 6ff50cd55545 ("tcp: gso: do not generate out of order packets")
had an heuristic that can trigger a warning in skb_try_coalesce(),
because skb->truesize of the gso segments were exactly set to mss.
This breaks the requirement that
skb->truesize >= skb->len + truesizeof(struct sk_buff);
It can trivially be reproduced by :
ifconfig lo mtu 1500
ethtool -K lo tso off
netperf
As the skbs are looped into the TCP networking stack, skb_try_coalesce()
warns us of these skb under-estimating their truesize.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Alexei Starovoitov <ast@plumgrid.com>
---
Based on net-next but applies as well on net tree with some fuzz.
David, we might backport this one to 3.12 and stable later, I let you
decide.
net/ipv4/tcp_offload.c | 13 +++++--------
1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
index a7a5583e..a2b68a1 100644
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -18,6 +18,7 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb,
netdev_features_t features)
{
struct sk_buff *segs = ERR_PTR(-EINVAL);
+ unsigned int sum_truesize = 0;
struct tcphdr *th;
unsigned int thlen;
unsigned int seq;
@@ -104,13 +105,7 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb,
if (copy_destructor) {
skb->destructor = gso_skb->destructor;
skb->sk = gso_skb->sk;
- /* {tcp|sock}_wfree() use exact truesize accounting :
- * sum(skb->truesize) MUST be exactly be gso_skb->truesize
- * So we account mss bytes of 'true size' for each segment.
- * The last segment will contain the remaining.
- */
- skb->truesize = mss;
- gso_skb->truesize -= mss;
+ sum_truesize += skb->truesize;
}
skb = skb->next;
th = tcp_hdr(skb);
@@ -127,7 +122,9 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb,
if (copy_destructor) {
swap(gso_skb->sk, skb->sk);
swap(gso_skb->destructor, skb->destructor);
- swap(gso_skb->truesize, skb->truesize);
+ sum_truesize += skb->truesize;
+ atomic_add(sum_truesize - gso_skb->truesize,
+ &skb->sk->sk_wmem_alloc);
}
delta = htonl(oldlen + (skb_tail_pointer(skb) -
^ permalink raw reply related
* Re: vxlan gso is broken by stackable gso_segment()
From: Eric Dumazet @ 2013-10-26 0:52 UTC (permalink / raw)
To: Alexei Starovoitov; +Cc: David Miller, Eric Dumazet, stephen, netdev
In-Reply-To: <1382743502.4032.6.camel@edumazet-glaptop.roam.corp.google.com>
On Fri, 2013-10-25 at 16:25 -0700, Eric Dumazet wrote:
> Please note the original performance is not that good, you mentioned 230
> Mbps on lxc, while I get more than 5Gb/s on a 10G link.
>
> This should be investigated ...
This is probably trivial to increase performance :
veth currently do not support any kind of tunneling TSO :
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-mpls-segmentation: off [fixed]
I'll submit a patch for net-next
^ permalink raw reply
* [PATCH net-next] veth: extend features to support tunneling
From: Eric Dumazet @ 2013-10-26 1:25 UTC (permalink / raw)
To: Alexei Starovoitov; +Cc: David Miller, Eric Dumazet, stephen, netdev
In-Reply-To: <1382748742.4032.24.camel@edumazet-glaptop.roam.corp.google.com>
From: Eric Dumazet <edumazet@google.com>
While investigating on a recent vxlan regression, I found veth
was using a zero features set for vxlan tunnels.
We have to segment GSO frames, copy the payload, and do the checksum.
This patch brings a ~200% performance increase
We probably have to add hw_enc_features support
on other virtual devices.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
---
drivers/net/veth.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index b2d0347..b24db7a 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -261,6 +261,8 @@ static const struct net_device_ops veth_netdev_ops = {
#define VETH_FEATURES (NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_ALL_TSO | \
NETIF_F_HW_CSUM | NETIF_F_RXCSUM | NETIF_F_HIGHDMA | \
+ NETIF_F_GSO_GRE | NETIF_F_GSO_UDP_TUNNEL | \
+ NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT | NETIF_F_UFO | \
NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX | \
NETIF_F_HW_VLAN_STAG_TX | NETIF_F_HW_VLAN_STAG_RX )
@@ -279,6 +281,7 @@ static void veth_setup(struct net_device *dev)
dev->destructor = veth_dev_free;
dev->hw_features = VETH_FEATURES;
+ dev->hw_enc_features = VETH_FEATURES;
}
/*
^ permalink raw reply related
* Re: [PATCH net-next] tcp: gso: fix truesize tracking
From: Alexei Starovoitov @ 2013-10-26 1:46 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, Eric Dumazet, stephen, netdev
In-Reply-To: <1382747177.4032.21.camel@edumazet-glaptop.roam.corp.google.com>
On Fri, Oct 25, 2013 at 5:26 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> commit 6ff50cd55545 ("tcp: gso: do not generate out of order packets")
> had an heuristic that can trigger a warning in skb_try_coalesce(),
> because skb->truesize of the gso segments were exactly set to mss.
>
> This breaks the requirement that
>
> skb->truesize >= skb->len + truesizeof(struct sk_buff);
>
> It can trivially be reproduced by :
>
> ifconfig lo mtu 1500
> ethtool -K lo tso off
> netperf
>
> As the skbs are looped into the TCP networking stack, skb_try_coalesce()
> warns us of these skb under-estimating their truesize.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Alexei Starovoitov <ast@plumgrid.com>
> ---
it fixed it in my setup. Thanks!
^ permalink raw reply
* Re: [PATCH net-next] veth: extend features to support tunneling
From: Alexei Starovoitov @ 2013-10-26 2:22 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, Eric Dumazet, stephen, netdev
In-Reply-To: <1382750703.4032.32.camel@edumazet-glaptop.roam.corp.google.com>
On Fri, Oct 25, 2013 at 6:25 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> While investigating on a recent vxlan regression, I found veth
> was using a zero features set for vxlan tunnels.
oneliner can be better :)
> We have to segment GSO frames, copy the payload, and do the checksum.
>
> This patch brings a ~200% performance increase
>
> We probably have to add hw_enc_features support
> on other virtual devices.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> ---
iperf over veth with gre/vxlan tunneling is now ~4Gbps. 20x gain as advertised.
Thanks!
^ permalink raw reply
* Re: [PATCH 1/4] sctp: merge two if statements to one
From: wangweidong @ 2013-10-26 2:55 UTC (permalink / raw)
To: Sergei Shtylyov, davem, nhorman, vyasevich
Cc: dingtianhong, linux-sctp, netdev
In-Reply-To: <526A6C45.3080200@cogentembedded.com>
On 2013/10/25 21:04, Sergei Shtylyov wrote:
> Hello.
>
> On 25-10-2013 5:50, Wang Weidong wrote:
>
>> Two if statements do the same work, maybe we can merge them to
>> one. There is just code simplification, no functional changes.
>
>> Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
>> ---
>> net/sctp/auth.c | 12 ++++--------
>> 1 file changed, 4 insertions(+), 8 deletions(-)
>
> I understand what I noticed below is not your typos but maybe it's time to fix them?
Yeah, I will fix them.
Thanks.
>
>> diff --git a/net/sctp/auth.c b/net/sctp/auth.c
>> index 8c4fa5d..19fb0ae 100644
>> --- a/net/sctp/auth.c
>> +++ b/net/sctp/auth.c
>> @@ -539,18 +539,14 @@ struct sctp_hmac *sctp_auth_asoc_get_hmac(const struct sctp_association *asoc)
>> for (i = 0; i < n_elt; i++) {
>> id = ntohs(hmacs->hmac_ids[i]);
>>
>> - /* Check the id is in the supported range */
>> - if (id > SCTP_AUTH_HMAC_ID_MAX) {
>> - id = 0;
>> - continue;
>> - }
>> -
>> - /* See is we support the id. Supported IDs have name and
>> + /* Check the id is in the supported range. And
>> + * see is we support the id. Supported IDs have name and
>
> s/is/if/.
>
>> * length fields set, so that we can allocated and use
>
> s/allocated/allocate/.
>
> WBR, Sergei
>
>
>
^ permalink raw reply
* Re: [PATCH 1/4] sctp: merge two if statements to one
From: wangweidong @ 2013-10-26 2:59 UTC (permalink / raw)
To: Vlad Yasevich, davem, nhorman; +Cc: dingtianhong, linux-sctp, netdev
In-Reply-To: <526B016D.6030301@gmail.com>
On 2013/10/26 7:40, Vlad Yasevich wrote:
> On 10/24/2013 09:50 PM, Wang Weidong wrote:
>> Two if statements do the same work, maybe we can merge them to
>> one. There is just code simplification, no functional changes.
>>
>> Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
>> ---
>> net/sctp/auth.c | 12 ++++--------
>> 1 file changed, 4 insertions(+), 8 deletions(-)
>>
>> diff --git a/net/sctp/auth.c b/net/sctp/auth.c
>> index 8c4fa5d..19fb0ae 100644
>> --- a/net/sctp/auth.c
>> +++ b/net/sctp/auth.c
>> @@ -539,18 +539,14 @@ struct sctp_hmac *sctp_auth_asoc_get_hmac(const struct sctp_association *asoc)
>> for (i = 0; i < n_elt; i++) {
>> id = ntohs(hmacs->hmac_ids[i]);
>>
>> - /* Check the id is in the supported range */
>> - if (id > SCTP_AUTH_HMAC_ID_MAX) {
>> - id = 0;
>> - continue;
>> - }
>> -
>> - /* See is we support the id. Supported IDs have name and
>> + /* Check the id is in the supported range. And
>> + * see is we support the id. Supported IDs have name and
>> * length fields set, so that we can allocated and use
>> * them. We can safely just check for name, for without the
>> * name, we can't allocate the TFM.
>> */
>> - if (!sctp_hmac_list[id].hmac_name) {
>> + if (id > SCTP_AUTH_HMAC_ID_MAX ||
>> + !sctp_hmac_list[id].hmac_name) {
>
> Can you please make the 2 parts of the 'if' statement above line up
> with each other instead of the code below. I makes it easy to see what
> the whole 'if conditional' is.
>
> Thanks
> -vlad
>
Ok, I will resend a new version.
Thanks.
>> id = 0;
>> continue;
>> }
>>
>
>
>
^ permalink raw reply
* Re: [PATCH net 0/2] qlcnic: Bug fixes
From: David Miller @ 2013-10-26 4:05 UTC (permalink / raw)
To: shahed.shaikh; +Cc: netdev, Dept_NX_Linux_NIC_Driver
In-Reply-To: <1382711917-28501-1-git-send-email-shahed.shaikh@qlogic.com>
From: Shahed Shaikh <shahed.shaikh@qlogic.com>
Date: Fri, 25 Oct 2013 10:38:35 -0400
> From: Shahed Shaikh <shahed.shaikh@qlogic.com>
>
> This patch series contains following fixes-
> * Performace drop because driver was forcing adapter not to check
> destination IP for LRO.
> * driver was not issuing qlcnic_fw_cmd_set_drv_version() to 83xx adapter
> becasue of improper handling of QLCNIC_FW_CAPABILITY_MORE_CAPS bit.
>
> Please apply to net.
Applied, what exactly does that destination IP check do?
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox