netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Alexander Aring <aring@mojatatu.com>, Tariq Toukan <tariqt@mellanox.com>
Cc: David Miller <davem@davemloft.net>,
	edumazet@google.com, netdev@vger.kernel.org, fw@strlen.de,
	herbert@gondor.apana.org.au, tgraf@suug.ch, brouer@redhat.com,
	alex.aring@gmail.com, stefan@osg.samsung.com,
	ktkhai@virtuozzo.com, eric.dumazet@gmail.com,
	Moshe Shemesh <moshe@mellanox.com>,
	Eran Ben Elisha <eranbe@mellanox.com>
Subject: Re: [PATCH v4 net-next 00/19] inet: frags: bring rhashtables to IP defrag
Date: Mon, 28 May 2018 09:09:17 -0700	[thread overview]
Message-ID: <13bf3889-4426-b17a-d8d7-e843038a2a82@gmail.com> (raw)
In-Reply-To: <20180528145224.3ih6urfixwv4fwkf@x220t>



On 05/28/2018 07:52 AM, Alexander Aring wrote:

> as somebody who had similar issues with this patch series I can tell you
> about what happened for the 6LoWPAN fragmentation.
> 
> The issue sounds similar, but there is too much missing information here
> to say something about if you have exactly the issue which we had.
> 
> Our problem:
> 
> The patch series uses memcmp() to compare hash keys, we had some padding
> bytes in our hash key and it occurs that we had sometimes random bytes
> in this structure when it's put on stack. We solved it by a struct
> foo_key bar = {}, which in case of gcc it _seems_ it makes a whole
> memset(bar, 0, ..) on the structure.
> 
> I asked on the netdev mailinglist how to deal with this problem in
> general, because = {} works in case of gcc, others compilers may have a
> different handling or even gcc will changes this behaviour in future.
> I got no reply so I did what it works for me. :-)
> 
> At least maybe a memcmp() on structures should never be used, it should
> be compared by field. I would recommend this way when the compiler is
> always clever enough to optimize it in some cases, but I am not so a
> compiler expert to say anything about that.
> 
> I checked the hash key structures for x86_64 and pahole, so far I didn't
> find any padding bytes there, but it might be different on
> architectures or ?compiler?.
> 
> Additional useful information to check if you running into the same problem
> would be:
> 
>  - Which architecture do you use?
> 
>  - Do you have similar problems with a veth setup?
> 
> You could also try this:
> 
> diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
> index b939b94e7e91..40ece9ab8b12 100644
> --- a/net/ipv6/reassembly.c
> +++ b/net/ipv6/reassembly.c
> @@ -142,19 +142,19 @@ static void ip6_frag_expire(struct timer_list *t)
>  static struct frag_queue *
>  fq_find(struct net *net, __be32 id, const struct ipv6hdr *hdr, int iif)
>  {
> -       struct frag_v6_compare_key key = {
> -               .id = id,
> -               .saddr = hdr->saddr,
> -               .daddr = hdr->daddr,
> -               .user = IP6_DEFRAG_LOCAL_DELIVER,
> -               .iif = iif,
> -       };
> +       struct frag_v6_compare_key key = {};
>         struct inet_frag_queue *q;
>  
>         if (!(ipv6_addr_type(&hdr->daddr) & (IPV6_ADDR_MULTICAST |
>                                             IPV6_ADDR_LINKLOCAL)))
>                 key.iif = 0;
>  
> +       key.id = id;
> +       key.saddr = hdr->saddr;
> +       key.daddr = hdr->daddr;
> +       key.user = IP6_DEFRAG_LOCAL_DELIVER;
> +       key.iif = iif;
> +
>         q = inet_frag_find(&net->ipv6.frags, &key);
>         if (!q)
>                 return NULL;
> 
> - Alex
> 

Hi Alex.

This patch makes no sense, since struct frag_v6_compare_key has no hole.

Only 6LoWPAN had a problem really, because of its way of having unions (and holes).

Also note that your patch would break the case when we force key.iif to be zero.


Tariq, here are my test results : No drops for me.

# ./netperf -H 2607:f8b0:8099:e18:: -t UDP_STREAM
MIGRATED UDP STREAM TEST from ::0 (::) port 0 AF_INET6 to 2607:f8b0:8099:e18:: () port 0 AF_INET6
Socket  Message  Elapsed      Messages                
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

212992   65507   10.00      202117      0    10592.00
212992           10.00           0              0.00

Somehow, you might send packets too fast and receiver has a problem with that ?
For particular needs, you might need to adjust :

/proc/sys/net/ipv6/ip6frag_time  (to 2 seconds instead of the default of 60)
/proc/sys/net/ipv6/ip6frag_low_thresh
/proc/sys/net/ipv6/ip6frag_high_thresh

Once your receiver has filled its capacity with frags, the default of 60 seconds to garbage collect
might be the reason you notice a problem.

Check :
grep FRAG6 /proc/net/sockstat6

On Google servers we multiply by 25 the limits for ipv6 frags memory usage :

/proc/sys/net/ipv6/ip6frag_high_thresh:104857600  (instead of 4MB)
/proc/sys/net/ipv6/ip6frag_low_thresh:78643200  (instead of 3 MB)

When using 64KB datagrams, note that the truesize of the datagram would be about 44 * 2 = 88 KB,
so after ~40 lost packets in the network, you no longer can accept ipv6 fragments, until garbage
collector evicted old datagrams.

  reply	other threads:[~2018-05-28 16:09 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-31 19:58 [PATCH v4 net-next 00/19] inet: frags: bring rhashtables to IP defrag Eric Dumazet
2018-03-31 19:58 ` [PATCH v4 net-next 01/19] ipv6: frag: remove unused field Eric Dumazet
2018-03-31 19:58 ` [PATCH v4 net-next 02/19] inet: frags: change inet_frags_init_net() return value Eric Dumazet
2018-03-31 19:58 ` [PATCH v4 net-next 03/19] inet: frags: add a pointer to struct netns_frags Eric Dumazet
2018-03-31 19:58 ` [PATCH v4 net-next 04/19] inet: frags: refactor ipv6_frag_init() Eric Dumazet
2018-03-31 19:58 ` [PATCH v4 net-next 05/19] inet: frags: refactor lowpan_net_frag_init() Eric Dumazet
2018-03-31 19:58 ` [PATCH v4 net-next 06/19] inet: frags: refactor ipfrag_init() Eric Dumazet
2018-03-31 19:58 ` [PATCH v4 net-next 07/19] rhashtable: add schedule points Eric Dumazet
2018-03-31 19:58 ` [PATCH v4 net-next 08/19] inet: frags: use rhashtables for reassembly units Eric Dumazet
2018-04-16 12:54   ` Stefan Schmidt
2018-03-31 19:58 ` [PATCH v4 net-next 09/19] inet: frags: remove some helpers Eric Dumazet
2018-03-31 19:58 ` [PATCH v4 net-next 10/19] inet: frags: get rif of inet_frag_evicting() Eric Dumazet
2018-03-31 19:58 ` [PATCH v4 net-next 11/19] inet: frags: remove inet_frag_maybe_warn_overflow() Eric Dumazet
2018-03-31 19:58 ` [PATCH v4 net-next 12/19] inet: frags: break the 2GB limit for frags storage Eric Dumazet
2018-03-31 19:58 ` [PATCH v4 net-next 13/19] inet: frags: do not clone skb in ip_expire() Eric Dumazet
2018-03-31 19:58 ` [PATCH v4 net-next 14/19] ipv6: frags: rewrite ip6_expire_frag_queue() Eric Dumazet
2018-03-31 19:58 ` [PATCH v4 net-next 15/19] rhashtable: reorganize struct rhashtable layout Eric Dumazet
2018-03-31 19:58 ` [PATCH v4 net-next 16/19] inet: frags: reorganize struct netns_frags Eric Dumazet
2018-03-31 19:58 ` [PATCH v4 net-next 17/19] inet: frags: get rid of ipfrag_skb_cb/FRAG_CB Eric Dumazet
2018-04-01  3:02   ` kbuild test robot
2018-03-31 19:58 ` [PATCH v4 net-next 18/19] ipv6: frags: get rid of ip6frag_skb_cb/FRAG6_CB Eric Dumazet
2018-03-31 19:59 ` [PATCH v4 net-next 19/19] inet: frags: get rid of nf_ct_frag6_skb_cb/NFCT_FRAG6_CB Eric Dumazet
2018-04-01  3:25 ` [PATCH v4 net-next 00/19] inet: frags: bring rhashtables to IP defrag David Miller
2018-05-28  9:12   ` Tariq Toukan
2018-05-28 14:52     ` Alexander Aring
2018-05-28 16:09       ` Eric Dumazet [this message]
2018-05-30  7:20         ` Tariq Toukan
2018-05-30  7:36           ` Eric Dumazet
2018-05-30 14:42             ` Tariq Toukan
2018-05-31 12:18           ` Moshe Shemesh
2018-05-31 14:05             ` Eric Dumazet
2018-05-30  9:20         ` Jesper Dangaard Brouer
2018-05-30 10:36           ` Eric Dumazet
2018-05-30 10:56             ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=13bf3889-4426-b17a-d8d7-e843038a2a82@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=alex.aring@gmail.com \
    --cc=aring@mojatatu.com \
    --cc=brouer@redhat.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eranbe@mellanox.com \
    --cc=fw@strlen.de \
    --cc=herbert@gondor.apana.org.au \
    --cc=ktkhai@virtuozzo.com \
    --cc=moshe@mellanox.com \
    --cc=netdev@vger.kernel.org \
    --cc=stefan@osg.samsung.com \
    --cc=tariqt@mellanox.com \
    --cc=tgraf@suug.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).