From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.7 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,FROM_EXCESS_BASE64, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0C3CC4360F for ; Fri, 5 Apr 2019 08:24:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 666D6217D7 for ; Fri, 5 Apr 2019 08:24:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ieXKUTD4" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729820AbfDEIYf (ORCPT ); Fri, 5 Apr 2019 04:24:35 -0400 Received: from mail-ed1-f68.google.com ([209.85.208.68]:36359 "EHLO mail-ed1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725955AbfDEIYe (ORCPT ); Fri, 5 Apr 2019 04:24:34 -0400 Received: by mail-ed1-f68.google.com with SMTP id s16so4727559edr.3 for ; Fri, 05 Apr 2019 01:24:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:references:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=1Q7KO3KBA88y5rbPhQ8PGA1GnaP4sCeEmcZOKJ/5DZk=; b=ieXKUTD42AeVKAFMNtH02B0JolgHAZt/dIayl+N0bw7eYbHl+O6TXZB7qc/zlm2YS9 h13b2T8BzoJzc4GCqWQzqZWUqXfbjdd4XtYX23Qt6G/Qyrv+vlGReAzaz0pKlkw0pa/w Kkc5lGptULrU39hEBPOoEfo8L6PxzuTBVHNHRrSq+kssYVN+qBana2Jmcq6DH8ATS5lS iZS1WC/vS9+5N/ntJcstgG7aYgvR97FK+Unin40n9kAHZmacegZUbpnXrXWx1efrCHjE 83W30P0t4lSZS8MP1UxrYfaFdg0yw7CZTVhHOqMlCH8UMYGj4KqivmqJHrqrLAsyLfZq 10Kg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:references:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=1Q7KO3KBA88y5rbPhQ8PGA1GnaP4sCeEmcZOKJ/5DZk=; b=T6QztHEZvzrqCQIVcp9pzHOL1u3PNtr05D+Zoxu8fvhMDKzL4ed7egfbA+BK2soqwD zqVeCRxtZtmTl43Ix3n+9py7nvyWlp/IE1IgnHsTdVGmSyzeBXdPDPquhgJ/qL0d9Ip/ 6uZOjmnFMLGaDSmNqU2jXaLuXhKMxqnMLtRAGYqB8szdov9L5vLBSg1lWroztg6bafsD 4XlQ/Mi7+kjqSFIV8gTnnzBRVhJ0OwQkWjcBJRYfGQXUoGzaP+JlsSk6Y7Gq9eX06Tio P5NUDnWFmyU9WKQLtY+PJCSjbQsp0INmyunSuJ0I8u3q72x+UwbIzaX+8CPPiJ46DLrC GyZA== X-Gm-Message-State: APjAAAVjE8FfhA95ofI7mssGRrM9ufjZQ03FbEkj8QtcW61A0fkWO7uR sAqTUwS9PPaK3a32AgR2dPRKOC2R X-Google-Smtp-Source: APXvYqy4huiqGzOmEokZAZJm+jV8IaxlnxR2QE4AjUkUq/SAhh/wm14Xe3oHedDbyOWzBRKyT7jiAQ== X-Received: by 2002:aa7:dcc3:: with SMTP id w3mr1428106edu.205.1554452672069; Fri, 05 Apr 2019 01:24:32 -0700 (PDT) Received: from elitebook.lan (ip-194-187-74-233.konfederacka.maverick.com.pl. [194.187.74.233]) by smtp.googlemail.com with ESMTPSA id gf2sm3961601ejb.20.2019.04.05.01.24.30 (version=TLS1_3 cipher=AEAD-AES128-GCM-SHA256 bits=128/128); Fri, 05 Apr 2019 01:24:31 -0700 (PDT) Subject: Re: NAT performance regression caused by vlan GRO support From: =?UTF-8?B?UmFmYcWCIE1pxYJlY2tp?= To: Toshiaki Makita , Felix Fietkau Cc: Toshiaki Makita , netdev@vger.kernel.org, "David S. Miller" , Stefano Brivio , Sabrina Dubroca , David Ahern , Jo-Philipp Wich , Koen Vandeputte References: <73223229-6bc0-2647-6952-975961811866@gmail.com> <75961408-fd62-0f12-bd4b-79008b27576c@gmail.com> <53588a9f-8cc8-0ee5-0947-8ab2b2e56f15@gmail.com> <45b6fe37-ba1a-91c2-1d4a-2d045793babd@nbd.name> <67d634cd-cf16-df21-7b8a-5d865d95e4e6@lab.ntt.co.jp> <31acd23f-6973-1912-7fcc-575a5d4e00e7@gmail.com> Message-ID: Date: Fri, 5 Apr 2019 10:24:29 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.5.2 MIME-Version: 1.0 In-Reply-To: <31acd23f-6973-1912-7fcc-575a5d4e00e7@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On 05.04.2019 10:12, Rafał Miłecki wrote: > On 05.04.2019 09:58, Toshiaki Makita wrote: >> On 2019/04/05 16:14, Felix Fietkau wrote: >>> On 2019-04-05 09:11, Rafał Miłecki wrote: >>>> I guess its GRO + csum_partial() to be blamed for this performance drop. >>>> >>>> Maybe csum_partial() is very fast on your powerful machine and few extra calls >>>> don't make a difference? I can imagine it affecting much slower home router with >>>> ARM cores. >>> Most high performance Ethernet devices implement hardware checksum >>> offload, which completely gets rid of this overhead. >>> Unfortunately, the BCM53xx/47xx Ethernet MAC doesn't have this, which is >>> why you're getting such crappy performance. >> >> Hmm... now I disabled rx checksum and tried the test again, and indeed I >> see csum_partial from GRO path. But I also see csum_partial even without >> GRO from nf_conntrack_in -> tcp_packet -> __skb_checksum_complete. >> Probably Rafał disabled nf_conntrack_checksum sysctl knob? >> >> But anyway even with disabling rx csum offload my machine has better >> performance with GRO. I'm sure in some cases GRO should be disabled, but >> I guess it's difficult to determine whether we should disable GRO or not >> automatically when csum offload is not available. > > Few testing results: > > 1) ethtool -K eth0 gro off; echo 0 > /proc/sys/net/netfilter/nf_conntrack_checksum > [  6]  0.0-60.0 sec  6.57 GBytes   940 Mbits/sec > > 2) ethtool -K eth0 gro off; echo 1 > /proc/sys/net/netfilter/nf_conntrack_checksum > [  6]  0.0-60.0 sec  4.65 GBytes   666 Mbits/sec For this case (GRO off and nf_conntrack_checksum enabled) I can confirm I see csum_partial() in the perf output. It's taking 13,14% instead of 25,46% (as when using GRO) though. Samples: 38K of event 'cycles', Event count (approx.): 12209908413 Overhead Command Shared Object Symbol + 13,14% ksoftirqd/1 [kernel.kallsyms] [k] csum_partial + 10,16% swapper [kernel.kallsyms] [k] v7_dma_inv_range + 6,36% swapper [kernel.kallsyms] [k] l2c210_inv_range + 4,89% swapper [kernel.kallsyms] [k] __irqentry_text_end + 4,12% ksoftirqd/1 [kernel.kallsyms] [k] v7_dma_clean_range + 3,78% swapper [kernel.kallsyms] [k] bcma_host_soc_read32 + 2,76% swapper [kernel.kallsyms] [k] arch_cpu_idle + 2,45% ksoftirqd/1 [kernel.kallsyms] [k] __netif_receive_skb_core + 2,37% ksoftirqd/1 [kernel.kallsyms] [k] l2c210_clean_range + 1,76% ksoftirqd/1 [kernel.kallsyms] [k] bgmac_start_xmit + 1,66% swapper [kernel.kallsyms] [k] bgmac_poll + 1,55% ksoftirqd/1 [kernel.kallsyms] [k] __dev_queue_xmit + 1,11% ksoftirqd/1 [kernel.kallsyms] [k] skb_vlan_untag > 3) ethtool -K eth0 gro on; echo 0 > /proc/sys/net/netfilter/nf_conntrack_checksum > [  6]  0.0-60.0 sec  4.02 GBytes   575 Mbits/sec > > 4) ethtool -K eth0 gro on; echo 1 > /proc/sys/net/netfilter/nf_conntrack_checksum > [  6]  0.0-60.0 sec  4.04 GBytes   579 Mbits/sec