From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1A25E77173 for ; Fri, 6 Dec 2024 08:24:12 +0000 (UTC) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id DF9F540430; Fri, 6 Dec 2024 09:24:11 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 29A78402C9 for ; Fri, 6 Dec 2024 09:24:10 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1733473449; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=NtWPyfU7ei0/a9E/m++X8EMyuVtwh0FJEgKM7rpx8cE=; b=QIcBV9CyTm+rJDW4eYfXe4F+UQ/7n/bbaXmeFwI8pTauk+S8716mZBzyX2In88SllPBkhn EjRin3bpoV287jDnEsV4yorKfU8Wg2FhewbqQ39tam88bBGXQpTinI/0TBMA5zQ6OIkW1O pfFUaKrBbeQDbPVhNfbNzIO3ZkG52Cw= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-387-l-MmOAimP9acHGWKJCxDuQ-1; Fri, 06 Dec 2024 03:24:04 -0500 X-MC-Unique: l-MmOAimP9acHGWKJCxDuQ-1 X-Mimecast-MFC-AGG-ID: l-MmOAimP9acHGWKJCxDuQ Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id CB86719560B6; Fri, 6 Dec 2024 08:24:02 +0000 (UTC) Received: from [10.39.208.8] (unknown [10.39.208.8]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id D2D7B300019E; Fri, 6 Dec 2024 08:23:59 +0000 (UTC) Message-ID: <04661add-f2dc-401d-84b1-de9f8af89e55@redhat.com> Date: Fri, 6 Dec 2024 09:23:49 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: GCP cloud : Virtio-PMD performance Issue To: Mukul Sinha , dev@dpdk.org Cc: chenbox@nvidia.com, jeroendb@google.com, rushilg@google.com, joshwash@google.com, Srinivasa Srikanth Podila , Tathagat Priyadarshi , Samar Yadav , Varun LA References: From: Maxime Coquelin Autocrypt: addr=maxime.coquelin@redhat.com; keydata= xsFNBFOEQQIBEADjNLYZZqghYuWv1nlLisptPJp+TSxE/KuP7x47e1Gr5/oMDJ1OKNG8rlNg kLgBQUki3voWhUbMb69ybqdMUHOl21DGCj0BTU3lXwapYXOAnsh8q6RRM+deUpasyT+Jvf3a gU35dgZcomRh5HPmKMU4KfeA38cVUebsFec1HuJAWzOb/UdtQkYyZR4rbzw8SbsOemtMtwOx YdXodneQD7KuRU9IhJKiEfipwqk2pufm2VSGl570l5ANyWMA/XADNhcEXhpkZ1Iwj3TWO7XR uH4xfvPl8nBsLo/EbEI7fbuUULcAnHfowQslPUm6/yaGv6cT5160SPXT1t8U9QDO6aTSo59N jH519JS8oeKZB1n1eLDslCfBpIpWkW8ZElGkOGWAN0vmpLfdyiqBNNyS3eGAfMkJ6b1A24un /TKc6j2QxM0QK4yZGfAxDxtvDv9LFXec8ENJYsbiR6WHRHq7wXl/n8guyh5AuBNQ3LIK44x0 KjGXP1FJkUhUuruGyZsMrDLBRHYi+hhDAgRjqHgoXi5XGETA1PAiNBNnQwMf5aubt+mE2Q5r qLNTgwSo2dpTU3+mJ3y3KlsIfoaxYI7XNsPRXGnZi4hbxmeb2NSXgdCXhX3nELUNYm4ArKBP LugOIT/zRwk0H0+RVwL2zHdMO1Tht1UOFGfOZpvuBF60jhMzbQARAQABzSxNYXhpbWUgQ29x dWVsaW4gPG1heGltZS5jb3F1ZWxpbkByZWRoYXQuY29tPsLBeAQTAQIAIgUCV3u/5QIbAwYL CQgHAwIGFQgCCQoLBBYCAwECHgECF4AACgkQyjiNKEaHD4ma2g/+P+Hg9WkONPaY1J4AR7Uf kBneosS4NO3CRy0x4WYmUSLYMLx1I3VH6SVjqZ6uBoYy6Fs6TbF6SHNc7QbB6Qjo3neqnQR1 71Ua1MFvIob8vUEl3jAR/+oaE1UJKrxjWztpppQTukIk4oJOmXbL0nj3d8dA2QgHdTyttZ1H xzZJWWz6vqxCrUqHU7RSH9iWg9R2iuTzii4/vk1oi4Qz7y/q8ONOq6ffOy/t5xSZOMtZCspu Mll2Szzpc/trFO0pLH4LZZfz/nXh2uuUbk8qRIJBIjZH3ZQfACffgfNefLe2PxMqJZ8mFJXc RQO0ONZvwoOoHL6CcnFZp2i0P5ddduzwPdGsPq1bnIXnZqJSl3dUfh3xG5ArkliZ/++zGF1O wvpGvpIuOgLqjyCNNRoR7cP7y8F24gWE/HqJBXs1qzdj/5Hr68NVPV1Tu/l2D1KMOcL5sOrz 2jLXauqDWn1Okk9hkXAP7+0Cmi6QwAPuBT3i6t2e8UdtMtCE4sLesWS/XohnSFFscZR6Vaf3 gKdWiJ/fW64L6b9gjkWtHd4jAJBAIAx1JM6xcA1xMbAFsD8gA2oDBWogHGYcScY/4riDNKXi lw92d6IEHnSf6y7KJCKq8F+Jrj2BwRJiFKTJ6ChbOpyyR6nGTckzsLgday2KxBIyuh4w+hMq TGDSp2rmWGJjASrOwU0EVPSbkwEQAMkaNc084Qvql+XW+wcUIY+Dn9A2D1gMr2BVwdSfVDN7 0ZYxo9PvSkzh6eQmnZNQtl8WSHl3VG3IEDQzsMQ2ftZn2sxjcCadexrQQv3Lu60Tgj7YVYRM H+fLYt9W5YuWduJ+FPLbjIKynBf6JCRMWr75QAOhhhaI0tsie3eDsKQBA0w7WCuPiZiheJaL 4MDe9hcH4rM3ybnRW7K2dLszWNhHVoYSFlZGYh+MGpuODeQKDS035+4H2rEWgg+iaOwqD7bg CQXwTZ1kSrm8NxIRVD3MBtzp9SZdUHLfmBl/tLVwDSZvHZhhvJHC6Lj6VL4jPXF5K2+Nn/Su CQmEBisOmwnXZhhu8ulAZ7S2tcl94DCo60ReheDoPBU8PR2TLg8rS5f9w6mLYarvQWL7cDtT d2eX3Z6TggfNINr/RTFrrAd7NHl5h3OnlXj7PQ1f0kfufduOeCQddJN4gsQfxo/qvWVB7PaE 1WTIggPmWS+Xxijk7xG6x9McTdmGhYaPZBpAxewK8ypl5+yubVsE9yOOhKMVo9DoVCjh5To5 aph7CQWfQsV7cd9PfSJjI2lXI0dhEXhQ7lRCFpf3V3mD6CyrhpcJpV6XVGjxJvGUale7+IOp sQIbPKUHpB2F+ZUPWds9yyVxGwDxD8WLqKKy0WLIjkkSsOb9UBNzgRyzrEC9lgQ/ABEBAAHC wV8EGAECAAkFAlT0m5MCGwwACgkQyjiNKEaHD4nU8hAAtt0xFJAy0sOWqSmyxTc7FUcX+pbD KVyPlpl6urKKMk1XtVMUPuae/+UwvIt0urk1mXi6DnrAN50TmQqvdjcPTQ6uoZ8zjgGeASZg jj0/bJGhgUr9U7oG7Hh2F8vzpOqZrdd65MRkxmc7bWj1k81tOU2woR/Gy8xLzi0k0KUa8ueB iYOcZcIGTcs9CssVwQjYaXRoeT65LJnTxYZif2pfNxfINFzCGw42s3EtZFteczClKcVSJ1+L +QUY/J24x0/ocQX/M1PwtZbB4c/2Pg/t5FS+s6UB1Ce08xsJDcwyOPIH6O3tccZuriHgvqKP yKz/Ble76+NFlTK1mpUlfM7PVhD5XzrDUEHWRTeTJSvJ8TIPL4uyfzhjHhlkCU0mw7Pscyxn DE8G0UYMEaNgaZap8dcGMYH/96EfE5s/nTX0M6MXV0yots7U2BDb4soLCxLOJz4tAFDtNFtA wLBhXRSvWhdBJZiig/9CG3dXmKfi2H+wdUCSvEFHRpgo7GK8/Kh3vGhgKmnnxhl8ACBaGy9n fxjSxjSO6rj4/MeenmlJw1yebzkX8ZmaSi8BHe+n6jTGEFNrbiOdWpJgc5yHIZZnwXaW54QT UhhSjDL1rV2B4F28w30jYmlRmm2RdN7iCZfbyP3dvFQTzQ4ySquuPkIGcOOHrvZzxbRjzMx1 Mwqu3GQ= In-Reply-To: X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: Hjfpxj4faJMFuiPrYLph_A6-hec87UMuAQm8bGEr5x0_1733473443 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi Mukul, On 12/5/24 23:54, Mukul Sinha wrote: > Thanks @maxime.coquelin@redhat.com > Have included dev@dpdk.org > > > On Fri, Dec 6, 2024 at 2:11 AM Maxime Coquelin > > wrote: > > Hi Mukul, > > DPDK upstream mailing lists should be added to this e-mail. > I am not allowed to provide off-list support, all discussions should > happen upstream. > > If this is reproduced with downtream DPDK provided with RHEL and you > have a RHEL subscription, please use the Red Hat issue tracker. > > Thanks for your understanding, > Maxime > > On 12/5/24 21:36, Mukul Sinha wrote: > > + Varun > > > > On Fri, Dec 6, 2024 at 2:04 AM Mukul Sinha > > > >> wrote: > > > >     Hi GCP & Virtio-PMD dev teams, > >     We are from VMware NSX Advanced Load Balancer Team whereby in > >     GCP-cloud (*custom-8-8192 VM instance type 8core8G*) we are > triaging > >     an issue of TCP profile application throughput performance with > >     single dispatcher core single Rx/Tx queue (queue depth: 2048) the > >     throughput performance we get using dpdk-22.11 virtio-PMD code is > >     degraded significantly when compared to when using dpdk-20.05 PMD > >     We see high amount of Tx packet drop counter incrementing on > >     virtio-NIC pointing to issue that the GCP hypervisor side is > unable > >     to drain the packets faster (No drops are seen on Rx side) > >     The behavior is like this : > >     _Using dpdk-22.11_ > >     At 75% CPU usage itself we start seeing huge number of Tx packet > >     drops reported (no Rx drops) causing TCP restransmissions > eventually > >     bringing down the effective throughput numbers > >     _Using dpdk-20.05_ > >     even at ~95% CPU usage without any packet drops (neither Rx > nor Tx) > >     we are able to get a much better throughput > > > >     To improve performance numbers with dpdk-22.11 we have tried > >     increasing the queue depth to 4096 but that din't help. > >     If with dpdk-22.11 we move from single core Rx/Tx queue=1 to > single > >     core Rx/Tx queue=2 we are able to get slightly better numbers > (but > >     still doesnt match the numbers obtained using dpdk-20.05 > single core > >     Rx/Tx queue=1). This again corroborates the fact the GCP > hypervisor > >     is the bottleneck here. > > > >     To root-cause this issue we were able to replicate this behavior > >     using native DPDK testpmd as shown below (cmds used):- > >     Hugepage size: 2 MB > >       ./app/dpdk-testpmd -l 0-1 -n 1 -- -i --nb-cores=1 --txd=2048 > >     --rxd=2048 --rxq=1 --txq=1  --portmask=0x3 > >     set fwd mac > >     set fwd flowgen > >     set txpkts 1518 > >     start > >     stop > > > >     Testpmd traffic run (for packet-size=1518) for exact same > >     time-interval of 15 seconds: > > > >     _22.11_ > >        ---------------------- Forward statistics for port 0 > >       ---------------------- > >        RX-packets: 2              RX-dropped: 0 > RX-total: 2 > >        TX-packets: 19497570 *TX-dropped: 364674686 *    TX-total: > 384172256 > > > > >  ---------------------------------------------------------------------------- > >     _20.05_ > >        ---------------------- Forward statistics for port 0 > >       ---------------------- > >        RX-packets: 3              RX-dropped: 0 > RX-total: 3 > >        TX-packets: 19480319       TX-dropped: 0             TX-total: > >     19480319 > > > > >  ---------------------------------------------------------------------------- > > > >     As you can see > >     dpdk-22.11 > >     Packets generated : 384 million Packets serviced : ~19.5 > million : > >     Tx-dropped : 364 million > >     dpdk-20.05 > >     Packets generated : ~19.5 million Packets serviced : ~19.5 > million : > >     Tx-dropped : 0 > > > >     Actual serviced traffic remains almost same between the two > versions > >     (implying the underlying GCP hypervisor is only capable of > handling > >     that much) but in dpdk-22.11 the PMD is pushing almost 20x > traffic > >     compared to dpdk-20.05 > >     The same pattern can be seen even if we run traffic for a longer > >     duration. > > >  =============================================================================================== > > > >     Following are our queries: > >     @ Virtio-dev team > >     1. Why in dpdk-22.11 using virtio PMD the testpmd application is > >     able to pump 20 times Tx traffic towards hypervisor compared to > >     dpdk-20.05 ? > >     What has changed either in the virtio-PMD or in the virtio-PMD & > >     underlying hypervisor communication causing this behavior ? > >     If you see actual serviced traffic by the hypervisor remains > almost > >     on par with dpdk-20.05 but its the humongous packets drop count > >     which can be overall detrimental for any DPDK-application running > >     TCP traffic profile. > >     Is there a way to slow down the number of packets sent > towards the > >     hypervisor (through either any code change in virtio-PMD or any > >     config setting) and make it on-par with dpdk-20.05 performance ? > >     2. In the published Virtio performance report Release 22.11 > we see > >     no qualification of throughput numbers done on GCP-cloud. Is > there > >     any internal performance benchmark numbers you have for GCP-cloud > >     and if yes can you please share it with us so that we can > check if > >     there's any configs/knobs/settings you used to get optimum > performance. I don't know what your issue is, but this is not something we noticed using QEMU/KVM as hypervisor with Vhost-user backend. I would suggest you run a git bisect to pinpoint to the specific commit introducing this regression. Also, you could run perf top in the guest on both 20.05 and 22.11, maybe we could spot something in it. Regards, Maxime > > > >     @ GCP-cloud dev team > >     As we can see any amount of traffic greater than what can be > >     successfully serviced by the GCP hypervisor is all getting > dropped > >     hence we need help from your side to reproduce this issue in your > >     in-house setup preferably using the same VM instance type as > >     highlighted before. > >     We need further investigation by you from the GCP host level > side to > >     check on parameters like running out of Tx buffers or Queue full > >     conditions for the virtio-NIC or number of NIC Rx/Tx kernel > threads > >     as to what is causing hypervisor to not match up to the > traffic load > >     pumped in dpdk-22.11 > >     Based on your debugging we would additionally need inputs as > to what > >     can be tweaked or any knobs/settings can be configured from the > >     GCP-VM level to get better performance numbers. > > > >     Please feel free to reach out to us for any further queries. > > > >     _Additional outputs for debugging:_ > >     lspci | grep Eth > >     00:06.0 Ethernet controller: Red Hat, Inc. Virtio network device > >     root@dcg15-se-ecmyw:/home/admin/dpdk/build# ethtool -i eth0 > >     driver: virtio_net > >     version: 1.0.0 > >     firmware-version: > >     expansion-rom-version: > >     bus-info: 0000:00:06.0 > >     supports-statistics: yes > >     supports-test: no > >     supports-eeprom-access: no > >     supports-register-dump: no > >     supports-priv-flags: no > > > >     testpmd> show port info all > >     ********************* Infos for port 0  ********************* > >     MAC address: 42:01:0A:98:A0:0F > >     Device name: 0000:00:06.0 > >     Driver name: net_virtio > >     Firmware-version: not available > >     Connect to socket: 0 > >     memory allocation on the socket: 0 > >     Link status: up > >     Link speed: Unknown > >     Link duplex: full-duplex > >     Autoneg status: On > >     MTU: 1500 > >     Promiscuous mode: disabled > >     Allmulticast mode: disabled > >     Maximum number of MAC addresses: 64 > >     Maximum number of MAC addresses of hash filtering: 0 > >     VLAN offload: > >        strip off, filter off, extend off, qinq strip off > >     No RSS offload flow type is supported. > >     Minimum size of RX buffer: 64 > >     Maximum configurable length of RX packet: 9728 > >     Maximum configurable size of LRO aggregated packet: 0 > >     Current number of RX queues: 1 > >     Max possible RX queues: 2 > >     Max possible number of RXDs per queue: 32768 > >     Min possible number of RXDs per queue: 32 > >     RXDs number alignment: 1 > >     Current number of TX queues: 1 > >     Max possible TX queues: 2 > >     Max possible number of TXDs per queue: 32768 > >     Min possible number of TXDs per queue: 32 > >     TXDs number alignment: 1 > >     Max segment number per packet: 65535 > >     Max segment number per MTU/TSO: 65535 > >     Device capabilities: 0x0( ) > >     Device error handling mode: none > > > > > > > > This electronic communication and the information and any files > > transmitted with it, or attached to it, are confidential and are > > intended solely for the use of the individual or entity to whom > it is > > addressed and may contain information that is confidential, legally > > privileged, protected by privacy laws, or otherwise restricted from > > disclosure to anyone else. If you are not the intended recipient > or the > > person responsible for delivering the e-mail to the intended > recipient, > > you are hereby notified that any use, copying, distributing, > > dissemination, forwarding, printing, or copying of this e-mail is > > strictly prohibited. If you received this e-mail in error, please > return > > the e-mail to the sender, delete it from your computer, and > destroy any > > printed copy of it. > > > This electronic communication and the information and any files > transmitted with it, or attached to it, are confidential and are > intended solely for the use of the individual or entity to whom it is > addressed and may contain information that is confidential, legally > privileged, protected by privacy laws, or otherwise restricted from > disclosure to anyone else. If you are not the intended recipient or the > person responsible for delivering the e-mail to the intended recipient, > you are hereby notified that any use, copying, distributing, > dissemination, forwarding, printing, or copying of this e-mail is > strictly prohibited. If you received this e-mail in error, please return > the e-mail to the sender, delete it from your computer, and destroy any > printed copy of it.