From mboxrd@z Thu Jan 1 00:00:00 1970 From: NeilBrown Subject: Re: [PATCH 0/3]: net: dsa: mt7530: support MT7530 in the MT7621 SoC Date: Mon, 17 Dec 2018 18:11:02 +1100 Message-ID: <871s6gbpbt.fsf@notabene.neil.brown.name> References: <87r2f2pxpa.fsf@miraculix.mork.no> <87pnu8vepj.fsf@notabene.neil.brown.name> <87a7l5azux.fsf@notabene.neil.brown.name> <20181216.141458.260639209068679776.davem@davemloft.net> <875zvtawlh.fsf@notabene.neil.brown.name> Mime-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Cc: bjorn@mork.no, gerg@kernel.org, sean.wang@mediatek.com, andrew@lunn.ch, vivien.didelot@savoirfairelinux.com, netdev@vger.kernel.org, blogic@openwrt.org, opensource@vdorst.com To: Florian Fainelli , David Miller Return-path: Received: from mx2.suse.de ([195.135.220.15]:46992 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726395AbeLQHLQ (ORCPT ); Mon, 17 Dec 2018 02:11:16 -0500 In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Sun, Dec 16 2018, Florian Fainelli wrote: > On December 16, 2018 3:19:22 PM PST, NeilBrown wrote: >>On Sun, Dec 16 2018, David Miller wrote: >> >>> From: NeilBrown >>> Date: Mon, 17 Dec 2018 09:08:54 +1100 >>> >>>> In my 4.4 kernel, the build_skb() call in (the equivalent of) >>>> mtk_poll_rx() takes about 1.2usec and the call to napi_gro_receive() >>>> takes about 3usec. >>>>=20 >>>> In my 4.20 kernel, these calls take about 30 and 24 usec >>respectively. >>>> This easily explains the slowdown. >>> >>> That's a huge difference. >>> >>> Nothing jumps out as a possible cause except perhaps retpoline or >>> something like that. >> >>I'll keep that in mind - thanks. >> >>My guess was CPU-cache invalidation. >>I just checked and the other CPU core (there are two - each >>hyper-threaded - "other" meaning not the one that handles ethernet >>interrupts) gets several thousand "IPI resched" interrupts while >>running a 10 second (226MByte) iperf3 receive test. >>About 17KB transferred per IPI. >>I cannot see where build_skb() would do cache invalidation though. > > It doesn't the driver is responsible for that. How is coherency maintaine= d between cores? I suspect so - yes. Coherency only needs explicit management with DMA is used. This wasn't the problem. Further investigation showed that the problem was that I had CONFIG_SLUB_DEBUG set. That was probably useful in some earlier debugging exercise, but it clearly isn't useful when performance-testing the network. I removed that and I have much nicer numbers - not quite the consistent 900+ that I saw with 4.4, but a lot closer. Thanks for the encouragement, and sorry of the noise. NeilBrown > > The IPI could be due to receive packet steering, is the MAC multi queue a= ware on the RX path? > --=20 > Florian --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEG8Yp69OQ2HB7X0l6Oeye3VZigbkFAlwXTAcACgkQOeye3VZi gbn8vw/8CLVLbfgRB7mDsiImCVoeX4qIqolThPIvw3d7RoGBm7w2uZMHW4IVMzK3 TwGgrVQdA/zkwFoal0FZ5Cye4xpm6CkmgkdGrR55eaEhcYRVmbso6EoHVxmAe7G0 X3SFYqh01H3p/9dY1w92mv7dITlxm+cnVIEOdYsFmhZ3mS6ZpAZTm/VznZrQeUzg /tzzQdwP3Q035zdN+ijxE2L4FCVb9TuSvbLhen/vK/kN1nvhFVFRdmN6khG0KGU/ Ldsc8IOatmXWYr/k4gM8Lpe9XUYNHeIMU82xXDibFqncJJXmc+omGfFYRXE3BV0U NUTBQ8/dRNJWLSq4QNwYru/ALGYIFk2wNv2mGc+7hbXqkeg8OCW+4k98cdiBBaxc kUa1r84ZExghY8DpJtXzZLangI+nTcbskZZAG7a55wws4o0kPIbjM4FfWMIkEzS0 D0Bd2dRhBABjLzCoxf9Xsw2NEdJdXzBcQxeqwI9HLKURusyZS8pwJceUnEWAVNix b11sovgi+v/SVtEtuZyjnHcvAY5qM6OGWfgAdY7WWuSuvMDdzFYlUAHa4uHyZBrm 26rT0o88Gp7CLIJQ6IWOiftj0nYWiITz6kK2FBtIrtBn+VOI1MNoKKDJOsJNGrNf eSU0Cnu/c64u1jnCOH1ApwRuPGsw1uL7yOk5xcKtkTXpEOs32yg= =GagW -----END PGP SIGNATURE----- --=-=-=--