From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Miller Subject: Re: Memory corruption with r8169 across several device revisions and kernels Date: Tue, 23 Jan 2018 10:28:23 -0500 (EST) Message-ID: <20180123.102823.1267642153979326760.davem@davemloft.net> References: <8ac81034-008b-7ad0-619c-b80bb0843c14@googlemail.com> <20180122000922.GA3020@electric-eye.fr.zoreil.com> <3ceebc43-6baf-2b4c-af10-70522e97385e@googlemail.com> Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: romieu@fr.zoreil.com, netdev@vger.kernel.org To: o.freyermuth@googlemail.com Return-path: Received: from shards.monkeyblade.net ([184.105.139.130]:54062 "EHLO shards.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751325AbeAWP20 (ORCPT ); Tue, 23 Jan 2018 10:28:26 -0500 In-Reply-To: <3ceebc43-6baf-2b4c-af10-70522e97385e@googlemail.com> Sender: netdev-owner@vger.kernel.org List-ID: From: Oliver Freyermuth Date: Mon, 22 Jan 2018 23:55:58 +0100 > Checking through the driver sources, I find rtnl_link_stats64 can > not be the culprit, since it has rx_packets and only after > tx_packets. However, struct rtl8169_counters looks like: > > struct rtl8169_counters { > __le64 tx_packets; > __le64 rx_packets; > __le64 tx_errors; > __le32 rx_errors; > __le16 rx_missed; > __le16 align_errors; > __le32 tx_one_collision; > __le32 tx_multi_collision; > __le64 rx_unicast; > __le64 rx_broadcast; > __le32 rx_multicast; > __le16 tx_aborted; > __le16 tx_underun; > }; > > This looks like it could very well match the structure found in > memory, so something would be broken related to rtl8169_do_counters, > in the DMA transfer. > > Does this help - can I provide more info? I get the feeling this > affects many tens of thousands of systems and just has been hidden > due to network stats being read rarely... Looking at how these DMA counters are handled, there appears to be a requirement that the memory buffer is 64-byte aligned. This is because the low bits in the counter address register are used for various commands, for example: /* ResetCounterCommand */ CounterReset = 0x1, /* DumpCounterCommand */ CounterDump = 0x8, Looking at the FreeBSD driver, the requirement seems to be 64-bytes of alignment. (see RL_DUMP_ALIGN define) However, nothing is being done in r8169.c to enforce this alignment at counter allocation time: tp->counters = dmam_alloc_coherent (&pdev->dev, sizeof(*tp->counters), &tp->counters_phys_addr, There is no alignment guaranteed by this allocation interface. On a lot of platforms you get PAGE_SIZE aligned buffers, but this is not a universal thing at all. Therefore the driver needs to allocate "size + (64 - 1)" bytes and do the 64-byte alignment of the CPU pointer and the DMA address by hand.