From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bhaskar Dutta Subject: Re: TCP-MD5 checksum failure on x86_64 SMP Date: Wed, 5 May 2010 23:33:59 +0530 Message-ID: References: <1272972722.2097.1.camel@achroite.uk.solarflarecom.com> <20100504091215.5a4a51f4@nehalam> <20100504101301.5f4dd9c2@nehalam> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Ben Hutchings , netdev@vger.kernel.org To: Stephen Hemminger Return-path: Received: from mail-px0-f174.google.com ([209.85.212.174]:62592 "EHLO mail-px0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752217Ab0EESED convert rfc822-to-8bit (ORCPT ); Wed, 5 May 2010 14:04:03 -0400 Received: by pxi5 with SMTP id 5so1620064pxi.19 for ; Wed, 05 May 2010 11:03:59 -0700 (PDT) In-Reply-To: <20100504101301.5f4dd9c2@nehalam> Sender: netdev-owner@vger.kernel.org List-ID: On Tue, May 4, 2010 at 10:43 PM, Stephen Hemminger wrote: > > On Tue, 4 May 2010 22:38:49 +0530 > Bhaskar Dutta wrote: > > > On Tue, May 4, 2010 at 9:42 PM, Stephen Hemminger wrote: > > > On Tue, 4 May 2010 19:58:32 +0530 > > > Bhaskar Dutta wrote: > > > > > >> On Tue, May 4, 2010 at 5:02 PM, Ben Hutchings wrote: > > >> > On Tue, 2010-05-04 at 09:00 +0530, Bhaskar Dutta wrote: > > >> >> Hi, > > >> >> > > >> >> I am observing intermittent TCP-MD5 checksum failures > > >> >> (CONFIG_TCP_MD5SIG) =A0on kernel 2.6.31 while talking to a BG= P router. > > >> >> > > >> >> The problem is only seen in multi-core 64 bit machines. > > >> >> Is there any known bug in the per_cpu_ptr implementation (I a= m aware > > >> >> that the percpu allocator has been re-implemented in 2.6.33) = that > > >> >> might cause a corruption in 64 bit SMP machines? > > >> >> > > >> >> Any pointers would be appreciated. > > >> > > > >> > There was another recent report of incorrect MD5 signatures in > > >> > , but with= out any > > >> > response. > > >> > > > >> > Ben. > > >> > > > >> > > >> I found another thread posted back in Jan 2007 with a similar bu= g > > >> (x86_64 on 2.6.20) but no replies to that as well. > > >> http://lkml.org/lkml/2007/1/20/56 > > > > > > 2.6.20 had lots of other MD5 bugs. Your problem might be related = to > > > GRO. =A0MD5 may not handle multi-fragment packets. > > > -- > > > > I am getting the issue on 2.6.31 and 2.6.28 (gro infrastructure was > > added in 2.6.29). > > Also, both segmentation offloading as well as receive offloading > > (gso/gro) are turned off. > > > > Moreover outgoing TCP packets are the ones with the corrupt checksu= ms. > > Both tcpdump on my local machine and the BGP router on the other si= de > > complain of the bad checksums with the same packet. > > > > I am trying to figure out if there is something in the per-cpu > > implementation that might be causing a corruption (SMP and x86_64) = but > > I am not really getting anywhere. > > I seriously doubt the per-cpu stuff is the issue. > > > I am trying to reproduce the bad checksums with the latest kernel > > sources since it has a new implementation of the percpu allocator. > > First turn off all offload settings on the device (TSO,GSO,SG,CSUM) > then check that size of the bad packets. Are they fragmented or > just simple linear packets? > > -- Hi, TSO, GSO and SG are already turned off. rx/tx checksumming is on, but that shouldn't matter, right? # ethtool -k eth0 Offload parameters for eth0: rx-checksumming: on tx-checksumming: on scatter-gather: off tcp segmentation offload: off udp fragmentation offload: off generic segmentation offload: off The bad packets are very small in size, most have no data at all (<300 = bytes). After adding some logs to kernel 2.6.31-12, it seems that tcp_v4_md5_hash_skb (function that calculates the md5 hash) is (might?) getting corrupt. The tcp4_pseudohdr (bp =3D &hp->md5_blk.ip4) structure's saddr, daddr and len fields get modified to different values towards the end of the tcp_v4_md5_hash_skb function whenever there is a checksum error. The tcp4_pseudohdr (bp) is within the tcp_md5sig_pool (hp), which is filled up by tcp_get_md5sig_pool (which calls per_cpu_ptr). Using a local copy of the tcp4_pseudohdr in the same function tcp_v4_md5_hash_skb (copied all fields from the original tcp4_pseudohdr within the tcp_md5sig_pool) and calculating the md5 checksum with the local tcp4_pseudohdr seems to solve the issue (don't see bad packets for a hours in load tests, and without the change I can see them instantaneously in the load tests). I am still unable to figure out how this is happening. Please let me know if you have any pointers. Thanks a lot! Bhaskar