From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stephen Hemminger Subject: Re: Strange latency spikes/TX network stalls on Sun Fire X4150(x86) and e1000e Date: Wed, 6 Jun 2012 11:46:46 -0700 Message-ID: <20120606114646.143c342f@nehalam.linuxnetplumber.net> References: <20120606092635.00003b61@unknown> <20120607021937.a5638bfd.shimoda.hiroaki@gmail.com> <20120606.112332.885204082939531665.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: therbert@google.com, shimoda.hiroaki@gmail.com, jesse.brandeburg@intel.com, eric.dumazet@gmail.com, denys@visp.net.lb, netdev@vger.kernel.org, e1000-devel@lists.sourceforge.net, jeffrey.t.kirsher@intel.com To: David Miller Return-path: Received: from mail.vyatta.com ([76.74.103.46]:43195 "EHLO mail.vyatta.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752410Ab2FFSqt convert rfc822-to-8bit (ORCPT ); Wed, 6 Jun 2012 14:46:49 -0400 In-Reply-To: <20120606.112332.885204082939531665.davem@davemloft.net> Sender: netdev-owner@vger.kernel.org List-ID: On Wed, 06 Jun 2012 11:23:32 -0700 (PDT) David Miller wrote: > From: Tom Herbert > Date: Wed, 6 Jun 2012 11:21:40 -0700 >=20 > > I'm not exactly sure what the exact effect of WTHRESH is here. Doe= s > > the device coalesce 5 completions regardless of size? Would the > > problem be avoided if bql limit_min were MTU, or could same issue b= e > > hit with larger that 64 byte packets? >=20 > The problem is that no TX completions are signalled happen until at > least WTHRESH are pending. >=20 > BQL is the least of the problems generated by this kind of behavior. >=20 > All drivers must TX complete in a small, finite, amount of time so > it is absolutely illegal to have the behavior that WRTHRESH > 1 > gives. The TX completion is also controlled by the programming of the correspo= nding interrupt moderation register (EITR). It makes sense to hold off a lit= tle bit to try and reduce the TX completion interrupt load.=20 Intel manual.. Descriptors are written back in one of three cases: =E2=80=A2 TXDCTL[n].WTHRESH =3D 0b and a descriptor which has RS set is= ready to be written back =E2=80=A2 The corresponding EITR counter has reached zero =E2=80=A2 TXDCTL[n].WTHRESH > 0b and TXDCTL[n].WTHRESH descriptors have= accumulated =46or the first condition, write-backs are immediate. This is the defau= lt operation and is backward compatible with previous device implementations. The other two conditions are only valid if descriptor bursting is enabl= ed (Section 8.12.13). In the second condition, the EITR counter is used to force timely write-back o= f descriptors. The first packet after timer initialization starts the timer. Timer expiration flushes a= ny accumulated descriptors and sets an interrupt event (TXDW). =46or the final condition, if TXDCTL[n].WTHRESH descriptors are ready f= or write-back, the write-back is performed.