From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59364) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bwnmn-00050P-Bg for qemu-devel@nongnu.org; Wed, 19 Oct 2016 06:07:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bwnmi-0005F8-Bf for qemu-devel@nongnu.org; Wed, 19 Oct 2016 06:07:49 -0400 Received: from mx1.redhat.com ([209.132.183.28]:33386) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1bwnmi-0005EL-3r for qemu-devel@nongnu.org; Wed, 19 Oct 2016 06:07:44 -0400 Date: Wed, 19 Oct 2016 12:07:40 +0200 From: Kevin Wolf Message-ID: <20161019100740.GE5336@noname.redhat.com> References: <1476657307-26165-1-git-send-email-mail@kevin-wolf.de> <20161018142705.GK4706@noname.str.redhat.com> <22FF2CCB-272D-4202-A7F0-D29E848C366A@daynix.com> <20161019072510.GB5336@noname.redhat.com> <4635F072-CAD3-4FD1-978A-8E0136878CBD@daynix.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <4635F072-CAD3-4FD1-978A-8E0136878CBD@daynix.com> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH] e1000e: Don't zero out buffer address in rx descriptor List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Dmitry Fleytman Cc: Kevin Wolf , Qemu Developers , jasowang@redhat.com Am 19.10.2016 um 09:57 hat Dmitry Fleytman geschrieben: >=20 > On 19 Oct 2016, at 10:25 AM, Kevin Wolf wrote: >=20 > Am 19.10.2016 um 08:48 hat Dmitry Fleytman geschrieben: >=20 > Another related thing that I noticed while debugging this an= d > turning on > tracing is that the interrupt throttling timers kept firing = even if > there was no activity at all. Something might be wrong, ther= e, too. >=20 > Next thing I wondered why throttling was enabled at all beca= use the > spec > says the default is 0 (turned off). So one thing that I'm pr= etty > sure is > just a misunderstanding is the following defintion: >=20 > #define E1000E_MIN_XITR (500) /* No more then 7813 inter= rupts > per > second according to spec > 10.2.4.2 */ >=20 >=20 > As I understand it, the spec is just giving an example there= and > lower > values are valid as well. At the very least, 0 should be acc= epted as > a > special case because it means "disabled" and it's specified = to be > the > default. >=20 >=20 > Right, this according to the spec this value should be 0 by def= ault and > throttling should be disabled. >=20 > Current device implementation does not allow specification of > throttling interval less than 500 and treats interval 0 as thro= ttling > enabled with interval 500. >=20 > This is done by intention because according to the spec (10.2.4= .2) > device cannot produce more than 7813 interrupts per second even= when > throttling is disabled. Therefore, even in case of interrupt st= orm > (continuous interrupt re-injection by device), number of interr= upts > produced by device is limited and CPU (driver) has enough time = to do > its job and handle problematic interrupt state. >=20 >=20 > I think you're misinterpreting the spec here. This is the paragraph > we're talking about, right? >=20 > For example, if the interval is programmed to 500 (decimal), the > 82574 guarantees the CPU is not interrupted by it for 128 =C2=B5= s from > the last interrupt. The maximum observable interrupt rate from t= he > 82574 should never exceed 7813 interrupts/sec. >=20 > It says "for example", so this is just demonstrating how you can > calculate the effects of a specific throttling setting. It says tha= t > _if_ you set ITR to 500, you get an interrupt at most every > 500 * 256 ns =3D 128 =C2=B5s. And 1 / 128 =C2=B5s =3D 7821.5 Hz, so= this is the > effective maximum frequency that _this specific_ ITR setting allows= . >=20 > I also don't think it would make any sense for hardware to be unabl= e to > trigger interrupts more often than that. Triggering an interrupt is= not > a complex operation that involves a lot of calculation or anything. >=20 >=20 > Hi Kevin, >=20 > Yes, I assume that sentence >=20 > =E2=80=9CThe maximum observable interrupt rate from the > 82574 should never exceed 7813 interrupts/sec." >=20 > is not a related to a specific case, but describes a generic limitation= , > however it might be I=E2=80=99m misreading the spec indeed. For me everything hints at this being only an example: Not only do the numbers match the example made in the previous sentence (which is explicitly called an example) and look weird as a real limit, but it's also in the same paragraph as the explicit example and the spec is generally good at starting a new paragraph when talking about a new aspect. I don't care enough to actually make you change anything, but I wanted you to be aware that the interpretation of the spec as coded into our emulation isn't clear at all (in fact, I would think it's clear that it's _not_ meant this way) and that real hardware probably doesn't do the same thing as we do. What we're doing may still have merit, as a workaround for a guest driver bug. > Opposed to this, virtual device is able to raise interrupts wit= h rate > limited by CPU speed only therefore driver has no chance to fix > interrupt storm condition. >=20 > Windows e1000e drivers rely on upper limit for number of interr= upts > per second in some cases and absence of this limit leads to inf= inite > interrupt storms. >=20 > To summarise, while usage of throttling mechanisms is a little = bit > different from what specification says, effective emulated devi= ce > behavior is totally compliant to the real device. >=20 >=20 > So Windows doesn't configure ITR (i.e. it is 0) even though it can'= t > handle unlimited interrupts? That would be a driver bug then, and > perhaps an important enough one to keep a workaround in our code. B= ut > then let's be explicit that this is a workaround for a Windows bug = and > not mandated by the spec. >=20 > I'm not sure in what setup you produced this error, but possibly a > reason why this doesn't happen with real hardware isn't the NIC its= elf > but the backend: Communication with the host can obviously be faste= r > than talking to a physical network (so if you were doing the latter= , the > rate in the VM wouldn't be limited by the CPU, but by the physical > network). >=20 >=20 > This issue is reproduced on device disable and not related=20 > to intensive device/backend communication. One RX packet with > right timing is enough to trigged the problem. >=20 > The same issue was fixed in e1000 device some time ago as well. Commit 9596ef7c was good in flagging it as a guest driver bug. Only a later series brought in the questionable spec interpretation. Kevin