From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:56767)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jasowang@redhat.com>) id 1a3bfi-00088x-6C
	for qemu-devel@nongnu.org; Mon, 30 Nov 2015 22:32:07 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jasowang@redhat.com>) id 1a3bfe-0008Ev-VX
	for qemu-devel@nongnu.org; Mon, 30 Nov 2015 22:32:06 -0500
Received: from mx1.redhat.com ([209.132.183.28]:44957)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jasowang@redhat.com>) id 1a3bfe-0008Eq-NE
	for qemu-devel@nongnu.org; Mon, 30 Nov 2015 22:32:02 -0500
References: <1448606921-17846-1-git-send-email-den@openvz.org>
	<5657FD3A.407@openvz.org> <5658418B.6000700@openvz.org>
	<565BE57B.6060503@redhat.com> <565BEB1C.5050806@parallels.com>
From: Jason Wang <jasowang@redhat.com>
Message-ID: <565D14A7.2000501@redhat.com>
Date: Tue, 1 Dec 2015 11:31:51 +0800
MIME-Version: 1.0
In-Reply-To: <565BEB1C.5050806@parallels.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH for 2.5 1/1] e1000: fix hang of win2k12
 shutdown with flood ping
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Denis V. Lunev" <den-lists@parallels.com>, "Denis V. Lunev" <den@openvz.org>
Cc: Vincenzo Maffione <v.maffione@gmail.com>, qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>


On 11/30/2015 02:22 PM, Denis V. Lunev wrote:
> On 11/30/2015 08:58 AM, Jason Wang wrote:
>>
>> On 11/27/2015 07:42 PM, Denis V. Lunev wrote:
>>> On 11/27/2015 09:50 AM, Denis V. Lunev wrote:
>>>> On 11/27/2015 09:48 AM, Denis V. Lunev wrote:
>>>>> e1000 driver in Win2k12 is really well rotten. It 100% hangs on
>>>>> shutdown
>>>>> of UP VM under flood ping. The guest checks card state and reinjects
>>>>> itself interrupt in a loop. This is fatal for UP machine.
>>>>>
>>>>> There is no good way to fix this misbehavior but to kludge it. The
>>>>> emulation has interrupt throttling register aka ITR which limits
>>>>> interrupt rate and allows the guest to proceed this phase.
>>>>> There is no problem with this kludge for Linux guests - it adjust the
>>>>> value of it itself.
>>>>>
>>>>> On the other hand according to the initial research in
>>>>>       commit e9845f0985f088dd01790f4821026df0afba5795
>>>>>       Author: Vincenzo Maffione <v.maffione@gmail.com>
>>>>>       Date:   Fri Aug 2 18:30:52 2013 +0200
>>>>>
>>>>>       e1000: add interrupt mitigation support
>>>>>
>>>>>       ...
>>>>>
>>>>>       Interrupt mitigation boosts performance when the guest suffers
>>>>> from
>>>>>       an high interrupt rate (i.e. receiving short UDP packets at
>>>>> high packet
>>>>>       rate). For some numerical results see the following link
>>>>> http://info.iet.unipi.it/~luigi/papers/20130520-rizzo-vm.pdf
>>>>>
>>>>> this should also boost performance a bit.
>>>>>
>>>>> See https://bugzilla.redhat.com/show_bug.cgi?id=874406 for additional
>>>>> details.
>>>>>
>>>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>>>> CC: Vincenzo Maffione <v.maffione@gmail.com>
>>>>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>>>>> ---
>>>>>    hw/net/e1000.c | 3 +++
>>>>>    1 file changed, 3 insertions(+)
>>>>>
>>>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>>>>> index c877e06..0af528f 100644
>>>>> --- a/hw/net/e1000.c
>>>>> +++ b/hw/net/e1000.c
>>>>> @@ -447,6 +447,9 @@ static void e1000_reset(void *opaque)
>>>>>            e1000_link_down(d);
>>>>>        }
>>>>>    +    /* Throttle interrupts to allow poor Win 2012 to shutdown */
>>>>> +    d->mac_reg[ITR] = 250;
>>>>> +
>>>>>        /* Some guests expect pre-initialized RAH/RAL (AddrValid flag
>>>>> + MACaddr) */
>>>>>        d->mac_reg[RA] = 0;
>>>>>        d->mac_reg[RA + 1] = E1000_RAH_AV;
>>>> Intel manual says about ITR that " A initial suggested range is
>>>> 651-5580 (28Bh - 15CCh)."
>>>> Should we use something other than 250? :)
>>>>
>>>> http://www.intel.com/content/www/us/en/embedded/products/networking/pci-pci-x-family-gbe-controllers-software-dev-manual.html
>>>>
>>>>
>>>>
>>>> Den
>>> Jason, can you look to this?
>>>
>>> I have rechecked MAINTAINERs file and found that
>>> I have missed you here. Sorry :(
>>>
>>> Den
>>>
>> No problem.
>>
>> But I have a question. What if ITR is disabled?
>>
>
> On behalf of guest  I do not think that this is really true.
> In this case the guest should set it to a real value and
> after that clear it. This is not the case - my patch
> applies on a reset only, i.e. the guest do not care at all
> on this and the value lives "as is". I think that real card
> behaves in a similar way, it could not generate interrupts
> with the speed of any hypervisor, i.e. there is natural
> limitation which allows to bypass this problem or there
> is a default value.
>
> On behalf of QEMU the question is still here. Fortunately
> the handle (mitigation flag) is on by default. I think that
> it exists to preserve compatibility with QEMU 1.6
> In a real life nobody will turn it off until the person is
> know what he is doing ;)
>
> Den

Ok, apply to my -net with minor tweaks and adding a TODO in the comment.

We've met several similar issues in the past, need to consider a
complete solution in the future otherwise we may still hit something
like this in the future.

Thanks