qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Laszlo Ersek <lersek@redhat.com>
To: qemu devel list <qemu-devel@nongnu.org>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Subject: Re: [Qemu-devel] virtio-net regression [was: syslinux vs. OVMF]
Date: Fri, 10 Apr 2015 16:36:15 +0200	[thread overview]
Message-ID: <5527DFDF.6090007@redhat.com> (raw)
In-Reply-To: <5527AE47.8080909@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 5245 bytes --]

On 04/10/15 13:04, Laszlo Ersek wrote:
> On 04/10/15 12:06, Laszlo Ersek wrote:
>> On 04/10/15 10:14, Gerd Hoffmann wrote:
>>>   Hi,
>>>
>>>> In summary, please ask Gerd to rebuild the ipxe binaries that are
>>>> bundled with upstream qemu such that they include those two iPXE patches
>>>> of ours (see the last reference).
>>>
>>> https://www.kraxel.org/cgit/qemu/log/?h=rebase/roms-next
>>>
>>> Can you give this a try?
>>
>> Thank you for this update, I tested it.
>>
>> (1) I reproduced the issue, so that I could be sure that the fix wasn't
>> meaningless. Indeed the bug reproduces with the iPXE binaries bundled
>> with upstream qemu.
>>
>> I then checked out, built and installed your branch, and tried again,
>> with virtio-net and then e1000.
>>
>> (2) Virito-net results:
>> - OVMF        loads shim.efi    via network
>> - shim.efi    loads grubx64.efi via network
>> - grubx64.efi loads grub.cfg    via network
>> - grubx64.efi loads vmlinuz     via network
>>
>> However, while grubx64.efi loads initrd.img via the network, qemu
>> crashes the guest, with the following message:
>>
>> qemu-system-x86_64: Guest moved used index from 46499 to 65534
>>
>> This is a virtio protocol bug in the guest (efi-virtio.rom), *or* in
>> QEMU. I don't know.
>>
>> * e1000 results:
>> - OVMF        loads shim.efi    via network
>> - shim.efi    loads grubx64.efi via network
>> - grubx64.efi loads grub.cfg    via network
>> - grubx64.efi loads vmlinuz     via network
>> - grubx64.efi loads initrd.img  via network
>> - guest kernel boots
>>
>> So, I think the update is fine in general; but maybe there's a new
>> virtio-related bug in either "efi-virtio.rom" or in QEMU.
>>
>> (When I originally wrote the (earlier versions of the) patches, I tested
>> them with virtio-net using RHEL-7 qemu, so I guess this could be an
>> upstream QEMU regression. The machine type I used for testing was
>> pc-i440fx-2.3.)
>>
>> (3) ... Confirmed, this is a qemu regression. Namely, I checked your new
>> efi-virtio.rom with RHEL-7 qemu, and it works fine. CC'ing qemu-devel.
> 
> Small update, before I start bisecting it: the bug does not reproduce
> with "-netdev bridge".
> 
> It seems to be specific to "-netdev tap". Further, "vhost=on" seems to
> play no role, "-netdev tap" reproduces the error both with and without
> vhost=on.

This is creepy.

It was not easy to bisect, because machine type "pc-i440fx-2.3" is obviously not available in eg. v2.2.0.

Ultimately I realized that machine type pc-i440fx-2.0 does not reproduce the error, even with current master.

So I picked machine type pc-i440fx-2.1, and bisected the interval between the introduction of "pc-i440fx-2.1" (commit 3458b2b0) and current master (commit 6a460ed1). Log attached.

The result makes me question my sanity, or at least that I issued the correct "git bisect bad" and "git bisect good" commands. This is the culprit:

commit 18045fb9f457a0f0cba2bd113c748a2dcb4ed39e
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Mon Jul 28 17:34:16 2014 +0200

    pc: future-proof migration-compatibility of ACPI tables
    
    This patch avoids that similar changes break QEMU again in the future.
    QEMU will now hard-code 64k as the maximum ACPI table size, which
    (despite being an order of magnitude smaller than 640k) should be enough
    for everyone.
    
    Reviewed-by: Laszlo Ersek <lersek@redhat.com>
    Tested-by: Igor Mammedov <imammedo@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

How?!

Anyway, then I patched qemu, on top of current master, still sticking with machine type "pc-i440fx-2.1", as follows:

-----------
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 1fe7bfb..6cb00a2 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -344,6 +344,7 @@ static void pc_compat_2_1(MachineState *machine)
     x86_cpu_compat_set_features("core2duo", FEAT_1_ECX, CPUID_EXT_VMX, 0);
     x86_cpu_compat_kvm_no_autodisable(FEAT_8000_0001_ECX, CPUID_EXT3_SVM);
     pcms->enforce_aligned_dimm = false;
+    legacy_acpi_table_size = 6652;
 }
 
 static void pc_compat_2_0(MachineState *machine)
-----------

Incredibly, this made the crash go away.

Without this patch (ie. when it crashes), the fw_cfg file called "etc/acpi/tables" has size 0x20000. With the patch (which happens to suppress the crash for some reason), the same fw_cfg file has size 0x2000 (1/16th). This is consistent with the branches in acpi_build(). (Note that the warning block visible there, in the second branch, is never printed.)

It seems very unlikely that qemu is doing anything wrong. The difference in the fw_cfg file size causes a differently sized memory allocation in OVMF, which displaces further allocations by 1 page (4KB). For example, "1af41000.efi" (the iPXE virtio-net driver) is also loaded 4KB higher than before. But that doesn't directly explain why grub places garbage in the virtio-net ring while it downloads "initrd.img".

Anyway I think we can rule out any qemu regression at this point. It's a bug in some other component that the different memory map (due to the larger, 0x20000 allocation) exposes.

Thanks,
Laszlo

[-- Attachment #2: bisect.log --]
[-- Type: text/x-log, Size: 2471 bytes --]

git bisect start
# bad: [6a460ed18a3fda0eb2d9c96b8b01817b4dcbded4] configure: disable Archipelago by default and warn about libxseg GPLv3 license
git bisect bad 6a460ed18a3fda0eb2d9c96b8b01817b4dcbded4
# good: [3458b2b075f92f163ccb9a1f24733eb5705947f0] pc: add 2.1 machine type
git bisect good 3458b2b075f92f163ccb9a1f24733eb5705947f0
# bad: [ed173cb704f01a62143a3ef0dcf8b493bc795c23] .travis.yml: remove "make check" from main matrix
git bisect bad ed173cb704f01a62143a3ef0dcf8b493bc795c23
# good: [089a39486f2c47994c6c0d34ac7abf34baf40d9d] Merge remote-tracking branch 'remotes/qmp-unstable/queue/qmp' into staging
git bisect good 089a39486f2c47994c6c0d34ac7abf34baf40d9d
# bad: [39ba3bf69c4ef4d8a8b683ee7282efd25b3f01ff] qcow2: fix new_blocks double-free in alloc_refcount_block()
git bisect bad 39ba3bf69c4ef4d8a8b683ee7282efd25b3f01ff
# good: [4bce526ec4b88362a684fd858e0e14c83ddf0db4] target-ppc: KVMPPC_H_CAS fix cpu-version endianess
git bisect good 4bce526ec4b88362a684fd858e0e14c83ddf0db4
# bad: [a9047ec3f6ab56295cba5b07e0d46cded9e2a7ff] hw/arm/boot: Set PC correctly when loading AArch64 ELF files
git bisect bad a9047ec3f6ab56295cba5b07e0d46cded9e2a7ff
# good: [82172b751929314a81337aa91deea82e8297af1f] tests/Makefile: Only run vhost-user-test on Linux
git bisect good 82172b751929314a81337aa91deea82e8297af1f
# good: [3a18d449836d21dee60439b154056cca9a3b6aee] Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging
git bisect good 3a18d449836d21dee60439b154056cca9a3b6aee
# bad: [18045fb9f457a0f0cba2bd113c748a2dcb4ed39e] pc: future-proof migration-compatibility of ACPI tables
git bisect bad 18045fb9f457a0f0cba2bd113c748a2dcb4ed39e
# good: [3b257486639cf6c25e1f3a744d1f19e6b4efdc7a] Merge remote-tracking branch 'remotes/qmp-unstable/queue/qmp' into staging
git bisect good 3b257486639cf6c25e1f3a744d1f19e6b4efdc7a
# good: [c60a57ff497667780132a3fcdc1500c83af5d5c0] Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
git bisect good c60a57ff497667780132a3fcdc1500c83af5d5c0
# good: [cb348985abd3673b40c8af069c3e3b84f547b6f7] bios-tables-test: fix ASL normalization false positive
git bisect good cb348985abd3673b40c8af069c3e3b84f547b6f7
# good: [093a35e5fc0c60508e8c754ae81572090365723d] acpi-build: minor code cleanup
git bisect good 093a35e5fc0c60508e8c754ae81572090365723d
# first bad commit: [18045fb9f457a0f0cba2bd113c748a2dcb4ed39e] pc: future-proof migration-compatibility of ACPI tables

  reply	other threads:[~2015-04-10 14:36 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <alpine.GSO.2.01.1504062122470.1832@mono>
     [not found] ` <5523E12E.8010103@redhat.com>
     [not found]   ` <1428653687.11559.5.camel@nilsson.home.kraxel.org>
2015-04-10 10:06     ` [Qemu-devel] [edk2] syslinux vs. OVMF Laszlo Ersek
2015-04-10 11:04       ` [Qemu-devel] virtio-net regression [was: syslinux vs. OVMF] Laszlo Ersek
2015-04-10 14:36         ` Laszlo Ersek [this message]
2015-04-10 19:56           ` Laszlo Ersek
2015-05-26 14:36       ` [Qemu-devel] [edk2] syslinux vs. OVMF Michael Tokarev
2015-05-26 16:49         ` Laszlo Ersek
2015-05-26 17:04           ` Michael Tokarev
2015-05-26 18:38             ` Laszlo Ersek
2015-05-26 20:17               ` BALATON Zoltan
2015-05-26 20:27                 ` Michael Tokarev
2015-05-26 20:42                   ` BALATON Zoltan
2015-05-26 21:31             ` Michael Tokarev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5527DFDF.6090007@redhat.com \
    --to=lersek@redhat.com \
    --cc=kraxel@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).