Re: [qemu-upstream-unstable test] 21375: regressions - FAIL

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Anthony PERARD <anthony.perard@citrix.com>
To: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@citrix.com>,
	xen-devel@lists.xensource.com,
	"xen.org" <ian.jackson@eu.citrix.com>,
	Jan Beulich <jbeulich@suse.com>
Subject: Re: [qemu-upstream-unstable test] 21375: regressions - FAIL
Date: Tue, 19 Nov 2013 12:33:46 +0000	[thread overview]
Message-ID: <20131119123345.GF2663@perard.uk.xensource.com> (raw)
In-Reply-To: <1384859242.30014.30.camel@kazak.uk.xensource.com>

On Tue, Nov 19, 2013 at 11:07:22AM +0000, Ian Campbell wrote:
> On Mon, 2013-11-18 at 17:18 +0000, Anthony PERARD wrote:
> > On Wed, Nov 06, 2013 at 05:22:29PM +0000, Anthony PERARD wrote:
> > > On Fri, Nov 01, 2013 at 03:46:36PM +0000, Anthony PERARD wrote:
> > > > On Fri, Nov 01, 2013 at 12:06:51PM +0000, Ian Campbell wrote:
> > > > > On Fri, 2013-11-01 at 11:58 +0000, Anthony PERARD wrote:
> > > > > > On Fri, Nov 01, 2013 at 10:43:16AM +0000, Ian Campbell wrote:
> > > > > > > On Fri, 2013-11-01 at 10:38 +0000, xen.org wrote:
> > > > > > > > flight 21375 qemu-upstream-unstable real [real]
> > > > > > > > http://www.chiark.greenend.org.uk/~xensrcts/logs/21375/
> > > > > > > > 
> > > > > > > > Regressions :-(
> > > > > > > > 
> > > > > > > > Tests which did not succeed and are blocking,
> > > > > > > > including tests which could not be run:
> > > > > > > >  test-amd64-i386-qemuu-rhel6hvm-intel  7 redhat-install    fail REGR. vs. 20054
> > > > > > > 
> > > > > > > Anythony, have you made any progress on this? It's been failing for ages
> > > > > > > now...
> > > > > > 
> > > > > > Yes, looks like the bug it trigger during a vesa resolution change. I
> > > > > > have try to use the vgabios blob that we use for qemu-traditionnal and
> > > > > > it works fine. But with the vgabios blob provided by qemu, it does not
> > > > > > work... I'm still not sure of what the bug is, but I'm getting closer to
> > > > > > it.
> > > > > 
> > > > > Yay!
> > > > > 
> > > > > > Also, this happen only on an Intel machine, on an AMD machine,
> > > > > > everything works like a charm.
> > > > > > 
> > > > > > More detail, if anyone want to know:
> > > > > > It's look like syslinux is doing a int 10h call that never return to set
> > > > > > video mode:
> > > > > > Int 0x10, with AX=0x4F02
> > > > > 
> > > > > This looks like it might be handled by SeaBIOS vgasrc/vbe.c:vbe_104f00 ?
> > > > > There seem to be a few changes in upstream seabios since the version
> > > > > referenced in xen.git:Config.mk. Many of them are cleanups/code motion
> > > > > but a few look worth investigating. 
> > > > 
> > > > I've been able to get the things working by applying a patch to vgabios
> > > > that is in xen tree: a0e7ccf6864c196906d58b54cd0996b4dbc1b022
> > > > This patch allow to clear the framebuffer much faster.
> > > > 
> > > > But it those not really help be to understand why the guest freeze. A
> > > > couple more printf might.
> > > 
> > > I finally managed to have a better understanding of the issue.
> > > 
> > > So, the vgabios blob provided by QEMU have a routine to clear the video
> > > ram that take few seconds to run. That give enough time to QEMU to try
> > > to refresh is display, and this mean they will be a call to
> > > xc_hvm_track_dirty_vram(). If the function is called while the vgabios
> > > routine is running, then the guest is lost.
> > > 
> > > The issue appear only with an Intel machine on an HVM guest using EPT.
> > > Having the guest using shadow works fine. So I'm going to investigate
> > > the track_dirty code in Xen.
> > > 
> > > The vgabios routine is called by syslinux with an Int 0x10, I tryied to
> > > get some debug print after the call, either from the guest serial or
> > > by using the Xen debug ioport, nothing ever appear, and gdbsx only gave
> > > me some weird IP which does not appear to point to any usefull code
> > > (it's all zeros).
> > 
> > An other update,
> > 
> > we had the idee of trying this on earlier versin of Xen, and it turns
> > out that Xen 4.3 works fine. One bisect later, and a commit turns out.
> > 
> > commit 86781624f8df1d50eb4185cfc2ddce926798f7aa
> > x86_emulate: PUSH <mem> must read source operand just once
> > ... for the case of accessing MMIO.
> > 
> > So after this commit, syslinux stop working correctly with the last
> > version of QEMU. This happen if QEMU is calling track_dirty_vram.
> > 
> > I also have use xentrace/xenalyze to try to grab more information about
> > the issue, it did not really help, but it's tell me that the guest is
> > stock on a specific instruction (it result in vmexit EPT_VIOLATION over
> > and over on xentrace). And that were the guest is stock:
> > 
> >    0xa126:  mov    %eax,%cr0
> >    0xa129:  ljmp   $0xf2e,$0xa12e
> >    0xa130:  mov    $0x26,%dl
> >    0xa132:  or     %bh,(%eax)
> >    0xa134:  movzww %sp,%sp
> >    0xa138:  mov    %edx,%ds
> >    0xa13a:  mov    %edx,%es
> >    0xa13c:  mov    %edx,%fs
> >    0xa13e:  mov    %edx,%gs
> >    0xa140:  jmp    *%ebx
> >    0xa142:  pushf  
> > => 0xa143:  lcall  *%cs:(%si)
> >    0xa147:  mov    $0x0,%ch
> 
> OOI what is the encoding of the bad instruction?

That's what gdb give me:
   0x0000a143:  2e 67 ff 1c   lcall  *%cs:(%si)

> > Before trying on earlier version of Xen, I try to understand what when
> > wrong on the Xen side, it turn out that, in the track_dirty_vram
> > hypercall, a call to hap_enable_log_dirty() is all that needed to break
> > the guest.
> > 
> > Jan, any idee of what the issue is?
> > 
> > Regards,
> > 
> 
> 

-- 
Anthony PERARD

next prev parent reply	other threads:[~2013-11-19 12:33 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-01 10:38 [qemu-upstream-unstable test] 21375: regressions - FAIL xen.org
2013-11-01 10:43 ` Ian Campbell
2013-11-01 11:58   ` Anthony PERARD
2013-11-01 12:06     ` Ian Campbell
2013-11-01 15:46       ` Anthony PERARD
2013-11-06 17:22         ` Anthony PERARD
2013-11-18 17:18           ` Anthony PERARD
2013-11-19 11:07             ` Ian Campbell
2013-11-19 12:33               ` Anthony PERARD [this message]
2013-11-19 13:05             ` Jan Beulich
2013-11-19 14:28               ` Anthony PERARD
2013-11-20 14:42                 ` Ian Jackson
2013-11-01 10:47 ` Sander Eikelenboom
2013-11-01 10:52   ` Ian Campbell
2013-11-01 11:57     ` Sander Eikelenboom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131119123345.GF2663@perard.uk.xensource.com \
    --to=anthony.perard@citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=stefano.stabellini@citrix.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).