Re: System freeze with IGD passthrough

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: "G.R." <firemeteor@users.sourceforge.net>
Cc: xen-devel <xen-devel@lists.xen.org>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Subject: Re: System freeze with IGD passthrough
Date: Fri, 21 Dec 2012 14:38:57 -0500	[thread overview]
Message-ID: <20121221193857.GC30562@phenom.dumpdata.com> (raw)
In-Reply-To: <CAKhsbWYchBmZega7Xy6NiNW4B+2vy1cK+r=d_bD1KLbW2_xWSw@mail.gmail.com>

On Thu, Dec 20, 2012 at 12:04:01AM +0800, G.R. wrote:
> On Wed, Dec 19, 2012 at 2:20 PM, G.R. <firemeteor@users.sourceforge.net> wrote:
> > Adding Jean, the author to the opregion patch.
> >
> > Jean, I believe the warning is due to the offset within the page.
> > To accommodate the offset, you would need to reserve another page for it.
> > Will the extra page cause any unexpected problem?
> >
> > The original thread is about an instability issue that directly freeze the host.
> > I believe this warning above should not has such effect.
> > What do you think? And any suggestion?
> >
> 
> Jean appears to be no longer reach able.
> The warning I found turns out to be not relevant.
> According to the OpRegion spec, the tail part is reserved and should
> never be touched by the guest.
> But anyway, I had a local fix to get rid of the warning, but reserving
> one more page and map it when the host opregion is not page aligned.
> I'll send it to a separate thread.
> 
> Back to the topic. I updated to xen 4.2.1 and tried three times tonight.
> Two of them lead to total freeze with no error log available, after
> game playing for a couple of minutes.
> And the last try ended up with GPU hang after 10+ minutes of game playing.
> This is a guest only hang. But I still have no way to check GPU error
> state even it has been collected:
> 
> [ 1553.588076] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer
> elapsed... GPU hung
> [ 1553.592112] [drm] capturing error event; look for more information
> in /debug/dri/0/i915_error_state
> [ 1582.004075] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer
> elapsed... GPU hung
> [ 1597.220075] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer
> elapsed... GPU hung
> [ 1613.220074] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer
> elapsed... GPU hung

Those also appear with baremetal (Linus actually mentioned this).

> 
> I'm wondering if the two syndromes are due to the same underlying cause.
> But I guess a GPU hang caused by guest driver issue should not freeze
> the host. Is it true?

It shouldn't. Is the machine usuable with this guest being frozen?
> 
> I'm going to try more with different config -- different kernel
> version, with / without PVOPS, native run vs VM etc.
> But this is kind of blindly since I have no clue at all. If you have
> anything to suspect, it will be highly appreciated.
> 
> Thanks,
> Timothy
> 
> > Thanks,
> > Timothy
> >
> > On Wed, Dec 19, 2012 at 1:28 AM, G.R. <firemeteor@users.sourceforge.net> wrote:
> >> Hi Stefano,
> >>
> >> I recently tried to play some 3D games on my linux guest.
> >> The game starts without problem but it freezes the entire system after
> >> a some time (a minute or so?).
> >> Here I mean both the host and domU are not responsive anymore.
> >> The ssh freezes and i had to shutdown the machine using power button directly.
> >>
> >> I did not find anything obvious from the host log. But from the guest,
> >> I can find this:
> >>
> >> Dec 18 20:28:38 debvm kernel: [    0.899860] resource map sanity check
> >> conflict: 0xfeff5018 0xfeff7017 0xfeff7000 0xffffffff reserved
> >> Dec 18 20:28:38 debvm kernel: [    0.899862] ------------[ cut here
> >> ]------------
> >> Dec 18 20:28:38 debvm kernel: [    0.899869] WARNING: at
> >> arch/x86/mm/ioremap.c:171 __ioremap_caller+0x2c4/0x33c()
> >> Dec 18 20:28:38 debvm kernel: [    0.899870] Hardware name: HVM domU
> >> Dec 18 20:28:38 debvm kernel: [    0.899872] Info: mapping multiple
> >> BARs. Your kernel is fine.
> >> Dec 18 20:28:38 debvm kernel: [    0.899873] Modules linked in:
> >> Dec 18 20:28:38 debvm kernel: [    0.899878] Pid: 1, comm: swapper/0
> >> Not tainted 3.6.9 #4
> >> Dec 18 20:28:38 debvm kernel: [    0.899892] Call Trace:
> >> Dec 18 20:28:38 debvm kernel: [    0.899896]  [<ffffffff8103d194>] ?
> >> warn_slowpath_common+0x76/0x8a
> >> Dec 18 20:28:38 debvm kernel: [    0.899898]  [<ffffffff8103d240>] ?
> >> warn_slowpath_fmt+0x45/0x4a
> >> Dec 18 20:28:38 debvm kernel: [    0.899900]  [<ffffffff81032a6c>] ?
> >> __ioremap_caller+0x2c4/0x33c
> >> Dec 18 20:28:38 debvm kernel: [    0.899902]  [<ffffffff812c3be3>] ?
> >> intel_opregion_setup+0x9c/0x201
> >> Dec 18 20:28:38 debvm kernel: [    0.899904]  [<ffffffff812bcb75>] ?
> >> intel_setup_gmbus+0x175/0x19d
> >> Dec 18 20:28:38 debvm kernel: [    0.899907]  [<ffffffff8128a37a>] ?
> >> i915_driver_load+0x548/0x90d
> >> Dec 18 20:28:38 debvm kernel: [    0.899910]  [<ffffffff812ff804>] ?
> >> setup_hpet_msi_remapped+0x20/0x20
> >> Dec 18 20:28:38 debvm kernel: [    0.899912]  [<ffffffff81272706>] ?
> >> drm_get_pci_dev+0x152/0x259
> >> Dec 18 20:28:38 debvm kernel: [    0.899915]  [<ffffffff813d4883>] ?
> >> _raw_spin_lock_irqsave+0x21/0x45
> >> Dec 18 20:28:38 debvm kernel: [    0.899918]  [<ffffffff811d9ecc>] ?
> >> local_pci_probe+0x5a/0xa0
> >> Dec 18 20:28:38 debvm kernel: [    0.899920]  [<ffffffff811d9fcf>] ?
> >> pci_device_probe+0xbd/0xe7
> >> Dec 18 20:28:38 debvm kernel: [    0.899922]  [<ffffffff812cd887>] ?
> >> driver_probe_device+0x1b0/0x1b0
> >> Dec 18 20:28:38 debvm kernel: [    0.899923]  [<ffffffff812cd887>] ?
> >> driver_probe_device+0x1b0/0x1b0
> >> Dec 18 20:28:38 debvm kernel: [    0.899925]  [<ffffffff812cd769>] ?
> >> driver_probe_device+0x92/0x1b0
> >> Dec 18 20:28:38 debvm kernel: [    0.899926]  [<ffffffff812cd8da>] ?
> >> __driver_attach+0x53/0x73
> >> Dec 18 20:28:38 debvm kernel: [    0.899928]  [<ffffffff812cc06f>] ?
> >> bus_for_each_dev+0x46/0x77
> >> Dec 18 20:28:38 debvm kernel: [    0.899930]  [<ffffffff812ccf8f>] ?
> >> bus_add_driver+0xd5/0x1f4
> >> Dec 18 20:28:38 debvm kernel: [    0.899931]  [<ffffffff812cde14>] ?
> >> driver_register+0x89/0x101
> >> Dec 18 20:28:38 debvm kernel: [    0.899933]  [<ffffffff811d9336>] ?
> >> __pci_register_driver+0x49/0xa3
> >> Dec 18 20:28:38 debvm kernel: [    0.899935]  [<ffffffff816d55c7>] ?
> >> ttm_init+0x63/0x63
> >> Dec 18 20:28:38 debvm kernel: [    0.899937]  [<ffffffff81002085>] ?
> >> do_one_initcall+0x75/0x12c
> >> Dec 18 20:28:38 debvm kernel: [    0.899940]  [<ffffffff816a6cc2>] ?
> >> kernel_init+0x13c/0x1c0
> >> Dec 18 20:28:38 debvm kernel: [    0.899941]  [<ffffffff816a6565>] ?
> >> do_early_param+0x83/0x83
> >> Dec 18 20:28:38 debvm kernel: [    0.899943]  [<ffffffff813d9f44>] ?
> >> kernel_thread_helper+0x4/0x10
> >> Dec 18 20:28:38 debvm kernel: [    0.899945]  [<ffffffff816a6b86>] ?
> >> start_kernel+0x3e1/0x3e1
> >> Dec 18 20:28:38 debvm kernel: [    0.899947]  [<ffffffff813d9f40>] ?
> >> gs_change+0x13/0x13
> >> Dec 18 20:28:38 debvm kernel: [    0.899950] ---[ end trace
> >> db461543ce599b44 ]---
> >>
> >> I'm not sure if this has anything to do with the freeze. This seems to
> >> show up on every boot after I upgraded to xen version 4.2.1-rc2. Both
> >> debian kernel 3.2.32 / 3.6.9 suffers from the same log. But whole
> >> system freeze happens only during gaming, which is much less frequent.
> >> So I'm not sure if the two are related. But anyway, could you comment
> >> about what does this log mean?
> >>
> >> I can find the one of the mentioned address in the qemu_dm log:
> >> pt_pci_write_config: [00:02:0] address=00fc val=0xfeff5000 len=4
> >> igd_write_opregion: Map OpRegion: cd996018 -> feff5018
> >> igd_write_opregion: [00:02:0] addr=fc len=2 val=feff5000
> >>
> >> PS: I also run xbmc on domU and it playbacks video under HW
> >> acceleration (VAAPI) without any problem. XBMC by itself is also an
> >> graphics intensive program. But this runs on an pure HVM guest, while
> >> the failing case is on PVHVM.
> >>
> >> PS2: I also suffered another instability yesterday. It happens when I
> >> was compiling kernel in side the domU. The host reboots suddenly.
> >> Since I'm not using graphics at that time (Xorg session is idle, I
> >> connected through SSH), this may be a different issue.
> >>
> >> Thanks,
> >> Timothy
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>

next prev parent reply	other threads:[~2012-12-21 19:38 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-18 17:28 System freeze with IGD passthrough G.R.
2012-12-19  6:20 ` G.R.
2012-12-19 16:04   ` G.R.
2012-12-20 18:20     ` G.R.
2012-12-21 19:39       ` Konrad Rzeszutek Wilk
2012-12-23  5:37         ` G.R.
2012-12-21 19:38     ` Konrad Rzeszutek Wilk [this message]
2012-12-19 18:18 ` Jean Guyader
2012-12-20 13:52   ` G.R.

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121221193857.GC30562@phenom.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=firemeteor@users.sourceforge.net \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.