All of lore.kernel.org
 help / color / mirror / Atom feed
From: Keir Fraser <keir.fraser@eu.citrix.com>
To: "Kay, Allen M" <allen.m.kay@intel.com>,
	"Li, Xin" <xin.li@intel.com>,
	"Li, Haicheng" <haicheng.li@intel.com>,
	"'xen-devel@lists.xensource.com'" <xen-devel@lists.xensource.com>
Subject: Re: Critical bug: VT-d fault causes disk corruption or Dom0 kernel panic.
Date: Fri, 23 Jan 2009 18:41:58 +0000	[thread overview]
Message-ID: <C59FBFF6.21CC8%keir.fraser@eu.citrix.com> (raw)
In-Reply-To: <57C9024A16AD2D4C97DC78E552063EA35EED26A5@orsmsx505.amr.corp.intel.com>

Ah, I know what it is! We actually free up bits of the Xen image at the end
of Xen bootstrap, and these can now be allocated to a domain (e.g., dom0)
and DMAed to. But these will be contained within the bounds of __pa(&_start)
and __pa(&_end) and hence will not have been mapped in dom0'd vtd tables.

Sadly the fact is that Xen relies on validity of memory from the domain heap
as well as Xen heap anyway, so the avoidance of mapping Xen-critical memory
in dom0 vtd tables is inadequate anyway, even on x86_32 and ia64.

Also it's going to be hard to do better while keeping efficiency since if
you only map dom0's pages in its vtd tables then PV backend drivers will not
work (which rely on DMAing to/from other domain's pages via grant
references). You'd have to dynamically map/unmap as grants get
mapped/unmapped, and you may not want the performance hit of that.

I'd personally vote for getting rid of xen_in_range(). Alternatively we
could have it merely check for is_kernel_text(), but really I think since it
is not in any way full protection from dom0 I wonder if it is worth the
bother at all.

What do you think?

 -- Keir

On 23/01/2009 17:30, "Kay, Allen M" <allen.m.kay@intel.com> wrote:

> I have not figured out why this is the problem yet but I know comment it out
> makes the problem go away.  Leaving tboot_in_range() in does not cause this
> problem.
> 
> Allen
> 
> -----Original Message-----
> From: Keir Fraser [mailto:keir.fraser@eu.citrix.com]
> Sent: Friday, January 23, 2009 12:34 AM
> To: Kay, Allen M; Li, Xin; Li, Haicheng; 'xen-devel@lists.xensource.com'
> Subject: Re: [Xen-devel] Critical bug: VT-d fault causes disk corruption or
> Dom0 kernel panic.
> 
> Are you sure that is the problem? The xen_in_range() change should make the
> dom0 VT-d table more permissive, and hence if anything less likely to
> experience VT-d faults. Also it wouldn't seem to explain problems for HVM
> guest passthrough.
> 
>  -- Keir
> 
> On 23/01/2009 01:01, "Kay, Allen M" <allen.m.kay@intel.com> wrote:
> 
>> Looks like the problem is caused by xen_in_range() call in
>> vtd/iommu.c/intel_iommu_domain_init().  Definition of xen_in_range() was
>> changed as part of the heap patch.
>> 
>> I'm looking into change intel_iommu_domain_init() to just map pages in
>> dom0->page_list.  However this looks to be more complicated as d->page_list
>> is
>> not initialized at this stage of the boot yet.
>> 
>> Allen
>> 
>> -----Original Message-----
>> From: xen-devel-bounces@lists.xensource.com
>> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Keir Fraser
>> Sent: Thursday, January 22, 2009 1:23 AM
>> To: Li, Xin; Li, Haicheng; 'xen-devel@lists.xensource.com'
>> Subject: Re: [Xen-devel] Critical bug: VT-d fault causes disk corruption or
>> Dom0 kernel panic.
>> 
>> Mmm well not really. :-)
>> 
>> Is there any assumption in the VT-d setup about preventing access to the Xen
>> heap, and could that be broken?
>> 
>> Perhaps the VT-d pagetables are broken causing bad DMAs leading to data
>> corruption and bad command packets?
>> 
>>  -- Keir
>> 
>> On 22/01/2009 08:58, "Li, Xin" <xin.li@intel.com> wrote:
>> 
>>> We are looking into the issue too. If you have any idea on how it's caused,
>>> please tell us :-)
>>> Thanks!
>>> -Xin
>>> 
>>>> -----Original Message-----
>>>> From: xen-devel-bounces@lists.xensource.com
>>>> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Keir Fraser
>>>> Sent: Thursday, January 22, 2009 3:40 PM
>>>> To: Li, Haicheng; 'xen-devel@lists.xensource.com'
>>>> Subject: Re: [Xen-devel] Critical bug: VT-d fault causes disk corruption or
>>>> Dom0
>>>> kernel panic.
>>>> 
>>>> Thanks,
>>>> 
>>>> I haven't seen any problems outside of VT-d since c/s 19057, btw.
>>>> 
>>>> -- Keir
>>>> 
>>>> On 22/01/2009 03:42, "Li, Haicheng" <haicheng.li@intel.com> wrote:
>>>> 
>>>>> All,
>>>>> 
>>>>> We met several system failures on different hardware platforms, which are
>>>>> all
>>>>> caused by VT-d fault.
>>>>> err 1: disk is corrupted by VT-d fault on SATA.
>>>>> err 2: Dom0 kernel panics at booting, which is caused VT-d fault on UHCI.
>>>>> err 3, Dom0 complains disk errors while creating HVM guests.
>>>>> 
>>>>> The culprit would be changeset 19054 "x86_64: Remove
>>>>> statically-partitioned
>>>>> Xen heap.".
>>>>> 
>>>>> Detailed error logs can be found via BZ#,
>>>>> http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1409.
>>>>> 
>>>>> 
>>>>> -haicheng
>>>>> _______________________________________________
>>>>> Xen-devel mailing list
>>>>> Xen-devel@lists.xensource.com
>>>>> http://lists.xensource.com/xen-devel
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@lists.xensource.com
>>>> http://lists.xensource.com/xen-devel
>> 
>> 
>> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
> 
> 

  reply	other threads:[~2009-01-23 18:41 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-22  3:42 Critical bug: VT-d fault causes disk corruption or Dom0 kernel panic Li, Haicheng
2009-01-22  7:40 ` Keir Fraser
2009-01-22  8:58   ` Li, Xin
2009-01-22  9:23     ` Keir Fraser
2009-01-22  9:40       ` Critical bug: VT-d fault causes disk corruption orDom0 " Akio Takebe
2009-01-22  9:47         ` Keir Fraser
2009-01-23  1:01       ` Critical bug: VT-d fault causes disk corruption or Dom0 " Kay, Allen M
2009-01-23  8:33         ` Keir Fraser
2009-01-23 17:30           ` Kay, Allen M
2009-01-23 18:41             ` Keir Fraser [this message]
2009-01-23 18:44               ` Keir Fraser
2009-01-23 23:40                 ` Kay, Allen M
2009-01-24  0:34                   ` Cihula, Joseph
2009-01-24  9:15                   ` Keir Fraser
2009-01-24  9:26                     ` Keir Fraser
2009-01-24 19:07                       ` Cihula, Joseph
2009-01-24 19:58                         ` Keir Fraser
2009-01-24  2:19               ` Cihula, Joseph

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C59FBFF6.21CC8%keir.fraser@eu.citrix.com \
    --to=keir.fraser@eu.citrix.com \
    --cc=allen.m.kay@intel.com \
    --cc=haicheng.li@intel.com \
    --cc=xen-devel@lists.xensource.com \
    --cc=xin.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.