* Need help to debug win7 BSOD on IGD passthrough
@ 2012-12-11 9:26 G.R.
2013-01-04 13:49 ` G.R.
0 siblings, 1 reply; 22+ messages in thread
From: G.R. @ 2012-12-11 9:26 UTC (permalink / raw)
To: xen-devel
[-- Attachment #1.1: Type: text/plain, Size: 1085 bytes --]
Hi guys,
I met win7 vga passthrough problem and would like to analysis the memory
dump to identify the issues. Could anybody lend me a hand when I'm in
trouble?
Some details below and would post follow up later.
Any suggestion on the triage / experiment direction would be appreciated.
I have a working build for passing through intel HD4000 to a debian domU.
But win7 does not quite work. Issues I've *randomly *met so far:
1) black screen after windows loading files
2) failed to access install media (I access install media / virtual disk
through NFS)
3) BSOD memory_mangement during boot
4) BSOD system_service_exception during boot
As I continuously try restarting the domU, syndrome keeps change randomly.
At first glance, these appear to be irrelevant. But they only happens when
I try IGD passthrough.
And these appear both during installation and, if I finish install by
temporally disable vga passthrough, during reboot after video driver
install.
I'm now going to download windows SDK to analysis the memory dump.
Will likely need your expertise later.
Thanks,
Timothy
[-- Attachment #1.2: Type: text/html, Size: 1178 bytes --]
[-- Attachment #2: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Need help to debug win7 BSOD on IGD passthrough
2012-12-11 9:26 Need help to debug win7 BSOD on IGD passthrough G.R.
@ 2013-01-04 13:49 ` G.R.
2013-01-04 15:53 ` Ross Philipson
0 siblings, 1 reply; 22+ messages in thread
From: G.R. @ 2013-01-04 13:49 UTC (permalink / raw)
To: xen-devel, Ross Philipson
Hi Ross,
Let's reuse this thread to discuss the win7 64bit BSOD issue in gfx
passthrough env.
I started this thread but later switched my focus to linux domU. It's
time to pick it up again.
Quoting your reply from the opregion patch thread:
>> Hmm, I found a similar issue with BSODs like this and I tacked it down to the Windows igfx drivers expecting to find vendor capabilities in the GMCH device (the root device at 00:00.0). It was actually looking there for capabilities specific to the gfx card. We fixed this by patching qemu to pass in the vendor capabilities on that device. I am not sure if that made it upstream - I will have to check.
>
> Great news to hear. It'll be great if you can locate that patch for me.
> I'm not sure which upstream you are referring to?
> I'm using the qemu-xen 4.2.1 version. I remember the upstream qemu
> does not support gfx-passthrough.
So here is the lspci -vvv -s 00:00.0 output from the dom0:
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core
processor DRAM Controller (rev 09)
Subsystem: ASRock Incorporation Device 0150
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort+ >SERR- <PERR- INTx-
Latency: 0
Capabilities: [e0] Vendor Specific Information: Len=0c <?>
And Linux domU:
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core
processor DRAM Controller (rev 09)
Subsystem: ASRock Incorporation Device 0150
Control: I/O- Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
One notable difference is the 'Vendor Specific Information' line in
the host is missing in the domU.
Is it the fix you mention in your comment?
But the fix you mention seems to be igfx driver specific.
But I also observe BSOD without it -- I guess the system run with
basic VGA driver in that case.
Is it the expected behavior?
One more question: does the windows igfx passthrough rely on 1:1
mapping of MMIO bars?
According to tutorials, nvidia cards require such trick.
But it seems that igfx linux driver does not require this -- the lspci
output shows different memory region on dom0 vs domU.
Is there any info you are interested?
Thanks,
Timothy
On Tue, Dec 11, 2012 at 5:26 PM, G.R. <firemeteor@users.sourceforge.net> wrote:
> Hi guys,
> I met win7 vga passthrough problem and would like to analysis the memory
> dump to identify the issues. Could anybody lend me a hand when I'm in
> trouble?
>
> Some details below and would post follow up later.
> Any suggestion on the triage / experiment direction would be appreciated.
>
> I have a working build for passing through intel HD4000 to a debian domU.
> But win7 does not quite work. Issues I've randomly met so far:
> 1) black screen after windows loading files
> 2) failed to access install media (I access install media / virtual disk
> through NFS)
> 3) BSOD memory_mangement during boot
> 4) BSOD system_service_exception during boot
> As I continuously try restarting the domU, syndrome keeps change randomly.
>
> At first glance, these appear to be irrelevant. But they only happens when I
> try IGD passthrough.
> And these appear both during installation and, if I finish install by
> temporally disable vga passthrough, during reboot after video driver
> install.
>
> I'm now going to download windows SDK to analysis the memory dump.
> Will likely need your expertise later.
>
> Thanks,
> Timothy
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Need help to debug win7 BSOD on IGD passthrough
2013-01-04 13:49 ` G.R.
@ 2013-01-04 15:53 ` Ross Philipson
2013-01-04 16:30 ` G.R.
0 siblings, 1 reply; 22+ messages in thread
From: Ross Philipson @ 2013-01-04 15:53 UTC (permalink / raw)
To: G.R., xen-devel
> -----Original Message-----
> From: firemeteor.guo@gmail.com [mailto:firemeteor.guo@gmail.com] On
> Behalf Of G.R.
> Sent: Friday, January 04, 2013 8:50 AM
> To: xen-devel; Ross Philipson
> Subject: Re: Need help to debug win7 BSOD on IGD passthrough
>
> Hi Ross,
> Let's reuse this thread to discuss the win7 64bit BSOD issue in gfx
> passthrough env.
> I started this thread but later switched my focus to linux domU. It's
> time to pick it up again.
>
> Quoting your reply from the opregion patch thread:
> >> Hmm, I found a similar issue with BSODs like this and I tacked it
> down to the Windows igfx drivers expecting to find vendor capabilities
> in the GMCH device (the root device at 00:00.0). It was actually looking
> there for capabilities specific to the gfx card. We fixed this by
> patching qemu to pass in the vendor capabilities on that device. I am
> not sure if that made it upstream - I will have to check.
> >
> > Great news to hear. It'll be great if you can locate that patch for
> me.
> > I'm not sure which upstream you are referring to?
> > I'm using the qemu-xen 4.2.1 version. I remember the upstream qemu
> > does not support gfx-passthrough.
>
> So here is the lspci -vvv -s 00:00.0 output from the dom0:
> 00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core
> processor DRAM Controller (rev 09)
> Subsystem: ASRock Incorporation Device 0150
> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B- DisINTx-
> Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort+ >SERR- <PERR- INTx-
> Latency: 0
> Capabilities: [e0] Vendor Specific Information: Len=0c <?>
>
> And Linux domU:
> 00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core
> processor DRAM Controller (rev 09)
> Subsystem: ASRock Incorporation Device 0150
> Control: I/O- Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B- DisINTx-
> Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
> Latency: 0
>
> One notable difference is the 'Vendor Specific Information' line in
> the host is missing in the domU.
> Is it the fix you mention in your comment?
Yes that is exactly what I am talking about. The igfx Windows driver
reads those vendor caps and does a bunch of internal driver setup
using them. We found crashes directly related to their absence.
I am not sure how to account for the crashes w/o the igfx
drivers - perhaps that is a different crash or perhaps the
crash is unrelated to those capabilities.
The original patch I made (>2yrs ago) is completely out of date with
newer versions of qemu. But it basically followed the caps chain out
of register 0x34 and if it the value requested was vendor types
caps (type 0x09) on the host bridge it read them directly.
>
> But the fix you mention seems to be igfx driver specific.
> But I also observe BSOD without it -- I guess the system run with
> basic VGA driver in that case.
> Is it the expected behavior?
>
> One more question: does the windows igfx passthrough rely on 1:1
> mapping of MMIO bars?
> According to tutorials, nvidia cards require such trick.
> But it seems that igfx linux driver does not require this -- the lspci
> output shows different memory region on dom0 vs domU.
It sounds right that it doesn't need to be but I am not
sure; maybe someone else can confirm.
>
> Is there any info you are interested?
>
> Thanks,
> Timothy
>
> On Tue, Dec 11, 2012 at 5:26 PM, G.R. <firemeteor@users.sourceforge.net>
> wrote:
> > Hi guys,
> > I met win7 vga passthrough problem and would like to analysis the
> memory
> > dump to identify the issues. Could anybody lend me a hand when I'm in
> > trouble?
> >
> > Some details below and would post follow up later.
> > Any suggestion on the triage / experiment direction would be
> appreciated.
> >
> > I have a working build for passing through intel HD4000 to a debian
> domU.
> > But win7 does not quite work. Issues I've randomly met so far:
> > 1) black screen after windows loading files
> > 2) failed to access install media (I access install media / virtual
> disk
> > through NFS)
> > 3) BSOD memory_mangement during boot
> > 4) BSOD system_service_exception during boot
> > As I continuously try restarting the domU, syndrome keeps change
> randomly.
> >
> > At first glance, these appear to be irrelevant. But they only happens
> when I
> > try IGD passthrough.
> > And these appear both during installation and, if I finish install by
> > temporally disable vga passthrough, during reboot after video driver
> > install.
> >
> > I'm now going to download windows SDK to analysis the memory dump.
> > Will likely need your expertise later.
> >
> > Thanks,
> > Timothy
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Need help to debug win7 BSOD on IGD passthrough
2013-01-04 15:53 ` Ross Philipson
@ 2013-01-04 16:30 ` G.R.
2013-01-04 18:14 ` Jean Guyader
0 siblings, 1 reply; 22+ messages in thread
From: G.R. @ 2013-01-04 16:30 UTC (permalink / raw)
To: Ross Philipson, Jean Guyader, Stefano Stabellini; +Cc: xen-devel
>>
>> So here is the lspci -vvv -s 00:00.0 output from the dom0:
>> 00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core
>> processor DRAM Controller (rev 09)
>> Subsystem: ASRock Incorporation Device 0150
>> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
>> Stepping- SERR- FastB2B- DisINTx-
>> Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
>> <TAbort- <MAbort+ >SERR- <PERR- INTx-
>> Latency: 0
>> Capabilities: [e0] Vendor Specific Information: Len=0c <?>
>>
>> And Linux domU:
>> 00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core
>> processor DRAM Controller (rev 09)
>> Subsystem: ASRock Incorporation Device 0150
>> Control: I/O- Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
>> Stepping- SERR- FastB2B- DisINTx-
>> Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>> <TAbort- <MAbort- >SERR- <PERR- INTx-
>> Latency: 0
>>
>> One notable difference is the 'Vendor Specific Information' line in
>> the host is missing in the domU.
>> Is it the fix you mention in your comment?
>
> Yes that is exactly what I am talking about. The igfx Windows driver
> reads those vendor caps and does a bunch of internal driver setup
> using them. We found crashes directly related to their absence.
> I am not sure how to account for the crashes w/o the igfx
> drivers - perhaps that is a different crash or perhaps the
> crash is unrelated to those capabilities.
>
> The original patch I made (>2yrs ago) is completely out of date with
> newer versions of qemu. But it basically followed the caps chain out
> of register 0x34 and if it the value requested was vendor types
> caps (type 0x09) on the host bridge it read them directly.
>
I did a quick grep in the list, it seems that Jean was trying to
submit the patch.
http://lists.xen.org/archives/html/xen-devel/2012-01/msg01129.html
http://lists.xen.org/archives/html/xen-devel/2012-01/msg01128.html
I guess this is lost accidentally? I don't see obvious negative
feedback on the thread.
Adding Jean, and Stefano who reviewed the patch.
Will try this out tomorrow. Hope the patch can work without big modification.
Jean, do you remember if there are other related patches to make gfx
passthrough working with windows?
I would like to check if there are more missed.
>>
>> But the fix you mention seems to be igfx driver specific.
>> But I also observe BSOD without it -- I guess the system run with
>> basic VGA driver in that case.
>> Is it the expected behavior?
>>
>> One more question: does the windows igfx passthrough rely on 1:1
>> mapping of MMIO bars?
>> According to tutorials, nvidia cards require such trick.
>> But it seems that igfx linux driver does not require this -- the lspci
>> output shows different memory region on dom0 vs domU.
>
> It sounds right that it doesn't need to be but I am not
> sure; maybe someone else can confirm.
>
Jean, do you happen to know if the igfx card rely on 1:1 MMIO bar
mapping to work on windows?
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Need help to debug win7 BSOD on IGD passthrough
2013-01-04 16:30 ` G.R.
@ 2013-01-04 18:14 ` Jean Guyader
2013-01-05 5:08 ` G.R.
0 siblings, 1 reply; 22+ messages in thread
From: Jean Guyader @ 2013-01-04 18:14 UTC (permalink / raw)
To: G.R.; +Cc: xen-devel, Ross Philipson, Stefano Stabellini
On Fri, Jan 4, 2013 at 8:30 AM, G.R. <firemeteor@users.sourceforge.net> wrote:
>>>
>>> So here is the lspci -vvv -s 00:00.0 output from the dom0:
>>> 00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core
>>> processor DRAM Controller (rev 09)
>>> Subsystem: ASRock Incorporation Device 0150
>>> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
>>> Stepping- SERR- FastB2B- DisINTx-
>>> Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
>>> <TAbort- <MAbort+ >SERR- <PERR- INTx-
>>> Latency: 0
>>> Capabilities: [e0] Vendor Specific Information: Len=0c <?>
>>>
>>> And Linux domU:
>>> 00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core
>>> processor DRAM Controller (rev 09)
>>> Subsystem: ASRock Incorporation Device 0150
>>> Control: I/O- Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
>>> Stepping- SERR- FastB2B- DisINTx-
>>> Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>>> <TAbort- <MAbort- >SERR- <PERR- INTx-
>>> Latency: 0
>>>
>>> One notable difference is the 'Vendor Specific Information' line in
>>> the host is missing in the domU.
>>> Is it the fix you mention in your comment?
>>
>> Yes that is exactly what I am talking about. The igfx Windows driver
>> reads those vendor caps and does a bunch of internal driver setup
>> using them. We found crashes directly related to their absence.
>> I am not sure how to account for the crashes w/o the igfx
>> drivers - perhaps that is a different crash or perhaps the
>> crash is unrelated to those capabilities.
>>
>> The original patch I made (>2yrs ago) is completely out of date with
>> newer versions of qemu. But it basically followed the caps chain out
>> of register 0x34 and if it the value requested was vendor types
>> caps (type 0x09) on the host bridge it read them directly.
>>
> I did a quick grep in the list, it seems that Jean was trying to
> submit the patch.
> http://lists.xen.org/archives/html/xen-devel/2012-01/msg01129.html
> http://lists.xen.org/archives/html/xen-devel/2012-01/msg01128.html
>
> I guess this is lost accidentally? I don't see obvious negative
> feedback on the thread.
> Adding Jean, and Stefano who reviewed the patch.
>
> Will try this out tomorrow. Hope the patch can work without big modification.
>
> Jean, do you remember if there are other related patches to make gfx
> passthrough working with windows?
> I would like to check if there are more missed.
>
Ho right I remember those patch, I thought they went in but I guess I was
wrong. Have you tried intel vga passthrough with those patches, does that
make any difference?
Also you should try to remove the OpRegion patch all together and see
if the system is more stable after that.
How much RAM do you give to your guest? We were seeing random BSOD
with the Windows driver when the guest had more than 4G of RAM.
To play it safe you should use 3G for your guest.
>>>
>>> But the fix you mention seems to be igfx driver specific.
>>> But I also observe BSOD without it -- I guess the system run with
>>> basic VGA driver in that case.
>>> Is it the expected behavior?
>>>
>>> One more question: does the windows igfx passthrough rely on 1:1
>>> mapping of MMIO bars?
>>> According to tutorials, nvidia cards require such trick.
>>> But it seems that igfx linux driver does not require this -- the lspci
>>> output shows different memory region on dom0 vs domU.
>>
>> It sounds right that it doesn't need to be but I am not
>> sure; maybe someone else can confirm.
>>
> Jean, do you happen to know if the igfx card rely on 1:1 MMIO bar
> mapping to work on windows?
The windows igfx driver doesn't require the bar to be 1:1, it requires the
device to be at the same BDF though.
Jean
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Need help to debug win7 BSOD on IGD passthrough
2013-01-04 18:14 ` Jean Guyader
@ 2013-01-05 5:08 ` G.R.
2013-01-06 14:16 ` G.R.
0 siblings, 1 reply; 22+ messages in thread
From: G.R. @ 2013-01-05 5:08 UTC (permalink / raw)
To: Jean Guyader; +Cc: xen-devel, Ross Philipson, Stefano Stabellini
>>>
>> I did a quick grep in the list, it seems that Jean was trying to
>> submit the patch.
>> http://lists.xen.org/archives/html/xen-devel/2012-01/msg01129.html
>> http://lists.xen.org/archives/html/xen-devel/2012-01/msg01128.html
>>
>> I guess this is lost accidentally? I don't see obvious negative
>> feedback on the thread.
>> Adding Jean, and Stefano who reviewed the patch.
>>
>> Will try this out tomorrow. Hope the patch can work without big modification.
>>
>> Jean, do you remember if there are other related patches to make gfx
>> passthrough working with windows?
>> I would like to check if there are more missed.
>>
>
> Ho right I remember those patch, I thought they went in but I guess I was
> wrong. Have you tried intel vga passthrough with those patches, does that
> make any difference?
Haven't got chance yet. Will try it out this weekend to see how it behaves.
> Also you should try to remove the OpRegion patch all together and see
> if the system is more stable after that.
I started my xen journey with version 4.1.3, which does not have the
OpRegion patch.
But I don't think the windows domU is better with that version.
Currently the linux domU works just fine with the OpRegion patch.
So I don't believe this is causing issues. But anyway I'll give it a
shot if I can't solve it in other ways.
> How much RAM do you give to your guest? We were seeing random BSOD
> with the Windows driver when the guest had more than 4G of RAM.
> To play it safe you should use 3G for your guest.
Thanks for this hint. I don't remember the number and need to check it
out tonight.
Could you clarify if it will cause problem for exact 4GB of RAM?
It's not likely that I give it 4GB+, but 4GB is a possible number.
>
>>>>
>>>> But the fix you mention seems to be igfx driver specific.
>>>> But I also observe BSOD without it -- I guess the system run with
>>>> basic VGA driver in that case.
>>>> Is it the expected behavior?
>>>>
>>>> One more question: does the windows igfx passthrough rely on 1:1
>>>> mapping of MMIO bars?
>>>> According to tutorials, nvidia cards require such trick.
>>>> But it seems that igfx linux driver does not require this -- the lspci
>>>> output shows different memory region on dom0 vs domU.
>>>
>>> It sounds right that it doesn't need to be but I am not
>>> sure; maybe someone else can confirm.
>>>
>> Jean, do you happen to know if the igfx card rely on 1:1 MMIO bar
>> mapping to work on windows?
>
> The windows igfx driver doesn't require the bar to be 1:1, it requires the
> device to be at the same BDF though.
So it means that the gfx-passthrough config must be enabled. Thanks
for this info.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Need help to debug win7 BSOD on IGD passthrough
2013-01-05 5:08 ` G.R.
@ 2013-01-06 14:16 ` G.R.
2013-01-06 15:32 ` Pasi Kärkkäinen
` (3 more replies)
0 siblings, 4 replies; 22+ messages in thread
From: G.R. @ 2013-01-06 14:16 UTC (permalink / raw)
To: Jean Guyader, Ross Philipson, Stefano Stabellini; +Cc: xen-devel
>> Also you should try to remove the OpRegion patch all together and see
>> if the system is more stable after that.
> I started my xen journey with version 4.1.3, which does not have the
> OpRegion patch.
> But I don't think the windows domU is better with that version.
>
> Currently the linux domU works just fine with the OpRegion patch.
> So I don't believe this is causing issues. But anyway I'll give it a
> shot if I can't solve it in other ways.
>
>> How much RAM do you give to your guest? We were seeing random BSOD
>> with the Windows driver when the guest had more than 4G of RAM.
>> To play it safe you should use 3G for your guest.
>
> Thanks for this hint. I don't remember the number and need to check it
> out tonight.
> Could you clarify if it will cause problem for exact 4GB of RAM?
> It's not likely that I give it 4GB+, but 4GB is a possible number.
I'm so glad to report that the win7 64 bit domU works with the following steps:
1. Vendor caps fix.
2. Memory limited to 3GB.
Actually I'm not sure if #1 is required for the gfx driver since I did
not try #2 alone.
What I can confirm is that #2 is absolutely required and #1 + #2
produces a working system.
As I can observe, #2 fixes random BSOD without igfx drivers.
My original number is 4048MB.
So it appears that there are something wrong in the guest address
range of [0xc0000000 ~ 0xfd000000).
I would like to do some investigation since 3GB is a big limitation
for a 64 bit domU.
I remember traditionally most of the 3G ~ 4G range are reserved for
PCI config region.
My first guess is that some exposed PCI device appears to be buggy to windows.
Any suggestions how to investigate? I'm not very familiar with this.
For the vendor caps patch, I had to do some tweak to make it compile
sine that was a bit ancient.
I also find some small issues in that patch. Actually there is a CAPS
bit in the status register (0x6).
Without asserting this bit, lspci will not try to check the vendor
caps. I'm not sure if this is required
by the igfx windows driver, but this at least better adhere to the spec.
Apart from the Vender cap fix, I also find a gfx reset patch lost in the list.
http://lists.xen.org/archives/html/xen-devel/2012-01/msg02755.html
http://lists.xen.org/archives/html/xen-devel/2012-01/msg02754.html
Without this fix, the screen would flash during domU boot (start from
the second boot).
This can be worked around by bind the device to i915 driver and then
to pciback each time launching a new guest.
But this workaround does not work for rebooting guest.
So I've accumulated many local fixes to make intel gpu passthrough work.
If you believe they are worthwhile, I'll clean up the fix and submit the patch.
List of local fixes:
1. Stefano's intel pch init fix, patch submitted and acked by Stefano.
Not sure if accepted && submitted.
2. Patch for un-aligned OpRegion. Got feedback on security concern.
The impact is not yet clear.
3. Vendor cap fix. Jean && Ross's patch about one year ago, lost in
the devel list.
4. igfx reset fix. Jean's patch about one year ago, lost in the devel list.
I think devel list is not the best way to track issue && fixes. I
wounder if there are public bug tracking systems.
Thanks,
Timothy
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Need help to debug win7 BSOD on IGD passthrough
2013-01-06 14:16 ` G.R.
@ 2013-01-06 15:32 ` Pasi Kärkkäinen
2013-01-07 10:38 ` Stefano Stabellini
` (2 subsequent siblings)
3 siblings, 0 replies; 22+ messages in thread
From: Pasi Kärkkäinen @ 2013-01-06 15:32 UTC (permalink / raw)
To: G.R.; +Cc: xen-devel, Ross Philipson, Jean Guyader, Stefano Stabellini
On Sun, Jan 06, 2013 at 10:16:33PM +0800, G.R. wrote:
> >> Also you should try to remove the OpRegion patch all together and see
> >> if the system is more stable after that.
> > I started my xen journey with version 4.1.3, which does not have the
> > OpRegion patch.
> > But I don't think the windows domU is better with that version.
> >
> > Currently the linux domU works just fine with the OpRegion patch.
> > So I don't believe this is causing issues. But anyway I'll give it a
> > shot if I can't solve it in other ways.
> >
> >> How much RAM do you give to your guest? We were seeing random BSOD
> >> with the Windows driver when the guest had more than 4G of RAM.
> >> To play it safe you should use 3G for your guest.
> >
> > Thanks for this hint. I don't remember the number and need to check it
> > out tonight.
> > Could you clarify if it will cause problem for exact 4GB of RAM?
> > It's not likely that I give it 4GB+, but 4GB is a possible number.
>
> I'm so glad to report that the win7 64 bit domU works with the following steps:
> 1. Vendor caps fix.
> 2. Memory limited to 3GB.
>
> Actually I'm not sure if #1 is required for the gfx driver since I did
> not try #2 alone.
> What I can confirm is that #2 is absolutely required and #1 + #2
> produces a working system.
>
That's good to hear! Thanks for all the debugging!
> As I can observe, #2 fixes random BSOD without igfx drivers.
> My original number is 4048MB.
> So it appears that there are something wrong in the guest address
> range of [0xc0000000 ~ 0xfd000000).
>
> I would like to do some investigation since 3GB is a big limitation
> for a 64 bit domU.
> I remember traditionally most of the 3G ~ 4G range are reserved for
> PCI config region.
> My first guess is that some exposed PCI device appears to be buggy to windows.
> Any suggestions how to investigate? I'm not very familiar with this.
>
>
> For the vendor caps patch, I had to do some tweak to make it compile
> sine that was a bit ancient.
> I also find some small issues in that patch. Actually there is a CAPS
> bit in the status register (0x6).
> Without asserting this bit, lspci will not try to check the vendor
> caps. I'm not sure if this is required
> by the igfx windows driver, but this at least better adhere to the spec.
>
> Apart from the Vender cap fix, I also find a gfx reset patch lost in the list.
> http://lists.xen.org/archives/html/xen-devel/2012-01/msg02755.html
> http://lists.xen.org/archives/html/xen-devel/2012-01/msg02754.html
> Without this fix, the screen would flash during domU boot (start from
> the second boot).
> This can be worked around by bind the device to i915 driver and then
> to pciback each time launching a new guest.
> But this workaround does not work for rebooting guest.
>
>
> So I've accumulated many local fixes to make intel gpu passthrough work.
> If you believe they are worthwhile, I'll clean up the fix and submit the patch.
>
Yep! Please send all of the fixes as separate patches, against xen-unstable,
with Signed-Off-By lines etc.
> List of local fixes:
> 1. Stefano's intel pch init fix, patch submitted and acked by Stefano.
> Not sure if accepted && submitted.
> 2. Patch for un-aligned OpRegion. Got feedback on security concern.
> The impact is not yet clear.
> 3. Vendor cap fix. Jean && Ross's patch about one year ago, lost in
> the devel list.
> 4. igfx reset fix. Jean's patch about one year ago, lost in the devel list.
>
> I think devel list is not the best way to track issue && fixes. I
> wounder if there are public bug tracking systems.
>
There's the xen.org bugzilla, but it's not very much used..
It's better to just submit all the patches and get them merged to xen-unstable!
Thanks!
-- Pasi
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Need help to debug win7 BSOD on IGD passthrough
2013-01-06 14:16 ` G.R.
2013-01-06 15:32 ` Pasi Kärkkäinen
@ 2013-01-07 10:38 ` Stefano Stabellini
2013-01-09 15:07 ` G.R.
2013-01-07 15:51 ` Ross Philipson
2013-01-09 16:37 ` Stefano Stabellini
3 siblings, 1 reply; 22+ messages in thread
From: Stefano Stabellini @ 2013-01-07 10:38 UTC (permalink / raw)
To: G.R.
Cc: xen-devel, Ian Jackson, Ross Philipson, Jean Guyader,
Stefano Stabellini
On Sun, 6 Jan 2013, G.R. wrote:
> >> Also you should try to remove the OpRegion patch all together and see
> >> if the system is more stable after that.
> > I started my xen journey with version 4.1.3, which does not have the
> > OpRegion patch.
> > But I don't think the windows domU is better with that version.
> >
> > Currently the linux domU works just fine with the OpRegion patch.
> > So I don't believe this is causing issues. But anyway I'll give it a
> > shot if I can't solve it in other ways.
> >
> >> How much RAM do you give to your guest? We were seeing random BSOD
> >> with the Windows driver when the guest had more than 4G of RAM.
> >> To play it safe you should use 3G for your guest.
> >
> > Thanks for this hint. I don't remember the number and need to check it
> > out tonight.
> > Could you clarify if it will cause problem for exact 4GB of RAM?
> > It's not likely that I give it 4GB+, but 4GB is a possible number.
>
> I'm so glad to report that the win7 64 bit domU works with the following steps:
> 1. Vendor caps fix.
> 2. Memory limited to 3GB.
>
> Actually I'm not sure if #1 is required for the gfx driver since I did
> not try #2 alone.
> What I can confirm is that #2 is absolutely required and #1 + #2
> produces a working system.
>
> As I can observe, #2 fixes random BSOD without igfx drivers.
> My original number is 4048MB.
> So it appears that there are something wrong in the guest address
> range of [0xc0000000 ~ 0xfd000000).
>
> I would like to do some investigation since 3GB is a big limitation
> for a 64 bit domU.
> I remember traditionally most of the 3G ~ 4G range are reserved for
> PCI config region.
> My first guess is that some exposed PCI device appears to be buggy to windows.
> Any suggestions how to investigate? I'm not very familiar with this.
>
>
> For the vendor caps patch, I had to do some tweak to make it compile
> sine that was a bit ancient.
> I also find some small issues in that patch. Actually there is a CAPS
> bit in the status register (0x6).
> Without asserting this bit, lspci will not try to check the vendor
> caps. I'm not sure if this is required
> by the igfx windows driver, but this at least better adhere to the spec.
>
> Apart from the Vender cap fix, I also find a gfx reset patch lost in the list.
> http://lists.xen.org/archives/html/xen-devel/2012-01/msg02755.html
> http://lists.xen.org/archives/html/xen-devel/2012-01/msg02754.html
> Without this fix, the screen would flash during domU boot (start from
> the second boot).
> This can be worked around by bind the device to i915 driver and then
> to pciback each time launching a new guest.
> But this workaround does not work for rebooting guest.
>
>
> So I've accumulated many local fixes to make intel gpu passthrough work.
> If you believe they are worthwhile, I'll clean up the fix and submit the patch.
>
> List of local fixes:
> 1. Stefano's intel pch init fix, patch submitted and acked by Stefano.
> Not sure if accepted && submitted.
> 2. Patch for un-aligned OpRegion. Got feedback on security concern.
> The impact is not yet clear.
> 3. Vendor cap fix. Jean && Ross's patch about one year ago, lost in
> the devel list.
> 4. igfx reset fix. Jean's patch about one year ago, lost in the devel list.
>
> I think devel list is not the best way to track issue && fixes. I
> wounder if there are public bug tracking systems.
Ian Jackson is responsible for applying patches to the
qemu-xen-traditional tree.
However if you care to forward port the patches to upstream QEMU, that
now has PCI passthrough support but it is still completely missing gfx
passthrough, I would be happy to commit them and send them upstream.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Need help to debug win7 BSOD on IGD passthrough
2013-01-06 14:16 ` G.R.
2013-01-06 15:32 ` Pasi Kärkkäinen
2013-01-07 10:38 ` Stefano Stabellini
@ 2013-01-07 15:51 ` Ross Philipson
2013-01-09 16:37 ` Stefano Stabellini
3 siblings, 0 replies; 22+ messages in thread
From: Ross Philipson @ 2013-01-07 15:51 UTC (permalink / raw)
To: G.R., Jean Guyader, Stefano Stabellini; +Cc: xen-devel
> -----Original Message-----
> From: firemeteor.guo@gmail.com [mailto:firemeteor.guo@gmail.com] On
> Behalf Of G.R.
> Sent: Sunday, January 06, 2013 9:17 AM
> To: Jean Guyader; Ross Philipson; Stefano Stabellini
> Cc: xen-devel
> Subject: Re: Need help to debug win7 BSOD on IGD passthrough
>
> >> Also you should try to remove the OpRegion patch all together and see
> >> if the system is more stable after that.
> > I started my xen journey with version 4.1.3, which does not have the
> > OpRegion patch.
> > But I don't think the windows domU is better with that version.
> >
> > Currently the linux domU works just fine with the OpRegion patch.
> > So I don't believe this is causing issues. But anyway I'll give it a
> > shot if I can't solve it in other ways.
> >
> >> How much RAM do you give to your guest? We were seeing random BSOD
> >> with the Windows driver when the guest had more than 4G of RAM.
> >> To play it safe you should use 3G for your guest.
> >
> > Thanks for this hint. I don't remember the number and need to check it
> > out tonight.
> > Could you clarify if it will cause problem for exact 4GB of RAM?
> > It's not likely that I give it 4GB+, but 4GB is a possible number.
>
> I'm so glad to report that the win7 64 bit domU works with the following
> steps:
> 1. Vendor caps fix.
> 2. Memory limited to 3GB.
>
> Actually I'm not sure if #1 is required for the gfx driver since I did
> not try #2 alone.
> What I can confirm is that #2 is absolutely required and #1 + #2
> produces a working system.
That is great news. I am certain the vendor caps fix was needed for
addressing BSODs in (certain versions perhaps of) the Windows igfx
drivers.
>
> As I can observe, #2 fixes random BSOD without igfx drivers.
> My original number is 4048MB.
> So it appears that there are something wrong in the guest address
> range of [0xc0000000 ~ 0xfd000000).
>
> I would like to do some investigation since 3GB is a big limitation
> for a 64 bit domU.
> I remember traditionally most of the 3G ~ 4G range are reserved for
> PCI config region.
> My first guess is that some exposed PCI device appears to be buggy to
> windows.
> Any suggestions how to investigate? I'm not very familiar with this.
>
>
> For the vendor caps patch, I had to do some tweak to make it compile
> sine that was a bit ancient.
> I also find some small issues in that patch. Actually there is a CAPS
> bit in the status register (0x6).
> Without asserting this bit, lspci will not try to check the vendor
> caps. I'm not sure if this is required
> by the igfx windows driver, but this at least better adhere to the spec.
The 2.1 PCI spec indicates this bit is used for this purpose (UDF Supported)
but 2.2 and 3.0 spec say this bit is reserved. Vendor caps have always seemed
to work w/o this bit set. Not sure what the best way to go with this is.
>
> Apart from the Vender cap fix, I also find a gfx reset patch lost in the
> list.
> http://lists.xen.org/archives/html/xen-devel/2012-01/msg02755.html
> http://lists.xen.org/archives/html/xen-devel/2012-01/msg02754.html
> Without this fix, the screen would flash during domU boot (start from
> the second boot).
> This can be worked around by bind the device to i915 driver and then
> to pciback each time launching a new guest.
> But this workaround does not work for rebooting guest.
>
>
> So I've accumulated many local fixes to make intel gpu passthrough work.
> If you believe they are worthwhile, I'll clean up the fix and submit the
> patch.
>
> List of local fixes:
> 1. Stefano's intel pch init fix, patch submitted and acked by Stefano.
> Not sure if accepted && submitted.
> 2. Patch for un-aligned OpRegion. Got feedback on security concern.
> The impact is not yet clear.
> 3. Vendor cap fix. Jean && Ross's patch about one year ago, lost in
> the devel list.
> 4. igfx reset fix. Jean's patch about one year ago, lost in the devel
> list.
>
> I think devel list is not the best way to track issue && fixes. I
> wounder if there are public bug tracking systems.
>
> Thanks,
> Timothy
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Need help to debug win7 BSOD on IGD passthrough
2013-01-07 10:38 ` Stefano Stabellini
@ 2013-01-09 15:07 ` G.R.
2013-01-09 16:12 ` Stefano Stabellini
2013-01-09 16:21 ` Ian Jackson
0 siblings, 2 replies; 22+ messages in thread
From: G.R. @ 2013-01-09 15:07 UTC (permalink / raw)
To: Stefano Stabellini; +Cc: Ian Jackson, Ross Philipson, Jean Guyader, xen-devel
>> So I've accumulated many local fixes to make intel gpu passthrough work.
>> If you believe they are worthwhile, I'll clean up the fix and submit the patch.
>>
>> List of local fixes:
>> 1. Stefano's intel pch init fix, patch submitted and acked by Stefano.
>> Not sure if accepted && submitted.
>> 2. Patch for un-aligned OpRegion. Got feedback on security concern.
>> The impact is not yet clear.
>> 3. Vendor cap fix. Jean && Ross's patch about one year ago, lost in
>> the devel list.
>> 4. igfx reset fix. Jean's patch about one year ago, lost in the devel list.
>>
>> I think devel list is not the best way to track issue && fixes. I
>> wounder if there are public bug tracking systems.
>
> Ian Jackson is responsible for applying patches to the
> qemu-xen-traditional tree.
>
> However if you care to forward port the patches to upstream QEMU, that
> now has PCI passthrough support but it is still completely missing gfx
> passthrough, I would be happy to commit them and send them upstream.
Let me start with the qemu-xen-traditional tree, which I'm currently using.
I'll consider porting them to upstream QEMU once they are accepted.
Just some questions first:
1. Do I need to switch to the unstable tree? I'm currently on 4.2.1.
2. Any guide to generate patch series in git?
The wiki has a tutorial for Mercurial, but not git.
Generating well-formed patch series from a single file manually is not
a pleasant task.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Need help to debug win7 BSOD on IGD passthrough
2013-01-09 15:07 ` G.R.
@ 2013-01-09 16:12 ` Stefano Stabellini
2013-01-09 16:21 ` Ian Jackson
1 sibling, 0 replies; 22+ messages in thread
From: Stefano Stabellini @ 2013-01-09 16:12 UTC (permalink / raw)
To: G.R.
Cc: xen-devel, Ian Jackson, Ross Philipson, Jean Guyader,
Stefano Stabellini
On Wed, 9 Jan 2013, G.R. wrote:
> >> So I've accumulated many local fixes to make intel gpu passthrough work.
> >> If you believe they are worthwhile, I'll clean up the fix and submit the patch.
> >>
> >> List of local fixes:
> >> 1. Stefano's intel pch init fix, patch submitted and acked by Stefano.
> >> Not sure if accepted && submitted.
> >> 2. Patch for un-aligned OpRegion. Got feedback on security concern.
> >> The impact is not yet clear.
> >> 3. Vendor cap fix. Jean && Ross's patch about one year ago, lost in
> >> the devel list.
> >> 4. igfx reset fix. Jean's patch about one year ago, lost in the devel list.
> >>
> >> I think devel list is not the best way to track issue && fixes. I
> >> wounder if there are public bug tracking systems.
> >
> > Ian Jackson is responsible for applying patches to the
> > qemu-xen-traditional tree.
> >
> > However if you care to forward port the patches to upstream QEMU, that
> > now has PCI passthrough support but it is still completely missing gfx
> > passthrough, I would be happy to commit them and send them upstream.
>
> Let me start with the qemu-xen-traditional tree, which I'm currently using.
> I'll consider porting them to upstream QEMU once they are accepted.
>
> Just some questions first:
> 1. Do I need to switch to the unstable tree? I'm currently on 4.2.1.
Yes
> 2. Any guide to generate patch series in git?
> The wiki has a tutorial for Mercurial, but not git.
> Generating well-formed patch series from a single file manually is not
> a pleasant task.
I personally use guilt, that works the same way as mercurial with patch
queues. However many people in the git world just use git, different
branches and git rebase -i.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Need help to debug win7 BSOD on IGD passthrough
2013-01-09 15:07 ` G.R.
2013-01-09 16:12 ` Stefano Stabellini
@ 2013-01-09 16:21 ` Ian Jackson
1 sibling, 0 replies; 22+ messages in thread
From: Ian Jackson @ 2013-01-09 16:21 UTC (permalink / raw)
To: G.R.; +Cc: xen-devel, Ross Philipson, Jean Guyader, Stefano Stabellini
G.R. writes ("Re: Need help to debug win7 BSOD on IGD passthrough"):
> 2. Any guide to generate patch series in git?
> The wiki has a tutorial for Mercurial, but not git.
> Generating well-formed patch series from a single file manually is not
> a pleasant task.
git-format-patch
git-send-email
I use:
rm -r ../d; mkdir ../d; git-format-patch -o ../d basis -n
^^^^^ tag or branch for where to start from
git-send-email --bcc MYSELF --no-chain-reply-to ../d --to xen-devel@lists.xensource.com --compose
and answer the questions; don't forget to start cover letter subject line with [PATCH 00/16] or whatever
Ian.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Need help to debug win7 BSOD on IGD passthrough
2013-01-06 14:16 ` G.R.
` (2 preceding siblings ...)
2013-01-07 15:51 ` Ross Philipson
@ 2013-01-09 16:37 ` Stefano Stabellini
2013-01-10 10:18 ` G.R.
3 siblings, 1 reply; 22+ messages in thread
From: Stefano Stabellini @ 2013-01-09 16:37 UTC (permalink / raw)
To: G.R.
Cc: xen-devel, Ian Jackson, Ross Philipson, Jean Guyader,
Stefano Stabellini
On Sun, 6 Jan 2013, G.R. wrote:
> So I've accumulated many local fixes to make intel gpu passthrough work.
> If you believe they are worthwhile, I'll clean up the fix and submit the patch.
>
> List of local fixes:
> 1. Stefano's intel pch init fix, patch submitted and acked by Stefano.
> Not sure if accepted && submitted.
I asked Ian to give it a look
> 2. Patch for un-aligned OpRegion. Got feedback on security concern.
> The impact is not yet clear.
Could you please give me a reference to the patch?
> 3. Vendor cap fix.
I went back to the original thread and it looks like I had some simple
comments but Jean never addressed them.
If you cared to address the comment and resent the patches to the list,
then they would have a chance of being applied.
> Jean && Ross's patch about one year ago, lost in
> the devel list.
Are you still talking about:
http://lists.xen.org/archives/html/xen-devel/2012-01/msg02755.html
http://lists.xen.org/archives/html/xen-devel/2012-01/msg02754.html
?
Or are there other patches from Ross and Jean that got lost?
If so, could you please give me a reference to the patch?
> 4. igfx reset fix. Jean's patch about one year ago, lost in the devel list.
Could you please give me a reference to the patch?
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Need help to debug win7 BSOD on IGD passthrough
2013-01-09 16:37 ` Stefano Stabellini
@ 2013-01-10 10:18 ` G.R.
2013-01-10 10:31 ` Stefano Stabellini
0 siblings, 1 reply; 22+ messages in thread
From: G.R. @ 2013-01-10 10:18 UTC (permalink / raw)
To: Stefano Stabellini; +Cc: Ian Jackson, Ross Philipson, Jean Guyader, xen-devel
On Thu, Jan 10, 2013 at 12:37 AM, Stefano Stabellini
<stefano.stabellini@eu.citrix.com> wrote:
> On Sun, 6 Jan 2013, G.R. wrote:
>> So I've accumulated many local fixes to make intel gpu passthrough work.
>> If you believe they are worthwhile, I'll clean up the fix and submit the patch.
>>
>> List of local fixes:
>> 1. Stefano's intel pch init fix, patch submitted and acked by Stefano.
>> Not sure if accepted && submitted.
>
> I asked Ian to give it a look
>
Thanks for your boost!
>
>> 2. Patch for un-aligned OpRegion. Got feedback on security concern.
>> The impact is not yet clear.
>
> Could you please give me a reference to the patch?
>
http://lists.xen.org/archives/html/xen-devel/2012-12/msg01584.html
This is the thread of that patch. I think this need to be enhanced in
ways that magic number avoided.
But more importantly, security concern need to be resolved.
This patch may expose some more address space to accommodate unaligned
OpRegion, we need to figure out what the adjacent address is.
Ross is helping on this.
>
>> 3. Vendor cap fix.
>
> I went back to the original thread and it looks like I had some simple
> comments but Jean never addressed them.
> If you cared to address the comment and resent the patches to the list,
> then they would have a chance of being applied.
>
Could you remind me the comments you just mentioned?
The mail archive is kind of messy and I can't easily go through them.
>
>> Jean && Ross's patch about one year ago, lost in
>> the devel list.
>
> Are you still talking about:
>
> http://lists.xen.org/archives/html/xen-devel/2012-01/msg02755.html
> http://lists.xen.org/archives/html/xen-devel/2012-01/msg02754.html
>
This one is about the chip reset, not the vendor cap.
The vendor cap fix is here:
http://lists.xen.org/archives/html/xen-devel/2012-01/msg01129.html
http://lists.xen.org/archives/html/xen-devel/2012-01/msg01128.html
So are you actually talking about the reset fix when you referred to
your comment above?
> ?
> Or are there other patches from Ross and Jean that got lost?
> If so, could you please give me a reference to the patch?
>
>
>> 4. igfx reset fix. Jean's patch about one year ago, lost in the devel list.
>
> Could you please give me a reference to the patch?
See above.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Need help to debug win7 BSOD on IGD passthrough
2013-01-10 10:18 ` G.R.
@ 2013-01-10 10:31 ` Stefano Stabellini
2013-01-10 15:51 ` G.R.
0 siblings, 1 reply; 22+ messages in thread
From: Stefano Stabellini @ 2013-01-10 10:31 UTC (permalink / raw)
To: G.R.
Cc: xen-devel, Ian Jackson, Ross Philipson, Jean Guyader,
Stefano Stabellini
On Thu, 10 Jan 2013, G.R. wrote:
> >> 3. Vendor cap fix.
> >
> > I went back to the original thread and it looks like I had some simple
> > comments but Jean never addressed them.
> > If you cared to address the comment and resent the patches to the list,
> > then they would have a chance of being applied.
> >
> Could you remind me the comments you just mentioned?
> The mail archive is kind of messy and I can't easily go through them.
Sure, here they are:
http://marc.info/?l=xen-devel&m=132810779729833
http://marc.info/?l=xen-devel&m=132810811429932
> >> Jean && Ross's patch about one year ago, lost in
> >> the devel list.
> >
> > Are you still talking about:
> >
> > http://lists.xen.org/archives/html/xen-devel/2012-01/msg02755.html
> > http://lists.xen.org/archives/html/xen-devel/2012-01/msg02754.html
> >
>
> This one is about the chip reset, not the vendor cap.
> The vendor cap fix is here:
> http://lists.xen.org/archives/html/xen-devel/2012-01/msg01129.html
> http://lists.xen.org/archives/html/xen-devel/2012-01/msg01128.html
> So are you actually talking about the reset fix when you referred to
> your comment above?
Yes, sorry for the confusion.
Regarding these two, the first one has already my ack, so it should be
applied as far as I am concerned.
The second one introduces a new function, igd_pci_read_vendor_cap, that
returns 0 or -1 but return type is actually uint32_t. That needs to be
fixed. Aside from that I think it is OK.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Need help to debug win7 BSOD on IGD passthrough
2013-01-10 10:31 ` Stefano Stabellini
@ 2013-01-10 15:51 ` G.R.
2013-01-11 12:56 ` Stefano Stabellini
0 siblings, 1 reply; 22+ messages in thread
From: G.R. @ 2013-01-10 15:51 UTC (permalink / raw)
To: Stefano Stabellini; +Cc: Ian Jackson, Ross Philipson, Jean Guyader, xen-devel
>> Could you remind me the comments you just mentioned?
>> The mail archive is kind of messy and I can't easily go through them.
>
> Sure, here they are:
> http://marc.info/?l=xen-devel&m=132810779729833
> http://marc.info/?l=xen-devel&m=132810811429932
>
Thanks, I'll try to address them when I got time.
Actually I haven't try this out yet. But this is really a good-to-have.
But one question first -- what do you mean by "inline patch?"
Sorry if it is a silly question.
>
>> >> Jean && Ross's patch about one year ago, lost in
>> >> the devel list.
>> >
>> > Are you still talking about:
>> >
>> > http://lists.xen.org/archives/html/xen-devel/2012-01/msg02755.html
>> > http://lists.xen.org/archives/html/xen-devel/2012-01/msg02754.html
>> >
>>
>> This one is about the chip reset, not the vendor cap.
>> The vendor cap fix is here:
>> http://lists.xen.org/archives/html/xen-devel/2012-01/msg01129.html
>> http://lists.xen.org/archives/html/xen-devel/2012-01/msg01128.html
>> So are you actually talking about the reset fix when you referred to
>> your comment above?
>
> Yes, sorry for the confusion.
>
> Regarding these two, the first one has already my ack, so it should be
> applied as far as I am concerned.
>
Actually I don't quite understand the patch description...
> The second one introduces a new function, igd_pci_read_vendor_cap, that
> returns 0 or -1 but return type is actually uint32_t. That needs to be
> fixed. Aside from that I think it is OK.
Sorry, but it seems that I failed to reference the latest patch.
Jean has one follow up that fixes that issue you mentioned -- he just
return '1' instead.
http://lists.xen.org/archives/html/xen-devel/2012-01/msg01290.html
But unfortunately the patch needs to be modified against the current tree.
So anyway I need to re-post.
Also, I find that the patch does not work for the linux lspci.
It seems to lack of the Caps bit in the status (0x6) register.
But according to Ross, that bit only exists in spec 2.1 and becomes
reserved in later specs.
What do you think? Do we need to provide that bit also?
Thanks,
Timothy
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Need help to debug win7 BSOD on IGD passthrough
2013-01-10 15:51 ` G.R.
@ 2013-01-11 12:56 ` Stefano Stabellini
2013-01-15 17:12 ` G.R.
0 siblings, 1 reply; 22+ messages in thread
From: Stefano Stabellini @ 2013-01-11 12:56 UTC (permalink / raw)
To: G.R.
Cc: xen-devel, Ian Jackson, Ross Philipson, Jean Guyader,
Stefano Stabellini
On Thu, 10 Jan 2013, G.R. wrote:
> >> Could you remind me the comments you just mentioned?
> >> The mail archive is kind of messy and I can't easily go through them.
> >
> > Sure, here they are:
> > http://marc.info/?l=xen-devel&m=132810779729833
> > http://marc.info/?l=xen-devel&m=132810811429932
> >
> Thanks, I'll try to address them when I got time.
> Actually I haven't try this out yet. But this is really a good-to-have.
>
> But one question first -- what do you mean by "inline patch?"
> Sorry if it is a silly question.
http://www.kernel.org/doc/Documentation/SubmittingPatches
See point 7):
"No MIME, no links, no compression, no attachments. Just plain text."
> >> >> Jean && Ross's patch about one year ago, lost in
> >> >> the devel list.
> >> >
> >> > Are you still talking about:
> >> >
> >> > http://lists.xen.org/archives/html/xen-devel/2012-01/msg02755.html
> >> > http://lists.xen.org/archives/html/xen-devel/2012-01/msg02754.html
> >> >
> >>
> >> This one is about the chip reset, not the vendor cap.
> >> The vendor cap fix is here:
> >> http://lists.xen.org/archives/html/xen-devel/2012-01/msg01129.html
> >> http://lists.xen.org/archives/html/xen-devel/2012-01/msg01128.html
> >> So are you actually talking about the reset fix when you referred to
> >> your comment above?
> >
> > Yes, sorry for the confusion.
> >
> > Regarding these two, the first one has already my ack, so it should be
> > applied as far as I am concerned.
> >
> Actually I don't quite understand the patch description...
The patch exposes the vendor specific PCI cap to the guest, instead of
emulating it.
> > The second one introduces a new function, igd_pci_read_vendor_cap, that
> > returns 0 or -1 but return type is actually uint32_t. That needs to be
> > fixed. Aside from that I think it is OK.
>
> Sorry, but it seems that I failed to reference the latest patch.
> Jean has one follow up that fixes that issue you mentioned -- he just
> return '1' instead.
> http://lists.xen.org/archives/html/xen-devel/2012-01/msg01290.html
>
> But unfortunately the patch needs to be modified against the current tree.
> So anyway I need to re-post.
OK
> Also, I find that the patch does not work for the linux lspci.
> It seems to lack of the Caps bit in the status (0x6) register.
> But according to Ross, that bit only exists in spec 2.1 and becomes
> reserved in later specs.
> What do you think? Do we need to provide that bit also?
Being Linux open, you can just clone the code of lspci and see what
exactly it is looking for.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Need help to debug win7 BSOD on IGD passthrough
2013-01-11 12:56 ` Stefano Stabellini
@ 2013-01-15 17:12 ` G.R.
2013-01-15 18:40 ` Stefano Stabellini
0 siblings, 1 reply; 22+ messages in thread
From: G.R. @ 2013-01-15 17:12 UTC (permalink / raw)
To: Stefano Stabellini; +Cc: Ian Jackson, Ross Philipson, Jean Guyader, xen-devel
>> Also, I find that the patch does not work for the linux lspci.
>> It seems to lack of the Caps bit in the status (0x6) register.
>> But according to Ross, that bit only exists in spec 2.1 and becomes
>> reserved in later specs.
>> What do you think? Do we need to provide that bit also?
>
> Being Linux open, you can just clone the code of lspci and see what
> exactly it is looking for.
Well, I understand that Linux is open. But that's not my question.
Ross's comment gave me the impression that lspci did not adhere to the standard.
So my question was actually to follow standard or lspci?
After double checking the spec, it seems that Ross was talking about a
different bit.
The bit that once defined but now reserved is the UDF bit (bit 0x6) in
the Status (ofs 0x6) reg.
While I was talking about the Cap-list (bit 0x4) in the Status (ofs 0x6) reg.
So it seems that the patch should be fixed to expose this bit and I
need your suggestion here.
Currently I have a quick and dirty hack locally to directly expose
both command (word 0x4) && status (word 0x6) reg.
I'm not sure if it's acceptable (probably not).
I tend to provide read-only access since I have no idea what will
happen if the guest ever try to operate on it (the host bridge
00:00.0)
I'm not sure if I should expose the Status reg only and exclude
Command reg (in case the Status reg is fully read-only, which I'm not
sure about since I don't have the spec), or just fake a result for
that register. If we choose to fake one, what should the result look
like?
Thanks,
Timothy
PS: it looks that the switch-on-address style of code is not very robust.
This would fail if the guest access using unexpected alignment && length.
I guess may be we should switch to a mask based implementation.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Need help to debug win7 BSOD on IGD passthrough
2013-01-15 17:12 ` G.R.
@ 2013-01-15 18:40 ` Stefano Stabellini
2013-01-20 16:26 ` G.R.
0 siblings, 1 reply; 22+ messages in thread
From: Stefano Stabellini @ 2013-01-15 18:40 UTC (permalink / raw)
To: G.R.
Cc: xen-devel, Ian Jackson, Ross Philipson, Jean Guyader,
Stefano Stabellini
On Tue, 15 Jan 2013, G.R. wrote:
> >> Also, I find that the patch does not work for the linux lspci.
> >> It seems to lack of the Caps bit in the status (0x6) register.
> >> But according to Ross, that bit only exists in spec 2.1 and becomes
> >> reserved in later specs.
> >> What do you think? Do we need to provide that bit also?
> >
> > Being Linux open, you can just clone the code of lspci and see what
> > exactly it is looking for.
>
> Well, I understand that Linux is open. But that's not my question.
> Ross's comment gave me the impression that lspci did not adhere to the standard.
> So my question was actually to follow standard or lspci?
>
> After double checking the spec, it seems that Ross was talking about a
> different bit.
> The bit that once defined but now reserved is the UDF bit (bit 0x6) in
> the Status (ofs 0x6) reg.
> While I was talking about the Cap-list (bit 0x4) in the Status (ofs 0x6) reg.
>
> So it seems that the patch should be fixed to expose this bit and I
> need your suggestion here.
> Currently I have a quick and dirty hack locally to directly expose
> both command (word 0x4) && status (word 0x6) reg.
> I'm not sure if it's acceptable (probably not).
> I tend to provide read-only access since I have no idea what will
> happen if the guest ever try to operate on it (the host bridge
> 00:00.0)
> I'm not sure if I should expose the Status reg only and exclude
> Command reg (in case the Status reg is fully read-only, which I'm not
> sure about since I don't have the spec), or just fake a result for
> that register. If we choose to fake one, what should the result look
> like?
I would expose only the Status register, read-only.
> PS: it looks that the switch-on-address style of code is not very robust.
> This would fail if the guest access using unexpected alignment && length.
> I guess may be we should switch to a mask based implementation.
Given the way QEMU emulates the PCI config space, I don't think is
possible to receive a reads or writes with a length different from 1, 2 or
4 bytes.
However I am always open to code improvements.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Need help to debug win7 BSOD on IGD passthrough
2013-01-15 18:40 ` Stefano Stabellini
@ 2013-01-20 16:26 ` G.R.
2013-01-21 10:35 ` Stefano Stabellini
0 siblings, 1 reply; 22+ messages in thread
From: G.R. @ 2013-01-20 16:26 UTC (permalink / raw)
To: Stefano Stabellini; +Cc: Ian Jackson, Ross Philipson, Jean Guyader, xen-devel
On Wed, Jan 16, 2013 at 2:40 AM, Stefano Stabellini
<stefano.stabellini@eu.citrix.com> wrote:
>> PS: it looks that the switch-on-address style of code is not very robust.
>> This would fail if the guest access using unexpected alignment && length.
>> I guess may be we should switch to a mask based implementation.
>
> Given the way QEMU emulates the PCI config space, I don't think is
> possible to receive a reads or writes with a length different from 1, 2 or
> 4 bytes.
> However I am always open to code improvements.
I'm not sure if I understand the QEMU part. But at least I've seen
this in the log:
igd_pci_read: [00:00:0] addr=0 len=4 val=1508086
igd_pci_read: [00:00:0] addr=4 len=4 val=20900006
igd_pci_read: [00:00:0] addr=8 len=4 val=6000009
igd_pci_read: [00:00:0] addr=c len=4 val=0
igd_pci_read: [00:00:0] addr=6 len=2 val=2090
igd_pci_read: [00:00:0] addr=34 len=1 val=e0
igd_pci_read: [00:00:0] addr=e0 len=2 val=9
igd_pci_read: [00:00:0] addr=4 len=2 val=6
igd_pci_read: [00:00:0] addr=4 len=2 val=6
igd_pci_read: [00:00:0] addr=c len=1 val=0
igd_pci_read: [00:00:0] addr=d len=1 val=0
There are both 2 && 4 bytes read to offset 0x4, and 2 bytes read to offset 0x6.
If we only check for 0x6 in igd_pci_read(), the 4-byte read to offset
0x4 would show inconsistent result.
But I'm not sure if this matters to SW.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Need help to debug win7 BSOD on IGD passthrough
2013-01-20 16:26 ` G.R.
@ 2013-01-21 10:35 ` Stefano Stabellini
0 siblings, 0 replies; 22+ messages in thread
From: Stefano Stabellini @ 2013-01-21 10:35 UTC (permalink / raw)
To: G.R.
Cc: xen-devel, Ian Jackson, Ross Philipson, Jean Guyader,
Stefano Stabellini
On Sun, 20 Jan 2013, G.R. wrote:
> On Wed, Jan 16, 2013 at 2:40 AM, Stefano Stabellini
> <stefano.stabellini@eu.citrix.com> wrote:
> >> PS: it looks that the switch-on-address style of code is not very robust.
> >> This would fail if the guest access using unexpected alignment && length.
> >> I guess may be we should switch to a mask based implementation.
> >
> > Given the way QEMU emulates the PCI config space, I don't think is
> > possible to receive a reads or writes with a length different from 1, 2 or
> > 4 bytes.
> > However I am always open to code improvements.
>
> I'm not sure if I understand the QEMU part. But at least I've seen
> this in the log:
>
> igd_pci_read: [00:00:0] addr=0 len=4 val=1508086
> igd_pci_read: [00:00:0] addr=4 len=4 val=20900006
> igd_pci_read: [00:00:0] addr=8 len=4 val=6000009
> igd_pci_read: [00:00:0] addr=c len=4 val=0
>
> igd_pci_read: [00:00:0] addr=6 len=2 val=2090
> igd_pci_read: [00:00:0] addr=34 len=1 val=e0
> igd_pci_read: [00:00:0] addr=e0 len=2 val=9
> igd_pci_read: [00:00:0] addr=4 len=2 val=6
> igd_pci_read: [00:00:0] addr=4 len=2 val=6
> igd_pci_read: [00:00:0] addr=c len=1 val=0
> igd_pci_read: [00:00:0] addr=d len=1 val=0
>
> There are both 2 && 4 bytes read to offset 0x4, and 2 bytes read to offset 0x6.
> If we only check for 0x6 in igd_pci_read(), the 4-byte read to offset
> 0x4 would show inconsistent result.
> But I'm not sure if this matters to SW.
Yes, you are right. Well spotted.
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2013-01-21 10:35 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-11 9:26 Need help to debug win7 BSOD on IGD passthrough G.R.
2013-01-04 13:49 ` G.R.
2013-01-04 15:53 ` Ross Philipson
2013-01-04 16:30 ` G.R.
2013-01-04 18:14 ` Jean Guyader
2013-01-05 5:08 ` G.R.
2013-01-06 14:16 ` G.R.
2013-01-06 15:32 ` Pasi Kärkkäinen
2013-01-07 10:38 ` Stefano Stabellini
2013-01-09 15:07 ` G.R.
2013-01-09 16:12 ` Stefano Stabellini
2013-01-09 16:21 ` Ian Jackson
2013-01-07 15:51 ` Ross Philipson
2013-01-09 16:37 ` Stefano Stabellini
2013-01-10 10:18 ` G.R.
2013-01-10 10:31 ` Stefano Stabellini
2013-01-10 15:51 ` G.R.
2013-01-11 12:56 ` Stefano Stabellini
2013-01-15 17:12 ` G.R.
2013-01-15 18:40 ` Stefano Stabellini
2013-01-20 16:26 ` G.R.
2013-01-21 10:35 ` Stefano Stabellini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).