* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
@ 2015-09-18 9:37 Janusz
2015-09-18 10:07 ` Laszlo Ersek
2015-09-22 8:59 ` Paolo Bonzini
0 siblings, 2 replies; 37+ messages in thread
From: Janusz @ 2015-09-18 9:37 UTC (permalink / raw)
To: kvm; +Cc: edk2-devel, guangrong.xiao
Hello,
I am writting about this patch that was posted by Xiao:
http://www.spinics.net/lists/kvm/msg119044.html and this:
http://www.spinics.net/lists/kvm/msg119045.html
I've tested both kernel 4.2 and 4.3 and problem still exists when I use
OVMF - 100% cpu usage, VM resetting, while it works properly on kernel 4.1
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-09-18 9:37 [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled Janusz
@ 2015-09-18 10:07 ` Laszlo Ersek
2015-09-18 17:48 ` Janusz
2015-09-22 8:59 ` Paolo Bonzini
1 sibling, 1 reply; 37+ messages in thread
From: Laszlo Ersek @ 2015-09-18 10:07 UTC (permalink / raw)
To: Janusz, kvm; +Cc: edk2-devel, guangrong.xiao
On 09/18/15 11:37, Janusz wrote:
> Hello,
>
> I am writting about this patch that was posted by Xiao:
> http://www.spinics.net/lists/kvm/msg119044.html and this:
> http://www.spinics.net/lists/kvm/msg119045.html
> I've tested both kernel 4.2 and 4.3 and problem still exists when I use
> OVMF - 100% cpu usage, VM resetting, while it works properly on kernel 4.1
My last (still current) request remains "please quirk it". See the end
of <http://thread.gmane.org/gmane.linux.kernel/1952205/focus=1996025>,
and other messages in that subthread.
I haven't been following kernel development, so maybe the quirk has not
happened. No clue.
... "VM resetting" looks something different though; I've been under the
impression that the pedantic (=unquirked) MTRR configuration didn't
impact other things than speed. Janusz, maybe you could contribute with
a host kernel bisection for the VM reset symptom.
Thanks
Laszlo
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
>
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-09-18 10:07 ` Laszlo Ersek
@ 2015-09-18 17:48 ` Janusz
2015-09-21 2:51 ` Xiao Guangrong
0 siblings, 1 reply; 37+ messages in thread
From: Janusz @ 2015-09-18 17:48 UTC (permalink / raw)
To: Laszlo Ersek, kvm; +Cc: edk2-devel, guangrong.xiao
W dniu 18.09.2015 o 12:07, Laszlo Ersek pisze:
> On 09/18/15 11:37, Janusz wrote:
>> Hello,
>>
>> I am writting about this patch that was posted by Xiao:
>> http://www.spinics.net/lists/kvm/msg119044.html and this:
>> http://www.spinics.net/lists/kvm/msg119045.html
>> I've tested both kernel 4.2 and 4.3 and problem still exists when I use
>> OVMF - 100% cpu usage, VM resetting, while it works properly on kernel 4.1
> My last (still current) request remains "please quirk it". See the end
> of <http://thread.gmane.org/gmane.linux.kernel/1952205/focus=1996025>,
> and other messages in that subthread.
when I saw message from Xiao that he posted patch for it and have seen
this code in kernel sources (landed in 4.2-rc3) I though that the status
is not "please quirk it" anymore
>
> I haven't been following kernel development, so maybe the quirk has not
> happened. No clue.
>
> ... "VM resetting" looks something different though; I've been under the
> impression that the pedantic (=unquirked) MTRR configuration didn't
> impact other things than speed. Janusz, maybe you could contribute with
> a host kernel bisection for the VM reset symptom.
To be more exact - VM is mostly not starting or its starting after long
time and then resetting it self at random time (but before system boots)
or gets very high cpu usage or sometimes boots without problem (the
least possible case scenario). When I start VM with -vga std, not with
my gpu passthrough - in most of the time I get this:
http://pastebin.com/raw.php?i=CKrNsueS
Result of bisect:
git bisect start
# bad: [d770e558e21961ad6cfdf0ff7df0eb5d7d4f0754] Linux 4.2-rc1
git bisect bad d770e558e21961ad6cfdf0ff7df0eb5d7d4f0754
# good: [b953c0d234bc72e8489d3bf51a276c5c4ec85345] Linux 4.1
git bisect good b953c0d234bc72e8489d3bf51a276c5c4ec85345
# good: [4570a37169d4b44d316f40b2ccc681dc93fedc7b] Merge tag
'sound-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect good 4570a37169d4b44d316f40b2ccc681dc93fedc7b
# good: [8d7804a2f03dbd34940fcb426450c730adf29dae] Merge tag
'driver-core-4.2-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
git bisect good 8d7804a2f03dbd34940fcb426450c730adf29dae
# bad: [78c10e556ed904d5bfbd71e9cadd8ce8f25d6982] Merge branch
'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
git bisect bad 78c10e556ed904d5bfbd71e9cadd8ce8f25d6982
# good: [623f0e137c0fedb81bbf3d88be4ed300eee163da] Staging: lustre: fix
space before and after comma in dt_object.c
git bisect good 623f0e137c0fedb81bbf3d88be4ed300eee163da
# bad: [8c7febe83915332276cab49e89f6580bb963fb9a] Merge tag
'tty-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
git bisect bad 8c7febe83915332276cab49e89f6580bb963fb9a
# good: [bcf5b92d9bbf0b7683199615f0f184e89fa486bc] staging: rtl8192e:
Remove rt_hi_throughput::ChnkOp
git bisect good bcf5b92d9bbf0b7683199615f0f184e89fa486bc
# good: [71eec4836b834b992e0cefeefc8b85efe4cb185b] drivers: PL011: allow
avoiding UART enabling/disabling
git bisect good 71eec4836b834b992e0cefeefc8b85efe4cb185b
# good: [49ef9c850154756cf2fbc50fd3804c44675d4633] staging: comedi:
me4000: rename 'thisboard' variables
git bisect good 49ef9c850154756cf2fbc50fd3804c44675d4633
# good: [2a4462418af771ef9f1f1d1532bcbb8799df842d] tty/serial: kill off
set_irq_flags usage
git bisect good 2a4462418af771ef9f1f1d1532bcbb8799df842d
# good: [6d43b0f482561ab421a91ebf59a51192d66cf8a7] Staging: sm750fb:
ddk750_swi2c.c: Insert spaces around operators
git bisect good 6d43b0f482561ab421a91ebf59a51192d66cf8a7
# good: [84e1eb83d0b9e0969a59b6848d718eaf71e98fcb] MAINTAINERS: tty: add
serial docs directory
git bisect good 84e1eb83d0b9e0969a59b6848d718eaf71e98fcb
# good: [53a20e9e378ecd52f0afa4b60f8f8c81b6f97c27] staging: wilc1000:
disable driver due to build warnings
git bisect good 53a20e9e378ecd52f0afa4b60f8f8c81b6f97c27
# good: [71206b9f8120eb513c621d4f31906577bb658df3] Doc:
serial-rs485.txt: update RS485 driver interface
git bisect good 71206b9f8120eb513c621d4f31906577bb658df3
# bad: [23908db413eccd77084b09c9b0a4451dfb0524c0] Merge tag
'staging-4.2-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
git bisect bad 23908db413eccd77084b09c9b0a4451dfb0524c0
# first bad commit: [23908db413eccd77084b09c9b0a4451dfb0524c0] Merge tag
'staging-4.2-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
I also checked out 10dc331ff5e7e4668c0f0c95b1a873aba9b70826 commit to
make sure if this patch didn't fixed it and then something else after
that broke it again, but on this commit result is the same
> Thanks
> Laszlo
>
>> _______________________________________________
>> edk2-devel mailing list
>> edk2-devel@lists.01.org
>> https://lists.01.org/mailman/listinfo/edk2-devel
>>
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-09-18 17:48 ` Janusz
@ 2015-09-21 2:51 ` Xiao Guangrong
2015-09-21 3:30 ` Wanpeng Li
2015-09-21 8:23 ` Janusz
0 siblings, 2 replies; 37+ messages in thread
From: Xiao Guangrong @ 2015-09-21 2:51 UTC (permalink / raw)
To: Janusz, Laszlo Ersek, kvm; +Cc: edk2-devel
Thanks for your report and analysis, Janusz!
On 09/19/2015 01:48 AM, Janusz wrote:
> W dniu 18.09.2015 o 12:07, Laszlo Ersek pisze:
>> On 09/18/15 11:37, Janusz wrote:
>>> Hello,
>>>
>>> I am writting about this patch that was posted by Xiao:
>>> http://www.spinics.net/lists/kvm/msg119044.html and this:
>>> http://www.spinics.net/lists/kvm/msg119045.html
>>> I've tested both kernel 4.2 and 4.3 and problem still exists when I use
>>> OVMF - 100% cpu usage, VM resetting, while it works properly on kernel 4.1
>> My last (still current) request remains "please quirk it". See the end
>> of <http://thread.gmane.org/gmane.linux.kernel/1952205/focus=1996025>,
>> and other messages in that subthread.
> when I saw message from Xiao that he posted patch for it and have seen
> this code in kernel sources (landed in 4.2-rc3) I though that the status
> is not "please quirk it" anymore
>>
>> I haven't been following kernel development, so maybe the quirk has not
>> happened. No clue.
>>
>> ... "VM resetting" looks something different though; I've been under the
>> impression that the pedantic (=unquirked) MTRR configuration didn't
>> impact other things than speed. Janusz, maybe you could contribute with
>> a host kernel bisection for the VM reset symptom.
> To be more exact - VM is mostly not starting or its starting after long
> time and then resetting it self at random time (but before system boots)
> or gets very high cpu usage or sometimes boots without problem (the
> least possible case scenario). When I start VM with -vga std, not with
> my gpu passthrough - in most of the time I get this:
> http://pastebin.com/raw.php?i=CKrNsueS
It seems the behaviour is different with previous (previously, it can boot
but slowly), right?
The URL cat not be accessed, i do not know it's your web server issue or
our networking issue.
>
> Result of bisect:
>
> git bisect start
> # bad: [d770e558e21961ad6cfdf0ff7df0eb5d7d4f0754] Linux 4.2-rc1
> git bisect bad d770e558e21961ad6cfdf0ff7df0eb5d7d4f0754
> # good: [b953c0d234bc72e8489d3bf51a276c5c4ec85345] Linux 4.1
> git bisect good b953c0d234bc72e8489d3bf51a276c5c4ec85345
> # good: [4570a37169d4b44d316f40b2ccc681dc93fedc7b] Merge tag
> 'sound-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
> git bisect good 4570a37169d4b44d316f40b2ccc681dc93fedc7b
> # good: [8d7804a2f03dbd34940fcb426450c730adf29dae] Merge tag
> 'driver-core-4.2-rc1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
> git bisect good 8d7804a2f03dbd34940fcb426450c730adf29dae
> # bad: [78c10e556ed904d5bfbd71e9cadd8ce8f25d6982] Merge branch
> 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
> git bisect bad 78c10e556ed904d5bfbd71e9cadd8ce8f25d6982
> # good: [623f0e137c0fedb81bbf3d88be4ed300eee163da] Staging: lustre: fix
> space before and after comma in dt_object.c
> git bisect good 623f0e137c0fedb81bbf3d88be4ed300eee163da
> # bad: [8c7febe83915332276cab49e89f6580bb963fb9a] Merge tag
> 'tty-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
> git bisect bad 8c7febe83915332276cab49e89f6580bb963fb9a
> # good: [bcf5b92d9bbf0b7683199615f0f184e89fa486bc] staging: rtl8192e:
> Remove rt_hi_throughput::ChnkOp
> git bisect good bcf5b92d9bbf0b7683199615f0f184e89fa486bc
> # good: [71eec4836b834b992e0cefeefc8b85efe4cb185b] drivers: PL011: allow
> avoiding UART enabling/disabling
> git bisect good 71eec4836b834b992e0cefeefc8b85efe4cb185b
> # good: [49ef9c850154756cf2fbc50fd3804c44675d4633] staging: comedi:
> me4000: rename 'thisboard' variables
> git bisect good 49ef9c850154756cf2fbc50fd3804c44675d4633
> # good: [2a4462418af771ef9f1f1d1532bcbb8799df842d] tty/serial: kill off
> set_irq_flags usage
> git bisect good 2a4462418af771ef9f1f1d1532bcbb8799df842d
> # good: [6d43b0f482561ab421a91ebf59a51192d66cf8a7] Staging: sm750fb:
> ddk750_swi2c.c: Insert spaces around operators
> git bisect good 6d43b0f482561ab421a91ebf59a51192d66cf8a7
> # good: [84e1eb83d0b9e0969a59b6848d718eaf71e98fcb] MAINTAINERS: tty: add
> serial docs directory
> git bisect good 84e1eb83d0b9e0969a59b6848d718eaf71e98fcb
> # good: [53a20e9e378ecd52f0afa4b60f8f8c81b6f97c27] staging: wilc1000:
> disable driver due to build warnings
> git bisect good 53a20e9e378ecd52f0afa4b60f8f8c81b6f97c27
> # good: [71206b9f8120eb513c621d4f31906577bb658df3] Doc:
> serial-rs485.txt: update RS485 driver interface
> git bisect good 71206b9f8120eb513c621d4f31906577bb658df3
> # bad: [23908db413eccd77084b09c9b0a4451dfb0524c0] Merge tag
> 'staging-4.2-rc1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
> git bisect bad 23908db413eccd77084b09c9b0a4451dfb0524c0
> # first bad commit: [23908db413eccd77084b09c9b0a4451dfb0524c0] Merge tag
> 'staging-4.2-rc1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
>
The bad commit is:
commit 23908db413eccd77084b09c9b0a4451dfb0524c0
Merge: 8d7804a 53a20e9
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date: Fri Jun 26 15:46:08 2015 -0700
Hard to find which one causes the real issue... but i will try to reproduce
it, could you please share your command line?
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-09-21 2:51 ` Xiao Guangrong
@ 2015-09-21 3:30 ` Wanpeng Li
2015-09-21 3:40 ` Xiao Guangrong
2015-09-21 8:23 ` Janusz
1 sibling, 1 reply; 37+ messages in thread
From: Wanpeng Li @ 2015-09-21 3:30 UTC (permalink / raw)
To: Xiao Guangrong, Janusz, Laszlo Ersek, kvm; +Cc: edk2-devel
On 9/21/15 10:51 AM, Xiao Guangrong wrote:
>
> Thanks for your report and analysis, Janusz!
>
> On 09/19/2015 01:48 AM, Janusz wrote:
>> W dniu 18.09.2015 o 12:07, Laszlo Ersek pisze:
>>> On 09/18/15 11:37, Janusz wrote:
>>>> Hello,
>>>>
>>>> I am writting about this patch that was posted by Xiao:
>>>> http://www.spinics.net/lists/kvm/msg119044.html and this:
>>>> http://www.spinics.net/lists/kvm/msg119045.html
>>>> I've tested both kernel 4.2 and 4.3 and problem still exists when I
>>>> use
>>>> OVMF - 100% cpu usage, VM resetting, while it works properly on
>>>> kernel 4.1
>>> My last (still current) request remains "please quirk it". See the end
>>> of <http://thread.gmane.org/gmane.linux.kernel/1952205/focus=1996025>,
>>> and other messages in that subthread.
>> when I saw message from Xiao that he posted patch for it and have seen
>> this code in kernel sources (landed in 4.2-rc3) I though that the status
>> is not "please quirk it" anymore
>>>
>>> I haven't been following kernel development, so maybe the quirk has not
>>> happened. No clue.
>>>
>>> ... "VM resetting" looks something different though; I've been under
>>> the
>>> impression that the pedantic (=unquirked) MTRR configuration didn't
>>> impact other things than speed. Janusz, maybe you could contribute with
>>> a host kernel bisection for the VM reset symptom.
>> To be more exact - VM is mostly not starting or its starting after long
>> time and then resetting it self at random time (but before system boots)
>> or gets very high cpu usage or sometimes boots without problem (the
>> least possible case scenario). When I start VM with -vga std, not with
>> my gpu passthrough - in most of the time I get this:
>> http://pastebin.com/raw.php?i=CKrNsueS
>
> It seems the behaviour is different with previous (previously, it can
> boot
> but slowly), right?
>
> The URL cat not be accessed, i do not know it's your web server issue or
> our networking issue.
>
It can be accessed and dump as below:
KVM internal error. Suberror: 1
emulation failure
KVM internal error. Suberror: 1
emulation failure
KVM internal error. Suberror: 1
emulation failure
KVM internal error. Suberror: 1
emulation failure
KVM internal error. Suberror: 1
emulation failure
KVM internal error. Suberror: 1
emulation failure
KVM internal error. Suberror: 1
emulation failure
EAX=bfefa000 EBX=00000002 ECX=00000000 EDX=00000600
ESI=00000000 EDI=00003eb8 EBP=00000000 ESP=00000000
EIP=000a0000 EFL=00010086 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
CS =0010 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
SS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
DS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
FS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
GS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT= bfee87d8 0000003f
IDT= 00000000 0000ffff
CR0=00000033 CR2=00000000 CR3=bfefa000 CR4=00000640
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=af af af af af af af af af af af af af af af af af af af af <00> 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
EAX=bfefa000 EBX=00000002 ECX=00000000 EDX=00000600
ESI=00000000 EDI=00003eb8 EBP=00000000 ESP=00000000
EIP=000a0000 EFL=00010086 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
CS =0010 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
SS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
DS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
FS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
GS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT= bfee87d8 0000003f
IDT= 00000000 0000ffff
CR0=00000033 CR2=00000000 CR3=bfefa000 CR4=00000640
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=af af af af af af af af af af af af af af af af af af af af <00> 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
EAX=00000000 EBX=00000000 ECX=00000000 EDX=00000600
ESI=00000000 EDI=00002000 EBP=00000000 ESP=00000000
EIP=00001000 EFL=00010046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 0000ffff 00009300
CS =9f00 0009f000 0000ffff 00009b00
SS =0000 00000000 0000ffff 00009300
DS =0000 00000000 0000ffff 00009300
FS =0000 00000000 0000ffff 00009300
GS =0000 00000000 0000ffff 00009300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT= 00000000 0000ffff
IDT= 00000000 0000ffff
CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=af af af af af af af af af af af af af af af af af af af af <00> 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
EAX=00000023 EBX=00000002 ECX=00000000 EDX=00000600
ESI=00000000 EDI=00001fc2 EBP=00000000 ESP=00000000
EIP=00001000 EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
CS =9f00 0009f000 0000ffff 00009b00 DPL=0 CS16 [-RA]
SS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
DS =9f00 0009f000 0000ffff 00009300 DPL=0 DS16 [-WA]
FS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
GS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT= bfee87d8 0000003f
IDT= 00000000 0000ffff
CR0=00000033 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=af af af af af af af af af af af af af af af af af af af af <00> 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
EAX=00000023 EBX=00000002 ECX=00000000 EDX=00000600
ESI=00000000 EDI=00001fc2 EBP=00000000 ESP=00000000
EIP=00001000 EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
CS =9f00 0009f000 0000ffff 00009b00 DPL=0 CS16 [-RA]
SS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
DS =9f00 0009f000 0000ffff 00009300 DPL=0 DS16 [-WA]
FS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
GS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT= bfee87d8 0000003f
IDT= 00000000 0000ffff
CR0=00000033 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=af af af af af af af af af af af af af af af af af af af af <00> 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
EAX=00000023 EBX=00000002 ECX=00000000 EDX=00000600
ESI=00000000 EDI=00001fc2 EBP=00000000 ESP=00000000
EIP=00001000 EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
CS =9f00 0009f000 0000ffff 00009b00 DPL=0 CS16 [-RA]
SS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
DS =9f00 0009f000 0000ffff 00009300 DPL=0 DS16 [-WA]
FS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
GS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT= bfee87d8 0000003f
IDT= 00000000 0000ffff
CR0=00000033 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=af af af af af af af af af af af af af af af af af af af af <00> 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
EAX=bfefa000 EBX=00000002 ECX=00000000 EDX=00000600
ESI=00000000 EDI=00003eb8 EBP=00000000 ESP=00000000
EIP=000a0000 EFL=00010086 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
CS =0010 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
SS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
DS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
FS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
GS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT= bfee87d8 0000003f
IDT= 00000000 0000ffff
CR0=00000033 CR2=00000000 CR3=bfefa000 CR4=00000640
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=af af af af af af af af af af af af af af af af af af af af <00> 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-09-21 3:30 ` Wanpeng Li
@ 2015-09-21 3:40 ` Xiao Guangrong
2015-10-01 14:12 ` Janusz
0 siblings, 1 reply; 37+ messages in thread
From: Xiao Guangrong @ 2015-09-21 3:40 UTC (permalink / raw)
To: Wanpeng Li, Janusz, Laszlo Ersek, kvm; +Cc: edk2-devel
On 09/21/2015 11:30 AM, Wanpeng Li wrote:
> On 9/21/15 10:51 AM, Xiao Guangrong wrote:
>>
>> Thanks for your report and analysis, Janusz!
>>
>> On 09/19/2015 01:48 AM, Janusz wrote:
>>> W dniu 18.09.2015 o 12:07, Laszlo Ersek pisze:
>>>> On 09/18/15 11:37, Janusz wrote:
>>>>> Hello,
>>>>>
>>>>> I am writting about this patch that was posted by Xiao:
>>>>> http://www.spinics.net/lists/kvm/msg119044.html and this:
>>>>> http://www.spinics.net/lists/kvm/msg119045.html
>>>>> I've tested both kernel 4.2 and 4.3 and problem still exists when I use
>>>>> OVMF - 100% cpu usage, VM resetting, while it works properly on kernel 4.1
>>>> My last (still current) request remains "please quirk it". See the end
>>>> of <http://thread.gmane.org/gmane.linux.kernel/1952205/focus=1996025>,
>>>> and other messages in that subthread.
>>> when I saw message from Xiao that he posted patch for it and have seen
>>> this code in kernel sources (landed in 4.2-rc3) I though that the status
>>> is not "please quirk it" anymore
>>>>
>>>> I haven't been following kernel development, so maybe the quirk has not
>>>> happened. No clue.
>>>>
>>>> ... "VM resetting" looks something different though; I've been under the
>>>> impression that the pedantic (=unquirked) MTRR configuration didn't
>>>> impact other things than speed. Janusz, maybe you could contribute with
>>>> a host kernel bisection for the VM reset symptom.
>>> To be more exact - VM is mostly not starting or its starting after long
>>> time and then resetting it self at random time (but before system boots)
>>> or gets very high cpu usage or sometimes boots without problem (the
>>> least possible case scenario). When I start VM with -vga std, not with
>>> my gpu passthrough - in most of the time I get this:
>>> http://pastebin.com/raw.php?i=CKrNsueS
>>
>> It seems the behaviour is different with previous (previously, it can boot
>> but slowly), right?
>>
>> The URL cat not be accessed, i do not know it's your web server issue or
>> our networking issue.
>>
>
> It can be accessed and dump as below:
>
> KVM internal error. Suberror: 1
> emulation failure
> KVM internal error. Suberror: 1
> emulation failure
> KVM internal error. Suberror: 1
> emulation failure
> KVM internal error. Suberror: 1
> emulation failure
> KVM internal error. Suberror: 1
> emulation failure
> KVM internal error. Suberror: 1
> emulation failure
> KVM internal error. Suberror: 1
> emulation failure
> EAX=bfefa000 EBX=00000002 ECX=00000000 EDX=00000600
> ESI=00000000 EDI=00003eb8 EBP=00000000 ESP=00000000
> EIP=000a0000 EFL=00010086 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
> ES =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
> CS =0010 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
> SS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
> DS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
> FS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
> GS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
> LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
> TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
> GDT= bfee87d8 0000003f
> IDT= 00000000 0000ffff
> CR0=00000033 CR2=00000000 CR3=bfefa000 CR4=00000640
> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
> DR6=00000000ffff0ff0 DR7=0000000000000400
> EFER=0000000000000000
> Code=af af af af af af af af af af af af af af af af af af af af <00> 00 00 00 00 00 00 00 00 00 00
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> EAX=bfefa000 EBX=00000002 ECX=00000000 EDX=00000600
> ESI=00000000 EDI=00003eb8 EBP=00000000 ESP=00000000
> EIP=000a0000 EFL=00010086 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
> ES =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
> CS =0010 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
> SS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
> DS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
> FS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
> GS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
> LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
> TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
> GDT= bfee87d8 0000003f
> IDT= 00000000 0000ffff
> CR0=00000033 CR2=00000000 CR3=bfefa000 CR4=00000640
> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
> DR6=00000000ffff0ff0 DR7=0000000000000400
> EFER=0000000000000000
> Code=af af af af af af af af af af af af af af af af af af af af <00> 00 00 00 00 00 00 00 00 00 00
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> EAX=00000000 EBX=00000000 ECX=00000000 EDX=00000600
> ESI=00000000 EDI=00002000 EBP=00000000 ESP=00000000
> EIP=00001000 EFL=00010046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
> ES =0000 00000000 0000ffff 00009300
> CS =9f00 0009f000 0000ffff 00009b00
> SS =0000 00000000 0000ffff 00009300
> DS =0000 00000000 0000ffff 00009300
> FS =0000 00000000 0000ffff 00009300
> GS =0000 00000000 0000ffff 00009300
> LDT=0000 00000000 0000ffff 00008200
> TR =0000 00000000 0000ffff 00008b00
> GDT= 00000000 0000ffff
> IDT= 00000000 0000ffff
> CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
> DR6=00000000ffff0ff0 DR7=0000000000000400
> EFER=0000000000000000
> Code=af af af af af af af af af af af af af af af af af af af af <00> 00 00 00 00 00 00 00 00 00 00
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> EAX=00000023 EBX=00000002 ECX=00000000 EDX=00000600
> ESI=00000000 EDI=00001fc2 EBP=00000000 ESP=00000000
> EIP=00001000 EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
> ES =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
> CS =9f00 0009f000 0000ffff 00009b00 DPL=0 CS16 [-RA]
> SS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
> DS =9f00 0009f000 0000ffff 00009300 DPL=0 DS16 [-WA]
> FS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
> GS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
> LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
> TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
> GDT= bfee87d8 0000003f
> IDT= 00000000 0000ffff
> CR0=00000033 CR2=00000000 CR3=00000000 CR4=00000000
> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
> DR6=00000000ffff0ff0 DR7=0000000000000400
> EFER=0000000000000000
> Code=af af af af af af af af af af af af af af af af af af af af <00> 00 00 00 00 00 00 00 00 00 00
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> EAX=00000023 EBX=00000002 ECX=00000000 EDX=00000600
> ESI=00000000 EDI=00001fc2 EBP=00000000 ESP=00000000
> EIP=00001000 EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
> ES =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
> CS =9f00 0009f000 0000ffff 00009b00 DPL=0 CS16 [-RA]
> SS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
> DS =9f00 0009f000 0000ffff 00009300 DPL=0 DS16 [-WA]
> FS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
> GS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
> LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
> TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
> GDT= bfee87d8 0000003f
> IDT= 00000000 0000ffff
> CR0=00000033 CR2=00000000 CR3=00000000 CR4=00000000
> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
> DR6=00000000ffff0ff0 DR7=0000000000000400
> EFER=0000000000000000
> Code=af af af af af af af af af af af af af af af af af af af af <00> 00 00 00 00 00 00 00 00 00 00
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> EAX=00000023 EBX=00000002 ECX=00000000 EDX=00000600
> ESI=00000000 EDI=00001fc2 EBP=00000000 ESP=00000000
> EIP=00001000 EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
> ES =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
> CS =9f00 0009f000 0000ffff 00009b00 DPL=0 CS16 [-RA]
> SS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
> DS =9f00 0009f000 0000ffff 00009300 DPL=0 DS16 [-WA]
> FS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
> GS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
> LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
> TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
> GDT= bfee87d8 0000003f
> IDT= 00000000 0000ffff
> CR0=00000033 CR2=00000000 CR3=00000000 CR4=00000000
> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
> DR6=00000000ffff0ff0 DR7=0000000000000400
> EFER=0000000000000000
> Code=af af af af af af af af af af af af af af af af af af af af <00> 00 00 00 00 00 00 00 00 00 00
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> EAX=bfefa000 EBX=00000002 ECX=00000000 EDX=00000600
> ESI=00000000 EDI=00003eb8 EBP=00000000 ESP=00000000
> EIP=000a0000 EFL=00010086 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
> ES =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
> CS =0010 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
> SS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
> DS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
> FS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
> GS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
> LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
> TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
> GDT= bfee87d8 0000003f
> IDT= 00000000 0000ffff
> CR0=00000033 CR2=00000000 CR3=bfefa000 CR4=00000640
> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
> DR6=00000000ffff0ff0 DR7=0000000000000400
> EFER=0000000000000000
> Code=af af af af af af af af af af af af af af af af af af af af <00> 00 00 00 00 00 00 00 00 00 00
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>
Got it, thank you!
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-09-21 2:51 ` Xiao Guangrong
2015-09-21 3:30 ` Wanpeng Li
@ 2015-09-21 8:23 ` Janusz
1 sibling, 0 replies; 37+ messages in thread
From: Janusz @ 2015-09-21 8:23 UTC (permalink / raw)
To: Xiao Guangrong, Laszlo Ersek, kvm; +Cc: edk2-devel
W dniu 21.09.2015 o 04:51, Xiao Guangrong pisze:
>
> Thanks for your report and analysis, Janusz!
>
> On 09/19/2015 01:48 AM, Janusz wrote:
>> W dniu 18.09.2015 o 12:07, Laszlo Ersek pisze:
>>> On 09/18/15 11:37, Janusz wrote:
>>>> Hello,
>>>>
>>>> I am writting about this patch that was posted by Xiao:
>>>> http://www.spinics.net/lists/kvm/msg119044.html and this:
>>>> http://www.spinics.net/lists/kvm/msg119045.html
>>>> I've tested both kernel 4.2 and 4.3 and problem still exists when I
>>>> use
>>>> OVMF - 100% cpu usage, VM resetting, while it works properly on
>>>> kernel 4.1
>>> My last (still current) request remains "please quirk it". See the end
>>> of <http://thread.gmane.org/gmane.linux.kernel/1952205/focus=1996025>,
>>> and other messages in that subthread.
>> when I saw message from Xiao that he posted patch for it and have seen
>> this code in kernel sources (landed in 4.2-rc3) I though that the status
>> is not "please quirk it" anymore
>>>
>>> I haven't been following kernel development, so maybe the quirk has not
>>> happened. No clue.
>>>
>>> ... "VM resetting" looks something different though; I've been under
>>> the
>>> impression that the pedantic (=unquirked) MTRR configuration didn't
>>> impact other things than speed. Janusz, maybe you could contribute with
>>> a host kernel bisection for the VM reset symptom.
>> To be more exact - VM is mostly not starting or its starting after long
>> time and then resetting it self at random time (but before system boots)
>> or gets very high cpu usage or sometimes boots without problem (the
>> least possible case scenario). When I start VM with -vga std, not with
>> my gpu passthrough - in most of the time I get this:
>> http://pastebin.com/raw.php?i=CKrNsueS
>
> It seems the behaviour is different with previous (previously, it can
> boot
> but slowly), right?
>
> The URL cat not be accessed, i do not know it's your web server issue or
> our networking issue.
>
>>
>> Result of bisect:
>>
>> git bisect start
>> # bad: [d770e558e21961ad6cfdf0ff7df0eb5d7d4f0754] Linux 4.2-rc1
>> git bisect bad d770e558e21961ad6cfdf0ff7df0eb5d7d4f0754
>> # good: [b953c0d234bc72e8489d3bf51a276c5c4ec85345] Linux 4.1
>> git bisect good b953c0d234bc72e8489d3bf51a276c5c4ec85345
>> # good: [4570a37169d4b44d316f40b2ccc681dc93fedc7b] Merge tag
>> 'sound-4.2-rc1' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
>> git bisect good 4570a37169d4b44d316f40b2ccc681dc93fedc7b
>> # good: [8d7804a2f03dbd34940fcb426450c730adf29dae] Merge tag
>> 'driver-core-4.2-rc1' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
>> git bisect good 8d7804a2f03dbd34940fcb426450c730adf29dae
>> # bad: [78c10e556ed904d5bfbd71e9cadd8ce8f25d6982] Merge branch
>> 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
>> git bisect bad 78c10e556ed904d5bfbd71e9cadd8ce8f25d6982
>> # good: [623f0e137c0fedb81bbf3d88be4ed300eee163da] Staging: lustre: fix
>> space before and after comma in dt_object.c
>> git bisect good 623f0e137c0fedb81bbf3d88be4ed300eee163da
>> # bad: [8c7febe83915332276cab49e89f6580bb963fb9a] Merge tag
>> 'tty-4.2-rc1' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
>> git bisect bad 8c7febe83915332276cab49e89f6580bb963fb9a
>> # good: [bcf5b92d9bbf0b7683199615f0f184e89fa486bc] staging: rtl8192e:
>> Remove rt_hi_throughput::ChnkOp
>> git bisect good bcf5b92d9bbf0b7683199615f0f184e89fa486bc
>> # good: [71eec4836b834b992e0cefeefc8b85efe4cb185b] drivers: PL011: allow
>> avoiding UART enabling/disabling
>> git bisect good 71eec4836b834b992e0cefeefc8b85efe4cb185b
>> # good: [49ef9c850154756cf2fbc50fd3804c44675d4633] staging: comedi:
>> me4000: rename 'thisboard' variables
>> git bisect good 49ef9c850154756cf2fbc50fd3804c44675d4633
>> # good: [2a4462418af771ef9f1f1d1532bcbb8799df842d] tty/serial: kill off
>> set_irq_flags usage
>> git bisect good 2a4462418af771ef9f1f1d1532bcbb8799df842d
>> # good: [6d43b0f482561ab421a91ebf59a51192d66cf8a7] Staging: sm750fb:
>> ddk750_swi2c.c: Insert spaces around operators
>> git bisect good 6d43b0f482561ab421a91ebf59a51192d66cf8a7
>> # good: [84e1eb83d0b9e0969a59b6848d718eaf71e98fcb] MAINTAINERS: tty: add
>> serial docs directory
>> git bisect good 84e1eb83d0b9e0969a59b6848d718eaf71e98fcb
>> # good: [53a20e9e378ecd52f0afa4b60f8f8c81b6f97c27] staging: wilc1000:
>> disable driver due to build warnings
>> git bisect good 53a20e9e378ecd52f0afa4b60f8f8c81b6f97c27
>> # good: [71206b9f8120eb513c621d4f31906577bb658df3] Doc:
>> serial-rs485.txt: update RS485 driver interface
>> git bisect good 71206b9f8120eb513c621d4f31906577bb658df3
>> # bad: [23908db413eccd77084b09c9b0a4451dfb0524c0] Merge tag
>> 'staging-4.2-rc1' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
>> git bisect bad 23908db413eccd77084b09c9b0a4451dfb0524c0
>> # first bad commit: [23908db413eccd77084b09c9b0a4451dfb0524c0] Merge tag
>> 'staging-4.2-rc1' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
>>
>
> The bad commit is:
> commit 23908db413eccd77084b09c9b0a4451dfb0524c0
> Merge: 8d7804a 53a20e9
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Fri Jun 26 15:46:08 2015 -0700
>
> Hard to find which one causes the real issue... but i will try to
> reproduce
> it, could you please share your command line?
OPTS="-vga none"
OPTS="$OPTS -drive
if=pflash,format=raw,readonly,file=/home/janusz/uefi/OVMF_CODE.fd"
OPTS="$OPTS -drive if=pflash,format=raw,file=/home/janusz/uefi/OVMF_VARS.fd"
OPTS="$OPTS -enable-kvm -m 10000 -cpu host -smp
8,cores=4,threads=2,sockets=1"
OPTS="$OPTS -device
vfio-pci,host=01:00.0,romfile=/home/janusz/uefi/uefi-vga.bin"
OPTS="$OPTS -device vfio-pci,host=01:00.1"
OPTS="$OPTS -net nic,macaddr=50:E5:49:57:74:E3 -net bridge,vlan=0"
OPTS="$OPTS -soundhw hda"
OPTS="$OPTS -boot d"
OPTS="$OPTS -usb -usbdevice host:09da:000a -usbdevice host:1a2c:0c21"
OPTS="$OPTS -hda /mnt/storage2/win10.img"
QEMU_AUDIO_DRV=pa qemu-system-x86_64 $OPTS
my cpu is i7 6700k
commands for kernel:
intel_iommu=on vfio_iommu_type1.allow_unsafe_interrupts=0
on 4.1 I had to use:
i915.preliminary_hw_support=1 intel_iommu=on
vfio_iommu_type1.allow_unsafe_interrupts=0 intel_iommu=igfx_off because
otherwise I got kernel panic, but on versions of kernel that doesn't
work, with igfx_off my iommu groups were empty and it didn't panic
anymore without it
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-09-18 9:37 [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled Janusz
2015-09-18 10:07 ` Laszlo Ersek
@ 2015-09-22 8:59 ` Paolo Bonzini
2015-09-22 10:29 ` Janusz
1 sibling, 1 reply; 37+ messages in thread
From: Paolo Bonzini @ 2015-09-22 8:59 UTC (permalink / raw)
To: Janusz, kvm; +Cc: edk2-devel, guangrong.xiao
On 18/09/2015 11:37, Janusz wrote:
> Hello,
>
> I am writting about this patch that was posted by Xiao:
> http://www.spinics.net/lists/kvm/msg119044.html and this:
> http://www.spinics.net/lists/kvm/msg119045.html
> I've tested both kernel 4.2 and 4.3 and problem still exists when I use
> OVMF - 100% cpu usage, VM resetting, while it works properly on kernel 4.1
Is this an Intel or AMD CPU?
Paolo
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-09-22 8:59 ` Paolo Bonzini
@ 2015-09-22 10:29 ` Janusz
0 siblings, 0 replies; 37+ messages in thread
From: Janusz @ 2015-09-22 10:29 UTC (permalink / raw)
To: Paolo Bonzini, kvm; +Cc: edk2-devel, guangrong.xiao
W dniu 22.09.2015 o 10:59, Paolo Bonzini pisze:
>
> On 18/09/2015 11:37, Janusz wrote:
>> Hello,
>>
>> I am writting about this patch that was posted by Xiao:
>> http://www.spinics.net/lists/kvm/msg119044.html and this:
>> http://www.spinics.net/lists/kvm/msg119045.html
>> I've tested both kernel 4.2 and 4.3 and problem still exists when I use
>> OVMF - 100% cpu usage, VM resetting, while it works properly on kernel 4.1
> Is this an Intel or AMD CPU?
>
> Paolo
Intel i7 6700k
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-09-21 3:40 ` Xiao Guangrong
@ 2015-10-01 14:12 ` Janusz
2015-10-01 14:18 ` Paolo Bonzini
0 siblings, 1 reply; 37+ messages in thread
From: Janusz @ 2015-10-01 14:12 UTC (permalink / raw)
To: Xiao Guangrong, Wanpeng Li, Laszlo Ersek, kvm; +Cc: edk2-devel
Now, I can also add, that the problem is only when I allow VM to use
more than one core, so with option for example:
-smp 8,cores=4,threads=2,sockets=1 and other combinations like -smp
4,threads=1 its not working, and without it I am always running VM
without problems
Any ideas what can it be? or any idea what would help to find out what
is causing this?
W dniu 21.09.2015 o 05:40, Xiao Guangrong pisze:
>
>
> On 09/21/2015 11:30 AM, Wanpeng Li wrote:
>> On 9/21/15 10:51 AM, Xiao Guangrong wrote:
>>>
>>> Thanks for your report and analysis, Janusz!
>>>
>>> On 09/19/2015 01:48 AM, Janusz wrote:
>>>> W dniu 18.09.2015 o 12:07, Laszlo Ersek pisze:
>>>>> On 09/18/15 11:37, Janusz wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I am writting about this patch that was posted by Xiao:
>>>>>> http://www.spinics.net/lists/kvm/msg119044.html and this:
>>>>>> http://www.spinics.net/lists/kvm/msg119045.html
>>>>>> I've tested both kernel 4.2 and 4.3 and problem still exists when
>>>>>> I use
>>>>>> OVMF - 100% cpu usage, VM resetting, while it works properly on
>>>>>> kernel 4.1
>>>>> My last (still current) request remains "please quirk it". See the
>>>>> end
>>>>> of <http://thread.gmane.org/gmane.linux.kernel/1952205/focus=1996025>,
>>>>> and other messages in that subthread.
>>>> when I saw message from Xiao that he posted patch for it and have seen
>>>> this code in kernel sources (landed in 4.2-rc3) I though that the
>>>> status
>>>> is not "please quirk it" anymore
>>>>>
>>>>> I haven't been following kernel development, so maybe the quirk
>>>>> has not
>>>>> happened. No clue.
>>>>>
>>>>> ... "VM resetting" looks something different though; I've been
>>>>> under the
>>>>> impression that the pedantic (=unquirked) MTRR configuration didn't
>>>>> impact other things than speed. Janusz, maybe you could contribute
>>>>> with
>>>>> a host kernel bisection for the VM reset symptom.
>>>> To be more exact - VM is mostly not starting or its starting after
>>>> long
>>>> time and then resetting it self at random time (but before system
>>>> boots)
>>>> or gets very high cpu usage or sometimes boots without problem (the
>>>> least possible case scenario). When I start VM with -vga std, not with
>>>> my gpu passthrough - in most of the time I get this:
>>>> http://pastebin.com/raw.php?i=CKrNsueS
>>>
>>> It seems the behaviour is different with previous (previously, it
>>> can boot
>>> but slowly), right?
>>>
>>> The URL cat not be accessed, i do not know it's your web server
>>> issue or
>>> our networking issue.
>>>
>>
>> It can be accessed and dump as below:
>>
>> KVM internal error. Suberror: 1
>> emulation failure
>> KVM internal error. Suberror: 1
>> emulation failure
>> KVM internal error. Suberror: 1
>> emulation failure
>> KVM internal error. Suberror: 1
>> emulation failure
>> KVM internal error. Suberror: 1
>> emulation failure
>> KVM internal error. Suberror: 1
>> emulation failure
>> KVM internal error. Suberror: 1
>> emulation failure
>> EAX=bfefa000 EBX=00000002 ECX=00000000 EDX=00000600
>> ESI=00000000 EDI=00003eb8 EBP=00000000 ESP=00000000
>> EIP=000a0000 EFL=00010086 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
>> ES =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> CS =0010 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
>> SS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> DS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> FS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> GS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
>> TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
>> GDT= bfee87d8 0000003f
>> IDT= 00000000 0000ffff
>> CR0=00000033 CR2=00000000 CR3=bfefa000 CR4=00000640
>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
>> DR3=0000000000000000
>> DR6=00000000ffff0ff0 DR7=0000000000000400
>> EFER=0000000000000000
>> Code=af af af af af af af af af af af af af af af af af af af af <00>
>> 00 00 00 00 00 00 00 00 00 00
>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> EAX=bfefa000 EBX=00000002 ECX=00000000 EDX=00000600
>> ESI=00000000 EDI=00003eb8 EBP=00000000 ESP=00000000
>> EIP=000a0000 EFL=00010086 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
>> ES =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> CS =0010 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
>> SS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> DS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> FS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> GS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
>> TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
>> GDT= bfee87d8 0000003f
>> IDT= 00000000 0000ffff
>> CR0=00000033 CR2=00000000 CR3=bfefa000 CR4=00000640
>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
>> DR3=0000000000000000
>> DR6=00000000ffff0ff0 DR7=0000000000000400
>> EFER=0000000000000000
>> Code=af af af af af af af af af af af af af af af af af af af af <00>
>> 00 00 00 00 00 00 00 00 00 00
>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> EAX=00000000 EBX=00000000 ECX=00000000 EDX=00000600
>> ESI=00000000 EDI=00002000 EBP=00000000 ESP=00000000
>> EIP=00001000 EFL=00010046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
>> ES =0000 00000000 0000ffff 00009300
>> CS =9f00 0009f000 0000ffff 00009b00
>> SS =0000 00000000 0000ffff 00009300
>> DS =0000 00000000 0000ffff 00009300
>> FS =0000 00000000 0000ffff 00009300
>> GS =0000 00000000 0000ffff 00009300
>> LDT=0000 00000000 0000ffff 00008200
>> TR =0000 00000000 0000ffff 00008b00
>> GDT= 00000000 0000ffff
>> IDT= 00000000 0000ffff
>> CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
>> DR3=0000000000000000
>> DR6=00000000ffff0ff0 DR7=0000000000000400
>> EFER=0000000000000000
>> Code=af af af af af af af af af af af af af af af af af af af af <00>
>> 00 00 00 00 00 00 00 00 00 00
>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> EAX=00000023 EBX=00000002 ECX=00000000 EDX=00000600
>> ESI=00000000 EDI=00001fc2 EBP=00000000 ESP=00000000
>> EIP=00001000 EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
>> ES =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
>> CS =9f00 0009f000 0000ffff 00009b00 DPL=0 CS16 [-RA]
>> SS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
>> DS =9f00 0009f000 0000ffff 00009300 DPL=0 DS16 [-WA]
>> FS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
>> GS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
>> LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
>> TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
>> GDT= bfee87d8 0000003f
>> IDT= 00000000 0000ffff
>> CR0=00000033 CR2=00000000 CR3=00000000 CR4=00000000
>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
>> DR3=0000000000000000
>> DR6=00000000ffff0ff0 DR7=0000000000000400
>> EFER=0000000000000000
>> Code=af af af af af af af af af af af af af af af af af af af af <00>
>> 00 00 00 00 00 00 00 00 00 00
>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> EAX=00000023 EBX=00000002 ECX=00000000 EDX=00000600
>> ESI=00000000 EDI=00001fc2 EBP=00000000 ESP=00000000
>> EIP=00001000 EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
>> ES =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
>> CS =9f00 0009f000 0000ffff 00009b00 DPL=0 CS16 [-RA]
>> SS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
>> DS =9f00 0009f000 0000ffff 00009300 DPL=0 DS16 [-WA]
>> FS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
>> GS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
>> LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
>> TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
>> GDT= bfee87d8 0000003f
>> IDT= 00000000 0000ffff
>> CR0=00000033 CR2=00000000 CR3=00000000 CR4=00000000
>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
>> DR3=0000000000000000
>> DR6=00000000ffff0ff0 DR7=0000000000000400
>> EFER=0000000000000000
>> Code=af af af af af af af af af af af af af af af af af af af af <00>
>> 00 00 00 00 00 00 00 00 00 00
>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> EAX=00000023 EBX=00000002 ECX=00000000 EDX=00000600
>> ESI=00000000 EDI=00001fc2 EBP=00000000 ESP=00000000
>> EIP=00001000 EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
>> ES =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
>> CS =9f00 0009f000 0000ffff 00009b00 DPL=0 CS16 [-RA]
>> SS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
>> DS =9f00 0009f000 0000ffff 00009300 DPL=0 DS16 [-WA]
>> FS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
>> GS =0000 00000000 0000ffff 00009300 DPL=0 DS16 [-WA]
>> LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
>> TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
>> GDT= bfee87d8 0000003f
>> IDT= 00000000 0000ffff
>> CR0=00000033 CR2=00000000 CR3=00000000 CR4=00000000
>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
>> DR3=0000000000000000
>> DR6=00000000ffff0ff0 DR7=0000000000000400
>> EFER=0000000000000000
>> Code=af af af af af af af af af af af af af af af af af af af af <00>
>> 00 00 00 00 00 00 00 00 00 00
>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> EAX=bfefa000 EBX=00000002 ECX=00000000 EDX=00000600
>> ESI=00000000 EDI=00003eb8 EBP=00000000 ESP=00000000
>> EIP=000a0000 EFL=00010086 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
>> ES =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> CS =0010 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
>> SS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> DS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> FS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> GS =0008 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
>> LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
>> TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
>> GDT= bfee87d8 0000003f
>> IDT= 00000000 0000ffff
>> CR0=00000033 CR2=00000000 CR3=bfefa000 CR4=00000640
>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
>> DR3=0000000000000000
>> DR6=00000000ffff0ff0 DR7=0000000000000400
>> EFER=0000000000000000
>> Code=af af af af af af af af af af af af af af af af af af af af <00>
>> 00 00 00 00 00 00 00 00 00 00
>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>
>
>
> Got it, thank you!
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-01 14:12 ` Janusz
@ 2015-10-01 14:18 ` Paolo Bonzini
2015-10-02 14:38 ` Janusz
0 siblings, 1 reply; 37+ messages in thread
From: Paolo Bonzini @ 2015-10-01 14:18 UTC (permalink / raw)
To: Janusz, Xiao Guangrong, Wanpeng Li, Laszlo Ersek, kvm; +Cc: edk2-devel
On 01/10/2015 16:12, Janusz wrote:
> Now, I can also add, that the problem is only when I allow VM to use
> more than one core, so with option for example:
> -smp 8,cores=4,threads=2,sockets=1 and other combinations like -smp
> 4,threads=1 its not working, and without it I am always running VM
> without problems
>
> Any ideas what can it be? or any idea what would help to find out what
> is causing this?
I am going to send a revert of the patch tomorrow.
Paolo
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-01 14:18 ` Paolo Bonzini
@ 2015-10-02 14:38 ` Janusz
2015-10-10 20:07 ` Xiao Guangrong
2015-10-14 3:58 ` Xiao Guangrong
0 siblings, 2 replies; 37+ messages in thread
From: Janusz @ 2015-10-02 14:38 UTC (permalink / raw)
To: Paolo Bonzini, Xiao Guangrong, Wanpeng Li, Laszlo Ersek, kvm; +Cc: edk2-devel
W dniu 01.10.2015 o 16:18, Paolo Bonzini pisze:
>
> On 01/10/2015 16:12, Janusz wrote:
>> Now, I can also add, that the problem is only when I allow VM to use
>> more than one core, so with option for example:
>> -smp 8,cores=4,threads=2,sockets=1 and other combinations like -smp
>> 4,threads=1 its not working, and without it I am always running VM
>> without problems
>>
>> Any ideas what can it be? or any idea what would help to find out what
>> is causing this?
> I am going to send a revert of the patch tomorrow.
>
> Paolo
Thanks, but revert patch doesn't help, so something else is wrong here
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-02 14:38 ` Janusz
@ 2015-10-10 20:07 ` Xiao Guangrong
2015-10-12 18:20 ` Xiao Guangrong
2015-10-14 3:58 ` Xiao Guangrong
1 sibling, 1 reply; 37+ messages in thread
From: Xiao Guangrong @ 2015-10-10 20:07 UTC (permalink / raw)
To: Janusz, Paolo Bonzini, Wanpeng Li, Laszlo Ersek, kvm; +Cc: edk2-devel
On 10/02/2015 10:38 PM, Janusz wrote:
> W dniu 01.10.2015 o 16:18, Paolo Bonzini pisze:
>>
>> On 01/10/2015 16:12, Janusz wrote:
>>> Now, I can also add, that the problem is only when I allow VM to use
>>> more than one core, so with option for example:
>>> -smp 8,cores=4,threads=2,sockets=1 and other combinations like -smp
>>> 4,threads=1 its not working, and without it I am always running VM
>>> without problems
>>>
>>> Any ideas what can it be? or any idea what would help to find out what
>>> is causing this?
>> I am going to send a revert of the patch tomorrow.
>>
>> Paolo
> Thanks, but revert patch doesn't help, so something else is wrong here
>
It seems i can reproduce it now ... and finally i get little free time now :(
I will dig into it and fix it asap.
Thank you, Janusz and Paolo!
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-10 20:07 ` Xiao Guangrong
@ 2015-10-12 18:20 ` Xiao Guangrong
2015-10-12 18:29 ` Xiao Guangrong
0 siblings, 1 reply; 37+ messages in thread
From: Xiao Guangrong @ 2015-10-12 18:20 UTC (permalink / raw)
To: Janusz, Paolo Bonzini, Wanpeng Li, Laszlo Ersek, kvm; +Cc: edk2-devel
On 10/11/2015 04:07 AM, Xiao Guangrong wrote:
>
>
> On 10/02/2015 10:38 PM, Janusz wrote:
>> W dniu 01.10.2015 o 16:18, Paolo Bonzini pisze:
>>>
>>> On 01/10/2015 16:12, Janusz wrote:
>>>> Now, I can also add, that the problem is only when I allow VM to use
>>>> more than one core, so with option for example:
>>>> -smp 8,cores=4,threads=2,sockets=1 and other combinations like -smp
>>>> 4,threads=1 its not working, and without it I am always running VM
>>>> without problems
>>>>
>>>> Any ideas what can it be? or any idea what would help to find out what
>>>> is causing this?
>>> I am going to send a revert of the patch tomorrow.
>>>
>>> Paolo
>> Thanks, but revert patch doesn't help, so something else is wrong here
>>
>
> It seems i can reproduce it now ... and finally i get little free time now :(
> I will dig into it and fix it asap.
>
> Thank you, Janusz and Paolo!
I think i have figured out the root case, i got these traces:
<...>-47935 [052] d... 20017.763244: kvm_exit: reason EPT_VIOLATION rip 0xa0000 info 184 0
<...>-47935 [052] .... 20017.763244: kvm_page_fault: address a0000 error_code 184
<...>-47935 [052] .... 20017.763269: mark_mmio_spte: sptep:ffff880841c3d500 gfn a0
access 6 gen fff94
<...>-47935 [052] .... 20017.763272: kvm_mmu_pagetable_walk: addr a0000 pferr 10 F
<...>-47935 [052] .... 20017.763272: kvm_mmu_paging_element: pte bfeff023 level 4
<...>-47935 [052] .... 20017.763273: kvm_mmu_paging_element: pte bff00023 level 3
<...>-47935 [052] .... 20017.763273: kvm_mmu_paging_element: pte e3 level 2
<...>-47935 [052] .... 20017.763274: kvm_emulate_insn: 0:a0000: (prot32)
<...>-47935 [052] .... 20017.763274: kvm_emulate_insn: 0:a0000: (prot32) failed
<...>-
It told me that guest is executing on address 0xa0000 but it is a MMIO address, so KVM
can not emulate it and complained with internal error.
Actually, 0xa0000 is belong to SMRAM (0x30000 is SMRAM base and 0x80000 is EIP offset,
0x30000 + 0x80000 = 0xa0000), however, from QEMU's dump:
EAX=bfefe000 EBX=00000002 ECX=00000000 EDX=00000600
ESI=00000000 EDI=00003eb8 EBP=00000000 ESP=00000000
EIP=000a0000 EFL=00010086 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
we see that VCPU is not in SMM.
I dropped some patches (MTRR patches) then this bug can not be trigged so frequently but it
can not completely be avoided :(
I think we need to check OVMF's code to see if there is rare case that SMM hahdler is called
but KVM have not received SMI at that time...
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-12 18:20 ` Xiao Guangrong
@ 2015-10-12 18:29 ` Xiao Guangrong
0 siblings, 0 replies; 37+ messages in thread
From: Xiao Guangrong @ 2015-10-12 18:29 UTC (permalink / raw)
To: Janusz, Paolo Bonzini, Wanpeng Li, Laszlo Ersek, kvm; +Cc: edk2-devel
On 10/13/2015 02:20 AM, Xiao Guangrong wrote:
>
>
> On 10/11/2015 04:07 AM, Xiao Guangrong wrote:
>>
>>
>> On 10/02/2015 10:38 PM, Janusz wrote:
>>> W dniu 01.10.2015 o 16:18, Paolo Bonzini pisze:
>>>>
>>>> On 01/10/2015 16:12, Janusz wrote:
>>>>> Now, I can also add, that the problem is only when I allow VM to use
>>>>> more than one core, so with option for example:
>>>>> -smp 8,cores=4,threads=2,sockets=1 and other combinations like -smp
>>>>> 4,threads=1 its not working, and without it I am always running VM
>>>>> without problems
>>>>>
>>>>> Any ideas what can it be? or any idea what would help to find out what
>>>>> is causing this?
>>>> I am going to send a revert of the patch tomorrow.
>>>>
>>>> Paolo
>>> Thanks, but revert patch doesn't help, so something else is wrong here
>>>
>>
>> It seems i can reproduce it now ... and finally i get little free time now :(
>> I will dig into it and fix it asap.
>>
>> Thank you, Janusz and Paolo!
>
> I think i have figured out the root case, i got these traces:
> <...>-47935 [052] d... 20017.763244: kvm_exit: reason EPT_VIOLATION rip 0xa0000 info 184 0
> <...>-47935 [052] .... 20017.763244: kvm_page_fault: address a0000 error_code 184
> <...>-47935 [052] .... 20017.763269: mark_mmio_spte: sptep:ffff880841c3d500 gfn a0
> access 6 gen fff94
> <...>-47935 [052] .... 20017.763272: kvm_mmu_pagetable_walk: addr a0000 pferr 10 F
> <...>-47935 [052] .... 20017.763272: kvm_mmu_paging_element: pte bfeff023 level 4
> <...>-47935 [052] .... 20017.763273: kvm_mmu_paging_element: pte bff00023 level 3
> <...>-47935 [052] .... 20017.763273: kvm_mmu_paging_element: pte e3 level 2
> <...>-47935 [052] .... 20017.763274: kvm_emulate_insn: 0:a0000: (prot32)
> <...>-47935 [052] .... 20017.763274: kvm_emulate_insn: 0:a0000: (prot32) failed
> <...>-
> It told me that guest is executing on address 0xa0000 but it is a MMIO address, so KVM
> can not emulate it and complained with internal error.
>
> Actually, 0xa0000 is belong to SMRAM (0x30000 is SMRAM base and 0x80000 is EIP offset,
> 0x30000 + 0x80000 = 0xa0000), however, from QEMU's dump:
Wrong here...
Please ignore this mail... I definitely need some rest.
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-02 14:38 ` Janusz
2015-10-10 20:07 ` Xiao Guangrong
@ 2015-10-14 3:58 ` Xiao Guangrong
2015-10-14 7:37 ` Janusz
1 sibling, 1 reply; 37+ messages in thread
From: Xiao Guangrong @ 2015-10-14 3:58 UTC (permalink / raw)
To: Janusz, Paolo Bonzini, Wanpeng Li, Laszlo Ersek, kvm; +Cc: edk2-devel
Janusz,
Could you please try this:
$ git diff
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 185fc16..bdd564f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4957,12 +4957,14 @@ static int handle_emulation_failure(struct kvm_vcpu *vcpu)
++vcpu->stat.insn_emulation_fail;
trace_kvm_emulate_insn_failed(vcpu);
+#if 0
if (!is_guest_mode(vcpu) && kvm_x86_ops->get_cpl(vcpu) == 0) {
vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
vcpu->run->internal.ndata = 0;
r = EMULATE_FAIL;
}
+#endif
kvm_queue_exception(vcpu, UD_VECTOR);
return r;
To see if the issue still there?
On 10/02/2015 10:38 PM, Janusz wrote:
> W dniu 01.10.2015 o 16:18, Paolo Bonzini pisze:
>>
>> On 01/10/2015 16:12, Janusz wrote:
>>> Now, I can also add, that the problem is only when I allow VM to use
>>> more than one core, so with option for example:
>>> -smp 8,cores=4,threads=2,sockets=1 and other combinations like -smp
>>> 4,threads=1 its not working, and without it I am always running VM
>>> without problems
>>>
>>> Any ideas what can it be? or any idea what would help to find out what
>>> is causing this?
>> I am going to send a revert of the patch tomorrow.
>>
>> Paolo
> Thanks, but revert patch doesn't help, so something else is wrong here
>
^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-14 3:58 ` Xiao Guangrong
@ 2015-10-14 7:37 ` Janusz
2015-10-14 8:24 ` Xiao Guangrong
0 siblings, 1 reply; 37+ messages in thread
From: Janusz @ 2015-10-14 7:37 UTC (permalink / raw)
To: Xiao Guangrong, Paolo Bonzini, Wanpeng Li, Laszlo Ersek, kvm; +Cc: edk2-devel
I was able to run my virtual machine with this, but had very high cpu
usage when something happen in it like booting system. once, my virtual
machine hang and I couln't even get my mouse / keyboard back from qemu.
When I did vga passthrough, I didn't get any video output, and cpu usage
was also high. Tried it on 4.3
W dniu 14.10.2015 o 05:58, Xiao Guangrong pisze:
>
> Janusz,
>
> Could you please try this:
>
> $ git diff
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 185fc16..bdd564f 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4957,12 +4957,14 @@ static int handle_emulation_failure(struct
> kvm_vcpu *vcpu)
>
> ++vcpu->stat.insn_emulation_fail;
> trace_kvm_emulate_insn_failed(vcpu);
> +#if 0
> if (!is_guest_mode(vcpu) && kvm_x86_ops->get_cpl(vcpu) == 0) {
> vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
> vcpu->run->internal.suberror =
> KVM_INTERNAL_ERROR_EMULATION;
> vcpu->run->internal.ndata = 0;
> r = EMULATE_FAIL;
> }
> +#endif
> kvm_queue_exception(vcpu, UD_VECTOR);
>
> return r;
>
> To see if the issue still there?
>
>
> On 10/02/2015 10:38 PM, Janusz wrote:
>> W dniu 01.10.2015 o 16:18, Paolo Bonzini pisze:
>>>
>>> On 01/10/2015 16:12, Janusz wrote:
>>>> Now, I can also add, that the problem is only when I allow VM to use
>>>> more than one core, so with option for example:
>>>> -smp 8,cores=4,threads=2,sockets=1 and other combinations like -smp
>>>> 4,threads=1 its not working, and without it I am always running VM
>>>> without problems
>>>>
>>>> Any ideas what can it be? or any idea what would help to find out what
>>>> is causing this?
>>> I am going to send a revert of the patch tomorrow.
>>>
>>> Paolo
>> Thanks, but revert patch doesn't help, so something else is wrong here
>>
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-14 7:37 ` Janusz
@ 2015-10-14 8:24 ` Xiao Guangrong
2015-10-14 8:32 ` Xiao Guangrong
0 siblings, 1 reply; 37+ messages in thread
From: Xiao Guangrong @ 2015-10-14 8:24 UTC (permalink / raw)
To: Janusz, Paolo Bonzini, Wanpeng Li, Laszlo Ersek, kvm; +Cc: edk2-devel
On 10/14/2015 03:37 PM, Janusz wrote:
> I was able to run my virtual machine with this, but had very high cpu
> usage when something happen in it like booting system. once, my virtual
> machine hang and I couln't even get my mouse / keyboard back from qemu.
> When I did vga passthrough, I didn't get any video output, and cpu usage
> was also high. Tried it on 4.3
Which tree are you using? Is it kvm tree?
Could you please work on queue brancn on current kvm tree based on
top commit 73917739334c6509: KVM: x86: fix SMI to halted VCPU.
Hmm... interesting, this diff works on my box...
>
> W dniu 14.10.2015 o 05:58, Xiao Guangrong pisze:
>>
>> Janusz,
>>
>> Could you please try this:
>>
>> $ git diff
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 185fc16..bdd564f 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -4957,12 +4957,14 @@ static int handle_emulation_failure(struct
>> kvm_vcpu *vcpu)
>>
>> ++vcpu->stat.insn_emulation_fail;
>> trace_kvm_emulate_insn_failed(vcpu);
>> +#if 0
>> if (!is_guest_mode(vcpu) && kvm_x86_ops->get_cpl(vcpu) == 0) {
>> vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
>> vcpu->run->internal.suberror =
>> KVM_INTERNAL_ERROR_EMULATION;
>> vcpu->run->internal.ndata = 0;
>> r = EMULATE_FAIL;
>> }
>> +#endif
>> kvm_queue_exception(vcpu, UD_VECTOR);
>>
>> return r;
>>
>> To see if the issue still there?
>>
>>
>> On 10/02/2015 10:38 PM, Janusz wrote:
>>> W dniu 01.10.2015 o 16:18, Paolo Bonzini pisze:
>>>>
>>>> On 01/10/2015 16:12, Janusz wrote:
>>>>> Now, I can also add, that the problem is only when I allow VM to use
>>>>> more than one core, so with option for example:
>>>>> -smp 8,cores=4,threads=2,sockets=1 and other combinations like -smp
>>>>> 4,threads=1 its not working, and without it I am always running VM
>>>>> without problems
>>>>>
>>>>> Any ideas what can it be? or any idea what would help to find out what
>>>>> is causing this?
>>>> I am going to send a revert of the patch tomorrow.
>>>>
>>>> Paolo
>>> Thanks, but revert patch doesn't help, so something else is wrong here
>>>
>
>
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-14 8:24 ` Xiao Guangrong
@ 2015-10-14 8:32 ` Xiao Guangrong
2015-10-14 9:13 ` Janusz
` (2 more replies)
0 siblings, 3 replies; 37+ messages in thread
From: Xiao Guangrong @ 2015-10-14 8:32 UTC (permalink / raw)
To: Janusz, Paolo Bonzini, Wanpeng Li, Laszlo Ersek, kvm; +Cc: edk2-devel
[-- Attachment #1: Type: text/plain, Size: 868 bytes --]
On 10/14/2015 04:24 PM, Xiao Guangrong wrote:
>
>
> On 10/14/2015 03:37 PM, Janusz wrote:
>> I was able to run my virtual machine with this, but had very high cpu
>> usage when something happen in it like booting system. once, my virtual
>> machine hang and I couln't even get my mouse / keyboard back from qemu.
>> When I did vga passthrough, I didn't get any video output, and cpu usage
>> was also high. Tried it on 4.3
>
> Which tree are you using? Is it kvm tree?
> Could you please work on queue brancn on current kvm tree based on
> top commit 73917739334c6509: KVM: x86: fix SMI to halted VCPU.
>
> Hmm... interesting, this diff works on my box...
Forgot to say that i built my test env following the instructions on kvm-wiki:
http://www.linux-kvm.org/page/OVMF
My test script is attached, and i will try to build the env like yours as much
as possible...
[-- Attachment #2: ovmf.sh --]
[-- Type: application/x-shellscript, Size: 2806 bytes --]
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-14 8:32 ` Xiao Guangrong
@ 2015-10-14 9:13 ` Janusz
2015-10-14 9:16 ` Janusz
2015-10-14 9:47 ` Laszlo Ersek
2015-10-14 18:08 ` Janusz
2 siblings, 1 reply; 37+ messages in thread
From: Janusz @ 2015-10-14 9:13 UTC (permalink / raw)
To: Xiao Guangrong, Paolo Bonzini, Wanpeng Li, Laszlo Ersek, kvm; +Cc: edk2-devel
W dniu 14.10.2015 o 10:32, Xiao Guangrong pisze:
>
>
> On 10/14/2015 04:24 PM, Xiao Guangrong wrote:
>>
>>
>> On 10/14/2015 03:37 PM, Janusz wrote:
>>> I was able to run my virtual machine with this, but had very high cpu
>>> usage when something happen in it like booting system. once, my virtual
>>> machine hang and I couln't even get my mouse / keyboard back from qemu.
>>> When I did vga passthrough, I didn't get any video output, and cpu
>>> usage
>>> was also high. Tried it on 4.3
>>
>> Which tree are you using? Is it kvm tree?
>> Could you please work on queue brancn on current kvm tree based on
>> top commit 73917739334c6509: KVM: x86: fix SMI to halted VCPU.
>>
>> Hmm... interesting, this diff works on my box...
>
> Forgot to say that i built my test env following the instructions on
> kvm-wiki:
> http://www.linux-kvm.org/page/OVMF
>
> My test script is attached, and i will try to build the env like yours
> as much
> as possible...
I attach my script. I see that you are using pc-i440fx-2.1 - I use
default, I think its pc-i440fx-2.4, tried 2.3 some time ago and I get
the same problem. I will try with 2.1 after work
I am using master from main kernel tree, will also try this tree you
mentioned after work
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-14 9:13 ` Janusz
@ 2015-10-14 9:16 ` Janusz
0 siblings, 0 replies; 37+ messages in thread
From: Janusz @ 2015-10-14 9:16 UTC (permalink / raw)
To: Xiao Guangrong, Paolo Bonzini, Wanpeng Li, Laszlo Ersek, kvm; +Cc: edk2-devel
W dniu 14.10.2015 o 11:13, Janusz pisze:
> W dniu 14.10.2015 o 10:32, Xiao Guangrong pisze:
>>
>> On 10/14/2015 04:24 PM, Xiao Guangrong wrote:
>>>
>>> On 10/14/2015 03:37 PM, Janusz wrote:
>>>> I was able to run my virtual machine with this, but had very high cpu
>>>> usage when something happen in it like booting system. once, my virtual
>>>> machine hang and I couln't even get my mouse / keyboard back from qemu.
>>>> When I did vga passthrough, I didn't get any video output, and cpu
>>>> usage
>>>> was also high. Tried it on 4.3
>>> Which tree are you using? Is it kvm tree?
>>> Could you please work on queue brancn on current kvm tree based on
>>> top commit 73917739334c6509: KVM: x86: fix SMI to halted VCPU.
>>>
>>> Hmm... interesting, this diff works on my box...
>> Forgot to say that i built my test env following the instructions on
>> kvm-wiki:
>> http://www.linux-kvm.org/page/OVMF
>>
>> My test script is attached, and i will try to build the env like yours
>> as much
>> as possible...
> I attach my script. I see that you are using pc-i440fx-2.1 - I use
> default, I think its pc-i440fx-2.4, tried 2.3 some time ago and I get
> the same problem. I will try with 2.1 after work
> I am using master from main kernel tree, will also try this tree you
> mentioned after work
I am sending this one more time, as my message was rejected by intel
servers because of attached script... Script:
https://bpaste.net/show/8467c3af8b18
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-14 8:32 ` Xiao Guangrong
2015-10-14 9:13 ` Janusz
@ 2015-10-14 9:47 ` Laszlo Ersek
2015-10-15 3:59 ` Xiao Guangrong
2015-10-14 18:08 ` Janusz
2 siblings, 1 reply; 37+ messages in thread
From: Laszlo Ersek @ 2015-10-14 9:47 UTC (permalink / raw)
To: Xiao Guangrong; +Cc: Janusz, Paolo Bonzini, Wanpeng Li, kvm, edk2-devel
On 10/14/15 10:32, Xiao Guangrong wrote:
>
>
> On 10/14/2015 04:24 PM, Xiao Guangrong wrote:
>>
>>
>> On 10/14/2015 03:37 PM, Janusz wrote:
>>> I was able to run my virtual machine with this, but had very high cpu
>>> usage when something happen in it like booting system. once, my virtual
>>> machine hang and I couln't even get my mouse / keyboard back from qemu.
>>> When I did vga passthrough, I didn't get any video output, and cpu usage
>>> was also high. Tried it on 4.3
>>
>> Which tree are you using? Is it kvm tree?
>> Could you please work on queue brancn on current kvm tree based on
>> top commit 73917739334c6509: KVM: x86: fix SMI to halted VCPU.
>>
>> Hmm... interesting, this diff works on my box...
>
> Forgot to say that i built my test env following the instructions on
> kvm-wiki:
> http://www.linux-kvm.org/page/OVMF
Wow! Someone actually cares about the whitepaper. Thank you. :)
Laszlo
>
> My test script is attached, and i will try to build the env like yours
> as much
> as possible...
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-14 8:32 ` Xiao Guangrong
2015-10-14 9:13 ` Janusz
2015-10-14 9:47 ` Laszlo Ersek
@ 2015-10-14 18:08 ` Janusz
2015-10-15 4:19 ` Xiao Guangrong
2 siblings, 1 reply; 37+ messages in thread
From: Janusz @ 2015-10-14 18:08 UTC (permalink / raw)
To: Xiao Guangrong, Paolo Bonzini, Wanpeng Li, Laszlo Ersek, kvm; +Cc: edk2-devel
W dniu 14.10.2015 o 10:32, Xiao Guangrong pisze:
>
>
> On 10/14/2015 04:24 PM, Xiao Guangrong wrote:
>>
>>
>> On 10/14/2015 03:37 PM, Janusz wrote:
>>> I was able to run my virtual machine with this, but had very high cpu
>>> usage when something happen in it like booting system. once, my virtual
>>> machine hang and I couln't even get my mouse / keyboard back from qemu.
>>> When I did vga passthrough, I didn't get any video output, and cpu
>>> usage
>>> was also high. Tried it on 4.3
>>
>> Which tree are you using? Is it kvm tree?
>> Could you please work on queue brancn on current kvm tree based on
>> top commit 73917739334c6509: KVM: x86: fix SMI to halted VCPU.
>>
>> Hmm... interesting, this diff works on my box...
>
> Forgot to say that i built my test env following the instructions on
> kvm-wiki:
> http://www.linux-kvm.org/page/OVMF
>
> My test script is attached, and i will try to build the env like yours
> as much
> as possible...
I cloned git://git.kernel.org/pub/scm/virt/kvm/kvm.git 73917739334c6509
commit, but this is breaking my system...
Slim is not able to start i3, xdm is not killing X when I stop xdm, qemu
is not able to start when I don't use option -nographic
log from qemu on that kernel version:
xcb_connection_has_error() returned true
No protocol specified
Could not initialize SDL(No available video device) - exiting
On main kernel branch I don't have those problems.
I tried to run with -nographic, and tried pc-i440fx-2.1 but the same
problem as before, high cpu usage and no graphic on my GPU.
I don't know if that will help by this is my log from option -global
isa-debugcon.iobase=0x402 -debugcon file:fedora.ovmf.log:
https://bpaste.net/show/36c54dba68c2
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-14 9:47 ` Laszlo Ersek
@ 2015-10-15 3:59 ` Xiao Guangrong
0 siblings, 0 replies; 37+ messages in thread
From: Xiao Guangrong @ 2015-10-15 3:59 UTC (permalink / raw)
To: Laszlo Ersek; +Cc: Janusz, Paolo Bonzini, Wanpeng Li, kvm, edk2-devel
On 10/14/2015 05:47 PM, Laszlo Ersek wrote:
> On 10/14/15 10:32, Xiao Guangrong wrote:
>>
>>
>> On 10/14/2015 04:24 PM, Xiao Guangrong wrote:
>>>
>>>
>>> On 10/14/2015 03:37 PM, Janusz wrote:
>>>> I was able to run my virtual machine with this, but had very high cpu
>>>> usage when something happen in it like booting system. once, my virtual
>>>> machine hang and I couln't even get my mouse / keyboard back from qemu.
>>>> When I did vga passthrough, I didn't get any video output, and cpu usage
>>>> was also high. Tried it on 4.3
>>>
>>> Which tree are you using? Is it kvm tree?
>>> Could you please work on queue brancn on current kvm tree based on
>>> top commit 73917739334c6509: KVM: x86: fix SMI to halted VCPU.
>>>
>>> Hmm... interesting, this diff works on my box...
>>
>> Forgot to say that i built my test env following the instructions on
>> kvm-wiki:
>> http://www.linux-kvm.org/page/OVMF
>
> Wow! Someone actually cares about the whitepaper. Thank you. :)
:)
The document is really useful to me, thanks for your contribution.
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-14 18:08 ` Janusz
@ 2015-10-15 4:19 ` Xiao Guangrong
2015-10-15 6:19 ` Janusz
0 siblings, 1 reply; 37+ messages in thread
From: Xiao Guangrong @ 2015-10-15 4:19 UTC (permalink / raw)
To: Janusz, Paolo Bonzini, Wanpeng Li, Laszlo Ersek, kvm; +Cc: edk2-devel
On 10/15/2015 02:08 AM, Janusz wrote:
> W dniu 14.10.2015 o 10:32, Xiao Guangrong pisze:
>>
>>
>> On 10/14/2015 04:24 PM, Xiao Guangrong wrote:
>>>
>>>
>>> On 10/14/2015 03:37 PM, Janusz wrote:
>>>> I was able to run my virtual machine with this, but had very high cpu
>>>> usage when something happen in it like booting system. once, my virtual
>>>> machine hang and I couln't even get my mouse / keyboard back from qemu.
>>>> When I did vga passthrough, I didn't get any video output, and cpu
>>>> usage
>>>> was also high. Tried it on 4.3
>>>
>>> Which tree are you using? Is it kvm tree?
>>> Could you please work on queue brancn on current kvm tree based on
>>> top commit 73917739334c6509: KVM: x86: fix SMI to halted VCPU.
>>>
>>> Hmm... interesting, this diff works on my box...
>>
>> Forgot to say that i built my test env following the instructions on
>> kvm-wiki:
>> http://www.linux-kvm.org/page/OVMF
>>
>> My test script is attached, and i will try to build the env like yours
>> as much
>> as possible...
> I cloned git://git.kernel.org/pub/scm/virt/kvm/kvm.git 73917739334c6509
> commit, but this is breaking my system...
> Slim is not able to start i3, xdm is not killing X when I stop xdm, qemu
> is not able to start when I don't use option -nographic
> log from qemu on that kernel version:
> xcb_connection_has_error() returned true
> No protocol specified
> Could not initialize SDL(No available video device) - exiting
>
> On main kernel branch I don't have those problems.
>
> I tried to run with -nographic, and tried pc-i440fx-2.1 but the same
> problem as before, high cpu usage and no graphic on my GPU.
> I don't know if that will help by this is my log from option -global
> isa-debugcon.iobase=0x402 -debugcon file:fedora.ovmf.log:
> https://bpaste.net/show/36c54dba68c2
Well, the bug may be not in KVM. When this bug happened, i saw OVMF
only checked 1 CPU out, there is the log from OVMF's debug input:
Flushing GCD
Flushing GCD
Flushing GCD
Flushing GCD
Flushing GCD
Flushing GCD
Flushing GCD
Flushing GCD
Flushing GCD
Flushing GCDs
Detect CPU count: 1
So that the startup code has been freed however the APs are still running,
i think that why we saw the vCPUs executed on unexpected address.
After digging into OVMF's code, i noticed that BSP CPU waits for APs
for a fixed timer period, however, KVM recent changes require zap all
mappings if CR0.CD is changed, that means the APs need more time to
startup.
After following changes to OVMF, the bug is completely gone on my side:
--- a/UefiCpuPkg/CpuDxe/ApStartup.c
+++ b/UefiCpuPkg/CpuDxe/ApStartup.c
@@ -454,7 +454,9 @@ StartApsStackless (
//
// Wait 100 milliseconds for APs to arrive at the ApEntryPoint routine
//
- MicroSecondDelay (100 * 1000);
+ MicroSecondDelay (10 * 100 * 1000);
return EFI_SUCCESS;
}
Janusz, could you please check this instead? You can switch to your
previous kernel to do this test.
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-15 4:19 ` Xiao Guangrong
@ 2015-10-15 6:19 ` Janusz
2015-10-15 6:41 ` Xiao Guangrong
0 siblings, 1 reply; 37+ messages in thread
From: Janusz @ 2015-10-15 6:19 UTC (permalink / raw)
To: Xiao Guangrong, Paolo Bonzini, Wanpeng Li, Laszlo Ersek, kvm; +Cc: edk2-devel
W dniu 15.10.2015 o 06:19, Xiao Guangrong pisze:
>
>
>
> Well, the bug may be not in KVM. When this bug happened, i saw OVMF
> only checked 1 CPU out, there is the log from OVMF's debug input:
>
> Flushing GCD
> Flushing GCD
> Flushing GCD
> Flushing GCD
> Flushing GCD
> Flushing GCD
> Flushing GCD
> Flushing GCD
> Flushing GCD
> Flushing GCDs
> Detect CPU count: 1
>
> So that the startup code has been freed however the APs are still
> running,
> i think that why we saw the vCPUs executed on unexpected address.
>
> After digging into OVMF's code, i noticed that BSP CPU waits for APs
> for a fixed timer period, however, KVM recent changes require zap all
> mappings if CR0.CD is changed, that means the APs need more time to
> startup.
>
> After following changes to OVMF, the bug is completely gone on my side:
>
> --- a/UefiCpuPkg/CpuDxe/ApStartup.c
> +++ b/UefiCpuPkg/CpuDxe/ApStartup.c
> @@ -454,7 +454,9 @@ StartApsStackless (
> //
> // Wait 100 milliseconds for APs to arrive at the ApEntryPoint routine
> //
> - MicroSecondDelay (100 * 1000);
> + MicroSecondDelay (10 * 100 * 1000);
>
> return EFI_SUCCESS;
> }
>
> Janusz, could you please check this instead? You can switch to your
> previous kernel to do this test.
>
>
Ok, now first time when I started VM I was able to start system
successfully. When I turned it off and started it again, it restarted my
vm at system boot couple of times. Sometimes I also get very high cpu
usage for no reason. Also, I get less fps in GTA 5 than in kernel 4.1, I
get something like 30-55, but on 4.1 I get all the time 60 fps. This is
my new log: https://bpaste.net/show/61a122ad7fe5
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-15 6:19 ` Janusz
@ 2015-10-15 6:41 ` Xiao Guangrong
2015-10-15 6:58 ` Janusz
0 siblings, 1 reply; 37+ messages in thread
From: Xiao Guangrong @ 2015-10-15 6:41 UTC (permalink / raw)
To: Janusz, Paolo Bonzini, Wanpeng Li, Laszlo Ersek, kvm; +Cc: edk2-devel
On 10/15/2015 02:19 PM, Janusz wrote:
> W dniu 15.10.2015 o 06:19, Xiao Guangrong pisze:
>>
>>
>>
>> Well, the bug may be not in KVM. When this bug happened, i saw OVMF
>> only checked 1 CPU out, there is the log from OVMF's debug input:
>>
>> Flushing GCD
>> Flushing GCD
>> Flushing GCD
>> Flushing GCD
>> Flushing GCD
>> Flushing GCD
>> Flushing GCD
>> Flushing GCD
>> Flushing GCD
>> Flushing GCDs
>> Detect CPU count: 1
>>
>> So that the startup code has been freed however the APs are still
>> running,
>> i think that why we saw the vCPUs executed on unexpected address.
>>
>> After digging into OVMF's code, i noticed that BSP CPU waits for APs
>> for a fixed timer period, however, KVM recent changes require zap all
>> mappings if CR0.CD is changed, that means the APs need more time to
>> startup.
>>
>> After following changes to OVMF, the bug is completely gone on my side:
>>
>> --- a/UefiCpuPkg/CpuDxe/ApStartup.c
>> +++ b/UefiCpuPkg/CpuDxe/ApStartup.c
>> @@ -454,7 +454,9 @@ StartApsStackless (
>> //
>> // Wait 100 milliseconds for APs to arrive at the ApEntryPoint routine
>> //
>> - MicroSecondDelay (100 * 1000);
>> + MicroSecondDelay (10 * 100 * 1000);
>>
>> return EFI_SUCCESS;
>> }
>>
>> Janusz, could you please check this instead? You can switch to your
>> previous kernel to do this test.
>>
>>
> Ok, now first time when I started VM I was able to start system
> successfully. When I turned it off and started it again, it restarted my
> vm at system boot couple of times. Sometimes I also get very high cpu
> usage for no reason. Also, I get less fps in GTA 5 than in kernel 4.1, I
> get something like 30-55, but on 4.1 I get all the time 60 fps. This is
> my new log: https://bpaste.net/show/61a122ad7fe5
>
Just confirm: the Qemu internal error did not appear any more, right?
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-15 6:41 ` Xiao Guangrong
@ 2015-10-15 6:58 ` Janusz
2015-10-15 7:10 ` Xiao Guangrong
0 siblings, 1 reply; 37+ messages in thread
From: Janusz @ 2015-10-15 6:58 UTC (permalink / raw)
To: Xiao Guangrong, Paolo Bonzini, Wanpeng Li, Laszlo Ersek, kvm; +Cc: edk2-devel
W dniu 15.10.2015 o 08:41, Xiao Guangrong pisze:
>
>
> On 10/15/2015 02:19 PM, Janusz wrote:
>> W dniu 15.10.2015 o 06:19, Xiao Guangrong pisze:
>>>
>>>
>>>
>>> Well, the bug may be not in KVM. When this bug happened, i saw OVMF
>>> only checked 1 CPU out, there is the log from OVMF's debug input:
>>>
>>> Flushing GCD
>>> Flushing GCD
>>> Flushing GCD
>>> Flushing GCD
>>> Flushing GCD
>>> Flushing GCD
>>> Flushing GCD
>>> Flushing GCD
>>> Flushing GCD
>>> Flushing GCDs
>>> Detect CPU count: 1
>>>
>>> So that the startup code has been freed however the APs are still
>>> running,
>>> i think that why we saw the vCPUs executed on unexpected address.
>>>
>>> After digging into OVMF's code, i noticed that BSP CPU waits for APs
>>> for a fixed timer period, however, KVM recent changes require zap all
>>> mappings if CR0.CD is changed, that means the APs need more time to
>>> startup.
>>>
>>> After following changes to OVMF, the bug is completely gone on my side:
>>>
>>> --- a/UefiCpuPkg/CpuDxe/ApStartup.c
>>> +++ b/UefiCpuPkg/CpuDxe/ApStartup.c
>>> @@ -454,7 +454,9 @@ StartApsStackless (
>>> //
>>> // Wait 100 milliseconds for APs to arrive at the ApEntryPoint
>>> routine
>>> //
>>> - MicroSecondDelay (100 * 1000);
>>> + MicroSecondDelay (10 * 100 * 1000);
>>>
>>> return EFI_SUCCESS;
>>> }
>>>
>>> Janusz, could you please check this instead? You can switch to your
>>> previous kernel to do this test.
>>>
>>>
>> Ok, now first time when I started VM I was able to start system
>> successfully. When I turned it off and started it again, it restarted my
>> vm at system boot couple of times. Sometimes I also get very high cpu
>> usage for no reason. Also, I get less fps in GTA 5 than in kernel 4.1, I
>> get something like 30-55, but on 4.1 I get all the time 60 fps. This is
>> my new log: https://bpaste.net/show/61a122ad7fe5
>>
>
> Just confirm: the Qemu internal error did not appear any more, right?
Yes, when I reverted your first patch, switched to -vga std from -vga
none and didn't passthrough my GPU (case when I got this internal
error), vm started without problem. I even didn't get any VM restarts
like with passthrough
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-15 6:58 ` Janusz
@ 2015-10-15 7:10 ` Xiao Guangrong
2015-10-15 7:21 ` Janusz
2015-10-15 16:18 ` Laszlo Ersek
0 siblings, 2 replies; 37+ messages in thread
From: Xiao Guangrong @ 2015-10-15 7:10 UTC (permalink / raw)
To: Janusz, Paolo Bonzini, Wanpeng Li, Laszlo Ersek, kvm
Cc: edk2-devel, Alex Williamson
On 10/15/2015 02:58 PM, Janusz wrote:
> W dniu 15.10.2015 o 08:41, Xiao Guangrong pisze:
>>
>>
>> On 10/15/2015 02:19 PM, Janusz wrote:
>>> W dniu 15.10.2015 o 06:19, Xiao Guangrong pisze:
>>>>
>>>>
>>>>
>>>> Well, the bug may be not in KVM. When this bug happened, i saw OVMF
>>>> only checked 1 CPU out, there is the log from OVMF's debug input:
>>>>
>>>> Flushing GCD
>>>> Flushing GCD
>>>> Flushing GCD
>>>> Flushing GCD
>>>> Flushing GCD
>>>> Flushing GCD
>>>> Flushing GCD
>>>> Flushing GCD
>>>> Flushing GCD
>>>> Flushing GCDs
>>>> Detect CPU count: 1
>>>>
>>>> So that the startup code has been freed however the APs are still
>>>> running,
>>>> i think that why we saw the vCPUs executed on unexpected address.
>>>>
>>>> After digging into OVMF's code, i noticed that BSP CPU waits for APs
>>>> for a fixed timer period, however, KVM recent changes require zap all
>>>> mappings if CR0.CD is changed, that means the APs need more time to
>>>> startup.
>>>>
>>>> After following changes to OVMF, the bug is completely gone on my side:
>>>>
>>>> --- a/UefiCpuPkg/CpuDxe/ApStartup.c
>>>> +++ b/UefiCpuPkg/CpuDxe/ApStartup.c
>>>> @@ -454,7 +454,9 @@ StartApsStackless (
>>>> //
>>>> // Wait 100 milliseconds for APs to arrive at the ApEntryPoint
>>>> routine
>>>> //
>>>> - MicroSecondDelay (100 * 1000);
>>>> + MicroSecondDelay (10 * 100 * 1000);
>>>>
>>>> return EFI_SUCCESS;
>>>> }
>>>>
>>>> Janusz, could you please check this instead? You can switch to your
>>>> previous kernel to do this test.
>>>>
>>>>
>>> Ok, now first time when I started VM I was able to start system
>>> successfully. When I turned it off and started it again, it restarted my
>>> vm at system boot couple of times. Sometimes I also get very high cpu
>>> usage for no reason. Also, I get less fps in GTA 5 than in kernel 4.1, I
>>> get something like 30-55, but on 4.1 I get all the time 60 fps. This is
>>> my new log: https://bpaste.net/show/61a122ad7fe5
>>>
>>
>> Just confirm: the Qemu internal error did not appear any more, right?
> Yes, when I reverted your first patch, switched to -vga std from -vga
> none and didn't passthrough my GPU (case when I got this internal
> error), vm started without problem. I even didn't get any VM restarts
> like with passthrough
>
Wow, it seems we have fixed the QEMU internal error now. :)
Recurrently, Paolo has reverted some MTRR patches, was your test
based on these reverted patches?
The GPU passthrough issue may be related to vfio (not sure), Alex, do
you have any idea?
Laszlo, could you please check the root case is reasonable and fix it in
OVMF if it's right?
BTW, OVMF handles #UD with no trace - nothing is killed, and no call trace
in the debug input...
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-15 7:10 ` Xiao Guangrong
@ 2015-10-15 7:21 ` Janusz
2015-10-15 16:18 ` Laszlo Ersek
1 sibling, 0 replies; 37+ messages in thread
From: Janusz @ 2015-10-15 7:21 UTC (permalink / raw)
To: Xiao Guangrong, Paolo Bonzini, Wanpeng Li, Laszlo Ersek, kvm
Cc: edk2-devel, Alex Williamson
W dniu 15.10.2015 o 09:10, Xiao Guangrong pisze:
>
>
> On 10/15/2015 02:58 PM, Janusz wrote:
>> W dniu 15.10.2015 o 08:41, Xiao Guangrong pisze:
>>>
>>>
>>> On 10/15/2015 02:19 PM, Janusz wrote:
>>>> W dniu 15.10.2015 o 06:19, Xiao Guangrong pisze:
>>>>>
>>>>>
>>>>>
>>>>> Well, the bug may be not in KVM. When this bug happened, i saw OVMF
>>>>> only checked 1 CPU out, there is the log from OVMF's debug input:
>>>>>
>>>>> Flushing GCD
>>>>> Flushing GCD
>>>>> Flushing GCD
>>>>> Flushing GCD
>>>>> Flushing GCD
>>>>> Flushing GCD
>>>>> Flushing GCD
>>>>> Flushing GCD
>>>>> Flushing GCD
>>>>> Flushing GCDs
>>>>> Detect CPU count: 1
>>>>>
>>>>> So that the startup code has been freed however the APs are still
>>>>> running,
>>>>> i think that why we saw the vCPUs executed on unexpected address.
>>>>>
>>>>> After digging into OVMF's code, i noticed that BSP CPU waits for APs
>>>>> for a fixed timer period, however, KVM recent changes require zap all
>>>>> mappings if CR0.CD is changed, that means the APs need more time to
>>>>> startup.
>>>>>
>>>>> After following changes to OVMF, the bug is completely gone on my
>>>>> side:
>>>>>
>>>>> --- a/UefiCpuPkg/CpuDxe/ApStartup.c
>>>>> +++ b/UefiCpuPkg/CpuDxe/ApStartup.c
>>>>> @@ -454,7 +454,9 @@ StartApsStackless (
>>>>> //
>>>>> // Wait 100 milliseconds for APs to arrive at the ApEntryPoint
>>>>> routine
>>>>> //
>>>>> - MicroSecondDelay (100 * 1000);
>>>>> + MicroSecondDelay (10 * 100 * 1000);
>>>>>
>>>>> return EFI_SUCCESS;
>>>>> }
>>>>>
>>>>> Janusz, could you please check this instead? You can switch to your
>>>>> previous kernel to do this test.
>>>>>
>>>>>
>>>> Ok, now first time when I started VM I was able to start system
>>>> successfully. When I turned it off and started it again, it
>>>> restarted my
>>>> vm at system boot couple of times. Sometimes I also get very high cpu
>>>> usage for no reason. Also, I get less fps in GTA 5 than in kernel
>>>> 4.1, I
>>>> get something like 30-55, but on 4.1 I get all the time 60 fps.
>>>> This is
>>>> my new log: https://bpaste.net/show/61a122ad7fe5
>>>>
>>>
>>> Just confirm: the Qemu internal error did not appear any more, right?
>> Yes, when I reverted your first patch, switched to -vga std from -vga
>> none and didn't passthrough my GPU (case when I got this internal
>> error), vm started without problem. I even didn't get any VM restarts
>> like with passthrough
>>
>
> Wow, it seems we have fixed the QEMU internal error now. :)
>
> Recurrently, Paolo has reverted some MTRR patches, was your test
> based on these reverted patches?
>
> The GPU passthrough issue may be related to vfio (not sure), Alex, do
> you have any idea?
>
> Laszlo, could you please check the root case is reasonable and fix it in
> OVMF if it's right?
>
> BTW, OVMF handles #UD with no trace - nothing is killed, and no call
> trace
> in the debug input...
>
Yes, reverted MTRR code is already in kernel I use - 4.3-r5+
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-15 7:10 ` Xiao Guangrong
2015-10-15 7:21 ` Janusz
@ 2015-10-15 16:18 ` Laszlo Ersek
2015-10-15 16:53 ` Kinney, Michael D
[not found] ` <5620696F.7050406@linux.intel.com>
1 sibling, 2 replies; 37+ messages in thread
From: Laszlo Ersek @ 2015-10-15 16:18 UTC (permalink / raw)
To: Xiao Guangrong
Cc: Janusz, Paolo Bonzini, Wanpeng Li, kvm, edk2-devel,
Alex Williamson, Chen Fan, Jordan Justen (Intel address)
CC'ing Jordan and Chen Fan.
On 10/15/15 09:10, Xiao Guangrong wrote:
>
>
> On 10/15/2015 02:58 PM, Janusz wrote:
>> W dniu 15.10.2015 o 08:41, Xiao Guangrong pisze:
>>>
>>>
>>> On 10/15/2015 02:19 PM, Janusz wrote:
>>>> W dniu 15.10.2015 o 06:19, Xiao Guangrong pisze:
>>>>>
>>>>>
>>>>>
>>>>> Well, the bug may be not in KVM. When this bug happened, i saw OVMF
>>>>> only checked 1 CPU out, there is the log from OVMF's debug input:
>>>>>
>>>>> Flushing GCD
>>>>> Flushing GCD
>>>>> Flushing GCD
>>>>> Flushing GCD
>>>>> Flushing GCD
>>>>> Flushing GCD
>>>>> Flushing GCD
>>>>> Flushing GCD
>>>>> Flushing GCD
>>>>> Flushing GCDs
>>>>> Detect CPU count: 1
>>>>>
>>>>> So that the startup code has been freed however the APs are still
>>>>> running,
>>>>> i think that why we saw the vCPUs executed on unexpected address.
>>>>>
>>>>> After digging into OVMF's code, i noticed that BSP CPU waits for APs
>>>>> for a fixed timer period, however, KVM recent changes require zap all
>>>>> mappings if CR0.CD is changed, that means the APs need more time to
>>>>> startup.
>>>>>
>>>>> After following changes to OVMF, the bug is completely gone on my
>>>>> side:
>>>>>
>>>>> --- a/UefiCpuPkg/CpuDxe/ApStartup.c
>>>>> +++ b/UefiCpuPkg/CpuDxe/ApStartup.c
>>>>> @@ -454,7 +454,9 @@ StartApsStackless (
>>>>> //
>>>>> // Wait 100 milliseconds for APs to arrive at the ApEntryPoint
>>>>> routine
>>>>> //
>>>>> - MicroSecondDelay (100 * 1000);
>>>>> + MicroSecondDelay (10 * 100 * 1000);
>>>>>
>>>>> return EFI_SUCCESS;
>>>>> }
>>>>>
>>>>> Janusz, could you please check this instead? You can switch to your
>>>>> previous kernel to do this test.
>>>>>
>>>>>
>>>> Ok, now first time when I started VM I was able to start system
>>>> successfully. When I turned it off and started it again, it
>>>> restarted my
>>>> vm at system boot couple of times. Sometimes I also get very high cpu
>>>> usage for no reason. Also, I get less fps in GTA 5 than in kernel
>>>> 4.1, I
>>>> get something like 30-55, but on 4.1 I get all the time 60 fps. This is
>>>> my new log: https://bpaste.net/show/61a122ad7fe5
>>>>
>>>
>>> Just confirm: the Qemu internal error did not appear any more, right?
>> Yes, when I reverted your first patch, switched to -vga std from -vga
>> none and didn't passthrough my GPU (case when I got this internal
>> error), vm started without problem. I even didn't get any VM restarts
>> like with passthrough
>>
>
> Wow, it seems we have fixed the QEMU internal error now. :)
>
> Recurrently, Paolo has reverted some MTRR patches, was your test
> based on these reverted patches?
>
> The GPU passthrough issue may be related to vfio (not sure), Alex, do
> you have any idea?
>
> Laszlo, could you please check the root case is reasonable and fix it in
> OVMF if it's right?
The code that you have found is in edk2's EFI_MP_SERVICES_PROTOCOL
implementation -- more closely, its initial CPU counter code --, from
edk2 git commit 533263ee5a7f. It is not specific to OVMF -- it is
generic edk2 code for Intel processors. (I'm CC'ing Jordan and Chen Fan
because they authored the patch in question.)
If VCPUs need more time to rendezvous than written in the code, on
recent KVM, then I think we should introduce a new FixedPCD in
UefiCpuPkg (practically: a compile time constant) for the timeout. Which
is not hard to do.
However, we'll need two things:
- an idea about the concrete rendezvous timeout to set, from OvmfPkg
- a *detailed* explanation / elaboration on your words:
"KVM recent changes require zap all mappings if CR0.CD is changed,
that means the APs need more time to startup"
Preferably with references to Linux kernel commits and the Intel SDM,
so that n00bs like me can get a fleeting idea. Do you mean that with
caching disabled, the APs execute their rendezvous code (from memory)
more slowly?
> BTW, OVMF handles #UD with no trace - nothing is killed, and no call trace
> in the debug input...
There *is* a trace (of any unexpected exception -- at least for the
BSP), but unfortunately its location is not intuitive.
The exception handler that is built into OVMF
("UefiCpuPkg/Library/CpuExceptionHandlerLib") is again generic edk2
code, and it prints the trace directly to the serial port, regardless of
the fact that OVMF's DebugLib instance logs explicit DEBUGs to the QEMU
debug port. (The latter can be directed to the serial port as well, if
you build OVMF with -D DEBUG_ON_SERIAL_PORT, but this is not relevant here.)
If you reproduce the issue while looking at the (virtual) serial port of
the guest, I trust you will get a register dump.
Thanks!
Laszlo
^ permalink raw reply [flat|nested] 37+ messages in thread
* RE: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-15 16:18 ` Laszlo Ersek
@ 2015-10-15 16:53 ` Kinney, Michael D
2015-10-15 18:46 ` Laszlo Ersek
[not found] ` <5620696F.7050406@linux.intel.com>
1 sibling, 1 reply; 37+ messages in thread
From: Kinney, Michael D @ 2015-10-15 16:53 UTC (permalink / raw)
To: Laszlo Ersek, Xiao Guangrong, Kinney, Michael D
Cc: kvm@vger.kernel.org, Justen, Jordan L, edk2-devel@ml01.01.org,
Alex Williamson, Chen Fan, Paolo Bonzini, Wanpeng Li
Laszlo,
There is already a PCD for this timeout that is used by CpuMpPei.
gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds
I noticed that CpuDxe is using a hard coded AP timeout. I think we should just use this same PCD for both the PEI and DXE CPU module and then set it for OVMF to the compatible value.
Mike
>-----Original Message-----
>From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of
>Laszlo Ersek
>Sent: Thursday, October 15, 2015 9:19 AM
>To: Xiao Guangrong
>Cc: kvm@vger.kernel.org; Justen, Jordan L; edk2-devel@ml01.01.org; Alex
>Williamson; Chen Fan; Paolo Bonzini; Wanpeng Li
>Subject: Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is
>completely disabled
>
>CC'ing Jordan and Chen Fan.
>
>On 10/15/15 09:10, Xiao Guangrong wrote:
>>
>>
>> On 10/15/2015 02:58 PM, Janusz wrote:
>>> W dniu 15.10.2015 o 08:41, Xiao Guangrong pisze:
>>>>
>>>>
>>>> On 10/15/2015 02:19 PM, Janusz wrote:
>>>>> W dniu 15.10.2015 o 06:19, Xiao Guangrong pisze:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Well, the bug may be not in KVM. When this bug happened, i saw
>OVMF
>>>>>> only checked 1 CPU out, there is the log from OVMF's debug input:
>>>>>>
>>>>>> Flushing GCD
>>>>>> Flushing GCD
>>>>>> Flushing GCD
>>>>>> Flushing GCD
>>>>>> Flushing GCD
>>>>>> Flushing GCD
>>>>>> Flushing GCD
>>>>>> Flushing GCD
>>>>>> Flushing GCD
>>>>>> Flushing GCDs
>>>>>> Detect CPU count: 1
>>>>>>
>>>>>> So that the startup code has been freed however the APs are still
>>>>>> running,
>>>>>> i think that why we saw the vCPUs executed on unexpected address.
>>>>>>
>>>>>> After digging into OVMF's code, i noticed that BSP CPU waits for APs
>>>>>> for a fixed timer period, however, KVM recent changes require zap all
>>>>>> mappings if CR0.CD is changed, that means the APs need more time to
>>>>>> startup.
>>>>>>
>>>>>> After following changes to OVMF, the bug is completely gone on my
>>>>>> side:
>>>>>>
>>>>>> --- a/UefiCpuPkg/CpuDxe/ApStartup.c
>>>>>> +++ b/UefiCpuPkg/CpuDxe/ApStartup.c
>>>>>> @@ -454,7 +454,9 @@ StartApsStackless (
>>>>>> //
>>>>>> // Wait 100 milliseconds for APs to arrive at the ApEntryPoint
>>>>>> routine
>>>>>> //
>>>>>> - MicroSecondDelay (100 * 1000);
>>>>>> + MicroSecondDelay (10 * 100 * 1000);
>>>>>>
>>>>>> return EFI_SUCCESS;
>>>>>> }
>>>>>>
>>>>>> Janusz, could you please check this instead? You can switch to your
>>>>>> previous kernel to do this test.
>>>>>>
>>>>>>
>>>>> Ok, now first time when I started VM I was able to start system
>>>>> successfully. When I turned it off and started it again, it
>>>>> restarted my
>>>>> vm at system boot couple of times. Sometimes I also get very high cpu
>>>>> usage for no reason. Also, I get less fps in GTA 5 than in kernel
>>>>> 4.1, I
>>>>> get something like 30-55, but on 4.1 I get all the time 60 fps. This is
>>>>> my new log: https://bpaste.net/show/61a122ad7fe5
>>>>>
>>>>
>>>> Just confirm: the Qemu internal error did not appear any more, right?
>>> Yes, when I reverted your first patch, switched to -vga std from -vga
>>> none and didn't passthrough my GPU (case when I got this internal
>>> error), vm started without problem. I even didn't get any VM restarts
>>> like with passthrough
>>>
>>
>> Wow, it seems we have fixed the QEMU internal error now. :)
>>
>> Recurrently, Paolo has reverted some MTRR patches, was your test
>> based on these reverted patches?
>>
>> The GPU passthrough issue may be related to vfio (not sure), Alex, do
>> you have any idea?
>>
>> Laszlo, could you please check the root case is reasonable and fix it in
>> OVMF if it's right?
>
>The code that you have found is in edk2's EFI_MP_SERVICES_PROTOCOL
>implementation -- more closely, its initial CPU counter code --, from
>edk2 git commit 533263ee5a7f. It is not specific to OVMF -- it is
>generic edk2 code for Intel processors. (I'm CC'ing Jordan and Chen Fan
>because they authored the patch in question.)
>
>If VCPUs need more time to rendezvous than written in the code, on
>recent KVM, then I think we should introduce a new FixedPCD in
>UefiCpuPkg (practically: a compile time constant) for the timeout. Which
>is not hard to do.
>
>However, we'll need two things:
>- an idea about the concrete rendezvous timeout to set, from OvmfPkg
>
>- a *detailed* explanation / elaboration on your words:
>
> "KVM recent changes require zap all mappings if CR0.CD is changed,
> that means the APs need more time to startup"
>
> Preferably with references to Linux kernel commits and the Intel SDM,
> so that n00bs like me can get a fleeting idea. Do you mean that with
> caching disabled, the APs execute their rendezvous code (from memory)
> more slowly?
>
>> BTW, OVMF handles #UD with no trace - nothing is killed, and no call trace
>> in the debug input...
>
>There *is* a trace (of any unexpected exception -- at least for the
>BSP), but unfortunately its location is not intuitive.
>
>The exception handler that is built into OVMF
>("UefiCpuPkg/Library/CpuExceptionHandlerLib") is again generic edk2
>code, and it prints the trace directly to the serial port, regardless of
>the fact that OVMF's DebugLib instance logs explicit DEBUGs to the QEMU
>debug port. (The latter can be directed to the serial port as well, if
>you build OVMF with -D DEBUG_ON_SERIAL_PORT, but this is not relevant
>here.)
>
>If you reproduce the issue while looking at the (virtual) serial port of
>the guest, I trust you will get a register dump.
>
>Thanks!
>Laszlo
>_______________________________________________
>edk2-devel mailing list
>edk2-devel@lists.01.org
>https://lists.01.org/mailman/listinfo/edk2-devel
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-15 16:53 ` Kinney, Michael D
@ 2015-10-15 18:46 ` Laszlo Ersek
2015-10-20 17:27 ` Janusz
0 siblings, 1 reply; 37+ messages in thread
From: Laszlo Ersek @ 2015-10-15 18:46 UTC (permalink / raw)
To: Kinney, Michael D, Xiao Guangrong
Cc: kvm@vger.kernel.org, Justen, Jordan L, edk2-devel@ml01.01.org,
Alex Williamson, Chen Fan, Paolo Bonzini, Wanpeng Li
On 10/15/15 18:53, Kinney, Michael D wrote:
> Laszlo,
>
> There is already a PCD for this timeout that is used by CpuMpPei.
>
> gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds
>
> I noticed that CpuDxe is using a hard coded AP timeout. I think we should just use this same PCD for both the PEI and DXE CPU module and then set it for OVMF to the compatible value.
Perfect, thank you!
(I notice the default in the DEC file is 50000, which is half of what
the DXE driver hardcodes.)
Now we only need a recommended (or experimental) value for it, and an
explanation why 100*1000 is no longer sufficient on KVM :)
Thanks!
Laszlo
>
> Mike
>
>> -----Original Message-----
>> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of
>> Laszlo Ersek
>> Sent: Thursday, October 15, 2015 9:19 AM
>> To: Xiao Guangrong
>> Cc: kvm@vger.kernel.org; Justen, Jordan L; edk2-devel@ml01.01.org; Alex
>> Williamson; Chen Fan; Paolo Bonzini; Wanpeng Li
>> Subject: Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is
>> completely disabled
>>
>> CC'ing Jordan and Chen Fan.
>>
>> On 10/15/15 09:10, Xiao Guangrong wrote:
>>>
>>>
>>> On 10/15/2015 02:58 PM, Janusz wrote:
>>>> W dniu 15.10.2015 o 08:41, Xiao Guangrong pisze:
>>>>>
>>>>>
>>>>> On 10/15/2015 02:19 PM, Janusz wrote:
>>>>>> W dniu 15.10.2015 o 06:19, Xiao Guangrong pisze:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Well, the bug may be not in KVM. When this bug happened, i saw
>> OVMF
>>>>>>> only checked 1 CPU out, there is the log from OVMF's debug input:
>>>>>>>
>>>>>>> Flushing GCD
>>>>>>> Flushing GCD
>>>>>>> Flushing GCD
>>>>>>> Flushing GCD
>>>>>>> Flushing GCD
>>>>>>> Flushing GCD
>>>>>>> Flushing GCD
>>>>>>> Flushing GCD
>>>>>>> Flushing GCD
>>>>>>> Flushing GCDs
>>>>>>> Detect CPU count: 1
>>>>>>>
>>>>>>> So that the startup code has been freed however the APs are still
>>>>>>> running,
>>>>>>> i think that why we saw the vCPUs executed on unexpected address.
>>>>>>>
>>>>>>> After digging into OVMF's code, i noticed that BSP CPU waits for APs
>>>>>>> for a fixed timer period, however, KVM recent changes require zap all
>>>>>>> mappings if CR0.CD is changed, that means the APs need more time to
>>>>>>> startup.
>>>>>>>
>>>>>>> After following changes to OVMF, the bug is completely gone on my
>>>>>>> side:
>>>>>>>
>>>>>>> --- a/UefiCpuPkg/CpuDxe/ApStartup.c
>>>>>>> +++ b/UefiCpuPkg/CpuDxe/ApStartup.c
>>>>>>> @@ -454,7 +454,9 @@ StartApsStackless (
>>>>>>> //
>>>>>>> // Wait 100 milliseconds for APs to arrive at the ApEntryPoint
>>>>>>> routine
>>>>>>> //
>>>>>>> - MicroSecondDelay (100 * 1000);
>>>>>>> + MicroSecondDelay (10 * 100 * 1000);
>>>>>>>
>>>>>>> return EFI_SUCCESS;
>>>>>>> }
>>>>>>>
>>>>>>> Janusz, could you please check this instead? You can switch to your
>>>>>>> previous kernel to do this test.
>>>>>>>
>>>>>>>
>>>>>> Ok, now first time when I started VM I was able to start system
>>>>>> successfully. When I turned it off and started it again, it
>>>>>> restarted my
>>>>>> vm at system boot couple of times. Sometimes I also get very high cpu
>>>>>> usage for no reason. Also, I get less fps in GTA 5 than in kernel
>>>>>> 4.1, I
>>>>>> get something like 30-55, but on 4.1 I get all the time 60 fps. This is
>>>>>> my new log: https://bpaste.net/show/61a122ad7fe5
>>>>>>
>>>>>
>>>>> Just confirm: the Qemu internal error did not appear any more, right?
>>>> Yes, when I reverted your first patch, switched to -vga std from -vga
>>>> none and didn't passthrough my GPU (case when I got this internal
>>>> error), vm started without problem. I even didn't get any VM restarts
>>>> like with passthrough
>>>>
>>>
>>> Wow, it seems we have fixed the QEMU internal error now. :)
>>>
>>> Recurrently, Paolo has reverted some MTRR patches, was your test
>>> based on these reverted patches?
>>>
>>> The GPU passthrough issue may be related to vfio (not sure), Alex, do
>>> you have any idea?
>>>
>>> Laszlo, could you please check the root case is reasonable and fix it in
>>> OVMF if it's right?
>>
>> The code that you have found is in edk2's EFI_MP_SERVICES_PROTOCOL
>> implementation -- more closely, its initial CPU counter code --, from
>> edk2 git commit 533263ee5a7f. It is not specific to OVMF -- it is
>> generic edk2 code for Intel processors. (I'm CC'ing Jordan and Chen Fan
>> because they authored the patch in question.)
>>
>> If VCPUs need more time to rendezvous than written in the code, on
>> recent KVM, then I think we should introduce a new FixedPCD in
>> UefiCpuPkg (practically: a compile time constant) for the timeout. Which
>> is not hard to do.
>>
>> However, we'll need two things:
>> - an idea about the concrete rendezvous timeout to set, from OvmfPkg
>>
>> - a *detailed* explanation / elaboration on your words:
>>
>> "KVM recent changes require zap all mappings if CR0.CD is changed,
>> that means the APs need more time to startup"
>>
>> Preferably with references to Linux kernel commits and the Intel SDM,
>> so that n00bs like me can get a fleeting idea. Do you mean that with
>> caching disabled, the APs execute their rendezvous code (from memory)
>> more slowly?
>>
>>> BTW, OVMF handles #UD with no trace - nothing is killed, and no call trace
>>> in the debug input...
>>
>> There *is* a trace (of any unexpected exception -- at least for the
>> BSP), but unfortunately its location is not intuitive.
>>
>> The exception handler that is built into OVMF
>> ("UefiCpuPkg/Library/CpuExceptionHandlerLib") is again generic edk2
>> code, and it prints the trace directly to the serial port, regardless of
>> the fact that OVMF's DebugLib instance logs explicit DEBUGs to the QEMU
>> debug port. (The latter can be directed to the serial port as well, if
>> you build OVMF with -D DEBUG_ON_SERIAL_PORT, but this is not relevant
>> here.)
>>
>> If you reproduce the issue while looking at the (virtual) serial port of
>> the guest, I trust you will get a register dump.
>>
>> Thanks!
>> Laszlo
>> _______________________________________________
>> edk2-devel mailing list
>> edk2-devel@lists.01.org
>> https://lists.01.org/mailman/listinfo/edk2-devel
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
[not found] ` <5620696F.7050406@linux.intel.com>
@ 2015-10-16 18:22 ` Laszlo Ersek
0 siblings, 0 replies; 37+ messages in thread
From: Laszlo Ersek @ 2015-10-16 18:22 UTC (permalink / raw)
To: Xiao Guangrong
Cc: Janusz, Paolo Bonzini, Wanpeng Li, kvm, edk2-devel,
Alex Williamson, Chen Fan, Jordan Justen (Intel address)
On 10/16/15 05:05, Xiao Guangrong wrote:
>
>
> On 10/16/2015 12:18 AM, Laszlo Ersek wrote:
>> CC'ing Jordan and Chen Fan.
>>
>> On 10/15/15 09:10, Xiao Guangrong wrote:
>>>
>>>
>>> On 10/15/2015 02:58 PM, Janusz wrote:
>>>> W dniu 15.10.2015 o 08:41, Xiao Guangrong pisze:
>>>>>
>>>>>
>>>>> On 10/15/2015 02:19 PM, Janusz wrote:
>>>>>> W dniu 15.10.2015 o 06:19, Xiao Guangrong pisze:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Well, the bug may be not in KVM. When this bug happened, i saw OVMF
>>>>>>> only checked 1 CPU out, there is the log from OVMF's debug input:
>>>>>>>
>>>>>>> Flushing GCD
>>>>>>> Flushing GCD
>>>>>>> Flushing GCD
>>>>>>> Flushing GCD
>>>>>>> Flushing GCD
>>>>>>> Flushing GCD
>>>>>>> Flushing GCD
>>>>>>> Flushing GCD
>>>>>>> Flushing GCD
>>>>>>> Flushing GCDs
>>>>>>> Detect CPU count: 1
>>>>>>>
>>>>>>> So that the startup code has been freed however the APs are still
>>>>>>> running,
>>>>>>> i think that why we saw the vCPUs executed on unexpected address.
>>>>>>>
>>>>>>> After digging into OVMF's code, i noticed that BSP CPU waits for APs
>>>>>>> for a fixed timer period, however, KVM recent changes require zap
>>>>>>> all
>>>>>>> mappings if CR0.CD is changed, that means the APs need more time to
>>>>>>> startup.
>>>>>>>
>>>>>>> After following changes to OVMF, the bug is completely gone on my
>>>>>>> side:
>>>>>>>
>>>>>>> --- a/UefiCpuPkg/CpuDxe/ApStartup.c
>>>>>>> +++ b/UefiCpuPkg/CpuDxe/ApStartup.c
>>>>>>> @@ -454,7 +454,9 @@ StartApsStackless (
>>>>>>> //
>>>>>>> // Wait 100 milliseconds for APs to arrive at the ApEntryPoint
>>>>>>> routine
>>>>>>> //
>>>>>>> - MicroSecondDelay (100 * 1000);
>>>>>>> + MicroSecondDelay (10 * 100 * 1000);
>>>>>>>
>>>>>>> return EFI_SUCCESS;
>>>>>>> }
>>>>>>>
>>>>>>> Janusz, could you please check this instead? You can switch to your
>>>>>>> previous kernel to do this test.
>>>>>>>
>>>>>>>
>>>>>> Ok, now first time when I started VM I was able to start system
>>>>>> successfully. When I turned it off and started it again, it
>>>>>> restarted my
>>>>>> vm at system boot couple of times. Sometimes I also get very high cpu
>>>>>> usage for no reason. Also, I get less fps in GTA 5 than in kernel
>>>>>> 4.1, I
>>>>>> get something like 30-55, but on 4.1 I get all the time 60 fps.
>>>>>> This is
>>>>>> my new log: https://bpaste.net/show/61a122ad7fe5
>>>>>>
>>>>>
>>>>> Just confirm: the Qemu internal error did not appear any more, right?
>>>> Yes, when I reverted your first patch, switched to -vga std from -vga
>>>> none and didn't passthrough my GPU (case when I got this internal
>>>> error), vm started without problem. I even didn't get any VM restarts
>>>> like with passthrough
>>>>
>>>
>>> Wow, it seems we have fixed the QEMU internal error now. :)
>>>
>>> Recurrently, Paolo has reverted some MTRR patches, was your test
>>> based on these reverted patches?
>>>
>>> The GPU passthrough issue may be related to vfio (not sure), Alex, do
>>> you have any idea?
>>>
>>> Laszlo, could you please check the root case is reasonable and fix it in
>>> OVMF if it's right?
>>
>> The code that you have found is in edk2's EFI_MP_SERVICES_PROTOCOL
>> implementation -- more closely, its initial CPU counter code --, from
>> edk2 git commit 533263ee5a7f. It is not specific to OVMF -- it is
>> generic edk2 code for Intel processors. (I'm CC'ing Jordan and Chen Fan
>> because they authored the patch in question.)
>
> Okay, good to know it, i do not have much knowledge on edk2 and OVMF... :(
>
>>
>> If VCPUs need more time to rendezvous than written in the code, on
>> recent KVM, then I think we should introduce a new FixedPCD in
>> UefiCpuPkg (practically: a compile time constant) for the timeout. Which
>> is not hard to do.
>>
>> However, we'll need two things:
>> - an idea about the concrete rendezvous timeout to set, from OvmfPkg
>>
>> - a *detailed* explanation / elaboration on your words:
>>
>> "KVM recent changes require zap all mappings if CR0.CD is changed,
>> that means the APs need more time to startup"
>>
>> Preferably with references to Linux kernel commits and the Intel SDM,
>> so that n00bs like me can get a fleeting idea. Do you mean that with
>> caching disabled, the APs execute their rendezvous code (from memory)
>> more slowly?
>
> Kernel commit b18d5431acc causes the vCPUs need more time to startup
> as:
> - it zaps all the mappings for the guest memory in EPT or shadow page
> table, it requires VM-exits to rebuild the mappings for all memory
> access.
>
> - if there is device passthrough-ed in guest and IOMMU lacks snooping
> control feature, the memory will become UC after CR0.CD is set to 1.
>
> And a generic factor is, if the guest has more vCPUs then more time is
> needed. That why the bug is hardly triggered on small vCPUs guest. I
> guess we need a self-adapting way to handle the case...
Thanks, this should be enough for composing a commit message.
>
>>
>>> BTW, OVMF handles #UD with no trace - nothing is killed, and no call
>>> trace
>>> in the debug input...
>>
>> There *is* a trace (of any unexpected exception -- at least for the
>> BSP), but unfortunately its location is not intuitive.
>>
>> The exception handler that is built into OVMF
>> ("UefiCpuPkg/Library/CpuExceptionHandlerLib") is again generic edk2
>> code, and it prints the trace directly to the serial port, regardless of
>> the fact that OVMF's DebugLib instance logs explicit DEBUGs to the QEMU
>> debug port. (The latter can be directed to the serial port as well, if
>> you build OVMF with -D DEBUG_ON_SERIAL_PORT, but this is not relevant
>> here.)
>>
>> If you reproduce the issue while looking at the (virtual) serial port of
>> the guest, I trust you will get a register dump.
>
> Er... it seems no dump in serial output, i attached it in this mail. The
> system
> continues to run with 1 CPU enabled......
Actually, the guest is in a reboot loop, it just may not be obvious from
the log. Whenever you see
SecCoreStartupWithStack(0xFFFCC000, 0x818000)
that means the guest has rebooted.
The fault handler that I described becomes active when a fault gets
injected visibily to the guest -- or happens within the guest entirely
-- for example, a null pointer dereference, and the fault handler can
actually handle it.
I guess a triple fault occurs or some such.
Thanks
Laszlo
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-15 18:46 ` Laszlo Ersek
@ 2015-10-20 17:27 ` Janusz
2015-10-20 17:44 ` Laszlo Ersek
0 siblings, 1 reply; 37+ messages in thread
From: Janusz @ 2015-10-20 17:27 UTC (permalink / raw)
To: Laszlo Ersek, Kinney, Michael D, Xiao Guangrong
Cc: kvm@vger.kernel.org, Justen, Jordan L, edk2-devel@ml01.01.org,
Alex Williamson, Chen Fan, Paolo Bonzini, Wanpeng Li
W dniu 15.10.2015 o 20:46, Laszlo Ersek pisze:
> On 10/15/15 18:53, Kinney, Michael D wrote:
>> Laszlo,
>>
>> There is already a PCD for this timeout that is used by CpuMpPei.
>>
>> gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds
>>
>> I noticed that CpuDxe is using a hard coded AP timeout. I think we should just use this same PCD for both the PEI and DXE CPU module and then set it for OVMF to the compatible value.
> Perfect, thank you!
>
> (I notice the default in the DEC file is 50000, which is half of what
> the DXE driver hardcodes.)
>
> Now we only need a recommended (or experimental) value for it, and an
> explanation why 100*1000 is no longer sufficient on KVM :)
>
> Thanks!
> Laszlo
>
>
>
Laszlo,
I saw that there is already some change in ovmf for MicroSecondDelay
https://github.com/tianocore/edk2/commit/1e410eadd80c328e66868263b3006a274ce81ae0
Is that a fix for it? Because I tried it and it still doesn't work for
me: https://bpaste.net/show/2514b51bf41f
I still get internal error
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-20 17:27 ` Janusz
@ 2015-10-20 17:44 ` Laszlo Ersek
2015-10-20 18:52 ` Janusz Mocek
0 siblings, 1 reply; 37+ messages in thread
From: Laszlo Ersek @ 2015-10-20 17:44 UTC (permalink / raw)
To: Janusz, Kinney, Michael D, Xiao Guangrong
Cc: kvm@vger.kernel.org, Justen, Jordan L, edk2-devel@ml01.01.org,
Alex Williamson, Chen Fan, Paolo Bonzini, Wanpeng Li
Hi,
On 10/20/15 19:27, Janusz wrote:
> W dniu 15.10.2015 o 20:46, Laszlo Ersek pisze:
>> On 10/15/15 18:53, Kinney, Michael D wrote:
>>> Laszlo,
>>>
>>> There is already a PCD for this timeout that is used by CpuMpPei.
>>>
>>> gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds
>>>
>>> I noticed that CpuDxe is using a hard coded AP timeout. I think we should just use this same PCD for both the PEI and DXE CPU module and then set it for OVMF to the compatible value.
>> Perfect, thank you!
>>
>> (I notice the default in the DEC file is 50000, which is half of what
>> the DXE driver hardcodes.)
>>
>> Now we only need a recommended (or experimental) value for it, and an
>> explanation why 100*1000 is no longer sufficient on KVM :)
>>
>> Thanks!
>> Laszlo
>>
>>
>>
> Laszlo,
>
> I saw that there is already some change in ovmf for MicroSecondDelay
> https://github.com/tianocore/edk2/commit/1e410eadd80c328e66868263b3006a274ce81ae0
> Is that a fix for it? Because I tried it and it still doesn't work for
> me: https://bpaste.net/show/2514b51bf41f
> I still get internal error
I think you guys are now "mature enough OVMF users" to start employing
the correct terminology.
"edk2" (also spelled as "EDK II") is: "a modern, feature-rich,
cross-platform firmware development environment for the UEFI and PI
specifications".
The source tree contains a whole bunch of modules (drivers,
applications, libraries), organized into packages.
"OVMF" usually denotes a firmware binary built from one of the
OvmfPkg/OvmfPkg*.dsc "platform description files". Think of them as "top
level makefiles". The difference between them is the target architecture
(there's Ia32, X64, and Ia32X64 -- the last one means that the SEC and
PEI phases are 32-bit, whereas the DXE and later phases are 64-bit.) In
practice you'll only care about full X64.
Now, each of OvmfPkg/OvmfPkg*.dsc builds the following three kinds of
modules into the final binary:
- platform-independent modules from various top-level packages
- platform- (ie. Ia32/X64-) dependent modules from various top-level
packages
- modules from under OvmfPkg that are specific to QEMU/KVM (and Xen, if
you happen to use OVMF with Xen)
Now, when you reference a commit like 1e410ead above, you can look at
the diffstat, and decide if it is OvmfPkg-specific (third category
above) or not. Here you see UefiCpuPkg, which happens to be the second
category.
The important point is: please do *not* call any and all edk2 patches
"OVMF changes", indiscriminately. That's super confusing for people who
understand the above distinctions. Which now you do too. :)
Let me add that in edk2, patches that straddle top level packages are
generally forbidden -- you can't have a patch that modifies OvmfPkg and
UefiCpuPkg at the same time, modulo *very* rare exceptions. If a feature
or bugfix needs to touch several top-level packages, the series must be
built up carefully in stages.
Knowing all of the above, you can tell that the patch you referenced had
only *enabled* OvmfPkg to customize UefiCpuPkg, via
"PcdCpuApInitTimeOutInMicroSeconds". But for that customization to occur
actually, a small patch for OvmfPkg will be necessary too, in order to
set "PcdCpuApInitTimeOutInMicroSeconds" differently from the default.
I plan to send that patch soon. If you'd like to be CC'd, that's great
(reporting back with a Tested-by is even better!), but I'll need your
real name for that. (Or any name that looks like a real name.)
Thanks!
Laszlo
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled
2015-10-20 17:44 ` Laszlo Ersek
@ 2015-10-20 18:52 ` Janusz Mocek
0 siblings, 0 replies; 37+ messages in thread
From: Janusz Mocek @ 2015-10-20 18:52 UTC (permalink / raw)
To: Laszlo Ersek, Kinney, Michael D, Xiao Guangrong
Cc: kvm@vger.kernel.org, Justen, Jordan L, edk2-devel@ml01.01.org,
Alex Williamson, Chen Fan, Paolo Bonzini, Wanpeng Li
W dniu 20.10.2015 o 19:44, Laszlo Ersek pisze:
> Hi,
>
> On 10/20/15 19:27, Janusz wrote:
>> W dniu 15.10.2015 o 20:46, Laszlo Ersek pisze:
>>> On 10/15/15 18:53, Kinney, Michael D wrote:
>>>> Laszlo,
>>>>
>>>> There is already a PCD for this timeout that is used by CpuMpPei.
>>>>
>>>> gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds
>>>>
>>>> I noticed that CpuDxe is using a hard coded AP timeout. I think we should just use this same PCD for both the PEI and DXE CPU module and then set it for OVMF to the compatible value.
>>> Perfect, thank you!
>>>
>>> (I notice the default in the DEC file is 50000, which is half of what
>>> the DXE driver hardcodes.)
>>>
>>> Now we only need a recommended (or experimental) value for it, and an
>>> explanation why 100*1000 is no longer sufficient on KVM :)
>>>
>>> Thanks!
>>> Laszlo
>>>
>>>
>>>
>> Laszlo,
>>
>> I saw that there is already some change in ovmf for MicroSecondDelay
>> https://github.com/tianocore/edk2/commit/1e410eadd80c328e66868263b3006a274ce81ae0
>> Is that a fix for it? Because I tried it and it still doesn't work for
>> me: https://bpaste.net/show/2514b51bf41f
>> I still get internal error
> I think you guys are now "mature enough OVMF users" to start employing
> the correct terminology.
Sory for that :)
> "edk2" (also spelled as "EDK II") is: "a modern, feature-rich,
> cross-platform firmware development environment for the UEFI and PI
> specifications".
>
> The source tree contains a whole bunch of modules (drivers,
> applications, libraries), organized into packages.
>
> "OVMF" usually denotes a firmware binary built from one of the
> OvmfPkg/OvmfPkg*.dsc "platform description files". Think of them as "top
> level makefiles". The difference between them is the target architecture
> (there's Ia32, X64, and Ia32X64 -- the last one means that the SEC and
> PEI phases are 32-bit, whereas the DXE and later phases are 64-bit.) In
> practice you'll only care about full X64.
>
> Now, each of OvmfPkg/OvmfPkg*.dsc builds the following three kinds of
> modules into the final binary:
> - platform-independent modules from various top-level packages
> - platform- (ie. Ia32/X64-) dependent modules from various top-level
> packages
> - modules from under OvmfPkg that are specific to QEMU/KVM (and Xen, if
> you happen to use OVMF with Xen)
>
> Now, when you reference a commit like 1e410ead above, you can look at
> the diffstat, and decide if it is OvmfPkg-specific (third category
> above) or not. Here you see UefiCpuPkg, which happens to be the second
> category.
>
> The important point is: please do *not* call any and all edk2 patches
> "OVMF changes", indiscriminately. That's super confusing for people who
> understand the above distinctions. Which now you do too. :)
>
> Let me add that in edk2, patches that straddle top level packages are
> generally forbidden -- you can't have a patch that modifies OvmfPkg and
> UefiCpuPkg at the same time, modulo *very* rare exceptions. If a feature
> or bugfix needs to touch several top-level packages, the series must be
> built up carefully in stages.
>
> Knowing all of the above, you can tell that the patch you referenced had
> only *enabled* OvmfPkg to customize UefiCpuPkg, via
> "PcdCpuApInitTimeOutInMicroSeconds". But for that customization to occur
> actually, a small patch for OvmfPkg will be necessary too, in order to
> set "PcdCpuApInitTimeOutInMicroSeconds" differently from the default.
>
> I plan to send that patch soon. If you'd like to be CC'd, that's great
> (reporting back with a Tested-by is even better!), but I'll need your
> real name for that. (Or any name that looks like a real name.)
would be great if you could add me to cc list, thanks
>
> Thanks!
> Laszlo
^ permalink raw reply [flat|nested] 37+ messages in thread
end of thread, other threads:[~2015-10-20 18:52 UTC | newest]
Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-18 9:37 [edk2] KVM: MTRR: fix memory type handling if MTRR is completely disabled Janusz
2015-09-18 10:07 ` Laszlo Ersek
2015-09-18 17:48 ` Janusz
2015-09-21 2:51 ` Xiao Guangrong
2015-09-21 3:30 ` Wanpeng Li
2015-09-21 3:40 ` Xiao Guangrong
2015-10-01 14:12 ` Janusz
2015-10-01 14:18 ` Paolo Bonzini
2015-10-02 14:38 ` Janusz
2015-10-10 20:07 ` Xiao Guangrong
2015-10-12 18:20 ` Xiao Guangrong
2015-10-12 18:29 ` Xiao Guangrong
2015-10-14 3:58 ` Xiao Guangrong
2015-10-14 7:37 ` Janusz
2015-10-14 8:24 ` Xiao Guangrong
2015-10-14 8:32 ` Xiao Guangrong
2015-10-14 9:13 ` Janusz
2015-10-14 9:16 ` Janusz
2015-10-14 9:47 ` Laszlo Ersek
2015-10-15 3:59 ` Xiao Guangrong
2015-10-14 18:08 ` Janusz
2015-10-15 4:19 ` Xiao Guangrong
2015-10-15 6:19 ` Janusz
2015-10-15 6:41 ` Xiao Guangrong
2015-10-15 6:58 ` Janusz
2015-10-15 7:10 ` Xiao Guangrong
2015-10-15 7:21 ` Janusz
2015-10-15 16:18 ` Laszlo Ersek
2015-10-15 16:53 ` Kinney, Michael D
2015-10-15 18:46 ` Laszlo Ersek
2015-10-20 17:27 ` Janusz
2015-10-20 17:44 ` Laszlo Ersek
2015-10-20 18:52 ` Janusz Mocek
[not found] ` <5620696F.7050406@linux.intel.com>
2015-10-16 18:22 ` Laszlo Ersek
2015-09-21 8:23 ` Janusz
2015-09-22 8:59 ` Paolo Bonzini
2015-09-22 10:29 ` Janusz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).