From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: Re: kernel BUG at arch/x86/xen/mmu.c:1860! Date: Fri, 8 Apr 2011 19:24:35 +0800 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="_460269ca-875f-4047-a8e0-1f9fee0def2b_" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen devel Cc: jeremy@goop.org, dave@ivt.com.au, giamteckchoon@gmail.com, ian.campbell@citrix.com, konrad.wilk@oracle.com List-Id: xen-devel@lists.xenproject.org --_460269ca-875f-4047-a8e0-1f9fee0def2b_ Content-Type: multipart/alternative; boundary="_e80b7637-5c0a-4b02-9f72-aa3c3c0fd1c3_" --_e80b7637-5c0a-4b02-9f72-aa3c3c0fd1c3_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi:=20 Unfortunately I met the exactly same bug today. With pvops kernel 2.= 6.32.36, and xen 4.0.1. Kernel Panic and serial log attached.=20 =20 Our test cases is quite simple, on a single physical host, we start = 12 HVMS(windows 2003), each of the HVM reboot every 10minutes.=20 =20 The bug is easy to hit on our 48G machine(in hours). But We haven't = hit the bug in our 24G=20 machine(we have three 24G machine, all works fine.) -----Is is possible = related to Memory capacity? =20 Taking a look at the serial output, the Dom0 code is attempting to pin w= hat it thins=20 is a "PGT_l3_page_table", however the hypervisor returns -EINVAL because = it actually is a "PGT_writable_page".=20 =20 (XEN) mm.c:2364:d0 Bad type (saw 7400000000000001 !=3D exp 4000 0000 0000= 0000) for mfn 898a41 (pfn 9ca41) (XEN) mm.c:2733:d0 Error while pinning mfn 898a41 =20 And before that quite a lot abnormal grant table log like : =20 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:1717:d0 Bad grant reference 4294965888 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 =20 It looks like something wrong with grant table. =20 Many thanks. =20 > From: Jeremy Fitzhardinge > Subject: Re: [Xen-devel] [SPAM] Re: kernel BUG at > arch/x86/xen/mmu.c:1860! - ideas. > To: Ian Campbell > Cc: Dave Hunter , Teck Choon Giam > , "xen-devel@lists.xensource.com" > xen-devel@lists.xensource.com =20 > On 04/06/2011 12:53 AM, Ian Campbell wrote: > > Please don't top post. > > > > On Wed, 2011-04-06 at 00:20 +0100, Dave Hunter wrote: > >> Is it likely that Debian would release an updated kernel in squeeze = with > >> this configuration? (sorry, this might not be the place to ask). > > I doubt they will, enabling DEBUG_PAGEALLOC seems very much like a > > workaround not a solution to me. >=20 > Yes, it will impose a pretty large performance overhead. >=20 > J >=20 >=20 =20 --_e80b7637-5c0a-4b02-9f72-aa3c3c0fd1c3_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi:
     Unfortunately I met the exactly same bug to= day. With pvops kernel 2.6.32.36, and xen 4.0.1.
     Kernel Panic and serial log attached. =
 
     Our test cases is quite simple, on a single phys= ical host, we start 12 HVMS(windows 2003),
each of the HVM reboot every 10minutes.
 
     The bug is easy to hit on our 48G machine(in hou= rs). But We haven't hit the bug in our 24G
machine(we have three 24G machine, all works fine.)  -----Is is poss= ible related to Memory capacity?
 
Taking a look at the serial output,  the Dom0 code is attempting to pin what it t= hins
is a "PGT_l3_page_table", however the hypervisor returns -= EINVAL because it actually  is a "PGT_writ= able_page".
 
(XEN) mm.c:2364:d0 Bad type (saw 7400000000000001 !=3D exp 4000 0000 0000= 0000) for mfn 898a41 (pfn 9ca41)
(XEN) mm.c:2733:d0 Error while pinni= ng mfn 898a41
 
And  before that quite a lot abnormal grant table log like :
 
(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983
(XEN)= grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)
(XEN)= grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)
(XEN)= grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)
(XEN)= grant_table.c:1717:d0 Bad grant reference 4294965888
(XEN) grant_tabl= e.c:1717:d0 Bad grant reference 4294965983

 
It looks like something wrong with grant table.
 
Many thanks.
 
> From: Jeremy Fitzhardinge <jeremy@goop.org>
> Subject: R= e: [Xen-devel] [SPAM] Re: kernel BUG at
> arch/x86/xen/mmu.c:1860! = - ideas.
> To: Ian Campbell <Ian.Campbell@citrix.com>
>= Cc: Dave Hunter <dave@ivt.com.au>, Teck Choon Giam
> <gia= mteckchoon@gmail.com>, "xen-devel@lists.xensource.com"
> xen-devel@lists.xensource.com
 
> On 04/06/2011 12:53 AM, Ian Campbell wrote:
> = > Please don't top post.
> >
> > On Wed, 2011-04-06 = at 00:20 +0100, Dave Hunter wrote:
> >> Is it likely that Deb= ian would release an updated kernel in squeeze with
> >> this= configuration? (sorry, this might not be the place to ask).
> >= I doubt they will, enabling DEBUG_PAGEALLOC seems very much like a
&g= t; > workaround not a solution to me.
>
> Yes, it will im= pose a pretty large performance overhead.>
> J
>



--_e80b7637-5c0a-4b02-9f72-aa3c3c0fd1c3_-- --_460269ca-875f-4047-a8e0-1f9fee0def2b_ Content-Type: text/plain Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="kernel.txt" QXByICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6IC0tLS0tLS0tLS0tLVsgY3V0IGhlcmUg XS0tLS0tLS0tLS0tLQ0KQXByICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6IGtlcm5lbCBC VUcgYXQgYXJjaC94ODYveGVuL21tdS5jOjE4NzIhDQpBcHIgIDggMTI6MTk6NDcgcjE0YTExMDE3 IGtlcm5lbDogaW52YWxpZCBvcGNvZGU6IDAwMDAgWyMxXSBTTVANCkFwciAgOCAxMjoxOTo0NyBy MTRhMTEwMTcga2VybmVsOiBsYXN0IHN5c2ZzIGZpbGU6IC9zeXMvaHlwZXJ2aXNvci9wcm9wZXJ0 aWVzL2NhcGFiaWxpdGllcw0KQXByICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6IENQVSAw DQpBcHIgIDggMTI6MTk6NDcgcjE0YTExMDE3IGtlcm5lbDogTW9kdWxlcyBsaW5rZWQgaW46IDgw MjFxIGdhcnAgYmxrdGFwIHhlbl9uZXRiYWNrIHhlbl9ibGtiYWNrIGJsa2JhY2tfcGFnZW1hcCBu YmQgYnJpZGdlIHN0cCBsbGMgYXV0b2ZzNCBpcG1pX2RldmludGYgaXBtaV9zaSBpcG1pX21zZ2hh bmRsZXIgbG9ja2Qgc3VucnBjIGJvbmRpbmcgaXB2NiB4ZW5mcyBkbV9tdWx0aXBhdGggdmlkZW8g b3V0cHV0IHNicyBzYnNoYyBwYXJwb3J0X3BjIGxwIHBhcnBvcnQgc2VzIGVuY2xvc3VyZSBzbmRf c2VxX2R1bW15IHNuZF9zZXFfb3NzIGJueDIgc25kX3NlcV9taWRpX2V2ZW50IHNlcmlvX3JhdyBz bmRfc2VxIHNuZF9zZXFfZGV2aWNlIHNuZF9wY21fb3NzIHNuZF9taXhlcl9vc3Mgc25kX3BjbSBz bmRfdGltZXIgaTJjX2k4MDEgaVRDT193ZHQgaTJjX2NvcmUgc25kIHNvdW5kY29yZSBzbmRfcGFn ZV9hbGxvYyBpVENPX3ZlbmRvcl9zdXBwb3J0IHBhdGFfYWNwaSBhdGFfZ2VuZXJpYyBwY3Nwa3Ig YXRhX3BpaXggc2hwY2hwIG1wdHNhcyBtcHRzY3NpaCBtcHRiYXNlIFtsYXN0IHVubG9hZGVkOiBm cmVxX3RhYmxlXQ0KQXByICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6IFBpZDogMTU3Njks IGNvbW06IHNoIE5vdCB0YWludGVkIDIuNi4zMi4zNnhlbiAjMSBUZWNhbCBSSDIyODUNCkFwciAg OCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVsOiBSSVA6IGUwMzA6WzxmZmZmZmZmZjgxMDBjZWJj Pl0gIFs8ZmZmZmZmZmY4MTAwY2ViYz5dIHBpbl9wYWdldGFibGVfcGZuKzB4MzYvMHgzYw0KQXBy ICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6IFJTUDogZTAyYjpmZmZmODgwMDFlYjdiYWE4 ICBFRkxBR1M6IDAwMDEwMjgyDQpBcHIgIDggMTI6MTk6NDcgcjE0YTExMDE3IGtlcm5lbDogUkFY OiAwMDAwMDAwMGZmZmZmZmVhIFJCWDogMDAwMDAwMDAwMDA3YjMwNyBSQ1g6IDAwMDAwMDAwMDAw MDAwMDENCkFwciAgOCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVsOiBSRFg6IDAwMDAwMDAwMDAw MDAwMDAgUlNJOiAwMDAwMDAwMDAwMDAwMDAxIFJESTogZmZmZjg4MDAxZWI3YmFhOA0KQXByICA4 IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6IFJCUDogZmZmZjg4MDAxZWI3YmFjOCBSMDg6IDAw MDAwMDAwMDAwMDA0MjAgUjA5OiBmZmZmODgwMDAwMDAwMDAwDQpBcHIgIDggMTI6MTk6NDcgcjE0 YTExMDE3IGtlcm5lbDogUjEwOiAwMDAwMDAwMDAwMDA3ZmYwIFIxMTogZmZmZjg4MDA4ZmM5NzI0 OCBSMTI6IGZmZmY4ODAwMjg0MGIwMDANCkFwciAgOCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVs OiBSMTM6IDAwMDAwMDAwMDAwN2I0ODQgUjE0OiAwMDAwMDAwMDAwMDAwMDAzIFIxNTogZmZmZjg4 MDA5YjA5MDAwMA0KQXByICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6IEZTOiAgMDAwMDdm ZThiYmM2NTZlMCgwMDAwKSBHUzpmZmZmODgwMDI4MDNiMDAwKDAwMDApIGtubEdTOjAwMDAwMDAw MDAwMDAwMDANCkFwciAgOCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVsOiBDUzogIGUwMzMgRFM6 IDAwMDAgRVM6IDAwMDAgQ1IwOiAwMDAwMDAwMDgwMDUwMDNiDQpBcHIgIDggMTI6MTk6NDcgcjE0 YTExMDE3IGtlcm5lbDogQ1IyOiAwMDAwMDAwMDAwNmJiMzM4IENSMzogMDAwMDAwMDA3YjMwNzAw MCBDUjQ6IDAwMDAwMDAwMDAwMDI2NjANCkFwciAgOCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVs OiBEUjA6IDAwMDAwMDAwMDAwMDAwMDAgRFIxOiAwMDAwMDAwMDAwMDAwMDAwIERSMjogMDAwMDAw MDAwMDAwMDAwMA0KQXByICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6IERSMzogMDAwMDAw MDAwMDAwMDAwMCBEUjY6IDAwMDAwMDAwZmZmZjBmZjAgRFI3OiAwMDAwMDAwMDAwMDAwNDAwDQpB cHIgIDggMTI6MTk6NDcgcjE0YTExMDE3IGtlcm5lbDogUHJvY2VzcyBzaCAocGlkOiAxNTc2OSwg dGhyZWFkaW5mbyBmZmZmODgwMDFlYjdhMDAwLCB0YXNrIGZmZmY4ODAwOWIwOTAwMDApDQpBcHIg IDggMTI6MTk6NDcgcjE0YTExMDE3IGtlcm5lbDogU3RhY2s6DQpBcHIgIDggMTI6MTk6NDcgcjE0 YTExMDE3IGtlcm5lbDogIDAwMDAwMDAwMDAwMDAwMDAgMDAwMDAwMDAwMDRiNzQ4NCAwMDAwMDAw MTFlYjdiYWM4IDAwMDAwMDAwMDAwN2IzMDcNCkFwciAgOCAxMjoxOTo0NyByMTRhMTEwMTcga2Vy bmVsOiA8MD4gZmZmZjg4MDAxZWI3YmFmOCBmZmZmZmZmZjgxMDBlOGVmIGZmZmY4ODAxMmU0ZmIx MDAgZmZmZjg4MDAwZmI1ZTAxOA0KQXByICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6IDww PiAwMDAwMDAwMDAwMDdiNDg0IDAwMDAwMDAwMDA2YmIzMzggZmZmZjg4MDAxZWI3YmIwOCBmZmZm ZmZmZjgxMDBlOTM1DQpBcHIgIDggMTI6MTk6NDcgcjE0YTExMDE3IGtlcm5lbDogQ2FsbCBUcmFj ZToNCkFwciAgOCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVsOiAgWzxmZmZmZmZmZjgxMDBlOGVm Pl0geGVuX2FsbG9jX3B0cGFnZSsweDhkLzB4OTYNCkFwciAgOCAxMjoxOTo0NyByMTRhMTEwMTcg a2VybmVsOiAgWzxmZmZmZmZmZjgxMDBlOTM1Pl0geGVuX2FsbG9jX3B0ZSsweDEzLzB4MTUNCkFw ciAgOCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVsOiAgWzxmZmZmZmZmZjgxMGViNzAyPl0gX19w dGVfYWxsb2MrMHg3Zi8weGRjDQpBcHIgIDggMTI6MTk6NDcgcjE0YTExMDE3IGtlcm5lbDogIFs8 ZmZmZmZmZmY4MTBlOTBiZD5dID8gcG1kX29mZnNldCsweDEzLzB4M2MNCkFwciAgOCAxMjoxOTo0 NyByMTRhMTEwMTcga2VybmVsOiAgWzxmZmZmZmZmZjgxMGViODE4Pl0gaGFuZGxlX21tX2ZhdWx0 KzB4YjkvMHg3NzENCkFwciAgOCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVsOiAgWzxmZmZmZmZm ZjgxMGYwOGZkPl0gPyB2bWFfbGluaysweDdjLzB4YTQNCkFwciAgOCAxMjoxOTo0NyByMTRhMTEw MTcga2VybmVsOiAgWzxmZmZmZmZmZjgxMGYxM2IwPl0gPyBtbWFwX3JlZ2lvbisweDMyMi8weDQy Yg0KQXByICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6ICBbPGZmZmZmZmZmODEwMGYxNjk+ XSA/IHhlbl9mb3JjZV9ldnRjaG5fY2FsbGJhY2srMHhkLzB4Zg0KQXByICA4IDEyOjE5OjQ3IHIx NGExMTAxNyBrZXJuZWw6ICBbPGZmZmZmZmZmODE0NDk3MDE+XSBkb19wYWdlX2ZhdWx0KzB4MjFj LzB4Mjg4DQpBcHIgIDggMTI6MTk6NDcgcjE0YTExMDE3IGtlcm5lbDogIFs8ZmZmZmZmZmY4MTQ0 NzY5NT5dIHBhZ2VfZmF1bHQrMHgyNS8weDMwDQpBcHIgIDggMTI6MTk6NDcgcjE0YTExMDE3IGtl cm5lbDogIFs8ZmZmZmZmZmY4MTIyMmEzOT5dID8gX19jbGVhcl91c2VyKzB4MzMvMHg1NQ0KQXBy ICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6ICBbPGZmZmZmZmZmODEyMjJhMWQ+XSA/IF9f Y2xlYXJfdXNlcisweDE3LzB4NTUNCkFwciAgOCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVsOiAg WzxmZmZmZmZmZjgxMjIyYThiPl0gY2xlYXJfdXNlcisweDMwLzB4MzgNCkFwciAgOCAxMjoxOTo0 NyByMTRhMTEwMTcga2VybmVsOiAgWzxmZmZmZmZmZjgxMTUxMzlhPl0gbG9hZF9lbGZfYmluYXJ5 KzB4NWQ1LzB4MTdlZg0KQXByICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6ICBbPGZmZmZm ZmZmODExZjQ2NDg+XSA/IHByb2Nlc3NfbWVhc3VyZW1lbnQrMHhjMC8weGQ3DQpBcHIgIDggMTI6 MTk6NDcgcjE0YTExMDE3IGtlcm5lbDogIFs8ZmZmZmZmZmY4MTE1MGRjNT5dID8gbG9hZF9lbGZf YmluYXJ5KzB4MC8weDE3ZWYNCkFwciAgOCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVsOiAgWzxm ZmZmZmZmZjgxMTEzMDk0Pl0gc2VhcmNoX2JpbmFyeV9oYW5kbGVyKzB4YzgvMHgyNTUNCkFwciAg OCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVsOiAgWzxmZmZmZmZmZjgxMTE0MzYyPl0gZG9fZXhl Y3ZlKzB4MWMzLzB4MjllDQpBcHIgIDggMTI6MTk6NDcgcjE0YTExMDE3IGtlcm5lbDogIFs8ZmZm ZmZmZmY4MTAxMTU1ZD5dIHN5c19leGVjdmUrMHg0My8weDVkDQpBcHIgIDggMTI6MTk6NDcgcjE0 YTExMDE3IGtlcm5lbDogIFs8ZmZmZmZmZmY4MTAxMzFjYT5dIHN0dWJfZXhlY3ZlKzB4NmEvMHhj MA== --_460269ca-875f-4047-a8e0-1f9fee0def2b_ Content-Type: text/plain Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="serial.txt" KFhFTikgZ3JhbnRfdGFibGUuYzoxNzE3OmQwIEJhZCBncmFudCByZWZlcmVuY2UgNDI5NDk2NTk4 Mw0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDApLiAo ZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDAp IG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQw IEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgZ3JhbnRf dGFibGUuYzoxNzE3OmQwIEJhZCBncmFudCByZWZlcmVuY2UgNDI5NDk2NTg4OA0KKFhFTikgZ3Jh bnRfdGFibGUuYzoxNzE3OmQwIEJhZCBncmFudCByZWZlcmVuY2UgNDI5NDk2NTk4Mw0KKFhFTikg Z3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQg ZG9tIDApDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAo MCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIGdyYW50X3RhYmxlLmM6MTcxNzpkMCBCYWQgZ3Jh bnQgcmVmZXJlbmNlIDQyOTQ5NjU5ODMNCihYRU4pIGdyYW50X3RhYmxlLmM6MTcxNzpkMCBCYWQg Z3JhbnQgcmVmZXJlbmNlIDQyOTQ5NjU5ODMNCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJh ZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgZ3JhbnRfdGFi bGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQoo WEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBl Y3RlZCBkb20gMCkNCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3Ig ZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFk IGZsYWdzICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBncmFudF90YWJs ZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihY RU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVj dGVkIGRvbSAwKQ0KKFhFTikgcHJpbnRrOiAxIG1lc3NhZ2VzIHN1cHByZXNzZWQuDQooWEVOKSBn cmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBlY3RlZCBk b20gMCkNCihYRU4pIHByaW50azogMSBtZXNzYWdlcyBzdXBwcmVzc2VkLg0KKFhFTikgZ3JhbnRf dGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQgZG9tIDAp DQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChl eHBlY3RlZCBkb20gMCkNCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkg b3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgcHJpbnRrOiA1IG1lc3NhZ2VzIHN1 cHByZXNzZWQuDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRv bSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIHByaW50azogMTQgbWVzc2FnZXMgc3VwcHJl c3NlZC4NCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgw KS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgcHJpbnRrOiA3IG1lc3NhZ2VzIHN1cHByZXNzZWQu DQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChl eHBlY3RlZCBkb20gMCkNCihYRU4pIHByaW50azogMSBtZXNzYWdlcyBzdXBwcmVzc2VkLg0KKFhF TikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDApLiAoZXhwZWN0 ZWQgZG9tIDApDQooWEVOKSBncmFudF90YWJsZS5jOjE3MTc6ZDAgQmFkIGdyYW50IHJlZmVyZW5j ZSA0Mjk0OTAxNzYwDQooWEVOKSBwcmludGs6IDEzIG1lc3NhZ2VzIHN1cHByZXNzZWQuDQooWEVO KSBncmFudF90YWJsZS5jOjE3MTc6ZDAgQmFkIGdyYW50IHJlZmVyZW5jZSA0Mjk0OTAxNzYwDQoo WEVOKSBwcmludGs6IDU5IG1lc3NhZ2VzIHN1cHByZXNzZWQuDQooWEVOKSBncmFudF90YWJsZS5j OjE3MTc6ZDAgQmFkIGdyYW50IHJlZmVyZW5jZSA0Mjk0OTAxNzYwDQooWEVOKSBwcmludGs6IDgx IG1lc3NhZ2VzIHN1cHByZXNzZWQuDQooWEVOKSBncmFudF90YWJsZS5jOjE3MTc6ZDAgQmFkIGdy YW50IHJlZmVyZW5jZSA0Mjk0OTAxNzYwDQooWEVOKSBwcmludGs6IDc1IG1lc3NhZ2VzIHN1cHBy ZXNzZWQuDQooWEVOKSBncmFudF90YWJsZS5jOjE3MTc6ZDAgQmFkIGdyYW50IHJlZmVyZW5jZSA0 Mjk0OTAxNzYwDQooWEVOKSBwcmludGs6IDc5IG1lc3NhZ2VzIHN1cHByZXNzZWQuDQooWEVOKSBn cmFudF90YWJsZS5jOjE3MTc6ZDAgQmFkIGdyYW50IHJlZmVyZW5jZSA0Mjk0OTAxNzYwDQooWEVO KSBwcmludGs6IDgxIG1lc3NhZ2VzIHN1cHByZXNzZWQuDQooWEVOKSBncmFudF90YWJsZS5jOjE3 MTc6ZDAgQmFkIGdyYW50IHJlZmVyZW5jZSA0Mjk0OTAxNzYwDQooWEVOKSBwcmludGs6IDMzIG1l c3NhZ2VzIHN1cHByZXNzZWQuDQooWEVOKSBncmFudF90YWJsZS5jOjE3MTc6ZDAgQmFkIGdyYW50 IHJlZmVyZW5jZSA0Mjk0OTAxNzYwDQooWEVOKSBwcmludGs6IDkgbWVzc2FnZXMgc3VwcHJlc3Nl ZC4NCihYRU4pIGdyYW50X3RhYmxlLmM6MTcxNzpkMCBCYWQgZ3JhbnQgcmVmZXJlbmNlIDQyOTQ5 MDE3NjUNCihYRU4pIHByaW50azogNyBtZXNzYWdlcyBzdXBwcmVzc2VkLg0KKFhFTikgZ3JhbnRf dGFibGUuYzoxNzE3OmQwIEJhZCBncmFudCByZWZlcmVuY2UgNDI5NDkwMTc2MA0KKFhFTikgcHJp bnRrOiAxIG1lc3NhZ2VzIHN1cHByZXNzZWQuDQooWEVOKSBncmFudF90YWJsZS5jOjE3MTc6ZDAg QmFkIGdyYW50IHJlZmVyZW5jZSA0Mjk0OTAxNzYwDQooWEVOKSBncmFudF90YWJsZS5jOjE3MTc6 ZDAgQmFkIGdyYW50IHJlZmVyZW5jZSA0Mjk0OTAxNzY1DQooWEVOKSBncmFudF90YWJsZS5jOjE3 MTc6ZDAgQmFkIGdyYW50IHJlZmVyZW5jZSA0Mjk0OTAxNzYwDQooWEVOKSBncmFudF90YWJsZS5j OjE3MTc6ZDAgQmFkIGdyYW50IHJlZmVyZW5jZSA0Mjk0OTAxNzY1DQooWEVOKSBncmFudF90YWJs ZS5jOjE3MTc6ZDAgQmFkIGdyYW50IHJlZmVyZW5jZSA0Mjk0OTAxNzYwDQooWEVOKSBncmFudF90 YWJsZS5jOjE3MTc6ZDAgQmFkIGdyYW50IHJlZmVyZW5jZSA0Mjk0OTAxNzY1DQooWEVOKSBncmFu dF90YWJsZS5jOjE3MTc6ZDAgQmFkIGdyYW50IHJlZmVyZW5jZSA0Mjk0OTAxNzYwDQooWEVOKSBn cmFudF90YWJsZS5jOjE3MTc6ZDAgQmFkIGdyYW50IHJlZmVyZW5jZSA0Mjk0OTAxNzY1DQooWEVO KSBwcmludGs6IDEwIG1lc3NhZ2VzIHN1cHByZXNzZWQuDQooWEVOKSBtbS5jOjIzNjQ6ZDAgQmFk IHR5cGUgKHNhdyA3NDAwMDAwMDAwMDAwMDAxICE9IGV4cCA0MDAwIDAwMDAgMDAwMCAwMDAwKSBm b3IgbWZuIDg5OGE0MSAocGZuIDljYTQxKQ0KKFhFTikgbW0uYzoyNzMzOmQwIEVycm9yIHdoaWxl IHBpbm5pbmcgbWZuIDg5OGE0MQ0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgODAw MDAwMDAwMDAwMDAwMCANCihYRU4pIG1tLmM6MjM2NDpkMCBCYWQgdHlwZSAoc2F3IDc0MDAwMDAw MDAwMDAwMDEgIT0gZXhwIDQwMDAwMDAwMDAwMDAwMDApIGZvciBtZm4gODcxNDQzIChwZm4gNzU0 NDMpDQooWEVOKSBtbS5jOjI3MzM6ZDAgRXJyb3Igd2hpbGUgcGlubmluZyBtZm4gODcxNDQzDQoo WEVOKSBtbS5jOjIzNjQ6ZDAgQmFkIHR5cGUgKHNhdyA3NDAwMDAwMDAwMDAwMDAxICE9IGV4cCA0 MDAwMDAwMDAwMDAwMDAwKSBmb3IgbWZuIDg5OGE0MSAocGZuIDljYTQxKQ0KKFhFTikgbW0uYzoy NTAwOmQwIEVycm9yIHdoaWxlIGluc3RhbGxpbmcgbmV3IGJhc2VwdHIgODk4YTQxDQooWEVOKSBt bS5jOjIzNjQ6ZDAgQmFkIHR5cGUgKHNhdyA3NDAwMDAwMDAwMDAwMDAxICE9IGV4cCA0MDAwMDAw MDAwMDAwMDAwKSBmb3IgbWZuIDg3MTQ0MyAocGZuIDc1NDQzKQ0KKFhFTikgbW0uYzoyODI1OmQw IEVycm9yIHdoaWxlIGluc3RhbGxpbmcgbmV3IG1mbiA4NzE0NDMNCihYRU4pIG1tLmM6MjM2NDpk MCBCYWQgdHlwZSAoc2F3IDQ0MDAwMDAwMDAwMDAwMDEgIT0gZXhwIDcwMDAwMDAwMDAwMDAwMDAp IGZvciBtZm4gODk5NTUxIChwZm4gOWQ1NTEpDQooWEVOKSBtbS5jOjg2MDpkMCBFcnJvciBnZXR0 aW5nIG1mbiA4OTk1NTEgKHBmbiA5ZDU1MSkgZnJvbSBMMSBlbnRyeSA4MDAwMDAwODk5NTUxMDYz IGZvciBsMWVfb3duZXI9MCwgcGdfb3duZXI9MA== --_460269ca-875f-4047-a8e0-1f9fee0def2b_ Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --_460269ca-875f-4047-a8e0-1f9fee0def2b_-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: RE: kernel BUG at arch/x86/xen/mmu.c:1860! Date: Fri, 8 Apr 2011 19:46:43 +0800 Message-ID: References: , Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0882369681==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen devel Cc: jeremy@goop.org, dave@ivt.com.au, giamteckchoon@gmail.com, ian.campbell@citrix.com, konrad.wilk@oracle.com List-Id: xen-devel@lists.xenproject.org --===============0882369681== Content-Type: multipart/alternative; boundary="_d5ee14b4-d939-4bd9-baee-5eb98e2ea4e4_" --_d5ee14b4-d939-4bd9-baee-5eb98e2ea4e4_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable HI: =20 As I go through the code with log, I noticed that the log:=20 (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) =20 is from xen/common/grant_table.c:266, which is in function _set_status_v1= () so it looks like kernel 2.6.32 use grant table version 1. =20 While in 2.6.31. driver/xen/grant-table.c, I noticed function gnttab_requ= est_version() which looks like 2.6.31 require grant version 2. But this function cannot= be found=20 in 2.6.32. =20 Is this correct? =20 Thanks. =20 =20 >------------------------------------------------------------------------= -------- >From: tinnycloud@hotmail.com >To: xen-devel@lists.xensource.com >CC: dave@ivt.com.au; ian.campbell@citrix.com; giamteckchoon@gmail.com; k= onrad.wilk@oracle.com; jeremy@goop.org >Subject: Re: kernel BUG at arch/x86/xen/mmu.c:1860! >Date: Fri, 8 Apr 2011 19:24:35 +0800 > >Hi:=20 > Unfortunately I met the exactly same bug today. With pvops kernel 2= .6.32.36, and xen 4.0.1. > Kernel Panic and serial log attached.=20 >=20 > Our test cases is quite simple, on a single physical host, we start= 12 HVMS(windows 2003), >each of the HVM reboot every 10minutes.=20 >=20 > The bug is easy to hit on our 48G machine(in hours). But We haven't= hit the bug in our 24G=20 >machine(we have three 24G machine, all works fine.) -----Is is possible= related to Memory capacity? >=20 >Taking a look at the serial output, the Dom0 code is attempting to pin = what it thins=20 >is a "PGT_l3_page_table", however the hypervisor returns -EINVAL because= it actually is a "PGT_writable_page".=20 >=20 >(XEN) mm.c:2364:d0 Bad type (saw 7400000000000001 !=3D exp 4000 0000 000= 0 0000) for mfn 898a41 (pfn 9ca41) >(XEN) mm.c:2733:d0 Error while pinning mfn 898a41 >=20 >And before that quite a lot abnormal grant table log like : >=20 >(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 >(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) >(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) >(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) >(XEN) grant_table.c:1717:d0 Bad grant reference 4294965888 >(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 >=20 >It looks like something wrong with grant table. >=20 >Many thanks. >=20 >> From: Jeremy Fitzhardinge >> Subject: Re: [Xen-devel] [SPAM] Re: kernel BUG at >> arch/x86/xen/mmu.c:1860! - ideas. >> To: Ian Campbell >> Cc: Dave Hunter , Teck Choon Giam >> , "xen-devel@lists.xensource.com" >> xen-devel@lists.xensource.com >=20 >> On 04/06/2011 12:53 AM, Ian Campbell wrote: >> > Please don't top post. >> > >> > On Wed, 2011-04-06 at 00:20 +0100, Dave Hunter wrote: >> >> Is it likely that Debian would release an updated kernel in squeeze= with >> >> this configuration? (sorry, this might not be the place to ask). >> > I doubt they will, enabling DEBUG_PAGEALLOC seems very much like a >> > workaround not a solution to me. >>=20 >> Yes, it will impose a pretty large performance overhead. >>=20 >> J >>=20 >>=20 > =20 --_d5ee14b4-d939-4bd9-baee-5eb98e2ea4e4_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable HI:
 
     As I go through the code with log, I noticed tha= t  the log:
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)
 
is from xen/common/grant_table.c:266, which is in function _set_status_v1= ()
so it looks like kernel 2.6.32 use grant table version 1.
 
While in 2.6.31. driver/xen/grant-table.c, I noticed function gnttab_requ= est_version()
which looks like 2.6.31 require grant version 2. But this function cannot= be found
in 2.6.32.
 
Is this correct?
 
Thanks.
 
 
>---------------------------------------------------------------------= -----------
>From:
tinnyc= loud@hotmail.com
>To: xen-devel@lists.xensource.com
>CC: dave@ivt.com.au; ian.campbell@citrix.com; giamteckchoon@gmail.com; konrad.wilk@oracle.com; jeremy@= goop.org
>Subject: Re: kernel BUG at arch/x86/xen/mmu.c:1860!>Date: Fri, 8 Apr 2011 19:24:35 +0800
>
>Hi:
>&nb= sp;    Unfortunately I met the exactly same bug today. Wit= h pvops kernel 2.6.32.36, and xen 4.0.1.
>     = Kernel Panic and serial log attached.
>
>   =   Our test cases is quite simple, on a single phys ical host, we start 12 HVMS(windows 2003),
>each of the HVM reboot= every 10minutes.
>
>     The bug is ea= sy to hit on our 48G machine(in hours). But We haven't hit the bug in our= 24G
>machine(we have three 24G machine, all works fine.)  --= ---Is is possible related to Memory capacity?
>
>Taking a lo= ok at the serial output,  the Dom0 code is attempting to pin what it= thins
>is a "PGT_l3_page_table", however the hypervisor returns -= EINVAL because it actually  is a "PGT_writable_page".
>
&= gt;(XEN) mm.c:2364:d0 Bad type (saw 7400000000000001 !=3D exp 4000 0000 0= 000 0000) for mfn 898a41 (pfn 9ca41)
>(XEN) mm.c:2733:d0 Error whil= e pinning mfn 898a41
>
>And  before that quite a lot ab= normal grant table log like :
>
>(XEN) grant_table.c:1717:d0= Bad grant reference 4294965983
>(XEN) grant_table.c:266:d0 Bad fla= gs (0) or dom (0). (expected dom 0)
>( XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)
= >(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)=
>(XEN) grant_table.c:1717:d0 Bad grant reference 4294965888
>= ;(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983
>
&= gt;It looks like something wrong with grant table.
>
>Many t= hanks.
>
>> From: Jeremy Fitzhardinge <jeremy@goop.org>
>> Subject: Re: [Xen= -devel] [SPAM] Re: kernel BUG at
>> arch/x86/xen/mmu.c:1860! - i= deas.
>> To: Ian Campbell <Ian.Campbell@citrix.com>
>> Cc: Dave Hunter <= dave@ivt.com.au>, Teck Choon Gi= am
>> <giamteckcho= on@gmail.com>, "x= en-devel@lists.xensource.com"
>> xen-devel@lists.xensource.com=
>
>> On 04/06/2011 12:53 AM, Ian Campbell wrote:
= >> > Please don't top post.
>> >
>> > On= Wed, 2011-04-06 at 00:20 +0100, Dave Hunter wrote:
>> >> = Is it likely that Debian would release an updated kernel in squeeze with<= BR>>> >> this configuration? (sorry, this might not be the pl= ace to ask).
>> > I doubt they will, enabling DEBUG_PAGEALLOC= seems very much like a
>> > workaround not a solution to me.=
>>
>> Yes, it will impose a pretty large performance = overhead.
>>
>> J
>>
>>
> --_d5ee14b4-d939-4bd9-baee-5eb98e2ea4e4_-- --===============0882369681== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============0882369681==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: kernel BUG at arch/x86/xen/mmu.c:1872 Date: Sun, 10 Apr 2011 11:57:14 +0800 Message-ID: References: , Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0747822049==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen devel Cc: jeremy@goop.org, dave@ivt.com.au, giamteckchoon@gmail.com, ian.campbell@citrix.com, konrad.wilk@oracle.com List-Id: xen-devel@lists.xenproject.org --===============0747822049== Content-Type: multipart/alternative; boundary="_5f54129e-ac36-4a84-98fe-21db6fb6839b_" --_5f54129e-ac36-4a84-98fe-21db6fb6839b_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi Konrad & Jeremy: =20 I'd like to open this BUG in a new thread, since the old thread is t= oo long for read. =20 We recently want to upgrade our kernel to 2.6.32, but unfortunately,= we confront a kernel crash bug. Our test case is simple, start 24 win2003 HVMS on our physical machine, a= nd each HVM reboot=20 every 15minutes. The kernel will crash in half an hour.(That is crash on = VM second starts). =20 Our test go much further. We test different kernel version. 2.6.32.10 2.6.32.10 2.6.32.10 =20 =20 --_5f54129e-ac36-4a84-98fe-21db6fb6839b_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi Konrad & Jeremy:
 
     I'd like to open this BUG in a new thread, since= the old thread is too long for read.
    
     We recently want to upgrade our kernel to 2.6.32= , but unfortunately, we confront a kernel crash bug.
Our test case is simple, start 24 win2003 HVMS on our physical machine, a= nd each HVM reboot
every 15minutes. The kernel will crash in half an hour.(That is crash on = VM second starts).
 
Our test go much further.
We test different kernel version.
2.6.32.10
2.6.32.10
2.6.32.10
 
     
--_5f54129e-ac36-4a84-98fe-21db6fb6839b_-- --===============0747822049== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============0747822049==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: kernel BUG at arch/x86/xen/mmu.c:1872 Date: Sun, 10 Apr 2011 12:29:06 +0800 Message-ID: References: , Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="_2bf2ab8c-b338-47ff-b127-f026606dcf02_" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen devel Cc: jeremy@goop.org, dave@ivt.com.au, giamteckchoon@gmail.com, ian.campbell@citrix.com, konrad.wilk@oracle.com List-Id: xen-devel@lists.xenproject.org --_2bf2ab8c-b338-47ff-b127-f026606dcf02_ Content-Type: multipart/alternative; boundary="_5f80312c-512c-4032-b060-8da09a462daf_" --_5f80312c-512c-4032-b060-8da09a462daf_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable (Please ignore my last mail, sent by type..) =20 Hi Konrad & Jeremy: =20 I'd like to open this BUG in a new thread, since the old thread is t= oo long for easy read. =20 We recently want to upgrade our kernel to 2.6.32, but unfortunately,= we confront a kernel crash bug. Our test case is simple, start 24 win2003 HVMS on our physical machine, a= nd each HVM reboot=20 every 15minutes. The kernel will crash in half an hour.(That is crash on = VM second starts). =20 Our test go much further. We test different kernel version. 2.6.32.10 http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3D= commit;h=3Dd945b014ac5df9592c478bf9486d97e8914aab59 2.6.32.11 http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3D= commit;h=3D27f948a3bf365a5bc3d56119637a177d41147815 2.6.32.12 http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3D= commit;h=3Dba739f9abd3f659b907a824af1161926b420a2ce 2.6.32.13 http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3D= commit;h=3Df6fe6583b77a49b569eef1b66c3d761eec2e561b 2.6.32.15 http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3D= commit;h=3D27ed1b0e0dae5f1d5da5c76451bc84cb529128bd 2.6.32.21 http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3D= commit;h=3D69e50db231723596ed8ef9275d0068d6697f466a =20 There are basic three different result we met. =20 i1) grant table issue The host still function, but use xm dmesg, we have abnormal log. please refer to the attched log of grant table =20 i2) kernel crash on a different place. Host die during the test, after reboot, we can see nothing abnormal in /v= ar/log/messages =20 i3) kernel BUG at arch/x86/xen/mmu.c:1872;=20 Host die during the test, after reboot, we see the crash log in messages,= refer to the attached log of 2.6.32.36 Summary of the test result, can be classified in two: =20 1) 2.6.32.10 30 machines involved the test, and three has issue (i1), and two has issu= e (i2), *no* issue (i3) Other machines run tests successfully till now, more than 8 hours =20 2)2.6.32.11 or later version. Each version containers 10 machine for tests, and all machine crashed in = less than half an hour. =20 Conclusion: 1) grant table issue exists in all kernel version 2) kernerl crash at different place may exist in all kernel versions, but= not happen so frequently, 2 out of 30 3) We observe the major difference of issue i3), from the test, it looks = like it is introduced between the version 2.6.32.10 and 2.6.32.11. =20 Hope this help to locate the bug. Many thanks. =20 =20 --_5f80312c-512c-4032-b060-8da09a462daf_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable (Please ignore my last mail, sent by type..)
 
Hi Konrad & Jeremy:
 
     I'd like to open this BUG in a new thread, since= the old thread is too long for easy read.
    
     We recently want to upgrade our kernel to 2.6.32= , but unfortunately, we confront a kernel crash bug.
Our test case is simple, start 24 win2003 HVMS on our physical machine, a= nd each HVM reboot
every 15minutes. The kernel will crash in half an hour.(That is crash on = VM second starts).
 
Our test go much further.
We test different kernel version.
2.6.32.10  http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/x= en.git;a=3Dcommit;h=3Dd945b014ac5df9592c478bf9486d97e8914aab59=
2.6.32.11  http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/= xen.git;a=3Dcommit;h=3D27f948a3bf365a5bc3d56119637a177d41147815
2.6.32.12 
http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/= xen.git;a=3Dcommit;h=3Dba739f9 abd3f659b907a824af1161926b420a2ce
2.6.32.13  h= ttp://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit;h=3D= f6fe6583b77a49b569eef1b66c3d761eec2e561b
2.6.32.15  http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit= ;h=3D27ed1b0e0dae5f1d5da5c76451bc84cb529128bd
2.6.32.21&nbs= p; http://git.kerne= l.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit;h=3D69e50db23172359= 6ed8ef9275d0068d6697f466a
 
There are basic three different result we met.
 
i1) grant table issue
The host still function, but use xm  dmesg, we have abnormal log. please refer to the attched log of grant table
 
i2) kernel crash on a different place.
Host die during the test, after reboot, we can see nothing abnormal in /v= ar/log/messages
 
i3) kernel BUG at arch/x86/xen/mmu.c:1872;
Host die during the test, after reboot, we see the crash log in messages,= refer to the attached log of 2.6.32.36
Summary of the test result, can be classified in two:
 
1) 2.6.32.10
30 machines involved the test, and three has issue (i1), and two has issu= e (i2), *no* issue (i3)
Other machines run tests successfully till now, more than 8 hours  
2)2.6.32.11 or later version.
Each version containers 10 machine for tests, and all machine crashed in = less than half an hour.
 
Conclusion:
1) grant table issue exists in all kernel version
2) kernerl crash at different place may exist in all kernel versions, but= not happen so frequently, 2 out of 30
3) We observe the major difference of issue i3), from the test, it l= ooks like it is introduced between the version
2.6.32.10 and 2.6.32.11.
 
Hope this help to locate the bug.
Many thanks.
 
     
--_5f80312c-512c-4032-b060-8da09a462daf_-- --_2bf2ab8c-b338-47ff-b127-f026606dcf02_ Content-Type: text/plain Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="kernel_crash_at_different_place.txt" DQo9PT09PT09PT09PT09PT09PT09PT09PT09Y3Jhc2ggbG9nIGZvciBtYWNoaW5lIG9uZSBpbiAy LjYuMzIuMTAgPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 DQoNCklOSVQ6IElkICJzMCIgcmVzcGF3bmluZyB0b28gZmFzdDogZGlzYWJsZWQgZm9yIDUgbWlu dXRlcw0KYmxrdGFwX3N5c2ZzX2Rlc3Ryb3kNCmJsa3RhcF9zeXNmc19jcmVhdGU6IGFkZGluZyBh dHRyaWJ1dGVzIGZvciBkZXYgZmZmZjg4MDBiN2E2MjYwMA0KYmxrdGFwX3N5c2ZzX2Rlc3Ryb3kN CmJsa3RhcF9zeXNmc19jcmVhdGU6IGFkZGluZyBhdHRyaWJ1dGVzIGZvciBkZXYgZmZmZjg4MDBh YmIzZDIwMA0KYmxrdGFwX3N5c2ZzX2Rlc3Ryb3kNCmJsa3RhcF9zeXNmc19jcmVhdGU6IGFkZGlu ZyBhdHRyaWJ1dGVzIGZvciBkZXYgZmZmZjg4MDBhYjNhMzAwMA0KYmxrdGFwX3N5c2ZzX2Rlc3Ry b3kNCmJsa3RhcF9zeXNmc19jcmVhdGU6IGFkZGluZyBhdHRyaWJ1dGVzIGZvciBkZXYgZmZmZjg4 MDBhN2VmOGUwMA0KYmxrdGFwX3N5c2ZzX2Rlc3Ryb3kNCmJsa3RhcF9zeXNmc19jcmVhdGU6IGFk ZGluZyBhdHRyaWJ1dGVzIGZvciBkZXYgZmZmZjg4MDBiZDIyNGUwMA0KYmxrdGFwX3N5c2ZzX2Rl c3Ryb3kNCmJsa3RhcF9zeXNmc19jcmVhdGU6IGFkZGluZyBhdHRyaWJ1dGVzIGZvciBkZXYgZmZm Zjg4MDBiZjA5ZjQwMA0KYmxrdGFwX3N5c2ZzX2Rlc3Ryb3kNCmJsa3RhcF9zeXNmc19kZXN0cm95 DQpibGt0YXBfc3lzZnNfY3JlYXRlOiBhZGRpbmcgYXR0cmlidXRlcyBmb3IgZGV2IGZmZmY4ODAw YTk2YjNjMDANCmJsa3RhcF9zeXNmc19jcmVhdGU6IGFkZGluZyBhdHRyaWJ1dGVzIGZvciBkZXYg ZmZmZjg4MDBiZjA5ZWMwMA0KSU5JVDogSWQgInMwIiByZXNwYXduaW5nIHRvbyBmYXN0OiBkaXNh YmxlZCBmb3IgNSBtaW51dGVzDQpibGt0YXBfc3lzZnNfZGVzdHJveQ0KYmxrdGFwX3N5c2ZzX2Ny ZWF0ZTogYWRkaW5nIGF0dHJpYnV0ZXMgZm9yIGRldiBmZmZmODgwMGI4ZGFhZTAwDQpibGt0YXBf c3lzZnNfZGVzdHJveQ0KYmxrdGFwX3N5c2ZzX2NyZWF0ZTogYWRkaW5nIGF0dHJpYnV0ZXMgZm9y IGRldiBmZmZmODgwMGIwZWE1NDAwDQpibGt0YXBfc3lzZnNfZGVzdHJveQ0KYmxrdGFwX3N5c2Zz X2NyZWF0ZTogYWRkaW5nIGF0dHJpYnV0ZXMgZm9yIGRldiBmZmZmODgwMGI4ZGFiMjAwDQpibGt0 YXBfc3lzZnNfZGVzdHJveQ0KYmxrdGFwX3N5c2ZzX2NyZWF0ZTogYWRkaW5nIGF0dHJpYnV0ZXMg Zm9yIGRldiBmZmZmODgwMGI4ZGFhNjAwDQpJTklUOiBJZCAiczAiIHJlc3Bhd25pbmcgdG9vIGZh c3Q6IGRpc2FibGVkIGZvciA1IG1pbnV0ZXMNCmJsa3RhcF9zeXNmc19kZXN0cm95DQpibGt0YXBf c3lzZnNfY3JlYXRlOiBhZGRpbmcgYXR0cmlidXRlcyBmb3IgZGV2IGZmZmY4ODAwYWI5MzMyMDAN CmJsa3RhcF9zeXNmc19kZXN0cm95DQpibGt0YXBfc3lzZnNfY3JlYXRlOiBhZGRpbmcgYXR0cmli dXRlcyBmb3IgZGV2IGZmZmY4ODAwYjY1MTQwMDANCkJVRzogc2NoZWR1bGluZyB3aGlsZSBhdG9t aWM6IHN3YXBwZXIvMC8weDEwMDAwMTAwDQpNb2R1bGVzIGxpbmtlZCBpbjogODAyMXEgZ2FycCB4 ZW5fbmV0YmFjayB4ZW5fYmxrYmFjayBibGt0YXAgYmxrYmFja19wYWdlbWFwIG5iZCBicmlkZ2Ug c3RwIGxsYyBhdXRvZnM0IGlwbWlfZGV2aW50ZiBpcG1pX3NpIGlwbWlfbXNnaGFuZGxlciBsb2Nr ZCBzdW5ycGMgYm9uZGluZyBpcHY2IHhlbmZzIGRtX211bHRpcGF0aCB2aWRlbyBvdXRwdXQgc2Jz IHNic2hjIHBhcnBvcnRfcGMgbHAgcGFycG9ydCBzZXMgZW5jbG9zdXJlIHNlcmlvX3JhdyBibngy IHNuZF9zZXFfZHVtbXkgc25kX3NlcV9vc3Mgc25kX3NlcV9taWRpX2V2ZW50IHNuZF9zZXEgc25k X3NlcV9kZXZpY2Ugc25kX3BjbV9vc3Mgc25kX21peGVyX29zcyBzbmRfcGNtIHNuZF90aW1lciBz bmQgcGF0YV9hY3BpIHNvdW5kY29yZSBhdGFfZ2VuZXJpYyBzbmRfcGFnZV9hbGxvYyBwY3Nwa3Ig aVRDT193ZHQgaVRDT192ZW5kb3Jfc3VwcG9ydCBpMmNfaTgwMSBpMmNfY29yZSBhdGFfcGlpeCBz aHBjaHAgbXB0c2FzIG1wdHNjc2loIG1wdGJhc2UgW2xhc3QgdW5sb2FkZWQ6IGZyZXFfdGFibGVd DQpDUFUgMDoNCk1vZHVsZXMgbGlua2VkIGluOiA4MDIxcSBnYXJwIHhlbl9uZXRiYWNrIHhlbl9i bGtiYWNrIGJsa3RhcCBibGtiYWNrX3BhZ2VtYXAgbmJkIGJyaWRnZSBzdHAgbGxjIGF1dG9mczQg aXBtaV9kZXZpbnRmIGlwbWlfc2kgaXBtaV9tc2doYW5kbGVyIGxvY2tkIHN1bnJwYyBib25kaW5n IGlwdjYgeGVuZnMgZG1fbXVsdGlwYXRoIHZpZGVvIG91dHB1dCBzYnMgc2JzaGMgcGFycG9ydF9w YyBscCBwYXJwb3J0IHNlcyBlbmNsb3N1cmUgc2VyaW9fcmF3IGJueDIgc25kX3NlcV9kdW1teSBz bmRfc2VxX29zcyBzbmRfc2VxX21pZGlfZXZlbnQgc25kX3NlcSBzbmRfc2VxX2RldmljZSBzbmRf cGNtX29zcyBzbmRfbWl4ZXJfb3NzIHNuZF9wY20gc25kX3RpbWVyIHNuZCBwYXRhX2FjcGkgc291 bmRjb3JlIGF0YV9nZW5lcmljIHNuZF9wYWdlX2FsbG9jIHBjc3BrciBpVENPX3dkdCBpVENPX3Zl bmRvcl9zdXBwb3J0IGkyY19pODAxIGkyY19jb3JlIGF0YV9waWl4IHNocGNocCBtcHRzYXMgbXB0 c2NzaWggbXB0YmFzZSBbbGFzdCB1bmxvYWRlZDogZnJlcV90YWJsZV0NClBpZDogMCwgY29tbTog c3dhcHBlciBOb3QgdGFpbnRlZCAyLjYuMzIuMTB4ZW4gIzEgVGVjYWwgUkgyMjg1ICAgICAgICAg IA0KUklQOiBlMDMwOls8ZmZmZmZmZmY4MTAwOTNhYT5dICBbPGZmZmZmZmZmODEwMDkzYWE+XSBo eXBlcmNhbGxfcGFnZSsweDNhYS8weDEwMDANClJTUDogZTAyYjpmZmZmZmZmZjgxNjYzZWQ4ICBF RkxBR1M6IDAwMDAwMjQ2DQpSQVg6IDAwMDAwMDAwMDAwMDAwMDAgUkJYOiBmZmZmZmZmZjgxNjYy MDAwIFJDWDogZmZmZmZmZmY4MTAwOTNhYQ0KUkRYOiBmZmZmZmZmZjgxMDBmMjNmIFJTSTogMDAw MDAwMDAwMDAwMDAwMCBSREk6IDAwMDAwMDAwMDAwMDAwMDENClJCUDogZmZmZmZmZmY4MTY2M2Vm MCBSMDg6IDAwMDAwMDAwMDAwMDAwMDAgUjA5OiBmZmZmODgwMDI4MDkyZTA4DQpSMTA6IGZmZmY4 ODAxNTk1NTgwMDAgUjExOiAwMDAwMDAwMDAwMDAwMjQ2IFIxMjogMDAwMDAwMDAwMDAwMDAwMA0K UjEzOiA2ZGI2ZGI2ZGI2ZGI2ZGI3IFIxNDogZmZmZmZmZmY4MTdkNzNhMCBSMTU6IDAwMDAwMDAw MDAwMDAwMDANCkZTOiAgMDAwMDdmNTAyMTVkYjZlMCgwMDAwKSBHUzpmZmZmODgwMDI4MDJjMDAw KDAwMDApIGtubEdTOjAwMDAwMDAwMDAwMDAwMDANCkNTOiAgZTAzMyBEUzogMDAwMCBFUzogMDAw MCBDUjA6IDAwMDAwMDAwODAwNTAwM2INCkNSMjogMDAwMDdmNDQ5NjcyZDAwMCBDUjM6IDAwMDAw MDAwYTdiMGEwMDAgQ1I0OiAwMDAwMDAwMDAwMDAyNjYwDQpEUjA6IDAwMDAwMDAwMDAwMDAwMDAg RFIxOiAwMDAwMDAwMDAwMDAwMDAwIERSMjogMDAwMDAwMDAwMDAwMDAwMA0KRFIzOiAwMDAwMDAw MDAwMDAwMDAwIERSNjogMDAwMDAwMDBmZmZmMGZmMCBEUjc6IDAwMDAwMDAwMDAwMDA0MDANCkNh bGwgVHJhY2U6DQogWzxmZmZmZmZmZjgxMDBlYmJmPl0gPyB4ZW5fc2FmZV9oYWx0KzB4MTAvMHgx YQ0KIFs8ZmZmZmZmZmY4MTAwYzEwMj5dIHhlbl9pZGxlKzB4M2MvMHg0Ng0KIFs8ZmZmZmZmZmY4 MTAxMGNiZD5dIGNwdV9pZGxlKzB4NWQvMHg4Yw0KIFs8ZmZmZmZmZmY4MTQyNjc0Mj5dIHJlc3Rf aW5pdCsweDY2LzB4NjgNCiBbPGZmZmZmZmZmODE3OWZkOGQ+XSBzdGFydF9rZXJuZWwrMHgzZWYv MHgzZmINCiBbPGZmZmZmZmZmODE3OWYyYzM+XSB4ODZfNjRfc3RhcnRfcmVzZXJ2YXRpb25zKzB4 YWUvMHhiMg0KIFs8ZmZmZmZmZmY4MTdhMmNiMz5dIHhlbl9zdGFydF9rZXJuZWwrMHg0YzAvMHg0 YzcNCmRpdmlkZSBlcnJvcjogMDAwMCBbIzFdIFNNUCANCmxhc3Qgc3lzZnMgZmlsZTogL3N5cy9j bGFzcy9uZXQvZDM0MmRkL2FkZHJlc3MNCkNQVSAwIA0KTW9kdWxlcyBsaW5rZWQgaW46IDgwMjFx IGdhcnAgeGVuX25ldGJhY2sgeGVuX2Jsa2JhY2sgYmxrdGFwIGJsa2JhY2tfcGFnZW1hcCBuYmQg YnJpZGdlIHN0cCBsbGMgYXV0b2ZzNCBpcG1pX2RldmludGYgaXBtaV9zaSBpcG1pX21zZ2hhbmRs ZXIgbG9ja2Qgc3VucnBjIGJvbmRpbmcgaXB2NiB4ZW5mcyBkbV9tdWx0aXBhdGggdmlkZW8gb3V0 cHV0IHNicyBzYnNoYyBwYXJwb3J0X3BjIGxwIHBhcnBvcnQgc2VzIGVuY2xvc3VyZSBzZXJpb19y YXcgYm54MiBzbmRfc2VxX2R1bW15IHNuZF9zZXFfb3NzIHNuZF9zZXFfbWlkaV9ldmVudCBzbmRf c2VxIHNuZF9zZXFfZGV2aWNlIHNuZF9wY21fb3NzIHNuZF9taXhlcl9vc3Mgc25kX3BjbSBzbmRf dGltZXIgc25kIHBhdGFfYWNwaSBzb3VuZGNvcmUgYXRhX2dlbmVyaWMgc25kX3BhZ2VfYWxsb2Mg cGNzcGtyIGlUQ09fd2R0IGlUQ09fdmVuZG9yX3N1cHBvcnQgaTJjX2k4MDEgaTJjX2NvcmUgYXRh X3BpaXggc2hwY2hwIG1wdHNhcyBtcHRzY3NpaCBtcHRiYXNlIFtsYXN0IHVubG9hZGVkOiBmcmVx X3RhYmxlXQ0KUGlkOiAwLCBjb21tOiBzd2FwcGVyIE5vdCB0YWludGVkIDIuNi4zMi4xMHhlbiAj MSBUZWNhbCBSSDIyODUgICAgICAgICAgDQpSSVA6IGUwMzA6WzxmZmZmZmZmZjgxMDRlZTU3Pl0g IFs8ZmZmZmZmZmY4MTA0ZWU1Nz5dIGZpbmRfYnVzaWVzdF9ncm91cCsweDM3ZC8weDcyMQ0KUlNQ OiBlMDJiOmZmZmY4ODAwMjgwMmZjOTAgIEVGTEFHUzogMDAwMTAyNDYNClJBWDogMDAwMDAwMDAw MDAwM2MwMCBSQlg6IDAwMDAwMDAwMDAwMDAwMDAgUkNYOiBmZmZmODgwMDI4MDQxNTAxDQpSRFg6 IDAwMDAwMDAwMDAwMDAwMDAgUlNJOiAwMDAwMDAwMDAwMDAwMDQwIFJESTogMDAwMDAwMDAwMDAw MDA0MA0KUkJQOiBmZmZmODgwMDI4MDJmZGYwIFIwODogMDAwMDAwMDAwMDAwMDAwMCBSMDk6IGZm ZmY4ODAwMjgwM2JlMDgNClIxMDogMDAwMDAwMDAwMDAwMDAwMCBSMTE6IDAwMDAwMDAwMDAwMDAw MDEgUjEyOiAwMDAwMDAwMDAwMDAwMDQwDQpSMTM6IGZmZmY4ODAwMjgwM2JkZjAgUjE0OiBmZmZm ODgwMDI4MDNiY2UwIFIxNTogMDAwMDAwMDAwMDAwMDAwMA0KRlM6ICAwMDAwN2Y1MDIxNWRiNmUw KDAwMDApIEdTOmZmZmY4ODAwMjgwMmMwMDAoMDAwMCkga25sR1M6MDAwMDAwMDAwMDAwMDAwMA0K Q1M6ICBlMDMzIERTOiAwMDAwIEVTOiAwMDAwIENSMDogMDAwMDAwMDA4MDA1MDAzYg0KQ1IyOiAw MDAwN2Y0NDk2NzJkMDAwIENSMzogMDAwMDAwMDBhN2IwYTAwMCBDUjQ6IDAwMDAwMDAwMDAwMDI2 NjANCkRSMDogMDAwMDAwMDAwMDAwMDAwMCBEUjE6IDAwMDAwMDAwMDAwMDAwMDAgRFIyOiAwMDAw MDAwMDAwMDAwMDAwDQpEUjM6IDAwMDAwMDAwMDAwMDAwMDAgRFI2OiAwMDAwMDAwMGZmZmYwZmYw IERSNzogMDAwMDAwMDAwMDAwMDQwMA0KUHJvY2VzcyBzd2FwcGVyIChwaWQ6IDAsIHRocmVhZGlu Zm8gZmZmZmZmZmY4MTY2MjAwMCwgdGFzayBmZmZmZmZmZjgxNmU4OTgwKQ0KU3RhY2s6DQogMDAw MDFhMGU5NWY5ZWM1NiBmZmZmODgwMDI4MDNiYTQ4IGZmZmY4ODAwMjgwMmZlNGMgMDAwMDAwMDA4 MTAwZWI3OQ0KPDA+IGZmZmY4ODAwMjgwMmZlNDAgMDAwMDAwMDA4MTAwZjI1MiAwMDAwMDAwMDAw MDAzYzAwIDAwMDAwMDAwMDAwMDAwMDENCjwwPiBmZmZmODgwMDI4MDNiZTAwIDAwMDAwMDAxMDAw MDAwMTAgZmZmZjg4MDAyODAzYmRmMCBmZmZmZmZmZjgxMDJlZGY5DQpDYWxsIFRyYWNlOg0KIDxJ UlE+IA0KIFs8ZmZmZmZmZmY4MTAyZWRmOT5dID8gcHZjbG9ja19jbG9ja3NvdXJjZV9yZWFkKzB4 NDcvMHg4MA0KIFs8ZmZmZmZmZmY4MTAwZjIzZj5dID8geGVuX3Jlc3RvcmVfZmxfZGlyZWN0X2Vu ZCsweDAvMHgxDQogWzxmZmZmZmZmZjgxMDRmYzQzPl0gcmViYWxhbmNlX2RvbWFpbnMrMHgxN2Iv MHg0NWINCiBbPGZmZmZmZmZmODEwNDgwZjI+XSA/IHdha2VfdXBfcHJvY2VzcysweDE1LzB4MTcN CiBbPGZmZmZmZmZmODEwNGZmNjM+XSBydW5fcmViYWxhbmNlX2RvbWFpbnMrMHg0MC8weGM1DQog WzxmZmZmZmZmZjgxMDU5YjliPl0gX19kb19zb2Z0aXJxKzB4ZDIvMHgxOTQNCiBbPGZmZmZmZmZm ODEwMTJlYWM+XSBjYWxsX3NvZnRpcnErMHgxYy8weDMwDQogWzxmZmZmZmZmZjgxMDE0NjI3Pl0g ZG9fc29mdGlycSsweDQ2LzB4ODcNCiBbPGZmZmZmZmZmODEwNTljOTg+XSBpcnFfZXhpdCsweDNi LzB4N2ENCiBbPGZmZmZmZmZmODEyODY4YWI+XSB4ZW5fZXZ0Y2huX2RvX3VwY2FsbCsweDE1Ni8w eDE3Mg0KIFs8ZmZmZmZmZmY4MTAxMmVmZT5dIHhlbl9kb19oeXBlcnZpc29yX2NhbGxiYWNrKzB4 MWUvMHgzMA0KIDxFT0k+IA0KIFs8ZmZmZmZmZmY4MTAwOTNhYT5dID8gaHlwZXJjYWxsX3BhZ2Ur MHgzYWEvMHgxMDAwDQogWzxmZmZmZmZmZjgxMDBmMjNmPl0gPyB4ZW5fcmVzdG9yZV9mbF9kaXJl Y3RfZW5kKzB4MC8weDENCiBbPGZmZmZmZmZmODEwMDkzYWE+XSA/IGh5cGVyY2FsbF9wYWdlKzB4 M2FhLzB4MTAwMA0KIFs8ZmZmZmZmZmY4MTAwZWJiZj5dID8geGVuX3NhZmVfaGFsdCsweDEwLzB4 MWENCiBbPGZmZmZmZmZmODEwMGMxMDI+XSA/IHhlbl9pZGxlKzB4M2MvMHg0Ng0KIFs8ZmZmZmZm ZmY4MTAxMGNiZD5dID8gY3B1X2lkbGUrMHg1ZC8weDhjDQogWzxmZmZmZmZmZjgxNDI2NzQyPl0g PyByZXN0X2luaXQrMHg2Ni8weDY4DQogWzxmZmZmZmZmZjgxNzlmZDhkPl0gPyBzdGFydF9rZXJu ZWwrMHgzZWYvMHgzZmINCiBbPGZmZmZmZmZmODE3OWYyYzM+XSA/IHg4Nl82NF9zdGFydF9yZXNl cnZhdGlvbnMrMHhhZS8weGIyDQogWzxmZmZmZmZmZjgxN2EyY2IzPl0gPyB4ZW5fc3RhcnRfa2Vy bmVsKzB4NGMwLzB4NGM3DQpDb2RlOiA4MyA3ZCAxMCAwMCA3NCAwYyA0OCA4YiA1ZCAxMCBjNyAw MyAwMCAwMCAwMCAwMCBlYiA3MCA0MSA4YiA1NSAwOCA0OCA4YiA0NSBhOCA0OCA4OSBkMyA0OCBj MSBhNSBkMCBmZSBmZiBmZiAwYSA0OCBjMSBlMCAwYSAzMSBkMiA8NDg+IGY3IGYzIDQ4IDg5IDQ1 IGEwIDQ4IDhiIDg1IDA4IGZmIGZmIGZmIDQ4IDI5IDg1IDAwIGZmIGZmIGZmIA0KUklQICBbPGZm ZmZmZmZmODEwNGVlNTc+XSBmaW5kX2J1c2llc3RfZ3JvdXArMHgzN2QvMHg3MjENCiBSU1AgPGZm ZmY4ODAwMjgwMmZjOTA+DQotLS1bIGVuZCB0cmFjZSA3YzNlM2I2NGNhMzQxZjBhIF0tLS0NCmRp dmlkZSBlcnJvcjogMDAwMCBbIzJdIFNNUCANCmxhc3Qgc3lzZnMgZmlsZTogL3N5cy9jbGFzcy9u ZXQvZDM0MmRkL2FkZHJlc3MNCkNQVSAyIA0KTW9kdWxlcyBsaW5rZWQgaW46IDgwMjFxIGdhcnAg eGVuX25ldGJhY2sgeGVuX2Jsa2JhY2sgYmxrdGFwIGJsa2JhY2tfcGFnZW1hcCBuYmQgYnJpZGdl IHN0cCBsbGMgYXV0b2ZzNCBpcG1pX2RldmludGYgaXBtaV9zaSBpcG1pX21zZ2hhbmRsZXIgbG9j a2Qgc3VucnBjIGJvbmRpbmcgaXB2NiB4ZW5mcyBkbV9tdWx0aXBhdGggdmlkZW8gb3V0cHV0IHNi cyBzYnNoYyBwYXJwb3J0X3BjIGxwIHBhcnBvcnQgc2VzIGVuY2xvc3VyZSBzZXJpb19yYXcgYm54 MiBzbmRfc2VxX2R1bW15IHNuZF9zZXFfb3NzIHNuZF9zZXFfbWlkaV9ldmVudCBzbmRfc2VxIHNu ZF9zZXFfZGV2aWNlIHNuZF9wY21fb3NzIHNuZF9taXhlcl9vc3Mgc25kX3BjbSBzbmRfdGltZXIg c25kIHBhdGFfYWNwaSBzb3VuZGNvcmUgYXRhX2dlbmVyaWMgc25kX3BhZ2VfYWxsb2MgcGNzcGty IGlUQ09fd2R0IGlUQ09fdmVuZG9yX3N1cHBvcnQgaTJjX2k4MDEgaTJjX2NvcmUgYXRhX3BpaXgg c2hwY2hwIG1wdHNhcyBtcHRzY3NpaCBtcHRiYXNlIFtsYXN0IHVubG9hZGVkOiBmcmVxX3RhYmxl XQ0KUGlkOiAwLCBjb21tOiBzd2FwcGVyIFRhaW50ZWQ6IEcgICAgICBEICAgIDIuNi4zMi4xMHhl biAjMSBUZWNhbCBSSDIyODUgICAgICAgICAgDQpSSVA6IGUwMzA6WzxmZmZmZmZmZjgxMDRlZTU3 Pl0gIFs8ZmZmZmZmZmY4MTA0ZWU1Nz5dIGZpbmRfYnVzaWVzdF9ncm91cCsweDM3ZC8weDcyMQ0K UlNQOiBlMDJiOmZmZmY4ODAwMjgwNjljOTAgIEVGTEFHUzogMDAwMTAyNDYNClJBWDogMDAwMDAw MDAwMDAwMDAwMCBSQlg6IDAwMDAwMDAwMDAwMDAwMDAgUkNYOiBmZmZmODgwMDI4MDQxNTAwDQpS RFg6IDAwMDAwMDAwMDAwMDAwMDAgUlNJOiAwMDAwMDAwMDAwMDAwMDQwIFJESTogMDAwMDAwMDAw MDAwMDA0MA0KUkJQOiBmZmZmODgwMDI4MDY5ZGYwIFIwODogMDAwMDAwMDAwMDAwMDAwMCBSMDk6 IGZmZmY4ODAwMjgwM2JlMDgNClIxMDogZmZmZjg4MDAyODA2OWQyOCBSMTE6IGZmZmY4ODAxNWY4 ZjdlNDggUjEyOiAwMDAwMDAwMDAwMDAwMDQwDQpSMTM6IGZmZmY4ODAwMjgwM2JkZjAgUjE0OiBm ZmZmODgwMDI4MDU4Y2UwIFIxNTogMDAwMDAwMDAwMDAwMDAwMQ0KRlM6ICAwMDAwN2ZhYWQzYmE4 NzMwKDAwMDApIEdTOmZmZmY4ODAwMjgwNjYwMDAoMDAwMCkga25sR1M6MDAwMDAwMDAwMDAwMDAw MA0KQ1M6ICBlMDMzIERTOiAwMDJiIEVTOiAwMDJiIENSMDogMDAwMDAwMDA4MDA1MDAzYg0KQ1Iy OiAwMDAwN2ZkMWU3ZWQ5MDAwIENSMzogMDAwMDAwMDE1NTEwYTAwMCBDUjQ6IDAwMDAwMDAwMDAw MDI2NjANCkRSMDogMDAwMDAwMDAwMDAwMDAwMCBEUjE6IDAwMDAwMDAwMDAwMDAwMDAgRFIyOiAw MDAwMDAwMDAwMDAwMDAwDQpEUjM6IDAwMDAwMDAwMDAwMDAwMDAgRFI2OiAwMDAwMDAwMGZmZmYw ZmYwIERSNzogMDAwMDAwMDAwMDAwMDQwMA0KUHJvY2VzcyBzd2FwcGVyIChwaWQ6IDAsIHRocmVh ZGluZm8gZmZmZjg4MDE1ZjhmNjAwMCwgdGFzayBmZmZmODgwMTVmOGU0NDEwKQ0KU3RhY2s6DQog MDAwMDFhMGU5NWZkYjk3YiBmZmZmODgwMDI4MDc1YTQ4IGZmZmY4ODAwMjgwNjllNGMgMDAwMDAw MDA4MTAwZWI3OQ0KPDA+IGZmZmY4ODAwMjgwNjllNDAgMDAwMDAwMDA4MTAwZjI1MiAwMDAwMDAw MDAwMDAzYzAwIGZmZmZmZmZmMDAwMDAwMDANCjwwPiBmZmZmODgwMDI4MDNiZTAwIDAwMDAwMDAx MDAwMDAwMTAgZmZmZjg4MDAyODA1OGRmMCAwMDAwMDAwMDAwMDAwMDAyDQpDYWxsIFRyYWNlOg0K IDxJUlE+IA0KIFs8ZmZmZmZmZmY4MTA3ZTM2ZD5dID8gdGlja19kZXZfcHJvZ3JhbV9ldmVudCsw eDJmLzB4YTENCiBbPGZmZmZmZmZmODEwNGZjNDM+XSByZWJhbGFuY2VfZG9tYWlucysweDE3Yi8w eDQ1Yg0KIFs8ZmZmZmZmZmY4MTA0ZmY5OD5dIHJ1bl9yZWJhbGFuY2VfZG9tYWlucysweDc1LzB4 YzUNCiBbPGZmZmZmZmZmODEwNTliOWI+XSBfX2RvX3NvZnRpcnErMHhkMi8weDE5NA0KIFs8ZmZm ZmZmZmY4MTAxMmVhYz5dIGNhbGxfc29mdGlycSsweDFjLzB4MzANCiBbPGZmZmZmZmZmODEwMTQ2 Mjc+XSBkb19zb2Z0aXJxKzB4NDYvMHg4Nw0KIFs8ZmZmZmZmZmY4MTA1OWM5OD5dIGlycV9leGl0 KzB4M2IvMHg3YQ0KIFs8ZmZmZmZmZmY4MTI4NjhhYj5dIHhlbl9ldnRjaG5fZG9fdXBjYWxsKzB4 MTU2LzB4MTcyDQogWzxmZmZmZmZmZjgxMDEyZWZlPl0geGVuX2RvX2h5cGVydmlzb3JfY2FsbGJh Y2srMHgxZS8weDMwDQogPEVPST4gDQogWzxmZmZmZmZmZjgxMDA5M2FhPl0gPyBoeXBlcmNhbGxf cGFnZSsweDNhYS8weDEwMDANCiBbPGZmZmZmZmZmODEwMDkzYWE+XSA/IGh5cGVyY2FsbF9wYWdl KzB4M2FhLzB4MTAwMA0KIFs8ZmZmZmZmZmY4MTAwZWJiZj5dID8geGVuX3NhZmVfaGFsdCsweDEw LzB4MWENCiBbPGZmZmZmZmZmODEwMGMxMDI+XSA/IHhlbl9pZGxlKzB4M2MvMHg0Ng0KIFs8ZmZm ZmZmZmY4MTAxMGNiZD5dID8gY3B1X2lkbGUrMHg1ZC8weDhjDQogWzxmZmZmZmZmZjgxNDMyZDY4 Pl0gPyBjcHVfYnJpbmd1cF9hbmRfaWRsZSsweDEzLzB4MTUNCkNvZGU6IDgzIDdkIDEwIDAwIDc0 IDBjIDQ4IDhiIDVkIDEwIGM3IDAzIDAwIDAwIDAwIDAwIGViIDcwIDQxIDhiIDU1IDA4IDQ4IDhi IDQ1IGE4IDQ4IDg5IGQzIDQ4IGMxIGE1IGQwIGZlIGZmIGZmIDBhIDQ4IGMxIGUwIDBhIDMxIGQy IDw0OD4gZjcgZjMgNDggODkgNDUgYTAgNDggOGIgODUgMDggZmYgZmYgZmYgNDggMjkgODUgMDAg ZmYgZmYgZmYgDQpSSVAgIFs8ZmZmZmZmZmY4MTA0ZWU1Nz5dIGZpbmRfYnVzaWVzdF9ncm91cCsw eDM3ZC8weDcyMQ0KIFJTUCA8ZmZmZjg4MDAyODA2OWM5MD4NCi0tLVsgZW5kIHRyYWNlIDdjM2Uz YjY0Y2EzNDFmMGIgXS0tLQ0KS2VybmVsIHBhbmljIC0gbm90IHN5bmNpbmc6IEZhdGFsIGV4Y2Vw dGlvbiBpbiBpbnRlcnJ1cHQNClBpZDogMCwgY29tbTogc3dhcHBlciBUYWludGVkOiBHICAgICAg RCAgICAyLjYuMzIuMTB4ZW4gIzENCkNhbGwgVHJhY2U6DQogPElSUT4gIFs8ZmZmZmZmZmY4MTA0 MDJhNT5dID8gZnRyYWNlX3Byb2ZpbGVfZW5hYmxlX3NjaGVkX3Byb2Nlc3NfZXhpdCsweDEwLzB4 MTcNCiBbPGZmZmZmZmZmODEwNTJlZWE+XSBwYW5pYysweGUwLzB4MTk4DQogWzxmZmZmZmZmZjgx MDBlYjc5Pl0gPyB4ZW5fZm9yY2VfZXZ0Y2huX2NhbGxiYWNrKzB4ZC8weGYNCiBbPGZmZmZmZmZm ODEwMGYyNTI+XSA/IGNoZWNrX2V2ZW50cysweDEyLzB4MjANCiBbPGZmZmZmZmZmODEwMGYyM2Y+ XSA/IHhlbl9yZXN0b3JlX2ZsX2RpcmVjdF9lbmQrMHgwLzB4MQ0KIFs8ZmZmZmZmZmY4MTA0MDJh NT5dID8gZnRyYWNlX3Byb2ZpbGVfZW5hYmxlX3NjaGVkX3Byb2Nlc3NfZXhpdCsweDEwLzB4MTcN CiBbPGZmZmZmZmZmODEwNTJiNDM+XSA/IHByaW50X29vcHNfZW5kX21hcmtlcisweDIzLzB4MjUN CiBbPGZmZmZmZmZmODEwNDAyYTU+XSA/IGZ0cmFjZV9wcm9maWxlX2VuYWJsZV9zY2hlZF9wcm9j ZXNzX2V4aXQrMHgxMC8weDE3DQogWzxmZmZmZmZmZjgxNDNkMmI1Pl0gb29wc19lbmQrMHhiNi8w eGM2DQogWzxmZmZmZmZmZjgxMDE1NmMxPl0gZGllKzB4NWEvMHg2Mw0KIFs8ZmZmZmZmZmY4MTQz Y2I4Yz5dIGRvX3RyYXArMHgxMTUvMHgxMjQNCiBbPGZmZmZmZmZmODEwMTM2MTA+XSBkb19kaXZp ZGVfZXJyb3IrMHg5Ni8weDlmDQogWzxmZmZmZmZmZjgxMDRlZTU3Pl0gPyBmaW5kX2J1c2llc3Rf Z3JvdXArMHgzN2QvMHg3MjENCiBbPGZmZmZmZmZmODEwMGYyM2Y+XSA/IHhlbl9yZXN0b3JlX2Zs X2RpcmVjdF9lbmQrMHgwLzB4MQ0KIFs8ZmZmZmZmZmY4MTQzYzNkYT5dID8gX3NwaW5fdW5sb2Nr X2lycXJlc3RvcmUrMHgxNS8weDE3DQogWzxmZmZmZmZmZjgxMzkzN2EwPl0gPyBza2JfcmVsZWFz ZV9kYXRhKzB4YWIvMHhiMA0KIFs8ZmZmZmZmZmY4MTAwZWI3OT5dID8geGVuX2ZvcmNlX2V2dGNo bl9jYWxsYmFjaysweGQvMHhmDQogWzxmZmZmZmZmZjgxMDBmMjUyPl0gPyBjaGVja19ldmVudHMr MHgxMi8weDIwDQogWzxmZmZmZmZmZjgxMDEyYWRiPl0gZGl2aWRlX2Vycm9yKzB4MWIvMHgyMA0K IFs8ZmZmZmZmZmY4MTA0ZWU1Nz5dID8gZmluZF9idXNpZXN0X2dyb3VwKzB4MzdkLzB4NzIxDQog WzxmZmZmZmZmZjgxMDdlMzZkPl0gPyB0aWNrX2Rldl9wcm9ncmFtX2V2ZW50KzB4MmYvMHhhMQ0K IFs8ZmZmZmZmZmY4MTA0ZmM0Mz5dIHJlYmFsYW5jZV9kb21haW5zKzB4MTdiLzB4NDViDQogWzxm ZmZmZmZmZjgxMDRmZjk4Pl0gcnVuX3JlYmFsYW5jZV9kb21haW5zKzB4NzUvMHhjNQ0KIFs8ZmZm ZmZmZmY4MTA1OWI5Yj5dIF9fZG9fc29mdGlycSsweGQyLzB4MTk0DQogWzxmZmZmZmZmZjgxMDEy ZWFjPl0gY2FsbF9zb2Z0aXJxKzB4MWMvMHgzMA0KIFs8ZmZmZmZmZmY4MTAxNDYyNz5dIGRvX3Nv ZnRpcnErMHg0Ni8weDg3DQogWzxmZmZmZmZmZjgxMDU5Yzk4Pl0gaXJxX2V4aXQrMHgzYi8weDdh DQogWzxmZmZmZmZmZjgxMjg2OGFiPl0geGVuX2V2dGNobl9kb191cGNhbGwrMHgxNTYvMHgxNzIN CiBbPGZmZmZmZmZmODEwMTJlZmU+XSB4ZW5fZG9faHlwZXJ2aXNvcl9jYWxsYmFjaysweDFlLzB4 MzANCiA8RU9JPiAgWzxmZmZmZmZmZjgxMDA5M2FhPl0gPyBoeXBlcmNhbGxfcGFnZSsweDNhYS8w eDEwMDANCiBbPGZmZmZmZmZmODEwMDkzYWE+XSA/IGh5cGVyY2FsbF9wYWdlKzB4M2FhLzB4MTAw MA0KIFs8ZmZmZmZmZmY4MTAwZWJiZj5dID8geGVuX3NhZmVfaGFsdCsweDEwLzB4MWENCiBbPGZm ZmZmZmZmODEwMGMxMDI+XSA/IHhlbl9pZGxlKzB4M2MvMHg0Ng0KIFs8ZmZmZmZmZmY4MTAxMGNi ZD5dID8gY3B1X2lkbGUrMHg1ZC8weDhjDQogWzxmZmZmZmZmZjgxNDMyZDY4Pl0gPyBjcHVfYnJp bmd1cF9hbmRfaWRsZSsweDEzLzB4MTUNCktlcm5lbCBwYW5pYyAtIG5vdCBzeW5jaW5nOiBGYXRh bCBleGNlcHRpb24gaW4gaW50ZXJydXB0DQpQaWQ6IDAsIGNvbW06IHN3YXBwZXIgVGFpbnRlZDog RyAgICAgIEQgICAgMi42LjMyLjEweGVuICMxDQpDYWxsIFRyYWNlOg0KIDxJUlE+ICBbPGZmZmZm ZmZmODEwNTJlZWE+XSBwYW5pYysweGUwLzB4MTk4DQogWzxmZmZmZmZmZjgxNDMwMGIyPl0gPyBt ZWdhcmFpZF9wcm9iZV9vbmUrMHhmMjUvMHgxMTZkDQogWzxmZmZmZmZmZjgxMDBlYjc5Pl0gPyB4 ZW5fZm9yY2VfZXZ0Y2huX2NhbGxiYWNrKzB4ZC8weGYNCiBbPGZmZmZmZmZmODEwMGYyNTI+XSA/ IGNoZWNrX2V2ZW50cysweDEyLzB4MjANCiBbPGZmZmZmZmZmODEwMGYyM2Y+XSA/IHhlbl9yZXN0 b3JlX2ZsX2RpcmVjdF9lbmQrMHgwLzB4MQ0KIFs8ZmZmZmZmZmY4MTA1MmI0Mz5dID8gcHJpbnRf b29wc19lbmRfbWFya2VyKzB4MjMvMHgyNQ0KIFs8ZmZmZmZmZmY4MTQzZDJiNT5dIG9vcHNfZW5k KzB4YjYvMHhjNg0KIFs8ZmZmZmZmZmY4MTAxNTZjMT5dIGRpZSsweDVhLzB4NjMNCiBbPGZmZmZm ZmZmODE0M2NiOGM+XSBkb190cmFwKzB4MTE1LzB4MTI0DQogWzxmZmZmZmZmZjgxMDEzNjEwPl0g ZG9fZGl2aWRlX2Vycm9yKzB4OTYvMHg5Zg0KIFs8ZmZmZmZmZmY4MTA0ZWU1Nz5dID8gZmluZF9i dXNpZXN0X2dyb3VwKzB4MzdkLzB4NzIxDQogWzxmZmZmZmZmZjgxMzkzNGI5Pl0gPyBfX2tmcmVl X3NrYisweDc5LzB4N2QNCiBbPGZmZmZmZmZmODEwMGYyM2Y+XSA/IHhlbl9yZXN0b3JlX2ZsX2Rp cmVjdF9lbmQrMHgwLzB4MQ0KIFs8ZmZmZmZmZmY4MTBmZDg4Yj5dID8ga21lbV9jYWNoZV9mcmVl KzB4ODgvMHhiYg0KIFs8ZmZmZmZmZmY4MTM5MzRiOT5dID8gX19rZnJlZV9za2IrMHg3OS8weDdk DQogWzxmZmZmZmZmZjgxMDEyYWRiPl0gZGl2aWRlX2Vycm9yKzB4MWIvMHgyMA0KIFs8ZmZmZmZm ZmY4MTA0ZWU1Nz5dID8gZmluZF9idXNpZXN0X2dyb3VwKzB4MzdkLzB4NzIxDQogWzxmZmZmZmZm ZjgxMDJlZGY5Pl0gPyBwdmNsb2NrX2Nsb2Nrc291cmNlX3JlYWQrMHg0Ny8weDgwDQogWzxmZmZm ZmZmZjgxMDBmMjNmPl0gPyB4ZW5fcmVzdG9yZV9mbF9kaXJlY3RfZW5kKzB4MC8weDENCiBbPGZm ZmZmZmZmODEwNGZjNDM+XSByZWJhbGFuY2VfZG9tYWlucysweDE3Yi8weDQ1Yg0KIFs8ZmZmZmZm ZmY4MTA0ODBmMj5dID8gd2FrZV91cF9wcm9jZXNzKzB4MTUvMHgxNw0KIFs8ZmZmZmZmZmY4MTA0 ZmY2Mz5dIHJ1bl9yZWJhbGFuY2VfZG9tYWlucysweDQwLzB4YzUNCiBbPGZmZmZmZmZmODEwNTli OWI+XSBfX2RvX3NvZnRpcnErMHhkMi8weDE5NA0KIFs8ZmZmZmZmZmY4MTAxMmVhYz5dIGNhbGxf c29mdGlycSsweDFjLzB4MzANCiBbPGZmZmZmZmZmODEwMTQ2Mjc+XSBkb19zb2Z0aXJxKzB4NDYv MHg4Nw0KIFs8ZmZmZmZmZmY4MTA1OWM5OD5dIGlycV9leGl0KzB4M2IvMHg3YQ0KIFs8ZmZmZmZm ZmY4MTI4NjhhYj5dIHhlbl9ldnRjaG5fZG9fdXBjYWxsKzB4MTU2LzB4MTcyDQogWzxmZmZmZmZm ZjgxMDEyZWZlPl0geGVuX2RvX2h5cGVydmlzb3JfY2FsbGJhY2srMHgxZS8weDMwDQogPEVPST4g IFs8ZmZmZmZmZmY4MTAwOTNhYT5dID8gaHlwZXJjYWxsX3BhZ2UrMHgzYWEvMHgxMDAwDQogWzxm ZmZmZmZmZjgxMDBmMjNmPl0gPyB4ZW5fcmVzdG9yZV9mbF9kaXJlY3RfZW5kKzB4MC8weDENCiBb PGZmZmZmZmZmODEwMDkzYWE+XSA/IGh5cGVyY2FsbF9wYWdlKzB4M2FhLzB4MTAwMA0KIFs8ZmZm ZmZmZmY4MTAwZWJiZj5dID8geGVuX3NhZmVfaGFsdCsweDEwLzB4MWENCiBbPGZmZmZmZmZmODEw MGMxMDI+XSA/IHhlbl9pZGxlKzB4M2MvMHg0Ng0KIFs8ZmZmZmZmZmY4MTAxMGNiZD5dID8gY3B1 X2lkbGUrMHg1ZC8weDhjDQogWzxmZmZmZmZmZjgxNDI2NzQyPl0gPyByZXN0X2luaXQrMHg2Ni8w eDY4DQogWzxmZmZmZmZmZjgxNzlmZDhkPl0gPyBzdGFydF9rZXJuZWwrMHgzZWYvMHgzZmINCiBb PGZmZmZmZmZmODE3OWYyYzM+XSA/IHg4Nl82NF9zdGFydF9yZXNlcnZhdGlvbnMrMHhhZS8weGIy DQogWzxmZmZmZmZmZjgxN2EyY2IzPl0gPyB4ZW5fc3RhcnRfa2VybmVsKzB4NGMwLzB4NGM3DQoN Cg0KDQoNCg0KDQoNCg0KDQoNCj09PT09PT09PT09PT09PT09PT09PT09PT1jcmFzaCBsb2cgZm9y IG1hY2hpbmUgdHdvIGluIDIuNi4zMi4xMCA9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT0NCg0KDQoNCg0KDQoNCmJsa3RhcF9zeXNmc19jcmVhdGU6IGFkZGlu ZyBhdHRyaWJ1dGVzIGZvciBkZXYgZmZmZjg4MDEwYjY2OGMwMA0KX19yYXRlbGltaXQ6IDQgY2Fs bGJhY2tzIHN1cHByZXNzZWQNCmJsa3RhcF9zeXNmc19jcmVhdGU6IGFkZGluZyBhdHRyaWJ1dGVz IGZvciBkZXYgZmZmZjg4MDBiZDM4NDIwMA0KX19yYXRlbGltaXQ6IDYgY2FsbGJhY2tzIHN1cHBy ZXNzZWQNCklOSVQ6IElkICJzMCIgcmVzcGF3bmluZyB0b28gZmFzdDogZGlzYWJsZWQgZm9yIDUg bWludXRlcw0KYmxrdGFwX3N5c2ZzX2Rlc3Ryb3kNCmJsa3RhcF9zeXNmc19jcmVhdGU6IGFkZGlu ZyBhdHRyaWJ1dGVzIGZvciBkZXYgZmZmZjg4MDBiZjA3ZDgwMA0KSU5JVDogSWQgInMwIiByZXNw YXduaW5nIHRvbyBmYXN0OiBkaXNhYmxlZCBmb3IgNSBtaW51dGVzDQpkaXZpZGUgZXJyb3I6IDAw MDAgWyMxXSBTTVAgDQpsYXN0IHN5c2ZzIGZpbGU6IC9zeXMvaHlwZXJ2aXNvci90eXBlDQpDUFUg MiANCk1vZHVsZXMgbGlua2VkIGluOiA4MDIxcSBnYXJwIHhlbl9uZXRiYWNrIHhlbl9ibGtiYWNr IGJsa3RhcCBibGtiYWNrX3BhZ2VtYXAgbmJkIGJyaWRnZSBzdHAgbGxjIGF1dG9mczQgaXBtaV9k ZXZpbnRmIGlwbWlfc2kgaXBtaV9tc2doYW5kbGVyIGxvY2tkIHN1bnJwYyBib25kaW5nIGlwdjYg eGVuZnMgZG1fbXVsdGlwYXRoIHZpZGVvIG91dHB1dCBzYnMgc2JzaGMgcGFycG9ydF9wYyBscCBw YXJwb3J0IHNlcyBlbmNsb3N1cmUgc25kX3NlcV9kdW1teSBzbmRfc2VxX29zcyBzbmRfc2VxX21p ZGlfZXZlbnQgc25kX3NlcSBzbmRfc2VxX2RldmljZSBzbmRfcGNtX29zcyBzbmRfbWl4ZXJfb3Nz IGJueDIgc25kX3BjbSBzZXJpb19yYXcgc25kX3RpbWVyIHNuZCBzb3VuZGNvcmUgc25kX3BhZ2Vf YWxsb2MgaTJjX2k4MDEgaTJjX2NvcmUgcGNzcGtyIHBhdGFfYWNwaSBhdGFfZ2VuZXJpYyBpVENP X3dkdCBpVENPX3ZlbmRvcl9zdXBwb3J0IGF0YV9waWl4IHNocGNocCBtcHRzYXMgbXB0c2NzaWgg bXB0YmFzZSBbbGFzdCB1bmxvYWRlZDogZnJlcV90YWJsZV0NClBpZDogMTk2MzIsIGNvbW06IHhl bnN0b3JlLWxpc3QgTm90IHRhaW50ZWQgMi42LjMyLjEweGVuICMxIFRlY2FsIFJIMjI4NSAgICAg ICAgICANClJJUDogZTAzMDpbPGZmZmZmZmZmODEwNGVlNTc+XSAgWzxmZmZmZmZmZjgxMDRlZTU3 Pl0gZmluZF9idXNpZXN0X2dyb3VwKzB4MzdkLzB4NzIxDQpSU1A6IGUwMmI6ZmZmZjg4MDBiNTJh MWMwOCAgRUZMQUdTOiAwMDAxMDA0Ng0KUkFYOiAwMDAwMDAwMDAwMDAwMDAwIFJCWDogMDAwMDAw MDAwMDAwMDAwMCBSQ1g6IGZmZmY4ODAwMjgwN2I1MDENClJEWDogMDAwMDAwMDAwMDAwMDAwMCBS U0k6IDAwMDAwMDAwMDAwMDAwNDAgUkRJOiAwMDAwMDAwMDAwMDAwMDQwDQpSQlA6IGZmZmY4ODAw YjUyYTFkNjggUjA4OiAwMDAwMDAwMDAwMDAwMDAwIFIwOTogZmZmZjg4MDAyODA3NWUwOA0KUjEw OiBmZmZmODgwMGJmMGNmNmMwIFIxMTogZmZmZjg4MDBiNTJhMWRkOCBSMTI6IDAwMDAwMDAwMDAw MDAwNDANClIxMzogZmZmZjg4MDAyODA3NWRmMCBSMTQ6IGZmZmY4ODAwMjgwNzVjZTAgUjE1OiAw MDAwMDAwMDAwMDAwMDAyDQpGUzogIDAwMDA3ZmNiMjI4MTQ2ZTAoMDAwMCkgR1M6ZmZmZjg4MDAy ODA2NjAwMCgwMDAwKSBrbmxHUzowMDAwMDAwMDAwMDAwMDAwDQpDUzogIGUwMzMgRFM6IDAwMDAg RVM6IDAwMDAgQ1IwOiAwMDAwMDAwMDgwMDUwMDNiDQpDUjI6IDAwMDAwMDM1ODFkMTgwZTAgQ1Iz OiAwMDAwMDAwMGI4M2I1MDAwIENSNDogMDAwMDAwMDAwMDAwMjY2MA0KRFIwOiAwMDAwMDAwMDAw MDAwMDAwIERSMTogMDAwMDAwMDAwMDAwMDAwMCBEUjI6IDAwMDAwMDAwMDAwMDAwMDANCkRSMzog MDAwMDAwMDAwMDAwMDAwMCBEUjY6IDAwMDAwMDAwZmZmZjBmZjAgRFI3OiAwMDAwMDAwMDAwMDAw NDAwDQpQcm9jZXNzIHhlbnN0b3JlLWxpc3QgKHBpZDogMTk2MzIsIHRocmVhZGluZm8gZmZmZjg4 MDBiNTJhMDAwMCwgdGFzayBmZmZmODgwMTBjMDgwMDAwKQ0KU3RhY2s6DQogZmZmZjg4MDBiNTJh MWMyMCBmZmZmODgwMDI4MDc1YTQ4IGZmZmY4ODAwYjUyYTFkYmMgMDAwMDAwMDJiNTJhMWMzMA0K PDA+IGZmZmY4ODAwYjUyYTFkYjAgMDAwMDAwMDAwMDAwMDAwNCAwMDAwMDAwMDAwMDAwMDAwIDAw MDAwMDAyMDAwMDAwMDANCjwwPiBmZmZmODgwMDI4MDc1ZTAwIDAwMDAwMDAwMDAwMDAwMDEgZmZm Zjg4MDAyODA3NWRmMCBmZmZmZmZmZjI4MDdiNWMwDQpDYWxsIFRyYWNlOg0KIFs8ZmZmZmZmZmY4 MTQzYTgwNT5dIHNjaGVkdWxlKzB4MjdhLzB4NzM2DQogWzxmZmZmZmZmZjgxNDNjM2RhPl0gPyBf c3Bpbl91bmxvY2tfaXJxcmVzdG9yZSsweDE1LzB4MTcNCiBbPGZmZmZmZmZmODEyODg1Mjg+XSBy ZWFkX3JlcGx5KzB4ODYvMHgxMDQNCiBbPGZmZmZmZmZmODEwNzFlODI+XSA/IGF1dG9yZW1vdmVf d2FrZV9mdW5jdGlvbisweDAvMHgzZA0KIFs8ZmZmZmZmZmY4MTQzYWUwZT5dID8gX2NvbmRfcmVz Y2hlZCsweGUvMHgyMg0KIFs8ZmZmZmZmZmY4MTI4ODYzOT5dIHhlbmJ1c19kZXZfcmVxdWVzdF9h bmRfcmVwbHkrMHg1OC8weDg5DQogWzxmZmZmZmZmZmEwMTE1NTRlPl0geGVuYnVzX2ZpbGVfd3Jp dGUrMHgxNmEvMHg0NjkgW3hlbmZzXQ0KIFs8ZmZmZmZmZmY4MTEwOTY3MT5dIHZmc193cml0ZSsw eGIwLzB4MTBhDQogWzxmZmZmZmZmZjgxMTBhM2FiPl0gc3lzX3dyaXRlKzB4NGMvMHg3Mg0KIFs8 ZmZmZmZmZmY4MTAxMWQ3Mj5dIHN5c3RlbV9jYWxsX2Zhc3RwYXRoKzB4MTYvMHgxYg0KQ29kZTog ODMgN2QgMTAgMDAgNzQgMGMgNDggOGIgNWQgMTAgYzcgMDMgMDAgMDAgMDAgMDAgZWIgNzAgNDEg OGIgNTUgMDggNDggOGIgNDUgYTggNDggODkgZDMgNDggYzEgYTUgZDAgZmUgZmYgZmYgMGEgNDgg YzEgZTAgMGEgMzEgZDIgPDQ4PiBmNyBmMyA0OCA4OSA0NSBhMCA0OCA4YiA4NSAwOCBmZiBmZiBm ZiA0OCAyOSA4NSAwMCBmZiBmZiBmZiANClJJUCAgWzxmZmZmZmZmZjgxMDRlZTU3Pl0gZmluZF9i dXNpZXN0X2dyb3VwKzB4MzdkLzB4NzIxDQogUlNQIDxmZmZmODgwMGI1MmExYzA4Pg0KLS0tWyBl bmQgdHJhY2UgMTM1MDlkODhmNWI4OTE4YyBdLS0tDQpkaXZpZGUgZXJyb3I6IDAwMDAgWyMyXSBT TVAgDQpsYXN0IHN5c2ZzIGZpbGU6IC9zeXMvaHlwZXJ2aXNvci90eXBlDQpDUFUgMSANCk1vZHVs ZXMgbGlua2VkIGluOiA4MDIxcSBnYXJwIHhlbl9uZXRiYWNrIHhlbl9ibGtiYWNrIGJsa3RhcCBi bGtiYWNrX3BhZ2VtYXAgbmJkIGJyaWRnZSBzdHAgbGxjIGF1dG9mczQgaXBtaV9kZXZpbnRmIGlw bWlfc2kgaXBtaV9tc2doYW5kbGVyIGxvY2tkIHN1bnJwYyBib25kaW5nIGlwdjYgeGVuZnMgZG1f bXVsdGlwYXRoIHZpZGVvIG91dHB1dCBzYnMgc2JzaGMgcGFycG9ydF9wYyBscCBwYXJwb3J0IHNl cyBlbmNsb3N1cmUgc25kX3NlcV9kdW1teSBzbmRfc2VxX29zcyBzbmRfc2VxX21pZGlfZXZlbnQg c25kX3NlcSBzbmRfc2VxX2RldmljZSBzbmRfcGNtX29zcyBzbmRfbWl4ZXJfb3NzIGJueDIgc25k X3BjbSBzZXJpb19yYXcgc25kX3RpbWVyIHNuZCBzb3VuZGNvcmUgc25kX3BhZ2VfYWxsb2MgaTJj X2k4MDEgaTJjX2NvcmUgcGNzcGtyIHBhdGFfYWNwaSBhdGFfZ2VuZXJpYyBpVENPX3dkdCBpVENP X3ZlbmRvcl9zdXBwb3J0IGF0YV9waWl4IHNocGNocCBtcHRzYXMgbXB0c2NzaWggbXB0YmFzZSBb bGFzdCB1bmxvYWRlZDogZnJlcV90YWJsZV0NClBpZDogMzQyOSwgY29tbToga2lwbWkwIFRhaW50 ZWQ6IEcgICAgICBEICAgIDIuNi4zMi4xMHhlbiAjMSBUZWNhbCBSSDIyODUgICAgICAgICAgDQpS SVA6IGUwMzA6WzxmZmZmZmZmZjgxMDRlZTU3Pl0gIFs8ZmZmZmZmZmY4MTA0ZWU1Nz5dIGZpbmRf YnVzaWVzdF9ncm91cCsweDM3ZC8weDcyMQ0KUlNQOiBlMDJiOmZmZmY4ODAxNTVlMmRjNzAgIEVG TEFHUzogMDAwMTAwNDYNClJBWDogMDAwMDAwMDAwMDAwMDAwMCBSQlg6IDAwMDAwMDAwMDAwMDAw MDAgUkNYOiBmZmZmODgwMDI4MDdiNTAwDQpSRFg6IDAwMDAwMDAwMDAwMDAwMDAgUlNJOiAwMDAw MDAwMDAwMDAwMDQwIFJESTogMDAwMDAwMDAwMDAwMDA0MA0KUkJQOiBmZmZmODgwMTU1ZTJkZGQw IFIwODogMDAwMDAwMDAwMDAwMDAwMCBSMDk6IGZmZmY4ODAwMjgwNzVlMDgNClIxMDogZmZmZjg4 MDE1ODMzOTQwMCBSMTE6IGZmZmY4ODAwMjgwNDAwMDAgUjEyOiAwMDAwMDAwMDAwMDAwMDQwDQpS MTM6IGZmZmY4ODAwMjgwNzVkZjAgUjE0OiBmZmZmODgwMDI4MDU4Y2UwIFIxNTogMDAwMDAwMDAw MDAwMDAwMQ0KRlM6ICAwMDAwN2ZkZTU5NDg5NmUwKDAwMDApIEdTOmZmZmY4ODAwMjgwNDkwMDAo MDAwMCkga25sR1M6MDAwMDAwMDAwMDAwMDAwMA0KQ1M6ICBlMDMzIERTOiAwMDAwIEVTOiAwMDAw IENSMDogMDAwMDAwMDA4MDA1MDAzYg0KQ1IyOiAwMDAwMDAwMDAwNDY5MDAwIENSMzogMDAwMDAw MDAwMTAwMTAwMCBDUjQ6IDAwMDAwMDAwMDAwMDI2NjANCkRSMDogMDAwMDAwMDAwMDAwMDAwMCBE UjE6IDAwMDAwMDAwMDAwMDAwMDAgRFIyOiAwMDAwMDAwMDAwMDAwMDAwDQpEUjM6IDAwMDAwMDAw MDAwMDAwMDAgRFI2OiAwMDAwMDAwMGZmZmYwZmYwIERSNzogMDAwMDAwMDAwMDAwMDQwMA0KUHJv Y2VzcyBraXBtaTAgKHBpZDogMzQyOSwgdGhyZWFkaW5mbyBmZmZmODgwMTU1ZTJjMDAwLCB0YXNr IGZmZmY4ODAxNTVjZjJkNjApDQpTdGFjazoNCiBmZmZmODgwMTU1ZTJkYzg4IGZmZmY4ODAwMjgw NThhNDggZmZmZjg4MDE1NWUyZGUyNCAwMDAwMDAwMjU1ZTJkYzk4DQo8MD4gZmZmZjg4MDE1NWUy ZGUxOCAwMDAwMDAwMDAwMDAwMDA0IDAwMDAwMDAwMDAwMDAwMDAgZmZmZmZmZmYwMDAwMDAwMA0K PDA+IGZmZmY4ODAwMjgwNzVlMDAgMDAwMDAwMDAyODA0MDAwMCBmZmZmODgwMDI4MDU4ZGYwIDAw MDAwMDAwMjgwM2JlMDgNCkNhbGwgVHJhY2U6DQogWzxmZmZmZmZmZjgxNDNhODA1Pl0gc2NoZWR1 bGUrMHgyN2EvMHg3MzYNCiBbPGZmZmZmZmZmODE0M2MzZGE+XSA/IF9zcGluX3VubG9ja19pcnFy ZXN0b3JlKzB4MTUvMHgxNw0KIFs8ZmZmZmZmZmY4MTQzYjBjMD5dIHNjaGVkdWxlX3RpbWVvdXQr MHg5ZC8weGM0DQogWzxmZmZmZmZmZjgxMDYwOGM0Pl0gPyBwcm9jZXNzX3RpbWVvdXQrMHgwLzB4 MTANCiBbPGZmZmZmZmZmODE0M2IxMjU+XSBzY2hlZHVsZV90aW1lb3V0X2ludGVycnVwdGlibGUr MHgxZS8weDIwDQogWzxmZmZmZmZmZmEwMWM5ZDA5Pl0gaXBtaV90aHJlYWQrMHg2YS8weDdlIFtp cG1pX3NpXQ0KIFs8ZmZmZmZmZmZhMDFjOWM5Zj5dID8gaXBtaV90aHJlYWQrMHgwLzB4N2UgW2lw bWlfc2ldDQogWzxmZmZmZmZmZjgxMDcxYWEzPl0ga3RocmVhZCsweDZlLzB4NzYNCiBbPGZmZmZm ZmZmODEwMTJkYWE+XSBjaGlsZF9yaXArMHhhLzB4MjANCiBbPGZmZmZmZmZmODEwMTFmOTE+XSA/ IGludF9yZXRfZnJvbV9zeXNfY2FsbCsweDcvMHgxYg0KIFs8ZmZmZmZmZmY4MTAxMjcxZD5dID8g cmV0aW50X3Jlc3RvcmVfYXJncysweDUvMHg2DQogWzxmZmZmZmZmZjgxMDEyZGEwPl0gPyBjaGls ZF9yaXArMHgwLzB4MjANCkNvZGU6IDgzIDdkIDEwIDAwIDc0IDBjIDQ4IDhiIDVkIDEwIGM3IDAz IDAwIDAwIDAwIDAwIGViIDcwIDQxIDhiIDU1IDA4IDQ4IDhiIDQ1IGE4IDQ4IDg5IGQzIDQ4IGMx IGE1IGQwIGZlIGZmIGZmIDBhIDQ4IGMxIGUwIDBhIDMxIGQyIDw0OD4gZjcgZjMgNDggODkgNDUg YTAgNDggOGIgODUgMDggZmYgZmYgZmYgNDggMjkgODUgMDAgZmYgZmYgZmYgDQpSSVAgIFs8ZmZm ZmZmZmY4MTA0ZWU1Nz5dIGZpbmRfYnVzaWVzdF9ncm91cCsweDM3ZC8weDcyMQ0KIFJTUCA8ZmZm Zjg4MDE1NWUyZGM3MD4NCi0tLVsgZW5kIHRyYWNlIDEzNTA5ZDg4ZjViODkxOGQgXS0tLQ== --_2bf2ab8c-b338-47ff-b127-f026606dcf02_ Content-Type: text/plain Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="kernel_bug_at_mmu.c.txt" QXByICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6IC0tLS0tLS0tLS0tLVsgY3V0IGhlcmUg XS0tLS0tLS0tLS0tLQ0KQXByICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6IGtlcm5lbCBC VUcgYXQgYXJjaC94ODYveGVuL21tdS5jOjE4NzIhDQpBcHIgIDggMTI6MTk6NDcgcjE0YTExMDE3 IGtlcm5lbDogaW52YWxpZCBvcGNvZGU6IDAwMDAgWyMxXSBTTVANCkFwciAgOCAxMjoxOTo0NyBy MTRhMTEwMTcga2VybmVsOiBsYXN0IHN5c2ZzIGZpbGU6IC9zeXMvaHlwZXJ2aXNvci9wcm9wZXJ0 aWVzL2NhcGFiaWxpdGllcw0KQXByICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6IENQVSAw DQpBcHIgIDggMTI6MTk6NDcgcjE0YTExMDE3IGtlcm5lbDogTW9kdWxlcyBsaW5rZWQgaW46IDgw MjFxIGdhcnAgYmxrdGFwIHhlbl9uZXRiYWNrIHhlbl9ibGtiYWNrIGJsa2JhY2tfcGFnZW1hcCBu YmQgYnJpZGdlIHN0cCBsbGMgYXV0b2ZzNCBpcG1pX2RldmludGYgaXBtaV9zaSBpcG1pX21zZ2hh bmRsZXIgbG9ja2Qgc3VucnBjIGJvbmRpbmcgaXB2NiB4ZW5mcyBkbV9tdWx0aXBhdGggdmlkZW8g b3V0cHV0IHNicyBzYnNoYyBwYXJwb3J0X3BjIGxwIHBhcnBvcnQgc2VzIGVuY2xvc3VyZSBzbmRf c2VxX2R1bW15IHNuZF9zZXFfb3NzIGJueDIgc25kX3NlcV9taWRpX2V2ZW50IHNlcmlvX3JhdyBz bmRfc2VxIHNuZF9zZXFfZGV2aWNlIHNuZF9wY21fb3NzIHNuZF9taXhlcl9vc3Mgc25kX3BjbSBz bmRfdGltZXIgaTJjX2k4MDEgaVRDT193ZHQgaTJjX2NvcmUgc25kIHNvdW5kY29yZSBzbmRfcGFn ZV9hbGxvYyBpVENPX3ZlbmRvcl9zdXBwb3J0IHBhdGFfYWNwaSBhdGFfZ2VuZXJpYyBwY3Nwa3Ig YXRhX3BpaXggc2hwY2hwIG1wdHNhcyBtcHRzY3NpaCBtcHRiYXNlIFtsYXN0IHVubG9hZGVkOiBm cmVxX3RhYmxlXQ0KQXByICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6IFBpZDogMTU3Njks IGNvbW06IHNoIE5vdCB0YWludGVkIDIuNi4zMi4zNnhlbiAjMSBUZWNhbCBSSDIyODUNCkFwciAg OCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVsOiBSSVA6IGUwMzA6WzxmZmZmZmZmZjgxMDBjZWJj Pl0gIFs8ZmZmZmZmZmY4MTAwY2ViYz5dIHBpbl9wYWdldGFibGVfcGZuKzB4MzYvMHgzYw0KQXBy ICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6IFJTUDogZTAyYjpmZmZmODgwMDFlYjdiYWE4 ICBFRkxBR1M6IDAwMDEwMjgyDQpBcHIgIDggMTI6MTk6NDcgcjE0YTExMDE3IGtlcm5lbDogUkFY OiAwMDAwMDAwMGZmZmZmZmVhIFJCWDogMDAwMDAwMDAwMDA3YjMwNyBSQ1g6IDAwMDAwMDAwMDAw MDAwMDENCkFwciAgOCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVsOiBSRFg6IDAwMDAwMDAwMDAw MDAwMDAgUlNJOiAwMDAwMDAwMDAwMDAwMDAxIFJESTogZmZmZjg4MDAxZWI3YmFhOA0KQXByICA4 IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6IFJCUDogZmZmZjg4MDAxZWI3YmFjOCBSMDg6IDAw MDAwMDAwMDAwMDA0MjAgUjA5OiBmZmZmODgwMDAwMDAwMDAwDQpBcHIgIDggMTI6MTk6NDcgcjE0 YTExMDE3IGtlcm5lbDogUjEwOiAwMDAwMDAwMDAwMDA3ZmYwIFIxMTogZmZmZjg4MDA4ZmM5NzI0 OCBSMTI6IGZmZmY4ODAwMjg0MGIwMDANCkFwciAgOCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVs OiBSMTM6IDAwMDAwMDAwMDAwN2I0ODQgUjE0OiAwMDAwMDAwMDAwMDAwMDAzIFIxNTogZmZmZjg4 MDA5YjA5MDAwMA0KQXByICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6IEZTOiAgMDAwMDdm ZThiYmM2NTZlMCgwMDAwKSBHUzpmZmZmODgwMDI4MDNiMDAwKDAwMDApIGtubEdTOjAwMDAwMDAw MDAwMDAwMDANCkFwciAgOCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVsOiBDUzogIGUwMzMgRFM6 IDAwMDAgRVM6IDAwMDAgQ1IwOiAwMDAwMDAwMDgwMDUwMDNiDQpBcHIgIDggMTI6MTk6NDcgcjE0 YTExMDE3IGtlcm5lbDogQ1IyOiAwMDAwMDAwMDAwNmJiMzM4IENSMzogMDAwMDAwMDA3YjMwNzAw MCBDUjQ6IDAwMDAwMDAwMDAwMDI2NjANCkFwciAgOCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVs OiBEUjA6IDAwMDAwMDAwMDAwMDAwMDAgRFIxOiAwMDAwMDAwMDAwMDAwMDAwIERSMjogMDAwMDAw MDAwMDAwMDAwMA0KQXByICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6IERSMzogMDAwMDAw MDAwMDAwMDAwMCBEUjY6IDAwMDAwMDAwZmZmZjBmZjAgRFI3OiAwMDAwMDAwMDAwMDAwNDAwDQpB cHIgIDggMTI6MTk6NDcgcjE0YTExMDE3IGtlcm5lbDogUHJvY2VzcyBzaCAocGlkOiAxNTc2OSwg dGhyZWFkaW5mbyBmZmZmODgwMDFlYjdhMDAwLCB0YXNrIGZmZmY4ODAwOWIwOTAwMDApDQpBcHIg IDggMTI6MTk6NDcgcjE0YTExMDE3IGtlcm5lbDogU3RhY2s6DQpBcHIgIDggMTI6MTk6NDcgcjE0 YTExMDE3IGtlcm5lbDogIDAwMDAwMDAwMDAwMDAwMDAgMDAwMDAwMDAwMDRiNzQ4NCAwMDAwMDAw MTFlYjdiYWM4IDAwMDAwMDAwMDAwN2IzMDcNCkFwciAgOCAxMjoxOTo0NyByMTRhMTEwMTcga2Vy bmVsOiA8MD4gZmZmZjg4MDAxZWI3YmFmOCBmZmZmZmZmZjgxMDBlOGVmIGZmZmY4ODAxMmU0ZmIx MDAgZmZmZjg4MDAwZmI1ZTAxOA0KQXByICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6IDww PiAwMDAwMDAwMDAwMDdiNDg0IDAwMDAwMDAwMDA2YmIzMzggZmZmZjg4MDAxZWI3YmIwOCBmZmZm ZmZmZjgxMDBlOTM1DQpBcHIgIDggMTI6MTk6NDcgcjE0YTExMDE3IGtlcm5lbDogQ2FsbCBUcmFj ZToNCkFwciAgOCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVsOiAgWzxmZmZmZmZmZjgxMDBlOGVm Pl0geGVuX2FsbG9jX3B0cGFnZSsweDhkLzB4OTYNCkFwciAgOCAxMjoxOTo0NyByMTRhMTEwMTcg a2VybmVsOiAgWzxmZmZmZmZmZjgxMDBlOTM1Pl0geGVuX2FsbG9jX3B0ZSsweDEzLzB4MTUNCkFw ciAgOCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVsOiAgWzxmZmZmZmZmZjgxMGViNzAyPl0gX19w dGVfYWxsb2MrMHg3Zi8weGRjDQpBcHIgIDggMTI6MTk6NDcgcjE0YTExMDE3IGtlcm5lbDogIFs8 ZmZmZmZmZmY4MTBlOTBiZD5dID8gcG1kX29mZnNldCsweDEzLzB4M2MNCkFwciAgOCAxMjoxOTo0 NyByMTRhMTEwMTcga2VybmVsOiAgWzxmZmZmZmZmZjgxMGViODE4Pl0gaGFuZGxlX21tX2ZhdWx0 KzB4YjkvMHg3NzENCkFwciAgOCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVsOiAgWzxmZmZmZmZm ZjgxMGYwOGZkPl0gPyB2bWFfbGluaysweDdjLzB4YTQNCkFwciAgOCAxMjoxOTo0NyByMTRhMTEw MTcga2VybmVsOiAgWzxmZmZmZmZmZjgxMGYxM2IwPl0gPyBtbWFwX3JlZ2lvbisweDMyMi8weDQy Yg0KQXByICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6ICBbPGZmZmZmZmZmODEwMGYxNjk+ XSA/IHhlbl9mb3JjZV9ldnRjaG5fY2FsbGJhY2srMHhkLzB4Zg0KQXByICA4IDEyOjE5OjQ3IHIx NGExMTAxNyBrZXJuZWw6ICBbPGZmZmZmZmZmODE0NDk3MDE+XSBkb19wYWdlX2ZhdWx0KzB4MjFj LzB4Mjg4DQpBcHIgIDggMTI6MTk6NDcgcjE0YTExMDE3IGtlcm5lbDogIFs8ZmZmZmZmZmY4MTQ0 NzY5NT5dIHBhZ2VfZmF1bHQrMHgyNS8weDMwDQpBcHIgIDggMTI6MTk6NDcgcjE0YTExMDE3IGtl cm5lbDogIFs8ZmZmZmZmZmY4MTIyMmEzOT5dID8gX19jbGVhcl91c2VyKzB4MzMvMHg1NQ0KQXBy ICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6ICBbPGZmZmZmZmZmODEyMjJhMWQ+XSA/IF9f Y2xlYXJfdXNlcisweDE3LzB4NTUNCkFwciAgOCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVsOiAg WzxmZmZmZmZmZjgxMjIyYThiPl0gY2xlYXJfdXNlcisweDMwLzB4MzgNCkFwciAgOCAxMjoxOTo0 NyByMTRhMTEwMTcga2VybmVsOiAgWzxmZmZmZmZmZjgxMTUxMzlhPl0gbG9hZF9lbGZfYmluYXJ5 KzB4NWQ1LzB4MTdlZg0KQXByICA4IDEyOjE5OjQ3IHIxNGExMTAxNyBrZXJuZWw6ICBbPGZmZmZm ZmZmODExZjQ2NDg+XSA/IHByb2Nlc3NfbWVhc3VyZW1lbnQrMHhjMC8weGQ3DQpBcHIgIDggMTI6 MTk6NDcgcjE0YTExMDE3IGtlcm5lbDogIFs8ZmZmZmZmZmY4MTE1MGRjNT5dID8gbG9hZF9lbGZf YmluYXJ5KzB4MC8weDE3ZWYNCkFwciAgOCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVsOiAgWzxm ZmZmZmZmZjgxMTEzMDk0Pl0gc2VhcmNoX2JpbmFyeV9oYW5kbGVyKzB4YzgvMHgyNTUNCkFwciAg OCAxMjoxOTo0NyByMTRhMTEwMTcga2VybmVsOiAgWzxmZmZmZmZmZjgxMTE0MzYyPl0gZG9fZXhl Y3ZlKzB4MWMzLzB4MjllDQpBcHIgIDggMTI6MTk6NDcgcjE0YTExMDE3IGtlcm5lbDogIFs8ZmZm ZmZmZmY4MTAxMTU1ZD5dIHN5c19leGVjdmUrMHg0My8weDVkDQpBcHIgIDggMTI6MTk6NDcgcjE0 YTExMDE3IGtlcm5lbDogIFs8ZmZmZmZmZmY4MTAxMzFjYT5dIHN0dWJfZXhlY3ZlKzB4NmEvMHhj MA== --_2bf2ab8c-b338-47ff-b127-f026606dcf02_ Content-Type: text/plain Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="granttabl.txt" KFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDApLiAoZXhw ZWN0ZWQgZG9tIDApDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9y IGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJh ZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgZ3JhbnRfdGFi bGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQoo WEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBl Y3RlZCBkb20gMCkNCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3Ig ZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFk IGZsYWdzICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBncmFudF90YWJs ZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihY RU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVj dGVkIGRvbSAwKQ0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBk b20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQg ZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIHByaW50azogMSBt ZXNzYWdlcyBzdXBwcmVzc2VkLg0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdz ICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBwcmludGs6IDUgbWVzc2Fn ZXMgc3VwcHJlc3NlZC4NCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkg b3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgcHJpbnRrOiA1IG1lc3NhZ2VzIHN1 cHByZXNzZWQuDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRv bSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIHByaW50azogMyBtZXNzYWdlcyBzdXBwcmVz c2VkLg0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDAp LiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBwcmludGs6IDEgbWVzc2FnZXMgc3VwcHJlc3NlZC4N CihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4 cGVjdGVkIGRvbSAwKQ0KKFhFTikgcHJpbnRrOiAxIG1lc3NhZ2VzIHN1cHByZXNzZWQuDQooWEVO KSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBlY3Rl ZCBkb20gMCkNCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9t ICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZs YWdzICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBwcmludGs6IDEgbWVz c2FnZXMgc3VwcHJlc3NlZC4NCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAo MCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgcHJpbnRrOiAxIG1lc3NhZ2Vz IHN1cHByZXNzZWQuDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9y IGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIHByaW50azogMyBtZXNzYWdlcyBzdXBw cmVzc2VkLg0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20g KDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBwcmludGs6IDEgbWVzc2FnZXMgc3VwcHJlc3Nl ZC4NCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4g KGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgcHJpbnRrOiA1IG1lc3NhZ2VzIHN1cHByZXNzZWQuDQoo WEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBl Y3RlZCBkb20gMCkNCihYRU4pIHByaW50azogNSBtZXNzYWdlcyBzdXBwcmVzc2VkLg0KKFhFTikg Z3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQg ZG9tIDApDQooWEVOKSBwcmludGs6IDMgbWVzc2FnZXMgc3VwcHJlc3NlZC4NCihYRU4pIGdyYW50 X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAw KQ0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDApLiAo ZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDAp IG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIHByaW50azogMSBtZXNzYWdlcyBz dXBwcmVzc2VkLg0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBk b20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBwcmludGs6IDEgbWVzc2FnZXMgc3VwcHJl c3NlZC4NCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgw KS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdz ICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBwcmludGs6IDYgbWVzc2Fn ZXMgc3VwcHJlc3NlZC4NCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkg b3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgcHJpbnRrOiAxIG1lc3NhZ2VzIHN1 cHByZXNzZWQuDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRv bSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIHByaW50azogMyBtZXNzYWdlcyBzdXBwcmVz c2VkLg0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDAp LiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBwcmludGs6IDkgbWVzc2FnZXMgc3VwcHJlc3NlZC4N CihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4 cGVjdGVkIGRvbSAwKQ0KKFhFTikgcHJpbnRrOiA5IG1lc3NhZ2VzIHN1cHByZXNzZWQuDQooWEVO KSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBlY3Rl ZCBkb20gMCkNCihYRU4pIHByaW50azogOSBtZXNzYWdlcyBzdXBwcmVzc2VkLg0KKFhFTikgZ3Jh bnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQgZG9t IDApDQooWEVOKSBwcmludGs6IDcgbWVzc2FnZXMgc3VwcHJlc3NlZC4NCihYRU4pIGdyYW50X3Rh YmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0K KFhFTikgcHJpbnRrOiA1IG1lc3NhZ2VzIHN1cHByZXNzZWQuDQooWEVOKSBncmFudF90YWJsZS5j OjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4p IHByaW50azogMyBtZXNzYWdlcyBzdXBwcmVzc2VkLg0KKFhFTikgZ3JhbnRfdGFibGUuYzoxNzE3 OmQwIEJhZCBncmFudCByZWZlcmVuY2UgNDI5NDk2NTk4Mw0KKFhFTikgcHJpbnRrOiAxIG1lc3Nh Z2VzIHN1cHByZXNzZWQuDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDAp IG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIHByaW50azogMSBtZXNzYWdlcyBz dXBwcmVzc2VkLg0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBk b20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBwcmludGs6IDMgbWVzc2FnZXMgc3VwcHJl c3NlZC4NCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgw KS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdz ICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBncmFudF90YWJsZS5jOjI2 NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIGdy YW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRv bSAwKQ0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDAp LiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3Mg KDApIG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2 OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgZ3Jh bnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQgZG9t IDApDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAoMCku IChleHBlY3RlZCBkb20gMCkNCihYRU4pIHByaW50azogMyBtZXNzYWdlcyBzdXBwcmVzc2VkLg0K KFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDApLiAoZXhw ZWN0ZWQgZG9tIDApDQooWEVOKSBwcmludGs6IDEgbWVzc2FnZXMgc3VwcHJlc3NlZC4NCihYRU4p IGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVjdGVk IGRvbSAwKQ0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20g KDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxh Z3MgKDApIG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIHByaW50azogMyBtZXNz YWdlcyBzdXBwcmVzc2VkLg0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgw KSBvciBkb20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBncmFudF90YWJsZS5jOjI2Njpk MCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIHByaW50 azogMiBtZXNzYWdlcyBzdXBwcmVzc2VkLg0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFk IGZsYWdzICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBncmFudF90YWJs ZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihY RU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVj dGVkIGRvbSAwKQ0KKFhFTikgZ3JhbnRfdGFibGUuYzoxNzE3OmQwIEJhZCBncmFudCByZWZlcmVu Y2UgNDI5NDk2NTk4Mw0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBv ciBkb20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBC YWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIGdyYW50X3Rh YmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0K KFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDApLiAoZXhw ZWN0ZWQgZG9tIDApDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9y IGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJh ZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgZ3JhbnRfdGFi bGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQoo WEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBl Y3RlZCBkb20gMCkNCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3Ig ZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFk IGZsYWdzICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBncmFudF90YWJs ZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihY RU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVj dGVkIGRvbSAwKQ0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBk b20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQg ZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIGdyYW50X3RhYmxl LmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhF TikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDApLiAoZXhwZWN0 ZWQgZG9tIDApDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRv bSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBm bGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgcHJpbnRrOiA0IG1l c3NhZ2VzIHN1cHByZXNzZWQuDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3Mg KDApIG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2 OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgcHJp bnRrOiAyIG1lc3NhZ2VzIHN1cHByZXNzZWQuDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBC YWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIHByaW50azog MyBtZXNzYWdlcyBzdXBwcmVzc2VkLg0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZs YWdzICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBwcmludGs6IDEgbWVz c2FnZXMgc3VwcHJlc3NlZC4NCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAo MCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgcHJpbnRrOiAxIG1lc3NhZ2Vz IHN1cHByZXNzZWQuDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9y IGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJh ZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgZ3JhbnRfdGFi bGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQoo WEVOKSBwcmludGs6IDEgbWVzc2FnZXMgc3VwcHJlc3NlZC4NCihYRU4pIGdyYW50X3RhYmxlLmM6 MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikg Z3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQg ZG9tIDApDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAo MCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFn cyAoMCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgZ3JhbnRfdGFibGUuYzoy NjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBn cmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBlY3RlZCBk b20gMCkNCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgw KS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdz ICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBncmFudF90YWJsZS5jOjI2 NjpkMCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIGdy YW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRv bSAwKQ0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDAp LiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3Mg KDApIG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2 OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgcHJp bnRrOiAxIG1lc3NhZ2VzIHN1cHByZXNzZWQuDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBC YWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIHByaW50azog MSBtZXNzYWdlcyBzdXBwcmVzc2VkLg0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZs YWdzICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBwcmludGs6IDUgbWVz c2FnZXMgc3VwcHJlc3NlZC4NCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAo MCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgcHJpbnRrOiAzIG1lc3NhZ2Vz IHN1cHByZXNzZWQuDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxhZ3MgKDApIG9y IGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIHByaW50azogMSBtZXNzYWdlcyBzdXBw cmVzc2VkLg0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20g KDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBwcmludGs6IDMgbWVzc2FnZXMgc3VwcHJlc3Nl ZC4NCihYRU4pIGdyYW50X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4g KGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgw KSBvciBkb20gKDApLiAoZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBncmFudF90YWJsZS5jOjI2Njpk MCBCYWQgZmxhZ3MgKDApIG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIGdyYW50 X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAw KQ0KKFhFTikgZ3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDApLiAo ZXhwZWN0ZWQgZG9tIDApDQooWEVOKSBncmFudF90YWJsZS5jOjE3MTc6ZDAgQmFkIGdyYW50IHJl ZmVyZW5jZSA0Mjk0OTY1OTgzDQooWEVOKSBncmFudF90YWJsZS5jOjE3MTc6ZDAgQmFkIGdyYW50 IHJlZmVyZW5jZSA0Mjk0OTY1OTgzDQooWEVOKSBncmFudF90YWJsZS5jOjI2NjpkMCBCYWQgZmxh Z3MgKDApIG9yIGRvbSAoMCkuIChleHBlY3RlZCBkb20gMCkNCihYRU4pIGdyYW50X3RhYmxlLmM6 MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAwKQ0KKFhFTikg Z3JhbnRfdGFibGUuYzoyNjY6ZDAgQmFkIGZsYWdzICgwKSBvciBkb20gKDApLiAoZXhwZWN0ZWQg ZG9tIDApDQooWEVOKSBwcmludGs6IDcgbWVzc2FnZXMgc3VwcHJlc3NlZC4NCihYRU4pIGdyYW50 X3RhYmxlLmM6MjY2OmQwIEJhZCBmbGFncyAoMCkgb3IgZG9tICgwKS4gKGV4cGVjdGVkIGRvbSAw KQ0KKFhFTikgcHJpbnRrOiAxNSBtZXNzYWdlcyBzdXBwcmVzc2VkLg0KKFhFTikgZ3JhbnRfdGFi bGUuYzoxNzE3OmQwIEJhZCBncmFudCByZWZlcmVuY2UgNDI5NDkwMTc2MA0KKFhFTikgcHJpbnRr OiAxMyBtZXNzYWdlcyBzdXBwcmVzc2VkLg0KKFhFTikgZ3JhbnRfdGFibGUuYzoxNzE3OmQwIEJh ZCBncmFudCByZWZlcmVuY2UgNDI5NDkwMTc2NQ0KKFhFTikgcHJpbnRrOiA5IG1lc3NhZ2VzIHN1 cHByZXNzZWQuDQooWEVOKSBncmFudF90YWJsZS5jOjE3MTc6ZDAgQmFkIGdyYW50IHJlZmVyZW5j ZSA0Mjk0OTAxNzY1DQooWEVOKSBwcmludGs6IDkgbWVzc2FnZXMgc3VwcHJlc3NlZC4NCihYRU4p IGdyYW50X3RhYmxlLmM6MTcxNzpkMCBCYWQgZ3JhbnQgcmVmZXJlbmNlIDQyOTQ5MDE3NjUNCihY RU4pIHByaW50azogOSBtZXNzYWdlcyBzdXBwcmVzc2VkLg0KKFhFTikgZ3JhbnRfdGFibGUuYzox NzE3OmQwIEJhZCBncmFudCByZWZlcmVuY2UgNDI5NDkwMTc2NQ0KKFhFTikgcHJpbnRrOiAxMyBt ZXNzYWdlcyBzdXBwcmVzc2VkLg0KKFhFTikgZ3JhbnRfdGFibGUuYzoxNzE3OmQwIEJhZCBncmFu dCByZWZlcmVuY2UgNDI5NDkwMTc2MA0KKFhFTikgcHJpbnRrOiA5IG1lc3NhZ2VzIHN1cHByZXNz ZWQuDQooWEVOKSBncmFudF90YWJsZS5jOjE3MTc6ZDAgQmFkIGdyYW50IHJlZmVyZW5jZSA0Mjk0 OTAxNzYwDQooWEVOKSBncmFudF90YWJsZS5jOjE3MTc6ZDAgQmFkIGdyYW50IHJlZmVyZW5jZSA0 Mjk0OTAxNzYwDQooWEVOKSBncmFudF90YWJsZS5jOjE3MTc6ZDAgQmFkIGdyYW50IHJlZmVyZW5j ZSA0Mjk0OTAxNzY1DQooWEVOKSBwcmludGs6IDUgbWVzc2FnZXMgc3VwcHJlc3NlZC4NCihYRU4p IGdyYW50X3RhYmxlLmM6MTcxNzpkMCBCYWQgZ3JhbnQgcmVmZXJlbmNlIDQyOTQ5MDE3NjUNCihY RU4pIHByaW50azogNiBtZXNzYWdlcyBzdXBwcmVzc2VkLg0KKFhFTikgZ3JhbnRfdGFibGUuYzox NzE3OmQwIEJhZCBncmFudCByZWZlcmVuY2UgNDI5NDkwMTc2MA0KKFhFTikgcHJpbnRrOiA4IG1l c3NhZ2VzIHN1cHByZXNzZWQuDQooWEVOKSBncmFudF90YWJsZS5jOjE3MTc6ZDAgQmFkIGdyYW50 IHJlZmVyZW5jZSA0Mjk0OTAxNzY1 --_2bf2ab8c-b338-47ff-b127-f026606dcf02_ Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --_2bf2ab8c-b338-47ff-b127-f026606dcf02_-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: RE: kernel BUG at arch/x86/xen/mmu.c:1872 Date: Sun, 10 Apr 2011 21:57:10 +0800 Message-ID: References: , , Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0049315248==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen devel Cc: jeremy@goop.org, keir@xen.org, ian.campbell@citrix.com, konrad.wilk@oracle.com, giamteckchoon@gmail.com, dave@ivt.com.au List-Id: xen-devel@lists.xenproject.org --===============0049315248== Content-Type: multipart/alternative; boundary="_f86da273-4979-42c1-8df3-f2363d70d864_" --_f86da273-4979-42c1-8df3-f2363d70d864_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi Konrad & Jeremy: I think we finally located the missing patch for this commit. We test commit http://git.kernel.org/?p=3Dlinux/kernel/git/je= remy/xen.git;a=3Dcommit;h=3Dc97f681f138039425c87f35ea46a92385d81e70e which is works. =20 We test commit http://git.kernel.org/?p=3Dlinux/kernel/git/je= remy/xen.git;a=3Dcommit;h=3D221c64dbf860d37f841f40893bddf8d804aa55bd which server crashed. =20 Later I found the comments for this commit:=20 http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a= =3Dcommit;h=3D64141da587241301ce8638cc945f8b67853156ec =20 So It looks like this fix is not applied on 2.6.32.36, Could = you take a look at this?=20 =20 Many thanks. =20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D >Hi Konrad & Jeremy: >=20 > I'd like to open this BUG in a new thread, since the old thread is = too long for easy read. > =20 > We recently want to upgrade our kernel to 2.6.32, but unfortunately= , we confront a kernel crash bug. >Our test case is simple, start 24 win2003 HVMS on our physical machine, = and each HVM reboot=20 >every 15minutes. The kernel will crash in half an hour.(That is crash on= VM second starts). >=20 >Our test go much further. >We test different kernel version. >2.6.32.10 http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3D= commit;h=3Dd945b014ac5df9592c478bf9486d97e8914aab59 >2.6.32.11 http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3D= commit;h=3D27f948a3bf365a5bc3d56119637a177d41147815 >2.6.32.12 http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3D= commit;h=3Dba739f9abd3f659b907a824af1161926b420a2ce >2.6.32.13 http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3D= commit;h=3Df6fe6583b77a49b569eef1b66c3d761eec2e561b >2.6.32.15 http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3D= commit;h=3D27ed1b0e0dae5f1d5da5c76451bc84cb529128bd >2.6.32.21 http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3D= commit;h=3D69e50db231723596ed8ef9275d0068d6697f466a >=20 >There are basic three different result we met. >=20 >i1) grant table issue >The host still function, but use xm dmesg, we have abnormal log. >please refer to the attched log of grant table >=20 >i2) kernel crash on a different place. >Host die during the test, after reboot, we can see nothing abnormal in /= var/log/messages >=20 >i3) kernel BUG at arch/x86/xen/mmu.c:1872;=20 >Host die during the test, after reboot, we see the crash log in messages= , refer to the attached log of 2.6.32.36 >Summary of the test result, can be classified in two: >=20 >1) 2.6.32.10 >30 machines involved the test, and three has issue (i1), and two has iss= ue (i2), *no* issue (i3) >Other machines run tests successfully till now, more than 8 hours >=20 >2)2.6.32.11 or later version. >Each version containers 10 machine for tests, and all machine crashed in= less than half an hour. >=20 >Conclusion: >1) grant table issue exists in all kernel version >2) kernerl crash at different place may exist in all kernel versions, bu= t not happen so frequently, 2 out of 30 >3) We observe the major difference of issue i3), from the test, it looks= like it is introduced between the version >2.6.32.10 and 2.6.32.11. >=20 >Hope this help to locate the bug. >Many thanks. >=20 > =20 --_f86da273-4979-42c1-8df3-f2363d70d864_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi Konrad & Jeremy:

            I thin= k we finally located the missing patch for this commit.
            We tes= t commit http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3D= commit;h=3Dc97f681f138039425c87f35ea46a92385d81e70e
            w= hich is works.
 
            We tes= t commit http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3D= commit;h=3D221c64dbf860d37f841f40893bddf8d804aa55bd
 &= nbsp;          which se= rver crashed.
 
             = Later I found the comments for this commit:
             = http://git.kernel.or= g/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit;h=3D64141da587241301ce8= 638cc945f8b67853156ec
         
            S= o It looks like this fix is not applied on 2.6.32.36, Could you take a lo= ok at this?
 
            M= any thanks.
            &= nbsp;
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D
>Hi Konrad & Jeremy:
>
>     I'd = like to open this BUG in a new thread, since the old thread is too long f= or easy read.
>    
>   &= nbsp; We recently want to upgrade our kernel to 2.6.32, but unfortunately= , we confront a kernel crash bug.
>Our test case is simple, start 2= 4 win2003 HVMS on our physical machine, and each HVM reboot
>every= 15minutes. The kernel will crash in half an hour.(That is crash on VM se= cond starts).
>
>Our test go much further.
>We test di= fferent kernel version.
>2.6.32.10  http://git.kernel.org/?p=3Dlinux/kernel/git/j= eremy/xen.git;a=3Dcommit;h=3Dd945b014ac5df9592c478bf9486d97e8914aab59=
>2.6.32.11  http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.gi= t;a=3Dcommit;h=3D27f948a3bf365a5bc3d56119637a177d41147815
>2.6.= 32.12  http://= git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit;h=3Dba739f= 9abd3f659b907a824af1161926b420a2ce
>2.6.32.13  http://git.kernel.org/?p=3Dlinu= x/kernel/git/jeremy/xen.git;a=3Dcommit;h=3Df6fe6583b77a49b569eef1b66c3d76= 1eec2e561b
>2.6.32.15  http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen= .git;a=3Dcommit;h=3D27ed1b0e0dae5f1d5da5c76451bc84cb529128bd
>2= .6.32.21  http://git.kernel.org/?p=3D= linux/kernel/git/jeremy/xen.git;a=3Dcommit;h=3D69e50db231723596ed8ef9275d= 0068d6697f466a
>
>There are basic three different result= we met.
>
>i1) grant table issue
>The host still func= tion, but use xm  dmesg, we have abnormal log.
>please refer t= o the attched log of grant table
>
>i2) kernel crash on a di= fferent place.
>Host die during the test, after reboot, we can see = nothing abnormal in /var/log/messages
>
>i3) kernel BUG at a= rch/x86/xen/mmu.c:1872;
>Host die during the test, after reboot, w= e see the crash log in messages, refer to the attached log of 2.6.32.36>Summary of the test result, can be classified in two:
>
&= gt;1) 2.6.32.10
>30 machines involved the test, and three has issue= (i1), and two has issue (i2), *no* issue (i3)
>Other machines run = tests successfully till now, more than 8 hours>
>2)2.6.32.11 or later version.
>Each version contain= ers 10 machine for tests, and all machine crashed in less than half an ho= ur.
>
>Conclusion:
>1) grant table issue exists in all= kernel version
>2) kernerl crash at different place may exist in a= ll kernel versions, but not happen so frequently, 2 out of 30
>3) W= e observe the major difference of issue i3), from the test, it looks like= it is introduced between the version
>2.6.32.10 and 2.6.32.11.
= >
>Hope this help to locate the bug.
>Many thanks.
>= ;
>
--_f86da273-4979-42c1-8df3-f2363d70d864_-- --===============0049315248== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============0049315248==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Teck Choon Giam Subject: Re: kernel BUG at arch/x86/xen/mmu.c:1872 Date: Mon, 11 Apr 2011 04:14:45 +0800 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary=000e0cd4cc3035c3b104a0961c51 Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: MaoXiaoyun Cc: jeremy@goop.org, xen devel , keir@xen.org, ian.campbell@citrix.com, konrad.wilk@oracle.com, dave@ivt.com.au List-Id: xen-devel@lists.xenproject.org --000e0cd4cc3035c3b104a0961c51 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable 2011/4/10 MaoXiaoyun : > Hi Konrad & Jeremy: > > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 I think we finally located the missing = patch for this commit. > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 We test commit > http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit;h= =3Dc97f681f138039425c87f35ea46a92385d81e70e > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0which is works. > > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 We test commit > http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit;h= =3D221c64dbf860d37f841f40893bddf8d804aa55bd > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0which server crashed. > > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Later I found the comments for this = commit: > > http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit;h= =3D64141da587241301ce8638cc945f8b67853156ec > > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0So It looks like this fix is not appl= ied on 2.6.32.36, Could you > take a look at this? > > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0Many thanks. > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D >>Hi Konrad & Jeremy: >> >>=A0=A0=A0=A0 I'd like to open this BUG in a new thread, since the old thr= ead is too >> long for easy read. >> >>=A0=A0=A0=A0 We recently want to upgrade our kernel to 2.6.32, but unfort= unately, >> we confront a kernel crash bug. >>Our test case is simple, start 24 win2003 HVMS on our physical machine, a= nd >> each HVM reboot >>every 15minutes. The kernel will crash in half an hour.(That is crash on = VM >> second starts). >> >>Our test go much further. >>We test different kernel version. >>2.6.32.10 >> http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit;h= =3Dd945b014ac5df9592c478bf9486d97e8914aab59 >>2.6.32.11 >> http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit;h= =3D27f948a3bf365a5bc3d56119637a177d41147815 >>2.6.32.12 >> http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit;h= =3Dba739f9abd3f659b907a824af1161926b420a2ce >>2.6.32.13 >> http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit;h= =3Df6fe6583b77a49b569eef1b66c3d761eec2e561b >>2.6.32.15 >> http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit;h= =3D27ed1b0e0dae5f1d5da5c76451bc84cb529128bd >>2.6.32.21 >> http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit;h= =3D69e50db231723596ed8ef9275d0068d6697f466a >> >>There are basic three different result we met. >> >>i1) grant table issue >>The host still function, but use xm=A0 dmesg, we have abnormal log. >>please refer to the attched log of grant table >> >>i2) kernel crash on a different place. >>Host die during the test, after reboot, we can see nothing abnormal in >> /var/log/messages >> >>i3) kernel BUG at arch/x86/xen/mmu.c:1872; >>Host die during the test, after reboot, we see the crash log in messages, >> refer to the attached log of 2.6.32.36 >>Summary of the test result, can be classified in two: >> >>1) 2.6.32.10 >>30 machines involved the test, and three has issue (i1), and two has issu= e >> (i2), *no* issue (i3) >>Other machines run tests successfully till now, more than 8 hours >> >>2)2.6.32.11 or later version. >>Each version containers 10 machine for tests, and all machine crashed in >> less than half an hour. >> >>Conclusion: >>1) grant table issue exists in all kernel version >>2) kernerl crash at different place may exist in all kernel versions, but >> not happen so frequently, 2 out of 30 >>3) We observe the major difference of issue i3), from the test, it looks >> like it is introduced between the version >>2.6.32.10 and 2.6.32.11. >> >>Hope this help to locate the bug. >>Many thanks. >> >> > Hi, Sorry, since this mmu related BUG has been troubled me for very long... I really want to "kill" this BUG but my knowledge in kernel hacking and/or xen is very limited. While waiting for Jeremy or Konrad or others ... Many thanks for spending time to track down this mmu related BUG. I have backported the commit from http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit;h=3D6= 4141da587241301ce8638cc945f8b67853156ec to 2.6.32.36 PVOPS kernel and patch attached. I won't know whether did I backport it correctly nor does it affects anything. I am currently testing the 2.6.32.36 PVOPS kernel with this patch applied and also unset CONFIG_DEBUG_PAGEALLOC. Currently running testcrash.sh loop 1000 as I am unable to reproduce this mmu BUG 1872 in testcrash.sh loop 100. Please note that when CONFIG_DEBUG_PAGEALLOC is unset, I can reproduce this mmu BUG 1872 easily within <50 testcrash.sh loop cycle with PVOPS version 2.6.32.24 to 2.6.32.36 kernel. Now test with this backport patch to see whether I can reproduce this mmu BUG... ... Kindest regards, Giam Teck Choon --000e0cd4cc3035c3b104a0961c51 Content-Type: text/x-patch; charset=US-ASCII; name="vmalloc__eagerly_clear_ptes_on_vunmap.patch" Content-Disposition: attachment; filename="vmalloc__eagerly_clear_ptes_on_vunmap.patch" Content-Transfer-Encoding: base64 X-Attachment-Id: f_gmceplmn0 QmFjayBwb3J0IGZyb20gY29tbWl0IGh0dHA6Ly9naXQua2VybmVsLm9yZy8/cD1saW51eC9rZXJu ZWwvZ2l0L2plcmVteS94ZW4uZ2l0O2E9Y29tbWl0O2g9NjQxNDFkYTU4NzI0MTMwMWNlODYzOGNj OTQ1ZjhiNjc4NTMxNTZlYwoKZGlmZiAtdXJOIGEvYXJjaC94ODYveGVuL21tdS5jIGIvYXJjaC94 ODYveGVuL21tdS5jCi0tLSBhL2FyY2gveDg2L3hlbi9tbXUuYwkyMDExLTAzLTMwIDA2OjE3OjQ2 LjAwMDAwMDAwMCArMDgwMAorKysgYi9hcmNoL3g4Ni94ZW4vbW11LmMJMjAxMS0wNC0xMSAwMjox Nzo1NC4wMDAwMDAwMDAgKzA4MDAKQEAgLTI0MzAsOCArMjQzMCw2IEBACiAJeDg2X2luaXQucGFn aW5nLnBhZ2V0YWJsZV9zZXR1cF9zdGFydCA9IHhlbl9wYWdldGFibGVfc2V0dXBfc3RhcnQ7CiAJ eDg2X2luaXQucGFnaW5nLnBhZ2V0YWJsZV9zZXR1cF9kb25lID0geGVuX3BhZ2V0YWJsZV9zZXR1 cF9kb25lOwogCXB2X21tdV9vcHMgPSB4ZW5fbW11X29wczsKLQotCXZtYXBfbGF6eV91bm1hcCA9 IGZhbHNlOwogfQogCiAvKiBQcm90ZWN0ZWQgYnkgeGVuX3Jlc2VydmF0aW9uX2xvY2suICovCmRp ZmYgLXVyTiBhL2luY2x1ZGUvbGludXgvdm1hbGxvYy5oIGIvaW5jbHVkZS9saW51eC92bWFsbG9j LmgKLS0tIGEvaW5jbHVkZS9saW51eC92bWFsbG9jLmgJMjAxMS0wMy0zMCAwNjoxNzo0Ni4wMDAw MDAwMDAgKzA4MDAKKysrIGIvaW5jbHVkZS9saW51eC92bWFsbG9jLmgJMjAxMS0wNC0xMSAwMjox ODo0My4wMDAwMDAwMDAgKzA4MDAKQEAgLTcsOCArNyw2IEBACiAKIHN0cnVjdCB2bV9hcmVhX3N0 cnVjdDsJCS8qIHZtYSBkZWZpbmluZyB1c2VyIG1hcHBpbmcgaW4gbW1fdHlwZXMuaCAqLwogCi1l eHRlcm4gYm9vbCB2bWFwX2xhenlfdW5tYXA7Ci0KIC8qIGJpdHMgaW4gZmxhZ3Mgb2Ygdm1hbGxv YydzIHZtX3N0cnVjdCBiZWxvdyAqLwogI2RlZmluZSBWTV9JT1JFTUFQCTB4MDAwMDAwMDEJLyog aW9yZW1hcCgpIGFuZCBmcmllbmRzICovCiAjZGVmaW5lIFZNX0FMTE9DCTB4MDAwMDAwMDIJLyog dm1hbGxvYygpICovCmRpZmYgLXVyTiBhL21tL3ZtYWxsb2MuYyBiL21tL3ZtYWxsb2MuYwotLS0g YS9tbS92bWFsbG9jLmMJMjAxMS0wMy0zMCAwNjoxNzo0Ni4wMDAwMDAwMDAgKzA4MDAKKysrIGIv bW0vdm1hbGxvYy5jCTIwMTEtMDQtMTEgMDI6MjU6MzguMDAwMDAwMDAwICswODAwCkBAIC0zMSw4 ICszMSw2IEBACiAjaW5jbHVkZSA8YXNtL3RsYmZsdXNoLmg+CiAjaW5jbHVkZSA8YXNtL3NobXBh cmFtLmg+CiAKLWJvb2wgdm1hcF9sYXp5X3VubWFwIF9fcmVhZF9tb3N0bHkgPSB0cnVlOwotCiAv KioqIFBhZ2UgdGFibGUgbWFuaXB1bGF0aW9uIGZ1bmN0aW9ucyAqKiovCiAKIHN0YXRpYyB2b2lk IHZ1bm1hcF9wdGVfcmFuZ2UocG1kX3QgKnBtZCwgdW5zaWduZWQgbG9uZyBhZGRyLCB1bnNpZ25l ZCBsb25nIGVuZCkKQEAgLTUwMyw5ICs1MDEsNiBAQAogewogCXVuc2lnbmVkIGludCBsb2c7CiAK LQlpZiAoIXZtYXBfbGF6eV91bm1hcCkKLQkJcmV0dXJuIDA7Ci0KIAlsb2cgPSBmbHMobnVtX29u bGluZV9jcHVzKCkpOwogCiAJcmV0dXJuIGxvZyAqICgzMlVMICogMTAyNCAqIDEwMjQgLyBQQUdF X1NJWkUpOwpAQCAtNTY2LDcgKzU2MSw2IEBACiAJCQlpZiAodmEtPnZhX2VuZCA+ICplbmQpCiAJ CQkJKmVuZCA9IHZhLT52YV9lbmQ7CiAJCQluciArPSAodmEtPnZhX2VuZCAtIHZhLT52YV9zdGFy dCkgPj4gUEFHRV9TSElGVDsKLQkJCXVubWFwX3ZtYXBfYXJlYSh2YSk7CiAJCQlsaXN0X2FkZF90 YWlsKCZ2YS0+cHVyZ2VfbGlzdCwgJnZhbGlzdCk7CiAJCQl2YS0+ZmxhZ3MgfD0gVk1fTEFaWV9G UkVFSU5HOwogCQkJdmEtPmZsYWdzICY9IH5WTV9MQVpZX0ZSRUU7CkBAIC02MTIsMTAgKzYwNiwx MSBAQAogfQogCiAvKgotICogRnJlZSBhbmQgdW5tYXAgYSB2bWFwIGFyZWEsIGNhbGxlciBlbnN1 cmluZyBmbHVzaF9jYWNoZV92dW5tYXAgaGFkIGJlZW4KLSAqIGNhbGxlZCBmb3IgdGhlIGNvcnJl Y3QgcmFuZ2UgcHJldmlvdXNseS4KKyAqIEZyZWUgYSB2bWFwIGFyZWEsIGNhbGxlciBlbnN1cmlu ZyB0aGF0IHRoZSBhcmVhIGhhcyBiZWVuIHVubWFwcGVkCisgKiBhbmQgZmx1c2hfY2FjaGVfdnVu bWFwIGhhZCBiZWVuIGNhbGxlZCBmb3IgdGhlIGNvcnJlY3QgcmFuZ2UKKyAqIHByZXZpb3VzbHku CiAgKi8KLXN0YXRpYyB2b2lkIGZyZWVfdW5tYXBfdm1hcF9hcmVhX25vZmx1c2goc3RydWN0IHZt YXBfYXJlYSAqdmEpCitzdGF0aWMgdm9pZCBmcmVlX3ZtYXBfYXJlYV9ub2ZsdXNoKHN0cnVjdCB2 bWFwX2FyZWEgKnZhKQogewogCXZhLT5mbGFncyB8PSBWTV9MQVpZX0ZSRUU7CiAJYXRvbWljX2Fk ZCgodmEtPnZhX2VuZCAtIHZhLT52YV9zdGFydCkgPj4gUEFHRV9TSElGVCwgJnZtYXBfbGF6eV9u cik7CkBAIC02MjQsNiArNjE5LDE2IEBACiB9CiAKIC8qCisgKiBGcmVlIGFuZCB1bm1hcCBhIHZt YXAgYXJlYSwgY2FsbGVyIGVuc3VyaW5nIGZsdXNoX2NhY2hlX3Z1bm1hcCBoYWQgYmVlbgorICog Y2FsbGVkIGZvciB0aGUgY29ycmVjdCByYW5nZSBwcmV2aW91c2x5LgorICovCitzdGF0aWMgdm9p ZCBmcmVlX3VubWFwX3ZtYXBfYXJlYV9ub2ZsdXNoKHN0cnVjdCB2bWFwX2FyZWEgKnZhKQorewor CXVubWFwX3ZtYXBfYXJlYSh2YSk7CisJZnJlZV92bWFwX2FyZWFfbm9mbHVzaCh2YSk7Cit9CisK Ky8qCiAgKiBGcmVlIGFuZCB1bm1hcCBhIHZtYXAgYXJlYQogICovCiBzdGF0aWMgdm9pZCBmcmVl X3VubWFwX3ZtYXBfYXJlYShzdHJ1Y3Qgdm1hcF9hcmVhICp2YSkKQEAgLTc5OSw3ICs4MDQsNyBA QAogCXNwaW5fdW5sb2NrKCZ2bWFwX2Jsb2NrX3RyZWVfbG9jayk7CiAJQlVHX09OKHRtcCAhPSB2 Yik7CiAKLQlmcmVlX3VubWFwX3ZtYXBfYXJlYV9ub2ZsdXNoKHZiLT52YSk7CisJZnJlZV92bWFw X2FyZWFfbm9mbHVzaCh2Yi0+dmEpOwogCWNhbGxfcmN1KCZ2Yi0+cmN1X2hlYWQsIHJjdV9mcmVl X3ZiKTsKIH0KIApAQCAtOTM2LDYgKzk0MSw4IEBACiAJcmN1X3JlYWRfdW5sb2NrKCk7CiAJQlVH X09OKCF2Yik7CiAKKwl2dW5tYXBfcGFnZV9yYW5nZSgodW5zaWduZWQgbG9uZylhZGRyLCAodW5z aWduZWQgbG9uZylhZGRyICsgc2l6ZSk7CisKIAlzcGluX2xvY2soJnZiLT5sb2NrKTsKIAlCVUdf T04oYml0bWFwX2FsbG9jYXRlX3JlZ2lvbih2Yi0+ZGlydHlfbWFwLCBvZmZzZXQgPj4gUEFHRV9T SElGVCwgb3JkZXIpKTsKIApAQCAtOTg4LDcgKzk5NSw2IEBACiAKIAkJCQlzID0gdmItPnZhLT52 YV9zdGFydCArIChpIDw8IFBBR0VfU0hJRlQpOwogCQkJCWUgPSB2Yi0+dmEtPnZhX3N0YXJ0ICsg KGogPDwgUEFHRV9TSElGVCk7Ci0JCQkJdnVubWFwX3BhZ2VfcmFuZ2UocywgZSk7CiAJCQkJZmx1 c2ggPSAxOwogCiAJCQkJaWYgKHMgPCBzdGFydCkK --000e0cd4cc3035c3b104a0961c51 Content-Type: application/x-sh; name="testcrash.sh" Content-Disposition: attachment; filename="testcrash.sh" Content-Transfer-Encoding: base64 X-Attachment-Id: f_gmceq8gr1 IyEvYmluL3NoCiMKIyBUaGlzIHNjcmlwdCBpcyB0byBjcmVhdGUgbHZtIHNuYXBzaG90LCBtb3Vu dCBpdCwgdW1vdW50IGl0IGFuZCByZW1vdmUgaW4gYQojIHNwZWNpZmllZCBudW1iZXIgb2YgbG9v cHMgdG8gdGVzdCB3aGV0aGVyIGl0IHdpbGwgY3Jhc2ggdGhlIGhvc3Qgc2VydmVyLgojIEFsbCBM Vk0gc25hcHNob3RzIGFzc3VtZWQgY2FuIGJlIG1vdW50ZWQgbGlrZSBpZiB5b3UgYXJlIHJ1bm5p bmcgYSBQViBkb21VLgojCiMgQ3JlYXRlZCBieSBHaWFtIFRlY2sgQ2hvb24KIwoKIyBUaGUgTFYg bmFtZSBhbmQgZm9yIHRoaXMgY2FzZSB3ZSBhcmUgdXNpbmcgdGhlIGZpcnN0IGluIHZnZGlzcGxh eSBvdXRwdXQuCiMgQ2hhbmdlIHRoZSB2YXJpYWJsZSBpZiB5b3Ugd2FudCBvdGhlciBWRyBOYW1l CkxWR3JvdXBOYW1lPWB2Z2Rpc3BsYXkgfCBncmVwICdWRyBOYW1lJyB8IGF3ayAne3ByaW50ICQz fScgfCB0YWlsIC1uIDFgCgppZiBbICEgLW4gIiRMVkdyb3VwTmFtZSIgXSAmJiBbICEgLWQgIi9k ZXYvJHtMVkdyb3VwTmFtZX0iIF0gOyB0aGVuCiAgICBlY2hvICJVbmFibGUgdG8gZGV0ZWN0IFZH IE5hbWUhIgogICAgZXhpdCAxCmZpCgojIHJldHVybiAxIGlmIGlzIG1vdW50ZWQgb3RoZXJ3aXNl IHJldHVybiAwCmNoZWNrX21vdW50KCkgewogICAgbG9jYWwgY2hlY2tkaXI9JHsxfQogICAgaWYg WyAtbiAiJGNoZWNrZGlyIiBdIDsgdGhlbgogICAgbG9jYWwgY2hlY2s9YGdyZXAgIiRjaGVja2Rp ciIgL3Byb2MvbW91bnRzYAogICAgaWYgWyAtbiAiJGNoZWNrIiBdIDsgdGhlbgoJcmV0dXJuIDEK ICAgIGZpCiAgICBmaQogICAgcmV0dXJuIDAKfQoKIyBXZSB3aWxsIGNyZWF0ZSA1IHRlc3RjcmFz aCBMViBpbiAkTFZHcm91cE5hbWUgZWFjaCB3aXRoIDVHQiBzaXplCiMgYW5kIGZvcm1hdCBpdCBh cyBleHQzCmRvX2x2bV9jcmVhdGVfdGVzdGNyYXNoKCkgewogICAgbG9jYWwgbHZuYW1lPSR7MTot dGVzdGNyYXNofQogICAgbG9jYWwgbHZzaXplPSR7MjotNUd9CiAgICBsb2NhbCBsaW1pdD0kezM6 LTV9CiAgICBsb2NhbCBjb3VudD0xCiAgICB3aGlsZSBbICIkY291bnQiIC1sZSAiJGxpbWl0IiBd CiAgICBkbwogICAgaWYgWyAhIC1oICIvZGV2LyR7TFZHcm91cE5hbWV9LyR7bHZuYW1lfSR7Y291 bnR9IiBdIDsgdGhlbgoJZWNobyAibHZjcmVhdGUgLXYgLW4gJHtsdm5hbWV9JHtjb3VudH0gLUwg JHtsdnNpemV9ICR7TFZHcm91cE5hbWV9IC4uLiAuLi4gIgoJbHZjcmVhdGUgLXYgLW4gJHtsdm5h bWV9JHtjb3VudH0gLUwgJHtsdnNpemV9ICR7TFZHcm91cE5hbWV9CgllY2hvICJsdmNyZWF0ZSAt diAtbiAke2x2bmFtZX0ke2NvdW50fSAtTCAke2x2c2l6ZX0gJHtMVkdyb3VwTmFtZX0gY29tcGxl dGVkISIKCWlmIFsgLWggIi9kZXYvJHtMVkdyb3VwTmFtZX0vJHtsdm5hbWV9JHtjb3VudH0iIF0g OyB0aGVuCgllY2hvICJta2UyZnMgLUYgLWogL2Rldi8ke0xWR3JvdXBOYW1lfS8ke2x2bmFtZX0k e2NvdW50fSAuLi4gLi4uICIKCW1rZTJmcyAtRiAtaiAvZGV2LyR7TFZHcm91cE5hbWV9LyR7bHZu YW1lfSR7Y291bnR9CgllY2hvICJta2UyZnMgLUYgLWogL2Rldi8ke0xWR3JvdXBOYW1lfS8ke2x2 bmFtZX0ke2NvdW50fSBjb21wbGV0ZWQhIgoJZWxzZQoJZWNobyAiL2Rldi8ke0xWR3JvdXBOYW1l fS8ke2x2bmFtZX0ke2NvdW50fSBub3QgZm91bmQhIgoJZmkKICAgIGZpCiAgICBjb3VudD1gZXhw ciAkY291bnQgKyAxYAogICAgZG9uZQp9Cgpkb19sdm1fY3JlYXRlX3JlbW92ZSgpIHsKICAgICMg bnVtYmVyIG9mIGxvb3BzIGRlZmF1bHQgaXMgMQogICAgbG9jYWwgbG9vcGNvdW50bGltaXQ9JHsx Oi0xfQogICAgIyBzbmFwc2hvdCBzaXplIGRlZmF1bHQgaXMgMUcKICAgIGxvY2FsIHNuYXBzaG90 c2l6ZT0kezI6LTFHfQogICAgIyBpbXBsZW1lbnQgYSBzbGVlcCBiZXR3ZWVuIGNyZWF0ZSwgbW91 bnQsIHVtb3VudCBhbmQgcmVtb3ZlIChkZWZhdWx0IGlzIDAgd2hpY2ggaXMgbm8gcGF1c2UpCiAg ICBsb2NhbCBwYXVzZWludGVydmFsPSR7MzotMH0KICAgICMgZXhlY3V0ZSBjb21tYW5kcyBhZnRl ciBlYWNoIHBhdXNlL3NsZWVwIHN1Y2ggYXMgc3luYyBvciBhbnl0aGluZyB0aGF0IHlvdSB3YW50 IHRvIHRlc3QKICAgIGxvY2FsIGNvbW1hbmRzPSR7NH0KICAgICMgV2UgZmlsdGVyIG91dCBzbmFw c2hvdCBhbmQgc3dhcAogICAgbG9jYWwgY291bnQ9MAogICAgaWYgWyAtZCAiL2Rldi8ke0xWR3Jv dXBOYW1lfSIgXSA7IHRoZW4KICAgIHdoaWxlIFsgIiRjb3VudCIgLWx0ICIkbG9vcGNvdW50bGlt aXQiIF0KICAgIGRvCgljb3VudD1gZXhwciAkY291bnQgKyAxYAoJZWNobyAiJHtjb3VudH0gLi4u IC4uLiAiCgllY2hvICIke2NvdW50fSBhdCBgZGF0ZWAiID4+IC90bXAvdGVzdGNyYXNoLmxvZwoJ Zm9yIGkgaW4gYGxzIC9kZXYvJHtMVkdyb3VwTmFtZX0gfCBncmVwIC1FdiAnc25hcHNob3QkJyB8 IGdyZXAgLUV2ICdzd2FwJCdgOyBkbwoJaWYgWyAtaCAiL2Rldi8ke0xWR3JvdXBOYW1lfS8ke2l9 IiBdIDsgdGhlbgoJICAgIGVjaG8gLW4gImx2Y3JlYXRlIC1zIC12IC1uICR7aX0tc25hcHNob3Qg LUwgJHtzbmFwc2hvdHNpemV9IC9kZXYvJHtMVkdyb3VwTmFtZX0vJHtpfSAuLi4gLi4uICIKCSAg ICBsdmNyZWF0ZSAtcyAtdiAtbiAke2l9LXNuYXBzaG90IC1MICR7c25hcHNob3RzaXplfSAvZGV2 LyR7TFZHcm91cE5hbWV9LyR7aX0KCSAgICBlY2hvICJkb25lLiIKCSAgICBzbGVlcCAke3BhdXNl aW50ZXJ2YWx9CgkgICAgaWYgWyAtbiAiJGNvbW1hbmRzIiBdIDsgdGhlbgoJICAgIGVjaG8gLW4g IiR7Y29tbWFuZHN9IC4uLiAuLi4gIgoJICAgICRjb21tYW5kcwoJICAgIGVjaG8gImRvbmUuIgoJ ICAgIGZpCgkgICAgbWtkaXIgLXAgL21udC90ZXN0bHZtLyR7aX0KCSAgICBpZiBbIC1oICIvZGV2 LyR7TFZHcm91cE5hbWV9LyR7aX0tc25hcHNob3QiIF0gOyB0aGVuCgkgICAgY2hlY2tfbW91bnQg L21udC90ZXN0bHZtLyR7aX0KCSAgICBsb2NhbCBpc21vdW50PSQ/CgkgICAgaWYgWyAiJGlzbW91 bnQiIC1lcSAwIF0gOyB0aGVuCgkJZWNobyAtbiAibW91bnQgL2Rldi8ke0xWR3JvdXBOYW1lfS8k e2l9LXNuYXBzaG90IC9tbnQvdGVzdGx2bS8ke2l9IC4uLiAuLi4gIgoJCW1vdW50IC9kZXYvJHtM Vkdyb3VwTmFtZX0vJHtpfS1zbmFwc2hvdCAvbW50L3Rlc3Rsdm0vJHtpfQoJCWVjaG8gImRvbmUu IgoJCXNsZWVwICR7cGF1c2VpbnRlcnZhbH0KCQlpZiBbIC1uICIkY29tbWFuZHMiIF0gOyB0aGVu CgkJZWNobyAtbiAiJHtjb21tYW5kc30gLi4uIC4uLiAiCgkJJGNvbW1hbmRzCgkJZWNobyAiZG9u ZS4iCgkJZmkKCSAgICBmaQoJICAgIGNoZWNrX21vdW50IC9tbnQvdGVzdGx2bS8ke2l9CgkgICAg bG9jYWwgaXNtb3VudDI9JD8KCSAgICBpZiBbICIkaXNtb3VudDIiIC1lcSAxIF0gOyB0aGVuCgkJ ZWNobyAtbiAidW1vdW50IC9tbnQvdGVzdGx2bS8ke2l9IC4uLiAuLi4gIgoJCXVtb3VudCAvbW50 L3Rlc3Rsdm0vJHtpfQoJCWVjaG8gImRvbmUuIgoJCXNsZWVwICR7cGF1c2VpbnRlcnZhbH0KCQlp ZiBbIC1uICIkY29tbWFuZHMiIF0gOyB0aGVuCgkJZWNobyAtbiAiJHtjb21tYW5kc30gLi4uIC4u LiAiCgkJJGNvbW1hbmRzCgkJZWNobyAiZG9uZS4iCgkJZmkKCSAgICBmaQoJICAgIGZpCgkgICAg cm0gLXJmIC9tbnQvdGVzdGx2bS8ke2l9CgkgICAgZWNobyAtbiAibHZyZW1vdmUgLWYgL2Rldi8k e0xWR3JvdXBOYW1lfS8ke2l9LXNuYXBzaG90IC4uLiAuLi4gIgoJICAgIGx2cmVtb3ZlIC1mIC9k ZXYvJHtMVkdyb3VwTmFtZX0vJHtpfS1zbmFwc2hvdAoJICAgIGVjaG8gImRvbmUuIgoJICAgIHNs ZWVwICR7cGF1c2VpbnRlcnZhbH0KCSAgICBpZiBbIC1uICIkY29tbWFuZHMiIF0gOyB0aGVuCgkg ICAgZWNobyAtbiAiJHtjb21tYW5kc30gLi4uIC4uLiAiCgkgICAgJGNvbW1hbmRzCgkgICAgZWNo byAiZG9uZS4iCgkgICAgZmkKCWZpCglkb25lCglybSAtZnIgL21udC90ZXN0bHZtCiAgICBkb25l CiAgICBlbHNlCiAgICBlY2hvICIvZGV2LyR7TFZHcm91cE5hbWV9IGRpcmVjdG9yeSBub3QgZm91 bmQhIgogICAgZXhpdCAxCiAgICBmaQp9CgpjYXNlICQxIGluCiAgICBzZXR1cCkgICAgc2hpZnQK ICAgIGRvX2x2bV9jcmVhdGVfdGVzdGNyYXNoICIkQCIKICAgIDs7CiAgICBsb29wKSAgICBzaGlm dAogICAgZG9fbHZtX2NyZWF0ZV9yZW1vdmUgIiRAIgogICAgOzsKICAgICopICAgIGNhdCA8PEhF TFAKVXNhZ2U6ICQwIGxvb3AgbG9vcGNvdW50bGltaXQgc25hcHNob3RzaXplIHBhdXNlaW50ZXJ2 YWwgY29tbWFuZHMKV2hlcmU6CiAgICBsb29wY291bnRsaW1pdCBpcyBkZWZhdWx0IHRvIDEKICAg IHNuYXBzaG90c2l6ZSBpcyBkZWZhdWx0IHRvIDFHCiAgICBwYXVzZWludGVydmFsIGlzIGRlZmF1 bHQgdG8gMAogICAgY29tbWFuZHMgaXMgZGVmYXVsdCB0byBub25lCgpFeGFtcGxlIHRvIHJ1biB3 aXRoIDEwMCBsb29wcyB3aXRob3V0IHBhdXNlL3NsZWVwOgogICAgJDAgbG9vcCAxMDAKCkV4YW1w bGUgdG8gcnVuIHdpdGggMTAwIGxvb3BzIHdpdGggcGF1c2Uvc2xlZXAgb2YgNSBzZWNvbmRzOgog ICAgJDAgbG9vcCAxMDAgMUcgNQoKRXhhbXBsZSB0byBydW4gd2l0aCAxMDAgbG9vcHMgd2l0aCBz bmFwc2hvdCBzaXplIG9mIDJHIGluc3RlYWQgb2YgMUc6CiAgICAkMCBsb29wIDEwMCAyRwoKRXhh bXBsZSB0byBydW4gd2l0aCA1MCBsb29wcywgMUcgc25hcHNob3Qgc2l6ZSwgNSBzZWNvbmRzIHBh dXNlIGFuZCB3aXRoIHN5bmM6CmNvbW1hbmQgd2l0aCBlYWNoIHBhdXNlL3NsZWVwCiAgICAkMCBs b29wIDUwIDFHIDUgc3luYwoKRXhhbXBsZSB0byBydW4geW91ciBvd24gY29tbWFuZHM6CiAgICAk MCBsb29wIDEwMCAxRyA1ICJlY2hvIGhpICYmIHN5bmMiCgpJZiB0aGlzIGlzIHRoZSBmaXJzdCB0 aW1lIHlvdSBhcmUgcnVubmluZyBhbmQgZG8gbm90IGhhdmUgYW55IExWIGluIHlvdXIgVkcsIHJ1 bjoKICAgICQwIHNldHVwClRoaXMgd2lsbCBjcmVhdGUgNSB0ZXN0Y3Jhc2ggTFYgaW4geW91ciBW RyB3aXRoIDVHQiBzaXplIGVhY2ggKGRlZmF1bHQpCgpVc2FnZSBmb3Igc2V0dXA6ClVzYWdlOiAk MCBzZXR1cCB5b3VydGVzdGNyYXNobHZuYW1lIGVhY2h0ZXN0Y3Jhc2hzaXplIG51bWJlcm9mdGVz dGNyYXNoCldoZXJlOgogICAgeW91cnRlc3RjcmFzaGx2bmFtZSBpcyBkZWZhdWx0IHRvIHRlc3Rj cmFzaAogICAgZWFjaHRlc3RjcmFzaHNpemUgaXMgZGVmYXVsdCB0byA1RwogICAgbnVtYmVyb2Z0 ZXN0Y3Jhc2ggaXMgZGVmYXVsdCB0byA1CgpJZiB5b3UgbmVlZCBtb3JlIHRlc3RjcmFzaCBMViwg ZG8gc29tZXRoaW5nIGxpa2U6CiAgICAkMCBzZXR1cCB0ZXN0Y3Jhc2ggNUcgMTAKVGhpcyB3aWxs IGNyZWF0ZSAxMCB0ZXN0Y3Jhc2ggTFYgd2l0aCA1R0Igc2l6ZSBlYWNoLgoKSWYgeW91IG5lZWQg dG8gc2V0dXAgZGlmZmVyZW50IG5hbWUgc3VjaCBhcyB0ZXN0aW5nIG90aGVyIHRoYW4gdGVzdGNy YXNoLCBkbyBzbzoKICAgICQwIHNldHVwIHRlc3RpbmcKVGhpcyB3aWxsIGNyZWF0ZSA1IHRlc3Rp bmcgTFYgd2l0aCA1R0Igc2l6ZSBlYWNoLgoKSEVMUAogICAgOzsKZXNhYwo= --000e0cd4cc3035c3b104a0961c51 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --000e0cd4cc3035c3b104a0961c51-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Teck Choon Giam Subject: Re: kernel BUG at arch/x86/xen/mmu.c:1872 Date: Mon, 11 Apr 2011 20:16:53 +0800 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: MaoXiaoyun Cc: jeremy@goop.org, xen devel , keir@xen.org, ian.campbell@citrix.com, konrad.wilk@oracle.com, dave@ivt.com.au List-Id: xen-devel@lists.xenproject.org > > Hi, > > Sorry, since this mmu related BUG has been troubled me for very > long... I really want to "kill" this BUG but my knowledge in kernel > hacking and/or xen is very limited. > > While waiting for Jeremy or Konrad or others ... > > Many thanks for spending time to track down this mmu related BUG. =A0I > have backported the commit from > http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit;h= =3D64141da587241301ce8638cc945f8b67853156ec > to 2.6.32.36 PVOPS kernel and patch attached. =A0I won't know whether > did I backport it correctly nor does it affects anything. =A0I am > currently testing the 2.6.32.36 PVOPS kernel with this patch applied > and also unset CONFIG_DEBUG_PAGEALLOC. =A0Currently running testcrash.sh > loop 1000 as I am unable to reproduce this mmu BUG 1872 in > testcrash.sh loop 100. =A0Please note that when CONFIG_DEBUG_PAGEALLOC > is unset, I can reproduce this mmu BUG 1872 easily within <50 > testcrash.sh loop cycle with PVOPS version 2.6.32.24 to 2.6.32.36 > kernel. =A0Now test with this backport patch to see whether I can > reproduce this mmu BUG... ... > > Kindest regards, > Giam Teck Choon > I have tested with my backport patch and it is working fine as I am unable to reproduce the mmu.c 1872 or 1860 bug with CONFIG_DEBUG_PAGEALLOC not set. I tested with testcrash.sh loop 100 and 1000. Now doing testcrash.sh loop 10000. Xiaoyun, is it possible for you to test my patch and see whether can you reproduce the mmu.c 1872/1860 bug? Can anyone of you review my patch? I will post a format patch according to Documentation/SubmittingPatches in my next reply and hopefully can be reviewed. Thanks. Kindest regards, Giam Teck Choon From mboxrd@z Thu Jan 1 00:00:00 1970 From: Teck Choon Giam Subject: Re: kernel BUG at arch/x86/xen/mmu.c:1872 Date: Mon, 11 Apr 2011 20:22:30 +0800 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: MaoXiaoyun Cc: jeremy@goop.org, xen devel , keir@xen.org, ian.campbell@citrix.com, konrad.wilk@oracle.com, dave@ivt.com.au List-Id: xen-devel@lists.xenproject.org From: Giam Teck Choon vmalloc: eagerly clear ptes on vunmap Backport from commit 64141da587241301ce8638cc945f8b67853156ec to 2.6.32.36 URL: http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=commit;h=64141da587241301ce8638cc945f8b67853156ec Without this patch, kernel BUG at arch/x86/xen/mmu.c:1860 or kernel BUG at arch/x86/xen/mmu.c:1872 is easily triggered when CONFIG_DEBUG_PAGEALLOC is unset especially doing LVM snapshots. Signed-off-by: Giam Teck Choon --- arch/x86/xen/mmu.c | 2 -- include/linux/vmalloc.h | 2 -- mm/vmalloc.c | 28 +++++++++++++++++----------- 3 files changed, 17 insertions(+), 15 deletions(-) diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index fa36ab8..204e3ba 100644 --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -2430,8 +2430,6 @@ void __init xen_init_mmu_ops(void) x86_init.paging.pagetable_setup_start = xen_pagetable_setup_start; x86_init.paging.pagetable_setup_done = xen_pagetable_setup_done; pv_mmu_ops = xen_mmu_ops; - - vmap_lazy_unmap = false; } /* Protected by xen_reservation_lock. */ diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 1a2ba21..3c123c3 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -7,8 +7,6 @@ struct vm_area_struct; /* vma defining user mapping in mm_types.h */ -extern bool vmap_lazy_unmap; - /* bits in flags of vmalloc's vm_struct below */ #define VM_IOREMAP 0x00000001 /* ioremap() and friends */ #define VM_ALLOC 0x00000002 /* vmalloc() */ diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 4f701c2..80cbd7b 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -31,8 +31,6 @@ #include #include -bool vmap_lazy_unmap __read_mostly = true; - /*** Page table manipulation functions ***/ static void vunmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end) @@ -503,9 +501,6 @@ static unsigned long lazy_max_pages(void) { unsigned int log; - if (!vmap_lazy_unmap) - return 0; - log = fls(num_online_cpus()); return log * (32UL * 1024 * 1024 / PAGE_SIZE); @@ -566,7 +561,6 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end, if (va->va_end > *end) *end = va->va_end; nr += (va->va_end - va->va_start) >> PAGE_SHIFT; - unmap_vmap_area(va); list_add_tail(&va->purge_list, &valist); va->flags |= VM_LAZY_FREEING; va->flags &= ~VM_LAZY_FREE; @@ -612,10 +606,11 @@ static void purge_vmap_area_lazy(void) } /* - * Free and unmap a vmap area, caller ensuring flush_cache_vunmap had been - * called for the correct range previously. + * Free a vmap area, caller ensuring that the area has been unmapped + * and flush_cache_vunmap had been called for the correct range + * previously. */ -static void free_unmap_vmap_area_noflush(struct vmap_area *va) +static void free_vmap_area_noflush(struct vmap_area *va) { va->flags |= VM_LAZY_FREE; atomic_add((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr); @@ -624,6 +619,16 @@ static void free_unmap_vmap_area_noflush(struct vmap_area *va) } /* + * Free and unmap a vmap area, caller ensuring flush_cache_vunmap had been + * called for the correct range previously. + */ +static void free_unmap_vmap_area_noflush(struct vmap_area *va) +{ + unmap_vmap_area(va); + free_vmap_area_noflush(va); +} + +/* * Free and unmap a vmap area */ static void free_unmap_vmap_area(struct vmap_area *va) @@ -799,7 +804,7 @@ static void free_vmap_block(struct vmap_block *vb) spin_unlock(&vmap_block_tree_lock); BUG_ON(tmp != vb); - free_unmap_vmap_area_noflush(vb->va); + free_vmap_area_noflush(vb->va); call_rcu(&vb->rcu_head, rcu_free_vb); } @@ -936,6 +941,8 @@ static void vb_free(const void *addr, unsigned long size) rcu_read_unlock(); BUG_ON(!vb); + vunmap_page_range((unsigned long)addr, (unsigned long)addr + size); + spin_lock(&vb->lock); BUG_ON(bitmap_allocate_region(vb->dirty_map, offset >> PAGE_SHIFT, order)); @@ -988,7 +995,6 @@ void vm_unmap_aliases(void) s = vb->va->va_start + (i << PAGE_SHIFT); e = vb->va->va_start + (j << PAGE_SHIFT); - vunmap_page_range(s, e); flush = 1; if (s < start) From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: RE: kernel BUG at arch/x86/xen/mmu.c:1872 Date: Mon, 11 Apr 2011 20:31:36 +0800 Message-ID: References: , , , , , Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="_54687869-ea4f-46fd-9a8a-3db53e822f7a_" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: giamteckchoon@gmail.com Cc: jeremy@goop.org, xen devel , keir@xen.org, ian.campbell@citrix.com, konrad.wilk@oracle.com, dave@ivt.com.au List-Id: xen-devel@lists.xenproject.org --_54687869-ea4f-46fd-9a8a-3db53e822f7a_ Content-Type: multipart/alternative; boundary="_3608759d-e999-418a-bec5-2df48ed33540_" --_3608759d-e999-418a-bec5-2df48ed33540_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi: =20 I believe this is the fix at much extent.=20 Since I have my own test cases which with this patch, my test case w= ill success in 30 rounds run.=20 Every round takes 8hours. While without this patch, tests fail evey= round in 15minutes. =20 So this really means fix most of the things.=20 =20 But during running, I met another crash, from the log it it looks l= ike has relation with this BUG, since the crash log shows it is tlb related and this BUG also t= lb related. =20 Well, I'm also have poor knowledge of kernel. Hope someone from Xen Devel offer some help.=20 =20 Many thanks. =20 > Date: Mon, 11 Apr 2011 20:16:53 +0800 > Subject: Re: kernel BUG at arch/x86/xen/mmu.c:1872 > From: giamteckchoon@gmail.com > To: tinnycloud@hotmail.com > CC: xen-devel@lists.xensource.com; dave@ivt.com.au; ian.campbell@citrix= .com; konrad.wilk@oracle.com; jeremy@goop.org; keir@xen.org >=20 > > > > Hi, > > > > Sorry, since this mmu related BUG has been troubled me for very > > long... I really want to "kill" this BUG but my knowledge in kernel > > hacking and/or xen is very limited. > > > > While waiting for Jeremy or Konrad or others ... > > > > Many thanks for spending time to track down this mmu related BUG. I > > have backported the commit from > > http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit= ;h=3D64141da587241301ce8638cc945f8b67853156ec > > to 2.6.32.36 PVOPS kernel and patch attached. I won't know whether > > did I backport it correctly nor does it affects anything. I am > > currently testing the 2.6.32.36 PVOPS kernel with this patch applied > > and also unset CONFIG_DEBUG_PAGEALLOC. Currently running testcrash.s= h > > loop 1000 as I am unable to reproduce this mmu BUG 1872 in > > testcrash.sh loop 100. Please note that when CONFIG_DEBUG_PAGEALLOC > > is unset, I can reproduce this mmu BUG 1872 easily within <50 > > testcrash.sh loop cycle with PVOPS version 2.6.32.24 to 2.6.32.36 > > kernel. Now test with this backport patch to see whether I can > > reproduce this mmu BUG... ... > > > > Kindest regards, > > Giam Teck Choon > > >=20 > I have tested with my backport patch and it is working fine as I am > unable to reproduce the mmu.c 1872 or 1860 bug with > CONFIG_DEBUG_PAGEALLOC not set. I tested with testcrash.sh loop 100 > and 1000. Now doing testcrash.sh loop 10000. >=20 > Xiaoyun, is it possible for you to test my patch and see whether can > you reproduce the mmu.c 1872/1860 bug? >=20 > Can anyone of you review my patch? >=20 > I will post a format patch according to > Documentation/SubmittingPatches in my next reply and hopefully can be > reviewed. >=20 > Thanks. >=20 > Kindest regards, > Giam Teck Choon =20 --_3608759d-e999-418a-bec5-2df48ed33540_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi:
 
     I believe this is the fix at much extent. <= BR>      Since I have my own test cases which with this p= atch, my test case will success in 30 rounds run.
     Every round takes 8hours.  While witho= ut this patch, tests fail evey round in 15minutes.
 
      So this really means fix most of the thing= s. 
 
      But during running, I met another crash, f= rom the log it it looks like has relation with
this BUG, since the crash log shows it is tlb related and this BUG a= lso tlb related.
 
      Well, I'm also have poor knowledge of kern= el.
      Hope someone from Xen Devel offer som= e help.
 
      Many thanks.
 
> Date: Mon, 11 Apr 2011 20:16:53 +0800
> Subject: Re: kernel BU= G at arch/x86/xen/mmu.c:1872
> From: giamteckchoon@gmail.com
>= ; To: tinnycloud@hotmail.com
> CC: xen-devel@lists.xensource.com; d= ave@ivt.com.au; ian.campbell@citrix.com; konrad.wilk@oracle.com; jeremy@g= oop.org; keir@xen.org
>
> >
> > Hi,
> >=
> > Sorry, since this mmu related BUG has been troubled me for = very
> > long... I really want to "kill" this BUG but my knowled= ge in kernel
> > hacking and/or xen is very limited.
> >= ;
> > While waiting for Jeremy or Konrad or others ...
> &= gt;
> > Many thanks for spending time to track down this mmu rel= ated BUG.  I
> > have backported the commit from
> &g= t; http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit;= h=3D64141da587241301ce8638cc945f8b67853156ec
> > to 2.6.32.36 PV= OPS kernel and patch attached.  I won't kn ow whether
> > did I backport it correctly nor does it affects = anything.  I am
> > currently testing the 2.6.32.36 PVOPS k= ernel with this patch applied
> > and also unset CONFIG_DEBUG_PA= GEALLOC.  Currently running testcrash.sh
> > loop 1000 as I= am unable to reproduce this mmu BUG 1872 in
> > testcrash.sh lo= op 100.  Please note that when CONFIG_DEBUG_PAGEALLOC
> > i= s unset, I can reproduce this mmu BUG 1872 easily within <50
> &= gt; testcrash.sh loop cycle with PVOPS version 2.6.32.24 to 2.6.32.36
= > > kernel.  Now test with this backport patch to see whether = I can
> > reproduce this mmu BUG... ...
> >
> >= ; Kindest regards,
> > Giam Teck Choon
> >
>
= > I have tested with my backport patch and it is working fine as I am<= BR>> unable to reproduce the mmu.c 1872 or 1860 bug with
> CONFI= G_DEBUG_PAGEALLOC not set. I tested with t estcrash.sh loop 100
> and 1000. Now doing testcrash.sh loop 10000= .
>
> Xiaoyun, is it possible for you to test my patch and s= ee whether can
> you reproduce the mmu.c 1872/1860 bug?
> > Can anyone of you review my patch?
>
> I will post a f= ormat patch according to
> Documentation/SubmittingPatches in my ne= xt reply and hopefully can be
> reviewed.
>
> Thanks.<= BR>>
> Kindest regards,
> Giam Teck Choon
= --_3608759d-e999-418a-bec5-2df48ed33540_-- --_54687869-ea4f-46fd-9a8a-3db53e822f7a_ Content-Type: text/plain Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="196.22.txt" X3JhdGVsaW1pdDogNjIgY2FsbGJhY2tzIHN1cHByZXNzZWQNCmJsa3RhcF9zeXNmc19jcmVhdGU6 IGFkZGluZyBhdHRyaWJ1dGVzIGZvciBkZXYgZmZmZjg4MDA5YmRiNjAwMA0KYmxrdGFwX3N5c2Zz X2NyZWF0ZTogYWRkaW5nIGF0dHJpYnV0ZXMgZm9yIGRldiBmZmZmODgwMDliZGIyMjAwDQpJTklU OiBJZCAiczAiIHJlc3Bhd25pbmcgdG9vIGZhc3Q6IGRpc2FibGVkIGZvciA1IG1pbnV0ZXMNCl9f cmF0ZWxpbWl0OiAxNCBjYWxsYmFja3Mgc3VwcHJlc3NlZA0KYmxrdGFwX3N5c2ZzX2Rlc3Ryb3kN CmJsa3RhcF9zeXNmc19kZXN0cm95DQotLS0tLS0tLS0tLS1bIGN1dCBoZXJlIF0tLS0tLS0tLS0t LS0NCmtlcm5lbCBCVUcgYXQgYXJjaC94ODYvbW0vdGxiLmM6NjEhDQppbnZhbGlkIG9wY29kZTog MDAwMCBbIzFdIFNNUCANCmxhc3Qgc3lzZnMgZmlsZTogL3N5cy9kZXZpY2VzL3N5c3RlbS94ZW5f bWVtb3J5L3hlbl9tZW1vcnkwL2luZm8vY3VycmVudF9rYg0KQ1BVIDEgDQpNb2R1bGVzIGxpbmtl ZCBpbjogODAyMXEgZ2FycCB4ZW5fbmV0YmFjayB4ZW5fYmxrYmFjayBibGt0YXAgYmxrYmFja19w YWdlbWFwIG5iZCBicmlkZ2Ugc3RwIGxsYyBhdXRvZnM0IGlwbWlfZGV2aW50ZiBpcG1pX3NpIGlw bWlfbXNnaGFuZGxlciBsb2NrZCBzdW5ycGMgYm9uZGluZyBpcHY2IHhlbmZzIGRtX211bHRpcGF0 aCB2aWRlbyBvdXRwdXQgc2JzIHNic2hjIHBhcnBvcnRfcGMgbHAgcGFycG9ydCBzZXMgZW5jbG9z dXJlIHNuZF9zZXFfZHVtbXkgc25kX3NlcV9vc3Mgc25kX3NlcV9taWRpX2V2ZW50IHNuZF9zZXEg c25kX3NlcV9kZXZpY2Ugc2VyaW9fcmF3IGJueDIgc25kX3BjbV9vc3Mgc25kX21peGVyX29zcyBz bmRfcGNtIHNuZF90aW1lciBpVENPX3dkdCBzbmQgc291bmRjb3JlIHNuZF9wYWdlX2FsbG9jIGky Y19pODAxIGlUQ09fdmVuZG9yX3N1cHBvcnQgaTJjX2NvcmUgcGNzcGtyIHBhdGFfYWNwaSBhdGFf Z2VuZXJpYyBhdGFfcGlpeCBzaHBjaHAgbXB0c2FzIG1wdHNjc2loIG1wdGJhc2UgW2xhc3QgdW5s b2FkZWQ6IGZyZXFfdGFibGVdDQpQaWQ6IDI1NTgxLCBjb21tOiBraGVscGVyIE5vdCB0YWludGVk IDIuNi4zMi4zNmZpeHhlbiAjMSBUZWNhbCBSSDIyODUgICAgICAgICAgDQpSSVA6IGUwMzA6Wzxm ZmZmZmZmZjgxMDNhM2NiPl0gIFs8ZmZmZmZmZmY4MTAzYTNjYj5dIGxlYXZlX21tKzB4MTUvMHg0 Ng0KUlNQOiBlMDJiOmZmZmY4ODAwMjgwNWJlNDggIEVGTEFHUzogMDAwMTAwNDYNClJBWDogMDAw MDAwMDAwMDAwMDAwMCBSQlg6IDAwMDAwMDAwMDAwMDAwMDEgUkNYOiBmZmZmODgwMTVmOGUyZGEw DQpSRFg6IGZmZmY4ODAwMjgwNWJlNzggUlNJOiAwMDAwMDAwMDAwMDAwMDAwIFJESTogMDAwMDAw MDAwMDAwMDAwMQ0KUkJQOiBmZmZmODgwMDI4MDViZTQ4IFIwODogZmZmZjg4MDA5ZDY2MjAwMCBS MDk6IGRlYWQwMDAwMDAyMDAyMDANClIxMDogZGVhZDAwMDAwMDEwMDEwMCBSMTE6IGZmZmZmZmZm ODE0NDcyYjIgUjEyOiBmZmZmODgwMDliZmMxODgwDQpSMTM6IGZmZmY4ODAwMjgwNjMwMjAgUjE0 OiAwMDAwMDAwMDAwMDAwNGY2IFIxNTogMDAwMDAwMDAwMDAwMDAwMA0KRlM6ICAwMDAwN2Y2MjM2 MmQ2NmUwKDAwMDApIEdTOmZmZmY4ODAwMjgwNTgwMDAoMDAwMCkga25sR1M6MDAwMDAwMDAwMDAw MDAwMA0KQ1M6ICBlMDMzIERTOiAwMDAwIEVTOiAwMDAwIENSMDogMDAwMDAwMDA4MDA1MDAzYg0K Q1IyOiAwMDAwMDAzYWFiYzExOTA5IENSMzogMDAwMDAwMDA5YjhjYTAwMCBDUjQ6IDAwMDAwMDAw MDAwMDI2NjANCkRSMDogMDAwMDAwMDAwMDAwMDAwMCBEUjE6IDAwMDAwMDAwMDAwMDAwMDAgRFIy OiAwMDAwMDAwMDAwMDAwMDAwDQpEUjM6IDAwMDAwMDAwMDAwMDAwMDAgRFI2OiAwMDAwMDAwMGZm ZmYwZmYwIERSNzogMDAwMDAwMDAwMDAwMDQwMA0KUHJvY2VzcyBraGVscGVyIChwaWQ6IDI1NTgx LCB0aHJlYWRpbmZvIGZmZmY4ODAwNzY5MWUwMDAsIHRhc2sgZmZmZjg4MDA5YjkyZGI0MCkNClN0 YWNrOg0KIGZmZmY4ODAwMjgwNWJlNjggZmZmZmZmZmY4MTAwZTRhZSAwMDAwMDAwMDAwMDAwMDAx IGZmZmY4ODAwOWQ3MzNiODgNCjwwPiBmZmZmODgwMDI4MDViZTk4IGZmZmZmZmZmODEwODcyMjQg ZmZmZjg4MDAyODA1YmU3OCBmZmZmODgwMDI4MDViZTc4DQo8MD4gZmZmZjg4MDE1ZjgwODM2MCAw MDAwMDAwMDAwMDAwNGY2IGZmZmY4ODAwMjgwNWJlYTggZmZmZmZmZmY4MTAxMDEwOA0KQ2FsbCBU cmFjZToNCiA8SVJRPiANCiBbPGZmZmZmZmZmODEwMGU0YWU+XSBkcm9wX290aGVyX21tX3JlZisw eDJhLzB4NTMNCiBbPGZmZmZmZmZmODEwODcyMjQ+XSBnZW5lcmljX3NtcF9jYWxsX2Z1bmN0aW9u X3NpbmdsZV9pbnRlcnJ1cHQrMHhkOC8weGZjDQogWzxmZmZmZmZmZjgxMDEwMTA4Pl0geGVuX2Nh bGxfZnVuY3Rpb25fc2luZ2xlX2ludGVycnVwdCsweDEzLzB4MjgNCiBbPGZmZmZmZmZmODEwYTkz NmE+XSBoYW5kbGVfSVJRX2V2ZW50KzB4NjYvMHgxMjANCiBbPGZmZmZmZmZmODEwYWFjNWI+XSBo YW5kbGVfcGVyY3B1X2lycSsweDQxLzB4NmUNCiBbPGZmZmZmZmZmODEyOGMxYzA+XSBfX3hlbl9l dnRjaG5fZG9fdXBjYWxsKzB4MWFiLzB4MjdkDQogWzxmZmZmZmZmZjgxMjhkZDExPl0geGVuX2V2 dGNobl9kb191cGNhbGwrMHgzMy8weDQ2DQogWzxmZmZmZmZmZjgxMDEzZWZlPl0geGVuX2RvX2h5 cGVydmlzb3JfY2FsbGJhY2srMHgxZS8weDMwDQogPEVPST4gDQogWzxmZmZmZmZmZjgxNDQ3MmIy Pl0gPyBfc3Bpbl91bmxvY2tfaXJxcmVzdG9yZSsweDE1LzB4MTcNCiBbPGZmZmZmZmZmODEwMGY4 Y2Y+XSA/IHhlbl9yZXN0b3JlX2ZsX2RpcmVjdF9lbmQrMHgwLzB4MQ0KIFs8ZmZmZmZmZmY4MTEx M2Y3MT5dID8gZmx1c2hfb2xkX2V4ZWMrMHgzYWMvMHg1MDANCiBbPGZmZmZmZmZmODExNTBkYzU+ XSA/IGxvYWRfZWxmX2JpbmFyeSsweDAvMHgxN2VmDQogWzxmZmZmZmZmZjgxMTUwZGM1Pl0gPyBs b2FkX2VsZl9iaW5hcnkrMHgwLzB4MTdlZg0KIFs8ZmZmZmZmZmY4MTE1MTE1ZD5dID8gbG9hZF9l bGZfYmluYXJ5KzB4Mzk4LzB4MTdlZg0KIFs8ZmZmZmZmZmY4MTA0MmZjZj5dID8gbmVlZF9yZXNj aGVkKzB4MjMvMHgyZA0KIFs8ZmZmZmZmZmY4MTFmNDY0OD5dID8gcHJvY2Vzc19tZWFzdXJlbWVu dCsweGMwLzB4ZDcNCiBbPGZmZmZmZmZmODExNTBkYzU+XSA/IGxvYWRfZWxmX2JpbmFyeSsweDAv MHgxN2VmDQogWzxmZmZmZmZmZjgxMTEzMDk0Pl0gPyBzZWFyY2hfYmluYXJ5X2hhbmRsZXIrMHhj OC8weDI1NQ0KIFs8ZmZmZmZmZmY4MTExNDM2Mj5dID8gZG9fZXhlY3ZlKzB4MWMzLzB4MjllDQog WzxmZmZmZmZmZjgxMDExNTVkPl0gPyBzeXNfZXhlY3ZlKzB4NDMvMHg1ZA0KIFs8ZmZmZmZmZmY4 MTA2ZmM0NT5dID8gX19jYWxsX3VzZXJtb2RlaGVscGVyKzB4MC8weDZmDQogWzxmZmZmZmZmZjgx MDEzZTI4Pl0gPyBrZXJuZWxfZXhlY3ZlKzB4NjgvMHhkMA0KIFs8ZmZmZmZmZmY4MTA2ZmM0NT5d ID8gX19jYWxsX3VzZXJtb2RlaGVscGVyKzB4MC8weDZmDQogWzxmZmZmZmZmZjgxMDBmOGNmPl0g PyB4ZW5fcmVzdG9yZV9mbF9kaXJlY3RfZW5kKzB4MC8weDENCiBbPGZmZmZmZmZmODEwNmZiNjQ+ XSA/IF9fX19jYWxsX3VzZXJtb2RlaGVscGVyKzB4MTEzLzB4MTFlDQogWzxmZmZmZmZmZjgxMDEz ZGFhPl0gPyBjaGlsZF9yaXArMHhhLzB4MjANCiBbPGZmZmZmZmZmODEwNmZjNDU+XSA/IF9fY2Fs bF91c2VybW9kZWhlbHBlcisweDAvMHg2Zg0KIFs8ZmZmZmZmZmY4MTAxMmY5MT5dID8gaW50X3Jl dF9mcm9tX3N5c19jYWxsKzB4Ny8weDFiDQogWzxmZmZmZmZmZjgxMDEzNzFkPl0gPyByZXRpbnRf cmVzdG9yZV9hcmdzKzB4NS8weDYNCiBbPGZmZmZmZmZmODEwMTNkYTA+XSA/IGNoaWxkX3JpcCsw eDAvMHgyMA0KQ29kZTogNDEgNWUgNDEgNWYgYzkgYzMgNTUgNDggODkgZTUgMGYgMWYgNDQgMDAg MDAgZTggMTcgZmYgZmYgZmYgYzkgYzMgNTUgNDggODkgZTUgMGYgMWYgNDQgMDAgMDAgNjUgOGIg MDQgMjUgYzggNTUgMDEgMDAgZmYgYzggNzUgMDQgPDBmPiAwYiBlYiBmZSA2NSA0OCA4YiAzNCAy NSBjMCA1NSAwMSAwMCA0OCA4MSBjNiBiOCAwMiAwMCAwMCBlOCANClJJUCAgWzxmZmZmZmZmZjgx MDNhM2NiPl0gbGVhdmVfbW0rMHgxNS8weDQ2DQogUlNQIDxmZmZmODgwMDI4MDViZTQ4Pg0KLS0t WyBlbmQgdHJhY2UgY2U5Y2VlNjgzMmE5YzUwMyBdLS0tDQpLZXJuZWwgcGFuaWMgLSBub3Qgc3lu Y2luZzogRmF0YWwgZXhjZXB0aW9uIGluIGludGVycnVwdA0KUGlkOiAyNTU4MSwgY29tbToga2hl bHBlciBUYWludGVkOiBHICAgICAgRCAgICAyLjYuMzIuMzZmaXh4ZW4gIzENCkNhbGwgVHJhY2U6 DQogPElSUT4gIFs8ZmZmZmZmZmY4MTA1NjgyZT5dIHBhbmljKzB4ZTAvMHgxOWENCiBbPGZmZmZm ZmZmODE0NDAwOGE+XSA/IGluaXRfYW1kKzB4Mjk2LzB4MzdhDQogWzxmZmZmZmZmZjgxMDBmMTdk Pl0gPyB4ZW5fZm9yY2VfZXZ0Y2huX2NhbGxiYWNrKzB4ZC8weGYNCiBbPGZmZmZmZmZmODEwMGY4 ZTI+XSA/IGNoZWNrX2V2ZW50cysweDEyLzB4MjANCiBbPGZmZmZmZmZmODEwMGY4Y2Y+XSA/IHhl bl9yZXN0b3JlX2ZsX2RpcmVjdF9lbmQrMHgwLzB4MQ0KIFs8ZmZmZmZmZmY4MTA1NjQ4Nz5dID8g cHJpbnRfb29wc19lbmRfbWFya2VyKzB4MjMvMHgyNQ0KIFs8ZmZmZmZmZmY4MTQ0ODE4NT5dIG9v cHNfZW5kKzB4YjYvMHhjNg0KIFs8ZmZmZmZmZmY4MTAxNjZlNT5dIGRpZSsweDVhLzB4NjMNCiBb PGZmZmZmZmZmODE0NDdhNWM+XSBkb190cmFwKzB4MTE1LzB4MTI0DQogWzxmZmZmZmZmZjgxMDE0 OGU2Pl0gZG9faW52YWxpZF9vcCsweDljLzB4YTUNCiBbPGZmZmZmZmZmODEwM2EzY2I+XSA/IGxl YXZlX21tKzB4MTUvMHg0Ng0KIFs8ZmZmZmZmZmY4MTAwZjZmYT5dID8geGVuX2Nsb2Nrc291cmNl X3JlYWQrMHgyMS8weDIzDQogWzxmZmZmZmZmZjgxMDBmMjZjPl0gPyBIWVBFUlZJU09SX3ZjcHVf b3ArMHhmLzB4MTENCiBbPGZmZmZmZmZmODEwMGY3Njc+XSA/IHhlbl92Y3B1b3Bfc2V0X25leHRf ZXZlbnQrMHg1Mi8weDY3DQogWzxmZmZmZmZmZjgxMDgwYmZhPl0gPyBjbG9ja2V2ZW50c19wcm9n cmFtX2V2ZW50KzB4NzgvMHg4MQ0KIFs8ZmZmZmZmZmY4MTAxM2IzYj5dIGludmFsaWRfb3ArMHgx Yi8weDIwDQogWzxmZmZmZmZmZjgxNDQ3MmIyPl0gPyBfc3Bpbl91bmxvY2tfaXJxcmVzdG9yZSsw eDE1LzB4MTcNCiBbPGZmZmZmZmZmODEwM2EzY2I+XSA/IGxlYXZlX21tKzB4MTUvMHg0Ng0KIFs8 ZmZmZmZmZmY4MTAwZTRhZT5dIGRyb3Bfb3RoZXJfbW1fcmVmKzB4MmEvMHg1Mw0KIFs8ZmZmZmZm ZmY4MTA4NzIyND5dIGdlbmVyaWNfc21wX2NhbGxfZnVuY3Rpb25fc2luZ2xlX2ludGVycnVwdCsw eGQ4LzB4ZmMNCiBbPGZmZmZmZmZmODEwMTAxMDg+XSB4ZW5fY2FsbF9mdW5jdGlvbl9zaW5nbGVf aW50ZXJydXB0KzB4MTMvMHgyOA0KIFs8ZmZmZmZmZmY4MTBhOTM2YT5dIGhhbmRsZV9JUlFfZXZl bnQrMHg2Ni8weDEyMA0KIFs8ZmZmZmZmZmY4MTBhYWM1Yj5dIGhhbmRsZV9wZXJjcHVfaXJxKzB4 NDEvMHg2ZQ0KIFs8ZmZmZmZmZmY4MTI4YzFjMD5dIF9feGVuX2V2dGNobl9kb191cGNhbGwrMHgx YWIvMHgyN2QNCiBbPGZmZmZmZmZmODEyOGRkMTE+XSB4ZW5fZXZ0Y2huX2RvX3VwY2FsbCsweDMz LzB4NDYNCiBbPGZmZmZmZmZmODEwMTNlZmU+XSB4ZW5fZG9faHlwZXJ2aXNvcl9jYWxsYmFjaysw eDFlLzB4MzANCiA8RU9JPiAgWzxmZmZmZmZmZjgxNDQ3MmIyPl0gPyBfc3Bpbl91bmxvY2tfaXJx cmVzdG9yZSsweDE1LzB4MTcNCiBbPGZmZmZmZmZmODEwMGY4Y2Y+XSA/IHhlbl9yZXN0b3JlX2Zs X2RpcmVjdF9lbmQrMHgwLzB4MQ0KIFs8ZmZmZmZmZmY4MTExM2Y3MT5dID8gZmx1c2hfb2xkX2V4 ZWMrMHgzYWMvMHg1MDANCiBbPGZmZmZmZmZmODExNTBkYzU+XSA/IGxvYWRfZWxmX2JpbmFyeSsw eDAvMHgxN2VmDQogWzxmZmZmZmZmZjgxMTUwZGM1Pl0gPyBsb2FkX2VsZl9iaW5hcnkrMHgwLzB4 MTdlZg0KIFs8ZmZmZmZmZmY4MTE1MTE1ZD5dID8gbG9hZF9lbGZfYmluYXJ5KzB4Mzk4LzB4MTdl Zg0KIFs8ZmZmZmZmZmY4MTA0MmZjZj5dID8gbmVlZF9yZXNjaGVkKzB4MjMvMHgyZA0KIFs8ZmZm ZmZmZmY4MTFmNDY0OD5dID8gcHJvY2Vzc19tZWFzdXJlbWVudCsweGMwLzB4ZDcNCiBbPGZmZmZm ZmZmODExNTBkYzU+XSA/IGxvYWRfZWxmX2JpbmFyeSsweDAvMHgxN2VmDQogWzxmZmZmZmZmZjgx MTEzMDk0Pl0gPyBzZWFyY2hfYmluYXJ5X2hhbmRsZXIrMHhjOC8weDI1NQ0KIFs8ZmZmZmZmZmY4 MTExNDM2Mj5dID8gZG9fZXhlY3ZlKzB4MWMzLzB4MjllDQogWzxmZmZmZmZmZjgxMDExNTVkPl0g PyBzeXNfZXhlY3ZlKzB4NDMvMHg1ZA0KIFs8ZmZmZmZmZmY4MTA2ZmM0NT5dID8gX19jYWxsX3Vz ZXJtb2RlaGVscGVyKzB4MC8weDZmDQogWzxmZmZmZmZmZjgxMDEzZTI4Pl0gPyBrZXJuZWxfZXhl Y3ZlKzB4NjgvMHhkMA0KIFs8ZmZmZmZmZmY4MTA2ZmM0NT5dID8gX19jYWxsX3VzZXJtb2RlaGVs cGVyKzB4MC8weDZmDQogWzxmZmZmZmZmZjgxMDBmOGNmPl0gPyB4ZW5fcmVzdG9yZV9mbF9kaXJl Y3RfZW5kKzB4MC8weDENCiBbPGZmZmZmZmZmODEwNmZiNjQ+XSA/IF9fX19jYWxsX3VzZXJtb2Rl aGVscGVyKzB4MTEzLzB4MTFlDQogWzxmZmZmZmZmZjgxMDEzZGFhPl0gPyBjaGlsZF9yaXArMHhh LzB4MjANCiBbPGZmZmZmZmZmODEwNmZjNDU+XSA/IF9fY2FsbF91c2VybW9kZWhlbHBlcisweDAv MHg2Zg0KIFs8ZmZmZmZmZmY4MTAxMmY5MT5dID8gaW50X3JldF9mcm9tX3N5c19jYWxsKzB4Ny8w eDFiDQogWzxmZmZmZmZmZjgxMDEzNzFkPl0gPyByZXRpbnRfcmVzdG9yZV9hcmdzKzB4NS8weDYN CiBbPGZmZmZmZmZmODEwMTNkYTA+XSA/IGNoaWxkX3JpcCsweDAvMHgyMA0KKFhFTikgRG9tYWlu IDAgY3Jhc2hlZDogJ25vcmVib290JyBzZXQgLSBub3QgcmVib290aW5nLg== --_54687869-ea4f-46fd-9a8a-3db53e822f7a_ Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --_54687869-ea4f-46fd-9a8a-3db53e822f7a_-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Teck Choon Giam Subject: Re: kernel BUG at arch/x86/xen/mmu.c:1872 Date: Mon, 11 Apr 2011 23:25:19 +0800 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: MaoXiaoyun Cc: jeremy@goop.org, xen devel , keir@xen.org, ian.campbell@citrix.com, konrad.wilk@oracle.com, dave@ivt.com.au List-Id: xen-devel@lists.xenproject.org 2011/4/11 MaoXiaoyun : > Hi: > > =A0=A0=A0=A0 I believe this is the fix at=A0much extent. > =A0=A0=A0=A0 Since I have my own test cases which with this patch, my tes= t case will > success in 30 rounds run. > =A0=A0=A0=A0 Every round takes 8hours. =A0While=A0without this patch, tes= ts fail evey > round in 15minutes. > > =A0=A0=A0=A0=A0 So this really means fix most of the things. > > =A0=A0=A0=A0=A0 But during running, I met another crash, from the log it = it looks like > has relation with > this BUG, since=A0the crash log shows it is tlb related and this BUG also= tlb > related. Are you able to run another test with cpuidle=3D0 cpufreq=3Dnone in kernel boot option? Just curious whether can you reproduce the tlb bug when you boot with cpuidle=3D0 cpufreq=3Dnone... ... > > =A0=A0=A0=A0=A0 Well, I'm also have poor knowledge of kernel. > =A0=A0=A0=A0=A0=A0Hope someone from Xen Devel offer some help. > > =A0=A0=A0=A0=A0 Many thanks. > >> Date: Mon, 11 Apr 2011 20:16:53 +0800 >> Subject: Re: kernel BUG at arch/x86/xen/mmu.c:1872 >> From: giamteckchoon@gmail.com >> To: tinnycloud@hotmail.com >> CC: xen-devel@lists.xensource.com; dave@ivt.com.au; >> ian.campbell@citrix.com; konrad.wilk@oracle.com; jeremy@goop.org; >> keir@xen.org >> >> > >> > Hi, >> > >> > Sorry, since this mmu related BUG has been troubled me for very >> > long... I really want to "kill" this BUG but my knowledge in kernel >> > hacking and/or xen is very limited. >> > >> > While waiting for Jeremy or Konrad or others ... >> > >> > Many thanks for spending time to track down this mmu related BUG. =A0I >> > have backported the commit from >> > >> > http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit;= h=3D64141da587241301ce8638cc945f8b67853156ec >> > to 2.6.32.36 PVOPS kernel and patch attached. =A0I won't know whether >> > did I backport it correctly nor does it affects anything. =A0I am >> > currently testing the 2.6.32.36 PVOPS kernel with this patch applied >> > and also unset CONFIG_DEBUG_PAGEALLOC. =A0Currently running testcrash.= sh >> > loop 1000 as I am unable to reproduce this mmu BUG 1872 in >> > testcrash.sh loop 100. =A0Please note that when CONFIG_DEBUG_PAGEALLOC >> > is unset, I can reproduce this mmu BUG 1872 easily within <50 >> > testcrash.sh loop cycle with PVOPS version 2.6.32.24 to 2.6.32.36 >> > kernel. =A0Now test with this backport patch to see whether I can >> > reproduce this mmu BUG... ... >> > >> > Kindest regards, >> > Giam Teck Choon >> > >> >> I have tested with my backport patch and it is working fine as I am >> unable to reproduce the mmu.c 1872 or 1860 bug with >> CONFIG_DEBUG_PAGEALLOC not set. I tested with testcrash.sh loop 100 >> and 1000. Now doing testcrash.sh loop 10000. >> >> Xiaoyun, is it possible for you to test my patch and see whether can >> you reproduce the mmu.c 1872/1860 bug? >> >> Can anyone of you review my patch? >> >> I will post a format patch according to >> Documentation/SubmittingPatches in my next reply and hopefully can be >> reviewed. >> >> Thanks. >> >> Kindest regards, >> Giam Teck Choon > From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeremy Fitzhardinge Subject: Re: kernel BUG at arch/x86/xen/mmu.c:1872 Date: Mon, 11 Apr 2011 11:08:10 -0700 Message-ID: <4DA3438A.6070503@goop.org> References: , , , , , Mime-Version: 1.0 Content-Type: text/plain; charset=GB2312 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: MaoXiaoyun Cc: xen devel , keir@xen.org, ian.campbell@citrix.com, konrad.wilk@oracle.com, giamteckchoon@gmail.com, dave@ivt.com.au List-Id: xen-devel@lists.xenproject.org On 04/11/2011 05:31 AM, MaoXiaoyun wrote: > Hi: > > I believe this is the fix at much extent. > Since I have my own test cases which with this patch, my test case > will success in 30 rounds run. > Every round takes 8hours. While without this patch, tests fail evey > round in 15minutes. > > So this really means fix most of the things. > > But during running, I met another crash, from the log it it looks like > has relation with > this BUG, since the crash log shows it is tlb related and this BUG > also tlb related. > > Well, I'm also have poor knowledge of kernel. > Hope someone from Xen Devel offer some help. Thanks for confirming; it makes sense and explains the symptoms, so I'm glad it also works ;) J From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: RE: kernel BUG at arch/x86/xen/mmu.c:1872 Date: Tue, 12 Apr 2011 11:30:32 +0800 Message-ID: References: , , , , , , , Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0441027735==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: giamteckchoon@gmail.com Cc: jeremy@goop.org, xen devel , keir@xen.org, ian.campbell@citrix.com, konrad.wilk@oracle.com, dave@ivt.com.au List-Id: xen-devel@lists.xenproject.org --===============0441027735== Content-Type: multipart/alternative; boundary="_a9fa7ebe-dc6a-4555-a398-7de33bf973b5_" --_a9fa7ebe-dc6a-4555-a398-7de33bf973b5_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi: =20 I have just kicked off cpuidle=3D0 "cpufreq=3Dnone" tests. =20 What is your Xen version? Do you use the backend driver of 2.6.32= .36? =20 Beside the "TLB BUG ", I've met at least two other issues 1)Xen4.0.1 + 2.6.32.36 kernel + backend driver from 2.6.31 =3D=3D= > will cause "Bad grant reference " log in serial output 2)Xen4.0.1 + 2.6.32.36 kernel with its owen backend driver =3D=3D= > will cause disk error like belows. =20 sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device sd 0:0:0:0: rejecting I/O to offline device end_request: I/O error, dev tdb, sector 28699593 end_request: I/O error, dev tdb, sector 28699673 end_request: I/O error, dev tdb, sector 28699753 end_request: I/O error, dev tdb, sector 28699833 end_request: I/O error, dev tdb, sector 28699913 end_request: I/O error, dev tdb, sector 28699993 end_request: I/O error, dev tdb, sector 28700073 =20 thanks. =20 =20 > Date: Mon, 11 Apr 2011 23:25:19 +0800 > Subject: Re: kernel BUG at arch/x86/xen/mmu.c:1872 > From: giamteckchoon@gmail.com > To: tinnycloud@hotmail.com > CC: xen-devel@lists.xensource.com; dave@ivt.com.au; ian.campbell@citrix= .com; konrad.wilk@oracle.com; jeremy@goop.org; keir@xen.org >=20 > 2011/4/11 MaoXiaoyun : > > Hi: > > > > I believe this is the fix at much extent. > > Since I have my own test cases which with this patch, my test ca= se will > > success in 30 rounds run. > > Every round takes 8hours. While without this patch, tests fail = evey > > round in 15minutes. > > > > So this really means fix most of the things. > > > > But during running, I met another crash, from the log it it loo= ks like > > has relation with > > this BUG, since the crash log shows it is tlb related and this BUG al= so tlb > > related. >=20 > Are you able to run another test with cpuidle=3D0 cpufreq=3Dnone in ker= nel > boot option? Just curious whether can you reproduce the tlb bug when > you boot with cpuidle=3D0 cpufreq=3Dnone... ... >=20 > > > > Well, I'm also have poor knowledge of kernel. > > Hope someone from Xen Devel offer some help. > > > > Many thanks. > > > >> Date: Mon, 11 Apr 2011 20:16:53 +0800 > >> Subject: Re: kernel BUG at arch/x86/xen/mmu.c:1872 > >> From: giamteckchoon@gmail.com > >> To: tinnycloud@hotmail.com > >> CC: xen-devel@lists.xensource.com; dave@ivt.com.au; > >> ian.campbell@citrix.com; konrad.wilk@oracle.com; jeremy@goop.org; > >> keir@xen.org > >> > >> > > >> > Hi, > >> > > >> > Sorry, since this mmu related BUG has been troubled me for very > >> > long... I really want to "kill" this BUG but my knowledge in kerne= l > >> > hacking and/or xen is very limited. > >> > > >> > While waiting for Jeremy or Konrad or others ... > >> > > >> > Many thanks for spending time to track down this mmu related BUG. = I > >> > have backported the commit from > >> > > >> > http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcom= mit;h=3D64141da587241301ce8638cc945f8b67853156ec > >> > to 2.6.32.36 PVOPS kernel and patch attached. I won't know whethe= r > >> > did I backport it correctly nor does it affects anything. I am > >> > currently testing the 2.6.32.36 PVOPS kernel with this patch appli= ed > >> > and also unset CONFIG_DEBUG_PAGEALLOC. Currently running testcras= h.sh > >> > loop 1000 as I am unable to reproduce this mmu BUG 1872 in > >> > testcrash.sh loop 100. Please note that when CONFIG_DEBUG_PAGEALL= OC > >> > is unset, I can reproduce this mmu BUG 1872 easily within <50 > >> > testcrash.sh loop cycle with PVOPS version 2.6.32.24 to 2.6.32.36 > >> > kernel. Now test with this backport patch to see whether I can > >> > reproduce this mmu BUG... ... > >> > > >> > Kindest regards, > >> > Giam Teck Choon > >> > > >> > >> I have tested with my backport patch and it is working fine as I am > >> unable to reproduce the mmu.c 1872 or 1860 bug with > >> CONFIG_DEBUG_PAGEALLOC not set. I tested with testcrash.sh loop 100 > >> and 1000. Now doing testcrash.sh loop 10000. > >> > >> Xiaoyun, is it possible for you to test my patch and see whether can > >> you reproduce the mmu.c 1872/1860 bug? > >> > >> Can anyone of you review my patch? > >> > >> I will post a format patch according to > >> Documentation/SubmittingPatches in my next reply and hopefully can b= e > >> reviewed. > >> > >> Thanks. > >> > >> Kindest regards, > >> Giam Teck Choon > > =20 --_a9fa7ebe-dc6a-4555-a398-7de33bf973b5_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi:
 
       I have just kicked off cpuidle=3D= 0 "cpufreq=3Dnone" tests.
 
       What is your Xen version? =  Do you use the backend driver of 2.6.32.36?
 
       Beside the "TLB BUG ", I've met at l= east two other issues
       1)Xen4.0.1 + 2.6.32.36 kernel += backend driver from 2.6.31  =3D=3D> will cause "Bad grant r= eference " log in serial output
       2)Xen4.0.1 + 2.6.32.36 ker= nel with its owen backend driver   =3D=3D> will cause disk e= rror like belows.
 
sd 0:0:0:0: rejecting I/O t= o offline device
sd 0:0:0:0: rejecting I/O&nb= sp;to offline device
sd 0:0:0:0: rejecting I/= O to offline device
sd 0:0:0:0: rejecting&nbs= p;I/O to offline device
sd 0:0:0:0: rejecting=  I/O to offline device
sd 0:0:0:0: rejec= ting I/O to offline device
sd 0:0:0:0: r= ejecting I/O to offline device
sd 0:0:0:0:&nb= sp;rejecting I/O to offline device
sd 0:0:0:0= : rejecting I/O to offline device
sd 0:0= :0:0: rejecting I/O to offline device
sd = ;0:0:0:0: rejecting I/O to offline device
sd&= nbsp;0:0:0:0: rejecting I/O to offline devicesd 0:0:0:0: rejecting I/O to&n bsp;offline device
end_request: I/O error, dev&nb= sp;tdb, sector 28699593
end_request: I/O error,&nb= sp;dev tdb, sector 28699673
end_request: I/O = error, dev tdb, sector 28699753
end_request: = I/O error, dev tdb, sector 28699833
end_reque= st: I/O error, dev tdb, sector 28699913
= end_request: I/O error, dev tdb, sector 286= 99993
end_request: I/O error, dev tdb, sector=  28700073

     
    thanks.
 
 
> Date: Mon, 11 Apr 2011 23:25:19 +0800
> Subject: Re: kernel BU= G at arch/x86/xen/mmu.c:1872
> From: giamteckchoon@gmail.com
>= ; To: tinnycloud@hotmail.com
> CC: xen-devel@lists.xensource.com; d= ave@ivt.com.au; ian.campbell@citrix.com; konrad.wilk@oracle.com; jeremy@g= oop.org; keir@xen.org
>
> 2011/4/11 MaoXiaoyun <tinnyclou= d@hotmail.com>:
> > Hi:
> >
> >   = ;   I believe this is the fix at much extent.
> >=      Since I have my own test cases which with this = patch, my test case will
> > success in 30 rounds run.
> &= gt;      Every round takes 8hours.  While w= ithout this patch, tests fail evey
> > round in 15minutes.
&g= t; >
> >       So this really means = fix most of the things.
> >
> >     = ;  But during running, I met another crash, from the log it it looks like
> > has relation with
= > > this BUG, since the crash log shows it is tlb related and = this BUG also tlb
> > related.
>
> Are you able to = run another test with cpuidle=3D0 cpufreq=3Dnone in kernel
> boot o= ption? Just curious whether can you reproduce the tlb bug when
> yo= u boot with cpuidle=3D0 cpufreq=3Dnone... ...
>
> >
&g= t; >       Well, I'm also have poor knowledge= of kernel.
> >       Hope someone= from Xen Devel offer some help.
> >
> >   &n= bsp;   Many thanks.
> >
> >> Date: Mon, 11= Apr 2011 20:16:53 +0800
> >> Subject: Re: kernel BUG at arch= /x86/xen/mmu.c:1872
> >> From: giamteckchoon@gmail.com
>= ; >> To: tinnycloud@hotmail.com
> >> CC: xen-devel@list= s.xensource.com; dave@ivt.com.au;
> >> ian.campbell@citrix.com; konrad.wilk@oracle.com; jeremy@goop.org;
&g= t; >> keir@xen.org
> >>
> >> >
> &= gt;> > Hi,
> >> >
> >> > Sorry, since= this mmu related BUG has been troubled me for very
> >> >= long... I really want to "kill" this BUG but my knowledge in kernel
&= gt; >> > hacking and/or xen is very limited.
> >> &g= t;
> >> > While waiting for Jeremy or Konrad or others ...=
> >> >
> >> > Many thanks for spending tim= e to track down this mmu related BUG.  I
> >> > have = backported the commit from
> >> >
> >> > ht= tp://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit;h=3D6= 4141da587241301ce8638cc945f8b67853156ec
> >> > to 2.6.32.3= 6 PVOPS kernel and patch attached.  I won't know whether
> >= ;> > did I backport it correctly nor does=20 it affects anything.  I am
> >> > currently testing = the 2.6.32.36 PVOPS kernel with this patch applied
> >> > = and also unset CONFIG_DEBUG_PAGEALLOC.  Currently running testcrash.= sh
> >> > loop 1000 as I am unable to reproduce this mmu B= UG 1872 in
> >> > testcrash.sh loop 100.  Please note= that when CONFIG_DEBUG_PAGEALLOC
> >> > is unset, I can r= eproduce this mmu BUG 1872 easily within <50
> >> > tes= tcrash.sh loop cycle with PVOPS version 2.6.32.24 to 2.6.32.36
> &g= t;> > kernel.  Now test with this backport patch to see whethe= r I can
> >> > reproduce this mmu BUG... ...
> >&= gt; >
> >> > Kindest regards,
> >> > Gia= m Teck Choon
> >> >
> >>
> >> I ha= ve tested with my backport patch and it is working fine as I am
> &= gt;> unable to reproduce the mmu.c 1872 or 1860 bug with
> >> CONFIG_DEBUG_PAGEALLOC not set. I tes= ted with testcrash.sh loop 100
> >> and 1000. Now doing testc= rash.sh loop 10000.
> >>
> >> Xiaoyun, is it poss= ible for you to test my patch and see whether can
> >> you re= produce the mmu.c 1872/1860 bug?
> >>
> >> Can an= yone of you review my patch?
> >>
> >> I will pos= t a format patch according to
> >> Documentation/SubmittingPa= tches in my next reply and hopefully can be
> >> reviewed.> >>
> >> Thanks.
> >>
> >>= Kindest regards,
> >> Giam Teck Choon
> >
= --_a9fa7ebe-dc6a-4555-a398-7de33bf973b5_-- --===============0441027735== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============0441027735==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: RE: kernel BUG at arch/x86/xen/mmu.c:1872 Date: Tue, 12 Apr 2011 11:35:40 +0800 Message-ID: References: , , , , , , <4DA3438A.6070503@goop.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="_90dd77a0-f2a4-471f-9c3c-f26f5eefc718_" Return-path: In-Reply-To: <4DA3438A.6070503@goop.org> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: jeremy@goop.org Cc: xen devel , giamteckchoon@gmail.com, ian.campbell@citrix.com, dave@ivt.com.au, konrad.wilk@oracle.com List-Id: xen-devel@lists.xenproject.org --_90dd77a0-f2a4-471f-9c3c-f26f5eefc718_ Content-Type: multipart/alternative; boundary="_23c52b92-1f3f-4839-8ab7-e84beb85df13_" --_23c52b92-1f3f-4839-8ab7-e84beb85df13_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable Thanks for your reply and comfirm. =20 Well, what's your opinion of TLB bug? Is it related to this patch or a new bug? =20 Attached it the new log I've got in 28 machine tests, one crashed. =20 > Date: Mon, 11 Apr 2011 11:08:10 -0700 > From: jeremy@goop.org > To: tinnycloud@hotmail.com > CC: giamteckchoon@gmail.com; xen-devel@lists.xensource.com; dave@ivt.co= m.au; ian.campbell@citrix.com; konrad.wilk@oracle.com; keir@xen.org > Subject: Re: kernel BUG at arch/x86/xen/mmu.c:1872 >=20 > On 04/11/2011 05:31 AM, MaoXiaoyun wrote: > > Hi: > > > > I believe this is the fix at much extent. > > Since I have my own test cases which with this patch, my test case > > will success in 30 rounds run. > > Every round takes 8hours. While without this patch, tests fail evey > > round in 15minutes. > > > > So this really means fix most of the things. > > > > But during running, I met another crash, from the log it it looks lik= e > > has relation with > > this BUG, since the crash log shows it is tlb related and this BUG > > also tlb related. > > > > Well, I'm also have poor knowledge of kernel. > > Hope someone from Xen Devel offer some help. >=20 > Thanks for confirming; it makes sense and explains the symptoms, so I'm > glad it also works ;) >=20 >=20 > J =20 --_23c52b92-1f3f-4839-8ab7-e84beb85df13_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable Thanks for your reply and comfirm.
 
Well, what's your opinion of TLB bug?
Is it related to this patch or a new bug?
 
Attached it the new log I've got in 28 machine tests, one crashed.
 
> Date: Mon, 11 Apr 2011 11:08:10 -0700
> From: jeremy@goop.org<= BR>> To: tinnycloud@hotmail.com
> CC: giamteckchoon@gmail.com; x= en-devel@lists.xensource.com; dave@ivt.com.au; ian.campbell@citrix.com; k= onrad.wilk@oracle.com; keir@xen.org
> Subject: Re: kernel BUG at ar= ch/x86/xen/mmu.c:1872
>
> On 04/11/2011 05:31 AM, MaoXiaoyun= wrote:
> > Hi:
> >
> > I believe this is the = fix at much extent.
> > Since I have my own test cases which wit= h this patch, my test case
> > will success in 30 rounds run.> > Every round takes 8hours. While without this patch, tests fail= evey
> > round in 15minutes.
> >
> > So this = really means fix most of the things.
> >
> > But during= running, I met another crash, from the log it it looks like
> >= has relation with
> > this BUG, since the crash log shows it is= tlb related and this BUG
> > al so tlb related.
> >
> > Well, I'm also have poor knowl= edge of kernel.
> > Hope someone from Xen Devel offer some help.=
>
> Thanks for confirming; it makes sense and explains the = symptoms, so I'm
> glad it also works ;)
>
>
> = J
--_23c52b92-1f3f-4839-8ab7-e84beb85df13_-- --_90dd77a0-f2a4-471f-9c3c-f26f5eefc718_ Content-Type: text/plain Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="195.31.txt" X3JhdGVsaW1pdDogNjIgY2FsbGJhY2tzIHN1cHByZXNzZWQNCmJsa3RhcF9zeXNmc19jcmVhdGU6 IGFkZGluZyBhdHRyaWJ1dGVzIGZvciBkZXYgZmZmZjg4MDA5YmRiNjAwMA0KYmxrdGFwX3N5c2Zz X2NyZWF0ZTogYWRkaW5nIGF0dHJpYnV0ZXMgZm9yIGRldiBmZmZmODgwMDliZGIyMjAwDQpJTklU OiBJZCAiczAiIHJlc3Bhd25pbmcgdG9vIGZhc3Q6IGRpc2FibGVkIGZvciA1IG1pbnV0ZXMNCl9f cmF0ZWxpbWl0OiAxNCBjYWxsYmFja3Mgc3VwcHJlc3NlZA0KYmxrdGFwX3N5c2ZzX2Rlc3Ryb3kN CmJsa3RhcF9zeXNmc19kZXN0cm95DQotLS0tLS0tLS0tLS1bIGN1dCBoZXJlIF0tLS0tLS0tLS0t LS0NCmtlcm5lbCBCVUcgYXQgYXJjaC94ODYvbW0vdGxiLmM6NjEhDQppbnZhbGlkIG9wY29kZTog MDAwMCBbIzFdIFNNUCANCmxhc3Qgc3lzZnMgZmlsZTogL3N5cy9kZXZpY2VzL3N5c3RlbS94ZW5f bWVtb3J5L3hlbl9tZW1vcnkwL2luZm8vY3VycmVudF9rYg0KQ1BVIDEgDQpNb2R1bGVzIGxpbmtl ZCBpbjogODAyMXEgZ2FycCB4ZW5fbmV0YmFjayB4ZW5fYmxrYmFjayBibGt0YXAgYmxrYmFja19w YWdlbWFwIG5iZCBicmlkZ2Ugc3RwIGxsYyBhdXRvZnM0IGlwbWlfZGV2aW50ZiBpcG1pX3NpIGlw bWlfbXNnaGFuZGxlciBsb2NrZCBzdW5ycGMgYm9uZGluZyBpcHY2IHhlbmZzIGRtX211bHRpcGF0 aCB2aWRlbyBvdXRwdXQgc2JzIHNic2hjIHBhcnBvcnRfcGMgbHAgcGFycG9ydCBzZXMgZW5jbG9z dXJlIHNuZF9zZXFfZHVtbXkgc25kX3NlcV9vc3Mgc25kX3NlcV9taWRpX2V2ZW50IHNuZF9zZXEg c25kX3NlcV9kZXZpY2Ugc2VyaW9fcmF3IGJueDIgc25kX3BjbV9vc3Mgc25kX21peGVyX29zcyBz bmRfcGNtIHNuZF90aW1lciBpVENPX3dkdCBzbmQgc291bmRjb3JlIHNuZF9wYWdlX2FsbG9jIGky Y19pODAxIGlUQ09fdmVuZG9yX3N1cHBvcnQgaTJjX2NvcmUgcGNzcGtyIHBhdGFfYWNwaSBhdGFf Z2VuZXJpYyBhdGFfcGlpeCBzaHBjaHAgbXB0c2FzIG1wdHNjc2loIG1wdGJhc2UgW2xhc3QgdW5s b2FkZWQ6IGZyZXFfdGFibGVdDQpQaWQ6IDI1NTgxLCBjb21tOiBraGVscGVyIE5vdCB0YWludGVk IDIuNi4zMi4zNmZpeHhlbiAjMSBUZWNhbCBSSDIyODUgICAgICAgICAgDQpSSVA6IGUwMzA6Wzxm ZmZmZmZmZjgxMDNhM2NiPl0gIFs8ZmZmZmZmZmY4MTAzYTNjYj5dIGxlYXZlX21tKzB4MTUvMHg0 Ng0KUlNQOiBlMDJiOmZmZmY4ODAwMjgwNWJlNDggIEVGTEFHUzogMDAwMTAwNDYNClJBWDogMDAw MDAwMDAwMDAwMDAwMCBSQlg6IDAwMDAwMDAwMDAwMDAwMDEgUkNYOiBmZmZmODgwMTVmOGUyZGEw DQpSRFg6IGZmZmY4ODAwMjgwNWJlNzggUlNJOiAwMDAwMDAwMDAwMDAwMDAwIFJESTogMDAwMDAw MDAwMDAwMDAwMQ0KUkJQOiBmZmZmODgwMDI4MDViZTQ4IFIwODogZmZmZjg4MDA5ZDY2MjAwMCBS MDk6IGRlYWQwMDAwMDAyMDAyMDANClIxMDogZGVhZDAwMDAwMDEwMDEwMCBSMTE6IGZmZmZmZmZm ODE0NDcyYjIgUjEyOiBmZmZmODgwMDliZmMxODgwDQpSMTM6IGZmZmY4ODAwMjgwNjMwMjAgUjE0 OiAwMDAwMDAwMDAwMDAwNGY2IFIxNTogMDAwMDAwMDAwMDAwMDAwMA0KRlM6ICAwMDAwN2Y2MjM2 MmQ2NmUwKDAwMDApIEdTOmZmZmY4ODAwMjgwNTgwMDAoMDAwMCkga25sR1M6MDAwMDAwMDAwMDAw MDAwMA0KQ1M6ICBlMDMzIERTOiAwMDAwIEVTOiAwMDAwIENSMDogMDAwMDAwMDA4MDA1MDAzYg0K Q1IyOiAwMDAwMDAzYWFiYzExOTA5IENSMzogMDAwMDAwMDA5YjhjYTAwMCBDUjQ6IDAwMDAwMDAw MDAwMDI2NjANCkRSMDogMDAwMDAwMDAwMDAwMDAwMCBEUjE6IDAwMDAwMDAwMDAwMDAwMDAgRFIy OiAwMDAwMDAwMDAwMDAwMDAwDQpEUjM6IDAwMDAwMDAwMDAwMDAwMDAgRFI2OiAwMDAwMDAwMGZm ZmYwZmYwIERSNzogMDAwMDAwMDAwMDAwMDQwMA0KUHJvY2VzcyBraGVscGVyIChwaWQ6IDI1NTgx LCB0aHJlYWRpbmZvIGZmZmY4ODAwNzY5MWUwMDAsIHRhc2sgZmZmZjg4MDA5YjkyZGI0MCkNClN0 YWNrOg0KIGZmZmY4ODAwMjgwNWJlNjggZmZmZmZmZmY4MTAwZTRhZSAwMDAwMDAwMDAwMDAwMDAx IGZmZmY4ODAwOWQ3MzNiODgNCjwwPiBmZmZmODgwMDI4MDViZTk4IGZmZmZmZmZmODEwODcyMjQg ZmZmZjg4MDAyODA1YmU3OCBmZmZmODgwMDI4MDViZTc4DQo8MD4gZmZmZjg4MDE1ZjgwODM2MCAw MDAwMDAwMDAwMDAwNGY2IGZmZmY4ODAwMjgwNWJlYTggZmZmZmZmZmY4MTAxMDEwOA0KQ2FsbCBU cmFjZToNCiA8SVJRPiANCiBbPGZmZmZmZmZmODEwMGU0YWU+XSBkcm9wX290aGVyX21tX3JlZisw eDJhLzB4NTMNCiBbPGZmZmZmZmZmODEwODcyMjQ+XSBnZW5lcmljX3NtcF9jYWxsX2Z1bmN0aW9u X3NpbmdsZV9pbnRlcnJ1cHQrMHhkOC8weGZjDQogWzxmZmZmZmZmZjgxMDEwMTA4Pl0geGVuX2Nh bGxfZnVuY3Rpb25fc2luZ2xlX2ludGVycnVwdCsweDEzLzB4MjgNCiBbPGZmZmZmZmZmODEwYTkz NmE+XSBoYW5kbGVfSVJRX2V2ZW50KzB4NjYvMHgxMjANCiBbPGZmZmZmZmZmODEwYWFjNWI+XSBo YW5kbGVfcGVyY3B1X2lycSsweDQxLzB4NmUNCiBbPGZmZmZmZmZmODEyOGMxYzA+XSBfX3hlbl9l dnRjaG5fZG9fdXBjYWxsKzB4MWFiLzB4MjdkDQogWzxmZmZmZmZmZjgxMjhkZDExPl0geGVuX2V2 dGNobl9kb191cGNhbGwrMHgzMy8weDQ2DQogWzxmZmZmZmZmZjgxMDEzZWZlPl0geGVuX2RvX2h5 cGVydmlzb3JfY2FsbGJhY2srMHgxZS8weDMwDQogPEVPST4gDQogWzxmZmZmZmZmZjgxNDQ3MmIy Pl0gPyBfc3Bpbl91bmxvY2tfaXJxcmVzdG9yZSsweDE1LzB4MTcNCiBbPGZmZmZmZmZmODEwMGY4 Y2Y+XSA/IHhlbl9yZXN0b3JlX2ZsX2RpcmVjdF9lbmQrMHgwLzB4MQ0KIFs8ZmZmZmZmZmY4MTEx M2Y3MT5dID8gZmx1c2hfb2xkX2V4ZWMrMHgzYWMvMHg1MDANCiBbPGZmZmZmZmZmODExNTBkYzU+ XSA/IGxvYWRfZWxmX2JpbmFyeSsweDAvMHgxN2VmDQogWzxmZmZmZmZmZjgxMTUwZGM1Pl0gPyBs b2FkX2VsZl9iaW5hcnkrMHgwLzB4MTdlZg0KIFs8ZmZmZmZmZmY4MTE1MTE1ZD5dID8gbG9hZF9l bGZfYmluYXJ5KzB4Mzk4LzB4MTdlZg0KIFs8ZmZmZmZmZmY4MTA0MmZjZj5dID8gbmVlZF9yZXNj aGVkKzB4MjMvMHgyZA0KIFs8ZmZmZmZmZmY4MTFmNDY0OD5dID8gcHJvY2Vzc19tZWFzdXJlbWVu dCsweGMwLzB4ZDcNCiBbPGZmZmZmZmZmODExNTBkYzU+XSA/IGxvYWRfZWxmX2JpbmFyeSsweDAv MHgxN2VmDQogWzxmZmZmZmZmZjgxMTEzMDk0Pl0gPyBzZWFyY2hfYmluYXJ5X2hhbmRsZXIrMHhj OC8weDI1NQ0KIFs8ZmZmZmZmZmY4MTExNDM2Mj5dID8gZG9fZXhlY3ZlKzB4MWMzLzB4MjllDQog WzxmZmZmZmZmZjgxMDExNTVkPl0gPyBzeXNfZXhlY3ZlKzB4NDMvMHg1ZA0KIFs8ZmZmZmZmZmY4 MTA2ZmM0NT5dID8gX19jYWxsX3VzZXJtb2RlaGVscGVyKzB4MC8weDZmDQogWzxmZmZmZmZmZjgx MDEzZTI4Pl0gPyBrZXJuZWxfZXhlY3ZlKzB4NjgvMHhkMA0KIFs8ZmZmZmZmZmY4MTA2ZmM0NT5d ID8gX19jYWxsX3VzZXJtb2RlaGVscGVyKzB4MC8weDZmDQogWzxmZmZmZmZmZjgxMDBmOGNmPl0g PyB4ZW5fcmVzdG9yZV9mbF9kaXJlY3RfZW5kKzB4MC8weDENCiBbPGZmZmZmZmZmODEwNmZiNjQ+ XSA/IF9fX19jYWxsX3VzZXJtb2RlaGVscGVyKzB4MTEzLzB4MTFlDQogWzxmZmZmZmZmZjgxMDEz ZGFhPl0gPyBjaGlsZF9yaXArMHhhLzB4MjANCiBbPGZmZmZmZmZmODEwNmZjNDU+XSA/IF9fY2Fs bF91c2VybW9kZWhlbHBlcisweDAvMHg2Zg0KIFs8ZmZmZmZmZmY4MTAxMmY5MT5dID8gaW50X3Jl dF9mcm9tX3N5c19jYWxsKzB4Ny8weDFiDQogWzxmZmZmZmZmZjgxMDEzNzFkPl0gPyByZXRpbnRf cmVzdG9yZV9hcmdzKzB4NS8weDYNCiBbPGZmZmZmZmZmODEwMTNkYTA+XSA/IGNoaWxkX3JpcCsw eDAvMHgyMA0KQ29kZTogNDEgNWUgNDEgNWYgYzkgYzMgNTUgNDggODkgZTUgMGYgMWYgNDQgMDAg MDAgZTggMTcgZmYgZmYgZmYgYzkgYzMgNTUgNDggODkgZTUgMGYgMWYgNDQgMDAgMDAgNjUgOGIg MDQgMjUgYzggNTUgMDEgMDAgZmYgYzggNzUgMDQgPDBmPiAwYiBlYiBmZSA2NSA0OCA4YiAzNCAy NSBjMCA1NSAwMSAwMCA0OCA4MSBjNiBiOCAwMiAwMCAwMCBlOCANClJJUCAgWzxmZmZmZmZmZjgx MDNhM2NiPl0gbGVhdmVfbW0rMHgxNS8weDQ2DQogUlNQIDxmZmZmODgwMDI4MDViZTQ4Pg0KLS0t WyBlbmQgdHJhY2UgY2U5Y2VlNjgzMmE5YzUwMyBdLS0tDQpLZXJuZWwgcGFuaWMgLSBub3Qgc3lu Y2luZzogRmF0YWwgZXhjZXB0aW9uIGluIGludGVycnVwdA0KUGlkOiAyNTU4MSwgY29tbToga2hl bHBlciBUYWludGVkOiBHICAgICAgRCAgICAyLjYuMzIuMzZmaXh4ZW4gIzENCkNhbGwgVHJhY2U6 DQogPElSUT4gIFs8ZmZmZmZmZmY4MTA1NjgyZT5dIHBhbmljKzB4ZTAvMHgxOWENCiBbPGZmZmZm ZmZmODE0NDAwOGE+XSA/IGluaXRfYW1kKzB4Mjk2LzB4MzdhDQogWzxmZmZmZmZmZjgxMDBmMTdk Pl0gPyB4ZW5fZm9yY2VfZXZ0Y2huX2NhbGxiYWNrKzB4ZC8weGYNCiBbPGZmZmZmZmZmODEwMGY4 ZTI+XSA/IGNoZWNrX2V2ZW50cysweDEyLzB4MjANCiBbPGZmZmZmZmZmODEwMGY4Y2Y+XSA/IHhl bl9yZXN0b3JlX2ZsX2RpcmVjdF9lbmQrMHgwLzB4MQ0KIFs8ZmZmZmZmZmY4MTA1NjQ4Nz5dID8g cHJpbnRfb29wc19lbmRfbWFya2VyKzB4MjMvMHgyNQ0KIFs8ZmZmZmZmZmY4MTQ0ODE4NT5dIG9v cHNfZW5kKzB4YjYvMHhjNg0KIFs8ZmZmZmZmZmY4MTAxNjZlNT5dIGRpZSsweDVhLzB4NjMNCiBb PGZmZmZmZmZmODE0NDdhNWM+XSBkb190cmFwKzB4MTE1LzB4MTI0DQogWzxmZmZmZmZmZjgxMDE0 OGU2Pl0gZG9faW52YWxpZF9vcCsweDljLzB4YTUNCiBbPGZmZmZmZmZmODEwM2EzY2I+XSA/IGxl YXZlX21tKzB4MTUvMHg0Ng0KIFs8ZmZmZmZmZmY4MTAwZjZmYT5dID8geGVuX2Nsb2Nrc291cmNl X3JlYWQrMHgyMS8weDIzDQogWzxmZmZmZmZmZjgxMDBmMjZjPl0gPyBIWVBFUlZJU09SX3ZjcHVf b3ArMHhmLzB4MTENCiBbPGZmZmZmZmZmODEwMGY3Njc+XSA/IHhlbl92Y3B1b3Bfc2V0X25leHRf ZXZlbnQrMHg1Mi8weDY3DQogWzxmZmZmZmZmZjgxMDgwYmZhPl0gPyBjbG9ja2V2ZW50c19wcm9n cmFtX2V2ZW50KzB4NzgvMHg4MQ0KIFs8ZmZmZmZmZmY4MTAxM2IzYj5dIGludmFsaWRfb3ArMHgx Yi8weDIwDQogWzxmZmZmZmZmZjgxNDQ3MmIyPl0gPyBfc3Bpbl91bmxvY2tfaXJxcmVzdG9yZSsw eDE1LzB4MTcNCiBbPGZmZmZmZmZmODEwM2EzY2I+XSA/IGxlYXZlX21tKzB4MTUvMHg0Ng0KIFs8 ZmZmZmZmZmY4MTAwZTRhZT5dIGRyb3Bfb3RoZXJfbW1fcmVmKzB4MmEvMHg1Mw0KIFs8ZmZmZmZm ZmY4MTA4NzIyND5dIGdlbmVyaWNfc21wX2NhbGxfZnVuY3Rpb25fc2luZ2xlX2ludGVycnVwdCsw eGQ4LzB4ZmMNCiBbPGZmZmZmZmZmODEwMTAxMDg+XSB4ZW5fY2FsbF9mdW5jdGlvbl9zaW5nbGVf aW50ZXJydXB0KzB4MTMvMHgyOA0KIFs8ZmZmZmZmZmY4MTBhOTM2YT5dIGhhbmRsZV9JUlFfZXZl bnQrMHg2Ni8weDEyMA0KIFs8ZmZmZmZmZmY4MTBhYWM1Yj5dIGhhbmRsZV9wZXJjcHVfaXJxKzB4 NDEvMHg2ZQ0KIFs8ZmZmZmZmZmY4MTI4YzFjMD5dIF9feGVuX2V2dGNobl9kb191cGNhbGwrMHgx YWIvMHgyN2QNCiBbPGZmZmZmZmZmODEyOGRkMTE+XSB4ZW5fZXZ0Y2huX2RvX3VwY2FsbCsweDMz LzB4NDYNCiBbPGZmZmZmZmZmODEwMTNlZmU+XSB4ZW5fZG9faHlwZXJ2aXNvcl9jYWxsYmFjaysw eDFlLzB4MzANCiA8RU9JPiAgWzxmZmZmZmZmZjgxNDQ3MmIyPl0gPyBfc3Bpbl91bmxvY2tfaXJx cmVzdG9yZSsweDE1LzB4MTcNCiBbPGZmZmZmZmZmODEwMGY4Y2Y+XSA/IHhlbl9yZXN0b3JlX2Zs X2RpcmVjdF9lbmQrMHgwLzB4MQ0KIFs8ZmZmZmZmZmY4MTExM2Y3MT5dID8gZmx1c2hfb2xkX2V4 ZWMrMHgzYWMvMHg1MDANCiBbPGZmZmZmZmZmODExNTBkYzU+XSA/IGxvYWRfZWxmX2JpbmFyeSsw eDAvMHgxN2VmDQogWzxmZmZmZmZmZjgxMTUwZGM1Pl0gPyBsb2FkX2VsZl9iaW5hcnkrMHgwLzB4 MTdlZg0KIFs8ZmZmZmZmZmY4MTE1MTE1ZD5dID8gbG9hZF9lbGZfYmluYXJ5KzB4Mzk4LzB4MTdl Zg0KIFs8ZmZmZmZmZmY4MTA0MmZjZj5dID8gbmVlZF9yZXNjaGVkKzB4MjMvMHgyZA0KIFs8ZmZm ZmZmZmY4MTFmNDY0OD5dID8gcHJvY2Vzc19tZWFzdXJlbWVudCsweGMwLzB4ZDcNCiBbPGZmZmZm ZmZmODExNTBkYzU+XSA/IGxvYWRfZWxmX2JpbmFyeSsweDAvMHgxN2VmDQogWzxmZmZmZmZmZjgx MTEzMDk0Pl0gPyBzZWFyY2hfYmluYXJ5X2hhbmRsZXIrMHhjOC8weDI1NQ0KIFs8ZmZmZmZmZmY4 MTExNDM2Mj5dID8gZG9fZXhlY3ZlKzB4MWMzLzB4MjllDQogWzxmZmZmZmZmZjgxMDExNTVkPl0g PyBzeXNfZXhlY3ZlKzB4NDMvMHg1ZA0KIFs8ZmZmZmZmZmY4MTA2ZmM0NT5dID8gX19jYWxsX3Vz ZXJtb2RlaGVscGVyKzB4MC8weDZmDQogWzxmZmZmZmZmZjgxMDEzZTI4Pl0gPyBrZXJuZWxfZXhl Y3ZlKzB4NjgvMHhkMA0KIFs8ZmZmZmZmZmY4MTA2ZmM0NT5dID8gX19jYWxsX3VzZXJtb2RlaGVs cGVyKzB4MC8weDZmDQogWzxmZmZmZmZmZjgxMDBmOGNmPl0gPyB4ZW5fcmVzdG9yZV9mbF9kaXJl Y3RfZW5kKzB4MC8weDENCiBbPGZmZmZmZmZmODEwNmZiNjQ+XSA/IF9fX19jYWxsX3VzZXJtb2Rl aGVscGVyKzB4MTEzLzB4MTFlDQogWzxmZmZmZmZmZjgxMDEzZGFhPl0gPyBjaGlsZF9yaXArMHhh LzB4MjANCiBbPGZmZmZmZmZmODEwNmZjNDU+XSA/IF9fY2FsbF91c2VybW9kZWhlbHBlcisweDAv MHg2Zg0KIFs8ZmZmZmZmZmY4MTAxMmY5MT5dID8gaW50X3JldF9mcm9tX3N5c19jYWxsKzB4Ny8w eDFiDQogWzxmZmZmZmZmZjgxMDEzNzFkPl0gPyByZXRpbnRfcmVzdG9yZV9hcmdzKzB4NS8weDYN CiBbPGZmZmZmZmZmODEwMTNkYTA+XSA/IGNoaWxkX3JpcCsweDAvMHgyMA0KKFhFTikgRG9tYWlu IDAgY3Jhc2hlZDogJ25vcmVib290JyBzZXQgLSBub3QgcmVib290aW5nLg== --_90dd77a0-f2a4-471f-9c3c-f26f5eefc718_ Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --_90dd77a0-f2a4-471f-9c3c-f26f5eefc718_-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: Grant Table Error on 2.6.32.36 + Xen 4.0.1 Date: Tue, 12 Apr 2011 14:48:36 +0800 Message-ID: References: , , , , , , , <4DA3438A.6070503@goop.org>, Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1643582353==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen devel Cc: tim.deegan@citrix.com, george.dunlap@eu.citrix.com, giamteckchoon@gmail.com, ian.campbell@citrix.com, keir.fraser@eu.citrix.com List-Id: xen-devel@lists.xenproject.org --===============1643582353== Content-Type: multipart/alternative; boundary="_f0e64f58-f51c-4221-aeda-fa26d58f6a48_" --_f0e64f58-f51c-4221-aeda-fa26d58f6a48_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi: =20 We are just about to try the new Kernel, but confront Error on gran= t table. =20 2.6.32.36 Kernel: http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy= /xen.git;a=3Dcommit;h=3Dbb1a15e55ec665a64c8a9c6bd699b1f16ac01ff4 Xen 4.0.1 http://xenbits.xen.org/hg/xen-4.0-testing.hg/rev/b536ebfba= 183 =20 Our test is simple, 24 HVMS(Win2003 ) on a single host, each HVM loo= pes in restart every 15minutes. Please refer to error log from serial output=20 =20 I've traced the log a bit, and the log is from xen/common/grant_table= .c =20 1) log " grant_table.c:1717:d0 Bad grant reference 4294965983 " if from=20 1715 if ( unlikely(gref >=3D nr_grant_entries(rd->grant_table)) ){ 1716 PIN_FAIL(unlock_out, GNTST_bad_gntref, 1717 "Bad grant reference %ld\n", gref); 1718 BUG(); 1719 } =20 2) log "grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) "= is from=20 =20 grant_table.c:1967 =3D> __acquire_grant_for_copy =3D> _set_status =20 ( not from __gnttab_map_grant_ref, since I add some log to identify this= ) The log shows that all are from gnttab_copy, which I later found only net= back has grant copy hypercall.=20 =20 I also tried netback code from 2.6.31(which works well with kernel 2.6.31= ), but still met these errors. So it looks like it is kernel related. =20 What happened for this, will this harmful for the usage of HVM? =20 Many thanks. =20 =3D-=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D (XEN) Xen trace buffers: disabled (XEN) Std. Loglevel: Errors and warnings (XEN) Guest Loglevel: Nothing (Rate-limited: Errors and warnings) (XEN) Xen is relinquishing VGA console. (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input= to Xen) (XEN) Freed 168kB init memory. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 3 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 3 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 3 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 1 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 1 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 1 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 1 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 1 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 1 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 3 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 17 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 13 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 11 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 11 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 10 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 3 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 2 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 6 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 7 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 2 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 3 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 2 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 10 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 1 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 3 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 15 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 7 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 7 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 3 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 7 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 3 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 3 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 3 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 3 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 8 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 15 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 7 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 29 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 25 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 25 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 19 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 27 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 27 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 7 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 3 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 5 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 10 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 7 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 2 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 3 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 15 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 7 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 2 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 1 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 8 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 1 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 3 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 3 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 1 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 1 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 3 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 3 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 8 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 9 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 1 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 2 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 7 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 5 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 2 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 2 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 1 messages suppressed. (XEN) grant_table.c:1717:d0 Bad grant reference 4294965983 (XEN) printk: 3 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 3 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 1 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 1 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) printk: 1 messages suppressed. (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) (XEN) grant_table.c:578:d0 Iomem mapping not permitted ffffffffffffffff (= domain 137) (XEN) grant_table.c:578:d0 Iomem mapping not permitted ffffffffffffffff (= domain 137) (XEN) grant_table.c:578:d0 Iomem mapping not permitted ffffffffffffffff (= domain 137) (XEN) grant_table.c:578:d0 Iomem mapping not permitted ffffffffffffffff (= domain 137) (XEN) grant_table.c:578:d0 Iomem mapping not permitted ffffffffffffffff (= domain 137) =20 --_f0e64f58-f51c-4221-aeda-fa26d58f6a48_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi:
 
      We are just about to= try the new Kernel, but confront Error on grant table.
       
     2.6.32.36 Kernel: http://git.kernel.org/?p=3Dlinux/kernel/git/jerem= y/xen.git;a=3Dcommit;h=3Dbb1a15e55ec665a64c8a9c6bd699b1f16ac01ff4
     Xen 4.0.1 http://xenbits.xen.org/hg/xen-4.= 0-testing.hg/rev/b536ebfba183
       
    Our test is simple, 24 HVMS(Win2003 ) &= nbsp;on a single host, each HVM loopes in restart every 15minutes.     Please refer to error log from serial output
            &= nbsp; 
    I've traced the log a bit, and the log is from xe= n/common/grant_table.c
 
1) log " grant_table.c:1717:d0 Bad grant reference 4294965983 " if f= rom

1715     if ( unlikely(gref >=3D nr_grant_entries(= rd->grant_table)) ){
1716       =   PIN_FAIL(unlock_out, GNTST_bad_gntref,
1717   &n= bsp;           &nb= sp;  "Bad grant reference %ld\n", gref);
1718   &n= bsp;     BUG();
1719     }
 
2) log "grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) "= is from
 
  grant_table.c:1967 =3D>  __acquire_grant_for_copy  =3D= > _set_status
 
 ( not from __gnttab_map_grant_ref, since I add some log to identify= this )

The log shows that all are from gnttab_copy, which I later= found only netback
has grant copy hypercall.
 
I also tried netback code from 2.6.31(which works well with kernel 2.6.31= ), but
still met these errors. So it looks like it is kernel related.
 
What happened for this, will this harmful for the usage of HVM?
 
Many thanks.
 
 =3D-=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
(XEN) Xen trace buffe= rs: disabled
(XEN) Std. Loglevel: Errors and warnings
(XEN) Guest L= oglevel: Nothing (Rate-limited: Errors and warnings)
(XEN) Xen is reli= nquishing VGA console.
(XEN) *** Serial input -> DOM0 (type 'CTRL-a= ' three times to switch input to Xen)
(XEN) Freed 168kB init memory.(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983
(XEN) gra= nt_table.c:1717:d0 Bad grant reference 4294965983
(XEN) grant_table.c:= 1717:d0 Bad grant reference 4294965983
(XEN) grant_table.c:1717:d0 Bad= grant reference 4294965983
(XEN) grant_table.c:1717:d0 Bad grant refe= rence 4294965983
(XEN) grant_table.c:1717:d0 Bad grant reference 42949= 65983
(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983
(X= EN) grant_table.c:1717:d0 Bad grant reference 4294965983
(XEN) grant_t= able.c:1717:d0 Bad grant reference 4294965983
(XEN) grant_table.c:1717= :d0 Bad grant reference 4294965983
(XEN)=20 printk: 3 messages suppressed.
(XEN) grant_table.c:1717:d0 Bad grant = reference 4294965983
(XEN) printk: 3 messages suppressed.
(XEN) gra= nt_table.c:1717:d0 Bad grant reference 4294965983
(XEN) printk: 3 mess= ages suppressed.
(XEN) grant_table.c:1717:d0 Bad grant reference 42949= 65983
(XEN) printk: 1 messages suppressed.
(XEN) grant_table.c:1717= :d0 Bad grant reference 4294965983
(XEN) printk: 1 messages suppressed= .
(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983
(XEN) = grant_table.c:1717:d0 Bad grant reference 4294965983
(XEN) printk: 1 m= essages suppressed.
(XEN) grant_table.c:1717:d0 Bad grant reference 42= 94965983
(XEN) printk: 1 messages suppressed.
(XEN) grant_table.c:1= 717:d0 Bad grant reference 4294965983
(XEN) printk: 1 messages suppres= sed.
(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983
(XE= N) printk: 1 messages suppressed.
(XEN) grant_table.c:266:d0 Bad flags= (0) or dom (0). (expected dom 0)
(XEN) printk: 3 messages suppressed.
(XEN) grant_table.c:266:d0 Bad flags = (0) or dom (0). (expected dom 0)
(XEN) printk: 17 messages suppressed.=
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)=
(XEN) printk: 13 messages suppressed.
(XEN) grant_table.c:266:d0 B= ad flags (0) or dom (0). (expected dom 0)
(XEN) printk: 11 messages su= ppressed.
(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983(XEN) printk: 11 messages suppressed.
(XEN) grant_table.c:266:d0 Bad= flags (0) or dom (0). (expected dom 0)
(XEN) grant_table.c:266:d0 Bad= flags (0) or dom (0). (expected dom 0)
(XEN) printk: 10 messages supp= ressed.
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected= dom 0)
(XEN) printk: 3 messages suppressed.
(XEN) grant_table.c:26= 6:d0 Bad flags (0) or dom (0). (expected dom 0)
(XEN) grant_table.c:26= 6:d0 Bad flags (0) or dom (0). (expected dom 0)
(XEN) printk: 2 messag= es suppressed.
(XEN) grant_table.c:266: d0 Bad flags (0) or dom (0). (expected dom 0)
(XEN) grant_table.c:266= :d0 Bad flags (0) or dom (0). (expected dom 0)
(XEN) grant_table.c:171= 7:d0 Bad grant reference 4294965983
(XEN) grant_table.c:1717:d0 Bad gr= ant reference 4294965983
(XEN) grant_table.c:266:d0 Bad flags (0) or d= om (0). (expected dom 0)
(XEN) grant_table.c:266:d0 Bad flags (0) or d= om (0). (expected dom 0)
(XEN) grant_table.c:1717:d0 Bad grant referen= ce 4294965983
(XEN) grant_table.c:1717:d0 Bad grant reference 42949659= 83
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom = 0)
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom = 0)
(XEN) printk: 6 messages suppressed.
(XEN) grant_table.c:266:d0 = Bad flags (0) or dom (0). (expected dom 0)
(XEN) printk: 7 messages su= ppressed.
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expect= ed dom 0)
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expect= ed dom 0)
(XEN) printk: 2 messages supp ressed.
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expecte= d dom 0)
(XEN) printk: 3 messages suppressed.
(XEN) grant_table.c:2= 66:d0 Bad flags (0) or dom (0). (expected dom 0)
(XEN) grant_table.c:2= 66:d0 Bad flags (0) or dom (0). (expected dom 0)
(XEN) printk: 2 messa= ges suppressed.
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (= expected dom 0)
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (= expected dom 0)
(XEN) grant_table.c:1717:d0 Bad grant reference 429496= 5983
(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983
(XE= N) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)
(XE= N) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)
(XE= N) grant_table.c:1717:d0 Bad grant reference 4294965983
(XEN) grant_ta= ble.c:1717:d0 Bad grant reference 4294965983
(XEN) grant_table.c:266:d= 0 Bad flags (0) or dom (0). (expected dom 0)
(XEN) grant_table.c:1717:= d0 Bad grant reference 4294965983
(XEN) printk: 10 messages suppressed.
(XEN) grant_table.c:266:d0 Bad flags= (0) or dom (0). (expected dom 0)
(XEN) grant_table.c:266:d0 Bad flags= (0) or dom (0). (expected dom 0)
(XEN) grant_table.c:1717:d0 Bad gran= t reference 4294965983
(XEN) grant_table.c:1717:d0 Bad grant reference= 4294965983
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expe= cted dom 0)
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expe= cted dom 0)
(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983=
(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983
(XEN) g= rant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)
(XEN) g= rant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)
(XEN) g= rant_table.c:1717:d0 Bad grant reference 4294965983
(XEN) grant_table.= c:1717:d0 Bad grant reference 4294965983
(XEN) grant_table.c:266:d0 Ba= d flags (0) or dom (0). (expected dom 0)
(XEN) grant_table.c:266:d0 Ba= d flags (0) or dom (0). (expected dom 0)(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983
(XEN) pr= intk: 1 messages suppressed.
(XEN) grant_table.c:266:d0 Bad flags (0) = or dom (0). (expected dom 0)
(XEN) grant_table.c:266:d0 Bad flags (0) = or dom (0). (expected dom 0)
(XEN) grant_table.c:1717:d0 Bad grant ref= erence 4294965983
(XEN) grant_table.c:1717:d0 Bad grant reference 4294= 965983
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected = dom 0)
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected = dom 0)
(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983
(= XEN) grant_table.c:1717:d0 Bad grant reference 4294965983
(XEN) grant_= table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)
(XEN) printk= : 3 messages suppressed.
(XEN) grant_table.c:1717:d0 Bad grant referen= ce 4294965983
(XEN) printk: 15 messages suppressed.
(XEN) grant_tab= le.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)
(XEN) printk: 7= messages suppressed.
(XEN) grant_table .c:266:d0 Bad flags (0) or dom (0). (expected dom 0)
(XEN) printk: 7 = messages suppressed.
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (= 0). (expected dom 0)
(XEN) printk: 3 messages suppressed.
(XEN) gra= nt_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)
(XEN) pri= ntk: 7 messages suppressed.
(XEN) grant_table.c:266:d0 Bad flags (0) o= r dom (0). (expected dom 0)
(XEN) printk: 3 messages suppressed.
(X= EN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)
(X= EN) printk: 3 messages suppressed.
(XEN) grant_table.c:266:d0 Bad flag= s (0) or dom (0). (expected dom 0)
(XEN) printk: 3 messages suppressed= .
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0= )
(XEN) printk: 3 messages suppressed.
(XEN) grant_table.c:1717:d0 = Bad grant reference 4294965983
(XEN) grant_table.c:1717:d0 Bad grant r= eference 4294965983
(XEN) printk: 8 messages suppressed.
(XEN) gran= t_table.c:1717:d0 Bad grant reference 4294 965983
(XEN) printk: 15 messages suppressed.
(XEN) grant_table.c:1= 717:d0 Bad grant reference 4294965983
(XEN) printk: 7 messages suppres= sed.
(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983
(XE= N) printk: 29 messages suppressed.
(XEN) grant_table.c:266:d0 Bad flag= s (0) or dom (0). (expected dom 0)
(XEN) printk: 25 messages suppresse= d.
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom = 0)
(XEN) printk: 25 messages suppressed.
(XEN) grant_table.c:266:d0= Bad flags (0) or dom (0). (expected dom 0)
(XEN) printk: 19 messages = suppressed.
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expe= cted dom 0)
(XEN) printk: 27 messages suppressed.
(XEN) grant_table= .c:266:d0 Bad flags (0) or dom (0). (expected dom 0)
(XEN) printk: 27 = messages suppressed.
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (= 0). (expected dom 0)
(XEN) printk: 7 messages suppressed.
(XEN) gra= nt_table.c:266:d0 Bad flags (0) or dom (0) . (expected dom 0)
(XEN) printk: 3 messages suppressed.
(XEN) gran= t_table.c:1717:d0 Bad grant reference 4294965983
(XEN) grant_table.c:1= 717:d0 Bad grant reference 4294965983
(XEN) grant_table.c:266:d0 Bad f= lags (0) or dom (0). (expected dom 0)
(XEN) printk: 5 messages suppres= sed.
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected do= m 0)
(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983
(XE= N) printk: 10 messages suppressed.
(XEN) grant_table.c:266:d0 Bad flag= s (0) or dom (0). (expected dom 0)
(XEN) printk: 7 messages suppressed= .
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0= )
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0= )
(XEN) printk: 2 messages suppressed.
(XEN) grant_table.c:1717:d0 = Bad grant reference 4294965983
(XEN) printk: 3 messages suppressed.(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983
(XEN) prin= tk: 15 messages suppressed.
(XEN) grant _table.c:1717:d0 Bad grant reference 4294965983
(XEN) printk: 7 messa= ges suppressed.
(XEN) grant_table.c:1717:d0 Bad grant reference 429496= 5983
(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983
(XE= N) printk: 2 messages suppressed.
(XEN) grant_table.c:1717:d0 Bad gran= t reference 4294965983
(XEN) grant_table.c:1717:d0 Bad grant reference= 4294965983
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expe= cted dom 0)
(XEN) printk: 1 messages suppressed.
(XEN) grant_table.= c:266:d0 Bad flags (0) or dom (0). (expected dom 0)
(XEN) grant_table.= c:266:d0 Bad flags (0) or dom (0). (expected dom 0)
(XEN) printk: 8 me= ssages suppressed.
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0)= . (expected dom 0)
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0)= . (expected dom 0)
(XEN) grant_table.c:1717:d0 Bad grant reference 429= 4965983
(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983
= (XEN) grant_table.c:1717:d0 Bad grant refe rence 4294965983
(XEN) grant_table.c:1717:d0 Bad grant reference 4294= 965983
(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983
(= XEN) printk: 1 messages suppressed.
(XEN) grant_table.c:266:d0 Bad fla= gs (0) or dom (0). (expected dom 0)
(XEN) printk: 3 messages suppresse= d.
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom = 0)
(XEN) printk: 3 messages suppressed.
(XEN) grant_table.c:266:d0 = Bad flags (0) or dom (0). (expected dom 0)
(XEN) grant_table.c:266:d0 = Bad flags (0) or dom (0). (expected dom 0)
(XEN) grant_table.c:1717:d0= Bad grant reference 4294965983
(XEN) printk: 1 messages suppressed.(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983
(XEN) pri= ntk: 1 messages suppressed.
(XEN) grant_table.c:266:d0 Bad flags (0) o= r dom (0). (expected dom 0)
(XEN) print k: 3 messages suppressed.
(XEN) grant_table.c:266:d0 Bad flags (0) or= dom (0). (expected dom 0)
(XEN) printk: 3 messages suppressed.
(XE= N) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)
(XE= N) grant_table.c:1717:d0 Bad grant reference 4294965983
(XEN) printk: = 8 messages suppressed.
(XEN) grant_table.c:266:d0 Bad flags (0) or dom= (0). (expected dom 0)
(XEN) printk: 9 messages suppressed.
(XEN) g= rant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)
(XEN) p= rintk: 1 messages suppressed.
(XEN) grant_table.c:1717:d0 Bad grant re= ference 4294965983
(XEN) grant_table.c:1717:d0 Bad grant reference 429= 4965983
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected= dom 0)
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected= dom 0)
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected= dom 0)
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected= dom 0)
(XEN) grant_table.c:266:d0 Bad=20 flags (0) or dom (0). (expected dom 0)
(XEN) grant_table.c:266:d0 Bad= flags (0) or dom (0). (expected dom 0)
(XEN) grant_table.c:266:d0 Bad= flags (0) or dom (0). (expected dom 0)
(XEN) grant_table.c:266:d0 Bad= flags (0) or dom (0). (expected dom 0)
(XEN) grant_table.c:1717:d0 Ba= d grant reference 4294965983
(XEN) grant_table.c:1717:d0 Bad grant ref= erence 4294965983
(XEN) grant_table.c:1717:d0 Bad grant reference 4294= 965983
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected = dom 0)
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected = dom 0)
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected = dom 0)
(XEN) printk: 2 messages suppressed.
(XEN) grant_table.c:171= 7:d0 Bad grant reference 4294965983
(XEN) printk: 7 messages suppresse= d.
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom = 0)
(XEN) printk: 5 messages suppressed.
(XEN) grant_table.c:266:d0 = Bad flags (0) or dom (0). (expected dom 0)
(XEN) printk: 2 messages suppressed.
(XEN) grant_table.c:266:d0 B= ad flags (0) or dom (0). (expected dom 0)
(XEN) printk: 2 messages sup= pressed.
(XEN) grant_table.c:1717:d0 Bad grant reference 4294965983(XEN) printk: 1 messages suppressed.
(XEN) grant_table.c:1717:d0 Bad = grant reference 4294965983
(XEN) printk: 3 messages suppressed.
(XE= N) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)
(XE= N) printk: 3 messages suppressed.
(XEN) grant_table.c:266:d0 Bad flags= (0) or dom (0). (expected dom 0)
(XEN) printk: 1 messages suppressed.=
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)=
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)=
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)=
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)=
(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)=
(XEN) grant_table.c:266:d0 Bad flags ( 0) or dom (0). (expected dom 0)
(XEN) grant_table.c:266:d0 Bad flags = (0) or dom (0). (expected dom 0)
(XEN) printk: 1 messages suppressed.<= BR>(XEN) grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0)<= BR>(XEN) printk: 1 messages suppressed.
(XEN) grant_table.c:266:d0 Bad= flags (0) or dom (0). (expected dom 0)
(XEN) grant_table.c:266:d0 Bad= flags (0) or dom (0). (expected dom 0)
(XEN) grant_table.c:578:d0 Iom= em mapping not permitted ffffffffffffffff (domain 137)
(XEN) grant_tab= le.c:578:d0 Iomem mapping not permitted ffffffffffffffff (domain 137)
= (XEN) grant_table.c:578:d0 Iomem mapping not permitted ffffffffffffffff (= domain 137)
(XEN) grant_table.c:578:d0 Iomem mapping not permitted fff= fffffffffffff (domain 137)
(XEN) grant_table.c:578:d0 Iomem mapping no= t permitted ffffffffffffffff (domain 137)

--_f0e64f58-f51c-4221-aeda-fa26d58f6a48_-- --===============1643582353== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============1643582353==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: Grant Table Error on 2.6.32.36 + Xen 4.0.1 Date: Tue, 12 Apr 2011 04:46:29 -0400 Message-ID: <20110412084629.GA6255@dumpdata.com> References: <4DA3438A.6070503@goop.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: MaoXiaoyun Cc: xen devel , tim.deegan@citrix.com, george.dunlap@eu.citrix.com, giamteckchoon@gmail.com, keir.fraser@eu.citrix.com, ian.campbell@citrix.com List-Id: xen-devel@lists.xenproject.org On Tue, Apr 12, 2011 at 02:48:36PM +0800, MaoXiaoyun wrote: > > Hi: > > We are just about to try the new Kernel, but confront Error on grant table. Please open a new thread on this one. This is getting confusing. > > 2.6.32.36 Kernel: http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=commit;h=bb1a15e55ec665a64c8a9c6bd699b1f16ac01ff4 > Xen 4.0.1 http://xenbits.xen.org/hg/xen-4.0-testing.hg/rev/b536ebfba183 > > Our test is simple, 24 HVMS(Win2003 ) on a single host, each HVM loopes in restart every 15minutes. > Please refer to error log from serial output > > I've traced the log a bit, and the log is from xen/common/grant_table.c > > 1) log " grant_table.c:1717:d0 Bad grant reference 4294965983 " if from > > 1715 if ( unlikely(gref >= nr_grant_entries(rd->grant_table)) ){ > 1716 PIN_FAIL(unlock_out, GNTST_bad_gntref, > 1717 "Bad grant reference %ld\n", gref); > 1718 BUG(); > 1719 } > > 2) log "grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) " is from > > grant_table.c:1967 => __acquire_grant_for_copy => _set_status > > ( not from __gnttab_map_grant_ref, since I add some log to identify this ) > > The log shows that all are from gnttab_copy, which I later found only netback > has grant copy hypercall. > > I also tried netback code from 2.6.31(which works well with kernel 2.6.31), but > still met these errors. So it looks like it is kernel related. > > What happened for this, will this harmful for the usage of HVM? What is the storage for your HVM guests? iSCSI? From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: RE: Grant Table Error on 2.6.32.36 + Xen 4.0.1 Date: Tue, 12 Apr 2011 17:02:36 +0800 Message-ID: References: , , , , , , , <4DA3438A.6070503@goop.org>, , , <20110412084629.GA6255@dumpdata.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1498394838==" Return-path: In-Reply-To: <20110412084629.GA6255@dumpdata.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: konrad.wilk@oracle.com Cc: xen devel , tim.deegan@citrix.com, george.dunlap@eu.citrix.com, giamteckchoon@gmail.com, keir.fraser@eu.citrix.com, ian.campbell@citrix.com List-Id: xen-devel@lists.xenproject.org --===============1498394838== Content-Type: multipart/alternative; boundary="_59b3be6e-c3d3-4a10-a59f-00ba7b7739e8_" --_59b3be6e-c3d3-4a10-a59f-00ba7b7739e8_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable Thanks Konrad. =20 I will new a thread on TLB bug. For grant table error. I add some debug log on netback.c , line 388.=20 =20 358 static u16 netbk_gop_frag(struct xen_netif *netif, struct netbk_rx_m= eta *meta, 359 int i, struct netrx_pending_operations *np= o, 360 struct page *page, unsigned long size, 361 unsigned long offset) 362 { 363 struct gnttab_copy *copy_gop; 364 struct xen_netif_rx_request *req; 365 unsigned long old_mfn; 366 int idx =3D netif_page_index(page); 367=20 368 old_mfn =3D virt_to_mfn(page_address(page)); 369=20 370 req =3D RING_GET_REQUEST(&netif->rx, netif->rx.req_cons + i)= ; 371=20 372 copy_gop =3D npo->copy + npo->copy_prod++; 373 copy_gop->flags =3D GNTCOPY_dest_gref; 374 if (idx > -1) { 375 struct pending_tx_info *src_pend =3D &pending_tx_inf= o[idx]; 376 copy_gop->source.domid =3D src_pend->netif->domid; 377 copy_gop->source.u.ref =3D src_pend->req.gref; 378 copy_gop->flags |=3D GNTCOPY_source_gref; 379 } else { 380 copy_gop->source.domid =3D DOMID_SELF; 381 copy_gop->source.u.gmfn =3D old_mfn; 382 } 383 copy_gop->source.offset =3D offset; 384 copy_gop->dest.domid =3D netif->domid; 385 copy_gop->dest.offset =3D 0; 386 copy_gop->dest.u.ref =3D req->gref; 387 copy_gop->len =3D size; 388 if(req->gref > 16384) 389 IPRINTK("dom %d, req gref %d size =3D %lu\n", netif->domi= d, req->gref, size); 390=20 391 return req->id; 392 } =20 And the output below, indicates something might wrong on grant table. =20 Apr 12 16:38:31 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 270 Apr 12 16:38:31 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72 Apr 12 16:38:31 xmao kernel: xen_net: dom 14, req gref -1313 size =3D 270 Apr 12 16:38:31 xmao kernel: xen_net: dom 14, req gref -1313 size =3D 72 Apr 12 16:38:33 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 270 Apr 12 16:38:33 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72 Apr 12 16:38:34 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 270 Apr 12 16:38:34 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72 Apr 12 16:38:34 xmao kernel: xen_net: dom 14, req gref -1313 size =3D 270 Apr 12 16:38:35 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 270 Apr 12 16:38:35 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72 Apr 12 16:38:40 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 270 Apr 12 16:38:40 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72 Apr 12 16:38:42 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 270 Apr 12 16:38:42 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72 Apr 12 16:38:44 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 270 Apr 12 16:38:44 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72 Apr 12 16:38:57 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 270 Apr 12 16:38:57 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72 Apr 12 16:38:59 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 270 Apr 12 16:38:59 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72 Apr 12 16:38:59 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 270 Apr 12 16:38:59 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72 Apr 12 16:39:22 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 270 Apr 12 16:39:22 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72 Apr 12 16:39:26 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 270 Apr 12 16:39:26 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72 Apr 12 16:39:29 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 42 Apr 12 16:39:29 xmao kernel: xen_net: dom 14, req gref -1313 size =3D 42 Apr 12 16:39:29 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 42 Apr 12 16:39:29 xmao kernel: xen_net: dom 14, req gref 5242956 size =3D 4= 2 Apr 12 16:39:30 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 42 Apr 12 16:39:30 xmao kernel: xen_net: dom 14, req gref 1817341261 size =3D= 42 Apr 12 16:39:31 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 42 Apr 12 16:39:31 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 38 Apr 12 16:39:31 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72 Apr 12 16:39:31 xmao kernel: xen_net: dom 14, req gref -1313 size =3D 38 Apr 12 16:39:31 xmao kernel: xen_net: dom 14, req gref -1313 size =3D 72 Apr 12 16:39:32 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 270 Apr 12 16:39:32 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72 Apr 12 16:39:32 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 270 Apr 12 16:39:32 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72 Apr 12 16:39:32 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 42 Apr 12 16:39:32 xmao kernel: xen_net: dom 14, req gref -1408 size =3D 42 Apr 12 16:39:32 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 38 Apr 12 16:39:32 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72 Apr 12 16:39:32 xmao kernel: xen_net: dom 14, req gref -1408 size =3D 38 Apr 12 16:39:32 xmao kernel: xen_net: dom 14, req gref -1408 size =3D 72 Apr 12 16:39:33 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 42 Apr 12 16:39:33 xmao kernel: xen_net: dom 14, req gref -1408 size =3D 42 Apr 12 16:39:33 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 42 Apr 12 16:39:33 xmao kernel: xen_net: dom 14, req gref -1408 size =3D 42 Apr 12 16:39:33 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 38 Apr 12 16:39:33 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72 Apr 12 16:39:33 xmao kernel: xen_net: dom 14, req gref 1850305869 size =3D= 38 Apr 12 16:39:33 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 42 Apr 12 16:39:34 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 38 Apr 12 16:39:34 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72 Apr 12 16:39:34 xmao kernel: xen_net: dom 14, req gref -1313 size =3D 38 Apr 12 16:39:34 xmao kernel: xen_net: dom 14, req gref -1313 size =3D 72 Apr 12 16:39:34 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 42 Apr 12 16:39:34 xmao kernel: xen_net: dom 14, req gref -1313 size =3D 42 Apr 12 16:39:35 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 270 Apr 12 16:39:35 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72 Apr 12 16:39:35 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 38 Apr 12 16:39:35 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72 Apr 12 16:39:35 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 38 Apr 12 16:39:35 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72 =20 > Date: Tue, 12 Apr 2011 04:46:29 -0400 > From: konrad.wilk@oracle.com > To: tinnycloud@hotmail.com > CC: xen-devel@lists.xensource.com; tim.deegan@citrix.com; george.dunlap= @eu.citrix.com; giamteckchoon@gmail.com; ian.campbell@citrix.com; keir.fr= aser@eu.citrix.com > Subject: Re: [Xen-devel] Grant Table Error on 2.6.32.36 + Xen 4.0.1 >=20 > On Tue, Apr 12, 2011 at 02:48:36PM +0800, MaoXiaoyun wrote: > >=20 > > Hi: > >=20 > > We are just about to try the new Kernel, but confront Error on grant = table. >=20 > Please open a new thread on this one. This is getting confusing. > >=20 > > 2.6.32.36 Kernel: http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/= xen.git;a=3Dcommit;h=3Dbb1a15e55ec665a64c8a9c6bd699b1f16ac01ff4 > > Xen 4.0.1 http://xenbits.xen.org/hg/xen-4.0-testing.hg/rev/b536ebfba1= 83 > >=20 > > Our test is simple, 24 HVMS(Win2003 ) on a single host, each HVM loop= es in restart every 15minutes. > > Please refer to error log from serial output=20 > >=20 > > I've traced the log a bit, and the log is from xen/common/grant_table= .c > >=20 > > 1) log " grant_table.c:1717:d0 Bad grant reference 4294965983 " if fr= om=20 > >=20 > > 1715 if ( unlikely(gref >=3D nr_grant_entries(rd->grant_table)) ){ > > 1716 PIN_FAIL(unlock_out, GNTST_bad_gntref, > > 1717 "Bad grant reference %ld\n", gref); > > 1718 BUG(); > > 1719 } > >=20 > > 2) log "grant_table.c:266:d0 Bad flags (0) or dom (0). (expected dom = 0) " is from=20 > >=20 > > grant_table.c:1967 =3D> __acquire_grant_for_copy =3D> _set_status > >=20 > > ( not from __gnttab_map_grant_ref, since I add some log to identify t= his ) > >=20 > > The log shows that all are from gnttab_copy, which I later found only= netback > > has grant copy hypercall.=20 > >=20 > > I also tried netback code from 2.6.31(which works well with kernel 2.= 6.31), but > > still met these errors. So it looks like it is kernel related. > >=20 > > What happened for this, will this harmful for the usage of HVM? >=20 > What is the storage for your HVM guests? iSCSI? =20 --_59b3be6e-c3d3-4a10-a59f-00ba7b7739e8_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable Thanks Konrad.
 
I will new a thread on TLB bug.
For grant table error.  I add some debug log on netback.c , line 388= .
 
 358 static u16 netbk_gop_frag(struct xen_netif *netif, struct netbk= _rx_meta *meta,
 359       &nb= sp;           &nbs= p;       int i, struct netrx_pending_operat= ions *npo,
 360        &n= bsp;           &nb= sp;      struct page *page, unsigned long size,<= BR> 361          &= nbsp;           &n= bsp;    unsigned long offset)
 362 {
 363&= nbsp;        struct gnttab_copy *copy_= gop;
 364         struct = xen_netif_rx_request *req;
 365     &nbs= p;   unsigned long old_mfn;
 366         int idx =3D ne= tif_page_index(page);
 367
 368    &= nbsp;    old_mfn =3D virt_to_mfn(page_address(page));
&= nbsp;369
 370         re= q =3D RING_GET_REQUEST(&netif->rx, netif->rx.req_cons + i);
=  371
 372         c= opy_gop =3D npo->copy + npo->copy_prod++;
 373  &= nbsp;      copy_gop->flags =3D GNTCOPY_dest_g= ref;
 374         if (idx= > -1) {
 375        &= nbsp;        struct pending_tx_info *s= rc_pend =3D &pending_tx_info[idx];
 376   &nbs= p;            = ; copy_gop->source.domid =3D src_pend->netif->d omid;
 377         =         copy_gop->source.u.ref =3D = src_pend->req.gref;
 378      &n= bsp;          copy_gop->f= lags |=3D GNTCOPY_source_gref;
 379     =     } else {
 380     &nb= sp;           copy_gop-= >source.domid =3D DOMID_SELF;
 381    &nbs= p;            copy= _gop->source.u.gmfn =3D old_mfn;
 382    &= nbsp;    }
 383      = ;   copy_gop->source.offset =3D offset;
 384 &n= bsp;       copy_gop->dest.domid =3D neti= f->domid;
 385      ;    copy_gop->dest.offset =3D 0;
 386 &n= bsp;       copy_gop->dest.u.ref =3D req-= >gref;
 387         co= py_gop->len =3D size;
 388      =    if(req->gref > 16384)
 389   &n= bsp;        IPRINTK("dom %d, req gref = %d size =3D %lu\n", netif->domid, req->gref, size);
 390  391         return req-&g= t;id;
 392 }
 
And the output below, indicates something might wrong on grant table= .
 
Apr 12 16:38:31 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 270=
Apr 12 16:38:31 xmao kernel: xen_net: dom 23, req gref -1313 size =3D= 72
Apr 12 16:38:31 xmao kernel: xen_net: dom 14, req gref -1313 size = =3D 270
Apr 12 16:38:31 xmao kernel: xen_net: dom 14, req gref -1313 s= ize =3D 72
Apr 12 16:38:33 xmao kernel: xen_net: dom 23, req gref -131= 3 size =3D 270
Apr 12 16:38:33 xmao kernel: xen_net: dom 23, req gref = -1313 size =3D 72
Apr 12 16:38:34 xmao kernel: xen_net: dom 23, req gr= ef -1313 size =3D 270
Apr 12 16:38:34 xmao kernel: xen_net: dom 23, re= q gref -1313 size =3D 72
Apr 12 16:38:34 xmao kernel: xen_net: dom 14,= req gref -1313 size =3D 270
Apr 12 16:38:35 xmao kernel: xen_net: dom= 23, req gref -1313 size =3D 270
Apr 12 16:38:35 xmao kernel: xen_net:= dom 23, req gref -1313 size =3D 72
Apr 12 16:38:40 xmao kernel: xen_n= et: dom 23, req gref -1313 size =3D 270
Apr 12 16:38:40 xmao kernel: x= en_net: dom 23, req gref -1313 size =3D 72
Apr 12 16:38:42 xmao=20 kernel: xen_net: dom 23, req gref -1313 size =3D 270
Apr 12 16:38:42 = xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72
Apr 12 16:38:= 44 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 270
Apr 12 16= :38:44 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72
Apr 12= 16:38:57 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 270
Ap= r 12 16:38:57 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72Apr 12 16:38:59 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 27= 0
Apr 12 16:38:59 xmao kernel: xen_net: dom 23, req gref -1313 size =3D= 72
Apr 12 16:38:59 xmao kernel: xen_net: dom 23, req gref -1313 size = =3D 270
Apr 12 16:38:59 xmao kernel: xen_net: dom 23, req gref -1313 s= ize =3D 72
Apr 12 16:39:22 xmao kernel: xen_net: dom 23, req gref -131= 3 size =3D 270
Apr 12 16:39:22 xmao kernel: xen_net: dom 23, req gref = -1313 size =3D 72
Apr 12 16:39:26 xmao kernel: xen_net: dom 23, req gr= ef -1313 size =3D 270
Apr 12 16:39:26 xmao kernel: xen_net: dom=20 23, req gref -1313 size =3D 72
Apr 12 16:39:29 xmao kernel: xen_net: = dom 23, req gref -1313 size =3D 42
Apr 12 16:39:29 xmao kernel: xen_ne= t: dom 14, req gref -1313 size =3D 42
Apr 12 16:39:29 xmao kernel: xen= _net: dom 23, req gref -1313 size =3D 42
Apr 12 16:39:29 xmao kernel: = xen_net: dom 14, req gref 5242956 size =3D 42
Apr 12 16:39:30 xmao ker= nel: xen_net: dom 23, req gref -1313 size =3D 42
Apr 12 16:39:30 xmao = kernel: xen_net: dom 14, req gref 1817341261 size =3D 42
Apr 12 16:39:= 31 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 42
Apr 12 16:= 39:31 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 38
Apr 12 = 16:39:31 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72
Apr = 12 16:39:31 xmao kernel: xen_net: dom 14, req gref -1313 size =3D 38
A= pr 12 16:39:31 xmao kernel: xen_net: dom 14, req gref -1313 size =3D 72Apr 12 16:39:32 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 2= 70
Apr 12 16:39:32 xmao kernel: xen_net: dom 23, req gref -1313 s ize =3D 72
Apr 12 16:39:32 xmao kernel: xen_net: dom 23, req gref -13= 13 size =3D 270
Apr 12 16:39:32 xmao kernel: xen_net: dom 23, req gref= -1313 size =3D 72
Apr 12 16:39:32 xmao kernel: xen_net: dom 23, req g= ref -1313 size =3D 42
Apr 12 16:39:32 xmao kernel: xen_net: dom 14, re= q gref -1408 size =3D 42
Apr 12 16:39:32 xmao kernel: xen_net: dom 23,= req gref -1313 size =3D 38
Apr 12 16:39:32 xmao kernel: xen_net: dom = 23, req gref -1313 size =3D 72
Apr 12 16:39:32 xmao kernel: xen_net: d= om 14, req gref -1408 size =3D 38
Apr 12 16:39:32 xmao kernel: xen_net= : dom 14, req gref -1408 size =3D 72
Apr 12 16:39:33 xmao kernel: xen_= net: dom 23, req gref -1313 size =3D 42
Apr 12 16:39:33 xmao kernel: x= en_net: dom 14, req gref -1408 size =3D 42
Apr 12 16:39:33 xmao kernel= : xen_net: dom 23, req gref -1313 size =3D 42
Apr 12 16:39:33 xmao ker= nel: xen_net: dom 14, req gref -1408 size =3D 42
Apr 12 16:39:33 xmao = kernel: xen_net: dom 23, req gref -1313 size =3D 38
Apr 12 16:39:33 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 72
Apr 12 16:3= 9:33 xmao kernel: xen_net: dom 14, req gref 1850305869 size =3D 38
Apr= 12 16:39:33 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 42
= Apr 12 16:39:34 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 38<= BR>Apr 12 16:39:34 xmao kernel: xen_net: dom 23, req gref -1313 size =3D = 72
Apr 12 16:39:34 xmao kernel: xen_net: dom 14, req gref -1313 size =3D= 38
Apr 12 16:39:34 xmao kernel: xen_net: dom 14, req gref -1313 size = =3D 72
Apr 12 16:39:34 xmao kernel: xen_net: dom 23, req gref -1313 si= ze =3D 42
Apr 12 16:39:34 xmao kernel: xen_net: dom 14, req gref -1313= size =3D 42
Apr 12 16:39:35 xmao kernel: xen_net: dom 23, req gref -1= 313 size =3D 270
Apr 12 16:39:35 xmao kernel: xen_net: dom 23, req gre= f -1313 size =3D 72
Apr 12 16:39:35 xmao kernel: xen_net: dom 23, req = gref -1313 size =3D 38
Apr 12 16:39:35 xmao kernel: xen_net: dom 23, r= eq gref -1313 size =3D 72
Apr 12 16:39:35 xmao kernel: xen_net: dom 23, req gref -1313 size =3D 38
Apr 12 16:39:35 xmao kernel: xen_= net: dom 23, req gref -1313 size =3D 72
 
> Date: Tue, 12 Apr 2011 04:46:29 -0400
> From: konrad.wilk@orac= le.com
> To: tinnycloud@hotmail.com
> CC: xen-devel@lists.xen= source.com; tim.deegan@citrix.com; george.dunlap@eu.citrix.com; giamteckc= hoon@gmail.com; ian.campbell@citrix.com; keir.fraser@eu.citrix.com
>= ; Subject: Re: [Xen-devel] Grant Table Error on 2.6.32.36 + Xen 4.0.1
= >
> On Tue, Apr 12, 2011 at 02:48:36PM +0800, MaoXiaoyun wrote:=
> >
> > Hi:
> >
> > We are just ab= out to try the new Kernel, but confront Error on grant table.
> > Please open a new thread on this one. This is getting confusing.> >
> > 2.6.32.36 Kernel: http://git.kernel.org/?p=3Dlin= ux/kernel/git/jeremy/xen.git;a=3Dcommit;h=3Dbb1a15e55ec665a64c8a9c6bd699b= 1f16ac01ff4
> > Xen 4.0.1 http://xenbits.xen.org/hg/xen-4.0-test= ing.hg/rev/b536ebfba183
> >
> > Our test is simple, 24= HVMS(Win2003 ) on a single host, each HVM loop es in restart every 15minutes.
> > Please refer to error log fr= om serial output
> >
> > I've traced the log a bit, a= nd the log is from xen/common/grant_table.c
> >
> > 1)= log " grant_table.c:1717:d0 Bad grant reference 4294965983 " if from > >
> > 1715 if ( unlikely(gref >=3D nr_grant_entries= (rd->grant_table)) ){
> > 1716 PIN_FAIL(unlock_out, GNTST_bad= _gntref,
> > 1717 "Bad grant reference %ld\n", gref);
> &g= t; 1718 BUG();
> > 1719 }
> >
> > 2) log "gra= nt_table.c:266:d0 Bad flags (0) or dom (0). (expected dom 0) " is from > >
> > grant_table.c:1967 =3D> __acquire_grant_for_= copy =3D> _set_status
> >
> > ( not from __gnttab_m= ap_grant_ref, since I add some log to identify this )
> >
&g= t; > The log shows that all are from gnttab_copy, which I later found = only netback
> > has grant copy hyperca ll.
> >
> > I also tried netback code from 2.6.31(wh= ich works well with kernel 2.6.31), but
> > still met these erro= rs. So it looks like it is kernel related.
> >
> > Wha= t happened for this, will this harmful for the usage of HVM?
>
= > What is the storage for your HVM guests? iSCSI?
--_59b3be6e-c3d3-4a10-a59f-00ba7b7739e8_-- --===============1498394838== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============1498394838==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Tue, 12 Apr 2011 17:11:51 +0800 Message-ID: References: , , , , , , , <4DA3438A.6070503@goop.org>, Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0140234595==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen devel Cc: jeremy@goop.org, giamteckchoon@gmail.com, konrad.wilk@oracle.com List-Id: xen-devel@lists.xenproject.org --===============0140234595== Content-Type: multipart/alternative; boundary="_766b3516-2726-4c1a-91ea-25975993a35c_" --_766b3516-2726-4c1a-91ea-25975993a35c_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi : =20 We are using pvops kernel 2.6.32.36 + xen 4.0.1, but confront a kernel = panic bug. =20 2.6.32.36 Kernel: http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xe= n.git;a=3Dcommit;h=3Dbb1a15e55ec665a64c8a9c6bd699b1f16ac01ff4 Xen 4.0.1 http://xenbits.xen.org/hg/xen-4.0-testing.hg/rev/b536ebfba183= =20 =20 Our test is simple, 24 HVMS(Win2003 ) on a single host, each HVM loope= s in restart every 15minutes. About 17 machines are invovled in the test, after 10 hours run, one co= nfrontted a crash at arch/x86/mm/tlb.c:61 =20 Currently I am trying "cpuidle=3D0 cpufreq=3Dnone" tests based on Teck'= s suggestion. =20 Any comments, thanks.=20 =20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3Dcrash log=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D INIT: Id "s0" respawning too fast: disabled for 5 minutes __ratelimit: 14 callbacks suppressed blktap_sysfs_destroy blktap_sysfs_destroy ------------[ cut here ]------------ kernel BUG at arch/x86/mm/tlb.c:61! invalid opcode: 0000 [#1] SMP=20 last sysfs file: /sys/devices/system/xen_memory/xen_memory0/info/current_= kb CPU 1=20 Modules linked in: 8021q garp xen_netback xen_blkback blktap blkback_page= map nbd bridge stp llc autofs4 ipmi_devintf ipmi_si ipmi_msghandler lockd= sunrpc bonding ipv6 xenfs dm_multipath video output sbs sbshc parport_pc= lp parport ses enclosure snd_seq_dummy snd_seq_oss snd_seq_midi_event sn= d_seq snd_seq_device serio_raw bnx2 snd_pcm_oss snd_mixer_oss snd_pcm snd= _timer iTCO_wdt snd soundcore snd_page_alloc i2c_i801 iTCO_vendor_support= i2c_core pcspkr pata_acpi ata_generic ata_piix shpchp mptsas mptscsih mp= tbase [last unloaded: freq_table] Pid: 25581, comm: khelper Not tainted 2.6.32.36fixxen #1 Tecal RH2285 = =20 RIP: e030:[] [] leave_mm+0x15/0x46 RSP: e02b:ffff88002805be48 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88015f8e2da0 RDX: ffff88002805be78 RSI: 0000000000000000 RDI: 0000000000000001 RBP: ffff88002805be48 R08: ffff88009d662000 R09: dead000000200200 R10: dead000000100100 R11: ffffffff814472b2 R12: ffff88009bfc1880 R13: ffff880028063020 R14: 00000000000004f6 R15: 0000000000000000 FS: 00007f62362d66e0(0000) GS:ffff880028058000(0000) knlGS:0000000000000= 000 CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000003aabc11909 CR3: 000000009b8ca000 CR4: 0000000000002660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process khelper (pid: 25581, threadinfo ffff88007691e000, task ffff88009b= 92db40) Stack: ffff88002805be68 ffffffff8100e4ae 0000000000000001 ffff88009d733b88 <0> ffff88002805be98 ffffffff81087224 ffff88002805be78 ffff88002805be78 <0> ffff88015f808360 00000000000004f6 ffff88002805bea8 ffffffff81010108 Call Trace: =20 [] drop_other_mm_ref+0x2a/0x53 [] generic_smp_call_function_single_interrupt+0xd8/0xf= c [] xen_call_function_single_interrupt+0x13/0x28 [] handle_IRQ_event+0x66/0x120 [] handle_percpu_irq+0x41/0x6e [] __xen_evtchn_do_upcall+0x1ab/0x27d [] xen_evtchn_do_upcall+0x33/0x46 [] xen_do_hypervisor_callback+0x1e/0x30 =20 [] ? _spin_unlock_irqrestore+0x15/0x17 [] ? xen_restore_fl_direct_end+0x0/0x1 [] ? flush_old_exec+0x3ac/0x500 [] ? load_elf_binary+0x0/0x17ef [] ? load_elf_binary+0x0/0x17ef [] ? load_elf_binary+0x398/0x17ef [] ? need_resched+0x23/0x2d [] ? process_measurement+0xc0/0xd7 [] ? load_elf_binary+0x0/0x17ef [] ? search_binary_handler+0xc8/0x255 [] ? do_execve+0x1c3/0x29e [] ? sys_execve+0x43/0x5d [] ? __call_usermodehelper+0x0/0x6f [] ? kernel_execve+0x68/0xd0 [] ? __call_usermodehelper+0x0/0x6f [] ? xen_restore_fl_direct_end+0x0/0x1 [] ? ____call_usermodehelper+0x113/0x11e [] ? child_rip+0xa/0x20 [] ? __call_usermodehelper+0x0/0x6f [] ? int_ret_from_sys_call+0x7/0x1b [] ? retint_restore_args+0x5/0x6 [] ? child_rip+0x0/0x20 Code: 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 c3 5= 5 48 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b eb = fe 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8=20 RIP [] leave_mm+0x15/0x46 RSP ---[ end trace ce9cee6832a9c503 ]--- Kernel panic - not syncing: Fatal exception in interrupt Pid: 25581, comm: khelper Tainted: G D 2.6.32.36fixxen #1 Call Trace: [] panic+0xe0/0x19a [] ? init_amd+0x296/0x37a [] ? xen_force_evtchn_callback+0xd/0xf [] ? check_events+0x12/0x20 [] ? xen_restore_fl_direct_end+0x0/0x1 [] ? print_oops_end_marker+0x23/0x25 [] oops_end+0xb6/0xc6 [] die+0x5a/0x63 [] do_trap+0x115/0x124 [] do_invalid_op+0x9c/0xa5 [] ? leave_mm+0x15/0x46 [] ? xen_clocksource_read+0x21/0x23 [] ? HYPERVISOR_vcpu_op+0xf/0x11 [] ? xen_vcpuop_set_next_event+0x52/0x67 [] ? clockevents_program_event+0x78/0x81 [] invalid_op+0x1b/0x20 [] ? _spin_unlock_irqrestore+0x15/0x17 [] ? leave_mm+0x15/0x46 [] drop_other_mm_ref+0x2a/0x53 [] generic_smp_call_function_single_interrupt+0xd8/0xf= c [] xen_call_function_single_interrupt+0x13/0x28 [] handle_IRQ_event+0x66/0x120 [] handle_percpu_irq+0x41/0x6e [] __xen_evtchn_do_upcall+0x1ab/0x27d [] xen_evtchn_do_upcall+0x33/0x46 [] xen_do_hypervisor_callback+0x1e/0x30 [] ? _spin_unlock_irqrestore+0x15/0x17 [] ? xen_restore_fl_direct_end+0x0/0x1 [] ? flush_old_exec+0x3ac/0x500 [] ? load_elf_binary+0x0/0x17ef [] ? load_elf_binary+0x0/0x17ef [] ? load_elf_binary+0x398/0x17ef [] ? need_resched+0x23/0x2d [] ? process_measurement+0xc0/0xd7 [] ? load_elf_binary+0x0/0x17ef [] ? search_binary_handler+0xc8/0x255 [] ? do_execve+0x1c3/0x29e [] ? sys_execve+0x43/0x5d [] ? __call_usermodehelper+0x0/0x6f [] ? kernel_execve+0x68/0xd0 [] ? __call_usermodehelper+0x0/0x6f [] ? xen_restore_fl_direct_end+0x0/0x1 [] ? ____call_usermodehelper+0x113/0x11e [] ? child_rip+0xa/0x20 [] ? __call_usermodehelper+0x0/0x6f [] ? int_ret_from_sys_call+0x7/0x1b [] ? retint_restore_args+0x5/0x6 [] ? child_rip+0x0/0x20 =20 =20 --_766b3516-2726-4c1a-91ea-25975993a35c_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi :
 
  We are using pvops kernel 2.6.32.36 + xen 4.0.1, but confront a ke= rnel panic bug.
 
&nb= sp; 2.6.32.36 Kernel: http://git.kernel.org/?p=3Dlinux/kernel/git/je= remy/xen.git;a=3Dcommit;h=3Dbb1a15e55ec665a64c8a9c6bd699b1f16ac01ff4
  Xen 4.0.1 http://xenbits.xen= .org/hg/xen-4.0-testing.hg/rev/b536ebfba183
 
 
  Our test is simple, 24 HVMS(Win2003 )  on a singl= e host, each HVM loopes in restart every 15minutes.
&nb= sp; About 17 machines are invovled in the test,  after 10 hours= run, one confrontted a crash at arch/x86/mm/tlb.c:61
 
&nb= sp; Currently I am trying "cpuidle=3D0 cpufreq=3Dnone" tests based on Teck's suggest= ion.
 
&nb= sp; Any comments, thanks.
 
=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3Dcrash log=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
INI= T: Id "s0" respawning too fast: disabled for 5 minutes
__ratelimit: 14= callbacks suppressed
blktap_sysfs_destroy
blktap_sysfs_destroy
= ------------[ cut here ]------------
kernel BUG at arch/x86/mm/tlb.c:6= 1!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/sys= tem/xen_memory/xen_memory0/info/current_kb
CPU 1
Modules linked in= : 8021q garp xen_netback xen_blkback blktap blkback_pagemap nbd bridge st= p llc autofs4 ipmi_devintf ipmi_si ipmi_msghandler lockd sunrpc bonding i= pv6 xenfs dm_multipath video output sbs sbshc parport_pc lp parport ses e= nclosure snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_dev= ice serio_raw bnx2 snd_pcm_oss snd_mixer_oss snd_pcm snd_timer iTCO_wdt s= nd soundcore snd_page_alloc i2c_i801 iTCO_vendor_support i2c_core pcs pkr pata_acpi ata_generic ata_piix shpchp mptsas mptscsih mptbase [last = unloaded: freq_table]
Pid: 25581, comm: khelper Not tainted 2.6.32.36f= ixxen #1 Tecal RH2285        &nbs= p;
RIP: e030:[<ffffffff8103a3cb>]  [<ffffffff8103a3cb&g= t;] leave_mm+0x15/0x46
RSP: e02b:ffff88002805be48  EFLAGS: 000100= 46
RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88015f8e2da0RDX: ffff88002805be78 RSI: 0000000000000000 RDI: 0000000000000001
RB= P: ffff88002805be48 R08: ffff88009d662000 R09: dead000000200200
R10: d= ead000000100100 R11: ffffffff814472b2 R12: ffff88009bfc1880
R13: ffff8= 80028063020 R14: 00000000000004f6 R15: 0000000000000000
FS:  0000= 7f62362d66e0(0000) GS:ffff880028058000(0000) knlGS:0000000000000000
CS= :  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000003aabc1= 1909 CR3: 000000009b8ca000 CR4: 0000000000002660
DR0: 0000000000000000= DR1: 0000000000000000 DR2: 00000000000000 00
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400<= BR>Process khelper (pid: 25581, threadinfo ffff88007691e000, task ffff880= 09b92db40)
Stack:
 ffff88002805be68 ffffffff8100e4ae 000000000= 0000001 ffff88009d733b88
<0> ffff88002805be98 ffffffff81087224 f= fff88002805be78 ffff88002805be78
<0> ffff88015f808360 0000000000= 0004f6 ffff88002805bea8 ffffffff81010108
Call Trace:
 <IRQ&= gt;
 [<ffffffff8100e4ae>] drop_other_mm_ref+0x2a/0x53
&= nbsp;[<ffffffff81087224>] generic_smp_call_function_single_interrup= t+0xd8/0xfc
 [<ffffffff81010108>] xen_call_function_single_= interrupt+0x13/0x28
 [<ffffffff810a936a>] handle_IRQ_event+= 0x66/0x120
 [<ffffffff810aac5b>] handle_percpu_irq+0x41/0x6= e
 [<ffffffff8128c1c0>] __xen_evtchn_do_upcall+0x1ab/0x27d<= BR> [<ffffffff8128dd11>] xen_evtchn_do_upcall+0x33/0x46
&nb= sp;[<ffffffff81013efe>] xen_do_hyper visor_callback+0x1e/0x30
 <EOI>
 [<ffffffff814= 472b2>] ? _spin_unlock_irqrestore+0x15/0x17
 [<ffffffff8100= f8cf>] ? xen_restore_fl_direct_end+0x0/0x1
 [<ffffffff81113= f71>] ? flush_old_exec+0x3ac/0x500
 [<ffffffff81150dc5>]= ? load_elf_binary+0x0/0x17ef
 [<ffffffff81150dc5>] ? load_= elf_binary+0x0/0x17ef
 [<ffffffff8115115d>] ? load_elf_bina= ry+0x398/0x17ef
 [<ffffffff81042fcf>] ? need_resched+0x23/0= x2d
 [<ffffffff811f4648>] ? process_measurement+0xc0/0xd7 [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef
 = [<ffffffff81113094>] ? search_binary_handler+0xc8/0x255
 [&= lt;ffffffff81114362>] ? do_execve+0x1c3/0x29e
 [<ffffffff81= 01155d>] ? sys_execve+0x43/0x5d
 [<ffffffff8106fc45>] ? = __call_usermodehelper+0x0/0x6f
 [<ffffffff81013e28>] ? kern= el_execve+0x68/0xd0
 [<ffffffff 8106fc45>] ? __call_usermodehelper+0x0/0x6f
 [<ffffffff810= 0f8cf>] ? xen_restore_fl_direct_end+0x0/0x1
 [<ffffffff8106= fb64>] ? ____call_usermodehelper+0x113/0x11e
 [<ffffffff810= 13daa>] ? child_rip+0xa/0x20
 [<ffffffff8106fc45>] ? __c= all_usermodehelper+0x0/0x6f
 [<ffffffff81012f91>] ? int_ret= _from_sys_call+0x7/0x1b
 [<ffffffff8101371d>] ? retint_rest= ore_args+0x5/0x6
 [<ffffffff81013da0>] ? child_rip+0x0/0x20=
Code: 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 = c3 55 48 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f&= gt; 0b eb fe 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8
RIP&n= bsp; [<ffffffff8103a3cb>] leave_mm+0x15/0x46
 RSP <ffff8= 8002805be48>
---[ end trace ce9cee6832a9c503 ]---
Kernel panic -= not syncing: Fatal exception in interrupt
Pid: 25581, comm: khelper T= ainted: G      D& nbsp;   2.6.32.36fixxen #1
Call Trace:
 <IRQ>=   [<ffffffff8105682e>] panic+0xe0/0x19a
 [<ffffffff= 8144008a>] ? init_amd+0x296/0x37a
 [<ffffffff8100f17d>] = ? xen_force_evtchn_callback+0xd/0xf
 [<ffffffff8100f8e2>] ?= check_events+0x12/0x20
 [<ffffffff8100f8cf>] ? xen_restore= _fl_direct_end+0x0/0x1
 [<ffffffff81056487>] ? print_oops_e= nd_marker+0x23/0x25
 [<ffffffff81448185>] oops_end+0xb6/0xc= 6
 [<ffffffff810166e5>] die+0x5a/0x63
 [<fffffff= f81447a5c>] do_trap+0x115/0x124
 [<ffffffff810148e6>] do= _invalid_op+0x9c/0xa5
 [<ffffffff8103a3cb>] ? leave_mm+0x15= /0x46
 [<ffffffff8100f6fa>] ? xen_clocksource_read+0x21/0x2= 3
 [<ffffffff8100f26c>] ? HYPERVISOR_vcpu_op+0xf/0x11
&n= bsp;[<ffffffff8100f767>] ? xen_vcpuop_set_next_event+0x52/0x67
&= nbsp;[<ffffffff81080bfa>] ? clockeve nts_program_event+0x78/0x81
 [<ffffffff81013b3b>] invalid_= op+0x1b/0x20
 [<ffffffff814472b2>] ? _spin_unlock_irqrestor= e+0x15/0x17
 [<ffffffff8103a3cb>] ? leave_mm+0x15/0x46
&= nbsp;[<ffffffff8100e4ae>] drop_other_mm_ref+0x2a/0x53
 [<= ;ffffffff81087224>] generic_smp_call_function_single_interrupt+0xd8/0x= fc
 [<ffffffff81010108>] xen_call_function_single_interrupt= +0x13/0x28
 [<ffffffff810a936a>] handle_IRQ_event+0x66/0x12= 0
 [<ffffffff810aac5b>] handle_percpu_irq+0x41/0x6e
&nbs= p;[<ffffffff8128c1c0>] __xen_evtchn_do_upcall+0x1ab/0x27d
 = [<ffffffff8128dd11>] xen_evtchn_do_upcall+0x33/0x46
 [<f= fffffff81013efe>] xen_do_hypervisor_callback+0x1e/0x30
 <EO= I>  [<ffffffff814472b2>] ? _spin_unlock_irqrestore+0x15/0x1= 7
 [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1=
 [<ffffffff81113f71>] ? flu sh_old_exec+0x3ac/0x500
 [<ffffffff81150dc5>] ? load_elf_b= inary+0x0/0x17ef
 [<ffffffff81150dc5>] ? load_elf_binary+0x= 0/0x17ef
 [<ffffffff8115115d>] ? load_elf_binary+0x398/0x17= ef
 [<ffffffff81042fcf>] ? need_resched+0x23/0x2d
 = [<ffffffff811f4648>] ? process_measurement+0xc0/0xd7
 [<= ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef
 [<ffffffff= 81113094>] ? search_binary_handler+0xc8/0x255
 [<ffffffff81= 114362>] ? do_execve+0x1c3/0x29e
 [<ffffffff8101155d>] ?= sys_execve+0x43/0x5d
 [<ffffffff8106fc45>] ? __call_usermo= dehelper+0x0/0x6f
 [<ffffffff81013e28>] ? kernel_execve+0x6= 8/0xd0
 [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x= 6f
 [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x= 1
 [<ffffffff8106fb64>] ? ____call_usermodehelper+0x113/0x1= 1e
 [<ffffffff81013daa>] ? c hild_rip+0xa/0x20
 [<ffffffff8106fc45>] ? __call_usermodeh= elper+0x0/0x6f
 [<ffffffff81012f91>] ? int_ret_from_sys_cal= l+0x7/0x1b
 [<ffffffff8101371d>] ? retint_restore_args+0x5/= 0x6
 [<ffffffff81013da0>] ? child_rip+0x0/0x20

&nb= sp;
 
--_766b3516-2726-4c1a-91ea-25975993a35c_-- --===============0140234595== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============0140234595==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Tue, 12 Apr 2011 06:00:00 -0400 Message-ID: <20110412100000.GA15647@dumpdata.com> References: <4DA3438A.6070503@goop.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: MaoXiaoyun Cc: jeremy@goop.org, xen devel , giamteckchoon@gmail.com List-Id: xen-devel@lists.xenproject.org On Tue, Apr 12, 2011 at 05:11:51PM +0800, MaoXiaoyun wrote: > > Hi : > > We are using pvops kernel 2.6.32.36 + xen 4.0.1, but confront a kernel panic bug. > > 2.6.32.36 Kernel: http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=commit;h=bb1a15e55ec665a64c8a9c6bd699b1f16ac01ff4 > Xen 4.0.1 http://xenbits.xen.org/hg/xen-4.0-testing.hg/rev/b536ebfba183 > > Our test is simple, 24 HVMS(Win2003 ) on a single host, each HVM loopes in restart every 15minutes. What is the storage that you are using for your guests? AoE? Local disks? > About 17 machines are invovled in the test, after 10 hours run, one confrontted a crash at arch/x86/mm/tlb.c:61 > > Currently I am trying "cpuidle=0 cpufreq=none" tests based on Teck's suggestion. > > Any comments, thanks. > > ===============crash log========================== > INIT: Id "s0" respawning too fast: disabled for 5 minutes > __ratelimit: 14 callbacks suppressed > blktap_sysfs_destroy > blktap_sysfs_destroy > ------------[ cut here ]------------ > kernel BUG at arch/x86/mm/tlb.c:61! > invalid opcode: 0000 [#1] SMP > last sysfs file: /sys/devices/system/xen_memory/xen_memory0/info/current_kb > CPU 1 > Modules linked in: 8021q garp xen_netback xen_blkback blktap blkback_pagemap nbd bridge stp llc autofs4 ipmi_devintf ipmi_si ipmi_msghandler lockd sunrpc bonding ipv6 xenfs dm_multipath video output sbs sbshc parport_pc lp parport ses enclosure snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device serio_raw bnx2 snd_pcm_oss snd_mixer_oss snd_pcm snd_timer iTCO_wdt snd soundcore snd_page_alloc i2c_i801 iTCO_vendor_support i2c_core pcspkr pata_acpi ata_generic ata_piix shpchp mptsas mptscsih mptbase [last unloaded: freq_table] > Pid: 25581, comm: khelper Not tainted 2.6.32.36fixxen #1 Tecal RH2285 > RIP: e030:[] [] leave_mm+0x15/0x46 > RSP: e02b:ffff88002805be48 EFLAGS: 00010046 > RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88015f8e2da0 > RDX: ffff88002805be78 RSI: 0000000000000000 RDI: 0000000000000001 > RBP: ffff88002805be48 R08: ffff88009d662000 R09: dead000000200200 > R10: dead000000100100 R11: ffffffff814472b2 R12: ffff88009bfc1880 > R13: ffff880028063020 R14: 00000000000004f6 R15: 0000000000000000 > FS: 00007f62362d66e0(0000) GS:ffff880028058000(0000) knlGS:0000000000000000 > CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 0000003aabc11909 CR3: 000000009b8ca000 CR4: 0000000000002660 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process khelper (pid: 25581, threadinfo ffff88007691e000, task ffff88009b92db40) > Stack: > ffff88002805be68 ffffffff8100e4ae 0000000000000001 ffff88009d733b88 > <0> ffff88002805be98 ffffffff81087224 ffff88002805be78 ffff88002805be78 > <0> ffff88015f808360 00000000000004f6 ffff88002805bea8 ffffffff81010108 > Call Trace: > > [] drop_other_mm_ref+0x2a/0x53 > [] generic_smp_call_function_single_interrupt+0xd8/0xfc > [] xen_call_function_single_interrupt+0x13/0x28 > [] handle_IRQ_event+0x66/0x120 > [] handle_percpu_irq+0x41/0x6e > [] __xen_evtchn_do_upcall+0x1ab/0x27d > [] xen_evtchn_do_upcall+0x33/0x46 > [] xen_do_hypervisor_callback+0x1e/0x30 > > [] ? _spin_unlock_irqrestore+0x15/0x17 > [] ? xen_restore_fl_direct_end+0x0/0x1 > [] ? flush_old_exec+0x3ac/0x500 > [] ? load_elf_binary+0x0/0x17ef > [] ? load_elf_binary+0x0/0x17ef > [] ? load_elf_binary+0x398/0x17ef > [] ? need_resched+0x23/0x2d > [] ? process_measurement+0xc0/0xd7 > [] ? load_elf_binary+0x0/0x17ef > [] ? search_binary_handler+0xc8/0x255 > [] ? do_execve+0x1c3/0x29e > [] ? sys_execve+0x43/0x5d > [] ? __call_usermodehelper+0x0/0x6f > [] ? kernel_execve+0x68/0xd0 > [] ? __call_usermodehelper+0x0/0x6f > [] ? xen_restore_fl_direct_end+0x0/0x1 > [] ? ____call_usermodehelper+0x113/0x11e > [] ? child_rip+0xa/0x20 > [] ? __call_usermodehelper+0x0/0x6f > [] ? int_ret_from_sys_call+0x7/0x1b > [] ? retint_restore_args+0x5/0x6 > [] ? child_rip+0x0/0x20 > Code: 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 c3 55 48 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b eb fe 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8 > RIP [] leave_mm+0x15/0x46 > RSP > ---[ end trace ce9cee6832a9c503 ]--- > Kernel panic - not syncing: Fatal exception in interrupt > Pid: 25581, comm: khelper Tainted: G D 2.6.32.36fixxen #1 > Call Trace: > [] panic+0xe0/0x19a > [] ? init_amd+0x296/0x37a > [] ? xen_force_evtchn_callback+0xd/0xf > [] ? check_events+0x12/0x20 > [] ? xen_restore_fl_direct_end+0x0/0x1 > [] ? print_oops_end_marker+0x23/0x25 > [] oops_end+0xb6/0xc6 > [] die+0x5a/0x63 > [] do_trap+0x115/0x124 > [] do_invalid_op+0x9c/0xa5 > [] ? leave_mm+0x15/0x46 > [] ? xen_clocksource_read+0x21/0x23 > [] ? HYPERVISOR_vcpu_op+0xf/0x11 > [] ? xen_vcpuop_set_next_event+0x52/0x67 > [] ? clockevents_program_event+0x78/0x81 > [] invalid_op+0x1b/0x20 > [] ? _spin_unlock_irqrestore+0x15/0x17 > [] ? leave_mm+0x15/0x46 > [] drop_other_mm_ref+0x2a/0x53 > [] generic_smp_call_function_single_interrupt+0xd8/0xfc > [] xen_call_function_single_interrupt+0x13/0x28 > [] handle_IRQ_event+0x66/0x120 > [] handle_percpu_irq+0x41/0x6e > [] __xen_evtchn_do_upcall+0x1ab/0x27d > [] xen_evtchn_do_upcall+0x33/0x46 > [] xen_do_hypervisor_callback+0x1e/0x30 > [] ? _spin_unlock_irqrestore+0x15/0x17 > [] ? xen_restore_fl_direct_end+0x0/0x1 > [] ? flush_old_exec+0x3ac/0x500 > [] ? load_elf_binary+0x0/0x17ef > [] ? load_elf_binary+0x0/0x17ef > [] ? load_elf_binary+0x398/0x17ef > [] ? need_resched+0x23/0x2d > [] ? process_measurement+0xc0/0xd7 > [] ? load_elf_binary+0x0/0x17ef > [] ? search_binary_handler+0xc8/0x255 > [] ? do_execve+0x1c3/0x29e > [] ? sys_execve+0x43/0x5d > [] ? __call_usermodehelper+0x0/0x6f > [] ? kernel_execve+0x68/0xd0 > [] ? __call_usermodehelper+0x0/0x6f > [] ? xen_restore_fl_direct_end+0x0/0x1 > [] ? ____call_usermodehelper+0x113/0x11e > [] ? child_rip+0xa/0x20 > [] ? __call_usermodehelper+0x0/0x6f > [] ? int_ret_from_sys_call+0x7/0x1b > [] ? retint_restore_args+0x5/0x6 > [] ? child_rip+0x0/0x20 > > From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: RE: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Tue, 12 Apr 2011 18:10:36 +0800 Message-ID: References: , , , , , , , <4DA3438A.6070503@goop.org>, , , <20110412100000.GA15647@dumpdata.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0051551837==" Return-path: In-Reply-To: <20110412100000.GA15647@dumpdata.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: konrad.wilk@oracle.com Cc: jeremy@goop.org, xen devel , giamteckchoon@gmail.com List-Id: xen-devel@lists.xenproject.org --===============0051551837== Content-Type: multipart/alternative; boundary="_f47a06fd-cd58-4a3e-9d6c-3717085a9f4f_" --_f47a06fd-cd58-4a3e-9d6c-3717085a9f4f_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable VHD file in local disk.=20 =20 disk =3D [ 'tap:vhd:/mnt/xmao/test/img/win2003.cp1.vhd,hda,w'] =20 thanks. =20 > Date: Tue, 12 Apr 2011 06:00:00 -0400 > From: konrad.wilk@oracle.com > To: tinnycloud@hotmail.com > CC: xen-devel@lists.xensource.com; giamteckchoon@gmail.com; jeremy@goop= .org > Subject: Re: Kernel BUG at arch/x86/mm/tlb.c:61 >=20 > On Tue, Apr 12, 2011 at 05:11:51PM +0800, MaoXiaoyun wrote: > >=20 > > Hi : > >=20 > > We are using pvops kernel 2.6.32.36 + xen 4.0.1, but confront a kerne= l panic bug. > >=20 > > 2.6.32.36 Kernel: http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/= xen.git;a=3Dcommit;h=3Dbb1a15e55ec665a64c8a9c6bd699b1f16ac01ff4 > > Xen 4.0.1 http://xenbits.xen.org/hg/xen-4.0-testing.hg/rev/b536ebfba1= 83=20 > >=20 > > Our test is simple, 24 HVMS(Win2003 ) on a single host, each HVM loop= es in restart every 15minutes. >=20 > What is the storage that you are using for your guests? AoE? Local disk= s? >=20 > > About 17 machines are invovled in the test, after 10 hours run, one c= onfrontted a crash at arch/x86/mm/tlb.c:61 > >=20 > > Currently I am trying "cpuidle=3D0 cpufreq=3Dnone" tests based on Tec= k's suggestion. > >=20 > > Any comments, thanks.=20 > >=20 > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3Dcrash log=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > INIT: Id "s0" respawning too fast: disabled for 5 minutes > > __ratelimit: 14 callbacks suppressed > > blktap_sysfs_destroy > > blktap_sysfs_destroy > > ------------[ cut here ]------------ > > kernel BUG at arch/x86/mm/tlb.c:61! > > invalid opcode: 0000 [#1] SMP=20 > > last sysfs file: /sys/devices/system/xen_memory/xen_memory0/info/curr= ent_kb > > CPU 1=20 > > Modules linked in: 8021q garp xen_netback xen_blkback blktap blkback_= pagemap nbd bridge stp llc autofs4 ipmi_devintf ipmi_si ipmi_msghandler l= ockd sunrpc bonding ipv6 xenfs dm_multipath video output sbs sbshc parpor= t_pc lp parport ses enclosure snd_seq_dummy snd_seq_oss snd_seq_midi_even= t snd_seq snd_seq_device serio_raw bnx2 snd_pcm_oss snd_mixer_oss snd_pcm= snd_timer iTCO_wdt snd soundcore snd_page_alloc i2c_i801 iTCO_vendor_sup= port i2c_core pcspkr pata_acpi ata_generic ata_piix shpchp mptsas mptscsi= h mptbase [last unloaded: freq_table] > > Pid: 25581, comm: khelper Not tainted 2.6.32.36fixxen #1 Tecal RH2285= =20 > > RIP: e030:[] [] leave_mm+0x15/0x4= 6 > > RSP: e02b:ffff88002805be48 EFLAGS: 00010046 > > RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88015f8e2da0 > > RDX: ffff88002805be78 RSI: 0000000000000000 RDI: 0000000000000001 > > RBP: ffff88002805be48 R08: ffff88009d662000 R09: dead000000200200 > > R10: dead000000100100 R11: ffffffff814472b2 R12: ffff88009bfc1880 > > R13: ffff880028063020 R14: 00000000000004f6 R15: 0000000000000000 > > FS: 00007f62362d66e0(0000) GS:ffff880028058000(0000) knlGS:0000000000= 000000 > > CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > > CR2: 0000003aabc11909 CR3: 000000009b8ca000 CR4: 0000000000002660 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > Process khelper (pid: 25581, threadinfo ffff88007691e000, task ffff88= 009b92db40) > > Stack: > > ffff88002805be68 ffffffff8100e4ae 0000000000000001 ffff88009d733b88 > > <0> ffff88002805be98 ffffffff81087224 ffff88002805be78 ffff88002805be= 78 > > <0> ffff88015f808360 00000000000004f6 ffff88002805bea8 ffffffff810101= 08 > > Call Trace: > > =20 > > [] drop_other_mm_ref+0x2a/0x53 > > [] generic_smp_call_function_single_interrupt+0xd8/= 0xfc > > [] xen_call_function_single_interrupt+0x13/0x28 > > [] handle_IRQ_event+0x66/0x120 > > [] handle_percpu_irq+0x41/0x6e > > [] __xen_evtchn_do_upcall+0x1ab/0x27d > > [] xen_evtchn_do_upcall+0x33/0x46 > > [] xen_do_hypervisor_callback+0x1e/0x30 > > =20 > > [] ? _spin_unlock_irqrestore+0x15/0x17 > > [] ? xen_restore_fl_direct_end+0x0/0x1 > > [] ? flush_old_exec+0x3ac/0x500 > > [] ? load_elf_binary+0x0/0x17ef > > [] ? load_elf_binary+0x0/0x17ef > > [] ? load_elf_binary+0x398/0x17ef > > [] ? need_resched+0x23/0x2d > > [] ? process_measurement+0xc0/0xd7 > > [] ? load_elf_binary+0x0/0x17ef > > [] ? search_binary_handler+0xc8/0x255 > > [] ? do_execve+0x1c3/0x29e > > [] ? sys_execve+0x43/0x5d > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? kernel_execve+0x68/0xd0 > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? xen_restore_fl_direct_end+0x0/0x1 > > [] ? ____call_usermodehelper+0x113/0x11e > > [] ? child_rip+0xa/0x20 > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? int_ret_from_sys_call+0x7/0x1b > > [] ? retint_restore_args+0x5/0x6 > > [] ? child_rip+0x0/0x20 > > Code: 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 = c3 55 48 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b= eb fe 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8=20 > > RIP [] leave_mm+0x15/0x46 > > RSP > > ---[ end trace ce9cee6832a9c503 ]--- > > Kernel panic - not syncing: Fatal exception in interrupt > > Pid: 25581, comm: khelper Tainted: G D 2.6.32.36fixxen #1 > > Call Trace: > > [] panic+0xe0/0x19a > > [] ? init_amd+0x296/0x37a > > [] ? xen_force_evtchn_callback+0xd/0xf > > [] ? check_events+0x12/0x20 > > [] ? xen_restore_fl_direct_end+0x0/0x1 > > [] ? print_oops_end_marker+0x23/0x25 > > [] oops_end+0xb6/0xc6 > > [] die+0x5a/0x63 > > [] do_trap+0x115/0x124 > > [] do_invalid_op+0x9c/0xa5 > > [] ? leave_mm+0x15/0x46 > > [] ? xen_clocksource_read+0x21/0x23 > > [] ? HYPERVISOR_vcpu_op+0xf/0x11 > > [] ? xen_vcpuop_set_next_event+0x52/0x67 > > [] ? clockevents_program_event+0x78/0x81 > > [] invalid_op+0x1b/0x20 > > [] ? _spin_unlock_irqrestore+0x15/0x17 > > [] ? leave_mm+0x15/0x46 > > [] drop_other_mm_ref+0x2a/0x53 > > [] generic_smp_call_function_single_interrupt+0xd8/= 0xfc > > [] xen_call_function_single_interrupt+0x13/0x28 > > [] handle_IRQ_event+0x66/0x120 > > [] handle_percpu_irq+0x41/0x6e > > [] __xen_evtchn_do_upcall+0x1ab/0x27d > > [] xen_evtchn_do_upcall+0x33/0x46 > > [] xen_do_hypervisor_callback+0x1e/0x30 > > [] ? _spin_unlock_irqrestore+0x15/0x17 > > [] ? xen_restore_fl_direct_end+0x0/0x1 > > [] ? flush_old_exec+0x3ac/0x500 > > [] ? load_elf_binary+0x0/0x17ef > > [] ? load_elf_binary+0x0/0x17ef > > [] ? load_elf_binary+0x398/0x17ef > > [] ? need_resched+0x23/0x2d > > [] ? process_measurement+0xc0/0xd7 > > [] ? load_elf_binary+0x0/0x17ef > > [] ? search_binary_handler+0xc8/0x255 > > [] ? do_execve+0x1c3/0x29e > > [] ? sys_execve+0x43/0x5d > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? kernel_execve+0x68/0xd0 > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? xen_restore_fl_direct_end+0x0/0x1 > > [] ? ____call_usermodehelper+0x113/0x11e > > [] ? child_rip+0xa/0x20 > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? int_ret_from_sys_call+0x7/0x1b > > [] ? retint_restore_args+0x5/0x6 > > [] ? child_rip+0x0/0x20 > >=20 > >=20 =20 --_f47a06fd-cd58-4a3e-9d6c-3717085a9f4f_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable VHD file in local disk.
 
disk =3D [ 'tap:vhd:/mnt/xmao/test/img/win2003.cp1.vhd,hda,w']
 
thanks.
 
> Date: Tue, 12 Apr 2011 06:00:00 -0400
> From: konrad.wilk@orac= le.com
> To: tinnycloud@hotmail.com
> CC: xen-devel@lists.xen= source.com; giamteckchoon@gmail.com; jeremy@goop.org
> Subject: Re:= Kernel BUG at arch/x86/mm/tlb.c:61
>
> On Tue, Apr 12, 2011= at 05:11:51PM +0800, MaoXiaoyun wrote:
> >
> > Hi :> >
> > We are using pvops kernel 2.6.32.36 + xen 4.0.1= , but confront a kernel panic bug.
> >
> > 2.6.32.36 K= ernel: http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcom= mit;h=3Dbb1a15e55ec665a64c8a9c6bd699b1f16ac01ff4
> > Xen 4.0.1 h= ttp://xenbits.xen.org/hg/xen-4.0-testing.hg/rev/b536ebfba183
> >= ;
> > Our test is simple, 24 HVMS(Win2003 ) on a single host, e= ach HVM loopes in restart every 15minutes.
>
> What is the s= torage that you are using for your guests? AoE? Local disks?
>
= > > About 17 machines are invovled in the test, after 10 hours run, one confrontted a crash at arch/x86/mm/tlb.c:= 61
> >
> > Currently I am trying "cpuidle=3D0 cpufreq=3D= none" tests based on Teck's suggestion.
> >
> > Any co= mments, thanks.
> >
> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3Dcrash log=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D
> > INIT: Id "s0" respawning too fas= t: disabled for 5 minutes
> > __ratelimit: 14 callbacks suppress= ed
> > blktap_sysfs_destroy
> > blktap_sysfs_destroy> > ------------[ cut here ]------------
> > kernel BUG a= t arch/x86/mm/tlb.c:61!
> > invalid opcode: 0000 [#1] SMP
&g= t; > last sysfs file: /sys/devices/system/xen_memory/xen_memory0/info/= current_kb
> > CPU 1
> > Modules linked in: 8021q garp= xen_netback xen_blkback blktap blkback_pagemap nbd bridge stp llc autofs= 4 ipmi_devintf ipmi_si ipmi_msghandler lockd sunrpc bonding ipv6 xenfs dm= _multipath video output sbs sbshc parport_pc lp pa rport ses enclosure snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq= snd_seq_device serio_raw bnx2 snd_pcm_oss snd_mixer_oss snd_pcm snd_time= r iTCO_wdt snd soundcore snd_page_alloc i2c_i801 iTCO_vendor_support i2c_= core pcspkr pata_acpi ata_generic ata_piix shpchp mptsas mptscsih mptbase= [last unloaded: freq_table]
> > Pid: 25581, comm: khelper Not t= ainted 2.6.32.36fixxen #1 Tecal RH2285
> > RIP: e030:[<fffff= fff8103a3cb>] [<ffffffff8103a3cb>] leave_mm+0x15/0x46
> &g= t; RSP: e02b:ffff88002805be48 EFLAGS: 00010046
> > RAX: 00000000= 00000000 RBX: 0000000000000001 RCX: ffff88015f8e2da0
> > RDX: ff= ff88002805be78 RSI: 0000000000000000 RDI: 0000000000000001
> > R= BP: ffff88002805be48 R08: ffff88009d662000 R09: dead000000200200
> = > R10: dead000000100100 R11: ffffffff814472b2 R12: ffff88009bfc1880> > R13: ffff880028063020 R14: 00000000000004f6 R15: 0000000000000= 000
> > FS: 00007f62362d66e0(0000 ) GS:ffff880028058000(0000) knlGS:0000000000000000
> > CS: e033= DS: 0000 ES: 0000 CR0: 000000008005003b
> > CR2: 0000003aabc119= 09 CR3: 000000009b8ca000 CR4: 0000000000002660
> > DR0: 00000000= 00000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 00= 00000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > P= rocess khelper (pid: 25581, threadinfo ffff88007691e000, task ffff88009b9= 2db40)
> > Stack:
> > ffff88002805be68 ffffffff8100e4ae= 0000000000000001 ffff88009d733b88
> > <0> ffff88002805be9= 8 ffffffff81087224 ffff88002805be78 ffff88002805be78
> > <0&g= t; ffff88015f808360 00000000000004f6 ffff88002805bea8 ffffffff81010108> > Call Trace:
> > <IRQ>
> > [<ffffff= ff8100e4ae>] drop_other_mm_ref+0x2a/0x53
> > [<ffffffff810= 87224>] generic_smp_call_function_single_interrupt+0xd8/0xfc
> &= gt; [<ffffffff81010108>] xen_call_fu nction_single_interrupt+0x13/0x28
> > [<ffffffff810a936a>= ] handle_IRQ_event+0x66/0x120
> > [<ffffffff810aac5b>] han= dle_percpu_irq+0x41/0x6e
> > [<ffffffff8128c1c0>] __xen_ev= tchn_do_upcall+0x1ab/0x27d
> > [<ffffffff8128dd11>] xen_ev= tchn_do_upcall+0x33/0x46
> > [<ffffffff81013efe>] xen_do_h= ypervisor_callback+0x1e/0x30
> > <EOI>
> > [<= ffffffff814472b2>] ? _spin_unlock_irqrestore+0x15/0x17
> > [&= lt;ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1
> >= [<ffffffff81113f71>] ? flush_old_exec+0x3ac/0x500
> > [&l= t;ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef
> > [<ff= ffffff81150dc5>] ? load_elf_binary+0x0/0x17ef
> > [<ffffff= ff8115115d>] ? load_elf_binary+0x398/0x17ef
> > [<ffffffff= 81042fcf>] ? need_resched+0x23/0x2d
> > [<ffffffff811f4648= >] ? process_measurement+0xc0/0xd7
& gt; > [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef
>= > [<ffffffff81113094>] ? search_binary_handler+0xc8/0x255
&g= t; > [<ffffffff81114362>] ? do_execve+0x1c3/0x29e
> > [= <ffffffff8101155d>] ? sys_execve+0x43/0x5d
> > [<ffffff= ff8106fc45>] ? __call_usermodehelper+0x0/0x6f
> > [<ffffff= ff81013e28>] ? kernel_execve+0x68/0xd0
> > [<ffffffff8106f= c45>] ? __call_usermodehelper+0x0/0x6f
> > [<ffffffff8100f= 8cf>] ? xen_restore_fl_direct_end+0x0/0x1
> > [<ffffffff81= 06fb64>] ? ____call_usermodehelper+0x113/0x11e
> > [<fffff= fff81013daa>] ? child_rip+0xa/0x20
> > [<ffffffff8106fc45&= gt;] ? __call_usermodehelper+0x0/0x6f
> > [<ffffffff81012f91&= gt;] ? int_ret_from_sys_call+0x7/0x1b
> > [<ffffffff8101371d&= gt;] ? retint_restore_args+0x5/0x6
> > [<ffffffff81013da0>= ] ? child_rip+0x0/0x20
> > Code:=20 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 c3 55 48 = 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b eb= fe 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8
> > RIP = [<ffffffff8103a3cb>] leave_mm+0x15/0x46
> > RSP <ffff88= 002805be48>
> > ---[ end trace ce9cee6832a9c503 ]---
> = > Kernel panic - not syncing: Fatal exception in interrupt
> >= ; Pid: 25581, comm: khelper Tainted: G D 2.6.32.36fixxen #1
> > = Call Trace:
> > <IRQ> [<ffffffff8105682e>] panic+0xe= 0/0x19a
> > [<ffffffff8144008a>] ? init_amd+0x296/0x37a> > [<ffffffff8100f17d>] ? xen_force_evtchn_callback+0xd/0xf=
> > [<ffffffff8100f8e2>] ? check_events+0x12/0x20
>= > [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1
&= gt; > [<ffffffff81056487>] ? print_oops_end_marker+0x23/0x25
= > > [<ffffffff81448185>] oops_ end+0xb6/0xc6
> > [<ffffffff810166e5>] die+0x5a/0x63
&= gt; > [<ffffffff81447a5c>] do_trap+0x115/0x124
> > [<= ;ffffffff810148e6>] do_invalid_op+0x9c/0xa5
> > [<ffffffff= 8103a3cb>] ? leave_mm+0x15/0x46
> > [<ffffffff8100f6fa>= ] ? xen_clocksource_read+0x21/0x23
> > [<ffffffff8100f26c>= ] ? HYPERVISOR_vcpu_op+0xf/0x11
> > [<ffffffff8100f767>] ?= xen_vcpuop_set_next_event+0x52/0x67
> > [<ffffffff81080bfa&g= t;] ? clockevents_program_event+0x78/0x81
> > [<ffffffff81013= b3b>] invalid_op+0x1b/0x20
> > [<ffffffff814472b2>] ? _= spin_unlock_irqrestore+0x15/0x17
> > [<ffffffff8103a3cb>] = ? leave_mm+0x15/0x46
> > [<ffffffff8100e4ae>] drop_other_m= m_ref+0x2a/0x53
> > [<ffffffff81087224>] generic_smp_call_= function_single_interrupt+0xd8/0xfc
> > [<ffffffff81010108>= ;] xen_call_function_single_interrupt+0x13 /0x28
> > [<ffffffff810a936a>] handle_IRQ_event+0x66/0x12= 0
> > [<ffffffff810aac5b>] handle_percpu_irq+0x41/0x6e
= > > [<ffffffff8128c1c0>] __xen_evtchn_do_upcall+0x1ab/0x27d> > [<ffffffff8128dd11>] xen_evtchn_do_upcall+0x33/0x46
= > > [<ffffffff81013efe>] xen_do_hypervisor_callback+0x1e/0x30=
> > <EOI> [<ffffffff814472b2>] ? _spin_unlock_irqre= store+0x15/0x17
> > [<ffffffff8100f8cf>] ? xen_restore_fl_= direct_end+0x0/0x1
> > [<ffffffff81113f71>] ? flush_old_ex= ec+0x3ac/0x500
> > [<ffffffff81150dc5>] ? load_elf_binary+= 0x0/0x17ef
> > [<ffffffff81150dc5>] ? load_elf_binary+0x0/= 0x17ef
> > [<ffffffff8115115d>] ? load_elf_binary+0x398/0x= 17ef
> > [<ffffffff81042fcf>] ? need_resched+0x23/0x2d
= > > [<ffffffff811f4648>] ? process_measurement+0xc0/0xd7
&= gt; > [<ffffffff81150dc5>] ? load _elf_binary+0x0/0x17ef
> > [<ffffffff81113094>] ? search_= binary_handler+0xc8/0x255
> > [<ffffffff81114362>] ? do_ex= ecve+0x1c3/0x29e
> > [<ffffffff8101155d>] ? sys_execve+0x4= 3/0x5d
> > [<ffffffff8106fc45>] ? __call_usermodehelper+0x= 0/0x6f
> > [<ffffffff81013e28>] ? kernel_execve+0x68/0xd0<= BR>> > [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f<= BR>> > [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0= x1
> > [<ffffffff8106fb64>] ? ____call_usermodehelper+0x11= 3/0x11e
> > [<ffffffff81013daa>] ? child_rip+0xa/0x20
&= gt; > [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f
&= gt; > [<ffffffff81012f91>] ? int_ret_from_sys_call+0x7/0x1b
&= gt; > [<ffffffff8101371d>] ? retint_restore_args+0x5/0x6
>= > [<ffffffff81013da0>] ? child_rip+0x0/0x20
> >
&g= t; >
--_f47a06fd-cd58-4a3e-9d6c-3717085a9f4f_-- --===============0051551837== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============0051551837==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Teck Choon Giam Subject: Re: kernel BUG at arch/x86/xen/mmu.c:1872 Date: Wed, 13 Apr 2011 00:08:04 +0800 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: MaoXiaoyun Cc: jeremy@goop.org, xen devel , keir@xen.org, ian.campbell@citrix.com, konrad.wilk@oracle.com, dave@ivt.com.au List-Id: xen-devel@lists.xenproject.org If it is possible, please try not to top-post as this make reading more confusing for me at least. Thanks ;) 2011/4/12 MaoXiaoyun : > Hi: > > =A0=A0=A0=A0=A0=A0 I=A0have just kicked off cpuidle=3D0 "cpufreq=3Dnone"= =A0tests. Let see whether are you able to reproduce the tlb BUG with the above. > > =A0=A0=A0=A0=A0=A0=A0What is your Xen version?=A0=A0Do you use the backen= d driver of > 2.6.32.36? You are asking me? xen-4.0.2-rc3-pre latest changeset and also xen-4.1.1-rc1-pre. What do you mean backend driver? My testing are mostly on PV domU and HVM on windows with LVM as storage. I do not use VDH or any PV drivers for windows. > > =A0=A0=A0=A0=A0=A0 Beside the "TLB BUG ", I've met at least two other iss= ues > =A0=A0=A0=A0=A0=A0 1)Xen4.0.1 + 2.6.32.36 kernel=A0+ backend driver from = 2.6.31=A0 =3D=3D> will > cause=A0"Bad grant reference " log in serial output > =A0=A0=A0=A0=A0=A0=A02)Xen4.0.1 +=A02.6.32.36 kernel with its owen backen= d driver=A0=A0 =3D=3D> will > cause disk error like belows. > > sd=A00:0:0:0:=A0rejecting=A0I/O=A0to=A0offline=A0device > sd=A00:0:0:0:=A0rejecting=A0I/O=A0to=A0offline=A0device > sd=A00:0:0:0:=A0rejecting=A0I/O=A0to=A0offline=A0device > sd=A00:0:0:0:=A0rejecting=A0I/O=A0to=A0offline=A0device > sd=A00:0:0:0:=A0rejecting=A0I/O=A0to=A0offline=A0device > sd=A00:0:0:0:=A0rejecting=A0I/O=A0to=A0offline=A0device > sd=A00:0:0:0:=A0rejecting=A0I/O=A0to=A0offline=A0device > sd=A00:0:0:0:=A0rejecting=A0I/O=A0to=A0offline=A0device > sd=A00:0:0:0:=A0rejecting=A0I/O=A0to=A0offline=A0device > sd=A00:0:0:0:=A0rejecting=A0I/O=A0to=A0offline=A0device > sd=A00:0:0:0:=A0rejecting=A0I/O=A0to=A0offline=A0device > sd=A00:0:0:0:=A0rejecting=A0I/O=A0to=A0offline=A0device > sd=A00:0:0:0:=A0rejecting=A0I/O=A0to=A0offline=A0device > end_request:=A0I/O=A0error,=A0dev=A0tdb,=A0sector=A028699593 > end_request:=A0I/O=A0error,=A0dev=A0tdb,=A0sector=A028699673 > end_request:=A0I/O=A0error,=A0dev=A0tdb,=A0sector=A028699753 > end_request:=A0I/O=A0error,=A0dev=A0tdb,=A0sector=A028699833 > end_request:=A0I/O=A0error,=A0dev=A0tdb,=A0sector=A028699913 > end_request:=A0I/O=A0error,=A0dev=A0tdb,=A0sector=A028699993 > end_request:=A0I/O=A0error,=A0dev=A0tdb,=A0sector=A028700073 Is this related to VDH? What is the specific backend driver? These started to surface after you applied my backport patch or regardless the patch applied it is already there? Thanks. Kindest regards, Giam Teck Choon From mboxrd@z Thu Jan 1 00:00:00 1970 From: Teck Choon Giam Subject: Re: kernel BUG at arch/x86/xen/mmu.c:1872 Date: Wed, 13 Apr 2011 00:32:05 +0800 Message-ID: References: <4DA3438A.6070503@goop.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Return-path: In-Reply-To: <4DA3438A.6070503@goop.org> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jeremy Fitzhardinge Cc: MaoXiaoyun , xen devel , keir@xen.org, ian.campbell@citrix.com, konrad.wilk@oracle.com, dave@ivt.com.au List-Id: xen-devel@lists.xenproject.org 2011/4/12 Jeremy Fitzhardinge : > On 04/11/2011 05:31 AM, MaoXiaoyun wrote: >> Hi: >> >> I believe this is the fix at much extent. >> Since I have my own test cases which with this patch, my test case >> will success in 30 rounds run. >> Every round takes 8hours. While without this patch, tests fail evey >> round in 15minutes. >> >> So this really means fix most of the things. >> >> But during running, I met another crash, from the log it it looks like >> has relation with >> this BUG, since the crash log shows it is tlb related and this BUG >> also tlb related. >> >> Well, I'm also have poor knowledge of kernel. >> Hope someone from Xen Devel offer some help. > > Thanks for confirming; it makes sense and explains the symptoms, so I'm > glad it also works ;) > > > J > Thanks Jeremy, I can see the needed backport patch is in your xen/next-2.6.32 tree now ;) Kindest regards, Giam Teck Choon From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: RE: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Thu, 14 Apr 2011 14:16:24 +0800 Message-ID: References: , , , , , , , <4DA3438A.6070503@goop.org>, , , <20110412100000.GA15647@dumpdata.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0760198219==" Return-path: In-Reply-To: <20110412100000.GA15647@dumpdata.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen devel Cc: jeremy@goop.org, giamteckchoon@gmail.com, konrad.wilk@oracle.com List-Id: xen-devel@lists.xenproject.org --===============0760198219== Content-Type: multipart/alternative; boundary="_702121fa-b4c8-4aa3-b55a-69720ad5fc06_" --_702121fa-b4c8-4aa3-b55a-69720ad5fc06_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi: =20 I've done test with "cpuidle=3D0 cpufreq=3Dnone", two machine crash= ed. =20 blktap_sysfs_destroy blktap_sysfs_destroy blktap_sysfs_create: adding attributes for dev ffff8800ad581000 blktap_sysfs_create: adding attributes for dev ffff8800a48e3e00 ------------[ cut here ]------------ kernel BUG at arch/x86/mm/tlb.c:61! invalid opcode: 0000 [#1] SMP=20 last sysfs file: /sys/block/tapdeve/dev CPU 0=20 Modules linked in: 8021q garp blktap xen_netback xen_blkback blkback_page= map nbd bridge stp llc autofs4 ipmi_devintf ipmi_si ipmi_ms ghandler lockd sunrpc bonding ipv6 xenfs dm_multipath video output sbs sb= shc parport_pc lp parport ses enclosure snd_seq_dummy bnx2=20 serio_raw snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_o= ss snd_mixer_oss snd_pcm i2c_i801 snd_timer i2c_core snd iT CO_wdt pata_acpi soundcore iTCO_vendor_ support ata_generic snd_page_alloc pcspkr ata_piix shpchp mptsas mptscsih= mptbase [last unloa ded: freq_table] Pid: 8022, comm: khelper Not tainted 2.6.32.36xen #1 Tecal RH2285 = =20 RIP: e030:[] [] leave_mm+0x15/0x46 RSP: e02b:ffff88002803ee48 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff81675980 RDX: ffff88002803ee78 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88002803ee48 R08: ffff8800a4929000 R09: dead000000200200 R10: dead000000100100 R11: ffffffff81447292 R12: ffff88012ba07b80 R13: ffff880028046020 R14: 00000000000004fb R15: 0000000000000000 FS: 00007f410af416e0(0000) GS:ffff88002803b000(0000) knlGS:0000000000000= 000 CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000469000 CR3: 00000000ad639000 CR4: 0000000000002660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process khelper (pid: 8022, threadinfo ffff8800a4846000, task ffff8800a9e= d0000) Stack: ffff88002803ee68 ffffffff8100e4a4 0000000000000001 ffff880097de3b88 <0> ffff88002803ee98 ffffffff81087224 ffff88002803ee78 ffff88002803ee78 <0> ffff88015f808180 00000000000004fb ffff88002803eea8 ffffffff810100e8 Call Trace: =20 [] drop_other_mm_ref+0x2a/0x53 [] generic_smp_call_function_single_interrupt+0xd8/0xf= c [] xen_call_function_single_interrupt+0x13/0x28 [] handle_IRQ_event+0x66/0x120 [] handle_percpu_irq+0x41/0x6e [] __xen_evtchn_do_upcall+0x1ab/0x27d [] xen_evtchn_do_upcall+0x33/0x46 [] xen_do_hypervisor_callback+0x1e/0x30 =20 [] ? _spin_unlock_irqrestore+0x15/0x17 [] ? xen_restore_fl_direct_end+0x0/0x1 [] ? flush_old_exec+0x3ac/0x500 [] ? load_elf_binary+0x0/0x17ef [] ? load_elf_binary+0x0/0x17ef [] ? load_elf_binary+0x398/0x17ef [] ? need_resched+0x23/0x2d =20 [] ? process_measurement+0xc0/0xd7 [] ? load_elf_binary+0x0/0x17ef [] ? search_binary_handler+0xc8/0x255 [] ? do_execve+0x1c3/0x29e [] ? sys_execve+0x43/0x5d [] ? __call_usermodehelper+0x0/0x6f [] ? kernel_execve+0x68/0xd0 [] ? __call_usermodehelper+0x0/0x6f [] ? xen_restore_fl_direct_end+0x0/0x1 [] ? ____call_usermodehelper+0x113/0x11e [] ? child_rip+0xa/0x20 [] ? __call_usermodehelper+0x0/0x6f [] ? int_ret_from_sys_call+0x7/0x1b [] ? retint_restore_args+0x5/0x6 [] ? c hild_rip+0x0/0x20 Code: 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 c3 5= 5 48 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b eb = fe 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8=20 RIP [] leave_mm+0x15/0x46 RSP ---[ end trace 1522f17fdfc9162d ]--- Kernel panic - not syncing: Fatal exception in interrupt Pid: 8022, comm: khelper Tainted: G D 2.6.32.36xen #1 Call Trace: [] panic+0xe0/0x19a [] ? init_amd+0x296/0x37a [] ? xen_force_evtchn_callback+0xd/0xf [] ? check_events+0x12/0x20 [] ? xen_restore_fl_direct_end+0x0/0x1 [] ? print_oops_end_marker+0x23/0x25 [] oops_end+0xb6/0xc6 [] die+0x5a/0x63 [] do_trap+0x115/0x124 [] do_invalid_op+0x9c/0xa5 [] ? leave_mm+0x15/0x46 [] ? xen_clocksource_read+0x21/0x23 [] ? HYPERVISOR_vcpu_op+0xf/0x11 [] ? xen_vcpuop_set_next_event+0x52/0x67 [] invalid_op+0x1b/0x20 [] ? _spin_unlock_irqrestore+0x15/0x17 [] ? leave_mm+0x15/0x46 [] drop_other_mm_ref+0x2a/0x53 [] generic_smp_call_function_single_interrupt+0xd8/0xf= c [] xen_call_function_single_interrupt+0x13/0x28 [] handle_IRQ_event+0x66/0x120 [] handle_percpu_irq+0x41/0x6e [] __xen_evtchn_do_upcall+0x1ab/0x27d [] xen_evtchn_do_upcall+0x33/0x46 [] xen_do_hypervisor_callback+0x1e/0x30 [] ? _spin_unlock_irqrestore+0x15/0x17 [] ? xen_restore_fl_direct_end+0x0/0x1 [] ? flush_old_exec+0x3ac/0x500 [] ? load_elf_binary+0x0/0x17ef [] ? load_elf_binary+0x0/0x17ef [] ? load_elf_binary+0x398/0x17ef [] ? need_resched+0x23/0x 2d [] ? process_measurement+0xc0/0xd7 [] ? load_elf_binary+0x0/0x17ef [] ? search_binary_handler+0xc8/0x255 [] ? do_execve+0x1c3/0x29e [] ? sys_execve+0x43/0x5d [] ? __call_usermodehelper+0x0/0x6f [] ? kernel_execve+0x68/0xd0 [] ? __call_usermodehelper+0x0/0x6f [] ? xen_restore_fl_direct_end+0x0/0x1 [] ? ____call_usermodehelper+0x113/0x11e [] ? child_rip+0xa/0x20 [] ? __call_usermodehelper+0x0/0x6f [] ? int_ret_from_sys_call+0x7/0x1b [] ? retint_restore_args+0x5/0x6 [] ? child_rip+0x0/0x20 (XEN) Domain 0 crashed: 'noreboot' set - not rebooting. =20 > Date: Tue, 12 Apr 2011 06:00:00 -0400 > From: konrad.wilk@oracle.com > To: tinnycloud@hotmail.com > CC: xen-devel@lists.xensource.com; giamteckchoon@gmail.com; jeremy@goop= .org > Subject: Re: Kernel BUG at arch/x86/mm/tlb.c:61 >=20 > On Tue, Apr 12, 2011 at 05:11:51PM +0800, MaoXiaoyun wrote: > >=20 > > Hi : > >=20 > > We are using pvops kernel 2.6.32.36 + xen 4.0.1, but confront a kerne= l panic bug. > >=20 > > 2.6.32.36 Kernel: http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/= xen.git;a=3Dcommit;h=3Dbb1a15e55ec665a64c8a9c6bd699b1f16ac01ff4 > > Xen 4.0.1 http://xenbits.xen.org/hg/xen-4.0-testing.hg/rev/b536ebfba1= 83=20 > >=20 > > Our test is simple, 24 HVMS(Win2003 ) on a single host, each HVM loop= es in restart every 15minutes. >=20 > What is the storage that you are using for your guests? AoE? Local disk= s? >=20 > > About 17 machines are invovled in the test, after 10 hours run, one c= onfrontted a crash at arch/x86/mm/tlb.c:61 > >=20 > > Currently I am trying "cpuidle=3D0 cpufreq=3Dnone" tests based on Tec= k's suggestion. > >=20 > > Any comments, thanks.=20 > >=20 > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3Dcrash log=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > INIT: Id "s0" respawning too fast: disabled for 5 minutes > > __ratelimit: 14 callbacks suppressed > > blktap_sysfs_destroy > > blktap_sysfs_destroy > > ------------[ cut here ]------------ > > kernel BUG at arch/x86/mm/tlb.c:61! > > invalid opcode: 0000 [#1] SMP=20 > > last sysfs file: /sys/devices/system/xen_memory/xen_memory0/info/curr= ent_kb > > CPU 1=20 > > Modules linked in: 8021q garp xen_netback xen_blkback blktap blkback_= pagemap nbd bridge stp llc autofs4 ipmi_devintf ipmi_si ipmi_msghandler l= ockd sunrpc bonding ipv6 xenfs dm_multipath video output sbs sbshc parpor= t_pc lp parport ses enclosure snd_seq_dummy snd_seq_oss snd_seq_midi_even= t snd_seq snd_seq_device serio_raw bnx2 snd_pcm_oss snd_mixer_oss snd_pcm= snd_timer iTCO_wdt snd soundcore snd_page_alloc i2c_i801 iTCO_vendor_sup= port i2c_core pcspkr pata_acpi ata_generic ata_piix shpchp mptsas mptscsi= h mptbase [last unloaded: freq_table] > > Pid: 25581, comm: khelper Not tainted 2.6.32.36fixxen #1 Tecal RH2285= =20 > > RIP: e030:[] [] leave_mm+0x15/0x4= 6 > > RSP: e02b:ffff88002805be48 EFLAGS: 00010046 > > RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88015f8e2da0 > > RDX: ffff88002805be78 RSI: 0000000000000000 RDI: 0000000000000001 > > RBP: ffff88002805be48 R08: ffff88009d662000 R09: dead000000200200 > > R10: dead000000100100 R11: ffffffff814472b2 R12: ffff88009bfc1880 > > R13: ffff880028063020 R14: 00000000000004f6 R15: 0000000000000000 > > FS: 00007f62362d66e0(0000) GS:ffff880028058000(0000) knlGS:0000000000= 000000 > > CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > > CR2: 0000003aabc11909 CR3: 000000009b8ca000 CR4: 0000000000002660 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > Process khelper (pid: 25581, threadinfo ffff88007691e000, task ffff88= 009b92db40) > > Stack: > > ffff88002805be68 ffffffff8100e4ae 0000000000000001 ffff88009d733b88 > > <0> ffff88002805be98 ffffffff81087224 ffff88002805be78 ffff88002805be= 78 > > <0> ffff88015f808360 00000000000004f6 ffff88002805bea8 ffffffff810101= 08 > > Call Trace: > > =20 > > [] drop_other_mm_ref+0x2a/0x53 > > [] generic_smp_call_function_single_interrupt+0xd8/= 0xfc > > [] xen_call_function_single_interrupt+0x13/0x28 > > [] handle_IRQ_event+0x66/0x120 > > [] handle_percpu_irq+0x41/0x6e > > [] __xen_evtchn_do_upcall+0x1ab/0x27d > > [] xen_evtchn_do_upcall+0x33/0x46 > > [] xen_do_hypervisor_callback+0x1e/0x30 > > =20 > > [] ? _spin_unlock_irqrestore+0x15/0x17 > > [] ? xen_restore_fl_direct_end+0x0/0x1 > > [] ? flush_old_exec+0x3ac/0x500 > > [] ? load_elf_binary+0x0/0x17ef > > [] ? load_elf_binary+0x0/0x17ef > > [] ? load_elf_binary+0x398/0x17ef > > [] ? need_resched+0x23/0x2d > > [] ? process_measurement+0xc0/0xd7 > > [] ? load_elf_binary+0x0/0x17ef > > [] ? search_binary_handler+0xc8/0x255 > > [] ? do_execve+0x1c3/0x29e > > [] ? sys_execve+0x43/0x5d > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? kernel_execve+0x68/0xd0 > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? xen_restore_fl_direct_end+0x0/0x1 > > [] ? ____call_usermodehelper+0x113/0x11e > > [] ? child_rip+0xa/0x20 > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? int_ret_from_sys_call+0x7/0x1b > > [] ? retint_restore_args+0x5/0x6 > > [] ? child_rip+0x0/0x20 > > Code: 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 = c3 55 48 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b= eb fe 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8=20 > > RIP [] leave_mm+0x15/0x46 > > RSP > > ---[ end trace ce9cee6832a9c503 ]--- > > Kernel panic - not syncing: Fatal exception in interrupt > > Pid: 25581, comm: khelper Tainted: G D 2.6.32.36fixxen #1 > > Call Trace: > > [] panic+0xe0/0x19a > > [] ? init_amd+0x296/0x37a > > [] ? xen_force_evtchn_callback+0xd/0xf > > [] ? check_events+0x12/0x20 > > [] ? xen_restore_fl_direct_end+0x0/0x1 > > [] ? print_oops_end_marker+0x23/0x25 > > [] oops_end+0xb6/0xc6 > > [] die+0x5a/0x63 > > [] do_trap+0x115/0x124 > > [] do_invalid_op+0x9c/0xa5 > > [] ? leave_mm+0x15/0x46 > > [] ? xen_clocksource_read+0x21/0x23 > > [] ? HYPERVISOR_vcpu_op+0xf/0x11 > > [] ? xen_vcpuop_set_next_event+0x52/0x67 > > [] ? clockevents_program_event+0x78/0x81 > > [] invalid_op+0x1b/0x20 > > [] ? _spin_unlock_irqrestore+0x15/0x17 > > [] ? leave_mm+0x15/0x46 > > [] drop_other_mm_ref+0x2a/0x53 > > [] generic_smp_call_function_single_interrupt+0xd8/= 0xfc > > [] xen_call_function_single_interrupt+0x13/0x28 > > [] handle_IRQ_event+0x66/0x120 > > [] handle_percpu_irq+0x41/0x6e > > [] __xen_evtchn_do_upcall+0x1ab/0x27d > > [] xen_evtchn_do_upcall+0x33/0x46 > > [] xen_do_hypervisor_callback+0x1e/0x30 > > [] ? _spin_unlock_irqrestore+0x15/0x17 > > [] ? xen_restore_fl_direct_end+0x0/0x1 > > [] ? flush_old_exec+0x3ac/0x500 > > [] ? load_elf_binary+0x0/0x17ef > > [] ? load_elf_binary+0x0/0x17ef > > [] ? load_elf_binary+0x398/0x17ef > > [] ? need_resched+0x23/0x2d > > [] ? process_measurement+0xc0/0xd7 > > [] ? load_elf_binary+0x0/0x17ef > > [] ? search_binary_handler+0xc8/0x255 > > [] ? do_execve+0x1c3/0x29e > > [] ? sys_execve+0x43/0x5d > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? kernel_execve+0x68/0xd0 > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? xen_restore_fl_direct_end+0x0/0x1 > > [] ? ____call_usermodehelper+0x113/0x11e > > [] ? child_rip+0xa/0x20 > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? int_ret_from_sys_call+0x7/0x1b > > [] ? retint_restore_args+0x5/0x6 > > [] ? child_rip+0x0/0x20 > >=20 > >=20 =20 --_702121fa-b4c8-4aa3-b55a-69720ad5fc06_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi:
 
      I've done test with "cpuidle=3D0 cpufreq=3D= none", two machine crashed.
 
blktap_sysfs_destroy
blktap_sysfs_dest= roy
blktap_sysfs_create: adding attributes for dev=  ffff8800ad581000
blktap_sysfs_create: adding attribute= s for dev ffff8800a48e3e00
------------[ cut = here ]------------
kernel BUG at arch/x86/mm/tlb.c= :61!
invalid opcode: 0000 [#1] SMP 
last&n= bsp;sysfs file: /sys/block/tapdeve/dev
CPU 0 
M= odules linked in: 8021q garp blktap xen_net= back xen_blkback blkback_pagemap nbd bridge stp&= nbsp;llc autofs4 ipmi_devintf ipmi_si ipmi_ms
ghan= dler lockd sunrpc bonding ipv6 xenfs dm_mul= tipath video output sbs sbshc parport_pc lp=  parport ses enclosure snd_seq_dummy bnx2 <= BR>serio_raw snd_seq_oss snd_seq_midi_event&n bsp;snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss = ;snd_pcm i2c_i801 snd_timer i2c_core snd iT
C= O_wdt pata_acpi soundcore iTCO_vendor_

support ata_generic snd_page_alloc pcspkr ata_pii= x shpchp mptsas mptscsih mptbase [last unlo= a
ded: freq_table]
Pid: 8022, comm: khelper&nbs= p;Not tainted 2.6.32.36xen #1 Tecal RH2285 =          
RIP: e030:= [<ffffffff8103a3cb>]  [<ffffffff8103a3cb>] lea= ve_mm+0x15/0x46
RSP: e02b:ffff88002803ee48  EFLAGS:&nbs= p;00010046
RAX: 0000000000000000 RBX: 0000000000000001&= nbsp;RCX: ffffffff81675980
RDX: ffff88002803ee78 RSI:&n= bsp;0000000000000000 RDI: 0000000000000000
RBP: ffff880= 02803ee48 R08: ffff8800a4929000 R09: dead00000 0200200
R10: dead000000100100 R11: ffffffff81447292&nb= sp;R12: ffff88012ba07b80
R13: ffff880028046020 R14:&nbs= p;00000000000004fb R15: 0000000000000000
FS:  0000= 7f410af416e0(0000) GS:ffff88002803b000(0000) knlGS:000000000000= 0000
CS:  e033 DS: 0000 ES: 0000 CR= 0: 000000008005003b
CR2: 0000000000469000 CR3: 000= 00000ad639000 CR4: 0000000000002660
DR0: 00000000000000= 00 DR1: 0000000000000000 DR2: 0000000000000000
DR3= : 0000000000000000 DR6: 00000000ffff0ff0 DR7: 00= 00000000000400
Process khelper (pid: 8022, threadi= nfo ffff8800a4846000, task ffff8800a9ed0000)
Stack:
=  ffff88002803ee68 ffffffff8100e4a4 0000000000000001 f= fff880097de3b88
<0> ffff88002803ee98 ffffffff81087224&= nbsp;ffff88002803ee78 ffff88002803ee7 8
<0> ffff88015f808180 00000000000004fb ffff8800= 2803eea8 ffffffff810100e8
Call Trace:
 <IRQ>&n= bsp;
 [<ffffffff8100e4a4>] drop_other_mm_ref+0x2a/0x53=
 [<ffffffff81087224>] generic_smp_call_function_singl= e_interrupt+0xd8/0xfc
 [<ffffffff810100e8>] xen_call_f= unction_single_interrupt+0x13/0x28
 [<ffffffff810a936a>]&nb= sp;handle_IRQ_event+0x66/0x120
 [<ffffffff810aac5b>] h= andle_percpu_irq+0x41/0x6e
 [<ffffffff8128c1a8>] __xen= _evtchn_do_upcall+0x1ab/0x27d
 [<ffffffff8128dcf9>] xe= n_evtchn_do_upcall+0x33/0x46
 [<ffffffff81013efe>] xen= _do_hypervisor_callback+0x1e/0x30
 <EOI> 
 [&l= t;ffffffff81447292>] ? _spin_unlock_irqrestore+0x15/0x17
=  [<ffffffff8100f8af>] ? xen_restore_fl_direct_end+0x= 0/0x1
 [<ffffffff81113f75>]& nbsp;? flush_old_exec+0x3ac/0x500
 [<ffffffff81150dc9>= ;] ? load_elf_binary+0x0/0x17ef
 [<ffffffff81150dc9&= gt;] ? load_elf_binary+0x0/0x17ef
 [<ffffffff8115116= 1>] ? load_elf_binary+0x398/0x17ef
 [<ffffffff810= 42fcf>] ? need_resched+0x23/0x2d
 
[<ffffffff8= 11f463c>] ? process_measurement+0xc0/0xd7
 [<ffff= ffff81150dc9>] ? load_elf_binary+0x0/0x17ef
 [<ff= ffffff81113098>] ? search_binary_handler+0xc8/0x255
 = ;[<ffffffff81114366>] ? do_execve+0x1c3/0x29e
 [&= lt;ffffffff8101155d>] ? sys_execve+0x43/0x5d
 [<f= fffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f
 = [<ffffffff81013e28>] ? kernel_execve+0x68/0xd0
 [= <ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f
&= nbsp;[<ffffffff8100f8af>] ?&nbs p;xen_restore_fl_direct_end+0x0/0x1
 [<ffffffff8106fb64>]&= nbsp;? ____call_usermodehelper+0x113/0x11e
 [<ffffffff810= 13daa>] ? child_rip+0xa/0x20
 [<ffffffff8106fc45&= gt;] ? __call_usermodehelper+0x0/0x6f
 [<ffffffff810= 12f91>] ? int_ret_from_sys_call+0x7/0x1b
 [<fffff= fff8101371d>] ? retint_restore_args+0x5/0x6
 [<ff= ffffff81013da0>] ? c
hild_rip+0x0/0x20
Code: 41 5e 41 5f c9&nb= sp;c3 55 48 89 e5 0f 1f 44 00&nbs= p;00 e8 17 ff ff ff c9 c3 55 = ;48 89 e5 0f 1f 44 00 00 65 = 8b 04 25 c8 55 01 00 ff c8 7= 5 04 <0f> 0b eb fe 65 48 8b=  34 25 c0 55 01 00 48 81 c6&= nbsp;b8 02 00 00 e8 
RIP  [<ffff= ffff8103a3cb>] leave_mm+0x15/0x46
 RSP <ffff88002= 803ee48>
---[ end trace 1522f17fdfc9162d ]---Kernel panic - not syncing: Fatal  exception in interrupt
Pid: 8022, comm: khel= per Tainted: G      D  =   2.6.32.36xen #1
Call Trace:
 <IRQ>=   [<ffffffff8105682e>] panic+0xe0/0x19a
 [&l= t;ffffffff8144006a>] ? init_amd+0x296/0x37a
 [<ff= ffffff8100f169>] ? xen_force_evtchn_callback+0xd/0xf
&nbs= p;[<ffffffff8100f8c2>] ? check_events+0x12/0x20
 = [<ffffffff8100f8af>] ? xen_restore_fl_direct_end+0x0/0x1<= BR> [<ffffffff81056487>] ? print_oops_end_marker+0x2= 3/0x25
 [<ffffffff81448165>] oops_end+0xb6/0xc6
&nb= sp;[<ffffffff810166e5>] die+0x5a/0x63
 [<ffffffff81= 447a3c>] do_trap+0x115/0x124
 [<ffffffff810148e6>]&= nbsp;do_invalid_op+0x9c/0xa5
 [<ffffffff8103a3cb>] ?&n= bsp;leave_mm+0x15/0x46
 [<fffff fff8100f6e6>] ? xen_clocksource_read+0x21/0x23
 [&l= t;ffffffff8100f258>] ? HYPERVISOR_vcpu_op+0xf/0x11
 = [<ffffffff8100f753>] ? xen_vcpuop_set_next_event+0x52/0x6= 7
 [<ffffffff81013b3b>] invalid_op+0x1b/0x20
 = [<ffffffff81447292>] ? _spin_unlock_irqrestore+0x15/0x17<= BR> [<ffffffff8103a3cb>] ? leave_mm+0x15/0x46
&nb= sp;[<ffffffff8100e4a4>] drop_other_mm_ref+0x2a/0x53
 [= <ffffffff81087224>] generic_smp_call_function_single_interrupt= +0xd8/0xfc
 [<ffffffff810100e8>] xen_call_function_sin= gle_interrupt+0x13/0x28
 [<ffffffff810a936a>] handle_I= RQ_event+0x66/0x120
 [<ffffffff810aac5b>] handle_percp= u_irq+0x41/0x6e
 [<ffffffff8128c1a8>] __xen_evtchn_do_= upcall+0x1ab/0x27d
 [<ffffffff8128dcf9>] xen_evtchn_do= _upcall+0x33/0x46
 [<ffffffff81 013efe>] xen_do_hypervisor_callback+0x1e/0x30
 <EOI&g= t;  [<ffffffff81447292>] ? _spin_unlock_irqrest= ore+0x15/0x17
 [<ffffffff8100f8af>] ? xen_restore= _fl_direct_end+0x0/0x1
 [<ffffffff81113f75>] ? fl= ush_old_exec+0x3ac/0x500
 [<ffffffff81150dc9>] ? = load_elf_binary+0x0/0x17ef
 [<ffffffff81150dc9>] ?&nbs= p;load_elf_binary+0x0/0x17ef
 [<ffffffff81151161>] ?&n= bsp;load_elf_binary+0x398/0x17ef
 [<ffffffff81042fcf>] = ;? need_resched+0x23/0x
2d
 [<ffffffff811f463c>]&nb= sp;? process_measurement+0xc0/0xd7
 [<ffffffff81150dc9>= ;] ? load_elf_binary+0x0/0x17ef
 [<ffffffff81113098&= gt;] ? search_binary_handler+0xc8/0x255
 [<ffffffff8= 1114366>] ? do_execve+0x1c3/0x29e
 [<ffffffff8101= 155d>] ? sys_execve+0x43/0x5d
 [<ffffffff8106fc45>] ? __call_usermodehelper+0= x0/0x6f
 [<ffffffff81013e28>] ? kernel_execve+0x6= 8/0xd0
 [<ffffffff8106fc45>] ? __call_usermodehel= per+0x0/0x6f
 [<ffffffff8100f8af>] ? xen_restore_= fl_direct_end+0x0/0x1
 [<ffffffff8106fb64>] ? ___= _call_usermodehelper+0x113/0x11e
 [<ffffffff81013daa>] = ;? child_rip+0xa/0x20
 [<ffffffff8106fc45>] ?&nbs= p;__call_usermodehelper+0x0/0x6f
 [<ffffffff81012f91>] = ;? int_ret_from_sys_call+0x7/0x1b
 [<ffffffff8101371d>= ] ? retint_restore_args+0x5/0x6
 [<ffffffff81013da0&= gt;] ? child_rip+0x0/0x20
(XEN) Domain 0 cras= hed: 'noreboot' set - not rebooting.

 
> Date: Tue, 12 Apr 2011 06:00:00 -0400
> From: konrad.wilk@orac= le.com
> To: tinnycloud@hotmail.com
> CC: xen-devel@lists.xen= source.com; giamteckchoon@gmail.com; jeremy@goop.org
> Subject: Re:= Kernel BUG at arch/x86/mm/tlb.c:61
>
> On Tue, Apr 12, 2011= at 05:11:51PM +0800, MaoXiaoyun wrote:
> >
> > Hi :> >
> > We are using pvops kernel 2.6.32.36 + xen 4.0.1= , but confront a kernel panic bug.
> >
> > 2.6.32.36 K= ernel: http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcom= mit;h=3Dbb1a15e55ec665a64c8a9c6bd699b1f16ac01ff4
> > Xen 4.0.1 h= ttp://xenbits.xen.org/hg/xen-4.0-testing.hg/rev/b536ebfba183
> >= ;
> > Our test is simple, 24 HVMS(Win2003 ) on a single host, e= ach HVM loopes in restart every 15minutes.
>
> What is the s= torage that you are using for your guests? AoE? Local disks?
>
= > > About 17 machines are invovled in the test, after 10 hours run, one confrontted a crash at arch/x86/mm/tlb.c:= 61
> >
> > Currently I am trying "cpuidle=3D0 cpufreq=3D= none" tests based on Teck's suggestion.
> >
> > Any co= mments, thanks.
> >
> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3Dcrash log=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D
> > INIT: Id "s0" respawning too fas= t: disabled for 5 minutes
> > __ratelimit: 14 callbacks suppress= ed
> > blktap_sysfs_destroy
> > blktap_sysfs_destroy> > ------------[ cut here ]------------
> > kernel BUG a= t arch/x86/mm/tlb.c:61!
> > invalid opcode: 0000 [#1] SMP
&g= t; > last sysfs file: /sys/devices/system/xen_memory/xen_memory0/info/= current_kb
> > CPU 1
> > Modules linked in: 8021q garp= xen_netback xen_blkback blktap blkback_pagemap nbd bridge stp llc autofs= 4 ipmi_devintf ipmi_si ipmi_msghandler lockd sunrpc bonding ipv6 xenfs dm= _multipath video output sbs sbshc parport_pc lp pa rport ses enclosure snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq= snd_seq_device serio_raw bnx2 snd_pcm_oss snd_mixer_oss snd_pcm snd_time= r iTCO_wdt snd soundcore snd_page_alloc i2c_i801 iTCO_vendor_support i2c_= core pcspkr pata_acpi ata_generic ata_piix shpchp mptsas mptscsih mptbase= [last unloaded: freq_table]
> > Pid: 25581, comm: khelper Not t= ainted 2.6.32.36fixxen #1 Tecal RH2285
> > RIP: e030:[<fffff= fff8103a3cb>] [<ffffffff8103a3cb>] leave_mm+0x15/0x46
> &g= t; RSP: e02b:ffff88002805be48 EFLAGS: 00010046
> > RAX: 00000000= 00000000 RBX: 0000000000000001 RCX: ffff88015f8e2da0
> > RDX: ff= ff88002805be78 RSI: 0000000000000000 RDI: 0000000000000001
> > R= BP: ffff88002805be48 R08: ffff88009d662000 R09: dead000000200200
> = > R10: dead000000100100 R11: ffffffff814472b2 R12: ffff88009bfc1880> > R13: ffff880028063020 R14: 00000000000004f6 R15: 0000000000000= 000
> > FS: 00007f62362d66e0(0000 ) GS:ffff880028058000(0000) knlGS:0000000000000000
> > CS: e033= DS: 0000 ES: 0000 CR0: 000000008005003b
> > CR2: 0000003aabc119= 09 CR3: 000000009b8ca000 CR4: 0000000000002660
> > DR0: 00000000= 00000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 00= 00000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > P= rocess khelper (pid: 25581, threadinfo ffff88007691e000, task ffff88009b9= 2db40)
> > Stack:
> > ffff88002805be68 ffffffff8100e4ae= 0000000000000001 ffff88009d733b88
> > <0> ffff88002805be9= 8 ffffffff81087224 ffff88002805be78 ffff88002805be78
> > <0&g= t; ffff88015f808360 00000000000004f6 ffff88002805bea8 ffffffff81010108> > Call Trace:
> > <IRQ>
> > [<ffffff= ff8100e4ae>] drop_other_mm_ref+0x2a/0x53
> > [<ffffffff810= 87224>] generic_smp_call_function_single_interrupt+0xd8/0xfc
> &= gt; [<ffffffff81010108>] xen_call_fu nction_single_interrupt+0x13/0x28
> > [<ffffffff810a936a>= ] handle_IRQ_event+0x66/0x120
> > [<ffffffff810aac5b>] han= dle_percpu_irq+0x41/0x6e
> > [<ffffffff8128c1c0>] __xen_ev= tchn_do_upcall+0x1ab/0x27d
> > [<ffffffff8128dd11>] xen_ev= tchn_do_upcall+0x33/0x46
> > [<ffffffff81013efe>] xen_do_h= ypervisor_callback+0x1e/0x30
> > <EOI>
> > [<= ffffffff814472b2>] ? _spin_unlock_irqrestore+0x15/0x17
> > [&= lt;ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1
> >= [<ffffffff81113f71>] ? flush_old_exec+0x3ac/0x500
> > [&l= t;ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef
> > [<ff= ffffff81150dc5>] ? load_elf_binary+0x0/0x17ef
> > [<ffffff= ff8115115d>] ? load_elf_binary+0x398/0x17ef
> > [<ffffffff= 81042fcf>] ? need_resched+0x23/0x2d
> > [<ffffffff811f4648= >] ? process_measurement+0xc0/0xd7
& gt; > [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef
>= > [<ffffffff81113094>] ? search_binary_handler+0xc8/0x255
&g= t; > [<ffffffff81114362>] ? do_execve+0x1c3/0x29e
> > [= <ffffffff8101155d>] ? sys_execve+0x43/0x5d
> > [<ffffff= ff8106fc45>] ? __call_usermodehelper+0x0/0x6f
> > [<ffffff= ff81013e28>] ? kernel_execve+0x68/0xd0
> > [<ffffffff8106f= c45>] ? __call_usermodehelper+0x0/0x6f
> > [<ffffffff8100f= 8cf>] ? xen_restore_fl_direct_end+0x0/0x1
> > [<ffffffff81= 06fb64>] ? ____call_usermodehelper+0x113/0x11e
> > [<fffff= fff81013daa>] ? child_rip+0xa/0x20
> > [<ffffffff8106fc45&= gt;] ? __call_usermodehelper+0x0/0x6f
> > [<ffffffff81012f91&= gt;] ? int_ret_from_sys_call+0x7/0x1b
> > [<ffffffff8101371d&= gt;] ? retint_restore_args+0x5/0x6
> > [<ffffffff81013da0>= ] ? child_rip+0x0/0x20
> > Code:=20 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 c3 55 48 = 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b eb= fe 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8
> > RIP = [<ffffffff8103a3cb>] leave_mm+0x15/0x46
> > RSP <ffff88= 002805be48>
> > ---[ end trace ce9cee6832a9c503 ]---
> = > Kernel panic - not syncing: Fatal exception in interrupt
> >= ; Pid: 25581, comm: khelper Tainted: G D 2.6.32.36fixxen #1
> > = Call Trace:
> > <IRQ> [<ffffffff8105682e>] panic+0xe= 0/0x19a
> > [<ffffffff8144008a>] ? init_amd+0x296/0x37a> > [<ffffffff8100f17d>] ? xen_force_evtchn_callback+0xd/0xf=
> > [<ffffffff8100f8e2>] ? check_events+0x12/0x20
>= > [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1
&= gt; > [<ffffffff81056487>] ? print_oops_end_marker+0x23/0x25
= > > [<ffffffff81448185>] oops_ end+0xb6/0xc6
> > [<ffffffff810166e5>] die+0x5a/0x63
&= gt; > [<ffffffff81447a5c>] do_trap+0x115/0x124
> > [<= ;ffffffff810148e6>] do_invalid_op+0x9c/0xa5
> > [<ffffffff= 8103a3cb>] ? leave_mm+0x15/0x46
> > [<ffffffff8100f6fa>= ] ? xen_clocksource_read+0x21/0x23
> > [<ffffffff8100f26c>= ] ? HYPERVISOR_vcpu_op+0xf/0x11
> > [<ffffffff8100f767>] ?= xen_vcpuop_set_next_event+0x52/0x67
> > [<ffffffff81080bfa&g= t;] ? clockevents_program_event+0x78/0x81
> > [<ffffffff81013= b3b>] invalid_op+0x1b/0x20
> > [<ffffffff814472b2>] ? _= spin_unlock_irqrestore+0x15/0x17
> > [<ffffffff8103a3cb>] = ? leave_mm+0x15/0x46
> > [<ffffffff8100e4ae>] drop_other_m= m_ref+0x2a/0x53
> > [<ffffffff81087224>] generic_smp_call_= function_single_interrupt+0xd8/0xfc
> > [<ffffffff81010108>= ;] xen_call_function_single_interrupt+0x13 /0x28
> > [<ffffffff810a936a>] handle_IRQ_event+0x66/0x12= 0
> > [<ffffffff810aac5b>] handle_percpu_irq+0x41/0x6e
= > > [<ffffffff8128c1c0>] __xen_evtchn_do_upcall+0x1ab/0x27d> > [<ffffffff8128dd11>] xen_evtchn_do_upcall+0x33/0x46
= > > [<ffffffff81013efe>] xen_do_hypervisor_callback+0x1e/0x30=
> > <EOI> [<ffffffff814472b2>] ? _spin_unlock_irqre= store+0x15/0x17
> > [<ffffffff8100f8cf>] ? xen_restore_fl_= direct_end+0x0/0x1
> > [<ffffffff81113f71>] ? flush_old_ex= ec+0x3ac/0x500
> > [<ffffffff81150dc5>] ? load_elf_binary+= 0x0/0x17ef
> > [<ffffffff81150dc5>] ? load_elf_binary+0x0/= 0x17ef
> > [<ffffffff8115115d>] ? load_elf_binary+0x398/0x= 17ef
> > [<ffffffff81042fcf>] ? need_resched+0x23/0x2d
= > > [<ffffffff811f4648>] ? process_measurement+0xc0/0xd7
&= gt; > [<ffffffff81150dc5>] ? load _elf_binary+0x0/0x17ef
> > [<ffffffff81113094>] ? search_= binary_handler+0xc8/0x255
> > [<ffffffff81114362>] ? do_ex= ecve+0x1c3/0x29e
> > [<ffffffff8101155d>] ? sys_execve+0x4= 3/0x5d
> > [<ffffffff8106fc45>] ? __call_usermodehelper+0x= 0/0x6f
> > [<ffffffff81013e28>] ? kernel_execve+0x68/0xd0<= BR>> > [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f<= BR>> > [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0= x1
> > [<ffffffff8106fb64>] ? ____call_usermodehelper+0x11= 3/0x11e
> > [<ffffffff81013daa>] ? child_rip+0xa/0x20
&= gt; > [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f
&= gt; > [<ffffffff81012f91>] ? int_ret_from_sys_call+0x7/0x1b
&= gt; > [<ffffffff8101371d>] ? retint_restore_args+0x5/0x6
>= > [<ffffffff81013da0>] ? child_rip+0x0/0x20
> >
&g= t; >
--_702121fa-b4c8-4aa3-b55a-69720ad5fc06_-- --===============0760198219== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============0760198219==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Teck Choon Giam Subject: Re: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Thu, 14 Apr 2011 15:26:14 +0800 Message-ID: References: <4DA3438A.6070503@goop.org> <20110412100000.GA15647@dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: MaoXiaoyun Cc: jeremy@goop.org, xen devel , konrad.wilk@oracle.com List-Id: xen-devel@lists.xenproject.org 2011/4/14 MaoXiaoyun : > Hi: > > =A0=A0=A0=A0=A0 I've done test with "cpuidle=3D0 cpufreq=3Dnone",=A0two= =A0machine=A0crashed. > > blktap_sysfs_destroy > blktap_sysfs_destroy > blktap_sysfs_create:=A0adding=A0attributes=A0for=A0dev=A0ffff8800ad581000 > blktap_sysfs_create:=A0adding=A0attributes=A0for=A0dev=A0ffff8800a48e3e00 > ------------[=A0cut=A0here=A0]------------ > kernel=A0BUG=A0at=A0arch/x86/mm/tlb.c:61! > invalid=A0opcode:=A00000=A0[#1]=A0SMP > last=A0sysfs=A0file:=A0/sys/block/tapdeve/dev > CPU=A00 > Modules=A0linked=A0in:=A08021q=A0garp=A0blktap=A0xen_netback=A0xen_blkbac= k=A0blkback_pagemap=A0nbd=A0bridge=A0stp=A0llc=A0autofs4=A0ipmi_devintf=A0i= pmi_si=A0ipmi_ms > ghandler=A0lockd=A0sunrpc=A0bonding=A0ipv6=A0xenfs=A0dm_multipath=A0video= =A0output=A0sbs=A0sbshc=A0parport_pc=A0lp=A0parport=A0ses=A0enclosure=A0snd= _seq_dummy=A0bnx2 > serio_raw=A0snd_seq_oss=A0snd_seq_midi_event=A0snd_seq=A0snd_seq_device= =A0snd_pcm_oss=A0snd_mixer_oss=A0snd_pcm=A0i2c_i801=A0snd_timer=A0i2c_core= =A0snd=A0iT > CO_wdt=A0pata_acpi=A0soundcore=A0iTCO_vendor_ > support=A0ata_generic=A0snd_page_alloc=A0pcspkr=A0ata_piix=A0shpchp=A0mpt= sas=A0mptscsih=A0mptbase=A0[last=A0unloa > ded:=A0freq_table] > Pid:=A08022,=A0comm:=A0khelper=A0Not=A0tainted=A02.6.32.36xen=A0#1=A0Teca= l=A0RH2285 > RIP:=A0e030:[]=A0=A0[]=A0leave_mm+0x1= 5/0x46 > RSP:=A0e02b:ffff88002803ee48=A0=A0EFLAGS:=A000010046 > RAX:=A00000000000000000=A0RBX:=A00000000000000001=A0RCX:=A0ffffffff816759= 80 > RDX:=A0ffff88002803ee78=A0RSI:=A00000000000000000=A0RDI:=A000000000000000= 00 > RBP:=A0ffff88002803ee48=A0R08:=A0ffff8800a4929000=A0R09:=A0dead0000002002= 00 > R10:=A0dead000000100100=A0R11:=A0ffffffff81447292=A0R12:=A0ffff88012ba07b= 80 > R13:=A0ffff880028046020=A0R14:=A000000000000004fb=A0R15:=A000000000000000= 00 > FS:=A0=A000007f410af416e0(0000)=A0GS:ffff88002803b000(0000)=A0knlGS:00000= 00000000000 > CS:=A0=A0e033=A0DS:=A00000=A0ES:=A00000=A0CR0:=A0000000008005003b > CR2:=A00000000000469000=A0CR3:=A000000000ad639000=A0CR4:=A000000000000026= 60 > DR0:=A00000000000000000=A0DR1:=A00000000000000000=A0DR2:=A000000000000000= 00 > DR3:=A00000000000000000=A0DR6:=A000000000ffff0ff0=A0DR7:=A000000000000004= 00 > Process=A0khelper=A0(pid:=A08022,=A0threadinfo=A0ffff8800a4846000,=A0task= =A0ffff8800a9ed0000) > Stack: > =A0ffff88002803ee68=A0ffffffff8100e4a4=A00000000000000001=A0ffff880097de3= b88 > <0>=A0ffff88002803ee98=A0ffffffff81087224=A0ffff88002803ee78=A0ffff880028= 03ee78 > <0>=A0ffff88015f808180=A000000000000004fb=A0ffff88002803eea8=A0ffffffff81= 0100e8 > Call=A0Trace: > =A0 > =A0[]=A0drop_other_mm_ref+0x2a/0x53 > =A0[]=A0generic_smp_call_function_single_interrupt+0xd8= /0xfc > =A0[]=A0xen_call_function_single_interrupt+0x13/0x28 > =A0[]=A0handle_IRQ_event+0x66/0x120 > =A0[]=A0handle_percpu_irq+0x41/0x6e > =A0[]=A0__xen_evtchn_do_upcall+0x1ab/0x27d > =A0[]=A0xen_evtchn_do_upcall+0x33/0x46 > =A0[]=A0xen_do_hypervisor_callback+0x1e/0x30 > =A0 > =A0[]=A0?=A0_spin_unlock_irqrestore+0x15/0x17 > =A0[]=A0?=A0xen_restore_fl_direct_end+0x0/0x1 > =A0[]=A0?=A0flush_old_exec+0x3ac/0x500 > =A0[]=A0?=A0load_elf_binary+0x0/0x17ef > =A0[]=A0?=A0load_elf_binary+0x0/0x17ef > =A0[]=A0?=A0load_elf_binary+0x398/0x17ef > =A0[]=A0?=A0need_resched+0x23/0x2d > > []=A0?=A0process_measurement+0xc0/0xd7 > =A0[]=A0?=A0load_elf_binary+0x0/0x17ef > =A0[]=A0?=A0search_binary_handler+0xc8/0x255 > =A0[]=A0?=A0do_execve+0x1c3/0x29e > =A0[]=A0?=A0sys_execve+0x43/0x5d > =A0[]=A0?=A0__call_usermodehelper+0x0/0x6f > =A0[]=A0?=A0kernel_execve+0x68/0xd0 > =A0[]=A0?=A0__call_usermodehelper+0x0/0x6f > =A0[]=A0?=A0xen_restore_fl_direct_end+0x0/0x1 > =A0[]=A0?=A0____call_usermodehelper+0x113/0x11e > =A0[]=A0?=A0child_rip+0xa/0x20 > =A0[]=A0?=A0__call_usermodehelper+0x0/0x6f > =A0[]=A0?=A0int_ret_from_sys_call+0x7/0x1b > =A0[]=A0?=A0retint_restore_args+0x5/0x6 > =A0[]=A0?=A0c > hild_rip+0x0/0x20 > Code:=A041=A05e=A041=A05f=A0c9=A0c3=A055=A048=A089=A0e5=A00f=A01f=A044=A0= 00=A000=A0e8=A017=A0ff=A0ff=A0ff=A0c9=A0c3=A055=A048=A089=A0e5=A00f=A01f=A0= 44=A000=A000=A065=A08b=A004=A025=A0c8=A055=A001=A000=A0ff=A0c8=A075=A004=A0= <0f>=A00b=A0eb=A0fe=A065=A048=A08b=A034=A025=A0c0=A055=A001=A000=A048=A081= =A0c6=A0b8=A002=A000=A000=A0e8 > RIP=A0=A0[]=A0leave_mm+0x15/0x46 > =A0RSP=A0 > ---[=A0end=A0trace=A01522f17fdfc9162d=A0]--- > Kernel=A0panic=A0-=A0not=A0syncing:=A0Fatal=A0exception=A0in=A0interrupt > Pid:=A08022,=A0comm:=A0khelper=A0Tainted:=A0G=A0=A0=A0=A0=A0=A0D=A0=A0=A0= =A02.6.32.36xen=A0#1 > Call=A0Trace: > =A0=A0=A0[]=A0panic+0xe0/0x19a > =A0[]=A0?=A0init_amd+0x296/0x37a Hmmm... both machines are using AMD CPU? Did you hit the same bug on Intel= CPU? > =A0[]=A0?=A0xen_force_evtchn_callback+0xd/0xf > =A0[]=A0?=A0check_events+0x12/0x20 > =A0[]=A0?=A0xen_restore_fl_direct_end+0x0/0x1 > =A0[]=A0?=A0print_oops_end_marker+0x23/0x25 > =A0[]=A0oops_end+0xb6/0xc6 > =A0[]=A0die+0x5a/0x63 > =A0[]=A0do_trap+0x115/0x124 > =A0[]=A0do_invalid_op+0x9c/0xa5 > =A0[]=A0?=A0leave_mm+0x15/0x46 > =A0[]=A0?=A0xen_clocksource_read+0x21/0x23 > =A0[]=A0?=A0HYPERVISOR_vcpu_op+0xf/0x11 > =A0[]=A0?=A0xen_vcpuop_set_next_event+0x52/0x67 > =A0[]=A0invalid_op+0x1b/0x20 > =A0[]=A0?=A0_spin_unlock_irqrestore+0x15/0x17 > =A0[]=A0?=A0leave_mm+0x15/0x46 > =A0[]=A0drop_other_mm_ref+0x2a/0x53 > =A0[]=A0generic_smp_call_function_single_interrupt+0xd8= /0xfc > =A0[]=A0xen_call_function_single_interrupt+0x13/0x28 > =A0[]=A0handle_IRQ_event+0x66/0x120 > =A0[]=A0handle_percpu_irq+0x41/0x6e > =A0[]=A0__xen_evtchn_do_upcall+0x1ab/0x27d > =A0[]=A0xen_evtchn_do_upcall+0x33/0x46 > =A0[]=A0xen_do_hypervisor_callback+0x1e/0x30 > =A0=A0=A0[]=A0?=A0_spin_unlock_irqrestore+0x15/0x1= 7 > =A0[]=A0?=A0xen_restore_fl_direct_end+0x0/0x1 > =A0[]=A0?=A0flush_old_exec+0x3ac/0x500 > =A0[]=A0?=A0load_elf_binary+0x0/0x17ef > =A0[]=A0?=A0load_elf_binary+0x0/0x17ef > =A0[]=A0?=A0load_elf_binary+0x398/0x17ef > =A0[]=A0?=A0need_resched+0x23/0x > 2d > =A0[]=A0?=A0process_measurement+0xc0/0xd7 > =A0[]=A0?=A0load_elf_binary+0x0/0x17ef > =A0[]=A0?=A0search_binary_handler+0xc8/0x255 > =A0[]=A0?=A0do_execve+0x1c3/0x29e > =A0[]=A0?=A0sys_execve+0x43/0x5d > =A0[]=A0?=A0__call_usermodehelper+0x0/0x6f > =A0[]=A0?=A0kernel_execve+0x68/0xd0 > =A0[]=A0?=A0__call_usermodehelper+0x0/0x6f > =A0[]=A0?=A0xen_restore_fl_direct_end+0x0/0x1 > =A0[]=A0?=A0____call_usermodehelper+0x113/0x11e > =A0[]=A0?=A0child_rip+0xa/0x20 > =A0[]=A0?=A0__call_usermodehelper+0x0/0x6f > =A0[]=A0?=A0int_ret_from_sys_call+0x7/0x1b > =A0[]=A0?=A0retint_restore_args+0x5/0x6 > =A0[]=A0?=A0child_rip+0x0/0x20 > (XEN)=A0Domain=A00=A0crashed:=A0'noreboot'=A0set=A0-=A0not=A0rebooting. > >> Date: Tue, 12 Apr 2011 06:00:00 -0400 >> From: konrad.wilk@oracle.com >> To: tinnycloud@hotmail.com >> CC: xen-devel@lists.xensource.com; giamteckchoon@gmail.com; >> jeremy@goop.org >> Subject: Re: Kernel BUG at arch/x86/mm/tlb.c:61 >> >> On Tue, Apr 12, 2011 at 05:11:51PM +0800, MaoXiaoyun wrote: >> > >> > Hi : >> > >> > We are using pvops kernel 2.6.32.36 + xen 4.0.1, but confront a kernel >> > panic bug. >> > >> > 2.6.32.36 Kernel: >> > http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit;= h=3Dbb1a15e55ec665a64c8a9c6bd699b1f16ac01ff4 >> > Xen 4.0.1 http://xenbits.xen.org/hg/xen-4.0-testing.hg/rev/b536ebfba18= 3 >> > >> > Our test is simple, 24 HVMS(Win2003 ) on a single host, each HVM loope= s >> > in restart every 15minutes. >> >> What is the storage that you are using for your guests? AoE? Local disks= ? >> >> > About 17 machines are invovled in the test, after 10 hours run, one >> > confrontted a crash at arch/x86/mm/tlb.c:61 >> > >> > Currently I am trying "cpuidle=3D0 cpufreq=3Dnone" tests based on Teck= 's >> > suggestion. >> > >> > Any comments, thanks. >> > >> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3Dcrash log=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> > INIT: Id "s0" respawning too fast: disabled for 5 minutes >> > __ratelimit: 14 callbacks suppressed >> > blktap_sysfs_destroy >> > blktap_sysfs_destroy >> > ------------[ cut here ]------------ >> > kernel BUG at arch/x86/mm/tlb.c:61! >> > invalid opcode: 0000 [#1] SMP >> > last sysfs file: >> > /sys/devices/system/xen_memory/xen_memory0/info/current_kb >> > CPU 1 >> > Modules linked in: 8021q garp xen_netback xen_blkback blktap >> > blkback_pagemap nbd bridge stp llc autofs4 ipmi_devintf ipmi_si >> > ipmi_msghandler lockd sunrpc bonding ipv6 xenfs dm_multipath video out= put >> > sbs sbshc parport_pc lp parport ses enclosure snd_seq_dummy snd_seq_os= s >> > snd_seq_midi_event snd_seq snd_seq_device serio_raw bnx2 snd_pcm_oss >> > snd_mixer_oss snd_pcm snd_timer iTCO_wdt snd soundcore snd_page_alloc >> > i2c_i801 iTCO_vendor_support i2c_core pcspkr pata_acpi ata_generic ata= _piix >> > shpchp mptsas mptscsih mptbase [last unloaded: freq_table] >> > Pid: 25581, comm: khelper Not tainted 2.6.32.36fixxen #1 Tecal RH2285 >> > RIP: e030:[] [] leave_mm+0x15/0x46 >> > RSP: e02b:ffff88002805be48 EFLAGS: 00010046 >> > RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88015f8e2da0 >> > RDX: ffff88002805be78 RSI: 0000000000000000 RDI: 0000000000000001 >> > RBP: ffff88002805be48 R08: ffff88009d662000 R09: dead000000200200 >> > R10: dead000000100100 R11: ffffffff814472b2 R12: ffff88009bfc1880 >> > R13: ffff880028063020 R14: 00000000000004f6 R15: 0000000000000000 >> > FS: 00007f62362d66e0(0000) GS:ffff880028058000(0000) >> > knlGS:0000000000000000 >> > CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b >> > CR2: 0000003aabc11909 CR3: 000000009b8ca000 CR4: 0000000000002660 >> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >> > Process khelper (pid: 25581, threadinfo ffff88007691e000, task >> > ffff88009b92db40) >> > Stack: >> > ffff88002805be68 ffffffff8100e4ae 0000000000000001 ffff88009d733b88 >> > <0> ffff88002805be98 ffffffff81087224 ffff88002805be78 ffff88002805be7= 8 >> > <0> ffff88015f808360 00000000000004f6 ffff88002805bea8 ffffffff8101010= 8 >> > Call Trace: >> > >> > [] drop_other_mm_ref+0x2a/0x53 >> > [] >> > generic_smp_call_function_single_interrupt+0xd8/0xfc >> > [] xen_call_function_single_interrupt+0x13/0x28 >> > [] handle_IRQ_event+0x66/0x120 >> > [] handle_percpu_irq+0x41/0x6e >> > [] __xen_evtchn_do_upcall+0x1ab/0x27d >> > [] xen_evtchn_do_upcall+0x33/0x46 >> > [] xen_do_hypervisor_callback+0x1e/0x30 >> > >> > [] ? _spin_unlock_irqrestore+0x15/0x17 >> > [] ? xen_restore_fl_direct_end+0x0/0x1 >> > [] ? flush_old_exec+0x3ac/0x500 >> > [] ? load_elf_binary+0x0/0x17ef >> > [] ? load_elf_binary+0x0/0x17ef >> > [] ? load_elf_binary+0x398/0x17ef >> > [] ? need_resched+0x23/0x2d >> > [] ? process_measurement+0xc0/0xd7 >> > [] ? load_elf_binary+0x0/0x17ef >> > [] ? search_binary_handler+0xc8/0x255 >> > [] ? do_execve+0x1c3/0x29e >> > [] ? sys_execve+0x43/0x5d >> > [] ? __call_usermodehelper+0x0/0x6f >> > [] ? kernel_execve+0x68/0xd0 >> > [] ? __call_usermodehelper+0x0/0x6f >> > [] ? xen_restore_fl_direct_end+0x0/0x1 >> > [] ? ____call_usermodehelper+0x113/0x11e >> > [] ? child_rip+0xa/0x20 >> > [] ? __call_usermodehelper+0x0/0x6f >> > [] ? int_ret_from_sys_call+0x7/0x1b >> > [] ? retint_restore_args+0x5/0x6 >> > [] ? child_rip+0x0/0x20 >> > Code: 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 c= 3 >> > 55 48 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b= eb fe >> > 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8 >> > RIP [] leave_mm+0x15/0x46 >> > RSP >> > ---[ end trace ce9cee6832a9c503 ]--- >> > Kernel panic - not syncing: Fatal exception in interrupt >> > Pid: 25581, comm: khelper Tainted: G D 2.6.32.36fixxen #1 >> > Call Trace: >> > [] panic+0xe0/0x19a >> > [] ? init_amd+0x296/0x37a >> > [] ? xen_force_evtchn_callback+0xd/0xf >> > [] ? check_events+0x12/0x20 >> > [] ? xen_restore_fl_direct_end+0x0/0x1 >> > [] ? print_oops_end_marker+0x23/0x25 >> > [] oops_end+0xb6/0xc6 >> > [] die+0x5a/0x63 >> > [] do_trap+0x115/0x124 >> > [] do_invalid_op+0x9c/0xa5 >> > [] ? leave_mm+0x15/0x46 >> > [] ? xen_clocksource_read+0x21/0x23 >> > [] ? HYPERVISOR_vcpu_op+0xf/0x11 >> > [] ? xen_vcpuop_set_next_event+0x52/0x67 >> > [] ? clockevents_program_event+0x78/0x81 >> > [] invalid_op+0x1b/0x20 >> > [] ? _spin_unlock_irqrestore+0x15/0x17 >> > [] ? leave_mm+0x15/0x46 >> > [] drop_other_mm_ref+0x2a/0x53 >> > [] >> > generic_smp_call_function_single_interrupt+0xd8/0xfc >> > [] xen_call_function_single_interrupt+0x13/0x28 >> > [] handle_IRQ_event+0x66/0x120 >> > [] handle_percpu_irq+0x41/0x6e >> > [] __xen_evtchn_do_upcall+0x1ab/0x27d >> > [] xen_evtchn_do_upcall+0x33/0x46 >> > [] xen_do_hypervisor_callback+0x1e/0x30 >> > [] ? _spin_unlock_irqrestore+0x15/0x17 >> > [] ? xen_restore_fl_direct_end+0x0/0x1 >> > [] ? flush_old_exec+0x3ac/0x500 >> > [] ? load_elf_binary+0x0/0x17ef >> > [] ? load_elf_binary+0x0/0x17ef >> > [] ? load_elf_binary+0x398/0x17ef >> > [] ? need_resched+0x23/0x2d >> > [] ? process_measurement+0xc0/0xd7 >> > [] ? load_elf_binary+0x0/0x17ef >> > [] ? search_binary_handler+0xc8/0x255 >> > [] ? do_execve+0x1c3/0x29e >> > [] ? sys_execve+0x43/0x5d >> > [] ? __call_usermodehelper+0x0/0x6f >> > [] ? kernel_execve+0x68/0xd0 >> > [] ? __call_usermodehelper+0x0/0x6f >> > [] ? xen_restore_fl_direct_end+0x0/0x1 >> > [] ? ____call_usermodehelper+0x113/0x11e >> > [] ? child_rip+0xa/0x20 >> > [] ? __call_usermodehelper+0x0/0x6f >> > [] ? int_ret_from_sys_call+0x7/0x1b >> > [] ? retint_restore_args+0x5/0x6 >> > [] ? child_rip+0x0/0x20 >> > >> > > From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: RE: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Thu, 14 Apr 2011 15:56:49 +0800 Message-ID: References: , , , , , , , <4DA3438A.6070503@goop.org>, , , <20110412100000.GA15647@dumpdata.com>, , Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0024717207==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: giamteckchoon@gmail.com Cc: jeremy@goop.org, xen devel , konrad.wilk@oracle.com List-Id: xen-devel@lists.xenproject.org --===============0024717207== Content-Type: multipart/alternative; boundary="_f47789e7-8408-4b92-825b-0558efcdbf75_" --_f47789e7-8408-4b92-825b-0558efcdbf75_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable =20 > Date: Thu, 14 Apr 2011 15:26:14 +0800 > Subject: Re: Kernel BUG at arch/x86/mm/tlb.c:61 > From: giamteckchoon@gmail.com > To: tinnycloud@hotmail.com > CC: xen-devel@lists.xensource.com; jeremy@goop.org; konrad.wilk@oracle.= com >=20 > 2011/4/14 MaoXiaoyun : > > Hi: > > > > I've done test with "cpuidle=3D0 cpufreq=3Dnone", two machine c= rashed. > > > > blktap_sysfs_destroy > > blktap_sysfs_destroy > > blktap_sysfs_create: adding attributes for dev ffff8800ad581000 > > blktap_sysfs_create: adding attributes for dev ffff8800a48e3e00 > > ------------[ cut here ]------------ > > kernel BUG at arch/x86/mm/tlb.c:61! > > invalid opcode: 0000 [#1] SMP > > last sysfs file: /sys/block/tapdeve/dev > > CPU 0 > > Modules linked in: 8021q garp blktap xen_netback xen_blkback blkback_= pagemap nbd bridge stp llc autofs4 ipmi_devintf ipmi_si ipmi_ms > > ghandler lockd sunrpc bonding ipv6 xenfs dm_multipath video output sb= s sbshc parport_pc lp parport ses enclosure snd_seq_dummy bnx2 > > serio_raw snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_p= cm_oss snd_mixer_oss snd_pcm i2c_i801 snd_timer i2c_core snd iT > > CO_wdt pata_acpi soundcore iTCO_vendor_ > > support ata_generic snd_page_alloc pcspkr ata_piix shpchp mptsas mpts= csih mptbase [last unloa > > ded: freq_table] > > Pid: 8022, comm: khelper Not tainted 2.6.32.36xen #1 Tecal RH2285 > > RIP: e030:[] [] leave_mm+0x15/0x= 46 > > RSP: e02b:ffff88002803ee48 EFLAGS: 00010046 > > RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff81675980 > > RDX: ffff88002803ee78 RSI: 0000000000000000 RDI: 0000000000000000 > > RBP: ffff88002803ee48 R08: ffff8800a4929000 R09: dead000000200200 > > R10: dead000000100100 R11: ffffffff81447292 R12: ffff88012ba07b80 > > R13: ffff880028046020 R14: 00000000000004fb R15: 0000000000000000 > > FS: 00007f410af416e0(0000) GS:ffff88002803b000(0000) knlGS:000000000= 0000000 > > CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > > CR2: 0000000000469000 CR3: 00000000ad639000 CR4: 0000000000002660 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > Process khelper (pid: 8022, threadinfo ffff8800a4846000, task ffff880= 0a9ed0000) > > Stack: > > ffff88002803ee68 ffffffff8100e4a4 0000000000000001 ffff880097de3b88 > > <0> ffff88002803ee98 ffffffff81087224 ffff88002803ee78 ffff88002803ee= 78 > > <0> ffff88015f808180 00000000000004fb ffff88002803eea8 ffffffff810100= e8 > > Call Trace: > > > > [] drop_other_mm_ref+0x2a/0x53 > > [] generic_smp_call_function_single_interrupt+0xd8= /0xfc > > [] xen_call_function_single_interrupt+0x13/0x28 > > [] handle_IRQ_event+0x66/0x120 > > [] handle_percpu_irq+0x41/0x6e > > [] __xen_evtchn_do_upcall+0x1ab/0x27d > > [] xen_evtchn_do_upcall+0x33/0x46 > > [] xen_do_hypervisor_callback+0x1e/0x30 > > > > [] ? _spin_unlock_irqrestore+0x15/0x17 > > [] ? xen_restore_fl_direct_end+0x0/0x1 > > [] ? flush_old_exec+0x3ac/0x500 > > [] ? load_elf_binary+0x0/0x17ef > > [] ? load_elf_binary+0x0/0x17ef > > [] ? load_elf_binary+0x398/0x17ef > > [] ? need_resched+0x23/0x2d > > > > [] ? process_measurement+0xc0/0xd7 > > [] ? load_elf_binary+0x0/0x17ef > > [] ? search_binary_handler+0xc8/0x255 > > [] ? do_execve+0x1c3/0x29e > > [] ? sys_execve+0x43/0x5d > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? kernel_execve+0x68/0xd0 > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? xen_restore_fl_direct_end+0x0/0x1 > > [] ? ____call_usermodehelper+0x113/0x11e > > [] ? child_rip+0xa/0x20 > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? int_ret_from_sys_call+0x7/0x1b > > [] ? retint_restore_args+0x5/0x6 > > [] ? c > > hild_rip+0x0/0x20 > > Code: 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 = c3 55 48 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b= eb fe 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8 > > RIP [] leave_mm+0x15/0x46 > > RSP > > ---[ end trace 1522f17fdfc9162d ]--- > > Kernel panic - not syncing: Fatal exception in interrupt > > Pid: 8022, comm: khelper Tainted: G D 2.6.32.36xen #1 > > Call Trace: > > [] panic+0xe0/0x19a > > [] ? init_amd+0x296/0x37a >=20 > Hmmm... both machines are using AMD CPU? Did you hit the same bug on In= tel CPU? >=20 >=20 =20 It is Intel CPU, not AMD.=20 =20 model name : Intel(R) Xeon(R) CPU E5620 @ 2.40GHz =20 > > [] ? xen_force_evtchn_callback+0xd/0xf > > [] ? check_events+0x12/0x20 > > [] ? xen_restore_fl_direct_end+0x0/0x1 > > [] ? print_oops_end_marker+0x23/0x25 > > [] oops_end+0xb6/0xc6 > > [] die+0x5a/0x63 > > [] do_trap+0x115/0x124 > > [] do_invalid_op+0x9c/0xa5 > > [] ? leave_mm+0x15/0x46 > > [] ? xen_clocksource_read+0x21/0x23 > > [] ? HYPERVISOR_vcpu_op+0xf/0x11 > > [] ? xen_vcpuop_set_next_event+0x52/0x67 > > [] invalid_op+0x1b/0x20 > > [] ? _spin_unlock_irqrestore+0x15/0x17 > > [] ? leave_mm+0x15/0x46 > > [] drop_other_mm_ref+0x2a/0x53 > > [] generic_smp_call_function_single_interrupt+0xd8= /0xfc > > [] xen_call_function_single_interrupt+0x13/0x28 > > [] handle_IRQ_event+0x66/0x120 > > [] handle_percpu_irq+0x41/0x6e > > [] __xen_evtchn_do_upcall+0x1ab/0x27d > > [] xen_evtchn_do_upcall+0x33/0x46 > > [] xen_do_hypervisor_callback+0x1e/0x30 > > [] ? _spin_unlock_irqrestore+0x15/0x17 > > [] ? xen_restore_fl_direct_end+0x0/0x1 > > [] ? flush_old_exec+0x3ac/0x500 > > [] ? load_elf_binary+0x0/0x17ef > > [] ? load_elf_binary+0x0/0x17ef > > [] ? load_elf_binary+0x398/0x17ef > > [] ? need_resched+0x23/0x > > 2d > > [] ? process_measurement+0xc0/0xd7 > > [] ? load_elf_binary+0x0/0x17ef > > [] ? search_binary_handler+0xc8/0x255 > > [] ? do_execve+0x1c3/0x29e > > [] ? sys_execve+0x43/0x5d > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? kernel_execve+0x68/0xd0 > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? xen_restore_fl_direct_end+0x0/0x1 > > [] ? ____call_usermodehelper+0x113/0x11e > > [] ? child_rip+0xa/0x20 > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? int_ret_from_sys_call+0x7/0x1b > > [] ? retint_restore_args+0x5/0x6 > > [] ? child_rip+0x0/0x20 > > (XEN) Domain 0 crashed: 'noreboot' set - not rebooting. > > > >> Date: Tue, 12 Apr 2011 06:00:00 -0400 > >> From: konrad.wilk@oracle.com > >> To: tinnycloud@hotmail.com > >> CC: xen-devel@lists.xensource.com; giamteckchoon@gmail.com; > >> jeremy@goop.org > >> Subject: Re: Kernel BUG at arch/x86/mm/tlb.c:61 > >> > >> On Tue, Apr 12, 2011 at 05:11:51PM +0800, MaoXiaoyun wrote: > >> > > >> > Hi : > >> > > >> > We are using pvops kernel 2.6.32.36 + xen 4.0.1, but confront a ke= rnel > >> > panic bug. > >> > > >> > 2.6.32.36 Kernel: > >> > http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcom= mit;h=3Dbb1a15e55ec665a64c8a9c6bd699b1f16ac01ff4 > >> > Xen 4.0.1 http://xenbits.xen.org/hg/xen-4.0-testing.hg/rev/b536ebf= ba183 > >> > > >> > Our test is simple, 24 HVMS(Win2003 ) on a single host, each HVM l= oopes > >> > in restart every 15minutes. > >> > >> What is the storage that you are using for your guests? AoE? Local d= isks? > >> > >> > About 17 machines are invovled in the test, after 10 hours run, on= e > >> > confrontted a crash at arch/x86/mm/tlb.c:61 > >> > > >> > Currently I am trying "cpuidle=3D0 cpufreq=3Dnone" tests based on = Teck's > >> > suggestion. > >> > > >> > Any comments, thanks. > >> > =20 --_f47789e7-8408-4b92-825b-0558efcdbf75_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable
 
> Date: Thu, 14 Apr 2011 15:26:14 +0800
> Subject: Re: Kernel BU= G at arch/x86/mm/tlb.c:61
> From: giamteckchoon@gmail.com
> T= o: tinnycloud@hotmail.com
> CC: xen-devel@lists.xensource.com; jere= my@goop.org; konrad.wilk@oracle.com
>
> 2011/4/14 MaoXiaoyun= <tinnycloud@hotmail.com>:
> > Hi:
> >
> &g= t;       I've done test with "cpuidle=3D0 cpufre= q=3Dnone", two machine crashed.
> >
> > = blktap_sysfs_destroy
> > blktap_sysfs_destroy
> > blkta= p_sysfs_create: adding attributes for dev ffff88= 00ad581000
> > blktap_sysfs_create: adding attributes&= nbsp;for dev ffff8800a48e3e00
> > ------------[ c= ut here ]------------
> > kernel BUG at = ;arch/x86/mm/tlb.c:61!
> > invalid opcode: 0000 [= #1] SMP
> > last sysfs&nbs p;file: /sys/block/tapdeve/dev
> > CPU 0
> >= Modules linked in: 8021q garp blktap xen_n= etback xen_blkback blkback_pagemap nbd bridge st= p llc autofs4 ipmi_devintf ipmi_si ipmi_ms
&g= t; > ghandler lockd sunrpc bonding ipv6 xenfs=  dm_multipath video output sbs sbshc parpor= t_pc lp parport ses enclosure snd_seq_dummy = ;bnx2
> > serio_raw snd_seq_oss snd_seq_midi_event&nbs= p;snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss sn= d_pcm i2c_i801 snd_timer i2c_core snd iT
>= > CO_wdt pata_acpi soundcore iTCO_vendor_
> >= support ata_generic snd_page_alloc pcspkr ata_piix&n= bsp;shpchp mptsas mptscsih mptbase [last unloa> > ded: freq_table]
>=20 > Pid: 8022, comm: khelper Not tainted = 2.6.32.36xen #1 Tecal RH2285
> > RIP: e030:[= <ffffffff8103a3cb>]  [<ffffffff8103a3cb>] leav= e_mm+0x15/0x46
> > RSP: e02b:ffff88002803ee48  EF= LAGS: 00010046
> > RAX: 0000000000000000 RBX:&nbs= p;0000000000000001 RCX: ffffffff81675980
> > RDX: = ;ffff88002803ee78 RSI: 0000000000000000 RDI: 00000000= 00000000
> > RBP: ffff88002803ee48 R08: ffff8800a= 4929000 R09: dead000000200200
> > R10: dead000000= 100100 R11: ffffffff81447292 R12: ffff88012ba07b80> > R13: ffff880028046020 R14: 00000000000004fb&nbs= p;R15: 0000000000000000
> > FS:  00007f410af416e0= (0000) GS:ffff88002803b000(0000) knlGS:0000000000000000
>= > CS:  e033 DS: 00 00 ES: 0000 CR0: 000000008005003b
> > CR2:&= nbsp;0000000000469000 CR3: 00000000ad639000 CR4: 0000= 000000002660
> > DR0: 0000000000000000 DR1: 00000= 00000000000 DR2: 0000000000000000
> > DR3: 000000= 0000000000 DR6: 00000000ffff0ff0 DR7: 000000000000040= 0
> > Process khelper (pid: 8022, threadinfo=  ffff8800a4846000, task ffff8800a9ed0000)
> > Sta= ck:
> >  ffff88002803ee68 ffffffff8100e4a4 000000= 0000000001 ffff880097de3b88
> > <0> ffff88002803= ee98 ffffffff81087224 ffff88002803ee78 ffff88002803ee78> > <0> ffff88015f808180 00000000000004fb fff= f88002803eea8 ffffffff810100e8
> > Call Trace:
>= >  <IRQ>
> >  [<ffffffff8100e4a4>]&nbs= p;drop_other_mm_ref+0x2a/0x53
> >  [<ffffffff81087224>] generic_smp_call_function_single_= interrupt+0xd8/0xfc
> >  [<ffffffff810100e8>] xe= n_call_function_single_interrupt+0x13/0x28
> >  [<ffffff= ff810a936a>] handle_IRQ_event+0x66/0x120
> >  [<= ffffffff810aac5b>] handle_percpu_irq+0x41/0x6e
> >  = ;[<ffffffff8128c1a8>] __xen_evtchn_do_upcall+0x1ab/0x27d
&g= t; >  [<ffffffff8128dcf9>] xen_evtchn_do_upcall+0x33/0= x46
> >  [<ffffffff81013efe>] xen_do_hypervisor_= callback+0x1e/0x30
> >  <EOI>
> >  [<= ;ffffffff81447292>] ? _spin_unlock_irqrestore+0x15/0x17
&= gt; >  [<ffffffff8100f8af>] ? xen_restore_fl_dire= ct_end+0x0/0x1
> >  [<ffffffff81113f75>] ? = flush_old_exec+0x3ac/0x500
> >  [<ffffffff81150dc9>]&= nbsp;? load_elf_binary+0x0/0x17ef
> >  [<ffffffff81150dc9>] ? load_elf_binary+0= x0/0x17ef
> >  [<ffffffff81151161>] ? load_= elf_binary+0x398/0x17ef
> >  [<ffffffff81042fcf>]&nbs= p;? need_resched+0x23/0x2d
> >
> > [<ffffffff81= 1f463c>] ? process_measurement+0xc0/0xd7
> >  = [<ffffffff81150dc9>] ? load_elf_binary+0x0/0x17ef
>= >  [<ffffffff81113098>] ? search_binary_handler+= 0xc8/0x255
> >  [<ffffffff81114366>] ? do_e= xecve+0x1c3/0x29e
> >  [<ffffffff8101155d>] ?&nb= sp;sys_execve+0x43/0x5d
> >  [<ffffffff8106fc45>]&nbs= p;? __call_usermodehelper+0x0/0x6f
> >  [<ffffffff8= 1013e28>] ? kernel_execve+0x68/0xd0
> >  [<= ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f
> = >  [<ffffffff8100f8af>]  ;? xen_restore_fl_direct_end+0x0/0x1
> >  [<ffffff= ff8106fb64>] ? ____call_usermodehelper+0x113/0x11e
> &= gt;  [<ffffffff81013daa>] ? child_rip+0xa/0x20
&g= t; >  [<ffffffff8106fc45>] ? __call_usermodehelpe= r+0x0/0x6f
> >  [<ffffffff81012f91>] ? int_= ret_from_sys_call+0x7/0x1b
> >  [<ffffffff8101371d>]&= nbsp;? retint_restore_args+0x5/0x6
> >  [<ffffffff8= 1013da0>] ? c
> > hild_rip+0x0/0x20
> > Co= de: 41 5e 41 5f c9 c3 55 48 = 89 e5 0f 1f 44 00 00 e8 17 f= f ff ff c9 c3 55 48 89 e5 0f=  1f 44 00 00 65 8b 04 25 c8&= nbsp;55 01 00 ff c8 75 04 <0f>&n= bsp;0b eb fe 65 48&nbs p;8b 34 25 c0 55 01 00 48 81&nbs= p;c6 b8 02 00 00 e8
> > RIP  = [<ffffffff8103a3cb>] leave_mm+0x15/0x46
> >  RSP=  <ffff88002803ee48>
> > ---[ end trace = ;1522f17fdfc9162d ]---
> > Kernel panic - no= t syncing: Fatal exception in interrupt
> = > Pid: 8022, comm: khelper Tainted: G &n= bsp;    D    2.6.32.36xen #1=
> > Call Trace:
> >  <IRQ>  = [<ffffffff8105682e>] panic+0xe0/0x19a
> >  [<= ffffffff8144006a>] ? init_amd+0x296/0x37a
>
> H= mmm... both machines are using AMD CPU? Did you hit the same bug on Intel= CPU?
>
>
 
It is Intel CPU, not AMD.
 
model name      : Intel(R) Xeon(R) CPU &nbs= p;         E5620  @ 2.40GHz<= BR>  

> >  [<ffffffff8100f169>] ? xen_force_evtc= hn_callback+0xd/0xf
> >  [<ffffffff8100f8c2>] ?&= nbsp;check_events+0x12/0x20
> >  [<ffffffff8100f8af>]=  ? xen_restore_fl_direct_end+0x0/0x1
> >  [<ff= ffffff81056487>] ? print_oops_end_marker+0x23/0x25
> &= gt;  [<ffffffff81448165>] oops_end+0xb6/0xc6
> >=  [<ffffffff810166e5>] die+0x5a/0x63
> >  [= <ffffffff81447a3c>] do_trap+0x115/0x124
> >  [&l= t;ffffffff810148e6>] do_invalid_op+0x9c/0xa5
> >  [= <ffffffff8103a3cb>] ? leave_mm+0x15/0x46
> > &nb= sp;[<ffffffff8100f6e6>] ? xen_clocksource_read+0x21/0x23<= BR>> >  [<ffffffff8100f258>] ? HYPERVISOR_vcpu= _op+0xf/0x11
> >  [<ffffffff8100f753>] ? xe= n_vcpuop_set_next_event+0x52/0x67
> >  [<ffffffff81013b3b>] invalid_op+0x1b/0x20
>= >  [<ffffffff81447292>] ? _spin_unlock_irqrestor= e+0x15/0x17
> >  [<ffffffff8103a3cb>] ? lea= ve_mm+0x15/0x46
> >  [<ffffffff8100e4a4>] drop_o= ther_mm_ref+0x2a/0x53
> >  [<ffffffff81087224>] = generic_smp_call_function_single_interrupt+0xd8/0xfc
> >  [= <ffffffff810100e8>] xen_call_function_single_interrupt+0x13/0x= 28
> >  [<ffffffff810a936a>] handle_IRQ_event+0x= 66/0x120
> >  [<ffffffff810aac5b>] handle_percpu= _irq+0x41/0x6e
> >  [<ffffffff8128c1a8>] __xen_e= vtchn_do_upcall+0x1ab/0x27d
> >  [<ffffffff8128dcf9>]=  xen_evtchn_do_upcall+0x33/0x46
> >  [<ffffffff8101= 3efe>] xen_do_hypervisor_callback+0x1e/0x30
> >  &l= t;EOI>  [<ffffffff81447292 >] ? _spin_unlock_irqrestore+0x15/0x17
> >  [= <ffffffff8100f8af>] ? xen_restore_fl_direct_end+0x0/0x1> >  [<ffffffff81113f75>] ? flush_old_exec+0= x3ac/0x500
> >  [<ffffffff81150dc9>] ? load= _elf_binary+0x0/0x17ef
> >  [<ffffffff81150dc9>] = ;? load_elf_binary+0x0/0x17ef
> >  [<ffffffff811511= 61>] ? load_elf_binary+0x398/0x17ef
> >  [<= ffffffff81042fcf>] ? need_resched+0x23/0x
> > 2d> >  [<ffffffff811f463c>] ? process_measureme= nt+0xc0/0xd7
> >  [<ffffffff81150dc9>] ? lo= ad_elf_binary+0x0/0x17ef
> >  [<ffffffff81113098>]&nb= sp;? search_binary_handler+0xc8/0x255
> >  [<ffffff= ff81114366>] ? do_execve+0x1c3/0x29e
> >  [<= ;ffffffff8101155d>] ? sys_exe cve+0x43/0x5d
> >  [<ffffffff8106fc45>] ? = __call_usermodehelper+0x0/0x6f
> >  [<ffffffff81013e28&g= t;] ? kernel_execve+0x68/0xd0
> >  [<ffffffff8= 106fc45>] ? __call_usermodehelper+0x0/0x6f
> > &nbs= p;[<ffffffff8100f8af>] ? xen_restore_fl_direct_end+0x0/0x= 1
> >  [<ffffffff8106fb64>] ? ____call_user= modehelper+0x113/0x11e
> >  [<ffffffff81013daa>] = ;? child_rip+0xa/0x20
> >  [<ffffffff8106fc45>]&= nbsp;? __call_usermodehelper+0x0/0x6f
> >  [<ffffff= ff81012f91>] ? int_ret_from_sys_call+0x7/0x1b
> > &= nbsp;[<ffffffff8101371d>] ? retint_restore_args+0x5/0x6> >  [<ffffffff81013da0>] ? child_rip+0x0/0x= 20
> > (XEN) Domain 0 crashed: 'noreboot'&nb= sp;set - not rebooting.
> >
> >> Date: Tue, 12 Apr 2011 06:00:00 -0400
>= >> From: konrad.wilk@oracle.com
> >> To: tinnycloud@ho= tmail.com
> >> CC: xen-devel@lists.xensource.com; giamteckcho= on@gmail.com;
> >> jeremy@goop.org
> >> Subject: = Re: Kernel BUG at arch/x86/mm/tlb.c:61
> >>
> >> = On Tue, Apr 12, 2011 at 05:11:51PM +0800, MaoXiaoyun wrote:
> >&= gt; >
> >> > Hi :
> >> >
> >>= ; > We are using pvops kernel 2.6.32.36 + xen 4.0.1, but confront a ke= rnel
> >> > panic bug.
> >> >
> >&= gt; > 2.6.32.36 Kernel:
> >> > http://git.kernel.org/?p= =3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit;h=3Dbb1a15e55ec665a64c8a9c6= bd699b1f16ac01ff4
> >> > Xen 4.0.1 http://xenbits.xen.org/= hg/xen-4.0-testing.hg/rev/b536ebfba183
> >> >
> >= > > Our test is simple, 24 HVMS(Win2003 )=20 on a single host, each HVM loopes
> >> > in restart every= 15minutes.
> >>
> >> What is the storage that yo= u are using for your guests? AoE? Local disks?
> >>
> &= gt;> > About 17 machines are invovled in the test, after 10 hours r= un, one
> >> > confrontted a crash at arch/x86/mm/tlb.c:61=
> >> >
> >> > Currently I am trying "cpuid= le=3D0 cpufreq=3Dnone" tests based on Teck's
> >> > sugges= tion.
> >> >
> >> > Any comments, thanks.> >> >


--_f47789e7-8408-4b92-825b-0558efcdbf75_-- --===============0024717207== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============0024717207==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: RE: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Thu, 14 Apr 2011 19:16:37 +0800 Message-ID: References: , , , , , , , <4DA3438A.6070503@goop.org>, , , <20110412100000.GA15647@dumpdata.com>, , , Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0686933409==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: giamteckchoon@gmail.com Cc: jeremy@goop.org, xen devel , konrad.wilk@oracle.com List-Id: xen-devel@lists.xenproject.org --===============0686933409== Content-Type: multipart/alternative; boundary="_1211f961-1a35-410a-997a-c00be090a19b_" --_1211f961-1a35-410a-997a-c00be090a19b_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi: =20 As I go through the code.=20 From tlb.c:60, it looks like it cpu_tlbstate.state is TLBSTATE_O= K,=20 which indicates in user space, but the caller, in mmu.c:1512, =20 (active_mm =3D=3D mm) indicates kernel space, that the conflict. =20 Well, the panic CPU is processing IPI interrupt, could it be something= wrong with CPU mask?=20 =20 thanks.=20 =20 =3D=3D=3D=3D=3D=3Darch/x86/mm/tlb.c=3D=3D=3D 58 void leave_mm(int cpu) 59 { 60 <+++if (percpu_read(cpu_tlbstate.state) =3D=3D TLBSTATE_OK) 61 <+++<+++BUG(); = = =20 62 <+++cpumask_clear_cpu(cpu, 63 <+++<+++<+++ mm_cpumask(percpu_read(cpu_tlbstate.active_mm))); 64 <+++load_cr3(swapper_pg_dir); 65 } 66 EXPORT_SYMBOL_GPL(leave_mm); 67=20 =20 ///arch/x86/xen/mmu.c=20 =20 1502 #ifdef CONFIG_SMP 1503 /* Another cpu may still have their %cr3 pointing at the pagetable, = so 1504 we need to repoint it somewhere else before we can unpin it. */ 1505 static void drop_other_mm_ref(void *info) 1506 { 1507 <+++struct mm_struct *mm =3D info; 1508 <+++struct mm_struct *active_mm; 1509=20 1510 <+++active_mm =3D percpu_read(cpu_tlbstate.active_mm); 1511=20 1512 <+++if (active_mm =3D=3D mm) 1513 <+++<+++leave_mm(smp_processor_id()); = = =20 1514=20 1515 <+++/* If this cpu still has a stale cr3 reference, then make sure 1516 <+++ it has been flushed. */ 1517 <+++if (percpu_read(xen_current_cr3) =3D=3D __pa(mm->pgd)) 1518 <+++<+++load_cr3(swapper_pg_dir); 1519 } =20 > Date: Thu, 14 Apr 2011 15:26:14 +0800 > Subject: Re: Kernel BUG at arch/x86/mm/tlb.c:61 > From: giamteckchoon@gmail.com > To: tinnycloud@hotmail.com > CC: xen-devel@lists.xensource.com; jeremy@goop.org; konrad.wilk@oracle.= com >=20 > 2011/4/14 MaoXiaoyun : > > Hi: > > > > I've done test with "cpuidle=3D0 cpufreq=3Dnone", two machine c= rashed. > > > > blktap_sysfs_destroy > > blktap_sysfs_destroy > > blktap_sysfs_create: adding attributes for dev ffff8800ad581000 > > blktap_sysfs_create: adding attributes for dev ffff8800a48e3e00 > > ------------[ cut here ]------------ > > kernel BUG at arch/x86/mm/tlb.c:61! > > invalid opcode: 0000 [#1] SMP > > last sysfs file: /sys/block/tapdeve/dev > > CPU 0 > > Modules linked in: 8021q garp blktap xen_netback xen_blkback blkback_= pagemap nbd bridge stp llc autofs4 ipmi_devintf ipmi_si ipmi_ms > > ghandler lockd sunrpc bonding ipv6 xenfs dm_multipath video output sb= s sbshc parport_pc lp parport ses enclosure snd_seq_dummy bnx2 > > serio_raw snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_p= cm_oss snd_mixer_oss snd_pcm i2c_i801 snd_timer i2c_core snd iT > > CO_wdt pata_acpi soundcore iTCO_vendor_ > > support ata_generic snd_page_alloc pcspkr ata_piix shpchp mptsas mpts= csih mptbase [last unloa > > ded: freq_table] > > Pid: 8022, comm: khelper Not tainted 2.6.32.36xen #1 Tecal RH2285 > > RIP: e030:[] [] leave_mm+0x15/0x= 46 > > RSP: e02b:ffff88002803ee48 EFLAGS: 00010046 > > RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff81675980 > > RDX: ffff88002803ee78 RSI: 0000000000000000 RDI: 0000000000000000 > > RBP: ffff88002803ee48 R08: ffff8800a4929000 R09: dead000000200200 > > R10: dead000000100100 R11: ffffffff81447292 R12: ffff88012ba07b80 > > R13: ffff880028046020 R14: 00000000000004fb R15: 0000000000000000 > > FS: 00007f410af416e0(0000) GS:ffff88002803b000(0000) knlGS:000000000= 0000000 > > CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > > CR2: 0000000000469000 CR3: 00000000ad639000 CR4: 0000000000002660 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > Process khelper (pid: 8022, threadinfo ffff8800a4846000, task ffff880= 0a9ed0000) > > Stack: > > ffff88002803ee68 ffffffff8100e4a4 0000000000000001 ffff880097de3b88 > > <0> ffff88002803ee98 ffffffff81087224 ffff88002803ee78 ffff88002803ee= 78 > > <0> ffff88015f808180 00000000000004fb ffff88002803eea8 ffffffff810100= e8 > > Call Trace: > > > > [] drop_other_mm_ref+0x2a/0x53 > > [] generic_smp_call_function_single_interrupt+0xd8= /0xfc > > [] xen_call_function_single_interrupt+0x13/0x28 > > [] handle_IRQ_event+0x66/0x120 > > [] handle_percpu_irq+0x41/0x6e > > [] __xen_evtchn_do_upcall+0x1ab/0x27d > > [] xen_evtchn_do_upcall+0x33/0x46 > > [] xen_do_hypervisor_callback+0x1e/0x30 > > > > [] ? _spin_unlock_irqrestore+0x15/0x17 > > [] ? xen_restore_fl_direct_end+0x0/0x1 > > [] ? flush_old_exec+0x3ac/0x500 > > [] ? load_elf_binary+0x0/0x17ef > > [] ? load_elf_binary+0x0/0x17ef > > [] ? load_elf_binary+0x398/0x17ef > > [] ? need_resched+0x23/0x2d > > > > [] ? process_measurement+0xc0/0xd7 > > [] ? load_elf_binary+0x0/0x17ef > > [] ? search_binary_handler+0xc8/0x255 > > [] ? do_execve+0x1c3/0x29e > > [] ? sys_execve+0x43/0x5d > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? kernel_execve+0x68/0xd0 > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? xen_restore_fl_direct_end+0x0/0x1 > > [] ? ____call_usermodehelper+0x113/0x11e > > [] ? child_rip+0xa/0x20 > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? int_ret_from_sys_call+0x7/0x1b > > [] ? retint_restore_args+0x5/0x6 > > [] ? c > > hild_rip+0x0/0x20 > > Code: 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 = c3 55 48 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b= eb fe 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8 > > RIP [] leave_mm+0x15/0x46 > > RSP > > ---[ end trace 1522f17fdfc9162d ]--- > > Kernel panic - not syncing: Fatal exception in interrupt > > Pid: 8022, comm: khelper Tainted: G D 2.6.32.36xen #1 > > Call Trace: > > [] panic+0xe0/0x19a > > [] ? init_amd+0x296/0x37a > > [] ? xen_force_evtchn_callback+0xd/0xf > > [] ? check_events+0x12/0x20 > > [] ? xen_restore_fl_direct_end+0x0/0x1 > > [] ? print_oops_end_marker+0x23/0x25 > > [] oops_end+0xb6/0xc6 > > [] die+0x5a/0x63 > > [] do_trap+0x115/0x124 > > [] do_invalid_op+0x9c/0xa5 > > [] ? leave_mm+0x15/0x46 > > [] ? xen_clocksource_read+0x21/0x23 > > [] ? HYPERVISOR_vcpu_op+0xf/0x11 > > [] ? xen_vcpuop_set_next_event+0x52/0x67 > > [] invalid_op+0x1b/0x20 > > [] ? _spin_unlock_irqrestore+0x15/0x17 > > [] ? leave_mm+0x15/0x46 > > [] drop_other_mm_ref+0x2a/0x53 > > [] generic_smp_call_function_single_interrupt+0xd8= /0xfc > > [] xen_call_function_single_interrupt+0x13/0x28 > > [] handle_IRQ_event+0x66/0x120 > > [] handle_percpu_irq+0x41/0x6e > > [] __xen_evtchn_do_upcall+0x1ab/0x27d > > [] xen_evtchn_do_upcall+0x33/0x46 > > [] xen_do_hypervisor_callback+0x1e/0x30 > > [] ? _spin_unlock_irqrestore+0x15/0x17 > > [] ? xen_restore_fl_direct_end+0x0/0x1 > > [] ? flush_old_exec+0x3ac/0x500 > > [] ? load_elf_binary+0x0/0x17ef > > [] ? load_elf_binary+0x0/0x17ef > > [] ? load_elf_binary+0x398/0x17ef > > [] ? need_resched+0x23/0x > > 2d > > [] ? process_measurement+0xc0/0xd7 > > [] ? load_elf_binary+0x0/0x17ef > > [] ? search_binary_handler+0xc8/0x255 > > [] ? do_execve+0x1c3/0x29e > > [] ? sys_execve+0x43/0x5d > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? kernel_execve+0x68/0xd0 > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? xen_restore_fl_direct_end+0x0/0x1 > > [] ? ____call_usermodehelper+0x113/0x11e > > [] ? child_rip+0xa/0x20 > > [] ? __call_usermodehelper+0x0/0x6f > > [] ? int_ret_from_sys_call+0x7/0x1b > > [] ? retint_restore_args+0x5/0x6 > > [] ? child_rip+0x0/0x20 > > (XEN) Domain 0 crashed: 'noreboot' set - not rebooting. > > > >> Date: Tue, 12 Apr 2011 06:00:00 -0400 > >> From: konrad.wilk@oracle.com > >> To: tinnycloud@hotmail.com > >> CC: xen-devel@lists.xensource.com; giamteckchoon@gmail.com; > >> jeremy@goop.org > >> Subject: Re: Kernel BUG at arch/x86/mm/tlb.c:61 > >> > >> On Tue, Apr 12, 2011 at 05:11:51PM +0800, MaoXiaoyun wrote: > >> > > >> > Hi : > >> > > >> > We are using pvops kernel 2.6.32.36 + xen 4.0.1, but confront a ke= rnel > >> > panic bug. > >> > > >> > 2.6.32.36 Kernel: > >> > http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcom= mit;h=3Dbb1a15e55ec665a64c8a9c6bd699b1f16ac01ff4 > >> > Xen 4.0.1 http://xenbits.xen.org/hg/xen-4.0-testing.hg/rev/b536ebf= ba183 > >> > > >> > Our test is simple, 24 HVMS(Win2003 ) on a single host, each HVM l= oopes > >> > in restart every 15minutes. > >> > >> What is the storage that you are using for your guests? AoE? Local d= isks? > >> > >> > About 17 machines are invovled in the test, after 10 hours run, on= e > >> > confrontted a crash at arch/x86/mm/tlb.c:61 > >> > > >> > Currently I am trying "cpuidle=3D0 cpufreq=3Dnone" tests based on = Teck's > >> > suggestion. > >> > > >> > Any comments, thanks. > >> > =20 --_1211f961-1a35-410a-997a-c00be090a19b_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi:
 
       As I go through the code.
       From tlb.c:60, it looks like it = ; cpu_tlbstate.state  is TLBSTATE_OK, 
which indicates in user space, but the caller, in mmu.c:1512,&n= bsp;
(active_mm =3D=3D mm) indicates kernel space, that the conflict.
    
   Well, the panic CPU is processing IPI interrupt, c= ould it be something wrong
with CPU mask?
 
   thanks. 
 
=3D=3D=3D=3D=3D=3Darch/x86/mm/tlb.c=3D=3D=3D
 58 void leave_mm(int cpu)
 59 {
 60 <+++if (perc= pu_read(cpu_tlbstate.state) =3D=3D TLBSTATE_OK)
 61 <+++<++= +BUG();           =             &= nbsp;           &n= bsp;           &nb= sp;           &nbs= p;            = ;            =             &= nbsp;           &n= bsp;           &nb= sp;           &nbs= p;     
 62 <+++cpumask_clear_cpu(cpu,
 63 <+++<+++<+++  m= m_cpumask(percpu_read(cpu_tlbstate.active_mm)));
 64 <+++load_= cr3(swapper_pg_dir);
 65 }
 66 EXPORT_SYMBOL_GPL(leave_mm= );
 67
 
///arch/x86/xen/mmu.c
 
1502 #ifdef CONFIG_SMP
1503 /* Another cpu may still have their %cr3 p= ointing at the pagetable, so
1504    we need to repoint= it somewhere else before we can unpin it. */
1505 static void drop_ot= her_mm_ref(void *info)
1506 {
1507 <+++struct mm_struct *mm =3D = info;
1508 <+++struct mm_struct *active_mm;
1509
1510 <++= +active_mm =3D percpu_read(cpu_tlbstate.active_mm);
1511
1512 <= +++if (active_mm =3D=3D mm)
1513 <+++<+++leave_mm(smp_processor_= id());           &= nbsp;           &n= bsp;           &nb= sp;           &nbs= p;            = ;            =         & nbsp;           &= nbsp;           &n= bsp;         
1514
1= 515 <+++/* If this cpu still has a stale cr3 reference, then make sure=
1516 <+++   it has been flushed. */
1517 <+++if (p= ercpu_read(xen_current_cr3) =3D=3D __pa(mm->pgd))
1518 <+++<+= ++load_cr3(swapper_pg_dir);
1519 }


 
> Date: Thu, 14 Apr 2011 15:26:14 +0800
> Subject:= Re: Kernel BUG at arch/x86/mm/tlb.c:61
> From: giamteckchoon@gmail= .com
> To: tinnycloud@hotmail.com
> CC: xen-devel@lists.xenso= urce.com; jeremy@goop.org; konrad.wilk@oracle.com
>
> 2011/4= /14 MaoXiaoyun <tinnycloud@hotmail.com>:
> > Hi:
> &= gt;
> >       I've done test with "cpui= dle=3D0 cpufreq=3Dnone", two machine crashed.
> >=
> > blktap_sysfs_destroy
> > blktap_sysfs_destroy
&= gt; > blktap_sysfs_create: adding attributes for d= ev ffff8800ad581000
> > blktap_sysfs_create: adding&nb= sp;attributes for dev ffff8800a48e3e00
> > ------= ------[ cut here ]------------
> > kernel BU= G at arch/x86/mm/tlb.c:61!
> > invalid opcode:&nb= sp;0000 [#1] SMP
> > last& nbsp;sysfs file: /sys/block/tapdeve/dev
> > CPU = 0
> > Modules linked in: 8021q garp blk= tap xen_netback xen_blkback blkback_pagemap nbd = bridge stp llc autofs4 ipmi_devintf ipmi_si = ;ipmi_ms
> > ghandler lockd sunrpc bonding i= pv6 xenfs dm_multipath video output sbs sbs= hc parport_pc lp parport ses enclosure snd_= seq_dummy bnx2
> > serio_raw snd_seq_oss snd_seq_= midi_event snd_seq snd_seq_device snd_pcm_oss snd_mix= er_oss snd_pcm i2c_i801 snd_timer i2c_core snd&n= bsp;iT
> > CO_wdt pata_acpi soundcore iTCO_vendor= _
> > support ata_generic snd_page_alloc pcspkr&n= bsp;ata_piix shpchp mptsas mptscsih mptbase [las= t unloa
> > ded: freq_t able]
> > Pid: 8022, comm: khelper Not = ;tainted 2.6.32.36xen #1 Tecal RH2285
> > RI= P: e030:[<ffffffff8103a3cb>]  [<ffffffff8103a3cb&= gt;] leave_mm+0x15/0x46
> > RSP: e02b:ffff88002803ee48=   EFLAGS: 00010046
> > RAX: 0000000000000000=  RBX: 0000000000000001 RCX: ffffffff81675980
> = > RDX: ffff88002803ee78 RSI: 0000000000000000 RDI:=  0000000000000000
> > RBP: ffff88002803ee48 R08:&= nbsp;ffff8800a4929000 R09: dead000000200200
> > R10:&n= bsp;dead000000100100 R11: ffffffff81447292 R12: ffff8= 8012ba07b80
> > R13: ffff880028046020 R14: 000000= 00000004fb R15: 0000000000000000
> > FS:  00= 007f410af416e0(0000) GS:ffff88002803b000(0000) knlGS:0000000000= 000000
> > CS:  e033&nb sp;DS: 0000 ES: 0000 CR0: 000000008005003b
&= gt; > CR2: 0000000000469000 CR3: 00000000ad639000 = CR4: 0000000000002660
> > DR0: 0000000000000000 D= R1: 0000000000000000 DR2: 0000000000000000
> > DR= 3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0= 000000000000400
> > Process khelper (pid: 8022,&n= bsp;threadinfo ffff8800a4846000, task ffff8800a9ed0000)> > Stack:
> >  ffff88002803ee68 ffffffff8100e4= a4 0000000000000001 ffff880097de3b88
> > <0>&nbs= p;ffff88002803ee98 ffffffff81087224 ffff88002803ee78 ffff8= 8002803ee78
> > <0> ffff88015f808180 00000000000= 004fb ffff88002803eea8 ffffffff810100e8
> > Call = Trace:
> >  <IRQ>
> >  [<ffffffff810= 0e4a4>] drop_other_mm_ref+0x2a/0x5 3
> >  [<ffffffff81087224>] generic_smp_call_fu= nction_single_interrupt+0xd8/0xfc
> >  [<ffffffff810100e= 8>] xen_call_function_single_interrupt+0x13/0x28
> > &nb= sp;[<ffffffff810a936a>] handle_IRQ_event+0x66/0x120
> &g= t;  [<ffffffff810aac5b>] handle_percpu_irq+0x41/0x6e
&= gt; >  [<ffffffff8128c1a8>] __xen_evtchn_do_upcall+0x1= ab/0x27d
> >  [<ffffffff8128dcf9>] xen_evtchn_do= _upcall+0x33/0x46
> >  [<ffffffff81013efe>] xen_= do_hypervisor_callback+0x1e/0x30
> >  <EOI>
> &= gt;  [<ffffffff81447292>] ? _spin_unlock_irqrestore+= 0x15/0x17
> >  [<ffffffff8100f8af>] ? xen_r= estore_fl_direct_end+0x0/0x1
> >  [<ffffffff81113f75>= ] ? flush_old_exec+0x3ac/0x500
> >  [<ffffffff= 81150dc9>] ? load_elf_binary+ 0x0/0x17ef
> >  [<ffffffff81150dc9>] ? loa= d_elf_binary+0x0/0x17ef
> >  [<ffffffff81151161>]&nbs= p;? load_elf_binary+0x398/0x17ef
> >  [<ffffffff810= 42fcf>] ? need_resched+0x23/0x2d
> >
> > [= <ffffffff811f463c>] ? process_measurement+0xc0/0xd7
&g= t; >  [<ffffffff81150dc9>] ? load_elf_binary+0x0/= 0x17ef
> >  [<ffffffff81113098>] ? search_b= inary_handler+0xc8/0x255
> >  [<ffffffff81114366>]&nb= sp;? do_execve+0x1c3/0x29e
> >  [<ffffffff8101155d&= gt;] ? sys_execve+0x43/0x5d
> >  [<ffffffff810= 6fc45>] ? __call_usermodehelper+0x0/0x6f
> >  = [<ffffffff81013e28>] ? kernel_execve+0x68/0xd0
> &g= t;  [<ffffffff8106fc45>] ? __call_usermodehelper+0x0= /0x6f
> >  [<ffffffff8100 f8af>] ? xen_restore_fl_direct_end+0x0/0x1
> > &nb= sp;[<ffffffff8106fb64>] ? ____call_usermodehelper+0x113/0= x11e
> >  [<ffffffff81013daa>] ? child_rip+= 0xa/0x20
> >  [<ffffffff8106fc45>] ? __call= _usermodehelper+0x0/0x6f
> >  [<ffffffff81012f91>]&nb= sp;? int_ret_from_sys_call+0x7/0x1b
> >  [<ffffffff= 8101371d>] ? retint_restore_args+0x5/0x6
> >  = [<ffffffff81013da0>] ? c
> > hild_rip+0x0/0x20> > Code: 41 5e 41 5f c9 c3 55=  48 89 e5 0f 1f 44 00 00 e8&= nbsp;17 ff ff ff c9 c3 55 48 89&n= bsp;e5 0f 1f 44 00 00 65 8b 04&nb= sp;25 c8 55 01 00 ff c8 75 04&nbs= p;<0f> 0b eb fe  65 48 8b 34 25 c0 55 01 00 = 48 81 c6 b8 02 00 00 e8
> > R= IP  [<ffffffff8103a3cb>] leave_mm+0x15/0x46
> = >  RSP <ffff88002803ee48>
> > ---[ end&n= bsp;trace 1522f17fdfc9162d ]---
> > Kernel panic&= nbsp;- not syncing: Fatal exception in inte= rrupt
> > Pid: 8022, comm: khelper Tainted:&= nbsp;G      D    2.6.32= .36xen #1
> > Call Trace:
> >  <IRQ&g= t;  [<ffffffff8105682e>] panic+0xe0/0x19a
> &g= t;  [<ffffffff8144006a>] ? init_amd+0x296/0x37a
> >  [<ffffffff8100f169>] ? xen_force_evtchn_c= allback+0xd/0xf
> >  [<ffffffff8100f8c2>] ? = ;check_events+0x12/0x20
> >  [<ffffffff8100f8af>]&nbs= p;? xen_restore_fl_direct_end+0x0/0x1
> >  [<ffffff= ff81056487>] ? print_oops_end_marker+0x23/0x25
> > =  [<ffffffff81448165>] oops_end+0xb6/0xc6
> > &nb= sp;[<ffffffff810166e5>] die+0x5a/0x63
> >  [<= ffffffff81447a3c>] do_trap+0x115/0x124
> >  [<ff= ffffff810148e6>] do_invalid_op+0x9c/0xa5
> >  [<= ffffffff8103a3cb>] ? leave_mm+0x15/0x46
> >  [= <ffffffff8100f6e6>] ? xen_clocksource_read+0x21/0x23
&= gt; >  [<ffffffff8100f258>] ? HYPERVISOR_vcpu_op+= 0xf/0x11
> >  [<ffffffff8100f753>] ? xen_vc= puop_set_next_event+0x52/0x67
> > ;  [<ffffffff81013b3b>] invalid_op+0x1b/0x20
> >= ;  [<ffffffff81447292>] ? _spin_unlock_irqrestore+0x= 15/0x17
> >  [<ffffffff8103a3cb>] ? leave_m= m+0x15/0x46
> >  [<ffffffff8100e4a4>] drop_other= _mm_ref+0x2a/0x53
> >  [<ffffffff81087224>] gene= ric_smp_call_function_single_interrupt+0xd8/0xfc
> >  [<= ffffffff810100e8>] xen_call_function_single_interrupt+0x13/0x28> >  [<ffffffff810a936a>] handle_IRQ_event+0x66/0= x120
> >  [<ffffffff810aac5b>] handle_percpu_irq= +0x41/0x6e
> >  [<ffffffff8128c1a8>] __xen_evtch= n_do_upcall+0x1ab/0x27d
> >  [<ffffffff8128dcf9>]&nbs= p;xen_evtchn_do_upcall+0x33/0x46
> >  [<ffffffff81013efe= >] xen_do_hypervisor_callback+0x1e/0x30
> >  <EO= I>  [<ffffffff81447292> ] ? _spin_unlock_irqrestore+0x15/0x17
> >  [<= ffffffff8100f8af>] ? xen_restore_fl_direct_end+0x0/0x1
&g= t; >  [<ffffffff81113f75>] ? flush_old_exec+0x3ac= /0x500
> >  [<ffffffff81150dc9>] ? load_elf= _binary+0x0/0x17ef
> >  [<ffffffff81150dc9>] ?&n= bsp;load_elf_binary+0x0/0x17ef
> >  [<ffffffff81151161&g= t;] ? load_elf_binary+0x398/0x17ef
> >  [<ffff= ffff81042fcf>] ? need_resched+0x23/0x
> > 2d
>= ; >  [<ffffffff811f463c>] ? process_measurement+0= xc0/0xd7
> >  [<ffffffff81150dc9>] ? load_e= lf_binary+0x0/0x17ef
> >  [<ffffffff81113098>] ?=  search_binary_handler+0xc8/0x255
> >  [<ffffffff81= 114366>] ? do_execve+0x1c3/0x29e
> >  [<fff= fffff8101155d>] ? sys_execve+ 0x43/0x5d
> >  [<ffffffff8106fc45>] ? __ca= ll_usermodehelper+0x0/0x6f
> >  [<ffffffff81013e28>]&= nbsp;? kernel_execve+0x68/0xd0
> >  [<ffffffff8106f= c45>] ? __call_usermodehelper+0x0/0x6f
> >  [&= lt;ffffffff8100f8af>] ? xen_restore_fl_direct_end+0x0/0x1> >  [<ffffffff8106fb64>] ? ____call_usermode= helper+0x113/0x11e
> >  [<ffffffff81013daa>] ?&n= bsp;child_rip+0xa/0x20
> >  [<ffffffff8106fc45>] = ;? __call_usermodehelper+0x0/0x6f
> >  [<ffffffff81= 012f91>] ? int_ret_from_sys_call+0x7/0x1b
> >  = ;[<ffffffff8101371d>] ? retint_restore_args+0x5/0x6
&g= t; >  [<ffffffff81013da0>] ? child_rip+0x0/0x20> > (XEN) Domain 0 crashed: 'noreboot' s= et - not rebooting.
> >
> >> Date: Tue, 12 Apr 2011 06:00:00 -0400
> >= ;> From: konrad.wilk@oracle.com
> >> To: tinnycloud@hotmai= l.com
> >> CC: xen-devel@lists.xensource.com; giamteckchoon@g= mail.com;
> >> jeremy@goop.org
> >> Subject: Re: = Kernel BUG at arch/x86/mm/tlb.c:61
> >>
> >> On T= ue, Apr 12, 2011 at 05:11:51PM +0800, MaoXiaoyun wrote:
> >> = >
> >> > Hi :
> >> >
> >> &g= t; We are using pvops kernel 2.6.32.36 + xen 4.0.1, but confront a kernel=
> >> > panic bug.
> >> >
> >> = > 2.6.32.36 Kernel:
> >> > http://git.kernel.org/?p=3Dl= inux/kernel/git/jeremy/xen.git;a=3Dcommit;h=3Dbb1a15e55ec665a64c8a9c6bd69= 9b1f16ac01ff4
> >> > Xen 4.0.1 http://xenbits.xen.org/hg/x= en-4.0-testing.hg/rev/b536ebfba183
> >> >
> >>= > Our test is simple, 24 HVMS(Win2003 ) on a single host, each HVM loopes
> >> > in restart every 15m= inutes.
> >>
> >> What is the storage that you ar= e using for your guests? AoE? Local disks?
> >>
> >&= gt; > About 17 machines are invovled in the test, after 10 hours run, = one
> >> > confrontted a crash at arch/x86/mm/tlb.c:61
= > >> >
> >> > Currently I am trying "cpuidle=3D= 0 cpufreq=3Dnone" tests based on Teck's
> >> > suggestion.=
> >> >
> >> > Any comments, thanks.
>= ; >> >



--_1211f961-1a35-410a-997a-c00be090a19b_-- --===============0686933409== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============0686933409==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: RE: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Fri, 15 Apr 2011 20:23:55 +0800 Message-ID: References: , , , , , , , <4DA3438A.6070503@goop.org>, , , <20110412100000.GA15647@dumpdata.com>, , , , Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1406585959==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: giamteckchoon@gmail.com Cc: jeremy@goop.org, xen devel , konrad.wilk@oracle.com List-Id: xen-devel@lists.xenproject.org --===============1406585959== Content-Type: multipart/alternative; boundary="_f022eacc-e797-467d-81e3-316272c942eb_" --_f022eacc-e797-467d-81e3-316272c942eb_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi=A3=BA Could the crash related to this patch ?=20 http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommitdiff= ;h=3D45bfd7bfc6cf32f8e60bb91b32349f0b5090eea3 Since now TLB state change to TLBSTATE_OK(mmu_context.h:40) is before cp= umask_clear_cpu(line 49). Could it possible that right after execute line 40 of mmu_context.h, CPU= revice IPI from other CPU to=20 flush the mm, and when in interrupt, find the TLB state happened to be TL= BSTATE_OK. Which conflicts. Thanks. arch/x86/include/asm/mmu_context.h =20 33 static inline void switch_mm(struct mm_struct *prev, struct mm_struct = *next, 34 <+++<+++<+++ struct task_struct *tsk) 35 { 36 <+++unsigned cpu =3D smp_processor_id(); 37=20 38 <+++if (likely(prev !=3D next)) { 39 #ifdef CONFIG_SMP 40 <+++<+++percpu_write(cpu_tlbstate.state, TLBSTATE_OK); 41 <+++<+++percpu_write(cpu_tlbstate.active_mm, next); 42 #endif 43 <+++<+++cpumask_set_cpu(cpu, mm_cpumask(next)); 44=20 45 <+++<+++/* Re-load page tables */ 46 <+++<+++load_cr3(next->pgd); 47=20 48 <+++<+++/* stop flush ipis for the previous mm */ 49 <+++<+++cpumask_clear_cpu(cpu, mm_cpumask(prev)); =20 =20 --_f022eacc-e797-467d-81e3-316272c942eb_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable Hi=A3=BA

Could the crash  related to this patch ?
http://git.kerne= l.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommitdiff;h=3D45bfd7bfc6c= f32f8e60bb91b32349f0b5090eea3

Since now TLB state  change to TLBSTATE_OK(mmu_context.h:40) is bef= ore cpumask_clear_cpu(line 49).
Could it possible that right after exe= cute line 40 of mmu_context.h,  CPU revice IPI from other = CPU to

flu= sh the mm, and when in interrupt, find the TLB state happened= to be TLBSTATE_OK. Which conflicts.

Tha= nks.

arch/x86/include/asm/mmu_context.h
 
33 static inline voi= d switch_mm(struct mm_struct *prev, struct mm_struct *next,
 34 &= lt;+++<+++<+++     struct task_struct *tsk)
=  35 {
 36 <+++unsigned cpu =3D smp_processor_id();
&nb= sp;37
 38 <+++if (likely(prev !=3D next)) {
 39 #ifde= f CONFIG_SMP
 40 <+++<+++percpu_write(cpu_tlbstate.state, T= LBSTATE_OK);
 41 <+++<+++percpu_write(cpu_tlbstate.active_m= m, next);
 42 #endif
 43 <+++<+++cpumask_set_cpu(cp= u, mm_cpumask(next));
&nb sp;44
 45 <+++<+++/* Re-load page tables */
 46 &= lt;+++<+++load_cr3(next->pgd);
 47
 48 <+++<= +++/* stop flush ipis for the previous mm */
 49 <+++<+++cp= umask_clear_cpu(cpu, mm_cpumask(prev));  



= --_f022eacc-e797-467d-81e3-316272c942eb_-- --===============1406585959== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============1406585959==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeremy Fitzhardinge Subject: Re: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Fri, 15 Apr 2011 14:22:29 -0700 Message-ID: <4DA8B715.9080508@goop.org> References: , , , , , , , <4DA3438A.6070503@goop.org>, , , <20110412100000.GA15647@dumpdata.com>, , , , Mime-Version: 1.0 Content-Type: text/plain; charset=GB2312 Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: MaoXiaoyun Cc: xen devel , giamteckchoon@gmail.com, konrad.wilk@oracle.com List-Id: xen-devel@lists.xenproject.org On 04/15/2011 05:23 AM, MaoXiaoyun wrote: > Hi=A3=BA > > Could the crash related to this patch ? > http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommitdi= ff;h=3D45bfd7bfc6cf32f8e60bb91b32349f0b5090eea3 > > Since now TLB state change to TLBSTATE_OK(mmu_context.h:40) is before > cpumask_clear_cpu(line 49). > Could it possible that right after execute line 40 of mmu_context.h, > CPU revice IPI from other CPU to > flush the mm, and when in interrupt, find the TLB state happened to be > TLBSTATE_OK. Which conflicts. Does reverting it help? J > > Thanks. > > arch/x86/include/asm/mmu_context.h > > 33 static inline void switch_mm(struct mm_struct *prev, struct > mm_struct *next, > 34 <+++<+++<+++ struct task_struct *tsk) > 35 { > 36 <+++unsigned cpu =3D smp_processor_id(); > 37 > 38 <+++if (likely(prev !=3D next)) { > 39 #ifdef CONFIG_SMP > 40 <+++<+++percpu_write(cpu_tlbstate.state, TLBSTATE_OK); > 41 <+++<+++percpu_write(cpu_tlbstate.active_mm, next); > 42 #endif > 43 <+++<+++cpumask_set_cpu(cpu, mm_cpumask(next)); > 44 > 45 <+++<+++/* Re-load page tables */ > 46 <+++<+++load_cr3(next->pgd); > 47 > 48 <+++<+++/* stop flush ipis for the previous mm */ > 49 <+++<+++cpumask_clear_cpu(cpu, mm_cpumask(prev)); > > From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: RE: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Mon, 18 Apr 2011 23:20:31 +0800 Message-ID: References: , , , , , , , <4DA3438A.6070503@goop.org>, , , <20110412100000.GA15647@dumpdata.com>, , , , , <4DA8B715.9080508@goop.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1111364251==" Return-path: In-Reply-To: <4DA8B715.9080508@goop.org> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen devel Cc: jeremy@goop.org List-Id: xen-devel@lists.xenproject.org --===============1111364251== Content-Type: multipart/alternative; boundary="_a68ccabf-f159-4535-b68f-106da27ef22a_" --_a68ccabf-f159-4535-b68f-106da27ef22a_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable =20 > Date: Fri, 15 Apr 2011 14:22:29 -0700 > From: jeremy@goop.org > To: tinnycloud@hotmail.com > CC: giamteckchoon@gmail.com; xen-devel@lists.xensource.com; konrad.wilk= @oracle.com > Subject: Re: Kernel BUG at arch/x86/mm/tlb.c:61 >=20 > On 04/15/2011 05:23 AM, MaoXiaoyun wrote: > > Hi=A3=BA > > > > Could the crash related to this patch ? > > http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit= diff;h=3D45bfd7bfc6cf32f8e60bb91b32349f0b5090eea3 > > > > Since now TLB state change to TLBSTATE_OK(mmu_context.h:40) is before > > cpumask_clear_cpu(line 49). > > Could it possible that right after execute line 40 of mmu_context.h, > > CPU revice IPI from other CPU to > > flush the mm, and when in interrupt, find the TLB state happened to b= e > > TLBSTATE_OK. Which conflicts. >=20 > Does reverting it help? >=20 > J =20 Very likely. =20 Previously in 17 machines test, one to three machines will fail in 10hour= s, very easily. =20 But after reverting, we have 29machines involved the test, 28 successfuly= rung 2 days, 1 fail after 28 hours.=20 Unfortunately I can't tell wether the failed one related to this bug, sin= ce I got no log in messages. And the machine was reboot by someone before I can see something from serial = port. =20 But in my opinion the fail points to another bug, which I happened to con= front before. =20 =20 Before, one of my develop machine(2.6.32.36kernel+xen4.0.1) completely st= op response,=20 including serial console. There is no abnormal message in serial port, l= ooks like Xen runs in deadlock.=20 Well, it is rarely happen, since I only met once till now.=20 =20 Now I am trying to figure out what might cause the deadlock, we never met= this before. I don't have clear thoughts on how to dig it out, but I think this bug e= xists in Xen. since if dom0 hangs, xen should work, and serial output will response. If so, the bug may be introduced between 4.0.0 and 4.0.1. =20 What do you think, thanks. > > > > Thanks. > > > > arch/x86/include/asm/mmu_context.h > > > > 33 static inline void switch_mm(struct mm_struct *prev, struct > > mm_struct *next, > > 34 <+++<+++<+++ struct task_struct *tsk) > > 35 { > > 36 <+++unsigned cpu =3D smp_processor_id(); > > 37 > > 38 <+++if (likely(prev !=3D next)) { > > 39 #ifdef CONFIG_SMP > > 40 <+++<+++percpu_write(cpu_tlbstate.state, TLBSTATE_OK); > > 41 <+++<+++percpu_write(cpu_tlbstate.active_mm, next); > > 42 #endif > > 43 <+++<+++cpumask_set_cpu(cpu, mm_cpumask(next)); > > 44 > > 45 <+++<+++/* Re-load page tables */ > > 46 <+++<+++load_cr3(next->pgd); > > 47 > > 48 <+++<+++/* stop flush ipis for the previous mm */ > > 49 <+++<+++cpumask_clear_cpu(cpu, mm_cpumask(prev)); > > > > >=20 =20 --_a68ccabf-f159-4535-b68f-106da27ef22a_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable
 
> Date: Fri, 15 Apr 2011 14:22:29 -0700
> From: jeremy@goop.org<= BR>> To: tinnycloud@hotmail.com
> CC: giamteckchoon@gmail.com; x= en-devel@lists.xensource.com; konrad.wilk@oracle.com
> Subject: Re:= Kernel BUG at arch/x86/mm/tlb.c:61
>
> On 04/15/2011 05:23 = AM, MaoXiaoyun wrote:
> > Hi=A3=BA
> >
> > Cou= ld the crash related to this patch ?
> > http://git.kernel.org/?= p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommitdiff;h=3D45bfd7bfc6cf32f8e6= 0bb91b32349f0b5090eea3
> >
> > Since now TLB state chan= ge to TLBSTATE_OK(mmu_context.h:40) is before
> > cpumask_clear_= cpu(line 49).
> > Could it possible that right after execute lin= e 40 of mmu_context.h,
> > CPU revice IPI from other CPU to
&= gt; > flush the mm, and when in interrupt, find the TLB state happened= to be
> > TLBSTATE_OK. Which conflicts.
>
> Does r= everting it help?
>
> J
 
Very likely.
 
Previously in 17 machines test, one to three machines will fail in 10hour= s, very easily.
 
But after reverting, we have 29machines involved the test, 28 successfuly= rung 2 days, 1 fail after 28 hours.
Unfortunately I can't tell wether the failed one related to this bug, sin= ce I got no log in messages.  And
the machine was reboot by someone before I can see something from serial = port.
 
But in my opinion the fail points to another bug, which I happened to con= front before. 
 
Before, one of my develop machine(2.6.32.36kernel+xen4.0.1) completely st= op response,
including serial console. There is no abnormal message in serial port,&nb= sp; looks like Xen runs in deadlock. 
Well, it is rarely happen, since I only met once till now. 
 
Now I am trying to figure out what might cause the deadlock, we never met= this before.
I don't have clear thoughts on how to dig it out, but  I think this = bug exists in Xen.
since if dom0 hangs, xen should work,  and serial output will respon= se.
 If so, the bug may be introduced between 4.0.0 and 4.0.1.
 
What do you think,  thanks.

> >
> > Thanks.
> >
> > arch/x86/inc= lude/asm/mmu_context.h
> >
> > 33 static inline void sw= itch_mm(struct mm_struct *prev, struct
> > mm_struct *next,
&= gt; > 34 <+++<+++<+++ struct task_struct *tsk)
> > 3= 5 {
> > 36 <+++unsigned cpu =3D smp_processor_id();
> &= gt; 37
> > 38 <+++if (likely(prev !=3D next)) {
> > = 39 #ifdef CONFIG_SMP
> > 40 <+++<+++percpu_write(cpu_tlbst= ate.state, TLBSTATE_OK);
> > 41 <+++<+++percpu_write(cpu_t= lbstate.active_mm, next);
> > 42 #endif
> > 43 <+++&= lt;+++cpumask_set_cpu(cpu, mm_cpumask(next));
> > 44
> >= ; 45 <+++<+++/* Re-load page tables */
> > 46 <+++<+= ++load_cr3(next->pgd);
> > 47
> > 48 <+++<+++/= * stop flush ipis for the previous mm */
> > 49 <+++<+++cp= umask_clear_cpu(cpu, mm_cpumask(prev));
&g t; >
> >
>

--_a68ccabf-f159-4535-b68f-106da27ef22a_-- --===============1111364251== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============1111364251==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: RE: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Mon, 25 Apr 2011 11:15:15 +0800 Message-ID: References: , , , , , , , <4DA3438A.6070503@goop.org>, , , <20110412100000.GA15647@dumpdata.com>, , , , , <4DA8B715.9080508@goop.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1730289001==" Return-path: In-Reply-To: <4DA8B715.9080508@goop.org> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: jeremy@goop.org Cc: xen devel , giamteckchoon@gmail.com, konrad.wilk@oracle.com List-Id: xen-devel@lists.xenproject.org --===============1730289001== Content-Type: multipart/alternative; boundary="_695d5714-ca49-4951-b159-c7aee7c65a17_" --_695d5714-ca49-4951-b159-c7aee7c65a17_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable =20 > Date: Fri, 15 Apr 2011 14:22:29 -0700 > From: jeremy@goop.org > To: tinnycloud@hotmail.com > CC: giamteckchoon@gmail.com; xen-devel@lists.xensource.com; konrad.wilk= @oracle.com > Subject: Re: Kernel BUG at arch/x86/mm/tlb.c:61 >=20 > On 04/15/2011 05:23 AM, MaoXiaoyun wrote: > > Hi=A3=BA > > > > Could the crash related to this patch ? > > http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommit= diff;h=3D45bfd7bfc6cf32f8e60bb91b32349f0b5090eea3 > > > > Since now TLB state change to TLBSTATE_OK(mmu_context.h:40) is before > > cpumask_clear_cpu(line 49). > > Could it possible that right after execute line 40 of mmu_context.h, > > CPU revice IPI from other CPU to > > flush the mm, and when in interrupt, find the TLB state happened to b= e > > TLBSTATE_OK. Which conflicts. >=20 > Does reverting it help? >=20 > J =20 Hi Jeremy: =20 The lastest test result shows the reverting didn't help. Kernel panic exactly at the same place in tlb.c. =20 I have question about TLB state, from the stack,=20 xen_do_hypervisor_callback-> xen_evtchn_do_upcall->... ->drop_other_m= m_ref =20 What cpu_tlbstate.state should be, could TLBSTATE_OK or TLBSTATE_L= AZY all be possible?=20 That is after a hypercall from userspace, state will be TLBSTATE_OK, = and if from kernel space, state will be TLBSTATE_LAZE ?=20 =20 thanks. =20 [] drop_other_mm_ref+0x2a/0x53 [] generic_smp_call_function_single_interrupt+0xd8/0xf= c [] xen_call_function_single_interrupt+0x13/0x28 [] handle_IRQ_event+0x66/0x120 [] handle_percpu_irq+0x41/0x6e [] __xen_evtchn_do_upcall+0x1ab/0x27d [] xen_evtchn_do_upcall+0x33/0x46 [] xen_do_hypervisor_callback+0x1e/0x30 >=20 > > > > Thanks. > > > > arch/x86/include/asm/mmu_context.h > > > > 33 static inline void switch_mm(struct mm_struct *prev, struct > > mm_struct *next, > > 34 <+++<+++<+++ struct task_struct *tsk) > > 35 { > > 36 <+++unsigned cpu =3D smp_processor_id(); > > 37 > > 38 <+++if (likely(prev !=3D next)) { > > 39 #ifdef CONFIG_SMP > > 40 <+++<+++percpu_write(cpu_tlbstate.state, TLBSTATE_OK); > > 41 <+++<+++percpu_write(cpu_tlbstate.active_mm, next); > > 42 #endif > > 43 <+++<+++cpumask_set_cpu(cpu, mm_cpumask(next)); > > 44 > > 45 <+++<+++/* Re-load page tables */ > > 46 <+++<+++load_cr3(next->pgd); > > 47 > > 48 <+++<+++/* stop flush ipis for the previous mm */ > > 49 <+++<+++cpumask_clear_cpu(cpu, mm_cpumask(prev)); > > > > >=20 =20 --_695d5714-ca49-4951-b159-c7aee7c65a17_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable  
> Date: Fri, 15 Apr 2011 14:22:29 -0700
> From: jeremy@goop.org<= BR>> To: tinnycloud@hotmail.com
> CC: giamteckchoon@gmail.com; x= en-devel@lists.xensource.com; konrad.wilk@oracle.com
> Subject: Re:= Kernel BUG at arch/x86/mm/tlb.c:61
>
> On 04/15/2011 05:23 = AM, MaoXiaoyun wrote:
> > Hi=A3=BA
> >
> > Cou= ld the crash related to this patch ?
> > http://git.kernel.org/?= p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommitdiff;h=3D45bfd7bfc6cf32f8e6= 0bb91b32349f0b5090eea3
> >
> > Since now TLB state chan= ge to TLBSTATE_OK(mmu_context.h:40) is before
> > cpumask_clear_= cpu(line 49).
> > Could it possible that right after execute lin= e 40 of mmu_context.h,
> > CPU revice IPI from other CPU to
&= gt; > flush the mm, and when in interrupt, find the TLB state happened= to be
> > TLBSTATE_OK. Which conflicts.
>
> Does r= everting it help?
>
> J
 
Hi Jeremy:
 
    The lastest test result shows the reverting didn't hel= p.
    Kernel panic exactly at the same place in tlb.c.<= BR>  
    I have question about TLB state, from the stack,
    xen_do_hypervisor= _callback-> xen_evtchn_do_upcall->... ->drop_other_mm_ref
 
    What  cpu_tlbstate.state should be,  could &n= bsp;TLBSTATE_OK or TLBSTATE_LAZY all be possible?
    That is after a hypercall fr= om userspace, state will be TLBSTATE_OK, and
      if from kernel space,= state will be TLBSTATE_LAZE ?
 
       thanks.
   

 [<ffffffff8100e4a4>] drop_other_mm_ref+0x2a/0x53

 [<ffffffff81087224>] generic_smp_call_function_single_interrupt= +0xd8/0xfc

 [<ffffffff810100e8>] xen_call_function_single_interrupt+0x13/0x= 28

 [<ffffffff810a936a>] handle_IRQ_event+0x66/0x120

 [<ffffffff810aac5b>] handle_percpu_irq+0x41/0x6e

 [<ffffffff8128c1a8>] __xen_evtchn_do_upcall+0x1ab/0x27d

 [<ffffffff8128dcf9>] xen_evtchn_do_upcall+0x33/0x46<= /FONT>

 [<ffffffff81013efe>] xen_do_hypervisor_callback+0x1e/0x30<= /o:p>


>
> >
> > Thanks.
> >
> > = arch/x86/include/asm/mmu_context.h
> >
> > 33 static in= line void switch_mm(struct mm_struct *prev, struct
> > mm_struct= *next,
> > 34 <+++<+++<+++ struct task_struct *tsk)> > 35 {
> > 36 <+++unsigned cpu =3D smp_processor_id(= );
> > 37
> > 38 <+++if (likely(prev !=3D next)) {> > 39 #ifdef CONFIG_SMP
> > 40 <+++<+++percpu_wri= te(cpu_tlbstate.state, TLBSTATE_OK);
> > 41 <+++<+++percpu= _write(cpu_tlbstate.active_mm, next);
> > 42 #endif
> >= 43 <+++<+++cpumask_set_cpu(cpu, mm_cpumask(next));
> > 44=
> > 45 <+++<+++/* Re-load page tables */
> > 46 = <+++<+++load_cr3(next->pgd);
> > 47
> > 48 <= ;+++<+++/* stop flush ipis for the previous mm */
> > 49 <= +++<+++cpumask_clear_cpu(cpu, mm_cpumask(p rev));
> >
> >
>

--_695d5714-ca49-4951-b159-c7aee7c65a17_-- --===============1730289001== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============1730289001==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: RE: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Mon, 25 Apr 2011 12:42:48 +0800 Message-ID: References: , , , , , , , <4DA3438A.6070503@goop.org>, , , <20110412100000.GA15647@dumpdata.com>, , , , , <4DA8B715.9080508@goop.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1278404576==" Return-path: In-Reply-To: <4DA8B715.9080508@goop.org> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: jeremy@goop.org Cc: xen devel , giamteckchoon@gmail.com, konrad.wilk@oracle.com List-Id: xen-devel@lists.xenproject.org --===============1278404576== Content-Type: multipart/alternative; boundary="_8a771ba3-574c-42cf-ab64-c54cc7d9a52f_" --_8a771ba3-574c-42cf-ab64-c54cc7d9a52f_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable I go through the switch_mm more, and come up one more question: =20 Why we don't need to clear prev cpumask in line between line 59 and 60? =20 Say 1) Context is switch from process A to kernel, then kernel has active_mm= -> A's mm 2) Context is switch from kernel to A, in sched.c oldmm =3D A's mm; mm =3D= A's mm 3) it will call arch/x86/include/asm/mmu_context.h:60, since prev =3D ne= xt; if another CPU flush A's mm, but this cpu don't clear CPU mask, it m= ight enter IPI interrput routine, and also find cpu_tlbstate.state is TLBSTATE_OK. =20 Could this possible? =20 kernel/sched.c =20 2999 context_switch(struct rq *rq, struct task_struct *prev, 3000 struct task_struct *next) 3001 { 3002 struct mm_struct *mm, *oldmm; 3003=20 3004 prepare_task_switch(rq, prev, next); 3005 trace_sched_switch(rq, prev, next); 3006 mm =3D next->mm; 3007 oldmm =3D prev->active_mm; 3008 /* 3009 * For paravirt, this is coupled with an exit in switch_to to 3010 * combine the page table reload and the switch backend into 3011 * one hypercall. 3012 */ 3013 arch_start_context_switch(prev); 3014=20 3015 if (unlikely(!mm)) { 3016 next->active_mm =3D oldmm; 3017 atomic_inc(&oldmm->mm_count); 3018 enter_lazy_tlb(oldmm, next); 3019 } else 3020 switch_mm(oldmm, mm, next); 3021=20 3022 if (unlikely(!prev->mm)) { 3023 prev->active_mm =3D NULL; 3024 rq->prev_mm =3D oldmm; 3025 } =20 =20 33 static inline void switch_mm(struct mm_struct *prev, struct mm_struct= *next, 34 struct task_struct *tsk) 35 { 36 unsigned cpu =3D smp_processor_id(); 37=20 38 if (likely(prev !=3D next)) { 39 /* stop flush ipis for the previous mm */ 40 cpumask_clear_cpu(cpu, mm_cpumask(prev)); 41=20 42=20 43 #ifdef CONFIG_SMP 44 percpu_write(cpu_tlbstate.state, TLBSTATE_OK); 45 percpu_write(cpu_tlbstate.active_mm, next); 46 #endif 47 cpumask_set_cpu(cpu, mm_cpumask(next)); 48=20 49 /* Re-load page tables */ 50 load_cr3(next->pgd); 51=20 52 /* 53 * load the LDT, if the LDT is different: 54 */ 55 if (unlikely(prev->context.ldt !=3D next->context.ldt)) 56 load_LDT_nolock(&next->context); 57 } 58 #ifdef CONFIG_SMP 59 else { 60 percpu_write(cpu_tlbstate.state, TLBSTATE_OK); 61 BUG_ON(percpu_read(cpu_tlbstate.active_mm) !=3D next); 62=20 63 if (!cpumask_test_and_set_cpu(cpu, mm_cpumask(next))) { 64 /* We were in lazy tlb mode and leave_mm disabled 65 * tlb flush IPI delivery. We must reload CR3 66 * to make sure to use no freed page tables. 67 */ 68 load_cr3(next->pgd); 69 load_LDT_nolock(&next->context); 70 } 71 }=20 =20 --_8a771ba3-574c-42cf-ab64-c54cc7d9a52f_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable I go through the switch_mm more, and come up one more question:
 
Why we don't need to clear prev cpumask in line between line 59 and 60?  
Say
1)  Context is switch from process A to kernel, then kernel has= active_mm-> A's mm
2)  Context is switch from kernel to A, in sched.c oldmm =3D A's mm;= mm =3D A's mm
3)  it will call arch/x86/include/asm/mmu_context.h:60, si= nce prev =3D next;
     if another CPU flush A's mm, but th= is cpu don't clear CPU mask, it might enter IPI interrput
     routine, and also find cpu_tlbstate.state is TLBSTATE_OK.
 
Could this possible?
 
kernel/sched.c
 
2999 context_switch(struct rq *rq, struct task_struct *prev,
 300= 0            struc= t task_struct *next)
 3001 {
 3002    = ; struct mm_struct *mm, *oldmm;
 3003
 3004  &= nbsp;  prepare_task_switch(rq, prev, next);
 3005  = ;   trace_sched_switch(rq, prev, next);
 3006 &nbs= p;   mm =3D next->mm;
 3007     = oldmm =3D prev->active_mm;
 3008     /* 3009      * For paravirt, this is coupled= with an exit in switch_to to
 3010     = * combine the page table reload and the switch backend into
 301= 1      * one hypercall.
 3012  = ;    */
 3013     arch_start_c= ontext_switch(prev);
 3014
  3015     if (unlikely(!mm)) {
 3016 &nb= sp;       next->active_mm =3D oldmm;
=  3017         atomic_inc(&am= p;oldmm->mm_count);
 3018      &= nbsp;  enter_lazy_tlb(oldmm, next);
 3019   &= nbsp; } else
 3020        = ; switch_mm(oldmm, mm, next);
 3021
 3022  &nb= sp;  if (unlikely(!prev->mm)) {
 3023   &n= bsp;     prev->active_mm =3D NULL;
 3024&n= bsp;        rq->prev_mm =3D oldmm;<= BR> 3025     }
 
 
 33 static inline void switch_mm(struct mm_struct *prev, struct mm_s= truct *next,
 34        &= nbsp;         struct task_struct = *tsk)
 35 {
 36     unsigned cpu =3D = smp_processor_id();
 37
 38     if (= likely(prev !=3D next)) {
 39      =    /* stop flush ipis for the previous mm */
 40 &= nbsp;       cpumask_clear_cpu(cpu, mm_cpuma= sk(prev));
 41
 42
 43 #ifdef CONFIG_SMP
&nb= sp;44         percpu_write(cpu_tl= bstate.state, TLBSTATE_OK);
 45     &nbs= p;   percpu_write(cpu_tlbstate.active_mm, next);
 46 #e= ndif
 47         cpumask_= set_cpu(cpu, mm_cpumask(next));
 48 < BR> 49         /* Re-load p= age tables */
 50        = load_cr3(next->pgd);
 51
 52    =      /*
 53     &nbs= p;    * load the LDT, if the LDT is different:
 54=           */
 55&nbs= p;        if (unlikely(prev->contex= t.ldt !=3D next->context.ldt))
 56    &nbs= p;        load_LDT_nolock(&next-&g= t;context);
 57     }
 58 #ifdef CONF= IG_SMP
 59     else {
 60  =        percpu_write(cpu_tlbstate.state, TLB= STATE_OK);
 61         BU= G_ON(percpu_read(cpu_tlbstate.active_mm) !=3D=20 next);
 62
 63       = ;  if (!cpumask_test_and_set_cpu(cpu, mm_cpumask(next))) {
 = 64            = ; /* We were in lazy tlb mode and leave_mm disabled
 65 &nbs= p;            * tl= b flush IPI delivery. We must reload CR3
 66   &nb= sp;          * to make sure = to use no freed page tables.
 67     &nb= sp;        */
 68  &= nbsp;          load_cr3(next= ->pgd);
 69        &nb= sp;    load_LDT_nolock(&next->context);
 70=          }
 71  = ;   }
--_8a771ba3-574c-42cf-ab64-c54cc7d9a52f_-- --===============1278404576== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============1278404576==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: RE: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Mon, 25 Apr 2011 20:54:54 +0800 Message-ID: References: , , , , , , , <4DA3438A.6070503@goop.org>, , , <20110412100000.GA15647@dumpdata.com>, , , , , , <4DA8B715.9080508@goop.org>, Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1205683555==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: jeremy@goop.org Cc: xen devel , giamteckchoon@gmail.com, konrad.wilk@oracle.com List-Id: xen-devel@lists.xenproject.org --===============1205683555== Content-Type: multipart/alternative; boundary="_d0481d91-bb04-4359-9d6b-4ca70b7f1b2c_" --_d0481d91-bb04-4359-9d6b-4ca70b7f1b2c_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable Add some debug info in drop_other_mm_ref(line 1516), get on machine crash= . log attached, pity I lost prink info. =20 Does current->mm indicates userspace? Thanks. =20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D 1502 #ifdef CONFIG_SMP 1503 /* Another cpu may still have their %cr3 pointing at the pagetable, = so 1504 we need to repoint it somewhere else before we can unpin it. */ 1505 static void drop_other_mm_ref(void *info) 1506 { 1507 <+++struct mm_struct *mm =3D info; 1508 <+++struct mm_struct *active_mm; 1509=20 1510 <+++active_mm =3D percpu_read(cpu_tlbstate.active_mm); 1511=20 1512 <+++if (active_mm =3D=3D mm){ 1513 if(current->mm){ 1514 <+++<+++ printk("in userspace active_mm %p mm %p curr_mm %p tlbst= ate%d\n", = =20 1515 active_mm, mm, current->mm, percpu_read(cpu_tlbst= ate.state)); 1516 BUG(); 1517 } 1518 <+++<+++leave_mm(smp_processor_id()); 1519 } 1520=20 =20 =20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D =20 Starting udev: ------------[ cut here ]------------ kernel BUG at arch/x86/xen/mmu.c:1516! invalid opcode: 0000 [#1] SMP=20 last sysfs file: /sys/class/raw/rawctl/dev CPU 2=20 Modules linked in: snd_seq_dummy bnx2 snd_seq_oss(+) snd_seq_midi_event s= nd_seq=20 snd_seq_device serio_raw snd_pcm_oss snd_mixer_oss snd_pcm snd_timer i2c_= i801 i2c_core iTCO_wdt snd pata_acpi iTCO_vendor_support ata_generic soun= dcore=20 snd_page_alloc pcspkr ata_piix shpchp mptsas mptscsih mptbase = =20 Pid: 1126, comm: khelper Not tainted 2.6.32.36xen #1 Tecal RH2285 = =20 RIP: e030:[] [] drop_other_mm_ref+0x= 46/0x80 RSP: e02b:ffff880028078e58 EFLAGS: 00010092 RAX: 0000000000000015 RBX: 0000000000000001 RCX: 00000000ffff0075 RDX: 0000000000009f9f RSI: ffffffff8144006a RDI: 0000000000000004 RBP: ffff880028078e68 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000028078cf8 R11: 0000000000000246 R12: ffff88012c032680 R13: ffff880028080020 R14: 00000000000004f1 R15: 0000000000000000 FS: 00007f01adcf8710(0000) GS:ffff880028075000(0000) knlGS:0000000000000= 000 CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f01adf20648 CR3: 000000012a546000 CR4: 0000000000002660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process khelper (pid: 1126, threadinfo ffff88012d80e000, task ffff88012b8= 80000) Stack: 0000000000000001 ffff88012bb9bb88 ffff880028078e98 ffffffff81087224 <0> ffff880028078e78 ffff880028078e78 ffff88015f808540 00000000000004f1 <0> ffff880028078ea8 ffffffff81010118 ffff880028078ee8 ffffffff810a936a Call Trace: =20 [] generic_smp_call_function_single_interrupt+0xd8/0xf= c [] xen_call_function_single_interrupt+0x13/0x28 [] handle_IRQ_event+0x66/0x120 [] handle_percpu_irq+0x41/0x6e [] __xen_evtchn_do_upcall+0x1ab/0x27d [] xen_evtchn_do_upcall+0x33/0x46 [] xen_do_hypervisor_callback+0x1e/0x30 =20 [] ? xen_restore_fl_direct_end+0x0/0x1 [] ? hypercall_page+0x22a/0x1000 [] ? hypercall_page+0x22a/0x1000 [] ? _spin_unlock_irqrestore+0x15/0x17 [] ? xen_force_evtchn_callback+0xd/0xf [] ? check_events+0x12/0x20 [] ? _spin_unlock_irqrestore+0x15/0x17 [] ? xen_restore_fl_direct_end+0x0/0x1 [] ? xen_restore_fl_direct_end+0x0/0x1 [] ? xen_mc_issue+0x2e/0x33 [] ? __xen_pgd_pin+0xc1/0xc9 [] ? xen_pgd_pin+0x12/0x14 [] ? xen_activate_mm+0x25/0x2f [] ? flush_old_exec+0x390/0x500 [] ? load_elf_binary+0x0/0x17ef [] ? load_elf_binary+0x0/0x17ef [] ? load_elf_binary+0x398/0x17ef [] ? need_resched+0x23/0x2d [] ? process_measurement+0xc0/0xd7 [] ? load_elf_binary+0x0/0x17ef [] ? search_binary_handler+0xc8/0x255 [] ? do_execve+0x1c3/0x29e [] ? sys_execve+0x43/0x5d [] ? __call_usermodehelper+0x0/0x6f [] ? kernel_execve+0x68/0xd0 [] ? __call_usermodehelper+0x0/0x6f [] ? xen_restore_fl_direct_end+0x0/0x1 [] ? ____call_usermodehelper+0x113/0x11e [] ? child_rip+0xa/0x20 [] ? __call_usermodehelper+0x0/0x6f [] ? int_ret_from_sys_call+0x7/0x1b [] ? retint_restore_args+0x5/0x6 [] ? child_rip+0x0/0x20 Code: 75 3a 65 48 8b 04 25 c0 cb 00 00 48 83 b8 78 02 00 00 00 74 1a 65 8= b 34 25 c8 55 01 00 48 c7 c7 06 98 5b 81 31 c0 e8 d9 90 04 00 <0f> 0b eb = fe 65 8b 3c=20 25 78 e3 00 00 e8 e5 be 02 00 65 48 8b 1c = =20 RIP [] drop_other_mm_ref+0x46/0x80 RSP [] ? init_amd+0x296/0x37a [] ? xen_force_evtchn_callback+0xd/0xf [] ? check_events+0x12/0x20 [] ? print_oops_end_marker+0x23/0x25 [] oops_end+0xb6/0xc6 [] die+0x5a/0x63 [] do_trap+0x115/0x124 [] do_invalid_op+0x9c/0xa5 [] ? drop_other_mm_ref+0x46/0x80 [] ? printk+0xa7/0xa9 [] invalid_op+0x1b/0x20 [] ? init_amd+0x296/0x37a [] ? drop_other_mm_ref+0x46/0x80 [] ? drop_other_mm_ref+0x46/0x80 [] generic_smp_call_function_single_interrupt+0xd8/0xf= c [] xen_call_function_single_interrupt+0x13/0x28 [] handle_IRQ_event+0x66/0x120 [] handle_percpu_irq+0x41/0x6e [] __xen_evtchn_do_upcall+0x1ab/0x27d [] xen_evtchn_do_upcall+0x33/0x46 [] xen_do_hypervisor_callback+0x1e/0x30 [] ? xen_restore_fl_direct_end+0x0/0x1 [] ? hypercall_page+0x22a/0x1000 [] ? hypercall_page+0x22a/0x1000 [] ? _spin_unlock_irqrestore+0x15/0x17 [] ? xen_force_evtchn_callback+0xd/0xf [] ? check_events+0x12/0x20 [] ? _spin_unlock_irqrestore+0x15/0x17 [] ? xen_restore_fl_direct_end+0x0/0x1 [] ? xen_restore_fl_direct_end+0x0/0x1 [] ? xen_mc_issue+0x2e/0x33 [] ? __xen_pgd_pin+0xc1/0xc9 [] ? xen_pgd_pin+0x12/0x14 [] ? xen_activate_mm+0x25/0x2f [] ? flush_old_exec+0x390/0x500 [] ? load_elf_binary+0x0/0x17ef [] ? load_elf_binary+0x0/0x17ef [] ? load_elf_binary+0x398/0x17ef =20 --_d0481d91-bb04-4359-9d6b-4ca70b7f1b2c_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable Add some debug info in drop_other_mm_ref(line 1516), get on machine = crash.
log attached, pity I lost prink info.
 
Does current->mm indicates userspace?
Thanks.
 
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D
1502 #ifdef CONFIG_SMP
1503 /* Another cpu may still have their %cr3 p= ointing at the pagetable, so
1504    we need to repoint= it somewhere else before we can unpin it. */
1505 static void drop_ot= her_mm_ref(void *info)
1506 {
1507 <+++struct mm_struct *mm =3D = info;
1508 <+++struct mm_struct *active_mm;
1509
1510 <++= +active_mm =3D percpu_read(cpu_tlbstate.active_mm);
1511
1512 <= +++if (active_mm =3D=3D mm){
1513      &= nbsp;  if(current->mm){
1514 <+++<+++    = printk("in userspace active_mm %p mm %p curr_mm %p tlbstate%d\n", &n= bsp;           &nb= sp;           &nbs= p;            = ;            =         & nbsp;           &= nbsp;   
1515       = ;            = active_mm, mm, current->mm, percpu_read(cpu_tlbstate.state));
1516=              = BUG();
1517         }
1518 = <+++<+++leave_mm(smp_processor_id());
1519   &nbs= p; }
1520
 
 
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D
 
Starting udev: ------------[ cut here ]------------
kernel BUG at arch= /x86/xen/mmu.c:1516!
invalid opcode: 0000 [#1] SMP
last sysfs file= : /sys/class/raw/rawctl/dev
CPU 2
Modules linked in: snd_seq_dummy= bnx2 snd_seq_oss(+) snd_seq_midi_event snd_seq
snd_seq_device serio_= raw snd_pcm_oss snd_mixer_oss snd_pcm snd_timer i2c_i801 i2c_core iTCO_wd= t snd pata_acpi iTCO_vendor_support ata_generic soundcore
snd_page_alloc pcspkr ata_piix shpchp mptsas mptscsih mptbase  =             &= nbsp;            <= BR>Pid: 1126, comm: khelper Not tainted 2.6.32.36xen #1 Tecal RH2285 = ;        
RIP: e030:[<ffff= ffff8100e4c0>]  [<ffffffff8100e4c0>] drop_other_mm_ref+0x46= /0x80
RSP: e02b:ffff880028078e58  EFLAGS: 00010092
RAX: 000000= 0000000015 RBX: 0000000000000001 RCX: 00000000ffff0075
RDX: 0000000000= 009f9f RSI: ffffffff8144006a RDI: 0000000000000004
RBP: ffff880028078e= 68 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000028078cf8 R= 11: 0000000000000246 R12: ffff88012c032680
R13: ffff880028080020 R14: = 00000000000004f1 R15: 0000000000000000
FS:  00007f01adcf8710(0000= ) GS:ffff880028075000(0000) knlGS:0000000000000000
CS:  e033 DS: = 0000 ES: 0000 CR0: 000000008005003b
CR 2: 00007f01adf20648 CR3: 000000012a546000 CR4: 0000000000002660
DR0: = 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000= 000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process khelp= er (pid: 1126, threadinfo ffff88012d80e000, task ffff88012b880000)
Sta= ck:
 0000000000000001 ffff88012bb9bb88 ffff880028078e98 ffffffff8= 1087224
<0> ffff880028078e78 ffff880028078e78 ffff88015f808540 0= 0000000000004f1
<0> ffff880028078ea8 ffffffff81010118 ffff880028= 078ee8 ffffffff810a936a
Call Trace:
 <IRQ>
 [&l= t;ffffffff81087224>] generic_smp_call_function_single_interrupt+0xd8/0= xfc
 [<ffffffff81010118>] xen_call_function_single_interrup= t+0x13/0x28
 [<ffffffff810a936a>] handle_IRQ_event+0x66/0x1= 20
 [<ffffffff810aac5b>] handle_percpu_irq+0x41/0x6e
&nb= sp;[<ffffffff8128c1a8>] __xen_evtchn_do_upcall+0x1ab/0x27d
 = ;[<ffffffff8128dcf9>] xen_evtchn_do_ upcall+0x33/0x46
 [<ffffffff81013efe>] xen_do_hypervisor_c= allback+0x1e/0x30
 <EOI>
 [<ffffffff8100f8df>= ;] ? xen_restore_fl_direct_end+0x0/0x1
 [<ffffffff8100922a>= ] ? hypercall_page+0x22a/0x1000
 [<ffffffff8100922a>] ? hyp= ercall_page+0x22a/0x1000
 [<ffffffff81447292>] ? _spin_unlo= ck_irqrestore+0x15/0x17
 [<ffffffff8100f195>] ? xen_force_e= vtchn_callback+0xd/0xf
 [<ffffffff8100f8f2>] ? check_events= +0x12/0x20
 [<ffffffff81447292>] ? _spin_unlock_irqrestore+= 0x15/0x17
 [<ffffffff8100f8df>] ? xen_restore_fl_direct_end= +0x0/0x1
 [<ffffffff8100f8df>] ? xen_restore_fl_direct_end+= 0x0/0x1
 [<ffffffff8100d47f>] ? xen_mc_issue+0x2e/0x33
&= nbsp;[<ffffffff8100e42f>] ? __xen_pgd_pin+0xc1/0xc9
 [<f= fffffff8100e449>] ? xen_pgd_pin+0x12/0x14
 [<ffffffff8100e4= 70>] ? xen_activate_mm+0x25/0x2f
&nb sp;[<ffffffff81113f59>] ? flush_old_exec+0x390/0x500
 [<= ;ffffffff81150dc9>] ? load_elf_binary+0x0/0x17ef
 [<fffffff= f81150dc9>] ? load_elf_binary+0x0/0x17ef
 [<ffffffff8115116= 1>] ? load_elf_binary+0x398/0x17ef
 [<ffffffff81042fcf>]= ? need_resched+0x23/0x2d
 [<ffffffff811f463c>] ? process_m= easurement+0xc0/0xd7
 [<ffffffff81150dc9>] ? load_elf_binar= y+0x0/0x17ef
 [<ffffffff81113098>] ? search_binary_handler+= 0xc8/0x255
 [<ffffffff81114366>] ? do_execve+0x1c3/0x29e [<ffffffff8101155d>] ? sys_execve+0x43/0x5d
 [<ff= ffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f
 [<ffffff= ff81013e28>] ? kernel_execve+0x68/0xd0
 [<ffffffff8106fc45&= gt;] ? __call_usermodehelper+0x0/0x6f
 [<ffffffff8100f8df>]= ? xen_restore_fl_direct_end+0x0/0x1
 [<ffffffff8106fb64>] = ? ____call_usermodehelper+0x113/0x11e
& nbsp;[<ffffffff81013daa>] ? child_rip+0xa/0x20
 [<fffff= fff8106fc45>] ? __call_usermodehelper+0x0/0x6f
 [<ffffffff8= 1012f91>] ? int_ret_from_sys_call+0x7/0x1b
 [<ffffffff81013= 71d>] ? retint_restore_args+0x5/0x6
 [<ffffffff81013da0>= ] ? child_rip+0x0/0x20
Code: 75 3a 65 48 8b 04 25 c0 cb 00 00 48 83 b8= 78 02 00 00 00 74 1a 65 8b 34 25 c8 55 01 00 48 c7 c7 06 98 5b 81 31 c0 = e8 d9 90 04 00 <0f> 0b eb fe 65 8b 3c
25 78 e3 00 00 e8 e5 be 02 00 65 48 8b 1c     &n= bsp;           &nb= sp;           &nbs= p;          
RIP&nb= sp; [<ffffffff8100e4c0>] drop_other_mm_ref+0x46/0x80
 RSP &= lt;ffff880028078e58>
[<ffffffff8144006a>] ? init_amd+0x296/0x= 37a
 [<ffffffff8100f195>] ? xen_force_evtchn_callback+0xd/0= xf
 [<ffffffff8100f8f2>] ? check_events+0x12/0x20
 = [<ffffffff81056487>] ? print_oops_end_marker+0x23/0x25
 [&l= t;ffffffff81448165>] oops_end+0xb6/0xc6
 [<ffffffff810166e5= >] die+0x5a/0x63
 [<ffffffff81447a3c>] do_trap+0x115/0x1= 24
 [<ffffffff810148e6>] do_invalid_op+0x9c/0xa5
 [= <ffffffff8100e4c0>] ? drop_other_mm_ref+0x46/0x80
 [<fff= fffff81057640>] ? printk+0xa7/0xa9
 [<ffffffff81013b3b>] invalid_op+0x1b/0x20
 [<ffff= ffff8144006a>] ? init_amd+0x296/0x37a
 [<ffffffff8100e4c0&g= t;] ? drop_other_mm_ref+0x46/0x80
 [<ffffffff8100e4c0>] ? d= rop_other_mm_ref+0x46/0x80
 [<ffffffff81087224>] generic_sm= p_call_function_single_interrupt+0xd8/0xfc
 [<ffffffff81010118= >] xen_call_function_single_interrupt+0x13/0x28
 [<ffffffff= 810a936a>] handle_IRQ_event+0x66/0x120
 [<ffffffff810aac5b&= gt;] handle_percpu_irq+0x41/0x6e
 [<ffffffff8128c1a8>] __xe= n_evtchn_do_upcall+0x1ab/0x27d
 [<ffffffff8128dcf9>] xen_ev= tchn_do_upcall+0x33/0x46
 [<ffffffff81013efe>] xen_do_hyper= visor_callback+0x1e/0x30
 <EOI>  [<ffffffff8100f8df= >] ? xen_restore_fl_direct_end+0x0/0x1
 [<ffffffff8100922a&= gt;] ? hypercall_page+0x22a/0x1000
 [<ffffffff8100922a>] ? = hypercall_page+0x22a/0x1000
 [< ffffffff81447292>] ? _spin_unlock_irqrestore+0x15/0x17
 [<= ffffffff8100f195>] ? xen_force_evtchn_callback+0xd/0xf
 [<f= fffffff8100f8f2>] ? check_events+0x12/0x20
 [<ffffffff81447= 292>] ? _spin_unlock_irqrestore+0x15/0x17
 [<ffffffff8100f8= df>] ? xen_restore_fl_direct_end+0x0/0x1
 [<ffffffff8100f8d= f>] ? xen_restore_fl_direct_end+0x0/0x1
 [<ffffffff8100d47f= >] ? xen_mc_issue+0x2e/0x33
 [<ffffffff8100e42f>] ? __xe= n_pgd_pin+0xc1/0xc9
 [<ffffffff8100e449>] ? xen_pgd_pin+0x1= 2/0x14
 [<ffffffff8100e470>] ? xen_activate_mm+0x25/0x2f [<ffffffff81113f59>] ? flush_old_exec+0x390/0x500
 [= <ffffffff81150dc9>] ? load_elf_binary+0x0/0x17ef
 [<ffff= ffff81150dc9>] ? load_elf_binary+0x0/0x17ef
 [<ffffffff8115= 1161>] ? load_elf_binary+0x398/0x17ef

--_d0481d91-bb04-4359-9d6b-4ca70b7f1b2c_-- --===============1205683555== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============1205683555==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: RE: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Mon, 25 Apr 2011 21:11:16 +0800 Message-ID: References: , , , , , , , <4DA3438A.6070503@goop.org>, , , <20110412100000.GA15647@dumpdata.com>, , , , , , <4DA8B715.9080508@goop.org>, , Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1809387188==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: jeremy@goop.org Cc: xen devel , giamteckchoon@gmail.com, konrad.wilk@oracle.com List-Id: xen-devel@lists.xenproject.org --===============1809387188== Content-Type: multipart/alternative; boundary="_20436f08-cf01-454f-afd0-595747c3a300_" --_20436f08-cf01-454f-afd0-595747c3a300_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable =20 >From: tinnycloud@hotmail.com >To: jeremy@goop.org >CC: giamteckchoon@gmail.com; xen-devel@lists.xensource.com; konrad.wilk@= oracle.com >Subject: RE: Kernel BUG at arch/x86/mm/tlb.c:61 >Date: Mon, 25 Apr 2011 20:54:54 +0800 >Add some debug info in drop_other_mm_ref(line 1516), get on machine cras= h. >log attached, pity I lost prink info. printk info: in userspace active_mm ffff8800a3669f80 mm ffff8800a3669f80 = curr_mm ffff88008d73c000 tlbstate 2 =20 >Does current->mm indicates userspace? >Thanks. =20 =20 --_20436f08-cf01-454f-afd0-595747c3a300_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable  
>From: tinnycloud@hotmail.com
>To: jeremy@goop.org
>CC: gi= amteckchoon@gmail.com; xen-devel@lists.xensource.com; konrad.wilk@oracle.= com
>Subject: RE: Kernel BUG at arch/x86/mm/tlb.c:61
>Date: M= on, 25 Apr 2011 20:54:54 +0800

>Add some debug info in drop_other_mm_ref(line 1516), get on mach= ine crash.
>log attached, pity I lost prink info.

printk info: in userspace active_mm ffff8800a3669f80 mm ffff8800a3669f80 = curr_mm ffff88008d73c000 tlbstate 2  

>Does current-&= gt;mm indicates userspace?
>Thanks.
 


= --_20436f08-cf01-454f-afd0-595747c3a300_-- --===============1809387188== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============1809387188==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: RE: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Mon, 25 Apr 2011 23:05:24 +0800 Message-ID: References: , , , , , , , <4DA3438A.6070503@goop.org>, , , <20110412100000.GA15647@dumpdata.com>, , , , , , <4DA8B715.9080508@goop.org>, , , Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1408319221==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: jeremy@goop.org Cc: xen devel , giamteckchoon@gmail.com, konrad.wilk@oracle.com List-Id: xen-devel@lists.xenproject.org --===============1408319221== Content-Type: multipart/alternative; boundary="_9041a24e-2ac7-4854-b563-81f36e1ded8d_" --_9041a24e-2ac7-4854-b563-81f36e1ded8d_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable Please ignore my last two mails, I just learnt that Current is meanless i= n irq context. =20 Just come up one whole assumption: =20 In my opinion: =20 1) CPU running in switch_mm has the possiblity of receiving IPI message a= nd enter interrupt 2) Before revert that patch, not matter the if statement is true or not, = the cpu_tlbstate.state could be changed to TLBSTATE_OK, right before enter irq routhine 3) Since the cpu_tlbstate is per CPU variable, before calling leave_mm(),= test cpu_tlbstate.state in drop_other_mm_ref is feasible and nessary 4) If I am right, strange thing is the code of 2.6.32.36 is same as 2.6.3= 1.x, which we never met tlb bug before. =20 any comments? =20 Many thanks. =20 --_9041a24e-2ac7-4854-b563-81f36e1ded8d_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable Please ignore my last two mails, I just learnt that Current is meanless i= n irq context.
 
Just come up one whole assumption:
 
In my opinion:
 
1) CPU running in switch_mm has the possiblity of receiving IPI message a= nd enter interrupt
2) Before revert that patch, not matter the if statement is true or not, = the cpu_tlbstate.state
could be changed to TLBSTATE_OK, right before enter irq routhine
3) Since the cpu_tlbstate is per CPU variable, before calling leave_= mm(), test cpu_tlbstate.state
in drop_other_mm_ref is feasible and nessary
4) If I am right, strange thing is the code of 2.6.32.36 is same as 2.6.3= 1.x, which we never met tlb bug before.
 
any comments?
 
Many thanks.
 
--_9041a24e-2ac7-4854-b563-81f36e1ded8d_-- --===============1408319221== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============1408319221==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Tian, Kevin" Subject: RE: RE: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Tue, 26 Apr 2011 13:52:11 +0800 Message-ID: <625BA99ED14B2D499DC4E29D8138F1505C7F2C5185@shsmsx502.ccr.corp.intel.com> References: , , , , , , , <4DA3438A.6070503@goop.org>, , , <20110412100000.GA15647@dumpdata.com>, , , , , <4DA8B715.9080508@goop.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0871541839==" Return-path: In-Reply-To: Content-Language: en-US List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: MaoXiaoyun , "jeremy@goop.org" Cc: xen devel , "giamteckchoon@gmail.com" , "konrad.wilk@oracle.com" List-Id: xen-devel@lists.xenproject.org --===============0871541839== Content-Language: en-US Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 PkZyb206IE1hb1hpYW95dW4NCj5TZW50OiBNb25kYXksIEFwcmlsIDI1LCAyMDExIDExOjE1IEFN DQo+PiBEYXRlOiBGcmksIDE1IEFwciAyMDExIDE0OjIyOjI5IC0wNzAwDQo+PiBGcm9tOiBqZXJl bXlAZ29vcC5vcmcNCj4+IFRvOiB0aW5ueWNsb3VkQGhvdG1haWwuY29tDQo+PiBDQzogZ2lhbXRl Y2tjaG9vbkBnbWFpbC5jb207IHhlbi1kZXZlbEBsaXN0cy54ZW5zb3VyY2UuY29tOyBrb25yYWQu d2lsa0BvcmFjbGUuY29tDQo+PiBTdWJqZWN0OiBSZTogS2VybmVsIEJVRyBhdCBhcmNoL3g4Ni9t bS90bGIuYzo2MQ0KPj4gDQo+PiBPbiAwNC8xNS8yMDExIDA1OjIzIEFNLCBNYW9YaWFveXVuIHdy b3RlOg0KPj4gPiBIae+8mg0KPj4gPg0KPj4gPiBDb3VsZCB0aGUgY3Jhc2ggcmVsYXRlZCB0byB0 aGlzIHBhdGNoID8NCj4+ID4gaHR0cDovL2dpdC5rZXJuZWwub3JnLz9wPWxpbnV4L2tlcm5lbC9n aXQvamVyZW15L3hlbi5naXQ7YT1jb21taXRkaWZmO2g9NDViZmQ3YmZjNmNmMzJmOGU2MGJiOTFi MzIzNDlmMGI1MDkwZWVhMw0KPj4gPg0KPj4gPiBTaW5jZSBub3cgVExCIHN0YXRlIGNoYW5nZSB0 byBUTEJTVEFURV9PSyhtbXVfY29udGV4dC5oOjQwKSBpcyBiZWZvcmUNCj4+ID4gY3B1bWFza19j bGVhcl9jcHUobGluZSA0OSkuDQo+PiA+IENvdWxkIGl0IHBvc3NpYmxlIHRoYXQgcmlnaHQgYWZ0 ZXIgZXhlY3V0ZSBsaW5lIDQwIG9mIG1tdV9jb250ZXh0LmgsDQo+PiA+IENQVSByZXZpY2UgSVBJ IGZyb20gb3RoZXIgQ1BVIHRvDQo+PiA+IGZsdXNoIHRoZSBtbSwgYW5kIHdoZW4gaW4gaW50ZXJy dXB0LCBmaW5kIHRoZSBUTEIgc3RhdGUgaGFwcGVuZWQgdG8gYmUNCj4+ID4gVExCU1RBVEVfT0su IFdoaWNoIGNvbmZsaWN0cy4NCj4+IA0KPj4gRG9lcyByZXZlcnRpbmcgaXQgaGVscD8NCj4+IA0K Pj4gSg0KPsKgDQo+SGkgSmVyZW15Og0KPsKgDQo+wqDCoMKgIFRoZSBsYXN0ZXN0IHRlc3QgcmVz dWx0IHNob3dzIHRoZSByZXZlcnRpbmcgZGlkbid0IGhlbHAuDQo+wqDCoMKgwqBLZXJuZWwgcGFu aWMgZXhhY3RseSBhdCB0aGUgc2FtZSBwbGFjZSBpbiB0bGIuYy4NCj7CoA0KPsKgwqDCoCBJIGhh dmUgcXVlc3Rpb24gYWJvdXQgVExCIHN0YXRlLCBmcm9tIHRoZSBzdGFjaywgDQo+wqDCoMKgIHhl bl9kb19oeXBlcnZpc29yX2NhbGxiYWNrLT4geGVuX2V2dGNobl9kb191cGNhbGwtPi4uLiAtPmRy b3Bfb3RoZXJfbW1fcmVmDQo+wqANCj7CoMKgwqDCoFdoYXTCoMKgY3B1X3RsYnN0YXRlLnN0YXRl IHNob3VsZCBiZSzCoCBjb3VsZMKgwqBUTEJTVEFURV9PSyBvciBUTEJTVEFURV9MQVpZIGFsbCBi ZSBwb3NzaWJsZT8gDQo+wqDCoMKgwqBUaGF0IGlzIGFmdGVyIGEgaHlwZXJjYWxsIGZyb20gdXNl cnNwYWNlLCBzdGF0ZSB3aWxsIGJlIFRMQlNUQVRFX09LLCBhbmQNCj7CoMKgwqDCoMKgIGlmIGZy b20ga2VybmVsIHNwYWNlLCBzdGF0ZSB3aWxsIGJlIFRMQlNUQVRFX0xBWkUgPyANCj7CoA0KPsKg wqDCoMKgwqDCoMKgdGhhbmtzLg0KDQppdCBsb29rcyBhIGJ1ZyBpbiBkcm9wX290aGVyX21tX3Jl ZiBpbXBsZW1lbnRhdGlvbiwgdGhhdCBjdXJyZW50IFRMQiBzdGF0ZSBzaG91bGQgYmUgY2hlY2tl ZA0KYmVmb3JlIGludm9raW5nIGxlYXZlX21tKCkuIFRoZXJlJ3MgYSB3aW5kb3cgYmV0d2VlbiBi ZWxvdyBsaW5lcyBvZiBjb2RlOg0KDQo8eGVuX2Ryb3BfbW1fcmVmPg0KICAgICAgIC8qIEdldCB0 aGUgIm9mZmljaWFsIiBzZXQgb2YgY3B1cyByZWZlcnJpbmcgdG8gb3VyIHBhZ2V0YWJsZS4gKi8N CiAgICAgICAgaWYgKCFhbGxvY19jcHVtYXNrX3ZhcigmbWFzaywgR0ZQX0FUT01JQykpIHsNCiAg ICAgICAgICAgICAgICBmb3JfZWFjaF9vbmxpbmVfY3B1KGNwdSkgew0KICAgICAgICAgICAgICAg ICAgICAgICAgaWYgKCFjcHVtYXNrX3Rlc3RfY3B1KGNwdSwgbW1fY3B1bWFzayhtbSkpDQogICAg ICAgICAgICAgICAgICAgICAgICAgICAgJiYgcGVyX2NwdSh4ZW5fY3VycmVudF9jcjMsIGNwdSkg IT0gX19wYShtbS0+cGdkKSkNCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgY29udGlu dWU7DQogICAgICAgICAgICAgICAgICAgICAgICBzbXBfY2FsbF9mdW5jdGlvbl9zaW5nbGUoY3B1 LCBkcm9wX290aGVyX21tX3JlZiwgbW0sIDEpOw0KICAgICAgICAgICAgICAgIH0NCiAgICAgICAg ICAgICAgICByZXR1cm47DQogICAgICAgIH0NCg0KdGhlcmUncyBjaGFuY2UgdGhhdCB3aGVuIHNt cF9jYWxsX2Z1bmN0aW9uX3NpbmdsZSBpcyBpbnZva2VkLCBhY3R1YWwgVExCIHN0YXRlIGhhcyBi ZWVuDQp1cGRhdGVkIGluIHRoZSBvdGhlciBjcHUuIFRoZSB1cHN0cmVhbSBrZXJuZWwgcGF0Y2gg eW91IHJlZmVycmVkIHRvIGVhcmxpZXIganVzdCBtYWtlcw0KdGhpcyBidWcgZXhwb3NlZCBtb3Jl IGVhc2lseS4gQnV0IGV2ZW4gd2l0aG91dCB0aGlzIHBhdGNoLCB5b3UgbWF5IHN0aWxsIHN1ZmZl ciBzdWNoIGlzc3VlDQp3aGljaCBpcyB3aHkgcmV2ZXJ0aW5nIHRoZSBwYXRjaCBkb2Vzbid0IGhl bHAuDQoNCkNvdWxkIHlvdSB0cnkgYWRkaW5nIGEgY2hlY2sgaW4gZHJvcF9vdGhlcl9tbV9yZWY/ DQoNCiAgICAgICAgaWYgKGFjdGl2ZV9tbSA9PSBtbSAmJiBwZXJjcHVfcmVhZChjcHVfdGxic3Rh dGUuc3RhdGUpICE9IFRMQlNUQVRFX09LKQ0KICAgICAgICAgICAgICAgIGxlYXZlX21tKHNtcF9w cm9jZXNzb3JfaWQoKSk7DQoNCm9uY2UgdGhlIGludGVycnVwdGVkIGNvbnRleHQgaGFzIFRMQlNU QVRFX09LLCBpdCBpbXBsaWNhdGVzIHRoYXQgbGF0ZXIgaXQgd2lsbCBoYW5kbGUgDQp0aGUgVExC IGZsdXNoIGFuZCB0aHVzIG5vIG5lZWQgZm9yIGxlYXZlX21tIGZyb20gaW50ZXJydXB0IGhhbmRs ZXIsIGFuZCB0aGF0J3MgdGhlDQphc3N1bXB0aW9uIG9mIGRvaW5nIGxlYXZlX21tLg0KDQpUaGFu a3MNCktldmluDQo= --===============0871541839== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============0871541839==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Tian, Kevin" Subject: RE: RE: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Tue, 26 Apr 2011 13:55:50 +0800 Message-ID: <625BA99ED14B2D499DC4E29D8138F1505C7F2C518E@shsmsx502.ccr.corp.intel.com> References: , , , , , , , <4DA3438A.6070503@goop.org>, , , <20110412100000.GA15647@dumpdata.com>, , , , , , <4DA8B715.9080508@goop.org>, , , Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1962481668==" Return-path: In-Reply-To: Content-Language: en-US List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: MaoXiaoyun , "jeremy@goop.org" Cc: xen devel , "giamteckchoon@gmail.com" , "konrad.wilk@oracle.com" List-Id: xen-devel@lists.xenproject.org --===============1962481668== Content-Language: en-US Content-Type: multipart/alternative; boundary="_000_625BA99ED14B2D499DC4E29D8138F1505C7F2C518Eshsmsx502ccrc_" --_000_625BA99ED14B2D499DC4E29D8138F1505C7F2C518Eshsmsx502ccrc_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable the race window is always there, but whether it will be triggered is not de= termined. It's possible that you never met this bug on 2.6.31.x now, but it= doesn't mean you won't meet it in long run in the future. :) Thanks Kevin From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists= .xensource.com] On Behalf Of MaoXiaoyun Sent: Monday, April 25, 2011 11:05 PM To: jeremy@goop.org Cc: xen devel; giamteckchoon@gmail.com; konrad.wilk@oracle.com Subject: [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61 Please ignore my last two mails, I just learnt that Current is meanless in = irq context. Just come up one whole assumption: In my opinion: 1) CPU running in switch_mm has the possiblity of receiving IPI message and= enter interrupt 2) Before revert that patch, not matter the if statement is true or not, th= e cpu_tlbstate.state could be changed to TLBSTATE_OK, right before enter irq routhine 3) Since the cpu_tlbstate is per CPU variable, before calling leave_mm(), t= est cpu_tlbstate.state in drop_other_mm_ref is feasible and nessary 4) If I am right, strange thing is the code of 2.6.32.36 is same as 2.6.31.= x, which we never met tlb bug before. any comments? Many thanks. --_000_625BA99ED14B2D499DC4E29D8138F1505C7F2C518Eshsmsx502ccrc_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

the race window is always there, but whether it will be triggered is no= t determined. It’s possible that you never met this bug on 2.6.31.x n= ow, but it doesn’t mean you won’t meet it in long run in the fu= ture. J

 

Thanks

Kevin

 

From: xen-devel-bounces@lists.xensourc= e.com [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Ma= oXiaoyun
Sent: Monday, April 25, 2011 11:05 PM
To: jere= my@goop.org
Cc: xen devel; giamteckchoon@gmail.com; konrad.wilk@o= racle.com
Subject: [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.= c:61

 

Please ignore my last two mails, I just learnt that Current is mean= less in irq context.
 
Just come up one whole assumption:
&nb= sp;
In my opinion:
 
1) CPU running in switch_mm has the poss= iblity of receiving IPI message and enter interrupt
2) Before revert tha= t patch, not matter the if statement is true or not, the cpu_tlbstate.state=
could be changed to TLBSTATE_OK, right before enter irq routhine
3) = Since the cpu_tlbstate is per CPU variable, before calling leave_mm(),= test cpu_tlbstate.state
in drop_other_mm_ref is feasible and nessary4) If I am right, strange thing is the code of 2.6.32.36 is same as 2.6.31= .x, which we never met tlb bug before.
 
any comments?
 =
Many thanks.
 

= --_000_625BA99ED14B2D499DC4E29D8138F1505C7F2C518Eshsmsx502ccrc_-- --===============1962481668== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============1962481668==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: RE: RE: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Tue, 26 Apr 2011 15:04:31 +0800 Message-ID: References: , , , , , , , , , , , , , <4DA3438A.6070503@goop.org>, , , , , , <20110412100000.GA15647@dumpdata.com>, , , , , , , , , , <4DA8B715.9080508@goop.org>, , <625BA99ED14B2D499DC4E29D8138F1505C7F2C5185@shsmsx502.ccr.corp.intel.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2001338816==" Return-path: In-Reply-To: <625BA99ED14B2D499DC4E29D8138F1505C7F2C5185@shsmsx502.ccr.corp.intel.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: kevin.tian@intel.com, jeremy@goop.org Cc: xen devel , giamteckchoon@gmail.com, konrad.wilk@oracle.com List-Id: xen-devel@lists.xenproject.org --===============2001338816== Content-Type: multipart/alternative; boundary="_b56b87da-8107-429c-92ce-5ec99c537358_" --_b56b87da-8107-429c-92ce-5ec99c537358_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable Many thanks, Kevin. =20 I agree on the race window. One thing more, In my understaning, the CPU who send out IPI message, wi= ll unpin the pagetable after=20 receive all ACKS from other cpu, if the CPU who received IPI message, = enter drop_other_mm_ref, and=20 has TLBSTATE_OK, does nothing, will it possible it possible confronts wit= h stale pagetable (that is unpinned by sender CPU)? =20 So do we need flush tlb when its state is TBLSTATE_OK? =20 if (active_mm =3D=3D mm){ if (percpu_read(cpu_tlbstate.state) =3D=3D TLBSTATE_OK) load_cr3(mm->pgd) else leave_mm(smp_processor_id()); } =20 > From: kevin.tian@intel.com > To: tinnycloud@hotmail.com; jeremy@goop.org > CC: xen-devel@lists.xensource.com; giamteckchoon@gmail.com; konrad.wilk= @oracle.com > Date: Tue, 26 Apr 2011 13:52:11 +0800 > Subject: RE: [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61 >=20 > >From: MaoXiaoyun > >Sent: Monday, April 25, 2011 11:15 AM > >> Date: Fri, 15 Apr 2011 14:22:29 -0700 > >> From: jeremy@goop.org > >> To: tinnycloud@hotmail.com > >> CC: giamteckchoon@gmail.com; xen-devel@lists.xensource.com; konrad.w= ilk@oracle.com > >> Subject: Re: Kernel BUG at arch/x86/mm/tlb.c:61 > >>=20 > >> On 04/15/2011 05:23 AM, MaoXiaoyun wrote: > >> > Hi=A3=BA > >> > > >> > Could the crash related to this patch ? > >> > http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcom= mitdiff;h=3D45bfd7bfc6cf32f8e60bb91b32349f0b5090eea3 > >> > > >> > Since now TLB state change to TLBSTATE_OK(mmu_context.h:40) is bef= ore > >> > cpumask_clear_cpu(line 49). > >> > Could it possible that right after execute line 40 of mmu_context.= h, > >> > CPU revice IPI from other CPU to > >> > flush the mm, and when in interrupt, find the TLB state happened t= o be > >> > TLBSTATE_OK. Which conflicts. > >>=20 > >> Does reverting it help? > >>=20 > >> J > >=20 > >Hi Jeremy: > >=20 > > The lastest test result shows the reverting didn't help. > > Kernel panic exactly at the same place in tlb.c. > >=20 > > I have question about TLB state, from the stack,=20 > > xen_do_hypervisor_callback-> xen_evtchn_do_upcall->... ->drop_othe= r_mm_ref > >=20 > > What cpu_tlbstate.state should be, could TLBSTATE_OK or TLBSTAT= E_LAZY all be possible?=20 > > That is after a hypercall from userspace, state will be TLBSTATE_O= K, and > > if from kernel space, state will be TLBSTATE_LAZE ?=20 > >=20 > > thanks. >=20 > it looks a bug in drop_other_mm_ref implementation, that current TLB st= ate should be checked > before invoking leave_mm(). There's a window between below lines of cod= e: >=20 > > /* Get the "official" set of cpus referring to our pagetable. */ > if (!alloc_cpumask_var(&mask, GFP_ATOMIC)) { > for_each_online_cpu(cpu) { > if (!cpumask_test_cpu(cpu, mm_cpumask(mm)) > && per_cpu(xen_current_cr3, cpu) !=3D __pa(mm->pgd)) > continue; > smp_call_function_single(cpu, drop_other_mm_ref, mm, 1); > } > return; > } >=20 > there's chance that when smp_call_function_single is invoked, actual TL= B state has been > updated in the other cpu. The upstream kernel patch you referred to ear= lier just makes > this bug exposed more easily. But even without this patch, you may stil= l suffer such issue > which is why reverting the patch doesn't help. >=20 > Could you try adding a check in drop_other_mm_ref? >=20 > if (active_mm =3D=3D mm && percpu_read(cpu_tlbstate.state) !=3D TLBSTAT= E_OK) > leave_mm(smp_processor_id()); >=20 > once the interrupted context has TLBSTATE_OK, it implicates that later = it will handle=20 > the TLB flush and thus no need for leave_mm from interrupt handler, and= that's the > assumption of doing leave_mm. >=20 > Thanks > Kevin =20 --_b56b87da-8107-429c-92ce-5ec99c537358_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable Many thanks, Kevin.
 
I agree on the race window.
One thing more,  In my understaning, the CPU who send out IPI messag= e, will unpin the pagetable after
receive all ACKS  from other cpu,  if the CPU who received=  IPI message, enter drop_other_mm_ref, and
has TLBSTATE_OK, does nothing, will it possible it possible confronts wit= h stale pagetable
(that is unpinned by sender CPU)?
 
So do we need flush tlb when its state is TBLSTATE_OK?
 

if (active_mm =3D=3D mm){

     if (percpu_read(cpu_tlbstate= .state) =3D=3D TLBSTATE_OK)

        load_cr3(m= m->pgd)

     else

           = ;     leave_mm(smp_processor_id());=

 }

 

> From: kevin.tian@intel.com
> To: tinnycloud@hotmail.com; jerem= y@goop.org
> CC: xen-devel@lists.xensource.com; giamteckchoon@gmail= .com; konrad.wilk@oracle.com
> Date: Tue, 26 Apr 2011 13:52:11 +080= 0
> Subject: RE: [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61=
>
> >From: MaoXiaoyun
> >Sent: Monday, April 25= , 2011 11:15 AM
> >> Date: Fri, 15 Apr 2011 14:22:29 -0700> >> From: jeremy@goop.org
> >> To: tinnycloud@hotm= ail.com
> >> CC: giamteckchoon@gmail.com; xen-devel@lists.xen= source.com; konrad.wilk@oracle.com
> >> Subject: Re: Kernel B= UG at arch/x86/mm/tlb.c:61
> >>
> >> On 04/15/20= 11 05:23 AM, MaoXiaoyun wrote:
> >> > Hi=A3=BA
> >= ;> >
> >> > Could the crash related to this patch ?<= BR>> >> > http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/= xen.git;a=3Dcommitdiff;h=3D45bfd7bfc6cf32f8e60bb91b 32349f0b5090eea3
> >> >
> >> > Since now T= LB state change to TLBSTATE_OK(mmu_context.h:40) is before
> >&g= t; > cpumask_clear_cpu(line 49).
> >> > Could it possib= le that right after execute line 40 of mmu_context.h,
> >> &g= t; CPU revice IPI from other CPU to
> >> > flush the mm, a= nd when in interrupt, find the TLB state happened to be
> >> = > TLBSTATE_OK. Which conflicts.
> >>
> >> Doe= s reverting it help?
> >>
> >> J
> >&nb= sp;
> >Hi Jeremy:
> > 
> >  &nb= sp; The lastest test result shows the reverting didn't help.
> >=     Kernel panic exactly at the same place in tlb.c.<= BR>> > 
> >    I have question about T= LB state, from the stack,
> >    xen_do_hypervis= or_callback-> xen_evtchn_do_upcall-> ... ->drop_other_mm_ref
> > 
> >  &n= bsp; What  cpu_tlbstate.state should be,  could =  TLBSTATE_OK or TLBSTATE_LAZY all be possible?
> > &n= bsp;  That is after a hypercall from userspace, state will be T= LBSTATE_OK, and
> >      if from kernel= space, state will be TLBSTATE_LAZE ?
> > 
> >&nb= sp;      thanks.
>
> it looks = a bug in drop_other_mm_ref implementation, that current TLB state should = be checked
> before invoking leave_mm(). There's a window between b= elow lines of code:
>
> <xen_drop_mm_ref>
> /* G= et the "official" set of cpus referring to our pagetable. */
> if (= !alloc_cpumask_var(&mask, GFP_ATOMIC)) {
> for_each_online_cpu(= cpu) {
> if (!cpumask_test_cpu(cpu, mm_cpumask(mm))
> &&a= mp; per_cpu(xen_current_cr3, cpu) !=3D __pa( mm->pgd))
> continue;
> smp_call_function_single(cpu, dro= p_other_mm_ref, mm, 1);
> }
> return;
> }
>
&= gt; there's chance that when smp_call_function_single is invoked, actual = TLB state has been
> updated in the other cpu. The upstream kernel = patch you referred to earlier just makes
> this bug exposed more ea= sily. But even without this patch, you may still suffer such issue
>= ; which is why reverting the patch doesn't help.
>
> Could y= ou try adding a check in drop_other_mm_ref?
>
> if (active_m= m =3D=3D mm && percpu_read(cpu_tlbstate.state) !=3D TLBSTATE_OK)<= BR>> leave_mm(smp_processor_id());
>
> once the interrupt= ed context has TLBSTATE_OK, it implicates that later it will handle
&= gt; the TLB flush and thus no need for leave_mm from interrupt handler, a= nd that's the
> assumption of doing leave_mm.
>
> Than= ks
> Kevin
--_b56b87da-8107-429c-92ce-5ec99c537358_-- --===============2001338816== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============2001338816==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Tian, Kevin" Subject: RE: RE: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Tue, 26 Apr 2011 16:31:51 +0800 Message-ID: <625BA99ED14B2D499DC4E29D8138F1505C7F2C52EB@shsmsx502.ccr.corp.intel.com> References: , , , , , , , , , , , , , <4DA3438A.6070503@goop.org>, , , , , , <20110412100000.GA15647@dumpdata.com>, , , , , , , , , , <4DA8B715.9080508@goop.org>, , <625BA99ED14B2D499DC4E29D8138F1505C7F2C5185@shsmsx502.ccr.corp.intel.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1064128113==" Return-path: In-Reply-To: Content-Language: en-US List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: MaoXiaoyun , "jeremy@goop.org" Cc: xen devel , "giamteckchoon@gmail.com" , "konrad.wilk@oracle.com" List-Id: xen-devel@lists.xenproject.org --===============1064128113== Content-Language: en-US Content-Type: multipart/alternative; boundary="_000_625BA99ED14B2D499DC4E29D8138F1505C7F2C52EBshsmsx502ccrc_" --_000_625BA99ED14B2D499DC4E29D8138F1505C7F2C52EBshsmsx502ccrc_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: base64 SSB0aGluayB0aGF0IHNob3VsZCBiZSBmaW5lLiBub3RlIGEgbGF0ZXIgY2hlY2s6DQoNCiAgICAg ICAgLyogSWYgdGhpcyBjcHUgc3RpbGwgaGFzIGEgc3RhbGUgY3IzIHJlZmVyZW5jZSwgdGhlbiBt YWtlIHN1cmUNCiAgICAgICAgICAgaXQgaGFzIGJlZW4gZmx1c2hlZC4gKi8NCiAgICAgICAgaWYg KHBlcmNwdV9yZWFkKHhlbl9jdXJyZW50X2NyMykgPT0gX19wYShtbS0+cGdkKSkNCiAgICAgICAg ICAgICAgICBsb2FkX2NyMyhzd2FwcGVyX3BnX2Rpcik7DQoNCnRoaXMgc2hvdWxkIGVuc3VyZSB0 aGUgc3RhbGUgVExCIGJlaW5nIGZsdXNoZWQgaWYgdGhpcyBjcHUgaXMgc3RpbGwgaW4gbGF6eSBt b2RlLg0KDQpUaGFua3MNCktldmluDQoNCkZyb206IE1hb1hpYW95dW4gW21haWx0bzp0aW5ueWNs b3VkQGhvdG1haWwuY29tXQ0KU2VudDogVHVlc2RheSwgQXByaWwgMjYsIDIwMTEgMzowNSBQTQ0K VG86IFRpYW4sIEtldmluOyBqZXJlbXlAZ29vcC5vcmcNCkNjOiB4ZW4gZGV2ZWw7IGdpYW10ZWNr Y2hvb25AZ21haWwuY29tOyBrb25yYWQud2lsa0BvcmFjbGUuY29tDQpTdWJqZWN0OiBSRTogW1hl bi1kZXZlbF0gUkU6IEtlcm5lbCBCVUcgYXQgYXJjaC94ODYvbW0vdGxiLmM6NjENCg0KTWFueSB0 aGFua3MsIEtldmluLg0KDQpJIGFncmVlIG9uIHRoZSByYWNlIHdpbmRvdy4NCk9uZSB0aGluZyBt b3JlLCAgSW4gbXkgdW5kZXJzdGFuaW5nLCB0aGUgQ1BVIHdobyBzZW5kIG91dCBJUEkgbWVzc2Fn ZSwgd2lsbCB1bnBpbiB0aGUgcGFnZXRhYmxlIGFmdGVyDQpyZWNlaXZlIGFsbCBBQ0tTICBmcm9t IG90aGVyIGNwdSwgIGlmIHRoZSBDUFUgd2hvIHJlY2VpdmVkICBJUEkgbWVzc2FnZSwgZW50ZXIg ZHJvcF9vdGhlcl9tbV9yZWYsIGFuZA0KaGFzIFRMQlNUQVRFX09LLCBkb2VzIG5vdGhpbmcsIHdp bGwgaXQgcG9zc2libGUgaXQgcG9zc2libGUgY29uZnJvbnRzIHdpdGggc3RhbGUgcGFnZXRhYmxl DQoodGhhdCBpcyB1bnBpbm5lZCBieSBzZW5kZXIgQ1BVKT8NCg0KU28gZG8gd2UgbmVlZCBmbHVz aCB0bGIgd2hlbiBpdHMgc3RhdGUgaXMgVEJMU1RBVEVfT0s/DQoNCmlmIChhY3RpdmVfbW0gPT0g bW0pew0KICAgICBpZiAocGVyY3B1X3JlYWQoY3B1X3RsYnN0YXRlLnN0YXRlKSA9PSBUTEJTVEFU RV9PSykNCiAgICAgICAgbG9hZF9jcjMobW0tPnBnZCkNCiAgICAgZWxzZQ0KICAgICAgICAgICAg ICAgIGxlYXZlX21tKHNtcF9wcm9jZXNzb3JfaWQoKSk7DQogfQ0KDQo+IEZyb206IGtldmluLnRp YW5AaW50ZWwuY29tDQo+IFRvOiB0aW5ueWNsb3VkQGhvdG1haWwuY29tOyBqZXJlbXlAZ29vcC5v cmcNCj4gQ0M6IHhlbi1kZXZlbEBsaXN0cy54ZW5zb3VyY2UuY29tOyBnaWFtdGVja2Nob29uQGdt YWlsLmNvbTsga29ucmFkLndpbGtAb3JhY2xlLmNvbQ0KPiBEYXRlOiBUdWUsIDI2IEFwciAyMDEx IDEzOjUyOjExICswODAwDQo+IFN1YmplY3Q6IFJFOiBbWGVuLWRldmVsXSBSRTogS2VybmVsIEJV RyBhdCBhcmNoL3g4Ni9tbS90bGIuYzo2MQ0KPg0KPiA+RnJvbTogTWFvWGlhb3l1bg0KPiA+U2Vu dDogTW9uZGF5LCBBcHJpbCAyNSwgMjAxMSAxMToxNSBBTQ0KPiA+PiBEYXRlOiBGcmksIDE1IEFw ciAyMDExIDE0OjIyOjI5IC0wNzAwDQo+ID4+IEZyb206IGplcmVteUBnb29wLm9yZw0KPiA+PiBU bzogdGlubnljbG91ZEBob3RtYWlsLmNvbQ0KPiA+PiBDQzogZ2lhbXRlY2tjaG9vbkBnbWFpbC5j b207IHhlbi1kZXZlbEBsaXN0cy54ZW5zb3VyY2UuY29tOyBrb25yYWQud2lsa0BvcmFjbGUuY29t DQo+ID4+IFN1YmplY3Q6IFJlOiBLZXJuZWwgQlVHIGF0IGFyY2gveDg2L21tL3RsYi5jOjYxDQo+ ID4+DQo+ID4+IE9uIDA0LzE1LzIwMTEgMDU6MjMgQU0sIE1hb1hpYW95dW4gd3JvdGU6DQo+ID4+ ID4gSGmjug0KPiA+PiA+DQo+ID4+ID4gQ291bGQgdGhlIGNyYXNoIHJlbGF0ZWQgdG8gdGhpcyBw YXRjaCA/DQo+ID4+ID4gaHR0cDovL2dpdC5rZXJuZWwub3JnLz9wPWxpbnV4L2tlcm5lbC9naXQv amVyZW15L3hlbi5naXQ7YT1jb21taXRkaWZmO2g9NDViZmQ3YmZjNmNmMzJmOGU2MGJiOTFiMzIz NDlmMGI1MDkwZWVhMw0KPiA+PiA+DQo+ID4+ID4gU2luY2Ugbm93IFRMQiBzdGF0ZSBjaGFuZ2Ug dG8gVExCU1RBVEVfT0sobW11X2NvbnRleHQuaDo0MCkgaXMgYmVmb3JlDQo+ID4+ID4gY3B1bWFz a19jbGVhcl9jcHUobGluZSA0OSkuDQo+ID4+ID4gQ291bGQgaXQgcG9zc2libGUgdGhhdCByaWdo dCBhZnRlciBleGVjdXRlIGxpbmUgNDAgb2YgbW11X2NvbnRleHQuaCwNCj4gPj4gPiBDUFUgcmV2 aWNlIElQSSBmcm9tIG90aGVyIENQVSB0bw0KPiA+PiA+IGZsdXNoIHRoZSBtbSwgYW5kIHdoZW4g aW4gaW50ZXJydXB0LCBmaW5kIHRoZSBUTEIgc3RhdGUgaGFwcGVuZWQgdG8gYmUNCj4gPj4gPiBU TEJTVEFURV9PSy4gV2hpY2ggY29uZmxpY3RzLg0KPiA+Pg0KPiA+PiBEb2VzIHJldmVydGluZyBp dCBoZWxwPw0KPiA+Pg0KPiA+PiBKDQo+ID4NCj4gPkhpIEplcmVteToNCj4gPg0KPiA+ICAgIFRo ZSBsYXN0ZXN0IHRlc3QgcmVzdWx0IHNob3dzIHRoZSByZXZlcnRpbmcgZGlkbid0IGhlbHAuDQo+ ID4gICAgS2VybmVsIHBhbmljIGV4YWN0bHkgYXQgdGhlIHNhbWUgcGxhY2UgaW4gdGxiLmMuDQo+ ID4NCj4gPiAgICBJIGhhdmUgcXVlc3Rpb24gYWJvdXQgVExCIHN0YXRlLCBmcm9tIHRoZSBzdGFj aywNCj4gPiAgICB4ZW5fZG9faHlwZXJ2aXNvcl9jYWxsYmFjay0+IHhlbl9ldnRjaG5fZG9fdXBj YWxsLT4uLi4gLT5kcm9wX290aGVyX21tX3JlZg0KPiA+DQo+ID4gICAgV2hhdCAgY3B1X3RsYnN0 YXRlLnN0YXRlIHNob3VsZCBiZSwgIGNvdWxkICBUTEJTVEFURV9PSyBvciBUTEJTVEFURV9MQVpZ IGFsbCBiZSBwb3NzaWJsZT8NCj4gPiAgICBUaGF0IGlzIGFmdGVyIGEgaHlwZXJjYWxsIGZyb20g dXNlcnNwYWNlLCBzdGF0ZSB3aWxsIGJlIFRMQlNUQVRFX09LLCBhbmQNCj4gPiAgICAgIGlmIGZy b20ga2VybmVsIHNwYWNlLCBzdGF0ZSB3aWxsIGJlIFRMQlNUQVRFX0xBWkUgPw0KPiA+DQo+ID4g ICAgICAgdGhhbmtzLg0KPg0KPiBpdCBsb29rcyBhIGJ1ZyBpbiBkcm9wX290aGVyX21tX3JlZiBp bXBsZW1lbnRhdGlvbiwgdGhhdCBjdXJyZW50IFRMQiBzdGF0ZSBzaG91bGQgYmUgY2hlY2tlZA0K PiBiZWZvcmUgaW52b2tpbmcgbGVhdmVfbW0oKS4gVGhlcmUncyBhIHdpbmRvdyBiZXR3ZWVuIGJl bG93IGxpbmVzIG9mIGNvZGU6DQo+DQo+IDx4ZW5fZHJvcF9tbV9yZWY+DQo+IC8qIEdldCB0aGUg Im9mZmljaWFsIiBzZXQgb2YgY3B1cyByZWZlcnJpbmcgdG8gb3VyIHBhZ2V0YWJsZS4gKi8NCj4g aWYgKCFhbGxvY19jcHVtYXNrX3ZhcigmbWFzaywgR0ZQX0FUT01JQykpIHsNCj4gZm9yX2VhY2hf b25saW5lX2NwdShjcHUpIHsNCj4gaWYgKCFjcHVtYXNrX3Rlc3RfY3B1KGNwdSwgbW1fY3B1bWFz ayhtbSkpDQo+ICYmIHBlcl9jcHUoeGVuX2N1cnJlbnRfY3IzLCBjcHUpICE9IF9fcGEobW0tPnBn ZCkpDQo+IGNvbnRpbnVlOw0KPiBzbXBfY2FsbF9mdW5jdGlvbl9zaW5nbGUoY3B1LCBkcm9wX290 aGVyX21tX3JlZiwgbW0sIDEpOw0KPiB9DQo+IHJldHVybjsNCj4gfQ0KPg0KPiB0aGVyZSdzIGNo YW5jZSB0aGF0IHdoZW4gc21wX2NhbGxfZnVuY3Rpb25fc2luZ2xlIGlzIGludm9rZWQsIGFjdHVh bCBUTEIgc3RhdGUgaGFzIGJlZW4NCj4gdXBkYXRlZCBpbiB0aGUgb3RoZXIgY3B1LiBUaGUgdXBz dHJlYW0ga2VybmVsIHBhdGNoIHlvdSByZWZlcnJlZCB0byBlYXJsaWVyIGp1c3QgbWFrZXMNCj4g dGhpcyBidWcgZXhwb3NlZCBtb3JlIGVhc2lseS4gQnV0IGV2ZW4gd2l0aG91dCB0aGlzIHBhdGNo LCB5b3UgbWF5IHN0aWxsIHN1ZmZlciBzdWNoIGlzc3VlDQo+IHdoaWNoIGlzIHdoeSByZXZlcnRp bmcgdGhlIHBhdGNoIGRvZXNuJ3QgaGVscC4NCj4NCj4gQ291bGQgeW91IHRyeSBhZGRpbmcgYSBj aGVjayBpbiBkcm9wX290aGVyX21tX3JlZj8NCj4NCj4gaWYgKGFjdGl2ZV9tbSA9PSBtbSAmJiBw ZXJjcHVfcmVhZChjcHVfdGxic3RhdGUuc3RhdGUpICE9IFRMQlNUQVRFX09LKQ0KPiBsZWF2ZV9t bShzbXBfcHJvY2Vzc29yX2lkKCkpOw0KPg0KPiBvbmNlIHRoZSBpbnRlcnJ1cHRlZCBjb250ZXh0 IGhhcyBUTEJTVEFURV9PSywgaXQgaW1wbGljYXRlcyB0aGF0IGxhdGVyIGl0IHdpbGwgaGFuZGxl DQo+IHRoZSBUTEIgZmx1c2ggYW5kIHRodXMgbm8gbmVlZCBmb3IgbGVhdmVfbW0gZnJvbSBpbnRl cnJ1cHQgaGFuZGxlciwgYW5kIHRoYXQncyB0aGUNCj4gYXNzdW1wdGlvbiBvZiBkb2luZyBsZWF2 ZV9tbS4NCj4NCj4gVGhhbmtzDQo+IEtldmluDQo= --_000_625BA99ED14B2D499DC4E29D8138F1505C7F2C52EBshsmsx502ccrc_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable

I think that should be fine. note a later check:

<= p class=3DMsoNormal> 

       = ; /* If this cpu still has a stale cr3 reference, then make sure=

    = ;       it has been flushed. */

    &n= bsp;   if (percpu_read(xen_current_cr3) =3D=3D __pa(mm->pgd))<= o:p>

  &= nbsp;           &nbs= p; load_cr3(swapper_pg_dir);

 

this should ensure the stale TLB being flushed if this cpu is sti= ll in lazy mode.

 

Th= anks

Kevin

 

From:= MaoXiaoyun [mailto:tinnycloud@hotmail.com]
Sent: Tuesd= ay, April 26, 2011 3:05 PM
To: Tian, Kevin; jeremy@goop.org
Cc:
xen devel; giamteckchoon@gmail.com; konrad.wilk@oracle.com
S= ubject: RE: [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61

 = ;

Many thanks, = Kevin.
 
I agree on the race window.
One thing more,  In= my understaning, the CPU who send out IPI message, will unpin the pagetabl= e after
receive all ACKS  from other cpu,  if the CPU wh= o received  IPI message, enter drop_other_mm_ref, and
has TLBSTATE= _OK, does nothing, will it possible it possible confronts with stale pageta= ble
(that is unpinned by sender CPU)?
 
So do we need flush t= lb when its state is TBLSTATE_OK?
 

if (active_mm =3D=3D mm){=

     if (percpu_read(cpu_tlbsta= te.state) =3D=3D TLBSTATE_OK)

 &nb= sp;      load_cr3(mm->pgd)

     else

=

&nb= sp;            =    leave_mm(smp_processor_id());

 }

 

> From: kevin.tian@intel.com
> To: tinnycloud@hotmail.co= m; jeremy@goop.org
> CC: xen-devel@lists.xensource.com; giamteckchoon= @gmail.com; konrad.wilk@oracle.com
> Date: Tue, 26 Apr 2011 13:52:11 = +0800
> Subject: RE: [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:= 61
>
> >From: MaoXiaoyun
> >Sent: Monday, April 25= , 2011 11:15 AM
> >> Date: Fri, 15 Apr 2011 14:22:29 -0700
&= gt; >> From: jeremy@goop.org
> >> To: tinnycloud@hotmail.= com
> >> CC: giamteckchoon@gmail.com; xen-devel@lists.xensource= .com; konrad.wilk@oracle.com
> >> Subject: Re: Kernel BUG at ar= ch/x86/mm/tlb.c:61
> >>
> >> On 04/15/2011 05:23 A= M, MaoXiaoyun wrote:
> >> > Hi
=A3=BA
> >> >
> >> > Could the crash re= lated to this patch ?
> >> > http://git.kernel.org/?p=3Dlinu= x/kernel/git/jeremy/xen.git;a=3Dcommitdiff;h=3D45bfd7bfc6cf32f8e60bb91b3234= 9f0b5090eea3
> >> >
> >> > Since now TLB stat= e change to TLBSTATE_OK(mmu_context.h:40) is before
> >> > c= pumask_clear_cpu(line 49).
> >> > Could it possible that rig= ht after execute line 40 of mmu_context.h,
> >> > CPU revice= IPI from other CPU to
> >> > flush the mm, and when in inte= rrupt, find the TLB state happened to be
> >> > TLBSTATE_OK.= Which conflicts.
> >>
> >> Does reverting it help= ?
> >>
> >> J
> > 
> >Hi Je= remy:
> > 
> >    The lastest test re= sult shows the reverting didn't help.
> >    K= ernel panic exactly at the same place in tlb.c.
> > 
> = >    I have question about TLB state, from the stack, > >    xen_do_hypervisor_callback-> xen_evtchn_do_= upcall->... ->drop_other_mm_ref
> > 
> > =    What  cpu_tlbstate.state should be,  could=   TLBSTATE_OK or TLBSTATE_LAZY all be possible?
> >&nbs= p;   That is after a hypercall from userspace, state will be= TLBSTATE_OK, and
> >      if from kernel= space, state will be TLBSTATE_LAZE ?
> > 
> > = ;      thanks.
>
> it looks a bu= g in drop_other_mm_ref implementation, that current TLB state should be che= cked
> before invoking leave_mm(). There's a window between below lin= es of code:
>
> <xen_drop_mm_ref>
> /* Get the &qu= ot;official" set of cpus referring to our pagetable. */
> if (!a= lloc_cpumask_var(&mask, GFP_ATOMIC)) {
> for_each_online_cpu(cpu)= {
> if (!cpumask_test_cpu(cpu, mm_cpumask(mm))
> && pe= r_cpu(xen_current_cr3, cpu) !=3D __pa(mm->pgd))
> continue;
>= ; smp_call_function_single(cpu, drop_other_mm_ref, mm, 1);
> }
>= ; return;
> }
>
> there's chance that when smp_call_func= tion_single is invoked, actual TLB state has been
> updated in the ot= her cpu. The upstream kernel patch you referred to earlier just makes
&g= t; this bug exposed more easily. But even without this patch, you may still= suffer such issue
> which is why reverting the patch doesn't help.>
> Could you try adding a check in drop_other_mm_ref?
> =
> if (active_mm =3D=3D mm && percpu_read(cpu_tlbstate.state)= !=3D TLBSTATE_OK)
> leave_mm(smp_processor_id());
>
> o= nce the interrupted context has TLBSTATE_OK, it implicates that later it wi= ll handle
> the TLB flush and thus no need for leave_mm from interru= pt handler, and that's the
> assumption of doing leave_mm.
> > Thanks
> Kevin

= = --_000_625BA99ED14B2D499DC4E29D8138F1505C7F2C52EBshsmsx502ccrc_-- --===============1064128113== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============1064128113==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeremy Fitzhardinge Subject: Re: RE: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Thu, 28 Apr 2011 16:29:09 -0700 Message-ID: <4DB9F845.6020204@goop.org> References: , , , , , , , <4DA3438A.6070503@goop.org>, , , <20110412100000.GA15647@dumpdata.com>, , , , , <4DA8B715.9080508@goop.org> <625BA99ED14B2D499DC4E29D8138F1505C7F2C5185@shsmsx502.ccr.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <625BA99ED14B2D499DC4E29D8138F1505C7F2C5185@shsmsx502.ccr.corp.intel.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: "Tian, Kevin" Cc: MaoXiaoyun , xen devel , "giamteckchoon@gmail.com" , "konrad.wilk@oracle.com" List-Id: xen-devel@lists.xenproject.org On 04/25/2011 10:52 PM, Tian, Kevin wrote: >> From: MaoXiaoyun >> Sent: Monday, April 25, 2011 11:15 AM >>> Date: Fri, 15 Apr 2011 14:22:29 -0700 >>> From: jeremy@goop.org >>> To: tinnycloud@hotmail.com >>> CC: giamteckchoon@gmail.com; xen-devel@lists.xensource.com; konrad.wi= lk@oracle.com >>> Subject: Re: Kernel BUG at arch/x86/mm/tlb.c:61 >>> >>> On 04/15/2011 05:23 AM, MaoXiaoyun wrote: >>>> Hi=EF=BC=9A >>>> >>>> Could the crash related to this patch ? >>>> http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommi= tdiff;h=3D45bfd7bfc6cf32f8e60bb91b32349f0b5090eea3 >>>> >>>> Since now TLB state change to TLBSTATE_OK(mmu_context.h:40) is befor= e >>>> cpumask_clear_cpu(line 49). >>>> Could it possible that right after execute line 40 of mmu_context.h, >>>> CPU revice IPI from other CPU to >>>> flush the mm, and when in interrupt, find the TLB state happened to = be >>>> TLBSTATE_OK. Which conflicts. >>> Does reverting it help? >>> >>> J >> =20 >> Hi Jeremy: >> =20 >> The lastest test result shows the reverting didn't help. >> Kernel panic exactly at the same place in tlb.c. >> =20 >> I have question about TLB state, from the stack,=20 >> xen_do_hypervisor_callback-> xen_evtchn_do_upcall->... ->drop_othe= r_mm_ref >> =20 >> What cpu_tlbstate.state should be, could TLBSTATE_OK or TLBSTAT= E_LAZY all be possible?=20 >> That is after a hypercall from userspace, state will be TLBSTATE_O= K, and >> if from kernel space, state will be TLBSTATE_LAZE ?=20 >> =20 >> thanks. > it looks a bug in drop_other_mm_ref implementation, that current TLB st= ate should be checked > before invoking leave_mm(). There's a window between below lines of cod= e: > > > /* Get the "official" set of cpus referring to our pagetable. */ > if (!alloc_cpumask_var(&mask, GFP_ATOMIC)) { > for_each_online_cpu(cpu) { > if (!cpumask_test_cpu(cpu, mm_cpumask(mm)) > && per_cpu(xen_current_cr3, cpu) !=3D __pa(= mm->pgd)) > continue; > smp_call_function_single(cpu, drop_other_mm_ref= , mm, 1); > } > return; > } > > there's chance that when smp_call_function_single is invoked, actual TL= B state has been > updated in the other cpu. The upstream kernel patch you referred to ear= lier just makes > this bug exposed more easily. But even without this patch, you may stil= l suffer such issue > which is why reverting the patch doesn't help. > > Could you try adding a check in drop_other_mm_ref? > > if (active_mm =3D=3D mm && percpu_read(cpu_tlbstate.state) !=3D= TLBSTATE_OK) > leave_mm(smp_processor_id()); > > once the interrupted context has TLBSTATE_OK, it implicates that later = it will handle=20 > the TLB flush and thus no need for leave_mm from interrupt handler, and= that's the > assumption of doing leave_mm. That seems reasonable. MaoXiaoyun, does it fix the bug for you? Kevin, could you submit this as a proper patch? Thanks, J From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Tian, Kevin" Subject: RE: RE: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Fri, 29 Apr 2011 08:19:44 +0800 Message-ID: <625BA99ED14B2D499DC4E29D8138F1505C843BB27A@shsmsx502.ccr.corp.intel.com> References: , , , , , , , <4DA3438A.6070503@goop.org>, , , <20110412100000.GA15647@dumpdata.com>, , , , , <4DA8B715.9080508@goop.org> <625BA99ED14B2D499DC4E29D8138F1505C7F2C5185@shsmsx502.ccr.corp.intel.com> <4DB9F845.6020204@goop.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0083845746==" Return-path: In-Reply-To: <4DB9F845.6020204@goop.org> Content-Language: en-US List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jeremy Fitzhardinge Cc: MaoXiaoyun , xen devel , "giamteckchoon@gmail.com" , "konrad.wilk@oracle.com" List-Id: xen-devel@lists.xenproject.org --===============0083845746== Content-Language: en-US Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 PiBGcm9tOiBKZXJlbXkgRml0emhhcmRpbmdlIFttYWlsdG86amVyZW15QGdvb3Aub3JnXQ0KPiBT ZW50OiBGcmlkYXksIEFwcmlsIDI5LCAyMDExIDc6MjkgQU0NCj4gDQo+IE9uIDA0LzI1LzIwMTEg MTA6NTIgUE0sIFRpYW4sIEtldmluIHdyb3RlOg0KPiA+PiBGcm9tOiBNYW9YaWFveXVuDQo+ID4+ IFNlbnQ6IE1vbmRheSwgQXByaWwgMjUsIDIwMTEgMTE6MTUgQU0NCj4gPj4+IERhdGU6IEZyaSwg MTUgQXByIDIwMTEgMTQ6MjI6MjkgLTA3MDANCj4gPj4+IEZyb206IGplcmVteUBnb29wLm9yZw0K PiA+Pj4gVG86IHRpbm55Y2xvdWRAaG90bWFpbC5jb20NCj4gPj4+IENDOiBnaWFtdGVja2Nob29u QGdtYWlsLmNvbTsgeGVuLWRldmVsQGxpc3RzLnhlbnNvdXJjZS5jb207DQo+ID4+PiBrb25yYWQu d2lsa0BvcmFjbGUuY29tDQo+ID4+PiBTdWJqZWN0OiBSZTogS2VybmVsIEJVRyBhdCBhcmNoL3g4 Ni9tbS90bGIuYzo2MQ0KPiA+Pj4NCj4gPj4+IE9uIDA0LzE1LzIwMTEgMDU6MjMgQU0sIE1hb1hp YW95dW4gd3JvdGU6DQo+ID4+Pj4gSGnvvJoNCj4gPj4+Pg0KPiA+Pj4+IENvdWxkIHRoZSBjcmFz aCByZWxhdGVkIHRvIHRoaXMgcGF0Y2ggPw0KPiA+Pj4+IGh0dHA6Ly9naXQua2VybmVsLm9yZy8/ cD1saW51eC9rZXJuZWwvZ2l0L2plcmVteS94ZW4uZ2l0O2E9Y29tbWl0ZGkNCj4gPj4+PiBmZjto PTQ1YmZkN2JmYzZjZjMyZjhlNjBiYjkxYjMyMzQ5ZjBiNTA5MGVlYTMNCj4gPj4+Pg0KPiA+Pj4+ IFNpbmNlIG5vdyBUTEIgc3RhdGUgY2hhbmdlIHRvIFRMQlNUQVRFX09LKG1tdV9jb250ZXh0Lmg6 NDApIGlzDQo+ID4+Pj4gYmVmb3JlIGNwdW1hc2tfY2xlYXJfY3B1KGxpbmUgNDkpLg0KPiA+Pj4+ IENvdWxkIGl0IHBvc3NpYmxlIHRoYXQgcmlnaHQgYWZ0ZXIgZXhlY3V0ZSBsaW5lIDQwIG9mDQo+ ID4+Pj4gbW11X2NvbnRleHQuaCwgQ1BVIHJldmljZSBJUEkgZnJvbSBvdGhlciBDUFUgdG8gZmx1 c2ggdGhlIG1tLCBhbmQNCj4gPj4+PiB3aGVuIGluIGludGVycnVwdCwgZmluZCB0aGUgVExCIHN0 YXRlIGhhcHBlbmVkIHRvIGJlIFRMQlNUQVRFX09LLg0KPiA+Pj4+IFdoaWNoIGNvbmZsaWN0cy4N Cj4gPj4+IERvZXMgcmV2ZXJ0aW5nIGl0IGhlbHA/DQo+ID4+Pg0KPiA+Pj4gSg0KPiA+Pg0KPiA+ PiBIaSBKZXJlbXk6DQo+ID4+DQo+ID4+ICAgICBUaGUgbGFzdGVzdCB0ZXN0IHJlc3VsdCBzaG93 cyB0aGUgcmV2ZXJ0aW5nIGRpZG4ndCBoZWxwLg0KPiA+PiAgICAgS2VybmVsIHBhbmljIGV4YWN0 bHkgYXQgdGhlIHNhbWUgcGxhY2UgaW4gdGxiLmMuDQo+ID4+DQo+ID4+ICAgICBJIGhhdmUgcXVl c3Rpb24gYWJvdXQgVExCIHN0YXRlLCBmcm9tIHRoZSBzdGFjaywNCj4gPj4gICAgIHhlbl9kb19o eXBlcnZpc29yX2NhbGxiYWNrLT4geGVuX2V2dGNobl9kb191cGNhbGwtPi4uLg0KPiA+PiAtPmRy b3Bfb3RoZXJfbW1fcmVmDQo+ID4+DQo+ID4+ICAgICBXaGF0ICBjcHVfdGxic3RhdGUuc3RhdGUg c2hvdWxkIGJlLCAgY291bGQgIFRMQlNUQVRFX09LIG9yDQo+IFRMQlNUQVRFX0xBWlkgYWxsIGJl IHBvc3NpYmxlPw0KPiA+PiAgICAgVGhhdCBpcyBhZnRlciBhIGh5cGVyY2FsbCBmcm9tIHVzZXJz cGFjZSwgc3RhdGUgd2lsbCBiZSBUTEJTVEFURV9PSywNCj4gYW5kDQo+ID4+ICAgICAgIGlmIGZy b20ga2VybmVsIHNwYWNlLCBzdGF0ZSB3aWxsIGJlIFRMQlNUQVRFX0xBWkUgPw0KPiA+Pg0KPiA+ PiAgICAgICAgdGhhbmtzLg0KPiA+IGl0IGxvb2tzIGEgYnVnIGluIGRyb3Bfb3RoZXJfbW1fcmVm IGltcGxlbWVudGF0aW9uLCB0aGF0IGN1cnJlbnQgVExCDQo+ID4gc3RhdGUgc2hvdWxkIGJlIGNo ZWNrZWQgYmVmb3JlIGludm9raW5nIGxlYXZlX21tKCkuIFRoZXJlJ3MgYSB3aW5kb3cNCj4gYmV0 d2VlbiBiZWxvdyBsaW5lcyBvZiBjb2RlOg0KPiA+DQo+ID4gPHhlbl9kcm9wX21tX3JlZj4NCj4g PiAgICAgICAgLyogR2V0IHRoZSAib2ZmaWNpYWwiIHNldCBvZiBjcHVzIHJlZmVycmluZyB0byBv dXIgcGFnZXRhYmxlLiAqLw0KPiA+ICAgICAgICAgaWYgKCFhbGxvY19jcHVtYXNrX3ZhcigmbWFz aywgR0ZQX0FUT01JQykpIHsNCj4gPiAgICAgICAgICAgICAgICAgZm9yX2VhY2hfb25saW5lX2Nw dShjcHUpIHsNCj4gPiAgICAgICAgICAgICAgICAgICAgICAgICBpZiAoIWNwdW1hc2tfdGVzdF9j cHUoY3B1LA0KPiBtbV9jcHVtYXNrKG1tKSkNCj4gPiAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgJiYgcGVyX2NwdSh4ZW5fY3VycmVudF9jcjMsIGNwdSkgIT0NCj4gX19wYShtbS0+cGdkKSkN Cj4gPiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIGNvbnRpbnVlOw0KPiA+ICAgICAg ICAgICAgICAgICAgICAgICAgIHNtcF9jYWxsX2Z1bmN0aW9uX3NpbmdsZShjcHUsDQo+IGRyb3Bf b3RoZXJfbW1fcmVmLCBtbSwgMSk7DQo+ID4gICAgICAgICAgICAgICAgIH0NCj4gPiAgICAgICAg ICAgICAgICAgcmV0dXJuOw0KPiA+ICAgICAgICAgfQ0KPiA+DQo+ID4gdGhlcmUncyBjaGFuY2Ug dGhhdCB3aGVuIHNtcF9jYWxsX2Z1bmN0aW9uX3NpbmdsZSBpcyBpbnZva2VkLCBhY3R1YWwNCj4g PiBUTEIgc3RhdGUgaGFzIGJlZW4gdXBkYXRlZCBpbiB0aGUgb3RoZXIgY3B1LiBUaGUgdXBzdHJl YW0ga2VybmVsIHBhdGNoDQo+ID4geW91IHJlZmVycmVkIHRvIGVhcmxpZXIganVzdCBtYWtlcyB0 aGlzIGJ1ZyBleHBvc2VkIG1vcmUgZWFzaWx5LiBCdXQNCj4gPiBldmVuIHdpdGhvdXQgdGhpcyBw YXRjaCwgeW91IG1heSBzdGlsbCBzdWZmZXIgc3VjaCBpc3N1ZSB3aGljaCBpcyB3aHkgcmV2ZXJ0 aW5nDQo+IHRoZSBwYXRjaCBkb2Vzbid0IGhlbHAuDQo+ID4NCj4gPiBDb3VsZCB5b3UgdHJ5IGFk ZGluZyBhIGNoZWNrIGluIGRyb3Bfb3RoZXJfbW1fcmVmPw0KPiA+DQo+ID4gICAgICAgICBpZiAo YWN0aXZlX21tID09IG1tICYmIHBlcmNwdV9yZWFkKGNwdV90bGJzdGF0ZS5zdGF0ZSkgIT0NCj4g VExCU1RBVEVfT0spDQo+ID4gICAgICAgICAgICAgICAgIGxlYXZlX21tKHNtcF9wcm9jZXNzb3Jf aWQoKSk7DQo+ID4NCj4gPiBvbmNlIHRoZSBpbnRlcnJ1cHRlZCBjb250ZXh0IGhhcyBUTEJTVEFU RV9PSywgaXQgaW1wbGljYXRlcyB0aGF0IGxhdGVyDQo+ID4gaXQgd2lsbCBoYW5kbGUgdGhlIFRM QiBmbHVzaCBhbmQgdGh1cyBubyBuZWVkIGZvciBsZWF2ZV9tbSBmcm9tDQo+ID4gaW50ZXJydXB0 IGhhbmRsZXIsIGFuZCB0aGF0J3MgdGhlIGFzc3VtcHRpb24gb2YgZG9pbmcgbGVhdmVfbW0uDQo+ IA0KPiBUaGF0IHNlZW1zIHJlYXNvbmFibGUuICBNYW9YaWFveXVuLCBkb2VzIGl0IGZpeCB0aGUg YnVnIGZvciB5b3U/DQo+IA0KPiBLZXZpbiwgY291bGQgeW91IHN1Ym1pdCB0aGlzIGFzIGEgcHJv cGVyIHBhdGNoPw0KPiANCg0KSSdtIHdhaXRpbmcgZm9yIFhpYW95dW4ncyB0ZXN0IHJlc3VsdCBi ZWZvcmUgc3VibWl0dGluZyBhIHByb3BlciBwYXRjaCwgc2luY2UgdGhpcw0KcGFydCBvZiBsb2dp YyBpcyB0cmlja3kgYW5kIGhpcyB0ZXN0IGNhbiBtYWtlIHN1cmUgd2UgZG9uJ3Qgb3Zlcmxvb2sg c29tZSBjb3JuZXINCmNhc2VzLiA6LSkNCg0KVGhhbmtzDQpLZXZpbg0K --===============0083845746== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============0083845746==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: MaoXiaoyun Subject: RE: RE: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Fri, 29 Apr 2011 09:50:57 +0800 Message-ID: References: , , , , , , , , , , , , , <4DA3438A.6070503@goop.org>, , , , , , <20110412100000.GA15647@dumpdata.com>, , , , , , , , , , <4DA8B715.9080508@goop.org>, , <625BA99ED14B2D499DC4E29D8138F1505C7F2C5185@shsmsx502.ccr.corp.intel.com>, <4DB9F845.6020204@goop.org>, <625BA99ED14B2D499DC4E29D8138F1505C843BB27A@shsmsx502.ccr.corp.intel.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0998087465==" Return-path: In-Reply-To: <625BA99ED14B2D499DC4E29D8138F1505C843BB27A@shsmsx502.ccr.corp.intel.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: kevin.tian@intel.com, jeremy@goop.org Cc: xen devel , giamteckchoon@gmail.com, konrad.wilk@oracle.com List-Id: xen-devel@lists.xenproject.org --===============0998087465== Content-Type: multipart/alternative; boundary="_ff952d7a-1353-4fad-add3-268a2e2826b3_" --_ff952d7a-1353-4fad-add3-268a2e2826b3_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable =20 > From: kevin.tian@intel.com > To: jeremy@goop.org > CC: tinnycloud@hotmail.com; xen-devel@lists.xensource.com; giamteckchoo= n@gmail.com; konrad.wilk@oracle.com > Date: Fri, 29 Apr 2011 08:19:44 +0800 > Subject: RE: [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61 >=20 > > From: Jeremy Fitzhardinge [mailto:jeremy@goop.org] > > Sent: Friday, April 29, 2011 7:29 AM > >=20 > > On 04/25/2011 10:52 PM, Tian, Kevin wrote: > > >> From: MaoXiaoyun > > >> Sent: Monday, April 25, 2011 11:15 AM > > >>> Date: Fri, 15 Apr 2011 14:22:29 -0700 > > >>> From: jeremy@goop.org > > >>> To: tinnycloud@hotmail.com > > >>> CC: giamteckchoon@gmail.com; xen-devel@lists.xensource.com; > > >>> konrad.wilk@oracle.com > > >>> Subject: Re: Kernel BUG at arch/x86/mm/tlb.c:61 > > >>> > > >>> On 04/15/2011 05:23 AM, MaoXiaoyun wrote: > > >>>> Hi=A3=BA > > >>>> > > >>>> Could the crash related to this patch ? > > >>>> http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dc= ommitdi > > >>>> ff;h=3D45bfd7bfc6cf32f8e60bb91b32349f0b5090eea3 > > >>>> > > >>>> Since now TLB state change to TLBSTATE_OK(mmu_context.h:40) is > > >>>> before cpumask_clear_cpu(line 49). > > >>>> Could it possible that right after execute line 40 of > > >>>> mmu_context.h, CPU revice IPI from other CPU to flush the mm, an= d > > >>>> when in interrupt, find the TLB state happened to be TLBSTATE_OK= . > > >>>> Which conflicts. > > >>> Does reverting it help? > > >>> > > >>> J > > >> > > >> Hi Jeremy: > > >> > > >> The lastest test result shows the reverting didn't help. > > >> Kernel panic exactly at the same place in tlb.c. > > >> > > >> I have question about TLB state, from the stack, > > >> xen_do_hypervisor_callback-> xen_evtchn_do_upcall->... > > >> ->drop_other_mm_ref > > >> > > >> What cpu_tlbstate.state should be, could TLBSTATE_OK or > > TLBSTATE_LAZY all be possible? > > >> That is after a hypercall from userspace, state will be TLBSTATE_O= K, > > and > > >> if from kernel space, state will be TLBSTATE_LAZE ? > > >> > > >> thanks. > > > it looks a bug in drop_other_mm_ref implementation, that current TL= B > > > state should be checked before invoking leave_mm(). There's a windo= w > > between below lines of code: > > > > > > > > > /* Get the "official" set of cpus referring to our pagetable. */ > > > if (!alloc_cpumask_var(&mask, GFP_ATOMIC)) { > > > for_each_online_cpu(cpu) { > > > if (!cpumask_test_cpu(cpu, > > mm_cpumask(mm)) > > > && per_cpu(xen_current_cr3, cpu) !=3D > > __pa(mm->pgd)) > > > continue; > > > smp_call_function_single(cpu, > > drop_other_mm_ref, mm, 1); > > > } > > > return; > > > } > > > > > > there's chance that when smp_call_function_single is invoked, actua= l > > > TLB state has been updated in the other cpu. The upstream kernel pa= tch > > > you referred to earlier just makes this bug exposed more easily. Bu= t > > > even without this patch, you may still suffer such issue which is w= hy reverting > > the patch doesn't help. > > > > > > Could you try adding a check in drop_other_mm_ref? > > > > > > if (active_mm =3D=3D mm && percpu_read(cpu_tlbstate.state) !=3D > > TLBSTATE_OK) > > > leave_mm(smp_processor_id()); > > > > > > once the interrupted context has TLBSTATE_OK, it implicates that la= ter > > > it will handle the TLB flush and thus no need for leave_mm from > > > interrupt handler, and that's the assumption of doing leave_mm. > >=20 > > That seems reasonable. MaoXiaoyun, does it fix the bug for you? > >=20 > > Kevin, could you submit this as a proper patch? > >=20 >=20 > I'm waiting for Xiaoyun's test result before submitting a proper patch,= since this > part of logic is tricky and his test can make sure we don't overlook so= me corner > cases. :-) >=20 =20 I think it works. The test has been running over 70 hours successfully. My plan is run one week. =20 Thanks.=20 =20 > Thanks > Kevin =20 --_ff952d7a-1353-4fad-add3-268a2e2826b3_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable  
> From: kevin.tian@intel.com
> To: jeremy@goop.org
> CC: t= innycloud@hotmail.com; xen-devel@lists.xensource.com; giamteckchoon@gmail= .com; konrad.wilk@oracle.com
> Date: Fri, 29 Apr 2011 08:19:44 +080= 0
> Subject: RE: [Xen-devel] RE: Kernel BUG at arch/x86/mm/tlb.c:61=
>
> > From: Jeremy Fitzhardinge [mailto:jeremy@goop.org]=
> > Sent: Friday, April 29, 2011 7:29 AM
> >
> = > On 04/25/2011 10:52 PM, Tian, Kevin wrote:
> > >> Fro= m: MaoXiaoyun
> > >> Sent: Monday, April 25, 2011 11:15 AM=
> > >>> Date: Fri, 15 Apr 2011 14:22:29 -0700
> = > >>> From: jeremy@goop.org
> > >>> To: tin= nycloud@hotmail.com
> > >>> CC: giamteckchoon@gmail.com= ; xen-devel@lists.xensource.com;
> > >>> konrad.wilk@or= acle.com
> > >>> Subject: Re: Kernel BUG at arch/x86/mm= /tlb.c:61
> > >>>
&g t; > >>> On 04/15/2011 05:23 AM, MaoXiaoyun wrote:
> &= gt; >>>> Hi=A3=BA
> > >>>>
> > = >>>> Could the crash related to this patch ?
> > >= ;>>> http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;= a=3Dcommitdi
> > >>>> ff;h=3D45bfd7bfc6cf32f8e60bb91= b32349f0b5090eea3
> > >>>>
> > >>>= > Since now TLB state change to TLBSTATE_OK(mmu_context.h:40) is
&g= t; > >>>> before cpumask_clear_cpu(line 49).
> > = >>>> Could it possible that right after execute line 40 of> > >>>> mmu_context.h, CPU revice IPI from other CPU = to flush the mm, and
> > >>>> when in interrupt, fin= d the TLB state happened to be TLBSTATE_OK.
> > >>>>= Which conflicts.
> > >>> Does reverting it help?
&g= t; > >>>
> > >>> J
> > >>
> > >> Hi Jeremy:
> > >>= ;
> > >> The lastest test result shows the reverting didn'= t help.
> > >> Kernel panic exactly at the same place in t= lb.c.
> > >>
> > >> I have question about T= LB state, from the stack,
> > >> xen_do_hypervisor_callbac= k-> xen_evtchn_do_upcall->...
> > >> ->drop_other= _mm_ref
> > >>
> > >> What cpu_tlbstate.sta= te should be, could TLBSTATE_OK or
> > TLBSTATE_LAZY all be poss= ible?
> > >> That is after a hypercall from userspace, sta= te will be TLBSTATE_OK,
> > and
> > >> if from ke= rnel space, state will be TLBSTATE_LAZE ?
> > >>
> &= gt; >> thanks.
> > > it looks a bug in drop_other_mm_re= f implementation, that current TLB
> > > state should be chec= ked before invoking leave_mm(). There's a=20 window
> > between below lines of code:
> > >
&g= t; > > <xen_drop_mm_ref>
> > > /* Get the "offici= al" set of cpus referring to our pagetable. */
> > > if (!all= oc_cpumask_var(&mask, GFP_ATOMIC)) {
> > > for_each_onlin= e_cpu(cpu) {
> > > if (!cpumask_test_cpu(cpu,
> > mm= _cpumask(mm))
> > > && per_cpu(xen_current_cr3, cpu) = !=3D
> > __pa(mm->pgd))
> > > continue;
> &= gt; > smp_call_function_single(cpu,
> > drop_other_mm_ref, mm= , 1);
> > > }
> > > return;
> > > }> > >
> > > there's chance that when smp_call_func= tion_single is invoked, actual
> > > TLB state has been updat= ed in the other cpu. The upstream kernel patch
> > > you refe= rred to earlier just makes this bug exposed more easily. But
> >= > even without this patch, you may still suffer such issue which is why reverting
> > the patch doesn't= help.
> > >
> > > Could you try adding a check i= n drop_other_mm_ref?
> > >
> > > if (active_mm =3D= =3D mm && percpu_read(cpu_tlbstate.state) !=3D
> > TLBST= ATE_OK)
> > > leave_mm(smp_processor_id());
> > >=
> > > once the interrupted context has TLBSTATE_OK, it impli= cates that later
> > > it will handle the TLB flush and thus = no need for leave_mm from
> > > interrupt handler, and that's= the assumption of doing leave_mm.
> >
> > That seems = reasonable. MaoXiaoyun, does it fix the bug for you?
> >
>= ; > Kevin, could you submit this as a proper patch?
> >
&= gt;
> I'm waiting for Xiaoyun's test result before submitting a pr= oper patch, since this
> part of logic is tricky and his test can m= ake sure we don't overlook some corner
>=20 cases. :-)
>
 
I think it works. The test has been running over 70 hours successfully. My plan is run one week.
 
Thanks.
 
> Thanks
> Kevin

--_ff952d7a-1353-4fad-add3-268a2e2826b3_-- --===============0998087465== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============0998087465==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Tian, Kevin" Subject: RE: RE: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Fri, 29 Apr 2011 09:57:11 +0800 Message-ID: <625BA99ED14B2D499DC4E29D8138F1505C843BB382@shsmsx502.ccr.corp.intel.com> References: , , , , , , , , , , , , , <4DA3438A.6070503@goop.org>, , , , , , <20110412100000.GA15647@dumpdata.com>, , , , , , , , , , <4DA8B715.9080508@goop.org>, , <625BA99ED14B2D499DC4E29D8138F1505C7F2C5185@shsmsx502.ccr.corp.intel.com>, <4DB9F845.6020204@goop.org>, <625BA99ED14B2D499DC4E29D8138F1505C843BB27A@shsmsx502.ccr.corp.intel.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0574505600==" Return-path: In-Reply-To: Content-Language: en-US List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: MaoXiaoyun , "jeremy@goop.org" Cc: xen devel , "giamteckchoon@gmail.com" , "konrad.wilk@oracle.com" List-Id: xen-devel@lists.xenproject.org --===============0574505600== Content-Language: en-US Content-Type: multipart/alternative; boundary="_000_625BA99ED14B2D499DC4E29D8138F1505C843BB382shsmsx502ccrc_" --_000_625BA99ED14B2D499DC4E29D8138F1505C843BB382shsmsx502ccrc_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: base64 T0ssIHRoYW5rcyBmb3IgdGhlIHVwZGF0ZS4gSaGvbGwgc2VuZCBvdXQgdGhlIHBhdGNoIHRoZW4N Cg0KVGhhbmtzDQpLZXZpbg0KDQpGcm9tOiBNYW9YaWFveXVuIFttYWlsdG86dGlubnljbG91ZEBo b3RtYWlsLmNvbV0NClNlbnQ6IEZyaWRheSwgQXByaWwgMjksIDIwMTEgOTo1MSBBTQ0KVG86IFRp YW4sIEtldmluOyBqZXJlbXlAZ29vcC5vcmcNCkNjOiB4ZW4gZGV2ZWw7IGdpYW10ZWNrY2hvb25A Z21haWwuY29tOyBrb25yYWQud2lsa0BvcmFjbGUuY29tDQpTdWJqZWN0OiBSRTogW1hlbi1kZXZl bF0gUkU6IEtlcm5lbCBCVUcgYXQgYXJjaC94ODYvbW0vdGxiLmM6NjENCg0KDQo+IEZyb206IGtl dmluLnRpYW5AaW50ZWwuY29tPG1haWx0bzprZXZpbi50aWFuQGludGVsLmNvbT4NCj4gVG86IGpl cmVteUBnb29wLm9yZzxtYWlsdG86amVyZW15QGdvb3Aub3JnPg0KPiBDQzogdGlubnljbG91ZEBo b3RtYWlsLmNvbTxtYWlsdG86dGlubnljbG91ZEBob3RtYWlsLmNvbT47IHhlbi1kZXZlbEBsaXN0 cy54ZW5zb3VyY2UuY29tPG1haWx0bzp4ZW4tZGV2ZWxAbGlzdHMueGVuc291cmNlLmNvbT47IGdp YW10ZWNrY2hvb25AZ21haWwuY29tPG1haWx0bzpnaWFtdGVja2Nob29uQGdtYWlsLmNvbT47IGtv bnJhZC53aWxrQG9yYWNsZS5jb208bWFpbHRvOmtvbnJhZC53aWxrQG9yYWNsZS5jb20+DQo+IERh dGU6IEZyaSwgMjkgQXByIDIwMTEgMDg6MTk6NDQgKzA4MDANCj4gU3ViamVjdDogUkU6IFtYZW4t ZGV2ZWxdIFJFOiBLZXJuZWwgQlVHIGF0IGFyY2gveDg2L21tL3RsYi5jOjYxDQo+DQo+ID4gRnJv bTogSmVyZW15IEZpdHpoYXJkaW5nZSBbbWFpbHRvOmplcmVteUBnb29wLm9yZ108bWFpbHRvOltt YWlsdG86amVyZW15QGdvb3Aub3JnXT4NCj4gPiBTZW50OiBGcmlkYXksIEFwcmlsIDI5LCAyMDEx IDc6MjkgQU0NCj4gPg0KPiA+IE9uIDA0LzI1LzIwMTEgMTA6NTIgUE0sIFRpYW4sIEtldmluIHdy b3RlOg0KPiA+ID4+IEZyb206IE1hb1hpYW95dW4NCj4gPiA+PiBTZW50OiBNb25kYXksIEFwcmls IDI1LCAyMDExIDExOjE1IEFNDQo+ID4gPj4+IERhdGU6IEZyaSwgMTUgQXByIDIwMTEgMTQ6MjI6 MjkgLTA3MDANCj4gPiA+Pj4gRnJvbTogamVyZW15QGdvb3Aub3JnPG1haWx0bzpqZXJlbXlAZ29v cC5vcmc+DQo+ID4gPj4+IFRvOiB0aW5ueWNsb3VkQGhvdG1haWwuY29tPG1haWx0bzp0aW5ueWNs b3VkQGhvdG1haWwuY29tPg0KPiA+ID4+PiBDQzogZ2lhbXRlY2tjaG9vbkBnbWFpbC5jb208bWFp bHRvOmdpYW10ZWNrY2hvb25AZ21haWwuY29tPjsgeGVuLWRldmVsQGxpc3RzLnhlbnNvdXJjZS5j b208bWFpbHRvOnhlbi1kZXZlbEBsaXN0cy54ZW5zb3VyY2UuY29tPjsNCj4gPiA+Pj4ga29ucmFk LndpbGtAb3JhY2xlLmNvbTxtYWlsdG86a29ucmFkLndpbGtAb3JhY2xlLmNvbT4NCj4gPiA+Pj4g U3ViamVjdDogUmU6IEtlcm5lbCBCVUcgYXQgYXJjaC94ODYvbW0vdGxiLmM6NjENCj4gPiA+Pj4N Cj4gPiA+Pj4gT24gMDQvMTUvMjAxMSAwNToyMyBBTSwgTWFvWGlhb3l1biB3cm90ZToNCj4gPiA+ Pj4+IEhpo7oNCj4gPiA+Pj4+DQo+ID4gPj4+PiBDb3VsZCB0aGUgY3Jhc2ggcmVsYXRlZCB0byB0 aGlzIHBhdGNoID8NCj4gPiA+Pj4+IGh0dHA6Ly9naXQua2VybmVsLm9yZy8/cD1saW51eC9rZXJu ZWwvZ2l0L2plcmVteS94ZW4uZ2l0O2E9Y29tbWl0ZGkNCj4gPiA+Pj4+IGZmO2g9NDViZmQ3YmZj NmNmMzJmOGU2MGJiOTFiMzIzNDlmMGI1MDkwZWVhMw0KPiA+ID4+Pj4NCj4gPiA+Pj4+IFNpbmNl IG5vdyBUTEIgc3RhdGUgY2hhbmdlIHRvIFRMQlNUQVRFX09LKG1tdV9jb250ZXh0Lmg6NDApIGlz DQo+ID4gPj4+PiBiZWZvcmUgY3B1bWFza19jbGVhcl9jcHUobGluZSA0OSkuDQo+ID4gPj4+PiBD b3VsZCBpdCBwb3NzaWJsZSB0aGF0IHJpZ2h0IGFmdGVyIGV4ZWN1dGUgbGluZSA0MCBvZg0KPiA+ ID4+Pj4gbW11X2NvbnRleHQuaCwgQ1BVIHJldmljZSBJUEkgZnJvbSBvdGhlciBDUFUgdG8gZmx1 c2ggdGhlIG1tLCBhbmQNCj4gPiA+Pj4+IHdoZW4gaW4gaW50ZXJydXB0LCBmaW5kIHRoZSBUTEIg c3RhdGUgaGFwcGVuZWQgdG8gYmUgVExCU1RBVEVfT0suDQo+ID4gPj4+PiBXaGljaCBjb25mbGlj dHMuDQo+ID4gPj4+IERvZXMgcmV2ZXJ0aW5nIGl0IGhlbHA/DQo+ID4gPj4+DQo+ID4gPj4+IEoN Cj4gPiA+Pg0KPiA+ID4+IEhpIEplcmVteToNCj4gPiA+Pg0KPiA+ID4+IFRoZSBsYXN0ZXN0IHRl c3QgcmVzdWx0IHNob3dzIHRoZSByZXZlcnRpbmcgZGlkbid0IGhlbHAuDQo+ID4gPj4gS2VybmVs IHBhbmljIGV4YWN0bHkgYXQgdGhlIHNhbWUgcGxhY2UgaW4gdGxiLmMuDQo+ID4gPj4NCj4gPiA+ PiBJIGhhdmUgcXVlc3Rpb24gYWJvdXQgVExCIHN0YXRlLCBmcm9tIHRoZSBzdGFjaywNCj4gPiA+ PiB4ZW5fZG9faHlwZXJ2aXNvcl9jYWxsYmFjay0+IHhlbl9ldnRjaG5fZG9fdXBjYWxsLT4uLi4N Cj4gPiA+PiAtPmRyb3Bfb3RoZXJfbW1fcmVmDQo+ID4gPj4NCj4gPiA+PiBXaGF0IGNwdV90bGJz dGF0ZS5zdGF0ZSBzaG91bGQgYmUsIGNvdWxkIFRMQlNUQVRFX09LIG9yDQo+ID4gVExCU1RBVEVf TEFaWSBhbGwgYmUgcG9zc2libGU/DQo+ID4gPj4gVGhhdCBpcyBhZnRlciBhIGh5cGVyY2FsbCBm cm9tIHVzZXJzcGFjZSwgc3RhdGUgd2lsbCBiZSBUTEJTVEFURV9PSywNCj4gPiBhbmQNCj4gPiA+ PiBpZiBmcm9tIGtlcm5lbCBzcGFjZSwgc3RhdGUgd2lsbCBiZSBUTEJTVEFURV9MQVpFID8NCj4g PiA+Pg0KPiA+ID4+IHRoYW5rcy4NCj4gPiA+IGl0IGxvb2tzIGEgYnVnIGluIGRyb3Bfb3RoZXJf bW1fcmVmIGltcGxlbWVudGF0aW9uLCB0aGF0IGN1cnJlbnQgVExCDQo+ID4gPiBzdGF0ZSBzaG91 bGQgYmUgY2hlY2tlZCBiZWZvcmUgaW52b2tpbmcgbGVhdmVfbW0oKS4gVGhlcmUncyBhIHdpbmRv dw0KPiA+IGJldHdlZW4gYmVsb3cgbGluZXMgb2YgY29kZToNCj4gPiA+DQo+ID4gPiA8eGVuX2Ry b3BfbW1fcmVmPg0KPiA+ID4gLyogR2V0IHRoZSAib2ZmaWNpYWwiIHNldCBvZiBjcHVzIHJlZmVy cmluZyB0byBvdXIgcGFnZXRhYmxlLiAqLw0KPiA+ID4gaWYgKCFhbGxvY19jcHVtYXNrX3Zhcigm bWFzaywgR0ZQX0FUT01JQykpIHsNCj4gPiA+IGZvcl9lYWNoX29ubGluZV9jcHUoY3B1KSB7DQo+ ID4gPiBpZiAoIWNwdW1hc2tfdGVzdF9jcHUoY3B1LA0KPiA+IG1tX2NwdW1hc2sobW0pKQ0KPiA+ ID4gJiYgcGVyX2NwdSh4ZW5fY3VycmVudF9jcjMsIGNwdSkgIT0NCj4gPiBfX3BhKG1tLT5wZ2Qp KQ0KPiA+ID4gY29udGludWU7DQo+ID4gPiBzbXBfY2FsbF9mdW5jdGlvbl9zaW5nbGUoY3B1LA0K PiA+IGRyb3Bfb3RoZXJfbW1fcmVmLCBtbSwgMSk7DQo+ID4gPiB9DQo+ID4gPiByZXR1cm47DQo+ ID4gPiB9DQo+ID4gPg0KPiA+ID4gdGhlcmUncyBjaGFuY2UgdGhhdCB3aGVuIHNtcF9jYWxsX2Z1 bmN0aW9uX3NpbmdsZSBpcyBpbnZva2VkLCBhY3R1YWwNCj4gPiA+IFRMQiBzdGF0ZSBoYXMgYmVl biB1cGRhdGVkIGluIHRoZSBvdGhlciBjcHUuIFRoZSB1cHN0cmVhbSBrZXJuZWwgcGF0Y2gNCj4g PiA+IHlvdSByZWZlcnJlZCB0byBlYXJsaWVyIGp1c3QgbWFrZXMgdGhpcyBidWcgZXhwb3NlZCBt b3JlIGVhc2lseS4gQnV0DQo+ID4gPiBldmVuIHdpdGhvdXQgdGhpcyBwYXRjaCwgeW91IG1heSBz dGlsbCBzdWZmZXIgc3VjaCBpc3N1ZSB3aGljaCBpcyB3aHkgcmV2ZXJ0aW5nDQo+ID4gdGhlIHBh dGNoIGRvZXNuJ3QgaGVscC4NCj4gPiA+DQo+ID4gPiBDb3VsZCB5b3UgdHJ5IGFkZGluZyBhIGNo ZWNrIGluIGRyb3Bfb3RoZXJfbW1fcmVmPw0KPiA+ID4NCj4gPiA+IGlmIChhY3RpdmVfbW0gPT0g bW0gJiYgcGVyY3B1X3JlYWQoY3B1X3RsYnN0YXRlLnN0YXRlKSAhPQ0KPiA+IFRMQlNUQVRFX09L KQ0KPiA+ID4gbGVhdmVfbW0oc21wX3Byb2Nlc3Nvcl9pZCgpKTsNCj4gPiA+DQo+ID4gPiBvbmNl IHRoZSBpbnRlcnJ1cHRlZCBjb250ZXh0IGhhcyBUTEJTVEFURV9PSywgaXQgaW1wbGljYXRlcyB0 aGF0IGxhdGVyDQo+ID4gPiBpdCB3aWxsIGhhbmRsZSB0aGUgVExCIGZsdXNoIGFuZCB0aHVzIG5v IG5lZWQgZm9yIGxlYXZlX21tIGZyb20NCj4gPiA+IGludGVycnVwdCBoYW5kbGVyLCBhbmQgdGhh dCdzIHRoZSBhc3N1bXB0aW9uIG9mIGRvaW5nIGxlYXZlX21tLg0KPiA+DQo+ID4gVGhhdCBzZWVt cyByZWFzb25hYmxlLiBNYW9YaWFveXVuLCBkb2VzIGl0IGZpeCB0aGUgYnVnIGZvciB5b3U/DQo+ ID4NCj4gPiBLZXZpbiwgY291bGQgeW91IHN1Ym1pdCB0aGlzIGFzIGEgcHJvcGVyIHBhdGNoPw0K PiA+DQo+DQo+IEknbSB3YWl0aW5nIGZvciBYaWFveXVuJ3MgdGVzdCByZXN1bHQgYmVmb3JlIHN1 Ym1pdHRpbmcgYSBwcm9wZXIgcGF0Y2gsIHNpbmNlIHRoaXMNCj4gcGFydCBvZiBsb2dpYyBpcyB0 cmlja3kgYW5kIGhpcyB0ZXN0IGNhbiBtYWtlIHN1cmUgd2UgZG9uJ3Qgb3Zlcmxvb2sgc29tZSBj b3JuZXINCj4gY2FzZXMuIDotKQ0KPg0KDQpJIHRoaW5rIGl0IHdvcmtzLiBUaGUgdGVzdCBoYXMg YmVlbiBydW5uaW5nIG92ZXIgNzAgaG91cnMgc3VjY2Vzc2Z1bGx5Lg0KTXkgcGxhbiBpcyBydW4g b25lIHdlZWsuDQoNClRoYW5rcy4NCg0KPiBUaGFua3MNCj4gS2V2aW4NCg== --_000_625BA99ED14B2D499DC4E29D8138F1505C843BB382shsmsx502ccrc_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable

OK, thanks for the update. I=A1=AFll send out the patch then=

 

Thanks

<= p class=3DMsoNormal>Kevin

 

From: MaoXiaoyun [mailt= o:tinnycloud@hotmail.com]
Sent: Friday, April 29, 2011 9:51 AMTo: Tian, Kevin; jeremy@goop.org
Cc: xen devel; giamteck= choon@gmail.com; konrad.wilk@oracle.com
Subject: RE: [Xen-devel] = RE: Kernel BUG at arch/x86/mm/tlb.c:61

 

 > From: kevin.tian@intel.com
> To:
jeremy@goop.org
&g= t; CC: tinnycloud@hotmail.com= ; xen-devel@lists.xensourc= e.com; giamteckchoon@gmail.c= om; konrad.wilk@oracle.com
> Date: Fri, 29 Apr 2011 08:19:44 +0800
> Subject: RE: [Xen-d= evel] RE: Kernel BUG at arch/x86/mm/tlb.c:61
>
> > From: Je= remy Fitzhardinge
[mailto:jerem= y@goop.org]
> > Sent: Friday, April 29, 2011 7:29 AM
> &= gt;
> > On 04/25/2011 10:52 PM, Tian, Kevin wrote:
> > &= gt;> From: MaoXiaoyun
> > >> Sent: Monday, April 25, 2011= 11:15 AM
> > >>> Date: Fri, 15 Apr 2011 14:22:29 -0700> > >>> From: jeremy@go= op.org
> > >>> To: tinnycloud@hotmail.com
> > >>> CC: giamteckchoon@gmail.com; xen-devel@lists.xensource.com;
&= gt; > >>> konrad.wilk= @oracle.com
> > >>> Subject: Re: Kernel BUG at arch/x= 86/mm/tlb.c:61
> > >>>
> > >>> On 04/15= /2011 05:23 AM, MaoXiaoyun wrote:
> > >>>> Hi
=A3=BA
> > >>>>
> >= ; >>>> Could the crash related to this patch ?
> > >= ;>>> http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/x= en.git;a=3Dcommitdi
> > >>>> ff;h=3D45bfd7bfc6cf32= f8e60bb91b32349f0b5090eea3
> > >>>>
> > >&= gt;>> Since now TLB state change to TLBSTATE_OK(mmu_context.h:40) is<= br>> > >>>> before cpumask_clear_cpu(line 49).
> &g= t; >>>> Could it possible that right after execute line 40 of> > >>>> mmu_context.h, CPU revice IPI from other CPU t= o flush the mm, and
> > >>>> when in interrupt, find t= he TLB state happened to be TLBSTATE_OK.
> > >>>> Whic= h conflicts.
> > >>> Does reverting it help?
> >= >>>
> > >>> J
> > >>
> >= ; >> Hi Jeremy:
> > >>
> > >> The laste= st test result shows the reverting didn't help.
> > >> Kerne= l panic exactly at the same place in tlb.c.
> > >>
> &= gt; >> I have question about TLB state, from the stack,
> > = >> xen_do_hypervisor_callback-> xen_evtchn_do_upcall->...
&g= t; > >> ->drop_other_mm_ref
> > >>
> > = >> What cpu_tlbstate.state should be, could TLBSTATE_OK or
> &g= t; TLBSTATE_LAZY all be possible?
> > >> That is after a hyp= ercall from userspace, state will be TLBSTATE_OK,
> > and
> = > >> if from kernel space, state will be TLBSTATE_LAZE ?
> &= gt; >>
> > >> thanks.
> > > it looks a bug= in drop_other_mm_ref implementation, that current TLB
> > > st= ate should be checked before invoking leave_mm(). There's a window
> = > between below lines of code:
> > >
> > > <x= en_drop_mm_ref>
> > > /* Get the "official" set of= cpus referring to our pagetable. */
> > > if (!alloc_cpumask_v= ar(&mask, GFP_ATOMIC)) {
> > > for_each_online_cpu(cpu) {> > > if (!cpumask_test_cpu(cpu,
> > mm_cpumask(mm))> > > && per_cpu(xen_current_cr3, cpu) !=3D
> > = __pa(mm->pgd))
> > > continue;
> > > smp_call_fu= nction_single(cpu,
> > drop_other_mm_ref, mm, 1);
> > >= ; }
> > > return;
> > > }
> > >
>= > > there's chance that when smp_call_function_single is invoked, ac= tual
> > > TLB state has been updated in the other cpu. The ups= tream kernel patch
> > > you referred to earlier just makes thi= s bug exposed more easily. But
> > > even without this patch, y= ou may still suffer such issue which is why reverting
> > the patc= h doesn't help.
> > >
> > > Could you try adding a = check in drop_other_mm_ref?
> > >
> > > if (active_= mm =3D=3D mm && percpu_read(cpu_tlbstate.state) !=3D
> > T= LBSTATE_OK)
> > > leave_mm(smp_processor_id());
> > &g= t;
> > > once the interrupted context has TLBSTATE_OK, it impli= cates that later
> > > it will handle the TLB flush and thus no= need for leave_mm from
> > > interrupt handler, and that's the= assumption of doing leave_mm.
> >
> > That seems reason= able. MaoXiaoyun, does it fix the bug for you?
> >
> > K= evin, could you submit this as a proper patch?
> >
>
&g= t; I'm waiting for Xiaoyun's test result before submitting a proper patch, = since this
> part of logic is tricky and his test can make sure we do= n't overlook some corner
> cases. :-)
>
 
I think i= t works. The test has been running over 70 hours successfully.
My plan i= s run one week.
 
Thanks.
 
> Thanks
> Kevi= n

= --_000_625BA99ED14B2D499DC4E29D8138F1505C843BB382shsmsx502ccrc_-- --===============0574505600== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============0574505600==--