From mboxrd@z Thu Jan 1 00:00:00 1970 From: Keir Fraser Subject: Re: [Question] Is it safe to call "xmalloc()" with irq disabled? Date: Tue, 01 Mar 2011 08:50:19 +0000 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Haitao Shan Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org We need to move dynamic allocation into CPU_UP_PREPARE context. Sadly that will need surgery on the machine-check cruft. I'll take a look later today and see if I can do a suitable hatchet job for 4.1. -- Keir On 01/03/2011 08:22, "Haitao Shan" wrote: > Hi, Keir, >=20 > Below is the log when I met the issue. >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>> (XEN)=A0=A0=A0 ffff83023e257e40 ffff82c48019cc80 0000000100000000 >>> 0000000000000039 (XEN)=A0=A0=A0 ffff82c4802c6a00 0000000000000039 >>> ffff82c4802c6a00 0000000000000039 (XEN)=A0=A0=A0 0000000000000000 >>> 0000000000000000 ffff83023e257e70 >>> ffff82c48019f49a >>> (XEN)=A0=A0=A0 0000000000000039 ffff82c4802c6a00 ffff83023e257e90 >>> ffff82c48019cb8d (XEN)=A0=A0=A0 0000000000000039 ffff82c4802c6a00 >>> ffff83023e257eb0 ffff82c48017815b (XEN) Xen call trace: (XEN)=A0=A0 >>> [] flush_area_mask+0x1b/0x127 (XEN)=A0=A0 >>> [] alloc_heap_pages+0x5d6/0x61b (XEN)=A0=A0 >>> [] alloc_domheap_pages+0xc7/0x13d (XEN)=A0=A0 >>> [] alloc_xenheap_pages+0x50/0xd8 (XEN)=A0=A0 >>> [] xmalloc_pool_get+0x2b/0x2d (XEN)=A0=A0 >>> [] xmem_pool_alloc+0x26c/0x4c2 (XEN)=A0=A0 >>> [] _xmalloc+0x106/0x1b6 (XEN)=A0=A0 >>> [] mcabanks_alloc+0x18/0xa4 (XEN)=A0=A0 >>> [] intel_mcheck_init+0x21/0x64e (XEN)=A0=A0 >>> [] mcheck_init+0xdd/0x1b2 (XEN)=A0=A0 >>> [] identify_cpu+0x27d/0x282 (XEN)=A0=A0 >>> [] smp_store_cpu_info+0x3b/0xca (XEN)=A0=A0 >>> [] smp_callin+0x8e/0x157 (XEN)=A0=A0 >>> [] start_secondary+0xab/0x126 (XEN) (XEN) >>> (XEN) **************************************** >>> (XEN) Panic on CPU 57: >>> (XEN) Assertion 'local_irq_is_enabled()' failed at smp.c:234 >>> (XEN) **************************************** >>> (XEN) >>> (XEN) Reboot in five seconds...^M >>> (XEN) Resetting with ACPI MEMORY or I/O RESET_REG. >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >=20 > 2011/3/1 Keir Fraser >> Haitao, >>=20 >> Both _xmalloc and xfree can only safely be called with irqs enabled. I k= now >> there is a somewhat suspicious area during CPU bringup where we temporar= ily >> disable spinlock debugging. It would be nice to not need this. And for t= his >> particular bug you are dealing with, perhaps we can fix it now -- what i= s >> the backtrace for the failing allocation? >>=20 >> =A0-- Keir >>=20 >> On 01/03/2011 07:42, "Haitao Shan" wrote: >>=20 >>> Hi, Keir, >>>=20 >>> In recent effort on debugging cpu offline/online, I met Xen panic some >>> times. >>>=20 >>> The reason of the panic is caused by following code path: >>>=20 >>> xmalloc ---> alloc_heap_pages ---> flush_area_mask { >>> ASSERT(local_irq_enabled)........} >>>=20 >>> This bring me the question: is it safe to call xmalloc with local irq >>> disabled? As you can see, not all alloc_heap_pages will result in TLB >>> flushing. But once it calls, the assertion will fail. >>>=20 >>> In my case, the xmalloc is called with starting secondary processors. S= ome >>> initialization code run with local irq enabled, for example, the MCA >>> initialization. Normally this piece of code runs when all heap pages do= not >>> have a former owner (no domain is initialized at booting time, I guess)= , so >>> calling xmalloc won't be a problem. But later when this same piece of c= ode >>> runs as a result of cpu online operation, it has=A0possibility=A0to trigger= the >>> assertion failure. >>>=20 >>> What's you view on this, Keir? Is it the design that xmalloc must be ca= lled >>> with local irq enabled? I have done a hack to remove the assertion. Eve= ry >>> things work just fine to me. But maybe I just happened not to run into = any >>> problem with the hack. >>>=20 >>> Shan Haitao >>>=20 >>=20 >>=20 >=20 >=20