From mboxrd@z Thu Jan  1 00:00:00 1970
From: Tomasz Wroblewski <tomasz.wroblewski@citrix.com>
Subject: Re: [PATCH] Fix boot crash on xsm/flask enabled builds
 when no policy module is present
Date: Mon, 26 Aug 2013 15:34:49 +0200
Message-ID: <521B5979.3050509@citrix.com>
References: <1377511404-3365-1-git-send-email-tomasz.wroblewski@citrix.com>
	<521B543902000078000EE55D@nat28.tlf.novell.com>
	<521B48FF.1040904@citrix.com>
	<521B6D7302000078000EE64E@nat28.tlf.novell.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Received: from mail6.bemta14.messagelabs.com ([193.109.254.103])
	by lists.xen.org with esmtp (Exim 4.72)
	(envelope-from <tomasz.wroblewski@citrix.com>) id 1VDwxa-0005Wc-Ox
	for xen-devel@lists.xenproject.org; Mon, 26 Aug 2013 13:35:58 +0000
In-Reply-To: <521B6D7302000078000EE64E@nat28.tlf.novell.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Jan Beulich <JBeulich@suse.com>
Cc: xen-devel@lists.xenproject.org, dgdegra@tycho.nsa.gov
List-Id: xen-devel@lists.xenproject.org

On 08/26/2013 03:00 PM, Jan Beulich wrote:
>>>> On 26.08.13 at 14:24, Tomasz Wroblewski<tomasz.wroblewski@citrix.com>  wrote:
>> On 08/26/2013 01:12 PM, Jan Beulich wrote:
>>>>>> On 26.08.13 at 12:03, Tomasz Wroblewski<tomasz.wroblewski@citrix.com>   wrote:
>>>> Xen crashes on boot of xsm/flask enabled builds, if policy module is not
>>>> specified.
>>>> This seems to have worked on 4.1 at least.
>>> Looking at the code (4.1.5) I can't see what would prevent the
>>> same NULL pointer deref. Care to explain?
>> The crash doesn't happen at the NULL pointer dereference site though,
> Then does it deref the NULL pointer, or does it not? If it does and
> merely doesn't crash because something happens to be mapped
> there, that's still a bug.
>
>> but a bit later, when xen tries to flush tlbs for first time I believe,
>> which happens during page allocation for the initial domain structure. I
>> traced it to the following ASSERT in smp.c (so yes I should add this
>> particular crash likely is limited to debug builds then)
>>
>> void flush_area_mask(const cpumask_t *mask, const void *va, unsigned int
>> flags)
>> {
>>       ASSERT(local_irq_is_enabled());
>>       ...
>>
>> The actual crash message is unhelpful since it's basically only
>>
>> ...
>> (XEN) Using scheduler: SMP Credit Scheduler (credit)
>> (XEN) Unknown interrupt (cr2=0000000000000000)
>>
>>
>> Either removing the assert (which is obviously bad), or checking for the
> The assertion is in no way bad. It's the too early use of the
> function that is the problem here.

Aye I meant removing the assertion is bad, not the assert. Looks like 
this needs bit more investigation to see what exact bits inside 
security_load_policy causes this, so I'll do that.

>> null pointer deref as in the submitted patch seems to be fixing it. I'm
>> suspecting it was always broken somehow but just was hidden or had
>> different side effects on 4.1 than it does now. I do lack for a good
>> explanation why fiddling with null addresses breaks up this assert, though.
> Also, you didn't show the call trace that made things get here (yes,
> you may need to construct this manually). I'm in no way convinced
> that there's a NULL pointer involved here at all - the fact that CR2
> is zero doesn't mean a page fault occurred in the first place.
Yeah will investigate more
> Jan
>