From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tomasz Wroblewski Subject: Re: [PATCH] Fix boot crash on xsm/flask enabled builds when no policy module is present Date: Mon, 26 Aug 2013 14:24:31 +0200 Message-ID: <521B48FF.1040904@citrix.com> References: <1377511404-3365-1-git-send-email-tomasz.wroblewski@citrix.com> <521B543902000078000EE55D@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta4.messagelabs.com ([85.158.143.247]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1VDvrh-0003YL-J6 for xen-devel@lists.xenproject.org; Mon, 26 Aug 2013 12:25:49 +0000 In-Reply-To: <521B543902000078000EE55D@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: xen-devel@lists.xenproject.org, dgdegra@tycho.nsa.gov List-Id: xen-devel@lists.xenproject.org On 08/26/2013 01:12 PM, Jan Beulich wrote: >>>> On 26.08.13 at 12:03, Tomasz Wroblewski wrote: >> Xen crashes on boot of xsm/flask enabled builds, if policy module is not >> specified. >> This seems to have worked on 4.1 at least. > Looking at the code (4.1.5) I can't see what would prevent the > same NULL pointer deref. Care to explain? The crash doesn't happen at the NULL pointer dereference site though, but a bit later, when xen tries to flush tlbs for first time I believe, which happens during page allocation for the initial domain structure. I traced it to the following ASSERT in smp.c (so yes I should add this particular crash likely is limited to debug builds then) void flush_area_mask(const cpumask_t *mask, const void *va, unsigned int flags) { ASSERT(local_irq_is_enabled()); ... The actual crash message is unhelpful since it's basically only ... (XEN) Using scheduler: SMP Credit Scheduler (credit) (XEN) Unknown interrupt (cr2=0000000000000000) Either removing the assert (which is obviously bad), or checking for the null pointer deref as in the submitted patch seems to be fixing it. I'm suspecting it was always broken somehow but just was hidden or had different side effects on 4.1 than it does now. I do lack for a good explanation why fiddling with null addresses breaks up this assert, though. >> Can be fixed by testing whether >> policy_buffer >> is NULL before attempting to load from it - it's a global which is set to >> non-NULL when >> policy module is detected. >> >> Signed-off-by: Tomasz Wroblewski >> --- >> xen/xsm/flask/hooks.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c >> index fa0589a..cfa2929 100644 >> --- a/xen/xsm/flask/hooks.c >> +++ b/xen/xsm/flask/hooks.c >> @@ -1585,7 +1585,8 @@ static __init int flask_init(void) >> if ( register_xsm(&flask_ops) ) >> panic("Flask: Unable to register with XSM.\n"); >> >> - ret = security_load_policy(policy_buffer, policy_size); >> + if ( policy_buffer ) >> + ret = security_load_policy(policy_buffer, policy_size); > Question is whether policy_buffer == NULL really isn't supposed > to result in a -E... return value (as in fact flask initialization failed). > > Jan >