From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <5046549C.7030008@xenomai.org> Date: Tue, 04 Sep 2012 21:21:00 +0200 From: Gilles Chanteperdrix MIME-Version: 1.0 References: <50460BCE.8010505@xenomai.org> <50464969.2000902@xenomai.org> In-Reply-To: <50464969.2000902@xenomai.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [Xenomai] kernel NULL pointer dereference List-Id: Discussions about the Xenomai project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Henri Roosen Cc: Xenomai On 09/04/2012 08:33 PM, Gilles Chanteperdrix wrote: > On 09/04/2012 04:28 PM, Henri Roosen wrote: > >> On Tue, Sep 4, 2012 at 4:10 PM, Gilles Chanteperdrix >> wrote: >>> On 09/04/2012 03:42 PM, Henri Roosen wrote: >>>> Hi, >>>> >>>> I'm using the bleeding edge of Xenomai (0590cb45adce468f619) and Ipipe >>>> (d21e8cdbdcf21ade) on a x86 multicore system and kernel 3.4.6. >>>> I reserved one cpu (kernel param isolcpus=1). >>>> >>>> Our application triggers the following NULL pointer dereference when I >>>> set the affinity of some tasks to cpu 0 and other tasks to cpu 1. >>>> The application does not trigger this when all tasks have the same >>>> affinity (set via /proc/xenomai/affinity). >>>> >>>> I was able to reproduce this also under QEMU and will do some >>>> debugging, but maybe someone knows what is wrong already by seeing the >>>> stacktrace below: >>> >>> Could you try to reduce the bug to a simple testcase which we would try >>> and run to reproduce? >>> >>>> [ 108.013023] BUG: unable to handle kernel NULL pointer dereference >>> at 00000294 >>>> [ 108.013550] IP: [] __lock_task_sighand+0x53/0xc3 >>> >>> Or send us a disassembly of the function __lock_task_sighand? > > > Looks like someone is calling send_sig_info with an invalid pointer. > There is something seriously wrong. > > On the other hand, now that I think about it, you need at least the > following patch: > > diff --git a/ksrc/nucleus/intr.c b/ksrc/nucleus/intr.c > index c75fcac..0f37bb2 100644 > --- a/ksrc/nucleus/intr.c > +++ b/ksrc/nucleus/intr.c > @@ -93,8 +93,18 @@ void xnintr_host_tick(struct xnsched *sched) /* Interrupts off. */ > > void xnintr_clock_handler(void) > { > - struct xnsched *sched = xnpod_current_sched(); > xnstat_exectime_t *prev; > + struct xnsched *sched; > + unsigned cpu; > + > + cpu = xnarch_current_cpu(); > + > + if (!cpumask_test_cpu(cpu, &xnarch_supported_cpus)) { > + xnarch_relay_tick(); > + return; > + } > + > + sched = xnpod_sched_slot(cpu); > > prev = xnstat_exectime_switch(sched, > &nkclock.stat[xnsched_cpu(sched)].account); It should work (I did not test it), with the following patch on the I-pipe: diff --git a/kernel/ipipe/timer.c b/kernel/ipipe/timer.c index d51fa62..301cdc0 100644 --- a/kernel/ipipe/timer.c +++ b/kernel/ipipe/timer.c @@ -176,11 +176,17 @@ int ipipe_select_timers(const struct cpumask *mask) hrclock_freq = __ipipe_hrclock_freq; spin_lock_irqsave(&lock, flags); - for_each_cpu(cpu, mask) { + for_each_cpu(cpu, cpu_online_mask) { list_for_each_entry(t, &timers, link) { if (!cpumask_test_cpu(cpu, t->cpumask)) continue; + if (!cpumask_test_cpu(cpu, mask) + && t->irq == per_cpu(ipipe_percpu.hrtimer_irq, 0)) { + per_cpu(ipipe_percpu.hrtimer_irq, cpu) = t->irq; + goto found; + } + evtdev = t->host_timer; #ifdef CONFIG_GENERIC_CLOCKEVENTS if (!evtdev @@ -188,10 +194,16 @@ int ipipe_select_timers(const struct cpumask *mask) #endif /* CONFIG_GENERIC_CLOCKEVENTS */ goto found; } + if (!cpumask_test_cpu(cpu, mask)) + continue; + printk("I-pipe: could not find timer for cpu #%d\n", cpu); goto err_remove_all; found: + if (!cpumask_test_cpu(cpu, mask)) + continue; + if (__ipipe_hrtimer_freq == 0) __ipipe_hrtimer_freq = t->freq; per_cpu(ipipe_percpu.hrtimer_irq, cpu) = t->irq; -- Gilles.