Spinlock bug??

public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed

* Spinlock bug??
@ 2007-01-24 16:55 JWM
  2007-01-25  9:12 ` Simon Derr
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: JWM @ 2007-01-24 16:55 UTC (permalink / raw)
  To: linux-ia64

    Hi all;
    I'm working on a Bull - 8 way ia64 system running a RedHat variant of 
2.6.17.
    I keep getting a spin lock bug and dump , attached.
    It appears that cpuset_set_cpus_affinity is taking doing a task_lock on 
the task structure and only releaseing it after the cpu has changed. That 
naturally causes the spin_bug function to get upset.
    The lock doesn't appear to be required since set_cpus_allowed makes sure 
that things are serialized pretty well.
    Am I missing something here or is this lock not required.

    ....JW
    ___________________________________________DUMP_______________________________________________
    Jan 22 20:40:14 Itanium-2 kernel: BUG: spinlock wrong CPU on CPU#0, 
process1/13439
Jan 22 20:40:14 Itanium-2 kernel:  lock: e000000454a00e08, .magic: dead4ead, 
.owner: process1/13439, .owner_cpu: 4
Jan 22 20:40:14 Itanium-2 kernel:
Jan 22 20:40:14 Itanium-2 kernel: Call Trace:
Jan 22 20:40:14 Itanium-2 kernel:  [<a000000100010da0>] show_stack+0x80/0xa0
Jan 22 20:40:14 Itanium-2 kernel: 
spà00000454a07c50 bspà00000454a01190
Jan 22 20:40:14 Itanium-2 kernel:  [<a000000100010df0>] dump_stack+0x30/0x60
Jan 22 20:40:14 Itanium-2 kernel: 
spà00000454a07e20 bspà00000454a01178
Jan 22 20:40:14 Itanium-2 kernel:     r32 : a0000001002b4980 r33 : 
000000000000028b r34 : 00000000dead4ead
Jan 22 20:40:14 Itanium-2 kernel:  [<a0000001002b4980>] spin_bug+0x100/0x120
Jan 22 20:40:14 Itanium-2 kernel: 
spà00000454a07e20 bspà00000454a01150
Jan 22 20:40:14 Itanium-2 kernel:     r32 : e000000454a00e08 r33 : 
e000000454a00000 r34 : a0000001002b4d10
Jan 22 20:40:14 Itanium-2 kernel:     r35 : 0000000000000308 r36 : 
0000000000200200
Jan 22 20:40:14 Itanium-2 kernel:  [<a0000001002b4d10>] 
_raw_spin_unlock+0xd0/0x120
Jan 22 20:40:14 Itanium-2 kernel: 
spà00000454a07e20 bspà00000454a01120
Jan 22 20:40:14 Itanium-2 kernel:     r32 : e000000454a00e08 r33 : 
e000000454a00e10 r34 : e000000454a00e18
Jan 22 20:40:14 Itanium-2 kernel:     r35 : a00000010053c560 r36 : 
0000000000000205 r37 : a0000001000c42c0
Jan 22 20:40:14 Itanium-2 kernel:  [<a00000010053c560>] 
_spin_unlock+0x20/0x40
Jan 22 20:40:14 Itanium-2 kernel: 
spà00000454a07e20 bspà00000454a01100
Jan 22 20:40:14 Itanium-2 kernel:     r32 : e000000454a00e08 r33 : 
a0000001000c42d0 r34 : 0000000000000308
Jan 22 20:40:14 Itanium-2 kernel:     r35 : e000000100004b18
Jan 22 20:40:14 Itanium-2 kernel:  [<a0000001000c42d0>] 
cpuset_set_cpus_affinity+0x110/0x160
Jan 22 20:40:14 Itanium-2 kernel: 
spà00000454a07e20 bspà00000454a010d0
Jan 22 20:40:14 Itanium-2 kernel:     r32 : 0000000000000000 r33 : 
0000000000000001 r34 : e000000454a00e08
Jan 22 20:40:14 Itanium-2 kernel:     r35 : a0000001000655f0 r36 : 
0000000000000389 r37 : a00000010053c5e0
Jan 22 20:40:14 Itanium-2 kernel:  [<a0000001000655f0>] 
sched_setaffinity+0x110/0x1e0
Jan 22 20:40:14 Itanium-2 kernel: 
spà00000454a07e20 bspà00000454a01098
Jan 22 20:40:14 Itanium-2 kernel:     r32 : e000000454a00000 r33 : 
0000000000000001 r34 : e000000454a00010
Jan 22 20:40:14 Itanium-2 kernel:     r35 : ffffffffffffffff r36 : 
a0000001000657a0 r37 : 000000000000060f
Jan 22 20:40:14 Itanium-2 kernel:     r38 : 0000000000000000
Jan 22 20:40:14 Itanium-2 kernel:  [<a0000001000657a0>] 
sys_sched_setaffinity+0xe0/0x100
Jan 22 20:40:14 Itanium-2 kernel: 
spà00000454a07e20 bspà00000454a01038
Jan 22 20:40:14 Itanium-2 kernel:     r32 : 0000000000000000 r33 : 
0000000000000080 r34 : 6000000fffcffe40
Jan 22 20:40:14 Itanium-2 kernel:     r35 : 0000000000000000 r36 : 
2000000000267ab0 r37 : 20000000004168b0
Jan 22 20:40:14 Itanium-2 kernel:     r38 : 2000000000019280 r39 : 
c000000000000711 r40 : 0000000000000000
Jan 22 20:40:14 Itanium-2 kernel:     r41 : a00000010000b640 r42 : 
0000000000000008 r43 : 0000000000000000
Jan 22 20:40:14 Itanium-2 kernel:  [<a00000010000b640>] 
ia64_ret_from_syscall+0x0/0x20
Jan 22 20:40:14 Itanium-2 kernel: 
spà00000454a07e30 bspà00000454a01038
Jan 22 20:40:14 Itanium-2 kernel:  [<a000000000010640>] 
__kernel_syscall_via_break+0x0/0x20
Jan 22 20:40:14 Itanium-2 kernel: 
spà00000454a08000 bspà00000454a01038


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Spinlock bug??
  2007-01-24 16:55 Spinlock bug?? JWM
@ 2007-01-25  9:12 ` Simon Derr
  2007-01-25 17:39 ` Christoph Lameter
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Simon Derr @ 2007-01-25  9:12 UTC (permalink / raw)
  To: linux-ia64

>    Hi all;
>    I'm working on a Bull - 8 way ia64 system running a RedHat variant of
> 2.6.17.
>    I keep getting a spin lock bug and dump , attached.

Hello,

If you are running a Bull Linux kernel, I suggest that you contact 
directly Bull for this kind of issues. We have added a few custom patches 
that could hardly be adressed by other people from this list.

That being said, I'm going to see what's going on in this particular case.
You're probably right.

	Simon.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Spinlock bug??
  2007-01-24 16:55 Spinlock bug?? JWM
  2007-01-25  9:12 ` Simon Derr
@ 2007-01-25 17:39 ` Christoph Lameter
  2007-01-25 21:49 ` JWM
  2007-01-25 21:51 ` JWM
  3 siblings, 0 replies; 5+ messages in thread
From: Christoph Lameter @ 2007-01-25 17:39 UTC (permalink / raw)
  To: linux-ia64

On Wed, 24 Jan 2007, JWM wrote:

>    Hi all;
>    I'm working on a Bull - 8 way ia64 system running a RedHat variant of
> 2.6.17.
>    I keep getting a spin lock bug and dump , attached.
>    It appears that cpuset_set_cpus_affinity is taking doing a task_lock on the
> task structure and only releaseing it after the cpu has changed. That
> naturally causes the spin_bug function to get upset.
>    The lock doesn't appear to be required since set_cpus_allowed makes sure
> that things are serialized pretty well.
>    Am I missing something here or is this lock not required.

Try a newer kernel. That piece was reworked and cpuset_set_cpus_affinity 
no longer exists in recent kernels. 2.6.20-rc6 has:

long sched_setaffinity(pid_t pid, cpumask_t new_mask)
{
        cpumask_t cpus_allowed;
        struct task_struct *p;
        int retval;

        lock_cpu_hotplug();
        read_lock(&tasklist_lock);

        p = find_process_by_pid(pid);
        if (!p) {
                read_unlock(&tasklist_lock);
                unlock_cpu_hotplug();
                return -ESRCH;
        }

        /*
         * It is not safe to call set_cpus_allowed with the
         * tasklist_lock held.  We will bump the task_struct's
         * usage count and then drop tasklist_lock.
         */
        get_task_struct(p);
        read_unlock(&tasklist_lock);

        retval = -EPERM;
        if ((current->euid != p->euid) && (current->euid != p->uid) &&
                        !capable(CAP_SYS_NICE))
                goto out_unlock;

        retval = security_task_setscheduler(p, 0, NULL);
        if (retval)
                goto out_unlock;

        cpus_allowed = cpuset_cpus_allowed(p);
        cpus_and(new_mask, new_mask, cpus_allowed);
        retval = set_cpus_allowed(p, new_mask);

out_unlock:
        put_task_struct(p);
        unlock_cpu_hotplug();
        return retval;
}


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Spinlock bug??
  2007-01-24 16:55 Spinlock bug?? JWM
  2007-01-25  9:12 ` Simon Derr
  2007-01-25 17:39 ` Christoph Lameter
@ 2007-01-25 21:49 ` JWM
  2007-01-25 21:51 ` JWM
  3 siblings, 0 replies; 5+ messages in thread
From: JWM @ 2007-01-25 21:49 UTC (permalink / raw)
  To: linux-ia64

    Simon;
    I've looked at the code and the function that calls 
cpuset_set_cpus_affinity does a get_task. That bumps the usage count and 
should protect from an exiting process doing evil - shouldn't it?
    So the sched_setaffinity locks, bumps the task useage and then unlocks 
and calls cpuset_set_cpus_affinity. The in the Bull code 
cpuset_set_cpuaffinity takes a spin lock on the structure and then 
(potentially) moves it to another CPU.
    It doesn't look like a change is required other than removing the lock 
in cpuset_set_cpus_affinity.
    Is there a possible race here I'm missing?
    ....JW
----- Original Message ----- 
From: "Simon Derr" <Simon.Derr@bull.net>
To: "JWM" <jwm@systemfabricworks.com>
Cc: <linux-ia64@vger.kernel.org>; "Philippe Garrigues" 
<philippe.garrigues@bull.net>
Sent: Thursday, January 25, 2007 3:12 AM
Subject: Re: Spinlock bug??


>>    Hi all;
>>    I'm working on a Bull - 8 way ia64 system running a RedHat variant of
>> 2.6.17.
>>    I keep getting a spin lock bug and dump , attached.
>
> Hello,
>
> If you are running a Bull Linux kernel, I suggest that you contact
> directly Bull for this kind of issues. We have added a few custom patches
> that could hardly be adressed by other people from this list.
>
> That being said, I'm going to see what's going on in this particular case.
> You're probably right.
>
> Simon.
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Spinlock bug??
  2007-01-24 16:55 Spinlock bug?? JWM
                   ` (2 preceding siblings ...)
  2007-01-25 21:49 ` JWM
@ 2007-01-25 21:51 ` JWM
  3 siblings, 0 replies; 5+ messages in thread
From: JWM @ 2007-01-25 21:51 UTC (permalink / raw)
  To: linux-ia64

    Thanks
    When I saw that code you pasted in it became clear. The potential switch 
in CPU's is protected by bumping the usage on the task structure. I haven't 
looked at the exit code, but I would think that should protect this from 
exiting processes wouldn't you?
    ....JW
----- Original Message ----- 
From: "Christoph Lameter" <clameter@sgi.com>
To: "JWM" <jwm@systemfabricworks.com>
Cc: <linux-ia64@vger.kernel.org>; <pj@sgi.com>
Sent: Thursday, January 25, 2007 11:39 AM
Subject: Re: Spinlock bug??


> On Wed, 24 Jan 2007, JWM wrote:
>
>>    Hi all;
>>    I'm working on a Bull - 8 way ia64 system running a RedHat variant of
>> 2.6.17.
>>    I keep getting a spin lock bug and dump , attached.
>>    It appears that cpuset_set_cpus_affinity is taking doing a task_lock 
>> on the
>> task structure and only releaseing it after the cpu has changed. That
>> naturally causes the spin_bug function to get upset.
>>    The lock doesn't appear to be required since set_cpus_allowed makes 
>> sure
>> that things are serialized pretty well.
>>    Am I missing something here or is this lock not required.
>
> Try a newer kernel. That piece was reworked and cpuset_set_cpus_affinity
> no longer exists in recent kernels. 2.6.20-rc6 has:
>
> long sched_setaffinity(pid_t pid, cpumask_t new_mask)
> {
>        cpumask_t cpus_allowed;
>        struct task_struct *p;
>        int retval;
>
>        lock_cpu_hotplug();
>        read_lock(&tasklist_lock);
>
>        p = find_process_by_pid(pid);
>        if (!p) {
>                read_unlock(&tasklist_lock);
>                unlock_cpu_hotplug();
>                return -ESRCH;
>        }
>
>        /*
>         * It is not safe to call set_cpus_allowed with the
>         * tasklist_lock held.  We will bump the task_struct's
>         * usage count and then drop tasklist_lock.
>         */
>        get_task_struct(p);
>        read_unlock(&tasklist_lock);
>
>        retval = -EPERM;
>        if ((current->euid != p->euid) && (current->euid != p->uid) &&
>                        !capable(CAP_SYS_NICE))
>                goto out_unlock;
>
>        retval = security_task_setscheduler(p, 0, NULL);
>        if (retval)
>                goto out_unlock;
>
>        cpus_allowed = cpuset_cpus_allowed(p);
>        cpus_and(new_mask, new_mask, cpus_allowed);
>        retval = set_cpus_allowed(p, new_mask);
>
> out_unlock:
>        put_task_struct(p);
>        unlock_cpu_hotplug();
>        return retval;
> }
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-01-25 21:51 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-24 16:55 Spinlock bug?? JWM
2007-01-25  9:12 ` Simon Derr
2007-01-25 17:39 ` Christoph Lameter
2007-01-25 21:49 ` JWM
2007-01-25 21:51 ` JWM

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox