linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* Re: I want to break softirq.c to fix broken hardware
  2001-08-09 21:56 I want to break softirq.c to fix broken hardware Justin (Gus) Hurwitz
@ 2001-08-09 18:08 ` Adrian Cox
  2001-08-09 22:32   ` Justin (Gus) Hurwitz
  0 siblings, 1 reply; 3+ messages in thread
From: Adrian Cox @ 2001-08-09 18:08 UTC (permalink / raw)
  To: Justin (Gus) Hurwitz; +Cc: linuxppc-embedded


Justin (Gus) Hurwitz wrote:

 > #else
 >         __asm__ __volatile__("\n\
 > 1:      lwzx    %0,0,%3\n\
 >         add     %0,%2,%0\n\
 >         stwx    %0,0,%3"
 >         : "=&r" (t), "=m" (v->counter)
 >         : "r" (a), "r" (v), "m" (v->counter)
 >         : "cc");
 > #endif /* CONFIG_NO_ATOMICS */

Last time this came up I pointed out that lwarx/stwcx are for locking
against interrupts as well as against other CPUs. You don't seem to have
anything in your atomic routine to prevent interrupts occurring between
the load and the store.

--
Adrian Cox   http://www.humboldt.co.uk/


** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 3+ messages in thread

* I want to break softirq.c to fix broken hardware
@ 2001-08-09 21:56 Justin (Gus) Hurwitz
  2001-08-09 18:08 ` Adrian Cox
  0 siblings, 1 reply; 3+ messages in thread
From: Justin (Gus) Hurwitz @ 2001-08-09 21:56 UTC (permalink / raw)
  To: linuxppc-embedded


Our custom 603e based board is at long last getting to the point that we
can call it working. We can now boot it, and mount a root filesystem
either via ramdisk or nfs. The board is even mostly stable.

But, I say mostly for a reason.

For those who do not remember my rants about this board, it does not
support bus snooping for the CPU. The 603e uses bus snooping to impliment
the lwarx and stwcx. instructions, which are used by the CPU to do atomic
memory functions. I have worked around this problem thusfar by simply
replacing the atomic functions with non-atomic counterparts. In other
words, the beginning of include/asm/atomic.h now looks like this:

static __inline__ int atomic_add_return(int a, atomic_t *v)
{
        int t;

#ifndef CONFIG_NO_ATOMICS
        __asm__ __volatile__("\n\
1:      lwarx   %0,0,%3\n\
        add     %0,%2,%0\n\
        stwcx.  %0,0,%3\n\
        bne-    1b"
        : "=&r" (t), "=m" (v->counter)
        : "r" (a), "r" (v), "m" (v->counter)
        : "cc");
#else
        __asm__ __volatile__("\n\
1:      lwzx    %0,0,%3\n\
        add     %0,%2,%0\n\
        stwx    %0,0,%3"
        : "=&r" (t), "=m" (v->counter)
        : "r" (a), "r" (v), "m" (v->counter)
        : "cc");
#endif /* CONFIG_NO_ATOMICS */

        return t;
}


While this has worked reasonably well for development, it does crash
regularly. More specifically, it crashes with: kernel BUG at
softirq.c:241! That is in the following section:

static void tasklet_hi_action(struct softirq_action *a)
{
        int cpu = smp_processor_id();
        struct tasklet_struct *list;

        local_irq_disable();
        list = tasklet_hi_vec[cpu].list;
        tasklet_hi_vec[cpu].list = NULL;
        local_irq_enable();

        while (list) {
                struct tasklet_struct *t = list;

                list = list->next;

237             if (!tasklet_trylock(t))
238                     BUG();
239             if (!atomic_read(&t->count)) {
240                     if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
241                             BUG();
242                     t->func(t->data);
243                     tasklet_unlock(t);
244                     continue;
                }
                tasklet_unlock(t);

                local_irq_disable();
                t->next = tasklet_hi_vec[cpu].list;
                tasklet_hi_vec[cpu].list = t;
                cpu_raise_softirq(cpu, HI_SOFTIRQ);
                local_irq_enable();
        }
}

This is obviously atomic-intensive code. My question is, therefore, what
is the best way to bypass the need for an atomic function here? Would it
work for me to replace

if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
with
while(!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state)){}

?

Any thoughts are appreciated.

Thanks,

-Gus


** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: I want to break softirq.c to fix broken hardware
  2001-08-09 18:08 ` Adrian Cox
@ 2001-08-09 22:32   ` Justin (Gus) Hurwitz
  0 siblings, 0 replies; 3+ messages in thread
From: Justin (Gus) Hurwitz @ 2001-08-09 22:32 UTC (permalink / raw)
  To: Adrian Cox; +Cc: linuxppc-embedded


On Thu, 9 Aug 2001, Adrian Cox wrote:

>
> Justin (Gus) Hurwitz wrote:
>
>  > #else
>  >         __asm__ __volatile__("\n\
>  > 1:      lwzx    %0,0,%3\n\
>  >         add     %0,%2,%0\n\
>  >         stwx    %0,0,%3"
>  >         : "=&r" (t), "=m" (v->counter)
>  >         : "r" (a), "r" (v), "m" (v->counter)
>  >         : "cc");
>  > #endif /* CONFIG_NO_ATOMICS */
>
> Last time this came up I pointed out that lwarx/stwcx are for locking
> against interrupts as well as against other CPUs. You don't seem to have
> anything in your atomic routine to prevent interrupts occurring between
> the load and the store.
>
> --
> Adrian Cox   http://www.humboldt.co.uk/

Indeed- you are right (and that is, I'm 99% sure, the cause of the crash
when it happens).

I presume that it would be terrible ineffecient to turn off interrupts
throughout all of the tasklet_hi_action() function (though they are off
for most of it). Since this is the only function that I have ever had a
crash in, I would rather not add local_irq_disable()/local_irq_enable() to
each of the atomic functions, as that seems to be ineffecient overkill
(though it might be the proper thing to do?).

Right now I am compiling a kernel with local_irq_disable()/local_irq_enable()
around the
if (!atomic_read(&t->count)) {
                        if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
                                BUG();
                        t->func(t->data);
                        tasklet_unlock(t);
                        continue;
                }
statement. Hopefully that will work.

When I get this working, would it be worthwhile to send a patch to the
list? I don't know if anyone else needs this kind of fix. If others would
like it, I'll try to do this a more generically "right" way.

Thanks again,
--Gus


** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2001-08-09 22:32 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-08-09 21:56 I want to break softirq.c to fix broken hardware Justin (Gus) Hurwitz
2001-08-09 18:08 ` Adrian Cox
2001-08-09 22:32   ` Justin (Gus) Hurwitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).