All of lore.kernel.org
 help / color / mirror / Atom feed
* "impossible" spinlock "wrong CPU" problem with custom device driver
@ 2009-07-08 22:48 Timm Korte
  0 siblings, 0 replies; only message in thread
From: Timm Korte @ 2009-07-08 22:48 UTC (permalink / raw)
  To: lkml

I'm trying to understand a spinlog bug in a kernel module (device driver).
I have a spinlock that is uses in the actual hardware interrupt handler
as well as in a seperate kernel thread doing the real work via a work
queue. The first one uses the spinlock with spin_lock() and
spin_unlock(), while the thread uses spin_lock_irqsave() and
spin_unlock_irqrestore().
On rare occasions (can't reproduce on purpose), i get a spinlog debug
message about wrong cpu on _raw_spin_unlock when called from the kernel
thread.

This is the source (for the kernel_thread) that runs into the problem:

static int my_irqthread_function(void *ptr) {
  struct my_dev *mydev = ptr;

  daemonize(MY_NAME "%02x", mydev->mynum);
  allow_signal(SIGTERM);
  while (!wait_event_interruptible(mydev->irqthread_wait,
atomic_read(&mydev->irqthread_pending_count))) {
    do {
      uint8_t my_irq_pending = 0;
      unsigned long iflags;

      spin_lock_irqsave(&mydev->irq_pending_lock, iflags);
      my_irq_pending = mydev->irq_pending;
      mydev->irq_pending = 0;
      spin_unlock_irqrestore(&mydev->irq_pending_lock, iflags);

      // handle irqs
      if (my_irq_pending & INT_IPAC1) {
         my_handle_interrupt(&mydev->mydev[IPAC1]);
      }
...
      // continue if the pending count still is != 0 after decrementing
    } while (!atomic_dec_and_test(&mydev->irqthread_pending_count));
  }

  mydev->irqthread = 0;
  complete_and_exit(&mydev->irqthread_exit, 0);
}

The error (SPIN_BUG with kernel panic on my SMP box) happens on the
"spin_unlock_irqrestore(&mydev->irq_pending_lock, iflags);" - but i
really can't figure out, how the thread could be moved to another cpu,
while holding the lock and only doing two assignment operations.

The only thing i could think of, is that it might have something to do
with the enabled sigterm signal - even though the module wasn't being
unloaded at the time the bug occured.

System is FC4 based with a 2.6.17 kernel (can't change).

So I'm sort of out of ideas and hope someone here has an idea, what
might have gone wrong here.

Timm

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2009-07-08 22:55 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-08 22:48 "impossible" spinlock "wrong CPU" problem with custom device driver Timm Korte

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.