From: Timm Korte <korte-kernel@easycrypt.de>
To: lkml <linux-kernel@vger.kernel.org>
Subject: "impossible" spinlock "wrong CPU" problem with custom device driver
Date: Thu, 09 Jul 2009 00:48:14 +0200 [thread overview]
Message-ID: <4A55222E.5030405@easycrypt.de> (raw)
I'm trying to understand a spinlog bug in a kernel module (device driver).
I have a spinlock that is uses in the actual hardware interrupt handler
as well as in a seperate kernel thread doing the real work via a work
queue. The first one uses the spinlock with spin_lock() and
spin_unlock(), while the thread uses spin_lock_irqsave() and
spin_unlock_irqrestore().
On rare occasions (can't reproduce on purpose), i get a spinlog debug
message about wrong cpu on _raw_spin_unlock when called from the kernel
thread.
This is the source (for the kernel_thread) that runs into the problem:
static int my_irqthread_function(void *ptr) {
struct my_dev *mydev = ptr;
daemonize(MY_NAME "%02x", mydev->mynum);
allow_signal(SIGTERM);
while (!wait_event_interruptible(mydev->irqthread_wait,
atomic_read(&mydev->irqthread_pending_count))) {
do {
uint8_t my_irq_pending = 0;
unsigned long iflags;
spin_lock_irqsave(&mydev->irq_pending_lock, iflags);
my_irq_pending = mydev->irq_pending;
mydev->irq_pending = 0;
spin_unlock_irqrestore(&mydev->irq_pending_lock, iflags);
// handle irqs
if (my_irq_pending & INT_IPAC1) {
my_handle_interrupt(&mydev->mydev[IPAC1]);
}
...
// continue if the pending count still is != 0 after decrementing
} while (!atomic_dec_and_test(&mydev->irqthread_pending_count));
}
mydev->irqthread = 0;
complete_and_exit(&mydev->irqthread_exit, 0);
}
The error (SPIN_BUG with kernel panic on my SMP box) happens on the
"spin_unlock_irqrestore(&mydev->irq_pending_lock, iflags);" - but i
really can't figure out, how the thread could be moved to another cpu,
while holding the lock and only doing two assignment operations.
The only thing i could think of, is that it might have something to do
with the enabled sigterm signal - even though the module wasn't being
unloaded at the time the bug occured.
System is FC4 based with a 2.6.17 kernel (can't change).
So I'm sort of out of ideas and hope someone here has an idea, what
might have gone wrong here.
Timm
reply other threads:[~2009-07-08 22:55 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A55222E.5030405@easycrypt.de \
--to=korte-kernel@easycrypt.de \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox