public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* scheduling with io_lock held in 2.4.6
@ 2001-08-16 23:45 Peter T. Breuer
  2001-08-17  0:08 ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Peter T. Breuer @ 2001-08-16 23:45 UTC (permalink / raw)
  To: linux kernel

I've been plagued for a month by smp lockups in my block driver
that I eventually deduced were due to somebody else scheduling while
holding the io_request_lock spinlock.

There's nothing so impressive as a direct test, so I put a test for
the spinlock being held at the front of schedule(), and sure enough, it
fires every 20s or so when I'm doing nothing in particular on the
machine:

  Aug 17 01:36:42 xilofon kernel: Scheduling with io lock held in process 0

doing a dd to /dev/null from the ide disk seems to trigger it, but from
differnt contexts ...

  Aug 17 01:40:58 xilofon kernel: Scheduling with io lock held in process 0
  Aug 17 01:41:00 xilofon last message repeated 150 times
  Aug 17 01:41:00 xilofon kernel: Scheduling with io lock held in process 1139
  Aug 17 01:41:00 xilofon kernel: Scheduling with io lock held in process 0
  Aug 17 01:41:01 xilofon last message repeated 87 times
  Aug 17 01:41:01 xilofon kernel: Scheduling with io lock held in process 1141
  Aug 17 01:41:01 xilofon kernel: Scheduling with io lock held in process 0


Surprise, 1139 and 1141 are klogd and syslogd respectively.

Any suggestions as to how to track this further?

Peter

^ permalink raw reply	[flat|nested] 10+ messages in thread
[parent not found: <3B7EC41C.9811D384@zip.com.au>]
* Re: scheduling with io_lock held in 2.4.6
@ 2001-08-19 18:58 Peter T. Breuer
  2001-08-19 22:47 ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Peter T. Breuer @ 2001-08-19 18:58 UTC (permalink / raw)
  To: ptb; +Cc: akpm, linux kernel

More info ..

"A month of sundays ago ptb wrote:"
> What puzzles me is the sequence shown in the second oops of
> blkdev_release_request leading to a schedule. Possibly via an
> interrupt? You yourself said that the function cannot sleep. The oops
> was triggered my my detection code placed in schedule which fires
> on just such an event.
> 
> >>EIP; c011426a <schedule+142/8f4>   <=====
> Trace; c011b064 <exit_notify+178/2a8>
> Trace; c011b548 <do_exit+3b4/3d4>
> Trace; c0107a38 <do_divide_error+0/9c>
> Trace; c0113683 <do_page_fault+3f3/510>
> Trace; c0113290 <do_page_fault+0/510>
> Trace; c01d91ad <memcpy_toiovec+35/64>
> Trace; c01d7c4c <skb_release_data+68/74>
> Trace; c0113cc3 <reschedule_idle+63/214>
> Trace; c01ef15d <tcp_recvmsg+839/a58>                 <-- urr?
> Trace; c0107304 <error_code+34/3c>                    <-- huh?
> Trace; c01a3c17 <blkdev_release_request+df/1b4>
> Trace; c01a4807 <end_that_request_last+a3/130>
> Trace; c0134176 <__free_pages+1e/24>
> Trace; c01341a3 <free_pages+27/2c>
> Trace; c014c51d <do_select+1e5/1fc>
> Trace; c0122071 <sys_rt_sigaction+89/d8>
> Trace; c0143842 <blkdev_ioctl+2e/34>
> Trace; c014b9a9 <sys_ioctl+1f9/280>
> Trace; c010721b <system_call+33/38>

I am now fairly certain that the schedule occured while
blkdev_release_request was in a completely innocuous line

              if (waitqueue_active(&blk_buffers_wait)
                    && atomic_read(&queued_sectors) < low_queued_sectors)
                        wake_up(&blk_buffers_wait);
                /*
                 * Add to pending free list and batch wakeups
                 */
                list_add(&req->table, &q->pending_freelist[rw]);   <- HERE
                if (++q->pending_free[rw] >= batch_requests) {
                        int wake_up = q->pending_free[rw];

 
so I suppose that this is an interrupt. But IRQs are supposed to be
masked when blkdev_release_request is called, and I call it only from
end_that_request_last, with spin_lock_irqsave(&io_request_lock) held.

I'll check if IRQs are really masked (how?). But how on earth
could somebody else release the IRQs on the same CPU while the io
spinlock is held? We'd have to schedule once to allow somebody else to
release IRQs first .. (infinite mental regress follows).

Or is this a NMI?  do_divide_error looks distinctly hardware level!

At any rate, this kernel (2.4.6) is fairly unstable on this smp box
(dell poweredge) even before I started debugging. I had several
lockups just now when running e2fsck at reboot. Even when I rebooted
with noapic (I seem to recall). Booting back to 2.4.3 cured it.

I daresay it's time to see if 2.4.8 or 9 solves anything.

Peter

^ permalink raw reply	[flat|nested] 10+ messages in thread
* Re: scheduling with io_lock held in 2.4.6
@ 2001-08-19 20:42 Manfred Spraul
  0 siblings, 0 replies; 10+ messages in thread
From: Manfred Spraul @ 2001-08-19 20:42 UTC (permalink / raw)
  To: "Peter T. Breuer"; +Cc: linux-kernel

> I'll check if IRQs are really masked (how?).

__save_flags() and check IIRC bit 9 - but I doubt that this is your
problem.

> But how on earth
> could somebody else release the IRQs on the same CPU while the io
> spinlock is held? We'd have to schedule once to allow somebody else to
> release IRQs first .. (infinite mental regress follows).

Backtraces contain stale functions from previous calls.

The first entry is always correct, then you must manually check that the
calls are possible, i.e.
> Trace; c011b548 <do_exit+3b4/3d4>
> Trace; c0107a38 <do_divide_error+0/9c>
> Trace; c0113683 <do_page_fault+3f3/510>
> Trace; c0113290 <do_page_fault+0/510>

either do_page_fault or do_divide_error are stale.
+0 means that someone pushed the function address on the stack - if a
function calls another function, then the offset is at least +6 on i386.

check if do_page_fault+3f3 actually calls do_exit, and if do_exit+3b4
calls exit_notify etc. Try clear all stale entries from the callchain.
Note that the address pushed is behind the call instruction, and call
instructions are usually 5 or 6 bytes long. Disassemble from +3d0 and
check the call.

And are you sure that schedule() was actually called from within an
interrupt? I don't see the do_IRQ()//handle_IRQ_event.

I bet it was a schedule() within spin_lock_bh().

> Even when I rebooted
> with noapic (I seem to recall). Booting back to 2.4.3 cured it.

Which network card do you use?
Networking is the main user of spin_lock_bh(), and since 2.4.4 zero-copy
networking is merged.


Good luck,
    Manfred


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2001-08-19 22:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-08-16 23:45 scheduling with io_lock held in 2.4.6 Peter T. Breuer
2001-08-17  0:08 ` Andrew Morton
2001-08-18 19:27   ` Peter T. Breuer
     [not found] <3B7EC41C.9811D384@zip.com.au>
2001-08-18 21:57 ` Peter T. Breuer
2001-08-18 22:13   ` Andrew Morton
2001-08-19  1:21     ` Peter T. Breuer
2001-08-19 16:38       ` Peter T. Breuer
  -- strict thread matches above, loose matches on Subject: below --
2001-08-19 18:58 Peter T. Breuer
2001-08-19 22:47 ` Andrew Morton
2001-08-19 20:42 Manfred Spraul

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox