From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Sun, 19 Aug 2001 16:42:51 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Sun, 19 Aug 2001 16:42:41 -0400 Received: from colorfullife.com ([216.156.138.34]:5 "EHLO colorfullife.com") by vger.kernel.org with ESMTP id ; Sun, 19 Aug 2001 16:42:37 -0400 Message-ID: <000701c128ef$876dcf60$010411ac@local> From: "Manfred Spraul" To: "\"Peter T. Breuer\"" Cc: Subject: Re: scheduling with io_lock held in 2.4.6 Date: Sun, 19 Aug 2001 22:42:48 +0200 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4522.1200 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org > I'll check if IRQs are really masked (how?). __save_flags() and check IIRC bit 9 - but I doubt that this is your problem. > But how on earth > could somebody else release the IRQs on the same CPU while the io > spinlock is held? We'd have to schedule once to allow somebody else to > release IRQs first .. (infinite mental regress follows). Backtraces contain stale functions from previous calls. The first entry is always correct, then you must manually check that the calls are possible, i.e. > Trace; c011b548 > Trace; c0107a38 > Trace; c0113683 > Trace; c0113290 either do_page_fault or do_divide_error are stale. +0 means that someone pushed the function address on the stack - if a function calls another function, then the offset is at least +6 on i386. check if do_page_fault+3f3 actually calls do_exit, and if do_exit+3b4 calls exit_notify etc. Try clear all stale entries from the callchain. Note that the address pushed is behind the call instruction, and call instructions are usually 5 or 6 bytes long. Disassemble from +3d0 and check the call. And are you sure that schedule() was actually called from within an interrupt? I don't see the do_IRQ()//handle_IRQ_event. I bet it was a schedule() within spin_lock_bh(). > Even when I rebooted > with noapic (I seem to recall). Booting back to 2.4.3 cured it. Which network card do you use? Networking is the main user of spin_lock_bh(), and since 2.4.4 zero-copy networking is merged. Good luck, Manfred