From: Michael Buesch <mbuesch@freenet.de>
To: Ingo Molnar <mingo@elte.hu>
Cc: linux-kernel@vger.kernel.org, ck@vds.kolivas.org, kernel@kolivas.org
Subject: Re: [PATCH, RFC] protect call to set_tsk_need_resched() by the rq-lock
Date: Wed, 8 Dec 2004 11:29:48 +0100 [thread overview]
Message-ID: <200412081129.54249.mbuesch@freenet.de> (raw)
In-Reply-To: <200412081049.37498.mbuesch@freenet.de>
[-- Attachment #1: Type: text/plain, Size: 4343 bytes --]
Ok, forget it all. It crashed again. :)
Must be nvidia...
Quoting Michael Buesch <mbuesch@freenet.de>:
> Quoting Ingo Molnar <mingo@elte.hu>:
> >
> > * Michael Buesch <mbuesch@freenet.de> wrote:
> >
> > > > > The two attached patches (one against vanilla kernel and one
> > > > > against ck patchset) moves the rq-lock a few lines up in
> > > > > scheduler_tick() to also protect set_tsk_need_resched().
> > > > >
> > > > > Is that neccessary?
> > > >
> > > > scheduler_tick() is a special case, 'current' is pinned and cannot
> > > > go away, nor can it get off the runqueue.
> > >
> > > Can you explain in short, why this is the case, please? I don't really
> > > get behind it. How are the two things enforced?
> >
> > 'current' is the currently executing task and as such it wont get moved
> > off the runqueue. The only way to leave the runqueue is to execute
> > schedule() [or to be preempted].
>
> Ok, I understand that.
>
> Someone else said to me:
> [quote]
> "another runqueue might want to touch your runqueue
> while you're in scheduler_tick" ...
> "that is far more likely to hit with many many cpus and I'd
> be surprised if you're the first person to get a race there"
> [/quote]
>
> What about this? Is this possible? Or did she/he/it miss a point?
>
>
> It's this scenario here:
> I frequently get oopses in cpu_idle(). In the two hours before
> I made the patch, the machine hung twice. Since I'm running
> a patched scheduler, It did not hang again. I gave em about
> 15 hours testing.
>
> But maybe that's all pure luck.
>
> > Ingo
>
>
> Unable to handle kernel paging request at virtual address 00099108
> printing eip:
> b01010c0
> *pde = 00000000
> Oops: 0000 [#1]
> SMP
> Modules linked in: nfs lockd sunrpc nvidia ipv6 ohci_hcd tuner tvaudio msp3400 bttv video_buf firmware_class btcx_risc ehci_hcd uhci_hcd usbcore intel_agp agpgart evdev
> CPU: 0
> EIP: 0060:[<b01010c0>] Tainted: P VLI
> EFLAGS: 00010286 (2.6.10-rc2-ck2-nozeroram-findvmastat)
> EIP is at cpu_idle+0x31/0x3f
> eax: 00000001 ebx: 00099100 ecx: 00000000 edx: 0000001d
> esi: 00000000 edi: b03dff9c ebp: b03dffe4 esp: b03dffe0
> ds: 007b es: 007b ss: 0068
> Process swapper (pid: 0, threadinfo=b03de000 task=b034db40)
> Stack: 00020800 b03dfff8 b03e0898 000000bd b03e0340 b040cb80 0044f007 b0100211
> Call Trace:
> [<b0103c00>] show_stack+0x7a/0x90
> [<b0103d81>] show_registers+0x152/0x1ca
> [<b0103f86>] die+0xf4/0x178
> [<b0114556>] do_page_fault+0x42a/0x645
> [<b01038a7>] error_code+0x2b/0x30
> [<b03e0898>] start_kernel+0x13a/0x151
> [<b0100211>] 0xb0100211
> Code: e0 ff ff 21 e3 eb 24 8b 0d 84 c6 40 b0 b8 26 10 10 b0 8b 15 c0 eb 34 b0 85 c9 0f 44 c8 8b 43 10 c1 e0 07 89 90 84 52 41 b0 ff d1 <8b> 43 08 a8 08 74 d5 e8 d8 7f 1f 00 eb f2 55 89 e5 56 53 fb ba
> <0>Kernel panic - not syncing: Attempted to kill the idle task!
>
>
> b010108f <cpu_idle>:
> b010108f: 55 push %ebp
> b0101090: 89 e5 mov %esp,%ebp
> b0101092: 53 push %ebx
> b0101093: bb 00 e0 ff ff mov $0xffffe000,%ebx
> b0101098: 21 e3 and %esp,%ebx
> b010109a: eb 24 jmp b01010c0 <cpu_idle+0x31>
> b010109c: 8b 0d 84 c6 40 b0 mov 0xb040c684,%ecx
> b01010a2: b8 26 10 10 b0 mov $0xb0101026,%eax
> b01010a7: 8b 15 c0 eb 34 b0 mov 0xb034ebc0,%edx
> b01010ad: 85 c9 test %ecx,%ecx
> b01010af: 0f 44 c8 cmove %eax,%ecx
> b01010b2: 8b 43 10 mov 0x10(%ebx),%eax
> b01010b5: c1 e0 07 shl $0x7,%eax
> b01010b8: 89 90 84 52 41 b0 mov %edx,0xb0415284(%eax)
> b01010be: ff d1 call *%ecx
> b01010c0: 8b 43 08 mov 0x8(%ebx),%eax
> ^^^^^^^^^^^^^^^^^^ OOPS. This is the check to need_resched().
> b01010c3: a8 08 test $0x8,%al
> b01010c5: 74 d5 je b010109c <cpu_idle+0xd>
> b01010c7: e8 d8 7f 1f 00 call b02f90a4 <schedule>
> b01010cc: eb f2 jmp b01010c0 <cpu_idle+0x31>
>
>
> Ah, and yes, the kernel is tainted. So Nvidia already received a bugreport.
>
--
Regards Michael Buesch [ http://www.tuxsoft.de.vu ]
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
next prev parent reply other threads:[~2004-12-08 10:30 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-12-06 22:39 [PATCH, RFC] protect call to set_tsk_need_resched() by the rq-lock Michael Buesch
2004-12-07 13:10 ` Ingo Molnar
2004-12-07 23:30 ` Michael Buesch
2004-12-08 8:26 ` Ingo Molnar
2004-12-08 9:49 ` Michael Buesch
2004-12-08 10:29 ` Michael Buesch [this message]
2004-12-08 11:15 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200412081129.54249.mbuesch@freenet.de \
--to=mbuesch@freenet.de \
--cc=ck@vds.kolivas.org \
--cc=kernel@kolivas.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox