From: Andrew Morton <akpm@linux-foundation.org>
To: Kyle Hubert <khubert@gmail.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: race condition between udevd and modprobe (mtrr_add)
Date: Thu, 6 May 2010 15:59:51 -0700 [thread overview]
Message-ID: <20100506155951.af7b3ded.akpm@linux-foundation.org> (raw)
In-Reply-To: <p2n9a158e2e1005032230u2cb21e6do40ae0b58e746b75f@mail.gmail.com>
On Mon, 3 May 2010 22:30:01 -0700
Kyle Hubert <khubert@gmail.com> wrote:
> Hi, while booting an initrd image built off of BusyBox on a thousand
> nodes, we hit a race on a couple of nodes. They hang during the boot
> process with the stack traces listed below. The really simple init
> script in the initrd does a 'udevd --daemon' and then modprobe of a
> device. The device needs to assign an mtrr to the pci resource, and
> instead the whole node hangs. Putting a 'sleep 1' in between these two
> calls prevents any hangs.
>
> mtrr_add_page and the buddy allocator code don't appear to share any
> semaphores, and there isn't an obvious way in which this can hang.
> Possibly the smp_call_function IPI isn't being handled by the other
> cores... That's the best guess. Can anyone help sort this mess out?
>
> Also, is there a better way to test that udevd is fully up? A 'sleep
> 1' is not the preferred solution here.
>
> Thanks for your time,
>
What kernel version are you using here? It looks old - pre 2.6.31.
>
> >> ps
> ADDR UID PID PPID STATE FLAGS CPU NAME
> ===============================================================================
> ...
> 0xffff88061d26c720 0 1036 1 0 0x400140 - udevd
> 0xffff88021e05c480 0 1037 1 0 0x400100 - modprobe
> 0xffff88081d072440 0 1116 1036 0 0x400040 - udevd
> ===============================================================================
> 135 active task structs found
> >> bt 0xffff88021e05c480
> ================================================================
> STACK TRACE FOR TASK: 0xffff88021e05c480(modprobe)
>
> 0 <schedule?> [0x0]
> 1 mtrr_add_page+494 [0xffffffff80219d9e]
> 2 <unknown?>+<ERROR> [0xffffffffa0009a08]
> ================================================================
> >> bt 0xffff88061d25f420
> ================================================================
> STACK TRACE FOR TASK: 0xffff88061d25f420(udevd)
>
> 0 <schedule?> [0x0]
> 1 __alloc_pages_internal+241 [0xffffffff80292731]
> 2 rmqueue_bulk+89 [0xffffffff80291b19]
> 3 get_page_from_freelist+1430 [0xffffffff802922e6]
> 4 __alloc_pages_internal+241 [0xffffffff80292731]
> 5 alloc_pages_current+168 [0xffffffff802b0898]
> 6 pte_alloc_one+49 [0xffffffff80229271]
> 7 __pte_alloc+67 [0xffffffff8029e7d3]
> 8 copy_page_range+1269 [0xffffffff802a11c5]
> 9 alloc_pid+744 [0xffffffff80250a18]
> 10 copy_process+3057 [0xffffffff8023bcf1]
> 11 do_fork+118 [0xffffffff8023c4d6]
> 12 sys_clone+35 [0xffffffff80209c23]
> 13 ptregscall_common+103 [0xffffffff8020bda7]
These traces look odd - the kernel shouldn't be calling schedule() from
below rmqueue_bulk()!
If possible, please try a more recent kernel. If the problem occurs
there and if we manage to fix it, the fix can be backported into
whatever-kernel-version-you're-using.
Can you get a better trace? The sysrq-T output would be good. That's
known to work sufficiently well. Please avoid wordwrapping it when sending.
prev parent reply other threads:[~2010-05-06 23:00 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-04 5:30 race condition between udevd and modprobe (mtrr_add) Kyle Hubert
2010-05-05 5:18 ` Kay Sievers
2010-05-06 22:59 ` Andrew Morton [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100506155951.af7b3ded.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=khubert@gmail.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox