public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@us.ibm.com>
To: Paul Jackson <pj@sgi.com>
Cc: Alexis Bruemmer <alexisb@us.ibm.com>, linux-kernel@vger.kernel.org
Subject: Re: cpusets: BUG: cpuset_excl_nodes_overlap() may sleep under tasklist_lock
Date: Mon, 09 Jan 2006 14:42:55 -0800	[thread overview]
Message-ID: <43C2E6EF.40903@us.ibm.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 2813 bytes --]

Hi Mr. Jackson,

I've been stress testing 2.6.15 on an IntelliStation Z20 in our lab, and
have seen the same complaints that were described earlier in this thread
(sleeping in the cpuset code under OOM conditions):

Debug: sleeping function called from invalid context at
include/asm/semaphore.h:105
in_atomic():1, irqs_disabled():0

Call Trace:<ffffffff80130640>{__might_sleep+179}
<ffffffff8015bb68>{cpuset_excl_nodes_overlap+31}
       <ffffffff80164d9f>{out_of_memory+123}
<ffffffff801676c4>{__alloc_pages+564}
       <ffffffff8017ec41>{alloc_pages_current+160}
<ffffffff8016873f>{__do_page_cache_readahead+197}
       <ffffffff8010e5c3>{__switch_to+50}
<ffffffff8031c547>{_spin_unlock_irqrestore+52}
       <ffffffff80166604>{free_pages_bulk+674}
<ffffffff80168a00>{do_page_cache_readahead+82}
       <ffffffff8016355a>{filemap_nopage+332}
<ffffffff8017378f>{__handle_mm_fault+1162}
       <ffffffff8031c547>{_spin_unlock_irqrestore+52}
<ffffffff8031df02>{do_page_fault+1096}
       <ffffffff801991ee>{sys_select+839} <ffffffff8011078d>{error_exit+0}

Since we seem to be able to reproduce this with some regularity, let me
know if you'd like us to test out any cpuset patch(es).

--Darrick

> Kirill Korotaev wrote:
>> FYI, there is an obvious bug in cpusets in 2.6.15-rcX:
>> cpuset_excl_nodes_overlap() may sleep (as it takes semaphore), but is 
>> called from atomic context - select_bad_process() under tasklist_lock.
>> BUG. Found by Denis Lunev.
> 
> Sorry for not responding sooner - I was off the air for a week.
> 
> Thanks for finding and reporting this.
> 
> Apparently, from KUROSAWA Takahiro's report, this bug was also in
> 2.6.14.  My initial reading of the code in 2.6.14 and 2.6.15-* agrees,
> and finds that this bug was present since the cpuset_excl_nodes_overlap
> call was added, Sept 8, 2005 (in Linus's tree.)
> 
> 
>> the same actually applies to cpuset_zone_allowed() which is called e.g. 
>> from __alloc_pages()->get_page_from_freelist() and doesn't check for 
>> GPF_NOATOMIC anyhow...
> 
> I don't think so.  Please read the comments in kernel/cpuset.c above
> the routine cpuset_zone_allowed().  Either that routine is called with
> the __GFP_HARDWALL flag set, so returns before it gets to the semaphore
> call, or it is not called at all, due to the check for ATOMIC (!wait)
> in mm/page_alloc.c.
> 
> I don't see any bugs like this, in the cpuset_zone_allowed code path.
> 
> 
> ==> My initial analysis - I have one bug, in the oom_kill path,
>     where the code takes callback_sem while holding tasklist_ lock,
>     that has been in the main line kernel since 2.6.14.
> 
> My first guess is that it will take me about a week, with testing and
> other priorities (including a few more days vacation), to respond with a
> patch.  Speak up if that doesn't meet your needs.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

             reply	other threads:[~2006-01-09 22:43 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-01-09 22:42 Darrick J. Wong [this message]
  -- strict thread matches above, loose matches on Subject: below --
2005-12-28 12:48 cpusets: BUG: cpuset_excl_nodes_overlap() may sleep under tasklist_lock Kirill Korotaev
2006-01-03 22:31 ` Paul Jackson
2006-01-04  9:26   ` Kirill Korotaev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43C2E6EF.40903@us.ibm.com \
    --to=djwong@us.ibm.com \
    --cc=alexisb@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pj@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox