From: Paul Jackson <pj@sgi.com>
To: Andrew Morton <akpm@linux-foundation.org>, menage@google.com
Cc: torvalds@linux-foundation.org, lizf@cn.fujitsu.com,
Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>,
Hiroyuki KAMEZAWA <kamezawa.hiroyu@jp.fujitsu.com>,
Dimitri Sivanich <sivanich@sgi.com>,
linux-kernel@vger.kernel.org
Subject: Re: [patch 125/311] Cpuset hardwall flag: switch cpusets to use the bulk cgroup_add_files() API
Date: Tue, 6 May 2008 19:20:18 -0500 [thread overview]
Message-ID: <20080506192018.26c1e3b7.pj@sgi.com> (raw)
In-Reply-To: <20080429053804.6d083a7f.akpm@linux-foundation.org>
Dimitri Sivanich, a colleague of mine, just reported to me an easily
reproduced BUG in Linus's current git tree, anytime one reads or writes
the new per-cpuset file "sched_relax_domain_level". The guilty task
gets a SEGV and the kernel prints (if the command was called 'cat'
and its pid was 16766 ;):
kernel BUG at kernel/cpuset.c:1448!
cat[16766]: bugcheck! 0 [3]
The BUG comes from cpuset code that wasn't expecting that read or write
request at that point in the code.
The basic problem is that Seto-san's "sched_relax_domain_level" and
Paul M's conversion to the new style *_u64 cpuset file handlers were
occurring at the same time, with the result that the handlers for
the per-cpuset file "sched_relax_domain_level" were only partially
converted to the new style *_u64 cpuset file handlers.
The following provides more details, and presents a couple of questions
for Andrew or Paul Menage, at the end.
===
On April 29, Paul Menage observed that the cpuset patch for
'sched_relax_domain' got mangled -- it ended up using the
old style common file read/write routines, but having the
cases to handle it added to Paul M's new style *_u64 handlers.
Paul M proposed the following untested patch:
> --- cpuset-fix-2.6.25-mm1.orig/kernel/cpuset.c
> +++ cpuset-fix-2.6.25-mm1/kernel/cpuset.c
> @@ -1295,6 +1295,9 @@ static int cpuset_write_u64(
> retval = update_flag(CS_SPREAD_SLAB, cs, val);
> cs->mems_generation = cpuset_mems_generation++;
> break;
> + case FILE_SCHED_RELAX_DOMAIN_LEVEL:
> + retval = update_relax_domain_level(cs, val);
> + break;
> default:
> retval = -EINVAL;
> break;
> @@ -1396,6 +1399,8 @@ static u64 cpuset_read_u64(
> return is_spread_page(cs);
> case FILE_SPREAD_SLAB:
> return is_spread_slab(cs);
> + case FILE_SCHED_RELAX_DOMAIN_LEVEL:
> + return cs->relax_domain_level;
> default:
> BUG();
> }
Andrew replied:
> OK, can we please proceeed with the thing as-is, send us any needed
> fixup later in the week?
I definitely agree with the above observations of Paul M. I suspect
that the patch might be missing the lines needed to -remove- the
FILE_SCHED_RELAX_DOMAIN_LEVEL cases from the old style
cpuset_common_file_read and cpuset_common_file_write switches.
The kernel now at the top of Linus's git tree hits a BUG()
immediately, anytime you try to read or write these new
per-cpuset files "sched_relax_domain_level".
I tried looking in 2.6.25-rc1-mm1-mmotm (as of an hour ago),
and it -looks- like the fix is in the linux-next.patch there.
However:
1) I can't get 2.6.25-rc1-mm1-mmotm to apply even close to
either of 2.6.25 or 2.6.25-rc1. Blows up on the first
patch.
==> akpm - what does todays 2.6.25-rc1-mm1-mmotm
apply to?
2) I didn't see any replies from Paul M in response to
Andrews above request to "send us any needed fixup later
in the week".
==> Paul M or akpm - Is this fixup in the pipeline?
I guess it did from my reading of the linux-next.patch
in 2.6.25-rc1-mm1-mmotm, but I'm not confident I'm
reading that patch right.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.940.382.4214
next parent reply other threads:[~2008-05-07 0:20 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <200804290800.m3T80QFs009131@imap1.linux-foundation.org>
[not found] ` <6599ad830804290329heea3c5fu7395b1ef71a87881@mail.gmail.com>
[not found] ` <20080429053804.6d083a7f.akpm@linux-foundation.org>
2008-05-07 0:20 ` Paul Jackson [this message]
2008-05-07 0:26 ` [patch 125/311] Cpuset hardwall flag: switch cpusets to use the bulk cgroup_add_files() API Paul Menage
2008-05-07 0:39 ` Paul Jackson
2008-05-07 1:03 ` Andrew Morton
2008-05-07 1:21 ` Paul Menage
2008-05-07 1:00 ` Andrew Morton
2008-05-07 1:44 ` Paul Jackson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080506192018.26c1e3b7.pj@sgi.com \
--to=pj@sgi.com \
--cc=akpm@linux-foundation.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lizf@cn.fujitsu.com \
--cc=menage@google.com \
--cc=seto.hidetoshi@jp.fujitsu.com \
--cc=sivanich@sgi.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.