public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [patch 125/311] Cpuset hardwall flag: switch cpusets to use the bulk cgroup_add_files() API
       [not found]   ` <20080429053804.6d083a7f.akpm@linux-foundation.org>
@ 2008-05-07  0:20     ` Paul Jackson
  2008-05-07  0:26       ` Paul Menage
  2008-05-07  1:00       ` Andrew Morton
  0 siblings, 2 replies; 7+ messages in thread
From: Paul Jackson @ 2008-05-07  0:20 UTC (permalink / raw)
  To: Andrew Morton, menage
  Cc: torvalds, lizf, Hidetoshi Seto, Hiroyuki KAMEZAWA,
	Dimitri Sivanich, linux-kernel

Dimitri Sivanich, a colleague of mine, just reported to me an easily
reproduced BUG in Linus's current git tree, anytime one reads or writes
the new per-cpuset file "sched_relax_domain_level".  The guilty task
gets a SEGV and the kernel prints (if the command was called 'cat'
and its pid was 16766 ;):

    kernel BUG at kernel/cpuset.c:1448!
    cat[16766]: bugcheck! 0 [3]

The BUG comes from cpuset code that wasn't expecting that read or write
request at that point in the code.

The basic problem is that Seto-san's "sched_relax_domain_level" and
Paul M's conversion to the new style *_u64 cpuset file handlers were
occurring at the same time, with the result that the handlers for
the per-cpuset file "sched_relax_domain_level" were only partially
converted to the new style *_u64 cpuset file handlers.

The following provides more details, and presents a couple of questions
for Andrew or Paul Menage, at the end.

===

On April 29, Paul Menage observed that the cpuset patch for
'sched_relax_domain' got mangled -- it ended up using the
old style common file read/write routines, but having the
cases to handle it added to Paul M's new style *_u64 handlers.

Paul M proposed the following untested patch:
> --- cpuset-fix-2.6.25-mm1.orig/kernel/cpuset.c
> +++ cpuset-fix-2.6.25-mm1/kernel/cpuset.c
> @@ -1295,6 +1295,9 @@ static int cpuset_write_u64(
>                 retval = update_flag(CS_SPREAD_SLAB, cs, val);
>                 cs->mems_generation = cpuset_mems_generation++;
>                 break;
> +       case FILE_SCHED_RELAX_DOMAIN_LEVEL:
> +               retval = update_relax_domain_level(cs, val);
> +               break;
>         default:
>                 retval = -EINVAL;
>                 break;
> @@ -1396,6 +1399,8 @@ static u64 cpuset_read_u64(
>                 return is_spread_page(cs);
>         case FILE_SPREAD_SLAB:
>                 return is_spread_slab(cs);
> +       case FILE_SCHED_RELAX_DOMAIN_LEVEL:
> +               return cs->relax_domain_level;
>         default:
>                 BUG();
>         }

Andrew replied:
> OK, can we please proceeed with the thing as-is, send us any needed
> fixup later in the week?

I definitely agree with the above observations of Paul M.  I suspect
that the patch might be missing the lines needed to -remove- the
FILE_SCHED_RELAX_DOMAIN_LEVEL cases from the old style
cpuset_common_file_read and cpuset_common_file_write switches.

The kernel now at the top of Linus's git tree hits a BUG()
immediately, anytime you try to read or write these new
per-cpuset files "sched_relax_domain_level".

I tried looking in 2.6.25-rc1-mm1-mmotm (as of an hour ago),
and it -looks- like the fix is in the linux-next.patch there.

However:

 1) I can't get 2.6.25-rc1-mm1-mmotm to apply even close to
    either of 2.6.25 or 2.6.25-rc1.  Blows up on the first
    patch.

	==> akpm - what does todays 2.6.25-rc1-mm1-mmotm
	    apply to?

 2) I didn't see any replies from Paul M in response to
    Andrews above request to "send us any needed fixup later
    in the week".

	==> Paul M or akpm - Is this fixup in the pipeline?

    I guess it did from my reading of the linux-next.patch
    in 2.6.25-rc1-mm1-mmotm, but I'm not confident I'm
    reading that patch right.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 125/311] Cpuset hardwall flag: switch cpusets to use the bulk cgroup_add_files() API
  2008-05-07  0:20     ` [patch 125/311] Cpuset hardwall flag: switch cpusets to use the bulk cgroup_add_files() API Paul Jackson
@ 2008-05-07  0:26       ` Paul Menage
  2008-05-07  0:39         ` Paul Jackson
  2008-05-07  1:03         ` Andrew Morton
  2008-05-07  1:00       ` Andrew Morton
  1 sibling, 2 replies; 7+ messages in thread
From: Paul Menage @ 2008-05-07  0:26 UTC (permalink / raw)
  To: Paul Jackson
  Cc: Andrew Morton, torvalds, lizf, Hidetoshi Seto, Hiroyuki KAMEZAWA,
	Dimitri Sivanich, linux-kernel

On Tue, May 6, 2008 at 5:20 PM, Paul Jackson <pj@sgi.com> wrote:
>
>  I definitely agree with the above observations of Paul M.  I suspect
>  that the patch might be missing the lines needed to -remove- the
>  FILE_SCHED_RELAX_DOMAIN_LEVEL cases from the old style
>  cpuset_common_file_read and cpuset_common_file_write switches.

Yes, it is - but I didn't have a tree with the relevant bits in it to
remove, as far as I could see.

>
>   2) I didn't see any replies from Paul M in response to
>     Andrews above request to "send us any needed fixup later
>     in the week".
>
>         ==> Paul M or akpm - Is this fixup in the pipeline?

Not yet - I was waiting for 2.6.26-rc1-mm1 to come out. But I can send
one against 2.6.26-rc1 directly if that helps.

Paul

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 125/311] Cpuset hardwall flag: switch cpusets to use the bulk cgroup_add_files() API
  2008-05-07  0:26       ` Paul Menage
@ 2008-05-07  0:39         ` Paul Jackson
  2008-05-07  1:03         ` Andrew Morton
  1 sibling, 0 replies; 7+ messages in thread
From: Paul Jackson @ 2008-05-07  0:39 UTC (permalink / raw)
  To: Paul Menage
  Cc: akpm, torvalds, lizf, seto.hidetoshi, kamezawa.hiroyu, sivanich,
	linux-kernel

Paul M wrote:
> But I can send one against 2.6.26-rc1 directly if that helps.

At this point, whatever resolves this with the least amount of
additional effort and confusion on the parts of Linus, Andrew,
Seto-san and yourself seems best.  A few days one way or the
other don't matter, so far as I know.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 125/311] Cpuset hardwall flag: switch cpusets to use the bulk cgroup_add_files() API
  2008-05-07  0:20     ` [patch 125/311] Cpuset hardwall flag: switch cpusets to use the bulk cgroup_add_files() API Paul Jackson
  2008-05-07  0:26       ` Paul Menage
@ 2008-05-07  1:00       ` Andrew Morton
  2008-05-07  1:44         ` Paul Jackson
  1 sibling, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2008-05-07  1:00 UTC (permalink / raw)
  To: Paul Jackson
  Cc: menage, torvalds, lizf, Hidetoshi Seto, Hiroyuki KAMEZAWA,
	Dimitri Sivanich, linux-kernel

On Tue, 6 May 2008 19:20:18 -0500 Paul Jackson <pj@sgi.com> wrote:

>  1) I can't get 2.6.25-rc1-mm1-mmotm to apply even close to
>     either of 2.6.25 or 2.6.25-rc1.  Blows up on the first
>     patch.

origin.patch?  That shouldn't happen.

> 	==> akpm - what does todays 2.6.25-rc1-mm1-mmotm
> 	    apply to?

2.6.25-rc1.

<resyncs>

It _should_ be OK.  Please check?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 125/311] Cpuset hardwall flag: switch cpusets to use the bulk cgroup_add_files() API
  2008-05-07  0:26       ` Paul Menage
  2008-05-07  0:39         ` Paul Jackson
@ 2008-05-07  1:03         ` Andrew Morton
  2008-05-07  1:21           ` Paul Menage
  1 sibling, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2008-05-07  1:03 UTC (permalink / raw)
  To: Paul Menage
  Cc: Paul Jackson, torvalds, lizf, Hidetoshi Seto, Hiroyuki KAMEZAWA,
	Dimitri Sivanich, linux-kernel

On Tue, 6 May 2008 17:26:22 -0700 "Paul Menage" <menage@google.com> wrote:

> On Tue, May 6, 2008 at 5:20 PM, Paul Jackson <pj@sgi.com> wrote:
> >
> >  I definitely agree with the above observations of Paul M.  I suspect
> >  that the patch might be missing the lines needed to -remove- the
> >  FILE_SCHED_RELAX_DOMAIN_LEVEL cases from the old style
> >  cpuset_common_file_read and cpuset_common_file_write switches.
> 
> Yes, it is - but I didn't have a tree with the relevant bits in it to
> remove, as far as I could see.

This whole fiasco was caused by unexpected changes magically appearing in
mainline late in the merge window.  All very predictable.

> >
> >   2) I didn't see any replies from Paul M in response to
> >     Andrews above request to "send us any needed fixup later
> >     in the week".
> >
> >         ==> Paul M or akpm - Is this fixup in the pipeline?
> 
> Not yet - I was waiting for 2.6.26-rc1-mm1 to come out. But I can send
> one against 2.6.26-rc1 directly if that helps.

I'm still crunching on backlog.  A fix against mainline would be great,
thanks.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 125/311] Cpuset hardwall flag: switch cpusets to use the bulk cgroup_add_files() API
  2008-05-07  1:03         ` Andrew Morton
@ 2008-05-07  1:21           ` Paul Menage
  0 siblings, 0 replies; 7+ messages in thread
From: Paul Menage @ 2008-05-07  1:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Paul Jackson, torvalds, lizf, Hidetoshi Seto, Hiroyuki KAMEZAWA,
	Dimitri Sivanich, linux-kernel

On Tue, May 6, 2008 at 6:03 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
>  > Not yet - I was waiting for 2.6.26-rc1-mm1 to come out. But I can send
>  > one against 2.6.26-rc1 directly if that helps.
>
>  I'm still crunching on backlog.  A fix against mainline would be great,
>  thanks.
>

Sent.

Paul

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 125/311] Cpuset hardwall flag: switch cpusets to use the bulk cgroup_add_files() API
  2008-05-07  1:00       ` Andrew Morton
@ 2008-05-07  1:44         ` Paul Jackson
  0 siblings, 0 replies; 7+ messages in thread
From: Paul Jackson @ 2008-05-07  1:44 UTC (permalink / raw)
  To: Andrew Morton
  Cc: menage, torvalds, lizf, seto.hidetoshi, kamezawa.hiroyu, sivanich,
	linux-kernel

> > 	==> akpm - what does todays 2.6.25-rc1-mm1-mmotm
> > 	    apply to?
> 
> 2.6.25-rc1.
> 
> <resyncs>
> 
> It _should_ be OK.  Please check?

mmotm works fine now ... probably would have worked fine
before, except for the brain damaged keyboard operator
who is sitting behind my computer monitor - curse him ;).

Thanks.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-05-07  1:44 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <200804290800.m3T80QFs009131@imap1.linux-foundation.org>
     [not found] ` <6599ad830804290329heea3c5fu7395b1ef71a87881@mail.gmail.com>
     [not found]   ` <20080429053804.6d083a7f.akpm@linux-foundation.org>
2008-05-07  0:20     ` [patch 125/311] Cpuset hardwall flag: switch cpusets to use the bulk cgroup_add_files() API Paul Jackson
2008-05-07  0:26       ` Paul Menage
2008-05-07  0:39         ` Paul Jackson
2008-05-07  1:03         ` Andrew Morton
2008-05-07  1:21           ` Paul Menage
2008-05-07  1:00       ` Andrew Morton
2008-05-07  1:44         ` Paul Jackson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox