netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* select(0, ..) is valid ?
@ 2007-05-15 17:29 Badari Pulavarty
  2007-05-15 17:34 ` Mark Glines
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Badari Pulavarty @ 2007-05-15 17:29 UTC (permalink / raw)
  To: netdev, lkml; +Cc: Christoph Lameter, Andrew Morton

Hi,

Is select(0, ..) is a valid operation ?

I see that there is no check to prevent this or return
success early, without doing any work. Do we need one ?

slub code is complaining that we are doing kmalloc(0).

Thanks,
Badari

------------[ cut here ]------------
Badness at include/linux/slub_def.h:88
Call Trace:
[c0000001e4eb7640] [c00000000000e650] .show_stack+0x68/0x1b0
(unreliable)
[c0000001e4eb76e0] [c00000000029b854] .report_bug+0x94/0xe8
[c0000001e4eb7770] [c0000000000219f0] .program_check_exception
+0x12c/0x568
[c0000001e4eb77f0] [c000000000004a84] program_check_common+0x104/0x180
--- Exception: 700 at .get_slab+0x4c/0x234
    LR = .__kmalloc+0x24/0xc4
[c0000001e4eb7ae0] [c0000001e4eb7b80] 0xc0000001e4eb7b80 (unreliable)
[c0000001e4eb7b80] [c0000000000a7ff0] .__kmalloc+0x24/0xc4
[c0000001e4eb7c10] [c0000000000ea720] .compat_core_sys_select+0x90/0x240
[c0000001e4eb7d00] [c0000000000ec3a4] .compat_sys_select+0xb0/0x190
[c0000001e4eb7dc0] [c000000000014944] .ppc32_select+0x14/0x28
[c0000001e4eb7e30] [c00000000000872c] syscall_exit+0x0/0x40



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: select(0, ..) is valid ?
  2007-05-15 17:29 select(0, ..) is valid ? Badari Pulavarty
@ 2007-05-15 17:34 ` Mark Glines
  2007-05-15 17:36 ` Jiri Slaby
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 18+ messages in thread
From: Mark Glines @ 2007-05-15 17:34 UTC (permalink / raw)
  To: Badari Pulavarty; +Cc: netdev, lkml, Christoph Lameter, Andrew Morton

On Tue, 15 May 2007 10:29:18 -0700
Badari Pulavarty <pbadari@us.ibm.com> wrote:

> Hi,
> 
> Is select(0, ..) is a valid operation ?

select(0, ..) is rather commonly used as a portable sleep() with
microsecond granularity.  Disabling it will break lots of things.

Mark

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: select(0, ..) is valid ?
  2007-05-15 17:29 select(0, ..) is valid ? Badari Pulavarty
  2007-05-15 17:34 ` Mark Glines
@ 2007-05-15 17:36 ` Jiri Slaby
  2007-05-15 17:42 ` H. Peter Anvin
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 18+ messages in thread
From: Jiri Slaby @ 2007-05-15 17:36 UTC (permalink / raw)
  To: Badari Pulavarty; +Cc: netdev, lkml, Christoph Lameter, Andrew Morton

Badari Pulavarty napsal(a):
> Hi,
> 
> Is select(0, ..) is a valid operation ?

Yes, it was (is) sometimes used for measuring (sleeping for) short time slices.

regards,
-- 
http://www.fi.muni.cz/~xslaby/            Jiri Slaby
faculty of informatics, masaryk university, brno, cz
e-mail: jirislaby gmail com, gpg pubkey fingerprint:
 B674 9967 0407 CE62 ACC8  22A0 32CC 55C3 39D4 7A7E

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: select(0, ..) is valid ?
  2007-05-15 17:29 select(0, ..) is valid ? Badari Pulavarty
  2007-05-15 17:34 ` Mark Glines
  2007-05-15 17:36 ` Jiri Slaby
@ 2007-05-15 17:42 ` H. Peter Anvin
  2007-05-15 17:43 ` Alan Cox
  2007-05-15 17:44 ` Andrew Morton
  4 siblings, 0 replies; 18+ messages in thread
From: H. Peter Anvin @ 2007-05-15 17:42 UTC (permalink / raw)
  To: Badari Pulavarty; +Cc: netdev, lkml, Christoph Lameter, Andrew Morton

Badari Pulavarty wrote:
> Hi,
> 
> Is select(0, ..) is a valid operation ?
> 
> I see that there is no check to prevent this or return
> success early, without doing any work. Do we need one ?
> 
> slub code is complaining that we are doing kmalloc(0).
> 

select(0, ...) is valid, and is functionally equivalent to
select(..., NULL, NULL, NULL, ...); except that any nonzero fdsets get
zeroed on return.  As such, the only thing that can interrupt it is the
timeout, or a signal.

	-hpa

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: select(0, ..) is valid ?
  2007-05-15 17:29 select(0, ..) is valid ? Badari Pulavarty
                   ` (2 preceding siblings ...)
  2007-05-15 17:42 ` H. Peter Anvin
@ 2007-05-15 17:43 ` Alan Cox
  2007-05-15 17:44 ` Andrew Morton
  4 siblings, 0 replies; 18+ messages in thread
From: Alan Cox @ 2007-05-15 17:43 UTC (permalink / raw)
  To: Badari Pulavarty; +Cc: netdev, lkml, Christoph Lameter, Andrew Morton

On Tue, 15 May 2007 10:29:18 -0700
Badari Pulavarty <pbadari@us.ibm.com> wrote:

> Hi,
> 
> Is select(0, ..) is a valid operation ?

Yes. It's a fairly classic old BSD way to do timeouts

Alan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: select(0, ..) is valid ?
  2007-05-15 17:29 select(0, ..) is valid ? Badari Pulavarty
                   ` (3 preceding siblings ...)
  2007-05-15 17:43 ` Alan Cox
@ 2007-05-15 17:44 ` Andrew Morton
  2007-05-15 17:57   ` Badari Pulavarty
  2007-05-15 18:10   ` Christoph Lameter
  4 siblings, 2 replies; 18+ messages in thread
From: Andrew Morton @ 2007-05-15 17:44 UTC (permalink / raw)
  To: Badari Pulavarty; +Cc: netdev, lkml, Christoph Lameter

On Tue, 15 May 2007 10:29:18 -0700
Badari Pulavarty <pbadari@us.ibm.com> wrote:

> Hi,
> 
> Is select(0, ..) is a valid operation ?

Probably - it becomes an elaborate way of doing a sleep.  Whatever - we
used to permit it without error, so we should continue to do so.

> I see that there is no check to prevent this or return
> success early, without doing any work. Do we need one ?
> 
> slub code is complaining that we are doing kmalloc(0).
> 
> ------------[ cut here ]------------
> Badness at include/linux/slub_def.h:88
> Call Trace:
> [c0000001e4eb7640] [c00000000000e650] .show_stack+0x68/0x1b0
> (unreliable)
> [c0000001e4eb76e0] [c00000000029b854] .report_bug+0x94/0xe8
> [c0000001e4eb7770] [c0000000000219f0] .program_check_exception
> +0x12c/0x568
> [c0000001e4eb77f0] [c000000000004a84] program_check_common+0x104/0x180
> --- Exception: 700 at .get_slab+0x4c/0x234
>     LR = .__kmalloc+0x24/0xc4
> [c0000001e4eb7ae0] [c0000001e4eb7b80] 0xc0000001e4eb7b80 (unreliable)
> [c0000001e4eb7b80] [c0000000000a7ff0] .__kmalloc+0x24/0xc4
> [c0000001e4eb7c10] [c0000000000ea720] .compat_core_sys_select+0x90/0x240
> [c0000001e4eb7d00] [c0000000000ec3a4] .compat_sys_select+0xb0/0x190
> [c0000001e4eb7dc0] [c000000000014944] .ppc32_select+0x14/0x28
> [c0000001e4eb7e30] [c00000000000872c] syscall_exit+0x0/0x40
>

I _think_ we can just do

--- a/fs/compat.c~a
+++ a/fs/compat.c
@@ -1566,9 +1566,13 @@ int compat_core_sys_select(int n, compat
 	 */
 	ret = -ENOMEM;
 	size = FDS_BYTES(n);
-	bits = kmalloc(6 * size, GFP_KERNEL);
-	if (!bits)
-		goto out_nofds;
+	if (likely(size)) {
+		bits = kmalloc(6 * size, GFP_KERNEL);
+		if (!bits)
+			goto out_nofds;
+	} else {
+		bits = NULL;
+	}
 	fds.in      = (unsigned long *)  bits;
 	fds.out     = (unsigned long *) (bits +   size);
 	fds.ex      = (unsigned long *) (bits + 2*size);
_

I mean, if that oopses then I'd be very interested in finding out why.

But I'm starting to suspect that it would be better to permit kmalloc(0) in
slub.  It depends on how many more of these things need fixing.

otoh, a kmalloc(0) could be a sign of some buggy/inefficient/weird code, so
there's some value in forcing us to go look at all the callsites.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: select(0, ..) is valid ?
  2007-05-15 17:44 ` Andrew Morton
@ 2007-05-15 17:57   ` Badari Pulavarty
  2007-05-15 18:10   ` Christoph Lameter
  1 sibling, 0 replies; 18+ messages in thread
From: Badari Pulavarty @ 2007-05-15 17:57 UTC (permalink / raw)
  To: Andrew Morton; +Cc: netdev, lkml, Christoph Lameter

On Tue, 2007-05-15 at 10:44 -0700, Andrew Morton wrote:
> On Tue, 15 May 2007 10:29:18 -0700
> Badari Pulavarty <pbadari@us.ibm.com> wrote:
> 
> > Hi,
> > 
> > Is select(0, ..) is a valid operation ?
> 
> Probably - it becomes an elaborate way of doing a sleep.  Whatever - we
> used to permit it without error, so we should continue to do so.

Okay.

> 
> > I see that there is no check to prevent this or return
> > success early, without doing any work. Do we need one ?
> > 
> > slub code is complaining that we are doing kmalloc(0).
> > 
> > ------------[ cut here ]------------
> > Badness at include/linux/slub_def.h:88
> > Call Trace:
> > [c0000001e4eb7640] [c00000000000e650] .show_stack+0x68/0x1b0
> > (unreliable)
> > [c0000001e4eb76e0] [c00000000029b854] .report_bug+0x94/0xe8
> > [c0000001e4eb7770] [c0000000000219f0] .program_check_exception
> > +0x12c/0x568
> > [c0000001e4eb77f0] [c000000000004a84] program_check_common+0x104/0x180
> > --- Exception: 700 at .get_slab+0x4c/0x234
> >     LR = .__kmalloc+0x24/0xc4
> > [c0000001e4eb7ae0] [c0000001e4eb7b80] 0xc0000001e4eb7b80 (unreliable)
> > [c0000001e4eb7b80] [c0000000000a7ff0] .__kmalloc+0x24/0xc4
> > [c0000001e4eb7c10] [c0000000000ea720] .compat_core_sys_select+0x90/0x240
> > [c0000001e4eb7d00] [c0000000000ec3a4] .compat_sys_select+0xb0/0x190
> > [c0000001e4eb7dc0] [c000000000014944] .ppc32_select+0x14/0x28
> > [c0000001e4eb7e30] [c00000000000872c] syscall_exit+0x0/0x40
> >
> 
> I _think_ we can just do
> 
> --- a/fs/compat.c~a
> +++ a/fs/compat.c
> @@ -1566,9 +1566,13 @@ int compat_core_sys_select(int n, compat
>  	 */
>  	ret = -ENOMEM;
>  	size = FDS_BYTES(n);
> -	bits = kmalloc(6 * size, GFP_KERNEL);
> -	if (!bits)
> -		goto out_nofds;
> +	if (likely(size)) {
> +		bits = kmalloc(6 * size, GFP_KERNEL);
> +		if (!bits)
> +			goto out_nofds;
> +	} else {
> +		bits = NULL;
> +	}
>  	fds.in      = (unsigned long *)  bits;
>  	fds.out     = (unsigned long *) (bits +   size);
>  	fds.ex      = (unsigned long *) (bits + 2*size);
> _


Yes. This is what I did earlier, but then I was wondering if I
could skip the whole operation and bail out early (if n == 0). 
I guess not.

> I mean, if that oopses then I'd be very interested in finding out why.
> 
> But I'm starting to suspect that it would be better to permit kmalloc(0) in
> slub.  It depends on how many more of these things need fixing.
> 
> otoh, a kmalloc(0) could be a sign of some buggy/inefficient/weird code, so
> there's some value in forcing us to go look at all the callsites.

So far, I haven't found any other. Lets leave the check.

Thanks,
Badari

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: select(0, ..) is valid ?
  2007-05-15 17:44 ` Andrew Morton
  2007-05-15 17:57   ` Badari Pulavarty
@ 2007-05-15 18:10   ` Christoph Lameter
  2007-05-15 18:30     ` Andrew Morton
  2007-05-15 18:44     ` Hugh Dickins
  1 sibling, 2 replies; 18+ messages in thread
From: Christoph Lameter @ 2007-05-15 18:10 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Badari Pulavarty, netdev, lkml

On Tue, 15 May 2007, Andrew Morton wrote:

> I _think_ we can just do
> 
> --- a/fs/compat.c~a
> +++ a/fs/compat.c
> @@ -1566,9 +1566,13 @@ int compat_core_sys_select(int n, compat
>  	 */
>  	ret = -ENOMEM;
>  	size = FDS_BYTES(n);
> -	bits = kmalloc(6 * size, GFP_KERNEL);
> -	if (!bits)
> -		goto out_nofds;
> +	if (likely(size)) {
> +		bits = kmalloc(6 * size, GFP_KERNEL);
> +		if (!bits)
> +			goto out_nofds;
> +	} else {
> +		bits = NULL;
> +	}
>  	fds.in      = (unsigned long *)  bits;
>  	fds.out     = (unsigned long *) (bits +   size);
>  	fds.ex      = (unsigned long *) (bits + 2*size);
> _
> 
> I mean, if that oopses then I'd be very interested in finding out why.
> 
> But I'm starting to suspect that it would be better to permit kmalloc(0) in
> slub.  It depends on how many more of these things need fixing.
> 
> otoh, a kmalloc(0) could be a sign of some buggy/inefficient/weird code, so
> there's some value in forcing us to go look at all the callsites.
 
Hmmm... We could have kmalloc(0) return a pointer to the zero page? That 
would catch any writers?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: select(0, ..) is valid ?
  2007-05-15 18:10   ` Christoph Lameter
@ 2007-05-15 18:30     ` Andrew Morton
  2007-05-15 18:36       ` Christoph Lameter
  2007-05-15 18:44     ` Hugh Dickins
  1 sibling, 1 reply; 18+ messages in thread
From: Andrew Morton @ 2007-05-15 18:30 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Badari Pulavarty, netdev, lkml

On Tue, 15 May 2007 11:10:22 -0700 (PDT)
Christoph Lameter <clameter@sgi.com> wrote:

> On Tue, 15 May 2007, Andrew Morton wrote:
> 
> > I _think_ we can just do
> > 
> > --- a/fs/compat.c~a
> > +++ a/fs/compat.c
> > @@ -1566,9 +1566,13 @@ int compat_core_sys_select(int n, compat
> >  	 */
> >  	ret = -ENOMEM;
> >  	size = FDS_BYTES(n);
> > -	bits = kmalloc(6 * size, GFP_KERNEL);
> > -	if (!bits)
> > -		goto out_nofds;
> > +	if (likely(size)) {
> > +		bits = kmalloc(6 * size, GFP_KERNEL);
> > +		if (!bits)
> > +			goto out_nofds;
> > +	} else {
> > +		bits = NULL;
> > +	}
> >  	fds.in      = (unsigned long *)  bits;
> >  	fds.out     = (unsigned long *) (bits +   size);
> >  	fds.ex      = (unsigned long *) (bits + 2*size);
> > _
> > 
> > I mean, if that oopses then I'd be very interested in finding out why.
> > 
> > But I'm starting to suspect that it would be better to permit kmalloc(0) in
> > slub.  It depends on how many more of these things need fixing.
> > 
> > otoh, a kmalloc(0) could be a sign of some buggy/inefficient/weird code, so
> > there's some value in forcing us to go look at all the callsites.
>  
> Hmmm... We could have kmalloc(0) return a pointer to the zero page? That 
> would catch any writers?

Returning NULL would have the same effect..

But the problem is that we won't get 100% coverage of all codepaths
for ages, so any oopses we added won't get found.

otoh, any code which does dereference that pointer is buggy anwyay.

The problem here is that code which does

	kmalloc(some-expression-which-returns-0)

will go and assume that the kmalloc(0) got an ENOMEM and it'll take the
error path.

Oh well, let's persist with things as they now are.

Perhaps putting a size=0 detector into slab also would speed this
process up.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: select(0, ..) is valid ?
  2007-05-15 18:30     ` Andrew Morton
@ 2007-05-15 18:36       ` Christoph Lameter
  0 siblings, 0 replies; 18+ messages in thread
From: Christoph Lameter @ 2007-05-15 18:36 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Badari Pulavarty, netdev, lkml

On Tue, 15 May 2007, Andrew Morton wrote:

> Perhaps putting a size=0 detector into slab also would speed this
> process up.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

Index: linux-2.6/mm/slab.c
===================================================================
--- linux-2.6.orig/mm/slab.c	2007-05-15 11:32:25.000000000 -0700
+++ linux-2.6/mm/slab.c	2007-05-15 11:35:55.000000000 -0700
@@ -792,6 +792,7 @@ static inline struct kmem_cache *__find_
 	 */
 	BUG_ON(malloc_sizes[INDEX_AC].cs_cachep == NULL);
 #endif
+	WARN_ON_ONCE(size == 0);
 	while (size > csizep->cs_size)
 		csizep++;
 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: select(0, ..) is valid ?
  2007-05-15 18:10   ` Christoph Lameter
  2007-05-15 18:30     ` Andrew Morton
@ 2007-05-15 18:44     ` Hugh Dickins
  2007-05-16 15:37       ` Anton Blanchard
  1 sibling, 1 reply; 18+ messages in thread
From: Hugh Dickins @ 2007-05-15 18:44 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Andrew Morton, Badari Pulavarty, netdev, lkml

On Tue, 15 May 2007, Christoph Lameter wrote:
> On Tue, 15 May 2007, Andrew Morton wrote:
> 
> > I _think_ we can just do
> > 
> > --- a/fs/compat.c~a
> > +++ a/fs/compat.c
> > @@ -1566,9 +1566,13 @@ int compat_core_sys_select(int n, compat
> >  	 */
> >  	ret = -ENOMEM;
> >  	size = FDS_BYTES(n);
> > -	bits = kmalloc(6 * size, GFP_KERNEL);
> > -	if (!bits)
> > -		goto out_nofds;
> > +	if (likely(size)) {
> > +		bits = kmalloc(6 * size, GFP_KERNEL);
> > +		if (!bits)
> > +			goto out_nofds;
> > +	} else {
> > +		bits = NULL;
> > +	}

It's interesting that compat_core_sys_select() shows this kmalloc(0)
failure but core_sys_select() does not.  That's because core_sys_select()
avoids kmalloc by using a buffer on the stack for small allocations (and
0 sure is small).  Shouldn't compat_core_sys_select() do just the same?
Or is SLUB going to be so efficient that doing so is a waste of time?

> >  	fds.in      = (unsigned long *)  bits;
> >  	fds.out     = (unsigned long *) (bits +   size);
> >  	fds.ex      = (unsigned long *) (bits + 2*size);
> > _
> > 
> > I mean, if that oopses then I'd be very interested in finding out why.
> > 
> > But I'm starting to suspect that it would be better to permit kmalloc(0) in
> > slub.  It depends on how many more of these things need fixing.
> > 
> > otoh, a kmalloc(0) could be a sign of some buggy/inefficient/weird code, so
> > there's some value in forcing us to go look at all the callsites.
>  
> Hmmm... We could have kmalloc(0) return a pointer to the zero page? That 
> would catch any writers?

I don't think using the zero page that way would be at all safe:
there's probably configurations/architectures in which it is write
protected, but I don't believe that's a given at all.

But the principle is good: ERR_PTR(-MAX_ERRNO) should work,
that area up the top should always give a fault.
Hmm, but perhaps there are architectures on which it does not?

Hugh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: select(0, ..) is valid ?
  2007-05-15 18:44     ` Hugh Dickins
@ 2007-05-16 15:37       ` Anton Blanchard
  2007-05-17  0:59         ` Badari Pulavarty
  2007-05-18 13:21         ` Andi Kleen
  0 siblings, 2 replies; 18+ messages in thread
From: Anton Blanchard @ 2007-05-16 15:37 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Christoph Lameter, Andrew Morton, Badari Pulavarty, netdev, lkml,
	sfr, ak


Hi Hugh,

> It's interesting that compat_core_sys_select() shows this kmalloc(0)
> failure but core_sys_select() does not.  That's because core_sys_select()
> avoids kmalloc by using a buffer on the stack for small allocations (and
> 0 sure is small).  Shouldn't compat_core_sys_select() do just the same?
> Or is SLUB going to be so efficient that doing so is a waste of time?

Nice catch, the original optimisation from Andi is:

http://git.kernel.org/git-new/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=70674f95c0a2ea694d5c39f4e514f538a09be36f

And I think it makes sense for the compat code to do it too.

Anton

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: select(0, ..) is valid ?
  2007-05-16 15:37       ` Anton Blanchard
@ 2007-05-17  0:59         ` Badari Pulavarty
  2007-05-22 14:16           ` Steve Fox
  2007-05-18 13:21         ` Andi Kleen
  1 sibling, 1 reply; 18+ messages in thread
From: Badari Pulavarty @ 2007-05-17  0:59 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: Hugh Dickins, Christoph Lameter, Andrew Morton, netdev, lkml, sfr,
	ak

On Wed, 2007-05-16 at 10:37 -0500, Anton Blanchard wrote:
> Hi Hugh,
> 
> > It's interesting that compat_core_sys_select() shows this kmalloc(0)
> > failure but core_sys_select() does not.  That's because core_sys_select()
> > avoids kmalloc by using a buffer on the stack for small allocations (and
> > 0 sure is small).  Shouldn't compat_core_sys_select() do just the same?
> > Or is SLUB going to be so efficient that doing so is a waste of time?
> 
> Nice catch, the original optimisation from Andi is:
> 
> http://git.kernel.org/git-new/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=70674f95c0a2ea694d5c39f4e514f538a09be36f
> 
> And I think it makes sense for the compat code to do it too.
> 
> Anton

Here it is ..

Should I do one for poll() also ?

Thanks,
Badari

Optimize select by a using stack space for small fd sets.
core_sys_select() already has this optimization. This is
for compat version. 

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
---
 fs/compat.c |   17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

Index: linux-2.6.22-rc1/fs/compat.c
===================================================================
--- linux-2.6.22-rc1.orig/fs/compat.c	2007-05-12 18:45:56.000000000 -0700
+++ linux-2.6.22-rc1/fs/compat.c	2007-05-16 17:50:39.000000000 -0700
@@ -1544,9 +1544,10 @@ int compat_core_sys_select(int n, compat
 	compat_ulong_t __user *outp, compat_ulong_t __user *exp, s64 *timeout)
 {
 	fd_set_bits fds;
-	char *bits;
+	void *bits;
 	int size, max_fds, ret = -EINVAL;
 	struct fdtable *fdt;
+	long stack_fds[SELECT_STACK_ALLOC/sizeof(long)];
 
 	if (n < 0)
 		goto out_nofds;
@@ -1564,11 +1565,14 @@ int compat_core_sys_select(int n, compat
 	 * since we used fdset we need to allocate memory in units of
 	 * long-words.
 	 */
-	ret = -ENOMEM;
 	size = FDS_BYTES(n);
-	bits = kmalloc(6 * size, GFP_KERNEL);
-	if (!bits)
-		goto out_nofds;
+	bits = stack_fds;
+	if (size > sizeof(stack_fds) / 6) {
+		bits = kmalloc(6 * size, GFP_KERNEL);
+		ret = -ENOMEM;
+		if (!bits)
+			goto out_nofds;
+	}
 	fds.in      = (unsigned long *)  bits;
 	fds.out     = (unsigned long *) (bits +   size);
 	fds.ex      = (unsigned long *) (bits + 2*size);
@@ -1600,7 +1604,8 @@ int compat_core_sys_select(int n, compat
 	    compat_set_fd_set(n, exp, fds.res_ex))
 		ret = -EFAULT;
 out:
-	kfree(bits);
+	if (bits != stack_fds)
+		kfree(bits);
 out_nofds:
 	return ret;
 }




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: select(0, ..) is valid ?
  2007-05-16 15:37       ` Anton Blanchard
  2007-05-17  0:59         ` Badari Pulavarty
@ 2007-05-18 13:21         ` Andi Kleen
  2007-05-22  1:21           ` Nish Aravamudan
  1 sibling, 1 reply; 18+ messages in thread
From: Andi Kleen @ 2007-05-18 13:21 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: Hugh Dickins, Christoph Lameter, Andrew Morton, Badari Pulavarty,
	netdev, lkml, sfr

On Wednesday 16 May 2007 17:37, Anton Blanchard wrote:
> Hi Hugh,
>
> > It's interesting that compat_core_sys_select() shows this kmalloc(0)
> > failure but core_sys_select() does not.  That's because core_sys_select()
> > avoids kmalloc by using a buffer on the stack for small allocations (and
> > 0 sure is small).  Shouldn't compat_core_sys_select() do just the same?
> > Or is SLUB going to be so efficient that doing so is a waste of time?
>
> Nice catch, the original optimisation from Andi is:
>
> http://git.kernel.org/git-new/?p=linux/kernel/git/torvalds/linux-2.6.git;a=
>commit;h=70674f95c0a2ea694d5c39f4e514f538a09be36f
>
> And I think it makes sense for the compat code to do it too.

Yes agreed. I just forgot the copy'n'pasted code when doing the original
change.

-Andi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: select(0, ..) is valid ?
  2007-05-18 13:21         ` Andi Kleen
@ 2007-05-22  1:21           ` Nish Aravamudan
  0 siblings, 0 replies; 18+ messages in thread
From: Nish Aravamudan @ 2007-05-22  1:21 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Anton Blanchard, Hugh Dickins, Christoph Lameter, Andrew Morton,
	Badari Pulavarty, netdev, lkml, sfr

On 5/18/07, Andi Kleen <ak@suse.de> wrote:
> On Wednesday 16 May 2007 17:37, Anton Blanchard wrote:
> > Hi Hugh,
> >
> > > It's interesting that compat_core_sys_select() shows this kmalloc(0)
> > > failure but core_sys_select() does not.  That's because core_sys_select()
> > > avoids kmalloc by using a buffer on the stack for small allocations (and
> > > 0 sure is small).  Shouldn't compat_core_sys_select() do just the same?
> > > Or is SLUB going to be so efficient that doing so is a waste of time?
> >
> > Nice catch, the original optimisation from Andi is:
> >
> > http://git.kernel.org/git-new/?p=linux/kernel/git/torvalds/linux-2.6.git;a=
> >commit;h=70674f95c0a2ea694d5c39f4e514f538a09be36f
> >
> > And I think it makes sense for the compat code to do it too.
>
> Yes agreed. I just forgot the copy'n'pasted code when doing the original
> change.

Is this headed upstream? It's causing some noise on test.kernel.org
now that SLAB is also warning about kmalloc(0).

Thanks,
Nish

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: select(0, ..) is valid ?
  2007-05-17  0:59         ` Badari Pulavarty
@ 2007-05-22 14:16           ` Steve Fox
  2007-05-22 14:34             ` Nishanth Aravamudan
  0 siblings, 1 reply; 18+ messages in thread
From: Steve Fox @ 2007-05-22 14:16 UTC (permalink / raw)
  To: Badari Pulavarty
  Cc: Anton Blanchard, Hugh Dickins, Christoph Lameter, Andrew Morton,
	netdev, lkml, sfr, ak, Andy Whitcroft, Nishanth Aravamudan

On Wed, 2007-05-16 at 17:59 -0700, Badari Pulavarty wrote:

> Here it is ..
> 
> Should I do one for poll() also ?
> 
> Thanks,
> Badari
> 
> Optimize select by a using stack space for small fd sets.
> core_sys_select() already has this optimization. This is
> for compat version. 
> 
> Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
> ---
>  fs/compat.c |   17 +++++++++++------
>  1 file changed, 11 insertions(+), 6 deletions(-)
> 
> Index: linux-2.6.22-rc1/fs/compat.c
> ===================================================================
> --- linux-2.6.22-rc1.orig/fs/compat.c	2007-05-12 18:45:56.000000000 -0700
> +++ linux-2.6.22-rc1/fs/compat.c	2007-05-16 17:50:39.000000000 -0700
> @@ -1544,9 +1544,10 @@ int compat_core_sys_select(int n, compat
>  	compat_ulong_t __user *outp, compat_ulong_t __user *exp, s64 *timeout)
>  {
>  	fd_set_bits fds;
> -	char *bits;
> +	void *bits;
>  	int size, max_fds, ret = -EINVAL;
>  	struct fdtable *fdt;
> +	long stack_fds[SELECT_STACK_ALLOC/sizeof(long)];
> 
>  	if (n < 0)
>  		goto out_nofds;
> @@ -1564,11 +1565,14 @@ int compat_core_sys_select(int n, compat
>  	 * since we used fdset we need to allocate memory in units of
>  	 * long-words.
>  	 */
> -	ret = -ENOMEM;
>  	size = FDS_BYTES(n);
> -	bits = kmalloc(6 * size, GFP_KERNEL);
> -	if (!bits)
> -		goto out_nofds;
> +	bits = stack_fds;
> +	if (size > sizeof(stack_fds) / 6) {
> +		bits = kmalloc(6 * size, GFP_KERNEL);
> +		ret = -ENOMEM;
> +		if (!bits)
> +			goto out_nofds;
> +	}
>  	fds.in      = (unsigned long *)  bits;
>  	fds.out     = (unsigned long *) (bits +   size);
>  	fds.ex      = (unsigned long *) (bits + 2*size);
> @@ -1600,7 +1604,8 @@ int compat_core_sys_select(int n, compat
>  	    compat_set_fd_set(n, exp, fds.res_ex))
>  		ret = -EFAULT;
>  out:
> -	kfree(bits);
> +	if (bits != stack_fds)
> +		kfree(bits);
>  out_nofds:
>  	return ret;
>  }

Andy put this through a couple machines on test.kernel.org and elm3b6
was fixed, however elm3b239 still had a boot error.

BUG: at mm/slab.c:777 __find_general_cachep()

Call Trace:
 [<ffffffff802729c6>] __kmalloc+0xa6/0xe0
 [<ffffffff8021d21b>] cache_k8_northbridges+0x9b/0x120
 [<ffffffff80688af3>] gart_iommu_init+0x33/0x5b0
 [<ffffffff802211a3>] __wake_up+0x43/0x70
 [<ffffffff80453b90>] genl_rcv+0x0/0x70
 [<ffffffff80452175>] netlink_kernel_create+0x155/0x170
 [<ffffffff80684029>] pci_iommu_init+0x9/0x20
 [<ffffffff8067e6f4>] kernel_init+0x154/0x330
 [<ffffffff8020a8d8>] child_rip+0xa/0x12
 [<ffffffff80348e10>] acpi_ds_init_one_object+0x0/0x7c
 [<ffffffff8067e5a0>] kernel_init+0x0/0x330
 [<ffffffff8020a8ce>] child_rip+0x0/0x12

See the "2.6.22-rc2-git1 +1" row at http://test.kernel.org/ for full
logs.

-- 

Steve Fox
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: select(0, ..) is valid ?
  2007-05-22 14:16           ` Steve Fox
@ 2007-05-22 14:34             ` Nishanth Aravamudan
  2007-05-22 17:49               ` Steve Fox
  0 siblings, 1 reply; 18+ messages in thread
From: Nishanth Aravamudan @ 2007-05-22 14:34 UTC (permalink / raw)
  To: Steve Fox
  Cc: Badari Pulavarty, Anton Blanchard, Hugh Dickins,
	Christoph Lameter, Andrew Morton, netdev, lkml, sfr, ak,
	Andy Whitcroft

On 22.05.2007 [09:16:37 -0500], Steve Fox wrote:
> On Wed, 2007-05-16 at 17:59 -0700, Badari Pulavarty wrote:
> 
> > Here it is ..
> > 
> > Should I do one for poll() also ?
> > 
> > Thanks,
> > Badari
> > 
> > Optimize select by a using stack space for small fd sets.
> > core_sys_select() already has this optimization. This is
> > for compat version. 
> > 
> > Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
> > ---
> >  fs/compat.c |   17 +++++++++++------
> >  1 file changed, 11 insertions(+), 6 deletions(-)
> > 
> > Index: linux-2.6.22-rc1/fs/compat.c
> > ===================================================================
> > --- linux-2.6.22-rc1.orig/fs/compat.c	2007-05-12 18:45:56.000000000 -0700
> > +++ linux-2.6.22-rc1/fs/compat.c	2007-05-16 17:50:39.000000000 -0700
> > @@ -1544,9 +1544,10 @@ int compat_core_sys_select(int n, compat
> >  	compat_ulong_t __user *outp, compat_ulong_t __user *exp, s64 *timeout)
> >  {
> >  	fd_set_bits fds;
> > -	char *bits;
> > +	void *bits;
> >  	int size, max_fds, ret = -EINVAL;
> >  	struct fdtable *fdt;
> > +	long stack_fds[SELECT_STACK_ALLOC/sizeof(long)];
> > 
> >  	if (n < 0)
> >  		goto out_nofds;
> > @@ -1564,11 +1565,14 @@ int compat_core_sys_select(int n, compat
> >  	 * since we used fdset we need to allocate memory in units of
> >  	 * long-words.
> >  	 */
> > -	ret = -ENOMEM;
> >  	size = FDS_BYTES(n);
> > -	bits = kmalloc(6 * size, GFP_KERNEL);
> > -	if (!bits)
> > -		goto out_nofds;
> > +	bits = stack_fds;
> > +	if (size > sizeof(stack_fds) / 6) {
> > +		bits = kmalloc(6 * size, GFP_KERNEL);
> > +		ret = -ENOMEM;
> > +		if (!bits)
> > +			goto out_nofds;
> > +	}
> >  	fds.in      = (unsigned long *)  bits;
> >  	fds.out     = (unsigned long *) (bits +   size);
> >  	fds.ex      = (unsigned long *) (bits + 2*size);
> > @@ -1600,7 +1604,8 @@ int compat_core_sys_select(int n, compat
> >  	    compat_set_fd_set(n, exp, fds.res_ex))
> >  		ret = -EFAULT;
> >  out:
> > -	kfree(bits);
> > +	if (bits != stack_fds)
> > +		kfree(bits);
> >  out_nofds:
> >  	return ret;
> >  }
> 
> Andy put this through a couple machines on test.kernel.org and elm3b6
> was fixed, however elm3b239 still had a boot error.
> 
> BUG: at mm/slab.c:777 __find_general_cachep()
> 
> Call Trace:
>  [<ffffffff802729c6>] __kmalloc+0xa6/0xe0
>  [<ffffffff8021d21b>] cache_k8_northbridges+0x9b/0x120

I believe this is fixed by:

http://lkml.org/lkml/2007/5/18/19

Care to stack it on top and retest?

Thanks,
Nish

-- 
Nishanth Aravamudan <nacc@us.ibm.com>
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: select(0, ..) is valid ?
  2007-05-22 14:34             ` Nishanth Aravamudan
@ 2007-05-22 17:49               ` Steve Fox
  0 siblings, 0 replies; 18+ messages in thread
From: Steve Fox @ 2007-05-22 17:49 UTC (permalink / raw)
  To: Nishanth Aravamudan
  Cc: Badari Pulavarty, Anton Blanchard, Hugh Dickins,
	Christoph Lameter, Andrew Morton, netdev, lkml, sfr, ak,
	Andy Whitcroft

On Tue, 2007-05-22 at 07:34 -0700, Nishanth Aravamudan wrote:
> On 22.05.2007 [09:16:37 -0500], Steve Fox wrote:
> > 
> > Andy put this through a couple machines on test.kernel.org and elm3b6
> > was fixed, however elm3b239 still had a boot error.
> > 
> > BUG: at mm/slab.c:777 __find_general_cachep()
> > 
> > Call Trace:
> >  [<ffffffff802729c6>] __kmalloc+0xa6/0xe0
> >  [<ffffffff8021d21b>] cache_k8_northbridges+0x9b/0x120
> 
> I believe this is fixed by:
> 
> http://lkml.org/lkml/2007/5/18/19
> 
> Care to stack it on top and retest?

Looks good. See the "2.6.22-rc2-git1 +1 +1" row on tko. Thanks.

-- 

Steve Fox
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2007-05-22 17:49 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-15 17:29 select(0, ..) is valid ? Badari Pulavarty
2007-05-15 17:34 ` Mark Glines
2007-05-15 17:36 ` Jiri Slaby
2007-05-15 17:42 ` H. Peter Anvin
2007-05-15 17:43 ` Alan Cox
2007-05-15 17:44 ` Andrew Morton
2007-05-15 17:57   ` Badari Pulavarty
2007-05-15 18:10   ` Christoph Lameter
2007-05-15 18:30     ` Andrew Morton
2007-05-15 18:36       ` Christoph Lameter
2007-05-15 18:44     ` Hugh Dickins
2007-05-16 15:37       ` Anton Blanchard
2007-05-17  0:59         ` Badari Pulavarty
2007-05-22 14:16           ` Steve Fox
2007-05-22 14:34             ` Nishanth Aravamudan
2007-05-22 17:49               ` Steve Fox
2007-05-18 13:21         ` Andi Kleen
2007-05-22  1:21           ` Nish Aravamudan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).