public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: RFC: poll change
  2001-08-14 23:53       ` RFC: poll change Tim Hockin
@ 2001-08-14 23:53         ` David S. Miller
  2001-08-15  0:08           ` Andrea Arcangeli
  2001-08-15  0:09           ` Tim Hockin
  2001-08-15  0:11         ` Andrea Arcangeli
  1 sibling, 2 replies; 15+ messages in thread
From: David S. Miller @ 2001-08-14 23:53 UTC (permalink / raw)
  To: thockin; +Cc: linux-kernel

   From: Tim Hockin <thockin@sun.com>
   Date: Tue, 14 Aug 2001 16:53:43 -0700

   The standard says negative fd's are ignored.  We get that right.  What we
   are left with is an overly paranoid check against max_fds.  This check
   should go away.  You should be able to pass in up to your rlimit fds, and
   let negative ones (holes or tails) be ignored.
   
I am saying there is no problem.

In both cases, for a properly written application we ignore the
invalid fds.  The behavior is identical both before and after
your change, so there is no reason to make it.

Later,
David S. Miller
davem@redhat.com


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RFC: poll change
       [not found]     ` <20010814.163804.66057702.davem@redhat.com>
@ 2001-08-14 23:53       ` Tim Hockin
  2001-08-14 23:53         ` David S. Miller
  2001-08-15  0:11         ` Andrea Arcangeli
  0 siblings, 2 replies; 15+ messages in thread
From: Tim Hockin @ 2001-08-14 23:53 UTC (permalink / raw)
  To: David S. Miller; +Cc: Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 670 bytes --]

"David S. Miller" wrote:
 
>    The standard also says that any pollfd with (fd < 0) is ignored.  Holes are
>    explicitly ALLOWED.
> 
> Dude, it ignores negative fds, check fs/select.c:do_pollfd()

Right - we're running in circles.

The standard says negative fd's are ignored.  We get that right.  What we
are left with is an overly paranoid check against max_fds.  This check
should go away.  You should be able to pass in up to your rlimit fds, and
let negative ones (holes or tails) be ignored.

I'm attaching a patch :).

Am I still not making the problem clear?

-- 
Tim Hockin
Systems Software Engineer
Sun Microsystems, Cobalt Server Appliances
thockin@sun.com

[-- Attachment #2: select.diff --]
[-- Type: text/plain, Size: 583 bytes --]

Index: fs/select.c
===================================================================
RCS file: /home/cvs/linux-2.4/fs/select.c,v
retrieving revision 1.5
diff -u -r1.5 select.c
--- fs/select.c	2001/07/09 23:10:25	1.5
+++ fs/select.c	2001/08/14 23:47:46
@@ -416,11 +416,8 @@
 	int nchunks, nleft;
 
 	/* Do a sanity check on nfds ... */
-	if (nfds > NR_OPEN)
+	if (nfds > current->rlim[RLIMIT_NOFILE].rlim_cur)
 		return -EINVAL;
-
-	if (nfds > current->files->max_fds)
-		nfds = current->files->max_fds;
 
 	if (timeout) {
 		/* Careful about overflow in the intermediate values */

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RFC: poll change
  2001-08-15  0:09           ` Tim Hockin
@ 2001-08-15  0:06             ` David S. Miller
  0 siblings, 0 replies; 15+ messages in thread
From: David S. Miller @ 2001-08-15  0:06 UTC (permalink / raw)
  To: thockin; +Cc: linux-kernel

   From: Tim Hockin <thockin@sun.com>
   Date: Tue, 14 Aug 2001 17:09:00 -0700

   But for an application that (imho) is poorly written but IS COMPLIANT, it
   fails.
   
   This program is compliant, if your ulimit -n is maxxed out at 1048576.

This program is stupid.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RFC: poll change
  2001-08-14 23:53         ` David S. Miller
@ 2001-08-15  0:08           ` Andrea Arcangeli
  2001-08-15  0:09           ` Tim Hockin
  1 sibling, 0 replies; 15+ messages in thread
From: Andrea Arcangeli @ 2001-08-15  0:08 UTC (permalink / raw)
  To: David S. Miller; +Cc: thockin, linux-kernel

On Tue, Aug 14, 2001 at 04:53:20PM -0700, David S. Miller wrote:
> invalid fds.  The behavior is identical both before and after
> your change, so there is no reason to make it.

actually I can see a case where it makes difference: if you sends 1024
entries to poll with the last entry of the array polling for one of the
allocated file descriptors (all previous 1023 entries are negative) and
the last fd allocated is ID 255.

Andrea

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RFC: poll change
  2001-08-14 23:53         ` David S. Miller
  2001-08-15  0:08           ` Andrea Arcangeli
@ 2001-08-15  0:09           ` Tim Hockin
  2001-08-15  0:06             ` David S. Miller
  1 sibling, 1 reply; 15+ messages in thread
From: Tim Hockin @ 2001-08-15  0:09 UTC (permalink / raw)
  To: David S. Miller; +Cc: Linux Kernel Mailing List

"David S. Miller" wrote:
> 
>    From: Tim Hockin <thockin@sun.com>
>    Date: Tue, 14 Aug 2001 16:53:43 -0700
> 
>    The standard says negative fd's are ignored.  We get that right.  What we
>    are left with is an overly paranoid check against max_fds.  This check
>    should go away.  You should be able to pass in up to your rlimit fds, and
>    let negative ones (holes or tails) be ignored.
>
> I am saying there is no problem.
> 
> In both cases, for a properly written application we ignore the
> invalid fds.  The behavior is identical both before and after
> your change, so there is no reason to make it.

But for an application that (imho) is poorly written but IS COMPLIANT, it
fails.

This program is compliant, if your ulimit -n is maxxed out at 1048576.

I'm not saying it is good design, but it is compliant.  The patch hurts
nothing and makes poll() that little bit more friendly.



#include <stdio.h>
#include <sys/poll.h>
#include <errno.h>
#include <string.h>

int
main(void)
{
        static struct pollfd ar[1024*1024];
        int size = sizeof(ar)/sizeof(*ar);
        int i;

        for (i = 0; i < size; i++)
                ar[i].fd = -1;
        ar[1000000].fd = 0;
        ar[1000000].events = POLLIN;

        i = poll(ar, size, -1);
        printf("return = %d (%s)\n", i, i?strerror(errno):"success");

        return 0;
}


-- 
Tim Hockin
Systems Software Engineer
Sun Microsystems, Cobalt Server Appliances
thockin@sun.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RFC: poll change
  2001-08-14 23:53       ` RFC: poll change Tim Hockin
  2001-08-14 23:53         ` David S. Miller
@ 2001-08-15  0:11         ` Andrea Arcangeli
  2001-08-15  0:16           ` David S. Miller
  1 sibling, 1 reply; 15+ messages in thread
From: Andrea Arcangeli @ 2001-08-15  0:11 UTC (permalink / raw)
  To: Tim Hockin; +Cc: David S. Miller, Linux Kernel Mailing List

On Tue, Aug 14, 2001 at 04:53:43PM -0700, Tim Hockin wrote:
> -	if (nfds > NR_OPEN)
> +	if (nfds > current->rlim[RLIMIT_NOFILE].rlim_cur)

Here SuS speaks about OPEN_MAX, not sure if OPEN_MAX corresponds to the
current ulimit or to the absolute maximum (to me it sounds more like our
NR_OPEN).

Andrea

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RFC: poll change
  2001-08-15  0:11         ` Andrea Arcangeli
@ 2001-08-15  0:16           ` David S. Miller
  2001-08-15  0:30             ` Andrea Arcangeli
                               ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: David S. Miller @ 2001-08-15  0:16 UTC (permalink / raw)
  To: andrea; +Cc: thockin, linux-kernel

   From: Andrea Arcangeli <andrea@suse.de>
   Date: Wed, 15 Aug 2001 02:11:10 +0200

   On Tue, Aug 14, 2001 at 04:53:43PM -0700, Tim Hockin wrote:
   > -	if (nfds > NR_OPEN)
   > +	if (nfds > current->rlim[RLIMIT_NOFILE].rlim_cur)
   
   Here SuS speaks about OPEN_MAX, not sure if OPEN_MAX corresponds to the
   current ulimit or to the absolute maximum (to me it sounds more like our
   NR_OPEN).

Right, and our equivalent is "NR_OPEN".

Ok, I think I see what you and Tim are trying to say.
I'm thinking in a select() minded way which is why I didn't
understand :-)

Yeah, the check can be removed, but anyone who cares about
performance won't pass in huge arrays of -1 entries if only
the low few pollfd entries are actually useful.

Later,
David S. Miller
davem@redhat.com



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RFC: poll change
  2001-08-15  0:16           ` David S. Miller
@ 2001-08-15  0:30             ` Andrea Arcangeli
  2001-08-15  0:34             ` [PATCH] agreed upon " Tim Hockin
  2001-08-15 14:13             ` RFC: " Andrea Arcangeli
  2 siblings, 0 replies; 15+ messages in thread
From: Andrea Arcangeli @ 2001-08-15  0:30 UTC (permalink / raw)
  To: David S. Miller; +Cc: thockin, linux-kernel

On Tue, Aug 14, 2001 at 05:16:09PM -0700, David S. Miller wrote:
> Yeah, the check can be removed, but anyone who cares about
> performance won't pass in huge arrays of -1 entries if only
> the low few pollfd entries are actually useful.

yes, on the same lines I'm not sure who needs the -1 entries in first
place since one of the major points of poll is to avoid useless browse
of non interesting entries. I'm scared people uses poll just to
workaround the 1024 bit limit on the fdset structures in glibc and to
generate arrays larger than 1024. I hope it's just a resource management
issue, where only a few entries are sometime set to -1 and so they can
avoid to duplicate the polltable for very minor difference in their
utilization or for slow path cases.

Andrea

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] agreed upon poll change
  2001-08-15  0:34             ` [PATCH] agreed upon " Tim Hockin
@ 2001-08-15  0:32               ` David S. Miller
  0 siblings, 0 replies; 15+ messages in thread
From: David S. Miller @ 2001-08-15  0:32 UTC (permalink / raw)
  To: thockin; +Cc: andrea, linux-kernel

   From: Tim Hockin <thockin@sun.com>
   Date: Tue, 14 Aug 2001 17:34:22 -0700

   of course...patch attached, for anyone who cares - do we need to send it to
   anyone else, or are you going to channel it in, Dave?

I'll pass it on.

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] agreed upon poll change
  2001-08-15  0:16           ` David S. Miller
  2001-08-15  0:30             ` Andrea Arcangeli
@ 2001-08-15  0:34             ` Tim Hockin
  2001-08-15  0:32               ` David S. Miller
  2001-08-15 14:13             ` RFC: " Andrea Arcangeli
  2 siblings, 1 reply; 15+ messages in thread
From: Tim Hockin @ 2001-08-15  0:34 UTC (permalink / raw)
  To: David S. Miller; +Cc: andrea, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 621 bytes --]

"David S. Miller" wrote:

> Right, and our equivalent is "NR_OPEN".
> 
> Ok, I think I see what you and Tim are trying to say.
> I'm thinking in a select() minded way which is why I didn't
> understand :-)

Whew! :)

> Yeah, the check can be removed, but anyone who cares about
> performance won't pass in huge arrays of -1 entries if only
> the low few pollfd entries are actually useful.

of course...patch attached, for anyone who cares - do we need to send it to
anyone else, or are you going to channel it in, Dave?

-- 
Tim Hockin
Systems Software Engineer
Sun Microsystems, Cobalt Server Appliances
thockin@sun.com

[-- Attachment #2: poll-paranoid.diff --]
[-- Type: text/plain, Size: 528 bytes --]

Index: fs/select.c
===================================================================
RCS file: /home/cvs/linux-2.4/fs/select.c,v
retrieving revision 1.5
diff -u -r1.5 select.c
--- fs/select.c	2001/07/09 23:10:25	1.5
+++ fs/select.c	2001/08/15 00:28:13
@@ -419,9 +419,6 @@
 	if (nfds > NR_OPEN)
 		return -EINVAL;
 
-	if (nfds > current->files->max_fds)
-		nfds = current->files->max_fds;
-
 	if (timeout) {
 		/* Careful about overflow in the intermediate values */
 		if ((unsigned long) timeout < MAX_SCHEDULE_TIMEOUT / HZ)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RFC: poll change
  2001-08-15  0:16           ` David S. Miller
  2001-08-15  0:30             ` Andrea Arcangeli
  2001-08-15  0:34             ` [PATCH] agreed upon " Tim Hockin
@ 2001-08-15 14:13             ` Andrea Arcangeli
  2001-08-15 14:24               ` David S. Miller
  2 siblings, 1 reply; 15+ messages in thread
From: Andrea Arcangeli @ 2001-08-15 14:13 UTC (permalink / raw)
  To: David S. Miller; +Cc: thockin, linux-kernel

On Tue, Aug 14, 2001 at 05:16:09PM -0700, David S. Miller wrote:
>    From: Andrea Arcangeli <andrea@suse.de>
>    Date: Wed, 15 Aug 2001 02:11:10 +0200
> 
>    On Tue, Aug 14, 2001 at 04:53:43PM -0700, Tim Hockin wrote:
>    > -	if (nfds > NR_OPEN)
>    > +	if (nfds > current->rlim[RLIMIT_NOFILE].rlim_cur)
>    
>    Here SuS speaks about OPEN_MAX, not sure if OPEN_MAX corresponds to the
>    current ulimit or to the absolute maximum (to me it sounds more like our
>    NR_OPEN).
> 
> Right, and our equivalent is "NR_OPEN".

I was backporting the new version to 2.2 and I noticed that by using
NR_OPEN an luser will actually be able to lock into RAM something of the
order of the dozen mbytes per process. So I'm wondering that it would be
saner to use the rlimit instead, after all I don't see much of a value
to use NR_OPEN instead of the rlimit (even if strictly speaking SuS asks
us to use NR_OPEN). Any weird program (if any) that would depend on
NR_OPEN instead of the rlimit can be easily fixed with a one liner at
most. So I guess I'd be more happy with the rlimit instead of NR_OPEN.
Comments?

Andrea

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RFC: poll change
  2001-08-15 14:13             ` RFC: " Andrea Arcangeli
@ 2001-08-15 14:24               ` David S. Miller
  2001-08-15 14:32                 ` Andrea Arcangeli
  0 siblings, 1 reply; 15+ messages in thread
From: David S. Miller @ 2001-08-15 14:24 UTC (permalink / raw)
  To: andrea; +Cc: thockin, linux-kernel

   From: Andrea Arcangeli <andrea@suse.de>
   Date: Wed, 15 Aug 2001 16:13:18 +0200

   I was backporting the new version to 2.2 and I noticed that by using
   NR_OPEN an luser will actually be able to lock into RAM something of the
   order of the dozen mbytes per process.

He can do this with page tables too :-)

Later,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RFC: poll change
  2001-08-15 14:24               ` David S. Miller
@ 2001-08-15 14:32                 ` Andrea Arcangeli
  2001-08-15 16:30                   ` Hugh Dickins
  0 siblings, 1 reply; 15+ messages in thread
From: Andrea Arcangeli @ 2001-08-15 14:32 UTC (permalink / raw)
  To: David S. Miller; +Cc: thockin, linux-kernel

On Wed, Aug 15, 2001 at 07:24:19AM -0700, David S. Miller wrote:
>    From: Andrea Arcangeli <andrea@suse.de>
>    Date: Wed, 15 Aug 2001 16:13:18 +0200
> 
>    I was backporting the new version to 2.2 and I noticed that by using
>    NR_OPEN an luser will actually be able to lock into RAM something of the
>    order of the dozen mbytes per process.
> 
> He can do this with page tables too :-)

Since 2.2 we have the free_pgtables to release the pagetables under
unused pgd slots, that was used to work pretty well last time I checked.

Andrea

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RFC: poll change
  2001-08-15 14:32                 ` Andrea Arcangeli
@ 2001-08-15 16:30                   ` Hugh Dickins
  2001-08-15 16:40                     ` Andrea Arcangeli
  0 siblings, 1 reply; 15+ messages in thread
From: Hugh Dickins @ 2001-08-15 16:30 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: David S. Miller, thockin, linux-kernel

On Wed, 15 Aug 2001, Andrea Arcangeli wrote:
> 
> Since 2.2 we have the free_pgtables to release the pagetables under
> unused pgd slots, that was used to work pretty well last time I checked.

Funny you mention that: I noticed a while back that actually
it doesn't work well with i386 PAE - presumably looks for an empty 1GB.

Hugh


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: RFC: poll change
  2001-08-15 16:30                   ` Hugh Dickins
@ 2001-08-15 16:40                     ` Andrea Arcangeli
  0 siblings, 0 replies; 15+ messages in thread
From: Andrea Arcangeli @ 2001-08-15 16:40 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: David S. Miller, thockin, linux-kernel

On Wed, Aug 15, 2001 at 05:30:04PM +0100, Hugh Dickins wrote:
> On Wed, 15 Aug 2001, Andrea Arcangeli wrote:
> > 
> > Since 2.2 we have the free_pgtables to release the pagetables under
> > unused pgd slots, that was used to work pretty well last time I checked.
> 
> Funny you mention that: I noticed a while back that actually
> it doesn't work well with i386 PAE - presumably looks for an empty 1GB.

correct, we'd need the equivalent for the pmd too, this applies to the
other 3-level (or more) pagetables archs too.

Andrea

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2001-08-15 16:41 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <3B79B381.58266C13@sun.com>
     [not found] ` <20010814.162710.131914269.davem@redhat.com>
     [not found]   ` <3B79B5F3.C816CBED@sun.com>
     [not found]     ` <20010814.163804.66057702.davem@redhat.com>
2001-08-14 23:53       ` RFC: poll change Tim Hockin
2001-08-14 23:53         ` David S. Miller
2001-08-15  0:08           ` Andrea Arcangeli
2001-08-15  0:09           ` Tim Hockin
2001-08-15  0:06             ` David S. Miller
2001-08-15  0:11         ` Andrea Arcangeli
2001-08-15  0:16           ` David S. Miller
2001-08-15  0:30             ` Andrea Arcangeli
2001-08-15  0:34             ` [PATCH] agreed upon " Tim Hockin
2001-08-15  0:32               ` David S. Miller
2001-08-15 14:13             ` RFC: " Andrea Arcangeli
2001-08-15 14:24               ` David S. Miller
2001-08-15 14:32                 ` Andrea Arcangeli
2001-08-15 16:30                   ` Hugh Dickins
2001-08-15 16:40                     ` Andrea Arcangeli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox