All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] fdtable: Eradicate fdarray overflow.
@ 2006-10-12  2:58 Vadim Lobanov
  2006-10-12  5:19 ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: Vadim Lobanov @ 2006-10-12  2:58 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel

Andrew,

If you want it, here is the "actual patch format" fix for the random kernel
bug issue that has been discovered. This patch is functionally identical to
the one you grabbed, but contains comments and sign-offs.

Fix the computation of the length of an allocated fdarray, when we decide to
grow the fdtable. The rationale behind this fix is as follows:
=> The 'nr' variable is the requested fd, so will be one less than the minimum
   allowable fdarray size.
=> Due to the above fact, when we divide 'nr' by a fourth-of-a-page block, we
   will always be exactly one block short of the size we need.
=> Incrementing before the division is wrong, because the division will discard
   a non-zero modulo, possibly leaving us one fourth-of-a-page block short.

Signed-off-by: Vadim Lobanov <vlobanov@speakeasy.net>

diff -Npru old/fs/file.c new/fs/file.c
--- old/fs/file.c	2006-10-10 18:58:21.000000000 -0700
+++ new/fs/file.c	2006-10-11 19:37:23.000000000 -0700
@@ -164,9 +164,8 @@ static struct fdtable * alloc_fdtable(un
 	 * the fdarray into page-sized chunks: starting at a quarter of a page,
 	 * and growing in powers of two from there on.
 	 */
-	nr++;
 	nr /= (PAGE_SIZE / 4 / sizeof(struct file *));
-	nr = roundup_pow_of_two(nr);
+	nr = roundup_pow_of_two(nr + 1);
 	nr *= (PAGE_SIZE / 4 / sizeof(struct file *));
 	if (nr > NR_OPEN)
 		nr = NR_OPEN;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fdtable: Eradicate fdarray overflow.
  2006-10-12  2:58 [PATCH] fdtable: Eradicate fdarray overflow Vadim Lobanov
@ 2006-10-12  5:19 ` Eric Dumazet
  2006-10-12  6:07   ` Vadim Lobanov
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2006-10-12  5:19 UTC (permalink / raw)
  To: Vadim Lobanov; +Cc: akpm, linux-kernel

Vadim Lobanov a écrit :
> 
> diff -Npru old/fs/file.c new/fs/file.c
> --- old/fs/file.c	2006-10-10 18:58:21.000000000 -0700
> +++ new/fs/file.c	2006-10-11 19:37:23.000000000 -0700
> @@ -164,9 +164,8 @@ static struct fdtable * alloc_fdtable(un
>  	 * the fdarray into page-sized chunks: starting at a quarter of a page,
>  	 * and growing in powers of two from there on.
>  	 */
> -	nr++;
>  	nr /= (PAGE_SIZE / 4 / sizeof(struct file *));
> -	nr = roundup_pow_of_two(nr);
> +	nr = roundup_pow_of_two(nr + 1);
>  	nr *= (PAGE_SIZE / 4 / sizeof(struct file *));
>  	if (nr > NR_OPEN)
>  		nr = NR_OPEN;

Hi Vadim

I find your PAGE_SIZE/4 minimum allocation quite unjustified.

For architectures with 64K PAGE_SIZE, we endup allocating 16K, for poor tasks 
that happen to touch a not so high (>= 64) file descriptor...

I would vote for a fixed size, like 1024

Eric

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fdtable: Eradicate fdarray overflow.
  2006-10-12  5:19 ` Eric Dumazet
@ 2006-10-12  6:07   ` Vadim Lobanov
  2006-10-12  6:32     ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: Vadim Lobanov @ 2006-10-12  6:07 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: akpm, linux-kernel

On Wednesday 11 October 2006 22:19, Eric Dumazet wrote:
> Hi Vadim
>
> I find your PAGE_SIZE/4 minimum allocation quite unjustified.
>
> For architectures with 64K PAGE_SIZE, we endup allocating 16K, for poor
> tasks that happen to touch a not so high (>= 64) file descriptor...
>
> I would vote for a fixed size, like 1024

In my opinion, always picking 1024 would be highly suboptimal for some 
architectures (x86-64 in particular -- that's a whole page, just for the 
fdarray!). If anything, I'd prefer something similar to this pseudo-code:

#define FDTABLE_MIN min_t(uint, PAGE_SIZE / 4 / sizeof(struct file *), 1024)
...
nr /= FDTABLE_MIN;
nr = roundup_pow_of_two(nr + 1);
nr *= FDTABLE_MIN;

gcc should be smart enough to optimize that expression into a single constant. 
At least it did (version 4.1.0) in my quick test here.

> Eric

Let me know what you think. Please don't just go radio-silent on me. ;)

-- Vadim Lobanov

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fdtable: Eradicate fdarray overflow.
  2006-10-12  6:07   ` Vadim Lobanov
@ 2006-10-12  6:32     ` Eric Dumazet
  2006-10-12  7:16       ` Vadim Lobanov
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2006-10-12  6:32 UTC (permalink / raw)
  To: Vadim Lobanov; +Cc: akpm, linux-kernel

Vadim Lobanov a écrit :
> On Wednesday 11 October 2006 22:19, Eric Dumazet wrote:
>> Hi Vadim
>>
>> I find your PAGE_SIZE/4 minimum allocation quite unjustified.
>>
>> For architectures with 64K PAGE_SIZE, we endup allocating 16K, for poor
>> tasks that happen to touch a not so high (>= 64) file descriptor...
>>
>> I would vote for a fixed size, like 1024
> 
> In my opinion, always picking 1024 would be highly suboptimal for some 
> architectures (x86-64 in particular -- that's a whole page, just for the 
> fdarray!). If anything, I'd prefer something similar to this pseudo-code:

I was speaking of 1024 bytes.

I was the guy who made fdset going from PAGE_SIZE to 64 bytes (L1_CACHE_BYTES 
if you dare), I wont be the guy responsible for a reverse path on fdtable :)

That is replace your (PAGE_SIZE/4)  by 1024, wich was you probably meant
No archi has a smaller page, so no need to play with min_t() macro...

> 
> #define FDTABLE_MIN min_t(uint, PAGE_SIZE / 4 / sizeof(struct file *), 1024)
> ...
> nr /= FDTABLE_MIN;
> nr = roundup_pow_of_two(nr + 1);
> nr *= FDTABLE_MIN;
> 
> gcc should be smart enough to optimize that expression into a single constant. 
> At least it did (version 4.1.0) in my quick test here.
> 
>> Eric
> 
> Let me know what you think. Please don't just go radio-silent on me. ;)
> 

radio-silent ? well, it seems I already sent you many mails about your patches :)

Eric


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fdtable: Eradicate fdarray overflow.
  2006-10-12  6:32     ` Eric Dumazet
@ 2006-10-12  7:16       ` Vadim Lobanov
  0 siblings, 0 replies; 5+ messages in thread
From: Vadim Lobanov @ 2006-10-12  7:16 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: akpm, linux-kernel

On Wednesday 11 October 2006 23:32, Eric Dumazet wrote:
> Vadim Lobanov a écrit :
> > On Wednesday 11 October 2006 22:19, Eric Dumazet wrote:
> >> Hi Vadim
> >>
> >> I find your PAGE_SIZE/4 minimum allocation quite unjustified.
> >>
> >> For architectures with 64K PAGE_SIZE, we endup allocating 16K, for poor
> >> tasks that happen to touch a not so high (>= 64) file descriptor...
> >>
> >> I would vote for a fixed size, like 1024
> >
> > In my opinion, always picking 1024 would be highly suboptimal for some
> > architectures (x86-64 in particular -- that's a whole page, just for the
> > fdarray!). If anything, I'd prefer something similar to this pseudo-code:
>
> I was speaking of 1024 bytes.

Oh, well, that does make a difference! :) I misread your email: I thought you 
were suggesting 1024 fdarray entries, not 1024 bytes. I'd agree with you 
completely then.

> That is replace your (PAGE_SIZE/4)  by 1024, wich was you probably meant
> No archi has a smaller page, so no need to play with min_t() macro...

Good point. Code can then simply become:
nr /= (1024 / sizeof(struct file *));
nr = roundup_pow_of_two(nr + 1);
nr *= (1024 / sizeof(struct file *));

> > Let me know what you think. Please don't just go radio-silent on me. ;)
>
> radio-silent ? well, it seems I already sent you many mails about your
> patches :)

Your feedback is very much appreciated, believe me! :) Both you (for the 
ideas) and akpm (for his assistance and seemingly-infinite patience, even 
when I accidentally broke his tree (I owe him a few beers/other-goodies for 
that)) have helped out tremendously with this.

So far, I have two comments from you that I need to take care of:
1) allocate at least L1_CACHE_BYTES for fdsets
2) change PAGE_SIZE/4 to 1024
Both are performance tweaks, not crash fixes, and are easy to do. Could you 
please look through the rest of the code in fs/file.c and point out any other 
issues or code tweaks you can see? My plan is to prepare patches for these 
things and buffer them up, and when we're done tweaking, I'll forward the 
whole batch on to akpm.

Then, once everything has settled down for a long while, I'll send out the 
final fs/file.c cleanups patch -- mostly it introduces extensive comments 
into that file about why the code does what it does.

> Eric

-- Vadim Lobanov

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-10-12  7:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-12  2:58 [PATCH] fdtable: Eradicate fdarray overflow Vadim Lobanov
2006-10-12  5:19 ` Eric Dumazet
2006-10-12  6:07   ` Vadim Lobanov
2006-10-12  6:32     ` Eric Dumazet
2006-10-12  7:16       ` Vadim Lobanov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.