[PATCH] fdset's leakage

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] fdset's leakage
@ 2006-07-10 13:40 Kirill Korotaev
  2006-07-11  8:01 ` Andrew Morton
  0 siblings, 1 reply; 8+ messages in thread
From: Kirill Korotaev @ 2006-07-10 13:40 UTC (permalink / raw)
  To: Andrew Morton, Linux Kernel Mailing List, devel, Alexey Kuznetsov

[-- Attachment #1: Type: text/plain, Size: 453 bytes --]

Andrew,

Another patch from Alexey Kuznetsov fixing memory leak in alloc_fdtable().

[PATCH] fdset's leakage

When found, it is obvious. nfds calculated when allocating fdsets
is rewritten by calculation of size of fdtable, and when we are
unlucky, we try to free fdsets of wrong size.

Found due to OpenVZ resource management (User Beancounters).

Signed-Off-By: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-Off-By: Kirill Korotaev <dev@openvz.org>


[-- Attachment #2: diff-fdset-leakage --]
[-- Type: text/plain, Size: 523 bytes --]

diff -urp linux-2.6-orig/fs/file.c linux-2.6/fs/file.c
--- linux-2.6-orig/fs/file.c	2006-07-10 12:10:51.000000000 +0400
+++ linux-2.6/fs/file.c	2006-07-10 14:47:01.000000000 +0400
@@ -277,11 +277,13 @@ static struct fdtable *alloc_fdtable(int
 	} while (nfds <= nr);
 	new_fds = alloc_fd_array(nfds);
 	if (!new_fds)
-		goto out;
+		goto out2;
 	fdt->fd = new_fds;
 	fdt->max_fds = nfds;
 	fdt->free_files = NULL;
 	return fdt;
+out2:
+	nfds = fdt->max_fdset;
 out:
   	if (new_openset)
   		free_fdset(new_openset, nfds);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] fdset's leakage
  2006-07-10 13:40 [PATCH] fdset's leakage Kirill Korotaev
@ 2006-07-11  8:01 ` Andrew Morton
  2006-07-11  9:02   ` Rene Scharfe
  2006-07-11  9:05   ` Kirill Korotaev
  0 siblings, 2 replies; 8+ messages in thread
From: Andrew Morton @ 2006-07-11  8:01 UTC (permalink / raw)
  To: Kirill Korotaev; +Cc: linux-kernel, devel, kuznet

On Mon, 10 Jul 2006 17:40:51 +0400
Kirill Korotaev <dev@openvz.org> wrote:

> Andrew,
> 
> Another patch from Alexey Kuznetsov fixing memory leak in alloc_fdtable().
> 
> [PATCH] fdset's leakage
> 
> When found, it is obvious. nfds calculated when allocating fdsets
> is rewritten by calculation of size of fdtable, and when we are
> unlucky, we try to free fdsets of wrong size.
> 
> Found due to OpenVZ resource management (User Beancounters).
> 
> Signed-Off-By: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
> Signed-Off-By: Kirill Korotaev <dev@openvz.org>
> 
> 
> diff -urp linux-2.6-orig/fs/file.c linux-2.6/fs/file.c
> --- linux-2.6-orig/fs/file.c	2006-07-10 12:10:51.000000000 +0400
> +++ linux-2.6/fs/file.c	2006-07-10 14:47:01.000000000 +0400
> @@ -277,11 +277,13 @@ static struct fdtable *alloc_fdtable(int
>  	} while (nfds <= nr);
>  	new_fds = alloc_fd_array(nfds);
>  	if (!new_fds)
> -		goto out;
> +		goto out2;
>  	fdt->fd = new_fds;
>  	fdt->max_fds = nfds;
>  	fdt->free_files = NULL;
>  	return fdt;
> +out2:
> +	nfds = fdt->max_fdset;
>  out:
>    	if (new_openset)
>    		free_fdset(new_openset, nfds);

OK, that was a simple fix.  And if we need this fix backported to 2.6.17.x
then it'd be best to go with the simple fix.

And I think we do need to backport this to 2.6.17.x because NR_OPEN can be
really big, and vmalloc() is not immortal.

But the code in there is really sick.   In all cases we do:

	free_fdset(foo->open_fds, foo->max_fdset);
	free_fdset(foo->close_on_exec, foo->max_fdset);

How much neater and more reliable would it be to do:

	free_fdsets(foo);

?

Also,

	nfds = NR_OPEN_DEFAULT;
	/*
	 * Expand to the max in easy steps, and keep expanding it until
	 * we have enough for the requested fd array size.
	 */
	do {
#if NR_OPEN_DEFAULT < 256
		if (nfds < 256)
			nfds = 256;
		else
#endif
		if (nfds < (PAGE_SIZE / sizeof(struct file *)))
			nfds = PAGE_SIZE / sizeof(struct file *);
		else {
			nfds = nfds * 2;
			if (nfds > NR_OPEN)
				nfds = NR_OPEN;
  		}
	} while (nfds <= nr);


That's going to take a long time to compute if nr > NR_OPEN.  I just fixed
a similar infinite loop in this function.  Methinks this

	nfds = max(NR_OPEN_DEFAULT, 256);
	nfds = max(nfds, PAGE_SIZE/sizeof(struct file *));
	nfds = max(nfds, round_up_pow_of_two(nr + 1));
	nfds = min(nfds, NR_OPEN);

is clearer and less buggy.  I _think_ it's also equivalent (as long as
NR_OPEN>256).  But please check my logic.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] fdset's leakage
  2006-07-11  8:01 ` Andrew Morton
@ 2006-07-11  9:02   ` Rene Scharfe
  2006-07-11  9:05   ` Kirill Korotaev
  1 sibling, 0 replies; 8+ messages in thread
From: Rene Scharfe @ 2006-07-11  9:02 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, devel, kuznet, Linus Torvalds

[strange loop snipped]

> That's going to take a long time to compute if nr > NR_OPEN.  I just fixed
> a similar infinite loop in this function.

That other fix looks buggy btw.  Here it is:

-	nfds = 8 * L1_CACHE_BYTES;
-  	/* Expand to the max in easy steps */
-  	while (nfds <= nr) {
-		nfds = nfds * 2;
-		if (nfds > NR_OPEN)
-			nfds = NR_OPEN;
-	}
+	nfds = max_t(int, 8 * L1_CACHE_BYTES, roundup_pow_of_two(nfds));
+	if (nfds > NR_OPEN)
+		nfds = NR_OPEN;

Surely you meant to say "roundup_pow_of_two(nr + 1)"?

René

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] fdset's leakage
  2006-07-11  8:01 ` Andrew Morton
  2006-07-11  9:02   ` Rene Scharfe
@ 2006-07-11  9:05   ` Kirill Korotaev
  2006-07-11  9:28     ` Andrew Morton
  2006-07-11 16:13     ` Vadim Lobanov
  1 sibling, 2 replies; 8+ messages in thread
From: Kirill Korotaev @ 2006-07-11  9:05 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, devel, kuznet

Andrew,

>>Another patch from Alexey Kuznetsov fixing memory leak in alloc_fdtable().
>>
>>[PATCH] fdset's leakage
>>
>>When found, it is obvious. nfds calculated when allocating fdsets
>>is rewritten by calculation of size of fdtable, and when we are
>>unlucky, we try to free fdsets of wrong size.
>>
>>Found due to OpenVZ resource management (User Beancounters).
>>
>>Signed-Off-By: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
>>Signed-Off-By: Kirill Korotaev <dev@openvz.org>
>>
>>
>>diff -urp linux-2.6-orig/fs/file.c linux-2.6/fs/file.c
>>--- linux-2.6-orig/fs/file.c	2006-07-10 12:10:51.000000000 +0400
>>+++ linux-2.6/fs/file.c	2006-07-10 14:47:01.000000000 +0400
>>@@ -277,11 +277,13 @@ static struct fdtable *alloc_fdtable(int
>> 	} while (nfds <= nr);
>> 	new_fds = alloc_fd_array(nfds);
>> 	if (!new_fds)
>>-		goto out;
>>+		goto out2;
>> 	fdt->fd = new_fds;
>> 	fdt->max_fds = nfds;
>> 	fdt->free_files = NULL;
>> 	return fdt;
>>+out2:
>>+	nfds = fdt->max_fdset;
>> out:
>>   	if (new_openset)
>>   		free_fdset(new_openset, nfds);
> 
> 
> OK, that was a simple fix.  And if we need this fix backported to 2.6.17.x
> then it'd be best to go with the simple fix.
> 
> And I think we do need to backport this to 2.6.17.x because NR_OPEN can be
> really big, and vmalloc() is not immortal.
> 
> But the code in there is really sick.   In all cases we do:
> 
> 	free_fdset(foo->open_fds, foo->max_fdset);
> 	free_fdset(foo->close_on_exec, foo->max_fdset);
> 
> How much neater and more reliable would it be to do:
> 
> 	free_fdsets(foo);
> 
> ?
agree. should I prepare a patch?

> Also,
> 
> 	nfds = NR_OPEN_DEFAULT;
> 	/*
> 	 * Expand to the max in easy steps, and keep expanding it until
> 	 * we have enough for the requested fd array size.
> 	 */
> 	do {
> #if NR_OPEN_DEFAULT < 256
> 		if (nfds < 256)
> 			nfds = 256;
> 		else
> #endif
> 		if (nfds < (PAGE_SIZE / sizeof(struct file *)))
> 			nfds = PAGE_SIZE / sizeof(struct file *);
> 		else {
> 			nfds = nfds * 2;
> 			if (nfds > NR_OPEN)
> 				nfds = NR_OPEN;
>   		}
> 	} while (nfds <= nr);
> 
> 
> That's going to take a long time to compute if nr > NR_OPEN.  I just fixed
> a similar infinite loop in this function.  Methinks this
> 
> 	nfds = max(NR_OPEN_DEFAULT, 256);
> 	nfds = max(nfds, PAGE_SIZE/sizeof(struct file *));
> 	nfds = max(nfds, round_up_pow_of_two(nr + 1));
> 	nfds = min(nfds, NR_OPEN);
> 
> is clearer and less buggy.  I _think_ it's also equivalent (as long as
> NR_OPEN>256).  But please check my logic.
Yeah, I also noticed these nasty loops but was too lazy to bother :)
Too much crap for my nerves :)

Your logic looks fine for me. Do we have already round_up_pow_of_two() function or
should we create it as something like:
unsinged long round_up_pow_of_two(unsigned long x)
{
  unsigned long res = 1 << BITS_PER_LONG;
  while (res > x)
    res >>= 1;
  }
  return res << 1;
}

or maybe using:
n = find_first_bit(x);
return res = 1 << n;
(though it depends on endianness IMHO)
?

Thanks,
Kirill


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] fdset's leakage
  2006-07-11  9:05   ` Kirill Korotaev
@ 2006-07-11  9:28     ` Andrew Morton
  2006-07-11 16:13     ` Vadim Lobanov
  1 sibling, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2006-07-11  9:28 UTC (permalink / raw)
  To: Kirill Korotaev; +Cc: linux-kernel, devel, kuznet

On Tue, 11 Jul 2006 13:05:03 +0400
Kirill Korotaev <dev@openvz.org> wrote:

> Andrew,
> 
> > But the code in there is really sick.   In all cases we do:
> > 
> > 	free_fdset(foo->open_fds, foo->max_fdset);
> > 	free_fdset(foo->close_on_exec, foo->max_fdset);
> > 
> > How much neater and more reliable would it be to do:
> > 
> > 	free_fdsets(foo);
> > 
> > ?
> agree. should I prepare a patch?

Is OK, I'll take care of it later.  We want to let your current patch bake
as-is in mainline for a while so that we can backport it into 2.6.17.x with
more confidence.  That's a bit excessive in this case, but the principle is
good.

> > Also,
> > 
> > 	nfds = NR_OPEN_DEFAULT;
> > 	/*
> > 	 * Expand to the max in easy steps, and keep expanding it until
> > 	 * we have enough for the requested fd array size.
> > 	 */
> > 	do {
> > #if NR_OPEN_DEFAULT < 256
> > 		if (nfds < 256)
> > 			nfds = 256;
> > 		else
> > #endif
> > 		if (nfds < (PAGE_SIZE / sizeof(struct file *)))
> > 			nfds = PAGE_SIZE / sizeof(struct file *);
> > 		else {
> > 			nfds = nfds * 2;
> > 			if (nfds > NR_OPEN)
> > 				nfds = NR_OPEN;
> >   		}
> > 	} while (nfds <= nr);
> > 
> > 
> > That's going to take a long time to compute if nr > NR_OPEN.  I just fixed
> > a similar infinite loop in this function.  Methinks this
> > 
> > 	nfds = max(NR_OPEN_DEFAULT, 256);
> > 	nfds = max(nfds, PAGE_SIZE/sizeof(struct file *));
> > 	nfds = max(nfds, round_up_pow_of_two(nr + 1));
> > 	nfds = min(nfds, NR_OPEN);
> > 
> > is clearer and less buggy.  I _think_ it's also equivalent (as long as
> > NR_OPEN>256).  But please check my logic.
> Yeah, I also noticed these nasty loops but was too lazy to bother :)
> Too much crap for my nerves :)
> 
> Your logic looks fine for me.

I usually get that stuff wrong.

> Do we have already round_up_pow_of_two() function

yep, in kernel.h.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] fdset's leakage
  2006-07-11  9:05   ` Kirill Korotaev
  2006-07-11  9:28     ` Andrew Morton
@ 2006-07-11 16:13     ` Vadim Lobanov
  2006-07-11 17:26       ` Eric Dumazet
  2006-07-12 10:49       ` Kirill Korotaev
  1 sibling, 2 replies; 8+ messages in thread
From: Vadim Lobanov @ 2006-07-11 16:13 UTC (permalink / raw)
  To: Kirill Korotaev; +Cc: Andrew Morton, linux-kernel, devel, kuznet

On Tue, 11 Jul 2006, Kirill Korotaev wrote:

> Andrew,
>
> >>Another patch from Alexey Kuznetsov fixing memory leak in alloc_fdtable().
> >>
> >>[PATCH] fdset's leakage
> >>
> >>When found, it is obvious. nfds calculated when allocating fdsets
> >>is rewritten by calculation of size of fdtable, and when we are
> >>unlucky, we try to free fdsets of wrong size.
> >>
> >>Found due to OpenVZ resource management (User Beancounters).
> >>
> >>Signed-Off-By: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
> >>Signed-Off-By: Kirill Korotaev <dev@openvz.org>
> >>
> >>
> >>diff -urp linux-2.6-orig/fs/file.c linux-2.6/fs/file.c
> >>--- linux-2.6-orig/fs/file.c	2006-07-10 12:10:51.000000000 +0400
> >>+++ linux-2.6/fs/file.c	2006-07-10 14:47:01.000000000 +0400
> >>@@ -277,11 +277,13 @@ static struct fdtable *alloc_fdtable(int
> >> 	} while (nfds <= nr);
> >> 	new_fds = alloc_fd_array(nfds);
> >> 	if (!new_fds)
> >>-		goto out;
> >>+		goto out2;
> >> 	fdt->fd = new_fds;
> >> 	fdt->max_fds = nfds;
> >> 	fdt->free_files = NULL;
> >> 	return fdt;
> >>+out2:
> >>+	nfds = fdt->max_fdset;
> >> out:
> >>   	if (new_openset)
> >>   		free_fdset(new_openset, nfds);
> >
> >
> > OK, that was a simple fix.  And if we need this fix backported to 2.6.17.x
> > then it'd be best to go with the simple fix.
> >
> > And I think we do need to backport this to 2.6.17.x because NR_OPEN can be
> > really big, and vmalloc() is not immortal.
> >
> > But the code in there is really sick.   In all cases we do:
> >
> > 	free_fdset(foo->open_fds, foo->max_fdset);
> > 	free_fdset(foo->close_on_exec, foo->max_fdset);
> >
> > How much neater and more reliable would it be to do:
> >
> > 	free_fdsets(foo);
> >
> > ?
> agree. should I prepare a patch?
>
> > Also,
> >
> > 	nfds = NR_OPEN_DEFAULT;
> > 	/*
> > 	 * Expand to the max in easy steps, and keep expanding it until
> > 	 * we have enough for the requested fd array size.
> > 	 */
> > 	do {
> > #if NR_OPEN_DEFAULT < 256
> > 		if (nfds < 256)
> > 			nfds = 256;
> > 		else
> > #endif
> > 		if (nfds < (PAGE_SIZE / sizeof(struct file *)))
> > 			nfds = PAGE_SIZE / sizeof(struct file *);
> > 		else {
> > 			nfds = nfds * 2;
> > 			if (nfds > NR_OPEN)
> > 				nfds = NR_OPEN;
> >   		}
> > 	} while (nfds <= nr);
> >
> >
> > That's going to take a long time to compute if nr > NR_OPEN.  I just fixed
> > a similar infinite loop in this function.  Methinks this
> >
> > 	nfds = max(NR_OPEN_DEFAULT, 256);
> > 	nfds = max(nfds, PAGE_SIZE/sizeof(struct file *));
> > 	nfds = max(nfds, round_up_pow_of_two(nr + 1));
> > 	nfds = min(nfds, NR_OPEN);
> >
> > is clearer and less buggy.  I _think_ it's also equivalent (as long as
> > NR_OPEN>256).  But please check my logic.
> Yeah, I also noticed these nasty loops but was too lazy to bother :)
> Too much crap for my nerves :)
>
> Your logic looks fine for me. Do we have already round_up_pow_of_two() function or
> should we create it as something like:
> unsinged long round_up_pow_of_two(unsigned long x)
> {
>   unsigned long res = 1 << BITS_PER_LONG;

You'll get a zero here. Should be 1 << (BITS_PER_LONG - 1).

>   while (res > x)
>     res >>= 1;
>   }
>   return res << 1;
> }
>
> or maybe using:
> n = find_first_bit(x);
> return res = 1 << n;
> (though it depends on endianness IMHO)
> ?
>
> Thanks,
> Kirill

-- Vadim Lobanov

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] fdset's leakage
  2006-07-11 16:13     ` Vadim Lobanov
@ 2006-07-11 17:26       ` Eric Dumazet
  2006-07-12 10:49       ` Kirill Korotaev
  1 sibling, 0 replies; 8+ messages in thread
From: Eric Dumazet @ 2006-07-11 17:26 UTC (permalink / raw)
  To: Vadim Lobanov; +Cc: Kirill Korotaev, Andrew Morton, linux-kernel, devel, kuznet

On Tuesday 11 July 2006 18:13, Vadim Lobanov wrote:
> > unsinged long round_up_pow_of_two(unsigned long x)
> > {
> >   unsigned long res = 1 << BITS_PER_LONG;
>
> You'll get a zero here. Should be 1 << (BITS_PER_LONG - 1).
>

Nope. It wont work on 64 bits platform :)

You want  1UL << (BITS_PER_LONG - 1).

But the roundup_pow_of_two() function is already defined in 
include/linux/kernel.h and uses fls_long()

Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] fdset's leakage
  2006-07-11 16:13     ` Vadim Lobanov
  2006-07-11 17:26       ` Eric Dumazet
@ 2006-07-12 10:49       ` Kirill Korotaev
  1 sibling, 0 replies; 8+ messages in thread
From: Kirill Korotaev @ 2006-07-12 10:49 UTC (permalink / raw)
  To: Vadim Lobanov; +Cc: Kirill Korotaev, Andrew Morton, linux-kernel, devel, kuznet

>>Your logic looks fine for me. Do we have already round_up_pow_of_two() function or
>>should we create it as something like:
>>unsinged long round_up_pow_of_two(unsigned long x)
>>{
>>  unsigned long res = 1 << BITS_PER_LONG;
> 
> 
> You'll get a zero here. Should be 1 << (BITS_PER_LONG - 1).
Good that so many people are watching when you even didn't write it yet :)))
Thanks!

Kirill

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2006-07-12 10:50 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-10 13:40 [PATCH] fdset's leakage Kirill Korotaev
2006-07-11  8:01 ` Andrew Morton
2006-07-11  9:02   ` Rene Scharfe
2006-07-11  9:05   ` Kirill Korotaev
2006-07-11  9:28     ` Andrew Morton
2006-07-11 16:13     ` Vadim Lobanov
2006-07-11 17:26       ` Eric Dumazet
2006-07-12 10:49       ` Kirill Korotaev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox