* [patch 1/4 -mm] flex_array: convert element_nr formals to unsigned
@ 2009-08-21 23:21 David Rientjes
2009-08-21 23:21 ` [patch 2/4 -mm] flex_array: add flex_array_clear function David Rientjes
` (2 more replies)
0 siblings, 3 replies; 30+ messages in thread
From: David Rientjes @ 2009-08-21 23:21 UTC (permalink / raw)
To: Andrew Morton; +Cc: Dave Hansen, linux-kernel
It's problematic to allow signed element_nr's or total's to be passed as
part of the flex array API.
flex_array_alloc() allows total_nr_elements to be set to a negative
quantity, which is obviously erroneous.
flex_array_get() and flex_array_put() allows negative array indices in
dereferencing an array part, which could address memory mapped before
struct flex_array.
The fix is to convert all existing element_nr formals to be qualified as
unsigned. Existing checks to compare it to total_nr_elements or the max
array size based on element_size need not be changed.
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
include/linux/flex_array.h | 10 ++++++----
lib/flex_array.c | 24 +++++++++++++-----------
2 files changed, 19 insertions(+), 15 deletions(-)
diff --git a/include/linux/flex_array.h b/include/linux/flex_array.h
--- a/include/linux/flex_array.h
+++ b/include/linux/flex_array.h
@@ -36,12 +36,14 @@ struct flex_array {
.total_nr_elements = (total), \
} } }
-struct flex_array *flex_array_alloc(int element_size, int total, gfp_t flags);
-int flex_array_prealloc(struct flex_array *fa, int start, int end, gfp_t flags);
+struct flex_array *flex_array_alloc(int element_size, unsigned int total,
+ gfp_t flags);
+int flex_array_prealloc(struct flex_array *fa, unsigned int start,
+ unsigned int end, gfp_t flags);
void flex_array_free(struct flex_array *fa);
void flex_array_free_parts(struct flex_array *fa);
-int flex_array_put(struct flex_array *fa, int element_nr, void *src,
+int flex_array_put(struct flex_array *fa, unsigned int element_nr, void *src,
gfp_t flags);
-void *flex_array_get(struct flex_array *fa, int element_nr);
+void *flex_array_get(struct flex_array *fa, unsigned int element_nr);
#endif /* _FLEX_ARRAY_H */
diff --git a/lib/flex_array.c b/lib/flex_array.c
--- a/lib/flex_array.c
+++ b/lib/flex_array.c
@@ -99,7 +99,8 @@ static inline int elements_fit_in_base(struct flex_array *fa)
* capacity in the base structure. Also note that no effort is made
* to efficiently pack objects across page boundaries.
*/
-struct flex_array *flex_array_alloc(int element_size, int total, gfp_t flags)
+struct flex_array *flex_array_alloc(int element_size, unsigned int total,
+ gfp_t flags)
{
struct flex_array *ret;
int max_size = nr_base_part_ptrs() * __elements_per_part(element_size);
@@ -115,7 +116,8 @@ struct flex_array *flex_array_alloc(int element_size, int total, gfp_t flags)
return ret;
}
-static int fa_element_to_part_nr(struct flex_array *fa, int element_nr)
+static int fa_element_to_part_nr(struct flex_array *fa,
+ unsigned int element_nr)
{
return element_nr / __elements_per_part(fa->element_size);
}
@@ -143,14 +145,12 @@ void flex_array_free(struct flex_array *fa)
kfree(fa);
}
-static int fa_index_inside_part(struct flex_array *fa, int element_nr)
+static unsigned int index_inside_part(struct flex_array *fa,
+ unsigned int element_nr)
{
- return element_nr % __elements_per_part(fa->element_size);
-}
+ unsigned int part_offset;
-static int index_inside_part(struct flex_array *fa, int element_nr)
-{
- int part_offset = fa_index_inside_part(fa, element_nr);
+ part_offset = element_nr % __elements_per_part(fa->element_size);
return part_offset * fa->element_size;
}
@@ -185,7 +185,8 @@ __fa_get_part(struct flex_array *fa, int part_nr, gfp_t flags)
*
* Locking must be provided by the caller.
*/
-int flex_array_put(struct flex_array *fa, int element_nr, void *src, gfp_t flags)
+int flex_array_put(struct flex_array *fa, unsigned int element_nr, void *src,
+ gfp_t flags)
{
int part_nr = fa_element_to_part_nr(fa, element_nr);
struct flex_array_part *part;
@@ -217,7 +218,8 @@ int flex_array_put(struct flex_array *fa, int element_nr, void *src, gfp_t flags
*
* Locking must be provided by the caller.
*/
-int flex_array_prealloc(struct flex_array *fa, int start, int end, gfp_t flags)
+int flex_array_prealloc(struct flex_array *fa, unsigned int start,
+ unsigned int end, gfp_t flags)
{
int start_part;
int end_part;
@@ -248,7 +250,7 @@ int flex_array_prealloc(struct flex_array *fa, int start, int end, gfp_t flags)
*
* Locking must be provided by the caller.
*/
-void *flex_array_get(struct flex_array *fa, int element_nr)
+void *flex_array_get(struct flex_array *fa, unsigned int element_nr)
{
int part_nr = fa_element_to_part_nr(fa, element_nr);
struct flex_array_part *part;
^ permalink raw reply [flat|nested] 30+ messages in thread* [patch 2/4 -mm] flex_array: add flex_array_clear function
2009-08-21 23:21 [patch 1/4 -mm] flex_array: convert element_nr formals to unsigned David Rientjes
@ 2009-08-21 23:21 ` David Rientjes
2009-08-24 15:41 ` Dave Hansen
2009-08-21 23:21 ` [patch 3/4 -mm] flex_array: poison free elements David Rientjes
2009-08-21 23:21 ` [patch 4/4 -mm] flex_array: add flex_array_shrink function David Rientjes
2 siblings, 1 reply; 30+ messages in thread
From: David Rientjes @ 2009-08-21 23:21 UTC (permalink / raw)
To: Andrew Morton; +Cc: Dave Hansen, linux-kernel
This patch adds a new function to the flex_array API:
int flex_array_clear(struct flex_array *fa,
unsigned int element_nr)
This function will zero the element at element_nr in the flex_array.
Although this is equivalent to using flex_array_put() and passing a
pointer to zero'd memory, flex_array_clear() does not require such
a pointer to memory that would most likely need to be allocated on
the caller's stack which could be significantly large depending on
element_size.
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
include/linux/flex_array.h | 1 +
lib/flex_array.c | 26 ++++++++++++++++++++++++++
2 files changed, 27 insertions(+), 0 deletions(-)
diff --git a/include/linux/flex_array.h b/include/linux/flex_array.h
--- a/include/linux/flex_array.h
+++ b/include/linux/flex_array.h
@@ -44,6 +44,7 @@ void flex_array_free(struct flex_array *fa);
void flex_array_free_parts(struct flex_array *fa);
int flex_array_put(struct flex_array *fa, unsigned int element_nr, void *src,
gfp_t flags);
+int flex_array_clear(struct flex_array *fa, unsigned int element_nr);
void *flex_array_get(struct flex_array *fa, unsigned int element_nr);
#endif /* _FLEX_ARRAY_H */
diff --git a/lib/flex_array.c b/lib/flex_array.c
--- a/lib/flex_array.c
+++ b/lib/flex_array.c
@@ -207,6 +207,32 @@ int flex_array_put(struct flex_array *fa, unsigned int element_nr, void *src,
}
/**
+ * flex_array_clear - clear element in array at @element_nr
+ * @element_nr: index of the position to clear.
+ *
+ * Locking must be provided by the caller.
+ */
+int flex_array_clear(struct flex_array *fa, unsigned int element_nr)
+{
+ int part_nr = fa_element_to_part_nr(fa, element_nr);
+ struct flex_array_part *part;
+ void *dst;
+
+ if (element_nr >= fa->total_nr_elements)
+ return -ENOSPC;
+ if (elements_fit_in_base(fa))
+ part = (struct flex_array_part *)&fa->parts[0];
+ else {
+ part = fa->parts[part_nr];
+ if (!part)
+ return -EINVAL;
+ }
+ dst = &part->elements[index_inside_part(fa, element_nr)];
+ memset(dst, 0, fa->element_size);
+ return 0;
+}
+
+/**
* flex_array_prealloc - guarantee that array space exists
* @start: index of first array element for which space is allocated
* @end: index of last (inclusive) element for which space is allocated
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [patch 2/4 -mm] flex_array: add flex_array_clear function
2009-08-21 23:21 ` [patch 2/4 -mm] flex_array: add flex_array_clear function David Rientjes
@ 2009-08-24 15:41 ` Dave Hansen
2009-08-24 20:29 ` David Rientjes
0 siblings, 1 reply; 30+ messages in thread
From: Dave Hansen @ 2009-08-24 15:41 UTC (permalink / raw)
To: David Rientjes; +Cc: Andrew Morton, linux-kernel
On Fri, 2009-08-21 at 16:21 -0700, David Rientjes wrote:
> /**
> + * flex_array_clear - clear element in array at @element_nr
> + * @element_nr: index of the position to clear.
> + *
> + * Locking must be provided by the caller.
> + */
> +int flex_array_clear(struct flex_array *fa, unsigned int element_nr)
> +{
> + int part_nr = fa_element_to_part_nr(fa, element_nr);
> + struct flex_array_part *part;
> + void *dst;
> +
> + if (element_nr >= fa->total_nr_elements)
> + return -ENOSPC;
> + if (elements_fit_in_base(fa))
> + part = (struct flex_array_part *)&fa->parts[0];
> + else {
> + part = fa->parts[part_nr];
> + if (!part)
> + return -EINVAL;
> + }
> + dst = &part->elements[index_inside_part(fa, element_nr)];
> + memset(dst, 0, fa->element_size);
> + return 0;
> +}
My only worry about this is that it's largely a copy-and-paste of
flex_array_put(). If we had a function that just returned a pointer to
'dst', we could use that in both cases.
Couldn't we implement the above with just:
int flex_array_clear(struct flex_array *fa, unsigned int element_nr)
{
return flex_array_put(fa, element_nr, &empty_zero_page);
}
-- Dave
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [patch 2/4 -mm] flex_array: add flex_array_clear function
2009-08-24 15:41 ` Dave Hansen
@ 2009-08-24 20:29 ` David Rientjes
2009-08-24 20:38 ` Dave Hansen
0 siblings, 1 reply; 30+ messages in thread
From: David Rientjes @ 2009-08-24 20:29 UTC (permalink / raw)
To: Dave Hansen; +Cc: Andrew Morton, linux-kernel
On Mon, 24 Aug 2009, Dave Hansen wrote:
> My only worry about this is that it's largely a copy-and-paste of
> flex_array_put(). If we had a function that just returned a pointer to
> 'dst', we could use that in both cases.
>
> Couldn't we implement the above with just:
>
> int flex_array_clear(struct flex_array *fa, unsigned int element_nr)
> {
> return flex_array_put(fa, element_nr, &empty_zero_page);
> }
>
Sure, if you never increase FLEX_ARRAY_PART_SIZE. Otherwise, doing
static char zero_part[FLEX_ARRAY_PART_SIZE] = {
[0 ... FLEX_ARRAY_PART_SIZE - 1] = 0
};
and using flex_array_put(fa, element_nr, &zero_part) would work although
you're trading off cleaner, yet not more efficient, code at the cost of
FLEX_ARRAY_PART_SIZE wasted memory and memcpy() being slower than
memset().
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [patch 2/4 -mm] flex_array: add flex_array_clear function
2009-08-24 20:29 ` David Rientjes
@ 2009-08-24 20:38 ` Dave Hansen
2009-08-24 20:50 ` David Rientjes
0 siblings, 1 reply; 30+ messages in thread
From: Dave Hansen @ 2009-08-24 20:38 UTC (permalink / raw)
To: David Rientjes; +Cc: Andrew Morton, linux-kernel
On Mon, 2009-08-24 at 13:29 -0700, David Rientjes wrote:
> On Mon, 24 Aug 2009, Dave Hansen wrote:
>
> > My only worry about this is that it's largely a copy-and-paste of
> > flex_array_put(). If we had a function that just returned a pointer to
> > 'dst', we could use that in both cases.
> >
> > Couldn't we implement the above with just:
> >
> > int flex_array_clear(struct flex_array *fa, unsigned int element_nr)
> > {
> > return flex_array_put(fa, element_nr, &empty_zero_page);
> > }
> >
>
> Sure, if you never increase FLEX_ARRAY_PART_SIZE. Otherwise, doing
>
> static char zero_part[FLEX_ARRAY_PART_SIZE] = {
> [0 ... FLEX_ARRAY_PART_SIZE - 1] = 0
> };
>
> and using flex_array_put(fa, element_nr, &zero_part) would work although
> you're trading off cleaner, yet not more efficient, code at the cost of
> FLEX_ARRAY_PART_SIZE wasted memory and memcpy() being slower than
> memset().
Yeah, that's true. How about using the get() function?
int flex_array_clear(struct flex_array *fa, unsigned int element_nr)
{
void *element = flex_array_get(fa, element_nr);
memset(element, FLEX_ARRAY_FREE, fa->element_size);
}
It'll keep us from having to keep around a zero'd element.
But, I guess we could also do:
struct flex_array_part *zero_part = empty_zero_page;
And use a BUILD_BUG_ON(FLEX_ARRAY_PART_SIZE > PAGE_SIZE). But the whole
point of this was to have elements that are smaller than PAGE_SIZE.
Having that as a constraint doesn't seem too bad. :)
-- Dave
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [patch 2/4 -mm] flex_array: add flex_array_clear function
2009-08-24 20:38 ` Dave Hansen
@ 2009-08-24 20:50 ` David Rientjes
2009-08-24 21:28 ` Dave Hansen
0 siblings, 1 reply; 30+ messages in thread
From: David Rientjes @ 2009-08-24 20:50 UTC (permalink / raw)
To: Dave Hansen; +Cc: Andrew Morton, linux-kernel
On Mon, 24 Aug 2009, Dave Hansen wrote:
> > Sure, if you never increase FLEX_ARRAY_PART_SIZE. Otherwise, doing
> >
> > static char zero_part[FLEX_ARRAY_PART_SIZE] = {
> > [0 ... FLEX_ARRAY_PART_SIZE - 1] = 0
> > };
> >
> > and using flex_array_put(fa, element_nr, &zero_part) would work although
> > you're trading off cleaner, yet not more efficient, code at the cost of
> > FLEX_ARRAY_PART_SIZE wasted memory and memcpy() being slower than
> > memset().
>
> Yeah, that's true. How about using the get() function?
>
> int flex_array_clear(struct flex_array *fa, unsigned int element_nr)
> {
> void *element = flex_array_get(fa, element_nr);
> memset(element, FLEX_ARRAY_FREE, fa->element_size);
> }
>
The idea was to eventually be able to distinguish between
use-uninitialized and use-after-free and flex_array_clear() was a
convenient way of providing an interface to identify the later. So when
an array is fully initialized (or fully cleared after a previous use where
all elements we're used), you couldn't do flex_array_clear() on an element
before flex_array_put() if its part isn't allocated yet with this
implementation.
> It'll keep us from having to keep around a zero'd element.
>
> But, I guess we could also do:
>
> struct flex_array_part *zero_part = empty_zero_page;
>
> And use a BUILD_BUG_ON(FLEX_ARRAY_PART_SIZE > PAGE_SIZE). But the whole
> point of this was to have elements that are smaller than PAGE_SIZE.
> Having that as a constraint doesn't seem too bad. :)
>
Hmm, I think being able to increase FLEX_ARRAY_PART_SIZE is eventually
going to become an integral part of the entire library so that it supports
larger number of entries (and order-1 allocations aren't as difficult with
anti-fragmentation), especially for those with larger element sizes.
Otherwise, there's no need for FLEX_ARRAY_PART_SIZE in the first place.
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [patch 2/4 -mm] flex_array: add flex_array_clear function
2009-08-24 20:50 ` David Rientjes
@ 2009-08-24 21:28 ` Dave Hansen
2009-08-24 22:32 ` David Rientjes
0 siblings, 1 reply; 30+ messages in thread
From: Dave Hansen @ 2009-08-24 21:28 UTC (permalink / raw)
To: David Rientjes; +Cc: Andrew Morton, linux-kernel
On Mon, 2009-08-24 at 13:50 -0700, David Rientjes wrote:
> On Mon, 24 Aug 2009, Dave Hansen wrote:
> > int flex_array_clear(struct flex_array *fa, unsigned int element_nr)
> > {
> > void *element = flex_array_get(fa, element_nr);
> > memset(element, FLEX_ARRAY_FREE, fa->element_size);
> > }
> >
>
> The idea was to eventually be able to distinguish between
> use-uninitialized and use-after-free and flex_array_clear() was a
> convenient way of providing an interface to identify the later. So when
> an array is fully initialized (or fully cleared after a previous use where
> all elements we're used), you couldn't do flex_array_clear() on an element
> before flex_array_put() if its part isn't allocated yet with this
> implementation.
OK, just to make sure I'm understanding what you are saying. If we
haven't allocated the 'part' of a given element, then this code is
bogus. flex_array_get() will return NULL, and we have nothing to
memset(). We effectively need flex_array_get()'s behavior, but we also
need to ensure that there is space for the element allocated if it
wasn't before flex_array_clear() is called. Right?
I'm not literally saying that we have to use flex_array_get() forever.
But, it does seem that flex_array_clear() could certainly share some
code with the existing functions. So, instead of just copying those
functions, let's make sure that we refactor them in a way so that we can
reuse the code.
-- Dave
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [patch 2/4 -mm] flex_array: add flex_array_clear function
2009-08-24 21:28 ` Dave Hansen
@ 2009-08-24 22:32 ` David Rientjes
0 siblings, 0 replies; 30+ messages in thread
From: David Rientjes @ 2009-08-24 22:32 UTC (permalink / raw)
To: Dave Hansen; +Cc: Andrew Morton, linux-kernel
On Mon, 24 Aug 2009, Dave Hansen wrote:
> OK, just to make sure I'm understanding what you are saying. If we
> haven't allocated the 'part' of a given element, then this code is
> bogus. flex_array_get() will return NULL, and we have nothing to
> memset(). We effectively need flex_array_get()'s behavior, but we also
> need to ensure that there is space for the element allocated if it
> wasn't before flex_array_clear() is called. Right?
>
Yes, that would be the only way to eventually be able to distinguish
between use-uninitialized and use-after-free; I don't know whether that
information is helpful to the person developing code that is using flex
array or not, but it costs nothing to do other than adding another poison
value. Since flex_array_get() will never allocate parts, we can't use it
as-is since it returns NULL for both getting beyond the array upper bound
and for an unallocated part. If it returned an ERR_PTR() that would
distinguish between the two, we could do a variation of what you suggested
and then check the return value, yet we still have the same restriction of
never increasing FLEX_ARRAY_PART_SIZE beyond PAGE_SIZE if we're to avoid
wasted memory space (static char zero_part[FLEX_ARRAY_PART_SIZE]) to
simply refactor the code with no other improvement.
^ permalink raw reply [flat|nested] 30+ messages in thread
* [patch 3/4 -mm] flex_array: poison free elements
2009-08-21 23:21 [patch 1/4 -mm] flex_array: convert element_nr formals to unsigned David Rientjes
2009-08-21 23:21 ` [patch 2/4 -mm] flex_array: add flex_array_clear function David Rientjes
@ 2009-08-21 23:21 ` David Rientjes
2009-08-24 15:56 ` Dave Hansen
2009-08-21 23:21 ` [patch 4/4 -mm] flex_array: add flex_array_shrink function David Rientjes
2 siblings, 1 reply; 30+ messages in thread
From: David Rientjes @ 2009-08-21 23:21 UTC (permalink / raw)
To: Andrew Morton; +Cc: Dave Hansen, linux-kernel
Newly initialized flex_array's and/or flex_array_part's are now poisoned
with a new poison value, FLEX_ARRAY_FREE. It's value is similar to
POISON_FREE used in the various slab allocators, but is different to
distinguish between flex array's poisoned kmem and slab allocator
poisoned kmem.
This will allow us to identify flex_array_part's that only contain free
elements (and free them with an addition to the flex_array API). This
could also be extended in the future to identify `get' uses on elements
that have not been `put'.
If __GFP_ZERO is passed for a part's gfp mask, the poisoning is avoided.
These elements are considered to be in-use since they have been
initialized.
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
include/linux/poison.h | 3 +++
lib/flex_array.c | 15 +++++++--------
2 files changed, 10 insertions(+), 8 deletions(-)
diff --git a/include/linux/poison.h b/include/linux/poison.h
--- a/include/linux/poison.h
+++ b/include/linux/poison.h
@@ -65,6 +65,9 @@
#define MUTEX_DEBUG_INIT 0x11
#define MUTEX_DEBUG_FREE 0x22
+/********** lib/flex_array.c **********/
+#define FLEX_ARRAY_FREE 0x6c /* for use-after-free poisoning */
+
/********** security/ **********/
#define KEY_DESTROY 0xbd
diff --git a/lib/flex_array.c b/lib/flex_array.c
--- a/lib/flex_array.c
+++ b/lib/flex_array.c
@@ -113,6 +113,8 @@ struct flex_array *flex_array_alloc(int element_size, unsigned int total,
return NULL;
ret->element_size = element_size;
ret->total_nr_elements = total;
+ if (elements_fit_in_base(ret) && !(flags & __GFP_ZERO))
+ memset(ret->parts[0], FLEX_ARRAY_FREE, bytes_left_in_base());
return ret;
}
@@ -159,15 +161,12 @@ __fa_get_part(struct flex_array *fa, int part_nr, gfp_t flags)
{
struct flex_array_part *part = fa->parts[part_nr];
if (!part) {
- /*
- * This leaves the part pages uninitialized
- * and with potentially random data, just
- * as if the user had kmalloc()'d the whole.
- * __GFP_ZERO can be used to zero it.
- */
- part = kmalloc(FLEX_ARRAY_PART_SIZE, flags);
+ part = kmalloc(sizeof(struct flex_array_part), flags);
if (!part)
return NULL;
+ if (!(flags & __GFP_ZERO))
+ memset(part, FLEX_ARRAY_FREE,
+ sizeof(struct flex_array_part));
fa->parts[part_nr] = part;
}
return part;
@@ -228,7 +227,7 @@ int flex_array_clear(struct flex_array *fa, unsigned int element_nr)
return -EINVAL;
}
dst = &part->elements[index_inside_part(fa, element_nr)];
- memset(dst, 0, fa->element_size);
+ memset(dst, FLEX_ARRAY_FREE, fa->element_size);
return 0;
}
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [patch 3/4 -mm] flex_array: poison free elements
2009-08-21 23:21 ` [patch 3/4 -mm] flex_array: poison free elements David Rientjes
@ 2009-08-24 15:56 ` Dave Hansen
2009-08-24 20:41 ` David Rientjes
0 siblings, 1 reply; 30+ messages in thread
From: Dave Hansen @ 2009-08-24 15:56 UTC (permalink / raw)
To: David Rientjes; +Cc: Andrew Morton, linux-kernel
On Fri, 2009-08-21 at 16:21 -0700, David Rientjes wrote:
>
> diff --git a/include/linux/poison.h b/include/linux/poison.h
> --- a/include/linux/poison.h
> +++ b/include/linux/poison.h
> @@ -65,6 +65,9 @@
> #define MUTEX_DEBUG_INIT 0x11
> #define MUTEX_DEBUG_FREE 0x22
>
> +/********** lib/flex_array.c **********/
> +#define FLEX_ARRAY_FREE 0x6c /* for use-after-free poisoning */
This seems like a good idea, but perhaps we should pick a non-ASCII
character as the poison value. If someone ever tried to store strings
as one-byte elements, they'd be in for a rude awakening the first time
they store an 'l'.
Or, maybe we should just disable poisoning if the elements are 4 bytes
or less. Or, perhaps the minimum element size should just be 4 bytes
and we have a 4-byte poison value.
The other alternative is to just use '\0' as the cleared value. We
won't be able to track whether accesses in the middle of the array are
valid, but we can at least always shrink the array with no fear of
misdetection of the poison value. That also rids us of some of the
logic around GFP_ZERO and the poisoning that is a bit confusing.
Do you think it is confusing that passing GFP_ZERO will keep you from
doing a shrink unless you do a clear for all of the allocated data?
Should we expose some of the functions so that users can tell if what
was allocated precisely?
-- Dave
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 3/4 -mm] flex_array: poison free elements
2009-08-24 15:56 ` Dave Hansen
@ 2009-08-24 20:41 ` David Rientjes
2009-08-24 21:16 ` Dave Hansen
2009-08-24 21:42 ` Dave Hansen
0 siblings, 2 replies; 30+ messages in thread
From: David Rientjes @ 2009-08-24 20:41 UTC (permalink / raw)
To: Dave Hansen; +Cc: Andrew Morton, linux-kernel
On Mon, 24 Aug 2009, Dave Hansen wrote:
> > diff --git a/include/linux/poison.h b/include/linux/poison.h
> > --- a/include/linux/poison.h
> > +++ b/include/linux/poison.h
> > @@ -65,6 +65,9 @@
> > #define MUTEX_DEBUG_INIT 0x11
> > #define MUTEX_DEBUG_FREE 0x22
> >
> > +/********** lib/flex_array.c **********/
> > +#define FLEX_ARRAY_FREE 0x6c /* for use-after-free poisoning */
>
> This seems like a good idea, but perhaps we should pick a non-ASCII
> character as the poison value. If someone ever tried to store strings
> as one-byte elements, they'd be in for a rude awakening the first time
> they store an 'l'.
>
I wasn't aware that storing an array of ASCII characters was a use case
for flex array, I'm having a hard type imagining such a user. We're
always going to have the possibility of conflict with the poison value
just from allowing eight byte element sizes, yet that possibility is still
going to exist if we disabled it on smaller elements and re-defined
FLEX_ARRAY_FREE as 0x6c6c, for example; the only thing that we've done is
eliminated the possibility of flex_array_shrink() for arrays consisting of
smaller elements. FLEX_ARRAY_FREE (or an additional poison value to
distinguish between use-uninitialized vs. use-after-free) must be used in
flex_array_clear() otherwise the cgroup patchset, the only proposed user
of this library code, could never shrink this array when pid's are free
like the kmalloc vs. vmalloc patchset could do.
On the other hand, I'd have no problem trying to eliminate
fa->total_nr_elements (since we already have fa->element_size) since we
can calculate it in real-time; the only problem is being able to
distinguish when the elements are being stored in struct flex_array vs.
being stored in struct flex_array_part. We could then use that
unsigned int in struct flex_array to store the number of inuse elements
which is an alternative implementation to flex_array_shrink(), yet I'd
still propose to keep the poisoning to reveal use-uninitialized.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 3/4 -mm] flex_array: poison free elements
2009-08-24 20:41 ` David Rientjes
@ 2009-08-24 21:16 ` Dave Hansen
2009-08-24 22:40 ` David Rientjes
2009-08-24 21:42 ` Dave Hansen
1 sibling, 1 reply; 30+ messages in thread
From: Dave Hansen @ 2009-08-24 21:16 UTC (permalink / raw)
To: David Rientjes; +Cc: Andrew Morton, linux-kernel, Ben Blum
On Mon, 2009-08-24 at 13:41 -0700, David Rientjes wrote:
> LEX_ARRAY_FREE (or an additional poison value to
> distinguish between use-uninitialized vs. use-after-free) must be used in
> flex_array_clear() otherwise the cgroup patchset, the only proposed user
> of this library code, could never shrink this array when pid's are free
> like the kmalloc vs. vmalloc patchset could do.
Are you saying that you expected it to never reallocate the array, but
have a permanent flex_array and that it just calls flex_array_clear() on
the elements that it doesn't want any more, and the array ends up
sparsely populated? I can see why we'd need a poison value in that
case.
Or, are we just talking about a situation where we need to truncate the
pidlist?
-- Dave
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 3/4 -mm] flex_array: poison free elements
2009-08-24 21:16 ` Dave Hansen
@ 2009-08-24 22:40 ` David Rientjes
0 siblings, 0 replies; 30+ messages in thread
From: David Rientjes @ 2009-08-24 22:40 UTC (permalink / raw)
To: Dave Hansen; +Cc: Andrew Morton, linux-kernel
On Mon, 24 Aug 2009, Dave Hansen wrote:
> On Mon, 2009-08-24 at 13:41 -0700, David Rientjes wrote:
> > LEX_ARRAY_FREE (or an additional poison value to
> > distinguish between use-uninitialized vs. use-after-free) must be used in
> > flex_array_clear() otherwise the cgroup patchset, the only proposed user
> > of this library code, could never shrink this array when pid's are free
> > like the kmalloc vs. vmalloc patchset could do.
>
> Are you saying that you expected it to never reallocate the array, but
> have a permanent flex_array and that it just calls flex_array_clear() on
> the elements that it doesn't want any more, and the array ends up
> sparsely populated? I can see why we'd need a poison value in that
> case.
>
Yeah, similar to using flex_array_free_parts() with a statically allocated
struct flex_array storing the array's metadata (which is also broken,
by the way, since it does no sanity check on either parameter in
FLEX_ARRAY_INIT()).
> Or, are we just talking about a situation where we need to truncate the
> pidlist?
>
The patchset Andrew referred to where he proposed using flex array instead
would re-kmalloc (or re-vmalloc, I guess) a smaller amount of memory and
then transfer elements into it to free the larger set of pages. If they
were to convert to using flex array, this would be achieved with
flex_array_shrink() so they lose no functionality with the switch.
I don't think they can use flex array for another reason, however, because
they want to sort the pidlist as its returned to userspace. That would
require dynamically allocating almost the same amount of memory as the
flex array would, sorting it, and then storing it in the array. This
doesn't seem to be the best example use of flex array since the mapping of
array indices to pid isn't really relevant and there's a longer iteration
through the array if it is sparsely populated, which is why I suggested an
alternative implementation of flex list.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 3/4 -mm] flex_array: poison free elements
2009-08-24 20:41 ` David Rientjes
2009-08-24 21:16 ` Dave Hansen
@ 2009-08-24 21:42 ` Dave Hansen
2009-08-24 22:44 ` David Rientjes
1 sibling, 1 reply; 30+ messages in thread
From: Dave Hansen @ 2009-08-24 21:42 UTC (permalink / raw)
To: David Rientjes; +Cc: Andrew Morton, linux-kernel
On Mon, 2009-08-24 at 13:41 -0700, David Rientjes wrote:
> On the other hand, I'd have no problem trying to eliminate
> fa->total_nr_elements (since we already have fa->element_size) since we
> can calculate it in real-time; the only problem is being able to
> distinguish when the elements are being stored in struct flex_array vs.
> being stored in struct flex_array_part.
Yeah, that's the only reason it's there: to determine if we're using the
compact form or not. We could probably find another place to store
that, or just store a flag telling whether we're using that form or not.
We couldn't do the bounds-checking, but I'm OK with that.
But, honestly, it doesn't *really* bother me if we just use two 32-bit
int slots in the 'struct flex_array'. The 64-bit code has an extra 4
bytes now anyway.
> We could then use that
> unsigned int in struct flex_array to store the number of inuse elements
> which is an alternative implementation to flex_array_shrink(), yet I'd
> still propose to keep the poisoning to reveal use-uninitialized.
I kinda like the idea of truncating more, just because its simpler and
doesn't have any assumptions about the contents. But, you are right
that poisoning has some distinct benefits, namely allowing sparsely used
arrays to shrink.
-- Dave
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 3/4 -mm] flex_array: poison free elements
2009-08-24 21:42 ` Dave Hansen
@ 2009-08-24 22:44 ` David Rientjes
2009-09-08 22:26 ` David Rientjes
0 siblings, 1 reply; 30+ messages in thread
From: David Rientjes @ 2009-08-24 22:44 UTC (permalink / raw)
To: Dave Hansen; +Cc: Andrew Morton, linux-kernel
On Mon, 24 Aug 2009, Dave Hansen wrote:
> On Mon, 2009-08-24 at 13:41 -0700, David Rientjes wrote:
> > On the other hand, I'd have no problem trying to eliminate
> > fa->total_nr_elements (since we already have fa->element_size) since we
> > can calculate it in real-time; the only problem is being able to
> > distinguish when the elements are being stored in struct flex_array vs.
> > being stored in struct flex_array_part.
>
> Yeah, that's the only reason it's there: to determine if we're using the
> compact form or not. We could probably find another place to store
> that, or just store a flag telling whether we're using that form or not.
> We couldn't do the bounds-checking, but I'm OK with that.
>
I'm starting to think that maybe an `unsigned int flags' is eventually
going to need to be added anyway: one reason is to serve this purpose of
identifying arrays that are stored in the base structure. Another
possible reason is to identify flex arrays that want the poison behavior
to identify use-uninitialized and use-after-free and would be passed such
as flex_array_alloc(size, total, FLEX_ARRAY_POISON, gfp) to isolate arrays
that may, as you said, store elements of types other than void * or those
significantly small enough that they would collide with the poison values
at a higher probability.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 3/4 -mm] flex_array: poison free elements
2009-08-24 22:44 ` David Rientjes
@ 2009-09-08 22:26 ` David Rientjes
2009-09-09 2:05 ` Li Zefan
2009-09-09 15:28 ` Dave Hansen
0 siblings, 2 replies; 30+ messages in thread
From: David Rientjes @ 2009-09-08 22:26 UTC (permalink / raw)
To: Dave Hansen; +Cc: Andrew Morton, linux-kernel
On Mon, 24 Aug 2009, David Rientjes wrote:
> I'm starting to think that maybe an `unsigned int flags' is eventually
> going to need to be added anyway: one reason is to serve this purpose of
> identifying arrays that are stored in the base structure. Another
> possible reason is to identify flex arrays that want the poison behavior
> to identify use-uninitialized and use-after-free and would be passed such
> as flex_array_alloc(size, total, FLEX_ARRAY_POISON, gfp) to isolate arrays
> that may, as you said, store elements of types other than void * or those
> significantly small enough that they would collide with the poison values
> at a higher probability.
>
I was looking at converting a vmalloc'd array somewhere (anywhere, really)
to use flex array instead.
The cgroup pidarray can't be converted because it sorts the array, for
which flex array lacks support (and isn't really in its scope since the
mapping from index to element is then pointless). An alternative
implementation like flex list, which I've already eluded to, seems more
appropriate.
I also looked at oprofile's event_buffer, which vmalloc's elements of size
unsigned long up to a certain threshold. Flex array could accomodate
261,632 such elements on x86_64 and the buffer threshold defaults to
131,072, so it was looking like a good candidate.
The problem is that it copies the entire array to userspace up to a
certain element when reading the oprofile `buffer' file. That's not
possible with flex array without vmalloc'ing memory of size buffer_pos *
sizeof(unsigned long) and memcpy'ing the elements into it. So using flex
array in this case is pointless (and considerably slower on every read).
I'm struggling to find other examples. Dave, do you know of any
subsystems in the kernel that can readily be converted to using flex
array?
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 3/4 -mm] flex_array: poison free elements
2009-09-08 22:26 ` David Rientjes
@ 2009-09-09 2:05 ` Li Zefan
2009-09-09 3:18 ` David Rientjes
2009-09-09 15:28 ` Dave Hansen
1 sibling, 1 reply; 30+ messages in thread
From: Li Zefan @ 2009-09-09 2:05 UTC (permalink / raw)
To: David Rientjes; +Cc: Dave Hansen, Andrew Morton, linux-kernel
> I'm struggling to find other examples. Dave, do you know of any
> subsystems in the kernel that can readily be converted to using flex
> array?
Actually I'm planing to try to convert to use flex array in
kernel/trace/ftrace.c, and it needs some change in flex array,
and I'll have to check if it will have a performance effect
or not.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 3/4 -mm] flex_array: poison free elements
2009-09-09 2:05 ` Li Zefan
@ 2009-09-09 3:18 ` David Rientjes
2009-09-09 3:31 ` Li Zefan
0 siblings, 1 reply; 30+ messages in thread
From: David Rientjes @ 2009-09-09 3:18 UTC (permalink / raw)
To: Li Zefan; +Cc: Dave Hansen, Andrew Morton, linux-kernel
On Wed, 9 Sep 2009, Li Zefan wrote:
> > I'm struggling to find other examples. Dave, do you know of any
> > subsystems in the kernel that can readily be converted to using flex
> > array?
>
> Actually I'm planing to try to convert to use flex array in
> kernel/trace/ftrace.c, and it needs some change in flex array,
> and I'll have to check if it will have a performance effect
> or not.
>
That's cool, but it looks like none of those allocations currently would
ever exceed PAGE_SIZE. The return stack for each task would be a flex
array of 50 elements, each element being 40 bytes for a maximum array
size of 2KB. The tasklist would allocate a flex array of pointers to
struct ftrace_ret_stack with a maximum of 32 elements. On x86_64, that
has a maximum size of 256 bytes.
So while you would be converting existing kernel code to use the new
interface, which is great, it doesn't have any advantage over the existing
implementation. I was looking for a current use-case that would otherwise
use vmalloc because the entire array could not fit into a single page.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 3/4 -mm] flex_array: poison free elements
2009-09-09 3:18 ` David Rientjes
@ 2009-09-09 3:31 ` Li Zefan
2009-09-09 3:41 ` David Rientjes
0 siblings, 1 reply; 30+ messages in thread
From: Li Zefan @ 2009-09-09 3:31 UTC (permalink / raw)
To: David Rientjes; +Cc: Dave Hansen, Andrew Morton, linux-kernel
David Rientjes wrote:
> On Wed, 9 Sep 2009, Li Zefan wrote:
>
>>> I'm struggling to find other examples. Dave, do you know of any
>>> subsystems in the kernel that can readily be converted to using flex
>>> array?
>> Actually I'm planing to try to convert to use flex array in
>> kernel/trace/ftrace.c, and it needs some change in flex array,
>> and I'll have to check if it will have a performance effect
>> or not.
>>
>
> That's cool, but it looks like none of those allocations currently would
> ever exceed PAGE_SIZE. The return stack for each task would be a flex
> array of 50 elements, each element being 40 bytes for a maximum array
> size of 2KB. The tasklist would allocate a flex array of pointers to
> struct ftrace_ret_stack with a maximum of 32 elements. On x86_64, that
> has a maximum size of 256 bytes.
>
> So while you would be converting existing kernel code to use the new
> interface, which is great, it doesn't have any advantage over the existing
> implementation. I was looking for a current use-case that would otherwise
> use vmalloc because the entire array could not fit into a single page.
>
I was not talking about ftrace_ret_stack, I was talking about
struct ftrace_page and struct ftrace_profile_page. ;)
Each page holds an array of records, and there is a list linking
those pages. The total nr of elements can be the nr of functions
in kernel.
I think flex array can be used here to remove duplicate
implementation.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 3/4 -mm] flex_array: poison free elements
2009-09-09 3:31 ` Li Zefan
@ 2009-09-09 3:41 ` David Rientjes
2009-09-09 3:45 ` Li Zefan
0 siblings, 1 reply; 30+ messages in thread
From: David Rientjes @ 2009-09-09 3:41 UTC (permalink / raw)
To: Li Zefan; +Cc: Dave Hansen, Andrew Morton, linux-kernel
On Wed, 9 Sep 2009, Li Zefan wrote:
> I was not talking about ftrace_ret_stack, I was talking about
> struct ftrace_page and struct ftrace_profile_page. ;)
>
Gack, ftrace_page would probably be better off simply using the flex list
implementation I suggested which constructs a list from page->lru since it
doesn't appear to require indexing into the array, it simply requires an
iteration in all cases (or am I wrong?). Flex lists could easily be made
to support seqfiles.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 3/4 -mm] flex_array: poison free elements
2009-09-09 3:41 ` David Rientjes
@ 2009-09-09 3:45 ` Li Zefan
2009-09-09 4:15 ` David Rientjes
0 siblings, 1 reply; 30+ messages in thread
From: Li Zefan @ 2009-09-09 3:45 UTC (permalink / raw)
To: David Rientjes; +Cc: Dave Hansen, Andrew Morton, linux-kernel
11:41, David Rientjes wrote:
> On Wed, 9 Sep 2009, Li Zefan wrote:
>
>> I was not talking about ftrace_ret_stack, I was talking about
>> struct ftrace_page and struct ftrace_profile_page. ;)
>>
>
> Gack, ftrace_page would probably be better off simply using the flex list
> implementation I suggested which constructs a list from page->lru since it
> doesn't appear to require indexing into the array, it simply requires an
> iteration in all cases (or am I wrong?). Flex lists could easily be made
> to support seqfiles.
>
Right, iteration is needed but indexing is not. I thought
about flex list too, flex array has some limitations which
make it hard to find a real user.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 3/4 -mm] flex_array: poison free elements
2009-09-09 3:45 ` Li Zefan
@ 2009-09-09 4:15 ` David Rientjes
0 siblings, 0 replies; 30+ messages in thread
From: David Rientjes @ 2009-09-09 4:15 UTC (permalink / raw)
To: Li Zefan; +Cc: Dave Hansen, Andrew Morton, linux-kernel
On Wed, 9 Sep 2009, Li Zefan wrote:
> > Gack, ftrace_page would probably be better off simply using the flex list
> > implementation I suggested which constructs a list from page->lru since it
> > doesn't appear to require indexing into the array, it simply requires an
> > iteration in all cases (or am I wrong?). Flex lists could easily be made
> > to support seqfiles.
> >
>
> Right, iteration is needed but indexing is not. I thought
> about flex list too, flex array has some limitations which
> make it hard to find a real user.
>
I agree, the cgroup pidarray could be converted into a flex list
implementation as well as ftrace_page, it appears. All the metadata can
be stored by overloading struct page and linking them in a dynamically
expanding list on page->lru.
I'm thinking Andrew will want to see some user of this library in 2.6.32
that cannot be served by a flex list type of implementation if we're to
keep it merged in the kernel.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 3/4 -mm] flex_array: poison free elements
2009-09-08 22:26 ` David Rientjes
2009-09-09 2:05 ` Li Zefan
@ 2009-09-09 15:28 ` Dave Hansen
2009-09-09 19:18 ` David Rientjes
1 sibling, 1 reply; 30+ messages in thread
From: Dave Hansen @ 2009-09-09 15:28 UTC (permalink / raw)
To: David Rientjes; +Cc: Andrew Morton, linux-kernel
On Tue, 2009-09-08 at 15:26 -0700, David Rientjes wrote:
> I'm struggling to find other examples. Dave, do you know of any
> subsystems in the kernel that can readily be converted to using flex
> array?
I don't know of any offhand. Someone mentioned oprofile buffers at some
point, but I looked through it and it didn't pop out at me.
-- Dave
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 3/4 -mm] flex_array: poison free elements
2009-09-09 15:28 ` Dave Hansen
@ 2009-09-09 19:18 ` David Rientjes
2009-09-09 19:22 ` Dave Hansen
0 siblings, 1 reply; 30+ messages in thread
From: David Rientjes @ 2009-09-09 19:18 UTC (permalink / raw)
To: Dave Hansen; +Cc: Andrew Morton, linux-kernel
On Wed, 9 Sep 2009, Dave Hansen wrote:
> I don't know of any offhand. Someone mentioned oprofile buffers at some
> point, but I looked through it and it didn't pop out at me.
>
Their length and element size fit the prerequisites of using flex array,
but they cannot be copied to userspace as a string the way we can with
vmalloc'd memory.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 3/4 -mm] flex_array: poison free elements
2009-09-09 19:18 ` David Rientjes
@ 2009-09-09 19:22 ` Dave Hansen
2009-09-09 19:34 ` David Rientjes
0 siblings, 1 reply; 30+ messages in thread
From: Dave Hansen @ 2009-09-09 19:22 UTC (permalink / raw)
To: David Rientjes; +Cc: Andrew Morton, linux-kernel
On Wed, 2009-09-09 at 12:18 -0700, David Rientjes wrote:
> On Wed, 9 Sep 2009, Dave Hansen wrote:
> > I don't know of any offhand. Someone mentioned oprofile buffers at some
> > point, but I looked through it and it didn't pop out at me.
>
> Their length and element size fit the prerequisites of using flex array,
> but they cannot be copied to userspace as a string the way we can with
> vmalloc'd memory.
It shouldn't be too hard to do that, though. Although a bit silly to
have make a copy_to_user() call for each ->parts[] page, it would be
pretty easy to implement.
-- Dave
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 3/4 -mm] flex_array: poison free elements
2009-09-09 19:22 ` Dave Hansen
@ 2009-09-09 19:34 ` David Rientjes
0 siblings, 0 replies; 30+ messages in thread
From: David Rientjes @ 2009-09-09 19:34 UTC (permalink / raw)
To: Dave Hansen; +Cc: Andrew Morton, linux-kernel
On Wed, 9 Sep 2009, Dave Hansen wrote:
> It shouldn't be too hard to do that, though. Although a bit silly to
> have make a copy_to_user() call for each ->parts[] page, it would be
> pretty easy to implement.
>
Right, but the event buffer becomes slower as a whole because of the
multiple copy_to_user() on read, indirection into each flex_array_part,
and add_event_entry() would now require a memcpy.
^ permalink raw reply [flat|nested] 30+ messages in thread
* [patch 4/4 -mm] flex_array: add flex_array_shrink function
2009-08-21 23:21 [patch 1/4 -mm] flex_array: convert element_nr formals to unsigned David Rientjes
2009-08-21 23:21 ` [patch 2/4 -mm] flex_array: add flex_array_clear function David Rientjes
2009-08-21 23:21 ` [patch 3/4 -mm] flex_array: poison free elements David Rientjes
@ 2009-08-21 23:21 ` David Rientjes
2009-08-21 23:49 ` Andrew Morton
2 siblings, 1 reply; 30+ messages in thread
From: David Rientjes @ 2009-08-21 23:21 UTC (permalink / raw)
To: Andrew Morton; +Cc: Dave Hansen, linux-kernel
This patch adds a new function to the flex_array API:
int flex_array_shrink(struct flex_array *fa)
This function will free all unused second-level pages. Since elements
are now poisoned if they are not allocated with __GFP_ZERO, it's
possible to identify parts that consist solely of unused elements.
flex_array_shrink() returns the number of pages freed.
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: David Rientjes <rientjes@google.com>
---
include/linux/flex_array.h | 1 +
lib/flex_array.c | 40 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 41 insertions(+), 0 deletions(-)
diff --git a/include/linux/flex_array.h b/include/linux/flex_array.h
--- a/include/linux/flex_array.h
+++ b/include/linux/flex_array.h
@@ -46,5 +46,6 @@ int flex_array_put(struct flex_array *fa, unsigned int element_nr, void *src,
gfp_t flags);
int flex_array_clear(struct flex_array *fa, unsigned int element_nr);
void *flex_array_get(struct flex_array *fa, unsigned int element_nr);
+int flex_array_shrink(struct flex_array *fa);
#endif /* _FLEX_ARRAY_H */
diff --git a/lib/flex_array.c b/lib/flex_array.c
--- a/lib/flex_array.c
+++ b/lib/flex_array.c
@@ -291,3 +291,43 @@ void *flex_array_get(struct flex_array *fa, unsigned int element_nr)
}
return &part->elements[index_inside_part(fa, element_nr)];
}
+
+static int part_is_free(struct flex_array_part *part)
+{
+ int i;
+
+ for (i = 0; i < sizeof(struct flex_array_part); i++)
+ if (part->elements[i] != FLEX_ARRAY_FREE)
+ return 0;
+ return 1;
+}
+
+/**
+ * flex_array_shrink - free unused second-level pages
+ *
+ * Frees all second-level pages that consist solely of unused
+ * elements. Returns the number of pages freed.
+ *
+ * Locking must be provided by the caller.
+ */
+int flex_array_shrink(struct flex_array *fa)
+{
+ struct flex_array_part *part;
+ int max_part = nr_base_part_ptrs();
+ int part_nr;
+ int ret = 0;
+
+ if (elements_fit_in_base(fa))
+ return ret;
+ for (part_nr = 0; part_nr < max_part; part_nr++) {
+ part = fa->parts[part_nr];
+ if (!part)
+ continue;
+ if (part_is_free(part)) {
+ fa->parts[part_nr] = NULL;
+ kfree(part);
+ ret++;
+ }
+ }
+ return ret;
+}
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: [patch 4/4 -mm] flex_array: add flex_array_shrink function
2009-08-21 23:21 ` [patch 4/4 -mm] flex_array: add flex_array_shrink function David Rientjes
@ 2009-08-21 23:49 ` Andrew Morton
2009-08-22 0:02 ` Randy Dunlap
2009-08-22 21:28 ` David Rientjes
0 siblings, 2 replies; 30+ messages in thread
From: Andrew Morton @ 2009-08-21 23:49 UTC (permalink / raw)
To: David Rientjes; +Cc: dave, linux-kernel, Randy Dunlap
On Fri, 21 Aug 2009 16:21:49 -0700 (PDT)
David Rientjes <rientjes@google.com> wrote:
> +/**
> + * flex_array_shrink - free unused second-level pages
> + *
> + * Frees all second-level pages that consist solely of unused
> + * elements. Returns the number of pages freed.
> + *
> + * Locking must be provided by the caller.
> + */
> +int flex_array_shrink(struct flex_array *fa)
It's logical but afaik unconventional that none of the flex_array
kerneldoc actually documents the `struct flex_array *fa' argument.
We're all waiting with bated breath to see who will be the first to use
flex_array in live code.
Sometime we'll need to move flex_array.o from lib-y to obj-y and fill
it up with EXPORT_SYMBOL()s. 2.6.32 I expect.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 4/4 -mm] flex_array: add flex_array_shrink function
2009-08-21 23:49 ` Andrew Morton
@ 2009-08-22 0:02 ` Randy Dunlap
2009-08-22 21:28 ` David Rientjes
1 sibling, 0 replies; 30+ messages in thread
From: Randy Dunlap @ 2009-08-22 0:02 UTC (permalink / raw)
To: Andrew Morton; +Cc: David Rientjes, dave, linux-kernel
On Fri, 21 Aug 2009 16:49:35 -0700 Andrew Morton wrote:
> On Fri, 21 Aug 2009 16:21:49 -0700 (PDT)
> David Rientjes <rientjes@google.com> wrote:
>
> > +/**
> > + * flex_array_shrink - free unused second-level pages
> > + *
> > + * Frees all second-level pages that consist solely of unused
> > + * elements. Returns the number of pages freed.
> > + *
> > + * Locking must be provided by the caller.
> > + */
> > +int flex_array_shrink(struct flex_array *fa)
>
> It's logical but afaik unconventional that none of the flex_array
> kerneldoc actually documents the `struct flex_array *fa' argument.
IF (big if) someone had used "make htmldocs" and checked it for errors/warnings,
(see 11. in Documentation/SubmitChecklist), this would have been flagged with:
Warning(some_filename.c:1012): No description found for parameter 'fa'
> We're all waiting with bated breath to see who will be the first to use
> flex_array in live code.
>
> Sometime we'll need to move flex_array.o from lib-y to obj-y and fill
> it up with EXPORT_SYMBOL()s. 2.6.32 I expect.
---
~Randy
LPC 2009, Sept. 23-25, Portland, Oregon
http://linuxplumbersconf.org/2009/
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [patch 4/4 -mm] flex_array: add flex_array_shrink function
2009-08-21 23:49 ` Andrew Morton
2009-08-22 0:02 ` Randy Dunlap
@ 2009-08-22 21:28 ` David Rientjes
1 sibling, 0 replies; 30+ messages in thread
From: David Rientjes @ 2009-08-22 21:28 UTC (permalink / raw)
To: Andrew Morton; +Cc: Dave Hansen, linux-kernel, Randy Dunlap
On Fri, 21 Aug 2009, Andrew Morton wrote:
> > +/**
> > + * flex_array_shrink - free unused second-level pages
> > + *
> > + * Frees all second-level pages that consist solely of unused
> > + * elements. Returns the number of pages freed.
> > + *
> > + * Locking must be provided by the caller.
> > + */
> > +int flex_array_shrink(struct flex_array *fa)
>
> It's logical but afaik unconventional that none of the flex_array
> kerneldoc actually documents the `struct flex_array *fa' argument.
>
In the interest of completeness (vs. more doc clutter), we'd need to add
seven or eight instances of "@fa: the flex array to manipulate," although
that should be fairly obvious.
More of a priority might be to document flex_array_free() which lacks any
commentary or add kerneldoc lines for the gfp_t formal of the three
functions that use them.
> We're all waiting with bated breath to see who will be the first to use
> flex_array in live code.
>
Now that flex_array_shrink() is in -mm, the cgroup code to manipulate
arrays of pids could probably now use them without losing any
functionality.
Beyond that, I don't know of any other users with enough incentive to
immediately convert. Its scope is unfortunately limited because it's an
array and not a list, the latter of which you can efficiently iterate,
sort, eliminate upper bounds, etc.
An alternative implementation, like a flex list, would probably find more
users and could also be used by the cgroup code to manage pids. Flex
lists would be unbounded and can dynamically expand with linkage through
page->lru when additional elements are pushed.
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2009-09-09 19:34 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-21 23:21 [patch 1/4 -mm] flex_array: convert element_nr formals to unsigned David Rientjes
2009-08-21 23:21 ` [patch 2/4 -mm] flex_array: add flex_array_clear function David Rientjes
2009-08-24 15:41 ` Dave Hansen
2009-08-24 20:29 ` David Rientjes
2009-08-24 20:38 ` Dave Hansen
2009-08-24 20:50 ` David Rientjes
2009-08-24 21:28 ` Dave Hansen
2009-08-24 22:32 ` David Rientjes
2009-08-21 23:21 ` [patch 3/4 -mm] flex_array: poison free elements David Rientjes
2009-08-24 15:56 ` Dave Hansen
2009-08-24 20:41 ` David Rientjes
2009-08-24 21:16 ` Dave Hansen
2009-08-24 22:40 ` David Rientjes
2009-08-24 21:42 ` Dave Hansen
2009-08-24 22:44 ` David Rientjes
2009-09-08 22:26 ` David Rientjes
2009-09-09 2:05 ` Li Zefan
2009-09-09 3:18 ` David Rientjes
2009-09-09 3:31 ` Li Zefan
2009-09-09 3:41 ` David Rientjes
2009-09-09 3:45 ` Li Zefan
2009-09-09 4:15 ` David Rientjes
2009-09-09 15:28 ` Dave Hansen
2009-09-09 19:18 ` David Rientjes
2009-09-09 19:22 ` Dave Hansen
2009-09-09 19:34 ` David Rientjes
2009-08-21 23:21 ` [patch 4/4 -mm] flex_array: add flex_array_shrink function David Rientjes
2009-08-21 23:49 ` Andrew Morton
2009-08-22 0:02 ` Randy Dunlap
2009-08-22 21:28 ` David Rientjes
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox