* [PATCH 01/02] cpuset bitmap and mask remap operators
@ 2005-10-24 7:27 Paul Jackson
2005-10-24 7:27 ` [PATCH 02/02] cpuset automatic numa mempolicy rebinding Paul Jackson
2005-10-24 7:48 ` [PATCH 01/02] cpuset bitmap and mask remap operators Andrew Morton
0 siblings, 2 replies; 10+ messages in thread
From: Paul Jackson @ 2005-10-24 7:27 UTC (permalink / raw)
To: Andrew Morton
Cc: Simon.Derr, linux-kernel, Andi Kleen, Linus Torvalds,
Paul Jackson, Christoph Lameter
In the forthcoming task migration support, a key calculation will be
mapping cpu and node numbers from the old set to the new set while
preserving cpuset-relative offset.
For example, if a task and its pages on nodes 8-11 are being migrated
to nodes 24-27, then pages on node 9 (the 2nd node in the old set)
should be moved to node 25 (the 2nd node in the new set.)
As with other bitmap operations, the proper way to code this is
to provide the underlying calculation in lib/bitmap.c, and then to
provide the usual cpumask and nodemask wrappers.
This patch provides that. These operations are termed 'remap'
operations. Both remapping a single bit and a set of bits is
supported.
Signed-off-by: Paul Jackson <pj@sgi.com>
---
include/linux/bitmap.h | 6 +
include/linux/cpumask.h | 20 +++++
include/linux/nodemask.h | 20 +++++
lib/bitmap.c | 166 +++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 212 insertions(+)
--- 2.6.14-rc4-mm1-cpuset-patches.orig/include/linux/bitmap.h 2005-10-23 18:43:58.257209273 -0700
+++ 2.6.14-rc4-mm1-cpuset-patches/include/linux/bitmap.h 2005-10-23 18:44:01.304116059 -0700
@@ -40,6 +40,8 @@
* bitmap_weight(src, nbits) Hamming Weight: number set bits
* bitmap_shift_right(dst, src, n, nbits) *dst = *src >> n
* bitmap_shift_left(dst, src, n, nbits) *dst = *src << n
+ * bitmap_remap(dst, src, old, new, nbits) *dst = map(old, new)(src)
+ * bitmap_bitremap(oldbit, old, new, nbits) newbit = map(old, new)(oldbit)
* bitmap_scnprintf(buf, len, src, nbits) Print bitmap src to buf
* bitmap_parse(ubuf, ulen, dst, nbits) Parse bitmap dst from user buf
* bitmap_scnlistprintf(buf, len, src, nbits) Print bitmap src as list to buf
@@ -104,6 +106,10 @@ extern int bitmap_scnlistprintf(char *bu
const unsigned long *src, int nbits);
extern int bitmap_parselist(const char *buf, unsigned long *maskp,
int nmaskbits);
+extern void bitmap_remap(const unsigned long *dst, unsigned long *src,
+ const unsigned long *old, const unsigned long *new, int bits);
+extern int bitmap_bitremap(int oldbit,
+ const unsigned long *old, const unsigned long *new, int bits);
extern int bitmap_find_free_region(unsigned long *bitmap, int bits, int order);
extern void bitmap_release_region(unsigned long *bitmap, int pos, int order);
extern int bitmap_allocate_region(unsigned long *bitmap, int pos, int order);
--- 2.6.14-rc4-mm1-cpuset-patches.orig/include/linux/cpumask.h 2005-10-23 18:43:58.258185846 -0700
+++ 2.6.14-rc4-mm1-cpuset-patches/include/linux/cpumask.h 2005-10-23 18:44:01.305092631 -0700
@@ -12,6 +12,8 @@
* see bitmap_scnprintf() and bitmap_parse() in lib/bitmap.c.
* For details of cpulist_scnprintf() and cpulist_parse(), see
* bitmap_scnlistprintf() and bitmap_parselist(), also in bitmap.c.
+ * For details of cpu_remap(), see bitmap_bitremap in lib/bitmap.c
+ * For details of cpus_remap(), see bitmap_remap in lib/bitmap.c.
*
* The available cpumask operations are:
*
@@ -50,6 +52,8 @@
* int cpumask_parse(ubuf, ulen, mask) Parse ascii string as cpumask
* int cpulist_scnprintf(buf, len, mask) Format cpumask as list for printing
* int cpulist_parse(buf, map) Parse ascii string as cpulist
+ * int cpu_remap(oldbit, old, new) newbit = map(old, new)(oldbit)
+ * int cpus_remap(dst, src, old, new) *dst = map(old, new)(src)
*
* for_each_cpu_mask(cpu, mask) for-loop cpu over mask
*
@@ -294,6 +298,22 @@ static inline int __cpulist_parse(const
return bitmap_parselist(buf, dstp->bits, nbits);
}
+#define cpu_remap(oldbit, old, new) \
+ __cpu_remap((oldbit), &(old), &(new), NR_CPUS)
+static inline int __cpu_remap(int oldbit,
+ const cpumask_t *oldp, const cpumask_t *newp, int nbits)
+{
+ return bitmap_bitremap(oldbit, oldp->bits, newp->bits, nbits);
+}
+
+#define cpus_remap(dst, src, old, new) \
+ __cpus_remap(&(dst), &(src), &(old), &(new), NR_CPUS)
+static inline void __cpus_remap(const cpumask_t *dstp, cpumask_t *srcp,
+ const cpumask_t *oldp, const cpumask_t *newp, int nbits)
+{
+ bitmap_remap(dstp->bits, srcp->bits, oldp->bits, newp->bits, nbits);
+}
+
#if NR_CPUS > 1
#define for_each_cpu_mask(cpu, mask) \
for ((cpu) = first_cpu(mask); \
--- 2.6.14-rc4-mm1-cpuset-patches.orig/include/linux/nodemask.h 2005-10-23 18:43:58.282600163 -0700
+++ 2.6.14-rc4-mm1-cpuset-patches/include/linux/nodemask.h 2005-10-23 18:44:01.306069204 -0700
@@ -12,6 +12,8 @@
* see bitmap_scnprintf() and bitmap_parse() in lib/bitmap.c.
* For details of nodelist_scnprintf() and nodelist_parse(), see
* bitmap_scnlistprintf() and bitmap_parselist(), also in bitmap.c.
+ * For details of node_remap(), see bitmap_bitremap in lib/bitmap.c.
+ * For details of nodes_remap(), see bitmap_remap in lib/bitmap.c.
*
* The available nodemask operations are:
*
@@ -52,6 +54,8 @@
* int nodemask_parse(ubuf, ulen, mask) Parse ascii string as nodemask
* int nodelist_scnprintf(buf, len, mask) Format nodemask as list for printing
* int nodelist_parse(buf, map) Parse ascii string as nodelist
+ * int node_remap(oldbit, old, new) newbit = map(old, new)(oldbit)
+ * int nodes_remap(dst, src, old, new) *dst = map(old, new)(dst)
*
* for_each_node_mask(node, mask) for-loop node over mask
*
@@ -307,6 +311,22 @@ static inline int __nodelist_parse(const
return bitmap_parselist(buf, dstp->bits, nbits);
}
+#define node_remap(oldbit, old, new) \
+ __node_remap((oldbit), &(old), &(new), MAX_NUMNODES)
+static inline int __node_remap(int oldbit,
+ const nodemask_t *oldp, const nodemask_t *newp, int nbits)
+{
+ return bitmap_bitremap(oldbit, oldp->bits, newp->bits, nbits);
+}
+
+#define nodes_remap(dst, src, old, new) \
+ __nodes_remap(&(dst), &(src), &(old), &(new), MAX_NUMNODES)
+static inline void __nodes_remap(const nodemask_t *dstp, nodemask_t *srcp,
+ const nodemask_t *oldp, const nodemask_t *newp, int nbits)
+{
+ bitmap_remap(dstp->bits, srcp->bits, oldp->bits, newp->bits, nbits);
+}
+
#if MAX_NUMNODES > 1
#define for_each_node_mask(node, mask) \
for ((node) = first_node(mask); \
--- 2.6.14-rc4-mm1-cpuset-patches.orig/lib/bitmap.c 2005-10-23 18:43:58.289436172 -0700
+++ 2.6.14-rc4-mm1-cpuset-patches/lib/bitmap.c 2005-10-23 18:49:18.271292080 -0700
@@ -511,6 +511,172 @@ int bitmap_parselist(const char *bp, uns
}
EXPORT_SYMBOL(bitmap_parselist);
+/*
+ * bitmap_pos_to_ord(buf, pos, bits)
+ * @buf: pointer to a bitmap
+ * @pos: a bit position in @buf (0 <= @pos < @bits)
+ * @bits: number of valid bit positions in @buf
+ *
+ * Map the bit at position @pos in @buf (of length @bits) to the
+ * ordinal of which set bit it is. If it is not set or if @pos
+ * is not a valid bit position, map to zero (0).
+ *
+ * If for example, just bits 4 through 7 are set in @buf, then @pos
+ * values 4 through 7 will get mapped to 0 through 3, respectively,
+ * and other @pos values will get mapped to 0. When @pos value 7
+ * gets mapped to (returns) @ord value 3 in this example, that means
+ * that bit 7 is the 3rd (starting with 0th) set bit in @buf.
+ *
+ * The bit positions 0 through @bits are valid positions in @buf.
+ */
+static int bitmap_pos_to_ord(const unsigned long *buf, int pos, int bits)
+{
+ int ord = 0;
+
+ if (pos >= 0 && pos < bits) {
+ int i;
+
+ for (i = find_first_bit(buf, bits);
+ i < pos;
+ i = find_next_bit(buf, bits, i + 1))
+ ord++;
+ if (i > pos)
+ ord = 0;
+ }
+ return ord;
+}
+
+/**
+ * bitmap_ord_to_pos(buf, ord, bits)
+ * @buf: pointer to bitmap
+ * @ord: ordinal bit position (n-th set bit, n >= 0)
+ * @bits: number of valid bit positions in @buf
+ *
+ * Map the ordinal offset of bit @ord in @buf to its position in @buf.
+ * If @ord is not the ordinal offset of a set bit in @buf, map to zero (0).
+ *
+ * If for example, just bits 4 through 7 are set in @buf, then @ord
+ * values 0 through 3 will get mapped to 4 through 7, respectively,
+ * and all other @ord valuds will get mapped to 0. When @ord value 3
+ * gets mapped to (returns) @pos value 7 in this example, that means
+ * that the 3rd set bit (starting with 0th) is at position 7 in @buf.
+ *
+ * The bit positions 0 through @bits are valid positions in @buf.
+ */
+static int bitmap_ord_to_pos(const unsigned long *buf, int ord, int bits)
+{
+ int pos = 0;
+
+ if (ord >= 0 && ord < bits) {
+ int i;
+
+ for (i = find_first_bit(buf, bits);
+ i < bits && ord > 0;
+ i = find_next_bit(buf, bits, i + 1))
+ ord--;
+ if (i < bits && ord == 0)
+ pos = i;
+ }
+
+ return pos;
+}
+
+/**
+ * bitmap_remap - Apply map defined by a pair of bitmaps to another bitmap
+ * @src: subset to be remapped
+ * @dst: remapped result
+ * @old: defines domain of map
+ * @new: defines range of map
+ * @bits: number of bits in each of these bitmaps
+ *
+ * Let @old and @new define a mapping of bit positions, such that
+ * whatever position is held by the n-th set bit in @old is mapped
+ * to the n-th set bit in @new. In the more general case, allowing
+ * for the possibility that the weight 'w' of @new is less than the
+ * weight of @old, map the position of the n-th set bit in @old to
+ * the position of the m-th set bit in @new, where m == n % w.
+ *
+ * If either of the @old and @new bitmaps are empty, or if@src and @dst
+ * point to the same location, then this routine does nothing.
+ *
+ * The positions of unset bits in @old are mapped to the position of
+ * the first set bit in @new.
+ *
+ * Apply the above specified mapping to @src, placing the result in
+ * @dst, clearing any bits previously set in @dst.
+ *
+ * The resulting value of @dst will have either the same weight as
+ * @src, or less weight in the general case that the mapping wasn't
+ * injective due to the weight of @new being less than that of @old.
+ * The resulting value of @dst will never have greater weight than
+ * that of @src, except perhaps in the case that one of the above
+ * conditions was not met and this routine just returned.
+ *
+ * For example, lets say that @old has bits 4 through 7 set, and
+ * @new has bits 12 through 15 set. This defines the mapping of bit
+ * position 4 to 12, 5 to 13, 6 to 14 and 7 to 15, and of all other
+ * bit positions to 12 (the first set bit in @new. So if say @src
+ * comes into this routine with bits 1, 5 and 7 set, then @dst should
+ * leave with bits 12, 13 and 15 set.
+ */
+void bitmap_remap(const unsigned long *dst, unsigned long *src,
+ const unsigned long *old, const unsigned long *new,
+ int bits)
+{
+ int s;
+
+ if (bitmap_weight(old, bits) == 0)
+ return;
+ if (bitmap_weight(new, bits) == 0)
+ return;
+ if (dst == src) /* following doesn't handle inplace remaps */
+ return;
+
+ bitmap_zero(dst, bits);
+ for (s = find_first_bit(src, bits);
+ s < bits;
+ s = find_next_bit(src, bits, s + 1)) {
+ int x = bitmap_pos_to_ord(old, s, bits);
+ int y = bitmap_ord_to_pos(new, x, bits);
+ set_bit(y, dst);
+ }
+}
+EXPORT_SYMBOL(bitmap_remap);
+
+/**
+ * bitmap_bitremap - Apply map defined by a pair of bitmaps to a single bit
+ * @oldbit - bit position to be mapped
+ * @old: defines domain of map
+ * @new: defines range of map
+ * @bits: number of bits in each of these bitmaps
+ *
+ * Let @old and @new define a mapping of bit positions, such that
+ * whatever position is held by the n-th set bit in @old is mapped
+ * to the n-th set bit in @new. In the more general case, allowing
+ * for the possibility that the weight 'w' of @new is less than the
+ * weight of @old, map the position of the n-th set bit in @old to
+ * the position of the m-th set bit in @new, where m == n % w.
+ *
+ * The positions of unset bits in @old are mapped to the position of
+ * the first set bit in @new.
+ *
+ * Apply the above specified mapping to bit position @oldbit, returning
+ * the new bit position.
+ *
+ * For example, lets say that @old has bits 4 through 7 set, and
+ * @new has bits 12 through 15 set. This defines the mapping of bit
+ * position 4 to 12, 5 to 13, 6 to 14 and 7 to 15, and of all other
+ * bit positions to 12 (the first set bit in @new. So if say @oldbit
+ * is 5, then this routine returns 13.
+ */
+int bitmap_bitremap(int oldbit, const unsigned long *old,
+ const unsigned long *new, int bits)
+{
+ int x = bitmap_pos_to_ord(old, oldbit, bits);
+ return bitmap_ord_to_pos(new, x, bits);
+}
+EXPORT_SYMBOL(bitmap_bitremap);
+
/**
* bitmap_find_free_region - find a contiguous aligned mem region
* @bitmap: an array of unsigned longs corresponding to the bitmap
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.650.933.1373
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 02/02] cpuset automatic numa mempolicy rebinding
2005-10-24 7:27 [PATCH 01/02] cpuset bitmap and mask remap operators Paul Jackson
@ 2005-10-24 7:27 ` Paul Jackson
2005-10-24 8:36 ` Dave Hansen
2005-10-24 7:48 ` [PATCH 01/02] cpuset bitmap and mask remap operators Andrew Morton
1 sibling, 1 reply; 10+ messages in thread
From: Paul Jackson @ 2005-10-24 7:27 UTC (permalink / raw)
To: Andrew Morton
Cc: Simon.Derr, linux-kernel, Christoph Lameter, Linus Torvalds,
Paul Jackson, Andi Kleen
This patch automatically updates a tasks NUMA mempolicy when its cpuset
memory placement changes. It does so within the context of the task,
without any need to support low level external mempolicy manipulation.
If a system is not using cpusets, or if running on a system with
just the root (all-encompassing) cpuset, then this remap is a no-op.
Only when a task is moved between cpusets, or a cpusets memory
placement is changed does the following apply. Otherwise, the main
routine below, rebind_policy() is not even called.
When mixing cpusets, scheduler affinity, and NUMA mempolicies, the
essential role of cpusets is to place jobs (several related tasks) on a
set of CPUs and Memory Nodes, the essential role of sched_setaffinity
is to manage a jobs processor placement within its allowed cpuset,
and the essential role of NUMA mempolicy (mbind, set_mempolicy)
is to manage a jobs memory placement within its allowed cpuset.
However, CPU affinity and NUMA memory placement are managed within
the kernel using absolute system wide numbering, not cpuset relative
numbering.
This is ok until a job is migrated to a different cpuset, or what's
the same, a jobs cpuset is moved to different CPUs and Memory Nodes.
Then the CPU affinity and NUMA memory placement of the tasks in the
job need to be updated, to preserve their cpuset-relative position.
This can be done for CPU affinity using sched_setaffinity() from user
code, as one task can modify anothers CPU affinity. This cannot be
done from an external task for NUMA memory placement, as that can
only be modified in the context of the task using it.
However, it easy enough to remap a tasks NUMA mempolicy automatically
when a task is migrated, using the existing cpuset mechanism to trigger
a refresh of a tasks memory placement after its cpuset has changed.
All that is needed is the old and new nodemask, and notice to the
task that it needs to rebind its mempolicy. The tasks mems_allowed
has the old mask, the tasks cpuset has the new mask, and the existing
cpuset_update_current_mems_allowed() mechanism provides the notice.
The bitmap/cpumask/nodemask remap operators provide the cpuset
relative calculations.
This patch leaves open a couple of issues:
1) Updating vma and shmfs/tmpfs/hugetlbfs memory policies:
These mempolicies may reference nodes outside of those allowed
to the current task by its cpuset. Tasks are migrated as part of
jobs, which reside on what might be several cpusets in a subtree.
When such a job is migrated, all NUMA memory policy references
to nodes within that cpuset subtree should be translated, and
references to any nodes outside that subtree should be left
untouched. A future patch will provide the cpuset mechanism
needed to mark such subtrees. With that patch, we will be able
to correctly migrate these other memory policies across a job
migration.
2) Updating cpuset, affinity and memory policies in user space:
This is harder. Any placement state stored in user space
using system-wide numbering will be invalidated across a
migration. More work will be required to provide user code
with a migration-safe means to manage its cpuset relative
placement, while preserving the current API's that pass
system wide numbers, not cpuset relative numbers across the
kernel-user boundary.
Signed-off-by: Paul Jackson
---
include/linux/mempolicy.h | 6 ++++
kernel/cpuset.c | 4 ++
mm/mempolicy.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 74 insertions(+)
--- 2.6.14-rc4-mm1-cpuset-patches.orig/include/linux/mempolicy.h 2005-10-23 21:33:27.371933009 -0700
+++ 2.6.14-rc4-mm1-cpuset-patches/include/linux/mempolicy.h 2005-10-23 21:34:42.485999678 -0700
@@ -155,6 +155,7 @@ struct mempolicy *get_vma_policy(struct
extern void numa_default_policy(void);
extern void numa_policy_init(void);
+extern void numa_policy_rebind(const nodemask_t *old, const nodemask_t *new);
extern struct mempolicy default_policy;
#else
@@ -227,6 +228,11 @@ static inline void numa_default_policy(v
{
}
+static inline void numa_policy_rebind(const nodemask_t *old,
+ const nodemask_t *new)
+{
+}
+
#endif /* CONFIG_NUMA */
#endif /* __KERNEL__ */
--- 2.6.14-rc4-mm1-cpuset-patches.orig/kernel/cpuset.c 2005-10-23 21:33:27.494004597 -0700
+++ 2.6.14-rc4-mm1-cpuset-patches/kernel/cpuset.c 2005-10-23 21:34:42.490882542 -0700
@@ -32,6 +32,7 @@
#include <linux/kernel.h>
#include <linux/kmod.h>
#include <linux/list.h>
+#include <linux/mempolicy.h>
#include <linux/mm.h>
#include <linux/module.h>
#include <linux/mount.h>
@@ -600,6 +601,7 @@ static void refresh_mems(void)
if (current->cpuset_mems_generation != my_cpusets_mem_gen) {
struct cpuset *cs;
+ nodemask_t oldmem = current->mems_allowed;
down(&callback_sem);
task_lock(current);
@@ -608,6 +610,8 @@ static void refresh_mems(void)
current->cpuset_mems_generation = cs->mems_generation;
task_unlock(current);
up(&callback_sem);
+ if (!nodes_equal(oldmem, current->mems_allowed))
+ numa_policy_rebind(&oldmem, ¤t->mems_allowed);
}
}
--- 2.6.14-rc4-mm1-cpuset-patches.orig/mm/mempolicy.c 2005-10-23 21:33:27.494981170 -0700
+++ 2.6.14-rc4-mm1-cpuset-patches/mm/mempolicy.c 2005-10-23 23:22:15.814826285 -0700
@@ -457,6 +457,7 @@ long do_get_mempolicy(int *policy, nodem
struct vm_area_struct *vma = NULL;
struct mempolicy *pol = current->mempolicy;
+ cpuset_update_current_mems_allowed();
if (flags & ~(unsigned long)(MPOL_F_NODE|MPOL_F_ADDR))
return -EINVAL;
if (flags & MPOL_F_ADDR) {
@@ -1205,3 +1206,66 @@ void numa_default_policy(void)
{
do_set_mempolicy(MPOL_DEFAULT, NULL);
}
+
+/* Migrate a policy to a different set of nodes */
+static void rebind_policy(struct mempolicy *pol, const nodemask_t *old,
+ const nodemask_t *new)
+{
+ nodemask_t tmp;
+
+ if (!pol)
+ return;
+
+ switch (pol->policy) {
+ case MPOL_DEFAULT:
+ break;
+ case MPOL_INTERLEAVE:
+ nodes_remap(tmp, pol->v.nodes, *old, *new);
+ pol->v.nodes = tmp;
+ current->il_next = node_remap(current->il_next, *old, *new);
+ break;
+ case MPOL_PREFERRED:
+ pol->v.preferred_node = node_remap(pol->v.preferred_node,
+ *old, *new);
+ break;
+ case MPOL_BIND: {
+ nodemask_t nodes;
+ struct zone **z;
+ struct zonelist *zonelist;
+
+ nodes_clear(nodes);
+ for (z = pol->v.zonelist->zones; *z; z++)
+ node_set((*z)->zone_pgdat->node_id, nodes);
+ nodes_remap(tmp, nodes, *old, *new);
+ nodes = tmp;
+
+ zonelist = bind_zonelist(&nodes);
+
+ /* If no mem, then zonelist is NULL and we keep old zonelist.
+ * If that old zonelist has no remaining mems_allowed nodes,
+ * then zonelist_policy() will "FALL THROUGH" to MPOL_DEFAULT.
+ */
+
+ if (zonelist) {
+ /* Good - got mem - substitute new zonelist */
+ kfree(pol->v.zonelist);
+ pol->v.zonelist = zonelist;
+ }
+ break;
+ }
+ default:
+ BUG();
+ break;
+ }
+}
+
+/*
+ * Someone moved this task to different nodes. Fixup mempolicies.
+ *
+ * TODO - fixup current->mm->vma and shmfs/tmpfs/hugetlbfs policies as well,
+ * once we have a cpuset mechanism to mark which cpuset subtree is migrating.
+ */
+void numa_policy_rebind(const nodemask_t *old, const nodemask_t *new)
+{
+ rebind_policy(current->mempolicy, old, new);
+}
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.650.933.1373
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 01/02] cpuset bitmap and mask remap operators
2005-10-24 7:27 [PATCH 01/02] cpuset bitmap and mask remap operators Paul Jackson
2005-10-24 7:27 ` [PATCH 02/02] cpuset automatic numa mempolicy rebinding Paul Jackson
@ 2005-10-24 7:48 ` Andrew Morton
2005-10-24 8:16 ` Paul Jackson
1 sibling, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2005-10-24 7:48 UTC (permalink / raw)
To: Paul Jackson; +Cc: Simon.Derr, linux-kernel, ak, torvalds, pj, clameter
Paul Jackson <pj@sgi.com> wrote:
>
> +#define node_remap(oldbit, old, new) \
> + __node_remap((oldbit), &(old), &(new), MAX_NUMNODES)
> +static inline int __node_remap(int oldbit,
> + const nodemask_t *oldp, const nodemask_t *newp, int nbits)
> +{
> + return bitmap_bitremap(oldbit, oldp->bits, newp->bits, nbits);
> +}
What's the reason for the wrapper macro?
+EXPORT_SYMBOL(bitmap_bitremap);
Is that deliberately not EXPORT_SYMBOL_GPL?
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 01/02] cpuset bitmap and mask remap operators
2005-10-24 7:48 ` [PATCH 01/02] cpuset bitmap and mask remap operators Andrew Morton
@ 2005-10-24 8:16 ` Paul Jackson
2005-10-24 8:37 ` Andrew Morton
0 siblings, 1 reply; 10+ messages in thread
From: Paul Jackson @ 2005-10-24 8:16 UTC (permalink / raw)
To: Andrew Morton; +Cc: Simon.Derr, linux-kernel, ak, torvalds, clameter
Andrew wrote:
> > +#define node_remap(oldbit, old, new) \
> > + __node_remap((oldbit), &(old), &(new), MAX_NUMNODES)
> > +static inline int __node_remap(int oldbit, ...
>
> What's the reason for the wrapper macro?
Most all the nodemask/cpumask operators are like that. It allows
writing *mask code as if masks were pass by value (which is how the
vast majority of kernel hackers, working on systems with one-word
masks, think of them), while actually passing by reference, to
avoid unnecessary stack copies of multiword masks.
> +EXPORT_SYMBOL(bitmap_bitremap);
>
> Is that deliberately not EXPORT_SYMBOL_GPL?
It's not deliberate that I am aware of.
But it does seem to be the common practice ....
All the bitmap routines are that way - no GPL. In fact it seems that
almost all the EXPORT_SYMBOLS in the lib/*.c routines are that way - 12
with GPL and 174 without GPL, or some such. The only lib/*.c GPL
exports are in lib/klist.c and lib/kobject_uevent.c.
Is this bad?
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 02/02] cpuset automatic numa mempolicy rebinding
2005-10-24 7:27 ` [PATCH 02/02] cpuset automatic numa mempolicy rebinding Paul Jackson
@ 2005-10-24 8:36 ` Dave Hansen
2005-10-24 14:47 ` Paul Jackson
0 siblings, 1 reply; 10+ messages in thread
From: Dave Hansen @ 2005-10-24 8:36 UTC (permalink / raw)
To: Paul Jackson
Cc: Andrew Morton, Simon.Derr, Linux Kernel Mailing List,
Christoph Lameter, Linus Torvalds, Andi Kleen
On Mon, 2005-10-24 at 00:27 -0700, Paul Jackson wrote:
> + break;
> + }
> + default:
> + BUG();
> + break;
> + }
> +}
Just think how that looks to a reviewer without the full context :)
Perhaps the MBOL_BIND case needs a little helper function.
-- Dave
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 01/02] cpuset bitmap and mask remap operators
2005-10-24 8:16 ` Paul Jackson
@ 2005-10-24 8:37 ` Andrew Morton
2005-10-24 14:37 ` Paul Jackson
0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2005-10-24 8:37 UTC (permalink / raw)
To: Paul Jackson; +Cc: Simon.Derr, linux-kernel, ak, torvalds, clameter
Paul Jackson <pj@sgi.com> wrote:
>
> Andrew wrote:
> > > +#define node_remap(oldbit, old, new) \
> > > + __node_remap((oldbit), &(old), &(new), MAX_NUMNODES)
> > > +static inline int __node_remap(int oldbit, ...
> >
> > What's the reason for the wrapper macro?
>
> Most all the nodemask/cpumask operators are like that. It allows
> writing *mask code as if masks were pass by value (which is how the
> vast majority of kernel hackers, working on systems with one-word
> masks, think of them), while actually passing by reference, to
> avoid unnecessary stack copies of multiword masks.
>
hm. That hides what's really going on from the programmer.
Oh well - you're the only guy who dinks with that stuff anyway ;)
> > +EXPORT_SYMBOL(bitmap_bitremap);
> >
> > Is that deliberately not EXPORT_SYMBOL_GPL?
>
> It's not deliberate that I am aware of.
>
> But it does seem to be the common practice ....
>
> All the bitmap routines are that way - no GPL. In fact it seems that
> almost all the EXPORT_SYMBOLS in the lib/*.c routines are that way - 12
> with GPL and 174 without GPL, or some such. The only lib/*.c GPL
> exports are in lib/klist.c and lib/kobject_uevent.c.
>
> Is this bad?
>
Ah, the bitmap library - sorry I thought it was in the cpuset code.
Making the bitmap library non-GPL makes sense I guess.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 01/02] cpuset bitmap and mask remap operators
2005-10-24 8:37 ` Andrew Morton
@ 2005-10-24 14:37 ` Paul Jackson
0 siblings, 0 replies; 10+ messages in thread
From: Paul Jackson @ 2005-10-24 14:37 UTC (permalink / raw)
To: Andrew Morton; +Cc: Simon.Derr, linux-kernel, ak, torvalds, clameter
Andrew wrote:
> hm. That hides what's really going on from the programmer.
>
> Oh well - you're the only guy who dinks with that stuff anyway ;)
Halloween comes a week early to the Morton household. The patch from
hell, circa April 2004, lives on to haunt Andrew <grin>.
By the time we (wli, rusty, colpatch, ...) incorporated the following
various constraints on these cpumask/nodemask macros, we were fortunate
to only violate the "obvious to the programmer" constraint on the
implementation internals:
- near perfect code gen on small systems (1 word masks)
- near perfect code gen on large systems (multiword masks)
- type checking on arguments
- keep the existing macro-style calling conventions:
cpus_and(result, input1, input2)
- a single bitmap internal implementation - the rest just wrappers
- dramatic shinkage of kernel source devoted to this stuff
- reduction in kernel text size across all architectures.
Probably it is best that we not ask too many questions at this time ;).
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 02/02] cpuset automatic numa mempolicy rebinding
2005-10-24 8:36 ` Dave Hansen
@ 2005-10-24 14:47 ` Paul Jackson
2005-10-24 14:52 ` Dave Hansen
0 siblings, 1 reply; 10+ messages in thread
From: Paul Jackson @ 2005-10-24 14:47 UTC (permalink / raw)
To: Dave Hansen; +Cc: akpm, Simon.Derr, linux-kernel, clameter, torvalds, ak
Dave wrote:
> Just think how that looks to a reviewer without the full context :)
I was just copying Andi's coding style for these cases, with
this BUG() if no policy matched. When in Rome, do as the
Romans.
> Perhaps the MBOL_BIND case needs a little helper function.
Eh ... that entire routine still fits on a page of my screen.
I'm not sure that adding a helper function would clarify the
code any.
Thanks for looking at it. If you're pretty sure I should change
one of the above, squeek a little louder.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 02/02] cpuset automatic numa mempolicy rebinding
2005-10-24 14:47 ` Paul Jackson
@ 2005-10-24 14:52 ` Dave Hansen
2005-10-24 15:02 ` Paul Jackson
0 siblings, 1 reply; 10+ messages in thread
From: Dave Hansen @ 2005-10-24 14:52 UTC (permalink / raw)
To: Paul Jackson
Cc: Andrew Morton, Simon.Derr, Linux Kernel Mailing List, clameter,
Linus Torvalds, ak
On Mon, 2005-10-24 at 07:47 -0700, Paul Jackson wrote:
> Dave wrote:
> > Just think how that looks to a reviewer without the full context :)
>
> I was just copying Andi's coding style for these cases, with
> this BUG() if no policy matched. When in Rome, do as the
> Romans.
>
> > Perhaps the MBOL_BIND case needs a little helper function.
>
> Eh ... that entire routine still fits on a page of my screen.
> I'm not sure that adding a helper function would clarify the
> code any.
>
> Thanks for looking at it. If you're pretty sure I should change
> one of the above, squeek a little louder.
I really just meant it was hard to read having two braces at the same
level. This was due to the extra block inside of the switch() for the
'case MPOL_BIND:'. Not a huge deal, but it did cause me a bit of a
brain fault for a moment.
-- Dave
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 02/02] cpuset automatic numa mempolicy rebinding
2005-10-24 14:52 ` Dave Hansen
@ 2005-10-24 15:02 ` Paul Jackson
0 siblings, 0 replies; 10+ messages in thread
From: Paul Jackson @ 2005-10-24 15:02 UTC (permalink / raw)
To: Dave Hansen; +Cc: akpm, Simon.Derr, linux-kernel, clameter, torvalds, ak
Dave wrote:
> I really just meant it was hard to read having two braces at the same
> level. This was due to the extra block inside of the switch() ...
Ah - yes. That.
Such is life.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2005-10-24 15:03 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-10-24 7:27 [PATCH 01/02] cpuset bitmap and mask remap operators Paul Jackson
2005-10-24 7:27 ` [PATCH 02/02] cpuset automatic numa mempolicy rebinding Paul Jackson
2005-10-24 8:36 ` Dave Hansen
2005-10-24 14:47 ` Paul Jackson
2005-10-24 14:52 ` Dave Hansen
2005-10-24 15:02 ` Paul Jackson
2005-10-24 7:48 ` [PATCH 01/02] cpuset bitmap and mask remap operators Andrew Morton
2005-10-24 8:16 ` Paul Jackson
2005-10-24 8:37 ` Andrew Morton
2005-10-24 14:37 ` Paul Jackson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).