* [PATCH 0/2] jump label: 2.6.38 updates
@ 2011-01-05 15:43 Jason Baron
2011-01-05 15:43 ` [PATCH 1/2] jump label: make enable/disable o(1) Jason Baron
` (2 more replies)
0 siblings, 3 replies; 113+ messages in thread
From: Jason Baron @ 2011-01-05 15:43 UTC (permalink / raw)
To: peterz, mathieu.desnoyers, hpa, rostedt, mingo
Cc: tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi,
davem, sam, ddaney, michael, linux-kernel
Hi,
The first patch uses the storage space of the jump label key address
as a pointer into the update table. In this way, we can find all
the addresses that need to be updated without hashing.
The second patch introduces:
static __always_inline bool static_branch(struct jump_label_key *key);
instead of the old JUMP_LABEL(key, label) macro.
In this way, jump labels become really easy to use:
Define:
struct jump_label_key jump_key;
Can be used as:
if (static_branch(&jump_key))
do unlikely code
enable/disale via:
jump_label_enable(&jump_key);
jump_label_disable(&jump_key);
that's it!
For perf, which also uses jump labels, I've left the reference counting
out of the jump label layer, thus removing the 'jump_label_inc()' and
'jump_label_dec()' interface. Hopefully, this is a more palatable solution.
Thanks to H. Peter Anvin for suggesting the simpler 'static_branch()'
function.
thanks,
-Jason
Jason Baron (2):
jump label: make enable/disable o(1)
jump label: introduce unlikely_switch()
arch/sparc/include/asm/jump_label.h | 25 ++++---
arch/x86/include/asm/jump_label.h | 22 ++++---
arch/x86/kernel/jump_label.c | 2 +-
include/linux/dynamic_debug.h | 24 ++-----
include/linux/jump_label.h | 66 ++++++++++--------
include/linux/jump_label_ref.h | 36 +++-------
include/linux/perf_event.h | 28 ++++----
include/linux/tracepoint.h | 8 +--
kernel/jump_label.c | 129 +++++++++++++++++++++++++++--------
kernel/perf_event.c | 24 ++++--
kernel/tracepoint.c | 22 ++----
11 files changed, 226 insertions(+), 160 deletions(-)
^ permalink raw reply [flat|nested] 113+ messages in thread* [PATCH 1/2] jump label: make enable/disable o(1) 2011-01-05 15:43 [PATCH 0/2] jump label: 2.6.38 updates Jason Baron @ 2011-01-05 15:43 ` Jason Baron 2011-01-05 17:31 ` Steven Rostedt 2011-01-05 15:43 ` [PATCH 2/2] jump label: introduce static_branch() Jason Baron 2011-02-11 19:25 ` [PATCH 0/2] jump label: 2.6.38 updates Peter Zijlstra 2 siblings, 1 reply; 113+ messages in thread From: Jason Baron @ 2011-01-05 15:43 UTC (permalink / raw) To: peterz, mathieu.desnoyers, hpa, rostedt, mingo Cc: tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel Previously, I allowed any variable type to be used as the 'key' for the jump label. However, by enforcing a type, we can make use of the contents of the 'key'. This patch thus introduces: struct jump_label_key { void *ptr; }; The 'ptr' is used a pointer into the jump label table of the corresponding addresses that need to be updated. Thus, when jump labels are enabled/disabled we have a constant time algorithm. There is no longer any hashing. When jump lables are disabled we simply have: struct jump_label_key { int state; }; I tested enable/disable times on x86 on a quad core via: time echo 1 > /sys/kernel/debug/tracing/events/enable With this patch, runs average .03s. Prior to the jump label infrastructure this command averaged around .01s. We can speed this path up further via batching the enable/disables. thanks, -Jason Signed-off-by: Jason Baron <jbaron@redhat.com> --- include/linux/dynamic_debug.h | 6 +- include/linux/jump_label.h | 46 +++++++++----- include/linux/jump_label_ref.h | 34 +++-------- include/linux/perf_event.h | 8 ++- include/linux/tracepoint.h | 6 +- kernel/jump_label.c | 127 +++++++++++++++++++++++++++++++--------- kernel/perf_event.c | 24 +++++--- kernel/tracepoint.c | 22 +++----- 8 files changed, 172 insertions(+), 101 deletions(-) diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h index a90b389..ddf7bae 100644 --- a/include/linux/dynamic_debug.h +++ b/include/linux/dynamic_debug.h @@ -33,7 +33,7 @@ struct _ddebug { #define _DPRINTK_FLAGS_PRINT (1<<0) /* printk() a message using the format */ #define _DPRINTK_FLAGS_DEFAULT 0 unsigned int flags:8; - char enabled; + struct jump_label_key enabled; } __attribute__((aligned(8))); @@ -50,7 +50,7 @@ extern int ddebug_remove_module(const char *mod_name); __used \ __attribute__((section("__verbose"), aligned(8))) = \ { KBUILD_MODNAME, __func__, __FILE__, fmt, __LINE__, \ - _DPRINTK_FLAGS_DEFAULT }; \ + _DPRINTK_FLAGS_DEFAULT, JUMP_LABEL_INIT }; \ JUMP_LABEL(&descriptor.enabled, do_printk); \ goto out; \ do_printk: \ @@ -66,7 +66,7 @@ out: ; \ __used \ __attribute__((section("__verbose"), aligned(8))) = \ { KBUILD_MODNAME, __func__, __FILE__, fmt, __LINE__, \ - _DPRINTK_FLAGS_DEFAULT }; \ + _DPRINTK_FLAGS_DEFAULT, JUMP_LABEL_INIT }; \ JUMP_LABEL(&descriptor.enabled, do_printk); \ goto out; \ do_printk: \ diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h index 7880f18..152f7de 100644 --- a/include/linux/jump_label.h +++ b/include/linux/jump_label.h @@ -2,6 +2,11 @@ #define _LINUX_JUMP_LABEL_H #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL) + +struct jump_label_key { + void *ptr; +}; + # include <asm/jump_label.h> # define HAVE_JUMP_LABEL #endif @@ -13,6 +18,8 @@ enum jump_label_type { struct module; +#define JUMP_LABEL_INIT { 0 } + #ifdef HAVE_JUMP_LABEL extern struct jump_entry __start___jump_table[]; @@ -23,33 +30,38 @@ extern void jump_label_unlock(void); extern void arch_jump_label_transform(struct jump_entry *entry, enum jump_label_type type); extern void arch_jump_label_text_poke_early(jump_label_t addr); -extern void jump_label_update(unsigned long key, enum jump_label_type type); extern void jump_label_apply_nops(struct module *mod); extern int jump_label_text_reserved(void *start, void *end); - -#define jump_label_enable(key) \ - jump_label_update((unsigned long)key, JUMP_LABEL_ENABLE); - -#define jump_label_disable(key) \ - jump_label_update((unsigned long)key, JUMP_LABEL_DISABLE); +extern int jump_label_enabled(struct jump_label_key *key); +extern void jump_label_enable(struct jump_label_key *key); +extern void jump_label_disable(struct jump_label_key *key); #else +struct jump_label_key { + int state; +}; + #define JUMP_LABEL(key, label) \ do { \ - if (unlikely(*key)) \ + if (unlikely(((struct jump_label_key *)key)->state)) \ goto label; \ } while (0) -#define jump_label_enable(cond_var) \ -do { \ - *(cond_var) = 1; \ -} while (0) +static inline int jump_label_enabled(struct jump_label_key *key) +{ + return key->state; +} -#define jump_label_disable(cond_var) \ -do { \ - *(cond_var) = 0; \ -} while (0) +static inline void jump_label_enable(struct jump_label_key *key) +{ + key->state = 1; +} + +static inline void jump_label_disable(struct jump_label_key *key) +{ + key->state = 0; +} static inline int jump_label_apply_nops(struct module *mod) { @@ -69,7 +81,7 @@ static inline void jump_label_unlock(void) {} #define COND_STMT(key, stmt) \ do { \ __label__ jl_enabled; \ - JUMP_LABEL(key, jl_enabled); \ + JUMP_LABEL_ELSE_ATOMIC_READ(key, jl_enabled); \ if (0) { \ jl_enabled: \ stmt; \ diff --git a/include/linux/jump_label_ref.h b/include/linux/jump_label_ref.h index e5d012a..8a76e89 100644 --- a/include/linux/jump_label_ref.h +++ b/include/linux/jump_label_ref.h @@ -4,38 +4,20 @@ #include <linux/jump_label.h> #include <asm/atomic.h> -#ifdef HAVE_JUMP_LABEL - -static inline void jump_label_inc(atomic_t *key) -{ - if (atomic_add_return(1, key) == 1) - jump_label_enable(key); +struct jump_label_key_counter { + atomic_t ref; + struct jump_label_key key; } -static inline void jump_label_dec(atomic_t *key) -{ - if (atomic_dec_and_test(key)) - jump_label_disable(key); -} - -#else /* !HAVE_JUMP_LABEL */ +#ifdef HAVE_JUMP_LABEL -static inline void jump_label_inc(atomic_t *key) -{ - atomic_inc(key); -} +#define JUMP_LABEL_ELSE_ATOMIC_READ(key, label, counter) JUMP_LABEL(key, label) -static inline void jump_label_dec(atomic_t *key) -{ - atomic_dec(key); -} +#else /* !HAVE_JUMP_LABEL */ -#undef JUMP_LABEL -#define JUMP_LABEL(key, label) \ +#define JUMP_LABEL_ELSE_ATOMIC_READ(key, label, counter) \ do { \ - if (unlikely(__builtin_choose_expr( \ - __builtin_types_compatible_p(typeof(key), atomic_t *), \ - atomic_read((atomic_t *)(key)), *(key)))) \ + if (unlikely(atomic_read((atomic_t *)counter))) \ goto label; \ } while (0) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index dda5b0a..94834ce 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1000,7 +1000,7 @@ static inline int is_software_event(struct perf_event *event) return event->pmu->task_ctx_nr == perf_sw_context; } -extern atomic_t perf_swevent_enabled[PERF_COUNT_SW_MAX]; +extern struct jump_label_key_counter perf_swevent_enabled[PERF_COUNT_SW_MAX]; extern void __perf_sw_event(u32, u64, int, struct pt_regs *, u64); @@ -1029,7 +1029,9 @@ perf_sw_event(u32 event_id, u64 nr, int nmi, struct pt_regs *regs, u64 addr) { struct pt_regs hot_regs; - JUMP_LABEL(&perf_swevent_enabled[event_id], have_event); + JUMP_LABEL_ELSE_ATOMIC_READ(&perf_swevent_enabled[event_id].key, + have_event, + &perf_swevent_enabled[event_id].ref); return; have_event: @@ -1040,7 +1042,7 @@ have_event: __perf_sw_event(event_id, nr, nmi, regs, addr); } -extern atomic_t perf_task_events; +extern struct jump_label_key_counter perf_task_events; static inline void perf_event_task_sched_in(struct task_struct *task) { diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h index d3e4f87..2ff00e5 100644 --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -29,7 +29,7 @@ struct tracepoint_func { struct tracepoint { const char *name; /* Tracepoint name */ - int state; /* State. */ + struct jump_label_key key; void (*regfunc)(void); void (*unregfunc)(void); struct tracepoint_func *funcs; @@ -149,7 +149,7 @@ static inline void tracepoint_update_probe_range(struct tracepoint *begin, extern struct tracepoint __tracepoint_##name; \ static inline void trace_##name(proto) \ { \ - JUMP_LABEL(&__tracepoint_##name.state, do_trace); \ + JUMP_LABEL(&__tracepoint_##name.key, do_trace); \ return; \ do_trace: \ __DO_TRACE(&__tracepoint_##name, \ @@ -179,7 +179,7 @@ do_trace: \ __attribute__((section("__tracepoints_strings"))) = #name; \ struct tracepoint __tracepoint_##name \ __attribute__((section("__tracepoints"), aligned(32))) = \ - { __tpstrtab_##name, 0, reg, unreg, NULL } + { __tpstrtab_##name, JUMP_LABEL_INIT, reg, unreg, NULL } #define DEFINE_TRACE(name) \ DEFINE_TRACE_FN(name, NULL, NULL); diff --git a/kernel/jump_label.c b/kernel/jump_label.c index 3b79bd9..b6d461c 100644 --- a/kernel/jump_label.c +++ b/kernel/jump_label.c @@ -26,10 +26,11 @@ static DEFINE_MUTEX(jump_label_mutex); struct jump_label_entry { struct hlist_node hlist; struct jump_entry *table; - int nr_entries; /* hang modules off here */ struct hlist_head modules; unsigned long key; + u32 nr_entries; + int refcount; }; struct jump_label_module_entry { @@ -105,11 +106,14 @@ add_jump_label_entry(jump_label_t key, int nr_entries, struct jump_entry *table) hash = jhash((void *)&key, sizeof(jump_label_t), 0); head = &jump_label_table[hash & (JUMP_LABEL_TABLE_SIZE - 1)]; - e->key = key; + e->key = (unsigned long)key; e->table = table; e->nr_entries = nr_entries; + e->refcount = 0; INIT_HLIST_HEAD(&(e->modules)); hlist_add_head(&e->hlist, head); + ((struct jump_label_key *)(unsigned long)key)->ptr = e; + return e; } @@ -154,37 +158,91 @@ build_jump_label_hashtable(struct jump_entry *start, struct jump_entry *stop) * */ -void jump_label_update(unsigned long key, enum jump_label_type type) +static void jump_label_update(struct jump_label_entry *entry, enum jump_label_type type) { struct jump_entry *iter; - struct jump_label_entry *entry; struct hlist_node *module_node; struct jump_label_module_entry *e_module; int count; - jump_label_lock(); - entry = get_jump_label_entry((jump_label_t)key); - if (entry) { - count = entry->nr_entries; - iter = entry->table; + count = entry->nr_entries; + iter = entry->table; + while (count--) { + if (kernel_text_address(iter->code)) + arch_jump_label_transform(iter, type); + iter++; + } + /* enable/disable jump labels in modules */ + hlist_for_each_entry(e_module, module_node, &(entry->modules), + hlist) { + count = e_module->nr_entries; + iter = e_module->table; while (count--) { - if (kernel_text_address(iter->code)) + if (iter->key && kernel_text_address(iter->code)) arch_jump_label_transform(iter, type); iter++; } - /* eanble/disable jump labels in modules */ - hlist_for_each_entry(e_module, module_node, &(entry->modules), - hlist) { - count = e_module->nr_entries; - iter = e_module->table; - while (count--) { - if (iter->key && - kernel_text_address(iter->code)) - arch_jump_label_transform(iter, type); - iter++; - } - } } +} + +static struct jump_label_entry *get_jump_label_entry_key(struct jump_label_key *key) +{ + struct jump_label_entry *entry; + + entry = (struct jump_label_entry *)key->ptr; + if (!entry) { + entry = add_jump_label_entry((jump_label_t)(unsigned long)key, 0, NULL); + if (IS_ERR(entry)) + return NULL; + } + return entry; +} + +int jump_label_enabled(struct jump_label_key *key) +{ + struct jump_label_entry *entry; + int enabled = 0; + + jump_label_lock(); + entry = get_jump_label_entry_key(key); + if (!entry) + goto out; + enabled = !!entry->refcount; +out: + jump_label_unlock(); + return enabled; +} + + +void jump_label_enable(struct jump_label_key *key) +{ + struct jump_label_entry *entry; + + jump_label_lock(); + entry = get_jump_label_entry_key(key); + if (!entry) + goto out; + if (!entry->refcount) { + jump_label_update(entry, JUMP_LABEL_ENABLE); + entry->refcount = 1; + } +out: + jump_label_unlock(); +} + +void jump_label_disable(struct jump_label_key *key) +{ + struct jump_label_entry *entry; + + jump_label_lock(); + entry = get_jump_label_entry_key(key); + if (!entry) + goto out; + if (entry->refcount) { + jump_label_update(entry, JUMP_LABEL_DISABLE); + entry->refcount = 0; + } +out: jump_label_unlock(); } @@ -305,6 +363,7 @@ add_jump_label_module_entry(struct jump_label_entry *entry, int count, struct module *mod) { struct jump_label_module_entry *e; + struct jump_entry *iter; e = kmalloc(sizeof(struct jump_label_module_entry), GFP_KERNEL); if (!e) @@ -313,6 +372,13 @@ add_jump_label_module_entry(struct jump_label_entry *entry, e->nr_entries = count; e->table = iter_begin; hlist_add_head(&e->hlist, &entry->modules); + if (entry->refcount) { + iter = iter_begin; + while (count--) { + arch_jump_label_transform(iter, JUMP_LABEL_ENABLE); + iter++; + } + } return e; } @@ -360,10 +426,6 @@ static void remove_jump_label_module(struct module *mod) struct jump_label_module_entry *e_module; int i; - /* if the module doesn't have jump label entries, just return */ - if (!mod->num_jump_entries) - return; - for (i = 0; i < JUMP_LABEL_TABLE_SIZE; i++) { head = &jump_label_table[i]; hlist_for_each_entry_safe(e, node, node_next, head, hlist) { @@ -375,10 +437,21 @@ static void remove_jump_label_module(struct module *mod) kfree(e_module); } } + } + } + /* now check if any keys can be removed */ + for (i = 0; i < JUMP_LABEL_TABLE_SIZE; i++) { + head = &jump_label_table[i]; + hlist_for_each_entry_safe(e, node, node_next, head, hlist) { + if (!within_module_core(e->key, mod)) + continue; if (hlist_empty(&e->modules) && (e->nr_entries == 0)) { hlist_del(&e->hlist); kfree(e); + continue; } + WARN(1, KERN_ERR "jump label: " + "tyring to remove used key: %lu !\n", e->key); } } } @@ -470,7 +543,7 @@ void jump_label_apply_nops(struct module *mod) struct notifier_block jump_label_module_nb = { .notifier_call = jump_label_module_notify, - .priority = 0, + .priority = 1, /* higher than tracepoints */ }; static __init int init_jump_label_module(void) diff --git a/kernel/perf_event.c b/kernel/perf_event.c index 11847bf..f96d615 100644 --- a/kernel/perf_event.c +++ b/kernel/perf_event.c @@ -38,7 +38,7 @@ #include <asm/irq_regs.h> -atomic_t perf_task_events __read_mostly; +struct jump_label_key_counter perf_task_events __read_mostly; static atomic_t nr_mmap_events __read_mostly; static atomic_t nr_comm_events __read_mostly; static atomic_t nr_task_events __read_mostly; @@ -2292,8 +2292,10 @@ static void free_event(struct perf_event *event) irq_work_sync(&event->pending); if (!event->parent) { - if (event->attach_state & PERF_ATTACH_TASK) - jump_label_dec(&perf_task_events); + if (event->attach_state & PERF_ATTACH_TASK) { + if (atomic_dec_and_test(&perf_task_events.ref)) + jump_label_disable(&perf_task_events.key); + } if (event->attr.mmap || event->attr.mmap_data) atomic_dec(&nr_mmap_events); if (event->attr.comm) @@ -4821,7 +4823,7 @@ fail: return err; } -atomic_t perf_swevent_enabled[PERF_COUNT_SW_MAX]; +struct jump_label_key_counter perf_swevent_enabled[PERF_COUNT_SW_MAX]; static void sw_perf_event_destroy(struct perf_event *event) { @@ -4829,7 +4831,8 @@ static void sw_perf_event_destroy(struct perf_event *event) WARN_ON(event->parent); - jump_label_dec(&perf_swevent_enabled[event_id]); + if (atomic_dec_and_test(&perf_swevent_enabled[event_id].ref)) + jump_label_disable(&perf_swevent_enabled[event_id].key); swevent_hlist_put(event); } @@ -4854,12 +4857,15 @@ static int perf_swevent_init(struct perf_event *event) if (!event->parent) { int err; + atomic_t *ref; err = swevent_hlist_get(event); if (err) return err; - jump_label_inc(&perf_swevent_enabled[event_id]); + ref = &perf_swevent_enabled[event_id].ref; + if (atomic_add_return(1, ref) == 1) + jump_label_enable(&perf_swevent_enabled[event_id].key); event->destroy = sw_perf_event_destroy; } @@ -5614,8 +5620,10 @@ done: event->pmu = pmu; if (!event->parent) { - if (event->attach_state & PERF_ATTACH_TASK) - jump_label_inc(&perf_task_events); + if (event->attach_state & PERF_ATTACH_TASK) { + if (atomic_add_return(1, &perf_task_events.ref) == 1) + jump_label_enable(&perf_task_events.key); + } if (event->attr.mmap || event->attr.mmap_data) atomic_inc(&nr_mmap_events); if (event->attr.comm) diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c index e95ee7f..d54b434 100644 --- a/kernel/tracepoint.c +++ b/kernel/tracepoint.c @@ -251,9 +251,9 @@ static void set_tracepoint(struct tracepoint_entry **entry, { WARN_ON(strcmp((*entry)->name, elem->name) != 0); - if (elem->regfunc && !elem->state && active) + if (elem->regfunc && !jump_label_enabled(&elem->key) && active) elem->regfunc(); - else if (elem->unregfunc && elem->state && !active) + else if (elem->unregfunc && jump_label_enabled(&elem->key) && !active) elem->unregfunc(); /* @@ -264,13 +264,10 @@ static void set_tracepoint(struct tracepoint_entry **entry, * is used. */ rcu_assign_pointer(elem->funcs, (*entry)->funcs); - if (!elem->state && active) { - jump_label_enable(&elem->state); - elem->state = active; - } else if (elem->state && !active) { - jump_label_disable(&elem->state); - elem->state = active; - } + if (active) + jump_label_enable(&elem->key); + else if (!active) + jump_label_disable(&elem->key); } /* @@ -281,13 +278,10 @@ static void set_tracepoint(struct tracepoint_entry **entry, */ static void disable_tracepoint(struct tracepoint *elem) { - if (elem->unregfunc && elem->state) + if (elem->unregfunc && jump_label_enabled(&elem->key)) elem->unregfunc(); - if (elem->state) { - jump_label_disable(&elem->state); - elem->state = 0; - } + jump_label_disable(&elem->key); rcu_assign_pointer(elem->funcs, NULL); } -- 1.7.1 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* Re: [PATCH 1/2] jump label: make enable/disable o(1) 2011-01-05 15:43 ` [PATCH 1/2] jump label: make enable/disable o(1) Jason Baron @ 2011-01-05 17:31 ` Steven Rostedt 2011-01-05 21:19 ` Jason Baron 0 siblings, 1 reply; 113+ messages in thread From: Steven Rostedt @ 2011-01-05 17:31 UTC (permalink / raw) To: Jason Baron Cc: peterz, mathieu.desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel On Wed, 2011-01-05 at 10:43 -0500, Jason Baron wrote: > > +struct jump_label_key { > + int state; > +}; > + > #define JUMP_LABEL(key, label) \ > do { \ > - if (unlikely(*key)) \ > + if (unlikely(((struct jump_label_key *)key)->state)) \ > goto label; \ > } while (0) Anything that uses JUMP_LABEL() should pass in a pointer to a struct jump_label_key. Hence, remove the typecast. That can only lead to hard to find bugs. -- Steve ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 1/2] jump label: make enable/disable o(1) 2011-01-05 17:31 ` Steven Rostedt @ 2011-01-05 21:19 ` Jason Baron 0 siblings, 0 replies; 113+ messages in thread From: Jason Baron @ 2011-01-05 21:19 UTC (permalink / raw) To: Steven Rostedt Cc: peterz, mathieu.desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel On Wed, Jan 05, 2011 at 12:31:05PM -0500, Steven Rostedt wrote: > On Wed, 2011-01-05 at 10:43 -0500, Jason Baron wrote: > > > > +struct jump_label_key { > > + int state; > > +}; > > + > > #define JUMP_LABEL(key, label) \ > > do { \ > > - if (unlikely(*key)) \ > > + if (unlikely(((struct jump_label_key *)key)->state)) \ > > goto label; \ > > } while (0) > > Anything that uses JUMP_LABEL() should pass in a pointer to a struct > jump_label_key. Hence, remove the typecast. That can only lead to hard > to find bugs. > > -- Steve > > right. The second patch in the series converts the JUMP_LABEL() macro -> static __always_inline bool static_branch(struct jump_label_key *key). So, that addresses this concern. thanks, -Jason ^ permalink raw reply [flat|nested] 113+ messages in thread
* [PATCH 2/2] jump label: introduce static_branch() 2011-01-05 15:43 [PATCH 0/2] jump label: 2.6.38 updates Jason Baron 2011-01-05 15:43 ` [PATCH 1/2] jump label: make enable/disable o(1) Jason Baron @ 2011-01-05 15:43 ` Jason Baron 2011-01-05 17:15 ` Frederic Weisbecker ` (3 more replies) 2011-02-11 19:25 ` [PATCH 0/2] jump label: 2.6.38 updates Peter Zijlstra 2 siblings, 4 replies; 113+ messages in thread From: Jason Baron @ 2011-01-05 15:43 UTC (permalink / raw) To: peterz, mathieu.desnoyers, hpa, rostedt, mingo Cc: tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel Introduce: static __always_inline bool static_branch(struct jump_label_key *key) to replace the old JUMP_LABEL(key, label) macro. The new static_branch(), simplifies the usage of jump labels. Since, static_branch() returns a boolean, it can be used as part of an if() construct. It also, allows us to drop the 'label' argument from the prototype. Its probably best understood with an example, here is the part of the patch that converts the tracepoints to use unlikely_switch(): --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -146,9 +146,7 @@ static inline void tracepoint_update_probe_range(struct tracepoint *begin, extern struct tracepoint __tracepoint_##name; \ static inline void trace_##name(proto) \ { \ - JUMP_LABEL(&__tracepoint_##name.key, do_trace); \ - return; \ -do_trace: \ + if (static_branch(&__tracepoint_##name.key)) \ __DO_TRACE(&__tracepoint_##name, \ TP_PROTO(data_proto), \ TP_ARGS(data_args)); \ I analyzed the code produced by static_branch(), and it seems to be at least as good as the code generated by the JUMP_LABEL(). As a reminder, we get a single nop in the fastpath for -02. But will often times get a 'double jmp' in the -Os case. That is, 'jmp 0', followed by a jmp around the disabled code. We believe that future gcc tweaks to allow block re-ordering in the -Os, will solve the -Os case in the future. I also saw a 1-2% tbench throughput improvement when compiling with jump labels. This patch also addresses a build issue that Tetsuo Handa reported where gcc v3.3 currently chokes on compiling 'dynamic debug': include/net/inet_connection_sock.h: In function `inet_csk_reset_xmit_timer': include/net/inet_connection_sock.h:236: error: duplicate label declaration `do_printk' include/net/inet_connection_sock.h:219: error: this is a previous declaration include/net/inet_connection_sock.h:236: error: duplicate label declaration `out' include/net/inet_connection_sock.h:219: error: this is a previous declaration include/net/inet_connection_sock.h:236: error: duplicate label `do_printk' include/net/inet_connection_sock.h:236: error: duplicate label `out' Thanks to H. Peter Anvin for suggesting this improved syntax. Suggested-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Jason Baron <jbaron@redhat.com> Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> --- arch/sparc/include/asm/jump_label.h | 25 ++++++++++++++----------- arch/x86/include/asm/jump_label.h | 22 +++++++++++++--------- arch/x86/kernel/jump_label.c | 2 +- include/linux/dynamic_debug.h | 18 ++++-------------- include/linux/jump_label.h | 26 +++++++++++--------------- include/linux/jump_label_ref.h | 18 +++++++++++------- include/linux/perf_event.h | 26 +++++++++++++------------- include/linux/tracepoint.h | 4 +--- kernel/jump_label.c | 2 +- 9 files changed, 69 insertions(+), 74 deletions(-) diff --git a/arch/sparc/include/asm/jump_label.h b/arch/sparc/include/asm/jump_label.h index 427d468..882651c 100644 --- a/arch/sparc/include/asm/jump_label.h +++ b/arch/sparc/include/asm/jump_label.h @@ -7,17 +7,20 @@ #define JUMP_LABEL_NOP_SIZE 4 -#define JUMP_LABEL(key, label) \ - do { \ - asm goto("1:\n\t" \ - "nop\n\t" \ - "nop\n\t" \ - ".pushsection __jump_table, \"a\"\n\t"\ - ".align 4\n\t" \ - ".word 1b, %l[" #label "], %c0\n\t" \ - ".popsection \n\t" \ - : : "i" (key) : : label);\ - } while (0) +static __always_inline bool __static_branch(struct jump_label_key *key) +{ + asm goto("1:\n\t" + "nop\n\t" + "nop\n\t" + ".pushsection __jump_table, \"a\"\n\t" + ".align 4\n\t" + ".word 1b, %l[l_yes], %c0\n\t" + ".popsection \n\t" + : : "i" (key) : : l_yes); + return false; +l_yes: + return true; +} #endif /* __KERNEL__ */ diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h index f52d42e..3d44a7c 100644 --- a/arch/x86/include/asm/jump_label.h +++ b/arch/x86/include/asm/jump_label.h @@ -5,20 +5,24 @@ #include <linux/types.h> #include <asm/nops.h> +#include <asm/asm.h> #define JUMP_LABEL_NOP_SIZE 5 # define JUMP_LABEL_INITIAL_NOP ".byte 0xe9 \n\t .long 0\n\t" -# define JUMP_LABEL(key, label) \ - do { \ - asm goto("1:" \ - JUMP_LABEL_INITIAL_NOP \ - ".pushsection __jump_table, \"a\" \n\t"\ - _ASM_PTR "1b, %l[" #label "], %c0 \n\t" \ - ".popsection \n\t" \ - : : "i" (key) : : label); \ - } while (0) +static __always_inline bool __static_branch(struct jump_label_key *key) +{ + asm goto("1:" + JUMP_LABEL_INITIAL_NOP + ".pushsection __jump_table, \"a\" \n\t" + _ASM_PTR "1b, %l[l_yes], %c0 \n\t" + ".popsection \n\t" + : : "i" (key) : : l_yes ); + return false; +l_yes: + return true; +} #endif /* __KERNEL__ */ diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c index 961b6b3..dfa4c3c 100644 --- a/arch/x86/kernel/jump_label.c +++ b/arch/x86/kernel/jump_label.c @@ -4,13 +4,13 @@ * Copyright (C) 2009 Jason Baron <jbaron@redhat.com> * */ -#include <linux/jump_label.h> #include <linux/memory.h> #include <linux/uaccess.h> #include <linux/module.h> #include <linux/list.h> #include <linux/jhash.h> #include <linux/cpu.h> +#include <linux/jump_label.h> #include <asm/kprobes.h> #include <asm/alternative.h> diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h index ddf7bae..2ade291 100644 --- a/include/linux/dynamic_debug.h +++ b/include/linux/dynamic_debug.h @@ -44,34 +44,24 @@ int ddebug_add_module(struct _ddebug *tab, unsigned int n, extern int ddebug_remove_module(const char *mod_name); #define dynamic_pr_debug(fmt, ...) do { \ - __label__ do_printk; \ - __label__ out; \ static struct _ddebug descriptor \ __used \ __attribute__((section("__verbose"), aligned(8))) = \ { KBUILD_MODNAME, __func__, __FILE__, fmt, __LINE__, \ _DPRINTK_FLAGS_DEFAULT, JUMP_LABEL_INIT }; \ - JUMP_LABEL(&descriptor.enabled, do_printk); \ - goto out; \ -do_printk: \ - printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__); \ -out: ; \ + if (static_branch(&descriptor.enabled)) \ + printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__); \ } while (0) #define dynamic_dev_dbg(dev, fmt, ...) do { \ - __label__ do_printk; \ - __label__ out; \ static struct _ddebug descriptor \ __used \ __attribute__((section("__verbose"), aligned(8))) = \ { KBUILD_MODNAME, __func__, __FILE__, fmt, __LINE__, \ _DPRINTK_FLAGS_DEFAULT, JUMP_LABEL_INIT }; \ - JUMP_LABEL(&descriptor.enabled, do_printk); \ - goto out; \ -do_printk: \ - dev_printk(KERN_DEBUG, dev, fmt, ##__VA_ARGS__); \ -out: ; \ + if (static_branch(&descriptor.enabled)) \ + dev_printk(KERN_DEBUG, dev, fmt, ##__VA_ARGS__); \ } while (0) #else diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h index 152f7de..0ad9c2e 100644 --- a/include/linux/jump_label.h +++ b/include/linux/jump_label.h @@ -22,6 +22,11 @@ struct module; #ifdef HAVE_JUMP_LABEL +static __always_inline bool static_branch(struct jump_label_key *key) +{ + return __static_branch(key); +} + extern struct jump_entry __start___jump_table[]; extern struct jump_entry __stop___jump_table[]; @@ -42,11 +47,12 @@ struct jump_label_key { int state; }; -#define JUMP_LABEL(key, label) \ -do { \ - if (unlikely(((struct jump_label_key *)key)->state)) \ - goto label; \ -} while (0) +static __always_inline bool static_branch(struct jump_label_key *key) +{ + if (unlikely(key->state)) + return true; + return false; +} static inline int jump_label_enabled(struct jump_label_key *key) { @@ -78,14 +84,4 @@ static inline void jump_label_unlock(void) {} #endif -#define COND_STMT(key, stmt) \ -do { \ - __label__ jl_enabled; \ - JUMP_LABEL_ELSE_ATOMIC_READ(key, jl_enabled); \ - if (0) { \ -jl_enabled: \ - stmt; \ - } \ -} while (0) - #endif diff --git a/include/linux/jump_label_ref.h b/include/linux/jump_label_ref.h index 8a76e89..5178696 100644 --- a/include/linux/jump_label_ref.h +++ b/include/linux/jump_label_ref.h @@ -7,19 +7,23 @@ struct jump_label_key_counter { atomic_t ref; struct jump_label_key key; -} +}; #ifdef HAVE_JUMP_LABEL -#define JUMP_LABEL_ELSE_ATOMIC_READ(key, label, counter) JUMP_LABEL(key, label) +static __always_inline bool static_branch_else_atomic_read(struct jump_label_key *key, atomic_t *count) +{ + return __static_branch(key); +} #else /* !HAVE_JUMP_LABEL */ -#define JUMP_LABEL_ELSE_ATOMIC_READ(key, label, counter) \ -do { \ - if (unlikely(atomic_read((atomic_t *)counter))) \ - goto label; \ -} while (0) +static __always_inline bool static_branch_else_atomic_read(struct jump_label_key *key, atomic_t *count) +{ + if (unlikely(atomic_read(count))) + return true; + return false; +} #endif /* HAVE_JUMP_LABEL */ diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 94834ce..26fe115 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1029,32 +1029,32 @@ perf_sw_event(u32 event_id, u64 nr, int nmi, struct pt_regs *regs, u64 addr) { struct pt_regs hot_regs; - JUMP_LABEL_ELSE_ATOMIC_READ(&perf_swevent_enabled[event_id].key, - have_event, - &perf_swevent_enabled[event_id].ref); - return; - -have_event: - if (!regs) { - perf_fetch_caller_regs(&hot_regs); - regs = &hot_regs; + if (static_branch_else_atomic_read(&perf_swevent_enabled[event_id].key, + &perf_swevent_enabled[event_id].ref)) { + if (!regs) { + perf_fetch_caller_regs(&hot_regs); + regs = &hot_regs; + } + __perf_sw_event(event_id, nr, nmi, regs, addr); } - __perf_sw_event(event_id, nr, nmi, regs, addr); } extern struct jump_label_key_counter perf_task_events; static inline void perf_event_task_sched_in(struct task_struct *task) { - COND_STMT(&perf_task_events, __perf_event_task_sched_in(task)); + if (static_branch_else_atomic_read(&perf_task_events.key, + &perf_task_events.ref)) + __perf_event_task_sched_in(task); } static inline void perf_event_task_sched_out(struct task_struct *task, struct task_struct *next) { perf_sw_event(PERF_COUNT_SW_CONTEXT_SWITCHES, 1, 1, NULL, 0); - - COND_STMT(&perf_task_events, __perf_event_task_sched_out(task, next)); + if (static_branch_else_atomic_read(&perf_task_events.key, + &perf_task_events.ref)) + __perf_event_task_sched_out(task, next); } extern void perf_event_mmap(struct vm_area_struct *vma); diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h index 2ff00e5..b95e99a 100644 --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -149,9 +149,7 @@ static inline void tracepoint_update_probe_range(struct tracepoint *begin, extern struct tracepoint __tracepoint_##name; \ static inline void trace_##name(proto) \ { \ - JUMP_LABEL(&__tracepoint_##name.key, do_trace); \ - return; \ -do_trace: \ + if (static_branch(&__tracepoint_##name.key)) \ __DO_TRACE(&__tracepoint_##name, \ TP_PROTO(data_proto), \ TP_ARGS(data_args), \ diff --git a/kernel/jump_label.c b/kernel/jump_label.c index b6d461c..b72d3cd 100644 --- a/kernel/jump_label.c +++ b/kernel/jump_label.c @@ -4,7 +4,6 @@ * Copyright (C) 2009 Jason Baron <jbaron@redhat.com> * */ -#include <linux/jump_label.h> #include <linux/memory.h> #include <linux/uaccess.h> #include <linux/module.h> @@ -13,6 +12,7 @@ #include <linux/slab.h> #include <linux/sort.h> #include <linux/err.h> +#include <linux/jump_label.h> #ifdef HAVE_JUMP_LABEL -- 1.7.1 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* Re: [PATCH 2/2] jump label: introduce static_branch() 2011-01-05 15:43 ` [PATCH 2/2] jump label: introduce static_branch() Jason Baron @ 2011-01-05 17:15 ` Frederic Weisbecker 2011-01-05 17:46 ` Steven Rostedt 2011-01-05 21:14 ` Jason Baron 2011-01-05 17:32 ` David Daney ` (2 subsequent siblings) 3 siblings, 2 replies; 113+ messages in thread From: Frederic Weisbecker @ 2011-01-05 17:15 UTC (permalink / raw) To: Jason Baron Cc: peterz, mathieu.desnoyers, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, avi, davem, sam, ddaney, michael, linux-kernel On Wed, Jan 05, 2011 at 10:43:12AM -0500, Jason Baron wrote: > diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h > index 152f7de..0ad9c2e 100644 > --- a/include/linux/jump_label.h > +++ b/include/linux/jump_label.h > @@ -22,6 +22,11 @@ struct module; > > #ifdef HAVE_JUMP_LABEL > > +static __always_inline bool static_branch(struct jump_label_key *key) > +{ > + return __static_branch(key); Not very important, but __static_branch() would be more self-explained if it was called arch_static_branch(). > +} > + > extern struct jump_entry __start___jump_table[]; > extern struct jump_entry __stop___jump_table[]; > > @@ -42,11 +47,12 @@ struct jump_label_key { > int state; > }; > > -#define JUMP_LABEL(key, label) \ > -do { \ > - if (unlikely(((struct jump_label_key *)key)->state)) \ > - goto label; \ > -} while (0) > +static __always_inline bool static_branch(struct jump_label_key *key) > +{ > + if (unlikely(key->state)) > + return true; > + return false; > +} > > static inline int jump_label_enabled(struct jump_label_key *key) > { > @@ -78,14 +84,4 @@ static inline void jump_label_unlock(void) {} > > #endif > > -#define COND_STMT(key, stmt) \ > -do { \ > - __label__ jl_enabled; \ > - JUMP_LABEL_ELSE_ATOMIC_READ(key, jl_enabled); \ > - if (0) { \ > -jl_enabled: \ > - stmt; \ > - } \ > -} while (0) > - > #endif > diff --git a/include/linux/jump_label_ref.h b/include/linux/jump_label_ref.h > index 8a76e89..5178696 100644 > --- a/include/linux/jump_label_ref.h > +++ b/include/linux/jump_label_ref.h > @@ -7,19 +7,23 @@ > struct jump_label_key_counter { > atomic_t ref; > struct jump_label_key key; > -} > +}; > > #ifdef HAVE_JUMP_LABEL > > -#define JUMP_LABEL_ELSE_ATOMIC_READ(key, label, counter) JUMP_LABEL(key, label) > +static __always_inline bool static_branch_else_atomic_read(struct jump_label_key *key, atomic_t *count) > +{ > + return __static_branch(key); > +} How about having only static_branch() but the key would be handled only by ways of get()/put(). Simple boolean key enablement would work in this scheme as well as branches based on refcount. So that the users could avoid maintaining both key and count, this would be transparently handled by the jump label API. Or am I missing something? Other than that, looks like a very nice patch! ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 2/2] jump label: introduce static_branch() 2011-01-05 17:15 ` Frederic Weisbecker @ 2011-01-05 17:46 ` Steven Rostedt 2011-01-05 18:52 ` H. Peter Anvin 2011-01-05 21:14 ` Jason Baron 1 sibling, 1 reply; 113+ messages in thread From: Steven Rostedt @ 2011-01-05 17:46 UTC (permalink / raw) To: Frederic Weisbecker Cc: Jason Baron, peterz, mathieu.desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, avi, davem, sam, ddaney, michael, linux-kernel On Wed, 2011-01-05 at 18:15 +0100, Frederic Weisbecker wrote: > On Wed, Jan 05, 2011 at 10:43:12AM -0500, Jason Baron wrote: > > diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h > > index 152f7de..0ad9c2e 100644 > > --- a/include/linux/jump_label.h > > +++ b/include/linux/jump_label.h > > @@ -22,6 +22,11 @@ struct module; > > > > #ifdef HAVE_JUMP_LABEL > > > > +static __always_inline bool static_branch(struct jump_label_key *key) > > +{ > > + return __static_branch(key); > > Not very important, but __static_branch() would be more self-explained > if it was called arch_static_branch(). I disagree, I think it is very important ;-) Yes, the kernel has been moving to adding "arch_" to functions that are implemented dependently by different archs. Please change this to "arch_static_branch()". Thanks, -- Steve ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 2/2] jump label: introduce static_branch() 2011-01-05 17:46 ` Steven Rostedt @ 2011-01-05 18:52 ` H. Peter Anvin 2011-01-05 21:19 ` Jason Baron 0 siblings, 1 reply; 113+ messages in thread From: H. Peter Anvin @ 2011-01-05 18:52 UTC (permalink / raw) To: Steven Rostedt Cc: Frederic Weisbecker, Jason Baron, peterz, mathieu.desnoyers, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, avi, davem, sam, ddaney, michael, linux-kernel On 01/05/2011 09:46 AM, Steven Rostedt wrote: > On Wed, 2011-01-05 at 18:15 +0100, Frederic Weisbecker wrote: >> On Wed, Jan 05, 2011 at 10:43:12AM -0500, Jason Baron wrote: >>> diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h >>> index 152f7de..0ad9c2e 100644 >>> --- a/include/linux/jump_label.h >>> +++ b/include/linux/jump_label.h >>> @@ -22,6 +22,11 @@ struct module; >>> >>> #ifdef HAVE_JUMP_LABEL >>> >>> +static __always_inline bool static_branch(struct jump_label_key *key) >>> +{ >>> + return __static_branch(key); >> >> Not very important, but __static_branch() would be more self-explained >> if it was called arch_static_branch(). > > I disagree, I think it is very important ;-) > > Yes, the kernel has been moving to adding "arch_" to functions that are > implemented dependently by different archs. Please change this to > "arch_static_branch()". > Indeed. This hugely simplifies knowing where to look and whose responsibility it is. -hpa ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 2/2] jump label: introduce static_branch() 2011-01-05 18:52 ` H. Peter Anvin @ 2011-01-05 21:19 ` Jason Baron 0 siblings, 0 replies; 113+ messages in thread From: Jason Baron @ 2011-01-05 21:19 UTC (permalink / raw) To: H. Peter Anvin Cc: Steven Rostedt, Frederic Weisbecker, peterz, mathieu.desnoyers, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, avi, davem, sam, ddaney, michael, linux-kernel On Wed, Jan 05, 2011 at 10:52:05AM -0800, H. Peter Anvin wrote: > On 01/05/2011 09:46 AM, Steven Rostedt wrote: > > On Wed, 2011-01-05 at 18:15 +0100, Frederic Weisbecker wrote: > >> On Wed, Jan 05, 2011 at 10:43:12AM -0500, Jason Baron wrote: > >>> diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h > >>> index 152f7de..0ad9c2e 100644 > >>> --- a/include/linux/jump_label.h > >>> +++ b/include/linux/jump_label.h > >>> @@ -22,6 +22,11 @@ struct module; > >>> > >>> #ifdef HAVE_JUMP_LABEL > >>> > >>> +static __always_inline bool static_branch(struct jump_label_key *key) > >>> +{ > >>> + return __static_branch(key); > >> > >> Not very important, but __static_branch() would be more self-explained > >> if it was called arch_static_branch(). > > > > I disagree, I think it is very important ;-) > > > > Yes, the kernel has been moving to adding "arch_" to functions that are > > implemented dependently by different archs. Please change this to > > "arch_static_branch()". > > > > Indeed. This hugely simplifies knowing where to look and whose > responsibility it is. > > -hpa agreed. updated. thanks, -Jason ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 2/2] jump label: introduce static_branch() 2011-01-05 17:15 ` Frederic Weisbecker 2011-01-05 17:46 ` Steven Rostedt @ 2011-01-05 21:14 ` Jason Baron 1 sibling, 0 replies; 113+ messages in thread From: Jason Baron @ 2011-01-05 21:14 UTC (permalink / raw) To: Frederic Weisbecker Cc: peterz, mathieu.desnoyers, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, avi, davem, sam, ddaney, michael, linux-kernel On Wed, Jan 05, 2011 at 06:15:18PM +0100, Frederic Weisbecker wrote: > On Wed, Jan 05, 2011 at 10:43:12AM -0500, Jason Baron wrote: > > diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h > > index 152f7de..0ad9c2e 100644 > > --- a/include/linux/jump_label.h > > +++ b/include/linux/jump_label.h > > @@ -22,6 +22,11 @@ struct module; > > > > #ifdef HAVE_JUMP_LABEL > > > > +static __always_inline bool static_branch(struct jump_label_key *key) > > +{ > > + return __static_branch(key); > > Not very important, but __static_branch() would be more self-explained > if it was called arch_static_branch(). > > > +} > > + > > extern struct jump_entry __start___jump_table[]; > > extern struct jump_entry __stop___jump_table[]; > > > > @@ -42,11 +47,12 @@ struct jump_label_key { > > int state; > > }; > > > > -#define JUMP_LABEL(key, label) \ > > -do { \ > > - if (unlikely(((struct jump_label_key *)key)->state)) \ > > - goto label; \ > > -} while (0) > > +static __always_inline bool static_branch(struct jump_label_key *key) > > +{ > > + if (unlikely(key->state)) > > + return true; > > + return false; > > +} > > > > static inline int jump_label_enabled(struct jump_label_key *key) > > { > > @@ -78,14 +84,4 @@ static inline void jump_label_unlock(void) {} > > > > #endif > > > > -#define COND_STMT(key, stmt) \ > > -do { \ > > - __label__ jl_enabled; \ > > - JUMP_LABEL_ELSE_ATOMIC_READ(key, jl_enabled); \ > > - if (0) { \ > > -jl_enabled: \ > > - stmt; \ > > - } \ > > -} while (0) > > - > > #endif > > diff --git a/include/linux/jump_label_ref.h b/include/linux/jump_label_ref.h > > index 8a76e89..5178696 100644 > > --- a/include/linux/jump_label_ref.h > > +++ b/include/linux/jump_label_ref.h > > @@ -7,19 +7,23 @@ > > struct jump_label_key_counter { > > atomic_t ref; > > struct jump_label_key key; > > -} > > +}; > > > > #ifdef HAVE_JUMP_LABEL > > > > -#define JUMP_LABEL_ELSE_ATOMIC_READ(key, label, counter) JUMP_LABEL(key, label) > > +static __always_inline bool static_branch_else_atomic_read(struct jump_label_key *key, atomic_t *count) > > +{ > > + return __static_branch(key); > > +} > > How about having only static_branch() but the key would be handled only > by ways of get()/put(). > > Simple boolean key enablement would work in this scheme as well as branches > based on refcount. So that the users could avoid maintaining both key and count, > this would be transparently handled by the jump label API. > > Or am I missing something? > right. this is a good point. I had a 'jump_label_inc()', 'jump_label_dec()' essentially providing this. However, when jump labels are disabled we didn't want to incur an atomic_read() everywhere. Furthermore, the use of the atomic_t type within jump_label.h, causes #include dependencies problems, since atomic.h ends up including jump_label.h... Thus, what I've proposed here, is to the have the very simple jump_label_enable()/disable(), and leave reference counting to the caller. thanks, -Jason ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 2/2] jump label: introduce static_branch() 2011-01-05 15:43 ` [PATCH 2/2] jump label: introduce static_branch() Jason Baron 2011-01-05 17:15 ` Frederic Weisbecker @ 2011-01-05 17:32 ` David Daney 2011-01-05 17:43 ` Steven Rostedt 2011-01-05 21:16 ` Jason Baron 2011-01-05 17:41 ` Steven Rostedt 2011-01-09 18:48 ` Mathieu Desnoyers 3 siblings, 2 replies; 113+ messages in thread From: David Daney @ 2011-01-05 17:32 UTC (permalink / raw) To: Jason Baron, Ralf Baechle Cc: peterz, mathieu.desnoyers, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, michael, linux-kernel On 01/05/2011 07:43 AM, Jason Baron wrote: > Introduce: > > static __always_inline bool static_branch(struct jump_label_key *key) > > to replace the old JUMP_LABEL(key, label) macro. > > The new static_branch(), simplifies the usage of jump labels. Since, > static_branch() returns a boolean, it can be used as part of an if() > construct. It also, allows us to drop the 'label' argument from the > prototype. Its probably best understood with an example, here is the part > of the patch that converts the tracepoints to use unlikely_switch(): > > --- a/include/linux/tracepoint.h > +++ b/include/linux/tracepoint.h > @@ -146,9 +146,7 @@ static inline void tracepoint_update_probe_range(struct tracepoint *begin, > extern struct tracepoint __tracepoint_##name; \ > static inline void trace_##name(proto) \ > { \ > - JUMP_LABEL(&__tracepoint_##name.key, do_trace); \ > - return; \ > -do_trace: \ > + if (static_branch(&__tracepoint_##name.key)) \ > __DO_TRACE(&__tracepoint_##name, \ > TP_PROTO(data_proto), \ > TP_ARGS(data_args)); \ > > > I analyzed the code produced by static_branch(), and it seems to be > at least as good as the code generated by the JUMP_LABEL(). As a reminder, > we get a single nop in the fastpath for -02. But will often times get > a 'double jmp' in the -Os case. That is, 'jmp 0', followed by a jmp around > the disabled code. We believe that future gcc tweaks to allow block > re-ordering in the -Os, will solve the -Os case in the future. > > I also saw a 1-2% tbench throughput improvement when compiling with > jump labels. > > This patch also addresses a build issue that Tetsuo Handa reported where > gcc v3.3 currently chokes on compiling 'dynamic debug': > > include/net/inet_connection_sock.h: In function `inet_csk_reset_xmit_timer': > include/net/inet_connection_sock.h:236: error: duplicate label declaration `do_printk' > include/net/inet_connection_sock.h:219: error: this is a previous declaration > include/net/inet_connection_sock.h:236: error: duplicate label declaration `out' > include/net/inet_connection_sock.h:219: error: this is a previous declaration > include/net/inet_connection_sock.h:236: error: duplicate label `do_printk' > include/net/inet_connection_sock.h:236: error: duplicate label `out' > > > Thanks to H. Peter Anvin for suggesting this improved syntax. > > Suggested-by: H. Peter Anvin<hpa@linux.intel.com> > Signed-off-by: Jason Baron<jbaron@redhat.com> > Tested-by: Tetsuo Handa<penguin-kernel@i-love.sakura.ne.jp> > --- > arch/sparc/include/asm/jump_label.h | 25 ++++++++++++++----------- > arch/x86/include/asm/jump_label.h | 22 +++++++++++++--------- > arch/x86/kernel/jump_label.c | 2 +- > include/linux/dynamic_debug.h | 18 ++++-------------- > include/linux/jump_label.h | 26 +++++++++++--------------- > include/linux/jump_label_ref.h | 18 +++++++++++------- > include/linux/perf_event.h | 26 +++++++++++++------------- > include/linux/tracepoint.h | 4 +--- > kernel/jump_label.c | 2 +- > 9 files changed, 69 insertions(+), 74 deletions(-) > [...] This patch will conflict with the MIPS jump label support that Ralf has queued up for 2.6.38. David Daney ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 2/2] jump label: introduce static_branch() 2011-01-05 17:32 ` David Daney @ 2011-01-05 17:43 ` Steven Rostedt 2011-01-05 18:44 ` David Miller 2011-01-05 18:56 ` H. Peter Anvin 2011-01-05 21:16 ` Jason Baron 1 sibling, 2 replies; 113+ messages in thread From: Steven Rostedt @ 2011-01-05 17:43 UTC (permalink / raw) To: David Daney Cc: Jason Baron, Ralf Baechle, peterz, mathieu.desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, michael, linux-kernel On Wed, 2011-01-05 at 09:32 -0800, David Daney wrote: > This patch will conflict with the MIPS jump label support that Ralf has > queued up for 2.6.38. Can you disable that support for now? As Linus said at Kernel Summit, other archs jumped too quickly onto the jump label band wagon. This change really needs to get in, and IMO, it is more critical to clean up the jump label code than to have other archs implementing it. -- Steve ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 2/2] jump label: introduce static_branch() 2011-01-05 17:43 ` Steven Rostedt @ 2011-01-05 18:44 ` David Miller 2011-01-05 20:04 ` Steven Rostedt 2011-01-05 18:56 ` H. Peter Anvin 1 sibling, 1 reply; 113+ messages in thread From: David Miller @ 2011-01-05 18:44 UTC (permalink / raw) To: rostedt Cc: ddaney, jbaron, ralf, peterz, mathieu.desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, michael, linux-kernel From: Steven Rostedt <rostedt@goodmis.org> Date: Wed, 05 Jan 2011 12:43:59 -0500 > On Wed, 2011-01-05 at 09:32 -0800, David Daney wrote: > >> This patch will conflict with the MIPS jump label support that Ralf has >> queued up for 2.6.38. > > Can you disable that support for now? As Linus said at Kernel Summit, > other archs jumped too quickly onto the jump label band wagon. I totally disagree with this assesment. Implementing jump label for sparc64 as early as possible found so much broken stuff that otherwise would have merged in before any other architecture tried supporting it. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 2/2] jump label: introduce static_branch() 2011-01-05 18:44 ` David Miller @ 2011-01-05 20:04 ` Steven Rostedt 0 siblings, 0 replies; 113+ messages in thread From: Steven Rostedt @ 2011-01-05 20:04 UTC (permalink / raw) To: David Miller Cc: ddaney, jbaron, ralf, peterz, mathieu.desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, michael, linux-kernel On Wed, 2011-01-05 at 10:44 -0800, David Miller wrote: > From: Steven Rostedt <rostedt@goodmis.org> > Date: Wed, 05 Jan 2011 12:43:59 -0500 > > > On Wed, 2011-01-05 at 09:32 -0800, David Daney wrote: > > > >> This patch will conflict with the MIPS jump label support that Ralf has > >> queued up for 2.6.38. > > > > Can you disable that support for now? As Linus said at Kernel Summit, > > other archs jumped too quickly onto the jump label band wagon. > > I totally disagree with this assesment. Implementing jump label for > sparc64 as early as possible found so much broken stuff that otherwise > would have merged in before any other architecture tried supporting > it. The issue here is that jump labels went in too fast. And I agree that having it ported to all archs is/was important. But the infrastructure needs to be cleaned up. Probably best to get out the kinks in Linux next as suppose to mainline. -- Steve ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 2/2] jump label: introduce static_branch() 2011-01-05 17:43 ` Steven Rostedt 2011-01-05 18:44 ` David Miller @ 2011-01-05 18:56 ` H. Peter Anvin 2011-01-05 19:14 ` Ingo Molnar 1 sibling, 1 reply; 113+ messages in thread From: H. Peter Anvin @ 2011-01-05 18:56 UTC (permalink / raw) To: Steven Rostedt Cc: David Daney, Jason Baron, Ralf Baechle, peterz, mathieu.desnoyers, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, michael, linux-kernel On 01/05/2011 09:43 AM, Steven Rostedt wrote: > On Wed, 2011-01-05 at 09:32 -0800, David Daney wrote: > >> This patch will conflict with the MIPS jump label support that Ralf has >> queued up for 2.6.38. > > Can you disable that support for now? As Linus said at Kernel Summit, > other archs jumped too quickly onto the jump label band wagon. This > change really needs to get in, and IMO, it is more critical to clean up > the jump label code than to have other archs implementing it. > Ralf is really good... perhaps we can get the conflicts resolved? -hpa ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 2/2] jump label: introduce static_branch() 2011-01-05 18:56 ` H. Peter Anvin @ 2011-01-05 19:14 ` Ingo Molnar 2011-01-05 19:32 ` David Daney 0 siblings, 1 reply; 113+ messages in thread From: Ingo Molnar @ 2011-01-05 19:14 UTC (permalink / raw) To: H. Peter Anvin Cc: Steven Rostedt, David Daney, Jason Baron, Ralf Baechle, peterz, mathieu.desnoyers, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, michael, linux-kernel * H. Peter Anvin <hpa@zytor.com> wrote: > On 01/05/2011 09:43 AM, Steven Rostedt wrote: > > On Wed, 2011-01-05 at 09:32 -0800, David Daney wrote: > > > >> This patch will conflict with the MIPS jump label support that Ralf has > >> queued up for 2.6.38. > > > > Can you disable that support for now? As Linus said at Kernel Summit, > > other archs jumped too quickly onto the jump label band wagon. This > > change really needs to get in, and IMO, it is more critical to clean up > > the jump label code than to have other archs implementing it. > > > > Ralf is really good... perhaps we can get the conflicts resolved? Yep, the best Git-ish way to handle that is to resolve the conflicts whenever they happen - i.e. whoever merges his tree upstream later. No need for anyone to 'wait' or undo anything. Thanks, Ingo ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 2/2] jump label: introduce static_branch() 2011-01-05 19:14 ` Ingo Molnar @ 2011-01-05 19:32 ` David Daney 2011-01-05 19:50 ` Ingo Molnar 0 siblings, 1 reply; 113+ messages in thread From: David Daney @ 2011-01-05 19:32 UTC (permalink / raw) To: Ingo Molnar Cc: H. Peter Anvin, Steven Rostedt, Jason Baron, Ralf Baechle, peterz, mathieu.desnoyers, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, michael, linux-kernel On 01/05/2011 11:14 AM, Ingo Molnar wrote: > > * H. Peter Anvin<hpa@zytor.com> wrote: > >> On 01/05/2011 09:43 AM, Steven Rostedt wrote: >>> On Wed, 2011-01-05 at 09:32 -0800, David Daney wrote: >>> >>>> This patch will conflict with the MIPS jump label support that Ralf has >>>> queued up for 2.6.38. >>> >>> Can you disable that support for now? As Linus said at Kernel Summit, >>> other archs jumped too quickly onto the jump label band wagon. This >>> change really needs to get in, and IMO, it is more critical to clean up >>> the jump label code than to have other archs implementing it. >>> >> >> Ralf is really good... perhaps we can get the conflicts resolved? > > Yep, the best Git-ish way to handle that is to resolve the conflicts whenever they > happen - i.e. whoever merges his tree upstream later. No need for anyone to 'wait' > or undo anything. > There will be no git conflicts, as the affected files are disjoint. It will be manifested as a build failure for MIPS, which is why I raised the issue. No matter I guess. We will undoubtedly have many -rc releases in which we can merge any required adjustments. David Daney ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 2/2] jump label: introduce static_branch() 2011-01-05 19:32 ` David Daney @ 2011-01-05 19:50 ` Ingo Molnar 2011-01-05 20:07 ` David Daney 0 siblings, 1 reply; 113+ messages in thread From: Ingo Molnar @ 2011-01-05 19:50 UTC (permalink / raw) To: David Daney Cc: H. Peter Anvin, Steven Rostedt, Jason Baron, Ralf Baechle, peterz, mathieu.desnoyers, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, michael, linux-kernel * David Daney <ddaney@caviumnetworks.com> wrote: > On 01/05/2011 11:14 AM, Ingo Molnar wrote: > > > >* H. Peter Anvin<hpa@zytor.com> wrote: > > > >>On 01/05/2011 09:43 AM, Steven Rostedt wrote: > >>>On Wed, 2011-01-05 at 09:32 -0800, David Daney wrote: > >>> > >>>>This patch will conflict with the MIPS jump label support that Ralf has > >>>>queued up for 2.6.38. > >>> > >>>Can you disable that support for now? As Linus said at Kernel Summit, > >>>other archs jumped too quickly onto the jump label band wagon. This > >>>change really needs to get in, and IMO, it is more critical to clean up > >>>the jump label code than to have other archs implementing it. > >>> > >> > >>Ralf is really good... perhaps we can get the conflicts resolved? > > > >Yep, the best Git-ish way to handle that is to resolve the conflicts whenever they > >happen - i.e. whoever merges his tree upstream later. No need for anyone to 'wait' > >or undo anything. > > > > There will be no git conflicts, as the affected files are disjoint. I regularly resolve semantic conflicts in merge commits - or in the first followup commit. Thanks, Ingo ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 2/2] jump label: introduce static_branch() 2011-01-05 19:50 ` Ingo Molnar @ 2011-01-05 20:07 ` David Daney 2011-01-05 20:08 ` H. Peter Anvin 2011-01-05 20:18 ` Ingo Molnar 0 siblings, 2 replies; 113+ messages in thread From: David Daney @ 2011-01-05 20:07 UTC (permalink / raw) To: Ingo Molnar Cc: H. Peter Anvin, Steven Rostedt, Jason Baron, Ralf Baechle, peterz, mathieu.desnoyers, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, michael, linux-kernel On 01/05/2011 11:50 AM, Ingo Molnar wrote: > > * David Daney<ddaney@caviumnetworks.com> wrote: > >> On 01/05/2011 11:14 AM, Ingo Molnar wrote: >>> >>> * H. Peter Anvin<hpa@zytor.com> wrote: >>> >>>> On 01/05/2011 09:43 AM, Steven Rostedt wrote: >>>>> On Wed, 2011-01-05 at 09:32 -0800, David Daney wrote: >>>>> >>>>>> This patch will conflict with the MIPS jump label support that Ralf has >>>>>> queued up for 2.6.38. >>>>> >>>>> Can you disable that support for now? As Linus said at Kernel Summit, >>>>> other archs jumped too quickly onto the jump label band wagon. This >>>>> change really needs to get in, and IMO, it is more critical to clean up >>>>> the jump label code than to have other archs implementing it. >>>>> >>>> >>>> Ralf is really good... perhaps we can get the conflicts resolved? >>> >>> Yep, the best Git-ish way to handle that is to resolve the conflicts whenever they >>> happen - i.e. whoever merges his tree upstream later. No need for anyone to 'wait' >>> or undo anything. >>> >> >> There will be no git conflicts, as the affected files are disjoint. > > I regularly resolve semantic conflicts in merge commits - or in the first followup > commit. > But I am guessing that neither you, nor Linus, regularly build MIPS kernels with GCC-4.5.x *and* jump label support enabled. So how would such semantic conflict ever be detected? I would expect the conflict to first occur when Linus pulls Ralf's tree. I don't expect anybody to magically fix such things, so whatever happens, I will test it and submit patches if required. Thanks, David Daney ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 2/2] jump label: introduce static_branch() 2011-01-05 20:07 ` David Daney @ 2011-01-05 20:08 ` H. Peter Anvin 2011-01-05 20:18 ` Ingo Molnar 1 sibling, 0 replies; 113+ messages in thread From: H. Peter Anvin @ 2011-01-05 20:08 UTC (permalink / raw) To: David Daney Cc: Ingo Molnar, Steven Rostedt, Jason Baron, Ralf Baechle, peterz, mathieu.desnoyers, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, michael, linux-kernel On 01/05/2011 12:07 PM, David Daney wrote: > > But I am guessing that neither you, nor Linus, regularly build MIPS > kernels with GCC-4.5.x *and* jump label support enabled. So how would > such semantic conflict ever be detected? I would expect the conflict to > first occur when Linus pulls Ralf's tree. > > I don't expect anybody to magically fix such things, so whatever > happens, I will test it and submit patches if required. > If Ralf knows to expect them, then Ralf can take corrective actions as he thinks is appropriate. -hpa ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 2/2] jump label: introduce static_branch() 2011-01-05 20:07 ` David Daney 2011-01-05 20:08 ` H. Peter Anvin @ 2011-01-05 20:18 ` Ingo Molnar 1 sibling, 0 replies; 113+ messages in thread From: Ingo Molnar @ 2011-01-05 20:18 UTC (permalink / raw) To: David Daney Cc: H. Peter Anvin, Steven Rostedt, Jason Baron, Ralf Baechle, peterz, mathieu.desnoyers, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, michael, linux-kernel * David Daney <ddaney@caviumnetworks.com> wrote: > On 01/05/2011 11:50 AM, Ingo Molnar wrote: > > > >* David Daney<ddaney@caviumnetworks.com> wrote: > > > >>On 01/05/2011 11:14 AM, Ingo Molnar wrote: > >>> > >>>* H. Peter Anvin<hpa@zytor.com> wrote: > >>> > >>>>On 01/05/2011 09:43 AM, Steven Rostedt wrote: > >>>>>On Wed, 2011-01-05 at 09:32 -0800, David Daney wrote: > >>>>> > >>>>>>This patch will conflict with the MIPS jump label support that Ralf has > >>>>>>queued up for 2.6.38. > >>>>> > >>>>>Can you disable that support for now? As Linus said at Kernel Summit, > >>>>>other archs jumped too quickly onto the jump label band wagon. This > >>>>>change really needs to get in, and IMO, it is more critical to clean up > >>>>>the jump label code than to have other archs implementing it. > >>>>> > >>>> > >>>>Ralf is really good... perhaps we can get the conflicts resolved? > >>> > >>>Yep, the best Git-ish way to handle that is to resolve the conflicts whenever they > >>>happen - i.e. whoever merges his tree upstream later. No need for anyone to 'wait' > >>>or undo anything. > >>> > >> > >>There will be no git conflicts, as the affected files are disjoint. > > > >I regularly resolve semantic conflicts in merge commits - or in the first followup > >commit. > > > > But I am guessing that neither you, nor Linus, regularly build MIPS > kernels with GCC-4.5.x *and* jump label support enabled. [...] I build MIPS defconfig kernels at least once per day - so at least serious, wide-ranging issues should not slip through. Rarer combos possibly - but that's true of pretty much anything. > [...] So how would such semantic conflict ever be detected? I would expect the > conflict to first occur when Linus pulls Ralf's tree. If that slips through then a fix is queued up? Thanks, Ingo ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 2/2] jump label: introduce static_branch() 2011-01-05 17:32 ` David Daney 2011-01-05 17:43 ` Steven Rostedt @ 2011-01-05 21:16 ` Jason Baron 1 sibling, 0 replies; 113+ messages in thread From: Jason Baron @ 2011-01-05 21:16 UTC (permalink / raw) To: David Daney Cc: Ralf Baechle, peterz, mathieu.desnoyers, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, michael, linux-kernel On Wed, Jan 05, 2011 at 09:32:11AM -0800, David Daney wrote: > On 01/05/2011 07:43 AM, Jason Baron wrote: >> Introduce: >> >> static __always_inline bool static_branch(struct jump_label_key *key) >> >> to replace the old JUMP_LABEL(key, label) macro. >> >> The new static_branch(), simplifies the usage of jump labels. Since, >> static_branch() returns a boolean, it can be used as part of an if() >> construct. It also, allows us to drop the 'label' argument from the >> prototype. Its probably best understood with an example, here is the part >> of the patch that converts the tracepoints to use unlikely_switch(): >> >> --- a/include/linux/tracepoint.h >> +++ b/include/linux/tracepoint.h >> @@ -146,9 +146,7 @@ static inline void tracepoint_update_probe_range(struct tracepoint *begin, >> extern struct tracepoint __tracepoint_##name; \ >> static inline void trace_##name(proto) \ >> { \ >> - JUMP_LABEL(&__tracepoint_##name.key, do_trace); \ >> - return; \ >> -do_trace: \ >> + if (static_branch(&__tracepoint_##name.key)) \ >> __DO_TRACE(&__tracepoint_##name, \ >> TP_PROTO(data_proto), \ >> TP_ARGS(data_args)); \ >> >> >> I analyzed the code produced by static_branch(), and it seems to be >> at least as good as the code generated by the JUMP_LABEL(). As a reminder, >> we get a single nop in the fastpath for -02. But will often times get >> a 'double jmp' in the -Os case. That is, 'jmp 0', followed by a jmp around >> the disabled code. We believe that future gcc tweaks to allow block >> re-ordering in the -Os, will solve the -Os case in the future. >> >> I also saw a 1-2% tbench throughput improvement when compiling with >> jump labels. >> >> This patch also addresses a build issue that Tetsuo Handa reported where >> gcc v3.3 currently chokes on compiling 'dynamic debug': >> >> include/net/inet_connection_sock.h: In function `inet_csk_reset_xmit_timer': >> include/net/inet_connection_sock.h:236: error: duplicate label declaration `do_printk' >> include/net/inet_connection_sock.h:219: error: this is a previous declaration >> include/net/inet_connection_sock.h:236: error: duplicate label declaration `out' >> include/net/inet_connection_sock.h:219: error: this is a previous declaration >> include/net/inet_connection_sock.h:236: error: duplicate label `do_printk' >> include/net/inet_connection_sock.h:236: error: duplicate label `out' >> >> >> Thanks to H. Peter Anvin for suggesting this improved syntax. >> >> Suggested-by: H. Peter Anvin<hpa@linux.intel.com> >> Signed-off-by: Jason Baron<jbaron@redhat.com> >> Tested-by: Tetsuo Handa<penguin-kernel@i-love.sakura.ne.jp> >> --- >> arch/sparc/include/asm/jump_label.h | 25 ++++++++++++++----------- >> arch/x86/include/asm/jump_label.h | 22 +++++++++++++--------- >> arch/x86/kernel/jump_label.c | 2 +- >> include/linux/dynamic_debug.h | 18 ++++-------------- >> include/linux/jump_label.h | 26 +++++++++++--------------- >> include/linux/jump_label_ref.h | 18 +++++++++++------- >> include/linux/perf_event.h | 26 +++++++++++++------------- >> include/linux/tracepoint.h | 4 +--- >> kernel/jump_label.c | 2 +- >> 9 files changed, 69 insertions(+), 74 deletions(-) >> > [...] > > This patch will conflict with the MIPS jump label support that Ralf has > queued up for 2.6.38. > > David Daney indeed. If you look at the x86 or sparc bits the fixup should be quite small. The bulk of the changes are in the common code. thanks, -Jason ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 2/2] jump label: introduce static_branch() 2011-01-05 15:43 ` [PATCH 2/2] jump label: introduce static_branch() Jason Baron 2011-01-05 17:15 ` Frederic Weisbecker 2011-01-05 17:32 ` David Daney @ 2011-01-05 17:41 ` Steven Rostedt 2011-01-09 18:48 ` Mathieu Desnoyers 3 siblings, 0 replies; 113+ messages in thread From: Steven Rostedt @ 2011-01-05 17:41 UTC (permalink / raw) To: Jason Baron Cc: peterz, mathieu.desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel On Wed, 2011-01-05 at 10:43 -0500, Jason Baron wrote: > Introduce: > > static __always_inline bool static_branch(struct jump_label_key *key) > > to replace the old JUMP_LABEL(key, label) macro. > > The new static_branch(), simplifies the usage of jump labels. Since, > static_branch() returns a boolean, it can be used as part of an if() > construct. It also, allows us to drop the 'label' argument from the > prototype. Its probably best understood with an example, here is the part > of the patch that converts the tracepoints to use unlikely_switch(): > > --- a/include/linux/tracepoint.h > +++ b/include/linux/tracepoint.h > @@ -146,9 +146,7 @@ static inline void tracepoint_update_probe_range(struct tracepoint *begin, > extern struct tracepoint __tracepoint_##name; \ > static inline void trace_##name(proto) \ > { \ > - JUMP_LABEL(&__tracepoint_##name.key, do_trace); \ > - return; \ > -do_trace: \ > + if (static_branch(&__tracepoint_##name.key)) \ > __DO_TRACE(&__tracepoint_##name, \ > TP_PROTO(data_proto), \ > TP_ARGS(data_args)); \ BTW, do not put real diffs in the change log. That is, remove the header from it. This can confuse tools that pull in patches from mailing lists. As this change will be done in the code itself. Thanks, -- Steve > > > I analyzed the code produced by static_branch(), and it seems to be > at least as good as the code generated by the JUMP_LABEL(). As a reminder, > we get a single nop in the fastpath for -02. But will often times get > a 'double jmp' in the -Os case. That is, 'jmp 0', followed by a jmp around > the disabled code. We believe that future gcc tweaks to allow block > re-ordering in the -Os, will solve the -Os case in the future. > > I also saw a 1-2% tbench throughput improvement when compiling with > jump labels. > > This patch also addresses a build issue that Tetsuo Handa reported where > gcc v3.3 currently chokes on compiling 'dynamic debug': ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 2/2] jump label: introduce static_branch() 2011-01-05 15:43 ` [PATCH 2/2] jump label: introduce static_branch() Jason Baron ` (2 preceding siblings ...) 2011-01-05 17:41 ` Steven Rostedt @ 2011-01-09 18:48 ` Mathieu Desnoyers 3 siblings, 0 replies; 113+ messages in thread From: Mathieu Desnoyers @ 2011-01-09 18:48 UTC (permalink / raw) To: Jason Baron Cc: peterz, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel * Jason Baron (jbaron@redhat.com) wrote: > Introduce: > > static __always_inline bool static_branch(struct jump_label_key *key) > > to replace the old JUMP_LABEL(key, label) macro. > > The new static_branch(), simplifies the usage of jump labels. Since, > static_branch() returns a boolean, it can be used as part of an if() > construct. It also, allows us to drop the 'label' argument from the > prototype. Its probably best understood with an example, here is the part > of the patch that converts the tracepoints to use unlikely_switch(): small nit: s/unlikely_switch/arch_static_branch/g Thanks, Mathieu -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-01-05 15:43 [PATCH 0/2] jump label: 2.6.38 updates Jason Baron 2011-01-05 15:43 ` [PATCH 1/2] jump label: make enable/disable o(1) Jason Baron 2011-01-05 15:43 ` [PATCH 2/2] jump label: introduce static_branch() Jason Baron @ 2011-02-11 19:25 ` Peter Zijlstra 2011-02-11 21:13 ` Mathieu Desnoyers [not found] ` <BLU0-SMTP101B686C32E10BA346B15F896EF0@phx.gbl> 2 siblings, 2 replies; 113+ messages in thread From: Peter Zijlstra @ 2011-02-11 19:25 UTC (permalink / raw) To: Jason Baron Cc: mathieu.desnoyers, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel On Wed, 2011-01-05 at 10:43 -0500, Jason Baron wrote: > Hi, > > The first patch uses the storage space of the jump label key address > as a pointer into the update table. In this way, we can find all > the addresses that need to be updated without hashing. > > The second patch introduces: > > static __always_inline bool static_branch(struct jump_label_key *key); > > instead of the old JUMP_LABEL(key, label) macro. > > In this way, jump labels become really easy to use: > > Define: > > struct jump_label_key jump_key; > > Can be used as: > > if (static_branch(&jump_key)) > do unlikely code > > enable/disale via: > > jump_label_enable(&jump_key); > jump_label_disable(&jump_key); > > that's it! > > For perf, which also uses jump labels, I've left the reference counting > out of the jump label layer, thus removing the 'jump_label_inc()' and > 'jump_label_dec()' interface. Hopefully, this is a more palatable solution. Right, lets go with this. Maybe we'll manage to come up with something saner than _else_atomic_read(), but for now its an improvement over what we have. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-11 19:25 ` [PATCH 0/2] jump label: 2.6.38 updates Peter Zijlstra @ 2011-02-11 21:13 ` Mathieu Desnoyers [not found] ` <BLU0-SMTP101B686C32E10BA346B15F896EF0@phx.gbl> 1 sibling, 0 replies; 113+ messages in thread From: Mathieu Desnoyers @ 2011-02-11 21:13 UTC (permalink / raw) To: Peter Zijlstra Cc: Jason Baron, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel * Peter Zijlstra (peterz@infradead.org) wrote: > On Wed, 2011-01-05 at 10:43 -0500, Jason Baron wrote: > > Hi, > > > > The first patch uses the storage space of the jump label key address > > as a pointer into the update table. In this way, we can find all > > the addresses that need to be updated without hashing. > > > > The second patch introduces: > > > > static __always_inline bool static_branch(struct jump_label_key *key); > > > > instead of the old JUMP_LABEL(key, label) macro. > > > > In this way, jump labels become really easy to use: > > > > Define: > > > > struct jump_label_key jump_key; > > > > Can be used as: > > > > if (static_branch(&jump_key)) > > do unlikely code > > > > enable/disale via: > > > > jump_label_enable(&jump_key); > > jump_label_disable(&jump_key); > > > > that's it! > > > > For perf, which also uses jump labels, I've left the reference counting > > out of the jump label layer, thus removing the 'jump_label_inc()' and > > 'jump_label_dec()' interface. Hopefully, this is a more palatable solution. > > Right, lets go with this. Maybe we'll manage to come up with something > saner than _else_atomic_read(), but for now its an improvement over what > we have. I agree that keeping jump_label.h with the minimal clean API is a good goal, and this patchset is almost there (maybe except for the _else_atomic_read() part). Hrm, given that the atomic inc/dec return and test for 1/0 is moved into the Perf code, I wonder if it would make sense to move the "_else_atomic_read()" oddness into the perf code too ? Perf could declare, in its own header, a wrapper over __static_branch, e.g. put in perf_event.h: #ifdef HAVE_JUMP_LABEL static __always_inline bool perf_sw_event_static_branch_refcount(struct jump_label_key *key, atomic_t *ref) { return __static_branch(key); } #else static __always_inline bool perf_sw_event_static_branch_refcount(struct jump_label_key *key, atomic_t *ref) { if (unlikely(atomic_read(ref))) return true; return false; } #endif Otherwise, jump_label_ref.h looks like an half-baked interface that only provides the "test" API but not the ref/unref. If we have only a single user interested in refcounting, it might make more sense to put the code in perf_event.h. If we have many users using an atomic refcount like this, then we should extend jump_label_ref.h to also provide the ref/unref API too. I don't care much about where it ends up, as long as it's a consistent choice. Thoughts ? Mathieu -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <BLU0-SMTP101B686C32E10BA346B15F896EF0@phx.gbl>]
* Re: [PATCH 0/2] jump label: 2.6.38 updates [not found] ` <BLU0-SMTP101B686C32E10BA346B15F896EF0@phx.gbl> @ 2011-02-11 21:38 ` Peter Zijlstra 2011-02-11 22:15 ` Jason Baron ` (3 more replies) 0 siblings, 4 replies; 113+ messages in thread From: Peter Zijlstra @ 2011-02-11 21:38 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Jason Baron, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel On Fri, 2011-02-11 at 16:13 -0500, Mathieu Desnoyers wrote: > > Thoughts ? #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL) + +struct jump_label_key { + void *ptr; +}; struct jump_label_entry { struct hlist_node hlist; struct jump_entry *table; - int nr_entries; /* hang modules off here */ struct hlist_head modules; unsigned long key; + u32 nr_entries; + int refcount; }; #else +struct jump_label_key { + int state; +}; #endif So why can't we make that jump_label_entry::refcount and jump_label_key::state an atomic_t and be done with it? Then the enabled case uses if (atomic_inc_return(&key->ptr->refcount) == 1), and the disabled atomic_inc(&key->state). ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-11 21:38 ` Peter Zijlstra @ 2011-02-11 22:15 ` Jason Baron 2011-02-11 22:19 ` H. Peter Anvin 2011-02-11 22:30 ` Mathieu Desnoyers 2011-02-11 22:20 ` Mathieu Desnoyers ` (2 subsequent siblings) 3 siblings, 2 replies; 113+ messages in thread From: Jason Baron @ 2011-02-11 22:15 UTC (permalink / raw) To: Peter Zijlstra Cc: Mathieu Desnoyers, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel On Fri, Feb 11, 2011 at 10:38:17PM +0100, Peter Zijlstra wrote: > On Fri, 2011-02-11 at 16:13 -0500, Mathieu Desnoyers wrote: > > > > Thoughts ? > > #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL) > + > +struct jump_label_key { > + void *ptr; > +}; > > struct jump_label_entry { > struct hlist_node hlist; > struct jump_entry *table; > - int nr_entries; > /* hang modules off here */ > struct hlist_head modules; > unsigned long key; > + u32 nr_entries; > + int refcount; > }; > > #else > > +struct jump_label_key { > + int state; > +}; > > #endif > > > > So why can't we make that jump_label_entry::refcount and > jump_label_key::state an atomic_t and be done with it? > > Then the enabled case uses if (atomic_inc_return(&key->ptr->refcount) == > 1), and the disabled atomic_inc(&key->state). > a bit of history... For the disabled jump label case, we didn't want to incur an atomic_read() to check if the branch was enabled. So, I separated the API, to have one for the non-atomic case, and one for the atomic case. Nobody liked that. So now, I'm proposing to leave the core API based around a non-atomic variable, and have any callers that want to use this atomic interface, to call into the non-atomic interface. If another user besides perf wants to use the same type of atomic interface, we can re-visit the decsion? thanks, -Jason ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-11 22:15 ` Jason Baron @ 2011-02-11 22:19 ` H. Peter Anvin 2011-02-11 22:30 ` Mathieu Desnoyers 1 sibling, 0 replies; 113+ messages in thread From: H. Peter Anvin @ 2011-02-11 22:19 UTC (permalink / raw) To: Jason Baron Cc: Peter Zijlstra, Mathieu Desnoyers, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel On 02/11/2011 02:15 PM, Jason Baron wrote: > > a bit of history... > > For the disabled jump label case, we didn't want to incur an atomic_read() to > check if the branch was enabled. > > So, I separated the API, to have one for the non-atomic case, and one > for the atomic case. Nobody liked that. > > So now, I'm proposing to leave the core API based around a non-atomic > variable, and have any callers that want to use this atomic interface, > to call into the non-atomic interface. If another user besides perf > wants to use the same type of atomic interface, we can re-visit the > decsion? > What is the problem with taking the atomic_read()? -hpa ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-11 22:15 ` Jason Baron 2011-02-11 22:19 ` H. Peter Anvin @ 2011-02-11 22:30 ` Mathieu Desnoyers 1 sibling, 0 replies; 113+ messages in thread From: Mathieu Desnoyers @ 2011-02-11 22:30 UTC (permalink / raw) To: Jason Baron Cc: Peter Zijlstra, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel * Jason Baron (jbaron@redhat.com) wrote: > On Fri, Feb 11, 2011 at 10:38:17PM +0100, Peter Zijlstra wrote: > > On Fri, 2011-02-11 at 16:13 -0500, Mathieu Desnoyers wrote: > > > > > > Thoughts ? > > > > #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL) > > + > > +struct jump_label_key { > > + void *ptr; > > +}; > > > > struct jump_label_entry { > > struct hlist_node hlist; > > struct jump_entry *table; > > - int nr_entries; > > /* hang modules off here */ > > struct hlist_head modules; > > unsigned long key; > > + u32 nr_entries; > > + int refcount; > > }; > > > > #else > > > > +struct jump_label_key { > > + int state; > > +}; > > > > #endif > > > > > > > > So why can't we make that jump_label_entry::refcount and > > jump_label_key::state an atomic_t and be done with it? > > > > Then the enabled case uses if (atomic_inc_return(&key->ptr->refcount) == > > 1), and the disabled atomic_inc(&key->state). > > > > a bit of history... > > For the disabled jump label case, we didn't want to incur an atomic_read() to > check if the branch was enabled. > > So, I separated the API, to have one for the non-atomic case, and one > for the atomic case. Nobody liked that. > > So now, I'm proposing to leave the core API based around a non-atomic > variable, and have any callers that want to use this atomic interface, > to call into the non-atomic interface. If another user besides perf > wants to use the same type of atomic interface, we can re-visit the > decsion? See my other email to PeterZ. I think it might be better to keep the interface really clean and take compiler optimization hit on the volatile if we figure out that it is negligible. I'd love to see benchmarks on the impact of this change to justify that we can actually dismiss the performance impact. We have enough tracepoints in the kernel that if we figure out that it does not make a noticeable performance difference in !JUMP_LABEL configs with tracepoints enabled, we can as well take the volatile. But please document these benchmarks in the patch changelog. Also looking at the disassembly of core instrumented kernel functions to see if the added volatile hurts the basic block ordering, and documenting that, would be helpful. I'd recommend a jump_label_ref()/jump_label_unref() interface (similar to kref) intead of enable/disable through (to make it clear that we have reference counter handling in there). Long story short: I'm not against adding the volatile read here. I'm against adding it without measuring and documenting the impact of this change. Thanks, Mathieu > > thanks, > > -Jason -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-11 21:38 ` Peter Zijlstra 2011-02-11 22:15 ` Jason Baron @ 2011-02-11 22:20 ` Mathieu Desnoyers [not found] ` <BLU0-SMTP8562BA758CF8AAE5323AE296EF0@phx.gbl> 2011-02-12 18:47 ` Peter Zijlstra 3 siblings, 0 replies; 113+ messages in thread From: Mathieu Desnoyers @ 2011-02-11 22:20 UTC (permalink / raw) To: Peter Zijlstra Cc: Jason Baron, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel * Peter Zijlstra (peterz@infradead.org) wrote: > On Fri, 2011-02-11 at 16:13 -0500, Mathieu Desnoyers wrote: > > > > Thoughts ? > > #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL) > + > +struct jump_label_key { > + void *ptr; > +}; > > struct jump_label_entry { > struct hlist_node hlist; > struct jump_entry *table; > - int nr_entries; > /* hang modules off here */ > struct hlist_head modules; > unsigned long key; > + u32 nr_entries; > + int refcount; > }; > > #else > > +struct jump_label_key { > + int state; > +}; > > #endif > > So why can't we make that jump_label_entry::refcount and > jump_label_key::state an atomic_t and be done with it? > > Then the enabled case uses if (atomic_inc_return(&key->ptr->refcount) == > 1), and the disabled atomic_inc(&key->state). > OK, by "enabled" you mean #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL), and "disabled", the #else. I guess the only downside is the extra volatile for the atomic_read for the fallback case, which is not really much of problem realistically speaking: anyway, the volatile is a good thing to have in the fallback case to force the compiler to re-read the variable. Let's go with your idea. Thanks, Mathieu -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <BLU0-SMTP8562BA758CF8AAE5323AE296EF0@phx.gbl>]
* Re: [PATCH 0/2] jump label: 2.6.38 updates [not found] ` <BLU0-SMTP8562BA758CF8AAE5323AE296EF0@phx.gbl> @ 2011-02-11 22:27 ` Jason Baron 2011-02-11 22:32 ` Mathieu Desnoyers 0 siblings, 1 reply; 113+ messages in thread From: Jason Baron @ 2011-02-11 22:27 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Peter Zijlstra, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel On Fri, Feb 11, 2011 at 05:20:25PM -0500, Mathieu Desnoyers wrote: > * Peter Zijlstra (peterz@infradead.org) wrote: > > On Fri, 2011-02-11 at 16:13 -0500, Mathieu Desnoyers wrote: > > > > > > Thoughts ? > > > > #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL) > > + > > +struct jump_label_key { > > + void *ptr; > > +}; > > > > struct jump_label_entry { > > struct hlist_node hlist; > > struct jump_entry *table; > > - int nr_entries; > > /* hang modules off here */ > > struct hlist_head modules; > > unsigned long key; > > + u32 nr_entries; > > + int refcount; > > }; > > > > #else > > > > +struct jump_label_key { > > + int state; > > +}; > > > > #endif > > > > So why can't we make that jump_label_entry::refcount and > > jump_label_key::state an atomic_t and be done with it? > > > > Then the enabled case uses if (atomic_inc_return(&key->ptr->refcount) == > > 1), and the disabled atomic_inc(&key->state). > > > > OK, by "enabled" you mean #if defined(CC_HAVE_ASM_GOTO) && > defined(CONFIG_JUMP_LABEL), and "disabled", the #else. > > I guess the only downside is the extra volatile for the atomic_read for > the fallback case, which is not really much of problem realistically > speaking: anyway, the volatile is a good thing to have in the fallback > case to force the compiler to re-read the variable. Let's go with your > idea. > > Thanks, > > Mathieu > ok, I'll try and re-spin the interface based around atomic_t, if we are all agreed...there was also a circular dependency issue with atomic.h including kernel.h which included jump_label.h, and that was why we had a separate, jump_label_ref.h header file, but hopefully I can be resolve that in a clean way. thanks, -Jason ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-11 22:27 ` Jason Baron @ 2011-02-11 22:32 ` Mathieu Desnoyers 0 siblings, 0 replies; 113+ messages in thread From: Mathieu Desnoyers @ 2011-02-11 22:32 UTC (permalink / raw) To: Jason Baron Cc: Peter Zijlstra, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel * Jason Baron (jbaron@redhat.com) wrote: > On Fri, Feb 11, 2011 at 05:20:25PM -0500, Mathieu Desnoyers wrote: > > * Peter Zijlstra (peterz@infradead.org) wrote: > > > On Fri, 2011-02-11 at 16:13 -0500, Mathieu Desnoyers wrote: > > > > > > > > Thoughts ? > > > > > > #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL) > > > + > > > +struct jump_label_key { > > > + void *ptr; > > > +}; > > > > > > struct jump_label_entry { > > > struct hlist_node hlist; > > > struct jump_entry *table; > > > - int nr_entries; > > > /* hang modules off here */ > > > struct hlist_head modules; > > > unsigned long key; > > > + u32 nr_entries; > > > + int refcount; > > > }; > > > > > > #else > > > > > > +struct jump_label_key { > > > + int state; > > > +}; > > > > > > #endif > > > > > > So why can't we make that jump_label_entry::refcount and > > > jump_label_key::state an atomic_t and be done with it? > > > > > > Then the enabled case uses if (atomic_inc_return(&key->ptr->refcount) == > > > 1), and the disabled atomic_inc(&key->state). > > > > > > > OK, by "enabled" you mean #if defined(CC_HAVE_ASM_GOTO) && > > defined(CONFIG_JUMP_LABEL), and "disabled", the #else. > > > > I guess the only downside is the extra volatile for the atomic_read for > > the fallback case, which is not really much of problem realistically > > speaking: anyway, the volatile is a good thing to have in the fallback > > case to force the compiler to re-read the variable. Let's go with your > > idea. > > > > Thanks, > > > > Mathieu > > > > ok, I'll try and re-spin the interface based around atomic_t, if we are all > agreed...there was also a circular dependency issue with atomic.h including > kernel.h which included jump_label.h, and that was why we had a separate, > jump_label_ref.h header file, but hopefully I can be resolve that in a clean > way. See spinlocks ? jump_label_types.h (structure definitions, includes types.h, included from kernel.h) jump_label.h (prototypes, inline functions, includes atomic.h) Thanks, Mathieu > > thanks, > > -Jason -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-11 21:38 ` Peter Zijlstra ` (2 preceding siblings ...) [not found] ` <BLU0-SMTP8562BA758CF8AAE5323AE296EF0@phx.gbl> @ 2011-02-12 18:47 ` Peter Zijlstra 2011-02-14 12:27 ` Ingo Molnar ` (2 more replies) 3 siblings, 3 replies; 113+ messages in thread From: Peter Zijlstra @ 2011-02-12 18:47 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Jason Baron, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel On Fri, 2011-02-11 at 22:38 +0100, Peter Zijlstra wrote: > > So why can't we make that jump_label_entry::refcount and > jump_label_key::state an atomic_t and be done with it? So I had a bit of a poke at this because I didn't quite understand why all that stuff was as it was. I applied both Jason's patches and then basically rewrote kernel/jump_label.c just for kicks ;-) I haven't tried compiling this, let alone running it, but provided I didn't actually forget anything the storage per key is now 16 bytes when modules are disabled and 24 * (1 + mods) bytes for when they are enabled. The old code had 64 + 40 * mods bytes. I still need to clean up the static_branch_else bits and look at !x86 aside from the already mentioned bits.. but what do people think? --- arch/sparc/include/asm/jump_label.h | 25 +- arch/x86/include/asm/jump_label.h | 22 +- arch/x86/kernel/jump_label.c | 2 +- arch/x86/kernel/module.c | 3 - include/linux/dynamic_debug.h | 10 +- include/linux/jump_label.h | 71 +++--- include/linux/jump_label_ref.h | 36 +-- include/linux/module.h | 1 + include/linux/perf_event.h | 28 +- include/linux/tracepoint.h | 8 +- kernel/jump_label.c | 516 +++++++++++++---------------------- kernel/module.c | 7 + kernel/perf_event.c | 30 ++- kernel/timer.c | 8 +- kernel/tracepoint.c | 22 +- 15 files changed, 333 insertions(+), 456 deletions(-) diff --git a/arch/sparc/include/asm/jump_label.h b/arch/sparc/include/asm/jump_label.h index 427d468..e4ca085 100644 --- a/arch/sparc/include/asm/jump_label.h +++ b/arch/sparc/include/asm/jump_label.h @@ -7,17 +7,20 @@ #define JUMP_LABEL_NOP_SIZE 4 -#define JUMP_LABEL(key, label) \ - do { \ - asm goto("1:\n\t" \ - "nop\n\t" \ - "nop\n\t" \ - ".pushsection __jump_table, \"a\"\n\t"\ - ".align 4\n\t" \ - ".word 1b, %l[" #label "], %c0\n\t" \ - ".popsection \n\t" \ - : : "i" (key) : : label);\ - } while (0) +static __always_inline bool __static_branch(struct jump_label_key *key) +{ + asm goto("1:\n\t" + "nop\n\t" + "nop\n\t" + ".pushsection __jump_table, \"a\"\n\t" + ".align 4\n\t" + ".word 1b, %l[l_yes], %c0\n\t" + ".popsection \n\t" + : : "i" (key) : : l_yes); + return false; +l_yes: + return true; +} #endif /* __KERNEL__ */ diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h index 574dbc2..3d44a7c 100644 --- a/arch/x86/include/asm/jump_label.h +++ b/arch/x86/include/asm/jump_label.h @@ -5,20 +5,24 @@ #include <linux/types.h> #include <asm/nops.h> +#include <asm/asm.h> #define JUMP_LABEL_NOP_SIZE 5 # define JUMP_LABEL_INITIAL_NOP ".byte 0xe9 \n\t .long 0\n\t" -# define JUMP_LABEL(key, label) \ - do { \ - asm goto("1:" \ - JUMP_LABEL_INITIAL_NOP \ - ".pushsection __jump_table, \"aw\" \n\t"\ - _ASM_PTR "1b, %l[" #label "], %c0 \n\t" \ - ".popsection \n\t" \ - : : "i" (key) : : label); \ - } while (0) +static __always_inline bool __static_branch(struct jump_label_key *key) +{ + asm goto("1:" + JUMP_LABEL_INITIAL_NOP + ".pushsection __jump_table, \"a\" \n\t" + _ASM_PTR "1b, %l[l_yes], %c0 \n\t" + ".popsection \n\t" + : : "i" (key) : : l_yes ); + return false; +l_yes: + return true; +} #endif /* __KERNEL__ */ diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c index 961b6b3..dfa4c3c 100644 --- a/arch/x86/kernel/jump_label.c +++ b/arch/x86/kernel/jump_label.c @@ -4,13 +4,13 @@ * Copyright (C) 2009 Jason Baron <jbaron@redhat.com> * */ -#include <linux/jump_label.h> #include <linux/memory.h> #include <linux/uaccess.h> #include <linux/module.h> #include <linux/list.h> #include <linux/jhash.h> #include <linux/cpu.h> +#include <linux/jump_label.h> #include <asm/kprobes.h> #include <asm/alternative.h> diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c index ab23f1a..0e6b823 100644 --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -230,9 +230,6 @@ int module_finalize(const Elf_Ehdr *hdr, apply_paravirt(pseg, pseg + para->sh_size); } - /* make jump label nops */ - jump_label_apply_nops(me); - return 0; } diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h index 1c70028..2ade291 100644 --- a/include/linux/dynamic_debug.h +++ b/include/linux/dynamic_debug.h @@ -33,7 +33,7 @@ struct _ddebug { #define _DPRINTK_FLAGS_PRINT (1<<0) /* printk() a message using the format */ #define _DPRINTK_FLAGS_DEFAULT 0 unsigned int flags:8; - char enabled; + struct jump_label_key enabled; } __attribute__((aligned(8))); @@ -48,8 +48,8 @@ extern int ddebug_remove_module(const char *mod_name); __used \ __attribute__((section("__verbose"), aligned(8))) = \ { KBUILD_MODNAME, __func__, __FILE__, fmt, __LINE__, \ - _DPRINTK_FLAGS_DEFAULT }; \ - if (unlikely(descriptor.enabled)) \ + _DPRINTK_FLAGS_DEFAULT, JUMP_LABEL_INIT }; \ + if (static_branch(&descriptor.enabled)) \ printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__); \ } while (0) @@ -59,8 +59,8 @@ extern int ddebug_remove_module(const char *mod_name); __used \ __attribute__((section("__verbose"), aligned(8))) = \ { KBUILD_MODNAME, __func__, __FILE__, fmt, __LINE__, \ - _DPRINTK_FLAGS_DEFAULT }; \ - if (unlikely(descriptor.enabled)) \ + _DPRINTK_FLAGS_DEFAULT, JUMP_LABEL_INIT }; \ + if (static_branch(&descriptor.enabled)) \ dev_printk(KERN_DEBUG, dev, fmt, ##__VA_ARGS__); \ } while (0) diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h index 7880f18..a1cec0a 100644 --- a/include/linux/jump_label.h +++ b/include/linux/jump_label.h @@ -2,19 +2,35 @@ #define _LINUX_JUMP_LABEL_H #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL) + +struct jump_label_key { + atomic_t enabled; + struct jump_entry *entries; +#ifdef CONFIG_MODULES + struct jump_module *next; +#endif +}; + # include <asm/jump_label.h> # define HAVE_JUMP_LABEL #endif enum jump_label_type { + JUMP_LABEL_DISABLE = 0, JUMP_LABEL_ENABLE, - JUMP_LABEL_DISABLE }; struct module; +#define JUMP_LABEL_INIT { 0 } + #ifdef HAVE_JUMP_LABEL +static __always_inline bool static_branch(struct jump_label_key *key) +{ + return __static_branch(key); +} + extern struct jump_entry __start___jump_table[]; extern struct jump_entry __stop___jump_table[]; @@ -23,37 +39,31 @@ extern void jump_label_unlock(void); extern void arch_jump_label_transform(struct jump_entry *entry, enum jump_label_type type); extern void arch_jump_label_text_poke_early(jump_label_t addr); -extern void jump_label_update(unsigned long key, enum jump_label_type type); -extern void jump_label_apply_nops(struct module *mod); extern int jump_label_text_reserved(void *start, void *end); - -#define jump_label_enable(key) \ - jump_label_update((unsigned long)key, JUMP_LABEL_ENABLE); - -#define jump_label_disable(key) \ - jump_label_update((unsigned long)key, JUMP_LABEL_DISABLE); +extern void jump_label_enable(struct jump_label_key *key); +extern void jump_label_disable(struct jump_label_key *key); #else -#define JUMP_LABEL(key, label) \ -do { \ - if (unlikely(*key)) \ - goto label; \ -} while (0) +struct jump_label_key { + atomic_t enabled; +}; -#define jump_label_enable(cond_var) \ -do { \ - *(cond_var) = 1; \ -} while (0) +static __always_inline bool static_branch(struct jump_label_key *key) +{ + if (unlikely(atomic_read(&key->state))) + return true; + return false; +} -#define jump_label_disable(cond_var) \ -do { \ - *(cond_var) = 0; \ -} while (0) +static inline void jump_label_enable(struct jump_label_key *key) +{ + atomic_inc(&key->state); +} -static inline int jump_label_apply_nops(struct module *mod) +static inline void jump_label_disable(struct jump_label_key *key) { - return 0; + atomic_dec(&key->state); } static inline int jump_label_text_reserved(void *start, void *end) @@ -66,14 +76,9 @@ static inline void jump_label_unlock(void) {} #endif -#define COND_STMT(key, stmt) \ -do { \ - __label__ jl_enabled; \ - JUMP_LABEL(key, jl_enabled); \ - if (0) { \ -jl_enabled: \ - stmt; \ - } \ -} while (0) +static inline bool jump_label_enabled(struct jump_label_key *key) +{ + return !!atomic_read(&key->state); +} #endif diff --git a/include/linux/jump_label_ref.h b/include/linux/jump_label_ref.h index e5d012a..5178696 100644 --- a/include/linux/jump_label_ref.h +++ b/include/linux/jump_label_ref.h @@ -4,41 +4,27 @@ #include <linux/jump_label.h> #include <asm/atomic.h> -#ifdef HAVE_JUMP_LABEL +struct jump_label_key_counter { + atomic_t ref; + struct jump_label_key key; +}; -static inline void jump_label_inc(atomic_t *key) -{ - if (atomic_add_return(1, key) == 1) - jump_label_enable(key); -} +#ifdef HAVE_JUMP_LABEL -static inline void jump_label_dec(atomic_t *key) +static __always_inline bool static_branch_else_atomic_read(struct jump_label_key *key, atomic_t *count) { - if (atomic_dec_and_test(key)) - jump_label_disable(key); + return __static_branch(key); } #else /* !HAVE_JUMP_LABEL */ -static inline void jump_label_inc(atomic_t *key) +static __always_inline bool static_branch_else_atomic_read(struct jump_label_key *key, atomic_t *count) { - atomic_inc(key); + if (unlikely(atomic_read(count))) + return true; + return false; } -static inline void jump_label_dec(atomic_t *key) -{ - atomic_dec(key); -} - -#undef JUMP_LABEL -#define JUMP_LABEL(key, label) \ -do { \ - if (unlikely(__builtin_choose_expr( \ - __builtin_types_compatible_p(typeof(key), atomic_t *), \ - atomic_read((atomic_t *)(key)), *(key)))) \ - goto label; \ -} while (0) - #endif /* HAVE_JUMP_LABEL */ #endif /* _LINUX_JUMP_LABEL_REF_H */ diff --git a/include/linux/module.h b/include/linux/module.h index 9bdf27c..eeb3e99 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -266,6 +266,7 @@ enum module_state MODULE_STATE_LIVE, MODULE_STATE_COMING, MODULE_STATE_GOING, + MODULE_STATE_POST_RELOCATE, }; struct module diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index dda5b0a..26fe115 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1000,7 +1000,7 @@ static inline int is_software_event(struct perf_event *event) return event->pmu->task_ctx_nr == perf_sw_context; } -extern atomic_t perf_swevent_enabled[PERF_COUNT_SW_MAX]; +extern struct jump_label_key_counter perf_swevent_enabled[PERF_COUNT_SW_MAX]; extern void __perf_sw_event(u32, u64, int, struct pt_regs *, u64); @@ -1029,30 +1029,32 @@ perf_sw_event(u32 event_id, u64 nr, int nmi, struct pt_regs *regs, u64 addr) { struct pt_regs hot_regs; - JUMP_LABEL(&perf_swevent_enabled[event_id], have_event); - return; - -have_event: - if (!regs) { - perf_fetch_caller_regs(&hot_regs); - regs = &hot_regs; + if (static_branch_else_atomic_read(&perf_swevent_enabled[event_id].key, + &perf_swevent_enabled[event_id].ref)) { + if (!regs) { + perf_fetch_caller_regs(&hot_regs); + regs = &hot_regs; + } + __perf_sw_event(event_id, nr, nmi, regs, addr); } - __perf_sw_event(event_id, nr, nmi, regs, addr); } -extern atomic_t perf_task_events; +extern struct jump_label_key_counter perf_task_events; static inline void perf_event_task_sched_in(struct task_struct *task) { - COND_STMT(&perf_task_events, __perf_event_task_sched_in(task)); + if (static_branch_else_atomic_read(&perf_task_events.key, + &perf_task_events.ref)) + __perf_event_task_sched_in(task); } static inline void perf_event_task_sched_out(struct task_struct *task, struct task_struct *next) { perf_sw_event(PERF_COUNT_SW_CONTEXT_SWITCHES, 1, 1, NULL, 0); - - COND_STMT(&perf_task_events, __perf_event_task_sched_out(task, next)); + if (static_branch_else_atomic_read(&perf_task_events.key, + &perf_task_events.ref)) + __perf_event_task_sched_out(task, next); } extern void perf_event_mmap(struct vm_area_struct *vma); diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h index 97c84a5..6c8c747 100644 --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -29,7 +29,7 @@ struct tracepoint_func { struct tracepoint { const char *name; /* Tracepoint name */ - int state; /* State. */ + struct jump_label_key key; void (*regfunc)(void); void (*unregfunc)(void); struct tracepoint_func __rcu *funcs; @@ -146,9 +146,7 @@ void tracepoint_update_probe_range(struct tracepoint * const *begin, extern struct tracepoint __tracepoint_##name; \ static inline void trace_##name(proto) \ { \ - JUMP_LABEL(&__tracepoint_##name.state, do_trace); \ - return; \ -do_trace: \ + if (static_branch(&__tracepoint_##name.key)) \ __DO_TRACE(&__tracepoint_##name, \ TP_PROTO(data_proto), \ TP_ARGS(data_args), \ @@ -181,7 +179,7 @@ do_trace: \ __attribute__((section("__tracepoints_strings"))) = #name; \ struct tracepoint __tracepoint_##name \ __attribute__((section("__tracepoints"))) = \ - { __tpstrtab_##name, 0, reg, unreg, NULL }; \ + { __tpstrtab_##name, JUMP_LABEL_INIT, reg, unreg, NULL };\ static struct tracepoint * const __tracepoint_ptr_##name __used \ __attribute__((section("__tracepoints_ptrs"))) = \ &__tracepoint_##name; diff --git a/kernel/jump_label.c b/kernel/jump_label.c index 3b79bd9..29b34be 100644 --- a/kernel/jump_label.c +++ b/kernel/jump_label.c @@ -2,9 +2,9 @@ * jump label support * * Copyright (C) 2009 Jason Baron <jbaron@redhat.com> + * Copyright (C) 2011 Peter Zijlstra <pzijlstr@redhat.com> * */ -#include <linux/jump_label.h> #include <linux/memory.h> #include <linux/uaccess.h> #include <linux/module.h> @@ -13,32 +13,13 @@ #include <linux/slab.h> #include <linux/sort.h> #include <linux/err.h> +#include <linux/jump_label.h> #ifdef HAVE_JUMP_LABEL -#define JUMP_LABEL_HASH_BITS 6 -#define JUMP_LABEL_TABLE_SIZE (1 << JUMP_LABEL_HASH_BITS) -static struct hlist_head jump_label_table[JUMP_LABEL_TABLE_SIZE]; - /* mutex to protect coming/going of the the jump_label table */ static DEFINE_MUTEX(jump_label_mutex); -struct jump_label_entry { - struct hlist_node hlist; - struct jump_entry *table; - int nr_entries; - /* hang modules off here */ - struct hlist_head modules; - unsigned long key; -}; - -struct jump_label_module_entry { - struct hlist_node hlist; - struct jump_entry *table; - int nr_entries; - struct module *mod; -}; - void jump_label_lock(void) { mutex_lock(&jump_label_mutex); @@ -64,7 +45,7 @@ static int jump_label_cmp(const void *a, const void *b) } static void -sort_jump_label_entries(struct jump_entry *start, struct jump_entry *stop) +jump_label_sort_entries(struct jump_entry *start, struct jump_entry *stop) { unsigned long size; @@ -73,118 +54,25 @@ sort_jump_label_entries(struct jump_entry *start, struct jump_entry *stop) sort(start, size, sizeof(struct jump_entry), jump_label_cmp, NULL); } -static struct jump_label_entry *get_jump_label_entry(jump_label_t key) -{ - struct hlist_head *head; - struct hlist_node *node; - struct jump_label_entry *e; - u32 hash = jhash((void *)&key, sizeof(jump_label_t), 0); - - head = &jump_label_table[hash & (JUMP_LABEL_TABLE_SIZE - 1)]; - hlist_for_each_entry(e, node, head, hlist) { - if (key == e->key) - return e; - } - return NULL; -} +static void jump_label_update(struct jump_label_key *key, int enable); -static struct jump_label_entry * -add_jump_label_entry(jump_label_t key, int nr_entries, struct jump_entry *table) +void jump_label_enable(struct jump_label_key *key) { - struct hlist_head *head; - struct jump_label_entry *e; - u32 hash; - - e = get_jump_label_entry(key); - if (e) - return ERR_PTR(-EEXIST); - - e = kmalloc(sizeof(struct jump_label_entry), GFP_KERNEL); - if (!e) - return ERR_PTR(-ENOMEM); - - hash = jhash((void *)&key, sizeof(jump_label_t), 0); - head = &jump_label_table[hash & (JUMP_LABEL_TABLE_SIZE - 1)]; - e->key = key; - e->table = table; - e->nr_entries = nr_entries; - INIT_HLIST_HEAD(&(e->modules)); - hlist_add_head(&e->hlist, head); - return e; -} + if (atomic_inc_not_zero(&key->enabled)) + return; -static int -build_jump_label_hashtable(struct jump_entry *start, struct jump_entry *stop) -{ - struct jump_entry *iter, *iter_begin; - struct jump_label_entry *entry; - int count; - - sort_jump_label_entries(start, stop); - iter = start; - while (iter < stop) { - entry = get_jump_label_entry(iter->key); - if (!entry) { - iter_begin = iter; - count = 0; - while ((iter < stop) && - (iter->key == iter_begin->key)) { - iter++; - count++; - } - entry = add_jump_label_entry(iter_begin->key, - count, iter_begin); - if (IS_ERR(entry)) - return PTR_ERR(entry); - } else { - WARN_ONCE(1, KERN_ERR "build_jump_hashtable: unexpected entry!\n"); - return -1; - } - } - return 0; + jump_label_lock(); + if (atomic_add_return(&key->enabled) == 1) + jump_label_update(key, JUMP_LABEL_ENABLE); + jump_label_unlock(); } -/*** - * jump_label_update - update jump label text - * @key - key value associated with a a jump label - * @type - enum set to JUMP_LABEL_ENABLE or JUMP_LABEL_DISABLE - * - * Will enable/disable the jump for jump label @key, depending on the - * value of @type. - * - */ - -void jump_label_update(unsigned long key, enum jump_label_type type) +void jump_label_disable(struct jump_label_key *key) { - struct jump_entry *iter; - struct jump_label_entry *entry; - struct hlist_node *module_node; - struct jump_label_module_entry *e_module; - int count; + if (!atomic_dec_and_mutex_lock(&key->enabled, &jump_label_mutex)) + return; - jump_label_lock(); - entry = get_jump_label_entry((jump_label_t)key); - if (entry) { - count = entry->nr_entries; - iter = entry->table; - while (count--) { - if (kernel_text_address(iter->code)) - arch_jump_label_transform(iter, type); - iter++; - } - /* eanble/disable jump labels in modules */ - hlist_for_each_entry(e_module, module_node, &(entry->modules), - hlist) { - count = e_module->nr_entries; - iter = e_module->table; - while (count--) { - if (iter->key && - kernel_text_address(iter->code)) - arch_jump_label_transform(iter, type); - iter++; - } - } - } + jump_label_update(key, JUMP_LABEL_DISABLE); jump_label_unlock(); } @@ -197,77 +85,30 @@ static int addr_conflict(struct jump_entry *entry, void *start, void *end) return 0; } -#ifdef CONFIG_MODULES - -static int module_conflict(void *start, void *end) -{ - struct hlist_head *head; - struct hlist_node *node, *node_next, *module_node, *module_node_next; - struct jump_label_entry *e; - struct jump_label_module_entry *e_module; - struct jump_entry *iter; - int i, count; - int conflict = 0; - - for (i = 0; i < JUMP_LABEL_TABLE_SIZE; i++) { - head = &jump_label_table[i]; - hlist_for_each_entry_safe(e, node, node_next, head, hlist) { - hlist_for_each_entry_safe(e_module, module_node, - module_node_next, - &(e->modules), hlist) { - count = e_module->nr_entries; - iter = e_module->table; - while (count--) { - if (addr_conflict(iter, start, end)) { - conflict = 1; - goto out; - } - iter++; - } - } - } - } -out: - return conflict; -} - -#endif - -/*** - * jump_label_text_reserved - check if addr range is reserved - * @start: start text addr - * @end: end text addr - * - * checks if the text addr located between @start and @end - * overlaps with any of the jump label patch addresses. Code - * that wants to modify kernel text should first verify that - * it does not overlap with any of the jump label addresses. - * Caller must hold jump_label_mutex. - * - * returns 1 if there is an overlap, 0 otherwise - */ -int jump_label_text_reserved(void *start, void *end) +static int __jump_label_text_reserved(struct jump_entry *iter_start, + struct jump_entry *iter_stop, void *start, void *end) { struct jump_entry *iter; - struct jump_entry *iter_start = __start___jump_table; - struct jump_entry *iter_stop = __start___jump_table; - int conflict = 0; iter = iter_start; while (iter < iter_stop) { - if (addr_conflict(iter, start, end)) { - conflict = 1; - goto out; - } + if (addr_conflict(iter, start, end)) + return 1; iter++; } - /* now check modules */ -#ifdef CONFIG_MODULES - conflict = module_conflict(start, end); -#endif -out: - return conflict; + return 0; +} + +static void __jump_label_update(struct jump_label_key *key, + struct jump_entry *entry, int enable) +{ + for (; entry->key == (jump_label_t)key; entry++) { + if (WARN_ON_ONCE(!kernel_text_address(iter->code))) + continue; + + arch_jump_label_transform(iter, enable); + } } /* @@ -277,141 +118,155 @@ void __weak arch_jump_label_text_poke_early(jump_label_t addr) { } -static __init int init_jump_label(void) +static __init int jump_label_init(void) { - int ret; struct jump_entry *iter_start = __start___jump_table; struct jump_entry *iter_stop = __stop___jump_table; + struct jump_label_key *key = NULL; struct jump_entry *iter; jump_label_lock(); - ret = build_jump_label_hashtable(__start___jump_table, - __stop___jump_table); - iter = iter_start; - while (iter < iter_stop) { + jump_label_sort_entries(iter_start, iter_stop); + + for (iter = iter_start; iter < iter_stop; iter++) { arch_jump_label_text_poke_early(iter->code); - iter++; + if (iter->key == (jump_label_t)key) + continue; + + key = (struct jump_label_key *)iter->key; + atomic_set(&key->enabled, 0); + key->entries = iter; +#ifdef CONFIG_MODULES + key->next = NULL; +#endif } jump_label_unlock(); - return ret; + + return 0; } -early_initcall(init_jump_label); +early_initcall(jump_label_init); #ifdef CONFIG_MODULES -static struct jump_label_module_entry * -add_jump_label_module_entry(struct jump_label_entry *entry, - struct jump_entry *iter_begin, - int count, struct module *mod) -{ - struct jump_label_module_entry *e; - - e = kmalloc(sizeof(struct jump_label_module_entry), GFP_KERNEL); - if (!e) - return ERR_PTR(-ENOMEM); - e->mod = mod; - e->nr_entries = count; - e->table = iter_begin; - hlist_add_head(&e->hlist, &entry->modules); - return e; -} +struct jump_label_mod { + struct jump_label_mod *next; + struct jump_entry *entries; + struct module *mod; +}; -static int add_jump_label_module(struct module *mod) +static int __jump_label_mod_text_reserved(void *start, void *end) { - struct jump_entry *iter, *iter_begin; - struct jump_label_entry *entry; - struct jump_label_module_entry *module_entry; - int count; + struct module *mod; - /* if the module doesn't have jump label entries, just return */ - if (!mod->num_jump_entries) + mod = __module_text_address(start); + if (!mod) return 0; - sort_jump_label_entries(mod->jump_entries, + WARN_ON_ONCE(__module_text_address(end) != mod); + + return __jump_label_text_reserved(mod->jump_entries, mod->jump_entries + mod->num_jump_entries); - iter = mod->jump_entries; - while (iter < mod->jump_entries + mod->num_jump_entries) { - entry = get_jump_label_entry(iter->key); - iter_begin = iter; - count = 0; - while ((iter < mod->jump_entries + mod->num_jump_entries) && - (iter->key == iter_begin->key)) { - iter++; - count++; - } - if (!entry) { - entry = add_jump_label_entry(iter_begin->key, 0, NULL); - if (IS_ERR(entry)) - return PTR_ERR(entry); - } - module_entry = add_jump_label_module_entry(entry, iter_begin, - count, mod); - if (IS_ERR(module_entry)) - return PTR_ERR(module_entry); +} + +static void __jump_label_mod_update(struct jump_label_key *key, int enable) +{ + struct jump_label_mod *mod = key->next; + + while (mod) { + __jump_label_update(key, mod->entries, enable); + mod = mod->next; } - return 0; } -static void remove_jump_label_module(struct module *mod) +/*** + * apply_jump_label_nops - patch module jump labels with arch_get_jump_label_nop() + * @mod: module to patch + * + * Allow for run-time selection of the optimal nops. Before the module + * loads patch these with arch_get_jump_label_nop(), which is specified by + * the arch specific jump label code. + */ +static void jump_label_apply_nops(struct module *mod) { - struct hlist_head *head; - struct hlist_node *node, *node_next, *module_node, *module_node_next; - struct jump_label_entry *e; - struct jump_label_module_entry *e_module; - int i; + struct jump_entry *iter_start = mod->jump_entries; + struct jump_entry *iter_stop = mod->jump_entries + mod->num_jump_entries; + struct jump_entry *iter; /* if the module doesn't have jump label entries, just return */ - if (!mod->num_jump_entries) + if (iter_start == iter_stop) return; - for (i = 0; i < JUMP_LABEL_TABLE_SIZE; i++) { - head = &jump_label_table[i]; - hlist_for_each_entry_safe(e, node, node_next, head, hlist) { - hlist_for_each_entry_safe(e_module, module_node, - module_node_next, - &(e->modules), hlist) { - if (e_module->mod == mod) { - hlist_del(&e_module->hlist); - kfree(e_module); - } - } - if (hlist_empty(&e->modules) && (e->nr_entries == 0)) { - hlist_del(&e->hlist); - kfree(e); - } + jump_label_sort_entries(iter_start, iter_stop); + + for (iter = iter_start; iter < iter_stop; iter++) + arch_jump_label_text_poke_early(iter->code); +} + +static int jump_label_add_module(struct module *mod) +{ + struct jump_entry *iter_start = mod->jump_entries; + struct jump_entry *iter_stop = mod->jump_entries + mod->num_jump_entries; + struct jump_entry *iter; + struct jump_label_key *key = NULL; + struct jump_label_mod *jlm; + + for (iter = iter_start; iter < iter_stop; iter++) { + if (iter->key == (jump_label_t)key) + continue; + + key = (struct jump_label_key)iter->key; + + if (__module_address(iter->key) == mod) { + atomic_set(&key->enabled, 0); + key->entries = iter; + key->next = NULL; + continue; } + + jlm = kzalloc(sizeof(struct jump_label_mod), GFP_KERNEL); + if (!jlm) + return -ENOMEM; + + jlm->mod = mod; + jlm->entries = iter; + jlm->next = key->next; + key->next = jlm; + + if (jump_label_enabled(key)) + __jump_label_update(key, iter, JUMP_LABEL_ENABLE); } + + return 0; } -static void remove_jump_label_module_init(struct module *mod) +static void jump_label_del_module(struct module *mod) { - struct hlist_head *head; - struct hlist_node *node, *node_next, *module_node, *module_node_next; - struct jump_label_entry *e; - struct jump_label_module_entry *e_module; + struct jump_entry *iter_start = mod->jump_entries; + struct jump_entry *iter_stop = mod->jump_entries + mod->num_jump_entries; struct jump_entry *iter; - int i, count; + struct jump_label_key *key = NULL; + struct jump_label_mod *jlm, **prev; - /* if the module doesn't have jump label entries, just return */ - if (!mod->num_jump_entries) - return; + for (iter = iter_start; iter < iter_stop; iter++) { + if (iter->key == (jump_label_t)key) + continue; + + key = (struct jump_label_key)iter->key; + + if (__module_address(iter->key) == mod) + continue; + + prev = &key->next; + jlm = key->next; + + while (jlm && jlm->mod != mod) { + prev = &jlm->next; + jlm = jlm->next; + } - for (i = 0; i < JUMP_LABEL_TABLE_SIZE; i++) { - head = &jump_label_table[i]; - hlist_for_each_entry_safe(e, node, node_next, head, hlist) { - hlist_for_each_entry_safe(e_module, module_node, - module_node_next, - &(e->modules), hlist) { - if (e_module->mod != mod) - continue; - count = e_module->nr_entries; - iter = e_module->table; - while (count--) { - if (within_module_init(iter->code, mod)) - iter->key = 0; - iter++; - } - } + if (jlm) { + *prev = jlm->next; + kfree(jlm); } } } @@ -424,61 +279,76 @@ jump_label_module_notify(struct notifier_block *self, unsigned long val, int ret = 0; switch (val) { - case MODULE_STATE_COMING: + case MODULE_STATE_POST_RELOCATE: jump_label_lock(); - ret = add_jump_label_module(mod); - if (ret) - remove_jump_label_module(mod); + jump_label_apply_nops(mod); jump_label_unlock(); break; - case MODULE_STATE_GOING: + case MODULE_STATE_COMING: jump_label_lock(); - remove_jump_label_module(mod); + ret = jump_label_add_module(mod); + if (ret) + jump_label_del_module(mod); jump_label_unlock(); break; - case MODULE_STATE_LIVE: + case MODULE_STATE_GOING: jump_label_lock(); - remove_jump_label_module_init(mod); + jump_label_del_module(mod); jump_label_unlock(); break; } return ret; } +struct notifier_block jump_label_module_nb = { + .notifier_call = jump_label_module_notify, + .priority = 1, /* higher than tracepoints */ +}; + +static __init int jump_label_init_module(void) +{ + return register_module_notifier(&jump_label_module_nb); +} +early_initcall(jump_label_init_module); + +#endif /* CONFIG_MODULES */ + /*** - * apply_jump_label_nops - patch module jump labels with arch_get_jump_label_nop() - * @mod: module to patch + * jump_label_text_reserved - check if addr range is reserved + * @start: start text addr + * @end: end text addr * - * Allow for run-time selection of the optimal nops. Before the module - * loads patch these with arch_get_jump_label_nop(), which is specified by - * the arch specific jump label code. + * checks if the text addr located between @start and @end + * overlaps with any of the jump label patch addresses. Code + * that wants to modify kernel text should first verify that + * it does not overlap with any of the jump label addresses. + * Caller must hold jump_label_mutex. + * + * returns 1 if there is an overlap, 0 otherwise */ -void jump_label_apply_nops(struct module *mod) +int jump_label_text_reserved(void *start, void *end) { - struct jump_entry *iter; + int ret = __jump_label_text_reserved(__start___jump_table, + __stop___jump_table, start, end); - /* if the module doesn't have jump label entries, just return */ - if (!mod->num_jump_entries) - return; + if (ret) + return ret; - iter = mod->jump_entries; - while (iter < mod->jump_entries + mod->num_jump_entries) { - arch_jump_label_text_poke_early(iter->code); - iter++; - } +#ifdef CONFIG_MODULES + ret = __jump_label_mod_text_reserved(start, end); +#endif + return ret; } -struct notifier_block jump_label_module_nb = { - .notifier_call = jump_label_module_notify, - .priority = 0, -}; - -static __init int init_jump_label_module(void) +static void jump_label_update(struct jump_label_key *key, int enable) { - return register_module_notifier(&jump_label_module_nb); -} -early_initcall(init_jump_label_module); + struct jump_entry *entry = key->entries; -#endif /* CONFIG_MODULES */ + __jump_label_update(key, entry, enable); + +#ifdef CONFIG_MODULES + __jump_label_mod_update(key, enable); +#endif +} #endif diff --git a/kernel/module.c b/kernel/module.c index efa290e..890cadf 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2789,6 +2789,13 @@ static struct module *load_module(void __user *umod, goto unlock; } + err = blocking_notifier_call_chain(&module_notify_list, + MODULE_STATE_POST_RELOCATE, mod); + if (err != NOTIFY_DONE) { + err = notifier_to_errno(err); + goto unlock; + } + /* This has to be done once we're sure module name is unique. */ if (!mod->taints) dynamic_debug_setup(info.debug, info.num_debug); diff --git a/kernel/perf_event.c b/kernel/perf_event.c index a353a4d..7bacdd3 100644 --- a/kernel/perf_event.c +++ b/kernel/perf_event.c @@ -117,7 +117,7 @@ enum event_type_t { EVENT_ALL = EVENT_FLEXIBLE | EVENT_PINNED, }; -atomic_t perf_task_events __read_mostly; +struct jump_label_key_counter perf_task_events __read_mostly; static atomic_t nr_mmap_events __read_mostly; static atomic_t nr_comm_events __read_mostly; static atomic_t nr_task_events __read_mostly; @@ -2383,8 +2383,10 @@ static void free_event(struct perf_event *event) irq_work_sync(&event->pending); if (!event->parent) { - if (event->attach_state & PERF_ATTACH_TASK) - jump_label_dec(&perf_task_events); + if (event->attach_state & PERF_ATTACH_TASK) { + if (atomic_dec_and_test(&perf_task_events.ref)) + jump_label_disable(&perf_task_events.key); + } if (event->attr.mmap || event->attr.mmap_data) atomic_dec(&nr_mmap_events); if (event->attr.comm) @@ -4912,7 +4914,7 @@ fail: return err; } -atomic_t perf_swevent_enabled[PERF_COUNT_SW_MAX]; +struct jump_label_key_counter perf_swevent_enabled[PERF_COUNT_SW_MAX]; static void sw_perf_event_destroy(struct perf_event *event) { @@ -4920,7 +4922,8 @@ static void sw_perf_event_destroy(struct perf_event *event) WARN_ON(event->parent); - jump_label_dec(&perf_swevent_enabled[event_id]); + if (atomic_dec_and_test(&perf_swevent_enabled[event_id].ref)) + jump_label_disable(&perf_swevent_enabled[event_id].key); swevent_hlist_put(event); } @@ -4945,12 +4948,15 @@ static int perf_swevent_init(struct perf_event *event) if (!event->parent) { int err; + atomic_t *ref; err = swevent_hlist_get(event); if (err) return err; - jump_label_inc(&perf_swevent_enabled[event_id]); + ref = &perf_swevent_enabled[event_id].ref; + if (atomic_add_return(1, ref) == 1) + jump_label_enable(&perf_swevent_enabled[event_id].key); event->destroy = sw_perf_event_destroy; } @@ -5123,6 +5129,10 @@ static enum hrtimer_restart perf_swevent_hrtimer(struct hrtimer *hrtimer) u64 period; event = container_of(hrtimer, struct perf_event, hw.hrtimer); + + if (event->state < PERF_EVENT_STATE_ACTIVE) + return HRTIMER_NORESTART; + event->pmu->read(event); perf_sample_data_init(&data, 0); @@ -5174,7 +5184,7 @@ static void perf_swevent_cancel_hrtimer(struct perf_event *event) ktime_t remaining = hrtimer_get_remaining(&hwc->hrtimer); local64_set(&hwc->period_left, ktime_to_ns(remaining)); - hrtimer_cancel(&hwc->hrtimer); + hrtimer_try_to_cancel(&hwc->hrtimer); } } @@ -5713,8 +5723,10 @@ done: event->pmu = pmu; if (!event->parent) { - if (event->attach_state & PERF_ATTACH_TASK) - jump_label_inc(&perf_task_events); + if (event->attach_state & PERF_ATTACH_TASK) { + if (atomic_add_return(1, &perf_task_events.ref) == 1) + jump_label_enable(&perf_task_events.key); + } if (event->attr.mmap || event->attr.mmap_data) atomic_inc(&nr_mmap_events); if (event->attr.comm) diff --git a/kernel/timer.c b/kernel/timer.c index 343ff27..c848cd8 100644 --- a/kernel/timer.c +++ b/kernel/timer.c @@ -959,7 +959,7 @@ EXPORT_SYMBOL(try_to_del_timer_sync); * * Synchronization rules: Callers must prevent restarting of the timer, * otherwise this function is meaningless. It must not be called from - * hardirq contexts. The caller must not hold locks which would prevent + * interrupt contexts. The caller must not hold locks which would prevent * completion of the timer's handler. The timer's handler must not call * add_timer_on(). Upon exit the timer is not queued and the handler is * not running on any CPU. @@ -971,12 +971,10 @@ int del_timer_sync(struct timer_list *timer) #ifdef CONFIG_LOCKDEP unsigned long flags; - raw_local_irq_save(flags); - local_bh_disable(); + local_irq_save(flags); lock_map_acquire(&timer->lockdep_map); lock_map_release(&timer->lockdep_map); - _local_bh_enable(); - raw_local_irq_restore(flags); + local_irq_restore(flags); #endif /* * don't use it in hardirq context, because it diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c index 68187af..13066e8 100644 --- a/kernel/tracepoint.c +++ b/kernel/tracepoint.c @@ -251,9 +251,9 @@ static void set_tracepoint(struct tracepoint_entry **entry, { WARN_ON(strcmp((*entry)->name, elem->name) != 0); - if (elem->regfunc && !elem->state && active) + if (elem->regfunc && !jump_label_enabled(&elem->key) && active) elem->regfunc(); - else if (elem->unregfunc && elem->state && !active) + else if (elem->unregfunc && jump_label_enabled(&elem->key) && !active) elem->unregfunc(); /* @@ -264,13 +264,10 @@ static void set_tracepoint(struct tracepoint_entry **entry, * is used. */ rcu_assign_pointer(elem->funcs, (*entry)->funcs); - if (!elem->state && active) { - jump_label_enable(&elem->state); - elem->state = active; - } else if (elem->state && !active) { - jump_label_disable(&elem->state); - elem->state = active; - } + if (active) + jump_label_enable(&elem->key); + else if (!active) + jump_label_disable(&elem->key); } /* @@ -281,13 +278,10 @@ static void set_tracepoint(struct tracepoint_entry **entry, */ static void disable_tracepoint(struct tracepoint *elem) { - if (elem->unregfunc && elem->state) + if (elem->unregfunc && jump_label_enabled(&elem->key)) elem->unregfunc(); - if (elem->state) { - jump_label_disable(&elem->state); - elem->state = 0; - } + jump_label_disable(&elem->key); rcu_assign_pointer(elem->funcs, NULL); } ^ permalink raw reply related [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-12 18:47 ` Peter Zijlstra @ 2011-02-14 12:27 ` Ingo Molnar 2011-02-14 15:51 ` Jason Baron 2011-02-14 16:11 ` Mathieu Desnoyers 2 siblings, 0 replies; 113+ messages in thread From: Ingo Molnar @ 2011-02-14 12:27 UTC (permalink / raw) To: Peter Zijlstra Cc: Mathieu Desnoyers, Jason Baron, hpa, rostedt, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel * Peter Zijlstra <peterz@infradead.org> wrote: > On Fri, 2011-02-11 at 22:38 +0100, Peter Zijlstra wrote: > > > > So why can't we make that jump_label_entry::refcount and > > jump_label_key::state an atomic_t and be done with it? > > So I had a bit of a poke at this because I didn't quite understand why > all that stuff was as it was. I applied both Jason's patches and then > basically rewrote kernel/jump_label.c just for kicks ;-) > > I haven't tried compiling this, let alone running it, but provided I > didn't actually forget anything the storage per key is now 16 bytes when > modules are disabled and 24 * (1 + mods) bytes for when they are > enabled. The old code had 64 + 40 * mods bytes. > > I still need to clean up the static_branch_else bits and look at !x86 > aside from the already mentioned bits.. but what do people think? [...] > 15 files changed, 333 insertions(+), 456 deletions(-) The diffstat win alone makes me want this :-) Thanks, Ingo ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-12 18:47 ` Peter Zijlstra 2011-02-14 12:27 ` Ingo Molnar @ 2011-02-14 15:51 ` Jason Baron 2011-02-14 15:57 ` Peter Zijlstra 2011-02-14 16:11 ` Mathieu Desnoyers 2 siblings, 1 reply; 113+ messages in thread From: Jason Baron @ 2011-02-14 15:51 UTC (permalink / raw) To: Peter Zijlstra Cc: Mathieu Desnoyers, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel On Sat, Feb 12, 2011 at 07:47:45PM +0100, Peter Zijlstra wrote: > On Fri, 2011-02-11 at 22:38 +0100, Peter Zijlstra wrote: > > > > So why can't we make that jump_label_entry::refcount and > > jump_label_key::state an atomic_t and be done with it? > > So I had a bit of a poke at this because I didn't quite understand why > all that stuff was as it was. I applied both Jason's patches and then > basically rewrote kernel/jump_label.c just for kicks ;-) > > I haven't tried compiling this, let alone running it, but provided I > didn't actually forget anything the storage per key is now 16 bytes when > modules are disabled and 24 * (1 + mods) bytes for when they are > enabled. The old code had 64 + 40 * mods bytes. > > I still need to clean up the static_branch_else bits and look at !x86 > aside from the already mentioned bits.. but what do people think? > > --- Generally, I really like this! Its the direction I think the jump label code should be going. The complete removal of the hash table, makes the design a lot better and simpler. We just need to get some of the details cleaned up, and of course we need this to compile :) But I don't see any fundamental problems with this approach. Things that still need to be sorted out: 1) Since jump_label.h, are included in kernel.h, (indirectly via the dynamic_debug.h) the atomic_t definitions could be problematic, since atomic.h includes kernel.h indirectly...so we might need some header magic. 2) I had some code to disallow writing to module __init section, by setting the 'key' value to 0, after the module->init was run, but before, the memory was freed. And then I check for a non-zero key value when the jump label is updated. In this way we can't corrupt some random piece of memory. I had this done via the 'MODULE_STATE_LIVE' notifier. 3) For 'jump_label_enable()' 'jump_label_disable()' in the tracepoint code, I'm not sure that there is an enable for each disable. So i'm not sure if a refcount would work there. But we can fix this by first checking 'jump_label_enabled()' before calling 'jump_label_eanble()' or jump_label_ref(). This is safe b/c the the tracepoint code is protected using the tracepoint_mutex. thanks, -Jason ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 15:51 ` Jason Baron @ 2011-02-14 15:57 ` Peter Zijlstra 2011-02-14 16:04 ` Jason Baron 0 siblings, 1 reply; 113+ messages in thread From: Peter Zijlstra @ 2011-02-14 15:57 UTC (permalink / raw) To: Jason Baron Cc: Mathieu Desnoyers, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel On Mon, 2011-02-14 at 10:51 -0500, Jason Baron wrote: > On Sat, Feb 12, 2011 at 07:47:45PM +0100, Peter Zijlstra wrote: > > On Fri, 2011-02-11 at 22:38 +0100, Peter Zijlstra wrote: > > > > > > So why can't we make that jump_label_entry::refcount and > > > jump_label_key::state an atomic_t and be done with it? > > > > So I had a bit of a poke at this because I didn't quite understand why > > all that stuff was as it was. I applied both Jason's patches and then > > basically rewrote kernel/jump_label.c just for kicks ;-) > > > > I haven't tried compiling this, let alone running it, but provided I > > didn't actually forget anything the storage per key is now 16 bytes when > > modules are disabled and 24 * (1 + mods) bytes for when they are > > enabled. The old code had 64 + 40 * mods bytes. > > > > I still need to clean up the static_branch_else bits and look at !x86 > > aside from the already mentioned bits.. but what do people think? > > > > --- > > Generally, I really like this! Its the direction I think the jump label > code should be going. The complete removal of the hash table, makes the > design a lot better and simpler. We just need to get some of the details > cleaned up, and of course we need this to compile :) But I don't see any > fundamental problems with this approach. > > Things that still need to be sorted out: > > 1) Since jump_label.h, are included in kernel.h, (indirectly via the > dynamic_debug.h) the atomic_t definitions could be problematic, since > atomic.h includes kernel.h indirectly...so we might need some header > magic. Yes, I remember running into that when I did the jump_label_ref stuff, some head-scratching is in order there. > 2) I had some code to disallow writing to module __init section, by > setting the 'key' value to 0, after the module->init was run, but > before, the memory was freed. And then I check for a non-zero key value > when the jump label is updated. In this way we can't corrupt some random > piece of memory. I had this done via the 'MODULE_STATE_LIVE' notifier. AH! I wondered what that was about.. that wouldn't work now since we actually rely on iter->key to remain what it was. > 3) For 'jump_label_enable()' 'jump_label_disable()' in the tracepoint > code, I'm not sure that there is an enable for each disable. So i'm not > sure if a refcount would work there. But we can fix this by first > checking 'jump_label_enabled()' before calling 'jump_label_eanble()' or > jump_label_ref(). This is safe b/c the the tracepoint code is protected > using the tracepoint_mutex. Right,.. I hadn't considered people using it like that, but like you said, that should be easily fixed. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 15:57 ` Peter Zijlstra @ 2011-02-14 16:04 ` Jason Baron 2011-02-14 16:14 ` Mathieu Desnoyers [not found] ` <BLU0-SMTP4069A1A89F06CDFF9B28F896D00@phx.gbl> 0 siblings, 2 replies; 113+ messages in thread From: Jason Baron @ 2011-02-14 16:04 UTC (permalink / raw) To: Peter Zijlstra Cc: Mathieu Desnoyers, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel On Mon, Feb 14, 2011 at 04:57:04PM +0100, Peter Zijlstra wrote: > On Mon, 2011-02-14 at 10:51 -0500, Jason Baron wrote: > > On Sat, Feb 12, 2011 at 07:47:45PM +0100, Peter Zijlstra wrote: > > > On Fri, 2011-02-11 at 22:38 +0100, Peter Zijlstra wrote: > > > > > > > > So why can't we make that jump_label_entry::refcount and > > > > jump_label_key::state an atomic_t and be done with it? > > > > > > So I had a bit of a poke at this because I didn't quite understand why > > > all that stuff was as it was. I applied both Jason's patches and then > > > basically rewrote kernel/jump_label.c just for kicks ;-) > > > > > > I haven't tried compiling this, let alone running it, but provided I > > > didn't actually forget anything the storage per key is now 16 bytes when > > > modules are disabled and 24 * (1 + mods) bytes for when they are > > > enabled. The old code had 64 + 40 * mods bytes. > > > > > > I still need to clean up the static_branch_else bits and look at !x86 > > > aside from the already mentioned bits.. but what do people think? > > > > > > --- > > > > Generally, I really like this! Its the direction I think the jump label > > code should be going. The complete removal of the hash table, makes the > > design a lot better and simpler. We just need to get some of the details > > cleaned up, and of course we need this to compile :) But I don't see any > > fundamental problems with this approach. > > > > Things that still need to be sorted out: > > > > 1) Since jump_label.h, are included in kernel.h, (indirectly via the > > dynamic_debug.h) the atomic_t definitions could be problematic, since > > atomic.h includes kernel.h indirectly...so we might need some header > > magic. > > Yes, I remember running into that when I did the jump_label_ref stuff, > some head-scratching is in order there. > yes. i suspect this might be the hardest bit of this... > > 2) I had some code to disallow writing to module __init section, by > > setting the 'key' value to 0, after the module->init was run, but > > before, the memory was freed. And then I check for a non-zero key value > > when the jump label is updated. In this way we can't corrupt some random > > piece of memory. I had this done via the 'MODULE_STATE_LIVE' notifier. > > AH! I wondered what that was about.. that wouldn't work now since we > actually rely on iter->key to remain what it was. > we could just use iter->code, or iter->target -> 0 to indicate that the entry is not valid, and leave iter->key as it is. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 16:04 ` Jason Baron @ 2011-02-14 16:14 ` Mathieu Desnoyers [not found] ` <BLU0-SMTP4069A1A89F06CDFF9B28F896D00@phx.gbl> 1 sibling, 0 replies; 113+ messages in thread From: Mathieu Desnoyers @ 2011-02-14 16:14 UTC (permalink / raw) To: Jason Baron Cc: Peter Zijlstra, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel * Jason Baron (jbaron@redhat.com) wrote: > On Mon, Feb 14, 2011 at 04:57:04PM +0100, Peter Zijlstra wrote: > > On Mon, 2011-02-14 at 10:51 -0500, Jason Baron wrote: > > > On Sat, Feb 12, 2011 at 07:47:45PM +0100, Peter Zijlstra wrote: > > > > On Fri, 2011-02-11 at 22:38 +0100, Peter Zijlstra wrote: > > > > > > > > > > So why can't we make that jump_label_entry::refcount and > > > > > jump_label_key::state an atomic_t and be done with it? > > > > > > > > So I had a bit of a poke at this because I didn't quite understand why > > > > all that stuff was as it was. I applied both Jason's patches and then > > > > basically rewrote kernel/jump_label.c just for kicks ;-) > > > > > > > > I haven't tried compiling this, let alone running it, but provided I > > > > didn't actually forget anything the storage per key is now 16 bytes when > > > > modules are disabled and 24 * (1 + mods) bytes for when they are > > > > enabled. The old code had 64 + 40 * mods bytes. > > > > > > > > I still need to clean up the static_branch_else bits and look at !x86 > > > > aside from the already mentioned bits.. but what do people think? > > > > > > > > --- > > > > > > Generally, I really like this! Its the direction I think the jump label > > > code should be going. The complete removal of the hash table, makes the > > > design a lot better and simpler. We just need to get some of the details > > > cleaned up, and of course we need this to compile :) But I don't see any > > > fundamental problems with this approach. > > > > > > Things that still need to be sorted out: > > > > > > 1) Since jump_label.h, are included in kernel.h, (indirectly via the > > > dynamic_debug.h) the atomic_t definitions could be problematic, since > > > atomic.h includes kernel.h indirectly...so we might need some header > > > magic. > > > > Yes, I remember running into that when I did the jump_label_ref stuff, > > some head-scratching is in order there. > > > > yes. i suspect this might be the hardest bit of this... I remember that atomic_t is defined in types.h now rather than atomic.h. Any reason why you should keep including atomic.h from jump_label.h ? Thanks, Mathieu -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <BLU0-SMTP4069A1A89F06CDFF9B28F896D00@phx.gbl>]
* Re: [PATCH 0/2] jump label: 2.6.38 updates [not found] ` <BLU0-SMTP4069A1A89F06CDFF9B28F896D00@phx.gbl> @ 2011-02-14 16:25 ` Peter Zijlstra 2011-02-14 16:29 ` Jason Baron 0 siblings, 1 reply; 113+ messages in thread From: Peter Zijlstra @ 2011-02-14 16:25 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Jason Baron, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel On Mon, 2011-02-14 at 11:14 -0500, Mathieu Desnoyers wrote: > > I remember that atomic_t is defined in types.h now rather than atomic.h. > Any reason why you should keep including atomic.h from jump_label.h ? Ooh, shiny.. we could probably move the few atomic_{read,inc,dec} users in jump_label.h into out of line functions and have this sorted. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 16:25 ` Peter Zijlstra @ 2011-02-14 16:29 ` Jason Baron 2011-02-14 16:37 ` Peter Zijlstra 0 siblings, 1 reply; 113+ messages in thread From: Jason Baron @ 2011-02-14 16:29 UTC (permalink / raw) To: Peter Zijlstra Cc: Mathieu Desnoyers, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel On Mon, Feb 14, 2011 at 05:25:54PM +0100, Peter Zijlstra wrote: > > > > I remember that atomic_t is defined in types.h now rather than atomic.h. > > Any reason why you should keep including atomic.h from jump_label.h ? > > Ooh, shiny.. we could probably move the few atomic_{read,inc,dec} users > in jump_label.h into out of line functions and have this sorted. > inc and dec sure, but atomic_read() for the disabled case needs to be inline.... ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 16:29 ` Jason Baron @ 2011-02-14 16:37 ` Peter Zijlstra 2011-02-14 16:43 ` Mathieu Desnoyers ` (2 more replies) 0 siblings, 3 replies; 113+ messages in thread From: Peter Zijlstra @ 2011-02-14 16:37 UTC (permalink / raw) To: Jason Baron Cc: Mathieu Desnoyers, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel On Mon, 2011-02-14 at 11:29 -0500, Jason Baron wrote: > On Mon, Feb 14, 2011 at 05:25:54PM +0100, Peter Zijlstra wrote: > > > > > > I remember that atomic_t is defined in types.h now rather than atomic.h. > > > Any reason why you should keep including atomic.h from jump_label.h ? > > > > Ooh, shiny.. we could probably move the few atomic_{read,inc,dec} users > > in jump_label.h into out of line functions and have this sorted. > > > > inc and dec sure, but atomic_read() for the disabled case needs to be > inline.... D'0h yes of course, I was thinking about jump_label_enabled(), but there's still the static_branch() implementation to consider. We could of course cheat implement our own version of atomic_read() in order to avoid the whole header mess, but that's not pretty at all ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 16:37 ` Peter Zijlstra @ 2011-02-14 16:43 ` Mathieu Desnoyers 2011-02-14 16:46 ` Steven Rostedt [not found] ` <BLU0-SMTP64371A838030ED92A7CCB696D00@phx.gbl> 2 siblings, 0 replies; 113+ messages in thread From: Mathieu Desnoyers @ 2011-02-14 16:43 UTC (permalink / raw) To: Peter Zijlstra Cc: Jason Baron, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel * Peter Zijlstra (peterz@infradead.org) wrote: > On Mon, 2011-02-14 at 11:29 -0500, Jason Baron wrote: > > On Mon, Feb 14, 2011 at 05:25:54PM +0100, Peter Zijlstra wrote: > > > > > > > > I remember that atomic_t is defined in types.h now rather than atomic.h. > > > > Any reason why you should keep including atomic.h from jump_label.h ? > > > > > > Ooh, shiny.. we could probably move the few atomic_{read,inc,dec} users > > > in jump_label.h into out of line functions and have this sorted. > > > > > > > inc and dec sure, but atomic_read() for the disabled case needs to be > > inline.... > > D'0h yes of course, I was thinking about jump_label_enabled(), but > there's still the static_branch() implementation to consider. > > We could of course cheat implement our own version of atomic_read() in > order to avoid the whole header mess, but that's not pretty at all > OK, so the other way around then : why does kernel.h need to include dynamic_debug.h (which includes jump_label.h) ? Mathieu -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 16:37 ` Peter Zijlstra 2011-02-14 16:43 ` Mathieu Desnoyers @ 2011-02-14 16:46 ` Steven Rostedt 2011-02-14 16:53 ` Peter Zijlstra 2011-02-14 17:18 ` Steven Rostedt [not found] ` <BLU0-SMTP64371A838030ED92A7CCB696D00@phx.gbl> 2 siblings, 2 replies; 113+ messages in thread From: Steven Rostedt @ 2011-02-14 16:46 UTC (permalink / raw) To: Peter Zijlstra Cc: Jason Baron, Mathieu Desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel On Mon, 2011-02-14 at 17:37 +0100, Peter Zijlstra wrote: > We could of course cheat implement our own version of atomic_read() in > order to avoid the whole header mess, but that's not pretty at all Oh God please no! ;) atomic_read() is implemented per arch. -- Steve ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 16:46 ` Steven Rostedt @ 2011-02-14 16:53 ` Peter Zijlstra 2011-02-14 17:18 ` Steven Rostedt 1 sibling, 0 replies; 113+ messages in thread From: Peter Zijlstra @ 2011-02-14 16:53 UTC (permalink / raw) To: Steven Rostedt Cc: Jason Baron, Mathieu Desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel On Mon, 2011-02-14 at 11:46 -0500, Steven Rostedt wrote: > On Mon, 2011-02-14 at 17:37 +0100, Peter Zijlstra wrote: > > > We could of course cheat implement our own version of atomic_read() in > > order to avoid the whole header mess, but that's not pretty at all > > Oh God please no! ;) > > atomic_read() is implemented per arch. Ah, but it needn't be: static inline int atomic_read(atomic_t *a) { return ACCESS_ONCE(a->counter); } is basically it. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 16:46 ` Steven Rostedt 2011-02-14 16:53 ` Peter Zijlstra @ 2011-02-14 17:18 ` Steven Rostedt 2011-02-14 17:23 ` Mike Frysinger 2011-02-14 17:27 ` Peter Zijlstra 1 sibling, 2 replies; 113+ messages in thread From: Steven Rostedt @ 2011-02-14 17:18 UTC (permalink / raw) To: Peter Zijlstra Cc: Jason Baron, Mathieu Desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel On Mon, 2011-02-14 at 11:46 -0500, Steven Rostedt wrote: > On Mon, 2011-02-14 at 17:37 +0100, Peter Zijlstra wrote: > > > We could of course cheat implement our own version of atomic_read() in > > order to avoid the whole header mess, but that's not pretty at all > > Oh God please no! ;) > > atomic_read() is implemented per arch. Hmm, maybe this isn't so bad: alpha: #define atomic_read(v) (*(volatile int *)&(v)->counter) arm: #define atomic_read(v) (*(volatile int *)&(v)->counter) avr32: #define atomic_read(v) (*(volatile int *)&(v)->counter) blackfin: #define atomic_read(v) __raw_uncached_fetch_asm(&(v)->counter) cris: #define atomic_read(v) (*(volatile int *)&(v)->counter) frv: #define atomic_read(v) (*(volatile int *)&(v)->counter) h8300: #define atomic_read(v) (*(volatile int *)&(v)->counter) ia64: #define atomic_read(v) (*(volatile int *)&(v)->counter) m32r: #define atomic_read(v) (*(volatile int *)&(v)->counter) m68k: #define atomic_read(v) (*(volatile int *)&(v)->counter) microblaze: uses generic which is: mips: #define atomic_read(v) (*(volatile int *)&(v)->counter) mn10300: #define atomic_read(v) ((v)->counter) parisc: static __inline__ int atomic_read(const atomic_t *v) { return (*(volatile int *)&(v)->counter); } powerpc: static __inline__ int atomic_read(const atomic_t *v) { int t; __asm__ __volatile__("lwz%U1%X1 %0,%1" : "=r"(t) : "m"(v->counter)); return t; } which is still pretty much a volatile read s390: static inline int atomic_read(const atomic_t *v) { barrier(); return v->counter; } score: uses generic sh: #define atomic_read(v) (*(volatile int *)&(v)->counter) sparc 32: sparc 64: #define atomic_read(v) (*(volatile int *)&(v)->counter) tile: static inline int atomic_read(const atomic_t *v) { return v->counter; } Hmm, nothing volatile at all? x86: static inline int atomic_read(const atomic_t *v) { return (*(volatile int *)&(v)->counter); } xtensa: #define atomic_read(v) (*(volatile int *)&(v)->counter) So all but a few have basically (as you said on IRC) #define atomic_read(v) ACCESS_ONCE(v) Those few are blackfin, s390, powerpc and tile. s390 probably doesn't need that much of a big hammer with atomic_read() (unless it uses it in its own arch that expects it to be such). powerpc could probably be converted to just the volatile code as everything else. Not sure why it did it that way. To be different? tile just looks wrong, but wont be hurt with adding volatile to that. blackfin, seems to be doing quite a lot. Not sure if it is required, but that may need a bit of investigating to understand why it does the raw_uncached thing. Maybe we could move the atomic_read() out of atomic and make it a standard inline for all (in kernel.h)? -- Steve ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 17:18 ` Steven Rostedt @ 2011-02-14 17:23 ` Mike Frysinger 2011-02-14 17:27 ` Peter Zijlstra 1 sibling, 0 replies; 113+ messages in thread From: Mike Frysinger @ 2011-02-14 17:23 UTC (permalink / raw) To: Steven Rostedt Cc: Peter Zijlstra, Jason Baron, Mathieu Desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel On Mon, Feb 14, 2011 at 12:18, Steven Rostedt wrote: > blackfin: > #define atomic_read(v) __raw_uncached_fetch_asm(&(v)->counter) > > blackfin, seems to be doing quite a lot. Not sure if it is required, but > that may need a bit of investigating to understand why it does the > raw_uncached thing. this is only for SMP ports, and it's due to our lack of cache-coherency. for non-SMP, we use asm-generic. -mike ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 17:18 ` Steven Rostedt 2011-02-14 17:23 ` Mike Frysinger @ 2011-02-14 17:27 ` Peter Zijlstra 2011-02-14 17:29 ` Mike Frysinger ` (2 more replies) 1 sibling, 3 replies; 113+ messages in thread From: Peter Zijlstra @ 2011-02-14 17:27 UTC (permalink / raw) To: Steven Rostedt Cc: Jason Baron, Mathieu Desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel, Mike Frysinger, Chris Metcalf, dhowells, Martin Schwidefsky, heiko.carstens, benh On Mon, 2011-02-14 at 12:18 -0500, Steven Rostedt wrote: > mn10300: > #define atomic_read(v) ((v)->counter) > tile: > static inline int atomic_read(const atomic_t *v) > { > return v->counter; > } Yeah, I already send email to the respective maintainers telling them they might want to fix this ;-) > So all but a few have basically (as you said on IRC) > #define atomic_read(v) ACCESS_ONCE(v) ACCESS_ONCE(v->counter), but yeah :-) > Those few are blackfin, s390, powerpc and tile. > > s390 probably doesn't need that much of a big hammer with atomic_read() > (unless it uses it in its own arch that expects it to be such). Right, it could just do the volatile thing.. > powerpc could probably be converted to just the volatile code as > everything else. Not sure why it did it that way. To be different? Maybe that code was written before we all got inventive with the volatile cast stuff.. > blackfin, seems to be doing quite a lot. Not sure if it is required, but > that may need a bit of investigating to understand why it does the > raw_uncached thing. >From what I can tell its flushing its write cache, invalidating its d-cache and then issue the read, something which is _way_ overboard. > Maybe we could move the atomic_read() out of atomic and make it a > standard inline for all (in kernel.h)? Certainly looks like that might work.. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 17:27 ` Peter Zijlstra @ 2011-02-14 17:29 ` Mike Frysinger 2011-02-14 17:38 ` Peter Zijlstra 2011-02-14 17:38 ` Will Newton 2011-02-15 15:20 ` Heiko Carstens 2 siblings, 1 reply; 113+ messages in thread From: Mike Frysinger @ 2011-02-14 17:29 UTC (permalink / raw) To: Peter Zijlstra Cc: Steven Rostedt, Jason Baron, Mathieu Desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel, Chris Metcalf, dhowells, Martin Schwidefsky, heiko.carstens, benh On Mon, Feb 14, 2011 at 12:27, Peter Zijlstra wrote: > On Mon, 2011-02-14 at 12:18 -0500, Steven Rostedt wrote: >> blackfin, seems to be doing quite a lot. Not sure if it is required, but >> that may need a bit of investigating to understand why it does the >> raw_uncached thing. > > From what I can tell its flushing its write cache, invalidating its > d-cache and then issue the read, something which is _way_ overboard. not when the cores in a SMP system lack cache coherency please to review: http://docs.blackfin.uclinux.org/doku.php?id=linux-kernel:smp-like -mike ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 17:29 ` Mike Frysinger @ 2011-02-14 17:38 ` Peter Zijlstra 2011-02-14 17:45 ` Mike Frysinger 0 siblings, 1 reply; 113+ messages in thread From: Peter Zijlstra @ 2011-02-14 17:38 UTC (permalink / raw) To: Mike Frysinger Cc: Steven Rostedt, Jason Baron, Mathieu Desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel, Chris Metcalf, dhowells, Martin Schwidefsky, heiko.carstens, benh, Paul E. McKenney On Mon, 2011-02-14 at 12:29 -0500, Mike Frysinger wrote: > On Mon, Feb 14, 2011 at 12:27, Peter Zijlstra wrote: > > On Mon, 2011-02-14 at 12:18 -0500, Steven Rostedt wrote: > >> blackfin, seems to be doing quite a lot. Not sure if it is required, but > >> that may need a bit of investigating to understand why it does the > >> raw_uncached thing. > > > > From what I can tell its flushing its write cache, invalidating its > > d-cache and then issue the read, something which is _way_ overboard. > > not when the cores in a SMP system lack cache coherency But atomic_read() is completely unordered, so even on a non-coherent system a regular read should suffice, any old value is correct. The only problem would be when you could get cache aliasing and read something totally unrelated. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 17:38 ` Peter Zijlstra @ 2011-02-14 17:45 ` Mike Frysinger 0 siblings, 0 replies; 113+ messages in thread From: Mike Frysinger @ 2011-02-14 17:45 UTC (permalink / raw) To: Peter Zijlstra Cc: Steven Rostedt, Jason Baron, Mathieu Desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel, Chris Metcalf, dhowells, Martin Schwidefsky, heiko.carstens, benh, Paul E. McKenney On Mon, Feb 14, 2011 at 12:38, Peter Zijlstra wrote: > On Mon, 2011-02-14 at 12:29 -0500, Mike Frysinger wrote: >> On Mon, Feb 14, 2011 at 12:27, Peter Zijlstra wrote: >> > On Mon, 2011-02-14 at 12:18 -0500, Steven Rostedt wrote: >> >> blackfin, seems to be doing quite a lot. Not sure if it is required, but >> >> that may need a bit of investigating to understand why it does the >> >> raw_uncached thing. >> > >> > From what I can tell its flushing its write cache, invalidating its >> > d-cache and then issue the read, something which is _way_ overboard. >> >> not when the cores in a SMP system lack cache coherency > > But atomic_read() is completely unordered, so even on a non-coherent > system a regular read should suffice, any old value is correct. the words you use seem to form a line of reasoning that makes sense to me. we'll have to play first though to double check. > The only problem would be when you could get cache aliasing and read > something totally unrelated. being a nommu arch, there shouldnt be any cache aliasing issues. we're just trying to make sure that what another core has pushed out isnt stale in another core's cache when the other core does the read. -mike ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 17:27 ` Peter Zijlstra 2011-02-14 17:29 ` Mike Frysinger @ 2011-02-14 17:38 ` Will Newton 2011-02-14 17:43 ` Peter Zijlstra 2011-02-15 15:20 ` Heiko Carstens 2 siblings, 1 reply; 113+ messages in thread From: Will Newton @ 2011-02-14 17:38 UTC (permalink / raw) To: Peter Zijlstra Cc: Steven Rostedt, Jason Baron, Mathieu Desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel, Mike Frysinger, Chris Metcalf, dhowells, Martin Schwidefsky, heiko.carstens, benh On Mon, Feb 14, 2011 at 5:27 PM, Peter Zijlstra <peterz@infradead.org> wrote: >> So all but a few have basically (as you said on IRC) >> #define atomic_read(v) ACCESS_ONCE(v) > > ACCESS_ONCE(v->counter), but yeah :-) I maintain an out-of-tree architecture where that isn't the case unfortunately [1]. Not expecting any special favours for being out-of-tree of course, but just thought I would add that data point. [1] Our atomic operations go around the cache rather than through it, so the value of an atomic cannot be read with a normal load instruction. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 17:38 ` Will Newton @ 2011-02-14 17:43 ` Peter Zijlstra 2011-02-14 17:50 ` Will Newton 0 siblings, 1 reply; 113+ messages in thread From: Peter Zijlstra @ 2011-02-14 17:43 UTC (permalink / raw) To: Will Newton Cc: Steven Rostedt, Jason Baron, Mathieu Desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel, Mike Frysinger, Chris Metcalf, dhowells, Martin Schwidefsky, heiko.carstens, benh On Mon, 2011-02-14 at 17:38 +0000, Will Newton wrote: > On Mon, Feb 14, 2011 at 5:27 PM, Peter Zijlstra <peterz@infradead.org> wrote: > > >> So all but a few have basically (as you said on IRC) > >> #define atomic_read(v) ACCESS_ONCE(v) > > > > ACCESS_ONCE(v->counter), but yeah :-) > > I maintain an out-of-tree architecture where that isn't the case > unfortunately [1]. Not expecting any special favours for being > out-of-tree of course, but just thought I would add that data point. > > [1] Our atomic operations go around the cache rather than through it, > so the value of an atomic cannot be read with a normal load > instruction. Cannot how? It would observe a stale value? That is acceptable for atomic_read(). ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 17:43 ` Peter Zijlstra @ 2011-02-14 17:50 ` Will Newton 2011-02-14 18:04 ` Peter Zijlstra 2011-02-14 18:24 ` Peter Zijlstra 0 siblings, 2 replies; 113+ messages in thread From: Will Newton @ 2011-02-14 17:50 UTC (permalink / raw) To: Peter Zijlstra Cc: Steven Rostedt, Jason Baron, Mathieu Desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel, Mike Frysinger, Chris Metcalf, dhowells, Martin Schwidefsky, heiko.carstens, benh On Mon, Feb 14, 2011 at 5:43 PM, Peter Zijlstra <peterz@infradead.org> wrote: > On Mon, 2011-02-14 at 17:38 +0000, Will Newton wrote: >> On Mon, Feb 14, 2011 at 5:27 PM, Peter Zijlstra <peterz@infradead.org> wrote: >> >> >> So all but a few have basically (as you said on IRC) >> >> #define atomic_read(v) ACCESS_ONCE(v) >> > >> > ACCESS_ONCE(v->counter), but yeah :-) >> >> I maintain an out-of-tree architecture where that isn't the case >> unfortunately [1]. Not expecting any special favours for being >> out-of-tree of course, but just thought I would add that data point. >> >> [1] Our atomic operations go around the cache rather than through it, >> so the value of an atomic cannot be read with a normal load >> instruction. > > Cannot how? It would observe a stale value? That is acceptable for > atomic_read(). It would observe a stale value, but that value would only be updated when the cache line was reloaded from main memory which would have to be triggered by either eviction or cache flushing. So it could get pretty stale. Whilst that's probably within the spec. of atomic_read I suspect it would lead to problems in practice. I could be wrong though. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 17:50 ` Will Newton @ 2011-02-14 18:04 ` Peter Zijlstra 2011-02-14 18:24 ` Peter Zijlstra 1 sibling, 0 replies; 113+ messages in thread From: Peter Zijlstra @ 2011-02-14 18:04 UTC (permalink / raw) To: Will Newton Cc: Steven Rostedt, Jason Baron, Mathieu Desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel, Mike Frysinger, Chris Metcalf, dhowells, Martin Schwidefsky, heiko.carstens, benh On Mon, 2011-02-14 at 17:50 +0000, Will Newton wrote: > On Mon, Feb 14, 2011 at 5:43 PM, Peter Zijlstra <peterz@infradead.org> wrote: > > On Mon, 2011-02-14 at 17:38 +0000, Will Newton wrote: > >> On Mon, Feb 14, 2011 at 5:27 PM, Peter Zijlstra <peterz@infradead.org> wrote: > >> > >> >> So all but a few have basically (as you said on IRC) > >> >> #define atomic_read(v) ACCESS_ONCE(v) > >> > > >> > ACCESS_ONCE(v->counter), but yeah :-) > >> > >> I maintain an out-of-tree architecture where that isn't the case > >> unfortunately [1]. Not expecting any special favours for being > >> out-of-tree of course, but just thought I would add that data point. > >> > >> [1] Our atomic operations go around the cache rather than through it, > >> so the value of an atomic cannot be read with a normal load > >> instruction. > > > > Cannot how? It would observe a stale value? That is acceptable for > > atomic_read(). > > It would observe a stale value, but that value would only be updated > when the cache line was reloaded from main memory which would have to > be triggered by either eviction or cache flushing. So it could get > pretty stale. Whilst that's probably within the spec. of atomic_read I > suspect it would lead to problems in practice. I could be wrong > though. Arguable, finding such cases would be a Good (TM) thing.. but yeah, I can imagine you're not too keen on being the one finding them. Luckily it looks like you're in the same boat as blackfin-smp is. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 17:50 ` Will Newton 2011-02-14 18:04 ` Peter Zijlstra @ 2011-02-14 18:24 ` Peter Zijlstra 2011-02-14 18:53 ` Mathieu Desnoyers 2011-02-14 21:29 ` Steven Rostedt 1 sibling, 2 replies; 113+ messages in thread From: Peter Zijlstra @ 2011-02-14 18:24 UTC (permalink / raw) To: Will Newton Cc: Steven Rostedt, Jason Baron, Mathieu Desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel, Mike Frysinger, Chris Metcalf, dhowells, Martin Schwidefsky, heiko.carstens, benh On Mon, 2011-02-14 at 17:50 +0000, Will Newton wrote: > > It would observe a stale value, but that value would only be updated > when the cache line was reloaded from main memory which would have to > be triggered by either eviction or cache flushing. So it could get > pretty stale. Whilst that's probably within the spec. of atomic_read I > suspect it would lead to problems in practice. I could be wrong > though. Right, so the typical scenario that could cause pain is something like: while (atomic_read(&foo) != n) cpu_relax(); and the problem is that cpu_relax() doesn't know which particular cacheline to flush in order to make things go faster, hm? ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 18:24 ` Peter Zijlstra @ 2011-02-14 18:53 ` Mathieu Desnoyers 2011-02-14 21:29 ` Steven Rostedt 1 sibling, 0 replies; 113+ messages in thread From: Mathieu Desnoyers @ 2011-02-14 18:53 UTC (permalink / raw) To: Peter Zijlstra Cc: Will Newton, Steven Rostedt, Jason Baron, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel, Mike Frysinger, Chris Metcalf, dhowells, Martin Schwidefsky, heiko.carstens, benh * Peter Zijlstra (peterz@infradead.org) wrote: > On Mon, 2011-02-14 at 17:50 +0000, Will Newton wrote: > > > > It would observe a stale value, but that value would only be updated > > when the cache line was reloaded from main memory which would have to > > be triggered by either eviction or cache flushing. So it could get > > pretty stale. Whilst that's probably within the spec. of atomic_read I > > suspect it would lead to problems in practice. I could be wrong > > though. > > Right, so the typical scenario that could cause pain is something like: > > while (atomic_read(&foo) != n) > cpu_relax(); > > and the problem is that cpu_relax() doesn't know which particular > cacheline to flush in order to make things go faster, hm? As an information point, this is why I mapped "uatomic_read()" to "CMM_LOAD_SHARED" in my userspace RCU library rather than just doing a volatile access. On cache-coherent architectures, the arch-specific code turns CMM_LOAD_SHARED into a simple volatile access, but for non-cache-coherent architectures, it can call the required architecture-level primitives to fetch the stale data. FWIW, I also have "CMM_STORE_SHARED" which does pretty much the same thing. I use these for rcu_assign_pointer() and rcu_dereference() (thus replacing "ACCESS_ONCE()"). The more detailed comment and macros are found at http://git.lttng.org/?p=userspace-rcu.git;a=blob;f=urcu/system.h I hope this helps, Mathieu -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 18:24 ` Peter Zijlstra 2011-02-14 18:53 ` Mathieu Desnoyers @ 2011-02-14 21:29 ` Steven Rostedt 2011-02-14 21:39 ` Steven Rostedt 1 sibling, 1 reply; 113+ messages in thread From: Steven Rostedt @ 2011-02-14 21:29 UTC (permalink / raw) To: Peter Zijlstra Cc: Will Newton, Jason Baron, Mathieu Desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel, Mike Frysinger, Chris Metcalf, dhowells, Martin Schwidefsky, heiko.carstens, benh On Mon, 2011-02-14 at 19:24 +0100, Peter Zijlstra wrote: > On Mon, 2011-02-14 at 17:50 +0000, Will Newton wrote: > > > > It would observe a stale value, but that value would only be updated > > when the cache line was reloaded from main memory which would have to > > be triggered by either eviction or cache flushing. So it could get > > pretty stale. Whilst that's probably within the spec. of atomic_read I > > suspect it would lead to problems in practice. I could be wrong > > though. > > Right, so the typical scenario that could cause pain is something like: > > while (atomic_read(&foo) != n) > cpu_relax(); > > and the problem is that cpu_relax() doesn't know which particular > cacheline to flush in order to make things go faster, hm? But what about any global variable? Can't we also just have: while (global != n) cpu_relax(); ? -- Steve ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 21:29 ` Steven Rostedt @ 2011-02-14 21:39 ` Steven Rostedt 2011-02-14 21:46 ` David Miller 2011-02-14 22:15 ` Matt Fleming 0 siblings, 2 replies; 113+ messages in thread From: Steven Rostedt @ 2011-02-14 21:39 UTC (permalink / raw) To: Peter Zijlstra Cc: Will Newton, Jason Baron, Mathieu Desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel, Mike Frysinger, Chris Metcalf, dhowells, Martin Schwidefsky, heiko.carstens, benh On Mon, 2011-02-14 at 16:29 -0500, Steven Rostedt wrote: > > while (atomic_read(&foo) != n) > > cpu_relax(); > > > > and the problem is that cpu_relax() doesn't know which particular > > cacheline to flush in order to make things go faster, hm? > > But what about any global variable? Can't we also just have: > > while (global != n) > cpu_relax(); > > ? Matt Fleming answered this for me on IRC, and I'll share the answer here (for those that are dying to know ;) Seems that the atomic_inc() uses ll/sc operations that do not affect the cache. Thus the problem is only with atomic_read() as while(atomic_read(&foo) != n) cpu_relax(); Will just check the cache version of foo. But because ll/sc skips the cache, the foo will never update. That is, atomic_inc() and friends do not touch the cache, and the CPU spinning in this loop will is only checking the cache, and will spin forever. Thus it is not about global, as global is updated by normal means and will update the caches. atomic_t is updated via the ll/sc that ignores the cache and causes all this to break down. IOW... broken hardware ;) Matt, feel free to correct this if it is wrong. -- Steve ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 21:39 ` Steven Rostedt @ 2011-02-14 21:46 ` David Miller 2011-02-14 22:20 ` Steven Rostedt 2011-02-14 22:37 ` Matt Fleming 2011-02-14 22:15 ` Matt Fleming 1 sibling, 2 replies; 113+ messages in thread From: David Miller @ 2011-02-14 21:46 UTC (permalink / raw) To: rostedt Cc: peterz, will.newton, jbaron, mathieu.desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh From: Steven Rostedt <rostedt@goodmis.org> Date: Mon, 14 Feb 2011 16:39:36 -0500 > Thus it is not about global, as global is updated by normal means and > will update the caches. atomic_t is updated via the ll/sc that ignores > the cache and causes all this to break down. IOW... broken hardware ;) I don't see how cache coherency can possibly work if the hardware behaves this way. In cache aliasing situations, yes I can understand a L1 cache visibility issue being present, but with kernel only stuff that should never happen otherwise we have a bug in the arch cache flushing support. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 21:46 ` David Miller @ 2011-02-14 22:20 ` Steven Rostedt 2011-02-14 22:21 ` Steven Rostedt ` (2 more replies) 2011-02-14 22:37 ` Matt Fleming 1 sibling, 3 replies; 113+ messages in thread From: Steven Rostedt @ 2011-02-14 22:20 UTC (permalink / raw) To: David Miller Cc: peterz, will.newton, jbaron, mathieu.desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh On Mon, 2011-02-14 at 13:46 -0800, David Miller wrote: > From: Steven Rostedt <rostedt@goodmis.org> > Date: Mon, 14 Feb 2011 16:39:36 -0500 > > > Thus it is not about global, as global is updated by normal means and > > will update the caches. atomic_t is updated via the ll/sc that ignores > > the cache and causes all this to break down. IOW... broken hardware ;) > > I don't see how cache coherency can possibly work if the hardware > behaves this way. > > In cache aliasing situations, yes I can understand a L1 cache visibility > issue being present, but with kernel only stuff that should never happen > otherwise we have a bug in the arch cache flushing support. I guess the issue is, if you use ll/sc on memory, you must always use ll/sc on that memory, otherwise any normal read won't read the proper cache. The atomic_read() in this arch uses ll to read the memory directly and skip the cache. If we make atomic_read() like the other archs: #define atomic_read(v) (*(volatile int *)&(v)->counter) This pulls the counter into cache, and it will not be updated by a atomic_inc() from another CPU. Ideally, we would like a single atomic_read() but due to these wacky archs, it may not be possible. -- Steve ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 22:20 ` Steven Rostedt @ 2011-02-14 22:21 ` Steven Rostedt 2011-02-14 22:21 ` H. Peter Anvin 2011-02-14 22:33 ` David Miller 2 siblings, 0 replies; 113+ messages in thread From: Steven Rostedt @ 2011-02-14 22:21 UTC (permalink / raw) To: David Miller Cc: peterz, will.newton, jbaron, mathieu.desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh On Mon, 2011-02-14 at 17:20 -0500, Steven Rostedt wrote: > I guess the issue is, if you use ll/sc on memory, you must always use > ll/sc on that memory, otherwise any normal read won't read the proper > cache. s/cache/memory/ -- Steve ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 22:20 ` Steven Rostedt 2011-02-14 22:21 ` Steven Rostedt @ 2011-02-14 22:21 ` H. Peter Anvin 2011-02-14 22:29 ` Mathieu Desnoyers [not found] ` <BLU0-SMTP98BFCC52FD41661DD9CC1E96D00@phx.gbl> 2011-02-14 22:33 ` David Miller 2 siblings, 2 replies; 113+ messages in thread From: H. Peter Anvin @ 2011-02-14 22:21 UTC (permalink / raw) To: Steven Rostedt Cc: David Miller, peterz, will.newton, jbaron, mathieu.desnoyers, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh On 02/14/2011 02:20 PM, Steven Rostedt wrote: > > Ideally, we would like a single atomic_read() but due to these wacky > archs, it may not be possible. > #ifdef ARCH_ATOMIC_READ_SUCKS_EGGS? -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 22:21 ` H. Peter Anvin @ 2011-02-14 22:29 ` Mathieu Desnoyers [not found] ` <BLU0-SMTP98BFCC52FD41661DD9CC1E96D00@phx.gbl> 1 sibling, 0 replies; 113+ messages in thread From: Mathieu Desnoyers @ 2011-02-14 22:29 UTC (permalink / raw) To: H. Peter Anvin Cc: Steven Rostedt, David Miller, peterz, will.newton, jbaron, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh * H. Peter Anvin (hpa@zytor.com) wrote: > On 02/14/2011 02:20 PM, Steven Rostedt wrote: > > > > Ideally, we would like a single atomic_read() but due to these wacky > > archs, it may not be possible. > > > > #ifdef ARCH_ATOMIC_READ_SUCKS_EGGS? > > -hpa lol :) Hrm, I wonder if it might cause problems with combinations of "cmpxchg" and "read" performed on a variable (without using atomic.h). Mathieu > > > -- > H. Peter Anvin, Intel Open Source Technology Center > I work for Intel. I don't speak on their behalf. > -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <BLU0-SMTP98BFCC52FD41661DD9CC1E96D00@phx.gbl>]
* Re: [PATCH 0/2] jump label: 2.6.38 updates [not found] ` <BLU0-SMTP98BFCC52FD41661DD9CC1E96D00@phx.gbl> @ 2011-02-14 22:33 ` David Miller 0 siblings, 0 replies; 113+ messages in thread From: David Miller @ 2011-02-14 22:33 UTC (permalink / raw) To: mathieu.desnoyers Cc: hpa, rostedt, peterz, will.newton, jbaron, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Date: Mon, 14 Feb 2011 17:29:58 -0500 > * H. Peter Anvin (hpa@zytor.com) wrote: >> On 02/14/2011 02:20 PM, Steven Rostedt wrote: >> > >> > Ideally, we would like a single atomic_read() but due to these wacky >> > archs, it may not be possible. >> > >> >> #ifdef ARCH_ATOMIC_READ_SUCKS_EGGS? >> >> -hpa > > lol :) > > Hrm, I wonder if it might cause problems with combinations of "cmpxchg" > and "read" performed on a variable (without using atomic.h). We do that everywhere, it has to work. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 22:20 ` Steven Rostedt 2011-02-14 22:21 ` Steven Rostedt 2011-02-14 22:21 ` H. Peter Anvin @ 2011-02-14 22:33 ` David Miller 2 siblings, 0 replies; 113+ messages in thread From: David Miller @ 2011-02-14 22:33 UTC (permalink / raw) To: rostedt Cc: peterz, will.newton, jbaron, mathieu.desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh From: Steven Rostedt <rostedt@goodmis.org> Date: Mon, 14 Feb 2011 17:20:30 -0500 > On Mon, 2011-02-14 at 13:46 -0800, David Miller wrote: >> From: Steven Rostedt <rostedt@goodmis.org> >> Date: Mon, 14 Feb 2011 16:39:36 -0500 >> >> > Thus it is not about global, as global is updated by normal means and >> > will update the caches. atomic_t is updated via the ll/sc that ignores >> > the cache and causes all this to break down. IOW... broken hardware ;) >> >> I don't see how cache coherency can possibly work if the hardware >> behaves this way. >> >> In cache aliasing situations, yes I can understand a L1 cache visibility >> issue being present, but with kernel only stuff that should never happen >> otherwise we have a bug in the arch cache flushing support. > > I guess the issue is, if you use ll/sc on memory, you must always use > ll/sc on that memory, otherwise any normal read won't read the proper > cache. That also makes no sense at all. Any update to the L2 cache must be snooped by the L1 cache and cause an update, otherwise nothing can work correctly. So every object we use cmpxchg() on in the kernel cannot work on this architecture? Is that what you're saying? If so, a lot of things we do will not work. . ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 21:46 ` David Miller 2011-02-14 22:20 ` Steven Rostedt @ 2011-02-14 22:37 ` Matt Fleming 2011-02-14 23:03 ` Mathieu Desnoyers ` (3 more replies) 1 sibling, 4 replies; 113+ messages in thread From: Matt Fleming @ 2011-02-14 22:37 UTC (permalink / raw) To: David Miller Cc: rostedt, peterz, will.newton, jbaron, mathieu.desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh On Mon, 14 Feb 2011 13:46:00 -0800 (PST) David Miller <davem@davemloft.net> wrote: > From: Steven Rostedt <rostedt@goodmis.org> > Date: Mon, 14 Feb 2011 16:39:36 -0500 > > > Thus it is not about global, as global is updated by normal means > > and will update the caches. atomic_t is updated via the ll/sc that > > ignores the cache and causes all this to break down. IOW... broken > > hardware ;) > > I don't see how cache coherency can possibly work if the hardware > behaves this way. Cache coherency is still maintained provided writes/reads both go through the cache ;-) The problem is that for read-modify-write operations the arbitration logic that decides who "wins" and is allowed to actually perform the write, assuming two or more CPUs are competing for a single memory address, is not implemented in the cache controller, I think. I'm not a hardware engineer and I never understood how the arbitration logic worked but I'm guessing that's the reason that the ll/sc instructions bypass the cache. Which is why the atomic_t functions worked out really well for that arch, such that any accesses to an atomic_t * had to go through the wrapper functions. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 22:37 ` Matt Fleming @ 2011-02-14 23:03 ` Mathieu Desnoyers [not found] ` <BLU0-SMTP166A8555C791786059B0FF96D00@phx.gbl> ` (2 subsequent siblings) 3 siblings, 0 replies; 113+ messages in thread From: Mathieu Desnoyers @ 2011-02-14 23:03 UTC (permalink / raw) To: Matt Fleming Cc: David Miller, rostedt, peterz, will.newton, jbaron, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh, Paul E. McKenney * Matt Fleming (matt@console-pimps.org) wrote: > On Mon, 14 Feb 2011 13:46:00 -0800 (PST) > David Miller <davem@davemloft.net> wrote: > > > From: Steven Rostedt <rostedt@goodmis.org> > > Date: Mon, 14 Feb 2011 16:39:36 -0500 > > > > > Thus it is not about global, as global is updated by normal means > > > and will update the caches. atomic_t is updated via the ll/sc that > > > ignores the cache and causes all this to break down. IOW... broken > > > hardware ;) > > > > I don't see how cache coherency can possibly work if the hardware > > behaves this way. > > Cache coherency is still maintained provided writes/reads both go > through the cache ;-) > > The problem is that for read-modify-write operations the arbitration > logic that decides who "wins" and is allowed to actually perform the > write, assuming two or more CPUs are competing for a single memory > address, is not implemented in the cache controller, I think. I'm not a > hardware engineer and I never understood how the arbitration logic > worked but I'm guessing that's the reason that the ll/sc instructions > bypass the cache. > > Which is why the atomic_t functions worked out really well for that > arch, such that any accesses to an atomic_t * had to go through the > wrapper functions. If this is true, then we have bugs in lots of xchg/cmpxchg users (which do not reside in atomic.h), e.g.: fs/fs_struct.c: int current_umask(void) { return current->fs->umask; } EXPORT_SYMBOL(current_umask); kernel/sys.c: SYSCALL_DEFINE1(umask, int, mask) { mask = xchg(¤t->fs->umask, mask & S_IRWXUGO); return mask; } The solution to this would be to force all xchg/cmpxchg users to swap to atomic.h variables, which would force the ll semantic on read. But I'd really like to see where this is documented first -- or which PowerPC engineer we should talk to. Thanks, Mathieu -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <BLU0-SMTP166A8555C791786059B0FF96D00@phx.gbl>]
* Re: [PATCH 0/2] jump label: 2.6.38 updates [not found] ` <BLU0-SMTP166A8555C791786059B0FF96D00@phx.gbl> @ 2011-02-14 23:09 ` Paul E. McKenney 2011-02-14 23:29 ` Mathieu Desnoyers ` (3 more replies) 0 siblings, 4 replies; 113+ messages in thread From: Paul E. McKenney @ 2011-02-14 23:09 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Matt Fleming, David Miller, rostedt, peterz, will.newton, jbaron, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh On Mon, Feb 14, 2011 at 06:03:01PM -0500, Mathieu Desnoyers wrote: > * Matt Fleming (matt@console-pimps.org) wrote: > > On Mon, 14 Feb 2011 13:46:00 -0800 (PST) > > David Miller <davem@davemloft.net> wrote: > > > > > From: Steven Rostedt <rostedt@goodmis.org> > > > Date: Mon, 14 Feb 2011 16:39:36 -0500 > > > > > > > Thus it is not about global, as global is updated by normal means > > > > and will update the caches. atomic_t is updated via the ll/sc that > > > > ignores the cache and causes all this to break down. IOW... broken > > > > hardware ;) > > > > > > I don't see how cache coherency can possibly work if the hardware > > > behaves this way. > > > > Cache coherency is still maintained provided writes/reads both go > > through the cache ;-) > > > > The problem is that for read-modify-write operations the arbitration > > logic that decides who "wins" and is allowed to actually perform the > > write, assuming two or more CPUs are competing for a single memory > > address, is not implemented in the cache controller, I think. I'm not a > > hardware engineer and I never understood how the arbitration logic > > worked but I'm guessing that's the reason that the ll/sc instructions > > bypass the cache. > > > > Which is why the atomic_t functions worked out really well for that > > arch, such that any accesses to an atomic_t * had to go through the > > wrapper functions. ??? What CPU family are we talking about here? For cache coherent CPUs, cache coherence really is supposed to work, even for mixed atomic and non-atomic instructions to the same variable. Thanx, Paul > If this is true, then we have bugs in lots of xchg/cmpxchg users (which > do not reside in atomic.h), e.g.: > > fs/fs_struct.c: > int current_umask(void) > { > return current->fs->umask; > } > EXPORT_SYMBOL(current_umask); > > kernel/sys.c: > SYSCALL_DEFINE1(umask, int, mask) > { > mask = xchg(¤t->fs->umask, mask & S_IRWXUGO); > return mask; > } > > The solution to this would be to force all xchg/cmpxchg users to swap to > atomic.h variables, which would force the ll semantic on read. But I'd > really like to see where this is documented first -- or which PowerPC > engineer we should talk to. > > Thanks, > > Mathieu > > -- > Mathieu Desnoyers > Operating System Efficiency R&D Consultant > EfficiOS Inc. > http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 23:09 ` Paul E. McKenney @ 2011-02-14 23:29 ` Mathieu Desnoyers [not found] ` <BLU0-SMTP4599FAAD7330498472B87396D00@phx.gbl> ` (2 subsequent siblings) 3 siblings, 0 replies; 113+ messages in thread From: Mathieu Desnoyers @ 2011-02-14 23:29 UTC (permalink / raw) To: Paul E. McKenney Cc: Matt Fleming, David Miller, rostedt, peterz, will.newton, jbaron, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh, Segher Boessenkool, Paul Mackerras [ added Segher Boessenkool and Paul Mackerras to CC list ] * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote: > On Mon, Feb 14, 2011 at 06:03:01PM -0500, Mathieu Desnoyers wrote: > > * Matt Fleming (matt@console-pimps.org) wrote: > > > On Mon, 14 Feb 2011 13:46:00 -0800 (PST) > > > David Miller <davem@davemloft.net> wrote: > > > > > > > From: Steven Rostedt <rostedt@goodmis.org> > > > > Date: Mon, 14 Feb 2011 16:39:36 -0500 > > > > > > > > > Thus it is not about global, as global is updated by normal means > > > > > and will update the caches. atomic_t is updated via the ll/sc that > > > > > ignores the cache and causes all this to break down. IOW... broken > > > > > hardware ;) > > > > > > > > I don't see how cache coherency can possibly work if the hardware > > > > behaves this way. > > > > > > Cache coherency is still maintained provided writes/reads both go > > > through the cache ;-) > > > > > > The problem is that for read-modify-write operations the arbitration > > > logic that decides who "wins" and is allowed to actually perform the > > > write, assuming two or more CPUs are competing for a single memory > > > address, is not implemented in the cache controller, I think. I'm not a > > > hardware engineer and I never understood how the arbitration logic > > > worked but I'm guessing that's the reason that the ll/sc instructions > > > bypass the cache. > > > > > > Which is why the atomic_t functions worked out really well for that > > > arch, such that any accesses to an atomic_t * had to go through the > > > wrapper functions. > > ??? > > What CPU family are we talking about here? For cache coherent CPUs, > cache coherence really is supposed to work, even for mixed atomic and > non-atomic instructions to the same variable. > I'm really curious to know which CPU families too. I've used git blame to see where these lwz/stw instructions were added to powerpc, and it points to: commit 9f0cbea0d8cc47801b853d3c61d0e17475b0cc89 Author: Segher Boessenkool <segher@kernel.crashing.org> Date: Sat Aug 11 10:15:30 2007 +1000 [POWERPC] Implement atomic{, 64}_{read, write}() without volatile Instead, use asm() like all other atomic operations already do. Also use inline functions instead of macros; this actually improves code generation (some code becomes a little smaller, probably because of improved alias information -- just a few hundred bytes total on a default kernel build, nothing shocking). Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org> So let's ping the relevant people to see if there was any reason for making these atomic read/set operations different from other architectures in the first place. Thanks, Mathieu -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <BLU0-SMTP4599FAAD7330498472B87396D00@phx.gbl>]
* Re: [PATCH 0/2] jump label: 2.6.38 updates [not found] ` <BLU0-SMTP4599FAAD7330498472B87396D00@phx.gbl> @ 2011-02-15 0:19 ` Segher Boessenkool 2011-02-15 0:48 ` Mathieu Desnoyers 2011-02-15 1:29 ` Steven Rostedt 0 siblings, 2 replies; 113+ messages in thread From: Segher Boessenkool @ 2011-02-15 0:19 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Paul E. McKenney, Matt Fleming, David Miller, rostedt, peterz, will.newton, jbaron, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh, Segher Boessenkool, Paul Mackerras >> What CPU family are we talking about here? For cache coherent CPUs, >> cache coherence really is supposed to work, even for mixed atomic and >> non-atomic instructions to the same variable. > > I'm really curious to know which CPU families too. I've used git blame > to see where these lwz/stw instructions were added to powerpc, and it > points to: > > commit 9f0cbea0d8cc47801b853d3c61d0e17475b0cc89 > So let's ping the relevant people to see if there was any reason for > making these atomic read/set operations different from other > architectures in the first place. lwz is a simple 32-bit load. On PowerPC, such a load is guaranteed to be atomic (except some unaligned cases). stw is similar, for stores. These are the normal insns, not ll/sc or anything. At the time, volatile tricks were used to make the accesses atomic; this patch changed that. Result is (or should be!) better code generation. Is there a problem with it? Segher ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-15 0:19 ` Segher Boessenkool @ 2011-02-15 0:48 ` Mathieu Desnoyers 2011-02-15 1:29 ` Steven Rostedt 1 sibling, 0 replies; 113+ messages in thread From: Mathieu Desnoyers @ 2011-02-15 0:48 UTC (permalink / raw) To: Segher Boessenkool Cc: Paul E. McKenney, Matt Fleming, David Miller, rostedt, peterz, will.newton, jbaron, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh, Paul Mackerras * Segher Boessenkool (segher@kernel.crashing.org) wrote: > >> What CPU family are we talking about here? For cache coherent CPUs, > >> cache coherence really is supposed to work, even for mixed atomic and > >> non-atomic instructions to the same variable. > > > > I'm really curious to know which CPU families too. I've used git blame > > to see where these lwz/stw instructions were added to powerpc, and it > > points to: > > > > commit 9f0cbea0d8cc47801b853d3c61d0e17475b0cc89 > > > So let's ping the relevant people to see if there was any reason for > > making these atomic read/set operations different from other > > architectures in the first place. > > lwz is a simple 32-bit load. On PowerPC, such a load is guaranteed > to be atomic (except some unaligned cases). stw is similar, for stores. > These are the normal insns, not ll/sc or anything. > > At the time, volatile tricks were used to make the accesses atomic; this > patch changed that. Result is (or should be!) better code generation. > > Is there a problem with it? It seems fine then. It seems to be my confusion to think that Matt referred to PowerPC in his statement. It's probably an unrelated architecture. Thanks, Mathieu -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-15 0:19 ` Segher Boessenkool 2011-02-15 0:48 ` Mathieu Desnoyers @ 2011-02-15 1:29 ` Steven Rostedt 1 sibling, 0 replies; 113+ messages in thread From: Steven Rostedt @ 2011-02-15 1:29 UTC (permalink / raw) To: Segher Boessenkool Cc: Mathieu Desnoyers, Paul E. McKenney, Matt Fleming, David Miller, peterz, will.newton, jbaron, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh, Paul Mackerras On Tue, 2011-02-15 at 01:19 +0100, Segher Boessenkool wrote: > >> What CPU family are we talking about here? For cache coherent CPUs, > >> cache coherence really is supposed to work, even for mixed atomic and > >> non-atomic instructions to the same variable. > > > > I'm really curious to know which CPU families too. I've used git blame > > to see where these lwz/stw instructions were added to powerpc, and it > > points to: > > > > commit 9f0cbea0d8cc47801b853d3c61d0e17475b0cc89 > > > So let's ping the relevant people to see if there was any reason for > > making these atomic read/set operations different from other > > architectures in the first place. > > lwz is a simple 32-bit load. On PowerPC, such a load is guaranteed > to be atomic (except some unaligned cases). stw is similar, for stores. > These are the normal insns, not ll/sc or anything. > > At the time, volatile tricks were used to make the accesses atomic; this > patch changed that. Result is (or should be!) better code generation. > > Is there a problem with it? I guess Mathieu was just getting confused. But we are looking at seeing if we can make atomic_read() a generic function instead of defining it for all archs. Just something that we could do to fix the include header hell when a static inline contains atomic_read() and happens to be included by kernel.h. Then we have atomic.h needing to include kernel.h which needs to include atomic.h first and so on. Although, it may be just best if we can do some #ifdef magic to prevent all this mess anyway. -- Steve ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <BLU0-SMTP984E876DBDFBC13F4C86F896D00@phx.gbl>]
* Re: [PATCH 0/2] jump label: 2.6.38 updates [not found] ` <BLU0-SMTP984E876DBDFBC13F4C86F896D00@phx.gbl> @ 2011-02-15 0:42 ` Paul E. McKenney 2011-02-15 0:51 ` Mathieu Desnoyers 0 siblings, 1 reply; 113+ messages in thread From: Paul E. McKenney @ 2011-02-15 0:42 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Matt Fleming, David Miller, rostedt, peterz, will.newton, jbaron, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh, Segher Boessenkool, Paul Mackerras On Mon, Feb 14, 2011 at 06:29:47PM -0500, Mathieu Desnoyers wrote: > [ added Segher Boessenkool and Paul Mackerras to CC list ] > > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote: > > On Mon, Feb 14, 2011 at 06:03:01PM -0500, Mathieu Desnoyers wrote: > > > * Matt Fleming (matt@console-pimps.org) wrote: > > > > On Mon, 14 Feb 2011 13:46:00 -0800 (PST) > > > > David Miller <davem@davemloft.net> wrote: > > > > > > > > > From: Steven Rostedt <rostedt@goodmis.org> > > > > > Date: Mon, 14 Feb 2011 16:39:36 -0500 > > > > > > > > > > > Thus it is not about global, as global is updated by normal means > > > > > > and will update the caches. atomic_t is updated via the ll/sc that > > > > > > ignores the cache and causes all this to break down. IOW... broken > > > > > > hardware ;) > > > > > > > > > > I don't see how cache coherency can possibly work if the hardware > > > > > behaves this way. > > > > > > > > Cache coherency is still maintained provided writes/reads both go > > > > through the cache ;-) > > > > > > > > The problem is that for read-modify-write operations the arbitration > > > > logic that decides who "wins" and is allowed to actually perform the > > > > write, assuming two or more CPUs are competing for a single memory > > > > address, is not implemented in the cache controller, I think. I'm not a > > > > hardware engineer and I never understood how the arbitration logic > > > > worked but I'm guessing that's the reason that the ll/sc instructions > > > > bypass the cache. > > > > > > > > Which is why the atomic_t functions worked out really well for that > > > > arch, such that any accesses to an atomic_t * had to go through the > > > > wrapper functions. > > > > ??? > > > > What CPU family are we talking about here? For cache coherent CPUs, > > cache coherence really is supposed to work, even for mixed atomic and > > non-atomic instructions to the same variable. > > > > I'm really curious to know which CPU families too. I've used git blame > to see where these lwz/stw instructions were added to powerpc, and it > points to: But lwz and stw instructions are normal non-atomic PowerPC loads and stores. No LL/SC -- those would instead be lwarx and stwcx. Thanx, Paul > commit 9f0cbea0d8cc47801b853d3c61d0e17475b0cc89 > Author: Segher Boessenkool <segher@kernel.crashing.org> > Date: Sat Aug 11 10:15:30 2007 +1000 > > [POWERPC] Implement atomic{, 64}_{read, write}() without volatile > > Instead, use asm() like all other atomic operations already do. > > Also use inline functions instead of macros; this actually > improves code generation (some code becomes a little smaller, > probably because of improved alias information -- just a few > hundred bytes total on a default kernel build, nothing shocking). > > Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org> > Signed-off-by: Paul Mackerras <paulus@samba.org> > > So let's ping the relevant people to see if there was any reason for > making these atomic read/set operations different from other > architectures in the first place. > > Thanks, > > Mathieu > > -- > Mathieu Desnoyers > Operating System Efficiency R&D Consultant > EfficiOS Inc. > http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-15 0:42 ` Paul E. McKenney @ 2011-02-15 0:51 ` Mathieu Desnoyers 0 siblings, 0 replies; 113+ messages in thread From: Mathieu Desnoyers @ 2011-02-15 0:51 UTC (permalink / raw) To: Paul E. McKenney Cc: Matt Fleming, David Miller, rostedt, peterz, will.newton, jbaron, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh, Segher Boessenkool, Paul Mackerras * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote: > On Mon, Feb 14, 2011 at 06:29:47PM -0500, Mathieu Desnoyers wrote: > > [ added Segher Boessenkool and Paul Mackerras to CC list ] > > > > * Paul E. McKenney (paulmck@linux.vnet.ibm.com) wrote: > > > On Mon, Feb 14, 2011 at 06:03:01PM -0500, Mathieu Desnoyers wrote: > > > > * Matt Fleming (matt@console-pimps.org) wrote: > > > > > On Mon, 14 Feb 2011 13:46:00 -0800 (PST) > > > > > David Miller <davem@davemloft.net> wrote: > > > > > > > > > > > From: Steven Rostedt <rostedt@goodmis.org> > > > > > > Date: Mon, 14 Feb 2011 16:39:36 -0500 > > > > > > > > > > > > > Thus it is not about global, as global is updated by normal means > > > > > > > and will update the caches. atomic_t is updated via the ll/sc that > > > > > > > ignores the cache and causes all this to break down. IOW... broken > > > > > > > hardware ;) > > > > > > > > > > > > I don't see how cache coherency can possibly work if the hardware > > > > > > behaves this way. > > > > > > > > > > Cache coherency is still maintained provided writes/reads both go > > > > > through the cache ;-) > > > > > > > > > > The problem is that for read-modify-write operations the arbitration > > > > > logic that decides who "wins" and is allowed to actually perform the > > > > > write, assuming two or more CPUs are competing for a single memory > > > > > address, is not implemented in the cache controller, I think. I'm not a > > > > > hardware engineer and I never understood how the arbitration logic > > > > > worked but I'm guessing that's the reason that the ll/sc instructions > > > > > bypass the cache. > > > > > > > > > > Which is why the atomic_t functions worked out really well for that > > > > > arch, such that any accesses to an atomic_t * had to go through the > > > > > wrapper functions. > > > > > > ??? > > > > > > What CPU family are we talking about here? For cache coherent CPUs, > > > cache coherence really is supposed to work, even for mixed atomic and > > > non-atomic instructions to the same variable. > > > > > > > I'm really curious to know which CPU families too. I've used git blame > > to see where these lwz/stw instructions were added to powerpc, and it > > points to: > > But lwz and stw instructions are normal non-atomic PowerPC loads and > stores. No LL/SC -- those would instead be lwarx and stwcx. Ah, right. Color me confused ;) I think Matt was talking about a secret "out of tree" architecture. It sure feels like a James Bond movie. :) Thanks, Mathieu > > Thanx, Paul > > > commit 9f0cbea0d8cc47801b853d3c61d0e17475b0cc89 > > Author: Segher Boessenkool <segher@kernel.crashing.org> > > Date: Sat Aug 11 10:15:30 2007 +1000 > > > > [POWERPC] Implement atomic{, 64}_{read, write}() without volatile > > > > Instead, use asm() like all other atomic operations already do. > > > > Also use inline functions instead of macros; this actually > > improves code generation (some code becomes a little smaller, > > probably because of improved alias information -- just a few > > hundred bytes total on a default kernel build, nothing shocking). > > > > Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org> > > Signed-off-by: Paul Mackerras <paulus@samba.org> > > > > So let's ping the relevant people to see if there was any reason for > > making these atomic read/set operations different from other > > architectures in the first place. > > > > Thanks, > > > > Mathieu > > > > -- > > Mathieu Desnoyers > > Operating System Efficiency R&D Consultant > > EfficiOS Inc. > > http://www.efficios.com -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 23:09 ` Paul E. McKenney ` (2 preceding siblings ...) [not found] ` <BLU0-SMTP984E876DBDFBC13F4C86F896D00@phx.gbl> @ 2011-02-15 11:53 ` Will Newton 2011-02-18 19:03 ` Paul E. McKenney 3 siblings, 1 reply; 113+ messages in thread From: Will Newton @ 2011-02-15 11:53 UTC (permalink / raw) To: paulmck Cc: Mathieu Desnoyers, Matt Fleming, David Miller, rostedt, peterz, jbaron, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh On Mon, Feb 14, 2011 at 11:09 PM, Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: Hi Paul, > What CPU family are we talking about here? For cache coherent CPUs, > cache coherence really is supposed to work, even for mixed atomic and > non-atomic instructions to the same variable. Is there a specific situation you can think of where this would be a problem? I have to admit to a certain amount of unease with the design our hardware guys came up with, but I don't have a specific case where it won't work, just cases where it is less than optimal. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-15 11:53 ` Will Newton @ 2011-02-18 19:03 ` Paul E. McKenney 0 siblings, 0 replies; 113+ messages in thread From: Paul E. McKenney @ 2011-02-18 19:03 UTC (permalink / raw) To: Will Newton Cc: Mathieu Desnoyers, Matt Fleming, David Miller, rostedt, peterz, jbaron, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh On Tue, Feb 15, 2011 at 11:53:37AM +0000, Will Newton wrote: > On Mon, Feb 14, 2011 at 11:09 PM, Paul E. McKenney > <paulmck@linux.vnet.ibm.com> wrote: > > Hi Paul, > > > What CPU family are we talking about here? For cache coherent CPUs, > > cache coherence really is supposed to work, even for mixed atomic and > > non-atomic instructions to the same variable. > > Is there a specific situation you can think of where this would be a > problem? I have to admit to a certain amount of unease with the design > our hardware guys came up with, but I don't have a specific case where > it won't work, just cases where it is less than optimal. OK, you did ask... One case is when a given block of memory was subject to atomic instructions, then was freed and reallocated as a structure used by normal instructions. It would be quite bad if the last pre-free atomic operation failed to play nice with the first post-allocate non-atomic instruction. The reverse situation is of course important as well, where a block subject to non-atomic instructions is freed and reallocated as a structure subject to atomic instructions. I would guess you would handle these cases by making the memory allocator deal with any hardware caching issues, but however it is handled, it does need to be handled. Another case is a leaky-bucket token protocol, where there is a rate limit of some sort. There is an integer that is positive when progress is permitted, and negative otherwise. This integer is periodically reset to its upper limit, and this reset operation can use a non-atomic store. When attempting to carry out a rate-limited operation, you use either atomic_add_return() if underflow cannot happen, but you must use cmpxchg() if underflow is a possibility. Now you -could- use atomic xchg() to reset the integer, but you don't have to. You -could- also use atomic_cmpxchg() to check and atomic_set() to reset the limit, but again, you don't have to. And there might well be places in the Linux kernel that mix atomic and non-atomic operations in this case. Yet another case is a variation on the lockless queue that can have concurrent enqueues but where only one task may dequeue at a time, for example, dequeuing might be guarded by a lock. Suppose that dequeues removed all the elements on the queue at one shot. Such a queue might have a head and tail pointer, where the tail pointer references the ->next pointer of the last element, or references the head pointer if the queue is empty. Each element also has a flag that indicates whether it is a normal element or a dummy element. Enqueues are handled in the normal way for this sort of queue: 1. Initialize the element to be added, including NULLing out its ->next pointer. 2. Atomically exchange the queue's tail pointer with a pointer to the element's ->next pointer, placing the old tail pointer into a local variable (call it "oldtail"). 3. Nonatomically set the pointer referenced by oldtail to point to the newly added element. Then a bulk dequeue could be written as follows: 1. Pick up the head pointer, placing it in a local variable (call it "oldhead"). If NULL, return an empty list, otherwise continue through the following steps. 2. Store NULL into the head pointer. This can be done nonatomically, because no one else will be concurrently storing into this pointer -- there is at least one element on the list, and so the enqueuers will be instead storing to the ->next pointer of the last element. 3. Atomically exchange the queue's tail pointer with a pointer to the queue's head pointer, placing the old value of the tail pointer into a local variable (again, call it "oldtail"). 4. Return a list with oldhead as the head pointer and oldtail as the tail pointer. The caller cannot rely on NULL pointers to find the end of the list, as an enqueuer might be delayed between steps 2 and 3. Instead, the caller must check to see if the address of the NULL pointer is equal to oldtail, in which case, the caller has in fact reached the end of the list. Otherwise, the caller must wait for the pointer to become non-NULL. Yes, you can replace the non-atomic loads and stores in the enqueuer's step #3 and in the bulk dequeue's steps #1 and #2 with atomic exchange instructions -- in fact you can replace either or both. And you could also require that the caller use atomic instructions when looking at each element's ->next pointer. There are other algorithms, but this should be a decent start. And yes, you -can- make these algorithms use only atomic instructions, but you don't -have- to. So it is quite likely that similar algorithms exist somewhere in the 10+ million lines of code making up the Linux kernel. Thanx, Paul ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 22:37 ` Matt Fleming 2011-02-14 23:03 ` Mathieu Desnoyers [not found] ` <BLU0-SMTP166A8555C791786059B0FF96D00@phx.gbl> @ 2011-02-14 23:19 ` H. Peter Anvin 2011-02-15 11:01 ` Will Newton [not found] ` <BLU0-SMTP637B2E9372CFBF3A0B5B0996D00@phx.gbl> 3 siblings, 1 reply; 113+ messages in thread From: H. Peter Anvin @ 2011-02-14 23:19 UTC (permalink / raw) To: Matt Fleming Cc: David Miller, rostedt, peterz, will.newton, jbaron, mathieu.desnoyers, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh On 02/14/2011 02:37 PM, Matt Fleming wrote: >> >> I don't see how cache coherency can possibly work if the hardware >> behaves this way. > > Cache coherency is still maintained provided writes/reads both go > through the cache ;-) > > The problem is that for read-modify-write operations the arbitration > logic that decides who "wins" and is allowed to actually perform the > write, assuming two or more CPUs are competing for a single memory > address, is not implemented in the cache controller, I think. I'm not a > hardware engineer and I never understood how the arbitration logic > worked but I'm guessing that's the reason that the ll/sc instructions > bypass the cache. > > Which is why the atomic_t functions worked out really well for that > arch, such that any accesses to an atomic_t * had to go through the > wrapper functions. I'm sorry... this doesn't compute. Either reads can work normally (go through the cache) in which case atomic_read() can simply be a read or they don't, so I don't understand this at all. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 23:19 ` H. Peter Anvin @ 2011-02-15 11:01 ` Will Newton 2011-02-15 13:31 ` H. Peter Anvin 2011-02-15 21:11 ` Will Simoneau 0 siblings, 2 replies; 113+ messages in thread From: Will Newton @ 2011-02-15 11:01 UTC (permalink / raw) To: H. Peter Anvin Cc: Matt Fleming, David Miller, rostedt, peterz, jbaron, mathieu.desnoyers, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh On Mon, Feb 14, 2011 at 11:19 PM, H. Peter Anvin <hpa@zytor.com> wrote: > On 02/14/2011 02:37 PM, Matt Fleming wrote: >>> >>> I don't see how cache coherency can possibly work if the hardware >>> behaves this way. >> >> Cache coherency is still maintained provided writes/reads both go >> through the cache ;-) >> >> The problem is that for read-modify-write operations the arbitration >> logic that decides who "wins" and is allowed to actually perform the >> write, assuming two or more CPUs are competing for a single memory >> address, is not implemented in the cache controller, I think. I'm not a >> hardware engineer and I never understood how the arbitration logic >> worked but I'm guessing that's the reason that the ll/sc instructions >> bypass the cache. >> >> Which is why the atomic_t functions worked out really well for that >> arch, such that any accesses to an atomic_t * had to go through the >> wrapper functions. > > I'm sorry... this doesn't compute. Either reads can work normally (go > through the cache) in which case atomic_read() can simply be a read or > they don't, so I don't understand this at all. The CPU in question has two sets of instructions: load/store - these go via the cache (write through) ll/sc - these operate literally as if there is no cache (they do not hit on read or write) This may or may not be a sensible way to architect a CPU, but I think it is possible to make it work. Making it work efficiently is more of a challenge. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-15 11:01 ` Will Newton @ 2011-02-15 13:31 ` H. Peter Anvin 2011-02-15 13:49 ` Steven Rostedt 2011-02-15 14:04 ` Will Newton 2011-02-15 21:11 ` Will Simoneau 1 sibling, 2 replies; 113+ messages in thread From: H. Peter Anvin @ 2011-02-15 13:31 UTC (permalink / raw) To: Will Newton Cc: Matt Fleming, David Miller, rostedt, peterz, jbaron, mathieu.desnoyers, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh On 02/15/2011 03:01 AM, Will Newton wrote: > > The CPU in question has two sets of instructions: > > load/store - these go via the cache (write through) > ll/sc - these operate literally as if there is no cache (they do not > hit on read or write) > > This may or may not be a sensible way to architect a CPU, but I think > it is possible to make it work. Making it work efficiently is more of > a challenge. > a) What "CPU in question" is this? b) Why should we let this particular insane CPU slow ALL OTHER CPUs down? -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-15 13:31 ` H. Peter Anvin @ 2011-02-15 13:49 ` Steven Rostedt 2011-02-15 14:04 ` Will Newton 1 sibling, 0 replies; 113+ messages in thread From: Steven Rostedt @ 2011-02-15 13:49 UTC (permalink / raw) To: H. Peter Anvin Cc: Will Newton, Matt Fleming, David Miller, peterz, jbaron, mathieu.desnoyers, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh On Tue, 2011-02-15 at 05:31 -0800, H. Peter Anvin wrote: > On 02/15/2011 03:01 AM, Will Newton wrote: > b) Why should we let this particular insane CPU slow ALL OTHER CPUs down? Yesterday I got around to reading Linus's interview here: http://www.itwire.com/opinion-and-analysis/open-sauce/44975-linus-torvalds-looking-back-looking-forward?start=4 This seems appropriate: "When it comes to "feature I had to include for reasons beyond my control", it tends to be about crazy hardware doing stupid things that we just have to work around. Most of the time that's limited to some specific driver or other, and it's not something that has any relevance in the "big picture", or that really affects core kernel design very much. But sometimes it does, and then I really detest it." -- Steve ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-15 13:31 ` H. Peter Anvin 2011-02-15 13:49 ` Steven Rostedt @ 2011-02-15 14:04 ` Will Newton 1 sibling, 0 replies; 113+ messages in thread From: Will Newton @ 2011-02-15 14:04 UTC (permalink / raw) To: H. Peter Anvin Cc: Matt Fleming, David Miller, rostedt, peterz, jbaron, mathieu.desnoyers, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh On Tue, Feb 15, 2011 at 1:31 PM, H. Peter Anvin <hpa@zytor.com> wrote: > On 02/15/2011 03:01 AM, Will Newton wrote: >> >> The CPU in question has two sets of instructions: >> >> load/store - these go via the cache (write through) >> ll/sc - these operate literally as if there is no cache (they do not >> hit on read or write) >> >> This may or may not be a sensible way to architect a CPU, but I think >> it is possible to make it work. Making it work efficiently is more of >> a challenge. >> > > a) What "CPU in question" is this? http://imgtec.com/meta/meta-technology.asp > b) Why should we let this particular insane CPU slow ALL OTHER CPUs down? I didn't propose we do that. I brought it up just to make people aware that there are these odd architectures out there, and indeed it turns out Blackfin has some superficially similar issues. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-15 11:01 ` Will Newton 2011-02-15 13:31 ` H. Peter Anvin @ 2011-02-15 21:11 ` Will Simoneau 2011-02-15 21:27 ` David Miller 1 sibling, 1 reply; 113+ messages in thread From: Will Simoneau @ 2011-02-15 21:11 UTC (permalink / raw) To: Will Newton Cc: H. Peter Anvin, Matt Fleming, David Miller, rostedt, peterz, jbaron, mathieu.desnoyers, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh [-- Attachment #1: Type: text/plain, Size: 4741 bytes --] On 11:01 Tue 17 Feb , Will Newton wrote: > On Mon, Feb 14, 2011 at 11:19 PM, H. Peter Anvin <hpa@zytor.com> wrote: > > On 02/14/2011 02:37 PM, Matt Fleming wrote: > >>> > >>> I don't see how cache coherency can possibly work if the hardware > >>> behaves this way. > >> > >> Cache coherency is still maintained provided writes/reads both go > >> through the cache ;-) > >> > >> The problem is that for read-modify-write operations the arbitration > >> logic that decides who "wins" and is allowed to actually perform the > >> write, assuming two or more CPUs are competing for a single memory > >> address, is not implemented in the cache controller, I think. I'm not a > >> hardware engineer and I never understood how the arbitration logic > >> worked but I'm guessing that's the reason that the ll/sc instructions > >> bypass the cache. > >> > >> Which is why the atomic_t functions worked out really well for that > >> arch, such that any accesses to an atomic_t * had to go through the > >> wrapper functions. > > > > I'm sorry... this doesn't compute. ?Either reads can work normally (go > > through the cache) in which case atomic_read() can simply be a read or > > they don't, so I don't understand this at all. > > The CPU in question has two sets of instructions: > > load/store - these go via the cache (write through) > ll/sc - these operate literally as if there is no cache (they do not > hit on read or write) > > This may or may not be a sensible way to architect a CPU, but I think > it is possible to make it work. Making it work efficiently is more of > a challenge. Speaking as a (non-commercial) processor designer here, but feel free to point out anything I'm wrong on. I have direct experience implementing these operations in hardware so I'd hope what I say here is right. This information is definitely relevant to the MIPS R4000 as well as my own hardware. A quick look at the PPC documentation seems to indicate it's the same there too, and it should agree with the Wikipedia article on the subject: http://en.wikipedia.org/wiki/Load-link/store-conditional The entire point of implementing load-linked (ll) / store-conditional (sc) instructions is to have lockless atomic primitives *using the cache*. Proper implementations do not bypass the cache; in fact, the cache coherence protocol must get involved for them to be correct. If properly implemented, these operations cause no external bus traffic if the critical section is uncontended and hits the cache (good for scalability). These are the semantics: ll: Essentially the same as a normal word load. Implementations will need to do a little internal book-keeping (i.e. save physical address of last ll instruction and/or modify coherence state for the cacheline). sc: Store a word if and only if the address has not been written by any other processor since the last ll. If the store fails, write 0 to a register, otherwise write 1. The address may be tracked on cacheline granularity; this operation may spuriously fail, depending on implementation details (called "weak" ll/sc). Arguably the "obvious" way to implement this is to have sc fail if the local CPU snoops a read-for-ownership for the address in question coming from a remote CPU. This works because the remote CPU will need to gain the cacheline for exclusive access before its competing sc can execute. Code is supposed to put ll/sc in a loop and simply retry the operation until the sc succeeds. Note how the cache and cache coherence protocol are fundamental parts of this operation; if these instructions simply bypassed the cache, they *could not* work correctly - how do you detect when the underlying memory has been modified? You can't simply detect whether the value has changed - it may have been changed to another value and then back ("ABA" problem). You have to snoop bus transactions, and that is what the cache and its coherence algorithm already do. ll/sc can be implemented entirely using the side-effects of the cache coherence algorithm; my own working hardware implementation does this. So, atomically reading the variable can be accomplished with a normal load instruction. I can't speak for unaligned loads on implementations that do them in hardware, but at least an aligned load of word size should be atomic on any sane architecture. Only an atomic read-modify-write of the variable needs to use ll/sc at all, and only for the reason of preventing another concurrent modification between the load and store. A plain aligned word store should be atomic too, but it's not too useful because a another concurrent store would not be ordered relative to the local store. [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-15 21:11 ` Will Simoneau @ 2011-02-15 21:27 ` David Miller 2011-02-15 21:56 ` Will Simoneau 2011-02-15 22:20 ` Benjamin Herrenschmidt 0 siblings, 2 replies; 113+ messages in thread From: David Miller @ 2011-02-15 21:27 UTC (permalink / raw) To: simoneau Cc: will.newton, hpa, matt, rostedt, peterz, jbaron, mathieu.desnoyers, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh From: Will Simoneau <simoneau@ele.uri.edu> Date: Tue, 15 Feb 2011 16:11:23 -0500 > Note how the cache and cache coherence protocol are fundamental parts of this > operation; if these instructions simply bypassed the cache, they *could not* > work correctly - how do you detect when the underlying memory has been > modified? The issue here isn't L2 cache bypassing, it's local L1 cache bypassing. The chips in question aparently do not consult the local L1 cache on "ll" instructions. Therefore you must only ever access such atomic data using "ll" instructions. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-15 21:27 ` David Miller @ 2011-02-15 21:56 ` Will Simoneau 2011-02-16 10:15 ` Will Newton 2011-02-15 22:20 ` Benjamin Herrenschmidt 1 sibling, 1 reply; 113+ messages in thread From: Will Simoneau @ 2011-02-15 21:56 UTC (permalink / raw) To: David Miller Cc: will.newton, hpa, matt, rostedt, peterz, jbaron, mathieu.desnoyers, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh [-- Attachment #1: Type: text/plain, Size: 1234 bytes --] On 13:27 Tue 15 Feb , David Miller wrote: > From: Will Simoneau <simoneau@ele.uri.edu> > Date: Tue, 15 Feb 2011 16:11:23 -0500 > > > Note how the cache and cache coherence protocol are fundamental parts of this > > operation; if these instructions simply bypassed the cache, they *could not* > > work correctly - how do you detect when the underlying memory has been > > modified? > > The issue here isn't L2 cache bypassing, it's local L1 cache bypassing. > > The chips in question aparently do not consult the local L1 cache on > "ll" instructions. > > Therefore you must only ever access such atomic data using "ll" > instructions. (I should not have said "underlying memory", since it is correct that only the L1 caches are the problem here) That's some really crippled hardware... it does seem like *any* loads from *any* address updated by an sc would have to be done with ll as well, else they may load stale values. One could work this into atomic_read(), but surely there are other places that are problems. It would be OK if the caches on the hardware in question were to back-invalidate matching cachelines when the sc is snooped from the bus, but it sounds like this doesn't happen? [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-15 21:56 ` Will Simoneau @ 2011-02-16 10:15 ` Will Newton 2011-02-16 12:18 ` Steven Rostedt 0 siblings, 1 reply; 113+ messages in thread From: Will Newton @ 2011-02-16 10:15 UTC (permalink / raw) To: Will Simoneau Cc: David Miller, hpa, matt, rostedt, peterz, jbaron, mathieu.desnoyers, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh On Tue, Feb 15, 2011 at 9:56 PM, Will Simoneau <simoneau@ele.uri.edu> wrote: > On 13:27 Tue 15 Feb , David Miller wrote: >> From: Will Simoneau <simoneau@ele.uri.edu> >> Date: Tue, 15 Feb 2011 16:11:23 -0500 >> >> > Note how the cache and cache coherence protocol are fundamental parts of this >> > operation; if these instructions simply bypassed the cache, they *could not* >> > work correctly - how do you detect when the underlying memory has been >> > modified? >> >> The issue here isn't L2 cache bypassing, it's local L1 cache bypassing. >> >> The chips in question aparently do not consult the local L1 cache on >> "ll" instructions. >> >> Therefore you must only ever access such atomic data using "ll" >> instructions. > > (I should not have said "underlying memory", since it is correct that > only the L1 caches are the problem here) > > That's some really crippled hardware... it does seem like *any* loads > from *any* address updated by an sc would have to be done with ll as > well, else they may load stale values. One could work this into > atomic_read(), but surely there are other places that are problems. I think it's actually ok, atomics have arch implemented accessors, as do spinlocks and atomic bitops. Those are the only place we do sc so we can make sure we always ll or invalidate manually. > It would be OK if the caches on the hardware in question were to > back-invalidate matching cachelines when the sc is snooped from the bus, > but it sounds like this doesn't happen? Yes it's possible to manually invalidate the line but it is not automatic. Manual invalidation is actually quite reasonable in many cases because you never see a bad value, just a potentially stale one, so many of the races are harmless in practice. I think you're correct in your comments re multi-processor cache coherence and the performance problems associated with not doing ll/sc in the cache. I believe some of the reasoning behind the current implementation is to allow different processors in the same SoC to participate in the atomic store protocol without having a fully coherent cache (and implementing a full cache coherence protocol). It's my understanding that the ll/sc is implemented somewhere beyond the cache in the bus fabric. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-16 10:15 ` Will Newton @ 2011-02-16 12:18 ` Steven Rostedt 2011-02-16 12:41 ` Will Newton 0 siblings, 1 reply; 113+ messages in thread From: Steven Rostedt @ 2011-02-16 12:18 UTC (permalink / raw) To: Will Newton Cc: Will Simoneau, David Miller, hpa, matt, peterz, jbaron, mathieu.desnoyers, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh On Wed, 2011-02-16 at 10:15 +0000, Will Newton wrote: > > That's some really crippled hardware... it does seem like *any* loads > > from *any* address updated by an sc would have to be done with ll as > > well, else they may load stale values. One could work this into > > atomic_read(), but surely there are other places that are problems. > > I think it's actually ok, atomics have arch implemented accessors, as > do spinlocks and atomic bitops. Those are the only place we do sc so > we can make sure we always ll or invalidate manually. I'm curious, how is cmpxchg() implemented on this architecture? As there are several places in the kernel that uses this on regular variables without any "accessor" functions. -- Steve ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-16 12:18 ` Steven Rostedt @ 2011-02-16 12:41 ` Will Newton 2011-02-16 13:24 ` Mathieu Desnoyers ` (3 more replies) 0 siblings, 4 replies; 113+ messages in thread From: Will Newton @ 2011-02-16 12:41 UTC (permalink / raw) To: Steven Rostedt Cc: Will Simoneau, David Miller, hpa, matt, peterz, jbaron, mathieu.desnoyers, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh On Wed, Feb 16, 2011 at 12:18 PM, Steven Rostedt <rostedt@goodmis.org> wrote: > On Wed, 2011-02-16 at 10:15 +0000, Will Newton wrote: > >> > That's some really crippled hardware... it does seem like *any* loads >> > from *any* address updated by an sc would have to be done with ll as >> > well, else they may load stale values. One could work this into >> > atomic_read(), but surely there are other places that are problems. >> >> I think it's actually ok, atomics have arch implemented accessors, as >> do spinlocks and atomic bitops. Those are the only place we do sc so >> we can make sure we always ll or invalidate manually. > > I'm curious, how is cmpxchg() implemented on this architecture? As there > are several places in the kernel that uses this on regular variables > without any "accessor" functions. We can invalidate the cache manually. The current cpu will see the new value (post-cache invalidate) and the other cpus will see either the old value or the new value depending on whether they read before or after the invalidate, which is racy but I don't think it is problematic. Unless I'm missing something... ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-16 12:41 ` Will Newton @ 2011-02-16 13:24 ` Mathieu Desnoyers 2011-02-16 22:51 ` Will Simoneau ` (2 subsequent siblings) 3 siblings, 0 replies; 113+ messages in thread From: Mathieu Desnoyers @ 2011-02-16 13:24 UTC (permalink / raw) To: Will Newton Cc: Steven Rostedt, Will Simoneau, David Miller, hpa, matt, peterz, jbaron, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh * Will Newton (will.newton@gmail.com) wrote: > On Wed, Feb 16, 2011 at 12:18 PM, Steven Rostedt <rostedt@goodmis.org> wrote: > > On Wed, 2011-02-16 at 10:15 +0000, Will Newton wrote: > > > >> > That's some really crippled hardware... it does seem like *any* loads > >> > from *any* address updated by an sc would have to be done with ll as > >> > well, else they may load stale values. One could work this into > >> > atomic_read(), but surely there are other places that are problems. > >> > >> I think it's actually ok, atomics have arch implemented accessors, as > >> do spinlocks and atomic bitops. Those are the only place we do sc so > >> we can make sure we always ll or invalidate manually. > > > > I'm curious, how is cmpxchg() implemented on this architecture? As there > > are several places in the kernel that uses this on regular variables > > without any "accessor" functions. > > We can invalidate the cache manually. The current cpu will see the new > value (post-cache invalidate) and the other cpus will see either the > old value or the new value depending on whether they read before or > after the invalidate, which is racy but I don't think it is > problematic. Unless I'm missing something... Assuming the invalidate is specific to a cache-line, I'm concerned about the failure of a scenario like the following: initially: foo = 0 bar = 0 CPU A CPU B xchg(&foo, 1); ll foo sc foo -> interrupt if (foo == 1) xchg(&bar, 1); ll bar sc bar invalidate bar lbar = bar; smp_mb() lfoo = foo; BUG_ON(lbar == 1 && lfoo == 0); invalidate foo It should be valid to expect that every time "bar" read by CPU B is 1, then "foo" is always worth 1. However, in this case, the lack of invalidate on foo is keeping the cacheline from reaching CPU B. There seems to be a problem with interrupts/NMIs coming right between sc and invalidate, as Ingo pointed out. Thanks, Mathieu -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-16 12:41 ` Will Newton 2011-02-16 13:24 ` Mathieu Desnoyers @ 2011-02-16 22:51 ` Will Simoneau 2011-02-17 0:53 ` Please watch your cc lists Andi Kleen 2011-02-17 10:55 ` [PATCH 0/2] jump label: 2.6.38 updates Will Newton [not found] ` <BLU0-SMTP80F56386E7E060A3B2020B96D20@phx.gbl> [not found] ` <BLU0-SMTP71BCB155CBAE79997EE08D96D20@phx.gbl> 3 siblings, 2 replies; 113+ messages in thread From: Will Simoneau @ 2011-02-16 22:51 UTC (permalink / raw) To: Will Newton Cc: Steven Rostedt, David Miller, hpa, matt, peterz, jbaron, mathieu.desnoyers, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh [-- Attachment #1: Type: text/plain, Size: 851 bytes --] On 12:41 Wed 16 Feb , Will Newton wrote: > On Wed, Feb 16, 2011 at 12:18 PM, Steven Rostedt <rostedt@goodmis.org> wrote: > > I'm curious, how is cmpxchg() implemented on this architecture? As there > > are several places in the kernel that uses this on regular variables > > without any "accessor" functions. > > We can invalidate the cache manually. The current cpu will see the new > value (post-cache invalidate) and the other cpus will see either the > old value or the new value depending on whether they read before or > after the invalidate, which is racy but I don't think it is > problematic. Unless I'm missing something... If I understand this correctly, the manual invalidates must propagate to all CPUs that potentially read the value, even if there is no contention. Doesn't this involve IPIs? How does it not suck? [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 113+ messages in thread
* Please watch your cc lists 2011-02-16 22:51 ` Will Simoneau @ 2011-02-17 0:53 ` Andi Kleen 2011-02-17 0:56 ` David Miller 2011-02-17 10:55 ` [PATCH 0/2] jump label: 2.6.38 updates Will Newton 1 sibling, 1 reply; 113+ messages in thread From: Andi Kleen @ 2011-02-17 0:53 UTC (permalink / raw) To: Will Simoneau Cc: Will Newton, Steven Rostedt, David Miller, hpa, matt, peterz, jbaron, mathieu.desnoyers, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh Folks, if you want to invent new masochistic programming models like this please do it on an own thread with a reduced cc list. Thank you, -Andi -- ak@linux.intel.com -- Speaking for myself only. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: Please watch your cc lists 2011-02-17 0:53 ` Please watch your cc lists Andi Kleen @ 2011-02-17 0:56 ` David Miller 2011-02-17 1:04 ` Michael Witten 0 siblings, 1 reply; 113+ messages in thread From: David Miller @ 2011-02-17 0:56 UTC (permalink / raw) To: andi Cc: simoneau, will.newton, rostedt, hpa, matt, peterz, jbaron, mathieu.desnoyers, mingo, tglx, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh From: Andi Kleen <andi@firstfloor.org> Date: Thu, 17 Feb 2011 01:53:00 +0100 > > Folks, if you want to invent new masochistic programming models > like this please do it on an own thread with a reduced cc list. Well, Andi, since you removed the subject nobody has any idea what thread you are even referring to. This makes you as much as a bozo as the people who you are chiding right now. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: Please watch your cc lists 2011-02-17 0:56 ` David Miller @ 2011-02-17 1:04 ` Michael Witten 0 siblings, 0 replies; 113+ messages in thread From: Michael Witten @ 2011-02-17 1:04 UTC (permalink / raw) To: David Miller Cc: andi, simoneau, will.newton, rostedt, hpa, matt, peterz, jbaron, mathieu.desnoyers, mingo, tglx, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh On Wed, Feb 16, 2011 at 18:56, David Miller <davem@davemloft.net> wrote: > Well, Andi, since you removed the subject nobody has any idea > what thread you are even referring to. However, all of the necessary information is in the email headers: In-Reply-To: <20110216225151.GA10435@ele.uri.edu> References: <4D59B891.8010300@zytor.com> <AANLkTimp9xKbZdDZpONrxDkfMSAiQre0v=SOsJUnnoWA@mail.gmail.com> <20110215211123.GA3094@ele.uri.edu> <20110215.132702.39199169.davem@davemloft.net> <20110215215604.GA3177@ele.uri.edu> <AANLkTikXy+AJ3tdEkEN--xJPefbXJ4-OVS3cg6R7yXzc@mail.gmail.com> <1297858734.23343.138.camel@gandalf.stny.rr.com> <AANLkTinzr6rb=WwFs7QApsvdy5f7PHZ1qS9ZVrncEzZD@mail.gmail.com> <20110216225151.GA10435@ele.uri.edu> It's a damn shame that our email tools ignore such useful information. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-16 22:51 ` Will Simoneau 2011-02-17 0:53 ` Please watch your cc lists Andi Kleen @ 2011-02-17 10:55 ` Will Newton 1 sibling, 0 replies; 113+ messages in thread From: Will Newton @ 2011-02-17 10:55 UTC (permalink / raw) To: Will Simoneau Cc: Steven Rostedt, David Miller, hpa, matt, peterz, jbaron, mathieu.desnoyers, mingo, tglx, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh On Wed, Feb 16, 2011 at 10:51 PM, Will Simoneau <simoneau@ele.uri.edu> wrote: > On 12:41 Wed 16 Feb , Will Newton wrote: >> On Wed, Feb 16, 2011 at 12:18 PM, Steven Rostedt <rostedt@goodmis.org> wrote: >> > I'm curious, how is cmpxchg() implemented on this architecture? As there >> > are several places in the kernel that uses this on regular variables >> > without any "accessor" functions. >> >> We can invalidate the cache manually. The current cpu will see the new >> value (post-cache invalidate) and the other cpus will see either the >> old value or the new value depending on whether they read before or >> after the invalidate, which is racy but I don't think it is >> problematic. Unless I'm missing something... > > If I understand this correctly, the manual invalidates must propagate to > all CPUs that potentially read the value, even if there is no > contention. Doesn't this involve IPIs? How does it not suck? The cache is shared between cores (in that regard it's more like a hyper-threaded core than a true multi-core) so is completely coherent, so this is the one bit that doesn't really suck! Having spoken to our hardware guys I'm confident that we'll only ever build a handful of chip designs with the current way of doing ll/sc and hopefully future cores will do this the "right" way. ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <BLU0-SMTP80F56386E7E060A3B2020B96D20@phx.gbl>]
* Re: [PATCH 0/2] jump label: 2.6.38 updates [not found] ` <BLU0-SMTP80F56386E7E060A3B2020B96D20@phx.gbl> @ 2011-02-17 1:55 ` Masami Hiramatsu 2011-02-17 3:19 ` H. Peter Anvin 0 siblings, 1 reply; 113+ messages in thread From: Masami Hiramatsu @ 2011-02-17 1:55 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Will Newton, Steven Rostedt, Will Simoneau, David Miller, hpa, matt, peterz, jbaron, mingo, tglx, andi, roland, rth, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh, 2nddept-manager (2011/02/16 22:24), Mathieu Desnoyers wrote: > * Will Newton (will.newton@gmail.com) wrote: >> On Wed, Feb 16, 2011 at 12:18 PM, Steven Rostedt <rostedt@goodmis.org> wrote: >>> On Wed, 2011-02-16 at 10:15 +0000, Will Newton wrote: >>> >>>>> That's some really crippled hardware... it does seem like *any* loads >>>>> from *any* address updated by an sc would have to be done with ll as >>>>> well, else they may load stale values. One could work this into >>>>> atomic_read(), but surely there are other places that are problems. >>>> >>>> I think it's actually ok, atomics have arch implemented accessors, as >>>> do spinlocks and atomic bitops. Those are the only place we do sc so >>>> we can make sure we always ll or invalidate manually. >>> >>> I'm curious, how is cmpxchg() implemented on this architecture? As there >>> are several places in the kernel that uses this on regular variables >>> without any "accessor" functions. >> >> We can invalidate the cache manually. The current cpu will see the new >> value (post-cache invalidate) and the other cpus will see either the >> old value or the new value depending on whether they read before or >> after the invalidate, which is racy but I don't think it is >> problematic. Unless I'm missing something... > > Assuming the invalidate is specific to a cache-line, I'm concerned about > the failure of a scenario like the following: > > initially: > foo = 0 > bar = 0 > > CPU A CPU B > > xchg(&foo, 1); > ll foo > sc foo > > -> interrupt > > if (foo == 1) > xchg(&bar, 1); > ll bar > sc bar > invalidate bar > > lbar = bar; > smp_mb() > lfoo = foo; > BUG_ON(lbar == 1 && lfoo == 0); > invalidate foo > > It should be valid to expect that every time "bar" read by CPU B is 1, > then "foo" is always worth 1. However, in this case, the lack of > invalidate on foo is keeping the cacheline from reaching CPU B. There > seems to be a problem with interrupts/NMIs coming right between sc and > invalidate, as Ingo pointed out. Hmm, I think that is miss-coding ll/sc. If I understand correctly, usually cache invalidation should be done right before storing value, as MSI protocol does. (or, sc should atomically invalidate the cache line) Thank you, -- Masami HIRAMATSU 2nd Dept. Linux Technology Center Hitachi, Ltd., Systems Development Laboratory E-mail: masami.hiramatsu.pt@hitachi.com ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-17 1:55 ` Masami Hiramatsu @ 2011-02-17 3:19 ` H. Peter Anvin 2011-02-17 16:03 ` Mathieu Desnoyers 0 siblings, 1 reply; 113+ messages in thread From: H. Peter Anvin @ 2011-02-17 3:19 UTC (permalink / raw) To: Masami Hiramatsu Cc: Mathieu Desnoyers, Will Newton, Steven Rostedt, Will Simoneau, David Miller, matt, peterz, jbaron, mingo, tglx, andi, roland, rth, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh, 2nddept-manager On 02/16/2011 05:55 PM, Masami Hiramatsu wrote: > > Hmm, I think that is miss-coding ll/sc. > If I understand correctly, usually cache invalidation should be done > right before storing value, as MSI protocol does. > (or, sc should atomically invalidate the cache line) > I suspect in this case one should flush the cache line before ll (a cache flush will typically invalidate the ll/sc link.) -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-17 3:19 ` H. Peter Anvin @ 2011-02-17 16:03 ` Mathieu Desnoyers 0 siblings, 0 replies; 113+ messages in thread From: Mathieu Desnoyers @ 2011-02-17 16:03 UTC (permalink / raw) To: H. Peter Anvin Cc: Masami Hiramatsu, Will Newton, Steven Rostedt, Will Simoneau, David Miller, matt, peterz, jbaron, mingo, tglx, roland, rth, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh, 2nddept-manager * H. Peter Anvin (hpa@zytor.com) wrote: > On 02/16/2011 05:55 PM, Masami Hiramatsu wrote: > > > > Hmm, I think that is miss-coding ll/sc. > > If I understand correctly, usually cache invalidation should be done > > right before storing value, as MSI protocol does. > > (or, sc should atomically invalidate the cache line) > > > > I suspect in this case one should flush the cache line before ll (a > cache flush will typically invalidate the ll/sc link.) hrm, but if you have: invalidate -> interrupt read (fetch the invalidated cacheline) ll sc you basically end up in a situation similar to not having any invalidate, no ? AFAIU, disabling interrupts around the whole ll-sc-invalidate (or invalidate-ll-sc) seems required for this specific architecture, so the invalidation is made "atomic" with the ll-sc pair from the point of view of one hardware thread. Mathieu > > -hpa > > -- > H. Peter Anvin, Intel Open Source Technology Center > I work for Intel. I don't speak on their behalf. > -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <BLU0-SMTP71BCB155CBAE79997EE08D96D20@phx.gbl>]
* Re: [PATCH 0/2] jump label: 2.6.38 updates [not found] ` <BLU0-SMTP71BCB155CBAE79997EE08D96D20@phx.gbl> @ 2011-02-17 3:36 ` Steven Rostedt 2011-02-17 16:13 ` Mathieu Desnoyers [not found] ` <BLU0-SMTP51D40A5B1DACA8883D6AB596D50@phx.gbl> 0 siblings, 2 replies; 113+ messages in thread From: Steven Rostedt @ 2011-02-17 3:36 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Will Newton, Will Simoneau, David Miller, hpa, matt, peterz, jbaron, mingo, tglx, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh [ Removed Andi as I believe this is the mysterious thread he was talking about. Anyone else want to be removed? ] On Wed, 2011-02-16 at 08:24 -0500, Mathieu Desnoyers wrote: > * Will Newton (will.newton@gmail.com) wrote: > initially: > foo = 0 > bar = 0 > > CPU A CPU B > > xchg(&foo, 1); > ll foo > sc foo > > -> interrupt > > if (foo == 1) > xchg(&bar, 1); > ll bar > sc bar > invalidate bar > > lbar = bar; > smp_mb() Question: Does a mb() flush all cache or does it just make sure that read/write operations finish before starting new ones? > lfoo = foo; IOW, will that smp_mb() really make lfoo read the new foo in memory? If foo happens to still be in cache and no coherency has been performed to flush it, would it just simply read foo straight from the cache? -- Steve > BUG_ON(lbar == 1 && lfoo == 0); > invalidate foo > > It should be valid to expect that every time "bar" read by CPU B is 1, > then "foo" is always worth 1. However, in this case, the lack of > invalidate on foo is keeping the cacheline from reaching CPU B. There > seems to be a problem with interrupts/NMIs coming right between sc and > invalidate, as Ingo pointed out. > > Thanks, > > Mathieu > ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-17 3:36 ` Steven Rostedt @ 2011-02-17 16:13 ` Mathieu Desnoyers [not found] ` <BLU0-SMTP51D40A5B1DACA8883D6AB596D50@phx.gbl> 1 sibling, 0 replies; 113+ messages in thread From: Mathieu Desnoyers @ 2011-02-17 16:13 UTC (permalink / raw) To: Steven Rostedt Cc: Will Newton, Will Simoneau, David Miller, hpa, matt, peterz, jbaron, mingo, tglx, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh * Steven Rostedt (rostedt@goodmis.org) wrote: > [ Removed Andi as I believe this is the mysterious thread he was talking > about. Anyone else want to be removed? ] > > > On Wed, 2011-02-16 at 08:24 -0500, Mathieu Desnoyers wrote: > > * Will Newton (will.newton@gmail.com) wrote: > > > initially: > > foo = 0 > > bar = 0 > > > > CPU A CPU B > > > > xchg(&foo, 1); > > ll foo > > sc foo > > > > -> interrupt > > > > if (foo == 1) > > xchg(&bar, 1); > > ll bar > > sc bar > > invalidate bar > > > > lbar = bar; > > smp_mb() > > Question: Does a mb() flush all cache or does it just make sure that > read/write operations finish before starting new ones? AFAIK, the Linux kernel memory model semantic only cares about coherent caches (I'd be interested to learn if I am wrong here). Therefore, smp_mb() affects ordering of data memory read/writes only, not cache invalidation -- _however_, it apply only in a memory model where the underlying accesses are performed on coherent caches. > > > lfoo = foo; > > IOW, will that smp_mb() really make lfoo read the new foo in memory? If > foo happens to still be in cache and no coherency has been performed to > flush it, would it just simply read foo straight from the cache? If we were to deploy the Linux kernel on an architecture without coherent caches, I think smp_mb() should imply a cacheline invalidation, otherwise we completely mess up the order of data writes vs their observability from each invididual core POV. This is what I do in liburcu actually. I introduced a "smp_mc() (mc for memory commit)" macro to specify that cache invalidation is required on non-cache-coherent archs. smp_mb() imply a smp_mc(). (smp_mc() is therefore weaker than smp_mb(), because the mb imply ordering of memory operations performed by a given core, while smp_mc only ensures that the core caches are synchronized with memory) Thanks, Mathieu > > -- Steve > > > BUG_ON(lbar == 1 && lfoo == 0); > > invalidate foo > > > > It should be valid to expect that every time "bar" read by CPU B is 1, > > then "foo" is always worth 1. However, in this case, the lack of > > invalidate on foo is keeping the cacheline from reaching CPU B. There > > seems to be a problem with interrupts/NMIs coming right between sc and > > invalidate, as Ingo pointed out. > > > > Thanks, > > > > Mathieu > > > > -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <BLU0-SMTP51D40A5B1DACA8883D6AB596D50@phx.gbl>]
* Re: [PATCH 0/2] jump label: 2.6.38 updates [not found] ` <BLU0-SMTP51D40A5B1DACA8883D6AB596D50@phx.gbl> @ 2011-02-17 20:09 ` Steven Rostedt 0 siblings, 0 replies; 113+ messages in thread From: Steven Rostedt @ 2011-02-17 20:09 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Will Newton, Will Simoneau, David Miller, hpa, matt, peterz, jbaron, mingo, tglx, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh On Thu, 2011-02-17 at 11:13 -0500, Mathieu Desnoyers wrote: > > > > > lfoo = foo; > > > > IOW, will that smp_mb() really make lfoo read the new foo in memory? If > > foo happens to still be in cache and no coherency has been performed to > > flush it, would it just simply read foo straight from the cache? > > If we were to deploy the Linux kernel on an architecture without > coherent caches, I think smp_mb() should imply a cacheline invalidation, > otherwise we completely mess up the order of data writes vs their > observability from each invididual core POV. Um but this thread is not about non-coherent caches. It's about a HW that happens to do something stupid with ll/sc. That is, everything deals with the cache except ll/sc which skips it. Although, this was more or less answered in another email. That is, the cache on this HW is not really coherent but all the CPUs just seem to share the same cache. Thus a invalidate of the cache line affects all CPUs which makes my question moot. -- Steve ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-15 21:27 ` David Miller 2011-02-15 21:56 ` Will Simoneau @ 2011-02-15 22:20 ` Benjamin Herrenschmidt 2011-02-16 8:35 ` Ingo Molnar 1 sibling, 1 reply; 113+ messages in thread From: Benjamin Herrenschmidt @ 2011-02-15 22:20 UTC (permalink / raw) To: David Miller Cc: simoneau, will.newton, hpa, matt, rostedt, peterz, jbaron, mathieu.desnoyers, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens On Tue, 2011-02-15 at 13:27 -0800, David Miller wrote: > From: Will Simoneau <simoneau@ele.uri.edu> > Date: Tue, 15 Feb 2011 16:11:23 -0500 > > > Note how the cache and cache coherence protocol are fundamental parts of this > > operation; if these instructions simply bypassed the cache, they *could not* > > work correctly - how do you detect when the underlying memory has been > > modified? > > The issue here isn't L2 cache bypassing, it's local L1 cache bypassing. > > The chips in question aparently do not consult the local L1 cache on > "ll" instructions. > > Therefore you must only ever access such atomic data using "ll" > instructions. Note that it's actually a reasonable design choice to not consult the L1 in these case .... as long as you invalidate it on the way. That's how current powerpcs do it afaik, they send a kill to any matching L1 line along as reading from the L2. (Of course, L1 has to be write-through for that to work). Cheers, Ben. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-15 22:20 ` Benjamin Herrenschmidt @ 2011-02-16 8:35 ` Ingo Molnar 2011-02-17 1:04 ` H. Peter Anvin 0 siblings, 1 reply; 113+ messages in thread From: Ingo Molnar @ 2011-02-16 8:35 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: David Miller, simoneau, will.newton, hpa, matt, rostedt, peterz, jbaron, mathieu.desnoyers, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens * Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > On Tue, 2011-02-15 at 13:27 -0800, David Miller wrote: > > From: Will Simoneau <simoneau@ele.uri.edu> > > Date: Tue, 15 Feb 2011 16:11:23 -0500 > > > > > Note how the cache and cache coherence protocol are fundamental parts of this > > > operation; if these instructions simply bypassed the cache, they *could not* > > > work correctly - how do you detect when the underlying memory has been > > > modified? > > > > The issue here isn't L2 cache bypassing, it's local L1 cache bypassing. > > > > The chips in question aparently do not consult the local L1 cache on > > "ll" instructions. > > > > Therefore you must only ever access such atomic data using "ll" > > instructions. > > Note that it's actually a reasonable design choice to not consult the L1 > in these case .... as long as you invalidate it on the way. That's how > current powerpcs do it afaik, they send a kill to any matching L1 line > along as reading from the L2. (Of course, L1 has to be write-through for > that to work). Just curious: how does this work if there's an interrupt (or NMI) right after the invalidate instruction but before the 'll' instruction? The IRQ/NMI may refill the L1. Or are the two instructions coupled by hw (they form a single instruction in essence) and irqs/NMIs are inhibited inbetween? Thanks, Ingo ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-16 8:35 ` Ingo Molnar @ 2011-02-17 1:04 ` H. Peter Anvin 2011-02-17 12:51 ` Ingo Molnar 0 siblings, 1 reply; 113+ messages in thread From: H. Peter Anvin @ 2011-02-17 1:04 UTC (permalink / raw) To: Ingo Molnar Cc: Benjamin Herrenschmidt, David Miller, simoneau, will.newton, matt, rostedt, peterz, jbaron, mathieu.desnoyers, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens On 02/16/2011 12:35 AM, Ingo Molnar wrote: > > Just curious: how does this work if there's an interrupt (or NMI) right after the > invalidate instruction but before the 'll' instruction? The IRQ/NMI may refill the > L1. Or are the two instructions coupled by hw (they form a single instruction in > essence) and irqs/NMIs are inhibited inbetween? > http://en.wikipedia.org/wiki/Load-link/store-conditional -hpa ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-17 1:04 ` H. Peter Anvin @ 2011-02-17 12:51 ` Ingo Molnar 0 siblings, 0 replies; 113+ messages in thread From: Ingo Molnar @ 2011-02-17 12:51 UTC (permalink / raw) To: H. Peter Anvin Cc: Benjamin Herrenschmidt, David Miller, simoneau, will.newton, matt, rostedt, peterz, jbaron, mathieu.desnoyers, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens * H. Peter Anvin <hpa@zytor.com> wrote: > On 02/16/2011 12:35 AM, Ingo Molnar wrote: > > > > Just curious: how does this work if there's an interrupt (or NMI) right after the > > invalidate instruction but before the 'll' instruction? The IRQ/NMI may refill the > > L1. Or are the two instructions coupled by hw (they form a single instruction in > > essence) and irqs/NMIs are inhibited inbetween? > > > > http://en.wikipedia.org/wiki/Load-link/store-conditional Oh, ll/sc, that indeed clicks - i even wrote such assembly code many years ago ;-) Thanks, Ingo ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <BLU0-SMTP637B2E9372CFBF3A0B5B0996D00@phx.gbl>]
* Re: [PATCH 0/2] jump label: 2.6.38 updates [not found] ` <BLU0-SMTP637B2E9372CFBF3A0B5B0996D00@phx.gbl> @ 2011-02-14 23:25 ` David Miller 2011-02-14 23:34 ` Mathieu Desnoyers [not found] ` <20110214233405.GC17432@Krystal> 0 siblings, 2 replies; 113+ messages in thread From: David Miller @ 2011-02-14 23:25 UTC (permalink / raw) To: mathieu.desnoyers Cc: matt, rostedt, peterz, will.newton, jbaron, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh, paulmck From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Date: Mon, 14 Feb 2011 18:03:01 -0500 > If this is true, then we have bugs in lots of xchg/cmpxchg users (which > do not reside in atomic.h), e.g.: > > fs/fs_struct.c: > int current_umask(void) > { > return current->fs->umask; > } > EXPORT_SYMBOL(current_umask); > > kernel/sys.c: > SYSCALL_DEFINE1(umask, int, mask) > { > mask = xchg(¤t->fs->umask, mask & S_IRWXUGO); > return mask; > } > > The solution to this would be to force all xchg/cmpxchg users to swap to > atomic.h variables, which would force the ll semantic on read. But I'd > really like to see where this is documented first -- or which PowerPC > engineer we should talk to. We can't wholesale to atomic_t because we do this on variables of all sizes, not just 32-bit ones. We do them on pointers in the networking for example. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 23:25 ` David Miller @ 2011-02-14 23:34 ` Mathieu Desnoyers [not found] ` <20110214233405.GC17432@Krystal> 1 sibling, 0 replies; 113+ messages in thread From: Mathieu Desnoyers @ 2011-02-14 23:34 UTC (permalink / raw) To: David Miller Cc: matt, rostedt, peterz, will.newton, jbaron, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh, paulmck * David Miller (davem@davemloft.net) wrote: > From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> > Date: Mon, 14 Feb 2011 18:03:01 -0500 > > > If this is true, then we have bugs in lots of xchg/cmpxchg users (which > > do not reside in atomic.h), e.g.: > > > > fs/fs_struct.c: > > int current_umask(void) > > { > > return current->fs->umask; > > } > > EXPORT_SYMBOL(current_umask); > > > > kernel/sys.c: > > SYSCALL_DEFINE1(umask, int, mask) > > { > > mask = xchg(¤t->fs->umask, mask & S_IRWXUGO); > > return mask; > > } > > > > The solution to this would be to force all xchg/cmpxchg users to swap to > > atomic.h variables, which would force the ll semantic on read. But I'd > > really like to see where this is documented first -- or which PowerPC > > engineer we should talk to. > > We can't wholesale to atomic_t because we do this on variables of > all sizes, not just 32-bit ones. > > We do them on pointers in the networking for example. We have atomic_long_t for this, but yeah, it would kind of suck to have to create union { atomic_long_t atomic; void *ptr; } all around the place. Let's see if we can get to know which PowerPC processor family all this fuss is about, and where this rumour originates from. Thanks, Mathieu -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <20110214233405.GC17432@Krystal>]
* Re: [PATCH 0/2] jump label: 2.6.38 updates [not found] ` <20110214233405.GC17432@Krystal> @ 2011-02-14 23:52 ` Mathieu Desnoyers 0 siblings, 0 replies; 113+ messages in thread From: Mathieu Desnoyers @ 2011-02-14 23:52 UTC (permalink / raw) To: David Miller Cc: matt, rostedt, peterz, will.newton, jbaron, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, sam, ddaney, michael, linux-kernel, vapier, cmetcalf, dhowells, schwidefsky, heiko.carstens, benh, paulmck * Mathieu Desnoyers (mathieu.desnoyers@polymtl.ca) wrote: > * David Miller (davem@davemloft.net) wrote: [...] > > We can't wholesale to atomic_t because we do this on variables of > > all sizes, not just 32-bit ones. > > > > We do them on pointers in the networking for example. > > We have atomic_long_t for this, but yeah, it would kind of suck to have > to create > > union { > atomic_long_t atomic; > void *ptr; > } Actually, using a union for this is probably one of the worse idea I've had recently. Just casting the pointer to unsigned long and vice-versa, using atomic_long_*() ops would do the trick. But let's wait and see if it's really needed. Thanks, Mathieu > > all around the place. Let's see if we can get to know which PowerPC > processor family all this fuss is about, and where this rumour > originates from. > > Thanks, > > Mathieu > > > -- > Mathieu Desnoyers > Operating System Efficiency R&D Consultant > EfficiOS Inc. > http://www.efficios.com -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 21:39 ` Steven Rostedt 2011-02-14 21:46 ` David Miller @ 2011-02-14 22:15 ` Matt Fleming 1 sibling, 0 replies; 113+ messages in thread From: Matt Fleming @ 2011-02-14 22:15 UTC (permalink / raw) To: Steven Rostedt Cc: Peter Zijlstra, Will Newton, Jason Baron, Mathieu Desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel, Mike Frysinger, Chris Metcalf, dhowells, Martin Schwidefsky, heiko.carstens, benh On Mon, 14 Feb 2011 16:39:36 -0500 Steven Rostedt <rostedt@goodmis.org> wrote: > On Mon, 2011-02-14 at 16:29 -0500, Steven Rostedt wrote: > > > > while (atomic_read(&foo) != n) > > > cpu_relax(); > > > > > > and the problem is that cpu_relax() doesn't know which particular > > > cacheline to flush in order to make things go faster, hm? > > > > But what about any global variable? Can't we also just have: > > > > while (global != n) > > cpu_relax(); > > > > ? > > Matt Fleming answered this for me on IRC, and I'll share the answer > here (for those that are dying to know ;) > > Seems that the atomic_inc() uses ll/sc operations that do not affect > the cache. Thus the problem is only with atomic_read() as > > while(atomic_read(&foo) != n) > cpu_relax(); > > Will just check the cache version of foo. But because ll/sc skips the > cache, the foo will never update. That is, atomic_inc() and friends do > not touch the cache, and the CPU spinning in this loop will is only > checking the cache, and will spin forever. Right. When I wrote the atomic_read() implementation that Will is talking about I used the ll-equivalent instruction to bypass the cache, e.g. I wrote it assembly because the compiler didn't emit that instruction. And that is what it boils down to really, the ll/sc instructions are different from any other instructions in the ISA as they bypass the cache and are not emitted by the compiler. So, in order to maintain coherence with other cpus doing atomic updates on memory addresses, or rather to avoid reading stale values, it's necessary to use the ll instruction - and this isn't possible from C. > Thus it is not about global, as global is updated by normal means and > will update the caches. atomic_t is updated via the ll/sc that ignores > the cache and causes all this to break down. IOW... broken hardware ;) Well, to be precise it's about read-modify-write operations - the architecture does maintain cache coherence in that writes from one CPU are immediately visible to other CPUs. FYI spinlocks are also implemented with ll/sc instructions. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 17:27 ` Peter Zijlstra 2011-02-14 17:29 ` Mike Frysinger 2011-02-14 17:38 ` Will Newton @ 2011-02-15 15:20 ` Heiko Carstens 2 siblings, 0 replies; 113+ messages in thread From: Heiko Carstens @ 2011-02-15 15:20 UTC (permalink / raw) To: Peter Zijlstra Cc: Steven Rostedt, Jason Baron, Mathieu Desnoyers, hpa, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel, Mike Frysinger, Chris Metcalf, dhowells, Martin Schwidefsky, benh On Mon, Feb 14, 2011 at 06:27:27PM +0100, Peter Zijlstra wrote: > On Mon, 2011-02-14 at 12:18 -0500, Steven Rostedt wrote: > > > mn10300: > > #define atomic_read(v) ((v)->counter) > > > tile: > > static inline int atomic_read(const atomic_t *v) > > { > > return v->counter; > > } > > Yeah, I already send email to the respective maintainers telling them > they might want to fix this ;-) > > > > So all but a few have basically (as you said on IRC) > > #define atomic_read(v) ACCESS_ONCE(v) > > ACCESS_ONCE(v->counter), but yeah :-) > > > Those few are blackfin, s390, powerpc and tile. > > > > s390 probably doesn't need that much of a big hammer with atomic_read() > > (unless it uses it in its own arch that expects it to be such). > > Right, it could just do the volatile thing.. The reason that the code on s390 looks like it is was that the volatile cast was known to generate really bad code. However I just tried a few variants (inline asm / ACCESS_ONCE / leave as is) with gcc 4.5.2 and the resulting code was always identical. So I'm going to change it to the ACCESS_ONCE variant so it's the same like everywhere else. ^ permalink raw reply [flat|nested] 113+ messages in thread
[parent not found: <BLU0-SMTP64371A838030ED92A7CCB696D00@phx.gbl>]
* Re: [PATCH 0/2] jump label: 2.6.38 updates [not found] ` <BLU0-SMTP64371A838030ED92A7CCB696D00@phx.gbl> @ 2011-02-14 18:54 ` Jason Baron 2011-02-14 19:20 ` Peter Zijlstra 0 siblings, 1 reply; 113+ messages in thread From: Jason Baron @ 2011-02-14 18:54 UTC (permalink / raw) To: Mathieu Desnoyers, peterz Cc: hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel On Mon, Feb 14, 2011 at 11:43:43AM -0500, Mathieu Desnoyers wrote: > * Peter Zijlstra (peterz@infradead.org) wrote: > > On Mon, 2011-02-14 at 11:29 -0500, Jason Baron wrote: > > > On Mon, Feb 14, 2011 at 05:25:54PM +0100, Peter Zijlstra wrote: > > > > > > > > > > I remember that atomic_t is defined in types.h now rather than atomic.h. > > > > > Any reason why you should keep including atomic.h from jump_label.h ? > > > > > > > > Ooh, shiny.. we could probably move the few atomic_{read,inc,dec} users > > > > in jump_label.h into out of line functions and have this sorted. > > > > > > > > > > inc and dec sure, but atomic_read() for the disabled case needs to be > > > inline.... > > > > D'0h yes of course, I was thinking about jump_label_enabled(), but > > there's still the static_branch() implementation to consider. > > > > We could of course cheat implement our own version of atomic_read() in > > order to avoid the whole header mess, but that's not pretty at all > > > > OK, so the other way around then : why does kernel.h need to include > dynamic_debug.h (which includes jump_label.h) ? > well, its used to dynamically enable/disable pr_debug() statements which actually have now moved to linux/printk.h, which is included by kernel.h. I don't need an atomic_read() in the disabled case for dynamic debug, and I would be ok, #ifdef CONFIG_JUMP_LABEL, in dynamic_debug.h. Its not the prettiest solution. But I can certainly live with it for now, so that we can sort out the atomic_read() issue independently. Peter, Mathieu, are you guys ok with this? -Jason ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 18:54 ` Jason Baron @ 2011-02-14 19:20 ` Peter Zijlstra 2011-02-14 19:48 ` Mathieu Desnoyers 0 siblings, 1 reply; 113+ messages in thread From: Peter Zijlstra @ 2011-02-14 19:20 UTC (permalink / raw) To: Jason Baron Cc: Mathieu Desnoyers, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel On Mon, 2011-02-14 at 13:54 -0500, Jason Baron wrote: > I don't need an atomic_read() in the disabled case for dynamic debug, > and I would be ok, #ifdef CONFIG_JUMP_LABEL, in dynamic_debug.h. Its not > the prettiest solution. But I can certainly live with it for now, so > that we can sort out the atomic_read() issue independently. > > Peter, Mathieu, are you guys ok with this? Yeah, lets see where that gets us. ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-14 19:20 ` Peter Zijlstra @ 2011-02-14 19:48 ` Mathieu Desnoyers 0 siblings, 0 replies; 113+ messages in thread From: Mathieu Desnoyers @ 2011-02-14 19:48 UTC (permalink / raw) To: Peter Zijlstra Cc: Jason Baron, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel * Peter Zijlstra (peterz@infradead.org) wrote: > On Mon, 2011-02-14 at 13:54 -0500, Jason Baron wrote: > > > I don't need an atomic_read() in the disabled case for dynamic debug, > > and I would be ok, #ifdef CONFIG_JUMP_LABEL, in dynamic_debug.h. Its not > > the prettiest solution. But I can certainly live with it for now, so > > that we can sort out the atomic_read() issue independently. > > > > Peter, Mathieu, are you guys ok with this? > > Yeah, lets see where that gets us. > Works for me as long as you put a nice witty comment around this disgusting hack (something related to "include Hell", where bad programmers go after they die). ;-) Thanks, Mathieu -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 0/2] jump label: 2.6.38 updates 2011-02-12 18:47 ` Peter Zijlstra 2011-02-14 12:27 ` Ingo Molnar 2011-02-14 15:51 ` Jason Baron @ 2011-02-14 16:11 ` Mathieu Desnoyers 2 siblings, 0 replies; 113+ messages in thread From: Mathieu Desnoyers @ 2011-02-14 16:11 UTC (permalink / raw) To: Peter Zijlstra Cc: Jason Baron, hpa, rostedt, mingo, tglx, andi, roland, rth, masami.hiramatsu.pt, fweisbec, avi, davem, sam, ddaney, michael, linux-kernel * Peter Zijlstra (peterz@infradead.org) wrote: > On Fri, 2011-02-11 at 22:38 +0100, Peter Zijlstra wrote: > > > > So why can't we make that jump_label_entry::refcount and > > jump_label_key::state an atomic_t and be done with it? > > So I had a bit of a poke at this because I didn't quite understand why > all that stuff was as it was. I applied both Jason's patches and then > basically rewrote kernel/jump_label.c just for kicks ;-) > > I haven't tried compiling this, let alone running it, but provided I > didn't actually forget anything the storage per key is now 16 bytes when > modules are disabled and 24 * (1 + mods) bytes for when they are > enabled. The old code had 64 + 40 * mods bytes. > > I still need to clean up the static_branch_else bits and look at !x86 > aside from the already mentioned bits.. but what do people think? Hi Peter, It looks like a huge step in the right direction. I'm sure that once Jason and you finish ironing out the details, this will be a huge improvement in terms of shrinking code and API complexity. Thanks, Mathieu > > --- > arch/sparc/include/asm/jump_label.h | 25 +- > arch/x86/include/asm/jump_label.h | 22 +- > arch/x86/kernel/jump_label.c | 2 +- > arch/x86/kernel/module.c | 3 - > include/linux/dynamic_debug.h | 10 +- > include/linux/jump_label.h | 71 +++--- > include/linux/jump_label_ref.h | 36 +-- > include/linux/module.h | 1 + > include/linux/perf_event.h | 28 +- > include/linux/tracepoint.h | 8 +- > kernel/jump_label.c | 516 +++++++++++++---------------------- > kernel/module.c | 7 + > kernel/perf_event.c | 30 ++- > kernel/timer.c | 8 +- > kernel/tracepoint.c | 22 +- > 15 files changed, 333 insertions(+), 456 deletions(-) > > diff --git a/arch/sparc/include/asm/jump_label.h b/arch/sparc/include/asm/jump_label.h > index 427d468..e4ca085 100644 > --- a/arch/sparc/include/asm/jump_label.h > +++ b/arch/sparc/include/asm/jump_label.h > @@ -7,17 +7,20 @@ > > #define JUMP_LABEL_NOP_SIZE 4 > > -#define JUMP_LABEL(key, label) \ > - do { \ > - asm goto("1:\n\t" \ > - "nop\n\t" \ > - "nop\n\t" \ > - ".pushsection __jump_table, \"a\"\n\t"\ > - ".align 4\n\t" \ > - ".word 1b, %l[" #label "], %c0\n\t" \ > - ".popsection \n\t" \ > - : : "i" (key) : : label);\ > - } while (0) > +static __always_inline bool __static_branch(struct jump_label_key *key) > +{ > + asm goto("1:\n\t" > + "nop\n\t" > + "nop\n\t" > + ".pushsection __jump_table, \"a\"\n\t" > + ".align 4\n\t" > + ".word 1b, %l[l_yes], %c0\n\t" > + ".popsection \n\t" > + : : "i" (key) : : l_yes); > + return false; > +l_yes: > + return true; > +} > > #endif /* __KERNEL__ */ > > diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h > index 574dbc2..3d44a7c 100644 > --- a/arch/x86/include/asm/jump_label.h > +++ b/arch/x86/include/asm/jump_label.h > @@ -5,20 +5,24 @@ > > #include <linux/types.h> > #include <asm/nops.h> > +#include <asm/asm.h> > > #define JUMP_LABEL_NOP_SIZE 5 > > # define JUMP_LABEL_INITIAL_NOP ".byte 0xe9 \n\t .long 0\n\t" > > -# define JUMP_LABEL(key, label) \ > - do { \ > - asm goto("1:" \ > - JUMP_LABEL_INITIAL_NOP \ > - ".pushsection __jump_table, \"aw\" \n\t"\ > - _ASM_PTR "1b, %l[" #label "], %c0 \n\t" \ > - ".popsection \n\t" \ > - : : "i" (key) : : label); \ > - } while (0) > +static __always_inline bool __static_branch(struct jump_label_key *key) > +{ > + asm goto("1:" > + JUMP_LABEL_INITIAL_NOP > + ".pushsection __jump_table, \"a\" \n\t" > + _ASM_PTR "1b, %l[l_yes], %c0 \n\t" > + ".popsection \n\t" > + : : "i" (key) : : l_yes ); > + return false; > +l_yes: > + return true; > +} > > #endif /* __KERNEL__ */ > > diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c > index 961b6b3..dfa4c3c 100644 > --- a/arch/x86/kernel/jump_label.c > +++ b/arch/x86/kernel/jump_label.c > @@ -4,13 +4,13 @@ > * Copyright (C) 2009 Jason Baron <jbaron@redhat.com> > * > */ > -#include <linux/jump_label.h> > #include <linux/memory.h> > #include <linux/uaccess.h> > #include <linux/module.h> > #include <linux/list.h> > #include <linux/jhash.h> > #include <linux/cpu.h> > +#include <linux/jump_label.h> > #include <asm/kprobes.h> > #include <asm/alternative.h> > > diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c > index ab23f1a..0e6b823 100644 > --- a/arch/x86/kernel/module.c > +++ b/arch/x86/kernel/module.c > @@ -230,9 +230,6 @@ int module_finalize(const Elf_Ehdr *hdr, > apply_paravirt(pseg, pseg + para->sh_size); > } > > - /* make jump label nops */ > - jump_label_apply_nops(me); > - > return 0; > } > > diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h > index 1c70028..2ade291 100644 > --- a/include/linux/dynamic_debug.h > +++ b/include/linux/dynamic_debug.h > @@ -33,7 +33,7 @@ struct _ddebug { > #define _DPRINTK_FLAGS_PRINT (1<<0) /* printk() a message using the format */ > #define _DPRINTK_FLAGS_DEFAULT 0 > unsigned int flags:8; > - char enabled; > + struct jump_label_key enabled; > } __attribute__((aligned(8))); > > > @@ -48,8 +48,8 @@ extern int ddebug_remove_module(const char *mod_name); > __used \ > __attribute__((section("__verbose"), aligned(8))) = \ > { KBUILD_MODNAME, __func__, __FILE__, fmt, __LINE__, \ > - _DPRINTK_FLAGS_DEFAULT }; \ > - if (unlikely(descriptor.enabled)) \ > + _DPRINTK_FLAGS_DEFAULT, JUMP_LABEL_INIT }; \ > + if (static_branch(&descriptor.enabled)) \ > printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__); \ > } while (0) > > @@ -59,8 +59,8 @@ extern int ddebug_remove_module(const char *mod_name); > __used \ > __attribute__((section("__verbose"), aligned(8))) = \ > { KBUILD_MODNAME, __func__, __FILE__, fmt, __LINE__, \ > - _DPRINTK_FLAGS_DEFAULT }; \ > - if (unlikely(descriptor.enabled)) \ > + _DPRINTK_FLAGS_DEFAULT, JUMP_LABEL_INIT }; \ > + if (static_branch(&descriptor.enabled)) \ > dev_printk(KERN_DEBUG, dev, fmt, ##__VA_ARGS__); \ > } while (0) > > diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h > index 7880f18..a1cec0a 100644 > --- a/include/linux/jump_label.h > +++ b/include/linux/jump_label.h > @@ -2,19 +2,35 @@ > #define _LINUX_JUMP_LABEL_H > > #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL) > + > +struct jump_label_key { > + atomic_t enabled; > + struct jump_entry *entries; > +#ifdef CONFIG_MODULES > + struct jump_module *next; > +#endif > +}; > + > # include <asm/jump_label.h> > # define HAVE_JUMP_LABEL > #endif > > enum jump_label_type { > + JUMP_LABEL_DISABLE = 0, > JUMP_LABEL_ENABLE, > - JUMP_LABEL_DISABLE > }; > > struct module; > > +#define JUMP_LABEL_INIT { 0 } > + > #ifdef HAVE_JUMP_LABEL > > +static __always_inline bool static_branch(struct jump_label_key *key) > +{ > + return __static_branch(key); > +} > + > extern struct jump_entry __start___jump_table[]; > extern struct jump_entry __stop___jump_table[]; > > @@ -23,37 +39,31 @@ extern void jump_label_unlock(void); > extern void arch_jump_label_transform(struct jump_entry *entry, > enum jump_label_type type); > extern void arch_jump_label_text_poke_early(jump_label_t addr); > -extern void jump_label_update(unsigned long key, enum jump_label_type type); > -extern void jump_label_apply_nops(struct module *mod); > extern int jump_label_text_reserved(void *start, void *end); > - > -#define jump_label_enable(key) \ > - jump_label_update((unsigned long)key, JUMP_LABEL_ENABLE); > - > -#define jump_label_disable(key) \ > - jump_label_update((unsigned long)key, JUMP_LABEL_DISABLE); > +extern void jump_label_enable(struct jump_label_key *key); > +extern void jump_label_disable(struct jump_label_key *key); > > #else > > -#define JUMP_LABEL(key, label) \ > -do { \ > - if (unlikely(*key)) \ > - goto label; \ > -} while (0) > +struct jump_label_key { > + atomic_t enabled; > +}; > > -#define jump_label_enable(cond_var) \ > -do { \ > - *(cond_var) = 1; \ > -} while (0) > +static __always_inline bool static_branch(struct jump_label_key *key) > +{ > + if (unlikely(atomic_read(&key->state))) > + return true; > + return false; > +} > > -#define jump_label_disable(cond_var) \ > -do { \ > - *(cond_var) = 0; \ > -} while (0) > +static inline void jump_label_enable(struct jump_label_key *key) > +{ > + atomic_inc(&key->state); > +} > > -static inline int jump_label_apply_nops(struct module *mod) > +static inline void jump_label_disable(struct jump_label_key *key) > { > - return 0; > + atomic_dec(&key->state); > } > > static inline int jump_label_text_reserved(void *start, void *end) > @@ -66,14 +76,9 @@ static inline void jump_label_unlock(void) {} > > #endif > > -#define COND_STMT(key, stmt) \ > -do { \ > - __label__ jl_enabled; \ > - JUMP_LABEL(key, jl_enabled); \ > - if (0) { \ > -jl_enabled: \ > - stmt; \ > - } \ > -} while (0) > +static inline bool jump_label_enabled(struct jump_label_key *key) > +{ > + return !!atomic_read(&key->state); > +} > > #endif > diff --git a/include/linux/jump_label_ref.h b/include/linux/jump_label_ref.h > index e5d012a..5178696 100644 > --- a/include/linux/jump_label_ref.h > +++ b/include/linux/jump_label_ref.h > @@ -4,41 +4,27 @@ > #include <linux/jump_label.h> > #include <asm/atomic.h> > > -#ifdef HAVE_JUMP_LABEL > +struct jump_label_key_counter { > + atomic_t ref; > + struct jump_label_key key; > +}; > > -static inline void jump_label_inc(atomic_t *key) > -{ > - if (atomic_add_return(1, key) == 1) > - jump_label_enable(key); > -} > +#ifdef HAVE_JUMP_LABEL > > -static inline void jump_label_dec(atomic_t *key) > +static __always_inline bool static_branch_else_atomic_read(struct jump_label_key *key, atomic_t *count) > { > - if (atomic_dec_and_test(key)) > - jump_label_disable(key); > + return __static_branch(key); > } > > #else /* !HAVE_JUMP_LABEL */ > > -static inline void jump_label_inc(atomic_t *key) > +static __always_inline bool static_branch_else_atomic_read(struct jump_label_key *key, atomic_t *count) > { > - atomic_inc(key); > + if (unlikely(atomic_read(count))) > + return true; > + return false; > } > > -static inline void jump_label_dec(atomic_t *key) > -{ > - atomic_dec(key); > -} > - > -#undef JUMP_LABEL > -#define JUMP_LABEL(key, label) \ > -do { \ > - if (unlikely(__builtin_choose_expr( \ > - __builtin_types_compatible_p(typeof(key), atomic_t *), \ > - atomic_read((atomic_t *)(key)), *(key)))) \ > - goto label; \ > -} while (0) > - > #endif /* HAVE_JUMP_LABEL */ > > #endif /* _LINUX_JUMP_LABEL_REF_H */ > diff --git a/include/linux/module.h b/include/linux/module.h > index 9bdf27c..eeb3e99 100644 > --- a/include/linux/module.h > +++ b/include/linux/module.h > @@ -266,6 +266,7 @@ enum module_state > MODULE_STATE_LIVE, > MODULE_STATE_COMING, > MODULE_STATE_GOING, > + MODULE_STATE_POST_RELOCATE, > }; > > struct module > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h > index dda5b0a..26fe115 100644 > --- a/include/linux/perf_event.h > +++ b/include/linux/perf_event.h > @@ -1000,7 +1000,7 @@ static inline int is_software_event(struct perf_event *event) > return event->pmu->task_ctx_nr == perf_sw_context; > } > > -extern atomic_t perf_swevent_enabled[PERF_COUNT_SW_MAX]; > +extern struct jump_label_key_counter perf_swevent_enabled[PERF_COUNT_SW_MAX]; > > extern void __perf_sw_event(u32, u64, int, struct pt_regs *, u64); > > @@ -1029,30 +1029,32 @@ perf_sw_event(u32 event_id, u64 nr, int nmi, struct pt_regs *regs, u64 addr) > { > struct pt_regs hot_regs; > > - JUMP_LABEL(&perf_swevent_enabled[event_id], have_event); > - return; > - > -have_event: > - if (!regs) { > - perf_fetch_caller_regs(&hot_regs); > - regs = &hot_regs; > + if (static_branch_else_atomic_read(&perf_swevent_enabled[event_id].key, > + &perf_swevent_enabled[event_id].ref)) { > + if (!regs) { > + perf_fetch_caller_regs(&hot_regs); > + regs = &hot_regs; > + } > + __perf_sw_event(event_id, nr, nmi, regs, addr); > } > - __perf_sw_event(event_id, nr, nmi, regs, addr); > } > > -extern atomic_t perf_task_events; > +extern struct jump_label_key_counter perf_task_events; > > static inline void perf_event_task_sched_in(struct task_struct *task) > { > - COND_STMT(&perf_task_events, __perf_event_task_sched_in(task)); > + if (static_branch_else_atomic_read(&perf_task_events.key, > + &perf_task_events.ref)) > + __perf_event_task_sched_in(task); > } > > static inline > void perf_event_task_sched_out(struct task_struct *task, struct task_struct *next) > { > perf_sw_event(PERF_COUNT_SW_CONTEXT_SWITCHES, 1, 1, NULL, 0); > - > - COND_STMT(&perf_task_events, __perf_event_task_sched_out(task, next)); > + if (static_branch_else_atomic_read(&perf_task_events.key, > + &perf_task_events.ref)) > + __perf_event_task_sched_out(task, next); > } > > extern void perf_event_mmap(struct vm_area_struct *vma); > diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h > index 97c84a5..6c8c747 100644 > --- a/include/linux/tracepoint.h > +++ b/include/linux/tracepoint.h > @@ -29,7 +29,7 @@ struct tracepoint_func { > > struct tracepoint { > const char *name; /* Tracepoint name */ > - int state; /* State. */ > + struct jump_label_key key; > void (*regfunc)(void); > void (*unregfunc)(void); > struct tracepoint_func __rcu *funcs; > @@ -146,9 +146,7 @@ void tracepoint_update_probe_range(struct tracepoint * const *begin, > extern struct tracepoint __tracepoint_##name; \ > static inline void trace_##name(proto) \ > { \ > - JUMP_LABEL(&__tracepoint_##name.state, do_trace); \ > - return; \ > -do_trace: \ > + if (static_branch(&__tracepoint_##name.key)) \ > __DO_TRACE(&__tracepoint_##name, \ > TP_PROTO(data_proto), \ > TP_ARGS(data_args), \ > @@ -181,7 +179,7 @@ do_trace: \ > __attribute__((section("__tracepoints_strings"))) = #name; \ > struct tracepoint __tracepoint_##name \ > __attribute__((section("__tracepoints"))) = \ > - { __tpstrtab_##name, 0, reg, unreg, NULL }; \ > + { __tpstrtab_##name, JUMP_LABEL_INIT, reg, unreg, NULL };\ > static struct tracepoint * const __tracepoint_ptr_##name __used \ > __attribute__((section("__tracepoints_ptrs"))) = \ > &__tracepoint_##name; > diff --git a/kernel/jump_label.c b/kernel/jump_label.c > index 3b79bd9..29b34be 100644 > --- a/kernel/jump_label.c > +++ b/kernel/jump_label.c > @@ -2,9 +2,9 @@ > * jump label support > * > * Copyright (C) 2009 Jason Baron <jbaron@redhat.com> > + * Copyright (C) 2011 Peter Zijlstra <pzijlstr@redhat.com> > * > */ > -#include <linux/jump_label.h> > #include <linux/memory.h> > #include <linux/uaccess.h> > #include <linux/module.h> > @@ -13,32 +13,13 @@ > #include <linux/slab.h> > #include <linux/sort.h> > #include <linux/err.h> > +#include <linux/jump_label.h> > > #ifdef HAVE_JUMP_LABEL > > -#define JUMP_LABEL_HASH_BITS 6 > -#define JUMP_LABEL_TABLE_SIZE (1 << JUMP_LABEL_HASH_BITS) > -static struct hlist_head jump_label_table[JUMP_LABEL_TABLE_SIZE]; > - > /* mutex to protect coming/going of the the jump_label table */ > static DEFINE_MUTEX(jump_label_mutex); > > -struct jump_label_entry { > - struct hlist_node hlist; > - struct jump_entry *table; > - int nr_entries; > - /* hang modules off here */ > - struct hlist_head modules; > - unsigned long key; > -}; > - > -struct jump_label_module_entry { > - struct hlist_node hlist; > - struct jump_entry *table; > - int nr_entries; > - struct module *mod; > -}; > - > void jump_label_lock(void) > { > mutex_lock(&jump_label_mutex); > @@ -64,7 +45,7 @@ static int jump_label_cmp(const void *a, const void *b) > } > > static void > -sort_jump_label_entries(struct jump_entry *start, struct jump_entry *stop) > +jump_label_sort_entries(struct jump_entry *start, struct jump_entry *stop) > { > unsigned long size; > > @@ -73,118 +54,25 @@ sort_jump_label_entries(struct jump_entry *start, struct jump_entry *stop) > sort(start, size, sizeof(struct jump_entry), jump_label_cmp, NULL); > } > > -static struct jump_label_entry *get_jump_label_entry(jump_label_t key) > -{ > - struct hlist_head *head; > - struct hlist_node *node; > - struct jump_label_entry *e; > - u32 hash = jhash((void *)&key, sizeof(jump_label_t), 0); > - > - head = &jump_label_table[hash & (JUMP_LABEL_TABLE_SIZE - 1)]; > - hlist_for_each_entry(e, node, head, hlist) { > - if (key == e->key) > - return e; > - } > - return NULL; > -} > +static void jump_label_update(struct jump_label_key *key, int enable); > > -static struct jump_label_entry * > -add_jump_label_entry(jump_label_t key, int nr_entries, struct jump_entry *table) > +void jump_label_enable(struct jump_label_key *key) > { > - struct hlist_head *head; > - struct jump_label_entry *e; > - u32 hash; > - > - e = get_jump_label_entry(key); > - if (e) > - return ERR_PTR(-EEXIST); > - > - e = kmalloc(sizeof(struct jump_label_entry), GFP_KERNEL); > - if (!e) > - return ERR_PTR(-ENOMEM); > - > - hash = jhash((void *)&key, sizeof(jump_label_t), 0); > - head = &jump_label_table[hash & (JUMP_LABEL_TABLE_SIZE - 1)]; > - e->key = key; > - e->table = table; > - e->nr_entries = nr_entries; > - INIT_HLIST_HEAD(&(e->modules)); > - hlist_add_head(&e->hlist, head); > - return e; > -} > + if (atomic_inc_not_zero(&key->enabled)) > + return; > > -static int > -build_jump_label_hashtable(struct jump_entry *start, struct jump_entry *stop) > -{ > - struct jump_entry *iter, *iter_begin; > - struct jump_label_entry *entry; > - int count; > - > - sort_jump_label_entries(start, stop); > - iter = start; > - while (iter < stop) { > - entry = get_jump_label_entry(iter->key); > - if (!entry) { > - iter_begin = iter; > - count = 0; > - while ((iter < stop) && > - (iter->key == iter_begin->key)) { > - iter++; > - count++; > - } > - entry = add_jump_label_entry(iter_begin->key, > - count, iter_begin); > - if (IS_ERR(entry)) > - return PTR_ERR(entry); > - } else { > - WARN_ONCE(1, KERN_ERR "build_jump_hashtable: unexpected entry!\n"); > - return -1; > - } > - } > - return 0; > + jump_label_lock(); > + if (atomic_add_return(&key->enabled) == 1) > + jump_label_update(key, JUMP_LABEL_ENABLE); > + jump_label_unlock(); > } > > -/*** > - * jump_label_update - update jump label text > - * @key - key value associated with a a jump label > - * @type - enum set to JUMP_LABEL_ENABLE or JUMP_LABEL_DISABLE > - * > - * Will enable/disable the jump for jump label @key, depending on the > - * value of @type. > - * > - */ > - > -void jump_label_update(unsigned long key, enum jump_label_type type) > +void jump_label_disable(struct jump_label_key *key) > { > - struct jump_entry *iter; > - struct jump_label_entry *entry; > - struct hlist_node *module_node; > - struct jump_label_module_entry *e_module; > - int count; > + if (!atomic_dec_and_mutex_lock(&key->enabled, &jump_label_mutex)) > + return; > > - jump_label_lock(); > - entry = get_jump_label_entry((jump_label_t)key); > - if (entry) { > - count = entry->nr_entries; > - iter = entry->table; > - while (count--) { > - if (kernel_text_address(iter->code)) > - arch_jump_label_transform(iter, type); > - iter++; > - } > - /* eanble/disable jump labels in modules */ > - hlist_for_each_entry(e_module, module_node, &(entry->modules), > - hlist) { > - count = e_module->nr_entries; > - iter = e_module->table; > - while (count--) { > - if (iter->key && > - kernel_text_address(iter->code)) > - arch_jump_label_transform(iter, type); > - iter++; > - } > - } > - } > + jump_label_update(key, JUMP_LABEL_DISABLE); > jump_label_unlock(); > } > > @@ -197,77 +85,30 @@ static int addr_conflict(struct jump_entry *entry, void *start, void *end) > return 0; > } > > -#ifdef CONFIG_MODULES > - > -static int module_conflict(void *start, void *end) > -{ > - struct hlist_head *head; > - struct hlist_node *node, *node_next, *module_node, *module_node_next; > - struct jump_label_entry *e; > - struct jump_label_module_entry *e_module; > - struct jump_entry *iter; > - int i, count; > - int conflict = 0; > - > - for (i = 0; i < JUMP_LABEL_TABLE_SIZE; i++) { > - head = &jump_label_table[i]; > - hlist_for_each_entry_safe(e, node, node_next, head, hlist) { > - hlist_for_each_entry_safe(e_module, module_node, > - module_node_next, > - &(e->modules), hlist) { > - count = e_module->nr_entries; > - iter = e_module->table; > - while (count--) { > - if (addr_conflict(iter, start, end)) { > - conflict = 1; > - goto out; > - } > - iter++; > - } > - } > - } > - } > -out: > - return conflict; > -} > - > -#endif > - > -/*** > - * jump_label_text_reserved - check if addr range is reserved > - * @start: start text addr > - * @end: end text addr > - * > - * checks if the text addr located between @start and @end > - * overlaps with any of the jump label patch addresses. Code > - * that wants to modify kernel text should first verify that > - * it does not overlap with any of the jump label addresses. > - * Caller must hold jump_label_mutex. > - * > - * returns 1 if there is an overlap, 0 otherwise > - */ > -int jump_label_text_reserved(void *start, void *end) > +static int __jump_label_text_reserved(struct jump_entry *iter_start, > + struct jump_entry *iter_stop, void *start, void *end) > { > struct jump_entry *iter; > - struct jump_entry *iter_start = __start___jump_table; > - struct jump_entry *iter_stop = __start___jump_table; > - int conflict = 0; > > iter = iter_start; > while (iter < iter_stop) { > - if (addr_conflict(iter, start, end)) { > - conflict = 1; > - goto out; > - } > + if (addr_conflict(iter, start, end)) > + return 1; > iter++; > } > > - /* now check modules */ > -#ifdef CONFIG_MODULES > - conflict = module_conflict(start, end); > -#endif > -out: > - return conflict; > + return 0; > +} > + > +static void __jump_label_update(struct jump_label_key *key, > + struct jump_entry *entry, int enable) > +{ > + for (; entry->key == (jump_label_t)key; entry++) { > + if (WARN_ON_ONCE(!kernel_text_address(iter->code))) > + continue; > + > + arch_jump_label_transform(iter, enable); > + } > } > > /* > @@ -277,141 +118,155 @@ void __weak arch_jump_label_text_poke_early(jump_label_t addr) > { > } > > -static __init int init_jump_label(void) > +static __init int jump_label_init(void) > { > - int ret; > struct jump_entry *iter_start = __start___jump_table; > struct jump_entry *iter_stop = __stop___jump_table; > + struct jump_label_key *key = NULL; > struct jump_entry *iter; > > jump_label_lock(); > - ret = build_jump_label_hashtable(__start___jump_table, > - __stop___jump_table); > - iter = iter_start; > - while (iter < iter_stop) { > + jump_label_sort_entries(iter_start, iter_stop); > + > + for (iter = iter_start; iter < iter_stop; iter++) { > arch_jump_label_text_poke_early(iter->code); > - iter++; > + if (iter->key == (jump_label_t)key) > + continue; > + > + key = (struct jump_label_key *)iter->key; > + atomic_set(&key->enabled, 0); > + key->entries = iter; > +#ifdef CONFIG_MODULES > + key->next = NULL; > +#endif > } > jump_label_unlock(); > - return ret; > + > + return 0; > } > -early_initcall(init_jump_label); > +early_initcall(jump_label_init); > > #ifdef CONFIG_MODULES > > -static struct jump_label_module_entry * > -add_jump_label_module_entry(struct jump_label_entry *entry, > - struct jump_entry *iter_begin, > - int count, struct module *mod) > -{ > - struct jump_label_module_entry *e; > - > - e = kmalloc(sizeof(struct jump_label_module_entry), GFP_KERNEL); > - if (!e) > - return ERR_PTR(-ENOMEM); > - e->mod = mod; > - e->nr_entries = count; > - e->table = iter_begin; > - hlist_add_head(&e->hlist, &entry->modules); > - return e; > -} > +struct jump_label_mod { > + struct jump_label_mod *next; > + struct jump_entry *entries; > + struct module *mod; > +}; > > -static int add_jump_label_module(struct module *mod) > +static int __jump_label_mod_text_reserved(void *start, void *end) > { > - struct jump_entry *iter, *iter_begin; > - struct jump_label_entry *entry; > - struct jump_label_module_entry *module_entry; > - int count; > + struct module *mod; > > - /* if the module doesn't have jump label entries, just return */ > - if (!mod->num_jump_entries) > + mod = __module_text_address(start); > + if (!mod) > return 0; > > - sort_jump_label_entries(mod->jump_entries, > + WARN_ON_ONCE(__module_text_address(end) != mod); > + > + return __jump_label_text_reserved(mod->jump_entries, > mod->jump_entries + mod->num_jump_entries); > - iter = mod->jump_entries; > - while (iter < mod->jump_entries + mod->num_jump_entries) { > - entry = get_jump_label_entry(iter->key); > - iter_begin = iter; > - count = 0; > - while ((iter < mod->jump_entries + mod->num_jump_entries) && > - (iter->key == iter_begin->key)) { > - iter++; > - count++; > - } > - if (!entry) { > - entry = add_jump_label_entry(iter_begin->key, 0, NULL); > - if (IS_ERR(entry)) > - return PTR_ERR(entry); > - } > - module_entry = add_jump_label_module_entry(entry, iter_begin, > - count, mod); > - if (IS_ERR(module_entry)) > - return PTR_ERR(module_entry); > +} > + > +static void __jump_label_mod_update(struct jump_label_key *key, int enable) > +{ > + struct jump_label_mod *mod = key->next; > + > + while (mod) { > + __jump_label_update(key, mod->entries, enable); > + mod = mod->next; > } > - return 0; > } > > -static void remove_jump_label_module(struct module *mod) > +/*** > + * apply_jump_label_nops - patch module jump labels with arch_get_jump_label_nop() > + * @mod: module to patch > + * > + * Allow for run-time selection of the optimal nops. Before the module > + * loads patch these with arch_get_jump_label_nop(), which is specified by > + * the arch specific jump label code. > + */ > +static void jump_label_apply_nops(struct module *mod) > { > - struct hlist_head *head; > - struct hlist_node *node, *node_next, *module_node, *module_node_next; > - struct jump_label_entry *e; > - struct jump_label_module_entry *e_module; > - int i; > + struct jump_entry *iter_start = mod->jump_entries; > + struct jump_entry *iter_stop = mod->jump_entries + mod->num_jump_entries; > + struct jump_entry *iter; > > /* if the module doesn't have jump label entries, just return */ > - if (!mod->num_jump_entries) > + if (iter_start == iter_stop) > return; > > - for (i = 0; i < JUMP_LABEL_TABLE_SIZE; i++) { > - head = &jump_label_table[i]; > - hlist_for_each_entry_safe(e, node, node_next, head, hlist) { > - hlist_for_each_entry_safe(e_module, module_node, > - module_node_next, > - &(e->modules), hlist) { > - if (e_module->mod == mod) { > - hlist_del(&e_module->hlist); > - kfree(e_module); > - } > - } > - if (hlist_empty(&e->modules) && (e->nr_entries == 0)) { > - hlist_del(&e->hlist); > - kfree(e); > - } > + jump_label_sort_entries(iter_start, iter_stop); > + > + for (iter = iter_start; iter < iter_stop; iter++) > + arch_jump_label_text_poke_early(iter->code); > +} > + > +static int jump_label_add_module(struct module *mod) > +{ > + struct jump_entry *iter_start = mod->jump_entries; > + struct jump_entry *iter_stop = mod->jump_entries + mod->num_jump_entries; > + struct jump_entry *iter; > + struct jump_label_key *key = NULL; > + struct jump_label_mod *jlm; > + > + for (iter = iter_start; iter < iter_stop; iter++) { > + if (iter->key == (jump_label_t)key) > + continue; > + > + key = (struct jump_label_key)iter->key; > + > + if (__module_address(iter->key) == mod) { > + atomic_set(&key->enabled, 0); > + key->entries = iter; > + key->next = NULL; > + continue; > } > + > + jlm = kzalloc(sizeof(struct jump_label_mod), GFP_KERNEL); > + if (!jlm) > + return -ENOMEM; > + > + jlm->mod = mod; > + jlm->entries = iter; > + jlm->next = key->next; > + key->next = jlm; > + > + if (jump_label_enabled(key)) > + __jump_label_update(key, iter, JUMP_LABEL_ENABLE); > } > + > + return 0; > } > > -static void remove_jump_label_module_init(struct module *mod) > +static void jump_label_del_module(struct module *mod) > { > - struct hlist_head *head; > - struct hlist_node *node, *node_next, *module_node, *module_node_next; > - struct jump_label_entry *e; > - struct jump_label_module_entry *e_module; > + struct jump_entry *iter_start = mod->jump_entries; > + struct jump_entry *iter_stop = mod->jump_entries + mod->num_jump_entries; > struct jump_entry *iter; > - int i, count; > + struct jump_label_key *key = NULL; > + struct jump_label_mod *jlm, **prev; > > - /* if the module doesn't have jump label entries, just return */ > - if (!mod->num_jump_entries) > - return; > + for (iter = iter_start; iter < iter_stop; iter++) { > + if (iter->key == (jump_label_t)key) > + continue; > + > + key = (struct jump_label_key)iter->key; > + > + if (__module_address(iter->key) == mod) > + continue; > + > + prev = &key->next; > + jlm = key->next; > + > + while (jlm && jlm->mod != mod) { > + prev = &jlm->next; > + jlm = jlm->next; > + } > > - for (i = 0; i < JUMP_LABEL_TABLE_SIZE; i++) { > - head = &jump_label_table[i]; > - hlist_for_each_entry_safe(e, node, node_next, head, hlist) { > - hlist_for_each_entry_safe(e_module, module_node, > - module_node_next, > - &(e->modules), hlist) { > - if (e_module->mod != mod) > - continue; > - count = e_module->nr_entries; > - iter = e_module->table; > - while (count--) { > - if (within_module_init(iter->code, mod)) > - iter->key = 0; > - iter++; > - } > - } > + if (jlm) { > + *prev = jlm->next; > + kfree(jlm); > } > } > } > @@ -424,61 +279,76 @@ jump_label_module_notify(struct notifier_block *self, unsigned long val, > int ret = 0; > > switch (val) { > - case MODULE_STATE_COMING: > + case MODULE_STATE_POST_RELOCATE: > jump_label_lock(); > - ret = add_jump_label_module(mod); > - if (ret) > - remove_jump_label_module(mod); > + jump_label_apply_nops(mod); > jump_label_unlock(); > break; > - case MODULE_STATE_GOING: > + case MODULE_STATE_COMING: > jump_label_lock(); > - remove_jump_label_module(mod); > + ret = jump_label_add_module(mod); > + if (ret) > + jump_label_del_module(mod); > jump_label_unlock(); > break; > - case MODULE_STATE_LIVE: > + case MODULE_STATE_GOING: > jump_label_lock(); > - remove_jump_label_module_init(mod); > + jump_label_del_module(mod); > jump_label_unlock(); > break; > } > return ret; > } > > +struct notifier_block jump_label_module_nb = { > + .notifier_call = jump_label_module_notify, > + .priority = 1, /* higher than tracepoints */ > +}; > + > +static __init int jump_label_init_module(void) > +{ > + return register_module_notifier(&jump_label_module_nb); > +} > +early_initcall(jump_label_init_module); > + > +#endif /* CONFIG_MODULES */ > + > /*** > - * apply_jump_label_nops - patch module jump labels with arch_get_jump_label_nop() > - * @mod: module to patch > + * jump_label_text_reserved - check if addr range is reserved > + * @start: start text addr > + * @end: end text addr > * > - * Allow for run-time selection of the optimal nops. Before the module > - * loads patch these with arch_get_jump_label_nop(), which is specified by > - * the arch specific jump label code. > + * checks if the text addr located between @start and @end > + * overlaps with any of the jump label patch addresses. Code > + * that wants to modify kernel text should first verify that > + * it does not overlap with any of the jump label addresses. > + * Caller must hold jump_label_mutex. > + * > + * returns 1 if there is an overlap, 0 otherwise > */ > -void jump_label_apply_nops(struct module *mod) > +int jump_label_text_reserved(void *start, void *end) > { > - struct jump_entry *iter; > + int ret = __jump_label_text_reserved(__start___jump_table, > + __stop___jump_table, start, end); > > - /* if the module doesn't have jump label entries, just return */ > - if (!mod->num_jump_entries) > - return; > + if (ret) > + return ret; > > - iter = mod->jump_entries; > - while (iter < mod->jump_entries + mod->num_jump_entries) { > - arch_jump_label_text_poke_early(iter->code); > - iter++; > - } > +#ifdef CONFIG_MODULES > + ret = __jump_label_mod_text_reserved(start, end); > +#endif > + return ret; > } > > -struct notifier_block jump_label_module_nb = { > - .notifier_call = jump_label_module_notify, > - .priority = 0, > -}; > - > -static __init int init_jump_label_module(void) > +static void jump_label_update(struct jump_label_key *key, int enable) > { > - return register_module_notifier(&jump_label_module_nb); > -} > -early_initcall(init_jump_label_module); > + struct jump_entry *entry = key->entries; > > -#endif /* CONFIG_MODULES */ > + __jump_label_update(key, entry, enable); > + > +#ifdef CONFIG_MODULES > + __jump_label_mod_update(key, enable); > +#endif > +} > > #endif > diff --git a/kernel/module.c b/kernel/module.c > index efa290e..890cadf 100644 > --- a/kernel/module.c > +++ b/kernel/module.c > @@ -2789,6 +2789,13 @@ static struct module *load_module(void __user *umod, > goto unlock; > } > > + err = blocking_notifier_call_chain(&module_notify_list, > + MODULE_STATE_POST_RELOCATE, mod); > + if (err != NOTIFY_DONE) { > + err = notifier_to_errno(err); > + goto unlock; > + } > + > /* This has to be done once we're sure module name is unique. */ > if (!mod->taints) > dynamic_debug_setup(info.debug, info.num_debug); > diff --git a/kernel/perf_event.c b/kernel/perf_event.c > index a353a4d..7bacdd3 100644 > --- a/kernel/perf_event.c > +++ b/kernel/perf_event.c > @@ -117,7 +117,7 @@ enum event_type_t { > EVENT_ALL = EVENT_FLEXIBLE | EVENT_PINNED, > }; > > -atomic_t perf_task_events __read_mostly; > +struct jump_label_key_counter perf_task_events __read_mostly; > static atomic_t nr_mmap_events __read_mostly; > static atomic_t nr_comm_events __read_mostly; > static atomic_t nr_task_events __read_mostly; > @@ -2383,8 +2383,10 @@ static void free_event(struct perf_event *event) > irq_work_sync(&event->pending); > > if (!event->parent) { > - if (event->attach_state & PERF_ATTACH_TASK) > - jump_label_dec(&perf_task_events); > + if (event->attach_state & PERF_ATTACH_TASK) { > + if (atomic_dec_and_test(&perf_task_events.ref)) > + jump_label_disable(&perf_task_events.key); > + } > if (event->attr.mmap || event->attr.mmap_data) > atomic_dec(&nr_mmap_events); > if (event->attr.comm) > @@ -4912,7 +4914,7 @@ fail: > return err; > } > > -atomic_t perf_swevent_enabled[PERF_COUNT_SW_MAX]; > +struct jump_label_key_counter perf_swevent_enabled[PERF_COUNT_SW_MAX]; > > static void sw_perf_event_destroy(struct perf_event *event) > { > @@ -4920,7 +4922,8 @@ static void sw_perf_event_destroy(struct perf_event *event) > > WARN_ON(event->parent); > > - jump_label_dec(&perf_swevent_enabled[event_id]); > + if (atomic_dec_and_test(&perf_swevent_enabled[event_id].ref)) > + jump_label_disable(&perf_swevent_enabled[event_id].key); > swevent_hlist_put(event); > } > > @@ -4945,12 +4948,15 @@ static int perf_swevent_init(struct perf_event *event) > > if (!event->parent) { > int err; > + atomic_t *ref; > > err = swevent_hlist_get(event); > if (err) > return err; > > - jump_label_inc(&perf_swevent_enabled[event_id]); > + ref = &perf_swevent_enabled[event_id].ref; > + if (atomic_add_return(1, ref) == 1) > + jump_label_enable(&perf_swevent_enabled[event_id].key); > event->destroy = sw_perf_event_destroy; > } > > @@ -5123,6 +5129,10 @@ static enum hrtimer_restart perf_swevent_hrtimer(struct hrtimer *hrtimer) > u64 period; > > event = container_of(hrtimer, struct perf_event, hw.hrtimer); > + > + if (event->state < PERF_EVENT_STATE_ACTIVE) > + return HRTIMER_NORESTART; > + > event->pmu->read(event); > > perf_sample_data_init(&data, 0); > @@ -5174,7 +5184,7 @@ static void perf_swevent_cancel_hrtimer(struct perf_event *event) > ktime_t remaining = hrtimer_get_remaining(&hwc->hrtimer); > local64_set(&hwc->period_left, ktime_to_ns(remaining)); > > - hrtimer_cancel(&hwc->hrtimer); > + hrtimer_try_to_cancel(&hwc->hrtimer); > } > } > > @@ -5713,8 +5723,10 @@ done: > event->pmu = pmu; > > if (!event->parent) { > - if (event->attach_state & PERF_ATTACH_TASK) > - jump_label_inc(&perf_task_events); > + if (event->attach_state & PERF_ATTACH_TASK) { > + if (atomic_add_return(1, &perf_task_events.ref) == 1) > + jump_label_enable(&perf_task_events.key); > + } > if (event->attr.mmap || event->attr.mmap_data) > atomic_inc(&nr_mmap_events); > if (event->attr.comm) > diff --git a/kernel/timer.c b/kernel/timer.c > index 343ff27..c848cd8 100644 > --- a/kernel/timer.c > +++ b/kernel/timer.c > @@ -959,7 +959,7 @@ EXPORT_SYMBOL(try_to_del_timer_sync); > * > * Synchronization rules: Callers must prevent restarting of the timer, > * otherwise this function is meaningless. It must not be called from > - * hardirq contexts. The caller must not hold locks which would prevent > + * interrupt contexts. The caller must not hold locks which would prevent > * completion of the timer's handler. The timer's handler must not call > * add_timer_on(). Upon exit the timer is not queued and the handler is > * not running on any CPU. > @@ -971,12 +971,10 @@ int del_timer_sync(struct timer_list *timer) > #ifdef CONFIG_LOCKDEP > unsigned long flags; > > - raw_local_irq_save(flags); > - local_bh_disable(); > + local_irq_save(flags); > lock_map_acquire(&timer->lockdep_map); > lock_map_release(&timer->lockdep_map); > - _local_bh_enable(); > - raw_local_irq_restore(flags); > + local_irq_restore(flags); > #endif > /* > * don't use it in hardirq context, because it > diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c > index 68187af..13066e8 100644 > --- a/kernel/tracepoint.c > +++ b/kernel/tracepoint.c > @@ -251,9 +251,9 @@ static void set_tracepoint(struct tracepoint_entry **entry, > { > WARN_ON(strcmp((*entry)->name, elem->name) != 0); > > - if (elem->regfunc && !elem->state && active) > + if (elem->regfunc && !jump_label_enabled(&elem->key) && active) > elem->regfunc(); > - else if (elem->unregfunc && elem->state && !active) > + else if (elem->unregfunc && jump_label_enabled(&elem->key) && !active) > elem->unregfunc(); > > /* > @@ -264,13 +264,10 @@ static void set_tracepoint(struct tracepoint_entry **entry, > * is used. > */ > rcu_assign_pointer(elem->funcs, (*entry)->funcs); > - if (!elem->state && active) { > - jump_label_enable(&elem->state); > - elem->state = active; > - } else if (elem->state && !active) { > - jump_label_disable(&elem->state); > - elem->state = active; > - } > + if (active) > + jump_label_enable(&elem->key); > + else if (!active) > + jump_label_disable(&elem->key); > } > > /* > @@ -281,13 +278,10 @@ static void set_tracepoint(struct tracepoint_entry **entry, > */ > static void disable_tracepoint(struct tracepoint *elem) > { > - if (elem->unregfunc && elem->state) > + if (elem->unregfunc && jump_label_enabled(&elem->key)) > elem->unregfunc(); > > - if (elem->state) { > - jump_label_disable(&elem->state); > - elem->state = 0; > - } > + jump_label_disable(&elem->key); > rcu_assign_pointer(elem->funcs, NULL); > } > > > -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 113+ messages in thread
end of thread, other threads:[~2011-02-18 19:04 UTC | newest]
Thread overview: 113+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-01-05 15:43 [PATCH 0/2] jump label: 2.6.38 updates Jason Baron
2011-01-05 15:43 ` [PATCH 1/2] jump label: make enable/disable o(1) Jason Baron
2011-01-05 17:31 ` Steven Rostedt
2011-01-05 21:19 ` Jason Baron
2011-01-05 15:43 ` [PATCH 2/2] jump label: introduce static_branch() Jason Baron
2011-01-05 17:15 ` Frederic Weisbecker
2011-01-05 17:46 ` Steven Rostedt
2011-01-05 18:52 ` H. Peter Anvin
2011-01-05 21:19 ` Jason Baron
2011-01-05 21:14 ` Jason Baron
2011-01-05 17:32 ` David Daney
2011-01-05 17:43 ` Steven Rostedt
2011-01-05 18:44 ` David Miller
2011-01-05 20:04 ` Steven Rostedt
2011-01-05 18:56 ` H. Peter Anvin
2011-01-05 19:14 ` Ingo Molnar
2011-01-05 19:32 ` David Daney
2011-01-05 19:50 ` Ingo Molnar
2011-01-05 20:07 ` David Daney
2011-01-05 20:08 ` H. Peter Anvin
2011-01-05 20:18 ` Ingo Molnar
2011-01-05 21:16 ` Jason Baron
2011-01-05 17:41 ` Steven Rostedt
2011-01-09 18:48 ` Mathieu Desnoyers
2011-02-11 19:25 ` [PATCH 0/2] jump label: 2.6.38 updates Peter Zijlstra
2011-02-11 21:13 ` Mathieu Desnoyers
[not found] ` <BLU0-SMTP101B686C32E10BA346B15F896EF0@phx.gbl>
2011-02-11 21:38 ` Peter Zijlstra
2011-02-11 22:15 ` Jason Baron
2011-02-11 22:19 ` H. Peter Anvin
2011-02-11 22:30 ` Mathieu Desnoyers
2011-02-11 22:20 ` Mathieu Desnoyers
[not found] ` <BLU0-SMTP8562BA758CF8AAE5323AE296EF0@phx.gbl>
2011-02-11 22:27 ` Jason Baron
2011-02-11 22:32 ` Mathieu Desnoyers
2011-02-12 18:47 ` Peter Zijlstra
2011-02-14 12:27 ` Ingo Molnar
2011-02-14 15:51 ` Jason Baron
2011-02-14 15:57 ` Peter Zijlstra
2011-02-14 16:04 ` Jason Baron
2011-02-14 16:14 ` Mathieu Desnoyers
[not found] ` <BLU0-SMTP4069A1A89F06CDFF9B28F896D00@phx.gbl>
2011-02-14 16:25 ` Peter Zijlstra
2011-02-14 16:29 ` Jason Baron
2011-02-14 16:37 ` Peter Zijlstra
2011-02-14 16:43 ` Mathieu Desnoyers
2011-02-14 16:46 ` Steven Rostedt
2011-02-14 16:53 ` Peter Zijlstra
2011-02-14 17:18 ` Steven Rostedt
2011-02-14 17:23 ` Mike Frysinger
2011-02-14 17:27 ` Peter Zijlstra
2011-02-14 17:29 ` Mike Frysinger
2011-02-14 17:38 ` Peter Zijlstra
2011-02-14 17:45 ` Mike Frysinger
2011-02-14 17:38 ` Will Newton
2011-02-14 17:43 ` Peter Zijlstra
2011-02-14 17:50 ` Will Newton
2011-02-14 18:04 ` Peter Zijlstra
2011-02-14 18:24 ` Peter Zijlstra
2011-02-14 18:53 ` Mathieu Desnoyers
2011-02-14 21:29 ` Steven Rostedt
2011-02-14 21:39 ` Steven Rostedt
2011-02-14 21:46 ` David Miller
2011-02-14 22:20 ` Steven Rostedt
2011-02-14 22:21 ` Steven Rostedt
2011-02-14 22:21 ` H. Peter Anvin
2011-02-14 22:29 ` Mathieu Desnoyers
[not found] ` <BLU0-SMTP98BFCC52FD41661DD9CC1E96D00@phx.gbl>
2011-02-14 22:33 ` David Miller
2011-02-14 22:33 ` David Miller
2011-02-14 22:37 ` Matt Fleming
2011-02-14 23:03 ` Mathieu Desnoyers
[not found] ` <BLU0-SMTP166A8555C791786059B0FF96D00@phx.gbl>
2011-02-14 23:09 ` Paul E. McKenney
2011-02-14 23:29 ` Mathieu Desnoyers
[not found] ` <BLU0-SMTP4599FAAD7330498472B87396D00@phx.gbl>
2011-02-15 0:19 ` Segher Boessenkool
2011-02-15 0:48 ` Mathieu Desnoyers
2011-02-15 1:29 ` Steven Rostedt
[not found] ` <BLU0-SMTP984E876DBDFBC13F4C86F896D00@phx.gbl>
2011-02-15 0:42 ` Paul E. McKenney
2011-02-15 0:51 ` Mathieu Desnoyers
2011-02-15 11:53 ` Will Newton
2011-02-18 19:03 ` Paul E. McKenney
2011-02-14 23:19 ` H. Peter Anvin
2011-02-15 11:01 ` Will Newton
2011-02-15 13:31 ` H. Peter Anvin
2011-02-15 13:49 ` Steven Rostedt
2011-02-15 14:04 ` Will Newton
2011-02-15 21:11 ` Will Simoneau
2011-02-15 21:27 ` David Miller
2011-02-15 21:56 ` Will Simoneau
2011-02-16 10:15 ` Will Newton
2011-02-16 12:18 ` Steven Rostedt
2011-02-16 12:41 ` Will Newton
2011-02-16 13:24 ` Mathieu Desnoyers
2011-02-16 22:51 ` Will Simoneau
2011-02-17 0:53 ` Please watch your cc lists Andi Kleen
2011-02-17 0:56 ` David Miller
2011-02-17 1:04 ` Michael Witten
2011-02-17 10:55 ` [PATCH 0/2] jump label: 2.6.38 updates Will Newton
[not found] ` <BLU0-SMTP80F56386E7E060A3B2020B96D20@phx.gbl>
2011-02-17 1:55 ` Masami Hiramatsu
2011-02-17 3:19 ` H. Peter Anvin
2011-02-17 16:03 ` Mathieu Desnoyers
[not found] ` <BLU0-SMTP71BCB155CBAE79997EE08D96D20@phx.gbl>
2011-02-17 3:36 ` Steven Rostedt
2011-02-17 16:13 ` Mathieu Desnoyers
[not found] ` <BLU0-SMTP51D40A5B1DACA8883D6AB596D50@phx.gbl>
2011-02-17 20:09 ` Steven Rostedt
2011-02-15 22:20 ` Benjamin Herrenschmidt
2011-02-16 8:35 ` Ingo Molnar
2011-02-17 1:04 ` H. Peter Anvin
2011-02-17 12:51 ` Ingo Molnar
[not found] ` <BLU0-SMTP637B2E9372CFBF3A0B5B0996D00@phx.gbl>
2011-02-14 23:25 ` David Miller
2011-02-14 23:34 ` Mathieu Desnoyers
[not found] ` <20110214233405.GC17432@Krystal>
2011-02-14 23:52 ` Mathieu Desnoyers
2011-02-14 22:15 ` Matt Fleming
2011-02-15 15:20 ` Heiko Carstens
[not found] ` <BLU0-SMTP64371A838030ED92A7CCB696D00@phx.gbl>
2011-02-14 18:54 ` Jason Baron
2011-02-14 19:20 ` Peter Zijlstra
2011-02-14 19:48 ` Mathieu Desnoyers
2011-02-14 16:11 ` Mathieu Desnoyers
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).