All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
To: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: linux-kernel@vger.kernel.org, akpm@osdl.org,
	anil.s.keshavamurthy@intel.com, davem@davemloft.net,
	prasanna@in.ibm.com
Subject: Re: [PATCH 8/9] Kprobes: Use RCU for (un)register synchronization - base changes
Date: Tue, 18 Oct 2005 10:43:23 -0400	[thread overview]
Message-ID: <20051018144323.GA4479@in.ibm.com> (raw)
In-Reply-To: <20051018054436.GA1364@us.ibm.com>

On Mon, Oct 17, 2005 at 10:44:36PM -0700, Paul E. McKenney wrote:
> On Mon, Oct 10, 2005 at 10:47:20AM -0400, Ananth N Mavinakayanahalli wrote:
> > From: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
> > 
> > Changes to the base kprobes infrastructure to use RCU for synchronization
> > during kprobe registration and unregistration. These changes coupled with
> > the arch kprobe changes (next in series):
> > 
> > a. serialize registration and unregistration of kprobes.
> > b. enable lockless execution of handlers. Handlers can now run in parallel.
> 
> A couple of questions interspersed...  Probably my limited knowledge of
> kprobes, but I have to ask...  Search for empty lines to find them.

Paul,

Thanks for the review... replies inlined...

Ananth

> 
> 						Thanx, Paul
> 
> > Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
> > Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
> > ---
> > 
> >  include/linux/kprobes.h |    9 +---
> >  kernel/kprobes.c        |  101 +++++++++++++++++++-----------------------------
> >  2 files changed, 45 insertions(+), 65 deletions(-)
> > 
> > Index: linux-2.6.14-rc3/include/linux/kprobes.h
> > ===================================================================
> > --- linux-2.6.14-rc3.orig/include/linux/kprobes.h	2005-10-07 21:40:43.000000000 -0400
> > +++ linux-2.6.14-rc3/include/linux/kprobes.h	2005-10-07 21:40:49.000000000 -0400
> > @@ -34,6 +34,8 @@
> >  #include <linux/notifier.h>
> >  #include <linux/smp.h>
> >  #include <linux/percpu.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/rcupdate.h>
> >  
> >  #include <asm/kprobes.h>
> >  
> > @@ -146,10 +148,7 @@ struct kretprobe_instance {
> >  };
> >  
> >  #ifdef CONFIG_KPROBES
> > -/* Locks kprobe: irq must be disabled */
> > -void lock_kprobes(void);
> > -void unlock_kprobes(void);
> > -
> > +extern spinlock_t kretprobe_lock;
> >  extern int arch_prepare_kprobe(struct kprobe *p);
> >  extern void arch_copy_kprobe(struct kprobe *p);
> >  extern void arch_arm_kprobe(struct kprobe *p);
> > @@ -160,7 +159,7 @@ extern void show_registers(struct pt_reg
> >  extern kprobe_opcode_t *get_insn_slot(void);
> >  extern void free_insn_slot(kprobe_opcode_t *slot);
> >  
> > -/* Get the kprobe at this addr (if any).  Must have called lock_kprobes */
> > +/* Get the kprobe at this addr (if any) - called under a rcu_read_lock() */
> 
> Since the update side uses synchronize_kernel() (in patch 9/9, right?),

Yes, synchronize_sched() to be precise.

> shouldn't the above "rcu_read_lock()" instead by preempt_disable()?

My reasoning was that since rcu_read_(un)lock is #defined to
preempt_(en)disable. But, I agree, since we use synchronize_sched() this
can just be preempt_disable()...
Suggestions?

> Also, some of the code in the earlier patches seems to do preempt_disable.
> For example, kprobe_handler() in arch/i386/kernel/kprobes.c.

Well, the convolution is due to the way kprobes work: 
- Breakpoint hit; execute pre_handler
- single-step on a copy of the original instruction; execute the
post_handler

We don't want to be preempted from the time of the breakpoint exception
until after the post_handler is run. I've taken care to ensure the
preempt_ calls from the various codepaths are balanced.

> In realtime kernels, synchronize_sched() does -not- guarantee that all
> rcu_read_lock() critical sections have finished, only that any
> preempt_disable() code sequences (explicit or implicit) have completed.

Are we assured that the preempt_ depth is also taken care of? If that is
the case, I think we are safe, since the matching preempt_enable() calls
are _after_ the post_handler is executed.

> >  struct kprobe *get_kprobe(void *addr);
> >  struct hlist_head * kretprobe_inst_table_head(struct task_struct *tsk);
> >  
> > Index: linux-2.6.14-rc3/kernel/kprobes.c
> > ===================================================================
> > --- linux-2.6.14-rc3.orig/kernel/kprobes.c	2005-10-07 21:40:43.000000000 -0400
> > +++ linux-2.6.14-rc3/kernel/kprobes.c	2005-10-07 21:41:11.000000000 -0400
> > @@ -32,7 +32,6 @@
> >   *		<prasanna@in.ibm.com> added function-return probes.
> >   */
> >  #include <linux/kprobes.h>
> > -#include <linux/spinlock.h>
> >  #include <linux/hash.h>
> >  #include <linux/init.h>
> >  #include <linux/module.h>
> > @@ -48,8 +47,8 @@
> >  static struct hlist_head kprobe_table[KPROBE_TABLE_SIZE];
> >  static struct hlist_head kretprobe_inst_table[KPROBE_TABLE_SIZE];
> >  
> > -unsigned int kprobe_cpu = NR_CPUS;
> > -static DEFINE_SPINLOCK(kprobe_lock);
> > +static DEFINE_SPINLOCK(kprobe_lock);	/* Protects kprobe_table */
> > +DEFINE_SPINLOCK(kretprobe_lock);	/* Protects kretprobe_inst_table */
> >  static DEFINE_PER_CPU(struct kprobe *, kprobe_instance) = NULL;
> >  
> >  /*
> > @@ -152,41 +151,6 @@ void __kprobes free_insn_slot(kprobe_opc
> >  	}
> >  }
> >  
> > -/* Locks kprobe: irqs must be disabled */
> > -void __kprobes lock_kprobes(void)
> > -{
> > -	unsigned long flags = 0;
> > -
> > -	/* Avoiding local interrupts to happen right after we take the kprobe_lock
> > -	 * and before we get a chance to update kprobe_cpu, this to prevent
> > -	 * deadlock when we have a kprobe on ISR routine and a kprobe on task
> > -	 * routine
> > -	 */
> > -	local_irq_save(flags);
> > -
> > -	spin_lock(&kprobe_lock);
> > -	kprobe_cpu = smp_processor_id();
> > -
> > - 	local_irq_restore(flags);
> > -}
> > -
> > -void __kprobes unlock_kprobes(void)
> > -{
> > -	unsigned long flags = 0;
> > -
> > -	/* Avoiding local interrupts to happen right after we update
> > -	 * kprobe_cpu and before we get a a chance to release kprobe_lock,
> > -	 * this to prevent deadlock when we have a kprobe on ISR routine and
> > -	 * a kprobe on task routine
> > -	 */
> > -	local_irq_save(flags);
> > -
> > -	kprobe_cpu = NR_CPUS;
> > -	spin_unlock(&kprobe_lock);
> > -
> > - 	local_irq_restore(flags);
> > -}
> > -
> >  /* We have preemption disabled.. so it is safe to use __ versions */
> >  static inline void set_kprobe_instance(struct kprobe *kp)
> >  {
> > @@ -198,14 +162,19 @@ static inline void reset_kprobe_instance
> >  	__get_cpu_var(kprobe_instance) = NULL;
> >  }
> >  
> > -/* You have to be holding the kprobe_lock */
> > +/*
> > + * This routine is called either:
> > + * 	- under the kprobe_lock spinlock - during kprobe_[un]register()
> > + * 				OR
> > + * 	- under an rcu_read_lock() - from arch/xxx/kernel/kprobes.c
> 
> Same here -- shouldn't the rcu_read_lock() be preempt_disable()?

This is the same routine referred to in the comment above.

> > + */
> >  struct kprobe __kprobes *get_kprobe(void *addr)
> >  {
> >  	struct hlist_head *head;
> >  	struct hlist_node *node;
> >  
> >  	head = &kprobe_table[hash_ptr(addr, KPROBE_HASH_BITS)];
> > -	hlist_for_each(node, head) {
> > +	hlist_for_each_rcu(node, head) {
> >  		struct kprobe *p = hlist_entry(node, struct kprobe, hlist);
> >  		if (p->addr == addr)
> >  			return p;
> > @@ -221,7 +190,7 @@ static int __kprobes aggr_pre_handler(st
> >  {
> >  	struct kprobe *kp;
> >  
> > -	list_for_each_entry(kp, &p->list, list) {
> > +	list_for_each_entry_rcu(kp, &p->list, list) {
> >  		if (kp->pre_handler) {
> >  			set_kprobe_instance(kp);
> >  			if (kp->pre_handler(kp, regs))
> > @@ -237,7 +206,7 @@ static void __kprobes aggr_post_handler(
> >  {
> >  	struct kprobe *kp;
> >  
> > -	list_for_each_entry(kp, &p->list, list) {
> > +	list_for_each_entry_rcu(kp, &p->list, list) {
> >  		if (kp->post_handler) {
> >  			set_kprobe_instance(kp);
> >  			kp->post_handler(kp, regs, flags);
> > @@ -276,6 +245,7 @@ static int __kprobes aggr_break_handler(
> >  	return ret;
> >  }
> >  
> > +/* Called with kretprobe_lock held */
> >  struct kretprobe_instance __kprobes *get_free_rp_inst(struct kretprobe *rp)
> >  {
> >  	struct hlist_node *node;
> > @@ -285,6 +255,7 @@ struct kretprobe_instance __kprobes *get
> >  	return NULL;
> >  }
> >  
> > +/* Called with kretprobe_lock held */
> >  static struct kretprobe_instance __kprobes *get_used_rp_inst(struct kretprobe
> >  							      *rp)
> >  {
> > @@ -295,6 +266,7 @@ static struct kretprobe_instance __kprob
> >  	return NULL;
> >  }
> >  
> > +/* Called with kretprobe_lock held */
> >  void __kprobes add_rp_inst(struct kretprobe_instance *ri)
> >  {
> >  	/*
> > @@ -313,6 +285,7 @@ void __kprobes add_rp_inst(struct kretpr
> >  	hlist_add_head(&ri->uflist, &ri->rp->used_instances);
> >  }
> >  
> > +/* Called with kretprobe_lock held */
> >  void __kprobes recycle_rp_inst(struct kretprobe_instance *ri)
> >  {
> >  	/* remove rp inst off the rprobe_inst_table */
> > @@ -346,13 +319,13 @@ void __kprobes kprobe_flush_task(struct 
> >  	struct hlist_node *node, *tmp;
> >  	unsigned long flags = 0;
> >  
> > -	spin_lock_irqsave(&kprobe_lock, flags);
> > +	spin_lock_irqsave(&kretprobe_lock, flags);
> >          head = kretprobe_inst_table_head(current);
> >          hlist_for_each_entry_safe(ri, node, tmp, head, hlist) {
> >                  if (ri->task == tk)
> >                          recycle_rp_inst(ri);
> >          }
> > -	spin_unlock_irqrestore(&kprobe_lock, flags);
> > +	spin_unlock_irqrestore(&kretprobe_lock, flags);
> >  }
> >  
> >  /*
> > @@ -363,9 +336,12 @@ static int __kprobes pre_handler_kretpro
> >  					   struct pt_regs *regs)
> >  {
> >  	struct kretprobe *rp = container_of(p, struct kretprobe, kp);
> > +	unsigned long flags = 0;
> >  
> >  	/*TODO: consider to only swap the RA after the last pre_handler fired */
> > +	spin_lock_irqsave(&kretprobe_lock, flags);
> >  	arch_prepare_kretprobe(rp, regs);
> > +	spin_unlock_irqrestore(&kretprobe_lock, flags);
> >  	return 0;
> >  }
> >  
> > @@ -396,13 +372,13 @@ static int __kprobes add_new_kprobe(stru
> >          struct kprobe *kp;
> >  
> >  	if (p->break_handler) {
> > -		list_for_each_entry(kp, &old_p->list, list) {
> > +		list_for_each_entry_rcu(kp, &old_p->list, list) {
> >  			if (kp->break_handler)
> >  				return -EEXIST;
> >  		}
> > -		list_add_tail(&p->list, &old_p->list);
> > +		list_add_tail_rcu(&p->list, &old_p->list);
> >  	} else
> > -		list_add(&p->list, &old_p->list);
> > +		list_add_rcu(&p->list, &old_p->list);
> >  	return 0;
> >  }
> >  
> > @@ -420,18 +396,18 @@ static inline void add_aggr_kprobe(struc
> >  	ap->break_handler = aggr_break_handler;
> >  
> >  	INIT_LIST_HEAD(&ap->list);
> > -	list_add(&p->list, &ap->list);
> > +	list_add_rcu(&p->list, &ap->list);
> >  
> >  	INIT_HLIST_NODE(&ap->hlist);
> > -	hlist_del(&p->hlist);
> > -	hlist_add_head(&ap->hlist,
> > +	hlist_del_rcu(&p->hlist);
> > +	hlist_add_head_rcu(&ap->hlist,
> >  		&kprobe_table[hash_ptr(ap->addr, KPROBE_HASH_BITS)]);
> >  }
> >  
> >  /*
> >   * This is the second or subsequent kprobe at the address - handle
> >   * the intricacies
> > - * TODO: Move kcalloc outside the spinlock
> > + * TODO: Move kcalloc outside the spin_lock
> >   */
> >  static int __kprobes register_aggr_kprobe(struct kprobe *old_p,
> >  					  struct kprobe *p)
> > @@ -457,7 +433,7 @@ static int __kprobes register_aggr_kprob
> >  static inline void cleanup_kprobe(struct kprobe *p, unsigned long flags)
> >  {
> >  	arch_disarm_kprobe(p);
> > -	hlist_del(&p->hlist);
> > +	hlist_del_rcu(&p->hlist);
> >  	spin_unlock_irqrestore(&kprobe_lock, flags);
> >  	arch_remove_kprobe(p);
> >  }
> > @@ -465,11 +441,10 @@ static inline void cleanup_kprobe(struct
> >  static inline void cleanup_aggr_kprobe(struct kprobe *old_p,
> >  		struct kprobe *p, unsigned long flags)
> >  {
> > -	list_del(&p->list);
> > -	if (list_empty(&old_p->list)) {
> > +	list_del_rcu(&p->list);
> > +	if (list_empty(&old_p->list))
> >  		cleanup_kprobe(old_p, flags);
> > -		kfree(old_p);
> > -	} else
> > +	else
> >  		spin_unlock_irqrestore(&kprobe_lock, flags);
> >  }
> >  
> > @@ -492,9 +467,9 @@ int __kprobes register_kprobe(struct kpr
> >  	if ((ret = arch_prepare_kprobe(p)) != 0)
> >  		goto rm_kprobe;
> >  
> > +	p->nmissed = 0;
> >  	spin_lock_irqsave(&kprobe_lock, flags);
> >  	old_p = get_kprobe(p->addr);
> > -	p->nmissed = 0;
> >  	if (old_p) {
> >  		ret = register_aggr_kprobe(old_p, p);
> >  		goto out;
> > @@ -502,7 +477,7 @@ int __kprobes register_kprobe(struct kpr
> >  
> >  	arch_copy_kprobe(p);
> >  	INIT_HLIST_NODE(&p->hlist);
> > -	hlist_add_head(&p->hlist,
> > +	hlist_add_head_rcu(&p->hlist,
> >  		       &kprobe_table[hash_ptr(p->addr, KPROBE_HASH_BITS)]);
> >  
> >    	arch_arm_kprobe(p);
> > @@ -523,10 +498,16 @@ void __kprobes unregister_kprobe(struct 
> >  	spin_lock_irqsave(&kprobe_lock, flags);
> >  	old_p = get_kprobe(p->addr);
> >  	if (old_p) {
> > +		/* cleanup_*_kprobe() does the spin_unlock_irqrestore */
> >  		if (old_p->pre_handler == aggr_pre_handler)
> >  			cleanup_aggr_kprobe(old_p, p, flags);
> >  		else
> >  			cleanup_kprobe(p, flags);
> > +
> > +		synchronize_sched();
> > +		if (old_p->pre_handler == aggr_pre_handler &&
> > +				list_empty(&old_p->list))
> > +			kfree(old_p);
> >  	} else
> >  		spin_unlock_irqrestore(&kprobe_lock, flags);
> >  }
> > @@ -603,13 +584,13 @@ void __kprobes unregister_kretprobe(stru
> >  
> >  	unregister_kprobe(&rp->kp);
> >  	/* No race here */
> > -	spin_lock_irqsave(&kprobe_lock, flags);
> > +	spin_lock_irqsave(&kretprobe_lock, flags);
> >  	free_rp_inst(rp);
> >  	while ((ri = get_used_rp_inst(rp)) != NULL) {
> >  		ri->rp = NULL;
> >  		hlist_del(&ri->uflist);
> >  	}
> > -	spin_unlock_irqrestore(&kprobe_lock, flags);
> > +	spin_unlock_irqrestore(&kretprobe_lock, flags);
> >  }
> >  
> >  static int __init init_kprobes(void)
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> > 
> > 

  reply	other threads:[~2005-10-18 14:44 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-10-10 14:37 [PATCH 0/9] Kprobes: scalability enhancements Ananth N Mavinakayanahalli
2005-10-10 14:39 ` [PATCH 1/9] Kprobes: rearrange preempt_disable/enable() calls Ananth N Mavinakayanahalli
2005-10-10 14:41   ` [PATCH 2/9] Kprobes: Track kprobe on a per_cpu basis - base changes Ananth N Mavinakayanahalli
2005-10-10 14:42     ` [PATCH 3/9] Kprobes: Track kprobe on a per_cpu basis - i386 changes Ananth N Mavinakayanahalli
2005-10-10 14:42       ` [PATCH 4/9] Kprobes: Track kprobe on a per_cpu basis - ia64 changes Ananth N Mavinakayanahalli
2005-10-10 14:43         ` [PATCH 5/9] Kprobes: Track kprobe on a per_cpu basis - ppc64 changes Ananth N Mavinakayanahalli
2005-10-10 14:44           ` [PATCH 6/9] Kprobes: Track kprobe on a per_cpu basis - sparc64 changes Ananth N Mavinakayanahalli
2005-10-10 14:45             ` [PATCH 7/9] Kprobes: Track kprobe on a per_cpu basis - x86_64 changes Ananth N Mavinakayanahalli
2005-10-10 14:47               ` [PATCH 8/9] Kprobes: Use RCU for (un)register synchronization - base changes Ananth N Mavinakayanahalli
2005-10-10 14:48                 ` [PATCH 9/9] Kprobes: Use RCU for (un)register synchronization - arch changes Ananth N Mavinakayanahalli
2005-10-18  5:49                   ` Paul E. McKenney
2005-10-18 14:45                     ` Ananth N Mavinakayanahalli
2005-10-18 16:35                       ` Paul E. McKenney
2005-10-18  5:44                 ` [PATCH 8/9] Kprobes: Use RCU for (un)register synchronization - base changes Paul E. McKenney
2005-10-18 14:43                   ` Ananth N Mavinakayanahalli [this message]
2005-10-18 16:31                     ` Paul E. McKenney
2005-10-18 17:09                       ` Ananth N Mavinakayanahalli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20051018144323.GA4479@in.ibm.com \
    --to=ananth@in.ibm.com \
    --cc=akpm@osdl.org \
    --cc=anil.s.keshavamurthy@intel.com \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@us.ibm.com \
    --cc=prasanna@in.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.