All of lore.kernel.org
 help / color / mirror / Atom feed
From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: mingo@redhat.com, a.p.zijlstra@chello.nl,
	torvalds@linux-foundation.org, peterz@infradead.org,
	anton@redhat.com, rostedt@goodmis.org, tglx@linutronix.de,
	oleg@redhat.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, hpa@zytor.com, jkenisto@us.ibm.com,
	andi@firstfloor.org, hch@infradead.org, ananth@in.ibm.com,
	vda.linux@googlemail.com, masami.hiramatsu.pt@hitachi.com,
	acme@infradead.org, sfr@canb.auug.org.au, roland@hack.frob.com,
	mingo@elte.hu, linux-tip-commits@vger.kernel.org
Subject: Re: [tip:perf/uprobes] uprobes, mm, x86: Add the ability to install and remove uprobes breakpoints
Date: Tue, 22 May 2012 13:35:13 +0530	[thread overview]
Message-ID: <20120522080513.GC10829@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120521143701.74ab2d0b.akpm@linux-foundation.org>

> 
> static void unmap_single_vma(struct mmu_gather *tlb,
> 		struct vm_area_struct *vma, unsigned long start_addr,
> 		unsigned long end_addr,
> 		struct zap_details *details)
> {
> 	unsigned long start = max(vma->vm_start, start_addr);
> 	unsigned long end;
> 
> 	if (start >= vma->vm_end)
> 		return;
> 	end = min(vma->vm_end, end_addr);
> 	if (end <= vma->vm_start)
> 		return;
> 
> <<<<<<< HEAD
> =======
> 	if (vma->vm_file)
> 		uprobe_munmap(vma, start, end);
> 
> 	if (vma->vm_flags & VM_ACCOUNT)
> 		*nr_accounted += (end - start) >> PAGE_SHIFT;
> 
> >>>>>>> linux-next/akpm-base
> 	if (unlikely(is_pfn_mapping(vma)))
> 		untrack_pfn_vma(vma, 0, 0);
> 
> 
> It made me look at uprobes.  Noticed a few things...
> 

I have responded to why I had to add a callback in unmap_single_vma in
response to Linus.

> > ...
> >
> > +static struct rb_root uprobes_tree = RB_ROOT;
> > +static DEFINE_SPINLOCK(uprobes_treelock);	/* serialize rbtree access */
> > +
> > +#define UPROBES_HASH_SZ	13
> > +/* serialize (un)register */
> > +static struct mutex uprobes_mutex[UPROBES_HASH_SZ];
> > +#define uprobes_hash(v)	(&uprobes_mutex[((unsigned long)(v)) %\
> > +						UPROBES_HASH_SZ])
> > +
> > +/* serialize uprobe->pending_list */
> > +static struct mutex uprobes_mmap_mutex[UPROBES_HASH_SZ];
> > +#define uprobes_mmap_hash(v)	(&uprobes_mmap_mutex[((unsigned long)(v)) %\
> > +						UPROBES_HASH_SZ])
> 
> Presumably these locks were hashed for scalability reasons?

Yes, 

uprobe_mmap_mutex is taken on every mmap/munmap operation. 
Since we do a per file operation per mm operation, (walk thro the rmap and 
insert/remove breakpoints), we looked at using i_mutex. However
Christoph wasnt happy to overload the usage of i_mutex. He suggested two
options,
1. adding another mutex in the inode structure 
2. adding global hash locks. (which he recommended)

Adding a mutex in the inode structure, is a overkill.
But having just one mutex to guard all uprobe_mmap is a contention on
different mmaps.  So we narrowed down to a hash mutex.

> 
> If so, this won't be terribly effective when we have multiple mutexes
> occupying a single cacheline - the array entries should be padded out.
> Of course, that's all a complete waste of space on uniprocessor
> machines, but nobody seems to think of that any more ;(
> 

Okay, I agree that having each mutex in a different cacheline helps.
If everyone agrees to this, I will have a addon patch that will move the
mutexes.

> There was no need to code the accessor functions as macros.  It is, as
> always, better to use a nice C function which takes an argument which
> is as strictly typed as possible.  ie, it *could* take a void*, but it
> would be better if it required an inode*.
> 

I will add this change as part of the add-on patch.

> >
> > ...
> >
> > +static int read_opcode(struct mm_struct *mm, unsigned long vaddr,
> > +						uprobe_opcode_t *opcode)
> > +{

[.....]

> > +	vaddr_new = kmap_atomic(page);
> > +	vaddr &= ~PAGE_MASK;
> > +	memcpy(opcode, vaddr_new + vaddr, uprobe_opcode_sz);
> > +	kunmap_atomic(vaddr_new);
> 
> This is modifying user memory?  flush_dcache_page() needed?  Or perhaps
> we will need different primitives to diddle the instruction memory on
> architectures which care.
> 

Here, we are just reading from the user memory, 
The part where we insert/remove the breakpoint (write_opcode) does the flush.

> > +int mmap_uprobe(struct vm_area_struct *vma)
> > +{
> > +	struct list_head tmp_list;
> > +	struct uprobe *uprobe, *u;
> > +	struct inode *inode;
> > +	int ret = 0;
> > +
> > +	if (!atomic_read(&uprobe_events) || !valid_vma(vma, true))
> > +		return ret;	/* Bail-out */
> > +
> > +	inode = vma->vm_file->f_mapping->host;
> > +	if (!inode)
> > +		return ret;
> > +
> > +	INIT_LIST_HEAD(&tmp_list);
> > +	mutex_lock(uprobes_mmap_hash(inode));
> > +	build_probe_list(inode, &tmp_list);
> > +	list_for_each_entry_safe(uprobe, u, &tmp_list, pending_list) {
> > +		loff_t vaddr;
> > +
> > +		list_del(&uprobe->pending_list);
> > +		if (!ret) {
> > +			vaddr = vma_address(vma, uprobe->offset);
> > +			if (vaddr < vma->vm_start || vaddr >= vma->vm_end) {
> > +				put_uprobe(uprobe);
> > +				continue;
> > +			}
> > +			ret = install_breakpoint(vma->vm_mm, uprobe, vma,
> > +								vaddr);
> > +			if (ret == -EEXIST)
> > +				ret = 0;
> 
> This now has the comment "Ignore double add:".  That is a poor
> comment, because it doesn't tell us *why* a double-add is ignored.
> 

We actually dont ignore the "Double-add". 

install_breakpoint() has comments on when we return EEXIST.

uprobe_mmap() has comments on why EEXIST should be considered successful
as part of commit  682968e0 (uprobes/core: Optimize probe hits with the
help of a counter) which is 

/*
 * Unable to insert a breakpoint, but
 * breakpoint lies underneath. Increment the
 * probe count
 */

i.e insert_breakpoint() needs to insert a breakpoint, but if a
breakpoint is already there, then it doesnt need to do anything.

I will go ahead and remove the "Ignore double-add" comment.
	
-- 
thanks and regards
Srikar

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: mingo@redhat.com, a.p.zijlstra@chello.nl,
	torvalds@linux-foundation.org, peterz@infradead.org,
	anton@redhat.com, rostedt@goodmis.org, tglx@linutronix.de,
	oleg@redhat.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, hpa@zytor.com, jkenisto@us.ibm.com,
	andi@firstfloor.org, hch@infradead.org, ananth@in.ibm.com,
	vda.linux@googlemail.com, masami.hiramatsu.pt@hitachi.com,
	acme@infradead.org, sfr@canb.auug.org.au, roland@hack.frob.com,
	mingo@elte.hu, linux-tip-commits@vger.kernel.org
Subject: Re: [tip:perf/uprobes] uprobes, mm, x86: Add the ability to install and remove uprobes breakpoints
Date: Tue, 22 May 2012 13:35:13 +0530	[thread overview]
Message-ID: <20120522080513.GC10829@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120521143701.74ab2d0b.akpm@linux-foundation.org>

> 
> static void unmap_single_vma(struct mmu_gather *tlb,
> 		struct vm_area_struct *vma, unsigned long start_addr,
> 		unsigned long end_addr,
> 		struct zap_details *details)
> {
> 	unsigned long start = max(vma->vm_start, start_addr);
> 	unsigned long end;
> 
> 	if (start >= vma->vm_end)
> 		return;
> 	end = min(vma->vm_end, end_addr);
> 	if (end <= vma->vm_start)
> 		return;
> 
> <<<<<<< HEAD
> =======
> 	if (vma->vm_file)
> 		uprobe_munmap(vma, start, end);
> 
> 	if (vma->vm_flags & VM_ACCOUNT)
> 		*nr_accounted += (end - start) >> PAGE_SHIFT;
> 
> >>>>>>> linux-next/akpm-base
> 	if (unlikely(is_pfn_mapping(vma)))
> 		untrack_pfn_vma(vma, 0, 0);
> 
> 
> It made me look at uprobes.  Noticed a few things...
> 

I have responded to why I had to add a callback in unmap_single_vma in
response to Linus.

> > ...
> >
> > +static struct rb_root uprobes_tree = RB_ROOT;
> > +static DEFINE_SPINLOCK(uprobes_treelock);	/* serialize rbtree access */
> > +
> > +#define UPROBES_HASH_SZ	13
> > +/* serialize (un)register */
> > +static struct mutex uprobes_mutex[UPROBES_HASH_SZ];
> > +#define uprobes_hash(v)	(&uprobes_mutex[((unsigned long)(v)) %\
> > +						UPROBES_HASH_SZ])
> > +
> > +/* serialize uprobe->pending_list */
> > +static struct mutex uprobes_mmap_mutex[UPROBES_HASH_SZ];
> > +#define uprobes_mmap_hash(v)	(&uprobes_mmap_mutex[((unsigned long)(v)) %\
> > +						UPROBES_HASH_SZ])
> 
> Presumably these locks were hashed for scalability reasons?

Yes, 

uprobe_mmap_mutex is taken on every mmap/munmap operation. 
Since we do a per file operation per mm operation, (walk thro the rmap and 
insert/remove breakpoints), we looked at using i_mutex. However
Christoph wasnt happy to overload the usage of i_mutex. He suggested two
options,
1. adding another mutex in the inode structure 
2. adding global hash locks. (which he recommended)

Adding a mutex in the inode structure, is a overkill.
But having just one mutex to guard all uprobe_mmap is a contention on
different mmaps.  So we narrowed down to a hash mutex.

> 
> If so, this won't be terribly effective when we have multiple mutexes
> occupying a single cacheline - the array entries should be padded out.
> Of course, that's all a complete waste of space on uniprocessor
> machines, but nobody seems to think of that any more ;(
> 

Okay, I agree that having each mutex in a different cacheline helps.
If everyone agrees to this, I will have a addon patch that will move the
mutexes.

> There was no need to code the accessor functions as macros.  It is, as
> always, better to use a nice C function which takes an argument which
> is as strictly typed as possible.  ie, it *could* take a void*, but it
> would be better if it required an inode*.
> 

I will add this change as part of the add-on patch.

> >
> > ...
> >
> > +static int read_opcode(struct mm_struct *mm, unsigned long vaddr,
> > +						uprobe_opcode_t *opcode)
> > +{

[.....]

> > +	vaddr_new = kmap_atomic(page);
> > +	vaddr &= ~PAGE_MASK;
> > +	memcpy(opcode, vaddr_new + vaddr, uprobe_opcode_sz);
> > +	kunmap_atomic(vaddr_new);
> 
> This is modifying user memory?  flush_dcache_page() needed?  Or perhaps
> we will need different primitives to diddle the instruction memory on
> architectures which care.
> 

Here, we are just reading from the user memory, 
The part where we insert/remove the breakpoint (write_opcode) does the flush.

> > +int mmap_uprobe(struct vm_area_struct *vma)
> > +{
> > +	struct list_head tmp_list;
> > +	struct uprobe *uprobe, *u;
> > +	struct inode *inode;
> > +	int ret = 0;
> > +
> > +	if (!atomic_read(&uprobe_events) || !valid_vma(vma, true))
> > +		return ret;	/* Bail-out */
> > +
> > +	inode = vma->vm_file->f_mapping->host;
> > +	if (!inode)
> > +		return ret;
> > +
> > +	INIT_LIST_HEAD(&tmp_list);
> > +	mutex_lock(uprobes_mmap_hash(inode));
> > +	build_probe_list(inode, &tmp_list);
> > +	list_for_each_entry_safe(uprobe, u, &tmp_list, pending_list) {
> > +		loff_t vaddr;
> > +
> > +		list_del(&uprobe->pending_list);
> > +		if (!ret) {
> > +			vaddr = vma_address(vma, uprobe->offset);
> > +			if (vaddr < vma->vm_start || vaddr >= vma->vm_end) {
> > +				put_uprobe(uprobe);
> > +				continue;
> > +			}
> > +			ret = install_breakpoint(vma->vm_mm, uprobe, vma,
> > +								vaddr);
> > +			if (ret == -EEXIST)
> > +				ret = 0;
> 
> This now has the comment "Ignore double add:".  That is a poor
> comment, because it doesn't tell us *why* a double-add is ignored.
> 

We actually dont ignore the "Double-add". 

install_breakpoint() has comments on when we return EEXIST.

uprobe_mmap() has comments on why EEXIST should be considered successful
as part of commit  682968e0 (uprobes/core: Optimize probe hits with the
help of a counter) which is 

/*
 * Unable to insert a breakpoint, but
 * breakpoint lies underneath. Increment the
 * probe count
 */

i.e insert_breakpoint() needs to insert a breakpoint, but if a
breakpoint is already there, then it doesnt need to do anything.

I will go ahead and remove the "Ignore double-add" comment.
	
-- 
thanks and regards
Srikar


  parent reply	other threads:[~2012-05-22  8:07 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-02 14:18 [PATCH v10 3.3-rc2 0/9] Uprobes patchset with perf probe support Srikar Dronamraju
2012-02-02 14:18 ` Srikar Dronamraju
2012-02-02 14:18 ` [PATCH v10 3.3-rc2 1/9] uprobes: Install and remove breakpoints Srikar Dronamraju
2012-02-02 14:18   ` Srikar Dronamraju
2012-02-03 12:01   ` Masami Hiramatsu
2012-02-03 12:01     ` Masami Hiramatsu
2012-02-07 17:17   ` Srikar Dronamraju
2012-02-07 17:17     ` Srikar Dronamraju
2012-02-08  9:40     ` Denys Vlasenko
2012-02-08  9:40       ` Denys Vlasenko
2012-02-08  9:40       ` Srikar Dronamraju
2012-02-08  9:40         ` Srikar Dronamraju
2012-02-09  1:27       ` Masami Hiramatsu
2012-02-09  1:27         ` Masami Hiramatsu
2012-02-09  6:37         ` Srikar Dronamraju
2012-02-09  6:37           ` Srikar Dronamraju
2012-02-09  7:53           ` Ingo Molnar
2012-02-09  7:53             ` Ingo Molnar
2012-02-09  8:14             ` Srikar Dronamraju
2012-02-09  8:14               ` Srikar Dronamraju
2012-02-09  8:17           ` Masami Hiramatsu
2012-02-09  8:17             ` Masami Hiramatsu
2012-02-09  8:27             ` Srikar Dronamraju
2012-02-09  8:27               ` Srikar Dronamraju
2012-02-08 14:08     ` Srikar Dronamraju
2012-02-08 14:08       ` Srikar Dronamraju
2012-02-09  9:26       ` [PATCH v10 take 3 " Srikar Dronamraju
2012-02-09  9:26         ` Srikar Dronamraju
2012-02-17  9:58         ` [tip:perf/uprobes] uprobes, mm, x86: Add the ability to install and remove uprobes breakpoints tip-bot for Srikar Dronamraju
2012-02-17  9:58           ` tip-bot for Srikar Dronamraju
2012-05-21 21:37           ` Andrew Morton
2012-05-21 21:37             ` Andrew Morton
2012-05-21 22:00             ` Linus Torvalds
2012-05-21 22:00               ` Linus Torvalds
2012-05-21 22:13               ` Andrew Morton
2012-05-21 22:13                 ` Andrew Morton
2012-05-22  1:16                 ` Stephen Rothwell
2012-05-22  2:27                   ` Andrew Morton
2012-05-22  2:27                     ` Andrew Morton
2012-05-22  6:50                     ` Stephen Rothwell
2012-05-23  0:37                 ` Stephen Rothwell
2012-05-22  1:10               ` Stephen Rothwell
2012-05-22  6:01               ` Srikar Dronamraju
2012-05-22  6:01                 ` Srikar Dronamraju
2012-05-22  8:05             ` Srikar Dronamraju [this message]
2012-05-22  8:05               ` Srikar Dronamraju
2012-02-02 14:19 ` [PATCH v10 3.3-rc2 2/9] uprobes: handle breakpoint and signal step exception Srikar Dronamraju
2012-02-02 14:19   ` Srikar Dronamraju
2012-02-02 14:19 ` [PATCH v10 3.3-rc2 3/9] uprobes: slot allocation Srikar Dronamraju
2012-02-02 14:19   ` Srikar Dronamraju
2012-02-02 14:19 ` [PATCH v10 3.3-rc2 4/9] uprobes: counter to optimize probe hits Srikar Dronamraju
2012-02-02 14:19   ` Srikar Dronamraju
2012-02-02 14:19 ` [PATCH v10 3.3-rc2 5/9] tracing: modify is_delete, is_return from ints to bool Srikar Dronamraju
2012-02-02 14:19   ` Srikar Dronamraju
2012-02-02 14:20 ` [PATCH v10 3.3-rc2 6/9] tracing: Extract out common code for kprobes/uprobes traceevents Srikar Dronamraju
2012-02-02 14:20   ` Srikar Dronamraju
2012-02-02 14:20 ` [PATCH v10 3.3-rc2 7/9] tracing: uprobes trace_event interface Srikar Dronamraju
2012-02-02 14:20   ` Srikar Dronamraju
2012-02-02 14:20 ` [PATCH v10 3.3-rc2 8/9] perf: rename target_module to target Srikar Dronamraju
2012-02-02 14:20   ` Srikar Dronamraju
2012-02-07 19:33   ` [tip:perf/core] perf probe: Rename " tip-bot for Srikar Dronamraju
2012-02-07 19:33     ` tip-bot for Srikar Dronamraju
2012-02-02 14:20 ` [PATCH v10 3.3-rc2 9/9] perf: perf interface for uprobes Srikar Dronamraju
2012-02-02 14:20   ` Srikar Dronamraju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120522080513.GC10829@linux.vnet.ibm.com \
    --to=srikar@linux.vnet.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=ananth@in.ibm.com \
    --cc=andi@firstfloor.org \
    --cc=anton@redhat.com \
    --cc=hch@infradead.org \
    --cc=hpa@zytor.com \
    --cc=jkenisto@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mingo@elte.hu \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=roland@hack.frob.com \
    --cc=rostedt@goodmis.org \
    --cc=sfr@canb.auug.org.au \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vda.linux@googlemail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.