From: Andrew Morton <akpm@linux-foundation.org>
To: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@elte.hu>,
Linus Torvalds <torvalds@linux-foundation.org>,
Masami Hiramatsu <mhiramat@redhat.com>,
Mel Gorman <mel@csn.ul.ie>,
Ananth N Mavinakayanahalli <ananth@in.ibm.com>,
Jim Keniston <jkenisto@linux.vnet.ibm.com>,
Frederic Weisbecker <fweisbec@gmail.com>,
"Frank Ch. Eigler" <fche@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
Randy Dunlap <randy.dunlap@oracle.com>
Subject: Re: [PATCH v1 4/10] User Space Breakpoint Assistance Layer
Date: Mon, 22 Mar 2010 21:40:09 -0400 [thread overview]
Message-ID: <20100322214009.bb90d6e2.akpm@linux-foundation.org> (raw)
In-Reply-To: <20100320142541.11427.98291.sendpatchset@localhost6.localdomain6>
On Sat, 20 Mar 2010 19:55:41 +0530 Srikar Dronamraju <srikar@linux.vnet.ibm.com> wrote:
> User Space Breakpoint Assistance Layer (USER_BKPT)
>
A quick scan, just to show I was paying attention ;)
>
> ...
>
> +/*
> + * Read @nbytes at @vaddr from @tsk into @kbuf. Return number of bytes read.
> + * Not exported, but available for use by arch-specific user_bkpt code.
> + */
You may as well finish off the kerneldoc on some of these functions.
> +int user_bkpt_read_vm(struct task_struct *tsk, unsigned long vaddr,
> + void *kbuf, int nbytes)
> +{
> + if (tsk == current) {
> + int nleft = copy_from_user(kbuf, (void __user *) vaddr,
> + nbytes);
> + return nbytes - nleft;
> + } else
> + return access_process_vm(tsk, vaddr, kbuf, nbytes, 0);
> +}
copy_from_user() takes and returns an unsigned long arg but this
function is converting these to and from ints. That's OK if we're 100%
sure that we'll never get or return an arg >2G. Otherwise things could
get ghastly. Please have a think. (Dittoes for some other functionss
around here).
> +
> + * Write @nbytes from @kbuf at @vaddr in @tsk. Return number of bytes written.
> + * Can be used to write to stack or data VM areas, but not instructions.
> + * Not exported, but available for use by arch-specific user_bkpt code.
> + */
> +int user_bkpt_write_data(struct task_struct *tsk, unsigned long vaddr,
> + const void *kbuf, int nbytes)
> +{
> + int nleft;
> +
> + if (tsk == current) {
> + nleft = copy_to_user((void __user *) vaddr, kbuf, nbytes);
> + return nbytes - nleft;
> + } else
> + return access_process_vm(tsk, vaddr, (void *) kbuf,
> + nbytes, 1);
> +}
This looks like it has the wrong interface. It should take a `void
__user *vaddr'. If any casting is to be done, it should be done at the
highest level so that sparse can check that the thing is used correctly
in as many places as possible.
> +/*
> + * Given an address, get its pte. Very similar to get_locked_pte except
> + * there is no spinlock involved.
> + */
> +static inline pte_t *get_pte(struct mm_struct *mm, unsigned long vaddr)
The inline is either unneeded or undesirable, I suspect.
> +{
> + pgd_t *pgd;
> + pud_t *pud;
> + pmd_t *pmd;
> + pte_t *pte;
> +
> + pgd = pgd_offset(mm, vaddr);
> + pud = pud_alloc(mm, pgd, vaddr);
> + if (!pud)
> + return NULL;
> + pmd = pmd_alloc(mm, pud, vaddr);
> + if (!pmd)
> + return NULL;
> + pte = pte_alloc_map(mm, pmd, vaddr);
> + return pte;
> +}
> +
> +static int write_opcode(struct task_struct *tsk, unsigned long vaddr,
> + user_bkpt_opcode_t opcode)
> +{
> + struct mm_struct *mm;
> + struct vm_area_struct *vma;
> + struct page *old_page, *new_page;
> + void *maddr;
> + pte_t *old_pte;
> + int offset;
> + int ret = -EINVAL;
> +
> + if (!tsk)
> + return ret;
> +
> + mm = get_task_mm(tsk);
> + if (!mm)
> + return ret;
> +
> + down_read(&mm->mmap_sem);
> +
> + /* Read the page with vaddr into memory */
> + ret = get_user_pages(tsk, mm, vaddr, 1, 0, 1, &old_page, &vma);
> + if (ret <= 0) {
> + up_read(&mm->mmap_sem);
> + goto mmput_out;
> + }
> +
> + /* Allocate a page and copy contents over from the old page */
> + new_page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, vaddr);
It might be smarter to allocate this page outside the mmap_sem region.
More scalable, less opportunity for weird deadlocks.
> + if (!new_page) {
> + ret = -ENOMEM;
> + goto unlock_out;
> + }
> +
> + /*
> + * check if the page we are interested is read-only mapped
> + * Since we are interested in text pages, Our pages of interest
> + * should be mapped read-only.
> + */
> + if ((vma->vm_flags && (VM_READ|VM_WRITE)) != VM_READ) {
> + ret = -EINVAL;
> + goto unlock_out;
> + }
> +
> + copy_user_highpage(new_page, old_page, vaddr, vma);
> + maddr = kmap(new_page);
> + offset = vaddr & (PAGE_SIZE - 1);
> + memcpy(maddr + offset, &opcode, user_bkpt_opcode_sz);
> + kunmap(new_page);
kmap_atomic() is preferred - it's faster. kmap() is still deadlockable
I guess if a process ever kmaps two pages at the same time. This code
doesn't do that, but kmap is still a bit sucky.
> + old_pte = get_pte(mm, vaddr);
> + if (!old_pte) {
> + ret = -EINVAL;
> + goto unlock_out;
> + }
> +
> + /*
> + * Now replace the page in vma with the new page.
> + * This is a text page; so, setup mapping and index.
> + */
> + new_page->mapping = old_page->mapping;
> + new_page->index = old_page->index;
> + ret = replace_page(vma, old_page, new_page, *old_pte);
> +
> +unlock_out:
> + if (ret != 0)
> + page_cache_release(new_page);
> + put_page(old_page); /* we did a get_page in the beginning */
> + up_read(&mm->mmap_sem);
> +mmput_out:
> + mmput(mm);
> + return ret;
> +}
> +
>
> ...
>
> +/**
> + * user_bkpt_validate_insn_addr - Validate if the instruction is an
> + * executable vma.
It used to be the case that the above linebreak is "wrong". (Nobody
ever tests their kerneldoc output!) I have a vague feeling that this
might have been fixed lately. Randy?
> + * Returns 0 if the vaddr is a valid instruction address.
> + * @tsk: the probed task
> + * @vaddr: virtual address of the instruction to be verified.
> + *
> + * Possible errors:
> + * -%EINVAL: Instruction passed is not a valid instruction address.
> + */
> +int user_bkpt_validate_insn_addr(struct task_struct *tsk, unsigned long vaddr)
> +{
> + int result;
> +
> + result = check_vma(tsk, vaddr);
> + if (result != 0)
> + return result;
> + if (arch->validate_address)
> + result = arch->validate_address(tsk, vaddr);
> + return result;
> +}
> +
>
> ...
>
> +int user_bkpt_insert_bkpt(struct task_struct *tsk, struct user_bkpt *user_bkpt)
> +{
> + int result, len;
> +
> + BUG_ON(!tsk || !user_bkpt);
If this BUG_ON triggers, we won't know which of these pointers was NULL,
which makes us sad.
> + if (!validate_strategy(user_bkpt->strategy, USER_BKPT_HNT_MASK))
> + return -EINVAL;
> +
>
> ...
>
> +int user_bkpt_pre_sstep(struct task_struct *tsk, struct user_bkpt *user_bkpt,
> + struct user_bkpt_task_arch_info *tskinfo, struct pt_regs *regs)
> +{
> + int result;
> +
> + BUG_ON(!tsk || !user_bkpt || !regs);
ditto.
Really, there's never much point in
BUG_ON(!some_pointer);
Just go ahead and dereference the pointer. If it's NULL then we'll get
an oops which gives all the information which the BUG_ON would have
given.
> + if (uses_xol_strategy(user_bkpt->strategy)) {
> + BUG_ON(!user_bkpt->xol_vaddr);
> + return arch->pre_xol(tsk, user_bkpt, tskinfo, regs);
> + }
> +
> + /*
> + * Single-step this instruction inline. Replace the breakpoint
> + * with the original opcode.
> + */
> + result = arch->set_orig_insn(tsk, user_bkpt, false);
> + if (result == 0)
> + arch->set_ip(regs, user_bkpt->vaddr);
> + return result;
> +}
> +
> +/**
> + * user_bkpt_post_sstep - prepare to resume execution after single-step
> + * @tsk: the probed task
> + * @user_bkpt: the probepoint information, as with @user_bkpt_pre_sstep()
> + * @tskinfo: the @user_bkpt_task_arch_info object, if any, passed to
> + * @user_bkpt_pre_sstep()
> + * @regs: reflects the saved state of @tsk after the single-step
> + * operation. @user_bkpt_post_sstep() adjusts @tsk's state as needed,
> + * including pointing the instruction pointer at the instruction
> + * following the probed instruction.
> + * Possible errors:
> + * -%EFAULT: Failed to read or write @tsk's address space as needed.
> + */
> +int user_bkpt_post_sstep(struct task_struct *tsk, struct user_bkpt *user_bkpt,
> + struct user_bkpt_task_arch_info *tskinfo, struct pt_regs *regs)
> +{
> + BUG_ON(!tsk || !user_bkpt || !regs);
More dittoes. It's just bloat.
> + if (uses_xol_strategy(user_bkpt->strategy))
> + return arch->post_xol(tsk, user_bkpt, tskinfo, regs);
> +
> + /*
> + * Single-stepped this instruction inline. Put the breakpoint
> + * instruction back.
> + */
> + return arch->set_bkpt(tsk, user_bkpt);
> +}
> +
>
> ...
>
next prev parent reply other threads:[~2010-03-23 4:42 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-20 14:24 [PATCH v1 0/10] Uprobes patches Srikar Dronamraju
2010-03-20 14:25 ` [PATCH v1 1/10] Move Macro W to insn.h Srikar Dronamraju
2010-03-20 15:50 ` Masami Hiramatsu
2010-03-22 6:24 ` Srikar Dronamraju
2010-03-22 14:11 ` Masami Hiramatsu
2010-03-20 14:25 ` [PATCH v1 2/10] Move replace_page() to mm/memory.c Srikar Dronamraju
2010-03-20 14:25 ` [PATCH v1 3/10] Enhance replace_page() to support pagecache Srikar Dronamraju
2010-03-20 14:25 ` [PATCH v1 4/10] User Space Breakpoint Assistance Layer Srikar Dronamraju
2010-03-23 1:40 ` Andrew Morton [this message]
2010-03-23 4:48 ` Randy Dunlap
2010-03-23 11:26 ` Srikar Dronamraju
2010-03-20 14:25 ` [PATCH v1 5/10] X86 details for user space breakpoint assistance Srikar Dronamraju
2010-03-20 14:26 ` [PATCH v1 6/10] Slot allocation for Execution out of line Srikar Dronamraju
2010-03-20 14:26 ` [PATCH v1 7/10] Uprobes Implementation Srikar Dronamraju
2010-03-23 11:01 ` Peter Zijlstra
2010-03-23 11:04 ` Peter Zijlstra
2010-03-23 12:23 ` Srikar Dronamraju
2010-03-23 13:46 ` Peter Zijlstra
2010-03-23 14:20 ` Masami Hiramatsu
2010-03-23 15:15 ` Peter Zijlstra
2010-03-23 17:36 ` Masami Hiramatsu
2010-03-24 10:22 ` Srikar Dronamraju
2010-03-23 15:05 ` Ananth N Mavinakayanahalli
2010-03-23 15:15 ` Peter Zijlstra
2010-03-23 15:26 ` Frank Ch. Eigler
2010-03-24 5:59 ` Ananth N Mavinakayanahalli
2010-03-24 7:58 ` Srikar Dronamraju
2010-03-24 13:00 ` Peter Zijlstra
2010-03-25 7:56 ` Srikar Dronamraju
2010-03-25 8:41 ` Srikar Dronamraju
2010-03-20 14:26 ` [PATCH v1 8/10] X86 details for uprobes Srikar Dronamraju
2010-03-20 14:26 ` [PATCH v1 9/10] Uprobes Documentation patch Srikar Dronamraju
2010-03-22 3:00 ` Randy Dunlap
2010-03-22 5:34 ` Srikar Dronamraju
2010-03-22 14:51 ` Randy Dunlap
2010-03-20 14:26 ` [PATCH v1 10/10] Uprobes samples Srikar Dronamraju
2010-03-23 1:38 ` [PATCH v1 0/10] Uprobes patches Andrew Morton
2010-03-23 10:55 ` Srikar Dronamraju
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100322214009.bb90d6e2.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=ananth@in.ibm.com \
--cc=fche@redhat.com \
--cc=fweisbec@gmail.com \
--cc=jkenisto@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mel@csn.ul.ie \
--cc=mhiramat@redhat.com \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=randy.dunlap@oracle.com \
--cc=srikar@linux.vnet.ibm.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox