All of lore.kernel.org
 help / color / mirror / Atom feed
From: Indu Bhagat <indu.bhagat@oracle.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-toolchains@vger.kernel.org, daandemeyer@meta.com,
	andrii@kernel.org, kris.van.hees@oracle.com,
	elena.zannoni@oracle.com, nick.alcock@oracle.com
Subject: Re: [POC 4/5] sframe: add an SFrame format stack tracer
Date: Mon, 1 May 2023 23:16:38 -0700	[thread overview]
Message-ID: <84bf4aee-e5a5-8e49-3826-ecb79eefc85b@oracle.com> (raw)
In-Reply-To: <20230501190018.24ae7704@gandalf.local.home>

On 5/1/23 16:00, Steven Rostedt wrote:
> On Mon,  1 May 2023 13:04:09 -0700
> Indu Bhagat <indu.bhagat@oracle.com> wrote:
> 
>> This patch adds an SFrame format based stack tracer.
>>
>> The files iterate_phdr.c, iterate_phdr.h implement a dl_iterate_phdr()
>> like functionality.
>>
>> The SFrame format based stack tracer is implemented in the
>> sframe_unwind.c with architecture specific bits in the
>> arch/arm64/include/asm/sframe_regs.h and
>> arch/x86/include/asm/sframe_regs.h.  Please note that the SFrame format
>> is supported for x86_64 (AMD64 ABI) and aarch64 (AAPCS64 ABI) only at
>> this time.
>>
>> The files sframe_state.[ch] implement the SFrame state management APIs.
>>
>> Some aspects of the implementation are "POC like". These will need to
>> addressed for the implementation to become more palatable:
>> - dealing with only Elf64_Phdr (no Elf32_Phdr) at this time, and other
>>    TODOs in the iterate_phdr.c,
>> - detecting whether a program did a dlopen/dlclose,
>> - code stubs around user space memory access (.sframe section, ELF hdr
>>    etc.) by the kernel need careful review.
>>
>> There are more aspects than above; The intention of this patch set is to
>> help drive the discussion on how to best incorporate an SFrame-based user
>> space unwinder in the kernel.
>>
>> Signed-off-by: Indu Bhagat <indu.bhagat@oracle.com>
>> ---
>>   arch/arm64/include/asm/sframe_regs.h |  37 +++
>>   arch/x86/include/asm/sframe_regs.h   |  34 +++
>>   include/sframe/sframe_regs.h         |  11 +
>>   include/sframe/sframe_unwind.h       |  62 ++++
>>   lib/sframe/Makefile                  |   8 +-
>>   lib/sframe/iterate_phdr.c            | 113 +++++++
>>   lib/sframe/iterate_phdr.h            |  34 +++
>>   lib/sframe/sframe_state.c            | 424 +++++++++++++++++++++++++++
>>   lib/sframe/sframe_state.h            |  80 +++++
>>   lib/sframe/sframe_unwind.c           | 208 +++++++++++++
>>   10 files changed, 1010 insertions(+), 1 deletion(-)
>>   create mode 100644 arch/arm64/include/asm/sframe_regs.h
>>   create mode 100644 arch/x86/include/asm/sframe_regs.h
>>   create mode 100644 include/sframe/sframe_regs.h
>>   create mode 100644 include/sframe/sframe_unwind.h
>>   create mode 100644 lib/sframe/iterate_phdr.c
>>   create mode 100644 lib/sframe/iterate_phdr.h
>>   create mode 100644 lib/sframe/sframe_state.c
>>   create mode 100644 lib/sframe/sframe_state.h
>>   create mode 100644 lib/sframe/sframe_unwind.c
>>
>> diff --git a/arch/arm64/include/asm/sframe_regs.h b/arch/arm64/include/asm/sframe_regs.h
>> new file mode 100644
>> index 000000000000..ae9ab9d5d3c1
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/sframe_regs.h
>> @@ -0,0 +1,37 @@
>> +/* SPDX-License-Identifier: GPL-2.0-or-later */
>> +/*
>> + * Copyright (C) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +#ifdef ASM_ARM64_SFRAME_REGS_H
>> +#define ASM_ARM64_SFRAME_REGS_H
>> +
>> +#define STACK_ACCESS_LEN 8
>> +
>> +static inline uint64_t
>> +get_ptregs_ip(struct pt_regs *regs)
>> +{
>> +	return regs->pc;
>> +}
>> +
>> +static inline uint64_t
>> +get_ptregs_sp(struct pt_regs *regs)
>> +{
>> +	return regs->sp;
>> +}
>> +
>> +static inline uint64_t
>> +get_ptregs_fp(struct pt_regs *regs)
>> +{
>> +#define UNWIND_AARCH64_X29              29      /* 64-bit frame pointer.  */
>> +	return (uint64_t)regs->regs[UNWIND_AARCH64_X29];
>> +}
>> +
>> +static inline uint64_t
>> +get_ptregs_ra(struct pt_regs *regs)
>> +{
>> +#define UNWIND_AARCH64_X30              30      /* 64-bit link pointer.  */
>> +	return (uint64_t)regs->regs[UNWIND_AARCH64_X30];
>> +}
>> +
>> +#endif /* ASM_ARM64_SFRAME_REGS_H */
>> diff --git a/arch/x86/include/asm/sframe_regs.h b/arch/x86/include/asm/sframe_regs.h
>> new file mode 100644
>> index 000000000000..99f84955854a
>> --- /dev/null
>> +++ b/arch/x86/include/asm/sframe_regs.h
>> @@ -0,0 +1,34 @@
>> +/* SPDX-License-Identifier: GPL-2.0-or-later */
>> +/*
>> + * Copyright (C) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +#ifndef ASM_X86_SFRAME_REGS_H
>> +#define ASM_X86_SFRAME_REGS_H
>> +
>> +#define STACK_ACCESS_LEN 8
>> +
>> +static inline uint64_t
>> +get_ptregs_ip(struct pt_regs *regs)
>> +{
>> +	return (uint64_t)regs->ip;
>> +}
>> +
>> +static inline uint64_t
>> +get_ptregs_sp(struct pt_regs *regs)
>> +{
>> +	return (uint64_t)regs->sp;
>> +}
>> +
>> +static inline uint64_t
>> +get_ptregs_fp(struct pt_regs *regs)
>> +{
>> +	return (uint64_t)regs->bp;
>> +}
>> +
>> +static inline uint64_t
>> +get_ptregs_ra(struct pt_regs *regs)
>> +{
>> +	return 0; /* SFRAME_CFA_FIXED_RA_INVALID */
>> +}
>> +#endif /* ASM_X86_SFRAME_REGS_H */
>> diff --git a/include/sframe/sframe_regs.h b/include/sframe/sframe_regs.h
>> new file mode 100644
>> index 000000000000..32b67f7a7c78
>> --- /dev/null
>> +++ b/include/sframe/sframe_regs.h
>> @@ -0,0 +1,11 @@
>> +/* SPDX-License-Identifier: GPL-2.0-or-later */
>> +/*
>> + * Copyright (C) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +#ifndef _SFRAME_REGS_H
>> +#define _SFRAME_REGS_H
>> +
>> +#include <asm/sframe_regs.h>
>> +
>> +#endif /* _SFRAME_REGS_H */
>> diff --git a/include/sframe/sframe_unwind.h b/include/sframe/sframe_unwind.h
>> new file mode 100644
>> index 000000000000..3e2c12816b60
>> --- /dev/null
>> +++ b/include/sframe/sframe_unwind.h
> 
> Also, these should probably go into include/linux, Unless there's going to
> be a lot more header files.
> 

I'd expect at most the current headers:
include/sframe/sframe_unwind.h
include/sframe/sframe_regs.h

And perhaps one more, for a callchain format and callbacks suggested below ?

>> @@ -0,0 +1,62 @@
>> +/* SPDX-License-Identifier: GPL-2.0-or-later */
>> +/*
>> + * Copyright (C) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +#ifndef _SFRAME_UNWIND_H
>> +#define _SFRAME_UNWIND_H
>> +
>> +#include <linux/sched.h>
>> +#include <linux/perf_event.h>
>> +
>> +#define PT_GNU_SFRAME  0x6474e554
>> +
>> +/*
>> + * State used for SFrame-based stack tracing for a user space task.
>> + */
>> +struct user_unwind_state {
>> +	uint64_t pc, sp, fp, ra;
> 
> I know this is POC, but please make each structure field a separate item.
> Also, should be tab delimited.
> 

OK.

>> +	enum stack_type stype;
>> +	struct task_struct *task;
>> +	bool error;
>> +};
> 
> Also swap the task and the stype, as the pointer to the task will create a
> hole in the structure.
> 
> struct user_unwind_state {
> 	uint64_t		pc;
> 	uint64_t		sp;
> 	uint64_t		fp;
> 	uint64_t		ra;
> 	struct task_stuct	*task;
> 	enum stack_type		stype;
> 	bool			error;
> };
> 

OK.

>> +
>> +/*
>> + * APIs for an SFrame based stack tracer.
>> + */
>> +
>> +void sframe_unwind_start(struct user_unwind_state *state,
>> +			 struct task_struct *task, struct pt_regs *regs);
>> +bool sframe_unwind_next_frame(struct user_unwind_state *state);
>> +uint64_t sframe_unwind_get_return_address(struct user_unwind_state *state);
>> +
>> +static inline bool sframe_unwind_done(struct user_unwind_state *state)
>> +{
>> +	return state->stype == STACK_TYPE_UNKNOWN;
>> +}
>> +
>> +static inline bool sframe_unwind_error(struct user_unwind_state *state)
>> +{
>> +	return state->error;
>> +}
>> +
>> +/*
>> + * APIs to manage the SFrame state per task for stack tracing.
>> + */
>> +
>> +extern struct sframe_state *unwind_sframe_state_alloc(struct task_struct *task);
>> +extern int unwind_sframe_state_update(struct task_struct *task);
>> +extern void unwind_sframe_state_cleanup(struct task_struct *task);
>> +
>> +extern bool unwind_sframe_state_valid_p(struct sframe_state *sfstate);
>> +extern bool unwind_sframe_state_ready_p(struct sframe_state *sftate);
>> +
>> +/*
>> + * Get the callchain using SFrame unwind info for the given task.
>> + */
>> +extern int
>> +sframe_callchain_user(struct task_struct *task,
>> +		      struct perf_callchain_entry_ctx *entry,
>> +		      struct pt_regs *regs);
> 
> 
> I plan on using this without any perf involvement, I'd like to keep perf
> separate from the sframe logic. As I mentioned in a previous email, I
> expect sframe to have callbacks. So the callchain format should be defined
> by sframe, and not reuse perf.
> 

I will think about this. Do you have some model of the expected 
callbacks for me to explore ?

>> +
>> +#endif /* _SFRAME_UNWIND_H */
>> diff --git a/lib/sframe/Makefile b/lib/sframe/Makefile
>> index 4e4291d9294f..5ee9e3e7ec93 100644
>> --- a/lib/sframe/Makefile
>> +++ b/lib/sframe/Makefile
>> @@ -1,5 +1,11 @@
>>   # SPDX-License-Identifier: GPL-2.0
>>   ##################################
>> -obj-$(CONFIG_USER_UNWINDER_SFRAME) += sframe_read.o \
>> +obj-$(CONFIG_USER_UNWINDER_SFRAME) += iterate_phdr.o \
>> +				      sframe_read.o \
>> +				      sframe_state.o \
>> +				      sframe_unwind.o
> 
> Ah, the backslash is fixed here.
> 

Yes, it was a rebase thing. It got missed when moving code between patches.

>>   
>> +CFLAGS_iterate_phdr.o += -I $(srctree)/lib/sframe/ -Wno-error=declaration-after-statement
>>   CFLAGS_sframe_read.o += -I $(srctree)/lib/sframe/
>> +CFLAGS_sframe_state.o += -I $(srctree)/lib/sframe/
>> +CFLAGS_sframe_unwind.o += -I $(srctree)/lib/sframe/
>> diff --git a/lib/sframe/iterate_phdr.c b/lib/sframe/iterate_phdr.c
>> new file mode 100644
>> index 000000000000..c10d590ecc67
>> --- /dev/null
>> +++ b/lib/sframe/iterate_phdr.c
>> @@ -0,0 +1,113 @@
>> +// SPDX-License-Identifier: GPL-2.0-or-later
>> +/*
>> + * Copyright (C) 2023, Oracle and/or its affiliates.
>> + */
>> +
>> +#include <linux/elf.h>
>> +#include <linux/mm.h>
>> +#include <linux/vmalloc.h>
>> +#include <linux/mm_types.h>
>> +
>> +#include "iterate_phdr.h"
>> +
>> +/*
>> + * Iterate over the task's memory mappings and find the ELF headers.
>> + *
>> + * This is expected to be called from perf_callchain_user(), so user process
>> + * context is expected.
> 
> My thought is that this will be called in the ptrace path (not the perf
> path), so yes, it will be in user process context.
> 
>> + */
>> +
>> +int iterate_phdr(int (*callback)(struct phdr_info *info,
>> +				 struct task_struct *task,
>> +				 void *data),
>> +		 struct task_struct *task, void *data)
>> +{
>> +	struct mm_struct *mm;
>> +	struct vm_area_struct *vma_mt;
>> +	struct page *page;
>> +
>> +	Elf64_Ehdr *ehdr;
>> +	struct phdr_info phinfo;
>> +
>> +	int ret = 0, res = 0;
>> +	int err = 0;
>> +	bool first = true;
>> +
>> +	memset(&phinfo, 0, sizeof(struct phdr_info));
>> +
>> +	mm = task->mm;
>> +
>> +	MA_STATE(mas, &mm->mm_mt, 0, 0);
>> +
> 
> So this is the code I want to discuss at LSFMM :-) As there will be more
> experts about this than what I know.
> 
> Let me go and start making the infrastructure to encompass this.
> 
> -- Steve
> 
> 
>> +	mas_for_each(&mas, vma_mt, ULONG_MAX) {
>> +		/* ELF header has a fixed place in the file, starting at offset
>> +		 * zero.
>> +		 */
>> +		if (vma_mt->vm_pgoff)
>> +			continue;
>> +
>> +		/* For the callback to infer if its the prog or DSO we are
>> +		 * dealing with.
>> +		 */
>> +		phinfo.pi_prog = first;
>> +		first = false;
>> +		/* FIXME TODO
>> +		 *  - This code assumes 64-bit ELF by using Elf64_Ehdr.
>> +		 *  - Detect the case when ELF program headers to be of
>> +		 * size > 1 page.
>> +		 */
>> +
>> +		/* FIXME TODO KERNEL
>> +		 *  - get_user_pages_WHAT, which API.
>> +		 *  What flags ? Is this correct ?
>> +		 */
>> +		ret = get_user_pages_remote(mm, vma_mt->vm_start, 1, FOLL_GET,
>> +					    &page, &vma_mt, NULL);
>> +		if (ret <= 0)
>> +			continue;
>> +
>> +		/* The first page must have the ELF header. */
>> +		ehdr = vmap(&page, 1, VM_MAP, PAGE_KERNEL);
>> +		if (!ehdr)
>> +			goto put_page;
>> +
>> +		/* Check for magic bytes to make sure this is ehdr. */
>> +		err = 0;
>> +		err |= ((ehdr->e_ident[EI_MAG0] != ELFMAG0)
>> +			|| (ehdr->e_ident[EI_MAG1] != ELFMAG1)
>> +			|| (ehdr->e_ident[EI_MAG2] != ELFMAG2)
>> +			|| (ehdr->e_ident[EI_MAG3] != ELFMAG3));
>> +		if (err)
>> +			goto unmap;
>> +
>> +		/*
>> +		 * FIXME TODO handle the case when number of program headers is
>> +		 * greater than or equal to PN_XNUM later.
>> +		 */
>> +		if (ehdr->e_phnum == PN_XNUM)
>> +			goto unmap;
>> +		/*
>> +		 * FIXME TODO handle the case when Elf phdrs span more than one
>> +		 * page later ?
>> +		 */
>> +		if ((sizeof(Elf64_Ehdr) + ehdr->e_phentsize * ehdr->e_phnum)
>> +		    > PAGE_SIZE)
>> +			goto unmap;
>> +
>> +		/* Save the location of program headers and the phnum. */
>> +		phinfo.pi_addr = vma_mt->vm_start;
>> +		phinfo.pi_phdr = (void *)ehdr + ehdr->e_phoff;
>> +		phinfo.pi_phnum = ehdr->e_phnum;
>> +
>> +		res = callback(&phinfo, task, data);
>> +unmap:
>> +		vunmap(ehdr);
>> +put_page:
>> +		put_page(page);
>> +
>> +		if (res < 0)
>> +			break;
>> +	}
>> +
>> +	return res;
>> +}
>>


  reply	other threads:[~2023-05-02  6:17 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-01 20:04 [POC 0/5] SFrame based stack tracer for user space in the kernel Indu Bhagat
2023-05-01 20:04 ` [POC 1/5] Kconfig: x86: Add new config options for userspace unwinder Indu Bhagat
2023-05-01 20:04 ` [POC 2/5] task_struct : add additional member for sframe state Indu Bhagat
2023-05-01 20:04 ` [POC 3/5] sframe: add new SFrame library Indu Bhagat
2023-05-01 22:40   ` Steven Rostedt
2023-05-02  5:07     ` Indu Bhagat
2023-05-02  8:46     ` Peter Zijlstra
2023-05-02  9:09   ` Peter Zijlstra
2023-05-02  9:20   ` Peter Zijlstra
2023-05-02  9:28   ` Peter Zijlstra
2023-05-02  9:30   ` Peter Zijlstra
2023-05-03  6:03     ` Indu Bhagat
2023-05-02 10:31   ` Peter Zijlstra
2023-05-02 10:41   ` Peter Zijlstra
2023-05-02 15:22     ` Steven Rostedt
2023-05-01 20:04 ` [POC 4/5] sframe: add an SFrame format stack tracer Indu Bhagat
2023-05-01 23:00   ` Steven Rostedt
2023-05-02  6:16     ` Indu Bhagat [this message]
2023-05-02  8:53   ` Peter Zijlstra
2023-05-02  9:04   ` Peter Zijlstra
2023-05-01 20:04 ` [POC 5/5] x86_64: invoke SFrame based stack tracer for user space Indu Bhagat
2023-05-01 23:11   ` Steven Rostedt
2023-05-02 10:53   ` Peter Zijlstra
2023-05-02 15:27     ` Steven Rostedt
2023-05-16 17:25       ` Andrii Nakryiko
2023-05-16 17:38         ` Steven Rostedt
2023-05-16 17:51           ` Andrii Nakryiko
2024-03-13 14:37       ` Tatsuyuki Ishi
2024-03-13 14:52         ` Steven Rostedt
2024-03-13 14:58           ` Tatsuyuki Ishi
2024-03-13 15:04             ` Steven Rostedt
2023-05-01 22:15 ` [POC 0/5] SFrame based stack tracer for user space in the kernel Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=84bf4aee-e5a5-8e49-3826-ecb79eefc85b@oracle.com \
    --to=indu.bhagat@oracle.com \
    --cc=andrii@kernel.org \
    --cc=daandemeyer@meta.com \
    --cc=elena.zannoni@oracle.com \
    --cc=kris.van.hees@oracle.com \
    --cc=linux-toolchains@vger.kernel.org \
    --cc=nick.alcock@oracle.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.