All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next 0/3] Open-coded task_vma iter
@ 2023-08-10 18:35 Dave Marchevsky
  2023-08-10 18:35 ` [PATCH bpf-next 1/3] bpf: Explicitly emit BTF for struct bpf_iter_num, not btf_iter_num Dave Marchevsky
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Dave Marchevsky @ 2023-08-10 18:35 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Kernel Team, Dave Marchevsky

At Meta we have a profiling daemon which periodically collects
information on many hosts. This collection usually involves grabbing
stacks (user and kernel) using perf_event BPF progs and later symbolicating
them. For user stacks we try to use BPF_F_USER_BUILD_ID and rely on
remote symbolication, but BPF_F_USER_BUILD_ID doesn't always succeed. In
those cases we must fall back to digging around in /proc/PID/maps to map
virtual address to (binary, offset). The /proc/PID/maps digging does not
occur synchronously with stack collection, so the process might already
be gone, in which case it won't have /proc/PID/maps and we will fail to
symbolicate.

This 'exited process problem' doesn't occur very often as
most of the prod services we care to profile are long-lived daemons, but
there are enough usecases to warrant a workaround: a BPF program which
can be optionally loaded at data collection time and essentially walks
/proc/PID/maps. Currently this is done by walking the vma list:

  struct vm_area_struct* mmap = BPF_CORE_READ(mm, mmap);
  mmap_next = BPF_CORE_READ(rmap, vm_next); /* in a loop */

Since commit 763ecb035029 ("mm: remove the vma linked list") there's no
longer a vma linked list to walk. Walking the vma maple tree is not as
simple as hopping struct vm_area_struct->vm_next. Luckily,
commit f39af05949a4 ("mm: add VMA iterator"), another commit in that series,
added struct vma_iterator and for_each_vma macro for easy vma iteration. If
similar functionality was exposed to BPF programs, it would be perfect for our
usecase.

This series adds such functionality, specifically a BPF equivalent of
for_each_vma using the open-coded iterator style.

Notes:
  * This approach was chosen after discussion on a previous series [0] which
    attempted to solve the same problem by adding a BPF_F_VMA_NEXT flag to
    bpf_find_vma.
  * Unlike the task_vma bpf_iter, the open-coded iterator kfuncs here do not
    drop the vma read lock between iterations. See Alexei's response in [0].
  * The [vsyscall] page isn't really part of task->mm's vmas, but
    /proc/PID/maps returns information about it anyways. The vma iter added
    here does not do the same. See comment on selftest in patch 3.
  * The struct vma_iterator wrapped by struct bpf_iter_task_vma itself wraps
    struct ma_state. Because we need the entire struct, not a ptr, changes to
    either struct vma_iterator or struct ma_state will necessitate changing the
    opaque struct bpf_iter_task_vma to account for the new size. This feels a
    bit brittle. We could instead use bpf_mem_alloc to allocate a struct
    vma_iterator in bpf_iter_task_vma_new and have struct bpf_iter_task_vma
    point to that, but that's not quite equivalent as BPF progs will usually
    use the stack for this struct via bpf_for_each. Went with the simpler route
    for now.

Patch summary:
  * Patch 1 is a tiny fix I ran into while implementing the vma iter in this
    series. It can be applied independently.
  * Patch 2 is the meat of the implementation
  * Patch 3 adds tests for the new functionality
    * Existing iter tests exercise failure cases (e.g. prog that doesn't call
      _destroy()). I didn't replicate them in this series, but am happy to add
      them in v2 if folks feel that it would be worthwhile.

  [0]: https://lore.kernel.org/bpf/20230801145414.418145-1-davemarchevsky@fb.com/


Dave Marchevsky (3):
  bpf: Explicitly emit BTF for struct bpf_iter_num, not btf_iter_num
  bpf: Introduce task_vma open-coded iterator kfuncs
  selftests/bpf: Add tests for open-coded task_vma iter

 include/uapi/linux/bpf.h                      |  5 ++
 kernel/bpf/bpf_iter.c                         |  2 +-
 kernel/bpf/helpers.c                          |  3 +
 kernel/bpf/task_iter.c                        | 54 ++++++++++++++
 tools/include/uapi/linux/bpf.h                |  5 ++
 tools/lib/bpf/bpf_helpers.h                   |  8 +++
 .../selftests/bpf/prog_tests/bpf_iter.c       | 26 +++----
 .../testing/selftests/bpf/prog_tests/iters.c  | 71 +++++++++++++++++++
 ...f_iter_task_vma.c => bpf_iter_task_vmas.c} |  0
 .../selftests/bpf/progs/iters_task_vma.c      | 56 +++++++++++++++
 10 files changed, 216 insertions(+), 14 deletions(-)
 rename tools/testing/selftests/bpf/progs/{bpf_iter_task_vma.c => bpf_iter_task_vmas.c} (100%)
 create mode 100644 tools/testing/selftests/bpf/progs/iters_task_vma.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re: [PATCH bpf-next 2/3] bpf: Introduce task_vma open-coded iterator kfuncs
@ 2023-08-11  6:15 kernel test robot
  0 siblings, 0 replies; 13+ messages in thread
From: kernel test robot @ 2023-08-11  6:15 UTC (permalink / raw)
  To: oe-kbuild; +Cc: lkp

:::::: 
:::::: Manual check reason: "git am base is a link in commit message"
:::::: 

BCC: lkp@intel.com
CC: oe-kbuild-all@lists.linux.dev
In-Reply-To: <20230810183513.684836-3-davemarchevsky@fb.com>
References: <20230810183513.684836-3-davemarchevsky@fb.com>
TO: Dave Marchevsky <davemarchevsky@fb.com>
TO: bpf@vger.kernel.org
CC: Alexei Starovoitov <ast@kernel.org>
CC: Daniel Borkmann <daniel@iogearbox.net>
CC: Andrii Nakryiko <andrii@kernel.org>
CC: Martin KaFai Lau <martin.lau@kernel.org>
CC: Kernel Team <kernel-team@fb.com>
CC: Dave Marchevsky <davemarchevsky@fb.com>
CC: Nathan Slingerland <slinger@meta.com>

Hi Dave,

kernel test robot noticed the following build warnings:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Dave-Marchevsky/bpf-Explicitly-emit-BTF-for-struct-bpf_iter_num-not-btf_iter_num/20230811-023615
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20230810183513.684836-3-davemarchevsky%40fb.com
patch subject: [PATCH bpf-next 2/3] bpf: Introduce task_vma open-coded iterator kfuncs
:::::: branch date: 12 hours ago
:::::: commit date: 12 hours ago
config: x86_64-kexec (https://download.01.org/0day-ci/archive/20230811/202308111423.MRMNWfoF-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce: (https://download.01.org/0day-ci/archive/20230811/202308111423.MRMNWfoF-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/r/202308111423.MRMNWfoF-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> kernel/bpf/task_iter.c:837:17: warning: no previous prototype for 'bpf_iter_task_vma_new' [-Wmissing-prototypes]
     837 | __bpf_kfunc int bpf_iter_task_vma_new(struct bpf_iter_task_vma *it,
         |                 ^~~~~~~~~~~~~~~~~~~~~
>> kernel/bpf/task_iter.c:869:36: warning: no previous prototype for 'bpf_iter_task_vma_next' [-Wmissing-prototypes]
     869 | __bpf_kfunc struct vm_area_struct *bpf_iter_task_vma_next(struct bpf_iter_task_vma *it)
         |                                    ^~~~~~~~~~~~~~~~~~~~~~
>> kernel/bpf/task_iter.c:878:18: warning: no previous prototype for 'bpf_iter_task_vma_destroy' [-Wmissing-prototypes]
     878 | __bpf_kfunc void bpf_iter_task_vma_destroy(struct bpf_iter_task_vma *it)
         |                  ^~~~~~~~~~~~~~~~~~~~~~~~~


vim +/bpf_iter_task_vma_new +837 kernel/bpf/task_iter.c

35129a6b021441 Dave Marchevsky 2023-08-10  836  
35129a6b021441 Dave Marchevsky 2023-08-10 @837  __bpf_kfunc int bpf_iter_task_vma_new(struct bpf_iter_task_vma *it,
35129a6b021441 Dave Marchevsky 2023-08-10  838  				      struct task_struct *task, u64 addr)
35129a6b021441 Dave Marchevsky 2023-08-10  839  {
35129a6b021441 Dave Marchevsky 2023-08-10  840  	struct bpf_iter_task_vma_kern *i = (void *)it;
35129a6b021441 Dave Marchevsky 2023-08-10  841  	bool irq_work_busy = false;
35129a6b021441 Dave Marchevsky 2023-08-10  842  
35129a6b021441 Dave Marchevsky 2023-08-10  843  	BUILD_BUG_ON(sizeof(struct bpf_iter_task_vma_kern) != sizeof(struct bpf_iter_task_vma));
35129a6b021441 Dave Marchevsky 2023-08-10  844  	BUILD_BUG_ON(__alignof__(struct bpf_iter_task_vma_kern) != __alignof__(struct bpf_iter_task_vma));
35129a6b021441 Dave Marchevsky 2023-08-10  845  
35129a6b021441 Dave Marchevsky 2023-08-10  846  	BTF_TYPE_EMIT(struct bpf_iter_task_vma);
35129a6b021441 Dave Marchevsky 2023-08-10  847  
35129a6b021441 Dave Marchevsky 2023-08-10  848  	/* NULL i->mm signals failed bpf_iter_task_vma initialization.
35129a6b021441 Dave Marchevsky 2023-08-10  849  	 * i->work == NULL is valid.
35129a6b021441 Dave Marchevsky 2023-08-10  850  	 */
35129a6b021441 Dave Marchevsky 2023-08-10  851  	i->mm = NULL;
35129a6b021441 Dave Marchevsky 2023-08-10  852  	if (!task)
35129a6b021441 Dave Marchevsky 2023-08-10  853  		return -ENOENT;
35129a6b021441 Dave Marchevsky 2023-08-10  854  
35129a6b021441 Dave Marchevsky 2023-08-10  855  	i->mm = task->mm;
35129a6b021441 Dave Marchevsky 2023-08-10  856  	if (!i->mm)
35129a6b021441 Dave Marchevsky 2023-08-10  857  		return -ENOENT;
35129a6b021441 Dave Marchevsky 2023-08-10  858  
35129a6b021441 Dave Marchevsky 2023-08-10  859  	irq_work_busy = bpf_mmap_unlock_get_irq_work(&i->work);
35129a6b021441 Dave Marchevsky 2023-08-10  860  	if (irq_work_busy || !mmap_read_trylock(i->mm)) {
35129a6b021441 Dave Marchevsky 2023-08-10  861  		i->mm = NULL;
35129a6b021441 Dave Marchevsky 2023-08-10  862  		return -EBUSY;
35129a6b021441 Dave Marchevsky 2023-08-10  863  	}
35129a6b021441 Dave Marchevsky 2023-08-10  864  
35129a6b021441 Dave Marchevsky 2023-08-10  865  	vma_iter_init(&i->vmi, i->mm, addr);
35129a6b021441 Dave Marchevsky 2023-08-10  866  	return 0;
35129a6b021441 Dave Marchevsky 2023-08-10  867  }
35129a6b021441 Dave Marchevsky 2023-08-10  868  
35129a6b021441 Dave Marchevsky 2023-08-10 @869  __bpf_kfunc struct vm_area_struct *bpf_iter_task_vma_next(struct bpf_iter_task_vma *it)
35129a6b021441 Dave Marchevsky 2023-08-10  870  {
35129a6b021441 Dave Marchevsky 2023-08-10  871  	struct bpf_iter_task_vma_kern *i = (void *)it;
35129a6b021441 Dave Marchevsky 2023-08-10  872  
35129a6b021441 Dave Marchevsky 2023-08-10  873  	if (!i->mm) /* bpf_iter_task_vma_new failed */
35129a6b021441 Dave Marchevsky 2023-08-10  874  		return NULL;
35129a6b021441 Dave Marchevsky 2023-08-10  875  	return vma_next(&i->vmi);
35129a6b021441 Dave Marchevsky 2023-08-10  876  }
35129a6b021441 Dave Marchevsky 2023-08-10  877  
35129a6b021441 Dave Marchevsky 2023-08-10 @878  __bpf_kfunc void bpf_iter_task_vma_destroy(struct bpf_iter_task_vma *it)
35129a6b021441 Dave Marchevsky 2023-08-10  879  {
35129a6b021441 Dave Marchevsky 2023-08-10  880  	struct bpf_iter_task_vma_kern *i = (void *)it;
35129a6b021441 Dave Marchevsky 2023-08-10  881  
35129a6b021441 Dave Marchevsky 2023-08-10  882  	if (i->mm)
35129a6b021441 Dave Marchevsky 2023-08-10  883  		bpf_mmap_unlock_mm(i->work, i->mm);
35129a6b021441 Dave Marchevsky 2023-08-10  884  }
35129a6b021441 Dave Marchevsky 2023-08-10  885  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re: [PATCH bpf-next 2/3] bpf: Introduce task_vma open-coded iterator kfuncs
@ 2023-08-11 23:08 kernel test robot
  2023-08-29  2:23 ` Liu, Yujie
  0 siblings, 1 reply; 13+ messages in thread
From: kernel test robot @ 2023-08-11 23:08 UTC (permalink / raw)
  To: oe-kbuild; +Cc: lkp

:::::: 
:::::: Manual check reason: "git am base is a link in commit message"
:::::: 

BCC: lkp@intel.com
CC: oe-kbuild-all@lists.linux.dev
In-Reply-To: <20230810183513.684836-3-davemarchevsky@fb.com>
References: <20230810183513.684836-3-davemarchevsky@fb.com>
TO: Dave Marchevsky <davemarchevsky@fb.com>
TO: bpf@vger.kernel.org
CC: Alexei Starovoitov <ast@kernel.org>
CC: Daniel Borkmann <daniel@iogearbox.net>
CC: Andrii Nakryiko <andrii@kernel.org>
CC: Martin KaFai Lau <martin.lau@kernel.org>
CC: Kernel Team <kernel-team@fb.com>
CC: Dave Marchevsky <davemarchevsky@fb.com>
CC: Nathan Slingerland <slinger@meta.com>

Hi Dave,

kernel test robot noticed the following build errors:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Dave-Marchevsky/bpf-Explicitly-emit-BTF-for-struct-bpf_iter_num-not-btf_iter_num/20230811-023615
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20230810183513.684836-3-davemarchevsky%40fb.com
patch subject: [PATCH bpf-next 2/3] bpf: Introduce task_vma open-coded iterator kfuncs
:::::: branch date: 28 hours ago
:::::: commit date: 28 hours ago
config: mips-allyesconfig (https://download.01.org/0day-ci/archive/20230812/202308120622.rMu98Wlt-lkp@intel.com/config)
compiler: mips-linux-gcc (GCC) 12.3.0
reproduce: (https://download.01.org/0day-ci/archive/20230812/202308120622.rMu98Wlt-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/r/202308120622.rMu98Wlt-lkp@intel.com/

All errors (new ones prefixed by >>):

   kernel/bpf/task_iter.c:837:17: warning: no previous prototype for 'bpf_iter_task_vma_new' [-Wmissing-prototypes]
     837 | __bpf_kfunc int bpf_iter_task_vma_new(struct bpf_iter_task_vma *it,
         |                 ^~~~~~~~~~~~~~~~~~~~~
   kernel/bpf/task_iter.c:869:36: warning: no previous prototype for 'bpf_iter_task_vma_next' [-Wmissing-prototypes]
     869 | __bpf_kfunc struct vm_area_struct *bpf_iter_task_vma_next(struct bpf_iter_task_vma *it)
         |                                    ^~~~~~~~~~~~~~~~~~~~~~
   kernel/bpf/task_iter.c:878:18: warning: no previous prototype for 'bpf_iter_task_vma_destroy' [-Wmissing-prototypes]
     878 | __bpf_kfunc void bpf_iter_task_vma_destroy(struct bpf_iter_task_vma *it)
         |                  ^~~~~~~~~~~~~~~~~~~~~~~~~
   In file included from <command-line>:
   kernel/bpf/task_iter.c: In function 'bpf_iter_task_vma_new':
>> include/linux/compiler_types.h:397:45: error: call to '__compiletime_assert_554' declared with attribute error: BUILD_BUG_ON failed: sizeof(struct bpf_iter_task_vma_kern) != sizeof(struct bpf_iter_task_vma)
     397 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
         |                                             ^
   include/linux/compiler_types.h:378:25: note: in definition of macro '__compiletime_assert'
     378 |                         prefix ## suffix();                             \
         |                         ^~~~~~
   include/linux/compiler_types.h:397:9: note: in expansion of macro '_compiletime_assert'
     397 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
         |         ^~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert'
      39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
         |                                     ^~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:50:9: note: in expansion of macro 'BUILD_BUG_ON_MSG'
      50 |         BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
         |         ^~~~~~~~~~~~~~~~
   kernel/bpf/task_iter.c:843:9: note: in expansion of macro 'BUILD_BUG_ON'
     843 |         BUILD_BUG_ON(sizeof(struct bpf_iter_task_vma_kern) != sizeof(struct bpf_iter_task_vma));
         |         ^~~~~~~~~~~~


vim +/__compiletime_assert_554 +397 include/linux/compiler_types.h

eb5c2d4b45e3d2 Will Deacon 2020-07-21  383  
eb5c2d4b45e3d2 Will Deacon 2020-07-21  384  #define _compiletime_assert(condition, msg, prefix, suffix) \
eb5c2d4b45e3d2 Will Deacon 2020-07-21  385  	__compiletime_assert(condition, msg, prefix, suffix)
eb5c2d4b45e3d2 Will Deacon 2020-07-21  386  
eb5c2d4b45e3d2 Will Deacon 2020-07-21  387  /**
eb5c2d4b45e3d2 Will Deacon 2020-07-21  388   * compiletime_assert - break build and emit msg if condition is false
eb5c2d4b45e3d2 Will Deacon 2020-07-21  389   * @condition: a compile-time constant condition to check
eb5c2d4b45e3d2 Will Deacon 2020-07-21  390   * @msg:       a message to emit if condition is false
eb5c2d4b45e3d2 Will Deacon 2020-07-21  391   *
eb5c2d4b45e3d2 Will Deacon 2020-07-21  392   * In tradition of POSIX assert, this macro will break the build if the
eb5c2d4b45e3d2 Will Deacon 2020-07-21  393   * supplied condition is *false*, emitting the supplied error message if the
eb5c2d4b45e3d2 Will Deacon 2020-07-21  394   * compiler has support to do so.
eb5c2d4b45e3d2 Will Deacon 2020-07-21  395   */
eb5c2d4b45e3d2 Will Deacon 2020-07-21  396  #define compiletime_assert(condition, msg) \
eb5c2d4b45e3d2 Will Deacon 2020-07-21 @397  	_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
eb5c2d4b45e3d2 Will Deacon 2020-07-21  398  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-08-29  2:23 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-10 18:35 [PATCH bpf-next 0/3] Open-coded task_vma iter Dave Marchevsky
2023-08-10 18:35 ` [PATCH bpf-next 1/3] bpf: Explicitly emit BTF for struct bpf_iter_num, not btf_iter_num Dave Marchevsky
2023-08-11  7:19   ` Yonghong Song
2023-08-10 18:35 ` [PATCH bpf-next 2/3] bpf: Introduce task_vma open-coded iterator kfuncs Dave Marchevsky
2023-08-10 21:57   ` Stanislav Fomichev
2023-08-11 14:57     ` David Marchevsky
2023-08-11 17:03       ` Stanislav Fomichev
2023-08-11 16:22   ` Yonghong Song
2023-08-11 16:41   ` Yonghong Song
2023-08-10 18:35 ` [PATCH bpf-next 3/3] selftests/bpf: Add tests for open-coded task_vma iter Dave Marchevsky
  -- strict thread matches above, loose matches on Subject: below --
2023-08-11  6:15 [PATCH bpf-next 2/3] bpf: Introduce task_vma open-coded iterator kfuncs kernel test robot
2023-08-11 23:08 kernel test robot
2023-08-29  2:23 ` Liu, Yujie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.