All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexei Starovoitov <ast@fb.com>
To: "David S . Miller" <davem@davemloft.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>,
	Daniel Wagner <daniel.wagner@bmw-carit.de>,
	Tom Zanussi <tom.zanussi@linux.intel.com>,
	Wang Nan <wangnan0@huawei.com>, He Kuang <hekuang@huawei.com>,
	Martin KaFai Lau <kafai@fb.com>,
	Brendan Gregg <brendan.d.gregg@gmail.com>,
	<netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<kernel-team@fb.com>
Subject: [PATCH net-next 0/9] bpf: hash map pre-alloc
Date: Sun, 6 Mar 2016 17:58:28 -0800	[thread overview]
Message-ID: <1457315917-1970307-1-git-send-email-ast@fb.com> (raw)

Hi,

this path set switches bpf hash map to use pre-allocation by default
and introduces BPF_F_NO_PREALLOC flag to keep old behavior for cases
where full map pre-allocation is too memory expensive.

Some time back Daniel Wagner reported crashes when bpf hash map is
used to compute time intervals between preempt_disable->preempt_enable
and recently Tom Zanussi reported a dead lock in iovisor/bcc/funccount
tool if it's used to count the number of invocations of kernel
'*spin*' functions. Both problems are due to the recursive use of
slub and can only be solved by pre-allocating all map elements.

A lot of different solutions were considered. Many implemented,
but at the end pre-allocation seems to be the only feasible answer.
As far as pre-allocation goes it also was implemented 4 different ways:
- simple free-list with single lock
- percpu_ida with optimizations
- blk-mq-tag variant customized for bpf use case
- percpu_freelist
For bpf style of alloc/free patterns percpu_freelist is the best
and implemented in this patch set.
Detailed performance numbers in patch 3.
Patch 2 introduces percpu_freelist
Patch 1 fixes simple deadlocks due to missing recursion checks
Patches 4-7: prepare test infra
Patch 8: stress test for hash map infra. It attaches to spin_lock
functions and bpf_map_update/delete are called from different contexts
(except nmi, which is unsupported by bpf still)
Patch 9: map performance test

Reported-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Reported-by: Tom Zanussi <tom.zanussi@linux.intel.com>

Alexei Starovoitov (9):
  bpf: prevent kprobe+bpf deadlocks
  bpf: introduce percpu_freelist
  bpf: pre-allocate hash map elements
  samples/bpf: make map creation more verbose
  samples/bpf: move ksym_search() into library
  samples/bpf: add map_flags to bpf loader
  samples/bpf: test both pre-alloc and normal maps
  samples/bpf: add bpf map stress test
  samples/bpf: add map performance test

 include/linux/bpf.h              |   4 +
 include/uapi/linux/bpf.h         |   3 +
 kernel/bpf/Makefile              |   2 +-
 kernel/bpf/hashtab.c             | 264 ++++++++++++++++++++++++++++-----------
 kernel/bpf/percpu_freelist.c     |  81 ++++++++++++
 kernel/bpf/percpu_freelist.h     |  31 +++++
 kernel/bpf/syscall.c             |  15 ++-
 kernel/trace/bpf_trace.c         |   2 -
 samples/bpf/Makefile             |   8 ++
 samples/bpf/bpf_helpers.h        |   1 +
 samples/bpf/bpf_load.c           |  70 ++++++++++-
 samples/bpf/bpf_load.h           |   6 +
 samples/bpf/fds_example.c        |   2 +-
 samples/bpf/libbpf.c             |   5 +-
 samples/bpf/libbpf.h             |   2 +-
 samples/bpf/map_perf_test_kern.c | 100 +++++++++++++++
 samples/bpf/map_perf_test_user.c | 155 +++++++++++++++++++++++
 samples/bpf/offwaketime_user.c   |  67 +---------
 samples/bpf/sock_example.c       |   2 +-
 samples/bpf/spintest_kern.c      |  59 +++++++++
 samples/bpf/spintest_user.c      |  50 ++++++++
 samples/bpf/test_maps.c          |  29 +++--
 samples/bpf/test_verifier.c      |   4 +-
 23 files changed, 802 insertions(+), 160 deletions(-)
 create mode 100644 kernel/bpf/percpu_freelist.c
 create mode 100644 kernel/bpf/percpu_freelist.h
 create mode 100644 samples/bpf/map_perf_test_kern.c
 create mode 100644 samples/bpf/map_perf_test_user.c
 create mode 100644 samples/bpf/spintest_kern.c
 create mode 100644 samples/bpf/spintest_user.c

-- 
2.6.5

             reply	other threads:[~2016-03-07  1:58 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-07  1:58 Alexei Starovoitov [this message]
2016-03-07  1:58 ` [PATCH net-next 1/9] bpf: prevent kprobe+bpf deadlocks Alexei Starovoitov
2016-03-07 10:07   ` Daniel Borkmann
2016-03-07  1:58 ` [PATCH net-next 2/9] bpf: introduce percpu_freelist Alexei Starovoitov
2016-03-07 10:33   ` Daniel Borkmann
2016-03-07 18:26     ` Alexei Starovoitov
2016-03-07  1:58 ` [PATCH net-next 3/9] bpf: pre-allocate hash map elements Alexei Starovoitov
2016-03-07 11:08   ` Daniel Borkmann
2016-03-07 18:29     ` Alexei Starovoitov
2016-03-07  1:58 ` [PATCH net-next 4/9] samples/bpf: make map creation more verbose Alexei Starovoitov
2016-03-07  1:58 ` [PATCH net-next 5/9] samples/bpf: move ksym_search() into library Alexei Starovoitov
2016-03-07  1:58 ` [PATCH net-next 6/9] samples/bpf: add map_flags to bpf loader Alexei Starovoitov
2016-03-07  1:58 ` [PATCH net-next 7/9] samples/bpf: test both pre-alloc and normal maps Alexei Starovoitov
2016-03-07  1:58 ` [PATCH net-next 8/9] samples/bpf: add bpf map stress test Alexei Starovoitov
2016-03-07  1:58 ` [PATCH net-next 9/9] samples/bpf: add map performance test Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1457315917-1970307-1-git-send-email-ast@fb.com \
    --to=ast@fb.com \
    --cc=brendan.d.gregg@gmail.com \
    --cc=daniel.wagner@bmw-carit.de \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=hekuang@huawei.com \
    --cc=kafai@fb.com \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=tom.zanussi@linux.intel.com \
    --cc=wangnan0@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.