All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexei Starovoitov <ast@plumgrid.com>
To: Kaixu Xia <xiakaixu@huawei.com>,
	davem@davemloft.net, acme@kernel.org, mingo@redhat.com,
	a.p.zijlstra@chello.nl, masami.hiramatsu.pt@hitachi.com,
	jolsa@kernel.org
Cc: wangnan0@huawei.com, linux-kernel@vger.kernel.org,
	pi3orama@163.com, hekuang@huawei.com
Subject: Re: [PATCH v2 3/5] bpf: Save the pointer to struct perf_event to map
Date: Wed, 22 Jul 2015 18:08:24 -0700	[thread overview]
Message-ID: <55B03E88.1000303@plumgrid.com> (raw)
In-Reply-To: <1437552572-84748-4-git-send-email-xiakaixu@huawei.com>

On 7/22/15 1:09 AM, Kaixu Xia wrote:
> The user space event FDs from perf_event_open() syscall are
> converted to the pointer to struct perf_event and stored in map.
...
> +static int replace_map_with_perf_event(void *value)
> +{
> +	struct perf_event *event;
> +	u32 fd;
> +
> +	fd = *(u32 *)value;
> +
> +	event = perf_event_get(fd);
> +	if (IS_ERR(event))
> +		return PTR_ERR(event);
> +
> +	/* limit the event type to PERF_TYPE_RAW
> +	 * and PERF_TYPE_HARDWARE
> +	 */
> +	if (event->attr.type != PERF_TYPE_RAW &&
> +	    event->attr.type != PERF_TYPE_HARDWARE)
> +		return -EINVAL;
> +
> +	memcpy(value, &event, sizeof(struct perf_event *));
> +
> +	return 0;
> +}
> +
> +static bool check_map_perf_event_stored(struct bpf_map *map, void *key)
> +{
> +	void *event;
> +	bool is_stored = false;
> +
> +	rcu_read_lock();
> +	event = array_map_lookup_elem(map, key);
> +	if (event && (*(unsigned long *)event))
> +		is_stored = true;
> +	rcu_read_unlock();
> +
> +	return is_stored;
> +}
> +
> +/* only called from syscall */
> +static int perf_event_array_map_update_elem(struct bpf_map *map, void *key,
> +					    void *value, u64 map_flags)
> +{
> +	/* check if the value is already stored */
> +	if (check_map_perf_event_stored(map, key))
> +		return -EINVAL;
> +
> +	if (replace_map_with_perf_event(value))
> +		return -EBADF;

the above three functions are broken due to races.
update_elem can be called by different user space processes.
Please see how prog_array solves it via xchg()

> +	if (map->map_type == BPF_MAP_TYPE_PERF_EVENT_ARRAY) {
> +		if (!map->ops->map_traverse_elem)
> +			return -EPERM;
> +
> +		rcu_read_lock();
> +		if (map->ops->map_traverse_elem(bpf_map_perf_event_put, map) < 0) {
> +			rcu_read_unlock();
> +			return -EINVAL;
> +		}
> +		rcu_read_unlock();

as I mentioned as part of patch 2 the above seems unnecessary.

> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index d3dae34..14a9924 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -8574,6 +8574,32 @@ void perf_event_delayed_put(struct task_struct *task)
>   		WARN_ON_ONCE(task->perf_event_ctxp[ctxn]);
>   }
>
> +struct perf_event *perf_event_get(unsigned int fd)
> +{
> +	struct perf_event *event;
> +	struct fd f;
> +
> +	f = fdget(fd);
> +
> +	if (!f.file)
> +		return ERR_PTR(-EBADF);
> +
> +	if (f.file->f_op != &perf_fops) {
> +		fdput(f);
> +		return ERR_PTR(-EINVAL);
> +	}
> +
> +	event = f.file->private_data;
> +
> +	if (!atomic_long_inc_not_zero(&event->refcount)) {

I don't understand necessity of '_not_zero' suffix. why?
How it can possibly race to zero here if we have an fd?

> +		fdput(f);
> +		return ERR_PTR(-ENOENT);
> +	}
> +
> +	fdput(f);
> +	return event;
> +}


  reply	other threads:[~2015-07-23  1:08 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-22  8:09 [PATCH v2 0/5] bpf: Introduce the new ability of eBPF programs to access hardware PMU counter Kaixu Xia
2015-07-22  8:09 ` [PATCH v2 1/5] bpf: Add new bpf map type to store the pointer to struct perf_event Kaixu Xia
2015-07-23  0:48   ` Alexei Starovoitov
2015-07-23  1:08     ` Wangnan (F)
2015-07-23  1:39       ` Alexei Starovoitov
2015-07-22  8:09 ` [PATCH v2 2/5] bpf: Add function map->ops->map_traverse_elem() to traverse map elems Kaixu Xia
2015-07-23  1:00   ` Alexei Starovoitov
2015-07-22  8:09 ` [PATCH v2 3/5] bpf: Save the pointer to struct perf_event to map Kaixu Xia
2015-07-23  1:08   ` Alexei Starovoitov [this message]
2015-07-22  8:09 ` [PATCH v2 4/5] bpf: Implement function bpf_perf_event_read() that get the selected hardware PMU conuter Kaixu Xia
2015-07-23  1:14   ` Alexei Starovoitov
2015-07-23  2:12     ` xiakaixu
2015-07-23  2:22       ` Alexei Starovoitov
2015-07-23  2:39         ` xiakaixu
2015-07-22  8:09 ` [PATCH v2 5/5] samples/bpf: example of get selected PMU counter value Kaixu Xia
2015-07-23  1:16   ` Alexei Starovoitov
2015-07-23 23:33 ` [PATCH v2 0/5] bpf: Introduce the new ability of eBPF programs to access hardware PMU counter Daniel Borkmann
2015-07-25  2:14   ` xiakaixu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55B03E88.1000303@plumgrid.com \
    --to=ast@plumgrid.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@kernel.org \
    --cc=davem@davemloft.net \
    --cc=hekuang@huawei.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mingo@redhat.com \
    --cc=pi3orama@163.com \
    --cc=wangnan0@huawei.com \
    --cc=xiakaixu@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.