From: Brenden Blanco <bblanco@plumgrid.com>
To: Tariq Toukan <tariqt@mellanox.com>
Cc: davem@davemloft.net, netdev@vger.kernel.org,
Daniel Borkmann <daniel@iogearbox.net>,
Alexei Starovoitov <alexei.starovoitov@gmail.com>,
Tariq Toukan <ttoukan.linux@gmail.com>,
Or Gerlitz <gerlitz.or@gmail.com>,
Tom Herbert <tom@herbertland.com>
Subject: Re: [PATCH] net/mlx4_en: protect ring->xdp_prog with rcu_read_lock
Date: Mon, 29 Aug 2016 08:55:59 -0700 [thread overview]
Message-ID: <20160829155558.GA13971@gmail.com> (raw)
In-Reply-To: <c03157a0-b7d0-bde7-0fe9-4f3b19a68d08@mellanox.com>
On Mon, Aug 29, 2016 at 05:59:26PM +0300, Tariq Toukan wrote:
> Hi Brenden,
>
> The solution direction should be XDP specific that does not hurt the
> regular flow.
An rcu_read_lock is _already_ taken for _every_ packet. This is 1/64th of
that.
>
> On 26/08/2016 11:38 PM, Brenden Blanco wrote:
> >Depending on the preempt mode, the bpf_prog stored in xdp_prog may be
> >freed despite the use of call_rcu inside bpf_prog_put. The situation is
> >possible when running in PREEMPT_RCU=y mode, for instance, since the rcu
> >callback for destroying the bpf prog can run even during the bh handling
> >in the mlx4 rx path.
> >
> >Several options were considered before this patch was settled on:
> >
> >Add a napi_synchronize loop in mlx4_xdp_set, which would occur after all
> >of the rings are updated with the new program.
> >This approach has the disadvantage that as the number of rings
> >increases, the speed of udpate will slow down significantly due to
> >napi_synchronize's msleep(1).
> I prefer this option as it doesn't hurt the data path. A delay in a
> control command can be tolerated.
> >Add a new rcu_head in bpf_prog_aux, to be used by a new bpf_prog_put_bh.
> >The action of the bpf_prog_put_bh would be to then call bpf_prog_put
> >later. Those drivers that consume a bpf prog in a bh context (like mlx4)
> >would then use the bpf_prog_put_bh instead when the ring is up. This has
> >the problem of complexity, in maintaining proper refcnts and rcu lists,
> >and would likely be harder to review. In addition, this approach to
> >freeing must be exclusive with other frees of the bpf prog, for instance
> >a _bh prog must not be referenced from a prog array that is consumed by
> >a non-_bh prog.
> >
> >The placement of rcu_read_lock in this patch is functionally the same as
> >putting an rcu_read_lock in napi_poll. Actually doing so could be a
> >potentially controversial change, but would bring the implementation in
> >line with sk_busy_loop (though of course the nature of those two paths
> >is substantially different), and would also avoid future copy/paste
> >problems with future supporters of XDP. Still, this patch does not take
> >that opinionated option.
> So you decided to add a lock for all non-XDP flows, which are 99% of
> the cases.
> We should avoid this.
The whole point of rcu_read_lock architecture is to be taken in the fast
path. There won't be a performance impact from this patch.
> >
> >Testing was done with kernels in either PREEMPT_RCU=y or
> >CONFIG_PREEMPT_VOLUNTARY=y+PREEMPT_RCU=n modes, with neither exhibiting
> >any drawback. With PREEMPT_RCU=n, the extra call to rcu_read_lock did
> >not show up in the perf report whatsoever, and with PREEMPT_RCU=y the
> >overhead of rcu_read_lock (according to perf) was the same before/after.
> >In the rx path, rcu_read_lock is eventually called for every packet
> >from netif_receive_skb_internal, so the napi poll call's rcu_read_lock
> >is easily amortized.
> For now, I don't agree with this fix.
> Let me think about the options you suggested.
> I also need to do my perf tests.
>
> Regards,
> Tariq
>
next prev parent reply other threads:[~2016-08-29 15:56 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-26 20:38 [PATCH] net/mlx4_en: protect ring->xdp_prog with rcu_read_lock Brenden Blanco
2016-08-26 21:01 ` Brenden Blanco
2016-08-29 14:59 ` Tariq Toukan
2016-08-29 15:55 ` Brenden Blanco [this message]
2016-08-29 17:46 ` Tom Herbert
2016-08-30 9:35 ` Saeed Mahameed
2016-08-31 1:50 ` Brenden Blanco
2016-09-01 22:59 ` Saeed Mahameed
2016-09-01 23:30 ` Tom Herbert
2016-09-02 17:50 ` Brenden Blanco
2016-09-02 18:01 ` Brenden Blanco
2016-09-02 18:13 ` Brenden Blanco
2016-09-02 19:14 ` Tom Herbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160829155558.GA13971@gmail.com \
--to=bblanco@plumgrid.com \
--cc=alexei.starovoitov@gmail.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=gerlitz.or@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=tariqt@mellanox.com \
--cc=tom@herbertland.com \
--cc=ttoukan.linux@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.