* [PATCH net-next v2 0/4] Couple of BPF refcount fixes for mlx5 @ 2016-11-16 0:04 Daniel Borkmann 2016-11-16 0:04 ` [PATCH net-next v2 1/4] bpf, mlx5: fix mlx5e_create_rq taking reference on prog Daniel Borkmann ` (3 more replies) 0 siblings, 4 replies; 15+ messages in thread From: Daniel Borkmann @ 2016-11-16 0:04 UTC (permalink / raw) To: davem Cc: alexei.starovoitov, bblanco, zhiyisun, ranas, saeedm, netdev, Daniel Borkmann Various mlx5 bugs on eBPF refcount handling found during review. Last patch in series adds a __must_check to BPF helpers to make sure we won't run into it again w/o compiler complaining first. v1 -> v2: - After discussion with Alexei, we agreed upon rebasing the patches against net-next. - Since net-next, I've also added the __must_check to enforce future users to check for errors. - Fixed up commit message #2. - Simplify assignment from patch #1 based on Saeed's feedback on previous set. Thanks a lot! Daniel Borkmann (4): bpf, mlx5: fix mlx5e_create_rq taking reference on prog bpf, mlx5: fix various refcount issues in mlx5e_xdp_set bpf, mlx5: drop priv->xdp_prog reference on netdev cleanup bpf: add __must_check attributes to refcount manipulating helpers drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 41 ++++++++++++++++++----- include/linux/bpf.h | 12 ++++--- kernel/bpf/syscall.c | 1 + 3 files changed, 40 insertions(+), 14 deletions(-) -- 1.9.3 ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH net-next v2 1/4] bpf, mlx5: fix mlx5e_create_rq taking reference on prog 2016-11-16 0:04 [PATCH net-next v2 0/4] Couple of BPF refcount fixes for mlx5 Daniel Borkmann @ 2016-11-16 0:04 ` Daniel Borkmann 2016-11-16 0:04 ` [PATCH net-next v2 2/4] bpf, mlx5: fix various refcount issues in mlx5e_xdp_set Daniel Borkmann ` (2 subsequent siblings) 3 siblings, 0 replies; 15+ messages in thread From: Daniel Borkmann @ 2016-11-16 0:04 UTC (permalink / raw) To: davem Cc: alexei.starovoitov, bblanco, zhiyisun, ranas, saeedm, netdev, Daniel Borkmann In mlx5e_create_rq(), when creating a new queue, we call bpf_prog_add() but without checking the return value. bpf_prog_add() can fail since 92117d8443bc ("bpf: fix refcnt overflow"), so we really must check it. Take the reference right when we assign it to the rq from priv->xdp_prog, and just drop the reference on error path. Destruction in mlx5e_destroy_rq() looks good, though. Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> --- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 13 +++++++++---- kernel/bpf/syscall.c | 1 + 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 16a9a10..ab0c336 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -513,7 +513,13 @@ static int mlx5e_create_rq(struct mlx5e_channel *c, rq->channel = c; rq->ix = c->ix; rq->priv = c->priv; - rq->xdp_prog = priv->xdp_prog; + + rq->xdp_prog = priv->xdp_prog ? bpf_prog_inc(priv->xdp_prog) : NULL; + if (IS_ERR(rq->xdp_prog)) { + err = PTR_ERR(rq->xdp_prog); + rq->xdp_prog = NULL; + goto err_rq_wq_destroy; + } rq->buff.map_dir = DMA_FROM_DEVICE; if (rq->xdp_prog) @@ -590,12 +596,11 @@ static int mlx5e_create_rq(struct mlx5e_channel *c, rq->page_cache.head = 0; rq->page_cache.tail = 0; - if (rq->xdp_prog) - bpf_prog_add(rq->xdp_prog, 1); - return 0; err_rq_wq_destroy: + if (rq->xdp_prog) + bpf_prog_put(rq->xdp_prog); mlx5_wq_destroy(&rq->wq_ctrl); return err; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index ce1b7de..eb15498 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -696,6 +696,7 @@ struct bpf_prog *bpf_prog_inc(struct bpf_prog *prog) { return bpf_prog_add(prog, 1); } +EXPORT_SYMBOL_GPL(bpf_prog_inc); static struct bpf_prog *__bpf_prog_get(u32 ufd, enum bpf_prog_type *type) { -- 1.9.3 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH net-next v2 2/4] bpf, mlx5: fix various refcount issues in mlx5e_xdp_set 2016-11-16 0:04 [PATCH net-next v2 0/4] Couple of BPF refcount fixes for mlx5 Daniel Borkmann 2016-11-16 0:04 ` [PATCH net-next v2 1/4] bpf, mlx5: fix mlx5e_create_rq taking reference on prog Daniel Borkmann @ 2016-11-16 0:04 ` Daniel Borkmann 2016-11-16 12:25 ` Saeed Mahameed 2016-11-16 0:04 ` [PATCH net-next v2 3/4] bpf, mlx5: drop priv->xdp_prog reference on netdev cleanup Daniel Borkmann 2016-11-16 0:04 ` [PATCH net-next v2 4/4] bpf: add __must_check attributes to refcount manipulating helpers Daniel Borkmann 3 siblings, 1 reply; 15+ messages in thread From: Daniel Borkmann @ 2016-11-16 0:04 UTC (permalink / raw) To: davem Cc: alexei.starovoitov, bblanco, zhiyisun, ranas, saeedm, netdev, Daniel Borkmann There are multiple issues in mlx5e_xdp_set(): 1) The batched bpf_prog_add() is currently not checked for errors! When doing so, it should be done at an earlier point in time to makes sure that we cannot fail anymore at the time we want to set the program for each channel. This only means that we have to undo the bpf_prog_add() in case we return due to required reset. 2) When swapping the priv->xdp_prog, then no extra reference count must be taken since we got that from call path via dev_change_xdp_fd() already. Otherwise, we'd never be able to free the program. Also, bpf_prog_add() without checking the return code could fail. Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> --- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 25 ++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index ab0c336..cf26672 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -3142,6 +3142,17 @@ static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog) goto unlock; } + if (prog) { + /* num_channels is invariant here, so we can take the + * batched reference right upfront. + */ + prog = bpf_prog_add(prog, priv->params.num_channels); + if (IS_ERR(prog)) { + err = PTR_ERR(prog); + goto unlock; + } + } + was_opened = test_bit(MLX5E_STATE_OPENED, &priv->state); /* no need for full reset when exchanging programs */ reset = (!priv->xdp_prog || !prog); @@ -3149,10 +3160,10 @@ static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog) if (was_opened && reset) mlx5e_close_locked(netdev); - /* exchange programs */ + /* exchange programs, extra prog reference we got from caller + * as long as we don't fail from this point onwards. + */ old_prog = xchg(&priv->xdp_prog, prog); - if (prog) - bpf_prog_add(prog, 1); if (old_prog) bpf_prog_put(old_prog); @@ -3163,12 +3174,11 @@ static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog) mlx5e_open_locked(netdev); if (!test_bit(MLX5E_STATE_OPENED, &priv->state) || reset) - goto unlock; + goto unlock_put; /* exchanging programs w/o reset, we update ref counts on behalf * of the channels RQs here. */ - bpf_prog_add(prog, priv->params.num_channels); for (i = 0; i < priv->params.num_channels; i++) { struct mlx5e_channel *c = priv->channel[i]; @@ -3190,6 +3200,11 @@ static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog) unlock: mutex_unlock(&priv->state_lock); return err; +unlock_put: + /* reference on priv->xdp_prog is still held at this point */ + if (prog) + bpf_prog_sub(prog, priv->params.num_channels); + goto unlock; } static bool mlx5e_xdp_attached(struct net_device *dev) -- 1.9.3 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v2 2/4] bpf, mlx5: fix various refcount issues in mlx5e_xdp_set 2016-11-16 0:04 ` [PATCH net-next v2 2/4] bpf, mlx5: fix various refcount issues in mlx5e_xdp_set Daniel Borkmann @ 2016-11-16 12:25 ` Saeed Mahameed 2016-11-16 13:54 ` Daniel Borkmann 0 siblings, 1 reply; 15+ messages in thread From: Saeed Mahameed @ 2016-11-16 12:25 UTC (permalink / raw) To: Daniel Borkmann Cc: David S. Miller, Alexei Starovoitov, Brenden Blanco, zhiyisun, Rana Shahout, Saeed Mahameed, Linux Netdev List On Wed, Nov 16, 2016 at 2:04 AM, Daniel Borkmann <daniel@iogearbox.net> wrote: > There are multiple issues in mlx5e_xdp_set(): > > 1) The batched bpf_prog_add() is currently not checked for errors! When > doing so, it should be done at an earlier point in time to makes sure > that we cannot fail anymore at the time we want to set the program for > each channel. This only means that we have to undo the bpf_prog_add() > in case we return due to required reset. > > 2) When swapping the priv->xdp_prog, then no extra reference count must be > taken since we got that from call path via dev_change_xdp_fd() already. > Otherwise, we'd never be able to free the program. Also, bpf_prog_add() > without checking the return code could fail. > > Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support") > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> > --- > drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 25 ++++++++++++++++++----- > 1 file changed, 20 insertions(+), 5 deletions(-) > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > index ab0c336..cf26672 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > @@ -3142,6 +3142,17 @@ static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog) > goto unlock; > } > > + if (prog) { > + /* num_channels is invariant here, so we can take the > + * batched reference right upfront. > + */ > + prog = bpf_prog_add(prog, priv->params.num_channels); > + if (IS_ERR(prog)) { > + err = PTR_ERR(prog); > + goto unlock; > + } > + } > + With this approach you will end up taking a ref count twice per RQ! on the first time xdp_set is called i.e (old_prog = NULL, prog != NULL). One ref will be taken per RQ/Channel from the above code, and since reset will be TRUE mlx5e_open_locked will be called and another ref count will be taken on mlx5e_create_rq. The issue here is that we have two places for ref count accounting, xdp_set and mlx5e_create_rq. Having ref-count updates in mlx5e_create_rq is essential for num_channels configuration changes (mlx5e_set_ringparam). Our previous approach made sure that only one path will do the ref counting (mlx5e_open_locked vs. mlx5e_xdp_set batch ref update when reset == false). We have two options: 1. remove ref count updates from mlx5e_create_rq and only do batch updates in mlx5e_set_ringparam and mlx5e_xdp_set, the only two places num_channels/priv->xdp_prog could change. 2. Keep the current approach and make sure to not update the ref count twice, you can batch update only if (!reset && was_open) otherwise you can rely on mlx5e_open_locked to take the num_channels ref count for you. Personally I prefer option 2, since we will keep the current logic which will allow configuration updates such as (mlx5e_set_ringparam) to not worry about ref counting since it will be done in the reset flow. Basically we want on xdp_set -bpf_prog_add(prog, 1) // one ref for the device. - if not going to reset bpf_prog_add(prog, num_channels) //batch update - old_prog = xchg(&priv->xdp_prog, prog); - if going to reset then reset . // reset will handle ref-counting per channel. - bpf_prog_put(old_prog) // put one ref for the device. - if (!reset) bpf_prog_put(old_prog, num_channels) // if we didn't reset, we need to release ref count on old_prog ourselves Saeed. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v2 2/4] bpf, mlx5: fix various refcount issues in mlx5e_xdp_set 2016-11-16 12:25 ` Saeed Mahameed @ 2016-11-16 13:54 ` Daniel Borkmann 2016-11-16 14:30 ` Saeed Mahameed 0 siblings, 1 reply; 15+ messages in thread From: Daniel Borkmann @ 2016-11-16 13:54 UTC (permalink / raw) To: Saeed Mahameed Cc: David S. Miller, Alexei Starovoitov, Brenden Blanco, zhiyisun, Rana Shahout, Saeed Mahameed, Linux Netdev List On 11/16/2016 01:25 PM, Saeed Mahameed wrote: > On Wed, Nov 16, 2016 at 2:04 AM, Daniel Borkmann <daniel@iogearbox.net> wrote: >> There are multiple issues in mlx5e_xdp_set(): >> >> 1) The batched bpf_prog_add() is currently not checked for errors! When >> doing so, it should be done at an earlier point in time to makes sure >> that we cannot fail anymore at the time we want to set the program for >> each channel. This only means that we have to undo the bpf_prog_add() >> in case we return due to required reset. >> >> 2) When swapping the priv->xdp_prog, then no extra reference count must be >> taken since we got that from call path via dev_change_xdp_fd() already. >> Otherwise, we'd never be able to free the program. Also, bpf_prog_add() >> without checking the return code could fail. >> >> Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support") >> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> >> --- >> drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 25 ++++++++++++++++++----- >> 1 file changed, 20 insertions(+), 5 deletions(-) >> >> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >> index ab0c336..cf26672 100644 >> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >> @@ -3142,6 +3142,17 @@ static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog) >> goto unlock; >> } >> >> + if (prog) { >> + /* num_channels is invariant here, so we can take the >> + * batched reference right upfront. >> + */ >> + prog = bpf_prog_add(prog, priv->params.num_channels); >> + if (IS_ERR(prog)) { >> + err = PTR_ERR(prog); >> + goto unlock; >> + } >> + } >> + > > With this approach you will end up taking a ref count twice per RQ! on > the first time xdp_set is called i.e (old_prog = NULL, prog != NULL). > One ref will be taken per RQ/Channel from the above code, and since > reset will be TRUE mlx5e_open_locked will be called and another ref > count will be taken on mlx5e_create_rq. > > The issue here is that we have two places for ref count accounting, > xdp_set and mlx5e_create_rq. Having ref-count updates in > mlx5e_create_rq is essential for num_channels configuration changes > (mlx5e_set_ringparam). > > Our previous approach made sure that only one path will do the ref > counting (mlx5e_open_locked vs. mlx5e_xdp_set batch ref update when > reset == false). That is correct, for a short time bpf_prog_add() was charged also when we reset. When we reset, we will then jump to unlock_put and safely undo this since we took ref from mlx5e_create_rq() already in that case, and return successfully. That was intentional, so that eventually we end up just taking a /single/ ref per RQ when we exit mlx5e_xdp_set(), but more below ... [...] > 2. Keep the current approach and make sure to not update the ref count > twice, you can batch update only if (!reset && was_open) otherwise you > can rely on mlx5e_open_locked to take the num_channels ref count for > you. > > Personally I prefer option 2, since we will keep the current logic > which will allow configuration updates such as (mlx5e_set_ringparam) > to not worry about ref counting since it will be done in the reset > flow. ... agree on keeping current approach. I actually like the idea, so we'd end up with this simpler version for the batched ref then. Note, your "bpf_prog_add(prog, 1) // one ref for the device." is not needed since this we already got that one through dev_change_xdp_fd() as mentioned. diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index ab0c336..09ac27c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -3148,11 +3148,21 @@ static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog) if (was_opened && reset) mlx5e_close_locked(netdev); + if (was_opened && !reset) { + /* num_channels is invariant here, so we can take the + * batched reference right upfront. + */ + prog = bpf_prog_add(prog, priv->params.num_channels); + if (IS_ERR(prog)) { + err = PTR_ERR(prog); + goto unlock; + } + } - /* exchange programs */ + /* exchange programs, extra prog reference we got from caller + * as long as we don't fail from this point onwards. + */ old_prog = xchg(&priv->xdp_prog, prog); - if (prog) - bpf_prog_add(prog, 1); if (old_prog) bpf_prog_put(old_prog); @@ -3168,7 +3178,6 @@ static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog) /* exchanging programs w/o reset, we update ref counts on behalf * of the channels RQs here. */ - bpf_prog_add(prog, priv->params.num_channels); for (i = 0; i < priv->params.num_channels; i++) { struct mlx5e_channel *c = priv->channel[i]; -- 1.9.3 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v2 2/4] bpf, mlx5: fix various refcount issues in mlx5e_xdp_set 2016-11-16 13:54 ` Daniel Borkmann @ 2016-11-16 14:30 ` Saeed Mahameed 2016-11-16 15:26 ` Daniel Borkmann 0 siblings, 1 reply; 15+ messages in thread From: Saeed Mahameed @ 2016-11-16 14:30 UTC (permalink / raw) To: Daniel Borkmann Cc: David S. Miller, Alexei Starovoitov, Brenden Blanco, zhiyisun, Rana Shahout, Saeed Mahameed, Linux Netdev List On Wed, Nov 16, 2016 at 3:54 PM, Daniel Borkmann <daniel@iogearbox.net> wrote: > On 11/16/2016 01:25 PM, Saeed Mahameed wrote: >> >> On Wed, Nov 16, 2016 at 2:04 AM, Daniel Borkmann <daniel@iogearbox.net> >> wrote: >>> >>> There are multiple issues in mlx5e_xdp_set(): >>> >>> 1) The batched bpf_prog_add() is currently not checked for errors! When >>> doing so, it should be done at an earlier point in time to makes sure >>> that we cannot fail anymore at the time we want to set the program >>> for >>> each channel. This only means that we have to undo the bpf_prog_add() >>> in case we return due to required reset. >>> >>> 2) When swapping the priv->xdp_prog, then no extra reference count must >>> be >>> taken since we got that from call path via dev_change_xdp_fd() >>> already. >>> Otherwise, we'd never be able to free the program. Also, >>> bpf_prog_add() >>> without checking the return code could fail. >>> >>> Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support") >>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> >>> --- >>> drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 25 >>> ++++++++++++++++++----- >>> 1 file changed, 20 insertions(+), 5 deletions(-) >>> >>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>> b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>> index ab0c336..cf26672 100644 >>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>> @@ -3142,6 +3142,17 @@ static int mlx5e_xdp_set(struct net_device >>> *netdev, struct bpf_prog *prog) >>> goto unlock; >>> } >>> >>> + if (prog) { >>> + /* num_channels is invariant here, so we can take the >>> + * batched reference right upfront. >>> + */ >>> + prog = bpf_prog_add(prog, priv->params.num_channels); >>> + if (IS_ERR(prog)) { >>> + err = PTR_ERR(prog); >>> + goto unlock; >>> + } >>> + } >>> + >> >> >> With this approach you will end up taking a ref count twice per RQ! on >> the first time xdp_set is called i.e (old_prog = NULL, prog != NULL). >> One ref will be taken per RQ/Channel from the above code, and since >> reset will be TRUE mlx5e_open_locked will be called and another ref >> count will be taken on mlx5e_create_rq. >> >> The issue here is that we have two places for ref count accounting, >> xdp_set and mlx5e_create_rq. Having ref-count updates in >> mlx5e_create_rq is essential for num_channels configuration changes >> (mlx5e_set_ringparam). >> >> Our previous approach made sure that only one path will do the ref >> counting (mlx5e_open_locked vs. mlx5e_xdp_set batch ref update when >> reset == false). > > > That is correct, for a short time bpf_prog_add() was charged also when > we reset. When we reset, we will then jump to unlock_put and safely undo > this since we took ref from mlx5e_create_rq() already in that case, and > return successfully. That was intentional, so that eventually we end up > just taking a /single/ ref per RQ when we exit mlx5e_xdp_set(), but more > below ... > > [...] >> >> 2. Keep the current approach and make sure to not update the ref count >> twice, you can batch update only if (!reset && was_open) otherwise you >> can rely on mlx5e_open_locked to take the num_channels ref count for >> you. >> >> Personally I prefer option 2, since we will keep the current logic >> which will allow configuration updates such as (mlx5e_set_ringparam) >> to not worry about ref counting since it will be done in the reset >> flow. > > > ... agree on keeping current approach. I actually like the idea, so we'd > end up with this simpler version for the batched ref then. > Great :). So let's do the batched update only if we are not going to reset (we already know that in advance), instead of the current patch where you batch update unconditionally and then unlock_put in case reset was performed (which is just redundant and confusing). Please note that if open_locked fails you need to goto unlock_put. > Note, your "bpf_prog_add(prog, 1) // one ref for the device." is not needed > since this we already got that one through dev_change_xdp_fd() as mentioned. > If it is not needed then why we need bpf_prog_put on mlx5e_nic_cleanup in your next patch? this doesn't look symmetric (right) ! you shouldn't release a reference you didn't take. Overall with this series the driver can take num_channels refs and will release num_channels refs on mlx5e_close. we shouldn't release one extra ref on NIC cleanup. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v2 2/4] bpf, mlx5: fix various refcount issues in mlx5e_xdp_set 2016-11-16 14:30 ` Saeed Mahameed @ 2016-11-16 15:26 ` Daniel Borkmann 2016-11-17 9:48 ` Saeed Mahameed 0 siblings, 1 reply; 15+ messages in thread From: Daniel Borkmann @ 2016-11-16 15:26 UTC (permalink / raw) To: Saeed Mahameed Cc: David S. Miller, Alexei Starovoitov, Brenden Blanco, zhiyisun, Rana Shahout, Saeed Mahameed, Linux Netdev List On 11/16/2016 03:30 PM, Saeed Mahameed wrote: > On Wed, Nov 16, 2016 at 3:54 PM, Daniel Borkmann <daniel@iogearbox.net> wrote: >> On 11/16/2016 01:25 PM, Saeed Mahameed wrote: >>> On Wed, Nov 16, 2016 at 2:04 AM, Daniel Borkmann <daniel@iogearbox.net> >>> wrote: >>>> >>>> There are multiple issues in mlx5e_xdp_set(): >>>> >>>> 1) The batched bpf_prog_add() is currently not checked for errors! When >>>> doing so, it should be done at an earlier point in time to makes sure >>>> that we cannot fail anymore at the time we want to set the program >>>> for >>>> each channel. This only means that we have to undo the bpf_prog_add() >>>> in case we return due to required reset. >>>> >>>> 2) When swapping the priv->xdp_prog, then no extra reference count must >>>> be >>>> taken since we got that from call path via dev_change_xdp_fd() >>>> already. >>>> Otherwise, we'd never be able to free the program. Also, >>>> bpf_prog_add() >>>> without checking the return code could fail. >>>> >>>> Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support") >>>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> >>>> --- >>>> drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 25 >>>> ++++++++++++++++++----- >>>> 1 file changed, 20 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>>> b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>>> index ab0c336..cf26672 100644 >>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>>> @@ -3142,6 +3142,17 @@ static int mlx5e_xdp_set(struct net_device >>>> *netdev, struct bpf_prog *prog) >>>> goto unlock; >>>> } >>>> >>>> + if (prog) { >>>> + /* num_channels is invariant here, so we can take the >>>> + * batched reference right upfront. >>>> + */ >>>> + prog = bpf_prog_add(prog, priv->params.num_channels); >>>> + if (IS_ERR(prog)) { >>>> + err = PTR_ERR(prog); >>>> + goto unlock; >>>> + } >>>> + } >>>> + >>> >>> With this approach you will end up taking a ref count twice per RQ! on >>> the first time xdp_set is called i.e (old_prog = NULL, prog != NULL). >>> One ref will be taken per RQ/Channel from the above code, and since >>> reset will be TRUE mlx5e_open_locked will be called and another ref >>> count will be taken on mlx5e_create_rq. >>> >>> The issue here is that we have two places for ref count accounting, >>> xdp_set and mlx5e_create_rq. Having ref-count updates in >>> mlx5e_create_rq is essential for num_channels configuration changes >>> (mlx5e_set_ringparam). >>> >>> Our previous approach made sure that only one path will do the ref >>> counting (mlx5e_open_locked vs. mlx5e_xdp_set batch ref update when >>> reset == false). >> >> That is correct, for a short time bpf_prog_add() was charged also when >> we reset. When we reset, we will then jump to unlock_put and safely undo >> this since we took ref from mlx5e_create_rq() already in that case, and >> return successfully. That was intentional, so that eventually we end up >> just taking a /single/ ref per RQ when we exit mlx5e_xdp_set(), but more >> below ... >> >> [...] >>> >>> 2. Keep the current approach and make sure to not update the ref count >>> twice, you can batch update only if (!reset && was_open) otherwise you >>> can rely on mlx5e_open_locked to take the num_channels ref count for >>> you. >>> >>> Personally I prefer option 2, since we will keep the current logic >>> which will allow configuration updates such as (mlx5e_set_ringparam) >>> to not worry about ref counting since it will be done in the reset >>> flow. >> >> ... agree on keeping current approach. I actually like the idea, so we'd >> end up with this simpler version for the batched ref then. > > Great :). > > So let's do the batched update only if we are not going to reset (we > already know that in advance), instead of the current patch where you > batch update unconditionally and then > unlock_put in case reset was performed (which is just redundant and confusing). > > Please note that if open_locked fails you need to goto unlock_put. Sorry, I'm not quite sure I follow you here; are you /now/ commenting on the original patch or on the already updated diff I did from my very last email, that is, http://patchwork.ozlabs.org/patch/695564/ ? >> Note, your "bpf_prog_add(prog, 1) // one ref for the device." is not needed >> since this we already got that one through dev_change_xdp_fd() as mentioned. > > If it is not needed then why we need bpf_prog_put on mlx5e_nic_cleanup > in your next patch? this doesn't look symmetric (right) ! > you shouldn't release a reference you didn't take. > Overall with this series the driver can take num_channels refs and > will release num_channels refs on mlx5e_close. we shouldn't release > one extra ref on NIC cleanup. I already explained this in the commit description; when dev_change_xdp_fd() is called and sees a valid fd, it does bpf_prog_get_type(), which calls the __bpf_prog_get() taking a ref on the program (bpf_prog_inc()). That is then passed down to ops->ndo_xdp(). Only in case of error from the ->ndo_xdp() callback, we bpf_prog_put() this reference from dev_change_xdp_fd() side. For drivers that implement against this ndo, it means that you need N-1 refs in addition. Have a look at the other two in-tree users, which do it correctly, that is mlx4_xdp_set() and nfp_net_xdp_setup(). It's documented here (include/linux/netdevice.h) with ndo_xdp referring to it: /* These structures hold the attributes of xdp state that are being passed * to the netdevice through the xdp op. */ enum xdp_netdev_command { /* Set or clear a bpf program used in the earliest stages of packet * rx. The prog will have been loaded as BPF_PROG_TYPE_XDP. The callee * is responsible for calling bpf_prog_put on any old progs that are * stored. In case of error, the callee need not release the new prog * reference, but on success it takes ownership and must bpf_prog_put * when it is no longer used. */ XDP_SETUP_PROG, [...] }; I think reason why this is rather confusing would be that initially, it was just a single prog for all queues, but it was requested along the way to move prog pointer down to queues instead and let them have their own ref, so that some time in future individual progs from a subset of the queues can be exchanged. Since the way xdp in mlx5 is implemented, you currently have the priv->xdp_prog as the control prog. That's okay right now, but requires to drop the last ref on priv->xdp_prog via bpf_prog_put() when netdev is dismantled and priv->xdp_prog still present. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v2 2/4] bpf, mlx5: fix various refcount issues in mlx5e_xdp_set 2016-11-16 15:26 ` Daniel Borkmann @ 2016-11-17 9:48 ` Saeed Mahameed 2016-11-17 20:03 ` Daniel Borkmann 0 siblings, 1 reply; 15+ messages in thread From: Saeed Mahameed @ 2016-11-17 9:48 UTC (permalink / raw) To: Daniel Borkmann Cc: David S. Miller, Alexei Starovoitov, Brenden Blanco, Zhiyi Sun, Rana Shahout, Saeed Mahameed, Linux Netdev List On Wed, Nov 16, 2016 at 5:26 PM, Daniel Borkmann <daniel@iogearbox.net> wrote: > On 11/16/2016 03:30 PM, Saeed Mahameed wrote: >> >> On Wed, Nov 16, 2016 at 3:54 PM, Daniel Borkmann <daniel@iogearbox.net> >> wrote: >>> >>> On 11/16/2016 01:25 PM, Saeed Mahameed wrote: >>>> >>>> On Wed, Nov 16, 2016 at 2:04 AM, Daniel Borkmann <daniel@iogearbox.net> >>>> wrote: >>>>> >>>>> >>>>> There are multiple issues in mlx5e_xdp_set(): >>>>> >>>>> 1) The batched bpf_prog_add() is currently not checked for errors! When >>>>> doing so, it should be done at an earlier point in time to makes >>>>> sure >>>>> that we cannot fail anymore at the time we want to set the program >>>>> for >>>>> each channel. This only means that we have to undo the >>>>> bpf_prog_add() >>>>> in case we return due to required reset. >>>>> >>>>> 2) When swapping the priv->xdp_prog, then no extra reference count must >>>>> be >>>>> taken since we got that from call path via dev_change_xdp_fd() >>>>> already. >>>>> Otherwise, we'd never be able to free the program. Also, >>>>> bpf_prog_add() >>>>> without checking the return code could fail. >>>>> >>>>> Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs >>>>> support") >>>>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> >>>>> --- >>>>> drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 25 >>>>> ++++++++++++++++++----- >>>>> 1 file changed, 20 insertions(+), 5 deletions(-) >>>>> >>>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>>>> b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>>>> index ab0c336..cf26672 100644 >>>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>>>> @@ -3142,6 +3142,17 @@ static int mlx5e_xdp_set(struct net_device >>>>> *netdev, struct bpf_prog *prog) >>>>> goto unlock; >>>>> } >>>>> >>>>> + if (prog) { >>>>> + /* num_channels is invariant here, so we can take the >>>>> + * batched reference right upfront. >>>>> + */ >>>>> + prog = bpf_prog_add(prog, priv->params.num_channels); >>>>> + if (IS_ERR(prog)) { >>>>> + err = PTR_ERR(prog); >>>>> + goto unlock; >>>>> + } >>>>> + } >>>>> + >>>> >>>> >>>> With this approach you will end up taking a ref count twice per RQ! on >>>> the first time xdp_set is called i.e (old_prog = NULL, prog != NULL). >>>> One ref will be taken per RQ/Channel from the above code, and since >>>> reset will be TRUE mlx5e_open_locked will be called and another ref >>>> count will be taken on mlx5e_create_rq. >>>> >>>> The issue here is that we have two places for ref count accounting, >>>> xdp_set and mlx5e_create_rq. Having ref-count updates in >>>> mlx5e_create_rq is essential for num_channels configuration changes >>>> (mlx5e_set_ringparam). >>>> >>>> Our previous approach made sure that only one path will do the ref >>>> counting (mlx5e_open_locked vs. mlx5e_xdp_set batch ref update when >>>> reset == false). >>> >>> >>> That is correct, for a short time bpf_prog_add() was charged also when >>> we reset. When we reset, we will then jump to unlock_put and safely undo >>> this since we took ref from mlx5e_create_rq() already in that case, and >>> return successfully. That was intentional, so that eventually we end up >>> just taking a /single/ ref per RQ when we exit mlx5e_xdp_set(), but more >>> below ... >>> >>> [...] >>>> >>>> >>>> 2. Keep the current approach and make sure to not update the ref count >>>> twice, you can batch update only if (!reset && was_open) otherwise you >>>> can rely on mlx5e_open_locked to take the num_channels ref count for >>>> you. >>>> >>>> Personally I prefer option 2, since we will keep the current logic >>>> which will allow configuration updates such as (mlx5e_set_ringparam) >>>> to not worry about ref counting since it will be done in the reset >>>> flow. >>> >>> >>> ... agree on keeping current approach. I actually like the idea, so we'd >>> end up with this simpler version for the batched ref then. >> >> >> Great :). >> >> So let's do the batched update only if we are not going to reset (we >> already know that in advance), instead of the current patch where you >> batch update unconditionally and then >> unlock_put in case reset was performed (which is just redundant and >> confusing). >> >> Please note that if open_locked fails you need to goto unlock_put. > > > Sorry, I'm not quite sure I follow you here; are you /now/ commenting on > the original patch or on the already updated diff I did from my very last > email, that is, http://patchwork.ozlabs.org/patch/695564/ ? > I am commenting on this patch "V2 2/4". this patch will take batched ref count unconditionally and release the extra refs "unlock_put" if reset was performed. I am asking to take the batched ref only if we are not going to reset in advance to save the extra "unlock_put" >>> Note, your "bpf_prog_add(prog, 1) // one ref for the device." is not >>> needed >>> since this we already got that one through dev_change_xdp_fd() as >>> mentioned. >> >> >> If it is not needed then why we need bpf_prog_put on mlx5e_nic_cleanup >> in your next patch? this doesn't look symmetric (right) ! >> you shouldn't release a reference you didn't take. >> Overall with this series the driver can take num_channels refs and >> will release num_channels refs on mlx5e_close. we shouldn't release >> one extra ref on NIC cleanup. > > > I already explained this in the commit description; when dev_change_xdp_fd() > is called and sees a valid fd, it does bpf_prog_get_type(), which calls the > __bpf_prog_get() taking a ref on the program (bpf_prog_inc()). That is then > passed down to ops->ndo_xdp(). Only in case of error from the ->ndo_xdp() > callback, we bpf_prog_put() this reference from dev_change_xdp_fd() side. > > For drivers that implement against this ndo, it means that you need N-1 refs > in addition. Have a look at the other two in-tree users, which do it > correctly, > that is mlx4_xdp_set() and nfp_net_xdp_setup(). > > It's documented here (include/linux/netdevice.h) with ndo_xdp referring to > it: > > /* These structures hold the attributes of xdp state that are being passed > * to the netdevice through the xdp op. > */ > enum xdp_netdev_command { > /* Set or clear a bpf program used in the earliest stages of packet > * rx. The prog will have been loaded as BPF_PROG_TYPE_XDP. The > callee > * is responsible for calling bpf_prog_put on any old progs that are > * stored. In case of error, the callee need not release the new > prog > * reference, but on success it takes ownership and must > bpf_prog_put > * when it is no longer used. > */ > XDP_SETUP_PROG, > [...] > }; > > I think reason why this is rather confusing would be that initially, it was > just a single prog for all queues, but it was requested along the way to > move > prog pointer down to queues instead and let them have their own ref, so that > some time in future individual progs from a subset of the queues can be > exchanged. > > Since the way xdp in mlx5 is implemented, you currently have the > priv->xdp_prog > as the control prog. That's okay right now, but requires to drop the last > ref > on priv->xdp_prog via bpf_prog_put() when netdev is dismantled and > priv->xdp_prog > still present. Got it, but i would rather make the stack cleanup those refs once it receives an unregester_netdev event instead of counting on netdev to do it, as i said before, like any other resource added to the netdev by the stack (vlan/mac/etc..). but this is a general discussion of xdp API, it won't block this series :). Saeed. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v2 2/4] bpf, mlx5: fix various refcount issues in mlx5e_xdp_set 2016-11-17 9:48 ` Saeed Mahameed @ 2016-11-17 20:03 ` Daniel Borkmann 0 siblings, 0 replies; 15+ messages in thread From: Daniel Borkmann @ 2016-11-17 20:03 UTC (permalink / raw) To: Saeed Mahameed Cc: David S. Miller, Alexei Starovoitov, Brenden Blanco, Zhiyi Sun, Rana Shahout, Saeed Mahameed, Linux Netdev List On 11/17/2016 10:48 AM, Saeed Mahameed wrote: > On Wed, Nov 16, 2016 at 5:26 PM, Daniel Borkmann <daniel@iogearbox.net> wrote: >> On 11/16/2016 03:30 PM, Saeed Mahameed wrote: >>> On Wed, Nov 16, 2016 at 3:54 PM, Daniel Borkmann <daniel@iogearbox.net> >>> wrote: >>>> On 11/16/2016 01:25 PM, Saeed Mahameed wrote: >>>>> On Wed, Nov 16, 2016 at 2:04 AM, Daniel Borkmann <daniel@iogearbox.net> >>>>> wrote: >>>>>> >>>>>> There are multiple issues in mlx5e_xdp_set(): >>>>>> >>>>>> 1) The batched bpf_prog_add() is currently not checked for errors! When >>>>>> doing so, it should be done at an earlier point in time to makes >>>>>> sure >>>>>> that we cannot fail anymore at the time we want to set the program >>>>>> for >>>>>> each channel. This only means that we have to undo the >>>>>> bpf_prog_add() >>>>>> in case we return due to required reset. >>>>>> >>>>>> 2) When swapping the priv->xdp_prog, then no extra reference count must >>>>>> be >>>>>> taken since we got that from call path via dev_change_xdp_fd() >>>>>> already. >>>>>> Otherwise, we'd never be able to free the program. Also, >>>>>> bpf_prog_add() >>>>>> without checking the return code could fail. >>>>>> >>>>>> Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs >>>>>> support") >>>>>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> >>>>>> --- >>>>>> drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 25 >>>>>> ++++++++++++++++++----- >>>>>> 1 file changed, 20 insertions(+), 5 deletions(-) >>>>>> >>>>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>>>>> b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>>>>> index ab0c336..cf26672 100644 >>>>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>>>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>>>>> @@ -3142,6 +3142,17 @@ static int mlx5e_xdp_set(struct net_device >>>>>> *netdev, struct bpf_prog *prog) >>>>>> goto unlock; >>>>>> } >>>>>> >>>>>> + if (prog) { >>>>>> + /* num_channels is invariant here, so we can take the >>>>>> + * batched reference right upfront. >>>>>> + */ >>>>>> + prog = bpf_prog_add(prog, priv->params.num_channels); >>>>>> + if (IS_ERR(prog)) { >>>>>> + err = PTR_ERR(prog); >>>>>> + goto unlock; >>>>>> + } >>>>>> + } >>>>>> + >>>>> >>>>> With this approach you will end up taking a ref count twice per RQ! on >>>>> the first time xdp_set is called i.e (old_prog = NULL, prog != NULL). >>>>> One ref will be taken per RQ/Channel from the above code, and since >>>>> reset will be TRUE mlx5e_open_locked will be called and another ref >>>>> count will be taken on mlx5e_create_rq. >>>>> >>>>> The issue here is that we have two places for ref count accounting, >>>>> xdp_set and mlx5e_create_rq. Having ref-count updates in >>>>> mlx5e_create_rq is essential for num_channels configuration changes >>>>> (mlx5e_set_ringparam). >>>>> >>>>> Our previous approach made sure that only one path will do the ref >>>>> counting (mlx5e_open_locked vs. mlx5e_xdp_set batch ref update when >>>>> reset == false). >>>> >>>> That is correct, for a short time bpf_prog_add() was charged also when >>>> we reset. When we reset, we will then jump to unlock_put and safely undo >>>> this since we took ref from mlx5e_create_rq() already in that case, and >>>> return successfully. That was intentional, so that eventually we end up >>>> just taking a /single/ ref per RQ when we exit mlx5e_xdp_set(), but more >>>> below ... >>>> >>>> [...] >>>>> >>>>> 2. Keep the current approach and make sure to not update the ref count >>>>> twice, you can batch update only if (!reset && was_open) otherwise you >>>>> can rely on mlx5e_open_locked to take the num_channels ref count for >>>>> you. >>>>> >>>>> Personally I prefer option 2, since we will keep the current logic >>>>> which will allow configuration updates such as (mlx5e_set_ringparam) >>>>> to not worry about ref counting since it will be done in the reset >>>>> flow. >>>> >>>> ... agree on keeping current approach. I actually like the idea, so we'd >>>> end up with this simpler version for the batched ref then. >>> >>> Great :). >>> >>> So let's do the batched update only if we are not going to reset (we >>> already know that in advance), instead of the current patch where you >>> batch update unconditionally and then >>> unlock_put in case reset was performed (which is just redundant and >>> confusing). >>> >>> Please note that if open_locked fails you need to goto unlock_put. >> >> Sorry, I'm not quite sure I follow you here; are you /now/ commenting on >> the original patch or on the already updated diff I did from my very last >> email, that is, http://patchwork.ozlabs.org/patch/695564/ ? > > I am commenting on this patch "V2 2/4". > this patch will take batched ref count unconditionally and release the > extra refs "unlock_put" if reset was performed. > I am asking to take the batched ref only if we are not going to reset > in advance to save the extra "unlock_put" Okay, sure, it's clear and we discussed about that already earlier, and my response back then was the diff in the /end/ of this email here: https://www.spinics.net/lists/netdev/msg404709.html Hence my confusion why we were back on the original patch after that. ;) Anyway, I'll get the updated set with the suggestion out tomorrow, thanks for the review! >>>> Note, your "bpf_prog_add(prog, 1) // one ref for the device." is not >>>> needed >>>> since this we already got that one through dev_change_xdp_fd() as >>>> mentioned. >>> >>> If it is not needed then why we need bpf_prog_put on mlx5e_nic_cleanup >>> in your next patch? this doesn't look symmetric (right) ! >>> you shouldn't release a reference you didn't take. >>> Overall with this series the driver can take num_channels refs and >>> will release num_channels refs on mlx5e_close. we shouldn't release >>> one extra ref on NIC cleanup. >> >> I already explained this in the commit description; when dev_change_xdp_fd() >> is called and sees a valid fd, it does bpf_prog_get_type(), which calls the >> __bpf_prog_get() taking a ref on the program (bpf_prog_inc()). That is then >> passed down to ops->ndo_xdp(). Only in case of error from the ->ndo_xdp() >> callback, we bpf_prog_put() this reference from dev_change_xdp_fd() side. >> >> For drivers that implement against this ndo, it means that you need N-1 refs >> in addition. Have a look at the other two in-tree users, which do it >> correctly, >> that is mlx4_xdp_set() and nfp_net_xdp_setup(). >> >> It's documented here (include/linux/netdevice.h) with ndo_xdp referring to >> it: >> >> /* These structures hold the attributes of xdp state that are being passed >> * to the netdevice through the xdp op. >> */ >> enum xdp_netdev_command { >> /* Set or clear a bpf program used in the earliest stages of packet >> * rx. The prog will have been loaded as BPF_PROG_TYPE_XDP. The >> callee >> * is responsible for calling bpf_prog_put on any old progs that are >> * stored. In case of error, the callee need not release the new >> prog >> * reference, but on success it takes ownership and must >> bpf_prog_put >> * when it is no longer used. >> */ >> XDP_SETUP_PROG, >> [...] >> }; >> >> I think reason why this is rather confusing would be that initially, it was >> just a single prog for all queues, but it was requested along the way to >> move >> prog pointer down to queues instead and let them have their own ref, so that >> some time in future individual progs from a subset of the queues can be >> exchanged. >> >> Since the way xdp in mlx5 is implemented, you currently have the >> priv->xdp_prog >> as the control prog. That's okay right now, but requires to drop the last >> ref >> on priv->xdp_prog via bpf_prog_put() when netdev is dismantled and >> priv->xdp_prog >> still present. > > Got it, but i would rather make the stack cleanup those refs once it > receives an unregester_netdev event instead of counting on netdev to > do it, as i said before, like any other resource added to the netdev > by the stack (vlan/mac/etc..). > > but this is a general discussion of xdp API, it won't block this series :). > > Saeed. ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH net-next v2 3/4] bpf, mlx5: drop priv->xdp_prog reference on netdev cleanup 2016-11-16 0:04 [PATCH net-next v2 0/4] Couple of BPF refcount fixes for mlx5 Daniel Borkmann 2016-11-16 0:04 ` [PATCH net-next v2 1/4] bpf, mlx5: fix mlx5e_create_rq taking reference on prog Daniel Borkmann 2016-11-16 0:04 ` [PATCH net-next v2 2/4] bpf, mlx5: fix various refcount issues in mlx5e_xdp_set Daniel Borkmann @ 2016-11-16 0:04 ` Daniel Borkmann 2016-11-16 12:51 ` Saeed Mahameed 2016-11-16 0:04 ` [PATCH net-next v2 4/4] bpf: add __must_check attributes to refcount manipulating helpers Daniel Borkmann 3 siblings, 1 reply; 15+ messages in thread From: Daniel Borkmann @ 2016-11-16 0:04 UTC (permalink / raw) To: davem Cc: alexei.starovoitov, bblanco, zhiyisun, ranas, saeedm, netdev, Daniel Borkmann mlx5e_xdp_set() is currently the only place where we drop reference on the prog sitting in priv->xdp_prog when it's exchanged by a new one. We also need to make sure that we eventually release that reference, for example, in case the netdev is dismantled. Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> --- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index cf26672..60fe54c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -3715,6 +3715,9 @@ static void mlx5e_nic_cleanup(struct mlx5e_priv *priv) if (MLX5_CAP_GEN(mdev, vport_group_manager)) mlx5_eswitch_unregister_vport_rep(esw, 0); + + if (priv->xdp_prog) + bpf_prog_put(priv->xdp_prog); } static int mlx5e_init_nic_rx(struct mlx5e_priv *priv) -- 1.9.3 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v2 3/4] bpf, mlx5: drop priv->xdp_prog reference on netdev cleanup 2016-11-16 0:04 ` [PATCH net-next v2 3/4] bpf, mlx5: drop priv->xdp_prog reference on netdev cleanup Daniel Borkmann @ 2016-11-16 12:51 ` Saeed Mahameed 2016-11-16 15:45 ` Daniel Borkmann 0 siblings, 1 reply; 15+ messages in thread From: Saeed Mahameed @ 2016-11-16 12:51 UTC (permalink / raw) To: Daniel Borkmann Cc: David S. Miller, Alexei Starovoitov, Brenden Blanco, zhiyisun, Rana Shahout, Saeed Mahameed, Linux Netdev List On Wed, Nov 16, 2016 at 2:04 AM, Daniel Borkmann <daniel@iogearbox.net> wrote: > mlx5e_xdp_set() is currently the only place where we drop reference on the > prog sitting in priv->xdp_prog when it's exchanged by a new one. We also > need to make sure that we eventually release that reference, for example, > in case the netdev is dismantled. > > Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support") > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> > --- > drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > index cf26672..60fe54c 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > @@ -3715,6 +3715,9 @@ static void mlx5e_nic_cleanup(struct mlx5e_priv *priv) > > if (MLX5_CAP_GEN(mdev, vport_group_manager)) > mlx5_eswitch_unregister_vport_rep(esw, 0); > + > + if (priv->xdp_prog) > + bpf_prog_put(priv->xdp_prog); > } > I thought that on unregister_netdev ndo_xdp_set will be called with NULL prog to cleanup. like any other resources (Vlans/mac_lists/ etc..), why xdp should be different ? Anyway if this is the case, I am ok with this fix, you can even send it to net (it looks like a serious leak). Saeed. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v2 3/4] bpf, mlx5: drop priv->xdp_prog reference on netdev cleanup 2016-11-16 12:51 ` Saeed Mahameed @ 2016-11-16 15:45 ` Daniel Borkmann 2016-11-16 15:55 ` Daniel Borkmann 0 siblings, 1 reply; 15+ messages in thread From: Daniel Borkmann @ 2016-11-16 15:45 UTC (permalink / raw) To: Saeed Mahameed Cc: David S. Miller, Alexei Starovoitov, Brenden Blanco, zhiyisun, Rana Shahout, Saeed Mahameed, Linux Netdev List On 11/16/2016 01:51 PM, Saeed Mahameed wrote: > On Wed, Nov 16, 2016 at 2:04 AM, Daniel Borkmann <daniel@iogearbox.net> wrote: >> mlx5e_xdp_set() is currently the only place where we drop reference on the >> prog sitting in priv->xdp_prog when it's exchanged by a new one. We also >> need to make sure that we eventually release that reference, for example, >> in case the netdev is dismantled. >> >> Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support") >> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> >> --- >> drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >> index cf26672..60fe54c 100644 >> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >> @@ -3715,6 +3715,9 @@ static void mlx5e_nic_cleanup(struct mlx5e_priv *priv) >> >> if (MLX5_CAP_GEN(mdev, vport_group_manager)) >> mlx5_eswitch_unregister_vport_rep(esw, 0); >> + >> + if (priv->xdp_prog) >> + bpf_prog_put(priv->xdp_prog); >> } > > I thought that on unregister_netdev ndo_xdp_set will be called with > NULL prog to cleanup. like any other resources (Vlans/mac_lists/ > etc..), why xdp should be different ? > Anyway if this is the case, I am ok with this fix, you can even send > it to net (it looks like a serious leak). The only interaction with ndo_xdp() right now is dev_change_xdp_fd() and the currently a bit terse dump via rtnl_xdp_fill(). The latter only tells whether something is actually attached and will have more info in near future, but doesn't alter anything. dev_change_xdp_fd() is only triggered from user side via netlink when IFLA_XDP container attr is around, so no automatic cleanup here. This means as per documentation in enum xdp_netdev_command, that the driver has full ownership, thus needs to bpf_prog_put(). ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v2 3/4] bpf, mlx5: drop priv->xdp_prog reference on netdev cleanup 2016-11-16 15:45 ` Daniel Borkmann @ 2016-11-16 15:55 ` Daniel Borkmann 2016-11-17 9:34 ` Saeed Mahameed 0 siblings, 1 reply; 15+ messages in thread From: Daniel Borkmann @ 2016-11-16 15:55 UTC (permalink / raw) To: Saeed Mahameed Cc: David S. Miller, Alexei Starovoitov, Brenden Blanco, zhiyisun, Rana Shahout, Saeed Mahameed, Linux Netdev List On 11/16/2016 04:45 PM, Daniel Borkmann wrote: > On 11/16/2016 01:51 PM, Saeed Mahameed wrote: >> On Wed, Nov 16, 2016 at 2:04 AM, Daniel Borkmann <daniel@iogearbox.net> wrote: >>> mlx5e_xdp_set() is currently the only place where we drop reference on the >>> prog sitting in priv->xdp_prog when it's exchanged by a new one. We also >>> need to make sure that we eventually release that reference, for example, >>> in case the netdev is dismantled. >>> >>> Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support") >>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> >>> --- >>> drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 +++ >>> 1 file changed, 3 insertions(+) >>> >>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>> index cf26672..60fe54c 100644 >>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>> @@ -3715,6 +3715,9 @@ static void mlx5e_nic_cleanup(struct mlx5e_priv *priv) >>> >>> if (MLX5_CAP_GEN(mdev, vport_group_manager)) >>> mlx5_eswitch_unregister_vport_rep(esw, 0); >>> + >>> + if (priv->xdp_prog) >>> + bpf_prog_put(priv->xdp_prog); >>> } >> >> I thought that on unregister_netdev ndo_xdp_set will be called with >> NULL prog to cleanup. like any other resources (Vlans/mac_lists/ >> etc..), why xdp should be different ? >> Anyway if this is the case, I am ok with this fix, you can even send >> it to net (it looks like a serious leak). > > The only interaction with ndo_xdp() right now is dev_change_xdp_fd() > and the currently a bit terse dump via rtnl_xdp_fill(). The latter > only tells whether something is actually attached and will have more > info in near future, but doesn't alter anything. > > dev_change_xdp_fd() is only triggered from user side via netlink when > IFLA_XDP container attr is around, so no automatic cleanup here. This > means as per documentation in enum xdp_netdev_command, that the driver > has full ownership, thus needs to bpf_prog_put(). Note that without patch 2/4, just sending this one to net doesn't really solve anything, since there the mlx5e_xdp_set() still has the incorrect bpf_prog_add(prog, 1) around. So it's the whole series if so. I had it originally targeted at net, but Alexei suggested net-next; I don't really mind either way, so I agreed to go for net-next. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v2 3/4] bpf, mlx5: drop priv->xdp_prog reference on netdev cleanup 2016-11-16 15:55 ` Daniel Borkmann @ 2016-11-17 9:34 ` Saeed Mahameed 0 siblings, 0 replies; 15+ messages in thread From: Saeed Mahameed @ 2016-11-17 9:34 UTC (permalink / raw) To: Daniel Borkmann Cc: David S. Miller, Alexei Starovoitov, Brenden Blanco, zhiyisun, Rana Shahout, Saeed Mahameed, Linux Netdev List On Wed, Nov 16, 2016 at 5:55 PM, Daniel Borkmann <daniel@iogearbox.net> wrote: > On 11/16/2016 04:45 PM, Daniel Borkmann wrote: >> >> On 11/16/2016 01:51 PM, Saeed Mahameed wrote: >>> >>> On Wed, Nov 16, 2016 at 2:04 AM, Daniel Borkmann <daniel@iogearbox.net> >>> wrote: >>>> >>>> mlx5e_xdp_set() is currently the only place where we drop reference on >>>> the >>>> prog sitting in priv->xdp_prog when it's exchanged by a new one. We also >>>> need to make sure that we eventually release that reference, for >>>> example, >>>> in case the netdev is dismantled. >>>> >>>> Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support") >>>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> >>>> --- >>>> drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 +++ >>>> 1 file changed, 3 insertions(+) >>>> >>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>>> b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>>> index cf26672..60fe54c 100644 >>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>>> @@ -3715,6 +3715,9 @@ static void mlx5e_nic_cleanup(struct mlx5e_priv >>>> *priv) >>>> >>>> if (MLX5_CAP_GEN(mdev, vport_group_manager)) >>>> mlx5_eswitch_unregister_vport_rep(esw, 0); >>>> + >>>> + if (priv->xdp_prog) >>>> + bpf_prog_put(priv->xdp_prog); >>>> } >>> >>> >>> I thought that on unregister_netdev ndo_xdp_set will be called with >>> NULL prog to cleanup. like any other resources (Vlans/mac_lists/ >>> etc..), why xdp should be different ? >>> Anyway if this is the case, I am ok with this fix, you can even send >>> it to net (it looks like a serious leak). >> >> >> The only interaction with ndo_xdp() right now is dev_change_xdp_fd() >> and the currently a bit terse dump via rtnl_xdp_fill(). The latter >> only tells whether something is actually attached and will have more >> info in near future, but doesn't alter anything. >> >> dev_change_xdp_fd() is only triggered from user side via netlink when >> IFLA_XDP container attr is around, so no automatic cleanup here. This >> means as per documentation in enum xdp_netdev_command, that the driver >> has full ownership, thus needs to bpf_prog_put(). > > > Note that without patch 2/4, just sending this one to net doesn't really > solve anything, since there the mlx5e_xdp_set() still has the incorrect > bpf_prog_add(prog, 1) around. So it's the whole series if so. I had it > originally targeted at net, but Alexei suggested net-next; I don't really > mind either way, so I agreed to go for net-next. Ok, Thank you Daniel ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH net-next v2 4/4] bpf: add __must_check attributes to refcount manipulating helpers 2016-11-16 0:04 [PATCH net-next v2 0/4] Couple of BPF refcount fixes for mlx5 Daniel Borkmann ` (2 preceding siblings ...) 2016-11-16 0:04 ` [PATCH net-next v2 3/4] bpf, mlx5: drop priv->xdp_prog reference on netdev cleanup Daniel Borkmann @ 2016-11-16 0:04 ` Daniel Borkmann 3 siblings, 0 replies; 15+ messages in thread From: Daniel Borkmann @ 2016-11-16 0:04 UTC (permalink / raw) To: davem Cc: alexei.starovoitov, bblanco, zhiyisun, ranas, saeedm, netdev, Daniel Borkmann Helpers like bpf_prog_add(), bpf_prog_inc(), bpf_map_inc() can fail with an error, so make sure the caller properly checks their return value and not just ignores it (which could worst-case lead to use after free). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> --- include/linux/bpf.h | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 01c1487..69d0a7f 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -233,14 +233,14 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size, struct bpf_prog *bpf_prog_get(u32 ufd); struct bpf_prog *bpf_prog_get_type(u32 ufd, enum bpf_prog_type type); -struct bpf_prog *bpf_prog_add(struct bpf_prog *prog, int i); +struct bpf_prog * __must_check bpf_prog_add(struct bpf_prog *prog, int i); void bpf_prog_sub(struct bpf_prog *prog, int i); -struct bpf_prog *bpf_prog_inc(struct bpf_prog *prog); +struct bpf_prog * __must_check bpf_prog_inc(struct bpf_prog *prog); void bpf_prog_put(struct bpf_prog *prog); struct bpf_map *bpf_map_get_with_uref(u32 ufd); struct bpf_map *__bpf_map_get(struct fd f); -struct bpf_map *bpf_map_inc(struct bpf_map *map, bool uref); +struct bpf_map * __must_check bpf_map_inc(struct bpf_map *map, bool uref); void bpf_map_put_with_uref(struct bpf_map *map); void bpf_map_put(struct bpf_map *map); int bpf_map_precharge_memlock(u32 pages); @@ -299,7 +299,8 @@ static inline struct bpf_prog *bpf_prog_get_type(u32 ufd, { return ERR_PTR(-EOPNOTSUPP); } -static inline struct bpf_prog *bpf_prog_add(struct bpf_prog *prog, int i) +static inline struct bpf_prog * __must_check bpf_prog_add(struct bpf_prog *prog, + int i) { return ERR_PTR(-EOPNOTSUPP); } @@ -311,7 +312,8 @@ static inline void bpf_prog_sub(struct bpf_prog *prog, int i) static inline void bpf_prog_put(struct bpf_prog *prog) { } -static inline struct bpf_prog *bpf_prog_inc(struct bpf_prog *prog) + +static inline struct bpf_prog * __must_check bpf_prog_inc(struct bpf_prog *prog) { return ERR_PTR(-EOPNOTSUPP); } -- 1.9.3 ^ permalink raw reply related [flat|nested] 15+ messages in thread
end of thread, other threads:[~2016-11-17 20:03 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-11-16 0:04 [PATCH net-next v2 0/4] Couple of BPF refcount fixes for mlx5 Daniel Borkmann 2016-11-16 0:04 ` [PATCH net-next v2 1/4] bpf, mlx5: fix mlx5e_create_rq taking reference on prog Daniel Borkmann 2016-11-16 0:04 ` [PATCH net-next v2 2/4] bpf, mlx5: fix various refcount issues in mlx5e_xdp_set Daniel Borkmann 2016-11-16 12:25 ` Saeed Mahameed 2016-11-16 13:54 ` Daniel Borkmann 2016-11-16 14:30 ` Saeed Mahameed 2016-11-16 15:26 ` Daniel Borkmann 2016-11-17 9:48 ` Saeed Mahameed 2016-11-17 20:03 ` Daniel Borkmann 2016-11-16 0:04 ` [PATCH net-next v2 3/4] bpf, mlx5: drop priv->xdp_prog reference on netdev cleanup Daniel Borkmann 2016-11-16 12:51 ` Saeed Mahameed 2016-11-16 15:45 ` Daniel Borkmann 2016-11-16 15:55 ` Daniel Borkmann 2016-11-17 9:34 ` Saeed Mahameed 2016-11-16 0:04 ` [PATCH net-next v2 4/4] bpf: add __must_check attributes to refcount manipulating helpers Daniel Borkmann
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).