All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: "Håkon Bugge" <haakon.bugge@oracle.com>
Cc: Yishai Hadas <yishaih@mellanox.com>,
	OFED mailing list <linux-rdma@vger.kernel.org>
Subject: Re: [PATCH for-rc] IB/mlx4: Fix leak in id_map_find_del
Date: Mon, 20 Jan 2020 18:27:51 +0200	[thread overview]
Message-ID: <20200120162751.GJ51881@unreal> (raw)
In-Reply-To: <71091F31-3206-4654-854B-B751F9A96BBC@oracle.com>

On Mon, Jan 20, 2020 at 04:49:56PM +0100, Håkon Bugge wrote:
>
>
> > On 20 Jan 2020, at 16:06, Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Fri, Jan 17, 2020 at 02:56:22PM +0100, Håkon Bugge wrote:
> >> Using CX-3 virtual functions, either from a bare-metal machine or
> >> pass-through from a VM, MAD packets are proxied through the PF driver.
> >>
> >> Since the VF drivers have separate name spaces for MAD Transaction Ids
> >> (TIDs), the PF driver has to re-map the TIDs and keep the book keeping
> >> in a cache.
> >>
> >> Following the RDMA Connection Manager (CM) protocol, it is clear when
> >> an entry has to evicted from the cache. When a DREP is sent from
> >> mlx4_ib_multiplex_cm_handler(), id_map_find_del() is called. Similar
> >> when a REJ is received by the mlx4_ib_demux_cm_handler(),
> >> id_map_find_del() is called.
> >>
> >> This function wipes out the TID in use from the IDR or XArray and
> >> removes the id_map_entry from the table.
> >>
> >> In short, it does everything except the topping of the cake, which is
> >> to remove the entry from the list and free it. In other words, for the
> >> DREP and REJ cases enumerated above, both will leak one id_map_entry.
> >>
> >> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
> >> ---
> >> drivers/infiniband/hw/mlx4/cm.c | 7 ++++++-
> >> 1 file changed, 6 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/infiniband/hw/mlx4/cm.c b/drivers/infiniband/hw/mlx4/cm.c
> >> index ecd6cadd529a..1df6d3ccfc62 100644
> >> --- a/drivers/infiniband/hw/mlx4/cm.c
> >> +++ b/drivers/infiniband/hw/mlx4/cm.c
> >> @@ -197,8 +197,13 @@ static void id_map_find_del(struct ib_device *ibdev, int pv_cm_id)
> >> 	if (!ent)
> >> 		goto out;
> >> 	found_ent = id_map_find_by_sl_id(ibdev, ent->slave_id, ent->sl_cm_id);
> >> -	if (found_ent && found_ent == ent)
> >> +	if (found_ent && found_ent == ent) {
> >> 		rb_erase(&found_ent->node, sl_id_map);
> >> +		if (!ent->scheduled_delete) {
> >
> > Why do we need to check scheduled_delete?
>
> 1. Node receives a DREQ and mlx4_ib_demux_cm_handler() is called, which again calls schedule_delayed(), which sets scheduled_delete.
>
> 2. DREQ is proxied over to the VM, which replies with a DREP.
>
> 3. The DREP is proxied over to the PF driver, mlx4_ib_multiplex_cm_handler() is called, id_map_find_del() is called. If it is freed now (without checking scheduled_delete), it will be a double free when the delayed work kicks in.

It will be the case if we don't cancel delayed work inside
id_map_find_del(), but it raises other question. Why do we need two
identical delete functions? Can we convert id_map_find_del() callers
to use id_map_ent_timeout() instead?

Thanks

>
> But this documents that the commit message is not accurate, it is only the REJ case that has a leak.
>
> > Isn't this to mark call to timeout cleanup (id_map_ent_timeout), which
> > can't race with id_map_find_del()? They both hold the same spinlock.
>
> No race, but it can be set as per the above.
>
> If you agree, I will send a v2 with corrected commit message.
>
>
> Thxs, Håkon
>
>
> >
> > Thanks
> >
> >> +			list_del(&ent->list);
> >> +			kfree(ent);
> >> +		}
> >> +	}
> >> out:
> >> 	spin_unlock(&sriov->id_map_lock);
> >> }
> >> --
> >> 2.20.1
> >>
>

  reply	other threads:[~2020-01-20 16:27 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-17 13:56 [PATCH for-rc] IB/mlx4: Fix leak in id_map_find_del Håkon Bugge
2020-01-20 14:36 ` Håkon Bugge
2020-01-20 15:06 ` Leon Romanovsky
2020-01-20 15:49   ` Håkon Bugge
2020-01-20 16:27     ` Leon Romanovsky [this message]
2020-01-21 14:26       ` Håkon Bugge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200120162751.GJ51881@unreal \
    --to=leon@kernel.org \
    --cc=haakon.bugge@oracle.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=yishaih@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.