linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Jason Gunthorpe
	<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>,
	Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Subject: Re: [PATCH rdma-cm] IB/core: Fix use after free of ifa
Date: Thu, 15 Oct 2015 13:37:22 -0400	[thread overview]
Message-ID: <561FE452.3050304@redhat.com> (raw)
In-Reply-To: <1444910463-5688-2-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 5729 bytes --]

On 10/15/2015 08:01 AM, Matan Barak wrote:
> When using ifup/ifdown while executing enum_netdev_ipv4_ips,
> ifa could become invalid and cause use after free error.
> Fixing it by protecting with RCU lock.
> 
> Fixes: 03db3a2d81e6 ('IB/core: Add RoCE GID table management')
> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---
> 
> Hi Doug,
> 
> This patch fixes a bug in RoCE GID table implementation. Under stress conditions
> where ifup/ifdown are used, the ifa pointer could become invalid. Using a
> RCU lock in order to avoid freeing the ifa node (as done in other inet functions
> (for example, inet_addr_onlink).
> 
> Our QA team verified that this patch fixes this issue.

This doesn't look like a good fix to me.  In particular, I think you
merely shifted the bug around, you didn't actually resolve it.

In the original code, you called update_gid_ip while holding a reference
to in_dev.  The reference to in_dev was not enough to keep the ifa list
from changing while you were doing your work.  It's not surprising that
you hit a race with the ifa list because update_gid_ip being called
synchronously can both A) sleep because of the mutexes it takes and B)
be slow because of how many locks it takes (and it can really take a lot
due to find_gid) and C) be slow again because of updating the gid table
calling into the low level driver and actually writing a mailbox command
to the card.  So, for all those reasons, not only did you hit this race,
but you were *likely* to hit this race.

Now, you've put an rcu_read_lock on ndev instead.  And you're no longer
seeing the race.  However, does taking the rcu_read_lock on ndev
actually protect the ifa list on ndev, or is the real fix the fact that
you moved update_gid_ip out of the main loop?  Before, you blocked while
processing the ifa list, making hitting your race likely.  Now you
process the ifa list very fast and build your own sin_list that is no
longer impacted by changes to the ifa list, but I don't know that the
rcu_read_lock you have taken actually makes you for sure safe here
versus the possibility that you have just made the race much harder to
hit and hidden it.

And even if the rcu_read_lock is for sure safe in terms of accessing the
ifa list, these changes may have just introduced a totally new bug that
your QE tests haven't exposed but might exist none the less.  In
particular, we have now queued up adding a bunch of gids to the ndev.
But we drop our reference to the rcu lock, then we invoke a (possibly
large number) of sleeping iterations.  What's to prevent a situation
where we get enum_netdev_ipv4_ips() called on say a vlan child interface
of a primary RoCE netdev, create our address list, release our lock,
then the user destroys our vlan device, and we race with del_netdev_ips
on the vlan device, such that del_netdev_ips completes and removes all
the gids for that netdev, but we still have backlogged gid add events in
enum_netdev_ipv4_ips and so we add back in what will become permanently
stale gids?  I don't think we hold rtnl_lock while running in
enum_netdev_ipv4_ips and that's probably the only lock that would
exclude the user from deleting the vlan device, so as far as I can tell
we can easily call del_netdev_ips while the tail end of
enum_netdev_ipv4_ips is sleeping.  Am I wrong here?  A test would be to
take whatever QE test you have that hit this bug in the first place, and
on a different terminal add a while loop of adding/removing the same
vlan interface that you are updating gids on and see if the gid table
starts filling up with stale, unremovable entries.

> Thanks,
> Matan
> 
>  drivers/infiniband/core/roce_gid_mgmt.c | 35 +++++++++++++++++++++++++--------
>  1 file changed, 27 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/infiniband/core/roce_gid_mgmt.c b/drivers/infiniband/core/roce_gid_mgmt.c
> index 6b24cba..178f984 100644
> --- a/drivers/infiniband/core/roce_gid_mgmt.c
> +++ b/drivers/infiniband/core/roce_gid_mgmt.c
> @@ -250,25 +250,44 @@ static void enum_netdev_ipv4_ips(struct ib_device *ib_dev,
>  				 u8 port, struct net_device *ndev)
>  {
>  	struct in_device *in_dev;
> +	struct sin_list {
> +		struct list_head	list;
> +		struct sockaddr_in	ip;
> +	};
> +	struct sin_list *sin_iter;
> +	struct sin_list *sin_temp;
>  
> +	LIST_HEAD(sin_list);
>  	if (ndev->reg_state >= NETREG_UNREGISTERING)
>  		return;
>  
> -	in_dev = in_dev_get(ndev);
> -	if (!in_dev)
> +	rcu_read_lock();
> +	in_dev = __in_dev_get_rcu(ndev);
> +	if (!in_dev) {
> +		rcu_read_unlock();
>  		return;
> +	}
>  
>  	for_ifa(in_dev) {
> -		struct sockaddr_in ip;
> +		struct sin_list *entry = kzalloc(sizeof(*entry), GFP_ATOMIC);
>  
> -		ip.sin_family = AF_INET;
> -		ip.sin_addr.s_addr = ifa->ifa_address;
> -		update_gid_ip(GID_ADD, ib_dev, port, ndev,
> -			      (struct sockaddr *)&ip);
> +		if (!entry) {
> +			pr_warn("roce_gid_mgmt: couldn't allocate entry for IPv4 update\n");
> +			continue;
> +		}
> +		entry->ip.sin_family = AF_INET;
> +		entry->ip.sin_addr.s_addr = ifa->ifa_address;
> +		list_add_tail(&entry->list, &sin_list);
>  	}
>  	endfor_ifa(in_dev);
> +	rcu_read_unlock();
>  
> -	in_dev_put(in_dev);
> +	list_for_each_entry_safe(sin_iter, sin_temp, &sin_list, list) {
> +		update_gid_ip(GID_ADD, ib_dev, port, ndev,
> +			      (struct sockaddr *)&sin_iter->ip);
> +		list_del(&sin_iter->list);
> +		kfree(sin_iter);
> +	}
>  }
>  
>  static void enum_netdev_ipv6_ips(struct ib_device *ib_dev,
> 


-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

  parent reply	other threads:[~2015-10-15 17:37 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-15 12:01 [PATCH rdma-cm] IB/core: Fix memory corruption in ib_cache_gid_set_default_gid Matan Barak
     [not found] ` <1444910463-5688-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-10-15 12:01   ` [PATCH rdma-cm] IB/core: Fix use after free of ifa Matan Barak
     [not found]     ` <1444910463-5688-2-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-10-15 17:37       ` Doug Ledford [this message]
     [not found]         ` <561FE452.3050304-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-10-15 17:53           ` Jason Gunthorpe
     [not found]             ` <20151015175310.GA17519-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-10-18  7:51               ` Matan Barak
     [not found]                 ` <CAAKD3BCoNmHjUvAR_SuKT_AL-823_y34QyRRV3aZ=T8cw9F9gw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-19 18:26                   ` Jason Gunthorpe
2015-10-18  7:49           ` Matan Barak
     [not found]             ` <CAAKD3BBEfKTHPKyoTzMW3YESKJmGkcUkui=hjhsbyFRY+xDDEw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-19 12:23               ` Doug Ledford
     [not found]                 ` <5624E0AE.8050702-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-10-19 14:20                   ` Matan Barak
     [not found]                     ` <5624FC13.1090200-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-10-19 15:27                       ` Doug Ledford
     [not found]                         ` <56250BD6.2050503-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-10-20 14:50                           ` Matan Barak
     [not found]                             ` <562654B6.8090501-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-10-20 16:52                               ` Doug Ledford
2015-10-20 20:17       ` Doug Ledford
     [not found]         ` <5626A15E.7080800-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-11-16 13:17           ` Matan Barak
2015-10-15 16:27   ` [PATCH rdma-cm] IB/core: Fix memory corruption in ib_cache_gid_set_default_gid Doug Ledford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=561FE452.3050304@redhat.com \
    --to=dledford-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).