From: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
To: Matan Barak <matanb-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
Cc: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
"Hefty,
Sean" <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Moni Shoua <monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
Somnath Kotur
<Somnath.Kotur-iNbyuHi0droAvxtiuMwx3w@public.gmane.org>,
"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [PATCH for-next V5 00/12] Move RoCE GID management to IB/Core
Date: Wed, 10 Jun 2015 16:01:54 -0600 [thread overview]
Message-ID: <20150610220154.GA4391@obsidianresearch.com> (raw)
In-Reply-To: <CAAKD3BB90iZ98B2ADG+=ZYuEVtLq26a99BEjQCR8U1vzvcG+Gw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On Wed, Jun 10, 2015 at 11:19:03PM +0300, Matan Barak wrote:
> > Sure gid_type is gone, but I didn't say roceve2 specific, I said
> > latent elements. ie I'm assuming reasons for the scary locking are
> > because the ripped out rocev2 code needed it? And some of the
> > complexity that looks pointless now was supporting ripped out rocev2
> > elements? That is not necessarily bad, but the code had better be good
> > quailty and working..
>
> Why do you think the locks have anything to do with roce v2?
What else could they be for? The current mlx4 driver doesn't use use
agressive performance locking.
After writing this email, I am of the opinion that the locking should
be simplified to rwsem and mutex, and every use of rcu, READ_ONCE and
seqlock should be ditched.
> > But then I look at the patches, and the very first locking I test out
> > looks wrong. I see call_rcu/synchronize_rcu being used without a
> > single call to rcu_read_lock. So this fails #2 of the RCU review
> > checklist (Seriously? Why am I catching this?)
> >
> > I stopped reading at that point.
> >
>
> Well, that's easy to explain - write_gid could be called with one of
> roce_gid_table's find API.
That doesn't explain anything.
You can't use call_rcu without also using rcu_dereference and
rcu_read_lock. It doesn't make any sense otherwise.
Your explanation seems confused too, did you reasearch this? Did you
read the RCU checklist? Is this a knee-jerk reply? Please be thoughtfull.
> find is called and returns a ndev
> write_gid is called and calls dev_put(ndev)
> ndev is freed
> find uses the ndev
Are you trying to say that this rcu is protecting this:
+static int find_gid(struct ib_roce_gid_table *table, const union ib_gid *gid,
+ const struct ib_gid_attr *val, unsigned long mask)
+{
[..]
+ if (mask & GID_ATTR_FIND_MASK_NETDEV &&
+ attr->ndev != val->ndev)
+ continue;
That is an unlocked access to a RCU protected value, without
rcu_dereference. Fails two points on the RCU checklist.
Where does it return ndev?
Honestly, since RCU is done wrong, and I'm very suspicious seqlock is
done wrong too, I would *strongly* encourage v6 to have simple
read/write sem and mutex locking and nothing fancy for performance. I
don't want to go round and round on subtle performance locking for a
*cleanup patch*.
There is also this RCU confusion:
+ rcu_read_lock();
+ if (ib_dev->get_netdev)
+ idev = ib_dev->get_netdev(ib_dev, port);
When holding the rcu_read_lock it should be obvious what the RCU
protected data is. There is no way holding it around a driver call
back makes any sense.
The driver should return a held netdev or null.
.. and maybe more, I stopped looking
> By calling the find API in RCU, your ndev is protected.
When implementing locking, identify the data being locked, and
confirm that every possible access to that data follows the required
locking rules. In this case the data being locked is the
table->data_vec[ix].attr.ndev pointer.
It was the very first thing I checked, in the very first patch.
> > I think you've got the right basic idea for a cleanup series here. It
> > is time to buckle down and execute it well. Do an internal mellanox
> > kernel team review of this series. Audit and fix all the locking,
> > evaluate the code growth and design. Audit to confirm there is no
> > functional change that is not documented in a commit message. Tell me
> > v6 is the best effort *team Mellanox* can put forward.
>
> Jason, I really appreciate your review. If you have any comments, I
> would like to either fix or write you back. This series wasn't sent
> without being looked at by the internal team here.
Well, I am looking at this thinking I don't want to invest time in
searching for things I think your team can find on it's own.
Take a breather, produce v6 very carefully.
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2015-06-10 22:01 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-08 14:12 [PATCH for-next V5 00/12] Move RoCE GID management to IB/Core Matan Barak
[not found] ` <1433772735-22416-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-06-08 14:12 ` [PATCH for-next V5 01/12] IB/core: Add RoCE GID table Matan Barak
2015-06-08 14:12 ` [PATCH for-next V5 02/12] IB/core: Add rwsem to allow reading device list or client list Matan Barak
2015-06-08 14:12 ` [PATCH for-next V5 03/12] IB/core: Add RoCE GID population Matan Barak
[not found] ` <1433772735-22416-4-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-06-11 4:18 ` Jason Gunthorpe
2015-06-08 14:12 ` [PATCH for-next V5 04/12] net/ipv6: Export addrconf_ifid_eui48 Matan Barak
2015-06-08 14:12 ` [PATCH for-next V5 05/12] IB/core: Add default GID for RoCE GID table Matan Barak
[not found] ` <1433772735-22416-6-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-06-11 6:20 ` Jason Gunthorpe
[not found] ` <20150611062017.GC22369-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-06-11 15:30 ` Matan Barak
2015-06-08 14:12 ` [PATCH for-next V5 06/12] net: Add info for NETDEV_CHANGEUPPER event Matan Barak
2015-06-08 14:12 ` [PATCH for-next V5 07/12] IB/core: Add RoCE table bonding support Matan Barak
[not found] ` <1433772735-22416-8-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-06-11 6:18 ` Jason Gunthorpe
[not found] ` <20150611061818.GB22369-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-06-11 16:00 ` Matan Barak
2015-06-08 14:12 ` [PATCH for-next V5 08/12] IB/core: ib_cache routines should use roce_gid_table when needed Matan Barak
2015-06-08 14:12 ` [PATCH for-next V5 09/12] net/mlx4: Postpone the registration of net_device Matan Barak
2015-06-08 14:12 ` [PATCH for-next V5 10/12] IB/mlx4: Implement ib_device callbacks Matan Barak
[not found] ` <1433772735-22416-11-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-06-11 6:31 ` Jason Gunthorpe
[not found] ` <20150611063108.GE22369-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-06-11 6:53 ` Moni Shoua
2015-06-08 14:12 ` [PATCH for-next V5 11/12] IB/mlx4: Replace mechanism for RoCE GID management Matan Barak
2015-06-08 14:12 ` [PATCH for-next V5 12/12] RDMA/ocrdma: Changes in driver to incorporate the moving of GID Table mgmt to IB/Core Matan Barak
[not found] ` <1433772735-22416-13-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-06-11 4:11 ` Jason Gunthorpe
[not found] ` <20150611041124.GC16599-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-06-11 6:04 ` Somnath Kotur
2015-06-08 21:37 ` [PATCH for-next V5 00/12] Move RoCE GID management " Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FE5D17-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-06-09 7:27 ` Matan Barak
[not found] ` <55769561.8000300-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-06-10 8:53 ` Or Gerlitz
[not found] ` <5577FAFB.8020205-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-06-10 15:00 ` Jason Gunthorpe
[not found] ` <20150610150010.GA11243-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-06-10 15:08 ` Matan Barak
[not found] ` <557852EE.5030107-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-06-10 18:49 ` Jason Gunthorpe
[not found] ` <20150610184954.GA26404-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-06-10 20:19 ` Matan Barak
[not found] ` <CAAKD3BB90iZ98B2ADG+=ZYuEVtLq26a99BEjQCR8U1vzvcG+Gw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-06-10 22:01 ` Jason Gunthorpe [this message]
[not found] ` <20150610220154.GA4391-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-06-11 9:49 ` Matan Barak
[not found] ` <CAAKD3BChd10Gd4P2Mwm+46aW+PJBT3j7K-BLex0Fkm5UdtUG3w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-06-11 16:27 ` Jason Gunthorpe
2015-06-12 12:29 ` Or Gerlitz
[not found] ` <CAJ3xEMiXWN9wC5u6iapKMVb4=bfzdnuy3CaZryV0nOFL_Cgmhw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-06-12 16:11 ` Jason Gunthorpe
2015-06-11 1:06 ` Doug Ledford
[not found] ` <1433984788.71666.78.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-06-11 3:57 ` Jason Gunthorpe
[not found] ` <20150611035727.GA16599-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-06-11 4:49 ` Doug Ledford
[not found] ` <1433998199.71666.144.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-06-11 5:38 ` Jason Gunthorpe
2015-06-11 10:15 ` Matan Barak
2015-06-11 10:09 ` Matan Barak
2015-06-11 0:15 ` Doug Ledford
[not found] ` <1433981756.71666.60.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-06-11 4:07 ` Jason Gunthorpe
2015-06-11 9:51 ` Matan Barak
2015-06-10 15:09 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373A8FE6616-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-06-10 15:19 ` Matan Barak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150610220154.GA4391@obsidianresearch.com \
--to=jgunthorpe-epgobjl8dl3ta4ec/59zmfatqe2ktcn/@public.gmane.org \
--cc=Somnath.Kotur-iNbyuHi0droAvxtiuMwx3w@public.gmane.org \
--cc=dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=matanb-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org \
--cc=matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=monis-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
--cc=sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox