From mboxrd@z Thu Jan 1 00:00:00 1970 From: Matan Barak Subject: Re: [PATCH rdma-cm] IB/core: Fix use after free of ifa Date: Tue, 20 Oct 2015 17:50:30 +0300 Message-ID: <562654B6.8090501@mellanox.com> References: <1444910463-5688-1-git-send-email-matanb@mellanox.com> <1444910463-5688-2-git-send-email-matanb@mellanox.com> <561FE452.3050304@redhat.com> <5624E0AE.8050702@redhat.com> <5624FC13.1090200@mellanox.com> <56250BD6.2050503@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <56250BD6.2050503-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Sender: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Doug Ledford Cc: linux-rdma , Or Gerlitz , Jason Gunthorpe , Eran Ben Elisha List-Id: linux-rdma@vger.kernel.org On 10/19/2015 6:27 PM, Doug Ledford wrote: > On 10/19/2015 10:20 AM, Matan Barak wrote: >> On 10/19/2015 3:23 PM, Doug Ledford wrote: > >>> This is a major cause of the slowness. Unless you have a specific need >>> of them, per-entry rwlocks are *NOT* a good idea. I was going to bring >>> this up separately, so I'll just mention it here. Per entry locks help >>> reduce contention when you have lots of multiple accessors. Using >>> rwlocks help reduce contention when you have a read-mostly entry that is >>> only occasionally changed. But every lock and every unlock (just like >>> every atomic access) still requires a locked memory cycle. That means >>> every lock acquire and every lock release requires a synchronization >>> event between all CPUs. Using per-entry locks means that every entry >>> triggers two of those synchronization events. On modern Intel CPUs, >>> they cost about 32 cycles per event. If you are going to do something, >>> and it can't be done without a lock, then grab a single lock and do it >>> as fast as you can. Only in rare cases would you want per-entry locks. >>> >> >> I agree that every rwlock costs us locked access. However, lets look at >> the common scenario here. > > No, let's not. Especially if your "common scenario" causes you to > ignore pathological bad behavior. Optimizing for the common case is not > an excuse to enable pathological behavior. > >> I think that in production stable systems, the >> IPs our rdma-cm stack uses (and thus GIDs) should be pretty stable. > > Probably true. > >> Thus, most accesses should be read calls. > > Probably true. > >> That's why IMHO read-write >> access makes sense here. > > Probably true. > >> Regarding single-lock vs per entry lock, it really depends on common an >> entry could be updated while another entry is being used by an >> application. > > No, it depends on more than that. In addition to the frequency of > updates versus lookups, there is also the issue of lookup cost and > update cost. Using per entry locks makes both the lookup cost and the > update cost of anything other than the first gid in the table grow > exponentially. A good way to demonstrate this would be to create 100 > vlans on a RoCE device, then run the cmtime test between two hosts using > the 100th vlan IP at each end. Using per-entry locks, the performance > of this test relative to using the first IP on the device will be > pathologically bad. > That's correct, we"waste" #entries *