From mboxrd@z Thu Jan 1 00:00:00 1970 From: Steven Rostedt Subject: Re: [RFC+PATCH] Infiniband hfi1 + PREEMPT_RT_FULL issues Date: Tue, 26 Sep 2017 10:56:50 -0400 Message-ID: <20170926105650.43314134@vmware.local.home> References: <20170925144949.GP29668@kernel.org> <20170925211447.GL29872@jcartwri.amer.corp.natinst.com> <32E1700B9017364D9B60AED9960492BC3441B3BD@fmsmsx120.amr.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: Julia Cartwright , Arnaldo Carvalho de Melo , "Dalessandro, Dennis" , Thomas Gleixner , Clark Williams , "Luick, Dean" , Doug Ledford , "Wan, Kaike" , Leon Romanovsky , Peter Zijlstra , Sebastian Andrzej Siewior , "Sanchez, Sebastian" , "linux-rt-users-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" To: "Marciniszyn, Mike" Return-path: In-Reply-To: <32E1700B9017364D9B60AED9960492BC3441B3BD-RjuIdWtd+YbTXloPLtfHfbfspsVTdybXVpNB7YpNyf8@public.gmane.org> Sender: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: linux-rt-users.vger.kernel.org On Tue, 26 Sep 2017 14:10:52 +0000 "Marciniszyn, Mike" wrote: > > My first question would be: is disabling preemption here really > > necessary? > > > > The motivation for the preempt manipulation has nothing to do with > the counter. > > The buffer allocation returns a set of ringed write-combining 64B > MMIO buffers. The allocate lock is then dropped right after > allocation to allow parallel allocates. > > The hardware keeps track of the buffer fill across multiple CPUs, so > that after the oldest buffer is copied the ring is advanced. > > The idea was to minimize the time from the drop of the allocate lock > to the end of the copy to insure the highest rate of copy to the ring > from multiple cores. > > > AFAICT, preemption is only disabled to protect the access of the > > 'buffers_allocated' percpu counters, nothing else. > > > > However, because these counters are only observed in aggregate, > > across CPUs (see get_buffers_allocated()), the sum will be the > > same, regardless of a thread is migrated between a > > this_cpu_inc(buffers_allocated) and a > > this_cpu_dec(buffers_allocated). > > The counter exists purely as a hot path optimization to avoid an > atomic. > > The only time an accurate counter is required is during reset > sequences to wait for allocate/copy pairs to finish and we don't care > if the closing decrement is on the same core. > > The percpu refcount might be a better choice, but I don't think it > existed at the time the code was written. > > > If that's the case, perhaps the fix is as easy as removing the > > preempt_disable() and preempt_enable()? > > > > That would certainly be the easiest solution, but would compromise > performance if a migration occurs after the allocate lock is dropped. local_lock() disables migration. It's currently only in the -rt patch. It's made to allow preemption to occur as much as possible, as -rt cares about reaction time more than performance (although, we would love to keep performance as well). > > We have been looking at the patch that was submitted for using the > raw spin lock and we share Arnaldo's concern, particularly mixing > lock types. > > Other locks can be procured in the timeout case in > sc_wait_for_packet_egress() inside queue_work() and > dd_dev_err()/printk(). > A raw_spin_lock sacrifices reaction time, which goes against the goal of -rt. When PREEMPT_RT is disabled, the local_lock becomes preempt_disable, so it has no affect on the vanilla kernel. -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html