From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.kernel.org ([198.145.29.99]:58612 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727693AbgKBRh2 (ORCPT ); Mon, 2 Nov 2020 12:37:28 -0500 Date: Mon, 2 Nov 2020 12:37:21 -0500 From: Steven Rostedt Subject: Re: [PATCH 11/11 v2] ftrace: Add recording of functions that caused recursion Message-ID: <20201102123721.4fcce2cb@gandalf.local.home> In-Reply-To: <20201102164147.GJ20201@alley> References: <20201030213142.096102821@goodmis.org> <20201030214014.801706340@goodmis.org> <20201102164147.GJ20201@alley> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit List-ID: To: Petr Mladek Cc: linux-kernel@vger.kernel.org, Masami Hiramatsu , Andrew Morton , Peter Zijlstra , Ingo Molnar , Josh Poimboeuf , Jiri Kosina , Miroslav Benes , Jonathan Corbet , Guo Ren , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Benjamin Herrenschmidt , Paul Mackerras , Heiko Carstens , Vasily Gorbik , Christian Borntraeger , Thomas Gleixner , Borislav Petkov , x86@kernel.org, "H. Peter Anvin" , Kees Cook , Anton Vorontsov , Colin Cross , Tony Luck , Joe Lawrence , Kamalesh Babulal , Mauro Carvalho Chehab , Sebastian Andrzej Siewior , linux-doc@vger.kernel.org, linux-csky@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, live-patching@vger.kernel.org On Mon, 2 Nov 2020 17:41:47 +0100 Petr Mladek wrote: > > + i = atomic_read(&nr_records); > > + smp_mb__after_atomic(); > > + if (i < 0) > > + cmpxchg(&recursed_functions[index].ip, ip, 0); > > + else if (i <= index) > > + atomic_cmpxchg(&nr_records, i, index + 1); > > This looks weird. It would shift nr_records past the record added > in this call. It might skip many slots that were zeroed when clearing. > Also we do not know if our entry was not zeroed as well. nr_records always holds the next position to write to. index = nr_records; recursed_functions[index].ip = ip; nr_records++; Before clearing, we have: nr_records = -1; smp_mb(); memset(recursed_functions, 0); smp_wmb(); nr_records = 0; When we enter this function: i = nr_records; smp_mb(); if (i < 0) return; Thus, we just stopped all new updates while clearing the records. But what about if something is currently updating? i = nr_records; smp_mb(); if (i < 0) cmpxchg(recursed_functions, ip, 0); The above shows that if the current updating process notices that the clearing happens, it will clear the function it added. else if (i <= index) cmpxchg(nr_records, i, index + 1); This makes sure that nr_records only grows if it is greater or equal to zero. The only race that I see that can happen, is the one in the comment I showed. And that is after enabling the recursed functions again after clearing, one CPU could add a function while another CPU that just added that same function could be just exiting this routine, notice that a clearing of the array happened, and remove its function (which was the same as the one just happened). So we get a "zero" in the array. If this happens, it is likely that that function will recurse again and will be added later. -- Steve