From mboxrd@z Thu Jan 1 00:00:00 1970 From: Heiko Carstens Date: Fri, 29 Apr 2022 20:46:36 +0200 Subject: [PATCH 13/30] s390/consoles: Improve panic notifiers reliability In-Reply-To: <20220427224924.592546-14-gpiccoli@igalia.com> References: <20220427224924.592546-1-gpiccoli@igalia.com> <20220427224924.592546-14-gpiccoli@igalia.com> Message-ID: List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: kexec@lists.infradead.org On Wed, Apr 27, 2022 at 07:49:07PM -0300, Guilherme G. Piccoli wrote: > Currently many console drivers for s390 rely on panic/reboot notifiers > to invoke callbacks on these events. The panic() function disables local > IRQs, secondary CPUs and preemption, so callbacks invoked on panic are > effectively running in atomic context. > > Happens that most of these console callbacks from s390 doesn't take the > proper care with regards to atomic context, like taking spinlocks that > might be taken in other function/CPU and hence will cause a lockup > situation. > > The goal for this patch is to improve the notifiers reliability, acting > on 4 console drivers, as detailed below: > > (1) con3215: changed a regular spinlock to the trylock alternative. > > (2) con3270: also changed a regular spinlock to its trylock counterpart, > but here we also have another problem: raw3270_activate_view() takes a > different spinlock. So, we worked a helper to validate if this other lock > is safe to acquire, and if so, raw3270_activate_view() should be safe. > > Notice though that there is a functional change here: it's now possible > to continue the notifier code [reaching con3270_wait_write() and > con3270_rebuild_update()] without executing raw3270_activate_view(). > > (3) sclp: a global lock is used heavily in the functions called from > the notifier, so we added a check here - if the lock is taken already, > we just bail-out, preventing the lockup. > > (4) sclp_vt220: same as (3), a lock validation was added to prevent the > potential lockup problem. > > Besides (1)-(4), we also removed useless void functions, adding the > code called from the notifier inside its own body, and changed the > priority of such notifiers to execute late, since they are "heavyweight" > for the panic environment, so we aim to reduce risks here. > Changed return values to NOTIFY_DONE as well, the standard one. > > Cc: Alexander Gordeev > Cc: Christian Borntraeger > Cc: Heiko Carstens > Cc: Sven Schnelle > Cc: Vasily Gorbik > Signed-off-by: Guilherme G. Piccoli > --- > > As a design choice, the option used here to verify a given spinlock is taken > was the function "spin_is_locked()" - but we noticed that it is not often used. > An alternative would to take the lock with a spin_trylock() and if it succeeds, > just release the spinlock and continue the code. But that seemed weird... > > Also, we'd like to ask a good validation of case (2) potential functionality > change from the s390 console experts - far from expert here, and in our naive > code observation, that seems fine, but that analysis might be missing some > corner case. > > Thanks in advance! > > drivers/s390/char/con3215.c | 36 +++++++++++++++-------------- > drivers/s390/char/con3270.c | 34 +++++++++++++++------------ > drivers/s390/char/raw3270.c | 18 +++++++++++++++ > drivers/s390/char/raw3270.h | 1 + > drivers/s390/char/sclp_con.c | 28 +++++++++++++---------- > drivers/s390/char/sclp_vt220.c | 42 +++++++++++++++++++--------------- > 6 files changed, 96 insertions(+), 63 deletions(-) Code looks good, and everything still seems to work. I applied this internally for the time being, and if it passes testing, I'll schedule it for the next merge window. Thanks!