All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: Helge Deller <deller@gmx.de>
Cc: Bart Van Assche <bvanassche@acm.org>,
	Linux SCSI List <linux-scsi@vger.kernel.org>,
	linux-aio@kvack.org, linux-parisc <linux-parisc@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>
Subject: Re: spinlock recursion in aio_complete()
Date: Tue, 23 May 2023 11:51:45 +0100	[thread overview]
Message-ID: <ZGyawdtBhNnvvTv3@shell.armlinux.org.uk> (raw)
In-Reply-To: <4d786f73-8c6f-4fd1-cdd6-42f2d59d6120@gmx.de>

On Tue, May 23, 2023 at 12:24:04PM +0200, Helge Deller wrote:
> On 5/22/23 23:22, Helge Deller wrote:
> > > > It hangs in fs/aio.c:1128, function aio_complete(), in this call:
> > > >      spin_lock_irqsave(&ctx->completion_lock, flags);
> > > 
> > > All code that I found and that obtains ctx->completion_lock disables IRQs.
> > > It is not clear to me how this spinlock can be locked recursively? Is it
> > > sure that the "spinlock recursion" report is correct?
> > 
> > Yes, it seems correct.
> > [...]
> 
> Bart, thanks to your suggestions I was able to narrow down the problem!
> 
> I got LOCKDEP working on parisc, which then reports:
> 	raw_local_irq_restore() called with IRQs enabled
> for the spin_unlock_irqrestore() in function aio_complete(), which shouldn't happen.
> 
> Finally, I found that parisc's flush_dcache_page() re-enables the IRQs
> which leads to the spinlock hang in aio_complete().
> 
> So, this is NOT a bug in aio or scsci, but we need fix in the the arch code.

You can find some of the background to this at:

https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/?id=16ceff2d5dc9f0347ab5a08abff3f4647c2fee04

which introduced flush_dcache_mmap_lock(). It looks like Hugh had
questions over whether this should be _irqsave() rather than _irq()
but I guess at the time all callers had interrupts enabled, and
it's only recently that someone came up with the idea of calling
flush_dcache_page() with interrupts disabled.

Adding another arg to flush_dcache_mmap_lock() to save the flags
may be doable, but requires a patch that touches not only architectures
that have a private implementation, but also various code in mm/.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

WARNING: multiple messages have this Message-ID (diff)
From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: Helge Deller <deller@gmx.de>
Cc: Bart Van Assche <bvanassche@acm.org>,
	Linux SCSI List <linux-scsi@vger.kernel.org>,
	linux-aio@kvack.org, linux-parisc <linux-parisc@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>
Subject: Re: spinlock recursion in aio_complete()
Date: Tue, 23 May 2023 11:51:45 +0100	[thread overview]
Message-ID: <ZGyawdtBhNnvvTv3@shell.armlinux.org.uk> (raw)
In-Reply-To: <4d786f73-8c6f-4fd1-cdd6-42f2d59d6120@gmx.de>

On Tue, May 23, 2023 at 12:24:04PM +0200, Helge Deller wrote:
> On 5/22/23 23:22, Helge Deller wrote:
> > > > It hangs in fs/aio.c:1128, function aio_complete(), in this call:
> > > >      spin_lock_irqsave(&ctx->completion_lock, flags);
> > > 
> > > All code that I found and that obtains ctx->completion_lock disables IRQs.
> > > It is not clear to me how this spinlock can be locked recursively? Is it
> > > sure that the "spinlock recursion" report is correct?
> > 
> > Yes, it seems correct.
> > [...]
> 
> Bart, thanks to your suggestions I was able to narrow down the problem!
> 
> I got LOCKDEP working on parisc, which then reports:
> 	raw_local_irq_restore() called with IRQs enabled
> for the spin_unlock_irqrestore() in function aio_complete(), which shouldn't happen.
> 
> Finally, I found that parisc's flush_dcache_page() re-enables the IRQs
> which leads to the spinlock hang in aio_complete().
> 
> So, this is NOT a bug in aio or scsci, but we need fix in the the arch code.

You can find some of the background to this at:

https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/?id=16ceff2d5dc9f0347ab5a08abff3f4647c2fee04

which introduced flush_dcache_mmap_lock(). It looks like Hugh had
questions over whether this should be _irqsave() rather than _irq()
but I guess at the time all callers had interrupts enabled, and
it's only recently that someone came up with the idea of calling
flush_dcache_page() with interrupts disabled.

Adding another arg to flush_dcache_mmap_lock() to save the flags
may be doable, but requires a patch that touches not only architectures
that have a private implementation, but also various code in mm/.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2023-05-23 10:52 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-21  5:43 spinlock recursion in aio_complete() Helge Deller
2023-05-22 19:28 ` Bart Van Assche
2023-05-22 20:51   ` Helge Deller
2023-05-22 20:58     ` Bart Van Assche
2023-05-22 21:22       ` Helge Deller
2023-05-23 10:24         ` Helge Deller
2023-05-23 10:24           ` Helge Deller
2023-05-23 10:51           ` Russell King (Oracle) [this message]
2023-05-23 10:51             ` Russell King (Oracle)
2023-05-23 20:01             ` Helge Deller
2023-05-23 20:01               ` Helge Deller
2023-05-23 20:06               ` Bart Van Assche
2023-05-23 20:06                 ` Bart Van Assche
2023-05-23 20:12                 ` Helge Deller
2023-05-23 20:12                   ` Helge Deller
2023-05-23 20:24               ` Helge Deller
2023-05-23 20:24                 ` Helge Deller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZGyawdtBhNnvvTv3@shell.armlinux.org.uk \
    --to=linux@armlinux.org.uk \
    --cc=bvanassche@acm.org \
    --cc=deller@gmx.de \
    --cc=linux-aio@kvack.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.