From: Catalin Marinas <catalin.marinas@arm.com>
To: "Petr Tesařík" <petr@tesarici.cz>
Cc: Petr Tesarik <petrtesarik@huaweicloud.com>,
Jonathan Corbet <corbet@lwn.net>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
Maxime Ripard <mripard@kernel.org>,
Thomas Zimmermann <tzimmermann@suse.de>,
David Airlie <airlied@gmail.com>, Daniel Vetter <daniel@ffwll.ch>,
Christoph Hellwig <hch@lst.de>,
Marek Szyprowski <m.szyprowski@samsung.com>,
Robin Murphy <robin.murphy@arm.com>,
"Paul E. McKenney" <paulmck@kernel.org>,
Borislav Petkov <bp@suse.de>,
Randy Dunlap <rdunlap@infradead.org>,
Damien Le Moal <damien.lemoal@opensource.wdc.com>,
Kim Phillips <kim.phillips@amd.com>,
"Steven Rostedt (Google)" <rostedt@goodmis.org>,
Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
Hans de Goede <hdegoede@redhat.com>,
Jason Gunthorpe <jgg@ziepe.ca>, Kees Cook <keescook@chromium.org>,
Thomas Gleixner <tglx@linutronix.de>,
"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
open list <linux-kernel@vger.kernel.org>,
"open list:DRM DRIVERS" <dri-devel@lists.freedesktop.org>,
"open list:DMA MAPPING HELPERS" <iommu@lists.linux.dev>,
Roberto Sassu <roberto.sassu@huawei.com>,
Kefeng Wang <wangkefeng.wang@huawei.com>
Subject: Re: [PATCH v2 RESEND 7/7] swiotlb: per-device flag if there are dynamically allocated buffers
Date: Tue, 16 May 2023 12:22:02 +0100 [thread overview]
Message-ID: <ZGNnWmw4eDsh9hBN@arm.com> (raw)
In-Reply-To: <20230516095512.3c99c35e@meshulam.tesarici.cz>
On Tue, May 16, 2023 at 09:55:12AM +0200, Petr Tesařík wrote:
> On Mon, 15 May 2023 17:28:38 +0100
> Catalin Marinas <catalin.marinas@arm.com> wrote:
> > There is another scenario to take into account on the list_del() side.
> > Let's assume that there are other elements on the list, so
> > list_empty() == false:
> >
> > P0:
> > list_del(paddr);
> > /* the memory gets freed, added to some slab or page free list */
> > WRITE_ONCE(slab_free_list, __va(paddr));
> >
> > P1:
> > paddr = __pa(READ_ONCE(slab_free_list));/* re-allocating paddr freed on P0 */
> > if (!list_empty()) { /* assuming other elements on the list */
> > /* searching the list */
> > list_for_each() {
> > if (pos->paddr) == __pa(vaddr))
> > /* match */
> > }
> > }
> >
> > On P0, you want the list update to be visible before the memory is freed
> > (and potentially reallocated on P1). An smp_wmb() on P0 would do. For
> > P1, we don't care about list_empty() as there can be other elements
> > already. But we do want any list elements reading during the search to
> > be ordered after the slab_free_list reading. The smp_rmb() you'd add for
> > the case above would suffice.
>
> Yes, but to protect against concurrent insertions/deletions, a spinlock
> is held while searching the list. The spin lock provides the necessary
> memory barriers implicitly.
Well, mostly. The spinlock acquire/release semantics ensure that
accesses within the locked region are not observed outside the
lock/unlock. But it doesn't guarantee anything about accesses outside
such region in relation to the accesses within the region. For example:
P0:
spin_lock_irqsave(&swiotlb_dyn_lock);
list_del(paddr);
spin_unlock_irqrestore(&swiotlb_dyn_lock);
/* the blah write below can be observed before list_del() above */
WRITE_ONCE(blah, paddr);
/* that's somewhat tricker but slab_free_list update can also be
* seen before list_del() above on certain architectures */
spin_lock_irqsave(&slab_lock);
WRITE_ONCE(slab_free_list, __va(paddr));
spin_unlock_irqrestore(&slab_lock);
On most architectures, the writing of the pointer to a slab structure
(assuming some spinlocks) would be ordered against the list_del() from
the swiotlb code. Apart from powerpc where the spin_unlock() is not
necessarily ordered against the subsequent spin_lock(). The architecture
selects ARCH_WEAK_RELEASE_ACQUIRE which in turns makes
smp_mb__after_unlock_lock() an smp_mb() (rather than no-op on all the
other architectures).
On arm64 we have smp_mb__after_spinlock() which ensures that memory
accesses prior to spin_lock() are not observed after accesses within the
locked region. I don't think this matters for your case but I thought
I'd mention it.
--
Catalin
WARNING: multiple messages have this Message-ID (diff)
From: Catalin Marinas <catalin.marinas@arm.com>
To: "Petr Tesařík" <petr@tesarici.cz>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>,
"Rafael J. Wysocki" <rafael@kernel.org>,
"open list:DRM DRIVERS" <dri-devel@lists.freedesktop.org>,
Kim Phillips <kim.phillips@amd.com>,
Christoph Hellwig <hch@lst.de>,
Marek Szyprowski <m.szyprowski@samsung.com>,
Petr Tesarik <petrtesarik@huaweicloud.com>,
Jonathan Corbet <corbet@lwn.net>,
Damien Le Moal <damien.lemoal@opensource.wdc.com>,
"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
Jason Gunthorpe <jgg@ziepe.ca>,
"open list:DMA MAPPING HELPERS" <iommu@lists.linux.dev>,
Borislav Petkov <bp@suse.de>, Kees Cook <keescook@chromium.org>,
"Paul E. McKenney" <paulmck@kernel.org>,
Hans de Goede <hdegoede@redhat.com>,
"Steven Rostedt \(Google\)" <rostedt@goodmis.org>,
Thomas Gleixner <tglx@linutronix.de>,
Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
Thomas Zimmermann <tzimmermann@suse.de>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Randy Dunlap <rdunlap@infradead.org>,
Roberto Sassu <roberto.sassu@huawei.com>,
open list <linux-kernel@vger.kernel.org>,
Robin Murphy <robin.murphy@arm.com>
Subject: Re: [PATCH v2 RESEND 7/7] swiotlb: per-device flag if there are dynamically allocated buffers
Date: Tue, 16 May 2023 12:22:02 +0100 [thread overview]
Message-ID: <ZGNnWmw4eDsh9hBN@arm.com> (raw)
In-Reply-To: <20230516095512.3c99c35e@meshulam.tesarici.cz>
On Tue, May 16, 2023 at 09:55:12AM +0200, Petr Tesařík wrote:
> On Mon, 15 May 2023 17:28:38 +0100
> Catalin Marinas <catalin.marinas@arm.com> wrote:
> > There is another scenario to take into account on the list_del() side.
> > Let's assume that there are other elements on the list, so
> > list_empty() == false:
> >
> > P0:
> > list_del(paddr);
> > /* the memory gets freed, added to some slab or page free list */
> > WRITE_ONCE(slab_free_list, __va(paddr));
> >
> > P1:
> > paddr = __pa(READ_ONCE(slab_free_list));/* re-allocating paddr freed on P0 */
> > if (!list_empty()) { /* assuming other elements on the list */
> > /* searching the list */
> > list_for_each() {
> > if (pos->paddr) == __pa(vaddr))
> > /* match */
> > }
> > }
> >
> > On P0, you want the list update to be visible before the memory is freed
> > (and potentially reallocated on P1). An smp_wmb() on P0 would do. For
> > P1, we don't care about list_empty() as there can be other elements
> > already. But we do want any list elements reading during the search to
> > be ordered after the slab_free_list reading. The smp_rmb() you'd add for
> > the case above would suffice.
>
> Yes, but to protect against concurrent insertions/deletions, a spinlock
> is held while searching the list. The spin lock provides the necessary
> memory barriers implicitly.
Well, mostly. The spinlock acquire/release semantics ensure that
accesses within the locked region are not observed outside the
lock/unlock. But it doesn't guarantee anything about accesses outside
such region in relation to the accesses within the region. For example:
P0:
spin_lock_irqsave(&swiotlb_dyn_lock);
list_del(paddr);
spin_unlock_irqrestore(&swiotlb_dyn_lock);
/* the blah write below can be observed before list_del() above */
WRITE_ONCE(blah, paddr);
/* that's somewhat tricker but slab_free_list update can also be
* seen before list_del() above on certain architectures */
spin_lock_irqsave(&slab_lock);
WRITE_ONCE(slab_free_list, __va(paddr));
spin_unlock_irqrestore(&slab_lock);
On most architectures, the writing of the pointer to a slab structure
(assuming some spinlocks) would be ordered against the list_del() from
the swiotlb code. Apart from powerpc where the spin_unlock() is not
necessarily ordered against the subsequent spin_lock(). The architecture
selects ARCH_WEAK_RELEASE_ACQUIRE which in turns makes
smp_mb__after_unlock_lock() an smp_mb() (rather than no-op on all the
other architectures).
On arm64 we have smp_mb__after_spinlock() which ensures that memory
accesses prior to spin_lock() are not observed after accesses within the
locked region. I don't think this matters for your case but I thought
I'd mention it.
--
Catalin
next prev parent reply other threads:[~2023-05-16 11:22 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-09 9:18 [PATCH v2 RESEND 0/7] Allow dynamic allocation of software IO TLB bounce buffers Petr Tesarik
2023-05-09 9:18 ` Petr Tesarik
2023-05-09 9:18 ` [PATCH v2 RESEND 1/7] swiotlb: Use a helper to initialize swiotlb fields in struct device Petr Tesarik
2023-05-09 9:18 ` Petr Tesarik
2023-05-09 9:18 ` [PATCH v2 RESEND 2/7] swiotlb: Move code around in preparation for dynamic bounce buffers Petr Tesarik
2023-05-09 9:18 ` Petr Tesarik
2023-05-09 9:18 ` [PATCH v2 RESEND 3/7] dma-mapping: introduce the DMA_ATTR_MAY_SLEEP attribute Petr Tesarik
2023-05-09 9:18 ` Petr Tesarik
2023-05-09 9:18 ` [PATCH v2 RESEND 4/7] swiotlb: Dynamically allocated bounce buffers Petr Tesarik
2023-05-09 9:18 ` Petr Tesarik
2023-05-15 19:43 ` Michael Kelley (LINUX)
2023-05-15 19:43 ` Michael Kelley (LINUX)
2023-05-16 6:13 ` Christoph Hellwig
2023-05-16 6:39 ` Petr Tesařík
2023-05-16 6:39 ` Petr Tesařík
2023-05-16 17:59 ` Catalin Marinas
2023-05-16 17:59 ` Catalin Marinas
2023-05-17 6:35 ` Petr Tesařík
2023-05-17 6:35 ` Petr Tesařík
2023-05-17 6:56 ` Christoph Hellwig
2023-05-17 7:32 ` Petr Tesařík
2023-05-17 7:32 ` Petr Tesařík
2023-05-17 9:41 ` Catalin Marinas
2023-05-17 9:41 ` Catalin Marinas
2023-05-17 9:58 ` Petr Tesařík
2023-05-17 9:58 ` Petr Tesařík
2023-05-17 11:08 ` Catalin Marinas
2023-05-17 11:08 ` Catalin Marinas
2023-05-17 11:27 ` Petr Tesařík
2023-05-17 11:27 ` Petr Tesařík
2023-05-23 9:54 ` Catalin Marinas
2023-05-23 9:54 ` Catalin Marinas
2023-05-23 11:53 ` Petr Tesařík
2023-05-23 11:53 ` Petr Tesařík
2023-05-16 6:16 ` Petr Tesařík
2023-05-16 6:16 ` Petr Tesařík
2023-05-09 9:18 ` [PATCH v2 RESEND 5/7] swiotlb: Add a boot option to enable dynamic " Petr Tesarik
2023-05-09 9:18 ` Petr Tesarik
2023-05-09 9:18 ` [PATCH v2 RESEND 6/7] drm: Use DMA_ATTR_MAY_SLEEP from process context Petr Tesarik
2023-05-09 9:18 ` Petr Tesarik
2023-05-09 9:18 ` [PATCH v2 RESEND 7/7] swiotlb: per-device flag if there are dynamically allocated buffers Petr Tesarik
2023-05-09 9:18 ` Petr Tesarik
2023-05-14 18:54 ` Catalin Marinas
2023-05-14 18:54 ` Catalin Marinas
2023-05-15 8:48 ` Petr Tesařík
2023-05-15 8:48 ` Petr Tesařík
2023-05-15 10:00 ` Petr Tesařík
2023-05-15 10:00 ` Petr Tesařík
2023-05-15 16:28 ` Catalin Marinas
2023-05-15 16:28 ` Catalin Marinas
2023-05-16 7:55 ` Petr Tesařík
2023-05-16 7:55 ` Petr Tesařík
2023-05-16 11:22 ` Catalin Marinas [this message]
2023-05-16 11:22 ` Catalin Marinas
[not found] ` <20230515104737.2c4c05db@meshulam.tesarici.cz>
[not found] ` <ZGH9v2KWJWZnKvxP@arm.com>
2023-05-15 10:47 ` Petr Tesařík
2023-05-15 10:47 ` Petr Tesařík
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZGNnWmw4eDsh9hBN@arm.com \
--to=catalin.marinas@arm.com \
--cc=airlied@gmail.com \
--cc=andriy.shevchenko@linux.intel.com \
--cc=bp@suse.de \
--cc=corbet@lwn.net \
--cc=damien.lemoal@opensource.wdc.com \
--cc=daniel@ffwll.ch \
--cc=dri-devel@lists.freedesktop.org \
--cc=gregkh@linuxfoundation.org \
--cc=hch@lst.de \
--cc=hdegoede@redhat.com \
--cc=iommu@lists.linux.dev \
--cc=jgg@ziepe.ca \
--cc=keescook@chromium.org \
--cc=kim.phillips@amd.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=m.szyprowski@samsung.com \
--cc=maarten.lankhorst@linux.intel.com \
--cc=mripard@kernel.org \
--cc=paulmck@kernel.org \
--cc=petr@tesarici.cz \
--cc=petrtesarik@huaweicloud.com \
--cc=rafael@kernel.org \
--cc=rdunlap@infradead.org \
--cc=roberto.sassu@huawei.com \
--cc=robin.murphy@arm.com \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=tzimmermann@suse.de \
--cc=wangkefeng.wang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.