From: Matthew Wilcox <willy@linux.intel.com>
To: Stephen Bates <stephen.bates@pmcs.com>
Cc: haggaie@mellanox.com, javier@cnexlabs.com,
linux-rdma@vger.kernel.org, linux-nvdimm@lists.01.org,
sagig@mellanox.com, linux-mm@kvack.org, artemyko@mellanox.com,
hch@infradead.org, leonro@mellanox.com,
jgunthorpe@obsidianresearch.com
Subject: Re: [PATCH RFC 1/1] Add support for ZONE_DEVICE IO memory with struct pages.
Date: Mon, 14 Mar 2016 17:23:44 -0400 [thread overview]
Message-ID: <20160314212344.GC23727@linux.intel.com> (raw)
In-Reply-To: <1457979277-26791-1-git-send-email-stephen.bates@pmcs.com>
On Mon, Mar 14, 2016 at 12:14:37PM -0600, Stephen Bates wrote:
> 3. Coherency Issues. When IOMEM is written from both the CPU and a PCIe
> peer there is potential for coherency issues and for writes to occur out
> of order. This is something that users of this feature need to be
> cognizant of and may necessitate the use of CONFIG_EXPERT. Though really,
> this isn't much different than the existing situation with RDMA: if
> userspace sets up an MR for remote use, they need to be careful about
> using that memory region themselves.
There's more to the coherency problem than this. As I understand it, on
x86, memory in a PCI BAR does not participate in the coherency protocol.
So you can get a situation where CPU A stores 4 bytes to offset 8 in a
cacheline, then CPU B stores 4 bytes to offset 16 in the same cacheline,
and CPU A's write mysteriously goes missing.
I may have misunderstood the exact details when this was explained to me a
few years ago, but the details were horrible enough to run away screaming.
Pretending PCI BARs are real memory? Just Say No.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
WARNING: multiple messages have this Message-ID (diff)
From: Matthew Wilcox <willy-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
To: Stephen Bates <stephen.bates-PwyqCcigF0Q@public.gmane.org>
Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org,
haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org,
javier-rmLALz0KWFtWk0Htik3J/w@public.gmane.org,
sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org,
jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org,
leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org,
artemyko-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org,
hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org
Subject: Re: [PATCH RFC 1/1] Add support for ZONE_DEVICE IO memory with struct pages.
Date: Mon, 14 Mar 2016 17:23:44 -0400 [thread overview]
Message-ID: <20160314212344.GC23727@linux.intel.com> (raw)
In-Reply-To: <1457979277-26791-1-git-send-email-stephen.bates-PwyqCcigF0Q@public.gmane.org>
On Mon, Mar 14, 2016 at 12:14:37PM -0600, Stephen Bates wrote:
> 3. Coherency Issues. When IOMEM is written from both the CPU and a PCIe
> peer there is potential for coherency issues and for writes to occur out
> of order. This is something that users of this feature need to be
> cognizant of and may necessitate the use of CONFIG_EXPERT. Though really,
> this isn't much different than the existing situation with RDMA: if
> userspace sets up an MR for remote use, they need to be careful about
> using that memory region themselves.
There's more to the coherency problem than this. As I understand it, on
x86, memory in a PCI BAR does not participate in the coherency protocol.
So you can get a situation where CPU A stores 4 bytes to offset 8 in a
cacheline, then CPU B stores 4 bytes to offset 16 in the same cacheline,
and CPU A's write mysteriously goes missing.
I may have misunderstood the exact details when this was explained to me a
few years ago, but the details were horrible enough to run away screaming.
Pretending PCI BARs are real memory? Just Say No.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
WARNING: multiple messages have this Message-ID (diff)
From: Matthew Wilcox <willy@linux.intel.com>
To: Stephen Bates <stephen.bates@pmcs.com>
Cc: linux-mm@kvack.org, linux-rdma@vger.kernel.org,
linux-nvdimm@lists.01.org, haggaie@mellanox.com,
javier@cnexlabs.com, sagig@mellanox.com,
jgunthorpe@obsidianresearch.com, leonro@mellanox.com,
artemyko@mellanox.com, hch@infradead.org
Subject: Re: [PATCH RFC 1/1] Add support for ZONE_DEVICE IO memory with struct pages.
Date: Mon, 14 Mar 2016 17:23:44 -0400 [thread overview]
Message-ID: <20160314212344.GC23727@linux.intel.com> (raw)
In-Reply-To: <1457979277-26791-1-git-send-email-stephen.bates@pmcs.com>
On Mon, Mar 14, 2016 at 12:14:37PM -0600, Stephen Bates wrote:
> 3. Coherency Issues. When IOMEM is written from both the CPU and a PCIe
> peer there is potential for coherency issues and for writes to occur out
> of order. This is something that users of this feature need to be
> cognizant of and may necessitate the use of CONFIG_EXPERT. Though really,
> this isn't much different than the existing situation with RDMA: if
> userspace sets up an MR for remote use, they need to be careful about
> using that memory region themselves.
There's more to the coherency problem than this. As I understand it, on
x86, memory in a PCI BAR does not participate in the coherency protocol.
So you can get a situation where CPU A stores 4 bytes to offset 8 in a
cacheline, then CPU B stores 4 bytes to offset 16 in the same cacheline,
and CPU A's write mysteriously goes missing.
I may have misunderstood the exact details when this was explained to me a
few years ago, but the details were horrible enough to run away screaming.
Pretending PCI BARs are real memory? Just Say No.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-03-14 21:23 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-14 18:14 [PATCH RFC 1/1] Add support for ZONE_DEVICE IO memory with struct pages Stephen Bates
2016-03-14 21:23 ` Matthew Wilcox [this message]
2016-03-14 21:23 ` Matthew Wilcox
2016-03-14 21:23 ` Matthew Wilcox
[not found] ` <20160314212344.GC23727-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2016-03-14 21:57 ` Jason Gunthorpe
2016-03-14 21:57 ` Jason Gunthorpe
2016-03-15 4:09 ` Logan Gunthorpe
[not found] ` <56E78B08.8050205-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2016-03-15 17:00 ` Stephen Bates
2016-03-15 17:00 ` Stephen Bates
[not found] ` <20160314215708.GA7282-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-03-17 15:18 ` Haggai Eran
2016-03-17 15:18 ` Haggai Eran
[not found] ` <56EACAB3.5070301-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-03-17 16:11 ` Jason Gunthorpe
2016-03-17 16:11 ` Jason Gunthorpe
2016-03-21 19:25 ` Stephen Bates
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160314212344.GC23727@linux.intel.com \
--to=willy@linux.intel.com \
--cc=artemyko@mellanox.com \
--cc=haggaie@mellanox.com \
--cc=hch@infradead.org \
--cc=javier@cnexlabs.com \
--cc=jgunthorpe@obsidianresearch.com \
--cc=leonro@mellanox.com \
--cc=linux-mm@kvack.org \
--cc=linux-nvdimm@lists.01.org \
--cc=linux-rdma@vger.kernel.org \
--cc=sagig@mellanox.com \
--cc=stephen.bates@pmcs.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.