linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
To: Matthew Wilcox <willy@linux.intel.com>
Cc: Stephen Bates <stephen.bates@pmcs.com>,
	linux-mm@kvack.org, linux-rdma@vger.kernel.org,
	linux-nvdimm@ml01.01.org, haggaie@mellanox.com,
	javier@cnexlabs.com, sagig@mellanox.com, leonro@mellanox.com,
	artemyko@mellanox.com, hch@infradead.org
Subject: Re: [PATCH RFC 1/1] Add support for ZONE_DEVICE IO memory with struct pages.
Date: Mon, 14 Mar 2016 15:57:08 -0600	[thread overview]
Message-ID: <20160314215708.GA7282@obsidianresearch.com> (raw)
In-Reply-To: <20160314212344.GC23727@linux.intel.com>

On Mon, Mar 14, 2016 at 05:23:44PM -0400, Matthew Wilcox wrote:
> On Mon, Mar 14, 2016 at 12:14:37PM -0600, Stephen Bates wrote:
> > 3. Coherency Issues. When IOMEM is written from both the CPU and a PCIe
> > peer there is potential for coherency issues and for writes to occur out
> > of order. This is something that users of this feature need to be
> > cognizant of and may necessitate the use of CONFIG_EXPERT. Though really,
> > this isn't much different than the existing situation with RDMA: if
> > userspace sets up an MR for remote use, they need to be careful about
> > using that memory region themselves.
> 
> There's more to the coherency problem than this.  As I understand it, on
> x86, memory in a PCI BAR does not participate in the coherency protocol.
> So you can get a situation where CPU A stores 4 bytes to offset 8 in a
> cacheline, then CPU B stores 4 bytes to offset 16 in the same cacheline,
> and CPU A's write mysteriously goes missing.

No, this cannot happen with writing combining. You need full caching turned
on to get that kind of problem.

write combining can only combine writes, it cannot make up writes that
never existed.

That said, I question I don't know the answer to, is how does write
locking/memory barries interact with the write combining CPU buffers,
and are all the fencing semantics guarenteed.. There is some
interaction there (some drivers use write combining a lot).. but that
sure is a rarely used corner area...

The other issue is that the fencing mechanism RDMA uses to create
ordering with system memory is not good enough to fence peer-peer
transactions in the general case. It is only possibly good enough if
all the transactions run through the root complex.

> I may have misunderstood the exact details when this was explained to me a
> few years ago, but the details were horrible enough to run away screaming.
> Pretending PCI BARs are real memory?  Just Say No.

Someone should probably explain in more detail what this is even good
for, DAX on PCI-E bar memory seems goofy in the general case. I was
under the impression the main use case involved the CPU never touching
these memories and just using them to route-through to another IO
device (eg network). So all these discussions about CPU coherency seem
a bit strange.

Jason

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-03-14 21:57 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-14 18:14 [PATCH RFC 1/1] Add support for ZONE_DEVICE IO memory with struct pages Stephen Bates
2016-03-14 21:23 ` Matthew Wilcox
2016-03-14 21:57   ` Jason Gunthorpe [this message]
2016-03-15  4:09     ` Logan Gunthorpe
2016-03-15 17:00       ` Stephen Bates
2016-03-17 15:18     ` Haggai Eran
2016-03-17 16:11       ` Jason Gunthorpe
2016-03-21 19:25         ` Stephen Bates

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160314215708.GA7282@obsidianresearch.com \
    --to=jgunthorpe@obsidianresearch.com \
    --cc=artemyko@mellanox.com \
    --cc=haggaie@mellanox.com \
    --cc=hch@infradead.org \
    --cc=javier@cnexlabs.com \
    --cc=leonro@mellanox.com \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@ml01.01.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=sagig@mellanox.com \
    --cc=stephen.bates@pmcs.com \
    --cc=willy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).