From: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
To: Anuj Kalia <anujkaliaiitd-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Gabriele Svelto
<gabriele.svelto-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: RDMA and memory ordering
Date: Tue, 12 Nov 2013 11:31:42 -0700 [thread overview]
Message-ID: <20131112183142.GB6639@obsidianresearch.com> (raw)
In-Reply-To: <CADPSxAh6i74j8JVmWncttN6W4ithTGmsMgqWWpoBy2Z8RRb=_g@mail.gmail.com>
On Tue, Nov 12, 2013 at 06:31:04AM -0400, Anuj Kalia wrote:
> That makes sense. This way, we have no consistency between the CPU's
> view and the HCA's view - it all depends when the cache gets flushed
> to RAM.
What you are talking about is firmly in undefined territory. You might
be able to get something to work today, but tomorrows CPUs and HCAs
might mess it up.
You will never reliably get the guarentee you desired with the scheme
you have. Even with two CPUs it is not going to happen.
> I have a remote client which reads the struct A[i] from the server
> (via RDMA) in a loop. Sometimes in the value that the client reads,
> A[i].counter is larger than A[i].value. i.e., I see the newer value of
> A[i].counter but A[i].value corresponds to a previous iteration of the
> server's loop.
This is a fundamental mis-understanding of what FENCE does, it just
makes the writes happen in-order, it doesn't alter the reader side
CPU1 CPU2
read avalue
value = counter
FENCE
a.counter = counter
read a.counter
value < counter
CPU1 CPU2
a.value = counter
read a.value
FENCE
a.counter = counter
read a.coutner
value < counter
CPU1 CPU2
a.value = counter
FENCE
read a.value
< SCHEDULE >
a.counter = counter
read a.coutner
value < counter
etc.
This stuff is hard, if you want a crazy scheme to be reliable you need
to have really detailed understanding of what is actually being
guarenteed.
> However, if the HCA performs reads from L3 cache, then everything
> should be consistent, right? While ordering the writes, I think we
> can
No. The cache makes no difference. Fundamentally you aren't atomically
writing cache lines. You are writing single values.
99% of the time it might look like atomic cache line writes, but there
is a 1% where that assumption will break.
Probably the best you can do is a collision detect scheme:
uint64_t counter
void data[];
writer
counter++
FENCE
data = [.....];
FENCE
counter++
reader:
read counter
if counter % 2 == 1: retry
read data
read counter
if counter != last_counter: retry
But even something as simple as that probably has scary races - I only
thought about it for a few moments. :)
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2013-11-12 18:31 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-10 10:46 RDMA and memory ordering Anuj Kalia
[not found] ` <CADPSxAhAGYZude8CM65-UDvfiPscStgcNsAfs=2XBbntg-wL0w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-12 10:16 ` Gabriele Svelto
[not found] ` <5281FFF9.5070705-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2013-11-12 10:31 ` Anuj Kalia
2013-11-12 18:31 ` Jason Gunthorpe [this message]
[not found] ` <20131112183142.GB6639-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-11-12 20:59 ` Anuj Kalia
[not found] ` <CADPSxAgF1CAiYoYbxbCON4NCD-tH8cAsJFRtECkTGJJQC4MXCg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-12 21:11 ` Jason Gunthorpe
[not found] ` <20131112211123.GA29132-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-11-13 6:55 ` Anuj Kalia
[not found] ` <CADPSxAhzmaut9s9L1fv5urhzX8xKU9GbL6z1TkOX3FuM4NUsww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-13 18:09 ` Jason Gunthorpe
[not found] ` <20131113180915.GA6597-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-11-14 5:12 ` Anuj Kalia
[not found] ` <CADPSxAiepGuzWYXjyDxnSzER5MqL57fZ9mh83SLwV461PwZO3Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-14 19:05 ` Jason Gunthorpe
[not found] ` <20131114190514.GB21549-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2013-11-14 19:33 ` Anuj Kalia
[not found] ` <CADPSxAg0k5SuxCX=3CMNV8-xME55p3iL4BMqnq0ji---kN6ZEg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-11-14 19:47 ` Anuj Kalia
2013-11-13 18:23 ` Gabriele Svelto
[not found] ` <5283C3B2.6010106-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2013-11-14 2:11 ` Anuj Kalia
-- strict thread matches above, loose matches on Subject: below --
2013-11-11 23:13 Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A8237388CF721E-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2013-11-12 7:28 ` Anuj Kalia
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131112183142.GB6639@obsidianresearch.com \
--to=jgunthorpe-epgobjl8dl3ta4ec/59zmfatqe2ktcn/@public.gmane.org \
--cc=anujkaliaiitd-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=gabriele.svelto-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox