From: Roland Dreier <rdreier@cisco.com>
To: David Miller <davem@davemloft.net>
Cc: tom@opengridcomputing.com, jeff@garzik.org,
swise@opengridcomputing.com, mshefty@ichips.intel.com,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
general@lists.openfabrics.org
Subject: Re: [ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCP ports from the host TCP port space.
Date: Tue, 28 Aug 2007 12:38:07 -0700 [thread overview]
Message-ID: <adair6zjvq8.fsf@cisco.com> (raw)
In-Reply-To: <20070820.235804.85409183.davem@davemloft.net> (David Miller's message of "Mon, 20 Aug 2007 23:58:04 -0700 (PDT)")
Sorry for the long latency, I was at the beach all last week.
> > And direct data placement really does give you a factor of two at
> > least, because otherwise you're stuck receiving the data in one
> > buffer, looking at some of the data at least, and then figuring out
> > where to copy it. And memory bandwidth is if anything becoming more
> > valuable; maybe LRO + header splitting + page remapping tricks can get
> > you somewhere but as NCPUS grows then it seems the TLB shootdown cost
> > of page flipping is only going to get worse.
> As Herbert has said already, people can code for this just like
> they have to code for RDMA.
No argument, you need to change the interface to take advantage of RDMA.
> There is no fundamental difference from converting an application
> to sendfile or similar.
Yes, on the transmit side, there's not much difference from sendfile
or splice, although RDMA may give a slightly nicer interface that also
gives basically the equivalent of AIO.
> The only thing this needs is a
> "recvmsg_I_dont_care_where_the_data_is()" call. There are no alignment
> issues unless you are trying to push this data directly into the
> page cache.
I don't understand how this gives you the same thing as direct data
placement (DDP). There are many situations where the sender knows
where the data has to go and if there's some way to pass that to the
receiver, so that info can be used in the receive path to put the data
in the right place, the receiver can save a copy. This is
fundamentally the same "offload" that an FC HBA does -- the SCSI
midlayer queues up commands like "read block A and put the data at
address X" and "read block B and put the data at address Y" and the
HBA matches tags on incoming data to put the blocks at the right
addresses, even if block B is received before block A.
RFC 4297 has some discussion of the various approaches, and while you
might not agree with their conclusions, it is interesting reading.
> Couple this with a card that makes sure that on a per-page basis, only
> data for a particular flow (or group of flows) will accumulate.
It seems that the NIC would also have to look into a TCP stream (and
handle out of order segments etc) to find message boundaries for this
to be equivalent to what an RDMA NIC does.
- R.
next prev parent reply other threads:[~2007-08-28 19:38 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-08-07 14:37 [PATCH RFC] RDMA/CMA: Allocate PS_TCP ports from the host TCP port space Steve Wise
2007-08-07 14:54 ` Evgeniy Polyakov
2007-08-07 15:06 ` Steve Wise
2007-08-07 15:39 ` Evgeniy Polyakov
2007-08-09 18:49 ` Steve Wise
2007-08-09 21:40 ` [ofa-general] " Sean Hefty
2007-08-09 21:55 ` David Miller
2007-08-09 23:22 ` Sean Hefty
2007-08-15 14:42 ` Steve Wise
2007-08-16 2:26 ` Jeff Garzik
2007-08-16 3:11 ` Roland Dreier
2007-08-16 3:27 ` [ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCP portsfrom " Sean Hefty
2007-08-16 13:43 ` [ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCP ports from " Tom Tucker
2007-08-16 21:17 ` David Miller
2007-08-17 19:52 ` Roland Dreier
2007-08-17 21:27 ` David Miller
2007-08-17 23:31 ` Roland Dreier
2007-08-18 0:00 ` David Miller
2007-08-18 5:23 ` Roland Dreier
2007-08-18 6:44 ` David Miller
2007-08-19 7:01 ` [ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCP portsfrom " Sean Hefty
2007-08-19 7:23 ` David Miller
2007-08-19 17:33 ` [ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCPportsfrom " Felix Marti
2007-08-19 19:32 ` David Miller
2007-08-19 19:49 ` Felix Marti
2007-08-19 23:04 ` David Miller
2007-08-20 0:32 ` Felix Marti
2007-08-20 0:40 ` David Miller
2007-08-20 0:47 ` Felix Marti
2007-08-20 1:05 ` David Miller
2007-08-20 1:41 ` Felix Marti
2007-08-20 11:07 ` Andi Kleen
2007-08-20 16:26 ` Felix Marti
2007-08-20 19:16 ` Rick Jones
2007-08-20 9:43 ` Evgeniy Polyakov
2007-08-20 16:53 ` Felix Marti
2007-08-20 18:10 ` Andi Kleen
2007-08-20 19:02 ` Felix Marti
2007-08-20 20:18 ` Thomas Graf
2007-08-20 20:33 ` Andi Kleen
2007-08-20 20:33 ` Patrick Geoffray
2007-08-21 4:21 ` Felix Marti
2007-08-19 23:27 ` Andi Kleen
2007-08-19 23:12 ` David Miller
2007-08-20 1:45 ` Felix Marti
2007-08-20 0:18 ` Herbert Xu
2007-08-21 1:16 ` [ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCP ports from " Roland Dreier
2007-08-21 6:58 ` David Miller
2007-08-28 19:38 ` Roland Dreier [this message]
2007-08-28 20:43 ` David Miller
2007-10-08 21:54 ` Steve Wise
2007-10-09 13:44 ` James Lentini
2007-10-10 21:01 ` Sean Hefty
2007-10-10 23:04 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=adair6zjvq8.fsf@cisco.com \
--to=rdreier@cisco.com \
--cc=davem@davemloft.net \
--cc=general@lists.openfabrics.org \
--cc=jeff@garzik.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mshefty@ichips.intel.com \
--cc=netdev@vger.kernel.org \
--cc=swise@opengridcomputing.com \
--cc=tom@opengridcomputing.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox