From: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
To: Or Gerlitz <gerlitz.or-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Yuval Shaia <yuval.shaia-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: ib_ipoib: CSUM support in connected mode
Date: Mon, 15 Sep 2014 11:45:13 -0600 [thread overview]
Message-ID: <20140915174513.GA3074@obsidianresearch.com> (raw)
In-Reply-To: <CAJ3xEMir_FgqS7j+fuhugocawdZXHG9hAK-jpArZ_5vkzVjZeg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On Mon, Sep 15, 2014 at 08:20:25PM +0300, Or Gerlitz wrote:
> On Mon, Sep 15, 2014 at 7:58 PM, Jason Gunthorpe
> <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> > To do this, you need to transfer the offload state across the wire, so
> > on receive you inject the packet with the proper tag that the csum is
> > not computed but ready for offload. A node receiving a packet like
> > this would have to compute the csum before sending it onwards, so no,
> > if done properly it will not break gateways.
> >
> > All the core infrastructure is there, all the virtualization drivers
> > work like this - the guest side does not compute the csum, and the
> > hyperviser side receives the packet with that flag, and the csum
> > ultimately is offloaded to the physical NIC. Look at the xen net
> > driver for an example.
>
> But is done on the xmitting hypervisor, isn't it? if this is the case,
> I don't see
> the similarity to the IPoIB CM case.
I'm not sure what you mean?
You raised the concern about gateways, which is identical to the
hypervisor case:
G-LINUX --(NO CSUM)--> ring buffer --> H-LINUX --(NO CSUM)--> NIC->WIRE
A-LINUX --(NO CSUM)--> RC QP --> B-LINUX --(NO CSUM)--> NIC->WIRE
The key is that csum state is placed in the ring buffer/RC QP with
every packet. Basically, you serialize the entire offload state the
IPoIB send receives from the kernel net stack, dump that onto the
wire, and restore that exact same semantic state on the receive side.
The NIC sees the same packet, with the same offload meta data, as
though it were directly connected to the sending Linux kernel.
The *typical* IPoIB CM case is similar to a guest talking to another
guest:
G1 --(NO CSUM)--> ring buffer --> H-LINUX --(NO CSUM)--> ring buffer --(NO CSUM)--> G2
Here the packet is never csum'd - the 2nd guest simply accepts the
packet with an uncsum'd tag. If you flatten the above it looks
identical to the typical IPoIB case.
Hypervisors are now also doing the same trick with GSO, they send
large packets without a high MTU, because they can take then GSO
master packet state from the sending guest and shuttle the whole thing
without segmentation to the receiving guest (or NIC). IPoIB should do
the same.
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2014-09-15 17:45 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-14 18:46 ib_ipoib: CSUM support in connected mode Yuval Shaia
2014-09-15 14:47 ` Or Gerlitz
[not found] ` <CAJ3xEMhEzdyzcAufQU--VbM7aoAzsw7wV2i_i=kjcS9PbdC0Tw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-09-15 16:58 ` Jason Gunthorpe
[not found] ` <20140915165820.GB12397-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2014-09-15 17:20 ` Or Gerlitz
[not found] ` <CAJ3xEMir_FgqS7j+fuhugocawdZXHG9hAK-jpArZ_5vkzVjZeg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-09-15 17:45 ` Jason Gunthorpe [this message]
2014-09-15 19:03 ` Yuval Shaia
2014-09-15 18:55 ` Yuval Shaia
2014-09-16 6:47 ` Or Gerlitz
[not found] ` <5417DD0F.9090201-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-09-22 19:28 ` Yuval Shaia
2014-09-30 8:39 ` Yuval Shaia
2014-10-02 13:00 ` Yuval Shaia
2014-10-01 11:55 ` Yuval Shaia
2014-10-01 12:13 ` Or Gerlitz
[not found] ` <542BEFED.6050203-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2014-10-04 18:36 ` Yuval Shaia
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140915174513.GA3074@obsidianresearch.com \
--to=jgunthorpe-epgobjl8dl3ta4ec/59zmfatqe2ktcn/@public.gmane.org \
--cc=gerlitz.or-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=yuval.shaia-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox