All of lore.kernel.org
 help / color / mirror / Atom feed
From: Max Krasnyansky <maxk@qualcomm.com>
To: Rusty Russell <rusty@rustcorp.com.au>
Cc: netdev@vger.kernel.org, Herbert Xu <herbert@gondor.apana.org.au>,
	virtualization@lists.linux-foundation.org
Subject: Re: [PATCH 2/3] partial checksum and GSO support for tun/tap.
Date: Mon, 03 Mar 2008 21:08:00 -0800	[thread overview]
Message-ID: <47CCD930.3040200@qualcomm.com> (raw)
In-Reply-To: <200803041202.17202.rusty@rustcorp.com.au>

Rusty Russell wrote:
> On Friday 08 February 2008 16:39:03 Max Krasnyansky wrote:
>> Rusty Russell wrote:
>>> (Changes since last time: we how have explicit IFF_RECV_CSUM and
>>> IFF_RECV_GSO bits, and some renaming of virtio_net hdr)
>>>
>>> We use the virtio_net_hdr: it is an ABI already and designed to
>>> encapsulate such metadata as GSO and partial checksums.
>>>
>>> IFF_VIRTIO_HDR means you will write and read a 'struct virtio_net_hdr'
>>> at the start of each packet.  You can always write packets with
>>> partial checksum and gso to the tap device using this header.
>>>
>>> IFF_RECV_CSUM means you can handle reading packets with partial
>>> checksums.  If IFF_RECV_GSO is also set, it means you can handle
>>> reading (all types of) GSO packets.
>>>
>>> Note that there is no easy way to detect if these flags are supported:
>>> see next patch.
>> Again sorry for delay in replying. Here are my thoughts on this.
>>
>> I like the approach in general. Certainly the part that creates skbs out of
>> the user-space pages looks good. And it's fits nicely into existing TUN
>> driver model. However I actually wanted to change the model :). In
>> particular I'm talking about "syscall per packet"
>> After messing around with things like libe1000.sf.net I'd like to make
>> TUN/TAP driver look more like modern nic's to the user-space. In other
>> words I'm thinking about introducing RX and TX rings that the user-space
>> can then mmap() and write/read packets descriptors to/from. That will saves
>> the number of system calls that the user-space app needs to do. That by
>> itself saves a lot of overhead, combined with the GSO it's be lightning
>> fast.
> 
> The problem with this approach is that for what I'm doing, the packets aren't 
> nicely arranged somewhere; they're in random process memory.
That's fine. RX/TX descriptors would not contain the data itself. They'd
contain pointers to actual packets (ie just like the NIC takes physical memory
address and DMAs data in/out).
The allows for sending/receiving packets without syscalls and fits nicely with
the async schemes like GSO.

btw The code that I sent you does indeed expect packets to be in a mmap()ed
buffer but I agree that it only works for certain cases. In general it's not
flexible. I was thinking of introducing some flags in the descriptor that tell
the kernel how to handle the packet. ie Whether it needs to be just copied
into a fresh SKB or remapped with get_user_pages().

> I thought about further abusing writev and readv to do multiple packets at 
> once.  
I actually was going to abuse them from day one. At that time Alex Kuznetsov
told me that I'm crazy and I gave up on it :)

>> Also btw why call it VIRTIO ? For example I'm actually interested in
>> speeding up tunning and general network apps. We have wireless basestation
>> apps here that need to handle packets in user-space. Those kind things have
>> nothing to with virtualization.
> 
> The structure is for virtio, I'm just borrowing it for tap because it's 
> already there.  We could rename it and move it out to its own header, but if 
> so we should do that before 2.6.25 is released.
If we do the whole enchilada with the RX/TX rings then we probably do not even
need it. I'm thinking that RX/TX descriptor would include everything you need
for the GSO and stuff.
I meant do not need it for the TUN/TAP driver that is. Is it used anywhere else ?

Max

  reply	other threads:[~2008-03-04  5:08 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-23 14:07 [PATCH 1/3] Cleanup and simplify virtnet header Rusty Russell
2008-01-23 14:10 ` [PATCH 2/3] partial checksum and GSO support for tun/tap Rusty Russell
2008-01-23 14:10 ` Rusty Russell
2008-01-23 14:14   ` [PATCH 3/3] Interface to query tun/tap features Rusty Russell
2008-02-08  5:07     ` Max Krasnyansky
2008-02-08  5:07     ` Max Krasnyansky
2008-01-23 14:14   ` Rusty Russell
2008-02-08  5:39   ` [PATCH 2/3] partial checksum and GSO support for tun/tap Max Krasnyansky
2008-02-08  5:39   ` Max Krasnyansky
2008-03-04  1:02     ` Rusty Russell
2008-03-04  5:08       ` Max Krasnyansky [this message]
2008-03-04  7:47         ` Rusty Russell
2008-03-04 20:08           ` Max Krasnyanskiy
2008-03-04 20:08           ` Max Krasnyanskiy
2008-03-04  7:47         ` Rusty Russell
2008-03-04  1:02     ` Rusty Russell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47CCD930.3040200@qualcomm.com \
    --to=maxk@qualcomm.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=netdev@vger.kernel.org \
    --cc=rusty@rustcorp.com.au \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.