netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: Elad Lahav <elahav@uwaterloo.ca>
Cc: linux-kernel@vger.kernel.org, Linux Netdev List <netdev@vger.kernel.org>
Subject: Re: [PATCH] Implementation of the sendgroup() system call
Date: Mon, 04 May 2009 11:03:39 +0200	[thread overview]
Message-ID: <49FEAF6B.5090308@cosmosbay.com> (raw)
In-Reply-To: <49FE9C8C.6090705@cosmosbay.com>

Eric Dumazet a écrit :
> Elad Lahav a écrit :
>> The attached patch contains an implementation of sendgroup(), a system
>> call that allows a UDP packet to be transmitted efficiently to multiple
>> recipients. Use cases for this system call include live-streaming and
>> multi-player online games.
>> The basic idea is that the caller maintains a group - a list of IP
>> addresses and UDP ports - and calls sendgroup() with the group list and
>> a common payload. Optionally, the call allows for per-recipient data to
>> be prepended or appended to the shared block. The data is copied once in
>> the kernel into an allocated page, and the per-recipient socket buffers
>> point to that page. Savings come from avoiding both the multiple calls
>> and the multiple copies of the data required with regular socket
>> operations. We have measured an improvement of 42% in CPU utilisation
>> when using this system call with the Helix multimedia server (reference:
>> http://simula.no/~griff/nossdav2008/27-32.pdf).
>>
>> The patch includes two implementations: one as described above and one
>> that uses the udp_sendmsg() function in a tight loop inside the kernel
>> (and thus saves on mode switches, but not on data copies). The latter is
>> provided for reference and benchmarking only.
>>
>> Feedback is welcome.
>>
> 
> Hi Elad
> 
> Patch is not inlined, this is really asking for troubles, I doubt many people
> will actually read your patch...
> 
> My comments are :
> 
> 1) Lack of latency checks. Sending UDP on 1000 destinations is expensive.
>   A syscall is not preemptable unless special conditions are met.
> 
> 2) Lack of a 32/64 bits aware API. A 64bit kernel should be able to 
> run a 32bit application using a sendgroup() syscall.
> 
> 3) Are footer/header differents for each calls ? Maybe you need
>   something better to avoid extra copies for them at each sendgroup() systemcall
> 
> 4) One expensive thing on UDP sends is the route cache lookups. You could avoid
> this cost using 'connected' group setup (see point 3)
>  
> ie using a different syscall to setup the group (and compute/lookup all needed routes)
>   (this syscall would be able to add/delete members (with their footer/header) to socket group)
>   
> Then sendgroup() would be really light, since it would provide a group identifier
> (can be a file descriptor -> mapping one group), and the UDP message to send.

Ah some other points : You forgot to include netdev  (CCed on my message),
as some network guys dont read lkml every day :)

On your experiments, did you change NIC txqueue length ? (default being 1000)
Using sendgroup() or sendmsg(), you'll hit pretty fast the NIC queue limit anyway...

Also, since 2.6.25 added memory accounting on UDP sockets, you'll probably need to
increase SO_SNDBUF to avoid being blocked on sendmsg()/sendgroup() call

      parent reply	other threads:[~2009-05-04  9:03 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <49FE47A1.7070700@uwaterloo.ca>
2009-05-04  7:13 ` [PATCH] Implementation of the sendgroup() system call Andi Kleen
2009-05-04  7:30   ` Avi Kivity
2009-05-04  9:53     ` Andi Kleen
2009-05-04  9:56       ` Eric Dumazet
2009-05-04 10:18         ` Andi Kleen
2009-05-04  9:58       ` Avi Kivity
2009-05-04  7:42   ` Rémi Denis-Courmont
2009-05-04 13:44   ` Elad Lahav
2009-05-04 14:50     ` Andi Kleen
2009-05-05  0:24       ` Elad Lahav
2009-05-06 11:25       ` Tim Brecht
     [not found] ` <49FE9C8C.6090705@cosmosbay.com>
2009-05-04  9:03   ` Eric Dumazet [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49FEAF6B.5090308@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=elahav@uwaterloo.ca \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).