netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Chris Friesen" <cfriesen@nortel.com>
To: Brandon Black <blblack@gmail.com>
Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: behavior of recvmmsg() on blocking sockets
Date: Wed, 24 Mar 2010 13:36:31 -0600	[thread overview]
Message-ID: <4BAA69BF.3080600@nortel.com> (raw)
In-Reply-To: <84621a61003241128x3afbcea1w387aeaa68c887320@mail.gmail.com>

On 03/24/2010 12:28 PM, Brandon Black wrote:
> On Wed, Mar 24, 2010 at 12:41 PM, Chris Friesen <cfriesen@nortel.com> wrote:
>> On 03/24/2010 10:15 AM, Brandon Black wrote:
>>> It uses a thread-per-socket model
>>
>> This doesn't scale well to large numbers of sockets....you get a lot of
>> unnecessary context switching.
> 
> It scales great actually, within my measurement error of linear in
> testing so far.  These are UDP server sockets, and the traffic pattern
> is one request packet maps to one response packet, with no longer-term
> per-client state (this is a DNS server, to be specific).  The "do some
> work" code doesn't have any inter-thread contention (no locks, no
> writes to the same memory, etc), so the "threads" here may as well be
> processes if that makes the discussion less confusing.  I haven't yet
> found a model that scales as well for me.

Note that I said "large numbers of sockets".  Like tens of thousands.
In addition to context switch overhead this can also lead to issues with
memory consumption due to stack frames.

> I'm also just not personally sure whether there are network
> interfaces/drivers out there that could queue packets to the kernel
> (to a single socket) faster than recvmsg() could dequeue them to
> userspace

A 10Gig NIC could do this easily depending on your CPU.

> I still think having a "block until at least one packet arrives" mode
> for recvmmsg() makes sense though.

Agreed, as long as developers are aware that it won't be the most
efficient mode of operation.

Consider the case where you want to do some other useful work in
addition to running your network server.  Every cpu cycle spent on the
network server is robbed from the other work.  In this scenario you want
to handle packets as efficiently as possible, so the timeout-based
behaviour is better since it is more likely to give you multiple packets
per syscall.

Chris

  parent reply	other threads:[~2010-03-24 19:36 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <84621a61003240915p2a4ce6bbjd0c6bfb02ab05ba8@mail.gmail.com>
2010-03-24 17:41 ` behavior of recvmmsg() on blocking sockets Chris Friesen
2010-03-24 18:28   ` Brandon Black
2010-03-24 18:34     ` drepper
2010-03-24 23:35       ` Brandon Black
2010-03-26 12:00         ` Ulrich Drepper
2010-03-26 14:20           ` Eric Dumazet
2010-03-24 19:36     ` Chris Friesen [this message]
2010-03-24 19:55       ` Brandon Black
2010-03-27 13:19         ` Brandon Black
2010-03-27 14:26           ` Arnaldo Carvalho de Melo
2010-03-29 16:18           ` Chris Friesen
2010-03-29 17:24             ` Brandon Black
2010-03-29 17:48               ` Chris Friesen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BAA69BF.3080600@nortel.com \
    --to=cfriesen@nortel.com \
    --cc=blblack@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).