All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jay Vosburgh <fubar@us.ibm.com>
To: Simon Horman <horms@verge.net.au>
Cc: netdev@vger.kernel.org
Subject: Re: noqueue on bonding devices
Date: Wed, 28 Jul 2010 10:37:56 -0700	[thread overview]
Message-ID: <16360.1280338676@death> (raw)
In-Reply-To: <20100728083217.GB20227@verge.net.au>

Simon Horman <horms@verge.net.au> wrote:

>Hi Jay, Hi All,
>
>I would just to wonder out loud if it is intentional that bonding
>devices default to noqueue, whereas for instance ethernet devices
>default to a pfifo_fast with qlen 1000.

	Yes, it is.

>The reason that I ask, is that when setting up some bandwidth
>control using tc I encountered some strange behaviour which
>I eventually tracked down to the queue-length of the qdiscs being 1p -
>inherited from noqueue, as opposed to 1000p which would occur
>on an ethernet device.
>
>Its trivial to work around, by either altering the txqueuelen on
>the bonding device before adding the qdisc or by manually setting
>the qlen of the qdisc. But it did take us a while to determine the
>cause of the problem we were seeing. And as it seems inconsistent
>I'm interested to know why this is the case.

	Software-only virtual devices (loopback, bonding, bridge, vlan,
etc) typically have no transmit queue because, well, the device does no
queueing.  Meaning that there is no flow control infrastructure in the
software device; bonding, et al, won't ever flow control (call
netif_stop_queue to temporarily suspend transmit) or accumulate packets
on a transmit queue.

	Hardware ethernet devices set a queue length because it is
meaningful for them to do so.  When their hardware transmit ring fills
up, they will assert flow control, and stop accepting new packets for
transmit.  Packets then accumulate in the software transmit queue, and
when the device unblocks, those packets are ready to go.  When under
continuous load, hardware network devices typically free up ring entries
in blocks (not one at a time), so the software transmit queue helps to
smooth out the chunkiness of the hardware driver's processing, minimize
dropped packets, etc.

	It's certainly possible to add a queue and qdisc to a bonding
device, and is reasonable to do if you want to do packet scheduling with
tc and friends.  In this case, the queue is really just for the tc
actions to connect to; the queue won't accumulate packets on account of
the driver (but could if the scheduler, e.g., rate limits).

>On an unrelated note, MAINTANERS lists bonding-devel@lists.sourceforge.net
>but the (recent) archives seem to be entirely spam.  Is the MAINTAINERS
>file correct?

	Yah, I should probably change that; the spam is pretty heavy,
and there isn't much I can do to limit it.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

  reply	other threads:[~2010-07-28 17:38 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-28  8:32 noqueue on bonding devices Simon Horman
2010-07-28 17:37 ` Jay Vosburgh [this message]
2010-07-28 23:42   ` Simon Horman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=16360.1280338676@death \
    --to=fubar@us.ibm.com \
    --cc=horms@verge.net.au \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.