Re: Optimizing instruction-cache, more packets at each stage

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Felix Fietkau <nbd@openwrt.org>
Cc: David Laight <David.Laight@ACULAB.COM>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	David Miller <davem@davemloft.net>,
	Alexander Duyck <alexander.duyck@gmail.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Daniel Borkmann <borkmann@iogearbox.net>,
	Marek Majkowski <marek@cloudflare.com>,
	Hannes Frederic Sowa <hannes@stressinduktion.org>,
	Florian Westphal <fw@strlen.de>, Paolo Abeni <pabeni@redhat.com>,
	John Fastabend <john.r.fastabend@intel.com>,
	brouer@redhat.com
Subject: Re: Optimizing instruction-cache, more packets at each stage
Date: Mon, 18 Jan 2016 12:54:20 +0100	[thread overview]
Message-ID: <20160118125420.0375ffda@redhat.com> (raw)
In-Reply-To: <56990473.9090300@openwrt.org>

On Fri, 15 Jan 2016 15:38:43 +0100 Felix Fietkau <nbd@openwrt.org> wrote:
> On 2016-01-15 15:00, Jesper Dangaard Brouer wrote:
[...]
> > 
> > The icache is still quite small 32Kb on modern server processors.  I
> > don't know if smaller embedded processors also have icache and how
> > large they are.  I speculate this approach would also be a benefit for
> > them (if they have icache).
>
> All of the router devices that I work with have icache. Typical sizes
> are 32 or 64 KiB. FWIW, I'm really looking forward to having such
> optimizations in the network stack ;)

That is very interesting. These kind of icache optimization will then
likely benefit lower-end devices more than high end Intel CPUs :-)

AFAIK the Intel CPUs are masking this icache problem, by having a icache
prefetcher and optimizing how fast the CPU can load/refill from higher
level caches.  Intel CPUs have a lot of HW-logic around this, which the
I assume the smaller CPUs don't.  E.g. quote from Intel Optimization
Reference Manual:

 "The instruction fetch unit (IFU) can fetch up to 16 bytes of aligned
  instruction bytes each cycle from the instruction cache to the
  instruction length decoder (ILD). The instruction queue (IQ) buffers
  the ILD-processed instructions and can deliver up to four instructions
  in one cycle to the instruction decoder."

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

next prev parent reply	other threads:[~2016-01-18 11:54 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-15 13:22 Optimizing instruction-cache, more packets at each stage Jesper Dangaard Brouer
2016-01-15 13:32 ` Hannes Frederic Sowa
2016-01-15 14:17   ` Jesper Dangaard Brouer
2016-01-15 13:36 ` David Laight
2016-01-15 14:00   ` Jesper Dangaard Brouer
2016-01-15 14:38     ` Felix Fietkau
2016-01-18 11:54       ` Jesper Dangaard Brouer [this message]
2016-01-18 17:01         ` Eric Dumazet
2016-01-25  0:08         ` Florian Fainelli
2016-01-15 20:47 ` David Miller
2016-01-18 10:27   ` Jesper Dangaard Brouer
2016-01-18 16:24     ` David Miller
2016-01-20 22:20       ` Or Gerlitz
2016-01-20 23:02         ` Eric Dumazet
2016-01-20 23:27           ` Tom Herbert
2016-01-21 11:27             ` Jesper Dangaard Brouer
2016-01-21 12:49               ` Or Gerlitz
2016-01-21 13:57                 ` Jesper Dangaard Brouer
2016-01-21 18:56                 ` David Miller
2016-01-21 22:45                   ` Or Gerlitz
2016-01-21 22:59                     ` David Miller
2016-01-21 16:38               ` Eric Dumazet
2016-01-21 18:54               ` David Miller
2016-01-24 14:28                 ` Jesper Dangaard Brouer
2016-01-24 14:44                   ` Michael S. Tsirkin
2016-01-24 17:28                     ` John Fastabend
2016-01-25 13:15                       ` Bypass at packet-page level (Was: Optimizing instruction-cache, more packets at each stage) Jesper Dangaard Brouer
2016-01-25 17:09                         ` Tom Herbert
2016-01-25 17:50                           ` John Fastabend
2016-01-25 21:32                             ` Tom Herbert
2016-01-25 21:58                               ` John Fastabend
2016-01-25 22:10                             ` Jesper Dangaard Brouer
2016-01-27 20:47                               ` Jesper Dangaard Brouer
2016-01-27 21:56                                 ` Alexei Starovoitov
2016-01-28  9:52                                   ` Jesper Dangaard Brouer
2016-01-28 12:54                                     ` Eric Dumazet
2016-01-28 13:25                                     ` Eric Dumazet
2016-01-28 16:43                                     ` Tom Herbert
2016-01-28  2:50                                 ` Tom Herbert
2016-01-28  9:25                                   ` Jesper Dangaard Brouer
2016-01-28 12:45                                     ` Eric Dumazet
2016-01-28 16:37                                       ` Tom Herbert
2016-01-28 16:43                                         ` Eric Dumazet
2016-01-28 17:04                                         ` Jesper Dangaard Brouer
2016-01-24 20:09                   ` Optimizing instruction-cache, more packets at each stage Tom Herbert
2016-01-24 21:41                     ` John Fastabend
2016-01-24 23:50                       ` Tom Herbert
2016-01-21 12:23             ` Jesper Dangaard Brouer
2016-01-21 16:38               ` Tom Herbert
2016-01-21 17:48                 ` Eric Dumazet
2016-01-22 12:33                   ` Jesper Dangaard Brouer
2016-01-22 14:33                     ` Eric Dumazet
2016-01-22 17:07                     ` Tom Herbert
2016-01-22 17:17                       ` Jesper Dangaard Brouer
2016-02-02 16:13             ` Or Gerlitz
2016-02-02 16:37               ` Eric Dumazet
2016-01-18 16:53     ` Eric Dumazet
2016-01-18 17:36     ` Tom Herbert
2016-01-18 17:49       ` Jesper Dangaard Brouer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160118125420.0375ffda@redhat.com \
    --to=brouer@redhat.com \
    --cc=David.Laight@ACULAB.COM \
    --cc=alexander.duyck@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=borkmann@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=fw@strlen.de \
    --cc=hannes@stressinduktion.org \
    --cc=john.r.fastabend@intel.com \
    --cc=marek@cloudflare.com \
    --cc=nbd@openwrt.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).