Re: [RFC PATCH 0/3] Faster than SLAB caching of SKBs with qmempool (backed by alf_queue)

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Christoph Lameter <cl@linux.com>
Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, linux-api@vger.kernel.org,
	Eric Dumazet <eric.dumazet@gmail.com>,
	"David S. Miller" <davem@davemloft.net>,
	Hannes Frederic Sowa <hannes@stressinduktion.org>,
	Alexander Duyck <alexander.duyck@gmail.com>,
	Alexei Starovoitov <ast@plumgrid.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	brouer@redhat.com
Subject: Re: [RFC PATCH 0/3] Faster than SLAB caching of SKBs with qmempool (backed by alf_queue)
Date: Thu, 11 Dec 2014 11:18:59 +0100	[thread overview]
Message-ID: <20141211111859.21e23e90@redhat.com> (raw)
In-Reply-To: <alpine.DEB.2.11.1412101339480.22982@gentwo.org>

On Wed, 10 Dec 2014 13:51:32 -0600 (CST)
Christoph Lameter <cl@linux.com> wrote:

> On Wed, 10 Dec 2014, Jesper Dangaard Brouer wrote:
> 
> > One of the building blocks for achieving this speedup is a cmpxchg
> > based Lock-Free queue that supports bulking, named alf_queue for
> > Array-based Lock-Free queue.  By bulking elements (pointers) from the
> > queue, the cost of the cmpxchg (approx 8 ns) is amortized over several
> > elements.
> 
> This is a bit of an issue since the design of the SLUB allocator is such
> that you should pick up an object, apply some processing and then take the
> next one. The fetching of an object warms up the first cacheline and this
> is tied into the way free objects are linked in SLUB.
> 
> So a bulk fetch from SLUB will not that effective and cause the touching
> of many cachelines if we are dealing with just a few objects. If we are
> looking at whole slab pages with all objects then SLUB can be effective
> since we do not have to build up the linked pointer structure in each
> page. SLAB has a different architecture there and a bulk fetch there is
> possible without touching objects even for small sets since the freelist
> management is separate from the objects.
> 
> If you do this bulking then you will later access cache cold objects?
> Doesnt that negate the benefit that you gain? Or are these objects written
> to by hardware and therefore by necessity cache cold?

Cache warmup is a concern, but perhaps it's the callers responsibility
to prefetch for their use-case.  For qmempool I do have patches that
prefetch elems when going from the sharedq to the localq (per CPU), but
I didn't see much gain, and I could prove my point (of being faster than
slab) without it.  And I would use/need the slab bulk interface to add
elems to sharedq which I consider semi-cache cold.


> We could provide a faster bulk alloc/free function.
> 
> 	int kmem_cache_alloc_array(struct kmem_cache *s, gfp_t flags,
> 		size_t objects, void **array)

I like it :-)

> and this could be optimized by each slab allocator to provide fast
> population of objects in that array. We then assume that the number of
> objects is in the hundreds or so right?

I'm already seeing a benefit with 16 packets alloc/free "bulking".

On RX we have a "budget" of 64 packets/descriptors (taken from the NIC
RX ring) that need SKBs.

On TX packets are put into the TX ring, and later at TX completion the
TX ring is cleaned up, as many as 256 (as e.g. in the ixgbe driver).

Scientific articles on userspace networking (like netmap) report that
they need at least 8 packet bulking to see wirespeed 10G at 64 bytes.


> The corresponding free function
> 
> 	void kmem_cache_free_array(struct kmem_cache *s,
> 		size_t objects, void **array)
> 
> 
> I think the queue management of the array can be improved by using a
> similar technique as used the SLUB allocator using the cmpxchg_local.
> cmpxchg_local is much faster than a full cmpxchg and we are operating on
> per cpu structures anyways. So the overhead could still be reduced.

I think you missed that the per cpu localq is already not using cmpxchg
(it is a SPSC queue).  The sharedq (MPMC queue) does need and use the
locked cmpxchg.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2014-12-11 10:18 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-10  3:40 [net-next PATCH 0/6] net: Alloc NAPI page frags from their own pool Alexander Duyck
2014-12-10  3:40 ` [net-next PATCH 1/6] net: Split netdev_alloc_frag into __alloc_page_frag and add __napi_alloc_frag Alexander Duyck
2014-12-10  4:16   ` Alexei Starovoitov
2014-12-10 15:21     ` Alexander Duyck
2014-12-10 16:02   ` Eric Dumazet
2014-12-10 17:06     ` Alexander Duyck
2014-12-10 17:13       ` Eric Dumazet
2014-12-10 17:16         ` Alexander Duyck
2014-12-10  3:40 ` [net-next PATCH 2/6] net: Pull out core bits of __netdev_alloc_skb and add __napi_alloc_skb Alexander Duyck
2014-12-10  3:40 ` [net-next PATCH 3/6] ethernet/intel: Use napi_alloc_skb Alexander Duyck
2014-12-11 21:43   ` Jeff Kirsher
2014-12-10  3:41 ` [net-next PATCH 4/6] cxgb: Use napi_alloc_skb instead of netdev_alloc_skb_ip_align Alexander Duyck
2014-12-10  3:41 ` [net-next PATCH 5/6] ethernet/realtek: use " Alexander Duyck
2014-12-10  3:41 ` [net-next PATCH 6/6] ethernet/broadcom: Use " Alexander Duyck
2014-12-10  9:50   ` David Laight
2014-12-10  9:52     ` David Laight
2014-12-10 15:16       ` Alexander Duyck
2014-12-10 14:15 ` [RFC PATCH 0/3] Faster than SLAB caching of SKBs with qmempool (backed by alf_queue) Jesper Dangaard Brouer
2014-12-10 14:15   ` [RFC PATCH 1/3] lib: adding an Array-based Lock-Free (ALF) queue Jesper Dangaard Brouer
2014-12-11 19:15     ` David Miller
2014-12-10 14:15   ` [RFC PATCH 2/3] mm: qmempool - quick queue based memory pool Jesper Dangaard Brouer
2014-12-10 14:15   ` [RFC PATCH 3/3] net: use qmempool in-front of sk_buff kmem_cache Jesper Dangaard Brouer
2014-12-10 14:22   ` [RFC PATCH 0/3] Faster than SLAB caching of SKBs with qmempool (backed by alf_queue) David Laight
2014-12-10 14:40     ` Jesper Dangaard Brouer
2014-12-10 15:17   ` Christoph Lameter
2014-12-10 15:33     ` Jesper Dangaard Brouer
     [not found]       ` <20141210163321.0e4e4fd2-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-12-10 16:17         ` Christoph Lameter
2014-12-10 19:51   ` Christoph Lameter
2014-12-11 10:18     ` Jesper Dangaard Brouer [this message]
2014-12-10 18:32 ` [net-next PATCH 0/6] net: Alloc NAPI page frags from their own pool David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141211111859.21e23e90@redhat.com \
    --to=brouer@redhat.com \
    --cc=alexander.duyck@gmail.com \
    --cc=ast@plumgrid.com \
    --cc=cl@linux.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=hannes@stressinduktion.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=netdev@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).