netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: John Fastabend <john.fastabend@gmail.com>
Cc: daniel@iogearbox.net, eric.dumazet@gmail.com, jhs@mojatatu.com,
	aduyck@mirantis.com, davem@davemloft.net,
	john.r.fastabend@intel.com, netdev@vger.kernel.org,
	brouer@redhat.com
Subject: Re: [RFC PATCH 01/12] lib: array based lock free queue
Date: Wed, 13 Jan 2016 20:28:07 +0100	[thread overview]
Message-ID: <20160113202807.2fe90c1a@redhat.com> (raw)
In-Reply-To: <20151230175103.26257.75334.stgit@john-Precision-Tower-5810>


On Wed, 30 Dec 2015 09:51:03 -0800 John Fastabend <john.fastabend@gmail.com> wrote:

> Initial implementation of an array based lock free queue. This works
> is originally done by Jesper Dangaard Brouer and I've grabbed it made
> only minor tweaks at the moment and plan to use it with the 'tc'
> subsystem although it is general enough to be used elsewhere.

First thanks for using my work. P.s. Hannes also deserves some credit
developing this queue.  And Alex Duyck helped review it (and found some
bugs ;-))

A question about the qdisc usage context: It is guaranteed that qdisc
enqueue and dequeue always runs either 1) in softirq context or 2) with
softirq/BH disabled?

This is important because (as noted in a comment), alf_queue is:

 Not preemption safe. Multiple CPUs can enqueue elements, but the
 same CPU is not allowed to be preempted and access the same
 queue. Due to how the tail is updated, this can result in a soft
 lock-up


> Certainly this implementation can be furthered optimized and improved
> but it is a good base implementation.

Agreed. E.g. the helpers you choose for enqueue/dequeue (store/load)
might not be the best, they look optimal (loop unroll etc) but the
extra icache usage cancel out the performance gain. At least in my
qmempool testing.
 But lets not get into these details now, as you say it is a good base
implementation. 


> diff --git a/include/linux/alf_queue.h b/include/linux/alf_queue.h
> new file mode 100644
> index 0000000..dac304e
> --- /dev/null
> +++ b/include/linux/alf_queue.h
> @@ -0,0 +1,368 @@
[...]
> +
> +/* Main Multi-Producer ENQUEUE
> + *
[...]
> + *
> + * Not preemption safe. Multiple CPUs can enqueue elements, but the
> + * same CPU is not allowed to be preempted and access the same
> + * queue. Due to how the tail is updated, this can result in a soft
> + * lock-up. (Same goes for alf_mc_dequeue).
> + */
> +static inline int
> +alf_mp_enqueue(const u32 n;
> +	       struct alf_queue *q, void *ptr[n], const u32 n)
> +{
> +	u32 p_head, p_next, c_tail, space;
> +
> +	/* Reserve part of the array for enqueue STORE/WRITE */
> +	do {
> +		p_head = ACCESS_ONCE(q->producer.head);
> +		c_tail = ACCESS_ONCE(q->consumer.tail);/* as smp_load_aquire */
> +
> +		space = q->size + c_tail - p_head;
> +		if (n > space)
> +			return 0;
> +
> +		p_next = p_head + n;
> +	}
> +	while (unlikely(cmpxchg(&q->producer.head, p_head, p_next) != p_head));
> +
> +	/* The memory barrier of smp_load_acquire(&q->consumer.tail)
> +	 * is satisfied by cmpxchg implicit full memory barrier
> +	 */
> +
> +	/* STORE the elems into the queue array */
> +	__helper_alf_enqueue_store(p_head, q, ptr, n);
> +	smp_wmb(); /* Write-Memory-Barrier matching dequeue LOADs */
> +
> +	/* Wait for other concurrent preceding enqueues not yet done,
> +	 * this part make us none-wait-free and could be problematic
> +	 * in case of congestion with many CPUs
> +	 */
> +	while (unlikely(ACCESS_ONCE(q->producer.tail) != p_head))
> +		cpu_relax();
> +	/* Mark this enq done and avail for consumption */
> +	ACCESS_ONCE(q->producer.tail) = p_next;
> +
> +	return n;
> +}

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

  reply	other threads:[~2016-01-13 19:28 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-30 17:50 [RFC PATCH 00/12] drop the qdisc lock for pfifo_fast/mq John Fastabend
2015-12-30 17:51 ` [RFC PATCH 01/12] lib: array based lock free queue John Fastabend
2016-01-13 19:28   ` Jesper Dangaard Brouer [this message]
2015-12-30 17:51 ` [RFC PATCH 02/12] net: sched: free per cpu bstats John Fastabend
2016-01-04 15:21   ` Daniel Borkmann
2016-01-04 17:32     ` Eric Dumazet
2016-01-04 18:08       ` John Fastabend
2015-12-30 17:51 ` [RFC PATCH 03/12] net: sched: allow qdiscs to handle locking John Fastabend
2015-12-30 17:52 ` [RFC PATCH 04/12] net: sched: provide per cpu qstat helpers John Fastabend
2015-12-30 17:52 ` [RFC PATCH 05/12] net: sched: per cpu gso handlers John Fastabend
2015-12-30 20:26   ` Jesper Dangaard Brouer
2015-12-30 20:42     ` John Fastabend
2015-12-30 17:53 ` [RFC PATCH 06/12] net: sched: support qdisc_reset on NOLOCK qdisc John Fastabend
2016-01-01  2:30   ` Alexei Starovoitov
2016-01-03 19:37     ` John Fastabend
2016-01-13 16:20   ` David Miller
2016-01-13 18:03     ` John Fastabend
2016-01-15 19:44       ` David Miller
2015-12-30 17:53 ` [RFC PATCH 07/12] net: sched: qdisc_qlen for per cpu logic John Fastabend
2015-12-30 17:53 ` [RFC PATCH 08/12] net: sched: a dflt qdisc may be used with per cpu stats John Fastabend
2015-12-30 17:54 ` [RFC PATCH 09/12] net: sched: pfifo_fast use alf_queue John Fastabend
2016-01-13 16:24   ` David Miller
2016-01-13 18:18     ` John Fastabend
2015-12-30 17:54 ` [RFC PATCH 10/12] net: sched: helper to sum qlen John Fastabend
2015-12-30 17:55 ` [RFC PATCH 11/12] net: sched: add support for TCQ_F_NOLOCK subqueues to sch_mq John Fastabend
2015-12-30 17:55 ` [RFC PATCH 12/12] net: sched: pfifo_fast new option to deque multiple pkts John Fastabend
2015-12-30 18:13   ` John Fastabend
2016-01-06 13:14 ` [RFC PATCH 00/12] drop the qdisc lock for pfifo_fast/mq Jamal Hadi Salim
2016-01-07 23:30   ` John Fastabend

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160113202807.2fe90c1a@redhat.com \
    --to=brouer@redhat.com \
    --cc=aduyck@mirantis.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=jhs@mojatatu.com \
    --cc=john.fastabend@gmail.com \
    --cc=john.r.fastabend@intel.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).