All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: netdev@vger.kernel.org, jakub.kicinski@netronome.com,
	pavel.odintsov@gmail.com, Jason Wang <jasowang@redhat.com>,
	mchan@broadcom.com, John Fastabend <john.fastabend@gmail.com>,
	peter.waskiewicz.jr@intel.com, ast@fiberby.dk,
	Daniel Borkmann <borkmann@iogearbox.net>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Andy Gospodarek <andy@greyhouse.net>,
	brouer@redhat.com
Subject: Re: [net-next V8 PATCH 3/5] bpf: cpumap xdp_buff to skb conversion and allocation
Date: Thu, 19 Oct 2017 12:10:51 +0200	[thread overview]
Message-ID: <20171019121051.2117c062@redhat.com> (raw)
In-Reply-To: <20171018165207-mutt-send-email-mst@kernel.org>

On Wed, 18 Oct 2017 17:12:09 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Mon, Oct 16, 2017 at 12:19:39PM +0200, Jesper Dangaard Brouer wrote:
> > @@ -191,15 +280,45 @@ static int cpu_map_kthread_run(void *data)
> >  	 * kthread_stop signal until queue is empty.
> >  	 */
> >  	while (!kthread_should_stop() || !__ptr_ring_empty(rcpu->queue)) {
> > +		unsigned int processed = 0, drops = 0;
> >  		struct xdp_pkt *xdp_pkt;
> >  
> > -		schedule();
> > -		/* Do work */
> > -		while ((xdp_pkt = ptr_ring_consume(rcpu->queue))) {
> > -			/* For now just "refcnt-free" */
> > -			page_frag_free(xdp_pkt);
> > +		/* Release CPU reschedule checks */
> > +		if (__ptr_ring_empty(rcpu->queue)) {  
> 
> 
> I suspect this is racy: if ring becomes non empty here and
> you wake the task, next line will put it to sleep.
> I think you want to reverse the order:
> 
> 			__set_current_state(TASK_INTERRUPTIBLE);
> 
> 	then check __ptr_ring_empty.

I'll look into this.

The window will be minimal, as __cpu_map_flush() after the last packets
enqueue will call wake_up_process(rcpu->kthread).  But I guess there is
still small race possible.  Worst case, a packet could be stuck in the
queue until a new packet arrive.  Thanks for spotting this.


> I note using the __ version means you can not resize the ring.
> Hope you do not need to.

Resize is not supported.  If user change the queue size, a new ptr_ring
and kthread is created, and logic assured the old ptr_ring and kthread
flush packets appropriately (this is tested with the --stress-mode).

 
> > +			__set_current_state(TASK_INTERRUPTIBLE);
> > +			schedule();
> > +		} else {
> > +			cond_resched();
> > +		}
> > +		__set_current_state(TASK_RUNNING);
> > +
> > +		/* Process packets in rcpu->queue */
> > +		local_bh_disable();
> > +		/*
> > +		 * The bpf_cpu_map_entry is single consumer, with this
> > +		 * kthread CPU pinned. Lockless access to ptr_ring
> > +		 * consume side valid as no-resize allowed of queue.
> > +		 */
> > +		while ((xdp_pkt = __ptr_ring_consume(rcpu->queue))) {
> > +			struct sk_buff *skb;
> > +			int ret;
> > +
> > +			skb = cpu_map_build_skb(rcpu, xdp_pkt);
> > +			if (!skb) {
> > +				page_frag_free(xdp_pkt);
> > +				continue;
> > +			}
> > +
> > +			/* Inject into network stack */
> > +			ret = netif_receive_skb_core(skb);
> > +			if (ret == NET_RX_DROP)
> > +				drops++;
> > +
> > +			/* Limit BH-disable period */
> > +			if (++processed == 8)
> > +				break;
> >  		}
> > -		__set_current_state(TASK_INTERRUPTIBLE);
> > +		local_bh_enable(); /* resched point, may call do_softirq() */
> >  	}
> >  	__set_current_state(TASK_RUNNING);
> >


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

  reply	other threads:[~2017-10-19 10:11 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-16 10:19 [net-next V8 PATCH 0/5] New bpf cpumap type for XDP_REDIRECT Jesper Dangaard Brouer
2017-10-16 10:19 ` [net-next V8 PATCH 1/5] bpf: introduce new bpf cpu map type BPF_MAP_TYPE_CPUMAP Jesper Dangaard Brouer
2017-10-16 21:49   ` Alexei Starovoitov
2017-10-17 10:47     ` Jesper Dangaard Brouer
2017-10-17 14:00       ` Daniel Borkmann
2017-10-18  7:45   ` Yann Ylavic
2017-10-18  8:38     ` Jesper Dangaard Brouer
2017-10-18 10:47       ` Yann Ylavic
2017-10-16 10:19 ` [net-next V8 PATCH 2/5] bpf: XDP_REDIRECT enable use of cpumap Jesper Dangaard Brouer
2017-10-16 10:19 ` [net-next V8 PATCH 3/5] bpf: cpumap xdp_buff to skb conversion and allocation Jesper Dangaard Brouer
2017-10-18 14:12   ` Michael S. Tsirkin
2017-10-19 10:10     ` Jesper Dangaard Brouer [this message]
2017-10-16 10:19 ` [net-next V8 PATCH 4/5] bpf: cpumap add tracepoints Jesper Dangaard Brouer
2017-10-16 10:19 ` [net-next V8 PATCH 5/5] samples/bpf: add cpumap sample program xdp_redirect_cpu Jesper Dangaard Brouer
2017-10-18 11:12 ` [net-next V8 PATCH 0/5] New bpf cpumap type for XDP_REDIRECT David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171019121051.2117c062@redhat.com \
    --to=brouer@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andy@greyhouse.net \
    --cc=ast@fiberby.dk \
    --cc=borkmann@iogearbox.net \
    --cc=jakub.kicinski@netronome.com \
    --cc=jasowang@redhat.com \
    --cc=john.fastabend@gmail.com \
    --cc=mchan@broadcom.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pavel.odintsov@gmail.com \
    --cc=peter.waskiewicz.jr@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.