All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: xiyou.wangcong@gmail.com, davem@davemloft.net, jhs@mojatatu.com,
	netdev@vger.kernel.org, paulmck@linux.vnet.ibm.com,
	brouer@redhat.com
Subject: Re: [net-next PATCH v3 09/15] net: sched: make cls_u32 lockless
Date: Tue, 09 Sep 2014 09:35:23 -0700	[thread overview]
Message-ID: <540F2C4B.6060404@gmail.com> (raw)
In-Reply-To: <1410268807.11872.177.camel@edumazet-glaptop2.roam.corp.google.com>

On 09/09/2014 06:20 AM, Eric Dumazet wrote:
> On Mon, 2014-09-08 at 22:58 -0700, John Fastabend wrote:
>> Make cls_u32 classifier safe to run without holding lock. This patch
>> converts statistics that are kept in read section u32_classify into
>> per cpu counters.
>>
>> This patch was tested with a tight u32 filter add/delete loop while
>> generating traffic with pktgen. By running pktgen on vlan devices
>> created on top of a physical device we can hit the qdisc layer
>> correctly. For ingress qdisc's a loopback cable was used.
>>
>> for i in {1..100}; do
>>          q=`echo $i%8|bc`;
>>          echo -n "u32 tos: iteration $i on queue $q";
>>          tc filter add dev p3p2 parent $p prio $i u32 match ip tos 0x10 0xff \
>>                    action skbedit queue_mapping $q;
>>          sleep 1;
>>          tc filter del dev p3p2 prio $i;
>>
>>          echo -n "u32 tos hash table: iteration $i on queue $q";
>>          tc filter add dev p3p2 parent $p protocol ip prio $i handle 628: u32 divisor 1
>>          tc filter add dev p3p2 parent $p protocol ip prio $i u32 \
>>                  match ip protocol 17 0xff link 628: offset at 0 mask 0xf00 shift 6 plus 0
>>          tc filter add dev p3p2 parent $p protocol ip prio $i u32 \
>>                  ht 628:0 match ip tos 0x10 0xff action skbedit queue_mapping $q
>>          sleep 2;
>>          tc filter del dev p3p2 prio $i
>>          sleep 1;
>> done
>>
>
>
> Note it might be easier to split this patch in 2 parts.
>
> (The percpu stuff could be done in a first step, then rcu conversion)

Sure, I'll split it up.

>
>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>> ---

[...]

>>
>>   static inline unsigned int u32_hash_fold(__be32 key,
>> @@ -96,7 +103,7 @@ static int u32_classify(struct sk_buff *skb, const struct tcf_proto *tp, struct
>>   		unsigned int	  off;
>>   	} stack[TC_U32_MAXDEPTH];
>>
>> -	struct tc_u_hnode *ht = tp->root;
>> +	struct tc_u_hnode *ht = rcu_dereference_bh(tp->root);
>>   	unsigned int off = skb_network_offset(skb);
>>   	struct tc_u_knode *n;
>>   	int sdepth = 0;
>> @@ -108,23 +115,23 @@ static int u32_classify(struct sk_buff *skb, const struct tcf_proto *tp, struct
>>   	int i, r;
>>
>>   next_ht:
>> -	n = ht->ht[sel];
>> +	n = rcu_dereference_bh(ht->ht[sel]);
>>
>>   next_knode:
>>   	if (n) {
>>   		struct tc_u32_key *key = n->sel.keys;
>>
>>   #ifdef CONFIG_CLS_U32_PERF
>> -		n->pf->rcnt += 1;
>> +		this_cpu_inc(n->pf->rcnt);
>
> As we run in BH, we are not preemptable, we can use instead :
> __this_cpu_inc()  (on all occurrences)
>
> Using a macro would also reduce the #ifdef mess of this file
>

I'll go ahead and do the macro conversion.

>>   		j = 0;
>>   #endif
>>
>>   #ifdef CONFIG_CLS_U32_MARK
>> -		if ((skb->mark & n->mark.mask) != n->mark.val) {
>> -			n = n->next;
>> +		if ((skb->mark & n->mask) != n->val) {
>> +			n = rcu_dereference_bh(n->next);
>>   			goto next_knode;
>>   		} else {
>> -			n->mark.success++;
>> +			this_cpu_inc(*n->pcpu_success);
>>   		}
>>   #endif
>>

[...]


>> @@ -326,11 +342,11 @@ static int u32_init(struct tcf_proto *tp)
>>   	}
>>
>>   	tp_c->refcnt++;
>> -	root_ht->next = tp_c->hlist;
>> -	tp_c->hlist = root_ht;
>> +	rcu_assign_pointer(root_ht->next, tp_c->hlist);
>
> 	RCU_INIT_POINTER() or root_ht->next = tp_c->hlist;
>

I'll fix all these as well.

>> +	rcu_assign_pointer(tp_c->hlist, root_ht);
>>   	root_ht->tp_c = tp_c;
>>
>> -	tp->root = root_ht;
>> +	rcu_assign_pointer(tp->root, root_ht);
>>   	tp->data = tp_c;
>>   	return 0;
>>   }
>> @@ -342,25 +358,33 @@ static int u32_destroy_key(struct tcf_proto *tp, struct tc_u_knode *n)
>>   	if (n->ht_down)
>>   		n->ht_down->refcnt--;
>>   #ifdef CONFIG_CLS_U32_PERF
>> -	kfree(n->pf);
>> +	free_percpu(n->pf);
>>   #endif
>>   	kfree(n);
>>   	return 0;
>>   }
>>
>> +void __u32_delete_key(struct rcu_head *rcu)
>
> Can we consistently use _rcu, as in u32_delete_key_rcu() ?

Yes, I'll rename these calls.

Thanks,
John

-- 
John Fastabend         Intel Corporation

  reply	other threads:[~2014-09-09 16:35 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-09  5:53 [net-next PATCH v3 00/15] net/sched use rcu filters John Fastabend
2014-09-09  5:54 ` [net-next PATCH v3 01/15] net: qdisc: use rcu prefix and silence sparse warnings John Fastabend
2014-09-09 12:05   ` Eric Dumazet
2014-09-09  5:54 ` [net-next PATCH v3 02/15] net: rcu-ify tcf_proto John Fastabend
2014-09-09 12:20   ` Eric Dumazet
2014-09-09 15:15     ` John Fastabend
2014-09-09 15:20       ` Eric Dumazet
2014-09-09  5:55 ` [net-next PATCH v3 03/15] net: sched: cls_basic use RCU John Fastabend
2014-09-09 12:29   ` Eric Dumazet
2014-09-09  5:55 ` [net-next PATCH v3 04/15] net: sched: cls_cgroup " John Fastabend
2014-09-09 12:34   ` Eric Dumazet
2014-09-09  5:56 ` [net-next PATCH v3 05/15] net: sched: cls_flow " John Fastabend
2014-09-09 12:41   ` Eric Dumazet
2014-09-09 16:09     ` John Fastabend
2014-09-09 16:20       ` Eric Dumazet
2014-09-09  5:56 ` [net-next PATCH v3 06/15] net: sched: fw " John Fastabend
2014-09-09 12:48   ` Eric Dumazet
2014-09-09  5:57 ` [net-next PATCH v3 07/15] net: sched: RCU cls_route John Fastabend
2014-09-09 12:59   ` Eric Dumazet
2014-09-09 16:24     ` John Fastabend
2014-09-09  5:57 ` [net-next PATCH v3 08/15] net: sched: RCU cls_tcindex John Fastabend
2014-09-09  5:58 ` [net-next PATCH v3 09/15] net: sched: make cls_u32 lockless John Fastabend
2014-09-09 13:20   ` Eric Dumazet
2014-09-09 16:35     ` John Fastabend [this message]
2014-09-09 18:02   ` Cong Wang
2014-09-10  4:53     ` John Fastabend
2014-09-09  5:58 ` [net-next PATCH v3 10/15] net: sched: rcu'ify cls_rsvp John Fastabend
2014-09-09 13:27   ` Eric Dumazet
2014-09-09 16:46     ` John Fastabend
2014-09-09  5:59 ` [net-next PATCH v3 11/15] net: sched: rcu'ify cls_bpf John Fastabend
2014-09-09  5:59 ` [net-next PATCH v3 12/15] net: sched: make tc_action safe to walk under RCU John Fastabend
2014-09-09  6:00 ` [net-next PATCH v3 13/15] net: sched: make bstats per cpu and estimator RCU safe John Fastabend
2014-09-09  6:00 ` [net-next PATCH v3 14/15] net: sched: make qstats per cpu John Fastabend
2014-09-09  6:01 ` [net-next PATCH v3 15/15] net: sched: drop ingress qdisc lock John Fastabend

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=540F2C4B.6060404@gmail.com \
    --to=john.fastabend@gmail.com \
    --cc=brouer@redhat.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=jhs@mojatatu.com \
    --cc=netdev@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.