Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH bpf-next 1/2] bpf: switch most helper return values from 32-bit int to 64-bit long
From: Daniel Borkmann @ 2020-06-19 13:08 UTC (permalink / raw)
  To: John Fastabend, Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov, Kernel Team
In-Reply-To: <5eec09418954e_27ce2adb0816a5b8f7@john-XPS-13-9370.notmuch>

On 6/19/20 2:39 AM, John Fastabend wrote:
> John Fastabend wrote:
>> Andrii Nakryiko wrote:
>>> On Thu, Jun 18, 2020 at 11:58 AM John Fastabend
>>> <john.fastabend@gmail.com> wrote:
> 
> [...]
> 
>>> That would be great. Self-tests do work, but having more testing with
>>> real-world application would certainly help as well.
>>
>> Thanks for all the follow up.
>>
>> I ran the change through some CI on my side and it passed so I can
>> complain about a few shifts here and there or just update my code or
>> just not change the return types on my side but I'm convinced its OK
>> in most cases and helps in some so...
>>
>> Acked-by: John Fastabend <john.fastabend@gmail.com>
> 
> I'll follow this up with a few more selftests to capture a couple of our
> patterns. These changes are subtle and I worry a bit that additional
> <<,s>> pattern could have the potential to break something.
> 
> Another one we didn't discuss that I found in our code base is feeding
> the output of a probe_* helper back into the size field (after some
> alu ops) of subsequent probe_* call. Unfortunately, the tests I ran
> today didn't cover that case.
> 
> I'll put it on the list tomorrow and encode these in selftests. I'll
> let the mainainers decide if they want to wait for those or not.

Given potential fragility on verifier side, my preference would be that we
have the known variations all covered in selftests before moving forward in
order to make sure they don't break in any way. Back in [0] I've seen mostly
similar cases in the way John mentioned in other projects, iirc, sysdig was
another one. If both of you could hack up the remaining cases we need to
cover and then submit a combined series, that would be great. I don't think
we need to rush this optimization w/o necessary selftests.

Thanks everyone,
Daniel

   [0] https://lore.kernel.org/bpf/20200421125822.14073-1-daniel@iogearbox.net/

^ permalink raw reply

* Re: [PATCH 03/11] bpf: Add btf_ids object
From: Jiri Olsa @ 2020-06-19 13:13 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Networking, bpf,
	Song Liu, Yonghong Song, Martin KaFai Lau, David Miller,
	John Fastabend, Wenbo Zhang, KP Singh, Andrii Nakryiko,
	Brendan Gregg, Florent Revest, Al Viro
In-Reply-To: <CAEf4BzZ=BN7zDU_8xMEEoF7khjC4bwGitU+iYf+6uFXPZ_=u-g@mail.gmail.com>

On Thu, Jun 18, 2020 at 05:56:38PM -0700, Andrii Nakryiko wrote:
> On Tue, Jun 16, 2020 at 3:05 AM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > Adding support to generate .BTF_ids section that would
> > hold various BTF IDs list for verifier.
> >
> > Adding macros help to define lists of BTF IDs placed in
> > .BTF_ids section. They are initially filled with zeros
> > (during compilation) and resolved later during the
> > linking phase by btfid tool.
> >
> > Following defines list of one BTF ID that is accessible
> > within kernel code as bpf_skb_output_btf_ids array.
> >
> >   extern int bpf_skb_output_btf_ids[];
> >
> >   BTF_ID_LIST(bpf_skb_output_btf_ids)
> >   BTF_ID(struct, sk_buff)
> >
> > Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  include/asm-generic/vmlinux.lds.h |  4 ++
> >  kernel/bpf/Makefile               |  2 +-
> >  kernel/bpf/btf_ids.c              |  3 ++
> >  kernel/bpf/btf_ids.h              | 70 +++++++++++++++++++++++++++++++
> >  4 files changed, 78 insertions(+), 1 deletion(-)
> >  create mode 100644 kernel/bpf/btf_ids.c
> >  create mode 100644 kernel/bpf/btf_ids.h
> >
> 
> [...]
> 
> > +/*
> > + * Following macros help to define lists of BTF IDs placed
> > + * in .BTF_ids section. They are initially filled with zeros
> > + * (during compilation) and resolved later during the
> > + * linking phase by btfid tool.
> > + *
> > + * Any change in list layout must be reflected in btfid
> > + * tool logic.
> > + */
> > +
> > +#define SECTION ".BTF_ids"
> 
> nit: SECTION is super generic and non-greppable. BTF_IDS_SECTION?

ok

> 
> > +
> > +#define ____BTF_ID(symbol)                             \
> > +asm(                                                   \
> > +".pushsection " SECTION ",\"a\";               \n"     \
> 
> section should be also read-only? Either immediately here, of btfid
> tool should mark it? Unless I missed that it's already doing it :)

hm, it's there next to the .BTF section within RO_DATA macro,
so I thought that was enough.. I'll double check

> 
> > +".local " #symbol " ;                          \n"     \
> > +".type  " #symbol ", @object;                  \n"     \
> > +".size  " #symbol ", 4;                        \n"     \
> > +#symbol ":                                     \n"     \
> > +".zero 4                                       \n"     \
> > +".popsection;                                  \n");
> > +
> > +#define __BTF_ID(...) \
> > +       ____BTF_ID(__VA_ARGS__)
> 
> why varargs, if it's always a single argument? Or it's one of those
> macro black magic things were it works only in this particular case,
> but not others?

yea, I kind of struggled in here, because any other would not
expand the name concat together with the unique ID bit,
__VA_ARGS__ did it nicely ;-) I'll revisit this

thanks,
jirka


^ permalink raw reply

* Re: [PATCH net-next 2/3] cls_flower: Pass the unmasked key to hw
From: Ido Schimmel @ 2020-06-19 13:14 UTC (permalink / raw)
  To: dsatish
  Cc: davem, jhs, xiyou.wangcong, jiri, kuba, netdev, simon.horman,
	kesavac, prathibha.nagooru, intiyaz.basha, jai.rana
In-Reply-To: <20200619094156.31184-3-satish.d@oneconvergence.com>

On Fri, Jun 19, 2020 at 03:11:55PM +0530, dsatish wrote:
> Pass the unmasked key along with the masked key to the hardware.
> This enables hardware to manage its own tables better based on the
> hardware features/capabilities.

How? Which hardware? You did not patch a single driver...

Are you familiar with chain templates? I think it might help:
https://github.com/Mellanox/mlxsw/wiki/ACLs#chain-templates

> 
> Signed-off-by: Chandra Kesava <kesavac@gmail.com>
> Signed-off-by: Prathibha Nagooru <prathibha.nagooru@oneconvergence.com>
> Signed-off-by: Satish Dhote <satish.d@oneconvergence.com>
> ---
>  include/net/flow_offload.h |  45 ++++++++++
>  net/core/flow_offload.c    | 171 +++++++++++++++++++++++++++++++++++++
>  net/sched/cls_flower.c     |  43 ++++++++++
>  3 files changed, 259 insertions(+)
> 
> diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
> index f2c8311a0433..26c6bd6bdb98 100644
> --- a/include/net/flow_offload.h
> +++ b/include/net/flow_offload.h
> @@ -11,6 +11,8 @@ struct flow_match {
>  	struct flow_dissector	*dissector;
>  	void			*mask;
>  	void			*key;
> +	void			*unmasked_key;
> +	struct flow_dissector	*unmasked_key_dissector;
>  };
>  
>  struct flow_match_meta {
> @@ -118,6 +120,49 @@ void flow_rule_match_enc_opts(const struct flow_rule *rule,
>  void flow_rule_match_ct(const struct flow_rule *rule,
>  			struct flow_match_ct *out);
>  
> +void flow_rule_match_unmasked_key_meta(const struct flow_rule *rule,
> +				       struct flow_match_meta *out);
> +void flow_rule_match_unmasked_key_basic(const struct flow_rule *rule,
> +					struct flow_match_basic *out);
> +void flow_rule_match_unmasked_key_control(const struct flow_rule *rule,
> +					  struct flow_match_control *out);
> +void flow_rule_match_unmasked_key_eth_addrs(const struct flow_rule *rule,
> +					    struct flow_match_eth_addrs *out);
> +void flow_rule_match_unmasked_key_vlan(const struct flow_rule *rule,
> +				       struct flow_match_vlan *out);
> +void flow_rule_match_unmasked_key_cvlan(const struct flow_rule *rule,
> +					struct flow_match_vlan *out);
> +void flow_rule_match_unmasked_key_ipv4_addrs(const struct flow_rule *rule,
> +					     struct flow_match_ipv4_addrs *out);
> +void flow_rule_match_unmasked_key_ipv6_addrs(const struct flow_rule *rule,
> +					     struct flow_match_ipv6_addrs *out);
> +void flow_rule_match_unmasked_key_ip(const struct flow_rule *rule,
> +				     struct flow_match_ip *out);
> +void flow_rule_match_unmasked_key_ports(const struct flow_rule *rule,
> +					struct flow_match_ports *out);
> +void flow_rule_match_unmasked_key_tcp(const struct flow_rule *rule,
> +				      struct flow_match_tcp *out);
> +void flow_rule_match_unmasked_key_icmp(const struct flow_rule *rule,
> +				       struct flow_match_icmp *out);
> +void flow_rule_match_unmasked_key_mpls(const struct flow_rule *rule,
> +				       struct flow_match_mpls *out);
> +void flow_rule_match_unmasked_key_enc_control(const struct flow_rule *rule,
> +					      struct flow_match_control *out);
> +void flow_rule_match_unmasked_key_enc_ipv4_addrs(const struct flow_rule *rule,
> +					    struct flow_match_ipv4_addrs *out);
> +void flow_rule_match_unmasked_key_enc_ipv6_addrs(const struct flow_rule *rule,
> +					    struct flow_match_ipv6_addrs *out);
> +void flow_rule_match_unmasked_key_enc_ip(const struct flow_rule *rule,
> +					 struct flow_match_ip *out);
> +void flow_rule_match_unmasked_key_enc_ports(const struct flow_rule *rule,
> +					    struct flow_match_ports *out);
> +void flow_rule_match_unmasked_key_enc_keyid(const struct flow_rule *rule,
> +					    struct flow_match_enc_keyid *out);
> +void flow_rule_match_unmasked_key_enc_opts(const struct flow_rule *rule,
> +					   struct flow_match_enc_opts *out);
> +void flow_rule_match_unmasked_key_ct(const struct flow_rule *rule,
> +				     struct flow_match_ct *out);
> +
>  enum flow_action_id {
>  	FLOW_ACTION_ACCEPT		= 0,
>  	FLOW_ACTION_DROP,
> diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
> index 0cfc35e6be28..a98c31e864b1 100644
> --- a/net/core/flow_offload.c
> +++ b/net/core/flow_offload.c
> @@ -173,6 +173,177 @@ void flow_rule_match_enc_opts(const struct flow_rule *rule,
>  }
>  EXPORT_SYMBOL(flow_rule_match_enc_opts);
>  
> +#define FLOW_UNMASKED_KEY_DISSECTOR_MATCH(__rule, __type, __out)	\
> +	const struct flow_match *__m = &(__rule)->match;		\
> +	struct flow_dissector *__d = (__m)->unmasked_key_dissector;	\
> +									\
> +	(__out)->key = skb_flow_dissector_target(__d, __type,		\
> +						 (__m)->unmasked_key);	\
> +	(__out)->mask = skb_flow_dissector_target(__d, __type,		\
> +						  (__m)->mask)		\
> +
> +void flow_rule_match_unmasked_key_meta(const struct flow_rule *rule,
> +				       struct flow_match_meta *out)
> +{
> +	FLOW_UNMASKED_KEY_DISSECTOR_MATCH(rule, FLOW_DISSECTOR_KEY_META, out);
> +}
> +EXPORT_SYMBOL(flow_rule_match_unmasked_key_meta);
> +
> +void
> +flow_rule_match_unmasked_key_basic(const struct flow_rule *rule,
> +				   struct flow_match_basic *out)
> +{
> +	FLOW_UNMASKED_KEY_DISSECTOR_MATCH(rule, FLOW_DISSECTOR_KEY_BASIC, out);
> +}
> +EXPORT_SYMBOL(flow_rule_match_unmasked_key_basic);
> +
> +void flow_rule_match_unmasked_key_control(const struct flow_rule *rule,
> +					  struct flow_match_control *out)
> +{
> +	FLOW_UNMASKED_KEY_DISSECTOR_MATCH(rule, FLOW_DISSECTOR_KEY_CONTROL,
> +					  out);
> +}
> +EXPORT_SYMBOL(flow_rule_match_unmasked_key_control);
> +
> +void flow_rule_match_unmasked_key_eth_addrs(const struct flow_rule *rule,
> +					    struct flow_match_eth_addrs *out)
> +{
> +	FLOW_UNMASKED_KEY_DISSECTOR_MATCH(rule, FLOW_DISSECTOR_KEY_ETH_ADDRS,
> +					  out);
> +}
> +EXPORT_SYMBOL(flow_rule_match_unmasked_key_eth_addrs);
> +
> +void flow_rule_match_unmasked_key_vlan(const struct flow_rule *rule,
> +				       struct flow_match_vlan *out)
> +{
> +	FLOW_UNMASKED_KEY_DISSECTOR_MATCH(rule, FLOW_DISSECTOR_KEY_VLAN, out);
> +}
> +EXPORT_SYMBOL(flow_rule_match_unmasked_key_vlan);
> +
> +void flow_rule_match_unmasked_key_cvlan(const struct flow_rule *rule,
> +					struct flow_match_vlan *out)
> +{
> +	FLOW_UNMASKED_KEY_DISSECTOR_MATCH(rule, FLOW_DISSECTOR_KEY_CVLAN, out);
> +}
> +EXPORT_SYMBOL(flow_rule_match_unmasked_key_cvlan);
> +
> +void flow_rule_match_unmasked_key_ipv4_addrs(const struct flow_rule *rule,
> +					     struct flow_match_ipv4_addrs *out)
> +{
> +	FLOW_UNMASKED_KEY_DISSECTOR_MATCH(rule, FLOW_DISSECTOR_KEY_IPV4_ADDRS,
> +					  out);
> +}
> +EXPORT_SYMBOL(flow_rule_match_unmasked_key_ipv4_addrs);
> +
> +void flow_rule_match_unmasked_key_ipv6_addrs(const struct flow_rule *rule,
> +					     struct flow_match_ipv6_addrs *out)
> +{
> +	FLOW_UNMASKED_KEY_DISSECTOR_MATCH(rule, FLOW_DISSECTOR_KEY_IPV6_ADDRS,
> +					  out);
> +}
> +EXPORT_SYMBOL(flow_rule_match_unmasked_key_ipv6_addrs);
> +
> +void flow_rule_match_unmasked_key_ip(const struct flow_rule *rule,
> +				     struct flow_match_ip *out)
> +{
> +	FLOW_UNMASKED_KEY_DISSECTOR_MATCH(rule, FLOW_DISSECTOR_KEY_IP, out);
> +}
> +EXPORT_SYMBOL(flow_rule_match_unmasked_key_ip);
> +
> +void flow_rule_match_unmasked_key_ports(const struct flow_rule *rule,
> +					struct flow_match_ports *out)
> +{
> +	FLOW_UNMASKED_KEY_DISSECTOR_MATCH(rule, FLOW_DISSECTOR_KEY_PORTS, out);
> +}
> +EXPORT_SYMBOL(flow_rule_match_unmasked_key_ports);
> +
> +void flow_rule_match_unmasked_key_tcp(const struct flow_rule *rule,
> +				      struct flow_match_tcp *out)
> +{
> +	FLOW_UNMASKED_KEY_DISSECTOR_MATCH(rule, FLOW_DISSECTOR_KEY_TCP, out);
> +}
> +EXPORT_SYMBOL(flow_rule_match_unmasked_key_tcp);
> +
> +void flow_rule_match_unmasked_key_icmp(const struct flow_rule *rule,
> +				       struct flow_match_icmp *out)
> +{
> +	FLOW_UNMASKED_KEY_DISSECTOR_MATCH(rule, FLOW_DISSECTOR_KEY_ICMP, out);
> +}
> +EXPORT_SYMBOL(flow_rule_match_unmasked_key_icmp);
> +
> +void flow_rule_match_unmasked_key_mpls(const struct flow_rule *rule,
> +				       struct flow_match_mpls *out)
> +{
> +	FLOW_UNMASKED_KEY_DISSECTOR_MATCH(rule, FLOW_DISSECTOR_KEY_MPLS, out);
> +}
> +EXPORT_SYMBOL(flow_rule_match_unmasked_key_mpls);
> +
> +void flow_rule_match_unmasked_key_enc_control(const struct flow_rule *rule,
> +					      struct flow_match_control *out)
> +{
> +	FLOW_UNMASKED_KEY_DISSECTOR_MATCH(rule, FLOW_DISSECTOR_KEY_ENC_CONTROL,
> +					  out);
> +}
> +EXPORT_SYMBOL(flow_rule_match_unmasked_key_enc_control);
> +
> +void
> +flow_rule_match_unmasked_key_enc_ipv4_addrs(const struct flow_rule *rule,
> +					    struct flow_match_ipv4_addrs *out)
> +{
> +	FLOW_UNMASKED_KEY_DISSECTOR_MATCH(rule,
> +					  FLOW_DISSECTOR_KEY_ENC_IPV4_ADDRS,
> +					  out);
> +}
> +EXPORT_SYMBOL(flow_rule_match_unmasked_key_enc_ipv4_addrs);
> +
> +void
> +flow_rule_match_unmasked_key_enc_ipv6_addrs(const struct flow_rule *rule,
> +					    struct flow_match_ipv6_addrs *out)
> +{
> +	FLOW_UNMASKED_KEY_DISSECTOR_MATCH(rule,
> +					  FLOW_DISSECTOR_KEY_ENC_IPV6_ADDRS,
> +					  out);
> +}
> +EXPORT_SYMBOL(flow_rule_match_unmasked_key_enc_ipv6_addrs);
> +
> +void flow_rule_match_unmasked_key_enc_ip(const struct flow_rule *rule,
> +					 struct flow_match_ip *out)
> +{
> +	FLOW_UNMASKED_KEY_DISSECTOR_MATCH(rule, FLOW_DISSECTOR_KEY_ENC_IP, out);
> +}
> +EXPORT_SYMBOL(flow_rule_match_unmasked_key_enc_ip);
> +
> +void flow_rule_match_unmasked_key_enc_ports(const struct flow_rule *rule,
> +					    struct flow_match_ports *out)
> +{
> +	FLOW_UNMASKED_KEY_DISSECTOR_MATCH(rule, FLOW_DISSECTOR_KEY_ENC_PORTS,
> +					  out);
> +}
> +EXPORT_SYMBOL(flow_rule_match_unmasked_key_enc_ports);
> +
> +void flow_rule_match_unmasked_key_enc_keyid(const struct flow_rule *rule,
> +					    struct flow_match_enc_keyid *out)
> +{
> +	FLOW_UNMASKED_KEY_DISSECTOR_MATCH(rule, FLOW_DISSECTOR_KEY_ENC_KEYID,
> +					  out);
> +}
> +EXPORT_SYMBOL(flow_rule_match_unmasked_key_enc_keyid);
> +
> +void flow_rule_match_unmasked_key_enc_opts(const struct flow_rule *rule,
> +					   struct flow_match_enc_opts *out)
> +{
> +	FLOW_UNMASKED_KEY_DISSECTOR_MATCH(rule, FLOW_DISSECTOR_KEY_ENC_OPTS,
> +					  out);
> +}
> +EXPORT_SYMBOL(flow_rule_match_unmasked_key_enc_opts);
> +
> +void flow_rule_match_unmasked_key_ct(const struct flow_rule *rule,
> +				     struct flow_match_ct *out)
> +{
> +	FLOW_UNMASKED_KEY_DISSECTOR_MATCH(rule, FLOW_DISSECTOR_KEY_CT, out);
> +}
> +EXPORT_SYMBOL(flow_rule_match_unmasked_key_ct);
> +
>  struct flow_action_cookie *flow_action_cookie_create(void *data,
>  						     unsigned int len,
>  						     gfp_t gfp)
> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
> index 64b70d396397..f1a5352cbb04 100644
> --- a/net/sched/cls_flower.c
> +++ b/net/sched/cls_flower.c
> @@ -121,6 +121,7 @@ struct cls_fl_filter {
>  	 */
>  	refcount_t refcnt;
>  	bool deleted;
> +	struct flow_dissector unmasked_key_dissector;
>  };
>  
>  static const struct rhashtable_params mask_ht_params = {
> @@ -449,6 +450,13 @@ static int fl_hw_replace_filter(struct tcf_proto *tp,
>  	cls_flower.rule->match.key = &f->mkey;
>  	cls_flower.classid = f->res.classid;
>  
> +	/* Pass unmasked key and corresponding dissector also to the driver,
> +	 * hardware may optimize its flow table based on its capabilities.
> +	 */
> +	cls_flower.rule->match.unmasked_key = &f->key;
> +	cls_flower.rule->match.unmasked_key_dissector =
> +						&f->unmasked_key_dissector;
> +
>  	err = tc_setup_flow_action(&cls_flower.rule->action, &f->exts);
>  	if (err) {
>  		kfree(cls_flower.rule);
> @@ -1753,6 +1761,39 @@ static void fl_init_dissector(struct flow_dissector *dissector,
>  	skb_flow_dissector_init(dissector, keys, cnt);
>  }
>  
> +/* Initialize dissector for unmasked key. */
> +static void fl_init_unmasked_key_dissector(struct flow_dissector *dissector)
> +{
> +	struct flow_dissector_key keys[FLOW_DISSECTOR_KEY_MAX];
> +	size_t cnt = 0;
> +
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_META, meta);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_CONTROL, control);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_BASIC, basic);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_ETH_ADDRS, eth);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_IPV4_ADDRS, ipv4);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_IPV6_ADDRS, ipv6);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_PORTS, tp);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_PORTS_RANGE, tp_range);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_IP, ip);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_TCP, tcp);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_ICMP, icmp);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_ARP, arp);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_MPLS, mpls);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_VLAN, vlan);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_CVLAN, cvlan);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_ENC_KEYID, enc_key_id);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_ENC_IPV4_ADDRS, enc_ipv4);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_ENC_IPV6_ADDRS, enc_ipv6);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_ENC_CONTROL, enc_control);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_ENC_PORTS, enc_tp);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_ENC_IP, enc_ip);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_ENC_OPTS, enc_opts);
> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_CT, ct);
> +
> +	skb_flow_dissector_init(dissector, keys, cnt);
> +}
> +
>  static struct fl_flow_mask *fl_create_new_mask(struct cls_fl_head *head,
>  					       struct fl_flow_mask *mask)
>  {
> @@ -1980,6 +2021,8 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
>  	if (err)
>  		goto errout;
>  
> +	fl_init_unmasked_key_dissector(&fnew->unmasked_key_dissector);
> +
>  	err = fl_ht_insert_unique(fnew, fold, &in_ht);
>  	if (err)
>  		goto errout_mask;
> -- 
> 2.17.1
> 

^ permalink raw reply

* Re: [PATCH 03/11] bpf: Add btf_ids object
From: Jiri Olsa @ 2020-06-19 13:16 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Networking, bpf,
	Song Liu, Yonghong Song, Martin KaFai Lau, David Miller,
	John Fastabend, Wenbo Zhang, KP Singh, Andrii Nakryiko,
	Brendan Gregg, Florent Revest, Al Viro
In-Reply-To: <CAEf4BzZivgf2r+tAnY4+cMTKN3dCb_M-PyVWL_ZSa7SY=x2efA@mail.gmail.com>

On Thu, Jun 18, 2020 at 06:06:49PM -0700, Andrii Nakryiko wrote:

SNIP

> > > +/*
> > > + * The BTF_ID_LIST macro defines pure (unsorted) list
> > > + * of BTF IDs, with following layout:
> > > + *
> > > + * BTF_ID_LIST(list1)
> > > + * BTF_ID(type1, name1)
> > > + * BTF_ID(type2, name2)
> > > + *
> > > + * list1:
> > > + * __BTF_ID__type1__name1__1:
> > > + * .zero 4
> > > + * __BTF_ID__type2__name2__2:
> > > + * .zero 4
> > > + *
> > > + */
> > > +#define BTF_ID_LIST(name)                              \
> >
> > nit: btw, you call it a list here, but btfids tool talks about
> > "sorts". Maybe stick to consistent naming. Either "list" or "set"
> > seems to be appropriate. Set implies a sorted aspect a bit more, IMO.

so how about we keep BTF_ID_LIST as it is and rename
BTF_WHITELIST_* to BTF_SET_*

> >
> > > +asm(                                                   \
> > > +".pushsection " SECTION ",\"a\";               \n"     \
> > > +".global " #name ";                            \n"     \
> >
> > I was expecting to see reserved 4 bytes for list size? I also couldn't
> > find where btfids tool prepends it. From what I could understand, it
> > just assumed the first 4 bytes are the length prefix? Sorry if I'm
> > slow...
> 
> Never mind, this is different from whitelisting you do in patch #8.
> But now I'm curious how this list symbol gets its size correctly
> calculated?..

so the BTF_ID_LIST list does not care about the size,
each symbol in the 'list' gets resolved based on its
__BTF_ID__XX__symbol__XX symbol

the count is kept in BTF_WHITELIST_* list because we
need it to sort it and search in it

thanks,
jirka


^ permalink raw reply

* Re: [PATCH 05/11] bpf: Remove btf_id helpers resolving
From: Jiri Olsa @ 2020-06-19 13:18 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Networking, bpf,
	Song Liu, Yonghong Song, Martin KaFai Lau, David Miller,
	John Fastabend, Wenbo Zhang, KP Singh, Andrii Nakryiko,
	Brendan Gregg, Florent Revest, Al Viro
In-Reply-To: <CAEf4BzYw0VciF-7CS164Nk8LLnZ4odtdYQyX1MS4eWDN5WbcSg@mail.gmail.com>

On Thu, Jun 18, 2020 at 06:10:29PM -0700, Andrii Nakryiko wrote:
> On Tue, Jun 16, 2020 at 3:06 AM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > Now when we moved the helpers btf_id into .BTF_ids section,
> > we can remove the code that resolve those IDs in runtime.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> 
> Nice! :)
> 
> BTW, have you looked at bpf_ctx_convert stuff? Would we be able to
> replace it with your btfids thing as well?

good, another usage ;-) I'll check

> 
> 
> >  kernel/bpf/btf.c | 88 +++---------------------------------------------
> >  1 file changed, 4 insertions(+), 84 deletions(-)
> >
> > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > index 58c9af1d4808..aea7b2cc8d26 100644
> > --- a/kernel/bpf/btf.c
> > +++ b/kernel/bpf/btf.c
> > @@ -4049,96 +4049,16 @@ int btf_struct_access(struct bpf_verifier_log *log,
> >         return -EINVAL;
> >  }
> >
> 
> [...]
> 
> >  int btf_resolve_helper_id(struct bpf_verifier_log *log,
> >                           const struct bpf_func_proto *fn, int arg)
> >  {
> > -       int *btf_id = &fn->btf_id[arg];
> > -       int ret;
> > -
> >         if (fn->arg_type[arg] != ARG_PTR_TO_BTF_ID)
> >                 return -EINVAL;
> >
> > -       ret = READ_ONCE(*btf_id);
> > -       if (ret)
> > -               return ret;
> > -       /* ok to race the search. The result is the same */
> > -       ret = __btf_resolve_helper_id(log, fn->func, arg);
> > -       if (!ret) {
> > -               /* Function argument cannot be type 'void' */
> > -               bpf_log(log, "BTF resolution bug\n");
> > -               return -EFAULT;
> > -       }
> > -       WRITE_ONCE(*btf_id, ret);
> > -       return ret;
> > +       if (WARN_ON_ONCE(!fn->btf_id))
> > +               return -EINVAL;
> > +
> > +       return fn->btf_id[arg];
> 
> It probably would be a good idea to add some sanity checking here,
> making sure that btf_id is >0 (void is never a right type) and <=
> nr_types in vmlinux_btf?

yep, will add it ;-)

jirka


^ permalink raw reply

* Re: [PATCH net 1/2] of: of_mdio: Correct loop scanning logic
From: Andrew Lunn @ 2020-06-19 13:21 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: netdev, Heiner Kallweit, Russell King, David S. Miller,
	Jakub Kicinski, Rob Herring, Frank Rowand, Dajun Jin,
	Alexandre Belloni, open list,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE
In-Reply-To: <20200619044759.11387-2-f.fainelli@gmail.com>

On Thu, Jun 18, 2020 at 09:47:58PM -0700, Florian Fainelli wrote:
> Commit 209c65b61d94 ("drivers/of/of_mdio.c:fix of_mdiobus_register()")
> introduced a break of the loop on the premise that a successful
> registration should exit the loop. The premise is correct but not to
> code, because rc && rc != -ENODEV is just a special error condition,
> that means we would exit the loop even with rc == -ENODEV which is
> absolutely not correct since this is the error code to indicate to the
> MDIO bus layer that scanning should continue.
> 
> Fix this by explicitly checking for rc = 0 as the only valid condition
> to break out of the loop.
> 
> Fixes: 209c65b61d94 ("drivers/of/of_mdio.c:fix of_mdiobus_register()")
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
> ---
>  drivers/of/of_mdio.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/of/of_mdio.c b/drivers/of/of_mdio.c
> index a04afe79529c..7496dc64d6b5 100644
> --- a/drivers/of/of_mdio.c
> +++ b/drivers/of/of_mdio.c
> @@ -315,9 +315,10 @@ int of_mdiobus_register(struct mii_bus *mdio, struct device_node *np)
>  
>  			if (of_mdiobus_child_is_phy(child)) {
>  				rc = of_mdiobus_register_phy(mdio, child, addr);
> -				if (rc && rc != -ENODEV)
> +				if (!rc)
> +					break;

Maybe add in a comment here about what ENODEV means in this context?
That might avoid it getting broken again in the future.

> +				if (rc != -ENODEV)
>  					goto unregister;
> -				break;
>  			}
>  		}
>  	}
> -- 

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH 02/11] bpf: Compile btfid tool at kernel compilation start
From: Jiri Olsa @ 2020-06-19 13:23 UTC (permalink / raw)
  To: John Fastabend
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, netdev, bpf,
	Song Liu, Yonghong Song, Martin KaFai Lau, David Miller,
	Wenbo Zhang, KP Singh, Andrii Nakryiko, Brendan Gregg,
	Florent Revest, Al Viro
In-Reply-To: <5eebd1486e46b_6d292ad5e7a285b817@john-XPS-13-9370.notmuch>

On Thu, Jun 18, 2020 at 01:40:40PM -0700, John Fastabend wrote:
> Jiri Olsa wrote:
> > The btfid tool will be used during the vmlinux linking,
> > so it's necessary it's ready for it.
> > 
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  Makefile           | 22 ++++++++++++++++++----
> >  tools/Makefile     |  3 +++
> >  tools/bpf/Makefile |  5 ++++-
> >  3 files changed, 25 insertions(+), 5 deletions(-)
> 
> This breaks the build for me. I fixed it with this but then I get warnings,
> 
> diff --git a/tools/bpf/btfid/btfid.c b/tools/bpf/btfid/btfid.c
> index 7cdf39bfb150..3697e8ae9efa 100644
> --- a/tools/bpf/btfid/btfid.c
> +++ b/tools/bpf/btfid/btfid.c
> @@ -48,7 +48,7 @@
>  #include <errno.h>
>  #include <linux/rbtree.h>
>  #include <linux/zalloc.h>
> -#include <btf.h>
> +#include <linux/btf.h>
>  #include <libbpf.h>
>  #include <parse-options.h>
> 
> Here is the error. Is it something about my setup? bpftool/btf.c uses
> <btf.h>. Because this in top-level Makefile we probably don't want to
> push extra setup onto folks.

ouch, I wonder it's because I have libbpf installed and the
setup got mixed up.. I'll erase and try to reproduce

thanks,
jirka

> 
> In file included from btfid.c:51:
> /home/john/git/bpf-next/tools/lib/bpf/btf.h: In function ‘btf_is_var’:
> /home/john/git/bpf-next/tools/lib/bpf/btf.h:254:24: error: ‘BTF_KIND_VAR’ undeclared (first use in this function); did you mean ‘BTF_KIND_PTR’?
>   return btf_kind(t) == BTF_KIND_VAR;
>                         ^~~~~~~~~~~~
>                         BTF_KIND_PTR
> /home/john/git/bpf-next/tools/lib/bpf/btf.h:254:24: note: each undeclared identifier is reported only once for each function it appears in
> /home/john/git/bpf-next/tools/lib/bpf/btf.h: In function ‘btf_is_datasec’:
> /home/john/git/bpf-next/tools/lib/bpf/btf.h:259:24: error: ‘BTF_KIND_DATASEC’ undeclared (first use in this function); did you mean ‘BTF_KIND_PTR’?
>   return btf_kind(t) == BTF_KIND_DATASEC;
>                         ^~~~~~~~~~~~~~~~
>                         BTF_KIND_PTR
> mv: cannot stat '/home/john/git/bpf-next/tools/bpf/btfid/.btfid.o.tmp': No such file or directory
> make[3]: *** [/home/john/git/bpf-next/tools/build/Makefile.build:97: /home/john/git/bpf-next/tools/bpf/btfid/btfid.o] Error 1
> make[2]: *** [Makefile:59: /home/john/git/bpf-next/tools/bpf/btfid/btfid-in.o] Error 2
> make[1]: *** [Makefile:71: bpf/btfid] Error 2
> make: *** [Makefile:1894: tools/bpf/btfid] Error 2
> 


^ permalink raw reply

* Re: [PATCH 06/11] bpf: Do not pass enum bpf_access_type to btf_struct_access
From: Jiri Olsa @ 2020-06-19 13:23 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Networking, bpf,
	Song Liu, Yonghong Song, Martin KaFai Lau, David Miller,
	John Fastabend, Wenbo Zhang, KP Singh, Andrii Nakryiko,
	Brendan Gregg, Florent Revest, Al Viro
In-Reply-To: <CAEf4BzY7207CWet_csENUznXESvy9SrQnfzu0PCXmAdHUO0rJw@mail.gmail.com>

On Thu, Jun 18, 2020 at 08:58:06PM -0700, Andrii Nakryiko wrote:
> On Tue, Jun 16, 2020 at 3:06 AM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > There's no need for it.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> 
> It matches bpf_verifier_ops->btf_struct_access, though, which, I
> think, actually allows write access for some special cases. So I think
> we should keep it.

ok, will keep it

jirka

> 
> >  include/linux/bpf.h   | 1 -
> >  kernel/bpf/btf.c      | 3 +--
> >  kernel/bpf/verifier.c | 2 +-
> >  net/ipv4/bpf_tcp_ca.c | 2 +-
> >  4 files changed, 3 insertions(+), 5 deletions(-)
> >
> 
> [...]
> 


^ permalink raw reply

* Re: [PATCH net 2/2] net: phy: Check harder for errors in get_phy_id()
From: Andrew Lunn @ 2020-06-19 13:26 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: netdev, Heiner Kallweit, Russell King, David S. Miller,
	Jakub Kicinski, Rob Herring, Frank Rowand, Dajun Jin,
	Alexandre Belloni, open list,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE
In-Reply-To: <20200619044759.11387-3-f.fainelli@gmail.com>

On Thu, Jun 18, 2020 at 09:47:59PM -0700, Florian Fainelli wrote:
> Commit 02a6efcab675 ("net: phy: allow scanning busses with missing
> phys") added a special condition to return -ENODEV in case -ENODEV or
> -EIO was returned from the first read of the MII_PHYSID1 register.
> 
> In case the MDIO bus data line pull-up is not strong enough, the MDIO
> bus controller will not flag this as a read error. This can happen when
> a pluggable daughter card is not connected and weak internal pull-ups
> are used (since that is the only option, otherwise the pins are
> floating).
> 
> The second read of MII_PHYSID2 will be correctly flagged an error
> though, but now we will return -EIO which will be treated as a hard
> error, thus preventing MDIO bus scanning loops to continue succesfully.
> 
> Apply the same logic to both register reads, thus allowing the scanning
> logic to proceed.

Hi Florian

Maybe extend the kerneldoc for this function to document the return
values and there special meanings?

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

BTW: Did you look at get_phy_c45_ids()? Is it using the correct return
value? Given the current work being done to extend scanning to C45,
maybe it needs reviewing for issues like this.

    Andrew

^ permalink raw reply

* Re: [PATCH 08/11] bpf: Add BTF whitelist support
From: Jiri Olsa @ 2020-06-19 13:29 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Networking, bpf,
	Song Liu, Yonghong Song, Martin KaFai Lau, David Miller,
	John Fastabend, Wenbo Zhang, KP Singh, Andrii Nakryiko,
	Brendan Gregg, Florent Revest, Al Viro
In-Reply-To: <CAEf4Bzaj-t0UYLiJh9czenqVtsi5UuviX_AqgpEq=gJx6WCHrw@mail.gmail.com>

On Thu, Jun 18, 2020 at 09:29:58PM -0700, Andrii Nakryiko wrote:

SNIP

> > @@ -4669,3 +4670,15 @@ u32 btf_id(const struct btf *btf)
> >  {
> >         return btf->id;
> >  }
> > +
> > +static int btf_id_cmp_func(const void *a, const void *b)
> > +{
> > +       const int *pa = a, *pb = b;
> > +
> > +       return *pa - *pb;
> > +}
> > +
> > +bool btf_whitelist_search(int id, int list[], int cnt)
> 
> whitelist is a bit too specific, this functionality can be used for
> blacklisting as well, no?
> 
> How about instead of "open coding" separately int list[] + int cnt, we
> define a struct:
> 
> struct btf_id_set {
>     u32 cnt;
>     u32 ids[];
> };
> 
> and pass that around?
> 
> This function then can be generic
> 
> bool btf_id_set_contains(struct btf_id_set *set, u32 id);
> 
> Then it's usable for both whitelist and blacklist? _contains also
> clearly implies what's the return result, while _search isn't so clear
> in that regard.

yep, looks better this way, will change

> 
> 
> > +{
> > +       return bsearch(&id, list, cnt, sizeof(int), btf_id_cmp_func) != NULL;
> > +}
> > diff --git a/kernel/bpf/btf_ids.h b/kernel/bpf/btf_ids.h
> > index 68aa5c38a37f..a90c09faa515 100644
> > --- a/kernel/bpf/btf_ids.h
> > +++ b/kernel/bpf/btf_ids.h
> > @@ -67,4 +67,42 @@ asm(                                                 \
> >  #name ":;                                      \n"     \
> >  ".popsection;                                  \n");
> >
> > +
> > +/*
> > + * The BTF_WHITELIST_ENTRY/END macros pair defines sorted
> > + * list of BTF IDs plus its members count, with following
> > + * layout:
> > + *
> > + * BTF_WHITELIST_ENTRY(list2)
> > + * BTF_ID(type1, name1)
> > + * BTF_ID(type2, name2)
> > + * BTF_WHITELIST_END(list)
> 
> It kind of sucks you need two separate ENTRY/END macro (btw, START/END
> or BEGIN/END would be a bit more "paired"), and your example clearly

ok, START/END it is

> shows why: it is not self-consistent (list2 on start, list on end ;).

ugh ;-)

> But doing variadic macro like this would be a nightmare as well,
> unfortunately. :(
> 
> > + *
> > + * __BTF_ID__sort__list:
> > + * list2_cnt:
> > + * .zero 4
> > + * list2:
> > + * __BTF_ID__type1__name1__3:
> > + * .zero 4
> > + * __BTF_ID__type2__name2__4:
> > + * .zero 4
> > + *
> > + */
> > +#define BTF_WHITELIST_ENTRY(name)                      \
> > +asm(                                                   \
> > +".pushsection " SECTION ",\"a\";               \n"     \
> > +".global __BTF_ID__sort__" #name ";            \n"     \
> > +"__BTF_ID__sort__" #name ":;                   \n"     \
> 
> I mentioned in the previous patch already, I think "sort" is a bad
> name, consider "set" (or "list", but you used list name already for a
> slightly different macro).

yes, I replied to this in another email

> 
> > +".global " #name "_cnt;                        \n"     \
> > +#name "_cnt:;                                  \n"     \
> 
> This label/symbol isn't necessary, why polluting the symbol table?

XXX_cnt variable is used in search function, but isn't needed
if we use that 'struct btf_id_set' you proposed

> 
> > +".zero 4                                       \n"     \
> > +".popsection;                                  \n");   \
> > +BTF_ID_LIST(name)
> > +
> > +#define BTF_WHITELIST_END(name)                                \
> > +asm(                                                   \
> > +".pushsection " SECTION ",\"a\";              \n"      \
> > +".size __BTF_ID__sort__" #name ", .-" #name " \n"      \
> > +".popsection;                                 \n");
> > +
> >  #endif
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index bee3da2cd945..5a9a6fd72907 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -4633,6 +4633,11 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
> >                 return -EINVAL;
> >         }
> >
> > +       if (fn->allowed && !fn->allowed(env->prog)) {
> > +               verbose(env, "helper call is not allowed in probe\n");
> 
> nit: probe -> program, or just drop "in probe" part altogether

ok

thnaks,
jirka


^ permalink raw reply

* Re: [PATCH net 2/2] net: phy: Check harder for errors in get_phy_id()
From: Russell King - ARM Linux admin @ 2020-06-19 13:30 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Florian Fainelli, netdev, Heiner Kallweit, David S. Miller,
	Jakub Kicinski, Rob Herring, Frank Rowand, Dajun Jin,
	Alexandre Belloni, open list,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE
In-Reply-To: <20200619132659.GB304147@lunn.ch>

On Fri, Jun 19, 2020 at 03:26:59PM +0200, Andrew Lunn wrote:
> On Thu, Jun 18, 2020 at 09:47:59PM -0700, Florian Fainelli wrote:
> > Commit 02a6efcab675 ("net: phy: allow scanning busses with missing
> > phys") added a special condition to return -ENODEV in case -ENODEV or
> > -EIO was returned from the first read of the MII_PHYSID1 register.
> > 
> > In case the MDIO bus data line pull-up is not strong enough, the MDIO
> > bus controller will not flag this as a read error. This can happen when
> > a pluggable daughter card is not connected and weak internal pull-ups
> > are used (since that is the only option, otherwise the pins are
> > floating).
> > 
> > The second read of MII_PHYSID2 will be correctly flagged an error
> > though, but now we will return -EIO which will be treated as a hard
> > error, thus preventing MDIO bus scanning loops to continue succesfully.
> > 
> > Apply the same logic to both register reads, thus allowing the scanning
> > logic to proceed.
> 
> Hi Florian
> 
> Maybe extend the kerneldoc for this function to document the return
> values and there special meanings?

You mean like the patch I sent yesterday?

> BTW: Did you look at get_phy_c45_ids()? Is it using the correct return
> value? Given the current work being done to extend scanning to C45,
> maybe it needs reviewing for issues like this.

And the updates I sent for this yesterday? ;)

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply

* Re: [PATCH 09/11] bpf: Add d_path helper
From: Jiri Olsa @ 2020-06-19 13:31 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Networking, bpf,
	Song Liu, Yonghong Song, Martin KaFai Lau, David Miller,
	John Fastabend, Wenbo Zhang, KP Singh, Andrii Nakryiko,
	Brendan Gregg, Florent Revest, Al Viro
In-Reply-To: <CAEf4BzY=d5y_-fXvomG7SjkbK7DZn5=-f+sdCYRdZh9qeynQrQ@mail.gmail.com>

On Thu, Jun 18, 2020 at 09:35:10PM -0700, Andrii Nakryiko wrote:
> On Tue, Jun 16, 2020 at 3:07 AM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > Adding d_path helper function that returns full path
> > for give 'struct path' object, which needs to be the
> > kernel BTF 'path' object.
> >
> > The helper calls directly d_path function.
> >
> > Updating also bpf.h tools uapi header and adding
> > 'path' to bpf_helpers_doc.py script.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  include/linux/bpf.h            |  4 ++++
> >  include/uapi/linux/bpf.h       | 14 ++++++++++++-
> >  kernel/bpf/btf_ids.c           | 11 ++++++++++
> >  kernel/trace/bpf_trace.c       | 38 ++++++++++++++++++++++++++++++++++
> >  scripts/bpf_helpers_doc.py     |  2 ++
> >  tools/include/uapi/linux/bpf.h | 14 ++++++++++++-
> >  6 files changed, 81 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index a94e85c2ec50..d35265b6c574 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -1752,5 +1752,9 @@ extern int bpf_skb_output_btf_ids[];
> >  extern int bpf_seq_printf_btf_ids[];
> >  extern int bpf_seq_write_btf_ids[];
> >  extern int bpf_xdp_output_btf_ids[];
> > +extern int bpf_d_path_btf_ids[];
> > +
> > +extern int btf_whitelist_d_path[];
> > +extern int btf_whitelist_d_path_cnt;
> 
> So with suggestion from previous patch, this would be declared as:
> 
> extern const struct btf_id_set btf_whitelist_d_path;

yes

SNIP

> >  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> >   * function eBPF program intends to call
> > diff --git a/kernel/bpf/btf_ids.c b/kernel/bpf/btf_ids.c
> > index d8d0df162f04..853c8fd59b06 100644
> > --- a/kernel/bpf/btf_ids.c
> > +++ b/kernel/bpf/btf_ids.c
> > @@ -13,3 +13,14 @@ BTF_ID(struct, seq_file)
> >
> >  BTF_ID_LIST(bpf_xdp_output_btf_ids)
> >  BTF_ID(struct, xdp_buff)
> > +
> > +BTF_ID_LIST(bpf_d_path_btf_ids)
> > +BTF_ID(struct, path)
> > +
> > +BTF_WHITELIST_ENTRY(btf_whitelist_d_path)
> > +BTF_ID(func, vfs_truncate)
> > +BTF_ID(func, vfs_fallocate)
> > +BTF_ID(func, dentry_open)
> > +BTF_ID(func, vfs_getattr)
> > +BTF_ID(func, filp_close)
> > +BTF_WHITELIST_END(btf_whitelist_d_path)
> 
> Oh, so that's why you added btf_ids.c. Do you think centralizing all
> those BTF ID lists in one file is going to be more convenient? I lean
> towards keeping them closer to where they are used, as it was with all
> those helper BTF IDS. But I wonder what others think...

either way works for me, but then BTF_ID_* macros needs to go
to include/linux/btf_ids.h header right?

jirka

> 
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index c1866d76041f..0ff5d8434d40 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -1016,6 +1016,42 @@ static const struct bpf_func_proto bpf_send_signal_thread_proto = {
> >         .arg1_type      = ARG_ANYTHING,
> >  };
> >
> 
> [...]
> 


^ permalink raw reply

* Re: [PATCH 10/11] selftests/bpf: Add verifier test for d_path helper
From: Jiri Olsa @ 2020-06-19 13:32 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Networking, bpf,
	Song Liu, Yonghong Song, Martin KaFai Lau, David Miller,
	John Fastabend, Wenbo Zhang, KP Singh, Andrii Nakryiko,
	Brendan Gregg, Florent Revest, Al Viro
In-Reply-To: <CAEf4BzYr6hwS5-XKAJt-QEyPiofNvj2M1WA_B-F29QCFoZU2jw@mail.gmail.com>

On Thu, Jun 18, 2020 at 09:38:56PM -0700, Andrii Nakryiko wrote:
> On Tue, Jun 16, 2020 at 3:06 AM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > Adding verifier test for attaching tracing program and
> > calling d_path helper from within and testing that it's
> > allowed for dentry_open function and denied for 'd_path'
> > function with appropriate error.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  tools/testing/selftests/bpf/test_verifier.c   | 13 ++++++-
> >  tools/testing/selftests/bpf/verifier/d_path.c | 38 +++++++++++++++++++
> >  2 files changed, 50 insertions(+), 1 deletion(-)
> >  create mode 100644 tools/testing/selftests/bpf/verifier/d_path.c
> >
> > diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
> > index 78a6bae56ea6..3cce3dc766a2 100644
> > --- a/tools/testing/selftests/bpf/test_verifier.c
> > +++ b/tools/testing/selftests/bpf/test_verifier.c
> > @@ -114,6 +114,7 @@ struct bpf_test {
> >                 bpf_testdata_struct_t retvals[MAX_TEST_RUNS];
> >         };
> >         enum bpf_attach_type expected_attach_type;
> > +       const char *kfunc;
> >  };
> >
> >  /* Note we want this to be 64 bit aligned so that the end of our array is
> > @@ -984,8 +985,18 @@ static void do_test_single(struct bpf_test *test, bool unpriv,
> >                 attr.log_level = 4;
> >         attr.prog_flags = pflags;
> >
> > +       if (prog_type == BPF_PROG_TYPE_TRACING && test->kfunc) {
> > +               attr.attach_btf_id = libbpf_find_vmlinux_btf_id(test->kfunc,
> > +                                               attr.expected_attach_type);
> 
> if (!attr.attach_btf_id)
>   emit more meaningful error, than later during load?

ok

> 
> > +       }
> > +
> >         fd_prog = bpf_load_program_xattr(&attr, bpf_vlog, sizeof(bpf_vlog));
> > -       if (fd_prog < 0 && !bpf_probe_prog_type(prog_type, 0)) {
> > +
> > +       /* BPF_PROG_TYPE_TRACING requires more setup and
> > +        * bpf_probe_prog_type won't give correct answer
> > +        */
> > +       if (fd_prog < 0 && (prog_type != BPF_PROG_TYPE_TRACING) &&
> 
> nit: () are redundant

ok

> 
> > +           !bpf_probe_prog_type(prog_type, 0)) {
> >                 printf("SKIP (unsupported program type %d)\n", prog_type);
> >                 skips++;
> >                 goto close_fds;
> > diff --git a/tools/testing/selftests/bpf/verifier/d_path.c b/tools/testing/selftests/bpf/verifier/d_path.c
> > new file mode 100644
> > index 000000000000..e08181abc056
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/verifier/d_path.c
> > @@ -0,0 +1,38 @@
> > +{
> > +       "d_path accept",
> > +       .insns = {
> > +       BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_1, 0),
> > +       BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
> > +       BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
> > +       BPF_MOV64_IMM(BPF_REG_6, 0),
> > +       BPF_STX_MEM(BPF_DW, BPF_REG_2, BPF_REG_6, 0),
> > +       BPF_LD_IMM64(BPF_REG_3, 8),
> > +       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_d_path),
> > +       BPF_MOV64_IMM(BPF_REG_0, 0),
> > +       BPF_EXIT_INSN(),
> > +       },
> > +       .errstr = "R0 max value is outside of the array range",
> > +       .result = ACCEPT,
> 
> accept with error string expected?

oops, probably lefover, will check

thanks,
jirka

> 
> 
> > +       .prog_type = BPF_PROG_TYPE_TRACING,
> > +       .expected_attach_type = BPF_TRACE_FENTRY,
> > +       .kfunc = "dentry_open",
> > +},
> > +{
> > +       "d_path reject",
> > +       .insns = {
> > +       BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_1, 0),
> > +       BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
> > +       BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
> > +       BPF_MOV64_IMM(BPF_REG_6, 0),
> > +       BPF_STX_MEM(BPF_DW, BPF_REG_2, BPF_REG_6, 0),
> > +       BPF_LD_IMM64(BPF_REG_3, 8),
> > +       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_d_path),
> > +       BPF_MOV64_IMM(BPF_REG_0, 0),
> > +       BPF_EXIT_INSN(),
> > +       },
> > +       .errstr = "helper call is not allowed in probe",
> > +       .result = REJECT,
> > +       .prog_type = BPF_PROG_TYPE_TRACING,
> > +       .expected_attach_type = BPF_TRACE_FENTRY,
> > +       .kfunc = "d_path",
> > +},
> > --
> > 2.25.4
> >
> 


^ permalink raw reply

* Re: [PATCH 11/11] selftests/bpf: Add test for d_path helper
From: Jiri Olsa @ 2020-06-19 13:34 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Wenbo Zhang,
	Networking, bpf, Song Liu, Yonghong Song, Martin KaFai Lau,
	David Miller, John Fastabend, KP Singh, Andrii Nakryiko,
	Brendan Gregg, Florent Revest, Al Viro
In-Reply-To: <CAEf4BzZmNFUBdSzCLiiQ-anQRmnzd-E1qa0wVdXHu0pYV_-=Nw@mail.gmail.com>

On Thu, Jun 18, 2020 at 09:44:23PM -0700, Andrii Nakryiko wrote:
> On Tue, Jun 16, 2020 at 3:07 AM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > Adding test for d_path helper which is pretty much
> > copied from Wenbo Zhang's test for bpf_get_fd_path,
> > which never made it in.
> >
> > I've failed so far to compile the test with <linux/fs.h>
> > kernel header, so for now adding 'struct file' with f_path
> > member that has same offset as kernel's file object.
> >
> > Original-patch-by: Wenbo Zhang <ethercflow@gmail.com>
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  .../testing/selftests/bpf/prog_tests/d_path.c | 153 ++++++++++++++++++
> >  .../testing/selftests/bpf/progs/test_d_path.c |  55 +++++++
> >  2 files changed, 208 insertions(+)
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/d_path.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/test_d_path.c
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/d_path.c b/tools/testing/selftests/bpf/prog_tests/d_path.c
> > new file mode 100644
> > index 000000000000..e2b7dfeb506f
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/prog_tests/d_path.c
> > @@ -0,0 +1,153 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#define _GNU_SOURCE
> > +#include <test_progs.h>
> > +#include <sys/stat.h>
> > +#include <linux/sched.h>
> > +#include <sys/syscall.h>
> > +
> > +#define MAX_PATH_LEN           128
> > +#define MAX_FILES              7
> > +#define MAX_EVENT_NUM          16
> > +
> > +struct d_path_test_data {
> > +       pid_t pid;
> > +       __u32 cnt_stat;
> > +       __u32 cnt_close;
> > +       char paths_stat[MAX_EVENT_NUM][MAX_PATH_LEN];
> > +       char paths_close[MAX_EVENT_NUM][MAX_PATH_LEN];
> > +};
> 
> with skeleton there is no point in defining this container struct, and
> especially duplicating it between BPF code and user-space code. Just
> declare all those fields as global variables and access them from
> skeleton directly.

ok, will check

> 
> [...]
> 
> > diff --git a/tools/testing/selftests/bpf/progs/test_d_path.c b/tools/testing/selftests/bpf/progs/test_d_path.c
> > new file mode 100644
> > index 000000000000..1b478c00ee7a
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/progs/test_d_path.c
> > @@ -0,0 +1,55 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include "vmlinux.h"
> > +#include <bpf/bpf_helpers.h>
> > +#include <bpf/bpf_tracing.h>
> > +
> > +#define MAX_PATH_LEN           128
> > +#define MAX_EVENT_NUM          16
> > +
> > +static struct d_path_test_data {
> > +       pid_t pid;
> > +       __u32 cnt_stat;
> > +       __u32 cnt_close;
> > +       char paths_stat[MAX_EVENT_NUM][MAX_PATH_LEN];
> > +       char paths_close[MAX_EVENT_NUM][MAX_PATH_LEN];
> > +} data;
> > +
> > +struct path;
> > +struct kstat;
> 
> both structs are in vmlinux.h, you shouldn't need this.

leftover from earlier version, will remove

thanks,
jirka


^ permalink raw reply

* Re: [PATCH net-next 3/3] cls_flower: Allow flow offloading though masked key exist.
From: Ido Schimmel @ 2020-06-19 13:35 UTC (permalink / raw)
  To: dsatish
  Cc: davem, jhs, xiyou.wangcong, jiri, kuba, netdev, simon.horman,
	kesavac, prathibha.nagooru, intiyaz.basha, jai.rana
In-Reply-To: <20200619094156.31184-4-satish.d@oneconvergence.com>

On Fri, Jun 19, 2020 at 03:11:56PM +0530, dsatish wrote:
> A packet reaches OVS user space, only if, either there is no rule in
> datapath/hardware or there is race condition that the flow is added
> to hardware but before it is processed another packet arrives.
> 
> It is possible hardware as part of its limitations/optimizations
> remove certain flows.

Which driver is doing this? I don't believe it's valid to remove filters
from hardware and not tell anyone about it.

> To handle such cases where the hardware lost
> the flows, tc can offload to hardware if it receives a flow for which
> there exists an entry in its flow table. To handle such cases TC when
> it returns EEXIST error, also programs the flow in hardware, if
> hardware offload is enabled.
> 
> Signed-off-by: Chandra Kesava <kesavac@gmail.com>
> Signed-off-by: Prathibha Nagooru <prathibha.nagooru@oneconvergence.com>
> Signed-off-by: Satish Dhote <satish.d@oneconvergence.com>
> ---
>  net/sched/cls_flower.c | 23 +++++++++++++++++++----
>  1 file changed, 19 insertions(+), 4 deletions(-)
> 
> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
> index f1a5352cbb04..d718233cd5b9 100644
> --- a/net/sched/cls_flower.c
> +++ b/net/sched/cls_flower.c
> @@ -431,7 +431,8 @@ static void fl_hw_destroy_filter(struct tcf_proto *tp, struct cls_fl_filter *f,
>  
>  static int fl_hw_replace_filter(struct tcf_proto *tp,
>  				struct cls_fl_filter *f, bool rtnl_held,
> -				struct netlink_ext_ack *extack)
> +				struct netlink_ext_ack *extack,
> +				unsigned long cookie)
>  {
>  	struct tcf_block *block = tp->chain->block;
>  	struct flow_cls_offload cls_flower = {};
> @@ -444,7 +445,7 @@ static int fl_hw_replace_filter(struct tcf_proto *tp,
>  
>  	tc_cls_common_offload_init(&cls_flower.common, tp, f->flags, extack);
>  	cls_flower.command = FLOW_CLS_REPLACE;
> -	cls_flower.cookie = (unsigned long) f;
> +	cls_flower.cookie = cookie;
>  	cls_flower.rule->match.dissector = &f->mask->dissector;
>  	cls_flower.rule->match.mask = &f->mask->key;
>  	cls_flower.rule->match.key = &f->mkey;
> @@ -2024,11 +2025,25 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
>  	fl_init_unmasked_key_dissector(&fnew->unmasked_key_dissector);
>  
>  	err = fl_ht_insert_unique(fnew, fold, &in_ht);
> -	if (err)
> +	if (err) {
> +		/* It is possible Hardware lost the flow even though TC has it,
> +		 * and flow miss in hardware causes controller to offload flow again.
> +		 */
> +		if (err == -EEXIST && !tc_skip_hw(fnew->flags)) {
> +			struct cls_fl_filter *f =
> +				__fl_lookup(fnew->mask, &fnew->mkey);
> +
> +			if (f)
> +				fl_hw_replace_filter(tp, fnew, rtnl_held,
> +						     extack,
> +						     (unsigned long)(f));
> +		}
>  		goto errout_mask;
> +	}
>  
>  	if (!tc_skip_hw(fnew->flags)) {
> -		err = fl_hw_replace_filter(tp, fnew, rtnl_held, extack);
> +		err = fl_hw_replace_filter(tp, fnew, rtnl_held, extack,
> +					   (unsigned long)fnew);
>  		if (err)
>  			goto errout_ht;
>  	}
> -- 
> 2.17.1
> 

^ permalink raw reply

* Re: [RFC PATCH 3/9] net: dsa: hellcreek: Add PTP clock support
From: Andrew Lunn @ 2020-06-19 13:40 UTC (permalink / raw)
  To: Kurt Kanzenbach
  Cc: Vivien Didelot, Florian Fainelli, David S. Miller, Jakub Kicinski,
	netdev, Rob Herring, devicetree, Sebastian Andrzej Siewior,
	Richard Cochran, Kamil Alkhouri, ilias.apalodimas
In-Reply-To: <878sgjqx4r.fsf@kurt>

On Fri, Jun 19, 2020 at 10:26:44AM +0200, Kurt Kanzenbach wrote:
> Hi Andrew,
> 
> On Thu Jun 18 2020, Andrew Lunn wrote:
> >> +static u64 __hellcreek_ptp_clock_read(struct hellcreek *hellcreek)
> >> +{
> >> +	u16 nsl, nsh, secl, secm, sech;
> >> +
> >> +	/* Take a snapshot */
> >> +	hellcreek_ptp_write(hellcreek, PR_COMMAND_C_SS, PR_COMMAND_C);
> >> +
> >> +	/* The time of the day is saved as 96 bits. However, due to hardware
> >> +	 * limitations the seconds are not or only partly kept in the PTP
> >> +	 * core. That's why only the nanoseconds are used and the seconds are
> >> +	 * tracked in software. Anyway due to internal locking all five
> >> +	 * registers should be read.
> >> +	 */
> >> +	sech = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
> >> +	secm = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
> >> +	secl = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
> >> +	nsh  = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
> >> +	nsl  = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
> >> +
> >> +	return (u64)nsl | ((u64)nsh << 16);
> >
> > Hi Kurt
> >
> > What are the hardware limitations? There seems to be 48 bits for
> > seconds? That allows for 8925104 years?
> 
> In theory, yes. Due to hardware hardware considerations only a few or
> none of these bits are used for the seconds. The rest is zero. Meaning
> that the wraparound is not 8925104 years, but at e.g. 8 seconds when
> using 3 bits for the seconds.

Please add this to the comment.

> >> +static u64 __hellcreek_ptp_gettime(struct hellcreek *hellcreek)
> >> +{
> >> +	u64 ns;
> >> +
> >> +	ns = __hellcreek_ptp_clock_read(hellcreek);
> >> +	if (ns < hellcreek->last_ts)
> >> +		hellcreek->seconds++;
> >> +	hellcreek->last_ts = ns;
> >> +	ns += hellcreek->seconds * NSEC_PER_SEC;
> >
> > So the assumption is, this gets called at least once per second. And
> > if that does not happen, there is no recovery. The second is lost.
> 
> Yes, exactly. If a single overflow is missed, then the time is wrong.
> 
> >
> > I'm just wondering if there is something more robust using what the
> > hardware does provide, even if the hardware is not perfect.
> 
> I don't think there's a more robust way to do this. The overflow period
> is a second which should be enough time to catch the overflow even if
> the system is loaded. We did long running tests for days and the
> mechanism worked fine. We could also consider to move the delayed work
> to a dedicated thread which could be run with real time (SCHED_FIFO)
> priority. But, I didn't see the need for it.

If the hardware does give you 3 working bits for the seconds, you
could make use of that for a consistency check. If nothing else, you
could do a

dev_err(dev, 'PTP time is FUBAR');

	     Andrew



^ permalink raw reply

* Re: [RFC PATCH 6/9] net: dsa: hellcreek: Add debugging mechanisms
From: Andrew Lunn @ 2020-06-19 13:42 UTC (permalink / raw)
  To: Kurt Kanzenbach
  Cc: Vivien Didelot, Florian Fainelli, David S. Miller, Jakub Kicinski,
	netdev, Rob Herring, devicetree, Sebastian Andrzej Siewior,
	Richard Cochran, Kamil Alkhouri, ilias.apalodimas
In-Reply-To: <875zbnqwo2.fsf@kurt>

> > Are trace registers counters?
> 
> No. The trace registers provide bits for error conditions and if packets
> have been dropped e.g. because of full queues or FCS errors, and so on.

Is there some documentation somewhere? A better understanding of what
they can do might help figure out the correct API.

     Andrew

^ permalink raw reply

* Re: [RFC PATCH 7/9] net: dsa: hellcreek: Add PTP status LEDs
From: Andrew Lunn @ 2020-06-19 13:52 UTC (permalink / raw)
  To: Kurt Kanzenbach
  Cc: Vivien Didelot, Florian Fainelli, David S. Miller, Jakub Kicinski,
	netdev, Rob Herring, devicetree, Sebastian Andrzej Siewior,
	Richard Cochran, Kamil Alkhouri, ilias.apalodimas
In-Reply-To: <87366rqw9b.fsf@kurt>

On Fri, Jun 19, 2020 at 10:45:36AM +0200, Kurt Kanzenbach wrote:
> Hi Andrew,
> 
> On Thu Jun 18 2020, Andrew Lunn wrote:
> > On Thu, Jun 18, 2020 at 08:40:27AM +0200, Kurt Kanzenbach wrote:
> >> The switch has two controllable I/Os which are usually connected to LEDs. This
> >> is useful to immediately visually see the PTP status.
> >> 
> >> These provide two signals:
> >> 
> >>  * is_gm
> >> 
> >>    This LED can be activated if the current device is the grand master in that
> >>    PTP domain.
> >> 
> >>  * sync_good
> >> 
> >>    This LED can be activated if the current device is in sync with the network
> >>    time.
> >> 
> >> Expose these via the LED framework to be controlled via user space
> >> e.g. linuxptp.
> >
> > Hi Kurt
> >
> > Is the hardware driving these signals at all? Or are these just
> > suggested names in the documentation? It would not be an issue to have
> > user space to configure them to use the heartbeat trigger, etc?
> 
> These are more like GPIOs. If a 1 is set into the register then the
> hardware drives the signal to high. The names are from the
> documentation:
> 
>  * sync_good: This signal indicates that the switch is in sync
>  * is_gm:     This signal indicates that the switch is the grand master
> 
> However, these signals have to be set by user space. Most likely these
> signals are connected to LEDs.

Thanks

Since these are general purpose LEDs, you might want to look at

Documentation/devicetree/bindings/leds/common.yaml

and implement some of the common properties. The label should ideally
correspond to the text on the case, not what the datasheet says. So
getting it from DT is a good idea. Do Hirschmanns own cases use this
text on there front plate?

	    Andrew

^ permalink raw reply

* Re: RATE not being printed on tc -s class show dev XXXX
From: Roberto J. Blandino Cisneros @ 2020-06-19 13:56 UTC (permalink / raw)
  To: Linux Kernel Network Developers
In-Reply-To: <CAM_iQpWa6KmiWv72YmB3ufR8Rw9RD9=PwLMamjOS6fSCM+zXbA@mail.gmail.com>

My TC rules are as follow:

TC="/usr/sbin/tc"
DEV=$1
ip=$2
MASTER1="50000kbit"
RATE1="1000kbit"
burst="5k"
$TC qdisc add dev $DEV root handle 1: htb default 10
$TC class add dev $DEV parent 1: classid 1:1 htb rate ${MASTER1} ceil 
${MASTER1} burst $burst
$TC class add dev $DEV parent 1:1 classid 1:10 htb rate ${RATE1} ceil 
${RATE1} prio 3 burst $burst
$TC qdisc add dev $DEV parent 1:10 handle 10: sfq perturb 10
$TC filter add dev $DEV parent 1: protocol all prio 1 u32 match ip dst 
$ip flowid 1:10
$TC filter add dev $DEV parent 1: protocol all prio 1 u32 match ip src 
$ip flowid 1:10


El 18/6/20 a las 19:30, Cong Wang escribió:
>
> You either need to enable /sys/module/sch_htb/parameters/htb_rate_est
> or specify a rate estimator when you create your HTB class.
Does it need to be set on the "root handle"?
>
> Thanks.
>


^ permalink raw reply

* Re: [RFC PATCH 9/9] dt-bindings: net: dsa: Add documentation for Hellcreek switches
From: Andrew Lunn @ 2020-06-19 13:56 UTC (permalink / raw)
  To: Kurt Kanzenbach
  Cc: Vivien Didelot, Florian Fainelli, David S. Miller, Jakub Kicinski,
	netdev, Rob Herring, devicetree, Sebastian Andrzej Siewior,
	Richard Cochran, Kamil Alkhouri, ilias.apalodimas
In-Reply-To: <87zh8zphlc.fsf@kurt>

> > The switch is 100/100Mbps right? The MAC is only Fast ethernet. Do you
> > need some properties in the port@0 node to tell the switch to only use
> > 100Mbps? I would expect it to default to 1G. Not looked at the code
> > yet...
> 
> No, that is not needed. That is a hardware configuration and AFAIK
> cannot be changed at run time.

I was wondering about that in general. I did not spot any code in the
driver dealing with results from the PHY auto-neg. So you are saying
the CPU is fixed speed, by strapping? But what about the other ports?
Does the MAC need to know the PHY has negotiated 10Half, not 1G? Would
that not make a difference to your TSN?

     Andrew

^ permalink raw reply

* Re: [PATCH v13 3/9] smccc: Export smccc conduit get helper.
From: Christoph Hellwig @ 2020-06-19 13:57 UTC (permalink / raw)
  To: Jianyong Wu
  Cc: netdev, yangbo.lu, john.stultz, tglx, pbonzini,
	sean.j.christopherson, maz, richardcochran, Mark.Rutland, will,
	suzuki.poulose, steven.price, justin.he, Wei.Chen, kvm,
	Steve.Capper, linux-kernel, Kaly.Xin, nd, kvmarm,
	linux-arm-kernel
In-Reply-To: <20200619130120.40556-4-jianyong.wu@arm.com>

On Fri, Jun 19, 2020 at 09:01:14PM +0800, Jianyong Wu wrote:
> Export arm_smccc_1_1_get_conduit then modules can use smccc helper which
> adopts it.
> 
> Acked-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
> ---
>  drivers/firmware/smccc/smccc.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/firmware/smccc/smccc.c b/drivers/firmware/smccc/smccc.c
> index 4e80921ee212..b855fe7b5c90 100644
> --- a/drivers/firmware/smccc/smccc.c
> +++ b/drivers/firmware/smccc/smccc.c
> @@ -24,6 +24,7 @@ enum arm_smccc_conduit arm_smccc_1_1_get_conduit(void)
>  
>  	return smccc_conduit;
>  }
> +EXPORT_SYMBOL(arm_smccc_1_1_get_conduit);

EXPORT_SYMBOL_GPL, please.


^ permalink raw reply

* Re: [PATCH net-next 2/3] cls_flower: Pass the unmasked key to hw
From: Jiri Pirko @ 2020-06-19 14:00 UTC (permalink / raw)
  To: dsatish
  Cc: davem, jhs, xiyou.wangcong, kuba, netdev, simon.horman, kesavac,
	prathibha.nagooru, intiyaz.basha, jai.rana
In-Reply-To: <20200619094156.31184-3-satish.d@oneconvergence.com>

Fri, Jun 19, 2020 at 11:41:55AM CEST, satish.d@oneconvergence.com wrote:
>Pass the unmasked key along with the masked key to the hardware.
>This enables hardware to manage its own tables better based on the
>hardware features/capabilities.

I don't undertand the motivation. Could you provide some examples?
Also, as Ido pointed out already, you need to submit the driver changes
in this patchset too.

^ permalink raw reply

* Re: [PATCH net-next 3/3] cls_flower: Allow flow offloading though masked key exist.
From: Jiri Pirko @ 2020-06-19 14:01 UTC (permalink / raw)
  To: dsatish
  Cc: davem, jhs, xiyou.wangcong, kuba, netdev, simon.horman, kesavac,
	prathibha.nagooru, intiyaz.basha, jai.rana
In-Reply-To: <20200619094156.31184-4-satish.d@oneconvergence.com>

Fri, Jun 19, 2020 at 11:41:56AM CEST, satish.d@oneconvergence.com wrote:
>A packet reaches OVS user space, only if, either there is no rule in
>datapath/hardware or there is race condition that the flow is added
>to hardware but before it is processed another packet arrives.
>
>It is possible hardware as part of its limitations/optimizations
>remove certain flows. To handle such cases where the hardware lost
>the flows, tc can offload to hardware if it receives a flow for which
>there exists an entry in its flow table. To handle such cases TC when
>it returns EEXIST error, also programs the flow in hardware, if
>hardware offload is enabled.
>
>Signed-off-by: Chandra Kesava <kesavac@gmail.com>
>Signed-off-by: Prathibha Nagooru <prathibha.nagooru@oneconvergence.com>
>Signed-off-by: Satish Dhote <satish.d@oneconvergence.com>
>---
> net/sched/cls_flower.c | 23 +++++++++++++++++++----
> 1 file changed, 19 insertions(+), 4 deletions(-)
>
>diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
>index f1a5352cbb04..d718233cd5b9 100644
>--- a/net/sched/cls_flower.c
>+++ b/net/sched/cls_flower.c
>@@ -431,7 +431,8 @@ static void fl_hw_destroy_filter(struct tcf_proto *tp, struct cls_fl_filter *f,
> 
> static int fl_hw_replace_filter(struct tcf_proto *tp,
> 				struct cls_fl_filter *f, bool rtnl_held,
>-				struct netlink_ext_ack *extack)
>+				struct netlink_ext_ack *extack,
>+				unsigned long cookie)
> {
> 	struct tcf_block *block = tp->chain->block;
> 	struct flow_cls_offload cls_flower = {};
>@@ -444,7 +445,7 @@ static int fl_hw_replace_filter(struct tcf_proto *tp,
> 
> 	tc_cls_common_offload_init(&cls_flower.common, tp, f->flags, extack);
> 	cls_flower.command = FLOW_CLS_REPLACE;
>-	cls_flower.cookie = (unsigned long) f;
>+	cls_flower.cookie = cookie;
> 	cls_flower.rule->match.dissector = &f->mask->dissector;
> 	cls_flower.rule->match.mask = &f->mask->key;
> 	cls_flower.rule->match.key = &f->mkey;
>@@ -2024,11 +2025,25 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
> 	fl_init_unmasked_key_dissector(&fnew->unmasked_key_dissector);
> 
> 	err = fl_ht_insert_unique(fnew, fold, &in_ht);
>-	if (err)
>+	if (err) {
>+		/* It is possible Hardware lost the flow even though TC has it,
>+		 * and flow miss in hardware causes controller to offload flow again.

What? Seems that you have a buggy HW what should be fixed...

What is "controller"?


>+		 */
>+		if (err == -EEXIST && !tc_skip_hw(fnew->flags)) {
>+			struct cls_fl_filter *f =
>+				__fl_lookup(fnew->mask, &fnew->mkey);
>+
>+			if (f)
>+				fl_hw_replace_filter(tp, fnew, rtnl_held,
>+						     extack,
>+						     (unsigned long)(f));
>+		}
> 		goto errout_mask;
>+	}
> 
> 	if (!tc_skip_hw(fnew->flags)) {
>-		err = fl_hw_replace_filter(tp, fnew, rtnl_held, extack);
>+		err = fl_hw_replace_filter(tp, fnew, rtnl_held, extack,
>+					   (unsigned long)fnew);
> 		if (err)
> 			goto errout_ht;
> 	}
>-- 
>2.17.1
>

^ permalink raw reply

* Re: RATE not being printed on tc -s class show dev XXXX
From: Roberto J. Blandino Cisneros @ 2020-06-19 14:07 UTC (permalink / raw)
  To: Linux Kernel Network Developers
In-Reply-To: <d746708a-f000-a6f7-d0d5-ab5905d080da@ibw.com.ni>

El 19/6/20 a las 07:56, Roberto J. Blandino Cisneros escribió:
>
> El 18/6/20 a las 19:30, Cong Wang escribió:
>>
>> You either need to enable /sys/module/sch_htb/parameters/htb_rate_est
>> or specify a rate estimator when you create your HTB class.
> Does it need to be set on the "root handle"?

echo "1" > /sys/module/sch_htb/parameters/htb_rate_est

Did the trick. Thanks.

I look in google and nothing came up as "No rating being printed on tc 
-s class show"

Now with the parameter you gave me i look it in google and i was not 
going to find it in the way i was looking for this inconvenient. If i 
were able to guess that it is a "HTB scheduler" it would another pattern 
of search and i would be able to find it easy.

Thanks. It is solve!

>>
>> Thanks.
>>
>
>

^ permalink raw reply

* [PATCH net-next 0/5] cxgb4: add support for ethtool n-tuple filters
From: Vishal Kulkarni @ 2020-06-19 14:21 UTC (permalink / raw)
  To: netdev, davem; +Cc: nirranjan, rahul.lakkireddy, Vishal Kulkarni

Patch 1: Adds data structure to maintain list of filters and handles init/dinit
	 of the same.

Patch 2: Handles addition of filters via ETHTOOL_SRXCLSRLINS.

Patch 3: Handles deletion of filtes via ETHTOOL_SRXCLSRLDEL.

Patch 4: Handles viewing of added filters.

Patch 5: Adds FLOW_ACTION_QUEUE support.

Vishal Kulkarni (5):
  cxgb4: add skeleton for ethtool n-tuple filters
  cxgb4: add ethtool n-tuple filter insertion
  cxgb4: add ethtool n-tuple filter deletion
  cxgb4: add support to fetch ethtool n-tuple filters
  cxgb4: add action to steer flows to specific Rxq

 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h    |  15 +
 .../ethernet/chelsio/cxgb4/cxgb4_ethtool.c    | 335 ++++++++++++++++++
 .../net/ethernet/chelsio/cxgb4/cxgb4_filter.c |   5 +
 .../net/ethernet/chelsio/cxgb4/cxgb4_filter.h |   3 +
 .../net/ethernet/chelsio/cxgb4/cxgb4_main.c   |  22 +-
 .../ethernet/chelsio/cxgb4/cxgb4_tc_flower.c  | 131 ++++---
 .../ethernet/chelsio/cxgb4/cxgb4_tc_flower.h  |   9 +
 .../net/ethernet/chelsio/cxgb4/cxgb4_uld.h    |   2 +
 drivers/net/ethernet/chelsio/cxgb4/t4_regs.h  |   4 +
 9 files changed, 463 insertions(+), 63 deletions(-)

-- 
2.18.2


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox