Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next] net: phy: realtek: load driver for all PHYs with a Realtek OUI
From: David Miller @ 2018-11-08  6:19 UTC (permalink / raw)
  To: hkallweit1; +Cc: f.fainelli, andrew, netdev
In-Reply-To: <a1a08754-1f94-3689-f26b-076283d9cc03@gmail.com>

From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Wed, 7 Nov 2018 08:52:46 +0100

> Instead of listing every single PHYID, load the driver for every PHYID
> with a Realtek OUI, independent of model number and revision.
> 
> This patch also improves two further aspects:
> - constify realtek_tbl[]
> - the mask should have been 0xffffffff instead of 0x001fffff so far,
>   by masking out some bits a PHY from another vendor could have been
>   matched
> 
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] net: phy: make phy_trigger_machine static
From: David Miller @ 2018-11-08  6:19 UTC (permalink / raw)
  To: hkallweit1; +Cc: f.fainelli, andrew, netdev
In-Reply-To: <291aa78f-678b-14b4-7c98-d73799a5e455@gmail.com>

From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Wed, 7 Nov 2018 08:15:58 +0100

> phy_trigger_machine() is used in phy.c only, so we can make it static.
> 
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH] [stable, netdev 4.4+] lan78xx: make sure RX_ADDRL & RX_ADDRH regs are always up to date
From: Sasha Levin @ 2018-11-08 15:49 UTC (permalink / raw)
  To: Paolo Pisati
  Cc: Woojung Huh, Microchip Linux Driver Support, netdev, stable,
	linux-usb, linux-kernel
In-Reply-To: <20181108110127.GA8415@harukaze>

On Thu, Nov 08, 2018 at 12:01:27PM +0100, Paolo Pisati wrote:
>On Wed, Nov 07, 2018 at 07:17:51PM -0500, Sasha Levin wrote:
>> So why not just take 760db29bdc completely? It looks safer than taking a
>> partial backport, and will make applying future patches easier.
>>
>> I tried to do it and it doesn't look like there are any dependencies
>> that would cause an issue.
>
>Somehow i was convinced it didn't build on 4.4.x... can you pick it up?
>
>commit 760db29bdc97b73ff60b091315ad787b1deb5cf5
>Author: Phil Elwell <phil@raspberrypi.org>
>Date:   Thu Apr 19 17:59:38 2018 +0100
>
>    lan78xx: Read MAC address from DT if present
>
>    There is a standard mechanism for locating and using a MAC address from
>    the Device Tree. Use this facility in the lan78xx driver to support
>    applications without programmed EEPROM or OTP. At the same time,
>    regularise the handling of the different address sources.
>
>    Signed-off-by: Phil Elwell <phil@raspberrypi.org>
>    Signed-off-by: David S. Miller <davem@davemloft.net>

Can you confirm it actually works on 4.4?

--
Thanks,
Sasha

^ permalink raw reply

* Re: (2) (2) [Kernel][NET] Bug report on packet defragmenting
From: Eric Dumazet @ 2018-11-08  6:13 UTC (permalink / raw)
  To: soukjin.bae, netdev@vger.kernel.org
In-Reply-To: <8b2209af-1221-f4f5-54e5-d9f5a503373e@gmail.com>



On 11/07/2018 08:26 PM, Eric Dumazet wrote:
> 
> 
> On 11/07/2018 08:10 PM, 배석진 wrote:
>>> --------- Original Message ---------
>>> Sender : Eric Dumazet <eric.dumazet@gmail.com>
>>> Date   : 2018-11-08 12:57 (GMT+9)
>>> Title  : Re: (2) [Kernel][NET] Bug report on packet defragmenting
>>>  
>>> On 11/07/2018 07:24 PM, Eric Dumazet wrote:
>>>
>>>>  Sure, it is better if RPS is smarter, but if there is a bug in IPv6 defrag unit
>>>>  we must investigate and root-cause it.
>>>  
>>> BTW, IPv4 defrag seems to have the same issue.
>>  
>>
>> yes, it could be.
>> key point isn't limitted to ipv6.
>>
>> maybe because of faster air-network and modem,
>> it looks like occure more often and we got recognized that.
>>
>> anyway,
>> we'll apply our patch to resolve this problem.
> 
> Yeah, and I will fix the defrag units.
> 
> We can not rely on other layers doing proper no-reorder logic for us.
> 
> Problem here is that multiple cpus attempt concurrent rhashtable_insert_fast()
> and do not properly recover in case -EEXIST is returned.
> 
> This is silly, of course :/

Patch would be https://patchwork.ozlabs.org/patch/994658/

^ permalink raw reply

* Re: [net-next 06/12] i40e/ixgbe/igb: fail on new WoL flag setting WAKE_MAGICSECURE
From: Kevin Easton @ 2018-11-08  6:05 UTC (permalink / raw)
  To: Jeff Kirsher; +Cc: davem, Todd Fujinaka, netdev, nhorman, sassmann
In-Reply-To: <20181107224830.9737-7-jeffrey.t.kirsher@intel.com>

On Wed, Nov 07, 2018 at 02:48:24PM -0800, Jeff Kirsher wrote:
> From: Todd Fujinaka <todd.fujinaka@intel.com>
> 
> There's a new flag for setting WoL filters that is only
> enabled on one manufacturer's NICs, and it's not ours. Fail
> with EOPNOTSUPP.
> 
> Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
> Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_ethtool.c   | 3 ++-
>  drivers/net/ethernet/intel/igb/igb_ethtool.c     | 2 +-
>  drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 3 ++-
>  3 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
> index 9f8464f80783..9c1211ad2c6b 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
> @@ -2377,7 +2377,8 @@ static int i40e_set_wol(struct net_device *netdev, struct ethtool_wolinfo *wol)
>  		return -EOPNOTSUPP;
>  
>  	/* only magic packet is supported */
> -	if (wol->wolopts && (wol->wolopts != WAKE_MAGIC))
> +	if (wol->wolopts && (wol->wolopts != WAKE_MAGIC)
> +			  | (wol->wolopts != WAKE_FILTER))
>  		return -EOPNOTSUPP;

This doesn't look right.  WAKE_MAGIC and WAKE_FILTER are distinct, so

(wol->wolopts != WAKE_MAGIC) | (wol->wolopts != WAKE_FILTER)

will always be 1.

It looks like the existing test in this driver was fine - it *only*
accepted wol->wolopts of either 0 or WAKE_MAGIC, it was already
rejecting everything else including WAKE_FILTER.

Suggest you drop that hunk.

    - Kevin

>
>  
>  	/* is this a new value? */
> diff --git a/drivers/net/ethernet/intel/igb/igb_ethtool.c b/drivers/net/ethernet/intel/igb/igb_ethtool.c
> index 5acf3b743876..c57671068245 100644
> --- a/drivers/net/ethernet/intel/igb/igb_ethtool.c
> +++ b/drivers/net/ethernet/intel/igb/igb_ethtool.c
> @@ -2113,7 +2113,7 @@ static int igb_set_wol(struct net_device *netdev, struct ethtool_wolinfo *wol)
>  {
>  	struct igb_adapter *adapter = netdev_priv(netdev);
>  
> -	if (wol->wolopts & (WAKE_ARP | WAKE_MAGICSECURE))
> +	if (wol->wolopts & (WAKE_ARP | WAKE_MAGICSECURE | WAKE_FILTER))
>  		return -EOPNOTSUPP;
>  
>  	if (!(adapter->flags & IGB_FLAG_WOL_SUPPORTED))
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
> index 732b1e6ecc43..acba067cc15a 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
> @@ -2206,7 +2206,8 @@ static int ixgbe_set_wol(struct net_device *netdev, struct ethtool_wolinfo *wol)
>  {
>  	struct ixgbe_adapter *adapter = netdev_priv(netdev);
>  
> -	if (wol->wolopts & (WAKE_PHY | WAKE_ARP | WAKE_MAGICSECURE))
> +	if (wol->wolopts & (WAKE_PHY | WAKE_ARP | WAKE_MAGICSECURE |
> +			    WAKE_FILTER))
>  		return -EOPNOTSUPP;
>  
>  	if (ixgbe_wol_exclusion(adapter, wol))
> -- 
> 2.19.1
> 
> 
> 

^ permalink raw reply

* [PATCH net] inet: frags: better deal with smp races
From: Eric Dumazet @ 2018-11-08  6:10 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Eric Dumazet, Eric Dumazet, 배석진

Multiple cpus might attempt to insert a new fragment in rhashtable,
if for example RPS is buggy, as reported by 배석진in
https://patchwork.ozlabs.org/patch/994601/

We use rhashtable_lookup_get_insert_key() instead of
rhashtable_insert_fast() to let cpus losing the race
free their own inet_frag_queue and use the one that
was inserted by another cpu.

Fixes: 648700f76b03 ("inet: frags: use rhashtables for reassembly units")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: 배석진 <soukjin.bae@samsung.com>
---
 net/ipv4/inet_fragment.c | 29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
index bcb11f3a27c0c34115af05034a5a20f57842eb0a..ced9abd4bec6cd494e352c1d6a97da8f67cf6073 100644
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -178,21 +178,22 @@ static struct inet_frag_queue *inet_frag_alloc(struct netns_frags *nf,
 }
 
 static struct inet_frag_queue *inet_frag_create(struct netns_frags *nf,
-						void *arg)
+						void *arg,
+						struct inet_frag_queue **prev)
 {
 	struct inet_frags *f = nf->f;
 	struct inet_frag_queue *q;
-	int err;
 
 	q = inet_frag_alloc(nf, f, arg);
-	if (!q)
+	if (!q) {
+		*prev = ERR_PTR(-ENOMEM);
 		return NULL;
-
+	}
 	mod_timer(&q->timer, jiffies + nf->timeout);
 
-	err = rhashtable_insert_fast(&nf->rhashtable, &q->node,
-				     f->rhash_params);
-	if (err < 0) {
+	*prev = rhashtable_lookup_get_insert_key(&nf->rhashtable, &q->key,
+						 &q->node, f->rhash_params);
+	if (*prev) {
 		q->flags |= INET_FRAG_COMPLETE;
 		inet_frag_kill(q);
 		inet_frag_destroy(q);
@@ -204,22 +205,22 @@ static struct inet_frag_queue *inet_frag_create(struct netns_frags *nf,
 /* TODO : call from rcu_read_lock() and no longer use refcount_inc_not_zero() */
 struct inet_frag_queue *inet_frag_find(struct netns_frags *nf, void *key)
 {
-	struct inet_frag_queue *fq;
+	struct inet_frag_queue *fq, *prev;
 
 	if (!nf->high_thresh || frag_mem_limit(nf) > nf->high_thresh)
 		return NULL;
 
 	rcu_read_lock();
 
-	fq = rhashtable_lookup(&nf->rhashtable, key, nf->f->rhash_params);
-	if (fq) {
+	prev = rhashtable_lookup(&nf->rhashtable, key, nf->f->rhash_params);
+	if (!prev)
+		fq = inet_frag_create(nf, key, &prev);
+	if (prev && !IS_ERR(prev)) {
+		fq = prev;
 		if (!refcount_inc_not_zero(&fq->refcnt))
 			fq = NULL;
-		rcu_read_unlock();
-		return fq;
 	}
 	rcu_read_unlock();
-
-	return inet_frag_create(nf, key);
+	return fq;
 }
 EXPORT_SYMBOL(inet_frag_find);
-- 
2.19.1.930.g4563a0d9d0-goog

^ permalink raw reply related

* Re: [PATCH net-next] dpaa2-eth: Introduce TX congestion management
From: David Miller @ 2018-11-08  6:07 UTC (permalink / raw)
  To: ruxandra.radulescu; +Cc: netdev, ioana.ciornei
In-Reply-To: <1541586669-24334-1-git-send-email-ruxandra.radulescu@nxp.com>

From: Ioana Ciocoi Radulescu <ruxandra.radulescu@nxp.com>
Date: Wed, 7 Nov 2018 10:31:16 +0000

> We chose this mechanism over BQL (to which it is conceptually
> very similar) because a) we can take advantage of the hardware
> offloading and b) BQL doesn't match well with our driver fastpath
> (we process ingress (Rx or Tx conf) frames in batches of up to 16,
> which in certain scenarios confuses the BQL adaptive algorithm,
> resulting in too low values of the limit and low performance).

First, this kind of explanation belongs in the commit message.

Second, you'll have to describe better what BQL, which is the
ultimate standard mechanism for every single driver in the
kernel to deal with this issue.

Are you saying that if 15 TX frames are pending, not TX interrupt
will arrive at all?

There absolutely must be some timeout or similar interrupt that gets
sent in that kind of situation.  You cannot leave stale TX packets
on your ring unprocessed just because a non-multiple of 16 packets
were queued up and then TX activity stopped.

^ permalink raw reply

* Re: [PATCH v2 net-next] sock: Reset dst when changing sk_mark via setsockopt
From: Eric Dumazet @ 2018-11-08  5:59 UTC (permalink / raw)
  To: David Barmann, netdev
In-Reply-To: <20181108045552.GA24562@konacove.com>



On 11/07/2018 08:55 PM, David Barmann wrote:
> When setting the SO_MARK socket option, the dst needs to be reset so
> that a new route lookup is performed.
> 
> This fixes the case where an application wants to change routing by
> setting a new sk_mark.  If this is done after some packets have already
> been sent, the dst is cached and has no effect.
> 
> Signed-off-by: David Barmann <david.barmann@stackpath.com>
> ---
>  net/core/sock.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 7b304e454a38..c74b10be86cb 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -952,10 +952,12 @@ int sock_setsockopt(struct socket *sock, int level, int optname,
>  			clear_bit(SOCK_PASSSEC, &sock->flags);
>  		break;
>  	case SO_MARK:
> -		if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN))
> +		if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) {
>  			ret = -EPERM;
> -		else
> +		} else {
>  			sk->sk_mark = val;
> +			sk_dst_reset(sk);


There is no need to force a sk_dst_reset(sk) if sk_mark was not changed.

I already gave you this feedback, please do not ignore it.

Thanks.

^ permalink raw reply

* Re: [PATCH 0/4] FDDI: defza: Fix a bunch of small issues
From: David Miller @ 2018-11-08  5:53 UTC (permalink / raw)
  To: macro; +Cc: netdev
In-Reply-To: <alpine.LFD.2.21.1811070323450.20378@eddie.linux-mips.org>

From: "Maciej W. Rozycki" <macro@linux-mips.org>
Date: Wed, 7 Nov 2018 12:06:46 +0000 (GMT)

>  Here is a bunch of small fixes addressing issues that I missed in my 
> final round of testing.  None of these affect run-time behaviour.  One was 
> actually found by the kbuild bot, which turned out to be more pedantic 
> than my compiler.  See individual change descriptions for details.
> 
>  Please apply.

Series applied.

^ permalink raw reply

* Re: [PATCH net-next] net: phy: bcm7xxx: Add entry for BCM7255
From: David Miller @ 2018-11-08  5:50 UTC (permalink / raw)
  To: f.fainelli; +Cc: netdev, andrew, justinpopo6
In-Reply-To: <20181107003744.19976-1-f.fainelli@gmail.com>

From: Florian Fainelli <f.fainelli@gmail.com>
Date: Tue,  6 Nov 2018 16:37:44 -0800

> From: Justin Chen <justinpopo6@gmail.com>
> 
> Add support for BCM7255 EPHY.
> 
> Signed-off-by: Justin Chen <justinpopo6@gmail.com>
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>

Applied, thanks Florian.

^ permalink raw reply

* Re: [PATCH][net-next] net/ipv6: compute anycast address hash only if dev is null
From: David Miller @ 2018-11-08  5:49 UTC (permalink / raw)
  To: lirongqing; +Cc: netdev
In-Reply-To: <1541655340-7035-1-git-send-email-lirongqing@baidu.com>

From: Li RongQing <lirongqing@baidu.com>
Date: Thu,  8 Nov 2018 13:35:40 +0800

> avoid to compute the hash value if dev is not null, since
> hash value is not used
> 
> Signed-off-by: Li RongQing <lirongqing@baidu.com>
> ---
>  net/ipv6/anycast.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
> index 94999058e110..a20e344486cb 100644
> --- a/net/ipv6/anycast.c
> +++ b/net/ipv6/anycast.c
> @@ -433,15 +433,16 @@ static bool ipv6_chk_acast_dev(struct net_device *dev, const struct in6_addr *ad
>  bool ipv6_chk_acast_addr(struct net *net, struct net_device *dev,
>  			 const struct in6_addr *addr)
>  {
> -	unsigned int hash = inet6_acaddr_hash(net, addr);
>  	struct net_device *nh_dev;
>  	struct ifacaddr6 *aca;
>  	bool found = false;
> +	unsigned int hash;
>  
>  	rcu_read_lock();
>  	if (dev)
>  		found = ipv6_chk_acast_dev(dev, addr);
> -	else
> +	else {
> +		hash = inet6_acaddr_hash(net, addr);
>  		hlist_for_each_entry_rcu(aca, &inet6_acaddr_lst[hash],

Please move the hash local variable declaration into this basic block
too, if you're going to do this.

Thanks.

^ permalink raw reply

* [PATCH v3 bpf-next 4/4] bpftool: support loading flow dissector
From: Stanislav Fomichev @ 2018-11-08  5:39 UTC (permalink / raw)
  To: netdev, linux-kselftest, ast, daniel, shuah, jakub.kicinski,
	quentin.monnet
  Cc: guro, jiong.wang, sdf, bhole_prashant_q7, john.fastabend, jbenc,
	treeze.taeung, yhs, osk, sandipan
In-Reply-To: <20181108053957.205681-1-sdf@google.com>

This commit adds support for loading/attaching/detaching flow
dissector program. The structure of the flow dissector program is
assumed to be the same as in the selftests:

* flow_dissector section with the main entry point
* a bunch of tail call progs
* a jmp_table map that is populated with the tail call progs

When `bpftool load` is called with a flow_dissector prog (i.e. when the
first section is flow_dissector of 'type flow_dissector' argument is
passed), we load and pin all the programs/maps. User is responsible to
construct the jump table for the tail calls.

The last argument of `bpftool attach` is made optional for this use
case.

Example:
bpftool prog load tools/testing/selftests/bpf/bpf_flow.o \
	/sys/fs/bpf/flow type flow_dissector

bpftool map update pinned /sys/fs/bpf/flow/jmp_table \
        key 0 0 0 0 \
        value pinned /sys/fs/bpf/flow/IP

bpftool map update pinned /sys/fs/bpf/flow/jmp_table \
        key 1 0 0 0 \
        value pinned /sys/fs/bpf/flow/IPV6

bpftool map update pinned /sys/fs/bpf/flow/jmp_table \
        key 2 0 0 0 \
        value pinned /sys/fs/bpf/flow/IPV6OP

bpftool map update pinned /sys/fs/bpf/flow/jmp_table \
        key 3 0 0 0 \
        value pinned /sys/fs/bpf/flow/IPV6FR

bpftool map update pinned /sys/fs/bpf/flow/jmp_table \
        key 4 0 0 0 \
        value pinned /sys/fs/bpf/flow/MPLS

bpftool map update pinned /sys/fs/bpf/flow/jmp_table \
        key 5 0 0 0 \
        value pinned /sys/fs/bpf/flow/VLAN

bpftool prog attach pinned /sys/fs/bpf/flow/flow_dissector flow_dissector

Tested by using the above lines to load the prog in
the test_flow_dissector.sh selftest.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 .../bpftool/Documentation/bpftool-prog.rst    |  36 ++++--
 tools/bpf/bpftool/bash-completion/bpftool     |   6 +-
 tools/bpf/bpftool/common.c                    |  30 ++---
 tools/bpf/bpftool/main.h                      |   1 +
 tools/bpf/bpftool/prog.c                      | 112 +++++++++++++-----
 5 files changed, 126 insertions(+), 59 deletions(-)

diff --git a/tools/bpf/bpftool/Documentation/bpftool-prog.rst b/tools/bpf/bpftool/Documentation/bpftool-prog.rst
index ac4e904b10fb..0374634c3087 100644
--- a/tools/bpf/bpftool/Documentation/bpftool-prog.rst
+++ b/tools/bpf/bpftool/Documentation/bpftool-prog.rst
@@ -15,7 +15,8 @@ SYNOPSIS
 	*OPTIONS* := { { **-j** | **--json** } [{ **-p** | **--pretty** }] | { **-f** | **--bpffs** } }
 
 	*COMMANDS* :=
-	{ **show** | **list** | **dump xlated** | **dump jited** | **pin** | **load** | **help** }
+	{ **show** | **list** | **dump xlated** | **dump jited** | **pin** | **load**
+	| **loadall** | **help** }
 
 MAP COMMANDS
 =============
@@ -24,9 +25,9 @@ MAP COMMANDS
 |	**bpftool** **prog dump xlated** *PROG* [{**file** *FILE* | **opcodes** | **visual**}]
 |	**bpftool** **prog dump jited**  *PROG* [{**file** *FILE* | **opcodes**}]
 |	**bpftool** **prog pin** *PROG* *FILE*
-|	**bpftool** **prog load** *OBJ* *FILE* [**type** *TYPE*] [**map** {**idx** *IDX* | **name** *NAME*} *MAP*] [**dev** *NAME*]
-|       **bpftool** **prog attach** *PROG* *ATTACH_TYPE* *MAP*
-|       **bpftool** **prog detach** *PROG* *ATTACH_TYPE* *MAP*
+|	**bpftool** **prog { load | loadall }** *OBJ* *FILE* [**type** *TYPE*] [**map** {**idx** *IDX* | **name** *NAME*} *MAP*] [**dev** *NAME*]
+|       **bpftool** **prog attach** *PROG* *ATTACH_TYPE* [*MAP*]
+|       **bpftool** **prog detach** *PROG* *ATTACH_TYPE* [*MAP*]
 |	**bpftool** **prog help**
 |
 |	*MAP* := { **id** *MAP_ID* | **pinned** *FILE* }
@@ -39,7 +40,9 @@ MAP COMMANDS
 |		**cgroup/bind4** | **cgroup/bind6** | **cgroup/post_bind4** | **cgroup/post_bind6** |
 |		**cgroup/connect4** | **cgroup/connect6** | **cgroup/sendmsg4** | **cgroup/sendmsg6**
 |	}
-|       *ATTACH_TYPE* := { **msg_verdict** | **skb_verdict** | **skb_parse** }
+|       *ATTACH_TYPE* := {
+|		**msg_verdict** | **skb_verdict** | **skb_parse** | **flow_dissector**
+|	}
 
 
 DESCRIPTION
@@ -79,8 +82,11 @@ DESCRIPTION
 		  contain a dot character ('.'), which is reserved for future
 		  extensions of *bpffs*.
 
-	**bpftool prog load** *OBJ* *FILE* [**type** *TYPE*] [**map** {**idx** *IDX* | **name** *NAME*} *MAP*] [**dev** *NAME*]
+	**bpftool prog { load | loadall }** *OBJ* *FILE* [**type** *TYPE*] [**map** {**idx** *IDX* | **name** *NAME*} *MAP*] [**dev** *NAME*]
 		  Load bpf program from binary *OBJ* and pin as *FILE*.
+		  **bpftool prog load** will pin only the first bpf program
+		  from the *OBJ*, **bpftool prog loadall** will pin all maps
+		  and programs from the *OBJ*.
 		  **type** is optional, if not specified program type will be
 		  inferred from section names.
 		  By default bpftool will create new maps as declared in the ELF
@@ -97,13 +103,17 @@ DESCRIPTION
 		  contain a dot character ('.'), which is reserved for future
 		  extensions of *bpffs*.
 
-        **bpftool prog attach** *PROG* *ATTACH_TYPE* *MAP*
-                  Attach bpf program *PROG* (with type specified by *ATTACH_TYPE*)
-                  to the map *MAP*.
-
-        **bpftool prog detach** *PROG* *ATTACH_TYPE* *MAP*
-                  Detach bpf program *PROG* (with type specified by *ATTACH_TYPE*)
-                  from the map *MAP*.
+        **bpftool prog attach** *PROG* *ATTACH_TYPE* [*MAP*]
+                  Attach bpf program *PROG* (with type specified by
+                  *ATTACH_TYPE*). Most *ATTACH_TYPEs* require a *MAP*
+                  parameter, with the exception of *flow_dissector* which is
+                  attached to current networking name space.
+
+        **bpftool prog detach** *PROG* *ATTACH_TYPE* [*MAP*]
+                  Detach bpf program *PROG* (with type specified by
+                  *ATTACH_TYPE*). Most *ATTACH_TYPEs* require a *MAP*
+                  parameter, with the exception of *flow_dissector* which is
+                  detached from the current networking name space.
 
 	**bpftool prog help**
 		  Print short help message.
diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool
index 3f78e6404589..ad0fc919f7ec 100644
--- a/tools/bpf/bpftool/bash-completion/bpftool
+++ b/tools/bpf/bpftool/bash-completion/bpftool
@@ -243,7 +243,7 @@ _bpftool()
     # Completion depends on object and command in use
     case $object in
         prog)
-            if [[ $command != "load" ]]; then
+            if [[ $command != "load" && $command != "loadall" ]]; then
                 case $prev in
                     id)
                         _bpftool_get_prog_ids
@@ -299,7 +299,7 @@ _bpftool()
                     fi
 
                     if [[ ${#words[@]} == 6 ]]; then
-                        COMPREPLY=( $( compgen -W "msg_verdict skb_verdict skb_parse" -- "$cur" ) )
+                        COMPREPLY=( $( compgen -W "msg_verdict skb_verdict skb_parse flow_dissector" -- "$cur" ) )
                         return 0
                     fi
 
@@ -309,7 +309,7 @@ _bpftool()
                     fi
                     return 0
                     ;;
-                load)
+                load|loadall)
                     local obj
 
                     if [[ ${#words[@]} -lt 6 ]]; then
diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c
index 25af85304ebe..f671a921dec5 100644
--- a/tools/bpf/bpftool/common.c
+++ b/tools/bpf/bpftool/common.c
@@ -169,34 +169,24 @@ int open_obj_pinned_any(char *path, enum bpf_obj_type exp_type)
 	return fd;
 }
 
-int do_pin_fd(int fd, const char *name)
+int mount_bpffs_for_pin(const char *name)
 {
 	char err_str[ERR_MAX_LEN];
 	char *file;
 	char *dir;
 	int err = 0;
 
-	err = bpf_obj_pin(fd, name);
-	if (!err)
-		goto out;
-
 	file = malloc(strlen(name) + 1);
 	strcpy(file, name);
 	dir = dirname(file);
 
-	if (errno != EPERM || is_bpffs(dir)) {
-		p_err("can't pin the object (%s): %s", name, strerror(errno));
+	if (is_bpffs(dir)) {
+		/* nothing to do if already mounted */
 		goto out_free;
 	}
 
-	/* Attempt to mount bpffs, then retry pinning. */
 	err = mnt_bpffs(dir, err_str, ERR_MAX_LEN);
-	if (!err) {
-		err = bpf_obj_pin(fd, name);
-		if (err)
-			p_err("can't pin the object (%s): %s", name,
-			      strerror(errno));
-	} else {
+	if (err) {
 		err_str[ERR_MAX_LEN - 1] = '\0';
 		p_err("can't mount BPF file system to pin the object (%s): %s",
 		      name, err_str);
@@ -204,10 +194,20 @@ int do_pin_fd(int fd, const char *name)
 
 out_free:
 	free(file);
-out:
 	return err;
 }
 
+int do_pin_fd(int fd, const char *name)
+{
+	int err;
+
+	err = mount_bpffs_for_pin(name);
+	if (err)
+		return err;
+
+	return bpf_obj_pin(fd, name);
+}
+
 int do_pin_any(int argc, char **argv, int (*get_fd_by_id)(__u32))
 {
 	unsigned int id;
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index 28322ace2856..1383824c9baf 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -129,6 +129,7 @@ const char *get_fd_type_name(enum bpf_obj_type type);
 char *get_fdinfo(int fd, const char *key);
 int open_obj_pinned(char *path);
 int open_obj_pinned_any(char *path, enum bpf_obj_type exp_type);
+int mount_bpffs_for_pin(const char *name);
 int do_pin_any(int argc, char **argv, int (*get_fd_by_id)(__u32));
 int do_pin_fd(int fd, const char *name);
 
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index 5302ee282409..a4346dd673b1 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -81,6 +81,7 @@ static const char * const attach_type_strings[] = {
 	[BPF_SK_SKB_STREAM_PARSER] = "stream_parser",
 	[BPF_SK_SKB_STREAM_VERDICT] = "stream_verdict",
 	[BPF_SK_MSG_VERDICT] = "msg_verdict",
+	[BPF_FLOW_DISSECTOR] = "flow_dissector",
 	[__MAX_BPF_ATTACH_TYPE] = NULL,
 };
 
@@ -724,10 +725,11 @@ int map_replace_compar(const void *p1, const void *p2)
 static int do_attach(int argc, char **argv)
 {
 	enum bpf_attach_type attach_type;
-	int err, mapfd, progfd;
+	int err, progfd;
+	int mapfd = 0;
 
-	if (!REQ_ARGS(5)) {
-		p_err("too few parameters for map attach");
+	if (!REQ_ARGS(3)) {
+		p_err("too few parameters for attach");
 		return -EINVAL;
 	}
 
@@ -740,11 +742,17 @@ static int do_attach(int argc, char **argv)
 		p_err("invalid attach type");
 		return -EINVAL;
 	}
-	NEXT_ARG();
+	if (attach_type != BPF_FLOW_DISSECTOR) {
+		NEXT_ARG();
+		if (!REQ_ARGS(2)) {
+			p_err("too few parameters for map attach");
+			return -EINVAL;
+		}
 
-	mapfd = map_parse_fd(&argc, &argv);
-	if (mapfd < 0)
-		return mapfd;
+		mapfd = map_parse_fd(&argc, &argv);
+		if (mapfd < 0)
+			return mapfd;
+	}
 
 	err = bpf_prog_attach(progfd, mapfd, attach_type, 0);
 	if (err) {
@@ -760,10 +768,11 @@ static int do_attach(int argc, char **argv)
 static int do_detach(int argc, char **argv)
 {
 	enum bpf_attach_type attach_type;
-	int err, mapfd, progfd;
+	int err, progfd;
+	int mapfd = 0;
 
-	if (!REQ_ARGS(5)) {
-		p_err("too few parameters for map detach");
+	if (!REQ_ARGS(3)) {
+		p_err("too few parameters for detach");
 		return -EINVAL;
 	}
 
@@ -776,11 +785,17 @@ static int do_detach(int argc, char **argv)
 		p_err("invalid attach type");
 		return -EINVAL;
 	}
-	NEXT_ARG();
+	if (attach_type != BPF_FLOW_DISSECTOR) {
+		NEXT_ARG();
+		if (!REQ_ARGS(2)) {
+			p_err("too few parameters for map detach");
+			return -EINVAL;
+		}
 
-	mapfd = map_parse_fd(&argc, &argv);
-	if (mapfd < 0)
-		return mapfd;
+		mapfd = map_parse_fd(&argc, &argv);
+		if (mapfd < 0)
+			return mapfd;
+	}
 
 	err = bpf_prog_detach2(progfd, mapfd, attach_type);
 	if (err) {
@@ -792,15 +807,16 @@ static int do_detach(int argc, char **argv)
 		jsonw_null(json_wtr);
 	return 0;
 }
-static int do_load(int argc, char **argv)
+
+static int load_with_options(int argc, char **argv, bool first_prog_only)
 {
 	enum bpf_attach_type expected_attach_type;
 	struct bpf_object_open_attr attr = {
 		.prog_type	= BPF_PROG_TYPE_UNSPEC,
 	};
 	struct map_replace *map_replace = NULL;
+	struct bpf_program *prog = NULL, *pos;
 	unsigned int old_map_fds = 0;
-	struct bpf_program *prog;
 	struct bpf_object *obj;
 	struct bpf_map *map;
 	const char *pinfile;
@@ -918,14 +934,20 @@ static int do_load(int argc, char **argv)
 		goto err_free_reuse_maps;
 	}
 
-	prog = bpf_program__next(NULL, obj);
-	if (!prog) {
-		p_err("object file doesn't contain any bpf program");
-		goto err_close_obj;
+	if (first_prog_only) {
+		prog = bpf_program__next(NULL, obj);
+		if (!prog) {
+			p_err("object file doesn't contain any bpf program");
+			goto err_close_obj;
+		}
 	}
 
-	bpf_program__set_ifindex(prog, ifindex);
 	if (attr.prog_type == BPF_PROG_TYPE_UNSPEC) {
+		if (!prog) {
+			p_err("can not guess program type when loading all programs\n");
+			goto err_close_obj;
+		}
+
 		const char *sec_name = bpf_program__title(prog, false);
 
 		err = libbpf_prog_type_by_name(sec_name, &attr.prog_type,
@@ -936,8 +958,13 @@ static int do_load(int argc, char **argv)
 			goto err_close_obj;
 		}
 	}
-	bpf_program__set_type(prog, attr.prog_type);
-	bpf_program__set_expected_attach_type(prog, expected_attach_type);
+
+	bpf_object__for_each_program(pos, obj) {
+		bpf_program__set_ifindex(pos, ifindex);
+		bpf_program__set_type(pos, attr.prog_type);
+		bpf_program__set_expected_attach_type(pos,
+						      expected_attach_type);
+	}
 
 	qsort(map_replace, old_map_fds, sizeof(*map_replace),
 	      map_replace_compar);
@@ -1001,9 +1028,25 @@ static int do_load(int argc, char **argv)
 		goto err_close_obj;
 	}
 
-	if (do_pin_fd(bpf_program__fd(prog), pinfile))
+	err = mount_bpffs_for_pin(pinfile);
+	if (err)
 		goto err_close_obj;
 
+	if (prog) {
+		err = bpf_obj_pin(bpf_program__fd(prog), pinfile);
+		if (err) {
+			p_err("failed to pin program %s",
+			      bpf_program__title(prog, false));
+			goto err_close_obj;
+		}
+	} else {
+		err = bpf_object__pin(obj, pinfile);
+		if (err) {
+			p_err("failed to pin all programs");
+			goto err_close_obj;
+		}
+	}
+
 	if (json_output)
 		jsonw_null(json_wtr);
 
@@ -1023,6 +1066,16 @@ static int do_load(int argc, char **argv)
 	return -1;
 }
 
+static int do_load(int argc, char **argv)
+{
+	return load_with_options(argc, argv, true);
+}
+
+static int do_loadall(int argc, char **argv)
+{
+	return load_with_options(argc, argv, false);
+}
+
 static int do_help(int argc, char **argv)
 {
 	if (json_output) {
@@ -1035,10 +1088,11 @@ static int do_help(int argc, char **argv)
 		"       %s %s dump xlated PROG [{ file FILE | opcodes | visual }]\n"
 		"       %s %s dump jited  PROG [{ file FILE | opcodes }]\n"
 		"       %s %s pin   PROG FILE\n"
-		"       %s %s load  OBJ  FILE [type TYPE] [dev NAME] \\\n"
+		"       %s %s { load | loadall } OBJ  FILE \\\n"
+		"                         [type TYPE] [dev NAME] \\\n"
 		"                         [map { idx IDX | name NAME } MAP]\n"
-		"       %s %s attach PROG ATTACH_TYPE MAP\n"
-		"       %s %s detach PROG ATTACH_TYPE MAP\n"
+		"       %s %s attach PROG ATTACH_TYPE [MAP]\n"
+		"       %s %s detach PROG ATTACH_TYPE [MAP]\n"
 		"       %s %s help\n"
 		"\n"
 		"       " HELP_SPEC_MAP "\n"
@@ -1050,7 +1104,8 @@ static int do_help(int argc, char **argv)
 		"                 cgroup/bind4 | cgroup/bind6 | cgroup/post_bind4 |\n"
 		"                 cgroup/post_bind6 | cgroup/connect4 | cgroup/connect6 |\n"
 		"                 cgroup/sendmsg4 | cgroup/sendmsg6 }\n"
-		"       ATTACH_TYPE := { msg_verdict | skb_verdict | skb_parse }\n"
+		"       ATTACH_TYPE := { msg_verdict | skb_verdict | skb_parse |\n"
+		"                        flow_dissector }\n"
 		"       " HELP_SPEC_OPTIONS "\n"
 		"",
 		bin_name, argv[-2], bin_name, argv[-2], bin_name, argv[-2],
@@ -1067,6 +1122,7 @@ static const struct cmd cmds[] = {
 	{ "dump",	do_dump },
 	{ "pin",	do_pin },
 	{ "load",	do_load },
+	{ "loadall",	do_loadall },
 	{ "attach",	do_attach },
 	{ "detach",	do_detach },
 	{ 0 }
-- 
2.19.1.930.g4563a0d9d0-goog

^ permalink raw reply related

* [PATCH v3 bpf-next 3/4] libbpf: bpf_program__pin: add special case for instances.nr == 1
From: Stanislav Fomichev @ 2018-11-08  5:39 UTC (permalink / raw)
  To: netdev, linux-kselftest, ast, daniel, shuah, jakub.kicinski,
	quentin.monnet
  Cc: guro, jiong.wang, sdf, bhole_prashant_q7, john.fastabend, jbenc,
	treeze.taeung, yhs, osk, sandipan
In-Reply-To: <20181108053957.205681-1-sdf@google.com>

When bpf_program has only one instance, don't create a subdirectory with
per-instance pin files (<prog>/0). Instead, just create a single pin file
for that single instance. This simplifies object pinning by not creating
unnecessary subdirectories.

This can potentially break existing users that depend on the case
where '/0' is always created. However, I couldn't find any serious
usage of bpf_program__pin inside the kernel tree and I suppose there
should be none outside.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/lib/bpf/libbpf.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index db84c85554e7..8407a880acbe 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -1761,6 +1761,11 @@ int bpf_program__pin(struct bpf_program *prog, const char *path)
 		return -EINVAL;
 	}
 
+	if (prog->instances.nr == 1) {
+		/* don't create subdirs when pinning single instance */
+		return bpf_program__pin_instance(prog, path, 0);
+	}
+
 	err = make_dir(path);
 	if (err)
 		return err;
@@ -1823,6 +1828,11 @@ int bpf_program__unpin(struct bpf_program *prog, const char *path)
 		return -EINVAL;
 	}
 
+	if (prog->instances.nr == 1) {
+		/* don't create subdirs when pinning single instance */
+		return bpf_program__unpin_instance(prog, path, 0);
+	}
+
 	for (i = 0; i < prog->instances.nr; i++) {
 		char buf[PATH_MAX];
 		int len;
-- 
2.19.1.930.g4563a0d9d0-goog

^ permalink raw reply related

* [PATCH v3 bpf-next 2/4] libbpf: cleanup after partial failure in bpf_object__pin
From: Stanislav Fomichev @ 2018-11-08  5:39 UTC (permalink / raw)
  To: netdev, linux-kselftest, ast, daniel, shuah, jakub.kicinski,
	quentin.monnet
  Cc: guro, jiong.wang, sdf, bhole_prashant_q7, john.fastabend, jbenc,
	treeze.taeung, yhs, osk, sandipan
In-Reply-To: <20181108053957.205681-1-sdf@google.com>

bpftool will use bpf_object__pin in the next commit to pin all programs
and maps from the file; in case of a partial failure, we need to get
back to the clean state (undo previous program/map pins).

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/lib/bpf/libbpf.c | 248 ++++++++++++++++++++++++++++++++++++-----
 tools/lib/bpf/libbpf.h |  11 ++
 2 files changed, 230 insertions(+), 29 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index d6e62e90e8d4..db84c85554e7 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -1699,6 +1699,34 @@ int bpf_program__pin_instance(struct bpf_program *prog, const char *path,
 	return 0;
 }
 
+int bpf_program__unpin_instance(struct bpf_program *prog, const char *path,
+				int instance)
+{
+	int err;
+
+	err = check_path(path);
+	if (err)
+		return err;
+
+	if (prog == NULL) {
+		pr_warning("invalid program pointer\n");
+		return -EINVAL;
+	}
+
+	if (instance < 0 || instance >= prog->instances.nr) {
+		pr_warning("invalid prog instance %d of prog %s (max %d)\n",
+			   instance, prog->section_name, prog->instances.nr);
+		return -EINVAL;
+	}
+
+	err = unlink(path);
+	if (err != 0)
+		return -errno;
+	pr_debug("unpinned program '%s'\n", path);
+
+	return 0;
+}
+
 static int make_dir(const char *path)
 {
 	char *cp, errmsg[STRERR_BUFSIZE];
@@ -1737,6 +1765,64 @@ int bpf_program__pin(struct bpf_program *prog, const char *path)
 	if (err)
 		return err;
 
+	for (i = 0; i < prog->instances.nr; i++) {
+		char buf[PATH_MAX];
+		int len;
+
+		len = snprintf(buf, PATH_MAX, "%s/%d", path, i);
+		if (len < 0) {
+			err = -EINVAL;
+			goto err_unpin;
+		} else if (len >= PATH_MAX) {
+			err = -ENAMETOOLONG;
+			goto err_unpin;
+		}
+
+		err = bpf_program__pin_instance(prog, buf, i);
+		if (err)
+			goto err_unpin;
+	}
+
+	return 0;
+
+err_unpin:
+	for (i = i - 1; i >= 0; i--) {
+		char buf[PATH_MAX];
+		int len;
+
+		len = snprintf(buf, PATH_MAX, "%s/%d", path, i);
+		if (len < 0)
+			continue;
+		else if (len >= PATH_MAX)
+			continue;
+
+		bpf_program__unpin_instance(prog, buf, i);
+	}
+
+	rmdir(path);
+
+	return err;
+}
+
+int bpf_program__unpin(struct bpf_program *prog, const char *path)
+{
+	int i, err;
+
+	err = check_path(path);
+	if (err)
+		return err;
+
+	if (prog == NULL) {
+		pr_warning("invalid program pointer\n");
+		return -EINVAL;
+	}
+
+	if (prog->instances.nr <= 0) {
+		pr_warning("no instances of prog %s to pin\n",
+			   prog->section_name);
+		return -EINVAL;
+	}
+
 	for (i = 0; i < prog->instances.nr; i++) {
 		char buf[PATH_MAX];
 		int len;
@@ -1747,11 +1833,15 @@ int bpf_program__pin(struct bpf_program *prog, const char *path)
 		else if (len >= PATH_MAX)
 			return -ENAMETOOLONG;
 
-		err = bpf_program__pin_instance(prog, buf, i);
+		err = bpf_program__unpin_instance(prog, buf, i);
 		if (err)
 			return err;
 	}
 
+	err = rmdir(path);
+	if (err)
+		return -errno;
+
 	return 0;
 }
 
@@ -1776,6 +1866,28 @@ int bpf_map__pin(struct bpf_map *map, const char *path)
 	}
 
 	pr_debug("pinned map '%s'\n", path);
+
+	return 0;
+}
+
+int bpf_map__unpin(struct bpf_map *map, const char *path)
+{
+	int err;
+
+	err = check_path(path);
+	if (err)
+		return err;
+
+	if (map == NULL) {
+		pr_warning("invalid map pointer\n");
+		return -EINVAL;
+	}
+
+	err = unlink(path);
+	if (err != 0)
+		return -errno;
+	pr_debug("unpinned map '%s'\n", path);
+
 	return 0;
 }
 
@@ -1803,14 +1915,17 @@ int bpf_object__pin(struct bpf_object *obj, const char *path)
 
 		len = snprintf(buf, PATH_MAX, "%s/%s", path,
 			       bpf_map__name(map));
-		if (len < 0)
-			return -EINVAL;
-		else if (len >= PATH_MAX)
-			return -ENAMETOOLONG;
+		if (len < 0) {
+			err = -EINVAL;
+			goto err_unpin_maps;
+		} else if (len >= PATH_MAX) {
+			err = -ENAMETOOLONG;
+			goto err_unpin_maps;
+		}
 
 		err = bpf_map__pin(map, buf);
 		if (err)
-			return err;
+			goto err_unpin_maps;
 	}
 
 	bpf_object__for_each_program(prog, obj) {
@@ -1819,17 +1934,56 @@ int bpf_object__pin(struct bpf_object *obj, const char *path)
 
 		len = snprintf(buf, PATH_MAX, "%s/%s", path,
 			       prog->section_name);
-		if (len < 0)
-			return -EINVAL;
-		else if (len >= PATH_MAX)
-			return -ENAMETOOLONG;
+		if (len < 0) {
+			err = -EINVAL;
+			goto err_unpin_programs;
+		} else if (len >= PATH_MAX) {
+			err = -ENAMETOOLONG;
+			goto err_unpin_programs;
+		}
 
 		err = bpf_program__pin(prog, buf);
 		if (err)
-			return err;
+			goto err_unpin_programs;
 	}
 
 	return 0;
+
+err_unpin_programs:
+	for (prog = bpf_program__prev(prog, obj);
+	     prog != NULL;
+	     prog = bpf_program__prev(prog, obj)) {
+		char buf[PATH_MAX];
+		int len;
+
+		len = snprintf(buf, PATH_MAX, "%s/%s", path,
+			       prog->section_name);
+		if (len < 0)
+			continue;
+		else if (len >= PATH_MAX)
+			continue;
+
+		bpf_program__unpin(prog, buf);
+	}
+
+err_unpin_maps:
+	for (map = bpf_map__prev(map, obj);
+	     map != NULL;
+	     map = bpf_map__prev(map, obj)) {
+		char buf[PATH_MAX];
+		int len;
+
+		len = snprintf(buf, PATH_MAX, "%s/%s", path,
+			       bpf_map__name(map));
+		if (len < 0)
+			continue;
+		else if (len >= PATH_MAX)
+			continue;
+
+		bpf_map__unpin(map, buf);
+	}
+
+	return err;
 }
 
 void bpf_object__close(struct bpf_object *obj)
@@ -1918,23 +2072,20 @@ void *bpf_object__priv(struct bpf_object *obj)
 }
 
 static struct bpf_program *
-__bpf_program__next(struct bpf_program *prev, struct bpf_object *obj)
+__bpf_program__iter(struct bpf_program *p, struct bpf_object *obj, int i)
 {
-	size_t idx;
+	ssize_t idx;
 
 	if (!obj->programs)
 		return NULL;
-	/* First handler */
-	if (prev == NULL)
-		return &obj->programs[0];
 
-	if (prev->obj != obj) {
+	if (p->obj != obj) {
 		pr_warning("error: program handler doesn't match object\n");
 		return NULL;
 	}
 
-	idx = (prev - obj->programs) + 1;
-	if (idx >= obj->nr_programs)
+	idx = (p - obj->programs) + i;
+	if (idx >= obj->nr_programs || idx < 0)
 		return NULL;
 	return &obj->programs[idx];
 }
@@ -1944,8 +2095,29 @@ bpf_program__next(struct bpf_program *prev, struct bpf_object *obj)
 {
 	struct bpf_program *prog = prev;
 
+	if (prev == NULL)
+		return obj->programs;
+
+	do {
+		prog = __bpf_program__iter(prog, obj, 1);
+	} while (prog && bpf_program__is_function_storage(prog, obj));
+
+	return prog;
+}
+
+struct bpf_program *
+bpf_program__prev(struct bpf_program *next, struct bpf_object *obj)
+{
+	struct bpf_program *prog = next;
+
+	if (next == NULL) {
+		if (!obj->nr_programs)
+			return NULL;
+		return obj->programs + obj->nr_programs - 1;
+	}
+
 	do {
-		prog = __bpf_program__next(prog, obj);
+		prog = __bpf_program__iter(prog, obj, -1);
 	} while (prog && bpf_program__is_function_storage(prog, obj));
 
 	return prog;
@@ -2272,10 +2444,10 @@ void bpf_map__set_ifindex(struct bpf_map *map, __u32 ifindex)
 	map->map_ifindex = ifindex;
 }
 
-struct bpf_map *
-bpf_map__next(struct bpf_map *prev, struct bpf_object *obj)
+static struct bpf_map *
+__bpf_map__iter(struct bpf_map *m, struct bpf_object *obj, int i)
 {
-	size_t idx;
+	ssize_t idx;
 	struct bpf_map *s, *e;
 
 	if (!obj || !obj->maps)
@@ -2284,21 +2456,39 @@ bpf_map__next(struct bpf_map *prev, struct bpf_object *obj)
 	s = obj->maps;
 	e = obj->maps + obj->nr_maps;
 
-	if (prev == NULL)
-		return s;
-
-	if ((prev < s) || (prev >= e)) {
+	if ((m < s) || (m >= e)) {
 		pr_warning("error in %s: map handler doesn't belong to object\n",
 			   __func__);
 		return NULL;
 	}
 
-	idx = (prev - obj->maps) + 1;
-	if (idx >= obj->nr_maps)
+	idx = (m - obj->maps) + i;
+	if (idx >= obj->nr_maps || idx < 0)
 		return NULL;
 	return &obj->maps[idx];
 }
 
+struct bpf_map *
+bpf_map__next(struct bpf_map *prev, struct bpf_object *obj)
+{
+	if (prev == NULL)
+		return obj->maps;
+
+	return __bpf_map__iter(prev, obj, 1);
+}
+
+struct bpf_map *
+bpf_map__prev(struct bpf_map *next, struct bpf_object *obj)
+{
+	if (next == NULL) {
+		if (!obj->nr_maps)
+			return NULL;
+		return obj->maps + obj->nr_maps - 1;
+	}
+
+	return __bpf_map__iter(next, obj, -1);
+}
+
 struct bpf_map *
 bpf_object__find_map_by_name(struct bpf_object *obj, const char *name)
 {
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 1f3468dad8b2..785b27f761de 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -112,6 +112,9 @@ LIBBPF_API struct bpf_program *bpf_program__next(struct bpf_program *prog,
 	     (pos) != NULL;				\
 	     (pos) = bpf_program__next((pos), (obj)))
 
+LIBBPF_API struct bpf_program *bpf_program__prev(struct bpf_program *prog,
+						 struct bpf_object *obj);
+
 typedef void (*bpf_program_clear_priv_t)(struct bpf_program *,
 					 void *);
 
@@ -131,7 +134,11 @@ LIBBPF_API int bpf_program__fd(struct bpf_program *prog);
 LIBBPF_API int bpf_program__pin_instance(struct bpf_program *prog,
 					 const char *path,
 					 int instance);
+LIBBPF_API int bpf_program__unpin_instance(struct bpf_program *prog,
+					   const char *path,
+					   int instance);
 LIBBPF_API int bpf_program__pin(struct bpf_program *prog, const char *path);
+LIBBPF_API int bpf_program__unpin(struct bpf_program *prog, const char *path);
 LIBBPF_API void bpf_program__unload(struct bpf_program *prog);
 
 struct bpf_insn;
@@ -260,6 +267,9 @@ bpf_map__next(struct bpf_map *map, struct bpf_object *obj);
 	     (pos) != NULL;				\
 	     (pos) = bpf_map__next((pos), (obj)))
 
+LIBBPF_API struct bpf_map *
+bpf_map__prev(struct bpf_map *map, struct bpf_object *obj);
+
 LIBBPF_API int bpf_map__fd(struct bpf_map *map);
 LIBBPF_API const struct bpf_map_def *bpf_map__def(struct bpf_map *map);
 LIBBPF_API const char *bpf_map__name(struct bpf_map *map);
@@ -274,6 +284,7 @@ LIBBPF_API int bpf_map__reuse_fd(struct bpf_map *map, int fd);
 LIBBPF_API bool bpf_map__is_offload_neutral(struct bpf_map *map);
 LIBBPF_API void bpf_map__set_ifindex(struct bpf_map *map, __u32 ifindex);
 LIBBPF_API int bpf_map__pin(struct bpf_map *map, const char *path);
+LIBBPF_API int bpf_map__unpin(struct bpf_map *map, const char *path);
 
 LIBBPF_API long libbpf_get_error(const void *ptr);
 
-- 
2.19.1.930.g4563a0d9d0-goog

^ permalink raw reply related

* [PATCH v3 bpf-next 1/4] selftests/bpf: rename flow dissector section to flow_dissector
From: Stanislav Fomichev @ 2018-11-08  5:39 UTC (permalink / raw)
  To: netdev, linux-kselftest, ast, daniel, shuah, jakub.kicinski,
	quentin.monnet
  Cc: guro, jiong.wang, sdf, bhole_prashant_q7, john.fastabend, jbenc,
	treeze.taeung, yhs, osk, sandipan
In-Reply-To: <20181108053957.205681-1-sdf@google.com>

Makes it compatible with the logic that derives program type
from section name in libbpf_prog_type_by_name.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/testing/selftests/bpf/bpf_flow.c             | 2 +-
 tools/testing/selftests/bpf/test_flow_dissector.sh | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/bpf_flow.c b/tools/testing/selftests/bpf/bpf_flow.c
index 107350a7821d..b9798f558ca7 100644
--- a/tools/testing/selftests/bpf/bpf_flow.c
+++ b/tools/testing/selftests/bpf/bpf_flow.c
@@ -116,7 +116,7 @@ static __always_inline int parse_eth_proto(struct __sk_buff *skb, __be16 proto)
 	return BPF_DROP;
 }
 
-SEC("dissect")
+SEC("flow_dissector")
 int _dissect(struct __sk_buff *skb)
 {
 	if (!skb->vlan_present)
diff --git a/tools/testing/selftests/bpf/test_flow_dissector.sh b/tools/testing/selftests/bpf/test_flow_dissector.sh
index c0fb073b5eab..d23d4da66b83 100755
--- a/tools/testing/selftests/bpf/test_flow_dissector.sh
+++ b/tools/testing/selftests/bpf/test_flow_dissector.sh
@@ -59,7 +59,7 @@ else
 fi
 
 # Attach BPF program
-./flow_dissector_load -p bpf_flow.o -s dissect
+./flow_dissector_load -p bpf_flow.o -s flow_dissector
 
 # Setup
 tc qdisc add dev lo ingress
-- 
2.19.1.930.g4563a0d9d0-goog

^ permalink raw reply related

* [PATCH v3 bpf-next 0/4] bpftool: support loading flow dissector
From: Stanislav Fomichev @ 2018-11-08  5:39 UTC (permalink / raw)
  To: netdev, linux-kselftest, ast, daniel, shuah, jakub.kicinski,
	quentin.monnet
  Cc: guro, jiong.wang, sdf, bhole_prashant_q7, john.fastabend, jbenc,
	treeze.taeung, yhs, osk, sandipan

v3 changes:
* (maybe) better cleanup for partial failure in bpf_object__pin
* added special case in bpf_program__pin for programs with single
  instances

v2 changes:
* addressed comments/style issues from Jakub Kicinski & Quentin Monnet
* removed logic that populates jump table
* added cleanup for partial failure in bpf_object__pin

This patch series adds support for loading and attaching flow dissector
programs from the bpftool:

* first patch fixes flow dissector section name in the selftests (so
  libbpf auto-detection works)
* second patch adds proper cleanup to bpf_object__pin which is now being
  used to attach all flow dissector progs/maps
* third patch adds special case in bpf_program__pin for programs with
  single instances (we don't create <prog>/0 pin anymore, just <prog>)
* forth patch adds actual support to the bpftool

See forth patch for the description/details.

Stanislav Fomichev (4):
  selftests/bpf: rename flow dissector section to flow_dissector
  libbpf: cleanup after partial failure in bpf_object__pin
  libbpf: bpf_program__pin: add special case for instances.nr == 1
  bpftool: support loading flow dissector

 .../bpftool/Documentation/bpftool-prog.rst    |  36 ++-
 tools/bpf/bpftool/bash-completion/bpftool     |   6 +-
 tools/bpf/bpftool/common.c                    |  30 +-
 tools/bpf/bpftool/main.h                      |   1 +
 tools/bpf/bpftool/prog.c                      | 112 ++++++--
 tools/lib/bpf/libbpf.c                        | 258 ++++++++++++++++--
 tools/lib/bpf/libbpf.h                        |  11 +
 tools/testing/selftests/bpf/bpf_flow.c        |   2 +-
 .../selftests/bpf/test_flow_dissector.sh      |   2 +-
 9 files changed, 368 insertions(+), 90 deletions(-)

-- 
2.19.1.930.g4563a0d9d0-goog

^ permalink raw reply

* [PATCH][net-next] net/ipv6: compute anycast address hash only if dev is null
From: Li RongQing @ 2018-11-08  5:35 UTC (permalink / raw)
  To: netdev

avoid to compute the hash value if dev is not null, since
hash value is not used

Signed-off-by: Li RongQing <lirongqing@baidu.com>
---
 net/ipv6/anycast.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
index 94999058e110..a20e344486cb 100644
--- a/net/ipv6/anycast.c
+++ b/net/ipv6/anycast.c
@@ -433,15 +433,16 @@ static bool ipv6_chk_acast_dev(struct net_device *dev, const struct in6_addr *ad
 bool ipv6_chk_acast_addr(struct net *net, struct net_device *dev,
 			 const struct in6_addr *addr)
 {
-	unsigned int hash = inet6_acaddr_hash(net, addr);
 	struct net_device *nh_dev;
 	struct ifacaddr6 *aca;
 	bool found = false;
+	unsigned int hash;
 
 	rcu_read_lock();
 	if (dev)
 		found = ipv6_chk_acast_dev(dev, addr);
-	else
+	else {
+		hash = inet6_acaddr_hash(net, addr);
 		hlist_for_each_entry_rcu(aca, &inet6_acaddr_lst[hash],
 					 aca_addr_lst) {
 			nh_dev = fib6_info_nh_dev(aca->aca_rt);
@@ -452,6 +453,7 @@ bool ipv6_chk_acast_addr(struct net *net, struct net_device *dev,
 				break;
 			}
 		}
+	}
 	rcu_read_unlock();
 	return found;
 }
-- 
2.16.2

^ permalink raw reply related

* Re: [RFC perf,bpf 1/5] perf, bpf: Introduce PERF_RECORD_BPF_EVENT
From: Peter Zijlstra @ 2018-11-08 15:00 UTC (permalink / raw)
  To: Song Liu
  Cc: Netdev, lkml, Kernel Team, ast@kernel.org, daniel@iogearbox.net,
	acme@kernel.org
In-Reply-To: <31067290-4B66-4AA1-8027-607397BC0264@fb.com>

On Wed, Nov 07, 2018 at 06:25:04PM +0000, Song Liu wrote:
> 
> 
> > On Nov 7, 2018, at 12:40 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > 
> > On Tue, Nov 06, 2018 at 12:52:42PM -0800, Song Liu wrote:
> >> For better performance analysis of BPF programs, this patch introduces
> >> PERF_RECORD_BPF_EVENT, a new perf_event_type that exposes BPF program
> >> load/unload information to user space.
> >> 
> >>        /*
> >>         * Record different types of bpf events:
> >>         *   enum perf_bpf_event_type {
> >>         *      PERF_BPF_EVENT_UNKNOWN          = 0,
> >>         *      PERF_BPF_EVENT_PROG_LOAD        = 1,
> >>         *      PERF_BPF_EVENT_PROG_UNLOAD      = 2,
> >>         *   };
> >>         *
> >>         * struct {
> >>         *      struct perf_event_header header;
> >>         *      u16 type;
> >>         *      u16 flags;
> >>         *      u32 id;  // prog_id or map_id
> >>         * };
> >>         */
> >>        PERF_RECORD_BPF_EVENT                   = 17,
> >> 
> >> PERF_RECORD_BPF_EVENT contains minimal information about the BPF program.
> >> Perf utility (or other user space tools) should listen to this event and
> >> fetch more details about the event via BPF syscalls
> >> (BPF_PROG_GET_FD_BY_ID, BPF_OBJ_GET_INFO_BY_FD, etc.).
> > 
> > Why !? You're failing to explain why it cannot provide the full
> > information there.
> 
> Aha, I missed this part. I will add the following to next version. Please
> let me know if anything is not clear.

> 
> This design decision is picked for the following reasons. First, BPF 
> programs could be loaded-and-jited and/or unloaded before/during/after 
> perf-record run. Once a BPF programs is unloaded, it is impossible to 
> recover details of the program. It is impossible to provide the 
> information through a simple key (like the build ID). Second, BPF prog
> annotation is under fast developments. Multiple informations will be 
> added to bpf_prog_info in the next few releases. Including all the
> information of a BPF program in the perf ring buffer requires frequent 
> changes to the perf ABI, and thus makes it very difficult to manage 
> compatibility of perf utility. 

So I don't agree with that reasoning. If you want symbol information
you'll just have to commit to some form of ABI. That bpf_prog_info is an
ABI too.

And relying on userspace to synchronously consume perf output to
directly call into the kernel again to get more info (through another
ABI) is a pretty terrible design.

So please try harder. NAK on this.

^ permalink raw reply

* Re: [PATCH v3 1/2] kretprobe: produce sane stack traces
From: Josh Poimboeuf @ 2018-11-08 14:44 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: Steven Rostedt, Naveen N. Rao, Anil S Keshavamurthy,
	David S. Miller, Masami Hiramatsu, Jonathan Corbet,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Shuah Khan,
	Alexei Starovoitov, Daniel Borkmann, Brendan Gregg,
	Christian Brauner, Aleksa Sarai, netdev, linux-doc
In-Reply-To: <20181108080448.rggfn4zawi3por23@yavin>

On Thu, Nov 08, 2018 at 07:04:48PM +1100, Aleksa Sarai wrote:
> On 2018-11-08, Aleksa Sarai <cyphar@cyphar.com> wrote:
> > I will attach what I have at the moment to hopefully explain what the
> > issue I've found is (re-using the kretprobe architecture but with the
> > shadow-stack idea).
> 
> Here is the patch I have at the moment (it works, except for the
> question I have about how to handle the top-level pt_regs -- I've marked
> that code with XXX).
> 
> -- 
> Aleksa Sarai
> Senior Software Engineer (Containers)
> SUSE Linux GmbH
> <https://www.cyphar.com/>
> 
> --8<---------------------------------------------------------------------
> 
> Since the return address is modified by kretprobe, the various unwinders
> can produce invalid and confusing stack traces. ftrace mostly solved
> this problem by teaching each unwinder how to find the original return
> address for stack trace purposes. This same technique can be applied to
> kretprobes by simply adding a pointer to where the return address was
> replaced in the stack, and then looking up the relevant
> kretprobe_instance when a stack trace is requested.
> 
> [WIP: This is currently broken because the *first entry* will not be
>       overwritten since it looks like the stack pointer is different
>       when we are provided pt_regs. All other addresses are correctly
>       handled.]

When you see this problem, what does regs->ip point to?  If it's
pointing to generated code, then we don't _currently_ have a way of
dealing with that.  If it's pointing to a real function, we can fix that
with unwind hints.

-- 
Josh

^ permalink raw reply

* [PATCH v2 net-next] sock: Reset dst when changing sk_mark via setsockopt
From: David Barmann @ 2018-11-08  4:55 UTC (permalink / raw)
  To: netdev

When setting the SO_MARK socket option, the dst needs to be reset so
that a new route lookup is performed.

This fixes the case where an application wants to change routing by
setting a new sk_mark.  If this is done after some packets have already
been sent, the dst is cached and has no effect.

Signed-off-by: David Barmann <david.barmann@stackpath.com>
---
 net/core/sock.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index 7b304e454a38..c74b10be86cb 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -952,10 +952,12 @@ int sock_setsockopt(struct socket *sock, int level, int optname,
 			clear_bit(SOCK_PASSSEC, &sock->flags);
 		break;
 	case SO_MARK:
-		if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN))
+		if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) {
 			ret = -EPERM;
-		else
+		} else {
 			sk->sk_mark = val;
+			sk_dst_reset(sk);
+		}
 		break;
 
 	case SO_RXQ_OVFL:
-- 
2.14.5

^ permalink raw reply related

* Re: [PATCH net-next v2 3/5] virtio_ring: add packed ring support
From: Michael S. Tsirkin @ 2018-11-08 14:14 UTC (permalink / raw)
  To: Jason Wang
  Cc: Tiwei Bie, virtualization, linux-kernel, netdev, virtio-dev, wexu,
	jfreimann
In-Reply-To: <2d46a41e-bc00-276a-e19a-105c9dffc75a@redhat.com>

On Thu, Nov 08, 2018 at 04:18:25PM +0800, Jason Wang wrote:
> 
> On 2018/11/8 上午9:38, Tiwei Bie wrote:
> > > > +
> > > > +	if (vq->vq.num_free < descs_used) {
> > > > +		pr_debug("Can't add buf len %i - avail = %i\n",
> > > > +			 descs_used, vq->vq.num_free);
> > > > +		/* FIXME: for historical reasons, we force a notify here if
> > > > +		 * there are outgoing parts to the buffer.  Presumably the
> > > > +		 * host should service the ring ASAP. */
> > > I don't think we have a reason to do this for packed ring.
> > > No historical baggage there, right?
> > Based on the original commit log, it seems that the notify here
> > is just an "optimization". But I don't quite understand what does
> > the "the heuristics which KVM uses" refer to. If it's safe to drop
> > this in packed ring, I'd like to do it.
> 
> 
> According to the commit log, it seems like a workaround of lguest networking
> backend. I agree to drop it, we should not have such burden.
> 
> But we should notice that, with this removed, the compare between packed vs
> split is kind of unfair.

I don't think this ever triggers to be frank. When would it?

> Consider the removal of lguest support recently,
> maybe we can drop this for split ring as well?
> 
> Thanks

If it's helpful, then for sure we can drop it for virtio 1.
Can you see any perf differences at all? With which device?

> 
> > 
> > commit 44653eae1407f79dff6f52fcf594ae84cb165ec4
> > Author: Rusty Russell<rusty@rustcorp.com.au>
> > Date:   Fri Jul 25 12:06:04 2008 -0500
> > 
> >      virtio: don't always force a notification when ring is full
> >      We force notification when the ring is full, even if the host has
> >      indicated it doesn't want to know.  This seemed like a good idea at
> >      the time: if we fill the transmit ring, we should tell the host
> >      immediately.
> >      Unfortunately this logic also applies to the receiving ring, which is
> >      refilled constantly.  We should introduce real notification thesholds
> >      to replace this logic.  Meanwhile, removing the logic altogether breaks
> >      the heuristics which KVM uses, so we use a hack: only notify if there are
> >      outgoing parts of the new buffer.
> >      Here are the number of exits with lguest's crappy network implementation:
> >      Before:
> >              network xmit 7859051 recv 236420
> >      After:
> >              network xmit 7858610 recv 118136
> >      Signed-off-by: Rusty Russell<rusty@rustcorp.com.au>
> > 
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 72bf8bc09014..21d9a62767af 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -87,8 +87,11 @@ static int vring_add_buf(struct virtqueue *_vq,
> >   	if (vq->num_free < out + in) {
> >   		pr_debug("Can't add buf len %i - avail = %i\n",
> >   			 out + in, vq->num_free);
> > -		/* We notify*even if*  VRING_USED_F_NO_NOTIFY is set here. */
> > -		vq->notify(&vq->vq);
> > +		/* FIXME: for historical reasons, we force a notify here if
> > +		 * there are outgoing parts to the buffer.  Presumably the
> > +		 * host should service the ring ASAP. */
> > +		if (out)
> > +			vq->notify(&vq->vq);
> >   		END_USE(vq);
> >   		return -ENOSPC;
> >   	}
> > 
> > 

^ permalink raw reply

* RE: [PATCH v2 2/3] dt-bindings: can: rcar_can: Add r8a774a1 support
From: Fabrizio Castro @ 2018-11-08 14:05 UTC (permalink / raw)
  To: Simon Horman, Marc Kleine-Budde, Rob Herring
  Cc: Wolfgang Grandegger, Mark Rutland, David S. Miller,
	Sergei Shtylyov, linux-can@vger.kernel.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	devicetree@vger.kernel.org, Geert Uytterhoeven, Chris Paterson,
	Biju Das, linux-renesas-soc@vger.kernel.org
In-Reply-To: <20181108124632.fd52533ws7l7j2r2@verge.net.au>

Thank you Simon for getting back to me.

Marc, does this patch look ok to you?

Thanks,
Fab

> Subject: Re: [PATCH v2 2/3] dt-bindings: can: rcar_can: Add r8a774a1 support
>
> On Thu, Nov 08, 2018 at 11:25:23AM +0000, Fabrizio Castro wrote:
> > Dear All,
> >
> > Who is the best person to take this patch?
>
> I believe this one is for Marc.
>
> > Thanks,
> > Fab
> >
> > > From: Fabrizio Castro <fabrizio.castro@bp.renesas.com>
> > > Sent: 10 September 2018 11:43
> > > Subject: [PATCH v2 2/3] dt-bindings: can: rcar_can: Add r8a774a1 support
> > >
> > > Document RZ/G2M (r8a774a1) SoC specific bindings.
> > >
> > > Signed-off-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com>
> > > Signed-off-by: Chris Paterson <Chris.Paterson2@renesas.com>
> > > Reviewed-by: Biju Das <biju.das@bp.renesas.com>
> > > ---
> > > v1->v2:
> > > * dropped "renesas,rzg-gen2-can" and fixed "clocks" property description
> > >   as per Geert's comments.
> > >
> > > This patch applies on top of next-20180910.
> > >
> > >  Documentation/devicetree/bindings/net/can/rcar_can.txt | 18 +++++++++++++-----
> > >  1 file changed, 13 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/Documentation/devicetree/bindings/net/can/rcar_can.txt
> b/Documentation/devicetree/bindings/net/can/rcar_can.txt
> > > index 94a7f33..f3b160c 100644
> > > --- a/Documentation/devicetree/bindings/net/can/rcar_can.txt
> > > +++ b/Documentation/devicetree/bindings/net/can/rcar_can.txt
> > > @@ -4,6 +4,7 @@ Renesas R-Car CAN controller Device Tree Bindings
> > >  Required properties:
> > >  - compatible: "renesas,can-r8a7743" if CAN controller is a part of R8A7743 SoC.
> > >        "renesas,can-r8a7745" if CAN controller is a part of R8A7745 SoC.
> > > +      "renesas,can-r8a774a1" if CAN controller is a part of R8A774A1 SoC.
> > >        "renesas,can-r8a7778" if CAN controller is a part of R8A7778 SoC.
> > >        "renesas,can-r8a7779" if CAN controller is a part of R8A7779 SoC.
> > >        "renesas,can-r8a7790" if CAN controller is a part of R8A7790 SoC.
> > > @@ -16,15 +17,21 @@ Required properties:
> > >        "renesas,rcar-gen1-can" for a generic R-Car Gen1 compatible device.
> > >        "renesas,rcar-gen2-can" for a generic R-Car Gen2 or RZ/G1
> > >        compatible device.
> > > -      "renesas,rcar-gen3-can" for a generic R-Car Gen3 compatible device.
> > > +      "renesas,rcar-gen3-can" for a generic R-Car Gen3 or RZ/G2
> > > +      compatible device.
> > >        When compatible with the generic version, nodes must list the
> > >        SoC-specific version corresponding to the platform first
> > >        followed by the generic version.
> > >
> > >  - reg: physical base address and size of the R-Car CAN register map.
> > >  - interrupts: interrupt specifier for the sole interrupt.
> > > -- clocks: phandles and clock specifiers for 3 CAN clock inputs.
> > > -- clock-names: 3 clock input name strings: "clkp1", "clkp2", "can_clk".
> > > +- clocks: phandles and clock specifiers for 2 CAN clock inputs for RZ/G2
> > > +  devices.
> > > +  phandles and clock specifiers for 3 CAN clock inputs for every other
> > > +  SoC.
> > > +- clock-names: 2 clock input name strings for RZ/G2: "clkp1", "can_clk".
> > > +       3 clock input name strings for every other SoC: "clkp1", "clkp2",
> > > +       "can_clk".
> > >  - pinctrl-0: pin control group to be used for this controller.
> > >  - pinctrl-names: must be "default".
> > >
> > > @@ -41,8 +48,9 @@ using the below properties:
> > >  Optional properties:
> > >  - renesas,can-clock-select: R-Car CAN Clock Source Select. Valid values are:
> > >      <0x0> (default) : Peripheral clock (clkp1)
> > > -    <0x1> : Peripheral clock (clkp2)
> > > -    <0x3> : Externally input clock
> > > +    <0x1> : Peripheral clock (clkp2) (not supported by
> > > +    RZ/G2 devices)
> > > +    <0x3> : External input clock
> > >
> > >  Example
> > >  -------
> > > --
> > > 2.7.4
> >
> >
> >
> >
> > Renesas Electronics Europe Ltd, Dukes Meadow, Millboard Road, Bourne End, Buckinghamshire, SL8 5FH, UK. Registered in England
> & Wales under Registered No. 04586709.
> >



Renesas Electronics Europe Ltd, Dukes Meadow, Millboard Road, Bourne End, Buckinghamshire, SL8 5FH, UK. Registered in England & Wales under Registered No. 04586709.

^ permalink raw reply

* Re: (2) (2) [Kernel][NET] Bug report on packet defragmenting
From: Eric Dumazet @ 2018-11-08  4:26 UTC (permalink / raw)
  To: soukjin.bae, netdev@vger.kernel.org
In-Reply-To: <20181108041001epcms1p6c83831e3ef0d66b9591c2aca25d5841b@epcms1p6>



On 11/07/2018 08:10 PM, 배석진 wrote:
>> --------- Original Message ---------
>> Sender : Eric Dumazet <eric.dumazet@gmail.com>
>> Date   : 2018-11-08 12:57 (GMT+9)
>> Title  : Re: (2) [Kernel][NET] Bug report on packet defragmenting
>>  
>> On 11/07/2018 07:24 PM, Eric Dumazet wrote:
>>
>>>  Sure, it is better if RPS is smarter, but if there is a bug in IPv6 defrag unit
>>>  we must investigate and root-cause it.
>>  
>> BTW, IPv4 defrag seems to have the same issue.
>  
> 
> yes, it could be.
> key point isn't limitted to ipv6.
> 
> maybe because of faster air-network and modem,
> it looks like occure more often and we got recognized that.
> 
> anyway,
> we'll apply our patch to resolve this problem.

Yeah, and I will fix the defrag units.

We can not rely on other layers doing proper no-reorder logic for us.

Problem here is that multiple cpus attempt concurrent rhashtable_insert_fast()
and do not properly recover in case -EEXIST is returned.

This is silly, of course :/

^ permalink raw reply

* RE:(2) (2) [Kernel][NET] Bug report on packet defragmenting
From: 배석진 @ 2018-11-08  4:10 UTC (permalink / raw)
  To: Eric Dumazet, netdev@vger.kernel.org
In-Reply-To: <1771721f-40fd-0042-b603-5ed763c54378@gmail.com>

> --------- Original Message ---------
> Sender : Eric Dumazet <eric.dumazet@gmail.com>
> Date   : 2018-11-08 12:57 (GMT+9)
> Title  : Re: (2) [Kernel][NET] Bug report on packet defragmenting
>  
> On 11/07/2018 07:24 PM, Eric Dumazet wrote:
> 
> > Sure, it is better if RPS is smarter, but if there is a bug in IPv6 defrag unit
> > we must investigate and root-cause it.
>  
> BTW, IPv4 defrag seems to have the same issue.
 

yes, it could be.
key point isn't limitted to ipv6.

maybe because of faster air-network and modem,
it looks like occure more often and we got recognized that.

anyway,
we'll apply our patch to resolve this problem.

Best regards, :)



^ permalink raw reply

* [PATCH net-next] net: qca_spi: Add available buffer space verification
From: Stefan Wahren @ 2018-11-08 13:38 UTC (permalink / raw)
  To: David S. Miller; +Cc: Michael Heimpold, netdev, linux-kernel, Stefan Wahren

Interferences on the SPI line could distort the response of
available buffer space. So at least we should check that the
response doesn't exceed the maximum available buffer space.
In error case increase a new error counter and retry it later.
This behavior avoids buffer errors in the QCA7000, which
results in an unnecessary chip reset including packet loss.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
---
 drivers/net/ethernet/qualcomm/qca_debug.c |  1 +
 drivers/net/ethernet/qualcomm/qca_spi.c   | 16 +++++++++++++++-
 drivers/net/ethernet/qualcomm/qca_spi.h   |  1 +
 3 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qualcomm/qca_debug.c b/drivers/net/ethernet/qualcomm/qca_debug.c
index a9f1bc0..1450f38 100644
--- a/drivers/net/ethernet/qualcomm/qca_debug.c
+++ b/drivers/net/ethernet/qualcomm/qca_debug.c
@@ -61,6 +61,7 @@ static const char qcaspi_gstrings_stats[][ETH_GSTRING_LEN] = {
 	"Transmit ring full",
 	"SPI errors",
 	"Write verify errors",
+	"Buffer available errors",
 };
 
 #ifdef CONFIG_DEBUG_FS
diff --git a/drivers/net/ethernet/qualcomm/qca_spi.c b/drivers/net/ethernet/qualcomm/qca_spi.c
index d531050..97f9295 100644
--- a/drivers/net/ethernet/qualcomm/qca_spi.c
+++ b/drivers/net/ethernet/qualcomm/qca_spi.c
@@ -289,6 +289,14 @@ qcaspi_transmit(struct qcaspi *qca)
 
 	qcaspi_read_register(qca, SPI_REG_WRBUF_SPC_AVA, &available);
 
+	if (available > QCASPI_HW_BUF_LEN) {
+		/* This could only happen by interferences on the SPI line.
+		 * So retry later ...
+		 */
+		qca->stats.buf_avail_err++;
+		return -1;
+	}
+
 	while (qca->txr.skb[qca->txr.head]) {
 		pkt_len = qca->txr.skb[qca->txr.head]->len + QCASPI_HW_PKT_LEN;
 
@@ -355,7 +363,13 @@ qcaspi_receive(struct qcaspi *qca)
 	netdev_dbg(net_dev, "qcaspi_receive: SPI_REG_RDBUF_BYTE_AVA: Value: %08x\n",
 		   available);
 
-	if (available == 0) {
+	if (available > QCASPI_HW_BUF_LEN) {
+		/* This could only happen by interferences on the SPI line.
+		 * So retry later ...
+		 */
+		qca->stats.buf_avail_err++;
+		return -1;
+	} else if (available == 0) {
 		netdev_dbg(net_dev, "qcaspi_receive called without any data being available!\n");
 		return -1;
 	}
diff --git a/drivers/net/ethernet/qualcomm/qca_spi.h b/drivers/net/ethernet/qualcomm/qca_spi.h
index 2d2c497..eb9af45 100644
--- a/drivers/net/ethernet/qualcomm/qca_spi.h
+++ b/drivers/net/ethernet/qualcomm/qca_spi.h
@@ -74,6 +74,7 @@ struct qcaspi_stats {
 	u64 ring_full;
 	u64 spi_err;
 	u64 write_verify_failed;
+	u64 buf_avail_err;
 };
 
 struct qcaspi {
-- 
2.7.4

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox