* Re: [PATCH net 0/5] rxrpc: Fixes
From: David Miller @ 2018-05-11 19:56 UTC (permalink / raw)
To: dhowells; +Cc: netdev, linux-afs, linux-kernel
In-Reply-To: <152599231687.26376.15020977491573449830.stgit@warthog.procyon.org.uk>
From: David Howells <dhowells@redhat.com>
Date: Thu, 10 May 2018 23:45:17 +0100
> Here are three fixes for AF_RXRPC and two tracepoints that were useful for
> finding them:
...
> The patches are tagged here:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
> rxrpc-fixes-20180510
Pulled, thanks David.
^ permalink raw reply
* [PATCH net-next 0/4] bonding: performance and reliability
From: Debabrata Banerjee @ 2018-05-11 19:25 UTC (permalink / raw)
To: David S . Miller, netdev
Cc: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, dbanerje
Series of fixes to how rlb updates are handled, code cleanup, allowing
higher performance tx hashing in balance-alb mode, and reliability of
link up/down monitoring.
Debabrata Banerjee (4):
bonding: don't queue up extraneous rlb updates
bonding: use common mac addr checks
bonding: allow use of tx hashing in balance-alb
bonding: allow carrier and link status to determine link state
Documentation/networking/bonding.txt | 4 +--
drivers/net/bonding/bond_alb.c | 50 +++++++++++++++++-----------
drivers/net/bonding/bond_main.c | 37 ++++++++++++--------
drivers/net/bonding/bond_options.c | 9 ++---
include/net/bonding.h | 10 +++++-
5 files changed, 70 insertions(+), 40 deletions(-)
--
2.17.0
^ permalink raw reply
* Re: [PATCH v2 net 1/1] net sched actions: fix invalid pointer dereferencing if skbedit flags missing
From: David Miller @ 2018-05-11 19:53 UTC (permalink / raw)
To: mrv; +Cc: netdev, kernel, jhs, xiyou.wangcong, jiri, alexander.duyck
In-Reply-To: <1526050509-30487-1-git-send-email-mrv@mojatatu.com>
From: Roman Mashak <mrv@mojatatu.com>
Date: Fri, 11 May 2018 10:55:09 -0400
> When application fails to pass flags in netlink TLV for a new skbedit action,
> the kernel results in the following oops:
...
> The caller calls action's ->init() and passes pointer to "struct tc_action *a",
> which later may be initialized to point at the existing action, otherwise
> "struct tc_action *a" is still invalid, and therefore dereferencing it is an
> error as happens in tcf_idr_release, where refcnt is decremented.
>
> So in case of missing flags tcf_idr_release must be called only for
> existing actions.
>
> v2:
> - prepare patch for net tree
>
> Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Applied and queued up for -stable.
^ permalink raw reply
* Re: [PATCH] isdn: eicon: fix a missing-check bug
From: David Miller @ 2018-05-11 19:50 UTC (permalink / raw)
To: wang6495; +Cc: kjlu, mac, isdn, netdev, linux-kernel
In-Reply-To: <1525548766-13017-1-git-send-email-wang6495@umn.edu>
From: Wenwen Wang <wang6495@umn.edu>
Date: Sat, 5 May 2018 14:32:46 -0500
> To avoid such issues, this patch adds a check after the second copy in the
> function diva_xdi_write(). If the adapter number is not equal to the one
> obtained in the first copy, (-4) will be returned to divas_write(), which
> will then return an error code -EINVAL.
Better fix is to copy the msg header once into an on-stack buffer supplied
by diva_write() to diva_xdi_open_adapter(), which is then passed on to
diva_xdi_write() with an adjusted src pointer and length.
^ permalink raw reply
* [PATCH net-next 4/4] bonding: allow carrier and link status to determine link state
From: Debabrata Banerjee @ 2018-05-11 19:25 UTC (permalink / raw)
To: David S . Miller, netdev
Cc: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, dbanerje
In-Reply-To: <20180511192548.8119-1-dbanerje@akamai.com>
In a mixed environment it may be difficult to tell if your hardware
support carrier, if it does not it can always report true. With a new
use_carrier option of 2, we can check both carrier and link status
sequentially, instead of one or the other
Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com>
---
Documentation/networking/bonding.txt | 4 ++--
drivers/net/bonding/bond_main.c | 12 ++++++++----
drivers/net/bonding/bond_options.c | 7 ++++---
3 files changed, 14 insertions(+), 9 deletions(-)
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index 9ba04c0bab8d..f063730e7e73 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -828,8 +828,8 @@ use_carrier
MII / ETHTOOL ioctl method to determine the link state.
A value of 1 enables the use of netif_carrier_ok(), a value of
- 0 will use the deprecated MII / ETHTOOL ioctls. The default
- value is 1.
+ 0 will use the deprecated MII / ETHTOOL ioctls. A value of 2
+ will check both. The default value is 1.
xmit_hash_policy
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index f7f8a49cb32b..7e9652c4b35c 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -132,7 +132,7 @@ MODULE_PARM_DESC(downdelay, "Delay before considering link down, "
"in milliseconds");
module_param(use_carrier, int, 0);
MODULE_PARM_DESC(use_carrier, "Use netif_carrier_ok (vs MII ioctls) in miimon; "
- "0 for off, 1 for on (default)");
+ "0 for off, 1 for on (default), 2 for carrier then legacy checks");
module_param(mode, charp, 0);
MODULE_PARM_DESC(mode, "Mode of operation; 0 for balance-rr, "
"1 for active-backup, 2 for balance-xor, "
@@ -434,12 +434,16 @@ static int bond_check_dev_link(struct bonding *bond,
int (*ioctl)(struct net_device *, struct ifreq *, int);
struct ifreq ifr;
struct mii_ioctl_data *mii;
+ bool carrier = true;
if (!reporting && !netif_running(slave_dev))
return 0;
if (bond->params.use_carrier)
- return netif_carrier_ok(slave_dev) ? BMSR_LSTATUS : 0;
+ carrier = netif_carrier_ok(slave_dev) ? BMSR_LSTATUS : 0;
+
+ if (!carrier)
+ return carrier;
/* Try to get link status using Ethtool first. */
if (slave_dev->ethtool_ops->get_link)
@@ -4399,8 +4403,8 @@ static int bond_check_params(struct bond_params *params)
downdelay = 0;
}
- if ((use_carrier != 0) && (use_carrier != 1)) {
- pr_warn("Warning: use_carrier module parameter (%d), not of valid value (0/1), so it was set to 1\n",
+ if (use_carrier < 0 || use_carrier > 2) {
+ pr_warn("Warning: use_carrier module parameter (%d), not of valid value (0-2), so it was set to 1\n",
use_carrier);
use_carrier = 1;
}
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index 8a945c9341d6..dba6cef05134 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -164,9 +164,10 @@ static const struct bond_opt_value bond_primary_reselect_tbl[] = {
};
static const struct bond_opt_value bond_use_carrier_tbl[] = {
- { "off", 0, 0},
- { "on", 1, BOND_VALFLAG_DEFAULT},
- { NULL, -1, 0}
+ { "off", 0, 0},
+ { "on", 1, BOND_VALFLAG_DEFAULT},
+ { "both", 2, 0},
+ { NULL, -1, 0}
};
static const struct bond_opt_value bond_all_slaves_active_tbl[] = {
--
2.17.0
^ permalink raw reply related
* [net 2/4] ixgbe: return error on unsupported SFP module when resetting
From: Jeff Kirsher @ 2018-05-11 19:47 UTC (permalink / raw)
To: davem; +Cc: Emil Tantilov, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20180511194722.28325-1-jeffrey.t.kirsher@intel.com>
From: Emil Tantilov <emil.s.tantilov@intel.com>
Add check for unsupported module and return the error code.
This fixes a Coverity hit due to unused return status from setup_sfp.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
index 3123267dfba9..9592f3e3e42e 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
@@ -3427,6 +3427,9 @@ static s32 ixgbe_reset_hw_X550em(struct ixgbe_hw *hw)
hw->phy.sfp_setup_needed = false;
}
+ if (status == IXGBE_ERR_SFP_NOT_SUPPORTED)
+ return status;
+
/* Reset PHY */
if (!hw->phy.reset_disable && hw->phy.ops.reset)
hw->phy.ops.reset(hw);
--
2.17.0
^ permalink raw reply related
* [net 3/4] ixgbevf: fix ixgbevf_xmit_frame()'s return type
From: Jeff Kirsher @ 2018-05-11 19:47 UTC (permalink / raw)
To: davem; +Cc: Luc Van Oostenryck, netdev, nhorman, sassmann, jogreene,
Jeff Kirsher
In-Reply-To: <20180511194722.28325-1-jeffrey.t.kirsher@intel.com>
From: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, but the implementation in this
driver returns an 'int'.
Fix this by returning 'netdev_tx_t' in this driver too.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index e3d04f226d57..850f8af95e49 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -4137,7 +4137,7 @@ static int ixgbevf_xmit_frame_ring(struct sk_buff *skb,
return NETDEV_TX_OK;
}
-static int ixgbevf_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
+static netdev_tx_t ixgbevf_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
{
struct ixgbevf_adapter *adapter = netdev_priv(netdev);
struct ixgbevf_ring *tx_ring;
--
2.17.0
^ permalink raw reply related
* [net 4/4] ixgbe: fix memory leak on ipsec allocation
From: Jeff Kirsher @ 2018-05-11 19:47 UTC (permalink / raw)
To: davem; +Cc: Colin Ian King, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20180511194722.28325-1-jeffrey.t.kirsher@intel.com>
From: Colin Ian King <colin.king@canonical.com>
The error clean up path kfree's adapter->ipsec and should be
instead kfree'ing ipsec. Fix this. Also, the err1 error exit path
does not need to kfree ipsec because this failure path was for
the failed allocation of ipsec.
Detected by CoverityScan, CID#146424 ("Resource Leak")
Fixes: 63a67fe229ea ("ixgbe: add ipsec offload add and remove SA")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index 68af127987bc..cead23e3db0c 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -943,8 +943,8 @@ void ixgbe_init_ipsec_offload(struct ixgbe_adapter *adapter)
kfree(ipsec->ip_tbl);
kfree(ipsec->rx_tbl);
kfree(ipsec->tx_tbl);
+ kfree(ipsec);
err1:
- kfree(adapter->ipsec);
netdev_err(adapter->netdev, "Unable to allocate memory for SA tables");
}
--
2.17.0
^ permalink raw reply related
* [net 0/4][pull request] Intel Wired LAN Driver Updates 2018-05-11
From: Jeff Kirsher @ 2018-05-11 19:47 UTC (permalink / raw)
To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann, jogreene
This series contains fixes to the ice, ixgbe and ixgbevf drivers.
Jeff Shaw provides a fix to ensure rq_last_status gets set, whether or
not the hardware responds with an error in the ice driver.
Emil adds a check for unsupported module during the reset routine for
ixgbe.
Luc Van Oostenryck fixes ixgbevf_xmit_frame() where it was not using the
correct return value (int).
Colin Ian King fixes a potential resource leak in ixgbe, where we were
not freeing ipsec in our cleanup path.
The following are changes since commit 5ae4bbf76928b401fe467e837073d939300adbf0:
Merge tag 'mlx5-fixes-2018-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
and are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue 10GbE
Colin Ian King (1):
ixgbe: fix memory leak on ipsec allocation
Emil Tantilov (1):
ixgbe: return error on unsupported SFP module when resetting
Jeff Shaw (1):
ice: Set rq_last_status when cleaning rq
Luc Van Oostenryck (1):
ixgbevf: fix ixgbevf_xmit_frame()'s return type
drivers/net/ethernet/intel/ice/ice_controlq.c | 2 +-
drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 2 +-
drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c | 3 +++
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 2 +-
4 files changed, 6 insertions(+), 3 deletions(-)
--
2.17.0
^ permalink raw reply
* [net 1/4] ice: Set rq_last_status when cleaning rq
From: Jeff Kirsher @ 2018-05-11 19:47 UTC (permalink / raw)
To: davem
Cc: Jeff Shaw, netdev, nhorman, sassmann, jogreene,
Anirudh Venkataramanan, Jeff Kirsher
In-Reply-To: <20180511194722.28325-1-jeffrey.t.kirsher@intel.com>
From: Jeff Shaw <jeffrey.b.shaw@intel.com>
Prior to this commit, the rq_last_status was only set when hardware
responded with an error. This leads to rq_last_status being invalid
in the future when hardware eventually responds without error. This
commit resolves the issue by unconditionally setting rq_last_status
with the value returned in the descriptor.
Fixes: 940b61af02f4 ("ice: Initialize PF and setup miscellaneous
interrupt")
Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/ice/ice_controlq.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_controlq.c b/drivers/net/ethernet/intel/ice/ice_controlq.c
index 5909a4407e38..7c511f144ed6 100644
--- a/drivers/net/ethernet/intel/ice/ice_controlq.c
+++ b/drivers/net/ethernet/intel/ice/ice_controlq.c
@@ -1014,10 +1014,10 @@ ice_clean_rq_elem(struct ice_hw *hw, struct ice_ctl_q_info *cq,
desc = ICE_CTL_Q_DESC(cq->rq, ntc);
desc_idx = ntc;
+ cq->rq_last_status = (enum ice_aq_err)le16_to_cpu(desc->retval);
flags = le16_to_cpu(desc->flags);
if (flags & ICE_AQ_FLAG_ERR) {
ret_code = ICE_ERR_AQ_ERROR;
- cq->rq_last_status = (enum ice_aq_err)le16_to_cpu(desc->retval);
ice_debug(hw, ICE_DBG_AQ_MSG,
"Control Receive Queue Event received with error 0x%x\n",
cq->rq_last_status);
--
2.17.0
^ permalink raw reply related
* [PATCH net-next 1/4] bonding: don't queue up extraneous rlb updates
From: Debabrata Banerjee @ 2018-05-11 19:25 UTC (permalink / raw)
To: David S . Miller, netdev
Cc: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, dbanerje
In-Reply-To: <20180511192548.8119-1-dbanerje@akamai.com>
arps for incomplete entries can't be sent anyway.
Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com>
---
drivers/net/bonding/bond_alb.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 5eb0df2e5464..c2f6c58e4e6a 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -421,7 +421,8 @@ static void rlb_clear_slave(struct bonding *bond, struct slave *slave)
if (assigned_slave) {
rx_hash_table[index].slave = assigned_slave;
if (!ether_addr_equal_64bits(rx_hash_table[index].mac_dst,
- mac_bcast)) {
+ mac_bcast) &&
+ !is_zero_ether_addr(rx_hash_table[index].mac_dst)) {
bond_info->rx_hashtbl[index].ntt = 1;
bond_info->rx_ntt = 1;
/* A slave has been removed from the
@@ -524,7 +525,8 @@ static void rlb_req_update_slave_clients(struct bonding *bond, struct slave *sla
client_info = &(bond_info->rx_hashtbl[hash_index]);
if ((client_info->slave == slave) &&
- !ether_addr_equal_64bits(client_info->mac_dst, mac_bcast)) {
+ !ether_addr_equal_64bits(client_info->mac_dst, mac_bcast) &&
+ !is_zero_ether_addr(client_info->mac_dst)) {
client_info->ntt = 1;
ntt = 1;
}
@@ -565,7 +567,8 @@ static void rlb_req_update_subnet_clients(struct bonding *bond, __be32 src_ip)
if ((client_info->ip_src == src_ip) &&
!ether_addr_equal_64bits(client_info->slave->dev->dev_addr,
bond->dev->dev_addr) &&
- !ether_addr_equal_64bits(client_info->mac_dst, mac_bcast)) {
+ !ether_addr_equal_64bits(client_info->mac_dst, mac_bcast) &&
+ !is_zero_ether_addr(client_info->mac_dst)) {
client_info->ntt = 1;
bond_info->rx_ntt = 1;
}
@@ -641,7 +644,8 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb, struct bonding *bon
ether_addr_copy(client_info->mac_src, arp->mac_src);
client_info->slave = assigned_slave;
- if (!ether_addr_equal_64bits(client_info->mac_dst, mac_bcast)) {
+ if (!ether_addr_equal_64bits(client_info->mac_dst, mac_bcast) &&
+ !is_zero_ether_addr(client_info->mac_dst)) {
client_info->ntt = 1;
bond->alb_info.rx_ntt = 1;
} else {
@@ -733,8 +737,10 @@ static void rlb_rebalance(struct bonding *bond)
assigned_slave = __rlb_next_rx_slave(bond);
if (assigned_slave && (client_info->slave != assigned_slave)) {
client_info->slave = assigned_slave;
- client_info->ntt = 1;
- ntt = 1;
+ if (!is_zero_ether_addr(client_info->mac_dst)) {
+ client_info->ntt = 1;
+ ntt = 1;
+ }
}
}
--
2.17.0
^ permalink raw reply related
* [PATCH net-next 3/4] bonding: allow use of tx hashing in balance-alb
From: Debabrata Banerjee @ 2018-05-11 19:25 UTC (permalink / raw)
To: David S . Miller, netdev
Cc: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, dbanerje
In-Reply-To: <20180511192548.8119-1-dbanerje@akamai.com>
The rx load balancing provided by balance-alb is not mutually
exclusive with using hashing for tx selection, and should provide a decent
speed increase because this eliminates spinlocks and cache contention.
Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com>
---
drivers/net/bonding/bond_alb.c | 20 ++++++++++++++++++--
drivers/net/bonding/bond_main.c | 25 +++++++++++++++----------
drivers/net/bonding/bond_options.c | 2 +-
include/net/bonding.h | 10 +++++++++-
4 files changed, 43 insertions(+), 14 deletions(-)
diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 180e50f7806f..6228635880d5 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -1478,8 +1478,24 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
}
if (do_tx_balance) {
- hash_index = _simple_hash(hash_start, hash_size);
- tx_slave = tlb_choose_channel(bond, hash_index, skb->len);
+ if (bond->params.tlb_dynamic_lb) {
+ hash_index = _simple_hash(hash_start, hash_size);
+ tx_slave = tlb_choose_channel(bond, hash_index, skb->len);
+ } else {
+ /*
+ * do_tx_balance means we are free to select the tx_slave
+ * So we do exactly what tlb would do for hash selection
+ */
+
+ struct bond_up_slave *slaves;
+ unsigned int count;
+
+ slaves = rcu_dereference(bond->slave_arr);
+ count = slaves ? READ_ONCE(slaves->count) : 0;
+ if (likely(count))
+ tx_slave = slaves->arr[bond_xmit_hash(bond, skb) %
+ count];
+ }
}
return bond_do_alb_xmit(skb, bond, tx_slave);
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 1f1e97b26f95..f7f8a49cb32b 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -159,7 +159,7 @@ module_param(min_links, int, 0);
MODULE_PARM_DESC(min_links, "Minimum number of available links before turning on carrier");
module_param(xmit_hash_policy, charp, 0);
-MODULE_PARM_DESC(xmit_hash_policy, "balance-xor and 802.3ad hashing method; "
+MODULE_PARM_DESC(xmit_hash_policy, "balance-alb, balance-tlb, balance-xor, 802.3ad hashing method; "
"0 for layer 2 (default), 1 for layer 3+4, "
"2 for layer 2+3, 3 for encap layer 2+3, "
"4 for encap layer 3+4");
@@ -1735,7 +1735,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
unblock_netpoll_tx();
}
- if (bond_mode_uses_xmit_hash(bond))
+ if (bond_mode_can_use_xmit_hash(bond))
bond_update_slave_arr(bond, NULL);
bond->nest_level = dev_get_nest_level(bond_dev);
@@ -1870,7 +1870,7 @@ static int __bond_release_one(struct net_device *bond_dev,
if (BOND_MODE(bond) == BOND_MODE_8023AD)
bond_3ad_unbind_slave(slave);
- if (bond_mode_uses_xmit_hash(bond))
+ if (bond_mode_can_use_xmit_hash(bond))
bond_update_slave_arr(bond, slave);
netdev_info(bond_dev, "Releasing %s interface %s\n",
@@ -3102,7 +3102,7 @@ static int bond_slave_netdev_event(unsigned long event,
* events. If these (miimon/arpmon) parameters are configured
* then array gets refreshed twice and that should be fine!
*/
- if (bond_mode_uses_xmit_hash(bond))
+ if (bond_mode_can_use_xmit_hash(bond))
bond_update_slave_arr(bond, NULL);
break;
case NETDEV_CHANGEMTU:
@@ -3322,7 +3322,7 @@ static int bond_open(struct net_device *bond_dev)
*/
if (bond_alb_initialize(bond, (BOND_MODE(bond) == BOND_MODE_ALB)))
return -ENOMEM;
- if (bond->params.tlb_dynamic_lb)
+ if (bond->params.tlb_dynamic_lb || BOND_MODE(bond) == BOND_MODE_ALB)
queue_delayed_work(bond->wq, &bond->alb_work, 0);
}
@@ -3341,7 +3341,7 @@ static int bond_open(struct net_device *bond_dev)
bond_3ad_initiate_agg_selection(bond, 1);
}
- if (bond_mode_uses_xmit_hash(bond))
+ if (bond_mode_can_use_xmit_hash(bond))
bond_update_slave_arr(bond, NULL);
return 0;
@@ -3892,7 +3892,7 @@ static void bond_slave_arr_handler(struct work_struct *work)
* to determine the slave interface -
* (a) BOND_MODE_8023AD
* (b) BOND_MODE_XOR
- * (c) BOND_MODE_TLB && tlb_dynamic_lb == 0
+ * (c) (BOND_MODE_TLB || BOND_MODE_ALB) && tlb_dynamic_lb == 0
*
* The caller is expected to hold RTNL only and NO other lock!
*/
@@ -3945,6 +3945,11 @@ int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
continue;
if (skipslave == slave)
continue;
+
+ netdev_dbg(bond->dev,
+ "Adding slave dev %s to tx hash array[%d]\n",
+ slave->dev->name, new_arr->count);
+
new_arr->arr[new_arr->count++] = slave;
}
@@ -4320,9 +4325,9 @@ static int bond_check_params(struct bond_params *params)
}
if (xmit_hash_policy) {
- if ((bond_mode != BOND_MODE_XOR) &&
- (bond_mode != BOND_MODE_8023AD) &&
- (bond_mode != BOND_MODE_TLB)) {
+ if (bond_mode == BOND_MODE_ROUNDROBIN ||
+ bond_mode == BOND_MODE_ACTIVEBACKUP ||
+ bond_mode == BOND_MODE_BROADCAST) {
pr_info("xmit_hash_policy param is irrelevant in mode %s\n",
bond_mode_name(bond_mode));
} else {
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index 58c705f24f96..8a945c9341d6 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -395,7 +395,7 @@ static const struct bond_option bond_opts[BOND_OPT_LAST] = {
.id = BOND_OPT_TLB_DYNAMIC_LB,
.name = "tlb_dynamic_lb",
.desc = "Enable dynamic flow shuffling",
- .unsuppmodes = BOND_MODE_ALL_EX(BIT(BOND_MODE_TLB)),
+ .unsuppmodes = BOND_MODE_ALL_EX(BIT(BOND_MODE_TLB) | BIT(BOND_MODE_ALB)),
.values = bond_tlb_dynamic_lb_tbl,
.flags = BOND_OPTFLAG_IFDOWN,
.set = bond_option_tlb_dynamic_lb_set,
diff --git a/include/net/bonding.h b/include/net/bonding.h
index b52235158836..9a41a50b0bd2 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -285,10 +285,18 @@ static inline bool bond_needs_speed_duplex(const struct bonding *bond)
static inline bool bond_is_nondyn_tlb(const struct bonding *bond)
{
- return (BOND_MODE(bond) == BOND_MODE_TLB) &&
+ return (BOND_MODE(bond) == BOND_MODE_TLB || BOND_MODE(bond) == BOND_MODE_ALB) &&
(bond->params.tlb_dynamic_lb == 0);
}
+static inline bool bond_mode_can_use_xmit_hash(const struct bonding *bond)
+{
+ return (BOND_MODE(bond) == BOND_MODE_8023AD ||
+ BOND_MODE(bond) == BOND_MODE_XOR ||
+ BOND_MODE(bond) == BOND_MODE_TLB ||
+ BOND_MODE(bond) == BOND_MODE_ALB);
+}
+
static inline bool bond_mode_uses_xmit_hash(const struct bonding *bond)
{
return (BOND_MODE(bond) == BOND_MODE_8023AD ||
--
2.17.0
^ permalink raw reply related
* [PATCH net-next 2/4] bonding: use common mac addr checks
From: Debabrata Banerjee @ 2018-05-11 19:25 UTC (permalink / raw)
To: David S . Miller, netdev
Cc: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, dbanerje
In-Reply-To: <20180511192548.8119-1-dbanerje@akamai.com>
Replace homegrown mac addr checks with faster defs from etherdevice.h
Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com>
---
drivers/net/bonding/bond_alb.c | 28 +++++++++-------------------
1 file changed, 9 insertions(+), 19 deletions(-)
diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index c2f6c58e4e6a..180e50f7806f 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -40,11 +40,6 @@
#include <net/bonding.h>
#include <net/bond_alb.h>
-
-
-static const u8 mac_bcast[ETH_ALEN + 2] __long_aligned = {
- 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
-};
static const u8 mac_v6_allmcast[ETH_ALEN + 2] __long_aligned = {
0x33, 0x33, 0x00, 0x00, 0x00, 0x01
};
@@ -420,9 +415,7 @@ static void rlb_clear_slave(struct bonding *bond, struct slave *slave)
if (assigned_slave) {
rx_hash_table[index].slave = assigned_slave;
- if (!ether_addr_equal_64bits(rx_hash_table[index].mac_dst,
- mac_bcast) &&
- !is_zero_ether_addr(rx_hash_table[index].mac_dst)) {
+ if (is_valid_ether_addr(rx_hash_table[index].mac_dst)) {
bond_info->rx_hashtbl[index].ntt = 1;
bond_info->rx_ntt = 1;
/* A slave has been removed from the
@@ -525,8 +518,7 @@ static void rlb_req_update_slave_clients(struct bonding *bond, struct slave *sla
client_info = &(bond_info->rx_hashtbl[hash_index]);
if ((client_info->slave == slave) &&
- !ether_addr_equal_64bits(client_info->mac_dst, mac_bcast) &&
- !is_zero_ether_addr(client_info->mac_dst)) {
+ is_valid_ether_addr(client_info->mac_dst)) {
client_info->ntt = 1;
ntt = 1;
}
@@ -567,8 +559,7 @@ static void rlb_req_update_subnet_clients(struct bonding *bond, __be32 src_ip)
if ((client_info->ip_src == src_ip) &&
!ether_addr_equal_64bits(client_info->slave->dev->dev_addr,
bond->dev->dev_addr) &&
- !ether_addr_equal_64bits(client_info->mac_dst, mac_bcast) &&
- !is_zero_ether_addr(client_info->mac_dst)) {
+ is_valid_ether_addr(client_info->mac_dst)) {
client_info->ntt = 1;
bond_info->rx_ntt = 1;
}
@@ -596,7 +587,7 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb, struct bonding *bon
if ((client_info->ip_src == arp->ip_src) &&
(client_info->ip_dst == arp->ip_dst)) {
/* the entry is already assigned to this client */
- if (!ether_addr_equal_64bits(arp->mac_dst, mac_bcast)) {
+ if (!is_broadcast_ether_addr(arp->mac_dst)) {
/* update mac address from arp */
ether_addr_copy(client_info->mac_dst, arp->mac_dst);
}
@@ -644,8 +635,7 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb, struct bonding *bon
ether_addr_copy(client_info->mac_src, arp->mac_src);
client_info->slave = assigned_slave;
- if (!ether_addr_equal_64bits(client_info->mac_dst, mac_bcast) &&
- !is_zero_ether_addr(client_info->mac_dst)) {
+ if (is_valid_ether_addr(client_info->mac_dst)) {
client_info->ntt = 1;
bond->alb_info.rx_ntt = 1;
} else {
@@ -1418,9 +1408,9 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
case ETH_P_IP: {
const struct iphdr *iph = ip_hdr(skb);
- if (ether_addr_equal_64bits(eth_data->h_dest, mac_bcast) ||
- (iph->daddr == ip_bcast) ||
- (iph->protocol == IPPROTO_IGMP)) {
+ if (is_broadcast_ether_addr(eth_data->h_dest) ||
+ iph->daddr == ip_bcast ||
+ iph->protocol == IPPROTO_IGMP) {
do_tx_balance = false;
break;
}
@@ -1432,7 +1422,7 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
/* IPv6 doesn't really use broadcast mac address, but leave
* that here just in case.
*/
- if (ether_addr_equal_64bits(eth_data->h_dest, mac_bcast)) {
+ if (is_broadcast_ether_addr(eth_data->h_dest)) {
do_tx_balance = false;
break;
}
--
2.17.0
^ permalink raw reply related
* Re: [PATCH v6 1/6] net: phy: at803x: Export at803x_debug_reg_mask()
From: Andrew Lunn @ 2018-05-11 19:24 UTC (permalink / raw)
To: Paul Burton; +Cc: Darren Hart, netdev, linux-mips, David S . Miller
In-Reply-To: <20180511182502.y74wm6dmtf3dbcln@pburton-laptop>
> I could reorder the probe function a little to initialize the PHY before
> performing the MAC reset, drop this patch and the AR803X hibernation
> stuff from patch 2 if you like. But again, I can't actually test the
> result on the affected hardware.
Hi Paul
I don't like a MAC driver poking around in PHY registers.
So if you can rearrange the code, that would be great.
Thanks
Andrew
^ permalink raw reply
* [PATCH V2] mlx4_core: allocate ICM memory in page size chunks
From: Qing Huang @ 2018-05-11 19:23 UTC (permalink / raw)
To: tariqt, davem, haakon.bugge, yanjun.zhu
Cc: netdev, linux-rdma, linux-kernel, Qing Huang
When a system is under memory presure (high usage with fragments),
the original 256KB ICM chunk allocations will likely trigger kernel
memory management to enter slow path doing memory compact/migration
ops in order to complete high order memory allocations.
When that happens, user processes calling uverb APIs may get stuck
for more than 120s easily even though there are a lot of free pages
in smaller chunks available in the system.
Syslog:
...
Dec 10 09:04:51 slcc03db02 kernel: [397078.572732] INFO: task
oracle_205573_e:205573 blocked for more than 120 seconds.
...
With 4KB ICM chunk size on x86_64 arch, the above issue is fixed.
However in order to support smaller ICM chunk size, we need to fix
another issue in large size kcalloc allocations.
E.g.
Setting log_num_mtt=30 requires 1G mtt entries. With the 4KB ICM chunk
size, each ICM chunk can only hold 512 mtt entries (8 bytes for each mtt
entry). So we need a 16MB allocation for a table->icm pointer array to
hold 2M pointers which can easily cause kcalloc to fail.
The solution is to use vzalloc to replace kcalloc. There is no need
for contiguous memory pages for a driver meta data structure (no need
of DMA ops).
Signed-off-by: Qing Huang <qing.huang@oracle.com>
Acked-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
---
v2 -> v1: adjusted chunk size to reflect different architectures.
drivers/net/ethernet/mellanox/mlx4/icm.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/icm.c b/drivers/net/ethernet/mellanox/mlx4/icm.c
index a822f7a..ccb62b8 100644
--- a/drivers/net/ethernet/mellanox/mlx4/icm.c
+++ b/drivers/net/ethernet/mellanox/mlx4/icm.c
@@ -43,12 +43,12 @@
#include "fw.h"
/*
- * We allocate in as big chunks as we can, up to a maximum of 256 KB
- * per chunk.
+ * We allocate in page size (default 4KB on many archs) chunks to avoid high
+ * order memory allocations in fragmented/high usage memory situation.
*/
enum {
- MLX4_ICM_ALLOC_SIZE = 1 << 18,
- MLX4_TABLE_CHUNK_SIZE = 1 << 18
+ MLX4_ICM_ALLOC_SIZE = 1 << PAGE_SHIFT,
+ MLX4_TABLE_CHUNK_SIZE = 1 << PAGE_SHIFT
};
static void mlx4_free_icm_pages(struct mlx4_dev *dev, struct mlx4_icm_chunk *chunk)
@@ -400,7 +400,7 @@ int mlx4_init_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table,
obj_per_chunk = MLX4_TABLE_CHUNK_SIZE / obj_size;
num_icm = (nobj + obj_per_chunk - 1) / obj_per_chunk;
- table->icm = kcalloc(num_icm, sizeof(*table->icm), GFP_KERNEL);
+ table->icm = vzalloc(num_icm * sizeof(*table->icm));
if (!table->icm)
return -ENOMEM;
table->virt = virt;
@@ -446,7 +446,7 @@ int mlx4_init_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table,
mlx4_free_icm(dev, table->icm[i], use_coherent);
}
- kfree(table->icm);
+ vfree(table->icm);
return -ENOMEM;
}
@@ -462,5 +462,5 @@ void mlx4_cleanup_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table)
mlx4_free_icm(dev, table->icm[i], table->coherent);
}
- kfree(table->icm);
+ vfree(table->icm);
}
--
2.9.3
^ permalink raw reply related
* Re: [PATCH] mlx4_core: allocate 4KB ICM chunks
From: Qing Huang @ 2018-05-11 19:16 UTC (permalink / raw)
To: Håkon Bugge; +Cc: tariqt, davem, netdev, OFED mailing list, linux-kernel
In-Reply-To: <5ABF1B88-882E-4575-8E8C-41F0452FECC1@oracle.com>
On 5/11/2018 3:27 AM, Håkon Bugge wrote:
>> On 11 May 2018, at 01:31, Qing Huang<qing.huang@oracle.com> wrote:
>>
>> When a system is under memory presure (high usage with fragments),
>> the original 256KB ICM chunk allocations will likely trigger kernel
>> memory management to enter slow path doing memory compact/migration
>> ops in order to complete high order memory allocations.
>>
>> When that happens, user processes calling uverb APIs may get stuck
>> for more than 120s easily even though there are a lot of free pages
>> in smaller chunks available in the system.
>>
>> Syslog:
>> ...
>> Dec 10 09:04:51 slcc03db02 kernel: [397078.572732] INFO: task
>> oracle_205573_e:205573 blocked for more than 120 seconds.
>> ...
>>
>> With 4KB ICM chunk size, the above issue is fixed.
>>
>> However in order to support 4KB ICM chunk size, we need to fix another
>> issue in large size kcalloc allocations.
>>
>> E.g.
>> Setting log_num_mtt=30 requires 1G mtt entries. With the 4KB ICM chunk
>> size, each ICM chunk can only hold 512 mtt entries (8 bytes for each mtt
>> entry). So we need a 16MB allocation for a table->icm pointer array to
>> hold 2M pointers which can easily cause kcalloc to fail.
>>
>> The solution is to use vzalloc to replace kcalloc. There is no need
>> for contiguous memory pages for a driver meta data structure (no need
>> of DMA ops).
>>
>> Signed-off-by: Qing Huang<qing.huang@oracle.com>
>> Acked-by: Daniel Jurgens<danielj@mellanox.com>
>> ---
>> drivers/net/ethernet/mellanox/mlx4/icm.c | 14 +++++++-------
>> 1 file changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx4/icm.c b/drivers/net/ethernet/mellanox/mlx4/icm.c
>> index a822f7a..2b17a4b 100644
>> --- a/drivers/net/ethernet/mellanox/mlx4/icm.c
>> +++ b/drivers/net/ethernet/mellanox/mlx4/icm.c
>> @@ -43,12 +43,12 @@
>> #include "fw.h"
>>
>> /*
>> - * We allocate in as big chunks as we can, up to a maximum of 256 KB
>> - * per chunk.
>> + * We allocate in 4KB page size chunks to avoid high order memory
>> + * allocations in fragmented/high usage memory situation.
>> */
>> enum {
>> - MLX4_ICM_ALLOC_SIZE = 1 << 18,
>> - MLX4_TABLE_CHUNK_SIZE = 1 << 18
>> + MLX4_ICM_ALLOC_SIZE = 1 << 12,
>> + MLX4_TABLE_CHUNK_SIZE = 1 << 12
> Shouldn’t these be the arch’s page size order? E.g., if running on SPARC, the hw page size is 8KiB.
Good point on supporting wider range of architectures. I got tunnel
vision when fixing this on our x64 lab machines.
Will send an v2 patch.
Thanks,
Qing
> Thxs, Håkon
>
>> };
>>
>> static void mlx4_free_icm_pages(struct mlx4_dev *dev, struct mlx4_icm_chunk *chunk)
>> @@ -400,7 +400,7 @@ int mlx4_init_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table,
>> obj_per_chunk = MLX4_TABLE_CHUNK_SIZE / obj_size;
>> num_icm = (nobj + obj_per_chunk - 1) / obj_per_chunk;
>>
>> - table->icm = kcalloc(num_icm, sizeof(*table->icm), GFP_KERNEL);
>> + table->icm = vzalloc(num_icm * sizeof(*table->icm));
>> if (!table->icm)
>> return -ENOMEM;
>> table->virt = virt;
>> @@ -446,7 +446,7 @@ int mlx4_init_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table,
>> mlx4_free_icm(dev, table->icm[i], use_coherent);
>> }
>>
>> - kfree(table->icm);
>> + vfree(table->icm);
>>
>> return -ENOMEM;
>> }
>> @@ -462,5 +462,5 @@ void mlx4_cleanup_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table)
>> mlx4_free_icm(dev, table->icm[i], table->coherent);
>> }
>>
>> - kfree(table->icm);
>> + vfree(table->icm);
>> }
>> --
>> 2.9.3
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message tomajordomo@vger.kernel.org
>> More majordomo info athttp://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message tomajordomo@vger.kernel.org
> More majordomo info athttp://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH v3] net: phy: DP83TC811: Introduce support for the DP83TC811 phy
From: David Miller @ 2018-05-11 19:15 UTC (permalink / raw)
To: andrew; +Cc: dmurphy, f.fainelli, netdev, linux-kernel
In-Reply-To: <20180511191011.GC12738@lunn.ch>
From: Andrew Lunn <andrew@lunn.ch>
Date: Fri, 11 May 2018 21:10:11 +0200
> Humm, i thought i had given one. But i cannot find it in the mail
> archive. Going senile :-(
You aren't going senile, there is just are a huge number of patches
being submitted since net-next openned up.
^ permalink raw reply
* Re: [PATCH v3] net: phy: DP83TC811: Introduce support for the DP83TC811 phy
From: Florian Fainelli @ 2018-05-11 19:12 UTC (permalink / raw)
To: Dan Murphy, andrew; +Cc: netdev, linux-kernel
In-Reply-To: <20180511180819.5036-1-dmurphy@ti.com>
On 05/11/2018 11:08 AM, Dan Murphy wrote:
> Add support for the DP83811 phy.
>
> The DP83811 supports both rgmii and sgmii interfaces.
> There are 2 part numbers for this the DP83TC811R does not
> reliably support the SGMII interface but the DP83TC811S will.
>
> There is not a way to differentiate these parts from the
> hardware or register set. So this is controlled via the DT
> to indicate which phy mode is required. Or the part can be
> strapped to a certain interface.
>
> Data sheet can be found here:
> http://www.ti.com/product/DP83TC811S-Q1/description
> http://www.ti.com/product/DP83TC811R-Q1/description
>
> Signed-off-by: Dan Murphy <dmurphy@ti.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
--
Florian
^ permalink raw reply
* Re: [PATCH v3] net: phy: DP83TC811: Introduce support for the DP83TC811 phy
From: Andrew Lunn @ 2018-05-11 19:10 UTC (permalink / raw)
To: Dan Murphy; +Cc: f.fainelli, netdev, linux-kernel
In-Reply-To: <56ca63ce-d609-5bb7-f38c-17d875496bd5@ti.com>
On Fri, May 11, 2018 at 01:51:28PM -0500, Dan Murphy wrote:
> Andrew
>
> On 05/11/2018 01:30 PM, Andrew Lunn wrote:
> > On Fri, May 11, 2018 at 01:08:19PM -0500, Dan Murphy wrote:
> >> Add support for the DP83811 phy.
> >>
> >> The DP83811 supports both rgmii and sgmii interfaces.
> >> There are 2 part numbers for this the DP83TC811R does not
> >> reliably support the SGMII interface but the DP83TC811S will.
> >>
> >> There is not a way to differentiate these parts from the
> >> hardware or register set. So this is controlled via the DT
> >> to indicate which phy mode is required. Or the part can be
> >> strapped to a certain interface.
> >>
> >> Data sheet can be found here:
> >> http://www.ti.com/product/DP83TC811S-Q1/description
> >> http://www.ti.com/product/DP83TC811R-Q1/description
> >>
> >> Signed-off-by: Dan Murphy <dmurphy@ti.com>
> >
> > Hi Dan
> >
> > It is normal to add any Reviewed-by, or Tested-by: tags you received,
> > so long as you don't make major changes.
> >
>
> Thanks for the reminder.
>
> I usually add them if I get them explicitly stated in the review.
Humm, i thought i had given one. But i cannot find it in the mail
archive. Going senile :-(
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply
* Re: [PATCH net 1/1] net sched actions: fix refcnt leak in skbmod
From: Cong Wang @ 2018-05-11 19:09 UTC (permalink / raw)
To: Roman Mashak
Cc: David Miller, Linux Kernel Network Developers, kernel,
Jamal Hadi Salim, Jiri Pirko
In-Reply-To: <1526063733-7813-1-git-send-email-mrv@mojatatu.com>
On Fri, May 11, 2018 at 11:35 AM, Roman Mashak <mrv@mojatatu.com> wrote:
> When application fails to pass flags in netlink TLV when replacing
> existing skbmod action, the kernel will leak refcnt:
>
> $ tc actions get action skbmod index 1
> total acts 0
>
> action order 0: skbmod pipe set smac 00:11:22:33:44:55
> index 1 ref 1 bind 0
>
> For example, at this point a buggy application replaces the action with
> index 1 with new smac 00:aa:22:33:44:55, it fails because of zero flags,
> however refcnt gets bumped:
>
> $ tc actions get actions skbmod index 1
> total acts 0
>
> action order 0: skbmod pipe set smac 00:11:22:33:44:55
> index 1 ref 2 bind 0
> $
>
> Tha patch fixes this by calling tcf_idr_release() on existing actions.
>
> Fixes: 86da71b57383d ("net_sched: Introduce skbmod action")
> Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
^ permalink raw reply
* Re: INFO: rcu detected stall in kfree_skbmem
From: Eric Dumazet @ 2018-05-11 19:08 UTC (permalink / raw)
To: Marcelo Ricardo Leitner, Dmitry Vyukov
Cc: syzbot, Vladislav Yasevich, Neil Horman, linux-sctp, Andrei Vagin,
David Miller, Kirill Tkhai, LKML, netdev, syzkaller-bugs
In-Reply-To: <20180511184141.GW5105@localhost.localdomain>
On 05/11/2018 11:41 AM, Marcelo Ricardo Leitner wrote:
> But calling ip6_xmit with rcu_read_lock is expected. tcp stack also
> does it.
> Thus I think this is more of an issue with IPv6 stack. If a host has
> an extensive ip6tables ruleset, it probably generates this more
> easily.
>
>>> sctp_v6_xmit+0x4a5/0x6b0 net/sctp/ipv6.c:225
>>> sctp_packet_transmit+0x26f6/0x3ba0 net/sctp/output.c:650
>>> sctp_outq_flush+0x1373/0x4370 net/sctp/outqueue.c:1197
>>> sctp_outq_uncork+0x6a/0x80 net/sctp/outqueue.c:776
>>> sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1820 [inline]
>>> sctp_side_effects net/sctp/sm_sideeffect.c:1220 [inline]
>>> sctp_do_sm+0x596/0x7160 net/sctp/sm_sideeffect.c:1191
>>> sctp_generate_heartbeat_event+0x218/0x450 net/sctp/sm_sideeffect.c:406
>>> call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
>>> expire_timers kernel/time/timer.c:1363 [inline]
>
> Having this call from a timer means it wasn't processing sctp stack
> for too long.
>
I feel the problem is that this part is looping, in some infinite loop.
I have seen this stack traces in other reports.
Maybe some kind of list corruption.
^ permalink raw reply
* Re: [PATCH v3] net: phy: DP83TC811: Introduce support for the DP83TC811 phy
From: Dan Murphy @ 2018-05-11 18:51 UTC (permalink / raw)
To: Andrew Lunn; +Cc: f.fainelli, netdev, linux-kernel
In-Reply-To: <20180511183006.GB12738@lunn.ch>
Andrew
On 05/11/2018 01:30 PM, Andrew Lunn wrote:
> On Fri, May 11, 2018 at 01:08:19PM -0500, Dan Murphy wrote:
>> Add support for the DP83811 phy.
>>
>> The DP83811 supports both rgmii and sgmii interfaces.
>> There are 2 part numbers for this the DP83TC811R does not
>> reliably support the SGMII interface but the DP83TC811S will.
>>
>> There is not a way to differentiate these parts from the
>> hardware or register set. So this is controlled via the DT
>> to indicate which phy mode is required. Or the part can be
>> strapped to a certain interface.
>>
>> Data sheet can be found here:
>> http://www.ti.com/product/DP83TC811S-Q1/description
>> http://www.ti.com/product/DP83TC811R-Q1/description
>>
>> Signed-off-by: Dan Murphy <dmurphy@ti.com>
>
> Hi Dan
>
> It is normal to add any Reviewed-by, or Tested-by: tags you received,
> so long as you don't make major changes.
>
Thanks for the reminder.
I usually add them if I get them explicitly stated in the review.
I have not seen any Reviewed-by or Tested-by tags in any of the replies for the
patch. But I may have missed it.
Dan
> Andrew
>
--
------------------
Dan Murphy
^ permalink raw reply
* Re: INFO: rcu detected stall in kfree_skbmem
From: Marcelo Ricardo Leitner @ 2018-05-11 18:41 UTC (permalink / raw)
To: Dmitry Vyukov
Cc: syzbot, Vladislav Yasevich, Neil Horman, linux-sctp, Andrei Vagin,
David Miller, Kirill Tkhai, LKML, netdev, syzkaller-bugs
In-Reply-To: <CACT4Y+Z_+=VLbELVW69B7WEVqvri1xBMM5RMnMQynec_XgaE=w@mail.gmail.com>
On Fri, May 11, 2018 at 12:00:38PM +0200, Dmitry Vyukov wrote:
> On Mon, Apr 30, 2018 at 8:09 PM, syzbot
> <syzbot+fc78715ba3b3257caf6a@syzkaller.appspotmail.com> wrote:
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit: 5d1365940a68 Merge
> > git://git.kernel.org/pub/scm/linux/kerne...
> > git tree: net-next
> > console output: https://syzkaller.appspot.com/x/log.txt?id=5667997129637888
> > kernel config:
> > https://syzkaller.appspot.com/x/.config?id=-5947642240294114534
> > dashboard link: https://syzkaller.appspot.com/bug?extid=fc78715ba3b3257caf6a
> > compiler: gcc (GCC) 8.0.1 20180413 (experimental)
> >
> > Unfortunately, I don't have any reproducer for this crash yet.
>
> This looks sctp-related, +sctp maintainers.
>
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+fc78715ba3b3257caf6a@syzkaller.appspotmail.com
> >
> > INFO: rcu_sched self-detected stall on CPU
> > 1-...!: (1 GPs behind) idle=a3e/1/4611686018427387908
> > softirq=71980/71983 fqs=33
> > (t=125000 jiffies g=39438 c=39437 q=958)
> > rcu_sched kthread starved for 124829 jiffies! g39438 c39437 f0x0
> > RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0
> > RCU grace-period kthread stack dump:
> > rcu_sched R running task 23768 9 2 0x80000000
> > Call Trace:
> > context_switch kernel/sched/core.c:2848 [inline]
> > __schedule+0x801/0x1e30 kernel/sched/core.c:3490
> > schedule+0xef/0x430 kernel/sched/core.c:3549
> > schedule_timeout+0x138/0x240 kernel/time/timer.c:1801
> > rcu_gp_kthread+0x6b5/0x1940 kernel/rcu/tree.c:2231
> > kthread+0x345/0x410 kernel/kthread.c:238
> > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:411
> > NMI backtrace for cpu 1
> > CPU: 1 PID: 20560 Comm: syz-executor4 Not tainted 4.16.0+ #1
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > Google 01/01/2011
> > Call Trace:
> > <IRQ>
> > __dump_stack lib/dump_stack.c:77 [inline]
> > dump_stack+0x1b9/0x294 lib/dump_stack.c:113
> > nmi_cpu_backtrace.cold.4+0x19/0xce lib/nmi_backtrace.c:103
> > nmi_trigger_cpumask_backtrace+0x151/0x192 lib/nmi_backtrace.c:62
> > arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
> > trigger_single_cpu_backtrace include/linux/nmi.h:156 [inline]
> > rcu_dump_cpu_stacks+0x175/0x1c2 kernel/rcu/tree.c:1376
> > print_cpu_stall kernel/rcu/tree.c:1525 [inline]
> > check_cpu_stall.isra.61.cold.80+0x36c/0x59a kernel/rcu/tree.c:1593
> > __rcu_pending kernel/rcu/tree.c:3356 [inline]
> > rcu_pending kernel/rcu/tree.c:3401 [inline]
> > rcu_check_callbacks+0x21b/0xad0 kernel/rcu/tree.c:2763
> > update_process_times+0x2d/0x70 kernel/time/timer.c:1636
> > tick_sched_handle+0x9f/0x180 kernel/time/tick-sched.c:173
> > tick_sched_timer+0x45/0x130 kernel/time/tick-sched.c:1283
> > __run_hrtimer kernel/time/hrtimer.c:1386 [inline]
> > __hrtimer_run_queues+0x3e3/0x10a0 kernel/time/hrtimer.c:1448
> > hrtimer_interrupt+0x286/0x650 kernel/time/hrtimer.c:1506
> > local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1025 [inline]
> > smp_apic_timer_interrupt+0x15d/0x710 arch/x86/kernel/apic/apic.c:1050
> > apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:862
> > RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:783
> > [inline]
> > RIP: 0010:kmem_cache_free+0xb3/0x2d0 mm/slab.c:3757
> > RSP: 0018:ffff8801db105228 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
> > RAX: 0000000000000007 RBX: ffff8800b055c940 RCX: 1ffff1003b2345a5
> > RDX: 0000000000000000 RSI: ffff8801d91a2d80 RDI: 0000000000000282
> > RBP: ffff8801db105248 R08: ffff8801d91a2cb8 R09: 0000000000000002
> > R10: ffff8801d91a2480 R11: 0000000000000000 R12: ffff8801d9848e40
> > R13: 0000000000000282 R14: ffffffff85b7f27c R15: 0000000000000000
> > kfree_skbmem+0x13c/0x210 net/core/skbuff.c:582
> > __kfree_skb net/core/skbuff.c:642 [inline]
> > kfree_skb+0x19d/0x560 net/core/skbuff.c:659
> > enqueue_to_backlog+0x2fc/0xc90 net/core/dev.c:3968
> > netif_rx_internal+0x14d/0xae0 net/core/dev.c:4181
> > netif_rx+0xba/0x400 net/core/dev.c:4206
> > loopback_xmit+0x283/0x741 drivers/net/loopback.c:91
> > __netdev_start_xmit include/linux/netdevice.h:4087 [inline]
> > netdev_start_xmit include/linux/netdevice.h:4096 [inline]
> > xmit_one net/core/dev.c:3053 [inline]
> > dev_hard_start_xmit+0x264/0xc10 net/core/dev.c:3069
> > __dev_queue_xmit+0x2724/0x34c0 net/core/dev.c:3584
> > dev_queue_xmit+0x17/0x20 net/core/dev.c:3617
> > neigh_hh_output include/net/neighbour.h:472 [inline]
> > neigh_output include/net/neighbour.h:480 [inline]
> > ip6_finish_output2+0x134e/0x2810 net/ipv6/ip6_output.c:120
> > ip6_finish_output+0x5fe/0xbc0 net/ipv6/ip6_output.c:154
> > NF_HOOK_COND include/linux/netfilter.h:277 [inline]
> > ip6_output+0x227/0x9b0 net/ipv6/ip6_output.c:171
> > dst_output include/net/dst.h:444 [inline]
> > NF_HOOK include/linux/netfilter.h:288 [inline]
> > ip6_xmit+0xf51/0x23f0 net/ipv6/ip6_output.c:277
sctp_v6_xmit calls ip6_xmit with rcu_read_lock() as it has to pass
np->opt to ip6_xmit. Sounds like this packet then went through a long
journey and hit the bell.
But calling ip6_xmit with rcu_read_lock is expected. tcp stack also
does it.
Thus I think this is more of an issue with IPv6 stack. If a host has
an extensive ip6tables ruleset, it probably generates this more
easily.
> > sctp_v6_xmit+0x4a5/0x6b0 net/sctp/ipv6.c:225
> > sctp_packet_transmit+0x26f6/0x3ba0 net/sctp/output.c:650
> > sctp_outq_flush+0x1373/0x4370 net/sctp/outqueue.c:1197
> > sctp_outq_uncork+0x6a/0x80 net/sctp/outqueue.c:776
> > sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1820 [inline]
> > sctp_side_effects net/sctp/sm_sideeffect.c:1220 [inline]
> > sctp_do_sm+0x596/0x7160 net/sctp/sm_sideeffect.c:1191
> > sctp_generate_heartbeat_event+0x218/0x450 net/sctp/sm_sideeffect.c:406
> > call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
> > expire_timers kernel/time/timer.c:1363 [inline]
Having this call from a timer means it wasn't processing sctp stack
for too long.
> > __run_timers+0x79e/0xc50 kernel/time/timer.c:1666
> > run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
> > __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
> > invoke_softirq kernel/softirq.c:365 [inline]
> > irq_exit+0x1d1/0x200 kernel/softirq.c:405
> > exiting_irq arch/x86/include/asm/apic.h:525 [inline]
> > smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052
> > apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:862
> > </IRQ>
> > RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:783
> > [inline]
> > RIP: 0010:lock_release+0x4d4/0xa10 kernel/locking/lockdep.c:3942
> > RSP: 0018:ffff8801971ce7b0 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
> > RAX: dffffc0000000000 RBX: 1ffff10032e39cfb RCX: 1ffff1003b234595
> > RDX: 1ffffffff11630ed RSI: 0000000000000002 RDI: 0000000000000282
> > RBP: ffff8801971ce8e0 R08: 1ffff10032e39cff R09: ffffed003b6246c2
> > R10: 0000000000000003 R11: 0000000000000001 R12: ffff8801d91a2480
> > R13: ffffffff88b8df60 R14: ffff8801d91a2480 R15: ffff8801971ce7f8
> > rcu_lock_release include/linux/rcupdate.h:251 [inline]
> > rcu_read_unlock include/linux/rcupdate.h:688 [inline]
> > __unlock_page_memcg+0x72/0x100 mm/memcontrol.c:1654
> > unlock_page_memcg+0x2c/0x40 mm/memcontrol.c:1663
> > page_remove_file_rmap mm/rmap.c:1248 [inline]
> > page_remove_rmap+0x6f2/0x1250 mm/rmap.c:1299
> > zap_pte_range mm/memory.c:1337 [inline]
> > zap_pmd_range mm/memory.c:1441 [inline]
> > zap_pud_range mm/memory.c:1470 [inline]
> > zap_p4d_range mm/memory.c:1491 [inline]
> > unmap_page_range+0xeb4/0x2200 mm/memory.c:1512
> > unmap_single_vma+0x1a0/0x310 mm/memory.c:1557
> > unmap_vmas+0x120/0x1f0 mm/memory.c:1587
> > exit_mmap+0x265/0x570 mm/mmap.c:3038
> > __mmput kernel/fork.c:962 [inline]
> > mmput+0x251/0x610 kernel/fork.c:983
> > exit_mm kernel/exit.c:544 [inline]
> > do_exit+0xe98/0x2730 kernel/exit.c:852
> > do_group_exit+0x16f/0x430 kernel/exit.c:968
> > get_signal+0x886/0x1960 kernel/signal.c:2469
> > do_signal+0x98/0x2040 arch/x86/kernel/signal.c:810
> > exit_to_usermode_loop+0x28a/0x310 arch/x86/entry/common.c:162
> > prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
> > syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
> > do_syscall_64+0x792/0x9d0 arch/x86/entry/common.c:292
> > entry_SYSCALL_64_after_hwframe+0x42/0xb7
> > RIP: 0033:0x455319
> > RSP: 002b:00007fa346e81ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
> > RAX: fffffffffffffe00 RBX: 000000000072bf80 RCX: 0000000000455319
> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000072bf80
> > RBP: 000000000072bf80 R08: 0000000000000000 R09: 000000000072bf58
> > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> > R13: 0000000000a3e81f R14: 00007fa346e829c0 R15: 0000000000000001
> >
> >
> > ---
> > This bug is generated by a bot. It may contain errors.
> > See https://goo.gl/tpsmEJ for more information about syzbot.
> > syzbot engineers can be reached at syzkaller@googlegroups.com.
> >
> > syzbot will keep track of this bug report.
> > If you forgot to add the Reported-by tag, once the fix for this bug is
> > merged
> > into any tree, please reply to this email with:
> > #syz fix: exact-commit-title
> > To mark this as a duplicate of another syzbot report, please reply with:
> > #syz dup: exact-subject-of-another-report
> > If it's a one-off invalid bug report, please reply with:
> > #syz invalid
> > Note: if the crash happens again, it will cause creation of a new bug
> > report.
> > Note: all commands must start from beginning of the line in the email body.
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "syzkaller-bugs" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to syzkaller-bugs+unsubscribe@googlegroups.com.
> > To view this discussion on the web visit
> > https://groups.google.com/d/msgid/syzkaller-bugs/000000000000a9b0e3056b14bfb2%40google.com.
> > For more options, visit https://groups.google.com/d/optout.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply
* Re: [PATCH v6 1/6] net: phy: at803x: Export at803x_debug_reg_mask()
From: Paul Burton @ 2018-05-11 18:38 UTC (permalink / raw)
To: Andrew Lunn, Darren Hart; +Cc: netdev, linux-mips, David S . Miller
In-Reply-To: <20180511182502.y74wm6dmtf3dbcln@pburton-laptop>
On Fri, May 11, 2018 at 11:25:02AM -0700, Paul Burton wrote:
> Hi Andrew,
>
> On Fri, May 11, 2018 at 02:26:19AM +0200, Andrew Lunn wrote:
> > On Thu, May 10, 2018 at 04:16:52PM -0700, Paul Burton wrote:
> > > From: Andrew Lunn <andrew@lunn.ch>
> > >
> > > On some boards, this PHY has a problem when it hibernates. Export this
> > > function to a board can register a PHY fixup to disable hibernation.
> >
> > What do you know about the problem?
> >
> > https://patchwork.ozlabs.org/patch/686371/
> >
> > I don't remember how it was solved, but you should probably do the
> > same.
> >
> > Andrew
>
> I'm afraid I don't know much about the problem - this one is your patch
> entirely unchanged, and I don't have access to the hardware in question
> (my board uses a Realtek RTL8211E PHY).
>
> I presume you did this because the pch_gbe driver as-is in mainline
> disables hibernation for the AR803X PHY found on the MinnowBoard, so
> this would be preserving the existing behaviour of the driver?
>
> That behaviour was introduced by commit f1a26fdf5944f ("pch_gbe: Add
> MinnowBoard support"), so perhaps Darren as its author might know more?
>
> My presumption would be that this is done to ensure that the PHY is
> always providing the RX clock, which the EG20T manual says is required
> for the MAC reset register RX_RST & ALL_RST bits to clear. We wait for
> those using the call to pch_gbe_wait_clr_bit() in
> pch_gbe_mac_reset_hw(), which happens before we initialize the PHY.
>
> I could reorder the probe function a little to initialize the PHY before
> performing the MAC reset, drop this patch and the AR803X hibernation
> stuff from patch 2 if you like. But again, I can't actually test the
> result on the affected hardware.
>
> Thanks,
> Paul
I got an undeliverable response using Darren's email address from the
commit referenced above, so updating to the latest address I see for him
in git history.
Thanks,
Paul
^ permalink raw reply
* [PATCH net 1/1] net sched actions: fix refcnt leak in skbmod
From: Roman Mashak @ 2018-05-11 18:35 UTC (permalink / raw)
To: davem; +Cc: netdev, kernel, jhs, xiyou.wangcong, jiri, Roman Mashak
When application fails to pass flags in netlink TLV when replacing
existing skbmod action, the kernel will leak refcnt:
$ tc actions get action skbmod index 1
total acts 0
action order 0: skbmod pipe set smac 00:11:22:33:44:55
index 1 ref 1 bind 0
For example, at this point a buggy application replaces the action with
index 1 with new smac 00:aa:22:33:44:55, it fails because of zero flags,
however refcnt gets bumped:
$ tc actions get actions skbmod index 1
total acts 0
action order 0: skbmod pipe set smac 00:11:22:33:44:55
index 1 ref 2 bind 0
$
Tha patch fixes this by calling tcf_idr_release() on existing actions.
Fixes: 86da71b57383d ("net_sched: Introduce skbmod action")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
---
net/sched/act_skbmod.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/sched/act_skbmod.c b/net/sched/act_skbmod.c
index bbcbdce732cc..ad050d7d4b46 100644
--- a/net/sched/act_skbmod.c
+++ b/net/sched/act_skbmod.c
@@ -131,8 +131,11 @@ static int tcf_skbmod_init(struct net *net, struct nlattr *nla,
if (exists && bind)
return 0;
- if (!lflags)
+ if (!lflags) {
+ if (exists)
+ tcf_idr_release(*a, bind);
return -EINVAL;
+ }
if (!exists) {
ret = tcf_idr_create(tn, parm->index, est, a,
--
2.7.4
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox