* Re: 3.12.33 - BUG xfrm_selector_match+0x25/0x2f6
From: Julian Anastasov @ 2014-12-07 18:27 UTC (permalink / raw)
To: Smart Weblications GmbH - Florian Wiessner
Cc: Steffen Klassert, netdev, LKML, stable, Simon Horman, lvs-devel
In-Reply-To: <5481B944.2000002@smart-weblications.de>
[-- Attachment #1: Type: TEXT/PLAIN, Size: 421 bytes --]
Hello,
On Fri, 5 Dec 2014, Smart Weblications GmbH - Florian Wiessner wrote:
> thank you for the fast responses! I would like to test any patch for 3.12.
I'm attaching a patch that avoids rerouting in
IPVS for LOCAL_IN. Please test it in your setup. My tests
were with NAT on today's net tree. I checked that it
compiles for 3.12.33. You can use the default snat_reroute=1.
Regards
--
Julian Anastasov <ja@ssi.bg>
[-- Attachment #2: patch --]
[-- Type: TEXT/plain, Size: 4336 bytes --]
From 4fc493f8f1ed967b1e3dd6d330a25bad762516d7 Mon Sep 17 00:00:00 2001
From: Julian Anastasov <ja@ssi.bg>
Date: Sun, 7 Dec 2014 18:13:24 +0200
Subject: [PATCH net] ipvs: rerouting to local clients is not needed anymore
commit f5a41847acc5 ("ipvs: move ip_route_me_harder for ICMP")
from 2.6.37 introduced ip_route_me_harder() call for responses to
local clients, so that we can provide valid rt_src after SNAT.
It was used by TCP to provide valid daddr for ip_send_reply().
After commit 0a5ebb8000c5 ("ipv4: Pass explicit daddr arg to
ip_send_reply()." from 3.0 this rerouting is not needed anymore
and should be avoided, especially in LOCAL_IN.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
---
net/netfilter/ipvs/ip_vs_core.c | 33 ++++++++++++++++++++++-----------
1 file changed, 22 insertions(+), 11 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index 990decb..b87ca32 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -659,16 +659,24 @@ static inline int ip_vs_gather_frags(struct sk_buff *skb, u_int32_t user)
return err;
}
-static int ip_vs_route_me_harder(int af, struct sk_buff *skb)
+static int ip_vs_route_me_harder(int af, struct sk_buff *skb,
+ unsigned int hooknum)
{
+ if (!sysctl_snat_reroute(skb))
+ return 0;
+ /* Reroute replies only to remote clients (FORWARD and LOCAL_OUT) */
+ if (NF_INET_LOCAL_IN == hooknum)
+ return 0;
#ifdef CONFIG_IP_VS_IPV6
if (af == AF_INET6) {
- if (sysctl_snat_reroute(skb) && ip6_route_me_harder(skb) != 0)
+ struct dst_entry *dst = skb_dst(skb);
+
+ if (dst->dev && !(dst->dev->flags & IFF_LOOPBACK) &&
+ ip6_route_me_harder(skb) != 0)
return 1;
} else
#endif
- if ((sysctl_snat_reroute(skb) ||
- skb_rtable(skb)->rt_flags & RTCF_LOCAL) &&
+ if (!(skb_rtable(skb)->rt_flags & RTCF_LOCAL) &&
ip_route_me_harder(skb, RTN_LOCAL) != 0)
return 1;
@@ -791,7 +799,8 @@ static int handle_response_icmp(int af, struct sk_buff *skb,
union nf_inet_addr *snet,
__u8 protocol, struct ip_vs_conn *cp,
struct ip_vs_protocol *pp,
- unsigned int offset, unsigned int ihl)
+ unsigned int offset, unsigned int ihl,
+ unsigned int hooknum)
{
unsigned int verdict = NF_DROP;
@@ -821,7 +830,7 @@ static int handle_response_icmp(int af, struct sk_buff *skb,
#endif
ip_vs_nat_icmp(skb, pp, cp, 1);
- if (ip_vs_route_me_harder(af, skb))
+ if (ip_vs_route_me_harder(af, skb, hooknum))
goto out;
/* do the statistics and put it back */
@@ -916,7 +925,7 @@ static int ip_vs_out_icmp(struct sk_buff *skb, int *related,
snet.ip = iph->saddr;
return handle_response_icmp(AF_INET, skb, &snet, cih->protocol, cp,
- pp, ciph.len, ihl);
+ pp, ciph.len, ihl, hooknum);
}
#ifdef CONFIG_IP_VS_IPV6
@@ -981,7 +990,8 @@ static int ip_vs_out_icmp_v6(struct sk_buff *skb, int *related,
snet.in6 = ciph.saddr.in6;
writable = ciph.len;
return handle_response_icmp(AF_INET6, skb, &snet, ciph.protocol, cp,
- pp, writable, sizeof(struct ipv6hdr));
+ pp, writable, sizeof(struct ipv6hdr),
+ hooknum);
}
#endif
@@ -1040,7 +1050,8 @@ static inline bool is_new_conn(const struct sk_buff *skb,
*/
static unsigned int
handle_response(int af, struct sk_buff *skb, struct ip_vs_proto_data *pd,
- struct ip_vs_conn *cp, struct ip_vs_iphdr *iph)
+ struct ip_vs_conn *cp, struct ip_vs_iphdr *iph,
+ unsigned int hooknum)
{
struct ip_vs_protocol *pp = pd->pp;
@@ -1078,7 +1089,7 @@ handle_response(int af, struct sk_buff *skb, struct ip_vs_proto_data *pd,
* if it came from this machine itself. So re-compute
* the routing information.
*/
- if (ip_vs_route_me_harder(af, skb))
+ if (ip_vs_route_me_harder(af, skb, hooknum))
goto drop;
IP_VS_DBG_PKT(10, af, pp, skb, 0, "After SNAT");
@@ -1181,7 +1192,7 @@ ip_vs_out(unsigned int hooknum, struct sk_buff *skb, int af)
cp = pp->conn_out_get(af, skb, &iph, 0);
if (likely(cp))
- return handle_response(af, skb, pd, cp, &iph);
+ return handle_response(af, skb, pd, cp, &iph, hooknum);
if (sysctl_nat_icmp_send(net) &&
(pp->protocol == IPPROTO_TCP ||
pp->protocol == IPPROTO_UDP ||
--
1.9.3
^ permalink raw reply related
* Re: [PATCH] netfilter: Fix build for NETFILTER_XT_TARGET_REDIRECT
From: Pablo Neira Ayuso @ 2014-12-07 18:23 UTC (permalink / raw)
To: Guenter Roeck; +Cc: netfilter-devel, coreteam, netdev, Arturo Borrero Gonzalez
In-Reply-To: <1417858919-10576-1-git-send-email-linux@roeck-us.net>
On Sat, Dec 06, 2014 at 01:41:59AM -0800, Guenter Roeck wrote:
> Fix:
>
> ERROR: "nf_nat_redirect_ipv6" [net/netfilter/xt_REDIRECT.ko] undefined!
>
> Seen if NETFILTER_XT_TARGET_REDIRECT is configured but NF_NAT_IPV6
> is not, since code compiled with NF_NAT_REDIRECT_IPV6 is used
> unconditionally by code enabled with NETFILTER_XT_TARGET_REDIRECT.
> This means that NETFILTER_XT_TARGET_REDIRECT depends on NF_NAT_IPV6
> and must always select NF_NAT_REDIRECT_IPV6.
Thanks for your patch. However, we decided to resolve this by
combining nf_reject_ipv4 and nf_reject_ipv6.
See b59eaf9 ("netfilter: combine IPv4 and IPv6 nf_nat_redirect code in one
module").
Let us know if you still hit problems after that patch. Thanks.
^ permalink raw reply
* Re: [PATCH 2/3] bridge: offload bridge port attributes to switch asic if feature flag set
From: Roopa Prabhu @ 2014-12-07 17:33 UTC (permalink / raw)
To: Arad, Ronen
Cc: Scott Feldman, Netdev, Jirí Pírko, Jamal Hadi Salim,
Benjamin LaHaise, Thomas Graf, john fastabend,
stephen@networkplumber.org, John Linville, nhorman@tuxdriver.com,
Nicolas Dichtel, vyasevic@redhat.com, Florian Fainelli,
buytenh@wantstofly.org, Aviad Raveh, David S. Miller,
shm@cumulusnetworks.com, Andy Gospodarek
In-Reply-To: <E4CD12F19ABA0C4D8729E087A761DC3505D84532@ORSMSX101.amr.corp.intel.com>
On 12/6/14, 12:05 AM, Arad, Ronen wrote:
>
>> -----Original Message-----
>> From: Scott Feldman [mailto:sfeldma@gmail.com]
>> Sent: Friday, December 05, 2014 10:29 PM
>> To: Arad, Ronen
>> Cc: Roopa Prabhu; Netdev; Jirí Pírko; Jamal Hadi Salim; Benjamin LaHaise;
>> Thomas Graf; john fastabend; stephen@networkplumber.org; John Linville;
>> nhorman@tuxdriver.com; Nicolas Dichtel; vyasevic@redhat.com; Florian
>> Fainelli; buytenh@wantstofly.org; Aviad Raveh; David S. Miller;
>> shm@cumulusnetworks.com; Andy Gospodarek
>> Subject: Re: [PATCH 2/3] bridge: offload bridge port attributes to switch asic
>> if feature flag set
>>
>> On Fri, Dec 5, 2014 at 5:04 PM, Arad, Ronen <ronen.arad@intel.com> wrote:
>>> I have another case of propagation which is not covered by the proposed
>> patch.
>>> A recent patch introduced default_pvid attribute for a bridge (so far
>> supported only via sysfs and not via netlink).
>>> When a port joins a bridge, it inherits a PVID from the default_pvid of the
>> bridge.
>>> The bridge driver propagates that to the newly created net_bridge_port.
>> This is done in br_vlan.c:
>>> int nbp_vlan_init(struct net_bridge_port *p) {
>>> int rc = 0;
>>>
>>> if (p->br->default_pvid) {
>>> rc = nbp_vlan_add(p, p->br->default_pvid,
>>> BRIDGE_VLAN_INFO_PVID |
>>> BRIDGE_VLAN_INFO_UNTAGGED);
>>> }
>>>
>>> return rc;
>>> }
>>>
>>> When L2 switching is offloaded to the HW, this PVID setting need to be
>> propagated.
>>
>> Agreed, it would be nice to have it propagated down, but there is a non-ideal
>> work-around. If you set default_pvid=0 to turn off PVID, then the switch port
>> driver can pick some internal VLAN ID just for HW purposes in matching
>> untagged pkts. It's non-ideal because the switch port driver needs to reserve
>> a block of VLAN IDs for internal usage or use some other matching
>> mechanism to keep untagged pkts within this bridge.
> This work-around let the administrator avoid using VID=1 as the default VLAN for untagged frames. However, it does not let the administrator pick a VID of her choice.
>
>> Better to have default_pvid value propagated down. But, default_pvid is a
>> per-bridge property, not a per-bridge-port property.
>> RTM_SETLINK/RTM_GETLINK for PF_BRIDGE does have AFSPEC for per-bridge
>> and PROTINFO for per-bridge-port, so it seems PVID needs to be part of
>> AFSPEC.
> I believe AFSPEC is not limited to per-bridge properties. It is per-bridge when the netlink msg's ifindex is that of a bridge and SELF flag is set.
> AFSPEC is for a port when the netlink msg's ifindex is that of an enslaved port device and MASTER flag is set (or neither MASTER nor SELF flag is set)
> PVID is one of the flags associated with a VID in bridge_vlan_info.
correct.
> default_pvid is not currently supported by netlink. A new IFLA_BRIDGE_DEFAULT_PVID could be introduced to carry this property when a nlmsg is directed at a bridge.
>
>
correct again. And yes, a netlink attribute to set default pvid is due.
^ permalink raw reply
* [PATCH net-next] enic: add support for set/get rss hash key
From: Govindarajulu Varadarajan @ 2014-12-07 17:11 UTC (permalink / raw)
To: davem, netdev; +Cc: ssujith, benve, Govindarajulu Varadarajan
This patch adds support for setting/getting rss hash key using ethtool.
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
---
drivers/net/ethernet/cisco/enic/enic.h | 2 ++
drivers/net/ethernet/cisco/enic/enic_ethtool.c | 30 ++++++++++++++++++++++++++
drivers/net/ethernet/cisco/enic/enic_main.c | 13 +++++++----
3 files changed, 41 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/cisco/enic/enic.h b/drivers/net/ethernet/cisco/enic/enic.h
index 5ba5ad0..25c4d88 100644
--- a/drivers/net/ethernet/cisco/enic/enic.h
+++ b/drivers/net/ethernet/cisco/enic/enic.h
@@ -187,6 +187,7 @@ struct enic {
unsigned int cq_count;
struct enic_rfs_flw_tbl rfs_h;
u32 rx_copybreak;
+ u8 rss_key[ENIC_RSS_LEN];
};
static inline struct device *enic_get_dev(struct enic *enic)
@@ -246,5 +247,6 @@ int enic_sriov_enabled(struct enic *enic);
int enic_is_valid_vf(struct enic *enic, int vf);
int enic_is_dynamic(struct enic *enic);
void enic_set_ethtool_ops(struct net_device *netdev);
+int __enic_set_rsskey(struct enic *enic);
#endif /* _ENIC_H_ */
diff --git a/drivers/net/ethernet/cisco/enic/enic_ethtool.c b/drivers/net/ethernet/cisco/enic/enic_ethtool.c
index 85173d6..fe5ae8f 100644
--- a/drivers/net/ethernet/cisco/enic/enic_ethtool.c
+++ b/drivers/net/ethernet/cisco/enic/enic_ethtool.c
@@ -23,6 +23,7 @@
#include "enic.h"
#include "enic_dev.h"
#include "enic_clsf.h"
+#include "vnic_rss.h"
struct enic_stat {
char name[ETH_GSTRING_LEN];
@@ -416,6 +417,32 @@ static int enic_set_tunable(struct net_device *dev,
return ret;
}
+static u32 enic_get_rxfh_key_size(struct net_device *netdev)
+{
+ return ENIC_RSS_LEN;
+}
+
+static int enic_get_rxfh(struct net_device *netdev, u32 *indir, u8 *hkey)
+{
+ struct enic *enic = netdev_priv(netdev);
+
+ if (hkey)
+ memcpy(hkey, enic->rss_key, ENIC_RSS_LEN);
+
+ return 0;
+}
+
+static int enic_set_rxfh(struct net_device *netdev, const u32 *indir,
+ const u8 *hkey)
+{
+ struct enic *enic = netdev_priv(netdev);
+
+ if (hkey)
+ memcpy(enic->rss_key, hkey, ENIC_RSS_LEN);
+
+ return __enic_set_rsskey(enic);
+}
+
static const struct ethtool_ops enic_ethtool_ops = {
.get_settings = enic_get_settings,
.get_drvinfo = enic_get_drvinfo,
@@ -430,6 +457,9 @@ static const struct ethtool_ops enic_ethtool_ops = {
.get_rxnfc = enic_get_rxnfc,
.get_tunable = enic_get_tunable,
.set_tunable = enic_set_tunable,
+ .get_rxfh_key_size = enic_get_rxfh_key_size,
+ .get_rxfh = enic_get_rxfh,
+ .set_rxfh = enic_set_rxfh,
};
void enic_set_ethtool_ops(struct net_device *netdev)
diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c
index 86ee350..868d0f6 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -1888,11 +1888,10 @@ static int enic_dev_hang_reset(struct enic *enic)
return err;
}
-static int enic_set_rsskey(struct enic *enic)
+int __enic_set_rsskey(struct enic *enic)
{
union vnic_rss_key *rss_key_buf_va;
dma_addr_t rss_key_buf_pa;
- u8 rss_key[ENIC_RSS_LEN];
int i, kidx, bidx, err;
rss_key_buf_va = pci_zalloc_consistent(enic->pdev,
@@ -1901,11 +1900,10 @@ static int enic_set_rsskey(struct enic *enic)
if (!rss_key_buf_va)
return -ENOMEM;
- netdev_rss_key_fill(rss_key, ENIC_RSS_LEN);
for (i = 0; i < ENIC_RSS_LEN; i++) {
kidx = i / ENIC_RSS_BYTES_PER_KEY;
bidx = i % ENIC_RSS_BYTES_PER_KEY;
- rss_key_buf_va->key[kidx].b[bidx] = rss_key[i];
+ rss_key_buf_va->key[kidx].b[bidx] = enic->rss_key[i];
}
spin_lock_bh(&enic->devcmd_lock);
err = enic_set_rss_key(enic,
@@ -1919,6 +1917,13 @@ static int enic_set_rsskey(struct enic *enic)
return err;
}
+static int enic_set_rsskey(struct enic *enic)
+{
+ netdev_rss_key_fill(enic->rss_key, ENIC_RSS_LEN);
+
+ return __enic_set_rsskey(enic);
+}
+
static int enic_set_rsscpu(struct enic *enic, u8 rss_hash_bits)
{
dma_addr_t rss_cpu_buf_pa;
--
2.1.0
^ permalink raw reply related
* [PATCH net-next v4 2/2] rocker: remove swdev mode
From: roopa @ 2014-12-07 17:09 UTC (permalink / raw)
To: jiri, sfeldma, jhs, bcrl, tgraf, john.fastabend, stephen,
linville, vyasevic
Cc: netdev, davem, shm, gospo, Roopa Prabhu
From: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
---
drivers/net/ethernet/rocker/rocker.c | 18 +-----------------
include/linux/rtnetlink.h | 2 +-
net/core/rtnetlink.c | 12 +++++++++---
3 files changed, 11 insertions(+), 21 deletions(-)
diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
index fded127..64bf78c 100644
--- a/drivers/net/ethernet/rocker/rocker.c
+++ b/drivers/net/ethernet/rocker/rocker.c
@@ -3700,27 +3700,11 @@ static int rocker_port_bridge_setlink(struct net_device *dev,
{
struct rocker_port *rocker_port = netdev_priv(dev);
struct nlattr *protinfo;
- struct nlattr *afspec;
struct nlattr *attr;
- u16 mode;
int err;
protinfo = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg),
IFLA_PROTINFO);
- afspec = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg), IFLA_AF_SPEC);
-
- if (afspec) {
- attr = nla_find_nested(afspec, IFLA_BRIDGE_MODE);
- if (attr) {
- if (nla_len(attr) < sizeof(mode))
- return -EINVAL;
-
- mode = nla_get_u16(attr);
- if (mode != BRIDGE_MODE_SWDEV)
- return -EINVAL;
- }
- }
-
if (protinfo) {
attr = nla_find_nested(protinfo, IFLA_BRPORT_LEARNING);
if (attr) {
@@ -3755,7 +3739,7 @@ static int rocker_port_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq,
u32 filter_mask)
{
struct rocker_port *rocker_port = netdev_priv(dev);
- u16 mode = BRIDGE_MODE_SWDEV;
+ s16 mode = -1;
u32 mask = BR_LEARNING | BR_LEARNING_SYNC;
return ndo_dflt_bridge_getlink(skb, pid, seq, dev, mode,
diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index 3b04190..dcfa06b 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -103,6 +103,6 @@ extern int ndo_dflt_fdb_del(struct ndmsg *ndm,
u16 vid);
extern int ndo_dflt_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq,
- struct net_device *dev, u16 mode,
+ struct net_device *dev, s16 mode,
u32 flags, u32 mask);
#endif /* __LINUX_RTNETLINK_H */
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 61cb7e7..b4e04b9 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -2696,7 +2696,7 @@ static int brport_nla_put_flag(struct sk_buff *skb, u32 flags, u32 mask,
}
int ndo_dflt_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq,
- struct net_device *dev, u16 mode,
+ struct net_device *dev, s16 mode,
u32 flags, u32 mask)
{
struct nlmsghdr *nlh;
@@ -2734,11 +2734,17 @@ int ndo_dflt_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq,
if (!br_afspec)
goto nla_put_failure;
- if (nla_put_u16(skb, IFLA_BRIDGE_FLAGS, BRIDGE_FLAGS_SELF) ||
- nla_put_u16(skb, IFLA_BRIDGE_MODE, mode)) {
+ if (nla_put_u16(skb, IFLA_BRIDGE_FLAGS, BRIDGE_FLAGS_SELF)) {
nla_nest_cancel(skb, br_afspec);
goto nla_put_failure;
}
+
+ if (mode >= 0) {
+ if (nla_put_u16(skb, IFLA_BRIDGE_MODE, mode)) {
+ nla_nest_cancel(skb, br_afspec);
+ goto nla_put_failure;
+ }
+ }
nla_nest_end(skb, br_afspec);
protinfo = nla_nest_start(skb, IFLA_PROTINFO | NLA_F_NESTED);
--
1.7.10.4
^ permalink raw reply related
* [PATCH net-next v4 1/2] bridge: remove mode 'swdev'
From: roopa @ 2014-12-07 17:09 UTC (permalink / raw)
To: jiri, sfeldma, jhs, bcrl, tgraf, john.fastabend, stephen,
linville, vyasevic
Cc: netdev, davem, shm, gospo, Roopa Prabhu
From: Roopa Prabhu <roopa@cumulusnetworks.com>
swdev mode was introduced to indicate switchdev offloads
for bridging from user space. But user can
use BRIDGE_FLAGS_SELF to directly call into the
hw switch port driver today. swdev mode is not required anymore.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
---
include/uapi/linux/if_bridge.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
index 296a556..da17e45 100644
--- a/include/uapi/linux/if_bridge.h
+++ b/include/uapi/linux/if_bridge.h
@@ -105,7 +105,6 @@ struct __fdb_entry {
#define BRIDGE_MODE_VEB 0 /* Default loopback mode */
#define BRIDGE_MODE_VEPA 1 /* 802.1Qbg defined VEPA mode */
-#define BRIDGE_MODE_SWDEV 2 /* Full switch device offload */
/* Bridge management nested attributes
* [IFLA_AF_SPEC] = {
--
1.7.10.4
^ permalink raw reply related
* [PATCH net-next v4 0/2] remove bridge BRIDGE_MODE_SWDEV
From: roopa @ 2014-12-07 17:09 UTC (permalink / raw)
To: jiri, sfeldma, jhs, bcrl, tgraf, john.fastabend, stephen,
linville, vyasevic
Cc: netdev, davem, shm, gospo, Roopa Prabhu
From: Roopa Prabhu <roopa@cumulusnetworks.com>
Roopa Prabhu (2):
bridge: remove mode 'swdev'
rocker: remove swdev mode
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
drivers/net/ethernet/rocker/rocker.c | 18 +-----------------
include/linux/rtnetlink.h | 2 +-
include/uapi/linux/if_bridge.h | 1 -
net/core/rtnetlink.c | 12 +++++++++---
4 files changed, 11 insertions(+), 22 deletions(-)
--
1.7.10.4
^ permalink raw reply
* RE: [PATCH RFC] pci: Control whether VFs are probed on pci_enable_sriov
From: Yuval Mintz @ 2014-12-07 17:05 UTC (permalink / raw)
To: Eli Cohen, bhelgaas@google.com, David Miller
Cc: linux-pci, netdev, ogerlitz@mellanox.com, yevgenyp@mellanox.com,
Eli Cohen, Donald Dutile
In-Reply-To: <1417957693-24979-1-git-send-email-eli@mellanox.com>
[-- Attachment #1: Type: text/plain, Size: 684 bytes --]
>This can save host side resource usage by VF instances which would be
>eventually probed to VMs.
>Use a parameter to pci_enable_sriov to control that policy, and modify
>all current callers such that they retain the same functionality.
What's the end-game here? How eventually would this be controlled?
>Use a one shot flag on struct pci_device which is cleared after the
>first probe is ignored so subsequent attempts go through.
Does a one-shot flag suffice? E.g., consider assigning a VF to VM and
than shutting down the VM. Assuming this feature is disabled,
the VF didn't appear on the hypervisor prior to the assignment but
will appear after its shutdown.
[-- Attachment #2: winmail.dat --]
[-- Type: application/ms-tnef, Size: 3596 bytes --]
^ permalink raw reply
* Re: [PATCH net-next v3 2/2] rocker: remove swdev mode
From: Roopa Prabhu @ 2014-12-07 16:55 UTC (permalink / raw)
To: Thomas Graf
Cc: jiri, sfeldma, jhs, bcrl, john.fastabend, stephen, linville,
vyasevic, netdev, davem, shm, gospo
In-Reply-To: <20141207081928.GA2215@casper.infradead.org>
On 12/7/14, 12:19 AM, Thomas Graf wrote:
> On 12/06/14 at 10:54pm, roopa@cumulusnetworks.com wrote:
>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>
>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>> ---
>> drivers/net/ethernet/rocker/rocker.c | 18 +-----------------
>> include/linux/rtnetlink.h | 2 +-
>> net/core/rtnetlink.c | 12 +++++++++---
>> 3 files changed, 11 insertions(+), 21 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/rocker/rocker.c b/drivers/net/ethernet/rocker/rocker.c
>> index fded127..9f1d256 100644
>> --- a/drivers/net/ethernet/rocker/rocker.c
>> +++ b/drivers/net/ethernet/rocker/rocker.c
>> @@ -3755,7 +3739,7 @@ static int rocker_port_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq,
>> u32 filter_mask)
>> {
>> struct rocker_port *rocker_port = netdev_priv(dev);
>> - u16 mode = BRIDGE_MODE_SWDEV;
>> + u16 mode = -1;
> ^^^
>
> I assume you meant s16
>
yes :(...i thought i had covered all places....missed this
one...resubmitting ..
^ permalink raw reply
* [PATCH net-next V1] net/mlx4_en: ethtool force speed when asking for autoneg=off
From: Amir Vadai @ 2014-12-07 16:27 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Or Gerlitz, Amir Vadai, Yevgeny Petrilin, Saeed Mahameed
From: Saeed Mahameed <saeedm@mellanox.com>
Use cmd->autoneg == AUTONEG_DISABLE as a user hint to force specific speed.
We don't want to rely on ethtool to calculate advertised link modes when
forcing specific speed, a user can request a specific speed and specify
"autoneg off" in ethtool command to give a hint for forcing this speed.
Move en_warn("port reset..") inside the "port reset" block.
Fixes: d48b3ab ("net/mlx4_en: Use PTYS register to set ethtool settings (Speed)")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
index c45e06a..06752e4 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
@@ -771,13 +771,13 @@ static int mlx4_en_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
}
proto_admin = cpu_to_be32(ptys_adv);
- if (speed >= 0 && speed != priv->port_state.link_speed)
+ if (speed >= 0 && (speed != priv->port_state.link_speed ||
+ cmd->autoneg == AUTONEG_DISABLE))
/* If speed was set then speed decides :-) */
proto_admin = speed_set_ptys_admin(priv, speed,
ptys_reg.eth_proto_cap);
proto_admin &= ptys_reg.eth_proto_cap;
-
if (proto_admin == ptys_reg.eth_proto_admin)
return 0; /* Nothing to change */
@@ -798,9 +798,9 @@ static int mlx4_en_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
return ret;
}
- en_warn(priv, "Port link mode changed, restarting port...\n");
mutex_lock(&priv->mdev->state_lock);
if (priv->port_up) {
+ en_warn(priv, "Port link mode changed, restarting port...\n");
mlx4_en_stop_port(dev, 1);
if (mlx4_en_start_port(dev))
en_err(priv, "Failed restarting port %d\n", priv->port);
--
1.9.3
^ permalink raw reply related
* Re: Where exactly will arch_fast_hash be used
From: Hannes Frederic Sowa @ 2014-12-07 14:06 UTC (permalink / raw)
To: George Spelvin
Cc: herbert, davem, dborkman, linux-kernel, netdev, tgraf, tytso
In-Reply-To: <20141207132305.24691.qmail@ns.horizon.com>
On So, 2014-12-07 at 08:23 -0500, George Spelvin wrote:
> So there are plenty of hash tables in Linux that you don't dare use this
> with. In fact, so many that, as you rightly point out, it's not clear
> if it's worth providing this special optimization for the few remaining.
In case of openvswitch it shows a performance improvment. The seed
parameter could be used as an initial biasing of the crc32 function, but
in case of openvswitch it is only set to 0.
Bye,
Hannes
^ permalink raw reply
* Re: Where exactly will arch_fast_hash be used
From: Hannes Frederic Sowa @ 2014-12-07 13:52 UTC (permalink / raw)
To: George Spelvin; +Cc: dborkman, herbert, linux-kernel, netdev, tgraf
In-Reply-To: <1417959696.17658.37.camel@localhost>
On So, 2014-12-07 at 14:41 +0100, Hannes Frederic Sowa wrote:
> On So, 2014-12-07 at 08:30 -0500, George Spelvin wrote:
> > Thanks for the encouragement!
> >
> > > Please consider xfs, too.
> > > AFAIK xfs doesn't seed their hashing so far and the hashing function is
> > > pretty weak. One example:
> > > http://marc.info/?l=linux-xfs&m=139590613002926&w=2
> >
> > Is that something that *can* be changed without breaking the
> > disk format? SipHash is explicitly *not* designed to be secure as
> > an unkeyed hash in the way that SHA-type algorithms are.
>
> I did some research and it looked like it would need a change to the
> disk format but it should be doable by incrementing the super block
> version, so at least newly created filesystem would benefit from it.
>
> > What it's designed to do is provide second preimage resistance
> > of its output, or a function (like modular reduction) of its output,
> > against an attacker who doesn't know the secret seed.
> >
> > > Ack. If we want to use it in the networking stack we should be able to
> > > use it without a dependency to the crypto framework.
> >
> > Already understood. My big question is whether a single function call
> > is okay or we need something inlinable.
>
> Like md5_transfrom, I think a non-inline function would be just fine.
> Otherwise kernel code size would increase. Most hash users in the
> network stack mostly deal with less bytes of input than one round needs.
Of course, if it looks feasable (from a performance PoV, but I doubt
that) to migrate the current jhash users to siphash, it might be worth
dealing with larger input sizes and maybe also doing it inline. But that
very much depends on the code size it would add. Currently we use jhash
as the non-linear "secure" hashing functions at most places.
Also rhashtable takes a pointer to the hasing function, thus causing gcc
to generate a function in each compilation unit if it would be static
inline.
^ permalink raw reply
* Re: Where exactly will arch_fast_hash be used
From: Hannes Frederic Sowa @ 2014-12-07 13:41 UTC (permalink / raw)
To: George Spelvin; +Cc: dborkman, herbert, linux-kernel, netdev, tgraf
In-Reply-To: <20141207133056.25209.qmail@ns.horizon.com>
On So, 2014-12-07 at 08:30 -0500, George Spelvin wrote:
> Thanks for the encouragement!
>
> > Please consider xfs, too.
> > AFAIK xfs doesn't seed their hashing so far and the hashing function is
> > pretty weak. One example:
> > http://marc.info/?l=linux-xfs&m=139590613002926&w=2
>
> Is that something that *can* be changed without breaking the
> disk format? SipHash is explicitly *not* designed to be secure as
> an unkeyed hash in the way that SHA-type algorithms are.
I did some research and it looked like it would need a change to the
disk format but it should be doable by incrementing the super block
version, so at least newly created filesystem would benefit from it.
> What it's designed to do is provide second preimage resistance
> of its output, or a function (like modular reduction) of its output,
> against an attacker who doesn't know the secret seed.
>
> > Ack. If we want to use it in the networking stack we should be able to
> > use it without a dependency to the crypto framework.
>
> Already understood. My big question is whether a single function call
> is okay or we need something inlinable.
Like md5_transfrom, I think a non-inline function would be just fine.
Otherwise kernel code size would increase. Most hash users in the
network stack mostly deal with less bytes of input than one round needs.
Bye,
Hannes
^ permalink raw reply
* Re: Where exactly will arch_fast_hash be used
From: George Spelvin @ 2014-12-07 13:30 UTC (permalink / raw)
To: hannes, linux; +Cc: dborkman, herbert, linux-kernel, netdev, tgraf
In-Reply-To: <1417958080.17658.32.camel@localhost>
Thanks for the encouragement!
> Please consider xfs, too.
> AFAIK xfs doesn't seed their hashing so far and the hashing function is
> pretty weak. One example:
> http://marc.info/?l=linux-xfs&m=139590613002926&w=2
Is that something that *can* be changed without breaking the
disk format? SipHash is explicitly *not* designed to be secure as
an unkeyed hash in the way that SHA-type algorithms are.
What it's designed to do is provide second preimage resistance
of its output, or a function (like modular reduction) of its output,
against an attacker who doesn't know the secret seed.
> Ack. If we want to use it in the networking stack we should be able to
> use it without a dependency to the crypto framework.
Already understood. My big question is whether a single function call
is okay or we need something inlinable.
^ permalink raw reply
* Re: Where exactly will arch_fast_hash be used
From: George Spelvin @ 2014-12-07 13:23 UTC (permalink / raw)
To: herbert, linux
Cc: davem, dborkman, hannes, linux-kernel, netdev, tgraf, tytso
In-Reply-To: <20141207125157.GA9745@gondor.apana.org.au>
>> How does this implicate the low bits specifically?
> If you can easily deduce the pre-images that make the last bit
> of the hash even or odd, then you've just cut your search space
> for collisions by half. The real killer is that you can do this
> without knowing what the secret is.
Um, yes, if you're in a situation where a hash collsion DoS is possible,
a CRC is disastrous choice. You can trivially find collisions for *all*
bits of a CRC. Low or high, they're all equally easy.
When you said
>>>>> Even if security wasn't an issue, straight CRC32 has really poor
>>>>> lower-order bit distribution, which makes it a terrible choice for
>>>>> a hash table that simply uses the lower-order bits.
This is talking about:
- Non-malicious inputs,where security isn't an issue, and
- Low-order bits specifically, implying that the high-order bits are different.
*That's* the claim I'm curious about. I know perfectly well that if
security *is* an issue, a fixed-polynomial CRC is a disaser.
But for non-malicious inputs, like normal software identifiers, a CRC
actually works very well.
If you want to do secure hashing with a CRC, you need to have a secret
*polynomial*. That *is* provably secure (it's a universal family of hash
functions), but isn't provided by x86 unless you use SSE and PCLMUL.
That's why it's a non-cryptographic hash, suitable for non-malicious
inputs only. That's the same security claim as many other common hash
functions.
> Our entire scheme is dependent on using the secret to defeat
> would-be attackers. If CRC does not make effective use of the
> secret, then we're essentially naked against attackers.
Okay, I'm confused. *What* scheme? The arch_fast_hash interface doesn't
have any provision for a secret. Because there's no point to having one;
you can't change the polynomial, and anything additive has just moves
collisions around without reducing them.
So there are plenty of hash tables in Linux that you don't dare use this
with. In fact, so many that, as you rightly point out, it's not clear
if it's worth providing this special optimization for the few remaining.
^ permalink raw reply
* Re: Where exactly will arch_fast_hash be used
From: Hannes Frederic Sowa @ 2014-12-07 13:14 UTC (permalink / raw)
To: George Spelvin; +Cc: herbert, dborkman, linux-kernel, netdev, tgraf
In-Reply-To: <20141207052041.20498.qmail@ns.horizon.com>
Hi,
On So, 2014-12-07 at 00:20 -0500, George Spelvin wrote:
> If you want DoS-resistant hash tables, I'm working on adding SipHash
> to the kernel.
>
> This is a keyed pseudo-random function designed specifically for that
> application. I am starting with ext4 directory hashes, and then intended
> to expand to secure sequence numbers (since it's far faster than MD5).
Please consider xfs, too.
AFAIK xfs doesn't seed their hashing so far and the hashing function is
pretty weak. One example:
http://marc.info/?l=linux-xfs&m=139590613002926&w=2
> (I'm trying to figure out a good interface, since the crypto API
> is a bit heavy for something to heavily optimized.)
Ack. If we want to use it in the networking stack we should be able to
use it without a dependency to the crypto framework.
Bye,
Hannes
^ permalink raw reply
* [PATCH RFC] pci: Control whether VFs are probed on pci_enable_sriov
From: Eli Cohen @ 2014-12-07 13:08 UTC (permalink / raw)
To: bhelgaas, davem
Cc: linux-pci, netdev, ogerlitz, yevgenyp, Eli Cohen, Donald Dutile
Sometimes it is not desirable to probe the virtual fuctions right away,
but rather leave the decision to the host's administrator.
This can save host side resource usage by VF instances which would be
eventually probed to VMs.
Use a parameter to pci_enable_sriov to control that policy, and modify
all current callers such that they retain the same functionality.
Use a one shot flag on struct pci_device which is cleared after the
first probe is ignored so subsequent attempts go through.
Cc: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
---
This approach is used by the mlx5 driver SRIOV implementation, so
sending this to get feedback from the PCI and networking folks.
drivers/misc/genwqe/card_base.c | 2 +-
drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c | 2 +-
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 2 +-
drivers/net/ethernet/cisco/enic/enic_main.c | 2 +-
drivers/net/ethernet/emulex/benet/be_main.c | 2 +-
drivers/net/ethernet/intel/fm10k/fm10k_iov.c | 2 +-
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 2 +-
drivers/net/ethernet/intel/igb/igb_main.c | 2 +-
drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c | 4 ++--
drivers/net/ethernet/mellanox/mlx4/main.c | 2 +-
drivers/net/ethernet/neterion/vxge/vxge-main.c | 2 +-
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c | 2 +-
drivers/net/ethernet/sfc/siena_sriov.c | 2 +-
drivers/pci/iov.c | 12 +++++++-----
drivers/pci/pci-driver.c | 11 ++++++++---
drivers/scsi/lpfc/lpfc_init.c | 2 +-
include/linux/pci.h | 5 +++--
17 files changed, 33 insertions(+), 25 deletions(-)
diff --git a/drivers/misc/genwqe/card_base.c b/drivers/misc/genwqe/card_base.c
index 4cf8f82cfca2..69253ca17506 100644
--- a/drivers/misc/genwqe/card_base.c
+++ b/drivers/misc/genwqe/card_base.c
@@ -1325,7 +1325,7 @@ static int genwqe_sriov_configure(struct pci_dev *dev, int numvfs)
if (numvfs > 0) {
genwqe_setup_vf_jtimer(cd);
- rc = pci_enable_sriov(dev, numvfs);
+ rc = pci_enable_sriov(dev, numvfs, 1);
if (rc < 0)
return rc;
return numvfs;
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
index c88b20af87df..773b20224a47 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
@@ -2570,7 +2570,7 @@ int bnx2x_enable_sriov(struct bnx2x *bp)
if (rc)
return rc;
- rc = pci_enable_sriov(bp->pdev, req_vfs);
+ rc = pci_enable_sriov(bp->pdev, req_vfs, 1);
if (rc) {
BNX2X_ERR("pci_enable_sriov failed with %d\n", rc);
return rc;
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 3aea82bb9039..6e8afbfd3eba 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -6597,7 +6597,7 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
sriov:
#ifdef CONFIG_PCI_IOV
if (func < ARRAY_SIZE(num_vf) && num_vf[func] > 0)
- if (pci_enable_sriov(pdev, num_vf[func]) == 0)
+ if (pci_enable_sriov(pdev, num_vf[func], 1) == 0)
dev_info(&pdev->dev,
"instantiated %u virtual functions\n",
num_vf[func]);
diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c
index 86ee350e57f0..8a8b1d86f18a 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -2421,7 +2421,7 @@ static int enic_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
pci_read_config_word(pdev, pos + PCI_SRIOV_TOTAL_VF,
&enic->num_vfs);
if (enic->num_vfs) {
- err = pci_enable_sriov(pdev, enic->num_vfs);
+ err = pci_enable_sriov(pdev, enic->num_vfs, 1);
if (err) {
dev_err(dev, "SRIOV enable failed, aborting."
" pci_enable_sriov() returned %d\n",
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index dc77ec2bdafd..a96491777ac4 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -3274,7 +3274,7 @@ static int be_vf_setup(struct be_adapter *adapter)
}
if (!old_vfs) {
- status = pci_enable_sriov(adapter->pdev, adapter->num_vfs);
+ status = pci_enable_sriov(adapter->pdev, adapter->num_vfs, 1);
if (status) {
dev_err(dev, "SRIOV enable failed\n");
adapter->num_vfs = 0;
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_iov.c b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
index 060190864238..04a3dc5acc28 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_iov.c
@@ -408,7 +408,7 @@ int fm10k_iov_configure(struct pci_dev *pdev, int num_vfs)
*/
fm10k_disable_aer_comp_abort(pdev);
- err = pci_enable_sriov(pdev, num_vfs);
+ err = pci_enable_sriov(pdev, num_vfs, 1);
if (err) {
dev_err(&pdev->dev,
"Enable PCI SR-IOV failed: %d\n", err);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 668d860275d6..fe56e09725f2 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -852,7 +852,7 @@ int i40e_alloc_vfs(struct i40e_pf *pf, u16 num_alloc_vfs)
/* Check to see if we're just allocating resources for extant VFs */
if (pci_num_vf(pf->pdev) != num_alloc_vfs) {
- ret = pci_enable_sriov(pf->pdev, num_alloc_vfs);
+ ret = pci_enable_sriov(pf->pdev, num_alloc_vfs, 1);
if (ret) {
dev_err(&pf->pdev->dev,
"Failed to enable SR-IOV, error %d.\n", ret);
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 3c0221620c9d..da01326ef550 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -2742,7 +2742,7 @@ static int igb_enable_sriov(struct pci_dev *pdev, int num_vfs)
/* only call pci_enable_sriov() if no VFs are allocated already */
if (!old_vfs) {
- err = pci_enable_sriov(pdev, adapter->vfs_allocated_count);
+ err = pci_enable_sriov(pdev, adapter->vfs_allocated_count, 1);
if (err)
goto err_out;
}
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
index 04eee7c7b653..74b33483a0d1 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
@@ -149,7 +149,7 @@ void ixgbe_enable_sriov(struct ixgbe_adapter *adapter)
*/
adapter->num_vfs = min_t(unsigned int, adapter->num_vfs, IXGBE_MAX_VFS_DRV_LIMIT);
- err = pci_enable_sriov(adapter->pdev, adapter->num_vfs);
+ err = pci_enable_sriov(adapter->pdev, adapter->num_vfs, 1);
if (err) {
e_err(probe, "Failed to enable PCI sriov: %d\n", err);
adapter->num_vfs = 0;
@@ -270,7 +270,7 @@ static int ixgbe_pci_sriov_enable(struct pci_dev *dev, int num_vfs)
for (i = 0; i < adapter->num_vfs; i++)
ixgbe_vf_configuration(dev, (i | 0x10000000));
- err = pci_enable_sriov(dev, num_vfs);
+ err = pci_enable_sriov(dev, num_vfs, 1);
if (err) {
e_dev_warn("Failed to enable PCI sriov: %d\n", err);
return err;
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index 3044f9e623cb..ae38b556ec13 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -2350,7 +2350,7 @@ static u64 mlx4_enable_sriov(struct mlx4_dev *dev, struct pci_dev *pdev,
existing_vfs, total_vfs);
} else {
mlx4_warn(dev, "Enabling SR-IOV with %d VFs\n", total_vfs);
- err = pci_enable_sriov(pdev, total_vfs);
+ err = pci_enable_sriov(pdev, total_vfs, 1);
}
if (err) {
mlx4_err(dev, "Failed to enable SR-IOV, continuing without SR-IOV (err = %d)\n",
diff --git a/drivers/net/ethernet/neterion/vxge/vxge-main.c b/drivers/net/ethernet/neterion/vxge/vxge-main.c
index cc0485e3c621..c341e73fc68c 100644
--- a/drivers/net/ethernet/neterion/vxge/vxge-main.c
+++ b/drivers/net/ethernet/neterion/vxge/vxge-main.c
@@ -4495,7 +4495,7 @@ vxge_probe(struct pci_dev *pdev, const struct pci_device_id *pre)
/* Enable SRIOV mode, if firmware has SRIOV support and if it is a PF */
if (is_sriov(function_mode) && !is_sriov_initialized(pdev) &&
(ll_config->intr_type != INTA)) {
- ret = pci_enable_sriov(pdev, num_vfs);
+ ret = pci_enable_sriov(pdev, num_vfs, 1);
if (ret)
vxge_debug_ll_config(VXGE_ERR,
"Failed in enabling SRIOV mode: %d\n", ret);
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c
index a29538b86edf..b483705a1ef1 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c
@@ -570,7 +570,7 @@ static int qlcnic_sriov_pf_enable(struct qlcnic_adapter *adapter, int num_vfs)
if (!qlcnic_sriov_enable_check(adapter))
return 0;
- err = pci_enable_sriov(adapter->pdev, num_vfs);
+ err = pci_enable_sriov(adapter->pdev, num_vfs, 1);
if (err)
qlcnic_sriov_pf_cleanup(adapter);
diff --git a/drivers/net/ethernet/sfc/siena_sriov.c b/drivers/net/ethernet/sfc/siena_sriov.c
index a8bbbad68a88..6804ed04cfcd 100644
--- a/drivers/net/ethernet/sfc/siena_sriov.c
+++ b/drivers/net/ethernet/sfc/siena_sriov.c
@@ -1332,7 +1332,7 @@ int efx_siena_sriov_init(struct efx_nic *efx)
/* At this point we must be ready to accept VFDI requests */
- rc = pci_enable_sriov(efx->pci_dev, efx->vf_count);
+ rc = pci_enable_sriov(efx->pci_dev, efx->vf_count, 1);
if (rc)
goto fail_pci;
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 4d109c07294a..f6aba5beea78 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -57,7 +57,7 @@ static void virtfn_remove_bus(struct pci_bus *physbus, struct pci_bus *virtbus)
pci_remove_bus(virtbus);
}
-static int virtfn_add(struct pci_dev *dev, int id, int reset)
+static int virtfn_add(struct pci_dev *dev, int id, int reset, int probe)
{
int i;
int rc = -ENOMEM;
@@ -85,6 +85,7 @@ static int virtfn_add(struct pci_dev *dev, int id, int reset)
virtfn->physfn = pci_dev_get(dev);
virtfn->is_virtfn = 1;
virtfn->multifunction = 0;
+ virtfn->probe_vf = probe;
for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
res = dev->resource + PCI_IOV_RESOURCES + i;
@@ -170,7 +171,7 @@ static void virtfn_remove(struct pci_dev *dev, int id, int reset)
pci_dev_put(dev);
}
-static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
+static int sriov_enable(struct pci_dev *dev, int nr_virtfn, int probe_vfs)
{
int rc;
int i, j;
@@ -255,7 +256,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
initial = nr_virtfn;
for (i = 0; i < initial; i++) {
- rc = virtfn_add(dev, i, 0);
+ rc = virtfn_add(dev, i, 0, probe_vfs);
if (rc)
goto failed;
}
@@ -558,17 +559,18 @@ int pci_iov_bus_range(struct pci_bus *bus)
* pci_enable_sriov - enable the SR-IOV capability
* @dev: the PCI device
* @nr_virtfn: number of virtual functions to enable
+ * @probe_vfs: in zero, don't probe new VFs, otherwise probe if suitable driver available
*
* Returns 0 on success, or negative on failure.
*/
-int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
+int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn, int probe_vfs)
{
might_sleep();
if (!dev->is_physfn)
return -ENOSYS;
- return sriov_enable(dev, nr_virtfn);
+ return sriov_enable(dev, nr_virtfn, probe_vfs);
}
EXPORT_SYMBOL_GPL(pci_enable_sriov);
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 2b3c89425bb5..d5b93339b8a4 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -397,9 +397,14 @@ static int pci_device_probe(struct device *dev)
drv = to_pci_driver(dev->driver);
pci_dev = to_pci_dev(dev);
pci_dev_get(pci_dev);
- error = __pci_device_probe(drv, pci_dev);
- if (error)
- pci_dev_put(pci_dev);
+ if (!pci_dev->is_virtfn || pci_dev->probe_vf) {
+ error = __pci_device_probe(drv, pci_dev);
+ if (error)
+ pci_dev_put(pci_dev);
+ }
+ /* one shot blocking of probe */
+ if (pci_dev->is_virtfn && !pci_dev->probe_vf)
+ pci_dev->probe_vf = 1;
return error;
}
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 0b2c53af85c7..2f81f471b8f3 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -4797,7 +4797,7 @@ lpfc_sli_probe_sriov_nr_virtfn(struct lpfc_hba *phba, int nr_vfn)
return -EINVAL;
}
- rc = pci_enable_sriov(pdev, nr_vfn);
+ rc = pci_enable_sriov(pdev, nr_vfn, 1);
if (rc) {
lpfc_printf_log(phba, KERN_WARNING, LOG_INIT,
"2806 Failed to enable sriov on this device "
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 4c8ac5fcc224..beb2640ba18d 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -373,6 +373,7 @@ struct pci_dev {
phys_addr_t rom; /* Physical address of ROM if it's not from the BAR */
size_t romlen; /* Length of ROM if it's not from the BAR */
char *driver_override; /* Driver name to force a match */
+ int probe_vf; /* probe this device */
};
static inline struct pci_dev *pci_physfn(struct pci_dev *dev)
@@ -1655,14 +1656,14 @@ int pci_ext_cfg_avail(void);
void __iomem *pci_ioremap_bar(struct pci_dev *pdev, int bar);
#ifdef CONFIG_PCI_IOV
-int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
+int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn, int probe_vfs);
void pci_disable_sriov(struct pci_dev *dev);
int pci_num_vf(struct pci_dev *dev);
int pci_vfs_assigned(struct pci_dev *dev);
int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs);
int pci_sriov_get_totalvfs(struct pci_dev *dev);
#else
-static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
+static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn, int nr_virt_probe)
{ return -ENODEV; }
static inline void pci_disable_sriov(struct pci_dev *dev) { }
static inline int pci_num_vf(struct pci_dev *dev) { return 0; }
--
2.1.3
^ permalink raw reply related
* Re: Where exactly will arch_fast_hash be used
From: Herbert Xu @ 2014-12-07 12:51 UTC (permalink / raw)
To: George Spelvin
Cc: dborkman, hannes, linux-kernel, netdev, tgraf, David S. Miller,
Theodore Ts'o
In-Reply-To: <20141207100252.6707.qmail@ns.horizon.com>
On Sun, Dec 07, 2014 at 05:02:52AM -0500, George Spelvin wrote:
>
> How does this implicate the low bits specifically?
If you can easily deduce the pre-images that make the last bit
of the hash even or odd, then you've just cut your search space
for collisions by half. The real killer is that you can do this
without knowing what the secret is.
Our entire scheme is dependent on using the secret to defeat
would-be attackers. If CRC does not make effective use of the
secret, then we're essentially naked against attackers.
Cheers,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Re: [PATCH v2 1/6] net-PPP: Replacement of a printk() call by pr_warn() in mppe_rekey()
From: Joe Perches @ 2014-12-07 12:42 UTC (permalink / raw)
To: Julia Lawall
Cc: SF Markus Elfring, Sergei Shtylyov, Paul Mackerras, linux-ppp,
netdev, Eric Dumazet, LKML, kernel-janitors
In-Reply-To: <alpine.DEB.2.02.1412071335140.2030@localhost6.localdomain6>
On Sun, 2014-12-07 at 13:36 +0100, Julia Lawall wrote:
> the semantic patch is only using __func__ and only in cases where
> the string wanted is similar to the name of the current function, so I
> think it should be OK?
Yes, it'd be a good thing.
^ permalink raw reply
* Re: [PATCH v2 1/6] net-PPP: Replacement of a printk() call by pr_warn() in mppe_rekey()
From: Julia Lawall @ 2014-12-07 12:36 UTC (permalink / raw)
To: Joe Perches
Cc: SF Markus Elfring, Sergei Shtylyov, Paul Mackerras, linux-ppp,
netdev, Eric Dumazet, LKML, kernel-janitors
In-Reply-To: <1417955413.31745.25.camel@perches.com>
On Sun, 7 Dec 2014, Joe Perches wrote:
> On Sun, 2014-12-07 at 11:44 +0100, Julia Lawall wrote:
> > > A negative to that approach is inlined functions would
> > > take the function name of the parent not keep the
> > > inlined function name.
> >
> > I tried the following:
> >
> > #include <stdio.h>
> >
> > inline int foo() {
> > printf("%s %x\n",__func__,0x12345);
> > }
> >
> > int main () {
> > foo();
> > }
> >
> > The assembly code generated for main is:
> >
> > 0000000000400470 <main>:
> > 400470: b9 45 23 01 00 mov $0x12345,%ecx
> > 400475: ba 4b 06 40 00 mov $0x40064b,%edx
> > 40047a: be 44 06 40 00 mov $0x400644,%esi
> > 40047f: bf 01 00 00 00 mov $0x1,%edi
> > 400484: 31 c0 xor %eax,%eax
> > 400486: e9 d5 ff ff ff jmpq 400460 <__printf_chk@plt>
> >
> > That is, the call to foo seems tom be inlined.
> >
> > But the output is:
> >
> > foo 12345
> >
> > So it seems that __func__ is determined before inlining.
>
> True, and that's what I intended to describe.
>
> If you did that with a kernel module and replaced
> "%s, __func__" with "%pf, __builtin_return_address(0)"
> when built with kallsyms you should get:
>
> "modname 12345" when most would expect "foo 12345"
>
> when built without kallsyms, that output should be
> "<address> 12345"
>
> but the object code should be smaller.
OK. But the semantic patch is only using __func__ and only in cases where
the string wanted is similar to the name of the current function, so I
think it should be OK?
julia
^ permalink raw reply
* Re: [PATCH v2 1/6] net-PPP: Replacement of a printk() call by pr_warn() in mppe_rekey()
From: Joe Perches @ 2014-12-07 12:30 UTC (permalink / raw)
To: Julia Lawall
Cc: SF Markus Elfring, Sergei Shtylyov, Paul Mackerras, linux-ppp,
netdev, Eric Dumazet, LKML, kernel-janitors
In-Reply-To: <alpine.DEB.2.02.1412071140290.2044@localhost6.localdomain6>
On Sun, 2014-12-07 at 11:44 +0100, Julia Lawall wrote:
> > A negative to that approach is inlined functions would
> > take the function name of the parent not keep the
> > inlined function name.
>
> I tried the following:
>
> #include <stdio.h>
>
> inline int foo() {
> printf("%s %x\n",__func__,0x12345);
> }
>
> int main () {
> foo();
> }
>
> The assembly code generated for main is:
>
> 0000000000400470 <main>:
> 400470: b9 45 23 01 00 mov $0x12345,%ecx
> 400475: ba 4b 06 40 00 mov $0x40064b,%edx
> 40047a: be 44 06 40 00 mov $0x400644,%esi
> 40047f: bf 01 00 00 00 mov $0x1,%edi
> 400484: 31 c0 xor %eax,%eax
> 400486: e9 d5 ff ff ff jmpq 400460 <__printf_chk@plt>
>
> That is, the call to foo seems tom be inlined.
>
> But the output is:
>
> foo 12345
>
> So it seems that __func__ is determined before inlining.
True, and that's what I intended to describe.
If you did that with a kernel module and replaced
"%s, __func__" with "%pf, __builtin_return_address(0)"
when built with kallsyms you should get:
"modname 12345" when most would expect "foo 12345"
when built without kallsyms, that output should be
"<address> 12345"
but the object code should be smaller.
^ permalink raw reply
* Re: [PATCH net-next] net/mlx4_en: ethtool force speed when asking for autoneg=off
From: Sergei Shtylyov @ 2014-12-07 11:55 UTC (permalink / raw)
To: Amir Vadai, David S. Miller
Cc: netdev, Or Gerlitz, Yevgeny Petrilin, Saeed Mahameed
In-Reply-To: <1417939634-26085-1-git-send-email-amirv@mellanox.com>
Hello.
On 12/7/2014 11:07 AM, Amir Vadai wrote:
> From: Saeed Mahameed <saeedm@mellanox.com>
> Use cmd->autoneg == AUTONEG_DISABLE as a user hint to force specific speed.
> We don't want to rely on ethtool to calculate advertised link modes when
> forcing specific speed, a user can request a specific speed and specify
> "autoneg off" in ethtool command to give a hint for forcing this speed.
> Move en_warn("port reset..") inside the "port reset" block.
> Fixes: d48b3ab ("net/mlx4_en: Use PTYS register to set ethtool settings (Speed)")
> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
> Signed-off-by: Amir Vadai <amirv@mellanox.com>
> ---
> drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
> index c45e06a..3045582 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
> @@ -771,13 +771,13 @@ static int mlx4_en_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
> }
>
> proto_admin = cpu_to_be32(ptys_adv);
> - if (speed >= 0 && speed != priv->port_state.link_speed)
> + if (speed >= 0 && ((speed != priv->port_state.link_speed) ||
> + (cmd->autoneg == AUTONEG_DISABLE)))
You're using () rather inconsistently. In fact, () around == and != are
not needed.
[...]
WBR, Sergei
^ permalink raw reply
* Re: [PATCH v2 1/6] net-PPP: Replacement of a printk() call by pr_warn() in mppe_rekey()
From: Julia Lawall @ 2014-12-07 10:44 UTC (permalink / raw)
To: Joe Perches
Cc: SF Markus Elfring, Sergei Shtylyov, Paul Mackerras, linux-ppp,
netdev, Eric Dumazet, LKML, kernel-janitors
In-Reply-To: <1417765287.2721.39.camel@perches.com>
> A negative to that approach is inlined functions would
> take the function name of the parent not keep the
> inlined function name.
I tried the following:
#include <stdio.h>
inline int foo() {
printf("%s %x\n",__func__,0x12345);
}
int main () {
foo();
}
The assembly code generated for main is:
0000000000400470 <main>:
400470: b9 45 23 01 00 mov $0x12345,%ecx
400475: ba 4b 06 40 00 mov $0x40064b,%edx
40047a: be 44 06 40 00 mov $0x400644,%esi
40047f: bf 01 00 00 00 mov $0x1,%edi
400484: 31 c0 xor %eax,%eax
400486: e9 d5 ff ff ff jmpq 400460 <__printf_chk@plt>
That is, the call to foo seems tom be inlined.
But the output is:
foo 12345
So it seems that __func__ is determined before inlining.
julia
^ permalink raw reply
* Re: Where exactly will arch_fast_hash be used
From: George Spelvin @ 2014-12-07 10:02 UTC (permalink / raw)
To: herbert, linux; +Cc: dborkman, hannes, linux-kernel, netdev, tgraf
In-Reply-To: <20141207092828.GA8623@gondor.apana.org.au>
> For a start why don't you print out the hashes of 1-255 and then
> find out how easy it is to deduce the last bit of the hash result.
They're available in lib/crc32table.h, as crc32ctable_le[0].
As a CRC is a linear function, every bit is the XOR of some
selected bits of the input, i.e. the parity of the input and
some bit-specific mask sequence.
Furthermore, CRCs are cyclic, so the mask sequences for adjacent bits are
shifts of each other.
The lsbit of the CRC32c of x is the parity of x & 0x1f.
This is because the LFSR sequence generated by the polynomial
starts 0001111110010001110010101111011000111000011011110010110000100101...
The first bit corresponds to the msbit of the last byte.
How does this implicate the low bits specifically?
^ permalink raw reply
* Re: [PATCH v8 3/3] net: hisilicon: new hip04 ethernet driver
From: Alexander Graf @ 2014-12-07 9:49 UTC (permalink / raw)
To: Ding Tianhong, Zhangfei Gao
Cc: davem, linux, arnd, f.fainelli, sergei.shtylyov, mark.rutland,
David.Laight, eric.dumazet, xuwei5, linux-arm-kernel, netdev,
devicetree
In-Reply-To: <5483C977.2060308@huawei.com>
On 07.12.14 04:28, Ding Tianhong wrote:
> On 2014/12/7 8:42, Alexander Graf wrote:
>> On 19.04.14 03:13, Zhangfei Gao wrote:
>>> Support Hisilicon hip04 ethernet driver, including 100M / 1000M controller.
>>> The controller has no tx done interrupt, reclaim xmitted buffer in the poll.
>>>
>>> Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
>>
>> Is this driver still supposed to go upstream? I presume this was the
>> last submission and it's been quite some time ago :)
>>
>
> yes, it is really a long time, but The hip04 did not support tx irq,
> we couldn't get any better idea to fix this defect, do you have any suggestion?
Well, if hardware doesn't have a TX irq I don't see there's anything we
can do to fix that ;).
Dave, what's your take here? Should we keep a driver from going upstream
just because the hardware is partly broken? I'd really prefer to have an
upstream driver on that SoC rather than some random (eventually even
more broken) downstream code.
Alex
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox