* [PATCH] ipv6: Fix default route failover when CONFIG_IPV6_ROUTER_PREF=n
From: Paul Marks @ 2012-12-02 10:00 UTC (permalink / raw)
To: davem, yoshfuji; +Cc: netdev, Paul Marks
I believe this commit from 2008 was incorrect:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=398bcbebb6f721ac308df1e3d658c0029bb74503
When CONFIG_IPV6_ROUTER_PREF is disabled, the kernel should follow
RFC4861 section 6.3.6: if no route is NUD_VALID, then traffic should be
sprayed across all routers (indirectly triggering NUD) until one of them
becomes NUD_VALID.
However, the following experiment demonstrates that this does not work:
1) Connect to an IPv6 network.
2) Change the router's MAC (and link-local) address.
The kernel will lock onto the first router and never try the new one, even
if the first becomes unreachable. This patch fixes the problem by
allowing rt6_check_neigh() to return 0; if all routers return 0, then
rt6_select() will fall back to round-robin behavior.
This patch should have no effect when CONFIG_IPV6_ROUTER_PREF=y.
Note that rt6_check_neigh() is only used in a boolean context, so the
presence of both "m = 1" and "m = 2" was irrelevant and confusing.
Signed-off-by: Paul Marks <pmarks@google.com>
---
net/ipv6/route.c | 13 +++++--------
1 files changed, 5 insertions(+), 8 deletions(-)
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index b1e6cf0..60b2995 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -490,7 +490,7 @@ static inline int rt6_check_dev(struct rt6_info *rt, int oif)
static inline int rt6_check_neigh(struct rt6_info *rt)
{
struct neighbour *neigh;
- int m;
+ int m = 0;
neigh = rt->n;
if (rt->rt6i_flags & RTF_NONEXTHOP ||
@@ -499,16 +499,13 @@ static inline int rt6_check_neigh(struct rt6_info *rt)
else if (neigh) {
read_lock_bh(&neigh->lock);
if (neigh->nud_state & NUD_VALID)
- m = 2;
+ m = 1;
#ifdef CONFIG_IPV6_ROUTER_PREF
- else if (neigh->nud_state & NUD_FAILED)
- m = 0;
-#endif
- else
+ else if (!(neigh->nud_state & NUD_FAILED))
m = 1;
+#endif
read_unlock_bh(&neigh->lock);
- } else
- m = 0;
+ }
return m;
}
--
1.7.7.3
^ permalink raw reply related
* [patch] p54: potential signedness issue in p54_parse_rssical()
From: Dan Carpenter @ 2012-12-02 10:36 UTC (permalink / raw)
To: Christian Lamparter
Cc: John W. Linville, linux-wireless, netdev, linux-kernel,
kernel-janitors
"entries" is unsigned here, so it is never less than zero. In theory,
len could be less than offset so I have added a check for that.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
diff --git a/drivers/net/wireless/p54/eeprom.c b/drivers/net/wireless/p54/eeprom.c
index 1ef1bfe..d43e374 100644
--- a/drivers/net/wireless/p54/eeprom.c
+++ b/drivers/net/wireless/p54/eeprom.c
@@ -541,8 +541,9 @@ static int p54_parse_rssical(struct ieee80211_hw *dev,
entries = (len - offset) /
sizeof(struct pda_rssi_cal_ext_entry);
- if ((len - offset) % sizeof(struct pda_rssi_cal_ext_entry) ||
- entries <= 0) {
+ if (len < offset ||
+ (len - offset) % sizeof(struct pda_rssi_cal_ext_entry) ||
+ entries == 0) {
wiphy_err(dev->wiphy, "invalid rssi database.\n");
goto err_data;
}
^ permalink raw reply related
* [PATCH net-next] ipv6: Make 'addrconf_rs_timer' send Router Solicitations (and re-arm itself) if Router Advertisements are accepted
From: Shmulik Ladkani @ 2012-12-02 11:44 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Hideaki YOSHIFUJI, Thomas Graf, Tore Anderson, Ami Koren,
Shmulik Ladkani
As of 026359b [ipv6: Send ICMPv6 RSes only when RAs are accepted],
Router Solicitations are sent whenever kernel accepts Router
Advertisements on the interface.
However, this logic isn't reflected in 'addrconf_rs_timer'.
The timer fails to issue subsequent RS messages (and fails to re-arm
itself) if forwarding is enabled and the special hybrid mode is
enabled (accept_ra=2).
Fix the condition determining whether next RS should be sent, by using
'ipv6_accept_ra()'.
Reported-by: Ami Koren <amikoren@yahoo.com>
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
---
net/ipv6/addrconf.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index ca1ed8a..3b990a2 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2843,7 +2843,7 @@ static void addrconf_rs_timer(unsigned long data)
if (idev->dead || !(idev->if_flags & IF_READY))
goto out;
- if (idev->cnf.forwarding)
+ if (!ipv6_accept_ra(idev))
goto out;
/* Announcement received after solicitation was sent */
--
1.7.9
^ permalink raw reply related
* Re: [PATCH net-next v1 3/3] net/mlx4_en: Set number of rx/tx channels using ethtool
From: Or Gerlitz @ 2012-12-02 12:23 UTC (permalink / raw)
To: David Miller; +Cc: bhutchings, amirv, ogerlitz, oren, netdev
In-Reply-To: <20121201.112555.1561917161372059709.davem@davemloft.net>
On Sat, Dec 1, 2012 at 6:25 PM, David Miller <davem@davemloft.net> wrote:
> He's saying that you need to carefully validate the arguments, not
> that appropriate permissions haven't been checked.
OK, understood.
Or.
^ permalink raw reply
* Re: [PATCH RFC] [INET]: Get cirtical word in first 64bit of cache line
From: Ling Ma @ 2012-12-02 13:25 UTC (permalink / raw)
To: Eric Dumazet; +Cc: linux-kernel, netdev
In-Reply-To: <1354024683.7553.1768.camel@edumazet-glaptop>
[-- Attachment #1: Type: text/plain, Size: 410 bytes --]
Hi Eric,
Attached benchmark test-cwf.c(cc -o test-cwf test-cwf.c), the result
shows when last level cache(LLC) miss and CPU fetches data from
memory, critical word as first 64bit member in cache line has better
performance(costs 158290336 cycles ) than other positions(offset 0x10,
costs 164100732 ) in cache line, the performance is improved by 3.6%
in this case.
cpu-info is also involved too.
Thanks
Ling
[-- Attachment #2: test-cwf.c --]
[-- Type: text/x-csrc, Size: 1986 bytes --]
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
#include<unistd.h>
#define MAX_BUF_NUM (1 << 20)
#define MAX_BUF_SIZE (1 << 8)
#define ACCESS_OFFSET (0x10)
# define HP_TIMING_NOW(Var) \
({ unsigned long long _hi, _lo; \
asm volatile ("rdtsc" : "=a" (_lo), "=d" (_hi)); \
(Var) = _hi << 32 | _lo; })
#define repeat_times (64)
static void init_buf(char **buf)
{
int i = 0;
char *start;
char *end;
int pagesize = getpagesize();
*buf = malloc(MAX_BUF_SIZE * MAX_BUF_NUM + pagesize);
if(*buf == NULL) {
printf("\nfait to malloc space!\n");
exit(1);
} else {
*buf = *buf + pagesize;
*buf = (char *)(((unsigned long)*buf) & (-pagesize));
}
start = *buf;
end = *buf + (MAX_BUF_SIZE * MAX_BUF_NUM) - MAX_BUF_SIZE;
while(1) {
*((unsigned char **)start) = end;
*((unsigned char **)(start + ACCESS_OFFSET)) = (end + ACCESS_OFFSET);
start = start + MAX_BUF_SIZE;
if(start == end)
break;
*((unsigned char **)end) = start;
*((unsigned char **)(end + ACCESS_OFFSET)) = start + ACCESS_OFFSET;
end = end - MAX_BUF_SIZE;
}
}
unsigned long lookingup_memmory(char *access, int num)
{
__asm__("sub $1, %rsi");
__asm__("xor %rax, %rax");
__asm__("1:");
__asm__("mov (%rdi), %r8");
__asm__("add %r8, %rax");
__asm__("mov %r8, %rdi");
__asm__("sub $1, %rsi");
__asm__("jae 1b");
}
static unsigned long test_lookup_time(char *buf)
{
unsigned long i, start, end, best_time = ~0;
for(i = 0; i < repeat_times; i++) {
HP_TIMING_NOW(start);
lookingup_memmory(buf, MAX_BUF_NUM);
HP_TIMING_NOW(end);
if(best_time > (end - start))
best_time = (end - start);
}
return best_time;
}
void main (void)
{
char *buf1 = NULL;
char *buf2 = NULL;
unsigned long aligned_time, unaligned_time;
init_buf(&buf1);
init_buf(&buf2);
aligned_time = test_lookup_time(buf1);
unaligned_time = test_lookup_time(buf2 + ACCESS_OFFSET);
printf("looking-up aligned time %ld, looking-up unaligned time %ld\n", aligned_time, unaligned_time);
}
[-- Attachment #3: cpu-info --]
[-- Type: application/octet-stream, Size: 7050 bytes --]
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 44
model name : Intel(R) Xeon(R) CPU E5620 @ 2.40GHz
stepping : 2
microcode : 0x10
cpu MHz : 2400.005
cache size : 12288 KB
physical id : 0
siblings : 8
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm arat epb dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips : 4800.01
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 44
model name : Intel(R) Xeon(R) CPU E5620 @ 2.40GHz
stepping : 2
microcode : 0x10
cpu MHz : 2400.005
cache size : 12288 KB
physical id : 0
siblings : 8
core id : 1
cpu cores : 4
apicid : 2
initial apicid : 2
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm arat epb dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips : 4800.01
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:
processor : 2
vendor_id : GenuineIntel
cpu family : 6
model : 44
model name : Intel(R) Xeon(R) CPU E5620 @ 2.40GHz
stepping : 2
microcode : 0x10
cpu MHz : 2400.005
cache size : 12288 KB
physical id : 0
siblings : 8
core id : 9
cpu cores : 4
apicid : 18
initial apicid : 18
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm arat epb dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips : 4800.01
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:
processor : 3
vendor_id : GenuineIntel
cpu family : 6
model : 44
model name : Intel(R) Xeon(R) CPU E5620 @ 2.40GHz
stepping : 2
microcode : 0x10
cpu MHz : 2400.005
cache size : 12288 KB
physical id : 0
siblings : 8
core id : 10
cpu cores : 4
apicid : 20
initial apicid : 20
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm arat epb dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips : 4800.01
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:
processor : 4
vendor_id : GenuineIntel
cpu family : 6
model : 44
model name : Intel(R) Xeon(R) CPU E5620 @ 2.40GHz
stepping : 2
microcode : 0x10
cpu MHz : 2400.005
cache size : 12288 KB
physical id : 0
siblings : 8
core id : 0
cpu cores : 4
apicid : 1
initial apicid : 1
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm arat epb dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips : 4800.01
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:
processor : 5
vendor_id : GenuineIntel
cpu family : 6
model : 44
model name : Intel(R) Xeon(R) CPU E5620 @ 2.40GHz
stepping : 2
microcode : 0x10
cpu MHz : 2400.005
cache size : 12288 KB
physical id : 0
siblings : 8
core id : 1
cpu cores : 4
apicid : 3
initial apicid : 3
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm arat epb dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips : 4800.01
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:
processor : 6
vendor_id : GenuineIntel
cpu family : 6
model : 44
model name : Intel(R) Xeon(R) CPU E5620 @ 2.40GHz
stepping : 2
microcode : 0x10
cpu MHz : 2400.005
cache size : 12288 KB
physical id : 0
siblings : 8
core id : 9
cpu cores : 4
apicid : 19
initial apicid : 19
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm arat epb dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips : 4800.01
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:
processor : 7
vendor_id : GenuineIntel
cpu family : 6
model : 44
model name : Intel(R) Xeon(R) CPU E5620 @ 2.40GHz
stepping : 2
microcode : 0x10
cpu MHz : 2400.005
cache size : 12288 KB
physical id : 0
siblings : 8
core id : 10
cpu cores : 4
apicid : 21
initial apicid : 21
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm arat epb dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips : 4800.01
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:
^ permalink raw reply
* [PATCH net-next v2 2/3] net/mlx4_en: Fix TX moderation info loss after set_ringparam is called
From: Amir Vadai @ 2012-12-02 13:49 UTC (permalink / raw)
To: David S. Miller; +Cc: Ben Hutchings, Amir Vadai, Or Gerlitz, Oren Duer, netdev
In-Reply-To: <1354456163-10497-1-git-send-email-amirv@mellanox.com>
We need to re-set tx moderation information after calling set_ringparam
else default tx moderation will be used.
Also avoid related code duplication, by putting it in a utility function.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 59 ++++++++++++-----------
1 files changed, 30 insertions(+), 29 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
index 9d0b88e..dc8ccb4 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
@@ -43,6 +43,34 @@
#define EN_ETHTOOL_SHORT_MASK cpu_to_be16(0xffff)
#define EN_ETHTOOL_WORD_MASK cpu_to_be32(0xffffffff)
+static int mlx4_en_moderation_update(struct mlx4_en_priv *priv)
+{
+ int i;
+ int err = 0;
+
+ for (i = 0; i < priv->tx_ring_num; i++) {
+ priv->tx_cq[i].moder_cnt = priv->tx_frames;
+ priv->tx_cq[i].moder_time = priv->tx_usecs;
+ err = mlx4_en_set_cq_moder(priv, &priv->tx_cq[i]);
+ if (err)
+ return err;
+ }
+
+ if (priv->adaptive_rx_coal)
+ return 0;
+
+ for (i = 0; i < priv->rx_ring_num; i++) {
+ priv->rx_cq[i].moder_cnt = priv->rx_frames;
+ priv->rx_cq[i].moder_time = priv->rx_usecs;
+ priv->last_moder_time[i] = MLX4_EN_AUTO_CONF;
+ err = mlx4_en_set_cq_moder(priv, &priv->rx_cq[i]);
+ if (err)
+ return err;
+ }
+
+ return err;
+}
+
static void
mlx4_en_get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *drvinfo)
{
@@ -381,7 +409,6 @@ static int mlx4_en_set_coalesce(struct net_device *dev,
struct ethtool_coalesce *coal)
{
struct mlx4_en_priv *priv = netdev_priv(dev);
- int err, i;
priv->rx_frames = (coal->rx_max_coalesced_frames ==
MLX4_EN_AUTO_CONF) ?
@@ -397,14 +424,6 @@ static int mlx4_en_set_coalesce(struct net_device *dev,
coal->tx_max_coalesced_frames != priv->tx_frames) {
priv->tx_usecs = coal->tx_coalesce_usecs;
priv->tx_frames = coal->tx_max_coalesced_frames;
- for (i = 0; i < priv->tx_ring_num; i++) {
- priv->tx_cq[i].moder_cnt = priv->tx_frames;
- priv->tx_cq[i].moder_time = priv->tx_usecs;
- if (mlx4_en_set_cq_moder(priv, &priv->tx_cq[i])) {
- en_warn(priv, "Failed changing moderation "
- "for TX cq %d\n", i);
- }
- }
}
/* Set adaptive coalescing params */
@@ -414,18 +433,8 @@ static int mlx4_en_set_coalesce(struct net_device *dev,
priv->rx_usecs_high = coal->rx_coalesce_usecs_high;
priv->sample_interval = coal->rate_sample_interval;
priv->adaptive_rx_coal = coal->use_adaptive_rx_coalesce;
- if (priv->adaptive_rx_coal)
- return 0;
- for (i = 0; i < priv->rx_ring_num; i++) {
- priv->rx_cq[i].moder_cnt = priv->rx_frames;
- priv->rx_cq[i].moder_time = priv->rx_usecs;
- priv->last_moder_time[i] = MLX4_EN_AUTO_CONF;
- err = mlx4_en_set_cq_moder(priv, &priv->rx_cq[i]);
- if (err)
- return err;
- }
- return 0;
+ return mlx4_en_moderation_update(priv);
}
static int mlx4_en_set_pauseparam(struct net_device *dev,
@@ -466,7 +475,6 @@ static int mlx4_en_set_ringparam(struct net_device *dev,
u32 rx_size, tx_size;
int port_up = 0;
int err = 0;
- int i;
if (param->rx_jumbo_pending || param->rx_mini_pending)
return -EINVAL;
@@ -505,14 +513,7 @@ static int mlx4_en_set_ringparam(struct net_device *dev,
en_err(priv, "Failed starting port\n");
}
- for (i = 0; i < priv->rx_ring_num; i++) {
- priv->rx_cq[i].moder_cnt = priv->rx_frames;
- priv->rx_cq[i].moder_time = priv->rx_usecs;
- priv->last_moder_time[i] = MLX4_EN_AUTO_CONF;
- err = mlx4_en_set_cq_moder(priv, &priv->rx_cq[i]);
- if (err)
- goto out;
- }
+ err = mlx4_en_moderation_update(priv);
out:
mutex_unlock(&mdev->state_lock);
--
1.7.8.2
^ permalink raw reply related
* [PATCH net-next v2 3/3] net/mlx4_en: Set number of rx/tx channels using ethtool
From: Amir Vadai @ 2012-12-02 13:49 UTC (permalink / raw)
To: David S. Miller; +Cc: Ben Hutchings, Amir Vadai, Or Gerlitz, Oren Duer, netdev
In-Reply-To: <1354456163-10497-1-git-send-email-amirv@mellanox.com>
Add support to changing number of rx/tx channels using
ethtool ('ethtool -[lL]'). Where the number of tx channels specified in ethtool
is the number of rings per user priority - not total number of tx rings.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 69 +++++++++++++++++++++++
drivers/net/ethernet/mellanox/mlx4/en_main.c | 2 +-
drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 26 +++++----
drivers/net/ethernet/mellanox/mlx4/en_tx.c | 2 +-
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 8 ++-
5 files changed, 93 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
index dc8ccb4..4aaa7c3 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
@@ -999,6 +999,73 @@ static int mlx4_en_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd)
return err;
}
+static void mlx4_en_get_channels(struct net_device *dev,
+ struct ethtool_channels *channel)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+
+ memset(channel, 0, sizeof(*channel));
+
+ channel->max_rx = MAX_RX_RINGS;
+ channel->max_tx = MLX4_EN_MAX_TX_RING_P_UP;
+
+ channel->rx_count = priv->rx_ring_num;
+ channel->tx_count = priv->tx_ring_num / MLX4_EN_NUM_UP;
+}
+
+static int mlx4_en_set_channels(struct net_device *dev,
+ struct ethtool_channels *channel)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ struct mlx4_en_dev *mdev = priv->mdev;
+ int port_up;
+ int err = 0;
+
+ if (channel->other_count || channel->combined_count ||
+ channel->tx_count > MLX4_EN_MAX_TX_RING_P_UP ||
+ channel->rx_count > MAX_RX_RINGS ||
+ !channel->tx_count || !channel->rx_count)
+ return -EINVAL;
+
+ mutex_lock(&mdev->state_lock);
+ if (priv->port_up) {
+ port_up = 1;
+ mlx4_en_stop_port(dev);
+ }
+
+ mlx4_en_free_resources(priv);
+
+ priv->num_tx_rings_p_up = channel->tx_count;
+ priv->tx_ring_num = channel->tx_count * MLX4_EN_NUM_UP;
+ priv->rx_ring_num = channel->rx_count;
+
+ err = mlx4_en_alloc_resources(priv);
+ if (err) {
+ en_err(priv, "Failed reallocating port resources\n");
+ goto out;
+ }
+
+ netif_set_real_num_tx_queues(dev, priv->tx_ring_num);
+ netif_set_real_num_rx_queues(dev, priv->rx_ring_num);
+
+ mlx4_en_setup_tc(dev, MLX4_EN_NUM_UP);
+
+ en_warn(priv, "Using %d TX rings\n", priv->tx_ring_num);
+ en_warn(priv, "Using %d RX rings\n", priv->rx_ring_num);
+
+ if (port_up) {
+ err = mlx4_en_start_port(dev);
+ if (err)
+ en_err(priv, "Failed starting port\n");
+ }
+
+ err = mlx4_en_moderation_update(priv);
+
+out:
+ mutex_unlock(&mdev->state_lock);
+ return err;
+}
+
const struct ethtool_ops mlx4_en_ethtool_ops = {
.get_drvinfo = mlx4_en_get_drvinfo,
.get_settings = mlx4_en_get_settings,
@@ -1023,6 +1090,8 @@ const struct ethtool_ops mlx4_en_ethtool_ops = {
.get_rxfh_indir_size = mlx4_en_get_rxfh_indir_size,
.get_rxfh_indir = mlx4_en_get_rxfh_indir,
.set_rxfh_indir = mlx4_en_set_rxfh_indir,
+ .get_channels = mlx4_en_get_channels,
+ .set_channels = mlx4_en_set_channels,
};
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_main.c b/drivers/net/ethernet/mellanox/mlx4/en_main.c
index a52922e..3a2b8c6 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_main.c
@@ -250,7 +250,7 @@ static void *mlx4_en_add(struct mlx4_dev *dev)
rounddown_pow_of_two(max_t(int, MIN_RX_RINGS,
min_t(int,
dev->caps.num_comp_vectors,
- MAX_RX_RINGS)));
+ DEF_RX_RINGS)));
} else {
mdev->profile.prof[i].rx_ring_num = rounddown_pow_of_two(
min_t(int, dev->caps.comp_pool/
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 2b23ca2..7d1287f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -47,11 +47,11 @@
#include "mlx4_en.h"
#include "en_port.h"
-static int mlx4_en_setup_tc(struct net_device *dev, u8 up)
+int mlx4_en_setup_tc(struct net_device *dev, u8 up)
{
struct mlx4_en_priv *priv = netdev_priv(dev);
int i;
- unsigned int q, offset = 0;
+ unsigned int offset = 0;
if (up && up != MLX4_EN_NUM_UP)
return -EINVAL;
@@ -59,10 +59,9 @@ static int mlx4_en_setup_tc(struct net_device *dev, u8 up)
netdev_set_num_tc(dev, up);
/* Partition Tx queues evenly amongst UP's */
- q = priv->tx_ring_num / up;
for (i = 0; i < up; i++) {
- netdev_set_tc_queue(dev, i, q, offset);
- offset += q;
+ netdev_set_tc_queue(dev, i, priv->num_tx_rings_p_up, offset);
+ offset += priv->num_tx_rings_p_up;
}
return 0;
@@ -1114,7 +1113,7 @@ int mlx4_en_start_port(struct net_device *dev)
/* Configure ring */
tx_ring = &priv->tx_ring[i];
err = mlx4_en_activate_tx_ring(priv, tx_ring, cq->mcq.cqn,
- i / priv->mdev->profile.num_tx_rings_p_up);
+ i / priv->num_tx_rings_p_up);
if (err) {
en_err(priv, "Failed allocating Tx ring\n");
mlx4_en_deactivate_cq(priv, cq);
@@ -1564,10 +1563,13 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port,
int err;
dev = alloc_etherdev_mqs(sizeof(struct mlx4_en_priv),
- prof->tx_ring_num, prof->rx_ring_num);
+ MAX_TX_RINGS, MAX_RX_RINGS);
if (dev == NULL)
return -ENOMEM;
+ netif_set_real_num_tx_queues(dev, prof->tx_ring_num);
+ netif_set_real_num_rx_queues(dev, prof->rx_ring_num);
+
SET_NETDEV_DEV(dev, &mdev->dev->pdev->dev);
dev->dev_id = port - 1;
@@ -1586,15 +1588,17 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port,
priv->flags = prof->flags;
priv->ctrl_flags = cpu_to_be32(MLX4_WQE_CTRL_CQ_UPDATE |
MLX4_WQE_CTRL_SOLICITED);
+ priv->num_tx_rings_p_up = mdev->profile.num_tx_rings_p_up;
priv->tx_ring_num = prof->tx_ring_num;
- priv->tx_ring = kzalloc(sizeof(struct mlx4_en_tx_ring) *
- priv->tx_ring_num, GFP_KERNEL);
+
+ priv->tx_ring = kzalloc(sizeof(struct mlx4_en_tx_ring) * MAX_TX_RINGS,
+ GFP_KERNEL);
if (!priv->tx_ring) {
err = -ENOMEM;
goto out;
}
- priv->tx_cq = kzalloc(sizeof(struct mlx4_en_cq) * priv->tx_ring_num,
- GFP_KERNEL);
+ priv->tx_cq = kzalloc(sizeof(struct mlx4_en_cq) * MAX_RX_RINGS,
+ GFP_KERNEL);
if (!priv->tx_cq) {
err = -ENOMEM;
goto out;
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index b35094c..1f571d0 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -523,7 +523,7 @@ static void build_inline_wqe(struct mlx4_en_tx_desc *tx_desc, struct sk_buff *sk
u16 mlx4_en_select_queue(struct net_device *dev, struct sk_buff *skb)
{
struct mlx4_en_priv *priv = netdev_priv(dev);
- u16 rings_p_up = priv->mdev->profile.num_tx_rings_p_up;
+ u16 rings_p_up = priv->num_tx_rings_p_up;
u8 up = 0;
if (dev->num_tc)
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index d3eba8b..334ec48 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -67,7 +67,8 @@
#define MLX4_EN_PAGE_SHIFT 12
#define MLX4_EN_PAGE_SIZE (1 << MLX4_EN_PAGE_SHIFT)
-#define MAX_RX_RINGS 16
+#define DEF_RX_RINGS 16
+#define MAX_RX_RINGS 128
#define MIN_RX_RINGS 4
#define TXBB_SIZE 64
#define HEADROOM (2048 / TXBB_SIZE + 1)
@@ -118,6 +119,8 @@ enum {
#define MLX4_EN_NUM_UP 8
#define MLX4_EN_DEF_TX_RING_SIZE 512
#define MLX4_EN_DEF_RX_RING_SIZE 1024
+#define MAX_TX_RINGS (MLX4_EN_MAX_TX_RING_P_UP * \
+ MLX4_EN_NUM_UP)
/* Target number of packets to coalesce with interrupt moderation */
#define MLX4_EN_RX_COAL_TARGET 44
@@ -476,6 +479,7 @@ struct mlx4_en_priv {
u32 flags;
#define MLX4_EN_FLAG_PROMISC 0x1
#define MLX4_EN_FLAG_MC_PROMISC 0x2
+ u8 num_tx_rings_p_up;
u32 tx_ring_num;
u32 rx_ring_num;
u32 rx_skb_size;
@@ -596,6 +600,8 @@ int mlx4_en_QUERY_PORT(struct mlx4_en_dev *mdev, u8 port);
extern const struct dcbnl_rtnl_ops mlx4_en_dcbnl_ops;
#endif
+int mlx4_en_setup_tc(struct net_device *dev, u8 up);
+
#ifdef CONFIG_RFS_ACCEL
void mlx4_en_cleanup_filters(struct mlx4_en_priv *priv,
struct mlx4_en_rx_ring *rx_ring);
--
1.7.8.2
^ permalink raw reply related
* [PATCH net-next v2 1/3] MAINTAINERS: Add Mellanox ethernet driver - mlx4_en
From: Amir Vadai @ 2012-12-02 13:49 UTC (permalink / raw)
To: David S. Miller
Cc: Ben Hutchings, Amir Vadai, Or Gerlitz, Oren Duer, netdev,
Yevgeny Petrilin
In-Reply-To: <1354456163-10497-1-git-send-email-amirv@mellanox.com>
Set mlx4_en maintainer to Amir Vadai instead of Yevgeny Petrilin.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Cc: Yevgeny Petrilin <yevgenyp@mellanox.com>
---
MAINTAINERS | 8 ++++++++
1 files changed, 8 insertions(+), 0 deletions(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index 5d72dd5..3c6e8cd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4822,6 +4822,14 @@ F: Documentation/scsi/megaraid.txt
F: drivers/scsi/megaraid.*
F: drivers/scsi/megaraid/
+MELLANOX ETHERNET DRIVER (mlx4_en)
+M: Amir Vadai <amirv@mellanox.com>
+L: netdev@vger.kernel.org
+S: Supported
+W: http://www.mellanox.com
+Q: http://patchwork.ozlabs.org/project/netdev/list/
+F: drivers/net/ethernet/mellanox/mlx4/en_*
+
MEMORY MANAGEMENT
L: linux-mm@kvack.org
W: http://www.linux-mm.org
--
1.7.8.2
^ permalink raw reply related
* Re: [patch] p54: potential signedness issue in p54_parse_rssical()
From: Christian Lamparter @ 2012-12-02 13:54 UTC (permalink / raw)
To: Dan Carpenter
Cc: John W. Linville, linux-wireless, netdev, linux-kernel,
kernel-janitors
In-Reply-To: <20121202103609.GA16078@elgon.mountain>
On Sunday 02 December 2012 11:36:09 Dan Carpenter wrote:
> "entries" is unsigned here, so it is never less than zero. In theory,
> len could be less than offset so I have added a check for that.
>
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Christian Lamparter <chunkeey@googlemail.com>
> diff --git a/drivers/net/wireless/p54/eeprom.c b/drivers/net/wireless/p54/eeprom.c
> index 1ef1bfe..d43e374 100644
> --- a/drivers/net/wireless/p54/eeprom.c
> +++ b/drivers/net/wireless/p54/eeprom.c
> @@ -541,8 +541,9 @@ static int p54_parse_rssical(struct ieee80211_hw *dev,
> entries = (len - offset) /
> sizeof(struct pda_rssi_cal_ext_entry);
>
> - if ((len - offset) % sizeof(struct pda_rssi_cal_ext_entry) ||
> - entries <= 0) {
> + if (len < offset ||
> + (len - offset) % sizeof(struct pda_rssi_cal_ext_entry) ||
> + entries == 0) {
> wiphy_err(dev->wiphy, "invalid rssi database.\n");
> goto err_data;
> }
>
^ permalink raw reply
* [PATCH net-next 01/13] bnx2x: revised and corrected SPIO access
From: Yuval Mintz @ 2012-12-02 14:05 UTC (permalink / raw)
To: davem, netdev; +Cc: eilong, ariele, Yuval Mintz
In-Reply-To: <1354457157-4730-1-git-send-email-yuvalmin@broadcom.com>
Changed naming convention of SPIO macros, and prevented access to invalid SPIOs.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 43 ++++++++++------------
drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h | 10 +++++
2 files changed, 30 insertions(+), 23 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index b4659c4..5a22e19 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -2032,40 +2032,39 @@ int bnx2x_set_gpio_int(struct bnx2x *bp, int gpio_num, u32 mode, u8 port)
return 0;
}
-static int bnx2x_set_spio(struct bnx2x *bp, int spio_num, u32 mode)
+static int bnx2x_set_spio(struct bnx2x *bp, int spio, u32 mode)
{
- u32 spio_mask = (1 << spio_num);
u32 spio_reg;
- if ((spio_num < MISC_REGISTERS_SPIO_4) ||
- (spio_num > MISC_REGISTERS_SPIO_7)) {
- BNX2X_ERR("Invalid SPIO %d\n", spio_num);
+ /* Only 2 SPIOs are configurable */
+ if ((spio != MISC_SPIO_SPIO4) && (spio != MISC_SPIO_SPIO5)) {
+ BNX2X_ERR("Invalid SPIO 0x%x\n", spio);
return -EINVAL;
}
bnx2x_acquire_hw_lock(bp, HW_LOCK_RESOURCE_SPIO);
/* read SPIO and mask except the float bits */
- spio_reg = (REG_RD(bp, MISC_REG_SPIO) & MISC_REGISTERS_SPIO_FLOAT);
+ spio_reg = (REG_RD(bp, MISC_REG_SPIO) & MISC_SPIO_FLOAT);
switch (mode) {
- case MISC_REGISTERS_SPIO_OUTPUT_LOW:
- DP(NETIF_MSG_HW, "Set SPIO %d -> output low\n", spio_num);
+ case MISC_SPIO_OUTPUT_LOW:
+ DP(NETIF_MSG_HW, "Set SPIO 0x%x -> output low\n", spio);
/* clear FLOAT and set CLR */
- spio_reg &= ~(spio_mask << MISC_REGISTERS_SPIO_FLOAT_POS);
- spio_reg |= (spio_mask << MISC_REGISTERS_SPIO_CLR_POS);
+ spio_reg &= ~(spio << MISC_SPIO_FLOAT_POS);
+ spio_reg |= (spio << MISC_SPIO_CLR_POS);
break;
- case MISC_REGISTERS_SPIO_OUTPUT_HIGH:
- DP(NETIF_MSG_HW, "Set SPIO %d -> output high\n", spio_num);
+ case MISC_SPIO_OUTPUT_HIGH:
+ DP(NETIF_MSG_HW, "Set SPIO 0x%x -> output high\n", spio);
/* clear FLOAT and set SET */
- spio_reg &= ~(spio_mask << MISC_REGISTERS_SPIO_FLOAT_POS);
- spio_reg |= (spio_mask << MISC_REGISTERS_SPIO_SET_POS);
+ spio_reg &= ~(spio << MISC_SPIO_FLOAT_POS);
+ spio_reg |= (spio << MISC_SPIO_SET_POS);
break;
- case MISC_REGISTERS_SPIO_INPUT_HI_Z:
- DP(NETIF_MSG_HW, "Set SPIO %d -> input\n", spio_num);
+ case MISC_SPIO_INPUT_HI_Z:
+ DP(NETIF_MSG_HW, "Set SPIO 0x%x -> input\n", spio);
/* set FLOAT */
- spio_reg |= (spio_mask << MISC_REGISTERS_SPIO_FLOAT_POS);
+ spio_reg |= (spio << MISC_SPIO_FLOAT_POS);
break;
default:
@@ -6196,18 +6195,16 @@ static void bnx2x_setup_fan_failure_detection(struct bnx2x *bp)
return;
/* Fan failure is indicated by SPIO 5 */
- bnx2x_set_spio(bp, MISC_REGISTERS_SPIO_5,
- MISC_REGISTERS_SPIO_INPUT_HI_Z);
+ bnx2x_set_spio(bp, MISC_SPIO_SPIO5, MISC_SPIO_INPUT_HI_Z);
/* set to active low mode */
val = REG_RD(bp, MISC_REG_SPIO_INT);
- val |= ((1 << MISC_REGISTERS_SPIO_5) <<
- MISC_REGISTERS_SPIO_INT_OLD_SET_POS);
+ val |= (MISC_SPIO_SPIO5 << MISC_SPIO_INT_OLD_SET_POS);
REG_WR(bp, MISC_REG_SPIO_INT, val);
/* enable interrupt to signal the IGU */
val = REG_RD(bp, MISC_REG_SPIO_EVENT_EN);
- val |= (1 << MISC_REGISTERS_SPIO_5);
+ val |= MISC_SPIO_SPIO5;
REG_WR(bp, MISC_REG_SPIO_EVENT_EN, val);
}
@@ -6969,7 +6966,7 @@ static int bnx2x_init_hw_port(struct bnx2x *bp)
/* If SPIO5 is set to generate interrupts, enable it for this port */
val = REG_RD(bp, MISC_REG_SPIO_EVENT_EN);
- if (val & (1 << MISC_REGISTERS_SPIO_5)) {
+ if (val & MISC_SPIO_SPIO5) {
u32 reg_addr = (port ? MISC_REG_AEU_ENABLE1_FUNC_1_OUT_0 :
MISC_REG_AEU_ENABLE1_FUNC_0_OUT_0);
val = REG_RD(bp, reg_addr);
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h
index f8d432a..87cf37c 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h
@@ -5942,6 +5942,16 @@
#define MISC_REGISTERS_SPIO_OUTPUT_HIGH 1
#define MISC_REGISTERS_SPIO_OUTPUT_LOW 0
#define MISC_REGISTERS_SPIO_SET_POS 8
+#define MISC_SPIO_CLR_POS 16
+#define MISC_SPIO_FLOAT (0xffL<<24)
+#define MISC_SPIO_FLOAT_POS 24
+#define MISC_SPIO_INPUT_HI_Z 2
+#define MISC_SPIO_INT_OLD_SET_POS 16
+#define MISC_SPIO_OUTPUT_HIGH 1
+#define MISC_SPIO_OUTPUT_LOW 0
+#define MISC_SPIO_SET_POS 8
+#define MISC_SPIO_SPIO4 0x10
+#define MISC_SPIO_SPIO5 0x20
#define HW_LOCK_MAX_RESOURCE_VALUE 31
#define HW_LOCK_RESOURCE_DCBX_ADMIN_MIB 13
#define HW_LOCK_RESOURCE_DRV_FLAGS 10
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 00/13] bnx2x: net-next patch series
From: Yuval Mintz @ 2012-12-02 14:05 UTC (permalink / raw)
To: davem, netdev; +Cc: eilong, ariele, Yuval Mintz
Hi Dave,
This patch series contains several small changes to the bnx2x driver,
including dcb changes, graceful error handling and benign error masking,
and setting the driver's configuration according to management/nvram
held values.
Please consider applying these patches to 'net-next'.
Thanks,
Yuval Mintz
^ permalink raw reply
* [PATCH net-next 03/13] bnx2x: Management can control PFC/ETS
From: Yuval Mintz @ 2012-12-02 14:05 UTC (permalink / raw)
To: davem, netdev; +Cc: eilong, ariele, Barak Witkowski, Yuval Mintz
In-Reply-To: <1354457157-4730-1-git-send-email-yuvalmin@broadcom.com>
From: Barak Witkowski <barak@broadcom.com>
If configured for PFC/ETS by management, configure chip regardless of the
presence of a remote peer which supports DCBX.
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 2 +-
drivers/net/ethernet/broadcom/bnx2x/bnx2x_dcb.c | 21 +++++++++++++++++----
drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h | 7 ++++++-
3 files changed, 24 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 8779ac1..e95174d 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -2426,7 +2426,7 @@ int bnx2x_nic_load(struct bnx2x *bp, int load_mode)
}
if (bp->port.pmf)
- bnx2x_update_drv_flags(bp, 1 << DRV_FLAGS_DCB_CONFIGURED, 0);
+ bnx2x_update_drv_flags(bp, 1 << DRV_FLAGS_PORT_MASK, 0);
else
bnx2x__link_status_update(bp);
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_dcb.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_dcb.c
index cba4a16..c0d9b69 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_dcb.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_dcb.c
@@ -413,8 +413,11 @@ static int bnx2x_dcbx_read_mib(struct bnx2x *bp,
static void bnx2x_pfc_set_pfc(struct bnx2x *bp)
{
+ int mfw_configured = SHMEM2_HAS(bp, drv_flags) &&
+ GET_FLAGS(SHMEM2_RD(bp, drv_flags),
+ 1 << DRV_FLAGS_DCB_MFW_CONFIGURED);
if (bp->dcbx_port_params.pfc.enabled &&
- !(bp->dcbx_error & DCBX_REMOTE_MIB_ERROR))
+ (!(bp->dcbx_error & DCBX_REMOTE_MIB_ERROR) || mfw_configured))
/*
* 1. Fills up common PFC structures if required
* 2. Configure NIG, MAC and BRB via the elink
@@ -552,10 +555,13 @@ static void bnx2x_dcbx_update_ets_config(struct bnx2x *bp)
static void bnx2x_dcbx_update_ets_params(struct bnx2x *bp)
{
+ int mfw_configured = SHMEM2_HAS(bp, drv_flags) &&
+ GET_FLAGS(SHMEM2_RD(bp, drv_flags),
+ 1 << DRV_FLAGS_DCB_MFW_CONFIGURED);
bnx2x_ets_disabled(&bp->link_params, &bp->link_vars);
if (!bp->dcbx_port_params.ets.enabled ||
- (bp->dcbx_error & DCBX_REMOTE_MIB_ERROR))
+ ((bp->dcbx_error & DCBX_REMOTE_MIB_ERROR) && !mfw_configured))
return;
if (CHIP_IS_E3B0(bp))
@@ -1802,11 +1808,14 @@ static void bnx2x_dcbx_fw_struct(struct bnx2x *bp,
u8 cos = 0, pri = 0;
struct priority_cos *tt2cos;
u32 *ttp = bp->dcbx_port_params.app.traffic_type_priority;
+ int mfw_configured = SHMEM2_HAS(bp, drv_flags) &&
+ GET_FLAGS(SHMEM2_RD(bp, drv_flags),
+ 1 << DRV_FLAGS_DCB_MFW_CONFIGURED);
memset(pfc_fw_cfg, 0, sizeof(*pfc_fw_cfg));
/* to disable DCB - the structure must be zeroed */
- if (bp->dcbx_error & DCBX_REMOTE_MIB_ERROR)
+ if ((bp->dcbx_error & DCBX_REMOTE_MIB_ERROR) && !mfw_configured)
return;
/*shortcut*/
@@ -2073,8 +2082,12 @@ static u8 bnx2x_dcbnl_set_all(struct net_device *netdev)
"Handling parity error recovery. Try again later\n");
return 1;
}
- if (netif_running(bp->dev))
+ if (netif_running(bp->dev)) {
+ bnx2x_update_drv_flags(bp,
+ 1 << DRV_FLAGS_DCB_MFW_CONFIGURED,
+ 1);
bnx2x_dcbx_init(bp, true);
+ }
DP(BNX2X_MSG_DCB, "set_dcbx_params done (%d)\n", rc);
if (rc)
return 1;
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h
index 1504e0a..9a51d49 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h
@@ -2088,8 +2088,13 @@ struct shmem2_region {
/* generic flags controlled by the driver */
u32 drv_flags;
- #define DRV_FLAGS_DCB_CONFIGURED 0x1
+ #define DRV_FLAGS_DCB_CONFIGURED 0x0
+ #define DRV_FLAGS_DCB_CONFIGURATION_ABORTED 0x1
+ #define DRV_FLAGS_DCB_MFW_CONFIGURED 0x2
+ #define DRV_FLAGS_PORT_MASK ((1 << DRV_FLAGS_DCB_CONFIGURED) | \
+ (1 << DRV_FLAGS_DCB_CONFIGURATION_ABORTED) | \
+ (1 << DRV_FLAGS_DCB_MFW_CONFIGURED))
/* pointer to extended dev_info shared data copied from nvm image */
u32 extended_dev_info_shared_addr;
u32 ncsi_oem_data_addr;
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 02/13] bnx2x: parity recovery flow enhancement
From: Yuval Mintz @ 2012-12-02 14:05 UTC (permalink / raw)
To: davem, netdev; +Cc: eilong, ariele, Barak Witkowski, Yuval Mintz
In-Reply-To: <1354457157-4730-1-git-send-email-yuvalmin@broadcom.com>
From: Barak Witkowski <barak@broadcom.com>
Parity recovery was enhanced in order to handle a few more corner cases.
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 33 +++++++++++++--------
1 files changed, 20 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 5a22e19..62fcf0f 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -8720,7 +8720,8 @@ static void bnx2x_reset_mcp_prep(struct bnx2x *bp, u32 *magic_val)
/* Get shmem offset */
shmem = REG_RD(bp, MISC_REG_SHARED_MEM_ADDR);
- validity_offset = offsetof(struct shmem_region, validity_map[0]);
+ validity_offset =
+ offsetof(struct shmem_region, validity_map[BP_PORT(bp)]);
/* Clear validity map flags */
if (shmem > 0)
@@ -8813,7 +8814,11 @@ static void bnx2x_process_kill_chip_reset(struct bnx2x *bp, bool global)
MISC_REGISTERS_RESET_REG_2_RST_MCP_N_RESET_CMN_CPU |
MISC_REGISTERS_RESET_REG_2_RST_MCP_N_RESET_CMN_CORE;
- /* Don't reset the following blocks */
+ /* Don't reset the following blocks.
+ * Important: per port blocks (such as EMAC, BMAC, UMAC) can't be
+ * reset, as in 4 port device they might still be owned
+ * by the MCP (there is only one leader per path).
+ */
not_reset_mask1 =
MISC_REGISTERS_RESET_REG_1_RST_HC |
MISC_REGISTERS_RESET_REG_1_RST_PXPV |
@@ -8829,19 +8834,19 @@ static void bnx2x_process_kill_chip_reset(struct bnx2x *bp, bool global)
MISC_REGISTERS_RESET_REG_2_RST_MCP_N_RESET_REG_HARD_CORE |
MISC_REGISTERS_RESET_REG_2_RST_MCP_N_HARD_CORE_RST_B |
MISC_REGISTERS_RESET_REG_2_RST_ATC |
- MISC_REGISTERS_RESET_REG_2_PGLC;
+ MISC_REGISTERS_RESET_REG_2_PGLC |
+ MISC_REGISTERS_RESET_REG_2_RST_BMAC0 |
+ MISC_REGISTERS_RESET_REG_2_RST_BMAC1 |
+ MISC_REGISTERS_RESET_REG_2_RST_EMAC0 |
+ MISC_REGISTERS_RESET_REG_2_RST_EMAC1 |
+ MISC_REGISTERS_RESET_REG_2_UMAC0 |
+ MISC_REGISTERS_RESET_REG_2_UMAC1;
/*
* Keep the following blocks in reset:
* - all xxMACs are handled by the bnx2x_link code.
*/
stay_reset2 =
- MISC_REGISTERS_RESET_REG_2_RST_BMAC0 |
- MISC_REGISTERS_RESET_REG_2_RST_BMAC1 |
- MISC_REGISTERS_RESET_REG_2_RST_EMAC0 |
- MISC_REGISTERS_RESET_REG_2_RST_EMAC1 |
- MISC_REGISTERS_RESET_REG_2_UMAC0 |
- MISC_REGISTERS_RESET_REG_2_UMAC1 |
MISC_REGISTERS_RESET_REG_2_XMAC |
MISC_REGISTERS_RESET_REG_2_XMAC_SOFT;
@@ -8931,6 +8936,7 @@ static int bnx2x_process_kill(struct bnx2x *bp, bool global)
int cnt = 1000;
u32 val = 0;
u32 sr_cnt, blk_cnt, port_is_idle_0, port_is_idle_1, pgl_exp_rom2;
+ u32 tags_63_32 = 0;
/* Empty the Tetris buffer, wait for 1s */
@@ -8940,10 +8946,14 @@ static int bnx2x_process_kill(struct bnx2x *bp, bool global)
port_is_idle_0 = REG_RD(bp, PXP2_REG_RD_PORT_IS_IDLE_0);
port_is_idle_1 = REG_RD(bp, PXP2_REG_RD_PORT_IS_IDLE_1);
pgl_exp_rom2 = REG_RD(bp, PXP2_REG_PGL_EXP_ROM2);
+ if (CHIP_IS_E3(bp))
+ tags_63_32 = REG_RD(bp, PGLUE_B_REG_TAGS_63_32);
+
if ((sr_cnt == 0x7e) && (blk_cnt == 0xa0) &&
((port_is_idle_0 & 0x1) == 0x1) &&
((port_is_idle_1 & 0x1) == 0x1) &&
- (pgl_exp_rom2 == 0xffffffff))
+ (pgl_exp_rom2 == 0xffffffff) &&
+ (!CHIP_IS_E3(bp) || (tags_63_32 == 0xffffffff)))
break;
usleep_range(1000, 1000);
} while (cnt-- > 0);
@@ -9000,9 +9010,6 @@ static int bnx2x_process_kill(struct bnx2x *bp, bool global)
/* TBD: Add resetting the NO_MCP mode DB here */
- /* PXP */
- bnx2x_pxp_prep(bp);
-
/* Open the gates #2, #3 and #4 */
bnx2x_set_234_gates(bp, false);
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 05/13] bnx2x: Correct advertised speed/duplex
From: Yuval Mintz @ 2012-12-02 14:05 UTC (permalink / raw)
To: davem, netdev; +Cc: eilong, ariele, Yuval Mintz
In-Reply-To: <1354457157-4730-1-git-send-email-yuvalmin@broadcom.com>
If link is down due to management (and not due to actual phy link being lost),
driver should still behave as if the link is down; Querying via ethtool about
speed/duplex state should result in 'UNKNOWN' (same behaviour as when link is
actually down).
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
.../net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c | 12 ++++--------
1 files changed, 4 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
index c7270c0..277f17e 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
@@ -231,18 +231,14 @@ static int bnx2x_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
cmd->advertising &= ~(ADVERTISED_10000baseT_Full);
}
- if ((bp->state == BNX2X_STATE_OPEN) && (bp->link_vars.link_up)) {
- if (!(bp->flags & MF_FUNC_DIS)) {
- ethtool_cmd_speed_set(cmd, bp->link_vars.line_speed);
+ if ((bp->state == BNX2X_STATE_OPEN) && bp->link_vars.link_up &&
+ !(bp->flags & MF_FUNC_DIS)) {
cmd->duplex = bp->link_vars.duplex;
- } else {
- ethtool_cmd_speed_set(
- cmd, bp->link_params.req_line_speed[cfg_idx]);
- cmd->duplex = bp->link_params.req_duplex[cfg_idx];
- }
if (IS_MF(bp) && !BP_NOMCP(bp))
ethtool_cmd_speed_set(cmd, bnx2x_get_mf_speed(bp));
+ else
+ ethtool_cmd_speed_set(cmd, bp->link_vars.line_speed);
} else {
cmd->duplex = DUPLEX_UNKNOWN;
ethtool_cmd_speed_set(cmd, SPEED_UNKNOWN);
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 04/13] bnx2x: Filter packets on FCoE rings
From: Yuval Mintz @ 2012-12-02 14:05 UTC (permalink / raw)
To: davem, netdev; +Cc: eilong, ariele, Dmitry Kravkov, Yuval Mintz
In-Reply-To: <1354457157-4730-1-git-send-email-yuvalmin@broadcom.com>
From: Dmitry Kravkov <dmitry@broadcom.com>
Whenever bnx2x fails to transmit a packet due to a full Tx ring, if the
ring size is zero (indicating an FCoE ring) driver filters the packet out
and gracefully continues.
Driver also gathers statistics on such filtered packets.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 11 ++++++++---
.../net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c | 6 +++++-
drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.c | 1 +
drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h | 3 +++
4 files changed, 17 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index e95174d..5e07aa5 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -3127,11 +3127,16 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
BDS_PER_TX_PKT +
NEXT_CNT_PER_TX_PKT(MAX_BDS_PER_TX_PKT))) {
/* Handle special storage cases separately */
- if (txdata->tx_ring_size != 0) {
- BNX2X_ERR("BUG! Tx ring full when queue awake!\n");
+ if (txdata->tx_ring_size == 0) {
+ struct bnx2x_eth_q_stats *q_stats =
+ bnx2x_fp_qstats(bp, txdata->parent_fp);
+ q_stats->driver_filtered_tx_pkt++;
+ dev_kfree_skb(skb);
+ return NETDEV_TX_OK;
+ }
bnx2x_fp_qstats(bp, txdata->parent_fp)->driver_xoff++;
netif_tx_stop_queue(txq);
- }
+ BNX2X_ERR("BUG! Tx ring full when queue awake!\n");
return NETDEV_TX_BUSY;
}
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
index e05f981..c7270c0 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
@@ -62,7 +62,9 @@ static const struct {
8, "[%s]: tpa_aggregations" },
{ Q_STATS_OFFSET32(total_tpa_aggregated_frames_hi),
8, "[%s]: tpa_aggregated_frames"},
- { Q_STATS_OFFSET32(total_tpa_bytes_hi), 8, "[%s]: tpa_bytes"}
+ { Q_STATS_OFFSET32(total_tpa_bytes_hi), 8, "[%s]: tpa_bytes"},
+ { Q_STATS_OFFSET32(driver_filtered_tx_pkt),
+ 4, "[%s]: driver_filtered_tx_pkt" }
};
#define BNX2X_NUM_Q_STATS ARRAY_SIZE(bnx2x_q_stats_arr)
@@ -177,6 +179,8 @@ static const struct {
4, STATS_FLAGS_FUNC, "recoverable_errors" },
{ STATS_OFFSET32(unrecoverable_error),
4, STATS_FLAGS_FUNC, "unrecoverable_errors" },
+ { STATS_OFFSET32(driver_filtered_tx_pkt),
+ 4, STATS_FLAGS_FUNC, "driver_filtered_tx_pkt" },
{ STATS_OFFSET32(eee_tx_lpi),
4, STATS_FLAGS_PORT, "Tx LPI entry count"}
};
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.c
index 348ed02..89ec066 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.c
@@ -1149,6 +1149,7 @@ static void bnx2x_drv_stats_update(struct bnx2x *bp)
UPDATE_ESTAT_QSTAT(rx_err_discard_pkt);
UPDATE_ESTAT_QSTAT(rx_skb_alloc_failed);
UPDATE_ESTAT_QSTAT(hw_csum_err);
+ UPDATE_ESTAT_QSTAT(driver_filtered_tx_pkt);
}
}
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h
index 24b8e50..b4d7b26 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h
@@ -203,6 +203,7 @@ struct bnx2x_eth_stats {
/* Recovery */
u32 recoverable_error;
u32 unrecoverable_error;
+ u32 driver_filtered_tx_pkt;
/* src: Clear-on-Read register; Will not survive PMF Migration */
u32 eee_tx_lpi;
};
@@ -264,6 +265,7 @@ struct bnx2x_eth_q_stats {
u32 total_tpa_aggregated_frames_lo;
u32 total_tpa_bytes_hi;
u32 total_tpa_bytes_lo;
+ u32 driver_filtered_tx_pkt;
};
struct bnx2x_eth_stats_old {
@@ -315,6 +317,7 @@ struct bnx2x_eth_q_stats_old {
u32 rx_err_discard_pkt_old;
u32 rx_skb_alloc_failed_old;
u32 hw_csum_err_old;
+ u32 driver_filtered_tx_pkt_old;
};
struct bnx2x_net_stats_old {
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 06/13] bnx2x: nvram enables dropless flow control
From: Yuval Mintz @ 2012-12-02 14:05 UTC (permalink / raw)
To: davem, netdev; +Cc: eilong, ariele, Yuval Mintz
In-Reply-To: <1354457157-4730-1-git-send-email-yuvalmin@broadcom.com>
It is now possible to enable dropless flow control via nvram.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
drivers/net/ethernet/broadcom/bnx2x/bnx2x.h | 2 +-
drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h | 13 +++++++++++--
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 20 +++++++++++++++++++-
3 files changed, 31 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
index 641d884..03647bf 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
@@ -1488,7 +1488,7 @@ struct bnx2x {
int qm_cid_count;
- int dropless_fc;
+ bool dropless_fc;
void *t2;
dma_addr_t t2_mapping;
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h
index 9a51d49..3369a50 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h
@@ -500,7 +500,15 @@ struct port_hw_cfg { /* port 0: 0x12c port 1: 0x2bc */
u32 e3_cmn_pin_cfg1; /* 0x170 */
#define PORT_HW_CFG_E3_OVER_CURRENT_MASK 0x000000FF
#define PORT_HW_CFG_E3_OVER_CURRENT_SHIFT 0
- u32 reserved0[7]; /* 0x174 */
+
+ /* pause on host ring */
+ u32 generic_features; /* 0x174 */
+ #define PORT_HW_CFG_PAUSE_ON_HOST_RING_MASK 0x00000001
+ #define PORT_HW_CFG_PAUSE_ON_HOST_RING_SHIFT 0
+ #define PORT_HW_CFG_PAUSE_ON_HOST_RING_DISABLED 0x00000000
+ #define PORT_HW_CFG_PAUSE_ON_HOST_RING_ENABLED 0x00000001
+
+ u32 reserved0[6]; /* 0x178 */
u32 aeu_int_mask; /* 0x190 */
@@ -1518,12 +1526,13 @@ enum mf_cfg_afex_vlan_mode {
/* This structure is not applicable and should not be accessed on 57711 */
struct func_ext_cfg {
u32 func_cfg;
- #define MACP_FUNC_CFG_FLAGS_MASK 0x000000FF
+ #define MACP_FUNC_CFG_FLAGS_MASK 0x0000007F
#define MACP_FUNC_CFG_FLAGS_SHIFT 0
#define MACP_FUNC_CFG_FLAGS_ENABLED 0x00000001
#define MACP_FUNC_CFG_FLAGS_ETHERNET 0x00000002
#define MACP_FUNC_CFG_FLAGS_ISCSI_OFFLOAD 0x00000004
#define MACP_FUNC_CFG_FLAGS_FCOE_OFFLOAD 0x00000008
+ #define MACP_FUNC_CFG_PAUSE_ON_HOST_RING 0x00000080
u32 iscsi_mac_addr_upper;
u32 iscsi_mac_addr_lower;
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 62fcf0f..89b3d10 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -10641,8 +10641,26 @@ static void __devinit bnx2x_get_mac_hwinfo(struct bnx2x *bp)
"bad Ethernet MAC address configuration: %pM\n"
"change it manually before bringing up the appropriate network interface\n",
bp->dev->dev_addr);
+}
+static bool __devinit bnx2x_get_dropless_info(struct bnx2x *bp)
+{
+ int tmp;
+ u32 cfg;
+ if (IS_MF(bp) && !CHIP_IS_E1x(bp)) {
+ /* Take function: tmp = func */
+ tmp = BP_ABS_FUNC(bp);
+ cfg = MF_CFG_RD(bp, func_ext_config[tmp].func_cfg);
+ cfg = !!(cfg & MACP_FUNC_CFG_PAUSE_ON_HOST_RING);
+ } else {
+ /* Take port: tmp = port */
+ tmp = BP_PORT(bp);
+ cfg = SHMEM_RD(bp,
+ dev_info.port_hw_config[tmp].generic_features);
+ cfg = !!(cfg & PORT_HW_CFG_PAUSE_ON_HOST_RING_ENABLED);
+ }
+ return cfg;
}
static int __devinit bnx2x_get_hwinfo(struct bnx2x *bp)
@@ -11063,7 +11081,7 @@ static int __devinit bnx2x_init_bp(struct bnx2x *bp)
if (CHIP_IS_E1(bp))
bp->dropless_fc = 0;
else
- bp->dropless_fc = dropless_fc;
+ bp->dropless_fc = dropless_fc | bnx2x_get_dropless_info(bp);
bp->mrrs = mrrs;
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 08/13] bnx2x: IGU parse error cause probe failure
From: Yuval Mintz @ 2012-12-02 14:05 UTC (permalink / raw)
To: davem, netdev; +Cc: eilong, ariele, Barak Witkowski, Yuval Mintz
In-Reply-To: <1354457157-4730-1-git-send-email-yuvalmin@broadcom.com>
From: Barak Witkowski <barak@broadcom.com>
If IGU parse error is encountered during the probing process, the error
propagates and the probe gracefully fails (until now, such errors were ignored,
later causing mischief).
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 17 ++++++++++++-----
1 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 571508d..5ff0bcb 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -9974,7 +9974,7 @@ static void __devinit bnx2x_get_common_hwinfo(struct bnx2x *bp)
#define IGU_FID(val) GET_FIELD((val), IGU_REG_MAPPING_MEMORY_FID)
#define IGU_VEC(val) GET_FIELD((val), IGU_REG_MAPPING_MEMORY_VECTOR)
-static void __devinit bnx2x_get_igu_cam_info(struct bnx2x *bp)
+static int __devinit bnx2x_get_igu_cam_info(struct bnx2x *bp)
{
int pfid = BP_FUNC(bp);
int igu_sb_id;
@@ -9991,7 +9991,7 @@ static void __devinit bnx2x_get_igu_cam_info(struct bnx2x *bp)
bp->igu_dsb_id = E1HVN_MAX * FP_SB_MAX_E1x +
(CHIP_MODE_IS_4_PORT(bp) ? pfid : vn);
- return;
+ return 0;
}
/* IGU in normal mode - read CAM */
@@ -10025,8 +10025,12 @@ static void __devinit bnx2x_get_igu_cam_info(struct bnx2x *bp)
bp->igu_sb_cnt = min_t(int, bp->igu_sb_cnt, igu_sb_cnt);
#endif
- if (igu_sb_cnt == 0)
+ if (igu_sb_cnt == 0) {
BNX2X_ERR("CAM configuration error\n");
+ return -EINVAL;
+ }
+
+ return 0;
}
static void __devinit bnx2x_link_settings_supported(struct bnx2x *bp,
@@ -10706,6 +10710,8 @@ static int __devinit bnx2x_get_hwinfo(struct bnx2x *bp)
if (REG_RD(bp, IGU_REG_RESET_MEMORIES)) {
dev_err(&bp->pdev->dev,
"FORCING Normal Mode failed!!!\n");
+ bnx2x_release_hw_lock(bp,
+ HW_LOCK_RESOURCE_RESET);
return -EPERM;
}
}
@@ -10716,9 +10722,10 @@ static int __devinit bnx2x_get_hwinfo(struct bnx2x *bp)
} else
BNX2X_DEV_INFO("IGU Normal Mode\n");
- bnx2x_get_igu_cam_info(bp);
-
+ rc = bnx2x_get_igu_cam_info(bp);
bnx2x_release_hw_lock(bp, HW_LOCK_RESOURCE_RESET);
+ if (rc)
+ return rc;
}
/*
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 07/13] bnx2x: Ext. config accessed only on non-E1x.
From: Yuval Mintz @ 2012-12-02 14:05 UTC (permalink / raw)
To: davem, netdev; +Cc: eilong, ariele, Yuval Mintz
In-Reply-To: <1354457157-4730-1-git-send-email-yuvalmin@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 89b3d10..571508d 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -10484,8 +10484,9 @@ static void __devinit bnx2x_get_fcoe_info(struct bnx2x *bp)
if (BNX2X_MF_EXT_PROTOCOL_FCOE(bp) && !CHIP_IS_E1x(bp))
bnx2x_get_ext_wwn_info(bp, func);
- } else if (IS_MF_FCOE_SD(bp))
+ } else if (IS_MF_FCOE_SD(bp) && !CHIP_IS_E1x(bp)) {
bnx2x_get_ext_wwn_info(bp, func);
+ }
BNX2X_DEV_INFO("max_fcoe_conn 0x%x\n", bp->cnic_eth_dev.max_fcoe_conn);
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 10/13] bnx2x: Handle a rarely missed interrupt
From: Yuval Mintz @ 2012-12-02 14:05 UTC (permalink / raw)
To: davem, netdev; +Cc: eilong, ariele, Yaniv Rosner, Yuval Mintz
In-Reply-To: <1354457157-4730-1-git-send-email-yuvalmin@broadcom.com>
From: Yaniv Rosner <yaniv.rosner@broadcom.com>
A rare case of no link due to a missed interrupt may occur due to a
race condition between acknowledging the IGU via the BAR and restoring the NIG
interrupt mask via the GRC.
To solve it, we wait for the IGU ack command to finish prior to restoring the
NIG interrupt mask.
Signed-off-by: Yaniv Rosner <yaniv.rosner@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
drivers/net/ethernet/broadcom/bnx2x/bnx2x.h | 1 +
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 15 +++++++++++++++
2 files changed, 16 insertions(+), 0 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
index 03647bf..02ea644 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
@@ -915,6 +915,7 @@ struct bnx2x_common {
#define BNX2X_IGU_STAS_MSG_VF_CNT 64
#define BNX2X_IGU_STAS_MSG_PF_CNT 4
+#define MAX_IGU_ATTN_ACK_TO 100
/* end of common */
/* port */
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 983a0c8..d76ca90 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -3588,6 +3588,21 @@ static void bnx2x_attn_int_asserted(struct bnx2x *bp, u32 asserted)
/* now set back the mask */
if (asserted & ATTN_NIG_FOR_FUNC) {
+ /* Verify that IGU ack through BAR was written before restoring
+ * NIG mask. This loop should exit after 2-3 iterations max.
+ */
+ if (bp->common.int_block != INT_BLOCK_HC) {
+ u32 cnt = 0, igu_acked;
+ do {
+ igu_acked = REG_RD(bp,
+ IGU_REG_ATTENTION_ACK_BITS);
+ } while (((igu_acked & ATTN_NIG_FOR_FUNC) == 0) &&
+ (++cnt < MAX_IGU_ATTN_ACK_TO));
+ if (!igu_acked)
+ DP(NETIF_MSG_HW,
+ "Failed to verify IGU ack on time\n");
+ barrier();
+ }
REG_WR(bp, nig_int_mask_addr, nig_mask);
bnx2x_release_phy_lock(bp);
}
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 09/13] bnx2x: mask CPL_OF interrupt
From: Yuval Mintz @ 2012-12-02 14:05 UTC (permalink / raw)
To: davem, netdev; +Cc: eilong, ariele, Yuval Mintz
In-Reply-To: <1354457157-4730-1-git-send-email-yuvalmin@broadcom.com>
Unmasked interrupt caused "FATAL HW block attention set2 0x20" messages
to erroneously appear, as the associated interrupt is fully recoverable.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 21 ++++++++++-----------
1 files changed, 10 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 5ff0bcb..983a0c8 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -6059,6 +6059,8 @@ static int bnx2x_int_mem_test(struct bnx2x *bp)
static void bnx2x_enable_blocks_attention(struct bnx2x *bp)
{
+ u32 val;
+
REG_WR(bp, PXP_REG_PXP_INT_MASK_0, 0);
if (!CHIP_IS_E1x(bp))
REG_WR(bp, PXP_REG_PXP_INT_MASK_1, 0x40);
@@ -6092,17 +6094,14 @@ static void bnx2x_enable_blocks_attention(struct bnx2x *bp)
/* REG_WR(bp, CSEM_REG_CSEM_INT_MASK_0, 0); */
/* REG_WR(bp, CSEM_REG_CSEM_INT_MASK_1, 0); */
- if (CHIP_REV_IS_FPGA(bp))
- REG_WR(bp, PXP2_REG_PXP2_INT_MASK_0, 0x580000);
- else if (!CHIP_IS_E1x(bp))
- REG_WR(bp, PXP2_REG_PXP2_INT_MASK_0,
- (PXP2_PXP2_INT_MASK_0_REG_PGL_CPL_OF
- | PXP2_PXP2_INT_MASK_0_REG_PGL_CPL_AFT
- | PXP2_PXP2_INT_MASK_0_REG_PGL_PCIE_ATTN
- | PXP2_PXP2_INT_MASK_0_REG_PGL_READ_BLOCKED
- | PXP2_PXP2_INT_MASK_0_REG_PGL_WRITE_BLOCKED));
- else
- REG_WR(bp, PXP2_REG_PXP2_INT_MASK_0, 0x480000);
+ val = PXP2_PXP2_INT_MASK_0_REG_PGL_CPL_AFT |
+ PXP2_PXP2_INT_MASK_0_REG_PGL_CPL_OF |
+ PXP2_PXP2_INT_MASK_0_REG_PGL_PCIE_ATTN;
+ if (!CHIP_IS_E1x(bp))
+ val |= PXP2_PXP2_INT_MASK_0_REG_PGL_READ_BLOCKED |
+ PXP2_PXP2_INT_MASK_0_REG_PGL_WRITE_BLOCKED;
+ REG_WR(bp, PXP2_REG_PXP2_INT_MASK_0, val);
+
REG_WR(bp, TSDM_REG_TSDM_INT_MASK_0, 0);
REG_WR(bp, TSDM_REG_TSDM_INT_MASK_1, 0);
REG_WR(bp, TCM_REG_TCM_INT_MASK, 0);
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 11/13] bnx2x: prevent DCB if disabled in nvram
From: Yuval Mintz @ 2012-12-02 14:05 UTC (permalink / raw)
To: davem, netdev; +Cc: eilong, ariele, Barak Witkowski, Yuval Mintz
In-Reply-To: <1354457157-4730-1-git-send-email-yuvalmin@broadcom.com>
From: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
drivers/net/ethernet/broadcom/bnx2x/bnx2x_dcb.c | 5 +++++
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 11 +++++++++--
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_dcb.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_dcb.c
index c0d9b69..c8c0340 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_dcb.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_dcb.c
@@ -1904,6 +1904,11 @@ static u8 bnx2x_dcbnl_set_state(struct net_device *netdev, u8 state)
struct bnx2x *bp = netdev_priv(netdev);
DP(BNX2X_MSG_DCB, "state = %s\n", state ? "on" : "off");
+ if (state && ((bp->dcbx_enabled == BNX2X_DCBX_ENABLED_OFF) ||
+ (bp->dcbx_enabled == BNX2X_DCBX_ENABLED_INVALID))) {
+ DP(BNX2X_MSG_DCB, "Can not set dcbx to enabled while it is disabled in nvm\n");
+ return 1;
+ }
bnx2x_dcbx_set_state(bp, (state ? true : false), bp->dcbx_enabled);
return 0;
}
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index d76ca90..ab65f34 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -11120,8 +11120,15 @@ static int __devinit bnx2x_init_bp(struct bnx2x *bp)
bp->timer.data = (unsigned long) bp;
bp->timer.function = bnx2x_timer;
- bnx2x_dcbx_set_state(bp, true, BNX2X_DCBX_ENABLED_ON_NEG_ON);
- bnx2x_dcbx_init_params(bp);
+ if (SHMEM2_HAS(bp, dcbx_lldp_params_offset) &&
+ SHMEM2_HAS(bp, dcbx_lldp_dcbx_stat_offset) &&
+ SHMEM2_RD(bp, dcbx_lldp_params_offset) &&
+ SHMEM2_RD(bp, dcbx_lldp_dcbx_stat_offset)) {
+ bnx2x_dcbx_set_state(bp, true, BNX2X_DCBX_ENABLED_ON_NEG_ON);
+ bnx2x_dcbx_init_params(bp);
+ } else {
+ bnx2x_dcbx_set_state(bp, false, BNX2X_DCBX_ENABLED_OFF);
+ }
if (CHIP_IS_E1x(bp))
bp->cnic_base_cl_id = FP_SB_MAX_E1x;
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 12/13] bnx2x: fix 'Ethtool -A' when autoneg
From: Yuval Mintz @ 2012-12-02 14:05 UTC (permalink / raw)
To: davem, netdev; +Cc: eilong, ariele, Yuval Mintz
In-Reply-To: <1354457157-4730-1-git-send-email-yuvalmin@broadcom.com>
When configuring pauses using 'ethtool -A', the requested values have
effect when used together with autoneg (up to this point, when configured
for autoneg, driver ignored requested pause configuration)
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h | 2 +-
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 49 ++++++++++++----------
2 files changed, 28 insertions(+), 23 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
index ad28074..32c3ab7 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
@@ -144,7 +144,7 @@ u32 bnx2x_fw_command(struct bnx2x *bp, u32 command, u32 param);
* @bp: driver handle
* @load_mode: current mode
*/
-u8 bnx2x_initial_phy_init(struct bnx2x *bp, int load_mode);
+int bnx2x_initial_phy_init(struct bnx2x *bp, int load_mode);
/**
* bnx2x_link_set - configure hw according to link parameters structure.
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index ab65f34..c8ec3fc 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -2103,22 +2103,25 @@ void bnx2x_calc_fc_adv(struct bnx2x *bp)
}
}
-u8 bnx2x_initial_phy_init(struct bnx2x *bp, int load_mode)
+static void bnx2x_set_requested_fc(struct bnx2x *bp)
{
- if (!BP_NOMCP(bp)) {
- u8 rc;
- int cfx_idx = bnx2x_get_link_cfg_idx(bp);
- u16 req_line_speed = bp->link_params.req_line_speed[cfx_idx];
- /*
- * Initialize link parameters structure variables
- * It is recommended to turn off RX FC for jumbo frames
- * for better performance
- */
- if (CHIP_IS_E1x(bp) && (bp->dev->mtu > 5000))
- bp->link_params.req_fc_auto_adv = BNX2X_FLOW_CTRL_TX;
- else
- bp->link_params.req_fc_auto_adv = BNX2X_FLOW_CTRL_BOTH;
+ /* Initialize link parameters structure variables
+ * It is recommended to turn off RX FC for jumbo frames
+ * for better performance
+ */
+ if (CHIP_IS_E1x(bp) && (bp->dev->mtu > 5000))
+ bp->link_params.req_fc_auto_adv = BNX2X_FLOW_CTRL_TX;
+ else
+ bp->link_params.req_fc_auto_adv = BNX2X_FLOW_CTRL_BOTH;
+}
+int bnx2x_initial_phy_init(struct bnx2x *bp, int load_mode)
+{
+ int rc, cfx_idx = bnx2x_get_link_cfg_idx(bp);
+ u16 req_line_speed = bp->link_params.req_line_speed[cfx_idx];
+
+ if (!BP_NOMCP(bp)) {
+ bnx2x_set_requested_fc(bp);
bnx2x_acquire_phy_lock(bp);
if (load_mode == LOAD_DIAG) {
@@ -2147,11 +2150,11 @@ u8 bnx2x_initial_phy_init(struct bnx2x *bp, int load_mode)
bnx2x_calc_fc_adv(bp);
- if (CHIP_REV_IS_SLOW(bp) && bp->link_vars.link_up) {
+ if (bp->link_vars.link_up) {
bnx2x_stats_handle(bp, STATS_EVENT_LINK_UP);
bnx2x_link_report(bp);
- } else
- queue_delayed_work(bnx2x_wq, &bp->period_task, 0);
+ }
+ queue_delayed_work(bnx2x_wq, &bp->period_task, 0);
bp->link_params.req_line_speed[cfx_idx] = req_line_speed;
return rc;
}
@@ -10315,11 +10318,13 @@ static void __devinit bnx2x_link_settings_requested(struct bnx2x *bp)
bp->link_params.req_flow_ctrl[idx] = (link_config &
PORT_FEATURE_FLOW_CONTROL_MASK);
- if ((bp->link_params.req_flow_ctrl[idx] ==
- BNX2X_FLOW_CTRL_AUTO) &&
- !(bp->port.supported[idx] & SUPPORTED_Autoneg)) {
- bp->link_params.req_flow_ctrl[idx] =
- BNX2X_FLOW_CTRL_NONE;
+ if (bp->link_params.req_flow_ctrl[idx] ==
+ BNX2X_FLOW_CTRL_AUTO) {
+ if (!(bp->port.supported[idx] & SUPPORTED_Autoneg))
+ bp->link_params.req_flow_ctrl[idx] =
+ BNX2X_FLOW_CTRL_NONE;
+ else
+ bnx2x_set_requested_fc(bp);
}
BNX2X_DEV_INFO("req_line_speed %d req_duplex %d req_flow_ctrl 0x%x advertising 0x%x\n",
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 13/13] bnx2x: Correct PFC disablement
From: Yuval Mintz @ 2012-12-02 14:05 UTC (permalink / raw)
To: davem, netdev; +Cc: eilong, ariele, Barak Witkowski, Yuval Mintz
In-Reply-To: <1354457157-4730-1-git-send-email-yuvalmin@broadcom.com>
From: Barak Witkowski <barak@broadcom.com>
bnx2x driver could only have enabled pfc via usage of dcbnl; now, it can
also correctly disable it.
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
drivers/net/ethernet/broadcom/bnx2x/bnx2x_dcb.c | 7 +++++--
1 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_dcb.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_dcb.c
index c8c0340..10bc093 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_dcb.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_dcb.c
@@ -2052,10 +2052,13 @@ static void bnx2x_dcbnl_set_pfc_cfg(struct net_device *netdev, int prio,
if (!bnx2x_dcbnl_set_valid(bp) || prio >= MAX_PFC_PRIORITIES)
return;
- bp->dcbx_config_params.admin_pfc_bitmap |= ((setting ? 1 : 0) << prio);
- if (setting)
+ if (setting) {
+ bp->dcbx_config_params.admin_pfc_bitmap |= (1 << prio);
bp->dcbx_config_params.admin_pfc_tx_enable = 1;
+ } else {
+ bp->dcbx_config_params.admin_pfc_bitmap &= ~(1 << prio);
+ }
}
static void bnx2x_dcbnl_get_pfc_cfg(struct net_device *netdev, int prio,
--
1.7.1
^ permalink raw reply related
* [PATCH net-next v2 0/3] mlx4_en: set number of rx/tx channels using ethtool
From: Amir Vadai @ 2012-12-02 13:49 UTC (permalink / raw)
To: David S. Miller; +Cc: Ben Hutchings, Amir Vadai, Or Gerlitz, Oren Duer, netdev
1. Added a record in the MAINTAINERS file for the mlx4_en driver
2. Fix set_ringparam not to forget tx moderation info + remove code duplication
3. Add support to changing number of rx/tx channels using ethtool
---
Changes from V1:
- Set limits to number of channels by priv->max_* and not values supplied by user
- Fix indentation
Changes from V0:
- Added file pattern to MAINAINERS file
Amir Vadai (3):
MAINTAINERS: Add Mellanox ethernet driver - mlx4_en
net/mlx4_en: Fix TX moderation info loss after set_ringparam is
called
net/mlx4_en: Set number of rx/tx channels using ethtool
MAINTAINERS | 8 ++
drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 128 +++++++++++++++++-----
drivers/net/ethernet/mellanox/mlx4/en_main.c | 2 +-
drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 26 +++--
drivers/net/ethernet/mellanox/mlx4/en_tx.c | 2 +-
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 8 ++-
6 files changed, 131 insertions(+), 43 deletions(-)
--
1.7.8.2
^ permalink raw reply
* Re: [net-next rfc v7 2/3] virtio_net: multiqueue support
From: Michael S. Tsirkin @ 2012-12-02 16:06 UTC (permalink / raw)
To: Jason Wang
Cc: rusty, krkumar2, virtualization, netdev, linux-kernel, kvm,
bhutchings, jwhan, shiyer
In-Reply-To: <1354011360-39479-3-git-send-email-jasowang@redhat.com>
On Tue, Nov 27, 2012 at 06:15:59PM +0800, Jason Wang wrote:
> This addes multiqueue support to virtio_net driver. In multiple queue modes, the
> driver expects the number of queue paris is equal to the number of vcpus. To
> eliminate the contention bettwen vcpus and virtqueues, per-cpu virtqueue pairs
> were implemented through:
>
> - select the txq based on the smp processor id.
> - smp affinity hint were set to the vcpu that owns the queue pairs.
>
> Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
> drivers/net/virtio_net.c | 454 ++++++++++++++++++++++++++++++---------
> include/uapi/linux/virtio_net.h | 16 ++
> 2 files changed, 371 insertions(+), 99 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 7975133..bcaa6e5 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -84,17 +84,25 @@ struct virtnet_info {
> struct virtio_device *vdev;
> struct virtqueue *cvq;
> struct net_device *dev;
> - struct napi_struct napi;
> - struct send_queue sq;
> - struct receive_queue rq;
> + struct send_queue *sq;
> + struct receive_queue *rq;
> unsigned int status;
>
> + /* Max # of queue pairs supported by the device */
> + u16 max_queue_pairs;
> +
> + /* # of queue pairs currently used by the driver */
> + u16 curr_queue_pairs;
> +
> /* I like... big packets and I cannot lie! */
> bool big_packets;
>
> /* Host will merge rx buffers for big packets (shake it! shake it!) */
> bool mergeable_rx_bufs;
>
> + /* Has control virtqueue */
> + bool has_cvq;
> +
> /* enable config space updates */
> bool config_enable;
>
> @@ -126,6 +134,34 @@ struct padded_vnet_hdr {
> char padding[6];
> };
>
> +static const struct ethtool_ops virtnet_ethtool_ops;
> +
> +/*
> + * Converting between virtqueue no. and kernel tx/rx queue no.
> + * 0:rx0 1:tx0 2:cvq 3:rx1 4:tx1 ... 2N+1:rxN 2N+2:txN
> + */
Weird, this isn't what spec v5 says: it says
0:rx0 1:tx0 2: rx1 3: tx1 .... vcq
We can change the spec to match but keeping all rx/tx
together seems a bit prettier?
> +static int vq2txq(struct virtqueue *vq)
> +{
> + int index = virtqueue_get_queue_index(vq);
> + return index == 1 ? 0 : (index - 2) / 2;
> +}
> +
> +static int txq2vq(int txq)
> +{
> + return txq ? 2 * txq + 2 : 1;
> +}
> +
> +static int vq2rxq(struct virtqueue *vq)
> +{
> + int index = virtqueue_get_queue_index(vq);
> + return index ? (index - 1) / 2 : 0;
> +}
> +
> +static int rxq2vq(int rxq)
> +{
> + return rxq ? 2 * rxq + 1 : 0;
> +}
> +
> static inline struct skb_vnet_hdr *skb_vnet_hdr(struct sk_buff *skb)
> {
> return (struct skb_vnet_hdr *)skb->cb;
> @@ -166,7 +202,7 @@ static void skb_xmit_done(struct virtqueue *vq)
> virtqueue_disable_cb(vq);
>
> /* We were probably waiting for more output buffers. */
> - netif_wake_queue(vi->dev);
> + netif_wake_subqueue(vi->dev, vq2txq(vq));
> }
>
> static void set_skb_frag(struct sk_buff *skb, struct page *page,
> @@ -503,7 +539,7 @@ static bool try_fill_recv(struct receive_queue *rq, gfp_t gfp)
> static void skb_recv_done(struct virtqueue *rvq)
> {
> struct virtnet_info *vi = rvq->vdev->priv;
> - struct receive_queue *rq = &vi->rq;
> + struct receive_queue *rq = &vi->rq[vq2rxq(rvq)];
>
> /* Schedule NAPI, Suppress further interrupts if successful. */
> if (napi_schedule_prep(&rq->napi)) {
> @@ -650,7 +686,8 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
> static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> {
> struct virtnet_info *vi = netdev_priv(dev);
> - struct send_queue *sq = &vi->sq;
> + int qnum = skb_get_queue_mapping(skb);
> + struct send_queue *sq = &vi->sq[qnum];
> int capacity;
>
> /* Free up any pending old buffers before queueing new ones. */
> @@ -664,13 +701,14 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> if (likely(capacity == -ENOMEM)) {
> if (net_ratelimit())
> dev_warn(&dev->dev,
> - "TX queue failure: out of memory\n");
> + "TXQ (%d) failure: out of memory\n",
> + qnum);
> } else {
> dev->stats.tx_fifo_errors++;
> if (net_ratelimit())
> dev_warn(&dev->dev,
> - "Unexpected TX queue failure: %d\n",
> - capacity);
> + "Unexpected TXQ (%d) failure: %d\n",
> + qnum, capacity);
> }
> dev->stats.tx_dropped++;
> kfree_skb(skb);
> @@ -685,12 +723,12 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> /* Apparently nice girls don't return TX_BUSY; stop the queue
> * before it gets out of hand. Naturally, this wastes entries. */
> if (capacity < 2+MAX_SKB_FRAGS) {
> - netif_stop_queue(dev);
> + netif_stop_subqueue(dev, qnum);
> if (unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
> /* More just got used, free them then recheck. */
> capacity += free_old_xmit_skbs(sq);
> if (capacity >= 2+MAX_SKB_FRAGS) {
> - netif_start_queue(dev);
> + netif_start_subqueue(dev, qnum);
> virtqueue_disable_cb(sq->vq);
> }
> }
> @@ -758,23 +796,13 @@ static struct rtnl_link_stats64 *virtnet_stats(struct net_device *dev,
> static void virtnet_netpoll(struct net_device *dev)
> {
> struct virtnet_info *vi = netdev_priv(dev);
> + int i;
>
> - napi_schedule(&vi->rq.napi);
> + for (i = 0; i < vi->curr_queue_pairs; i++)
> + napi_schedule(&vi->rq[i].napi);
> }
> #endif
>
> -static int virtnet_open(struct net_device *dev)
> -{
> - struct virtnet_info *vi = netdev_priv(dev);
> -
> - /* Make sure we have some buffers: if oom use wq. */
> - if (!try_fill_recv(&vi->rq, GFP_KERNEL))
> - schedule_delayed_work(&vi->rq.refill, 0);
> -
> - virtnet_napi_enable(&vi->rq);
> - return 0;
> -}
> -
> /*
> * Send command via the control virtqueue and check status. Commands
> * supported by the hypervisor, as indicated by feature bits, should
> @@ -830,13 +858,53 @@ static void virtnet_ack_link_announce(struct virtnet_info *vi)
> rtnl_unlock();
> }
>
> +static int virtnet_set_queues(struct virtnet_info *vi)
> +{
> + struct scatterlist sg;
> + struct virtio_net_ctrl_rfs s;
> + struct net_device *dev = vi->dev;
> +
> + s.virtqueue_pairs = vi->curr_queue_pairs;
> + sg_init_one(&sg, &s, sizeof(s));
> +
> + if (!vi->has_cvq)
> + return -EINVAL;
> +
> + if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RFS,
> + VIRTIO_NET_CTRL_RFS_VQ_PAIRS_SET, &sg, 1, 0)){
> + dev_warn(&dev->dev, "Fail to set the number of queue pairs to"
> + " %d\n", vi->curr_queue_pairs);
> + return -EINVAL;
> + }
> +
> + return 0;
> +}
> +
> +static int virtnet_open(struct net_device *dev)
> +{
> + struct virtnet_info *vi = netdev_priv(dev);
> + int i;
> +
> + for (i = 0; i < vi->max_queue_pairs; i++) {
> + /* Make sure we have some buffers: if oom use wq. */
> + if (!try_fill_recv(&vi->rq[i], GFP_KERNEL))
> + schedule_delayed_work(&vi->rq[i].refill, 0);
> + virtnet_napi_enable(&vi->rq[i]);
> + }
> +
> + return 0;
> +}
> +
> static int virtnet_close(struct net_device *dev)
> {
> struct virtnet_info *vi = netdev_priv(dev);
> + int i;
>
> /* Make sure refill_work doesn't re-enable napi! */
> - cancel_delayed_work_sync(&vi->rq.refill);
> - napi_disable(&vi->rq.napi);
> + for (i = 0; i < vi->max_queue_pairs; i++) {
> + cancel_delayed_work_sync(&vi->rq[i].refill);
> + napi_disable(&vi->rq[i].napi);
> + }
>
> return 0;
> }
> @@ -948,8 +1016,8 @@ static void virtnet_get_ringparam(struct net_device *dev,
> {
> struct virtnet_info *vi = netdev_priv(dev);
>
> - ring->rx_max_pending = virtqueue_get_vring_size(vi->rq.vq);
> - ring->tx_max_pending = virtqueue_get_vring_size(vi->sq.vq);
> + ring->rx_max_pending = virtqueue_get_vring_size(vi->rq[0].vq);
> + ring->tx_max_pending = virtqueue_get_vring_size(vi->sq[0].vq);
> ring->rx_pending = ring->rx_max_pending;
> ring->tx_pending = ring->tx_max_pending;
> }
> @@ -967,12 +1035,6 @@ static void virtnet_get_drvinfo(struct net_device *dev,
>
> }
>
> -static const struct ethtool_ops virtnet_ethtool_ops = {
> - .get_drvinfo = virtnet_get_drvinfo,
> - .get_link = ethtool_op_get_link,
> - .get_ringparam = virtnet_get_ringparam,
> -};
> -
> #define MIN_MTU 68
> #define MAX_MTU 65535
>
> @@ -984,6 +1046,20 @@ static int virtnet_change_mtu(struct net_device *dev, int new_mtu)
> return 0;
> }
>
> +/* To avoid contending a lock hold by a vcpu who would exit to host, select the
> + * txq based on the processor id.
> + */
> +static u16 virtnet_select_queue(struct net_device *dev, struct sk_buff *skb)
> +{
> + int txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
> + smp_processor_id();
> +
> + while (unlikely(txq >= dev->real_num_tx_queues))
> + txq -= dev->real_num_tx_queues;
> +
> + return txq;
> +}
> +
> static const struct net_device_ops virtnet_netdev = {
> .ndo_open = virtnet_open,
> .ndo_stop = virtnet_close,
> @@ -995,6 +1071,7 @@ static const struct net_device_ops virtnet_netdev = {
> .ndo_get_stats64 = virtnet_stats,
> .ndo_vlan_rx_add_vid = virtnet_vlan_rx_add_vid,
> .ndo_vlan_rx_kill_vid = virtnet_vlan_rx_kill_vid,
> + .ndo_select_queue = virtnet_select_queue,
> #ifdef CONFIG_NET_POLL_CONTROLLER
> .ndo_poll_controller = virtnet_netpoll,
> #endif
> @@ -1030,10 +1107,10 @@ static void virtnet_config_changed_work(struct work_struct *work)
>
> if (vi->status & VIRTIO_NET_S_LINK_UP) {
> netif_carrier_on(vi->dev);
> - netif_wake_queue(vi->dev);
> + netif_tx_wake_all_queues(vi->dev);
> } else {
> netif_carrier_off(vi->dev);
> - netif_stop_queue(vi->dev);
> + netif_tx_stop_all_queues(vi->dev);
> }
> done:
> mutex_unlock(&vi->config_lock);
> @@ -1046,41 +1123,212 @@ static void virtnet_config_changed(struct virtio_device *vdev)
> schedule_work(&vi->config_work);
> }
>
> -static int init_vqs(struct virtnet_info *vi)
> +static void free_receive_bufs(struct virtnet_info *vi)
> +{
> + int i;
> +
> + for (i = 0; i < vi->max_queue_pairs; i++) {
> + while (vi->rq[i].pages)
> + __free_pages(get_a_page(&vi->rq[i], GFP_KERNEL), 0);
> + }
> +}
> +
> +/* Free memory allocated for send and receive queues */
> +static void virtnet_free_queues(struct virtnet_info *vi)
> {
> - struct virtqueue *vqs[3];
> - vq_callback_t *callbacks[] = { skb_recv_done, skb_xmit_done, NULL};
> - const char *names[] = { "input", "output", "control" };
> - int nvqs, err;
> + kfree(vi->rq);
> + vi->rq = NULL;
> + kfree(vi->sq);
> + vi->sq = NULL;
> +}
>
> - /* We expect two virtqueues, receive then send,
> - * and optionally control. */
> - nvqs = virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ) ? 3 : 2;
> +static void free_unused_bufs(struct virtnet_info *vi)
> +{
> + void *buf;
> + int i;
>
> - err = vi->vdev->config->find_vqs(vi->vdev, nvqs, vqs, callbacks, names);
> - if (err)
> - return err;
> + for (i = 0; i < vi->max_queue_pairs; i++) {
> + struct virtqueue *vq = vi->sq[i].vq;
> + while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
> + dev_kfree_skb(buf);
> + }
>
> - vi->rq.vq = vqs[0];
> - vi->sq.vq = vqs[1];
> + for (i = 0; i < vi->max_queue_pairs; i++) {
> + struct virtqueue *vq = vi->rq[i].vq;
>
> - if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ)) {
> - vi->cvq = vqs[2];
> + while ((buf = virtqueue_detach_unused_buf(vq)) != NULL) {
> + if (vi->mergeable_rx_bufs || vi->big_packets)
> + give_pages(&vi->rq[i], buf);
> + else
> + dev_kfree_skb(buf);
> + --vi->rq[i].num;
> + }
> + BUG_ON(vi->rq[i].num != 0);
> + }
> +}
>
> +static void virtnet_set_affinity(struct virtnet_info *vi, bool set)
> +{
> + int i;
> +
> + for (i = 0; i < vi->max_queue_pairs; i++) {
> + int cpu = set ? i : -1;
> + virtqueue_set_affinity(vi->rq[i].vq, cpu);
> + virtqueue_set_affinity(vi->sq[i].vq, cpu);
> + }
> +}
> +
> +static void virtnet_del_vqs(struct virtnet_info *vi)
> +{
> + struct virtio_device *vdev = vi->vdev;
> +
> + virtnet_set_affinity(vi, false);
> +
> + vdev->config->del_vqs(vdev);
> +
> + virtnet_free_queues(vi);
> +}
> +
> +static int virtnet_find_vqs(struct virtnet_info *vi)
> +{
> + vq_callback_t **callbacks;
> + struct virtqueue **vqs;
> + int ret = -ENOMEM;
> + int i, total_vqs;
> + char **names;
> +
> + /*
> + * We expect 1 RX virtqueue followed by 1 TX virtqueue, followd by
> + * possible control virtqueue, followed by RX/TX N-1 queue pairs used
> + * in multiqueue mode.
> + */
> + total_vqs = vi->max_queue_pairs * 2 +
> + virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ);
> +
> + /* Allocate space for find_vqs parameters */
> + vqs = kzalloc(total_vqs * sizeof(*vqs), GFP_KERNEL);
> + callbacks = kzalloc(total_vqs * sizeof(*callbacks), GFP_KERNEL);
> + if (!vqs || !callbacks)
> + goto err_mem;
> + names = kzalloc(total_vqs * sizeof(*names), GFP_KERNEL);
> + if (!names)
> + goto err_mem;
> +
> + /* Parameters for control virtqueue, if any */
> + if (vi->has_cvq) {
> + callbacks[2] = NULL;
> + names[2] = kasprintf(GFP_KERNEL, "control");
> + }
> +
> + /* Allocate/initialize parameters for send/receive virtqueues */
> + for (i = 0; i < vi->max_queue_pairs; i++) {
> + callbacks[rxq2vq(i)] = skb_recv_done;
> + callbacks[txq2vq(i)] = skb_xmit_done;
> + names[rxq2vq(i)] = kasprintf(GFP_KERNEL, "input.%d", i);
> + names[txq2vq(i)] = kasprintf(GFP_KERNEL, "output.%d", i);
> + }
> +
> + ret = vi->vdev->config->find_vqs(vi->vdev, total_vqs, vqs, callbacks,
> + (const char **)names);
> + if (ret)
> + goto err_names;
> +
> + if (vi->has_cvq) {
> + vi->cvq = vqs[2];
> if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VLAN))
> vi->dev->features |= NETIF_F_HW_VLAN_FILTER;
> }
> +
> + for (i = 0; i < vi->max_queue_pairs; i++) {
> + vi->rq[i].vq = vqs[rxq2vq(i)];
> + vi->sq[i].vq = vqs[txq2vq(i)];
> + }
> +
> + kfree(callbacks);
> + kfree(vqs);
> +
> + return 0;
> +
> +err_names:
> + for (i = 0; i < total_vqs * 2; i ++)
> + kfree(names[i]);
> + kfree(names);
> +
> +err_mem:
> + kfree(callbacks);
> + kfree(vqs);
> +
> + return ret;
> +}
> +
> +static int virtnet_alloc_queues(struct virtnet_info *vi)
> +{
> + int i;
> +
> + vi->sq = kzalloc(sizeof(vi->sq[0]) * vi->max_queue_pairs, GFP_KERNEL);
> + vi->rq = kzalloc(sizeof(vi->rq[0]) * vi->max_queue_pairs, GFP_KERNEL);
> + if (!vi->rq || !vi->sq)
> + goto err;
> +
> + /* setup initial receive and send queue parameters */
> + for (i = 0; i < vi->max_queue_pairs; i++) {
> + vi->rq[i].pages = NULL;
> + INIT_DELAYED_WORK(&vi->rq[i].refill, refill_work);
> + netif_napi_add(vi->dev, &vi->rq[i].napi, virtnet_poll,
> + napi_weight);
> +
> + sg_init_table(vi->rq[i].sg, ARRAY_SIZE(vi->rq[i].sg));
> + sg_init_table(vi->sq[i].sg, ARRAY_SIZE(vi->sq[i].sg));
> + }
> +
> +
> return 0;
> +
> +err:
> + virtnet_free_queues(vi);
> + return -ENOMEM;
> +}
> +
> +static int init_vqs(struct virtnet_info *vi)
> +{
> + int ret;
> +
> + /* Allocate send & receive queues */
> + ret = virtnet_alloc_queues(vi);
> + if (ret)
> + goto err;
> +
> + ret = virtnet_find_vqs(vi);
> + if (ret)
> + goto err_free;
> +
> + virtnet_set_affinity(vi, true);
> + return 0;
> +
> +err_free:
> + virtnet_free_queues(vi);
> +err:
> + return ret;
> }
>
> static int virtnet_probe(struct virtio_device *vdev)
> {
> - int err;
> + int i, err;
> struct net_device *dev;
> struct virtnet_info *vi;
> + u16 curr_queue_pairs;
Probably a good idea to rename this max_queue_pairs.
> +
> + /* Find if host supports multiqueue virtio_net device */
> + err = virtio_config_val(vdev, VIRTIO_NET_F_RFS,
> + offsetof(struct virtio_net_config,
> + max_virtqueue_pairs), &curr_queue_pairs);
> +
> + /* We need at least 2 queue's */
> + if (err)
> + curr_queue_pairs = 1;
Let's also validate against VIRTIO_NET_CTRL_RFS_VQ_PAIRS_MIN
and VIRTIO_NET_CTRL_RFS_VQ_PAIRS_MAX.
>
> /* Allocate ourselves a network device with room for our info */
> - dev = alloc_etherdev(sizeof(struct virtnet_info));
> + dev = alloc_etherdev_mq(sizeof(struct virtnet_info), curr_queue_pairs);
> if (!dev)
> return -ENOMEM;
>
> @@ -1126,22 +1374,17 @@ static int virtnet_probe(struct virtio_device *vdev)
>
> /* Set up our device-specific information */
> vi = netdev_priv(dev);
> - netif_napi_add(dev, &vi->rq.napi, virtnet_poll, napi_weight);
> vi->dev = dev;
> vi->vdev = vdev;
> vdev->priv = vi;
> - vi->rq.pages = NULL;
> vi->stats = alloc_percpu(struct virtnet_stats);
> err = -ENOMEM;
> if (vi->stats == NULL)
> goto free;
>
> - INIT_DELAYED_WORK(&vi->rq.refill, refill_work);
> mutex_init(&vi->config_lock);
> vi->config_enable = true;
> INIT_WORK(&vi->config_work, virtnet_config_changed_work);
> - sg_init_table(vi->rq.sg, ARRAY_SIZE(vi->rq.sg));
> - sg_init_table(vi->sq.sg, ARRAY_SIZE(vi->sq.sg));
>
> /* If we can receive ANY GSO packets, we must allocate large ones. */
> if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO4) ||
> @@ -1152,10 +1395,21 @@ static int virtnet_probe(struct virtio_device *vdev)
> if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF))
> vi->mergeable_rx_bufs = true;
>
> + if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ))
> + vi->has_cvq = true;
> +
> + /* Use single tx/rx queue pair as default */
> + vi->curr_queue_pairs = 1;
> + vi->max_queue_pairs = curr_queue_pairs;
> +
> + /* Allocate/initialize the rx/tx queues, and invoke find_vqs */
> err = init_vqs(vi);
> if (err)
> goto free_stats;
>
> + netif_set_real_num_tx_queues(dev, 1);
> + netif_set_real_num_rx_queues(dev, 1);
> +
> err = register_netdev(dev);
> if (err) {
> pr_debug("virtio_net: registering device failed\n");
> @@ -1163,12 +1417,15 @@ static int virtnet_probe(struct virtio_device *vdev)
> }
>
> /* Last of all, set up some receive buffers. */
> - try_fill_recv(&vi->rq, GFP_KERNEL);
> -
> - /* If we didn't even get one input buffer, we're useless. */
> - if (vi->rq.num == 0) {
> - err = -ENOMEM;
> - goto unregister;
> + for (i = 0; i < vi->max_queue_pairs; i++) {
> + try_fill_recv(&vi->rq[i], GFP_KERNEL);
> +
> + /* If we didn't even get one input buffer, we're useless. */
> + if (vi->rq[i].num == 0) {
> + free_unused_bufs(vi);
> + err = -ENOMEM;
> + goto free_recv_bufs;
> + }
> }
>
> /* Assume link up if device can't report link status,
> @@ -1181,13 +1438,20 @@ static int virtnet_probe(struct virtio_device *vdev)
> netif_carrier_on(dev);
> }
>
> - pr_debug("virtnet: registered device %s\n", dev->name);
> + pr_debug("virtnet: registered device %s with %d RX and TX vq's\n",
> + dev->name, curr_queue_pairs);
> +
> return 0;
>
> -unregister:
> +free_recv_bufs:
> + free_receive_bufs(vi);
> unregister_netdev(dev);
> +
> free_vqs:
> - vdev->config->del_vqs(vdev);
> + for (i = 0; i <curr_queue_pairs; i++)
> + cancel_delayed_work_sync(&vi->rq[i].refill);
> + virtnet_del_vqs(vi);
> +
> free_stats:
> free_percpu(vi->stats);
> free:
> @@ -1195,28 +1459,6 @@ free:
> return err;
> }
>
> -static void free_unused_bufs(struct virtnet_info *vi)
> -{
> - void *buf;
> - while (1) {
> - buf = virtqueue_detach_unused_buf(vi->sq.vq);
> - if (!buf)
> - break;
> - dev_kfree_skb(buf);
> - }
> - while (1) {
> - buf = virtqueue_detach_unused_buf(vi->rq.vq);
> - if (!buf)
> - break;
> - if (vi->mergeable_rx_bufs || vi->big_packets)
> - give_pages(&vi->rq, buf);
> - else
> - dev_kfree_skb(buf);
> - --vi->rq.num;
> - }
> - BUG_ON(vi->rq.num != 0);
> -}
> -
> static void remove_vq_common(struct virtnet_info *vi)
> {
> vi->vdev->config->reset(vi->vdev);
> @@ -1224,10 +1466,9 @@ static void remove_vq_common(struct virtnet_info *vi)
> /* Free unused buffers in both send and recv, if any. */
> free_unused_bufs(vi);
>
> - vi->vdev->config->del_vqs(vi->vdev);
> + free_receive_bufs(vi);
>
> - while (vi->rq.pages)
> - __free_pages(get_a_page(&vi->rq, GFP_KERNEL), 0);
> + virtnet_del_vqs(vi);
> }
>
> static void __devexit virtnet_remove(struct virtio_device *vdev)
> @@ -1253,6 +1494,7 @@ static void __devexit virtnet_remove(struct virtio_device *vdev)
> static int virtnet_freeze(struct virtio_device *vdev)
> {
> struct virtnet_info *vi = vdev->priv;
> + int i;
>
> /* Prevent config work handler from accessing the device */
> mutex_lock(&vi->config_lock);
> @@ -1260,10 +1502,14 @@ static int virtnet_freeze(struct virtio_device *vdev)
> mutex_unlock(&vi->config_lock);
>
> netif_device_detach(vi->dev);
> - cancel_delayed_work_sync(&vi->rq.refill);
> + for (i = 0; i < vi->max_queue_pairs; i++)
> + cancel_delayed_work_sync(&vi->rq[i].refill);
>
> if (netif_running(vi->dev))
> - napi_disable(&vi->rq.napi);
> + for (i = 0; i < vi->max_queue_pairs; i++) {
> + napi_disable(&vi->rq[i].napi);
> + netif_napi_del(&vi->rq[i].napi);
> + }
>
> remove_vq_common(vi);
>
> @@ -1275,24 +1521,28 @@ static int virtnet_freeze(struct virtio_device *vdev)
> static int virtnet_restore(struct virtio_device *vdev)
> {
> struct virtnet_info *vi = vdev->priv;
> - int err;
> + int err, i;
>
> err = init_vqs(vi);
> if (err)
> return err;
>
> if (netif_running(vi->dev))
> - virtnet_napi_enable(&vi->rq);
> + for (i = 0; i < vi->max_queue_pairs; i++)
> + virtnet_napi_enable(&vi->rq[i]);
>
> netif_device_attach(vi->dev);
>
> - if (!try_fill_recv(&vi->rq, GFP_KERNEL))
> - schedule_delayed_work(&vi->rq.refill, 0);
> + for (i = 0; i < vi->max_queue_pairs; i++)
> + if (!try_fill_recv(&vi->rq[i], GFP_KERNEL))
> + schedule_delayed_work(&vi->rq[i].refill, 0);
>
> mutex_lock(&vi->config_lock);
> vi->config_enable = true;
> mutex_unlock(&vi->config_lock);
>
> + BUG_ON(virtnet_set_queues(vi));
> +
Won't this always fail when control vq is off?
> return 0;
> }
> #endif
> @@ -1310,7 +1560,7 @@ static unsigned int features[] = {
> VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_UFO,
> VIRTIO_NET_F_MRG_RXBUF, VIRTIO_NET_F_STATUS, VIRTIO_NET_F_CTRL_VQ,
> VIRTIO_NET_F_CTRL_RX, VIRTIO_NET_F_CTRL_VLAN,
> - VIRTIO_NET_F_GUEST_ANNOUNCE,
> + VIRTIO_NET_F_GUEST_ANNOUNCE, VIRTIO_NET_F_RFS,
> };
>
> static struct virtio_driver virtio_net_driver = {
> @@ -1328,6 +1578,12 @@ static struct virtio_driver virtio_net_driver = {
> #endif
> };
>
> +static const struct ethtool_ops virtnet_ethtool_ops = {
> + .get_drvinfo = virtnet_get_drvinfo,
> + .get_link = ethtool_op_get_link,
> + .get_ringparam = virtnet_get_ringparam,
> +};
> +
> static int __init init(void)
> {
> return register_virtio_driver(&virtio_net_driver);
> diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
> index 2470f54..6056cec 100644
> --- a/include/uapi/linux/virtio_net.h
> +++ b/include/uapi/linux/virtio_net.h
> @@ -51,6 +51,7 @@
> #define VIRTIO_NET_F_CTRL_RX_EXTRA 20 /* Extra RX mode control support */
> #define VIRTIO_NET_F_GUEST_ANNOUNCE 21 /* Guest can announce device on the
> * network */
> +#define VIRTIO_NET_F_RFS 22 /* Device supports multiple TXQ/RXQ */
>
> #define VIRTIO_NET_S_LINK_UP 1 /* Link is up */
> #define VIRTIO_NET_S_ANNOUNCE 2 /* Announcement is needed */
> @@ -60,6 +61,8 @@ struct virtio_net_config {
> __u8 mac[6];
> /* See VIRTIO_NET_F_STATUS and VIRTIO_NET_S_* above */
> __u16 status;
> + /* Total number of RX/TX queues */
> + __u16 max_virtqueue_pairs;
> } __attribute__((packed));
>
> /* This is the first element of the scatter-gather list. If you don't
> @@ -166,4 +169,17 @@ struct virtio_net_ctrl_mac {
> #define VIRTIO_NET_CTRL_ANNOUNCE 3
> #define VIRTIO_NET_CTRL_ANNOUNCE_ACK 0
>
> +/*
> + * Control multiqueue
> + *
> + */
> +struct virtio_net_ctrl_rfs {
> + u16 virtqueue_pairs;
> +};
> +
> +#define VIRTIO_NET_CTRL_RFS 4
> + #define VIRTIO_NET_CTRL_RFS_VQ_PAIRS_SET 0
> + #define VIRTIO_NET_CTRL_RFS_VQ_PAIRS_MIN 1
> + #define VIRTIO_NET_CTRL_RFS_VQ_PAIRS_MAX 0x8000
> +
> #endif /* _LINUX_VIRTIO_NET_H */
> --
> 1.7.1
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox