* Re: VLAN packets silently dropped in promiscuous mode
From: Jesse Gross @ 2010-09-30 21:21 UTC (permalink / raw)
To: Roger Luethi; +Cc: netdev, Patrick McHardy
In-Reply-To: <20100930080703.GA10827@core.hellgate.ch>
On Thu, Sep 30, 2010 at 1:07 AM, Roger Luethi <rl@hellgate.ch> wrote:
> On Wed, 29 Sep 2010 10:44:26 -0700, Jesse Gross wrote:
>> On Wed, Sep 29, 2010 at 4:37 AM, Roger Luethi <rl@hellgate.ch> wrote:
>> > I noticed packets for unknown VLANs getting silently dropped even in
>> > promiscuous mode (this is true only for the hardware accelerated path).
>> > netif_nit_deliver was introduced specifically to prevent that, but the
>> > function gets called only _after_ packets from unknown VLANs have been
>> > dropped.
>>
>> Some drivers are fixing this on a case by case basis by disabling
>> hardware accelerated VLAN stripping when in promiscuous mode, i.e.:
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=5f6c01819979afbfec7e0b15fe52371b8eed87e8
>>
>> However, at this point it is more or less random which drivers do
>> this. It would obviously be much better if it were consistent.
>
> My understanding is this. Hardware VLAN tagging and stripping can always be
> enabled. The kernel passes 802.1Q information along with the stripped
> header to libpcap which reassembles the original header where necessary.
> Works for me.
Sorry, I misread your original post as saying that the VLAN header
gets dropped, rather than the entire packet. I agree that this is how
it should work but not necessarily how it does work (again, depending
on the driver). Here's the problem that I was talking about:
Most drivers have a snippet of code that looks something like this
(taken from ixgbe):
if (adapter->vlgrp && is_vlan && (tag & VLAN_VID_MASK))
vlan_gro_receive(napi, adapter->vlgrp, tag, skb);
else
napi_gro_receive(napi, skb);
At this point the VLAN has already been stripped in hardware. If
there is no VLAN group configured on the device then we hit the second
case. The VLAN header was removed from the SKB and the tag variable
is unused. It is no longer possible for libpcap to reconstruct the
header because the information was thrown away (even the fact that
there was a VLAN tag at all).
There are a couple ways to fix this:
* Turn off VLAN stripping when in promiscuous mode (as done by the ixgbe driver)
* Reconstruct the VLAN header when there is no VLAN group (as done by
the tg3 driver)
A bunch of drivers do neither (bnx2x, for example) and exhibit this
problem. It's getting better but it seems like a common issue.
^ permalink raw reply
* [RFC 2/2] nl80211: use the new genetlink pre/post_doit hooks
From: Johannes Berg @ 2010-09-30 21:10 UTC (permalink / raw)
To: netdev, linux-wireless; +Cc: tgraf
In-Reply-To: <20100930211013.472957770@sipsolutions.net>
[-- Attachment #1: nl80211-use-pre-post.patch --]
[-- Type: text/plain, Size: 16023 bytes --]
From: Johannes Berg <johannes.berg@intel.com>
This makes nl80211 use the new genetlink
pre_doit/post_doit hooks for locking and
checking the interface/wiphy index.
This is only partial and serves to demonstrate
the code improvement.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
net/wireless/nl80211.c | 329 +++++++++++++++++++------------------------------
1 file changed, 130 insertions(+), 199 deletions(-)
--- wireless-testing.orig/net/wireless/nl80211.c 2010-09-30 23:05:34.000000000 +0200
+++ wireless-testing/net/wireless/nl80211.c 2010-09-30 23:08:40.000000000 +0200
@@ -23,6 +23,11 @@
#include "nl80211.h"
#include "reg.h"
+static int nl80211_pre_doit(struct genl_ops *ops, struct sk_buff *skb,
+ struct genl_info *info);
+static void nl80211_post_doit(struct genl_ops *ops, struct sk_buff *skb,
+ struct genl_info *info);
+
/* the netlink family */
static struct genl_family nl80211_fam = {
.id = GENL_ID_GENERATE, /* don't bother with a hardcoded ID */
@@ -31,6 +36,8 @@ static struct genl_family nl80211_fam =
.version = 1, /* no particular meaning now */
.maxattr = NL80211_ATTR_MAX,
.netnsok = true,
+ .pre_doit = nl80211_pre_doit,
+ .post_doit = nl80211_post_doit,
};
/* internal helper: get rdev and dev */
@@ -703,28 +710,18 @@ static int nl80211_dump_wiphy(struct sk_
static int nl80211_get_wiphy(struct sk_buff *skb, struct genl_info *info)
{
struct sk_buff *msg;
- struct cfg80211_registered_device *dev;
-
- dev = cfg80211_get_dev_from_info(info);
- if (IS_ERR(dev))
- return PTR_ERR(dev);
+ struct cfg80211_registered_device *dev = info->user_ptr[0];
msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
if (!msg)
- goto out_err;
-
- if (nl80211_send_wiphy(msg, info->snd_pid, info->snd_seq, 0, dev) < 0)
- goto out_free;
+ return -ENOMEM;
- cfg80211_unlock_rdev(dev);
+ if (nl80211_send_wiphy(msg, info->snd_pid, info->snd_seq, 0, dev) < 0) {
+ nlmsg_free(msg);
+ return -ENOBUFS;
+ }
return genlmsg_reply(msg, info);
-
- out_free:
- nlmsg_free(msg);
- out_err:
- cfg80211_unlock_rdev(dev);
- return -ENOBUFS;
}
static const struct nla_policy txq_params_policy[NL80211_TXQ_ATTR_MAX + 1] = {
@@ -1137,33 +1134,20 @@ static int nl80211_dump_interface(struct
static int nl80211_get_interface(struct sk_buff *skb, struct genl_info *info)
{
struct sk_buff *msg;
- struct cfg80211_registered_device *dev;
- struct net_device *netdev;
- int err;
-
- err = get_rdev_dev_by_info_ifindex(info, &dev, &netdev);
- if (err)
- return err;
+ struct cfg80211_registered_device *dev = info->user_ptr[0];
+ struct net_device *netdev = info->user_ptr[1];
msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
if (!msg)
- goto out_err;
+ return -ENOMEM;
if (nl80211_send_iface(msg, info->snd_pid, info->snd_seq, 0,
- dev, netdev) < 0)
- goto out_free;
-
- dev_put(netdev);
- cfg80211_unlock_rdev(dev);
+ dev, netdev) < 0) {
+ nlmsg_free(msg);
+ return -ENOBUFS;
+ }
return genlmsg_reply(msg, info);
-
- out_free:
- nlmsg_free(msg);
- out_err:
- dev_put(netdev);
- cfg80211_unlock_rdev(dev);
- return -ENOBUFS;
}
static const struct nla_policy mntr_flags_policy[NL80211_MNTR_FLAG_MAX + 1] = {
@@ -1223,39 +1207,29 @@ static int nl80211_valid_4addr(struct cf
static int nl80211_set_interface(struct sk_buff *skb, struct genl_info *info)
{
- struct cfg80211_registered_device *rdev;
+ struct cfg80211_registered_device *rdev = info->user_ptr[0];
struct vif_params params;
int err;
enum nl80211_iftype otype, ntype;
- struct net_device *dev;
+ struct net_device *dev = info->user_ptr[1];
u32 _flags, *flags = NULL;
bool change = false;
memset(¶ms, 0, sizeof(params));
- rtnl_lock();
-
- err = get_rdev_dev_by_info_ifindex(info, &rdev, &dev);
- if (err)
- goto unlock_rtnl;
-
otype = ntype = dev->ieee80211_ptr->iftype;
if (info->attrs[NL80211_ATTR_IFTYPE]) {
ntype = nla_get_u32(info->attrs[NL80211_ATTR_IFTYPE]);
if (otype != ntype)
change = true;
- if (ntype > NL80211_IFTYPE_MAX) {
- err = -EINVAL;
- goto unlock;
- }
+ if (ntype > NL80211_IFTYPE_MAX)
+ return -EINVAL;
}
if (info->attrs[NL80211_ATTR_MESH_ID]) {
- if (ntype != NL80211_IFTYPE_MESH_POINT) {
- err = -EINVAL;
- goto unlock;
- }
+ if (ntype != NL80211_IFTYPE_MESH_POINT)
+ return -EINVAL;
params.mesh_id = nla_data(info->attrs[NL80211_ATTR_MESH_ID]);
params.mesh_id_len = nla_len(info->attrs[NL80211_ATTR_MESH_ID]);
change = true;
@@ -1266,20 +1240,18 @@ static int nl80211_set_interface(struct
change = true;
err = nl80211_valid_4addr(rdev, dev, params.use_4addr, ntype);
if (err)
- goto unlock;
+ return err;
} else {
params.use_4addr = -1;
}
if (info->attrs[NL80211_ATTR_MNTR_FLAGS]) {
- if (ntype != NL80211_IFTYPE_MONITOR) {
- err = -EINVAL;
- goto unlock;
- }
+ if (ntype != NL80211_IFTYPE_MONITOR)
+ return -EINVAL;
err = parse_monitor_flags(info->attrs[NL80211_ATTR_MNTR_FLAGS],
&_flags);
if (err)
- goto unlock;
+ return err;
flags = &_flags;
change = true;
@@ -1293,17 +1265,12 @@ static int nl80211_set_interface(struct
if (!err && params.use_4addr != -1)
dev->ieee80211_ptr->use_4addr = params.use_4addr;
- unlock:
- dev_put(dev);
- cfg80211_unlock_rdev(rdev);
- unlock_rtnl:
- rtnl_unlock();
return err;
}
static int nl80211_new_interface(struct sk_buff *skb, struct genl_info *info)
{
- struct cfg80211_registered_device *rdev;
+ struct cfg80211_registered_device *rdev = info->user_ptr[0];
struct vif_params params;
int err;
enum nl80211_iftype type = NL80211_IFTYPE_UNSPECIFIED;
@@ -1320,19 +1287,9 @@ static int nl80211_new_interface(struct
return -EINVAL;
}
- rtnl_lock();
-
- rdev = cfg80211_get_dev_from_info(info);
- if (IS_ERR(rdev)) {
- err = PTR_ERR(rdev);
- goto unlock_rtnl;
- }
-
if (!rdev->ops->add_virtual_intf ||
- !(rdev->wiphy.interface_modes & (1 << type))) {
- err = -EOPNOTSUPP;
- goto unlock;
- }
+ !(rdev->wiphy.interface_modes & (1 << type)))
+ return -EOPNOTSUPP;
if (type == NL80211_IFTYPE_MESH_POINT &&
info->attrs[NL80211_ATTR_MESH_ID]) {
@@ -1344,7 +1301,7 @@ static int nl80211_new_interface(struct
params.use_4addr = !!nla_get_u8(info->attrs[NL80211_ATTR_4ADDR]);
err = nl80211_valid_4addr(rdev, NULL, params.use_4addr, type);
if (err)
- goto unlock;
+ return err;
}
err = parse_monitor_flags(type == NL80211_IFTYPE_MONITOR ?
@@ -1354,38 +1311,18 @@ static int nl80211_new_interface(struct
nla_data(info->attrs[NL80211_ATTR_IFNAME]),
type, err ? NULL : &flags, ¶ms);
- unlock:
- cfg80211_unlock_rdev(rdev);
- unlock_rtnl:
- rtnl_unlock();
return err;
}
static int nl80211_del_interface(struct sk_buff *skb, struct genl_info *info)
{
- struct cfg80211_registered_device *rdev;
- int err;
- struct net_device *dev;
-
- rtnl_lock();
+ struct cfg80211_registered_device *rdev = info->user_ptr[0];
+ struct net_device *dev = info->user_ptr[1];
- err = get_rdev_dev_by_info_ifindex(info, &rdev, &dev);
- if (err)
- goto unlock_rtnl;
-
- if (!rdev->ops->del_virtual_intf) {
- err = -EOPNOTSUPP;
- goto out;
- }
-
- err = rdev->ops->del_virtual_intf(&rdev->wiphy, dev);
+ if (!rdev->ops->del_virtual_intf)
+ return -EOPNOTSUPP;
- out:
- cfg80211_unlock_rdev(rdev);
- dev_put(dev);
- unlock_rtnl:
- rtnl_unlock();
- return err;
+ return rdev->ops->del_virtual_intf(&rdev->wiphy, dev);
}
struct get_key_cookie {
@@ -1438,9 +1375,9 @@ static void get_key_callback(void *c, st
static int nl80211_get_key(struct sk_buff *skb, struct genl_info *info)
{
- struct cfg80211_registered_device *rdev;
+ struct cfg80211_registered_device *rdev = info->user_ptr[0];
int err;
- struct net_device *dev;
+ struct net_device *dev = info->user_ptr[1];
u8 key_idx = 0;
u8 *mac_addr = NULL;
struct get_key_cookie cookie = {
@@ -1458,30 +1395,17 @@ static int nl80211_get_key(struct sk_buf
if (info->attrs[NL80211_ATTR_MAC])
mac_addr = nla_data(info->attrs[NL80211_ATTR_MAC]);
- rtnl_lock();
-
- err = get_rdev_dev_by_info_ifindex(info, &rdev, &dev);
- if (err)
- goto unlock_rtnl;
-
- if (!rdev->ops->get_key) {
- err = -EOPNOTSUPP;
- goto out;
- }
+ if (!rdev->ops->get_key)
+ return -EOPNOTSUPP;
msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
- if (!msg) {
- err = -ENOMEM;
- goto out;
- }
+ if (!msg)
+ return -ENOMEM;
hdr = nl80211hdr_put(msg, info->snd_pid, info->snd_seq, 0,
NL80211_CMD_NEW_KEY);
-
- if (IS_ERR(hdr)) {
- err = PTR_ERR(hdr);
- goto free_msg;
- }
+ if (IS_ERR(hdr))
+ return PTR_ERR(hdr);
cookie.msg = msg;
cookie.idx = key_idx;
@@ -1501,28 +1425,21 @@ static int nl80211_get_key(struct sk_buf
goto nla_put_failure;
genlmsg_end(msg, hdr);
- err = genlmsg_reply(msg, info);
- goto out;
+ return genlmsg_reply(msg, info);
nla_put_failure:
err = -ENOBUFS;
free_msg:
nlmsg_free(msg);
- out:
- cfg80211_unlock_rdev(rdev);
- dev_put(dev);
- unlock_rtnl:
- rtnl_unlock();
-
return err;
}
static int nl80211_set_key(struct sk_buff *skb, struct genl_info *info)
{
- struct cfg80211_registered_device *rdev;
+ struct cfg80211_registered_device *rdev = info->user_ptr[0];
struct key_parse key;
int err;
- struct net_device *dev;
+ struct net_device *dev = info->user_ptr[1];
int (*func)(struct wiphy *wiphy, struct net_device *netdev,
u8 key_index);
@@ -1537,21 +1454,13 @@ static int nl80211_set_key(struct sk_buf
if (!key.def && !key.defmgmt)
return -EINVAL;
- rtnl_lock();
-
- err = get_rdev_dev_by_info_ifindex(info, &rdev, &dev);
- if (err)
- goto unlock_rtnl;
-
if (key.def)
func = rdev->ops->set_default_key;
else
func = rdev->ops->set_default_mgmt_key;
- if (!func) {
- err = -EOPNOTSUPP;
- goto out;
- }
+ if (!func)
+ return -EOPNOTSUPP;
wdev_lock(dev->ieee80211_ptr);
err = nl80211_key_allowed(dev->ieee80211_ptr);
@@ -1568,21 +1477,14 @@ static int nl80211_set_key(struct sk_buf
#endif
wdev_unlock(dev->ieee80211_ptr);
- out:
- cfg80211_unlock_rdev(rdev);
- dev_put(dev);
-
- unlock_rtnl:
- rtnl_unlock();
-
return err;
}
static int nl80211_new_key(struct sk_buff *skb, struct genl_info *info)
{
- struct cfg80211_registered_device *rdev;
+ struct cfg80211_registered_device *rdev = info->user_ptr[0];
int err;
- struct net_device *dev;
+ struct net_device *dev = info->user_ptr[1];
struct key_parse key;
u8 *mac_addr = NULL;
@@ -1596,21 +1498,11 @@ static int nl80211_new_key(struct sk_buf
if (info->attrs[NL80211_ATTR_MAC])
mac_addr = nla_data(info->attrs[NL80211_ATTR_MAC]);
- rtnl_lock();
-
- err = get_rdev_dev_by_info_ifindex(info, &rdev, &dev);
- if (err)
- goto unlock_rtnl;
-
- if (!rdev->ops->add_key) {
- err = -EOPNOTSUPP;
- goto out;
- }
+ if (!rdev->ops->add_key)
+ return -EOPNOTSUPP;
- if (cfg80211_validate_key_settings(rdev, &key.p, key.idx, mac_addr)) {
- err = -EINVAL;
- goto out;
- }
+ if (cfg80211_validate_key_settings(rdev, &key.p, key.idx, mac_addr))
+ return -EINVAL;
wdev_lock(dev->ieee80211_ptr);
err = nl80211_key_allowed(dev->ieee80211_ptr);
@@ -1619,12 +1511,6 @@ static int nl80211_new_key(struct sk_buf
mac_addr, &key.p);
wdev_unlock(dev->ieee80211_ptr);
- out:
- cfg80211_unlock_rdev(rdev);
- dev_put(dev);
- unlock_rtnl:
- rtnl_unlock();
-
return err;
}
@@ -5085,43 +4971,24 @@ nl80211_attr_cqm_policy[NL80211_ATTR_CQM
static int nl80211_set_cqm_rssi(struct genl_info *info,
s32 threshold, u32 hysteresis)
{
- struct cfg80211_registered_device *rdev;
+ struct cfg80211_registered_device *rdev = info->user_ptr[0];
struct wireless_dev *wdev;
- struct net_device *dev;
- int err;
+ struct net_device *dev = info->user_ptr[1];
if (threshold > 0)
return -EINVAL;
- rtnl_lock();
-
- err = get_rdev_dev_by_info_ifindex(info, &rdev, &dev);
- if (err)
- goto unlock_rtnl;
-
wdev = dev->ieee80211_ptr;
- if (!rdev->ops->set_cqm_rssi_config) {
- err = -EOPNOTSUPP;
- goto unlock_rdev;
- }
+ if (!rdev->ops->set_cqm_rssi_config)
+ return -EOPNOTSUPP;
if (wdev->iftype != NL80211_IFTYPE_STATION &&
- wdev->iftype != NL80211_IFTYPE_P2P_CLIENT) {
- err = -EOPNOTSUPP;
- goto unlock_rdev;
- }
-
- err = rdev->ops->set_cqm_rssi_config(wdev->wiphy, dev,
- threshold, hysteresis);
-
- unlock_rdev:
- cfg80211_unlock_rdev(rdev);
- dev_put(dev);
- unlock_rtnl:
- rtnl_unlock();
+ wdev->iftype != NL80211_IFTYPE_P2P_CLIENT)
+ return -EOPNOTSUPP;
- return err;
+ return rdev->ops->set_cqm_rssi_config(wdev->wiphy, dev,
+ threshold, hysteresis);
}
static int nl80211_set_cqm(struct sk_buff *skb, struct genl_info *info)
@@ -5155,6 +5022,54 @@ out:
return err;
}
+#define NL80211_FLAG_NEED_WIPHY 0x01
+#define NL80211_FLAG_NEED_NETDEV 0x02
+#define NL80211_FLAG_NEED_RTNL 0x04
+
+static int nl80211_pre_doit(struct genl_ops *ops, struct sk_buff *skb,
+ struct genl_info *info)
+{
+ struct cfg80211_registered_device *rdev;
+ struct net_device *dev;
+ int err;
+ bool rtnl = ops->internal_flags & NL80211_FLAG_NEED_RTNL;
+
+ if (rtnl)
+ rtnl_lock();
+
+ if (ops->internal_flags & NL80211_FLAG_NEED_WIPHY) {
+ rdev = cfg80211_get_dev_from_info(info);
+ if (IS_ERR(rdev)) {
+ if (rtnl)
+ rtnl_unlock();
+ return PTR_ERR(rdev);
+ }
+ info->user_ptr[0] = rdev;
+ } else if (ops->internal_flags & NL80211_FLAG_NEED_NETDEV) {
+ err = get_rdev_dev_by_info_ifindex(info, &rdev, &dev);
+ if (err) {
+ if (rtnl)
+ rtnl_unlock();
+ return err;
+ }
+ info->user_ptr[0] = rdev;
+ info->user_ptr[1] = dev;
+ }
+
+ return 0;
+}
+
+static void nl80211_post_doit(struct genl_ops *ops, struct sk_buff *skb,
+ struct genl_info *info)
+{
+ if (info->user_ptr[0])
+ cfg80211_unlock_rdev(info->user_ptr[0]);
+ if (info->user_ptr[1])
+ dev_put(info->user_ptr[1]);
+ if (ops->internal_flags & NL80211_FLAG_NEED_RTNL)
+ rtnl_unlock();
+}
+
static struct genl_ops nl80211_ops[] = {
{
.cmd = NL80211_CMD_GET_WIPHY,
@@ -5162,6 +5077,7 @@ static struct genl_ops nl80211_ops[] = {
.dumpit = nl80211_dump_wiphy,
.policy = nl80211_policy,
/* can be retrieved by unprivileged users */
+ .internal_flags = NL80211_FLAG_NEED_WIPHY,
},
{
.cmd = NL80211_CMD_SET_WIPHY,
@@ -5175,42 +5091,55 @@ static struct genl_ops nl80211_ops[] = {
.dumpit = nl80211_dump_interface,
.policy = nl80211_policy,
/* can be retrieved by unprivileged users */
+ .internal_flags = NL80211_FLAG_NEED_NETDEV,
},
{
.cmd = NL80211_CMD_SET_INTERFACE,
.doit = nl80211_set_interface,
.policy = nl80211_policy,
.flags = GENL_ADMIN_PERM,
+ .internal_flags = NL80211_FLAG_NEED_NETDEV |
+ NL80211_FLAG_NEED_RTNL,
},
{
.cmd = NL80211_CMD_NEW_INTERFACE,
.doit = nl80211_new_interface,
.policy = nl80211_policy,
.flags = GENL_ADMIN_PERM,
+ .internal_flags = NL80211_FLAG_NEED_WIPHY |
+ NL80211_FLAG_NEED_RTNL,
},
{
.cmd = NL80211_CMD_DEL_INTERFACE,
.doit = nl80211_del_interface,
.policy = nl80211_policy,
.flags = GENL_ADMIN_PERM,
+ .internal_flags = NL80211_FLAG_NEED_NETDEV |
+ NL80211_FLAG_NEED_RTNL,
},
{
.cmd = NL80211_CMD_GET_KEY,
.doit = nl80211_get_key,
.policy = nl80211_policy,
.flags = GENL_ADMIN_PERM,
+ .internal_flags = NL80211_FLAG_NEED_NETDEV |
+ NL80211_FLAG_NEED_RTNL,
},
{
.cmd = NL80211_CMD_SET_KEY,
.doit = nl80211_set_key,
.policy = nl80211_policy,
.flags = GENL_ADMIN_PERM,
+ .internal_flags = NL80211_FLAG_NEED_NETDEV |
+ NL80211_FLAG_NEED_RTNL,
},
{
.cmd = NL80211_CMD_NEW_KEY,
.doit = nl80211_new_key,
.policy = nl80211_policy,
.flags = GENL_ADMIN_PERM,
+ .internal_flags = NL80211_FLAG_NEED_NETDEV |
+ NL80211_FLAG_NEED_RTNL,
},
{
.cmd = NL80211_CMD_DEL_KEY,
@@ -5464,6 +5393,8 @@ static struct genl_ops nl80211_ops[] = {
.doit = nl80211_set_cqm,
.policy = nl80211_policy,
.flags = GENL_ADMIN_PERM,
+ .internal_flags = NL80211_FLAG_NEED_NETDEV |
+ NL80211_FLAG_NEED_RTNL,
},
{
.cmd = NL80211_CMD_SET_CHANNEL,
^ permalink raw reply
* [RFC 1/2] genetlink: introduce pre_doit/post_doit hooks
From: Johannes Berg @ 2010-09-30 21:10 UTC (permalink / raw)
To: netdev-u79uwXL29TY76Z2rM5mHXA,
linux-wireless-u79uwXL29TY76Z2rM5mHXA
Cc: tgraf-G/eBtMaohhA
In-Reply-To: <20100930211013.472957770@sipsolutions.net>
[-- Attachment #1: genl-more-boilerplate.patch --]
[-- Type: text/plain, Size: 4233 bytes --]
From: Johannes Berg <johannes.berg-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Each family may have some amount of boilerplate
locking code that applies to most, or even all,
commands.
This allows a family to handle such things in
a more generic way, by allowing it to
a) include private flags in each operation
b) specify a pre_doit hook that is called,
before an operation's doit() callback and
may return an error directly,
c) specify a post_doit hook that can undo
locking or similar things done by pre_doit,
and finally
d) include two private pointers in each info
struct passed between all these operations
including doit(). (It's two because I'll
need two in nl80211 -- can be extended.)
Signed-off-by: Johannes Berg <johannes.berg-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
include/net/genetlink.h | 18 ++++++++++++++++++
net/netlink/genetlink.c | 14 +++++++++++++-
2 files changed, 31 insertions(+), 1 deletion(-)
--- wireless-testing.orig/include/net/genetlink.h 2010-09-30 22:19:03.000000000 +0200
+++ wireless-testing/include/net/genetlink.h 2010-09-30 22:34:10.000000000 +0200
@@ -20,6 +20,9 @@ struct genl_multicast_group {
u32 id;
};
+struct genl_ops;
+struct genl_info;
+
/**
* struct genl_family - generic netlink family
* @id: protocol family idenfitier
@@ -29,6 +32,10 @@ struct genl_multicast_group {
* @maxattr: maximum number of attributes supported
* @netnsok: set to true if the family can handle network
* namespaces and should be presented in all of them
+ * @pre_doit: called before an operation's doit callback, it may
+ * do additional, common, filtering and return an error
+ * @post_doit: called after an operation's doit callback, it may
+ * undo operations done by pre_doit, for example release locks
* @attrbuf: buffer to store parsed attributes
* @ops_list: list of all assigned operations
* @family_list: family list
@@ -41,6 +48,12 @@ struct genl_family {
unsigned int version;
unsigned int maxattr;
bool netnsok;
+ int (*pre_doit)(struct genl_ops *ops,
+ struct sk_buff *skb,
+ struct genl_info *info);
+ void (*post_doit)(struct genl_ops *ops,
+ struct sk_buff *skb,
+ struct genl_info *info);
struct nlattr ** attrbuf; /* private */
struct list_head ops_list; /* private */
struct list_head family_list; /* private */
@@ -55,6 +68,8 @@ struct genl_family {
* @genlhdr: generic netlink message header
* @userhdr: user specific header
* @attrs: netlink attributes
+ * @_net: network namespace
+ * @user_ptr: user pointers
*/
struct genl_info {
u32 snd_seq;
@@ -66,6 +81,7 @@ struct genl_info {
#ifdef CONFIG_NET_NS
struct net * _net;
#endif
+ void * user_ptr[2];
};
static inline struct net *genl_info_net(struct genl_info *info)
@@ -81,6 +97,7 @@ static inline void genl_info_net_set(str
/**
* struct genl_ops - generic netlink operations
* @cmd: command identifier
+ * @internal_flags: flags used by the family
* @flags: flags
* @policy: attribute validation policy
* @doit: standard command callback
@@ -90,6 +107,7 @@ static inline void genl_info_net_set(str
*/
struct genl_ops {
u8 cmd;
+ u8 internal_flags;
unsigned int flags;
const struct nla_policy *policy;
int (*doit)(struct sk_buff *skb,
--- wireless-testing.orig/net/netlink/genetlink.c 2010-09-30 22:19:56.000000000 +0200
+++ wireless-testing/net/netlink/genetlink.c 2010-09-30 22:22:28.000000000 +0200
@@ -547,8 +547,20 @@ static int genl_rcv_msg(struct sk_buff *
info.userhdr = nlmsg_data(nlh) + GENL_HDRLEN;
info.attrs = family->attrbuf;
genl_info_net_set(&info, net);
+ memset(&info.user_ptr, 0, sizeof(info.user_ptr));
- return ops->doit(skb, &info);
+ if (family->pre_doit) {
+ err = family->pre_doit(ops, skb, &info);
+ if (err)
+ return err;
+ }
+
+ err = ops->doit(skb, &info);
+
+ if (family->post_doit)
+ family->post_doit(ops, skb, &info);
+
+ return err;
}
static void genl_rcv(struct sk_buff *skb)
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [RFC 0/2] generic netlink doit generalisation
From: Johannes Berg @ 2010-09-30 21:10 UTC (permalink / raw)
To: netdev-u79uwXL29TY76Z2rM5mHXA,
linux-wireless-u79uwXL29TY76Z2rM5mHXA
Cc: tgraf-G/eBtMaohhA
nl80211 does a lot of boilerplate work
for each generic netlink doit() operation,
this is an attempt to reduce it.
If you think this looks good, I'd like to
merge it through John's wireless tree.
johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH] bonding: add retransmit membership reports tunable
From: Flavio Leitner @ 2010-09-30 20:46 UTC (permalink / raw)
To: netdev; +Cc: Andy Gospodarek, bonding-devel, Jay Vosburgh, Flavio Leitner
In-Reply-To: <20100929131713.GB2857@redhat.com>
Allow admins to adjust the number of multicast membership
reports sent on a link failure.
Signed-off-by: Flavio Leitner <fleitner@redhat.com>
---
drivers/net/bonding/bond_main.c | 14 ++++++++++++
drivers/net/bonding/bond_sysfs.c | 44 ++++++++++++++++++++++++++++++++++++++
drivers/net/bonding/bonding.h | 2 +
include/linux/if_bonding.h | 3 ++
4 files changed, 63 insertions(+), 0 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 1016c09..60bfd33 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -109,6 +109,7 @@ static char *arp_validate;
static char *fail_over_mac;
static int all_slaves_active = 0;
static struct bond_params bonding_defaults;
+static int resend_igmp = BOND_DEFAULT_RESEND_IGMP;
module_param(max_bonds, int, 0);
MODULE_PARM_DESC(max_bonds, "Max number of bonded devices");
@@ -163,6 +164,8 @@ module_param(all_slaves_active, int, 0);
MODULE_PARM_DESC(all_slaves_active, "Keep all frames received on an interface"
"by setting active flag for all slaves. "
"0 for never (default), 1 for always.");
+module_param(resend_igmp, int, 0);
+MODULE_PARM_DESC(resend_igmp, "Number of IGMP membership reports to send on link failure");
/*----------------------------- Global variables ----------------------------*/
@@ -909,6 +912,9 @@ static void bond_resend_igmp_join_requests(struct bonding *bond)
}
}
+ if (--bond->igmp_retrans > 0)
+ queue_delayed_work(bond->wq, &bond->mcast_work, HZ/5);
+
out:
read_unlock(&bond->lock);
}
@@ -981,6 +987,7 @@ static void bond_mc_swap(struct bonding *bond, struct slave *new_active,
dev_mc_add(new_active->dev, ha->addr);
/* rejoin multicast groups */
+ bond->igmp_retrans = bond->params.resend_igmp;
queue_delayed_work(bond->wq, &bond->mcast_work, 0);
}
}
@@ -4936,6 +4943,12 @@ static int bond_check_params(struct bond_params *params)
all_slaves_active = 0;
}
+ if (resend_igmp < 0 || resend_igmp > 255) {
+ pr_warning("Warning: resend_igmp (%d) should be between "
+ "0 and 255, resetting to %d\n",
+ resend_igmp, BOND_DEFAULT_RESEND_IGMP);
+ resend_igmp = BOND_DEFAULT_RESEND_IGMP;
+ }
/* reset values for TLB/ALB */
if ((bond_mode == BOND_MODE_TLB) ||
(bond_mode == BOND_MODE_ALB)) {
@@ -5108,6 +5121,7 @@ static int bond_check_params(struct bond_params *params)
params->fail_over_mac = fail_over_mac_value;
params->tx_queues = tx_queues;
params->all_slaves_active = all_slaves_active;
+ params->resend_igmp = resend_igmp;
if (primary) {
strncpy(params->primary, primary, IFNAMSIZ);
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index c311aed..a676a28 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -1592,6 +1592,49 @@ out:
static DEVICE_ATTR(all_slaves_active, S_IRUGO | S_IWUSR,
bonding_show_slaves_active, bonding_store_slaves_active);
+/*
+ * Show and set the number of IGMP membership reports to send on link failure
+ */
+static ssize_t bonding_show_resend_igmp(struct device *d,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct bonding *bond = to_bond(d);
+
+ return sprintf(buf, "%d\n", bond->params.resend_igmp);
+}
+
+static ssize_t bonding_store_resend_igmp(struct device *d,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ int new_value, ret = count;
+ struct bonding *bond = to_bond(d);
+
+ if (sscanf(buf, "%d", &new_value) != 1) {
+ pr_err("%s: no resend_igmp value specified.\n",
+ bond->dev->name);
+ ret = -EINVAL;
+ goto out;
+ }
+
+ if (new_value < 0) {
+ pr_err("%s: Invalid resend_igmp value %d not in range 1-%d; rejected.\n",
+ bond->dev->name, new_value, INT_MAX);
+ ret = -EINVAL;
+ goto out;
+ }
+
+ pr_info("%s: Setting resend_igmp to %d.\n",
+ bond->dev->name, new_value);
+ bond->params.resend_igmp = new_value;
+out:
+ return ret;
+}
+
+static DEVICE_ATTR(resend_igmp, S_IRUGO | S_IWUSR,
+ bonding_show_resend_igmp, bonding_store_resend_igmp);
+
static struct attribute *per_bond_attrs[] = {
&dev_attr_slaves.attr,
&dev_attr_mode.attr,
@@ -1619,6 +1662,7 @@ static struct attribute *per_bond_attrs[] = {
&dev_attr_ad_partner_mac.attr,
&dev_attr_queue_id.attr,
&dev_attr_all_slaves_active.attr,
+ &dev_attr_resend_igmp.attr,
NULL,
};
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 308ed10..c49bdb0 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -136,6 +136,7 @@ struct bond_params {
__be32 arp_targets[BOND_MAX_ARP_TARGETS];
int tx_queues;
int all_slaves_active;
+ int resend_igmp;
};
struct bond_parm_tbl {
@@ -198,6 +199,7 @@ struct bonding {
s32 slave_cnt; /* never change this value outside the attach/detach wrappers */
rwlock_t lock;
rwlock_t curr_slave_lock;
+ s8 igmp_retrans;
s8 kill_timers;
s8 send_grat_arp;
s8 send_unsol_na;
diff --git a/include/linux/if_bonding.h b/include/linux/if_bonding.h
index 2c799437..d2f78b7 100644
--- a/include/linux/if_bonding.h
+++ b/include/linux/if_bonding.h
@@ -84,6 +84,9 @@
#define BOND_DEFAULT_MAX_BONDS 1 /* Default maximum number of devices to support */
#define BOND_DEFAULT_TX_QUEUES 16 /* Default number of tx queues per device */
+
+#define BOND_DEFAULT_RESEND_IGMP 1 /* Default number of IGMP membership reports
+ to resend on link failure. */
/* hashing types */
#define BOND_XMIT_POLICY_LAYER2 0 /* layer 2 (MAC only), default */
#define BOND_XMIT_POLICY_LAYER34 1 /* layer 3+4 (IP ^ (TCP || UDP)) */
--
1.7.3
^ permalink raw reply related
* Re: [PATCH V3] fs: allow for more than 2^31 files
From: Eric Dumazet @ 2010-09-30 20:45 UTC (permalink / raw)
To: Robin Holt
Cc: David Miller, dipankar, viro, bcrl, den, mingo, mszeredi, cmm,
npiggin, xemul, linux-kernel, netdev
In-Reply-To: <20100930202612.GM14068@sgi.com>
Le jeudi 30 septembre 2010 à 15:26 -0500, Robin Holt a écrit :
> On Tue, Sep 28, 2010 at 05:46:51AM +0200, Eric Dumazet wrote:
> > Le lundi 27 septembre 2010 à 15:36 -0700, David Miller a écrit :
> ...
>
> > Fix is to let /proc/sys/fs/file-nr & /proc/sys/fs/file-max use long
> > integers, and change af_unix to use an atomic_long_t instead of
> > atomic_t.
> >
> > get_max_files() is changed to return an unsigned long.
>
> I _THINK_ we actually want get_max_files to return a long and have
> the files_stat_struct definitions be longs. If we do not have it that
> way, we could theoretically open enough files on a single cpu to make
> get_nr_files return a negative without overflowing max_files. That,
> of course, would require an insane amount of memory, but I think it is
> technically more correct.
>
Number of opened file is technically a positive (or null) value, I have
no idea why you want it being signed.
>
> > --- a/kernel/sysctl.c
> > +++ b/kernel/sysctl.c
> > @@ -1352,16 +1352,16 @@ static struct ctl_table fs_table[] = {
> > {
> > .procname = "file-nr",
> > .data = &files_stat,
> > - .maxlen = 3*sizeof(int),
> > + .maxlen = sizeof(files_stat),
> > .mode = 0444,
> > - .proc_handler = proc_nr_files,
> > + .proc_handler = proc_doulongvec_minmax,
>
> With this change, don't we lose the current nr_files value? I think
> you need proc_nr_files to stay as it was. If you disagree, we should
> probably eliminate the definitions for proc_nr_files as I don't believe
> they are used anywhere else.
>
I have no idea why you think I changed something. I only made the value
use 64bit on 64bit arches, so that we are not anymore limited to 2^31
files.
^ permalink raw reply
* [PATCH v2] bonding: rejoin multicast groups on VLANs
From: Flavio Leitner @ 2010-09-30 20:45 UTC (permalink / raw)
To: netdev; +Cc: Andy Gospodarek, bonding-devel, Jay Vosburgh, Flavio Leitner
In-Reply-To: <20100929131713.GB2857@redhat.com>
It fixes bonding to rejoin multicast groups added
to VLAN devices on top of bonding when a failover
happens.
Signed-off-by: Flavio Leitner <fleitner@redhat.com>
---
drivers/net/bonding/bond_main.c | 61 +++++++++++++++++++++++++++++++++-----
drivers/net/bonding/bonding.h | 1 +
2 files changed, 54 insertions(+), 8 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 3b16f62..1016c09 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -865,18 +865,14 @@ static void bond_mc_del(struct bonding *bond, void *addr)
}
-/*
- * Retrieve the list of registered multicast addresses for the bonding
- * device and retransmit an IGMP JOIN request to the current active
- * slave.
- */
-static void bond_resend_igmp_join_requests(struct bonding *bond)
+static void __bond_resend_igmp_join_requests(struct net_device *dev)
{
struct in_device *in_dev;
struct ip_mc_list *im;
rcu_read_lock();
- in_dev = __in_dev_get_rcu(bond->dev);
+
+ in_dev = __in_dev_get_rcu(dev);
if (in_dev) {
for (im = in_dev->mc_list; im; im = im->next)
ip_mc_rejoin_group(im);
@@ -885,6 +881,45 @@ static void bond_resend_igmp_join_requests(struct bonding *bond)
rcu_read_unlock();
}
+
+/*
+ * Retrieve the list of registered multicast addresses for the bonding
+ * device and retransmit an IGMP JOIN request to the current active
+ * slave.
+ */
+static void bond_resend_igmp_join_requests(struct bonding *bond)
+{
+ struct net_device *vlan_dev;
+ struct vlan_entry *vlan;
+
+ read_lock(&bond->lock);
+ if (bond->kill_timers)
+ goto out;
+
+ /* rejoin all groups on bond device */
+ __bond_resend_igmp_join_requests(bond->dev);
+
+ /* rejoin all groups on vlan devices */
+ if (bond->vlgrp) {
+ list_for_each_entry(vlan, &bond->vlan_list, vlan_list) {
+ vlan_dev = vlan_group_get_device(bond->vlgrp,
+ vlan->vlan_id);
+ if (vlan_dev)
+ __bond_resend_igmp_join_requests(vlan_dev);
+ }
+ }
+
+out:
+ read_unlock(&bond->lock);
+}
+
+void bond_resend_igmp_join_requests_delayed(struct work_struct *work)
+{
+ struct bonding *bond = container_of(work, struct bonding,
+ mcast_work.work);
+ bond_resend_igmp_join_requests(bond);
+}
+
/*
* flush all members of flush->mc_list from device dev->mc_list
*/
@@ -944,7 +979,9 @@ static void bond_mc_swap(struct bonding *bond, struct slave *new_active,
netdev_for_each_mc_addr(ha, bond->dev)
dev_mc_add(new_active->dev, ha->addr);
- bond_resend_igmp_join_requests(bond);
+
+ /* rejoin multicast groups */
+ queue_delayed_work(bond->wq, &bond->mcast_work, 0);
}
}
@@ -3744,6 +3781,9 @@ static int bond_open(struct net_device *bond_dev)
bond->kill_timers = 0;
+ /* multicast retrans */
+ INIT_DELAYED_WORK(&bond->mcast_work, bond_resend_igmp_join_requests_delayed);
+
if (bond_is_lb(bond)) {
/* bond_alb_initialize must be called before the timer
* is started.
@@ -3828,6 +3868,8 @@ static int bond_close(struct net_device *bond_dev)
break;
}
+ if (delayed_work_pending(&bond->mcast_work))
+ cancel_delayed_work(&bond->mcast_work);
if (bond_is_lb(bond)) {
/* Must be called only after all
@@ -4699,6 +4741,9 @@ static void bond_work_cancel_all(struct bonding *bond)
if (bond->params.mode == BOND_MODE_8023AD &&
delayed_work_pending(&bond->ad_work))
cancel_delayed_work(&bond->ad_work);
+
+ if (delayed_work_pending(&bond->mcast_work))
+ cancel_delayed_work(&bond->mcast_work);
}
/*
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index c6fdd85..308ed10 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -223,6 +223,7 @@ struct bonding {
struct delayed_work arp_work;
struct delayed_work alb_work;
struct delayed_work ad_work;
+ struct delayed_work mcast_work;
#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
struct in6_addr master_ipv6;
#endif
--
1.7.3
^ permalink raw reply related
* [PATCH net-next 6/8] tg3: Prepare for larger rx ring sizes
From: Matt Carlson @ 2010-09-30 20:34 UTC (permalink / raw)
To: davem; +Cc: netdev, andy, mcarlson
This patch adds two new variables to track the size of the standard and
jumbo rx producer ring sizes. The code is then pivoted to these
variables from preprocessor constants.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
---
drivers/net/tg3.c | 119 ++++++++++++++++++++++++++++------------------------
drivers/net/tg3.h | 2 +
2 files changed, 66 insertions(+), 55 deletions(-)
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 50b7e35..16848a9 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -101,9 +101,9 @@
* You can't change the ring sizes, but you can change where you place
* them in the NIC onboard memory.
*/
-#define TG3_RX_RING_SIZE 512
+#define TG3_RX_STD_RING_SIZE(tp) 512
#define TG3_DEF_RX_RING_PENDING 200
-#define TG3_RX_JUMBO_RING_SIZE 256
+#define TG3_RX_JMB_RING_SIZE(tp) 256
#define TG3_DEF_RX_JUMBO_RING_PENDING 100
#define TG3_RSS_INDIR_TBL_SIZE 128
@@ -120,12 +120,12 @@
#define TG3_TX_RING_SIZE 512
#define TG3_DEF_TX_RING_PENDING (TG3_TX_RING_SIZE - 1)
-#define TG3_RX_RING_BYTES (sizeof(struct tg3_rx_buffer_desc) * \
- TG3_RX_RING_SIZE)
-#define TG3_RX_JUMBO_RING_BYTES (sizeof(struct tg3_ext_rx_buffer_desc) * \
- TG3_RX_JUMBO_RING_SIZE)
-#define TG3_RX_RCB_RING_BYTES(tp) (sizeof(struct tg3_rx_buffer_desc) * \
- TG3_RX_RCB_RING_SIZE(tp))
+#define TG3_RX_STD_RING_BYTES(tp) \
+ (sizeof(struct tg3_rx_buffer_desc) * TG3_RX_STD_RING_SIZE(tp))
+#define TG3_RX_JMB_RING_BYTES(tp) \
+ (sizeof(struct tg3_ext_rx_buffer_desc) * TG3_RX_JMB_RING_SIZE(tp))
+#define TG3_RX_RCB_RING_BYTES(tp) \
+ (sizeof(struct tg3_rx_buffer_desc) * TG3_RX_RCB_RING_SIZE(tp))
#define TG3_TX_RING_BYTES (sizeof(struct tg3_tx_buffer_desc) * \
TG3_TX_RING_SIZE)
#define NEXT_TX(N) (((N) + 1) & (TG3_TX_RING_SIZE - 1))
@@ -143,11 +143,11 @@
#define TG3_RX_STD_MAP_SZ TG3_RX_DMA_TO_MAP_SZ(TG3_RX_STD_DMA_SZ)
#define TG3_RX_JMB_MAP_SZ TG3_RX_DMA_TO_MAP_SZ(TG3_RX_JMB_DMA_SZ)
-#define TG3_RX_STD_BUFF_RING_SIZE \
- (sizeof(struct ring_info) * TG3_RX_RING_SIZE)
+#define TG3_RX_STD_BUFF_RING_SIZE(tp) \
+ (sizeof(struct ring_info) * TG3_RX_STD_RING_SIZE(tp))
-#define TG3_RX_JMB_BUFF_RING_SIZE \
- (sizeof(struct ring_info) * TG3_RX_JUMBO_RING_SIZE)
+#define TG3_RX_JMB_BUFF_RING_SIZE(tp) \
+ (sizeof(struct ring_info) * TG3_RX_JMB_RING_SIZE(tp))
/* Due to a hardware bug, the 5701 can only DMA to memory addresses
* that are at least dword aligned when used in PCIX mode. The driver
@@ -4445,14 +4445,14 @@ static int tg3_alloc_rx_skb(struct tg3 *tp, struct tg3_rx_prodring_set *tpr,
src_map = NULL;
switch (opaque_key) {
case RXD_OPAQUE_RING_STD:
- dest_idx = dest_idx_unmasked % TG3_RX_RING_SIZE;
+ dest_idx = dest_idx_unmasked & tp->rx_std_ring_mask;
desc = &tpr->rx_std[dest_idx];
map = &tpr->rx_std_buffers[dest_idx];
skb_size = tp->rx_pkt_map_sz;
break;
case RXD_OPAQUE_RING_JUMBO:
- dest_idx = dest_idx_unmasked % TG3_RX_JUMBO_RING_SIZE;
+ dest_idx = dest_idx_unmasked & tp->rx_jmb_ring_mask;
desc = &tpr->rx_jmb[dest_idx].std;
map = &tpr->rx_jmb_buffers[dest_idx];
skb_size = TG3_RX_JMB_MAP_SZ;
@@ -4507,7 +4507,7 @@ static void tg3_recycle_rx(struct tg3_napi *tnapi,
switch (opaque_key) {
case RXD_OPAQUE_RING_STD:
- dest_idx = dest_idx_unmasked % TG3_RX_RING_SIZE;
+ dest_idx = dest_idx_unmasked & tp->rx_std_ring_mask;
dest_desc = &dpr->rx_std[dest_idx];
dest_map = &dpr->rx_std_buffers[dest_idx];
src_desc = &spr->rx_std[src_idx];
@@ -4515,7 +4515,7 @@ static void tg3_recycle_rx(struct tg3_napi *tnapi,
break;
case RXD_OPAQUE_RING_JUMBO:
- dest_idx = dest_idx_unmasked % TG3_RX_JUMBO_RING_SIZE;
+ dest_idx = dest_idx_unmasked & tp->rx_jmb_ring_mask;
dest_desc = &dpr->rx_jmb[dest_idx].std;
dest_map = &dpr->rx_jmb_buffers[dest_idx];
src_desc = &spr->rx_jmb[src_idx].std;
@@ -4715,7 +4715,8 @@ next_pkt:
(*post_ptr)++;
if (unlikely(rx_std_posted >= tp->rx_std_max_post)) {
- tpr->rx_std_prod_idx = std_prod_idx % TG3_RX_RING_SIZE;
+ tpr->rx_std_prod_idx = std_prod_idx &
+ tp->rx_std_ring_mask;
tw32_rx_mbox(TG3_RX_STD_PROD_IDX_REG,
tpr->rx_std_prod_idx);
work_mask &= ~RXD_OPAQUE_RING_STD;
@@ -4739,13 +4740,14 @@ next_pkt_nopost:
/* Refill RX ring(s). */
if (!(tp->tg3_flags3 & TG3_FLG3_ENABLE_RSS)) {
if (work_mask & RXD_OPAQUE_RING_STD) {
- tpr->rx_std_prod_idx = std_prod_idx % TG3_RX_RING_SIZE;
+ tpr->rx_std_prod_idx = std_prod_idx &
+ tp->rx_std_ring_mask;
tw32_rx_mbox(TG3_RX_STD_PROD_IDX_REG,
tpr->rx_std_prod_idx);
}
if (work_mask & RXD_OPAQUE_RING_JUMBO) {
- tpr->rx_jmb_prod_idx = jmb_prod_idx %
- TG3_RX_JUMBO_RING_SIZE;
+ tpr->rx_jmb_prod_idx = jmb_prod_idx &
+ tp->rx_jmb_ring_mask;
tw32_rx_mbox(TG3_RX_JMB_PROD_IDX_REG,
tpr->rx_jmb_prod_idx);
}
@@ -4756,8 +4758,8 @@ next_pkt_nopost:
*/
smp_wmb();
- tpr->rx_std_prod_idx = std_prod_idx % TG3_RX_RING_SIZE;
- tpr->rx_jmb_prod_idx = jmb_prod_idx % TG3_RX_JUMBO_RING_SIZE;
+ tpr->rx_std_prod_idx = std_prod_idx & tp->rx_std_ring_mask;
+ tpr->rx_jmb_prod_idx = jmb_prod_idx & tp->rx_jmb_ring_mask;
if (tnapi != &tp->napi[1])
napi_schedule(&tp->napi[1].napi);
@@ -4813,9 +4815,11 @@ static int tg3_rx_prodring_xfer(struct tg3 *tp,
if (spr->rx_std_cons_idx < src_prod_idx)
cpycnt = src_prod_idx - spr->rx_std_cons_idx;
else
- cpycnt = TG3_RX_RING_SIZE - spr->rx_std_cons_idx;
+ cpycnt = tp->rx_std_ring_mask + 1 -
+ spr->rx_std_cons_idx;
- cpycnt = min(cpycnt, TG3_RX_RING_SIZE - dpr->rx_std_prod_idx);
+ cpycnt = min(cpycnt,
+ tp->rx_std_ring_mask + 1 - dpr->rx_std_prod_idx);
si = spr->rx_std_cons_idx;
di = dpr->rx_std_prod_idx;
@@ -4849,10 +4853,10 @@ static int tg3_rx_prodring_xfer(struct tg3 *tp,
dbd->addr_lo = sbd->addr_lo;
}
- spr->rx_std_cons_idx = (spr->rx_std_cons_idx + cpycnt) %
- TG3_RX_RING_SIZE;
- dpr->rx_std_prod_idx = (dpr->rx_std_prod_idx + cpycnt) %
- TG3_RX_RING_SIZE;
+ spr->rx_std_cons_idx = (spr->rx_std_cons_idx + cpycnt) &
+ tp->rx_std_ring_mask;
+ dpr->rx_std_prod_idx = (dpr->rx_std_prod_idx + cpycnt) &
+ tp->rx_std_ring_mask;
}
while (1) {
@@ -4869,10 +4873,11 @@ static int tg3_rx_prodring_xfer(struct tg3 *tp,
if (spr->rx_jmb_cons_idx < src_prod_idx)
cpycnt = src_prod_idx - spr->rx_jmb_cons_idx;
else
- cpycnt = TG3_RX_JUMBO_RING_SIZE - spr->rx_jmb_cons_idx;
+ cpycnt = tp->rx_jmb_ring_mask + 1 -
+ spr->rx_jmb_cons_idx;
cpycnt = min(cpycnt,
- TG3_RX_JUMBO_RING_SIZE - dpr->rx_jmb_prod_idx);
+ tp->rx_jmb_ring_mask + 1 - dpr->rx_jmb_prod_idx);
si = spr->rx_jmb_cons_idx;
di = dpr->rx_jmb_prod_idx;
@@ -4906,10 +4911,10 @@ static int tg3_rx_prodring_xfer(struct tg3 *tp,
dbd->addr_lo = sbd->addr_lo;
}
- spr->rx_jmb_cons_idx = (spr->rx_jmb_cons_idx + cpycnt) %
- TG3_RX_JUMBO_RING_SIZE;
- dpr->rx_jmb_prod_idx = (dpr->rx_jmb_prod_idx + cpycnt) %
- TG3_RX_JUMBO_RING_SIZE;
+ spr->rx_jmb_cons_idx = (spr->rx_jmb_cons_idx + cpycnt) &
+ tp->rx_jmb_ring_mask;
+ dpr->rx_jmb_prod_idx = (dpr->rx_jmb_prod_idx + cpycnt) &
+ tp->rx_jmb_ring_mask;
}
return err;
@@ -6059,14 +6064,14 @@ static void tg3_rx_prodring_free(struct tg3 *tp,
if (tpr != &tp->napi[0].prodring) {
for (i = tpr->rx_std_cons_idx; i != tpr->rx_std_prod_idx;
- i = (i + 1) % TG3_RX_RING_SIZE)
+ i = (i + 1) & tp->rx_std_ring_mask)
tg3_rx_skb_free(tp, &tpr->rx_std_buffers[i],
tp->rx_pkt_map_sz);
if (tp->tg3_flags & TG3_FLAG_JUMBO_CAPABLE) {
for (i = tpr->rx_jmb_cons_idx;
i != tpr->rx_jmb_prod_idx;
- i = (i + 1) % TG3_RX_JUMBO_RING_SIZE) {
+ i = (i + 1) & tp->rx_jmb_ring_mask) {
tg3_rx_skb_free(tp, &tpr->rx_jmb_buffers[i],
TG3_RX_JMB_MAP_SZ);
}
@@ -6075,12 +6080,12 @@ static void tg3_rx_prodring_free(struct tg3 *tp,
return;
}
- for (i = 0; i < TG3_RX_RING_SIZE; i++)
+ for (i = 0; i <= tp->rx_std_ring_mask; i++)
tg3_rx_skb_free(tp, &tpr->rx_std_buffers[i],
tp->rx_pkt_map_sz);
if (tp->tg3_flags & TG3_FLAG_JUMBO_CAPABLE) {
- for (i = 0; i < TG3_RX_JUMBO_RING_SIZE; i++)
+ for (i = 0; i <= tp->rx_jmb_ring_mask; i++)
tg3_rx_skb_free(tp, &tpr->rx_jmb_buffers[i],
TG3_RX_JMB_MAP_SZ);
}
@@ -6104,15 +6109,16 @@ static int tg3_rx_prodring_alloc(struct tg3 *tp,
tpr->rx_jmb_prod_idx = 0;
if (tpr != &tp->napi[0].prodring) {
- memset(&tpr->rx_std_buffers[0], 0, TG3_RX_STD_BUFF_RING_SIZE);
+ memset(&tpr->rx_std_buffers[0], 0,
+ TG3_RX_STD_BUFF_RING_SIZE(tp));
if (tp->tg3_flags & TG3_FLAG_JUMBO_CAPABLE)
memset(&tpr->rx_jmb_buffers[0], 0,
- TG3_RX_JMB_BUFF_RING_SIZE);
+ TG3_RX_JMB_BUFF_RING_SIZE(tp));
goto done;
}
/* Zero out all descriptors. */
- memset(tpr->rx_std, 0, TG3_RX_RING_BYTES);
+ memset(tpr->rx_std, 0, TG3_RX_STD_RING_BYTES(tp));
rx_pkt_dma_sz = TG3_RX_STD_DMA_SZ;
if ((tp->tg3_flags2 & TG3_FLG2_5780_CLASS) &&
@@ -6124,7 +6130,7 @@ static int tg3_rx_prodring_alloc(struct tg3 *tp,
* stuff once. This works because the card does not
* write into the rx buffer posting rings.
*/
- for (i = 0; i < TG3_RX_RING_SIZE; i++) {
+ for (i = 0; i <= tp->rx_std_ring_mask; i++) {
struct tg3_rx_buffer_desc *rxd;
rxd = &tpr->rx_std[i];
@@ -6151,12 +6157,12 @@ static int tg3_rx_prodring_alloc(struct tg3 *tp,
if (!(tp->tg3_flags & TG3_FLAG_JUMBO_CAPABLE))
goto done;
- memset(tpr->rx_jmb, 0, TG3_RX_JUMBO_RING_BYTES);
+ memset(tpr->rx_jmb, 0, TG3_RX_JMB_RING_BYTES(tp));
if (!(tp->tg3_flags & TG3_FLAG_JUMBO_RING_ENABLE))
goto done;
- for (i = 0; i < TG3_RX_JUMBO_RING_SIZE; i++) {
+ for (i = 0; i <= tp->rx_jmb_ring_mask; i++) {
struct tg3_rx_buffer_desc *rxd;
rxd = &tpr->rx_jmb[i].std;
@@ -6196,12 +6202,12 @@ static void tg3_rx_prodring_fini(struct tg3 *tp,
kfree(tpr->rx_jmb_buffers);
tpr->rx_jmb_buffers = NULL;
if (tpr->rx_std) {
- pci_free_consistent(tp->pdev, TG3_RX_RING_BYTES,
+ pci_free_consistent(tp->pdev, TG3_RX_STD_RING_BYTES(tp),
tpr->rx_std, tpr->rx_std_mapping);
tpr->rx_std = NULL;
}
if (tpr->rx_jmb) {
- pci_free_consistent(tp->pdev, TG3_RX_JUMBO_RING_BYTES,
+ pci_free_consistent(tp->pdev, TG3_RX_JMB_RING_BYTES(tp),
tpr->rx_jmb, tpr->rx_jmb_mapping);
tpr->rx_jmb = NULL;
}
@@ -6210,23 +6216,24 @@ static void tg3_rx_prodring_fini(struct tg3 *tp,
static int tg3_rx_prodring_init(struct tg3 *tp,
struct tg3_rx_prodring_set *tpr)
{
- tpr->rx_std_buffers = kzalloc(TG3_RX_STD_BUFF_RING_SIZE, GFP_KERNEL);
+ tpr->rx_std_buffers = kzalloc(TG3_RX_STD_BUFF_RING_SIZE(tp),
+ GFP_KERNEL);
if (!tpr->rx_std_buffers)
return -ENOMEM;
- tpr->rx_std = pci_alloc_consistent(tp->pdev, TG3_RX_RING_BYTES,
+ tpr->rx_std = pci_alloc_consistent(tp->pdev, TG3_RX_STD_RING_BYTES(tp),
&tpr->rx_std_mapping);
if (!tpr->rx_std)
goto err_out;
if (tp->tg3_flags & TG3_FLAG_JUMBO_CAPABLE) {
- tpr->rx_jmb_buffers = kzalloc(TG3_RX_JMB_BUFF_RING_SIZE,
+ tpr->rx_jmb_buffers = kzalloc(TG3_RX_JMB_BUFF_RING_SIZE(tp),
GFP_KERNEL);
if (!tpr->rx_jmb_buffers)
goto err_out;
tpr->rx_jmb = pci_alloc_consistent(tp->pdev,
- TG3_RX_JUMBO_RING_BYTES,
+ TG3_RX_JMB_RING_BYTES(tp),
&tpr->rx_jmb_mapping);
if (!tpr->rx_jmb)
goto err_out;
@@ -9849,10 +9856,10 @@ static void tg3_get_ringparam(struct net_device *dev, struct ethtool_ringparam *
{
struct tg3 *tp = netdev_priv(dev);
- ering->rx_max_pending = TG3_RX_RING_SIZE - 1;
+ ering->rx_max_pending = tp->rx_std_ring_mask;
ering->rx_mini_max_pending = 0;
if (tp->tg3_flags & TG3_FLAG_JUMBO_RING_ENABLE)
- ering->rx_jumbo_max_pending = TG3_RX_JUMBO_RING_SIZE - 1;
+ ering->rx_jumbo_max_pending = tp->rx_jmb_ring_mask;
else
ering->rx_jumbo_max_pending = 0;
@@ -9873,8 +9880,8 @@ static int tg3_set_ringparam(struct net_device *dev, struct ethtool_ringparam *e
struct tg3 *tp = netdev_priv(dev);
int i, irq_sync = 0, err = 0;
- if ((ering->rx_pending > TG3_RX_RING_SIZE - 1) ||
- (ering->rx_jumbo_pending > TG3_RX_JUMBO_RING_SIZE - 1) ||
+ if ((ering->rx_pending > tp->rx_std_ring_mask) ||
+ (ering->rx_jumbo_pending > tp->rx_jmb_ring_mask) ||
(ering->tx_pending > TG3_TX_RING_SIZE - 1) ||
(ering->tx_pending <= MAX_SKB_FRAGS) ||
((tp->tg3_flags2 & TG3_FLG2_TSO_BUG) &&
@@ -13592,7 +13599,9 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
#endif
}
- tp->rx_std_max_post = TG3_RX_RING_SIZE;
+ tp->rx_std_ring_mask = TG3_RX_STD_RING_SIZE(tp) - 1;
+ tp->rx_jmb_ring_mask = TG3_RX_JMB_RING_SIZE(tp) - 1;
+ tp->rx_std_max_post = tp->rx_std_ring_mask + 1;
/* Increment the rx prod index on the rx std ring by at most
* 8 for these chips to workaround hw errata.
diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
index 241e314..9763298 100644
--- a/drivers/net/tg3.h
+++ b/drivers/net/tg3.h
@@ -2762,6 +2762,8 @@ struct tg3 {
void (*write32_rx_mbox) (struct tg3 *, u32,
u32);
u32 rx_copy_thresh;
+ u32 rx_std_ring_mask;
+ u32 rx_jmb_ring_mask;
u32 rx_pending;
u32 rx_jumbo_pending;
u32 rx_std_max_post;
--
1.7.2.2
^ permalink raw reply related
* [PATCH net-next 7/8] tg3: Add extend rx ring sizes for 5717 and 5719
From: Matt Carlson @ 2010-09-30 20:34 UTC (permalink / raw)
To: davem; +Cc: netdev, andy, mcarlson
This patch increases the rx ring sizes for those asic revs that support
them.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
---
drivers/net/tg3.c | 55 ++++++++++++++++++++++++++++++++++++++--------------
drivers/net/tg3.h | 3 ++
2 files changed, 43 insertions(+), 15 deletions(-)
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 16848a9..98f7158 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -101,9 +101,15 @@
* You can't change the ring sizes, but you can change where you place
* them in the NIC onboard memory.
*/
-#define TG3_RX_STD_RING_SIZE(tp) 512
+#define TG3_RX_STD_RING_SIZE(tp) \
+ ((GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5717 || \
+ GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5719) ? \
+ RX_STD_MAX_SIZE_5717 : 512)
#define TG3_DEF_RX_RING_PENDING 200
-#define TG3_RX_JMB_RING_SIZE(tp) 256
+#define TG3_RX_JMB_RING_SIZE(tp) \
+ ((GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5717 || \
+ GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5719) ? \
+ 1024 : 256)
#define TG3_DEF_RX_JUMBO_RING_PENDING 100
#define TG3_RSS_INDIR_TBL_SIZE 128
@@ -113,9 +119,6 @@
* hw multiply/modulo instructions. Another solution would be to
* replace things like '% foo' with '& (foo - 1)'.
*/
-#define TG3_RX_RCB_RING_SIZE(tp) \
- (((tp->tg3_flags & TG3_FLAG_JUMBO_CAPABLE) && \
- !(tp->tg3_flags2 & TG3_FLG2_5780_CLASS)) ? 1024 : 512)
#define TG3_TX_RING_SIZE 512
#define TG3_DEF_TX_RING_PENDING (TG3_TX_RING_SIZE - 1)
@@ -125,7 +128,7 @@
#define TG3_RX_JMB_RING_BYTES(tp) \
(sizeof(struct tg3_ext_rx_buffer_desc) * TG3_RX_JMB_RING_SIZE(tp))
#define TG3_RX_RCB_RING_BYTES(tp) \
- (sizeof(struct tg3_rx_buffer_desc) * TG3_RX_RCB_RING_SIZE(tp))
+ (sizeof(struct tg3_rx_buffer_desc) * (tp->rx_ret_ring_mask + 1))
#define TG3_TX_RING_BYTES (sizeof(struct tg3_tx_buffer_desc) * \
TG3_TX_RING_SIZE)
#define NEXT_TX(N) (((N) + 1) & (TG3_TX_RING_SIZE - 1))
@@ -4724,7 +4727,7 @@ next_pkt:
}
next_pkt_nopost:
sw_idx++;
- sw_idx &= (TG3_RX_RCB_RING_SIZE(tp) - 1);
+ sw_idx &= tp->rx_ret_ring_mask;
/* Refresh hw_idx to see if there is new work */
if (sw_idx == hw_idx) {
@@ -7612,8 +7615,8 @@ static void tg3_rings_reset(struct tg3 *tp)
if (tnapi->rx_rcb) {
tg3_set_bdinfo(tp, rxrcb, tnapi->rx_rcb_mapping,
- (TG3_RX_RCB_RING_SIZE(tp) <<
- BDINFO_FLAGS_MAXLEN_SHIFT), 0);
+ (tp->rx_ret_ring_mask + 1) <<
+ BDINFO_FLAGS_MAXLEN_SHIFT, 0);
rxrcb += TG3_BDINFO_SIZE;
}
@@ -7636,7 +7639,7 @@ static void tg3_rings_reset(struct tg3 *tp)
}
tg3_set_bdinfo(tp, rxrcb, tnapi->rx_rcb_mapping,
- (TG3_RX_RCB_RING_SIZE(tp) <<
+ ((tp->rx_ret_ring_mask + 1) <<
BDINFO_FLAGS_MAXLEN_SHIFT), 0);
stblk += 8;
@@ -7949,10 +7952,14 @@ static int tg3_reset_hw(struct tg3 *tp, int reset_phy)
BDINFO_FLAGS_DISABLED);
}
- if (tp->tg3_flags3 & TG3_FLG3_5717_PLUS)
- val = (RX_STD_MAX_SIZE_5705 << BDINFO_FLAGS_MAXLEN_SHIFT) |
- (TG3_RX_STD_DMA_SZ << 2);
- else
+ if (tp->tg3_flags3 & TG3_FLG3_5717_PLUS) {
+ if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_57765)
+ val = RX_STD_MAX_SIZE_5705;
+ else
+ val = RX_STD_MAX_SIZE_5717;
+ val <<= BDINFO_FLAGS_MAXLEN_SHIFT;
+ val |= (TG3_RX_STD_DMA_SZ << 2);
+ } else
val = TG3_RX_STD_DMA_SZ << BDINFO_FLAGS_MAXLEN_SHIFT;
} else
val = RX_STD_MAX_SIZE_5705 << BDINFO_FLAGS_MAXLEN_SHIFT;
@@ -8235,7 +8242,11 @@ static int tg3_reset_hw(struct tg3 *tp, int reset_phy)
tw32(SNDBDC_MODE, SNDBDC_MODE_ENABLE | SNDBDC_MODE_ATTN_ENABLE);
tw32(RCVBDI_MODE, RCVBDI_MODE_ENABLE | RCVBDI_MODE_RCB_ATTN_ENAB);
- tw32(RCVDBDI_MODE, RCVDBDI_MODE_ENABLE | RCVDBDI_MODE_INV_RING_SZ);
+ val = RCVDBDI_MODE_ENABLE | RCVDBDI_MODE_INV_RING_SZ;
+ if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5717 ||
+ GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5719)
+ val |= RCVDBDI_MODE_LRG_RING_SZ;
+ tw32(RCVDBDI_MODE, val);
tw32(SNDDATAI_MODE, SNDDATAI_MODE_ENABLE);
if (tp->tg3_flags2 & TG3_FLG2_HW_TSO)
tw32(SNDDATAI_MODE, SNDDATAI_MODE_ENABLE | 0x8);
@@ -12846,6 +12857,18 @@ static void inline vlan_features_add(struct net_device *dev, unsigned long flags
#endif
}
+static inline u32 tg3_rx_ret_ring_size(struct tg3 *tp)
+{
+ if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5717 ||
+ GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5719)
+ return 4096;
+ else if ((tp->tg3_flags & TG3_FLAG_JUMBO_CAPABLE) &&
+ !(tp->tg3_flags2 & TG3_FLG2_5780_CLASS))
+ return 1024;
+ else
+ return 512;
+}
+
static int __devinit tg3_get_invariants(struct tg3 *tp)
{
static struct pci_device_id write_reorder_chipsets[] = {
@@ -13601,6 +13624,8 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
tp->rx_std_ring_mask = TG3_RX_STD_RING_SIZE(tp) - 1;
tp->rx_jmb_ring_mask = TG3_RX_JMB_RING_SIZE(tp) - 1;
+ tp->rx_ret_ring_mask = tg3_rx_ret_ring_size(tp) - 1;
+
tp->rx_std_max_post = tp->rx_std_ring_mask + 1;
/* Increment the rx prod index on the rx std ring by at most
diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
index 9763298..f6b709a 100644
--- a/drivers/net/tg3.h
+++ b/drivers/net/tg3.h
@@ -26,6 +26,7 @@
#define TG3_RX_INTERNAL_RING_SZ_5906 32
#define RX_STD_MAX_SIZE_5705 512
+#define RX_STD_MAX_SIZE_5717 2048
#define RX_JUMBO_MAX_SIZE 0xdeadbeef /* XXX */
/* First 256 bytes are a mirror of PCI config space. */
@@ -972,6 +973,7 @@
#define RCVDBDI_MODE_JUMBOBD_NEEDED 0x00000004
#define RCVDBDI_MODE_FRM_TOO_BIG 0x00000008
#define RCVDBDI_MODE_INV_RING_SZ 0x00000010
+#define RCVDBDI_MODE_LRG_RING_SZ 0x00010000
#define RCVDBDI_STATUS 0x00002404
#define RCVDBDI_STATUS_JUMBOBD_NEEDED 0x00000004
#define RCVDBDI_STATUS_FRM_TOO_BIG 0x00000008
@@ -2764,6 +2766,7 @@ struct tg3 {
u32 rx_copy_thresh;
u32 rx_std_ring_mask;
u32 rx_jmb_ring_mask;
+ u32 rx_ret_ring_mask;
u32 rx_pending;
u32 rx_jumbo_pending;
u32 rx_std_max_post;
--
1.7.2.2
^ permalink raw reply related
* [PATCH net-next 2/8] tg3: 5719: Prevent tx data corruption
From: Matt Carlson @ 2010-09-30 20:34 UTC (permalink / raw)
To: davem; +Cc: netdev, andy, mcarlson
This patch enables a bit that prevents read DMA overflows and adjusts
the txmbuf margin from the hardware default. The combination of these
modifications prevents a tx data corruption issue we were seeing on the
5719.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
---
drivers/net/tg3.c | 12 +++++++++++-
drivers/net/tg3.h | 8 +++++++-
2 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index d64fec1..4f35a5c 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -7857,7 +7857,10 @@ static int tg3_reset_hw(struct tg3 *tp, int reset_phy)
tw32(BUFMGR_DMA_HIGH_WATER,
tp->bufmgr_config.dma_high_water);
- tw32(BUFMGR_MODE, BUFMGR_MODE_ENABLE | BUFMGR_MODE_ATTN_ENABLE);
+ val = BUFMGR_MODE_ENABLE | BUFMGR_MODE_ATTN_ENABLE;
+ if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5719)
+ val |= BUFMGR_MODE_NO_TX_UNDERRUN;
+ tw32(BUFMGR_MODE, val);
for (i = 0; i < 2000; i++) {
if (tr32(BUFMGR_MODE) & BUFMGR_MODE_ENABLE)
break;
@@ -8037,6 +8040,13 @@ static int tg3_reset_hw(struct tg3 *tp, int reset_phy)
val | TG3_RDMA_RSRVCTRL_FIFO_OFLW_FIX);
}
+ if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5719) {
+ val = tr32(TG3_LSO_RD_DMA_CRPTEN_CTRL);
+ tw32(TG3_LSO_RD_DMA_CRPTEN_CTRL, val |
+ TG3_LSO_RD_DMA_CRPTEN_CTRL_BLEN_BD_4K |
+ TG3_LSO_RD_DMA_CRPTEN_CTRL_BLEN_LSO_4K);
+ }
+
/* Receive/send statistics. */
if (tp->tg3_flags2 & TG3_FLG2_5750_PLUS) {
val = tr32(RCVLPC_STATS_ENABLE);
diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
index 44733e4..ec62f05 100644
--- a/drivers/net/tg3.h
+++ b/drivers/net/tg3.h
@@ -1225,6 +1225,7 @@
#define BUFMGR_MODE_ATTN_ENABLE 0x00000004
#define BUFMGR_MODE_BM_TEST 0x00000008
#define BUFMGR_MODE_MBLOW_ATTN_ENAB 0x00000010
+#define BUFMGR_MODE_NO_TX_UNDERRUN 0x80000000
#define BUFMGR_STATUS 0x00004404
#define BUFMGR_STATUS_ERROR 0x00000004
#define BUFMGR_STATUS_MBLOW 0x00000010
@@ -1306,7 +1307,12 @@
#define TG3_RDMA_RSRVCTRL_REG 0x00004900
#define TG3_RDMA_RSRVCTRL_FIFO_OFLW_FIX 0x00000004
-/* 0x4904 --> 0x4c00 unused */
+/* 0x4904 --> 0x4910 unused */
+
+#define TG3_LSO_RD_DMA_CRPTEN_CTRL 0x00004910
+#define TG3_LSO_RD_DMA_CRPTEN_CTRL_BLEN_BD_4K 0x00030000
+#define TG3_LSO_RD_DMA_CRPTEN_CTRL_BLEN_LSO_4K 0x000c0000
+/* 0x4914 --> 0x4c00 unused */
/* Write DMA control registers */
#define WDMAC_MODE 0x00004c00
--
1.7.2.2
^ permalink raw reply related
* [PATCH net-next 0/8] tg3: Bugfixes and updates
From: Matt Carlson @ 2010-09-30 20:34 UTC (permalink / raw)
To: davem; +Cc: netdev, andy, mcarlson
This patchset implements some bugfixes, removes the 5724 device
ID and introduces extended rx buffer rings.
^ permalink raw reply
* [PATCH net-next 1/8] tg3: Fix potential netpoll crash
From: Matt Carlson @ 2010-09-30 20:34 UTC (permalink / raw)
To: davem; +Cc: netdev, andy, mcarlson
Up until now the tg3 driver would call netif_napi_add() for the maximum
number of NAPI instances the driver could use. The problem is that
netpoll could call tg3_poll() on instances that are not active. The net
effect is that the driver will crash attempting to dereference
uninitialized pointers.
The fix is to only allocate as many NAPI instances as the driver would
use in tg3_open() and deleted them in tg3_close().
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
---
drivers/net/tg3.c | 111 +++++++++++++++++++++++++++++++----------------------
1 files changed, 65 insertions(+), 46 deletions(-)
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index fdb438d..d64fec1 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -752,42 +752,6 @@ static void tg3_int_reenable(struct tg3_napi *tnapi)
HOSTCC_MODE_ENABLE | tnapi->coal_now);
}
-static void tg3_napi_disable(struct tg3 *tp)
-{
- int i;
-
- for (i = tp->irq_cnt - 1; i >= 0; i--)
- napi_disable(&tp->napi[i].napi);
-}
-
-static void tg3_napi_enable(struct tg3 *tp)
-{
- int i;
-
- for (i = 0; i < tp->irq_cnt; i++)
- napi_enable(&tp->napi[i].napi);
-}
-
-static inline void tg3_netif_stop(struct tg3 *tp)
-{
- tp->dev->trans_start = jiffies; /* prevent tx timeout */
- tg3_napi_disable(tp);
- netif_tx_disable(tp->dev);
-}
-
-static inline void tg3_netif_start(struct tg3 *tp)
-{
- /* NOTE: unconditional netif_tx_wake_all_queues is only
- * appropriate so long as all callers are assured to
- * have free tx slots (such as after tg3_init_hw)
- */
- netif_tx_wake_all_queues(tp->dev);
-
- tg3_napi_enable(tp);
- tp->napi[0].hw_status->status |= SD_STATUS_UPDATED;
- tg3_enable_ints(tp);
-}
-
static void tg3_switch_clocks(struct tg3 *tp)
{
u32 clock_ctrl;
@@ -4338,6 +4302,11 @@ static int tg3_setup_phy(struct tg3 *tp, int force_reset)
return err;
}
+static inline int tg3_irq_sync(struct tg3 *tp)
+{
+ return tp->irq_sync;
+}
+
/* This is called whenever we suspect that the system chipset is re-
* ordering the sequence of MMIO to the tx send mailbox. The symptom
* is bogus tx completions. We try to recover by setting the
@@ -5083,6 +5052,59 @@ tx_recovery:
return work_done;
}
+static void tg3_napi_disable(struct tg3 *tp)
+{
+ int i;
+
+ for (i = tp->irq_cnt - 1; i >= 0; i--)
+ napi_disable(&tp->napi[i].napi);
+}
+
+static void tg3_napi_enable(struct tg3 *tp)
+{
+ int i;
+
+ for (i = 0; i < tp->irq_cnt; i++)
+ napi_enable(&tp->napi[i].napi);
+}
+
+static void tg3_napi_init(struct tg3 *tp)
+{
+ int i;
+
+ netif_napi_add(tp->dev, &tp->napi[0].napi, tg3_poll, 64);
+ for (i = 1; i < tp->irq_cnt; i++)
+ netif_napi_add(tp->dev, &tp->napi[i].napi, tg3_poll_msix, 64);
+}
+
+static void tg3_napi_fini(struct tg3 *tp)
+{
+ int i;
+
+ for (i = 0; i < tp->irq_cnt; i++)
+ netif_napi_del(&tp->napi[i].napi);
+}
+
+static inline void tg3_netif_stop(struct tg3 *tp)
+{
+ tp->dev->trans_start = jiffies; /* prevent tx timeout */
+ tg3_napi_disable(tp);
+ netif_tx_disable(tp->dev);
+}
+
+static inline void tg3_netif_start(struct tg3 *tp)
+{
+ /* NOTE: unconditional netif_tx_wake_all_queues is only
+ * appropriate so long as all callers are assured to
+ * have free tx slots (such as after tg3_init_hw)
+ */
+ netif_tx_wake_all_queues(tp->dev);
+
+ tg3_napi_enable(tp);
+ tp->napi[0].hw_status->status |= SD_STATUS_UPDATED;
+ tg3_enable_ints(tp);
+}
+
static void tg3_irq_quiesce(struct tg3 *tp)
{
int i;
@@ -5096,11 +5118,6 @@ static void tg3_irq_quiesce(struct tg3 *tp)
synchronize_irq(tp->napi[i].irq_vec);
}
-static inline int tg3_irq_sync(struct tg3 *tp)
-{
- return tp->irq_sync;
-}
-
/* Fully shutdown all tg3 driver activity elsewhere in the system.
* If irq_sync is non-zero, then the IRQ handler must be synchronized
* with as well. Most of the time, this is not necessary except when
@@ -8915,6 +8932,8 @@ static int tg3_open(struct net_device *dev)
if (err)
goto err_out1;
+ tg3_napi_init(tp);
+
tg3_napi_enable(tp);
for (i = 0; i < tp->irq_cnt; i++) {
@@ -9002,6 +9021,7 @@ err_out3:
err_out2:
tg3_napi_disable(tp);
+ tg3_napi_fini(tp);
tg3_free_consistent(tp);
err_out1:
@@ -9049,6 +9069,8 @@ static int tg3_close(struct net_device *dev)
memcpy(&tp->estats_prev, tg3_get_estats(tp),
sizeof(tp->estats_prev));
+ tg3_napi_fini(tp);
+
tg3_free_consistent(tp);
tg3_set_power_state(tp, PCI_D3hot);
@@ -14599,13 +14621,10 @@ static int __devinit tg3_init_one(struct pci_dev *pdev,
tnapi->consmbox = rcvmbx;
tnapi->prodmbox = sndmbx;
- if (i) {
+ if (i)
tnapi->coal_now = HOSTCC_MODE_COAL_VEC1_NOW << (i - 1);
- netif_napi_add(dev, &tnapi->napi, tg3_poll_msix, 64);
- } else {
+ else
tnapi->coal_now = HOSTCC_MODE_NOW;
- netif_napi_add(dev, &tnapi->napi, tg3_poll, 64);
- }
if (!(tp->tg3_flags & TG3_FLAG_SUPPORT_MSIX))
break;
--
1.7.2.2
^ permalink raw reply related
* [PATCH net-next 3/8] tg3: Remove 5724 device ID
From: Matt Carlson @ 2010-09-30 20:34 UTC (permalink / raw)
To: davem; +Cc: netdev, andy, mcarlson
This product was never released to the public.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
---
drivers/net/tg3.c | 2 --
drivers/net/tg3.h | 1 -
2 files changed, 0 insertions(+), 3 deletions(-)
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 4f35a5c..93228aa 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -264,7 +264,6 @@ static DEFINE_PCI_DEVICE_TABLE(tg3_pci_tbl) = {
{PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, TG3PCI_DEVICE_TIGON3_57788)},
{PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, TG3PCI_DEVICE_TIGON3_5717)},
{PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, TG3PCI_DEVICE_TIGON3_5718)},
- {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, TG3PCI_DEVICE_TIGON3_5724)},
{PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, TG3PCI_DEVICE_TIGON3_57781)},
{PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, TG3PCI_DEVICE_TIGON3_57785)},
{PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, TG3PCI_DEVICE_TIGON3_57761)},
@@ -12878,7 +12877,6 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
if (tp->pdev->device == TG3PCI_DEVICE_TIGON3_5717 ||
tp->pdev->device == TG3PCI_DEVICE_TIGON3_5718 ||
- tp->pdev->device == TG3PCI_DEVICE_TIGON3_5724 ||
tp->pdev->device == TG3PCI_DEVICE_TIGON3_5719)
pci_read_config_dword(tp->pdev,
TG3PCI_GEN2_PRODID_ASICREV,
diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
index ec62f05..241e314 100644
--- a/drivers/net/tg3.h
+++ b/drivers/net/tg3.h
@@ -46,7 +46,6 @@
#define TG3PCI_DEVICE_TIGON3_5785_F 0x16a0 /* 10/100 only */
#define TG3PCI_DEVICE_TIGON3_5717 0x1655
#define TG3PCI_DEVICE_TIGON3_5718 0x1656
-#define TG3PCI_DEVICE_TIGON3_5724 0x165c
#define TG3PCI_DEVICE_TIGON3_57781 0x16b1
#define TG3PCI_DEVICE_TIGON3_57785 0x16b5
#define TG3PCI_DEVICE_TIGON3_57761 0x16b0
--
1.7.2.2
^ permalink raw reply related
* [PATCH net-next 5/8] tg3: Futureproof the loopback test
From: Matt Carlson @ 2010-09-30 20:34 UTC (permalink / raw)
To: davem; +Cc: netdev, andy, mcarlson
There are other multiqueue modes 5717 and 5719 devices can assume. This
patch makes sure that the loopback test is safe, should those other
modes be enabled in the future.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
---
drivers/net/tg3.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index d76e718..50b7e35 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -10642,7 +10642,8 @@ static int tg3_run_loopback(struct tg3 *tp, int loopback_mode)
tnapi = &tp->napi[0];
rnapi = &tp->napi[0];
if (tp->irq_cnt > 1) {
- rnapi = &tp->napi[1];
+ if (tp->tg3_flags3 & TG3_FLG3_ENABLE_RSS)
+ rnapi = &tp->napi[1];
if (tp->tg3_flags3 & TG3_FLG3_ENABLE_TSS)
tnapi = &tp->napi[1];
}
--
1.7.2.2
^ permalink raw reply related
* [PATCH net-next 8/8] tg3: Update version to 3.114
From: Matt Carlson @ 2010-09-30 20:34 UTC (permalink / raw)
To: davem; +Cc: netdev, andy, mcarlson
This patch updates the tg3 version to 3.114.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
---
drivers/net/tg3.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 98f7158..9b134fd 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -69,10 +69,10 @@
#define DRV_MODULE_NAME "tg3"
#define TG3_MAJ_NUM 3
-#define TG3_MIN_NUM 113
+#define TG3_MIN_NUM 114
#define DRV_MODULE_VERSION \
__stringify(TG3_MAJ_NUM) "." __stringify(TG3_MIN_NUM)
-#define DRV_MODULE_RELDATE "August 2, 2010"
+#define DRV_MODULE_RELDATE "September 30, 2010"
#define TG3_DEF_MAC_MODE 0
#define TG3_DEF_RX_MODE 0
--
1.7.2.2
^ permalink raw reply related
* [PATCH net-next 4/8] tg3: Cleanup missing VPD partno section
From: Matt Carlson @ 2010-09-30 20:34 UTC (permalink / raw)
To: davem; +Cc: netdev, andy, mcarlson
This patch cleans up the default VPD partno section. New entries for
5717 asic rev devices were also added.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
---
drivers/net/tg3.c | 71 ++++++++++++++++++++++++++++------------------------
1 files changed, 38 insertions(+), 33 deletions(-)
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 93228aa..d76e718 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -12527,44 +12527,49 @@ partno:
out_not_found:
kfree(vpd_data);
- if (!tp->board_part_number[0])
+ if (tp->board_part_number[0])
return;
out_no_vpd:
- if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5906)
+ if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5717) {
+ if (tp->pdev->device == TG3PCI_DEVICE_TIGON3_5717)
+ strcpy(tp->board_part_number, "BCM5717");
+ else if (tp->pdev->device == TG3PCI_DEVICE_TIGON3_5718)
+ strcpy(tp->board_part_number, "BCM5718");
+ else
+ goto nomatch;
+ } else if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_57780) {
+ if (tp->pdev->device == TG3PCI_DEVICE_TIGON3_57780)
+ strcpy(tp->board_part_number, "BCM57780");
+ else if (tp->pdev->device == TG3PCI_DEVICE_TIGON3_57760)
+ strcpy(tp->board_part_number, "BCM57760");
+ else if (tp->pdev->device == TG3PCI_DEVICE_TIGON3_57790)
+ strcpy(tp->board_part_number, "BCM57790");
+ else if (tp->pdev->device == TG3PCI_DEVICE_TIGON3_57788)
+ strcpy(tp->board_part_number, "BCM57788");
+ else
+ goto nomatch;
+ } else if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_57765) {
+ if (tp->pdev->device == TG3PCI_DEVICE_TIGON3_57761)
+ strcpy(tp->board_part_number, "BCM57761");
+ else if (tp->pdev->device == TG3PCI_DEVICE_TIGON3_57765)
+ strcpy(tp->board_part_number, "BCM57765");
+ else if (tp->pdev->device == TG3PCI_DEVICE_TIGON3_57781)
+ strcpy(tp->board_part_number, "BCM57781");
+ else if (tp->pdev->device == TG3PCI_DEVICE_TIGON3_57785)
+ strcpy(tp->board_part_number, "BCM57785");
+ else if (tp->pdev->device == TG3PCI_DEVICE_TIGON3_57791)
+ strcpy(tp->board_part_number, "BCM57791");
+ else if (tp->pdev->device == TG3PCI_DEVICE_TIGON3_57795)
+ strcpy(tp->board_part_number, "BCM57795");
+ else
+ goto nomatch;
+ } else if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5906) {
strcpy(tp->board_part_number, "BCM95906");
- else if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_57780 &&
- tp->pdev->device == TG3PCI_DEVICE_TIGON3_57780)
- strcpy(tp->board_part_number, "BCM57780");
- else if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_57780 &&
- tp->pdev->device == TG3PCI_DEVICE_TIGON3_57760)
- strcpy(tp->board_part_number, "BCM57760");
- else if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_57780 &&
- tp->pdev->device == TG3PCI_DEVICE_TIGON3_57790)
- strcpy(tp->board_part_number, "BCM57790");
- else if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_57780 &&
- tp->pdev->device == TG3PCI_DEVICE_TIGON3_57788)
- strcpy(tp->board_part_number, "BCM57788");
- else if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_57765 &&
- tp->pdev->device == TG3PCI_DEVICE_TIGON3_57761)
- strcpy(tp->board_part_number, "BCM57761");
- else if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_57765 &&
- tp->pdev->device == TG3PCI_DEVICE_TIGON3_57765)
- strcpy(tp->board_part_number, "BCM57765");
- else if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_57765 &&
- tp->pdev->device == TG3PCI_DEVICE_TIGON3_57781)
- strcpy(tp->board_part_number, "BCM57781");
- else if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_57765 &&
- tp->pdev->device == TG3PCI_DEVICE_TIGON3_57785)
- strcpy(tp->board_part_number, "BCM57785");
- else if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_57765 &&
- tp->pdev->device == TG3PCI_DEVICE_TIGON3_57791)
- strcpy(tp->board_part_number, "BCM57791");
- else if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_57765 &&
- tp->pdev->device == TG3PCI_DEVICE_TIGON3_57795)
- strcpy(tp->board_part_number, "BCM57795");
- else
+ } else {
+nomatch:
strcpy(tp->board_part_number, "none");
+ }
}
static int __devinit tg3_fw_img_is_valid(struct tg3 *tp, u32 offset)
--
1.7.2.2
^ permalink raw reply related
* Re: [PATCH V3] fs: allow for more than 2^31 files
From: Robin Holt @ 2010-09-30 20:26 UTC (permalink / raw)
To: Eric Dumazet
Cc: David Miller, dipankar, holt, viro, bcrl, den, mingo, mszeredi,
cmm, npiggin, xemul, linux-kernel, netdev
In-Reply-To: <1285645611.10438.27.camel@edumazet-laptop>
On Tue, Sep 28, 2010 at 05:46:51AM +0200, Eric Dumazet wrote:
> Le lundi 27 septembre 2010 à 15:36 -0700, David Miller a écrit :
...
> Fix is to let /proc/sys/fs/file-nr & /proc/sys/fs/file-max use long
> integers, and change af_unix to use an atomic_long_t instead of
> atomic_t.
>
> get_max_files() is changed to return an unsigned long.
I _THINK_ we actually want get_max_files to return a long and have
the files_stat_struct definitions be longs. If we do not have it that
way, we could theoretically open enough files on a single cpu to make
get_nr_files return a negative without overflowing max_files. That,
of course, would require an insane amount of memory, but I think it is
technically more correct.
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -1352,16 +1352,16 @@ static struct ctl_table fs_table[] = {
> {
> .procname = "file-nr",
> .data = &files_stat,
> - .maxlen = 3*sizeof(int),
> + .maxlen = sizeof(files_stat),
> .mode = 0444,
> - .proc_handler = proc_nr_files,
> + .proc_handler = proc_doulongvec_minmax,
With this change, don't we lose the current nr_files value? I think
you need proc_nr_files to stay as it was. If you disagree, we should
probably eliminate the definitions for proc_nr_files as I don't believe
they are used anywhere else.
Thanks,
Robin
^ permalink raw reply
* wireless-testing (2.6.36-rc6): inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
From: Ben Greear @ 2010-09-30 20:16 UTC (permalink / raw)
To: linux-wireless@vger.kernel.org, NetDev
We saw this on a system that has two ath9k APs, some extra routing tables
and rules to use them, and a user-space 'bridge' that uses packet-sockets.
Aside from a few patches to help virtualize wireless devices (and none directly to ath9k),
this is today's wireless-testing tree.
=================================
[ INFO: inconsistent lock state ]
2.6.36-rc6-wl+ #20
---------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
kworker/u:0/5 [HC0[0]:SC0[0]:HE1:SE1] takes:
(&(&list->lock)->rlock){+.?...}, at: [<c0741066>] packet_rcv+0x1f3/0x27a
{IN-SOFTIRQ-W} state was registered at:
[<c0457036>] __lock_acquire+0x27f/0xb8c
[<c045799d>] lock_acquire+0x5a/0x78
[<c07638dc>] _raw_spin_lock+0x1b/0x2a
[<c0741066>] packet_rcv+0x1f3/0x27a
[<c06d04a6>] __netif_receive_skb+0x340/0x389
[<c06d058d>] process_backlog+0x9e/0x16e
[<c06d1104>] net_rx_action+0x99/0x17a
[<c04393f2>] __do_softirq+0x86/0x111
[<c04394b3>] do_softirq+0x36/0x5a
[<c04395ec>] irq_exit+0x35/0x69
[<c0403fb9>] do_IRQ+0x86/0x9a
[<c04034ee>] common_interrupt+0x2e/0x40
[<c06adb37>] cpuidle_idle_call+0x7f/0xb4
[<c040227f>] cpu_idle+0x4e/0x6b
[<c074f9c5>] rest_init+0x8d/0x92
[<c097b8e0>] start_kernel+0x316/0x31b
[<c097b0d0>] i386_start_kernel+0xd0/0xd7
irq event stamp: 62565
hardirqs last enabled at (62565): [<c04adf88>] kmem_cache_alloc+0xa0/0xc5
hardirqs last disabled at (62564): [<c04adf3a>] kmem_cache_alloc+0x52/0xc5
softirqs last enabled at (62562): [<c073ffe3>] run_filter+0x9b/0xa5
softirqs last disabled at (62560): [<c073ff59>] run_filter+0x11/0xa5
other info that might help us debug this:
4 locks held by kworker/u:0/5:
#0: ((wiphy_name(local->hw.wiphy))){+.+...}, at: [<c0443d2d>] process_one_work+0x173/0x2c3
#1: ((&sc->hw_check_work)){+.+...}, at: [<c0443d2d>] process_one_work+0x173/0x2c3
#2: (rcu_read_lock){.+.+..}, at: [<f8ff86f8>] ieee80211_tx_status+0x5c1/0x6b9 [mac80211]
#3: (rcu_read_lock){.+.+..}, at: [<c06cedec>] rcu_read_lock+0x0/0x21
stack backtrace:
Pid: 5, comm: kworker/u:0 Not tainted 2.6.36-rc6-wl+ #20
Call Trace:
[<c0761c6a>] ? printk+0xf/0x15
[<c0455f15>] valid_state+0x131/0x144
[<c0456017>] mark_lock+0xef/0x1de
[<c0456738>] ? check_usage_backwards+0x0/0x68
[<c04570a4>] __lock_acquire+0x2ed/0xb8c
[<c0455f03>] ? valid_state+0x11f/0x144
[<c06dd853>] ? sk_run_filter+0x1d0/0x3c0
[<c0455f46>] ? mark_lock+0x1e/0x1de
[<c045799d>] lock_acquire+0x5a/0x78
[<c0741066>] ? packet_rcv+0x1f3/0x27a
[<c07638dc>] _raw_spin_lock+0x1b/0x2a
[<c0741066>] ? packet_rcv+0x1f3/0x27a
[<c0741066>] packet_rcv+0x1f3/0x27a
[<c06d04a6>] __netif_receive_skb+0x340/0x389
[<c06d0f3b>] netif_receive_skb+0x72/0x78
[<f8ff87a0>] ieee80211_tx_status+0x669/0x6b9 [mac80211]
[<c0450050>] ? ntp_start_leap_timer+0x4b/0x67
[<f9091861>] ath_tx_complete_buf+0x1ba/0x219 [ath9k]
[<c04563a9>] ? trace_hardirqs_on_caller+0x104/0x125
[<f9092d9d>] ath_draintxq+0x179/0x2c8 [ath9k]
[<f9094346>] ath_drain_all_txq+0x10f/0x11d [ath9k]
[<f908e9d1>] ath_reset+0x40/0x15e [ath9k]
[<f908f31e>] ath_hw_check+0x3c/0x47 [ath9k]
[<c0443d77>] process_one_work+0x1bd/0x2c3
[<c0443d2d>] ? process_one_work+0x173/0x2c3
[<f908f2e2>] ? ath_hw_check+0x0/0x47 [ath9k]
[<c0445422>] worker_thread+0xf7/0x1f7
[<c044532b>] ? worker_thread+0x0/0x1f7
[<c0447dff>] kthread+0x62/0x67
[<c0447d9d>] ? kthread+0x0/0x67
[<c0403506>] kernel_thread_helper+0x6/0x1a
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply
* sending VLAN packets via packet_mmap
From: Phil Sutter @ 2010-09-30 19:24 UTC (permalink / raw)
To: netdev; +Cc: Johann Baudy, Eric Dumazet
Hi,
support for VLAN tags in af_packet.c seems to be incomplete. While it's
possible to receive a full packet using SOCK_RAW, sending one will fail
due to size constraints. tpacket_snd() does not account for the
additional four bytes.
There are a few possible solutions to this problem. When searching for
the most appropriate one, I've been looking at tpacket_rcv() which
simply writes the whole frame out, setting tpacket2_hdr.tp_vlan_tci on
the go. So from a user's point of view, information is redundantly
available.
The actual problem in tpacket_snd() is this:
| reserve = dev->hard_header_len;
| [...]
| if (size_max > dev->mtu + reserve)
| size_max = dev->mtu + reserve;
I guess the check is there to avoid skb overflows on malicious data
input. Is this correct? Are there other reasons for it's existence?
As af_packet.c has no knowledge about VLANs (other than a call to
vlan_tx_tag_get()), I guess avoiding expensive parsing of the inserted
data for the VLAN tag should be appropriate. Nevertheless the check from
above needs to account for the additional VLAN_HLEN when the tag exists.
So a rather trivial solution would be to drop the check completely
(given no other constraints, of course), thereby giving the user a
little more ability to break things. Alternatively, one could require
that tpacket2_hdr.tp_vlan_tci be set (at least non-zero) to identify
packets containing a VLAN tag and allow the additional size (probably
mostly consistent to the logic inside tpacket_rcv()).
A third solution could be like the second one, but not accepting
prebuilt packets including VLAN header at all and using
tpacket2_hdr.tp_vlan_tci together with vlan_put_tag() to instead insert
it from inside the kernel.
Hopefully I didn't overlook something crucial. Feel free to flame me if
that's the case! :)
Greetings, Phil
^ permalink raw reply
* Re: [PATCH 4/5] AF_UNIX: find peers on multicast Unix stream sockets
From: Alban Crequy @ 2010-09-30 19:24 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S. Miller, Stephen Hemminger, Cyrill Gorcunov,
Alexey Dobriyan, Lennart Poettering, Kay Sievers, Ian Molton,
netdev, linux-kernel
In-Reply-To: <1285351237.2478.7.camel@edumazet-laptop>
Le Fri, 24 Sep 2010 20:00:37 +0200,
Eric Dumazet <eric.dumazet@gmail.com> a écrit :
> Le vendredi 24 septembre 2010 à 18:25 +0100, Alban Crequy a écrit :
>
> > @@ -1612,7 +1671,12 @@ static int unix_stream_sendmsg(struct kiocb
> > *kiocb, struct socket *sock, } else {
> > sunaddr = NULL;
> > err = -ENOTCONN;
> > - other = NULL; /* FIXME: get the list of other
> > connection */
> > + max_others = atomic_read(&unix_nr_multicast_socks);
> > + others = kzalloc((max_others + 1) * sizeof(void
> > *), GFP_KERNEL);
> > + unix_find_other(sock_net(sk), u->addr->name,
> > + u->addr->len, 0, u->addr->hash, 1, others,
> > max_others, &err);
> > + other = others[0];
> > + kfree(others);
> > if (!other)
> > goto out_err;
> > }
>
> Seriously, this block sizing against unix_nr_multicast_socks is not
> scalable. What happens if we have 1000 sockets ?
> kzalloc() to clear 8000 bytes ?
> Its also unsafe.
>
> (say you kzalloc() a buffer for 2 sockets, and another cpu inserts a
> new socket. unix_find_socket_byname() can overflow the buffer)
>
>
> You should use a list, and allocates elements in
> unix_find_socket_byname()
>
> struct item {
> struct item *next;
> struct sock *s;
> };
Thanks for your review.
I cannot allocate elements directly in unix_find_socket_byname()
iteration after iteration because the spinlock "unix_table_lock" is
held. If I release the spinlock to allocate, the number of sockets in
the table may change.
I changed the code to count the sockets with the lock held and then
allocate. In the unfortunate case where the allocation is not big
enough (if another process inserts a new socket), it just tries again.
The code is available here. Please pull from:
git://git.collabora.co.uk/git/user/alban/linux-2.6.35.y/.git unix-multicast2
It is still a work in progress. Missing pieces:
- The flow control does not work correctly: poll/select does not match
the reality
- Atomic delivery: if a process is killed or interrupted in the middle
of a delivery, only a subset of the recipients will get the message
- Some locking to provide the same delivery order to all the recipients
when several senders run concurrently.
Feedback welcome,
Alban Crequy
^ permalink raw reply
* [PATCH net-next] cxgb4: remove a bogus PCI function number check
From: Dimitris Michailidis @ 2010-09-30 19:17 UTC (permalink / raw)
To: netdev; +Cc: Dimitris Michailidis
Remove a bogus PCI function number check from the driver's .remove
method that causes pci_release_regions not to be called for function 0
if additional functions are attached and one of them is used as primary.
Signed-off-by: Dimitris Michailidis <dm@chelsio.com>
---
drivers/net/cxgb4/cxgb4_main.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/net/cxgb4/cxgb4_main.c b/drivers/net/cxgb4/cxgb4_main.c
index 4fb08e3..22169a7 100644
--- a/drivers/net/cxgb4/cxgb4_main.c
+++ b/drivers/net/cxgb4/cxgb4_main.c
@@ -3863,7 +3863,7 @@ static void __devexit remove_one(struct pci_dev *pdev)
pci_disable_device(pdev);
pci_release_regions(pdev);
pci_set_drvdata(pdev, NULL);
- } else if (PCI_FUNC(pdev->devfn) > 0)
+ } else
pci_release_regions(pdev);
}
--
1.5.4
^ permalink raw reply related
* Re: pull request: wireless-2.6 2010-09-29
From: David Miller @ 2010-09-30 19:03 UTC (permalink / raw)
To: linville; +Cc: linux-wireless, netdev, linux-kernel
In-Reply-To: <20100929203352.GC2516@tuxdriver.com>
From: "John W. Linville" <linville@tuxdriver.com>
Date: Wed, 29 Sep 2010 16:33:53 -0400
> Here are two more fixes intended for 2.6.36. One fixes a user after
> free error, the other fixes a reported regression (bug 17722). Both are
> reasonably small and well documented in the commit logs.
Pulled, thanks John.
^ permalink raw reply
* Re: Packet time delays on multi-core systems
From: Eric Dumazet @ 2010-09-30 18:52 UTC (permalink / raw)
To: Alexey Vlasov; +Cc: Linux Kernel Mailing List, netdev
In-Reply-To: <20100930181556.GC4094@beaver.vrungel.ru>
Le jeudi 30 septembre 2010 à 22:15 +0400, Alexey Vlasov a écrit :
> On Thu, Sep 30, 2010 at 08:03:02PM +0200, Eric Dumazet wrote:
> > >
> > > The last test were made already concerning such rx queue binding:
> > > # cat /proc/irq/60/smp_affinity
> > > 001000
> > > # cat /proc/irq/61/smp_affinity
> > > 010000
> > > # cat /proc/irq/62/smp_affinity
> > > 080000
> > > # cat /proc/irq/63/smp_affinity
> > > 800000
> > >
> >
> > Why 60, 61, 62, 63 ? This should be 753, 754, 755, 756
>
> I've got several similar servers, the interrups 60-63 are on
> that one that I can test now, so this isn't a mistake.
>
If you have a burst of 'LOG' matches, it can really slow down the whole
thing.
You should add a limiter (eg: no more than 5 messages per second)
http://netfilter.org/documentation/HOWTO/packet-filtering-HOWTO-7.html
This module is most useful after a limit match, so you don't
flood your logs.
^ permalink raw reply
* Re: RFC: MTU for serving NFS on Infiniband
From: Marc Aurele La France @ 2010-09-30 18:50 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Ben Hutchings, linux-kernel, netdev, David S. Miller,
Alexey Kuznetsov, Pekka Savola (ipv6), James Morris,
Hideaki YOSHIFUJI, Patrick McHardy
In-Reply-To: <alpine.WNT.2.00.1008251408520.632@cluij.ucs.ualberta.ca>
On Thu, 26 Aug 2010, Marc Aurele La France wrote:
> I do want to thank you, however, for reminding me of TCP. It's something
> 20/20 hindsight says I should have checked out before starting this thread.
> Logistically, it'll be a few days before I can do so though. If that allows
> me to increase the MTU all the way up to 65520, then this UDP thing will
> likely remain unresolved.
Just to close off on this. It's been a few weeks now, but moving to NFS
over TCP allows me to increase the MTU all the way up to 65520 without
issues.
Thanks for the help.
Marc.
+----------------------------------+----------------------------------+
| Marc Aurele La France | work: 1-780-492-9310 |
| Academic Information and | fax: 1-780-492-1729 |
| Communications Technologies | email: tsi@ualberta.ca |
| 352 General Services Building +----------------------------------+
| University of Alberta | |
| Edmonton, Alberta | Standard disclaimers apply |
| T6G 2H1 | |
| CANADA | |
+----------------------------------+----------------------------------+
^ permalink raw reply
* Re: [MeeGo-Dev][PATCH v3] Topcliff: Update PCH_CAN driver to 2.6.35
From: David Miller @ 2010-09-30 18:50 UTC (permalink / raw)
To: wg-5Yr1BZd7O62+XT7JhA+gdA
Cc: andrew.chih.howe.khor-ral2JQCrhuEAvxtiuMwx3w,
socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
sameo-VuQAYsv1563Yd54FQh9/CA,
margie.foster-ral2JQCrhuEAvxtiuMwx3w,
netdev-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
masa-korg-ECg8zkTtlr0C6LszWs/t0g,
kok.howg.ewe-ral2JQCrhuEAvxtiuMwx3w,
joel.clark-ral2JQCrhuEAvxtiuMwx3w,
morinaga526-ECg8zkTtlr0C6LszWs/t0g,
meego-dev-WXzIur8shnEAvxtiuMwx3w,
yong.y.wang-ral2JQCrhuEAvxtiuMwx3w, chripell-VaTbYqLCNhc,
qi.wang-ral2JQCrhuEAvxtiuMwx3w
In-Reply-To: <4CA4541F.5040804-5Yr1BZd7O62+XT7JhA+gdA@public.gmane.org>
From: Wolfgang Grandegger <wg-5Yr1BZd7O62+XT7JhA+gdA@public.gmane.org>
Date: Thu, 30 Sep 2010 11:10:55 +0200
> On 09/24/2010 12:24 PM, Masayuki Ohtak wrote:
>> +};
>> +
>> +static struct can_bittiming_const pch_can_bittiming_const = {
>> + .name = KBUILD_MODNAME,
>
> Not sure what KBUILD_MODNAME is. Should be "pch_can", the name of the
> driver.
That's what KBUILD_MODNAME will be defined to when this gets
compiled :-)
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox