* Re: [PATCH] tunnels: Fix tunnels change rcu protection
From: David Miller @ 2010-10-27 21:21 UTC (permalink / raw)
To: eric.dumazet; +Cc: xemul, netdev
In-Reply-To: <1288206372.2658.13.camel@edumazet-laptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 27 Oct 2010 21:06:12 +0200
> Le mercredi 27 octobre 2010 à 21:02 +0200, Eric Dumazet a écrit :
>
>
>> Hmm, maybe we should allocate a "struct ip_tunnel_parm" instead of using
>> an embedded one (in struct ip_tunnel), and stick an rcu_head in it to
>> delay its freeing...
>>
>
> I forgot to Ack your patch, of course.
>
> We can implement something better when net-next-2.6 re-opens.
>
> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Applied.
^ permalink raw reply
* Re: [PATCH] bonding: Fix lockdep warning after bond_vlan_rx_register()
From: David Miller @ 2010-10-27 21:21 UTC (permalink / raw)
To: jarkao2; +Cc: eric.dumazet, netdev, jesse, fubar
In-Reply-To: <20101027170822.GA1902@del.dom.local>
From: Jarek Poplawski <jarkao2@gmail.com>
Date: Wed, 27 Oct 2010 19:08:22 +0200
> [ Full info at netdev: Wed, 27 Oct 2010 12:24:30 +0200
> Subject: [BUG net-2.6 vlan/bonding] lockdep splats ]
>
> Use BH variant of write_lock(&bond->lock) (as elsewhere in bond_main)
> to prevent this dependency.
>
> Fixes commit f35188faa0fbabefac476536994f4b6f3677380f [v2.6.36]
>
> Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
> Tested-by: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Jay Vosburgh <fubar@us.ibm.com>
Applied.
^ permalink raw reply
* Re: [PATCH v2] ehea: Fixing statistics
From: David Miller @ 2010-10-27 21:21 UTC (permalink / raw)
To: eric.dumazet; +Cc: leitao, netdev
In-Reply-To: <1288205610.2658.2.camel@edumazet-laptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 27 Oct 2010 20:53:30 +0200
> Le mercredi 27 octobre 2010 à 14:45 -0400, Breno Leitao a écrit :
>> (Applied over Eric's "ehea: fix use after free" patch)
>>
>> Currently ehea stats are broken. The bytes counters are got from
>> the hardware, while the packets counters are got from the device
>> driver. Also, the device driver counters are resetted during the
>> the down process, and the hardware aren't, causing some weird
>> numbers.
>>
>> This patch just consolidates the packets and bytes on the device
>> driver.
>>
>> Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
>
> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Applied.
^ permalink raw reply
* Re: [RFC][net-next-2.6 PATCH v2] 8021q: set hard_header_len when VLAN offload features are toggled
From: John Fastabend @ 2010-10-27 21:40 UTC (permalink / raw)
To: Jesse Gross; +Cc: netdev@vger.kernel.org, bhutchings@solarflare.com
In-Reply-To: <AANLkTi=9tL5yrVGWsOagcgyKte5z8R9ADdz5n-Uf2Lsw@mail.gmail.com>
On 10/26/2010 7:05 PM, Jesse Gross wrote:
> On Tue, Oct 26, 2010 at 2:59 PM, John Fastabend
> <john.r.fastabend@intel.com> wrote:
>> Toggling the vlan tx|rx hw offloads needs to set the hard_header_len
>> as well otherwise we end up using LL_RESERVED_SPACE incorrectly.
>> This results in pskb_expand_head() being used unnecessarily.
>>
>> This add a check in vlan_transfer_features to catch the ETH_FLAG_TXVLAN
>> flag and set the header length. This requires drivers to add the
>> ETH_FLAG_TXVLAN to vlan_features.
>>
>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>
> I think this addresses all of the original problems. However, I don't
> think that we want to have drivers claim to support vlan offloading as
> a feature for vlan packets. That implies some type of QinQ
> functionality to me. In addition, if the vlan device claims to
> support offloading and a second vlan device is stacked on top of it,
> then the two will clobber skb->vlan_tci. It's probably simpler to
> just keep track of whether vlan offloading is currently enabled so we
> can find out whether it changed.
>
Agreed. Rather then trying to be clever this is probably the easiest.
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -334,6 +334,12 @@ Hunk #1, a/net/8021q/vlan.c static void vlan_transfer_features(struct net_device *dev,
vlandev->features &= ~dev->vlan_features;
vlandev->features |= dev->features & dev->vlan_features;
vlandev->gso_max_size = dev->gso_max_size;
+
+ if (dev->features & NETIF_F_HW_VLAN_TX)
+ vlandev->hard_header_len = dev->hard_header_len;
+ else
+ vlandev->hard_header_len = dev->hard_header_len + VLAN_HLEN;
+
#if defined(CONFIG_FCOE) || defined(CONFIG_FCOE_MODULE)
vlandev->fcoe_ddp_xid = dev->fcoe_ddp_xid;
#endif
>> ---
>>
>> net/8021q/vlan.c | 10 ++++++++++
>> 1 files changed, 10 insertions(+), 0 deletions(-)
>>
>> diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
>> index 05b867e..825011b 100644
>> --- a/net/8021q/vlan.c
>> +++ b/net/8021q/vlan.c
>> @@ -334,6 +334,16 @@ static void vlan_transfer_features(struct net_device *dev,
>> vlandev->features &= ~dev->vlan_features;
>> vlandev->features |= dev->features & dev->vlan_features;
>> vlandev->gso_max_size = dev->gso_max_size;
>> +
>> + /* is ETH_FLAGS_TXVLAN being toggled */
>> + if ((vlandev->features & ETH_FLAG_TXVLAN) ^
>> + (old_features & ETH_FLAG_TXVLAN)) {
>> + if (vlandev->features & ETH_FLAG_TXVLAN)
>> + vlandev->hard_header_len -= VLAN_HLEN;
>> + else
>> + vlandev->hard_header_len += VLAN_HLEN;
>> + }
>
> The correct flag for dev->features is NETIF_F_HW_VLAN_TX.
> ETH_FLAGS_TXVLAN is an ethtool construct (that happens to have the
> same value).
>
> Thanks.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [RFC PATCH 1/1] vhost: TX used buffer guest signal accumulation
From: Shirley Ma @ 2010-10-27 21:58 UTC (permalink / raw)
To: mst@redhat.com; +Cc: David Miller, netdev, kvm, linux-kernel
This patch changes vhost TX used buffer guest signal from one by
one to 3/4 of vring size. This change improves vhost TX transmission
both bandwidth and CPU utilization performance for 256 to 8K messages s
ize without inducing any regression.
Signed-off-by: Shirley Ma <xma@us.ibm.com>
---
drivers/vhost/net.c | 20 +++++++++++++++++++-
drivers/vhost/vhost.c | 31 +++++++++++++++++++++++++++++++
drivers/vhost/vhost.h | 3 +++
3 files changed, 53 insertions(+), 1 deletions(-)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 4b4da5b..45e07cd 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -128,6 +128,7 @@ static void handle_tx(struct vhost_net *net)
int err, wmem;
size_t hdr_size;
struct socket *sock;
+ int max_pend = vq->num - (vq->num >> 2);
sock = rcu_dereference_check(vq->private_data,
lockdep_is_held(&vq->mutex));
@@ -198,7 +199,24 @@ static void handle_tx(struct vhost_net *net)
if (err != len)
pr_debug("Truncated TX packet: "
" len %d != %zd\n", err, len);
- vhost_add_used_and_signal(&net->dev, vq, head, 0);
+ /*
+ * if no pending buffer size allocate, signal used buffer
+ * one by one, otherwise, signal used buffer when reaching
+ * 3/4 ring size to reduce CPU utilization.
+ */
+ if (unlikely(vq->pend))
+ vhost_add_used_and_signal(&net->dev, vq, head, 0);
+ else {
+ vq->pend[vq->num_pend].id = head;
+ vq->pend[vq->num_pend].len = 0;
+ ++vq->num_pend;
+ if (vq->num_pend == max_pend) {
+ vhost_add_used_and_signal_n(&net->dev, vq,
+ vq->pend,
+ vq->num_pend);
+ vq->num_pend = 0;
+ }
+ }
total_len += len;
if (unlikely(total_len >= VHOST_NET_WEIGHT)) {
vhost_poll_queue(&vq->poll);
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 94701ff..9486a25 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -170,6 +170,16 @@ static void vhost_vq_reset(struct vhost_dev *dev,
vq->call_ctx = NULL;
vq->call = NULL;
vq->log_ctx = NULL;
+ /* signal pending used buffers */
+ if (vq->pend) {
+ if (vq->num_pend != 0) {
+ vhost_add_used_and_signal_n(dev, vq, vq->pend,
+ vq->num_pend);
+ vq->num_pend = 0;
+ }
+ kfree(vq->pend);
+ }
+ vq->pend = NULL;
}
static int vhost_worker(void *data)
@@ -273,7 +283,13 @@ long vhost_dev_init(struct vhost_dev *dev,
dev->vqs[i].heads = NULL;
dev->vqs[i].dev = dev;
mutex_init(&dev->vqs[i].mutex);
+ dev->vqs[i].num_pend = 0;
+ dev->vqs[i].pend = NULL;
vhost_vq_reset(dev, dev->vqs + i);
+ /* signal 3/4 of ring size used buffers */
+ dev->vqs[i].pend = kmalloc((dev->vqs[i].num -
+ (dev->vqs[i].num >> 2)) *
+ sizeof *vq->pend, GFP_KERNEL);
if (dev->vqs[i].handle_kick)
vhost_poll_init(&dev->vqs[i].poll,
dev->vqs[i].handle_kick, POLLIN, dev);
@@ -599,6 +615,21 @@ static long vhost_set_vring(struct vhost_dev *d, int ioctl, void __user *argp)
r = -EINVAL;
break;
}
+ if (vq->num != s.num) {
+ /* signal used buffers first */
+ if (vq->pend) {
+ if (vq->num_pend != 0) {
+ vhost_add_used_and_signal_n(vq->dev, vq,
+ vq->pend,
+ vq->num_pend);
+ vq->num_pend = 0;
+ }
+ kfree(vq->pend);
+ }
+ /* realloc pending used buffers size */
+ vq->pend = kmalloc((s.num - (s.num >> 2)) *
+ sizeof *vq->pend, GFP_KERNEL);
+ }
vq->num = s.num;
break;
case VHOST_SET_VRING_BASE:
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 073d06a..78949c0 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -108,6 +108,9 @@ struct vhost_virtqueue {
/* Log write descriptors */
void __user *log_base;
struct vhost_log *log;
+ /* delay multiple used buffers to signal once */
+ int num_pend;
+ struct vring_used_elem *pend;
};
struct vhost_dev {
^ permalink raw reply related
* IPV6 raw socket denies bind(2)
From: Jan Engelhardt @ 2010-10-27 22:01 UTC (permalink / raw)
To: netdev; +Cc: David S. Miller, Eric Dumazet
[Doublepost: linux-netdev does not exist. Sigh. "netdev" is totally
nonstandard to all the other linux-* list names. :-?]
Hi,
I was trying out raw sockets and stumbled into a case whereby I cannot
call bind(2) on a AF_INET6, SOCK_RAW socket. Apparently, I am triggering
this particular code path that means absolutely nothing to me:
# ./rawtest
kernel pseudo callgraph:
static int rawv6_bind(struct sock *sk, struct sockaddr *uaddr, int
addr_len)
addr >= SIN6_LEN_RFC2133
addr_type != IPV6_ADDR_MAPPED
sk->sk_state == TCP_CLOSED
addr_type != IPV6_ADDR_ANY
!(addr_type & IPV6_ADDR_LINKLOCAL)
!(addr_type & IPV6_ADDR_MULTICAST)
dev == NULL
ipv6_chk_addr does not have any addresses to loop through (wtf? checked
with printk.)
=> going -EADDRNOTAVAIL
At this point I have no idea why ipv6_chk_addr does not run through
its loop. No devices in the hash bucket or something. Happens in
2.6.36. I hope somebody can shed some light here.
---Userspace testcase---
#include <sys/socket.h>
#include <stdio.h>
#include <string.h>
#include <netinet/udp.h>
#include <netinet/in.h>
#include <netinet/ip6.h>
#include <arpa/inet.h>
#include <stdlib.h>
int main(void)
{
struct sockaddr_in6 src = {};
int sk;
sk = socket(AF_INET6, SOCK_RAW, IPPROTO_UDP);
memset(&src, 0, sizeof(src));
inet_pton(AF_INET6, "::1", &src);
src.sin6_family = AF_INET6;
if (bind(sk, (void *)&src, sizeof(src)) < 0) {
perror("bind");
abort();
}
return 0;
}
^ permalink raw reply
* [GIT] Networking
From: David Miller @ 2010-10-27 22:05 UTC (permalink / raw)
To: torvalds; +Cc: akpm, netdev, linux-kernel
Hey, I'll resolve the verify_iovec() issue this evening so that we can
wrap that sucker up. But for now here's some fallout fixing changes
as well as some other misc stuff:
1) dev_can_checksum() doesn't handle nested VLAN properly, also
generic checksum capability does not imply FCOE checksumming.
Both from Ben Hutchings.
2) typhoon driver fails to wait for RX mode commands to finish,
also use new VLAN accel interfaces. From David Dillow.
3) Tunnel transmit recursion limit too low, increase to 10.
4) Fix tms380tr build failure on x86-64 due to too large udelay().
5) ipv6 TPROXY needs to check CAP_NET_ADMIN just like ipv4, from
Balazs Scheidler.
6) Add caif-u5500 driver, from Amarnath Revanna.
7) Add tscan1 CAN driver, from Andre B. Oliveira.
8) Missed MTU updates in ip6_tunnel, from Anders Franzen.
9) Lots of missing __rcu annotations, from Eric Dumazet.
10) Fix bonding lockdep spew, from Jarek Poplawski.
11) Fix slhc double-export of symbol, from Denis Kirjanov.
12) cxgb4 too-early-queue access crash fix from Dimitris Michailidis,
also use new VLAN accel interfaces.
13) mlx4_en out-of-bounds array access fix from Eli Cohen.
14) IPv6 temporary address handling fixes from Glenn Wurster.
15) Fix RX crashes in gianfar, from Jarek Poplawski.
16) Fix ipv6 defrag dependencies with ip6tables and tproxy, from KOVACS Krisztian.
17) Missing CONFIG_SYSCTL checks in ipv6 reasm netfilter code.
18) Tunnel RCU protection fix from Pavel Emelyanov.
19) be2net bug fixes from Somnath Kotur (calling netif_carrier_off() too
early, UDP packet handling, and worker thread destruction and
scheduling bugs).
20) qlcnic can use invalid VLAN ids, from Sony Chacko.
21) Accidental exporting of static functions in l2tp, from Stephen Rothwell.
22) Toss lazy workqueue from connector, from Tejun Heo.
23) Final function staticization round from Stephen Hemminger.
Please pull, thanks a lot!
The following changes since commit 12ba8d1e9262ce81a695795410bd9ee5c9407ba1:
fix braino in fs: do not assign default i_ino in new_inode (2010-10-26 20:25:45 -0700)
are available in the git repository at:
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.git master
Amarnath Revanna (3):
caif-u5500: Adding shared memory include
caif-u5500: CAIF shared memory mailbox interface
caif-u5500: Build config for CAIF shared mem driver
Anders Franzen (1):
ip6_tunnel dont update the mtu on the route.
Andre B. Oliveira (1):
can: tscan1: add driver for TS-CAN1 boards
Balazs Scheidler (1):
tproxy: Add missing CAP_NET_ADMIN check to ipv6 side
Ben Greear (2):
ath9k: Properly initialize ath_common->cc_lock.
ath5k: Properly initialize ath_common->cc_lock.
Ben Hutchings (2):
net: Fix some corner cases in dev_can_checksum()
net: NETIF_F_HW_CSUM does not imply FCoE CRC offload
Breno Leitao (1):
ehea: Fixing statistics
Christian Lamparter (4):
carl9170: fix async command buffer leak
mac80211: don't sanitize invalid rates
carl9170: fix memory leak issue in async cmd macro wrappers
carl9170: fix scheduling while atomic
David Dillow (2):
typhoon: wait for RX mode commands to finish
typhoon: update to new VLAN acceleration model
David S. Miller (4):
net: Increase xmit RECURSION_LIMIT to 10.
tms380tr: Use mdelay() in tms380tr_wait().
netfilter: Add missing CONFIG_SYSCTL checks in ipv6's nf_conntrack_reasm.c
Merge branch 'master' of git://git.kernel.org/.../linville/wireless-2.6
Denis Kirjanov (1):
slhc: Don't export symbols twice
Dimitris Michailidis (2):
cxgb4: fix crash due to manipulating queues before registration
cxgb4: update to utilize the newer VLAN infrastructure
Divy Le Ray (1):
cxgb3: fix device opening error path
Don Fry (1):
iwlwifi: quiet a noisy printk
Eli Cohen (1):
mlx4_en: Fix out of bounds array access
Eric Dumazet (17):
netlink: fix netlink_change_ngroups()
vlan: rcu annotations
ipv6: ip6_ptr rcu annotations
net/802: add __rcu annotations
tunnels: add _rcu annotations
rps: add __rcu annotations
net_ns: add __rcu annotations
net: add __rcu annotation to sk_filter
ipv4: add __rcu annotations to ip_ra_chain
fib: fix fib_nl_newrule()
fib_hash: fix rcu sparse and logical errors
ipv4: add __rcu annotations to routes.c
net: add __rcu annotations to protocol
tunnels: add __rcu annotations
fib_rules: __rcu annotates ctarget
inetpeer: __rcu annotations
ehea: fix use after free
Felix Fietkau (3):
ath9k: fix crash in ath_update_survey_stats
ath9k: fix handling of rate control probe frames
ath9k: resume aggregation immediately after a hardware reset
Glenn Wurster (2):
IPv6: Create temporary address if none exists.
IPv6: Temp addresses are immediately deleted.
Grazvydas Ignotas (1):
wl1251: fix module names
Guo-Fu Tseng (1):
jme: Support WoL after shutdown
Harvey Harrison (3):
vmxnet3: remove set_flag_le{16,64} helpers
vmxnet3: annotate hwaddr members as __iomem pointers
vmxnet3: fix typo setting confPA
Jarek Poplawski (2):
gianfar: Fix crashes on RX path (Was Re: [Bugme-new] [Bug 19692] New: linux-2.6.36-rc5 crash with gianfar ethernet at full line rate traffic)
bonding: Fix lockdep warning after bond_vlan_rx_register()
Joe Perches (1):
drivers/atm/eni.c: Remove multiple uses of KERN_<level>
Joshua Hoke (1):
macb: Don't re-enable interrupts while in polling mode
Julia Lawall (3):
drivers/net/sb1000.c: delete double assignment
drivers/net/typhoon.c: delete double assignment
drivers/isdn: delete double assignment
KOVACS Krisztian (1):
netfilter: fix module dependency issues with IPv6 defragmentation, ip6tables and xt_TPROXY
Luis R. Rodriguez (2):
cfg80211: fix regression on processing country IEs
ath9k_hw: Fix TX carrier leakage for IEEE compliance on AR9003 2.2
Marc Kleine-Budde (12):
can: at91_can: use correct bit to enable CAN_CTRLMODE_3_SAMPLES
can: at91_can: fix reception of extended frames
can: at91_can: fix use after free of priv
can: at91_can: fix compiler warning in at91_irq_err_state
can: at91_can: fix section mismatch warning
can: at91_can: implement and use at91_get_berr_counter
can: at91_can: set bittiming in chip_start
can: at91_can: convert readl, writel their __raw pendants
can: at91_can: convert dev_<level> printing to netdev_<level>
can: at91_can: add KBUILD_MODNAME to bittiming constant
can: flexcan: fix use after free of priv
can: mcp251x: fix reception of standard RTR frames
Masayuki Ohtake (1):
can: Topcliff: Add PCH_CAN driver.
Nicolas Kaiser (1):
drivers/net: sgiseeq: fix return on error
Paul Gortmaker (1):
pktgen: clean up handling of local/transient counter vars
Pavel Emelyanov (1):
tunnels: Fix tunnels change rcu protection
Rafael J. Wysocki (1):
tg3: Do not call device_set_wakeup_enable() under spin_lock_bh
Rafał Miłecki (1):
b43: N-PHY: fix infinite-loop-typo
Rajkumar Manoharan (1):
mac80211: Fix ibss station got expired immediately
Randy Dunlap (1):
pch_can: depends on PCI
Ron Mercer (1):
qlge: bugfix: Restoring the vlan setting.
Senthil Balasubramanian (1):
ath9k_hw: Fix divide by zero cases in paprd.
Somnath Kotur (3):
be2net: Call netif_carier_off() after register_netdev()
be2net: Fix CSO for UDP packets
be2net: Schedule/Destroy worker thread in probe()/remove() rather than open()/close()
Sony Chacko (2):
qlcnic: reduce rx ring size
qlcnic: define valid vlan id range
Stephen Rothwell (1):
l2tp: static functions should not be exported
Tejun Heo (2):
connector: remove lazy workqueue creation
mac80211: cancel restart_work explicitly instead of depending on flush_scheduled_work()
Ursula Braun (1):
ipv6: fix refcnt problem related to POSTDAD state
amit salecha (1):
qlcnic: fix mac learning
sjur.brandeland@stericsson.com (1):
caif-u5500: CAIF shared memory transport protocol
stephen hemminger (12):
mlx4: make functions local and remove dead code.
l2tp: make local function static
benet: remove dead code
benet: make be_poll_rx local
atl1c: make functions static
atlx: make local functions/data static
phylib: make local function static
vxge: make functions local and remove dead code
qlge: make local functions static
qlge: disable unsed dump code
bnx2x: make local function static and remove dead code
e1000: make e1000_reinit_safe local
Documentation/networking/phy.txt | 18 -
drivers/atm/eni.c | 7 +-
drivers/connector/cn_queue.c | 75 +-
drivers/connector/connector.c | 9 +-
drivers/isdn/hardware/mISDN/mISDNinfineon.c | 2 +-
drivers/isdn/hisax/l3_1tr6.c | 6 +-
drivers/net/atl1c/atl1c.h | 2 -
drivers/net/atl1c/atl1c_main.c | 6 +-
drivers/net/atlx/atl1.c | 12 +-
drivers/net/atlx/atl1.h | 9 +-
drivers/net/atlx/atlx.c | 4 +
drivers/net/benet/be_cmds.c | 36 -
drivers/net/benet/be_cmds.h | 2 -
drivers/net/benet/be_main.c | 49 +-
drivers/net/bnx2x/bnx2x.h | 5 -
drivers/net/bnx2x/bnx2x_cmn.c | 3 +-
drivers/net/bnx2x/bnx2x_cmn.h | 55 -
drivers/net/bnx2x/bnx2x_init_ops.h | 34 +-
drivers/net/bnx2x/bnx2x_link.c | 137 +--
drivers/net/bnx2x/bnx2x_link.h | 15 -
drivers/net/bnx2x/bnx2x_main.c | 55 +-
drivers/net/bonding/bond_main.c | 4 +-
drivers/net/caif/Kconfig | 7 +
drivers/net/caif/Makefile | 4 +
drivers/net/caif/caif_shm_u5500.c | 129 ++
drivers/net/caif/caif_shmcore.c | 744 ++++++++++
drivers/net/can/Kconfig | 8 +
drivers/net/can/Makefile | 1 +
drivers/net/can/at91_can.c | 95 +-
drivers/net/can/flexcan.c | 3 +-
drivers/net/can/mcp251x.c | 3 +
drivers/net/can/pch_can.c | 1463 ++++++++++++++++++++
drivers/net/can/sja1000/Kconfig | 12 +
drivers/net/can/sja1000/Makefile | 1 +
drivers/net/can/sja1000/tscan1.c | 216 +++
drivers/net/cxgb3/cxgb3_main.c | 8 +-
drivers/net/cxgb4/cxgb4.h | 1 -
drivers/net/cxgb4/cxgb4_main.c | 33 +-
drivers/net/cxgb4/sge.c | 23 +-
drivers/net/e1000/e1000_main.c | 2 +-
drivers/net/ehea/ehea.h | 2 +
drivers/net/ehea/ehea_main.c | 42 +-
drivers/net/gianfar.c | 6 +-
drivers/net/jme.c | 45 +-
drivers/net/macb.c | 27 +-
drivers/net/mlx4/icm.c | 28 +-
drivers/net/mlx4/icm.h | 2 -
drivers/net/mlx4/port.c | 11 +
drivers/net/phy/phy.c | 13 +-
drivers/net/phy/phy_device.c | 19 +-
drivers/net/qlcnic/qlcnic.h | 7 +-
drivers/net/qlcnic/qlcnic_ethtool.c | 23 +-
drivers/net/qlcnic/qlcnic_main.c | 19 +-
drivers/net/qlge/qlge.h | 12 +-
drivers/net/qlge/qlge_main.c | 24 +-
drivers/net/qlge/qlge_mpi.c | 6 +-
drivers/net/sb1000.c | 6 +-
drivers/net/sgiseeq.c | 2 +-
drivers/net/slhc.c | 15 +-
drivers/net/tg3.c | 10 +-
drivers/net/tokenring/tms380tr.c | 2 +-
drivers/net/typhoon.c | 92 +-
drivers/net/vmxnet3/upt1_defs.h | 8 +-
drivers/net/vmxnet3/vmxnet3_defs.h | 6 +-
drivers/net/vmxnet3/vmxnet3_drv.c | 22 +-
drivers/net/vmxnet3/vmxnet3_ethtool.c | 14 +-
drivers/net/vmxnet3/vmxnet3_int.h | 19 +-
drivers/net/vxge/vxge-config.c | 332 ++++--
drivers/net/vxge/vxge-config.h | 227 +---
drivers/net/vxge/vxge-ethtool.c | 2 +-
drivers/net/vxge/vxge-main.c | 64 +-
drivers/net/vxge/vxge-main.h | 59 +-
drivers/net/vxge/vxge-traffic.c | 101 +--
drivers/net/vxge/vxge-traffic.h | 134 --
drivers/net/wireless/ath/ath5k/base.c | 1 +
.../net/wireless/ath/ath9k/ar9003_2p2_initvals.h | 191 ++-
drivers/net/wireless/ath/ath9k/ar9003_paprd.c | 14 +-
drivers/net/wireless/ath/ath9k/beacon.c | 2 +-
drivers/net/wireless/ath/ath9k/init.c | 1 +
drivers/net/wireless/ath/ath9k/main.c | 7 +-
drivers/net/wireless/ath/ath9k/xmit.c | 8 +-
drivers/net/wireless/ath/carl9170/cmd.h | 51 +-
drivers/net/wireless/ath/carl9170/main.c | 2 +-
drivers/net/wireless/ath/carl9170/usb.c | 25 +-
drivers/net/wireless/b43/phy_n.c | 2 +-
drivers/net/wireless/iwlwifi/iwl-agn-tx.c | 3 +-
drivers/net/wireless/wl1251/Makefile | 8 +-
include/linux/connector.h | 8 -
include/linux/netdevice.h | 18 +-
include/linux/phy.h | 12 -
include/net/caif/caif_shm.h | 26 +
include/net/dst.h | 2 +-
include/net/fib_rules.h | 2 +-
include/net/garp.h | 2 +-
include/net/inetpeer.h | 2 +-
include/net/ip.h | 4 +-
include/net/ip6_tunnel.h | 2 +-
include/net/ipip.h | 6 +-
include/net/net_namespace.h | 2 +-
include/net/protocol.h | 4 +-
include/net/sock.h | 2 +-
include/net/xfrm.h | 4 +-
net/802/garp.c | 18 +-
net/802/stp.c | 4 +-
net/8021q/vlan.c | 6 +-
net/core/dev.c | 38 +-
net/core/fib_rules.c | 21 +-
net/core/filter.c | 4 +-
net/core/net-sysfs.c | 20 +-
net/core/net_namespace.c | 4 +-
net/core/pktgen.c | 30 +-
net/core/sock.c | 2 +-
net/core/sysctl_net_core.c | 3 +-
net/ipv4/fib_hash.c | 36 +-
net/ipv4/gre.c | 5 +-
net/ipv4/inetpeer.c | 138 ++-
net/ipv4/ip_gre.c | 1 +
net/ipv4/ip_sockglue.c | 10 +-
net/ipv4/ipip.c | 1 +
net/ipv4/protocol.c | 8 +-
net/ipv4/route.c | 75 +-
net/ipv4/tunnel4.c | 29 +-
net/ipv4/udp.c | 2 +-
net/ipv6/addrconf.c | 16 +-
net/ipv6/ip6_tunnel.c | 2 +
net/ipv6/ipv6_sockglue.c | 4 +
net/ipv6/netfilter/Kconfig | 5 +
net/ipv6/netfilter/Makefile | 5 +-
net/ipv6/netfilter/nf_conntrack_reasm.c | 5 +-
net/ipv6/protocol.c | 8 +-
net/ipv6/raw.c | 2 +-
net/ipv6/sit.c | 1 +
net/ipv6/tunnel6.c | 24 +-
net/ipv6/udp.c | 2 +-
net/l2tp/l2tp_core.c | 53 +-
net/l2tp/l2tp_core.h | 33 -
net/l2tp/l2tp_ip.c | 2 +-
net/mac80211/ibss.c | 1 +
net/mac80211/main.c | 8 +-
net/mac80211/rate.c | 3 +
net/netfilter/Kconfig | 2 +
net/netfilter/xt_TPROXY.c | 10 +-
net/netfilter/xt_socket.c | 12 +-
net/netlink/af_netlink.c | 65 +-
net/wireless/reg.c | 2 +-
145 files changed, 4008 insertions(+), 1822 deletions(-)
create mode 100644 drivers/net/caif/caif_shm_u5500.c
create mode 100644 drivers/net/caif/caif_shmcore.c
create mode 100644 drivers/net/can/pch_can.c
create mode 100644 drivers/net/can/sja1000/tscan1.c
create mode 100644 include/net/caif/caif_shm.h
^ permalink raw reply
* [patch] fix stack overflow in pktgen_if_write()
From: Dan Carpenter @ 2010-10-27 22:12 UTC (permalink / raw)
To: nelhage
Cc: Eric Dumazet, David S. Miller, Robert Olsson, Andy Shevchenko,
netdev
In-Reply-To: <1288206788-21063-1-git-send-email-nelhage@ksplice.com>
Nelson Elhage says he was able to oops both amd64 and i386 test
machines with 8k writes to the pktgen file. Let's just allocate the
buffer on the heap instead of on the stack.
This can only be triggered by root so there are no security issues here.
Reported-by: Nelson Elhage <nelhage@ksplice.com>
Signed-off-by: Dan Carpenter <error27@gmail.com>
---
I saw this on twitter. Hi Nelson, could you test this?
diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 2c0df0f..b5d3c70 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -887,12 +887,14 @@ static ssize_t pktgen_if_write(struct file *file,
i += len;
if (debug) {
- char tb[count + 1];
- if (copy_from_user(tb, user_buffer, count))
- return -EFAULT;
- tb[count] = 0;
+ char *tb;
+
+ tb = strndup_user(user_buffer, count + 1);
+ if (IS_ERR(tb))
+ return PTR_ERR(tb);
printk(KERN_DEBUG "pktgen: %s,%lu buffer -:%s:-\n", name,
(unsigned long)count, tb);
+ kfree(tb);
}
if (!strcmp(name, "min_pkt_size")) {
^ permalink raw reply related
* RE: [PATCH net-next] ixgbe: fix stats handling
From: Tantilov, Emil S @ 2010-10-27 22:35 UTC (permalink / raw)
To: Eric Dumazet, David Miller, Waskiewicz Jr, Peter P,
Kirsher, Jeffrey T
Cc: netdev
In-Reply-To: <1286799439.2737.21.camel@edumazet-laptop>
>-----Original Message-----
>From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
>Sent: Monday, October 11, 2010 5:17 AM
>To: David Miller; Waskiewicz Jr, Peter P; Tantilov, Emil S; Kirsher,
>Jeffrey T
>Cc: netdev
>Subject: [PATCH net-next] ixgbe: fix stats handling
>
>Hi
>
>I am sending this patch for Intel people review/test and acknowledge.
>
>Thanks !
>
>[PATCH net-next] ixgbe: fix stats handling
>
>Current ixgbe stats have following problems :
>
>- Not 64 bit safe (on 32bit arches)
>
>- Not safe in ixgbe_clean_rx_irq() :
> All cpus dirty a common location (netdev->stats.rx_bytes &
>netdev->stats.rx_packets) without proper synchronization.
> This slow down a bit multiqueue operations, and possibly miss some
>updates.
>
>Fixes :
>
>Implement ndo_get_stats64() method to provide accurate 64bit rx|tx
>bytes/packets counters, using 64bit safe infrastructure.
>
>ixgbe_get_ethtool_stats() also use this infrastructure to provide 64bit
>safe counters.
>
>Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
>CC: Peter Waskiewicz <peter.p.waskiewicz.jr@intel.com>
>CC: Emil Tantilov <emil.s.tantilov@intel.com>
>CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
>---
> drivers/net/ixgbe/ixgbe.h | 3 +-
> drivers/net/ixgbe/ixgbe_ethtool.c | 29 +++++++++++---------
> drivers/net/ixgbe/ixgbe_main.c | 40 +++++++++++++++++++++++++---
> 3 files changed, 56 insertions(+), 16 deletions(-)
>
>diff --git a/drivers/net/ixgbe/ixgbe.h b/drivers/net/ixgbe/ixgbe.h
>index a8c47b0..944d9e2 100644
>--- a/drivers/net/ixgbe/ixgbe.h
>+++ b/drivers/net/ixgbe/ixgbe.h
>@@ -180,8 +180,9 @@ struct ixgbe_ring {
> */
>
> struct ixgbe_queue_stats stats;
>- unsigned long reinit_state;
>+ struct u64_stats_sync syncp;
> int numa_node;
>+ unsigned long reinit_state;
> u64 rsc_count; /* stat for coalesced packets */
> u64 rsc_flush; /* stats for flushed packets */
> u32 restart_queue; /* track tx queue restarts */
>diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c
>b/drivers/net/ixgbe/ixgbe_ethtool.c
>index d4ac943..3c7f15d 100644
>--- a/drivers/net/ixgbe/ixgbe_ethtool.c
>+++ b/drivers/net/ixgbe/ixgbe_ethtool.c
>@@ -999,12 +999,11 @@ static void ixgbe_get_ethtool_stats(struct net_device
>*netdev,
> struct ethtool_stats *stats, u64
>*data)
> {
> struct ixgbe_adapter *adapter = netdev_priv(netdev);
>- u64 *queue_stat;
>- int stat_count = sizeof(struct ixgbe_queue_stats) / sizeof(u64);
> struct rtnl_link_stats64 temp;
> const struct rtnl_link_stats64 *net_stats;
>- int j, k;
>- int i;
>+ unsigned int start;
>+ struct ixgbe_ring *ring;
>+ int i, j;
> char *p = NULL;
>
> ixgbe_update_stats(adapter);
>@@ -1025,16 +1024,22 @@ static void ixgbe_get_ethtool_stats(struct
>net_device *netdev,
> sizeof(u64)) ? *(u64 *)p : *(u32 *)p;
> }
> for (j = 0; j < adapter->num_tx_queues; j++) {
>- queue_stat = (u64 *)&adapter->tx_ring[j]->stats;
>- for (k = 0; k < stat_count; k++)
>- data[i + k] = queue_stat[k];
>- i += k;
>+ ring = adapter->tx_ring[j];
>+ do {
>+ start = u64_stats_fetch_begin_bh(&ring->syncp);
>+ data[i] = ring->stats.packets;
>+ data[i+1] = ring->stats.bytes;
>+ } while (u64_stats_fetch_retry_bh(&ring->syncp, start));
>+ i += 2;
> }
> for (j = 0; j < adapter->num_rx_queues; j++) {
>- queue_stat = (u64 *)&adapter->rx_ring[j]->stats;
>- for (k = 0; k < stat_count; k++)
>- data[i + k] = queue_stat[k];
>- i += k;
>+ ring = adapter->rx_ring[j];
>+ do {
>+ start = u64_stats_fetch_begin_bh(&ring->syncp);
>+ data[i] = ring->stats.packets;
>+ data[i+1] = ring->stats.bytes;
>+ } while (u64_stats_fetch_retry_bh(&ring->syncp, start));
>+ i += 2;
> }
> if (adapter->flags & IXGBE_FLAG_DCB_ENABLED) {
> for (j = 0; j < MAX_TX_PACKET_BUFFERS; j++) {
>diff --git a/drivers/net/ixgbe/ixgbe_main.c
>b/drivers/net/ixgbe/ixgbe_main.c
>index 95dbf60..1efbcde 100644
>--- a/drivers/net/ixgbe/ixgbe_main.c
>+++ b/drivers/net/ixgbe/ixgbe_main.c
>@@ -824,8 +824,10 @@ static bool ixgbe_clean_tx_irq(struct ixgbe_q_vector
>*q_vector,
>
> tx_ring->total_bytes += total_bytes;
> tx_ring->total_packets += total_packets;
>+ u64_stats_update_begin(&tx_ring->syncp);
> tx_ring->stats.packets += total_packets;
> tx_ring->stats.bytes += total_bytes;
>+ u64_stats_update_end(&tx_ring->syncp);
> return count < tx_ring->work_limit;
> }
>
>@@ -1172,7 +1174,6 @@ static bool ixgbe_clean_rx_irq(struct ixgbe_q_vector
>*q_vector,
> int *work_done, int work_to_do)
> {
> struct ixgbe_adapter *adapter = q_vector->adapter;
>- struct net_device *netdev = adapter->netdev;
> struct pci_dev *pdev = adapter->pdev;
> union ixgbe_adv_rx_desc *rx_desc, *next_rxd;
> struct ixgbe_rx_buffer *rx_buffer_info, *next_buffer;
>@@ -1298,8 +1299,10 @@ static bool ixgbe_clean_rx_irq(struct ixgbe_q_vector
>*q_vector,
> rx_ring->rsc_count++;
> rx_ring->rsc_flush++;
> }
>+ u64_stats_update_begin(&rx_ring->syncp);
> rx_ring->stats.packets++;
> rx_ring->stats.bytes += skb->len;
>+ u64_stats_update_end(&rx_ring->syncp);
> } else {
> if (rx_ring->flags & IXGBE_RING_RX_PS_ENABLED) {
> rx_buffer_info->skb = next_buffer->skb;
>@@ -1375,8 +1378,6 @@ next_desc:
>
> rx_ring->total_packets += total_rx_packets;
> rx_ring->total_bytes += total_rx_bytes;
>- netdev->stats.rx_bytes += total_rx_bytes;
>- netdev->stats.rx_packets += total_rx_packets;
>
> return cleaned;
> }
>@@ -6559,6 +6560,38 @@ static void ixgbe_netpoll(struct net_device *netdev)
> }
> #endif
>
>+static struct rtnl_link_stats64 *ixgbe_get_stats64(struct net_device
>*netdev,
>+ struct rtnl_link_stats64 *stats)
>+{
>+ struct ixgbe_adapter *adapter = netdev_priv(netdev);
>+ int i;
>+
>+ /* accurate rx/tx bytes/packets stats */
>+ dev_txq_stats_fold(netdev, stats);
>+ for (i = 0; i < adapter->num_rx_queues; i++) {
>+ struct ixgbe_ring *ring = adapter->rx_ring[i];
>+ u64 bytes, packets;
>+ unsigned int start;
>+
>+ do {
>+ start = u64_stats_fetch_begin_bh(&ring->syncp);
>+ packets = ring->stats.packets;
>+ bytes = ring->stats.bytes;
>+ } while (u64_stats_fetch_retry_bh(&ring->syncp, start));
>+ stats->rx_packets += packets;
>+ stats->rx_bytes += bytes;
>+ }
>+
>+ /* following stats updated by ixgbe_watchdog_task() */
>+ stats->multicast = netdev->stats.multicast;
>+ stats->rx_errors = netdev->stats.rx_errors;
>+ stats->rx_length_errors = netdev->stats.rx_length_errors;
>+ stats->rx_crc_errors = netdev->stats.rx_crc_errors;
>+ stats->rx_missed_errors = netdev->stats.rx_missed_errors;
>+ return stats;
>+}
>+
>+
> static const struct net_device_ops ixgbe_netdev_ops = {
> .ndo_open = ixgbe_open,
> .ndo_stop = ixgbe_close,
>@@ -6578,6 +6611,7 @@ static const struct net_device_ops ixgbe_netdev_ops =
>{
> .ndo_set_vf_vlan = ixgbe_ndo_set_vf_vlan,
> .ndo_set_vf_tx_rate = ixgbe_ndo_set_vf_bw,
> .ndo_get_vf_config = ixgbe_ndo_get_vf_config,
>+ .ndo_get_stats64 = ixgbe_get_stats64,
> #ifdef CONFIG_NET_POLL_CONTROLLER
> .ndo_poll_controller = ixgbe_netpoll,
> #endif
>
Eric,
We are seeing intermittent hangs on ia32 arch which seem to point to this patch:
BUG: unable to handle kernel NULL pointer dereference at 00000040
IP: [<f7f6b537>] ixgbe_get_stats64+0x47/0x120 [ixgbe]
*pdpt = 000000002dc83001 *pde = 000000032d7e5067
Oops: 0000 [#2] SMP
last sysfs file: /sys/devices/system/cpu/cpu23/cache/index2/shared_cpu_map
Modules linked in: act_skbedit cls_u32 sch_multiq ixgbe mdio netconsole configfs autofs4 8021q garp stp llc sunrpc ipv6 e
xt3 jbd dm_mirror dm_region_hash dm_log dm_round_robin dm_multipath power_meter hwmon sg ses enclosure dcdbas pcspkr serio_raw iTCO_wdt iTCO_vendor_support io
atdma dca i7core_edac edac_core bnx2 ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_generic ata_piix megaraid_sas dm_mod [last unloaded: mperf
]
Pid: 1939, comm: irqbalance Tainted: G D W 2.6.36-rc7-upstream-net-next-2.6-ixgbe-queue-i386-g55e1a84 #1 09CGW2/Po
werEdge T610
EIP: 0060:[<f7f6b537>] EFLAGS: 00010206 CPU: 0
EIP is at ixgbe_get_stats64+0x47/0x120 [ixgbe]
EAX: 00000000 EBX: ecc45e4c ECX: ebea0400 EDX: 00000000
ESI: ebea0000 EDI: 00000018 EBP: f7f846a0 ESP: ecc45d88
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process irqbalance (pid: 1939, ti=ecc44000 task=efc63a50 task.ti=ecc44000)
Stack:
ecc45e4c 00000000 ebea0400 ebea0000 ecc45e4c ebea0000 ecc45f04 f7f846a0
<0> c0750593 ebea0000 edfec340 ebea0000 000002b5 c075063a edfec340 c0993eec
<0> ebe78000 0000021c 00000000 00000009 00000000 00000000 00000000 00000000
Call Trace:
[<c0750593>] ? dev_get_stats+0x33/0xc0
[<c075063a>] ? dev_seq_printf_stats+0x1a/0x180
[<c07507aa>] ? dev_seq_show+0xa/0x20
[<c052398f>] ? seq_read+0x22f/0x3d0
[<c0523760>] ? seq_read+0x0/0x3d0
[<c054fdde>] ? proc_reg_read+0x5e/0x90
[<c054fd80>] ? proc_reg_read+0x0/0x90
[<c050a1dd>] ? vfs_read+0x9d/0x160
[<c049d4ef>] ? audit_syscall_entry+0x20f/0x230
[<c050a971>] ? sys_read+0x41/0x70
[<c0409cdf>] ? sysenter_do_call+0x12/0x28
Code: 60 4f 7e c8 8b 44 24 08 8b b8 20 06 00 00 85 ff 7e 63 c7 44 24 04 00 00 00 00 66 90 8b 54 24 04 8b 4c 24 08 8b 84 9
1 00 05 00 00 <8b> 50 40 eb 06 8d 74 26 00 89 ca f6 c2 01 0f 85 ae 00 00 00 8b
EIP: [<f7f6b537>] ixgbe_get_stats64+0x47/0x120 [ixgbe] SS:ESP 0068:ecc45d88
CR2: 0000000000000040
---[ end trace 51ea89f4e57f54f1 ]---
Emil
^ permalink raw reply
* Re: [patch] fix stack overflow in pktgen_if_write()
From: Dan Carpenter @ 2010-10-27 22:40 UTC (permalink / raw)
To: nelhage
Cc: Eric Dumazet, David S. Miller, Robert Olsson, Andy Shevchenko,
netdev
In-Reply-To: <20101027221234.GN6062@bicker>
On Thu, Oct 28, 2010 at 12:12:35AM +0200, Dan Carpenter wrote:
> - char tb[count + 1];
> - if (copy_from_user(tb, user_buffer, count))
> - return -EFAULT;
> - tb[count] = 0;
> + char *tb;
> +
> + tb = strndup_user(user_buffer, count + 1);
Crap... This should be memdup_user().
Sorry about that. I'll send v2.
regards,
dan carpenter
> + if (IS_ERR(tb))
> + return PTR_ERR(tb);
> printk(KERN_DEBUG "pktgen: %s,%lu buffer -:%s:-\n", name,
> (unsigned long)count, tb);
> + kfree(tb);
> }
>
> if (!strcmp(name, "min_pkt_size")) {
^ permalink raw reply
* [patch v2] fix stack overflow in pktgen_if_write()
From: Dan Carpenter @ 2010-10-27 22:43 UTC (permalink / raw)
To: nelhage
Cc: Eric Dumazet, David S. Miller, Robert Olsson, Andy Shevchenko,
netdev
In-Reply-To: <20101027221234.GN6062@bicker>
Nelson Elhage says he was able to oops both amd64 and i386 test
machines with 8k writes to the pktgen file. Let's just allocate the
buffer on the heap instead of on the stack.
This can only be triggered by root so there are no security issues here.
Reported-by: Nelson Elhage <nelhage@ksplice.com>
Signed-off-by: Dan Carpenter <error27@gmail.com>
---
I saw this on twitter. Hi Nelson, could you test this?
V2: strndup_user() => memdup_user()
diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 2c0df0f..b5d3c70 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -887,12 +887,14 @@ static ssize_t pktgen_if_write(struct file *file,
i += len;
if (debug) {
- char tb[count + 1];
- if (copy_from_user(tb, user_buffer, count))
- return -EFAULT;
- tb[count] = 0;
+ char *tb;
+
+ tb = memdup_user(user_buffer, count + 1);
+ if (IS_ERR(tb))
+ return PTR_ERR(tb);
printk(KERN_DEBUG "pktgen: %s,%lu buffer -:%s:-\n", name,
(unsigned long)count, tb);
+ kfree(tb);
}
if (!strcmp(name, "min_pkt_size")) {
^ permalink raw reply related
* Re: [RFC][net-next-2.6 PATCH v2] 8021q: set hard_header_len when VLAN offload features are toggled
From: Jesse Gross @ 2010-10-27 23:04 UTC (permalink / raw)
To: John Fastabend; +Cc: netdev@vger.kernel.org, bhutchings@solarflare.com
In-Reply-To: <4CC89C3A.7000209@intel.com>
On Wed, Oct 27, 2010 at 2:40 PM, John Fastabend
<john.r.fastabend@intel.com> wrote:
> On 10/26/2010 7:05 PM, Jesse Gross wrote:
>> On Tue, Oct 26, 2010 at 2:59 PM, John Fastabend
>> <john.r.fastabend@intel.com> wrote:
>>> Toggling the vlan tx|rx hw offloads needs to set the hard_header_len
>>> as well otherwise we end up using LL_RESERVED_SPACE incorrectly.
>>> This results in pskb_expand_head() being used unnecessarily.
>>>
>>> This add a check in vlan_transfer_features to catch the ETH_FLAG_TXVLAN
>>> flag and set the header length. This requires drivers to add the
>>> ETH_FLAG_TXVLAN to vlan_features.
>>>
>>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>>
>> I think this addresses all of the original problems. However, I don't
>> think that we want to have drivers claim to support vlan offloading as
>> a feature for vlan packets. That implies some type of QinQ
>> functionality to me. In addition, if the vlan device claims to
>> support offloading and a second vlan device is stacked on top of it,
>> then the two will clobber skb->vlan_tci. It's probably simpler to
>> just keep track of whether vlan offloading is currently enabled so we
>> can find out whether it changed.
>>
>
> Agreed. Rather then trying to be clever this is probably the easiest.
>
> --- a/net/8021q/vlan.c
> +++ b/net/8021q/vlan.c
> @@ -334,6 +334,12 @@ Hunk #1, a/net/8021q/vlan.c static void vlan_transfer_features(struct net_device *dev,
> vlandev->features &= ~dev->vlan_features;
> vlandev->features |= dev->features & dev->vlan_features;
> vlandev->gso_max_size = dev->gso_max_size;
> +
> + if (dev->features & NETIF_F_HW_VLAN_TX)
> + vlandev->hard_header_len = dev->hard_header_len;
> + else
> + vlandev->hard_header_len = dev->hard_header_len + VLAN_HLEN;
> +
Great, that's even simpler than I was thinking.
I think this series is ready to go.
^ permalink raw reply
* Re: [patch v2] fix stack overflow in pktgen_if_write()
From: Nelson Elhage @ 2010-10-27 23:06 UTC (permalink / raw)
To: Dan Carpenter
Cc: Eric Dumazet, David S. Miller, Robert Olsson, Andy Shevchenko,
netdev
In-Reply-To: <20101027224302.GQ6062@bicker>
You want to add a trailing NUL, or else printk will read off the end of the
buffer.
Also, by memdup()ing count + 1 bytes, you're technically reading one more byte
than userspace asked for, which could in principle lead to a spurious EFAULT.
- Nelson
On Thu, Oct 28, 2010 at 12:43:02AM +0200, Dan Carpenter wrote:
> Nelson Elhage says he was able to oops both amd64 and i386 test
> machines with 8k writes to the pktgen file. Let's just allocate the
> buffer on the heap instead of on the stack.
>
> This can only be triggered by root so there are no security issues here.
>
> Reported-by: Nelson Elhage <nelhage@ksplice.com>
> Signed-off-by: Dan Carpenter <error27@gmail.com>
> ---
> I saw this on twitter. Hi Nelson, could you test this?
>
> V2: strndup_user() => memdup_user()
>
> diff --git a/net/core/pktgen.c b/net/core/pktgen.c
> index 2c0df0f..b5d3c70 100644
> --- a/net/core/pktgen.c
> +++ b/net/core/pktgen.c
> @@ -887,12 +887,14 @@ static ssize_t pktgen_if_write(struct file *file,
> i += len;
>
> if (debug) {
> - char tb[count + 1];
> - if (copy_from_user(tb, user_buffer, count))
> - return -EFAULT;
> - tb[count] = 0;
> + char *tb;
> +
> + tb = memdup_user(user_buffer, count + 1);
> + if (IS_ERR(tb))
> + return PTR_ERR(tb);
> printk(KERN_DEBUG "pktgen: %s,%lu buffer -:%s:-\n", name,
> (unsigned long)count, tb);
> + kfree(tb);
> }
>
> if (!strcmp(name, "min_pkt_size")) {
>
^ permalink raw reply
* Re: IPV6 raw socket denies bind(2)
From: Brian Haley @ 2010-10-27 23:54 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: netdev, David S. Miller, Eric Dumazet
In-Reply-To: <alpine.LNX.2.01.1010280000260.7820@obet.zrqbmnf.qr>
On 10/27/2010 06:01 PM, Jan Engelhardt wrote:
> int main(void)
> {
> struct sockaddr_in6 src = {};
> int sk;
>
> sk = socket(AF_INET6, SOCK_RAW, IPPROTO_UDP);
> memset(&src, 0, sizeof(src));
> inet_pton(AF_INET6, "::1", &src);
> src.sin6_family = AF_INET6;
>
> if (bind(sk, (void *)&src, sizeof(src)) < 0) {
> perror("bind");
> abort();
> }
> return 0;
> }
You're trashing the sockaddr, try this patch:
< inet_pton(AF_INET6, "::1", &src);
---
> inet_pton(AF_INET6, "::1", &src.sin6_addr);
-Brian
^ permalink raw reply
* Re: [PATCH 2.6.36/stable v2] vlan: Fix crash when hwaccel rx pkt for non-existant vlan.
From: Jesse Gross @ 2010-10-28 0:11 UTC (permalink / raw)
To: Ben Greear; +Cc: netdev
In-Reply-To: <1288112797-21550-1-git-send-email-greearb@candelatech.com>
On Tue, Oct 26, 2010 at 10:06 AM, Ben Greear <greearb@candelatech.com> wrote:
> The vlan_hwaccel_do_receive code expected skb->dev to always
> be a vlan device, but if the NIC was promisc, and the VLAN
> for a particular VID was not configured, then this method
> could receive a packet where skb->dev was NOT a vlan
> device. This caused access of bad memory and a crash.
>
> Signed-off-by: Ben Greear <greearb@candelatech.com>
> ---
> v1 -> v2: Simplify patch..no need for setting pkt-type, etc.
>
> :100644 100644 0eb96f7... 0687b6c... M net/8021q/vlan_core.c
> :100644 100644 660dd41... 5dc45b9... M net/core/dev.c
> net/8021q/vlan_core.c | 3 +++
> net/core/dev.c | 5 +++--
> 2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
> index 0eb96f7..0687b6c 100644
> --- a/net/8021q/vlan_core.c
> +++ b/net/8021q/vlan_core.c
> @@ -43,6 +43,9 @@ int vlan_hwaccel_do_receive(struct sk_buff *skb)
> struct net_device *dev = skb->dev;
> struct vlan_rx_stats *rx_stats;
>
> + if (!is_vlan_dev(dev))
> + return 0;
> +
> skb->dev = vlan_dev_info(dev)->real_dev;
> netif_nit_deliver(skb);
>
What if we dropped any packet with a tag in skb->vlan_tci before it
gets to the bridge hooks? That would accomplish the original goal of
getting packets to tcpdump while preventing them from making it to
places where they aren't expected, It will provide the same behavior
as earlier kernels.
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 660dd41..5dc45b9 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2828,8 +2828,9 @@ static int __netif_receive_skb(struct sk_buff *skb)
> if (!netdev_tstamp_prequeue)
> net_timestamp_check(skb);
>
> - if (vlan_tx_tag_present(skb) && vlan_hwaccel_do_receive(skb))
> - return NET_RX_SUCCESS;
> + if (vlan_tx_tag_present(skb))
> + /* This method cannot fail at this time. */
> + vlan_hwaccel_do_receive(skb);
This is correct but it's not a bugfix, so I'm not sure that it should
go to -stable. It's already been fixed for 2.6.37.
^ permalink raw reply
* Re: [PATCH 2.6.36/stable v2] vlan: Fix crash when hwaccel rx pkt for non-existant vlan.
From: Ben Greear @ 2010-10-28 0:15 UTC (permalink / raw)
To: Jesse Gross; +Cc: netdev
In-Reply-To: <AANLkTi=EHVBSNNmsts4xTVZ2DGTBD92mHnpP0e5ZYEx1@mail.gmail.com>
On 10/27/2010 05:11 PM, Jesse Gross wrote:
> On Tue, Oct 26, 2010 at 10:06 AM, Ben Greear<greearb@candelatech.com> wrote:
>> The vlan_hwaccel_do_receive code expected skb->dev to always
>> be a vlan device, but if the NIC was promisc, and the VLAN
>> for a particular VID was not configured, then this method
>> could receive a packet where skb->dev was NOT a vlan
>> device. This caused access of bad memory and a crash.
>>
>> Signed-off-by: Ben Greear<greearb@candelatech.com>
>> ---
>> v1 -> v2: Simplify patch..no need for setting pkt-type, etc.
>>
>> :100644 100644 0eb96f7... 0687b6c... M net/8021q/vlan_core.c
>> :100644 100644 660dd41... 5dc45b9... M net/core/dev.c
>> net/8021q/vlan_core.c | 3 +++
>> net/core/dev.c | 5 +++--
>> 2 files changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
>> index 0eb96f7..0687b6c 100644
>> --- a/net/8021q/vlan_core.c
>> +++ b/net/8021q/vlan_core.c
>> @@ -43,6 +43,9 @@ int vlan_hwaccel_do_receive(struct sk_buff *skb)
>> struct net_device *dev = skb->dev;
>> struct vlan_rx_stats *rx_stats;
>>
>> + if (!is_vlan_dev(dev))
>> + return 0;
>> +
>> skb->dev = vlan_dev_info(dev)->real_dev;
>> netif_nit_deliver(skb);
>>
>
> What if we dropped any packet with a tag in skb->vlan_tci before it
> gets to the bridge hooks? That would accomplish the original goal of
> getting packets to tcpdump while preventing them from making it to
> places where they aren't expected, It will provide the same behavior
> as earlier kernels.
The VLAN code has changed a lot since I messed with it last, so
there very well may be better ways to fix this than what I came up
with. Please propose a patch if you have a suggestion.
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply
* multi-machine simultaneous kernel panic in tcp_transmit_kcb
From: Doug Hughes @ 2010-10-28 1:04 UTC (permalink / raw)
To: netdev
3 machines within 1 minute of each other (odd, by itself, but not the
root of the question).
2 of this:
2.6.18-164.15.1.el5 #1 SMP Wed Mar 17 11:30:06 EDT 2010 x86_64 x86_64
x86_64 GNU/Linux
(I have a screen shot on the kvm)
all Cent 5.4
1 Xen instances with 2.6.18-128.1.14.el5xen #1 SMP Wed Jun 17 07:10:16
EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
a slightly older kernel but crashed within one minute of the other two.
Since it's a xen, I have a text traceback:
Pid: 0, comm: swapper Not tainted 2.6.18-128.1.14.el5xen #1
RIP: e030:[<ffffffff8040e077>] [<ffffffff8040e077>] pskb_copy+0x133/0x1b1
RSP: e02b:ffffffff8066ade0 EFLAGS: 00010282
RAX: ffff8800325fa120 RBX: ffff8800434f5780 RCX: ffff88006d311930
RDX: 656363612f647074 RSI: ffff8800325fa130 RDI: 0000000000000002
RBP: ffff8800549aa680 R08: 7ffffffffffffffe R09: 0000000000000000
R10: ffff8800434f5780 R11: 00000000000000c8 R12: 0000000000000220
R13: ffff8800549aa680 R14: 0000000000000000 R15: ffffffffff578000
FS: 00002b84514af260(0000) GS:ffffffff805ba000(0000) knlGS:0000000000000000
CS: e033 DS: 0000 ES: 0000
Process swapper (pid: 0, threadinfo ffffffff8062a000, task ffffffff804e0a80)
Stack: ffffffff802886d9 ffff88006ad3d280 0000000000000001 ffffffff80222485
ffff880025665380 000000017d0f80ab 0000000000000001 ffff88006ad3d280
ffff8800549aa680 00000000ffffff8f
Call Trace:
<IRQ> [<ffffffff802886d9>] rebalance_tick+0x18b/0x3d4
[<ffffffff80222485>] tcp_transmit_skb+0x73/0x667
[<ffffffff8043903a>] tcp_retransmit_skb+0x53d/0x638
[<ffffffff8043a569>] tcp_write_timer+0x0/0x68e
[<ffffffff8043a9d6>] tcp_write_timer+0x46d/0x68e
[<ffffffff80291f8b>] run_timer_softirq+0x13f/0x1c6
[<ffffffff802130d6>] __do_softirq+0x8d/0x13b
[<ffffffff80260da4>] call_softirq+0x1c/0x278
[<ffffffff8026e0be>] do_softirq+0x31/0x98
[<ffffffff8026df39>] do_IRQ+0xec/0xf5
[<ffffffff803a7b94>] evtchn_do_upcall+0x13b/0x1fb
[<ffffffff802608d6>] do_hypervisor_callback+0x1e/0x2c
<EOI> [<ffffffff802063aa>] hypercall_page+0x3aa/0x1000
[<ffffffff802063aa>] hypercall_page+0x3aa/0x1000
[<ffffffff8026f511>] raw_safe_halt+0x84/0xa8
[<ffffffff8026ca52>] xen_idle+0x38/0x4a
[<ffffffff8024b0d8>] cpu_idle+0x97/0xba
[<ffffffff80634b09>] start_kernel+0x21f/0x224
[<ffffffff806341e5>] _sinittext+0x1e5/0x1eb
Code: 48 8b 02 25 00 40 02 00 48 3d 00 40 02 00 75 04 48 8b 52 10
RIP [<ffffffff8040e077>] pskb_copy+0x133/0x1b1
RSP <ffffffff8066ade0>
<0>Kernel panic - not syncing: Fatal exception
---
The first 4 lines of the trace on the xen and the non-xen are the same
except for the addresses.
In fact, they are the same up until the 9th line where they start to
diverge a little bit.
The last thing in the kern log before the crash on one was an nfs server
not responding, but those happen sporadically and often enough that I
don't suspect it's related.
Given that its looks, seemed like an appropriate question for netdev
(following a failed google search)
^ permalink raw reply
* cxgb3: kernel access of bad area with v2.6.36-6794-g12ba8d1
From: Nishanth Aravamudan @ 2010-10-28 1:54 UTC (permalink / raw)
To: Divy Le Ray; +Cc: sonnyrao, netdev, linux-kernel
Hi,
I'm seeing the following trace w/ current git on a machine in our lab:
Chelsio T3 Network Driver - version 1.1.4-ko
cxgb3 0003:01:00.0: enabling device (0140 -> 0142)
Unable to handle kernel paging request for data at address 0x00000010
Faulting instruction address: 0xd000000008473ae8
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=32 NUMA pSeries
last sysfs file: /sys/devices/virtual/block/dm-0/dev
Modules linked in: cxgb3(+) mdio ehea ib_ehca ib_core ext4 jbd2 mbcache sd_mod crc_t10dif ipr dm_mod [last unloaded: scsi_wait_scan]
NIP: d000000008473ae8 LR: d000000008473ac4 CTR: c0000000004398a0
REGS: c0000007a157f190 TRAP: 0300 Not tainted (2.6.36)
MSR: 8000000000009032 <EE,ME,IR,DR> CR: 24424444 XER: 00000000
DAR: 0000000000000010, DSISR: 0000000040000000
TASK = c0000007a3755290[741] 'modprobe' THREAD: c0000007a157c000 CPU: 24
GPR00: 0000000000000000 c0000007a157f410 d000000008486978 c0000007a526c000
GPR04: c0000000006d25dd c0000007a526c005 c0000007a526c29e 0000000000000002
GPR08: 0000000000000004 0000000000000010 c0000007a526c0a0 0000000000000000
GPR12: d000000008474aa8 c00000000eed3c00 d00000000847aeb8 0000000000000001
GPR16: 0000000000001000 0000000000000000 d000000008477aa8 00003c047ef7e000
GPR20: c0000007a8b7d280 c0000007a8b7d310 d00000000847d1c0 d00000000847d1d8
GPR24: 0000000000000003 00003c047ef7efff 0000000000000001 c0000007a3c1c000
GPR28: 0000000000000000 c0000007a526c000 d000000008484210 c0000007a3c1c000
NIP [d000000008473ae8] .init_one+0x510/0xb7c [cxgb3]
LR [d000000008473ac4] .init_one+0x4ec/0xb7c [cxgb3]
Call Trace:
[c0000007a157f410] [d000000008473ac4] .init_one+0x4ec/0xb7c [cxgb3] (unreliable)
[c0000007a157f560] [c0000000002e40bc] .local_pci_probe+0x7c/0x100
[c0000007a157f5f0] [c0000000002e5018] .pci_device_probe+0x148/0x150
[c0000007a157f6a0] [c00000000034df68] .driver_probe_device+0x128/0x330
[c0000007a157f750] [c00000000034e27c] .__driver_attach+0x10c/0x110
[c0000007a157f7e0] [c00000000034d15c] .bus_for_each_dev+0x9c/0xf0
[c0000007a157f890] [c00000000034dbc8] .driver_attach+0x28/0x40
[c0000007a157f910] [c00000000034c648] .bus_add_driver+0x218/0x3d0
[c0000007a157f9c0] [c00000000034e718] .driver_register+0x98/0x1d0
[c0000007a157fa60] [c0000000002e5354] .__pci_register_driver+0x64/0x140
[c0000007a157fb00] [d000000008474278] .cxgb3_init_module+0x2c/0x44 [cxgb3]
[c0000007a157fb80] [c000000000009754] .do_one_initcall+0x64/0x1e0
[c0000007a157fc40] [c0000000000d28b8] .SyS_init_module+0x1b8/0x1790
[c0000007a157fe30] [c000000000008564] syscall_exit+0x0/0x40
Instruction dump:
9b890018 9b090019 48000fe9 e8410028 801d0308 2f800000 419e003c 39600000
e93d0300 796045e4 7d290214 39290010 <7c0048a8> 7c00d378 7c0049ad 40a2fff4
---[ end trace 2a530df8c4ad3d70 ]---
udevd-work[600]: '/sbin/modprobe -b pci:v00001425d00000030sv00001014sd0000038Cbc02sc00i00' unexpected exit with status 0x000b
I did an objdump -ldr of cxgb3.ko and:
4c0: 48 00 00 01 bl 4c0 <.init_one+0x4c0>
4c0: R_PPC64_REL24 .alloc_etherdev_mq
4c4: 60 00 00 00 nop
4c8: 7c 7d 1b 79 mr. r29,r3
4cc: 41 82 03 28 beq- 7f4 <.init_one+0x7f4>
4d0: 39 3d 07 00 addi r9,r29,1792
4d4: fa bd 03 f8 std r21,1016(r29)
4d8: fb bb 32 08 std r29,12808(r27)
4dc: fb fd 07 00 std r31,1792(r29)
4e0: 9b 89 00 18 stb r28,24(r9)
4e4: 9b 09 00 19 stb r24,25(r9)
4e8: 48 00 00 01 bl 4e8 <.init_one+0x4e8>
4e8: R_PPC64_REL24 .netif_carrier_off
4ec: 60 00 00 00 nop
4f0: 80 1d 03 08 lwz r0,776(r29)
4f4: 2f 80 00 00 cmpwi cr7,r0,0
4f8: 41 9e 00 3c beq- cr7,534 <.init_one+0x534>
4fc: 39 60 00 00 li r11,0
500: e9 3d 03 00 ld r9,768(r29)
504: 79 60 45 e4 rldicr r0,r11,8,55
508: 7d 29 02 14 add r9,r9,r0
50c: 39 29 00 10 addi r9,r9,16
510: 7c 00 48 a8 ldarx r0,0,r9
514: 7c 00 d3 78 or r0,r0,r26
518: 7c 00 49 ad stdcx. r0,0,r9
51c: 40 a2 ff f4 bne- 510 <.init_one+0x510>
So I'm guessing it's somewhere in here:
for (i = 0; i < ai->nports0 + ai->nports1; ++i) {
struct net_device *netdev;
netdev = alloc_etherdev_mq(sizeof(struct port_info), SGE_QSETS);
if (!netdev) {
err = -ENOMEM;
goto out_free_dev;
}
SET_NETDEV_DEV(netdev, &pdev->dev);
adapter->port[i] = netdev;
pi = netdev_priv(netdev);
pi->adapter = adapter;
pi->rx_offload = T3_RX_CSUM | T3_LRO;
pi->port_id = i;
netif_carrier_off(netdev);
netif_tx_stop_all_queues(netdev);
netdev->irq = pdev->irq;
netdev->mem_start = mmio_start;
netdev->mem_end = mmio_start + mmio_len - 1;
netdev->features |= NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO;
netdev->features |= NETIF_F_GRO;
if (pci_using_dac)
netdev->features |= NETIF_F_HIGHDMA;
netdev->features |= NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX;
netdev->netdev_ops = &cxgb_netdev_ops;
SET_ETHTOOL_OPS(netdev, &cxgb_ethtool_ops);
}
Well, presuming the trace is mostly accurate? I'm not sure what else is
needed to determine the problem further. I'm building 2.6.36 as I write
this. But it doesn't seem like this code has changed much and I had a
working kernel around 2.6.36-rc7...
Let me know what else I can do to help debug.
Thanks,
Nish
--
Nishanth Aravamudan <nacc@us.ibm.com>
IBM Linux Technology Center
^ permalink raw reply
* Re: [PATCH] igb: Fix unused variable warning.
From: Jeff Kirsher @ 2010-10-28 2:13 UTC (permalink / raw)
To: Jesse Gross; +Cc: David Miller, netdev
In-Reply-To: <1288140963-20537-1-git-send-email-jesse@nicira.com>
On Tue, Oct 26, 2010 at 17:56, Jesse Gross <jesse@nicira.com> wrote:
> Commit eab6d18d "vlan: Don't check for vlan group before
> vlan_tx_tag_present" removed the need for the adapter variable
> in igb_xmit_frame_ring_adv(). This removes the variable as well
> to avoid the compiler warning.
>
> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
> Signed-off-by: Jesse Gross <jesse@nicira.com>
> ---
> drivers/net/igb/igb_main.c | 1 -
> 1 files changed, 0 insertions(+), 1 deletions(-)
>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
^ permalink raw reply
* [PATCH 1/1] net/unix: Allow Unix sockets to be treated like normal files
From: x @ 2010-10-28 2:24 UTC (permalink / raw)
To: netdev; +Cc: Dave Miller, Jeff Hansen
From: Jeff Hansen <x@jeffhansen.com>
Resent. Is there anything else I need to do for this patch to get reviewed
and/or merged? Any comments by anyone?
This allows Unix sockets to be opened, written, read, and closed, like
normal files. This can be especially handy from, for example, a shell
script that wants to send a short message to a Unix socket, but doesn't
want to and/or cannot create the socket itself.
This will try to open the Unix socket first in SOCK_DGRAM mode, then
SOCK_STREAM mode if that fails.
Signed-off-by: Jeff Hansen <x@jeffhansen.com>
---
net/unix/Kconfig | 10 +++++
net/unix/af_unix.c | 113 ++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 123 insertions(+), 0 deletions(-)
diff --git a/net/unix/Kconfig b/net/unix/Kconfig
index 5a69733..68df4f1 100644
--- a/net/unix/Kconfig
+++ b/net/unix/Kconfig
@@ -19,3 +19,13 @@ config UNIX
Say Y unless you know what you are doing.
+config UNIX_FOPS
+ boolean "Allow Unix sockets to be treated like normal files"
+ depends on UNIX
+ ---help---
+ If you say Y here, Unix sockets may be opened, written, read, and
+ closed, like normal files. This is handy for sending short commands
+ to Unix sockets (i.e. from shell scripts), without having to create
+ a Unix socket.
+
+ Say Y unless you know what you are doing.
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 0ebc777..b5a6655 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -798,6 +798,114 @@ fail:
return NULL;
}
+#ifdef CONFIG_UNIX_FOPS
+static int unix_open(struct inode *inode, struct file *filp)
+{
+ int err;
+ struct socket *sock = NULL;
+ struct dentry *dentry = filp->f_dentry;
+ struct sockaddr_un sunaddr = { 0 };
+ char *p;
+
+ if (!filp)
+ return -ENXIO;
+ dentry = filp->f_dentry;
+
+ if (!dentry || !dentry->d_parent)
+ return -ENXIO;
+
+ if (filp->private_data)
+ return -EBUSY;
+
+ sunaddr.sun_family = AF_UNIX;
+ p = d_path(&filp->f_path, sunaddr.sun_path, sizeof(sunaddr.sun_path));
+ if (IS_ERR(p))
+ return PTR_ERR(p);
+ memmove(sunaddr.sun_path, p, p[sizeof(sunaddr.sun_path) - 1] ?
+ sizeof(sunaddr.sun_path) : strlen(p));
+
+ err = sock_create(PF_UNIX, SOCK_DGRAM, 0, &sock);
+ if (err)
+ return err;
+
+ err = unix_dgram_connect(sock, (struct sockaddr *)&sunaddr,
+ sizeof(sunaddr), 0);
+ if (err) {
+ sock_release(sock);
+
+ err = sock_create(PF_UNIX, SOCK_STREAM, 0, &sock);
+ if (err)
+ return err;
+
+ err = unix_stream_connect(sock, (struct sockaddr *)&sunaddr,
+ sizeof(sunaddr), 0);
+
+ if (err)
+ return err;
+ }
+ filp->private_data = sock;
+
+ return err;
+}
+
+static int unix_frelease(struct inode *inode, struct file *filp)
+{
+ if (!filp->private_data)
+ return -ENXIO;
+
+ sock_release(filp->private_data);
+ filp->private_data = NULL;
+ return 0;
+}
+
+static ssize_t unix_readwrite(struct file *filp, void *buf,
+ size_t _len, loff_t *ppos, int do_write)
+{
+ struct socket *sock = filp->private_data;
+ int len = (int)_len, err;
+ struct kvec iov = {
+ .iov_base = buf,
+ .iov_len = len,
+ };
+ struct msghdr msg = {
+ /* NB: struct iovec and kvec are equal */
+ .msg_iov = (struct iovec *)&iov,
+ .msg_iovlen = 1,
+ };
+
+ if (!sock)
+ return -ENXIO;
+ if (_len > 0xffffffffLL)
+ return -E2BIG;
+
+ err = do_write ? sock_sendmsg(sock, &msg, len) :
+ sock_recvmsg(sock, &msg, len, 0);
+ if (err > 0 && ppos)
+ *ppos += err;
+
+ return err;
+}
+
+static ssize_t unix_write(struct file *filp, const char __user *buf,
+ size_t _len, loff_t *ppos)
+{
+ return unix_readwrite(filp, (void *)buf, _len, ppos, 0);
+}
+
+static ssize_t unix_read(struct file *filp, const char __user *buf,
+ size_t _len, loff_t *ppos)
+{
+ return unix_readwrite(filp, (void *)buf, _len, ppos, 0);
+}
+
+const struct file_operations unix_sock_fops = {
+ .owner = THIS_MODULE,
+ .open = unix_open,
+ .release = unix_frelease,
+ .write = unix_write,
+ .read = unix_read,
+};
+#endif /* CONFIG_UNIX_FOPS */
static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
{
@@ -874,6 +982,11 @@ out_mknod_drop_write:
mnt_drop_write(nd.path.mnt);
if (err)
goto out_mknod_dput;
+
+#ifdef CONFIG_UNIX_FOPS
+ dentry->d_inode->i_fop = &unix_sock_fops;
+#endif
+
mutex_unlock(&nd.path.dentry->d_inode->i_mutex);
dput(nd.path.dentry);
nd.path.dentry = dentry;
--
1.7.0.4
^ permalink raw reply related
* Re: [PATCH 1/1] net/unix: Allow Unix sockets to be treated like normal files
From: David Miller @ 2010-10-28 2:32 UTC (permalink / raw)
To: x; +Cc: netdev
In-Reply-To: <1288232669-8927-1-git-send-email-x@jeffhansen.com>
From: x@jeffhansen.com
Date: Thu, 28 Oct 2010 02:24:29 +0000
> Resent. Is there anything else I need to do for this patch to get reviewed
> and/or merged? Any comments by anyone?
I don't like this idea at all.
I remember there is a reason why similar things are not allowed for
sockets, it causes all sorts of problems although I forget the exact
details.
Take a look at net/socket.c:sock_no_open(), for example.
^ permalink raw reply
* Re: [PATCH] igb: Fix unused variable warning.
From: David Miller @ 2010-10-28 2:44 UTC (permalink / raw)
To: jeffrey.t.kirsher; +Cc: jesse, netdev
In-Reply-To: <AANLkTi=iDMYjMsvbeR9-ZhCE5zMHacO-vM5wTXvx4S9K@mail.gmail.com>
From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Wed, 27 Oct 2010 19:13:23 -0700
> On Tue, Oct 26, 2010 at 17:56, Jesse Gross <jesse@nicira.com> wrote:
>> Commit eab6d18d "vlan: Don't check for vlan group before
>> vlan_tx_tag_present" removed the need for the adapter variable
>> in igb_xmit_frame_ring_adv(). This removes the variable as well
>> to avoid the compiler warning.
>>
>> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
>> Signed-off-by: Jesse Gross <jesse@nicira.com>
>> ---
>> drivers/net/igb/igb_main.c | 1 -
>> 1 files changed, 0 insertions(+), 1 deletions(-)
>>
>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Applied.
^ permalink raw reply
* Re: [PATCH 1/1] net/unix: Allow Unix sockets to be treated like normal files
From: Jeff Hansen @ 2010-10-28 2:50 UTC (permalink / raw)
To: David Miller; +Cc: netdev
In-Reply-To: <20101027.193212.193709254.davem@davemloft.net>
Dave,
I agree that on larger systems this doesn't really make sense, but on
embedded platforms this can save some code space since applications can
get rid of their FIFO listeners and have strictly socket listeners.
That's why I made it an option that could be disabled by default.
Do you know who originally suggested that "creepy crawlies" are
introduced by allowing sockets to be opened? I'd be interested to know
how this could affect security, if at all.
-Jeff
On 10/27/2010 08:32 PM, David Miller wrote:
> From: x@jeffhansen.com
> Date: Thu, 28 Oct 2010 02:24:29 +0000
>
>> Resent. Is there anything else I need to do for this patch to get reviewed
>> and/or merged? Any comments by anyone?
> I don't like this idea at all.
>
> I remember there is a reason why similar things are not allowed for
> sockets, it causes all sorts of problems although I forget the exact
> details.
>
> Take a look at net/socket.c:sock_no_open(), for example.
>
--
---------------------------------------------------
"If someone's gotta do it, it might as well be me."
x@jeffhansen.com
^ permalink raw reply
* [PATCH 1/2] vmxnet3: remove unnecessary byteswapping in BAR writing macros
From: Harvey Harrison @ 2010-10-28 3:12 UTC (permalink / raw)
To: sbhatewara; +Cc: netdev
readl/writel swap to little-endian internally.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
---
drivers/net/vmxnet3/vmxnet3_int.h | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/vmxnet3/vmxnet3_int.h b/drivers/net/vmxnet3/vmxnet3_int.h
index 8a2f471..edf2288 100644
--- a/drivers/net/vmxnet3/vmxnet3_int.h
+++ b/drivers/net/vmxnet3/vmxnet3_int.h
@@ -330,14 +330,14 @@ struct vmxnet3_adapter {
};
#define VMXNET3_WRITE_BAR0_REG(adapter, reg, val) \
- writel(cpu_to_le32(val), (adapter)->hw_addr0 + (reg))
+ writel((val), (adapter)->hw_addr0 + (reg))
#define VMXNET3_READ_BAR0_REG(adapter, reg) \
- le32_to_cpu(readl((adapter)->hw_addr0 + (reg)))
+ readl((adapter)->hw_addr0 + (reg))
#define VMXNET3_WRITE_BAR1_REG(adapter, reg, val) \
- writel(cpu_to_le32(val), (adapter)->hw_addr1 + (reg))
+ writel((val), (adapter)->hw_addr1 + (reg))
#define VMXNET3_READ_BAR1_REG(adapter, reg) \
- le32_to_cpu(readl((adapter)->hw_addr1 + (reg)))
+ readl((adapter)->hw_addr1 + (reg))
#define VMXNET3_WAKE_QUEUE_THRESHOLD(tq) (5)
#define VMXNET3_RX_ALLOC_THRESHOLD(rq, ring_idx, adapter) \
--
1.7.1
^ permalink raw reply related
* [PATCH 2/2] vmxnet: trivial annotation of protocol constant
From: Harvey Harrison @ 2010-10-28 3:12 UTC (permalink / raw)
To: sbhatewara; +Cc: netdev
In-Reply-To: <1288235555-24675-1-git-send-email-harvey.harrison@gmail.com>
Noticed by sparse:
drivers/net/vmxnet3/vmxnet3_drv.c:876:38: warning: cast from restricted __be16
drivers/net/vmxnet3/vmxnet3_drv.c:876:38: warning: cast from restricted __be16
drivers/net/vmxnet3/vmxnet3_drv.c:876:24: warning: restricted __be16 degrades to integer
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
---
drivers/net/vmxnet3/vmxnet3_drv.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c b/drivers/net/vmxnet3/vmxnet3_drv.c
index e3658e1..21314e0 100644
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -873,7 +873,7 @@ vmxnet3_tq_xmit(struct sk_buff *skb, struct vmxnet3_tx_queue *tq,
count = VMXNET3_TXD_NEEDED(skb_headlen(skb)) +
skb_shinfo(skb)->nr_frags + 1;
- ctx.ipv4 = (skb->protocol == __constant_ntohs(ETH_P_IP));
+ ctx.ipv4 = (skb->protocol == cpu_to_be16(ETH_P_IP));
ctx.mss = skb_shinfo(skb)->gso_size;
if (ctx.mss) {
--
1.7.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox