Netdev List
 help / color / mirror / Atom feed
* Re: Real networking namespace
From: Stephen Smalley @ 2009-10-09 16:44 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: linux-security-module, Al Viro, netdev, Paul Moore, James Morris
In-Reply-To: <1255106246.2182.219.camel@moss-pluto.epoch.ncsc.mil>

On Fri, 2009-10-09 at 12:37 -0400, Stephen Smalley wrote:
> On Fri, 2009-10-09 at 08:38 -0700, Stephen Hemminger wrote:
> > The existing networking namespace model is unattractive for what I want,
> > has anyone investigated better alternatives?
> > 
> > I would like to be able to allow access to a network interface and associated objects
> > (routing tables etc), to be controlled by Mandatory Access Control API's.
> > I.e grant access to eth0 and to only certain processes.  Some the issues
> > with the existing models are:
> >   * eth0 and associated objects don't really exist in filesystem so
> >     not subject to LSM style control (SeLinux/SMACK/TOMOYO)
> >   * network namespaces do not allow object to exist in multiple namespaces.
> >     The current model is more restrictive than chroot jails. At least with
> >     chroot, put filesystem objects in multiple jails.
> > 
> > Since one of the first rules of security is "don't reinvent", surely
> > others have dealt with this issue. Any good ideas?
> 
> Is there something that prevents you from using the existing SELinux
> network access controls?  netif is a security class governed by SELinux
> policy, and routing table operations would be covered by the SELinux
> checks on netlink_route_socket.  SELinux uses a combination of LSM hooks
> and netfilter hooks to mediate network operations.

Also, depending on what you want to do, SECMARK may be useful to you.
That allows you to mark packets with security contexts via iptables, and
then use SELinux policy to control their flow.
http://paulmoore.livejournal.com/4281.html
http://james-morris.livejournal.com/11010.html

-- 
Stephen Smalley
National Security Agency


^ permalink raw reply

* Re: [PATCH] [CAIF-RFC 5/8-v2] CAIF Protocol Stack
From: Randy Dunlap @ 2009-10-09 16:43 UTC (permalink / raw)
  To: sjur.brandeland
  Cc: netdev, stefano.babic, randy.dunlap, kim.xx.lilliestierna,
	christian.bejram, daniel.martensson
In-Reply-To: <1255095571-6501-6-git-send-email-sjur.brandeland@stericsson.com>

On Fri, 09 Oct 2009 15:39:28 +0200 sjur.brandeland@stericsson.com wrote:

> From: Sjur Braendeland <sjur.brandeland@stericsson.com>
> 
> Change-Id: I205c5b3baf1542e1593637ce896d8684870415be
> Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
> ---
>  net/caif/Kconfig            |   61 ++
>  net/caif/Makefile           |   56 ++
>  net/caif/caif_chnlif.c      |  219 +++++++
>  net/caif/caif_chr.c         |  374 ++++++++++++
>  net/caif/caif_config_util.c |  167 ++++++
>  net/caif/chnl_chr.c         | 1393 +++++++++++++++++++++++++++++++++++++++++++
>  net/caif/chnl_net.c         |  492 +++++++++++++++
>  7 files changed, 2762 insertions(+), 0 deletions(-)
>  create mode 100644 net/caif/Kconfig
>  create mode 100644 net/caif/Makefile
>  create mode 100644 net/caif/caif_chnlif.c
>  create mode 100644 net/caif/caif_chr.c
>  create mode 100644 net/caif/caif_config_util.c
>  create mode 100644 net/caif/chnl_chr.c
>  create mode 100644 net/caif/chnl_net.c
> 
> diff --git a/net/caif/Kconfig b/net/caif/Kconfig
> new file mode 100644
> index 0000000..7fb9e9c
> --- /dev/null
> +++ b/net/caif/Kconfig
> @@ -0,0 +1,61 @@
> +#
> +# CAIF net configurations
> +#
> +
> +#menu "Caif Support"
> +comment "CAIF Support"
> +
> +menuconfig CAIF
> +	tristate "Enable Caif support"
> +	default n
> +	---help---
> +	Say Y here if you need to use a phone modem that uses CAIF as transport

	end above with period ('.').

> +	You will also need to say yes to any caif physical devices that your platform
> +	supports.
> +	This can be either built-in or as a loadable module, if you select to build it as module

s/,/;/

> +	the other CAIF also needs to built as modules

	the other CAIF {options or drivers or some other word here} also need  ... modules.
	(end with period)


> +	See Documentation/CAIF for a further explanation on how to use and configure.
> +
> +if CAIF
> +
> +config CAIF_CHARDEV
> +	tristate "CAIF character device"
> +	default CAIF
> +	---help---
> +	Say Y if you will be using the CAIF AT type character devices.
> +	This can be either built-in or as a loadable module,
> +	If you select to build it as a built in then the main caif device must also be a builtin.
> +	If unsure say Y.
> +
> +config CAIF_NETDEV
> +	tristate "CAIF Network device"
> +	default CAIF
> +	---help---
> +	Say Y if you will be using the CAIF based network device.
> +	This can be either built-in or as a loadable module,
> +	If you select to build it as a built in then the main caif device must also be a builtin.
> +	If unsure say Y.
> +
> +
> +config  CAIF_USE_PLAIN
> +	bool  "Use plain buffers instead of SKB in caif"
> +	default n
> +	---help---
> +	Use plain buffer to transport data,

	s/,/./

> +	Select what type of internal buffering CAIF should use,
> +	skb or plain.
> +	If unsure say N hre.
> +
> +config  CAIF_DEBUG
> +	bool "Enable Debug"
> +	default n
> +	--- help ---
> +	Enable the inclusion of debug code in the caif stack,
> +	be aware that doing this will impact performance.
> +	If unsure say N here.
> +
> +# Include physical drivers
> +# source "drivers/net/caif/Kconfig"

Drop the above line.

> +source "drivers/net/caif/Kconfig"
> +endif
> +#endmenu


---
~Randy

^ permalink raw reply

* Re: PATCH: Network Device Naming mechanism and policy
From: Marco d'Itri @ 2009-10-09 16:56 UTC (permalink / raw)
  To: Bryan Kadzban
  Cc: Matt Domsch, Narendra K, netdev, linux-hotplug, jordan_hargrave
In-Reply-To: <4ACF6367.8040401@kadzban.is-a-geek.net>

[-- Attachment #1: Type: text/plain, Size: 353 bytes --]

On Oct 09, Bryan Kadzban <bryan@kadzban.is-a-geek.net> wrote:

> > As has been noted here, MAC addresses are not necessarily unique to
> > an interface.
> Only in the case of e.g. qemu (virtual hardware), I think.  (Or some
> kinds of broken hardware.
Some Sun products have multiple interfaces sharing the same MAC address.

-- 
ciao,
Marco

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply

* Re: PATCH: Network Device Naming mechanism and policy
From: Matt Domsch @ 2009-10-09 17:17 UTC (permalink / raw)
  To: Greg KH; +Cc: Narendra K, netdev, linux-hotplug, jordan_hargrave
In-Reply-To: <20091009163613.GA3414@kroah.com>

On Fri, Oct 09, 2009 at 09:36:13AM -0700, Greg KH wrote:
> On Fri, Oct 09, 2009 at 09:00:01AM -0500, Narendra K wrote:
> > On Fri, Oct 09, 2009 at 07:12:07PM +0530, K, Narendra wrote:
> > > > example udev config:
> > > > SUBSYSTEM=="net",
> > > SYMLINK+="net/by-mac/$sysfs{ifindex}.$sysfs{address}"
> > > 
> > > work as well.  But coupling the ifindex to the MAC address like this
> > > doesn't work.  (In general, coupling any two unrelated attributes when
> > > trying to do persistent names doesn't work.)
> > > 
> > Attaching the latest patch incorporating review comments.
> > 
> > By creating character devices for every network device, we can use
> > udev to maintain alternate naming policies for devices, including
> > additional names for the same device, without interfering with the
> > name that the kernel assigns a device.
> > 
> > This is conditionalized on CONFIG_NET_CDEV.  If enabled (the default),
> > device nodes will automatically be created in /dev/netdev/ for each
> > network device.  (/dev/net/ is already populated by the tun device.)
> > 
> > These device nodes are not functional at the moment - open() returns
> > -ENOSYS.  Their only purpose is to provide userspace with a kernel
> > name to ifindex mapping, in a form that udev can easily manage.
> 
> How does this patch work with the network namespace functionality?

There is a monitonically increasing static ifindex kept in
net/core/dev.c:dev_new_index(), which is shared across all namespaces.
struct net_device ifindex field is assigned from this.  So two devices
in two different namespaces can't share an ifindex value.  However,
the device can be present (or not) in the per-namespace dev_name_hash
and dev_index_hashes.  This patch doesn't change this at all.

uevents aren't namespaced.  Presumably that means /dev can't be
polyinstantiated.  Therefore, all devnodes in /dev/netdev/* will be
visible to all processes, where 'ifconfig' and friends would only show
device names in the processes namespace.  This doesn't mean the app
can _do_ anything (it's the same as if it tried to act on a device
using an ifindex for a device not in its namespace), but yes, the fact
that such a device exists will be exposed.

-- 
Matt Domsch
Technology Strategist, Dell Office of the CTO
linux.dell.com & www.dell.com/linux

^ permalink raw reply

* Re: PATCH: Network Device Naming mechanism and policy
From: Greg KH @ 2009-10-09 17:22 UTC (permalink / raw)
  To: Matt Domsch; +Cc: Narendra K, netdev, linux-hotplug, jordan_hargrave
In-Reply-To: <20091009171724.GA11004@auslistsprd01.us.dell.com>

On Fri, Oct 09, 2009 at 12:17:24PM -0500, Matt Domsch wrote:
> 
> uevents aren't namespaced.  Presumably that means /dev can't be
> polyinstantiated.  Therefore, all devnodes in /dev/netdev/* will be
> visible to all processes, where 'ifconfig' and friends would only show
> device names in the processes namespace.  This doesn't mean the app
> can _do_ anything (it's the same as if it tried to act on a device
> using an ifindex for a device not in its namespace), but yes, the fact
> that such a device exists will be exposed.

That's the problem that the sysfs namespace patches were trying to
address.

Now I'm not saying it is a valid thing to try to work with this kind of
crazy, I was just wondering how it would work out.  Looks like it
doesn't :)

thanks,

greg k-h

^ permalink raw reply

* behaviour question for igb on nehalem box
From: Chris Friesen @ 2009-10-09 18:43 UTC (permalink / raw)
  To: e1000-list, Linux Network Development list, Kirsher, Jeffrey T,
	"Brandeburg, Jesse" <jesse


Hi all,

I've got some general questions around the expected behaviour of the
82576 igb net device.  (On a dual quad-core Nehalem box, if it matters.)

As a caveat, the box is running Centos 5.3 with their 2.6.18 kernel.
It's using the 1.3.16-k2 igb driver though, which looks to be the one
from mainline linux.

The igb driver is being loaded with no parameters specified.  At driver
init time, it's selecting 1 tx queue and 4 rx queues per device.

My first question is whether the number of queues makes sense.  I
couldn't figure out how this would happen since the rules for selecting
the number of queues seems to be the same for rx and tx.  Also, it's not
clear to me why it's limiting itself to 4 rx queues when I have 8
physical cores (and 16 virtual ones with hyperthreading enabled).

My second question is around how the rx queues are mapped to interrupts.
 According to /proc/interrupts there appears to be a 1:1 mapping between
queues and interrupts.  However, I've set up at test with a given amount
of traffic coming in to the device (from 4 different IP addresses and 4
ports).  Under this scenario, "ethtool -S" shows the number of packets
increasing for only rx queue 0, but I see the interrupt count going up
for two interrupts.

My final question is around smp affinity for the rx and tx queue
interrupts.  Do I need to affine the interrupt for each rx queue to a
single core to guarantee proper packet ordering, or can they be handled
on arbitrary cores?  Should the tx queue be affined to a particular core
or left to be handled by all cores?

Thanks,

Chris


------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference

^ permalink raw reply

* Ath5k data aborts
From: Krzysztof Halasa @ 2009-10-09 19:16 UTC (permalink / raw)
  To: linux-wireless, ath5k-devel, netdev

Hi,

I have done a small investigation. IXP425 (ARM) in big-endian mode,
EABI, mini-PCI atk5k wifi card, hostapd.

Atheros Communications Inc. Atheros AR5001X+ Wireless Network Adapter (rev 01)
Subsystem: Wistron NeWeb Corp. CM9 Wireless a/b/g MiniPCI Adapter
168c:0013 subsystem 185f:1012


Results:
Bad mode in data abort handler detected
Internal error: Oops - bad mode: 0 [#1]
LR is at ath5k_beacon_config+0x150/0x1d4 [ath5k]

This means the PCI device didn't respond on the bus or something
like that. Obviously the card is then unusable and the system needs to
be restarted.

Bisecting (I had to modify the procedure a bit since it only started to
show up after other unrelated code was merged) shows the guilty commit:
e8f055f0c3ba226ca599c14c2e5fe829f6f57cbb (ath5k: Update reset code).

The problem exists with 2.6.30, 2.6.31 and current Linus' tree.

Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>

----------------------------------------------
2.6.30 appears to be fixed by:

--- a/drivers/net/wireless/ath5k/reset.c
+++ b/drivers/net/wireless/ath5k/reset.c
@@ -476,7 +476,7 @@ static void ath5k_hw_set_sleep_clock(struct ath5k_hw *ah, bool enable)
 		(ah->ah_mac_version == (AR5K_SREV_AR2417 >> 4))) {
 			ath5k_hw_reg_write(ah, 0x26, AR5K_PHY_SLMT);
 			ath5k_hw_reg_write(ah, 0x0d, AR5K_PHY_SCAL);
-			ath5k_hw_reg_write(ah, 0x07, AR5K_PHY_SCLOCK);
+			ath5k_hw_reg_write(ah, 0x0C, AR5K_PHY_SCLOCK);
 			ath5k_hw_reg_write(ah, 0x3f, AR5K_PHY_SDELAY);
 			AR5K_REG_WRITE_BITS(ah, AR5K_PCICFG,
 				AR5K_PCICFG_SLEEP_CLOCK_RATE, 0x02);
@@ -490,8 +490,10 @@ static void ath5k_hw_set_sleep_clock(struct ath5k_hw *ah, bool enable)
 		}
 
 		/* Enable sleep clock operation */
+#if 0
 		AR5K_REG_ENABLE_BITS(ah, AR5K_PCICFG,
 				AR5K_PCICFG_SLEEP_CLOCK_EN);
+#endif
 
 	} else {
 


The AR5K_PHY_SCLOCK brings the old value (before the commit in question)
back, I have no idea what is it. Leaving the new value causes the second
run of hostapd to make the driver fail, the chip seems to not respond.
It seems the value itself may be correct (as it works with 2.6.31+) but
there is some additional bug fixed after 2.6.30, gitk show several
candidate patches for this.


Only disabling AR5K_PCICFG write makes the data abort go away.

----------------------------------------------
2.6.31 and Linus-current only need the AR5K_PCICFG change:

--- a/drivers/net/wireless/ath/ath5k/reset.c
+++ b/drivers/net/wireless/ath/ath5k/reset.c
@@ -489,9 +489,10 @@ static void ath5k_hw_set_sleep_clock(struct ath5k_hw *ah, bool enable)
 		}
 
 		/* Enable sleep clock operation */
+#if 0
 		AR5K_REG_ENABLE_BITS(ah, AR5K_PCICFG,
 				AR5K_PCICFG_SLEEP_CLOCK_EN);
-
+#endif
 	} else {
 
 		/* Disable sleep clock operation and


The question is, obviously, how to fix that for good. I can test the
result.


Full error message, not sure why the backtrace isn't printed.

Bad mode in data abort handler detected
Internal error: Oops - bad mode: 0 [#1]
Modules linked in: ohci_hcd ehci_hcd usbcore nls_base ixp4xx_hss ath5k ath ixp4x
x_eth
CPU: 0    Not tainted  (2.6.32-rc3 #123)
PC is at 0xffff01fc
LR is at ath5k_beacon_config+0x150/0x1d4 [ath5k]
pc : [<ffff01fc>]    lr : [<bf028db0>]    psr: a0000092
sp : c7dbfb90  ip : 00008050  fp : c78aa000
r10: c7dbfbd8  r9 : c78ac1c0  r8 : 00003304
r7 : c78aa000  r6 : c78aa000  r5 : 00000013  r4 : c78ac900
r3 : c88e0000  r2 : c88d0024  r1 : c88d0048  r0 : 800924b5
Flags: NzCv  IRQs off  FIQs on  Mode IRQ_32  ISA ARM  Segment user
Control: 000039ff  Table: 067e0000  DAC: 00000015
Process hostapd (pid: 258, stack limit = 0xc7dbe278)
Stack: (0xc7dbfb90 to 0xc7dc0000)
fb80:                                     800924b5 c88d0048 c88d0024 c88e0000 
fba0: c78ac900 00000013 c78aa000 c78aa000 00003304 c78ac1c0 c7dbfbd8 c78aa000 
fbc0: 00008050 c7dbfb90 bf028db0 ffff01fc a0000092 ffffffff 00000003 00000000 
fbe0: 00080000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
fc00: c78ac924 c78ac900 c78ac924 c7d34628 00000300 c7d34620 00000013 bf028ec8 
fc20: c78ac1c0 c7d52980 c0487e90 c7d342c0 c78ac1c0 c67e7140 c7caef20 0000001a 
fc40: 00000004 00000000 00000024 c0391694 c67e7140 c7caef20 0000001a c7dbfc88 
fc60: c7d342c0 c039e974 c7d52440 00000033 c7dbfcc0 c039e998 c7dbfc88 c0487d30 
fc80: c7c6e810 c0384640 00000000 00000000 00000000 00000002 00000000 00000000 
fca0: c7d34000 c78ac000 c0487bb0 c7c6e800 c7d52440 c02c5448 c0488184 00000102 
fcc0: 00000080 00000102 c7c6e800 c7c6e810 c7c6e814 c788d000 c7c6e800 c7d52440 
fce0: c02c5258 c7c54600 c04a0710 00000038 c7dbfd1c c02c42dc c047f6ac c7d52440 
fd00: c7d52440 c02c5244 c788d200 00000000 c7d52440 c02c3f14 00000024 7fffffff 
fd20: 00000000 c7dbff5c c7c54600 c7d52440 c7dbfe18 00000024 00000000 00000000 
fd40: 00000000 c02c47cc c7dbfe38 c7c54600 00000102 00000000 00000000 00000000 
fd60: 00000000 c7dbff5c c7dbfe18 00000000 00000000 00000024 c7dbfefc 00000000 
fd80: 00000008 c0282298 00000000 c7dbff1c 00000000 00000001 ffffffff 00000000 
fda0: 00000000 00000000 00000000 00000000 c7839080 00000001 00000000 00000000 
fdc0: 00000000 c7839080 c0185ccc c7dbfdcc c7dbfdcc 0000092a c7dbfec8 c038eae4 
fde0: c7dbfe18 00008b24 c7dbfec8 c037ab88 c7dbfdec c7dbfe0c c67d7360 c01aaf9c 
fe00: c7dbfe38 c0161cfc 00000000 c67da1a4 00000040 00000000 00000000 c74231bc 
fe20: 00000015 00000024 c7489380 0001d000 c7dbfd50 c7dbff5c c7dbfe7c c7dbfefc 
fe40: c7dbfe7c c7dbfefc c015a048 c7dbfefc 00000008 00000000 c7dbff5c c7dbff5c 
fe60: c7489380 00000000 c7dbfefc c02823ec c7dbff3c c7dbff3c 00000000 00100000 
fe80: 00000000 00000000 00000020 00000000 00008933 c02941a4 c67e4380 c67e4000 
fea0: 0000000a c67e4380 c7dbe000 c047bc28 c786d940 c024df48 776c616e 30000000 
fec0: 00000000 00000000 00000006 00000000 00000000 0e000000 c67e40e0 c67e4084 
fee0: 00000000 c028171c c786d340 00008933 00000000 60000013 00000007 0005c754 
ff00: 00000000 00000000 00000000 00000000 c74890e8 c0472750 00200200 00100100 
ff20: c7497338 c7401498 00200200 00100100 ffffffff ffffffff c780d5a0 c01ce3a0 
ff40: c7497338 c0472750 00200200 c01cea04 c786d340 00000000 c7497338 c7dbfe7c 
ff60: 0000000c c7dbfefc 00000001 00000000 00000000 00000000 00000000 ffffff97 
ff80: c786d340 000598f8 400722b0 000598a0 00000128 c015a048 c7dbe000 00000000 
ffa0: 00000001 c0159ea0 000598f8 400722b0 00000004 be9dfb24 00000000 00000000 
ffc0: 000598f8 400722b0 000598a0 00000128 00000000 00000000 00000001 00000001 
ffe0: be9dfb24 be9dfaf8 40039c84 402b022c 60000010 00000004 00000000 00000000 
Code: 00000000 00000000 00000000 00000000 (00000000) 
---[ end trace ff977de942e87c2d ]---

-- 
Krzysztof Halasa

^ permalink raw reply

* Re: [PATCH] Generalize socket rx gap / receive queue overflow cmsg (v2)
From: Neil Horman @ 2009-10-09 19:35 UTC (permalink / raw)
  To: netdev; +Cc: eric.dumazet, davem, socketcan, nhorman
In-Reply-To: <20091007180835.GB20524@hmsreliant.think-freely.org>

Ok, take two of this patch, taking in Erics notes:

Change Notes:

1) Locking on dropcount cleaned up

2) Support for reading of dropcount moved to a lower level support function
(sock_recv_ts_and_drops, modeled after sock_recv_timestamp).  This should make
this work a good deal faster

3) Socket flags moved to sk->sk_flags structure in support of (2)

Works well for me.


========================================================================

Create a new socket level option to report number of queue overflows

Recently I augmented the AF_PACKET protocol to report the number of frames lost
on the socket receive queue between any two enqueued frames.  This value was
exported via a SOL_PACKET level cmsg.  AFter I completed that work it was
requested that this feature be generalized so that any datagram oriented socket
could make use of this option.  As such I've created this patch, It creates a
new SOL_SOCKET level option called SO_RXQ_OVFL, which when enabled exports a
SOL_SOCKET level cmsg that reports the nubmer of times the sk_receive_queue
overflowed between any two given frames.  It also augments the AF_PACKET
protocol to take advantage of this new feature (as it previously did not touch
sk->sk_drops, which this patch uses to record the overflow count).  Tested
successfully by me.

Notes:

1) Unlike my previous patch, this patch simply records the sk_drops value, which
is not a number of drops between packets, but rather a total number of drops.
Deltas must be computed in user space.

2) While this patch currently works with datagram oriented protocols, it will
also be accepted by non-datagram oriented protocols. I'm not sure if thats
agreeable to everyone, but my argument in favor of doing so is that, for those
protocols which aren't applicable to this option, sk_drops will always be zero,
and reporting no drops on a receive queue that isn't used for those
non-participating protocols seems reasonable to me.  This also saves us having
to code in a per-protocol opt in mechanism.

3) This applies cleanly to net-next assuming that commit
977750076d98c7ff6cbda51858bb5a5894a9d9ab (my af packet cmsg patch) is reverted

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>


 include/asm-generic/socket.h |    1 +
 include/linux/skbuff.h       |    6 ++++--
 include/net/sock.h           |   13 +++++++++++++
 net/atm/common.c             |    2 +-
 net/bluetooth/af_bluetooth.c |    2 +-
 net/bluetooth/rfcomm/sock.c  |    2 +-
 net/can/bcm.c                |    2 +-
 net/can/raw.c                |    2 +-
 net/core/sock.c              |   17 ++++++++++++++++-
 net/ieee802154/dgram.c       |    2 +-
 net/ieee802154/raw.c         |    2 +-
 net/ipv4/raw.c               |    2 +-
 net/ipv4/udp.c               |    2 +-
 net/ipv6/raw.c               |    2 +-
 net/ipv6/udp.c               |    2 +-
 net/key/af_key.c             |    2 +-
 net/packet/af_packet.c       |    7 +++----
 net/rxrpc/ar-recvmsg.c       |    2 +-
 net/sctp/socket.c            |    2 +-
 net/socket.c                 |    7 +++++++
 20 files changed, 58 insertions(+), 21 deletions(-)

diff --git a/include/asm-generic/socket.h b/include/asm-generic/socket.h
index 538991c..9a6115e 100644
--- a/include/asm-generic/socket.h
+++ b/include/asm-generic/socket.h
@@ -63,4 +63,5 @@
 #define SO_PROTOCOL		38
 #define SO_DOMAIN		39
 
+#define SO_RXQ_OVFL             40
 #endif /* __ASM_GENERIC_SOCKET_H */
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index df7b23a..8c866b5 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -389,8 +389,10 @@ struct sk_buff {
 #ifdef CONFIG_NETWORK_SECMARK
 	__u32			secmark;
 #endif
-
-	__u32			mark;
+	union {
+		__u32		mark;
+		__u32		dropcount;
+	};
 
 	__u16			vlan_tci;
 
diff --git a/include/net/sock.h b/include/net/sock.h
index 98398bd..ae48d99 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -505,6 +505,7 @@ enum sock_flags {
 	SOCK_TIMESTAMPING_RAW_HARDWARE, /* %SOF_TIMESTAMPING_RAW_HARDWARE */
 	SOCK_TIMESTAMPING_SYS_HARDWARE, /* %SOF_TIMESTAMPING_SYS_HARDWARE */
 	SOCK_FASYNC, /* fasync() active */
+	SOCK_RXQ_OVFL,
 };
 
 static inline void sock_copy_flags(struct sock *nsk, struct sock *osk)
@@ -1493,6 +1494,18 @@ sock_recv_timestamp(struct msghdr *msg, struct sock *sk, struct sk_buff *skb)
 		sk->sk_stamp = kt;
 }
 
+extern void __sock_recv_ts_and_drops(struct msghdr *msg, struct sock *sk,
+	struct sk_buff *skb);
+
+static __inline__ void
+sock_recv_ts_and_drops(struct msghdr *msg, struct sock *sk, struct sk_buff *skb)
+{
+	sock_recv_timestamp(msg, sk, skb);
+
+	if (sock_flag(sk, SOCK_RXQ_OVFL) && skb && skb->dropcount)
+		__sock_recv_ts_and_drops(msg, sk, skb);
+}
+
 /**
  * sock_tx_timestamp - checks whether the outgoing packet is to be time stamped
  * @msg:	outgoing packet
diff --git a/net/atm/common.c b/net/atm/common.c
index 950bd16..d61e051 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -496,7 +496,7 @@ int vcc_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 	error = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
 	if (error)
 		return error;
-	sock_recv_timestamp(msg, sk, skb);
+	sock_recv_ts_and_drops(msg, sk, skb);
 	pr_debug("RcvM %d -= %d\n", atomic_read(&sk->sk_rmem_alloc), skb->truesize);
 	atm_return(vcc, skb->truesize);
 	skb_free_datagram(sk, skb);
diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c
index 1f6e49c..399e59c 100644
--- a/net/bluetooth/af_bluetooth.c
+++ b/net/bluetooth/af_bluetooth.c
@@ -257,7 +257,7 @@ int bt_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	skb_reset_transport_header(skb);
 	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
 	if (err == 0)
-		sock_recv_timestamp(msg, sk, skb);
+		sock_recv_ts_and_drops(msg, sk, skb);
 
 	skb_free_datagram(sk, skb);
 
diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
index c707865..d3bfc1b 100644
--- a/net/bluetooth/rfcomm/sock.c
+++ b/net/bluetooth/rfcomm/sock.c
@@ -703,7 +703,7 @@ static int rfcomm_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 		copied += chunk;
 		size   -= chunk;
 
-		sock_recv_timestamp(msg, sk, skb);
+		sock_recv_ts_and_drops(msg, sk, skb);
 
 		if (!(flags & MSG_PEEK)) {
 			atomic_sub(chunk, &sk->sk_rmem_alloc);
diff --git a/net/can/bcm.c b/net/can/bcm.c
index 597da4f..2f47039 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -1534,7 +1534,7 @@ static int bcm_recvmsg(struct kiocb *iocb, struct socket *sock,
 		return err;
 	}
 
-	sock_recv_timestamp(msg, sk, skb);
+	sock_recv_ts_and_drops(msg, sk, skb);
 
 	if (msg->msg_name) {
 		msg->msg_namelen = sizeof(struct sockaddr_can);
diff --git a/net/can/raw.c b/net/can/raw.c
index b5e8979..962fc9f 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -702,7 +702,7 @@ static int raw_recvmsg(struct kiocb *iocb, struct socket *sock,
 		return err;
 	}
 
-	sock_recv_timestamp(msg, sk, skb);
+	sock_recv_ts_and_drops(msg, sk, skb);
 
 	if (msg->msg_name) {
 		msg->msg_namelen = sizeof(struct sockaddr_can);
diff --git a/net/core/sock.c b/net/core/sock.c
index 7626b6a..0897311 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -276,6 +276,8 @@ int sock_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 {
 	int err = 0;
 	int skb_len;
+	unsigned long flags;
+	struct sk_buff_head *list = &sk->sk_receive_queue;
 
 	/* Cast sk->rcvbuf to unsigned... It's pointless, but reduces
 	   number of warnings when compiling with -W --ANK
@@ -305,7 +307,10 @@ int sock_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 	 */
 	skb_len = skb->len;
 
-	skb_queue_tail(&sk->sk_receive_queue, skb);
+	spin_lock_irqsave(&list->lock, flags);
+	skb->dropcount = atomic_read(&sk->sk_drops);
+	__skb_queue_tail(list, skb);
+	spin_unlock_irqrestore(&list->lock, flags);
 
 	if (!sock_flag(sk, SOCK_DEAD))
 		sk->sk_data_ready(sk, skb_len);
@@ -702,6 +707,12 @@ set_rcvbuf:
 
 		/* We implement the SO_SNDLOWAT etc to
 		   not be settable (1003.1g 5.3) */
+	case SO_RXQ_OVFL:
+		if (valbool)
+			sock_set_flag(sk, SOCK_RXQ_OVFL);
+		else
+			sock_reset_flag(sk, SOCK_RXQ_OVFL);
+		break;
 	default:
 		ret = -ENOPROTOOPT;
 		break;
@@ -901,6 +912,10 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
 		v.val = sk->sk_mark;
 		break;
 
+	case SO_RXQ_OVFL:
+		v.val = sock_flag(sk, SOCK_RXQ_OVFL);
+		break;
+
 	default:
 		return -ENOPROTOOPT;
 	}
diff --git a/net/ieee802154/dgram.c b/net/ieee802154/dgram.c
index a413b1b..25ad956 100644
--- a/net/ieee802154/dgram.c
+++ b/net/ieee802154/dgram.c
@@ -303,7 +303,7 @@ static int dgram_recvmsg(struct kiocb *iocb, struct sock *sk,
 	if (err)
 		goto done;
 
-	sock_recv_timestamp(msg, sk, skb);
+	sock_recv_ts_and_drops(msg, sk, skb);
 
 	if (flags & MSG_TRUNC)
 		copied = skb->len;
diff --git a/net/ieee802154/raw.c b/net/ieee802154/raw.c
index 30e74ee..769c8d1 100644
--- a/net/ieee802154/raw.c
+++ b/net/ieee802154/raw.c
@@ -191,7 +191,7 @@ static int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 	if (err)
 		goto done;
 
-	sock_recv_timestamp(msg, sk, skb);
+	sock_recv_ts_and_drops(msg, sk, skb);
 
 	if (flags & MSG_TRUNC)
 		copied = skb->len;
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 757c917..f18172b 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -682,7 +682,7 @@ static int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 	if (err)
 		goto done;
 
-	sock_recv_timestamp(msg, sk, skb);
+	sock_recv_ts_and_drops(msg, sk, skb);
 
 	/* Copy the address. */
 	if (sin) {
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 6ec6a8a..bb96eee 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -951,7 +951,7 @@ try_again:
 		UDP_INC_STATS_USER(sock_net(sk),
 				UDP_MIB_INDATAGRAMS, is_udplite);
 
-	sock_recv_timestamp(msg, sk, skb);
+	sock_recv_ts_and_drops(msg, sk, skb);
 
 	/* Copy the address. */
 	if (sin) {
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 4f24570..d8375bc 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -497,7 +497,7 @@ static int rawv6_recvmsg(struct kiocb *iocb, struct sock *sk,
 			sin6->sin6_scope_id = IP6CB(skb)->iif;
 	}
 
-	sock_recv_timestamp(msg, sk, skb);
+	sock_recv_ts_and_drops(msg, sk, skb);
 
 	if (np->rxopt.all)
 		datagram_recv_ctl(sk, msg, skb);
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index c6a303e..b51ee64 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -252,7 +252,7 @@ try_again:
 					UDP_MIB_INDATAGRAMS, is_udplite);
 	}
 
-	sock_recv_timestamp(msg, sk, skb);
+	sock_recv_ts_and_drops(msg, sk, skb);
 
 	/* Copy the address. */
 	if (msg->msg_name) {
diff --git a/net/key/af_key.c b/net/key/af_key.c
index c078ae6..472f659 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -3606,7 +3606,7 @@ static int pfkey_recvmsg(struct kiocb *kiocb,
 	if (err)
 		goto out_free;
 
-	sock_recv_timestamp(msg, sk, skb);
+	sock_recv_ts_and_drops(msg, sk, skb);
 
 	err = (flags & MSG_TRUNC) ? skb->len : copied;
 
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index f87ed48..bf3a295 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -627,15 +627,14 @@ static int packet_rcv(struct sk_buff *skb, struct net_device *dev,
 
 	spin_lock(&sk->sk_receive_queue.lock);
 	po->stats.tp_packets++;
+	skb->dropcount = atomic_read(&sk->sk_drops);
 	__skb_queue_tail(&sk->sk_receive_queue, skb);
 	spin_unlock(&sk->sk_receive_queue.lock);
 	sk->sk_data_ready(sk, skb->len);
 	return 0;
 
 drop_n_acct:
-	spin_lock(&sk->sk_receive_queue.lock);
-	po->stats.tp_drops++;
-	spin_unlock(&sk->sk_receive_queue.lock);
+	po->stats.tp_drops = atomic_inc_return(&sk->sk_drops);
 
 drop_n_restore:
 	if (skb_head != skb->data && skb_shared(skb)) {
@@ -1478,7 +1477,7 @@ static int packet_recvmsg(struct kiocb *iocb, struct socket *sock,
 	if (err)
 		goto out_free;
 
-	sock_recv_timestamp(msg, sk, skb);
+	sock_recv_ts_and_drops(msg, sk, skb);
 
 	if (msg->msg_name)
 		memcpy(msg->msg_name, &PACKET_SKB_CB(skb)->sa,
diff --git a/net/rxrpc/ar-recvmsg.c b/net/rxrpc/ar-recvmsg.c
index a39bf97..60c2b94 100644
--- a/net/rxrpc/ar-recvmsg.c
+++ b/net/rxrpc/ar-recvmsg.c
@@ -146,7 +146,7 @@ int rxrpc_recvmsg(struct kiocb *iocb, struct socket *sock,
 				memcpy(msg->msg_name,
 				       &call->conn->trans->peer->srx,
 				       sizeof(call->conn->trans->peer->srx));
-			sock_recv_timestamp(msg, &rx->sk, skb);
+			sock_recv_ts_and_drops(msg, &rx->sk, skb);
 		}
 
 		/* receive the message */
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index c8d0575..0970e92 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1958,7 +1958,7 @@ SCTP_STATIC int sctp_recvmsg(struct kiocb *iocb, struct sock *sk,
 	if (err)
 		goto out_free;
 
-	sock_recv_timestamp(msg, sk, skb);
+	sock_recv_ts_and_drops(msg, sk, skb);
 	if (sctp_ulpevent_is_notification(event)) {
 		msg->msg_flags |= MSG_NOTIFICATION;
 		sp->pf->event_msgname(event, msg->msg_name, addr_len);
diff --git a/net/socket.c b/net/socket.c
index d53ad11..c82146c 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -668,6 +668,13 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
 
 EXPORT_SYMBOL_GPL(__sock_recv_timestamp);
 
+void __sock_recv_ts_and_drops(struct msghdr *msg, struct sock *sk,
+	struct sk_buff *skb)
+{
+	put_cmsg(msg, SOL_SOCKET, SO_RXQ_OVFL, sizeof(__u32), &skb->dropcount);
+}
+EXPORT_SYMBOL_GPL(__sock_recv_ts_and_drops);
+
 static inline int __sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 				 struct msghdr *msg, size_t size, int flags)
 {

^ permalink raw reply related

* Re: [RFCv4 PATCH 1/2] net: Introduce recvmmsg socket syscall
From: Arnaldo Carvalho de Melo @ 2009-10-09 19:35 UTC (permalink / raw)
  To: David Miller
  Cc: Caitlin Bestler, Chris Van Hoof, Clark Williams, Neil Horman,
	Nir Tzachar, Nivedita Singhvi, Paul Moore,
	Rémi Denis-Courmont, Steven Whitehouse,
	Linux Networking Development Mailing List
In-Reply-To: <20090916170738.GC7699@ghostprotocols.net>

Em Wed, Sep 16, 2009 at 02:07:38PM -0300, Arnaldo Carvalho de Melo escreveu:
> Meaning receive multiple messages, reducing the number of syscalls and
> net stack entry/exit operations.
>
> Next patches will introduce mechanisms where protocols that want to
> optimize this operation will provide an unlocked_recvmsg operation.

Hi Dave,

	The second patch in this series has issues, I still have to
investigate it properly, study removing the skb_queue_head lock like TCP
does, but the first patch seems to be OK and already providing good
results at least as reported by Nir, if there aren't any other concerns
about the API, can we get it into net-next-2.6?

Best Regards,

- Arnaldo

^ permalink raw reply

* Re: [PATCH] net: Fix struct sock bitfield annotation
From: Christoph Lameter @ 2009-10-09 19:39 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S. Miller, Vegard Nossum, Linux Netdev List, Ingo Molnar
In-Reply-To: <4ACE95E1.30301@gmail.com>

On Fri, 9 Oct 2009, Eric Dumazet wrote:

> For networking guys, here is the actual mess with "struct sock" on x86_64,
> related to UDP handling (critical latencies for some people). We basically touch
> all cache lines, in every paths, bad effects on SMP...

Please keep me posted on this. I am very interested in this work.

Some simple shuffling around may do some good here.


^ permalink raw reply

* Re: [PATCH] irda/sa1100_ir: check return value of startup hook
From: Dmitry Artamonow @ 2009-10-09 20:12 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: netdev, Samuel Ortiz, Russell King, David S. Miller,
	linux-arm-kernel
In-Reply-To: <4ACF293C.7070803@ru.mvista.com>

[-- Attachment #1: Type: text/plain, Size: 367 bytes --]

On 16:14 Fri 09 Oct     , Sergei Shtylyov wrote:

[...]
> > -	if (si->pdata->startup)
> > -		si->pdata->startup(si->dev);
> > +	if (si->pdata->startup)	{
> > +		ret = si->pdata->startup(si->dev);
> > +		if (ret)
> > +			return ret;
> > +		}
> 
>     Overindented brace.
> 

Nice catch, thanks!

Updated patch in attachment.

-- 
Best regards,
Dmitry "MAD" Artamonow


[-- Attachment #2: 0001-irda-sa1100_ir-check-return-value-of-startup-hook.patch --]
[-- Type: text/plain, Size: 933 bytes --]

>From ba1fe701950634aae46aa59431633e99f8bd18cc Mon Sep 17 00:00:00 2001
From: Dmitry Artamonow <mad_soft@inbox.ru>
Date: Fri, 9 Oct 2009 21:56:21 +0400
Subject: [PATCH v2] irda/sa1100_ir: check return value of startup hook

Signed-off-by: Dmitry Artamonow <mad_soft@inbox.ru>
---
 drivers/net/irda/sa1100_ir.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/irda/sa1100_ir.c b/drivers/net/irda/sa1100_ir.c
index 38bf7cf..c412e80 100644
--- a/drivers/net/irda/sa1100_ir.c
+++ b/drivers/net/irda/sa1100_ir.c
@@ -232,8 +232,11 @@ static int sa1100_irda_startup(struct sa1100_irda *si)
 	/*
 	 * Ensure that the ports for this device are setup correctly.
 	 */
-	if (si->pdata->startup)
-		si->pdata->startup(si->dev);
+	if (si->pdata->startup)	{
+		ret = si->pdata->startup(si->dev);
+		if (ret)
+			return ret;
+	}
 
 	/*
 	 * Configure PPC for IRDA - we want to drive TXD2 low.
-- 
1.6.3.4


[-- Attachment #3: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related

* Re: behaviour question for igb on nehalem box
From: Brandeburg, Jesse @ 2009-10-09 20:22 UTC (permalink / raw)
  To: Chris Friesen
  Cc: e1000-list, Linux Network Development list, Allan, Bruce W,
	Ronciak, John, Kirsher, Jeffrey T
In-Reply-To: <4ACF8466.5030309@nortel.com>

On Fri, 9 Oct 2009, Chris Friesen wrote:
> I've got some general questions around the expected behaviour of the
> 82576 igb net device.  (On a dual quad-core Nehalem box, if it matters.)
> 
> As a caveat, the box is running Centos 5.3 with their 2.6.18 kernel.
> It's using the 1.3.16-k2 igb driver though, which looks to be the one
> from mainline linux.
> 
> The igb driver is being loaded with no parameters specified.  At driver
> init time, it's selecting 1 tx queue and 4 rx queues per device.
> 
> My first question is whether the number of queues makes sense.  I

It does for this kernel, because 2.6.18 doesn't support multiple tx 
queues.  The hardware supports RSS over receive queues, and the driver 
doesn't mention the multiple receive queues from the OS.

> couldn't figure out how this would happen since the rules for selecting
> the number of queues seems to be the same for rx and tx.  Also, it's not
> clear to me why it's limiting itself to 4 rx queues when I have 8
> physical cores (and 16 virtual ones with hyperthreading enabled).

for gigabit more queues is not necessarily better, and MQ arguably isn't 
necessary at all for gigabit.  However, it can help for some workloads 
when spreading out RX traffic.  the hardware you have only supports 8 
queues (rx and tx) and the driver is configured to only set up 4 max.

> My second question is around how the rx queues are mapped to interrupts.
>  According to /proc/interrupts there appears to be a 1:1 mapping between
> queues and interrupts.  However, I've set up at test with a given amount
> of traffic coming in to the device (from 4 different IP addresses and 4
> ports).  Under this scenario, "ethtool -S" shows the number of packets
> increasing for only rx queue 0, but I see the interrupt count going up
> for two interrupts.

one transmit interrupt and one receive interrupt?  RSS will spread the 
receive work out in a flow based way, based on ip/xDP header.  Your test 
as described should be using more than one flow (and therefore more than 
one rx queue) unless you got caught out by the default arp_filter 
behavior (check arp -an).
 
> My final question is around smp affinity for the rx and tx queue
> interrupts.  Do I need to affine the interrupt for each rx queue to a
> single core to guarantee proper packet ordering, or can they be handled
> on arbitrary cores?  Should the tx queue be affined to a particular core
> or left to be handled by all cores?

on RHEL5.3 you can use irqbalance, you shouldn't need to hand affine 
anything.  Packets won't be received out of order unless you have the rx 
interrupts going to more that one cpu per queue. (smp_affinity mask has 
more than one bit set)  RSS is doing flow steering.

going to a 2.6.27 or newer kernel will get you full tx multiqueue support.

Hope this helps,
  Jesse

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference

^ permalink raw reply

* Re: [PATCH] net: Fix struct sock bitfield annotation
From: Eric Dumazet @ 2009-10-09 20:41 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: David S. Miller, Vegard Nossum, Linux Netdev List, Ingo Molnar
In-Reply-To: <alpine.DEB.1.10.0910091539170.32209@gentwo.org>

Christoph Lameter a écrit :
> On Fri, 9 Oct 2009, Eric Dumazet wrote:
> 
>> For networking guys, here is the actual mess with "struct sock" on x86_64,
>> related to UDP handling (critical latencies for some people). We basically touch
>> all cache lines, in every paths, bad effects on SMP...
> 
> Please keep me posted on this. I am very interested in this work.
> 
> Some simple shuffling around may do some good here.
> 

Sure, will do, but first I want to suppress the lock_sock()/release_sock() in
rx path, that was added for sk_forward_alloc thing. This really hurts,
because of the backlog handling.

I have preliminary patch that restore UDP latencies we had in the past ;)

Trick is for UDP, sk_forward_alloc is not updated by tx/rx, only rx.
So we can use the sk_receive_queue.lock to forbid concurrent updates.

As this lock is already hot and only used by rx, we wont have to
dirty the sk_lock, that will only be used by tx path.

Then we can carefuly reorder struct sock to lower number of cache lines
needed for each path.


Patch against linux-2.6 git tree

 net/core/sock.c |    9 ++++
 net/ipv4/udp.c  |   89 ++++++++++++++++++++++------------------------
 2 files changed, 51 insertions(+), 47 deletions(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index 7626b6a..45212d4 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -276,6 +276,7 @@ int sock_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 {
 	int err = 0;
 	int skb_len;
+	unsigned long flags;
 
 	/* Cast sk->rcvbuf to unsigned... It's pointless, but reduces
 	   number of warnings when compiling with -W --ANK
@@ -290,8 +291,12 @@ int sock_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 	if (err)
 		goto out;
 
+	skb_orphan(skb);
+
+	spin_lock_irqsave(&sk->sk_receive_queue.lock, flags);
 	if (!sk_rmem_schedule(sk, skb->truesize)) {
 		err = -ENOBUFS;
+		spin_unlock_irqrestore(&sk->sk_receive_queue.lock, flags);
 		goto out;
 	}
 
@@ -305,7 +310,9 @@ int sock_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 	 */
 	skb_len = skb->len;
 
-	skb_queue_tail(&sk->sk_receive_queue, skb);
+	__skb_queue_tail(&sk->sk_receive_queue, skb);
+
+	spin_unlock_irqrestore(&sk->sk_receive_queue.lock, flags);
 
 	if (!sock_flag(sk, SOCK_DEAD))
 		sk->sk_data_ready(sk, skb_len);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 6ec6a8a..e8a1be4 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -841,6 +841,36 @@ out:
 	return ret;
 }
 
+
+/**
+ *	first_packet_length	- return length of first packet in receive queue
+ *	@sk: socket
+ *
+ *	Drops all bad checksum frames, until a valid one is found.
+ *	Returns the length of found skb, or 0 if none is found.
+ */
+static unsigned int first_packet_length(struct sock *sk)
+{
+	struct sk_buff_head *rcvq = &sk->sk_receive_queue;
+	struct sk_buff *skb;
+	unsigned int res;
+
+	spin_lock_bh(&rcvq->lock);
+
+	while ((skb = skb_peek(rcvq)) != NULL &&
+		udp_lib_checksum_complete(skb)) {
+		UDP_INC_STATS_BH(sock_net(sk), UDP_MIB_INERRORS,
+				 IS_UDPLITE(sk));
+		__skb_unlink(skb, rcvq);
+		skb_kill_datagram(sk, skb, 0);
+	}
+	res = skb ? skb->len : 0;
+
+	spin_unlock_bh(&rcvq->lock);
+
+	return res;
+}
+
 /*
  *	IOCTL requests applicable to the UDP protocol
  */
@@ -857,21 +887,16 @@ int udp_ioctl(struct sock *sk, int cmd, unsigned long arg)
 
 	case SIOCINQ:
 	{
-		struct sk_buff *skb;
-		unsigned long amount;
+		unsigned int amount = first_packet_length(sk);
 
-		amount = 0;
-		spin_lock_bh(&sk->sk_receive_queue.lock);
-		skb = skb_peek(&sk->sk_receive_queue);
-		if (skb != NULL) {
+		if (amount)
 			/*
 			 * We will only return the amount
 			 * of this packet since that is all
 			 * that will be read.
 			 */
-			amount = skb->len - sizeof(struct udphdr);
-		}
-		spin_unlock_bh(&sk->sk_receive_queue.lock);
+			amount -= sizeof(struct udphdr);
+
 		return put_user(amount, (int __user *)arg);
 	}
 
@@ -968,17 +993,17 @@ try_again:
 		err = ulen;
 
 out_free:
-	lock_sock(sk);
+	spin_lock_bh(&sk->sk_receive_queue.lock);
 	skb_free_datagram(sk, skb);
-	release_sock(sk);
+	spin_unlock_bh(&sk->sk_receive_queue.lock);
 out:
 	return err;
 
 csum_copy_err:
-	lock_sock(sk);
+	spin_lock_bh(&sk->sk_receive_queue.lock);
 	if (!skb_kill_datagram(sk, skb, flags))
-		UDP_INC_STATS_USER(sock_net(sk), UDP_MIB_INERRORS, is_udplite);
-	release_sock(sk);
+		UDP_INC_STATS_BH(sock_net(sk), UDP_MIB_INERRORS, is_udplite);
+	spin_unlock_bh(&sk->sk_receive_queue.lock);
 
 	if (noblock)
 		return -EAGAIN;
@@ -1060,7 +1085,6 @@ drop:
 int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 {
 	struct udp_sock *up = udp_sk(sk);
-	int rc;
 	int is_udplite = IS_UDPLITE(sk);
 
 	/*
@@ -1140,16 +1164,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 			goto drop;
 	}
 
-	rc = 0;
-
-	bh_lock_sock(sk);
-	if (!sock_owned_by_user(sk))
-		rc = __udp_queue_rcv_skb(sk, skb);
-	else
-		sk_add_backlog(sk, skb);
-	bh_unlock_sock(sk);
-
-	return rc;
+	return __udp_queue_rcv_skb(sk, skb);
 
 drop:
 	UDP_INC_STATS_BH(sock_net(sk), UDP_MIB_INERRORS, is_udplite);
@@ -1540,29 +1555,11 @@ unsigned int udp_poll(struct file *file, struct socket *sock, poll_table *wait)
 {
 	unsigned int mask = datagram_poll(file, sock, wait);
 	struct sock *sk = sock->sk;
-	int 	is_lite = IS_UDPLITE(sk);
 
 	/* Check for false positives due to checksum errors */
-	if ((mask & POLLRDNORM) &&
-	    !(file->f_flags & O_NONBLOCK) &&
-	    !(sk->sk_shutdown & RCV_SHUTDOWN)) {
-		struct sk_buff_head *rcvq = &sk->sk_receive_queue;
-		struct sk_buff *skb;
-
-		spin_lock_bh(&rcvq->lock);
-		while ((skb = skb_peek(rcvq)) != NULL &&
-		       udp_lib_checksum_complete(skb)) {
-			UDP_INC_STATS_BH(sock_net(sk),
-					UDP_MIB_INERRORS, is_lite);
-			__skb_unlink(skb, rcvq);
-			kfree_skb(skb);
-		}
-		spin_unlock_bh(&rcvq->lock);
-
-		/* nothing to see, move along */
-		if (skb == NULL)
-			mask &= ~(POLLIN | POLLRDNORM);
-	}
+	if ((mask & POLLRDNORM) && !(file->f_flags & O_NONBLOCK) &&
+	    !(sk->sk_shutdown & RCV_SHUTDOWN) && !first_packet_length(sk))
+		mask &= ~(POLLIN | POLLRDNORM);
 
 	return mask;
 

^ permalink raw reply related

* Re: PATCH: Network Device Naming mechanism and policy
From: Matt Domsch @ 2009-10-09 21:09 UTC (permalink / raw)
  To: netdev, linux-hotplug; +Cc: Narendra_K, jordan_hargrave
In-Reply-To: <20091009140000.GA18765@mock.linuxdev.us.dell.com>

On Fri, Oct 09, 2009 at 09:00:01AM -0500, Narendra K wrote:
> On Fri, Oct 09, 2009 at 07:12:07PM +0530, K, Narendra wrote:
> > > example udev config:
> > > SUBSYSTEM=="net",
> > SYMLINK+="net/by-mac/$sysfs{ifindex}.$sysfs{address}"
> > 
> > work as well.  But coupling the ifindex to the MAC address like this
> > doesn't work.  (In general, coupling any two unrelated attributes when
> > trying to do persistent names doesn't work.)
> > 
> Attaching the latest patch incorporating review comments.

Same patch, rebased to linux-next.

By creating character devices for every network device, we can use
udev to maintain alternate naming policies for devices, including
additional names for the same device, without interfering with the
name that the kernel assigns a device.

This is conditionalized on CONFIG_NET_CDEV.  If enabled (the default),
device nodes will automatically be created in /dev/netdev/ for each
network device.  (/dev/net/ is already populated by the tun device.)

These device nodes are not functional at the moment - open() returns
-ENOSYS.  Their only purpose is to provide userspace with a kernel
name to ifindex mapping, in a form that udev can easily manage.

Signed-off-by: Jordan Hargrave <Jordan_Hargrave@dell.com>
Signed-off-by: Narendra K <Narendra_K@dell.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>

---
 include/linux/netdevice.h |    4 ++++
 net/Kconfig               |   10 ++++++++++
 net/core/Makefile         |    1 +
 net/core/cdev.c           |   42 ++++++++++++++++++++++++++++++++++++++++++
 net/core/cdev.h           |   13 +++++++++++++
 net/core/dev.c            |   10 ++++++++++
 net/core/net-sysfs.c      |   13 +++++++++++++
 7 files changed, 93 insertions(+), 0 deletions(-)
 create mode 100644 net/core/cdev.c
 create mode 100644 net/core/cdev.h

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index b332eef..a2f23b4 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -44,6 +44,7 @@
 #include <linux/workqueue.h>
 
 #include <linux/ethtool.h>
+#include <linux/cdev.h>
 #include <net/net_namespace.h>
 #include <net/dsa.h>
 #ifdef CONFIG_DCB
@@ -916,6 +917,9 @@ struct net_device
 	/* max exchange id for FCoE LRO by ddp */
 	unsigned int		fcoe_ddp_xid;
 #endif
+#ifdef CONFIG_NET_CDEV
+	struct cdev cdev;
+#endif
 };
 #define to_net_dev(d) container_of(d, struct net_device, dev)
 
diff --git a/net/Kconfig b/net/Kconfig
index 041c35e..bdc5bd7 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -43,6 +43,16 @@ config COMPAT_NETLINK_MESSAGES
 	  Newly written code should NEVER need this option but do
 	  compat-independent messages instead!
 
+config NET_CDEV
+       bool "/dev files for network devices"
+       default y
+       help
+         This option causes /dev entries to be created for each
+         network device.  This allows the use of udev to create
+         alternate device naming policies.
+
+	 If unsure, say Y.
+
 menu "Networking options"
 
 source "net/packet/Kconfig"
diff --git a/net/core/Makefile b/net/core/Makefile
index 796f46e..0b40d2c 100644
--- a/net/core/Makefile
+++ b/net/core/Makefile
@@ -19,4 +19,5 @@ obj-$(CONFIG_NET_DMA) += user_dma.o
 obj-$(CONFIG_FIB_RULES) += fib_rules.o
 obj-$(CONFIG_TRACEPOINTS) += net-traces.o
 obj-$(CONFIG_NET_DROP_MONITOR) += drop_monitor.o
+obj-$(CONFIG_NET_CDEV) += cdev.o
 
diff --git a/net/core/cdev.c b/net/core/cdev.c
new file mode 100644
index 0000000..1f36076
--- /dev/null
+++ b/net/core/cdev.c
@@ -0,0 +1,42 @@
+#include <linux/fs.h>
+#include <linux/cdev.h>
+#include <linux/netdevice.h>
+#include <linux/device.h>
+
+/* Used for network dynamic major number */
+static dev_t netdev_devt;
+
+static int netdev_cdev_open(struct inode *inode, struct file *filep)
+{
+	/* no operations on this device are implemented */
+	return -ENOSYS;
+}
+
+static const struct file_operations netdev_cdev_fops = {
+	.owner = THIS_MODULE,
+	.open = netdev_cdev_open,
+};
+
+void netdev_cdev_alloc(void)
+{
+	alloc_chrdev_region(&netdev_devt, 0, 1<<20, "net");
+}
+
+void netdev_cdev_init(struct net_device *dev)
+{
+	cdev_init(&dev->cdev, &netdev_cdev_fops);
+	cdev_add(&dev->cdev, MKDEV(MAJOR(netdev_devt), dev->ifindex), 1);
+
+}
+
+void netdev_cdev_del(struct net_device *dev)
+{
+	if (dev->cdev.dev)
+		cdev_del(&dev->cdev);
+}
+
+void netdev_cdev_kobj_init(struct device *dev, struct net_device *net)
+{
+	if (net->cdev.dev)
+		dev->devt = net->cdev.dev;
+}
diff --git a/net/core/cdev.h b/net/core/cdev.h
new file mode 100644
index 0000000..9cf5a90
--- /dev/null
+++ b/net/core/cdev.h
@@ -0,0 +1,13 @@
+#include <linux/netdevice.h>
+
+#ifdef CONFIG_NET_CDEV
+void netdev_cdev_alloc(void);
+void netdev_cdev_init(struct net_device *dev);
+void netdev_cdev_del(struct net_device *dev);
+void netdev_cdev_kobj_init(struct device *dev, struct net_device *net);
+#else
+static inline void netdev_cdev_alloc(void) {}
+static inline void netdev_cdev_init(struct net_device *dev) {}
+static inline void netdev_cdev_del(struct net_device *dev) {}
+static inline void netdev_cdev_kobj_init(struct device *dev, struct net_device *net) {}
+#endif
diff --git a/net/core/dev.c b/net/core/dev.c
index a74c8fd..d771438 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -129,6 +129,7 @@
 #include <trace/events/napi.h>
 
 #include "net-sysfs.h"
+#include "cdev.h"
 
 /* Instead of increasing this, you should create a hash table. */
 #define MAX_GRO_SKBS 8
@@ -4684,6 +4685,7 @@ static void rollback_registered(struct net_device *dev)
 
 	/* Remove entries from kobject tree */
 	netdev_unregister_kobject(dev);
+	netdev_cdev_del(dev);
 
 	synchronize_net();
 
@@ -4835,6 +4837,8 @@ int register_netdevice(struct net_device *dev)
 	if (dev->features & NETIF_F_SG)
 		dev->features |= NETIF_F_GSO;
 
+	netdev_cdev_init(dev);
+
 	netdev_initialize_kobject(dev);
 
 	ret = call_netdevice_notifiers(NETDEV_POST_INIT, dev);
@@ -4870,6 +4874,7 @@ out:
 	return ret;
 
 err_uninit:
+	netdev_cdev_del(dev);
 	if (dev->netdev_ops->ndo_uninit)
 		dev->netdev_ops->ndo_uninit(dev);
 	goto out;
@@ -5377,6 +5382,7 @@ int dev_change_net_namespace(struct net_device *dev, struct net *net, const char
 	dev_addr_discard(dev);
 
 	netdev_unregister_kobject(dev);
+	netdev_cdev_del(dev);
 
 	/* Actually switch the network namespace */
 	dev_net_set(dev, net);
@@ -5393,6 +5399,8 @@ int dev_change_net_namespace(struct net_device *dev, struct net *net, const char
 			dev->iflink = dev->ifindex;
 	}
 
+	netdev_cdev_init(dev);
+
 	/* Fixup kobjects */
 	err = netdev_register_kobject(dev);
 	WARN_ON(err);
@@ -5626,6 +5634,8 @@ static int __init net_dev_init(void)
 
 	BUG_ON(!dev_boot_phase);
 
+	netdev_cdev_alloc();
+
 	if (dev_proc_init())
 		goto out;
 
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 753c420..f4ee557 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -19,6 +19,7 @@
 #include <net/wext.h>
 
 #include "net-sysfs.h"
+#include "cdev.h"
 
 #ifdef CONFIG_SYSFS
 static const char fmt_hex[] = "%#x\n";
@@ -501,6 +502,14 @@ static void netdev_release(struct device *d)
 	kfree((char *)dev - dev->padded);
 }
 
+#ifdef CONFIG_NET_CDEV
+static char *netdev_devnode(struct device *d, mode_t *mode)
+{
+	struct net_device *dev = to_net_dev(d);
+	return kasprintf(GFP_KERNEL, "netdev/%s", dev->name);
+}
+#endif
+
 static struct class net_class = {
 	.name = "net",
 	.dev_release = netdev_release,
@@ -510,6 +519,9 @@ static struct class net_class = {
 #ifdef CONFIG_HOTPLUG
 	.dev_uevent = netdev_uevent,
 #endif
+#ifdef CONFIG_NET_CDEV
+	.devnode = netdev_devnode,
+#endif
 };
 
 /* Delete sysfs entries but hold kobject reference until after all
@@ -536,6 +548,7 @@ int netdev_register_kobject(struct net_device *net)
 	dev->class = &net_class;
 	dev->platform_data = net;
 	dev->groups = groups;
+	netdev_cdev_kobj_init(dev, net);
 
 	dev_set_name(dev, "%s", net->name);
 
-- 
1.6.0.6


^ permalink raw reply related

* pull request: wireless-next-2.6 2009-10-09
From: John W. Linville @ 2009-10-09 21:05 UTC (permalink / raw)
  To: davem; +Cc: linux-wireless, netdev

Dave,

Here is the usual big first post-window pull request for -next...
Mostly it is the usual suspects, lots of iwlwifi and ath* along
with a smattering of other bits.  There are even a few from me! :-)
Most of these have spent several days banging-around in -next (which
helped to find some Kconfig problems).

Please let me know if there are problems!

Thanks,

John

---

Individual patches are available here:

	http://www.kernel.org/pub/linux/kernel/people/linville/wireless-next-2.6/

---

The following changes since commit d519e17e2d01a0ee9abe083019532061b4438065:
  Andy Gospodarek (1):
        net: export device speed and duplex via sysfs

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6.git master

Abhijeet Kolekar (2):
      iwlwifi/iwl3945 : unify apm stop operation
      iwlwifi: replace iwl_poll_direct_bit with iwl_poll_bit for CSR access

Amitkumar Karwar (2):
      libertas: Add auto deep sleep support for SD8385/SD8686/SD8688
      libertas: Use lbs_is_cmd_allowed() check in command handling routines.

Christian Lamparter (1):
      iwlwifi: drop lib80211 dependency

Daniel C Halperin (3):
      iwlwifi: clean up rs_tx_status
      iwlwifi: do not clear TX info flags when receiving BlockAckResponse
      iwlwifi: add aggregation tables to the rate scaling algorithm

Holger Schurig (5):
      nl80211: report age of scan results
      libertas: separate libertas' Kconfig in it's own file
      libertas: first stab at cfg80211 support
      libertas: remove extraneous select FW_LOADER
      libertas: depend on CONFIG_CFG80211

Huaxu Wan (2):
      iwlwifi: add module firmware info for 1000 series
      iwlwifi: clear the translate table area

Jaswinder Singh Rajput (1):
      b43: Comment unused functions lpphy_restore_dig_flt_state and lpphy_disable_rx_gain_override

Joerg Albert (3):
      ar9170: fixed coding style, moved define
      ar9170: add heavy clip handling
      ar9170: handle overflow in tsf_low register during get_tsf

Johannes Berg (10):
      iwlwifi: clean up ht config a little
      iwlwifi: clean up ht config naming
      iwlwifi: clarify and clean up chain settings
      iwlwifi: fix a typo
      iwlwifi: default to using all chains
      iwlwifi: support idle for 6000 series hw
      wext: refactor
      iwlwifi: device tracing
      iwlwifi: LED cleanup
      wireless: make wireless drivers select core

John W. Linville (6):
      wireless: implement basic ethtool support for cfg80211 devices
      mac80211: support ETHTOOL_GPERMADDR
      iwmc3200wifi: support ETHTOOL_GPERMADDR
      ipw2200: support ETHTOOL_GPERMADDR
      orinoco: support ETHTOOL_GPERMADDR
      net/wireless/ethtool.h: drop unnecessary include of linux/ethtool.h

Kalle Valo (3):
      wl1251: remove wl1251_netlink.h
      cfg80211: add firmware and hardware version to wiphy
      at76c50x-usb: set firmware and hardware version in wiphy

Larry Finger (1):
      staging: Add proper selection of WIRELESS_EXT and WEXT_PRIV

Luis R. Rodriguez (68):
      ath9k: use ath_hw for DPRINTF() and debug init/exit
      ath9k: move btcoex core driver info to its own struct
      ath9k: move hw specific btcoex info to ath_hw
      ath9k: split bluetooth hardware coex init into two helpers
      ath9k: move driver core helpers to main.c
      ath9k: split ath9k_hw_btcoex_enable() into two helpers
      ath9k: replaces SC_OP_BTCOEX_ENABLED with a bool
      ath9k: move bt_stomp_type to driver core
      ath9k: remove unused bt_duty_cycle
      ath9k: rename btcoex_scheme to just scheme
      ath9k: rename ath_btcoex_info to ath_btcoex_hw
      ath9k: simplify ath_btcoex_bt_stomp()
      ath9k: now move ath9k_hw_btcoex_set_weight() to btcoex.c
      ath9k: move ath_btcoex_config and ath_bt_mode to btcoex.c
      ath9k: rename ath_btcoex_supported() to ath9k_hw_btcoex_supported()
      ath9k: move ps helpers onto core driver when reseting tsf
      ath9k: move ath9k_ps_wakeup() and ath9k_ps_restore() to main.c
      ath9k: avoid usage of ath9k_hw_setpower() on hw.c
      ath9k: move ath9k_hw_setpower() to main.c
      ath9k: rename driver core and hw power save helpers
      ath: move ath_bcast_mac to common header
      atheros: use get_unaligned_le*() for bssid mask setting
      ath9k: make ath9k_hw_setbssidmask() and ath9k_hw_write_associd() use ath_hw
      ath9k: Use ath9k_hw_setbssidmask() on reset
      ath9k: use ath9k_hw_write_associd() on reset
      atheros/ath9k: move macaddr, curaid, curbssid and bssidmask to common
      ar9170: make use of common macaddr and curbssid
      ath5k: use common curbssid, bssidmask and macaddr
      ath5k: initialize eeprom struct early on attach
      ath9k: move ath_common to ath_hw
      ath5k: move ath_common to ath5k_hw
      ath9k: Define bus agnostic bluetooth coex prep helper
      atheros/ath9k: add common read/write ops and port ath9k to use it
      ath5k: allocate ath5k_hw prior to initializing hw
      ath5k: define ath_common ops
      atheros: define shared bssidmask setting
      atheros: add ieee80211_hw to ath_common
      ath9k: separate core driver and hw timer code
      atheros: add common debug printing
      atheros: move tx/rx chainmask to ath_common
      ath9k: remove ath9k 25 MHz HT40 spacing stuff
      ath9k: remove ath9k_ht_macmode
      ath9k: move ATH_AMPDU_LIMIT_MAX to hw.h
      ath9k: remove driver ASSERT, just use BUG_ON()
      ath9k: clarify what hw code is and remove ath9k.h from a few files
      ath9k: move ATH9K_RSSI_BAD to hw.h
      atheros: move bus ops to ath_common
      ath9k: make ath9k_common_ops const
      ath9k: use common read/write ops on pci and debug code
      ath9k: move hw code to its own module
      ath9k_hw: print device ID if not supported
      ath9k_hw: add AR9271 srev and device ID to allow hw to support ar9271
      atheros: define a common priv struct
      ath5k: fix regression on setting bssid mask on association
      ath5k: use ath_hw_setbssidmask() for bssid mask setting upon assoc
      ath5k: fix regression introduced upon the removal of AR5K_HIGH_ID()
      ath5k: simplify passed params to ath5k_hw_set_associd()
      ath5k: remove temporary low_id and high_id vars on ath5k_hw_set_associd()
      ath5k: fix regression which triggers an SME join upon assoc
      ath5k: enable Power-Save Polls by setting the association ID
      ath9k: move common->debug_mask setting to ath_init_softc()
      ath9k: initialize hw prior to debugfs
      ath9k: add helper to un-init the hw properly
      ath9k: add a helper to clean the core driver upon module unload
      ath9k: move ath_cleanup() below helpers to avoid forward declarations
      ath9k: rename ath_beaconq_setup() to ath9k_hw_beaconq_setup()
      ath9k: use right parameter for MODULE_PARM_DESC() for debug
      libertas: remove double assignment of dev->netdev_ops

Rafael J. Wysocki (1):
      Wireless / ath5k: Simplify suspend and resume callbacks

Randy Dunlap (1):
      wireless: fix CFG80211_WEXT build problems

Senthil Balasubramanian (5):
      ath9k: Allow PSPOLL only when the interface is configured in AP mode
      ath9k: Handle ATH9K_BEACON_RESET_TSF properly
      ath9k: Reduce PLL Settle time and eliminate redundant PLL calls.
      ath9k: Advertise midband for AR5416 devices
      ath9k: Fix bugs in handling TX power

Sujith (2):
      ath9k: Update INI release for AR9287
      ath9k: Fix RTC reset for AR5416

Vasanthakumar Thiagarajan (1):
      ath9k: Update initvals

Vivek Natarajan (1):
      ath9k: Add Calibration checks

Wey-Yi Guy (19):
      iwlwifi: modify LED blink index table
      iwlwifi: remove un-supported eeprom parameters
      iwlwifi: separate nic_config for different NIC
      iwlwifi: separate set_hw_params function for 6000 series
      iwlwifi: Adjust blink rate to compensate Clock difference
      iwlwifi: show NVM version in debugfs
      iwlwifi: Use RTS/CTS as the preferred protection mechanism for 6000 series
      iwlwifi: allow user change protection mechanism for HT
      iwlwifi: EEPROM version for 1000 and 6000 series
      iwlwifi: use S_IRUGO and S_IWUSR in module parameters
      iwlwifi: send cmd to uCode to configure valid tx antenna
      iwlwifi: update PCI Subsystem ID for 1000 series
      iwlwifi: update PCI Subsystem ID for 6000 series
      iwlwifi: add LED mode to support different LED behavior
      iwlwifi: Chain Noise Calibration for 6000 series
      iwlwifi: reliable entering of critical temperature state
      iwlwifi: change valid EEPROM version for 1000 series
      iwlwifi: set default aggregation frame count limit to 31
      iwlwifi: validate the signature for EEPROM and OTP

 drivers/net/wireless/Kconfig                 |   84 +-
 drivers/net/wireless/at76c50x-usb.c          |   10 +
 drivers/net/wireless/ath/Kconfig             |    8 +
 drivers/net/wireless/ath/Makefile            |    9 +-
 drivers/net/wireless/ath/ar9170/ar9170.h     |    4 +-
 drivers/net/wireless/ath/ar9170/cmd.c        |    3 +-
 drivers/net/wireless/ath/ar9170/cmd.h        |    1 +
 drivers/net/wireless/ath/ar9170/hw.h         |    2 +
 drivers/net/wireless/ath/ar9170/mac.c        |   15 +-
 drivers/net/wireless/ath/ar9170/main.c       |   30 +-
 drivers/net/wireless/ath/ar9170/phy.c        |   99 ++-
 drivers/net/wireless/ath/ath.h               |   41 +
 drivers/net/wireless/ath/ath5k/ath5k.h       |   40 +-
 drivers/net/wireless/ath/ath5k/attach.c      |   31 +-
 drivers/net/wireless/ath/ath5k/base.c        |  116 ++-
 drivers/net/wireless/ath/ath5k/base.h        |   12 -
 drivers/net/wireless/ath/ath5k/initvals.c    |    4 +-
 drivers/net/wireless/ath/ath5k/pcu.c         |  193 +---
 drivers/net/wireless/ath/ath5k/reg.h         |    8 +-
 drivers/net/wireless/ath/ath5k/reset.c       |   16 +-
 drivers/net/wireless/ath/ath9k/Kconfig       |    8 +
 drivers/net/wireless/ath/ath9k/Makefile      |   27 +-
 drivers/net/wireless/ath/ath9k/ahb.c         |   19 +-
 drivers/net/wireless/ath/ath9k/ani.c         |  141 ++-
 drivers/net/wireless/ath/ath9k/ath9k.h       |   73 +-
 drivers/net/wireless/ath/ath9k/beacon.c      |  112 +-
 drivers/net/wireless/ath/ath9k/btcoex.c      |  383 ++----
 drivers/net/wireless/ath/ath9k/btcoex.h      |   64 +-
 drivers/net/wireless/ath/ath9k/calib.c       |  391 ++++---
 drivers/net/wireless/ath/ath9k/calib.h       |    2 +
 drivers/net/wireless/ath/ath9k/debug.c       |   55 +-
 drivers/net/wireless/ath/ath9k/debug.h       |   36 +-
 drivers/net/wireless/ath/ath9k/eeprom.c      |    8 +-
 drivers/net/wireless/ath/ath9k/eeprom.h      |    9 +-
 drivers/net/wireless/ath/ath9k/eeprom_4k.c   |   90 +-
 drivers/net/wireless/ath/ath9k/eeprom_9287.c |   97 +-
 drivers/net/wireless/ath/ath9k/eeprom_def.c  |  183 ++-
 drivers/net/wireless/ath/ath9k/hw.c          |  595 +++++-----
 drivers/net/wireless/ath/ath9k/hw.h          |   63 +-
 drivers/net/wireless/ath/ath9k/initvals.h    |   72 +-
 drivers/net/wireless/ath/ath9k/mac.c         |  162 ++-
 drivers/net/wireless/ath/ath9k/mac.h         |   11 +-
 drivers/net/wireless/ath/ath9k/main.c        |  841 +++++++++----
 drivers/net/wireless/ath/ath9k/pci.c         |   37 +-
 drivers/net/wireless/ath/ath9k/phy.c         |   50 +-
 drivers/net/wireless/ath/ath9k/phy.h         |    1 +
 drivers/net/wireless/ath/ath9k/rc.c          |   33 +-
 drivers/net/wireless/ath/ath9k/recv.c        |   62 +-
 drivers/net/wireless/ath/ath9k/reg.h         |    5 +-
 drivers/net/wireless/ath/ath9k/virtual.c     |   22 +-
 drivers/net/wireless/ath/ath9k/xmit.c        |  113 +-
 drivers/net/wireless/ath/debug.c             |   32 +
 drivers/net/wireless/ath/debug.h             |   77 ++
 drivers/net/wireless/ath/hw.c                |  126 ++
 drivers/net/wireless/ath/reg.h               |   27 +
 drivers/net/wireless/b43/phy_lp.c            |    6 +
 drivers/net/wireless/hostap/Kconfig          |    2 +
 drivers/net/wireless/ipw2x00/Kconfig         |    7 +-
 drivers/net/wireless/ipw2x00/ipw2200.c       |    1 +
 drivers/net/wireless/iwlwifi/Kconfig         |   28 +-
 drivers/net/wireless/iwlwifi/Makefile        |   12 +-
 drivers/net/wireless/iwlwifi/iwl-1000.c      |   35 +-
 drivers/net/wireless/iwlwifi/iwl-3945-led.c  |  371 +-----
 drivers/net/wireless/iwlwifi/iwl-3945-led.h  |   22 +-
 drivers/net/wireless/iwlwifi/iwl-3945.c      |   65 +-
 drivers/net/wireless/iwlwifi/iwl-3945.h      |    2 +-
 drivers/net/wireless/iwlwifi/iwl-4965.c      |   71 +-
 drivers/net/wireless/iwlwifi/iwl-5000.c      |  127 +-
 drivers/net/wireless/iwlwifi/iwl-6000.c      |  245 ++++-
 drivers/net/wireless/iwlwifi/iwl-agn-led.c   |   85 ++
 drivers/net/wireless/iwlwifi/iwl-agn-led.h   |   32 +
 drivers/net/wireless/iwlwifi/iwl-agn-rs.c    |  466 ++++----
 drivers/net/wireless/iwlwifi/iwl-agn.c       |  124 ++-
 drivers/net/wireless/iwlwifi/iwl-calib.c     |   66 +-
 drivers/net/wireless/iwlwifi/iwl-commands.h  |   12 +-
 drivers/net/wireless/iwlwifi/iwl-core.c      |  209 ++--
 drivers/net/wireless/iwlwifi/iwl-core.h      |   31 +-
 drivers/net/wireless/iwlwifi/iwl-csr.h       |    7 +-
 drivers/net/wireless/iwlwifi/iwl-debug.h     |    2 -
 drivers/net/wireless/iwlwifi/iwl-debugfs.c   |   17 +-
 drivers/net/wireless/iwlwifi/iwl-dev.h       |   31 +-
 drivers/net/wireless/iwlwifi/iwl-devtrace.c  |   13 +
 drivers/net/wireless/iwlwifi/iwl-devtrace.h  |  178 +++
 drivers/net/wireless/iwlwifi/iwl-eeprom.c    |   45 +-
 drivers/net/wireless/iwlwifi/iwl-eeprom.h    |   17 +-
 drivers/net/wireless/iwlwifi/iwl-io.h        |   16 +-
 drivers/net/wireless/iwlwifi/iwl-led.c       |  323 +----
 drivers/net/wireless/iwlwifi/iwl-led.h       |   46 +-
 drivers/net/wireless/iwlwifi/iwl-power.c     |  149 ++-
 drivers/net/wireless/iwlwifi/iwl-power.h     |    3 +
 drivers/net/wireless/iwlwifi/iwl-scan.c      |    1 -
 drivers/net/wireless/iwlwifi/iwl-tx.c        |   26 +-
 drivers/net/wireless/iwlwifi/iwl3945-base.c  |   28 +-
 drivers/net/wireless/iwmc3200wifi/main.c     |    2 +
 drivers/net/wireless/libertas/Kconfig        |   39 +
 drivers/net/wireless/libertas/Makefile       |   15 +-
 drivers/net/wireless/libertas/README         |   26 +-
 drivers/net/wireless/libertas/cfg.c          |  198 +++
 drivers/net/wireless/libertas/cfg.h          |   16 +
 drivers/net/wireless/libertas/cmd.c          |  106 ++-
 drivers/net/wireless/libertas/cmdresp.c      |   12 +
 drivers/net/wireless/libertas/decl.h         |    3 +
 drivers/net/wireless/libertas/defs.h         |    2 +
 drivers/net/wireless/libertas/dev.h          |   19 +
 drivers/net/wireless/libertas/host.h         |    1 +
 drivers/net/wireless/libertas/if_cs.c        |    3 +
 drivers/net/wireless/libertas/if_sdio.c      |   56 +
 drivers/net/wireless/libertas/if_sdio.h      |    3 +-
 drivers/net/wireless/libertas/if_spi.c       |    3 +
 drivers/net/wireless/libertas/if_usb.c       |    3 +
 drivers/net/wireless/libertas/main.c         |  171 ++-
 drivers/net/wireless/libertas/wext.c         |   54 +-
 drivers/net/wireless/orinoco/Kconfig         |    4 +-
 drivers/net/wireless/orinoco/main.c          |    1 +
 drivers/net/wireless/wl12xx/wl1251_netlink.h |   30 -
 drivers/staging/rtl8187se/Kconfig            |    3 +-
 drivers/staging/rtl8192e/Kconfig             |    3 +-
 drivers/staging/vt6655/Kconfig               |    4 +-
 drivers/staging/vt6656/Kconfig               |    4 +-
 include/linux/nl80211.h                      |    2 +
 include/net/cfg80211.h                       |    9 +-
 include/net/iw_handler.h                     |   14 +-
 include/net/net_namespace.h                  |    2 +-
 include/net/wext.h                           |   49 +-
 net/core/net-sysfs.c                         |    6 +-
 net/mac80211/iface.c                         |    5 +-
 net/socket.c                                 |    4 +-
 net/wireless/Kconfig                         |   50 +-
 net/wireless/Makefile                        |   10 +-
 net/wireless/core.c                          |   17 +-
 net/wireless/ethtool.c                       |   45 +
 net/wireless/ethtool.h                       |    6 +
 net/wireless/ibss.c                          |   10 +-
 net/wireless/mlme.c                          |    2 +-
 net/wireless/nl80211.c                       |    6 +-
 net/wireless/scan.c                          |    6 +-
 net/wireless/sme.c                           |   12 +-
 net/wireless/wext-core.c                     | 1063 +++++++++++++++
 net/wireless/wext-priv.c                     |  248 ++++
 net/wireless/wext-proc.c                     |  155 +++
 net/wireless/wext-spy.c                      |  231 ++++
 net/wireless/wext.c                          | 1775 --------------------------
 142 files changed, 6953 insertions(+), 5229 deletions(-)
 create mode 100644 drivers/net/wireless/ath/debug.c
 create mode 100644 drivers/net/wireless/ath/debug.h
 create mode 100644 drivers/net/wireless/ath/hw.c
 create mode 100644 drivers/net/wireless/ath/reg.h
 create mode 100644 drivers/net/wireless/iwlwifi/iwl-agn-led.c
 create mode 100644 drivers/net/wireless/iwlwifi/iwl-agn-led.h
 create mode 100644 drivers/net/wireless/iwlwifi/iwl-devtrace.c
 create mode 100644 drivers/net/wireless/iwlwifi/iwl-devtrace.h
 create mode 100644 drivers/net/wireless/libertas/Kconfig
 create mode 100644 drivers/net/wireless/libertas/cfg.c
 create mode 100644 drivers/net/wireless/libertas/cfg.h
 delete mode 100644 drivers/net/wireless/wl12xx/wl1251_netlink.h
 create mode 100644 net/wireless/ethtool.c
 create mode 100644 net/wireless/ethtool.h
 create mode 100644 net/wireless/wext-core.c
 create mode 100644 net/wireless/wext-priv.c
 create mode 100644 net/wireless/wext-proc.c
 create mode 100644 net/wireless/wext-spy.c
 delete mode 100644 net/wireless/wext.c

Omnibus patch is available here:

	http://www.kernel.org/pub/linux/kernel/people/linville/wireless-next-2.6-2009-10-09.patch.bz2

-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply

* Re: tg3 and Broadcom PHY driver
From: David Miller @ 2009-10-09 21:25 UTC (permalink / raw)
  To: bhutchings; +Cc: felix, mcarlson, netdev
In-Reply-To: <1254185639.27790.3.camel@localhost>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Tue, 29 Sep 2009 01:53:59 +0100

> On Mon, 2009-09-28 at 14:55 -0700, David Miller wrote:
>> From: Felix Radensky <felix@embedded-sol.com>
>> Date: Mon, 28 Sep 2009 23:52:54 +0200
>> 
>> > Yes, moving CONFIG_TIGON3 right after CONFIG_PHYLIB in
>> > drivers/net/Makefile fixes the problem for me.
>> 
>> Thanks for testing.
>> 
>> We really need to fix this generically.
>> 
>> Does anyone think that moving the MDIO/MII/PHY layer objects
>> to the top of drivers/net/Makefile will break anything?
>> 
>> If not, that's what we should do I think.
> 
> Only the phylib drivers actually need to be moved to fix the
> initialisation order, but moving the others shouldn't hurt.

Ok, I'm adding the following to net-2.6 to resolve this and
will queue it up for -stable too.

Thanks everyone.

net: Link in PHY drivers before others.

We need PHY drivers to initialize in a static kernel before
the MAC drivers that use them.  So link them in first.

Based upon a report by Felix Radensky.

Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/Makefile |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index d866b8c..48d82e9 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -2,6 +2,10 @@
 # Makefile for the Linux network (ethercard) device drivers.
 #
 
+obj-$(CONFIG_MII) += mii.o
+obj-$(CONFIG_MDIO) += mdio.o
+obj-$(CONFIG_PHYLIB) += phy/
+
 obj-$(CONFIG_TI_DAVINCI_EMAC) += davinci_emac.o
 
 obj-$(CONFIG_E1000) += e1000/
@@ -100,10 +104,6 @@ obj-$(CONFIG_SH_ETH) += sh_eth.o
 # end link order section
 #
 
-obj-$(CONFIG_MII) += mii.o
-obj-$(CONFIG_MDIO) += mdio.o
-obj-$(CONFIG_PHYLIB) += phy/
-
 obj-$(CONFIG_SUNDANCE) += sundance.o
 obj-$(CONFIG_HAMACHI) += hamachi.o
 obj-$(CONFIG_NET) += Space.o loopback.o
-- 
1.6.4.4


^ permalink raw reply related

* Re: [RFCv4 PATCH 1/2] net: Introduce recvmmsg socket syscall
From: David Miller @ 2009-10-09 21:27 UTC (permalink / raw)
  To: acme
  Cc: caitlin.bestler, vanhoof, williams, nhorman, nir.tzachar, niv,
	paul.moore, remi.denis-courmont, steve, netdev
In-Reply-To: <20091009193520.GD12982@ghostprotocols.net>

From: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Fri, 9 Oct 2009 16:35:20 -0300

> 	The second patch in this series has issues, I still have to
> investigate it properly, study removing the skb_queue_head lock like TCP
> does, but the first patch seems to be OK and already providing good
> results at least as reported by Nir, if there aren't any other concerns
> about the API, can we get it into net-next-2.6?

Please make a formal submission of that first patch with all proper
signoffs and without the "RFC" in the subject line and I'll apply it.

Thanks!

^ permalink raw reply

* Re: [PATCH] Generalize socket rx gap / receive queue overflow cmsg (v2)
From: Eric Dumazet @ 2009-10-09 21:31 UTC (permalink / raw)
  To: Neil Horman; +Cc: netdev, davem, socketcan
In-Reply-To: <20091009193515.GA28196@hmsreliant.think-freely.org>

Neil Horman a écrit :

>  
> +extern void __sock_recv_ts_and_drops(struct msghdr *msg, struct sock *sk,
> +	struct sk_buff *skb);

Surely you meant __sock_recv_drops() ? It only deals with drops.


> +	case SO_RXQ_OVFL:
> +		v.val = sock_flag(sk, SOCK_RXQ_OVFL);
> +		break;
> +

Hmm, I advise to use v.val = !!sock_flag(sk, SOCK_RXQ_OVFL);
So that application gets 0 or 1, not 0 or some big value.
Its better because it allows us to change internal SOCK_RXQ_OVFL if necessary in the future.

>  drop_n_acct:
> -	spin_lock(&sk->sk_receive_queue.lock);
> -	po->stats.tp_drops++;
> -	spin_unlock(&sk->sk_receive_queue.lock);
> +	po->stats.tp_drops = atomic_inc_return(&sk->sk_drops);

Yes :)

>  EXPORT_SYMBOL_GPL(__sock_recv_timestamp);
>  
> +void __sock_recv_ts_and_drops(struct msghdr *msg, struct sock *sk,
> +	struct sk_buff *skb)
> +{
> +	put_cmsg(msg, SOL_SOCKET, SO_RXQ_OVFL, sizeof(__u32), &skb->dropcount);
> +}
> +EXPORT_SYMBOL_GPL(__sock_recv_ts_and_drops);
> +

Just change the name.

And is it really too large to be inlined ?

In the contrary, sock_recv_timestamp() is so large that I suspect
your sock_recv_ts_and_drops should *not* be inlined, and include inlined versions only :

I suggest something more orthogonal like :

void inline sock_recv_drops(struct msghdr *msg, struct sock *sk, struct sk_buff *skb)
{
	if (sock_flag(sk, SOCK_RXQ_OVFL) && skb && skb->dropcount)
		put_cmsg(msg, SOL_SOCKET, SO_RXQ_OVFL,
			 sizeof(__u32), &skb->dropcount);
}

void sock_recv_ts_and_drops(struct msghdr *msg, struct sock *sk, struct sk_buff *skb)
{
	sock_recv_timestamp(msg, sk, skb); // inlined
	sock_recv_drops(msg, sk, skb); // inlined
}
EXPORT_SYMBOL_GPL(sock_recv_ts_and_drops)


^ permalink raw reply

* Re: [PATCH 2.6.32-rc3] net: VMware virtual Ethernet NIC driver: vmxnet3
From: Stephen Hemminger @ 2009-10-09 21:35 UTC (permalink / raw)
  To: Shreyas Bhatewara
  Cc: Jeff, pv-drivers, netdev, linux-kernel, Andrew, Wright,
	Anthony Liguori, Greg Kroah-Hartman, Chris, Morton,
	virtualization, Garzik, David S. Miller
In-Reply-To: <alpine.LRH.2.00.0910081053460.19107@localhost.localdomain>

On Thu, 8 Oct 2009 10:59:26 -0700 (PDT)
Shreyas Bhatewara <sbhatewara@vmware.com> wrote:

> Hello all,
> 
> I do not mean to be bothersome but this thread has been unusually silent.
> Could you please review the patch for me and reply with your comments / 
> acks ?
> 
> Thanks.
> ->Shreyas  


Looks fine, but just a minor style nit (can be changed after insertion in mainline).

The code:

static void
vmxnet3_do_poll(struct vmxnet3_adapter *adapter, int budget, int *txd_done,
		int *rxd_done)
{
	if (unlikely(adapter->shared->ecr))
		vmxnet3_process_events(adapter);

	*txd_done = vmxnet3_tq_tx_complete(&adapter->tx_queue, adapter);
	*rxd_done = vmxnet3_rq_rx_complete(&adapter->rx_queue, adapter, budget);
}


static int
vmxnet3_poll(struct napi_struct *napi, int budget)
{
	struct vmxnet3_adapter *adapter = container_of(napi,
					  struct vmxnet3_adapter, napi);
	int rxd_done, txd_done;

	vmxnet3_do_poll(adapter, budget, &txd_done, &rxd_done);

	if (rxd_done < budget) {
		napi_complete(napi);
		vmxnet3_enable_intr(adapter, 0);
	}
	return rxd_done;
}


Is simpler if you just have do_poll return rx done value. Probably Gcc
inline's it all anyway.

static int
vmxnet3_do_poll(struct vmxnet3_adapter *adapter, int budget)
{
	if (unlikely(adapter->shared->ecr))
		vmxnet3_process_events(adapter);

	vmxnet3_tq_tx_complete(&adapter->tx_queue, adapter);
	return vmxnet3_rq_rx_complete(&adapter->rx_queue, adapter, budget);
}


static int
vmxnet3_poll(struct napi_struct *napi, int budget)
{
	struct vmxnet3_adapter *adapter = container_of(napi,
					  struct vmxnet3_adapter, napi);
	int rxd_done;

	rxd_done = vmxnet3_do_poll(adapter, budget);
	if (rxd_done < budget) {
		napi_complete(napi);
		vmxnet3_enable_intr(adapter, 0);
	}
	return rxd_done;
}

^ permalink raw reply

* Re: pull request: wireless-next-2.6 2009-10-09
From: David Miller @ 2009-10-09 21:40 UTC (permalink / raw)
  To: linville; +Cc: linux-wireless, netdev
In-Reply-To: <20091009210555.GC22861@tuxdriver.com>

From: "John W. Linville" <linville@tuxdriver.com>
Date: Fri, 9 Oct 2009 17:05:55 -0400

> Here is the usual big first post-window pull request for -next...
> Mostly it is the usual suspects, lots of iwlwifi and ath* along
> with a smattering of other bits.  There are even a few from me! :-)
> Most of these have spent several days banging-around in -next (which
> helped to find some Kconfig problems).
> 
> Please let me know if there are problems!

Pulled, thanks a lot John!

^ permalink raw reply

* Re: netconf notes and materials
From: Bill Fink @ 2009-10-09 21:49 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, peter.p.waskiewicz.jr
In-Reply-To: <20091007.044718.233642056.davem@davemloft.net>

On Wed, 07 Oct 2009, David Miller wrote:

> 
> Just a note that all of the available notes and slide etc.
> materials are available for netconf2009 at:
> 
> 	http://vger.kernel.org/netconf2009.html
> 
> Enjoy.

Thanks very much for the URL!

A question for Peter P Waskiewicz Jr, who presented on "NUMA scaling
issues in 10GbE":

How did you do the NUMA memory performance monitoring that was
presented on one of your slides?  This could be useful to me in
further pursuing an issue I recently raised with the subject
"Receive side performance issue with multi-10-GigE and NUMA"
(see http://article.gmane.org/gmane.linux.network/134658).

					-Thanks again

					-Bill

^ permalink raw reply

* Re: netconf notes and materials
From: David Miller @ 2009-10-09 21:51 UTC (permalink / raw)
  To: billfink; +Cc: netdev, peter.p.waskiewicz.jr
In-Reply-To: <20091009174949.467ddc50.billfink@mindspring.com>

From: Bill Fink <billfink@mindspring.com>
Date: Fri, 9 Oct 2009 17:49:49 -0400

> How did you do the NUMA memory performance monitoring that was
> presented on one of your slides?

Using proprietary internal tools Intel is unlikely to release.

On the bright side, some of those metrics will make their way into the
'perf' facilities in the kernel so they can be monitored, but not all
of them.


^ permalink raw reply

* [PATCH] Re: PACKET_TX_RING: packet size is too long
From: Gabor Gombas @ 2009-10-09 22:05 UTC (permalink / raw)
  To: netdev; +Cc: johann.baudy
In-Reply-To: <20091009090711.GG23133@boogie.lpds.sztaki.hu>

Hi,

Digging list archives I suspect the current value of size_max is the
remnant of the zero-copy mode that was not merged. So I propose the
following patch that IMHO makes the value of size_max consistent with
how the frame is actually handled in tpacket_fill_skb().

If the zero-copy mode is ever to be resurrected, then the user should
explicitely request it, and either the length of the extra padding
should be the same for 32-bit and 64-bit kernels or there must be a way
to query the value at run time.

Gabor

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index f9f7177..745a016 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -985,10 +985,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
 		goto out_put;
 
 	size_max = po->tx_ring.frame_size
-		- sizeof(struct skb_shared_info)
-		- po->tp_hdrlen
-		- LL_ALLOCATED_SPACE(dev)
-		- sizeof(struct sockaddr_ll);
+		- (po->tp_hdrlen - sizeof(struct sockaddr_ll));
 
 	if (size_max > dev->mtu + reserve)
 		size_max = dev->mtu + reserve;

-- 
     ---------------------------------------------------------
     MTA SZTAKI Computer and Automation Research Institute
                Hungarian Academy of Sciences
     ---------------------------------------------------------

^ permalink raw reply related

* Re: Real networking namespace
From: Paul Moore @ 2009-10-09 22:12 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: Stephen Hemminger, linux-security-module, Al Viro, netdev,
	James Morris
In-Reply-To: <1255106692.2182.224.camel@moss-pluto.epoch.ncsc.mil>

On Friday 09 October 2009 12:44:52 pm Stephen Smalley wrote:
> On Fri, 2009-10-09 at 12:37 -0400, Stephen Smalley wrote:
> > On Fri, 2009-10-09 at 08:38 -0700, Stephen Hemminger wrote:
> > > The existing networking namespace model is unattractive for what I
> > > want, has anyone investigated better alternatives?
> > >
> > > I would like to be able to allow access to a network interface and
> > > associated objects (routing tables etc), to be controlled by Mandatory
> > > Access Control API's. I.e grant access to eth0 and to only certain
> > > processes.  Some the issues with the existing models are:
> > >   * eth0 and associated objects don't really exist in filesystem so
> > >     not subject to LSM style control (SeLinux/SMACK/TOMOYO)

As Stephen points out, SELinux does have the ability to assign security labels 
to network interfaces, check out the 'semanage' command.  A while back I wrote 
up something about the SELinux network "ingress/egress" access controls:

 * http://paulmoore.livejournal.com/2128.html

Smack doesn't support controlling network access at the interface level, but 
that is due to a Smack design decision and not an inherent functionality gap 
in the LSM.  TOMOYO is currently working on improved network access controls 
(see patches posted earlier this week), I haven't had a chance to review them 
yet so I don't know the state of TOMOYO's network access controls.

> > >   * network namespaces do not allow object to exist in multiple
> > > namespaces. The current model is more restrictive than chroot jails. At
> > > least with chroot, put filesystem objects in multiple jails.

Perhaps I don't fully understand what you are getting at here, but I don't 
think this should be an issue with a flexible LSM.

> > Is there something that prevents you from using the existing SELinux
> > network access controls?  netif is a security class governed by SELinux
> > policy, and routing table operations would be covered by the SELinux
> > checks on netlink_route_socket.  SELinux uses a combination of LSM hooks
> > and netfilter hooks to mediate network operations.
> 
> Also, depending on what you want to do, SECMARK may be useful to you.
> That allows you to mark packets with security contexts via iptables, and
> then use SELinux policy to control their flow.
> http://paulmoore.livejournal.com/4281.html
> http://james-morris.livejournal.com/11010.html

While we're at it, a few more links ... here is a presentation from last year 
on Linux's labeled networking capabilities (which hits at a lot of your 
questions):

 * http://paulmoore.livejournal.com/964.html

... and there is a video too:

 * http://paulmoore.livejournal.com/1329.html

-- 
paul moore
linux @ hp

^ permalink raw reply

* Re: [PATCH 0/8] SECURITY ISSUE with connector
From: Greg KH @ 2009-10-09 22:25 UTC (permalink / raw)
  To: Philipp Reisner
  Cc: linux-fbdev-devel, netdev, linux-kernel, dm-devel,
	Evgeniy Polyakov, Andrew Morton, David S. Miller
In-Reply-To: <1254487211-11810-1-git-send-email-philipp.reisner@linbit.com>

On Fri, Oct 02, 2009 at 02:40:03PM +0200, Philipp Reisner wrote:
> Affected: All code that uses connector, in kernel and out of mainline
> 
> The connector, as it is today, does not allow the in kernel receiving
> parts to do any checks on privileges of a message's sender.
> 
> I know, there are not many out there that like connector, but as
> long as it is in the kernel, we have to fix the security issues it has!
> 
> Please either drop connector, or someone who feels a bit responsible
> and has our beloved dictator's blessing, PLEASE PLEASE PLEASE take 
> this into your tree, and send the pull request to Linus.
> 
> Patches 1 to 4 are already Acked-by Evgeny, the connector's maintainer.
> Patches 5 to 7 are the obvious fixes to the connector user's code.

These don't apply to the 2.6.31-stable tree at all.

Could you provide them backported to that tree if you want to see them
go into a .31-stable release?

thanks,

greg k-h

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox