Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH 0/15] RFC: create drivers/net/legacy for ISA, EISA, MCA drivers
From: Joe Perches @ 2010-10-29 22:08 UTC (permalink / raw)
  To: Paul Gortmaker; +Cc: davem, netdev, Jeff Kirsher
In-Reply-To: <4CCB3BF1.7070000@windriver.com>

On Fri, 2010-10-29 at 17:26 -0400, Paul Gortmaker wrote:
> On 10-10-28 09:48 PM, Joe Perches wrote:
> > On Thu, 2010-10-28 at 21:19 -0400, Paul Gortmaker wrote:
> >> The drivers/net dir has a lot of files - originally there were
> >> no subdirs, but at least now subdirs are being used effectively.
> >> But the original drivers from 10+ years ago are still right
> >> there at the top.  This series creates a drivers/net/legacy dir.
> > I like this idea.
> > I suggest a bit of a further grouping by using a
> > drivers/net/ethernet directory and putting those
> > legacy drivers in a new subdirectory
> > drivers/net/ethernet/legacy.
> That is a substantially larger change, since you'd now be
> relocating nearly every remaining driver, i.e. all the
> relatively modern 100M and GigE drivers.

Files to not need immediate renames.

Renames could happen when the appropriate maintainer
wants to or gets coerced to conform to some new
file layout standard.

I had submitted a related RFC patch:

https://patchwork.kernel.org/patch/244641/

and then had some off list discussions
with Jeff Kirsher from Intel.

Perhaps Jeff will chime in.

> Plus what do you
> do with the sb1000 - create drivers/cablemodem/legacy
> just for one file?

I never looked at that particular driver before.
Maybe.  I don't have a strong opinion.  Leaving
it where it is might be OK.

> Or the ethernet drivers already in
> existing subdirs, like arm and pcmcia -- do we move those?

Maybe.  If there's no demand, there's no absolute need to
move it at all.  I think a reasonable goal is to have some
sensible and consistent file layout scheme though.

There are arch specific directories under various drivers/...
so I don't see a need to move directories like drivers/net/arm
or drivers/s390.

> With this, I tried to aim for a significant gain (close to 1/3
> less files) within what I felt was a reasonable sized change
> set that had a chance of getting an overall OK from folks.
> Giant "flag-day" type mammoth changesets are a PITA for all.

I believe there's no need for a flag-day.
File renames could happen gradually or not at all.

^ permalink raw reply

* [PATCH 1/3] offloading: Make scatter/gather more tolerant of vlans.
From: Jesse Gross @ 2010-10-29 22:14 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Ben Hutchings

When checking if it is necessary to linearize a packet, we currently
use vlan_features if the packet contains either an in-band or out-
of-band vlan tag.  However, in-band tags aren't special in any way
for scatter/gather since they are part of the packet buffer and are
simply more data to DMA.  Therefore, only use vlan_features for out-
of-band tags, which could potentially have some interaction with
scatter/gather.

Signed-off-by: Jesse Gross <jesse@nicira.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
---
 net/core/dev.c |   19 ++++++++++++-------
 1 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 35dfb83..d21d655 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1977,15 +1977,20 @@ static inline void skb_orphan_try(struct sk_buff *skb)
 static inline int skb_needs_linearize(struct sk_buff *skb,
 				      struct net_device *dev)
 {
-	int features = dev->features;
+	if (skb_is_nonlinear(skb)) {
+		int features = dev->features;

-	if (skb->protocol == htons(ETH_P_8021Q) || vlan_tx_tag_present(skb))
-		features &= dev->vlan_features;
+		if (vlan_tx_tag_present(skb))
+			features &= dev->vlan_features;

-	return skb_is_nonlinear(skb) &&
-	       ((skb_has_frag_list(skb) && !(features & NETIF_F_FRAGLIST)) ||
-		(skb_shinfo(skb)->nr_frags && (!(features & NETIF_F_SG) ||
-					      illegal_highdma(dev, skb))));
+		return (skb_has_frag_list(skb) &&
+			!(features & NETIF_F_FRAGLIST)) ||
+			(skb_shinfo(skb)->nr_frags &&
+			(!(features & NETIF_F_SG) ||
+			illegal_highdma(dev, skb)));
+	}
+
+	return 0;
 }

 int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
-- 
1.7.1

^ permalink raw reply related

* [PATCH 2/3] offloading: Support multiple vlan tags in GSO.
From: Jesse Gross @ 2010-10-29 22:14 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Ben Hutchings
In-Reply-To: <1288390495-28923-1-git-send-email-jesse@nicira.com>

We assume that hardware TSO can't support multiple levels of vlan tags
but we allow it to be done.  Therefore, enable GSO to parse these tags
so we can fallback to software.

Signed-off-by: Jesse Gross <jesse@nicira.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
---
 net/core/dev.c |   12 +++++++-----
 1 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index d21d655..8bdda70 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1794,16 +1794,18 @@ struct sk_buff *skb_gso_segment(struct sk_buff *skb, int features)
 	struct sk_buff *segs = ERR_PTR(-EPROTONOSUPPORT);
 	struct packet_type *ptype;
 	__be16 type = skb->protocol;
+	int vlan_depth = ETH_HLEN;
 	int err;
 
-	if (type == htons(ETH_P_8021Q)) {
-		struct vlan_ethhdr *veh;
+	while (type == htons(ETH_P_8021Q)) {
+		struct vlan_hdr *vh;
 
-		if (unlikely(!pskb_may_pull(skb, VLAN_ETH_HLEN)))
+		if (unlikely(!pskb_may_pull(skb, vlan_depth + VLAN_HLEN)))
 			return ERR_PTR(-EINVAL);
 
-		veh = (struct vlan_ethhdr *)skb->data;
-		type = veh->h_vlan_encapsulated_proto;
+		vh = (struct vlan_hdr *)(skb->data + vlan_depth);
+		type = vh->h_vlan_encapsulated_proto;
+		vlan_depth += VLAN_HLEN;
 	}
 
 	skb_reset_mac_header(skb);
-- 
1.7.1


^ permalink raw reply related

* [PATCH 3/3] offloading: Force software GSO for multiple vlan tags.
From: Jesse Gross @ 2010-10-29 22:14 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Ben Hutchings
In-Reply-To: <1288390495-28923-1-git-send-email-jesse@nicira.com>

We currently use vlan_features to check for TSO support if there is
a vlan tag.  However, it's quite likely that the NIC is not able to
do TSO when there is an arbitrary number of tags.  Therefore if there
is more than one tag (in-band or out-of-band), fall back to software
emulation.

Signed-off-by: Jesse Gross <jesse@nicira.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
---
 include/linux/netdevice.h |    7 +++----
 net/core/dev.c            |   16 ++++++++++++++++
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 072652d..980c752 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2234,6 +2234,8 @@ unsigned long netdev_fix_features(unsigned long features, const char *name);
 void netif_stacked_transfer_operstate(const struct net_device *rootdev,
 					struct net_device *dev);
 
+int netif_get_vlan_features(struct sk_buff *skb, struct net_device *dev);
+
 static inline int net_gso_ok(int features, int gso_type)
 {
 	int feature = gso_type << NETIF_F_GSO_SHIFT;
@@ -2249,10 +2251,7 @@ static inline int skb_gso_ok(struct sk_buff *skb, int features)
 static inline int netif_needs_gso(struct net_device *dev, struct sk_buff *skb)
 {
 	if (skb_is_gso(skb)) {
-		int features = dev->features;
-
-		if (skb->protocol == htons(ETH_P_8021Q) || skb->vlan_tci)
-			features &= dev->vlan_features;
+		int features = netif_get_vlan_features(skb, dev);
 
 		return (!skb_gso_ok(skb, features) ||
 			unlikely(skb->ip_summed != CHECKSUM_PARTIAL));
diff --git a/net/core/dev.c b/net/core/dev.c
index 8bdda70..8d74988 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1969,6 +1969,22 @@ static inline void skb_orphan_try(struct sk_buff *skb)
 	}
 }
 
+int netif_get_vlan_features(struct sk_buff *skb, struct net_device *dev)
+{
+	__be16 protocol = skb->protocol;
+
+	if (protocol == htons(ETH_P_8021Q)) {
+		struct vlan_ethhdr *veh = (struct vlan_ethhdr *)skb->data;
+		protocol = veh->h_vlan_encapsulated_proto;
+	} else if (!skb->vlan_tci)
+		return dev->features;
+
+	if (protocol != htons(ETH_P_8021Q))
+		return dev->features & dev->vlan_features;
+	else
+		return 0;
+}
+
 /*
  * Returns true if either:
  *	1. skb has frag_list and the device doesn't support FRAGLIST, or
-- 
1.7.1


^ permalink raw reply related

* Re: [PATCH] ipv4: Flush per-ns routing cache more sanely.
From: David Miller @ 2010-10-29 22:21 UTC (permalink / raw)
  To: daniel.lezcano; +Cc: eric.dumazet, ebiederm, netdev
In-Reply-To: <4CCB3F94.20709@free.fr>

From: Daniel Lezcano <daniel.lezcano@free.fr>
Date: Fri, 29 Oct 2010 23:41:40 +0200

> do you plan to send another version of this patch ? Or can I test it
> as it is ?  Without removing a network device, I can check the
> routine, no ?

I'm backlogged with the current merge window work and fixing bugs,
so feel free to test what I posted if you have the time.

I'll post an updated version when time permits.

Thanks.

^ permalink raw reply

* [PATCH  kernel 2.6.36-git10] pcnet_cs: add new_id
From: Ken Kawasaki @ 2010-10-29 22:17 UTC (permalink / raw)
  To: netdev
In-Reply-To: <20100829074501.946ebcb8.ken_kawasaki@spring.nifty.jp>


pcnet_cs:
    add new_id: "corega Ether CF-TD" 10Base-T PCMCIA card.


Signed-off-by: Ken Kawasaki <ken_kawasaki@spring.nifty.jp>

---

--- linux-2.6.36-git10/drivers/net/pcmcia/pcnet_cs.c.orig	2010-10-29 22:11:43.000000000 +0900
+++ linux-2.6.36-git10/drivers/net/pcmcia/pcnet_cs.c	2010-10-29 22:15:30.000000000 +0900
@@ -1536,6 +1536,7 @@ static struct pcmcia_device_id pcnet_ids
 	PCMCIA_DEVICE_PROD_ID12("COMPU-SHACK", "FASTline PCMCIA 10/100 Fast-Ethernet", 0xfa2e424d, 0x3953d9b9),
 	PCMCIA_DEVICE_PROD_ID12("CONTEC", "C-NET(PC)C-10L", 0x21cab552, 0xf6f90722),
 	PCMCIA_DEVICE_PROD_ID12("corega", "FEther PCC-TXF", 0x0a21501a, 0xa51564a2),
+	PCMCIA_DEVICE_PROD_ID12("corega", "Ether CF-TD", 0x0a21501a, 0x6589340a),
 	PCMCIA_DEVICE_PROD_ID12("corega K.K.", "corega EtherII PCC-T", 0x5261440f, 0xfa9d85bd),
 	PCMCIA_DEVICE_PROD_ID12("corega K.K.", "corega EtherII PCC-TD", 0x5261440f, 0xc49bd73d),
 	PCMCIA_DEVICE_PROD_ID12("Corega K.K.", "corega EtherII PCC-TD", 0xd4fdcbd8, 0xc49bd73d),

^ permalink raw reply

* [PATCH 5/6] netdev: can: Change mail address of Hans J. Koch
From: Hans J. Koch @ 2010-10-29 22:33 UTC (permalink / raw)
  To: LKML; +Cc: netdev
In-Reply-To: <20101029221231.GA4331@local>

My old mail address doesn't exist anymore. This changes all occurrences
to my new address.

Signed-off-by: Hans J. Koch <hjk@hansjkoch.de>
---
 drivers/net/can/at91_can.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/can/at91_can.c b/drivers/net/can/at91_can.c
index cee98fa..7ef83d0 100644
--- a/drivers/net/can/at91_can.c
+++ b/drivers/net/can/at91_can.c
@@ -1,7 +1,7 @@
 /*
  * at91_can.c - CAN network driver for AT91 SoC CAN controller
  *
- * (C) 2007 by Hans J. Koch <hjk@linutronix.de>
+ * (C) 2007 by Hans J. Koch <hjk@hansjkoch.de>
  * (C) 2008, 2009, 2010 by Marc Kleine-Budde <kernel@pengutronix.de>
  *
  * This software may be distributed under the terms of the GNU General
-- 
1.7.1

^ permalink raw reply related

* @ms_world you won!
From: (ms-world) @ 2010-10-30  0:37 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 0 bytes --]



[-- Attachment #2: YOU HAVE WON A PRIZE.doc --]
[-- Type: application/msword, Size: 196096 bytes --]

^ permalink raw reply

* Re: [PATCH 0/15] RFC: create drivers/net/legacy for ISA, EISA, MCA drivers
From: Jeff Kirsher @ 2010-10-30  0:01 UTC (permalink / raw)
  To: Joe Perches; +Cc: Paul Gortmaker, davem@davemloft.net, netdev@vger.kernel.org
In-Reply-To: <1288390127.28828.225.camel@Joe-Laptop>

[-- Attachment #1: Type: text/plain, Size: 2891 bytes --]

On Fri, 2010-10-29 at 15:08 -0700, Joe Perches wrote:
> On Fri, 2010-10-29 at 17:26 -0400, Paul Gortmaker wrote:
> > On 10-10-28 09:48 PM, Joe Perches wrote:
> > > On Thu, 2010-10-28 at 21:19 -0400, Paul Gortmaker wrote:
> > >> The drivers/net dir has a lot of files - originally there were
> > >> no subdirs, but at least now subdirs are being used effectively.
> > >> But the original drivers from 10+ years ago are still right
> > >> there at the top.  This series creates a drivers/net/legacy dir.
> > > I like this idea.
> > > I suggest a bit of a further grouping by using a
> > > drivers/net/ethernet directory and putting those
> > > legacy drivers in a new subdirectory
> > > drivers/net/ethernet/legacy.
> > That is a substantially larger change, since you'd now be
> > relocating nearly every remaining driver, i.e. all the
> > relatively modern 100M and GigE drivers.
> 

I am not particularly a fan of making a "legacy" directory and moving
old drivers into it.  Just because this is very subjective, if you say
"drivers which are X years old and not used much" is vague and depending
on who you ask would get varying results.  But if you were to were to
define legacy as all ISA, EISA and MCA drivers (not based on their use)
would be better.

But if a legacy directory was to be made, I like Joe's suggestion of
drivers/net/ethernet/legacy.

> Files to not need immediate renames.
> 
> Renames could happen when the appropriate maintainer
> wants to or gets coerced to conform to some new
> file layout standard.
> 
> I had submitted a related RFC patch:
> 
> https://patchwork.kernel.org/patch/244641/
> 
> and then had some off list discussions
> with Jeff Kirsher from Intel.
> 
> Perhaps Jeff will chime in.
> 
> > Plus what do you
> > do with the sb1000 - create drivers/cablemodem/legacy
> > just for one file?
> 
> I never looked at that particular driver before.
> Maybe.  I don't have a strong opinion.  Leaving
> it where it is might be OK.
> 
> > Or the ethernet drivers already in
> > existing subdirs, like arm and pcmcia -- do we move those?
> 
> Maybe.  If there's no demand, there's no absolute need to
> move it at all.  I think a reasonable goal is to have some
> sensible and consistent file layout scheme though.
> 
> There are arch specific directories under various drivers/...
> so I don't see a need to move directories like drivers/net/arm
> or drivers/s390.

I agree with Joe.

> 
> > With this, I tried to aim for a significant gain (close to 1/3
> > less files) within what I felt was a reasonable sized change
> > set that had a chance of getting an overall OK from folks.
> > Giant "flag-day" type mammoth changesets are a PITA for all.
> 
> I believe there's no need for a flag-day.
> File renames could happen gradually or not at all.
> 
> 

Again I agree with Joe.


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply

* bonding: propagation of offload settings
From: Simon Horman @ 2010-10-30  2:54 UTC (permalink / raw)
  To: netdev; +Cc: Jay Vosburgh

Hi,

I am wondering what the desired behaviour for the propagating
of offload settings between master and slaves when the settings
are modified using ethtool.

I have observed the following (using Linus' latest tree, 2.6.36+)

#1 Disabling gro on a slave device propagates to the master
   but not other slaves

   bond1: generic-receive-offload: on
   eth1: generic-receive-offload: on
   eth4: generic-receive-offload: on

   # ethtool -K eth1 gro off

   bond1: generic-receive-offload: off
   eth1: generic-receive-offload: on
   eth4: generic-receive-offload: off

   This seems to occur regardless of if the slave is the
   active slave or not.

#2 No other propagation of settings occurs


It seems to me that from a user point of view it may make more sense to:

a) propagate settings from the master to the slaves and;
b) possibly disallow setting the slaves directly


^ permalink raw reply

* Re: [ovs-dev] Flow Control and Port Mirroring
From: Simon Horman @ 2010-10-30  2:59 UTC (permalink / raw)
  To: Jesse Gross; +Cc: dev, Michael S. Tsirkin, kvm, virtualization, netdev
In-Reply-To: <AANLkTi=QXBi4wmJS1TY0P=s+11Vjp0v=AfWOdFb4vrCj@mail.gmail.com>

[ CCed VHOST contacts ]

On Thu, Oct 28, 2010 at 01:22:02PM -0700, Jesse Gross wrote:
> On Thu, Oct 28, 2010 at 4:54 AM, Simon Horman <horms@verge.net.au> wrote:
> > My reasoning is that in the non-mirroring case the guest is
> > limited by the external interface through wich the packets
> > eventually flow - that is 1Gbit/s. But in the mirrored either
> > there is no flow control or the flow control is acting on the
> > rate of dummy0, which is essentailly infinate.
> >
> > Before investigating this any further I wanted to ask if
> > this behaviour is intentional.
> 
> It's not intentional but I can take a guess at what is happening.
> 
> When we send the packet to a mirror, the skb is cloned but only the
> original skb is charged to the sender.  If the original packet is
> delivered to localhost then it will be freed quickly and no longer
> accounted for, despite the fact that the "real" packet is still
> sitting in the transmit queue on the NIC.  The UDP stack will then
> send the next packet, limited only by the speed of the CPU.

That would explain what I have observed.

> Normally, this would be tracked by accounting for the memory charged
> to the socket.  However, I know that Xen tracks whether the actual
> pages of memory have been freed, which should avoid this problem since
> the memory won't be released util the last packet has been sent.  I
> don't know what KVM virtio does but I'm guessing that it similar to
> the former, since this problem is occurring.

I am also familiar of how Xen tracks pages but less sure of the
virtio side of things.

> While it would be easy to charge the socket for all clones, I also
> want to be careful about over accounting of the same data, leading to
> a very small effective socket buffer.

Agreed, we don't want to see over-charging.


^ permalink raw reply

* [PATCH] af_unix: unix_write_space() use keyed wakeups
From: Eric Dumazet @ 2010-10-30  6:44 UTC (permalink / raw)
  To: Alban Crequy, Davide Libenzi
  Cc: David S. Miller, Stephen Hemminger, Cyrill Gorcunov,
	Alexey Dobriyan, netdev, linux-kernel, Pauli Nieminen,
	Rainer Weikusat
In-Reply-To: <1288380431.2680.3.camel@edumazet-laptop>

Le vendredi 29 octobre 2010 à 21:27 +0200, Eric Dumazet a écrit :
> Le vendredi 29 octobre 2010 à 19:18 +0100, Alban Crequy a écrit :
> > Hi,
> > 
> > When a process calls the poll or select, the kernel calls (struct
> > file_operations)->poll on every file descriptor and returns a mask of
> > events which are ready. If the process is only interested by POLLIN
> > events, the mask is still computed for POLLOUT and it can be expensive.
> > For example, on Unix datagram sockets, a process running poll() with
> > POLLIN will wakes-up when the remote end call read(). This is a
> > performance regression introduced when fixing another bug by
> > 3c73419c09a5ef73d56472dbfdade9e311496e9b and
> > ec0d215f9420564fc8286dcf93d2d068bb53a07e.
> > 

unix_write_space() doesn’t yet use the wake_up_interruptible_sync_poll()
to restrict wakeups to only POLLOUT | POLLWRNORM | POLLWRBAND interested
sleepers. Same for unix_dgram_recvmsg()

In your pathological case, each time the other process calls
unix_dgram_recvmsg(), it loops on 800 pollwake() /
default_wake_function() / try_to_wake_up(), which are obviously
expensive, as you pointed out with your test program, carefully designed
to show the false sharing and O(N^2) effect :)

Once do_select() thread can _really_ block, the false sharing problem
disappears for good.

We still loop on 800 items, on each wake_up_interruptible_sync_poll()
call, so maybe we want to optimize this later, adding a global key,
ORing all items keys. I dont think its worth the added complexity, given
the biased usage of your program (800 'listeners' to one event). Is it a
real life scenario ?

Thanks

[PATCH] af_unix: use keyed wakeups

Instead of wakeup all sleepers, use wake_up_interruptible_sync_poll() to
wakeup only ones interested into writing the socket.

This patch is a specialization of commit 37e5540b3c9d (epoll keyed
wakeups: make sockets use keyed wakeups).

On a test program provided by Alan Crequy :

Before:
real    0m3.101s
user    0m0.000s
sys     0m6.104s

After:

real	0m0.211s
user	0m0.000s
sys	0m0.208s

Reported-by: Alban Crequy <alban.crequy@collabora.co.uk>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
---
 net/unix/af_unix.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 3c95304..f33c595 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -316,7 +316,8 @@ static void unix_write_space(struct sock *sk)
 	if (unix_writable(sk)) {
 		wq = rcu_dereference(sk->sk_wq);
 		if (wq_has_sleeper(wq))
-			wake_up_interruptible_sync(&wq->wait);
+			wake_up_interruptible_sync_poll(&wq->wait,
+				POLLOUT | POLLWRNORM | POLLWRBAND);
 		sk_wake_async(sk, SOCK_WAKE_SPACE, POLL_OUT);
 	}
 	rcu_read_unlock();
@@ -1710,7 +1711,8 @@ static int unix_dgram_recvmsg(struct kiocb *iocb, struct socket *sock,
 		goto out_unlock;
 	}
 
-	wake_up_interruptible_sync(&u->peer_wait);
+	wake_up_interruptible_sync_poll(&u->peer_wait,
+					POLLOUT | POLLWRNORM | POLLWRBAND);
 
 	if (msg->msg_name)
 		unix_copy_addr(msg, skb->sk);



^ permalink raw reply related

* [PATCH] isdn: mISDN: socket: fix information leak to userland
From: Vasiliy Kulikov @ 2010-10-30  9:04 UTC (permalink / raw)
  To: kernel-janitors
  Cc: Karsten Keil, Arnaldo Carvalho de Melo, David S. Miller,
	Tejun Heo, Eric Paris, netdev, linux-kernel

Structure mISDN_devinfo is copied to userland with the field "name"
that has the last elements unitialized.  It leads to leaking of
contents of kernel stack memory.

Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
---
 Compile tested.

 drivers/isdn/mISDN/socket.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/isdn/mISDN/socket.c b/drivers/isdn/mISDN/socket.c
index 3232206..7446d8b 100644
--- a/drivers/isdn/mISDN/socket.c
+++ b/drivers/isdn/mISDN/socket.c
@@ -392,6 +392,7 @@ data_sock_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
 		if (dev) {
 			struct mISDN_devinfo di;
 
+			memset(&di, 0, sizeof(di));
 			di.id = dev->id;
 			di.Dprotocols = dev->Dprotocols;
 			di.Bprotocols = dev->Bprotocols | get_all_Bprotocols();
@@ -672,6 +673,7 @@ base_sock_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
 		if (dev) {
 			struct mISDN_devinfo di;
 
+			memset(&di, 0, sizeof(di));
 			di.id = dev->id;
 			di.Dprotocols = dev->Dprotocols;
 			di.Bprotocols = dev->Bprotocols | get_all_Bprotocols();
-- 
1.7.0.4

^ permalink raw reply related

* [PATCH] af_unix: optimize unix_dgram_poll()
From: Eric Dumazet @ 2010-10-30  9:53 UTC (permalink / raw)
  To: Davide Libenzi
  Cc: Alban Crequy, David S. Miller, Stephen Hemminger, Cyrill Gorcunov,
	Alexey Dobriyan, netdev, Linux Kernel Mailing List,
	Pauli Nieminen, Rainer Weikusat
In-Reply-To: <alpine.DEB.2.00.1010291339180.8517@davide-lnx1>

Le vendredi 29 octobre 2010 à 13:46 -0700, Davide Libenzi a écrit :

> Also, why not using the existing wait->key instead of adding a poll2()?

Indeed, if wait is not null, we have in wait->key the interest of
poller. If a particular poll() function is expensive, it can test these
bits.

Thanks !

Note: I chose the 'goto skip_write' to make this patch really obvious.

[PATCH] af_unix: optimize unix_dgram_poll()

unix_dgram_poll() is pretty expensive to check POLLOUT status, because
it has to lock the socket to get its peer, take a reference on the peer
to check its receive queue status, and queue another poll_wait on
peer_wait. This all can be avoided if the process calling
unix_dgram_poll() is not interested in POLLOUT status. It makes
unix_dgram_recvmsg() faster by not queueing irrelevant pollers in
peer_wait.

On a test program provided by Alan Crequy :

Before:

real    0m0.211s
user    0m0.000s
sys     0m0.208s

After:

real	0m0.044s
user	0m0.000s
sys	0m0.040s

Suggested-by: Davide Libenzi <davidel@xmailserver.org>
Reported-by: Alban Crequy <alban.crequy@collabora.co.uk>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 net/unix/af_unix.c |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 3c95304..dcb84fe 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -2090,6 +2090,9 @@ static unsigned int unix_dgram_poll(struct file *file, struct socket *sock,
 			return mask;
 	}

+	if (wait && !(wait->key & (POLLWRBAND | POLLWRNORM | POLLOUT)))
+		goto skip_write;
+
 	/* writable? */
 	writable = unix_writable(sk);
 	if (writable) {
@@ -2111,6 +2114,7 @@ static unsigned int unix_dgram_poll(struct file *file, struct socket *sock,
 	else
 		set_bit(SOCK_ASYNC_NOSPACE, &sk->sk_socket->flags);

+skip_write:
 	return mask;
 }

^ permalink raw reply related

* Re: [PATCH 0/1] RFC: poll/select performance on datagram sockets
From: Alban Crequy @ 2010-10-30 11:34 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S. Miller, Stephen Hemminger, Cyrill Gorcunov,
	Alexey Dobriyan, netdev, linux-kernel, Pauli Nieminen,
	Rainer Weikusat, Davide Libenzi
In-Reply-To: <1288380431.2680.3.camel@edumazet-laptop>

Le Fri, 29 Oct 2010 21:27:11 +0200,
Eric Dumazet <eric.dumazet@gmail.com> a écrit :

> Le vendredi 29 octobre 2010 à 19:18 +0100, Alban Crequy a écrit :
> > Hi,
> > 
> > When a process calls the poll or select, the kernel calls (struct
> > file_operations)->poll on every file descriptor and returns a mask
> > of events which are ready. If the process is only interested by
> > POLLIN events, the mask is still computed for POLLOUT and it can be
> > expensive. For example, on Unix datagram sockets, a process running
> > poll() with POLLIN will wakes-up when the remote end call read().
> > This is a performance regression introduced when fixing another bug
> > by 3c73419c09a5ef73d56472dbfdade9e311496e9b and
> > ec0d215f9420564fc8286dcf93d2d068bb53a07e.
> > 
> > The attached program illustrates the problem. It compares the
> > performance of sending/receiving data on an Unix datagram socket and
> > select(). When the datagram sockets are not connected, the
> > performance problem is not triggered, but when they are connected
> > it becomes a lot slower. On my computer, I have the following time:
> > 
> > Connected datagram sockets: >4 seconds
> > Non-connected datagram sockets: <1 second
> > 
> > The patch attached in the next email fixes the performance problem:
> > it becomes <1 second for both cases. I am not suggesting the patch
> > for inclusion; I would like to change the prototype of (struct
> > file_operations)->poll instead of adding ->poll2. But there is a
> > lot of poll functions to change (grep tells me 337 functions).
> > 
> > Any opinions?
> 
> My opinion would be to use epoll() for this kind of workload.

I found a problem with epoll() with the following program. When there
is several datagram sockets connected to the same server and the
receiving queue is full, epoll(EPOLLOUT) wakes up only the emitter who
has its skb removed from the queue, and not all the emitters. It is
because sock_wfree() runs sk->sk_write_space() only for one emitter.

poll/select do not have this problem.

-----------------------

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <time.h>
#include <sys/epoll.h>

int
main(int argc, char *argv[])
{

// cat /proc/sys/net/unix/max_dgram_qlen
#define NB_CLIENTS (10 + 2)

  int client_fds[NB_CLIENTS];
  int server_fd;
  struct sockaddr_un addr_server;
  int epollfd = epoll_create(10);
#define MAX_EVENTS 10
  struct epoll_event ev, events[MAX_EVENTS];
  int i;
  char buffer[1024];

  int trigger = atoi(argv[1]);
  printf("trigger = %d\n", trigger);

  memset(&addr_server, 0, sizeof(addr_server));
  addr_server.sun_family = AF_UNIX;
  addr_server.sun_path[0] = '\0';
  strcpy(addr_server.sun_path + 1, "dgram_perfs_server");

  server_fd = socket(AF_UNIX, SOCK_DGRAM, 0);
  bind(server_fd, (struct sockaddr*)&addr_server, sizeof(addr_server));

  for (i = 0 ; i < NB_CLIENTS ; i++)
    {
      client_fds[i] = socket(AF_UNIX, SOCK_DGRAM, 0);
    }

  ev.events = EPOLLOUT;
  ev.data.fd = client_fds[NB_CLIENTS-1];
  if (epoll_ctl(epollfd, EPOLL_CTL_ADD, client_fds[NB_CLIENTS-1], &ev) == -1) {
      perror("epoll_ctl: client_fds max");
      exit(EXIT_FAILURE);
  }
  if (trigger == 0)
    {
      ev.events = EPOLLOUT;
      ev.data.fd = client_fds[0];
      if (epoll_ctl(epollfd, EPOLL_CTL_ADD, client_fds[0], &ev) == -1) {
        perror("epoll_ctl: client_fds 0");
        exit(EXIT_FAILURE);
      }
    }

  for (i = 0 ; i < NB_CLIENTS ; i++)
    {
      connect(client_fds[i], (struct sockaddr*)&addr_server, sizeof(addr_server));
    }

  if (fork() > 0)
    {
      for (i = 0 ; i < NB_CLIENTS - 1 ; i++)
        sendto(client_fds[i], "S", 1, 0, (struct sockaddr*)&addr_server, sizeof(addr_server));
      printf("Everything sent successfully. Now epoll_wait...\n");

      epoll_wait(epollfd, events, MAX_EVENTS, -1);
      printf("epoll_wait works fine :-)\n");
      wait(NULL);
      exit(0);
    }

  sleep(1);

  printf("Receiving one buffer...\n");
  recv(server_fd, buffer, 1024, 0);
  printf("One buffer received\n");

  return 0;
}

^ permalink raw reply

* Re: [PATCH 0/1] RFC: poll/select performance on datagram sockets
From: Eric Dumazet @ 2010-10-30 12:53 UTC (permalink / raw)
  To: Alban Crequy
  Cc: David S. Miller, Stephen Hemminger, Cyrill Gorcunov,
	Alexey Dobriyan, netdev, linux-kernel, Pauli Nieminen,
	Rainer Weikusat, Davide Libenzi
In-Reply-To: <20101030123403.5e01540d@chocolatine.cbg.collabora.co.uk>

Le samedi 30 octobre 2010 à 12:34 +0100, Alban Crequy a écrit :
> Le Fri, 29 Oct 2010 21:27:11 +0200,
> Eric Dumazet <eric.dumazet@gmail.com> a écrit :
> 
> > Le vendredi 29 octobre 2010 à 19:18 +0100, Alban Crequy a écrit :
> > > Hi,
> > > 
> > > When a process calls the poll or select, the kernel calls (struct
> > > file_operations)->poll on every file descriptor and returns a mask
> > > of events which are ready. If the process is only interested by
> > > POLLIN events, the mask is still computed for POLLOUT and it can be
> > > expensive. For example, on Unix datagram sockets, a process running
> > > poll() with POLLIN will wakes-up when the remote end call read().
> > > This is a performance regression introduced when fixing another bug
> > > by 3c73419c09a5ef73d56472dbfdade9e311496e9b and
> > > ec0d215f9420564fc8286dcf93d2d068bb53a07e.
> > > 
> > > The attached program illustrates the problem. It compares the
> > > performance of sending/receiving data on an Unix datagram socket and
> > > select(). When the datagram sockets are not connected, the
> > > performance problem is not triggered, but when they are connected
> > > it becomes a lot slower. On my computer, I have the following time:
> > > 
> > > Connected datagram sockets: >4 seconds
> > > Non-connected datagram sockets: <1 second
> > > 
> > > The patch attached in the next email fixes the performance problem:
> > > it becomes <1 second for both cases. I am not suggesting the patch
> > > for inclusion; I would like to change the prototype of (struct
> > > file_operations)->poll instead of adding ->poll2. But there is a
> > > lot of poll functions to change (grep tells me 337 functions).
> > > 
> > > Any opinions?
> > 
> > My opinion would be to use epoll() for this kind of workload.
> 
> I found a problem with epoll() with the following program. When there
> is several datagram sockets connected to the same server and the
> receiving queue is full, epoll(EPOLLOUT) wakes up only the emitter who
> has its skb removed from the queue, and not all the emitters. It is
> because sock_wfree() runs sk->sk_write_space() only for one emitter.
> 

I dont think this is the reason.

sock_wfree() really is good here, since it copes with one socket (the
one that sent the message)

Problem is the peer_wait, that epoll doesnt seem to be plugged into.

Bug is in unix_dgram_poll()

It calls sock_poll_wait( ... &unix_sk(other)->peer_wait,) only if socket
is 'writable'. Its a clear bug

Try this patch please ?

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 0ebc777..315716c 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -2092,7 +2092,7 @@ static unsigned int unix_dgram_poll(struct file *file, struct socket *sock,
 
 	/* writable? */
 	writable = unix_writable(sk);
-	if (writable) {
+	if (1 /*writable*/) {
 		other = unix_peer_get(sk);
 		if (other) {
 			if (unix_peer(other) != sk) {

^ permalink raw reply related

* Re: [Bugme-new] [Bug 16350] New: RTL8103EL interface new tcp connections get stuck on SYN_SENT state after 10 hours uptime
From: zic422 @ 2010-10-30 12:58 UTC (permalink / raw)
  To: Andrew Morton; +Cc: netdev, bugzilla-daemon, bugme-daemon, Francois Romieu
In-Reply-To: <20100708142526.5bf5938e.akpm@linux-foundation.org>

Now I cannot reproduce this bug. Don't know when it get banished. I
did not managed to reach more than ten hours uptime last two months.

On Fri, Jul 9, 2010 at 12:25 AM, Andrew Morton
<akpm@linux-foundation.org> wrote:
>
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> On Wed, 7 Jul 2010 14:04:59 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
>
>> https://bugzilla.kernel.org/show_bug.cgi?id=16350
>>
>>            Summary: RTL8103EL interface new tcp connections get stuck on
>>                     SYN_SENT state after 10 hours uptime
>>            Product: Networking
>>            Version: 2.5
>>     Kernel Version: 2.6.34.1
>>           Platform: All
>>         OS/Version: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: normal
>>           Priority: P1
>>          Component: IPV4
>>         AssignedTo: shemminger@linux-foundation.org
>>         ReportedBy: zic422@gmail.com
>>         Regression: No
>>
>>
>> Created an attachment (id=27038)
>>  --> (https://bugzilla.kernel.org/attachment.cgi?id=27038)
>> ioports, iomem, lspci, ver_linux and kernel config
>>
>> After 9-11 hours uptime network interface of RTL8103EL card cannot make new tcp
>> connections to internet and correctly close existing ones. ICMP and UDP works
>> fine. Besides r8169 driver I tried also the r8101-1.015.00 one from realtek
>> site with older kernel and bug were in place.
>>
>> Connections inside home network work normally. I tried to reboot router but
>> nor reconnecting ethernet cable nor restarting network daemon nor reloading
>> kernel module does not make any help. Bug disappears only after rebooting my
>> pc.
>>
>> repeatability of bug is 100% on arch, debian and gentoo
>>
>> default kernel config
>>
>> http://bugs.archlinux.org/task/19162  (this bug)

^ permalink raw reply

* Re: [PATCH 0/1] RFC: poll/select performance on datagram sockets
From: Eric Dumazet @ 2010-10-30 13:17 UTC (permalink / raw)
  To: Alban Crequy
  Cc: David S. Miller, Stephen Hemminger, Cyrill Gorcunov,
	Alexey Dobriyan, netdev, linux-kernel, Pauli Nieminen,
	Rainer Weikusat, Davide Libenzi
In-Reply-To: <1288443217.2680.962.camel@edumazet-laptop>


> Problem is the peer_wait, that epoll doesnt seem to be plugged into.
> 
> Bug is in unix_dgram_poll()
> 
> It calls sock_poll_wait( ... &unix_sk(other)->peer_wait,) only if socket
> is 'writable'. Its a clear bug
> 
> Try this patch please ?
> 
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 0ebc777..315716c 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -2092,7 +2092,7 @@ static unsigned int unix_dgram_poll(struct file *file, struct socket *sock,
>  
>  	/* writable? */
>  	writable = unix_writable(sk);
> -	if (writable) {
> +	if (1 /*writable*/) {
>  		other = unix_peer_get(sk);
>  		if (other) {
>  			if (unix_peer(other) != sk) {
> 
> 
> 

Also you'll need to change your program to make the epoll registrations
_after_ sockets connect(), or else you can see that epoll() wont know
about the other peer stuff.

for (i = 0 ; i < NB_CLIENTS ; i++) {
	client_fds[i] = socket(AF_UNIX, SOCK_DGRAM, 0);
}


for (i = 0 ; i < NB_CLIENTS ; i++) {
	connect(client_fds[i], (struct sockaddr*)&addr_server, sizeof(addr_server));
}

ev.events = EPOLLOUT;
ev.data.fd = client_fds[NB_CLIENTS-1];
if (epoll_ctl(epollfd, EPOLL_CTL_ADD, client_fds[NB_CLIENTS-1], &ev) == -1) {
      perror("epoll_ctl: client_fds max");
      exit(EXIT_FAILURE);
}
if (trigger == 0) {
      ev.events = EPOLLOUT;
      ev.data.fd = client_fds[0];
      if (epoll_ctl(epollfd, EPOLL_CTL_ADD, client_fds[0], &ev) == -1) {
        perror("epoll_ctl: client_fds 0");
        exit(EXIT_FAILURE);
      }
}




^ permalink raw reply

* [PATCH] bluetooth: bnep: fix information leak to userland
From: Vasiliy Kulikov @ 2010-10-30 14:26 UTC (permalink / raw)
  To: kernel-janitors
  Cc: Marcel Holtmann, Gustavo F. Padovan, David S. Miller,
	Eric Dumazet, Thadeu Lima de Souza Cascardo, Tejun Heo,
	Jiri Kosina, linux-bluetooth, netdev, linux-kernel

Structure bnep_conninfo is copied to userland with the field "device"
that has the last elements unitialized.  It leads to leaking of
contents of kernel stack memory.

Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
---
 Compile tested.

 net/bluetooth/bnep/core.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/bluetooth/bnep/core.c b/net/bluetooth/bnep/core.c
index f10b41f..5868597 100644
--- a/net/bluetooth/bnep/core.c
+++ b/net/bluetooth/bnep/core.c
@@ -648,6 +648,7 @@ int bnep_del_connection(struct bnep_conndel_req *req)
 
 static void __bnep_copy_ci(struct bnep_conninfo *ci, struct bnep_session *s)
 {
+	memset(ci, 0, sizeof(*ci));
 	memcpy(ci->dst, s->eh.h_source, ETH_ALEN);
 	strcpy(ci->device, s->dev->name);
 	ci->flags = s->flags;
-- 
1.7.0.4

^ permalink raw reply related

* [PATCH] bluetooth: cmtp: fix information leak to userland
From: Vasiliy Kulikov @ 2010-10-30 14:26 UTC (permalink / raw)
  To: kernel-janitors
  Cc: Marcel Holtmann, Gustavo F. Padovan, David S. Miller,
	Eric Dumazet, linux-bluetooth, netdev, linux-kernel

Structure cmtp_conninfo is copied to userland with some padding fields
unitialized.  It leads to leaking of contents of kernel stack memory.

Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
---
 Compile tested.

 net/bluetooth/cmtp/core.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/bluetooth/cmtp/core.c b/net/bluetooth/cmtp/core.c
index ec0a134..8e5f292 100644
--- a/net/bluetooth/cmtp/core.c
+++ b/net/bluetooth/cmtp/core.c
@@ -78,6 +78,7 @@ static void __cmtp_unlink_session(struct cmtp_session *session)
 
 static void __cmtp_copy_session(struct cmtp_session *session, struct cmtp_conninfo *ci)
 {
+	memset(ci, 0, sizeof(*ci));
 	bacpy(&ci->bdaddr, &session->bdaddr);
 
 	ci->flags = session->flags;
-- 
1.7.0.4

^ permalink raw reply related

* [PATCH] bluetooth: hidp: fix information leak to userland
From: Vasiliy Kulikov @ 2010-10-30 14:26 UTC (permalink / raw)
  To: kernel-janitors
  Cc: Marcel Holtmann, Gustavo F. Padovan, David S. Miller, Jiri Kosina,
	Michael Poole, Bastien Nocera, linux-bluetooth, netdev,
	linux-kernel

Structure hidp_conninfo is copied to userland with version, product,
vendor and name fields unitialized if both session->input and session->hid
are NULL.  It leads to leaking of contents of kernel stack memory.

Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
---
 Compile tested.

 net/bluetooth/hidp/core.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c
index c0ee8b3..29544c2 100644
--- a/net/bluetooth/hidp/core.c
+++ b/net/bluetooth/hidp/core.c
@@ -107,6 +107,7 @@ static void __hidp_unlink_session(struct hidp_session *session)
 
 static void __hidp_copy_session(struct hidp_session *session, struct hidp_conninfo *ci)
 {
+	memset(ci, 0, sizeof(*ci));
 	bacpy(&ci->bdaddr, &session->bdaddr);
 
 	ci->flags = session->flags;
@@ -115,7 +116,6 @@ static void __hidp_copy_session(struct hidp_session *session, struct hidp_connin
 	ci->vendor  = 0x0000;
 	ci->product = 0x0000;
 	ci->version = 0x0000;
-	memset(ci->name, 0, 128);
 
 	if (session->input) {
 		ci->vendor  = session->input->id.vendor;
-- 
1.7.0.4

^ permalink raw reply related

* [PATCH] net: core: scm: fix information leak to userland
From: Vasiliy Kulikov @ 2010-10-30 14:26 UTC (permalink / raw)
  To: kernel-janitors
  Cc: David S. Miller, Eric W. Biederman, Eric Dumazet, Tejun Heo,
	Serge E. Hallyn, netdev, linux-kernel

Structure cmsghdr is copied to userland with padding bytes
unitialized on architectures where __kernel_size_t is unsigned long.
It leads to leaking of contents of kernel stack memory.

Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
---
 Compile tested.

 net/core/scm.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/core/scm.c b/net/core/scm.c
index 413cab8..a4a9b70 100644
--- a/net/core/scm.c
+++ b/net/core/scm.c
@@ -233,6 +233,7 @@ int put_cmsg(struct msghdr * msg, int level, int type, int len, void *data)
 		msg->msg_flags |= MSG_CTRUNC;
 		cmlen = msg->msg_controllen;
 	}
+	memset(&cmhdr, 0, sizeof(cmhdr));
 	cmhdr.cmsg_level = level;
 	cmhdr.cmsg_type = type;
 	cmhdr.cmsg_len = cmlen;
-- 
1.7.0.4

^ permalink raw reply related

* [PATCH] net: core: sock: fix information leak to userland
From: Vasiliy Kulikov @ 2010-10-30 14:26 UTC (permalink / raw)
  To: kernel-janitors
  Cc: David S. Miller, Eric Dumazet, Eric W. Biederman, Herbert Xu,
	Paul E. McKenney, netdev, linux-kernel

"Address" variable might be not fully initialized in sock->ops->get_name().
The only current implementation is get_name(), it leaves some padding
fields of sockaddr_tipc uninitialized.  It leads to leaking of contents
of kernel stack memory.

Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
---
 Compile tested.

 net/core/sock.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index 3eed542..759dd81 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -930,6 +930,7 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
 	{
 		char address[128];
 
+		memset(&address, 0, sizeof(address));
 		if (sock->ops->getname(sock, (struct sockaddr *)address, &lv, 2))
 			return -ENOTCONN;
 		if (lv < len)
-- 
1.7.0.4

^ permalink raw reply related

* [PATCH] ipv4: netfilter: arp_tables: fix information leak to userland
From: Vasiliy Kulikov @ 2010-10-30 14:26 UTC (permalink / raw)
  To: kernel-janitors
  Cc: Patrick McHardy, David S. Miller, Alexey Kuznetsov,
	Pekka Savola (ipv6), James Morris, Hideaki YOSHIFUJI,
	netfilter-devel, netfilter, coreteam, netdev, linux-kernel

Structure arpt_getinfo is copied to userland with the field "name"
that has the last elements unitialized.  It leads to leaking of
contents of kernel stack memory.

Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
---
 Compile tested.

 net/ipv4/netfilter/arp_tables.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index 3cad259..3fac340 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -927,6 +927,7 @@ static int get_info(struct net *net, void __user *user,
 			private = &tmp;
 		}
 #endif
+		memset(&info, 0, sizeof(info));
 		info.valid_hooks = t->valid_hooks;
 		memcpy(info.hook_entry, private->hook_entry,
 		       sizeof(info.hook_entry));
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH] ipv4: netfilter: ip_tables: fix information leak to userland
From: Vasiliy Kulikov @ 2010-10-30 14:26 UTC (permalink / raw)
  To: kernel-janitors
  Cc: Patrick McHardy, David S. Miller, Alexey Kuznetsov,
	Pekka Savola (ipv6), James Morris, Hideaki YOSHIFUJI,
	netfilter-devel, netfilter, coreteam, netdev, linux-kernel

Structure ipt_getinfo is copied to userland with the field "name"
that has the last elements unitialized.  It leads to leaking of
contents of kernel stack memory.

Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
---
 Compile tested.

 net/ipv4/netfilter/ip_tables.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index d31b007..a846d63 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -1124,6 +1124,7 @@ static int get_info(struct net *net, void __user *user,
 			private = &tmp;
 		}
 #endif
+		memset(&info, 0, sizeof(info));
 		info.valid_hooks = t->valid_hooks;
 		memcpy(info.hook_entry, private->hook_entry,
 		       sizeof(info.hook_entry));
-- 
1.7.0.4

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox