Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next] net:set valid name before calling ndo_init()
From: David Miller @ 2011-05-13 20:50 UTC (permalink / raw)
  To: jpirko
  Cc: panweiping3, eric.dumazet, mirq-linux, therbert, bhutchings,
	netdev, linux-kernel
In-Reply-To: <20110513071508.GC2733@psychotron>

From: Jiri Pirko <jpirko@redhat.com>
Date: Fri, 13 May 2011 09:15:09 +0200

> Fri, May 13, 2011 at 03:46:56AM CEST, panweiping3@gmail.com wrote:
>>In commit 1c5cae815d19 (net: call dev_alloc_name from register_netdevice),
>>a bug of bonding was invloved, see example 1 and 2.
 ...
> 
> Reviewed-by: Jiri Pirko <jpirko@redhat.com>

Applied, thanks everyone.

^ permalink raw reply

* Re: [PATCH] net: netlink: don't try unicast when dst_pid is zero for NETLINK_USERSOCK
From: David Miller @ 2011-05-13 20:48 UTC (permalink / raw)
  To: xiaosuo; +Cc: netdev
In-Reply-To: <1305267894-3314-1-git-send-email-xiaosuo@gmail.com>

From: Changli Gao <xiaosuo@gmail.com>
Date: Fri, 13 May 2011 14:24:54 +0800

> For NETLINK_USERSOCK, no one listens on PID 0, so sending a message only to
> to a multicast group should not return -ECONNREFUSED.
> 
> Signed-off-by: Changli Gao <xiaosuo@gmail.com>

I don't think this is a great idea, creating different semantics for
NETLINK_USERSOCK vs. other types.

You have to set the pid to something which will receive the unicast
message, and then you can also (on top of that) send it to a multicast
group as well.

But the base operation is always the unicast send, and that is what
determines success/failure of the operation.

I'm not applying this patch.

^ permalink raw reply

* [PATCH] olympic: convert to seq_file
From: Alexey Dobriyan @ 2011-05-13 20:42 UTC (permalink / raw)
  To: davem; +Cc: netdev

->read_proc interface is going away, switch to seq_file.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
---

 drivers/net/tokenring/olympic.c |   57 +++++++++++++++++++---------------------
 1 file changed, 28 insertions(+), 29 deletions(-)

--- a/drivers/net/tokenring/olympic.c
+++ b/drivers/net/tokenring/olympic.c
@@ -86,6 +86,7 @@
 #include <linux/timer.h>
 #include <linux/in.h>
 #include <linux/ioport.h>
+#include <linux/seq_file.h>
 #include <linux/string.h>
 #include <linux/proc_fs.h>
 #include <linux/ptrace.h>
@@ -193,7 +194,7 @@ static void olympic_arb_cmd(struct net_device *dev);
 static int olympic_change_mtu(struct net_device *dev, int mtu);
 static void olympic_srb_bh(struct net_device *dev) ; 
 static void olympic_asb_bh(struct net_device *dev) ; 
-static int olympic_proc_info(char *buffer, char **start, off_t offset, int length, int *eof, void *data) ; 
+static const struct file_operations olympic_proc_ops;
 
 static const struct net_device_ops olympic_netdev_ops = {
 	.ndo_open		= olympic_open,
@@ -272,7 +273,7 @@ static int __devinit olympic_probe(struct pci_dev *pdev, const struct pci_device
 		char proc_name[20] ; 
 		strcpy(proc_name,"olympic_") ;
 		strcat(proc_name,dev->name) ; 
-		create_proc_read_entry(proc_name,0,init_net.proc_net,olympic_proc_info,(void *)dev) ;
+		proc_create_data(proc_name, 0, init_net.proc_net, &olympic_proc_ops, dev);
 		printk("Olympic: Network Monitor information: /proc/%s\n",proc_name); 
 	}
 	return  0 ;
@@ -1615,29 +1616,25 @@ static int olympic_change_mtu(struct net_device *dev, int mtu)
 	return 0 ; 
 }
 
-static int olympic_proc_info(char *buffer, char **start, off_t offset, int length, int *eof, void *data)
+static int olympic_proc_show(struct seq_file *m, void *v)
 {
-	struct net_device *dev = (struct net_device *)data ; 
+	struct net_device *dev = m->private;
 	struct olympic_private *olympic_priv=netdev_priv(dev);
 	u8 __iomem *oat = (olympic_priv->olympic_lap + olympic_priv->olympic_addr_table_addr) ; 
 	u8 __iomem *opt = (olympic_priv->olympic_lap + olympic_priv->olympic_parms_addr) ; 
-	int size = 0 ; 
-	int len=0;
-	off_t begin=0;
-	off_t pos=0;
 	u8 addr[6];
 	u8 addr2[6];
 	int i;
 
-	size = sprintf(buffer, 
+	seq_printf(m,
 		"IBM Pit/Pit-Phy/Olympic Chipset Token Ring Adapter %s\n",dev->name);
-	size += sprintf(buffer+size, "\n%6s: Adapter Address   : Node Address      : Functional Addr\n",
+	seq_printf(m, "\n%6s: Adapter Address   : Node Address      : Functional Addr\n",
  	   dev->name); 
 
 	for (i = 0 ; i < 6 ; i++)
 		addr[i] = readb(oat+offsetof(struct olympic_adapter_addr_table,node_addr) + i);
 
-	size += sprintf(buffer+size, "%6s: %pM : %pM : %02x:%02x:%02x:%02x\n",
+	seq_printf(m, "%6s: %pM : %pM : %02x:%02x:%02x:%02x\n",
 	   dev->name,
 	   dev->dev_addr, addr,
 	   readb(oat+offsetof(struct olympic_adapter_addr_table,func_addr)), 
@@ -1645,9 +1642,9 @@ static int olympic_proc_info(char *buffer, char **start, off_t offset, int lengt
 	   readb(oat+offsetof(struct olympic_adapter_addr_table,func_addr)+2),
 	   readb(oat+offsetof(struct olympic_adapter_addr_table,func_addr)+3));
 	 
-	size += sprintf(buffer+size, "\n%6s: Token Ring Parameters Table:\n", dev->name);
+	seq_printf(m, "\n%6s: Token Ring Parameters Table:\n", dev->name);
 
-	size += sprintf(buffer+size, "%6s: Physical Addr : Up Node Address   : Poll Address      : AccPri : Auth Src : Att Code :\n",
+	seq_printf(m, "%6s: Physical Addr : Up Node Address   : Poll Address      : AccPri : Auth Src : Att Code :\n",
 	  dev->name) ; 
 
 	for (i = 0 ; i < 6 ; i++)
@@ -1655,7 +1652,7 @@ static int olympic_proc_info(char *buffer, char **start, off_t offset, int lengt
 	for (i = 0 ; i < 6 ; i++)
 		addr2[i] =  readb(opt+offsetof(struct olympic_parameters_table, poll_addr) + i);
 
-	size += sprintf(buffer+size, "%6s: %02x:%02x:%02x:%02x   : %pM : %pM : %04x   : %04x     :  %04x    :\n",
+	seq_printf(m, "%6s: %02x:%02x:%02x:%02x   : %pM : %pM : %04x   : %04x     :  %04x    :\n",
 	  dev->name,
 	  readb(opt+offsetof(struct olympic_parameters_table, phys_addr)),
 	  readb(opt+offsetof(struct olympic_parameters_table, phys_addr)+1),
@@ -1666,12 +1663,12 @@ static int olympic_proc_info(char *buffer, char **start, off_t offset, int lengt
 	  swab16(readw(opt+offsetof(struct olympic_parameters_table, auth_source_class))),
 	  swab16(readw(opt+offsetof(struct olympic_parameters_table, att_code))));
 
-	size += sprintf(buffer+size, "%6s: Source Address    : Bcn T : Maj. V : Lan St : Lcl Rg : Mon Err : Frame Correl : \n",
+	seq_printf(m, "%6s: Source Address    : Bcn T : Maj. V : Lan St : Lcl Rg : Mon Err : Frame Correl : \n",
 	  dev->name) ; 
 	
 	for (i = 0 ; i < 6 ; i++)
 		addr[i] = readb(opt+offsetof(struct olympic_parameters_table, source_addr) + i);
-	size += sprintf(buffer+size, "%6s: %pM : %04x  : %04x   : %04x   : %04x   : %04x    :     %04x     : \n",
+	seq_printf(m, "%6s: %pM : %04x  : %04x   : %04x   : %04x   : %04x    :     %04x     : \n",
 	  dev->name, addr,
 	  swab16(readw(opt+offsetof(struct olympic_parameters_table, beacon_type))),
 	  swab16(readw(opt+offsetof(struct olympic_parameters_table, major_vector))),
@@ -1680,12 +1677,12 @@ static int olympic_proc_info(char *buffer, char **start, off_t offset, int lengt
 	  swab16(readw(opt+offsetof(struct olympic_parameters_table, mon_error))),
 	  swab16(readw(opt+offsetof(struct olympic_parameters_table, frame_correl))));
 
-	size += sprintf(buffer+size, "%6s: Beacon Details :  Tx  :  Rx  : NAUN Node Address : NAUN Node Phys : \n",
+	seq_printf(m, "%6s: Beacon Details :  Tx  :  Rx  : NAUN Node Address : NAUN Node Phys : \n",
 	  dev->name) ; 
 
 	for (i = 0 ; i < 6 ; i++)
 		addr[i] = readb(opt+offsetof(struct olympic_parameters_table, beacon_naun) + i);
-	size += sprintf(buffer+size, "%6s:                :  %02x  :  %02x  : %pM : %02x:%02x:%02x:%02x    : \n",
+	seq_printf(m, "%6s:                :  %02x  :  %02x  : %pM : %02x:%02x:%02x:%02x    : \n",
 	  dev->name,
 	  swab16(readw(opt+offsetof(struct olympic_parameters_table, beacon_transmit))),
 	  swab16(readw(opt+offsetof(struct olympic_parameters_table, beacon_receive))),
@@ -1695,19 +1692,21 @@ static int olympic_proc_info(char *buffer, char **start, off_t offset, int lengt
 	  readb(opt+offsetof(struct olympic_parameters_table, beacon_phys)+2),
 	  readb(opt+offsetof(struct olympic_parameters_table, beacon_phys)+3));
 
-	len=size;
-	pos=begin+size;
-	if (pos<offset) {
-		len=0;
-		begin=pos;
-	}
-	*start=buffer+(offset-begin);	/* Start of wanted data */
-	len-=(offset-begin);		/* Start slop */
-	if(len>length)
-		len=length;		/* Ending slop */
-	return len;
+	return 0;
 }
 
+static int olympic_proc_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, olympic_proc_show, PDE(inode)->data);
+}
+
+static const struct file_operations olympic_proc_ops = {
+	.open		= olympic_proc_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+
 static void __devexit olympic_remove_one(struct pci_dev *pdev) 
 {
 	struct net_device *dev = pci_get_drvdata(pdev) ; 

^ permalink raw reply

* [GIT] Networking
From: David Miller @ 2011-05-13 20:40 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


Some stragglers, the bridging one is pretty important as it hits
virtualization users:

1) Do not run ipv4 options parser on ipv6 packets in bridging,
   from Stephen Hemminger.

2) Regression fix, changes to TOS/tclass handling made ECN stop
   being done on ipv6 connections.  Fix from Steinar H. Gunderson.

3) On-wire packets for bridging modes were defined in a way (lacking
   necessary __packed directives), such that they didn't work properly
   on some architectures (namely, ARM).  Fix from Vitalii Demianets.

4) Long ago net_device_ops conversion broke several m68k drivers
   based upon the 8390 infrastructure.  Fix from Geert Uytterhoeven.

5) SFC needs to map certain chip memory as uncacheable, from Ben
   Hutchings.

6) Memory hotplug oops fix in ehea driver from Anton Blanchard.

7) IBSS oops fix in intel wireless drivers, from Stanislaw Gruszka.

8) Command pending queue locking fix in libertas.

Please pull, thanks a lot!

The following changes since commit 3568bd9720b4a775f28a718fcbb462ce2f386988:

  Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client (2011-05-11 19:13:34 -0700)

are available in the git repository at:

  master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.git master

Anton Blanchard (1):
      ehea: Fix memory hotplug oops

Ben Hutchings (1):
      sfc: Always map MCDI shared memory as uncacheable

David S. Miller (2):
      Merge branch 'master' of git://git.kernel.org/.../linville/wireless-2.6
      Merge branch 'sfc-2.6.39' of git://git.kernel.org/.../bwh/sfc-2.6

Geert Uytterhoeven (3):
      zorro8390: Fix regression caused during net_device_ops conversion
      hydra: Fix regression caused during net_device_ops conversion
      ne-h8300: Fix regression caused during net_device_ops conversion

Luciano Coelho (1):
      mac80211: don't start the dynamic ps timer if not associated

Mohammed Shafi Shajakhan (1):
      ath9k: Fix a warning due to a queued work during S3 state

Paul Fox (1):
      libertas: fix cmdpendingq locking

Stanislaw Gruszka (1):
      iwlegacy: fix IBSS mode crashes

Steinar H. Gunderson (1):
      ipv6: restore correct ECN handling on TCP xmit

Stephen Hemminger (1):
      bridge: fix forwarding of IPv6

Vitalii Demianets (1):
      bonding,llc: Fix structure sizeof incompatibility for some PDUs

 drivers/net/Makefile                     |    6 ++--
 drivers/net/bonding/bond_3ad.h           |   10 +++---
 drivers/net/ehea/ehea_main.c             |    6 ++--
 drivers/net/hydra.c                      |   14 ++++----
 drivers/net/ne-h8300.c                   |   16 +++++-----
 drivers/net/sfc/mcdi.c                   |   49 ++++++++++++++++++-----------
 drivers/net/sfc/nic.h                    |    2 +
 drivers/net/sfc/siena.c                  |   25 +++++++++++++--
 drivers/net/wireless/ath/ath9k/main.c    |    8 +++++
 drivers/net/wireless/iwlegacy/iwl-core.c |    7 ++++
 drivers/net/wireless/iwlegacy/iwl-dev.h  |    6 ++++
 drivers/net/wireless/libertas/cmd.c      |    6 ++-
 drivers/net/zorro8390.c                  |   12 ++++----
 include/net/inet_ecn.h                   |   16 ++++++++--
 include/net/llc_pdu.h                    |    8 ++--
 net/bridge/br_netfilter.c                |    2 +-
 net/mac80211/tx.c                        |    4 ++
 17 files changed, 132 insertions(+), 65 deletions(-)

^ permalink raw reply

* Re: [Bridge] [PATCH] bridge: fix forwarding of IPv6
From: Stephen Hemminger @ 2011-05-13 20:24 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, noahm, bridge, herbert, ben
In-Reply-To: <20110513.160232.1127228366429050055.davem@davemloft.net>

On Fri, 13 May 2011 16:02:32 -0400 (EDT)
David Miller <davem@davemloft.net> wrote:

> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Fri, 13 May 2011 22:00:44 +0200
> 
> > Le vendredi 13 mai 2011 à 12:53 -0700, Stephen Hemminger a écrit :
> >> The commit 6b1e960fdbd75dcd9bcc3ba5ff8898ff1ad30b6e
> >>     bridge: Reset IPCB when entering IP stack on NF_FORWARD
> >> broke forwarding of IPV6 packets in bridge because it would
> >> call bp_parse_ip_options with an IPV6 packet.
> >> 
> >> Reported-by: Noah Meyerhans <noahm@debian.org>
> >> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> >> 
> >> ---
> >> Patch against net-next-2.6 but must be applied to net-2.6
> >> and stable as well
> >> 
> > 
> > Well, stable is not needed, since faulty commit is not in 2.6.38
> > 
> > Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
> 
> I do need to queue it up for -stable because the faulty commit is
> also queued up there :-)

The faulty commit was in 2.6.38.4

^ permalink raw reply

* Re: [net-next 2/2] stmmac: fix autoneg in set_pauseparam
From: David Miller @ 2011-05-13 20:12 UTC (permalink / raw)
  To: peppe.cavallaro; +Cc: netdev
In-Reply-To: <1305268085-603-2-git-send-email-peppe.cavallaro@st.com>

From: Giuseppe CAVALLARO <peppe.cavallaro@st.com>
Date: Fri, 13 May 2011 08:28:05 +0200

> This patch fixes a bug in the set_pauseparam
> function that didn't well manage the ANE
> field and returned broken values when use
> ethtool -A|-a.
> 
> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>

Applied.

^ permalink raw reply

* Re: [net-next 1/2 (V2)] stmmac: don't go through ethtool to start auto-negotiation
From: David Miller @ 2011-05-13 20:12 UTC (permalink / raw)
  To: peppe.cavallaro; +Cc: netdev, decot
In-Reply-To: <1305268085-603-1-git-send-email-peppe.cavallaro@st.com>

From: Giuseppe CAVALLARO <peppe.cavallaro@st.com>
Date: Fri, 13 May 2011 08:28:04 +0200

> From: David Decotigny <decot@google.com>
> 
> The driver used to call phy's ethtool configuration routine to start
> auto-negotiation. This change has it call directly phy's routine to
> start auto-negotiation.
> 
> The initial version was hiding phy_start_aneg() return value,
> this patch returns it (<0 upon error).
> 
> Tested: module compiles, tested on STM HDK7108 STB.
> 
> Signed-off-by: David Decotigny <decot@google.com>
> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>

Applied.

^ permalink raw reply

* Re: [PATCH] drivers/isdn/hisax: Drop unused list
From: David Miller @ 2011-05-13 20:10 UTC (permalink / raw)
  To: julia; +Cc: isdn, kernel-janitors, netdev, linux-kernel
In-Reply-To: <1305296139-16773-1-git-send-email-julia@diku.dk>

From: Julia Lawall <julia@diku.dk>
Date: Fri, 13 May 2011 16:15:39 +0200

> The file st5481_init.c locally defines and initializes the adapter_list
> variable, but does not use it for anything.  Removing the list makes it
> possible to remove the list field from the st5481_adapter data structure.
> In the function probe_st5481, it also makes it possible to free the locally
> allocated adapter value on an error exit.
> 
> Signed-off-by: Julia Lawall <julia@diku.dk>

Applied, thanks a lot Julia.

^ permalink raw reply

* Re: [PATCH v3] net: ipv4: add IPPROTO_ICMP socket kind
From: David Miller @ 2011-05-13 20:08 UTC (permalink / raw)
  To: segoon
  Cc: solar, linux-kernel, netdev, peak, kees.cook, dan.j.rosenberg,
	eugene, nelhage, kuznet, pekkas, jmorris, yoshfuji, kaber
In-Reply-To: <20110513200100.GA3875@albatros>

From: Vasiliy Kulikov <segoon@openwall.com>
Date: Sat, 14 May 2011 00:01:00 +0400

> This patch adds IPPROTO_ICMP socket kind.

Applied, thanks for following through on all the review feedback.

^ permalink raw reply

* Re: [PATCH] bridge: fix forwarding of IPv6
From: Eric Dumazet @ 2011-05-13 20:05 UTC (permalink / raw)
  To: David Miller; +Cc: shemminger, noahm, herbert, ben, bridge, netdev
In-Reply-To: <20110513.160232.1127228366429050055.davem@davemloft.net>

Le vendredi 13 mai 2011 à 16:02 -0400, David Miller a écrit :

> I do need to queue it up for -stable because the faulty commit is
> also queued up there :-)

okay ;)



^ permalink raw reply

* Re: [PATCH] bridge: fix forwarding of IPv6
From: David Miller @ 2011-05-13 20:03 UTC (permalink / raw)
  To: shemminger; +Cc: noahm, herbert, ben, bridge, netdev
In-Reply-To: <20110513125314.66861b31@nehalam>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Fri, 13 May 2011 12:53:14 -0700

> The commit 6b1e960fdbd75dcd9bcc3ba5ff8898ff1ad30b6e
>     bridge: Reset IPCB when entering IP stack on NF_FORWARD
> broke forwarding of IPV6 packets in bridge because it would
> call bp_parse_ip_options with an IPV6 packet.
> 
> Reported-by: Noah Meyerhans <noahm@debian.org>
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> 
> ---
> Patch against net-next-2.6 but must be applied to net-2.6
> and stable as well

Applied and queued up for -stable, thanks!

^ permalink raw reply

* Re: [PATCH] bridge: fix forwarding of IPv6
From: David Miller @ 2011-05-13 20:02 UTC (permalink / raw)
  To: eric.dumazet; +Cc: shemminger, noahm, herbert, ben, bridge, netdev
In-Reply-To: <1305316844.3120.8.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 13 May 2011 22:00:44 +0200

> Le vendredi 13 mai 2011 à 12:53 -0700, Stephen Hemminger a écrit :
>> The commit 6b1e960fdbd75dcd9bcc3ba5ff8898ff1ad30b6e
>>     bridge: Reset IPCB when entering IP stack on NF_FORWARD
>> broke forwarding of IPV6 packets in bridge because it would
>> call bp_parse_ip_options with an IPV6 packet.
>> 
>> Reported-by: Noah Meyerhans <noahm@debian.org>
>> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
>> 
>> ---
>> Patch against net-next-2.6 but must be applied to net-2.6
>> and stable as well
>> 
> 
> Well, stable is not needed, since faulty commit is not in 2.6.38
> 
> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>

I do need to queue it up for -stable because the faulty commit is
also queued up there :-)

^ permalink raw reply

* Re: kernel BUG at net/ipv4/tcp_output.c:1006!
From: David Miller @ 2011-05-13 20:01 UTC (permalink / raw)
  To: eric.dumazet; +Cc: lkml, linux-kernel, netdev
In-Reply-To: <1305316058.3120.6.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 13 May 2011 21:47:38 +0200

> I suspect we should push commit 2fceec13375e5d98 (tcp: len check is
> unnecessarily devastating, change to WARN_ON) to stable if not already
> done...
> 
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2fceec13375e5d98
> 
> David, is this commit in your stable queue ?

No, but now it is.

^ permalink raw reply

* [PATCH v3] net: ipv4: add IPPROTO_ICMP socket kind
From: Vasiliy Kulikov @ 2011-05-13 20:01 UTC (permalink / raw)
  To: David Miller
  Cc: solar, linux-kernel, netdev, peak, kees.cook, dan.j.rosenberg,
	eugene, nelhage, kuznet, pekkas, jmorris, yoshfuji, kaber
In-Reply-To: <20110510.121550.112583080.davem@davemloft.net>

This patch adds IPPROTO_ICMP socket kind.  It makes it possible to send
ICMP_ECHO messages and receive the corresponding ICMP_ECHOREPLY messages
without any special privileges.  In other words, the patch makes it
possible to implement setuid-less and CAP_NET_RAW-less /bin/ping.  In
order not to increase the kernel's attack surface, the new functionality
is disabled by default, but is enabled at bootup by supporting Linux
distributions, optionally with restriction to a group or a group range
(see below).

Similar functionality is implemented in Mac OS X:
http://www.manpagez.com/man/4/icmp/

A new ping socket is created with

    socket(PF_INET, SOCK_DGRAM, PROT_ICMP)

Message identifiers (octets 4-5 of ICMP header) are interpreted as local
ports. Addresses are stored in struct sockaddr_in. No port numbers are
reserved for privileged processes, port 0 is reserved for API ("let the
kernel pick a free number"). There is no notion of remote ports, remote
port numbers provided by the user (e.g. in connect()) are ignored.

Data sent and received include ICMP headers. This is deliberate to:
1) Avoid the need to transport headers values like sequence numbers by
other means.
2) Make it easier to port existing programs using raw sockets.

ICMP headers given to send() are checked and sanitized. The type must be
ICMP_ECHO and the code must be zero (future extensions might relax this,
see below). The id is set to the number (local port) of the socket, the
checksum is always recomputed.

ICMP reply packets received from the network are demultiplexed according
to their id's, and are returned by recv() without any modifications.
IP header information and ICMP errors of those packets may be obtained
via ancillary data (IP_RECVTTL, IP_RETOPTS, and IP_RECVERR). ICMP source
quenches and redirects are reported as fake errors via the error queue
(IP_RECVERR); the next hop address for redirects is saved to ee_info (in
network order).

socket(2) is restricted to the group range specified in
"/proc/sys/net/ipv4/ping_group_range".  It is "1 0" by default, meaning
that nobody (not even root) may create ping sockets.  Setting it to "100
100" would grant permissions to the single group (to either make
/sbin/ping g+s and owned by this group or to grant permissions to the
"netadmins" group), "0 4294967295" would enable it for the world, "100
4294967295" would enable it for the users, but not daemons.

The existing code might be (in the unlikely case anyone needs it)
extended rather easily to handle other similar pairs of ICMP messages
(Timestamp/Reply, Information Request/Reply, Address Mask Request/Reply
etc.).

Userspace ping util & patch for it:
http://openwall.info/wiki/people/segoon/ping

For Openwall GNU/*/Linux it was the last step on the road to the
setuid-less distro.  A revision of this patch (for RHEL5/OpenVZ kernels)
is in use in Owl-current, such as in the 2011/03/12 LiveCD ISOs:
http://mirrors.kernel.org/openwall/Owl/current/iso/

Initially this functionality was written by Pavel Kankovsky for
Linux 2.4.32, but unfortunately it was never made public.

All ping options (-b, -p, -Q, -R, -s, -t, -T, -M, -I), are tested with
the patch.

PATCH v3:
    - switched to flowi4.
    - minor changes to be consistent with raw sockets code.

PATCH v2:
    - changed ping_debug() to pr_debug().
    - removed CONFIG_IP_PING.
    - removed ping_seq_fops.owner field (unused for procfs).
    - switched to proc_net_fops_create().
    - switched to %pK in seq_printf().

PATCH v1:
    - fixed checksumming bug.
    - CAP_NET_RAW may not create icmp sockets anymore.

RFC v2:
    - minor cleanups.
    - introduced sysctl'able group range to restrict socket(2).

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
---
 include/net/netns/ipv4.h   |    2 +
 include/net/ping.h         |   57 +++
 net/ipv4/Makefile          |    2 +-
 net/ipv4/af_inet.c         |   22 +
 net/ipv4/icmp.c            |   12 +-
 net/ipv4/ping.c            |  937 ++++++++++++++++++++++++++++++++++++++++++++
 net/ipv4/sysctl_net_ipv4.c |   80 ++++
 7 files changed, 1110 insertions(+), 2 deletions(-)
 create mode 100644 include/net/ping.h
 create mode 100644 net/ipv4/ping.c

diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 542195d..d786b4f 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -54,6 +54,8 @@ struct netns_ipv4 {
 	int sysctl_rt_cache_rebuild_count;
 	int current_rt_cache_rebuild_count;
 
+	unsigned int sysctl_ping_group_range[2];
+
 	atomic_t rt_genid;
 	atomic_t dev_addr_genid;
 
diff --git a/include/net/ping.h b/include/net/ping.h
new file mode 100644
index 0000000..23062c3
--- /dev/null
+++ b/include/net/ping.h
@@ -0,0 +1,57 @@
+/*
+ * INET		An implementation of the TCP/IP protocol suite for the LINUX
+ *		operating system.  INET is implemented using the  BSD Socket
+ *		interface as the means of communication with the user level.
+ *
+ *		Definitions for the "ping" module.
+ *
+ *		This program is free software; you can redistribute it and/or
+ *		modify it under the terms of the GNU General Public License
+ *		as published by the Free Software Foundation; either version
+ *		2 of the License, or (at your option) any later version.
+ */
+#ifndef _PING_H
+#define _PING_H
+
+#include <net/netns/hash.h>
+
+/* PING_HTABLE_SIZE must be power of 2 */
+#define PING_HTABLE_SIZE 	64
+#define PING_HTABLE_MASK 	(PING_HTABLE_SIZE-1)
+
+#define ping_portaddr_for_each_entry(__sk, node, list) \
+	hlist_nulls_for_each_entry(__sk, node, list, sk_nulls_node)
+
+/*
+ * gid_t is either uint or ushort.  We want to pass it to
+ * proc_dointvec_minmax(), so it must not be larger than MAX_INT
+ */
+#define GID_T_MAX (((gid_t)~0U) >> 1)
+
+struct ping_table {
+	struct hlist_nulls_head	hash[PING_HTABLE_SIZE];
+	rwlock_t		lock;
+};
+
+struct ping_iter_state {
+	struct seq_net_private  p;
+	int			bucket;
+};
+
+extern struct proto ping_prot;
+
+
+extern void ping_rcv(struct sk_buff *);
+extern void ping_err(struct sk_buff *, u32 info);
+
+extern void inet_get_ping_group_range_net(struct net *net, unsigned int *low, unsigned int *high);
+
+#ifdef CONFIG_PROC_FS
+extern int __init ping_proc_init(void);
+extern void ping_proc_exit(void);
+#endif
+
+void __init ping_init(void);
+
+
+#endif /* _PING_H */
diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile
index 0dc772d..f2dc69c 100644
--- a/net/ipv4/Makefile
+++ b/net/ipv4/Makefile
@@ -11,7 +11,7 @@ obj-y     := route.o inetpeer.o protocol.o \
 	     datagram.o raw.o udp.o udplite.o \
 	     arp.o icmp.o devinet.o af_inet.o  igmp.o \
 	     fib_frontend.o fib_semantics.o fib_trie.o \
-	     inet_fragment.o
+	     inet_fragment.o ping.o
 
 obj-$(CONFIG_SYSCTL) += sysctl_net_ipv4.o
 obj-$(CONFIG_PROC_FS) += proc.o
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 851aa05..cc14631 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -105,6 +105,7 @@
 #include <net/tcp.h>
 #include <net/udp.h>
 #include <net/udplite.h>
+#include <net/ping.h>
 #include <linux/skbuff.h>
 #include <net/sock.h>
 #include <net/raw.h>
@@ -1008,6 +1009,14 @@ static struct inet_protosw inetsw_array[] =
 		.flags =      INET_PROTOSW_PERMANENT,
        },
 
+       {
+		.type =       SOCK_DGRAM,
+		.protocol =   IPPROTO_ICMP,
+		.prot =       &ping_prot,
+		.ops =        &inet_dgram_ops,
+		.no_check =   UDP_CSUM_DEFAULT,
+		.flags =      INET_PROTOSW_REUSE,
+       },
 
        {
 	       .type =       SOCK_RAW,
@@ -1527,6 +1536,7 @@ static const struct net_protocol udp_protocol = {
 
 static const struct net_protocol icmp_protocol = {
 	.handler =	icmp_rcv,
+	.err_handler =	ping_err,
 	.no_policy =	1,
 	.netns_ok =	1,
 };
@@ -1642,6 +1652,10 @@ static int __init inet_init(void)
 	if (rc)
 		goto out_unregister_udp_proto;
 
+	rc = proto_register(&ping_prot, 1);
+	if (rc)
+		goto out_unregister_raw_proto;
+
 	/*
 	 *	Tell SOCKET that we are alive...
 	 */
@@ -1697,6 +1711,8 @@ static int __init inet_init(void)
 	/* Add UDP-Lite (RFC 3828) */
 	udplite4_register();
 
+	ping_init();
+
 	/*
 	 *	Set the ICMP layer up
 	 */
@@ -1727,6 +1743,8 @@ static int __init inet_init(void)
 	rc = 0;
 out:
 	return rc;
+out_unregister_raw_proto:
+	proto_unregister(&raw_prot);
 out_unregister_udp_proto:
 	proto_unregister(&udp_prot);
 out_unregister_tcp_proto:
@@ -1751,11 +1769,15 @@ static int __init ipv4_proc_init(void)
 		goto out_tcp;
 	if (udp4_proc_init())
 		goto out_udp;
+	if (ping_proc_init())
+		goto out_ping;
 	if (ip_misc_proc_init())
 		goto out_misc;
 out:
 	return rc;
 out_misc:
+	ping_proc_exit();
+out_ping:
 	udp4_proc_exit();
 out_udp:
 	tcp4_proc_exit();
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 853a670..7c47eca 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -83,6 +83,7 @@
 #include <net/tcp.h>
 #include <net/udp.h>
 #include <net/raw.h>
+#include <net/ping.h>
 #include <linux/skbuff.h>
 #include <net/sock.h>
 #include <linux/errno.h>
@@ -781,6 +782,15 @@ static void icmp_redirect(struct sk_buff *skb)
 			       iph->saddr, skb->dev);
 		break;
 	}
+
+	/* Ping wants to see redirects.
+         * Let's pretend they are errors of sorts... */
+	if (iph->protocol == IPPROTO_ICMP &&
+	    iph->ihl >= 5 &&
+	    pskb_may_pull(skb, (iph->ihl<<2)+8)) {
+		ping_err(skb, icmp_hdr(skb)->un.gateway);
+	}
+
 out:
 	return;
 out_err:
@@ -1041,7 +1051,7 @@ error:
  */
 static const struct icmp_control icmp_pointers[NR_ICMP_TYPES + 1] = {
 	[ICMP_ECHOREPLY] = {
-		.handler = icmp_discard,
+		.handler = ping_rcv,
 	},
 	[1] = {
 		.handler = icmp_discard,
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
new file mode 100644
index 0000000..a77e2d7
--- /dev/null
+++ b/net/ipv4/ping.c
@@ -0,0 +1,937 @@
+/*
+ * INET		An implementation of the TCP/IP protocol suite for the LINUX
+ *		operating system.  INET is implemented using the  BSD Socket
+ *		interface as the means of communication with the user level.
+ *
+ *		"Ping" sockets
+ *
+ *		This program is free software; you can redistribute it and/or
+ *		modify it under the terms of the GNU General Public License
+ *		as published by the Free Software Foundation; either version
+ *		2 of the License, or (at your option) any later version.
+ *
+ * Based on ipv4/udp.c code.
+ *
+ * Authors:	Vasiliy Kulikov / Openwall (for Linux 2.6),
+ *		Pavel Kankovsky (for Linux 2.4.32)
+ *
+ * Pavel gave all rights to bugs to Vasiliy,
+ * none of the bugs are Pavel's now.
+ *
+ */
+
+#include <asm/system.h>
+#include <linux/uaccess.h>
+#include <asm/ioctls.h>
+#include <linux/types.h>
+#include <linux/fcntl.h>
+#include <linux/socket.h>
+#include <linux/sockios.h>
+#include <linux/in.h>
+#include <linux/errno.h>
+#include <linux/timer.h>
+#include <linux/mm.h>
+#include <linux/inet.h>
+#include <linux/netdevice.h>
+#include <net/snmp.h>
+#include <net/ip.h>
+#include <net/ipv6.h>
+#include <net/icmp.h>
+#include <net/protocol.h>
+#include <linux/skbuff.h>
+#include <linux/proc_fs.h>
+#include <net/sock.h>
+#include <net/ping.h>
+#include <net/icmp.h>
+#include <net/udp.h>
+#include <net/route.h>
+#include <net/inet_common.h>
+#include <net/checksum.h>
+
+
+struct ping_table ping_table __read_mostly;
+
+u16 ping_port_rover;
+
+static inline int ping_hashfn(struct net *net, unsigned num, unsigned mask)
+{
+	int res = (num + net_hash_mix(net)) & mask;
+	pr_debug("hash(%d) = %d\n", num, res);
+	return res;
+}
+
+static inline struct hlist_nulls_head *ping_hashslot(struct ping_table *table,
+					     struct net *net, unsigned num)
+{
+	return &table->hash[ping_hashfn(net, num, PING_HTABLE_MASK)];
+}
+
+static int ping_v4_get_port(struct sock *sk, unsigned short ident)
+{
+	struct hlist_nulls_node *node;
+	struct hlist_nulls_head *hlist;
+	struct inet_sock *isk, *isk2;
+	struct sock *sk2 = NULL;
+
+	isk = inet_sk(sk);
+	write_lock_bh(&ping_table.lock);
+	if (ident == 0) {
+		u32 i;
+		u16 result = ping_port_rover + 1;
+
+		for (i = 0; i < (1L << 16); i++, result++) {
+			if (!result)
+				result++; /* avoid zero */
+			hlist = ping_hashslot(&ping_table, sock_net(sk),
+					    result);
+			ping_portaddr_for_each_entry(sk2, node, hlist) {
+				isk2 = inet_sk(sk2);
+
+				if (isk2->inet_num == result)
+					goto next_port;
+			}
+
+			/* found */
+			ping_port_rover = ident = result;
+			break;
+next_port:
+			;
+		}
+		if (i >= (1L << 16))
+			goto fail;
+	} else {
+		hlist = ping_hashslot(&ping_table, sock_net(sk), ident);
+		ping_portaddr_for_each_entry(sk2, node, hlist) {
+			isk2 = inet_sk(sk2);
+
+			if ((isk2->inet_num == ident) &&
+			    (sk2 != sk) &&
+			    (!sk2->sk_reuse || !sk->sk_reuse))
+				goto fail;
+		}
+	}
+
+	pr_debug("found port/ident = %d\n", ident);
+	isk->inet_num = ident;
+	if (sk_unhashed(sk)) {
+		pr_debug("was not hashed\n");
+		sock_hold(sk);
+		hlist_nulls_add_head(&sk->sk_nulls_node, hlist);
+		sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1);
+	}
+	write_unlock_bh(&ping_table.lock);
+	return 0;
+
+fail:
+	write_unlock_bh(&ping_table.lock);
+	return 1;
+}
+
+static void ping_v4_hash(struct sock *sk)
+{
+	pr_debug("ping_v4_hash(sk->port=%u)\n", inet_sk(sk)->inet_num);
+	BUG(); /* "Please do not press this button again." */
+}
+
+static void ping_v4_unhash(struct sock *sk)
+{
+	struct inet_sock *isk = inet_sk(sk);
+	pr_debug("ping_v4_unhash(isk=%p,isk->num=%u)\n", isk, isk->inet_num);
+	if (sk_hashed(sk)) {
+		struct hlist_nulls_head *hslot;
+
+		hslot = ping_hashslot(&ping_table, sock_net(sk), isk->inet_num);
+		write_lock_bh(&ping_table.lock);
+		hlist_nulls_del(&sk->sk_nulls_node);
+		sock_put(sk);
+		isk->inet_num = isk->inet_sport = 0;
+		sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1);
+		write_unlock_bh(&ping_table.lock);
+	}
+}
+
+struct sock *ping_v4_lookup(struct net *net, u32 saddr, u32 daddr,
+	 u16 ident, int dif)
+{
+	struct hlist_nulls_head *hslot = ping_hashslot(&ping_table, net, ident);
+	struct sock *sk = NULL;
+	struct inet_sock *isk;
+	struct hlist_nulls_node *hnode;
+
+	pr_debug("try to find: num = %d, daddr = %ld, dif = %d\n",
+			 (int)ident, (unsigned long)daddr, dif);
+	read_lock_bh(&ping_table.lock);
+
+	ping_portaddr_for_each_entry(sk, hnode, hslot) {
+		isk = inet_sk(sk);
+
+		pr_debug("found: %p: num = %d, daddr = %ld, dif = %d\n", sk,
+			 (int)isk->inet_num, (unsigned long)isk->inet_rcv_saddr,
+			 sk->sk_bound_dev_if);
+
+		pr_debug("iterate\n");
+		if (isk->inet_num != ident)
+			continue;
+		if (isk->inet_rcv_saddr && isk->inet_rcv_saddr != daddr)
+			continue;
+		if (sk->sk_bound_dev_if && sk->sk_bound_dev_if != dif)
+			continue;
+
+		sock_hold(sk);
+		goto exit;
+	}
+
+	sk = NULL;
+exit:
+	read_unlock_bh(&ping_table.lock);
+
+	return sk;
+}
+
+static int ping_init_sock(struct sock *sk)
+{
+	struct net *net = sock_net(sk);
+	gid_t group = current_egid();
+	gid_t range[2];
+	struct group_info *group_info = get_current_groups();
+	int i, j, count = group_info->ngroups;
+
+	inet_get_ping_group_range_net(net, range, range+1);
+	if (range[0] <= group && group <= range[1])
+		return 0;
+
+	for (i = 0; i < group_info->nblocks; i++) {
+		int cp_count = min_t(int, NGROUPS_PER_BLOCK, count);
+
+		for (j = 0; j < cp_count; j++) {
+			group = group_info->blocks[i][j];
+			if (range[0] <= group && group <= range[1])
+				return 0;
+		}
+
+		count -= cp_count;
+	}
+
+	return -EACCES;
+}
+
+static void ping_close(struct sock *sk, long timeout)
+{
+	pr_debug("ping_close(sk=%p,sk->num=%u)\n",
+		inet_sk(sk), inet_sk(sk)->inet_num);
+	pr_debug("isk->refcnt = %d\n", sk->sk_refcnt.counter);
+
+	sk_common_release(sk);
+}
+
+/*
+ * We need our own bind because there are no privileged id's == local ports.
+ * Moreover, we don't allow binding to multi- and broadcast addresses.
+ */
+
+static int ping_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len)
+{
+	struct sockaddr_in *addr = (struct sockaddr_in *)uaddr;
+	struct inet_sock *isk = inet_sk(sk);
+	unsigned short snum;
+	int chk_addr_ret;
+	int err;
+
+	if (addr_len < sizeof(struct sockaddr_in))
+		return -EINVAL;
+
+	pr_debug("ping_v4_bind(sk=%p,sa_addr=%08x,sa_port=%d)\n",
+		sk, addr->sin_addr.s_addr, ntohs(addr->sin_port));
+
+	chk_addr_ret = inet_addr_type(sock_net(sk), addr->sin_addr.s_addr);
+	if (addr->sin_addr.s_addr == INADDR_ANY)
+		chk_addr_ret = RTN_LOCAL;
+
+	if ((sysctl_ip_nonlocal_bind == 0 &&
+	    isk->freebind == 0 && isk->transparent == 0 &&
+	     chk_addr_ret != RTN_LOCAL) ||
+	    chk_addr_ret == RTN_MULTICAST ||
+	    chk_addr_ret == RTN_BROADCAST)
+		return -EADDRNOTAVAIL;
+
+	lock_sock(sk);
+
+	err = -EINVAL;
+	if (isk->inet_num != 0)
+		goto out;
+
+	err = -EADDRINUSE;
+	isk->inet_rcv_saddr = isk->inet_saddr = addr->sin_addr.s_addr;
+	snum = ntohs(addr->sin_port);
+	if (ping_v4_get_port(sk, snum) != 0) {
+		isk->inet_saddr = isk->inet_rcv_saddr = 0;
+		goto out;
+	}
+
+	pr_debug("after bind(): num = %d, daddr = %ld, dif = %d\n",
+		(int)isk->inet_num,
+		(unsigned long) isk->inet_rcv_saddr,
+		(int)sk->sk_bound_dev_if);
+
+	err = 0;
+	if (isk->inet_rcv_saddr)
+		sk->sk_userlocks |= SOCK_BINDADDR_LOCK;
+	if (snum)
+		sk->sk_userlocks |= SOCK_BINDPORT_LOCK;
+	isk->inet_sport = htons(isk->inet_num);
+	isk->inet_daddr = 0;
+	isk->inet_dport = 0;
+	sk_dst_reset(sk);
+out:
+	release_sock(sk);
+	pr_debug("ping_v4_bind -> %d\n", err);
+	return err;
+}
+
+/*
+ * Is this a supported type of ICMP message?
+ */
+
+static inline int ping_supported(int type, int code)
+{
+	if (type == ICMP_ECHO && code == 0)
+		return 1;
+	return 0;
+}
+
+/*
+ * This routine is called by the ICMP module when it gets some
+ * sort of error condition.
+ */
+
+static int ping_queue_rcv_skb(struct sock *sk, struct sk_buff *skb);
+
+void ping_err(struct sk_buff *skb, u32 info)
+{
+	struct iphdr *iph = (struct iphdr *)skb->data;
+	struct icmphdr *icmph = (struct icmphdr *)(skb->data+(iph->ihl<<2));
+	struct inet_sock *inet_sock;
+	int type = icmph->type;
+	int code = icmph->code;
+	struct net *net = dev_net(skb->dev);
+	struct sock *sk;
+	int harderr;
+	int err;
+
+	/* We assume the packet has already been checked by icmp_unreach */
+
+	if (!ping_supported(icmph->type, icmph->code))
+		return;
+
+	pr_debug("ping_err(type=%04x,code=%04x,id=%04x,seq=%04x)\n", type,
+		code, ntohs(icmph->un.echo.id), ntohs(icmph->un.echo.sequence));
+
+	sk = ping_v4_lookup(net, iph->daddr, iph->saddr,
+			    ntohs(icmph->un.echo.id), skb->dev->ifindex);
+	if (sk == NULL) {
+		ICMP_INC_STATS_BH(net, ICMP_MIB_INERRORS);
+		pr_debug("no socket, dropping\n");
+		return;	/* No socket for error */
+	}
+	pr_debug("err on socket %p\n", sk);
+
+	err = 0;
+	harderr = 0;
+	inet_sock = inet_sk(sk);
+
+	switch (type) {
+	default:
+	case ICMP_TIME_EXCEEDED:
+		err = EHOSTUNREACH;
+		break;
+	case ICMP_SOURCE_QUENCH:
+		/* This is not a real error but ping wants to see it.
+		 * Report it with some fake errno. */
+		err = EREMOTEIO;
+		break;
+	case ICMP_PARAMETERPROB:
+		err = EPROTO;
+		harderr = 1;
+		break;
+	case ICMP_DEST_UNREACH:
+		if (code == ICMP_FRAG_NEEDED) { /* Path MTU discovery */
+			if (inet_sock->pmtudisc != IP_PMTUDISC_DONT) {
+				err = EMSGSIZE;
+				harderr = 1;
+				break;
+			}
+			goto out;
+		}
+		err = EHOSTUNREACH;
+		if (code <= NR_ICMP_UNREACH) {
+			harderr = icmp_err_convert[code].fatal;
+			err = icmp_err_convert[code].errno;
+		}
+		break;
+	case ICMP_REDIRECT:
+		/* See ICMP_SOURCE_QUENCH */
+		err = EREMOTEIO;
+		break;
+	}
+
+	/*
+	 *      RFC1122: OK.  Passes ICMP errors back to application, as per
+	 *	4.1.3.3.
+	 */
+	if (!inet_sock->recverr) {
+		if (!harderr || sk->sk_state != TCP_ESTABLISHED)
+			goto out;
+	} else {
+		ip_icmp_error(sk, skb, err, 0 /* no remote port */,
+			 info, (u8 *)icmph);
+	}
+	sk->sk_err = err;
+	sk->sk_error_report(sk);
+out:
+	sock_put(sk);
+}
+
+/*
+ *	Copy and checksum an ICMP Echo packet from user space into a buffer.
+ */
+
+struct pingfakehdr {
+	struct icmphdr icmph;
+	struct iovec *iov;
+	u32 wcheck;
+};
+
+static int ping_getfrag(void *from, char * to,
+			int offset, int fraglen, int odd, struct sk_buff *skb)
+{
+	struct pingfakehdr *pfh = (struct pingfakehdr *)from;
+
+	if (offset == 0) {
+		if (fraglen < sizeof(struct icmphdr))
+			BUG();
+		if (csum_partial_copy_fromiovecend(to + sizeof(struct icmphdr),
+			    pfh->iov, 0, fraglen - sizeof(struct icmphdr),
+			    &pfh->wcheck))
+			return -EFAULT;
+
+		return 0;
+	}
+	if (offset < sizeof(struct icmphdr))
+		BUG();
+	if (csum_partial_copy_fromiovecend
+			(to, pfh->iov, offset - sizeof(struct icmphdr),
+			 fraglen, &pfh->wcheck))
+		return -EFAULT;
+	return 0;
+}
+
+static int ping_push_pending_frames(struct sock *sk, struct pingfakehdr *pfh, struct flowi4 *fl4)
+{
+	struct sk_buff *skb = skb_peek(&sk->sk_write_queue);
+
+	pfh->wcheck = csum_partial((char *)&pfh->icmph,
+		sizeof(struct icmphdr), pfh->wcheck);
+	pfh->icmph.checksum = csum_fold(pfh->wcheck);
+	memcpy(icmp_hdr(skb), &pfh->icmph, sizeof(struct icmphdr));
+	skb->ip_summed = CHECKSUM_NONE;
+	return ip_push_pending_frames(sk, fl4);
+}
+
+int ping_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
+		 size_t len)
+{
+	struct net *net = sock_net(sk);
+	struct flowi4 fl4;
+	struct inet_sock *inet = inet_sk(sk);
+	struct ipcm_cookie ipc;
+	struct icmphdr user_icmph;
+	struct pingfakehdr pfh;
+	struct rtable *rt = NULL;
+	struct ip_options_data opt_copy;
+	int free = 0;
+	u32 saddr, daddr, faddr;
+	u8  tos;
+	int err;
+
+	pr_debug("ping_sendmsg(sk=%p,sk->num=%u)\n", inet, inet->inet_num);
+
+
+	if (len > 0xFFFF)
+		return -EMSGSIZE;
+
+	/*
+	 *	Check the flags.
+	 */
+
+	/* Mirror BSD error message compatibility */
+	if (msg->msg_flags & MSG_OOB)
+		return -EOPNOTSUPP;
+
+	/*
+	 *	Fetch the ICMP header provided by the userland.
+	 *	iovec is modified!
+	 */
+
+	if (memcpy_fromiovec((u8 *)&user_icmph, msg->msg_iov,
+			     sizeof(struct icmphdr)))
+		return -EFAULT;
+	if (!ping_supported(user_icmph.type, user_icmph.code))
+		return -EINVAL;
+
+	/*
+	 *	Get and verify the address.
+	 */
+
+	if (msg->msg_name) {
+		struct sockaddr_in *usin = (struct sockaddr_in *)msg->msg_name;
+		if (msg->msg_namelen < sizeof(*usin))
+			return -EINVAL;
+		if (usin->sin_family != AF_INET)
+			return -EINVAL;
+		daddr = usin->sin_addr.s_addr;
+		/* no remote port */
+	} else {
+		if (sk->sk_state != TCP_ESTABLISHED)
+			return -EDESTADDRREQ;
+		daddr = inet->inet_daddr;
+		/* no remote port */
+	}
+
+	ipc.addr = inet->inet_saddr;
+	ipc.opt = NULL;
+	ipc.oif = sk->sk_bound_dev_if;
+	ipc.tx_flags = 0;
+	err = sock_tx_timestamp(sk, &ipc.tx_flags);
+	if (err)
+		return err;
+
+	if (msg->msg_controllen) {
+		err = ip_cmsg_send(sock_net(sk), msg, &ipc);
+		if (err)
+			return err;
+		if (ipc.opt)
+			free = 1;
+	}
+	if (!ipc.opt) {
+		struct ip_options_rcu *inet_opt;
+
+		rcu_read_lock();
+		inet_opt = rcu_dereference(inet->inet_opt);
+		if (inet_opt) {
+			memcpy(&opt_copy, inet_opt,
+			       sizeof(*inet_opt) + inet_opt->opt.optlen);
+			ipc.opt = &opt_copy.opt;
+		}
+		rcu_read_unlock();
+	}
+
+	saddr = ipc.addr;
+	ipc.addr = faddr = daddr;
+
+	if (ipc.opt && ipc.opt->opt.srr) {
+		if (!daddr)
+			return -EINVAL;
+		faddr = ipc.opt->opt.faddr;
+	}
+	tos = RT_TOS(inet->tos);
+	if (sock_flag(sk, SOCK_LOCALROUTE) ||
+	    (msg->msg_flags & MSG_DONTROUTE) ||
+	    (ipc.opt && ipc.opt->opt.is_strictroute)) {
+		tos |= RTO_ONLINK;
+	}
+
+	if (ipv4_is_multicast(daddr)) {
+		if (!ipc.oif)
+			ipc.oif = inet->mc_index;
+		if (!saddr)
+			saddr = inet->mc_addr;
+	}
+
+	flowi4_init_output(&fl4, ipc.oif, sk->sk_mark, tos,
+			   RT_SCOPE_UNIVERSE, sk->sk_protocol,
+			   inet_sk_flowi_flags(sk), faddr, saddr, 0, 0);
+
+	security_sk_classify_flow(sk, flowi4_to_flowi(&fl4));
+	rt = ip_route_output_flow(net, &fl4, sk);
+	if (IS_ERR(rt)) {
+		err = PTR_ERR(rt);
+		rt = NULL;
+		if (err == -ENETUNREACH)
+			IP_INC_STATS_BH(net, IPSTATS_MIB_OUTNOROUTES);
+		goto out;
+	}
+
+	err = -EACCES;
+	if ((rt->rt_flags & RTCF_BROADCAST) &&
+	    !sock_flag(sk, SOCK_BROADCAST))
+		goto out;
+
+	if (msg->msg_flags & MSG_CONFIRM)
+		goto do_confirm;
+back_from_confirm:
+
+	if (!ipc.addr)
+		ipc.addr = fl4.daddr;
+
+	lock_sock(sk);
+
+	pfh.icmph.type = user_icmph.type; /* already checked */
+	pfh.icmph.code = user_icmph.code; /* ditto */
+	pfh.icmph.checksum = 0;
+	pfh.icmph.un.echo.id = inet->inet_sport;
+	pfh.icmph.un.echo.sequence = user_icmph.un.echo.sequence;
+	pfh.iov = msg->msg_iov;
+	pfh.wcheck = 0;
+
+	err = ip_append_data(sk, &fl4, ping_getfrag, &pfh, len,
+			0, &ipc, &rt, msg->msg_flags);
+	if (err)
+		ip_flush_pending_frames(sk);
+	else
+		err = ping_push_pending_frames(sk, &pfh, &fl4);
+	release_sock(sk);
+
+out:
+	ip_rt_put(rt);
+	if (free)
+		kfree(ipc.opt);
+	if (!err) {
+		icmp_out_count(sock_net(sk), user_icmph.type);
+		return len;
+	}
+	return err;
+
+do_confirm:
+	dst_confirm(&rt->dst);
+	if (!(msg->msg_flags & MSG_PROBE) || len)
+		goto back_from_confirm;
+	err = 0;
+	goto out;
+}
+
+/*
+ *	IOCTL requests applicable to the UDP^H^H^HICMP protocol
+ */
+
+int ping_ioctl(struct sock *sk, int cmd, unsigned long arg)
+{
+	pr_debug("ping_ioctl(sk=%p,sk->num=%u,cmd=%d,arg=%lu)\n",
+		inet_sk(sk), inet_sk(sk)->inet_num, cmd, arg);
+	switch (cmd) {
+	case SIOCOUTQ:
+	case SIOCINQ:
+		return udp_ioctl(sk, cmd, arg);
+	default:
+		return -ENOIOCTLCMD;
+	}
+}
+
+int ping_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
+		 size_t len, int noblock, int flags, int *addr_len)
+{
+	struct inet_sock *isk = inet_sk(sk);
+	struct sockaddr_in *sin = (struct sockaddr_in *)msg->msg_name;
+	struct sk_buff *skb;
+	int copied, err;
+
+	pr_debug("ping_recvmsg(sk=%p,sk->num=%u)\n", isk, isk->inet_num);
+
+	if (flags & MSG_OOB)
+		goto out;
+
+	if (addr_len)
+		*addr_len = sizeof(*sin);
+
+	if (flags & MSG_ERRQUEUE)
+		return ip_recv_error(sk, msg, len);
+
+	skb = skb_recv_datagram(sk, flags, noblock, &err);
+	if (!skb)
+		goto out;
+
+	copied = skb->len;
+	if (copied > len) {
+		msg->msg_flags |= MSG_TRUNC;
+		copied = len;
+	}
+
+	/* Don't bother checking the checksum */
+	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	if (err)
+		goto done;
+
+	sock_recv_timestamp(msg, sk, skb);
+
+	/* Copy the address. */
+	if (sin) {
+		sin->sin_family = AF_INET;
+		sin->sin_port = 0 /* skb->h.uh->source */;
+		sin->sin_addr.s_addr = ip_hdr(skb)->saddr;
+		memset(sin->sin_zero, 0, sizeof(sin->sin_zero));
+	}
+	if (isk->cmsg_flags)
+		ip_cmsg_recv(msg, skb);
+	err = copied;
+
+done:
+	skb_free_datagram(sk, skb);
+out:
+	pr_debug("ping_recvmsg -> %d\n", err);
+	return err;
+}
+
+static int ping_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
+{
+	pr_debug("ping_queue_rcv_skb(sk=%p,sk->num=%d,skb=%p)\n",
+		inet_sk(sk), inet_sk(sk)->inet_num, skb);
+	if (sock_queue_rcv_skb(sk, skb) < 0) {
+		ICMP_INC_STATS_BH(sock_net(sk), ICMP_MIB_INERRORS);
+		kfree_skb(skb);
+		pr_debug("ping_queue_rcv_skb -> failed\n");
+		return -1;
+	}
+	return 0;
+}
+
+
+/*
+ *	All we need to do is get the socket.
+ */
+
+void ping_rcv(struct sk_buff *skb)
+{
+	struct sock *sk;
+	struct net *net = dev_net(skb->dev);
+	struct iphdr *iph = ip_hdr(skb);
+	struct icmphdr *icmph = icmp_hdr(skb);
+	u32 saddr = iph->saddr;
+	u32 daddr = iph->daddr;
+
+	/* We assume the packet has already been checked by icmp_rcv */
+
+	pr_debug("ping_rcv(skb=%p,id=%04x,seq=%04x)\n",
+		skb, ntohs(icmph->un.echo.id), ntohs(icmph->un.echo.sequence));
+
+	/* Push ICMP header back */
+	skb_push(skb, skb->data - (u8 *)icmph);
+
+	sk = ping_v4_lookup(net, saddr, daddr, ntohs(icmph->un.echo.id),
+			    skb->dev->ifindex);
+	if (sk != NULL) {
+		pr_debug("rcv on socket %p\n", sk);
+		ping_queue_rcv_skb(sk, skb_get(skb));
+		sock_put(sk);
+		return;
+	}
+	pr_debug("no socket, dropping\n");
+
+	/* We're called from icmp_rcv(). kfree_skb() is done there. */
+}
+
+struct proto ping_prot = {
+	.name =		"PING",
+	.owner =	THIS_MODULE,
+	.init =		ping_init_sock,
+	.close =	ping_close,
+	.connect =	ip4_datagram_connect,
+	.disconnect =	udp_disconnect,
+	.ioctl =	ping_ioctl,
+	.setsockopt =	ip_setsockopt,
+	.getsockopt =	ip_getsockopt,
+	.sendmsg =	ping_sendmsg,
+	.recvmsg =	ping_recvmsg,
+	.bind =		ping_bind,
+	.backlog_rcv =	ping_queue_rcv_skb,
+	.hash =		ping_v4_hash,
+	.unhash =	ping_v4_unhash,
+	.get_port =	ping_v4_get_port,
+	.obj_size =	sizeof(struct inet_sock),
+};
+EXPORT_SYMBOL(ping_prot);
+
+#ifdef CONFIG_PROC_FS
+
+static struct sock *ping_get_first(struct seq_file *seq, int start)
+{
+	struct sock *sk;
+	struct ping_iter_state *state = seq->private;
+	struct net *net = seq_file_net(seq);
+
+	for (state->bucket = start; state->bucket < PING_HTABLE_SIZE;
+	     ++state->bucket) {
+		struct hlist_nulls_node *node;
+		struct hlist_nulls_head *hslot = &ping_table.hash[state->bucket];
+
+		if (hlist_nulls_empty(hslot))
+			continue;
+
+		sk_nulls_for_each(sk, node, hslot) {
+			if (net_eq(sock_net(sk), net))
+				goto found;
+		}
+	}
+	sk = NULL;
+found:
+	return sk;
+}
+
+static struct sock *ping_get_next(struct seq_file *seq, struct sock *sk)
+{
+	struct ping_iter_state *state = seq->private;
+	struct net *net = seq_file_net(seq);
+
+	do {
+		sk = sk_nulls_next(sk);
+	} while (sk && (!net_eq(sock_net(sk), net)));
+
+	if (!sk)
+		return ping_get_first(seq, state->bucket + 1);
+	return sk;
+}
+
+static struct sock *ping_get_idx(struct seq_file *seq, loff_t pos)
+{
+	struct sock *sk = ping_get_first(seq, 0);
+
+	if (sk)
+		while (pos && (sk = ping_get_next(seq, sk)) != NULL)
+			--pos;
+	return pos ? NULL : sk;
+}
+
+static void *ping_seq_start(struct seq_file *seq, loff_t *pos)
+{
+	struct ping_iter_state *state = seq->private;
+	state->bucket = 0;
+
+	read_lock_bh(&ping_table.lock);
+
+	return *pos ? ping_get_idx(seq, *pos-1) : SEQ_START_TOKEN;
+}
+
+static void *ping_seq_next(struct seq_file *seq, void *v, loff_t *pos)
+{
+	struct sock *sk;
+
+	if (v == SEQ_START_TOKEN)
+		sk = ping_get_idx(seq, 0);
+	else
+		sk = ping_get_next(seq, v);
+
+	++*pos;
+	return sk;
+}
+
+static void ping_seq_stop(struct seq_file *seq, void *v)
+{
+	read_unlock_bh(&ping_table.lock);
+}
+
+static void ping_format_sock(struct sock *sp, struct seq_file *f,
+		int bucket, int *len)
+{
+	struct inet_sock *inet = inet_sk(sp);
+	__be32 dest = inet->inet_daddr;
+	__be32 src = inet->inet_rcv_saddr;
+	__u16 destp = ntohs(inet->inet_dport);
+	__u16 srcp = ntohs(inet->inet_sport);
+
+	seq_printf(f, "%5d: %08X:%04X %08X:%04X"
+		" %02X %08X:%08X %02X:%08lX %08X %5d %8d %lu %d %pK %d%n",
+		bucket, src, srcp, dest, destp, sp->sk_state,
+		sk_wmem_alloc_get(sp),
+		sk_rmem_alloc_get(sp),
+		0, 0L, 0, sock_i_uid(sp), 0, sock_i_ino(sp),
+		atomic_read(&sp->sk_refcnt), sp,
+		atomic_read(&sp->sk_drops), len);
+}
+
+static int ping_seq_show(struct seq_file *seq, void *v)
+{
+	if (v == SEQ_START_TOKEN)
+		seq_printf(seq, "%-127s\n",
+			   "  sl  local_address rem_address   st tx_queue "
+			   "rx_queue tr tm->when retrnsmt   uid  timeout "
+			   "inode ref pointer drops");
+	else {
+		struct ping_iter_state *state = seq->private;
+		int len;
+
+		ping_format_sock(v, seq, state->bucket, &len);
+		seq_printf(seq, "%*s\n", 127 - len, "");
+	}
+	return 0;
+}
+
+static const struct seq_operations ping_seq_ops = {
+	.show		= ping_seq_show,
+	.start		= ping_seq_start,
+	.next		= ping_seq_next,
+	.stop		= ping_seq_stop,
+};
+
+static int ping_seq_open(struct inode *inode, struct file *file)
+{
+	return seq_open_net(inode, file, &ping_seq_ops,
+			   sizeof(struct ping_iter_state));
+}
+
+static const struct file_operations ping_seq_fops = {
+	.open		= ping_seq_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= seq_release_net,
+};
+
+static int ping_proc_register(struct net *net)
+{
+	struct proc_dir_entry *p;
+	int rc = 0;
+
+	p = proc_net_fops_create(net, "icmp", S_IRUGO, &ping_seq_fops);
+	if (!p)
+		rc = -ENOMEM;
+	return rc;
+}
+
+static void ping_proc_unregister(struct net *net)
+{
+	proc_net_remove(net, "icmp");
+}
+
+
+static int __net_init ping_proc_init_net(struct net *net)
+{
+	return ping_proc_register(net);
+}
+
+static void __net_exit ping_proc_exit_net(struct net *net)
+{
+	ping_proc_unregister(net);
+}
+
+static struct pernet_operations ping_net_ops = {
+	.init = ping_proc_init_net,
+	.exit = ping_proc_exit_net,
+};
+
+int __init ping_proc_init(void)
+{
+	return register_pernet_subsys(&ping_net_ops);
+}
+
+void ping_proc_exit(void)
+{
+	unregister_pernet_subsys(&ping_net_ops);
+}
+
+#endif
+
+void __init ping_init(void)
+{
+	int i;
+
+	for (i = 0; i < PING_HTABLE_SIZE; i++)
+		INIT_HLIST_NULLS_HEAD(&ping_table.hash[i], i);
+	rwlock_init(&ping_table.lock);
+}
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 321e6e8..28e8273 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -13,6 +13,7 @@
 #include <linux/seqlock.h>
 #include <linux/init.h>
 #include <linux/slab.h>
+#include <linux/nsproxy.h>
 #include <net/snmp.h>
 #include <net/icmp.h>
 #include <net/ip.h>
@@ -21,6 +22,7 @@
 #include <net/udp.h>
 #include <net/cipso_ipv4.h>
 #include <net/inet_frag.h>
+#include <net/ping.h>
 
 static int zero;
 static int tcp_retr1_max = 255;
@@ -30,6 +32,8 @@ static int tcp_adv_win_scale_min = -31;
 static int tcp_adv_win_scale_max = 31;
 static int ip_ttl_min = 1;
 static int ip_ttl_max = 255;
+static int ip_ping_group_range_min[] = { 0, 0 };
+static int ip_ping_group_range_max[] = { GID_T_MAX, GID_T_MAX };
 
 /* Update system visible IP port range */
 static void set_local_port_range(int range[2])
@@ -68,6 +72,65 @@ static int ipv4_local_port_range(ctl_table *table, int write,
 	return ret;
 }
 
+
+void inet_get_ping_group_range_net(struct net *net, gid_t *low, gid_t *high)
+{
+	gid_t *data = net->ipv4.sysctl_ping_group_range;
+	unsigned seq;
+	do {
+		seq = read_seqbegin(&sysctl_local_ports.lock);
+
+		*low = data[0];
+		*high = data[1];
+	} while (read_seqretry(&sysctl_local_ports.lock, seq));
+}
+
+void inet_get_ping_group_range_table(struct ctl_table *table, gid_t *low, gid_t *high)
+{
+	gid_t *data = table->data;
+	unsigned seq;
+	do {
+		seq = read_seqbegin(&sysctl_local_ports.lock);
+
+		*low = data[0];
+		*high = data[1];
+	} while (read_seqretry(&sysctl_local_ports.lock, seq));
+}
+
+/* Update system visible IP port range */
+static void set_ping_group_range(struct ctl_table *table, int range[2])
+{
+	gid_t *data = table->data;
+	write_seqlock(&sysctl_local_ports.lock);
+	data[0] = range[0];
+	data[1] = range[1];
+	write_sequnlock(&sysctl_local_ports.lock);
+}
+
+/* Validate changes from /proc interface. */
+static int ipv4_ping_group_range(ctl_table *table, int write,
+				 void __user *buffer,
+				 size_t *lenp, loff_t *ppos)
+{
+	int ret;
+	gid_t range[2];
+	ctl_table tmp = {
+		.data = &range,
+		.maxlen = sizeof(range),
+		.mode = table->mode,
+		.extra1 = &ip_ping_group_range_min,
+		.extra2 = &ip_ping_group_range_max,
+	};
+
+	inet_get_ping_group_range_table(table, range, range + 1);
+	ret = proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos);
+
+	if (write && ret == 0)
+		set_ping_group_range(table, range);
+
+	return ret;
+}
+
 static int proc_tcp_congestion_control(ctl_table *ctl, int write,
 				       void __user *buffer, size_t *lenp, loff_t *ppos)
 {
@@ -677,6 +740,13 @@ static struct ctl_table ipv4_net_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec
 	},
+	{
+		.procname	= "ping_group_range",
+		.data		= &init_net.ipv4.sysctl_ping_group_range,
+		.maxlen		= sizeof(init_net.ipv4.sysctl_ping_group_range),
+		.mode		= 0644,
+		.proc_handler	= ipv4_ping_group_range,
+	},
 	{ }
 };
 
@@ -711,8 +781,18 @@ static __net_init int ipv4_sysctl_init_net(struct net *net)
 			&net->ipv4.sysctl_icmp_ratemask;
 		table[6].data =
 			&net->ipv4.sysctl_rt_cache_rebuild_count;
+		table[7].data =
+			&net->ipv4.sysctl_ping_group_range;
+
 	}
 
+	/*
+	 * Sane defaults - nobody may create ping sockets.
+	 * Boot scripts should set this to distro-specific group.
+	 */
+	net->ipv4.sysctl_ping_group_range[0] = 1;
+	net->ipv4.sysctl_ping_group_range[1] = 0;
+
 	net->ipv4.sysctl_rt_cache_rebuild_count = 4;
 
 	net->ipv4.ipv4_hdr = register_net_sysctl_table(net,
-- 
1.7.0.4

^ permalink raw reply related

* Re: [PATCH] bridge: fix forwarding of IPv6
From: Eric Dumazet @ 2011-05-13 20:00 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Noah Meyerhans, Herbert Xu, David Miller, Ben Hutchings, bridge,
	netdev
In-Reply-To: <20110513125314.66861b31@nehalam>

Le vendredi 13 mai 2011 à 12:53 -0700, Stephen Hemminger a écrit :
> The commit 6b1e960fdbd75dcd9bcc3ba5ff8898ff1ad30b6e
>     bridge: Reset IPCB when entering IP stack on NF_FORWARD
> broke forwarding of IPV6 packets in bridge because it would
> call bp_parse_ip_options with an IPV6 packet.
> 
> Reported-by: Noah Meyerhans <noahm@debian.org>
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> 
> ---
> Patch against net-next-2.6 but must be applied to net-2.6
> and stable as well
> 

Well, stable is not needed, since faulty commit is not in 2.6.38

Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>




^ permalink raw reply

* [PATCH] bridge: fix forwarding of IPv6
From: Stephen Hemminger @ 2011-05-13 19:53 UTC (permalink / raw)
  To: Noah Meyerhans, Herbert Xu, David Miller; +Cc: Ben Hutchings, bridge, netdev
In-Reply-To: <20110510233540.GJ6397@morgul.net>

The commit 6b1e960fdbd75dcd9bcc3ba5ff8898ff1ad30b6e
    bridge: Reset IPCB when entering IP stack on NF_FORWARD
broke forwarding of IPV6 packets in bridge because it would
call bp_parse_ip_options with an IPV6 packet.

Reported-by: Noah Meyerhans <noahm@debian.org>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---
Patch against net-next-2.6 but must be applied to net-2.6
and stable as well

--- a/net/bridge/br_netfilter.c	2011-05-13 12:37:30.289646958 -0700
+++ b/net/bridge/br_netfilter.c	2011-05-13 12:38:07.820333938 -0700
@@ -737,7 +737,7 @@ static unsigned int br_nf_forward_ip(uns
 		nf_bridge->mask |= BRNF_PKT_TYPE;
 	}
 
-	if (br_parse_ip_options(skb))
+	if (pf == PF_INET && br_parse_ip_options(skb))
 		return NF_DROP;
 
 	/* The physdev module checks on this */


-- 

^ permalink raw reply

* Re: kernel BUG at net/ipv4/tcp_output.c:1006!
From: Eric Dumazet @ 2011-05-13 19:47 UTC (permalink / raw)
  To: TB, David Miller; +Cc: linux-kernel, netdev
In-Reply-To: <4DCD86C0.9030904@techboom.com>

Le vendredi 13 mai 2011 à 15:30 -0400, TB a écrit :
> On 11-05-13 01:27 PM, Eric Dumazet wrote:
> > Le vendredi 13 mai 2011 à 13:11 -0400, TB a écrit :
> >> This is the 2.6.38.5 kernel with the patch in
> >> [PATCH] tcp_cubic: limit delayed_ack ratio to prevent divide error
> >>
> > 
> > Please send us full disassembly of tcp_fragment (from vmlinux file)
> 
> 
> GCC is debian 4.3.2-1.1
> AS 2.18.0.20080103
> 
> CPU is Intel Xeon E5620
> Kernel CPU is set to MCORE2 (Core 2/newer Xeon)
> 
> 
> ffffffff814e7eb0 <tcp_fragment>:
> ffffffff814e7eb0:       41 57                   push   %r15
> ffffffff814e7eb2:       49 89 ff                mov    %rdi,%r15
> ffffffff814e7eb5:       41 56                   push   %r14
> ffffffff814e7eb7:       41 55                   push   %r13
> ffffffff814e7eb9:       41 89 d5                mov    %edx,%r13d
> ffffffff814e7ebc:       41 54                   push   %r12
> ffffffff814e7ebe:       55                      push   %rbp
> ffffffff814e7ebf:       53                      push   %rbx
> ffffffff814e7ec0:       48 89 f3                mov    %rsi,%rbx
> ffffffff814e7ec3:       48 83 ec 18             sub    $0x18,%rsp
> ffffffff814e7ec7:       89 4c 24 0c             mov    %ecx,0xc(%rsp)
> ffffffff814e7ecb:       8b 6e 68                mov    0x68(%rsi),%ebp
> ffffffff814e7ece:       39 ea                   cmp    %ebp,%edx
> ffffffff814e7ed0:       76 04                   jbe    ffffffff814e7ed6
> <tcp_fragment+0x26>
> ffffffff814e7ed2:       0f 0b                   ud2a
> ffffffff814e7ed4:       eb fe                   jmp    ffffffff814e7ed4
> <tcp_fragment+0x24>



So skb->len = 0x1540 and len = 0x1708

I suspect we should push commit 2fceec13375e5d98 (tcp: len check is
unnecessarily devastating, change to WARN_ON) to stable if not already
done...

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2fceec13375e5d98

David, is this commit in your stable queue ?

Thanks !

^ permalink raw reply

* Re: kernel BUG at net/ipv4/tcp_output.c:1006!
From: TB @ 2011-05-13 19:30 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: linux-kernel, netdev
In-Reply-To: <1305307633.3866.61.camel@edumazet-laptop>

On 11-05-13 01:27 PM, Eric Dumazet wrote:
> Le vendredi 13 mai 2011 à 13:11 -0400, TB a écrit :
>> This is the 2.6.38.5 kernel with the patch in
>> [PATCH] tcp_cubic: limit delayed_ack ratio to prevent divide error
>>
> 
> Please send us full disassembly of tcp_fragment (from vmlinux file)


GCC is debian 4.3.2-1.1
AS 2.18.0.20080103

CPU is Intel Xeon E5620
Kernel CPU is set to MCORE2 (Core 2/newer Xeon)


ffffffff814e7eb0 <tcp_fragment>:
ffffffff814e7eb0:       41 57                   push   %r15
ffffffff814e7eb2:       49 89 ff                mov    %rdi,%r15
ffffffff814e7eb5:       41 56                   push   %r14
ffffffff814e7eb7:       41 55                   push   %r13
ffffffff814e7eb9:       41 89 d5                mov    %edx,%r13d
ffffffff814e7ebc:       41 54                   push   %r12
ffffffff814e7ebe:       55                      push   %rbp
ffffffff814e7ebf:       53                      push   %rbx
ffffffff814e7ec0:       48 89 f3                mov    %rsi,%rbx
ffffffff814e7ec3:       48 83 ec 18             sub    $0x18,%rsp
ffffffff814e7ec7:       89 4c 24 0c             mov    %ecx,0xc(%rsp)
ffffffff814e7ecb:       8b 6e 68                mov    0x68(%rsi),%ebp
ffffffff814e7ece:       39 ea                   cmp    %ebp,%edx
ffffffff814e7ed0:       76 04                   jbe    ffffffff814e7ed6
<tcp_fragment+0x26>
ffffffff814e7ed2:       0f 0b                   ud2a
ffffffff814e7ed4:       eb fe                   jmp    ffffffff814e7ed4
<tcp_fragment+0x24>
ffffffff814e7ed6:       44 8b 66 6c             mov    0x6c(%rsi),%r12d
ffffffff814e7eda:       f6 46 7c 02             testb  $0x2,0x7c(%rsi)
ffffffff814e7ede:       74 33                   je     ffffffff814e7f13
<tcp_fragment+0x63>
ffffffff814e7ee0:       8b 86 b4 00 00 00       mov    0xb4(%rsi),%eax
ffffffff814e7ee6:       48 03 86 b8 00 00 00    add    0xb8(%rsi),%rax
ffffffff814e7eed:       8b 40 28                mov    0x28(%rax),%eax
ffffffff814e7ef0:       66 ff c8                dec    %ax
ffffffff814e7ef3:       74 1e                   je     ffffffff814e7f13
<tcp_fragment+0x63>
ffffffff814e7ef5:       45 85 e4                test   %r12d,%r12d
ffffffff814e7ef8:       74 19                   je     ffffffff814e7f13
<tcp_fragment+0x63>
ffffffff814e7efa:       31 d2                   xor    %edx,%edx
ffffffff814e7efc:       31 f6                   xor    %esi,%esi
ffffffff814e7efe:       b9 20 00 00 00          mov    $0x20,%ecx
ffffffff814e7f03:       48 89 df                mov    %rbx,%rdi
ffffffff814e7f06:       e8 68 fe fb ff          callq  ffffffff814a7d73
<pskb_expand_head>
ffffffff814e7f0b:       85 c0                   test   %eax,%eax
ffffffff814e7f0d:       0f 85 23 02 00 00       jne    ffffffff814e8136
<tcp_fragment+0x286>
ffffffff814e7f13:       44 29 e5                sub    %r12d,%ebp
ffffffff814e7f16:       45 31 f6                xor    %r14d,%r14d
ffffffff814e7f19:       89 e8                   mov    %ebp,%eax
ffffffff814e7f1b:       ba 20 00 00 00          mov    $0x20,%edx
ffffffff814e7f20:       44 29 e8                sub    %r13d,%eax
ffffffff814e7f23:       4c 89 ff                mov    %r15,%rdi
ffffffff814e7f26:       44 0f 49 f0             cmovns %eax,%r14d
ffffffff814e7f2a:       44 89 f6                mov    %r14d,%esi
ffffffff814e7f2d:       e8 82 51 ff ff          callq  ffffffff814dd0b4
<sk_stream_alloc_skb>
ffffffff814e7f32:       48 89 c5                mov    %rax,%rbp
ffffffff814e7f35:       48 85 c0                test   %rax,%rax
ffffffff814e7f38:       0f 84 f8 01 00 00       je     ffffffff814e8136
<tcp_fragment+0x286>
ffffffff814e7f3e:       8b 80 c8 00 00 00       mov    0xc8(%rax),%eax
ffffffff814e7f44:       41 01 87 1c 01 00 00    add    %eax,0x11c(%r15)
ffffffff814e7f4b:       49 8b 47 28             mov    0x28(%r15),%rax
ffffffff814e7f4f:       8b 95 c8 00 00 00       mov    0xc8(%rbp),%edx
ffffffff814e7f55:       48 83 b8 c8 00 00 00    cmpq   $0x0,0xc8(%rax)
ffffffff814e7f5c:       00
ffffffff814e7f5d:       74 07                   je     ffffffff814e7f66
<tcp_fragment+0xb6>
ffffffff814e7f5f:       41 29 97 98 00 00 00    sub    %edx,0x98(%r15)
ffffffff814e7f66:       8b 43 68                mov    0x68(%rbx),%eax
ffffffff814e7f69:       4c 8d 63 28             lea    0x28(%rbx),%r12
ffffffff814e7f6d:       44 29 e8                sub    %r13d,%eax
ffffffff814e7f70:       44 89 ea                mov    %r13d,%edx
ffffffff814e7f73:       44 29 f0                sub    %r14d,%eax
ffffffff814e7f76:       01 85 c8 00 00 00       add    %eax,0xc8(%rbp)
ffffffff814e7f7c:       29 83 c8 00 00 00       sub    %eax,0xc8(%rbx)
ffffffff814e7f82:       48 8d 45 28             lea    0x28(%rbp),%rax
ffffffff814e7f86:       48 89 44 24 10          mov    %rax,0x10(%rsp)
ffffffff814e7f8b:       41 03 54 24 10          add    0x10(%r12),%edx
ffffffff814e7f90:       89 50 10                mov    %edx,0x10(%rax)
ffffffff814e7f93:       41 8b 44 24 14          mov    0x14(%r12),%eax
ffffffff814e7f98:       48 8b 4c 24 10          mov    0x10(%rsp),%rcx
ffffffff814e7f9d:       89 41 14                mov    %eax,0x14(%rcx)
ffffffff814e7fa0:       41 89 54 24 14          mov    %edx,0x14(%r12)
ffffffff814e7fa5:       41 8a 54 24 1c          mov    0x1c(%r12),%dl
ffffffff814e7faa:       88 d0                   mov    %dl,%al
ffffffff814e7fac:       83 e0 f6                and
$0xfffffffffffffff6,%eax
ffffffff814e7faf:       41 88 44 24 1c          mov    %al,0x1c(%r12)
ffffffff814e7fb4:       88 51 1c                mov    %dl,0x1c(%rcx)
ffffffff814e7fb7:       41 8a 44 24 1d          mov    0x1d(%r12),%al
ffffffff814e7fbc:       88 41 1d                mov    %al,0x1d(%rcx)
ffffffff814e7fbf:       8b 93 b4 00 00 00       mov    0xb4(%rbx),%edx
ffffffff814e7fc5:       48 8b 83 b8 00 00 00    mov    0xb8(%rbx),%rax
ffffffff814e7fcc:       66 83 3c 10 00          cmpw   $0x0,(%rax,%rdx,1)
ffffffff814e7fd1:       75 6e                   jne    ffffffff814e8041
<tcp_fragment+0x191>
ffffffff814e7fd3:       8a 43 7c                mov    0x7c(%rbx),%al
ffffffff814e7fd6:       83 e0 0c                and    $0xc,%eax
ffffffff814e7fd9:       3c 0c                   cmp    $0xc,%al
ffffffff814e7fdb:       74 64                   je     ffffffff814e8041
<tcp_fragment+0x191>
ffffffff814e7fdd:       44 89 f6                mov    %r14d,%esi
ffffffff814e7fe0:       48 89 ef                mov    %rbp,%rdi
ffffffff814e7fe3:       e8 da f7 fb ff          callq  ffffffff814a77c2
<skb_put>
ffffffff814e7fe8:       31 c9                   xor    %ecx,%ecx
ffffffff814e7fea:       48 89 c6                mov    %rax,%rsi
ffffffff814e7fed:       44 89 ef                mov    %r13d,%edi
ffffffff814e7ff0:       44 89 f2                mov    %r14d,%edx
ffffffff814e7ff3:       48 03 bb c0 00 00 00    add    0xc0(%rbx),%rdi
ffffffff814e7ffa:       e8 91 4f 05 00          callq  ffffffff8153cf90
<csum_partial_copy_nocheck>
ffffffff814e7fff:       44 89 ee                mov    %r13d,%esi
ffffffff814e8002:       89 45 74                mov    %eax,0x74(%rbp)
ffffffff814e8005:       48 89 df                mov    %rbx,%rdi
ffffffff814e8008:       e8 09 de fb ff          callq  ffffffff814a5e16
<skb_trim>
ffffffff814e800d:       8b 45 74                mov    0x74(%rbp),%eax
ffffffff814e8010:       8b 4b 74                mov    0x74(%rbx),%ecx
ffffffff814e8013:       41 80 e5 01             and    $0x1,%r13b
ffffffff814e8017:       74 15                   je     ffffffff814e802e
<tcp_fragment+0x17e>
ffffffff814e8019:       89 c2                   mov    %eax,%edx
ffffffff814e801b:       c1 e8 08                shr    $0x8,%eax
ffffffff814e801e:       81 e2 ff 00 ff 00       and    $0xff00ff,%edx
ffffffff814e8024:       25 ff 00 ff 00          and    $0xff00ff,%eax
ffffffff814e8029:       c1 e2 08                shl    $0x8,%edx
ffffffff814e802c:       01 d0                   add    %edx,%eax
ffffffff814e802e:       f7 d0                   not    %eax
ffffffff814e8030:       89 c2                   mov    %eax,%edx
ffffffff814e8032:       01 ca                   add    %ecx,%edx
ffffffff814e8034:       0f 92 c0                setb   %al
ffffffff814e8037:       0f b6 c0                movzbl %al,%eax
ffffffff814e803a:       01 d0                   add    %edx,%eax
ffffffff814e803c:       89 43 74                mov    %eax,0x74(%rbx)
ffffffff814e803f:       eb 12                   jmp    ffffffff814e8053
<tcp_fragment+0x1a3>
ffffffff814e8041:       80 4b 7c 0c             orb    $0xc,0x7c(%rbx)
ffffffff814e8045:       44 89 ea                mov    %r13d,%edx
ffffffff814e8048:       48 89 ee                mov    %rbp,%rsi
ffffffff814e804b:       48 89 df                mov    %rbx,%rdi
ffffffff814e804e:       e8 f8 f7 fb ff          callq  ffffffff814a784b
<skb_split>
ffffffff814e8053:       8a 53 7c                mov    0x7c(%rbx),%dl
ffffffff814e8056:       8a 45 7c                mov    0x7c(%rbp),%al
ffffffff814e8059:       83 e2 0c                and    $0xc,%edx
ffffffff814e805c:       83 e0 f3                and
$0xfffffffffffffff3,%eax
ffffffff814e805f:       48 89 de                mov    %rbx,%rsi
ffffffff814e8062:       09 d0                   or     %edx,%eax
ffffffff814e8064:       4c 89 ff                mov    %r15,%rdi
ffffffff814e8067:       88 45 7c                mov    %al,0x7c(%rbp)
ffffffff814e806a:       41 8b 44 24 18          mov    0x18(%r12),%eax
ffffffff814e806f:       48 8b 54 24 10          mov    0x10(%rsp),%rdx
ffffffff814e8074:       89 42 18                mov    %eax,0x18(%rdx)
ffffffff814e8077:       48 8b 43 10             mov    0x10(%rbx),%rax
ffffffff814e807b:       8b 93 b4 00 00 00       mov    0xb4(%rbx),%edx
ffffffff814e8081:       48 89 45 10             mov    %rax,0x10(%rbp)
ffffffff814e8085:       48 8b 83 b8 00 00 00    mov    0xb8(%rbx),%rax
ffffffff814e808c:       44 8b 64 10 04          mov
0x4(%rax,%rdx,1),%r12d
ffffffff814e8091:       8b 54 24 0c             mov    0xc(%rsp),%edx
ffffffff814e8095:       e8 3d dd ff ff          callq  ffffffff814e5dd7
<tcp_set_skb_tso_segs>
ffffffff814e809a:       8b 54 24 0c             mov    0xc(%rsp),%edx
ffffffff814e809e:       48 89 ee                mov    %rbp,%rsi
ffffffff814e80a1:       4c 89 ff                mov    %r15,%rdi
ffffffff814e80a4:       e8 2e dd ff ff          callq  ffffffff814e5dd7
<tcp_set_skb_tso_segs>
ffffffff814e80a9:       48 8b 4c 24 10          mov    0x10(%rsp),%rcx
ffffffff814e80ae:       8b 49 14                mov    0x14(%rcx),%ecx
ffffffff814e80b1:       41 39 8f 1c 04 00 00    cmp    %ecx,0x41c(%r15)
ffffffff814e80b8:       78 39                   js     ffffffff814e80f3
<tcp_fragment+0x243>
ffffffff814e80ba:       8b 8b b4 00 00 00       mov    0xb4(%rbx),%ecx
ffffffff814e80c0:       41 0f b7 d4             movzwl %r12w,%edx
ffffffff814e80c4:       48 8b 83 b8 00 00 00    mov    0xb8(%rbx),%rax
ffffffff814e80cb:       0f b7 44 08 04          movzwl 0x4(%rax,%rcx,1),%eax
ffffffff814e80d0:       8b 8d b4 00 00 00       mov    0xb4(%rbp),%ecx
ffffffff814e80d6:       29 c2                   sub    %eax,%edx
ffffffff814e80d8:       48 8b 85 b8 00 00 00    mov    0xb8(%rbp),%rax
ffffffff814e80df:       0f b7 44 08 04          movzwl 0x4(%rax,%rcx,1),%eax
ffffffff814e80e4:       29 c2                   sub    %eax,%edx
ffffffff814e80e6:       74 0b                   je     ffffffff814e80f3
<tcp_fragment+0x243>
ffffffff814e80e8:       48 89 de                mov    %rbx,%rsi
ffffffff814e80eb:       4c 89 ff                mov    %r15,%rdi
ffffffff814e80ee:       e8 1a f4 ff ff          callq  ffffffff814e750d
<tcp_adjust_pcount>
ffffffff814e80f3:       8a 45 7c                mov    0x7c(%rbp),%al
ffffffff814e80f6:       a8 10                   test   $0x10,%al
ffffffff814e80f8:       74 04                   je     ffffffff814e80fe
<tcp_fragment+0x24e>
ffffffff814e80fa:       0f 0b                   ud2a
ffffffff814e80fc:       eb fe                   jmp    ffffffff814e80fc
<tcp_fragment+0x24c>
ffffffff814e80fe:       83 c8 10                or     $0x10,%eax
ffffffff814e8101:       88 45 7c                mov    %al,0x7c(%rbp)
ffffffff814e8104:       8b 85 b4 00 00 00       mov    0xb4(%rbp),%eax
ffffffff814e810a:       48 03 85 b8 00 00 00    add    0xb8(%rbp),%rax
ffffffff814e8111:       f0 81 40 28 00 00 01    lock addl
$0x10000,0x28(%rax)
ffffffff814e8118:       00
ffffffff814e8119:       48 8b 03                mov    (%rbx),%rax
ffffffff814e811c:       48 89 5d 08             mov    %rbx,0x8(%rbp)
ffffffff814e8120:       48 89 45 00             mov    %rax,0x0(%rbp)
ffffffff814e8124:       48 89 68 08             mov    %rbp,0x8(%rax)
ffffffff814e8128:       48 89 2b                mov    %rbp,(%rbx)
ffffffff814e812b:       31 c0                   xor    %eax,%eax
ffffffff814e812d:       41 ff 87 10 01 00 00    incl   0x110(%r15)
ffffffff814e8134:       eb 05                   jmp    ffffffff814e813b
<tcp_fragment+0x28b>
ffffffff814e8136:       b8 f4 ff ff ff          mov    $0xfffffff4,%eax
ffffffff814e813b:       48 83 c4 18             add    $0x18,%rsp
ffffffff814e813f:       5b                      pop    %rbx
ffffffff814e8140:       5d                      pop    %rbp
ffffffff814e8141:       41 5c                   pop    %r12
ffffffff814e8143:       41 5d                   pop    %r13
ffffffff814e8145:       41 5e                   pop    %r14
ffffffff814e8147:       41 5f                   pop    %r15
ffffffff814e8149:       c3                      retq

^ permalink raw reply

* Re: AAARGH bisection is hard (Re: [2.6.39 regression] X locks up hard right after logging in)
From: Linus Torvalds @ 2011-05-13 19:18 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Sixt, Andrew Lutomirski, Christian Couder, linux-kernel,
	netdev, git, Shuang He
In-Reply-To: <7vliya77xl.fsf@alter.siamese.dyndns.org>

On Fri, May 13, 2011 at 11:48 AM, Junio C Hamano <gitster@pobox.com> wrote:
>
> Could you please clarify "off-limits"?
>
> Do you mean "anything before v2.6.38 did not even have this feature, so
> the result of testing a version in that range does not give us any
> information"?

Well, I think it's useful in two cases.

It's useful for the "before this version, the test we're doing doesn't
even make sense and cannot succeed" sense.

That doesn't have to be about hardware support, it could be any
feature. For example, in git, say that you noticed that
--dirstat-by-file stopped working at some point. You know it was good
when you merged it, so you'd do

  git bisect start
  git bisect good ac9666f84a59

but you'd also go "that's also when I introduced the *test* for it, so
I'll need to require that":

  git bisect requires ac9666f84a59

and then you can start it all off:

  git bisect bad
  git bisect run sh -c "make test"

or whatever.

Because you don't want to go into the merges that were based on code
that didn't even _have_ that feature.

Ok, so that's a made-up and contrieved example (it would make more
sense for when you add a whole new flag, and your test-script is
testign that new functionality), but it kind of explains the notion:
it will not bother to run bisect on code that simply isn't _relevant_
for the issue you are bisecting.

> Upon seeing "bad" result from a version before v2.6.38, what can we conclude?

The point would be that such versions aren't even _testable_. So the
whole "seeing 'bad'" concept is a NULL concept. It's like the above
"new command line flag to 'git'" example: it's not that those commits
might not have broken something, but those commits are crazy to test.

If it turns out that a merge brought in the breakage, we'd have to do
a totally new kind of test thing. But from a bisect standpoint, it's
already very interesting if the end result is "hey, you merged that
code that didn't even _support_ the feature we're testing, and that
broke it". That gives quite a bit of information, and opens up new
avenues for testing.

For example, at that point, we might decide that "Oh, ok, now I will
need to re-run the bisect for everthing that came in in that merge,
but I will do a new merge at that point to see which commit it is that
doesn't play nice with the new feature".

> The breakage cannot possibly come from the feature
> that is being checked, so the procedure to check itself is busted?

Right.

HOWEVER.

There's another reason to say "require version XYZ", and that's
essentially a "I want to do a (quicker) high-level bisect". Especially
the way the kernel merge window is done, it might be that versions
prior to v2.6.38 work perfectly _fine_, but what you want to do is to
quickly bisect down to which subsystem caused breakage.

A good way to do that would be to just say "requires v2.6.38", and
suddenly the actual set of commits that we're going to bisect is going
to be *much* smaller. We're basically throwing away all the individual
commits that were merged in the merge window, and saying something
that approximates to "we are only interested in the merge points".

Why would we do that? Just to get a quicker "this is the problematic
subsystem". So the "requires xyz" might be quite useful for that
reason too.

                 Linus

^ permalink raw reply

* Re: [PATCH v1] bonding,llc: Fix structure sizeof incompatibility for some PDUs
From: David Miller @ 2011-05-13 19:15 UTC (permalink / raw)
  To: vitas
  Cc: netdev, bonding-devel, eric.dumazet, bhutchings, joe, fubar, andy,
	acme
In-Reply-To: <201105131204.29612.vitas@nppfactor.kiev.ua>

From: Vitalii Demianets <vitas@nppfactor.kiev.ua>
Date: Fri, 13 May 2011 12:04:29 +0300

> With some combinations of arch/compiler (e.g. arm-linux-gcc) the sizeof 
> operator on structure returns value greater than expected. In cases when the 
> structure is used for mapping PDU fields it may lead to unexpected results 
> (such as holes and alignment problems in skb data). __packed prevents this 
> undesired behavior.
> 
> Signed-off-by: Vitalii Demianets <vitas@nppfactor.kiev.ua>

Applied, thanks.

^ permalink raw reply

* Re: [patch 0/9] [resend v2] s390: network feature patches for net-next
From: David Miller @ 2011-05-13 18:55 UTC (permalink / raw)
  To: frank.blaschka; +Cc: netdev, linux-s390
In-Reply-To: <20110513044500.190198403@de.ibm.com>

From: frank.blaschka@de.ibm.com
Date: Fri, 13 May 2011 06:45:00 +0200

> after one more iteration the hw_feature patch is complete now.
> I did some testing and could not find a problem so far. If
> we find bugs during extensive regession testing I will provide
> a bug fix so please apply this patch set now so it is available
> for the next merge window. Thx!

Series applied, please keep up the dialogue about hw_features with
Michał Mirosław.

Thanks.

^ permalink raw reply

* Re: AAARGH bisection is hard (Re: [2.6.39 regression] X locks up hard right after logging in)
From: Andrew Lutomirski @ 2011-05-13 18:55 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Linus Torvalds, Johannes Sixt, Christian Couder, linux-kernel,
	netdev, git, Shuang He
In-Reply-To: <7vliya77xl.fsf@alter.siamese.dyndns.org>

On Fri, May 13, 2011 at 2:48 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
>> When you say that v2.6.38 is good, that means that everything that can
>> be reached from 2.6.38 is good.
>>
>> NOT AT ALL the same thing as "git bisect requires v2.6.38" would be.
>>
>> The "requires v2.6.38" would basically say that anything that doesn't
>> contain v2.6.38 is "off-limits". It's fine to call them "good", but
>> that's not the same thing as "git bisect good v2.6.38".
>>
>> Why?
>>
>> Think about it. It's the "reachable from v2.6.38" vs "cannot reach
>> v2.6.38" difference. That's a HUGE difference.
>
> Could you please clarify "off-limits"?
>
> Do you mean "anything before v2.6.38 did not even have this feature, so
> the result of testing a version in that range does not give us any
> information"?  The feature didn't even exist, so a bug can never trigger,
> and seeing "good" from such a version does not mean everything reachable
> from it is good?  Upon seeing "bad" result from a version before v2.6.38,
> what can we conclude?  The breakage cannot possibly come from the feature
> that is being checked, so the procedure to check itself is busted?
>

In my case, if I'd given bisect a hint that commits that don't include
v2.6.38 are unlikely to work for reasons unrelated to the bug, then
there should still have been enough revisions left for bisect to tell
me "the bug was introduced by the merge of the drm tree but I can't
tell you more without testing off-limits revisions".  That would have
avoided testing three or four revisions that just failed to boot.

In my particular case I think it would also have avoided an
unnecessary set of tests to figure out why the networking merge broke
my system when the networking merge did not, in fact, break my system.
 This is coincidence -- all of the commits that didn't have the change
that fixed the bug the first time around also didn't contain v2.6.38,
so I never would have tested them.

This is maybe some further justification for a bisect mode that
follows the --first-parent path as long as possible -- it might take
one or two more kernel builds, but it avoids odd trips around the
history that can hit random crap like that.  (Of course, it could lead
to different random crap, but what can you do?)

--Andy

--Andy

>
>

^ permalink raw reply

* Re: AAARGH bisection is hard (Re: [2.6.39 regression] X locks up hard right after logging in)
From: Junio C Hamano @ 2011-05-13 18:48 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Johannes Sixt, Andrew Lutomirski, Christian Couder, linux-kernel,
	netdev, git, Shuang He
In-Reply-To: <BANLkTi=smoaARKyzWxFjid-E7qehmyAX8w@mail.gmail.com>

Linus Torvalds <torvalds@linux-foundation.org> writes:

> When you say that v2.6.38 is good, that means that everything that can
> be reached from 2.6.38 is good.
>
> NOT AT ALL the same thing as "git bisect requires v2.6.38" would be.
>
> The "requires v2.6.38" would basically say that anything that doesn't
> contain v2.6.38 is "off-limits". It's fine to call them "good", but
> that's not the same thing as "git bisect good v2.6.38".
>
> Why?
>
> Think about it. It's the "reachable from v2.6.38" vs "cannot reach
> v2.6.38" difference. That's a HUGE difference.

Could you please clarify "off-limits"?

Do you mean "anything before v2.6.38 did not even have this feature, so
the result of testing a version in that range does not give us any
information"?  The feature didn't even exist, so a bug can never trigger,
and seeing "good" from such a version does not mean everything reachable
from it is good?  Upon seeing "bad" result from a version before v2.6.38,
what can we conclude?  The breakage cannot possibly come from the feature
that is being checked, so the procedure to check itself is busted?

^ permalink raw reply

* Re: AAARGH bisection is hard (Re: [2.6.39 regression] X locks up hard right after logging in)
From: Johannes Sixt @ 2011-05-13 18:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Lutomirski, Christian Couder, linux-kernel, netdev, git,
	Shuang He
In-Reply-To: <BANLkTi=smoaARKyzWxFjid-E7qehmyAX8w@mail.gmail.com>

Am 13.05.2011 20:41, schrieb Linus Torvalds:
> On Fri, May 13, 2011 at 11:34 AM, Johannes Sixt <j6t@kdbg.org> wrote:
>>   git bisect good v2.6.38
> 
> When you say that v2.6.38 is good, that means that everything that can
> be reached from 2.6.38 is good.
> 
> NOT AT ALL the same thing as "git bisect requires v2.6.38" would be.
> 
> Think about it. It's the "reachable from v2.6.38" vs "cannot reach
> v2.6.38" difference. That's a HUGE difference.

Oops, you're right, I got it upside-down.

Thanks,
-- Hannes

^ permalink raw reply

* Re: [PATCHv3 net-next-2.6 3/3] qlcnic: Bumped up version number to 5.0.18
From: David Miller @ 2011-05-13 18:44 UTC (permalink / raw)
  To: anirban.chakraborty; +Cc: netdev, bhutchings
In-Reply-To: <1305240515-29237-5-git-send-email-anirban.chakraborty@qlogic.com>

From: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Date: Thu, 12 May 2011 15:48:35 -0700

> Update driver version number
> 
> Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>

Applied.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox