Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH iproute2 2/5] ip fou: Support to configure foo-over-udp RX
From: Stephen Hemminger @ 2014-11-02 19:45 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem, netdev
In-Reply-To: <20141102113610.785543ff@urahara>

On Sun, 2 Nov 2014 11:36:10 -0800
Stephen Hemminger <stephen@networkplumber.org> wrote:

> On Fri,  3 Oct 2014 08:55:15 -0700
> Tom Herbert <therbert@google.com> wrote:
> 
> > Added 'ip fou...' commands to enable/disable UDP ports for doing
> > foo-over-udp and Generic UDP Encapsulation variant. Arguments are port
> > number to bind to and IP protocol to map to port (for direct FOU).
> > 
> > Examples:
> > 
> > ip fou add port 7777 gue
> > ip fou add port 8888 ipproto 4
> > 
> > The first command creates a GUE port, the second creates a direct FOU
> > port for IPIP (receive payload is a assumed to be an IPv4 packet).
> > 
> > Signed-off-by: Tom Herbert <therbert@google.com>
> 
> Accepted.
> Also discovered that fou.h was missing from kernel Kbuild uapi.

I backed out the change since the rest of the patch series has
issues. Please fix and resubmit

^ permalink raw reply

* Re: [PATCH iproute2 0/5] iproute: Add FOU and GUE configuration in ip
From: Stephen Hemminger @ 2014-11-02 19:46 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem, netdev
In-Reply-To: <1412351718-22921-1-git-send-email-therbert@google.com>

On Fri,  3 Oct 2014 08:55:13 -0700
Tom Herbert <therbert@google.com> wrote:

> This patch set adds support in iproute2 to configure FOU and GUE ports
> for receive, and using FOU or GUE with ip tunnels (IPIP, GRE, sit) on
> transmit.
> 
> A new ip subcommand "fou" has been added to configure FOU/GUE ports.
> For example:
> 
>   ip fou add port 5555 gue 
>   ip fou add port 9999 ipproto 4
> 
> The first command creates a GUE port, the second creates a direct FOU
> port for IPIP (receive payload is a assumed to be an IP packet).
> 
> To configure an IP tunnel to use FOU or GUE encap parameters have
> been added. For example:
> 
>   ip link add name tun1 type ipip remote 192.168.1.1 local 192.168.1.2 \
>      ttl 225 encap gue encap-sport auto encap-dport 7777 encap-csum
>   ip link add name tun2 type gre remote 192.168.1.1 local 192.168.1.2 \
>      ttl 225 encap fou encap-sport auto encap-dport 8888 encap-csum
> 
> The first command configures an IPIP tunnel to use GUE on transmit. The
> peer might be configured to receive GUE packets with the
> "ip fou add port 7777 gue" command.
> 
> The second configures a GRE tunnel to use FOU encapsulation. The
> peer might be configured to receive these packets with the
> "ip fou add port 8888 ipproto 47" command.
> 
> Tom Herbert (5):
>   iplink: Fix setting of -1 as ifindex
>   ip fou: Support to configure foo-over-udp RX
>   ip tunnel: Kernel uapi definitions for fou and gue
>   ip link ipip: Add support to configure FOU and GUE
>   ip link gre: Add support to configure FOU and GUE
> 
>  include/linux/fou.h       |  41 ++++++++++++
>  include/linux/if_tunnel.h |  17 +++++
>  ip/Makefile               |   3 +-
>  ip/ip.c                   |   3 +-
>  ip/ip_common.h            |   1 +
>  ip/ipfou.c                | 158 ++++++++++++++++++++++++++++++++++++++++++++++
>  ip/iplink.c               |   2 +-
>  ip/link_gre.c             |  89 ++++++++++++++++++++++++++
>  ip/link_iptnl.c           |  89 ++++++++++++++++++++++++++
>  9 files changed, 400 insertions(+), 3 deletions(-)
>  create mode 100644 include/linux/fou.h
>  create mode 100644 ip/ipfou.c

Please resubmit this patch series.
 1. It no longer applies cleanly
 2. Address the comments about port number and -1 ifindex patch
 3. Add man pages

^ permalink raw reply

* (unknown)
From: MRS GRACE MANDA @ 2014-11-02 19:56 UTC (permalink / raw)

In-Reply-To: <1984683241.145020.1414958129409.JavaMail.yahoo@jws10025.mail.ne1.yahoo.com>

[-- Attachment #1: Type: text/plain, Size: 69 bytes --]







This is Mrs Grace Manda (  Please I need your Help is Urgent). 

[-- Attachment #2: Mrs Grace Manda.rtf --]
[-- Type: application/rtf, Size: 35796 bytes --]

^ permalink raw reply

* Re: [PATCH] iproute2: ip6_tunnel mode bugfixes: any,vti6
From: Stephen Hemminger @ 2014-11-02 19:49 UTC (permalink / raw)
  To: Alexey Andriyanov; +Cc: netdev
In-Reply-To: <1414563570-9858-1-git-send-email-alan@al-an.info>

On Wed, 29 Oct 2014 09:19:30 +0300
"Alexey Andriyanov" <alan@al-an.info> wrote:

> - any ipv6 tunnel mode (proto == 0) could not be set
> due to incomplete set of cases in do_add, do_del.
> - vti6 logic was inverted: it was using "ip6_vti0" basedev
> UNLESS mode is set to vti6.
> 
> We don't need a switch by p.proto in do_add()/do_del(): it
> already exists in parse_args(). So if parse_args() call
> was successful, no need to check tunnel mode again.
> 
> CC: Stephen Hemminger <shemming@brocade.com>
> Signed-off-by: Alexey Andriyanov <alan@al-an.info>

Accepted

^ permalink raw reply

* [0/2] tun: Fix csum_start and TUN_PKT_STRIP
From: Herbert Xu @ 2014-11-02 20:29 UTC (permalink / raw)
  To: David S. Miller, netdev

Hi:

The first patch fixes a serious problem that breaks checksum offload
in VMs while the second patch fixes a problem that probably affects
no one.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* [PATCH 1/2] tun: Fix csum_start with VLAN acceleration
From: Herbert Xu @ 2014-11-02 20:30 UTC (permalink / raw)
  To: David S. Miller, netdev
In-Reply-To: <20141102202929.GA24935@gondor.apana.org.au>

When VLAN acceleration is in use on the xmit path, we end up
setting csum_start to the wrong place.  The result is that the
whoever ends up doing the checksum setting will corrupt the packet
instead of writing the checksum to the expected location, usually
this means writing the checksum with an offset of -4.

This patch fixes this by adjusting csum_start when VLAN acceleration
is detected.

Fixes: 6680ec68eff4 ("tuntap: hardware vlan tx support")
Cc: stable@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---

 drivers/net/tun.c |   16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 7302398..57e6bf7 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1235,6 +1235,10 @@ static ssize_t tun_put_user(struct tun_struct *tun,
 	struct tun_pi pi = { 0, skb->protocol };
 	ssize_t total = 0;
 	int vlan_offset = 0, copied;
+	int vlan_hlen = 0;
+
+	if (vlan_tx_tag_present(skb))
+		vlan_hlen = VLAN_HLEN;
 
 	if (!(tun->flags & TUN_NO_PI)) {
 		if ((len -= sizeof(pi)) < 0)
@@ -1284,7 +1288,8 @@ static ssize_t tun_put_user(struct tun_struct *tun,
 
 		if (skb->ip_summed == CHECKSUM_PARTIAL) {
 			gso.flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
-			gso.csum_start = skb_checksum_start_offset(skb);
+			gso.csum_start = skb_checksum_start_offset(skb) +
+					 vlan_hlen;
 			gso.csum_offset = skb->csum_offset;
 		} else if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
 			gso.flags = VIRTIO_NET_HDR_F_DATA_VALID;
@@ -1297,10 +1302,9 @@ static ssize_t tun_put_user(struct tun_struct *tun,
 	}
 
 	copied = total;
-	total += skb->len;
-	if (!vlan_tx_tag_present(skb)) {
-		len = min_t(int, skb->len, len);
-	} else {
+	len = min_t(int, skb->len + vlan_hlen, len);
+	total += skb->len + vlan_hlen;
+	if (vlan_hlen) {
 		int copy, ret;
 		struct {
 			__be16 h_vlan_proto;
@@ -1311,8 +1315,6 @@ static ssize_t tun_put_user(struct tun_struct *tun,
 		veth.h_vlan_TCI = htons(vlan_tx_tag_get(skb));
 
 		vlan_offset = offsetof(struct vlan_ethhdr, h_vlan_proto);
-		len = min_t(int, skb->len + VLAN_HLEN, len);
-		total += VLAN_HLEN;
 
 		copy = min_t(int, vlan_offset, len);
 		ret = skb_copy_datagram_const_iovec(skb, 0, iv, copied, copy);

^ permalink raw reply related

* [PATCH 2/2] tun: Fix TUN_PKT_STRIP setting
From: Herbert Xu @ 2014-11-02 20:30 UTC (permalink / raw)
  To: David S. Miller, netdev
In-Reply-To: <20141102202929.GA24935@gondor.apana.org.au>

We set the flag TUN_PKT_STRIP if the user buffer provided is too
small to contain the entire packet plus meta-data.  However, this
has been broken ever since we added GSO meta-data.  VLAN acceleration
also has the same problem.

This patch fixes this by taking both into account when setting the
TUN_PKT_STRIP flag.

The fact that this has been broken for six years without anyone
realising means that nobody actually uses this flag.

Fixes: f43798c27684 ("tun: Allow GSO using virtio_net_hdr")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---

 drivers/net/tun.c |   12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 57e6bf7..9dd3746 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1236,15 +1236,19 @@ static ssize_t tun_put_user(struct tun_struct *tun,
 	ssize_t total = 0;
 	int vlan_offset = 0, copied;
 	int vlan_hlen = 0;
+	int vnet_hdr_sz = 0;
 
 	if (vlan_tx_tag_present(skb))
 		vlan_hlen = VLAN_HLEN;
 
+	if (tun->flags & TUN_VNET_HDR)
+		vnet_hdr_sz = tun->vnet_hdr_sz;
+
 	if (!(tun->flags & TUN_NO_PI)) {
 		if ((len -= sizeof(pi)) < 0)
 			return -EINVAL;
 
-		if (len < skb->len) {
+		if (len < skb->len + vlan_hlen + vnet_hdr_sz) {
 			/* Packet will be striped */
 			pi.flags |= TUN_PKT_STRIP;
 		}
@@ -1254,9 +1258,9 @@ static ssize_t tun_put_user(struct tun_struct *tun,
 		total += sizeof(pi);
 	}
 
-	if (tun->flags & TUN_VNET_HDR) {
+	if (vnet_hdr_sz) {
 		struct virtio_net_hdr gso = { 0 }; /* no info leak */
-		if ((len -= tun->vnet_hdr_sz) < 0)
+		if ((len -= vnet_hdr_sz) < 0)
 			return -EINVAL;
 
 		if (skb_is_gso(skb)) {
@@ -1298,7 +1302,7 @@ static ssize_t tun_put_user(struct tun_struct *tun,
 		if (unlikely(memcpy_toiovecend(iv, (void *)&gso, total,
 					       sizeof(gso))))
 			return -EFAULT;
-		total += tun->vnet_hdr_sz;
+		total += vnet_hdr_sz;
 	}
 
 	copied = total;

^ permalink raw reply related

* Re: [PATCH v4 1/1] ip-link: add switch to show human readable output
From: Stephen Hemminger @ 2014-11-02 20:51 UTC (permalink / raw)
  To: Christian Hesse; +Cc: netdev
In-Reply-To: <1414791193-25192-1-git-send-email-mail@eworm.de>

On Fri, 31 Oct 2014 22:33:13 +0100
Christian Hesse <mail@eworm.de> wrote:

> Byte and packet count can increase to really big numbers. This adds a
> switch to show human readable output.
> 
> 4: wl: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DORMANT group default qlen 1000
>     link/ether 00:de:ad:be:ee:ef brd ff:ff:ff:ff:ff:ff
>     RX: bytes  packets  errors  dropped overrun mcast
>     1523846973 3969051  0       0       0       0
>     TX: bytes  packets  errors  dropped carrier collsns
>     8710088361 6077735  0       0       0       0
> 4: wl: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DORMANT group default qlen 1000
>     link/ether 00:de:ad:be:ee:ef brd ff:ff:ff:ff:ff:ff
>     RX: bytes  packets  errors  dropped overrun mcast
>     1.5G       3.9M     0       0       0       0
>     TX: bytes  packets  errors  dropped carrier collsns
>     8.7G       6.0M     0       0       0       0

Applied, then I did a code cleanup and added -iec as a option (similar to tc).

^ permalink raw reply

* Re: [PATCH iproute2 2/5] ip fou: Support to configure foo-over-udp RX
From: Stephen Hemminger @ 2014-11-02 20:53 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem, netdev
In-Reply-To: <20141102113610.785543ff@urahara>

On Sun, 2 Nov 2014 11:36:10 -0800
Stephen Hemminger <stephen@networkplumber.org> wrote:

> On Fri,  3 Oct 2014 08:55:15 -0700
> Tom Herbert <therbert@google.com> wrote:
> 
> > Added 'ip fou...' commands to enable/disable UDP ports for doing
> > foo-over-udp and Generic UDP Encapsulation variant. Arguments are port
> > number to bind to and IP protocol to map to port (for direct FOU).
> > 
> > Examples:
> > 
> > ip fou add port 7777 gue
> > ip fou add port 8888 ipproto 4
> > 
> > The first command creates a GUE port, the second creates a direct FOU
> > port for IPIP (receive payload is a assumed to be an IPv4 packet).
> > 
> > Signed-off-by: Tom Herbert <therbert@google.com>
> 
> Accepted.
> Also discovered that fou.h was missing from kernel Kbuild uapi.

I backed out the change since the rest of the patch series has
issues. Please fix and r

^ permalink raw reply

* Re: [PATCH] bridge: fix netfilter/NF_BR_LOCAL_OUT for own, locally generated queries
From: Herbert Xu @ 2014-11-02 22:01 UTC (permalink / raw)
  To: Linus Lüssing
  Cc: Stephen Hemminger, netdev, bridge, David S. Miller, linux-kernel
In-Reply-To: <1411342364-4791-1-git-send-email-linus.luessing@web.de>

On Mon, Sep 22, 2014 at 01:32:44AM +0200, Linus Lüssing wrote:
> Ebtables on the OUTPUT chain (NF_BR_LOCAL_OUT) would not work as expected
> for both locally generated IGMP and MLD queries. The IP header specific
> filter options are off by 14 Bytes for netfilter (actual output on
> interfaces is fine).
> 
> NF_HOOK() expects the skb->data to point to the IP header, not the
> ethernet one (while dev_queue_xmit() does not). Luckily there is an
> br_dev_queue_push_xmit() helper function already - let's just use that.
> 
> Introduced by eb1d16414339a6e113d89e2cca2556005d7ce919
> ("bridge: Add core IGMP snooping support")
> 
> Ebtables example:
> 
> $ ebtables -I OUTPUT -p IPv6 -o eth1 --logical-out br0 \
> 	--log --log-level 6 --log-ip6 --log-prefix="~EBT: " -j DROP
> 
> before (broken):
> 
> ~EBT:  IN= OUT=eth1 MAC source = 02:04:64:a4:39:c2 \
> 	MAC dest = 33:33:00:00:00:01 proto = 0x86dd IPv6 \
> 	SRC=64a4:39c2:86dd:6000:0000:0020:0001:fe80 IPv6 \
> 	DST=0000:0000:0000:0004:64ff:fea4:39c2:ff02, \
> 	IPv6 priority=0x3, Next Header=2
> 
> after (working):
> 
> ~EBT:  IN= OUT=eth1 MAC source = 02:04:64:a4:39:c2 \
> 	MAC dest = 33:33:00:00:00:01 proto = 0x86dd IPv6 \
> 	SRC=fe80:0000:0000:0000:0004:64ff:fea4:39c2 IPv6 \
> 	DST=ff02:0000:0000:0000:0000:0000:0000:0001, \
> 	IPv6 priority=0x0, Next Header=0
> 
> Signed-off-by: Linus Lüssing <linus.luessing@web.de>

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH net-next v2 2/3] r8152: clear the flagofSCHEDULE_TASKLET in tasklet
From: Francois Romieu @ 2014-11-02 22:53 UTC (permalink / raw)
  To: Hayes Wang
  Cc: David Miller, netdev@vger.kernel.org, nic_swsd,
	linux-kernel@vger.kernel.org, linux-usb@vger.kernel.org
In-Reply-To: <0835B3720019904CB8F7AA43166CEEB2ECD8A7@RTITMBSV03.realtek.com.tw>

Hayes Wang <hayeswang@realtek.com> :
>  David Miller [davem@davemloft.net]
[...]
> > If another thread of control sets the bit between the test and the
> > clear, you will lose an event.
> 
> It is fine. The flag is used to schedule a tasklet, so if the tasklet is
> starting running, all the other plans for scheduling a tasklet could
> be cleared.

test_and_clear_bit (dense) or clear_bit would be more idiomatic.

-- 
Ueimor

^ permalink raw reply

* fs: Use non-const iov in aio_read/aio_write
From: Herbert Xu @ 2014-11-02 23:05 UTC (permalink / raw)
  To: David S. Miller, netdev, Linux Kernel Mailing List; +Cc: Benjamin LaHaise

Currently the functions aio_read/aio_write use a const iov as
input.  This is unnecessary as all their callers supply a
stack-based or kmalloced iov which is never reused.  Conceptually
this is fine because iovs supplied to aio_read/aio_write ultimately
come from user-space so we always have to make a copy of them for
the kernel.

This is also a joke because for as long (since 2.1.15) as we've
had the const iov, the network stack (currently through do_sock_read
and do_sock_write) has been casting the const away.  IOW if anybody
did supply a const iov they would crash and burn if they ever
entered the network stack.

The network stack needs a non-const iov because it iterates through
the iov as it reads/writes data.

So we have two alternatives, either change the network stack to
not touch the iovs or make the iovs non-const.

As there is no reason for the iovs to be const in the first place,
I have taken the second choice and changed all aio_read/aio_write
functions to use non-const iovs.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index b30753c..dfefc79 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -434,8 +434,8 @@ prototypes:
 	loff_t (*llseek) (struct file *, loff_t, int);
 	ssize_t (*read) (struct file *, char __user *, size_t, loff_t *);
 	ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *);
-	ssize_t (*aio_read) (struct kiocb *, const struct iovec *, unsigned long, loff_t);
-	ssize_t (*aio_write) (struct kiocb *, const struct iovec *, unsigned long, loff_t);
+	ssize_t (*aio_read) (struct kiocb *, struct iovec *, unsigned long, loff_t);
+	ssize_t (*aio_write) (struct kiocb *, struct iovec *, unsigned long, loff_t);
 	ssize_t (*read_iter) (struct kiocb *, struct iov_iter *);
 	ssize_t (*write_iter) (struct kiocb *, struct iov_iter *);
 	int (*iterate) (struct file *, struct dir_context *);
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 20bf204..a2ba142 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -811,8 +811,8 @@ struct file_operations {
 	loff_t (*llseek) (struct file *, loff_t, int);
 	ssize_t (*read) (struct file *, char __user *, size_t, loff_t *);
 	ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *);
-	ssize_t (*aio_read) (struct kiocb *, const struct iovec *, unsigned long, loff_t);
-	ssize_t (*aio_write) (struct kiocb *, const struct iovec *, unsigned long, loff_t);
+	ssize_t (*aio_read) (struct kiocb *, struct iovec *, unsigned long, loff_t);
+	ssize_t (*aio_write) (struct kiocb *, struct iovec *, unsigned long, loff_t);
 	ssize_t (*read_iter) (struct kiocb *, struct iov_iter *);
 	ssize_t (*write_iter) (struct kiocb *, struct iov_iter *);
 	int (*iterate) (struct file *, struct dir_context *);
diff --git a/arch/s390/hypfs/inode.c b/arch/s390/hypfs/inode.c
index c952b98..c7490bd 100644
--- a/arch/s390/hypfs/inode.c
+++ b/arch/s390/hypfs/inode.c
@@ -144,7 +144,7 @@ static int hypfs_open(struct inode *inode, struct file *filp)
 	return nonseekable_open(inode, filp);
 }
 
-static ssize_t hypfs_aio_read(struct kiocb *iocb, const struct iovec *iov,
+static ssize_t hypfs_aio_read(struct kiocb *iocb, struct iovec *iov,
 			      unsigned long nr_segs, loff_t offset)
 {
 	char *data;
@@ -167,7 +167,7 @@ static ssize_t hypfs_aio_read(struct kiocb *iocb, const struct iovec *iov,
 
 	return ret;
 }
-static ssize_t hypfs_aio_write(struct kiocb *iocb, const struct iovec *iov,
+static ssize_t hypfs_aio_write(struct kiocb *iocb, struct iovec *iov,
 			      unsigned long nr_segs, loff_t offset)
 {
 	int rc;
diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index 524b707..d94e5b0 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -598,13 +598,13 @@ static ssize_t write_null(struct file *file, const char __user *buf,
 	return count;
 }
 
-static ssize_t aio_read_null(struct kiocb *iocb, const struct iovec *iov,
+static ssize_t aio_read_null(struct kiocb *iocb, struct iovec *iov,
 			     unsigned long nr_segs, loff_t pos)
 {
 	return 0;
 }
 
-static ssize_t aio_write_null(struct kiocb *iocb, const struct iovec *iov,
+static ssize_t aio_write_null(struct kiocb *iocb, struct iovec *iov,
 			      unsigned long nr_segs, loff_t pos)
 {
 	return iov_length(iov, nr_segs);
diff --git a/drivers/infiniband/hw/ipath/ipath_file_ops.c b/drivers/infiniband/hw/ipath/ipath_file_ops.c
index 6d7f453..8b75de4f 100644
--- a/drivers/infiniband/hw/ipath/ipath_file_ops.c
+++ b/drivers/infiniband/hw/ipath/ipath_file_ops.c
@@ -53,7 +53,7 @@ static int ipath_open(struct inode *, struct file *);
 static int ipath_close(struct inode *, struct file *);
 static ssize_t ipath_write(struct file *, const char __user *, size_t,
 			   loff_t *);
-static ssize_t ipath_writev(struct kiocb *, const struct iovec *,
+static ssize_t ipath_writev(struct kiocb *, struct iovec *,
 			    unsigned long , loff_t);
 static unsigned int ipath_poll(struct file *, struct poll_table_struct *);
 static int ipath_mmap(struct file *, struct vm_area_struct *);
@@ -2414,7 +2414,7 @@ bail:
 	return ret;
 }
 
-static ssize_t ipath_writev(struct kiocb *iocb, const struct iovec *iov,
+static ssize_t ipath_writev(struct kiocb *iocb, struct iovec *iov,
 			    unsigned long dim, loff_t off)
 {
 	struct file *filp = iocb->ki_filp;
diff --git a/drivers/infiniband/hw/qib/qib_file_ops.c b/drivers/infiniband/hw/qib/qib_file_ops.c
index b15e34e..8872924 100644
--- a/drivers/infiniband/hw/qib/qib_file_ops.c
+++ b/drivers/infiniband/hw/qib/qib_file_ops.c
@@ -55,7 +55,7 @@
 static int qib_open(struct inode *, struct file *);
 static int qib_close(struct inode *, struct file *);
 static ssize_t qib_write(struct file *, const char __user *, size_t, loff_t *);
-static ssize_t qib_aio_write(struct kiocb *, const struct iovec *,
+static ssize_t qib_aio_write(struct kiocb *, struct iovec *,
 			     unsigned long, loff_t);
 static unsigned int qib_poll(struct file *, struct poll_table_struct *);
 static int qib_mmapf(struct file *, struct vm_area_struct *);
@@ -2245,7 +2245,7 @@ bail:
 	return ret;
 }
 
-static ssize_t qib_aio_write(struct kiocb *iocb, const struct iovec *iov,
+static ssize_t qib_aio_write(struct kiocb *iocb, struct iovec *iov,
 			     unsigned long dim, loff_t off)
 {
 	struct qib_filedata *fp = iocb->ki_filp->private_data;
diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 6f226de..823522e 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -761,7 +761,7 @@ err:
 	return err;
 }
 
-static ssize_t macvtap_aio_write(struct kiocb *iocb, const struct iovec *iv,
+static ssize_t macvtap_aio_write(struct kiocb *iocb, struct iovec *iv,
 				 unsigned long count, loff_t pos)
 {
 	struct file *file = iocb->ki_filp;
@@ -871,7 +871,7 @@ static ssize_t macvtap_do_read(struct macvtap_queue *q,
 	return ret;
 }
 
-static ssize_t macvtap_aio_read(struct kiocb *iocb, const struct iovec *iv,
+static ssize_t macvtap_aio_read(struct kiocb *iocb, struct iovec *iv,
 				unsigned long count, loff_t pos)
 {
 	struct file *file = iocb->ki_filp;
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 9dd3746..8d06816 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1206,7 +1206,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 	return total_len;
 }
 
-static ssize_t tun_chr_aio_write(struct kiocb *iocb, const struct iovec *iv,
+static ssize_t tun_chr_aio_write(struct kiocb *iocb, struct iovec *iv,
 			      unsigned long count, loff_t pos)
 {
 	struct file *file = iocb->ki_filp;
@@ -1371,7 +1371,7 @@ static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
 	return ret;
 }
 
-static ssize_t tun_chr_aio_read(struct kiocb *iocb, const struct iovec *iv,
+static ssize_t tun_chr_aio_read(struct kiocb *iocb, struct iovec *iv,
 			    unsigned long count, loff_t pos)
 {
 	struct file *file = iocb->ki_filp;
diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
index 63314ed..47fec3fd 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -958,7 +958,7 @@ static int ffs_aio_cancel(struct kiocb *kiocb)
 }
 
 static ssize_t ffs_epfile_aio_write(struct kiocb *kiocb,
-				    const struct iovec *iovec,
+				    struct iovec *iovec,
 				    unsigned long nr_segs, loff_t loff)
 {
 	struct ffs_io_data *io_data;
@@ -985,7 +985,7 @@ static ssize_t ffs_epfile_aio_write(struct kiocb *kiocb,
 }
 
 static ssize_t ffs_epfile_aio_read(struct kiocb *kiocb,
-				   const struct iovec *iovec,
+				   struct iovec *iovec,
 				   unsigned long nr_segs, loff_t loff)
 {
 	struct ffs_io_data *io_data;
diff --git a/drivers/usb/gadget/legacy/inode.c b/drivers/usb/gadget/legacy/inode.c
index c744e49..211ab83 100644
--- a/drivers/usb/gadget/legacy/inode.c
+++ b/drivers/usb/gadget/legacy/inode.c
@@ -695,7 +695,7 @@ fail:
 }
 
 static ssize_t
-ep_aio_read(struct kiocb *iocb, const struct iovec *iov,
+ep_aio_read(struct kiocb *iocb, struct iovec *iov,
 		unsigned long nr_segs, loff_t o)
 {
 	struct ep_data		*epdata = iocb->ki_filp->private_data;
@@ -712,7 +712,7 @@ ep_aio_read(struct kiocb *iocb, const struct iovec *iov,
 }
 
 static ssize_t
-ep_aio_write(struct kiocb *iocb, const struct iovec *iov,
+ep_aio_write(struct kiocb *iocb, struct iovec *iov,
 		unsigned long nr_segs, loff_t o)
 {
 	struct ep_data		*epdata = iocb->ki_filp->private_data;
diff --git a/fs/bad_inode.c b/fs/bad_inode.c
index afd2b44..ca3db8d 100644
--- a/fs/bad_inode.c
+++ b/fs/bad_inode.c
@@ -33,13 +33,13 @@ static ssize_t bad_file_write(struct file *filp, const char __user *buf,
         return -EIO;
 }
 
-static ssize_t bad_file_aio_read(struct kiocb *iocb, const struct iovec *iov,
+static ssize_t bad_file_aio_read(struct kiocb *iocb, struct iovec *iov,
 			unsigned long nr_segs, loff_t pos)
 {
 	return -EIO;
 }
 
-static ssize_t bad_file_aio_write(struct kiocb *iocb, const struct iovec *iov,
+static ssize_t bad_file_aio_write(struct kiocb *iocb, struct iovec *iov,
 			unsigned long nr_segs, loff_t pos)
 {
 	return -EIO;
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index ca88731..88ce708 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -1277,7 +1277,7 @@ static ssize_t fuse_dev_do_read(struct fuse_conn *fc, struct file *file,
 	return err;
 }
 
-static ssize_t fuse_dev_read(struct kiocb *iocb, const struct iovec *iov,
+static ssize_t fuse_dev_read(struct kiocb *iocb, struct iovec *iov,
 			      unsigned long nr_segs, loff_t pos)
 {
 	struct fuse_copy_state cs;
@@ -1881,7 +1881,7 @@ static ssize_t fuse_dev_do_write(struct fuse_conn *fc,
 	return err;
 }
 
-static ssize_t fuse_dev_write(struct kiocb *iocb, const struct iovec *iov,
+static ssize_t fuse_dev_write(struct kiocb *iocb, struct iovec *iov,
 			      unsigned long nr_segs, loff_t pos)
 {
 	struct fuse_copy_state cs;
diff --git a/fs/ntfs/file.c b/fs/ntfs/file.c
index 643faa4..2617860 100644
--- a/fs/ntfs/file.c
+++ b/fs/ntfs/file.c
@@ -2114,7 +2114,7 @@ out:
 /**
  * ntfs_file_aio_write -
  */
-static ssize_t ntfs_file_aio_write(struct kiocb *iocb, const struct iovec *iov,
+static ssize_t ntfs_file_aio_write(struct kiocb *iocb, struct iovec *iov,
 		unsigned long nr_segs, loff_t pos)
 {
 	struct file *file = iocb->ki_filp;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 4e41a4a..2585428 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1485,8 +1485,8 @@ struct file_operations {
 	loff_t (*llseek) (struct file *, loff_t, int);
 	ssize_t (*read) (struct file *, char __user *, size_t, loff_t *);
 	ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *);
-	ssize_t (*aio_read) (struct kiocb *, const struct iovec *, unsigned long, loff_t);
-	ssize_t (*aio_write) (struct kiocb *, const struct iovec *, unsigned long, loff_t);
+	ssize_t (*aio_read) (struct kiocb *, struct iovec *, unsigned long, loff_t);
+	ssize_t (*aio_write) (struct kiocb *, struct iovec *, unsigned long, loff_t);
 	ssize_t (*read_iter) (struct kiocb *, struct iov_iter *);
 	ssize_t (*write_iter) (struct kiocb *, struct iov_iter *);
 	int (*iterate) (struct file *, struct dir_context *);
diff --git a/net/socket.c b/net/socket.c
index fe20c31..3c6fbab 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -114,9 +114,9 @@ unsigned int sysctl_net_busy_poll __read_mostly;
 #endif
 
 static int sock_no_open(struct inode *irrelevant, struct file *dontcare);
-static ssize_t sock_aio_read(struct kiocb *iocb, const struct iovec *iov,
+static ssize_t sock_aio_read(struct kiocb *iocb, struct iovec *iov,
 			 unsigned long nr_segs, loff_t pos);
-static ssize_t sock_aio_write(struct kiocb *iocb, const struct iovec *iov,
+static ssize_t sock_aio_write(struct kiocb *iocb, struct iovec *iov,
 			  unsigned long nr_segs, loff_t pos);
 static int sock_mmap(struct file *file, struct vm_area_struct *vma);
 
@@ -901,7 +901,7 @@ static struct sock_iocb *alloc_sock_iocb(struct kiocb *iocb,
 }
 
 static ssize_t do_sock_read(struct msghdr *msg, struct kiocb *iocb,
-		struct file *file, const struct iovec *iov,
+		struct file *file, struct iovec *iov,
 		unsigned long nr_segs)
 {
 	struct socket *sock = file->private_data;
@@ -915,14 +915,14 @@ static ssize_t do_sock_read(struct msghdr *msg, struct kiocb *iocb,
 	msg->msg_namelen = 0;
 	msg->msg_control = NULL;
 	msg->msg_controllen = 0;
-	msg->msg_iov = (struct iovec *)iov;
+	msg->msg_iov = iov;
 	msg->msg_iovlen = nr_segs;
 	msg->msg_flags = (file->f_flags & O_NONBLOCK) ? MSG_DONTWAIT : 0;
 
 	return __sock_recvmsg(iocb, sock, msg, size, msg->msg_flags);
 }
 
-static ssize_t sock_aio_read(struct kiocb *iocb, const struct iovec *iov,
+static ssize_t sock_aio_read(struct kiocb *iocb, struct iovec *iov,
 				unsigned long nr_segs, loff_t pos)
 {
 	struct sock_iocb siocb, *x;
@@ -941,7 +941,7 @@ static ssize_t sock_aio_read(struct kiocb *iocb, const struct iovec *iov,
 }
 
 static ssize_t do_sock_write(struct msghdr *msg, struct kiocb *iocb,
-			struct file *file, const struct iovec *iov,
+			struct file *file, struct iovec *iov,
 			unsigned long nr_segs)
 {
 	struct socket *sock = file->private_data;
@@ -955,7 +955,7 @@ static ssize_t do_sock_write(struct msghdr *msg, struct kiocb *iocb,
 	msg->msg_namelen = 0;
 	msg->msg_control = NULL;
 	msg->msg_controllen = 0;
-	msg->msg_iov = (struct iovec *)iov;
+	msg->msg_iov = iov;
 	msg->msg_iovlen = nr_segs;
 	msg->msg_flags = (file->f_flags & O_NONBLOCK) ? MSG_DONTWAIT : 0;
 	if (sock->type == SOCK_SEQPACKET)
@@ -964,7 +964,7 @@ static ssize_t do_sock_write(struct msghdr *msg, struct kiocb *iocb,
 	return __sock_sendmsg(iocb, sock, msg, size);
 }
 
-static ssize_t sock_aio_write(struct kiocb *iocb, const struct iovec *iov,
+static ssize_t sock_aio_write(struct kiocb *iocb, struct iovec *iov,
 			  unsigned long nr_segs, loff_t pos)
 {
 	struct sock_iocb siocb, *x;
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 166d59c..229b5a9 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -2995,7 +2995,7 @@ static ssize_t snd_pcm_write(struct file *file, const char __user *buf,
 	return result;
 }
 
-static ssize_t snd_pcm_aio_read(struct kiocb *iocb, const struct iovec *iov,
+static ssize_t snd_pcm_aio_read(struct kiocb *iocb, struct iovec *iov,
 			     unsigned long nr_segs, loff_t pos)
 
 {
@@ -3031,7 +3031,7 @@ static ssize_t snd_pcm_aio_read(struct kiocb *iocb, const struct iovec *iov,
 	return result;
 }
 
-static ssize_t snd_pcm_aio_write(struct kiocb *iocb, const struct iovec *iov,
+static ssize_t snd_pcm_aio_write(struct kiocb *iocb, struct iovec *iov,
 			      unsigned long nr_segs, loff_t pos)
 {
 	struct snd_pcm_file *pcm_file;

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply related

* Re: fs: Use non-const iov in aio_read/aio_write
From: Al Viro @ 2014-11-03  0:16 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David S. Miller, netdev, Linux Kernel Mailing List,
	Benjamin LaHaise
In-Reply-To: <20141102230552.GA26095@gondor.apana.org.au>

On Mon, Nov 03, 2014 at 07:05:52AM +0800, Herbert Xu wrote:
> Currently the functions aio_read/aio_write use a const iov as
> input.  This is unnecessary as all their callers supply a
> stack-based or kmalloced iov which is never reused.  Conceptually
> this is fine because iovs supplied to aio_read/aio_write ultimately
> come from user-space so we always have to make a copy of them for
> the kernel.
> 
> This is also a joke because for as long (since 2.1.15) as we've
> had the const iov, the network stack (currently through do_sock_read
> and do_sock_write) has been casting the const away.  IOW if anybody
> did supply a const iov they would crash and burn if they ever
> entered the network stack.
> 
> The network stack needs a non-const iov because it iterates through
> the iov as it reads/writes data.
> 
> So we have two alternatives, either change the network stack to
> not touch the iovs or make the iovs non-const.
> 
> As there is no reason for the iovs to be const in the first place,
> I have taken the second choice and changed all aio_read/aio_write
> functions to use non-const iovs.

NAK with extreme prejudice.  The right way to deal with that is
to convert the socket side of things to iov_iter.  And give it a
consistent behaviour, while we are at it (some protocols do advance
the damn thing, so do not).  There are _very_ good reasons to have those
iovecs unchanged - if you look at the callers on the socket side, you'll
see a bunch that has to _copy_ iovec just to avoid it being buggered.
And you get rather suboptimal behaviour in memcpy_fromiovec() and friends,
exactly because you have to skip through the emptied elements.

IOW, no way in hell.

^ permalink raw reply

* Re: fs: Use non-const iov in aio_read/aio_write
From: Al Viro @ 2014-11-03  0:21 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David S. Miller, netdev, Linux Kernel Mailing List,
	Benjamin LaHaise
In-Reply-To: <20141103001634.GV7996@ZenIV.linux.org.uk>

On Mon, Nov 03, 2014 at 12:16:34AM +0000, Al Viro wrote:

> NAK with extreme prejudice.  The right way to deal with that is
> to convert the socket side of things to iov_iter.  And give it a
> consistent behaviour, while we are at it (some protocols do advance
> the damn thing, so do not).  There are _very_ good reasons to have those
> iovecs unchanged - if you look at the callers on the socket side, you'll
> see a bunch that has to _copy_ iovec just to avoid it being buggered.
> And you get rather suboptimal behaviour in memcpy_fromiovec() and friends,
> exactly because you have to skip through the emptied elements.
> 
> IOW, no way in hell.

PS: I do have the beginning of that stuff sitting in the local queue since
April; see http://marc.info/?l=linux-xfs&m=139179304710494&w=2 for the
beginning of the story.

^ permalink raw reply

* Re: fs: Use non-const iov in aio_read/aio_write
From: Herbert Xu @ 2014-11-03  0:22 UTC (permalink / raw)
  To: Al Viro
  Cc: David S. Miller, netdev, Linux Kernel Mailing List,
	Benjamin LaHaise
In-Reply-To: <20141103001634.GV7996@ZenIV.linux.org.uk>

On Mon, Nov 03, 2014 at 12:16:34AM +0000, Al Viro wrote:
> 
> NAK with extreme prejudice.  The right way to deal with that is
> to convert the socket side of things to iov_iter.  And give it a
> consistent behaviour, while we are at it (some protocols do advance
> the damn thing, so do not).  There are _very_ good reasons to have those
> iovecs unchanged - if you look at the callers on the socket side, you'll
> see a bunch that has to _copy_ iovec just to avoid it being buggered.
> And you get rather suboptimal behaviour in memcpy_fromiovec() and friends,
> exactly because you have to skip through the emptied elements.
> 
> IOW, no way in hell.

You're welcome to send patches fix every spot in the network stack
that writes to the iovec.  But until the network stack is all fixed
up, having a const struct iovec in aio_read/aio_write is a delusion.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: fs: Use non-const iov in aio_read/aio_write
From: Al Viro @ 2014-11-03  0:45 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David S. Miller, netdev, Linux Kernel Mailing List,
	Benjamin LaHaise
In-Reply-To: <20141103002207.GA26588@gondor.apana.org.au>

On Mon, Nov 03, 2014 at 08:22:07AM +0800, Herbert Xu wrote:
> On Mon, Nov 03, 2014 at 12:16:34AM +0000, Al Viro wrote:
> > 
> > NAK with extreme prejudice.  The right way to deal with that is
> > to convert the socket side of things to iov_iter.  And give it a
> > consistent behaviour, while we are at it (some protocols do advance
> > the damn thing, so do not).  There are _very_ good reasons to have those
> > iovecs unchanged - if you look at the callers on the socket side, you'll
> > see a bunch that has to _copy_ iovec just to avoid it being buggered.
> > And you get rather suboptimal behaviour in memcpy_fromiovec() and friends,
> > exactly because you have to skip through the emptied elements.
> > 
> > IOW, no way in hell.
> 
> You're welcome to send patches fix every spot in the network stack
> that writes to the iovec.  But until the network stack is all fixed
> up, having a const struct iovec in aio_read/aio_write is a delusion.

Check how many ->aio_read() and ->aio_write() instances are left.  If you
are implying that dealing with the ones in net/* is not feasible, I invite
you to check the situation in fs/*, where we used to have quite a few.
Compare it with what used to be there in e.g. January.

Note, BTW, that there's a damn good reason to convert the socket side of
things to iov_iter - as it is, ->splice_write() there is basically done with
page-by-page mapping and doing kernel_sendmsg(); being able to deal with
"map and copy" stuff *inside* ->sendmsg() would not only reduce the overhead,
it would allow to get rid of ->sendpage() completely.  Basically, let
->sendmsg() instances check the iov_iter type and play zerocopy games if
it's an "array of kernel pages" kind.  Compare ->sendpage() and ->sendmsg()
instances for the protocols that have nontrivial ->sendpage(); you'll see
that there's a lot of duplication.  Merging them looks very feasible, with
divergence happening only very deep in the call chain.

^ permalink raw reply

* [PATCH] ipv4: avoid divide 0 error in tcp_incr_quickack
From: Chen Weilong @ 2014-11-03  1:29 UTC (permalink / raw)
  To: davem, kuznet, jmorris, yoshfuji, kaber; +Cc: netdev, linux-kernel

From: Weilong Chen <chenweilong@huawei.com>

We got a problem like this:
 [ffff8801c1a05570] machine_kexec at ffffffff81025039
 [ffff8801c1a055d0] crash_kexec at ffffffff8109b253
 [ffff8801c1a056a0] oops_end at ffffffff81442aed
 [ffff8801c1a056d0] die at ffffffff81005603
 [ffff8801c1a05700] do_trap at ffffffff81442448
 [ffff8801c1a05760] do_divide_error at ffffffff81002c10
 [ffff8801c1a05888] tcp_send_dupack at ffffffff81385e44
 [ffff8801c1a058c8] tcp_validate_incoming at ffffffff813886b5
 [ffff8801c1a05908] tcp_rcv_state_process at ffffffff8138d0b7
 [ffff8801c1a05958] tcp_child_process at ffffffff81397255
 [ffff8801c1a05988] tcp_v4_do_rcv at ffffffff81395a70
 [ffff8801c1a059d8] tcp_v4_rcv at ffffffff81396fc8
 [ffff8801c1a05a48] ip_local_deliver_finish at ffffffff813746e9
 [ffff8801c1a05a78] ip_local_deliver at ffffffff81374a20
 [ffff8801c1a05aa8] ip_rcv_finish at ffffffff81374389
 [ffff8801c1a05ad8] ip_rcv at ffffffff81374c78
There was a wrong ack packet coming during TCP handshake. The socket's state
was TCP_SYN_RECV, its rcv_mss was not initialize yet. So
tcp_send_dupack -> tcp_enter_quickack_mode got a divide 0 error.
This patch add a state check before tcp_enter_quickack_mode.

Signed-off-by: Weilong Chen <chenweilong@huawei.com>
---
 net/ipv4/tcp_input.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 4e4617e..9eb56dc 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3986,7 +3986,8 @@ static void tcp_send_dupack(struct sock *sk, const struct sk_buff *skb)
 	if (TCP_SKB_CB(skb)->end_seq != TCP_SKB_CB(skb)->seq &&
 	    before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt)) {
 		NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_DELAYEDACKLOST);
-		tcp_enter_quickack_mode(sk);
+		if (sk->sk_state != TCP_SYN_RECV)
+			tcp_enter_quickack_mode(sk);

 		if (tcp_is_sack(tp) && sysctl_tcp_dsack) {
 			u32 end_seq = TCP_SKB_CB(skb)->end_seq;
-- 
1.7.12

^ permalink raw reply related

* [PATCH] netfilter: nft_reject_bridge: Fix powerpc build error
From: Guenter Roeck @ 2014-11-03  2:19 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: David S. Miller, netfilter-devel, coreteam, bridge, netdev,
	Stephen Hemminger, Guenter Roeck, Pablo Neira Ayuso

Fix:
net/bridge/netfilter/nft_reject_bridge.c:
In function 'nft_reject_br_send_v6_unreach':
net/bridge/netfilter/nft_reject_bridge.c:240:3:
	error: implicit declaration of function 'csum_ipv6_magic'
   csum_ipv6_magic(&nip6h->saddr, &nip6h->daddr,
   ^
make[3]: *** [net/bridge/netfilter/nft_reject_bridge.o] Error 1

Seen with powerpc:allmodconfig.

Fixes: 523b929d5446 ("netfilter: nft_reject_bridge: don't use IP stack to reject traffic")
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 net/bridge/netfilter/nft_reject_bridge.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/bridge/netfilter/nft_reject_bridge.c b/net/bridge/netfilter/nft_reject_bridge.c
index 654c901..48da2c5 100644
--- a/net/bridge/netfilter/nft_reject_bridge.c
+++ b/net/bridge/netfilter/nft_reject_bridge.c
@@ -18,6 +18,7 @@
 #include <net/netfilter/ipv6/nf_reject.h>
 #include <linux/ip.h>
 #include <net/ip.h>
+#include <net/ip6_checksum.h>
 #include <linux/netfilter_bridge.h>
 #include "../br_private.h"
 
-- 
1.9.1


^ permalink raw reply related

* [PATCH] ipv4: fix comment in net/ipv4/tcp_input.c to reference the correct RFC
From: James Brown @ 2014-11-03  3:11 UTC (permalink / raw)
  To: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy
  Cc: netdev, linux-kernel, James Brown

RFC 5691 has to do with MPEG surround-sound; RFC 5961 has to do
with hardening TCP against in-window spoofing attacks

Signed-off-by: James Brown <roguelazer@roguelazer.com>
---
 net/ipv4/tcp_input.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index a12b455..d285962 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5028,7 +5028,7 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb,
 	/* step 3: check security and precedence [ignored] */
 
 	/* step 4: Check for a SYN
-	 * RFC 5691 4.2 : Send a challenge ack
+	 * RFC 5961 4.2 : Send a challenge ack
 	 */
 	if (th->syn) {
 syn_challenge:
-- 
2.1.2

^ permalink raw reply related

* Re: [PATCH] ipv4: avoid divide 0 error in tcp_incr_quickack
From: Eric Dumazet @ 2014-11-03  3:42 UTC (permalink / raw)
  To: Chen Weilong
  Cc: davem, kuznet, jmorris, yoshfuji, kaber, netdev, linux-kernel
In-Reply-To: <1414978173-6948-1-git-send-email-chenweilong@huawei.com>

On Mon, 2014-11-03 at 09:29 +0800, Chen Weilong wrote:
> From: Weilong Chen <chenweilong@huawei.com>
> 
> We got a problem like this:

> There was a wrong ack packet coming during TCP handshake. The socket's state
> was TCP_SYN_RECV, its rcv_mss was not initialize yet. So
> tcp_send_dupack -> tcp_enter_quickack_mode got a divide 0 error.
> This patch add a state check before tcp_enter_quickack_mode.
> 
> Signed-off-by: Weilong Chen <chenweilong@huawei.com>
> ---
>  net/ipv4/tcp_input.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 4e4617e..9eb56dc 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -3986,7 +3986,8 @@ static void tcp_send_dupack(struct sock *sk, const struct sk_buff *skb)
>  	if (TCP_SKB_CB(skb)->end_seq != TCP_SKB_CB(skb)->seq &&
>  	    before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt)) {
>  		NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_DELAYEDACKLOST);
> -		tcp_enter_quickack_mode(sk);
> +		if (sk->sk_state != TCP_SYN_RECV)
> +			tcp_enter_quickack_mode(sk);
>  
>  		if (tcp_is_sack(tp) && sysctl_tcp_dsack) {
>  			u32 end_seq = TCP_SKB_CB(skb)->end_seq;


Sorry I do not think this is the right fix.

We have to not simply avoid the divide, but fix this issue by
understanding the missing steps.

^ permalink raw reply

* Re: [PATCH] ipv4: fix comment in net/ipv4/tcp_input.c to reference the correct RFC
From: David Miller @ 2014-11-03  3:49 UTC (permalink / raw)
  To: roguelazer; +Cc: kuznet, jmorris, yoshfuji, kaber, netdev, linux-kernel
In-Reply-To: <1414984260-26020-1-git-send-email-roguelazer@roguelazer.com>

From: James Brown <roguelazer@roguelazer.com>
Date: Sun,  2 Nov 2014 19:11:00 -0800

> RFC 5691 has to do with MPEG surround-sound; RFC 5961 has to do
> with hardening TCP against in-window spoofing attacks
> 
> Signed-off-by: James Brown <roguelazer@roguelazer.com>

Already fixed in net-next, thanks.

^ permalink raw reply

* linux-next: build failure in Linus' tree
From: Stephen Rothwell @ 2014-11-03  4:09 UTC (permalink / raw)
  To: Linus Torvalds, David Miller, netdev
  Cc: linux-next, linux-kernel, Pablo Neira Ayuso

[-- Attachment #1: Type: text/plain, Size: 1468 bytes --]

Hi Linus,

With Linus' tree, today's linux-next build (powerpc allyesconfig)
failed like this:

net/bridge/netfilter/nft_reject_bridge.c: In function 'nft_reject_br_send_v6_unreach':
net/bridge/netfilter/nft_reject_bridge.c:240:3: error: implicit declaration of function 'csum_ipv6_magic' [-Werror=implicit-function-declaration]
   csum_ipv6_magic(&nip6h->saddr, &nip6h->daddr,
   ^

Caused by commit 523b929d5446 ("netfilter: nft_reject_bridge: don't use
IP stack to reject traffic") from Linus' tree.

I applied the following patch for today:

From: Stephen Rothwell <sfr@canb.auug.org.au>
Date: Mon, 3 Nov 2014 15:01:16 +1100
Subject: [PATCH] netfilter: nft_reject_bridge: include ip6_checksum.h for csum_ipv6_magic

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
---
 net/bridge/netfilter/nft_reject_bridge.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/bridge/netfilter/nft_reject_bridge.c b/net/bridge/netfilter/nft_reject_bridge.c
index 654c9018e3e7..1123f2b4a1b1 100644
--- a/net/bridge/netfilter/nft_reject_bridge.c
+++ b/net/bridge/netfilter/nft_reject_bridge.c
@@ -16,6 +16,7 @@
 #include <net/netfilter/nft_reject.h>
 #include <net/netfilter/ipv4/nf_reject.h>
 #include <net/netfilter/ipv6/nf_reject.h>
+#include <net/ip6_checksum.h>
 #include <linux/ip.h>
 #include <net/ip.h>
 #include <linux/netfilter_bridge.h>
-- 
2.1.1

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply related

* Re: linux-next: build failure in Linus' tree
From: David Miller @ 2014-11-03  4:16 UTC (permalink / raw)
  To: sfr; +Cc: torvalds, netdev, linux-next, linux-kernel, pablo
In-Reply-To: <20141103150930.70a260c7@canb.auug.org.au>

From: Stephen Rothwell <sfr@canb.auug.org.au>
Date: Mon, 3 Nov 2014 15:09:30 +1100

> Hi Linus,
> 
> With Linus' tree, today's linux-next build (powerpc allyesconfig)
> failed like this:
> 
> net/bridge/netfilter/nft_reject_bridge.c: In function 'nft_reject_br_send_v6_unreach':
> net/bridge/netfilter/nft_reject_bridge.c:240:3: error: implicit declaration of function 'csum_ipv6_magic' [-Werror=implicit-function-declaration]
>    csum_ipv6_magic(&nip6h->saddr, &nip6h->daddr,
>    ^
> 
> Caused by commit 523b929d5446 ("netfilter: nft_reject_bridge: don't use
> IP stack to reject traffic") from Linus' tree.
> 
> I applied the following patch for today:

Yep, another person hit this today and submitted a patch too:

http://patchwork.ozlabs.org/patch/406003/

I'll get this into the net tree ASAP.

^ permalink raw reply

* Re: [PATCH] netfilter: nft_reject_bridge: Fix powerpc build error
From: David Miller @ 2014-11-03  4:17 UTC (permalink / raw)
  To: linux; +Cc: netdev, bridge, stephen, coreteam, netfilter-devel, kaber, pablo
In-Reply-To: <1414981155-27155-1-git-send-email-linux@roeck-us.net>

From: Guenter Roeck <linux@roeck-us.net>
Date: Sun,  2 Nov 2014 18:19:15 -0800

> Fix:
> net/bridge/netfilter/nft_reject_bridge.c:
> In function 'nft_reject_br_send_v6_unreach':
> net/bridge/netfilter/nft_reject_bridge.c:240:3:
> 	error: implicit declaration of function 'csum_ipv6_magic'
>    csum_ipv6_magic(&nip6h->saddr, &nip6h->daddr,
>    ^
> make[3]: *** [net/bridge/netfilter/nft_reject_bridge.o] Error 1
> 
> Seen with powerpc:allmodconfig.
> 
> Fixes: 523b929d5446 ("netfilter: nft_reject_bridge: don't use IP stack to reject traffic")
> Cc: Pablo Neira Ayuso <pablo@netfilter.org>
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>

Applied, thanks.

^ permalink raw reply

* [PATCH] ipv6: do xfrm transform after nat if necessary
From: Duan Jiong @ 2014-11-03  4:53 UTC (permalink / raw)
  To: David Miller; +Cc: netdev



In function nf_nat_ipv6_out, after nat is done, nf_xfrm_me_harder()
will be called to look up xfrm dst.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
---
 net/ipv6/ip6_output.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 8e950c2..742a845 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -124,6 +124,14 @@ static int ip6_finish_output2(struct sk_buff *skb)
 
 static int ip6_finish_output(struct sk_buff *skb)
 {
+#if defined(CONFIG_NETFILTER) && defined(CONFIG_XFRM)
+	/* Just like ipv4, policy lookup after nat yielded a new policy */
+	if (skb_dst(skb)->xfrm != NULL) {
+		IP6CB(skb)->flags |= IP6SKB_REROUTED;
+		return dst_output(skb);
+	}
+#endif
+
 	if ((skb->len > ip6_skb_dst_mtu(skb) && !skb_is_gso(skb)) ||
 	    dst_allfrag(skb_dst(skb)) ||
 	    (IP6CB(skb)->frag_max_size && skb->len > IP6CB(skb)->frag_max_size))
-- 
1.8.3.1

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox