Netdev List
 help / color / mirror / Atom feed
* Re: GPF in skb_flow_dissect
From: Eric Dumazet @ 2012-12-13  5:22 UTC (permalink / raw)
  To: Dave Jones, Jason Wang, David Miller; +Cc: netdev
In-Reply-To: <20121213041644.GB1611@redhat.com>

From: Eric Dumazet <edumazet@google.com>

On Wed, 2012-12-12 at 23:16 -0500, Dave Jones wrote:
> Since todays net merge, I see this when I start openvpn..
> 
> general protection fault: 0000 [#1] PREEMPT SMP 
> Modules linked in: ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables xfs iTCO_wdt iTCO_vendor_support snd_emu10k1 snd_util_mem snd_ac97_codec coretemp ac97_bus microcode snd_hwdep snd_seq pcspkr snd_pcm snd_page_alloc snd_timer lpc_ich i2c_i801 snd_rawmidi mfd_core snd_seq_device snd e1000e soundcore emu10k1_gp gameport i82975x_edac edac_core vhost_net tun macvtap macvlan kvm_intel kvm binfmt_misc nfsd auth_rpcgss nfs_acl lockd sunrpc btrfs libcrc32c zlib_deflate firewire_ohci sata_sil firewire_core crc_itu_t radeon i2c_algo_bit drm_kms_helper ttm drm i2c_core floppy
> CPU 0 
> Pid: 1381, comm: openvpn Not tainted 3.7.0+ #14                  /D975XBX
> RIP: 0010:[<ffffffff815b54a4>]  [<ffffffff815b54a4>] skb_flow_dissect+0x314/0x3e0
> RSP: 0018:ffff88007d0d9c48  EFLAGS: 00010206
> RAX: 000000000000055d RBX: 6b6b6b6b6b6b6b4b RCX: 1471030a0180040a
> RDX: 0000000000000005 RSI: 00000000ffffffe0 RDI: ffff8800ba83fa80
> RBP: ffff88007d0d9cb8 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000101 R12: ffff8800ba83fa80
> R13: 0000000000000008 R14: ffff88007d0d9cc8 R15: ffff8800ba83fa80
> FS:  00007f6637104800(0000) GS:ffff8800bf600000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f563f5b01c4 CR3: 000000007d140000 CR4: 00000000000007f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process openvpn (pid: 1381, threadinfo ffff88007d0d8000, task ffff8800a540cd60)
> Stack:
>  ffff8800ba83fa80 0000000000000296 0000000000000000 0000000000000000
>  ffff88007d0d9cc8 ffffffff815bcff4 ffff88007d0d9ce8 ffffffff815b1831
>  ffff88007d0d9ca8 00000000703f6364 ffff8800ba83fa80 0000000000000000
> Call Trace:
>  [<ffffffff815bcff4>] ? netif_rx+0x114/0x4c0
>  [<ffffffff815b1831>] ? skb_copy_datagram_from_iovec+0x61/0x290
>  [<ffffffff815b672a>] __skb_get_rxhash+0x1a/0xd0
>  [<ffffffffa03b9538>] tun_get_user+0x418/0x810 [tun]
>  [<ffffffff8135f468>] ? delay_tsc+0x98/0xf0
>  [<ffffffff8109605c>] ? __rcu_read_unlock+0x5c/0xa0
>  [<ffffffffa03b9a41>] tun_chr_aio_write+0x81/0xb0 [tun]
>  [<ffffffff81145011>] ? __buffer_unlock_commit+0x41/0x50
>  [<ffffffff811db917>] do_sync_write+0xa7/0xe0
>  [<ffffffff811dc01f>] vfs_write+0xaf/0x190
>  [<ffffffff811dc375>] sys_write+0x55/0xa0
>  [<ffffffff81705540>] tracesys+0xdd/0xe2
> Code: 41 8b 44 24 68 41 2b 44 24 6c 01 de 29 f0 83 f8 03 0f 8e a0 00 00 00 48 63 de 49 03 9c 24 e0 00 00 00 48 85 db 0f 84 72 fe ff ff <8b> 03 41 89 46 08 b8 01 00 00 00 e9 43 fd ff ff 0f 1f 40 00 48 
> RIP  [<ffffffff815b54a4>] skb_flow_dissect+0x314/0x3e0
>  RSP <ffff88007d0d9c48>
> ---[ end trace 6d42c834c72c002e ]---
> 
> 
> Faulting instruction is
> 
>    0:	8b 03                	mov    (%rbx),%eax
> 
> rbx is slab poison (-20) so this looks like a use-after-free here...
> 
>                         flow->ports = *ports;
>  314:   8b 03                   mov    (%rbx),%eax
>  316:   41 89 46 08             mov    %eax,0x8(%r14)
> 
> in the inlined skb_header_pointer in skb_flow_dissect
> 
> 	Dave
> 

Yes, commit 7694a3acc55a7 added this bug

Its illegal to use skb after call to netif_rx_ni(skb);

I would try following patch.

Thanks !

[PATCH] tuntap: dont use skb after netif_rx_ni(skb)

commit 96442e4242 (tuntap: choose the txq based on rxq) added
a use after free.

Cache rxhash in a temp variable before calling netif_rx_ni()

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jason Wang <jasowang@redhat.com>
---
 drivers/net/tun.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 2ac2164..40b426e 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -297,13 +297,12 @@ static void tun_flow_cleanup(unsigned long data)
 	spin_unlock_bh(&tun->lock);
 }
 
-static void tun_flow_update(struct tun_struct *tun, struct sk_buff *skb,
+static void tun_flow_update(struct tun_struct *tun, u32 rxhash,
 			    u16 queue_index)
 {
 	struct hlist_head *head;
 	struct tun_flow_entry *e;
 	unsigned long delay = tun->ageing_time;
-	u32 rxhash = skb_get_rxhash(skb);
 
 	if (!rxhash)
 		return;
@@ -1010,6 +1009,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 	int copylen;
 	bool zerocopy = false;
 	int err;
+	u32 rxhash;
 
 	if (!(tun->flags & TUN_NO_PI)) {
 		if ((len -= sizeof(pi)) > total_len)
@@ -1162,12 +1162,13 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 		skb_shinfo(skb)->tx_flags |= SKBTX_DEV_ZEROCOPY;
 	}
 
+	rxhash = skb_get_rxhash(skb);
 	netif_rx_ni(skb);
 
 	tun->dev->stats.rx_packets++;
 	tun->dev->stats.rx_bytes += len;
 
-	tun_flow_update(tun, skb, tfile->queue_index);
+	tun_flow_update(tun, rxhash, tfile->queue_index);
 	return total_len;
 }
 

^ permalink raw reply related

* Re: GPF in skb_flow_dissect
From: Jason Wang @ 2012-12-13  5:40 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Dave Jones, David Miller, netdev
In-Reply-To: <1355376177.12271.244.camel@edumazet-glaptop>

On 12/13/2012 01:22 PM, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> On Wed, 2012-12-12 at 23:16 -0500, Dave Jones wrote:
>> Since todays net merge, I see this when I start openvpn..
>>
>> general protection fault: 0000 [#1] PREEMPT SMP 
>> Modules linked in: ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables xfs iTCO_wdt iTCO_vendor_support snd_emu10k1 snd_util_mem snd_ac97_codec coretemp ac97_bus microcode snd_hwdep snd_seq pcspkr snd_pcm snd_page_alloc snd_timer lpc_ich i2c_i801 snd_rawmidi mfd_core snd_seq_device snd e1000e soundcore emu10k1_gp gameport i82975x_edac edac_core vhost_net tun macvtap macvlan kvm_intel kvm binfmt_misc nfsd auth_rpcgss nfs_acl lockd sunrpc btrfs libcrc32c zlib_deflate firewire_ohci sata_sil firewire_core crc_itu_t radeon i2c_algo_bit drm_kms_helper ttm drm i2c_core floppy
>> CPU 0 
>> Pid: 1381, comm: openvpn Not tainted 3.7.0+ #14                  /D975XBX
>> RIP: 0010:[<ffffffff815b54a4>]  [<ffffffff815b54a4>] skb_flow_dissect+0x314/0x3e0
>> RSP: 0018:ffff88007d0d9c48  EFLAGS: 00010206
>> RAX: 000000000000055d RBX: 6b6b6b6b6b6b6b4b RCX: 1471030a0180040a
>> RDX: 0000000000000005 RSI: 00000000ffffffe0 RDI: ffff8800ba83fa80
>> RBP: ffff88007d0d9cb8 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000101 R12: ffff8800ba83fa80
>> R13: 0000000000000008 R14: ffff88007d0d9cc8 R15: ffff8800ba83fa80
>> FS:  00007f6637104800(0000) GS:ffff8800bf600000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007f563f5b01c4 CR3: 000000007d140000 CR4: 00000000000007f0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Process openvpn (pid: 1381, threadinfo ffff88007d0d8000, task ffff8800a540cd60)
>> Stack:
>>  ffff8800ba83fa80 0000000000000296 0000000000000000 0000000000000000
>>  ffff88007d0d9cc8 ffffffff815bcff4 ffff88007d0d9ce8 ffffffff815b1831
>>  ffff88007d0d9ca8 00000000703f6364 ffff8800ba83fa80 0000000000000000
>> Call Trace:
>>  [<ffffffff815bcff4>] ? netif_rx+0x114/0x4c0
>>  [<ffffffff815b1831>] ? skb_copy_datagram_from_iovec+0x61/0x290
>>  [<ffffffff815b672a>] __skb_get_rxhash+0x1a/0xd0
>>  [<ffffffffa03b9538>] tun_get_user+0x418/0x810 [tun]
>>  [<ffffffff8135f468>] ? delay_tsc+0x98/0xf0
>>  [<ffffffff8109605c>] ? __rcu_read_unlock+0x5c/0xa0
>>  [<ffffffffa03b9a41>] tun_chr_aio_write+0x81/0xb0 [tun]
>>  [<ffffffff81145011>] ? __buffer_unlock_commit+0x41/0x50
>>  [<ffffffff811db917>] do_sync_write+0xa7/0xe0
>>  [<ffffffff811dc01f>] vfs_write+0xaf/0x190
>>  [<ffffffff811dc375>] sys_write+0x55/0xa0
>>  [<ffffffff81705540>] tracesys+0xdd/0xe2
>> Code: 41 8b 44 24 68 41 2b 44 24 6c 01 de 29 f0 83 f8 03 0f 8e a0 00 00 00 48 63 de 49 03 9c 24 e0 00 00 00 48 85 db 0f 84 72 fe ff ff <8b> 03 41 89 46 08 b8 01 00 00 00 e9 43 fd ff ff 0f 1f 40 00 48 
>> RIP  [<ffffffff815b54a4>] skb_flow_dissect+0x314/0x3e0
>>  RSP <ffff88007d0d9c48>
>> ---[ end trace 6d42c834c72c002e ]---
>>
>>
>> Faulting instruction is
>>
>>    0:	8b 03                	mov    (%rbx),%eax
>>
>> rbx is slab poison (-20) so this looks like a use-after-free here...
>>
>>                         flow->ports = *ports;
>>  314:   8b 03                   mov    (%rbx),%eax
>>  316:   41 89 46 08             mov    %eax,0x8(%r14)
>>
>> in the inlined skb_header_pointer in skb_flow_dissect
>>
>> 	Dave
>>
> Yes, commit 7694a3acc55a7 added this bug
>
> Its illegal to use skb after call to netif_rx_ni(skb);
>
> I would try following patch.
>
> Thanks !
>
> [PATCH] tuntap: dont use skb after netif_rx_ni(skb)
>
> commit 96442e4242 (tuntap: choose the txq based on rxq) added
> a use after free.
>
> Cache rxhash in a temp variable before calling netif_rx_ni()
>
> Reported-by: Dave Jones <davej@redhat.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/net/tun.c |    7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)

Acked-by: Jason Wang <jasowang@redhat.com>
>
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 2ac2164..40b426e 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -297,13 +297,12 @@ static void tun_flow_cleanup(unsigned long data)
>  	spin_unlock_bh(&tun->lock);
>  }
>  
> -static void tun_flow_update(struct tun_struct *tun, struct sk_buff *skb,
> +static void tun_flow_update(struct tun_struct *tun, u32 rxhash,
>  			    u16 queue_index)
>  {
>  	struct hlist_head *head;
>  	struct tun_flow_entry *e;
>  	unsigned long delay = tun->ageing_time;
> -	u32 rxhash = skb_get_rxhash(skb);
>  
>  	if (!rxhash)
>  		return;
> @@ -1010,6 +1009,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>  	int copylen;
>  	bool zerocopy = false;
>  	int err;
> +	u32 rxhash;
>  
>  	if (!(tun->flags & TUN_NO_PI)) {
>  		if ((len -= sizeof(pi)) > total_len)
> @@ -1162,12 +1162,13 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>  		skb_shinfo(skb)->tx_flags |= SKBTX_DEV_ZEROCOPY;
>  	}
>  
> +	rxhash = skb_get_rxhash(skb);
>  	netif_rx_ni(skb);
>  
>  	tun->dev->stats.rx_packets++;
>  	tun->dev->stats.rx_bytes += len;
>  
> -	tun_flow_update(tun, skb, tfile->queue_index);
> +	tun_flow_update(tun, rxhash, tfile->queue_index);
>  	return total_len;
>  }
>  
>
>

^ permalink raw reply

* Re: [net-next:master 14/17] net/bridge/br_mdb.c:330 br_mdb_add_group() error: potential null dereference 'mp'. (br_multicast_new_group returns null)
From: Cong Wang @ 2012-12-13  7:15 UTC (permalink / raw)
  To: kbuild test robot; +Cc: netdev
In-Reply-To: <50c8d94d.VSxb5ulWJHo9Ahhi%fengguang.wu@intel.com>

On Thu, 2012-12-13 at 03:21 +0800, kbuild test robot wrote:
> tree:   git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
> head:   520dfe3a3645257bf83660f672c47f8558f3d4c4
> commit: cfd567543590f71ca0af397437e2554f9756d750 [14/17] bridge: add support of adding and deleting mdb entries
> 
> 
> smatch warnings:
> 
> + net/bridge/br_mdb.c:330 br_mdb_add_group() error: potential null dereference 'mp'.  (br_multicast_new_group returns null)

br_multicast_new_group() seems impossible to return NULL, it either
returns a valid pointer (non-NULL) or some errno pointer.

OTOH, br_multicast_add_group() doesn't check for NULL either.

^ permalink raw reply

* Re: [RFC] net : add tx timestamp to packet mmap.
From: Paul Chavent @ 2012-12-13  7:13 UTC (permalink / raw)
  To: David Miller; +Cc: edumazet, daniel.borkmann, xemul, ebiederm, netdev
In-Reply-To: <20121212.142327.2290797438095968580.davem@davemloft.net>


After a sendmsg, we have to call recvmsg on the ERRQUEUE to get 
timestamp. I find that unfortunate indeed...

So this patch fix the tx timestamping (that take place in sendmsg), in 
order to be able to get timestamp (via recvmsg).

This seems suboptimal to me, that why i also ask if it wouldn't be 
possible to put the timestamp in the ring buffer frame before give it 
back to user.

Thanks for your reading.


On 12/12/2012 08:23 PM, David Miller wrote:
>
> You're changing the code that handles sendmsg() and then wondering why
> a recvmsg() call doesn't provide a timestamp.
>

^ permalink raw reply

* Re: [PATCH v2] netfilter: nf_nat: Also handle non-ESTABLISHED routing changes in MASQUERADE
From: Jozsef Kadlecsik @ 2012-12-13  8:19 UTC (permalink / raw)
  To: Andrew Collins; +Cc: netfilter-devel, netdev
In-Reply-To: <1355358229-25167-1-git-send-email-bsderandrew@gmail.com>

On Wed, 12 Dec 2012, Andrew Collins wrote:

> The MASQUERADE target now handles routing changes which affect
> the output interface of a connection, but only for ESTABLISHED
> connections.  It is also possible for NEW connections which
> already have a conntrack entry to be affected by routing changes.
> 
> This adds a check to drop entries in the NEW+conntrack state
> when the oif has changed.
> 
> Signed-off-by: Andrew Collins <bsderandrew@gmail.com>

Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.mta.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply

* [RFC PATCH] xfrm: avoid to send/receive the exceeding hard lifetime data
From: roy.qing.li @ 2012-12-13  8:25 UTC (permalink / raw)
  To: netdev

From: Li RongQing <roy.qing.li@gmail.com>

If setkey sets both bh and bs as 1024, and the total send and receive package
size is 1024, then if next package size is too large, this package should be
discard.

Example, first package size is 1000, send success, then the second package
is 500, 1000+500 is larger than 1024, so the second package should be discard.

Signed-off-by: Li RongQing <roy.qing.li@gmail.com> 
---
 net/xfrm/xfrm_input.c  |    6 +++---
 net/xfrm/xfrm_output.c |    6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index ab2bb42..d0de8f3 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -178,6 +178,9 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
 			goto drop_unlock;
 		}
 
+		x->curlft.bytes += skb->len;
+		x->curlft.packets++;
+
 		if (xfrm_state_check_expire(x)) {
 			XFRM_INC_STATS(net, LINUX_MIB_XFRMINSTATEEXPIRED);
 			goto drop_unlock;
@@ -219,9 +222,6 @@ resume:
 
 		x->repl->advance(x, seq);
 
-		x->curlft.bytes += skb->len;
-		x->curlft.packets++;
-
 		spin_unlock(&x->lock);
 
 		XFRM_MODE_SKB_CB(skb)->protocol = nexthdr;
diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
index 95a338c..0f38cb2 100644
--- a/net/xfrm/xfrm_output.c
+++ b/net/xfrm/xfrm_output.c
@@ -61,6 +61,9 @@ static int xfrm_output_one(struct sk_buff *skb, int err)
 		}
 
 		spin_lock_bh(&x->lock);
+
+		x->curlft.bytes += skb->len;
+		x->curlft.packets++;
 		err = xfrm_state_check_expire(x);
 		if (err) {
 			XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTSTATEEXPIRED);
@@ -73,9 +76,6 @@ static int xfrm_output_one(struct sk_buff *skb, int err)
 			goto error;
 		}
 
-		x->curlft.bytes += skb->len;
-		x->curlft.packets++;
-
 		spin_unlock_bh(&x->lock);
 
 		skb_dst_force(skb);
-- 
1.7.5.4

^ permalink raw reply related

* Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
From: Daniel Borkmann @ 2012-12-13  8:35 UTC (permalink / raw)
  To: ani; +Cc: Michael Richardson, netdev, tcpdump-workers, Francesco Ruggeri
In-Reply-To: <alpine.OSX.2.00.1212121205040.78903@animac.local>

On 12/12/2012 10:53 PM, Ani Sinha wrote:
>> unsigned int netdev_8021q_inskb = 1;
>>
>> ...
>> 	{
>> 		.ctl_name	= NET_CORE_8021q_INSKB,
>> 		.procname	= "netdev_8021q_inskb",
>> 		.data		= &netdev_8021q_inskb,
>> 		.maxlen		= sizeof(int),
>> 		.mode		= 0444,
>> 		.proc_handler	= proc_dointvec
>> 	},
>>
>> would seem to do it to me.
>> Then pcap can fopen("/proc/sys/net/core/netdev_8021q_inskb") and if it
>> finds it, and it is >0, then do the cmsg thing.
>>
>
> Does this work? This is just an experimental patch and by no means final.
> I just want to have an idea what everyone thought about it. Once we debate
> and discusss, I can cook up a final patch that would be worth commiting.
>
> Also instead of having this /proc interface, we can perhaps check for a
> specific
> kernel version that :
>
> (a) has the vlan tag info in the skb metadata (as opposed to in the packet
> itself)
> (b) has the following patch that adds the capability to generate a filter
> based on the tag value :
>
> commit f3335031b9452baebfe49b8b5e55d3fe0c4677d1
> Author: Eric Dumazet <edumazet@google.com>
> Date:   Sat Oct 27 02:26:17 2012 +0000
>
>      net: filter: add vlan tag access
>
> WE need both of the above two things for the userland to generate a filter
> code that compares vlan tag values in the skb metadata. For kernels that
> has the vlan tag in
> the skb metadata but does not have the above commit (b), there is nothing
> that can be done. For older kernels that had the vlan tag info in the
> packet itself, the filter code can be generated differently to look at
> specific offsets within the packet (something that libpcap does
> currently).
>
> We have already ruled out the idea of generating a filter and trying to
> load and see if that fails (see previous emails on this thread).
>
> Hope this makes sense.

I think it doesn't. Because then you are obviously considering adding one
procfs file into /proc/sys/net/core/ *for each* feature that is added into
the ancillary ops which cannot be the right way ...

> diff --git a/include/linux/filter.h b/include/linux/filter.h
> index c45eabc..91e2ba3 100644
> --- a/include/linux/filter.h
> +++ b/include/linux/filter.h
> @@ -36,6 +36,7 @@ static inline unsigned int sk_filter_len(const struct sk_filter *fp)
>   	return fp->len * sizeof(struct sock_filter) + sizeof(*fp);
>   }
>
> +extern bool sysctl_8021q_inskb;
>   extern int sk_filter(struct sock *sk, struct sk_buff *skb);
>   extern unsigned int sk_run_filter(const struct sk_buff *skb,
>   				  const struct sock_filter *filter);
> diff --git a/net/core/filter.c b/net/core/filter.c
> index c23543c..4f5a657 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -41,6 +41,8 @@
>   #include <linux/seccomp.h>
>   #include <linux/if_vlan.h>
>
> +bool sysctl_8021q_inskb = 1;
> +
>   /* No hurry in this branch
>    *
>    * Exported for the bpf jit load helper.
> diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
> index d1b0804..f9a3700 100644
> --- a/net/core/sysctl_net_core.c
> +++ b/net/core/sysctl_net_core.c
> @@ -15,6 +15,7 @@
>   #include <linux/init.h>
>   #include <linux/slab.h>
>   #include <linux/kmemleak.h>
> +#include <linux/filter.h>
>
>   #include <net/ip.h>
>   #include <net/sock.h>
> @@ -189,6 +190,13 @@ static struct ctl_table net_core_table[] = {
>   		.mode		= 0644,
>   		.proc_handler	= proc_dointvec
>   	},
> +	{
> +		.procname	= "8021q_inskb",
> +		.data		= &sysctl_8021q_inskb,
> +		.maxlen		= sizeof(bool),
> +		.mode		= 0444,
> +		.proc_handler	= proc_dointvec
> +	},
>   	{ }
>   };

^ permalink raw reply

* [PATCH] xfrm: do not check x->km.state
From: roy.qing.li @ 2012-12-13  9:06 UTC (permalink / raw)
  To: netdev

From: Li RongQing <roy.qing.li@gmail.com>

do not check x->km.state, it will be checked by succedent
xfrm_state_check_expire()

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
---
 net/ipv6/xfrm6_input.c |    1 -
 net/xfrm/xfrm_input.c  |    4 ----
 2 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/net/ipv6/xfrm6_input.c b/net/ipv6/xfrm6_input.c
index f8c3cf8..de4babd 100644
--- a/net/ipv6/xfrm6_input.c
+++ b/net/ipv6/xfrm6_input.c
@@ -108,7 +108,6 @@ int xfrm6_input_addr(struct sk_buff *skb, xfrm_address_t *daddr,
 		spin_lock(&x->lock);
 
 		if ((!i || (x->props.flags & XFRM_STATE_WILDRECV)) &&
-		    likely(x->km.state == XFRM_STATE_VALID) &&
 		    !xfrm_state_check_expire(x)) {
 			spin_unlock(&x->lock);
 			if (x->type->input(x, skb) > 0) {
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index ab2bb42..a8fbb09 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -163,10 +163,6 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
 		skb->sp->xvec[skb->sp->len++] = x;
 
 		spin_lock(&x->lock);
-		if (unlikely(x->km.state != XFRM_STATE_VALID)) {
-			XFRM_INC_STATS(net, LINUX_MIB_XFRMINSTATEINVALID);
-			goto drop_unlock;
-		}
 
 		if ((x->encap ? x->encap->encap_type : 0) != encap_type) {
 			XFRM_INC_STATS(net, LINUX_MIB_XFRMINSTATEMISMATCH);
-- 
1.7.5.4

^ permalink raw reply related

* [PATCH iproute2 v2] ip: use rtnelink to manage mroute
From: Nicolas Dichtel @ 2012-12-13  9:16 UTC (permalink / raw)
  To: shemminger; +Cc: netdev, Nicolas Dichtel
In-Reply-To: <20121212102617.0a3249e4@nehalam.linuxnetplumber.net>

mroute was using /proc/net/ip_mr_[vif|cache] to display mroute entries. Hence,
only RT_TABLE_DEFAULT was displayed and only IPv4.
With rtnetlink, it is possible to display all tables for IPv4 and IPv6. The output
format is kept. Also, like before the patch, statistics are displayed when user specify
the '-s' argument.

The patch also adds the support of 'ip monitor mroute', which is now possible.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---

v2: fix compilation warnings on 64bits arch

 ip/ip_common.h |   3 +
 ip/ipmonitor.c |  35 ++++++-
 ip/ipmroute.c  | 297 ++++++++++++++++++++++++++++++++++-----------------------
 3 files changed, 214 insertions(+), 121 deletions(-)
 create mode 100644 ip/ipmonitor.

diff --git a/ip/ip_common.h b/ip/ip_common.h
index a394669..de56810 100644
--- a/ip/ip_common.h
+++ b/ip/ip_common.h
@@ -16,11 +16,14 @@ extern int ipaddr_list_link(int argc, char **argv);
 extern int iproute_monitor(int argc, char **argv);
 extern void iplink_usage(void) __attribute__((noreturn));
 extern void iproute_reset_filter(void);
+extern void ipmroute_reset_filter(void);
 extern void ipaddr_reset_filter(int);
 extern void ipneigh_reset_filter(void);
 extern void ipntable_reset_filter(void);
 extern int print_route(const struct sockaddr_nl *who,
 		       struct nlmsghdr *n, void *arg);
+extern int print_mroute(const struct sockaddr_nl *who,
+			struct nlmsghdr *n, void *arg);
 extern int print_prefix(const struct sockaddr_nl *who,
 			struct nlmsghdr *n, void *arg);
 extern int print_rule(const struct sockaddr_nl *who,
diff --git a/ip/ipmonitor. b/ip/ipmonitor.
new file mode 100644
index 0000000..e69de29
diff --git a/ip/ipmonitor.c b/ip/ipmonitor.c
index d971623..09a339c 100644
--- a/ip/ipmonitor.c
+++ b/ip/ipmonitor.c
@@ -43,10 +43,26 @@ int accept_msg(const struct sockaddr_nl *who,
 		print_timestamp(fp);
 
 	if (n->nlmsg_type == RTM_NEWROUTE || n->nlmsg_type == RTM_DELROUTE) {
-		if (prefix_banner)
-			fprintf(fp, "[ROUTE]");
-		print_route(who, n, arg);
-		return 0;
+		struct rtmsg *r = NLMSG_DATA(n);
+		int len = n->nlmsg_len - NLMSG_LENGTH(sizeof(*r));
+
+		if (len < 0) {
+			fprintf(stderr, "BUG: wrong nlmsg len %d\n", len);
+			return -1;
+		}
+
+		if (r->rtm_family == RTNL_FAMILY_IPMR ||
+		    r->rtm_family == RTNL_FAMILY_IP6MR) {
+			if (prefix_banner)
+				fprintf(fp, "[MROUTE]");
+			print_mroute(who, n, arg);
+			return 0;
+		} else {
+			if (prefix_banner)
+				fprintf(fp, "[ROUTE]");
+			print_route(who, n, arg);
+			return 0;
+		}
 	}
 	if (n->nlmsg_type == RTM_NEWLINK || n->nlmsg_type == RTM_DELLINK) {
 		ll_remember_index(who, n, NULL);
@@ -123,6 +139,7 @@ int do_ipmonitor(int argc, char **argv)
 	int llink=0;
 	int laddr=0;
 	int lroute=0;
+	int lmroute=0;
 	int lprefix=0;
 	int lneigh=0;
 	int lnetconf=0;
@@ -130,6 +147,7 @@ int do_ipmonitor(int argc, char **argv)
 	rtnl_close(&rth);
 	ipaddr_reset_filter(1);
 	iproute_reset_filter();
+	ipmroute_reset_filter();
 	ipneigh_reset_filter();
 
 	while (argc > 0) {
@@ -145,6 +163,9 @@ int do_ipmonitor(int argc, char **argv)
 		} else if (matches(*argv, "route") == 0) {
 			lroute=1;
 			groups = 0;
+		} else if (matches(*argv, "mroute") == 0) {
+			lmroute=1;
+			groups = 0;
 		} else if (matches(*argv, "prefix") == 0) {
 			lprefix=1;
 			groups = 0;
@@ -180,6 +201,12 @@ int do_ipmonitor(int argc, char **argv)
 		if (!preferred_family || preferred_family == AF_INET6)
 			groups |= nl_mgrp(RTNLGRP_IPV6_ROUTE);
 	}
+	if (lmroute) {
+		if (!preferred_family || preferred_family == AF_INET)
+			groups |= nl_mgrp(RTNLGRP_IPV4_MROUTE);
+		if (!preferred_family || preferred_family == AF_INET6)
+			groups |= nl_mgrp(RTNLGRP_IPV6_MROUTE);
+	}
 	if (lprefix) {
 		if (!preferred_family || preferred_family == AF_INET6)
 			groups |= nl_mgrp(RTNLGRP_IPV6_PREFIX);
diff --git a/ip/ipmroute.c b/ip/ipmroute.c
index 945727d..defcfc5 100644
--- a/ip/ipmroute.c
+++ b/ip/ipmroute.c
@@ -15,6 +15,7 @@
 #include <unistd.h>
 #include <syslog.h>
 #include <fcntl.h>
+#include <inttypes.h>
 #include <sys/ioctl.h>
 #include <sys/socket.h>
 #include <netinet/in.h>
@@ -26,167 +27,229 @@
 #include <linux/if_arp.h>
 #include <linux/sockios.h>
 
+#include <rt_names.h>
 #include "utils.h"
-
-char filter_dev[16];
-int  filter_family;
+#include "ip_common.h"
 
 static void usage(void) __attribute__((noreturn));
 
 static void usage(void)
 {
-	fprintf(stderr, "Usage: ip mroute show [ PREFIX ] [ from PREFIX ] [ iif DEVICE ]\n");
+	fprintf(stderr, "Usage: ip mroute show [ [ to ] PREFIX ] [ from PREFIX ] [ iif DEVICE ]\n");
 #if 0
 	fprintf(stderr, "Usage: ip mroute [ add | del ] DESTINATION from SOURCE [ iif DEVICE ] [ oif DEVICE ]\n");
 #endif
 	exit(-1);
 }
 
-static char *viftable[32];
-
 struct rtfilter
 {
+	int tb;
+	int af;
+	int iif;
 	inet_prefix mdst;
 	inet_prefix msrc;
 } filter;
 
-static void read_viftable(void)
+int print_mroute(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 {
-	char buf[256];
-	FILE *fp = fopen("/proc/net/ip_mr_vif", "r");
-
-	if (!fp)
-		return;
-
-	if (!fgets(buf, sizeof(buf), fp)) {
-		fclose(fp);
-		return;
+	FILE *fp = (FILE*)arg;
+	struct rtmsg *r = NLMSG_DATA(n);
+	int len = n->nlmsg_len;
+	struct rtattr * tb[RTA_MAX+1];
+	char abuf[256];
+	char obuf[256];
+	SPRINT_BUF(b1);
+	__u32 table;
+	int iif = 0;
+	int family;
+
+	if ((n->nlmsg_type != RTM_NEWROUTE &&
+	     n->nlmsg_type != RTM_DELROUTE) ||
+	    !(n->nlmsg_flags & NLM_F_MULTI)) {
+		fprintf(stderr, "Not a multicast route: %08x %08x %08x\n",
+			n->nlmsg_len, n->nlmsg_type, n->nlmsg_flags);
+		return 0;
 	}
-	while (fgets(buf, sizeof(buf), fp)) {
-		int vifi;
-		char dev[256];
-
-		if (sscanf(buf, "%d%s", &vifi, dev) < 2)
-			continue;
-
-		if (vifi<0 || vifi>31)
-			continue;
-
-		viftable[vifi] = strdup(dev);
+	len -= NLMSG_LENGTH(sizeof(*r));
+	if (len < 0) {
+		fprintf(stderr, "BUG: wrong nlmsg len %d\n", len);
+		return -1;
 	}
-	fclose(fp);
-}
-
-static void read_mroute_list(FILE *ofp)
-{
-	char buf[256];
-	FILE *fp = fopen("/proc/net/ip_mr_cache", "r");
-
-	if (!fp)
-		return;
-
-	if (!fgets(buf, sizeof(buf), fp)) {
-		fclose(fp);
-		return;
+	if (r->rtm_type != RTN_MULTICAST) {
+		fprintf(stderr, "Not a multicast route (type: %s)\n",
+			rtnl_rtntype_n2a(r->rtm_type, b1, sizeof(b1)));
+		return 0;
 	}
 
-	while (fgets(buf, sizeof(buf), fp)) {
-		inet_prefix maddr, msrc;
-		unsigned pkts, b, w;
-		int vifi;
-		char oiflist[256];
-		char sbuf[256];
-		char mbuf[256];
-		char obuf[256];
-
-		oiflist[0] = 0;
-		if (sscanf(buf, "%x%x%d%u%u%u %[^\n]",
-			   maddr.data, msrc.data, &vifi,
-			   &pkts, &b, &w, oiflist) < 6)
-			continue;
-
-		if (vifi!=-1 && (vifi < 0 || vifi>31))
-			continue;
-
-		if (filter_dev[0] && (vifi<0 || strcmp(filter_dev, viftable[vifi])))
-			continue;
-		if (filter.mdst.family && inet_addr_match(&maddr, &filter.mdst, filter.mdst.bitlen))
-			continue;
-		if (filter.msrc.family && inet_addr_match(&msrc, &filter.msrc, filter.msrc.bitlen))
-			continue;
-
-		snprintf(obuf, sizeof(obuf), "(%s, %s)",
-			 format_host(AF_INET, 4, &msrc.data[0], sbuf, sizeof(sbuf)),
-			 format_host(AF_INET, 4, &maddr.data[0], mbuf, sizeof(mbuf)));
-
-		fprintf(ofp, "%-32s Iif: ", obuf);
-
-		if (vifi == -1)
-			fprintf(ofp, "unresolved ");
-		else
-			fprintf(ofp, "%-10s ", viftable[vifi]);
-
-		if (oiflist[0]) {
-			char *next = NULL;
-			char *p = oiflist;
-			int ovifi, ottl;
-
-			fprintf(ofp, "Oifs: ");
-
-			while (p) {
-				next = strchr(p, ' ');
-				if (next) {
-					*next = 0;
-					next++;
-				}
-				if (sscanf(p, "%d:%d", &ovifi, &ottl)<2) {
-					p = next;
-					continue;
-				}
-				p = next;
-
-				fprintf(ofp, "%s", viftable[ovifi]);
-				if (ottl>1)
-					fprintf(ofp, "(ttl %d) ", ovifi);
-				else
-					fprintf(ofp, " ");
+	parse_rtattr(tb, RTA_MAX, RTM_RTA(r), len);
+	table = rtm_get_table(r, tb);
+
+	if (filter.tb > 0 && filter.tb != table)
+		return 0;
+
+	if (tb[RTA_IIF])
+		iif = *(int*)RTA_DATA(tb[RTA_IIF]);
+	if (filter.iif && filter.iif != iif)
+		return 0;
+
+	if (filter.af && filter.af != r->rtm_family)
+		return 0;
+
+	if (tb[RTA_DST] &&
+	    filter.mdst.bitlen > 0 &&
+	    inet_addr_match(RTA_DATA(tb[RTA_DST]), &filter.mdst, filter.mdst.bitlen))
+		return 0;
+
+	if (tb[RTA_SRC] &&
+	    filter.msrc.bitlen > 0 &&
+	    inet_addr_match(RTA_DATA(tb[RTA_SRC]), &filter.msrc, filter.msrc.bitlen))
+		return 0;
+
+	family = r->rtm_family == RTNL_FAMILY_IPMR ? AF_INET : AF_INET6;
+
+	if (n->nlmsg_type == RTM_DELROUTE)
+		fprintf(fp, "Deleted ");
+
+	if (tb[RTA_SRC])
+		len = snprintf(obuf, sizeof(obuf),
+			       "(%s, ", rt_addr_n2a(family,
+						    RTA_PAYLOAD(tb[RTA_SRC]),
+						    RTA_DATA(tb[RTA_SRC]),
+						    abuf, sizeof(abuf)));
+	else
+		len = sprintf(obuf, "(unknown, ");
+	if (tb[RTA_DST])
+		snprintf(obuf + len, sizeof(obuf) - len,
+			 "%s)", rt_addr_n2a(family, RTA_PAYLOAD(tb[RTA_DST]),
+					    RTA_DATA(tb[RTA_DST]),
+					    abuf, sizeof(abuf)));
+	else
+		snprintf(obuf + len, sizeof(obuf) - len, "unknown) ");
+
+	fprintf(fp, "%-32s Iif: ", obuf);
+	if (iif)
+		fprintf(fp, "%-10s ", ll_index_to_name(iif));
+	else
+		fprintf(fp, "unresolved ");
+
+	if (tb[RTA_MULTIPATH]) {
+		struct rtnexthop *nh = RTA_DATA(tb[RTA_MULTIPATH]);
+		int first = 1;
+
+		len = RTA_PAYLOAD(tb[RTA_MULTIPATH]);
+
+		for (;;) {
+			if (len < sizeof(*nh))
+				break;
+			if (nh->rtnh_len > len)
+				break;
+
+			if (first) {
+				fprintf(fp, "Oifs: ");
+				first = 0;
 			}
+			fprintf(fp, "%s", ll_index_to_name(nh->rtnh_ifindex));
+			if (nh->rtnh_hops > 1)
+				fprintf(fp, "(ttl %d) ", nh->rtnh_hops);
+			else
+				fprintf(fp, " ");
+			len -= NLMSG_ALIGN(nh->rtnh_len);
+			nh = RTNH_NEXT(nh);
 		}
-
-		if (show_stats && b) {
-			fprintf(ofp, "%s  %u packets, %u bytes", _SL_, pkts, b);
-			if (w)
-				fprintf(ofp, ", %u arrived on wrong iif.", w);
-		}
-		fprintf(ofp, "\n");
 	}
-	fclose(fp);
+	if (show_stats && tb[RTA_MFC_STATS]) {
+		struct rta_mfc_stats *mfcs = RTA_DATA(tb[RTA_MFC_STATS]);
+
+		fprintf(fp, "%s  %"PRIu64" packets, %"PRIu64" bytes", _SL_,
+			(uint64_t)mfcs->mfcs_packets,
+			(uint64_t)mfcs->mfcs_bytes);
+		if (mfcs->mfcs_wrong_if)
+			fprintf(fp, ", %"PRIu64" arrived on wrong iif.",
+				(uint64_t)mfcs->mfcs_wrong_if);
+	}
+	fprintf(fp, "\n");
+	fflush(fp);
+	return 0;
 }
 
+void ipmroute_reset_filter(void)
+{
+	memset(&filter, 0, sizeof(filter));
+	filter.mdst.bitlen = -1;
+	filter.msrc.bitlen = -1;
+}
 
 static int mroute_list(int argc, char **argv)
 {
+	char *id = NULL;
+	int family;
+
+	ipmroute_reset_filter();
+	if (preferred_family == AF_UNSPEC)
+		family = AF_INET;
+	else
+		family = AF_INET6;
+	if (family == AF_INET) {
+		filter.af = RTNL_FAMILY_IPMR;
+		filter.tb = RT_TABLE_DEFAULT;  /* for backward compatibility */
+	} else
+		filter.af = RTNL_FAMILY_IP6MR;
+
 	while (argc > 0) {
-		if (strcmp(*argv, "iif") == 0) {
+		if (matches(*argv, "table") == 0) {
+			__u32 tid;
 			NEXT_ARG();
-			strncpy(filter_dev, *argv, sizeof(filter_dev)-1);
+			if (rtnl_rttable_a2n(&tid, *argv)) {
+				if (strcmp(*argv, "all") == 0) {
+					filter.tb = 0;
+				} else if (strcmp(*argv, "help") == 0) {
+					usage();
+				} else {
+					invarg("table id value is invalid\n", *argv);
+				}
+			} else
+				filter.tb = tid;
+		} else if (strcmp(*argv, "iif") == 0) {
+			NEXT_ARG();
+			id = *argv;
 		} else if (matches(*argv, "from") == 0) {
 			NEXT_ARG();
-			get_prefix(&filter.msrc, *argv, AF_INET);
+			get_prefix(&filter.msrc, *argv, family);
 		} else {
 			if (strcmp(*argv, "to") == 0) {
 				NEXT_ARG();
 			}
 			if (matches(*argv, "help") == 0)
 				usage();
-			get_prefix(&filter.mdst, *argv, AF_INET);
+			get_prefix(&filter.mdst, *argv, family);
 		}
-		argv++; argc--;
+		argc--; argv++;
 	}
 
-	read_viftable();
-	read_mroute_list(stdout);
-	return 0;
+	ll_init_map(&rth);
+
+	if (id)  {
+		int idx;
+
+		if ((idx = ll_name_to_index(id)) == 0) {
+			fprintf(stderr, "Cannot find device \"%s\"\n", id);
+			return -1;
+		}
+		filter.iif = idx;
+	}
+
+	if (rtnl_wilddump_request(&rth, filter.af, RTM_GETROUTE) < 0) {
+		perror("Cannot send dump request");
+		return 1;
+	}
+
+	if (rtnl_dump_filter(&rth, print_mroute, stdout) < 0) {
+		fprintf(stderr, "Dump terminated\n");
+		exit(1);
+	}
+
+	exit(0);
 }
 
 int do_multiroute(int argc, char **argv)
-- 
1.8.0.1

^ permalink raw reply related

* [PATCH] return the devices dummyX to the initial network namespace after container closure.
From: V. Lavrov @ 2012-12-13  9:13 UTC (permalink / raw)
  To: netdev

If container has a network device dummyX (with lxc.network.type = phys), then it disappears from the system after you close the container.
The patch returns the device dummyX to the initial network namespace after container is closed.

Signed-off-by: Vitaly Lavrov <lve@guap.ru>
---
diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
index bab0158..efa990c 100644
--- a/drivers/net/dummy.c
+++ b/drivers/net/dummy.c
@@ -160,6 +160,41 @@ static struct rtnl_link_ops dummy_link_ops __read_mostly = {
  module_param(numdummies, int, 0);
  MODULE_PARM_DESC(numdummies, "Number of dummy pseudo devices");

+
+static void __net_exit dummy_net_exit(struct net *net) {
+       struct net_device *dev, *aux;
+       int err;
+
+       if(net == &init_net) return;
+
+       rtnl_lock();
+       for_each_netdev_safe(net, dev, aux) {
+               if(dev->rtnl_link_ops == &dummy_link_ops) {
+                       err = dev_change_net_namespace(dev, &init_net, dev->name);
+                       if(err) {
+                               char fb_name[IFNAMSIZ];
+                               printk (KERN_INFO "%s: dev_change_net_namespace(init_net,%s) err: %d\n",
+                                       __func__,dev->name,err);
+                               snprintf(fb_name, IFNAMSIZ, "dev%d", dev->ifindex);
+                               err = dev_change_net_namespace(dev, &init_net, dev->name);
+                               if(err)
+                                       printk (KERN_INFO "%s: dev_change_net_namespace(%s,init_net,%s) err: %d\n",
+                                               __func__,dev->name,fb_name,err);
+                               else
+                                       printk (KERN_INFO "%s: %s rename to %s\n",
+                                               __func__,dev->name,fb_name);
+
+                       }
+               }
+       }
+       rtnl_unlock();
+}
+
+static struct pernet_operations __net_initdata dummy_net_ops = {
+       .exit = dummy_net_exit,
+};
+
+
  static int __init dummy_init_one(void)
  {
         struct net_device *dev_dummy;
@@ -184,6 +219,10 @@ static int __init dummy_init_module(void)
  {
         int i, err = 0;

+       err = register_pernet_device(&dummy_net_ops);
+       if(err)
+               return err;
+
         rtnl_lock();
         err = __rtnl_link_register(&dummy_link_ops);

@@ -191,8 +230,10 @@ static int __init dummy_init_module(void)
                 err = dummy_init_one();
                 cond_resched();
         }
-       if (err < 0)
+       if (err < 0) {
                 __rtnl_link_unregister(&dummy_link_ops);
+               unregister_pernet_device(&dummy_net_ops);
+       }
         rtnl_unlock();

         return err;
@@ -201,6 +242,7 @@ static int __init dummy_init_module(void)
  static void __exit dummy_cleanup_module(void)
  {
         rtnl_link_unregister(&dummy_link_ops);
+       unregister_pernet_device(&dummy_net_ops);
  }

  module_init(dummy_init_module);
--

^ permalink raw reply related

* Re: [RFC PATCH] xfrm: avoid to send/receive the exceeding hard lifetime data
From: Steffen Klassert @ 2012-12-13 10:14 UTC (permalink / raw)
  To: roy.qing.li; +Cc: netdev
In-Reply-To: <1355387152-9963-1-git-send-email-roy.qing.li@gmail.com>

On Thu, Dec 13, 2012 at 04:25:52PM +0800, roy.qing.li@gmail.com wrote:
> From: Li RongQing <roy.qing.li@gmail.com>
> 
> If setkey sets both bh and bs as 1024, and the total send and receive package
> size is 1024, then if next package size is too large, this package should be
> discard.
> 
> Example, first package size is 1000, send success, then the second package
> is 500, 1000+500 is larger than 1024, so the second package should be discard.
> 
> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> 
> ---
>  net/xfrm/xfrm_input.c  |    6 +++---
>  net/xfrm/xfrm_output.c |    6 +++---
>  2 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
> index ab2bb42..d0de8f3 100644
> --- a/net/xfrm/xfrm_input.c
> +++ b/net/xfrm/xfrm_input.c
> @@ -178,6 +178,9 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
>  			goto drop_unlock;
>  		}
>  
> +		x->curlft.bytes += skb->len;
> +		x->curlft.packets++;
> +

This is a bit critical on input. We should only increment these values
if the integrity check on this packet was successfull. Otherwise someone
could spam us with invalid packets and trigger a state expiry.

If a synchronous crypto algorithm is used, we send at most one packet too
much. The maximal byte count was not yet reached and RFC 2401 says not
much on how to handle the packet that reaches the maximal byte count,
so this is probaply ok.

But if an asynchronous crypto algorithm is used, we can send a lot
of packets too much. So we should probaply add a second expiry check
after resume from asynchronous crypto. We do this already with the replay
check.

^ permalink raw reply

* Re: [PATCH] xfrm: do not check x->km.state
From: Steffen Klassert @ 2012-12-13 10:19 UTC (permalink / raw)
  To: roy.qing.li; +Cc: netdev
In-Reply-To: <1355389560-7705-1-git-send-email-roy.qing.li@gmail.com>

On Thu, Dec 13, 2012 at 05:06:00PM +0800, roy.qing.li@gmail.com wrote:
> From: Li RongQing <roy.qing.li@gmail.com>
> 
> do not check x->km.state, it will be checked by succedent
> xfrm_state_check_expire()
> 
> Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
> ---
>  net/ipv6/xfrm6_input.c |    1 -
>  net/xfrm/xfrm_input.c  |    4 ----
>  2 files changed, 0 insertions(+), 5 deletions(-)
> 
> diff --git a/net/ipv6/xfrm6_input.c b/net/ipv6/xfrm6_input.c
> index f8c3cf8..de4babd 100644
> --- a/net/ipv6/xfrm6_input.c
> +++ b/net/ipv6/xfrm6_input.c
> @@ -108,7 +108,6 @@ int xfrm6_input_addr(struct sk_buff *skb, xfrm_address_t *daddr,
>  		spin_lock(&x->lock);
>  
>  		if ((!i || (x->props.flags & XFRM_STATE_WILDRECV)) &&
> -		    likely(x->km.state == XFRM_STATE_VALID) &&
>  		    !xfrm_state_check_expire(x)) {
>  			spin_unlock(&x->lock);
>  			if (x->type->input(x, skb) > 0) {
> diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
> index ab2bb42..a8fbb09 100644
> --- a/net/xfrm/xfrm_input.c
> +++ b/net/xfrm/xfrm_input.c
> @@ -163,10 +163,6 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
>  		skb->sp->xvec[skb->sp->len++] = x;
>  
>  		spin_lock(&x->lock);
> -		if (unlikely(x->km.state != XFRM_STATE_VALID)) {
> -			XFRM_INC_STATS(net, LINUX_MIB_XFRMINSTATEINVALID);
> -			goto drop_unlock;
> -		}


This would remove the only place where the LINUX_MIB_XFRMINSTATEINVALID
statistics counter is incremented. I think it would be better to ensure
a valid state before we call xfrm_state_check_expire(). This would make
the statistics more accurate and we can remove the x->km.state check
from xfrm_state_check_expire().

^ permalink raw reply

* Re: netconsole fun
From: Cong Wang @ 2012-12-13 10:33 UTC (permalink / raw)
  To: netdev
In-Reply-To: <1355345957.2687.18.camel@thor>

On Wed, 12 Dec 2012 at 20:59 GMT, Peter Hurley <peter@hurleysoftware.com> wrote:
>
> Just wondering if you think something like the patch below is
> suitable/acceptable for insulating netconsole from inconsistent device
> name scenarios without changing the existing semantics. The basic idea
> is to allow an ethernet MAC address in the <dev> field of the
> netconsole= options, and if a MAC address was specified rather than a
> device name, to do the dev lookup from the MAC address instead.
>
> This doesn't extend to, but also doesn't interfere with, the dynamic
> config of netconsole via configfs.
>
> Would you mind reviewing it?
>

This is a good idea. Just that you need to complete the configfs
interface too.

^ permalink raw reply

* Re: tc ipt action
From: Jamal Hadi Salim @ 2012-12-13 10:58 UTC (permalink / raw)
  To: Yury Stankevich; +Cc: netdev@vger.kernel.org, pablo
In-Reply-To: <50C4821D.5090206@gmail.com>

Yury,

This appears to be an ABI breakage on iptables/netfilter side.
I will look at it (and hopefully fix it) over the weekend.

cheers,
jamal

On 12-12-09 07:20 AM, Yury Stankevich wrote:
> Hello,
>
> i not sure this is correct list, please advise if not.
>
> i'm trying to use ipt action, and got a problem:
>
> #tc filter add dev eth0 parent ffff: protocol ip u32 match u32 0 0
> action ipt -j CONNMARK --restore-mark action mirred egress redirect dev ifb0
> -> bad action type ipt
>
> from strace:
> open("/usr/lib/tc//m_gact.so", O_RDONLY) = -1 ENOENT (No such file or
> directory)
> write(2, "bad action type ipt\n", 20bad action type ipt
>
> well. i'm trying to use xt:
> #tc filter add dev eth0 parent ffff: protocol ip u32 match u32 0 0
> action xt -j CONNMARK --restore-mark action mirred egress redirect dev ifb0
> xt: unrecognized option '--restore-mark'
>
> from strace:
> open("/lib/xtables/libxt_CONNMARK.so", O_RDONLY) = 4
> read(4,
> "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\200\6\0\0004\0\0\0"...,
> 512) = 512
> fstat64(4, {st_mode=S_IFREG|0644, st_size=9756, ...}) = 0
> mmap2(NULL, 12548, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 4, 0)
> = 0xf76f3000
> mmap2(0xf76f5000, 8192, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 4, 0x1) = 0xf76f5000
> close(4)                                = 0
> mprotect(0xf76f5000, 4096, PROT_READ)   = 0
> socket(PF_INET, SOCK_RAW, IPPROTO_RAW)  = 4
> fcntl64(4, F_SETFD, FD_CLOEXEC)         = 0
> lstat64("/proc/net/ip_tables_names", {st_mode=S_IFREG|0440, st_size=0,
> ...}) = 0
> statfs64("/proc/net/ip_tables_names", 84, {f_type="PROC_SUPER_MAGIC",
> f_bsize=4096, f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0,
> f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
> getsockopt(4, SOL_IP, 0x43 /* IP_??? */,
> "CONNMARK\0\367\f\300\0\0\0po\367l8p\367\364/p\367:}\302\1", [30]) = 0
> close(4)                                = 0
> write(2, "xt: unrecognized option '--resto"..., 41xt: unrecognized
> option '--restore-mark'
>
> so... i make something wrong or this is a bug ?
>
> ps: 3.6.8 kernel 64 bit kernel with 32 bit userspace, iproute 20121001
> from debian-experimental,
> module act_ipt is loaded.
> pps: please, cc me in reply.
>
>

^ permalink raw reply

* Re: [PATCH 1/1] net: cpts: fix for build break after ARM SoC integration
From: Tomi Valkeinen @ 2012-12-13 11:07 UTC (permalink / raw)
  To: Mugunthan V N
  Cc: netdev, davem, linux-arm-kernel, linux-omap, b-cousson, paul,
	richardcochran
In-Reply-To: <1354012034-31686-1-git-send-email-mugunthanvnm@ti.com>

Hi,

On 2012-11-27 12:27, Mugunthan V N wrote:
>   CC      drivers/net/ethernet/ti/cpts.o
> drivers/net/ethernet/ti/cpts.c:30:24: fatal error: plat/clock.h: No such file or directory
> compilation terminated.
> make[4]: *** [drivers/net/ethernet/ti/cpts.o] Error 1
> make[3]: *** [drivers/net/ethernet/ti] Error 2
> make[2]: *** [drivers/net/ethernet] Error 2
> make[1]: *** [drivers/net] Error 2
> 
> fix for build break as the header file is removed from plat-omap as part of
> the below patch

linux-next still has this build problem, I guess this patch is lingering
somewhere. Somewhat annoying, as the driver is enabled by default. (btw,
why is it "default y"?)

 Tomi

^ permalink raw reply

* [PATCH v2] ipv6: Change skb->data before using icmpv6_notify() to propagate redirect
From: Duan Jiong @ 2012-12-13 11:21 UTC (permalink / raw)
  To: davem; +Cc: Steffen Klassert, netdev


In function ndisc_redirect_rcv(), the skb->data points to the transport
header, but function icmpv6_notify() need the skb->data points to the
inner IP packet. So before using icmpv6_notify() to propagate redirect,
change skb->data to point the inner IP packet that triggered the sending
of the Redirect, and introduce struct rd_msg to make it easy.

Many thanks to Steffen Klassert.

Signed-off-by: Duan Jiong <djduanjiong@gmail.com>
---
 include/net/ndisc.h |    7 +++++++
 net/ipv6/ndisc.c    |   22 ++++++++++++++++++++++
 2 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/include/net/ndisc.h b/include/net/ndisc.h
index 980d263..6b305d7 100644
--- a/include/net/ndisc.h
+++ b/include/net/ndisc.h
@@ -78,6 +78,13 @@ struct ra_msg {
 	__be32			retrans_timer;
 };
 
+struct rd_msg {
+	struct icmp6hdr icmph;
+	struct in6_addr	target;
+	struct in6_addr	dest;
+	__u8		opt[0];
+};
+
 struct nd_opt_hdr {
 	__u8		nd_opt_type;
 	__u8		nd_opt_len;
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 2edce30..03deabc 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1333,6 +1333,12 @@ out:
 
 static void ndisc_redirect_rcv(struct sk_buff *skb)
 {
+	u8 *hdr;
+	struct ndisc_options ndopts;
+	struct rd_msg *msg = (struct rd_msg *)skb_transport_header(skb);
+	u32 ndoptlen = skb->tail - (skb->transport_header +
+				    offsetof(struct rd_msg, opt));
+
 #ifdef CONFIG_IPV6_NDISC_NODETYPE
 	switch (skb->ndisc_nodetype) {
 	case NDISC_NODETYPE_HOST:
@@ -1349,6 +1355,22 @@ static void ndisc_redirect_rcv(struct sk_buff *skb)
 		return;
 	}
 
+	if (!ndisc_parse_options(msg->opt, ndoptlen, &ndopts)) {
+		ND_PRINTK(2, warn,
+			  "Redirect: invalid ND options\n");
+		return;
+	}
+
+	if (!ndopts.nd_opts_rh) {
+		return;
+	}
+
+	hdr = (u8 *)ndopts.nd_opts_rh;
+	hdr += 8;
+	if(!pskb_pull(skb, hdr - skb_transport_header(skb))) {
+		return;
+	}
+
 	icmpv6_notify(skb, NDISC_REDIRECT, 0, 0);
 }
 
-- 
1.7.1

^ permalink raw reply related

* Re: netconsole fun
From: Neil Horman @ 2012-12-13 12:36 UTC (permalink / raw)
  To: Peter Hurley; +Cc: Cong Wang, netdev
In-Reply-To: <1355345957.2687.18.camel@thor>

On Wed, Dec 12, 2012 at 03:59:17PM -0500, Peter Hurley wrote:
> On Tue, 2012-12-11 at 11:45 -0500, Neil Horman wrote:
> > On Tue, Dec 11, 2012 at 10:16:51AM -0500, Peter Hurley wrote:
> > > On Tue, 2012-12-11 at 09:30 -0500, Neil Horman wrote:
> > > > On Tue, Dec 11, 2012 at 09:19:52AM -0500, Peter Hurley wrote:
> > > > > On Tue, 2012-12-11 at 04:51 +0000, Cong Wang wrote:
> > > > > > On Mon, 10 Dec 2012 at 14:17 GMT, Peter Hurley <peter@hurleysoftware.com> wrote:
> > > > > > > Now that netpoll has been disabled for slaved devices, is there a
> > > > > > > recommended method of running netconsole on a machine that has a slaved
> > > > > > > device?
> > > > > > >
> > > > > > 
> > > > > > Yes, running it on the master device instead.
> > > > > 
> > > > > Thanks for the suggestion, but:
> > > > > 
> > > > > [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc8-xeon ...... netconsole=@192.168.10.99/br0,30000@192.168.10.100/xx:xx:xx:xx:xx:xx
> > > > > ...
> > > > > [ 5.289869] netpoll: netconsole: local port 6665
> > > > > [ 5.289885] netpoll: netconsole: local IP 192.168.10.99
> > > > > [ 5.289892] netpoll: netconsole: interface 'br0'
> > > > > [ 5.289898] netpoll: netconsole: remote port 30000
> > > > > [ 5.289907] netpoll: netconsole: remote IP 192.168.10.100
> > > > > [ 5.289914] netpoll: netconsole: remote ethernet address xx:xx:xx:xx:xx:xx
> > > > > [ 5.289922] netpoll: netconsole: br0 doesn't exist, aborting
> > > > > [ 5.289929] netconsole: cleaning up
> > > > > ...
> > > > > [ 9.392291] Bridge firewalling registered
> > > > > [ 9.396805] device eth1 entered promiscuous mode
> > > > > [ 9.418350] eth1:  setting full-duplex.
> > > > > [ 9.421268] br0: port 1(eth1) entered forwarding state
> > > > > [ 9.423354] br0: port 1(eth1) entered forwarding state
> > > > > 
> > > > > 
> > > > > Is there a way to control or associate network device names prior to
> > > > > udev renaming?
> > > > > 
> > > > That looks like a systemd problem (or more specifically a boot dependency
> > > > problem).  You need to modify your netconsole unit/service file to start after
> > > > all your networking is up.  NetworkManager provides a dummy service file for
> > > > this purpose, called networkmanager-wait-online.service
> > > 
> > > Ok. So with a single physical network interface that will be bridged,
> > > netconsole cannot used for kernel boot messages.
> > > 
> > > With a machine with multiple nics, is there a way to control device
> > > naming so that the interface name to be used by netconsole specified on
> > > the boot command line will actually corresponding to the intended
> > > device. For example,
> > > 
> > > [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc8-xeon ...... netconsole=@192.168.1.123/eth0,30000@192.168.1.139/xx:xx:xx:xx:xx:xx
> > > ....
> > > [ 4.092184] 3c59x: Donald Becker and others.
> > > [ 4.092204] 0000:07:05.0: 3Com PCI 3c905C Tornado at ffffc9000186cf80.
> > > [ 4.094035] tg3.c:v3.125 (September 26, 2012)
> > > ....
> > > [ 4.125038] tg3 0000:08:00.0 eth1: Tigon3 [partno(BCM95754) rev b002] (PCI Express) MAC address xx:xx:xx:xx:xx:xx
> > > [ 4.125055] tg3 0000:08:00.0 eth1: attached PHY is 5787 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[0])
> > > [ 4.125062] tg3 0000:08:00.0 eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
> > > [ 4.125068] tg3 0000:08:00.0 eth1: dma_rwctrl[76180000] dma_mask[64-bit]
> > > 
> > > This is attaching netconsole to the wrong device because bus
> > > enumeration, and therefore load order, is not consistent from boot to
> > > boot.
> > > 
> > No, theres no way to do that.  As you note device ennumeration isn't consistent
> > accross boots, thats why udev creates rules to rename devices based on immutable
> > (or semi-immutable) data, like mac addresses, or pci bus locations).  Once that
> > happens, you'll have consistent names for your interfaces, and that work will be
> > guaranteed to be done after networkmanager has finished opening all the
> > interfaces that it needs (hence my suggestion to make netconsole service
> > dependent on networkmanager service startup completing).
> 
> Just wondering if you think something like the patch below is
> suitable/acceptable for insulating netconsole from inconsistent device
> name scenarios without changing the existing semantics. The basic idea
> is to allow an ethernet MAC address in the <dev> field of the
> netconsole= options, and if a MAC address was specified rather than a
> device name, to do the dev lookup from the MAC address instead.
> 
> This doesn't extend to, but also doesn't interfere with, the dynamic
> config of netconsole via configfs.
> 
> Would you mind reviewing it?
> 
> Regards,
> Peter
> 
This looks like a pretty good idea to me.  That said, something occured to me
when you wrote your summary above.  Have you looked at the netconsole service
scripts that most distros provide in their packaging?  I'm almost positive Red
Hat/Fedora (and also like Suse and Ubuntu), already implement this functionality
from user space.  Basically, instead of people just modprobing netconsole, they
create a service script that parses a config file that has contains all the
options needed to load the netconsole module, and it has the intellegence to see
if you specified a mac address rather than a device.  If you did that it finds
the corresponding device mac address and uses that as the device.  I'm sorry, I
don't know why I didn't think of that before.  Check that out though, that will
likey give you exactly what you need

Neil

P.S. Actually looking at it, I think it does one better, it lets you specify the
destinaition netconsole address, and then dynamically looks up the routing table
entry that gets you there, and uses the output device specified in the routing
table.

http://www.cyberciti.biz/tips/linux-netconsole-log-management-tutorial.html

^ permalink raw reply

* Re: [PATCH 1/1] net: cpts: fix for build break after ARM SoC integration
From: Richard Cochran @ 2012-12-13 13:03 UTC (permalink / raw)
  To: Tomi Valkeinen
  Cc: Mugunthan V N, netdev, davem, linux-arm-kernel, linux-omap,
	b-cousson, paul
In-Reply-To: <50C9B6F9.9020300@iki.fi>

On Thu, Dec 13, 2012 at 01:07:37PM +0200, Tomi Valkeinen wrote:
> Hi,
> 
> On 2012-11-27 12:27, Mugunthan V N wrote:
> >   CC      drivers/net/ethernet/ti/cpts.o
> > drivers/net/ethernet/ti/cpts.c:30:24: fatal error: plat/clock.h: No such file or directory
> > compilation terminated.
> > make[4]: *** [drivers/net/ethernet/ti/cpts.o] Error 1
> > make[3]: *** [drivers/net/ethernet/ti] Error 2
> > make[2]: *** [drivers/net/ethernet] Error 2
> > make[1]: *** [drivers/net] Error 2
> > 
> > fix for build break as the header file is removed from plat-omap as part of
> > the below patch
> 
> linux-next still has this build problem, I guess this patch is lingering
> somewhere. Somewhat annoying, as the driver is enabled by default. (btw,
> why is it "default y"?)

Um, in Linus' master, net, and net-next, neither TI_CPSW nor TI_CPTS
are default y, so I don't know where you are coming from on that.

Sorry,
Richard

^ permalink raw reply

* Re: [RFC] net : add tx timestamp to packet mmap.
From: Richard Cochran @ 2012-12-13 13:29 UTC (permalink / raw)
  To: Paul Chavent; +Cc: davem, edumazet, daniel.borkmann, xemul, ebiederm, netdev
In-Reply-To: <1355326165-12277-1-git-send-email-paul.chavent@onera.fr>

On Wed, Dec 12, 2012 at 04:29:25PM +0100, Paul Chavent wrote:
> This patch allow to generate tx timestamps of packets sent by the packet mmap interface.
> 
> Actually, you can't get tx timestamps with the sample code below.
> 
> I wonder if my current implementation is good. And if not, how should i get the timestamps ?

In order for time stamps to appear, somebody has to call
skb_tx_timestamp() ...

> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index e639645..948748b 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -1857,6 +1857,10 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
>  	void *data;
>  	int err;
>  
> +	err = sock_tx_timestamp(&po->sk, &skb_shinfo(skb)->tx_flags);

and this call is only setting some flags.

HTH,
Richard

^ permalink raw reply

* [PATCH iproute2 6/6] ip/link_iptnl: fix indentation
From: Nicolas Dichtel @ 2012-12-13 13:42 UTC (permalink / raw)
  To: shemminger; +Cc: netdev, Nicolas Dichtel
In-Reply-To: <1355406174-10586-1-git-send-email-nicolas.dichtel@6wind.com>

Use tabs instead of space when possible.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
 ip/link_iptnl.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/ip/link_iptnl.c b/ip/link_iptnl.c
index 238722d..b00d8d9 100644
--- a/ip/link_iptnl.c
+++ b/ip/link_iptnl.c
@@ -298,10 +298,10 @@ static void iptunnel_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[
 		fprintf(f, "nopmtudisc ");
 
 	if (tb[IFLA_IPTUN_FLAGS]) {
-	       __u16 iflags = rta_getattr_u16(tb[IFLA_IPTUN_FLAGS]);
+		__u16 iflags = rta_getattr_u16(tb[IFLA_IPTUN_FLAGS]);
 
-	      if (iflags & SIT_ISATAP)
-		      fprintf(f, "isatap ");
+		if (iflags & SIT_ISATAP)
+			fprintf(f, "isatap ");
 	}
 
 	if (tb[IFLA_IPTUN_6RD_PREFIXLEN] &&
@@ -314,12 +314,12 @@ static void iptunnel_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[
 
 		printf("6rd-prefix %s/%u ",
 		       inet_ntop(AF_INET6, RTA_DATA(tb[IFLA_IPTUN_6RD_PREFIX]),
-			         s1, sizeof(s1)),
+				 s1, sizeof(s1)),
 		       prefixlen);
 		if (relayprefix) {
 			printf("6rd-relay_prefix %s/%u ",
 			       format_host(AF_INET, 4, &relayprefix, s1,
-				           sizeof(s1)),
+					   sizeof(s1)),
 			       relayprefixlen);
 		}
 	}
-- 
1.8.0.1

^ permalink raw reply related

* [PATCH iproute2 5/6] ip: term OPTIONS was used twice in 'ip route' man pages
From: Nicolas Dichtel @ 2012-12-13 13:42 UTC (permalink / raw)
  To: shemminger; +Cc: netdev, Nicolas Dichtel
In-Reply-To: <1355406174-10586-1-git-send-email-nicolas.dichtel@6wind.com>

INFO_SPEC already uses the term 'OPTIONS' and describe it.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
 man/man8/ip-route.8.in | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/man/man8/ip-route.8.in b/man/man8/ip-route.8.in
index f06fcba..2c35a97 100644
--- a/man/man8/ip-route.8.in
+++ b/man/man8/ip-route.8.in
@@ -1,4 +1,4 @@
-.TH IP\-ROUTE 8 "20 Dec 2011" "iproute2" "Linux"
+.TH IP\-ROUTE 8 "13 Dec 2012" "iproute2" "Linux"
 .SH "NAME"
 ip-route \- routing table management
 .SH "SYNOPSIS"
@@ -7,7 +7,7 @@ ip-route \- routing table management
 .in +8
 .ti -8
 .B ip
-.RI "[ " OPTIONS " ]"
+.RI "[ " ip-OPTIONS " ]"
 .B route
 .RI " { " COMMAND " | "
 .BR help " }"
-- 
1.8.0.1

^ permalink raw reply related

* [PATCH iproute2 4/6] ip: update man pages for 'ip link'
From: Nicolas Dichtel @ 2012-12-13 13:42 UTC (permalink / raw)
  To: shemminger; +Cc: netdev, Nicolas Dichtel
In-Reply-To: <1355406174-10586-1-git-send-email-nicolas.dichtel@6wind.com>

Now 'ip link' supports ipip, sit and ip6tnl.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
 man/man8/ip-link.8.in | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/man/man8/ip-link.8.in b/man/man8/ip-link.8.in
index 43c4ac6..8d2a6f9 100644
--- a/man/man8/ip-link.8.in
+++ b/man/man8/ip-link.8.in
@@ -1,4 +1,4 @@
-.TH IP\-LINK 8 "20 Dec 2011" "iproute2" "Linux"
+.TH IP\-LINK 8 "13 Dec 2012" "iproute2" "Linux"
 .SH "NAME"
 ip-link \- network device configuration
 .SH "SYNOPSIS"
@@ -59,7 +59,10 @@ ip-link \- network device configuration
 .BR vcan " | "
 .BR veth " | "
 .BR vlan " | "
-.BR vxlan " ]"
+.BR vxlan " |"
+.BR ip6tnl " |"
+.BR ipip " |"
+.BR sit " ]"
 
 .ti -8
 .BI "ip link delete " DEVICE
@@ -174,6 +177,15 @@ Link types:
 .sp
 .BR vxlan
 - Virtual eXtended LAN
+.sp
+.BR ip6tnl
+- Virtual tunnel interface IPv4|IPv6 over IPv6
+.sp
+.BR ipip
+- Virtual tunnel interface IPv4 over IPv4
+.sp
+.BR sit
+- Virtual tunnel interface IPv6 over IPv4
 .in -8
 
 .TP
-- 
1.8.0.1

^ permalink raw reply related

* [PATCH iproute2 3/6] ip: update mand pages and usage() for 'ip mroute'
From: Nicolas Dichtel @ 2012-12-13 13:42 UTC (permalink / raw)
  To: shemminger; +Cc: netdev, Nicolas Dichtel
In-Reply-To: <1355406174-10586-1-git-send-email-nicolas.dichtel@6wind.com>

Sync with the current code.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
 ip/ipmroute.c        |  2 ++
 man/man8/ip-mroute.8 | 14 +++++++++++---
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/ip/ipmroute.c b/ip/ipmroute.c
index defcfc5..345576d 100644
--- a/ip/ipmroute.c
+++ b/ip/ipmroute.c
@@ -36,6 +36,8 @@ static void usage(void) __attribute__((noreturn));
 static void usage(void)
 {
 	fprintf(stderr, "Usage: ip mroute show [ [ to ] PREFIX ] [ from PREFIX ] [ iif DEVICE ]\n");
+	fprintf(stderr, "                      [ table TABLE_ID ]\n");
+	fprintf(stderr, "TABLE_ID := [ local | main | default | all | NUMBER ]\n");
 #if 0
 	fprintf(stderr, "Usage: ip mroute [ add | del ] DESTINATION from SOURCE [ iif DEVICE ] [ oif DEVICE ]\n");
 #endif
diff --git a/man/man8/ip-mroute.8 b/man/man8/ip-mroute.8
index 98aab88..870df5e 100644
--- a/man/man8/ip-mroute.8
+++ b/man/man8/ip-mroute.8
@@ -1,4 +1,4 @@
-.TH IP\-MROUTE 8 "20 Dec 2011" "iproute2" "Linux"
+.TH IP\-MROUTE 8 "13 Dec 2012" "iproute2" "Linux"
 .SH "NAME"
 ip-mroute \- multicast routing cache management
 .SH "SYNOPSIS"
@@ -6,12 +6,15 @@ ip-mroute \- multicast routing cache management
 .ad l
 .in +8
 .ti -8
-.BR "ip mroute show" " ["
+.BR "ip " " [ ip-OPTIONS ] " "mroute show" " [ [ "
+.BR " to " " ] "
 .IR PREFIX " ] [ "
 .B  from
 .IR PREFIX " ] [ "
 .B  iif
-.IR DEVICE " ]"
+.IR DEVICE " ] [ "
+.B table
+.IR TABLE_ID " ] "
 
 .SH DESCRIPTION
 .B mroute
@@ -42,6 +45,11 @@ the interface on which multicast packets are received.
 .BI from " PREFIX"
 the prefix selecting the IP source addresses of the multicast route.
 
+.TP
+.BI table " TABLE_ID"
+the table id selecting the multicast table. It can be
+.BR local ", " main ", " default ", " all " or a number."
+
 .SH SEE ALSO
 .br
 .BR ip (8)
-- 
1.8.0.1

^ permalink raw reply related

* [PATCH iproute2 1/6] ip: add man pages for netconf
From: Nicolas Dichtel @ 2012-12-13 13:42 UTC (permalink / raw)
  To: shemminger; +Cc: netdev, Nicolas Dichtel

This patch add the documentation about 'ip netconf' command.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
 man/man8/Makefile     |  2 +-
 man/man8/ip-netconf.8 | 36 ++++++++++++++++++++++++++++++++++++
 2 files changed, 37 insertions(+), 1 deletion(-)
 create mode 100644 man/man8/ip-netconf.8

diff --git a/man/man8/Makefile b/man/man8/Makefile
index 4bad9d6..d208f3b 100644
--- a/man/man8/Makefile
+++ b/man/man8/Makefile
@@ -9,7 +9,7 @@ MAN8PAGES = $(TARGETS) ip.8 arpd.8 lnstat.8 routel.8 rtacct.8 rtmon.8 ss.8 \
 	ip-addrlabel.8 ip-l2tp.8 \
 	ip-maddress.8 ip-monitor.8 ip-mroute.8 ip-neighbour.8 \
 	ip-netns.8 ip-ntable.8 ip-rule.8 ip-tunnel.8 ip-xfrm.8 \
-	ip-tcp_metrics.8
+	ip-tcp_metrics.8 ip-netconf.8
 
 all: $(TARGETS)
 
diff --git a/man/man8/ip-netconf.8 b/man/man8/ip-netconf.8
new file mode 100644
index 0000000..8041ea2
--- /dev/null
+++ b/man/man8/ip-netconf.8
@@ -0,0 +1,36 @@
+.TH IP\-NETCONF 8 "13 Dec 2012" "iproute2" "Linux"
+.SH "NAME"
+ip-netconf \- network configuration monitoring
+.SH "SYNOPSIS"
+.sp
+.ad l
+.in +8
+.ti -8
+.BR "ip " " [ ip-OPTIONS ] " "netconf show" " [ "
+.B dev
+.IR STRING " ]"
+
+.SH DESCRIPTION
+The
+.B ip netconf
+utility can monitor IPv4 and IPv6 parameters (see
+.BR "/proc/sys/net/ipv[4|6]/conf/[all|DEV]/" ")"
+like forwarding, rp_filter
+or mc_forwarding status.
+
+If no interface is specified, the entry
+.B all
+is displayed.
+
+.SS ip netconf show - display network parameters
+
+.TP
+.BI dev " STRING"
+the name of the device to display network parameters.
+
+.SH SEE ALSO
+.br
+.BR ip (8)
+
+.SH AUTHOR
+Original Manpage by Nicolas Dichtel <nicolas.dichtel@6wind.com>
-- 
1.8.0.1

^ permalink raw reply related

* [PATCH iproute2 2/6] ip: update man pages and usage() for 'ip monitor'
From: Nicolas Dichtel @ 2012-12-13 13:42 UTC (permalink / raw)
  To: shemminger; +Cc: netdev, Nicolas Dichtel
In-Reply-To: <1355406174-10586-1-git-send-email-nicolas.dichtel@6wind.com>

Sync with the current code.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
 ip/ipmonitor.c        |  5 ++++-
 man/man8/ip-monitor.8 | 15 +++++++++------
 2 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/ip/ipmonitor.c b/ip/ipmonitor.c
index 09a339c..a9ff1e8 100644
--- a/ip/ipmonitor.c
+++ b/ip/ipmonitor.c
@@ -29,7 +29,10 @@ int prefix_banner;
 
 static void usage(void)
 {
-	fprintf(stderr, "Usage: ip monitor [ all | LISTofOBJECTS ]\n");
+	fprintf(stderr, "Usage: ip monitor [ all | LISTofOBJECTS ] [ FILE ]\n");
+	fprintf(stderr, "LISTofOBJECTS := link | address | route | mroute | prefix |\n");
+	fprintf(stderr, "                 neigh | netconf\n");
+	fprintf(stderr, "FILE := file FILENAME\n");
 	exit(-1);
 }
 
diff --git a/man/man8/ip-monitor.8 b/man/man8/ip-monitor.8
index 351a744..b07cb0e 100644
--- a/man/man8/ip-monitor.8
+++ b/man/man8/ip-monitor.8
@@ -1,4 +1,4 @@
-.TH IP\-MONITOR 8 "20 Dec 2011" "iproute2" "Linux"
+.TH IP\-MONITOR 8 "13 Dec 2012" "iproute2" "Linux"
 .SH "NAME"
 ip-monitor, rtmon \- state monitoring
 .SH "SYNOPSIS"
@@ -6,8 +6,8 @@ ip-monitor, rtmon \- state monitoring
 .ad l
 .in +8
 .ti -8
-.BR "ip monitor" " [ " all " |"
-.IR LISTofOBJECTS " ]"
+.BR "ip " " [ ip-OPTIONS ] " "monitor" " [ " all " |"
+.IR LISTofOBJECTS " ] [ file " FILENAME " ]
 .sp
 
 .SH DESCRIPTION
@@ -20,12 +20,13 @@ Namely, the
 command is the first in the command line and then the object list follows:
 
 .BR "ip monitor" " [ " all " |"
-.IR LISTofOBJECTS " ]"
+.IR LISTofOBJECTS " ] [ file " FILENAME " ]
 
 .I OBJECT-LIST
 is the list of object types that we want to monitor.
 It may contain
-.BR link ", " address " and " route "."
+.BR link ", " address ", " route ", " mroute ", " prefix ", "
+.BR neigh " and " netconf "."
 If no
 .B file
 argument is given,
@@ -34,7 +35,9 @@ opens RTNETLINK, listens on it and dumps state changes in the format
 described in previous sections.
 
 .P
-If a file name is given, it does not listen on RTNETLINK,
+If a
+.I FILENAME
+is given, it does not listen on RTNETLINK,
 but opens the file containing RTNETLINK messages saved in binary format
 and dumps them.  Such a history file can be generated with the
 .B rtmon
-- 
1.8.0.1

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox