Netdev List
 help / color / mirror / Atom feed
* [GIT] Networking
From: David Miller @ 2010-05-11  7:01 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


1) Fix SCTP race between connect() and ICMP unreachable, from Vlad Yasevich.

2) Micrel phy driver OOPS fix, based upon a report by Ingo Molnar.

3) Firmware loading warn_on fix from Christian Lamparter.

4) UDP logs garbage addresses on short packets.  Fix from Bjørn Mork.

5) IPV6_RECVERR incorrectly handles locally generated errors, fix from
   Brian Haley.

6) IPV4 multicast purges routes too quickly due to mishandling of timers.
   Fix from Andreas Meissner.

7) Passive scans busted in iwlwifi, fix from Johannes Berg.

8) De-auth request handling fix in mac80211 from Reinette Chatre.

9) gianfar needs to purge cached recycled SKBs on MTU change, fix from
   Sebastian Andrzej Siewior

Please pull, thanks a lot!

The following changes since commit 8777c793d6a24c7f3adf52b1b1086e9706de4589:
  Linus Torvalds (1):
        Merge branch 'for-linus' of git://git.kernel.org/.../tj/wq

are available in the git repository at:

  master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.git master

Andreas Meissner (1):
      IPv4: unresolved multicast route cleanup

Bjørn Mork (1):
      ipv4: udp: fix short packet and bad checksum logging

Brian Haley (1):
      IPv6: fix IPV6_RECVERR handling of locally-generated errors

Christian Lamparter (1):
      ar9170: wait for asynchronous firmware loading

David S. Miller (3):
      phy: Fix initialization in micrel driver.
      net: Fix FDDI and TR config checks in ipv4 arp and LLC.
      Merge branch 'master' of git://git.kernel.org/.../linville/wireless-2.6

Eric Dumazet (1):
      veth: Dont kfree_skb() after dev_forward_skb()

Johannes Berg (1):
      iwlwifi: work around passive scan issue

Reinette Chatre (1):
      mac80211: remove association work when processing deauth request

Sebastian Andrzej Siewior (1):
      net/gianfar: drop recycled skbs on MTU change

Vlad Yasevich (1):
      sctp: Fix a race between ICMP protocol unreachable and connect()

 drivers/net/gianfar.c                       |    2 +-
 drivers/net/phy/micrel.c                    |    1 +
 drivers/net/veth.c                          |    1 -
 drivers/net/wireless/ath/ar9170/usb.c       |   11 ++++++++
 drivers/net/wireless/ath/ar9170/usb.h       |    1 +
 drivers/net/wireless/iwlwifi/iwl-commands.h |    4 ++-
 drivers/net/wireless/iwlwifi/iwl-scan.c     |   23 ++++++++++++++----
 drivers/net/wireless/iwlwifi/iwl3945-base.c |    3 +-
 include/net/sctp/sm.h                       |    1 +
 include/net/sctp/structs.h                  |    3 ++
 net/core/dev.c                              |   11 ++++----
 net/ipv4/arp.c                              |    6 ++--
 net/ipv4/ipmr.c                             |    3 +-
 net/ipv4/udp.c                              |    6 ++--
 net/ipv6/datagram.c                         |    8 ++++-
 net/llc/llc_sap.c                           |    2 +-
 net/mac80211/mlme.c                         |    3 +-
 net/sctp/input.c                            |   22 ++++++++++++++---
 net/sctp/sm_sideeffect.c                    |   35 +++++++++++++++++++++++++++
 net/sctp/transport.c                        |    2 +
 20 files changed, 118 insertions(+), 30 deletions(-)

^ permalink raw reply

* Re: [PATCH 1/9] netdev: bfin_mac: add support for IEEE 1588 PTP
From: Richard Cochran @ 2010-05-11  7:07 UTC (permalink / raw)
  To: Mike Frysinger; +Cc: netdev, David S. Miller, uclinux-dist-devel, Barry Song
In-Reply-To: <1273505954-32588-1-git-send-email-vapier@gentoo.org>

On Mon, May 10, 2010 at 11:39:06AM -0400, Mike Frysinger wrote:
> diff --git a/drivers/net/bfin_mac.c b/drivers/net/bfin_mac.c
> index 587f93c..6a9519f 100644
> --- a/drivers/net/bfin_mac.c
> +++ b/drivers/net/bfin_mac.c
...
> +#define PTP_CLK 25000000
> +
> +static void bfin_mac_hwtstamp_init(struct net_device *netdev)
> +{
> +	struct bfin_mac_local *lp = netdev_priv(netdev);
> +	u64 append;
> +
> +	/* Initialize hardware timer */
> +	append = PTP_CLK * (1ULL << 32);
> +	do_div(append, get_sclk());
> +	bfin_write_EMAC_PTP_ADDEND((u32)append);

It appears that one can tune this PTP clock.

I recently posted a suggestion for a PTP clock class driver. Would you
care to take a look at that and say whether that API would also work
for the blackfin?

Thanks,
Richard


^ permalink raw reply

* Re: TCP-MD5 checksum failure on x86_64 SMP
From: Bijay Singh @ 2010-05-11  8:23 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Stephen Hemminger, David Miller, <bhaskie@gmail.com>,
	<bhutchings@solarflare.com>, <netdev@vger.kernel.org>
In-Reply-To: <1273559267.2339.6.camel@edumazet-laptop>

I need MD5 for my BGP sessions and need the jumbo packets for the IS-IS peering. MTU of 1500 results in LSPs higher that 1500 getting dropped at the peering router. 

On 11-May-2010, at 11:57 AM, Eric Dumazet wrote:

> Le mardi 11 mai 2010 à 04:08 +0000, Bijay Singh a écrit :
>> Hi Eric,
>> 
>> I guess that makes me the enviable one. So I am keen to test out this feature
> 
>> completely, as long as I know what to do as a next step, directions, patches.
>> 
> 
> MTU > 4000 is not reliable because of high order allocations on typical
> NICS. I am afraid you need NIC able to deliver page fragments.
> 
> Its working here (32bit kernel) with a tg3 NIC, but I got following
> message :
> 
> ifconfig eth3 mtu 9000
> ...
> [51492.936500] 167731 total pagecache pages
> [51492.936500] 0 pages in swap cache
> [51492.936500] Swap cache stats: add 0, delete 0, find 0/0
> [51492.936500] Free swap  = 4192928kB
> [51492.936500] Total swap = 4192928kB
> [51492.936500] 1114110 pages RAM
> [51492.936500] 885761 pages HighMem
> [51492.936500] 77073 pages reserved
> [51492.936500] 134483 pages shared
> [51492.936500] 159131 pages non-shared
> [51492.953027] tg3 0000:14:04.1: eth3: Using a smaller RX standard ring.
> Only 183 out of 511 buffers were allocated successfully
> 
> $ ethtool -g eth3
> Ring parameters for eth3:
> Pre-set maximums:
> RX:		511
> RX Mini:	0
> RX Jumbo:	0
> TX:		511
> Current hardware settings:
> RX:		183
> RX Mini:	0
> RX Jumbo:	0
> TX:		511
> 
> $ cat /proc/buddyinfo
> Node 0, zone      DMA      5      2      1      1      2      2      2      1      0      1      0 
> Node 0, zone   Normal   4285   1823    248     73      9      5      0      0      0      0      0 
> Node 0, zone  HighMem     97    199    921    583    383    261    155    117     69     41    649 
> 
> I know that if I try to stress RX path, I'll get failures.
> 
> Could you explain me why you need both big MTUS and TCP-MD5 ?
> 
> 
> 


^ permalink raw reply

* Re: [PATCH net-next 3/3] cxgb4: report GRO stats with ethtool -S
From: Dimitris Michailidis @ 2010-05-11  8:45 UTC (permalink / raw)
  To: Ben Greear; +Cc: netdev
In-Reply-To: <4BE8C8CD.1000200@candelatech.com>

Ben Greear wrote:
> On 05/10/2010 06:58 PM, Dimitris Michailidis wrote:
>> Signed-off-by: Dimitris Michailidis<dm@chelsio.com>
> 
> One of these is on-wire packets?  If so, maybe use the same
> string as Intel ixgbe uses:
> 
> rx_pkts_nic  # Pkts received by NIC from wire.
> rx_bytes_nic # Bytes received by NIC from wire.
> tx_pkts_nic  # Pkts transmitted to wire by NIC.
> tx_bytes_nic # Bytes transmitted to wire by NIC.

The two stats I'm adding don't correspond to any of the 4 above.  One is the 
superpackets coming out of GRO, the other is packets merged by GRO to create 
the superpackets.

> 
> Thanks,
> Ben
> 
> 
> 
>> ---
>>   drivers/net/cxgb4/cxgb4_main.c |    6 ++++++
>>   1 files changed, 6 insertions(+), 0 deletions(-)
>>
>> diff --git a/drivers/net/cxgb4/cxgb4_main.c 
>> b/drivers/net/cxgb4/cxgb4_main.c
>> index 80c3fc5..90d375b 100644
>> --- a/drivers/net/cxgb4/cxgb4_main.c
>> +++ b/drivers/net/cxgb4/cxgb4_main.c
>> @@ -859,6 +859,8 @@ static char stats_strings[][ETH_GSTRING_LEN] = {
>>       "RxCsumGood         ",
>>       "VLANextractions    ",
>>       "VLANinsertions     ",
>> +    "GROpackets         ",
>> +    "GROmerged          ",
>>   };
>>
>>   static int get_sset_count(struct net_device *dev, int sset)
>> @@ -922,6 +924,8 @@ struct queue_port_stats {
>>       u64 rx_csum;
>>       u64 vlan_ex;
>>       u64 vlan_ins;
>> +    u64 gro_pkts;
>> +    u64 gro_merged;
>>   };
>>
>>   static void collect_sge_port_stats(const struct adapter *adap,
>> @@ -938,6 +942,8 @@ static void collect_sge_port_stats(const struct 
>> adapter *adap,
>>           s->rx_csum += rx->stats.rx_cso;
>>           s->vlan_ex += rx->stats.vlan_ex;
>>           s->vlan_ins += tx->vlan_ins;
>> +        s->gro_pkts += rx->stats.lro_pkts;
>> +        s->gro_merged += rx->stats.lro_merged;
>>       }
>>   }
>>
> 
> 


^ permalink raw reply

* Re: [PATCH 72/84] netfilter: xtables: inclusion of xt_TEE
From: Patrick McHardy @ 2010-05-11 11:42 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, netfilter-devel, netdev
In-Reply-To: <1273524779.2590.236.camel@edumazet-laptop>

Eric Dumazet wrote:
> Le lundi 10 mai 2010 à 22:18 +0200, kaber@trash.net a écrit :
>> From: Jan Engelhardt <jengelh@medozas.de>
>>
>> xt_TEE can be used to clone and reroute a packet. This can for
>> example be used to copy traffic at a router for logging purposes
>> to another dedicated machine.
>>
>> References: http://www.gossamer-threads.com/lists/iptables/devel/68781
>> Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
>> Signed-off-by: Patrick McHardy <kaber@trash.net>
>> ---
> 
>> +static bool tee_tg_route_oif(struct flowi *f, struct net *net,
>> +			     const struct xt_tee_tginfo *info)
>> +{
>> +	const struct net_device *dev;
>> +
>> +	if (*info->oif != '\0')
>> +		return true;
>> +	dev = dev_get_by_name(net, info->oif);
>> +	if (dev == NULL)
>> +		return false;
>> +	f->oif = dev->ifindex;
>> +	return true;
>> +}
>> +
> 
> This leaks a refcount on device.
> 
> But I see patch 76/84 replaces the whole thing, so this is probably
> harmless.

Correct, that patch replaces the per-packet lookup and uses
netdevice notifiers to store the ifindex of the output device,
without keeping a reference at all.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] virtif: initial interface extensions
From: Arnd Bergmann @ 2010-05-11 12:25 UTC (permalink / raw)
  To: Stefan Berger; +Cc: netdev, Scott Feldman
In-Reply-To: <OFFE8F5F70.5C07C656-ON8525771F.00787A71-8525771F.007FCDFC@us.ibm.com>

On Tuesday 11 May 2010, Stefan Berger wrote:
> Arnd Bergmann <arnd@arndb.de> wrote on 05/10/2010 05:46:37 PM:
> 
> > Stefan, can you just define the XML in a way that matches the netlink
> > definition? What you need is something like
> > 
> > 1. VF number (optional, signifies that 2/3 are done in firmware)
> 
> Shouldn't we be able to query that number via netlink starting with the
> macvtap device and the following the trail to the root and trying to find
> a VF number on the way?

No. If we have a macvtap device, there is no VF number. The VF number
should be known to libvirt in those cases where instead of creating a
macvtap device, it assigns a VF of an SR-IOV adapter to the guest.

> > 2. Lower-level protocol
> >   2.1. CDCP
> >      2.1.1 SVID
> >      2.1.2 SCID
> 
> Will the later on be qeueryable via netlink as well but not today???
> Vivek tells me svid is vlan, so that could be found out from the kernel.
> 
> So if we want to only support 1 and 2 for now, I'd rather skip them for 
> now.

svid is almost vlan (hence S-VLAN), but slightly different and is not
currently supported by the kernel. Again, if the implementation is done in
firmware, libvirt needs to set the same S-VLAN ID when setting up the
VF and when associating it to the switch.

When it's done in software, we need to create the device (or have
it created in advance), so you either know it or can query it as
you describe.

You don't need to support it yet in libvirt, but the definition should
be done in a way that leaves the option open to add it later.

> >      2.2.2 ...
> > 3. VDP
> >   3.1 VSI type/version/provider
> 
> as proposed on libvirt mailing list
> 
> >   3.2 UUID
> 
> we have a couple of UUIDs, which one?

This is a UUID that describes the VSI to the switch. It needs to be
unique in the migration domain. For a guest that has multiple
macvtap interfaces, you either need to have a single UUID and
put all MAC/VLAN pairs into the same netlink message with this
UUID, or have one UUID per device. 
 
> >   3.3 MAC/VLAN
> 
> MAC: available from libvirt
> VLAN: can be found out by querying for every interface for VLAN ID while 
> following the path towards the root device. 

Yes, in case of macvtap.

	Arnd

^ permalink raw reply

* Re: [PATCH] virtif: initial interface extensions
From: Arnd Bergmann @ 2010-05-11 12:59 UTC (permalink / raw)
  To: Scott Feldman; +Cc: Stefan Berger, netdev, chrisw, davem
In-Reply-To: <C80DF1EC.2FFE0%scofeldm@cisco.com>

On Tuesday 11 May 2010, Scott Feldman wrote:
> On 5/10/10 2:46 PM, "Arnd Bergmann" <arnd@arndb.de> wrote:
>
> > Stefan, can you just define the XML in a way that matches the netlink
> > definition? What you need is something like
> > 
> > 1. VF number (optional, signifies that 2/3 are done in firmware)
> > 2. Lower-level protocol
> >   2.1. CDCP
> >      2.1.1 SVID
> >      2.1.2 SCID
> >   2.2. enic
> >      2.2.1 port profile name
> >      2.2.2 ...
> > 3. VDP
> >   3.1 VSI type/version/provider
> >   3.2 UUID
> >   3.3 MAC/VLAN
> > 
> > You need to have 2. or 3. or both, and 2.1/2.2 are mutually exclusive.
> 
> I'm don't think this is going in the right direction.  We're talking a
> pretty simple concept of a port-profile used to configure the virtual port
> backing a VM i/f to something that's trying to munge disjoint protocols
> based on pre-standard work into a single API.  It's forcing all those
> pre-standard protocol details into the API, into the XML, and into the mgmt
> software (libvirt), and into the admin's lap.

No. I agree that the port-profile concept is simple and that the complexity
comes from trying to merge the Linux interface with what we do for VDP.
It would be much more sensible IMHO to just unify port-profiles and CDCP
in the kernel interface, because they are conceptually more similar and
both rather simple (note that there is no S-VLAN implementation in Linux,
so CDCP is not there yet either).

If we layer VDP on top of the two and do it always in user space (LLDPAD
with lldptool), things are much simpler on the interface side.

The focus of VDP is to manage migration, which is something that
enic doesn't do (or doesn't need to do) and most of the complexity
in there comes from this.

> I want the API to pass a port-profile name plus other information associated
> with the VM i/f to some mgmt object which can setup the virtual port backing
> the VM i/f.  (And unset it).  Using netlink for API let's that object be in
> user- or kernel-space software, hardware or firmware.  How the object sets
> up the virtual port based on the passed port-profile is beyond the scope of
> the API.
> 
> My last port-profile patch is this API.  It gives us this:
> 
>     1) single netlink msg for kernel- and user-space

Except for the VF argument, which is kernel- only, and it doesn't work
if user space needs additional information that only the firmware has
in your case (something like a virtual channel ID).

>     2) single parameter set from sender's perspective (libvirt)
>     3) single XML representation of parameters
>     4) single code path in kernel and libvirt

>     5) (potential) cross-vendor-switch VM migration

You cannot do live migration between Cisco switches and those implementing
802.1Qbg, because of the differences between port profiles and VSI types,
e.g. how they handle VLANs or handover during migration.
Migration between 802.1Qbg capable switches is covered by the standard.

>     6) admin-friendly port-profile names

>     7) allows pre-standard (802.1Qbg/bh) details to change
>        without bogging down the API

We're basically nailing down the API right here. As soon as this is
supported in Linux, it will be the standard, so the details won't
be able to change any more.

> What I proposed on the libvirt list is to maintain a mapping database from
> port-profile to vendor-specific or protocol-specific parameters.  Using VDP
> as an example, a port-profile would resolve the VDP tuple:
> 
>      port-profile: "joes-garage"   --->  VSI Manager ID: 15
>                                          VSI Type ID: 12345
>                                          VSI Type ID Ver: 1
>                                          other VSI settings (preassociate)

I agree that the port profile name matches the VSI manager/type/version ID
to a large degree and that it would be nice to unify these, but I don't think
there is a point given all the other differences.

There is no way around the preassociate/associate/disassociate messages
being part of the API, because otherwise you cannot do seamless migration
across multiple switches. 

Also, the primary key on VDP is the VSI UUID, which needs to be the same
on the target host after migration, while in your case the switch never
identifies the guest itself, only the port profile.

If I have understood you correctly, the primary key identifying a port
on enic is something that is only visible to the switch and nic firmware,
but never to software, so you identify the guest by its VF number on the
user API.

	Arnd

^ permalink raw reply

* get beyond 1Gbps with pktgen  on 10Gb nic?
From: Jon Zhou @ 2010-05-11 13:13 UTC (permalink / raw)
  To: netdev@vger.kernel.org

hi there:

anyone can get beyond 1Gbps with pktgen or other SW traffic generator with 10Gb nic(intel 82599 or BCM 57711)?
found that some one had met similar situation with broadcom 10G nic but no solution yet

thanks
jon

^ permalink raw reply

* Re: get beyond 1Gbps with pktgen  on 10Gb nic?
From: Ben Hutchings @ 2010-05-11 13:35 UTC (permalink / raw)
  To: Jon Zhou; +Cc: netdev@vger.kernel.org
In-Reply-To: <4A6A2125329CFD4D8CC40C9E8ABCAB9F2497D752C7@MILEXCH2.ds.jdsu.net>

On Tue, 2010-05-11 at 06:13 -0700, Jon Zhou wrote:
> hi there:
> 
> anyone can get beyond 1Gbps with pktgen or other SW traffic generator with 10Gb nic(intel 82599 or BCM 57711)?
> found that some one had met similar situation with broadcom 10G nic but no solution yet

I don't know about those specific controllers, but you should be able to
achieve close to 10G line rate with netperf's TCP_STREAM on any recent
PC server.  UDP throughput tends to be poorer as there is less support
for offloading segmentation and reassembly.  Performance may also be
constrained by PCI Express bandwidth (you need a real 8-lane slot) and
memory bandwidth (a single memory bank may not be enough).

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* [PATCH 0/6] ipv6: ip6mr: support multiple independant multicast routing instances
From: kaber @ 2010-05-11 14:02 UTC (permalink / raw)
  To: davem; +Cc: netdev

The following patches add support for multiple independant IPv6 multicast
routing instances. This can be useful to seperate traffic when building
a multicast router that is serving multiple independant networks.

The patchset is pretty much a straight forward port from IPv4 with no
significant differences.

The patchset consists of the following parts:

- Patch 01-04 contain some preparatory work and cleanup for supporting
  multiple multicast routing instances.

- Patch 05 contains the actual changes to support multiple multicast
  routing instances.

- Patch 06 adds support for dumping multicast routing routes from all
  tables over netlink.

These patches have been tested using pim6sd and mrd6.

Please apply or pull from:

git://git.kernel.org/pub/scm/linux/kernel/git/kaber/ipmr-2.6.git master

Thanks!

^ permalink raw reply

* [PATCH 2/6] ipv6: ip6mr: remove net pointer from struct mfc6_cache
From: kaber @ 2010-05-11 14:02 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <1273586551-3521-1-git-send-email-kaber@trash.net>

From: Patrick McHardy <kaber@trash.net>

Now that cache entries in unres_queue don't need to be distinguished by their
network namespace pointer anymore, we can remove it from struct mfc6_cache
add pass the namespace as function argument to the functions that need it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
---
 include/linux/mroute6.h |   15 -----------
 net/ipv6/ip6mr.c        |   63 +++++++++++++++++++++++------------------------
 2 files changed, 31 insertions(+), 47 deletions(-)

diff --git a/include/linux/mroute6.h b/include/linux/mroute6.h
index 2caa1a8..04e2e54 100644
--- a/include/linux/mroute6.h
+++ b/include/linux/mroute6.h
@@ -183,9 +183,6 @@ struct mif_device {
 
 struct mfc6_cache {
 	struct mfc6_cache *next;		/* Next entry on cache line 	*/
-#ifdef CONFIG_NET_NS
-	struct net *mfc6_net;
-#endif
 	struct in6_addr mf6c_mcastgrp;			/* Group the entry belongs to 	*/
 	struct in6_addr mf6c_origin;			/* Source of packet 		*/
 	mifi_t mf6c_parent;			/* Source interface		*/
@@ -208,18 +205,6 @@ struct mfc6_cache {
 	} mfc_un;
 };
 
-static inline
-struct net *mfc6_net(const struct mfc6_cache *mfc)
-{
-	return read_pnet(&mfc->mfc6_net);
-}
-
-static inline
-void mfc6_net_set(struct mfc6_cache *mfc, struct net *net)
-{
-	write_pnet(&mfc->mfc6_net, hold_net(net));
-}
-
 #define MFC_STATIC		1
 #define MFC_NOTIFY		2
 
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index 7236030..b3783a4 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -76,10 +76,12 @@ static DEFINE_SPINLOCK(mfc_unres_lock);
 
 static struct kmem_cache *mrt_cachep __read_mostly;
 
-static int ip6_mr_forward(struct sk_buff *skb, struct mfc6_cache *cache);
+static int ip6_mr_forward(struct net *net, struct sk_buff *skb,
+			  struct mfc6_cache *cache);
 static int ip6mr_cache_report(struct net *net, struct sk_buff *pkt,
 			      mifi_t mifi, int assert);
-static int ip6mr_fill_mroute(struct sk_buff *skb, struct mfc6_cache *c, struct rtmsg *rtm);
+static int ip6mr_fill_mroute(struct net *net, struct sk_buff *skb,
+			     struct mfc6_cache *c, struct rtmsg *rtm);
 static void mroute_clean_tables(struct net *net);
 
 
@@ -523,7 +525,6 @@ static int mif6_delete(struct net *net, int vifi, struct list_head *head)
 
 static inline void ip6mr_cache_free(struct mfc6_cache *c)
 {
-	release_net(mfc6_net(c));
 	kmem_cache_free(mrt_cachep, c);
 }
 
@@ -531,10 +532,9 @@ static inline void ip6mr_cache_free(struct mfc6_cache *c)
    and reporting error to netlink readers.
  */
 
-static void ip6mr_destroy_unres(struct mfc6_cache *c)
+static void ip6mr_destroy_unres(struct net *net, struct mfc6_cache *c)
 {
 	struct sk_buff *skb;
-	struct net *net = mfc6_net(c);
 
 	atomic_dec(&net->ipv6.cache_resolve_queue_len);
 
@@ -575,7 +575,7 @@ static void ipmr_do_expire_process(struct net *net)
 		}
 
 		*cp = c->next;
-		ip6mr_destroy_unres(c);
+		ip6mr_destroy_unres(net, c);
 	}
 
 	if (net->ipv6.mfc6_unres_queue != NULL)
@@ -599,10 +599,10 @@ static void ipmr_expire_process(unsigned long arg)
 
 /* Fill oifs list. It is called under write locked mrt_lock. */
 
-static void ip6mr_update_thresholds(struct mfc6_cache *cache, unsigned char *ttls)
+static void ip6mr_update_thresholds(struct net *net, struct mfc6_cache *cache,
+				    unsigned char *ttls)
 {
 	int vifi;
-	struct net *net = mfc6_net(cache);
 
 	cache->mfc_un.res.minvif = MAXMIFS;
 	cache->mfc_un.res.maxvif = 0;
@@ -717,24 +717,22 @@ static struct mfc6_cache *ip6mr_cache_find(struct net *net,
 /*
  *	Allocate a multicast cache entry
  */
-static struct mfc6_cache *ip6mr_cache_alloc(struct net *net)
+static struct mfc6_cache *ip6mr_cache_alloc(void)
 {
 	struct mfc6_cache *c = kmem_cache_zalloc(mrt_cachep, GFP_KERNEL);
 	if (c == NULL)
 		return NULL;
 	c->mfc_un.res.minvif = MAXMIFS;
-	mfc6_net_set(c, net);
 	return c;
 }
 
-static struct mfc6_cache *ip6mr_cache_alloc_unres(struct net *net)
+static struct mfc6_cache *ip6mr_cache_alloc_unres(void)
 {
 	struct mfc6_cache *c = kmem_cache_zalloc(mrt_cachep, GFP_ATOMIC);
 	if (c == NULL)
 		return NULL;
 	skb_queue_head_init(&c->mfc_un.unres.unresolved);
 	c->mfc_un.unres.expires = jiffies + 10 * HZ;
-	mfc6_net_set(c, net);
 	return c;
 }
 
@@ -742,7 +740,8 @@ static struct mfc6_cache *ip6mr_cache_alloc_unres(struct net *net)
  *	A cache entry has gone into a resolved state from queued
  */
 
-static void ip6mr_cache_resolve(struct mfc6_cache *uc, struct mfc6_cache *c)
+static void ip6mr_cache_resolve(struct net *net, struct mfc6_cache *uc,
+				struct mfc6_cache *c)
 {
 	struct sk_buff *skb;
 
@@ -755,7 +754,7 @@ static void ip6mr_cache_resolve(struct mfc6_cache *uc, struct mfc6_cache *c)
 			int err;
 			struct nlmsghdr *nlh = (struct nlmsghdr *)skb_pull(skb, sizeof(struct ipv6hdr));
 
-			if (ip6mr_fill_mroute(skb, c, NLMSG_DATA(nlh)) > 0) {
+			if (ip6mr_fill_mroute(net, skb, c, NLMSG_DATA(nlh)) > 0) {
 				nlh->nlmsg_len = skb_tail_pointer(skb) - (u8 *)nlh;
 			} else {
 				nlh->nlmsg_type = NLMSG_ERROR;
@@ -763,9 +762,9 @@ static void ip6mr_cache_resolve(struct mfc6_cache *uc, struct mfc6_cache *c)
 				skb_trim(skb, nlh->nlmsg_len);
 				((struct nlmsgerr *)NLMSG_DATA(nlh))->error = -EMSGSIZE;
 			}
-			err = rtnl_unicast(skb, mfc6_net(uc), NETLINK_CB(skb).pid);
+			err = rtnl_unicast(skb, net, NETLINK_CB(skb).pid);
 		} else
-			ip6_mr_forward(skb, c);
+			ip6_mr_forward(net, skb, c);
 	}
 }
 
@@ -889,7 +888,7 @@ ip6mr_cache_unresolved(struct net *net, mifi_t mifi, struct sk_buff *skb)
 		 */
 
 		if (atomic_read(&net->ipv6.cache_resolve_queue_len) >= 10 ||
-		    (c = ip6mr_cache_alloc_unres(net)) == NULL) {
+		    (c = ip6mr_cache_alloc_unres()) == NULL) {
 			spin_unlock_bh(&mfc_unres_lock);
 
 			kfree_skb(skb);
@@ -1133,7 +1132,7 @@ static int ip6mr_mfc_add(struct net *net, struct mf6cctl *mfc, int mrtsock)
 	if (c != NULL) {
 		write_lock_bh(&mrt_lock);
 		c->mf6c_parent = mfc->mf6cc_parent;
-		ip6mr_update_thresholds(c, ttls);
+		ip6mr_update_thresholds(net, c, ttls);
 		if (!mrtsock)
 			c->mfc_flags |= MFC_STATIC;
 		write_unlock_bh(&mrt_lock);
@@ -1143,14 +1142,14 @@ static int ip6mr_mfc_add(struct net *net, struct mf6cctl *mfc, int mrtsock)
 	if (!ipv6_addr_is_multicast(&mfc->mf6cc_mcastgrp.sin6_addr))
 		return -EINVAL;
 
-	c = ip6mr_cache_alloc(net);
+	c = ip6mr_cache_alloc();
 	if (c == NULL)
 		return -ENOMEM;
 
 	c->mf6c_origin = mfc->mf6cc_origin.sin6_addr;
 	c->mf6c_mcastgrp = mfc->mf6cc_mcastgrp.sin6_addr;
 	c->mf6c_parent = mfc->mf6cc_parent;
-	ip6mr_update_thresholds(c, ttls);
+	ip6mr_update_thresholds(net, c, ttls);
 	if (!mrtsock)
 		c->mfc_flags |= MFC_STATIC;
 
@@ -1178,7 +1177,7 @@ static int ip6mr_mfc_add(struct net *net, struct mf6cctl *mfc, int mrtsock)
 	spin_unlock_bh(&mfc_unres_lock);
 
 	if (uc) {
-		ip6mr_cache_resolve(uc, c);
+		ip6mr_cache_resolve(net, uc, c);
 		ip6mr_cache_free(uc);
 	}
 	return 0;
@@ -1229,7 +1228,7 @@ static void mroute_clean_tables(struct net *net)
 		cp = &net->ipv6.mfc6_unres_queue;
 		while ((c = *cp) != NULL) {
 			*cp = c->next;
-			ip6mr_destroy_unres(c);
+			ip6mr_destroy_unres(net, c);
 		}
 		spin_unlock_bh(&mfc_unres_lock);
 	}
@@ -1497,10 +1496,10 @@ static inline int ip6mr_forward2_finish(struct sk_buff *skb)
  *	Processing handlers for ip6mr_forward
  */
 
-static int ip6mr_forward2(struct sk_buff *skb, struct mfc6_cache *c, int vifi)
+static int ip6mr_forward2(struct net *net, struct sk_buff *skb,
+			  struct mfc6_cache *c, int vifi)
 {
 	struct ipv6hdr *ipv6h;
-	struct net *net = mfc6_net(c);
 	struct mif_device *vif = &net->ipv6.vif6_table[vifi];
 	struct net_device *dev;
 	struct dst_entry *dst;
@@ -1581,11 +1580,11 @@ static int ip6mr_find_vif(struct net_device *dev)
 	return ct;
 }
 
-static int ip6_mr_forward(struct sk_buff *skb, struct mfc6_cache *cache)
+static int ip6_mr_forward(struct net *net, struct sk_buff *skb,
+			  struct mfc6_cache *cache)
 {
 	int psend = -1;
 	int vif, ct;
-	struct net *net = mfc6_net(cache);
 
 	vif = cache->mf6c_parent;
 	cache->mfc_un.res.pkt++;
@@ -1627,13 +1626,13 @@ static int ip6_mr_forward(struct sk_buff *skb, struct mfc6_cache *cache)
 			if (psend != -1) {
 				struct sk_buff *skb2 = skb_clone(skb, GFP_ATOMIC);
 				if (skb2)
-					ip6mr_forward2(skb2, cache, psend);
+					ip6mr_forward2(net, skb2, cache, psend);
 			}
 			psend = ct;
 		}
 	}
 	if (psend != -1) {
-		ip6mr_forward2(skb, cache, psend);
+		ip6mr_forward2(net, skb, cache, psend);
 		return 0;
 	}
 
@@ -1674,7 +1673,7 @@ int ip6_mr_input(struct sk_buff *skb)
 		return -ENODEV;
 	}
 
-	ip6_mr_forward(skb, cache);
+	ip6_mr_forward(net, skb, cache);
 
 	read_unlock(&mrt_lock);
 
@@ -1683,11 +1682,11 @@ int ip6_mr_input(struct sk_buff *skb)
 
 
 static int
-ip6mr_fill_mroute(struct sk_buff *skb, struct mfc6_cache *c, struct rtmsg *rtm)
+ip6mr_fill_mroute(struct net *net, struct sk_buff *skb, struct mfc6_cache *c,
+		  struct rtmsg *rtm)
 {
 	int ct;
 	struct rtnexthop *nhp;
-	struct net *net = mfc6_net(c);
 	u8 *b = skb_tail_pointer(skb);
 	struct rtattr *mp_head;
 
@@ -1781,7 +1780,7 @@ int ip6mr_get_route(struct net *net,
 	if (!nowait && (rtm->rtm_flags&RTM_F_NOTIFY))
 		cache->mfc_flags |= MFC_NOTIFY;
 
-	err = ip6mr_fill_mroute(skb, cache, rtm);
+	err = ip6mr_fill_mroute(net, skb, cache, rtm);
 	read_unlock(&mrt_lock);
 	return err;
 }
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH 4/6] ipv6: ip6mr: move mroute data into seperate structure
From: kaber @ 2010-05-11 14:02 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <1273586551-3521-1-git-send-email-kaber@trash.net>

From: Patrick McHardy <kaber@trash.net>

Signed-off-by: Patrick McHardy <kaber@trash.net>
---
 include/linux/mroute6.h  |    5 +-
 include/net/netns/ipv6.h |   13 +--
 net/ipv6/ip6mr.c         |  390 +++++++++++++++++++++++++---------------------
 3 files changed, 216 insertions(+), 192 deletions(-)

diff --git a/include/linux/mroute6.h b/include/linux/mroute6.h
index 94a0cb5..0370dd4 100644
--- a/include/linux/mroute6.h
+++ b/include/linux/mroute6.h
@@ -229,10 +229,7 @@ extern int ip6mr_get_route(struct net *net, struct sk_buff *skb,
 			   struct rtmsg *rtm, int nowait);
 
 #ifdef CONFIG_IPV6_MROUTE
-static inline struct sock *mroute6_socket(struct net *net)
-{
-	return net->ipv6.mroute6_sk;
-}
+extern struct sock *mroute6_socket(struct net *net);
 extern int ip6mr_sk_done(struct sock *sk);
 #else
 static inline struct sock *mroute6_socket(struct net *net) { return NULL; }
diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
index 9cb3b5f..4e2780e 100644
--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h
@@ -59,18 +59,7 @@ struct netns_ipv6 {
 	struct sock             *tcp_sk;
 	struct sock             *igmp_sk;
 #ifdef CONFIG_IPV6_MROUTE
-	struct sock		*mroute6_sk;
-	struct timer_list	ipmr_expire_timer;
-	struct list_head	mfc6_unres_queue;
-	struct list_head	*mfc6_cache_array;
-	struct mif_device	*vif6_table;
-	int			maxvif;
-	atomic_t		cache_resolve_queue_len;
-	int			mroute_do_assert;
-	int			mroute_do_pim;
-#ifdef CONFIG_IPV6_PIMSM_V2
-	int			mroute_reg_vif_num;
-#endif
+	struct mr6_table	*mrt6;
 #endif
 };
 #endif
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index 08e0904..9419fce 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -51,6 +51,24 @@
 #include <linux/netfilter_ipv6.h>
 #include <net/ip6_checksum.h>
 
+struct mr6_table {
+#ifdef CONFIG_NET_NS
+	struct net		*net;
+#endif
+	struct sock		*mroute6_sk;
+	struct timer_list	ipmr_expire_timer;
+	struct list_head	mfc6_unres_queue;
+	struct list_head	mfc6_cache_array[MFC6_LINES];
+	struct mif_device	vif6_table[MAXMIFS];
+	int			maxvif;
+	atomic_t		cache_resolve_queue_len;
+	int			mroute_do_assert;
+	int			mroute_do_pim;
+#ifdef CONFIG_IPV6_PIMSM_V2
+	int			mroute_reg_vif_num;
+#endif
+};
+
 /* Big lock, protecting vif table, mrt cache and mroute socket state.
    Note that the changes are semaphored via rtnl_lock.
  */
@@ -61,7 +79,7 @@ static DEFINE_RWLOCK(mrt_lock);
  *	Multicast router control variables
  */
 
-#define MIF_EXISTS(_net, _idx) ((_net)->ipv6.vif6_table[_idx].dev != NULL)
+#define MIF_EXISTS(_mrt, _idx) ((_mrt)->vif6_table[_idx].dev != NULL)
 
 /* Special spinlock for queue of unresolved entries */
 static DEFINE_SPINLOCK(mfc_unres_lock);
@@ -76,13 +94,13 @@ static DEFINE_SPINLOCK(mfc_unres_lock);
 
 static struct kmem_cache *mrt_cachep __read_mostly;
 
-static int ip6_mr_forward(struct net *net, struct sk_buff *skb,
-			  struct mfc6_cache *cache);
-static int ip6mr_cache_report(struct net *net, struct sk_buff *pkt,
+static int ip6_mr_forward(struct net *net, struct mr6_table *mrt,
+			  struct sk_buff *skb, struct mfc6_cache *cache);
+static int ip6mr_cache_report(struct mr6_table *mrt, struct sk_buff *pkt,
 			      mifi_t mifi, int assert);
-static int ip6mr_fill_mroute(struct net *net, struct sk_buff *skb,
+static int ip6mr_fill_mroute(struct mr6_table *mrt, struct sk_buff *skb,
 			     struct mfc6_cache *c, struct rtmsg *rtm);
-static void mroute_clean_tables(struct net *net);
+static void mroute_clean_tables(struct mr6_table *mrt);
 
 
 #ifdef CONFIG_PROC_FS
@@ -97,11 +115,12 @@ struct ipmr_mfc_iter {
 static struct mfc6_cache *ipmr_mfc_seq_idx(struct net *net,
 					   struct ipmr_mfc_iter *it, loff_t pos)
 {
+	struct mr6_table *mrt = net->ipv6.mrt6;
 	struct mfc6_cache *mfc;
 
 	read_lock(&mrt_lock);
 	for (it->ct = 0; it->ct < MFC6_LINES; it->ct++) {
-		it->cache = &net->ipv6.mfc6_cache_array[it->ct];
+		it->cache = &mrt->mfc6_cache_array[it->ct];
 		list_for_each_entry(mfc, it->cache, list)
 			if (pos-- == 0)
 				return mfc;
@@ -109,7 +128,7 @@ static struct mfc6_cache *ipmr_mfc_seq_idx(struct net *net,
 	read_unlock(&mrt_lock);
 
 	spin_lock_bh(&mfc_unres_lock);
-	it->cache = &net->ipv6.mfc6_unres_queue;
+	it->cache = &mrt->mfc6_unres_queue;
 	list_for_each_entry(mfc, it->cache, list)
 		if (pos-- == 0)
 			return mfc;
@@ -132,11 +151,13 @@ static struct mif_device *ip6mr_vif_seq_idx(struct net *net,
 					    struct ipmr_vif_iter *iter,
 					    loff_t pos)
 {
-	for (iter->ct = 0; iter->ct < net->ipv6.maxvif; ++iter->ct) {
-		if (!MIF_EXISTS(net, iter->ct))
+	struct mr6_table *mrt = net->ipv6.mrt6;
+
+	for (iter->ct = 0; iter->ct < mrt->maxvif; ++iter->ct) {
+		if (!MIF_EXISTS(mrt, iter->ct))
 			continue;
 		if (pos-- == 0)
-			return &net->ipv6.vif6_table[iter->ct];
+			return &mrt->vif6_table[iter->ct];
 	}
 	return NULL;
 }
@@ -155,15 +176,16 @@ static void *ip6mr_vif_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 {
 	struct ipmr_vif_iter *iter = seq->private;
 	struct net *net = seq_file_net(seq);
+	struct mr6_table *mrt = net->ipv6.mrt6;
 
 	++*pos;
 	if (v == SEQ_START_TOKEN)
 		return ip6mr_vif_seq_idx(net, iter, 0);
 
-	while (++iter->ct < net->ipv6.maxvif) {
-		if (!MIF_EXISTS(net, iter->ct))
+	while (++iter->ct < mrt->maxvif) {
+		if (!MIF_EXISTS(mrt, iter->ct))
 			continue;
-		return &net->ipv6.vif6_table[iter->ct];
+		return &mrt->vif6_table[iter->ct];
 	}
 	return NULL;
 }
@@ -177,6 +199,7 @@ static void ip6mr_vif_seq_stop(struct seq_file *seq, void *v)
 static int ip6mr_vif_seq_show(struct seq_file *seq, void *v)
 {
 	struct net *net = seq_file_net(seq);
+	struct mr6_table *mrt = net->ipv6.mrt6;
 
 	if (v == SEQ_START_TOKEN) {
 		seq_puts(seq,
@@ -187,7 +210,7 @@ static int ip6mr_vif_seq_show(struct seq_file *seq, void *v)
 
 		seq_printf(seq,
 			   "%2td %-10s %8ld %7ld  %8ld %7ld %05X\n",
-			   vif - net->ipv6.vif6_table,
+			   vif - mrt->vif6_table,
 			   name, vif->bytes_in, vif->pkt_in,
 			   vif->bytes_out, vif->pkt_out,
 			   vif->flags);
@@ -229,6 +252,7 @@ static void *ipmr_mfc_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 	struct mfc6_cache *mfc = v;
 	struct ipmr_mfc_iter *it = seq->private;
 	struct net *net = seq_file_net(seq);
+	struct mr6_table *mrt = net->ipv6.mrt6;
 
 	++*pos;
 
@@ -238,13 +262,13 @@ static void *ipmr_mfc_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 	if (mfc->list.next != it->cache)
 		return list_entry(mfc->list.next, struct mfc6_cache, list);
 
-	if (it->cache == &net->ipv6.mfc6_unres_queue)
+	if (it->cache == &mrt->mfc6_unres_queue)
 		goto end_of_list;
 
-	BUG_ON(it->cache != &net->ipv6.mfc6_cache_array[it->ct]);
+	BUG_ON(it->cache != &mrt->mfc6_cache_array[it->ct]);
 
 	while (++it->ct < MFC6_LINES) {
-		it->cache = &net->ipv6.mfc6_cache_array[it->ct];
+		it->cache = &mrt->mfc6_cache_array[it->ct];
 		if (list_empty(it->cache))
 			continue;
 		return list_first_entry(it->cache, struct mfc6_cache, list);
@@ -252,7 +276,7 @@ static void *ipmr_mfc_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 
 	/* exhausted cache_array, show unresolved */
 	read_unlock(&mrt_lock);
-	it->cache = &net->ipv6.mfc6_unres_queue;
+	it->cache = &mrt->mfc6_unres_queue;
 	it->ct = 0;
 
 	spin_lock_bh(&mfc_unres_lock);
@@ -270,10 +294,11 @@ static void ipmr_mfc_seq_stop(struct seq_file *seq, void *v)
 {
 	struct ipmr_mfc_iter *it = seq->private;
 	struct net *net = seq_file_net(seq);
+	struct mr6_table *mrt = net->ipv6.mrt6;
 
-	if (it->cache == &net->ipv6.mfc6_unres_queue)
+	if (it->cache == &mrt->mfc6_unres_queue)
 		spin_unlock_bh(&mfc_unres_lock);
-	else if (it->cache == net->ipv6.mfc6_cache_array)
+	else if (it->cache == mrt->mfc6_cache_array)
 		read_unlock(&mrt_lock);
 }
 
@@ -281,6 +306,7 @@ static int ipmr_mfc_seq_show(struct seq_file *seq, void *v)
 {
 	int n;
 	struct net *net = seq_file_net(seq);
+	struct mr6_table *mrt = net->ipv6.mrt6;
 
 	if (v == SEQ_START_TOKEN) {
 		seq_puts(seq,
@@ -295,14 +321,14 @@ static int ipmr_mfc_seq_show(struct seq_file *seq, void *v)
 			   &mfc->mf6c_mcastgrp, &mfc->mf6c_origin,
 			   mfc->mf6c_parent);
 
-		if (it->cache != &net->ipv6.mfc6_unres_queue) {
+		if (it->cache != &mrt->mfc6_unres_queue) {
 			seq_printf(seq, " %8lu %8lu %8lu",
 				   mfc->mfc_un.res.pkt,
 				   mfc->mfc_un.res.bytes,
 				   mfc->mfc_un.res.wrong_if);
 			for (n = mfc->mfc_un.res.minvif;
 			     n < mfc->mfc_un.res.maxvif; n++) {
-				if (MIF_EXISTS(net, n) &&
+				if (MIF_EXISTS(mrt, n) &&
 				    mfc->mfc_un.res.ttls[n] < 255)
 					seq_printf(seq,
 						   " %2d:%-3d",
@@ -349,7 +375,8 @@ static int pim6_rcv(struct sk_buff *skb)
 	struct ipv6hdr   *encap;
 	struct net_device  *reg_dev = NULL;
 	struct net *net = dev_net(skb->dev);
-	int reg_vif_num = net->ipv6.mroute_reg_vif_num;
+	struct mr6_table *mrt = net->ipv6.mrt6;
+	int reg_vif_num = mrt->mroute_reg_vif_num;
 
 	if (!pskb_may_pull(skb, sizeof(*pim) + sizeof(*encap)))
 		goto drop;
@@ -374,7 +401,7 @@ static int pim6_rcv(struct sk_buff *skb)
 
 	read_lock(&mrt_lock);
 	if (reg_vif_num >= 0)
-		reg_dev = net->ipv6.vif6_table[reg_vif_num].dev;
+		reg_dev = mrt->vif6_table[reg_vif_num].dev;
 	if (reg_dev)
 		dev_hold(reg_dev);
 	read_unlock(&mrt_lock);
@@ -411,12 +438,12 @@ static netdev_tx_t reg_vif_xmit(struct sk_buff *skb,
 				      struct net_device *dev)
 {
 	struct net *net = dev_net(dev);
+	struct mr6_table *mrt = net->ipv6.mrt6;
 
 	read_lock(&mrt_lock);
 	dev->stats.tx_bytes += skb->len;
 	dev->stats.tx_packets++;
-	ip6mr_cache_report(net, skb, net->ipv6.mroute_reg_vif_num,
-			   MRT6MSG_WHOLEPKT);
+	ip6mr_cache_report(mrt, skb, mrt->mroute_reg_vif_num, MRT6MSG_WHOLEPKT);
 	read_unlock(&mrt_lock);
 	kfree_skb(skb);
 	return NETDEV_TX_OK;
@@ -472,15 +499,16 @@ failure:
  *	Delete a VIF entry
  */
 
-static int mif6_delete(struct net *net, int vifi, struct list_head *head)
+static int mif6_delete(struct mr6_table *mrt, int vifi, struct list_head *head)
 {
 	struct mif_device *v;
 	struct net_device *dev;
 	struct inet6_dev *in6_dev;
-	if (vifi < 0 || vifi >= net->ipv6.maxvif)
+
+	if (vifi < 0 || vifi >= mrt->maxvif)
 		return -EADDRNOTAVAIL;
 
-	v = &net->ipv6.vif6_table[vifi];
+	v = &mrt->vif6_table[vifi];
 
 	write_lock_bh(&mrt_lock);
 	dev = v->dev;
@@ -492,17 +520,17 @@ static int mif6_delete(struct net *net, int vifi, struct list_head *head)
 	}
 
 #ifdef CONFIG_IPV6_PIMSM_V2
-	if (vifi == net->ipv6.mroute_reg_vif_num)
-		net->ipv6.mroute_reg_vif_num = -1;
+	if (vifi == mrt->mroute_reg_vif_num)
+		mrt->mroute_reg_vif_num = -1;
 #endif
 
-	if (vifi + 1 == net->ipv6.maxvif) {
+	if (vifi + 1 == mrt->maxvif) {
 		int tmp;
 		for (tmp = vifi - 1; tmp >= 0; tmp--) {
-			if (MIF_EXISTS(net, tmp))
+			if (MIF_EXISTS(mrt, tmp))
 				break;
 		}
-		net->ipv6.maxvif = tmp + 1;
+		mrt->maxvif = tmp + 1;
 	}
 
 	write_unlock_bh(&mrt_lock);
@@ -529,11 +557,12 @@ static inline void ip6mr_cache_free(struct mfc6_cache *c)
    and reporting error to netlink readers.
  */
 
-static void ip6mr_destroy_unres(struct net *net, struct mfc6_cache *c)
+static void ip6mr_destroy_unres(struct mr6_table *mrt, struct mfc6_cache *c)
 {
+	struct net *net = read_pnet(&mrt->net);
 	struct sk_buff *skb;
 
-	atomic_dec(&net->ipv6.cache_resolve_queue_len);
+	atomic_dec(&mrt->cache_resolve_queue_len);
 
 	while((skb = skb_dequeue(&c->mfc_un.unres.unresolved)) != NULL) {
 		if (ipv6_hdr(skb)->version == 0) {
@@ -553,13 +582,13 @@ static void ip6mr_destroy_unres(struct net *net, struct mfc6_cache *c)
 
 /* Timer process for all the unresolved queue. */
 
-static void ipmr_do_expire_process(struct net *net)
+static void ipmr_do_expire_process(struct mr6_table *mrt)
 {
 	unsigned long now = jiffies;
 	unsigned long expires = 10 * HZ;
 	struct mfc6_cache *c, *next;
 
-	list_for_each_entry_safe(c, next, &net->ipv6.mfc6_unres_queue, list) {
+	list_for_each_entry_safe(c, next, &mrt->mfc6_unres_queue, list) {
 		if (time_after(c->mfc_un.unres.expires, now)) {
 			/* not yet... */
 			unsigned long interval = c->mfc_un.unres.expires - now;
@@ -569,31 +598,31 @@ static void ipmr_do_expire_process(struct net *net)
 		}
 
 		list_del(&c->list);
-		ip6mr_destroy_unres(net, c);
+		ip6mr_destroy_unres(mrt, c);
 	}
 
-	if (!list_empty(&net->ipv6.mfc6_unres_queue))
-		mod_timer(&net->ipv6.ipmr_expire_timer, jiffies + expires);
+	if (!list_empty(&mrt->mfc6_unres_queue))
+		mod_timer(&mrt->ipmr_expire_timer, jiffies + expires);
 }
 
 static void ipmr_expire_process(unsigned long arg)
 {
-	struct net *net = (struct net *)arg;
+	struct mr6_table *mrt = (struct mr6_table *)arg;
 
 	if (!spin_trylock(&mfc_unres_lock)) {
-		mod_timer(&net->ipv6.ipmr_expire_timer, jiffies + 1);
+		mod_timer(&mrt->ipmr_expire_timer, jiffies + 1);
 		return;
 	}
 
-	if (!list_empty(&net->ipv6.mfc6_unres_queue))
-		ipmr_do_expire_process(net);
+	if (!list_empty(&mrt->mfc6_unres_queue))
+		ipmr_do_expire_process(mrt);
 
 	spin_unlock(&mfc_unres_lock);
 }
 
 /* Fill oifs list. It is called under write locked mrt_lock. */
 
-static void ip6mr_update_thresholds(struct net *net, struct mfc6_cache *cache,
+static void ip6mr_update_thresholds(struct mr6_table *mrt, struct mfc6_cache *cache,
 				    unsigned char *ttls)
 {
 	int vifi;
@@ -602,8 +631,8 @@ static void ip6mr_update_thresholds(struct net *net, struct mfc6_cache *cache,
 	cache->mfc_un.res.maxvif = 0;
 	memset(cache->mfc_un.res.ttls, 255, MAXMIFS);
 
-	for (vifi = 0; vifi < net->ipv6.maxvif; vifi++) {
-		if (MIF_EXISTS(net, vifi) &&
+	for (vifi = 0; vifi < mrt->maxvif; vifi++) {
+		if (MIF_EXISTS(mrt, vifi) &&
 		    ttls[vifi] && ttls[vifi] < 255) {
 			cache->mfc_un.res.ttls[vifi] = ttls[vifi];
 			if (cache->mfc_un.res.minvif > vifi)
@@ -614,16 +643,17 @@ static void ip6mr_update_thresholds(struct net *net, struct mfc6_cache *cache,
 	}
 }
 
-static int mif6_add(struct net *net, struct mif6ctl *vifc, int mrtsock)
+static int mif6_add(struct net *net, struct mr6_table *mrt,
+		    struct mif6ctl *vifc, int mrtsock)
 {
 	int vifi = vifc->mif6c_mifi;
-	struct mif_device *v = &net->ipv6.vif6_table[vifi];
+	struct mif_device *v = &mrt->vif6_table[vifi];
 	struct net_device *dev;
 	struct inet6_dev *in6_dev;
 	int err;
 
 	/* Is vif busy ? */
-	if (MIF_EXISTS(net, vifi))
+	if (MIF_EXISTS(mrt, vifi))
 		return -EADDRINUSE;
 
 	switch (vifc->mif6c_flags) {
@@ -633,7 +663,7 @@ static int mif6_add(struct net *net, struct mif6ctl *vifc, int mrtsock)
 		 * Special Purpose VIF in PIM
 		 * All the packets will be sent to the daemon
 		 */
-		if (net->ipv6.mroute_reg_vif_num >= 0)
+		if (mrt->mroute_reg_vif_num >= 0)
 			return -EADDRINUSE;
 		dev = ip6mr_reg_vif(net);
 		if (!dev)
@@ -685,22 +715,22 @@ static int mif6_add(struct net *net, struct mif6ctl *vifc, int mrtsock)
 	v->dev = dev;
 #ifdef CONFIG_IPV6_PIMSM_V2
 	if (v->flags & MIFF_REGISTER)
-		net->ipv6.mroute_reg_vif_num = vifi;
+		mrt->mroute_reg_vif_num = vifi;
 #endif
-	if (vifi + 1 > net->ipv6.maxvif)
-		net->ipv6.maxvif = vifi + 1;
+	if (vifi + 1 > mrt->maxvif)
+		mrt->maxvif = vifi + 1;
 	write_unlock_bh(&mrt_lock);
 	return 0;
 }
 
-static struct mfc6_cache *ip6mr_cache_find(struct net *net,
+static struct mfc6_cache *ip6mr_cache_find(struct mr6_table *mrt,
 					   struct in6_addr *origin,
 					   struct in6_addr *mcastgrp)
 {
 	int line = MFC6_HASH(mcastgrp, origin);
 	struct mfc6_cache *c;
 
-	list_for_each_entry(c, &net->ipv6.mfc6_cache_array[line], list) {
+	list_for_each_entry(c, &mrt->mfc6_cache_array[line], list) {
 		if (ipv6_addr_equal(&c->mf6c_origin, origin) &&
 		    ipv6_addr_equal(&c->mf6c_mcastgrp, mcastgrp))
 			return c;
@@ -734,8 +764,8 @@ static struct mfc6_cache *ip6mr_cache_alloc_unres(void)
  *	A cache entry has gone into a resolved state from queued
  */
 
-static void ip6mr_cache_resolve(struct net *net, struct mfc6_cache *uc,
-				struct mfc6_cache *c)
+static void ip6mr_cache_resolve(struct net *net, struct mr6_table *mrt,
+				struct mfc6_cache *uc, struct mfc6_cache *c)
 {
 	struct sk_buff *skb;
 
@@ -748,7 +778,7 @@ static void ip6mr_cache_resolve(struct net *net, struct mfc6_cache *uc,
 			int err;
 			struct nlmsghdr *nlh = (struct nlmsghdr *)skb_pull(skb, sizeof(struct ipv6hdr));
 
-			if (ip6mr_fill_mroute(net, skb, c, NLMSG_DATA(nlh)) > 0) {
+			if (ip6mr_fill_mroute(mrt, skb, c, NLMSG_DATA(nlh)) > 0) {
 				nlh->nlmsg_len = skb_tail_pointer(skb) - (u8 *)nlh;
 			} else {
 				nlh->nlmsg_type = NLMSG_ERROR;
@@ -758,7 +788,7 @@ static void ip6mr_cache_resolve(struct net *net, struct mfc6_cache *uc,
 			}
 			err = rtnl_unicast(skb, net, NETLINK_CB(skb).pid);
 		} else
-			ip6_mr_forward(net, skb, c);
+			ip6_mr_forward(net, mrt, skb, c);
 	}
 }
 
@@ -769,8 +799,8 @@ static void ip6mr_cache_resolve(struct net *net, struct mfc6_cache *uc,
  *	Called under mrt_lock.
  */
 
-static int ip6mr_cache_report(struct net *net, struct sk_buff *pkt, mifi_t mifi,
-			      int assert)
+static int ip6mr_cache_report(struct mr6_table *mrt, struct sk_buff *pkt,
+			      mifi_t mifi, int assert)
 {
 	struct sk_buff *skb;
 	struct mrt6msg *msg;
@@ -806,7 +836,7 @@ static int ip6mr_cache_report(struct net *net, struct sk_buff *pkt, mifi_t mifi,
 		msg = (struct mrt6msg *)skb_transport_header(skb);
 		msg->im6_mbz = 0;
 		msg->im6_msgtype = MRT6MSG_WHOLEPKT;
-		msg->im6_mif = net->ipv6.mroute_reg_vif_num;
+		msg->im6_mif = mrt->mroute_reg_vif_num;
 		msg->im6_pad = 0;
 		ipv6_addr_copy(&msg->im6_src, &ipv6_hdr(pkt)->saddr);
 		ipv6_addr_copy(&msg->im6_dst, &ipv6_hdr(pkt)->daddr);
@@ -841,7 +871,7 @@ static int ip6mr_cache_report(struct net *net, struct sk_buff *pkt, mifi_t mifi,
 	skb->ip_summed = CHECKSUM_UNNECESSARY;
 	}
 
-	if (net->ipv6.mroute6_sk == NULL) {
+	if (mrt->mroute6_sk == NULL) {
 		kfree_skb(skb);
 		return -EINVAL;
 	}
@@ -849,7 +879,7 @@ static int ip6mr_cache_report(struct net *net, struct sk_buff *pkt, mifi_t mifi,
 	/*
 	 *	Deliver to user space multicast routing algorithms
 	 */
-	ret = sock_queue_rcv_skb(net->ipv6.mroute6_sk, skb);
+	ret = sock_queue_rcv_skb(mrt->mroute6_sk, skb);
 	if (ret < 0) {
 		if (net_ratelimit())
 			printk(KERN_WARNING "mroute6: pending queue full, dropping entries.\n");
@@ -864,14 +894,14 @@ static int ip6mr_cache_report(struct net *net, struct sk_buff *pkt, mifi_t mifi,
  */
 
 static int
-ip6mr_cache_unresolved(struct net *net, mifi_t mifi, struct sk_buff *skb)
+ip6mr_cache_unresolved(struct mr6_table *mrt, mifi_t mifi, struct sk_buff *skb)
 {
 	bool found = false;
 	int err;
 	struct mfc6_cache *c;
 
 	spin_lock_bh(&mfc_unres_lock);
-	list_for_each_entry(c, &net->ipv6.mfc6_unres_queue, list) {
+	list_for_each_entry(c, &mrt->mfc6_unres_queue, list) {
 		if (ipv6_addr_equal(&c->mf6c_mcastgrp, &ipv6_hdr(skb)->daddr) &&
 		    ipv6_addr_equal(&c->mf6c_origin, &ipv6_hdr(skb)->saddr)) {
 			found = true;
@@ -884,7 +914,7 @@ ip6mr_cache_unresolved(struct net *net, mifi_t mifi, struct sk_buff *skb)
 		 *	Create a new entry if allowable
 		 */
 
-		if (atomic_read(&net->ipv6.cache_resolve_queue_len) >= 10 ||
+		if (atomic_read(&mrt->cache_resolve_queue_len) >= 10 ||
 		    (c = ip6mr_cache_alloc_unres()) == NULL) {
 			spin_unlock_bh(&mfc_unres_lock);
 
@@ -902,7 +932,7 @@ ip6mr_cache_unresolved(struct net *net, mifi_t mifi, struct sk_buff *skb)
 		/*
 		 *	Reflect first query at pim6sd
 		 */
-		err = ip6mr_cache_report(net, skb, mifi, MRT6MSG_NOCACHE);
+		err = ip6mr_cache_report(mrt, skb, mifi, MRT6MSG_NOCACHE);
 		if (err < 0) {
 			/* If the report failed throw the cache entry
 			   out - Brad Parker
@@ -914,10 +944,10 @@ ip6mr_cache_unresolved(struct net *net, mifi_t mifi, struct sk_buff *skb)
 			return err;
 		}
 
-		atomic_inc(&net->ipv6.cache_resolve_queue_len);
-		list_add(&c->list, &net->ipv6.mfc6_unres_queue);
+		atomic_inc(&mrt->cache_resolve_queue_len);
+		list_add(&c->list, &mrt->mfc6_unres_queue);
 
-		ipmr_do_expire_process(net);
+		ipmr_do_expire_process(mrt);
 	}
 
 	/*
@@ -939,14 +969,14 @@ ip6mr_cache_unresolved(struct net *net, mifi_t mifi, struct sk_buff *skb)
  *	MFC6 cache manipulation by user space
  */
 
-static int ip6mr_mfc_delete(struct net *net, struct mf6cctl *mfc)
+static int ip6mr_mfc_delete(struct mr6_table *mrt, struct mf6cctl *mfc)
 {
 	int line;
 	struct mfc6_cache *c, *next;
 
 	line = MFC6_HASH(&mfc->mf6cc_mcastgrp.sin6_addr, &mfc->mf6cc_origin.sin6_addr);
 
-	list_for_each_entry_safe(c, next, &net->ipv6.mfc6_cache_array[line], list) {
+	list_for_each_entry_safe(c, next, &mrt->mfc6_cache_array[line], list) {
 		if (ipv6_addr_equal(&c->mf6c_origin, &mfc->mf6cc_origin.sin6_addr) &&
 		    ipv6_addr_equal(&c->mf6c_mcastgrp, &mfc->mf6cc_mcastgrp.sin6_addr)) {
 			write_lock_bh(&mrt_lock);
@@ -965,6 +995,7 @@ static int ip6mr_device_event(struct notifier_block *this,
 {
 	struct net_device *dev = ptr;
 	struct net *net = dev_net(dev);
+	struct mr6_table *mrt = net->ipv6.mrt6;
 	struct mif_device *v;
 	int ct;
 	LIST_HEAD(list);
@@ -972,10 +1003,10 @@ static int ip6mr_device_event(struct notifier_block *this,
 	if (event != NETDEV_UNREGISTER)
 		return NOTIFY_DONE;
 
-	v = &net->ipv6.vif6_table[0];
-	for (ct = 0; ct < net->ipv6.maxvif; ct++, v++) {
+	v = &mrt->vif6_table[0];
+	for (ct = 0; ct < mrt->maxvif; ct++, v++) {
 		if (v->dev == dev)
-			mif6_delete(net, ct, &list);
+			mif6_delete(mrt, ct, &list);
 	}
 	unregister_netdevice_many(&list);
 
@@ -992,35 +1023,28 @@ static struct notifier_block ip6_mr_notifier = {
 
 static int __net_init ip6mr_net_init(struct net *net)
 {
+	struct mr6_table *mrt;
 	unsigned int i;
 	int err = 0;
 
-	net->ipv6.vif6_table = kcalloc(MAXMIFS, sizeof(struct mif_device),
-				       GFP_KERNEL);
-	if (!net->ipv6.vif6_table) {
+	mrt = kzalloc(sizeof(*mrt), GFP_KERNEL);
+	if (mrt == NULL) {
 		err = -ENOMEM;
 		goto fail;
 	}
 
-	/* Forwarding cache */
-	net->ipv6.mfc6_cache_array = kcalloc(MFC6_LINES,
-					     sizeof(struct list_head),
-					     GFP_KERNEL);
-	if (!net->ipv6.mfc6_cache_array) {
-		err = -ENOMEM;
-		goto fail_mfc6_cache;
-	}
+	write_pnet(&mrt->net, net);
 
 	for (i = 0; i < MFC6_LINES; i++)
-		INIT_LIST_HEAD(&net->ipv6.mfc6_cache_array[i]);
+		INIT_LIST_HEAD(&mrt->mfc6_cache_array[i]);
 
-	INIT_LIST_HEAD(&net->ipv6.mfc6_unres_queue);
+	INIT_LIST_HEAD(&mrt->mfc6_unres_queue);
 
-	setup_timer(&net->ipv6.ipmr_expire_timer, ipmr_expire_process,
-		    (unsigned long)net);
+	setup_timer(&mrt->ipmr_expire_timer, ipmr_expire_process,
+		    (unsigned long)mrt);
 
 #ifdef CONFIG_IPV6_PIMSM_V2
-	net->ipv6.mroute_reg_vif_num = -1;
+	mrt->mroute_reg_vif_num = -1;
 #endif
 
 #ifdef CONFIG_PROC_FS
@@ -1030,30 +1054,31 @@ static int __net_init ip6mr_net_init(struct net *net)
 	if (!proc_net_fops_create(net, "ip6_mr_cache", 0, &ip6mr_mfc_fops))
 		goto proc_cache_fail;
 #endif
+
+	net->ipv6.mrt6 = mrt;
 	return 0;
 
 #ifdef CONFIG_PROC_FS
 proc_cache_fail:
 	proc_net_remove(net, "ip6_mr_vif");
 proc_vif_fail:
-	kfree(net->ipv6.mfc6_cache_array);
+	kfree(mrt);
 #endif
-fail_mfc6_cache:
-	kfree(net->ipv6.vif6_table);
 fail:
 	return err;
 }
 
 static void __net_exit ip6mr_net_exit(struct net *net)
 {
+	struct mr6_table *mrt = net->ipv6.mrt6;
+
 #ifdef CONFIG_PROC_FS
 	proc_net_remove(net, "ip6_mr_cache");
 	proc_net_remove(net, "ip6_mr_vif");
 #endif
-	del_timer(&net->ipv6.ipmr_expire_timer);
-	mroute_clean_tables(net);
-	kfree(net->ipv6.mfc6_cache_array);
-	kfree(net->ipv6.vif6_table);
+	del_timer(&mrt->ipmr_expire_timer);
+	mroute_clean_tables(mrt);
+	kfree(mrt);
 }
 
 static struct pernet_operations ip6mr_net_ops = {
@@ -1105,7 +1130,8 @@ void ip6_mr_cleanup(void)
 	kmem_cache_destroy(mrt_cachep);
 }
 
-static int ip6mr_mfc_add(struct net *net, struct mf6cctl *mfc, int mrtsock)
+static int ip6mr_mfc_add(struct net *net, struct mr6_table *mrt,
+			 struct mf6cctl *mfc, int mrtsock)
 {
 	bool found = false;
 	int line;
@@ -1125,7 +1151,7 @@ static int ip6mr_mfc_add(struct net *net, struct mf6cctl *mfc, int mrtsock)
 
 	line = MFC6_HASH(&mfc->mf6cc_mcastgrp.sin6_addr, &mfc->mf6cc_origin.sin6_addr);
 
-	list_for_each_entry(c, &net->ipv6.mfc6_cache_array[line], list) {
+	list_for_each_entry(c, &mrt->mfc6_cache_array[line], list) {
 		if (ipv6_addr_equal(&c->mf6c_origin, &mfc->mf6cc_origin.sin6_addr) &&
 		    ipv6_addr_equal(&c->mf6c_mcastgrp, &mfc->mf6cc_mcastgrp.sin6_addr)) {
 			found = true;
@@ -1136,7 +1162,7 @@ static int ip6mr_mfc_add(struct net *net, struct mf6cctl *mfc, int mrtsock)
 	if (found) {
 		write_lock_bh(&mrt_lock);
 		c->mf6c_parent = mfc->mf6cc_parent;
-		ip6mr_update_thresholds(net, c, ttls);
+		ip6mr_update_thresholds(mrt, c, ttls);
 		if (!mrtsock)
 			c->mfc_flags |= MFC_STATIC;
 		write_unlock_bh(&mrt_lock);
@@ -1153,12 +1179,12 @@ static int ip6mr_mfc_add(struct net *net, struct mf6cctl *mfc, int mrtsock)
 	c->mf6c_origin = mfc->mf6cc_origin.sin6_addr;
 	c->mf6c_mcastgrp = mfc->mf6cc_mcastgrp.sin6_addr;
 	c->mf6c_parent = mfc->mf6cc_parent;
-	ip6mr_update_thresholds(net, c, ttls);
+	ip6mr_update_thresholds(mrt, c, ttls);
 	if (!mrtsock)
 		c->mfc_flags |= MFC_STATIC;
 
 	write_lock_bh(&mrt_lock);
-	list_add(&c->list, &net->ipv6.mfc6_cache_array[line]);
+	list_add(&c->list, &mrt->mfc6_cache_array[line]);
 	write_unlock_bh(&mrt_lock);
 
 	/*
@@ -1167,21 +1193,21 @@ static int ip6mr_mfc_add(struct net *net, struct mf6cctl *mfc, int mrtsock)
 	 */
 	found = false;
 	spin_lock_bh(&mfc_unres_lock);
-	list_for_each_entry(uc, &net->ipv6.mfc6_unres_queue, list) {
+	list_for_each_entry(uc, &mrt->mfc6_unres_queue, list) {
 		if (ipv6_addr_equal(&uc->mf6c_origin, &c->mf6c_origin) &&
 		    ipv6_addr_equal(&uc->mf6c_mcastgrp, &c->mf6c_mcastgrp)) {
 			list_del(&uc->list);
-			atomic_dec(&net->ipv6.cache_resolve_queue_len);
+			atomic_dec(&mrt->cache_resolve_queue_len);
 			found = true;
 			break;
 		}
 	}
-	if (list_empty(&net->ipv6.mfc6_unres_queue))
-		del_timer(&net->ipv6.ipmr_expire_timer);
+	if (list_empty(&mrt->mfc6_unres_queue))
+		del_timer(&mrt->ipmr_expire_timer);
 	spin_unlock_bh(&mfc_unres_lock);
 
 	if (found) {
-		ip6mr_cache_resolve(net, uc, c);
+		ip6mr_cache_resolve(net, mrt, uc, c);
 		ip6mr_cache_free(uc);
 	}
 	return 0;
@@ -1191,7 +1217,7 @@ static int ip6mr_mfc_add(struct net *net, struct mf6cctl *mfc, int mrtsock)
  *	Close the multicast socket, and clear the vif tables etc
  */
 
-static void mroute_clean_tables(struct net *net)
+static void mroute_clean_tables(struct mr6_table *mrt)
 {
 	int i;
 	LIST_HEAD(list);
@@ -1200,9 +1226,9 @@ static void mroute_clean_tables(struct net *net)
 	/*
 	 *	Shut down all active vif entries
 	 */
-	for (i = 0; i < net->ipv6.maxvif; i++) {
-		if (!(net->ipv6.vif6_table[i].flags & VIFF_STATIC))
-			mif6_delete(net, i, &list);
+	for (i = 0; i < mrt->maxvif; i++) {
+		if (!(mrt->vif6_table[i].flags & VIFF_STATIC))
+			mif6_delete(mrt, i, &list);
 	}
 	unregister_netdevice_many(&list);
 
@@ -1210,7 +1236,7 @@ static void mroute_clean_tables(struct net *net)
 	 *	Wipe the cache
 	 */
 	for (i = 0; i < MFC6_LINES; i++) {
-		list_for_each_entry_safe(c, next, &net->ipv6.mfc6_cache_array[i], list) {
+		list_for_each_entry_safe(c, next, &mrt->mfc6_cache_array[i], list) {
 			if (c->mfc_flags & MFC_STATIC)
 				continue;
 			write_lock_bh(&mrt_lock);
@@ -1221,25 +1247,25 @@ static void mroute_clean_tables(struct net *net)
 		}
 	}
 
-	if (atomic_read(&net->ipv6.cache_resolve_queue_len) != 0) {
+	if (atomic_read(&mrt->cache_resolve_queue_len) != 0) {
 		spin_lock_bh(&mfc_unres_lock);
-		list_for_each_entry_safe(c, next, &net->ipv6.mfc6_unres_queue, list) {
+		list_for_each_entry_safe(c, next, &mrt->mfc6_unres_queue, list) {
 			list_del(&c->list);
-			ip6mr_destroy_unres(net, c);
+			ip6mr_destroy_unres(mrt, c);
 		}
 		spin_unlock_bh(&mfc_unres_lock);
 	}
 }
 
-static int ip6mr_sk_init(struct sock *sk)
+static int ip6mr_sk_init(struct mr6_table *mrt, struct sock *sk)
 {
 	int err = 0;
 	struct net *net = sock_net(sk);
 
 	rtnl_lock();
 	write_lock_bh(&mrt_lock);
-	if (likely(net->ipv6.mroute6_sk == NULL)) {
-		net->ipv6.mroute6_sk = sk;
+	if (likely(mrt->mroute6_sk == NULL)) {
+		mrt->mroute6_sk = sk;
 		net->ipv6.devconf_all->mc_forwarding++;
 	}
 	else
@@ -1255,15 +1281,16 @@ int ip6mr_sk_done(struct sock *sk)
 {
 	int err = 0;
 	struct net *net = sock_net(sk);
+	struct mr6_table *mrt = net->ipv6.mrt6;
 
 	rtnl_lock();
-	if (sk == net->ipv6.mroute6_sk) {
+	if (sk == mrt->mroute6_sk) {
 		write_lock_bh(&mrt_lock);
-		net->ipv6.mroute6_sk = NULL;
+		mrt->mroute6_sk = NULL;
 		net->ipv6.devconf_all->mc_forwarding--;
 		write_unlock_bh(&mrt_lock);
 
-		mroute_clean_tables(net);
+		mroute_clean_tables(mrt);
 	} else
 		err = -EACCES;
 	rtnl_unlock();
@@ -1271,6 +1298,13 @@ int ip6mr_sk_done(struct sock *sk)
 	return err;
 }
 
+struct sock *mroute6_socket(struct net *net)
+{
+	struct mr6_table *mrt = net->ipv6.mrt6;
+
+	return mrt->mroute6_sk;
+}
+
 /*
  *	Socket options and virtual interface manipulation. The whole
  *	virtual interface system is a complete heap, but unfortunately
@@ -1285,9 +1319,10 @@ int ip6_mroute_setsockopt(struct sock *sk, int optname, char __user *optval, uns
 	struct mf6cctl mfc;
 	mifi_t mifi;
 	struct net *net = sock_net(sk);
+	struct mr6_table *mrt = net->ipv6.mrt6;
 
 	if (optname != MRT6_INIT) {
-		if (sk != net->ipv6.mroute6_sk && !capable(CAP_NET_ADMIN))
+		if (sk != mrt->mroute6_sk && !capable(CAP_NET_ADMIN))
 			return -EACCES;
 	}
 
@@ -1299,7 +1334,7 @@ int ip6_mroute_setsockopt(struct sock *sk, int optname, char __user *optval, uns
 		if (optlen < sizeof(int))
 			return -EINVAL;
 
-		return ip6mr_sk_init(sk);
+		return ip6mr_sk_init(mrt, sk);
 
 	case MRT6_DONE:
 		return ip6mr_sk_done(sk);
@@ -1312,7 +1347,7 @@ int ip6_mroute_setsockopt(struct sock *sk, int optname, char __user *optval, uns
 		if (vif.mif6c_mifi >= MAXMIFS)
 			return -ENFILE;
 		rtnl_lock();
-		ret = mif6_add(net, &vif, sk == net->ipv6.mroute6_sk);
+		ret = mif6_add(net, mrt, &vif, sk == mrt->mroute6_sk);
 		rtnl_unlock();
 		return ret;
 
@@ -1322,7 +1357,7 @@ int ip6_mroute_setsockopt(struct sock *sk, int optname, char __user *optval, uns
 		if (copy_from_user(&mifi, optval, sizeof(mifi_t)))
 			return -EFAULT;
 		rtnl_lock();
-		ret = mif6_delete(net, mifi, NULL);
+		ret = mif6_delete(mrt, mifi, NULL);
 		rtnl_unlock();
 		return ret;
 
@@ -1338,10 +1373,9 @@ int ip6_mroute_setsockopt(struct sock *sk, int optname, char __user *optval, uns
 			return -EFAULT;
 		rtnl_lock();
 		if (optname == MRT6_DEL_MFC)
-			ret = ip6mr_mfc_delete(net, &mfc);
+			ret = ip6mr_mfc_delete(mrt, &mfc);
 		else
-			ret = ip6mr_mfc_add(net, &mfc,
-					    sk == net->ipv6.mroute6_sk);
+			ret = ip6mr_mfc_add(net, mrt, &mfc, sk == mrt->mroute6_sk);
 		rtnl_unlock();
 		return ret;
 
@@ -1353,7 +1387,7 @@ int ip6_mroute_setsockopt(struct sock *sk, int optname, char __user *optval, uns
 		int v;
 		if (get_user(v, (int __user *)optval))
 			return -EFAULT;
-		net->ipv6.mroute_do_assert = !!v;
+		mrt->mroute_do_assert = !!v;
 		return 0;
 	}
 
@@ -1366,9 +1400,9 @@ int ip6_mroute_setsockopt(struct sock *sk, int optname, char __user *optval, uns
 		v = !!v;
 		rtnl_lock();
 		ret = 0;
-		if (v != net->ipv6.mroute_do_pim) {
-			net->ipv6.mroute_do_pim = v;
-			net->ipv6.mroute_do_assert = v;
+		if (v != mrt->mroute_do_pim) {
+			mrt->mroute_do_pim = v;
+			mrt->mroute_do_assert = v;
 		}
 		rtnl_unlock();
 		return ret;
@@ -1394,6 +1428,7 @@ int ip6_mroute_getsockopt(struct sock *sk, int optname, char __user *optval,
 	int olr;
 	int val;
 	struct net *net = sock_net(sk);
+	struct mr6_table *mrt = net->ipv6.mrt6;
 
 	switch (optname) {
 	case MRT6_VERSION:
@@ -1401,11 +1436,11 @@ int ip6_mroute_getsockopt(struct sock *sk, int optname, char __user *optval,
 		break;
 #ifdef CONFIG_IPV6_PIMSM_V2
 	case MRT6_PIM:
-		val = net->ipv6.mroute_do_pim;
+		val = mrt->mroute_do_pim;
 		break;
 #endif
 	case MRT6_ASSERT:
-		val = net->ipv6.mroute_do_assert;
+		val = mrt->mroute_do_assert;
 		break;
 	default:
 		return -ENOPROTOOPT;
@@ -1436,16 +1471,17 @@ int ip6mr_ioctl(struct sock *sk, int cmd, void __user *arg)
 	struct mif_device *vif;
 	struct mfc6_cache *c;
 	struct net *net = sock_net(sk);
+	struct mr6_table *mrt = net->ipv6.mrt6;
 
 	switch (cmd) {
 	case SIOCGETMIFCNT_IN6:
 		if (copy_from_user(&vr, arg, sizeof(vr)))
 			return -EFAULT;
-		if (vr.mifi >= net->ipv6.maxvif)
+		if (vr.mifi >= mrt->maxvif)
 			return -EINVAL;
 		read_lock(&mrt_lock);
-		vif = &net->ipv6.vif6_table[vr.mifi];
-		if (MIF_EXISTS(net, vr.mifi)) {
+		vif = &mrt->vif6_table[vr.mifi];
+		if (MIF_EXISTS(mrt, vr.mifi)) {
 			vr.icount = vif->pkt_in;
 			vr.ocount = vif->pkt_out;
 			vr.ibytes = vif->bytes_in;
@@ -1463,7 +1499,7 @@ int ip6mr_ioctl(struct sock *sk, int cmd, void __user *arg)
 			return -EFAULT;
 
 		read_lock(&mrt_lock);
-		c = ip6mr_cache_find(net, &sr.src.sin6_addr, &sr.grp.sin6_addr);
+		c = ip6mr_cache_find(mrt, &sr.src.sin6_addr, &sr.grp.sin6_addr);
 		if (c) {
 			sr.pktcnt = c->mfc_un.res.pkt;
 			sr.bytecnt = c->mfc_un.res.bytes;
@@ -1493,11 +1529,11 @@ static inline int ip6mr_forward2_finish(struct sk_buff *skb)
  *	Processing handlers for ip6mr_forward
  */
 
-static int ip6mr_forward2(struct net *net, struct sk_buff *skb,
-			  struct mfc6_cache *c, int vifi)
+static int ip6mr_forward2(struct net *net, struct mr6_table *mrt,
+			  struct sk_buff *skb, struct mfc6_cache *c, int vifi)
 {
 	struct ipv6hdr *ipv6h;
-	struct mif_device *vif = &net->ipv6.vif6_table[vifi];
+	struct mif_device *vif = &mrt->vif6_table[vifi];
 	struct net_device *dev;
 	struct dst_entry *dst;
 	struct flowi fl;
@@ -1511,7 +1547,7 @@ static int ip6mr_forward2(struct net *net, struct sk_buff *skb,
 		vif->bytes_out += skb->len;
 		vif->dev->stats.tx_bytes += skb->len;
 		vif->dev->stats.tx_packets++;
-		ip6mr_cache_report(net, skb, vifi, MRT6MSG_WHOLEPKT);
+		ip6mr_cache_report(mrt, skb, vifi, MRT6MSG_WHOLEPKT);
 		goto out_free;
 	}
 #endif
@@ -1566,19 +1602,19 @@ out_free:
 	return 0;
 }
 
-static int ip6mr_find_vif(struct net_device *dev)
+static int ip6mr_find_vif(struct mr6_table *mrt, struct net_device *dev)
 {
-	struct net *net = dev_net(dev);
 	int ct;
-	for (ct = net->ipv6.maxvif - 1; ct >= 0; ct--) {
-		if (net->ipv6.vif6_table[ct].dev == dev)
+
+	for (ct = mrt->maxvif - 1; ct >= 0; ct--) {
+		if (mrt->vif6_table[ct].dev == dev)
 			break;
 	}
 	return ct;
 }
 
-static int ip6_mr_forward(struct net *net, struct sk_buff *skb,
-			  struct mfc6_cache *cache)
+static int ip6_mr_forward(struct net *net, struct mr6_table *mrt,
+			  struct sk_buff *skb, struct mfc6_cache *cache)
 {
 	int psend = -1;
 	int vif, ct;
@@ -1590,30 +1626,30 @@ static int ip6_mr_forward(struct net *net, struct sk_buff *skb,
 	/*
 	 * Wrong interface: drop packet and (maybe) send PIM assert.
 	 */
-	if (net->ipv6.vif6_table[vif].dev != skb->dev) {
+	if (mrt->vif6_table[vif].dev != skb->dev) {
 		int true_vifi;
 
 		cache->mfc_un.res.wrong_if++;
-		true_vifi = ip6mr_find_vif(skb->dev);
+		true_vifi = ip6mr_find_vif(mrt, skb->dev);
 
-		if (true_vifi >= 0 && net->ipv6.mroute_do_assert &&
+		if (true_vifi >= 0 && mrt->mroute_do_assert &&
 		    /* pimsm uses asserts, when switching from RPT to SPT,
 		       so that we cannot check that packet arrived on an oif.
 		       It is bad, but otherwise we would need to move pretty
 		       large chunk of pimd to kernel. Ough... --ANK
 		     */
-		    (net->ipv6.mroute_do_pim ||
+		    (mrt->mroute_do_pim ||
 		     cache->mfc_un.res.ttls[true_vifi] < 255) &&
 		    time_after(jiffies,
 			       cache->mfc_un.res.last_assert + MFC_ASSERT_THRESH)) {
 			cache->mfc_un.res.last_assert = jiffies;
-			ip6mr_cache_report(net, skb, true_vifi, MRT6MSG_WRONGMIF);
+			ip6mr_cache_report(mrt, skb, true_vifi, MRT6MSG_WRONGMIF);
 		}
 		goto dont_forward;
 	}
 
-	net->ipv6.vif6_table[vif].pkt_in++;
-	net->ipv6.vif6_table[vif].bytes_in += skb->len;
+	mrt->vif6_table[vif].pkt_in++;
+	mrt->vif6_table[vif].bytes_in += skb->len;
 
 	/*
 	 *	Forward the frame
@@ -1623,13 +1659,13 @@ static int ip6_mr_forward(struct net *net, struct sk_buff *skb,
 			if (psend != -1) {
 				struct sk_buff *skb2 = skb_clone(skb, GFP_ATOMIC);
 				if (skb2)
-					ip6mr_forward2(net, skb2, cache, psend);
+					ip6mr_forward2(net, mrt, skb2, cache, psend);
 			}
 			psend = ct;
 		}
 	}
 	if (psend != -1) {
-		ip6mr_forward2(net, skb, cache, psend);
+		ip6mr_forward2(net, mrt, skb, cache, psend);
 		return 0;
 	}
 
@@ -1647,9 +1683,10 @@ int ip6_mr_input(struct sk_buff *skb)
 {
 	struct mfc6_cache *cache;
 	struct net *net = dev_net(skb->dev);
+	struct mr6_table *mrt = net->ipv6.mrt6;
 
 	read_lock(&mrt_lock);
-	cache = ip6mr_cache_find(net,
+	cache = ip6mr_cache_find(mrt,
 				 &ipv6_hdr(skb)->saddr, &ipv6_hdr(skb)->daddr);
 
 	/*
@@ -1658,9 +1695,9 @@ int ip6_mr_input(struct sk_buff *skb)
 	if (cache == NULL) {
 		int vif;
 
-		vif = ip6mr_find_vif(skb->dev);
+		vif = ip6mr_find_vif(mrt, skb->dev);
 		if (vif >= 0) {
-			int err = ip6mr_cache_unresolved(net, vif, skb);
+			int err = ip6mr_cache_unresolved(mrt, vif, skb);
 			read_unlock(&mrt_lock);
 
 			return err;
@@ -1670,7 +1707,7 @@ int ip6_mr_input(struct sk_buff *skb)
 		return -ENODEV;
 	}
 
-	ip6_mr_forward(net, skb, cache);
+	ip6_mr_forward(net, mrt, skb, cache);
 
 	read_unlock(&mrt_lock);
 
@@ -1679,8 +1716,8 @@ int ip6_mr_input(struct sk_buff *skb)
 
 
 static int
-ip6mr_fill_mroute(struct net *net, struct sk_buff *skb, struct mfc6_cache *c,
-		  struct rtmsg *rtm)
+ip6mr_fill_mroute(struct mr6_table *mrt, struct sk_buff *skb,
+		  struct mfc6_cache *c, struct rtmsg *rtm)
 {
 	int ct;
 	struct rtnexthop *nhp;
@@ -1691,19 +1728,19 @@ ip6mr_fill_mroute(struct net *net, struct sk_buff *skb, struct mfc6_cache *c,
 	if (c->mf6c_parent > MAXMIFS)
 		return -ENOENT;
 
-	if (MIF_EXISTS(net, c->mf6c_parent))
-		RTA_PUT(skb, RTA_IIF, 4, &net->ipv6.vif6_table[c->mf6c_parent].dev->ifindex);
+	if (MIF_EXISTS(mrt, c->mf6c_parent))
+		RTA_PUT(skb, RTA_IIF, 4, &mrt->vif6_table[c->mf6c_parent].dev->ifindex);
 
 	mp_head = (struct rtattr *)skb_put(skb, RTA_LENGTH(0));
 
 	for (ct = c->mfc_un.res.minvif; ct < c->mfc_un.res.maxvif; ct++) {
-		if (MIF_EXISTS(net, ct) && c->mfc_un.res.ttls[ct] < 255) {
+		if (MIF_EXISTS(mrt, ct) && c->mfc_un.res.ttls[ct] < 255) {
 			if (skb_tailroom(skb) < RTA_ALIGN(RTA_ALIGN(sizeof(*nhp)) + 4))
 				goto rtattr_failure;
 			nhp = (struct rtnexthop *)skb_put(skb, RTA_ALIGN(sizeof(*nhp)));
 			nhp->rtnh_flags = 0;
 			nhp->rtnh_hops = c->mfc_un.res.ttls[ct];
-			nhp->rtnh_ifindex = net->ipv6.vif6_table[ct].dev->ifindex;
+			nhp->rtnh_ifindex = mrt->vif6_table[ct].dev->ifindex;
 			nhp->rtnh_len = sizeof(*nhp);
 		}
 	}
@@ -1721,11 +1758,12 @@ int ip6mr_get_route(struct net *net,
 		    struct sk_buff *skb, struct rtmsg *rtm, int nowait)
 {
 	int err;
+	struct mr6_table *mrt = net->ipv6.mrt6;
 	struct mfc6_cache *cache;
 	struct rt6_info *rt = (struct rt6_info *)skb_dst(skb);
 
 	read_lock(&mrt_lock);
-	cache = ip6mr_cache_find(net, &rt->rt6i_src.addr, &rt->rt6i_dst.addr);
+	cache = ip6mr_cache_find(mrt, &rt->rt6i_src.addr, &rt->rt6i_dst.addr);
 
 	if (!cache) {
 		struct sk_buff *skb2;
@@ -1739,7 +1777,7 @@ int ip6mr_get_route(struct net *net,
 		}
 
 		dev = skb->dev;
-		if (dev == NULL || (vif = ip6mr_find_vif(dev)) < 0) {
+		if (dev == NULL || (vif = ip6mr_find_vif(mrt, dev)) < 0) {
 			read_unlock(&mrt_lock);
 			return -ENODEV;
 		}
@@ -1768,7 +1806,7 @@ int ip6mr_get_route(struct net *net,
 		ipv6_addr_copy(&iph->saddr, &rt->rt6i_src.addr);
 		ipv6_addr_copy(&iph->daddr, &rt->rt6i_dst.addr);
 
-		err = ip6mr_cache_unresolved(net, vif, skb2);
+		err = ip6mr_cache_unresolved(mrt, vif, skb2);
 		read_unlock(&mrt_lock);
 
 		return err;
@@ -1777,7 +1815,7 @@ int ip6mr_get_route(struct net *net,
 	if (!nowait && (rtm->rtm_flags&RTM_F_NOTIFY))
 		cache->mfc_flags |= MFC_NOTIFY;
 
-	err = ip6mr_fill_mroute(net, skb, cache, rtm);
+	err = ip6mr_fill_mroute(mrt, skb, cache, rtm);
 	read_unlock(&mrt_lock);
 	return err;
 }
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH 3/6] ipv6: ip6mr: convert struct mfc_cache to struct list_head
From: kaber @ 2010-05-11 14:02 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <1273586551-3521-1-git-send-email-kaber@trash.net>

From: Patrick McHardy <kaber@trash.net>

Signed-off-by: Patrick McHardy <kaber@trash.net>
---
 include/linux/mroute6.h  |    2 +-
 include/net/netns/ipv6.h |    4 +-
 net/ipv6/ip6mr.c         |  127 ++++++++++++++++++++++-----------------------
 3 files changed, 65 insertions(+), 68 deletions(-)

diff --git a/include/linux/mroute6.h b/include/linux/mroute6.h
index 04e2e54..94a0cb5 100644
--- a/include/linux/mroute6.h
+++ b/include/linux/mroute6.h
@@ -182,7 +182,7 @@ struct mif_device {
 #define VIFF_STATIC 0x8000
 
 struct mfc6_cache {
-	struct mfc6_cache *next;		/* Next entry on cache line 	*/
+	struct list_head list;
 	struct in6_addr mf6c_mcastgrp;			/* Group the entry belongs to 	*/
 	struct in6_addr mf6c_origin;			/* Source of packet 		*/
 	mifi_t mf6c_parent;			/* Source interface		*/
diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
index 43d842a..9cb3b5f 100644
--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h
@@ -61,8 +61,8 @@ struct netns_ipv6 {
 #ifdef CONFIG_IPV6_MROUTE
 	struct sock		*mroute6_sk;
 	struct timer_list	ipmr_expire_timer;
-	struct mfc6_cache	*mfc6_unres_queue;
-	struct mfc6_cache	**mfc6_cache_array;
+	struct list_head	mfc6_unres_queue;
+	struct list_head	*mfc6_cache_array;
 	struct mif_device	*vif6_table;
 	int			maxvif;
 	atomic_t		cache_resolve_queue_len;
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index b3783a4..08e0904 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -89,7 +89,7 @@ static void mroute_clean_tables(struct net *net);
 
 struct ipmr_mfc_iter {
 	struct seq_net_private p;
-	struct mfc6_cache **cache;
+	struct list_head *cache;
 	int ct;
 };
 
@@ -99,18 +99,18 @@ static struct mfc6_cache *ipmr_mfc_seq_idx(struct net *net,
 {
 	struct mfc6_cache *mfc;
 
-	it->cache = net->ipv6.mfc6_cache_array;
 	read_lock(&mrt_lock);
-	for (it->ct = 0; it->ct < MFC6_LINES; it->ct++)
-		for (mfc = net->ipv6.mfc6_cache_array[it->ct];
-		     mfc; mfc = mfc->next)
+	for (it->ct = 0; it->ct < MFC6_LINES; it->ct++) {
+		it->cache = &net->ipv6.mfc6_cache_array[it->ct];
+		list_for_each_entry(mfc, it->cache, list)
 			if (pos-- == 0)
 				return mfc;
+	}
 	read_unlock(&mrt_lock);
 
-	it->cache = &net->ipv6.mfc6_unres_queue;
 	spin_lock_bh(&mfc_unres_lock);
-	for (mfc = net->ipv6.mfc6_unres_queue; mfc; mfc = mfc->next)
+	it->cache = &net->ipv6.mfc6_unres_queue;
+	list_for_each_entry(mfc, it->cache, list)
 		if (pos-- == 0)
 			return mfc;
 	spin_unlock_bh(&mfc_unres_lock);
@@ -119,9 +119,6 @@ static struct mfc6_cache *ipmr_mfc_seq_idx(struct net *net,
 	return NULL;
 }
 
-
-
-
 /*
  *	The /proc interfaces to multicast routing /proc/ip6_mr_cache /proc/ip6_mr_vif
  */
@@ -238,18 +235,19 @@ static void *ipmr_mfc_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 	if (v == SEQ_START_TOKEN)
 		return ipmr_mfc_seq_idx(net, seq->private, 0);
 
-	if (mfc->next)
-		return mfc->next;
+	if (mfc->list.next != it->cache)
+		return list_entry(mfc->list.next, struct mfc6_cache, list);
 
 	if (it->cache == &net->ipv6.mfc6_unres_queue)
 		goto end_of_list;
 
-	BUG_ON(it->cache != net->ipv6.mfc6_cache_array);
+	BUG_ON(it->cache != &net->ipv6.mfc6_cache_array[it->ct]);
 
 	while (++it->ct < MFC6_LINES) {
-		mfc = net->ipv6.mfc6_cache_array[it->ct];
-		if (mfc)
-			return mfc;
+		it->cache = &net->ipv6.mfc6_cache_array[it->ct];
+		if (list_empty(it->cache))
+			continue;
+		return list_first_entry(it->cache, struct mfc6_cache, list);
 	}
 
 	/* exhausted cache_array, show unresolved */
@@ -258,9 +256,8 @@ static void *ipmr_mfc_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 	it->ct = 0;
 
 	spin_lock_bh(&mfc_unres_lock);
-	mfc = net->ipv6.mfc6_unres_queue;
-	if (mfc)
-		return mfc;
+	if (!list_empty(it->cache))
+		return list_first_entry(it->cache, struct mfc6_cache, list);
 
  end_of_list:
 	spin_unlock_bh(&mfc_unres_lock);
@@ -560,25 +557,22 @@ static void ipmr_do_expire_process(struct net *net)
 {
 	unsigned long now = jiffies;
 	unsigned long expires = 10 * HZ;
-	struct mfc6_cache *c, **cp;
-
-	cp = &net->ipv6.mfc6_unres_queue;
+	struct mfc6_cache *c, *next;
 
-	while ((c = *cp) != NULL) {
+	list_for_each_entry_safe(c, next, &net->ipv6.mfc6_unres_queue, list) {
 		if (time_after(c->mfc_un.unres.expires, now)) {
 			/* not yet... */
 			unsigned long interval = c->mfc_un.unres.expires - now;
 			if (interval < expires)
 				expires = interval;
-			cp = &c->next;
 			continue;
 		}
 
-		*cp = c->next;
+		list_del(&c->list);
 		ip6mr_destroy_unres(net, c);
 	}
 
-	if (net->ipv6.mfc6_unres_queue != NULL)
+	if (!list_empty(&net->ipv6.mfc6_unres_queue))
 		mod_timer(&net->ipv6.ipmr_expire_timer, jiffies + expires);
 }
 
@@ -591,7 +585,7 @@ static void ipmr_expire_process(unsigned long arg)
 		return;
 	}
 
-	if (net->ipv6.mfc6_unres_queue != NULL)
+	if (!list_empty(&net->ipv6.mfc6_unres_queue))
 		ipmr_do_expire_process(net);
 
 	spin_unlock(&mfc_unres_lock);
@@ -706,12 +700,12 @@ static struct mfc6_cache *ip6mr_cache_find(struct net *net,
 	int line = MFC6_HASH(mcastgrp, origin);
 	struct mfc6_cache *c;
 
-	for (c = net->ipv6.mfc6_cache_array[line]; c; c = c->next) {
+	list_for_each_entry(c, &net->ipv6.mfc6_cache_array[line], list) {
 		if (ipv6_addr_equal(&c->mf6c_origin, origin) &&
 		    ipv6_addr_equal(&c->mf6c_mcastgrp, mcastgrp))
-			break;
+			return c;
 	}
-	return c;
+	return NULL;
 }
 
 /*
@@ -872,17 +866,20 @@ static int ip6mr_cache_report(struct net *net, struct sk_buff *pkt, mifi_t mifi,
 static int
 ip6mr_cache_unresolved(struct net *net, mifi_t mifi, struct sk_buff *skb)
 {
+	bool found = false;
 	int err;
 	struct mfc6_cache *c;
 
 	spin_lock_bh(&mfc_unres_lock);
-	for (c = net->ipv6.mfc6_unres_queue; c; c = c->next) {
+	list_for_each_entry(c, &net->ipv6.mfc6_unres_queue, list) {
 		if (ipv6_addr_equal(&c->mf6c_mcastgrp, &ipv6_hdr(skb)->daddr) &&
-		    ipv6_addr_equal(&c->mf6c_origin, &ipv6_hdr(skb)->saddr))
+		    ipv6_addr_equal(&c->mf6c_origin, &ipv6_hdr(skb)->saddr)) {
+			found = true;
 			break;
+		}
 	}
 
-	if (c == NULL) {
+	if (!found) {
 		/*
 		 *	Create a new entry if allowable
 		 */
@@ -918,8 +915,7 @@ ip6mr_cache_unresolved(struct net *net, mifi_t mifi, struct sk_buff *skb)
 		}
 
 		atomic_inc(&net->ipv6.cache_resolve_queue_len);
-		c->next = net->ipv6.mfc6_unres_queue;
-		net->ipv6.mfc6_unres_queue = c;
+		list_add(&c->list, &net->ipv6.mfc6_unres_queue);
 
 		ipmr_do_expire_process(net);
 	}
@@ -946,16 +942,15 @@ ip6mr_cache_unresolved(struct net *net, mifi_t mifi, struct sk_buff *skb)
 static int ip6mr_mfc_delete(struct net *net, struct mf6cctl *mfc)
 {
 	int line;
-	struct mfc6_cache *c, **cp;
+	struct mfc6_cache *c, *next;
 
 	line = MFC6_HASH(&mfc->mf6cc_mcastgrp.sin6_addr, &mfc->mf6cc_origin.sin6_addr);
 
-	for (cp = &net->ipv6.mfc6_cache_array[line];
-	     (c = *cp) != NULL; cp = &c->next) {
+	list_for_each_entry_safe(c, next, &net->ipv6.mfc6_cache_array[line], list) {
 		if (ipv6_addr_equal(&c->mf6c_origin, &mfc->mf6cc_origin.sin6_addr) &&
 		    ipv6_addr_equal(&c->mf6c_mcastgrp, &mfc->mf6cc_mcastgrp.sin6_addr)) {
 			write_lock_bh(&mrt_lock);
-			*cp = c->next;
+			list_del(&c->list);
 			write_unlock_bh(&mrt_lock);
 
 			ip6mr_cache_free(c);
@@ -997,7 +992,9 @@ static struct notifier_block ip6_mr_notifier = {
 
 static int __net_init ip6mr_net_init(struct net *net)
 {
+	unsigned int i;
 	int err = 0;
+
 	net->ipv6.vif6_table = kcalloc(MAXMIFS, sizeof(struct mif_device),
 				       GFP_KERNEL);
 	if (!net->ipv6.vif6_table) {
@@ -1007,13 +1004,18 @@ static int __net_init ip6mr_net_init(struct net *net)
 
 	/* Forwarding cache */
 	net->ipv6.mfc6_cache_array = kcalloc(MFC6_LINES,
-					     sizeof(struct mfc6_cache *),
+					     sizeof(struct list_head),
 					     GFP_KERNEL);
 	if (!net->ipv6.mfc6_cache_array) {
 		err = -ENOMEM;
 		goto fail_mfc6_cache;
 	}
 
+	for (i = 0; i < MFC6_LINES; i++)
+		INIT_LIST_HEAD(&net->ipv6.mfc6_cache_array[i]);
+
+	INIT_LIST_HEAD(&net->ipv6.mfc6_unres_queue);
+
 	setup_timer(&net->ipv6.ipmr_expire_timer, ipmr_expire_process,
 		    (unsigned long)net);
 
@@ -1105,8 +1107,9 @@ void ip6_mr_cleanup(void)
 
 static int ip6mr_mfc_add(struct net *net, struct mf6cctl *mfc, int mrtsock)
 {
+	bool found = false;
 	int line;
-	struct mfc6_cache *uc, *c, **cp;
+	struct mfc6_cache *uc, *c;
 	unsigned char ttls[MAXMIFS];
 	int i;
 
@@ -1122,14 +1125,15 @@ static int ip6mr_mfc_add(struct net *net, struct mf6cctl *mfc, int mrtsock)
 
 	line = MFC6_HASH(&mfc->mf6cc_mcastgrp.sin6_addr, &mfc->mf6cc_origin.sin6_addr);
 
-	for (cp = &net->ipv6.mfc6_cache_array[line];
-	     (c = *cp) != NULL; cp = &c->next) {
+	list_for_each_entry(c, &net->ipv6.mfc6_cache_array[line], list) {
 		if (ipv6_addr_equal(&c->mf6c_origin, &mfc->mf6cc_origin.sin6_addr) &&
-		    ipv6_addr_equal(&c->mf6c_mcastgrp, &mfc->mf6cc_mcastgrp.sin6_addr))
+		    ipv6_addr_equal(&c->mf6c_mcastgrp, &mfc->mf6cc_mcastgrp.sin6_addr)) {
+			found = true;
 			break;
+		}
 	}
 
-	if (c != NULL) {
+	if (found) {
 		write_lock_bh(&mrt_lock);
 		c->mf6c_parent = mfc->mf6cc_parent;
 		ip6mr_update_thresholds(net, c, ttls);
@@ -1154,29 +1158,29 @@ static int ip6mr_mfc_add(struct net *net, struct mf6cctl *mfc, int mrtsock)
 		c->mfc_flags |= MFC_STATIC;
 
 	write_lock_bh(&mrt_lock);
-	c->next = net->ipv6.mfc6_cache_array[line];
-	net->ipv6.mfc6_cache_array[line] = c;
+	list_add(&c->list, &net->ipv6.mfc6_cache_array[line]);
 	write_unlock_bh(&mrt_lock);
 
 	/*
 	 *	Check to see if we resolved a queued list. If so we
 	 *	need to send on the frames and tidy up.
 	 */
+	found = false;
 	spin_lock_bh(&mfc_unres_lock);
-	for (cp = &net->ipv6.mfc6_unres_queue; (uc = *cp) != NULL;
-	     cp = &uc->next) {
+	list_for_each_entry(uc, &net->ipv6.mfc6_unres_queue, list) {
 		if (ipv6_addr_equal(&uc->mf6c_origin, &c->mf6c_origin) &&
 		    ipv6_addr_equal(&uc->mf6c_mcastgrp, &c->mf6c_mcastgrp)) {
-			*cp = uc->next;
+			list_del(&uc->list);
 			atomic_dec(&net->ipv6.cache_resolve_queue_len);
+			found = true;
 			break;
 		}
 	}
-	if (net->ipv6.mfc6_unres_queue == NULL)
+	if (list_empty(&net->ipv6.mfc6_unres_queue))
 		del_timer(&net->ipv6.ipmr_expire_timer);
 	spin_unlock_bh(&mfc_unres_lock);
 
-	if (uc) {
+	if (found) {
 		ip6mr_cache_resolve(net, uc, c);
 		ip6mr_cache_free(uc);
 	}
@@ -1191,6 +1195,7 @@ static void mroute_clean_tables(struct net *net)
 {
 	int i;
 	LIST_HEAD(list);
+	struct mfc6_cache *c, *next;
 
 	/*
 	 *	Shut down all active vif entries
@@ -1205,16 +1210,11 @@ static void mroute_clean_tables(struct net *net)
 	 *	Wipe the cache
 	 */
 	for (i = 0; i < MFC6_LINES; i++) {
-		struct mfc6_cache *c, **cp;
-
-		cp = &net->ipv6.mfc6_cache_array[i];
-		while ((c = *cp) != NULL) {
-			if (c->mfc_flags & MFC_STATIC) {
-				cp = &c->next;
+		list_for_each_entry_safe(c, next, &net->ipv6.mfc6_cache_array[i], list) {
+			if (c->mfc_flags & MFC_STATIC)
 				continue;
-			}
 			write_lock_bh(&mrt_lock);
-			*cp = c->next;
+			list_del(&c->list);
 			write_unlock_bh(&mrt_lock);
 
 			ip6mr_cache_free(c);
@@ -1222,12 +1222,9 @@ static void mroute_clean_tables(struct net *net)
 	}
 
 	if (atomic_read(&net->ipv6.cache_resolve_queue_len) != 0) {
-		struct mfc6_cache *c, **cp;
-
 		spin_lock_bh(&mfc_unres_lock);
-		cp = &net->ipv6.mfc6_unres_queue;
-		while ((c = *cp) != NULL) {
-			*cp = c->next;
+		list_for_each_entry_safe(c, next, &net->ipv6.mfc6_unres_queue, list) {
+			list_del(&c->list);
 			ip6mr_destroy_unres(net, c);
 		}
 		spin_unlock_bh(&mfc_unres_lock);
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH 6/6] ipv6: ip6mr: add support for dumping routing tables over netlink
From: kaber @ 2010-05-11 14:02 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <1273586551-3521-1-git-send-email-kaber@trash.net>

From: Patrick McHardy <kaber@trash.net>

The ip6mr /proc interface (ip6_mr_cache) can't be extended to dump routes
from any tables but the main table in a backwards compatible fashion since
the output format ends in a variable amount of output interfaces.

Introduce a new netlink interface to dump multicast routes from all tables,
similar to the netlink interface for regular routes.

Signed-off-by: Patrick McHardy <kaber@trash.net>
---
 net/ipv6/ip6mr.c |   96 ++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 files changed, 89 insertions(+), 7 deletions(-)

diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index c2920a1..163850e 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -112,8 +112,10 @@ static int ip6_mr_forward(struct net *net, struct mr6_table *mrt,
 			  struct sk_buff *skb, struct mfc6_cache *cache);
 static int ip6mr_cache_report(struct mr6_table *mrt, struct sk_buff *pkt,
 			      mifi_t mifi, int assert);
-static int ip6mr_fill_mroute(struct mr6_table *mrt, struct sk_buff *skb,
-			     struct mfc6_cache *c, struct rtmsg *rtm);
+static int __ip6mr_fill_mroute(struct mr6_table *mrt, struct sk_buff *skb,
+			       struct mfc6_cache *c, struct rtmsg *rtm);
+static int ip6mr_rtm_dumproute(struct sk_buff *skb,
+			       struct netlink_callback *cb);
 static void mroute_clean_tables(struct mr6_table *mrt);
 static void ipmr_expire_process(unsigned long arg);
 
@@ -1038,7 +1040,7 @@ static void ip6mr_cache_resolve(struct net *net, struct mr6_table *mrt,
 			int err;
 			struct nlmsghdr *nlh = (struct nlmsghdr *)skb_pull(skb, sizeof(struct ipv6hdr));
 
-			if (ip6mr_fill_mroute(mrt, skb, c, NLMSG_DATA(nlh)) > 0) {
+			if (__ip6mr_fill_mroute(mrt, skb, c, NLMSG_DATA(nlh)) > 0) {
 				nlh->nlmsg_len = skb_tail_pointer(skb) - (u8 *)nlh;
 			} else {
 				nlh->nlmsg_type = NLMSG_ERROR;
@@ -1350,6 +1352,7 @@ int __init ip6_mr_init(void)
 		goto add_proto_fail;
 	}
 #endif
+	rtnl_register(RTNL_FAMILY_IP6MR, RTM_GETROUTE, NULL, ip6mr_rtm_dumproute);
 	return 0;
 #ifdef CONFIG_IPV6_PIMSM_V2
 add_proto_fail:
@@ -2007,9 +2010,8 @@ int ip6_mr_input(struct sk_buff *skb)
 }
 
 
-static int
-ip6mr_fill_mroute(struct mr6_table *mrt, struct sk_buff *skb,
-		  struct mfc6_cache *c, struct rtmsg *rtm)
+static int __ip6mr_fill_mroute(struct mr6_table *mrt, struct sk_buff *skb,
+			       struct mfc6_cache *c, struct rtmsg *rtm)
 {
 	int ct;
 	struct rtnexthop *nhp;
@@ -2111,8 +2113,88 @@ int ip6mr_get_route(struct net *net,
 	if (!nowait && (rtm->rtm_flags&RTM_F_NOTIFY))
 		cache->mfc_flags |= MFC_NOTIFY;
 
-	err = ip6mr_fill_mroute(mrt, skb, cache, rtm);
+	err = __ip6mr_fill_mroute(mrt, skb, cache, rtm);
 	read_unlock(&mrt_lock);
 	return err;
 }
 
+static int ip6mr_fill_mroute(struct mr6_table *mrt, struct sk_buff *skb,
+			     u32 pid, u32 seq, struct mfc6_cache *c)
+{
+	struct nlmsghdr *nlh;
+	struct rtmsg *rtm;
+
+	nlh = nlmsg_put(skb, pid, seq, RTM_NEWROUTE, sizeof(*rtm), NLM_F_MULTI);
+	if (nlh == NULL)
+		return -EMSGSIZE;
+
+	rtm = nlmsg_data(nlh);
+	rtm->rtm_family   = RTNL_FAMILY_IPMR;
+	rtm->rtm_dst_len  = 128;
+	rtm->rtm_src_len  = 128;
+	rtm->rtm_tos      = 0;
+	rtm->rtm_table    = mrt->id;
+	NLA_PUT_U32(skb, RTA_TABLE, mrt->id);
+	rtm->rtm_scope    = RT_SCOPE_UNIVERSE;
+	rtm->rtm_protocol = RTPROT_UNSPEC;
+	rtm->rtm_flags    = 0;
+
+	NLA_PUT(skb, RTA_SRC, 16, &c->mf6c_origin);
+	NLA_PUT(skb, RTA_DST, 16, &c->mf6c_mcastgrp);
+
+	if (__ip6mr_fill_mroute(mrt, skb, c, rtm) < 0)
+		goto nla_put_failure;
+
+	return nlmsg_end(skb, nlh);
+
+nla_put_failure:
+	nlmsg_cancel(skb, nlh);
+	return -EMSGSIZE;
+}
+
+static int ip6mr_rtm_dumproute(struct sk_buff *skb, struct netlink_callback *cb)
+{
+	struct net *net = sock_net(skb->sk);
+	struct mr6_table *mrt;
+	struct mfc6_cache *mfc;
+	unsigned int t = 0, s_t;
+	unsigned int h = 0, s_h;
+	unsigned int e = 0, s_e;
+
+	s_t = cb->args[0];
+	s_h = cb->args[1];
+	s_e = cb->args[2];
+
+	read_lock(&mrt_lock);
+	ip6mr_for_each_table(mrt, net) {
+		if (t < s_t)
+			goto next_table;
+		if (t > s_t)
+			s_h = 0;
+		for (h = s_h; h < MFC6_LINES; h++) {
+			list_for_each_entry(mfc, &mrt->mfc6_cache_array[h], list) {
+				if (e < s_e)
+					goto next_entry;
+				if (ip6mr_fill_mroute(mrt, skb,
+						      NETLINK_CB(cb->skb).pid,
+						      cb->nlh->nlmsg_seq,
+						      mfc) < 0)
+					goto done;
+next_entry:
+				e++;
+			}
+			e = s_e = 0;
+		}
+		s_h = 0;
+next_table:
+		t++;
+	}
+done:
+	read_unlock(&mrt_lock);
+
+	cb->args[2] = e;
+	cb->args[1] = h;
+	cb->args[0] = t;
+
+	return skb->len;
+}
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH 5/6] ipv6: ip6mr: support multiple tables
From: kaber @ 2010-05-11 14:02 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <1273586551-3521-1-git-send-email-kaber@trash.net>

From: Patrick McHardy <kaber@trash.net>

This patch adds support for multiple independant multicast routing instances,
named "tables".

Userspace multicast routing daemons can bind to a specific table instance by
issuing a setsockopt call using a new option MRT6_TABLE. The table number is
stored in the raw socket data and affects all following ip6mr setsockopt(),
getsockopt() and ioctl() calls. By default, a single table (RT6_TABLE_DFLT)
is created with a default routing rule pointing to it. Newly created pim6reg
devices have the table number appended ("pim6regX"), with the exception of
devices created in the default table, which are named just "pim6reg" for
compatibility reasons.

Packets are directed to a specific table instance using routing rules,
similar to how regular routing rules work. Currently iif, oif and mark
are supported as keys, source and destination addresses could be supported
additionally.

Example usage:

- bind pimd/xorp/... to a specific table:

uint32_t table = 123;
setsockopt(fd, SOL_IPV6, MRT6_TABLE, &table, sizeof(table));

- create routing rules directing packets to the new table:

# ip -6 mrule add iif eth0 lookup 123
# ip -6 mrule add oif eth0 lookup 123

Signed-off-by: Patrick McHardy <kaber@trash.net>
---
 include/linux/ipv6.h      |    1 +
 include/linux/mroute6.h   |   15 ++-
 include/linux/rtnetlink.h |    3 +-
 include/net/netns/ipv6.h  |    5 +
 net/ipv6/Kconfig          |   14 ++
 net/ipv6/ip6_output.c     |    2 +-
 net/ipv6/ip6mr.c          |  428 ++++++++++++++++++++++++++++++++++++++-------
 7 files changed, 396 insertions(+), 72 deletions(-)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 0e26903..99e1ab7 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -383,6 +383,7 @@ struct raw6_sock {
 	__u32			checksum;	/* perform checksum */
 	__u32			offset;		/* checksum offset  */
 	struct icmp6_filter	filter;
+	__u32			ip6mr_table;
 	/* ipv6_pinfo has to be the last member of raw6_sock, see inet6_sk_generic */
 	struct ipv6_pinfo	inet6;
 };
diff --git a/include/linux/mroute6.h b/include/linux/mroute6.h
index 0370dd4..6091ab7 100644
--- a/include/linux/mroute6.h
+++ b/include/linux/mroute6.h
@@ -24,7 +24,8 @@
 #define MRT6_DEL_MFC	(MRT6_BASE+5)	/* Delete a multicast forwarding entry	*/
 #define MRT6_VERSION	(MRT6_BASE+6)	/* Get the kernel multicast version	*/
 #define MRT6_ASSERT	(MRT6_BASE+7)	/* Activate PIM assert mode		*/
-#define MRT6_PIM	(MRT6_BASE+8)	/* enable PIM code	*/
+#define MRT6_PIM	(MRT6_BASE+8)	/* enable PIM code			*/
+#define MRT6_TABLE	(MRT6_BASE+9)	/* Specify mroute table ID		*/
 
 #define SIOCGETMIFCNT_IN6	SIOCPROTOPRIVATE	/* IP protocol privates */
 #define SIOCGETSGCNT_IN6	(SIOCPROTOPRIVATE+1)
@@ -229,11 +230,17 @@ extern int ip6mr_get_route(struct net *net, struct sk_buff *skb,
 			   struct rtmsg *rtm, int nowait);
 
 #ifdef CONFIG_IPV6_MROUTE
-extern struct sock *mroute6_socket(struct net *net);
+extern struct sock *mroute6_socket(struct net *net, struct sk_buff *skb);
 extern int ip6mr_sk_done(struct sock *sk);
 #else
-static inline struct sock *mroute6_socket(struct net *net) { return NULL; }
-static inline int ip6mr_sk_done(struct sock *sk) { return 0; }
+static inline struct sock *mroute6_socket(struct net *net, struct sk_buff *skb)
+{
+	return NULL;
+}
+static inline int ip6mr_sk_done(struct sock *sk)
+{
+	return 0;
+}
 #endif
 #endif
 
diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index 5a42c36..fbc8cb0 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -11,7 +11,8 @@
  * families, values above 128 may be used arbitrarily.
  */
 #define RTNL_FAMILY_IPMR		128
-#define RTNL_FAMILY_MAX			128
+#define RTNL_FAMILY_IP6MR		129
+#define RTNL_FAMILY_MAX			129
 
 /****
  *		Routing/neighbour discovery messages.
diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
index 4e2780e..81abfcb 100644
--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h
@@ -59,7 +59,12 @@ struct netns_ipv6 {
 	struct sock             *tcp_sk;
 	struct sock             *igmp_sk;
 #ifdef CONFIG_IPV6_MROUTE
+#ifndef CONFIG_IPV6_MROUTE_MULTIPLE_TABLES
 	struct mr6_table	*mrt6;
+#else
+	struct list_head	mr6_tables;
+	struct fib_rules_ops	*mr6_rules_ops;
+#endif
 #endif
 };
 #endif
diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig
index a578096..36d7437 100644
--- a/net/ipv6/Kconfig
+++ b/net/ipv6/Kconfig
@@ -229,6 +229,20 @@ config IPV6_MROUTE
 	  Experimental support for IPv6 multicast forwarding.
 	  If unsure, say N.
 
+config IPV6_MROUTE_MULTIPLE_TABLES
+	bool "IPv6: multicast policy routing"
+	depends on IPV6_MROUTE
+	select FIB_RULES
+	help
+	  Normally, a multicast router runs a userspace daemon and decides
+	  what to do with a multicast packet based on the source and
+	  destination addresses. If you say Y here, the multicast router
+	  will also be able to take interfaces and packet marks into
+	  account and run multiple instances of userspace daemons
+	  simultaneously, each one handling a single table.
+
+	  If unsure, say N.
+
 config IPV6_PIMSM_V2
 	bool "IPv6: PIM-SM version 2 support (EXPERIMENTAL)"
 	depends on IPV6_MROUTE
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 5173aca..cd963f6 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -108,7 +108,7 @@ static int ip6_finish_output2(struct sk_buff *skb)
 		struct inet6_dev *idev = ip6_dst_idev(skb_dst(skb));
 
 		if (!(dev->flags & IFF_LOOPBACK) && sk_mc_loop(skb->sk) &&
-		    ((mroute6_socket(dev_net(dev)) &&
+		    ((mroute6_socket(dev_net(dev), skb) &&
 		     !(IP6CB(skb)->flags & IP6SKB_FORWARDED)) ||
 		     ipv6_chk_mcast_addr(dev, &ipv6_hdr(skb)->daddr,
 					 &ipv6_hdr(skb)->saddr))) {
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index 9419fce..c2920a1 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -42,6 +42,7 @@
 #include <linux/if_arp.h>
 #include <net/checksum.h>
 #include <net/netlink.h>
+#include <net/fib_rules.h>
 
 #include <net/ipv6.h>
 #include <net/ip6_route.h>
@@ -52,9 +53,11 @@
 #include <net/ip6_checksum.h>
 
 struct mr6_table {
+	struct list_head	list;
 #ifdef CONFIG_NET_NS
 	struct net		*net;
 #endif
+	u32			id;
 	struct sock		*mroute6_sk;
 	struct timer_list	ipmr_expire_timer;
 	struct list_head	mfc6_unres_queue;
@@ -69,6 +72,14 @@ struct mr6_table {
 #endif
 };
 
+struct ip6mr_rule {
+	struct fib_rule		common;
+};
+
+struct ip6mr_result {
+	struct mr6_table	*mrt;
+};
+
 /* Big lock, protecting vif table, mrt cache and mroute socket state.
    Note that the changes are semaphored via rtnl_lock.
  */
@@ -94,6 +105,9 @@ static DEFINE_SPINLOCK(mfc_unres_lock);
 
 static struct kmem_cache *mrt_cachep __read_mostly;
 
+static struct mr6_table *ip6mr_new_table(struct net *net, u32 id);
+static void ip6mr_free_table(struct mr6_table *mrt);
+
 static int ip6_mr_forward(struct net *net, struct mr6_table *mrt,
 			  struct sk_buff *skb, struct mfc6_cache *cache);
 static int ip6mr_cache_report(struct mr6_table *mrt, struct sk_buff *pkt,
@@ -101,12 +115,220 @@ static int ip6mr_cache_report(struct mr6_table *mrt, struct sk_buff *pkt,
 static int ip6mr_fill_mroute(struct mr6_table *mrt, struct sk_buff *skb,
 			     struct mfc6_cache *c, struct rtmsg *rtm);
 static void mroute_clean_tables(struct mr6_table *mrt);
+static void ipmr_expire_process(unsigned long arg);
+
+#ifdef CONFIG_IPV6_MROUTE_MULTIPLE_TABLES
+#define ip6mr_for_each_table(mrt, met) \
+	list_for_each_entry_rcu(mrt, &net->ipv6.mr6_tables, list)
+
+static struct mr6_table *ip6mr_get_table(struct net *net, u32 id)
+{
+	struct mr6_table *mrt;
 
+	ip6mr_for_each_table(mrt, net) {
+		if (mrt->id == id)
+			return mrt;
+	}
+	return NULL;
+}
+
+static int ip6mr_fib_lookup(struct net *net, struct flowi *flp,
+			    struct mr6_table **mrt)
+{
+	struct ip6mr_result res;
+	struct fib_lookup_arg arg = { .result = &res, };
+	int err;
+
+	err = fib_rules_lookup(net->ipv6.mr6_rules_ops, flp, 0, &arg);
+	if (err < 0)
+		return err;
+	*mrt = res.mrt;
+	return 0;
+}
+
+static int ip6mr_rule_action(struct fib_rule *rule, struct flowi *flp,
+			     int flags, struct fib_lookup_arg *arg)
+{
+	struct ip6mr_result *res = arg->result;
+	struct mr6_table *mrt;
+
+	switch (rule->action) {
+	case FR_ACT_TO_TBL:
+		break;
+	case FR_ACT_UNREACHABLE:
+		return -ENETUNREACH;
+	case FR_ACT_PROHIBIT:
+		return -EACCES;
+	case FR_ACT_BLACKHOLE:
+	default:
+		return -EINVAL;
+	}
+
+	mrt = ip6mr_get_table(rule->fr_net, rule->table);
+	if (mrt == NULL)
+		return -EAGAIN;
+	res->mrt = mrt;
+	return 0;
+}
+
+static int ip6mr_rule_match(struct fib_rule *rule, struct flowi *flp, int flags)
+{
+	return 1;
+}
+
+static const struct nla_policy ip6mr_rule_policy[FRA_MAX + 1] = {
+	FRA_GENERIC_POLICY,
+};
+
+static int ip6mr_rule_configure(struct fib_rule *rule, struct sk_buff *skb,
+				struct fib_rule_hdr *frh, struct nlattr **tb)
+{
+	return 0;
+}
+
+static int ip6mr_rule_compare(struct fib_rule *rule, struct fib_rule_hdr *frh,
+			      struct nlattr **tb)
+{
+	return 1;
+}
+
+static int ip6mr_rule_fill(struct fib_rule *rule, struct sk_buff *skb,
+			   struct fib_rule_hdr *frh)
+{
+	frh->dst_len = 0;
+	frh->src_len = 0;
+	frh->tos     = 0;
+	return 0;
+}
+
+static const struct fib_rules_ops __net_initdata ip6mr_rules_ops_template = {
+	.family		= RTNL_FAMILY_IP6MR,
+	.rule_size	= sizeof(struct ip6mr_rule),
+	.addr_size	= sizeof(struct in6_addr),
+	.action		= ip6mr_rule_action,
+	.match		= ip6mr_rule_match,
+	.configure	= ip6mr_rule_configure,
+	.compare	= ip6mr_rule_compare,
+	.default_pref	= fib_default_rule_pref,
+	.fill		= ip6mr_rule_fill,
+	.nlgroup	= RTNLGRP_IPV6_RULE,
+	.policy		= ip6mr_rule_policy,
+	.owner		= THIS_MODULE,
+};
+
+static int __net_init ip6mr_rules_init(struct net *net)
+{
+	struct fib_rules_ops *ops;
+	struct mr6_table *mrt;
+	int err;
+
+	ops = fib_rules_register(&ip6mr_rules_ops_template, net);
+	if (IS_ERR(ops))
+		return PTR_ERR(ops);
+
+	INIT_LIST_HEAD(&net->ipv6.mr6_tables);
+
+	mrt = ip6mr_new_table(net, RT6_TABLE_DFLT);
+	if (mrt == NULL) {
+		err = -ENOMEM;
+		goto err1;
+	}
+
+	err = fib_default_rule_add(ops, 0x7fff, RT6_TABLE_DFLT, 0);
+	if (err < 0)
+		goto err2;
+
+	net->ipv6.mr6_rules_ops = ops;
+	return 0;
+
+err2:
+	kfree(mrt);
+err1:
+	fib_rules_unregister(ops);
+	return err;
+}
+
+static void __net_exit ip6mr_rules_exit(struct net *net)
+{
+	struct mr6_table *mrt, *next;
+
+	list_for_each_entry_safe(mrt, next, &net->ipv6.mr6_tables, list)
+		ip6mr_free_table(mrt);
+	fib_rules_unregister(net->ipv6.mr6_rules_ops);
+}
+#else
+#define ip6mr_for_each_table(mrt, net) \
+	for (mrt = net->ipv6.mrt6; mrt; mrt = NULL)
+
+static struct mr6_table *ip6mr_get_table(struct net *net, u32 id)
+{
+	return net->ipv6.mrt6;
+}
+
+static int ip6mr_fib_lookup(struct net *net, struct flowi *flp,
+			    struct mr6_table **mrt)
+{
+	*mrt = net->ipv6.mrt6;
+	return 0;
+}
+
+static int __net_init ip6mr_rules_init(struct net *net)
+{
+	net->ipv6.mrt6 = ip6mr_new_table(net, RT6_TABLE_DFLT);
+	return net->ipv6.mrt6 ? 0 : -ENOMEM;
+}
+
+static void __net_exit ip6mr_rules_exit(struct net *net)
+{
+	ip6mr_free_table(net->ipv6.mrt6);
+}
+#endif
+
+static struct mr6_table *ip6mr_new_table(struct net *net, u32 id)
+{
+	struct mr6_table *mrt;
+	unsigned int i;
+
+	mrt = ip6mr_get_table(net, id);
+	if (mrt != NULL)
+		return mrt;
+
+	mrt = kzalloc(sizeof(*mrt), GFP_KERNEL);
+	if (mrt == NULL)
+		return NULL;
+	mrt->id = id;
+	write_pnet(&mrt->net, net);
+
+	/* Forwarding cache */
+	for (i = 0; i < MFC6_LINES; i++)
+		INIT_LIST_HEAD(&mrt->mfc6_cache_array[i]);
+
+	INIT_LIST_HEAD(&mrt->mfc6_unres_queue);
+
+	setup_timer(&mrt->ipmr_expire_timer, ipmr_expire_process,
+		    (unsigned long)mrt);
+
+#ifdef CONFIG_IPV6_PIMSM_V2
+	mrt->mroute_reg_vif_num = -1;
+#endif
+#ifdef CONFIG_IPV6_MROUTE_MULTIPLE_TABLES
+	list_add_tail_rcu(&mrt->list, &net->ipv6.mr6_tables);
+#endif
+	return mrt;
+}
+
+static void ip6mr_free_table(struct mr6_table *mrt)
+{
+	del_timer(&mrt->ipmr_expire_timer);
+	mroute_clean_tables(mrt);
+	kfree(mrt);
+}
 
 #ifdef CONFIG_PROC_FS
 
 struct ipmr_mfc_iter {
 	struct seq_net_private p;
+	struct mr6_table *mrt;
 	struct list_head *cache;
 	int ct;
 };
@@ -115,7 +337,7 @@ struct ipmr_mfc_iter {
 static struct mfc6_cache *ipmr_mfc_seq_idx(struct net *net,
 					   struct ipmr_mfc_iter *it, loff_t pos)
 {
-	struct mr6_table *mrt = net->ipv6.mrt6;
+	struct mr6_table *mrt = it->mrt;
 	struct mfc6_cache *mfc;
 
 	read_lock(&mrt_lock);
@@ -144,6 +366,7 @@ static struct mfc6_cache *ipmr_mfc_seq_idx(struct net *net,
 
 struct ipmr_vif_iter {
 	struct seq_net_private p;
+	struct mr6_table *mrt;
 	int ct;
 };
 
@@ -151,7 +374,7 @@ static struct mif_device *ip6mr_vif_seq_idx(struct net *net,
 					    struct ipmr_vif_iter *iter,
 					    loff_t pos)
 {
-	struct mr6_table *mrt = net->ipv6.mrt6;
+	struct mr6_table *mrt = iter->mrt;
 
 	for (iter->ct = 0; iter->ct < mrt->maxvif; ++iter->ct) {
 		if (!MIF_EXISTS(mrt, iter->ct))
@@ -165,7 +388,15 @@ static struct mif_device *ip6mr_vif_seq_idx(struct net *net,
 static void *ip6mr_vif_seq_start(struct seq_file *seq, loff_t *pos)
 	__acquires(mrt_lock)
 {
+	struct ipmr_vif_iter *iter = seq->private;
 	struct net *net = seq_file_net(seq);
+	struct mr6_table *mrt;
+
+	mrt = ip6mr_get_table(net, RT6_TABLE_DFLT);
+	if (mrt == NULL)
+		return ERR_PTR(-ENOENT);
+
+	iter->mrt = mrt;
 
 	read_lock(&mrt_lock);
 	return *pos ? ip6mr_vif_seq_idx(net, seq->private, *pos - 1)
@@ -176,7 +407,7 @@ static void *ip6mr_vif_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 {
 	struct ipmr_vif_iter *iter = seq->private;
 	struct net *net = seq_file_net(seq);
-	struct mr6_table *mrt = net->ipv6.mrt6;
+	struct mr6_table *mrt = iter->mrt;
 
 	++*pos;
 	if (v == SEQ_START_TOKEN)
@@ -198,8 +429,8 @@ static void ip6mr_vif_seq_stop(struct seq_file *seq, void *v)
 
 static int ip6mr_vif_seq_show(struct seq_file *seq, void *v)
 {
-	struct net *net = seq_file_net(seq);
-	struct mr6_table *mrt = net->ipv6.mrt6;
+	struct ipmr_vif_iter *iter = seq->private;
+	struct mr6_table *mrt = iter->mrt;
 
 	if (v == SEQ_START_TOKEN) {
 		seq_puts(seq,
@@ -241,8 +472,15 @@ static const struct file_operations ip6mr_vif_fops = {
 
 static void *ipmr_mfc_seq_start(struct seq_file *seq, loff_t *pos)
 {
+	struct ipmr_mfc_iter *it = seq->private;
 	struct net *net = seq_file_net(seq);
+	struct mr6_table *mrt;
+
+	mrt = ip6mr_get_table(net, RT6_TABLE_DFLT);
+	if (mrt == NULL)
+		return ERR_PTR(-ENOENT);
 
+	it->mrt = mrt;
 	return *pos ? ipmr_mfc_seq_idx(net, seq->private, *pos - 1)
 		: SEQ_START_TOKEN;
 }
@@ -252,7 +490,7 @@ static void *ipmr_mfc_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 	struct mfc6_cache *mfc = v;
 	struct ipmr_mfc_iter *it = seq->private;
 	struct net *net = seq_file_net(seq);
-	struct mr6_table *mrt = net->ipv6.mrt6;
+	struct mr6_table *mrt = it->mrt;
 
 	++*pos;
 
@@ -293,8 +531,7 @@ static void *ipmr_mfc_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 static void ipmr_mfc_seq_stop(struct seq_file *seq, void *v)
 {
 	struct ipmr_mfc_iter *it = seq->private;
-	struct net *net = seq_file_net(seq);
-	struct mr6_table *mrt = net->ipv6.mrt6;
+	struct mr6_table *mrt = it->mrt;
 
 	if (it->cache == &mrt->mfc6_unres_queue)
 		spin_unlock_bh(&mfc_unres_lock);
@@ -305,8 +542,6 @@ static void ipmr_mfc_seq_stop(struct seq_file *seq, void *v)
 static int ipmr_mfc_seq_show(struct seq_file *seq, void *v)
 {
 	int n;
-	struct net *net = seq_file_net(seq);
-	struct mr6_table *mrt = net->ipv6.mrt6;
 
 	if (v == SEQ_START_TOKEN) {
 		seq_puts(seq,
@@ -316,6 +551,7 @@ static int ipmr_mfc_seq_show(struct seq_file *seq, void *v)
 	} else {
 		const struct mfc6_cache *mfc = v;
 		const struct ipmr_mfc_iter *it = seq->private;
+		struct mr6_table *mrt = it->mrt;
 
 		seq_printf(seq, "%pI6 %pI6 %-3hd",
 			   &mfc->mf6c_mcastgrp, &mfc->mf6c_origin,
@@ -375,8 +611,12 @@ static int pim6_rcv(struct sk_buff *skb)
 	struct ipv6hdr   *encap;
 	struct net_device  *reg_dev = NULL;
 	struct net *net = dev_net(skb->dev);
-	struct mr6_table *mrt = net->ipv6.mrt6;
-	int reg_vif_num = mrt->mroute_reg_vif_num;
+	struct mr6_table *mrt;
+	struct flowi fl = {
+		.iif	= skb->dev->ifindex,
+		.mark	= skb->mark,
+	};
+	int reg_vif_num;
 
 	if (!pskb_may_pull(skb, sizeof(*pim) + sizeof(*encap)))
 		goto drop;
@@ -399,6 +639,10 @@ static int pim6_rcv(struct sk_buff *skb)
 	    ntohs(encap->payload_len) + sizeof(*pim) > skb->len)
 		goto drop;
 
+	if (ip6mr_fib_lookup(net, &fl, &mrt) < 0)
+		goto drop;
+	reg_vif_num = mrt->mroute_reg_vif_num;
+
 	read_lock(&mrt_lock);
 	if (reg_vif_num >= 0)
 		reg_dev = mrt->vif6_table[reg_vif_num].dev;
@@ -438,7 +682,17 @@ static netdev_tx_t reg_vif_xmit(struct sk_buff *skb,
 				      struct net_device *dev)
 {
 	struct net *net = dev_net(dev);
-	struct mr6_table *mrt = net->ipv6.mrt6;
+	struct mr6_table *mrt;
+	struct flowi fl = {
+		.oif		= dev->ifindex,
+		.iif		= skb->skb_iif,
+		.mark		= skb->mark,
+	};
+	int err;
+
+	err = ip6mr_fib_lookup(net, &fl, &mrt);
+	if (err < 0)
+		return err;
 
 	read_lock(&mrt_lock);
 	dev->stats.tx_bytes += skb->len;
@@ -463,11 +717,17 @@ static void reg_vif_setup(struct net_device *dev)
 	dev->features		|= NETIF_F_NETNS_LOCAL;
 }
 
-static struct net_device *ip6mr_reg_vif(struct net *net)
+static struct net_device *ip6mr_reg_vif(struct net *net, struct mr6_table *mrt)
 {
 	struct net_device *dev;
+	char name[IFNAMSIZ];
+
+	if (mrt->id == RT6_TABLE_DFLT)
+		sprintf(name, "pim6reg");
+	else
+		sprintf(name, "pim6reg%u", mrt->id);
 
-	dev = alloc_netdev(0, "pim6reg", reg_vif_setup);
+	dev = alloc_netdev(0, name, reg_vif_setup);
 	if (dev == NULL)
 		return NULL;
 
@@ -665,7 +925,7 @@ static int mif6_add(struct net *net, struct mr6_table *mrt,
 		 */
 		if (mrt->mroute_reg_vif_num >= 0)
 			return -EADDRINUSE;
-		dev = ip6mr_reg_vif(net);
+		dev = ip6mr_reg_vif(net, mrt);
 		if (!dev)
 			return -ENOBUFS;
 		err = dev_set_allmulti(dev, 1);
@@ -995,7 +1255,7 @@ static int ip6mr_device_event(struct notifier_block *this,
 {
 	struct net_device *dev = ptr;
 	struct net *net = dev_net(dev);
-	struct mr6_table *mrt = net->ipv6.mrt6;
+	struct mr6_table *mrt;
 	struct mif_device *v;
 	int ct;
 	LIST_HEAD(list);
@@ -1003,10 +1263,12 @@ static int ip6mr_device_event(struct notifier_block *this,
 	if (event != NETDEV_UNREGISTER)
 		return NOTIFY_DONE;
 
-	v = &mrt->vif6_table[0];
-	for (ct = 0; ct < mrt->maxvif; ct++, v++) {
-		if (v->dev == dev)
-			mif6_delete(mrt, ct, &list);
+	ip6mr_for_each_table(mrt, net) {
+		v = &mrt->vif6_table[0];
+		for (ct = 0; ct < mrt->maxvif; ct++, v++) {
+			if (v->dev == dev)
+				mif6_delete(mrt, ct, &list);
+		}
 	}
 	unregister_netdevice_many(&list);
 
@@ -1023,29 +1285,11 @@ static struct notifier_block ip6_mr_notifier = {
 
 static int __net_init ip6mr_net_init(struct net *net)
 {
-	struct mr6_table *mrt;
-	unsigned int i;
-	int err = 0;
+	int err;
 
-	mrt = kzalloc(sizeof(*mrt), GFP_KERNEL);
-	if (mrt == NULL) {
-		err = -ENOMEM;
+	err = ip6mr_rules_init(net);
+	if (err < 0)
 		goto fail;
-	}
-
-	write_pnet(&mrt->net, net);
-
-	for (i = 0; i < MFC6_LINES; i++)
-		INIT_LIST_HEAD(&mrt->mfc6_cache_array[i]);
-
-	INIT_LIST_HEAD(&mrt->mfc6_unres_queue);
-
-	setup_timer(&mrt->ipmr_expire_timer, ipmr_expire_process,
-		    (unsigned long)mrt);
-
-#ifdef CONFIG_IPV6_PIMSM_V2
-	mrt->mroute_reg_vif_num = -1;
-#endif
 
 #ifdef CONFIG_PROC_FS
 	err = -ENOMEM;
@@ -1055,14 +1299,13 @@ static int __net_init ip6mr_net_init(struct net *net)
 		goto proc_cache_fail;
 #endif
 
-	net->ipv6.mrt6 = mrt;
 	return 0;
 
 #ifdef CONFIG_PROC_FS
 proc_cache_fail:
 	proc_net_remove(net, "ip6_mr_vif");
 proc_vif_fail:
-	kfree(mrt);
+	ip6mr_rules_exit(net);
 #endif
 fail:
 	return err;
@@ -1070,15 +1313,11 @@ fail:
 
 static void __net_exit ip6mr_net_exit(struct net *net)
 {
-	struct mr6_table *mrt = net->ipv6.mrt6;
-
 #ifdef CONFIG_PROC_FS
 	proc_net_remove(net, "ip6_mr_cache");
 	proc_net_remove(net, "ip6_mr_vif");
 #endif
-	del_timer(&mrt->ipmr_expire_timer);
-	mroute_clean_tables(mrt);
-	kfree(mrt);
+	ip6mr_rules_exit(net);
 }
 
 static struct pernet_operations ip6mr_net_ops = {
@@ -1279,28 +1518,39 @@ static int ip6mr_sk_init(struct mr6_table *mrt, struct sock *sk)
 
 int ip6mr_sk_done(struct sock *sk)
 {
-	int err = 0;
+	int err = -EACCES;
 	struct net *net = sock_net(sk);
-	struct mr6_table *mrt = net->ipv6.mrt6;
+	struct mr6_table *mrt;
 
 	rtnl_lock();
-	if (sk == mrt->mroute6_sk) {
-		write_lock_bh(&mrt_lock);
-		mrt->mroute6_sk = NULL;
-		net->ipv6.devconf_all->mc_forwarding--;
-		write_unlock_bh(&mrt_lock);
+	ip6mr_for_each_table(mrt, net) {
+		if (sk == mrt->mroute6_sk) {
+			write_lock_bh(&mrt_lock);
+			mrt->mroute6_sk = NULL;
+			net->ipv6.devconf_all->mc_forwarding--;
+			write_unlock_bh(&mrt_lock);
 
-		mroute_clean_tables(mrt);
-	} else
-		err = -EACCES;
+			mroute_clean_tables(mrt);
+			err = 0;
+			break;
+		}
+	}
 	rtnl_unlock();
 
 	return err;
 }
 
-struct sock *mroute6_socket(struct net *net)
+struct sock *mroute6_socket(struct net *net, struct sk_buff *skb)
 {
-	struct mr6_table *mrt = net->ipv6.mrt6;
+	struct mr6_table *mrt;
+	struct flowi fl = {
+		.iif	= skb->skb_iif,
+		.oif	= skb->dev->ifindex,
+		.mark	= skb->mark,
+	};
+
+	if (ip6mr_fib_lookup(net, &fl, &mrt) < 0)
+		return NULL;
 
 	return mrt->mroute6_sk;
 }
@@ -1319,7 +1569,11 @@ int ip6_mroute_setsockopt(struct sock *sk, int optname, char __user *optval, uns
 	struct mf6cctl mfc;
 	mifi_t mifi;
 	struct net *net = sock_net(sk);
-	struct mr6_table *mrt = net->ipv6.mrt6;
+	struct mr6_table *mrt;
+
+	mrt = ip6mr_get_table(net, raw6_sk(sk)->ip6mr_table ? : RT6_TABLE_DFLT);
+	if (mrt == NULL)
+		return -ENOENT;
 
 	if (optname != MRT6_INIT) {
 		if (sk != mrt->mroute6_sk && !capable(CAP_NET_ADMIN))
@@ -1409,6 +1663,27 @@ int ip6_mroute_setsockopt(struct sock *sk, int optname, char __user *optval, uns
 	}
 
 #endif
+#ifdef CONFIG_IPV6_MROUTE_MULTIPLE_TABLES
+	case MRT6_TABLE:
+	{
+		u32 v;
+
+		if (optlen != sizeof(u32))
+			return -EINVAL;
+		if (get_user(v, (u32 __user *)optval))
+			return -EFAULT;
+		if (sk == mrt->mroute6_sk)
+			return -EBUSY;
+
+		rtnl_lock();
+		ret = 0;
+		if (!ip6mr_new_table(net, v))
+			ret = -ENOMEM;
+		raw6_sk(sk)->ip6mr_table = v;
+		rtnl_unlock();
+		return ret;
+	}
+#endif
 	/*
 	 *	Spurious command, or MRT6_VERSION which you cannot
 	 *	set.
@@ -1428,7 +1703,11 @@ int ip6_mroute_getsockopt(struct sock *sk, int optname, char __user *optval,
 	int olr;
 	int val;
 	struct net *net = sock_net(sk);
-	struct mr6_table *mrt = net->ipv6.mrt6;
+	struct mr6_table *mrt;
+
+	mrt = ip6mr_get_table(net, raw6_sk(sk)->ip6mr_table ? : RT6_TABLE_DFLT);
+	if (mrt == NULL)
+		return -ENOENT;
 
 	switch (optname) {
 	case MRT6_VERSION:
@@ -1471,7 +1750,11 @@ int ip6mr_ioctl(struct sock *sk, int cmd, void __user *arg)
 	struct mif_device *vif;
 	struct mfc6_cache *c;
 	struct net *net = sock_net(sk);
-	struct mr6_table *mrt = net->ipv6.mrt6;
+	struct mr6_table *mrt;
+
+	mrt = ip6mr_get_table(net, raw6_sk(sk)->ip6mr_table ? : RT6_TABLE_DFLT);
+	if (mrt == NULL)
+		return -ENOENT;
 
 	switch (cmd) {
 	case SIOCGETMIFCNT_IN6:
@@ -1683,7 +1966,16 @@ int ip6_mr_input(struct sk_buff *skb)
 {
 	struct mfc6_cache *cache;
 	struct net *net = dev_net(skb->dev);
-	struct mr6_table *mrt = net->ipv6.mrt6;
+	struct mr6_table *mrt;
+	struct flowi fl = {
+		.iif	= skb->dev->ifindex,
+		.mark	= skb->mark,
+	};
+	int err;
+
+	err = ip6mr_fib_lookup(net, &fl, &mrt);
+	if (err < 0)
+		return err;
 
 	read_lock(&mrt_lock);
 	cache = ip6mr_cache_find(mrt,
@@ -1758,10 +2050,14 @@ int ip6mr_get_route(struct net *net,
 		    struct sk_buff *skb, struct rtmsg *rtm, int nowait)
 {
 	int err;
-	struct mr6_table *mrt = net->ipv6.mrt6;
+	struct mr6_table *mrt;
 	struct mfc6_cache *cache;
 	struct rt6_info *rt = (struct rt6_info *)skb_dst(skb);
 
+	mrt = ip6mr_get_table(net, RT6_TABLE_DFLT);
+	if (mrt == NULL)
+		return -ENOENT;
+
 	read_lock(&mrt_lock);
 	cache = ip6mr_cache_find(mrt, &rt->rt6i_src.addr, &rt->rt6i_dst.addr);
 
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH 1/6] ipv6: ip6mr: move unres_queue and timer to per-namespace data
From: kaber @ 2010-05-11 14:02 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <1273586551-3521-1-git-send-email-kaber@trash.net>

From: Patrick McHardy <kaber@trash.net>

The unres_queue is currently shared between all namespaces. Following patches
will additionally allow to create multiple multicast routing tables in each
namespace. Having a single shared queue for all these users seems to excessive,
move the queue and the cleanup timer to the per-namespace data to unshare it.

As a side-effect, this fixes a bug in the seq file iteration functions: the
first entry returned is always from the current namespace, entries returned
after that may belong to any namespace.

Signed-off-by: Patrick McHardy <kaber@trash.net>
---
 include/net/netns/ipv6.h |    2 +
 net/ipv6/ip6mr.c         |   74 ++++++++++++++++++++-------------------------
 2 files changed, 35 insertions(+), 41 deletions(-)

diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
index 1f11ebc..43d842a 100644
--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h
@@ -60,6 +60,8 @@ struct netns_ipv6 {
 	struct sock             *igmp_sk;
 #ifdef CONFIG_IPV6_MROUTE
 	struct sock		*mroute6_sk;
+	struct timer_list	ipmr_expire_timer;
+	struct mfc6_cache	*mfc6_unres_queue;
 	struct mfc6_cache	**mfc6_cache_array;
 	struct mif_device	*vif6_table;
 	int			maxvif;
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index e0b530c..7236030 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -63,8 +63,6 @@ static DEFINE_RWLOCK(mrt_lock);
 
 #define MIF_EXISTS(_net, _idx) ((_net)->ipv6.vif6_table[_idx].dev != NULL)
 
-static struct mfc6_cache *mfc_unres_queue;		/* Queue of unresolved entries */
-
 /* Special spinlock for queue of unresolved entries */
 static DEFINE_SPINLOCK(mfc_unres_lock);
 
@@ -84,8 +82,6 @@ static int ip6mr_cache_report(struct net *net, struct sk_buff *pkt,
 static int ip6mr_fill_mroute(struct sk_buff *skb, struct mfc6_cache *c, struct rtmsg *rtm);
 static void mroute_clean_tables(struct net *net);
 
-static struct timer_list ipmr_expire_timer;
-
 
 #ifdef CONFIG_PROC_FS
 
@@ -110,11 +106,10 @@ static struct mfc6_cache *ipmr_mfc_seq_idx(struct net *net,
 				return mfc;
 	read_unlock(&mrt_lock);
 
-	it->cache = &mfc_unres_queue;
+	it->cache = &net->ipv6.mfc6_unres_queue;
 	spin_lock_bh(&mfc_unres_lock);
-	for (mfc = mfc_unres_queue; mfc; mfc = mfc->next)
-		if (net_eq(mfc6_net(mfc), net) &&
-		    pos-- == 0)
+	for (mfc = net->ipv6.mfc6_unres_queue; mfc; mfc = mfc->next)
+		if (pos-- == 0)
 			return mfc;
 	spin_unlock_bh(&mfc_unres_lock);
 
@@ -244,7 +239,7 @@ static void *ipmr_mfc_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 	if (mfc->next)
 		return mfc->next;
 
-	if (it->cache == &mfc_unres_queue)
+	if (it->cache == &net->ipv6.mfc6_unres_queue)
 		goto end_of_list;
 
 	BUG_ON(it->cache != net->ipv6.mfc6_cache_array);
@@ -257,11 +252,11 @@ static void *ipmr_mfc_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 
 	/* exhausted cache_array, show unresolved */
 	read_unlock(&mrt_lock);
-	it->cache = &mfc_unres_queue;
+	it->cache = &net->ipv6.mfc6_unres_queue;
 	it->ct = 0;
 
 	spin_lock_bh(&mfc_unres_lock);
-	mfc = mfc_unres_queue;
+	mfc = net->ipv6.mfc6_unres_queue;
 	if (mfc)
 		return mfc;
 
@@ -277,7 +272,7 @@ static void ipmr_mfc_seq_stop(struct seq_file *seq, void *v)
 	struct ipmr_mfc_iter *it = seq->private;
 	struct net *net = seq_file_net(seq);
 
-	if (it->cache == &mfc_unres_queue)
+	if (it->cache == &net->ipv6.mfc6_unres_queue)
 		spin_unlock_bh(&mfc_unres_lock);
 	else if (it->cache == net->ipv6.mfc6_cache_array)
 		read_unlock(&mrt_lock);
@@ -301,7 +296,7 @@ static int ipmr_mfc_seq_show(struct seq_file *seq, void *v)
 			   &mfc->mf6c_mcastgrp, &mfc->mf6c_origin,
 			   mfc->mf6c_parent);
 
-		if (it->cache != &mfc_unres_queue) {
+		if (it->cache != &net->ipv6.mfc6_unres_queue) {
 			seq_printf(seq, " %8lu %8lu %8lu",
 				   mfc->mfc_un.res.pkt,
 				   mfc->mfc_un.res.bytes,
@@ -559,15 +554,15 @@ static void ip6mr_destroy_unres(struct mfc6_cache *c)
 }
 
 
-/* Single timer process for all the unresolved queue. */
+/* Timer process for all the unresolved queue. */
 
-static void ipmr_do_expire_process(unsigned long dummy)
+static void ipmr_do_expire_process(struct net *net)
 {
 	unsigned long now = jiffies;
 	unsigned long expires = 10 * HZ;
 	struct mfc6_cache *c, **cp;
 
-	cp = &mfc_unres_queue;
+	cp = &net->ipv6.mfc6_unres_queue;
 
 	while ((c = *cp) != NULL) {
 		if (time_after(c->mfc_un.unres.expires, now)) {
@@ -583,19 +578,21 @@ static void ipmr_do_expire_process(unsigned long dummy)
 		ip6mr_destroy_unres(c);
 	}
 
-	if (mfc_unres_queue != NULL)
-		mod_timer(&ipmr_expire_timer, jiffies + expires);
+	if (net->ipv6.mfc6_unres_queue != NULL)
+		mod_timer(&net->ipv6.ipmr_expire_timer, jiffies + expires);
 }
 
-static void ipmr_expire_process(unsigned long dummy)
+static void ipmr_expire_process(unsigned long arg)
 {
+	struct net *net = (struct net *)arg;
+
 	if (!spin_trylock(&mfc_unres_lock)) {
-		mod_timer(&ipmr_expire_timer, jiffies + 1);
+		mod_timer(&net->ipv6.ipmr_expire_timer, jiffies + 1);
 		return;
 	}
 
-	if (mfc_unres_queue != NULL)
-		ipmr_do_expire_process(dummy);
+	if (net->ipv6.mfc6_unres_queue != NULL)
+		ipmr_do_expire_process(net);
 
 	spin_unlock(&mfc_unres_lock);
 }
@@ -880,9 +877,8 @@ ip6mr_cache_unresolved(struct net *net, mifi_t mifi, struct sk_buff *skb)
 	struct mfc6_cache *c;
 
 	spin_lock_bh(&mfc_unres_lock);
-	for (c = mfc_unres_queue; c; c = c->next) {
-		if (net_eq(mfc6_net(c), net) &&
-		    ipv6_addr_equal(&c->mf6c_mcastgrp, &ipv6_hdr(skb)->daddr) &&
+	for (c = net->ipv6.mfc6_unres_queue; c; c = c->next) {
+		if (ipv6_addr_equal(&c->mf6c_mcastgrp, &ipv6_hdr(skb)->daddr) &&
 		    ipv6_addr_equal(&c->mf6c_origin, &ipv6_hdr(skb)->saddr))
 			break;
 	}
@@ -923,10 +919,10 @@ ip6mr_cache_unresolved(struct net *net, mifi_t mifi, struct sk_buff *skb)
 		}
 
 		atomic_inc(&net->ipv6.cache_resolve_queue_len);
-		c->next = mfc_unres_queue;
-		mfc_unres_queue = c;
+		c->next = net->ipv6.mfc6_unres_queue;
+		net->ipv6.mfc6_unres_queue = c;
 
-		ipmr_do_expire_process(1);
+		ipmr_do_expire_process(net);
 	}
 
 	/*
@@ -1019,6 +1015,9 @@ static int __net_init ip6mr_net_init(struct net *net)
 		goto fail_mfc6_cache;
 	}
 
+	setup_timer(&net->ipv6.ipmr_expire_timer, ipmr_expire_process,
+		    (unsigned long)net);
+
 #ifdef CONFIG_IPV6_PIMSM_V2
 	net->ipv6.mroute_reg_vif_num = -1;
 #endif
@@ -1050,6 +1049,7 @@ static void __net_exit ip6mr_net_exit(struct net *net)
 	proc_net_remove(net, "ip6_mr_cache");
 	proc_net_remove(net, "ip6_mr_vif");
 #endif
+	del_timer(&net->ipv6.ipmr_expire_timer);
 	mroute_clean_tables(net);
 	kfree(net->ipv6.mfc6_cache_array);
 	kfree(net->ipv6.vif6_table);
@@ -1075,7 +1075,6 @@ int __init ip6_mr_init(void)
 	if (err)
 		goto reg_pernet_fail;
 
-	setup_timer(&ipmr_expire_timer, ipmr_expire_process, 0);
 	err = register_netdevice_notifier(&ip6_mr_notifier);
 	if (err)
 		goto reg_notif_fail;
@@ -1092,7 +1091,6 @@ add_proto_fail:
 	unregister_netdevice_notifier(&ip6_mr_notifier);
 #endif
 reg_notif_fail:
-	del_timer(&ipmr_expire_timer);
 	unregister_pernet_subsys(&ip6mr_net_ops);
 reg_pernet_fail:
 	kmem_cache_destroy(mrt_cachep);
@@ -1102,7 +1100,6 @@ reg_pernet_fail:
 void ip6_mr_cleanup(void)
 {
 	unregister_netdevice_notifier(&ip6_mr_notifier);
-	del_timer(&ipmr_expire_timer);
 	unregister_pernet_subsys(&ip6mr_net_ops);
 	kmem_cache_destroy(mrt_cachep);
 }
@@ -1167,18 +1164,17 @@ static int ip6mr_mfc_add(struct net *net, struct mf6cctl *mfc, int mrtsock)
 	 *	need to send on the frames and tidy up.
 	 */
 	spin_lock_bh(&mfc_unres_lock);
-	for (cp = &mfc_unres_queue; (uc = *cp) != NULL;
+	for (cp = &net->ipv6.mfc6_unres_queue; (uc = *cp) != NULL;
 	     cp = &uc->next) {
-		if (net_eq(mfc6_net(uc), net) &&
-		    ipv6_addr_equal(&uc->mf6c_origin, &c->mf6c_origin) &&
+		if (ipv6_addr_equal(&uc->mf6c_origin, &c->mf6c_origin) &&
 		    ipv6_addr_equal(&uc->mf6c_mcastgrp, &c->mf6c_mcastgrp)) {
 			*cp = uc->next;
 			atomic_dec(&net->ipv6.cache_resolve_queue_len);
 			break;
 		}
 	}
-	if (mfc_unres_queue == NULL)
-		del_timer(&ipmr_expire_timer);
+	if (net->ipv6.mfc6_unres_queue == NULL)
+		del_timer(&net->ipv6.ipmr_expire_timer);
 	spin_unlock_bh(&mfc_unres_lock);
 
 	if (uc) {
@@ -1230,12 +1226,8 @@ static void mroute_clean_tables(struct net *net)
 		struct mfc6_cache *c, **cp;
 
 		spin_lock_bh(&mfc_unres_lock);
-		cp = &mfc_unres_queue;
+		cp = &net->ipv6.mfc6_unres_queue;
 		while ((c = *cp) != NULL) {
-			if (!net_eq(mfc6_net(c), net)) {
-				cp = &c->next;
-				continue;
-			}
 			*cp = c->next;
 			ip6mr_destroy_unres(c);
 		}
-- 
1.7.0.4


^ permalink raw reply related

* Re: [PATCH] virtif: initial interface extensions
From: Arnd Bergmann @ 2010-05-11 14:22 UTC (permalink / raw)
  To: Stefan Berger, Chris Wright; +Cc: netdev, Scott Feldman
In-Reply-To: <OF2E2B37D4.51A81D74-ON85257720.0045FA96-85257720.004C5403@us.ibm.com>

On Tuesday 11 May 2010, Stefan Berger wrote:
> Arnd Bergmann <arnd@arndb.de> wrote on 05/11/2010 08:25:27 AM:
> > netdev, Scott Feldman
> > On Tuesday 11 May 2010, Stefan Berger wrote:
> > > Arnd Bergmann <arnd@arndb.de> wrote on 05/10/2010 05:46:37 PM:
> > No. If we have a macvtap device, there is no VF number. The VF number
> > should be known to libvirt in those cases where instead of creating a
> > macvtap device, it assigns a VF of an SR-IOV adapter to the guest.
> 
> The only interface type that currently supports the vsi parameters is the
> 'direct' type of interface which directly maps into macvtap. That's the 
> only one that would currently let you run the setup protocol with the
> switch. Regular tap devices created through other interface types 
> (bridge, network) do not support these parameters and hence you cannot
> run the protocol with the switch. I never tried passthrough but I 
> believe libvirt is not aware of what it is passing through nor do we
> currently support the parameters for passthrough devices.

Ok. I believe we will at least have to add the same kind of setup
to bridged devices as well, not just macvtap.

For SR-IOV with device assignment, I'm not sure. This will be more
important when adapters show up that actually support VEPA in hardware
and don't have their own switch, but even for those with an integrated
switch, it would be nice if we could use VDP correctly.

> > svid is almost vlan (hence S-VLAN), but slightly different and is not
> > currently supported by the kernel. Again, if the implementation is done  in
> > firmware, libvirt needs to set the same S-VLAN ID when setting up the
> > VF and when associating it to the switch.
> 
> The netlink messages go into the kernel and I suppose the driver should be
> able to find out what the S-VLAN ID is that it needs to use, no?

Possibly yes, but that will depend on how the firmware does this. It
may also be possible that adapters implement this similar to what enic
does, which does not expose the S-VLAN ID at all and uses the VF
number as the identifier.

Maybe we should leave out the CDCP stuff for now, until we start seeing
hardware for it.

> > This is a UUID that describes the VSI to the switch. It needs to be
> > unique in the migration domain. For a guest that has multiple
> > macvtap interfaces, you either need to have a single UUID and
> > put all MAC/VLAN pairs into the same netlink message with this
> > UUID, or have one UUID per device. 
> 
> In that case it's the instanceID as proposed in this XML here:
> 
>    <interface type='direct'>
>       <source dev='static' mode='vepa'/>
>       <model type='virtio'/>
>       <vsi managerid='12' typeid='0x123456' typeidversion='1'
>            instanceid='fa9b7fff-b0a0-4893-8e0e-beef4ff18f8f' />
>       <filterref filter='clean-traffic'/>
>    </interface>

Yes.

	Arnd

^ permalink raw reply

* Re: e1000e fails probe
From: Andy Gospodarek @ 2010-05-11 14:19 UTC (permalink / raw)
  To: Pete Zaitcev; +Cc: Allan, Bruce W, netdev@vger.kernel.org
In-Reply-To: <20100510224308.4334fd11@redhat.com>

On Mon, May 10, 2010 at 10:43:08PM -0600, Pete Zaitcev wrote:
> On Mon, 10 May 2010 18:33:56 -0700
> "Allan, Bruce W" <bruce.w.allan@intel.com> wrote:
> 
> > Would you be able to try a patch that is already in DaveM's net-next-2.6 tree?  If so, it is http://git.kernel.org/?p=linux/kernel/git/davem/net-next-2.6.git;a=commitdiff;h=627c8a041f7aaaea93c766f69bd61d952a277586
> 
> I was too quick to claim victory here. I did an rmmod+insmod,
> it worked. Then I rebooted and it didn't. Weird...
> 

That is exactly what I used to see on what I think is a similar system
to Pete's.  Pete and I have been talking and it looks like this system
may need a BIOS update, so I'll see if we can work to get him an updated
one.


^ permalink raw reply

* Re: get beyond 1Gbps with pktgen  on 10Gb nic?
From: Rick Jones @ 2010-05-11 15:12 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Jon Zhou, netdev@vger.kernel.org
In-Reply-To: <1273584925.2107.6.camel@achroite.uk.solarflarecom.com>

Ben Hutchings wrote:
> On Tue, 2010-05-11 at 06:13 -0700, Jon Zhou wrote:
> 
>>hi there:
>>
>> anyone can get beyond 1Gbps with pktgen or other SW traffic generator with
>> 10Gb nic(intel 82599 or BCM 57711)? found that some one had met similar
>> situation with broadcom 10G nic but no solution yet
> 
> 
> I don't know about those specific controllers, but you should be able to
> achieve close to 10G line rate with netperf's TCP_STREAM on any recent
> PC server.  UDP throughput tends to be poorer as there is less support
> for offloading segmentation and reassembly.  Performance may also be
> constrained by PCI Express bandwidth (you need a real 8-lane slot) and
> memory bandwidth (a single memory bank may not be enough).

Further, at least in the context of netperf benchmarking, depending on the 
quantity of offloads, and the speed of your cores, you may want to use the 
global -T option to bind netperf, and particularly netserver to a core other 
than the one on which the interrupts were processed.

It is also good to check whether any of the cores in the system are at or near 
saturation (eg 100%ish utilization).

happy benchmarking,

rick jones

^ permalink raw reply

* [PATCH] fix non-mergeable buffers packet too large error handling
From: David L Stevens @ 2010-05-11 15:52 UTC (permalink / raw)
  To: mst; +Cc: netdev

This patch enforces single-buffer allocation when
mergeable rx buffers is not enabled.

Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 309c570..c346304 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -361,13 +361,21 @@ static void handle_rx(struct vhost_net *net)
 			break;
 		}
 		/* TODO: Should check and handle checksum. */
-		if (vhost_has_feature(&net->dev, VIRTIO_NET_F_MRG_RXBUF) &&
-		    memcpy_toiovecend(vq->hdr, (unsigned char *)&headcount,
-				      offsetof(typeof(hdr), num_buffers),
-				      sizeof hdr.num_buffers)) {
-			vq_err(vq, "Failed num_buffers write");
+		if (vhost_has_feature(&net->dev, VIRTIO_NET_F_MRG_RXBUF)) {
+			if (memcpy_toiovecend(vq->hdr,
+					      (unsigned char *)&headcount,
+					      offsetof(typeof(hdr),
+					      num_buffers),
+					      sizeof hdr.num_buffers)) {
+				vq_err(vq, "Failed num_buffers write");
+				vhost_discard_desc(vq, headcount);
+				break;
+			}
+		} else if (headcount > 1) {
+			vq_err(vq, "rx packet too large (%d) for guest",
+			       sock_len);
 			vhost_discard_desc(vq, headcount);
-			break;
+			continue;
 		}
 		vhost_add_used_and_signal_n(&net->dev, vq, vq->heads,
 					    headcount);



^ permalink raw reply related

* Re: get beyond 1Gbps with pktgen  on 10Gb nic?
From: Ben Greear @ 2010-05-11 15:55 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Jon Zhou, netdev@vger.kernel.org
In-Reply-To: <1273584925.2107.6.camel@achroite.uk.solarflarecom.com>

On 05/11/2010 06:35 AM, Ben Hutchings wrote:
> On Tue, 2010-05-11 at 06:13 -0700, Jon Zhou wrote:
>> hi there:
>>
>> anyone can get beyond 1Gbps with pktgen or other SW traffic generator with 10Gb nic(intel 82599 or BCM 57711)?
>> found that some one had met similar situation with broadcom 10G nic but no solution yet
>
> I don't know about those specific controllers, but you should be able to
> achieve close to 10G line rate with netperf's TCP_STREAM on any recent
> PC server.  UDP throughput tends to be poorer as there is less support
> for offloading segmentation and reassembly.  Performance may also be
> constrained by PCI Express bandwidth (you need a real 8-lane slot) and
> memory bandwidth (a single memory bank may not be enough).

We can easily push right at 10Gbps full-duplex on two ports (sending to self)
with a 2-port 82599 NIC, 3.3Ghz quad-core Intel core i7 6GT/s processor, etc.

In fact, recent testing with a 2-port 10G NIC and a bunch of intel 1G ports showed about
50Gbps aggregate bandwidth across the network on such a system.   (We were using 9000 MTU
for the 50Gbps test, but can reach 10G send-to-self with 1500 MTU on the 10G ports by themselves.)

This is all using a slightly modified pktgen, but normal pktgen should do just fine.

Thanks,
Ben

>
> Ben.
>


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: [PATCH] virtif: initial interface extensions
From: Vivek Kashyap @ 2010-05-11 17:15 UTC (permalink / raw)
  To: Scott Feldman; +Cc: Arnd Bergmann, Stefan Berger, netdev
In-Reply-To: <C80DF1EC.2FFE0%scofeldm@cisco.com>

>
> I'm don't think this is going in the right direction.  We're talking a
> pretty simple concept of a port-profile used to configure the virtual port
> backing a VM i/f to something that's trying to munge disjoint protocols
> based on pre-standard work into a single API.  It's forcing all those

The multi-channel setup and VDP can live, and are designed to live
together:
  - multiple channels per link (setup by the bridge) -- CDCP
  - association of a vsi (virtual interface) to bridge ports -- VDP

However, their implementation can be pursued separately.

> pre-standard protocol details into the API, into the XML, and into the mgmt
> software (libvirt), and into the admin's lap.
>
> I want the API to pass a port-profile name plus other information associated

The problem with defining a 'name' as the lookup key to another database is 
that one then requires additional management mechanisms to describe, 
maintain, and map to the real values needed. It lands in admin's lap 
anyway by a different path.


> with the VM i/f to some mgmt object which can setup the virtual port backing
> the VM i/f.  (And unset it).  Using netlink for API let's that object be in
> user- or kernel-space software, hardware or firmware.  How the object sets
> up the virtual port based on the passed port-profile is beyond the scope of
> the API.

Agree.  See Stefan's last patch in libvirt - it converges 802.1Qbh/portprofile
and 802.1Qbg quite well. The netlink message can be deciphered by the
recipient and meets the advantages below.  (Or, one could use a direct 
unix domain socket to communicate with a daemon as well).


>
> My last port-profile patch is this API.  It gives us this:
>
>    1) single netlink msg for kernel- and user-space
>    2) single parameter set from sender's perspective (libvirt)
>    3) single XML representation of parameters
>    4) single code path in kernel and libvirt
>    5) (potential) cross-vendor-switch VM migration
>    6) admin-friendly port-profile names
>    7) allows pre-standard (802.1Qbg/bh) details to change
>       without bogging down the API
>
> What I proposed on the libvirt list is to maintain a mapping database from
> port-profile to vendor-specific or protocol-specific parameters.  Using VDP
> as an example, a port-profile would resolve the VDP tuple:
>
>     port-profile: "joes-garage"   --->  VSI Manager ID: 15
>                                         VSI Type ID: 12345
>                                         VSI Type ID Ver: 1
>                                         other VSI settings (preassociate)
>
> How the mapping database is maintained is beyond the scope of the API.

Yes, but that will add another mechanism to set the mappings. The 
xml posted (on libvirt) defines the vsi values and avoids this 
additional management. It also allows for port-profile name for 802.1Qbh.

thanks
 	Vivek



^ permalink raw reply

* Re: [PATCH net-next-2.6 1/2] bonding: add keep_all parameter
From: Jay Vosburgh @ 2010-05-11 17:18 UTC (permalink / raw)
  To: Andy Gospodarek; +Cc: netdev
In-Reply-To: <20100511002949.GA7497@gospo.rdu.redhat.com>

Andy Gospodarek <andy@greyhouse.net> wrote:

>
>In an effort to suppress duplicate frames on certain bonding modes
>(specifically the modes that do not require additional configuration on
>the switch or switches connected to the host),

	Strictly speaking, the above is incorrect, as the duplicate
suppression is turned on for the active-backup inactive slaves as well
as 802.3ad ports that are disabled (any slave that gets the "inactive"
flag bit set).

>[...] code was added in the
>generic receive patch in 2.6.16.  The current behavior works quite well
>for most users, but there are some times it would be nice to restore old
>functionality and allow all frames to make their way up the stack.

	Reading netdev lately, it sure looks like everybody wants ways
to shut off or bypass the duplicate suppression.

>This patch adds support for a new module option and sysfs file called
>'keep_all' that will restore pre-2.6.16 functionality if the user
>desires.  The default value is '0' and retains existing behavior, but
>the user can set it to '1' and allow all frames up if desired.

	Since this is really meant for the queue tagging stuff in the
next patch, should this really be something that's enabled automatically
if the queues are configured in such a way that the inactive slave is
going to receive traffic?

	I also wonder if something like this would satisfy the FCOE guys
without making __netif_receive_skb / skb_bond_should_drop even more
complicated than they already are.

>Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
>Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
>---
> Documentation/networking/bonding.txt |   15 ++++++++++++
> drivers/net/bonding/bond_main.c      |   15 ++++++++++++
> drivers/net/bonding/bond_sysfs.c     |   43 +++++++++++++++++++++++++++++++++-
> drivers/net/bonding/bonding.h        |    1 +
> include/linux/if.h                   |    1 +
> net/core/dev.c                       |   26 +++++++++++---------
> 6 files changed, 88 insertions(+), 13 deletions(-)
>
>diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
>index 61f516b..d64fd2f 100644
>--- a/Documentation/networking/bonding.txt
>+++ b/Documentation/networking/bonding.txt
>@@ -399,6 +399,21 @@ fail_over_mac
> 	This option was added in bonding version 3.2.0.  The "follow"
> 	policy was added in bonding version 3.3.0.
>
>+keep_all
>+
>+	Option to specify whether or not you will keep all frames
>+	received on an interface that is a member of a bond.  Right
>+	now checking is done to ensure that most frames ultimately
>+	classified as duplicates are dropped to keep noise to a
>+	minimum.  The feature to drop duplicates was added in kernel
>+	version 2.6.16 (bonding driver version 3.0.2) and this will
>+	allow that original behavior to be restored if desired.
>+
>+	A value of 0 (default) will preserve the current behavior and
>+	will drop all duplicate frames the bond may receive.  A value
>+	of 1 will not attempt to avoid duplicate frames and pass all
>+	of them up the stack.

	Two thoughts (presuming for the moment that this doesn't
change): first, bump the driver version and mention when it was added;
second, mention that this only applies to active-backup mode.

> lacp_rate
>
> 	Option specifying the rate in which we'll ask our link partner
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 5e12462..eb86363 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -106,6 +106,7 @@ static int arp_interval = BOND_LINK_ARP_INTERV;
> static char *arp_ip_target[BOND_MAX_ARP_TARGETS];
> static char *arp_validate;
> static char *fail_over_mac;
>+static int keep_all	= 0;
> static struct bond_params bonding_defaults;
>
> module_param(max_bonds, int, 0);
>@@ -155,6 +156,9 @@ module_param(arp_validate, charp, 0);
> MODULE_PARM_DESC(arp_validate, "validate src/dst of ARP probes: none (default), active, backup or all");
> module_param(fail_over_mac, charp, 0);
> MODULE_PARM_DESC(fail_over_mac, "For active-backup, do not set all slaves to the same MAC.  none (default), active or follow");
>+module_param(keep_all, int, 0);
>+MODULE_PARM_DESC(keep_all, "Keep all frames received on an interface"
>+			   "0 for never (default), 1 for always.");
>
> /*----------------------------- Global variables ----------------------------*/
>
>@@ -4542,6 +4546,9 @@ static void bond_setup(struct net_device *bond_dev)
> 	if (bond->params.arp_interval)
> 		bond_dev->priv_flags |= IFF_MASTER_ARPMON;
>
>+	if (bond->params.keep_all)
>+		bond_dev->priv_flags |= IFF_BONDING_KEEP_ALL;
>+
> 	/* At first, we block adding VLANs. That's the only way to
> 	 * prevent problems that occur when adding VLANs over an
> 	 * empty bond. The block will be removed once non-challenged
>@@ -4756,6 +4763,13 @@ static int bond_check_params(struct bond_params *params)
> 		}
> 	}
>
>+	if ((keep_all != 0) && (keep_all != 1)) {
>+		pr_warning("Warning: keep_all module parameter (%d), "
>+			   "not of valid value (0/1), so it was set to "
>+			   "0\n", keep_all);
>+		keep_all = 0;
>+	}
>+
> 	/* reset values for TLB/ALB */
> 	if ((bond_mode == BOND_MODE_TLB) ||
> 	    (bond_mode == BOND_MODE_ALB)) {
>@@ -4926,6 +4940,7 @@ static int bond_check_params(struct bond_params *params)
> 	params->primary[0] = 0;
> 	params->primary_reselect = primary_reselect_value;
> 	params->fail_over_mac = fail_over_mac_value;
>+	params->keep_all = keep_all;
>
> 	if (primary) {
> 		strncpy(params->primary, primary, IFNAMSIZ);
>diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
>index b8bec08..44651ce 100644
>--- a/drivers/net/bonding/bond_sysfs.c
>+++ b/drivers/net/bonding/bond_sysfs.c
>@@ -339,7 +339,6 @@ out:
>
> static DEVICE_ATTR(slaves, S_IRUGO | S_IWUSR, bonding_show_slaves,
> 		   bonding_store_slaves);
>-
> /*
>  * Show and set the bonding mode.  The bond interface must be down to
>  * change the mode.
>@@ -1472,6 +1471,47 @@ static ssize_t bonding_show_ad_partner_mac(struct device *d,
> }
> static DEVICE_ATTR(ad_partner_mac, S_IRUGO, bonding_show_ad_partner_mac, NULL);
>
>+/*
>+ * Show and set the keep_all flag.
>+ */
>+static ssize_t bonding_show_keep(struct device *d,
>+				 struct device_attribute *attr,
>+				 char *buf)
>+{
>+	struct bonding *bond = to_bond(d);
>+
>+	return sprintf(buf, "%d\n", bond->params.keep_all);
>+}
>+
>+static ssize_t bonding_store_keep(struct device *d,
>+				  struct device_attribute *attr,
>+				  const char *buf, size_t count)
>+{
>+	int new_value, ret = count;
>+	struct bonding *bond = to_bond(d);
>+
>+	if (sscanf(buf, "%d", &new_value) != 1) {
>+		pr_err("%s: no keep_all value specified.\n",
>+		       bond->dev->name);
>+		ret = -EINVAL;
>+		goto out;
>+	}
>+	if ((new_value == 0) || (new_value == 1)) {
>+		bond->params.keep_all = new_value;
>+		if (bond->params.keep_all)
>+			bond->dev->priv_flags |= IFF_BONDING_KEEP_ALL;
>+		else
>+			bond->dev->priv_flags &= ~IFF_BONDING_KEEP_ALL;
>+	} else {
>+		pr_info("%s: Ignoring invalid keep_all value %d.\n",
>+			bond->dev->name, new_value);
>+	}
>+out:
>+	return count;
>+}
>+static DEVICE_ATTR(keep_all, S_IRUGO | S_IWUSR,
>+		   bonding_show_keep, bonding_store_keep);
>+
>
>
> static struct attribute *per_bond_attrs[] = {
>@@ -1499,6 +1539,7 @@ static struct attribute *per_bond_attrs[] = {
> 	&dev_attr_ad_actor_key.attr,
> 	&dev_attr_ad_partner_key.attr,
> 	&dev_attr_ad_partner_mac.attr,
>+	&dev_attr_keep_all.attr,
> 	NULL,
> };
>
>diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
>index 2aa3367..3b7532f 100644
>--- a/drivers/net/bonding/bonding.h
>+++ b/drivers/net/bonding/bonding.h
>@@ -131,6 +131,7 @@ struct bond_params {
> 	char primary[IFNAMSIZ];
> 	int primary_reselect;
> 	__be32 arp_targets[BOND_MAX_ARP_TARGETS];
>+	int keep_all;
> };
>
> struct bond_parm_tbl {
>diff --git a/include/linux/if.h b/include/linux/if.h
>index be350e6..e9f6040 100644
>--- a/include/linux/if.h
>+++ b/include/linux/if.h
>@@ -73,6 +73,7 @@
> #define IFF_DONT_BRIDGE 0x800		/* disallow bridging this ether dev */
> #define IFF_IN_NETPOLL	0x1000		/* whether we are processing netpoll */
> #define IFF_DISABLE_NETPOLL	0x2000	/* disable netpoll at run-time */
>+#define IFF_BONDING_KEEP_ALL	0x4000	/* do not drop possible dup frames */
>
> #define IF_GET_IFACE	0x0001		/* for querying only */
> #define IF_GET_PROTO	0x0002
>diff --git a/net/core/dev.c b/net/core/dev.c
>index 32611c8..9a9c01d 100644
>--- a/net/core/dev.c
>+++ b/net/core/dev.c
>@@ -2758,21 +2758,23 @@ int __skb_bond_should_drop(struct sk_buff *skb, struct net_device *master)
> 		skb_bond_set_mac_by_master(skb, master);
> 	}
>
>-	if (dev->priv_flags & IFF_SLAVE_INACTIVE) {
>-		if ((dev->priv_flags & IFF_SLAVE_NEEDARP) &&
>-		    skb->protocol == __cpu_to_be16(ETH_P_ARP))
>-			return 0;
>+	if (unlikely(!(master->priv_flags & IFF_BONDING_KEEP_ALL))) {

	So it's unlikely that "keep all" will be turned off?

>+		if (dev->priv_flags & IFF_SLAVE_INACTIVE) {
>+			if ((dev->priv_flags & IFF_SLAVE_NEEDARP) &&
>+			    skb->protocol == __cpu_to_be16(ETH_P_ARP))
>+				return 0;
>
>-		if (master->priv_flags & IFF_MASTER_ALB) {
>-			if (skb->pkt_type != PACKET_BROADCAST &&
>-			    skb->pkt_type != PACKET_MULTICAST)
>+			if (master->priv_flags & IFF_MASTER_ALB) {
>+				if (skb->pkt_type != PACKET_BROADCAST &&
>+				    skb->pkt_type != PACKET_MULTICAST)
>+					return 0;
>+			}
>+			if (master->priv_flags & IFF_MASTER_8023AD &&
>+			    skb->protocol == __cpu_to_be16(ETH_P_SLOW))
> 				return 0;
>-		}
>-		if (master->priv_flags & IFF_MASTER_8023AD &&
>-		    skb->protocol == __cpu_to_be16(ETH_P_SLOW))
>-			return 0;
>
>-		return 1;
>+			return 1;
>+		}
> 	}
> 	return 0;
> }
>-- 
>1.6.2.5

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply

* [PATCH RFC] vhost: fix barrier pairing
From: Michael S. Tsirkin @ 2010-05-11 17:26 UTC (permalink / raw)
  To: Michael S. Tsirkin, Juan Quintela, Rusty Russell, David S. Miller,
	"Paul E. McKenney" <paulmc

According to memory-barriers.txt, an smp memory barrier
should always be paired with another smp memory barrier,
and I quote "a lack of appropriate pairing is almost certainly an
error".

In case of vhost, failure to flush out used index
update before looking at the interrupt disable flag
could result in missed interrupts, resulting in
networking hang under stress.

This might happen when flags read bypasses used index write.
So we see interrupts disabled and do not interrupt, at the
same time guest writes flags value to enable interrupt,
reads an old used index value, thinks that
used ring is empty and waits for interrupt.

Note: the barrier we pair with here is in
drivers/virtio/virtio_ring.c, function
vring_enable_cb.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---

Dave, I think this is needed in 2.6.34, I'll send a pull
request after doing some more testing.

Rusty, Juan, could you take a look as well please?
Thanks!

 drivers/vhost/vhost.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index e69d238..14fa2f5 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1035,7 +1035,10 @@ int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len)
 /* This actually signals the guest, using eventfd. */
 void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 {
-	__u16 flags = 0;
+	__u16 flags;
+	/* Flush out used index updates. */
+	smp_mb();
+
 	if (get_user(flags, &vq->avail->flags)) {
 		vq_err(vq, "Failed to get flags");
 		return;
-- 
1.7.1.12.g42b7f

^ permalink raw reply related

* [PATCH net-next] net: Remove unnecessary returns from void function()s
From: Joe Perches @ 2010-05-11 17:45 UTC (permalink / raw)
  To: netdev

This patch removes from net/ (but not any netfilter files)
all the unnecessary return; statements that precede the
last closing brace of void functions.

It does not remove the returns that are immediately
preceded by a label as gcc doesn't like that.

Done via:
$ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
  xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'

Signed-off-by: Joe Perches <joe@perches.com>
---
 net/9p/trans_rdma.c               |    1 -
 net/atm/br2684.c                  |    1 -
 net/atm/lec.c                     |    6 ------
 net/atm/mpc.c                     |   32 --------------------------------
 net/atm/mpoa_caches.c             |   20 --------------------
 net/bluetooth/hci_core.c          |    2 --
 net/bluetooth/l2cap.c             |    3 ---
 net/bluetooth/rfcomm/tty.c        |    2 --
 net/bluetooth/sco.c               |    1 -
 net/caif/caif_dev.c               |    1 -
 net/can/bcm.c                     |    2 --
 net/decnet/dn_dev.c               |    3 ---
 net/decnet/dn_route.c             |    1 -
 net/ipv4/cipso_ipv4.c             |    2 --
 net/ipv4/fib_trie.c               |    2 --
 net/ipv4/ip_gre.c                 |    1 -
 net/ipv4/ip_options.c             |    1 -
 net/ipv4/ipmr.c                   |    1 -
 net/ipv6/ndisc.c                  |    2 --
 net/ipv6/proc.c                   |    1 -
 net/ipv6/route.c                  |    2 --
 net/irda/iriap.c                  |    2 --
 net/irda/irnet/irnet_irda.c       |    3 ---
 net/iucv/af_iucv.c                |    1 -
 net/mac80211/debugfs.h            |    1 -
 net/mac80211/mesh.c               |    2 --
 net/mac80211/mesh_hwmp.c          |    1 -
 net/netlabel/netlabel_addrlist.h  |    2 --
 net/netlabel/netlabel_unlabeled.c |    1 -
 net/sched/cls_flow.c              |    1 -
 net/sched/sch_hfsc.c              |    1 -
 net/sched/sch_ingress.c           |    1 -
 net/sched/sch_mq.c                |    1 -
 net/sched/sch_multiq.c            |    1 -
 net/sched/sch_prio.c              |    1 -
 net/sched/sch_red.c               |    1 -
 net/sctp/associola.c              |    2 --
 net/sctp/outqueue.c               |    2 --
 net/sctp/proc.c                   |    3 ---
 net/sctp/sm_sideeffect.c          |    4 ----
 net/sctp/ulpqueue.c               |    2 --
 net/sunrpc/clnt.c                 |    1 -
 net/sunrpc/svcsock.c              |    1 -
 net/sunrpc/xprt.c                 |    1 -
 net/sunrpc/xprtsock.c             |    4 ----
 net/sysctl_net.c                  |    1 -
 net/wimax/stack.c                 |    2 --
 net/xfrm/xfrm_policy.c            |    1 -
 48 files changed, 0 insertions(+), 131 deletions(-)

diff --git a/net/9p/trans_rdma.c b/net/9p/trans_rdma.c
index 041101a..0ea20c3 100644
--- a/net/9p/trans_rdma.c
+++ b/net/9p/trans_rdma.c
@@ -308,7 +308,6 @@ handle_recv(struct p9_client *client, struct p9_trans_rdma *rdma,
 		   req, err, status);
 	rdma->state = P9_RDMA_FLUSHING;
 	client->status = Disconnected;
-	return;
 }
 
 static void
diff --git a/net/atm/br2684.c b/net/atm/br2684.c
index d6c7cea..6719af6 100644
--- a/net/atm/br2684.c
+++ b/net/atm/br2684.c
@@ -446,7 +446,6 @@ error:
 	net_dev->stats.rx_errors++;
 free_skb:
 	dev_kfree_skb(skb);
-	return;
 }
 
 /*
diff --git a/net/atm/lec.c b/net/atm/lec.c
index feeaf57..d98bde1 100644
--- a/net/atm/lec.c
+++ b/net/atm/lec.c
@@ -161,8 +161,6 @@ static void lec_handle_bridge(struct sk_buff *skb, struct net_device *dev)
 		skb_queue_tail(&sk->sk_receive_queue, skb2);
 		sk->sk_data_ready(sk, skb2->len);
 	}
-
-	return;
 }
 #endif /* defined(CONFIG_BRIDGE) || defined(CONFIG_BRIDGE_MODULE) */
 
@@ -640,7 +638,6 @@ static void lec_set_multicast_list(struct net_device *dev)
 	 * by default, all multicast frames arrive over the bus.
 	 * eventually support selective multicast service
 	 */
-	return;
 }
 
 static const struct net_device_ops lec_netdev_ops = {
@@ -1199,8 +1196,6 @@ static void __exit lane_module_cleanup(void)
 			dev_lec[i] = NULL;
 		}
 	}
-
-	return;
 }
 
 module_init(lane_module_init);
@@ -1334,7 +1329,6 @@ static void lane2_associate_ind(struct net_device *dev, const u8 *mac_addr,
 		priv->lane2_ops->associate_indicator(dev, mac_addr,
 						     tlvs, sizeoftlvs);
 	}
-	return;
 }
 
 /*
diff --git a/net/atm/mpc.c b/net/atm/mpc.c
index 436f2e1..622b471 100644
--- a/net/atm/mpc.c
+++ b/net/atm/mpc.c
@@ -455,7 +455,6 @@ static void lane2_assoc_ind(struct net_device *dev, const u8 *mac_addr,
 	if (end_of_tlvs - tlvs != 0)
 		pr_info("(%s) ignoring %Zd bytes of trailing TLV garbage\n",
 			dev->name, end_of_tlvs - tlvs);
-	return;
 }
 
 /*
@@ -684,8 +683,6 @@ static void mpc_vcc_close(struct atm_vcc *vcc, struct net_device *dev)
 
 	if (in_entry == NULL && eg_entry == NULL)
 		dprintk("(%s) unused vcc closed\n", dev->name);
-
-	return;
 }
 
 static void mpc_push(struct atm_vcc *vcc, struct sk_buff *skb)
@@ -783,8 +780,6 @@ static void mpc_push(struct atm_vcc *vcc, struct sk_buff *skb)
 
 	memset(ATM_SKB(skb), 0, sizeof(struct atm_skb_data));
 	netif_rx(new_skb);
-
-	return;
 }
 
 static struct atmdev_ops mpc_ops = { /* only send is required */
@@ -873,8 +868,6 @@ static void send_set_mps_ctrl_addr(const char *addr, struct mpoa_client *mpc)
 	mesg.type = SET_MPS_CTRL_ADDR;
 	memcpy(mesg.MPS_ctrl, addr, ATM_ESA_LEN);
 	msg_to_mpoad(&mesg, mpc);
-
-	return;
 }
 
 static void mpoad_close(struct atm_vcc *vcc)
@@ -911,8 +904,6 @@ static void mpoad_close(struct atm_vcc *vcc)
 	pr_info("(%s) going down\n",
 		(mpc->dev) ? mpc->dev->name : "<unknown>");
 	module_put(THIS_MODULE);
-
-	return;
 }
 
 /*
@@ -1122,7 +1113,6 @@ static void MPOA_trigger_rcvd(struct k_message *msg, struct mpoa_client *mpc)
 	pr_info("(%s) entry already in resolving state\n",
 		(mpc->dev) ? mpc->dev->name : "<unknown>");
 	mpc->in_ops->put(entry);
-	return;
 }
 
 /*
@@ -1166,7 +1156,6 @@ static void check_qos_and_open_shortcut(struct k_message *msg,
 	} else
 		memset(&msg->qos, 0, sizeof(struct atm_qos));
 	msg_to_mpoad(msg, client);
-	return;
 }
 
 static void MPOA_res_reply_rcvd(struct k_message *msg, struct mpoa_client *mpc)
@@ -1240,8 +1229,6 @@ static void ingress_purge_rcvd(struct k_message *msg, struct mpoa_client *mpc)
 		mpc->in_ops->put(entry);
 		entry = mpc->in_ops->get_with_mask(dst_ip, mpc, mask);
 	} while (entry != NULL);
-
-	return;
 }
 
 static void egress_purge_rcvd(struct k_message *msg, struct mpoa_client *mpc)
@@ -1260,8 +1247,6 @@ static void egress_purge_rcvd(struct k_message *msg, struct mpoa_client *mpc)
 	write_unlock_irq(&mpc->egress_lock);
 
 	mpc->eg_ops->put(entry);
-
-	return;
 }
 
 static void purge_egress_shortcut(struct atm_vcc *vcc, eg_cache_entry *entry)
@@ -1295,8 +1280,6 @@ static void purge_egress_shortcut(struct atm_vcc *vcc, eg_cache_entry *entry)
 	skb_queue_tail(&sk->sk_receive_queue, skb);
 	sk->sk_data_ready(sk, skb->len);
 	dprintk("exiting\n");
-
-	return;
 }
 
 /*
@@ -1325,8 +1308,6 @@ static void mps_death(struct k_message *msg, struct mpoa_client *mpc)
 
 	mpc->in_ops->destroy_cache(mpc);
 	mpc->eg_ops->destroy_cache(mpc);
-
-	return;
 }
 
 static void MPOA_cache_impos_rcvd(struct k_message *msg,
@@ -1353,8 +1334,6 @@ static void MPOA_cache_impos_rcvd(struct k_message *msg,
 	write_unlock_irq(&mpc->egress_lock);
 
 	mpc->eg_ops->put(entry);
-
-	return;
 }
 
 static void set_mpc_ctrl_addr_rcvd(struct k_message *mesg,
@@ -1392,8 +1371,6 @@ static void set_mpc_ctrl_addr_rcvd(struct k_message *mesg,
 			pr_info("(%s) targetless LE_ARP request failed\n",
 				mpc->dev->name);
 	}
-
-	return;
 }
 
 static void set_mps_mac_addr_rcvd(struct k_message *msg,
@@ -1409,8 +1386,6 @@ static void set_mps_mac_addr_rcvd(struct k_message *msg,
 		return;
 	}
 	client->number_of_mps_macs = 1;
-
-	return;
 }
 
 /*
@@ -1436,7 +1411,6 @@ static void clean_up(struct k_message *msg, struct mpoa_client *mpc, int action)
 
 	msg->type = action;
 	msg_to_mpoad(msg, mpc);
-	return;
 }
 
 static void mpc_timer_refresh(void)
@@ -1445,8 +1419,6 @@ static void mpc_timer_refresh(void)
 	mpc_timer.data = mpc_timer.expires;
 	mpc_timer.function = mpc_cache_check;
 	add_timer(&mpc_timer);
-
-	return;
 }
 
 static void mpc_cache_check(unsigned long checking_time)
@@ -1471,8 +1443,6 @@ static void mpc_cache_check(unsigned long checking_time)
 		mpc = mpc->next;
 	}
 	mpc_timer_refresh();
-
-	return;
 }
 
 static int atm_mpoa_ioctl(struct socket *sock, unsigned int cmd,
@@ -1561,8 +1531,6 @@ static void __exit atm_mpoa_cleanup(void)
 		kfree(qos);
 		qos = nextqos;
 	}
-
-	return;
 }
 
 module_init(atm_mpoa_init);
diff --git a/net/atm/mpoa_caches.c b/net/atm/mpoa_caches.c
index e773d83..d1b2d9a 100644
--- a/net/atm/mpoa_caches.c
+++ b/net/atm/mpoa_caches.c
@@ -182,8 +182,6 @@ static void in_cache_put(in_cache_entry *entry)
 		memset(entry, 0, sizeof(in_cache_entry));
 		kfree(entry);
 	}
-
-	return;
 }
 
 /*
@@ -221,8 +219,6 @@ static void in_cache_remove_entry(in_cache_entry *entry,
 		}
 		vcc_release_async(vcc, -EPIPE);
 	}
-
-	return;
 }
 
 /* Call this every MPC-p2 seconds... Not exactly correct solution,
@@ -248,8 +244,6 @@ static void clear_count_and_expired(struct mpoa_client *client)
 		entry = next_entry;
 	}
 	write_unlock_bh(&client->ingress_lock);
-
-	return;
 }
 
 /* Call this every MPC-p4 seconds. */
@@ -334,8 +328,6 @@ static void in_destroy_cache(struct mpoa_client *mpc)
 	while (mpc->in_cache != NULL)
 		mpc->in_ops->remove_entry(mpc->in_cache, mpc);
 	write_unlock_irq(&mpc->ingress_lock);
-
-	return;
 }
 
 static eg_cache_entry *eg_cache_get_by_cache_id(__be32 cache_id,
@@ -427,8 +419,6 @@ static void eg_cache_put(eg_cache_entry *entry)
 		memset(entry, 0, sizeof(eg_cache_entry));
 		kfree(entry);
 	}
-
-	return;
 }
 
 /*
@@ -463,8 +453,6 @@ static void eg_cache_remove_entry(eg_cache_entry *entry,
 		}
 		vcc_release_async(vcc, -EPIPE);
 	}
-
-	return;
 }
 
 static eg_cache_entry *eg_cache_add_entry(struct k_message *msg,
@@ -509,8 +497,6 @@ static void update_eg_cache_entry(eg_cache_entry *entry, uint16_t holding_time)
 	do_gettimeofday(&(entry->tv));
 	entry->entry_state = EGRESS_RESOLVED;
 	entry->ctrl_info.holding_time = holding_time;
-
-	return;
 }
 
 static void clear_expired(struct mpoa_client *client)
@@ -537,8 +523,6 @@ static void clear_expired(struct mpoa_client *client)
 		entry = next_entry;
 	}
 	write_unlock_irq(&client->egress_lock);
-
-	return;
 }
 
 static void eg_destroy_cache(struct mpoa_client *mpc)
@@ -547,8 +531,6 @@ static void eg_destroy_cache(struct mpoa_client *mpc)
 	while (mpc->eg_cache != NULL)
 		mpc->eg_ops->remove_entry(mpc->eg_cache, mpc);
 	write_unlock_irq(&mpc->egress_lock);
-
-	return;
 }
 
 
@@ -584,6 +566,4 @@ void atm_mpoa_init_cache(struct mpoa_client *mpc)
 {
 	mpc->in_ops = &ingress_ops;
 	mpc->eg_ops = &egress_ops;
-
-	return;
 }
diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index 5e83f8e..2f768de 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -1316,8 +1316,6 @@ void hci_send_acl(struct hci_conn *conn, struct sk_buff *skb, __u16 flags)
 	}
 
 	tasklet_schedule(&hdev->tx_task);
-
-	return;
 }
 EXPORT_SYMBOL(hci_send_acl);
 
diff --git a/net/bluetooth/l2cap.c b/net/bluetooth/l2cap.c
index 673a368..1b682a5 100644
--- a/net/bluetooth/l2cap.c
+++ b/net/bluetooth/l2cap.c
@@ -1322,8 +1322,6 @@ static void l2cap_drop_acked_frames(struct sock *sk)
 
 	if (!l2cap_pi(sk)->unacked_frames)
 		del_timer(&l2cap_pi(sk)->retrans_timer);
-
-	return;
 }
 
 static inline void l2cap_do_send(struct sock *sk, struct sk_buff *skb)
@@ -4667,7 +4665,6 @@ void l2cap_load(void)
 	/* Dummy function to trigger automatic L2CAP module loading by
 	 * other modules that use L2CAP sockets but don't use any other
 	 * symbols from it. */
-	return;
 }
 EXPORT_SYMBOL(l2cap_load);
 
diff --git a/net/bluetooth/rfcomm/tty.c b/net/bluetooth/rfcomm/tty.c
index cab71ea..309b6c2 100644
--- a/net/bluetooth/rfcomm/tty.c
+++ b/net/bluetooth/rfcomm/tty.c
@@ -1014,8 +1014,6 @@ static void rfcomm_tty_set_termios(struct tty_struct *tty, struct ktermios *old)
 		rfcomm_send_rpn(dev->dlc->session, 1, dev->dlc->dlci, baud,
 				data_bits, stop_bits, parity,
 				RFCOMM_RPN_FLOW_NONE, x_on, x_off, changes);
-
-	return;
 }
 
 static void rfcomm_tty_throttle(struct tty_struct *tty)
diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c
index 4767928..d0927d1 100644
--- a/net/bluetooth/sco.c
+++ b/net/bluetooth/sco.c
@@ -273,7 +273,6 @@ static inline void sco_recv_frame(struct sco_conn *conn, struct sk_buff *skb)
 
 drop:
 	kfree_skb(skb);
-	return;
 }
 
 /* -------- Socket interface ---------- */
diff --git a/net/caif/caif_dev.c b/net/caif/caif_dev.c
index 024fd5b..e2b86f1 100644
--- a/net/caif/caif_dev.c
+++ b/net/caif/caif_dev.c
@@ -112,7 +112,6 @@ static void caif_device_destroy(struct net_device *dev)
 	spin_unlock_bh(&caifdevs->lock);
 
 	kfree(caifd);
-	return;
 }
 
 static int transmit(struct cflayer *layer, struct cfpkt *pkt)
diff --git a/net/can/bcm.c b/net/can/bcm.c
index 907dc87..9c65e9d 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -713,8 +713,6 @@ static void bcm_remove_op(struct bcm_op *op)
 		kfree(op->last_frames);
 
 	kfree(op);
-
-	return;
 }
 
 static void bcm_rx_unreg(struct net_device *dev, struct bcm_op *op)
diff --git a/net/decnet/dn_dev.c b/net/decnet/dn_dev.c
index 615dbe3..4c409b4 100644
--- a/net/decnet/dn_dev.c
+++ b/net/decnet/dn_dev.c
@@ -1220,17 +1220,14 @@ void dn_dev_down(struct net_device *dev)
 
 void dn_dev_init_pkt(struct sk_buff *skb)
 {
-	return;
 }
 
 void dn_dev_veri_pkt(struct sk_buff *skb)
 {
-	return;
 }
 
 void dn_dev_hello(struct sk_buff *skb)
 {
-	return;
 }
 
 void dn_dev_devices_off(void)
diff --git a/net/decnet/dn_route.c b/net/decnet/dn_route.c
index a8432e3..812e6df 100644
--- a/net/decnet/dn_route.c
+++ b/net/decnet/dn_route.c
@@ -264,7 +264,6 @@ static struct dst_entry *dn_dst_negative_advice(struct dst_entry *dst)
 
 static void dn_dst_link_failure(struct sk_buff *skb)
 {
-	return;
 }
 
 static inline int compare_keys(struct flowi *fl1, struct flowi *fl2)
diff --git a/net/ipv4/cipso_ipv4.c b/net/ipv4/cipso_ipv4.c
index c97cd9f..3a92a76 100644
--- a/net/ipv4/cipso_ipv4.c
+++ b/net/ipv4/cipso_ipv4.c
@@ -290,8 +290,6 @@ void cipso_v4_cache_invalidate(void)
 		cipso_v4_cache[iter].size = 0;
 		spin_unlock_bh(&cipso_v4_cache[iter].lock);
 	}
-
-	return;
 }
 
 /**
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index c98f115..79d057a 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1022,8 +1022,6 @@ static void trie_rebalance(struct trie *t, struct tnode *tn)
 
 	rcu_assign_pointer(t->trie, (struct node *)tn);
 	tnode_free_flush();
-
-	return;
 }
 
 /* only used from updater-side */
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index fe381d1..a5a5501 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -502,7 +502,6 @@ static void ipgre_err(struct sk_buff *skb, u32 info)
 	t->err_time = jiffies;
 out:
 	rcu_read_unlock();
-	return;
 }
 
 static inline void ipgre_ecn_decapsulate(struct iphdr *iph, struct sk_buff *skb)
diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c
index 4c09a31..39dc39e 100644
--- a/net/ipv4/ip_options.c
+++ b/net/ipv4/ip_options.c
@@ -238,7 +238,6 @@ void ip_options_fragment(struct sk_buff * skb)
 	opt->rr_needaddr = 0;
 	opt->ts_needaddr = 0;
 	opt->ts_needtime = 0;
-	return;
 }
 
 /*
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index f3f1c6b..48b9ae7 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -1605,7 +1605,6 @@ static void ipmr_queue_xmit(struct net *net, struct mr_table *mrt,
 
 out_free:
 	kfree_skb(skb);
-	return;
 }
 
 static int ipmr_find_vif(struct mr_table *mrt, struct net_device *dev)
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 3f7c12b..0abdc24 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -890,8 +890,6 @@ out:
 		in6_ifa_put(ifp);
 	else
 		in6_dev_put(idev);
-
-	return;
 }
 
 static void ndisc_recv_na(struct sk_buff *skb)
diff --git a/net/ipv6/proc.c b/net/ipv6/proc.c
index 458eabf..566798d 100644
--- a/net/ipv6/proc.c
+++ b/net/ipv6/proc.c
@@ -168,7 +168,6 @@ static void snmp6_seq_show_icmpv6msg(struct seq_file *seq, void __percpu **mib)
 			i & 0x100 ?  "Out" : "In", i & 0xff);
 		seq_printf(seq, "%-32s\t%lu\n", name, val);
 	}
-	return;
 }
 
 static void snmp6_seq_show_item(struct seq_file *seq, void __percpu **mib,
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 05ebd78..294cbe8 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -316,7 +316,6 @@ static void rt6_probe(struct rt6_info *rt)
 #else
 static inline void rt6_probe(struct rt6_info *rt)
 {
-	return;
 }
 #endif
 
@@ -1553,7 +1552,6 @@ void rt6_redirect(struct in6_addr *dest, struct in6_addr *src,
 
 out:
 	dst_release(&rt->u.dst);
-	return;
 }
 
 /*
diff --git a/net/irda/iriap.c b/net/irda/iriap.c
index 79a1e5a..fce364c 100644
--- a/net/irda/iriap.c
+++ b/net/irda/iriap.c
@@ -685,8 +685,6 @@ static void iriap_getvaluebyclass_indication(struct iriap_cb *self,
 	/* We have a match; send the value.  */
 	iriap_getvaluebyclass_response(self, obj->id, IAS_SUCCESS,
 				       attrib->value);
-
-	return;
 }
 
 /*
diff --git a/net/irda/irnet/irnet_irda.c b/net/irda/irnet/irnet_irda.c
index df18ab4..e98e40d 100644
--- a/net/irda/irnet/irnet_irda.c
+++ b/net/irda/irnet/irnet_irda.c
@@ -678,7 +678,6 @@ irda_irnet_destroy(irnet_socket *	self)
   self->stsap_sel = 0;
 
   DEXIT(IRDA_SOCK_TRACE, "\n");
-  return;
 }
 
 
@@ -928,7 +927,6 @@ irnet_disconnect_server(irnet_socket *	self,
   irttp_listen(self->tsap);
 
   DEXIT(IRDA_SERV_TRACE, "\n");
-  return;
 }
 
 /*------------------------------------------------------------------*/
@@ -1013,7 +1011,6 @@ irnet_destroy_server(void)
   irda_irnet_destroy(&irnet_server.s);
 
   DEXIT(IRDA_SERV_TRACE, "\n");
-  return;
 }
 
 
diff --git a/net/iucv/af_iucv.c b/net/iucv/af_iucv.c
index 8be324f..c8b4599 100644
--- a/net/iucv/af_iucv.c
+++ b/net/iucv/af_iucv.c
@@ -136,7 +136,6 @@ static void afiucv_pm_complete(struct device *dev)
 #ifdef CONFIG_PM_DEBUG
 	printk(KERN_WARNING "afiucv_pm_complete\n");
 #endif
-	return;
 }
 
 /**
diff --git a/net/mac80211/debugfs.h b/net/mac80211/debugfs.h
index 68e6a20..09cc9be 100644
--- a/net/mac80211/debugfs.h
+++ b/net/mac80211/debugfs.h
@@ -7,7 +7,6 @@ extern int mac80211_open_file_generic(struct inode *inode, struct file *file);
 #else
 static inline void debugfs_hw_add(struct ieee80211_local *local)
 {
-	return;
 }
 #endif
 
diff --git a/net/mac80211/mesh.c b/net/mac80211/mesh.c
index 7e93524..bde8103 100644
--- a/net/mac80211/mesh.c
+++ b/net/mac80211/mesh.c
@@ -287,8 +287,6 @@ void mesh_mgmt_ies_add(struct sk_buff *skb, struct ieee80211_sub_if_data *sdata)
 	*pos++ |= sdata->u.mesh.accepting_plinks ?
 	    MESHCONF_CAPAB_ACCEPT_PLINKS : 0x00;
 	*pos++ = 0x00;
-
-	return;
 }
 
 u32 mesh_table_hash(u8 *addr, struct ieee80211_sub_if_data *sdata, struct mesh_table *tbl)
diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c
index d89ed7f..0705018 100644
--- a/net/mac80211/mesh_hwmp.c
+++ b/net/mac80211/mesh_hwmp.c
@@ -624,7 +624,6 @@ static void hwmp_prep_frame_process(struct ieee80211_sub_if_data *sdata,
 fail:
 	rcu_read_unlock();
 	sdata->u.mesh.mshstats.dropped_frames_no_route++;
-	return;
 }
 
 static void hwmp_perr_frame_process(struct ieee80211_sub_if_data *sdata,
diff --git a/net/netlabel/netlabel_addrlist.h b/net/netlabel/netlabel_addrlist.h
index 07ae7fd..1c1c093 100644
--- a/net/netlabel/netlabel_addrlist.h
+++ b/net/netlabel/netlabel_addrlist.h
@@ -130,7 +130,6 @@ static inline void netlbl_af4list_audit_addr(struct audit_buffer *audit_buf,
 					     int src, const char *dev,
 					     __be32 addr, __be32 mask)
 {
-	return;
 }
 #endif
 
@@ -203,7 +202,6 @@ static inline void netlbl_af6list_audit_addr(struct audit_buffer *audit_buf,
 					     const struct in6_addr *addr,
 					     const struct in6_addr *mask)
 {
-	return;
 }
 #endif
 #endif /* IPV6 */
diff --git a/net/netlabel/netlabel_unlabeled.c b/net/netlabel/netlabel_unlabeled.c
index a3d64aa..e2b0a68 100644
--- a/net/netlabel/netlabel_unlabeled.c
+++ b/net/netlabel/netlabel_unlabeled.c
@@ -670,7 +670,6 @@ static void netlbl_unlhsh_condremove_iface(struct netlbl_unlhsh_iface *iface)
 
 unlhsh_condremove_failure:
 	spin_unlock(&netlbl_unlhsh_lock);
-	return;
 }
 
 /**
diff --git a/net/sched/cls_flow.c b/net/sched/cls_flow.c
index 6ed61b1..f73542d 100644
--- a/net/sched/cls_flow.c
+++ b/net/sched/cls_flow.c
@@ -602,7 +602,6 @@ static unsigned long flow_get(struct tcf_proto *tp, u32 handle)
 
 static void flow_put(struct tcf_proto *tp, unsigned long f)
 {
-	return;
 }
 
 static int flow_dump(struct tcf_proto *tp, unsigned long fh,
diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index b38b39c..03c597b 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -617,7 +617,6 @@ rtsc_min(struct runtime_sc *rtsc, struct internal_sc *isc, u64 x, u64 y)
 	rtsc->y = y;
 	rtsc->dx = dx;
 	rtsc->dy = dy;
-	return;
 }
 
 static void
diff --git a/net/sched/sch_ingress.c b/net/sched/sch_ingress.c
index a9e646b..f10e34a 100644
--- a/net/sched/sch_ingress.c
+++ b/net/sched/sch_ingress.c
@@ -44,7 +44,6 @@ static void ingress_put(struct Qdisc *sch, unsigned long cl)
 
 static void ingress_walk(struct Qdisc *sch, struct qdisc_walker *walker)
 {
-	return;
 }
 
 static struct tcf_proto **ingress_find_tcf(struct Qdisc *sch, unsigned long cl)
diff --git a/net/sched/sch_mq.c b/net/sched/sch_mq.c
index b2aba3f..fe91e50 100644
--- a/net/sched/sch_mq.c
+++ b/net/sched/sch_mq.c
@@ -174,7 +174,6 @@ static unsigned long mq_get(struct Qdisc *sch, u32 classid)
 
 static void mq_put(struct Qdisc *sch, unsigned long cl)
 {
-	return;
 }
 
 static int mq_dump_class(struct Qdisc *sch, unsigned long cl,
diff --git a/net/sched/sch_multiq.c b/net/sched/sch_multiq.c
index c50876c..6ae2512 100644
--- a/net/sched/sch_multiq.c
+++ b/net/sched/sch_multiq.c
@@ -340,7 +340,6 @@ static unsigned long multiq_bind(struct Qdisc *sch, unsigned long parent,
 
 static void multiq_put(struct Qdisc *q, unsigned long cl)
 {
-	return;
 }
 
 static int multiq_dump_class(struct Qdisc *sch, unsigned long cl,
diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c
index 81672e0..0748fb1 100644
--- a/net/sched/sch_prio.c
+++ b/net/sched/sch_prio.c
@@ -303,7 +303,6 @@ static unsigned long prio_bind(struct Qdisc *sch, unsigned long parent, u32 clas
 
 static void prio_put(struct Qdisc *q, unsigned long cl)
 {
-	return;
 }
 
 static int prio_dump_class(struct Qdisc *sch, unsigned long cl, struct sk_buff *skb,
diff --git a/net/sched/sch_red.c b/net/sched/sch_red.c
index 072cdf4..8d42bb3 100644
--- a/net/sched/sch_red.c
+++ b/net/sched/sch_red.c
@@ -303,7 +303,6 @@ static unsigned long red_get(struct Qdisc *sch, u32 classid)
 
 static void red_put(struct Qdisc *sch, unsigned long arg)
 {
-	return;
 }
 
 static void red_walk(struct Qdisc *sch, struct qdisc_walker *walker)
diff --git a/net/sctp/associola.c b/net/sctp/associola.c
index 3912420..e41feff 100644
--- a/net/sctp/associola.c
+++ b/net/sctp/associola.c
@@ -816,8 +816,6 @@ void sctp_assoc_del_nonprimary_peers(struct sctp_association *asoc,
 		if (t != primary)
 			sctp_assoc_rm_peer(asoc, t);
 	}
-
-	return;
 }
 
 /* Engage in transport control operations.
diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
index 5d05717..c04b2eb 100644
--- a/net/sctp/outqueue.c
+++ b/net/sctp/outqueue.c
@@ -80,7 +80,6 @@ static inline void sctp_outq_head_data(struct sctp_outq *q,
 {
 	list_add(&ch->list, &q->out_chunk_list);
 	q->out_qlen += ch->skb->len;
-	return;
 }
 
 /* Take data from the front of the queue. */
@@ -103,7 +102,6 @@ static inline void sctp_outq_tail_data(struct sctp_outq *q,
 {
 	list_add_tail(&ch->list, &q->out_chunk_list);
 	q->out_qlen += ch->skb->len;
-	return;
 }
 
 /*
diff --git a/net/sctp/proc.c b/net/sctp/proc.c
index 784bcc9..61aacfb 100644
--- a/net/sctp/proc.c
+++ b/net/sctp/proc.c
@@ -181,7 +181,6 @@ static void * sctp_eps_seq_start(struct seq_file *seq, loff_t *pos)
 
 static void sctp_eps_seq_stop(struct seq_file *seq, void *v)
 {
-	return;
 }
 
 
@@ -286,7 +285,6 @@ static void * sctp_assocs_seq_start(struct seq_file *seq, loff_t *pos)
 
 static void sctp_assocs_seq_stop(struct seq_file *seq, void *v)
 {
-	return;
 }
 
 
@@ -409,7 +407,6 @@ static void *sctp_remaddr_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 
 static void sctp_remaddr_seq_stop(struct seq_file *seq, void *v)
 {
-	return;
 }
 
 static int sctp_remaddr_seq_show(struct seq_file *seq, void *v)
diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
index 3b7230e..c8f05ca 100644
--- a/net/sctp/sm_sideeffect.c
+++ b/net/sctp/sm_sideeffect.c
@@ -857,8 +857,6 @@ static void sctp_cmd_process_fwdtsn(struct sctp_ulpq *ulpq,
 	sctp_walk_fwdtsn(skip, chunk) {
 		sctp_ulpq_skip(ulpq, ntohs(skip->stream), ntohs(skip->ssn));
 	}
-
-	return;
 }
 
 /* Helper function to remove the association non-primary peer
@@ -877,8 +875,6 @@ static void sctp_cmd_del_non_primary(struct sctp_association *asoc)
 			sctp_assoc_del_peer(asoc, &t->ipaddr);
 		}
 	}
-
-	return;
 }
 
 /* Helper function to set sk_err on a 1-1 style socket. */
diff --git a/net/sctp/ulpqueue.c b/net/sctp/ulpqueue.c
index 3a44853..c7f7e49 100644
--- a/net/sctp/ulpqueue.c
+++ b/net/sctp/ulpqueue.c
@@ -955,7 +955,6 @@ void sctp_ulpq_skip(struct sctp_ulpq *ulpq, __u16 sid, __u16 ssn)
 	 * ordering and deliver them if needed.
 	 */
 	sctp_ulpq_reap_ordered(ulpq, sid);
-	return;
 }
 
 static __u16 sctp_ulpq_renege_list(struct sctp_ulpq *ulpq,
@@ -1064,7 +1063,6 @@ void sctp_ulpq_renege(struct sctp_ulpq *ulpq, struct sctp_chunk *chunk,
 	}
 
 	sk_mem_reclaim(asoc->base.sk);
-	return;
 }
 
 
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 19c9983..462462e 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -1518,7 +1518,6 @@ call_refreshresult(struct rpc_task *task)
 	task->tk_action = call_refresh;
 	if (status != -ETIMEDOUT)
 		rpc_delay(task, 3*HZ);
-	return;
 }
 
 static __be32 *
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index ce0d5b3..76e504b 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -150,7 +150,6 @@ static void svc_set_cmsg_data(struct svc_rqst *rqstp, struct cmsghdr *cmh)
 		}
 		break;
 	}
-	return;
 }
 
 /*
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index 699ade6..2e3d502 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -716,7 +716,6 @@ void xprt_connect(struct rpc_task *task)
 		xprt->stat.connect_start = jiffies;
 		xprt->ops->connect(task);
 	}
-	return;
 }
 
 static void xprt_connect_status(struct rpc_task *task)
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 9847c30..6e0df66 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1050,8 +1050,6 @@ static inline void xs_tcp_read_common(struct rpc_xprt *xprt,
 		if (transport->tcp_flags & TCP_RCV_LAST_FRAG)
 			transport->tcp_flags &= ~TCP_RCV_COPY_DATA;
 	}
-
-	return;
 }
 
 /*
@@ -2210,7 +2208,6 @@ static int bc_send_request(struct rpc_task *task)
 
 static void bc_close(struct rpc_xprt *xprt)
 {
-	return;
 }
 
 /*
@@ -2220,7 +2217,6 @@ static void bc_close(struct rpc_xprt *xprt)
 
 static void bc_destroy(struct rpc_xprt *xprt)
 {
-	return;
 }
 
 static struct rpc_xprt_ops xs_udp_ops = {
diff --git a/net/sysctl_net.c b/net/sysctl_net.c
index 5319600..ca84212 100644
--- a/net/sysctl_net.c
+++ b/net/sysctl_net.c
@@ -82,7 +82,6 @@ static int __net_init sysctl_net_init(struct net *net)
 static void __net_exit sysctl_net_exit(struct net *net)
 {
 	WARN_ON(!list_empty(&net->sysctls.list));
-	return;
 }
 
 static struct pernet_operations sysctl_pernet_ops = {
diff --git a/net/wimax/stack.c b/net/wimax/stack.c
index 1ed65db..7380c89 100644
--- a/net/wimax/stack.c
+++ b/net/wimax/stack.c
@@ -320,7 +320,6 @@ void __wimax_state_change(struct wimax_dev *wimax_dev, enum wimax_st new_state)
 out:
 	d_fnend(3, dev, "(wimax_dev %p new_state %u [old %u]) = void\n",
 		wimax_dev, new_state, old_state);
-	return;
 }
 
 
@@ -362,7 +361,6 @@ void wimax_state_change(struct wimax_dev *wimax_dev, enum wimax_st new_state)
 	if (wimax_dev->state > __WIMAX_ST_NULL)
 		__wimax_state_change(wimax_dev, new_state);
 	mutex_unlock(&wimax_dev->mutex);
-	return;
 }
 EXPORT_SYMBOL_GPL(wimax_state_change);
 
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 31f4ba4..9cb3914 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -2209,7 +2209,6 @@ EXPORT_SYMBOL(xfrm_dst_ifdown);
 static void xfrm_link_failure(struct sk_buff *skb)
 {
 	/* Impossible. Such dst must be popped before reaches point of failure. */
-	return;
 }
 
 static struct dst_entry *xfrm_negative_advice(struct dst_entry *dst)



^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox