* [PATCH net-next-2.6 1/2] cpmac: remove unused buflen
From: Yoichi Yuasa @ 2010-12-13 8:17 UTC (permalink / raw)
To: David S. Miller; +Cc: yuasa, netdev, Eugene Konev
Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
---
drivers/net/cpmac.c | 4 ----
1 files changed, 0 insertions(+), 4 deletions(-)
diff --git a/drivers/net/cpmac.c b/drivers/net/cpmac.c
index fec939f..8021756 100644
--- a/drivers/net/cpmac.c
+++ b/drivers/net/cpmac.c
@@ -179,7 +179,6 @@ MODULE_PARM_DESC(dumb_switch, "Assume switch is not connected to MDIO bus");
struct cpmac_desc {
u32 hw_next;
u32 hw_data;
- u16 buflen;
u16 bufflags;
u16 datalen;
u16 dataflags;
@@ -415,7 +414,6 @@ static struct sk_buff *cpmac_rx_one(struct cpmac_priv *priv,
priv->dev->stats.rx_dropped++;
}
- desc->buflen = CPMAC_SKB_SIZE;
desc->dataflags = CPMAC_OWN;
return result;
@@ -586,7 +584,6 @@ static int cpmac_start_xmit(struct sk_buff *skb, struct net_device *dev)
DMA_TO_DEVICE);
desc->hw_data = (u32)desc->data_mapping;
desc->datalen = len;
- desc->buflen = len;
if (unlikely(netif_msg_tx_queued(priv)))
printk(KERN_DEBUG "%s: sending 0x%p, len=%d\n", dev->name, skb,
skb->len);
@@ -1005,7 +1002,6 @@ static int cpmac_open(struct net_device *dev)
CPMAC_SKB_SIZE,
DMA_FROM_DEVICE);
desc->hw_data = (u32)desc->data_mapping;
- desc->buflen = CPMAC_SKB_SIZE;
desc->dataflags = CPMAC_OWN;
desc->next = &priv->rx_head[(i + 1) % priv->ring_size];
desc->next->prev = desc;
--
1.7.3.3
^ permalink raw reply related
* [PATCH net-next-2.6 2/2] cpmac: fix warning length cast
From: Yoichi Yuasa @ 2010-12-13 8:18 UTC (permalink / raw)
To: David S. Miller; +Cc: yuasa, netdev, Eugene Konev
In-Reply-To: <20101213171725.2ed299cf.yuasa@linux-mips.org>
drivers/net/cpmac.c: In function 'cpmac_start_xmit':
drivers/net/cpmac.c:569: warning: comparison of distinct pointer types
lacks a cast
Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
---
drivers/net/cpmac.c | 7 ++++---
1 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/net/cpmac.c b/drivers/net/cpmac.c
index 8021756..3b606a5 100644
--- a/drivers/net/cpmac.c
+++ b/drivers/net/cpmac.c
@@ -180,7 +180,7 @@ struct cpmac_desc {
u32 hw_next;
u32 hw_data;
u16 bufflags;
- u16 datalen;
+ size_t datalen;
u16 dataflags;
#define CPMAC_SOP 0x8000
#define CPMAC_EOP 0x4000
@@ -554,7 +554,8 @@ fatal_error:
static int cpmac_start_xmit(struct sk_buff *skb, struct net_device *dev)
{
- int queue, len;
+ int queue;
+ size_t len;
struct cpmac_desc *desc;
struct cpmac_priv *priv = netdev_priv(dev);
@@ -564,7 +565,7 @@ static int cpmac_start_xmit(struct sk_buff *skb, struct net_device *dev)
if (unlikely(skb_padto(skb, ETH_ZLEN)))
return NETDEV_TX_OK;
- len = max(skb->len, ETH_ZLEN);
+ len = max_t(size_t, skb->len, ETH_ZLEN);
queue = skb_get_queue_mapping(skb);
netif_stop_subqueue(dev, queue);
--
1.7.3.3
^ permalink raw reply related
* Re: System blocks (hangs) on ifconfig up
From: Shmulik Hen @ 2010-12-13 9:14 UTC (permalink / raw)
To: netdev; +Cc: Shmulik Hen, Ben Hutchings, eric.dumazet, shemminger
In-Reply-To: <1292194983.3136.294.camel@localhost>
On 12/13/2010 01:03 AM, Ben Hutchings wrote:
> On Sun, 2010-12-12 at 17:00 +0200, Shmulik Hen wrote:
>> Hello,
>>
>> My system is Ubuntu 10.04, running kernel 2.6.32-26-generic.
>>
>> Whenever I try to bring up a specific ethernet interface for the second
>> time, my
>> system becomes unresponsive for 60 seconds - i.e. no mouse, no keyboard, no
>> screen refresh. etc.
>>
>> Looking at the driver's code, I could see that it's dev->open() method calls
>> wait_event_interruptible_timeout() with a timeout of 60 seconds - exactly
>> the delay I'm seeing.
> That seems like a stupid thing for it to do.
I agree...
>> I have narrowed the code to a bare minimum (see below - loosely based on
>> dummy.c), which only calls mdelay(10000) in it's dev->open() method, and
>> still, my system blocks for exactly 10 seconds when I run the following
>> sequence:
>>
>> > sudo ifconfig shmulik0 up
>> > sudo ifconfig shmulik0 down
>> > sudo ifconfig shmulik0 up
>>
>> At this point - the system is stuck for 10 seconds.
> Bringing an interface up or down is a synchronous operation and is
> serialised with most other network configuration operations. So this is
> the expected behaviour.
>
> Ben.
But why does this happen only the second time I run ifconfig up?
How come the entire system is totally frozen?
I can't even switch to other applications running. If I run 'top' in another
console, it stops refreshing for the entire period.
I'll try to explain better;
The driver I'm referring to is part of an embedded system development kit.
It runs on the controlling side, which may be a PC or some Linux embedded
system. It exposes a virtual interface that allows to communicate via
ethernet connection to a remote board, and performs the firmware download
to that board.
Unfortunately, the firmware download stage is done during dev->open() of
this virtual interface. The call to wait_event_interruptible_timeout()
is there to make sure the boot process of the remote board is complete via a
message. If all goes well the first time, there is no delay, but if the
operation
fails for any reason the first time, and a second attempt is made (another
ifconfig up), we see the freezing.
Since this driver is (mostly) closed source, I had to try and reproduce
the situation
in an all open-source driver - this is the sample code I attached to my
original
message. The call to mdelay() there is meant to simulate the delay of
the original
driver - it schedules.
Obviously, the correct way to fix this is to separate the firmware
download part
from the dev->open() method, but this is not as simple as it may sound - I'm
currently working on this. In the mean time I'm looking for a simpler
solution
(or answer) to our problem.
I'll appreciate any insight on this matter.
Thanks in advance,
Shmulik Hen.
^ permalink raw reply
* Re: set vlan CFI and TPID
From: Jesse Gross @ 2010-12-13 9:16 UTC (permalink / raw)
To: hong zhiyi; +Cc: netdev
In-Reply-To: <AANLkTimRZezo4WYZiXfsra83AojMubrZap1d1r5KUst1@mail.gmail.com>
On Sun, Dec 12, 2010 at 6:20 PM, hong zhiyi <zhiyi.hong@gmail.com> wrote:
> On Mon, Dec 13, 2010 at 8:20 AM, Jesse Gross <jesse@nicira.com> wrote:
>> On Tue, Dec 7, 2010 at 8:04 PM, hong zhiyi <zhiyi.hong@gmail.com> wrote:
>>> Hi All,
>>>
>>> May I ask how to set the 1 bit CFI and 2 bytesTPID in the VLAN header
>>> by vconfig?
>>
>> You can't change them using vconfig. For 802.1q/Ethernet, the TPID is
>> 0x8100 and the CFI is 0.
>>
>>> and what is the meaning of the egress map and ingress map?
>>
>> These convert between the PCP bits found in the vlan header and the
>> priority field that is used internally for QoS, which you access
>> through tc.
>>
>
> Hi Jesse,
>
> Thank you for your reply.
>
> Then how can it set the value of the TPID and the CFI? Also, I do not
> understand you mentioned the *tc* means, what is the tc?
You can't really. Like I said, the values are fixed by the standard
for basic 802.1q, which is what Linux implements. You could, of
course, directly send/receive packets through a raw socket with
whatever values you like. However, other components of the kernel
that deal with vlans won't recognize them.
tc is a userspace program that (among other things) handles QoS
settings. Google should be able to turn up information on how to use
it.
^ permalink raw reply
* correct include file for restart_syscall
From: Or Gerlitz @ 2010-12-13 10:17 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: Roland Dreier, netdev
Hi,
commit 26574401fef6766f6c3ca25b5c13febe662d2a32 "net: Fix ipoib rtnl_lock
sysfs deadlock" modified some flows doing rtnl_lock calls to use rtnl_trylock
/ restart_syscall, etc. It turned out that restart_syscall was included in
ipoib through linux/inet_lro.h which is an include I'd like to remove...
Is including linux/sched.h being the right thing to do? I wasn't sure as
this call is very rare, in network drivers its called only by ipoib and
bonding and it looks like bonding includes it indirectly.
Or.
^ permalink raw reply
* [GIT PULL net-next-2.6] vhost-net: tools, cleanups, optimizations
From: Michael S. Tsirkin @ 2010-12-13 10:44 UTC (permalink / raw)
To: David Miller; +Cc: kvm, virtualization, netdev, linux-kernel
Please merge the following tree for 2.6.38.
Thanks!
The following changes since commit ad1184c6cf067a13e8cb2a4e7ccc407f947027d0:
net: au1000_eth: remove unused global variable. (2010-12-11 12:01:48 -0800)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost-net-next
Jason Wang (1):
vhost: fix typos in comment
Julia Lawall (1):
drivers/vhost/vhost.c: delete double assignment
Michael S. Tsirkin (9):
vhost: put mm after thread stop
vhost-net: batch use/unuse mm
vhost: copy_to_user -> __copy_to_user
vhost: get/put_user -> __get/__put_user
vhost: remove unused include
vhost: correctly set bits of dirty pages
vhost: better variable name in logging
vhost test module
tools/virtio: virtio_test tool
drivers/vhost/net.c | 9 +-
drivers/vhost/test.c | 320 ++++++++++++++++++++++++++++++++++
drivers/vhost/test.h | 7 +
drivers/vhost/vhost.c | 44 +++---
drivers/vhost/vhost.h | 2 +-
tools/virtio/Makefile | 12 ++
tools/virtio/linux/device.h | 2 +
tools/virtio/linux/slab.h | 2 +
tools/virtio/linux/virtio.h | 223 +++++++++++++++++++++++
tools/virtio/vhost_test/Makefile | 2 +
tools/virtio/vhost_test/vhost_test.c | 1 +
tools/virtio/virtio_test.c | 248 ++++++++++++++++++++++++++
12 files changed, 842 insertions(+), 30 deletions(-)
create mode 100644 drivers/vhost/test.c
create mode 100644 drivers/vhost/test.h
create mode 100644 tools/virtio/Makefile
create mode 100644 tools/virtio/linux/device.h
create mode 100644 tools/virtio/linux/slab.h
create mode 100644 tools/virtio/linux/virtio.h
create mode 100644 tools/virtio/vhost_test/Makefile
create mode 100644 tools/virtio/vhost_test/vhost_test.c
create mode 100644 tools/virtio/virtio_test.c
^ permalink raw reply
* Re: correct include file for restart_syscall
From: Eric W. Biederman @ 2010-12-13 10:51 UTC (permalink / raw)
To: Or Gerlitz; +Cc: Roland Dreier, netdev
In-Reply-To: <Pine.LNX.4.64.1012131211001.17382@zuben.voltaire.com>
Or Gerlitz <ogerlitz@voltaire.com> writes:
> Hi,
>
> commit 26574401fef6766f6c3ca25b5c13febe662d2a32 "net: Fix ipoib rtnl_lock
> sysfs deadlock" modified some flows doing rtnl_lock calls to use rtnl_trylock
> / restart_syscall, etc. It turned out that restart_syscall was included in
> ipoib through linux/inet_lro.h which is an include I'd like to remove...
> Is including linux/sched.h being the right thing to do? I wasn't sure as
> this call is very rare, in network drivers its called only by ipoib and
> bonding and it looks like bonding includes it indirectly.
For the definition of restart_syscall yes.
I wish I was clever enough to find a way not to need rtnl_trylock
restart_syscall but unfortunately I don't :(
Eric
^ permalink raw reply
* Re: [PATCH 2/9] drivers/net: don't use flush_scheduled_work()
From: Lennert Buytenhek @ 2010-12-13 11:22 UTC (permalink / raw)
To: Tejun Heo
Cc: linux-kernel, netdev, davem, Jay Cliburn, Michael Chan,
Divy Le Ray, e1000-devel, Vasanthy Kolluri, Samuel Ortiz,
Andrew Gallatin, Francois Romieu, Ramkrishna Vepa, Matt Carlson,
David Brownell, Shreyas Bhatewara
In-Reply-To: <1292169185-10579-3-git-send-email-tj@kernel.org>
On Sun, Dec 12, 2010 at 04:52:58PM +0100, Tejun Heo wrote:
> flush_scheduled_work() is on its way out. This patch contains simple
> conversions to replace flush_scheduled_work() usage with direct
> cancels and flushes.
>
> Directly cancel the used works on driver detach and flush them in
> other cases.
>
> The conversions are mostly straight forward and the only dangers are,
>
> * Forgetting to cancel/flush one or more used works.
>
> * Cancelling when a work should be flushed (ie. the work must be
> executed once scheduled whether the driver is detaching or not).
>
> I've gone over the changes multiple times but it would be much
> appreciated if you can review with the above points in mind.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Jay Cliburn <jcliburn@gmail.com>
> Cc: Michael Chan <mchan@broadcom.com>
> Cc: Divy Le Ray <divy@chelsio.com>
> Cc: e1000-devel@lists.sourceforge.net
> Cc: Vasanthy Kolluri <vkolluri@cisco.com>
> Cc: Samuel Ortiz <samuel@sortiz.org>
> Cc: Lennert Buytenhek <buytenh@wantstofly.org>
> Cc: Andrew Gallatin <gallatin@myri.com>
> Cc: Francois Romieu <romieu@fr.zoreil.com>
> Cc: Ramkrishna Vepa <ramkrishna.vepa@exar.com>
> Cc: Matt Carlson <mcarlson@broadcom.com>
> Cc: David Brownell <dbrownell@users.sourceforge.net>
> Cc: Shreyas Bhatewara <sbhatewara@vmware.com>
> Cc: netdev@vger.kernel.org
For the mv643xx_eth part:
Acked-by: Lennert Buytenhek <buytenh@wantstofly.org>
^ permalink raw reply
* [PATCH kernel 2.6.37-rc5] axnet_cs: move id (0x1bf, 0x2328) to axnet_cs
From: Ken Kawasaki @ 2010-12-13 12:27 UTC (permalink / raw)
To: netdev
In-Reply-To: <20101114084208.b3ee991d.ken_kawasaki@spring.nifty.jp>
axnet_cs:
Accton EN2328 or compatible (id: 0x01bf, 0x2328) uses Asix chip.
So it works better with axnet_cs instead of pcnet_cs.
Signed-off-by: Ken Kawasaki <ken_kawasaki@spring.nifty.jp>
---
--- linux-2.6.37-rc5/drivers/net/pcmcia/axnet_cs.c.orig 2010-12-12 17:26:37.000000000 +0900
+++ linux-2.6.37-rc5/drivers/net/pcmcia/axnet_cs.c 2010-12-12 17:27:22.000000000 +0900
@@ -690,6 +690,7 @@ static void block_output(struct net_devi
static struct pcmcia_device_id axnet_ids[] = {
PCMCIA_PFC_DEVICE_MANF_CARD(0, 0x016c, 0x0081),
PCMCIA_DEVICE_MANF_CARD(0x018a, 0x0301),
+ PCMCIA_DEVICE_MANF_CARD(0x01bf, 0x2328),
PCMCIA_DEVICE_MANF_CARD(0x026f, 0x0301),
PCMCIA_DEVICE_MANF_CARD(0x026f, 0x0303),
PCMCIA_DEVICE_MANF_CARD(0x026f, 0x0309),
--- linux-2.6.37-rc5/drivers/net/pcmcia/pcnet_cs.c.orig 2010-12-12 17:26:31.000000000 +0900
+++ linux-2.6.37-rc5/drivers/net/pcmcia/pcnet_cs.c 2010-12-12 17:26:58.000000000 +0900
@@ -1493,7 +1493,6 @@ static struct pcmcia_device_id pcnet_ids
PCMCIA_DEVICE_MANF_CARD(0x0149, 0x4530),
PCMCIA_DEVICE_MANF_CARD(0x0149, 0xc1ab),
PCMCIA_DEVICE_MANF_CARD(0x0186, 0x0110),
- PCMCIA_DEVICE_MANF_CARD(0x01bf, 0x2328),
PCMCIA_DEVICE_MANF_CARD(0x01bf, 0x8041),
PCMCIA_DEVICE_MANF_CARD(0x0213, 0x2452),
PCMCIA_DEVICE_MANF_CARD(0x026f, 0x0300),
^ permalink raw reply
* Re: System blocks (hangs) on ifconfig up
From: Eric Dumazet @ 2010-12-13 12:37 UTC (permalink / raw)
To: Shmulik Hen; +Cc: netdev, Shmulik Hen, Ben Hutchings, shemminger
In-Reply-To: <4D05E400.7080409@trego.co.il>
Le lundi 13 décembre 2010 à 11:14 +0200, Shmulik Hen a écrit :
> On 12/13/2010 01:03 AM, Ben Hutchings wrote:
> > On Sun, 2010-12-12 at 17:00 +0200, Shmulik Hen wrote:
> >> Hello,
> >>
> >> My system is Ubuntu 10.04, running kernel 2.6.32-26-generic.
> >>
> >> Whenever I try to bring up a specific ethernet interface for the second
> >> time, my
> >> system becomes unresponsive for 60 seconds - i.e. no mouse, no keyboard, no
> >> screen refresh. etc.
> >>
> >> Looking at the driver's code, I could see that it's dev->open() method calls
> >> wait_event_interruptible_timeout() with a timeout of 60 seconds - exactly
> >> the delay I'm seeing.
> > That seems like a stupid thing for it to do.
> I agree...
> >> I have narrowed the code to a bare minimum (see below - loosely based on
> >> dummy.c), which only calls mdelay(10000) in it's dev->open() method, and
> >> still, my system blocks for exactly 10 seconds when I run the following
> >> sequence:
> >>
> >> > sudo ifconfig shmulik0 up
> >> > sudo ifconfig shmulik0 down
> >> > sudo ifconfig shmulik0 up
> >>
> >> At this point - the system is stuck for 10 seconds.
> > Bringing an interface up or down is a synchronous operation and is
> > serialised with most other network configuration operations. So this is
> > the expected behaviour.
> >
> > Ben.
> But why does this happen only the second time I run ifconfig up?
> How come the entire system is totally frozen?
> I can't even switch to other applications running. If I run 'top' in another
> console, it stops refreshing for the entire period.
>
> I'll try to explain better;
> The driver I'm referring to is part of an embedded system development kit.
> It runs on the controlling side, which may be a PC or some Linux embedded
> system. It exposes a virtual interface that allows to communicate via
> ethernet connection to a remote board, and performs the firmware download
> to that board.
> Unfortunately, the firmware download stage is done during dev->open() of
> this virtual interface. The call to wait_event_interruptible_timeout()
> is there to make sure the boot process of the remote board is complete via a
> message. If all goes well the first time, there is no delay, but if the
> operation
> fails for any reason the first time, and a second attempt is made (another
> ifconfig up), we see the freezing.
>
> Since this driver is (mostly) closed source, I had to try and reproduce
> the situation
> in an all open-source driver - this is the sample code I attached to my
> original
> message. The call to mdelay() there is meant to simulate the delay of
> the original
> driver - it schedules.
>
mdelay() does a busy wait. If you are not SMP, this means a 'freeze'
If you want to schedule, you should use msleep()
> Obviously, the correct way to fix this is to separate the firmware
> download part
> from the dev->open() method, but this is not as simple as it may sound - I'm
> currently working on this. In the mean time I'm looking for a simpler
> solution
> (or answer) to our problem.
>
> I'll appreciate any insight on this matter.
>
> Thanks in advance,
> Shmulik Hen.
^ permalink raw reply
* Re: Using net_devices with ATM/DSL
From: Florian Fainelli @ 2010-12-13 12:50 UTC (permalink / raw)
To: Philip Prindeville; +Cc: netdev
In-Reply-To: <4D05C2D3.7010306@redfish-solutions.com>
Hello,
On Monday 13 December 2010 07:53:07 Philip Prindeville wrote:
> I was trying to get this discussion rolling on linux-atm-general but didn't
> have much luck.
>
> I was wondering what the downside to having ATM/DSL interfaces use
> net_devices would be?
I think you would have to add ATM/DSL-specific extensions (ala wext) just to
extend it, specializing an atm_dev would make more sense to me.
>
> Part of the reason for wanting to do this is to have an end-point to
> send/receive netlink messages to, so that the interface can report carrier
> state transitions, bit rates, bit-error rates, SNR, attenuation,
> constellations, transmitter gain, etc.
>
> Seems simple enough.
If you want to be able to report informations from the DSL PHY, I would rather
specialize an interface called, say dsl_phy which has a list of operations for
setting/getting the DSL PHY state, low-level counters ...
The atm stack more or less already supports an ATM PHY with
atmphy_ops, but is in my opinion too limited to query chip-speficic infos.
Once that interface is well defined, adding netlink support to it should be
rather straight forward.
>
> Why not do this?
>
> Thanks,
>
> -Philip
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: System blocks (hangs) on ifconfig up
From: Shmulik Hen @ 2010-12-13 13:11 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev, Shmulik Hen, Ben Hutchings, shemminger
In-Reply-To: <1292243832.2759.1.camel@edumazet-laptop>
On 12/13/2010 02:37 PM, Eric Dumazet wrote:
>>
>> But why does this happen only the second time I run ifconfig up?
>> How come the entire system is totally frozen?
>> I can't even switch to other applications running. If I run 'top' in another
>> console, it stops refreshing for the entire period.
>>
>> I'll try to explain better;
>> The driver I'm referring to is part of an embedded system development kit.
>> It runs on the controlling side, which may be a PC or some Linux embedded
>> system. It exposes a virtual interface that allows to communicate via
>> ethernet connection to a remote board, and performs the firmware download
>> to that board.
>> Unfortunately, the firmware download stage is done during dev->open() of
>> this virtual interface. The call to wait_event_interruptible_timeout()
>> is there to make sure the boot process of the remote board is complete via a
>> message. If all goes well the first time, there is no delay, but if the
>> operation
>> fails for any reason the first time, and a second attempt is made (another
>> ifconfig up), we see the freezing.
>>
>> Since this driver is (mostly) closed source, I had to try and reproduce
>> the situation
>> in an all open-source driver - this is the sample code I attached to my
>> original
>> message. The call to mdelay() there is meant to simulate the delay of
>> the original
>> driver - it schedules.
>>
> mdelay() does a busy wait. If you are not SMP, this means a 'freeze'
>
> If you want to schedule, you should use msleep()
Correct - my bad. When I use msleep() in the sample code there is no freeze.
But I still don't get it - when using mdelay(), why does the system
freeze only
the second time but not in the first time?
And what about wait_event_interruptible_timeout()? surely it doesn't
do a busy loop - I can see the source calling schedule().
^ permalink raw reply
* [*v2 PATCH 00/22] IPVS, Network Name Space aware
From: Hans Schillstrom @ 2010-12-13 13:38 UTC (permalink / raw)
To: horms, ja, daniel.lezcano, wensong, lvs-devel, netdev,
netfilter-devel
Cc: hans, Hans Schillstrom
This patch series adds network name space support to the LVS.
REVISION
This is version 2
OVERVIEW
The patch doesn't remove or add any functionality except for netns.
For users that don't use network name space (netns) this patch is
completely transparent.
Now it's possible to run LVS in a Linux container (see lxc-tools)
i.e. a light weight visualization. For example it's possible to run
one or several lvs on a real server in their own network name spaces.
>From the LVS point of view it looks like it runs on it's own machine.
IMPLEMENTATION
Basic requirements for netns awareness
- Global variables has to be moved to dyn. allocated memory.
- No or very little performance loss
Large hash tables connection hash and service hashes still resides in
global memory with net ptr added in hash key.
Most global variables now resides in a struct ipvs { } in netns/ip_vs.h.
The size of per name space is 2004 bytes (for x86_64) and a little bit less
for 32 bit archs.
Statistics counters is now lock-free i.e. incremented per CPU,
The estimator does a sum when using it.
Procfs ip_vs_stats is also changed to reflect the "per cpu"
ex.
# cat /proc/net/ip_vs_stats
Total Incoming Outgoing Incoming Outgoing
CPU Conns Packets Packets Bytes Bytes
0 0 3 1 9D 34
1 0 1 2 49 70
2 0 1 2 34 76
3 1 2 2 70 74
~ 1 7 7 18A 18E
Conns/s Pkts/s Pkts/s Bytes/s Bytes/s
0 0 0 0 0
Algorithm files are untouched except for lblc and lblcr.
STEP BY STEP
First patch creates network name space init for all files that need it.
How ever if a new name space is created an error is returned.
This will be removed in the last patch.
When net ptr ain't available init_net will be used temporarily.
CHANGES
*v2
The patches is totally reworked so each patch compile ...
Depends on the IPv6 and Persistence Backup patch.
Common hash-table per name-space for connections and services
Stats per CPU
smaller changes in lblc and lblcr
Triggered by Julians comment:
"tcp_timeout_change should work with the new struct ip_vs_proto_data
so that tcp_state_table will go to pd->state_table
and set_tcp_state will get pd instead of pp"
PATCH SET
This patch set is based upon lvs-test-2.6 / v2.6.37-rc1
and depends upon IPVS sync patches
STATUS
untested protos
- sctp
- esp_ah
and SIP for IPv6
SUMMARY
include/net/ip_vs.h | 236 +++++++---
include/net/net_namespace.h | 2 +
include/net/netns/ip_vs.h | 144 ++++++
net/netfilter/ipvs/ip_vs_app.c | 101 +++--
net/netfilter/ipvs/ip_vs_conn.c | 156 ++++---
net/netfilter/ipvs/ip_vs_core.c | 163 +++++--
net/netfilter/ipvs/ip_vs_ctl.c | 823 +++++++++++++++++--------------
net/netfilter/ipvs/ip_vs_est.c | 157 ++++--
net/netfilter/ipvs/ip_vs_ftp.c | 57 ++-
net/netfilter/ipvs/ip_vs_lblc.c | 66 +++-
net/netfilter/ipvs/ip_vs_lblcr.c | 70 +++-
net/netfilter/ipvs/ip_vs_nfct.c | 6 +-
net/netfilter/ipvs/ip_vs_proto.c | 121 +++++-
net/netfilter/ipvs/ip_vs_proto_ah_esp.c | 35 +-
net/netfilter/ipvs/ip_vs_proto_sctp.c | 134 +++---
net/netfilter/ipvs/ip_vs_proto_tcp.c | 125 +++---
net/netfilter/ipvs/ip_vs_proto_udp.c | 102 +++--
net/netfilter/ipvs/ip_vs_sync.c | 422 +++++++++-------
18 files changed, 1900 insertions(+), 1020 deletions(-)
^ permalink raw reply
* [*v2 PATCH 01/22] IPVS: netns, add basic init per netns.
From: Hans Schillstrom @ 2010-12-13 13:38 UTC (permalink / raw)
To: horms, ja, daniel.lezcano, wensong, lvs-devel, netdev,
netfilter-devel
Cc: hans, Hans Schillstrom
In-Reply-To: <1292247510-753-1-git-send-email-hans.schillstrom@ericsson.com>
Preparation for network name-space init, in this stage
some empty functions exists.
In most files there is a check if it is root ns i.e. init_net
if (!net_eq(net, &init_net))
return ...
this will be removed by the last patch, when enabling name-space.
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
---
include/net/ip_vs.h | 14 ++++++++
include/net/net_namespace.h | 2 +
include/net/netns/ip_vs.h | 26 +++++++++++++++
net/netfilter/ipvs/ip_vs_app.c | 32 +++++++++++++++---
net/netfilter/ipvs/ip_vs_conn.c | 50 +++++++++++++++++++++-------
net/netfilter/ipvs/ip_vs_core.c | 67 ++++++++++++++++++++++++++++++++++----
net/netfilter/ipvs/ip_vs_ctl.c | 48 ++++++++++++++++++++++-----
net/netfilter/ipvs/ip_vs_est.c | 20 +++++++++++-
net/netfilter/ipvs/ip_vs_ftp.c | 34 +++++++++++++++++--
net/netfilter/ipvs/ip_vs_lblc.c | 37 +++++++++++++++++++--
net/netfilter/ipvs/ip_vs_lblcr.c | 38 +++++++++++++++++++--
net/netfilter/ipvs/ip_vs_proto.c | 19 +++++++++++
net/netfilter/ipvs/ip_vs_sync.c | 28 ++++++++++++++++
13 files changed, 368 insertions(+), 47 deletions(-)
create mode 100644 include/net/netns/ip_vs.h
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index d858264..40b7003 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -28,6 +28,18 @@
#if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
#include <net/netfilter/nf_conntrack.h>
#endif
+#include <net/net_namespace.h> /* Netw namespace */
+
+/*
+ * Generic access of ipvs struct
+ */
+static inline struct netns_ipvs * net_ipvs(struct net* net) {
+#ifdef CONFIG_NET_NS
+ return net->ipvs;
+#else
+ return init_net.ipvs;
+#endif
+}
/* Connections' size value needed by ip_vs_ctl.c */
extern int ip_vs_conn_tab_size;
@@ -922,6 +934,8 @@ extern char ip_vs_backup_mcast_ifn[IP_VS_IFNAME_MAXLEN];
extern int start_sync_thread(int state, char *mcast_ifn, __u8 syncid);
extern int stop_sync_thread(int state);
extern void ip_vs_sync_conn(struct ip_vs_conn *cp);
+extern int ip_vs_sync_init(void);
+extern void ip_vs_sync_cleanup(void);
/*
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 1bf812b..b3b4a34 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -20,6 +20,7 @@
#include <net/netns/conntrack.h>
#endif
#include <net/netns/xfrm.h>
+#include <net/netns/ip_vs.h>
struct proc_dir_entry;
struct net_device;
@@ -94,6 +95,7 @@ struct net {
#ifdef CONFIG_XFRM
struct netns_xfrm xfrm;
#endif
+ struct netns_ipvs *ipvs;
};
diff --git a/include/net/netns/ip_vs.h b/include/net/netns/ip_vs.h
new file mode 100644
index 0000000..9068d95
--- /dev/null
+++ b/include/net/netns/ip_vs.h
@@ -0,0 +1,26 @@
+/*
+ * ip_vs.h
+ *
+ * Created on: Nov 23, 2010
+ * Author: hans
+ */
+
+#ifndef IP_VS_H_
+#define IP_VS_H_
+
+#include <linux/list.h>
+#include <linux/mutex.h>
+#include <linux/list_nulls.h>
+#include <linux/ip_vs.h>
+#include <asm/atomic.h>
+#include <linux/in.h>
+
+struct ip_vs_stats;
+struct ip_vs_sync_buff;
+struct ctl_table_header;
+
+struct netns_ipvs {
+ int inc; /* Incarnation */
+};
+
+#endif /* IP_VS_H_ */
diff --git a/net/netfilter/ipvs/ip_vs_app.c b/net/netfilter/ipvs/ip_vs_app.c
index a475ede..6d10352 100644
--- a/net/netfilter/ipvs/ip_vs_app.c
+++ b/net/netfilter/ipvs/ip_vs_app.c
@@ -264,7 +264,7 @@ static inline void vs_fix_seq(const struct ip_vs_seq *vseq, struct tcphdr *th)
* for all packets before most recent resized pkt seq.
*/
if (vseq->delta || vseq->previous_delta) {
- if(after(seq, vseq->init_seq)) {
+ if (after(seq, vseq->init_seq)) {
th->seq = htonl(seq + vseq->delta);
IP_VS_DBG(9, "%s(): added delta (%d) to seq\n",
__func__, vseq->delta);
@@ -293,7 +293,7 @@ vs_fix_ack_seq(const struct ip_vs_seq *vseq, struct tcphdr *th)
if (vseq->delta || vseq->previous_delta) {
/* since ack_seq is the number of octet that is expected
to receive next, so compare it with init_seq+delta */
- if(after(ack_seq, vseq->init_seq+vseq->delta)) {
+ if (after(ack_seq, vseq->init_seq+vseq->delta)) {
th->ack_seq = htonl(ack_seq - vseq->delta);
IP_VS_DBG(9, "%s(): subtracted delta "
"(%d) from ack_seq\n", __func__, vseq->delta);
@@ -569,15 +569,35 @@ static const struct file_operations ip_vs_app_fops = {
};
#endif
-int __init ip_vs_app_init(void)
+static int __net_init __ip_vs_app_init(struct net *net)
{
- /* we will replace it with proc_net_ipvs_create() soon */
- proc_net_fops_create(&init_net, "ip_vs_app", 0, &ip_vs_app_fops);
+ if (!net_eq(net, &init_net)) /* netns not enabled yet */
+ return -EPERM;
+
+ proc_net_fops_create(net, "ip_vs_app", 0, &ip_vs_app_fops);
return 0;
}
+static void __net_exit __ip_vs_app_cleanup(struct net *net)
+{
+ proc_net_remove(net, "ip_vs_app");
+}
+
+static struct pernet_operations ip_vs_app_ops = {
+ .init = __ip_vs_app_init,
+ .exit = __ip_vs_app_cleanup,
+};
+
+int __init ip_vs_app_init(void)
+{
+ int rv;
+
+ rv = register_pernet_subsys(&ip_vs_app_ops);
+ return rv;
+}
+
void ip_vs_app_cleanup(void)
{
- proc_net_remove(&init_net, "ip_vs_app");
+ unregister_pernet_subsys(&ip_vs_app_ops);
}
diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
index 66e4662..5a9f5f8 100644
--- a/net/netfilter/ipvs/ip_vs_conn.c
+++ b/net/netfilter/ipvs/ip_vs_conn.c
@@ -1202,11 +1202,17 @@ static void ip_vs_conn_flush(void)
}
}
-
-int __init ip_vs_conn_init(void)
+int __net_init __ip_vs_conn_init(struct net *net)
{
int idx;
+ if (!net_eq(net, &init_net)) /* netns not enabled yet */
+ return -EPERM;
+
+ /* Compute size and mask */
+ ip_vs_conn_tab_size = 1 << ip_vs_conn_tab_bits;
+ ip_vs_conn_tab_mask = ip_vs_conn_tab_size - 1;
+
/* Compute size and mask */
ip_vs_conn_tab_size = 1 << ip_vs_conn_tab_bits;
ip_vs_conn_tab_mask = ip_vs_conn_tab_size - 1;
@@ -1243,24 +1249,42 @@ int __init ip_vs_conn_init(void)
rwlock_init(&__ip_vs_conntbl_lock_array[idx].l);
}
- proc_net_fops_create(&init_net, "ip_vs_conn", 0, &ip_vs_conn_fops);
- proc_net_fops_create(&init_net, "ip_vs_conn_sync", 0, &ip_vs_conn_sync_fops);
-
- /* calculate the random value for connection hash */
- get_random_bytes(&ip_vs_conn_rnd, sizeof(ip_vs_conn_rnd));
+ proc_net_fops_create(net, "ip_vs_conn", 0, &ip_vs_conn_fops);
+ proc_net_fops_create(net, "ip_vs_conn_sync", 0, &ip_vs_conn_sync_fops);
return 0;
}
-
-
-void ip_vs_conn_cleanup(void)
+/* Cleanup and release all netns related ... */
+static void __net_exit __ip_vs_conn_cleanup(struct net *net)
{
+ if (!net_eq(net, &init_net)) /* netns not enabled yet */
+ return;
+
/* flush all the connection entries first */
ip_vs_conn_flush();
-
/* Release the empty cache */
kmem_cache_destroy(ip_vs_conn_cachep);
- proc_net_remove(&init_net, "ip_vs_conn");
- proc_net_remove(&init_net, "ip_vs_conn_sync");
+ proc_net_remove(net, "ip_vs_conn");
+ proc_net_remove(net, "ip_vs_conn_sync");
vfree(ip_vs_conn_tab);
}
+static struct pernet_operations ipvs_conn_ops = {
+ .init = __ip_vs_conn_init,
+ .exit = __ip_vs_conn_cleanup,
+};
+
+int __init ip_vs_conn_init(void)
+{
+ int rv;
+
+ rv = register_pernet_subsys(&ipvs_conn_ops);
+
+ /* calculate the random value for connection hash */
+ get_random_bytes(&ip_vs_conn_rnd, sizeof(ip_vs_conn_rnd));
+ return rv;
+}
+
+void ip_vs_conn_cleanup(void)
+{
+ unregister_pernet_subsys(&ipvs_conn_ops);
+}
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index 5287771..cc9bbce 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -41,6 +41,7 @@
#include <net/icmp.h> /* for icmp_send */
#include <net/route.h>
#include <net/ip6_checksum.h>
+#include <net/netns/generic.h> /* net_generic() */
#include <linux/netfilter.h>
#include <linux/netfilter_ipv4.h>
@@ -68,6 +69,12 @@ EXPORT_SYMBOL(ip_vs_conn_put);
EXPORT_SYMBOL(ip_vs_get_debug_level);
#endif
+int ip_vs_net_id __read_mostly;
+#ifdef IP_VS_GENERIC_NETNS
+EXPORT_SYMBOL(ip_vs_net_id);
+#endif
+/* netns cnt used for uniqueness */
+static atomic_t ipvs_netns_cnt = ATOMIC_INIT(0);
/* ID used in ICMP lookups */
#define icmp_id(icmph) (((icmph)->un).echo.id)
@@ -1813,6 +1820,44 @@ static struct nf_hook_ops ip_vs_ops[] __read_mostly = {
#endif
};
+/*
+ * Initialize IP Virtual Server netns mem.
+ */
+static int __net_init __ip_vs_init(struct net *net)
+{
+ struct netns_ipvs *ipvs;
+
+ if (!net_eq(net, &init_net)) {
+ pr_err("The final patch for enabling netns is missing\n");
+ return -EPERM;
+ }
+ ipvs = (struct netns_ipvs *)net_generic(net, ip_vs_net_id);
+ if (ipvs == NULL) {
+ pr_err("%s(): no memory.\n", __func__);
+ return -ENOMEM;
+ }
+ /* Incarnation counters used for creating unique names */
+ ipvs->inc = atomic_read(&ipvs_netns_cnt);
+ atomic_inc(&ipvs_netns_cnt);
+ net->ipvs = ipvs;
+ printk(KERN_INFO "IPVS: Creating netns size=%lu id=%d\n",
+ sizeof(struct netns_ipvs), ipvs->inc);
+ return 0;
+}
+
+static void __net_exit __ip_vs_cleanup(struct net *net)
+{
+ struct netns_ipvs *ipvs = net_ipvs(net);
+
+ IP_VS_DBG(10, "ipvs netns %d released\n", ipvs->inc);
+}
+
+static struct pernet_operations ipvs_core_ops = {
+ .init = __ip_vs_init,
+ .exit = __ip_vs_cleanup,
+ .id = &ip_vs_net_id,
+ .size = sizeof(struct netns_ipvs),
+};
/*
* Initialize IP Virtual Server
@@ -1821,8 +1866,11 @@ static int __init ip_vs_init(void)
{
int ret;
- ip_vs_estimator_init();
+ ret = register_pernet_subsys(&ipvs_core_ops); /* Alloc ip_vs struct */
+ if (ret < 0)
+ return ret;
+ ip_vs_estimator_init();
ret = ip_vs_control_init();
if (ret < 0) {
pr_err("can't setup control.\n");
@@ -1830,28 +1878,30 @@ static int __init ip_vs_init(void)
}
ip_vs_protocol_init();
-
ret = ip_vs_app_init();
if (ret < 0) {
pr_err("can't setup application helper.\n");
goto cleanup_protocol;
}
-
ret = ip_vs_conn_init();
if (ret < 0) {
pr_err("can't setup connection table.\n");
goto cleanup_app;
}
-
+ ret = ip_vs_sync_init();
+ if (ret < 0) {
+ pr_err("can't setup sync data.\n");
+ goto cleanup_conn;
+ }
ret = nf_register_hooks(ip_vs_ops, ARRAY_SIZE(ip_vs_ops));
if (ret < 0) {
pr_err("can't register hooks.\n");
- goto cleanup_conn;
+ goto cleanup_sync;
}
-
pr_info("ipvs loaded.\n");
return ret;
-
+ cleanup_sync:
+ ip_vs_sync_cleanup();
cleanup_conn:
ip_vs_conn_cleanup();
cleanup_app:
@@ -1861,17 +1911,20 @@ static int __init ip_vs_init(void)
ip_vs_control_cleanup();
cleanup_estimator:
ip_vs_estimator_cleanup();
+ unregister_pernet_subsys(&ipvs_core_ops); /* free ip_vs struct */
return ret;
}
static void __exit ip_vs_cleanup(void)
{
nf_unregister_hooks(ip_vs_ops, ARRAY_SIZE(ip_vs_ops));
+ ip_vs_sync_cleanup();
ip_vs_conn_cleanup();
ip_vs_app_cleanup();
ip_vs_protocol_cleanup();
ip_vs_control_cleanup();
ip_vs_estimator_cleanup();
+ unregister_pernet_subsys(&ipvs_core_ops); /* free ip_vs struct */
pr_info("ipvs unloaded.\n");
}
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index d12a13c..33511f4 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -3408,6 +3408,41 @@ static void ip_vs_genl_unregister(void)
/* End of Generic Netlink interface definitions */
+/*
+ * per netns intit/exit func.
+ */
+int __net_init __ip_vs_control_init(struct net *net)
+{
+ if (!net_eq(net, &init_net)) /* netns not enabled yet */
+ return -EPERM;
+
+ proc_net_fops_create(net, "ip_vs", 0, &ip_vs_info_fops);
+ proc_net_fops_create(net, "ip_vs_stats", 0, &ip_vs_stats_fops);
+ sysctl_header = register_net_sysctl_table(net, net_vs_ctl_path, vs_vars);
+ if (sysctl_header == NULL)
+ goto err_reg;
+ ip_vs_new_estimator(&ip_vs_stats);
+ return 0;
+
+err_reg:
+ return -ENOMEM;
+}
+
+static void __net_exit __ip_vs_control_cleanup(struct net *net)
+{
+ if (!net_eq(net, &init_net)) /* netns not enabled yet */
+ return;
+
+ ip_vs_kill_estimator(&ip_vs_stats);
+ unregister_net_sysctl_table(sysctl_header);
+ proc_net_remove(net, "ip_vs_stats");
+ proc_net_remove(net, "ip_vs");
+}
+
+static struct pernet_operations ipvs_control_ops = {
+ .init = __ip_vs_control_init,
+ .exit = __ip_vs_control_cleanup,
+};
int __init ip_vs_control_init(void)
{
@@ -3439,12 +3474,9 @@ int __init ip_vs_control_init(void)
return ret;
}
- proc_net_fops_create(&init_net, "ip_vs", 0, &ip_vs_info_fops);
- proc_net_fops_create(&init_net, "ip_vs_stats",0, &ip_vs_stats_fops);
-
- sysctl_header = register_sysctl_paths(net_vs_ctl_path, vs_vars);
-
- ip_vs_new_estimator(&ip_vs_stats);
+ ret = register_pernet_subsys(&ipvs_control_ops);
+ if (ret)
+ return ret;
/* Hook the defense timer */
schedule_delayed_work(&defense_work, DEFENSE_TIMER_PERIOD);
@@ -3461,9 +3493,7 @@ void ip_vs_control_cleanup(void)
cancel_rearming_delayed_work(&defense_work);
cancel_work_sync(&defense_work.work);
ip_vs_kill_estimator(&ip_vs_stats);
- unregister_sysctl_table(sysctl_header);
- proc_net_remove(&init_net, "ip_vs_stats");
- proc_net_remove(&init_net, "ip_vs");
+ unregister_pernet_subsys(&ipvs_control_ops);
ip_vs_genl_unregister();
nf_unregister_sockopt(&ip_vs_sockopts);
LeaveFunction(2);
diff --git a/net/netfilter/ipvs/ip_vs_est.c b/net/netfilter/ipvs/ip_vs_est.c
index ff28801..7417a0c 100644
--- a/net/netfilter/ipvs/ip_vs_est.c
+++ b/net/netfilter/ipvs/ip_vs_est.c
@@ -157,13 +157,31 @@ void ip_vs_zero_estimator(struct ip_vs_stats *stats)
est->outbps = 0;
}
+static int __net_init __ip_vs_estimator_init(struct net *net)
+{
+ if (!net_eq(net, &init_net)) /* netns not enabled yet */
+ return -EPERM;
+
+ return 0;
+}
+
+static struct pernet_operations ip_vs_app_ops = {
+ .init = __ip_vs_estimator_init,
+};
+
int __init ip_vs_estimator_init(void)
{
+ int rv;
+
+ rv = register_pernet_subsys(&ip_vs_app_ops);
+ if (rv < 0)
+ return rv;
mod_timer(&est_timer, jiffies + 2 * HZ);
- return 0;
+ return rv;
}
void ip_vs_estimator_cleanup(void)
{
del_timer_sync(&est_timer);
+ unregister_pernet_subsys(&ip_vs_app_ops);
}
diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c
index 84aef65..0e762f3 100644
--- a/net/netfilter/ipvs/ip_vs_ftp.c
+++ b/net/netfilter/ipvs/ip_vs_ftp.c
@@ -399,15 +399,17 @@ static struct ip_vs_app ip_vs_ftp = {
.pkt_in = ip_vs_ftp_in,
};
-
/*
- * ip_vs_ftp initialization
+ * per netns ip_vs_ftp initialization
*/
-static int __init ip_vs_ftp_init(void)
+static int __net_init __ip_vs_ftp_init(struct net *net)
{
int i, ret;
struct ip_vs_app *app = &ip_vs_ftp;
+ if (!net_eq(net, &init_net)) /* netns not enabled yet */
+ return -EPERM;
+
ret = register_ip_vs_app(app);
if (ret)
return ret;
@@ -427,14 +429,38 @@ static int __init ip_vs_ftp_init(void)
return ret;
}
+/*
+ * netns exit
+ */
+static void __ip_vs_ftp_exit(struct net *net)
+{
+ struct ip_vs_app *app = &ip_vs_ftp;
+
+ if (!net_eq(net, &init_net)) /* netns not enabled yet */
+ return;
+
+ unregister_ip_vs_app(app);
+}
+
+static struct pernet_operations ip_vs_ftp_ops = {
+ .init = __ip_vs_ftp_init,
+ .exit = __ip_vs_ftp_exit,
+};
+int __init ip_vs_ftp_init(void)
+{
+ int rv;
+
+ rv = register_pernet_subsys(&ip_vs_ftp_ops);
+ return rv;
+}
/*
* ip_vs_ftp finish.
*/
static void __exit ip_vs_ftp_exit(void)
{
- unregister_ip_vs_app(&ip_vs_ftp);
+ unregister_pernet_subsys(&ip_vs_ftp_ops);
}
diff --git a/net/netfilter/ipvs/ip_vs_lblc.c b/net/netfilter/ipvs/ip_vs_lblc.c
index 9323f89..84278fb 100644
--- a/net/netfilter/ipvs/ip_vs_lblc.c
+++ b/net/netfilter/ipvs/ip_vs_lblc.c
@@ -543,23 +543,54 @@ static struct ip_vs_scheduler ip_vs_lblc_scheduler =
.schedule = ip_vs_lblc_schedule,
};
+/*
+ * per netns init.
+ */
+static int __net_init __ip_vs_lblc_init(struct net *net)
+{
+ if (!net_eq(net, &init_net)) /* netns not enabled yet */
+ return -EPERM;
+
+ sysctl_header = register_net_sysctl_table(net, net_vs_ctl_path,
+ vs_vars_table);
+ if (!sysctl_header)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static void __net_exit __ip_vs_lblc_exit(struct net *net)
+{
+ if (!net_eq(net, &init_net)) /* netns not enabled yet */
+ return;
+
+ unregister_net_sysctl_table(sysctl_header);
+}
+
+static struct pernet_operations ip_vs_lblc_ops = {
+ .init = __ip_vs_lblc_init,
+ .exit = __ip_vs_lblc_exit,
+};
static int __init ip_vs_lblc_init(void)
{
int ret;
- sysctl_header = register_sysctl_paths(net_vs_ctl_path, vs_vars_table);
+ ret = register_pernet_subsys(&ip_vs_lblc_ops);
+ if (ret)
+ return ret;
+
ret = register_ip_vs_scheduler(&ip_vs_lblc_scheduler);
if (ret)
- unregister_sysctl_table(sysctl_header);
+ unregister_pernet_subsys(&ip_vs_lblc_ops);
return ret;
}
static void __exit ip_vs_lblc_cleanup(void)
{
- unregister_sysctl_table(sysctl_header);
unregister_ip_vs_scheduler(&ip_vs_lblc_scheduler);
+ unregister_pernet_subsys(&ip_vs_lblc_ops);
}
diff --git a/net/netfilter/ipvs/ip_vs_lblcr.c b/net/netfilter/ipvs/ip_vs_lblcr.c
index dbeed8e..7c7396a 100644
--- a/net/netfilter/ipvs/ip_vs_lblcr.c
+++ b/net/netfilter/ipvs/ip_vs_lblcr.c
@@ -744,23 +744,53 @@ static struct ip_vs_scheduler ip_vs_lblcr_scheduler =
.schedule = ip_vs_lblcr_schedule,
};
+/*
+ * per netns init.
+ */
+static int __net_init __ip_vs_lblcr_init(struct net *net)
+{
+ if (!net_eq(net, &init_net)) /* netns not enabled yet */
+ return -EPERM;
+
+ sysctl_header = register_net_sysctl_table(net, net_vs_ctl_path,
+ vs_vars_table);
+ if (!sysctl_header)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static void __net_exit __ip_vs_lblcr_exit(struct net *net)
+{
+ if (!net_eq(net, &init_net)) /* netns not enabled yet */
+ return;
+
+ unregister_net_sysctl_table(sysctl_header);
+}
+
+static struct pernet_operations ip_vs_lblcr_ops = {
+ .init = __ip_vs_lblcr_init,
+ .exit = __ip_vs_lblcr_exit,
+};
static int __init ip_vs_lblcr_init(void)
{
int ret;
- sysctl_header = register_sysctl_paths(net_vs_ctl_path, vs_vars_table);
+ ret = register_pernet_subsys(&ip_vs_lblcr_ops);
+ if (ret)
+ return ret;
+
ret = register_ip_vs_scheduler(&ip_vs_lblcr_scheduler);
if (ret)
- unregister_sysctl_table(sysctl_header);
+ unregister_pernet_subsys(&ip_vs_lblcr_ops);
return ret;
}
-
static void __exit ip_vs_lblcr_cleanup(void)
{
- unregister_sysctl_table(sysctl_header);
unregister_ip_vs_scheduler(&ip_vs_lblcr_scheduler);
+ unregister_pernet_subsys(&ip_vs_lblcr_ops);
}
diff --git a/net/netfilter/ipvs/ip_vs_proto.c b/net/netfilter/ipvs/ip_vs_proto.c
index c539983..27bf034 100644
--- a/net/netfilter/ipvs/ip_vs_proto.c
+++ b/net/netfilter/ipvs/ip_vs_proto.c
@@ -236,6 +236,23 @@ ip_vs_tcpudp_debug_packet(int af, struct ip_vs_protocol *pp,
ip_vs_tcpudp_debug_packet_v4(pp, skb, offset, msg);
}
+/*
+ * per network name-space init
+ */
+static int __net_init __ip_vs_protocol_init(struct net *net)
+{
+ return 0;
+}
+
+static void __net_exit __ip_vs_protocol_cleanup(struct net *net)
+{
+ /* empty */
+}
+
+static struct pernet_operations ipvs_proto_ops = {
+ .init = __ip_vs_protocol_init,
+ .exit = __ip_vs_protocol_cleanup,
+};
int __init ip_vs_protocol_init(void)
{
@@ -265,6 +282,7 @@ int __init ip_vs_protocol_init(void)
REGISTER_PROTOCOL(&ip_vs_protocol_esp);
#endif
pr_info("Registered protocols (%s)\n", &protocols[2]);
+ return register_pernet_subsys(&ipvs_proto_ops);
return 0;
}
@@ -275,6 +293,7 @@ void ip_vs_protocol_cleanup(void)
struct ip_vs_protocol *pp;
int i;
+ unregister_pernet_subsys(&ipvs_proto_ops);
/* unregister all the ipvs protocols */
for (i = 0; i < IP_VS_PROTO_TAB_SIZE; i++) {
while ((pp = ip_vs_proto_table[i]) != NULL)
diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c
index c1c167a..ea390f8 100644
--- a/net/netfilter/ipvs/ip_vs_sync.c
+++ b/net/netfilter/ipvs/ip_vs_sync.c
@@ -1639,3 +1639,31 @@ int stop_sync_thread(int state)
return 0;
}
+
+/*
+ * Initialize data struct for each netns
+ */
+static int __net_init __ip_vs_sync_init(struct net *net)
+{
+ return 0;
+}
+
+static void __ip_vs_sync_cleanup(struct net *net)
+{
+ return;
+}
+static struct pernet_operations ipvs_sync_ops = {
+ .init = __ip_vs_sync_init,
+ .exit = __ip_vs_sync_cleanup,
+};
+
+
+int __init ip_vs_sync_init(void)
+{
+ return register_pernet_subsys(&ipvs_sync_ops);
+}
+
+void __exit ip_vs_sync_cleanup(void)
+{
+ unregister_pernet_subsys(&ipvs_sync_ops);
+}
--
1.7.2.3
^ permalink raw reply related
* [*v2 PATCH 02/22] IPVS: netns to services part 1
From: Hans Schillstrom @ 2010-12-13 13:38 UTC (permalink / raw)
To: horms, ja, daniel.lezcano, wensong, lvs-devel, netdev,
netfilter-devel
Cc: hans, Hans Schillstrom
In-Reply-To: <1292247510-753-1-git-send-email-hans.schillstrom@ericsson.com>
Services hash tables got netns ptr a hash arg,
While Real Servers (rs) has been moved to ipvs struct.
Two new inline functions added to get net ptr from skb.
Since ip_vs is called from different contexts there is two
places to dig for the net ptr skb->dev or skb->sk
this is handled in skb_net() and skb_sknet()
Global functions, ip_vs_service_get() ip_vs_lookup_real_service()
etc have got struct net *net as first param.
If possible get net ptr skb etc,
- if not &init_net is used at this early stage of patching.
ip_vs_ctl.c procfs not ready for netns yet.
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
---
include/net/ip_vs.h | 62 +++++++++-
include/net/netns/ip_vs.h | 9 ++
net/netfilter/ipvs/ip_vs_conn.c | 2 +-
net/netfilter/ipvs/ip_vs_core.c | 4 +-
net/netfilter/ipvs/ip_vs_ctl.c | 223 +++++++++++++++++++--------------
net/netfilter/ipvs/ip_vs_proto_sctp.c | 5 +-
net/netfilter/ipvs/ip_vs_proto_tcp.c | 5 +-
net/netfilter/ipvs/ip_vs_proto_udp.c | 5 +-
net/netfilter/ipvs/ip_vs_sync.c | 2 +-
9 files changed, 206 insertions(+), 111 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 40b7003..7532eef 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -40,6 +40,57 @@ static inline struct netns_ipvs * net_ipvs(struct net* net) {
return init_net.ipvs;
#endif
}
+/*
+ * Get net ptr from skb in traffic cases
+ * use skb_sknet when call is from userland (ioctl or netlink)
+ */
+static inline struct net *skb_net(struct sk_buff *skb) {
+#ifdef CONFIG_NET_NS
+#ifdef CONFIG_IP_VS_DEBUG
+ /*
+ * This is used for debug only.
+ * Start with the most likely hit
+ * End with BUG
+ */
+ if (likely(skb->dev && skb->dev->nd_net))
+ return dev_net(skb->dev);
+ if (skb_dst(skb)->dev)
+ return dev_net(skb_dst(skb)->dev);
+ WARN(skb->sk,"Maybe skb_sknet should be used instead in %s() line:%d\n",
+ __func__, __LINE__);
+ if (likely(skb->sk && skb->sk->sk_net))
+ return sock_net(skb->sk);
+ pr_err("There is no net ptr to find in the skb in %s() line:%d\n",
+ __func__, __LINE__);
+ BUG();
+#else
+ return dev_net(skb->dev ? : skb_dst(skb)->dev);
+#endif
+#else
+ return &init_net;
+#endif
+}
+
+static inline struct net *skb_sknet(struct sk_buff *skb) {
+#ifdef CONFIG_NET_NS
+#ifdef CONFIG_IP_VS_DEBUG
+ /* Start with the most likely hit */
+ if (likely(skb->sk && skb->sk->sk_net))
+ return sock_net(skb->sk);
+ WARN(skb->dev,"Maybe skb_net should be used instead in %s() line:%d\n",
+ __func__, __LINE__);
+ if (likely(skb->dev && skb->dev->nd_net))
+ return dev_net(skb->dev);
+ pr_err("There is no net ptr to find in the skb in %s() line:%d\n",
+ __func__, __LINE__);
+ BUG();
+#else
+ return sock_net(skb->sk);
+#endif
+#else
+ return &init_net;
+#endif
+}
/* Connections' size value needed by ip_vs_ctl.c */
extern int ip_vs_conn_tab_size;
@@ -499,6 +550,7 @@ struct ip_vs_service {
unsigned flags; /* service status flags */
unsigned timeout; /* persistent timeout in ticks */
__be32 netmask; /* grouping granularity */
+ struct net *net;
struct list_head destinations; /* real server d-linked list */
__u32 num_dests; /* number of servers */
@@ -899,7 +951,7 @@ extern int sysctl_ip_vs_sync_ver;
extern void ip_vs_sync_switch_mode(int mode);
extern struct ip_vs_service *
-ip_vs_service_get(int af, __u32 fwmark, __u16 protocol,
+ip_vs_service_get(struct net *net, int af, __u32 fwmark, __u16 protocol,
const union nf_inet_addr *vaddr, __be16 vport);
static inline void ip_vs_service_put(struct ip_vs_service *svc)
@@ -908,7 +960,7 @@ static inline void ip_vs_service_put(struct ip_vs_service *svc)
}
extern struct ip_vs_dest *
-ip_vs_lookup_real_service(int af, __u16 protocol,
+ip_vs_lookup_real_service(struct net *net, int af, __u16 protocol,
const union nf_inet_addr *daddr, __be16 dport);
extern int ip_vs_use_count_inc(void);
@@ -916,9 +968,9 @@ extern void ip_vs_use_count_dec(void);
extern int ip_vs_control_init(void);
extern void ip_vs_control_cleanup(void);
extern struct ip_vs_dest *
-ip_vs_find_dest(int af, const union nf_inet_addr *daddr, __be16 dport,
- const union nf_inet_addr *vaddr, __be16 vport, __u16 protocol,
- __u32 fwmark);
+ip_vs_find_dest(struct net *net, int af, const union nf_inet_addr *daddr,
+ __be16 dport, const union nf_inet_addr *vaddr, __be16 vport,
+ __u16 protocol, __u32 fwmark);
extern struct ip_vs_dest *ip_vs_try_bind_dest(struct ip_vs_conn *cp);
diff --git a/include/net/netns/ip_vs.h b/include/net/netns/ip_vs.h
index 9068d95..c22fec2 100644
--- a/include/net/netns/ip_vs.h
+++ b/include/net/netns/ip_vs.h
@@ -21,6 +21,15 @@ struct ctl_table_header;
struct netns_ipvs {
int inc; /* Incarnation */
+ /*
+ * Hash table: for real service lookups
+ */
+ #define IP_VS_RTAB_BITS 4
+ #define IP_VS_RTAB_SIZE (1 << IP_VS_RTAB_BITS)
+ #define IP_VS_RTAB_MASK (IP_VS_RTAB_SIZE - 1)
+
+ struct list_head rs_table[IP_VS_RTAB_SIZE];
+
};
#endif /* IP_VS_H_ */
diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
index 5a9f5f8..48212de 100644
--- a/net/netfilter/ipvs/ip_vs_conn.c
+++ b/net/netfilter/ipvs/ip_vs_conn.c
@@ -611,7 +611,7 @@ struct ip_vs_dest *ip_vs_try_bind_dest(struct ip_vs_conn *cp)
struct ip_vs_dest *dest;
if ((cp) && (!cp->dest)) {
- dest = ip_vs_find_dest(cp->af, &cp->daddr, cp->dport,
+ dest = ip_vs_find_dest(&init_net, cp->af, &cp->daddr, cp->dport,
&cp->vaddr, cp->vport,
cp->protocol, cp->fwmark);
ip_vs_bind_dest(cp, dest);
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index cc9bbce..90d307c 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -1031,11 +1031,13 @@ drop:
static unsigned int
ip_vs_out(unsigned int hooknum, struct sk_buff *skb, int af)
{
+ struct net *net = NULL;
struct ip_vs_iphdr iph;
struct ip_vs_protocol *pp;
struct ip_vs_conn *cp;
EnterFunction(11);
+ net = skb_net(skb);
/* Already marked as IPVS request or reply? */
if (skb->ipvs_property)
@@ -1119,7 +1121,7 @@ ip_vs_out(unsigned int hooknum, struct sk_buff *skb, int af)
sizeof(_ports), _ports);
if (pptr == NULL)
return NF_ACCEPT; /* Not for me */
- if (ip_vs_lookup_real_service(af, iph.protocol,
+ if (ip_vs_lookup_real_service(net, af, iph.protocol,
&iph.saddr,
pptr[0])) {
/*
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index 33511f4..37cb8d1 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -290,15 +290,6 @@ static struct list_head ip_vs_svc_table[IP_VS_SVC_TAB_SIZE];
static struct list_head ip_vs_svc_fwm_table[IP_VS_SVC_TAB_SIZE];
/*
- * Hash table: for real service lookups
- */
-#define IP_VS_RTAB_BITS 4
-#define IP_VS_RTAB_SIZE (1 << IP_VS_RTAB_BITS)
-#define IP_VS_RTAB_MASK (IP_VS_RTAB_SIZE - 1)
-
-static struct list_head ip_vs_rtable[IP_VS_RTAB_SIZE];
-
-/*
* Trash for destinations
*/
static LIST_HEAD(ip_vs_dest_trash);
@@ -314,7 +305,7 @@ static atomic_t ip_vs_nullsvc_counter = ATOMIC_INIT(0);
* Returns hash value for virtual service
*/
static __inline__ unsigned
-ip_vs_svc_hashkey(int af, unsigned proto, const union nf_inet_addr *addr,
+ip_vs_svc_hashkey(struct net *net, int af, unsigned proto, const union nf_inet_addr *addr,
__be16 port)
{
register unsigned porth = ntohs(port);
@@ -325,6 +316,7 @@ ip_vs_svc_hashkey(int af, unsigned proto, const union nf_inet_addr *addr,
addr_fold = addr->ip6[0]^addr->ip6[1]^
addr->ip6[2]^addr->ip6[3];
#endif
+ addr_fold ^= ((size_t)net>>8);
return (proto^ntohl(addr_fold)^(porth>>IP_VS_SVC_TAB_BITS)^porth)
& IP_VS_SVC_TAB_MASK;
@@ -333,13 +325,13 @@ ip_vs_svc_hashkey(int af, unsigned proto, const union nf_inet_addr *addr,
/*
* Returns hash value of fwmark for virtual service lookup
*/
-static __inline__ unsigned ip_vs_svc_fwm_hashkey(__u32 fwmark)
+static __inline__ unsigned ip_vs_svc_fwm_hashkey(struct net *net, __u32 fwmark)
{
- return fwmark & IP_VS_SVC_TAB_MASK;
+ return (((size_t)net>>8) ^ fwmark) & IP_VS_SVC_TAB_MASK;
}
/*
- * Hashes a service in the ip_vs_svc_table by <proto,addr,port>
+ * Hashes a service in the ip_vs_svc_table by <netns,proto,addr,port>
* or in the ip_vs_svc_fwm_table by fwmark.
* Should be called with locked tables.
*/
@@ -355,16 +347,16 @@ static int ip_vs_svc_hash(struct ip_vs_service *svc)
if (svc->fwmark == 0) {
/*
- * Hash it by <protocol,addr,port> in ip_vs_svc_table
+ * Hash it by <netns,protocol,addr,port> in ip_vs_svc_table
*/
- hash = ip_vs_svc_hashkey(svc->af, svc->protocol, &svc->addr,
- svc->port);
+ hash = ip_vs_svc_hashkey(svc->net, svc->af, svc->protocol,
+ &svc->addr, svc->port);
list_add(&svc->s_list, &ip_vs_svc_table[hash]);
} else {
/*
- * Hash it by fwmark in ip_vs_svc_fwm_table
+ * Hash it by fwmark in svc_fwm_table
*/
- hash = ip_vs_svc_fwm_hashkey(svc->fwmark);
+ hash = ip_vs_svc_fwm_hashkey(svc->net, svc->fwmark);
list_add(&svc->f_list, &ip_vs_svc_fwm_table[hash]);
}
@@ -376,7 +368,7 @@ static int ip_vs_svc_hash(struct ip_vs_service *svc)
/*
- * Unhashes a service from ip_vs_svc_table/ip_vs_svc_fwm_table.
+ * Unhashes a service from svc_table / svc_fwm_table.
* Should be called with locked tables.
*/
static int ip_vs_svc_unhash(struct ip_vs_service *svc)
@@ -388,10 +380,10 @@ static int ip_vs_svc_unhash(struct ip_vs_service *svc)
}
if (svc->fwmark == 0) {
- /* Remove it from the ip_vs_svc_table table */
+ /* Remove it from the svc_table table */
list_del(&svc->s_list);
} else {
- /* Remove it from the ip_vs_svc_fwm_table table */
+ /* Remove it from the svc_fwm_table table */
list_del(&svc->f_list);
}
@@ -402,20 +394,20 @@ static int ip_vs_svc_unhash(struct ip_vs_service *svc)
/*
- * Get service by {proto,addr,port} in the service table.
+ * Get service by {netns, proto,addr,port} in the service table.
*/
static inline struct ip_vs_service *
-__ip_vs_service_find(int af, __u16 protocol, const union nf_inet_addr *vaddr,
- __be16 vport)
+__ip_vs_service_find(struct net *net, int af, __u16 protocol,
+ const union nf_inet_addr *vaddr, __be16 vport)
{
unsigned hash;
struct ip_vs_service *svc;
/* Check for "full" addressed entries */
- hash = ip_vs_svc_hashkey(af, protocol, vaddr, vport);
+ hash = ip_vs_svc_hashkey(net, af, protocol, vaddr, vport);
list_for_each_entry(svc, &ip_vs_svc_table[hash], s_list){
- if ((svc->af == af)
+ if ((net_eq(svc->net, net)) && (svc->af == af)
&& ip_vs_addr_equal(af, &svc->addr, vaddr)
&& (svc->port == vport)
&& (svc->protocol == protocol)) {
@@ -432,16 +424,17 @@ __ip_vs_service_find(int af, __u16 protocol, const union nf_inet_addr *vaddr,
* Get service by {fwmark} in the service table.
*/
static inline struct ip_vs_service *
-__ip_vs_svc_fwm_find(int af, __u32 fwmark)
+__ip_vs_svc_fwm_find(struct net *net, int af, __u32 fwmark)
{
unsigned hash;
struct ip_vs_service *svc;
/* Check for fwmark addressed entries */
- hash = ip_vs_svc_fwm_hashkey(fwmark);
+ hash = ip_vs_svc_fwm_hashkey(net, fwmark);
list_for_each_entry(svc, &ip_vs_svc_fwm_table[hash], f_list) {
- if (svc->fwmark == fwmark && svc->af == af) {
+ if (net_eq(svc->net, net) && svc->fwmark == fwmark &&
+ svc->af == af) {
/* HIT */
return svc;
}
@@ -451,7 +444,7 @@ __ip_vs_svc_fwm_find(int af, __u32 fwmark)
}
struct ip_vs_service *
-ip_vs_service_get(int af, __u32 fwmark, __u16 protocol,
+ip_vs_service_get(struct net *net, int af, __u32 fwmark, __u16 protocol,
const union nf_inet_addr *vaddr, __be16 vport)
{
struct ip_vs_service *svc;
@@ -461,14 +454,14 @@ ip_vs_service_get(int af, __u32 fwmark, __u16 protocol,
/*
* Check the table hashed by fwmark first
*/
- if (fwmark && (svc = __ip_vs_svc_fwm_find(af, fwmark)))
+ if (fwmark && (svc = __ip_vs_svc_fwm_find(net, af, fwmark)))
goto out;
/*
* Check the table hashed by <protocol,addr,port>
* for "full" addressed entries
*/
- svc = __ip_vs_service_find(af, protocol, vaddr, vport);
+ svc = __ip_vs_service_find(net, af, protocol, vaddr, vport);
if (svc == NULL
&& protocol == IPPROTO_TCP
@@ -478,7 +471,7 @@ ip_vs_service_get(int af, __u32 fwmark, __u16 protocol,
* Check if ftp service entry exists, the packet
* might belong to FTP data connections.
*/
- svc = __ip_vs_service_find(af, protocol, vaddr, FTPPORT);
+ svc = __ip_vs_service_find(net, af, protocol, vaddr, FTPPORT);
}
if (svc == NULL
@@ -486,7 +479,7 @@ ip_vs_service_get(int af, __u32 fwmark, __u16 protocol,
/*
* Check if the catch-all port (port zero) exists
*/
- svc = __ip_vs_service_find(af, protocol, vaddr, 0);
+ svc = __ip_vs_service_find(net, af, protocol, vaddr, 0);
}
out:
@@ -547,10 +540,10 @@ static inline unsigned ip_vs_rs_hashkey(int af,
}
/*
- * Hashes ip_vs_dest in ip_vs_rtable by <proto,addr,port>.
+ * Hashes ip_vs_dest in rs_table by <proto,addr,port>.
* should be called with locked tables.
*/
-static int ip_vs_rs_hash(struct ip_vs_dest *dest)
+static int ip_vs_rs_hash(struct netns_ipvs *ipvs, struct ip_vs_dest *dest)
{
unsigned hash;
@@ -564,19 +557,19 @@ static int ip_vs_rs_hash(struct ip_vs_dest *dest)
*/
hash = ip_vs_rs_hashkey(dest->af, &dest->addr, dest->port);
- list_add(&dest->d_list, &ip_vs_rtable[hash]);
+ list_add(&dest->d_list, &ipvs->rs_table[hash]);
return 1;
}
/*
- * UNhashes ip_vs_dest from ip_vs_rtable.
+ * UNhashes ip_vs_dest from rs_table.
* should be called with locked tables.
*/
static int ip_vs_rs_unhash(struct ip_vs_dest *dest)
{
/*
- * Remove it from the ip_vs_rtable table.
+ * Remove it from the rs_table table.
*/
if (!list_empty(&dest->d_list)) {
list_del(&dest->d_list);
@@ -590,10 +583,11 @@ static int ip_vs_rs_unhash(struct ip_vs_dest *dest)
* Lookup real service by <proto,addr,port> in the real service table.
*/
struct ip_vs_dest *
-ip_vs_lookup_real_service(int af, __u16 protocol,
+ip_vs_lookup_real_service(struct net *net, int af, __u16 protocol,
const union nf_inet_addr *daddr,
__be16 dport)
{
+ struct netns_ipvs *ipvs = net_ipvs(net);
unsigned hash;
struct ip_vs_dest *dest;
@@ -604,7 +598,7 @@ ip_vs_lookup_real_service(int af, __u16 protocol,
hash = ip_vs_rs_hashkey(af, daddr, dport);
read_lock(&__ip_vs_rs_lock);
- list_for_each_entry(dest, &ip_vs_rtable[hash], d_list) {
+ list_for_each_entry(dest, &ipvs->rs_table[hash], d_list) {
if ((dest->af == af)
&& ip_vs_addr_equal(af, &dest->addr, daddr)
&& (dest->port == dport)
@@ -654,7 +648,8 @@ ip_vs_lookup_dest(struct ip_vs_service *svc, const union nf_inet_addr *daddr,
* ip_vs_lookup_real_service() looked promissing, but
* seems not working as expected.
*/
-struct ip_vs_dest *ip_vs_find_dest(int af, const union nf_inet_addr *daddr,
+struct ip_vs_dest *ip_vs_find_dest(struct net *net, int af,
+ const union nf_inet_addr *daddr,
__be16 dport,
const union nf_inet_addr *vaddr,
__be16 vport, __u16 protocol, __u32 fwmark)
@@ -662,7 +657,7 @@ struct ip_vs_dest *ip_vs_find_dest(int af, const union nf_inet_addr *daddr,
struct ip_vs_dest *dest;
struct ip_vs_service *svc;
- svc = ip_vs_service_get(af, fwmark, protocol, vaddr, vport);
+ svc = ip_vs_service_get(net, af, fwmark, protocol, vaddr, vport);
if (!svc)
return NULL;
dest = ip_vs_lookup_dest(svc, daddr, dport);
@@ -770,6 +765,7 @@ static void
__ip_vs_update_dest(struct ip_vs_service *svc, struct ip_vs_dest *dest,
struct ip_vs_dest_user_kern *udest, int add)
{
+ struct netns_ipvs *ipvs = net_ipvs(svc->net);
int conn_flags;
/* set the weight and the flags */
@@ -782,11 +778,11 @@ __ip_vs_update_dest(struct ip_vs_service *svc, struct ip_vs_dest *dest,
conn_flags |= IP_VS_CONN_F_NOOUTPUT;
} else {
/*
- * Put the real service in ip_vs_rtable if not present.
+ * Put the real service in rs_table if not present.
* For now only for NAT!
*/
write_lock_bh(&__ip_vs_rs_lock);
- ip_vs_rs_hash(dest);
+ ip_vs_rs_hash(ipvs, dest);
write_unlock_bh(&__ip_vs_rs_lock);
}
atomic_set(&dest->conn_flags, conn_flags);
@@ -1119,7 +1115,7 @@ ip_vs_del_dest(struct ip_vs_service *svc, struct ip_vs_dest_user_kern *udest)
* Add a service into the service hash table
*/
static int
-ip_vs_add_service(struct ip_vs_service_user_kern *u,
+ip_vs_add_service(struct net *net, struct ip_vs_service_user_kern *u,
struct ip_vs_service **svc_p)
{
int ret = 0;
@@ -1174,6 +1170,7 @@ ip_vs_add_service(struct ip_vs_service_user_kern *u,
svc->flags = u->flags;
svc->timeout = u->timeout * HZ;
svc->netmask = u->netmask;
+ svc->net = net;
INIT_LIST_HEAD(&svc->destinations);
rwlock_init(&svc->sched_lock);
@@ -1430,17 +1427,19 @@ static int ip_vs_del_service(struct ip_vs_service *svc)
/*
* Flush all the virtual services
*/
-static int ip_vs_flush(void)
+static int ip_vs_flush(struct net *net)
{
int idx;
struct ip_vs_service *svc, *nxt;
/*
- * Flush the service table hashed by <protocol,addr,port>
+ * Flush the service table hashed by <netns,protocol,addr,port>
*/
for(idx = 0; idx < IP_VS_SVC_TAB_SIZE; idx++) {
- list_for_each_entry_safe(svc, nxt, &ip_vs_svc_table[idx], s_list) {
- ip_vs_unlink_service(svc);
+ list_for_each_entry_safe(svc, nxt, &ip_vs_svc_table[idx],
+ s_list) {
+ if (net_eq(svc->net, net))
+ ip_vs_unlink_service(svc);
}
}
@@ -1450,7 +1449,8 @@ static int ip_vs_flush(void)
for(idx = 0; idx < IP_VS_SVC_TAB_SIZE; idx++) {
list_for_each_entry_safe(svc, nxt,
&ip_vs_svc_fwm_table[idx], f_list) {
- ip_vs_unlink_service(svc);
+ if (net_eq(svc->net, net))
+ ip_vs_unlink_service(svc);
}
}
@@ -1474,20 +1474,22 @@ static int ip_vs_zero_service(struct ip_vs_service *svc)
return 0;
}
-static int ip_vs_zero_all(void)
+static int ip_vs_zero_all(struct net *net)
{
int idx;
struct ip_vs_service *svc;
for(idx = 0; idx < IP_VS_SVC_TAB_SIZE; idx++) {
list_for_each_entry(svc, &ip_vs_svc_table[idx], s_list) {
- ip_vs_zero_service(svc);
+ if (net_eq(svc->net, net))
+ ip_vs_zero_service(svc);
}
}
for(idx = 0; idx < IP_VS_SVC_TAB_SIZE; idx++) {
list_for_each_entry(svc, &ip_vs_svc_fwm_table[idx], f_list) {
- ip_vs_zero_service(svc);
+ if (net_eq(svc->net, net))
+ ip_vs_zero_service(svc);
}
}
@@ -1765,6 +1767,7 @@ static struct ctl_table_header * sysctl_header;
#ifdef CONFIG_PROC_FS
struct ip_vs_iter {
+ struct seq_net_private p; /* Do not move this, netns depends upon it*/
struct list_head *table;
int bucket;
};
@@ -1791,6 +1794,7 @@ static inline const char *ip_vs_fwd_name(unsigned flags)
/* Get the Nth entry in the two lists */
static struct ip_vs_service *ip_vs_info_array(struct seq_file *seq, loff_t pos)
{
+ struct net *net = seq_file_net(seq);
struct ip_vs_iter *iter = seq->private;
int idx;
struct ip_vs_service *svc;
@@ -1798,7 +1802,7 @@ static struct ip_vs_service *ip_vs_info_array(struct seq_file *seq, loff_t pos)
/* look in hash by protocol */
for (idx = 0; idx < IP_VS_SVC_TAB_SIZE; idx++) {
list_for_each_entry(svc, &ip_vs_svc_table[idx], s_list) {
- if (pos-- == 0){
+ if (net_eq(svc->net, net) && pos-- == 0){
iter->table = ip_vs_svc_table;
iter->bucket = idx;
return svc;
@@ -1809,7 +1813,7 @@ static struct ip_vs_service *ip_vs_info_array(struct seq_file *seq, loff_t pos)
/* keep looking in fwmark */
for (idx = 0; idx < IP_VS_SVC_TAB_SIZE; idx++) {
list_for_each_entry(svc, &ip_vs_svc_fwm_table[idx], f_list) {
- if (pos-- == 0) {
+ if (net_eq(svc->net, net) && pos-- == 0) {
iter->table = ip_vs_svc_fwm_table;
iter->bucket = idx;
return svc;
@@ -1963,7 +1967,7 @@ static const struct seq_operations ip_vs_info_seq_ops = {
static int ip_vs_info_open(struct inode *inode, struct file *file)
{
- return seq_open_private(file, &ip_vs_info_seq_ops,
+ return seq_open_net(inode, file, &ip_vs_info_seq_ops,
sizeof(struct ip_vs_iter));
}
@@ -2013,7 +2017,7 @@ static int ip_vs_stats_show(struct seq_file *seq, void *v)
static int ip_vs_stats_seq_open(struct inode *inode, struct file *file)
{
- return single_open(file, ip_vs_stats_show, NULL);
+ return single_open_net(inode, file, ip_vs_stats_show);
}
static const struct file_operations ip_vs_stats_fops = {
@@ -2115,6 +2119,7 @@ static void ip_vs_copy_udest_compat(struct ip_vs_dest_user_kern *udest,
static int
do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user *user, unsigned int len)
{
+ struct net *net = sock_net(sk);
int ret;
unsigned char arg[MAX_ARG_LEN];
struct ip_vs_service_user *usvc_compat;
@@ -2149,7 +2154,7 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user *user, unsigned int len)
if (cmd == IP_VS_SO_SET_FLUSH) {
/* Flush the virtual service */
- ret = ip_vs_flush();
+ ret = ip_vs_flush(net);
goto out_unlock;
} else if (cmd == IP_VS_SO_SET_TIMEOUT) {
/* Set timeout values for (tcp tcpfin udp) */
@@ -2176,7 +2181,7 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user *user, unsigned int len)
if (cmd == IP_VS_SO_SET_ZERO) {
/* if no service address is set, zero counters in all */
if (!usvc.fwmark && !usvc.addr.ip && !usvc.port) {
- ret = ip_vs_zero_all();
+ ret = ip_vs_zero_all(net);
goto out_unlock;
}
}
@@ -2193,10 +2198,10 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user *user, unsigned int len)
/* Lookup the exact service by <protocol, addr, port> or fwmark */
if (usvc.fwmark == 0)
- svc = __ip_vs_service_find(usvc.af, usvc.protocol,
+ svc = __ip_vs_service_find(net, usvc.af, usvc.protocol,
&usvc.addr, usvc.port);
else
- svc = __ip_vs_svc_fwm_find(usvc.af, usvc.fwmark);
+ svc = __ip_vs_svc_fwm_find(net, usvc.af, usvc.fwmark);
if (cmd != IP_VS_SO_SET_ADD
&& (svc == NULL || svc->protocol != usvc.protocol)) {
@@ -2209,7 +2214,7 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user *user, unsigned int len)
if (svc != NULL)
ret = -EEXIST;
else
- ret = ip_vs_add_service(&usvc, &svc);
+ ret = ip_vs_add_service(net, &usvc, &svc);
break;
case IP_VS_SO_SET_EDIT:
ret = ip_vs_edit_service(svc, &usvc);
@@ -2269,7 +2274,7 @@ ip_vs_copy_service(struct ip_vs_service_entry *dst, struct ip_vs_service *src)
}
static inline int
-__ip_vs_get_service_entries(const struct ip_vs_get_services *get,
+__ip_vs_get_service_entries(struct net *net, const struct ip_vs_get_services *get,
struct ip_vs_get_services __user *uptr)
{
int idx, count=0;
@@ -2280,7 +2285,7 @@ __ip_vs_get_service_entries(const struct ip_vs_get_services *get,
for (idx = 0; idx < IP_VS_SVC_TAB_SIZE; idx++) {
list_for_each_entry(svc, &ip_vs_svc_table[idx], s_list) {
/* Only expose IPv4 entries to old interface */
- if (svc->af != AF_INET)
+ if (svc->af != AF_INET || !net_eq(svc->net, net))
continue;
if (count >= get->num_services)
@@ -2299,7 +2304,7 @@ __ip_vs_get_service_entries(const struct ip_vs_get_services *get,
for (idx = 0; idx < IP_VS_SVC_TAB_SIZE; idx++) {
list_for_each_entry(svc, &ip_vs_svc_fwm_table[idx], f_list) {
/* Only expose IPv4 entries to old interface */
- if (svc->af != AF_INET)
+ if (svc->af != AF_INET || !net_eq(svc->net, net))
continue;
if (count >= get->num_services)
@@ -2319,7 +2324,7 @@ __ip_vs_get_service_entries(const struct ip_vs_get_services *get,
}
static inline int
-__ip_vs_get_dest_entries(const struct ip_vs_get_dests *get,
+__ip_vs_get_dest_entries(struct net *net, const struct ip_vs_get_dests *get,
struct ip_vs_get_dests __user *uptr)
{
struct ip_vs_service *svc;
@@ -2327,9 +2332,9 @@ __ip_vs_get_dest_entries(const struct ip_vs_get_dests *get,
int ret = 0;
if (get->fwmark)
- svc = __ip_vs_svc_fwm_find(AF_INET, get->fwmark);
+ svc = __ip_vs_svc_fwm_find(net, AF_INET, get->fwmark);
else
- svc = __ip_vs_service_find(AF_INET, get->protocol, &addr,
+ svc = __ip_vs_service_find(net, AF_INET, get->protocol, &addr,
get->port);
if (svc) {
@@ -2403,7 +2408,9 @@ do_ip_vs_get_ctl(struct sock *sk, int cmd, void __user *user, int *len)
unsigned char arg[128];
int ret = 0;
unsigned int copylen;
+ struct net *net = sock_net(sk);
+ BUG_ON(!net);
if (!capable(CAP_NET_ADMIN))
return -EPERM;
@@ -2465,7 +2472,7 @@ do_ip_vs_get_ctl(struct sock *sk, int cmd, void __user *user, int *len)
ret = -EINVAL;
goto out;
}
- ret = __ip_vs_get_service_entries(get, user);
+ ret = __ip_vs_get_service_entries(net, get, user);
}
break;
@@ -2478,9 +2485,9 @@ do_ip_vs_get_ctl(struct sock *sk, int cmd, void __user *user, int *len)
entry = (struct ip_vs_service_entry *)arg;
addr.ip = entry->addr;
if (entry->fwmark)
- svc = __ip_vs_svc_fwm_find(AF_INET, entry->fwmark);
+ svc = __ip_vs_svc_fwm_find(net, AF_INET, entry->fwmark);
else
- svc = __ip_vs_service_find(AF_INET, entry->protocol,
+ svc = __ip_vs_service_find(net, AF_INET, entry->protocol,
&addr, entry->port);
if (svc) {
ip_vs_copy_service(entry, svc);
@@ -2504,7 +2511,7 @@ do_ip_vs_get_ctl(struct sock *sk, int cmd, void __user *user, int *len)
ret = -EINVAL;
goto out;
}
- ret = __ip_vs_get_dest_entries(get, user);
+ ret = __ip_vs_get_dest_entries(net, get, user);
}
break;
@@ -2724,11 +2731,12 @@ static int ip_vs_genl_dump_services(struct sk_buff *skb,
int idx = 0, i;
int start = cb->args[0];
struct ip_vs_service *svc;
+ struct net *net = skb_sknet(skb);
mutex_lock(&__ip_vs_mutex);
for (i = 0; i < IP_VS_SVC_TAB_SIZE; i++) {
list_for_each_entry(svc, &ip_vs_svc_table[i], s_list) {
- if (++idx <= start)
+ if (++idx <= start || !net_eq(svc->net, net))
continue;
if (ip_vs_genl_dump_service(skb, svc, cb) < 0) {
idx--;
@@ -2739,7 +2747,7 @@ static int ip_vs_genl_dump_services(struct sk_buff *skb,
for (i = 0; i < IP_VS_SVC_TAB_SIZE; i++) {
list_for_each_entry(svc, &ip_vs_svc_fwm_table[i], f_list) {
- if (++idx <= start)
+ if (++idx <= start || !net_eq(svc->net, net))
continue;
if (ip_vs_genl_dump_service(skb, svc, cb) < 0) {
idx--;
@@ -2755,7 +2763,8 @@ nla_put_failure:
return skb->len;
}
-static int ip_vs_genl_parse_service(struct ip_vs_service_user_kern *usvc,
+static int ip_vs_genl_parse_service(struct net *net,
+ struct ip_vs_service_user_kern *usvc,
struct nlattr *nla, int full_entry,
struct ip_vs_service **ret_svc)
{
@@ -2798,9 +2807,9 @@ static int ip_vs_genl_parse_service(struct ip_vs_service_user_kern *usvc,
}
if (usvc->fwmark)
- svc = __ip_vs_svc_fwm_find(usvc->af, usvc->fwmark);
+ svc = __ip_vs_svc_fwm_find(net, usvc->af, usvc->fwmark);
else
- svc = __ip_vs_service_find(usvc->af, usvc->protocol,
+ svc = __ip_vs_service_find(net, usvc->af, usvc->protocol,
&usvc->addr, usvc->port);
*ret_svc = svc;
@@ -2837,13 +2846,14 @@ static int ip_vs_genl_parse_service(struct ip_vs_service_user_kern *usvc,
return 0;
}
-static struct ip_vs_service *ip_vs_genl_find_service(struct nlattr *nla)
+static struct ip_vs_service *ip_vs_genl_find_service(struct net *net,
+ struct nlattr *nla)
{
struct ip_vs_service_user_kern usvc;
struct ip_vs_service *svc;
int ret;
- ret = ip_vs_genl_parse_service(&usvc, nla, 0, &svc);
+ ret = ip_vs_genl_parse_service(net, &usvc, nla, 0, &svc);
return ret ? ERR_PTR(ret) : svc;
}
@@ -2911,6 +2921,7 @@ static int ip_vs_genl_dump_dests(struct sk_buff *skb,
struct ip_vs_service *svc;
struct ip_vs_dest *dest;
struct nlattr *attrs[IPVS_CMD_ATTR_MAX + 1];
+ struct net *net;
mutex_lock(&__ip_vs_mutex);
@@ -2919,7 +2930,8 @@ static int ip_vs_genl_dump_dests(struct sk_buff *skb,
IPVS_CMD_ATTR_MAX, ip_vs_cmd_policy))
goto out_err;
- svc = ip_vs_genl_find_service(attrs[IPVS_CMD_ATTR_SERVICE]);
+ net = skb_sknet(skb);
+ svc = ip_vs_genl_find_service(net, attrs[IPVS_CMD_ATTR_SERVICE]);
if (IS_ERR(svc) || svc == NULL)
goto out_err;
@@ -3104,13 +3116,15 @@ static int ip_vs_genl_set_cmd(struct sk_buff *skb, struct genl_info *info)
struct ip_vs_dest_user_kern udest;
int ret = 0, cmd;
int need_full_svc = 0, need_full_dest = 0;
+ struct net *net;
+ net = skb_sknet(skb);
cmd = info->genlhdr->cmd;
mutex_lock(&__ip_vs_mutex);
if (cmd == IPVS_CMD_FLUSH) {
- ret = ip_vs_flush();
+ ret = ip_vs_flush(net);
goto out;
} else if (cmd == IPVS_CMD_SET_CONFIG) {
ret = ip_vs_genl_set_config(info->attrs);
@@ -3135,7 +3149,7 @@ static int ip_vs_genl_set_cmd(struct sk_buff *skb, struct genl_info *info)
goto out;
} else if (cmd == IPVS_CMD_ZERO &&
!info->attrs[IPVS_CMD_ATTR_SERVICE]) {
- ret = ip_vs_zero_all();
+ ret = ip_vs_zero_all(net);
goto out;
}
@@ -3145,7 +3159,7 @@ static int ip_vs_genl_set_cmd(struct sk_buff *skb, struct genl_info *info)
if (cmd == IPVS_CMD_NEW_SERVICE || cmd == IPVS_CMD_SET_SERVICE)
need_full_svc = 1;
- ret = ip_vs_genl_parse_service(&usvc,
+ ret = ip_vs_genl_parse_service(net, &usvc,
info->attrs[IPVS_CMD_ATTR_SERVICE],
need_full_svc, &svc);
if (ret)
@@ -3175,7 +3189,7 @@ static int ip_vs_genl_set_cmd(struct sk_buff *skb, struct genl_info *info)
switch (cmd) {
case IPVS_CMD_NEW_SERVICE:
if (svc == NULL)
- ret = ip_vs_add_service(&usvc, &svc);
+ ret = ip_vs_add_service(net, &usvc, &svc);
else
ret = -EEXIST;
break;
@@ -3213,7 +3227,9 @@ static int ip_vs_genl_get_cmd(struct sk_buff *skb, struct genl_info *info)
struct sk_buff *msg;
void *reply;
int ret, cmd, reply_cmd;
+ struct net *net;
+ net = skb_sknet(skb);
cmd = info->genlhdr->cmd;
if (cmd == IPVS_CMD_GET_SERVICE)
@@ -3242,7 +3258,8 @@ static int ip_vs_genl_get_cmd(struct sk_buff *skb, struct genl_info *info)
{
struct ip_vs_service *svc;
- svc = ip_vs_genl_find_service(info->attrs[IPVS_CMD_ATTR_SERVICE]);
+ svc = ip_vs_genl_find_service(net,
+ info->attrs[IPVS_CMD_ATTR_SERVICE]);
if (IS_ERR(svc)) {
ret = PTR_ERR(svc);
goto out_err;
@@ -3413,9 +3430,16 @@ static void ip_vs_genl_unregister(void)
*/
int __net_init __ip_vs_control_init(struct net *net)
{
+ int idx;
+ struct netns_ipvs *ipvs = net_ipvs(net);
+
if (!net_eq(net, &init_net)) /* netns not enabled yet */
return -EPERM;
+ for(idx = 0; idx < IP_VS_RTAB_SIZE; idx++) {
+ INIT_LIST_HEAD(&ipvs->rs_table[idx]);
+ }
+
proc_net_fops_create(net, "ip_vs", 0, &ip_vs_info_fops);
proc_net_fops_create(net, "ip_vs_stats", 0, &ip_vs_stats_fops);
sysctl_header = register_net_sysctl_table(net, net_vs_ctl_path, vs_vars);
@@ -3446,43 +3470,48 @@ static struct pernet_operations ipvs_control_ops = {
int __init ip_vs_control_init(void)
{
- int ret;
int idx;
+ int ret;
EnterFunction(2);
- /* Initialize ip_vs_svc_table, ip_vs_svc_fwm_table, ip_vs_rtable */
+ /* Initialize svc_table, ip_vs_svc_fwm_table, rs_table */
for(idx = 0; idx < IP_VS_SVC_TAB_SIZE; idx++) {
INIT_LIST_HEAD(&ip_vs_svc_table[idx]);
INIT_LIST_HEAD(&ip_vs_svc_fwm_table[idx]);
}
- for(idx = 0; idx < IP_VS_RTAB_SIZE; idx++) {
- INIT_LIST_HEAD(&ip_vs_rtable[idx]);
+
+ ret = register_pernet_subsys(&ipvs_control_ops);
+ if (ret) {
+ pr_err("cannot register namespace.\n");
+ goto err;
}
- smp_wmb();
+
+ smp_wmb(); /* Do wee really need it now ? */
ret = nf_register_sockopt(&ip_vs_sockopts);
if (ret) {
pr_err("cannot register sockopt.\n");
- return ret;
+ goto err_net;
}
ret = ip_vs_genl_register();
if (ret) {
pr_err("cannot register Generic Netlink interface.\n");
nf_unregister_sockopt(&ip_vs_sockopts);
- return ret;
+ goto err_net;
}
- ret = register_pernet_subsys(&ipvs_control_ops);
- if (ret)
- return ret;
-
/* Hook the defense timer */
schedule_delayed_work(&defense_work, DEFENSE_TIMER_PERIOD);
LeaveFunction(2);
return 0;
+
+err_net:
+ unregister_pernet_subsys(&ipvs_control_ops);
+err:
+ return ret;
}
diff --git a/net/netfilter/ipvs/ip_vs_proto_sctp.c b/net/netfilter/ipvs/ip_vs_proto_sctp.c
index a315159..521b827 100644
--- a/net/netfilter/ipvs/ip_vs_proto_sctp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_sctp.c
@@ -12,6 +12,7 @@ static int
sctp_conn_schedule(int af, struct sk_buff *skb, struct ip_vs_protocol *pp,
int *verdict, struct ip_vs_conn **cpp)
{
+ struct net *net;
struct ip_vs_service *svc;
sctp_chunkhdr_t _schunkh, *sch;
sctp_sctphdr_t *sh, _sctph;
@@ -27,9 +28,9 @@ sctp_conn_schedule(int af, struct sk_buff *skb, struct ip_vs_protocol *pp,
sizeof(_schunkh), &_schunkh);
if (sch == NULL)
return 0;
-
+ net = skb_net(skb);
if ((sch->type == SCTP_CID_INIT) &&
- (svc = ip_vs_service_get(af, skb->mark, iph.protocol,
+ (svc = ip_vs_service_get(net, af, skb->mark, iph.protocol,
&iph.daddr, sh->dest))) {
int ignored;
diff --git a/net/netfilter/ipvs/ip_vs_proto_tcp.c b/net/netfilter/ipvs/ip_vs_proto_tcp.c
index 1cdab12..5e4da60 100644
--- a/net/netfilter/ipvs/ip_vs_proto_tcp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_tcp.c
@@ -31,6 +31,7 @@ static int
tcp_conn_schedule(int af, struct sk_buff *skb, struct ip_vs_protocol *pp,
int *verdict, struct ip_vs_conn **cpp)
{
+ struct net *net;
struct ip_vs_service *svc;
struct tcphdr _tcph, *th;
struct ip_vs_iphdr iph;
@@ -42,10 +43,10 @@ tcp_conn_schedule(int af, struct sk_buff *skb, struct ip_vs_protocol *pp,
*verdict = NF_DROP;
return 0;
}
-
+ net = skb_net(skb);
/* No !th->ack check to allow scheduling on SYN+ACK for Active FTP */
if (th->syn &&
- (svc = ip_vs_service_get(af, skb->mark, iph.protocol, &iph.daddr,
+ (svc = ip_vs_service_get(net, af, skb->mark, iph.protocol, &iph.daddr,
th->dest))) {
int ignored;
diff --git a/net/netfilter/ipvs/ip_vs_proto_udp.c b/net/netfilter/ipvs/ip_vs_proto_udp.c
index cd398de..5ab54f6 100644
--- a/net/netfilter/ipvs/ip_vs_proto_udp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_udp.c
@@ -31,6 +31,7 @@ static int
udp_conn_schedule(int af, struct sk_buff *skb, struct ip_vs_protocol *pp,
int *verdict, struct ip_vs_conn **cpp)
{
+ struct net *net;
struct ip_vs_service *svc;
struct udphdr _udph, *uh;
struct ip_vs_iphdr iph;
@@ -42,8 +43,8 @@ udp_conn_schedule(int af, struct sk_buff *skb, struct ip_vs_protocol *pp,
*verdict = NF_DROP;
return 0;
}
^ permalink raw reply related
* [*v2 PATCH 03/22] IPVS: netns awarness to lblcr sheduler
From: Hans Schillstrom @ 2010-12-13 13:38 UTC (permalink / raw)
To: horms, ja, daniel.lezcano, wensong, lvs-devel, netdev,
netfilter-devel
Cc: hans, Hans Schillstrom
In-Reply-To: <1292247510-753-1-git-send-email-hans.schillstrom@ericsson.com>
var sysctl_ip_vs_lblcr_expiration moved to ipvs struct as
sysctl_lblcr_expiration
procfs updated to handle this.
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
---
include/net/netns/ip_vs.h | 4 +++
net/netfilter/ipvs/ip_vs_lblcr.c | 52 +++++++++++++++++++++++++------------
2 files changed, 39 insertions(+), 17 deletions(-)
diff --git a/include/net/netns/ip_vs.h b/include/net/netns/ip_vs.h
index c22fec2..17e4e3a 100644
--- a/include/net/netns/ip_vs.h
+++ b/include/net/netns/ip_vs.h
@@ -30,6 +30,10 @@ struct netns_ipvs {
struct list_head rs_table[IP_VS_RTAB_SIZE];
+ /* ip_vs_lblcr */
+ int sysctl_lblcr_expiration;
+ struct ctl_table_header *lblcr_ctl_header;
+ struct ctl_table *lblcr_ctl_table;
};
#endif /* IP_VS_H_ */
diff --git a/net/netfilter/ipvs/ip_vs_lblcr.c b/net/netfilter/ipvs/ip_vs_lblcr.c
index 7c7396a..91568c3 100644
--- a/net/netfilter/ipvs/ip_vs_lblcr.c
+++ b/net/netfilter/ipvs/ip_vs_lblcr.c
@@ -70,8 +70,6 @@
* entries that haven't been touched for a day.
*/
#define COUNT_FOR_FULL_EXPIRATION 30
-static int sysctl_ip_vs_lblcr_expiration = 24*60*60*HZ;
-
/*
* for IPVS lblcr entry hash table
@@ -296,7 +294,7 @@ struct ip_vs_lblcr_table {
static ctl_table vs_vars_table[] = {
{
.procname = "lblcr_expiration",
- .data = &sysctl_ip_vs_lblcr_expiration,
+ .data = NULL,
.maxlen = sizeof(int),
.mode = 0644,
.proc_handler = proc_dointvec_jiffies,
@@ -304,8 +302,6 @@ static ctl_table vs_vars_table[] = {
{ }
};
-static struct ctl_table_header * sysctl_header;
-
static inline void ip_vs_lblcr_free(struct ip_vs_lblcr_entry *en)
{
list_del(&en->list);
@@ -425,13 +421,14 @@ static inline void ip_vs_lblcr_full_check(struct ip_vs_service *svc)
unsigned long now = jiffies;
int i, j;
struct ip_vs_lblcr_entry *en, *nxt;
+ struct netns_ipvs *ipvs = net_ipvs(svc->net);
for (i=0, j=tbl->rover; i<IP_VS_LBLCR_TAB_SIZE; i++) {
j = (j + 1) & IP_VS_LBLCR_TAB_MASK;
write_lock(&svc->sched_lock);
list_for_each_entry_safe(en, nxt, &tbl->bucket[j], list) {
- if (time_after(en->lastuse+sysctl_ip_vs_lblcr_expiration,
+ if (time_after(en->lastuse+ipvs->sysctl_lblcr_expiration,
now))
continue;
@@ -664,6 +661,7 @@ ip_vs_lblcr_schedule(struct ip_vs_service *svc, const struct sk_buff *skb)
read_lock(&svc->sched_lock);
en = ip_vs_lblcr_get(svc->af, tbl, &iph.daddr);
if (en) {
+ struct netns_ipvs *ipvs = net_ipvs(svc->net);
/* We only hold a read lock, but this is atomic */
en->lastuse = jiffies;
@@ -675,7 +673,7 @@ ip_vs_lblcr_schedule(struct ip_vs_service *svc, const struct sk_buff *skb)
/* More than one destination + enough time passed by, cleanup */
if (atomic_read(&en->set.size) > 1 &&
time_after(jiffies, en->set.lastmod +
- sysctl_ip_vs_lblcr_expiration)) {
+ ipvs->sysctl_lblcr_expiration)) {
struct ip_vs_dest *m;
write_lock(&en->set.lock);
@@ -749,23 +747,43 @@ static struct ip_vs_scheduler ip_vs_lblcr_scheduler =
*/
static int __net_init __ip_vs_lblcr_init(struct net *net)
{
- if (!net_eq(net, &init_net)) /* netns not enabled yet */
- return -EPERM;
-
- sysctl_header = register_net_sysctl_table(net, net_vs_ctl_path,
- vs_vars_table);
- if (!sysctl_header)
- return -ENOMEM;
+ struct netns_ipvs *ipvs = net_ipvs(net);
+
+ if (!net_eq(net, &init_net)) {
+ ipvs->lblcr_ctl_table = kmemdup(vs_vars_table,
+ sizeof(vs_vars_table),
+ GFP_KERNEL);
+ if (ipvs->lblcr_ctl_table == NULL)
+ goto err_dup;
+ } else
+ ipvs->lblcr_ctl_table = vs_vars_table;
+ ipvs->sysctl_lblcr_expiration = 24*60*60*HZ;
+ ipvs->lblcr_ctl_table[0].data = &ipvs->sysctl_lblcr_expiration;
+
+ ipvs->lblcr_ctl_header =
+ register_net_sysctl_table(net, net_vs_ctl_path,
+ ipvs->lblcr_ctl_table);
+ if (!ipvs->lblcr_ctl_header)
+ goto err_reg;
return 0;
+
+err_reg:
+ if (!net_eq(net, &init_net))
+ kfree(ipvs->lblcr_ctl_table);
+
+err_dup:
+ return -ENOMEM;
}
static void __net_exit __ip_vs_lblcr_exit(struct net *net)
{
- if (!net_eq(net, &init_net)) /* netns not enabled yet */
- return;
+ struct netns_ipvs *ipvs = net_ipvs(net);
+
+ unregister_net_sysctl_table(ipvs->lblcr_ctl_header);
- unregister_net_sysctl_table(sysctl_header);
+ if (!net_eq(net, &init_net))
+ kfree(ipvs->lblcr_ctl_table);
}
static struct pernet_operations ip_vs_lblcr_ops = {
--
1.7.2.3
^ permalink raw reply related
* [*v2 PATCH 04/22] IPVS: netns awarness to lblc sheduler
From: Hans Schillstrom @ 2010-12-13 13:38 UTC (permalink / raw)
To: horms, ja, daniel.lezcano, wensong, lvs-devel, netdev,
netfilter-devel
Cc: hans, Hans Schillstrom
In-Reply-To: <1292247510-753-1-git-send-email-hans.schillstrom@ericsson.com>
var sysctl_ip_vs_lblc_expiration moved to ipvs struct as
sysctl_lblc_expiration
procfs updated to handle this.
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
---
include/net/netns/ip_vs.h | 4 +++
net/netfilter/ipvs/ip_vs_lblc.c | 49 ++++++++++++++++++++++++++------------
2 files changed, 37 insertions(+), 16 deletions(-)
diff --git a/include/net/netns/ip_vs.h b/include/net/netns/ip_vs.h
index 17e4e3a..4c8d751 100644
--- a/include/net/netns/ip_vs.h
+++ b/include/net/netns/ip_vs.h
@@ -30,6 +30,10 @@ struct netns_ipvs {
struct list_head rs_table[IP_VS_RTAB_SIZE];
+ /* ip_vs_lblc */
+ int sysctl_lblc_expiration;
+ struct ctl_table_header *lblc_ctl_header;
+ struct ctl_table *lblc_ctl_table;
/* ip_vs_lblcr */
int sysctl_lblcr_expiration;
struct ctl_table_header *lblcr_ctl_header;
diff --git a/net/netfilter/ipvs/ip_vs_lblc.c b/net/netfilter/ipvs/ip_vs_lblc.c
index 84278fb..b2dbfb5 100644
--- a/net/netfilter/ipvs/ip_vs_lblc.c
+++ b/net/netfilter/ipvs/ip_vs_lblc.c
@@ -70,7 +70,6 @@
* entries that haven't been touched for a day.
*/
#define COUNT_FOR_FULL_EXPIRATION 30
-static int sysctl_ip_vs_lblc_expiration = 24*60*60*HZ;
/*
@@ -117,7 +116,7 @@ struct ip_vs_lblc_table {
static ctl_table vs_vars_table[] = {
{
.procname = "lblc_expiration",
- .data = &sysctl_ip_vs_lblc_expiration,
+ .data = NULL,
.maxlen = sizeof(int),
.mode = 0644,
.proc_handler = proc_dointvec_jiffies,
@@ -125,8 +124,6 @@ static ctl_table vs_vars_table[] = {
{ }
};
-static struct ctl_table_header * sysctl_header;
-
static inline void ip_vs_lblc_free(struct ip_vs_lblc_entry *en)
{
list_del(&en->list);
@@ -248,6 +245,7 @@ static inline void ip_vs_lblc_full_check(struct ip_vs_service *svc)
struct ip_vs_lblc_entry *en, *nxt;
unsigned long now = jiffies;
int i, j;
+ struct netns_ipvs *ipvs = net_ipvs(svc->net);
for (i=0, j=tbl->rover; i<IP_VS_LBLC_TAB_SIZE; i++) {
j = (j + 1) & IP_VS_LBLC_TAB_MASK;
@@ -255,7 +253,7 @@ static inline void ip_vs_lblc_full_check(struct ip_vs_service *svc)
write_lock(&svc->sched_lock);
list_for_each_entry_safe(en, nxt, &tbl->bucket[j], list) {
if (time_before(now,
- en->lastuse + sysctl_ip_vs_lblc_expiration))
+ en->lastuse + ipvs->sysctl_lblc_expiration))
continue;
ip_vs_lblc_free(en);
@@ -548,23 +546,43 @@ static struct ip_vs_scheduler ip_vs_lblc_scheduler =
*/
static int __net_init __ip_vs_lblc_init(struct net *net)
{
- if (!net_eq(net, &init_net)) /* netns not enabled yet */
- return -EPERM;
-
- sysctl_header = register_net_sysctl_table(net, net_vs_ctl_path,
- vs_vars_table);
- if (!sysctl_header)
- return -ENOMEM;
+ struct netns_ipvs *ipvs = net_ipvs(net);
+
+ if (!net_eq(net, &init_net)) {
+ ipvs->lblc_ctl_table = kmemdup(vs_vars_table,
+ sizeof(vs_vars_table),
+ GFP_KERNEL);
+ if (ipvs->lblc_ctl_table == NULL)
+ goto err_dup;
+ } else
+ ipvs->lblc_ctl_table = vs_vars_table;
+ ipvs->sysctl_lblc_expiration = 24*60*60*HZ;
+ ipvs->lblc_ctl_table[0].data = &ipvs->sysctl_lblc_expiration;
+
+ ipvs->lblc_ctl_header =
+ register_net_sysctl_table(net, net_vs_ctl_path,
+ ipvs->lblc_ctl_table);
+ if (!ipvs->lblc_ctl_header)
+ goto err_reg;
return 0;
+
+err_reg:
+ if (!net_eq(net, &init_net))
+ kfree(ipvs->lblc_ctl_table);
+
+err_dup:
+ return -ENOMEM;
}
static void __net_exit __ip_vs_lblc_exit(struct net *net)
{
- if (!net_eq(net, &init_net)) /* netns not enabled yet */
- return;
+ struct netns_ipvs *ipvs = net_ipvs(net);
+
+ unregister_net_sysctl_table(ipvs->lblc_ctl_header);
- unregister_net_sysctl_table(sysctl_header);
+ if (!net_eq(net, &init_net))
+ kfree(ipvs->lblc_ctl_table);
}
static struct pernet_operations ip_vs_lblc_ops = {
@@ -586,7 +604,6 @@ static int __init ip_vs_lblc_init(void)
return ret;
}
^ permalink raw reply related
* [*v2 PATCH 05/22] IPVS: netns, prepare protocol
From: Hans Schillstrom @ 2010-12-13 13:38 UTC (permalink / raw)
To: horms, ja, daniel.lezcano, wensong, lvs-devel, netdev,
netfilter-devel
Cc: hans, Hans Schillstrom
In-Reply-To: <1292247510-753-1-git-send-email-hans.schillstrom@ericsson.com>
Add support for protocol data per name-space.
in struct ip_vs_protocol, appcnt will be removed when all protos
are modified for network name-space.
This patch causes warnings of unused functions, they will be used
when next patch will be applied.
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
---
include/net/ip_vs.h | 20 +++++++++++-
include/net/netns/ip_vs.h | 3 ++
net/netfilter/ipvs/ip_vs_proto.c | 66 ++++++++++++++++++++++++++++++++++++++
3 files changed, 88 insertions(+), 1 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 7532eef..528c5e8 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -353,6 +353,7 @@ struct iphdr;
struct ip_vs_conn;
struct ip_vs_app;
struct sk_buff;
+struct ip_vs_proto_data;
struct ip_vs_protocol {
struct ip_vs_protocol *next;
@@ -367,6 +368,10 @@ struct ip_vs_protocol {
void (*exit)(struct ip_vs_protocol *pp);
+ void (*init_netns)(struct net *net, struct ip_vs_proto_data *pd);
+
+ void (*exit_netns)(struct net *net, struct ip_vs_proto_data *pd);
+
int (*conn_schedule)(int af, struct sk_buff *skb,
struct ip_vs_protocol *pp,
int *verdict, struct ip_vs_conn **cpp);
@@ -418,7 +423,20 @@ struct ip_vs_protocol {
int (*set_state_timeout)(struct ip_vs_protocol *pp, char *sname, int to);
};
-extern struct ip_vs_protocol * ip_vs_proto_get(unsigned short proto);
+/*
+ * protocol data per netns
+ */
+struct ip_vs_proto_data {
+ struct ip_vs_proto_data *next;
+ struct ip_vs_protocol *pp;
+ int *timeout_table; /* protocol timeout table */
+ atomic_t appcnt; /* counter of proto app incs. */
+ struct tcp_states_t *tcp_state_table;
+};
+
+extern struct ip_vs_protocol * ip_vs_proto_get(unsigned short proto);
+extern struct ip_vs_proto_data * ip_vs_proto_data_get(struct net *net,
+ unsigned short proto);
struct ip_vs_conn_param {
const union nf_inet_addr *caddr;
diff --git a/include/net/netns/ip_vs.h b/include/net/netns/ip_vs.h
index 4c8d751..b7d7815 100644
--- a/include/net/netns/ip_vs.h
+++ b/include/net/netns/ip_vs.h
@@ -29,6 +29,9 @@ struct netns_ipvs {
#define IP_VS_RTAB_MASK (IP_VS_RTAB_SIZE - 1)
struct list_head rs_table[IP_VS_RTAB_SIZE];
+ /* ip_vs_proto */
+ #define IP_VS_PROTO_TAB_SIZE 32 /* must be power of 2 */
+ struct ip_vs_proto_data *proto_data_table[IP_VS_PROTO_TAB_SIZE];
/* ip_vs_lblc */
int sysctl_lblc_expiration;
diff --git a/net/netfilter/ipvs/ip_vs_proto.c b/net/netfilter/ipvs/ip_vs_proto.c
index 27bf034..8caaf3e 100644
--- a/net/netfilter/ipvs/ip_vs_proto.c
+++ b/net/netfilter/ipvs/ip_vs_proto.c
@@ -60,6 +60,31 @@ static int __used __init register_ip_vs_protocol(struct ip_vs_protocol *pp)
return 0;
}
+/*
+ * register an ipvs protocols netns related data
+ */
+static int
+register_ip_vs_proto_netns(struct net *net, struct ip_vs_protocol *pp)
+{
+ struct netns_ipvs *ipvs = net_ipvs(net);
+ unsigned hash = IP_VS_PROTO_HASH(pp->protocol);
+ struct ip_vs_proto_data *pd =
+ kzalloc(sizeof(struct ip_vs_proto_data), GFP_ATOMIC);
+
+ if (!pd) {
+ pr_err("%s(): no memory.\n", __func__);
+ return -ENOMEM;
+ }
+ pd->pp = pp; /* For speed issues */
+ pd->next = ipvs->proto_data_table[hash];
+ ipvs->proto_data_table[hash] = pd;
+ atomic_set(&pd->appcnt, 0); /* Init app counter */
+
+ if (pp->init_netns != NULL)
+ pp->init_netns(net, pd);
+
+ return 0;
+}
/*
* unregister an ipvs protocol
@@ -82,6 +107,29 @@ static int unregister_ip_vs_protocol(struct ip_vs_protocol *pp)
return -ESRCH;
}
+/*
+ * unregister an ipvs protocols netns data
+ */
+static int
+unregister_ip_vs_proto_netns(struct net *net, struct ip_vs_proto_data *pd)
+{
+ struct netns_ipvs *ipvs = net_ipvs(net);
+ struct ip_vs_proto_data **pd_p;
+ unsigned hash = IP_VS_PROTO_HASH(pd->pp->protocol);
+
+ pd_p = &ipvs->proto_data_table[hash];
+ for (; *pd_p; pd_p = &(*pd_p)->next) {
+ if (*pd_p == pd) {
+ *pd_p = pd->next;
+ if (pd->pp->exit_netns != NULL)
+ pd->pp->exit_netns(net, pd);
+ kfree(pd);
+ return 0;
+ }
+ }
+
+ return -ESRCH;
+}
/*
* get ip_vs_protocol object by its proto.
@@ -100,6 +148,24 @@ struct ip_vs_protocol * ip_vs_proto_get(unsigned short proto)
}
EXPORT_SYMBOL(ip_vs_proto_get);
+/*
+ * get ip_vs_protocol object data by netns and proto
+ */
+struct ip_vs_proto_data *
+ip_vs_proto_data_get(struct net *net, unsigned short proto)
+{
+ struct netns_ipvs *ipvs = net_ipvs(net);
+ struct ip_vs_proto_data *pd;
+ unsigned hash = IP_VS_PROTO_HASH(proto);
+
+ for (pd = ipvs->proto_data_table[hash]; pd; pd = pd->next) {
+ if (pd->pp->protocol == proto)
+ return pd;
+ }
+
+ return NULL;
+}
+EXPORT_SYMBOL(ip_vs_proto_data_get);
/*
* Propagate event for state change to all protocols
--
1.7.2.3
^ permalink raw reply related
* [*v2 PATCH 07/22] IPVS: netns preparation for proto_udp
From: Hans Schillstrom @ 2010-12-13 13:38 UTC (permalink / raw)
To: horms, ja, daniel.lezcano, wensong, lvs-devel, netdev,
netfilter-devel
Cc: hans, Hans Schillstrom
In-Reply-To: <1292247510-753-1-git-send-email-hans.schillstrom@ericsson.com>
In this phase (one), all local vars will be moved to ipvs struct.
Remaining work, add param struct net *net to a couple of
functions that is common for all protos and use ip_vs_proto_data
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
---
include/net/netns/ip_vs.h | 8 +++
net/netfilter/ipvs/ip_vs_proto.c | 3 +
net/netfilter/ipvs/ip_vs_proto_udp.c | 92 +++++++++++++++++++--------------
3 files changed, 64 insertions(+), 39 deletions(-)
diff --git a/include/net/netns/ip_vs.h b/include/net/netns/ip_vs.h
index 512cdd0..4975026 100644
--- a/include/net/netns/ip_vs.h
+++ b/include/net/netns/ip_vs.h
@@ -40,6 +40,14 @@ struct netns_ipvs {
struct list_head tcp_apps[TCP_APP_TAB_SIZE];
spinlock_t tcp_app_lock;
#endif
+ /* ip_vs_proto_udp */
+#ifdef CONFIG_IP_VS_PROTO_UDP
+ #define UDP_APP_TAB_BITS 4
+ #define UDP_APP_TAB_SIZE (1 << UDP_APP_TAB_BITS)
+ #define UDP_APP_TAB_MASK (UDP_APP_TAB_SIZE - 1)
+ struct list_head udp_apps[UDP_APP_TAB_SIZE];
+ spinlock_t udp_app_lock;
+#endif
/* ip_vs_lblc */
int sysctl_lblc_expiration;
diff --git a/net/netfilter/ipvs/ip_vs_proto.c b/net/netfilter/ipvs/ip_vs_proto.c
index 90d69c5..ec71d47 100644
--- a/net/netfilter/ipvs/ip_vs_proto.c
+++ b/net/netfilter/ipvs/ip_vs_proto.c
@@ -310,6 +310,9 @@ static int __net_init __ip_vs_protocol_init(struct net *net)
#ifdef CONFIG_IP_VS_PROTO_TCP
register_ip_vs_proto_netns(net, &ip_vs_protocol_tcp);
#endif
+#ifdef CONFIG_IP_VS_PROTO_UDP
+ register_ip_vs_proto_netns(net, &ip_vs_protocol_udp);
+#endif
return 0;
}
diff --git a/net/netfilter/ipvs/ip_vs_proto_udp.c b/net/netfilter/ipvs/ip_vs_proto_udp.c
index 5ab54f6..f30cb8c 100644
--- a/net/netfilter/ipvs/ip_vs_proto_udp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_udp.c
@@ -9,7 +9,8 @@
* as published by the Free Software Foundation; either version
* 2 of the License, or (at your option) any later version.
*
- * Changes:
+ * Changes: Hans Schillstrom <hans.schillstrom@ericsson.com>
+ * Network name space (netns) aware.
*
*/
@@ -345,19 +346,6 @@ udp_csum_check(int af, struct sk_buff *skb, struct ip_vs_protocol *pp)
return 1;
}
-
-/*
- * Note: the caller guarantees that only one of register_app,
- * unregister_app or app_conn_bind is called each time.
- */
-
-#define UDP_APP_TAB_BITS 4
-#define UDP_APP_TAB_SIZE (1 << UDP_APP_TAB_BITS)
-#define UDP_APP_TAB_MASK (UDP_APP_TAB_SIZE - 1)
-
-static struct list_head udp_apps[UDP_APP_TAB_SIZE];
-static DEFINE_SPINLOCK(udp_app_lock);
-
static inline __u16 udp_app_hashkey(__be16 port)
{
return (((__force u16)port >> UDP_APP_TAB_BITS) ^ (__force u16)port)
@@ -371,22 +359,24 @@ static int udp_register_app(struct ip_vs_app *inc)
__u16 hash;
__be16 port = inc->port;
int ret = 0;
+ struct netns_ipvs *ipvs = net_ipvs(&init_net);
+ struct ip_vs_proto_data *pd = ip_vs_proto_data_get(&init_net, IPPROTO_UDP);
hash = udp_app_hashkey(port);
- spin_lock_bh(&udp_app_lock);
- list_for_each_entry(i, &udp_apps[hash], p_list) {
+ spin_lock_bh(&ipvs->udp_app_lock);
+ list_for_each_entry(i, &ipvs->udp_apps[hash], p_list) {
if (i->port == port) {
ret = -EEXIST;
goto out;
}
}
- list_add(&inc->p_list, &udp_apps[hash]);
- atomic_inc(&ip_vs_protocol_udp.appcnt);
+ list_add(&inc->p_list, &ipvs->udp_apps[hash]);
+ atomic_inc(&pd->pp->appcnt);
out:
- spin_unlock_bh(&udp_app_lock);
+ spin_unlock_bh(&ipvs->udp_app_lock);
return ret;
}
@@ -394,15 +384,19 @@ static int udp_register_app(struct ip_vs_app *inc)
static void
udp_unregister_app(struct ip_vs_app *inc)
{
- spin_lock_bh(&udp_app_lock);
- atomic_dec(&ip_vs_protocol_udp.appcnt);
+ struct ip_vs_proto_data *pd = ip_vs_proto_data_get(&init_net, IPPROTO_UDP);
+ struct netns_ipvs *ipvs = net_ipvs(&init_net);
+
+ spin_lock_bh(&ipvs->udp_app_lock);
+ atomic_dec(&pd->pp->appcnt);
list_del(&inc->p_list);
- spin_unlock_bh(&udp_app_lock);
+ spin_unlock_bh(&ipvs->udp_app_lock);
}
static int udp_app_conn_bind(struct ip_vs_conn *cp)
{
+ struct netns_ipvs *ipvs = net_ipvs(&init_net);
int hash;
struct ip_vs_app *inc;
int result = 0;
@@ -414,12 +408,12 @@ static int udp_app_conn_bind(struct ip_vs_conn *cp)
/* Lookup application incarnations and bind the right one */
hash = udp_app_hashkey(cp->vport);
- spin_lock(&udp_app_lock);
- list_for_each_entry(inc, &udp_apps[hash], p_list) {
+ spin_lock(&ipvs->udp_app_lock);
+ list_for_each_entry(inc, &ipvs->udp_apps[hash], p_list) {
if (inc->port == cp->vport) {
if (unlikely(!ip_vs_app_inc_get(inc)))
break;
- spin_unlock(&udp_app_lock);
+ spin_unlock(&ipvs->udp_app_lock);
IP_VS_DBG_BUF(9, "%s(): Binding conn %s:%u->"
"%s:%u to app %s on port %u\n",
@@ -436,14 +430,14 @@ static int udp_app_conn_bind(struct ip_vs_conn *cp)
goto out;
}
}
- spin_unlock(&udp_app_lock);
+ spin_unlock(&ipvs->udp_app_lock);
out:
return result;
}
-static int udp_timeouts[IP_VS_UDP_S_LAST+1] = {
+static const int udp_timeouts[IP_VS_UDP_S_LAST+1] = {
[IP_VS_UDP_S_NORMAL] = 5*60*HZ,
[IP_VS_UDP_S_LAST] = 2*HZ,
};
@@ -453,13 +447,18 @@ static const char *const udp_state_name_table[IP_VS_UDP_S_LAST+1] = {
[IP_VS_UDP_S_LAST] = "BUG!",
};
-
+/*
static int
udp_set_state_timeout(struct ip_vs_protocol *pp, char *sname, int to)
{
- return ip_vs_set_state_timeout(pp->timeout_table, IP_VS_UDP_S_LAST,
- udp_state_name_table, sname, to);
-}
+ struct ip_vs_proto_data *pd = ip_vs_proto_data_get(&init_net, IPPROTO_UDP);
+ if (pd)
+ return ip_vs_set_state_timeout(pd->timeout_table,
+ IP_VS_UDP_S_LAST,
+ udp_state_name_table, sname, to);
+ else
+ return -ENOENT;
+} */
static const char * udp_state_name(int state)
{
@@ -473,18 +472,31 @@ udp_state_transition(struct ip_vs_conn *cp, int direction,
const struct sk_buff *skb,
struct ip_vs_protocol *pp)
{
- cp->timeout = pp->timeout_table[IP_VS_UDP_S_NORMAL];
+ struct ip_vs_proto_data *pd; /* Temp fix, pp will be replaced by pd */
+
+ pd = ip_vs_proto_data_get(&init_net, IPPROTO_UDP);
+ if (unlikely(!pd)) {
+ pr_err("UDP no ns data\n");
+ return 0;
+ }
+
+ cp->timeout = pd->timeout_table[IP_VS_UDP_S_NORMAL];
return 1;
}
-static void udp_init(struct ip_vs_protocol *pp)
+static void __udp_init(struct net *net, struct ip_vs_proto_data *pd)
{
- IP_VS_INIT_HASH_TABLE(udp_apps);
- pp->timeout_table = udp_timeouts;
+ struct netns_ipvs *ipvs = net_ipvs(net);
+
+ ip_vs_init_hash_table(ipvs->udp_apps, UDP_APP_TAB_SIZE);
+ spin_lock_init(&ipvs->udp_app_lock);
+ pd->timeout_table = ip_vs_create_timeout_table((int *)udp_timeouts,
+ sizeof(udp_timeouts));
}
-static void udp_exit(struct ip_vs_protocol *pp)
+static void __udp_exit(struct net *net, struct ip_vs_proto_data *pd)
{
+ kfree(pd->timeout_table);
}
@@ -493,8 +505,10 @@ struct ip_vs_protocol ip_vs_protocol_udp = {
.protocol = IPPROTO_UDP,
.num_states = IP_VS_UDP_S_LAST,
.dont_defrag = 0,
- .init = udp_init,
- .exit = udp_exit,
+ .init = NULL,
+ .exit = NULL,
+ .init_netns = __udp_init,
+ .exit_netns = __udp_exit,
.conn_schedule = udp_conn_schedule,
.conn_in_get = ip_vs_conn_in_get_proto,
.conn_out_get = ip_vs_conn_out_get_proto,
@@ -508,5 +522,5 @@ struct ip_vs_protocol ip_vs_protocol_udp = {
.app_conn_bind = udp_app_conn_bind,
.debug_packet = ip_vs_tcpudp_debug_packet,
.timeout_change = NULL,
- .set_state_timeout = udp_set_state_timeout,
+/* .set_state_timeout = udp_set_state_timeout, */
};
--
1.7.2.3
^ permalink raw reply related
* [*v2 PATCH 09/22] IPVS: netns preparation for proto_ah_esp
From: Hans Schillstrom @ 2010-12-13 13:38 UTC (permalink / raw)
To: horms, ja, daniel.lezcano, wensong, lvs-devel, netdev,
netfilter-devel
Cc: hans, Hans Schillstrom
In-Reply-To: <1292247510-753-1-git-send-email-hans.schillstrom@ericsson.com>
In this phase (one), all local vars will be moved to ipvs struct.
Remaining work, add param struct net *net to a couple of
functions that common for all protos.
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
---
net/netfilter/ipvs/ip_vs_proto.c | 6 ++++++
net/netfilter/ipvs/ip_vs_proto_ah_esp.c | 20 ++++----------------
2 files changed, 10 insertions(+), 16 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_proto.c b/net/netfilter/ipvs/ip_vs_proto.c
index fd3a7a8..1089248 100644
--- a/net/netfilter/ipvs/ip_vs_proto.c
+++ b/net/netfilter/ipvs/ip_vs_proto.c
@@ -316,6 +316,12 @@ static int __net_init __ip_vs_protocol_init(struct net *net)
#ifdef CONFIG_IP_VS_PROTO_SCTP
register_ip_vs_proto_netns(net, &ip_vs_protocol_sctp);
#endif
+#ifdef CONFIG_IP_VS_PROTO_AH
+ register_ip_vs_proto_netns(net, &ip_vs_protocol_ah);
+#endif
+#ifdef CONFIG_IP_VS_PROTO_ESP
+ register_ip_vs_proto_netns(net, &ip_vs_protocol_esp);
+#endif
return 0;
}
diff --git a/net/netfilter/ipvs/ip_vs_proto_ah_esp.c b/net/netfilter/ipvs/ip_vs_proto_ah_esp.c
index 3a04611..b8b37fa 100644
--- a/net/netfilter/ipvs/ip_vs_proto_ah_esp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_ah_esp.c
@@ -117,26 +117,14 @@ ah_esp_conn_schedule(int af, struct sk_buff *skb, struct ip_vs_protocol *pp,
return 0;
}
-static void ah_esp_init(struct ip_vs_protocol *pp)
-{
- /* nothing to do now */
-}
-
-
-static void ah_esp_exit(struct ip_vs_protocol *pp)
-{
- /* nothing to do now */
-}
-
-
#ifdef CONFIG_IP_VS_PROTO_AH
struct ip_vs_protocol ip_vs_protocol_ah = {
.name = "AH",
.protocol = IPPROTO_AH,
.num_states = 1,
.dont_defrag = 1,
- .init = ah_esp_init,
- .exit = ah_esp_exit,
+ .init = NULL,
+ .exit = NULL,
.conn_schedule = ah_esp_conn_schedule,
.conn_in_get = ah_esp_conn_in_get,
.conn_out_get = ah_esp_conn_out_get,
@@ -159,8 +147,8 @@ struct ip_vs_protocol ip_vs_protocol_esp = {
.protocol = IPPROTO_ESP,
.num_states = 1,
.dont_defrag = 1,
- .init = ah_esp_init,
- .exit = ah_esp_exit,
+ .init = NULL,
+ .exit = NULL,
.conn_schedule = ah_esp_conn_schedule,
.conn_in_get = ah_esp_conn_in_get,
.conn_out_get = ah_esp_conn_out_get,
--
1.7.2.3
^ permalink raw reply related
* [*v2 PATCH 10/22] IPVS: netns, use ip_vs_proto_data as param.
From: Hans Schillstrom @ 2010-12-13 13:38 UTC (permalink / raw)
To: horms, ja, daniel.lezcano, wensong, lvs-devel, netdev,
netfilter-devel
Cc: hans, Hans Schillstrom
In-Reply-To: <1292247510-753-1-git-send-email-hans.schillstrom@ericsson.com>
ip_vs_protocol *pp is replaced by ip_vs_proto_data *pd in
function call in ip_vs_protocol struct i.e. :,
- timeout_change()
- state_transition()
ip_vs_protocol_timeout_change() got ipvs as param, due to above
and a upcoming patch - defence work
Most of this changes are triggered by Julians comment:
"tcp_timeout_change should work with the new struct ip_vs_proto_data
so that tcp_state_table will go to pd->state_table
and set_tcp_state will get pd instead of pp"
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
---
include/net/ip_vs.h | 6 ++--
net/netfilter/ipvs/ip_vs_core.c | 4 ++-
net/netfilter/ipvs/ip_vs_ctl.c | 54 ++++++++++++++++++++-------------
net/netfilter/ipvs/ip_vs_proto.c | 17 ++++++++--
net/netfilter/ipvs/ip_vs_proto_sctp.c | 10 ++----
net/netfilter/ipvs/ip_vs_proto_tcp.c | 20 +++++-------
net/netfilter/ipvs/ip_vs_proto_udp.c | 5 +--
7 files changed, 65 insertions(+), 51 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 3765add..045cbe0 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -406,7 +406,7 @@ struct ip_vs_protocol {
int (*state_transition)(struct ip_vs_conn *cp, int direction,
const struct sk_buff *skb,
- struct ip_vs_protocol *pp);
+ struct ip_vs_proto_data *pd);
int (*register_app)(struct ip_vs_app *inc);
@@ -419,7 +419,7 @@ struct ip_vs_protocol {
int offset,
const char *msg);
- void (*timeout_change)(struct ip_vs_protocol *pp, int flags);
+ void (*timeout_change)(struct ip_vs_proto_data *pd, int flags);
int (*set_state_timeout)(struct ip_vs_protocol *pp, char *sname, int to);
};
@@ -919,7 +919,7 @@ static inline void ip_vs_pe_put(const struct ip_vs_pe *pe)
*/
extern int ip_vs_protocol_init(void);
extern void ip_vs_protocol_cleanup(void);
-extern void ip_vs_protocol_timeout_change(int flags);
+extern void ip_vs_protocol_timeout_change(struct netns_ipvs *ipvs, int flags);
extern int *ip_vs_create_timeout_table(int *table, int size);
extern int
ip_vs_set_state_timeout(int *table, int num, const char *const *names,
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index 90d307c..c9289b5 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -181,7 +181,9 @@ ip_vs_set_state(struct ip_vs_conn *cp, int direction,
{
if (unlikely(!pp->state_transition))
return 0;
- return pp->state_transition(cp, direction, skb, pp);
+ return pp->state_transition(cp, direction, skb,
+ ip_vs_proto_data_get(skb_net(skb),
+ pp->protocol));
}
static inline int
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index 37cb8d1..2c9cd5c 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -38,6 +38,7 @@
#include <linux/mutex.h>
#include <net/net_namespace.h>
+#include <linux/nsproxy.h>
#include <net/ip.h>
#ifdef CONFIG_IP_VS_IPV6
#include <net/ipv6.h>
@@ -127,7 +128,7 @@ static int __ip_vs_addr_is_local_v6(const struct in6_addr *addr)
* update_defense_level is called from keventd and from sysctl,
* so it needs to protect itself from softirqs
*/
-static void update_defense_level(void)
+static void update_defense_level(struct netns_ipvs *ipvs)
{
struct sysinfo i;
static int old_secure_tcp = 0;
@@ -241,7 +242,7 @@ static void update_defense_level(void)
}
old_secure_tcp = sysctl_ip_vs_secure_tcp;
if (to_change >= 0)
- ip_vs_protocol_timeout_change(sysctl_ip_vs_secure_tcp>1);
+ ip_vs_protocol_timeout_change(ipvs, sysctl_ip_vs_secure_tcp>1);
spin_unlock(&ip_vs_securetcp_lock);
local_bh_enable();
@@ -257,7 +258,10 @@ static DECLARE_DELAYED_WORK(defense_work, defense_work_handler);
static void defense_work_handler(struct work_struct *work)
{
- update_defense_level();
+ struct net *net = &init_net;
+ struct netns_ipvs *ipvs = net_ipvs(net);
+
+ update_defense_level(ipvs);
if (atomic_read(&ip_vs_dropentry))
ip_vs_random_dropentry();
@@ -1502,6 +1506,7 @@ static int
proc_do_defense_mode(ctl_table *table, int write,
void __user *buffer, size_t *lenp, loff_t *ppos)
{
+ struct net *net = current->nsproxy->net_ns;
int *valp = table->data;
int val = *valp;
int rc;
@@ -1512,7 +1517,7 @@ proc_do_defense_mode(ctl_table *table, int write,
/* Restore the correct value */
*valp = val;
} else {
- update_defense_level();
+ update_defense_level(net_ipvs(net));
}
}
return rc;
@@ -2033,8 +2038,10 @@ static const struct file_operations ip_vs_stats_fops = {
/*
* Set timeout values for tcp tcpfin udp in the timeout_table.
*/
-static int ip_vs_set_timeout(struct ip_vs_timeout_user *u)
+static int ip_vs_set_timeout(struct net *net, struct ip_vs_timeout_user *u)
{
+ struct ip_vs_proto_data *pd;
+
IP_VS_DBG(2, "Setting timeout tcp:%d tcpfin:%d udp:%d\n",
u->tcp_timeout,
u->tcp_fin_timeout,
@@ -2042,19 +2049,22 @@ static int ip_vs_set_timeout(struct ip_vs_timeout_user *u)
#ifdef CONFIG_IP_VS_PROTO_TCP
if (u->tcp_timeout) {
- ip_vs_protocol_tcp.timeout_table[IP_VS_TCP_S_ESTABLISHED]
+ pd = ip_vs_proto_data_get(net, IPPROTO_TCP);
+ pd->timeout_table[IP_VS_TCP_S_ESTABLISHED]
= u->tcp_timeout * HZ;
}
if (u->tcp_fin_timeout) {
- ip_vs_protocol_tcp.timeout_table[IP_VS_TCP_S_FIN_WAIT]
+ pd = ip_vs_proto_data_get(net, IPPROTO_TCP);
+ pd->timeout_table[IP_VS_TCP_S_FIN_WAIT]
= u->tcp_fin_timeout * HZ;
}
#endif
#ifdef CONFIG_IP_VS_PROTO_UDP
if (u->udp_timeout) {
- ip_vs_protocol_udp.timeout_table[IP_VS_UDP_S_NORMAL]
+ pd = ip_vs_proto_data_get(net, IPPROTO_TCP);
+ pd->timeout_table[IP_VS_UDP_S_NORMAL]
= u->udp_timeout * HZ;
}
#endif
@@ -2158,7 +2168,7 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user *user, unsigned int len)
goto out_unlock;
} else if (cmd == IP_VS_SO_SET_TIMEOUT) {
/* Set timeout values for (tcp tcpfin udp) */
- ret = ip_vs_set_timeout((struct ip_vs_timeout_user *)arg);
+ ret = ip_vs_set_timeout(net, (struct ip_vs_timeout_user *)arg);
goto out_unlock;
} else if (cmd == IP_VS_SO_SET_STARTDAEMON) {
struct ip_vs_daemon_user *dm = (struct ip_vs_daemon_user *)arg;
@@ -2369,17 +2379,19 @@ __ip_vs_get_dest_entries(struct net *net, const struct ip_vs_get_dests *get,
}
static inline void
-__ip_vs_get_timeouts(struct ip_vs_timeout_user *u)
+__ip_vs_get_timeouts(struct net *net, struct ip_vs_timeout_user *u)
{
+ struct ip_vs_proto_data *pd;
+
#ifdef CONFIG_IP_VS_PROTO_TCP
- u->tcp_timeout =
- ip_vs_protocol_tcp.timeout_table[IP_VS_TCP_S_ESTABLISHED] / HZ;
- u->tcp_fin_timeout =
- ip_vs_protocol_tcp.timeout_table[IP_VS_TCP_S_FIN_WAIT] / HZ;
+ pd = ip_vs_proto_data_get(net, IPPROTO_TCP);
+ u->tcp_timeout = pd->timeout_table[IP_VS_TCP_S_ESTABLISHED] / HZ;
+ u->tcp_fin_timeout = pd->timeout_table[IP_VS_TCP_S_FIN_WAIT] / HZ;
#endif
#ifdef CONFIG_IP_VS_PROTO_UDP
+ pd = ip_vs_proto_data_get(net, IPPROTO_UDP);
u->udp_timeout =
- ip_vs_protocol_udp.timeout_table[IP_VS_UDP_S_NORMAL] / HZ;
+ pd->timeout_table[IP_VS_UDP_S_NORMAL] / HZ;
#endif
}
@@ -2519,7 +2531,7 @@ do_ip_vs_get_ctl(struct sock *sk, int cmd, void __user *user, int *len)
{
struct ip_vs_timeout_user t;
- __ip_vs_get_timeouts(&t);
+ __ip_vs_get_timeouts(net, &t);
if (copy_to_user(user, &t, sizeof(t)) != 0)
ret = -EFAULT;
}
@@ -3090,11 +3102,11 @@ static int ip_vs_genl_del_daemon(struct nlattr **attrs)
return stop_sync_thread(nla_get_u32(attrs[IPVS_DAEMON_ATTR_STATE]));
}
-static int ip_vs_genl_set_config(struct nlattr **attrs)
+static int ip_vs_genl_set_config(struct net *net, struct nlattr **attrs)
{
struct ip_vs_timeout_user t;
- __ip_vs_get_timeouts(&t);
+ __ip_vs_get_timeouts(net, &t);
if (attrs[IPVS_CMD_ATTR_TIMEOUT_TCP])
t.tcp_timeout = nla_get_u32(attrs[IPVS_CMD_ATTR_TIMEOUT_TCP]);
@@ -3106,7 +3118,7 @@ static int ip_vs_genl_set_config(struct nlattr **attrs)
if (attrs[IPVS_CMD_ATTR_TIMEOUT_UDP])
t.udp_timeout = nla_get_u32(attrs[IPVS_CMD_ATTR_TIMEOUT_UDP]);
- return ip_vs_set_timeout(&t);
+ return ip_vs_set_timeout(net, &t);
}
static int ip_vs_genl_set_cmd(struct sk_buff *skb, struct genl_info *info)
@@ -3127,7 +3139,7 @@ static int ip_vs_genl_set_cmd(struct sk_buff *skb, struct genl_info *info)
ret = ip_vs_flush(net);
goto out;
} else if (cmd == IPVS_CMD_SET_CONFIG) {
- ret = ip_vs_genl_set_config(info->attrs);
+ ret = ip_vs_genl_set_config(net, info->attrs);
goto out;
} else if (cmd == IPVS_CMD_NEW_DAEMON ||
cmd == IPVS_CMD_DEL_DAEMON) {
@@ -3279,7 +3291,7 @@ static int ip_vs_genl_get_cmd(struct sk_buff *skb, struct genl_info *info)
{
struct ip_vs_timeout_user t;
- __ip_vs_get_timeouts(&t);
+ __ip_vs_get_timeouts(net, &t);
#ifdef CONFIG_IP_VS_PROTO_TCP
NLA_PUT_U32(msg, IPVS_CMD_ATTR_TIMEOUT_TCP, t.tcp_timeout);
NLA_PUT_U32(msg, IPVS_CMD_ATTR_TIMEOUT_TCP_FIN,
diff --git a/net/netfilter/ipvs/ip_vs_proto.c b/net/netfilter/ipvs/ip_vs_proto.c
index 1089248..803c6ec 100644
--- a/net/netfilter/ipvs/ip_vs_proto.c
+++ b/net/netfilter/ipvs/ip_vs_proto.c
@@ -152,9 +152,8 @@ EXPORT_SYMBOL(ip_vs_proto_get);
* get ip_vs_protocol object data by netns and proto
*/
struct ip_vs_proto_data *
-ip_vs_proto_data_get(struct net *net, unsigned short proto)
+__ipvs_proto_data_get(struct netns_ipvs *ipvs, unsigned short proto)
{
- struct netns_ipvs *ipvs = net_ipvs(net);
struct ip_vs_proto_data *pd;
unsigned hash = IP_VS_PROTO_HASH(proto);
@@ -165,12 +164,20 @@ ip_vs_proto_data_get(struct net *net, unsigned short proto)
return NULL;
}
+
+struct ip_vs_proto_data *
+ip_vs_proto_data_get(struct net *net, unsigned short proto)
+{
+ struct netns_ipvs *ipvs = net_ipvs(net);
+
+ return __ipvs_proto_data_get(ipvs, proto);
+}
EXPORT_SYMBOL(ip_vs_proto_data_get);
/*
* Propagate event for state change to all protocols
*/
-void ip_vs_protocol_timeout_change(int flags)
+void ip_vs_protocol_timeout_change(struct netns_ipvs *ipvs, int flags)
{
struct ip_vs_protocol *pp;
int i;
@@ -178,7 +185,9 @@ void ip_vs_protocol_timeout_change(int flags)
for (i = 0; i < IP_VS_PROTO_TAB_SIZE; i++) {
for (pp = ip_vs_proto_table[i]; pp; pp = pp->next) {
if (pp->timeout_change)
- pp->timeout_change(pp, flags);
+ pp->timeout_change(__ipvs_proto_data_get(ipvs,
+ pp->protocol),
+ flags);
}
}
}
diff --git a/net/netfilter/ipvs/ip_vs_proto_sctp.c b/net/netfilter/ipvs/ip_vs_proto_sctp.c
index 108ae0c..88715bd 100644
--- a/net/netfilter/ipvs/ip_vs_proto_sctp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_sctp.c
@@ -916,14 +916,13 @@ return ip_vs_set_state_timeout(pp->timeout_table, IP_VS_SCTP_S_LAST,
} */
static inline int
-set_sctp_state(struct ip_vs_protocol *pp, struct ip_vs_conn *cp,
+set_sctp_state(struct ip_vs_proto_data *pd, struct ip_vs_conn *cp,
int direction, const struct sk_buff *skb)
{
sctp_chunkhdr_t _sctpch, *sch;
unsigned char chunk_type;
int event, next_state;
int ihl;
- struct ip_vs_proto_data *pd;
#ifdef CONFIG_IP_VS_IPV6
ihl = cp->af == AF_INET ? ip_hdrlen(skb) : sizeof(struct ipv6hdr);
@@ -975,7 +974,7 @@ set_sctp_state(struct ip_vs_protocol *pp, struct ip_vs_conn *cp,
IP_VS_DBG_BUF(8, "%s %s %s:%d->"
"%s:%d state: %s->%s conn->refcnt:%d\n",
- pp->name,
+ pd->pp->name,
((direction == IP_VS_DIR_OUTPUT) ?
"output " : "input "),
IP_VS_DBG_ADDR(cp->af, &cp->daddr),
@@ -999,7 +998,6 @@ set_sctp_state(struct ip_vs_protocol *pp, struct ip_vs_conn *cp,
}
}
}
- pd = ip_vs_proto_data_get(&init_net, pp->protocol); /* tmp fix */
if (likely(pd))
cp->timeout = pd->timeout_table[cp->state = next_state];
else /* What to do ? */
@@ -1010,12 +1008,12 @@ set_sctp_state(struct ip_vs_protocol *pp, struct ip_vs_conn *cp,
static int
sctp_state_transition(struct ip_vs_conn *cp, int direction,
- const struct sk_buff *skb, struct ip_vs_protocol *pp)
+ const struct sk_buff *skb, struct ip_vs_proto_data *pd)
{
int ret = 0;
spin_lock(&cp->lock);
- ret = set_sctp_state(pp, cp, direction, skb);
+ ret = set_sctp_state(pd, cp, direction, skb);
spin_unlock(&cp->lock);
return ret;
diff --git a/net/netfilter/ipvs/ip_vs_proto_tcp.c b/net/netfilter/ipvs/ip_vs_proto_tcp.c
index 8986b25..eec7563 100644
--- a/net/netfilter/ipvs/ip_vs_proto_tcp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_tcp.c
@@ -448,10 +448,7 @@ static struct tcp_states_t tcp_states_dos [] = {
/*rst*/ {{sCL, sCL, sCL, sSR, sCL, sCL, sCL, sCL, sLA, sLI, sCL }},
};
-static struct tcp_states_t *tcp_state_table = tcp_states;
-
-
-static void tcp_timeout_change(struct ip_vs_protocol *pp, int flags)
+static void tcp_timeout_change(struct ip_vs_proto_data *pd, int flags)
{
int on = (flags & 1); /* secure_tcp */
@@ -461,7 +458,7 @@ static void tcp_timeout_change(struct ip_vs_protocol *pp, int flags)
** for most if not for all of the applications. Something
** like "capabilities" (flags) for each object.
*/
- tcp_state_table = (on? tcp_states_dos : tcp_states);
+ pd->tcp_state_table = (on? tcp_states_dos : tcp_states);
}
/* Remove dot used
static int
@@ -485,13 +482,12 @@ static inline int tcp_state_idx(struct tcphdr *th)
}
static inline void
-set_tcp_state(struct ip_vs_protocol *pp, struct ip_vs_conn *cp,
+set_tcp_state(struct ip_vs_proto_data *pd, struct ip_vs_conn *cp,
int direction, struct tcphdr *th)
{
int state_idx;
int new_state = IP_VS_TCP_S_CLOSE;
int state_off = tcp_state_off[direction];
- struct ip_vs_proto_data *pd; /* Temp fix */
/*
* Update state offset to INPUT_ONLY if necessary
@@ -509,7 +505,7 @@ set_tcp_state(struct ip_vs_protocol *pp, struct ip_vs_conn *cp,
goto tcp_state_out;
}
- new_state = tcp_state_table[state_off+state_idx].next_state[cp->state];
+ new_state = pd->tcp_state_table[state_off+state_idx].next_state[cp->state];
tcp_state_out:
if (new_state != cp->state) {
@@ -517,7 +513,7 @@ set_tcp_state(struct ip_vs_protocol *pp, struct ip_vs_conn *cp,
IP_VS_DBG_BUF(8, "%s %s [%c%c%c%c] %s:%d->"
"%s:%d state: %s->%s conn->refcnt:%d\n",
- pp->name,
+ pd->pp->name,
((state_off == TCP_DIR_OUTPUT) ?
"output " : "input "),
th->syn ? 'S' : '.',
@@ -547,7 +543,6 @@ set_tcp_state(struct ip_vs_protocol *pp, struct ip_vs_conn *cp,
}
}
- pd = ip_vs_proto_data_get(&init_net, pp->protocol);
if (likely(pd))
cp->timeout = pd->timeout_table[cp->state = new_state];
else /* What to do ? */
@@ -560,7 +555,7 @@ set_tcp_state(struct ip_vs_protocol *pp, struct ip_vs_conn *cp,
static int
tcp_state_transition(struct ip_vs_conn *cp, int direction,
const struct sk_buff *skb,
- struct ip_vs_protocol *pp)
+ struct ip_vs_proto_data *pd)
{
struct tcphdr _tcph, *th;
@@ -575,7 +570,7 @@ tcp_state_transition(struct ip_vs_conn *cp, int direction,
return 0;
spin_lock(&cp->lock);
- set_tcp_state(pp, cp, direction, th);
+ set_tcp_state(pd, cp, direction, th);
spin_unlock(&cp->lock);
return 1;
@@ -698,6 +693,7 @@ static void __ip_vs_tcp_init(struct net *net, struct ip_vs_proto_data *pd)
spin_lock_init(&ipvs->tcp_app_lock);
pd->timeout_table = ip_vs_create_timeout_table((int*)tcp_timeouts,
sizeof(tcp_timeouts));
+ pd->tcp_state_table = tcp_states;
}
static void __ip_vs_tcp_exit(struct net *net, struct ip_vs_proto_data *pd)
diff --git a/net/netfilter/ipvs/ip_vs_proto_udp.c b/net/netfilter/ipvs/ip_vs_proto_udp.c
index f30cb8c..585d140 100644
--- a/net/netfilter/ipvs/ip_vs_proto_udp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_udp.c
@@ -470,11 +470,8 @@ static const char * udp_state_name(int state)
static int
udp_state_transition(struct ip_vs_conn *cp, int direction,
const struct sk_buff *skb,
- struct ip_vs_protocol *pp)
+ struct ip_vs_proto_data *pd)
{
- struct ip_vs_proto_data *pd; /* Temp fix, pp will be replaced by pd */
-
- pd = ip_vs_proto_data_get(&init_net, IPPROTO_UDP);
if (unlikely(!pd)) {
pr_err("UDP no ns data\n");
return 0;
--
1.7.2.3
^ permalink raw reply related
* [*v2 PATCH 11/22] IPVS: netns, common protocol changes and use of appcnt.
From: Hans Schillstrom @ 2010-12-13 13:38 UTC (permalink / raw)
To: horms, ja, daniel.lezcano, wensong, lvs-devel, netdev,
netfilter-devel
Cc: hans, Hans Schillstrom
In-Reply-To: <1292247510-753-1-git-send-email-hans.schillstrom@ericsson.com>
appcnt and timeout_table moved from struct ip_vs_protocol to
ip_vs proto_data.
struct net *net added as first param to
- register_app()
- unregister_app()
- app_conn_bind()
- ip_vs_conn_new()
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
---
include/net/ip_vs.h | 2 -
net/netfilter/ipvs/ip_vs_conn.c | 6 ++--
net/netfilter/ipvs/ip_vs_proto_sctp.c | 4 +-
net/netfilter/ipvs/ip_vs_proto_tcp.c | 5 +--
net/netfilter/ipvs/ip_vs_proto_udp.c | 4 +-
net/netfilter/ipvs/ip_vs_sync.c | 56 +++++++++++++++++---------------
6 files changed, 39 insertions(+), 38 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 045cbe0..acd795c 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -362,8 +362,6 @@ struct ip_vs_protocol {
u16 protocol;
u16 num_states;
int dont_defrag;
- atomic_t appcnt; /* counter of proto app incs */
- int *timeout_table; /* protocol timeout table */
void (*init)(struct ip_vs_protocol *pp);
diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
index 48212de..ef35e5d 100644
--- a/net/netfilter/ipvs/ip_vs_conn.c
+++ b/net/netfilter/ipvs/ip_vs_conn.c
@@ -806,7 +806,7 @@ ip_vs_conn_new(const struct ip_vs_conn_param *p,
struct ip_vs_dest *dest, __u32 fwmark)
{
struct ip_vs_conn *cp;
- struct ip_vs_protocol *pp = ip_vs_proto_get(p->protocol);
+ struct ip_vs_proto_data *pd = ip_vs_proto_data_get(&init_net, p->protocol);
cp = kmem_cache_zalloc(ip_vs_conn_cachep, GFP_ATOMIC);
if (cp == NULL) {
@@ -865,8 +865,8 @@ ip_vs_conn_new(const struct ip_vs_conn_param *p,
#endif
ip_vs_bind_xmit(cp);
- if (unlikely(pp && atomic_read(&pp->appcnt)))
- ip_vs_bind_app(cp, pp);
+ if (unlikely(pd && atomic_read(&pd->appcnt)))
+ ip_vs_bind_app(cp, pd->pp);
/*
* Allow conntrack to be preserved. By default, conntrack
diff --git a/net/netfilter/ipvs/ip_vs_proto_sctp.c b/net/netfilter/ipvs/ip_vs_proto_sctp.c
index 88715bd..2460fc7 100644
--- a/net/netfilter/ipvs/ip_vs_proto_sctp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_sctp.c
@@ -1044,7 +1044,7 @@ static int sctp_register_app(struct ip_vs_app *inc)
}
}
list_add(&inc->p_list, &ipvs->sctp_apps[hash]);
- atomic_inc(&pd->pp->appcnt);
+ atomic_inc(&pd->appcnt);
out:
spin_unlock_bh(&ipvs->sctp_app_lock);
@@ -1057,7 +1057,7 @@ static void sctp_unregister_app(struct ip_vs_app *inc)
struct ip_vs_proto_data *pd = ip_vs_proto_data_get(&init_net, IPPROTO_SCTP);
spin_lock_bh(&ipvs->sctp_app_lock);
- atomic_dec(&pd->pp->appcnt);
+ atomic_dec(&pd->appcnt);
list_del(&inc->p_list);
spin_unlock_bh(&ipvs->sctp_app_lock);
}
diff --git a/net/netfilter/ipvs/ip_vs_proto_tcp.c b/net/netfilter/ipvs/ip_vs_proto_tcp.c
index eec7563..98cfec0 100644
--- a/net/netfilter/ipvs/ip_vs_proto_tcp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_tcp.c
@@ -602,7 +602,7 @@ static int tcp_register_app(struct ip_vs_app *inc)
}
}
list_add(&inc->p_list, &ipvs->tcp_apps[hash]);
- atomic_inc(&pd->pp->appcnt);
+ atomic_inc(&pd->appcnt);
out:
spin_unlock_bh(&ipvs->tcp_app_lock);
@@ -617,7 +617,7 @@ tcp_unregister_app(struct ip_vs_app *inc)
struct ip_vs_proto_data *pd = ip_vs_proto_data_get(&init_net, IPPROTO_TCP);
spin_lock_bh(&ipvs->tcp_app_lock);
- atomic_dec(&pd->pp->appcnt);
+ atomic_dec(&pd->appcnt);
list_del(&inc->p_list);
spin_unlock_bh(&ipvs->tcp_app_lock);
}
@@ -707,7 +707,6 @@ struct ip_vs_protocol ip_vs_protocol_tcp = {
.protocol = IPPROTO_TCP,
.num_states = IP_VS_TCP_S_LAST,
.dont_defrag = 0,
- .appcnt = ATOMIC_INIT(0),
.init = NULL,
.exit = NULL,
.init_netns = __ip_vs_tcp_init,
diff --git a/net/netfilter/ipvs/ip_vs_proto_udp.c b/net/netfilter/ipvs/ip_vs_proto_udp.c
index 585d140..da4add0 100644
--- a/net/netfilter/ipvs/ip_vs_proto_udp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_udp.c
@@ -373,7 +373,7 @@ static int udp_register_app(struct ip_vs_app *inc)
}
}
list_add(&inc->p_list, &ipvs->udp_apps[hash]);
- atomic_inc(&pd->pp->appcnt);
+ atomic_inc(&pd->appcnt);
out:
spin_unlock_bh(&ipvs->udp_app_lock);
@@ -388,7 +388,7 @@ udp_unregister_app(struct ip_vs_app *inc)
struct netns_ipvs *ipvs = net_ipvs(&init_net);
spin_lock_bh(&ipvs->udp_app_lock);
- atomic_dec(&pd->pp->appcnt);
+ atomic_dec(&pd->appcnt);
list_del(&inc->p_list);
spin_unlock_bh(&ipvs->udp_app_lock);
}
diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c
index df74b2c..8c8bb4c 100644
--- a/net/netfilter/ipvs/ip_vs_sync.c
+++ b/net/netfilter/ipvs/ip_vs_sync.c
@@ -725,17 +725,16 @@ ip_vs_conn_fill_param_sync(int af, union ip_vs_sync_conn *sc,
* Param: ...
* timeout is in sec.
*/
-static void ip_vs_proc_conn(struct ip_vs_conn_param *param, unsigned flags,
- unsigned state, unsigned protocol, unsigned type,
+static void ip_vs_proc_conn(struct net *net, struct ip_vs_conn_param *param,
+ unsigned int flags, unsigned int state,
+ unsigned int protocol, unsigned int type,
const union nf_inet_addr *daddr, __be16 dport,
unsigned long timeout, __u32 fwmark,
- struct ip_vs_sync_conn_options *opt,
- struct ip_vs_protocol *pp)
+ struct ip_vs_sync_conn_options *opt)
{
struct ip_vs_dest *dest;
struct ip_vs_conn *cp;
-
if (!(flags & IP_VS_CONN_F_TEMPLATE))
cp = ip_vs_conn_in_get(param);
else
@@ -821,17 +820,23 @@ static void ip_vs_proc_conn(struct ip_vs_conn_param *param, unsigned flags,
if (timeout > MAX_SCHEDULE_TIMEOUT / HZ)
timeout = MAX_SCHEDULE_TIMEOUT / HZ;
cp->timeout = timeout*HZ;
- } else if (!(flags & IP_VS_CONN_F_TEMPLATE) && pp->timeout_table)
- cp->timeout = pp->timeout_table[state];
- else
- cp->timeout = (3*60*HZ);
+ } else {
+ struct ip_vs_proto_data *pd;
+
+ pd = ip_vs_proto_data_get(net, protocol);
+ if (!(flags & IP_VS_CONN_F_TEMPLATE) && pd && pd->timeout_table)
+ cp->timeout = pd->timeout_table[state];
+ else
+ cp->timeout = (3*60*HZ);
+ }
ip_vs_conn_put(cp);
}
/*
* Process received multicast message for Version 0
*/
-static void ip_vs_process_message_v0(const char *buffer, const size_t buflen)
+static void ip_vs_process_message_v0(struct net *net, const char *buffer,
+ const size_t buflen)
{
struct ip_vs_sync_mesg_v0 *m = (struct ip_vs_sync_mesg_v0 *)buffer;
struct ip_vs_sync_conn_v0 *s;
@@ -843,7 +848,7 @@ static void ip_vs_process_message_v0(const char *buffer, const size_t buflen)
p = (char *)buffer + sizeof(struct ip_vs_sync_mesg_v0);
for (i=0; i<m->nr_conns; i++) {
- unsigned flags, state;
+ unsigned int flags, state;
if (p + SIMPLE_CONN_SIZE > buffer+buflen) {
IP_VS_ERR_RL("BACKUP v0, bogus conn\n");
@@ -879,7 +884,6 @@ static void ip_vs_process_message_v0(const char *buffer, const size_t buflen)
}
} else {
/* protocol in templates is not used for state/timeout */
- pp = NULL;
if (state > 0) {
IP_VS_DBG(2, "BACKUP v0, Invalid template state %u\n",
state);
@@ -894,9 +898,9 @@ static void ip_vs_process_message_v0(const char *buffer, const size_t buflen)
s->vport, ¶m);
/* Send timeout as Zero */
- ip_vs_proc_conn(¶m, flags, state, s->protocol, AF_INET,
+ ip_vs_proc_conn(net, ¶m, flags, state, s->protocol, AF_INET,
(union nf_inet_addr *)&s->daddr, s->dport,
- 0, 0, opt, pp);
+ 0, 0, opt);
}
}
@@ -945,7 +949,7 @@ static int ip_vs_proc_str(__u8 *p, unsigned int plen, unsigned int *data_len,
/*
* Process a Version 1 sync. connection
*/
-static inline int ip_vs_proc_sync_conn(__u8 *p, __u8 *msg_end)
+static inline int ip_vs_proc_sync_conn(struct net *net, __u8 *p, __u8 *msg_end)
{
struct ip_vs_sync_conn_options opt;
union ip_vs_sync_conn *s;
@@ -1043,7 +1047,6 @@ static inline int ip_vs_proc_sync_conn(__u8 *p, __u8 *msg_end)
}
} else {
/* protocol in templates is not used for state/timeout */
- pp = NULL;
if (state > 0) {
IP_VS_DBG(3, "BACKUP, Invalid template state %u\n",
state);
@@ -1058,18 +1061,18 @@ static inline int ip_vs_proc_sync_conn(__u8 *p, __u8 *msg_end)
}
/* If only IPv4, just silent skip IPv6 */
if (af == AF_INET)
- ip_vs_proc_conn(¶m, flags, state, s->v4.protocol, af,
+ ip_vs_proc_conn(net, ¶m, flags, state, s->v4.protocol, af,
(union nf_inet_addr *)&s->v4.daddr, s->v4.dport,
ntohl(s->v4.timeout), ntohl(s->v4.fwmark),
- (opt_flags & IPVS_OPT_F_SEQ_DATA ? &opt : NULL),
- pp);
+ (opt_flags & IPVS_OPT_F_SEQ_DATA ? &opt : NULL)
+ );
#ifdef CONFIG_IP_VS_IPV6
else
- ip_vs_proc_conn(¶m, flags, state, s->v6.protocol, af,
+ ip_vs_proc_conn(net, ¶m, flags, state, s->v6.protocol, af,
(union nf_inet_addr *)&s->v6.daddr, s->v6.dport,
ntohl(s->v6.timeout), ntohl(s->v6.fwmark),
- (opt_flags & IPVS_OPT_F_SEQ_DATA ? &opt : NULL),
- pp);
+ (opt_flags & IPVS_OPT_F_SEQ_DATA ? &opt : NULL)
+ );
#endif
return 0;
/* Error exit */
@@ -1083,7 +1086,8 @@ out:
* ip_vs_conn entries.
* Handles Version 0 & 1
*/
-static void ip_vs_process_message(__u8 *buffer, const size_t buflen)
+static void ip_vs_process_message(struct net *net, __u8 *buffer,
+ const size_t buflen)
{
struct ip_vs_sync_mesg *m2 = (struct ip_vs_sync_mesg *)buffer;
__u8 *p, *msg_end;
@@ -1136,7 +1140,7 @@ static void ip_vs_process_message(__u8 *buffer, const size_t buflen)
return;
}
/* Process a single sync_conn */
- if ((retc=ip_vs_proc_sync_conn(p, msg_end)) < 0) {
+ if ((retc=ip_vs_proc_sync_conn(net, p, msg_end)) < 0) {
IP_VS_ERR_RL("BACKUP, Dropping buffer, Err: %d in decoding\n",
retc);
return;
@@ -1146,7 +1150,7 @@ static void ip_vs_process_message(__u8 *buffer, const size_t buflen)
}
} else {
/* Old type of message */
- ip_vs_process_message_v0(buffer, buflen);
+ ip_vs_process_message_v0(net, buffer, buflen);
return;
}
}
@@ -1500,7 +1504,7 @@ static int sync_thread_backup(void *data)
/* disable bottom half, because it accesses the data
shared by softirq while getting/creating conns */
local_bh_disable();
- ip_vs_process_message(tinfo->buf, len);
+ ip_vs_process_message(&init_net, tinfo->buf, len);
local_bh_enable();
}
}
--
1.7.2.3
^ permalink raw reply related
* [*v2 PATCH 12/22] IPVS: netns awareness to ip_vs_app
From: Hans Schillstrom @ 2010-12-13 13:38 UTC (permalink / raw)
To: horms, ja, daniel.lezcano, wensong, lvs-devel, netdev,
netfilter-devel
Cc: hans, Hans Schillstrom
In-Reply-To: <1292247510-753-1-git-send-email-hans.schillstrom@ericsson.com>
All variables moved to struct ipvs,
most external changes fixed (i.e. init_net removed)
in ip_vs_protocol param struct net *net added to:
- register_app()
- unregister_app()
This affected almost all proto_xxx.c files
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
---
include/net/ip_vs.h | 12 +++---
include/net/netns/ip_vs.h | 5 ++
net/netfilter/ipvs/ip_vs_app.c | 72 +++++++++++++++++++--------------
net/netfilter/ipvs/ip_vs_ftp.c | 8 ++--
net/netfilter/ipvs/ip_vs_proto_sctp.c | 12 +++---
net/netfilter/ipvs/ip_vs_proto_tcp.c | 12 +++---
net/netfilter/ipvs/ip_vs_proto_udp.c | 12 +++---
7 files changed, 75 insertions(+), 58 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index acd795c..5988c5d 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -406,9 +406,9 @@ struct ip_vs_protocol {
const struct sk_buff *skb,
struct ip_vs_proto_data *pd);
- int (*register_app)(struct ip_vs_app *inc);
+ int (*register_app)(struct net *net, struct ip_vs_app *inc);
- void (*unregister_app)(struct ip_vs_app *inc);
+ void (*unregister_app)(struct net *net, struct ip_vs_app *inc);
int (*app_conn_bind)(struct ip_vs_conn *cp);
@@ -879,12 +879,12 @@ ip_vs_control_add(struct ip_vs_conn *cp, struct ip_vs_conn *ctl_cp)
* (from ip_vs_app.c)
*/
#define IP_VS_APP_MAX_PORTS 8
-extern int register_ip_vs_app(struct ip_vs_app *app);
-extern void unregister_ip_vs_app(struct ip_vs_app *app);
+extern int register_ip_vs_app(struct net *net, struct ip_vs_app *app);
+extern void unregister_ip_vs_app(struct net *net, struct ip_vs_app *app);
extern int ip_vs_bind_app(struct ip_vs_conn *cp, struct ip_vs_protocol *pp);
extern void ip_vs_unbind_app(struct ip_vs_conn *cp);
-extern int
-register_ip_vs_app_inc(struct ip_vs_app *app, __u16 proto, __u16 port);
+extern int register_ip_vs_app_inc(struct net *net, struct ip_vs_app *app,
+ __u16 proto, __u16 port);
extern int ip_vs_app_inc_get(struct ip_vs_app *inc);
extern void ip_vs_app_inc_put(struct ip_vs_app *inc);
diff --git a/include/net/netns/ip_vs.h b/include/net/netns/ip_vs.h
index fcb3c7c..f077bd3 100644
--- a/include/net/netns/ip_vs.h
+++ b/include/net/netns/ip_vs.h
@@ -29,6 +29,11 @@ struct netns_ipvs {
#define IP_VS_RTAB_MASK (IP_VS_RTAB_SIZE - 1)
struct list_head rs_table[IP_VS_RTAB_SIZE];
+ /* ip_vs_app */
+ struct list_head app_list;
+ struct mutex app_mutex;
+ struct lock_class_key app_key; /* mutex debuging */
+
/* ip_vs_proto */
#define IP_VS_PROTO_TAB_SIZE 32 /* must be power of 2 */
struct ip_vs_proto_data *proto_data_table[IP_VS_PROTO_TAB_SIZE];
diff --git a/net/netfilter/ipvs/ip_vs_app.c b/net/netfilter/ipvs/ip_vs_app.c
index 6d10352..af2cd83 100644
--- a/net/netfilter/ipvs/ip_vs_app.c
+++ b/net/netfilter/ipvs/ip_vs_app.c
@@ -43,11 +43,6 @@ EXPORT_SYMBOL(register_ip_vs_app);
EXPORT_SYMBOL(unregister_ip_vs_app);
EXPORT_SYMBOL(register_ip_vs_app_inc);
-/* ipvs application list head */
-static LIST_HEAD(ip_vs_app_list);
-static DEFINE_MUTEX(__ip_vs_app_mutex);
-
-
/*
* Get an ip_vs_app object
*/
@@ -67,7 +62,8 @@ static inline void ip_vs_app_put(struct ip_vs_app *app)
* Allocate/initialize app incarnation and register it in proto apps.
*/
static int
-ip_vs_app_inc_new(struct ip_vs_app *app, __u16 proto, __u16 port)
+ip_vs_app_inc_new(struct net *net, struct ip_vs_app *app, __u16 proto,
+ __u16 port)
{
struct ip_vs_protocol *pp;
struct ip_vs_app *inc;
@@ -98,7 +94,7 @@ ip_vs_app_inc_new(struct ip_vs_app *app, __u16 proto, __u16 port)
}
}
- ret = pp->register_app(inc);
+ ret = pp->register_app(net, inc);
if (ret)
goto out;
@@ -119,7 +115,7 @@ ip_vs_app_inc_new(struct ip_vs_app *app, __u16 proto, __u16 port)
* Release app incarnation
*/
static void
-ip_vs_app_inc_release(struct ip_vs_app *inc)
+ip_vs_app_inc_release(struct net *net, struct ip_vs_app *inc)
{
struct ip_vs_protocol *pp;
@@ -127,7 +123,7 @@ ip_vs_app_inc_release(struct ip_vs_app *inc)
return;
if (pp->unregister_app)
- pp->unregister_app(inc);
+ pp->unregister_app(net, inc);
IP_VS_DBG(9, "%s App %s:%u unregistered\n",
pp->name, inc->name, ntohs(inc->port));
@@ -168,15 +164,16 @@ void ip_vs_app_inc_put(struct ip_vs_app *inc)
* Register an application incarnation in protocol applications
*/
int
-register_ip_vs_app_inc(struct ip_vs_app *app, __u16 proto, __u16 port)
+register_ip_vs_app_inc(struct net *net, struct ip_vs_app *app, __u16 proto, __u16 port)
{
+ struct netns_ipvs *ipvs = net_ipvs(net);
int result;
- mutex_lock(&__ip_vs_app_mutex);
+ mutex_lock(&ipvs->app_mutex);
- result = ip_vs_app_inc_new(app, proto, port);
+ result = ip_vs_app_inc_new(net, app, proto, port);
- mutex_unlock(&__ip_vs_app_mutex);
+ mutex_unlock(&ipvs->app_mutex);
return result;
}
@@ -185,16 +182,17 @@ register_ip_vs_app_inc(struct ip_vs_app *app, __u16 proto, __u16 port)
/*
* ip_vs_app registration routine
*/
-int register_ip_vs_app(struct ip_vs_app *app)
+int register_ip_vs_app(struct net *net, struct ip_vs_app *app)
{
+ struct netns_ipvs *ipvs = net_ipvs(net);
/* increase the module use count */
ip_vs_use_count_inc();
- mutex_lock(&__ip_vs_app_mutex);
+ mutex_lock(&ipvs->app_mutex);
- list_add(&app->a_list, &ip_vs_app_list);
+ list_add(&app->a_list, &ipvs->app_list);
- mutex_unlock(&__ip_vs_app_mutex);
+ mutex_unlock(&ipvs->app_mutex);
return 0;
}
@@ -204,19 +202,20 @@ int register_ip_vs_app(struct ip_vs_app *app)
* ip_vs_app unregistration routine
* We are sure there are no app incarnations attached to services
*/
-void unregister_ip_vs_app(struct ip_vs_app *app)
+void unregister_ip_vs_app(struct net *net, struct ip_vs_app *app)
{
+ struct netns_ipvs *ipvs = net_ipvs(net);
struct ip_vs_app *inc, *nxt;
- mutex_lock(&__ip_vs_app_mutex);
+ mutex_lock(&ipvs->app_mutex);
list_for_each_entry_safe(inc, nxt, &app->incs_list, a_list) {
- ip_vs_app_inc_release(inc);
+ ip_vs_app_inc_release(net, inc);
}
list_del(&app->a_list);
- mutex_unlock(&__ip_vs_app_mutex);
+ mutex_unlock(&ipvs->app_mutex);
/* decrease the module use count */
ip_vs_use_count_dec();
@@ -226,7 +225,8 @@ void unregister_ip_vs_app(struct ip_vs_app *app)
/*
* Bind ip_vs_conn to its ip_vs_app (called by cp constructor)
*/
-int ip_vs_bind_app(struct ip_vs_conn *cp, struct ip_vs_protocol *pp)
+int ip_vs_bind_app(struct ip_vs_conn *cp,
+ struct ip_vs_protocol *pp)
{
return pp->app_conn_bind(cp);
}
@@ -481,11 +481,11 @@ int ip_vs_app_pkt_in(struct ip_vs_conn *cp, struct sk_buff *skb)
* /proc/net/ip_vs_app entry function
*/
-static struct ip_vs_app *ip_vs_app_idx(loff_t pos)
+static struct ip_vs_app *ip_vs_app_idx(struct netns_ipvs *ipvs, loff_t pos)
{
struct ip_vs_app *app, *inc;
- list_for_each_entry(app, &ip_vs_app_list, a_list) {
+ list_for_each_entry(app, &ipvs->app_list, a_list) {
list_for_each_entry(inc, &app->incs_list, a_list) {
if (pos-- == 0)
return inc;
@@ -497,19 +497,24 @@ static struct ip_vs_app *ip_vs_app_idx(loff_t pos)
static void *ip_vs_app_seq_start(struct seq_file *seq, loff_t *pos)
{
- mutex_lock(&__ip_vs_app_mutex);
+ struct net *net = seq_file_net(seq);
+ struct netns_ipvs *ipvs = net_ipvs(net);
+
+ mutex_lock(&ipvs->app_mutex);
- return *pos ? ip_vs_app_idx(*pos - 1) : SEQ_START_TOKEN;
+ return *pos ? ip_vs_app_idx(ipvs, *pos - 1) : SEQ_START_TOKEN;
}
static void *ip_vs_app_seq_next(struct seq_file *seq, void *v, loff_t *pos)
{
struct ip_vs_app *inc, *app;
struct list_head *e;
+ struct net *net = seq_file_net(seq);
+ struct netns_ipvs *ipvs = net_ipvs(net);
++*pos;
if (v == SEQ_START_TOKEN)
- return ip_vs_app_idx(0);
+ return ip_vs_app_idx(ipvs, 0);
inc = v;
app = inc->app;
@@ -518,7 +523,7 @@ static void *ip_vs_app_seq_next(struct seq_file *seq, void *v, loff_t *pos)
return list_entry(e, struct ip_vs_app, a_list);
/* go on to next application */
- for (e = app->a_list.next; e != &ip_vs_app_list; e = e->next) {
+ for (e = app->a_list.next; e != &ipvs->app_list; e = e->next) {
app = list_entry(e, struct ip_vs_app, a_list);
list_for_each_entry(inc, &app->incs_list, a_list) {
return inc;
@@ -529,7 +534,9 @@ static void *ip_vs_app_seq_next(struct seq_file *seq, void *v, loff_t *pos)
static void ip_vs_app_seq_stop(struct seq_file *seq, void *v)
{
- mutex_unlock(&__ip_vs_app_mutex);
+ struct netns_ipvs *ipvs = net_ipvs(seq_file_net(seq));
+
+ mutex_unlock(&ipvs->app_mutex);
}
static int ip_vs_app_seq_show(struct seq_file *seq, void *v)
@@ -557,7 +564,8 @@ static const struct seq_operations ip_vs_app_seq_ops = {
static int ip_vs_app_open(struct inode *inode, struct file *file)
{
- return seq_open(file, &ip_vs_app_seq_ops);
+ return seq_open_net(inode, file, &ip_vs_app_seq_ops,
+ sizeof(struct seq_net_private));
}
static const struct file_operations ip_vs_app_fops = {
@@ -571,9 +579,13 @@ static const struct file_operations ip_vs_app_fops = {
static int __net_init __ip_vs_app_init(struct net *net)
{
+ struct netns_ipvs *ipvs = net_ipvs(net);
+
if (!net_eq(net, &init_net)) /* netns not enabled yet */
return -EPERM;
+ INIT_LIST_HEAD(&ipvs->app_list);
+ __mutex_init(&ipvs->app_mutex,"ipvs->app_mutex", &ipvs->app_key);
proc_net_fops_create(net, "ip_vs_app", 0, &ip_vs_app_fops);
return 0;
}
diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c
index fdcb74b..b39befd 100644
--- a/net/netfilter/ipvs/ip_vs_ftp.c
+++ b/net/netfilter/ipvs/ip_vs_ftp.c
@@ -413,14 +413,14 @@ static int __net_init __ip_vs_ftp_init(struct net *net)
if (!net_eq(net, &init_net)) /* netns not enabled yet */
return -EPERM;
- ret = register_ip_vs_app(app);
+ ret = register_ip_vs_app(net, app);
if (ret)
return ret;
for (i=0; i<IP_VS_APP_MAX_PORTS; i++) {
if (!ports[i])
continue;
- ret = register_ip_vs_app_inc(app, app->protocol, ports[i]);
+ ret = register_ip_vs_app_inc(net, app, app->protocol, ports[i]);
if (ret)
break;
pr_info("%s: loaded support on port[%d] = %d\n",
@@ -428,7 +428,7 @@ static int __net_init __ip_vs_ftp_init(struct net *net)
}
if (ret)
- unregister_ip_vs_app(app);
+ unregister_ip_vs_app(net, app);
return ret;
}
@@ -442,7 +442,7 @@ static void __ip_vs_ftp_exit(struct net *net)
if (!net_eq(net, &init_net)) /* netns not enabled yet */
return;
- unregister_ip_vs_app(app);
+ unregister_ip_vs_app(net, app);
}
static struct pernet_operations ip_vs_ftp_ops = {
diff --git a/net/netfilter/ipvs/ip_vs_proto_sctp.c b/net/netfilter/ipvs/ip_vs_proto_sctp.c
index 2460fc7..3c6d06d 100644
--- a/net/netfilter/ipvs/ip_vs_proto_sctp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_sctp.c
@@ -1025,14 +1025,14 @@ static inline __u16 sctp_app_hashkey(__be16 port)
& SCTP_APP_TAB_MASK;
}
-static int sctp_register_app(struct ip_vs_app *inc)
+static int sctp_register_app(struct net *net, struct ip_vs_app *inc)
{
struct ip_vs_app *i;
__u16 hash;
__be16 port = inc->port;
int ret = 0;
- struct netns_ipvs *ipvs = net_ipvs(&init_net);
- struct ip_vs_proto_data *pd = ip_vs_proto_data_get(&init_net, IPPROTO_SCTP);
+ struct netns_ipvs *ipvs = net_ipvs(net);
+ struct ip_vs_proto_data *pd = ip_vs_proto_data_get(net, IPPROTO_SCTP);
hash = sctp_app_hashkey(port);
@@ -1051,10 +1051,10 @@ out:
return ret;
}
-static void sctp_unregister_app(struct ip_vs_app *inc)
+static void sctp_unregister_app(struct net *net, struct ip_vs_app *inc)
{
- struct netns_ipvs *ipvs = net_ipvs(&init_net);
- struct ip_vs_proto_data *pd = ip_vs_proto_data_get(&init_net, IPPROTO_SCTP);
+ struct netns_ipvs *ipvs = net_ipvs(net);
+ struct ip_vs_proto_data *pd = ip_vs_proto_data_get(net, IPPROTO_SCTP);
spin_lock_bh(&ipvs->sctp_app_lock);
atomic_dec(&pd->appcnt);
diff --git a/net/netfilter/ipvs/ip_vs_proto_tcp.c b/net/netfilter/ipvs/ip_vs_proto_tcp.c
index 98cfec0..326768d 100644
--- a/net/netfilter/ipvs/ip_vs_proto_tcp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_tcp.c
@@ -583,14 +583,14 @@ static inline __u16 tcp_app_hashkey(__be16 port)
}
-static int tcp_register_app(struct ip_vs_app *inc)
+static int tcp_register_app(struct net *net, struct ip_vs_app *inc)
{
struct ip_vs_app *i;
__u16 hash;
__be16 port = inc->port;
int ret = 0;
- struct netns_ipvs *ipvs = net_ipvs(&init_net);
- struct ip_vs_proto_data *pd = ip_vs_proto_data_get(&init_net, IPPROTO_TCP);
+ struct netns_ipvs *ipvs = net_ipvs(net);
+ struct ip_vs_proto_data *pd = ip_vs_proto_data_get(net, IPPROTO_TCP);
hash = tcp_app_hashkey(port);
@@ -611,10 +611,10 @@ static int tcp_register_app(struct ip_vs_app *inc)
static void
-tcp_unregister_app(struct ip_vs_app *inc)
+tcp_unregister_app(struct net *net, struct ip_vs_app *inc)
{
- struct netns_ipvs *ipvs = net_ipvs(&init_net);
- struct ip_vs_proto_data *pd = ip_vs_proto_data_get(&init_net, IPPROTO_TCP);
+ struct netns_ipvs *ipvs = net_ipvs(net);
+ struct ip_vs_proto_data *pd = ip_vs_proto_data_get(net, IPPROTO_TCP);
spin_lock_bh(&ipvs->tcp_app_lock);
atomic_dec(&pd->appcnt);
diff --git a/net/netfilter/ipvs/ip_vs_proto_udp.c b/net/netfilter/ipvs/ip_vs_proto_udp.c
index da4add0..7e894d5 100644
--- a/net/netfilter/ipvs/ip_vs_proto_udp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_udp.c
@@ -353,14 +353,14 @@ static inline __u16 udp_app_hashkey(__be16 port)
}
-static int udp_register_app(struct ip_vs_app *inc)
+static int udp_register_app(struct net *net, struct ip_vs_app *inc)
{
struct ip_vs_app *i;
__u16 hash;
__be16 port = inc->port;
int ret = 0;
- struct netns_ipvs *ipvs = net_ipvs(&init_net);
- struct ip_vs_proto_data *pd = ip_vs_proto_data_get(&init_net, IPPROTO_UDP);
+ struct netns_ipvs *ipvs = net_ipvs(net);
+ struct ip_vs_proto_data *pd = ip_vs_proto_data_get(net, IPPROTO_UDP);
hash = udp_app_hashkey(port);
@@ -382,10 +382,10 @@ static int udp_register_app(struct ip_vs_app *inc)
static void
-udp_unregister_app(struct ip_vs_app *inc)
+udp_unregister_app(struct net *net, struct ip_vs_app *inc)
{
- struct ip_vs_proto_data *pd = ip_vs_proto_data_get(&init_net, IPPROTO_UDP);
- struct netns_ipvs *ipvs = net_ipvs(&init_net);
+ struct ip_vs_proto_data *pd = ip_vs_proto_data_get(net, IPPROTO_UDP);
+ struct netns_ipvs *ipvs = net_ipvs(net);
spin_lock_bh(&ipvs->udp_app_lock);
atomic_dec(&pd->appcnt);
--
1.7.2.3
^ permalink raw reply related
* [*v2 PATCH 13/22] IPVS: netns awareness to ip_vs_est
From: Hans Schillstrom @ 2010-12-13 13:38 UTC (permalink / raw)
To: horms, ja, daniel.lezcano, wensong, lvs-devel, netdev,
netfilter-devel
Cc: hans, Hans Schillstrom
In-Reply-To: <1292247510-753-1-git-send-email-hans.schillstrom@ericsson.com>
All variables moved to struct ipvs,
most external changes fixed (i.e. init_net removed)
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
---
include/net/ip_vs.h | 4 +-
include/net/netns/ip_vs.h | 4 ++
net/netfilter/ipvs/ip_vs_ctl.c | 20 ++++----
net/netfilter/ipvs/ip_vs_est.c | 114 ++++++++++++++++++++++------------------
4 files changed, 79 insertions(+), 63 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 5988c5d..34f8d2f 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1012,8 +1012,8 @@ extern void ip_vs_sync_cleanup(void);
*/
extern int ip_vs_estimator_init(void);
extern void ip_vs_estimator_cleanup(void);
-extern void ip_vs_new_estimator(struct ip_vs_stats *stats);
-extern void ip_vs_kill_estimator(struct ip_vs_stats *stats);
+extern void ip_vs_new_estimator(struct net *net, struct ip_vs_stats *stats);
+extern void ip_vs_kill_estimator(struct net *net, struct ip_vs_stats *stats);
extern void ip_vs_zero_estimator(struct ip_vs_stats *stats);
/*
diff --git a/include/net/netns/ip_vs.h b/include/net/netns/ip_vs.h
index f077bd3..6486501 100644
--- a/include/net/netns/ip_vs.h
+++ b/include/net/netns/ip_vs.h
@@ -70,6 +70,10 @@ struct netns_ipvs {
int sysctl_lblcr_expiration;
struct ctl_table_header *lblcr_ctl_header;
struct ctl_table *lblcr_ctl_table;
+ /* ip_vs_est */
+ struct list_head est_list; /* estimator list */
+ spinlock_t est_lock;
+
};
#endif /* IP_VS_H_ */
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index 2c9cd5c..36458e8 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -815,7 +815,7 @@ __ip_vs_update_dest(struct ip_vs_service *svc, struct ip_vs_dest *dest,
spin_unlock(&dest->dst_lock);
if (add)
- ip_vs_new_estimator(&dest->stats);
+ ip_vs_new_estimator(svc->net, &dest->stats);
write_lock_bh(&__ip_vs_svc_lock);
@@ -1008,9 +1008,9 @@ ip_vs_edit_dest(struct ip_vs_service *svc, struct ip_vs_dest_user_kern *udest)
/*
* Delete a destination (must be already unlinked from the service)
*/
-static void __ip_vs_del_dest(struct ip_vs_dest *dest)
+static void __ip_vs_del_dest(struct net *net, struct ip_vs_dest *dest)
{
- ip_vs_kill_estimator(&dest->stats);
+ ip_vs_kill_estimator(net, &dest->stats);
/*
* Remove it from the d-linked list with the real services.
@@ -1079,6 +1079,7 @@ static int
ip_vs_del_dest(struct ip_vs_service *svc, struct ip_vs_dest_user_kern *udest)
{
struct ip_vs_dest *dest;
+ struct net *net = svc->net;
__be16 dport = udest->port;
EnterFunction(2);
@@ -1107,7 +1108,7 @@ ip_vs_del_dest(struct ip_vs_service *svc, struct ip_vs_dest_user_kern *udest)
/*
* Delete the destination
*/
- __ip_vs_del_dest(dest);
+ __ip_vs_del_dest(net, dest);
LeaveFunction(2);
@@ -1196,7 +1197,7 @@ ip_vs_add_service(struct net *net, struct ip_vs_service_user_kern *u,
else if (svc->port == 0)
atomic_inc(&ip_vs_nullsvc_counter);
- ip_vs_new_estimator(&svc->stats);
+ ip_vs_new_estimator(net, &svc->stats);
/* Count only IPv4 services for old get/setsockopt interface */
if (svc->af == AF_INET)
@@ -1344,7 +1345,7 @@ static void __ip_vs_del_service(struct ip_vs_service *svc)
if (svc->af == AF_INET)
ip_vs_num_services--;
- ip_vs_kill_estimator(&svc->stats);
+ ip_vs_kill_estimator(svc->net, &svc->stats);
/* Unbind scheduler */
old_sched = svc->scheduler;
@@ -1367,7 +1368,7 @@ static void __ip_vs_del_service(struct ip_vs_service *svc)
*/
list_for_each_entry_safe(dest, nxt, &svc->destinations, n_list) {
__ip_vs_unlink_dest(svc, dest, 0);
- __ip_vs_del_dest(dest);
+ __ip_vs_del_dest(svc->net, dest);
}
/*
@@ -3457,7 +3458,7 @@ int __net_init __ip_vs_control_init(struct net *net)
sysctl_header = register_net_sysctl_table(net, net_vs_ctl_path, vs_vars);
if (sysctl_header == NULL)
goto err_reg;
- ip_vs_new_estimator(&ip_vs_stats);
+ ip_vs_new_estimator(net, &ip_vs_stats);
return 0;
err_reg:
@@ -3469,7 +3470,7 @@ static void __net_exit __ip_vs_control_cleanup(struct net *net)
if (!net_eq(net, &init_net)) /* netns not enabled yet */
return;
- ip_vs_kill_estimator(&ip_vs_stats);
+ ip_vs_kill_estimator(net, &ip_vs_stats);
unregister_net_sysctl_table(sysctl_header);
proc_net_remove(net, "ip_vs_stats");
proc_net_remove(net, "ip_vs");
@@ -3533,7 +3534,6 @@ void ip_vs_control_cleanup(void)
ip_vs_trash_cleanup();
cancel_rearming_delayed_work(&defense_work);
cancel_work_sync(&defense_work.work);
- ip_vs_kill_estimator(&ip_vs_stats);
unregister_pernet_subsys(&ipvs_control_ops);
ip_vs_genl_unregister();
nf_unregister_sockopt(&ip_vs_sockopts);
diff --git a/net/netfilter/ipvs/ip_vs_est.c b/net/netfilter/ipvs/ip_vs_est.c
index 7417a0c..6bcabed 100644
--- a/net/netfilter/ipvs/ip_vs_est.c
+++ b/net/netfilter/ipvs/ip_vs_est.c
@@ -8,8 +8,12 @@
* as published by the Free Software Foundation; either version
* 2 of the License, or (at your option) any later version.
*
- * Changes:
- *
+ * Changes: Hans Schillstrom <hans.schillstrom@ericsson.com>
+ * Network name space (netns) aware.
+ * Global data moved to netns i.e struct netns_ipvs
+ * Affected data: est_list and est_lock.
+ * estimation_timer() runs with a common timer, but
+ * do update every netns on timeout.
*/
#define KMSG_COMPONENT "IPVS"
@@ -49,9 +53,6 @@
static void estimation_timer(unsigned long arg);
-
-static LIST_HEAD(est_list);
-static DEFINE_SPINLOCK(est_lock);
static DEFINE_TIMER(est_timer, estimation_timer, 0, 0);
static void estimation_timer(unsigned long arg)
@@ -62,51 +63,57 @@ static void estimation_timer(unsigned long arg)
u32 n_inpkts, n_outpkts;
u64 n_inbytes, n_outbytes;
u32 rate;
-
- spin_lock(&est_lock);
- list_for_each_entry(e, &est_list, list) {
- s = container_of(e, struct ip_vs_stats, est);
-
- spin_lock(&s->lock);
- n_conns = s->ustats.conns;
- n_inpkts = s->ustats.inpkts;
- n_outpkts = s->ustats.outpkts;
- n_inbytes = s->ustats.inbytes;
- n_outbytes = s->ustats.outbytes;
-
- /* scaled by 2^10, but divided 2 seconds */
- rate = (n_conns - e->last_conns)<<9;
- e->last_conns = n_conns;
- e->cps += ((long)rate - (long)e->cps)>>2;
- s->ustats.cps = (e->cps+0x1FF)>>10;
-
- rate = (n_inpkts - e->last_inpkts)<<9;
- e->last_inpkts = n_inpkts;
- e->inpps += ((long)rate - (long)e->inpps)>>2;
- s->ustats.inpps = (e->inpps+0x1FF)>>10;
-
- rate = (n_outpkts - e->last_outpkts)<<9;
- e->last_outpkts = n_outpkts;
- e->outpps += ((long)rate - (long)e->outpps)>>2;
- s->ustats.outpps = (e->outpps+0x1FF)>>10;
-
- rate = (n_inbytes - e->last_inbytes)<<4;
- e->last_inbytes = n_inbytes;
- e->inbps += ((long)rate - (long)e->inbps)>>2;
- s->ustats.inbps = (e->inbps+0xF)>>5;
-
- rate = (n_outbytes - e->last_outbytes)<<4;
- e->last_outbytes = n_outbytes;
- e->outbps += ((long)rate - (long)e->outbps)>>2;
- s->ustats.outbps = (e->outbps+0xF)>>5;
- spin_unlock(&s->lock);
+ struct net *net;
+ struct netns_ipvs *ipvs;
+
+ for_each_net(net) {
+ ipvs = net_ipvs(net);
+ spin_lock(&ipvs->est_lock);
+ list_for_each_entry(e, &ipvs->est_list, list) {
+ s = container_of(e, struct ip_vs_stats, est);
+
+ spin_lock(&s->lock);
+ n_conns = s->ustats.conns;
+ n_inpkts = s->ustats.inpkts;
+ n_outpkts = s->ustats.outpkts;
+ n_inbytes = s->ustats.inbytes;
+ n_outbytes = s->ustats.outbytes;
+
+ /* scaled by 2^10, but divided 2 seconds */
+ rate = (n_conns - e->last_conns) << 9;
+ e->last_conns = n_conns;
+ e->cps += ((long)rate - (long)e->cps) >> 2;
+ s->ustats.cps = (e->cps + 0x1FF) >> 10;
+
+ rate = (n_inpkts - e->last_inpkts) << 9;
+ e->last_inpkts = n_inpkts;
+ e->inpps += ((long)rate - (long)e->inpps) >> 2;
+ s->ustats.inpps = (e->inpps + 0x1FF) >> 10;
+
+ rate = (n_outpkts - e->last_outpkts) << 9;
+ e->last_outpkts = n_outpkts;
+ e->outpps += ((long)rate - (long)e->outpps) >> 2;
+ s->ustats.outpps = (e->outpps + 0x1FF) >> 10;
+
+ rate = (n_inbytes - e->last_inbytes) << 4;
+ e->last_inbytes = n_inbytes;
+ e->inbps += ((long)rate - (long)e->inbps) >> 2;
+ s->ustats.inbps = (e->inbps + 0xF) >> 5;
+
+ rate = (n_outbytes - e->last_outbytes) << 4;
+ e->last_outbytes = n_outbytes;
+ e->outbps += ((long)rate - (long)e->outbps) >> 2;
+ s->ustats.outbps = (e->outbps + 0xF) >> 5;
+ spin_unlock(&s->lock);
+ }
+ spin_unlock(&ipvs->est_lock);
}
- spin_unlock(&est_lock);
mod_timer(&est_timer, jiffies + 2*HZ);
}
-void ip_vs_new_estimator(struct ip_vs_stats *stats)
+void ip_vs_new_estimator(struct net *net, struct ip_vs_stats *stats)
{
+ struct netns_ipvs *ipvs = net_ipvs(net);
struct ip_vs_estimator *est = &stats->est;
INIT_LIST_HEAD(&est->list);
@@ -126,18 +133,19 @@ void ip_vs_new_estimator(struct ip_vs_stats *stats)
est->last_outbytes = stats->ustats.outbytes;
est->outbps = stats->ustats.outbps<<5;
- spin_lock_bh(&est_lock);
- list_add(&est->list, &est_list);
- spin_unlock_bh(&est_lock);
+ spin_lock_bh(&ipvs->est_lock);
+ list_add(&est->list, &ipvs->est_list);
+ spin_unlock_bh(&ipvs->est_lock);
}
-void ip_vs_kill_estimator(struct ip_vs_stats *stats)
+void ip_vs_kill_estimator(struct net *net, struct ip_vs_stats *stats)
{
+ struct netns_ipvs *ipvs = net_ipvs(net);
struct ip_vs_estimator *est = &stats->est;
- spin_lock_bh(&est_lock);
+ spin_lock_bh(&ipvs->est_lock);
list_del(&est->list);
- spin_unlock_bh(&est_lock);
+ spin_unlock_bh(&ipvs->est_lock);
}
void ip_vs_zero_estimator(struct ip_vs_stats *stats)
@@ -159,9 +167,13 @@ void ip_vs_zero_estimator(struct ip_vs_stats *stats)
static int __net_init __ip_vs_estimator_init(struct net *net)
{
+ struct netns_ipvs *ipvs = net_ipvs(net);
+
if (!net_eq(net, &init_net)) /* netns not enabled yet */
return -EPERM;
+ INIT_LIST_HEAD(&ipvs->est_list);
+ spin_lock_init(&ipvs->est_lock);
return 0;
}
--
1.7.2.3
^ permalink raw reply related
* [*v2 PATCH 16/22] IPVS: netns, connection hash got net as param.
From: Hans Schillstrom @ 2010-12-13 13:38 UTC (permalink / raw)
To: horms, ja, daniel.lezcano, wensong, lvs-devel, netdev,
netfilter-devel
Cc: hans, Hans Schillstrom
In-Reply-To: <1292247510-753-1-git-send-email-hans.schillstrom@ericsson.com>
Connection hash table is now name space aware.
i.e. net ptr >> 8 is xor:ed to the hash,
and this is the first param to be compared.
The net struct is 0xa40 in size ( a little bit smaller for 32 bit arch:s)
and cache-line aligned, so a ptr >> 5 might be a more clever solution ?
All lookups where net is compared uses net_eq() which returns 1 when netns
is disabled, and the compiler seems to do something clever in that case.
ip_vs_conn_fill_param() have *net as first param now.
Three new inlines added to keep conn struct smaller
when names space is disabled.
- ip_vs_conn_net()
- ip_vs_conn_net_set()
- ip_vs_conn_net_eq()
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
---
include/net/ip_vs.h | 54 ++++++++---
include/net/netns/ip_vs.h | 2 +
net/netfilter/ipvs/ip_vs_conn.c | 165 ++++++++++++++++++-------------
net/netfilter/ipvs/ip_vs_core.c | 15 ++-
net/netfilter/ipvs/ip_vs_ftp.c | 14 ++-
net/netfilter/ipvs/ip_vs_nfct.c | 6 +-
net/netfilter/ipvs/ip_vs_proto_ah_esp.c | 15 ++-
net/netfilter/ipvs/ip_vs_proto_sctp.c | 2 +-
net/netfilter/ipvs/ip_vs_proto_tcp.c | 2 +-
net/netfilter/ipvs/ip_vs_proto_udp.c | 2 +-
net/netfilter/ipvs/ip_vs_sync.c | 13 +--
11 files changed, 177 insertions(+), 113 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 848fcda..446e417 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -105,7 +105,6 @@ static inline struct net *seq_file_single_net(struct seq_file *seq)
#endif
}
-
/* Connections' size value needed by ip_vs_ctl.c */
extern int ip_vs_conn_tab_size;
@@ -460,6 +459,7 @@ extern struct ip_vs_proto_data * ip_vs_proto_data_get(struct net *net,
unsigned short proto);
struct ip_vs_conn_param {
+ struct net * net;
const union nf_inet_addr *caddr;
const union nf_inet_addr *vaddr;
__be16 cport;
@@ -477,17 +477,19 @@ struct ip_vs_conn_param {
*/
struct ip_vs_conn {
struct list_head c_list; /* hashed list heads */
-
+#ifdef CONFIG_NET_NS
+ struct net *net; /* Name space */
+#endif
/* Protocol, addresses and port numbers */
- u16 af; /* address family */
- union nf_inet_addr caddr; /* client address */
- union nf_inet_addr vaddr; /* virtual address */
- union nf_inet_addr daddr; /* destination address */
- volatile __u32 flags; /* status flags */
- __u32 fwmark; /* Fire wall mark from skb */
- __be16 cport;
- __be16 vport;
- __be16 dport;
+ u16 af; /* address family */
+ __be16 cport;
+ __be16 vport;
+ __be16 dport;
+ __u32 fwmark; /* Fire wall mark from skb */
+ union nf_inet_addr caddr; /* client address */
+ union nf_inet_addr vaddr; /* virtual address */
+ union nf_inet_addr daddr; /* destination address */
+ volatile __u32 flags; /* status flags */
__u16 protocol; /* Which protocol (TCP/UDP) */
/* counter and timer */
@@ -530,6 +532,33 @@ struct ip_vs_conn {
__u8 pe_data_len;
};
+/*
+ * To save some memory in conn table when name space is disabled.
+ */
+static inline struct net *ip_vs_conn_net(const struct ip_vs_conn *cp)
+{
+#ifdef CONFIG_NET_NS
+ return cp->net;
+#else
+ return &init_net;
+#endif
+}
+static inline void ip_vs_conn_net_set(struct ip_vs_conn *cp, struct net *net)
+{
+#ifdef CONFIG_NET_NS
+ cp->net = net;
+#endif
+}
+
+static inline int ip_vs_conn_net_eq(const struct ip_vs_conn *cp,
+ struct net *net)
+{
+#ifdef CONFIG_NET_NS
+ return cp->net == net;
+#else
+ return 1;
+#endif
+}
/*
* Extended internal versions of struct ip_vs_service_user and
@@ -779,13 +808,14 @@ enum {
IP_VS_DIR_LAST,
};
-static inline void ip_vs_conn_fill_param(int af, int protocol,
+static inline void ip_vs_conn_fill_param(struct net *net, int af, int protocol,
const union nf_inet_addr *caddr,
__be16 cport,
const union nf_inet_addr *vaddr,
__be16 vport,
struct ip_vs_conn_param *p)
{
+ p->net = net;
p->af = af;
p->protocol = protocol;
p->caddr = caddr;
diff --git a/include/net/netns/ip_vs.h b/include/net/netns/ip_vs.h
index b6642f0..f2d5bcd 100644
--- a/include/net/netns/ip_vs.h
+++ b/include/net/netns/ip_vs.h
@@ -66,6 +66,8 @@ struct netns_ipvs {
struct ip_vs_stats *ctl_stats; /* Statistics & estimator */
struct ip_vs_stats_user __percpu *ustats; /* Statistics */
+ /* ip_vs_conn */
+ atomic_t conn_count; /* connection counter */
/* ip_vs_lblc */
int sysctl_lblc_expiration;
struct ctl_table_header *lblc_ctl_header;
diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
index ef35e5d..41f93cc 100644
--- a/net/netfilter/ipvs/ip_vs_conn.c
+++ b/net/netfilter/ipvs/ip_vs_conn.c
@@ -64,9 +64,6 @@ static struct list_head *ip_vs_conn_tab __read_mostly;
/* SLAB cache for IPVS connections */
static struct kmem_cache *ip_vs_conn_cachep __read_mostly;
-/* counter for current IPVS connections */
-static atomic_t ip_vs_conn_count = ATOMIC_INIT(0);
-
/* counter for no client port connections */
static atomic_t ip_vs_conn_no_cport_cnt = ATOMIC_INIT(0);
@@ -76,7 +73,7 @@ static unsigned int ip_vs_conn_rnd __read_mostly;
/*
* Fine locking granularity for big connection hash table
*/
-#define CT_LOCKARRAY_BITS 4
+#define CT_LOCKARRAY_BITS 5
#define CT_LOCKARRAY_SIZE (1<<CT_LOCKARRAY_BITS)
#define CT_LOCKARRAY_MASK (CT_LOCKARRAY_SIZE-1)
@@ -133,19 +130,19 @@ static inline void ct_write_unlock_bh(unsigned key)
/*
* Returns hash value for IPVS connection entry
*/
-static unsigned int ip_vs_conn_hashkey(int af, unsigned proto,
+static unsigned int ip_vs_conn_hashkey(struct net *net, int af, unsigned proto,
const union nf_inet_addr *addr,
__be16 port)
{
#ifdef CONFIG_IP_VS_IPV6
if (af == AF_INET6)
- return jhash_3words(jhash(addr, 16, ip_vs_conn_rnd),
- (__force u32)port, proto, ip_vs_conn_rnd)
- & ip_vs_conn_tab_mask;
+ return (jhash_3words(jhash(addr, 16, ip_vs_conn_rnd),
+ (__force u32)port, proto, ip_vs_conn_rnd) ^
+ ((size_t)net>>8)) & ip_vs_conn_tab_mask;
#endif
- return jhash_3words((__force u32)addr->ip, (__force u32)port, proto,
- ip_vs_conn_rnd)
- & ip_vs_conn_tab_mask;
+ return (jhash_3words((__force u32)addr->ip, (__force u32)port, proto,
+ ip_vs_conn_rnd) ^
+ ((size_t)net>>8)) & ip_vs_conn_tab_mask;
}
static unsigned int ip_vs_conn_hashkey_param(const struct ip_vs_conn_param *p,
@@ -166,15 +163,15 @@ static unsigned int ip_vs_conn_hashkey_param(const struct ip_vs_conn_param *p,
port = p->vport;
}
- return ip_vs_conn_hashkey(p->af, p->protocol, addr, port);
+ return ip_vs_conn_hashkey(p->net, p->af, p->protocol, addr, port);
}
static unsigned int ip_vs_conn_hashkey_conn(const struct ip_vs_conn *cp)
{
struct ip_vs_conn_param p;
- ip_vs_conn_fill_param(cp->af, cp->protocol, &cp->caddr, cp->cport,
- NULL, 0, &p);
+ ip_vs_conn_fill_param(ip_vs_conn_net(cp), cp->af, cp->protocol,
+ &cp->caddr, cp->cport, NULL, 0, &p);
if (cp->pe) {
p.pe = cp->pe;
@@ -186,7 +183,7 @@ static unsigned int ip_vs_conn_hashkey_conn(const struct ip_vs_conn *cp)
}
/*
- * Hashes ip_vs_conn in ip_vs_conn_tab by proto,addr,port.
+ * Hashes ip_vs_conn in ip_vs_conn_tab by netns,proto,addr,port.
* returns bool success.
*/
static inline int ip_vs_conn_hash(struct ip_vs_conn *cp)
@@ -268,10 +265,10 @@ __ip_vs_conn_in_get(const struct ip_vs_conn_param *p)
ct_read_lock(hash);
list_for_each_entry(cp, &ip_vs_conn_tab[hash], c_list) {
- if (cp->af == p->af &&
+ if (ip_vs_conn_net_eq(cp, p->net) && cp->af == p->af &&
+ p->cport == cp->cport && p->vport == cp->vport &&
ip_vs_addr_equal(p->af, p->caddr, &cp->caddr) &&
ip_vs_addr_equal(p->af, p->vaddr, &cp->vaddr) &&
- p->cport == cp->cport && p->vport == cp->vport &&
((!p->cport) ^ (!(cp->flags & IP_VS_CONN_F_NO_CPORT))) &&
p->protocol == cp->protocol) {
/* HIT */
@@ -313,17 +310,18 @@ ip_vs_conn_fill_param_proto(int af, const struct sk_buff *skb,
struct ip_vs_conn_param *p)
{
__be16 _ports[2], *pptr;
+ struct net *net = skb_net(skb);
pptr = skb_header_pointer(skb, proto_off, sizeof(_ports), _ports);
if (pptr == NULL)
return 1;
if (likely(!inverse))
- ip_vs_conn_fill_param(af, iph->protocol, &iph->saddr, pptr[0],
- &iph->daddr, pptr[1], p);
+ ip_vs_conn_fill_param(net, af, iph->protocol, &iph->saddr,
+ pptr[0], &iph->daddr, pptr[1], p);
else
- ip_vs_conn_fill_param(af, iph->protocol, &iph->daddr, pptr[1],
- &iph->saddr, pptr[0], p);
+ ip_vs_conn_fill_param(net, af, iph->protocol, &iph->daddr,
+ pptr[1], &iph->saddr, pptr[0], p);
return 0;
}
@@ -353,6 +351,8 @@ struct ip_vs_conn *ip_vs_ct_in_get(const struct ip_vs_conn_param *p)
ct_read_lock(hash);
list_for_each_entry(cp, &ip_vs_conn_tab[hash], c_list) {
+ if (!ip_vs_conn_net_eq(cp, p->net))
+ continue;
if (p->pe_data && p->pe->ct_match) {
if (p->pe == cp->pe && p->pe->ct_match(p, cp))
goto out;
@@ -403,10 +403,10 @@ struct ip_vs_conn *ip_vs_conn_out_get(const struct ip_vs_conn_param *p)
ct_read_lock(hash);
list_for_each_entry(cp, &ip_vs_conn_tab[hash], c_list) {
- if (cp->af == p->af &&
+ if (ip_vs_conn_net_eq(cp, p->net) && cp->af == p->af &&
+ p->vport == cp->cport && p->cport == cp->dport &&
ip_vs_addr_equal(p->af, p->vaddr, &cp->caddr) &&
ip_vs_addr_equal(p->af, p->caddr, &cp->daddr) &&
- p->vport == cp->cport && p->cport == cp->dport &&
p->protocol == cp->protocol) {
/* HIT */
atomic_inc(&cp->refcnt);
@@ -611,8 +611,8 @@ struct ip_vs_dest *ip_vs_try_bind_dest(struct ip_vs_conn *cp)
struct ip_vs_dest *dest;
if ((cp) && (!cp->dest)) {
- dest = ip_vs_find_dest(&init_net, cp->af, &cp->daddr, cp->dport,
- &cp->vaddr, cp->vport,
+ dest = ip_vs_find_dest(ip_vs_conn_net(cp), cp->af, &cp->daddr,
+ cp->dport, &cp->vaddr, cp->vport,
cp->protocol, cp->fwmark);
ip_vs_bind_dest(cp, dest);
return dest;
@@ -730,6 +730,7 @@ int ip_vs_check_template(struct ip_vs_conn *ct)
static void ip_vs_conn_expire(unsigned long data)
{
struct ip_vs_conn *cp = (struct ip_vs_conn *)data;
+ struct netns_ipvs *ipvs = net_ipvs(ip_vs_conn_net(cp));
cp->timeout = 60*HZ;
@@ -772,7 +773,7 @@ static void ip_vs_conn_expire(unsigned long data)
ip_vs_unbind_dest(cp);
if (cp->flags & IP_VS_CONN_F_NO_CPORT)
atomic_dec(&ip_vs_conn_no_cport_cnt);
- atomic_dec(&ip_vs_conn_count);
+ atomic_dec(&ipvs->conn_count);
kmem_cache_free(ip_vs_conn_cachep, cp);
return;
@@ -806,7 +807,9 @@ ip_vs_conn_new(const struct ip_vs_conn_param *p,
struct ip_vs_dest *dest, __u32 fwmark)
{
struct ip_vs_conn *cp;
- struct ip_vs_proto_data *pd = ip_vs_proto_data_get(&init_net, p->protocol);
+ struct netns_ipvs *ipvs = net_ipvs(p->net);
+ struct ip_vs_proto_data *pd = ip_vs_proto_data_get(p->net,
+ p->protocol);
cp = kmem_cache_zalloc(ip_vs_conn_cachep, GFP_ATOMIC);
if (cp == NULL) {
@@ -816,6 +819,7 @@ ip_vs_conn_new(const struct ip_vs_conn_param *p,
INIT_LIST_HEAD(&cp->c_list);
setup_timer(&cp->timer, ip_vs_conn_expire, (unsigned long)cp);
+ ip_vs_conn_net_set(cp, p->net);
cp->af = p->af;
cp->protocol = p->protocol;
ip_vs_addr_copy(p->af, &cp->caddr, p->caddr);
@@ -846,7 +850,7 @@ ip_vs_conn_new(const struct ip_vs_conn_param *p,
atomic_set(&cp->n_control, 0);
atomic_set(&cp->in_pkts, 0);
- atomic_inc(&ip_vs_conn_count);
+ atomic_inc(&ipvs->conn_count);
if (flags & IP_VS_CONN_F_NO_CPORT)
atomic_inc(&ip_vs_conn_no_cport_cnt);
@@ -888,17 +892,22 @@ ip_vs_conn_new(const struct ip_vs_conn_param *p,
* /proc/net/ip_vs_conn entries
*/
#ifdef CONFIG_PROC_FS
+struct ip_vs_iter_state {
+ struct seq_net_private p;
+ struct list_head *l;
+};
static void *ip_vs_conn_array(struct seq_file *seq, loff_t pos)
{
int idx;
struct ip_vs_conn *cp;
+ struct ip_vs_iter_state *iter = seq->private;
for (idx = 0; idx < ip_vs_conn_tab_size; idx++) {
ct_read_lock_bh(idx);
list_for_each_entry(cp, &ip_vs_conn_tab[idx], c_list) {
if (pos-- == 0) {
- seq->private = &ip_vs_conn_tab[idx];
+ iter->l = &ip_vs_conn_tab[idx];
return cp;
}
}
@@ -910,14 +919,17 @@ static void *ip_vs_conn_array(struct seq_file *seq, loff_t pos)
static void *ip_vs_conn_seq_start(struct seq_file *seq, loff_t *pos)
{
- seq->private = NULL;
+ struct ip_vs_iter_state *iter = seq->private;
+
+ iter->l = NULL;
return *pos ? ip_vs_conn_array(seq, *pos - 1) :SEQ_START_TOKEN;
}
static void *ip_vs_conn_seq_next(struct seq_file *seq, void *v, loff_t *pos)
{
struct ip_vs_conn *cp = v;
- struct list_head *e, *l = seq->private;
+ struct ip_vs_iter_state *iter = seq->private;
+ struct list_head *e, *l = iter->l;
int idx;
++*pos;
@@ -934,18 +946,19 @@ static void *ip_vs_conn_seq_next(struct seq_file *seq, void *v, loff_t *pos)
while (++idx < ip_vs_conn_tab_size) {
ct_read_lock_bh(idx);
list_for_each_entry(cp, &ip_vs_conn_tab[idx], c_list) {
- seq->private = &ip_vs_conn_tab[idx];
+ iter->l = &ip_vs_conn_tab[idx];
return cp;
}
ct_read_unlock_bh(idx);
}
- seq->private = NULL;
+ iter->l = NULL;
return NULL;
}
static void ip_vs_conn_seq_stop(struct seq_file *seq, void *v)
{
- struct list_head *l = seq->private;
+ struct ip_vs_iter_state *iter = seq->private;
+ struct list_head *l = iter->l;
if (l)
ct_read_unlock_bh(l - ip_vs_conn_tab);
@@ -959,9 +972,12 @@ static int ip_vs_conn_seq_show(struct seq_file *seq, void *v)
"Pro FromIP FPrt ToIP TPrt DestIP DPrt State Expires PEName PEData\n");
else {
const struct ip_vs_conn *cp = v;
+ struct net *net = seq_file_net(seq);
char pe_data[IP_VS_PENAME_MAXLEN + IP_VS_PEDATA_MAXLEN + 3];
size_t len = 0;
+ if (!ip_vs_conn_net_eq(cp, net))
+ return 0;
if (cp->pe_data) {
pe_data[0] = ' ';
len = strlen(cp->pe->name);
@@ -1006,7 +1022,8 @@ static const struct seq_operations ip_vs_conn_seq_ops = {
static int ip_vs_conn_open(struct inode *inode, struct file *file)
{
- return seq_open(file, &ip_vs_conn_seq_ops);
+ return seq_open_net(inode, file, &ip_vs_conn_seq_ops,
+ sizeof(struct ip_vs_iter_state));
}
static const struct file_operations ip_vs_conn_fops = {
@@ -1033,6 +1050,10 @@ static int ip_vs_conn_sync_seq_show(struct seq_file *seq, void *v)
"Pro FromIP FPrt ToIP TPrt DestIP DPrt State Origin Expires\n");
else {
const struct ip_vs_conn *cp = v;
+ struct net *net = seq_file_net(seq);
+
+ if (!ip_vs_conn_net_eq(cp, net))
+ return 0;
#ifdef CONFIG_IP_VS_IPV6
if (cp->af == AF_INET6)
@@ -1069,7 +1090,8 @@ static const struct seq_operations ip_vs_conn_sync_seq_ops = {
static int ip_vs_conn_sync_open(struct inode *inode, struct file *file)
{
- return seq_open(file, &ip_vs_conn_sync_seq_ops);
+ return seq_open_net(inode, file, &ip_vs_conn_sync_seq_ops,
+ sizeof(struct ip_vs_iter_state));
}
static const struct file_operations ip_vs_conn_sync_fops = {
@@ -1170,10 +1192,11 @@ void ip_vs_random_dropentry(void)
/*
* Flush all the connection entries in the ip_vs_conn_tab
*/
-static void ip_vs_conn_flush(void)
+static void ip_vs_conn_flush(struct net *net)
{
int idx;
struct ip_vs_conn *cp;
+ struct netns_ipvs *ipvs = net_ipvs(net);
flush_again:
for (idx = 0; idx < ip_vs_conn_tab_size; idx++) {
@@ -1183,7 +1206,8 @@ static void ip_vs_conn_flush(void)
ct_write_lock_bh(idx);
list_for_each_entry(cp, &ip_vs_conn_tab[idx], c_list) {
-
+ if (!ip_vs_conn_net_eq(cp, net))
+ continue;
IP_VS_DBG(4, "del connection\n");
ip_vs_conn_expire_now(cp);
if (cp->control) {
@@ -1196,7 +1220,7 @@ static void ip_vs_conn_flush(void)
/* the counter may be not NULL, because maybe some conn entries
are run by slow timer handler or unhashed but still referred */
- if (atomic_read(&ip_vs_conn_count) != 0) {
+ if (atomic_read(&ipvs->conn_count) != 0) {
schedule();
goto flush_again;
}
@@ -1204,14 +1228,37 @@ static void ip_vs_conn_flush(void)
int __net_init __ip_vs_conn_init(struct net *net)
{
- int idx;
+ struct netns_ipvs *ipvs = net_ipvs(net);
if (!net_eq(net, &init_net)) /* netns not enabled yet */
return -EPERM;
+ atomic_set(&ipvs->conn_count, 0);
- /* Compute size and mask */
- ip_vs_conn_tab_size = 1 << ip_vs_conn_tab_bits;
- ip_vs_conn_tab_mask = ip_vs_conn_tab_size - 1;
+ proc_net_fops_create(net, "ip_vs_conn", 0, &ip_vs_conn_fops);
+ proc_net_fops_create(net, "ip_vs_conn_sync", 0, &ip_vs_conn_sync_fops);
+
+ return 0;
+}
+/* Cleanup and release all netns related ... */
+static void __net_exit __ip_vs_conn_cleanup(struct net *net)
+{
+ if (!net_eq(net, &init_net)) /* netns not enabled yet */
+ return;
+
+ /* flush all the connection entries first */
+ ip_vs_conn_flush(net);
+ proc_net_remove(net, "ip_vs_conn");
+ proc_net_remove(net, "ip_vs_conn_sync");
+}
+static struct pernet_operations ipvs_conn_ops = {
+ .init = __ip_vs_conn_init,
+ .exit = __ip_vs_conn_cleanup,
+};
+
+int __init ip_vs_conn_init(void)
+{
+ int rv;
+ int idx;
/* Compute size and mask */
ip_vs_conn_tab_size = 1 << ip_vs_conn_tab_bits;
@@ -1249,34 +1296,6 @@ int __net_init __ip_vs_conn_init(struct net *net)
rwlock_init(&__ip_vs_conntbl_lock_array[idx].l);
}
- proc_net_fops_create(net, "ip_vs_conn", 0, &ip_vs_conn_fops);
- proc_net_fops_create(net, "ip_vs_conn_sync", 0, &ip_vs_conn_sync_fops);
-
- return 0;
-}
-/* Cleanup and release all netns related ... */
-static void __net_exit __ip_vs_conn_cleanup(struct net *net)
-{
- if (!net_eq(net, &init_net)) /* netns not enabled yet */
- return;
-
- /* flush all the connection entries first */
- ip_vs_conn_flush();
- /* Release the empty cache */
- kmem_cache_destroy(ip_vs_conn_cachep);
- proc_net_remove(net, "ip_vs_conn");
- proc_net_remove(net, "ip_vs_conn_sync");
- vfree(ip_vs_conn_tab);
-}
-static struct pernet_operations ipvs_conn_ops = {
- .init = __ip_vs_conn_init,
- .exit = __ip_vs_conn_cleanup,
-};
-
-int __init ip_vs_conn_init(void)
-{
- int rv;
-
rv = register_pernet_subsys(&ipvs_conn_ops);
/* calculate the random value for connection hash */
@@ -1287,4 +1306,8 @@ int __init ip_vs_conn_init(void)
void ip_vs_conn_cleanup(void)
{
unregister_pernet_subsys(&ipvs_conn_ops);
+ /* Release the empty cache */
+ kmem_cache_destroy(ip_vs_conn_cachep);
+ vfree(ip_vs_conn_tab);
+
}
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index 679eb16..69ba8cc 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -193,7 +193,8 @@ ip_vs_conn_fill_param_persist(const struct ip_vs_service *svc,
const union nf_inet_addr *vaddr, __be16 vport,
struct ip_vs_conn_param *p)
{
- ip_vs_conn_fill_param(svc->af, protocol, caddr, cport, vaddr, vport, p);
+ ip_vs_conn_fill_param(svc->net, svc->af, protocol, caddr, cport, vaddr,
+ vport, p);
p->pe = svc->pe;
if (p->pe && p->pe->fill_param)
return p->pe->fill_param(p, skb);
@@ -336,8 +337,8 @@ ip_vs_sched_persist(struct ip_vs_service *svc,
/*
* Create a new connection according to the template
*/
- ip_vs_conn_fill_param(svc->af, iph.protocol, &iph.saddr, src_port,
- &iph.daddr, dst_port, ¶m);
+ ip_vs_conn_fill_param(svc->net, svc->af, iph.protocol, &iph.saddr,
+ src_port, &iph.daddr, dst_port, ¶m);
cp = ip_vs_conn_new(¶m, &dest->addr, dport, flags, dest, skb->mark);
if (cp == NULL) {
@@ -451,8 +452,10 @@ ip_vs_schedule(struct ip_vs_service *svc, struct sk_buff *skb,
*/
{
struct ip_vs_conn_param p;
- ip_vs_conn_fill_param(svc->af, iph.protocol, &iph.saddr,
- pptr[0], &iph.daddr, pptr[1], &p);
+
+ ip_vs_conn_fill_param(svc->net, svc->af, iph.protocol,
+ &iph.saddr, pptr[0], &iph.daddr, pptr[1],
+ &p);
cp = ip_vs_conn_new(&p, &dest->addr,
dest->port ? dest->port : pptr[1],
flags, dest, skb->mark);
@@ -518,7 +521,7 @@ int ip_vs_leave(struct ip_vs_service *svc, struct sk_buff *skb,
IP_VS_DBG(6, "%s(): create a cache_bypass entry\n", __func__);
{
struct ip_vs_conn_param p;
- ip_vs_conn_fill_param(svc->af, iph.protocol,
+ ip_vs_conn_fill_param(svc->net, svc->af, iph.protocol,
&iph.saddr, pptr[0],
&iph.daddr, pptr[1], &p);
cp = ip_vs_conn_new(&p, &daddr, 0,
diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c
index b39befd..78d5980 100644
--- a/net/netfilter/ipvs/ip_vs_ftp.c
+++ b/net/netfilter/ipvs/ip_vs_ftp.c
@@ -198,13 +198,15 @@ static int ip_vs_ftp_out(struct ip_vs_app *app, struct ip_vs_conn *cp,
*/
{
struct ip_vs_conn_param p;
- ip_vs_conn_fill_param(AF_INET, iph->protocol,
- &from, port, &cp->caddr, 0, &p);
+ ip_vs_conn_fill_param(ip_vs_conn_net(cp), AF_INET,
+ iph->protocol, &from, port,
+ &cp->caddr, 0, &p);
n_cp = ip_vs_conn_out_get(&p);
}
if (!n_cp) {
struct ip_vs_conn_param p;
- ip_vs_conn_fill_param(AF_INET, IPPROTO_TCP, &cp->caddr,
+ ip_vs_conn_fill_param(ip_vs_conn_net(cp),
+ AF_INET, IPPROTO_TCP, &cp->caddr,
0, &cp->vaddr, port, &p);
n_cp = ip_vs_conn_new(&p, &from, port,
IP_VS_CONN_F_NO_CPORT |
@@ -360,9 +362,9 @@ static int ip_vs_ftp_in(struct ip_vs_app *app, struct ip_vs_conn *cp,
{
struct ip_vs_conn_param p;
- ip_vs_conn_fill_param(AF_INET, iph->protocol, &to, port,
- &cp->vaddr, htons(ntohs(cp->vport)-1),
- &p);
+ ip_vs_conn_fill_param(ip_vs_conn_net(cp), AF_INET,
+ iph->protocol, &to, port, &cp->vaddr,
+ htons(ntohs(cp->vport)-1), &p);
n_cp = ip_vs_conn_in_get(&p);
if (!n_cp) {
n_cp = ip_vs_conn_new(&p, &cp->daddr,
diff --git a/net/netfilter/ipvs/ip_vs_nfct.c b/net/netfilter/ipvs/ip_vs_nfct.c
index 4680647..f454c80 100644
--- a/net/netfilter/ipvs/ip_vs_nfct.c
+++ b/net/netfilter/ipvs/ip_vs_nfct.c
@@ -141,6 +141,7 @@ static void ip_vs_nfct_expect_callback(struct nf_conn *ct,
struct nf_conntrack_tuple *orig, new_reply;
struct ip_vs_conn *cp;
struct ip_vs_conn_param p;
+ struct net *net = nf_ct_net(ct);
if (exp->tuple.src.l3num != PF_INET)
return;
@@ -155,7 +156,7 @@ static void ip_vs_nfct_expect_callback(struct nf_conn *ct,
/* RS->CLIENT */
orig = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple;
- ip_vs_conn_fill_param(exp->tuple.src.l3num, orig->dst.protonum,
+ ip_vs_conn_fill_param(net, exp->tuple.src.l3num, orig->dst.protonum,
&orig->src.u3, orig->src.u.tcp.port,
&orig->dst.u3, orig->dst.u.tcp.port, &p);
cp = ip_vs_conn_out_get(&p);
@@ -268,7 +269,8 @@ void ip_vs_conn_drop_conntrack(struct ip_vs_conn *cp)
" for conn " FMT_CONN "\n",
__func__, ARG_TUPLE(&tuple), ARG_CONN(cp));
- h = nf_conntrack_find_get(&init_net, NF_CT_DEFAULT_ZONE, &tuple);
+ h = nf_conntrack_find_get(ip_vs_conn_net(cp), NF_CT_DEFAULT_ZONE,
+ &tuple);
if (h) {
ct = nf_ct_tuplehash_to_ctrack(h);
/* Show what happens instead of calling nf_ct_kill() */
diff --git a/net/netfilter/ipvs/ip_vs_proto_ah_esp.c b/net/netfilter/ipvs/ip_vs_proto_ah_esp.c
index b8b37fa..ded1adb 100644
--- a/net/netfilter/ipvs/ip_vs_proto_ah_esp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_ah_esp.c
@@ -41,15 +41,16 @@ struct isakmp_hdr {
#define PORT_ISAKMP 500
static void
-ah_esp_conn_fill_param_proto(int af, const struct ip_vs_iphdr *iph,
- int inverse, struct ip_vs_conn_param *p)
+ah_esp_conn_fill_param_proto(struct net *net, int af,
+ const struct ip_vs_iphdr *iph, int inverse,
+ struct ip_vs_conn_param *p)
{
if (likely(!inverse))
- ip_vs_conn_fill_param(af, IPPROTO_UDP,
+ ip_vs_conn_fill_param(net, af, IPPROTO_UDP,
&iph->saddr, htons(PORT_ISAKMP),
&iph->daddr, htons(PORT_ISAKMP), p);
else
- ip_vs_conn_fill_param(af, IPPROTO_UDP,
+ ip_vs_conn_fill_param(net, af, IPPROTO_UDP,
&iph->daddr, htons(PORT_ISAKMP),
&iph->saddr, htons(PORT_ISAKMP), p);
}
@@ -61,8 +62,9 @@ ah_esp_conn_in_get(int af, const struct sk_buff *skb, struct ip_vs_protocol *pp,
{
struct ip_vs_conn *cp;
struct ip_vs_conn_param p;
+ struct net *net = skb_net(skb);
- ah_esp_conn_fill_param_proto(af, iph, inverse, &p);
+ ah_esp_conn_fill_param_proto(net, af, iph, inverse, &p);
cp = ip_vs_conn_in_get(&p);
if (!cp) {
/*
@@ -90,8 +92,9 @@ ah_esp_conn_out_get(int af, const struct sk_buff *skb,
{
struct ip_vs_conn *cp;
struct ip_vs_conn_param p;
+ struct net *net = skb_net(skb);
- ah_esp_conn_fill_param_proto(af, iph, inverse, &p);
+ ah_esp_conn_fill_param_proto(net, af, iph, inverse, &p);
cp = ip_vs_conn_out_get(&p);
if (!cp) {
IP_VS_DBG_BUF(12, "Unknown ISAKMP entry for inout packet "
diff --git a/net/netfilter/ipvs/ip_vs_proto_sctp.c b/net/netfilter/ipvs/ip_vs_proto_sctp.c
index 3c6d06d..c011830 100644
--- a/net/netfilter/ipvs/ip_vs_proto_sctp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_sctp.c
@@ -1064,7 +1064,7 @@ static void sctp_unregister_app(struct net *net, struct ip_vs_app *inc)
static int sctp_app_conn_bind(struct ip_vs_conn *cp)
{
- struct netns_ipvs *ipvs = net_ipvs(&init_net);
+ struct netns_ipvs *ipvs = net_ipvs(ip_vs_conn_net(cp));
int hash;
struct ip_vs_app *inc;
int result = 0;
diff --git a/net/netfilter/ipvs/ip_vs_proto_tcp.c b/net/netfilter/ipvs/ip_vs_proto_tcp.c
index 326768d..2e270b6 100644
--- a/net/netfilter/ipvs/ip_vs_proto_tcp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_tcp.c
@@ -626,7 +626,7 @@ tcp_unregister_app(struct net *net, struct ip_vs_app *inc)
static int
tcp_app_conn_bind(struct ip_vs_conn *cp)
{
- struct netns_ipvs *ipvs = net_ipvs(&init_net);
+ struct netns_ipvs *ipvs = net_ipvs(ip_vs_conn_net(cp));
int hash;
struct ip_vs_app *inc;
int result = 0;
diff --git a/net/netfilter/ipvs/ip_vs_proto_udp.c b/net/netfilter/ipvs/ip_vs_proto_udp.c
index 7e894d5..882a547 100644
--- a/net/netfilter/ipvs/ip_vs_proto_udp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_udp.c
@@ -396,7 +396,7 @@ udp_unregister_app(struct net *net, struct ip_vs_app *inc)
static int udp_app_conn_bind(struct ip_vs_conn *cp)
{
- struct netns_ipvs *ipvs = net_ipvs(&init_net);
+ struct netns_ipvs *ipvs = net_ipvs(ip_vs_conn_net(cp));
int hash;
struct ip_vs_app *inc;
int result = 0;
diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c
index 29c6bbb..2887c34 100644
--- a/net/netfilter/ipvs/ip_vs_sync.c
+++ b/net/netfilter/ipvs/ip_vs_sync.c
@@ -657,21 +657,21 @@ control:
* fill_param used by version 1
*/
static inline int
-ip_vs_conn_fill_param_sync(int af, union ip_vs_sync_conn *sc,
+ip_vs_conn_fill_param_sync(struct net *net, int af, union ip_vs_sync_conn *sc,
struct ip_vs_conn_param *p,
__u8 *pe_data, unsigned int pe_data_len,
__u8 *pe_name, unsigned int pe_name_len)
{
#ifdef CONFIG_IP_VS_IPV6
if (af == AF_INET6)
- ip_vs_conn_fill_param(af, sc->v6.protocol,
+ ip_vs_conn_fill_param(net, af, sc->v6.protocol,
(const union nf_inet_addr *)&sc->v6.caddr,
sc->v6.cport,
(const union nf_inet_addr *)&sc->v6.vaddr,
sc->v6.vport, p);
else
#endif
- ip_vs_conn_fill_param(af, sc->v4.protocol,
+ ip_vs_conn_fill_param(net, af, sc->v4.protocol,
(const union nf_inet_addr *)&sc->v4.caddr,
sc->v4.cport,
(const union nf_inet_addr *)&sc->v4.vaddr,
@@ -878,7 +878,7 @@ static void ip_vs_process_message_v0(struct net *net, const char *buffer,
}
}
- ip_vs_conn_fill_param(AF_INET, s->protocol,
+ ip_vs_conn_fill_param(net, AF_INET, s->protocol,
(const union nf_inet_addr *)&s->caddr,
s->cport,
(const union nf_inet_addr *)&s->vaddr,
@@ -1040,9 +1040,8 @@ static inline int ip_vs_proc_sync_conn(struct net *net, __u8 *p, __u8 *msg_end)
state = 0;
}
}
- if (ip_vs_conn_fill_param_sync(af, s, ¶m,
- pe_data, pe_data_len,
- pe_name, pe_name_len)) {
+ if (ip_vs_conn_fill_param_sync(net, af, s, ¶m, pe_data,
+ pe_data_len, pe_name, pe_name_len)) {
retc = 50;
goto out;
}
--
1.7.2.3
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox