* [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
@ 2008-06-11 17:11 Julius R. Volz
2008-06-11 17:11 ` [PATCH 01/26] IPVS: Add CONFIG_IP_VS_IPV6 option for IPv6 support Julius R. Volz
` (26 more replies)
0 siblings, 27 replies; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:11 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam
Hi,
This patch series adds first experimental IPv6 support to IPVS. I have
already posted it to the LVS mailing list as one huge patch a while ago,
so here's the split-up version, although it is still very big. I don't see
an easy way of breaking up the series into truly independent chunks though,
since most of it seems very interdependent. I'm still a kernel newbie, so
any advice is welcome :)
- Full kernel patch in one file against davem's net-2.6:
http://www-user.tu-chemnitz.de/~volz/ipvs_ipv6/ipvs_ipv6.patch
- It depends on this patch that moves the "ipvs" directory to "net/netfilter":
http://www-user.tu-chemnitz.de/~volz/ipvs_ipv6/move_ipvs_to_netfilter.patch
(this patch really only moves files from one directory to another, so
no real content changes in there)
While not all IPv6 features are working or tested, existing IPv4 features
should still work as before. However, as these changes break the
kernel<->userspace interface, you need a new version of ipvsadm to use
these patches, even for IPv4-only operation:
- ipvsadm patch and tar.gz (by Vince Busam):
http://www-user.tu-chemnitz.de/~volz/ipvs_ipv6/ipvsadm-1.25-ipv6-1.patch
http://www-user.tu-chemnitz.de/~volz/ipvs_ipv6/ipvsadm-1.25-ipv6-1.tar.gz
While I have mainly been working on the kernel part, Vince Busam has been
converting ipvsadm to support the new kernel features from userspace.
To enable IPv6 support in IPVS, set CONFIG_IP_VS_IPV6.
Short overview:
What works with IPv6:
- forwarding mechanisms: NAT, DR, maybe Tunnel (not fully tested yet)
- protocols: TCP, UDP, ESP, AH (last two not tested)
- manipulation and inspection of both IPv4 and IPv6 entries with ipvsadm
- 6 out of 10 schedulers
What is not supported with IPv6:
- handling fragmentation or other extension headers
- FTP application helper (can be loaded, but only operates on v4)
- sync daemon (can be started, but only operates on v4)
- probably some incorrect handling of ICMPv6 or other corner cases
Since fragmentation and extension headers should not occur very often,
things should "mostly" work. I tested HTTP and DNS over NAT and DR
with various supported schedulers without encountering any problems.
But we didn't test any exotic situations. Also, there are some TODOs
in the code for things that haven't been tested or implemented yet.
I copied and changed many IPv4 methods into corresponding IPv6
versions, so the duplication is quite high. I chose to go that way in
order to not break too much of the existing IPv4 code. The upside of
this is that v4 should hopefully still work exactly as before. All
relevant data structures have an added 'af' field for specifying the
address family and use a union of both a v4 and v6 address for specifying
IP addresses.
Feel free to comment, question, criticize, ridicule... this is our first
big kernel project though, so please don't be too hard on us! ;)
include/net/ip_vs.h | 267 +++++++++--
net/netfilter/ipvs/Kconfig | 8 +
net/netfilter/ipvs/ip_vs_conn.c | 406 ++++++++++++++--
net/netfilter/ipvs/ip_vs_core.c | 865 +++++++++++++++++++++++++++++++++-
net/netfilter/ipvs/ip_vs_ctl.c | 565 ++++++++++++++++++++---
net/netfilter/ipvs/ip_vs_dh.c | 3 +
net/netfilter/ipvs/ip_vs_ftp.c | 34 +-
net/netfilter/ipvs/ip_vs_lblc.c | 3 +
net/netfilter/ipvs/ip_vs_lblcr.c | 3 +
net/netfilter/ipvs/ip_vs_lc.c | 20 +-
net/netfilter/ipvs/ip_vs_nq.c | 15 +-
net/netfilter/ipvs/ip_vs_proto.c | 67 +++-
net/netfilter/ipvs/ip_vs_proto_ah.c | 122 +++++-
net/netfilter/ipvs/ip_vs_proto_esp.c | 121 +++++-
net/netfilter/ipvs/ip_vs_proto_tcp.c | 320 ++++++++++++-
net/netfilter/ipvs/ip_vs_proto_udp.c | 294 +++++++++++-
net/netfilter/ipvs/ip_vs_rr.c | 3 +
net/netfilter/ipvs/ip_vs_sed.c | 22 +-
net/netfilter/ipvs/ip_vs_sh.c | 3 +
net/netfilter/ipvs/ip_vs_sync.c | 6 +-
net/netfilter/ipvs/ip_vs_wlc.c | 22 +-
net/netfilter/ipvs/ip_vs_wrr.c | 22 +-
net/netfilter/ipvs/ip_vs_xmit.c | 467 ++++++++++++++++++-
23 files changed, 3401 insertions(+), 257 deletions(-)
Julius Volz
^ permalink raw reply [flat|nested] 76+ messages in thread
* [PATCH 01/26] IPVS: Add CONFIG_IP_VS_IPV6 option for IPv6 support.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
@ 2008-06-11 17:11 ` Julius R. Volz
2008-06-11 17:11 ` [PATCH 02/26] IPVS: Change IPVS data structures to support IPv6 addresses Julius R. Volz
` (25 subsequent siblings)
26 siblings, 0 replies; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:11 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Add boolean config option CONFIG_IP_VS_IPV6 for enabling experimental IPv6
support in IPVS. Only visible if IPv6 support is set to 'y' or both IPv6
and IPVS are modules.
Signed-off-by: Julius R. Volz <juliusv@google.com>
1 files changed, 8 insertions(+), 0 deletions(-)
diff --git a/net/netfilter/ipvs/Kconfig b/net/netfilter/ipvs/Kconfig
index 09d0c3f..fd24182 100644
--- a/net/netfilter/ipvs/Kconfig
+++ b/net/netfilter/ipvs/Kconfig
@@ -24,6 +24,14 @@ menuconfig IP_VS
if IP_VS
+config IP_VS_IPV6
+ bool "IPv6 support for IPVS (DANGEROUS)"
+ depends on EXPERIMENTAL && (IPV6 = y || IP_VS = IPV6)
+ ---help---
+ Add IPv6 support to IPVS. This is incomplete and might be dangerous.
+
+ Say N if unsure.
+
config IP_VS_DEBUG
bool "IP virtual server debugging"
---help---
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 02/26] IPVS: Change IPVS data structures to support IPv6 addresses.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
2008-06-11 17:11 ` [PATCH 01/26] IPVS: Add CONFIG_IP_VS_IPV6 option for IPv6 support Julius R. Volz
@ 2008-06-11 17:11 ` Julius R. Volz
2008-06-11 17:12 ` Patrick McHardy
2008-06-12 1:54 ` Brian Haley
2008-06-11 17:11 ` [PATCH 03/26] IPVS: Use new address family fields in IPVS structs Julius R. Volz
` (24 subsequent siblings)
26 siblings, 2 replies; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:11 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam
From: Vince Busam <vbusam@google.com>
Introduce new 'af' fields into IPVS structures for specifying an entry's
address family. Make IP addresses a union holding both an IPv4 and an IPv6
address. In kernel-internal structs, have the union only hold IPv4
addresses if no IPv6 support is enabled to save some space.
Bump IPVS version to 0x020000 because of this change.
Signed-off-by: Vince Busam <vbusam@google.com>
1 files changed, 39 insertions(+), 12 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 9a51eba..b7b181e 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -11,7 +11,12 @@
#include <linux/sysctl.h> /* For ctl_path */
-#define IP_VS_VERSION_CODE 0x010201
+#ifdef __KERNEL__
+#include <linux/in6.h> /* For struct in6_addr */
+#include <linux/ipv6.h> /* For struct ipv6hdr */
+#endif /* __KERNEL */
+
+#define IP_VS_VERSION_CODE 0x020000
#define NVERSION(version) \
(version >> 16) & 0xFF, \
(version >> 8) & 0xFF, \
@@ -95,6 +100,20 @@
#define IP_VS_SCHEDNAME_MAXLEN 16
#define IP_VS_IFNAME_MAXLEN 16
+union ip_vs_addr_user {
+ __be32 v4;
+ struct in6_addr v6;
+};
+
+#ifdef CONFIG_IP_VS_IPV6
+#define ip_vs_addr ip_vs_addr_user
+#define ip_vs_copy_addr(a, b) do { (a) = (b); } while (0)
+#else
+union ip_vs_addr {
+ __be32 v4;
+};
+#define ip_vs_copy_addr(a, b) do { (a).v4 = (b).v4; } while (0)
+#endif
/*
* The struct ip_vs_service_user and struct ip_vs_dest_user are
@@ -102,8 +121,9 @@
*/
struct ip_vs_service_user {
/* virtual service addresses */
+ u_int16_t af;
u_int16_t protocol;
- __be32 addr; /* virtual ip address */
+ union ip_vs_addr_user addr; /* virtual ip address */
__be16 port;
u_int32_t fwmark; /* firwall mark of service */
@@ -117,7 +137,8 @@ struct ip_vs_service_user {
struct ip_vs_dest_user {
/* destination server address */
- __be32 addr;
+ u_int16_t af;
+ union ip_vs_addr_user addr;
__be16 port;
/* real server options */
@@ -165,8 +186,9 @@ struct ip_vs_getinfo {
/* The argument to IP_VS_SO_GET_SERVICE */
struct ip_vs_service_entry {
/* which service: user fills in these */
+ u_int16_t af;
u_int16_t protocol;
- __be32 addr; /* virtual address */
+ union ip_vs_addr_user addr; /* virtual address */
__be16 port;
u_int32_t fwmark; /* firwall mark of service */
@@ -185,7 +207,8 @@ struct ip_vs_service_entry {
struct ip_vs_dest_entry {
- __be32 addr; /* destination address */
+ u_int16_t af;
+ union ip_vs_addr_user addr; /* destination address */
__be16 port;
unsigned conn_flags; /* connection flags */
int weight; /* destination weight */
@@ -205,8 +228,9 @@ struct ip_vs_dest_entry {
/* The argument to IP_VS_SO_GET_DESTS */
struct ip_vs_get_dests {
/* which service: user fills in these */
+ u_int16_t af;
u_int16_t protocol;
- __be32 addr; /* virtual address */
+ union ip_vs_addr_user addr; /* virtual address */
__be16 port;
u_int32_t fwmark; /* firwall mark of service */
@@ -472,9 +496,10 @@ struct ip_vs_conn {
struct list_head c_list; /* hashed list heads */
/* Protocol, addresses and port numbers */
- __be32 caddr; /* client address */
- __be32 vaddr; /* virtual address */
- __be32 daddr; /* destination address */
+ u_int16_t af; /* address family */
+ union ip_vs_addr caddr; /* client address */
+ union ip_vs_addr vaddr; /* virtual address */
+ union ip_vs_addr daddr; /* destination address */
__be16 cport;
__be16 vport;
__be16 dport;
@@ -527,8 +552,9 @@ struct ip_vs_service {
atomic_t refcnt; /* reference counter */
atomic_t usecnt; /* use counter */
+ u_int16_t af; /* address family */
__u16 protocol; /* which protocol (TCP/UDP) */
- __be32 addr; /* IP address for virtual service */
+ union ip_vs_addr addr; /* IP address for virtual service */
__be16 port; /* port number for the service */
__u32 fwmark; /* firewall mark of the service */
unsigned flags; /* service status flags */
@@ -555,7 +581,8 @@ struct ip_vs_dest {
struct list_head n_list; /* for the dests in the service */
struct list_head d_list; /* for table with all the dests */
- __be32 addr; /* IP address of the server */
+ u_int16_t af; /* address family */
+ union ip_vs_addr addr; /* IP address of the server */
__be16 port; /* port number of the server */
volatile unsigned flags; /* dest status flags */
atomic_t conn_flags; /* flags to copy to conn */
@@ -579,7 +606,7 @@ struct ip_vs_dest {
/* for virtual service */
struct ip_vs_service *svc; /* service it belongs to */
__u16 protocol; /* which protocol (TCP/UDP) */
- __be32 vaddr; /* virtual IP address */
+ union ip_vs_addr vaddr; /* virtual IP address */
__be16 vport; /* virtual port number */
__u32 vfwmark; /* firewall mark of service */
};
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 03/26] IPVS: Use new address family fields in IPVS structs.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
2008-06-11 17:11 ` [PATCH 01/26] IPVS: Add CONFIG_IP_VS_IPV6 option for IPv6 support Julius R. Volz
2008-06-11 17:11 ` [PATCH 02/26] IPVS: Change IPVS data structures to support IPv6 addresses Julius R. Volz
@ 2008-06-11 17:11 ` Julius R. Volz
2008-06-11 17:11 ` [PATCH 04/26] IPVS: Add address family specific debugging macros Julius R. Volz
` (23 subsequent siblings)
26 siblings, 0 replies; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:11 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Adjust miscellaneous code to correctly use the new af, v4 and v6 fields in
IPVS data structures where this is not already covered by other patches.
Signed-off-by: Julius R. Volz <juliusv@google.com>
8 files changed, 72 insertions(+), 60 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
index 65f1ba1..1d81cbc 100644
--- a/net/netfilter/ipvs/ip_vs_conn.c
+++ b/net/netfilter/ipvs/ip_vs_conn.c
@@ -199,8 +199,9 @@ static inline struct ip_vs_conn *__ip_vs_conn_in_get
ct_read_lock(hash);
list_for_each_entry(cp, &ip_vs_conn_tab[hash], c_list) {
- if (s_addr==cp->caddr && s_port==cp->cport &&
- d_port==cp->vport && d_addr==cp->vaddr &&
+ if (cp->af == AF_INET &&
+ s_addr==cp->caddr.v4 && s_port==cp->cport &&
+ d_port==cp->vport && d_addr==cp->vaddr.v4 &&
((!s_port) ^ (!(cp->flags & IP_VS_CONN_F_NO_CPORT))) &&
protocol==cp->protocol) {
/* HIT */
@@ -245,8 +246,9 @@ struct ip_vs_conn *ip_vs_ct_in_get
ct_read_lock(hash);
list_for_each_entry(cp, &ip_vs_conn_tab[hash], c_list) {
- if (s_addr==cp->caddr && s_port==cp->cport &&
- d_port==cp->vport && d_addr==cp->vaddr &&
+ if (cp->af == AF_INET &&
+ s_addr==cp->caddr.v4 && s_port==cp->cport &&
+ d_port==cp->vport && d_addr==cp->vaddr.v4 &&
cp->flags & IP_VS_CONN_F_TEMPLATE &&
protocol==cp->protocol) {
/* HIT */
@@ -288,8 +290,9 @@ struct ip_vs_conn *ip_vs_conn_out_get
ct_read_lock(hash);
list_for_each_entry(cp, &ip_vs_conn_tab[hash], c_list) {
- if (d_addr == cp->caddr && d_port == cp->cport &&
- s_port == cp->dport && s_addr == cp->daddr &&
+ if (cp->af == AF_INET &&
+ d_addr == cp->caddr.v4 && d_port == cp->cport &&
+ s_port == cp->dport && s_addr == cp->daddr.v4 &&
protocol == cp->protocol) {
/* HIT */
atomic_inc(&cp->refcnt);
@@ -642,12 +645,13 @@ ip_vs_conn_new(int proto, __be32 caddr, __be16 cport, __be32 vaddr, __be16 vport
INIT_LIST_HEAD(&cp->c_list);
setup_timer(&cp->timer, ip_vs_conn_expire, (unsigned long)cp);
+ cp->af = AF_INET;
cp->protocol = proto;
- cp->caddr = caddr;
+ cp->caddr.v4 = caddr;
cp->cport = cport;
- cp->vaddr = vaddr;
+ cp->vaddr.v4 = vaddr;
cp->vport = vport;
- cp->daddr = daddr;
+ cp->daddr.v4 = daddr;
cp->dport = dport;
cp->flags = flags;
spin_lock_init(&cp->lock);
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index 963981a..9a3d0df 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -234,14 +234,14 @@ ip_vs_sched_persist(struct ip_vs_service *svc,
snet, 0,
iph->daddr,
ports[1],
- dest->addr, dest->port,
+ dest->addr.v4, dest->port,
IP_VS_CONN_F_TEMPLATE,
dest);
else
ct = ip_vs_conn_new(iph->protocol,
snet, 0,
iph->daddr, 0,
- dest->addr, 0,
+ dest->addr.v4, 0,
IP_VS_CONN_F_TEMPLATE,
dest);
if (ct == NULL)
@@ -288,14 +288,14 @@ ip_vs_sched_persist(struct ip_vs_service *svc,
ct = ip_vs_conn_new(IPPROTO_IP,
snet, 0,
htonl(svc->fwmark), 0,
- dest->addr, 0,
+ dest->addr.v4, 0,
IP_VS_CONN_F_TEMPLATE,
dest);
else
ct = ip_vs_conn_new(iph->protocol,
snet, 0,
iph->daddr, 0,
- dest->addr, 0,
+ dest->addr.v4, 0,
IP_VS_CONN_F_TEMPLATE,
dest);
if (ct == NULL)
@@ -315,7 +315,7 @@ ip_vs_sched_persist(struct ip_vs_service *svc,
cp = ip_vs_conn_new(iph->protocol,
iph->saddr, ports[0],
iph->daddr, ports[1],
- dest->addr, dport,
+ dest->addr.v4, dport,
0,
dest);
if (cp == NULL) {
@@ -382,7 +382,7 @@ ip_vs_schedule(struct ip_vs_service *svc, const struct sk_buff *skb)
cp = ip_vs_conn_new(iph->protocol,
iph->saddr, pptr[0],
iph->daddr, pptr[1],
- dest->addr, dest->port?dest->port:pptr[1],
+ dest->addr.v4, dest->port?dest->port:pptr[1],
0,
dest);
if (cp == NULL)
@@ -391,9 +391,9 @@ ip_vs_schedule(struct ip_vs_service *svc, const struct sk_buff *skb)
IP_VS_DBG(6, "Schedule fwd:%c c:%u.%u.%u.%u:%u v:%u.%u.%u.%u:%u "
"d:%u.%u.%u.%u:%u conn->flags:%X conn->refcnt:%d\n",
ip_vs_fwd_tag(cp),
- NIPQUAD(cp->caddr), ntohs(cp->cport),
- NIPQUAD(cp->vaddr), ntohs(cp->vport),
- NIPQUAD(cp->daddr), ntohs(cp->dport),
+ NIPQUAD(cp->caddr.v4), ntohs(cp->cport),
+ NIPQUAD(cp->vaddr.v4), ntohs(cp->vport),
+ NIPQUAD(cp->daddr.v4), ntohs(cp->dport),
cp->flags, atomic_read(&cp->refcnt));
ip_vs_conn_stats(cp, svc);
@@ -528,14 +528,14 @@ void ip_vs_nat_icmp(struct sk_buff *skb, struct ip_vs_protocol *pp,
struct iphdr *ciph = (struct iphdr *)(icmph + 1);
if (inout) {
- iph->saddr = cp->vaddr;
+ iph->saddr = cp->vaddr.v4;
ip_send_check(iph);
- ciph->daddr = cp->vaddr;
+ ciph->daddr = cp->vaddr.v4;
ip_send_check(ciph);
} else {
- iph->daddr = cp->daddr;
+ iph->daddr = cp->daddr.v4;
ip_send_check(iph);
- ciph->saddr = cp->daddr;
+ ciph->saddr = cp->daddr.v4;
ip_send_check(ciph);
}
@@ -764,7 +764,7 @@ ip_vs_out(unsigned int hooknum, struct sk_buff *skb,
/* mangle the packet */
if (pp->snat_handler && !pp->snat_handler(skb, pp, cp))
goto drop;
- ip_hdr(skb)->saddr = cp->vaddr;
+ ip_hdr(skb)->saddr = cp->vaddr.v4;
ip_send_check(ip_hdr(skb));
/* For policy routing, packets originating from this
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index 94c5767..dd84deb 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -374,7 +374,8 @@ __ip_vs_service_get(__u16 protocol, __be32 vaddr, __be16 vport)
hash = ip_vs_svc_hashkey(protocol, vaddr, vport);
list_for_each_entry(svc, &ip_vs_svc_table[hash], s_list){
- if ((svc->addr == vaddr)
+ if ((svc->af == AF_INET)
+ && (svc->addr.v4 == vaddr)
&& (svc->port == vport)
&& (svc->protocol == protocol)) {
/* HIT */
@@ -544,7 +545,8 @@ ip_vs_lookup_real_service(__u16 protocol, __be32 daddr, __be16 dport)
read_lock(&__ip_vs_rs_lock);
list_for_each_entry(dest, &ip_vs_rtable[hash], d_list) {
- if ((dest->addr == daddr)
+ if ((dest->af == AF_INET)
+ && (dest->addr.v4 == daddr)
&& (dest->port == dport)
&& ((dest->protocol == protocol) ||
dest->vfwmark)) {
@@ -570,7 +572,8 @@ ip_vs_lookup_dest(struct ip_vs_service *svc, __be32 daddr, __be16 dport)
* Find the destination for the given service
*/
list_for_each_entry(dest, &svc->destinations, n_list) {
- if ((dest->addr == daddr) && (dest->port == dport)) {
+ if ((dest->af == AF_INET)
+ && (dest->addr.v4 == daddr) && (dest->port == dport)) {
/* HIT */
return dest;
}
@@ -627,14 +630,15 @@ ip_vs_trash_get_dest(struct ip_vs_service *svc, __be32 daddr, __be16 dport)
IP_VS_DBG(3, "Destination %u/%u.%u.%u.%u:%u still in trash, "
"dest->refcnt=%d\n",
dest->vfwmark,
- NIPQUAD(dest->addr), ntohs(dest->port),
+ NIPQUAD(dest->addr.v4), ntohs(dest->port),
atomic_read(&dest->refcnt));
- if (dest->addr == daddr &&
+ if (dest->af == AF_INET &&
+ dest->addr.v4 == daddr &&
dest->port == dport &&
dest->vfwmark == svc->fwmark &&
dest->protocol == svc->protocol &&
(svc->fwmark ||
- (dest->vaddr == svc->addr &&
+ (dest->vaddr.v4 == svc->addr.v4 &&
dest->vport == svc->port))) {
/* HIT */
return dest;
@@ -647,7 +651,7 @@ ip_vs_trash_get_dest(struct ip_vs_service *svc, __be32 daddr, __be16 dport)
IP_VS_DBG(3, "Removing destination %u/%u.%u.%u.%u:%u "
"from trash\n",
dest->vfwmark,
- NIPQUAD(dest->addr), ntohs(dest->port));
+ NIPQUAD(dest->addr.v4), ntohs(dest->port));
list_del(&dest->n_list);
ip_vs_dst_reset(dest);
__ip_vs_unbind_svc(dest);
@@ -766,11 +770,12 @@ ip_vs_new_dest(struct ip_vs_service *svc, struct ip_vs_dest_user *udest,
return -ENOMEM;
}
+ dest->af = svc->af;
dest->protocol = svc->protocol;
dest->vaddr = svc->addr;
dest->vport = svc->port;
dest->vfwmark = svc->fwmark;
- dest->addr = udest->addr;
+ ip_vs_copy_addr(dest->addr, udest->addr);
dest->port = udest->port;
atomic_set(&dest->activeconns, 0);
@@ -1085,8 +1090,9 @@ ip_vs_add_service(struct ip_vs_service_user *u, struct ip_vs_service **svc_p)
atomic_set(&svc->usecnt, 1);
atomic_set(&svc->refcnt, 0);
+ svc->af = u->af;
svc->protocol = u->protocol;
- svc->addr = u->addr;
+ ip_vs_copy_addr(svc->addr,u->addr);
svc->port = u->port;
svc->fwmark = u->fwmark;
svc->flags = u->flags;
@@ -2022,8 +2028,9 @@ ip_vs_copy_stats(struct ip_vs_stats_user *dst, struct ip_vs_stats *src)
static void
ip_vs_copy_service(struct ip_vs_service_entry *dst, struct ip_vs_service *src)
{
+ dst->af = src->af;
dst->protocol = src->protocol;
- dst->addr = src->addr;
+ ip_vs_copy_addr(dst->addr,src->addr);
dst->port = src->port;
dst->fwmark = src->fwmark;
strlcpy(dst->sched_name, src->scheduler->name, sizeof(dst->sched_name));
@@ -2097,7 +2104,8 @@ __ip_vs_get_dest_entries(const struct ip_vs_get_dests *get,
if (count >= get->num_dests)
break;
- entry.addr = dest->addr;
+ entry.af = dest->af;
+ ip_vs_copy_addr(entry.addr,dest->addr);
entry.port = dest->port;
entry.conn_flags = atomic_read(&dest->conn_flags);
entry.weight = atomic_read(&dest->weight);
diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c
index 59aa166..6542fa9 100644
--- a/net/netfilter/ipvs/ip_vs_ftp.c
+++ b/net/netfilter/ipvs/ip_vs_ftp.c
@@ -174,17 +174,17 @@ static int ip_vs_ftp_out(struct ip_vs_app *app, struct ip_vs_conn *cp,
IP_VS_DBG(7, "PASV response (%u.%u.%u.%u:%d) -> "
"%u.%u.%u.%u:%d detected\n",
- NIPQUAD(from), ntohs(port), NIPQUAD(cp->caddr), 0);
+ NIPQUAD(from), ntohs(port), NIPQUAD(cp->caddr.v4), 0);
/*
* Now update or create an connection entry for it
*/
n_cp = ip_vs_conn_out_get(iph->protocol, from, port,
- cp->caddr, 0);
+ cp->caddr.v4, 0);
if (!n_cp) {
n_cp = ip_vs_conn_new(IPPROTO_TCP,
- cp->caddr, 0,
- cp->vaddr, port,
+ cp->caddr.v4, 0,
+ cp->vaddr.v4, port,
from, port,
IP_VS_CONN_F_NO_CPORT,
cp->dest);
@@ -198,7 +198,7 @@ static int ip_vs_ftp_out(struct ip_vs_app *app, struct ip_vs_conn *cp,
/*
* Replace the old passive address with the new one
*/
- from = n_cp->vaddr;
+ from = n_cp->vaddr.v4;
port = n_cp->vport;
sprintf(buf,"%d,%d,%d,%d,%d,%d", NIPQUAD(from),
(ntohs(port)>>8)&255, ntohs(port)&255);
@@ -308,16 +308,16 @@ static int ip_vs_ftp_in(struct ip_vs_app *app, struct ip_vs_conn *cp,
*/
IP_VS_DBG(7, "protocol %s %u.%u.%u.%u:%d %u.%u.%u.%u:%d\n",
ip_vs_proto_name(iph->protocol),
- NIPQUAD(to), ntohs(port), NIPQUAD(cp->vaddr), 0);
+ NIPQUAD(to), ntohs(port), NIPQUAD(cp->vaddr.v4), 0);
n_cp = ip_vs_conn_in_get(iph->protocol,
to, port,
- cp->vaddr, htons(ntohs(cp->vport)-1));
+ cp->vaddr.v4, htons(ntohs(cp->vport)-1));
if (!n_cp) {
n_cp = ip_vs_conn_new(IPPROTO_TCP,
to, port,
- cp->vaddr, htons(ntohs(cp->vport)-1),
- cp->daddr, htons(ntohs(cp->dport)-1),
+ cp->vaddr.v4, htons(ntohs(cp->vport)-1),
+ cp->daddr.v4, htons(ntohs(cp->dport)-1),
0,
cp->dest);
if (!n_cp)
diff --git a/net/netfilter/ipvs/ip_vs_proto_tcp.c b/net/netfilter/ipvs/ip_vs_proto_tcp.c
index b83dc14..6068c47 100644
--- a/net/netfilter/ipvs/ip_vs_proto_tcp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_tcp.c
@@ -149,7 +149,7 @@ tcp_snat_handler(struct sk_buff *skb,
/* Adjust TCP checksums */
if (!cp->app) {
/* Only port and addr are changed, do fast csum update */
- tcp_fast_csum_update(tcph, cp->daddr, cp->vaddr,
+ tcp_fast_csum_update(tcph, cp->daddr.v4, cp->vaddr.v4,
cp->dport, cp->vport);
if (skb->ip_summed == CHECKSUM_COMPLETE)
skb->ip_summed = CHECKSUM_NONE;
@@ -157,7 +157,7 @@ tcp_snat_handler(struct sk_buff *skb,
/* full checksum calculation */
tcph->check = 0;
skb->csum = skb_checksum(skb, tcphoff, skb->len - tcphoff, 0);
- tcph->check = csum_tcpudp_magic(cp->vaddr, cp->caddr,
+ tcph->check = csum_tcpudp_magic(cp->vaddr.v4, cp->caddr.v4,
skb->len - tcphoff,
cp->protocol, skb->csum);
IP_VS_DBG(11, "O-pkt: %s O-csum=%d (+%zd)\n",
@@ -200,7 +200,7 @@ tcp_dnat_handler(struct sk_buff *skb,
*/
if (!cp->app) {
/* Only port and addr are changed, do fast csum update */
- tcp_fast_csum_update(tcph, cp->vaddr, cp->daddr,
+ tcp_fast_csum_update(tcph, cp->vaddr.v4, cp->daddr.v4,
cp->vport, cp->dport);
if (skb->ip_summed == CHECKSUM_COMPLETE)
skb->ip_summed = CHECKSUM_NONE;
@@ -208,7 +208,7 @@ tcp_dnat_handler(struct sk_buff *skb,
/* full checksum calculation */
tcph->check = 0;
skb->csum = skb_checksum(skb, tcphoff, skb->len - tcphoff, 0);
- tcph->check = csum_tcpudp_magic(cp->caddr, cp->daddr,
+ tcph->check = csum_tcpudp_magic(cp->caddr.v4, cp->daddr.v4,
skb->len - tcphoff,
cp->protocol, skb->csum);
skb->ip_summed = CHECKSUM_UNNECESSARY;
diff --git a/net/netfilter/ipvs/ip_vs_proto_udp.c b/net/netfilter/ipvs/ip_vs_proto_udp.c
index 75771cb..0bcc17a 100644
--- a/net/netfilter/ipvs/ip_vs_proto_udp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_udp.c
@@ -160,7 +160,7 @@ udp_snat_handler(struct sk_buff *skb,
*/
if (!cp->app && (udph->check != 0)) {
/* Only port and addr are changed, do fast csum update */
- udp_fast_csum_update(udph, cp->daddr, cp->vaddr,
+ udp_fast_csum_update(udph, cp->daddr.v4, cp->vaddr.v4,
cp->dport, cp->vport);
if (skb->ip_summed == CHECKSUM_COMPLETE)
skb->ip_summed = CHECKSUM_NONE;
@@ -168,7 +168,7 @@ udp_snat_handler(struct sk_buff *skb,
/* full checksum calculation */
udph->check = 0;
skb->csum = skb_checksum(skb, udphoff, skb->len - udphoff, 0);
- udph->check = csum_tcpudp_magic(cp->vaddr, cp->caddr,
+ udph->check = csum_tcpudp_magic(cp->vaddr.v4, cp->caddr.v4,
skb->len - udphoff,
cp->protocol, skb->csum);
if (udph->check == 0)
@@ -213,7 +213,7 @@ udp_dnat_handler(struct sk_buff *skb,
*/
if (!cp->app && (udph->check != 0)) {
/* Only port and addr are changed, do fast csum update */
- udp_fast_csum_update(udph, cp->vaddr, cp->daddr,
+ udp_fast_csum_update(udph, cp->vaddr.v4, cp->daddr.v4,
cp->vport, cp->dport);
if (skb->ip_summed == CHECKSUM_COMPLETE)
skb->ip_summed = CHECKSUM_NONE;
@@ -221,7 +221,7 @@ udp_dnat_handler(struct sk_buff *skb,
/* full checksum calculation */
udph->check = 0;
skb->csum = skb_checksum(skb, udphoff, skb->len - udphoff, 0);
- udph->check = csum_tcpudp_magic(cp->caddr, cp->daddr,
+ udph->check = csum_tcpudp_magic(cp->caddr.v4, cp->daddr.v4,
skb->len - udphoff,
cp->protocol, skb->csum);
if (udph->check == 0)
diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c
index eff54ef..bdd5cf0 100644
--- a/net/netfilter/ipvs/ip_vs_sync.c
+++ b/net/netfilter/ipvs/ip_vs_sync.c
@@ -245,9 +245,9 @@ void ip_vs_sync_conn(struct ip_vs_conn *cp)
s->cport = cp->cport;
s->vport = cp->vport;
s->dport = cp->dport;
- s->caddr = cp->caddr;
- s->vaddr = cp->vaddr;
- s->daddr = cp->daddr;
+ s->caddr = cp->caddr.v4;
+ s->vaddr = cp->vaddr.v4;
+ s->daddr = cp->daddr.v4;
s->flags = htons(cp->flags & ~IP_VS_CONN_F_HASHED);
s->state = htons(cp->state);
if (cp->flags & IP_VS_CONN_F_SEQ_MASK) {
diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
index f63006c..6b6ce6b 100644
--- a/net/netfilter/ipvs/ip_vs_xmit.c
+++ b/net/netfilter/ipvs/ip_vs_xmit.c
@@ -73,7 +73,7 @@ __ip_vs_get_out_rt(struct ip_vs_conn *cp, u32 rtos)
.oif = 0,
.nl_u = {
.ip4_u = {
- .daddr = dest->addr,
+ .daddr = dest->addr.v4,
.saddr = 0,
.tos = rtos, } },
};
@@ -82,12 +82,12 @@ __ip_vs_get_out_rt(struct ip_vs_conn *cp, u32 rtos)
spin_unlock(&dest->dst_lock);
IP_VS_DBG_RL("ip_route_output error, "
"dest: %u.%u.%u.%u\n",
- NIPQUAD(dest->addr));
+ NIPQUAD(dest->addr.v4));
return NULL;
}
__ip_vs_dst_set(dest, rtos, dst_clone(&rt->u.dst));
IP_VS_DBG(10, "new dst %u.%u.%u.%u, refcnt=%d, rtos=%X\n",
- NIPQUAD(dest->addr),
+ NIPQUAD(dest->addr.v4),
atomic_read(&rt->u.dst.__refcnt), rtos);
}
spin_unlock(&dest->dst_lock);
@@ -96,14 +96,14 @@ __ip_vs_get_out_rt(struct ip_vs_conn *cp, u32 rtos)
.oif = 0,
.nl_u = {
.ip4_u = {
- .daddr = cp->daddr,
+ .daddr = cp->daddr.v4,
.saddr = 0,
.tos = rtos, } },
};
if (ip_route_output_key(&init_net, &rt, &fl)) {
IP_VS_DBG_RL("ip_route_output error, dest: "
- "%u.%u.%u.%u\n", NIPQUAD(cp->daddr));
+ "%u.%u.%u.%u\n", NIPQUAD(cp->daddr.v4));
return NULL;
}
}
@@ -266,7 +266,7 @@ ip_vs_nat_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
/* mangle the packet */
if (pp->dnat_handler && !pp->dnat_handler(skb, pp, cp))
goto tx_error;
- ip_hdr(skb)->daddr = cp->daddr;
+ ip_hdr(skb)->daddr = cp->daddr.v4;
ip_send_check(ip_hdr(skb));
IP_VS_DBG_PKT(10, pp, skb, 0, "After DNAT");
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 04/26] IPVS: Add address family specific debugging macros.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (2 preceding siblings ...)
2008-06-11 17:11 ` [PATCH 03/26] IPVS: Use new address family fields in IPVS structs Julius R. Volz
@ 2008-06-11 17:11 ` Julius R. Volz
2008-06-11 17:11 ` [PATCH 05/26] IPVS: Use new " Julius R. Volz
` (22 subsequent siblings)
26 siblings, 0 replies; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:11 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Define debugging/error logging macros that only do something for a
specific address family. Avoids ugly conditionals in code using these
debugging macros.
Signed-off-by: Julius R. Volz <juliusv@google.com>
1 files changed, 34 insertions(+), 0 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index b7b181e..349a746 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -287,6 +287,24 @@ struct ip_vs_daemon_user {
#include <linux/net.h>
extern int ip_vs_get_debug_level(void);
+#define IP_VS_DBG_V4(af, level, msg...) \
+ do { \
+ if (af == AF_INET \
+ && level <= ip_vs_get_debug_level()) \
+ printk(KERN_DEBUG "IPVS: " msg); \
+ } while (0)
+
+#ifdef CONFIG_IP_VS_IPV6
+#define IP_VS_DBG_V6(af, level, msg...) \
+ do { \
+ if (af == AF_INET6 \
+ && level <= ip_vs_get_debug_level()) \
+ printk(KERN_DEBUG "IPVS: " msg); \
+ } while (0)
+#else
+#define IP_VS_DBG_V6(af, level, msg...) do {} while (0)
+#endif
+
#define IP_VS_DBG(level, msg...) \
do { \
if (level <= ip_vs_get_debug_level()) \
@@ -309,6 +327,8 @@ extern int ip_vs_get_debug_level(void);
pp->debug_packet(pp, skb, ofs, msg); \
} while (0)
#else /* NO DEBUGGING at ALL */
+#define IP_VS_DBG_V4(af, level, msg...) do {} while (0)
+#define IP_VS_DBG_V6(af, level, msg...) do {} while (0)
#define IP_VS_DBG(level, msg...) do {} while (0)
#define IP_VS_DBG_RL(msg...) do {} while (0)
#define IP_VS_DBG_PKT(level, pp, skb, ofs, msg) do {} while (0)
@@ -316,6 +336,20 @@ extern int ip_vs_get_debug_level(void);
#endif
#define IP_VS_BUG() BUG()
+#define IP_VS_ERR_V4(af, msg...) \
+ do { \
+ if (af == AF_INET) \
+ printk(KERN_ERR "IPVS: " msg); \
+ } while (0)
+#ifdef CONFIG_IP_VS_IPV6
+#define IP_VS_ERR_V6(af, msg...) \
+do { \
+ if (af == AF_INET6) \
+ printk(KERN_ERR "IPVS: " msg); \
+ } while (0)
+#else
+#define IP_VS_ERR_V6(af, msg...) do {} while (0)
+#endif
#define IP_VS_ERR(msg...) printk(KERN_ERR "IPVS: " msg)
#define IP_VS_INFO(msg...) printk(KERN_INFO "IPVS: " msg)
#define IP_VS_WARNING(msg...) \
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 05/26] IPVS: Use new address family specific debugging macros.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (3 preceding siblings ...)
2008-06-11 17:11 ` [PATCH 04/26] IPVS: Add address family specific debugging macros Julius R. Volz
@ 2008-06-11 17:11 ` Julius R. Volz
2008-06-11 17:14 ` Patrick McHardy
2008-06-11 17:11 ` [PATCH 06/26] IPVS: Add IPv6-specific function pointers to struct ip_vs_protocol Julius R. Volz
` (21 subsequent siblings)
26 siblings, 1 reply; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:11 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Change debug output to use address family specific debugging macros where
appropriate.
Signed-off-by: Julius R. Volz <juliusv@google.com>
10 files changed, 257 insertions(+), 112 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 349a746..5c2d48d 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -806,24 +806,41 @@ static inline void ip_vs_control_del(struct ip_vs_conn *cp)
{
struct ip_vs_conn *ctl_cp = cp->control;
if (!ctl_cp) {
- IP_VS_ERR("request control DEL for uncontrolled: "
- "%d.%d.%d.%d:%d to %d.%d.%d.%d:%d\n",
- NIPQUAD(cp->caddr),ntohs(cp->cport),
- NIPQUAD(cp->vaddr),ntohs(cp->vport));
+ IP_VS_ERR_V4(cp->af, "request control DEL for uncontrolled: "
+ "%d.%d.%d.%d:%d to %d.%d.%d.%d:%d\n",
+ NIPQUAD(cp->caddr.v4),ntohs(cp->cport),
+ NIPQUAD(cp->vaddr.v4),ntohs(cp->vport));
+
+ IP_VS_ERR_V6(cp->af, "request control DEL for uncontrolled: "
+ NIP6_FMT ":%d to " NIP6_FMT ":%d\n",
+ NIP6(cp->caddr.v6),ntohs(cp->cport),
+ NIP6(cp->vaddr.v6),ntohs(cp->vport));
+
return;
}
- IP_VS_DBG(7, "DELeting control for: "
- "cp.dst=%d.%d.%d.%d:%d ctl_cp.dst=%d.%d.%d.%d:%d\n",
- NIPQUAD(cp->caddr),ntohs(cp->cport),
- NIPQUAD(ctl_cp->caddr),ntohs(ctl_cp->cport));
+ IP_VS_DBG_V4(cp->af, 7, "DELeting control for: "
+ "cp.dst=%d.%d.%d.%d:%d ctl_cp.dst=%d.%d.%d.%d:%d\n",
+ NIPQUAD(cp->caddr.v4),ntohs(cp->cport),
+ NIPQUAD(ctl_cp->caddr.v4),ntohs(ctl_cp->cport));
+
+ IP_VS_DBG_V6(cp->af, 7, "DELeting control for: "
+ "cp.dst=" NIP6_FMT ":%d ctl_cp.dst=" NIP6_FMT ":%d\n",
+ NIP6(cp->caddr.v6),ntohs(cp->cport),
+ NIP6(ctl_cp->caddr.v6),ntohs(ctl_cp->cport));
cp->control = NULL;
if (atomic_read(&ctl_cp->n_control) == 0) {
- IP_VS_ERR("BUG control DEL with n=0 : "
- "%d.%d.%d.%d:%d to %d.%d.%d.%d:%d\n",
- NIPQUAD(cp->caddr),ntohs(cp->cport),
- NIPQUAD(cp->vaddr),ntohs(cp->vport));
+ IP_VS_ERR_V4(cp->af, "BUG control DEL with n=0 : "
+ "%d.%d.%d.%d:%d to %d.%d.%d.%d:%d\n",
+ NIPQUAD(cp->caddr.v4),ntohs(cp->cport),
+ NIPQUAD(cp->vaddr.v4),ntohs(cp->vport));
+
+ IP_VS_ERR_V6(cp->af, "BUG control DEL with n=0 : "
+ NIP6_FMT ":%d to " NIP6_FMT ":%d\n",
+ NIP6(cp->caddr.v6),ntohs(cp->cport),
+ NIP6(cp->vaddr.v6),ntohs(cp->vport));
+
return;
}
atomic_dec(&ctl_cp->n_control);
@@ -833,17 +850,28 @@ static inline void
ip_vs_control_add(struct ip_vs_conn *cp, struct ip_vs_conn *ctl_cp)
{
if (cp->control) {
- IP_VS_ERR("request control ADD for already controlled: "
- "%d.%d.%d.%d:%d to %d.%d.%d.%d:%d\n",
- NIPQUAD(cp->caddr),ntohs(cp->cport),
- NIPQUAD(cp->vaddr),ntohs(cp->vport));
+ IP_VS_ERR_V4(cp->af, "request control ADD for already controlled: "
+ "%d.%d.%d.%d:%d to %d.%d.%d.%d:%d\n",
+ NIPQUAD(cp->caddr.v4),ntohs(cp->cport),
+ NIPQUAD(cp->vaddr.v4),ntohs(cp->vport));
+
+ IP_VS_ERR_V6(cp->af, "request control ADD for already controlled: "
+ NIP6_FMT ":%d to " NIP6_FMT ":%d\n",
+ NIP6(cp->caddr.v6),ntohs(cp->cport),
+ NIP6(cp->vaddr.v6),ntohs(cp->vport));
+
ip_vs_control_del(cp);
}
- IP_VS_DBG(7, "ADDing control for: "
- "cp.dst=%d.%d.%d.%d:%d ctl_cp.dst=%d.%d.%d.%d:%d\n",
- NIPQUAD(cp->caddr),ntohs(cp->cport),
- NIPQUAD(ctl_cp->caddr),ntohs(ctl_cp->cport));
+ IP_VS_DBG_V4(cp->af, 7, "ADDing control for: "
+ "cp.dst=%d.%d.%d.%d:%d ctl_cp.dst=%d.%d.%d.%d:%d\n",
+ NIPQUAD(cp->caddr.v4),ntohs(cp->cport),
+ NIPQUAD(ctl_cp->caddr.v4),ntohs(ctl_cp->cport));
+
+ IP_VS_DBG_V6(cp->af, 7, "ADDing control for: "
+ "cp.dst=" NIP6_FMT ":%d ctl_cp.dst=" NIP6_FMT ":%d\n",
+ NIP6(cp->caddr.v6),ntohs(cp->cport),
+ NIP6(ctl_cp->caddr.v6),ntohs(ctl_cp->cport));
cp->control = ctl_cp;
atomic_inc(&ctl_cp->n_control);
diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
index 1d81cbc..b3df938 100644
--- a/net/netfilter/ipvs/ip_vs_conn.c
+++ b/net/netfilter/ipvs/ip_vs_conn.c
@@ -407,16 +407,27 @@ ip_vs_bind_dest(struct ip_vs_conn *cp, struct ip_vs_dest *dest)
cp->flags |= atomic_read(&dest->conn_flags);
cp->dest = dest;
- IP_VS_DBG(7, "Bind-dest %s c:%u.%u.%u.%u:%d v:%u.%u.%u.%u:%d "
- "d:%u.%u.%u.%u:%d fwd:%c s:%u conn->flags:%X conn->refcnt:%d "
- "dest->refcnt:%d\n",
- ip_vs_proto_name(cp->protocol),
- NIPQUAD(cp->caddr), ntohs(cp->cport),
- NIPQUAD(cp->vaddr), ntohs(cp->vport),
- NIPQUAD(cp->daddr), ntohs(cp->dport),
- ip_vs_fwd_tag(cp), cp->state,
- cp->flags, atomic_read(&cp->refcnt),
- atomic_read(&dest->refcnt));
+ IP_VS_DBG_V4(cp->af, 7, "Bind-dest %s c:%u.%u.%u.%u:%d v:%u.%u.%u.%u:%d "
+ "d:%u.%u.%u.%u:%d fwd:%c s:%u conn->flags:%X conn->refcnt:%d "
+ "dest->refcnt:%d\n",
+ ip_vs_proto_name(cp->protocol),
+ NIPQUAD(cp->caddr.v4), ntohs(cp->cport),
+ NIPQUAD(cp->vaddr.v4), ntohs(cp->vport),
+ NIPQUAD(cp->daddr.v4), ntohs(cp->dport),
+ ip_vs_fwd_tag(cp), cp->state,
+ cp->flags, atomic_read(&cp->refcnt),
+ atomic_read(&dest->refcnt));
+
+ IP_VS_DBG_V6(cp->af, 7, "Bind-dest %s c:" NIP6_FMT ":%d v:" NIP6_FMT ":%d "
+ "d:" NIP6_FMT ":%d fwd:%c s:%u conn->flags:%X conn->refcnt:%d "
+ "dest->refcnt:%d\n",
+ ip_vs_proto_name(cp->protocol),
+ NIP6(cp->caddr.v6), ntohs(cp->cport),
+ NIP6(cp->vaddr.v6), ntohs(cp->vport),
+ NIP6(cp->daddr.v6), ntohs(cp->dport),
+ ip_vs_fwd_tag(cp), cp->state,
+ cp->flags, atomic_read(&cp->refcnt),
+ atomic_read(&dest->refcnt));
/* Update the connection counters */
if (!(cp->flags & IP_VS_CONN_F_TEMPLATE)) {
@@ -469,16 +480,27 @@ static inline void ip_vs_unbind_dest(struct ip_vs_conn *cp)
if (!dest)
return;
- IP_VS_DBG(7, "Unbind-dest %s c:%u.%u.%u.%u:%d v:%u.%u.%u.%u:%d "
- "d:%u.%u.%u.%u:%d fwd:%c s:%u conn->flags:%X conn->refcnt:%d "
- "dest->refcnt:%d\n",
- ip_vs_proto_name(cp->protocol),
- NIPQUAD(cp->caddr), ntohs(cp->cport),
- NIPQUAD(cp->vaddr), ntohs(cp->vport),
- NIPQUAD(cp->daddr), ntohs(cp->dport),
- ip_vs_fwd_tag(cp), cp->state,
- cp->flags, atomic_read(&cp->refcnt),
- atomic_read(&dest->refcnt));
+ IP_VS_DBG_V4(cp->af, 7, "Unbind-dest %s c:%u.%u.%u.%u:%d v:%u.%u.%u.%u:%d "
+ "d:%u.%u.%u.%u:%d fwd:%c s:%u conn->flags:%X conn->refcnt:%d "
+ "dest->refcnt:%d\n",
+ ip_vs_proto_name(cp->protocol),
+ NIPQUAD(cp->caddr.v4), ntohs(cp->cport),
+ NIPQUAD(cp->vaddr.v4), ntohs(cp->vport),
+ NIPQUAD(cp->daddr.v4), ntohs(cp->dport),
+ ip_vs_fwd_tag(cp), cp->state,
+ cp->flags, atomic_read(&cp->refcnt),
+ atomic_read(&dest->refcnt));
+
+ IP_VS_DBG_V6(cp->af, 7, "Unbind-dest %s c:" NIP6_FMT ":%d v:" NIP6_FMT ":%d "
+ "d:" NIP6_FMT ":%d fwd:%c s:%u conn->flags:%X conn->refcnt:%d "
+ "dest->refcnt:%d\n",
+ ip_vs_proto_name(cp->protocol),
+ NIP6(cp->caddr.v6), ntohs(cp->cport),
+ NIP6(cp->vaddr.v6), ntohs(cp->vport),
+ NIP6(cp->daddr.v6), ntohs(cp->dport),
+ ip_vs_fwd_tag(cp), cp->state,
+ cp->flags, atomic_read(&cp->refcnt),
+ atomic_read(&dest->refcnt));
/* Update the connection counters */
if (!(cp->flags & IP_VS_CONN_F_TEMPLATE)) {
@@ -530,14 +552,22 @@ int ip_vs_check_template(struct ip_vs_conn *ct)
if ((dest == NULL) ||
!(dest->flags & IP_VS_DEST_F_AVAILABLE) ||
(sysctl_ip_vs_expire_quiescent_template &&
- (atomic_read(&dest->weight) == 0))) {
- IP_VS_DBG(9, "check_template: dest not available for "
- "protocol %s s:%u.%u.%u.%u:%d v:%u.%u.%u.%u:%d "
- "-> d:%u.%u.%u.%u:%d\n",
- ip_vs_proto_name(ct->protocol),
- NIPQUAD(ct->caddr), ntohs(ct->cport),
- NIPQUAD(ct->vaddr), ntohs(ct->vport),
- NIPQUAD(ct->daddr), ntohs(ct->dport));
+ (atomic_read(&dest->weight) == 0))) {
+ IP_VS_DBG_V4(ct->af, 9, "check_template: dest not available for "
+ "protocol %s s:%u.%u.%u.%u:%d v:%u.%u.%u.%u:%d "
+ "-> d:%u.%u.%u.%u:%d\n",
+ ip_vs_proto_name(ct->protocol),
+ NIPQUAD(ct->caddr.v4), ntohs(ct->cport),
+ NIPQUAD(ct->vaddr.v4), ntohs(ct->vport),
+ NIPQUAD(ct->daddr.v4), ntohs(ct->dport));
+
+ IP_VS_DBG_V6(ct->af, 9, "check_template: dest not available for "
+ "protocol %s s:" NIP6_FMT ":%d v:" NIP6_FMT ":%d "
+ "-> d:" NIP6_FMT ":%d\n",
+ ip_vs_proto_name(ct->protocol),
+ NIP6(ct->caddr.v6), ntohs(ct->cport),
+ NIP6(ct->vaddr.v6), ntohs(ct->vport),
+ NIP6(ct->daddr.v6), ntohs(ct->dport));
/*
* Invalidate the connection template
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index dd84deb..da2e431 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -835,13 +835,22 @@ ip_vs_add_dest(struct ip_vs_service *svc, struct ip_vs_dest_user *udest)
*/
dest = ip_vs_trash_get_dest(svc, daddr, dport);
if (dest != NULL) {
- IP_VS_DBG(3, "Get destination %u.%u.%u.%u:%u from trash, "
- "dest->refcnt=%d, service %u/%u.%u.%u.%u:%u\n",
- NIPQUAD(daddr), ntohs(dport),
- atomic_read(&dest->refcnt),
- dest->vfwmark,
- NIPQUAD(dest->vaddr),
- ntohs(dest->vport));
+ IP_VS_DBG_V4(svc->af, 3, "Get destination %u.%u.%u.%u:%u from trash, "
+ "dest->refcnt=%d, service %u/%u.%u.%u.%u:%u\n",
+ NIPQUAD(daddr.v4), ntohs(dport),
+ atomic_read(&dest->refcnt),
+ dest->vfwmark,
+ NIPQUAD(dest->vaddr.v4),
+ ntohs(dest->vport));
+
+ IP_VS_DBG_V6(svc->af, 3, "Get destination " NIP6_FMT ":%u from trash, "
+ "dest->refcnt=%d, service %u/" NIP6_FMT ":%u\n",
+ NIP6(daddr.v6), ntohs(dport),
+ atomic_read(&dest->refcnt),
+ dest->vfwmark,
+ NIP6(dest->vaddr.v6),
+ ntohs(dest->vport));
+
__ip_vs_update_dest(svc, dest, udest);
/*
@@ -981,10 +990,16 @@ static void __ip_vs_del_dest(struct ip_vs_dest *dest)
atomic_dec(&dest->svc->refcnt);
kfree(dest);
} else {
- IP_VS_DBG(3, "Moving dest %u.%u.%u.%u:%u into trash, "
- "dest->refcnt=%d\n",
- NIPQUAD(dest->addr), ntohs(dest->port),
- atomic_read(&dest->refcnt));
+ IP_VS_DBG_V4(dest->af, 3, "Moving dest %u.%u.%u.%u:%u into trash, "
+ "dest->refcnt=%d\n",
+ NIPQUAD(dest->addr.v4), ntohs(dest->port),
+ atomic_read(&dest->refcnt));
+
+ IP_VS_DBG_V6(dest->af, 3, "Moving dest " NIP6_FMT ":%u into trash, "
+ "dest->refcnt=%d\n",
+ NIP6(dest->addr.v6), ntohs(dest->port),
+ atomic_read(&dest->refcnt));
+
list_add(&dest->n_list, &ip_vs_dest_trash);
atomic_inc(&dest->refcnt);
}
@@ -1953,9 +1968,14 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user *user, unsigned int len)
/* Check for valid protocol: TCP or UDP, even for fwmark!=0 */
if (usvc->protocol!=IPPROTO_TCP && usvc->protocol!=IPPROTO_UDP) {
- IP_VS_ERR("set_ctl: invalid protocol: %d %d.%d.%d.%d:%d %s\n",
- usvc->protocol, NIPQUAD(usvc->addr),
- ntohs(usvc->port), usvc->sched_name);
+ IP_VS_ERR_V4(usvc->af, "set_ctl: invalid protocol: %d %d.%d.%d.%d:%d %s\n",
+ usvc->protocol, NIPQUAD(usvc->addr.v4),
+ ntohs(usvc->port), usvc->sched_name);
+
+ IP_VS_ERR_V6(usvc->af, "set_ctl: invalid protocol: %d " NIP6_FMT ":%d %s\n",
+ usvc->protocol, NIP6(usvc->addr.v6),
+ ntohs(usvc->port), usvc->sched_name);
+
ret = -EFAULT;
goto out_unlock;
}
diff --git a/net/netfilter/ipvs/ip_vs_lc.c b/net/netfilter/ipvs/ip_vs_lc.c
index d88fef9..e1214d1 100644
--- a/net/netfilter/ipvs/ip_vs_lc.c
+++ b/net/netfilter/ipvs/ip_vs_lc.c
@@ -86,11 +86,18 @@ ip_vs_lc_schedule(struct ip_vs_service *svc, const struct sk_buff *skb)
}
}
- if (least)
- IP_VS_DBG(6, "LC: server %u.%u.%u.%u:%u activeconns %d inactconns %d\n",
- NIPQUAD(least->addr), ntohs(least->port),
- atomic_read(&least->activeconns),
- atomic_read(&least->inactconns));
+ if (!least)
+ return NULL;
+
+ IP_VS_DBG_V4(svc->af, 6, "LC: server %u.%u.%u.%u:%u activeconns %d inactconns %d\n",
+ NIPQUAD(least->addr.v4), ntohs(least->port),
+ atomic_read(&least->activeconns),
+ atomic_read(&least->inactconns));
+
+ IP_VS_DBG_V6(svc->af, 6, "LC: server " NIP6_FMT ":%u activeconns %d inactconns %d\n",
+ NIP6(least->addr.v6), ntohs(least->port),
+ atomic_read(&least->activeconns),
+ atomic_read(&least->inactconns));
return least;
}
diff --git a/net/netfilter/ipvs/ip_vs_nq.c b/net/netfilter/ipvs/ip_vs_nq.c
index bc2a9e5..5de2e34 100644
--- a/net/netfilter/ipvs/ip_vs_nq.c
+++ b/net/netfilter/ipvs/ip_vs_nq.c
@@ -122,9 +122,17 @@ ip_vs_nq_schedule(struct ip_vs_service *svc, const struct sk_buff *skb)
return NULL;
out:
- IP_VS_DBG(6, "NQ: server %u.%u.%u.%u:%u "
+
+ IP_VS_DBG_V4(svc->af, 6, "NQ: server %u.%u.%u.%u:%u "
+ "activeconns %d refcnt %d weight %d overhead %d\n",
+ NIPQUAD(least->addr.v4), ntohs(least->port),
+ atomic_read(&least->activeconns),
+ atomic_read(&least->refcnt),
+ atomic_read(&least->weight), loh);
+
+ IP_VS_DBG_V6(svc->af, 6, "NQ: server " NIP6_FMT ":%u "
"activeconns %d refcnt %d weight %d overhead %d\n",
- NIPQUAD(least->addr), ntohs(least->port),
+ NIP6(least->addr.v6), ntohs(least->port),
atomic_read(&least->activeconns),
atomic_read(&least->refcnt),
atomic_read(&least->weight), loh);
diff --git a/net/netfilter/ipvs/ip_vs_proto_tcp.c b/net/netfilter/ipvs/ip_vs_proto_tcp.c
index 6068c47..0efb3e4 100644
--- a/net/netfilter/ipvs/ip_vs_proto_tcp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_tcp.c
@@ -421,19 +421,34 @@ set_tcp_state(struct ip_vs_protocol *pp, struct ip_vs_conn *cp,
if (new_state != cp->state) {
struct ip_vs_dest *dest = cp->dest;
- IP_VS_DBG(8, "%s %s [%c%c%c%c] %u.%u.%u.%u:%d->"
- "%u.%u.%u.%u:%d state: %s->%s conn->refcnt:%d\n",
- pp->name,
- (state_off==TCP_DIR_OUTPUT)?"output ":"input ",
- th->syn? 'S' : '.',
- th->fin? 'F' : '.',
- th->ack? 'A' : '.',
- th->rst? 'R' : '.',
- NIPQUAD(cp->daddr), ntohs(cp->dport),
- NIPQUAD(cp->caddr), ntohs(cp->cport),
- tcp_state_name(cp->state),
- tcp_state_name(new_state),
- atomic_read(&cp->refcnt));
+ IP_VS_DBG_V4(cp->af, 8, "%s %s [%c%c%c%c] %u.%u.%u.%u:%d->"
+ "%u.%u.%u.%u:%d state: %s->%s conn->refcnt:%d\n",
+ pp->name,
+ (state_off==TCP_DIR_OUTPUT)?"output ":"input ",
+ th->syn? 'S' : '.',
+ th->fin? 'F' : '.',
+ th->ack? 'A' : '.',
+ th->rst? 'R' : '.',
+ NIPQUAD(cp->daddr.v4), ntohs(cp->dport),
+ NIPQUAD(cp->caddr.v4), ntohs(cp->cport),
+ tcp_state_name(cp->state),
+ tcp_state_name(new_state),
+ atomic_read(&cp->refcnt));
+
+ IP_VS_DBG_V6(cp->af, 8, "%s %s [%c%c%c%c] " NIP6_FMT ":%d->"
+ NIP6_FMT ":%d state: %s->%s conn->refcnt:%d\n",
+ pp->name,
+ (state_off==TCP_DIR_OUTPUT)?"output ":"input ",
+ th->syn? 'S' : '.',
+ th->fin? 'F' : '.',
+ th->ack? 'A' : '.',
+ th->rst? 'R' : '.',
+ NIP6(cp->daddr.v6), ntohs(cp->dport),
+ NIP6(cp->caddr.v6), ntohs(cp->cport),
+ tcp_state_name(cp->state),
+ tcp_state_name(new_state),
+ atomic_read(&cp->refcnt));
+
if (dest) {
if (!(cp->flags & IP_VS_CONN_F_INACTIVE) &&
(new_state != IP_VS_TCP_S_ESTABLISHED)) {
@@ -548,12 +563,20 @@ tcp_app_conn_bind(struct ip_vs_conn *cp)
break;
spin_unlock(&tcp_app_lock);
- IP_VS_DBG(9, "%s: Binding conn %u.%u.%u.%u:%u->"
- "%u.%u.%u.%u:%u to app %s on port %u\n",
- __func__,
- NIPQUAD(cp->caddr), ntohs(cp->cport),
- NIPQUAD(cp->vaddr), ntohs(cp->vport),
- inc->name, ntohs(inc->port));
+ IP_VS_DBG_V4(cp->af, 9, "%s: Binding conn %u.%u.%u.%u:%u->"
+ "%u.%u.%u.%u:%u to app %s on port %u\n",
+ __func__,
+ NIPQUAD(cp->caddr.v4), ntohs(cp->cport),
+ NIPQUAD(cp->vaddr.v4), ntohs(cp->vport),
+ inc->name, ntohs(inc->port));
+
+ IP_VS_DBG_V6(cp->af, 9, "%s: Binding conn " NIP6_FMT ":%u->"
+ NIP6_FMT ":%u to app %s on port %u\n",
+ __func__,
+ NIP6(cp->caddr.v6), ntohs(cp->cport),
+ NIP6(cp->vaddr.v6), ntohs(cp->vport),
+ inc->name, ntohs(inc->port));
+
cp->app = inc;
if (inc->init_conn)
result = inc->init_conn(inc, cp);
diff --git a/net/netfilter/ipvs/ip_vs_proto_udp.c b/net/netfilter/ipvs/ip_vs_proto_udp.c
index 0bcc17a..76e97ef 100644
--- a/net/netfilter/ipvs/ip_vs_proto_udp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_udp.c
@@ -342,12 +342,20 @@ static int udp_app_conn_bind(struct ip_vs_conn *cp)
break;
spin_unlock(&udp_app_lock);
- IP_VS_DBG(9, "%s: Binding conn %u.%u.%u.%u:%u->"
- "%u.%u.%u.%u:%u to app %s on port %u\n",
- __func__,
- NIPQUAD(cp->caddr), ntohs(cp->cport),
- NIPQUAD(cp->vaddr), ntohs(cp->vport),
- inc->name, ntohs(inc->port));
+ IP_VS_DBG_V4(cp->af, 9, "%s: Binding conn %u.%u.%u.%u:%u->"
+ "%u.%u.%u.%u:%u to app %s on port %u\n",
+ __func__,
+ NIPQUAD(cp->caddr.v4), ntohs(cp->cport),
+ NIPQUAD(cp->vaddr.v4), ntohs(cp->vport),
+ inc->name, ntohs(inc->port));
+
+ IP_VS_DBG_V6(cp->af, 9, "%s: Binding conn " NIP6_FMT ":%u->"
+ NIP6_FMT ":%u to app %s on port %u\n",
+ __func__,
+ NIP6(cp->caddr.v6), ntohs(cp->cport),
+ NIP6(cp->vaddr.v6), ntohs(cp->vport),
+ inc->name, ntohs(inc->port));
+
cp->app = inc;
if (inc->init_conn)
result = inc->init_conn(inc, cp);
diff --git a/net/netfilter/ipvs/ip_vs_sed.c b/net/netfilter/ipvs/ip_vs_sed.c
index dd7c128..e7bc810 100644
--- a/net/netfilter/ipvs/ip_vs_sed.c
+++ b/net/netfilter/ipvs/ip_vs_sed.c
@@ -124,12 +124,19 @@ ip_vs_sed_schedule(struct ip_vs_service *svc, const struct sk_buff *skb)
}
}
- IP_VS_DBG(6, "SED: server %u.%u.%u.%u:%u "
- "activeconns %d refcnt %d weight %d overhead %d\n",
- NIPQUAD(least->addr), ntohs(least->port),
- atomic_read(&least->activeconns),
- atomic_read(&least->refcnt),
- atomic_read(&least->weight), loh);
+ IP_VS_DBG_V4(svc->af, 6, "SED: server %u.%u.%u.%u:%u "
+ "activeconns %d refcnt %d weight %d overhead %d\n",
+ NIPQUAD(least->addr.v4), ntohs(least->port),
+ atomic_read(&least->activeconns),
+ atomic_read(&least->refcnt),
+ atomic_read(&least->weight), loh);
+
+ IP_VS_DBG_V6(svc->af, 6, "SED: server " NIP6_FMT ":%u "
+ "activeconns %d refcnt %d weight %d overhead %d\n",
+ NIP6(least->addr.v6), ntohs(least->port),
+ atomic_read(&least->activeconns),
+ atomic_read(&least->refcnt),
+ atomic_read(&least->weight), loh);
return least;
}
diff --git a/net/netfilter/ipvs/ip_vs_wlc.c b/net/netfilter/ipvs/ip_vs_wlc.c
index 8a9d913..ff003a7 100644
--- a/net/netfilter/ipvs/ip_vs_wlc.c
+++ b/net/netfilter/ipvs/ip_vs_wlc.c
@@ -112,12 +112,19 @@ ip_vs_wlc_schedule(struct ip_vs_service *svc, const struct sk_buff *skb)
}
}
- IP_VS_DBG(6, "WLC: server %u.%u.%u.%u:%u "
- "activeconns %d refcnt %d weight %d overhead %d\n",
- NIPQUAD(least->addr), ntohs(least->port),
- atomic_read(&least->activeconns),
- atomic_read(&least->refcnt),
- atomic_read(&least->weight), loh);
+ IP_VS_DBG_V4(svc->af, 6, "WLC: server %u.%u.%u.%u:%u "
+ "activeconns %d refcnt %d weight %d overhead %d\n",
+ NIPQUAD(least->addr.v4), ntohs(least->port),
+ atomic_read(&least->activeconns),
+ atomic_read(&least->refcnt),
+ atomic_read(&least->weight), loh);
+
+ IP_VS_DBG_V6(svc->af, 6, "WLC: server " NIP6_FMT ":%u "
+ "activeconns %d refcnt %d weight %d overhead %d\n",
+ NIP6(least->addr.v6), ntohs(least->port),
+ atomic_read(&least->activeconns),
+ atomic_read(&least->refcnt),
+ atomic_read(&least->weight), loh);
return least;
}
diff --git a/net/netfilter/ipvs/ip_vs_wrr.c b/net/netfilter/ipvs/ip_vs_wrr.c
index 85c680a..3f61ab2 100644
--- a/net/netfilter/ipvs/ip_vs_wrr.c
+++ b/net/netfilter/ipvs/ip_vs_wrr.c
@@ -197,12 +197,19 @@ ip_vs_wrr_schedule(struct ip_vs_service *svc, const struct sk_buff *skb)
}
}
- IP_VS_DBG(6, "WRR: server %u.%u.%u.%u:%u "
- "activeconns %d refcnt %d weight %d\n",
- NIPQUAD(dest->addr), ntohs(dest->port),
- atomic_read(&dest->activeconns),
- atomic_read(&dest->refcnt),
- atomic_read(&dest->weight));
+ IP_VS_DBG_V4(svc->af, 6, "WRR: server %u.%u.%u.%u:%u "
+ "activeconns %d refcnt %d weight %d\n",
+ NIPQUAD(dest->addr.v4), ntohs(dest->port),
+ atomic_read(&dest->activeconns),
+ atomic_read(&dest->refcnt),
+ atomic_read(&dest->weight));
+
+ IP_VS_DBG_V6(svc->af, 6, "WRR: server " NIP6_FMT ":%u "
+ "activeconns %d refcnt %d weight %d\n",
+ NIP6(dest->addr.v6), ntohs(dest->port),
+ atomic_read(&dest->activeconns),
+ atomic_read(&dest->refcnt),
+ atomic_read(&dest->weight));
out:
write_unlock(&svc->sched_lock);
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 06/26] IPVS: Add IPv6-specific function pointers to struct ip_vs_protocol.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (4 preceding siblings ...)
2008-06-11 17:11 ` [PATCH 05/26] IPVS: Use new " Julius R. Volz
@ 2008-06-11 17:11 ` Julius R. Volz
2008-06-11 17:11 ` [PATCH 07/26] IPVS: Add IPv6 handler functions to AH protocol handler Julius R. Volz
` (20 subsequent siblings)
26 siblings, 0 replies; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:11 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Add some IPv6-specific handler function pointers to struct ip_vs_protocol.
These are needed for handlers that (may) have a different implementation
between protocol versions. Depending on the code path (processing either
v4 or v6 packets), the corresponding version of a handler will be called.
Signed-off-by: Julius R. Volz <juliusv@google.com>
1 files changed, 33 insertions(+), 0 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 5c2d48d..9ae04d0 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -477,6 +477,12 @@ struct ip_vs_protocol {
struct ip_vs_protocol *pp,
int *verdict, struct ip_vs_conn **cpp);
+#ifdef CONFIG_IP_VS_IPV6
+ int (*conn_schedule_v6)(struct sk_buff *skb,
+ struct ip_vs_protocol *pp,
+ int *verdict, struct ip_vs_conn **cpp);
+#endif
+
struct ip_vs_conn *
(*conn_in_get)(const struct sk_buff *skb,
struct ip_vs_protocol *pp,
@@ -491,13 +497,40 @@ struct ip_vs_protocol {
unsigned int proto_off,
int inverse);
+#ifdef CONFIG_IP_VS_IPV6
+ struct ip_vs_conn *
+ (*conn_in_get_v6)(const struct sk_buff *skb,
+ struct ip_vs_protocol *pp,
+ const struct ipv6hdr *iph,
+ unsigned int proto_off,
+ int inverse);
+
+ struct ip_vs_conn *
+ (*conn_out_get_v6)(const struct sk_buff *skb,
+ struct ip_vs_protocol *pp,
+ const struct ipv6hdr *iph,
+ unsigned int proto_off,
+ int inverse);
+#endif
+
int (*snat_handler)(struct sk_buff *skb,
struct ip_vs_protocol *pp, struct ip_vs_conn *cp);
int (*dnat_handler)(struct sk_buff *skb,
struct ip_vs_protocol *pp, struct ip_vs_conn *cp);
+#ifdef CONFIG_IP_VS_IPV6
+ int (*snat_handler_v6)(struct sk_buff *skb,
+ struct ip_vs_protocol *pp, struct ip_vs_conn *cp);
+
+ int (*dnat_handler_v6)(struct sk_buff *skb,
+ struct ip_vs_protocol *pp, struct ip_vs_conn *cp);
+#endif
+
int (*csum_check)(struct sk_buff *skb, struct ip_vs_protocol *pp);
+#ifdef CONFIG_IP_VS_IPV6
+ int (*csum_check_v6)(struct sk_buff *skb, struct ip_vs_protocol *pp);
+#endif
const char *(*state_name)(int state);
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 07/26] IPVS: Add IPv6 handler functions to AH protocol handler.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (5 preceding siblings ...)
2008-06-11 17:11 ` [PATCH 06/26] IPVS: Add IPv6-specific function pointers to struct ip_vs_protocol Julius R. Volz
@ 2008-06-11 17:11 ` Julius R. Volz
2008-06-11 17:11 ` [PATCH 08/26] IPVS: Add IPv6 handler functions to ESP " Julius R. Volz
` (19 subsequent siblings)
26 siblings, 0 replies; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:11 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Define new IPv6-specific handler functions in AH protocol handler. Set new
function pointers in ip_vs_protocol struct to point to these functions.
Signed-off-by: Julius R. Volz <juliusv@google.com>
1 files changed, 86 insertions(+), 0 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_proto_ah.c b/net/netfilter/ipvs/ip_vs_proto_ah.c
index 4bf835e..674c9d8 100644
--- a/net/netfilter/ipvs/ip_vs_proto_ah.c
+++ b/net/netfilter/ipvs/ip_vs_proto_ah.c
@@ -79,6 +79,47 @@ ah_conn_in_get(const struct sk_buff *skb,
return cp;
}
+#ifdef CONFIG_IP_VS_IPV6
+static struct ip_vs_conn *
+ah_conn_in_get_v6(const struct sk_buff *skb,
+ struct ip_vs_protocol *pp,
+ const struct ipv6hdr *iph,
+ unsigned int proto_off,
+ int inverse)
+{
+ struct ip_vs_conn *cp;
+
+ if (likely(!inverse)) {
+ cp = ip_vs_conn_in_get_v6(IPPROTO_UDP,
+ &iph->saddr,
+ htons(PORT_ISAKMP),
+ &iph->daddr,
+ htons(PORT_ISAKMP));
+ } else {
+ cp = ip_vs_conn_in_get_v6(IPPROTO_UDP,
+ &iph->daddr,
+ htons(PORT_ISAKMP),
+ &iph->saddr,
+ htons(PORT_ISAKMP));
+ }
+
+ if (!cp) {
+ /*
+ * We are not sure if the packet is from our
+ * service, so our conn_schedule hook should return NF_ACCEPT
+ */
+ IP_VS_DBG(12, "Unknown ISAKMP entry for outin packet "
+ "%s%s " NIP6_FMT "->" NIP6_FMT "\n",
+ inverse ? "ICMP+" : "",
+ pp->name,
+ NIP6(iph->saddr),
+ NIP6(iph->daddr));
+ }
+
+ return cp;
+}
+#endif
+
static struct ip_vs_conn *
ah_conn_out_get(const struct sk_buff *skb, struct ip_vs_protocol *pp,
@@ -112,6 +153,40 @@ ah_conn_out_get(const struct sk_buff *skb, struct ip_vs_protocol *pp,
return cp;
}
+#ifdef CONFIG_IP_VS_IPV6
+static struct ip_vs_conn *
+ah_conn_out_get_v6(const struct sk_buff *skb, struct ip_vs_protocol *pp,
+ const struct ipv6hdr *iph, unsigned int proto_off, int inverse)
+{
+ struct ip_vs_conn *cp;
+
+ if (likely(!inverse)) {
+ cp = ip_vs_conn_out_get_v6(IPPROTO_UDP,
+ &iph->saddr,
+ htons(PORT_ISAKMP),
+ &iph->daddr,
+ htons(PORT_ISAKMP));
+ } else {
+ cp = ip_vs_conn_out_get_v6(IPPROTO_UDP,
+ &iph->daddr,
+ htons(PORT_ISAKMP),
+ &iph->saddr,
+ htons(PORT_ISAKMP));
+ }
+
+ if (!cp) {
+ IP_VS_DBG(12, "Unknown ISAKMP entry for inout packet "
+ "%s%s " NIP6_FMT "->" NIP6_FMT "\n",
+ inverse ? "ICMP+" : "",
+ pp->name,
+ NIP6(iph->saddr),
+ NIP6(iph->daddr));
+ }
+
+ return cp;
+}
+#endif
+
static int
ah_conn_schedule(struct sk_buff *skb,
@@ -165,10 +240,21 @@ struct ip_vs_protocol ip_vs_protocol_ah = {
.init = ah_init,
.exit = ah_exit,
.conn_schedule = ah_conn_schedule,
+#ifdef CONFIG_IP_VS_IPV6
+ .conn_schedule_v6 = ah_conn_schedule,
+#endif
.conn_in_get = ah_conn_in_get,
.conn_out_get = ah_conn_out_get,
+#ifdef CONFIG_IP_VS_IPV6
+ .conn_in_get_v6 = ah_conn_in_get_v6,
+ .conn_out_get_v6 = ah_conn_out_get_v6,
+#endif
.snat_handler = NULL,
.dnat_handler = NULL,
+#ifdef CONFIG_IP_VS_IPV6
+ .snat_handler_v6 = NULL,
+ .dnat_handler_v6 = NULL,
+#endif
.csum_check = NULL,
.state_transition = NULL,
.register_app = NULL,
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 08/26] IPVS: Add IPv6 handler functions to ESP protocol handler.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (6 preceding siblings ...)
2008-06-11 17:11 ` [PATCH 07/26] IPVS: Add IPv6 handler functions to AH protocol handler Julius R. Volz
@ 2008-06-11 17:11 ` Julius R. Volz
2008-06-11 17:11 ` [PATCH 09/26] IPVS: Add IPv6 handler functions to TCP " Julius R. Volz
` (18 subsequent siblings)
26 siblings, 0 replies; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:11 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Define new IPv6-specific handler functions in ESP protocol handler. Set new
function pointers in ip_vs_protocol struct to point to these functions.
Signed-off-by: Julius R. Volz <juliusv@google.com>
1 files changed, 86 insertions(+), 0 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_proto_esp.c b/net/netfilter/ipvs/ip_vs_proto_esp.c
index db6a6b7..5113df4 100644
--- a/net/netfilter/ipvs/ip_vs_proto_esp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_esp.c
@@ -79,6 +79,47 @@ esp_conn_in_get(const struct sk_buff *skb,
return cp;
}
+#ifdef CONFIG_IP_VS_IPV6
+static struct ip_vs_conn *
+esp_conn_in_get_v6(const struct sk_buff *skb,
+ struct ip_vs_protocol *pp,
+ const struct ipv6hdr *iph,
+ unsigned int proto_off,
+ int inverse)
+{
+ struct ip_vs_conn *cp;
+
+ if (likely(!inverse)) {
+ cp = ip_vs_conn_in_get_v6(IPPROTO_UDP,
+ &iph->saddr,
+ htons(PORT_ISAKMP),
+ &iph->daddr,
+ htons(PORT_ISAKMP));
+ } else {
+ cp = ip_vs_conn_in_get_v6(IPPROTO_UDP,
+ &iph->daddr,
+ htons(PORT_ISAKMP),
+ &iph->saddr,
+ htons(PORT_ISAKMP));
+ }
+
+ if (!cp) {
+ /*
+ * We are not sure if the packet is from our
+ * service, so our conn_schedule hook should return NF_ACCEPT
+ */
+ IP_VS_DBG(12, "Unknown ISAKMP entry for outin packet "
+ "%s%s " NIP6_FMT "->" NIP6_FMT "\n",
+ inverse ? "ICMP+" : "",
+ pp->name,
+ NIP6(iph->saddr),
+ NIP6(iph->daddr));
+ }
+
+ return cp;
+}
+#endif
+
static struct ip_vs_conn *
esp_conn_out_get(const struct sk_buff *skb, struct ip_vs_protocol *pp,
@@ -112,6 +153,40 @@ esp_conn_out_get(const struct sk_buff *skb, struct ip_vs_protocol *pp,
return cp;
}
+#ifdef CONFIG_IP_VS_IPV6
+static struct ip_vs_conn *
+esp_conn_out_get_v6(const struct sk_buff *skb, struct ip_vs_protocol *pp,
+ const struct ipv6hdr *iph, unsigned int proto_off, int inverse)
+{
+ struct ip_vs_conn *cp;
+
+ if (likely(!inverse)) {
+ cp = ip_vs_conn_out_get_v6(IPPROTO_UDP,
+ &iph->saddr,
+ htons(PORT_ISAKMP),
+ &iph->daddr,
+ htons(PORT_ISAKMP));
+ } else {
+ cp = ip_vs_conn_out_get_v6(IPPROTO_UDP,
+ &iph->daddr,
+ htons(PORT_ISAKMP),
+ &iph->saddr,
+ htons(PORT_ISAKMP));
+ }
+
+ if (!cp) {
+ IP_VS_DBG(12, "Unknown ISAKMP entry for inout packet "
+ "%s%s " NIP6_FMT "->" NIP6_FMT "\n",
+ inverse ? "ICMP+" : "",
+ pp->name,
+ NIP6(iph->saddr),
+ NIP6(iph->daddr));
+ }
+
+ return cp;
+}
+#endif
+
static int
esp_conn_schedule(struct sk_buff *skb, struct ip_vs_protocol *pp,
@@ -164,10 +239,21 @@ struct ip_vs_protocol ip_vs_protocol_esp = {
.init = esp_init,
.exit = esp_exit,
.conn_schedule = esp_conn_schedule,
+#ifdef CONFIG_IP_VS_IPV6
+ .conn_schedule_v6 = esp_conn_schedule,
+#endif
.conn_in_get = esp_conn_in_get,
.conn_out_get = esp_conn_out_get,
+#ifdef CONFIG_IP_VS_IPV6
+ .conn_in_get_v6 = esp_conn_in_get_v6,
+ .conn_out_get_v6 = esp_conn_out_get_v6,
+#endif
.snat_handler = NULL,
.dnat_handler = NULL,
+#ifdef CONFIG_IP_VS_IPV6
+ .snat_handler_v6 = NULL,
+ .dnat_handler_v6 = NULL,
+#endif
.csum_check = NULL,
.state_transition = NULL,
.register_app = NULL,
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 09/26] IPVS: Add IPv6 handler functions to TCP protocol handler.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (7 preceding siblings ...)
2008-06-11 17:11 ` [PATCH 08/26] IPVS: Add IPv6 handler functions to ESP " Julius R. Volz
@ 2008-06-11 17:11 ` Julius R. Volz
2008-06-11 17:11 ` [PATCH 10/26] IPVS: Add IPv6 handler functions to UDP " Julius R. Volz
` (17 subsequent siblings)
26 siblings, 0 replies; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:11 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Define new IPv6-specific handler functions in TCP protocol handler. Set new
function pointers in ip_vs_protocol struct to point to these functions.
Introduce new ip_vs_check_diff16() function for recalculating IPv6 address
checksums.
Signed-off-by: Julius R. Volz <juliusv@google.com>
2 files changed, 260 insertions(+), 1 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 9ae04d0..a6e7438 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1094,6 +1094,17 @@ static inline __wsum ip_vs_check_diff4(__be32 old, __be32 new, __wsum oldsum)
return csum_partial((char *) diff, sizeof(diff), oldsum);
}
+#ifdef CONFIG_IP_VS_IPV6
+static inline __wsum ip_vs_check_diff16(const __be32 *old, const __be32 *new,
+ __wsum oldsum)
+{
+ __be32 diff[8] = { ~old[3], ~old[2], ~old[1], ~old[0],
+ new[3], new[2], new[1], new[0] };
+
+ return csum_partial((char *) diff, sizeof(diff), oldsum);
+}
+#endif
+
static inline __wsum ip_vs_check_diff2(__be16 old, __be16 new, __wsum oldsum)
{
__be16 diff[2] = { ~old, new };
diff --git a/net/netfilter/ipvs/ip_vs_proto_tcp.c b/net/netfilter/ipvs/ip_vs_proto_tcp.c
index 0efb3e4..02bf859 100644
--- a/net/netfilter/ipvs/ip_vs_proto_tcp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_tcp.c
@@ -47,6 +47,29 @@ tcp_conn_in_get(const struct sk_buff *skb, struct ip_vs_protocol *pp,
}
}
+#ifdef CONFIG_IP_VS_IPV6
+static struct ip_vs_conn *
+tcp_conn_in_get_v6(const struct sk_buff *skb, struct ip_vs_protocol *pp,
+ const struct ipv6hdr *iph, unsigned int proto_off, int inverse)
+{
+ __be16 _ports[2], *pptr;
+
+ pptr = skb_header_pointer(skb, proto_off, sizeof(_ports), _ports);
+ if (pptr == NULL)
+ return NULL;
+
+ if (likely(!inverse)) {
+ return ip_vs_conn_in_get_v6(iph->nexthdr,
+ &iph->saddr, pptr[0],
+ &iph->daddr, pptr[1]);
+ } else {
+ return ip_vs_conn_in_get_v6(iph->nexthdr,
+ &iph->daddr, pptr[1],
+ &iph->saddr, pptr[0]);
+ }
+}
+#endif
+
static struct ip_vs_conn *
tcp_conn_out_get(const struct sk_buff *skb, struct ip_vs_protocol *pp,
const struct iphdr *iph, unsigned int proto_off, int inverse)
@@ -68,6 +91,29 @@ tcp_conn_out_get(const struct sk_buff *skb, struct ip_vs_protocol *pp,
}
}
+#ifdef CONFIG_IP_VS_IPV6
+static struct ip_vs_conn *
+tcp_conn_out_get_v6(const struct sk_buff *skb, struct ip_vs_protocol *pp,
+ const struct ipv6hdr *iph, unsigned int proto_off, int inverse)
+{
+ __be16 _ports[2], *pptr;
+
+ pptr = skb_header_pointer(skb, proto_off, sizeof(_ports), _ports);
+ if (pptr == NULL)
+ return NULL;
+
+ if (likely(!inverse)) {
+ return ip_vs_conn_out_get_v6(iph->nexthdr,
+ &iph->saddr, pptr[0],
+ &iph->daddr, pptr[1]);
+ } else {
+ return ip_vs_conn_out_get_v6(iph->nexthdr,
+ &iph->daddr, pptr[1],
+ &iph->saddr, pptr[0]);
+ }
+}
+#endif
+
static int
tcp_conn_schedule(struct sk_buff *skb,
@@ -110,6 +156,50 @@ tcp_conn_schedule(struct sk_buff *skb,
return 1;
}
+#ifdef CONFIG_IP_VS_IPV6
+static int
+tcp_conn_schedule_v6(struct sk_buff *skb,
+ struct ip_vs_protocol *pp,
+ int *verdict, struct ip_vs_conn **cpp)
+{
+ struct ip_vs_service *svc;
+ struct tcphdr _tcph, *th;
+
+ th = skb_header_pointer(skb, sizeof(struct ipv6hdr), sizeof(_tcph),
+ &_tcph);
+ if (th == NULL) {
+ *verdict = NF_DROP;
+ return 0;
+ }
+
+ if (th->syn &&
+ (svc = ip_vs_service_get_v6(skb->mark, ipv6_hdr(skb)->nexthdr,
+ &ipv6_hdr(skb)->daddr, th->dest))) {
+ if (ip_vs_todrop()) {
+ /*
+ * It seems that we are very loaded.
+ * We have to drop this packet :(
+ */
+ ip_vs_service_put(svc);
+ *verdict = NF_DROP;
+ return 0;
+ }
+
+ /*
+ * Let the virtual server select a real server for the
+ * incoming connection, and create a connection entry.
+ */
+ *cpp = ip_vs_schedule_v6(svc, skb);
+ if (!*cpp) {
+ *verdict = ip_vs_leave_v6(svc, skb, pp);
+ return 0;
+ }
+ ip_vs_service_put(svc);
+ }
+ return 1;
+}
+#endif
+
static inline void
tcp_fast_csum_update(struct tcphdr *tcph, __be32 oldip, __be32 newip,
@@ -121,6 +211,19 @@ tcp_fast_csum_update(struct tcphdr *tcph, __be32 oldip, __be32 newip,
~csum_unfold(tcph->check))));
}
+#ifdef CONFIG_IP_VS_IPV6
+static inline void
+tcp_fast_csum_update_v6(struct tcphdr *tcph, const struct in6_addr *oldip,
+ const struct in6_addr *newip, __be16 oldport,
+ __be16 newport)
+{
+ tcph->check =
+ csum_fold(ip_vs_check_diff16(oldip->s6_addr32, newip->s6_addr32,
+ ip_vs_check_diff2(oldport, newport,
+ ~csum_unfold(tcph->check))));
+}
+#endif
+
static int
tcp_snat_handler(struct sk_buff *skb,
@@ -167,6 +270,53 @@ tcp_snat_handler(struct sk_buff *skb,
return 1;
}
+#ifdef CONFIG_IP_VS_IPV6
+static int
+tcp_snat_handler_v6(struct sk_buff *skb,
+ struct ip_vs_protocol *pp, struct ip_vs_conn *cp)
+{
+ struct tcphdr *tcph;
+ const unsigned int tcphoff = sizeof(struct ipv6hdr);
+
+ /* csum_check_v6 requires unshared skb */
+ if (!skb_make_writable(skb, tcphoff+sizeof(*tcph)))
+ return 0;
+
+ if (unlikely(cp->app != NULL)) {
+ /* Some checks before mangling */
+ if (pp->csum_check_v6 && !pp->csum_check_v6(skb, pp))
+ return 0;
+
+ /* Call application helper if needed */
+ if (!ip_vs_app_pkt_out(cp, skb))
+ return 0;
+ }
+
+ tcph = (void *)ipv6_hdr(skb) + tcphoff;
+ tcph->source = cp->vport;
+
+ /* Adjust TCP checksums */
+ if (!cp->app) {
+ /* Only port and addr are changed, do fast csum update */
+ tcp_fast_csum_update_v6(tcph, &cp->daddr.v6, &cp->vaddr.v6,
+ cp->dport, cp->vport);
+ if (skb->ip_summed == CHECKSUM_COMPLETE)
+ skb->ip_summed = CHECKSUM_NONE;
+ } else {
+ /* full checksum calculation */
+ tcph->check = 0;
+ skb->csum = skb_checksum(skb, tcphoff, skb->len - tcphoff, 0);
+ tcph->check = csum_ipv6_magic(&cp->vaddr.v6, &cp->caddr.v6,
+ skb->len - tcphoff,
+ cp->protocol, skb->csum);
+ IP_VS_DBG(11, "O-pkt: %s O-csum=%d (+%zd)\n",
+ pp->name, tcph->check,
+ (char*)&(tcph->check) - (char*)tcph);
+ }
+ return 1;
+}
+#endif
+
static int
tcp_dnat_handler(struct sk_buff *skb,
@@ -216,6 +366,56 @@ tcp_dnat_handler(struct sk_buff *skb,
return 1;
}
+#ifdef CONFIG_IP_VS_IPV6
+static int
+tcp_dnat_handler_v6(struct sk_buff *skb,
+ struct ip_vs_protocol *pp, struct ip_vs_conn *cp)
+{
+ struct tcphdr *tcph;
+ const unsigned int tcphoff = sizeof(struct ipv6hdr);
+
+ /* csum_check_v6 requires unshared skb */
+ if (!skb_make_writable(skb, tcphoff+sizeof(*tcph)))
+ return 0;
+
+ if (unlikely(cp->app != NULL)) {
+ /* Some checks before mangling */
+ if (pp->csum_check_v6 && !pp->csum_check_v6(skb, pp))
+ return 0;
+
+ /*
+ * Attempt ip_vs_app call.
+ * It will fix ip_vs_conn and iph ack_seq stuff
+ */
+ if (!ip_vs_app_pkt_in(cp, skb))
+ return 0;
+ }
+
+ tcph = (void *)ipv6_hdr(skb) + tcphoff;
+ tcph->dest = cp->dport;
+
+ /*
+ * Adjust TCP checksums
+ */
+ if (!cp->app) {
+ /* Only port and addr are changed, do fast csum update */
+ tcp_fast_csum_update_v6(tcph, &cp->vaddr.v6, &cp->daddr.v6,
+ cp->vport, cp->dport);
+ if (skb->ip_summed == CHECKSUM_COMPLETE)
+ skb->ip_summed = CHECKSUM_NONE;
+ } else {
+ /* full checksum calculation */
+ tcph->check = 0;
+ skb->csum = skb_checksum(skb, tcphoff, skb->len - tcphoff, 0);
+ tcph->check = csum_ipv6_magic(&cp->caddr.v6, &cp->daddr.v6,
+ skb->len - tcphoff,
+ cp->protocol, skb->csum);
+ skb->ip_summed = CHECKSUM_UNNECESSARY;
+ }
+ return 1;
+}
+#endif
+
static int
tcp_csum_check(struct sk_buff *skb, struct ip_vs_protocol *pp)
@@ -242,6 +442,35 @@ tcp_csum_check(struct sk_buff *skb, struct ip_vs_protocol *pp)
return 1;
}
+#ifdef CONFIG_IP_VS_IPV6
+static int
+tcp_csum_check_v6(struct sk_buff *skb, struct ip_vs_protocol *pp)
+{
+ const unsigned int tcphoff = sizeof(struct ipv6hdr);
+ return 1;
+
+ switch (skb->ip_summed) {
+ case CHECKSUM_NONE:
+ skb->csum = skb_checksum(skb, tcphoff, skb->len - tcphoff, 0);
+ case CHECKSUM_COMPLETE:
+ if (csum_ipv6_magic(&ipv6_hdr(skb)->saddr,
+ &ipv6_hdr(skb)->daddr,
+ skb->len - tcphoff,
+ ipv6_hdr(skb)->nexthdr, skb->csum)) {
+ IP_VS_DBG_RL_PKT(0, pp, skb, 0,
+ "Failed checksum for");
+ return 0;
+ }
+ break;
+ default:
+ /* No need to checksum. */
+ break;
+ }
+
+ return 1;
+}
+#endif
+
#define TCP_DIR_INPUT 0
#define TCP_DIR_OUTPUT 4
@@ -477,8 +706,13 @@ tcp_state_transition(struct ip_vs_conn *cp, int direction,
struct ip_vs_protocol *pp)
{
struct tcphdr _tcph, *th;
+#ifdef CONFIG_IP_VS_IPV6
+ int ihl = cp->af == AF_INET ? ip_hdrlen(skb) : sizeof(struct ipv6hdr);
+#else
+ int ihl = ip_hdrlen(skb);
+#endif
- th = skb_header_pointer(skb, ip_hdrlen(skb), sizeof(_tcph), &_tcph);
+ th = skb_header_pointer(skb, ihl, sizeof(_tcph), &_tcph);
if (th == NULL)
return 0;
@@ -625,11 +859,25 @@ struct ip_vs_protocol ip_vs_protocol_tcp = {
.register_app = tcp_register_app,
.unregister_app = tcp_unregister_app,
.conn_schedule = tcp_conn_schedule,
+#ifdef CONFIG_IP_VS_IPV6
+ .conn_schedule_v6 = tcp_conn_schedule_v6,
+#endif
.conn_in_get = tcp_conn_in_get,
.conn_out_get = tcp_conn_out_get,
+#ifdef CONFIG_IP_VS_IPV6
+ .conn_in_get_v6 = tcp_conn_in_get_v6,
+ .conn_out_get_v6 = tcp_conn_out_get_v6,
+#endif
.snat_handler = tcp_snat_handler,
.dnat_handler = tcp_dnat_handler,
+#ifdef CONFIG_IP_VS_IPV6
+ .snat_handler_v6 = tcp_snat_handler_v6,
+ .dnat_handler_v6 = tcp_dnat_handler_v6,
+#endif
.csum_check = tcp_csum_check,
+#ifdef CONFIG_IP_VS_IPV6
+ .csum_check_v6 = tcp_csum_check_v6,
+#endif
.state_name = tcp_state_name,
.state_transition = tcp_state_transition,
.app_conn_bind = tcp_app_conn_bind,
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 10/26] IPVS: Add IPv6 handler functions to UDP protocol handler.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (8 preceding siblings ...)
2008-06-11 17:11 ` [PATCH 09/26] IPVS: Add IPv6 handler functions to TCP " Julius R. Volz
@ 2008-06-11 17:11 ` Julius R. Volz
2008-06-11 17:18 ` Patrick McHardy
2008-06-11 17:11 ` [PATCH 11/26] IPVS: Add supports_ipv6 flag to schedulers Julius R. Volz
` (16 subsequent siblings)
26 siblings, 1 reply; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:11 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Define new IPv6-specific handler functions in UDP protocol handler. Set new
function pointers in ip_vs_protocol struct to point to these functions.
Signed-off-by: Julius R. Volz <juliusv@google.com>
1 files changed, 264 insertions(+), 1 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_proto_udp.c b/net/netfilter/ipvs/ip_vs_proto_udp.c
index 76e97ef..ef0d921 100644
--- a/net/netfilter/ipvs/ip_vs_proto_udp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_udp.c
@@ -49,6 +49,31 @@ udp_conn_in_get(const struct sk_buff *skb, struct ip_vs_protocol *pp,
return cp;
}
+#ifdef CONFIG_IP_VS_IPV6
+static struct ip_vs_conn *
+udp_conn_in_get_v6(const struct sk_buff *skb, struct ip_vs_protocol *pp,
+ const struct ipv6hdr *iph, unsigned int proto_off, int inverse)
+{
+ struct ip_vs_conn *cp;
+ __be16 _ports[2], *pptr;
+
+ pptr = skb_header_pointer(skb, proto_off, sizeof(_ports), _ports);
+ if (pptr == NULL)
+ return NULL;
+
+ if (likely(!inverse)) {
+ cp = ip_vs_conn_in_get_v6(iph->nexthdr,
+ &iph->saddr, pptr[0],
+ &iph->daddr, pptr[1]);
+ } else {
+ cp = ip_vs_conn_in_get_v6(iph->nexthdr,
+ &iph->daddr, pptr[1],
+ &iph->saddr, pptr[0]);
+ }
+
+ return cp;
+}
+#endif
static struct ip_vs_conn *
udp_conn_out_get(const struct sk_buff *skb, struct ip_vs_protocol *pp,
@@ -75,6 +100,33 @@ udp_conn_out_get(const struct sk_buff *skb, struct ip_vs_protocol *pp,
return cp;
}
+#ifdef CONFIG_IP_VS_IPV6
+static struct ip_vs_conn *
+udp_conn_out_get_v6(const struct sk_buff *skb, struct ip_vs_protocol *pp,
+ const struct ipv6hdr *iph, unsigned int proto_off, int inverse)
+{
+ struct ip_vs_conn *cp;
+ __be16 _ports[2], *pptr;
+
+ pptr = skb_header_pointer(skb, sizeof(struct ipv6hdr),
+ sizeof(_ports), _ports);
+ if (pptr == NULL)
+ return NULL;
+
+ if (likely(!inverse)) {
+ cp = ip_vs_conn_out_get_v6(iph->nexthdr,
+ &iph->saddr, pptr[0],
+ &iph->daddr, pptr[1]);
+ } else {
+ cp = ip_vs_conn_out_get_v6(iph->nexthdr,
+ &iph->daddr, pptr[1],
+ &iph->saddr, pptr[0]);
+ }
+
+ return cp;
+}
+#endif
+
static int
udp_conn_schedule(struct sk_buff *skb, struct ip_vs_protocol *pp,
@@ -116,6 +168,48 @@ udp_conn_schedule(struct sk_buff *skb, struct ip_vs_protocol *pp,
return 1;
}
+#ifdef CONFIG_IP_VS_IPV6
+static int
+udp_conn_schedule_v6(struct sk_buff *skb, struct ip_vs_protocol *pp,
+ int *verdict, struct ip_vs_conn **cpp)
+{
+ struct ip_vs_service *svc;
+ struct udphdr _udph, *uh;
+
+ uh = skb_header_pointer(skb, sizeof(struct ipv6hdr),
+ sizeof(_udph), &_udph);
+ if (uh == NULL) {
+ *verdict = NF_DROP;
+ return 0;
+ }
+
+ if ((svc = ip_vs_service_get_v6(skb->mark, ipv6_hdr(skb)->nexthdr,
+ &ipv6_hdr(skb)->daddr, uh->dest))) {
+ if (ip_vs_todrop()) {
+ /*
+ * It seems that we are very loaded.
+ * We have to drop this packet :(
+ */
+ ip_vs_service_put(svc);
+ *verdict = NF_DROP;
+ return 0;
+ }
+
+ /*
+ * Let the virtual server select a real server for the
+ * incoming connection, and create a connection entry.
+ */
+ *cpp = ip_vs_schedule_v6(svc, skb);
+ if (!*cpp) {
+ *verdict = ip_vs_leave_v6(svc, skb, pp);
+ return 0;
+ }
+ ip_vs_service_put(svc);
+ }
+ return 1;
+}
+#endif
+
static inline void
udp_fast_csum_update(struct udphdr *uhdr, __be32 oldip, __be32 newip,
@@ -129,6 +223,21 @@ udp_fast_csum_update(struct udphdr *uhdr, __be32 oldip, __be32 newip,
uhdr->check = CSUM_MANGLED_0;
}
+#ifdef CONFIG_IP_VS_IPV6
+static inline void
+udp_fast_csum_update_v6(struct udphdr *uhdr, const struct in6_addr *oldip,
+ const struct in6_addr *newip, __be16 oldport,
+ __be16 newport)
+{
+ uhdr->check =
+ csum_fold(ip_vs_check_diff16(oldip->s6_addr32, newip->s6_addr32,
+ ip_vs_check_diff2(oldport, newport,
+ ~csum_unfold(uhdr->check))));
+ if (!uhdr->check)
+ uhdr->check = CSUM_MANGLED_0;
+}
+#endif
+
static int
udp_snat_handler(struct sk_buff *skb,
struct ip_vs_protocol *pp, struct ip_vs_conn *cp)
@@ -180,6 +289,59 @@ udp_snat_handler(struct sk_buff *skb,
return 1;
}
+#ifdef CONFIG_IP_VS_IPV6
+static int
+udp_snat_handler_v6(struct sk_buff *skb,
+ struct ip_vs_protocol *pp, struct ip_vs_conn *cp)
+{
+ struct udphdr *udph;
+ const unsigned int udphoff = sizeof(struct ipv6hdr);
+
+ /* csum_check requires unshared skb */
+ if (!skb_make_writable(skb, udphoff+sizeof(*udph)))
+ return 0;
+
+ if (unlikely(cp->app != NULL)) {
+ /* Some checks before mangling */
+ if (pp->csum_check_v6 && !pp->csum_check_v6(skb, pp))
+ return 0;
+
+ /*
+ * Call application helper if needed
+ */
+ if (!ip_vs_app_pkt_out(cp, skb))
+ return 0;
+ }
+
+ udph = (void *)ipv6_hdr(skb) + udphoff;
+ udph->source = cp->vport;
+
+ /*
+ * Adjust UDP checksums
+ */
+ if (!cp->app && (udph->check != 0)) {
+ /* Only port and addr are changed, do fast csum update */
+ udp_fast_csum_update_v6(udph, &cp->daddr.v6, &cp->vaddr.v6,
+ cp->dport, cp->vport);
+ if (skb->ip_summed == CHECKSUM_COMPLETE)
+ skb->ip_summed = CHECKSUM_NONE;
+ } else {
+ /* full checksum calculation */
+ udph->check = 0;
+ skb->csum = skb_checksum(skb, udphoff, skb->len - udphoff, 0);
+ udph->check = csum_ipv6_magic(&cp->vaddr.v6, &cp->caddr.v6,
+ skb->len - udphoff,
+ cp->protocol, skb->csum);
+ if (udph->check == 0)
+ udph->check = CSUM_MANGLED_0;
+ IP_VS_DBG(11, "O-pkt: %s O-csum=%d (+%zd)\n",
+ pp->name, udph->check,
+ (char*)&(udph->check) - (char*)udph);
+ }
+ return 1;
+}
+#endif
+
static int
udp_dnat_handler(struct sk_buff *skb,
@@ -231,6 +393,58 @@ udp_dnat_handler(struct sk_buff *skb,
return 1;
}
+#ifdef CONFIG_IP_VS_IPV6
+static int
+udp_dnat_handler_v6(struct sk_buff *skb,
+ struct ip_vs_protocol *pp, struct ip_vs_conn *cp)
+{
+ struct udphdr *udph;
+ unsigned int udphoff = sizeof(struct ipv6hdr);
+
+ /* csum_check requires unshared skb */
+ if (!skb_make_writable(skb, udphoff+sizeof(*udph)))
+ return 0;
+
+ if (unlikely(cp->app != NULL)) {
+ /* Some checks before mangling */
+ if (pp->csum_check_v6 && !pp->csum_check_v6(skb, pp))
+ return 0;
+
+ /*
+ * Attempt ip_vs_app call.
+ * It will fix ip_vs_conn
+ */
+ if (!ip_vs_app_pkt_in(cp, skb))
+ return 0;
+ }
+
+ udph = (void *)ipv6_hdr(skb) + udphoff;
+ udph->dest = cp->dport;
+
+ /*
+ * Adjust UDP checksums
+ */
+ if (!cp->app && (udph->check != 0)) {
+ /* Only port and addr are changed, do fast csum update */
+ udp_fast_csum_update_v6(udph, &cp->vaddr.v6, &cp->daddr.v6,
+ cp->vport, cp->dport);
+ if (skb->ip_summed == CHECKSUM_COMPLETE)
+ skb->ip_summed = CHECKSUM_NONE;
+ } else {
+ /* full checksum calculation */
+ udph->check = 0;
+ skb->csum = skb_checksum(skb, udphoff, skb->len - udphoff, 0);
+ udph->check = csum_ipv6_magic(&cp->caddr.v6, &cp->daddr.v6,
+ skb->len - udphoff,
+ cp->protocol, skb->csum);
+ if (udph->check == 0)
+ udph->check = CSUM_MANGLED_0;
+ skb->ip_summed = CHECKSUM_UNNECESSARY;
+ }
+ return 1;
+}
+#endif
+
static int
udp_csum_check(struct sk_buff *skb, struct ip_vs_protocol *pp)
@@ -266,6 +480,42 @@ udp_csum_check(struct sk_buff *skb, struct ip_vs_protocol *pp)
return 1;
}
+#ifdef CONFIG_IP_VS_IPV6
+static int
+udp_csum_check_v6(struct sk_buff *skb, struct ip_vs_protocol *pp)
+{
+ struct udphdr _udph, *uh;
+ const unsigned int udphoff = sizeof(struct ipv6hdr);
+
+ uh = skb_header_pointer(skb, udphoff, sizeof(_udph), &_udph);
+ if (uh == NULL)
+ return 0;
+
+ if (uh->check != 0) {
+ switch (skb->ip_summed) {
+ case CHECKSUM_NONE:
+ skb->csum = skb_checksum(skb, udphoff,
+ skb->len - udphoff, 0);
+ case CHECKSUM_COMPLETE:
+ if (csum_ipv6_magic(&ipv6_hdr(skb)->saddr,
+ &ipv6_hdr(skb)->daddr,
+ skb->len - udphoff,
+ ipv6_hdr(skb)->nexthdr,
+ skb->csum)) {
+ IP_VS_DBG_RL_PKT(0, pp, skb, 0,
+ "Failed checksum for");
+ return 0;
+ }
+ break;
+ default:
+ /* No need to checksum. */
+ break;
+ }
+ }
+ return 1;
+}
+#endif
+
/*
* Note: the caller guarantees that only one of register_app,
@@ -413,7 +663,6 @@ static void udp_exit(struct ip_vs_protocol *pp)
{
}
-
struct ip_vs_protocol ip_vs_protocol_udp = {
.name = "UDP",
.protocol = IPPROTO_UDP,
@@ -422,11 +671,25 @@ struct ip_vs_protocol ip_vs_protocol_udp = {
.init = udp_init,
.exit = udp_exit,
.conn_schedule = udp_conn_schedule,
+#ifdef CONFIG_IP_VS_IPV6
+ .conn_schedule_v6 = udp_conn_schedule_v6,
+#endif
.conn_in_get = udp_conn_in_get,
.conn_out_get = udp_conn_out_get,
+#ifdef CONFIG_IP_VS_IPV6
+ .conn_in_get_v6 = udp_conn_in_get_v6,
+ .conn_out_get_v6 = udp_conn_out_get_v6,
+#endif
.snat_handler = udp_snat_handler,
.dnat_handler = udp_dnat_handler,
+#ifdef CONFIG_IP_VS_IPV6
+ .snat_handler_v6 = udp_snat_handler_v6,
+ .dnat_handler_v6 = udp_dnat_handler_v6,
+#endif
.csum_check = udp_csum_check,
+#ifdef CONFIG_IP_VS_IPV6
+ .csum_check_v6 = udp_csum_check_v6,
+#endif
.state_transition = udp_state_transition,
.state_name = udp_state_name,
.register_app = udp_register_app,
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 11/26] IPVS: Add supports_ipv6 flag to schedulers.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (9 preceding siblings ...)
2008-06-11 17:11 ` [PATCH 10/26] IPVS: Add IPv6 handler functions to UDP " Julius R. Volz
@ 2008-06-11 17:11 ` Julius R. Volz
2008-06-11 17:11 ` [PATCH 12/26] IPVS: Extend proto handler debug functions to handle IPv6 Julius R. Volz
` (15 subsequent siblings)
26 siblings, 0 replies; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:11 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Add supports_ipv6 flag to struct ip_vs_scheduler to indicate whether a
scheduler supports IPv6. Set the flag to 1 in schedulers that work with
IPv6, 0 otherwise.
Signed-off-by: Julius R. Volz <juliusv@google.com>
11 files changed, 33 insertions(+), 0 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index a6e7438..2ab5d59 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -687,6 +687,9 @@ struct ip_vs_scheduler {
char *name; /* scheduler name */
atomic_t refcnt; /* reference counter */
struct module *module; /* THIS_MODULE/NULL */
+#ifdef CONFIG_IP_VS_IPV6
+ int supports_ipv6; /* scheduler has IPv6 support */
+#endif
/* scheduler initializing service */
int (*init_service)(struct ip_vs_service *svc);
diff --git a/net/netfilter/ipvs/ip_vs_dh.c b/net/netfilter/ipvs/ip_vs_dh.c
index dcf5d46..25a0ac6 100644
--- a/net/netfilter/ipvs/ip_vs_dh.c
+++ b/net/netfilter/ipvs/ip_vs_dh.c
@@ -235,6 +235,9 @@ static struct ip_vs_scheduler ip_vs_dh_scheduler =
.name = "dh",
.refcnt = ATOMIC_INIT(0),
.module = THIS_MODULE,
+#ifdef CONFIG_IP_VS_IPV6
+ .supports_ipv6 = 0,
+#endif
.init_service = ip_vs_dh_init_svc,
.done_service = ip_vs_dh_done_svc,
.update_service = ip_vs_dh_update_svc,
diff --git a/net/netfilter/ipvs/ip_vs_lblc.c b/net/netfilter/ipvs/ip_vs_lblc.c
index 3888642..55c2417 100644
--- a/net/netfilter/ipvs/ip_vs_lblc.c
+++ b/net/netfilter/ipvs/ip_vs_lblc.c
@@ -541,6 +541,9 @@ static struct ip_vs_scheduler ip_vs_lblc_scheduler =
.name = "lblc",
.refcnt = ATOMIC_INIT(0),
.module = THIS_MODULE,
+#ifdef CONFIG_IP_VS_IPV6
+ .supports_ipv6 = 0,
+#endif
.init_service = ip_vs_lblc_init_svc,
.done_service = ip_vs_lblc_done_svc,
.update_service = ip_vs_lblc_update_svc,
diff --git a/net/netfilter/ipvs/ip_vs_lblcr.c b/net/netfilter/ipvs/ip_vs_lblcr.c
index daa260e..5b94f5f 100644
--- a/net/netfilter/ipvs/ip_vs_lblcr.c
+++ b/net/netfilter/ipvs/ip_vs_lblcr.c
@@ -730,6 +730,9 @@ static struct ip_vs_scheduler ip_vs_lblcr_scheduler =
.name = "lblcr",
.refcnt = ATOMIC_INIT(0),
.module = THIS_MODULE,
+#ifdef CONFIG_IP_VS_IPV6
+ .supports_ipv6 = 0,
+#endif
.init_service = ip_vs_lblcr_init_svc,
.done_service = ip_vs_lblcr_done_svc,
.update_service = ip_vs_lblcr_update_svc,
diff --git a/net/netfilter/ipvs/ip_vs_lc.c b/net/netfilter/ipvs/ip_vs_lc.c
index e1214d1..d64b17f 100644
--- a/net/netfilter/ipvs/ip_vs_lc.c
+++ b/net/netfilter/ipvs/ip_vs_lc.c
@@ -107,6 +107,9 @@ static struct ip_vs_scheduler ip_vs_lc_scheduler = {
.name = "lc",
.refcnt = ATOMIC_INIT(0),
.module = THIS_MODULE,
+#ifdef CONFIG_IP_VS_IPV6
+ .supports_ipv6 = 1,
+#endif
.init_service = ip_vs_lc_init_svc,
.done_service = ip_vs_lc_done_svc,
.update_service = ip_vs_lc_update_svc,
diff --git a/net/netfilter/ipvs/ip_vs_nq.c b/net/netfilter/ipvs/ip_vs_nq.c
index 5de2e34..aaa6321 100644
--- a/net/netfilter/ipvs/ip_vs_nq.c
+++ b/net/netfilter/ipvs/ip_vs_nq.c
@@ -146,6 +146,9 @@ static struct ip_vs_scheduler ip_vs_nq_scheduler =
.name = "nq",
.refcnt = ATOMIC_INIT(0),
.module = THIS_MODULE,
+#ifdef CONFIG_IP_VS_IPV6
+ .supports_ipv6 = 1,
+#endif
.init_service = ip_vs_nq_init_svc,
.done_service = ip_vs_nq_done_svc,
.update_service = ip_vs_nq_update_svc,
diff --git a/net/netfilter/ipvs/ip_vs_rr.c b/net/netfilter/ipvs/ip_vs_rr.c
index 433f8a9..3f78cc8 100644
--- a/net/netfilter/ipvs/ip_vs_rr.c
+++ b/net/netfilter/ipvs/ip_vs_rr.c
@@ -96,6 +96,9 @@ static struct ip_vs_scheduler ip_vs_rr_scheduler = {
.name = "rr", /* name */
.refcnt = ATOMIC_INIT(0),
.module = THIS_MODULE,
+#ifdef CONFIG_IP_VS_IPV6
+ .supports_ipv6 = 1,
+#endif
.init_service = ip_vs_rr_init_svc,
.done_service = ip_vs_rr_done_svc,
.update_service = ip_vs_rr_update_svc,
diff --git a/net/netfilter/ipvs/ip_vs_sed.c b/net/netfilter/ipvs/ip_vs_sed.c
index e7bc810..99995cf 100644
--- a/net/netfilter/ipvs/ip_vs_sed.c
+++ b/net/netfilter/ipvs/ip_vs_sed.c
@@ -147,6 +147,9 @@ static struct ip_vs_scheduler ip_vs_sed_scheduler =
.name = "sed",
.refcnt = ATOMIC_INIT(0),
.module = THIS_MODULE,
+#ifdef CONFIG_IP_VS_IPV6
+ .supports_ipv6 = 1,
+#endif
.init_service = ip_vs_sed_init_svc,
.done_service = ip_vs_sed_done_svc,
.update_service = ip_vs_sed_update_svc,
diff --git a/net/netfilter/ipvs/ip_vs_sh.c b/net/netfilter/ipvs/ip_vs_sh.c
index 1b25b00..49ef452 100644
--- a/net/netfilter/ipvs/ip_vs_sh.c
+++ b/net/netfilter/ipvs/ip_vs_sh.c
@@ -232,6 +232,9 @@ static struct ip_vs_scheduler ip_vs_sh_scheduler =
.name = "sh",
.refcnt = ATOMIC_INIT(0),
.module = THIS_MODULE,
+#ifdef CONFIG_IP_VS_IPV6
+ .supports_ipv6 = 0,
+#endif
.init_service = ip_vs_sh_init_svc,
.done_service = ip_vs_sh_done_svc,
.update_service = ip_vs_sh_update_svc,
diff --git a/net/netfilter/ipvs/ip_vs_wlc.c b/net/netfilter/ipvs/ip_vs_wlc.c
index ff003a7..88f571b 100644
--- a/net/netfilter/ipvs/ip_vs_wlc.c
+++ b/net/netfilter/ipvs/ip_vs_wlc.c
@@ -135,6 +135,9 @@ static struct ip_vs_scheduler ip_vs_wlc_scheduler =
.name = "wlc",
.refcnt = ATOMIC_INIT(0),
.module = THIS_MODULE,
+#ifdef CONFIG_IP_VS_IPV6
+ .supports_ipv6 = 1,
+#endif
.init_service = ip_vs_wlc_init_svc,
.done_service = ip_vs_wlc_done_svc,
.update_service = ip_vs_wlc_update_svc,
diff --git a/net/netfilter/ipvs/ip_vs_wrr.c b/net/netfilter/ipvs/ip_vs_wrr.c
index 3f61ab2..1bc4aad 100644
--- a/net/netfilter/ipvs/ip_vs_wrr.c
+++ b/net/netfilter/ipvs/ip_vs_wrr.c
@@ -221,6 +221,9 @@ static struct ip_vs_scheduler ip_vs_wrr_scheduler = {
.name = "wrr",
.refcnt = ATOMIC_INIT(0),
.module = THIS_MODULE,
+#ifdef CONFIG_IP_VS_IPV6
+ .supports_ipv6 = 1,
+#endif
.init_service = ip_vs_wrr_init_svc,
.done_service = ip_vs_wrr_done_svc,
.update_service = ip_vs_wrr_update_svc,
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 12/26] IPVS: Extend proto handler debug functions to handle IPv6.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (10 preceding siblings ...)
2008-06-11 17:11 ` [PATCH 11/26] IPVS: Add supports_ipv6 flag to schedulers Julius R. Volz
@ 2008-06-11 17:11 ` Julius R. Volz
2008-06-11 17:17 ` Patrick McHardy
2008-06-11 17:11 ` [PATCH 13/26] IPVS: Turn off FTP application helper for IPv6 Julius R. Volz
` (14 subsequent siblings)
26 siblings, 1 reply; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:11 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Extend protocol handler packet debug functions for TCP, UDP, AH and ESP to
handle IPv6. Make the main debug function call either a v4 or v6 version,
depending on the packet protocol version.
Signed-off-by: Julius R. Volz <juliusv@google.com>
3 files changed, 127 insertions(+), 10 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_proto.c b/net/netfilter/ipvs/ip_vs_proto.c
index 4b1c16c..8b82400 100644
--- a/net/netfilter/ipvs/ip_vs_proto.c
+++ b/net/netfilter/ipvs/ip_vs_proto.c
@@ -154,10 +154,10 @@ const char * ip_vs_state_name(__u16 proto, int state)
void
-ip_vs_tcpudp_debug_packet(struct ip_vs_protocol *pp,
- const struct sk_buff *skb,
- int offset,
- const char *msg)
+ip_vs_tcpudp_debug_packet_v4(struct ip_vs_protocol *pp,
+ const struct sk_buff *skb,
+ int offset,
+ const char *msg)
{
char buf[128];
struct iphdr _iph, *ih;
@@ -170,8 +170,8 @@ ip_vs_tcpudp_debug_packet(struct ip_vs_protocol *pp,
pp->name, NIPQUAD(ih->saddr),
NIPQUAD(ih->daddr));
else {
- __be16 _ports[2], *pptr
-;
+ __be16 _ports[2], *pptr;
+
pptr = skb_header_pointer(skb, offset + ih->ihl*4,
sizeof(_ports), _ports);
if (pptr == NULL)
@@ -191,6 +191,60 @@ ip_vs_tcpudp_debug_packet(struct ip_vs_protocol *pp,
printk(KERN_DEBUG "IPVS: %s: %s\n", msg, buf);
}
+#ifdef CONFIG_IP_VS_IPV6
+void
+ip_vs_tcpudp_debug_packet_v6(struct ip_vs_protocol *pp,
+ const struct sk_buff *skb,
+ int offset,
+ const char *msg)
+{
+ char buf[192];
+ struct ipv6hdr _iph, *ih;
+
+ ih = skb_header_pointer(skb, offset, sizeof(_iph), &_iph);
+ if (ih == NULL)
+ sprintf(buf, "%s TRUNCATED", pp->name);
+ else if (ih->nexthdr == IPPROTO_FRAGMENT)
+ sprintf(buf, "%s " NIP6_FMT "->" NIP6_FMT " frag",
+ pp->name, NIP6(ih->saddr),
+ NIP6(ih->daddr));
+ else {
+ __be16 _ports[2], *pptr;
+
+ pptr = skb_header_pointer(skb, offset + sizeof(struct ipv6hdr),
+ sizeof(_ports), _ports);
+ if (pptr == NULL)
+ sprintf(buf, "%s TRUNCATED " NIP6_FMT "->" NIP6_FMT,
+ pp->name,
+ NIP6(ih->saddr),
+ NIP6(ih->daddr));
+ else
+ sprintf(buf, "%s " NIP6_FMT ":%u->" NIP6_FMT ":%u",
+ pp->name,
+ NIP6(ih->saddr),
+ ntohs(pptr[0]),
+ NIP6(ih->daddr),
+ ntohs(pptr[1]));
+ }
+
+ printk(KERN_DEBUG "IPVS: %s: %s\n", msg, buf);
+}
+#endif
+
+
+void
+ip_vs_tcpudp_debug_packet(struct ip_vs_protocol *pp,
+ const struct sk_buff *skb,
+ int offset,
+ const char *msg)
+{
+#ifdef CONFIG_IP_VS_IPV6
+ if (skb->protocol == htons(ETH_P_IPV6))
+ ip_vs_tcpudp_debug_packet_v6(pp, skb, offset, msg);
+ else
+#endif
+ ip_vs_tcpudp_debug_packet_v4(pp, skb, offset, msg);
+}
int ip_vs_protocol_init(void)
{
diff --git a/net/netfilter/ipvs/ip_vs_proto_ah.c b/net/netfilter/ipvs/ip_vs_proto_ah.c
index 674c9d8..7c727d7 100644
--- a/net/netfilter/ipvs/ip_vs_proto_ah.c
+++ b/net/netfilter/ipvs/ip_vs_proto_ah.c
@@ -202,8 +202,8 @@ ah_conn_schedule(struct sk_buff *skb,
static void
-ah_debug_packet(struct ip_vs_protocol *pp, const struct sk_buff *skb,
- int offset, const char *msg)
+ah_debug_packet_v4(struct ip_vs_protocol *pp, const struct sk_buff *skb,
+ int offset, const char *msg)
{
char buf[256];
struct iphdr _iph, *ih;
@@ -219,6 +219,38 @@ ah_debug_packet(struct ip_vs_protocol *pp, const struct sk_buff *skb,
printk(KERN_DEBUG "IPVS: %s: %s\n", msg, buf);
}
+#ifdef CONFIG_IP_VS_IPV6
+static void
+ah_debug_packet_v6(struct ip_vs_protocol *pp, const struct sk_buff *skb,
+ int offset, const char *msg)
+{
+ char buf[256];
+ struct ipv6hdr _iph, *ih;
+
+ ih = skb_header_pointer(skb, offset, sizeof(_iph), &_iph);
+ if (ih == NULL)
+ sprintf(buf, "%s TRUNCATED", pp->name);
+ else
+ sprintf(buf, "%s " NIP6_FMT "->" NIP6_FMT,
+ pp->name, NIP6(ih->saddr),
+ NIP6(ih->daddr));
+
+ printk(KERN_DEBUG "IPVS: %s: %s\n", msg, buf);
+}
+#endif
+
+static void
+ah_debug_packet(struct ip_vs_protocol *pp, const struct sk_buff *skb,
+ int offset, const char *msg)
+{
+#ifdef CONFIG_IP_VS_IPV6
+ if (skb->protocol == htons(ETH_P_IPV6))
+ ah_debug_packet_v6(pp, skb, offset, msg);
+ else
+#endif
+ ah_debug_packet_v4(pp, skb, offset, msg);
+}
+
static void ah_init(struct ip_vs_protocol *pp)
{
diff --git a/net/netfilter/ipvs/ip_vs_proto_esp.c b/net/netfilter/ipvs/ip_vs_proto_esp.c
index 5113df4..c7606c7 100644
--- a/net/netfilter/ipvs/ip_vs_proto_esp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_esp.c
@@ -201,8 +201,8 @@ esp_conn_schedule(struct sk_buff *skb, struct ip_vs_protocol *pp,
static void
-esp_debug_packet(struct ip_vs_protocol *pp, const struct sk_buff *skb,
- int offset, const char *msg)
+esp_debug_packet_v4(struct ip_vs_protocol *pp, const struct sk_buff *skb,
+ int offset, const char *msg)
{
char buf[256];
struct iphdr _iph, *ih;
@@ -218,6 +218,37 @@ esp_debug_packet(struct ip_vs_protocol *pp, const struct sk_buff *skb,
printk(KERN_DEBUG "IPVS: %s: %s\n", msg, buf);
}
+#ifdef CONFIG_IP_VS_IPV6
+static void
+esp_debug_packet_v6(struct ip_vs_protocol *pp, const struct sk_buff *skb,
+ int offset, const char *msg)
+{
+ char buf[256];
+ struct ipv6hdr _iph, *ih;
+
+ ih = skb_header_pointer(skb, offset, sizeof(_iph), &_iph);
+ if (ih == NULL)
+ sprintf(buf, "%s TRUNCATED", pp->name);
+ else
+ sprintf(buf, "%s " NIP6_FMT "->" NIP6_FMT,
+ pp->name, NIP6(ih->saddr),
+ NIP6(ih->daddr));
+
+ printk(KERN_DEBUG "IPVS: %s: %s\n", msg, buf);
+}
+#endif
+
+static void
+esp_debug_packet(struct ip_vs_protocol *pp, const struct sk_buff *skb,
+ int offset, const char *msg)
+{
+#ifdef CONFIG_IP_VS_IPV6
+ if (skb->protocol == htons(ETH_P_IPV6))
+ esp_debug_packet_v6(pp, skb, offset, msg);
+ else
+#endif
+ esp_debug_packet_v4(pp, skb, offset, msg);
+}
static void esp_init(struct ip_vs_protocol *pp)
{
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 13/26] IPVS: Turn off FTP application helper for IPv6.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (11 preceding siblings ...)
2008-06-11 17:11 ` [PATCH 12/26] IPVS: Extend proto handler debug functions to handle IPv6 Julius R. Volz
@ 2008-06-11 17:11 ` Julius R. Volz
2008-06-11 17:11 ` [PATCH 14/26] IPVS: Extend xmit routing cache to support IPv6 Julius R. Volz
` (13 subsequent siblings)
26 siblings, 0 replies; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:11 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Immediately return from FTP application helper and do nothing when dealing
with IPv6 packets. IPv6 is not supported by this helper yet.
Signed-off-by: Julius R. Volz <juliusv@google.com>
1 files changed, 16 insertions(+), 0 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c
index 6542fa9..6b02b5f 100644
--- a/net/netfilter/ipvs/ip_vs_ftp.c
+++ b/net/netfilter/ipvs/ip_vs_ftp.c
@@ -149,6 +149,14 @@ static int ip_vs_ftp_out(struct ip_vs_app *app, struct ip_vs_conn *cp,
unsigned buf_len;
int ret;
+#ifdef CONFIG_IP_VS_IPV6
+ /* This application helper doesn't work with IPv6 yet,
+ * so turn this into a no-op for IPv6 packets
+ */
+ if (cp->af == AF_INET6)
+ return 1;
+#endif
+
*diff = 0;
/* Only useful for established sessions */
@@ -249,6 +257,14 @@ static int ip_vs_ftp_in(struct ip_vs_app *app, struct ip_vs_conn *cp,
__be16 port;
struct ip_vs_conn *n_cp;
+#ifdef CONFIG_IP_VS_IPV6
+ /* This application helper doesn't work with IPv6 yet,
+ * so turn this into a no-op for IPv6 packets
+ */
+ if (cp->af == AF_INET6)
+ return 1;
+#endif
+
/* no diff required for incoming packets */
*diff = 0;
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 14/26] IPVS: Extend xmit routing cache to support IPv6.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (12 preceding siblings ...)
2008-06-11 17:11 ` [PATCH 13/26] IPVS: Turn off FTP application helper for IPv6 Julius R. Volz
@ 2008-06-11 17:11 ` Julius R. Volz
2008-06-11 17:11 ` [PATCH 15/26] IPVS: Modify IP_VS_XMIT() " Julius R. Volz
` (12 subsequent siblings)
26 siblings, 0 replies; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:11 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Extend xmit destination routing cache to support lookup of IPv6
destinations.
Signed-off-by: Julius R. Volz <juliusv@google.com>
1 files changed, 74 insertions(+), 0 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
index 6b6ce6b..94b8fcb 100644
--- a/net/netfilter/ipvs/ip_vs_xmit.c
+++ b/net/netfilter/ipvs/ip_vs_xmit.c
@@ -22,6 +22,9 @@
#include <net/udp.h>
#include <net/icmp.h> /* for icmp_send */
#include <net/route.h> /* for ip_route_output */
+#include <net/ipv6.h>
+#include <net/ip6_route.h>
+#include <linux/icmpv6.h>
#include <linux/netfilter.h>
#include <linux/netfilter_ipv4.h>
@@ -59,6 +62,25 @@ __ip_vs_dst_check(struct ip_vs_dest *dest, u32 rtos, u32 cookie)
return dst;
}
+#ifdef CONFIG_IP_VS_IPV6
+static inline struct dst_entry *
+__ip_vs_dst_check_v6(struct ip_vs_dest *dest, u32 cookie)
+{
+ struct dst_entry *dst = dest->dst_cache;
+
+ if (!dst)
+ return NULL;
+ if ((dst->obsolete) &&
+ dst->ops->check(dst, cookie) == NULL) {
+ dest->dst_cache = NULL;
+ dst_release(dst);
+ return NULL;
+ }
+ dst_hold(dst);
+ return dst;
+}
+#endif
+
static struct rtable *
__ip_vs_get_out_rt(struct ip_vs_conn *cp, u32 rtos)
{
@@ -111,6 +133,58 @@ __ip_vs_get_out_rt(struct ip_vs_conn *cp, u32 rtos)
return rt;
}
+#ifdef CONFIG_IP_VS_IPV6
+static struct rt6_info *
+__ip_vs_get_out_rt_v6(struct ip_vs_conn *cp)
+{
+ struct rt6_info *rt; /* Route to the other host */
+ struct ip_vs_dest *dest = cp->dest;
+
+ if (dest) {
+ spin_lock(&dest->dst_lock);
+ if (!(rt = (struct rt6_info *)
+ __ip_vs_dst_check_v6(dest, 0))) {
+ struct flowi fl = {
+ .oif = 0,
+ .nl_u = {
+ .ip6_u = {
+ .daddr = dest->addr.v6,
+ .saddr = { .s6_addr32 = {0, 0, 0, 0} }, } },
+ };
+
+ if (!(rt = (struct rt6_info *)ip6_route_output(&init_net, NULL, &fl))) {
+ spin_unlock(&dest->dst_lock);
+ IP_VS_DBG_RL("ip6_route_output error, "
+ "dest: " NIP6_FMT "\n",
+ NIP6(dest->addr.v6));
+ return NULL;
+ }
+ __ip_vs_dst_set(dest, 0, dst_clone(&rt->u.dst));
+ IP_VS_DBG(10, "new dst " NIP6_FMT ", refcnt=%d\n",
+ NIP6(dest->addr.v6),
+ atomic_read(&rt->u.dst.__refcnt));
+ }
+ spin_unlock(&dest->dst_lock);
+ } else {
+ struct flowi fl = {
+ .oif = 0,
+ .nl_u = {
+ .ip6_u = {
+ .daddr = cp->daddr.v6,
+ .saddr = { .s6_addr32 = {0, 0, 0, 0}}, } },
+ };
+
+ if (!(rt = (struct rt6_info *)ip6_route_output(&init_net, NULL, &fl))) {
+ IP_VS_DBG_RL("ip6_route_output error, dest: "
+ NIP6_FMT "\n", NIP6(cp->daddr.v6));
+ return NULL;
+ }
+ }
+
+ return rt;
+}
+#endif
+
/*
* Release dest->dst_cache before a dest is removed
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 15/26] IPVS: Modify IP_VS_XMIT() to support IPv6.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (13 preceding siblings ...)
2008-06-11 17:11 ` [PATCH 14/26] IPVS: Extend xmit routing cache to support IPv6 Julius R. Volz
@ 2008-06-11 17:11 ` Julius R. Volz
2008-06-11 17:11 ` [PATCH 16/26] IPVS: Add IPv6 xmit forwarding functions Julius R. Volz
` (11 subsequent siblings)
26 siblings, 0 replies; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:11 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Modify IP_VS_XMIT() to call PF_INET6 local output hook instead of PF_INET
one when sending IPv6 packets. This is indicated by specifying an
additional first pf argument.
Existing uses are converted to call IP_VS_XMIT(PF_INET, ...).
Signed-off-by: Julius R. Volz <juliusv@google.com>
1 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
index 94b8fcb..029d9ec 100644
--- a/net/netfilter/ipvs/ip_vs_xmit.c
+++ b/net/netfilter/ipvs/ip_vs_xmit.c
@@ -199,11 +199,11 @@ ip_vs_dst_reset(struct ip_vs_dest *dest)
dst_release(old_dst);
}
-#define IP_VS_XMIT(skb, rt) \
+#define IP_VS_XMIT(pf, skb, rt) \
do { \
(skb)->ipvs_property = 1; \
skb_forward_csum(skb); \
- NF_HOOK(PF_INET, NF_INET_LOCAL_OUT, (skb), NULL, \
+ NF_HOOK(pf, NF_INET_LOCAL_OUT, (skb), NULL, \
(rt)->u.dst.dev, dst_output); \
} while (0)
@@ -276,7 +276,7 @@ ip_vs_bypass_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
/* Another hack: avoid icmp_send in ip_fragment */
skb->local_df = 1;
- IP_VS_XMIT(skb, rt);
+ IP_VS_XMIT(PF_INET, skb, rt);
LeaveFunction(10);
return NF_STOLEN;
@@ -352,7 +352,7 @@ ip_vs_nat_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
/* Another hack: avoid icmp_send in ip_fragment */
skb->local_df = 1;
- IP_VS_XMIT(skb, rt);
+ IP_VS_XMIT(PF_INET, skb, rt);
LeaveFunction(10);
return NF_STOLEN;
@@ -543,7 +543,7 @@ ip_vs_dr_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
/* Another hack: avoid icmp_send in ip_fragment */
skb->local_df = 1;
- IP_VS_XMIT(skb, rt);
+ IP_VS_XMIT(PF_INET, skb, rt);
LeaveFunction(10);
return NF_STOLEN;
@@ -616,7 +616,7 @@ ip_vs_icmp_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
/* Another hack: avoid icmp_send in ip_fragment */
skb->local_df = 1;
- IP_VS_XMIT(skb, rt);
+ IP_VS_XMIT(PF_INET, skb, rt);
rc = NF_STOLEN;
goto out;
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 16/26] IPVS: Add IPv6 xmit forwarding functions.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (14 preceding siblings ...)
2008-06-11 17:11 ` [PATCH 15/26] IPVS: Modify IP_VS_XMIT() " Julius R. Volz
@ 2008-06-11 17:11 ` Julius R. Volz
2008-06-12 1:55 ` Brian Haley
2008-06-11 17:12 ` [PATCH 17/26] IPVS: Add connection hashing function for IPv6 entries Julius R. Volz
` (10 subsequent siblings)
26 siblings, 1 reply; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:11 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Add IPv6 xmit functions where these are different from IPv4 versions. Add
an ip_vs_bind_xmit_v6() function that binds these functions to an IPv6
connection entry.
Signed-off-by: Julius R. Volz <juliusv@google.com>
3 files changed, 407 insertions(+), 0 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 2ab5d59..d04d5c6 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1042,6 +1042,18 @@ extern int ip_vs_icmp_xmit
(struct sk_buff *skb, struct ip_vs_conn *cp, struct ip_vs_protocol *pp, int offset);
extern void ip_vs_dst_reset(struct ip_vs_dest *dest);
+#ifdef CONFIG_IP_VS_IPV6
+extern int ip_vs_bypass_xmit_v6
+(struct sk_buff *skb, struct ip_vs_conn *cp, struct ip_vs_protocol *pp);
+extern int ip_vs_nat_xmit_v6
+(struct sk_buff *skb, struct ip_vs_conn *cp, struct ip_vs_protocol *pp);
+extern int ip_vs_tunnel_xmit_v6
+(struct sk_buff *skb, struct ip_vs_conn *cp, struct ip_vs_protocol *pp);
+extern int ip_vs_dr_xmit_v6
+(struct sk_buff *skb, struct ip_vs_conn *cp, struct ip_vs_protocol *pp);
+extern int ip_vs_icmp_xmit_v6
+(struct sk_buff *skb, struct ip_vs_conn *cp, struct ip_vs_protocol *pp, int offset);
+#endif
/*
* This is a simple mechanism to ignore packets when
diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
index b3df938..1a4040d 100644
--- a/net/netfilter/ipvs/ip_vs_conn.c
+++ b/net/netfilter/ipvs/ip_vs_conn.c
@@ -374,6 +374,33 @@ static inline void ip_vs_bind_xmit(struct ip_vs_conn *cp)
}
}
+#ifdef CONFIG_IP_VS_IPV6
+static inline void ip_vs_bind_xmit_v6(struct ip_vs_conn *cp)
+{
+ switch (IP_VS_FWD_METHOD(cp)) {
+ case IP_VS_CONN_F_MASQ:
+ cp->packet_xmit = ip_vs_nat_xmit_v6;
+ break;
+
+ case IP_VS_CONN_F_TUNNEL:
+ cp->packet_xmit = ip_vs_tunnel_xmit_v6;
+ break;
+
+ case IP_VS_CONN_F_DROUTE:
+ cp->packet_xmit = ip_vs_dr_xmit_v6;
+ break;
+
+ case IP_VS_CONN_F_LOCALNODE:
+ cp->packet_xmit = ip_vs_null_xmit;
+ break;
+
+ case IP_VS_CONN_F_BYPASS:
+ cp->packet_xmit = ip_vs_bypass_xmit_v6;
+ break;
+ }
+}
+#endif
+
static inline int ip_vs_dest_totalconns(struct ip_vs_dest *dest)
{
diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
index 029d9ec..9d2c424 100644
--- a/net/netfilter/ipvs/ip_vs_xmit.c
+++ b/net/netfilter/ipvs/ip_vs_xmit.c
@@ -289,6 +289,68 @@ ip_vs_bypass_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
return NF_STOLEN;
}
+#ifdef CONFIG_IP_VS_IPV6
+int
+ip_vs_bypass_xmit_v6(struct sk_buff *skb, struct ip_vs_conn *cp,
+ struct ip_vs_protocol *pp)
+{
+ struct rt6_info *rt; /* Route to the other host */
+ struct ipv6hdr *iph = ipv6_hdr(skb);
+ int mtu;
+ struct flowi fl = {
+ .oif = 0,
+ .nl_u = {
+ .ip6_u = {
+ .daddr = iph->daddr,
+ .saddr = { .s6_addr32 = {0, 0, 0, 0} }, } },
+ };
+
+ EnterFunction(10);
+
+ if (!(rt = (struct rt6_info *)ip6_route_output(&init_net, NULL, &fl))) {
+ IP_VS_DBG_RL("ip_vs_bypass_xmit(): ip_route_output error, "
+ "dest: " NIP6_FMT "\n", NIP6(iph->daddr));
+ goto tx_error_icmp;
+ }
+
+ /* MTU checking */
+ mtu = dst_mtu(&rt->u.dst);
+ if (skb->len > mtu) {
+ dst_release(&rt->u.dst);
+ icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu, skb->dev);
+ IP_VS_DBG_RL("ip_vs_bypass_xmit_v6(): frag needed\n");
+ goto tx_error;
+ }
+
+ /*
+ * Call ip_send_check because we are not sure it is called
+ * after ip_defrag. Is copy-on-write needed?
+ */
+ if (unlikely((skb = skb_share_check(skb, GFP_ATOMIC)) == NULL)) {
+ dst_release(&rt->u.dst);
+ return NF_STOLEN;
+ }
+
+ /* drop old route */
+ dst_release(skb->dst);
+ skb->dst = &rt->u.dst;
+
+ /* Another hack: avoid icmp_send in ip_fragment */
+ skb->local_df = 1;
+
+ IP_VS_XMIT(PF_INET6, skb, rt);
+
+ LeaveFunction(10);
+ return NF_STOLEN;
+
+ tx_error_icmp:
+ dst_link_failure(skb);
+ tx_error:
+ kfree_skb(skb);
+ LeaveFunction(10);
+ return NF_STOLEN;
+}
+#endif
/*
* NAT transmitter (only for outside-to-inside nat forwarding)
@@ -368,6 +430,80 @@ ip_vs_nat_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
goto tx_error;
}
+#ifdef CONFIG_IP_VS_IPV6
+int
+ip_vs_nat_xmit_v6(struct sk_buff *skb, struct ip_vs_conn *cp,
+ struct ip_vs_protocol *pp)
+{
+ struct rt6_info *rt; /* Route to the other host */
+ int mtu;
+
+ EnterFunction(10);
+
+ /* check if it is a connection of no-client-port */
+ if (unlikely(cp->flags & IP_VS_CONN_F_NO_CPORT)) {
+ __be16 _pt, *p;
+ p = skb_header_pointer(skb, sizeof(struct ipv6hdr), sizeof(_pt), &_pt);
+ if (p == NULL)
+ goto tx_error;
+ ip_vs_conn_fill_cport(cp, *p);
+ IP_VS_DBG(10, "filled cport=%d\n", ntohs(*p));
+ }
+
+ if (!(rt = __ip_vs_get_out_rt_v6(cp)))
+ goto tx_error_icmp;
+
+ /* MTU checking */
+ mtu = dst_mtu(&rt->u.dst);
+ if (skb->len > mtu) {
+ dst_release(&rt->u.dst);
+ icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu, skb->dev);
+ IP_VS_DBG_RL_PKT(0, pp, skb, 0, "ip_vs_nat_xmit_v6(): frag needed for");
+ goto tx_error;
+ }
+
+ /* copy-on-write the packet before mangling it */
+ if (!skb_make_writable(skb, sizeof(struct ipv6hdr)))
+ goto tx_error_put;
+
+ if (skb_cow(skb, rt->u.dst.dev->hard_header_len))
+ goto tx_error_put;
+
+ /* drop old route */
+ dst_release(skb->dst);
+ skb->dst = &rt->u.dst;
+
+ /* mangle the packet */
+ if (pp->dnat_handler_v6 && !pp->dnat_handler_v6(skb, pp, cp))
+ goto tx_error;
+ ipv6_hdr(skb)->daddr = cp->daddr.v6;
+
+ IP_VS_DBG_PKT(10, pp, skb, 0, "After DNAT");
+
+ /* FIXME: when application helper enlarges the packet and the length
+ is larger than the MTU of outgoing device, there will be still
+ MTU problem. */
+
+ /* Another hack: avoid icmp_send in ip_fragment */
+ skb->local_df = 1;
+
+ IP_VS_XMIT(PF_INET6, skb, rt);
+
+ LeaveFunction(10);
+ return NF_STOLEN;
+
+ tx_error_icmp:
+ dst_link_failure(skb);
+ tx_error:
+ LeaveFunction(10);
+ kfree_skb(skb);
+ return NF_STOLEN;
+ tx_error_put:
+ dst_release(&rt->u.dst);
+ goto tx_error;
+}
+#endif
+
/*
* IP Tunneling transmitter
@@ -499,6 +635,111 @@ ip_vs_tunnel_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
return NF_STOLEN;
}
+#ifdef CONFIG_IP_VS_IPV6
+int
+ip_vs_tunnel_xmit_v6(struct sk_buff *skb, struct ip_vs_conn *cp,
+ struct ip_vs_protocol *pp)
+{
+ struct rt6_info *rt; /* Route to the other host */
+ struct net_device *tdev; /* Device to other host */
+ struct ipv6hdr *old_iph = ipv6_hdr(skb);
+ sk_buff_data_t old_transport_header = skb->transport_header;
+ struct ipv6hdr *iph; /* Our new IP header */
+ unsigned int max_headroom; /* The extra header space needed */
+ int mtu;
+
+ EnterFunction(10);
+
+ if (skb->protocol != htons(ETH_P_IPV6)) {
+ IP_VS_DBG_RL("ip_vs_tunnel_xmit_v6(): protocol error, "
+ "ETH_P_IPV6: %d, skb protocol: %d\n",
+ htons(ETH_P_IPV6), skb->protocol);
+ goto tx_error;
+ }
+
+ if (!(rt = __ip_vs_get_out_rt_v6(cp)))
+ goto tx_error_icmp;
+
+ tdev = rt->u.dst.dev;
+
+ mtu = dst_mtu(&rt->u.dst) - sizeof(struct ipv6hdr);
+ /* TODO IPv6: do we need this check in IPv6? */
+ if (mtu < 1280) {
+ dst_release(&rt->u.dst);
+ IP_VS_DBG_RL("ip_vs_tunnel_xmit_v6(): mtu less than 1280\n");
+ goto tx_error;
+ }
+ if (skb->dst)
+ skb->dst->ops->update_pmtu(skb->dst, mtu);
+
+ if (mtu < ntohs(old_iph->payload_len) + sizeof(struct ipv6hdr)) {
+ icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu, skb->dev);
+ dst_release(&rt->u.dst);
+ IP_VS_DBG_RL("ip_vs_tunnel_xmit_v6(): frag needed\n");
+ goto tx_error;
+ }
+
+ /*
+ * Okay, now see if we can stuff it in the buffer as-is.
+ */
+ max_headroom = LL_RESERVED_SPACE(tdev) + sizeof(struct ipv6hdr);
+
+ if (skb_headroom(skb) < max_headroom
+ || skb_cloned(skb) || skb_shared(skb)) {
+ struct sk_buff *new_skb =
+ skb_realloc_headroom(skb, max_headroom);
+ if (!new_skb) {
+ dst_release(&rt->u.dst);
+ kfree_skb(skb);
+ IP_VS_ERR_RL("ip_vs_tunnel_xmit_v6(): no memory\n");
+ return NF_STOLEN;
+ }
+ kfree_skb(skb);
+ skb = new_skb;
+ old_iph = ipv6_hdr(skb);
+ }
+
+ skb->transport_header = old_transport_header;
+
+ skb_push(skb, sizeof(struct ipv6hdr));
+ skb_reset_network_header(skb);
+ memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
+
+ /* drop old route */
+ dst_release(skb->dst);
+ skb->dst = &rt->u.dst;
+
+ /*
+ * Push down and install the IPIP header.
+ */
+ iph = ipv6_hdr(skb);
+ iph->version = 6;
+ iph->nexthdr = IPPROTO_IPV6;
+ iph->payload_len = old_iph->payload_len + sizeof(old_iph);
+ iph->priority = old_iph->priority;
+ memset(&iph->flow_lbl, 0, sizeof(iph->flow_lbl));
+ iph->daddr = rt->rt6i_dst.addr;
+ iph->saddr = cp->vaddr.v6; /* rt->rt6i_src.addr; */
+ iph->hop_limit = old_iph->hop_limit;
+
+ /* Another hack: avoid icmp_send in ip_fragment */
+ skb->local_df = 1;
+
+ ip6_local_out(skb);
+
+ LeaveFunction(10);
+
+ return NF_STOLEN;
+
+ tx_error_icmp:
+ dst_link_failure(skb);
+ tx_error:
+ kfree_skb(skb);
+ LeaveFunction(10);
+ return NF_STOLEN;
+}
+#endif
+
/*
* Direct Routing transmitter
@@ -556,6 +797,58 @@ ip_vs_dr_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
return NF_STOLEN;
}
+#ifdef CONFIG_IP_VS_IPV6
+int
+ip_vs_dr_xmit_v6(struct sk_buff *skb, struct ip_vs_conn *cp,
+ struct ip_vs_protocol *pp)
+{
+ struct rt6_info *rt; /* Route to the other host */
+ int mtu;
+
+ EnterFunction(10);
+
+ if (!(rt = __ip_vs_get_out_rt_v6(cp)))
+ goto tx_error_icmp;
+
+ /* MTU checking */
+ mtu = dst_mtu(&rt->u.dst);
+ if (skb->len > mtu) {
+ icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu, skb->dev);
+ dst_release(&rt->u.dst);
+ IP_VS_DBG_RL("ip_vs_dr_xmit_v6(): frag needed\n");
+ goto tx_error;
+ }
+
+ /*
+ * Call ip_send_check because we are not sure it is called
+ * after ip_defrag. Is copy-on-write needed?
+ */
+ if (unlikely((skb = skb_share_check(skb, GFP_ATOMIC)) == NULL)) {
+ dst_release(&rt->u.dst);
+ return NF_STOLEN;
+ }
+
+ /* drop old route */
+ dst_release(skb->dst);
+ skb->dst = &rt->u.dst;
+
+ /* Another hack: avoid icmp_send in ip_fragment */
+ skb->local_df = 1;
+
+ IP_VS_XMIT(PF_INET6, skb, rt);
+
+ LeaveFunction(10);
+ return NF_STOLEN;
+
+ tx_error_icmp:
+ dst_link_failure(skb);
+ tx_error:
+ kfree_skb(skb);
+ LeaveFunction(10);
+ return NF_STOLEN;
+}
+#endif
+
/*
* ICMP packet transmitter
@@ -633,3 +926,78 @@ ip_vs_icmp_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
ip_rt_put(rt);
goto tx_error;
}
+
+#ifdef CONFIG_IP_VS_IPV6
+int
+ip_vs_icmp_xmit_v6(struct sk_buff *skb, struct ip_vs_conn *cp,
+ struct ip_vs_protocol *pp, int offset)
+{
+ struct rt6_info *rt; /* Route to the other host */
+ int mtu;
+ int rc;
+
+ EnterFunction(10);
+
+ /* The ICMP packet for VS/TUN, VS/DR and LOCALNODE will be
+ forwarded directly here, because there is no need to
+ translate address/port back */
+ if (IP_VS_FWD_METHOD(cp) != IP_VS_CONN_F_MASQ) {
+ if (cp->packet_xmit)
+ rc = cp->packet_xmit(skb, cp, pp);
+ else
+ rc = NF_ACCEPT;
+ /* do not touch skb anymore */
+ atomic_inc(&cp->in_pkts);
+ goto out;
+ }
+
+ /*
+ * mangle and send the packet here (only for VS/NAT)
+ */
+
+ if (!(rt = __ip_vs_get_out_rt_v6(cp)))
+ goto tx_error_icmp;
+
+ /* MTU checking */
+ mtu = dst_mtu(&rt->u.dst);
+ if (skb->len > mtu) {
+ dst_release(&rt->u.dst);
+ icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu, skb->dev);
+ IP_VS_DBG_RL("ip_vs_in_icmp(): frag needed\n");
+ goto tx_error;
+ }
+
+ /* copy-on-write the packet before mangling it */
+ if (!skb_make_writable(skb, offset))
+ goto tx_error_put;
+
+ if (skb_cow(skb, rt->u.dst.dev->hard_header_len))
+ goto tx_error_put;
+
+ /* drop the old route when skb is not shared */
+ dst_release(skb->dst);
+ skb->dst = &rt->u.dst;
+
+ ip_vs_nat_icmp_v6(skb, pp, cp, 0);
+
+ /* Another hack: avoid icmp_send in ip_fragment */
+ skb->local_df = 1;
+
+ IP_VS_XMIT(PF_INET6, skb, rt);
+
+ rc = NF_STOLEN;
+ goto out;
+
+ tx_error_icmp:
+ dst_link_failure(skb);
+ tx_error:
+ dev_kfree_skb(skb);
+ rc = NF_STOLEN;
+ out:
+ LeaveFunction(10);
+ return rc;
+ tx_error_put:
+ dst_release(&rt->u.dst);
+ goto tx_error;
+}
+#endif
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 17/26] IPVS: Add connection hashing function for IPv6 entries.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (15 preceding siblings ...)
2008-06-11 17:11 ` [PATCH 16/26] IPVS: Add IPv6 xmit forwarding functions Julius R. Volz
@ 2008-06-11 17:12 ` Julius R. Volz
2008-06-11 17:12 ` [PATCH 18/26] IPVS: Add functions for getting/creating IPv6 connections Julius R. Volz
` (9 subsequent siblings)
26 siblings, 0 replies; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:12 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Add ip_vs_conn_hashkey_v6() and call it instead of ip_vs_conn_hashkey()
when hashing IPv6 connection entries.
Signed-off-by: Julius R. Volz <juliusv@google.com>
1 files changed, 22 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
index 1a4040d..4ee5dac 100644
--- a/net/netfilter/ipvs/ip_vs_conn.c
+++ b/net/netfilter/ipvs/ip_vs_conn.c
@@ -37,6 +37,9 @@
#include <net/net_namespace.h>
#include <net/ip_vs.h>
+#ifdef CONFIG_IP_VS_IPV6
+#include <net/ipv6.h>
+#endif
/*
@@ -122,6 +125,13 @@ static unsigned int ip_vs_conn_hashkey(unsigned proto, __be32 addr, __be16 port)
& IP_VS_CONN_TAB_MASK;
}
+#ifdef CONFIG_IP_VS_IPV6
+static unsigned int ip_vs_conn_hashkey_v6(unsigned proto, const struct in6_addr *addr, __be16 port)
+{
+ return jhash_3words(jhash(addr, 16, ip_vs_conn_rnd), (__force u32)port, proto, ip_vs_conn_rnd)
+ & IP_VS_CONN_TAB_MASK;
+}
+#endif
/*
* Hashes ip_vs_conn in ip_vs_conn_tab by proto,addr,port.
@@ -133,7 +143,12 @@ static inline int ip_vs_conn_hash(struct ip_vs_conn *cp)
int ret;
/* Hash by protocol, client address and port */
- hash = ip_vs_conn_hashkey(cp->protocol, cp->caddr, cp->cport);
+#ifdef CONFIG_IP_VS_IPV6
+ if (cp->af == AF_INET6)
+ hash = ip_vs_conn_hashkey_v6(cp->protocol, &cp->caddr.v6, cp->cport);
+ else
+#endif
+ hash = ip_vs_conn_hashkey(cp->protocol, cp->caddr.v4, cp->cport);
ct_write_lock(hash);
@@ -164,7 +179,12 @@ static inline int ip_vs_conn_unhash(struct ip_vs_conn *cp)
int ret;
/* unhash it and decrease its reference counter */
- hash = ip_vs_conn_hashkey(cp->protocol, cp->caddr, cp->cport);
+#ifdef CONFIG_IP_VS_IPV6
+ if (cp->af == AF_INET6)
+ hash = ip_vs_conn_hashkey_v6(cp->protocol, &cp->caddr.v6, cp->cport);
+ else
+#endif
+ hash = ip_vs_conn_hashkey(cp->protocol, cp->caddr.v4, cp->cport);
ct_write_lock(hash);
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 18/26] IPVS: Add functions for getting/creating IPv6 connections.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (16 preceding siblings ...)
2008-06-11 17:12 ` [PATCH 17/26] IPVS: Add connection hashing function for IPv6 entries Julius R. Volz
@ 2008-06-11 17:12 ` Julius R. Volz
2008-06-12 1:55 ` Brian Haley
2008-06-11 17:12 ` [PATCH 19/26] IPVS: Add scheduling functions for " Julius R. Volz
` (8 subsequent siblings)
26 siblings, 1 reply; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:12 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Add functions for getting/creating IPv6 connections and connection
templates where these diverge significantly from the IPv4 versions.
Signed-off-by: Julius R. Volz <juliusv@google.com>
2 files changed, 211 insertions(+), 0 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index d04d5c6..6a58dff 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -816,6 +816,19 @@ extern struct ip_vs_conn *ip_vs_ct_in_get
extern struct ip_vs_conn *ip_vs_conn_out_get
(int protocol, __be32 s_addr, __be16 s_port, __be32 d_addr, __be16 d_port);
+#ifdef CONFIG_IP_VS_IPV6
+extern struct ip_vs_conn *
+ip_vs_conn_in_get_v6(int protocol, const struct in6_addr *s_addr, __be16 s_port,
+ const struct in6_addr *d_addr, __be16 d_port);
+extern struct ip_vs_conn *
+ip_vs_ct_in_get_v6(int protocol, const struct in6_addr *s_addr, __be16 s_port,
+ const struct in6_addr *d_addr, __be16 d_port);
+extern struct ip_vs_conn *
+ip_vs_conn_out_get_v6(int protocol, const struct in6_addr *s_addr,
+ __be16 s_port, const struct in6_addr *d_addr,
+ __be16 d_port);
+#endif
+
/* put back the conn without restarting its timer */
static inline void __ip_vs_conn_put(struct ip_vs_conn *cp)
{
@@ -828,6 +841,15 @@ extern struct ip_vs_conn *
ip_vs_conn_new(int proto, __be32 caddr, __be16 cport, __be32 vaddr, __be16 vport,
__be32 daddr, __be16 dport, unsigned flags,
struct ip_vs_dest *dest);
+
+#ifdef CONFIG_IP_VS_IPV6
+extern struct ip_vs_conn *
+ip_vs_conn_new_v6(int proto, const struct in6_addr *caddr, __be16 cport,
+ const struct in6_addr *vaddr, __be16 vport,
+ const struct in6_addr *daddr, __be16 dport, unsigned flags,
+ struct ip_vs_dest *dest);
+#endif
+
extern void ip_vs_conn_expire_now(struct ip_vs_conn *cp);
extern const char * ip_vs_state_name(__u16 proto, int state);
diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
index 4ee5dac..30e1ad2 100644
--- a/net/netfilter/ipvs/ip_vs_conn.c
+++ b/net/netfilter/ipvs/ip_vs_conn.c
@@ -236,6 +236,36 @@ static inline struct ip_vs_conn *__ip_vs_conn_in_get
return NULL;
}
+#ifdef CONFIG_IP_VS_IPV6
+static inline struct ip_vs_conn *__ip_vs_conn_in_get_v6
+(int protocol, const struct in6_addr *s_addr, __be16 s_port, const struct in6_addr *d_addr, __be16 d_port)
+{
+ unsigned hash;
+ struct ip_vs_conn *cp;
+
+ hash = ip_vs_conn_hashkey_v6(protocol, s_addr, s_port);
+
+ ct_read_lock(hash);
+
+ list_for_each_entry(cp, &ip_vs_conn_tab[hash], c_list) {
+ if (cp->af == AF_INET6 &&
+ ipv6_addr_equal(s_addr, &cp->caddr.v6) && s_port==cp->cport &&
+ ipv6_addr_equal(d_addr, &cp->vaddr.v6) && d_port==cp->vport &&
+ ((!s_port) ^ (!(cp->flags & IP_VS_CONN_F_NO_CPORT))) &&
+ protocol==cp->protocol) {
+ /* HIT */
+ atomic_inc(&cp->refcnt);
+ ct_read_unlock(hash);
+ return cp;
+ }
+ }
+
+ ct_read_unlock(hash);
+
+ return NULL;
+}
+#endif
+
struct ip_vs_conn *ip_vs_conn_in_get
(int protocol, __be32 s_addr, __be16 s_port, __be32 d_addr, __be16 d_port)
{
@@ -254,6 +284,26 @@ struct ip_vs_conn *ip_vs_conn_in_get
return cp;
}
+#ifdef CONFIG_IP_VS_IPV6
+struct ip_vs_conn *ip_vs_conn_in_get_v6
+(int protocol, const struct in6_addr *s_addr, __be16 s_port, const struct in6_addr *d_addr, __be16 d_port)
+{
+ struct ip_vs_conn *cp;
+
+ cp = __ip_vs_conn_in_get_v6(protocol, s_addr, s_port, d_addr, d_port);
+ if (!cp && atomic_read(&ip_vs_conn_no_cport_cnt))
+ cp = __ip_vs_conn_in_get_v6(protocol, s_addr, 0, d_addr, d_port);
+
+ IP_VS_DBG(9, "lookup/in %s " NIP6_FMT ":%d->" NIP6_FMT ":%d %s\n",
+ ip_vs_proto_name(protocol),
+ NIP6(*s_addr), ntohs(s_port),
+ NIP6(*d_addr), ntohs(d_port),
+ cp?"hit":"not hit");
+
+ return cp;
+}
+#endif
+
/* Get reference to connection template */
struct ip_vs_conn *ip_vs_ct_in_get
(int protocol, __be32 s_addr, __be16 s_port, __be32 d_addr, __be16 d_port)
@@ -290,6 +340,44 @@ struct ip_vs_conn *ip_vs_ct_in_get
return cp;
}
+#ifdef CONFIG_IP_VS_IPV6
+/* Get reference to connection template */
+struct ip_vs_conn *ip_vs_ct_in_get_v6
+(int protocol, const struct in6_addr *s_addr, __be16 s_port, const struct in6_addr *d_addr, __be16 d_port)
+{
+ unsigned hash;
+ struct ip_vs_conn *cp;
+
+ hash = ip_vs_conn_hashkey_v6(protocol, s_addr, s_port);
+
+ ct_read_lock(hash);
+
+ list_for_each_entry(cp, &ip_vs_conn_tab[hash], c_list) {
+ if (cp->af == AF_INET6 &&
+ ipv6_addr_equal(s_addr, &cp->caddr.v6) && s_port==cp->cport &&
+ ipv6_addr_equal(d_addr, &cp->vaddr.v6) && d_port==cp->vport &&
+ cp->flags & IP_VS_CONN_F_TEMPLATE &&
+ protocol==cp->protocol) {
+ /* HIT */
+ atomic_inc(&cp->refcnt);
+ goto out;
+ }
+ }
+ cp = NULL;
+
+ out:
+ ct_read_unlock(hash);
+
+ IP_VS_DBG(9, "template lookup/in %s " NIP6_FMT ":%d->" NIP6_FMT ":%d %s\n",
+ ip_vs_proto_name(protocol),
+ NIP6(*s_addr), ntohs(s_port),
+ NIP6(*d_addr), ntohs(d_port),
+ cp?"hit":"not hit");
+
+ return cp;
+}
+#endif
+
/*
* Gets ip_vs_conn associated with supplied parameters in the ip_vs_conn_tab.
* Called for pkts coming from inside-to-OUTside.
@@ -332,6 +420,44 @@ struct ip_vs_conn *ip_vs_conn_out_get
return ret;
}
+#ifdef CONFIG_IP_VS_IPV6
+struct ip_vs_conn *ip_vs_conn_out_get_v6
+(int protocol, const struct in6_addr *s_addr, __be16 s_port, const struct in6_addr *d_addr, __be16 d_port)
+{
+ unsigned hash;
+ struct ip_vs_conn *cp, *ret=NULL;
+
+ /*
+ * Check for "full" addressed entries
+ */
+ hash = ip_vs_conn_hashkey_v6(protocol, d_addr, d_port);
+
+ ct_read_lock(hash);
+
+ list_for_each_entry(cp, &ip_vs_conn_tab[hash], c_list) {
+ if (cp->af == AF_INET6 &&
+ ipv6_addr_equal(d_addr, &cp->caddr.v6) && d_port==cp->cport &&
+ ipv6_addr_equal(s_addr, &cp->daddr.v6) && s_port==cp->dport &&
+ protocol == cp->protocol) {
+ /* HIT */
+ atomic_inc(&cp->refcnt);
+ ret = cp;
+ break;
+ }
+ }
+
+ ct_read_unlock(hash);
+
+ IP_VS_DBG(9, "lookup/out %s " NIP6_FMT ":%d->" NIP6_FMT ":%d %s\n",
+ ip_vs_proto_name(protocol),
+ NIP6(*s_addr), ntohs(s_port),
+ NIP6(*d_addr), ntohs(d_port),
+ ret?"hit":"not hit");
+
+ return ret;
+}
+#endif
+
/*
* Put back the conn and restart its timer with its timeout
@@ -766,6 +892,69 @@ ip_vs_conn_new(int proto, __be32 caddr, __be16 cport, __be32 vaddr, __be16 vport
return cp;
}
+#ifdef CONFIG_IP_VS_IPV6
+struct ip_vs_conn *
+ip_vs_conn_new_v6(int proto, const struct in6_addr *caddr, __be16 cport,
+ const struct in6_addr *vaddr, __be16 vport,
+ const struct in6_addr *daddr, __be16 dport, unsigned flags,
+ struct ip_vs_dest *dest)
+{
+ struct ip_vs_conn *cp;
+ struct ip_vs_protocol *pp = ip_vs_proto_get(proto);
+
+ cp = kmem_cache_zalloc(ip_vs_conn_cachep, GFP_ATOMIC);
+ if (cp == NULL) {
+ IP_VS_ERR_RL("ip_vs_conn_new_v6: no memory available.\n");
+ return NULL;
+ }
+
+ INIT_LIST_HEAD(&cp->c_list);
+ setup_timer(&cp->timer, ip_vs_conn_expire, (unsigned long)cp);
+ cp->af = AF_INET6;
+ cp->protocol = proto;
+ cp->caddr.v6 = *caddr;
+ cp->cport = cport;
+ cp->vaddr.v6 = *vaddr;
+ cp->vport = vport;
+ cp->daddr.v6 = *daddr;
+ cp->dport = dport;
+ cp->flags = flags;
+ spin_lock_init(&cp->lock);
+
+ /*
+ * Set the entry is referenced by the current thread before hashing
+ * it in the table, so that other thread run ip_vs_random_dropentry
+ * but cannot drop this entry.
+ */
+ atomic_set(&cp->refcnt, 1);
+
+ atomic_set(&cp->n_control, 0);
+ atomic_set(&cp->in_pkts, 0);
+
+ atomic_inc(&ip_vs_conn_count);
+ if (flags & IP_VS_CONN_F_NO_CPORT)
+ atomic_inc(&ip_vs_conn_no_cport_cnt);
+
+ /* Bind the connection with a destination server */
+ ip_vs_bind_dest(cp, dest);
+
+ /* Set its state and timeout */
+ cp->state = 0;
+ cp->timeout = 3*HZ;
+
+ /* Bind its packet transmitter */
+ ip_vs_bind_xmit_v6(cp);
+
+ if (unlikely(pp && atomic_read(&pp->appcnt)))
+ ip_vs_bind_app(cp, pp);
+
+ /* Hash it in the ip_vs_conn_tab finally */
+ ip_vs_conn_hash(cp);
+
+ return cp;
+}
+#endif
+
/*
* /proc/net/ip_vs_conn entries
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 19/26] IPVS: Add scheduling functions for IPv6 connections.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (17 preceding siblings ...)
2008-06-11 17:12 ` [PATCH 18/26] IPVS: Add functions for getting/creating IPv6 connections Julius R. Volz
@ 2008-06-11 17:12 ` Julius R. Volz
2008-06-11 17:12 ` [PATCH 20/26] IPVS: Add IPv6 Netfilter hooks and add/modify support functions Julius R. Volz
` (7 subsequent siblings)
26 siblings, 0 replies; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:12 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Add ip_vs_schedule_v6() and ip_vs_sched_persist_v6() functions for
scheduling IPv6 connections.
Signed-off-by: Julius R. Volz <juliusv@google.com>
2 files changed, 241 insertions(+), 0 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 6a58dff..8d28d98 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -991,6 +991,11 @@ extern struct ip_vs_scheduler *ip_vs_scheduler_get(const char *sched_name);
extern void ip_vs_scheduler_put(struct ip_vs_scheduler *scheduler);
extern struct ip_vs_conn *
ip_vs_schedule(struct ip_vs_service *svc, const struct sk_buff *skb);
+#ifdef CONFIG_IP_VS_IPV6
+extern struct ip_vs_conn *
+ip_vs_schedule_v6(struct ip_vs_service *svc, const struct sk_buff *skb);
+#endif
+
extern int ip_vs_leave(struct ip_vs_service *svc, struct sk_buff *skb,
struct ip_vs_protocol *pp);
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index 9a3d0df..ccd95ff 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -333,6 +333,180 @@ ip_vs_sched_persist(struct ip_vs_service *svc,
return cp;
}
+#ifdef CONFIG_IP_VS_IPV6
+static struct ip_vs_conn *
+ip_vs_sched_persist_v6(struct ip_vs_service *svc,
+ const struct sk_buff *skb,
+ __be16 ports[2])
+{
+ struct ip_vs_conn *cp = NULL;
+ struct ipv6hdr *iph = ipv6_hdr(skb);
+ struct ip_vs_dest *dest;
+ struct ip_vs_conn *ct;
+ __be16 dport; /* destination port to forward */
+ struct in6_addr snet; /* source network of the client, after masking */
+
+ /* Mask saddr with the netmask to adjust template granularity */
+ ipv6_addr_prefix(&snet, &iph->saddr, svc->netmask);
+
+ IP_VS_DBG(6, "p-schedule: src " NIP6_FMT ":%u dest " NIP6_FMT ":%u "
+ "mnet " NIP6_FMT "\n",
+ NIP6(iph->saddr), ntohs(ports[0]),
+ NIP6(iph->daddr), ntohs(ports[1]),
+ NIP6(snet));
+
+ /*
+ * As far as we know, FTP is a very complicated network protocol, and
+ * it uses control connection and data connections. For active FTP,
+ * FTP server initialize data connection to the client, its source port
+ * is often 20. For passive FTP, FTP server tells the clients the port
+ * that it passively listens to, and the client issues the data
+ * connection. In the tunneling or direct routing mode, the load
+ * balancer is on the client-to-server half of connection, the port
+ * number is unknown to the load balancer. So, a conn template like
+ * <caddr, 0, vaddr, 0, daddr, 0> is created for persistent FTP
+ * service, and a template like <caddr, 0, vaddr, vport, daddr, dport>
+ * is created for other persistent services.
+ */
+ if (ports[1] == svc->port) {
+ /* Check if a template already exists */
+ if (svc->port != FTPPORT)
+ ct = ip_vs_ct_in_get_v6(iph->nexthdr, &snet, 0,
+ &iph->daddr, ports[1]);
+ else
+ ct = ip_vs_ct_in_get_v6(iph->nexthdr, &snet, 0,
+ &iph->daddr, 0);
+
+ if (!ct || !ip_vs_check_template(ct)) {
+ /*
+ * No template found or the dest of the connection
+ * template is not available.
+ */
+ dest = svc->scheduler->schedule(svc, skb);
+ if (dest == NULL) {
+ IP_VS_DBG(1, "p-schedule: no dest found.\n");
+ return NULL;
+ }
+
+ /*
+ * Create a template like <protocol,caddr,0,
+ * vaddr,vport,daddr,dport> for non-ftp service,
+ * and <protocol,caddr,0,vaddr,0,daddr,0>
+ * for ftp service.
+ */
+ if (svc->port != FTPPORT)
+ ct = ip_vs_conn_new_v6(iph->nexthdr,
+ &snet, 0,
+ &iph->daddr,
+ ports[1],
+ &dest->addr.v6, dest->port,
+ IP_VS_CONN_F_TEMPLATE,
+ dest);
+ else
+ ct = ip_vs_conn_new_v6(iph->nexthdr,
+ &snet, 0,
+ &iph->daddr, 0,
+ &dest->addr.v6, 0,
+ IP_VS_CONN_F_TEMPLATE,
+ dest);
+ if (ct == NULL)
+ return NULL;
+
+ ct->timeout = svc->timeout;
+ } else {
+ /* set destination with the found template */
+ dest = ct->dest;
+ }
+ dport = dest->port;
+ } else {
+ /*
+ * Note: persistent fwmark-based services and persistent
+ * port zero service are handled here.
+ * fwmark template: <IPPROTO_IP,caddr,0,fwmark,0,daddr,0>
+ * port zero template: <protocol,caddr,0,vaddr,0,daddr,0>
+ */
+ if (svc->fwmark) {
+ struct in6_addr fwmark = {
+ .s6_addr32 = {0, 0, 0, htonl(svc->fwmark)}
+ };
+
+ ct = ip_vs_ct_in_get_v6(IPPROTO_IP, &snet, 0,
+ &fwmark, 0);
+ } else
+ ct = ip_vs_ct_in_get_v6(iph->nexthdr, &snet, 0,
+ &iph->daddr, 0);
+
+ if (!ct || !ip_vs_check_template(ct)) {
+ /*
+ * If it is not persistent port zero, return NULL,
+ * otherwise create a connection template.
+ */
+ if (svc->port)
+ return NULL;
+
+ dest = svc->scheduler->schedule(svc, skb);
+ if (dest == NULL) {
+ IP_VS_DBG(1, "p-schedule: no dest found.\n");
+ return NULL;
+ }
+
+ /*
+ * Create a template according to the service
+ */
+ if (svc->fwmark) {
+ struct in6_addr fwmark = {
+ .s6_addr32 = {0, 0, 0, htonl(svc->fwmark)}
+ };
+
+ ct = ip_vs_conn_new_v6(IPPROTO_IP,
+ &snet, 0,
+ &fwmark, 0,
+ &dest->addr.v6, 0,
+ IP_VS_CONN_F_TEMPLATE,
+ dest);
+ }
+ else
+ ct = ip_vs_conn_new_v6(iph->nexthdr,
+ &snet, 0,
+ &iph->daddr, 0,
+ &dest->addr.v6, 0,
+ IP_VS_CONN_F_TEMPLATE,
+ dest);
+ if (ct == NULL)
+ return NULL;
+
+ ct->timeout = svc->timeout;
+ } else {
+ /* set destination with the found template */
+ dest = ct->dest;
+ }
+ dport = ports[1];
+ }
+
+ /*
+ * Create a new connection according to the template
+ */
+ cp = ip_vs_conn_new_v6(iph->nexthdr,
+ &iph->saddr, ports[0],
+ &iph->daddr, ports[1],
+ &dest->addr.v6, dport,
+ 0,
+ dest);
+ if (cp == NULL) {
+ ip_vs_conn_put(ct);
+ return NULL;
+ }
+
+ /*
+ * Add its control
+ */
+ ip_vs_control_add(cp, ct);
+ ip_vs_conn_put(ct);
+
+ ip_vs_conn_stats(cp, svc);
+ return cp;
+}
+#endif
/*
* IPVS main scheduling function
@@ -400,6 +574,68 @@ ip_vs_schedule(struct ip_vs_service *svc, const struct sk_buff *skb)
return cp;
}
+#ifdef CONFIG_IP_VS_IPV6
+struct ip_vs_conn *
+ip_vs_schedule_v6(struct ip_vs_service *svc, const struct sk_buff *skb)
+{
+ struct ip_vs_conn *cp = NULL;
+ struct ipv6hdr *iph = ipv6_hdr(skb);
+ struct ip_vs_dest *dest;
+ __be16 _ports[2], *pptr;
+
+ pptr = skb_header_pointer(skb, sizeof(struct ipv6hdr),
+ sizeof(_ports), _ports);
+ if (pptr == NULL)
+ return NULL;
+
+ /*
+ * Persistent service
+ */
+ if (svc->flags & IP_VS_SVC_F_PERSISTENT)
+ return ip_vs_sched_persist_v6(svc, skb, pptr);
+
+ /*
+ * Non-persistent service
+ */
+ if (!svc->fwmark && pptr[1] != svc->port) {
+ if (!svc->port)
+ IP_VS_ERR("Schedule: port zero only supported "
+ "in persistent services, "
+ "check your ipvs configuration\n");
+ return NULL;
+ }
+
+ dest = svc->scheduler->schedule(svc, skb);
+ if (dest == NULL) {
+ IP_VS_DBG(1, "Schedule: no dest found.\n");
+ return NULL;
+ }
+
+ /*
+ * Create a connection entry.
+ */
+ cp = ip_vs_conn_new_v6(iph->nexthdr,
+ &iph->saddr, pptr[0],
+ &iph->daddr, pptr[1],
+ &dest->addr.v6, dest->port?dest->port:pptr[1],
+ 0,
+ dest);
+ if (cp == NULL)
+ return NULL;
+
+ IP_VS_DBG(6, "Schedule fwd:%c c:" NIP6_FMT ":%u v:" NIP6_FMT ":%u "
+ "d:" NIP6_FMT ":%u conn->flags:%X conn->refcnt:%d\n",
+ ip_vs_fwd_tag(cp),
+ NIP6(cp->caddr.v6), ntohs(cp->cport),
+ NIP6(cp->vaddr.v6), ntohs(cp->vport),
+ NIP6(cp->daddr.v6), ntohs(cp->dport),
+ cp->flags, atomic_read(&cp->refcnt));
+
+ ip_vs_conn_stats(cp, svc);
+ return cp;
+}
+#endif
+
/*
* Pass or drop the packet.
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 20/26] IPVS: Add IPv6 Netfilter hooks and add/modify support functions.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (18 preceding siblings ...)
2008-06-11 17:12 ` [PATCH 19/26] IPVS: Add scheduling functions for " Julius R. Volz
@ 2008-06-11 17:12 ` Julius R. Volz
2008-06-12 1:55 ` Brian Haley
2008-06-11 17:12 ` [PATCH 21/26] IPVS: Make proc/net files output IPv6 entries correctly Julius R. Volz
` (6 subsequent siblings)
26 siblings, 1 reply; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:12 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Add Netfilter hooks for IPv6 and corresponding functions that handle
incoming / outgoing IPv6 packets. Also adapt/add some helper functions
and macros to work with the v6 versions.
Signed-off-by: Julius R. Volz <juliusv@google.com>
2 files changed, 607 insertions(+), 5 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 8d28d98..ab59696 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -998,7 +998,10 @@ ip_vs_schedule_v6(struct ip_vs_service *svc, const struct sk_buff *skb);
extern int ip_vs_leave(struct ip_vs_service *svc, struct sk_buff *skb,
struct ip_vs_protocol *pp);
-
+#ifdef CONFIG_IP_VS_IPV6
+extern int ip_vs_leave_v6(struct ip_vs_service *svc, struct sk_buff *skb,
+ struct ip_vs_protocol *pp);
+#endif
/*
* IPVS control data and functions (from ip_vs_ctl.c)
@@ -1125,7 +1128,12 @@ static inline char ip_vs_fwd_tag(struct ip_vs_conn *cp)
}
extern void ip_vs_nat_icmp(struct sk_buff *skb, struct ip_vs_protocol *pp,
- struct ip_vs_conn *cp, int dir);
+ struct ip_vs_conn *cp, int dir);
+
+#ifdef CONFIG_IP_VS_IPV6
+extern void ip_vs_nat_icmp_v6(struct sk_buff *skb, struct ip_vs_protocol *pp,
+ struct ip_vs_conn *cp, int dir);
+#endif
extern __sum16 ip_vs_checksum_complete(struct sk_buff *skb, int offset);
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index ccd95ff..ded862b 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -41,6 +41,11 @@
#include <linux/netfilter.h>
#include <linux/netfilter_ipv4.h>
+#ifdef CONFIG_IP_VS_IPV6
+#include <net/ipv6.h>
+#include <linux/netfilter_ipv6.h>
+#endif
+
#include <net/ip_vs.h>
@@ -62,6 +67,7 @@ EXPORT_SYMBOL(ip_vs_get_debug_level);
/* ID used in ICMP lookups */
#define icmp_id(icmph) (((icmph)->un).echo.id)
+#define icmpv6_id(icmph) (icmph->icmp6_dataun.u_echo.identifier)
const char *ip_vs_proto_name(unsigned proto)
{
@@ -76,6 +82,10 @@ const char *ip_vs_proto_name(unsigned proto)
return "TCP";
case IPPROTO_ICMP:
return "ICMP";
+#ifdef CONFIG_IP_VS_IPV6
+ case IPPROTO_ICMPV6:
+ return "ICMPv6";
+#endif
default:
sprintf(buf, "IP_%d", proto);
return buf;
@@ -716,6 +726,82 @@ int ip_vs_leave(struct ip_vs_service *svc, struct sk_buff *skb,
}
+#ifdef CONFIG_IP_VS_IPV6
+int ip_vs_leave_v6(struct ip_vs_service *svc, struct sk_buff *skb,
+ struct ip_vs_protocol *pp)
+{
+ __be16 _ports[2], *pptr;
+ struct ipv6hdr *iph = ipv6_hdr(skb);
+
+ pptr = skb_header_pointer(skb, sizeof(struct ipv6hdr),
+ sizeof(_ports), _ports);
+ if (pptr == NULL) {
+ ip_vs_service_put(svc);
+ return NF_DROP;
+ }
+
+ /* if it is fwmark-based service, the cache_bypass sysctl is up
+ and the destination is IPV6_ADDR_UNICAST (and not local), then create
+ a cache_bypass connection entry */
+ if (sysctl_ip_vs_cache_bypass && svc->fwmark
+ && (ipv6_addr_type(&iph->daddr) & IPV6_ADDR_UNICAST)) {
+ int ret, cs;
+ struct ip_vs_conn *cp;
+
+ ip_vs_service_put(svc);
+
+ /* create a new connection entry */
+ IP_VS_DBG(6, "ip_vs_leave: create a cache_bypass entry\n");
+ cp = ip_vs_conn_new_v6(iph->nexthdr,
+ &iph->saddr, pptr[0],
+ &iph->daddr, pptr[1],
+ 0, 0,
+ IP_VS_CONN_F_BYPASS,
+ NULL);
+ if (cp == NULL)
+ return NF_DROP;
+
+ /* statistics */
+ ip_vs_in_stats(cp, skb);
+
+ /* set state */
+ cs = ip_vs_set_state(cp, IP_VS_DIR_INPUT, skb, pp);
+
+ /* transmit the first SYN packet */
+ ret = cp->packet_xmit(skb, cp, pp);
+ /* do not touch skb anymore */
+
+ atomic_inc(&cp->in_pkts);
+ ip_vs_conn_put(cp);
+ return ret;
+ }
+
+ /*
+ * When the virtual ftp service is presented, packets destined
+ * for other services on the VIP may get here (except services
+ * listed in the ipvs table), pass the packets, because it is
+ * not ipvs job to decide to drop the packets.
+ */
+ if ((svc->port == FTPPORT) && (pptr[1] != FTPPORT)) {
+ ip_vs_service_put(svc);
+ return NF_ACCEPT;
+ }
+
+ ip_vs_service_put(svc);
+
+ /*
+ * Notify the client that the destination is unreachable, and
+ * release the socket buffer.
+ * Since it is in IP layer, the TCP socket is not actually
+ * created, the TCP RST packet cannot be sent, instead that
+ * ICMP_PORT_UNREACH is sent here no matter it is TCP/UDP. --WZ
+ */
+ icmpv6_send(skb, ICMPV6_DEST_UNREACH, ICMPV6_PORT_UNREACH, 0, skb->dev);
+ return NF_DROP;
+}
+#endif
+
+
/*
* It is hooked before NF_IP_PRI_NAT_SRC at the NF_INET_POST_ROUTING
* chain, and is used for VS/NAT.
@@ -750,6 +836,14 @@ static inline int ip_vs_gather_frags(struct sk_buff *skb, u_int32_t user)
return err;
}
+#ifdef CONFIG_IP_VS_IPV6
+static inline int ip_vs_gather_frags_v6(struct sk_buff *skb, u_int32_t user)
+{
+ /* TODO IPv6: Find out what to do here for IPv6 */
+ return 0;
+}
+#endif
+
/*
* Packet has been made sufficiently writable in caller
* - inout: 1=in->out, 0=out->in
@@ -798,6 +892,49 @@ void ip_vs_nat_icmp(struct sk_buff *skb, struct ip_vs_protocol *pp,
"Forwarding altered incoming ICMP");
}
+#ifdef CONFIG_IP_VS_IPV6
+void ip_vs_nat_icmp_v6(struct sk_buff *skb, struct ip_vs_protocol *pp,
+ struct ip_vs_conn *cp, int inout)
+{
+ struct ipv6hdr *iph = ipv6_hdr(skb);
+ unsigned int icmp_offset = sizeof(struct ipv6hdr);
+ struct icmp6hdr *icmph = (struct icmp6hdr *)(skb_network_header(skb) +
+ icmp_offset);
+ struct ipv6hdr *ciph = (struct ipv6hdr *)(icmph + 1);
+
+ if (inout) {
+ iph->saddr = cp->vaddr.v6;
+ ciph->daddr = cp->vaddr.v6;
+ } else {
+ iph->daddr = cp->daddr.v6;
+ ciph->saddr = cp->daddr.v6;
+ }
+
+ /* the TCP/UDP port */
+ if (IPPROTO_TCP == ciph->nexthdr || IPPROTO_UDP == ciph->nexthdr) {
+ __be16 *ports = (void *)ciph + sizeof(struct ipv6hdr);
+
+ if (inout)
+ ports[1] = cp->vport;
+ else
+ ports[0] = cp->dport;
+ }
+
+ /* And finally the ICMP checksum */
+ icmph->icmp6_cksum = 0;
+ /* TODO IPv6: is this correct for ICMPv6? */
+ ip_vs_checksum_complete(skb, icmp_offset);
+ skb->ip_summed = CHECKSUM_UNNECESSARY;
+
+ if (inout)
+ IP_VS_DBG_PKT(11, pp, skb, (void *)ciph - (void *)iph,
+ "Forwarding altered outgoing ICMPv6");
+ else
+ IP_VS_DBG_PKT(11, pp, skb, (void *)ciph - (void *)iph,
+ "Forwarding altered incoming ICMPv6");
+}
+#endif
+
/*
* Handle ICMP messages in the inside-to-outside direction (outgoing).
* Find any that might be relevant, check against existing connections,
@@ -904,11 +1041,112 @@ static int ip_vs_out_icmp(struct sk_buff *skb, int *related)
return verdict;
}
-static inline int is_tcp_reset(const struct sk_buff *skb)
+#ifdef CONFIG_IP_VS_IPV6
+static int ip_vs_out_icmp_v6(struct sk_buff *skb, int *related)
+{
+ struct ipv6hdr *iph;
+ struct icmp6hdr _icmph, *ic;
+ struct ipv6hdr _ciph, *cih; /* The ip header contained within the ICMP */
+ struct ip_vs_conn *cp;
+ struct ip_vs_protocol *pp;
+ unsigned int offset, verdict;
+
+ *related = 1;
+
+ /* reassemble IP fragments */
+ if (ipv6_hdr(skb)->nexthdr == IPPROTO_FRAGMENT) {
+ if (ip_vs_gather_frags_v6(skb, IP_DEFRAG_VS_OUT))
+ return NF_STOLEN;
+ }
+
+ iph = ipv6_hdr(skb);
+ offset = sizeof(struct ipv6hdr);
+ ic = skb_header_pointer(skb, offset, sizeof(_icmph), &_icmph);
+ if (ic == NULL)
+ return NF_DROP;
+
+ IP_VS_DBG(12, "Outgoing ICMPv6 (%d,%d) " NIP6_FMT "->" NIP6_FMT "\n",
+ ic->icmp6_type, ntohs(icmpv6_id(ic)),
+ NIP6(iph->saddr), NIP6(iph->daddr));
+
+ /*
+ * Work through seeing if this is for us.
+ * These checks are supposed to be in an order that means easy
+ * things are checked first to speed up processing.... however
+ * this means that some packets will manage to get a long way
+ * down this stack and then be rejected, but that's life.
+ */
+ if ((ic->icmp6_type != ICMPV6_DEST_UNREACH) &&
+ (ic->icmp6_type != ICMPV6_PKT_TOOBIG) &&
+ (ic->icmp6_type != ICMPV6_TIME_EXCEED)) {
+ *related = 0;
+ return NF_ACCEPT;
+ }
+
+ /* Now find the contained IP header */
+ offset += sizeof(_icmph);
+ cih = skb_header_pointer(skb, offset, sizeof(_ciph), &_ciph);
+ if (cih == NULL)
+ return NF_ACCEPT; /* The packet looks wrong, ignore */
+
+ pp = ip_vs_proto_get(cih->nexthdr);
+ if (!pp)
+ return NF_ACCEPT;
+
+ /* Is the embedded protocol header present? */
+ /* TODO: we don't support fragmentation at the moment anyways */
+ if (unlikely(cih->nexthdr == IPPROTO_FRAGMENT && pp->dont_defrag))
+ return NF_ACCEPT;
+
+ IP_VS_DBG_PKT(11, pp, skb, offset, "Checking outgoing ICMPv6 for");
+
+ /* The embedded headers contain source and dest in reverse order */
+ cp = pp->conn_out_get_v6(skb, pp, iph, offset, 1);
+ if (!cp)
+ return NF_ACCEPT;
+
+ verdict = NF_DROP;
+
+ if (IP_VS_FWD_METHOD(cp) != 0) {
+ IP_VS_ERR("shouldn't reach here, because the box is on the "
+ "half connection in the tun/dr module.\n");
+ }
+
+ /* Ensure the checksum is correct */
+ if (!skb_csum_unnecessary(skb)
+ && ip_vs_checksum_complete(skb, sizeof(struct ipv6hdr))) {
+ /* Failed checksum! */
+ IP_VS_DBG(1, "Forward ICMPv6: failed checksum from "
+ NIP6_FMT "!\n",
+ NIP6(iph->saddr));
+ goto out;
+ }
+
+ if (IPPROTO_TCP == cih->nexthdr || IPPROTO_UDP == cih->nexthdr)
+ offset += 2 * sizeof(__u16);
+ if (!skb_make_writable(skb, offset))
+ goto out;
+
+ ip_vs_nat_icmp_v6(skb, pp, cp, 1);
+
+ /* do the statistics and put it back */
+ ip_vs_out_stats(cp, skb);
+
+ skb->ipvs_property = 1;
+ verdict = NF_ACCEPT;
+
+ out:
+ __ip_vs_conn_put(cp);
+
+ return verdict;
+}
+#endif
+
+static inline int is_tcp_reset(const struct sk_buff *skb, int nh_len)
{
struct tcphdr _tcph, *th;
- th = skb_header_pointer(skb, ip_hdrlen(skb), sizeof(_tcph), &_tcph);
+ th = skb_header_pointer(skb, nh_len, sizeof(_tcph), &_tcph);
if (th == NULL)
return 0;
return th->rst;
@@ -980,7 +1218,7 @@ ip_vs_out(unsigned int hooknum, struct sk_buff *skb,
* packet or not TCP packet.
*/
if (iph->protocol != IPPROTO_TCP
- || !is_tcp_reset(skb)) {
+ || !is_tcp_reset(skb, ip_hdrlen(skb))) {
icmp_send(skb,ICMP_DEST_UNREACH,
ICMP_PORT_UNREACH, 0);
return NF_DROP;
@@ -1029,6 +1267,114 @@ ip_vs_out(unsigned int hooknum, struct sk_buff *skb,
return NF_STOLEN;
}
+#ifdef CONFIG_IP_VS_IPV6
+static unsigned int
+ip_vs_out_v6(unsigned int hooknum, struct sk_buff *skb,
+ const struct net_device *in, const struct net_device *out,
+ int (*okfn)(struct sk_buff *))
+{
+ struct ipv6hdr *iph;
+ struct ip_vs_protocol *pp;
+ struct ip_vs_conn *cp;
+
+ EnterFunction(11);
+
+ if (skb->ipvs_property)
+ return NF_ACCEPT;
+
+ iph = ipv6_hdr(skb);
+ if (unlikely(iph->nexthdr == IPPROTO_ICMPV6)) {
+ int related, verdict = ip_vs_out_icmp_v6(skb, &related);
+
+ if (related)
+ return verdict;
+ iph = ipv6_hdr(skb);
+ }
+
+ /* TODO IPv6: handle extension headers */
+ pp = ip_vs_proto_get(iph->nexthdr);
+ if (unlikely(!pp))
+ return NF_ACCEPT;
+
+ /* reassemble IP fragments */
+ if (iph->nexthdr == IPPROTO_FRAGMENT) {
+ if (ip_vs_gather_frags_v6(skb, IP_DEFRAG_VS_OUT))
+ return NF_STOLEN;
+ iph = ipv6_hdr(skb);
+ }
+
+ /*
+ * Check if the packet belongs to an existing entry
+ */
+ cp = pp->conn_out_get_v6(skb, pp, iph, sizeof(struct ipv6hdr), 0);
+
+ if (unlikely(!cp)) {
+ if (sysctl_ip_vs_nat_icmp_send &&
+ (pp->protocol == IPPROTO_TCP ||
+ pp->protocol == IPPROTO_UDP)) {
+ __be16 _ports[2], *pptr;
+
+ pptr = skb_header_pointer(skb, sizeof(struct ipv6hdr),
+ sizeof(_ports), _ports);
+ if (pptr == NULL)
+ return NF_ACCEPT; /* Not for me */
+ if (ip_vs_lookup_real_service_v6(iph->nexthdr,
+ &iph->saddr, pptr[0])) {
+ /*
+ * Notify the real server: there is no
+ * existing entry if it is not RST
+ * packet or not TCP packet.
+ */
+ if (iph->nexthdr != IPPROTO_TCP
+ || !is_tcp_reset(skb, sizeof(struct ipv6hdr))) {
+ icmpv6_send(skb, ICMPV6_DEST_UNREACH,
+ ICMPV6_PORT_UNREACH, 0, skb->dev);
+ return NF_DROP;
+ }
+ }
+ }
+ IP_VS_DBG_PKT(12, pp, skb, 0,
+ "packet continues traversal as normal");
+ return NF_ACCEPT;
+ }
+
+ IP_VS_DBG_PKT(11, pp, skb, 0, "Outgoing packet");
+
+ if (!skb_make_writable(skb, sizeof(struct ipv6hdr)))
+ goto drop;
+
+ /* mangle the packet */
+ if (pp->snat_handler_v6 && !pp->snat_handler_v6(skb, pp, cp))
+ goto drop;
+ ipv6_hdr(skb)->saddr = cp->vaddr.v6;
+
+ /* For policy routing, packets originating from this
+ * machine itself may be routed differently to packets
+ * passing through. We want this packet to be routed as
+ * if it came from this machine itself. So re-compute
+ * the routing information.
+ */
+ if (ip6_route_me_harder(skb) != 0)
+ goto drop;
+
+ IP_VS_DBG_PKT(10, pp, skb, 0, "After SNAT");
+
+ ip_vs_out_stats(cp, skb);
+ ip_vs_set_state(cp, IP_VS_DIR_OUTPUT, skb, pp);
+ ip_vs_conn_put(cp);
+
+ skb->ipvs_property = 1;
+
+ LeaveFunction(11);
+ return NF_ACCEPT;
+
+ drop:
+ ip_vs_conn_put(cp);
+ kfree_skb(skb);
+ return NF_STOLEN;
+}
+#endif
+
/*
* Handle ICMP messages in the outside-to-inside direction (incoming).
@@ -1126,6 +1472,90 @@ ip_vs_in_icmp(struct sk_buff *skb, int *related, unsigned int hooknum)
return verdict;
}
+#ifdef CONFIG_IP_VS_IPV6
+static int
+ip_vs_in_icmp_v6(struct sk_buff *skb, int *related, unsigned int hooknum)
+{
+ struct ipv6hdr *iph;
+ struct icmp6hdr _icmph, *ic;
+ struct ipv6hdr _ciph, *cih; /* The ip header contained within the ICMP */
+ struct ip_vs_conn *cp;
+ struct ip_vs_protocol *pp;
+ unsigned int offset, verdict;
+
+ *related = 1;
+
+ /* reassemble IP fragments */
+ if (ipv6_hdr(skb)->nexthdr == IPPROTO_FRAGMENT) {
+ if (ip_vs_gather_frags_v6(skb, hooknum == NF_INET_LOCAL_IN ?
+ IP_DEFRAG_VS_IN : IP_DEFRAG_VS_FWD))
+ return NF_STOLEN;
+ }
+
+ iph = ipv6_hdr(skb);
+ offset = sizeof(struct ipv6hdr);
+ ic = skb_header_pointer(skb, offset, sizeof(_icmph), &_icmph);
+ if (ic == NULL)
+ return NF_DROP;
+
+ IP_VS_DBG(12, "Incoming ICMPv6 (%d,%d) " NIP6_FMT "->" NIP6_FMT "\n",
+ ic->icmp6_type, ntohs(icmpv6_id(ic)),
+ NIP6(iph->saddr), NIP6(iph->daddr));
+
+ /*
+ * Work through seeing if this is for us.
+ * These checks are supposed to be in an order that means easy
+ * things are checked first to speed up processing.... however
+ * this means that some packets will manage to get a long way
+ * down this stack and then be rejected, but that's life.
+ */
+ if ((ic->icmp6_type != ICMPV6_DEST_UNREACH) &&
+ (ic->icmp6_type != ICMPV6_PKT_TOOBIG) &&
+ (ic->icmp6_type != ICMPV6_TIME_EXCEED)) {
+ *related = 0;
+ return NF_ACCEPT;
+ }
+
+ /* Now find the contained IP header */
+ offset += sizeof(_icmph);
+ cih = skb_header_pointer(skb, offset, sizeof(_ciph), &_ciph);
+ if (cih == NULL)
+ return NF_ACCEPT; /* The packet looks wrong, ignore */
+
+ pp = ip_vs_proto_get(cih->nexthdr);
+ if (!pp)
+ return NF_ACCEPT;
+
+ /* Is the embedded protocol header present? */
+ /* TODO: we don't support fragmentation at the moment anyways */
+ if (unlikely(cih->nexthdr == IPPROTO_FRAGMENT && pp->dont_defrag))
+ return NF_ACCEPT;
+
+ IP_VS_DBG_PKT(11, pp, skb, offset, "Checking incoming ICMPv6 for");
+
+ offset += sizeof(struct ipv6hdr);
+
+ /* The embedded headers contain source and dest in reverse order */
+ cp = pp->conn_in_get_v6(skb, pp, cih, offset, 1);
+ if (!cp)
+ return NF_ACCEPT;
+
+ verdict = NF_DROP;
+
+ /* do the statistics and put it back */
+ ip_vs_in_stats(cp, skb);
+ if (IPPROTO_TCP == cih->nexthdr || IPPROTO_UDP == cih->nexthdr)
+ offset += 2 * sizeof(__u16);
+ verdict = ip_vs_icmp_xmit_v6(skb, cp, pp, offset);
+ /* do not touch skb anymore */
+
+ __ip_vs_conn_put(cp);
+
+ return verdict;
+}
+#endif
+
+
/*
* Check if it's for virtual services, look it up,
* and send it on its way...
@@ -1237,6 +1667,118 @@ ip_vs_in(unsigned int hooknum, struct sk_buff *skb,
return ret;
}
+#ifdef CONFIG_IP_VS_IPV6
+static unsigned int
+ip_vs_in_v6(unsigned int hooknum, struct sk_buff *skb,
+ const struct net_device *in, const struct net_device *out,
+ int (*okfn)(struct sk_buff *))
+{
+ struct ipv6hdr *iph;
+ struct ip_vs_protocol *pp;
+ struct ip_vs_conn *cp;
+ int ret, restart;
+
+ /*
+ * Big tappo: only PACKET_HOST (neither loopback nor mcasts)
+ * ... don't know why 1st test DOES NOT include 2nd (?)
+ */
+ if (unlikely(skb->pkt_type != PACKET_HOST
+ || skb->dev->flags & IFF_LOOPBACK || skb->sk)) {
+ IP_VS_DBG(12, "packet type=%d proto=%d daddr=" NIP6_FMT " ignored\n",
+ skb->pkt_type,
+ ipv6_hdr(skb)->nexthdr,
+ NIP6(ipv6_hdr(skb)->daddr));
+ return NF_ACCEPT;
+ }
+
+ iph = ipv6_hdr(skb);
+ if (unlikely(iph->nexthdr == IPPROTO_ICMPV6)) {
+ int related, verdict = ip_vs_in_icmp_v6(skb, &related, hooknum);
+
+ if (related)
+ return verdict;
+ iph = ipv6_hdr(skb);
+ }
+
+ /* Protocol supported? */
+ /* TODO IPv6: handle extension headers */
+ pp = ip_vs_proto_get(iph->nexthdr);
+ if (unlikely(!pp))
+ return NF_ACCEPT;
+
+ /*
+ * Check if the packet belongs to an existing connection entry
+ */
+ cp = pp->conn_in_get_v6(skb, pp, iph, sizeof(struct ipv6hdr), 0);
+
+ if (unlikely(!cp)) {
+ int v;
+
+ if (!pp->conn_schedule_v6(skb, pp, &v, &cp))
+ return v;
+ }
+
+ if (unlikely(!cp)) {
+ /* sorry, all this trouble for a no-hit :) */
+ IP_VS_DBG_PKT(12, pp, skb, 0,
+ "packet continues traversal as normal");
+ return NF_ACCEPT;
+ }
+
+ IP_VS_DBG_PKT(11, pp, skb, 0, "Incoming packet");
+
+ /* Check the server status */
+ if (cp->dest && !(cp->dest->flags & IP_VS_DEST_F_AVAILABLE)) {
+ /* the destination server is not available */
+
+ if (sysctl_ip_vs_expire_nodest_conn) {
+ /* try to expire the connection immediately */
+ ip_vs_conn_expire_now(cp);
+ }
+ /* don't restart its timer, and silently
+ drop the packet. */
+ __ip_vs_conn_put(cp);
+ return NF_DROP;
+ }
+
+ ip_vs_in_stats(cp, skb);
+ restart = ip_vs_set_state(cp, IP_VS_DIR_INPUT, skb, pp);
+ if (cp->packet_xmit)
+ ret = cp->packet_xmit(skb, cp, pp);
+ /* do not touch skb anymore */
+ else {
+ IP_VS_DBG_RL("warning: packet_xmit is null");
+ ret = NF_ACCEPT;
+ }
+
+ /* Increase its packet counter and check if it is needed
+ * to be synchronized
+ *
+ * Sync connection if it is about to close to
+ * encorage the standby servers to update the connections timeout
+ *
+ * TODO IPv6: make sync daemon work with IPv6, disabled for now
+ */
+
+ /*
+ atomic_inc(&cp->in_pkts);
+ if ((ip_vs_sync_state & IP_VS_STATE_MASTER) &&
+ (((cp->protocol != IPPROTO_TCP ||
+ cp->state == IP_VS_TCP_S_ESTABLISHED) &&
+ (atomic_read(&cp->in_pkts) % sysctl_ip_vs_sync_threshold[1]
+ == sysctl_ip_vs_sync_threshold[0])) ||
+ ((cp->protocol == IPPROTO_TCP) && (cp->old_state != cp->state) &&
+ ((cp->state == IP_VS_TCP_S_FIN_WAIT) ||
+ (cp->state == IP_VS_TCP_S_CLOSE)))))
+ ip_vs_sync_conn(cp);
+ */
+ cp->old_state = cp->state;
+
+ ip_vs_conn_put(cp);
+ return ret;
+}
+#endif
+
/*
* It is hooked at the NF_INET_FORWARD chain, in order to catch ICMP
@@ -1260,6 +1802,21 @@ ip_vs_forward_icmp(unsigned int hooknum, struct sk_buff *skb,
return ip_vs_in_icmp(skb, &r, hooknum);
}
+#ifdef CONFIG_IP_VS_IPV6
+static unsigned int
+ip_vs_forward_icmp_v6(unsigned int hooknum, struct sk_buff *skb,
+ const struct net_device *in, const struct net_device *out,
+ int (*okfn)(struct sk_buff *))
+{
+ int r;
+
+ if (ipv6_hdr(skb)->nexthdr != IPPROTO_ICMPV6)
+ return NF_ACCEPT;
+
+ return ip_vs_in_icmp_v6(skb, &r, hooknum);
+}
+#endif
+
static struct nf_hook_ops ip_vs_ops[] __read_mostly = {
/* After packet filtering, forward packet through VS/DR, VS/TUN,
@@ -1297,6 +1854,43 @@ static struct nf_hook_ops ip_vs_ops[] __read_mostly = {
.hooknum = NF_INET_POST_ROUTING,
.priority = NF_IP_PRI_NAT_SRC-1,
},
+#ifdef CONFIG_IP_VS_IPV6
+ /* After packet filtering, forward packet through VS/DR, VS/TUN,
+ * or VS/NAT(change destination), so that filtering rules can be
+ * applied to IPVS. */
+ {
+ .hook = ip_vs_in_v6,
+ .owner = THIS_MODULE,
+ .pf = PF_INET6,
+ .hooknum = NF_INET_LOCAL_IN,
+ .priority = 100,
+ },
+ /* After packet filtering, change source only for VS/NAT */
+ {
+ .hook = ip_vs_out_v6,
+ .owner = THIS_MODULE,
+ .pf = PF_INET6,
+ .hooknum = NF_INET_FORWARD,
+ .priority = 100,
+ },
+ /* After packet filtering (but before ip_vs_out_icmp), catch icmp
+ * destined for 0.0.0.0/0, which is for incoming IPVS connections */
+ {
+ .hook = ip_vs_forward_icmp_v6,
+ .owner = THIS_MODULE,
+ .pf = PF_INET6,
+ .hooknum = NF_INET_FORWARD,
+ .priority = 99,
+ },
+ /* Before the netfilter connection tracking, exit from POST_ROUTING */
+ {
+ .hook = ip_vs_post_routing,
+ .owner = THIS_MODULE,
+ .pf = PF_INET6,
+ .hooknum = NF_INET_POST_ROUTING,
+ .priority = NF_IP6_PRI_NAT_SRC-1,
+ },
+#endif
};
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 21/26] IPVS: Make proc/net files output IPv6 entries correctly.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (19 preceding siblings ...)
2008-06-11 17:12 ` [PATCH 20/26] IPVS: Add IPv6 Netfilter hooks and add/modify support functions Julius R. Volz
@ 2008-06-11 17:12 ` Julius R. Volz
2008-06-11 17:12 ` [PATCH 22/26] IPVS: Add function to find out if IPv6 address is local Julius R. Volz
` (5 subsequent siblings)
26 siblings, 0 replies; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:12 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam
From: Vince Busam <vbusam@google.com>
Add support for IPv6 entry output to ip_vs_conn_seq_show() and
ip_vs_conn_sync_seq_show() proc/net file handlers.
Signed-off-by: Vince Busam <vbusam@google.com>
1 files changed, 35 insertions(+), 10 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
index 30e1ad2..ea0fd77 100644
--- a/net/netfilter/ipvs/ip_vs_conn.c
+++ b/net/netfilter/ipvs/ip_vs_conn.c
@@ -1032,14 +1032,26 @@ static int ip_vs_conn_seq_show(struct seq_file *seq, void *v)
else {
const struct ip_vs_conn *cp = v;
- seq_printf(seq,
- "%-3s %08X %04X %08X %04X %08X %04X %-11s %7lu\n",
+ if (cp->af == AF_INET)
+ seq_printf(seq,
+ "%-3s %08X %04X %08X %04X %08X %04X %-11s %7lu\n",
ip_vs_proto_name(cp->protocol),
- ntohl(cp->caddr), ntohs(cp->cport),
- ntohl(cp->vaddr), ntohs(cp->vport),
- ntohl(cp->daddr), ntohs(cp->dport),
+ ntohl(cp->caddr.v4), ntohs(cp->cport),
+ ntohl(cp->vaddr.v4), ntohs(cp->vport),
+ ntohl(cp->daddr.v4), ntohs(cp->dport),
ip_vs_state_name(cp->protocol, cp->state),
(cp->timer.expires-jiffies)/HZ);
+#ifdef CONFIG_IP_VS_IPV6
+ else
+ seq_printf(seq,
+ "%-3s " NIP6_FMT " %04X " NIP6_FMT " %04X " NIP6_FMT " %04X %-11s %7lu\n",
+ ip_vs_proto_name(cp->protocol),
+ NIP6(cp->caddr.v6), ntohs(cp->cport),
+ NIP6(cp->vaddr.v6), ntohs(cp->vport),
+ NIP6(cp->daddr.v6), ntohs(cp->dport),
+ ip_vs_state_name(cp->protocol, cp->state),
+ (cp->timer.expires-jiffies)/HZ);
+#endif
}
return 0;
}
@@ -1081,15 +1093,28 @@ static int ip_vs_conn_sync_seq_show(struct seq_file *seq, void *v)
else {
const struct ip_vs_conn *cp = v;
- seq_printf(seq,
- "%-3s %08X %04X %08X %04X %08X %04X %-11s %-6s %7lu\n",
+ if (cp->af == AF_INET)
+ seq_printf(seq,
+ "%-3s %08X %04X %08X %04X %08X %04X %-11s %-6s %7lu\n",
ip_vs_proto_name(cp->protocol),
- ntohl(cp->caddr), ntohs(cp->cport),
- ntohl(cp->vaddr), ntohs(cp->vport),
- ntohl(cp->daddr), ntohs(cp->dport),
+ ntohl(cp->caddr.v4), ntohs(cp->cport),
+ ntohl(cp->vaddr.v4), ntohs(cp->vport),
+ ntohl(cp->daddr.v4), ntohs(cp->dport),
ip_vs_state_name(cp->protocol, cp->state),
ip_vs_origin_name(cp->flags),
(cp->timer.expires-jiffies)/HZ);
+#ifdef CONFIG_IP_VS_IPV6
+ else
+ seq_printf(seq,
+ "%-3s " NIP6_FMT " %04X " NIP6_FMT " %04X " NIP6_FMT " %04X %-11s %-6s %7lu\n",
+ ip_vs_proto_name(cp->protocol),
+ NIP6(cp->caddr.v6), ntohs(cp->cport),
+ NIP6(cp->vaddr.v6), ntohs(cp->vport),
+ NIP6(cp->daddr.v6), ntohs(cp->dport),
+ ip_vs_state_name(cp->protocol, cp->state),
+ ip_vs_origin_name(cp->flags),
+ (cp->timer.expires-jiffies)/HZ);
+#endif
}
return 0;
}
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 22/26] IPVS: Add function to find out if IPv6 address is local.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (20 preceding siblings ...)
2008-06-11 17:12 ` [PATCH 21/26] IPVS: Make proc/net files output IPv6 entries correctly Julius R. Volz
@ 2008-06-11 17:12 ` Julius R. Volz
2008-06-11 17:12 ` [PATCH 23/26] IPVS: Add hash functions for IPv6 services and real servers Julius R. Volz
` (4 subsequent siblings)
26 siblings, 0 replies; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:12 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam
From: Vince Busam <vbusam@google.com>
Add __ip_vs_addr_is_local_v6() to find out if an IPv6 address belongs to a
local interface. This function is used to decide whether to set the
IP_VS_CONN_F_LOCALNODE flag for IPv6 destinations.
Signed-off-by: Vince Busam <vbusam@google.com>
1 files changed, 30 insertions(+), 3 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index da2e431..cf52034 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -92,6 +92,23 @@ int ip_vs_get_debug_level(void)
}
#endif
+#ifdef CONFIG_IP_VS_IPV6
+/* Taken from rt6_fill_node() in net/ipv6/route.c, is there a better way? */
+static int __ip_vs_addr_is_local_v6(const struct in6_addr *addr) {
+ struct rt6_info *rt;
+ struct flowi fl = {
+ .oif = 0,
+ .nl_u = {
+ .ip6_u = {
+ .daddr = *addr,
+ .saddr = { .s6_addr32 = {0, 0, 0, 0} }, } },
+ };
+ if ((rt = (struct rt6_info *)ip6_route_output(&init_net, NULL, &fl)))
+ if (rt->rt6i_dev && (rt->rt6i_dev->flags&IFF_LOOPBACK))
+ return 1;
+ return 0;
+}
+#endif
/*
* update_defense_level is called from keventd and from sysctl,
* so it needs to protect itself from softirqs
@@ -708,9 +725,19 @@ __ip_vs_update_dest(struct ip_vs_service *svc,
conn_flags = udest->conn_flags | IP_VS_CONN_F_INACTIVE;
/* check if local node and update the flags */
- if (inet_addr_type(&init_net, udest->addr) == RTN_LOCAL) {
- conn_flags = (conn_flags & ~IP_VS_CONN_F_FWD_MASK)
- | IP_VS_CONN_F_LOCALNODE;
+#ifdef CONFIG_IP_VS_IPV6
+ if (svc->af == AF_INET6) {
+ if (__ip_vs_addr_is_local_v6(&udest->addr.v6)) {
+ conn_flags = (conn_flags & ~IP_VS_CONN_F_FWD_MASK)
+ | IP_VS_CONN_F_LOCALNODE;
+ }
+ }
+#endif
+ if (svc->af == AF_INET) {
+ if (inet_addr_type(&init_net, udest->addr.v4) == RTN_LOCAL) {
+ conn_flags = (conn_flags & ~IP_VS_CONN_F_FWD_MASK)
+ | IP_VS_CONN_F_LOCALNODE;
+ }
}
/* set the IP_VS_CONN_F_NOOUTPUT flag if not masquerading/NAT */
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 23/26] IPVS: Add hash functions for IPv6 services and real servers.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (21 preceding siblings ...)
2008-06-11 17:12 ` [PATCH 22/26] IPVS: Add function to find out if IPv6 address is local Julius R. Volz
@ 2008-06-11 17:12 ` Julius R. Volz
2008-06-11 17:12 ` [PATCH 24/26] IPVS: Add IPv6 support to userspace interface Julius R. Volz
` (3 subsequent siblings)
26 siblings, 0 replies; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:12 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Add hashing functions ip_vs_svc_hashkey_v6() for hashing IPv6 service
entries and ip_vs_rs_hashkey_v6() for hashing real servers.
Signed-off-by: Julius R. Volz <juliusv@google.com>
1 files changed, 46 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index cf52034..ca198b9 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -37,6 +37,10 @@
#include <net/net_namespace.h>
#include <net/ip.h>
+#ifdef CONFIG_IP_VS_IPV6
+#include <net/ipv6.h>
+#include <net/ip6_route.h>
+#endif
#include <net/route.h>
#include <net/sock.h>
@@ -308,6 +312,20 @@ ip_vs_svc_hashkey(unsigned proto, __be32 addr, __be16 port)
& IP_VS_SVC_TAB_MASK;
}
+#ifdef CONFIG_IP_VS_IPV6
+static __inline__ unsigned
+ip_vs_svc_hashkey_v6(unsigned proto, const struct in6_addr *addr, __be16 port)
+{
+ register unsigned porth = ntohs(port);
+
+ /* TODO IPv6: is this a good way to hash IPv6 entries? */
+ int addr_fold = addr->s6_addr32[0]^addr->s6_addr32[1]^
+ addr->s6_addr32[2]^addr->s6_addr32[3];
+ return (proto^ntohl(addr_fold)^(porth>>IP_VS_SVC_TAB_BITS)^porth)
+ & IP_VS_SVC_TAB_MASK;
+}
+#endif
+
/*
* Returns hash value of fwmark for virtual service lookup
*/
@@ -335,7 +353,13 @@ static int ip_vs_svc_hash(struct ip_vs_service *svc)
/*
* Hash it by <protocol,addr,port> in ip_vs_svc_table
*/
- hash = ip_vs_svc_hashkey(svc->protocol, svc->addr, svc->port);
+#ifdef CONFIG_IP_VS_IPV6
+ hash = (svc->af == AF_INET)
+ ? ip_vs_svc_hashkey(svc->protocol, svc->addr.v4, svc->port)
+ : ip_vs_svc_hashkey_v6(svc->protocol, &svc->addr.v6, svc->port);
+#else
+ hash = ip_vs_svc_hashkey(svc->protocol, svc->addr.v4, svc->port);
+#endif
list_add(&svc->s_list, &ip_vs_svc_table[hash]);
} else {
/*
@@ -506,6 +530,20 @@ static __inline__ unsigned ip_vs_rs_hashkey(__be32 addr, __be16 port)
& IP_VS_RTAB_MASK;
}
+#ifdef CONFIG_IP_VS_IPV6
+static __inline__ unsigned ip_vs_rs_hashkey_v6(const struct in6_addr *addr,
+ __be16 port)
+{
+ register unsigned porth = ntohs(port);
+
+ /* TODO IPv6: is this a good way to hash IPv6 entries? */
+ int addr_fold = addr->s6_addr32[0]^addr->s6_addr32[1]^
+ addr->s6_addr32[2]^addr->s6_addr32[3];
+ return (ntohl(addr_fold)^(porth>>IP_VS_RTAB_BITS)^porth)
+ & IP_VS_RTAB_MASK;
+}
+#endif
+
/*
* Hashes ip_vs_dest in ip_vs_rtable by <proto,addr,port>.
* should be called with locked tables.
@@ -522,7 +560,13 @@ static int ip_vs_rs_hash(struct ip_vs_dest *dest)
* Hash by proto,addr,port,
* which are the parameters of the real service.
*/
- hash = ip_vs_rs_hashkey(dest->addr, dest->port);
+#ifdef CONFIG_IP_VS_IPV6
+ if (dest->af == AF_INET6)
+ hash = ip_vs_rs_hashkey_v6(&dest->addr.v6, dest->port);
+ else
+#endif
+ hash = ip_vs_rs_hashkey(dest->addr.v4, dest->port);
+
list_add(&dest->d_list, &ip_vs_rtable[hash]);
return 1;
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 24/26] IPVS: Add IPv6 support to userspace interface.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (22 preceding siblings ...)
2008-06-11 17:12 ` [PATCH 23/26] IPVS: Add hash functions for IPv6 services and real servers Julius R. Volz
@ 2008-06-11 17:12 ` Julius R. Volz
2008-06-12 1:55 ` Brian Haley
2008-06-11 17:12 ` [PATCH 25/26] IPVS: Add support for IPv6 entry output in procfs files Julius R. Volz
` (2 subsequent siblings)
26 siblings, 1 reply; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:12 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Add support for adding/modifying/deleting IPv6 service and dest entries by
introducing several new functions and implementing corresponding switches
and checks in existing code.
Signed-off-by: Julius R. Volz <juliusv@google.com>
3 files changed, 364 insertions(+), 19 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index ab59696..9790ed4 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1017,6 +1017,12 @@ extern struct ctl_path net_vs_ctl_path[];
extern struct ip_vs_service *
ip_vs_service_get(__u32 fwmark, __u16 protocol, __be32 vaddr, __be16 vport);
+#ifdef CONFIG_IP_VS_IPV6
+extern struct ip_vs_service *
+ip_vs_service_get_v6(__u32 fwmark, __u16 protocol,
+ const struct in6_addr *vaddr, __be16 vport);
+#endif
+
static inline void ip_vs_service_put(struct ip_vs_service *svc)
{
atomic_dec(&svc->usecnt);
@@ -1024,6 +1030,11 @@ static inline void ip_vs_service_put(struct ip_vs_service *svc)
extern struct ip_vs_dest *
ip_vs_lookup_real_service(__u16 protocol, __be32 daddr, __be16 dport);
+#ifdef CONFIG_IP_VS_IPV6
+extern struct ip_vs_dest *
+ip_vs_lookup_real_service_v6(__u16 protocol, const struct in6_addr *daddr,
+ __be16 dport);
+#endif
extern int ip_vs_use_count_inc(void);
extern void ip_vs_use_count_dec(void);
extern int ip_vs_control_init(void);
@@ -1031,6 +1042,11 @@ extern void ip_vs_control_cleanup(void);
extern struct ip_vs_dest *
ip_vs_find_dest(__be32 daddr, __be16 dport,
__be32 vaddr, __be16 vport, __u16 protocol);
+#ifdef CONFIG_IP_VS_IPV6
+extern struct ip_vs_dest *
+ip_vs_find_dest_v6(const struct in6_addr *daddr, __be16 dport,
+ const struct in6_addr *vaddr, __be16 vport, __u16 protocol);
+#endif
extern struct ip_vs_dest *ip_vs_try_bind_dest(struct ip_vs_conn *cp);
diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
index ea0fd77..6b031a8 100644
--- a/net/netfilter/ipvs/ip_vs_conn.c
+++ b/net/netfilter/ipvs/ip_vs_conn.c
@@ -633,8 +633,16 @@ struct ip_vs_dest *ip_vs_try_bind_dest(struct ip_vs_conn *cp)
struct ip_vs_dest *dest;
if ((cp) && (!cp->dest)) {
- dest = ip_vs_find_dest(cp->daddr, cp->dport,
- cp->vaddr, cp->vport, cp->protocol);
+#ifdef CONFIG_IP_VS_IPV6
+ if (cp->af == AF_INET6)
+ dest = ip_vs_find_dest_v6(&cp->daddr.v6, cp->dport,
+ &cp->vaddr.v6, cp->vport,
+ cp->protocol);
+ else
+#endif
+ dest = ip_vs_find_dest(cp->daddr.v4, cp->dport,
+ cp->vaddr.v4, cp->vport,
+ cp->protocol);
ip_vs_bind_dest(cp, dest);
return dest;
} else
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index ca198b9..388278a 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -428,6 +428,31 @@ __ip_vs_service_get(__u16 protocol, __be32 vaddr, __be16 vport)
return NULL;
}
+#ifdef CONFIG_IP_VS_IPV6
+static __inline__ struct ip_vs_service *
+__ip_vs_service_get_v6(__u16 protocol, const struct in6_addr *vaddr, __be16 vport)
+{
+ unsigned hash;
+ struct ip_vs_service *svc;
+
+ /* Check for "full" addressed entries */
+ hash = ip_vs_svc_hashkey_v6(protocol, vaddr, vport);
+
+ list_for_each_entry(svc, &ip_vs_svc_table[hash], s_list){
+ if ((svc->af == AF_INET6)
+ && ipv6_addr_equal(&svc->addr.v6, vaddr)
+ && (svc->port == vport)
+ && (svc->protocol == protocol)) {
+ /* HIT */
+ atomic_inc(&svc->usecnt);
+ return svc;
+ }
+ }
+
+ return NULL;
+}
+#endif
+
/*
* Get service by {fwmark} in the service table.
@@ -500,6 +525,56 @@ ip_vs_service_get(__u32 fwmark, __u16 protocol, __be32 vaddr, __be16 vport)
return svc;
}
+#ifdef CONFIG_IP_VS_IPV6
+struct ip_vs_service *
+ip_vs_service_get_v6(__u32 fwmark, __u16 protocol, const struct in6_addr *vaddr, __be16 vport)
+{
+ struct ip_vs_service *svc;
+
+ read_lock(&__ip_vs_svc_lock);
+
+ /*
+ * Check the table hashed by fwmark first
+ */
+ if (fwmark && (svc = __ip_vs_svc_fwm_get(fwmark)))
+ goto out;
+
+ /*
+ * Check the table hashed by <protocol,addr,port>
+ * for "full" addressed entries
+ */
+ svc = __ip_vs_service_get_v6(protocol, vaddr, vport);
+
+ if (svc == NULL
+ && protocol == IPPROTO_TCP
+ && atomic_read(&ip_vs_ftpsvc_counter)
+ && (vport == FTPDATA || ntohs(vport) >= PROT_SOCK)) {
+ /*
+ * Check if ftp service entry exists, the packet
+ * might belong to FTP data connections.
+ */
+ svc = __ip_vs_service_get_v6(protocol, vaddr, FTPPORT);
+ }
+
+ if (svc == NULL
+ && atomic_read(&ip_vs_nullsvc_counter)) {
+ /*
+ * Check if the catch-all port (port zero) exists
+ */
+ svc = __ip_vs_service_get_v6(protocol, vaddr, 0);
+ }
+
+ out:
+ read_unlock(&__ip_vs_svc_lock);
+
+ IP_VS_DBG(9, "lookup service: fwm %u %s " NIP6_FMT ":%u %s\n",
+ fwmark, ip_vs_proto_name(protocol),
+ NIP6(*vaddr), ntohs(vport),
+ svc?"hit":"not hit");
+
+ return svc;
+}
+#endif
static inline void
__ip_vs_bind_svc(struct ip_vs_dest *dest, struct ip_vs_service *svc)
@@ -621,6 +696,38 @@ ip_vs_lookup_real_service(__u16 protocol, __be32 daddr, __be16 dport)
return NULL;
}
+#ifdef CONFIG_IP_VS_IPV6
+struct ip_vs_dest *
+ip_vs_lookup_real_service_v6(__u16 protocol, const struct in6_addr *daddr,
+ __be16 dport)
+{
+ unsigned hash;
+ struct ip_vs_dest *dest;
+
+ /*
+ * Check for "full" addressed entries
+ * Return the first found entry
+ */
+ hash = ip_vs_rs_hashkey_v6(daddr, dport);
+
+ read_lock(&__ip_vs_rs_lock);
+ list_for_each_entry(dest, &ip_vs_rtable[hash], d_list) {
+ if ((dest->af == AF_INET6)
+ && (ipv6_addr_equal(&dest->addr.v6, daddr))
+ && (dest->port == dport)
+ && ((dest->protocol == protocol) ||
+ dest->vfwmark)) {
+ /* HIT */
+ read_unlock(&__ip_vs_rs_lock);
+ return dest;
+ }
+ }
+ read_unlock(&__ip_vs_rs_lock);
+
+ return NULL;
+}
+#endif
+
/*
* Lookup destination by {addr,port} in the given service
*/
@@ -643,6 +750,29 @@ ip_vs_lookup_dest(struct ip_vs_service *svc, __be32 daddr, __be16 dport)
return NULL;
}
+#ifdef CONFIG_IP_VS_IPV6
+static struct ip_vs_dest *
+ip_vs_lookup_dest_v6(struct ip_vs_service *svc, const struct in6_addr *daddr,
+ __be16 dport)
+{
+ struct ip_vs_dest *dest;
+
+ /*
+ * Find the destination for the given service
+ */
+ list_for_each_entry(dest, &svc->destinations, n_list) {
+ if ((dest->af == AF_INET6)
+ && ipv6_addr_equal(&dest->addr.v6, daddr)
+ && (dest->port == dport)) {
+ /* HIT */
+ return dest;
+ }
+ }
+
+ return NULL;
+}
+#endif
+
/*
* Find destination by {daddr,dport,vaddr,protocol}
* Cretaed to be used in ip_vs_process_message() in
@@ -669,6 +799,25 @@ struct ip_vs_dest *ip_vs_find_dest(__be32 daddr, __be16 dport,
return dest;
}
+#ifdef CONFIG_IP_VS_IPV6
+struct ip_vs_dest *ip_vs_find_dest_v6(const struct in6_addr *daddr, __be16 dport,
+ const struct in6_addr *vaddr, __be16 vport,
+ __u16 protocol)
+{
+ struct ip_vs_dest *dest;
+ struct ip_vs_service *svc;
+
+ svc = ip_vs_service_get_v6(0, protocol, vaddr, vport);
+ if (!svc)
+ return NULL;
+ dest = ip_vs_lookup_dest_v6(svc, daddr, dport);
+ if (dest)
+ atomic_inc(&dest->refcnt);
+ ip_vs_service_put(svc);
+ return dest;
+}
+#endif
+
/*
* Lookup dest by {svc,addr,port} in the destination trash.
* The destination trash is used to hold the destinations that are removed
@@ -723,12 +872,59 @@ ip_vs_trash_get_dest(struct ip_vs_service *svc, __be32 daddr, __be16 dport)
return NULL;
}
+#ifdef CONFIG_IP_VS_IPV6
+static struct ip_vs_dest *
+ip_vs_trash_get_dest_v6(struct ip_vs_service *svc, const struct in6_addr *daddr,
+ __be16 dport)
+{
+ struct ip_vs_dest *dest, *nxt;
+
+ /*
+ * Find the destination in trash
+ */
+ list_for_each_entry_safe(dest, nxt, &ip_vs_dest_trash, n_list) {
+ IP_VS_DBG(3, "Destination %u/" NIP6_FMT ":%u still in trash, "
+ "dest->refcnt=%d\n",
+ dest->vfwmark,
+ NIP6(dest->addr.v6), ntohs(dest->port),
+ atomic_read(&dest->refcnt));
+ if (dest->af == AF_INET6 &&
+ ipv6_addr_equal(&dest->addr.v6, daddr) &&
+ dest->port == dport &&
+ dest->vfwmark == svc->fwmark &&
+ dest->protocol == svc->protocol &&
+ (svc->fwmark ||
+ (ipv6_addr_equal(&dest->vaddr.v6, &svc->addr.v6) &&
+ dest->vport == svc->port))) {
+ /* HIT */
+ return dest;
+ }
+
+ /*
+ * Try to purge the destination from trash if not referenced
+ */
+ if (atomic_read(&dest->refcnt) == 1) {
+ IP_VS_DBG(3, "Removing destination %u/" NIP6_FMT ":%u "
+ "from trash\n",
+ dest->vfwmark,
+ NIP6(dest->addr.v6), ntohs(dest->port));
+ list_del(&dest->n_list);
+ ip_vs_dst_reset(dest);
+ __ip_vs_unbind_svc(dest);
+ kfree(dest);
+ }
+ }
+
+ return NULL;
+}
+#endif
+
/*
* Clean up all the destinations in the trash
* Called by the ip_vs_control_cleanup()
*
- * When the ip_vs_control_clearup is activated by ipvs module exit,
+ * When the ip_vs_control_cleanup is activated by ipvs module exit,
* the service tables must have been flushed and all the connections
* are expired, and the refcnt of each destination in the trash must
* be 1, so we simply release them here.
@@ -831,9 +1027,19 @@ ip_vs_new_dest(struct ip_vs_service *svc, struct ip_vs_dest_user *udest,
EnterFunction(2);
- atype = inet_addr_type(&init_net, udest->addr);
- if (atype != RTN_LOCAL && atype != RTN_UNICAST)
- return -EINVAL;
+#ifdef CONFIG_IP_VS_IPV6
+ if (svc->af == AF_INET6) {
+ atype = ipv6_addr_type(&udest->addr.v6);
+ if (!(atype & IPV6_ADDR_UNICAST) &&
+ !__ip_vs_addr_is_local_v6(&udest->addr.v6))
+ return -EINVAL;
+ } else
+#endif
+ {
+ atype = inet_addr_type(&init_net, udest->addr.v4);
+ if (atype != RTN_LOCAL && atype != RTN_UNICAST)
+ return -EINVAL;
+ }
dest = kzalloc(sizeof(struct ip_vs_dest), GFP_ATOMIC);
if (dest == NULL) {
@@ -874,12 +1080,18 @@ static int
ip_vs_add_dest(struct ip_vs_service *svc, struct ip_vs_dest_user *udest)
{
struct ip_vs_dest *dest;
- __be32 daddr = udest->addr;
+ union ip_vs_addr_user daddr = udest->addr;
__be16 dport = udest->port;
int ret;
EnterFunction(2);
+ if (udest->af != svc->af) {
+ IP_VS_ERR("ip_vs_add_dest(): address families of service and "
+ "destination do not match\n");
+ return -EFAULT;
+ }
+
if (udest->weight < 0) {
IP_VS_ERR("ip_vs_add_dest(): server weight less than zero\n");
return -ERANGE;
@@ -894,7 +1106,14 @@ ip_vs_add_dest(struct ip_vs_service *svc, struct ip_vs_dest_user *udest)
/*
* Check if the dest already exists in the list
*/
- dest = ip_vs_lookup_dest(svc, daddr, dport);
+#ifdef CONFIG_IP_VS_IPV6
+ dest = (svc->af == AF_INET)
+ ? ip_vs_lookup_dest(svc, daddr.v4, dport)
+ : ip_vs_lookup_dest_v6(svc, &daddr.v6, dport);
+#else
+ dest = ip_vs_lookup_dest(svc, daddr.v4, dport);
+#endif
+
if (dest != NULL) {
IP_VS_DBG(1, "ip_vs_add_dest(): dest already exists\n");
return -EEXIST;
@@ -904,7 +1123,14 @@ ip_vs_add_dest(struct ip_vs_service *svc, struct ip_vs_dest_user *udest)
* Check if the dest already exists in the trash and
* is from the same service
*/
- dest = ip_vs_trash_get_dest(svc, daddr, dport);
+#ifdef CONFIG_IP_VS_IPV6
+ dest = (svc->af == AF_INET)
+ ? ip_vs_trash_get_dest(svc, daddr.v4, dport)
+ : ip_vs_trash_get_dest_v6(svc, &daddr.v6, dport);
+#else
+ dest = ip_vs_trash_get_dest(svc, daddr.v4, dport);
+#endif
+
if (dest != NULL) {
IP_VS_DBG_V4(svc->af, 3, "Get destination %u.%u.%u.%u:%u from trash, "
"dest->refcnt=%d, service %u/%u.%u.%u.%u:%u\n",
@@ -989,11 +1215,17 @@ static int
ip_vs_edit_dest(struct ip_vs_service *svc, struct ip_vs_dest_user *udest)
{
struct ip_vs_dest *dest;
- __be32 daddr = udest->addr;
+ union ip_vs_addr_user daddr = udest->addr;
__be16 dport = udest->port;
EnterFunction(2);
+ if (udest->af != svc->af) {
+ IP_VS_ERR("ip_vs_edit_dest(): address families of service and "
+ "destination do not match\n");
+ return -EFAULT;
+ }
+
if (udest->weight < 0) {
IP_VS_ERR("ip_vs_edit_dest(): server weight less than zero\n");
return -ERANGE;
@@ -1008,7 +1240,14 @@ ip_vs_edit_dest(struct ip_vs_service *svc, struct ip_vs_dest_user *udest)
/*
* Lookup the destination list
*/
- dest = ip_vs_lookup_dest(svc, daddr, dport);
+#ifdef CONFIG_IP_VS_IPV6
+ dest = (svc->af == AF_INET)
+ ? ip_vs_lookup_dest(svc, daddr.v4, dport)
+ : ip_vs_lookup_dest_v6(svc, &daddr.v6, dport);
+#else
+ dest = ip_vs_lookup_dest(svc, daddr.v4, dport);
+#endif
+
if (dest == NULL) {
IP_VS_DBG(1, "ip_vs_edit_dest(): dest doesn't exist\n");
return -ENOENT;
@@ -1104,15 +1343,28 @@ static void __ip_vs_unlink_dest(struct ip_vs_service *svc,
* Delete a destination server in the given service
*/
static int
-ip_vs_del_dest(struct ip_vs_service *svc,struct ip_vs_dest_user *udest)
+ip_vs_del_dest(struct ip_vs_service *svc, struct ip_vs_dest_user *udest)
{
struct ip_vs_dest *dest;
- __be32 daddr = udest->addr;
+ union ip_vs_addr_user daddr = udest->addr;
__be16 dport = udest->port;
EnterFunction(2);
- dest = ip_vs_lookup_dest(svc, daddr, dport);
+ if (udest->af != svc->af) {
+ IP_VS_ERR("ip_vs_add_dest(): address families of service and"
+ "destination do not match\n");
+ return -EFAULT;
+ }
+
+#ifdef CONFIG_IP_VS_IPV6
+ dest = (svc->af == AF_INET)
+ ? ip_vs_lookup_dest(svc, daddr.v4, dport)
+ : ip_vs_lookup_dest_v6(svc, &daddr.v6, dport);
+#else
+ dest = ip_vs_lookup_dest(svc, daddr.v4, dport);
+#endif
+
if (dest == NULL) {
IP_VS_DBG(1, "ip_vs_del_dest(): destination not found!\n");
return -ENOENT;
@@ -1165,6 +1417,19 @@ ip_vs_add_service(struct ip_vs_service_user *u, struct ip_vs_service **svc_p)
goto out_mod_dec;
}
+#ifdef CONFIG_IP_VS_IPV6
+ if (u->af == AF_INET6) {
+ if (!sched->supports_ipv6) {
+ ret = EAFNOSUPPORT;
+ goto out_err;
+ }
+ if ((u->netmask < 1) || (u->netmask > 128)) {
+ ret = EINVAL;
+ goto out_err;
+ }
+ }
+#endif
+
svc = kzalloc(sizeof(struct ip_vs_service), GFP_ATOMIC);
if (svc == NULL) {
IP_VS_DBG(1, "ip_vs_add_service: kmalloc failed.\n");
@@ -1253,6 +1518,19 @@ ip_vs_edit_service(struct ip_vs_service *svc, struct ip_vs_service_user *u)
}
old_sched = sched;
+#ifdef CONFIG_IP_VS_IPV6
+ if (u->af == AF_INET6) {
+ if (!sched->supports_ipv6) {
+ ret = EAFNOSUPPORT;
+ goto out;
+ }
+ if ((u->netmask < 1) || (u->netmask > 128)) {
+ ret = EINVAL;
+ goto out;
+ }
+ }
+#endif
+
write_lock_bh(&__ip_vs_svc_lock);
/*
@@ -1685,6 +1963,7 @@ static struct ctl_table vs_vars[] = {
struct ctl_path net_vs_ctl_path[] = {
{ .procname = "net", .ctl_name = CTL_NET, },
+ /* TODO IPv6: possible to move / duplicate this? */
{ .procname = "ipv4", .ctl_name = NET_IPV4, },
{ .procname = "vs", },
{ }
@@ -2031,12 +2310,33 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user *user, unsigned int len)
if (cmd == IP_VS_SO_SET_ZERO) {
/* if no service address is set, zero counters in all */
- if (!usvc->fwmark && !usvc->addr && !usvc->port) {
+#ifdef CONFIG_IP_VS_IPV6
+ struct in6_addr zero_addr = { .s6_addr32 = {0, 0, 0, 0} };
+ if (usvc->af == AF_INET6 && !usvc->fwmark &&
+ ipv6_addr_equal(&usvc->addr.v6,&zero_addr) && !usvc->port) {
+ ret = ip_vs_zero_all();
+ goto out_unlock;
+ }
+#endif
+ if (!usvc->fwmark && !usvc->addr.v4 && !usvc->port) {
ret = ip_vs_zero_all();
goto out_unlock;
}
}
+ /* Check for valid address family */
+ if (usvc->af != AF_INET) {
+#ifdef CONFIG_IP_VS_IPV6
+ if (usvc->af != AF_INET6) {
+ ret = -EAFNOSUPPORT;
+ goto out_unlock;
+ }
+#else
+ ret = -EAFNOSUPPORT;
+ goto out_unlock;
+#endif
+ }
+
/* Check for valid protocol: TCP or UDP, even for fwmark!=0 */
if (usvc->protocol!=IPPROTO_TCP && usvc->protocol!=IPPROTO_UDP) {
IP_VS_ERR_V4(usvc->af, "set_ctl: invalid protocol: %d %d.%d.%d.%d:%d %s\n",
@@ -2053,8 +2353,14 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user *user, unsigned int len)
/* Lookup the exact service by <protocol, addr, port> or fwmark */
if (usvc->fwmark == 0)
- svc = __ip_vs_service_get(usvc->protocol,
- usvc->addr, usvc->port);
+#ifdef CONFIG_IP_VS_IPV6
+ if (usvc->af == AF_INET6)
+ svc = __ip_vs_service_get_v6(usvc->protocol,
+ &usvc->addr.v6, usvc->port);
+ else
+#endif
+ svc = __ip_vs_service_get(usvc->protocol,
+ usvc->addr.v4, usvc->port);
else
svc = __ip_vs_svc_fwm_get(usvc->fwmark);
@@ -2183,9 +2489,17 @@ __ip_vs_get_dest_entries(const struct ip_vs_get_dests *get,
if (get->fwmark)
svc = __ip_vs_svc_fwm_get(get->fwmark);
+ else if (get->af == AF_INET6)
+#ifdef CONFIG_IP_VS_IPV6
+ svc = __ip_vs_service_get_v6(get->protocol,
+ &get->addr.v6, get->port);
+#else
+ return -EAFNOSUPPORT;
+#endif
else
svc = __ip_vs_service_get(get->protocol,
- get->addr, get->port);
+ get->addr.v4, get->port);
+
if (svc) {
int count = 0;
struct ip_vs_dest *dest;
@@ -2325,9 +2639,16 @@ do_ip_vs_get_ctl(struct sock *sk, int cmd, void __user *user, int *len)
entry = (struct ip_vs_service_entry *)arg;
if (entry->fwmark)
svc = __ip_vs_svc_fwm_get(entry->fwmark);
+#ifdef CONFIG_IP_VS_IPV6
+ else if (entry->af == AF_INET6)
+ svc = __ip_vs_service_get_v6(entry->protocol,
+ &entry->addr.v6,
+ entry->port);
+#endif
else
svc = __ip_vs_service_get(entry->protocol,
- entry->addr, entry->port);
+ entry->addr.v4, entry->port);
+
if (svc) {
ip_vs_copy_service(entry, svc);
if (copy_to_user(user, entry, sizeof(*entry)) != 0)
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 25/26] IPVS: Add support for IPv6 entry output in procfs files.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (23 preceding siblings ...)
2008-06-11 17:12 ` [PATCH 24/26] IPVS: Add IPv6 support to userspace interface Julius R. Volz
@ 2008-06-11 17:12 ` Julius R. Volz
2008-06-11 17:12 ` [PATCH 26/26] IPVS: Add some blame/credits for IPv6 version Julius R. Volz
2008-06-11 17:23 ` [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Patrick McHardy
26 siblings, 0 replies; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:12 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam
From: Vince Busam <vbusam@google.com>
Add support for procfs output of IPv6 service and connection entries.
Signed-off-by: Vince Busam <vbusam@google.com>
1 files changed, 36 insertions(+), 14 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index 388278a..c6b737c 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -2104,15 +2104,26 @@ static int ip_vs_info_seq_show(struct seq_file *seq, void *v)
const struct ip_vs_iter *iter = seq->private;
const struct ip_vs_dest *dest;
- if (iter->table == ip_vs_svc_table)
- seq_printf(seq, "%s %08X:%04X %s ",
- ip_vs_proto_name(svc->protocol),
- ntohl(svc->addr),
- ntohs(svc->port),
- svc->scheduler->name);
- else
+ if (iter->table == ip_vs_svc_table) {
+ if (svc->af == AF_INET) {
+ seq_printf(seq, "%s %08X:%04X %s ",
+ ip_vs_proto_name(svc->protocol),
+ ntohl(svc->addr.v4),
+ ntohs(svc->port),
+ svc->scheduler->name);
+ } else if (svc->af == AF_INET6) {
+#ifdef CONFIG_IP_VS_IPV6
+ seq_printf(seq, "%s [" NIP6_FMT "]:%04X %s ",
+ ip_vs_proto_name(svc->protocol),
+ NIP6(svc->addr.v6),
+ ntohs(svc->port),
+ svc->scheduler->name);
+#endif
+ }
+ } else {
seq_printf(seq, "FWM %08X %s ",
svc->fwmark, svc->scheduler->name);
+ }
if (svc->flags & IP_VS_SVC_F_PERSISTENT)
seq_printf(seq, "persistent %d %08X\n",
@@ -2122,13 +2133,24 @@ static int ip_vs_info_seq_show(struct seq_file *seq, void *v)
seq_putc(seq, '\n');
list_for_each_entry(dest, &svc->destinations, n_list) {
- seq_printf(seq,
- " -> %08X:%04X %-7s %-6d %-10d %-10d\n",
- ntohl(dest->addr), ntohs(dest->port),
- ip_vs_fwd_name(atomic_read(&dest->conn_flags)),
- atomic_read(&dest->weight),
- atomic_read(&dest->activeconns),
- atomic_read(&dest->inactconns));
+ if (dest->af == AF_INET)
+ seq_printf(seq,
+ " -> %08X:%04X %-7s %-6d %-10d %-10d\n",
+ ntohl(dest->addr.v4), ntohs(dest->port),
+ ip_vs_fwd_name(atomic_read(&dest->conn_flags)),
+ atomic_read(&dest->weight),
+ atomic_read(&dest->activeconns),
+ atomic_read(&dest->inactconns));
+#ifdef CONFIG_IP_VS_IPV6
+ else if (dest->af == AF_INET6)
+ seq_printf(seq,
+ " -> [" NIP6_FMT "]:%04X %-7s %-6d %-10d %-10d\n",
+ NIP6(dest->addr.v6), ntohs(dest->port),
+ ip_vs_fwd_name(atomic_read(&dest->conn_flags)),
+ atomic_read(&dest->weight),
+ atomic_read(&dest->activeconns),
+ atomic_read(&dest->inactconns));
+#endif
}
}
return 0;
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 26/26] IPVS: Add some blame/credits for IPv6 version.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (24 preceding siblings ...)
2008-06-11 17:12 ` [PATCH 25/26] IPVS: Add support for IPv6 entry output in procfs files Julius R. Volz
@ 2008-06-11 17:12 ` Julius R. Volz
2008-06-11 17:23 ` [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Patrick McHardy
26 siblings, 0 replies; 76+ messages in thread
From: Julius R. Volz @ 2008-06-11 17:12 UTC (permalink / raw)
To: lvs-devel, netdev; +Cc: horms, davem, vbusam, Julius R. Volz
Add some blame for first IPv6 version of IPVS.
Signed-off-by: Julius R. Volz <juliusv@google.com>
7 files changed, 7 insertions(+), 0 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
index 6b031a8..8bb4ee8 100644
--- a/net/netfilter/ipvs/ip_vs_conn.c
+++ b/net/netfilter/ipvs/ip_vs_conn.c
@@ -21,6 +21,7 @@
* and others. Many code here is taken from IP MASQ code of kernel 2.2.
*
* Changes:
+ * Julius Volz add first IPv6 support
*
*/
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index ded862b..e1469d0 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -23,6 +23,7 @@
* Changes:
* Paul `Rusty' Russell properly handle non-linear skbs
* Harald Welte don't use nfcache
+ * Julius Volz add first IPv6 support
*
*/
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index c6b737c..06e5f74 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -17,6 +17,7 @@
* 2 of the License, or (at your option) any later version.
*
* Changes:
+ * Julius Volz, Vince Busam add first IPv6 support
*
*/
diff --git a/net/netfilter/ipvs/ip_vs_proto.c b/net/netfilter/ipvs/ip_vs_proto.c
index 8b82400..7766426 100644
--- a/net/netfilter/ipvs/ip_vs_proto.c
+++ b/net/netfilter/ipvs/ip_vs_proto.c
@@ -12,6 +12,7 @@
* 2 of the License, or (at your option) any later version.
*
* Changes:
+ * Julius Volz add first IPv6 support
*
*/
diff --git a/net/netfilter/ipvs/ip_vs_proto_tcp.c b/net/netfilter/ipvs/ip_vs_proto_tcp.c
index 02bf859..c25d531 100644
--- a/net/netfilter/ipvs/ip_vs_proto_tcp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_tcp.c
@@ -12,6 +12,7 @@
* 2 of the License, or (at your option) any later version.
*
* Changes:
+ * Julius Volz add first IPv6 support
*
*/
diff --git a/net/netfilter/ipvs/ip_vs_proto_udp.c b/net/netfilter/ipvs/ip_vs_proto_udp.c
index ef0d921..c56bade 100644
--- a/net/netfilter/ipvs/ip_vs_proto_udp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_udp.c
@@ -12,6 +12,7 @@
* 2 of the License, or (at your option) any later version.
*
* Changes:
+ * Julius Volz add first IPv6 support
*
*/
diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
index 9d2c424..8fd8f9f 100644
--- a/net/netfilter/ipvs/ip_vs_xmit.c
+++ b/net/netfilter/ipvs/ip_vs_xmit.c
@@ -12,6 +12,7 @@
* 2 of the License, or (at your option) any later version.
*
* Changes:
+ * Julius Volz add first IPv6 support
*
*/
--
1.5.3.6
^ permalink raw reply related [flat|nested] 76+ messages in thread
* Re: [PATCH 02/26] IPVS: Change IPVS data structures to support IPv6 addresses.
2008-06-11 17:11 ` [PATCH 02/26] IPVS: Change IPVS data structures to support IPv6 addresses Julius R. Volz
@ 2008-06-11 17:12 ` Patrick McHardy
[not found] ` <f4845fc0806111041u2a9a197fseefe300ffbbda3c3@mail.gmail.com>
2008-06-12 1:54 ` Brian Haley
1 sibling, 1 reply; 76+ messages in thread
From: Patrick McHardy @ 2008-06-11 17:12 UTC (permalink / raw)
To: Julius R. Volz; +Cc: lvs-devel, netdev, horms, davem, vbusam
Julius R. Volz wrote:
> diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
> index 9a51eba..b7b181e 100644
> --- a/include/net/ip_vs.h
> +++ b/include/net/ip_vs.h
> @@ -11,7 +11,12 @@
>
> #include <linux/sysctl.h> /* For ctl_path */
>
> -#define IP_VS_VERSION_CODE 0x010201
> +#ifdef __KERNEL__
> +#include <linux/in6.h> /* For struct in6_addr */
> +#include <linux/ipv6.h> /* For struct ipv6hdr */
>
include/net is always kernel only. At least it should be that way.
> +#endif /* __KERNEL */
> +
> +#define IP_VS_VERSION_CODE 0x020000
> #define NVERSION(version) \
> (version >> 16) & 0xFF, \
> (version >> 8) & 0xFF, \
> @@ -95,6 +100,20 @@
> #define IP_VS_SCHEDNAME_MAXLEN 16
> #define IP_VS_IFNAME_MAXLEN 16
>
> +union ip_vs_addr_user {
> + __be32 v4;
> + struct in6_addr v6;
> +};
>
Can't you use nf_inet_addr for this?
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 05/26] IPVS: Use new address family specific debugging macros.
2008-06-11 17:11 ` [PATCH 05/26] IPVS: Use new " Julius R. Volz
@ 2008-06-11 17:14 ` Patrick McHardy
0 siblings, 0 replies; 76+ messages in thread
From: Patrick McHardy @ 2008-06-11 17:14 UTC (permalink / raw)
To: Julius R. Volz; +Cc: lvs-devel, netdev, horms, davem, vbusam
Julius R. Volz wrote:
> Change debug output to use address family specific debugging macros where
> appropriate.
>
> - IP_VS_ERR("request control DEL for uncontrolled: "
> - "%d.%d.%d.%d:%d to %d.%d.%d.%d:%d\n",
> - NIPQUAD(cp->caddr),ntohs(cp->cport),
> - NIPQUAD(cp->vaddr),ntohs(cp->vport));
> + IP_VS_ERR_V4(cp->af, "request control DEL for uncontrolled: "
> + "%d.%d.%d.%d:%d to %d.%d.%d.%d:%d\n",
> + NIPQUAD(cp->caddr.v4),ntohs(cp->cport),
> + NIPQUAD(cp->vaddr.v4),ntohs(cp->vport));
> +
> + IP_VS_ERR_V6(cp->af, "request control DEL for uncontrolled: "
> + NIP6_FMT ":%d to " NIP6_FMT ":%d\n",
> + NIP6(cp->caddr.v6),ntohs(cp->cport),
> + NIP6(cp->vaddr.v6),ntohs(cp->vport));
> +
>
This would look at lot cleaner if you'd use a debugging macro that
can take both families.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 12/26] IPVS: Extend proto handler debug functions to handle IPv6.
2008-06-11 17:11 ` [PATCH 12/26] IPVS: Extend proto handler debug functions to handle IPv6 Julius R. Volz
@ 2008-06-11 17:17 ` Patrick McHardy
0 siblings, 0 replies; 76+ messages in thread
From: Patrick McHardy @ 2008-06-11 17:17 UTC (permalink / raw)
To: Julius R. Volz; +Cc: lvs-devel, netdev, horms, davem, vbusam
Julius R. Volz wrote:
> Extend protocol handler packet debug functions for TCP, UDP, AH and ESP to
> handle IPv6. Make the main debug function call either a v4 or v6 version,
> depending on the packet protocol version.
>
The only difference appears to be the address format and header
offset. You could save some duplication by using an af-independant
address debugging function and add a helper to get the proto offset.
Similar to how x_tables and nf_conntrack do it.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 10/26] IPVS: Add IPv6 handler functions to UDP protocol handler.
2008-06-11 17:11 ` [PATCH 10/26] IPVS: Add IPv6 handler functions to UDP " Julius R. Volz
@ 2008-06-11 17:18 ` Patrick McHardy
0 siblings, 0 replies; 76+ messages in thread
From: Patrick McHardy @ 2008-06-11 17:18 UTC (permalink / raw)
To: Julius R. Volz; +Cc: lvs-devel, netdev, horms, davem, vbusam
Julius R. Volz wrote:
> Define new IPv6-specific handler functions in UDP protocol handler. Set new
> function pointers in ip_vs_protocol struct to point to these functions.
The last four or five patches could save a lot of duplication
by taking advantage of the fact that only address length and
protocol offset differ. Why don't you move the af-specific logic
to the layer above these handlers and pass the offset to them?
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
` (25 preceding siblings ...)
2008-06-11 17:12 ` [PATCH 26/26] IPVS: Add some blame/credits for IPv6 version Julius R. Volz
@ 2008-06-11 17:23 ` Patrick McHardy
2008-06-11 18:23 ` Julius Volz
26 siblings, 1 reply; 76+ messages in thread
From: Patrick McHardy @ 2008-06-11 17:23 UTC (permalink / raw)
To: Julius R. Volz; +Cc: lvs-devel, netdev, horms, davem, vbusam
Julius R. Volz wrote:
> Hi,
>
> This patch series adds first experimental IPv6 support to IPVS. I have
> already posted it to the LVS mailing list as one huge patch a while ago,
> so here's the split-up version, although it is still very big. I don't see
> an easy way of breaking up the series into truly independent chunks though,
> since most of it seems very interdependent. I'm still a kernel newbie, so
> any advice is welcome :)
>
I briefly looked over the patches I didn't comment on.
I think there's too much duplication everywhere, a lot of
them look like they could avoid almost all duplication by
handling differences at a higher layer or simply sharing
the code (like hashing).
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 02/26] IPVS: Change IPVS data structures to support IPv6 addresses.
[not found] ` <485010E9.6000506@trash.net>
@ 2008-06-11 18:08 ` Julius Volz
0 siblings, 0 replies; 76+ messages in thread
From: Julius Volz @ 2008-06-11 18:08 UTC (permalink / raw)
To: Patrick McHardy; +Cc: lvs-devel, netdev
On Wed, Jun 11, 2008 at 7:52 PM, Patrick McHardy <kaber@trash.net> wrote:
> Julius Volz wrote:
>>
>> On Wed, Jun 11, 2008 at 7:12 PM, Patrick McHardy <kaber@trash.net> wrote:
>>
>>>
>>> include/net is always kernel only. At least it should be that way.
>>>
>>
>> ipvsadm has:
>>
>> INCLUDE = -I/usr/src/linux/include -I.. -I.
>>
>> in its Makefile and then includes net/ip_vs.h. Even in the current version
>> :(
>>
>
> OK, thanks for the explanation.
>
>>>> #define IP_VS_IFNAME_MAXLEN 16
>>>> +union ip_vs_addr_user {
>>>> + __be32 v4;
>>>> + struct in6_addr v6;
>>>> +};
>>>>
>>>>
>>>
>>> Can't you use nf_inet_addr for this?
>>>
>>
>> Ah, yes. I either didn't know about this or wasn't sure if it was
>> appropriate to reuse that struct from netfilter. I will look into
>> putting this in!
>>
>>
>
> Thanks. But please remember to keep netdev CCed on responses
> to review.
Yes, sorry! Re-added the lists for this response.
Julius
--
Google Switzerland GmbH
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-11 17:23 ` [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Patrick McHardy
@ 2008-06-11 18:23 ` Julius Volz
2008-06-11 18:42 ` Patrick McHardy
0 siblings, 1 reply; 76+ messages in thread
From: Julius Volz @ 2008-06-11 18:23 UTC (permalink / raw)
To: Patrick McHardy; +Cc: lvs-devel, netdev
On Wed, Jun 11, 2008 at 7:23 PM, Patrick McHardy <kaber@trash.net> wrote:
> I briefly looked over the patches I didn't comment on.
> I think there's too much duplication everywhere, a lot of
> them look like they could avoid almost all duplication by
> handling differences at a higher layer or simply sharing
> the code (like hashing).
Yes, the duplication is high unfortunately. I must admit that I didn't
feel secure enough to restructure all the existing code without
breaking it, so I copied lots of functions and modified them for IPv6.
My main goal was to keep all the old v4 stuff working first and then
remove the duplication later (or hope for smarter people).
So I obviously don't expect this to be ready for inclusion, but I will
have a lot of time to work on it (I'm doing it as an intern project)
and learn as long as I get good feedback like yours on what to
improve.
Another question I was unsure about: is the breaking of the
userspace-to-kernel interface even acceptable at all? I think the code
would get ugly (and have even more duplication) if you wanted to keep
the backwards compatibility. And you have to compile ipvsadm for your
kernel version anyways.
Julius
--
Google Switzerland GmbH
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-11 18:23 ` Julius Volz
@ 2008-06-11 18:42 ` Patrick McHardy
2008-06-11 19:05 ` Julius Volz
0 siblings, 1 reply; 76+ messages in thread
From: Patrick McHardy @ 2008-06-11 18:42 UTC (permalink / raw)
To: Julius Volz; +Cc: lvs-devel, netdev
Julius Volz wrote:
> On Wed, Jun 11, 2008 at 7:23 PM, Patrick McHardy <kaber@trash.net> wrote:
>> I briefly looked over the patches I didn't comment on.
>> I think there's too much duplication everywhere, a lot of
>> them look like they could avoid almost all duplication by
>> handling differences at a higher layer or simply sharing
>> the code (like hashing).
>
> Yes, the duplication is high unfortunately. I must admit that I didn't
> feel secure enough to restructure all the existing code without
> breaking it, so I copied lots of functions and modified them for IPv6.
> My main goal was to keep all the old v4 stuff working first and then
> remove the duplication later (or hope for smarter people).
>
> So I obviously don't expect this to be ready for inclusion, but I will
> have a lot of time to work on it (I'm doing it as an intern project)
> and learn as long as I get good feedback like yours on what to
> improve.
Great.
> Another question I was unsure about: is the breaking of the
> userspace-to-kernel interface even acceptable at all? I think the code
> would get ugly (and have even more duplication) if you wanted to keep
> the backwards compatibility. And you have to compile ipvsadm for your
> kernel version anyways.
Usually its not acceptable. Why do you have compile ipvsadm
for specific kernel versions?
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-11 18:42 ` Patrick McHardy
@ 2008-06-11 19:05 ` Julius Volz
2008-06-11 19:10 ` Patrick McHardy
0 siblings, 1 reply; 76+ messages in thread
From: Julius Volz @ 2008-06-11 19:05 UTC (permalink / raw)
To: Patrick McHardy; +Cc: lvs-devel, netdev
On Wed, Jun 11, 2008 at 8:42 PM, Patrick McHardy <kaber@trash.net> wrote:
> Julius Volz wrote:
>> Another question I was unsure about: is the breaking of the
>> userspace-to-kernel interface even acceptable at all? I think the code
>> would get ugly (and have even more duplication) if you wanted to keep
>> the backwards compatibility. And you have to compile ipvsadm for your
>> kernel version anyways.
>
> Usually its not acceptable. Why do you have compile ipvsadm
> for specific kernel versions?
ipvsadm uses get/set-sockopts on a raw socket to pass commands and
structs (as defined in include/net/ip_vs.h) to the kernel. So the
passed structs have to match exactly between userspace and kernel. The
kernel ip_vs.h also includes a version number that is used to verify
that ipvsadm matches your kernel version.
Julius
--
Google Switzerland GmbH
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-11 19:05 ` Julius Volz
@ 2008-06-11 19:10 ` Patrick McHardy
2008-06-11 19:29 ` Julius Volz
0 siblings, 1 reply; 76+ messages in thread
From: Patrick McHardy @ 2008-06-11 19:10 UTC (permalink / raw)
To: Julius Volz; +Cc: lvs-devel, netdev
Julius Volz wrote:
> On Wed, Jun 11, 2008 at 8:42 PM, Patrick McHardy <kaber@trash.net> wrote:
>> Julius Volz wrote:
>>> Another question I was unsure about: is the breaking of the
>>> userspace-to-kernel interface even acceptable at all? I think the code
>>> would get ugly (and have even more duplication) if you wanted to keep
>>> the backwards compatibility. And you have to compile ipvsadm for your
>>> kernel version anyways.
>> Usually its not acceptable. Why do you have compile ipvsadm
>> for specific kernel versions?
>
> ipvsadm uses get/set-sockopts on a raw socket to pass commands and
> structs (as defined in include/net/ip_vs.h) to the kernel. So the
> passed structs have to match exactly between userspace and kernel. The
> kernel ip_vs.h also includes a version number that is used to verify
> that ipvsadm matches your kernel version.
So they define an ABI, which means they must not be changed in
incompabtible ways. The question is whether they are actually
changed in incomaptible ways.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-11 19:10 ` Patrick McHardy
@ 2008-06-11 19:29 ` Julius Volz
2008-06-11 19:31 ` Patrick McHardy
0 siblings, 1 reply; 76+ messages in thread
From: Julius Volz @ 2008-06-11 19:29 UTC (permalink / raw)
To: Patrick McHardy; +Cc: lvs-devel, netdev
On Wed, Jun 11, 2008 at 9:10 PM, Patrick McHardy <kaber@trash.net> wrote:
> Julius Volz wrote:
>> ipvsadm uses get/set-sockopts on a raw socket to pass commands and
>> structs (as defined in include/net/ip_vs.h) to the kernel. So the
>> passed structs have to match exactly between userspace and kernel. The
>> kernel ip_vs.h also includes a version number that is used to verify
>> that ipvsadm matches your kernel version.
>
> So they define an ABI, which means they must not be changed in
> incompabtible ways. The question is whether they are actually
> changed in incomaptible ways.
It is clearly laid out to be able to be changed over time, hence the
ipvsadm version check... I think this whole interface is quite an
exception though.
Julius
--
Google Switzerland GmbH
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-11 19:29 ` Julius Volz
@ 2008-06-11 19:31 ` Patrick McHardy
2008-06-11 19:53 ` Julius Volz
0 siblings, 1 reply; 76+ messages in thread
From: Patrick McHardy @ 2008-06-11 19:31 UTC (permalink / raw)
To: Julius Volz; +Cc: lvs-devel, netdev
Julius Volz wrote:
> On Wed, Jun 11, 2008 at 9:10 PM, Patrick McHardy <kaber@trash.net> wrote:
>
>> Julius Volz wrote:
>>
>>> ipvsadm uses get/set-sockopts on a raw socket to pass commands and
>>> structs (as defined in include/net/ip_vs.h) to the kernel. So the
>>> passed structs have to match exactly between userspace and kernel. The
>>> kernel ip_vs.h also includes a version number that is used to verify
>>> that ipvsadm matches your kernel version.
>>>
>> So they define an ABI, which means they must not be changed in
>> incompabtible ways. The question is whether they are actually
>> changed in incomaptible ways.
>>
>
> It is clearly laid out to be able to be changed over time, hence the
> ipvsadm version check...
The usual way is to add new members at the end. The history shows no
changes at all to these structs though.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-11 19:31 ` Patrick McHardy
@ 2008-06-11 19:53 ` Julius Volz
2008-06-11 20:14 ` Julius Volz
0 siblings, 1 reply; 76+ messages in thread
From: Julius Volz @ 2008-06-11 19:53 UTC (permalink / raw)
To: Patrick McHardy; +Cc: lvs-devel, netdev, Vince Busam
On Wed, Jun 11, 2008 at 9:31 PM, Patrick McHardy <kaber@trash.net> wrote:
> Julius Volz wrote:
>> It is clearly laid out to be able to be changed over time, hence the
>> ipvsadm version check...
>
> The usual way is to add new members at the end. The history shows no
> changes at all to these structs though.
Not in git history at least. No version number change there either
(maybe the LVS folks know more about the history).
Adding members at the end sounds interesting! The current structs have
the v4 address members in the middle of the struct, so you couldn't
use a union of v4 and v6 anymore, but since it's only in the userspace
interface, that shouldn't matter much. If this allows us to keep the
interface backwards compatible in an easy way, that would be great.
Julius
--
Google Switzerland GmbH
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-11 19:53 ` Julius Volz
@ 2008-06-11 20:14 ` Julius Volz
2008-06-11 20:55 ` Vince Busam
0 siblings, 1 reply; 76+ messages in thread
From: Julius Volz @ 2008-06-11 20:14 UTC (permalink / raw)
To: Patrick McHardy; +Cc: lvs-devel, netdev, Vince Busam
On Wed, Jun 11, 2008 at 9:53 PM, Julius Volz <juliusv@google.com> wrote:
> Adding members at the end sounds interesting! The current structs have
> the v4 address members in the middle of the struct, so you couldn't
> use a union of v4 and v6 anymore, but since it's only in the userspace
> interface, that shouldn't matter much. If this allows us to keep the
> interface backwards compatible in an easy way, that would be great.
Ah, but the set/get-sockopt calls also pass a size argument, which is
the size of the passed structs. If the kernel and userspace struct
sizes don't match, it is treated as an error. Is this in case
different compilers pad the structs differently, even if the IPVS
version stays the same?
Julius
--
Google Switzerland GmbH
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-11 20:14 ` Julius Volz
@ 2008-06-11 20:55 ` Vince Busam
2008-06-11 21:30 ` Ben Greear
0 siblings, 1 reply; 76+ messages in thread
From: Vince Busam @ 2008-06-11 20:55 UTC (permalink / raw)
To: Julius Volz; +Cc: Patrick McHardy, lvs-devel, netdev
Julius Volz wrote:
> Ah, but the set/get-sockopt calls also pass a size argument, which is
> the size of the passed structs. If the kernel and userspace struct
> sizes don't match, it is treated as an error. Is this in case
> different compilers pad the structs differently, even if the IPVS
> version stays the same?
So we could disable the size checks of the passed structs, or key on it to
determine if the older ABI was used, keeping a list of the structs that
had different sizes around, but that sounds like a gross hack which would
get worse if any other fields are added. It would also mean new userspace
binaries with the new fields wouldn't work with older kernels, is that a
problem? Is this better than the alternatives of breaking the ABI, or
duplicating code into a separate ABI?
Vince
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-11 20:55 ` Vince Busam
@ 2008-06-11 21:30 ` Ben Greear
2008-06-11 22:26 ` Vince Busam
0 siblings, 1 reply; 76+ messages in thread
From: Ben Greear @ 2008-06-11 21:30 UTC (permalink / raw)
To: Vince Busam; +Cc: Julius Volz, Patrick McHardy, lvs-devel, netdev
Vince Busam wrote:
> Julius Volz wrote:
>> Ah, but the set/get-sockopt calls also pass a size argument, which is
>> the size of the passed structs. If the kernel and userspace struct
>> sizes don't match, it is treated as an error. Is this in case
>> different compilers pad the structs differently, even if the IPVS
>> version stays the same?
>
> So we could disable the size checks of the passed structs, or key on it
> to determine if the older ABI was used, keeping a list of the structs
> that had different sizes around, but that sounds like a gross hack which
> would get worse if any other fields are added. It would also mean new
> userspace binaries with the new fields wouldn't work with older kernels,
> is that a problem? Is this better than the alternatives of breaking the
> ABI, or duplicating code into a separate ABI?
You can have the kernel ignore any data it doesn't understand (ie, if struct is 24 bytes,
but the kernel expects 20 bytes, just ignore the last 4). This way it should
work with newer binaries.
Thanks,
Ben
>
> Vince
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-11 21:30 ` Ben Greear
@ 2008-06-11 22:26 ` Vince Busam
2008-06-12 1:45 ` Simon Horman
0 siblings, 1 reply; 76+ messages in thread
From: Vince Busam @ 2008-06-11 22:26 UTC (permalink / raw)
To: Ben Greear; +Cc: Julius Volz, Patrick McHardy, lvs-devel, netdev
Ben Greear wrote:
> You can have the kernel ignore any data it doesn't understand (ie, if
> struct is 24 bytes,
> but the kernel expects 20 bytes, just ignore the last 4). This way it
> should
> work with newer binaries.
Currently, the IPVS code specifically checks that length, so all kernels
up to now won't play well with any changes to the structs.
Vince
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-11 22:26 ` Vince Busam
@ 2008-06-12 1:45 ` Simon Horman
2008-06-12 13:31 ` Julius Volz
0 siblings, 1 reply; 76+ messages in thread
From: Simon Horman @ 2008-06-12 1:45 UTC (permalink / raw)
To: Vince Busam; +Cc: Ben Greear, Julius Volz, Patrick McHardy, lvs-devel, netdev
On Wed, Jun 11, 2008 at 03:26:06PM -0700, Vince Busam wrote:
> Ben Greear wrote:
>> You can have the kernel ignore any data it doesn't understand (ie, if
>> struct is 24 bytes,
>> but the kernel expects 20 bytes, just ignore the last 4). This way it
>> should
>> work with newer binaries.
>
> Currently, the IPVS code specifically checks that length, so all kernels
> up to now won't play well with any changes to the structs.
Adding new features to IPVS that require ipvsadm to be extended
has always been problematic due to the set/getsockopt interface
that is used.
A long time ago, before this code was merged into the kernel, the
interface changed quite a lot and this was painful. There was an
assumption that ipvsadm and kernel versions needed to match,
and the version checking code was added basically to stop people
shooting themselves in the foot. It was quite successful at that.
Eventially the changes settled down, and for the past few years they
have been very infrequent. But the problem that the interface isn't
really extendable and that when changes are made kernel and ipvsadm
versions need to be incremented together remains. For instance, the
Debian package of ipvsadm actually shipps three different ipvsadm
binaries, and a wrapper works out which one to use based on the kernel
version.
I wonder if now would be a good time to bite the bullet and design
a new interface that is extendable.
--
Horms
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 02/26] IPVS: Change IPVS data structures to support IPv6 addresses.
2008-06-11 17:11 ` [PATCH 02/26] IPVS: Change IPVS data structures to support IPv6 addresses Julius R. Volz
2008-06-11 17:12 ` Patrick McHardy
@ 2008-06-12 1:54 ` Brian Haley
2008-06-12 9:47 ` Julius Volz
1 sibling, 1 reply; 76+ messages in thread
From: Brian Haley @ 2008-06-12 1:54 UTC (permalink / raw)
To: Julius R. Volz; +Cc: lvs-devel, netdev, horms, davem, vbusam
Julius R. Volz wrote:
> +union ip_vs_addr_user {
> + __be32 v4;
> + struct in6_addr v6;
> +};
> +
> +#ifdef CONFIG_IP_VS_IPV6
> +#define ip_vs_addr ip_vs_addr_user
> +#define ip_vs_copy_addr(a, b) do { (a) = (b); } while (0)
> +#else
> +union ip_vs_addr {
> + __be32 v4;
> +};
> +#define ip_vs_copy_addr(a, b) do { (a).v4 = (b).v4; } while (0)
> +#endif
You need to use ipv6_addr_copy() with IPv6 addresses. Some of your
other patches have this same problem, I found some of them...
-Brian
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 16/26] IPVS: Add IPv6 xmit forwarding functions.
2008-06-11 17:11 ` [PATCH 16/26] IPVS: Add IPv6 xmit forwarding functions Julius R. Volz
@ 2008-06-12 1:55 ` Brian Haley
0 siblings, 0 replies; 76+ messages in thread
From: Brian Haley @ 2008-06-12 1:55 UTC (permalink / raw)
To: Julius R. Volz; +Cc: lvs-devel, netdev, horms, davem, vbusam
Julius R. Volz wrote:
> + /* mangle the packet */
> + if (pp->dnat_handler_v6 && !pp->dnat_handler_v6(skb, pp, cp))
> + goto tx_error;
> + ipv6_hdr(skb)->daddr = cp->daddr.v6;
ipv6_addr_copy().
> + /*
> + * Push down and install the IPIP header.
> + */
> + iph = ipv6_hdr(skb);
> + iph->version = 6;
> + iph->nexthdr = IPPROTO_IPV6;
> + iph->payload_len = old_iph->payload_len + sizeof(old_iph);
> + iph->priority = old_iph->priority;
> + memset(&iph->flow_lbl, 0, sizeof(iph->flow_lbl));
> + iph->daddr = rt->rt6i_dst.addr;
> + iph->saddr = cp->vaddr.v6; /* rt->rt6i_src.addr; */
ipv6_addr_copy().
-Brian
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 24/26] IPVS: Add IPv6 support to userspace interface.
2008-06-11 17:12 ` [PATCH 24/26] IPVS: Add IPv6 support to userspace interface Julius R. Volz
@ 2008-06-12 1:55 ` Brian Haley
2008-06-12 9:46 ` Julius Volz
0 siblings, 1 reply; 76+ messages in thread
From: Brian Haley @ 2008-06-12 1:55 UTC (permalink / raw)
To: Julius R. Volz; +Cc: lvs-devel, netdev, horms, davem, vbusam
Julius R. Volz wrote:
> +#ifdef CONFIG_IP_VS_IPV6
> + struct in6_addr zero_addr = { .s6_addr32 = {0, 0, 0, 0} };
> + if (usvc->af == AF_INET6 && !usvc->fwmark &&
> + ipv6_addr_equal(&usvc->addr.v6,&zero_addr) && !usvc->port) {
> + ret = ip_vs_zero_all();
> + goto out_unlock;
> + }
You can change this ipv6_addr_equal() to ipv6_addr_any(&usvc->addr.v6)
and get rid of the zero_addr variable mess.
-Brian
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 20/26] IPVS: Add IPv6 Netfilter hooks and add/modify support functions.
2008-06-11 17:12 ` [PATCH 20/26] IPVS: Add IPv6 Netfilter hooks and add/modify support functions Julius R. Volz
@ 2008-06-12 1:55 ` Brian Haley
0 siblings, 0 replies; 76+ messages in thread
From: Brian Haley @ 2008-06-12 1:55 UTC (permalink / raw)
To: Julius R. Volz; +Cc: lvs-devel, netdev, horms, davem, vbusam
Julius R. Volz wrote:
> +#ifdef CONFIG_IP_VS_IPV6
> +void ip_vs_nat_icmp_v6(struct sk_buff *skb, struct ip_vs_protocol *pp,
> + struct ip_vs_conn *cp, int inout)
> +{
> + struct ipv6hdr *iph = ipv6_hdr(skb);
> + unsigned int icmp_offset = sizeof(struct ipv6hdr);
> + struct icmp6hdr *icmph = (struct icmp6hdr *)(skb_network_header(skb) +
> + icmp_offset);
> + struct ipv6hdr *ciph = (struct ipv6hdr *)(icmph + 1);
> +
> + if (inout) {
> + iph->saddr = cp->vaddr.v6;
> + ciph->daddr = cp->vaddr.v6;
> + } else {
> + iph->daddr = cp->daddr.v6;
> + ciph->saddr = cp->daddr.v6;
> + }
ipv6_addr_copy().
> + /* mangle the packet */
> + if (pp->snat_handler_v6 && !pp->snat_handler_v6(skb, pp, cp))
> + goto drop;
> + ipv6_hdr(skb)->saddr = cp->vaddr.v6;
ipv6_addr_copy().
-Brian
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 18/26] IPVS: Add functions for getting/creating IPv6 connections.
2008-06-11 17:12 ` [PATCH 18/26] IPVS: Add functions for getting/creating IPv6 connections Julius R. Volz
@ 2008-06-12 1:55 ` Brian Haley
0 siblings, 0 replies; 76+ messages in thread
From: Brian Haley @ 2008-06-12 1:55 UTC (permalink / raw)
To: Julius R. Volz; +Cc: lvs-devel, netdev, horms, davem, vbusam
Julius R. Volz wrote:
> + INIT_LIST_HEAD(&cp->c_list);
> + setup_timer(&cp->timer, ip_vs_conn_expire, (unsigned long)cp);
> + cp->af = AF_INET6;
> + cp->protocol = proto;
> + cp->caddr.v6 = *caddr;
> + cp->cport = cport;
> + cp->vaddr.v6 = *vaddr;
> + cp->vport = vport;
> + cp->daddr.v6 = *daddr;
ipv6_addr_copy().
-Brian
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 24/26] IPVS: Add IPv6 support to userspace interface.
2008-06-12 1:55 ` Brian Haley
@ 2008-06-12 9:46 ` Julius Volz
0 siblings, 0 replies; 76+ messages in thread
From: Julius Volz @ 2008-06-12 9:46 UTC (permalink / raw)
To: Brian Haley; +Cc: lvs-devel, netdev
On Thu, Jun 12, 2008 at 3:55 AM, Brian Haley <brian.haley@hp.com> wrote:
> Julius R. Volz wrote:
>>
>> +#ifdef CONFIG_IP_VS_IPV6
>> + struct in6_addr zero_addr = { .s6_addr32 = {0, 0, 0, 0} };
>> + if (usvc->af == AF_INET6 && !usvc->fwmark &&
>> + ipv6_addr_equal(&usvc->addr.v6,&zero_addr) &&
>> !usvc->port) {
>> + ret = ip_vs_zero_all();
>> + goto out_unlock;
>> + }
>
> You can change this ipv6_addr_equal() to ipv6_addr_any(&usvc->addr.v6) and
> get rid of the zero_addr variable mess.
Thanks, will do that!
Julius
--
Google Switzerland GmbH
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 02/26] IPVS: Change IPVS data structures to support IPv6 addresses.
2008-06-12 1:54 ` Brian Haley
@ 2008-06-12 9:47 ` Julius Volz
0 siblings, 0 replies; 76+ messages in thread
From: Julius Volz @ 2008-06-12 9:47 UTC (permalink / raw)
To: Brian Haley; +Cc: lvs-devel, netdev
On Thu, Jun 12, 2008 at 3:54 AM, Brian Haley <brian.haley@hp.com> wrote:
> You need to use ipv6_addr_copy() with IPv6 addresses. Some of your other
> patches have this same problem, I found some of them...
Thanks for spotting this, I will convert it in the next version!
Julius
--
Google Switzerland GmbH
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-12 1:45 ` Simon Horman
@ 2008-06-12 13:31 ` Julius Volz
2008-06-12 13:38 ` Patrick McHardy
0 siblings, 1 reply; 76+ messages in thread
From: Julius Volz @ 2008-06-12 13:31 UTC (permalink / raw)
To: Simon Horman; +Cc: Vince Busam, Ben Greear, Patrick McHardy, lvs-devel, netdev
On Thu, Jun 12, 2008, Simon Horman <horms@verge.net.au> wrote:
> Eventially the changes settled down, and for the past few years they
> have been very infrequent. But the problem that the interface isn't
> really extendable and that when changes are made kernel and ipvsadm
> versions need to be incremented together remains. For instance, the
> Debian package of ipvsadm actually shipps three different ipvsadm
> binaries, and a wrapper works out which one to use based on the kernel
> version.
Ugh.
> I wonder if now would be a good time to bite the bullet and design
> a new interface that is extendable.
If we really have to break it once for IPv6 anyways, it seems like a
good opportunity. Depends on how invasive the changes would need to
be, of course...
You probably already have some ideas on what a better interface would
look like? Especially, how to design it for future backwards
compatibility? And would it still use sockopts or rather one of the
other communication mechanisms?
Julius
--
Google Switzerland GmbH
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-12 13:31 ` Julius Volz
@ 2008-06-12 13:38 ` Patrick McHardy
2008-06-12 15:34 ` Julius Volz
0 siblings, 1 reply; 76+ messages in thread
From: Patrick McHardy @ 2008-06-12 13:38 UTC (permalink / raw)
To: Julius Volz; +Cc: Simon Horman, Vince Busam, Ben Greear, lvs-devel, netdev
Julius Volz wrote:
> On Thu, Jun 12, 2008, Simon Horman <horms@verge.net.au> wrote:
>> Eventially the changes settled down, and for the past few years they
>> have been very infrequent. But the problem that the interface isn't
>> really extendable and that when changes are made kernel and ipvsadm
>> versions need to be incremented together remains. For instance, the
>> Debian package of ipvsadm actually shipps three different ipvsadm
>> binaries, and a wrapper works out which one to use based on the kernel
>> version.
>
> Ugh.
>
>> I wonder if now would be a good time to bite the bullet and design
>> a new interface that is extendable.
>
> If we really have to break it once for IPv6 anyways, it seems like a
> good opportunity. Depends on how invasive the changes would need to
> be, of course...
You don't need to break the old interface, just add an additional
one.
> You probably already have some ideas on what a better interface would
> look like? Especially, how to design it for future backwards
> compatibility? And would it still use sockopts or rather one of the
> other communication mechanisms?
I'd suggest genetlink or nfnetlink.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-12 13:38 ` Patrick McHardy
@ 2008-06-12 15:34 ` Julius Volz
2008-06-12 15:41 ` Julius Volz
2008-06-12 15:46 ` Patrick McHardy
0 siblings, 2 replies; 76+ messages in thread
From: Julius Volz @ 2008-06-12 15:34 UTC (permalink / raw)
To: Patrick McHardy; +Cc: Simon Horman, Vince Busam, Ben Greear, lvs-devel, netdev
On Thu, Jun 12, 2008 at 3:38 PM, Patrick McHardy <kaber@trash.net> wrote:
> Julius Volz wrote:
>>> I wonder if now would be a good time to bite the bullet and design
>>> a new interface that is extendable.
>>
>> If we really have to break it once for IPv6 anyways, it seems like a
>> good opportunity. Depends on how invasive the changes would need to
>> be, of course...
>
> You don't need to break the old interface, just add an additional
> one.
Ok, then we will just keep the old one in parallel for some time.
>> You probably already have some ideas on what a better interface would
>> look like? Especially, how to design it for future backwards
>> compatibility? And would it still use sockopts or rather one of the
>> other communication mechanisms?
>
> I'd suggest genetlink or nfnetlink.
Ah, that's what I thought... Are there any simple kernel examples with
userspace counterparts to look at? I know iproute2 uses netlink, but
it seems like a rather complicated example.
Genetlink seems especially nice, although I couldn't find a general
explanation of it other than in git history.
Julius
--
Google Switzerland GmbH
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-12 15:34 ` Julius Volz
@ 2008-06-12 15:41 ` Julius Volz
2008-06-12 15:46 ` Patrick McHardy
1 sibling, 0 replies; 76+ messages in thread
From: Julius Volz @ 2008-06-12 15:41 UTC (permalink / raw)
To: Patrick McHardy; +Cc: Simon Horman, Vince Busam, Ben Greear, lvs-devel, netdev
On Thu, Jun 12, 2008 at 5:34 PM, Julius Volz <juliusv@google.com> wrote:
> Genetlink seems especially nice, although I couldn't find a general
> explanation of it other than in git history.
Ah, searching for "genetlink" didn't work, but "Generic Netlink" finds
the HOWTO.
Julius
--
Google Switzerland GmbH
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-12 15:34 ` Julius Volz
2008-06-12 15:41 ` Julius Volz
@ 2008-06-12 15:46 ` Patrick McHardy
2008-06-12 19:33 ` Julius Volz
1 sibling, 1 reply; 76+ messages in thread
From: Patrick McHardy @ 2008-06-12 15:46 UTC (permalink / raw)
To: Julius Volz; +Cc: Simon Horman, Vince Busam, Ben Greear, lvs-devel, netdev
Julius Volz wrote:
> On Thu, Jun 12, 2008 at 3:38 PM, Patrick McHardy <kaber@trash.net> wrote:
>> Julius Volz wrote:
>>>> I wonder if now would be a good time to bite the bullet and design
>>>> a new interface that is extendable.
>>> If we really have to break it once for IPv6 anyways, it seems like a
>>> good opportunity. Depends on how invasive the changes would need to
>>> be, of course...
>> You don't need to break the old interface, just add an additional
>> one.
>
> Ok, then we will just keep the old one in parallel for some time.
>
>>> You probably already have some ideas on what a better interface would
>>> look like? Especially, how to design it for future backwards
>>> compatibility? And would it still use sockopts or rather one of the
>>> other communication mechanisms?
>> I'd suggest genetlink or nfnetlink.
>
> Ah, that's what I thought... Are there any simple kernel examples with
> userspace counterparts to look at? I know iproute2 uses netlink, but
> it seems like a rather complicated example.
For nfnetlink: net/netfilter/nf_conntrack_netlink.c and
libnfnetlink_conntrack from git.netfilter.org.
> Genetlink seems especially nice, although I couldn't find a general
> explanation of it other than in git history.
I don't have an example for genetlink, but I guess you should
find some in libnl. In this case I guess both would be fine
since ipvs is only loosely tied to the rest of netfilter.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-12 15:46 ` Patrick McHardy
@ 2008-06-12 19:33 ` Julius Volz
2008-06-13 6:26 ` Simon Horman
0 siblings, 1 reply; 76+ messages in thread
From: Julius Volz @ 2008-06-12 19:33 UTC (permalink / raw)
To: Patrick McHardy; +Cc: Simon Horman, Vince Busam, Ben Greear, lvs-devel, netdev
On Thu, Jun 12, 2008, Patrick McHardy <kaber@trash.net> wrote:
> Julius Volz wrote:
>> Ah, that's what I thought... Are there any simple kernel examples with
>> userspace counterparts to look at? I know iproute2 uses netlink, but
>> it seems like a rather complicated example.
>
> For nfnetlink: net/netfilter/nf_conntrack_netlink.c and
> libnfnetlink_conntrack from git.netfilter.org.
Thanks!
>> Genetlink seems especially nice, although I couldn't find a general
>> explanation of it other than in git history.
>
> I don't have an example for genetlink, but I guess you should
> find some in libnl. In this case I guess both would be fine
> since ipvs is only loosely tied to the rest of netfilter.
Ok, my first impression is that genetlink is aimed at being simple to
use (and has a nice howto).
So we'll work on a genetlink interface and some of the other v6 patch
issues and then post again in a while. Thanks for the feedback!
Horms: ping if you're interested or have some good ideas for this.
Julius
--
Google Switzerland GmbH
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-12 19:33 ` Julius Volz
@ 2008-06-13 6:26 ` Simon Horman
2008-06-13 14:17 ` Julius Volz
0 siblings, 1 reply; 76+ messages in thread
From: Simon Horman @ 2008-06-13 6:26 UTC (permalink / raw)
To: Julius Volz; +Cc: Patrick McHardy, Vince Busam, Ben Greear, lvs-devel, netdev
On Thu, Jun 12, 2008 at 09:33:27PM +0200, Julius Volz wrote:
> On Thu, Jun 12, 2008, Patrick McHardy <kaber@trash.net> wrote:
> > Julius Volz wrote:
> >> Ah, that's what I thought... Are there any simple kernel examples with
> >> userspace counterparts to look at? I know iproute2 uses netlink, but
> >> it seems like a rather complicated example.
> >
> > For nfnetlink: net/netfilter/nf_conntrack_netlink.c and
> > libnfnetlink_conntrack from git.netfilter.org.
>
> Thanks!
>
> >> Genetlink seems especially nice, although I couldn't find a general
> >> explanation of it other than in git history.
> >
> > I don't have an example for genetlink, but I guess you should
> > find some in libnl. In this case I guess both would be fine
> > since ipvs is only loosely tied to the rest of netfilter.
>
> Ok, my first impression is that genetlink is aimed at being simple to
> use (and has a nice howto).
>
> So we'll work on a genetlink interface and some of the other v6 patch
> issues and then post again in a while. Thanks for the feedback!
>
> Horms: ping if you're interested or have some good ideas for this.
Julius: pong
The main two problems that I see in the existing interface are
a) lack of extendibility (which is why we are here) and;
b) non-idempotent actions, especially adding and deleting
real servers, which mean that user-space programs that
manipulate ipvsadm have have extra (racy) logic.
(ok, perhaps that is more a pet peeve than a problem).
I don't really have any concrete ideas about what a better
interface would look like. But I am more than happy to hash our ideas.
--
Horms
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-13 6:26 ` Simon Horman
@ 2008-06-13 14:17 ` Julius Volz
2008-06-13 15:14 ` Patrick McHardy
0 siblings, 1 reply; 76+ messages in thread
From: Julius Volz @ 2008-06-13 14:17 UTC (permalink / raw)
To: Simon Horman; +Cc: Patrick McHardy, Vince Busam, Ben Greear, lvs-devel, netdev
On Fri, Jun 13, 2008 at 8:26 AM, Simon Horman <horms@verge.net.au> wrote:
> On Thu, Jun 12, 2008 at 09:33:27PM +0200, Julius Volz wrote:
>> Ok, my first impression is that genetlink is aimed at being simple to
>> use (and has a nice howto).
>>
>> So we'll work on a genetlink interface and some of the other v6 patch
>> issues and then post again in a while. Thanks for the feedback!
>>
>> Horms: ping if you're interested or have some good ideas for this.
>
> Julius: pong
>
> The main two problems that I see in the existing interface are
> a) lack of extendibility (which is why we are here) and;
> b) non-idempotent actions, especially adding and deleting
> real servers, which mean that user-space programs that
> manipulate ipvsadm have have extra (racy) logic.
> (ok, perhaps that is more a pet peeve than a problem).
Ok, so we probably won't focus on b) as a priority right now, unless
it happens as a side-effect.
> I don't really have any concrete ideas about what a better
> interface would look like. But I am more than happy to hash our ideas.
Good! At the moment I'm looking at various netlink docs and figuring
out how things generally work. I think netlink probably adds a lot of
complexity over the previous sockopt interface, but I hope it's worth
it.
As for compatibility and extensibility, how is that best achieved with
netlink? I've seen some examples copy whole C structs into netlink
datagrams, but that is obviously what we don't want anymore. So the
way to go seems to be to transfer each struct field as a separate
netlink attribute, right?
Julius
--
Google Switzerland GmbH
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-13 14:17 ` Julius Volz
@ 2008-06-13 15:14 ` Patrick McHardy
2008-06-16 0:14 ` Julius Volz
0 siblings, 1 reply; 76+ messages in thread
From: Patrick McHardy @ 2008-06-13 15:14 UTC (permalink / raw)
To: Julius Volz; +Cc: Simon Horman, Vince Busam, Ben Greear, lvs-devel, netdev
Julius Volz wrote:
> On Fri, Jun 13, 2008 at 8:26 AM, Simon Horman <horms@verge.net.au> wrote:
>> I don't really have any concrete ideas about what a better
>> interface would look like. But I am more than happy to hash our ideas.
>
> Good! At the moment I'm looking at various netlink docs and figuring
> out how things generally work. I think netlink probably adds a lot of
> complexity over the previous sockopt interface, but I hope it's worth
> it.
>
> As for compatibility and extensibility, how is that best achieved with
> netlink? I've seen some examples copy whole C structs into netlink
> datagrams, but that is obviously what we don't want anymore. So the
> way to go seems to be to transfer each struct field as a separate
> netlink attribute, right?
Yes. You can also group those which belong together logically
in nested attributes.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-13 15:14 ` Patrick McHardy
@ 2008-06-16 0:14 ` Julius Volz
2008-06-16 11:47 ` Patrick McHardy
0 siblings, 1 reply; 76+ messages in thread
From: Julius Volz @ 2008-06-16 0:14 UTC (permalink / raw)
To: Patrick McHardy; +Cc: Simon Horman, Vince Busam, Ben Greear, lvs-devel, netdev
On Fri, Jun 13, 2008 at 5:14 PM, Patrick McHardy <kaber@trash.net> wrote:
> Julius Volz wrote:
>> As for compatibility and extensibility, how is that best achieved with
>> netlink? I've seen some examples copy whole C structs into netlink
>> datagrams, but that is obviously what we don't want anymore. So the
>> way to go seems to be to transfer each struct field as a separate
>> netlink attribute, right?
>
> Yes. You can also group those which belong together logically
> in nested attributes.
Thanks. I've now looked closer at Netlink and read the Genetlink implementation.
For the new IPVS interface, is there a preference for the granularity
of the top-level Genetlink operations?
I see three naive possibilities (one Genetlink op per line), if we
start out with a straight mapping from the old API to the new one:
a)
SET - covers all of previous IP_VS_SO_SET_*
GET - covers all of previous IP_VS_SO_GET_*
b) more split up
ADD - services and destinations
EDIT - services and destinations
DEL - services and destinations
SETTIMEOUT
STARTDAEMON
STOPDAEMON
ZERO
(+ granular GET commands...)
c) totally split up
ADD_SVC
ADD_DEST
EDIT_SVC
EDIT_DEST
DEL_SVC
DEL_DEST
SETTIMEOUT
STARTDAEMON
STOPDAEMON
ZERO
(+ granular GET commands...)
I find http://www.linuxfoundation.org/en/Net:Generic_Netlink_HOWTO saying:
===========================
Operation Granularity
While it may be tempting to register a single operation for a Generic
Netlink family and multiplex multiple sub-commands on the single
operation, this is strongly discouraged for security reasons.
Combining multiple behaviors into one operation makes it difficult to
restrict the operations using the existing Linux kernel security
mechanisms.
===========================
Option c) looks reasonable to me and also seems easy to handle in
general. Is this the way to go? Or do we want the interface to look
completely different this time?
Julius
--
Google Switzerland GmbH
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-16 0:14 ` Julius Volz
@ 2008-06-16 11:47 ` Patrick McHardy
2008-06-16 12:13 ` Julius Volz
` (2 more replies)
0 siblings, 3 replies; 76+ messages in thread
From: Patrick McHardy @ 2008-06-16 11:47 UTC (permalink / raw)
To: Julius Volz; +Cc: Simon Horman, Vince Busam, Ben Greear, lvs-devel, netdev
Julius Volz wrote:
> Thanks. I've now looked closer at Netlink and read the Genetlink implementation.
>
> For the new IPVS interface, is there a preference for the granularity
> of the top-level Genetlink operations?
>
> I see three naive possibilities (one Genetlink op per line), if we
> start out with a straight mapping from the old API to the new one:
>
> a)
> SET - covers all of previous IP_VS_SO_SET_*
> GET - covers all of previous IP_VS_SO_GET_*
>
> b) more split up
> ADD - services and destinations
> EDIT - services and destinations
> DEL - services and destinations
> SETTIMEOUT
> STARTDAEMON
> STOPDAEMON
> ZERO
> (+ granular GET commands...)
>
> c) totally split up
> ADD_SVC
> ADD_DEST
> EDIT_SVC
> EDIT_DEST
> DEL_SVC
> DEL_DEST
> SETTIMEOUT
> STARTDAEMON
> STOPDAEMON
> ZERO
> (+ granular GET commands...)
>
> I find http://www.linuxfoundation.org/en/Net:Generic_Netlink_HOWTO saying:
> ===========================
> Operation Granularity
>
> While it may be tempting to register a single operation for a Generic
> Netlink family and multiplex multiple sub-commands on the single
> operation, this is strongly discouraged for security reasons.
> Combining multiple behaviors into one operation makes it difficult to
> restrict the operations using the existing Linux kernel security
> mechanisms.
> ===========================
>
> Option c) looks reasonable to me and also seems easy to handle in
> general. Is this the way to go? Or do we want the interface to look
> completely different this time?
b) or c) both look fine. You could save a few operations (ADD/EDIT
can be combined) by making use of nlmsg_flags though:
The semantics of the flags is:
- NLM_F_CREATE|NLM_F_EXCL: create if non-existant
- NLM_F_REPLACE: change existing
- NLM_F_CREATE|NLM_F_REPLACE: create if non-existing, replace otherwise
- NLM_F_EXCL: test existance
NLM_F_APPEND can be used as modifier for NLM_F_CREATE to
specify that the new entry should be added to the end instead
of the beginning.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-16 11:47 ` Patrick McHardy
@ 2008-06-16 12:13 ` Julius Volz
2008-06-16 23:19 ` Julius Volz
2008-06-30 12:01 ` Julius Volz
2 siblings, 0 replies; 76+ messages in thread
From: Julius Volz @ 2008-06-16 12:13 UTC (permalink / raw)
To: Patrick McHardy; +Cc: Simon Horman, Vince Busam, Ben Greear, lvs-devel, netdev
On Mon, Jun 16, 2008 at 1:47 PM, Patrick McHardy <kaber@trash.net> wrote:
>> Option c) looks reasonable to me and also seems easy to handle in
>> general. Is this the way to go? Or do we want the interface to look
>> completely different this time?
>
> b) or c) both look fine.
Good, thanks.
> You could save a few operations (ADD/EDIT
> can be combined) by making use of nlmsg_flags though:
>
> The semantics of the flags is:
>
> - NLM_F_CREATE|NLM_F_EXCL: create if non-existant
> - NLM_F_REPLACE: change existing
> - NLM_F_CREATE|NLM_F_REPLACE: create if non-existing, replace otherwise
> - NLM_F_EXCL: test existance
Thanks for explaining this! Sounds good!
> NLM_F_APPEND can be used as modifier for NLM_F_CREATE to
> specify that the new entry should be added to the end instead
> of the beginning.
Interesting. This should not be needed in IPVS though, as entry order
doesn't matter much there.
Julius
--
Google Switzerland GmbH
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-16 11:47 ` Patrick McHardy
2008-06-16 12:13 ` Julius Volz
@ 2008-06-16 23:19 ` Julius Volz
2008-06-17 11:52 ` Patrick McHardy
2008-06-30 12:01 ` Julius Volz
2 siblings, 1 reply; 76+ messages in thread
From: Julius Volz @ 2008-06-16 23:19 UTC (permalink / raw)
To: Patrick McHardy; +Cc: Simon Horman, Vince Busam, Ben Greear, lvs-devel, netdev
On Mon, Jun 16, 2008 at 1:47 PM, Patrick McHardy <kaber@trash.net> wrote:
>> Option c) looks reasonable to me and also seems easy to handle in
>> general. Is this the way to go? Or do we want the interface to look
>> completely different this time?
>
> b) or c) both look fine.
A couple of other Netlink questions came up while reading some code:
1) There are many examples of the following two cases in the kernel:
nla_nest_start(skb, SOME_ATTR_TYPE | NLA_F_NESTED);
nla_nest_start(skb, SOME_ATTR_TYPE);
Why don't all cases have NLA_F_NESTED? Then again, this bit is never
ever read out again (at least not in the kernel), so I guess people
are just using their implicit knowledge that a specific attribute type
is always nested and never check the bit?
Btw., couldn't we change nla_nest_start() to always add NLA_F_NESTED
to the type?
2) To send an array of attributes of the same type, you just add them
serially? I was just confused at first that nla_parse() will save only
one attribute of each type (the last one) in the destination array, so
when dealing with arrays, it doesn't help. So I just iterate over the
array with nla_for_each_attr() and parse each element manually, right?
Thanks for your time!
Julius
--
Google Switzerland GmbH
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-16 23:19 ` Julius Volz
@ 2008-06-17 11:52 ` Patrick McHardy
2008-06-17 17:18 ` Julius Volz
0 siblings, 1 reply; 76+ messages in thread
From: Patrick McHardy @ 2008-06-17 11:52 UTC (permalink / raw)
To: Julius Volz; +Cc: Simon Horman, Vince Busam, Ben Greear, lvs-devel, netdev
Julius Volz wrote:
> On Mon, Jun 16, 2008 at 1:47 PM, Patrick McHardy <kaber@trash.net> wrote:
>>> Option c) looks reasonable to me and also seems easy to handle in
>>> general. Is this the way to go? Or do we want the interface to look
>>> completely different this time?
>> b) or c) both look fine.
>
> A couple of other Netlink questions came up while reading some code:
>
> 1) There are many examples of the following two cases in the kernel:
>
> nla_nest_start(skb, SOME_ATTR_TYPE | NLA_F_NESTED);
> nla_nest_start(skb, SOME_ATTR_TYPE);
>
> Why don't all cases have NLA_F_NESTED? Then again, this bit is never
> ever read out again (at least not in the kernel), so I guess people
> are just using their implicit knowledge that a specific attribute type
> is always nested and never check the bit?
>
> Btw., couldn't we change nla_nest_start() to always add NLA_F_NESTED
> to the type?
The NLA_F_NESTED bit originated in nfnetlink, but was moved
to netlink so the new netlink parsing helpers could also be
used for nfnetlink. It can't be added to existing attributes
since userspace needs to mask it out again to get the real
attribute value and the non-nfnetlink userspace code doesn't
expect it.
> 2) To send an array of attributes of the same type, you just add them
> serially? I was just confused at first that nla_parse() will save only
> one attribute of each type (the last one) in the destination array, so
> when dealing with arrays, it doesn't help. So I just iterate over the
> array with nla_for_each_attr() and parse each element manually, right?
Exactly, net/8021q/vlan_netlink.c has two examples of this (the
QoS mapping attributes).
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-17 11:52 ` Patrick McHardy
@ 2008-06-17 17:18 ` Julius Volz
2008-06-17 20:08 ` Patrick McHardy
0 siblings, 1 reply; 76+ messages in thread
From: Julius Volz @ 2008-06-17 17:18 UTC (permalink / raw)
To: Patrick McHardy; +Cc: Simon Horman, Vince Busam, Ben Greear, lvs-devel, netdev
On Tue, Jun 17, 2008 at 01:52:46PM +0200, Patrick McHardy wrote:
> Julius Volz wrote:
> >Btw., couldn't we change nla_nest_start() to always add NLA_F_NESTED
> >to the type?
>
> The NLA_F_NESTED bit originated in nfnetlink, but was moved
> to netlink so the new netlink parsing helpers could also be
> used for nfnetlink. It can't be added to existing attributes
> since userspace needs to mask it out again to get the real
> attribute value and the non-nfnetlink userspace code doesn't
> expect it.
>
> >2) To send an array of attributes of the same type, you just add them
> >serially? I was just confused at first that nla_parse() will save only
> >one attribute of each type (the last one) in the destination array, so
> >when dealing with arrays, it doesn't help. So I just iterate over the
> >array with nla_for_each_attr() and parse each element manually, right?
>
> Exactly, net/8021q/vlan_netlink.c has two examples of this (the
> QoS mapping attributes).
Thanks for these explanations!
Ok, so this is my draft version of the IPVS Generic Netlink interface
definition. I'm posting this to see if anyone notices general problems
with it right away.
Arrays of the same attribute type are always put into a nested container
so that it is easy to add new attributes which are parallel to the array
later on. Perhaps integer flag fields should also be split up into
NLA_FLAG attributes, haven't done that yet.
First is a text listing attribute types and how they occur and nest in
all of the commands and their replies. After that are the corresponding
source excerpts (no patch material yet).
Julius
======================================
| IPVS NETLINK ATTRIBUTE TYPES |
| (grouped as enums) |
======================================
IPVS_ENTRY_ATTR_SERVICE - NLA_NESTED
IPVS_ENTRY_ATTR_SERVICES - NLA_NESTED
IPVS_ENTRY_ATTR_DEST - NLA_NESTED
IPVS_ENTRY_ATTR_DESTS - NLA_NESTED
IPVS_ENTRY_ATTR_DAEMON - NLA_NESTED
IPVS_ENTRY_ATTR_DAEMONS - NLA_NESTED
IPVS_SVC_ATTR_AF - NLA_U32
IPVS_SVC_ATTR_PROTOCOL - NLA_U32
IPVS_SVC_ATTR_ADDR - union nf_inet_addr
IPVS_SVC_ATTR_PORT - NLA_U16
IPVS_SVC_ATTR_FWMARK - NLA_U32
IPVS_SVC_ATTR_SCHED_NAME - NLA_STRING
IPVS_SVC_ATTR_FLAGS - NLA_U32
IPVS_SVC_ATTR_TIMEOUT - NLA_U32
IPVS_SVC_ATTR_NETMASK - NLA_U32
IPVS_SVC_ATTR_NUM_DESTS - NLA_U32
IPVS_SVC_ATTR_STATS - NLA_NESTED
IPVS_DEST_ATTR_AF - NLA_U32
IPVS_DEST_ATTR_ADDR - union nf_inet_addr
IPVS_DEST_ATTR_PORT - NLA_U16
IPVS_DEST_ATTR_CONN_FLAGS - NLA_U32
IPVS_DEST_ATTR_WEIGHT - NLA_U32
IPVS_DEST_ATTR_U_THRESH - NLA_U32
IPVS_DEST_ATTR_L_THRESH - NLA_U32
IPVS_DEST_ATTR_ACTIVE_CONNS - NLA_U32
IPVS_DEST_ATTR_INACT_CONNS - NLA_U32
IPVS_DEST_ATTR_PERSIST_CONNS - NLA_U32
IPVS_DEST_ATTR_STATS - NLA_NESTED
IPVS_STATS_ATTR_CONNS - NLA_U32
IPVS_STATS_ATTR_INPKTS - NLA_U32
IPVS_STATS_ATTR_OUTPKTS - NLA_U32
IPVS_STATS_ATTR_INBYTES - NLA_U64
IPVS_STATS_ATTR_OUTBYTES - NLA_U64
IPVS_STATS_ATTR_CPS - NLA_U32
IPVS_STATS_ATTR_INPPS - NLA_U32
IPVS_STATS_ATTR_OUTPPS - NLA_U32
IPVS_STATS_ATTR_INBPS - NLA_U32
IPVS_STATS_ATTR_OUTBPS - NLA_U32
IPVS_TIMEOUT_ATTR_TCP - NLA_U32
IPVS_TIMEOUT_ATTR_TCP_FIN - NLA_U32
IPVS_TIMEOUT_ATTR_UDP - NLA_U32
IPVS_DAEMON_ATTR_STATE - NLA_U32
IPVS_DAEMON_ATTR_MCAST_IFN - NLA_STRING
IPVS_DAEMON_ATTR_SYNC_ID - NLA_U32
IPVS_INFO_ATTR_VERSION - NLA_U32
IPVS_INFO_ATTR_CONNTAB_SIZE - NLA_U32
IPVS_INFO_ATTR_NUM_SERVICES - NLA_U32
====================================================
| ATTRIBUTES PASSED AS ARGUMENTS TO COMMANDS |
====================================================
IPVS_CMD_ADD_SERVICE (with add/edit flag)
IPVS_ENTRY_ATTR_SERVICE
IPVS_SVC_ATTR_AF
IPVS_SVC_ATTR_PROTOCOL
IPVS_SVC_ATTR_ADDR
IPVS_SVC_ATTR_PORT || IPVS_SVC_ATTR_FWMARK
IPVS_SVC_ATTR_SCHED_NAME
IPVS_SVC_ATTR_FLAGS
IPVS_SVC_ATTR_TIMEOUT
IPVS_SVC_ATTR_NETMASK
IPVS_CMD_DEL_SERVICE
IPVS_ENTRY_ATTR_SERVICE
IPVS_SVC_ATTR_AF
IPVS_SVC_ATTR_PROTOCOL
IPVS_SVC_ATTR_ADDR
IPVS_SVC_ATTR_PORT || IPVS_SVC_ATTR_FWMARK
IPVS_CMD_ADD_DEST (with add/edit flag)
IPVS_ENTRY_ATTR_SERVICE
IPVS_SVC_ATTR_AF
IPVS_SVC_ATTR_PROTOCOL
IPVS_SVC_ATTR_ADDR
IPVS_SVC_ATTR_PORT || IPVS_SVC_ATTR_FWMARK
IPVS_ENTRY_ATTR_DEST
IPVS_DEST_ATTR_AF
IPVS_DEST_ATTR_ADDR
IPVS_DEST_ATTR_PORT
IPVS_DEST_ATTR_CONN_FLAGS
IPVS_DEST_ATTR_WEIGHT
IPVS_DEST_ATTR_U_THRESH
IPVS_DEST_ATTR_L_THRESH
IPVS_CMD_DEL_DEST
IPVS_ENTRY_ATTR_SERVICE
IPVS_SVC_ATTR_AF
IPVS_SVC_ATTR_PROTOCOL
IPVS_SVC_ATTR_ADDR
IPVS_SVC_ATTR_PORT || IPVS_SVC_ATTR_FWMARK
IPVS_ENTRY_ATTR_DEST
IPVS_DEST_ATTR_AF
IPVS_DEST_ATTR_ADDR
IPVS_DEST_ATTR_PORT
IPVS_CMD_FLUSH (no arguments)
IPVS_CMD_SET_TIMEOUT
IPVS_TIMEOUT_ATTR_TCP
IPVS_TIMEOUT_ATTR_TCP_FIN
IPVS_TIMEOUT_ATTR_UDP
IPVS_CMD_START_DAEMON
IPVS_ENTRY_ATTR_DAEMON
IPVS_DAEMON_ATTR_STATE
IPVS_DAEMON_ATTR_MCAST_IFN
IPVS_DAEMON_ATTR_SYNC_ID
IPVS_CMD_STOP_DAEMON
IPVS_ENTRY_ATTR_DAEMON
IPVS_DAEMON_ATTR_STATE
IPVS_CMD_ZERO (no arguments)
IPVS_CMD_GET_INFO (no arguments)
IPVS_CMD_GET_SERVICES (no arguments)
IPVS_CMD_GET_SERVICE
IPVS_ENTRY_ATTR_SERVICE
IPVS_SVC_ATTR_AF
IPVS_SVC_ATTR_PROTOCOL
IPVS_SVC_ATTR_ADDR
IPVS_SVC_ATTR_PORT || IPVS_SVC_ATTR_FWMARK
IPVS_CMD_GET_DESTS
IPVS_ENTRY_ATTR_SERVICE
IPVS_SVC_ATTR_AF
IPVS_SVC_ATTR_PROTOCOL
IPVS_SVC_ATTR_ADDR
IPVS_SVC_ATTR_PORT || IPVS_SVC_ATTR_FWMARK
IPVS_CMD_GET_TIMEOUT (no arguments)
IPVS_CMD_GET_DAEMON (no arguments)
=====================================================
| ATTRIBUTES RETURNED IN RESPONSE TO COMMANDS |
=====================================================
IPVS_CMD_ADD_SERVICE (only return code)
IPVS_CMD_DEL_SERVICE (only return code)
IPVS_CMD_ADD_DEST (only return code)
IPVS_CMD_DEL_DEST (only return code)
IPVS_CMD_FLUSH (only return code)
IPVS_CMD_SET_TIMEOUT (only return code)
IPVS_CMD_START_DAEMON (only return code)
IPVS_CMD_STOP_DAEMON (only return code)
IPVS_CMD_ZERO (only return code)
IPVS_CMD_GET_INFO
IPVS_INFO_ATTR_VERSION
IPVS_INFO_ATTR_CONNTAB_SIZE
IPVS_INFO_ATTR_NUM_SERVICES
IPVS_CMD_GET_SERVICES
IPVS_ENTRY_ATTR_SERVICES
IPVS_ENTRY_ATTR_SERVICES (array)
IPVS_SVC_ATTR_AF
IPVS_SVC_ATTR_PROTOCOL
IPVS_SVC_ATTR_ADDR
IPVS_SVC_ATTR_PORT || IPVS_SVC_ATTR_FWMARK
IPVS_SVC_ATTR_SCHED_NAME
IPVS_SVC_ATTR_FLAGS
IPVS_SVC_ATTR_TIMEOUT
IPVS_SVC_ATTR_NETMASK
IPVS_SVC_ATTR_NUM_DESTS
IPVS_SVC_ATTR_STATS
IPVS_CMD_GET_SERVICE
IPVS_ENTRY_ATTR_SERVICE
IPVS_SVC_ATTR_AF
IPVS_SVC_ATTR_PROTOCOL
IPVS_SVC_ATTR_ADDR
IPVS_SVC_ATTR_PORT || IPVS_SVC_ATTR_FWMARK
IPVS_SVC_ATTR_SCHED_NAME
IPVS_SVC_ATTR_FLAGS
IPVS_SVC_ATTR_TIMEOUT
IPVS_SVC_ATTR_NETMASK
IPVS_SVC_ATTR_NUM_DESTS
IPVS_SVC_ATTR_STATS
IPVS_CMD_GET_DESTS
IPVS_ENTRY_ATTR_DESTS
IPVS_ENTRY_ATTR_DEST (array)
IPVS_DEST_ATTR_AF
IPVS_DEST_ATTR_ADDR
IPVS_DEST_ATTR_PORT
IPVS_DEST_ATTR_CONN_FLAGS
IPVS_DEST_ATTR_WEIGHT
IPVS_DEST_ATTR_U_THRESH
IPVS_DEST_ATTR_L_THRESH
IPVS_DEST_ATTR_ACTIVE_CONNS
IPVS_DEST_ATTR_INACT_CONNS
IPVS_DEST_ATTR_PERSIST_CONNS
IPVS_DEST_ATTR_STATS
IPVS_CMD_GET_TIMEOUT
IPVS_TIMEOUT_ATTR_TCP
IPVS_TIMEOUT_ATTR_TCP_FIN
IPVS_TIMEOUT_ATTR_UDP
IPVS_CMD_GET_DAEMON
IPVS_ENTRY_ATTR_DAEMONS
IPVS_ENTRY_ATTR_DAEMON (array)
IPVS_DAEMON_ATTR_STATE
IPVS_DAEMON_ATTR_MCAST_IFN
IPVS_DAEMON_ATTR_SYNC_ID
========================== include/net/ip_vs.h ==========================
/*
*
* IPVS Generic Netlink interface definitions
*
*/
/* Generic Netlink family info */
#define IPVS_GENL_NAME "IPVS"
#define IPVS_GENL_VERSION 0x1
/* Generic Netlink command attributes */
enum {
IPVS_CMD_UNSPEC = 0,
IPVS_CMD_ADD_SERVICE, /* add or modify service */
IPVS_CMD_DEL_SERVICE, /* delete service */
IPVS_CMD_ADD_DEST, /* add or modify destination */
IPVS_CMD_DEL_DEST, /* delete destination */
IPVS_CMD_FLUSH, /* flush all services and dests */
IPVS_CMD_SET_TIMEOUT, /* set TCP and UDP timeouts */
IPVS_CMD_START_DAEMON, /* start sync daemon */
IPVS_CMD_STOP_DAEMON, /* stop sync daemon */
IPVS_CMD_ZERO, /* zero all counters and stats */
IPVS_CMD_GET_INFO, /* get general IPVS info */
IPVS_CMD_GET_SERVICES, /* get list of all services */
IPVS_CMD_GET_SERVICE, /* get info about specific service */
IPVS_CMD_GET_DESTS, /* get list of all service dests */
IPVS_CMD_GET_TIMEOUT, /* get TCP and UDP timeouts */
IPVS_CMD_GET_DAEMON, /* get sync daemon status */
__IPVS_CMD_MAX,
};
#define IPVS_CMD_MAX (__IPVS_CMD_MAX - 1)
/*
* Attributes used in the first level of commands that maintain multiple entries
* of the same element type (services, destinations, sync daemons)
*
* Arrays of the same attribute type are always nested in the plural version of
* the attribute to allow adding attributes in parallel to the array later on
*/
enum {
IPVS_ENTRY_ATTR_UNSPEC = 0,
IPVS_ENTRY_ATTR_SERVICE, /* nested service attribute */
IPVS_ENTRY_ATTR_SERVICES, /* nested service list attribute */
IPVS_ENTRY_ATTR_DEST, /* nested destination attribute */
IPVS_ENTRY_ATTR_DESTS, /* nested destination list attribute */
IPVS_ENTRY_ATTR_DAEMON, /* nested sync daemon attribute */
IPVS_ENTRY_ATTR_DAEMONS, /* nested sync daemon list attribute */
__IPVS_ENTRY_ATTR_MAX,
};
#define IPVS_ENTRY_ATTR_MAX (__IPVS_SVC_ATTR_MAX - 1)
/*
* Attributes used to describe a service
*
* Used inside nested attribute IPVS_ENTRY_ATTR_SERVICE
*/
enum {
IPVS_SVC_ATTR_UNSPEC = 0,
IPVS_SVC_ATTR_AF, /* address family */
IPVS_SVC_ATTR_PROTOCOL, /* virtual service protocol */
IPVS_SVC_ATTR_ADDR, /* virtual service address */
IPVS_SVC_ATTR_PORT, /* virtual service port */
IPVS_SVC_ATTR_FWMARK, /* firewall mark of service */
IPVS_SVC_ATTR_SCHED_NAME, /* name of scheduler */
IPVS_SVC_ATTR_FLAGS, /* virtual service flags */
IPVS_SVC_ATTR_TIMEOUT, /* persistent timeout */
IPVS_SVC_ATTR_NETMASK, /* persistent netmask */
IPVS_SVC_ATTR_NUM_DESTS, /* number of real servers in service */
IPVS_SVC_ATTR_STATS, /* nested attribute for service stats */
__IPVS_SVC_ATTR_MAX,
};
#define IPVS_SVC_ATTR_MAX (__IPVS_SVC_ATTR_MAX - 1)
/*
* Attributes used to describe a destination (real server)
*
* Used inside nested attribute IPVS_ENTRY_ATTR_DEST
*/
enum {
IPVS_DEST_ATTR_UNSPEC = 0,
IPVS_DEST_ATTR_AF, /* address family */
IPVS_DEST_ATTR_ADDR, /* real server address */
IPVS_DEST_ATTR_PORT, /* real server port */
IPVS_DEST_ATTR_CONN_FLAGS, /* connection flags */
IPVS_DEST_ATTR_WEIGHT, /* destination weight */
IPVS_DEST_ATTR_U_THRESH, /* upper threshold */
IPVS_DEST_ATTR_L_THRESH, /* lower threshold */
IPVS_DEST_ATTR_ACTIVE_CONNS, /* active connections */
IPVS_DEST_ATTR_INACT_CONNS, /* inactive connections */
IPVS_DEST_ATTR_PERSIST_CONNS, /* persistent connections */
IPVS_DEST_ATTR_STATS, /* nested attribute for dest stats */
__IPVS_DEST_ATTR_MAX,
};
#define IPVS_DEST_ATTR_MAX (__IPVS_DEST_ATTR_MAX - 1)
/*
* Attributes describing a sync daemon
*
* Used inside nested attribute IPVS_ENTRY_ATTR_DAEMON
*/
enum {
IPVS_DAEMON_ATTR_UNSPEC = 0,
IPVS_DAEMON_ATTR_STATE, /* sync daemon state (master/backup) */
IPVS_DAEMON_ATTR_MCAST_IFN, /* multicast interface name */
IPVS_DAEMON_ATTR_SYNC_ID, /* SyncID we belong to */
__IPVS_DAEMON_ATTR_MAX,
};
#define IPVS_DAEMON_ATTR_MAX (__IPVS_DAEMON_ATTR_MAX - 1)
/*
* Attributes used to describe service or destination entry statistics
*
* Used inside nested attributes IPVS_SVC_ATTR_STATS and IPVS_DEST_ATTR_STATS
*/
enum {
IPVS_STATS_ATTR_UNSPEC = 0,
IPVS_STATS_ATTR_CONNS, /* connections scheduled */
IPVS_STATS_ATTR_INPKTS, /* incoming packets */
IPVS_STATS_ATTR_OUTPKTS, /* outgoing packets */
IPVS_STATS_ATTR_INBYTES, /* incoming bytes */
IPVS_STATS_ATTR_OUTBYTES, /* outgoing bytes */
IPVS_STATS_ATTR_CPS, /* current connection rate */
IPVS_STATS_ATTR_INPPS, /* current in packet rate */
IPVS_STATS_ATTR_OUTPPS, /* current out packet rate */
IPVS_STATS_ATTR_INBPS, /* current in byte rate */
IPVS_STATS_ATTR_OUTBPS, /* current out byte rate */
__IPVS_STATS_ATTR_MAX,
};
#define IPVS_STATS_ATTR_MAX (__IPVS_STATS_ATTR_MAX - 1)
/* Attributes used in IPVS_CMD_SET_TIMEOUT and IPVS_CMD_GET_TIMEOUT commands */
enum {
IPVS_TIMEOUT_ATTR_UNSPEC = 0,
IPVS_TIMEOUT_ATTR_TCP, /* TCP connection timeout */
IPVS_TIMEOUT_ATTR_TCP_FIN, /* TCP FIN wait timeout */
IPVS_TIMEOUT_ATTR_UDP, /* UDP timeout */
__IPVS_TIMEOUT_ATTR_MAX,
};
#define IPVS_TIMEOUT_ATTR_MAX (__IPVS_TIMEOUT_ATTR_MAX - 1)
/* Attributes used in response to IPVS_CMD_GET_INFO command */
enum {
IPVS_INFO_ATTR_UNSPEC = 0,
IPVS_INFO_ATTR_VERSION, /* IPVS version number */
IPVS_INFO_ATTR_CONNTAB_SIZE, /* size of connection hash table */
IPVS_INFO_ATTR_NUM_SERVICES, /* number of virtual services */
__IPVS_INFO_ATTR_MAX,
};
#define IPVS_INFO_ATTR_MAX (__IPVS_INFO_ATTR_MAX - 1)
/* End of Generic Netlink interface definitions */
========================== net/ipv4/ipvs/ip_vs_ctl.c ==========================
/*
*
* IPVS Generic Netlink interface definitions
*
*/
/* Generic Netlink family info */
#define IPVS_GENL_NAME "IPVS"
#define IPVS_GENL_VERSION 0x1
/* Generic Netlink command attributes */
enum {
IPVS_CMD_UNSPEC = 0,
IPVS_CMD_ADD_SERVICE, /* add or modify service */
IPVS_CMD_DEL_SERVICE, /* delete service */
IPVS_CMD_ADD_DEST, /* add or modify destination */
IPVS_CMD_DEL_DEST, /* delete destination */
IPVS_CMD_FLUSH, /* flush all services and dests */
IPVS_CMD_SET_TIMEOUT, /* set TCP and UDP timeouts */
IPVS_CMD_START_DAEMON, /* start sync daemon */
IPVS_CMD_STOP_DAEMON, /* stop sync daemon */
IPVS_CMD_ZERO, /* zero all counters and stats */
IPVS_CMD_GET_INFO, /* get general IPVS info */
IPVS_CMD_GET_SERVICES, /* get list of all services */
IPVS_CMD_GET_SERVICE, /* get info about specific service */
IPVS_CMD_GET_DESTS, /* get list of all service dests */
IPVS_CMD_GET_TIMEOUT, /* get TCP and UDP timeouts */
IPVS_CMD_GET_DAEMON, /* get sync daemon status */
__IPVS_CMD_MAX,
};
#define IPVS_CMD_MAX (__IPVS_CMD_MAX - 1)
/*
* Attributes used in the first level of commands that maintain multiple entries
* of the same element type (services, destinations, sync daemons)
*
* Arrays of the same attribute type are always nested in the plural version of
* the attribute to allow adding attributes in parallel to the array later on
*/
enum {
IPVS_ENTRY_ATTR_UNSPEC = 0,
IPVS_ENTRY_ATTR_SERVICE, /* nested service attribute */
IPVS_ENTRY_ATTR_SERVICES, /* nested service list attribute */
IPVS_ENTRY_ATTR_DEST, /* nested destination attribute */
IPVS_ENTRY_ATTR_DESTS, /* nested destination list attribute */
IPVS_ENTRY_ATTR_DAEMON, /* nested sync daemon attribute */
IPVS_ENTRY_ATTR_DAEMONS, /* nested sync daemon list attribute */
__IPVS_ENTRY_ATTR_MAX,
};
#define IPVS_ENTRY_ATTR_MAX (__IPVS_SVC_ATTR_MAX - 1)
/*
* Attributes used to describe a service
*
* Used inside nested attribute IPVS_ENTRY_ATTR_SERVICE
*/
enum {
IPVS_SVC_ATTR_UNSPEC = 0,
IPVS_SVC_ATTR_AF, /* address family */
IPVS_SVC_ATTR_PROTOCOL, /* virtual service protocol */
IPVS_SVC_ATTR_ADDR, /* virtual service address */
IPVS_SVC_ATTR_PORT, /* virtual service port */
IPVS_SVC_ATTR_FWMARK, /* firewall mark of service */
IPVS_SVC_ATTR_SCHED_NAME, /* name of scheduler */
IPVS_SVC_ATTR_FLAGS, /* virtual service flags */
IPVS_SVC_ATTR_TIMEOUT, /* persistent timeout */
IPVS_SVC_ATTR_NETMASK, /* persistent netmask */
IPVS_SVC_ATTR_NUM_DESTS, /* number of real servers in service */
IPVS_SVC_ATTR_STATS, /* nested attribute for service stats */
__IPVS_SVC_ATTR_MAX,
};
#define IPVS_SVC_ATTR_MAX (__IPVS_SVC_ATTR_MAX - 1)
/*
* Attributes used to describe a destination (real server)
*
* Used inside nested attribute IPVS_ENTRY_ATTR_DEST
*/
enum {
IPVS_DEST_ATTR_UNSPEC = 0,
IPVS_DEST_ATTR_AF, /* address family */
IPVS_DEST_ATTR_ADDR, /* real server address */
IPVS_DEST_ATTR_PORT, /* real server port */
IPVS_DEST_ATTR_CONN_FLAGS, /* connection flags */
IPVS_DEST_ATTR_WEIGHT, /* destination weight */
IPVS_DEST_ATTR_U_THRESH, /* upper threshold */
IPVS_DEST_ATTR_L_THRESH, /* lower threshold */
IPVS_DEST_ATTR_ACTIVE_CONNS, /* active connections */
IPVS_DEST_ATTR_INACT_CONNS, /* inactive connections */
IPVS_DEST_ATTR_PERSIST_CONNS, /* persistent connections */
IPVS_DEST_ATTR_STATS, /* nested attribute for dest stats */
__IPVS_DEST_ATTR_MAX,
};
#define IPVS_DEST_ATTR_MAX (__IPVS_DEST_ATTR_MAX - 1)
/*
* Attributes describing a sync daemon
*
* Used inside nested attribute IPVS_ENTRY_ATTR_DAEMON
*/
enum {
IPVS_DAEMON_ATTR_UNSPEC = 0,
IPVS_DAEMON_ATTR_STATE, /* sync daemon state (master/backup) */
IPVS_DAEMON_ATTR_MCAST_IFN, /* multicast interface name */
IPVS_DAEMON_ATTR_SYNC_ID, /* SyncID we belong to */
__IPVS_DAEMON_ATTR_MAX,
};
#define IPVS_DAEMON_ATTR_MAX (__IPVS_DAEMON_ATTR_MAX - 1)
/*
* Attributes used to describe service or destination entry statistics
*
* Used inside nested attributes IPVS_SVC_ATTR_STATS and IPVS_DEST_ATTR_STATS
*/
enum {
IPVS_STATS_ATTR_UNSPEC = 0,
IPVS_STATS_ATTR_CONNS, /* connections scheduled */
IPVS_STATS_ATTR_INPKTS, /* incoming packets */
IPVS_STATS_ATTR_OUTPKTS, /* outgoing packets */
IPVS_STATS_ATTR_INBYTES, /* incoming bytes */
IPVS_STATS_ATTR_OUTBYTES, /* outgoing bytes */
IPVS_STATS_ATTR_CPS, /* current connection rate */
IPVS_STATS_ATTR_INPPS, /* current in packet rate */
IPVS_STATS_ATTR_OUTPPS, /* current out packet rate */
IPVS_STATS_ATTR_INBPS, /* current in byte rate */
IPVS_STATS_ATTR_OUTBPS, /* current out byte rate */
__IPVS_STATS_ATTR_MAX,
};
#define IPVS_STATS_ATTR_MAX (__IPVS_STATS_ATTR_MAX - 1)
/* Attributes used in IPVS_CMD_SET_TIMEOUT and IPVS_CMD_GET_TIMEOUT commands */
enum {
IPVS_TIMEOUT_ATTR_UNSPEC = 0,
IPVS_TIMEOUT_ATTR_TCP, /* TCP connection timeout */
IPVS_TIMEOUT_ATTR_TCP_FIN, /* TCP FIN wait timeout */
IPVS_TIMEOUT_ATTR_UDP, /* UDP timeout */
__IPVS_TIMEOUT_ATTR_MAX,
};
#define IPVS_TIMEOUT_ATTR_MAX (__IPVS_TIMEOUT_ATTR_MAX - 1)
/* Attributes used in response to IPVS_CMD_GET_INFO command */
enum {
IPVS_INFO_ATTR_UNSPEC = 0,
IPVS_INFO_ATTR_VERSION, /* IPVS version number */
IPVS_INFO_ATTR_CONNTAB_SIZE, /* size of connection hash table */
IPVS_INFO_ATTR_NUM_SERVICES, /* number of virtual services */
__IPVS_INFO_ATTR_MAX,
};
#define IPVS_INFO_ATTR_MAX (__IPVS_INFO_ATTR_MAX - 1)
/* End of Generic Netlink interface definitions */
/*
* Generic Netlink definitions
*/
/* IPVS genetlink family*/
static struct genl_family ip_vs_genl_family = {
.id = GENL_ID_GENERATE,
.hdrsize = 0,
.name = IPVS_GENL_NAME,
.version = IPVS_GENL_VERSION,
.maxattr = IPVS_CMD_MAX
};
/*
* Policy used for commands that operate on service, destination
* or daemon entries
*/
static struct nla_policy ip_vs_entries_policy[IPVS_ENTRY_ATTR_MAX + 1]
__read_mostly = {
[IPVS_ENTRY_ATTR_SERVICE] = { .type = NLA_NESTED },
[IPVS_ENTRY_ATTR_SERVICES] = { .type = NLA_NESTED },
[IPVS_ENTRY_ATTR_DEST] = { .type = NLA_NESTED },
[IPVS_ENTRY_ATTR_DESTS] = { .type = NLA_NESTED },
[IPVS_ENTRY_ATTR_DAEMON] = { .type = NLA_NESTED },
[IPVS_ENTRY_ATTR_DAEMONS] = { .type = NLA_NESTED },
};
/* Policy used for IPVS_CMD_SET_TIMEOUT command attributes */
static struct nla_policy ip_vs_timeout_policy[IPVS_TIMEOUT_ATTR_MAX + 1]
__read_mostly = {
[IPVS_TIMEOUT_ATTR_TCP] = { .type = NLA_U32 },
[IPVS_TIMEOUT_ATTR_TCP_FIN] = { .type = NLA_U32 },
[IPVS_TIMEOUT_ATTR_UDP] = { .type = NLA_U32 },
};
/* Policy used for IPVS_CMD_SET_TIMEOUT command attributes */
static struct nla_policy ip_vs_daemon_policy[IPVS_DAEMON_ATTR_MAX + 1]
__read_mostly = {
[IPVS_DAEMON_ATTR_STATE] = { .type = NLA_U32 },
[IPVS_DAEMON_ATTR_MCAST_IFN] = { .type = NLA_STRING,
.len = IP_VS_IFNAME_MAXLEN },
[IPVS_DAEMON_ATTR_SYNC_ID] = { .type = NLA_U32 },
};
/* Policy used for attributes in nested attribute IPVS_ENTRY_ATTR_SERVICE */
static struct nla_policy ip_vs_svc_policy[IPVS_SVC_ATTR_MAX + 1]
__read_mostly = {
[IPVS_SVC_ATTR_AF] = { .type = NLA_U16 },
[IPVS_SVC_ATTR_PROTOCOL] = { .type = NLA_U32 },
[IPVS_SVC_ATTR_ADDR] = { .len = sizeof(union nf_inet_addr) },
[IPVS_SVC_ATTR_PORT] = { .type = NLA_U16 },
[IPVS_SVC_ATTR_FWMARK] = { .type = NLA_U32 },
[IPVS_SVC_ATTR_SCHED_NAME] = { .type = NLA_STRING,
.len = IP_VS_SCHEDNAME_MAXLEN },
[IPVS_SVC_ATTR_FLAGS] = { .type = NLA_U32 },
[IPVS_SVC_ATTR_TIMEOUT] = { .type = NLA_U32 },
[IPVS_SVC_ATTR_NETMASK] = { .type = NLA_U32 },
[IPVS_SVC_ATTR_NUM_DESTS] = { .type = NLA_U32 },
[IPVS_SVC_ATTR_STATS] = { .type = NLA_NESTED },
};
/* Policy used for attributes in nested attribute IPVS_ENTRY_ATTR_DEST */
static struct nla_policy ip_vs_dest_policy[IPVS_DEST_ATTR_MAX + 1]
__read_mostly = {
[IPVS_DEST_ATTR_AF] = { .type = NLA_U32 },
[IPVS_DEST_ATTR_ADDR] = { .len = sizeof(union nf_inet_addr) },
[IPVS_DEST_ATTR_PORT] = { .type = NLA_U16 },
[IPVS_DEST_ATTR_CONN_FLAGS] = { .type = NLA_U32 },
[IPVS_DEST_ATTR_WEIGHT] = { .type = NLA_U32 },
[IPVS_DEST_ATTR_U_THRESH] = { .type = NLA_U32 },
[IPVS_DEST_ATTR_L_THRESH] = { .type = NLA_U32 },
[IPVS_DEST_ATTR_ACTIVE_CONNS] = { .type = NLA_U32 },
[IPVS_DEST_ATTR_INACT_CONNS] = { .type = NLA_U32 },
[IPVS_DEST_ATTR_PERSIST_CONNS] = { .type = NLA_U32 },
[IPVS_DEST_ATTR_STATS] = { .type = NLA_NESTED },
};
static struct genl_ops ip_vs_genl_ops[] __read_mostly = {
/* SET commands */
{
.cmd = IPVS_CMD_ADD_SERVICE,
.flags = GENL_ADMIN_PERM,
.policy = ip_vs_entries_policy,
.doit = NULL /* TODO */
},
{
.cmd = IPVS_CMD_DEL_SERVICE,
.flags = GENL_ADMIN_PERM,
.policy = ip_vs_entries_policy,
.doit = NULL /* TODO */
},
{
.cmd = IPVS_CMD_ADD_DEST,
.flags = GENL_ADMIN_PERM,
.policy = ip_vs_entries_policy,
.doit = NULL /* TODO */
},
{
.cmd = IPVS_CMD_DEL_DEST,
.flags = GENL_ADMIN_PERM,
.policy = ip_vs_entries_policy,
.doit = NULL /* TODO */
},
{
.cmd = IPVS_CMD_FLUSH,
.flags = GENL_ADMIN_PERM,
.doit = NULL /* TODO */
},
{
.cmd = IPVS_CMD_SET_TIMEOUT,
.flags = GENL_ADMIN_PERM,
.policy = ip_vs_timeout_policy,
.doit = NULL /* TODO */
},
{
.cmd = IPVS_CMD_START_DAEMON,
.flags = GENL_ADMIN_PERM,
.policy = ip_vs_daemon_policy,
.doit = NULL /* TODO */
},
{
.cmd = IPVS_CMD_STOP_DAEMON,
.flags = GENL_ADMIN_PERM,
.policy = ip_vs_daemon_policy,
.doit = NULL /* TODO */
},
{
.cmd = IPVS_CMD_ZERO,
.flags = GENL_ADMIN_PERM,
.doit = NULL /* TODO */
},
/* GET commands */
{
.cmd = IPVS_CMD_GET_INFO,
.flags = GENL_ADMIN_PERM,
.doit = NULL /* TODO */
},
{
.cmd = IPVS_CMD_GET_SERVICES,
.flags = GENL_ADMIN_PERM,
.doit = NULL /* TODO */
},
{
.cmd = IPVS_CMD_GET_SERVICE,
.flags = GENL_ADMIN_PERM,
.policy = ip_vs_entries_policy,
.doit = NULL /* TODO */
},
{
.cmd = IPVS_CMD_GET_DESTS,
.flags = GENL_ADMIN_PERM,
.policy = ip_vs_entries_policy,
.doit = NULL /* TODO */
},
{
.cmd = IPVS_CMD_GET_TIMEOUT,
.flags = GENL_ADMIN_PERM,
.doit = NULL /* TODO */
},
{
.cmd = IPVS_CMD_GET_DAEMON,
.flags = GENL_ADMIN_PERM,
.doit = NULL /* TODO */
},
};
/* End of Generic Netlink definitions */
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-17 17:18 ` Julius Volz
@ 2008-06-17 20:08 ` Patrick McHardy
2008-06-17 22:47 ` Julius Volz
0 siblings, 1 reply; 76+ messages in thread
From: Patrick McHardy @ 2008-06-17 20:08 UTC (permalink / raw)
To: Julius Volz; +Cc: Simon Horman, Vince Busam, Ben Greear, lvs-devel, netdev
Julius Volz wrote:
> Ok, so this is my draft version of the IPVS Generic Netlink interface
> definition. I'm posting this to see if anyone notices general problems
> with it right away.
I'm not familiar with the ipvs interface itself, so I'll
stick to netlink related comments.
> Arrays of the same attribute type are always put into a nested container
> so that it is easy to add new attributes which are parallel to the array
> later on.
That makes sense.
> Perhaps integer flag fields should also be split up into
> NLA_FLAG attributes, haven't done that yet.
I personally don't find NLA_FLAG very useful since for
flags use usually want the flag and a mask, otherwise
you can't unset it without the convention that userspace
always includes it, even for change requests when it
doesn't want to change it. And that is unusual for netlink
and also needlessly complicated and racy in userspace since
you'd have to query the current value before sending a change
request.
> First is a text listing attribute types and how they occur and nest in
> all of the commands and their replies. After that are the corresponding
> source excerpts (no patch material yet).
>
> Julius
>
>
> ======================================
> | IPVS NETLINK ATTRIBUTE TYPES |
> | (grouped as enums) |
> ======================================
>
> IPVS_ENTRY_ATTR_SERVICE - NLA_NESTED
> IPVS_ENTRY_ATTR_SERVICES - NLA_NESTED
> IPVS_ENTRY_ATTR_DEST - NLA_NESTED
> IPVS_ENTRY_ATTR_DESTS - NLA_NESTED
> IPVS_ENTRY_ATTR_DAEMON - NLA_NESTED
> IPVS_ENTRY_ATTR_DAEMONS - NLA_NESTED
So these are lists I assume. I don't think we have any examples
of lists of nested attributes in the mainline kernel, but in
some similar (unsubmitted) code of mine I used (names adjusted):
IPVS_SERVICE_LIST - NLA_NESTED
IPCS_DEST_LIST - NLA_NESTED
IPVS_DAEMON_LIST - NLA_NESTED
and
IPVS_LIST_ELEM - NLA_NESTED
for list elements of every kind. Since you can only put one
kind of element in the lists anyway (I think), different
types don't allow any increased flexibility and the LIST
naming is more clear in my opinion.
> IPVS_SVC_ATTR_AF - NLA_U32
> IPVS_SVC_ATTR_PROTOCOL - NLA_U32
> IPVS_SVC_ATTR_ADDR - union nf_inet_addr
This should probably use NLA_BINARY, which allows addresses
of any kind.
> IPVS_SVC_ATTR_PORT - NLA_U16
> IPVS_SVC_ATTR_FWMARK - NLA_U32
> IPVS_SVC_ATTR_SCHED_NAME - NLA_STRING
NLA_NUL_STRING (at least for validation purposes)?
> IPVS_SVC_ATTR_FLAGS - NLA_U32
As I mentioned above, you usually want a MASK in combination
with flags to allow to unset them. This is best done using
a structure.
> IPVS_SVC_ATTR_TIMEOUT - NLA_U32
> IPVS_SVC_ATTR_NETMASK - NLA_U32
Shouldn't this also be able to carry IPv6 masks?
> IPVS_SVC_ATTR_NUM_DESTS - NLA_U32
Is this number related to the IPVS_ENTRY_ATTR_DESTS list?
If so, it shouldn't be contained as seperate attribute,
that just allows for potential inconsistency.
> IPVS_SVC_ATTR_STATS - NLA_NESTED
>
> IPVS_DEST_ATTR_AF - NLA_U32
Doesn't the family have to be equal for service and dest?
If so, having it specified only once avoids potential
inconsistencies.
> ========================== include/net/ip_vs.h ==========================
Please put this under include/linux, this doesn't belong
here as its a public header.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-17 20:08 ` Patrick McHardy
@ 2008-06-17 22:47 ` Julius Volz
2008-06-18 8:57 ` Patrick McHardy
0 siblings, 1 reply; 76+ messages in thread
From: Julius Volz @ 2008-06-17 22:47 UTC (permalink / raw)
To: Patrick McHardy; +Cc: Simon Horman, Vince Busam, Ben Greear, lvs-devel, netdev
On Tue, Jun 17, 2008 at 10:08 PM, Patrick McHardy <kaber@trash.net> wrote:
> Julius Volz wrote:
>>
>> Ok, so this is my draft version of the IPVS Generic Netlink interface
>> definition. I'm posting this to see if anyone notices general problems
>> with it right away.
>
> I'm not familiar with the ipvs interface itself, so I'll
> stick to netlink related comments.
Thanks, I very much appreciate the great feedback!
> I personally don't find NLA_FLAG very useful since for
> flags use usually want the flag and a mask, otherwise
> you can't unset it without the convention that userspace
> always includes it, even for change requests when it
> doesn't want to change it. And that is unusual for netlink
> and also needlessly complicated and racy in userspace since
> you'd have to query the current value before sending a change
> request.
Makes sense, yes!
> So these are lists I assume. I don't think we have any examples
> of lists of nested attributes in the mainline kernel, but in
> some similar (unsubmitted) code of mine I used (names adjusted):
>
> IPVS_SERVICE_LIST - NLA_NESTED
> IPCS_DEST_LIST - NLA_NESTED
> IPVS_DAEMON_LIST - NLA_NESTED
Nicer naming! I will adopt that.
> and
>
> IPVS_LIST_ELEM - NLA_NESTED
>
> for list elements of every kind. Since you can only put one
> kind of element in the lists anyway (I think), different
> types don't allow any increased flexibility and the LIST
> naming is more clear in my opinion.
However, since these container attributes (for daemons, services and
dests) also appear as single elements outside of lists, it might be
better to reuse the same names inside the list?
>> IPVS_SVC_ATTR_AF - NLA_U32
>> IPVS_SVC_ATTR_PROTOCOL - NLA_U32
>> IPVS_SVC_ATTR_ADDR - union nf_inet_addr
>
> This should probably use NLA_BINARY, which allows addresses
> of any kind.
Yes, the Netlink attribute type should really be NLA_BINARY. I just
used the union as an informal way of saying what I really intended to
store in there, though if some third address family came along, it
could be something completely different.
>> IPVS_SVC_ATTR_PORT - NLA_U16
>> IPVS_SVC_ATTR_FWMARK - NLA_U32
>> IPVS_SVC_ATTR_SCHED_NAME - NLA_STRING
>
> NLA_NUL_STRING (at least for validation purposes)?
Ah, looking closer at the validation code, that makes sense.
>> IPVS_SVC_ATTR_FLAGS - NLA_U32
>
> As I mentioned above, you usually want a MASK in combination
> with flags to allow to unset them. This is best done using
> a structure.
Hm, I'm not sure if I understand exactly what this struct is supposed
to look like. Could you give an example?
>> IPVS_SVC_ATTR_TIMEOUT - NLA_U32
>> IPVS_SVC_ATTR_NETMASK - NLA_U32
>
> Shouldn't this also be able to carry IPv6 masks?
We only need the prefix length for IPv6, for which we reused the
netmask field. This (only slightly) changes the semantics of the field
between address families. Acceptable or better have a separate field
for the prefix length?
>> IPVS_SVC_ATTR_NUM_DESTS - NLA_U32
>
> Is this number related to the IPVS_ENTRY_ATTR_DESTS list?
> If so, it shouldn't be contained as seperate attribute,
> that just allows for potential inconsistency.
Yes, but this count is only returned from commands that do not at the
same time return the list of destinations, so there is no
inconsistency within a message. However, I'm pretty sure the count was
only used in the old interface to allocate enough memory for the
destination list, so it can probably be deleted anyways.
>> IPVS_SVC_ATTR_STATS - NLA_NESTED
>>
>> IPVS_DEST_ATTR_AF - NLA_U32
>
> Doesn't the family have to be equal for service and dest?
> If so, having it specified only once avoids potential
> inconsistencies.
Yes, this field can likely go away too. I was thinking about the fact
that the family of the non-VIP interface of the destination could
theoretically differ, but no support for that is currently planned.
>> ========================== include/net/ip_vs.h ==========================
>
> Please put this under include/linux, this doesn't belong
> here as its a public header.
>
He, that's the thing about IPVS. The current ipvsadm already directly
includes this header from /usr/src/linux/include/net/ip_vs.h to
compile.
So we will have to keep the old header in the wrong location for old
ipvsadm sources and create a new one only for the genetlink interface
under include/linux.
Thank you for spotting all this!
Julius
--
Google Switzerland GmbH
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-17 22:47 ` Julius Volz
@ 2008-06-18 8:57 ` Patrick McHardy
2008-06-18 14:17 ` Julius Volz
0 siblings, 1 reply; 76+ messages in thread
From: Patrick McHardy @ 2008-06-18 8:57 UTC (permalink / raw)
To: Julius Volz; +Cc: Simon Horman, Vince Busam, Ben Greear, lvs-devel, netdev
Julius Volz wrote:
> On Tue, Jun 17, 2008 at 10:08 PM, Patrick McHardy <kaber@trash.net> wrote:
>
>> IPVS_LIST_ELEM - NLA_NESTED
>>
>> for list elements of every kind. Since you can only put one
>> kind of element in the lists anyway (I think), different
>> types don't allow any increased flexibility and the LIST
>> naming is more clear in my opinion.
>>
>
> However, since these container attributes (for daemons, services and
> dests) also appear as single elements outside of lists, it might be
> better to reuse the same names inside the list?
>
I didn't realize that. Yes, agreed.
>>> IPVS_SVC_ATTR_FLAGS - NLA_U32
>>>
>> As I mentioned above, you usually want a MASK in combination
>> with flags to allow to unset them. This is best done using
>> a structure.
>>
>
> Hm, I'm not sure if I understand exactly what this struct is supposed
> to look like. Could you give an example?
>
struct {
u32 flags;
u32 mask;
} flags;
and then:
obj->flags = (obj->flags & ~flags->mask) |
(flags->flags | flags->mask);
>>> IPVS_SVC_ATTR_TIMEOUT - NLA_U32
>>> IPVS_SVC_ATTR_NETMASK - NLA_U32
>>>
>> Shouldn't this also be able to carry IPv6 masks?
>>
>
> We only need the prefix length for IPv6, for which we reused the
> netmask field. This (only slightly) changes the semantics of the field
> between address families. Acceptable or better have a separate field
> for the prefix length?
>
I guess thats fine.
>>> IPVS_SVC_ATTR_NUM_DESTS - NLA_U32
>>>
>> Is this number related to the IPVS_ENTRY_ATTR_DESTS list?
>> If so, it shouldn't be contained as seperate attribute,
>> that just allows for potential inconsistency.
>>
>
> Yes, but this count is only returned from commands that do not at the
> same time return the list of destinations, so there is no
> inconsistency within a message. However, I'm pretty sure the count was
> only used in the old interface to allocate enough memory for the
> destination list, so it can probably be deleted anyways.
>
You're probably right, this looks similar to iptables.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-18 8:57 ` Patrick McHardy
@ 2008-06-18 14:17 ` Julius Volz
2008-06-18 14:19 ` Patrick McHardy
0 siblings, 1 reply; 76+ messages in thread
From: Julius Volz @ 2008-06-18 14:17 UTC (permalink / raw)
To: Patrick McHardy; +Cc: Simon Horman, Vince Busam, Ben Greear, lvs-devel, netdev
On Wed, Jun 18, 2008 at 10:57 AM, Patrick McHardy <kaber@trash.net> wrote:
>>> As I mentioned above, you usually want a MASK in combination
>>> with flags to allow to unset them. This is best done using
>>> a structure.
>>>
>>
>> Hm, I'm not sure if I understand exactly what this struct is supposed
>> to look like. Could you give an example?
>>
>
> struct {
> u32 flags;
> u32 mask;
> } flags;
>
> and then:
>
> obj->flags = (obj->flags & ~flags->mask) |
> (flags->flags | flags->mask);
Ah, I see. The second line should read "(flags->flags & flags->mask)", right?
Looking at how these "flags" are actually used in ipvsadm, I'm not
sure this would be needed here:
1) destination conn_flags are only set to successive integer values 0,
1, 2... (depending on the forwarding method), which are mutually
exclusive. Only internally in the kernel are other bits of this field
used in a flag-like fashion. So this Netlink attribute could be
renamed to something like *_FWD_METHOD and be a normal value field.
2) for the service flags, only one bit is set from userspace
(persistent/nonpersistent service). So this might be not too bad to
have as a single Netlink flag attribute.
Otherwise, I changed the interface according to your feedback and
we'll work on the implementation for a while now!
Thanks,
Julius
--
Google Switzerland GmbH
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-18 14:17 ` Julius Volz
@ 2008-06-18 14:19 ` Patrick McHardy
2008-06-18 14:27 ` Julius Volz
0 siblings, 1 reply; 76+ messages in thread
From: Patrick McHardy @ 2008-06-18 14:19 UTC (permalink / raw)
To: Julius Volz; +Cc: Simon Horman, Vince Busam, Ben Greear, lvs-devel, netdev
Julius Volz wrote:
> On Wed, Jun 18, 2008 at 10:57 AM, Patrick McHardy <kaber@trash.net> wrote:
>>>> As I mentioned above, you usually want a MASK in combination
>>>> with flags to allow to unset them. This is best done using
>>>> a structure.
>>>>
>>> Hm, I'm not sure if I understand exactly what this struct is supposed
>>> to look like. Could you give an example?
>>>
>> struct {
>> u32 flags;
>> u32 mask;
>> } flags;
>>
>> and then:
>>
>> obj->flags = (obj->flags & ~flags->mask) |
>> (flags->flags | flags->mask);
>
> Ah, I see. The second line should read "(flags->flags & flags->mask)", right?
Yes.
> Looking at how these "flags" are actually used in ipvsadm, I'm not
> sure this would be needed here:
>
> 1) destination conn_flags are only set to successive integer values 0,
> 1, 2... (depending on the forwarding method), which are mutually
> exclusive. Only internally in the kernel are other bits of this field
> used in a flag-like fashion. So this Netlink attribute could be
> renamed to something like *_FWD_METHOD and be a normal value field.
Yes. The internal fields shouldn't be exported to userspace
unless necessary.
> 2) for the service flags, only one bit is set from userspace
> (persistent/nonpersistent service). So this might be not too bad to
> have as a single Netlink flag attribute.
And this bit can't be unset (or if it currently can't be,
it also wouldn't make sense to be able to unset it)?
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-18 14:19 ` Patrick McHardy
@ 2008-06-18 14:27 ` Julius Volz
2008-06-18 14:30 ` Patrick McHardy
0 siblings, 1 reply; 76+ messages in thread
From: Julius Volz @ 2008-06-18 14:27 UTC (permalink / raw)
To: Patrick McHardy; +Cc: Simon Horman, Vince Busam, Ben Greear, lvs-devel, netdev
On Wed, Jun 18, 2008 at 4:19 PM, Patrick McHardy <kaber@trash.net> wrote:
>> 2) for the service flags, only one bit is set from userspace
>> (persistent/nonpersistent service). So this might be not too bad to
>> have as a single Netlink flag attribute.
>
> And this bit can't be unset (or if it currently can't be,
> it also wouldn't make sense to be able to unset it)?
It can get unset when editing a persistent service to be
non-persistent, so you would still have to include it in a change
request that doesn't want to unset it. Since it's only one flag
though, it didn't seem too bad to me.
An alternative (also, in case of more flags in the future) would be to
put flags into a nested attribute and if this is not supplied from
userspace during an edit operation, the flags will be left untouched.
Julius
--
Google Switzerland GmbH
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-18 14:27 ` Julius Volz
@ 2008-06-18 14:30 ` Patrick McHardy
2008-06-18 14:36 ` Julius Volz
0 siblings, 1 reply; 76+ messages in thread
From: Patrick McHardy @ 2008-06-18 14:30 UTC (permalink / raw)
To: Julius Volz; +Cc: Simon Horman, Vince Busam, Ben Greear, lvs-devel, netdev
Julius Volz wrote:
> On Wed, Jun 18, 2008 at 4:19 PM, Patrick McHardy <kaber@trash.net> wrote:
>>> 2) for the service flags, only one bit is set from userspace
>>> (persistent/nonpersistent service). So this might be not too bad to
>>> have as a single Netlink flag attribute.
>> And this bit can't be unset (or if it currently can't be,
>> it also wouldn't make sense to be able to unset it)?
>
> It can get unset when editing a persistent service to be
> non-persistent, so you would still have to include it in a change
> request that doesn't want to unset it. Since it's only one flag
> though, it didn't seem too bad to me.
The problem is that its racy. You have to query the current
state before deciding whether to send it or not. And another
process might change it in between. An additional downside
is the overhead for the query itself.
> An alternative (also, in case of more flags in the future) would be to
> put flags into a nested attribute and if this is not supplied from
> userspace during an edit operation, the flags will be left untouched.
Then you can't unset them. I'd simply use the flags/mask
scheme.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-18 14:30 ` Patrick McHardy
@ 2008-06-18 14:36 ` Julius Volz
0 siblings, 0 replies; 76+ messages in thread
From: Julius Volz @ 2008-06-18 14:36 UTC (permalink / raw)
To: Patrick McHardy; +Cc: Simon Horman, Vince Busam, Ben Greear, lvs-devel, netdev
On Wed, Jun 18, 2008 at 4:30 PM, Patrick McHardy <kaber@trash.net> wrote:
> Julius Volz wrote:
>>
>> On Wed, Jun 18, 2008 at 4:19 PM, Patrick McHardy <kaber@trash.net> wrote:
>>>>
>>>> 2) for the service flags, only one bit is set from userspace
>>>> (persistent/nonpersistent service). So this might be not too bad to
>>>> have as a single Netlink flag attribute.
>>>
>>> And this bit can't be unset (or if it currently can't be,
>>> it also wouldn't make sense to be able to unset it)?
>>
>> It can get unset when editing a persistent service to be
>> non-persistent, so you would still have to include it in a change
>> request that doesn't want to unset it. Since it's only one flag
>> though, it didn't seem too bad to me.
>
> The problem is that its racy. You have to query the current
> state before deciding whether to send it or not. And another
> process might change it in between. An additional downside
> is the overhead for the query itself.
Right, forgot about this issue. Although ipvsadm does no getting and
setting based on the prior value (it only takes the value of the flag
from the command line), there might be other userspace tools who'd
want to use the interface differently.
>> An alternative (also, in case of more flags in the future) would be to
>> put flags into a nested attribute and if this is not supplied from
>> userspace during an edit operation, the flags will be left untouched.
>
> Then you can't unset them. I'd simply use the flags/mask
> scheme.
Ok!
Julius
--
Google Switzerland GmbH
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 00/26] IPVS: Add first IPv6 support to IPVS.
2008-06-16 11:47 ` Patrick McHardy
2008-06-16 12:13 ` Julius Volz
2008-06-16 23:19 ` Julius Volz
@ 2008-06-30 12:01 ` Julius Volz
2 siblings, 0 replies; 76+ messages in thread
From: Julius Volz @ 2008-06-30 12:01 UTC (permalink / raw)
To: Patrick McHardy; +Cc: Simon Horman, Vince Busam, Ben Greear, lvs-devel, netdev
On Mon, Jun 16, 2008, Patrick McHardy wrote:
> b) or c) both look fine. You could save a few operations (ADD/EDIT
> can be combined) by making use of nlmsg_flags though:
>
> The semantics of the flags is:
>
> - NLM_F_CREATE|NLM_F_EXCL: create if non-existant
> - NLM_F_REPLACE: change existing
> - NLM_F_CREATE|NLM_F_REPLACE: create if non-existing, replace otherwise
> - NLM_F_EXCL: test existance
I just noticed that this doesn't work with genetlink. genl_rcv_msg()
treats all commands as GET requests instead of NEW requests, in which
the same flag bits mean different things (NLM_F_ROOT, NLM_F_MATCH and
so on). Specifically, it checks for NLM_F_DUMP, which overlaps with
the other interpretations of the bits.
So I think I'll just have separate commands for ADD and EDIT, similar
to what net/wireless/nl80211.c has.
Julius
--
Google Switzerland GmbH
^ permalink raw reply [flat|nested] 76+ messages in thread
end of thread, other threads:[~2008-06-30 12:01 UTC | newest]
Thread overview: 76+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-11 17:11 [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Julius R. Volz
2008-06-11 17:11 ` [PATCH 01/26] IPVS: Add CONFIG_IP_VS_IPV6 option for IPv6 support Julius R. Volz
2008-06-11 17:11 ` [PATCH 02/26] IPVS: Change IPVS data structures to support IPv6 addresses Julius R. Volz
2008-06-11 17:12 ` Patrick McHardy
[not found] ` <f4845fc0806111041u2a9a197fseefe300ffbbda3c3@mail.gmail.com>
[not found] ` <485010E9.6000506@trash.net>
2008-06-11 18:08 ` Julius Volz
2008-06-12 1:54 ` Brian Haley
2008-06-12 9:47 ` Julius Volz
2008-06-11 17:11 ` [PATCH 03/26] IPVS: Use new address family fields in IPVS structs Julius R. Volz
2008-06-11 17:11 ` [PATCH 04/26] IPVS: Add address family specific debugging macros Julius R. Volz
2008-06-11 17:11 ` [PATCH 05/26] IPVS: Use new " Julius R. Volz
2008-06-11 17:14 ` Patrick McHardy
2008-06-11 17:11 ` [PATCH 06/26] IPVS: Add IPv6-specific function pointers to struct ip_vs_protocol Julius R. Volz
2008-06-11 17:11 ` [PATCH 07/26] IPVS: Add IPv6 handler functions to AH protocol handler Julius R. Volz
2008-06-11 17:11 ` [PATCH 08/26] IPVS: Add IPv6 handler functions to ESP " Julius R. Volz
2008-06-11 17:11 ` [PATCH 09/26] IPVS: Add IPv6 handler functions to TCP " Julius R. Volz
2008-06-11 17:11 ` [PATCH 10/26] IPVS: Add IPv6 handler functions to UDP " Julius R. Volz
2008-06-11 17:18 ` Patrick McHardy
2008-06-11 17:11 ` [PATCH 11/26] IPVS: Add supports_ipv6 flag to schedulers Julius R. Volz
2008-06-11 17:11 ` [PATCH 12/26] IPVS: Extend proto handler debug functions to handle IPv6 Julius R. Volz
2008-06-11 17:17 ` Patrick McHardy
2008-06-11 17:11 ` [PATCH 13/26] IPVS: Turn off FTP application helper for IPv6 Julius R. Volz
2008-06-11 17:11 ` [PATCH 14/26] IPVS: Extend xmit routing cache to support IPv6 Julius R. Volz
2008-06-11 17:11 ` [PATCH 15/26] IPVS: Modify IP_VS_XMIT() " Julius R. Volz
2008-06-11 17:11 ` [PATCH 16/26] IPVS: Add IPv6 xmit forwarding functions Julius R. Volz
2008-06-12 1:55 ` Brian Haley
2008-06-11 17:12 ` [PATCH 17/26] IPVS: Add connection hashing function for IPv6 entries Julius R. Volz
2008-06-11 17:12 ` [PATCH 18/26] IPVS: Add functions for getting/creating IPv6 connections Julius R. Volz
2008-06-12 1:55 ` Brian Haley
2008-06-11 17:12 ` [PATCH 19/26] IPVS: Add scheduling functions for " Julius R. Volz
2008-06-11 17:12 ` [PATCH 20/26] IPVS: Add IPv6 Netfilter hooks and add/modify support functions Julius R. Volz
2008-06-12 1:55 ` Brian Haley
2008-06-11 17:12 ` [PATCH 21/26] IPVS: Make proc/net files output IPv6 entries correctly Julius R. Volz
2008-06-11 17:12 ` [PATCH 22/26] IPVS: Add function to find out if IPv6 address is local Julius R. Volz
2008-06-11 17:12 ` [PATCH 23/26] IPVS: Add hash functions for IPv6 services and real servers Julius R. Volz
2008-06-11 17:12 ` [PATCH 24/26] IPVS: Add IPv6 support to userspace interface Julius R. Volz
2008-06-12 1:55 ` Brian Haley
2008-06-12 9:46 ` Julius Volz
2008-06-11 17:12 ` [PATCH 25/26] IPVS: Add support for IPv6 entry output in procfs files Julius R. Volz
2008-06-11 17:12 ` [PATCH 26/26] IPVS: Add some blame/credits for IPv6 version Julius R. Volz
2008-06-11 17:23 ` [PATCH 00/26] IPVS: Add first IPv6 support to IPVS Patrick McHardy
2008-06-11 18:23 ` Julius Volz
2008-06-11 18:42 ` Patrick McHardy
2008-06-11 19:05 ` Julius Volz
2008-06-11 19:10 ` Patrick McHardy
2008-06-11 19:29 ` Julius Volz
2008-06-11 19:31 ` Patrick McHardy
2008-06-11 19:53 ` Julius Volz
2008-06-11 20:14 ` Julius Volz
2008-06-11 20:55 ` Vince Busam
2008-06-11 21:30 ` Ben Greear
2008-06-11 22:26 ` Vince Busam
2008-06-12 1:45 ` Simon Horman
2008-06-12 13:31 ` Julius Volz
2008-06-12 13:38 ` Patrick McHardy
2008-06-12 15:34 ` Julius Volz
2008-06-12 15:41 ` Julius Volz
2008-06-12 15:46 ` Patrick McHardy
2008-06-12 19:33 ` Julius Volz
2008-06-13 6:26 ` Simon Horman
2008-06-13 14:17 ` Julius Volz
2008-06-13 15:14 ` Patrick McHardy
2008-06-16 0:14 ` Julius Volz
2008-06-16 11:47 ` Patrick McHardy
2008-06-16 12:13 ` Julius Volz
2008-06-16 23:19 ` Julius Volz
2008-06-17 11:52 ` Patrick McHardy
2008-06-17 17:18 ` Julius Volz
2008-06-17 20:08 ` Patrick McHardy
2008-06-17 22:47 ` Julius Volz
2008-06-18 8:57 ` Patrick McHardy
2008-06-18 14:17 ` Julius Volz
2008-06-18 14:19 ` Patrick McHardy
2008-06-18 14:27 ` Julius Volz
2008-06-18 14:30 ` Patrick McHardy
2008-06-18 14:36 ` Julius Volz
2008-06-30 12:01 ` Julius Volz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).