From: Simon Horman <horms@verge.net.au>
To: "Siim Põder" <siim@p6drad-teel.net>
Cc: Julian Anastasov <ja@ssi.bg>,
netdev@vger.kernel.org, lvs-devel@vger.kernel.org,
Malcolm Turnbull <malcolm@loadbalancer.org>,
Julius Volz <juliusv@google.com>, Vince Busam <vbusam@google.com>,
Herbert Xu <herbert@gondor.apana.org.au>
Subject: Re: [PATCH 1/2] ipvs: load balance IPv4 connections from a local process
Date: Sat, 6 Sep 2008 17:43:54 +1000 [thread overview]
Message-ID: <20080906074352.GA22998@verge.net.au> (raw)
In-Reply-To: <60dee71e238f7fa03da7bc5338ec4bf5.squirrel@p6drad-teel.net>
On Fri, Sep 05, 2008 at 08:49:52AM +0300, Siim Põder wrote:
> Hi
>
> > TCP checksum is not optional, why this change appeared?
>
> The new packets that we handle are on the loopback device and no checksums
> appear to be generated there. I initially changed the condition to check
> for loopback device (which we could do), but checking udp code found that
> it already handled it by checking zero checksum, hence the same
> implementation in tcp code.
I checked with Herbert Xu and its not legal for the checksum to be missing
for TCP. However it is possible for there to be a partial checksum for
loopback traffic. That is, only the pseudo-header is summed.
The patch outlines a fix for this problem using the existing
structure of ip_vs_proto_tcp.c. I believe that a similar fix
is also required for UDP. I am posting it now so people can comment
on its correctness.
Moving forward tcp_partial_csum_update() and tcp_fast_csum_update()
could be implemented in terms of inet_proto_csum_replace* if
inet_proto_csum_replace16 was added. I will work on coding this up.
diff --git a/net/ipv4/ipvs/ip_vs_proto_tcp.c b/net/ipv4/ipvs/ip_vs_proto_tcp.c
index 808e8be..537f616 100644
--- a/net/ipv4/ipvs/ip_vs_proto_tcp.c
+++ b/net/ipv4/ipvs/ip_vs_proto_tcp.c
@@ -134,12 +134,34 @@ tcp_fast_csum_update(int af, struct tcphdr *tcph,
}
+static inline void
+tcp_partial_csum_update(int af, struct tcphdr *tcph,
+ const union nf_inet_addr *oldip,
+ const union nf_inet_addr *newip,
+ __be16 oldlen, __be16 newlen)
+{
+#ifdef CONFIG_IP_VS_IPV6
+ if (af == AF_INET6)
+ tcph->check =
+ csum_fold(ip_vs_check_diff16(oldip->ip6, newip->ip6,
+ ip_vs_check_diff2(oldlen, newlen,
+ ~csum_unfold(tcph->check))));
+ else
+#endif
+ tcph->check =
+ csum_fold(ip_vs_check_diff4(oldip->ip, newip->ip,
+ ip_vs_check_diff2(oldlen, newlen,
+ ~csum_unfold(tcph->check))));
+}
+
+
static int
tcp_snat_handler(struct sk_buff *skb,
struct ip_vs_protocol *pp, struct ip_vs_conn *cp)
{
struct tcphdr *tcph;
unsigned int tcphoff;
+ int oldlen;
#ifdef CONFIG_IP_VS_IPV6
if (cp->af == AF_INET6)
@@ -147,6 +169,7 @@ tcp_snat_handler(struct sk_buff *skb,
else
#endif
tcphoff = ip_hdrlen(skb);
+ oldlen = skb->len - tcphoff;
/* csum_check requires unshared skb */
if (!skb_make_writable(skb, tcphoff+sizeof(*tcph)))
@@ -166,7 +189,11 @@ tcp_snat_handler(struct sk_buff *skb,
tcph->source = cp->vport;
/* Adjust TCP checksums */
- if (!cp->app && (tcph->check != 0)) {
+ if (skb->ip_summed == CHECKSUM_PARTIAL) {
+ tcp_partial_csum_update(cp->af, tcph, &cp->daddr, &cp->vaddr,
+ htonl(oldlen),
+ htonl(skb->len - tcphoff));
+ } else if (!cp->app) {
/* Only port and addr are changed, do fast csum update */
tcp_fast_csum_update(cp->af, tcph, &cp->daddr, &cp->vaddr,
cp->dport, cp->vport);
@@ -204,6 +231,7 @@ tcp_dnat_handler(struct sk_buff *skb,
{
struct tcphdr *tcph;
unsigned int tcphoff;
+ int oldlen;
#ifdef CONFIG_IP_VS_IPV6
if (cp->af == AF_INET6)
@@ -211,6 +239,7 @@ tcp_dnat_handler(struct sk_buff *skb,
else
#endif
tcphoff = ip_hdrlen(skb);
+ oldlen = skb->len - tcphoff;
/* csum_check requires unshared skb */
if (!skb_make_writable(skb, tcphoff+sizeof(*tcph)))
@@ -235,7 +264,11 @@ tcp_dnat_handler(struct sk_buff *skb,
/*
* Adjust TCP checksums
*/
- if (!cp->app && (tcph->check != 0)) {
+ if (skb->ip_summed == CHECKSUM_PARTIAL) {
+ tcp_partial_csum_update(cp->af, tcph, &cp->daddr, &cp->vaddr,
+ htonl(oldlen),
+ htonl(skb->len - tcphoff));
+ } else if (!cp->app) {
/* Only port and addr are changed, do fast csum update */
tcp_fast_csum_update(cp->af, tcph, &cp->vaddr, &cp->daddr,
cp->vport, cp->dport);
next prev parent reply other threads:[~2008-09-06 7:43 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-05 1:36 [PATCH 1/2] ipvs: load balance IPv4 connections from a local process Simon Horman
2008-09-05 1:37 ` [PATCH 2/2] ipvs: load balance ipv6 " Simon Horman
2008-09-05 11:40 ` Julius Volz
2008-09-05 15:55 ` Brian Haley
2008-09-05 16:37 ` Julius Volz
2008-09-06 4:14 ` Simon Horman
2008-09-06 9:26 ` Julius Volz
2008-09-08 0:30 ` Simon Horman
2008-09-08 1:48 ` Simon Horman
2008-09-08 9:30 ` Julius Volz
2008-09-08 9:50 ` Simon Horman
2008-09-05 5:12 ` [PATCH 1/2] ipvs: load balance IPv4 " Julian Anastasov
2008-09-05 5:49 ` Siim Põder
2008-09-05 5:49 ` Siim Põder
2008-09-06 7:43 ` Simon Horman [this message]
2008-09-05 11:02 ` Julius Volz
2008-09-06 3:56 ` Simon Horman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080906074352.GA22998@verge.net.au \
--to=horms@verge.net.au \
--cc=herbert@gondor.apana.org.au \
--cc=ja@ssi.bg \
--cc=juliusv@google.com \
--cc=lvs-devel@vger.kernel.org \
--cc=malcolm@loadbalancer.org \
--cc=netdev@vger.kernel.org \
--cc=siim@p6drad-teel.net \
--cc=vbusam@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.