From: Simon Horman <horms@verge.net.au>
To: "Siim Põder" <siim@p6drad-teel.net>
Cc: Julian Anastasov <ja@ssi.bg>,
netdev@vger.kernel.org, lvs-devel@vger.kernel.org,
Malcolm Turnbull <malcolm@loadbalancer.org>,
Julius Volz <juliusv@google.com>, Vince Busam <vbusam@google.com>,
Herbert Xu <herbert@gondor.apana.org.au>
Subject: Re: [PATCH 1/2] ipvs: load balance IPv4 connections from a local process
Date: Sat, 6 Sep 2008 17:43:54 +1000 [thread overview]
Message-ID: <20080906074352.GA22998@verge.net.au> (raw)
In-Reply-To: <60dee71e238f7fa03da7bc5338ec4bf5.squirrel@p6drad-teel.net>
On Fri, Sep 05, 2008 at 08:49:52AM +0300, Siim Põder wrote:
> Hi
>
> > TCP checksum is not optional, why this change appeared?
>
> The new packets that we handle are on the loopback device and no checksums
> appear to be generated there. I initially changed the condition to check
> for loopback device (which we could do), but checking udp code found that
> it already handled it by checking zero checksum, hence the same
> implementation in tcp code.
I checked with Herbert Xu and its not legal for the checksum to be missing
for TCP. However it is possible for there to be a partial checksum for
loopback traffic. That is, only the pseudo-header is summed.
The patch outlines a fix for this problem using the existing
structure of ip_vs_proto_tcp.c. I believe that a similar fix
is also required for UDP. I am posting it now so people can comment
on its correctness.
Moving forward tcp_partial_csum_update() and tcp_fast_csum_update()
could be implemented in terms of inet_proto_csum_replace* if
inet_proto_csum_replace16 was added. I will work on coding this up.
diff --git a/net/ipv4/ipvs/ip_vs_proto_tcp.c b/net/ipv4/ipvs/ip_vs_proto_tcp.c
index 808e8be..537f616 100644
--- a/net/ipv4/ipvs/ip_vs_proto_tcp.c
+++ b/net/ipv4/ipvs/ip_vs_proto_tcp.c
@@ -134,12 +134,34 @@ tcp_fast_csum_update(int af, struct tcphdr *tcph,
}
+static inline void
+tcp_partial_csum_update(int af, struct tcphdr *tcph,
+ const union nf_inet_addr *oldip,
+ const union nf_inet_addr *newip,
+ __be16 oldlen, __be16 newlen)
+{
+#ifdef CONFIG_IP_VS_IPV6
+ if (af == AF_INET6)
+ tcph->check =
+ csum_fold(ip_vs_check_diff16(oldip->ip6, newip->ip6,
+ ip_vs_check_diff2(oldlen, newlen,
+ ~csum_unfold(tcph->check))));
+ else
+#endif
+ tcph->check =
+ csum_fold(ip_vs_check_diff4(oldip->ip, newip->ip,
+ ip_vs_check_diff2(oldlen, newlen,
+ ~csum_unfold(tcph->check))));
+}
+
+
static int
tcp_snat_handler(struct sk_buff *skb,
struct ip_vs_protocol *pp, struct ip_vs_conn *cp)
{
struct tcphdr *tcph;
unsigned int tcphoff;
+ int oldlen;
#ifdef CONFIG_IP_VS_IPV6
if (cp->af == AF_INET6)
@@ -147,6 +169,7 @@ tcp_snat_handler(struct sk_buff *skb,
else
#endif
tcphoff = ip_hdrlen(skb);
+ oldlen = skb->len - tcphoff;
/* csum_check requires unshared skb */
if (!skb_make_writable(skb, tcphoff+sizeof(*tcph)))
@@ -166,7 +189,11 @@ tcp_snat_handler(struct sk_buff *skb,
tcph->source = cp->vport;
/* Adjust TCP checksums */
- if (!cp->app && (tcph->check != 0)) {
+ if (skb->ip_summed == CHECKSUM_PARTIAL) {
+ tcp_partial_csum_update(cp->af, tcph, &cp->daddr, &cp->vaddr,
+ htonl(oldlen),
+ htonl(skb->len - tcphoff));
+ } else if (!cp->app) {
/* Only port and addr are changed, do fast csum update */
tcp_fast_csum_update(cp->af, tcph, &cp->daddr, &cp->vaddr,
cp->dport, cp->vport);
@@ -204,6 +231,7 @@ tcp_dnat_handler(struct sk_buff *skb,
{
struct tcphdr *tcph;
unsigned int tcphoff;
+ int oldlen;
#ifdef CONFIG_IP_VS_IPV6
if (cp->af == AF_INET6)
@@ -211,6 +239,7 @@ tcp_dnat_handler(struct sk_buff *skb,
else
#endif
tcphoff = ip_hdrlen(skb);
+ oldlen = skb->len - tcphoff;
/* csum_check requires unshared skb */
if (!skb_make_writable(skb, tcphoff+sizeof(*tcph)))
@@ -235,7 +264,11 @@ tcp_dnat_handler(struct sk_buff *skb,
/*
* Adjust TCP checksums
*/
- if (!cp->app && (tcph->check != 0)) {
+ if (skb->ip_summed == CHECKSUM_PARTIAL) {
+ tcp_partial_csum_update(cp->af, tcph, &cp->daddr, &cp->vaddr,
+ htonl(oldlen),
+ htonl(skb->len - tcphoff));
+ } else if (!cp->app) {
/* Only port and addr are changed, do fast csum update */
tcp_fast_csum_update(cp->af, tcph, &cp->vaddr, &cp->daddr,
cp->vport, cp->dport);
next prev parent reply other threads:[~2008-09-06 7:43 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-05 1:36 [PATCH 1/2] ipvs: load balance IPv4 connections from a local process Simon Horman
2008-09-05 1:37 ` [PATCH 2/2] ipvs: load balance ipv6 " Simon Horman
2008-09-05 11:40 ` Julius Volz
2008-09-05 15:55 ` Brian Haley
2008-09-05 16:37 ` Julius Volz
2008-09-06 4:14 ` Simon Horman
2008-09-06 9:26 ` Julius Volz
2008-09-08 0:30 ` Simon Horman
2008-09-08 1:48 ` Simon Horman
2008-09-08 9:30 ` Julius Volz
2008-09-08 9:50 ` Simon Horman
2008-09-05 5:12 ` [PATCH 1/2] ipvs: load balance IPv4 " Julian Anastasov
2008-09-05 5:49 ` Siim Põder
2008-09-06 7:43 ` Simon Horman [this message]
2008-09-05 11:02 ` Julius Volz
2008-09-06 3:56 ` Simon Horman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080906074352.GA22998@verge.net.au \
--to=horms@verge.net.au \
--cc=herbert@gondor.apana.org.au \
--cc=ja@ssi.bg \
--cc=juliusv@google.com \
--cc=lvs-devel@vger.kernel.org \
--cc=malcolm@loadbalancer.org \
--cc=netdev@vger.kernel.org \
--cc=siim@p6drad-teel.net \
--cc=vbusam@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).