From: jamal <hadi@cyberus.ca>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: davem@davemloft.net, kaber@trash.net, yoshfuji@linux-ipv6.org,
nakam@linux-ipv6.org, eric.dumazet@gmail.com,
netdev@vger.kernel.org,
Steffen Klassert <steffen.klassert@secunet.com>
Subject: Re: [RFC PATCH]xfrm: fix perpetual bundles
Date: Tue, 02 Mar 2010 07:11:45 -0500 [thread overview]
Message-ID: <1267531905.21749.21.camel@bigi> (raw)
In-Reply-To: <20100302112754.GA1513@gondor.apana.org.au>
On Tue, 2010-03-02 at 19:27 +0800, Herbert Xu wrote:
> On Wed, Feb 24, 2010 at 08:19:24AM -0500, jamal wrote:
> > 1)In the connect() stage, in the slow path a route cache is
> > created with the rth->fl.fl4_src of 0.0.0.0...
> > ==> policy->bundles is empty, so we do a lookup, fail, create
> > one.. (remember rth->fl.fl4_src of 0.0.0.0 at this stage and
> > thats what we end storing in the bundle/xdst for later comparison
> > instead of the skb's fl)
>
> So this is root number 1. When this stuff was first written this
> case simply wasn't possible. So I think the question we need to
> ask here is can we get a valid address there at the connect stage?
fl->fl4_src is valid non-zero. But in xfrm4_fill_dst()
we do wholesale copy of xdst->u.rt.fl = rt->fl; and rt->fl.fl4_src is
0.0.0.0.
> After all, for non-IPsec connect(2)s, you do get a valid IP address.
> So I don't see why the IPsec case should be different.
>
> Creating a bundle with a zero source address is just a hack to
> make connect(2) succeed immediately. AFAICS getting a valid IP
> address can also be done without waiting for the whole IPsec state
> to be created.
>
I did try to "fix it" above via:
+ if (!xdst->u.rt.fl.fl4_src) {
+ xdst->u.rt.fl.fl4_src = fl->fl4_src;
+ }
But this breaks again later in sendmsg bundle lookup because of
XFRM_SUB_POLICY. If i turned off config XFRM_SUB_POLICY, then
all works. I didnt look closely, but SUB_POLICY does do a memcpy
or two off the dst passed in connect() - which has the wrong src.
So i would have to "fix" a few more spots for it to work. This is
where i gave up concluding that i was just plugging with band-aids.
> Of course if anybody is still interested we could also revisit
> the neighbouresque queueing idea.
not plugged into that discussion..
> > 2)ping sends a packet (does a sendmsg)
> > ==> xfrm_find_bundle() ends up comparing skb's->fl (non-zero
> > fl->fl4_src) with what we stored from #1b. Fails.
> > ==> we create a new bundle at attach the old one at the end of it.
> > ...and now policy->bundles has two xdst entries
>
> This is the way it's supposed to work.
> > 3) Repeat #2, and now we have 3 xdsts in policy bundles
>
> This is what I don't understand. The code is supposed to look
> at every bundle attached to the policy. So why doesn't it find
> the one we created at step #2?
The issue is that the comparison is between xdst->u.rt.fl.fl4_src and
fl->fl4_src. fl->fl4_src is always non-zero. stored
xdst->u.rt.fl.fl4_src is always zero
> In conclusion, I think we have two problems, with the second
> one being the most immediate cause of your symptoms.
Remember the route cache (refer to dst copying above) is created at
connect time;->
So Steffen (on CC) tried to "fix it" by fixing at route cache creation
time. His approach:
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2778,15 +2778,26 @@ int ip_route_output_flow(struct net *net, struct
rtable **rp, struct flowi *flp,
struct sock *sk, int flags)
{
int err;
+ int update_route = 0;
if ((err = __ip_route_output_key(net, rp, flp)) != 0)
return err;
if (flp->proto) {
- if (!flp->fl4_src)
+ if (!flp->fl4_src) {
flp->fl4_src = (*rp)->rt_src;
- if (!flp->fl4_dst)
+ update_route = 1;
+ }
+ if (!flp->fl4_dst) {
flp->fl4_dst = (*rp)->rt_dst;
+ update_route = 1;
+ }
+ if (update_route) {
+ dst_release(&(*rp)->u.dst);
+ if ((err = __ip_route_output_key(net, rp,
flp)) != 0)
+ return err;
+ }
+
err = __xfrm_lookup(net, (struct dst_entry **)rp, flp,
sk,
flags ? XFRM_LOOKUP_WAIT : 0);
if (err == -EREMOTE)
--
I was worried about the impact of this on something else that expects
the behavior.
cheers,
jamal
next prev parent reply other threads:[~2010-03-02 12:11 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-24 13:19 [RFC PATCH]xfrm: fix perpetual bundles jamal
2010-02-25 10:40 ` Steffen Klassert
2010-02-25 13:19 ` jamal
2010-02-28 14:07 ` jamal
2010-03-01 15:33 ` Steffen Klassert
2010-03-02 11:27 ` Herbert Xu
2010-03-02 12:11 ` jamal [this message]
2010-03-02 12:51 ` Herbert Xu
2010-03-02 13:10 ` jamal
2010-03-02 13:46 ` Steffen Klassert
2010-03-02 13:54 ` jamal
2010-03-02 14:06 ` David Miller
2010-03-02 14:16 ` jamal
2010-03-02 14:06 ` Steffen Klassert
2010-03-03 0:47 ` jamal
2010-03-03 9:07 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1267531905.21749.21.camel@bigi \
--to=hadi@cyberus.ca \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=herbert@gondor.apana.org.au \
--cc=kaber@trash.net \
--cc=nakam@linux-ipv6.org \
--cc=netdev@vger.kernel.org \
--cc=steffen.klassert@secunet.com \
--cc=yoshfuji@linux-ipv6.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.