netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: jamal <hadi@cyberus.ca>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: davem@davemloft.net, kaber@trash.net, yoshfuji@linux-ipv6.org,
	nakam@linux-ipv6.org, eric.dumazet@gmail.com,
	netdev@vger.kernel.org,
	Steffen Klassert <steffen.klassert@secunet.com>
Subject: Re: [RFC PATCH]xfrm: fix perpetual bundles
Date: Tue, 02 Mar 2010 07:11:45 -0500	[thread overview]
Message-ID: <1267531905.21749.21.camel@bigi> (raw)
In-Reply-To: <20100302112754.GA1513@gondor.apana.org.au>

On Tue, 2010-03-02 at 19:27 +0800, Herbert Xu wrote:
> On Wed, Feb 24, 2010 at 08:19:24AM -0500, jamal wrote:

> > 1)In the connect() stage, in the slow path a route cache is
> > created with the rth->fl.fl4_src of 0.0.0.0...
> > ==> policy->bundles is empty, so we do a lookup, fail, create
> > one.. (remember rth->fl.fl4_src of 0.0.0.0 at this stage and
> > thats what we end storing in the bundle/xdst for later comparison
> > instead of the skb's fl)
> 
> So this is root number 1.  When this stuff was first written this
> case simply wasn't possible.  So I think the question we need to
> ask here is can we get a valid address there at the connect stage?

fl->fl4_src is valid non-zero. But in xfrm4_fill_dst()
we do wholesale copy of xdst->u.rt.fl = rt->fl; and rt->fl.fl4_src is
0.0.0.0.


> After all, for non-IPsec connect(2)s, you do get a valid IP address.
> So I don't see why the IPsec case should be different.
>
> Creating a bundle with a zero source address is just a hack to
> make connect(2) succeed immediately.  AFAICS getting a valid IP
> address can also be done without waiting for the whole IPsec state
> to be created.
> 

I did try to "fix it" above via:
+               if (!xdst->u.rt.fl.fl4_src) {
+                       xdst->u.rt.fl.fl4_src = fl->fl4_src;
+               }

But this breaks again later in sendmsg bundle lookup because of 
XFRM_SUB_POLICY. If i turned off config XFRM_SUB_POLICY, then
all works. I didnt look closely, but SUB_POLICY does do a memcpy
or two off the dst passed in connect() - which has the wrong src.
So i would have to "fix" a few more spots for it to work. This is
where i gave up concluding that i was just plugging with band-aids.

> Of course if anybody is still interested we could also revisit
> the neighbouresque queueing idea.

not plugged into that discussion..

> > 2)ping sends a packet (does a sendmsg) 
> > ==> xfrm_find_bundle() ends up comparing skb's->fl (non-zero
> > fl->fl4_src) with what we stored from #1b. Fails.
> > ==> we create a new bundle at attach the old one at the end of it.
> > ...and now policy->bundles has two xdst entries
> 
> This is the way it's supposed to work.

> > 3) Repeat #2, and now we have 3 xdsts in policy bundles
> 
> This is what I don't understand.  The code is supposed to look
> at every bundle attached to the policy.  So why doesn't it find
> the one we created at step #2?

The issue is that the comparison is between xdst->u.rt.fl.fl4_src and
fl->fl4_src. fl->fl4_src is always non-zero. stored
xdst->u.rt.fl.fl4_src is always zero 

> In conclusion, I think we have two problems, with the second
> one being the most immediate cause of your symptoms.

Remember the route cache (refer to dst copying above) is created at
connect time;->

So Steffen (on CC) tried to "fix it" by fixing at route cache creation
time. His approach:

--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2778,15 +2778,26 @@ int ip_route_output_flow(struct net *net, struct
rtable **rp, struct flowi *flp,
                         struct sock *sk, int flags)
 {
        int err;
+       int update_route = 0;
 
        if ((err = __ip_route_output_key(net, rp, flp)) != 0)
                return err;
 
        if (flp->proto) {
-               if (!flp->fl4_src)
+               if (!flp->fl4_src) {
                        flp->fl4_src = (*rp)->rt_src;
-               if (!flp->fl4_dst)
+                       update_route = 1;
+               }
+               if (!flp->fl4_dst) {
                        flp->fl4_dst = (*rp)->rt_dst;
+                       update_route = 1;
+               }
+               if (update_route) {
+                       dst_release(&(*rp)->u.dst);
+                       if ((err = __ip_route_output_key(net, rp,
flp)) != 0)
+                               return err;
+               }
+
                err = __xfrm_lookup(net, (struct dst_entry **)rp, flp,
sk,
                                    flags ? XFRM_LOOKUP_WAIT : 0);
                if (err == -EREMOTE)
--

I was worried about the impact of this on something else that expects
the behavior.

cheers,
jamal


  reply	other threads:[~2010-03-02 12:11 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-24 13:19 [RFC PATCH]xfrm: fix perpetual bundles jamal
2010-02-25 10:40 ` Steffen Klassert
2010-02-25 13:19   ` jamal
2010-02-28 14:07     ` jamal
2010-03-01 15:33       ` Steffen Klassert
2010-03-02 11:27 ` Herbert Xu
2010-03-02 12:11   ` jamal [this message]
2010-03-02 12:51     ` Herbert Xu
2010-03-02 13:10       ` jamal
2010-03-02 13:46         ` Steffen Klassert
2010-03-02 13:54           ` jamal
2010-03-02 14:06             ` David Miller
2010-03-02 14:16               ` jamal
2010-03-02 14:06             ` Steffen Klassert
2010-03-03  0:47               ` jamal
2010-03-03  9:07                 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1267531905.21749.21.camel@bigi \
    --to=hadi@cyberus.ca \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=kaber@trash.net \
    --cc=nakam@linux-ipv6.org \
    --cc=netdev@vger.kernel.org \
    --cc=steffen.klassert@secunet.com \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).