From: jamal <hadi@cyberus.ca>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: davem@davemloft.net, kaber@trash.net, yoshfuji@linux-ipv6.org,
nakam@linux-ipv6.org, eric.dumazet@gmail.com,
netdev@vger.kernel.org,
Steffen Klassert <steffen.klassert@secunet.com>
Subject: Re: [RFC PATCH]xfrm: fix perpetual bundles
Date: Tue, 02 Mar 2010 07:11:45 -0500 [thread overview]
Message-ID: <1267531905.21749.21.camel@bigi> (raw)
In-Reply-To: <20100302112754.GA1513@gondor.apana.org.au>
On Tue, 2010-03-02 at 19:27 +0800, Herbert Xu wrote:
> On Wed, Feb 24, 2010 at 08:19:24AM -0500, jamal wrote:
> > 1)In the connect() stage, in the slow path a route cache is
> > created with the rth->fl.fl4_src of 0.0.0.0...
> > ==> policy->bundles is empty, so we do a lookup, fail, create
> > one.. (remember rth->fl.fl4_src of 0.0.0.0 at this stage and
> > thats what we end storing in the bundle/xdst for later comparison
> > instead of the skb's fl)
>
> So this is root number 1. When this stuff was first written this
> case simply wasn't possible. So I think the question we need to
> ask here is can we get a valid address there at the connect stage?
fl->fl4_src is valid non-zero. But in xfrm4_fill_dst()
we do wholesale copy of xdst->u.rt.fl = rt->fl; and rt->fl.fl4_src is
0.0.0.0.
> After all, for non-IPsec connect(2)s, you do get a valid IP address.
> So I don't see why the IPsec case should be different.
>
> Creating a bundle with a zero source address is just a hack to
> make connect(2) succeed immediately. AFAICS getting a valid IP
> address can also be done without waiting for the whole IPsec state
> to be created.
>
I did try to "fix it" above via:
+ if (!xdst->u.rt.fl.fl4_src) {
+ xdst->u.rt.fl.fl4_src = fl->fl4_src;
+ }
But this breaks again later in sendmsg bundle lookup because of
XFRM_SUB_POLICY. If i turned off config XFRM_SUB_POLICY, then
all works. I didnt look closely, but SUB_POLICY does do a memcpy
or two off the dst passed in connect() - which has the wrong src.
So i would have to "fix" a few more spots for it to work. This is
where i gave up concluding that i was just plugging with band-aids.
> Of course if anybody is still interested we could also revisit
> the neighbouresque queueing idea.
not plugged into that discussion..
> > 2)ping sends a packet (does a sendmsg)
> > ==> xfrm_find_bundle() ends up comparing skb's->fl (non-zero
> > fl->fl4_src) with what we stored from #1b. Fails.
> > ==> we create a new bundle at attach the old one at the end of it.
> > ...and now policy->bundles has two xdst entries
>
> This is the way it's supposed to work.
> > 3) Repeat #2, and now we have 3 xdsts in policy bundles
>
> This is what I don't understand. The code is supposed to look
> at every bundle attached to the policy. So why doesn't it find
> the one we created at step #2?
The issue is that the comparison is between xdst->u.rt.fl.fl4_src and
fl->fl4_src. fl->fl4_src is always non-zero. stored
xdst->u.rt.fl.fl4_src is always zero
> In conclusion, I think we have two problems, with the second
> one being the most immediate cause of your symptoms.
Remember the route cache (refer to dst copying above) is created at
connect time;->
So Steffen (on CC) tried to "fix it" by fixing at route cache creation
time. His approach:
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2778,15 +2778,26 @@ int ip_route_output_flow(struct net *net, struct
rtable **rp, struct flowi *flp,
struct sock *sk, int flags)
{
int err;
+ int update_route = 0;
if ((err = __ip_route_output_key(net, rp, flp)) != 0)
return err;
if (flp->proto) {
- if (!flp->fl4_src)
+ if (!flp->fl4_src) {
flp->fl4_src = (*rp)->rt_src;
- if (!flp->fl4_dst)
+ update_route = 1;
+ }
+ if (!flp->fl4_dst) {
flp->fl4_dst = (*rp)->rt_dst;
+ update_route = 1;
+ }
+ if (update_route) {
+ dst_release(&(*rp)->u.dst);
+ if ((err = __ip_route_output_key(net, rp,
flp)) != 0)
+ return err;
+ }
+
err = __xfrm_lookup(net, (struct dst_entry **)rp, flp,
sk,
flags ? XFRM_LOOKUP_WAIT : 0);
if (err == -EREMOTE)
--
I was worried about the impact of this on something else that expects
the behavior.
cheers,
jamal
next prev parent reply other threads:[~2010-03-02 12:11 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-24 13:19 [RFC PATCH]xfrm: fix perpetual bundles jamal
2010-02-25 10:40 ` Steffen Klassert
2010-02-25 13:19 ` jamal
2010-02-28 14:07 ` jamal
2010-03-01 15:33 ` Steffen Klassert
2010-03-02 11:27 ` Herbert Xu
2010-03-02 12:11 ` jamal [this message]
2010-03-02 12:51 ` Herbert Xu
2010-03-02 13:10 ` jamal
2010-03-02 13:46 ` Steffen Klassert
2010-03-02 13:54 ` jamal
2010-03-02 14:06 ` David Miller
2010-03-02 14:16 ` jamal
2010-03-02 14:06 ` Steffen Klassert
2010-03-03 0:47 ` jamal
2010-03-03 9:07 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1267531905.21749.21.camel@bigi \
--to=hadi@cyberus.ca \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=herbert@gondor.apana.org.au \
--cc=kaber@trash.net \
--cc=nakam@linux-ipv6.org \
--cc=netdev@vger.kernel.org \
--cc=steffen.klassert@secunet.com \
--cc=yoshfuji@linux-ipv6.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).