From: "Maciej Żenczykowski" <zenczykowski@gmail.com>
To: Linux Networking <netdev@vger.kernel.org>
Subject: SO_MARK and IPv6 and ip rule fwmark: broken
Date: Sat, 26 Sep 2009 04:03:41 -0700 [thread overview]
Message-ID: <55a4f86e0909260403k1da86294tca3f60534da24db7@mail.gmail.com> (raw)
AFAICT the following can happen:
* userspace creates an IPv6 socket
* userspace calls setsockopt SO_MARK with a non-zero value
[sk->sk_mark = something] (requires CAP_NET_ADMIN)
* userspace attempts to connect or send a datagram in some other way
* flow struct gets initialized
* fl.mark isn't initialized and defaults to memset'ed 0 <-- *bug*
* routing decision (and potentially src ip selection, etc) gets made
based on flow with missing mark
* skb->mark = sk->sk_mark [which is non-zero] <-- *kind of correct*
* if ip6tables mangle is enabled:
- temp_mark = skb->mark <-- *correct, although leads to weird behaviour*
- rules get called, potentially matching and/or modifying the mark
(ip6tables -m mark, -j MARK)
- once all rules in the mangle table complete, we do:
- if (skb->mark != temp_mark) || (other special mangles happened)
re-examine [redo] previous routing decision.
This means, that:
ip rule (add) fwmark [non-zero] ...
will not work for a fwmark set with SO_MARK, unless you somehow cause
the mangle table to trigger the re-examination of the routing
decision.
For example (tested for v6 TCP, cursory code examination leads me to
believe this is broken for UDP/DCCP/SCTP/etc over v6 as well) while:
ip rule add fwmark 1234 lookup 200
combined with a
setsockopt(SO_MARK, 1234)
is ignored,
changing the setsockopt to
setsockopt(SO_MARK, 12345)
and adding in
ip6tables -t mangle -A OUTPUT -m mark --mark 12345 -j MARK --set-mark 1234
suddenly results in the ip rule fwmark being obeyed.
I believe that
setsockopt(SO_MARK, 1234)
combined with
ip6tables -t mangle -A OUTPUT -j HL --hl-dec 1
would also work (since a changed hoplimit triggers routing
re-examination [exactly why is unclear to me...]).
Figuring out exactly where fl.mark needs to be initialized seems not
quite trivial... (because of questions of what to do with non-tcp
protocols, or syn-ack syncookies, RST packets, etc)
Alternatively temp_mark = skb->mark in the v6 mangle code, could be
changed to temp_mark = 0, in which case just loading the mangle table
would cause it to work, of course this would be unoptimal performance
wise and it would still be broken without ip6tables mangle table being
loaded, which isn't particularly desirable behaviour.
Not quite sure what to do with this...
- Maciej
reply other threads:[~2009-09-26 11:03 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55a4f86e0909260403k1da86294tca3f60534da24db7@mail.gmail.com \
--to=zenczykowski@gmail.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox