* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
[not found] <200704271705.l3RH5Brw026873@hera.kernel.org>
@ 2007-05-14 10:21 ` Patrick McHardy
2007-05-14 10:35 ` David Miller
0 siblings, 1 reply; 24+ messages in thread
From: Patrick McHardy @ 2007-05-14 10:21 UTC (permalink / raw)
To: Simon Horman; +Cc: Linux Netdev List, Janusz Krzysztofik
Linux Kernel Mailing List wrote:
> Gitweb: http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2d771cd86d4c3af26f34a7bcdc1b87696824cad9
> Commit: 2d771cd86d4c3af26f34a7bcdc1b87696824cad9
>
> [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
>
> this is a small patch by Janusz Krzysztofik to ip_route_output_slow()
> that allows VIP-less LVS linux director to generate packets
> originating >From VIP if sysctl_ip_nonlocal_bind is set.
>
> In a nutshell, the intention is for an LVS linux director to be able
> to send ICMP unreachable responses to end-users when real-servers are
> removed.
>
> http://archive.linuxvirtualserver.org/html/lvs-users/2007-01/msg00106.html
>
> Signed-off-by: Simon Horman <horms@verge.net.au>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> ---
> net/ipv4/route.c | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index df9fe4f..cb76e3c 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -2396,7 +2396,7 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp)
>
> /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
> dev_out = ip_dev_find(oldflp->fl4_src);
> - if (dev_out == NULL)
> + if ((dev_out == NULL) && !(sysctl_ip_nonlocal_bind))
> goto out;
This allows any user to send spoofed packets when ip_nonlocal_bind
is set, which is a quite big change in behaviour of this option.
The TPROXY patches include a similar change, but use a flag in
struct flowi that requires CAP_NET_ADMIN to be set, which seems like
a better idea. Alternatively you could just use input routing for
non-local source addresses like ip_route_me_harder does.
BTW, there doesn't even seem to be a spot where IPVS calls
ip_route_output with the source address set. What exactly is this
needed for?
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-14 10:21 ` [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed Patrick McHardy
@ 2007-05-14 10:35 ` David Miller
2007-05-14 14:25 ` Janusz Krzysztofik
0 siblings, 1 reply; 24+ messages in thread
From: David Miller @ 2007-05-14 10:35 UTC (permalink / raw)
To: kaber; +Cc: horms, netdev, jkrzyszt
From: Patrick McHardy <kaber@trash.net>
Date: Mon, 14 May 2007 12:21:34 +0200
> This allows any user to send spoofed packets when ip_nonlocal_bind
> is set, which is a quite big change in behaviour of this option.
> The TPROXY patches include a similar change, but use a flag in
> struct flowi that requires CAP_NET_ADMIN to be set, which seems like
> a better idea. Alternatively you could just use input routing for
> non-local source addresses like ip_route_me_harder does.
Good point.
> BTW, there doesn't even seem to be a spot where IPVS calls
> ip_route_output with the source address set. What exactly is this
> needed for?
I suppose he has a patch to make use of it, but was waiting
for this route.c change to go in first.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-14 10:35 ` David Miller
@ 2007-05-14 14:25 ` Janusz Krzysztofik
2007-05-14 14:32 ` Patrick McHardy
0 siblings, 1 reply; 24+ messages in thread
From: Janusz Krzysztofik @ 2007-05-14 14:25 UTC (permalink / raw)
To: David Miller; +Cc: kaber, horms, netdev
David Miller wrote:
> From: Patrick McHardy <kaber@trash.net>
> Date: Mon, 14 May 2007 12:21:34 +0200
>
>> This allows any user to send spoofed packets when ip_nonlocal_bind
>> is set, which is a quite big change in behaviour of this option.
>> The TPROXY patches include a similar change, but use a flag in
>> struct flowi that requires CAP_NET_ADMIN to be set, which seems like
>> a better idea. Alternatively you could just use input routing for
>> non-local source addresses like ip_route_me_harder does.
>
> Good point.
>
>> BTW, there doesn't even seem to be a spot where IPVS calls
>> ip_route_output with the source address set. What exactly is this
>> needed for?
>
> I suppose he has a patch to make use of it, but was waiting
> for this route.c change to go in first.
If you mean me, the answer is no, I do not have any patch making use of
the change in question. What I have is rather a complicated method of
notifying udp clients on communication problems before they are
redirected to a new real server. My method needs some IPVS related
patches, but ICMP port unreachable messages are not generated inside
IPVS code, they are just sent, with help of the patch in question, from
udp_input() or netfilter REJECT.
It was my first intention to patch IPVS to send these messages, but I
found no simple way to implement this in the current IPVS code.
Janusz
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-14 14:25 ` Janusz Krzysztofik
@ 2007-05-14 14:32 ` Patrick McHardy
2007-05-14 15:49 ` Janusz Krzysztofik
0 siblings, 1 reply; 24+ messages in thread
From: Patrick McHardy @ 2007-05-14 14:32 UTC (permalink / raw)
To: Janusz Krzysztofik; +Cc: David Miller, horms, netdev
Janusz Krzysztofik wrote:
> David Miller wrote:
>
>>> BTW, there doesn't even seem to be a spot where IPVS calls
>>> ip_route_output with the source address set. What exactly is this
>>> needed for?
>>
>>
>> I suppose he has a patch to make use of it, but was waiting
>> for this route.c change to go in first.
>
>
> If you mean me, the answer is no, I do not have any patch making use of
> the change in question. What I have is rather a complicated method of
> notifying udp clients on communication problems before they are
> redirected to a new real server. My method needs some IPVS related
> patches, but ICMP port unreachable messages are not generated inside
> IPVS code, they are just sent, with help of the patch in question, from
> udp_input() or netfilter REJECT.
Both use icmp_send(), which should always pick a local source, so I
don't understand why this change was needed. Could you describe
the specific case when the packet generated by icmp_send() does
not have a local source?
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-14 14:32 ` Patrick McHardy
@ 2007-05-14 15:49 ` Janusz Krzysztofik
2007-05-14 17:41 ` Patrick McHardy
0 siblings, 1 reply; 24+ messages in thread
From: Janusz Krzysztofik @ 2007-05-14 15:49 UTC (permalink / raw)
To: Patrick McHardy; +Cc: David Miller, horms, netdev
Patrick McHardy wrote:
> Janusz Krzysztofik wrote:
>> ... ICMP port unreachable messages are not generated inside
>> IPVS code, they are just sent, with help of the patch in question, from
>> udp_input() or netfilter REJECT.
>
> Both use icmp_send(), which should always pick a local source, so I
> don't understand why this change was needed. Could you describe
> the specific case when the packet generated by icmp_send() does
> not have a local source?
Yes, it happens when a packet with a non-local destination IP address is
routed localy in order to reach ip_vs_in(), but is not catched there
because of no associated connection and no matching service, so it is
passed through and ends up in udp_input(). Then, inside udp_input(),
icmp_send() is invoked with original non-local destination IP as source
address.
Again, all this is my own method, usnig special packet marking, of
notifying clients of dead real servers, that is not possible with "pure"
LVS methods. More details can be found several paragraphs below
http://www.austintek.com/LVS/LVS-HOWTO/HOWTO/LVS-HOWTO.LVS-NAT.html#F5_snat
header.
Janusz
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-14 15:49 ` Janusz Krzysztofik
@ 2007-05-14 17:41 ` Patrick McHardy
2007-05-15 5:26 ` Simon Horman
0 siblings, 1 reply; 24+ messages in thread
From: Patrick McHardy @ 2007-05-14 17:41 UTC (permalink / raw)
To: Janusz Krzysztofik; +Cc: David Miller, horms, netdev
[-- Attachment #1: Type: text/plain, Size: 1379 bytes --]
Janusz Krzysztofik wrote:
> Patrick McHardy wrote:
>
>> Janusz Krzysztofik wrote:
>>
>>> ... ICMP port unreachable messages are not generated inside
>>> IPVS code, they are just sent, with help of the patch in question, from
>>> udp_input() or netfilter REJECT.
>>
>>
>> Both use icmp_send(), which should always pick a local source, so I
>> don't understand why this change was needed. Could you describe
>> the specific case when the packet generated by icmp_send() does
>> not have a local source?
>
>
> Yes, it happens when a packet with a non-local destination IP address is
> routed localy in order to reach ip_vs_in(), but is not catched there
> because of no associated connection and no matching service, so it is
> passed through and ends up in udp_input(). Then, inside udp_input(),
> icmp_send() is invoked with original non-local destination IP as source
> address.
So you're adding a local route for non-local destination and the
address selection in icmp_send() uses the original destination
address as source because the route has RTCF_LOCAL set, resulting
in an error in ip_route_output_slow().
If thats correct than this patch should also work, it changes
icmp_send() to check if the original destination address is
non-local when deciding whether to pick a new address (and
reverts the routing changes).
Signed-off-by: Patrick McHardy <kaber@trash.net>
[-- Attachment #2: x --]
[-- Type: text/plain, Size: 1310 bytes --]
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index d38cbba..b964863 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -513,7 +513,7 @@ void icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info)
*/
saddr = iph->daddr;
- if (!(rt->rt_flags & RTCF_LOCAL)) {
+ if (inet_addr_type(saddr) != RTN_LOCAL) {
if (sysctl_icmp_errors_use_inbound_ifaddr)
saddr = inet_select_addr(skb_in->dev, 0, RT_SCOPE_LINK);
else
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index cb76e3c..df9fe4f 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2396,7 +2396,7 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp)
/* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
dev_out = ip_dev_find(oldflp->fl4_src);
- if ((dev_out == NULL) && !(sysctl_ip_nonlocal_bind))
+ if (dev_out == NULL)
goto out;
/* I removed check for oif == dev_out->oif here.
@@ -2407,7 +2407,7 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp)
of another iface. --ANK
*/
- if (dev_out && oldflp->oif == 0
+ if (oldflp->oif == 0
&& (MULTICAST(oldflp->fl4_dst) || oldflp->fl4_dst == htonl(0xFFFFFFFF))) {
/* Special hack: user can direct multicasts
and limited broadcast via necessary interface
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-14 17:41 ` Patrick McHardy
@ 2007-05-15 5:26 ` Simon Horman
2007-05-15 9:46 ` Janusz Krzysztofik
2007-05-15 16:11 ` Patrick McHardy
0 siblings, 2 replies; 24+ messages in thread
From: Simon Horman @ 2007-05-15 5:26 UTC (permalink / raw)
To: Patrick McHardy
Cc: Janusz Krzysztofik, David Miller, netdev, Julian Anastasov
On Mon, May 14, 2007 at 07:41:48PM +0200, Patrick McHardy wrote:
> Janusz Krzysztofik wrote:
> > Patrick McHardy wrote:
> >
> >> Janusz Krzysztofik wrote:
> >>
> >>> ... ICMP port unreachable messages are not generated inside
> >>> IPVS code, they are just sent, with help of the patch in question, from
> >>> udp_input() or netfilter REJECT.
> >>
> >>
> >> Both use icmp_send(), which should always pick a local source, so I
> >> don't understand why this change was needed. Could you describe
> >> the specific case when the packet generated by icmp_send() does
> >> not have a local source?
> >
> >
> > Yes, it happens when a packet with a non-local destination IP address is
> > routed localy in order to reach ip_vs_in(), but is not catched there
> > because of no associated connection and no matching service, so it is
> > passed through and ends up in udp_input(). Then, inside udp_input(),
> > icmp_send() is invoked with original non-local destination IP as source
> > address.
>
>
> So you're adding a local route for non-local destination and the
> address selection in icmp_send() uses the original destination
> address as source because the route has RTCF_LOCAL set, resulting
> in an error in ip_route_output_slow().
I'm not entirely sure that "adding a local route" is the right
terminology, but then again, perhaps I'm missunderstanding exactly
what that means.
My undersanding of the problem is that IPVS likes to send icmp to notify
end-users when real-servers are down. The source ip of such icmp is the
VIP, that is the IP address associated with the virtual service.
However, it is quite valid for this VIP not to be configured on the
machine that is running IPVS. Thus the machine in question wants to send
icmp packets with a non-local source address.
http://archive.linuxvirtualserver.org/html/lvs-users/2007-01/msg00109.html
I think that your patch looks good, assuming that inet_addr_type(VIP)
is going to return RTN_LOCAL (except in the unlikely case that VIP is
multicast or something silly like that.
However, I wonder if efficiency or safety reasons it might
be better for IPVS to pass some sort of OK_ITS_SUPPSED_TO_BE_NON_LOCAL
flag into ip_route().
Just a thought.
>
> If thats correct than this patch should also work, it changes
> icmp_send() to check if the original destination address is
> non-local when deciding whether to pick a new address (and
> reverts the routing changes).
>
> Signed-off-by: Patrick McHardy <kaber@trash.net>
> diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
> index d38cbba..b964863 100644
> --- a/net/ipv4/icmp.c
> +++ b/net/ipv4/icmp.c
> @@ -513,7 +513,7 @@ void icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info)
> */
>
> saddr = iph->daddr;
> - if (!(rt->rt_flags & RTCF_LOCAL)) {
> + if (inet_addr_type(saddr) != RTN_LOCAL) {
> if (sysctl_icmp_errors_use_inbound_ifaddr)
> saddr = inet_select_addr(skb_in->dev, 0, RT_SCOPE_LINK);
> else
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index cb76e3c..df9fe4f 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -2396,7 +2396,7 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp)
>
> /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
> dev_out = ip_dev_find(oldflp->fl4_src);
> - if ((dev_out == NULL) && !(sysctl_ip_nonlocal_bind))
> + if (dev_out == NULL)
> goto out;
>
> /* I removed check for oif == dev_out->oif here.
> @@ -2407,7 +2407,7 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp)
> of another iface. --ANK
> */
>
> - if (dev_out && oldflp->oif == 0
> + if (oldflp->oif == 0
> && (MULTICAST(oldflp->fl4_dst) || oldflp->fl4_dst == htonl(0xFFFFFFFF))) {
> /* Special hack: user can direct multicasts
> and limited broadcast via necessary interface
--
Horms
H: http://www.vergenet.net/~horms/
W: http://www.valinux.co.jp/en/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-15 5:26 ` Simon Horman
@ 2007-05-15 9:46 ` Janusz Krzysztofik
2007-05-15 16:11 ` Patrick McHardy
1 sibling, 0 replies; 24+ messages in thread
From: Janusz Krzysztofik @ 2007-05-15 9:46 UTC (permalink / raw)
To: Simon Horman; +Cc: Patrick McHardy, David Miller, netdev, Julian Anastasov
Simon Horman wrote:
> On Mon, May 14, 2007 at 07:41:48PM +0200, Patrick McHardy wrote:
>> So you're adding a local route for non-local destination and the
>> address selection in icmp_send() uses the original destination
>> address as source because the route has RTCF_LOCAL set, resulting
>> in an error in ip_route_output_slow().
>
> I'm not entirely sure that "adding a local route" is the right
> terminology, but then again, perhaps I'm missunderstanding exactly
> what that means.
What I do exactly is:
ip rule add prio 1000 fwmark $IF_MARK_LVS lookup lvs
ip route replace table lvs local default dev lo
>
> My undersanding of the problem is that IPVS likes to send icmp to notify
> end-users when real-servers are down.
Yes, there is one such place in IPVS code too, inside ip_vs_leave(),
used for notifying clients on service overload.
> The source ip of such icmp is the
> VIP, that is the IP address associated with the virtual service.
> However, it is quite valid for this VIP not to be configured on the
> machine that is running IPVS. Thus the machine in question wants to send
> icmp packets with a non-local source address.
>
> http://archive.linuxvirtualserver.org/html/lvs-users/2007-01/msg00109.html
>
>> If thats correct than this patch should also work, it changes
>> icmp_send() to check if the original destination address is
>> non-local when deciding whether to pick a new address (and
>> reverts the routing changes).
> I think that your patch looks good, assuming that inet_addr_type(VIP)
> is going to return RTN_LOCAL (except in the unlikely case that VIP is
> multicast or something silly like that.
For now, I have no place other than my production firewall cluster to
verify this patch. I will do it as soon as possible and give you some
feedback.
>
> However, I wonder if efficiency or safety reasons it might
> be better for IPVS to pass some sort of OK_ITS_SUPPSED_TO_BE_NON_LOCAL
> flag into ip_route().
Do you mean packets that are passed through ip_vs_in()?. If not, please
remember that current IPVS code does not send any ICMP port unreachable
messages except for this rare overload case. I still have no idea how to
solve more common problem of notifying clients on dead real server
inside the IPVS code itself, to avoid my complicated tricks of marking
based on connection tracking.
On the other hand, I have to state that even if I can now send
notifications to clients using my method, this does not solve my real
problem of broken ipsec connections going through LVS director. Openswan
clients I use do not care about ICMP port unreachable messages an insist
on using connections that are invalid due to switched real server. So
maybe we should first verify if there are any real cases when notifying
udp clients with ICMP port unreachable may be realy usefull and then
decide if we do need this functionality.
Janusz
P.S. Simon, sorry for duplicated message.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-15 5:26 ` Simon Horman
2007-05-15 9:46 ` Janusz Krzysztofik
@ 2007-05-15 16:11 ` Patrick McHardy
2007-05-15 23:41 ` Julian Anastasov
1 sibling, 1 reply; 24+ messages in thread
From: Patrick McHardy @ 2007-05-15 16:11 UTC (permalink / raw)
To: Simon Horman; +Cc: Janusz Krzysztofik, David Miller, netdev, Julian Anastasov
Simon Horman wrote:
> On Mon, May 14, 2007 at 07:41:48PM +0200, Patrick McHardy wrote:
>
>>So you're adding a local route for non-local destination and the
>>address selection in icmp_send() uses the original destination
>>address as source because the route has RTCF_LOCAL set, resulting
>>in an error in ip_route_output_slow().
>
>
> I'm not entirely sure that "adding a local route" is the right
> terminology, but then again, perhaps I'm missunderstanding exactly
> what that means.
It means adding a route to the local table, which causes the
resulting dst_entry to be marked with RTCF_LOCAL.
> My undersanding of the problem is that IPVS likes to send icmp to notify
> end-users when real-servers are down. The source ip of such icmp is the
> VIP, that is the IP address associated with the virtual service.
> However, it is quite valid for this VIP not to be configured on the
> machine that is running IPVS. Thus the machine in question wants to send
> icmp packets with a non-local source address.
>
> http://archive.linuxvirtualserver.org/html/lvs-users/2007-01/msg00109.html
>
> I think that your patch looks good, assuming that inet_addr_type(VIP)
> is going to return RTN_LOCAL (except in the unlikely case that VIP is
> multicast or something silly like that.
I'm not familiar with the IPVS terms, but as far as I understand,
it is _not_ going to return RTN_LOCAL, so we get the desired
behaviour of selecting a local address as source.
> However, I wonder if efficiency or safety reasons it might
> be better for IPVS to pass some sort of OK_ITS_SUPPSED_TO_BE_NON_LOCAL
> flag into ip_route().
>
> Just a thought.
I'm not too thrilled about adding a route flag when it really is
ICMP address selection that is the problem here.
The patch should be completely safe since multicast and broadcast
packets are already filtered out earlier and the RTN_LOCAL test
matches exactly what ip_route_output_slow does.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-15 16:11 ` Patrick McHardy
@ 2007-05-15 23:41 ` Julian Anastasov
2007-05-17 11:25 ` Janusz Krzysztofik
2007-05-17 16:40 ` Patrick McHardy
0 siblings, 2 replies; 24+ messages in thread
From: Julian Anastasov @ 2007-05-15 23:41 UTC (permalink / raw)
To: Patrick McHardy; +Cc: Simon Horman, Janusz Krzysztofik, David Miller, netdev
Hello,
On Tue, 15 May 2007, Patrick McHardy wrote:
> Simon Horman wrote:
> > On Mon, May 14, 2007 at 07:41:48PM +0200, Patrick McHardy wrote:
> >
> >>So you're adding a local route for non-local destination and the
> >>address selection in icmp_send() uses the original destination
> >>address as source because the route has RTCF_LOCAL set, resulting
> >>in an error in ip_route_output_slow().
> >
> > I'm not entirely sure that "adding a local route" is the right
> > terminology, but then again, perhaps I'm missunderstanding exactly
> > what that means.
>
> It means adding a route to the local table, which causes the
> resulting dst_entry to be marked with RTCF_LOCAL.
IPVS users add local route to user-defined (not "local") routing
table to deliver locally only selected traffic, eg. by fwmark. The use
is just like for transparent proxy, deliver port X for VIP locally,
other traffic for VIP is forwarded (hits unicast route).
And it seems the idea is ICMP still to work for such scenarios,
icmp_send should know that this is authorized operation that can
avoid the source address check (saddr=VIP) in ip_route_output_slow.
If icmp_send is changed to use inet_addr_type() then ICMP will leave
with saddr != VIP and that is not nice.
I think, icmp_send is fine as is, the main problem is how
to differentiate the ip_route_output_slow users to ones that need
the check for valid source address and others (eg. NAT) that expect
and allow source address to be non-local.
It is interesting, what happens when some NAT rule maps
multiple internal addresses (one to one) to multiple non-local public
addresses, may be if packet from world is rejected we don't send ICMP
from the right public address? For example, if we are NAT router
with such rules:
internal 192.168.0.1 is mapped to public 10.0.0.1
internal 192.168.0.2 is mapped to public 10.0.0.2
...
where 10.0.0.0/16 are not configured as local IPs on the NAT router,
eg. when we don't want to add thousands of IPs.
World sends to 10.0.0.1 but the NAT router wants to send ICMP with
saddr=10.0.0.1 which is not local IP.
So, the rule here is that NAT allows ICMP to use 10.0.0.0/16
as source address, even if such IPs are not configured. Of course,
one may use ip addr add 10.0.0.0/16 dev lo for such case and may be all
such examples can be solved somehow, i simply didn't tried such setups.
To summarize, what can help is a flag (eg. RT_ANYSRC) to
ip_route_output* that all special users can provide to skip the
check, for example:
- RTCF_LOCAL packets in icmp_send() can avoid the check
- NAT can avoid the check (ip_route_me_harder can be simplified?)
Currently, all callers use the check, so may be the goal can be
to start with small set of callers that can set the new flag. It looks
like we can save some CPU cycles too, ip_route_me_harder looks too
overloaded.
> > I think that your patch looks good, assuming that inet_addr_type(VIP)
> > is going to return RTN_LOCAL (except in the unlikely case that VIP is
> > multicast or something silly like that.
>
> I'm not familiar with the IPVS terms, but as far as I understand,
> it is _not_ going to return RTN_LOCAL, so we get the desired
> behaviour of selecting a local address as source.
But what is preferred is to use VIP in ICMP.
ip route add local VIP dev lo table user_defined
returns RTCF_LOCAL but inet_addr_type() does not return RTN_LOCAL,
we fix one thing but break another :)
Regards
--
Julian Anastasov <ja@ssi.bg>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-15 23:41 ` Julian Anastasov
@ 2007-05-17 11:25 ` Janusz Krzysztofik
2007-05-17 16:41 ` Patrick McHardy
2007-05-17 16:40 ` Patrick McHardy
1 sibling, 1 reply; 24+ messages in thread
From: Janusz Krzysztofik @ 2007-05-17 11:25 UTC (permalink / raw)
To: Julian Anastasov; +Cc: Patrick McHardy, Simon Horman, David Miller, netdev
Julian Anastasov wrote:
> If icmp_send is changed to use inet_addr_type() then ICMP will leave
> with saddr != VIP and that is not nice.
> ...
>> I'm not familiar with the IPVS terms, but as far as I understand,
>> it is _not_ going to return RTN_LOCAL, so we get the desired
>> behaviour of selecting a local address as source.
>
> But what is preferred is to use VIP in ICMP.
>
> ip route add local VIP dev lo table user_defined
>
> returns RTCF_LOCAL but inet_addr_type() does not return RTN_LOCAL,
> we fix one thing but break another :)
Well, I have promissed you to give some feedback after I test the patch
proposed by Patrick, but after Julian's post I can only confirm that it
works exactly as Julian said, what is not what I would expect.
Julian, thank you for your detailed explanation of the issue, I have
nothing more to add.
Janusz
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-15 23:41 ` Julian Anastasov
2007-05-17 11:25 ` Janusz Krzysztofik
@ 2007-05-17 16:40 ` Patrick McHardy
2007-05-17 20:51 ` David Miller
` (2 more replies)
1 sibling, 3 replies; 24+ messages in thread
From: Patrick McHardy @ 2007-05-17 16:40 UTC (permalink / raw)
To: Julian Anastasov; +Cc: Simon Horman, Janusz Krzysztofik, David Miller, netdev
Julian Anastasov wrote:
> To summarize, what can help is a flag (eg. RT_ANYSRC) to
> ip_route_output* that all special users can provide to skip the
> check, for example:
> - RTCF_LOCAL packets in icmp_send() can avoid the check
> - NAT can avoid the check (ip_route_me_harder can be simplified?)
We want to be able to use iif in rules, so ip_route_me_harder still
needs to use ip_route_input().
> Currently, all callers use the check, so may be the goal can be
> to start with small set of callers that can set the new flag. It looks
> like we can save some CPU cycles too, ip_route_me_harder looks too
> overloaded.
>
>
>>>I think that your patch looks good, assuming that inet_addr_type(VIP)
>>>is going to return RTN_LOCAL (except in the unlikely case that VIP is
>>>multicast or something silly like that.
>>
>>I'm not familiar with the IPVS terms, but as far as I understand,
>>it is _not_ going to return RTN_LOCAL, so we get the desired
>>behaviour of selecting a local address as source.
>
>
> But what is preferred is to use VIP in ICMP.
>
> ip route add local VIP dev lo table user_defined
>
> returns RTCF_LOCAL but inet_addr_type() does not return RTN_LOCAL,
> we fix one thing but break another :)
Actually thats exactly the case that my patch handles. Why does it
matter which source address the ICMP packet uses, as long as its
routed properly?
In any case some better solution than the current one needs to be
found, allowing users to send spoofed packets is far worse than
using a non-desired source address for ICMP packets.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-17 11:25 ` Janusz Krzysztofik
@ 2007-05-17 16:41 ` Patrick McHardy
0 siblings, 0 replies; 24+ messages in thread
From: Patrick McHardy @ 2007-05-17 16:41 UTC (permalink / raw)
To: Janusz Krzysztofik; +Cc: Julian Anastasov, Simon Horman, David Miller, netdev
Janusz Krzysztofik wrote:
> Julian Anastasov wrote:
>
>> If icmp_send is changed to use inet_addr_type() then ICMP will leave
>> with saddr != VIP and that is not nice.
>> ...
>>
>>> I'm not familiar with the IPVS terms, but as far as I understand,
>>> it is _not_ going to return RTN_LOCAL, so we get the desired
>>> behaviour of selecting a local address as source.
>>
>>
>> But what is preferred is to use VIP in ICMP.
>>
>> ip route add local VIP dev lo table user_defined
>>
>> returns RTCF_LOCAL but inet_addr_type() does not return RTN_LOCAL,
>> we fix one thing but break another :)
>
>
> Well, I have promissed you to give some feedback after I test the patch
> proposed by Patrick, but after Julian's post I can only confirm that it
> works exactly as Julian said, what is not what I would expect.
It will pick a local source address and use that. I don't see why
it matters whether its the VIP address or some other.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-17 16:40 ` Patrick McHardy
@ 2007-05-17 20:51 ` David Miller
2007-05-18 1:06 ` Simon Horman
2007-05-18 8:40 ` Julian Anastasov
2 siblings, 0 replies; 24+ messages in thread
From: David Miller @ 2007-05-17 20:51 UTC (permalink / raw)
To: kaber; +Cc: ja, horms, jkrzyszt, netdev
From: Patrick McHardy <kaber@trash.net>
Date: Thu, 17 May 2007 18:40:28 +0200
> In any case some better solution than the current one needs to be
> found, allowing users to send spoofed packets is far worse than
> using a non-desired source address for ICMP packets.
Agreed, but it only occurs if the nonlocal bind sysctl is
enabled and almost nobody turns that thing on :-)
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-17 16:40 ` Patrick McHardy
2007-05-17 20:51 ` David Miller
@ 2007-05-18 1:06 ` Simon Horman
2007-05-18 8:40 ` Julian Anastasov
2 siblings, 0 replies; 24+ messages in thread
From: Simon Horman @ 2007-05-18 1:06 UTC (permalink / raw)
To: Patrick McHardy
Cc: Julian Anastasov, Janusz Krzysztofik, David Miller, netdev
On Thu, May 17, 2007 at 06:40:28PM +0200, Patrick McHardy wrote:
>
> Actually thats exactly the case that my patch handles. Why does it
> matter which source address the ICMP packet uses, as long as its
> routed properly?
Agreed.
> In any case some better solution than the current one needs to be
> found, allowing users to send spoofed packets is far worse than
> using a non-desired source address for ICMP packets.
No agrument there.
--
Horms
H: http://www.vergenet.net/~horms/
W: http://www.valinux.co.jp/en/
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-17 16:40 ` Patrick McHardy
2007-05-17 20:51 ` David Miller
2007-05-18 1:06 ` Simon Horman
@ 2007-05-18 8:40 ` Julian Anastasov
2007-05-18 9:05 ` David Miller
2 siblings, 1 reply; 24+ messages in thread
From: Julian Anastasov @ 2007-05-18 8:40 UTC (permalink / raw)
To: Patrick McHardy; +Cc: Simon Horman, Janusz Krzysztofik, David Miller, netdev
Hello,
On Thu, 17 May 2007, Patrick McHardy wrote:
> > But what is preferred is to use VIP in ICMP.
> >
> > ip route add local VIP dev lo table user_defined
> >
> > returns RTCF_LOCAL but inet_addr_type() does not return RTN_LOCAL,
> > we fix one thing but break another :)
>
>
> Actually thats exactly the case that my patch handles. Why does it
> matter which source address the ICMP packet uses, as long as its
> routed properly?
It should work for most of the cases but it can cause problems
in closely connected hosts where using the right subnet matters.
If inet_addr_type is not considered slow for routers and this
local route justifies it then i have no more objections. May be
Janusz should test it first without sysctl_ip_nonlocal_bind change.
> In any case some better solution than the current one needs to be
> found, allowing users to send spoofed packets is far worse than
> using a non-desired source address for ICMP packets.
yes, I would prefer the sysctl_ip_nonlocal_bind change to be
removed until such solution is found.
Regards
--
Julian Anastasov <ja@ssi.bg>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-18 8:40 ` Julian Anastasov
@ 2007-05-18 9:05 ` David Miller
2007-05-30 9:38 ` KOVACS Krisztian
0 siblings, 1 reply; 24+ messages in thread
From: David Miller @ 2007-05-18 9:05 UTC (permalink / raw)
To: ja; +Cc: kaber, horms, jkrzyszt, netdev
From: Julian Anastasov <ja@ssi.bg>
Date: Fri, 18 May 2007 11:40:54 +0300 (EEST)
> On Thu, 17 May 2007, Patrick McHardy wrote:
>
> > In any case some better solution than the current one needs to be
> > found, allowing users to send spoofed packets is far worse than
> > using a non-desired source address for ICMP packets.
>
> yes, I would prefer the sysctl_ip_nonlocal_bind change to be
> removed until such solution is found.
Ok, I'll revert it.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-18 9:05 ` David Miller
@ 2007-05-30 9:38 ` KOVACS Krisztian
2007-05-31 0:21 ` Julian Anastasov
0 siblings, 1 reply; 24+ messages in thread
From: KOVACS Krisztian @ 2007-05-30 9:38 UTC (permalink / raw)
To: David Miller; +Cc: ja, kaber, horms, jkrzyszt, netdev
Hi,
On Friday 18 May 2007 11:05, David Miller wrote:
> From: Julian Anastasov <ja@ssi.bg>
> Date: Fri, 18 May 2007 11:40:54 +0300 (EEST)
>
> > On Thu, 17 May 2007, Patrick McHardy wrote:
> > > In any case some better solution than the current one needs to be
> > > found, allowing users to send spoofed packets is far worse than
> > > using a non-desired source address for ICMP packets.
> >
> > yes, I would prefer the sysctl_ip_nonlocal_bind change to be
> > removed until such solution is found.
>
> Ok, I'll revert it.
I'm just about to publish the next round of tproxy patches (with the
routing code modifications completely removed), but this issue is still
present.
I've posted a few patches making omitting this check possible
selectively back in March. Do those changes look acceptable?
http://marc.info/?l=linux-netdev&m=117310979823297&w=3
And the related socket layer changes:
http://marc.info/?l=linux-netdev&m=117310979815374&w=3
http://marc.info/?l=linux-netdev&m=117310979902806&w=3
http://marc.info/?l=linux-netdev&m=117310980027541&w=3
--
Regards,
Krisztian Kovacs
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-30 9:38 ` KOVACS Krisztian
@ 2007-05-31 0:21 ` Julian Anastasov
2007-05-31 12:50 ` KOVACS Krisztian
0 siblings, 1 reply; 24+ messages in thread
From: Julian Anastasov @ 2007-05-31 0:21 UTC (permalink / raw)
To: KOVACS Krisztian; +Cc: David Miller, kaber, horms, jkrzyszt, netdev
Hello,
On Wed, 30 May 2007, KOVACS Krisztian wrote:
> I'm just about to publish the next round of tproxy patches (with the
> routing code modifications completely removed), but this issue is still
> present.
>
> I've posted a few patches making omitting this check possible
> selectively back in March. Do those changes look acceptable?
>
> http://marc.info/?l=linux-netdev&m=117310979823297&w=3
Just a small note for now:
The code
'if (dev_out == NULL && !(oldflp->flags & FLOWI_FLAG_TRANSPARENT))'
can lead to problem in 'fl.oif = dev_out->ifindex;' if fl4_src is not
present.
Also, i'm not sure if FLOWI_FLAG_TRANSPARENT should cause
different values for flags to be cached many times. Users without this
flag get EINVAL when fl4_src is not configured, other failures are not
cached too. And as fl4_src is considered in both cases (both kinds of
callers get same path on success) we don't need changes except in
ip_route_output_slow()? By this way I hope we can avoid any possible
forking of cache entries just by different flags.
Then we can use some more generic name, only for the flowi flag,
eg. FLOWI_FLAG_ANYSRC or something better?
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-31 0:21 ` Julian Anastasov
@ 2007-05-31 12:50 ` KOVACS Krisztian
2007-05-31 23:18 ` Julian Anastasov
0 siblings, 1 reply; 24+ messages in thread
From: KOVACS Krisztian @ 2007-05-31 12:50 UTC (permalink / raw)
To: Julian Anastasov; +Cc: David Miller, kaber, horms, jkrzyszt, netdev
Hi,
On Thursday 31 May 2007 02:21, Julian Anastasov wrote:
> > I've posted a few patches making omitting this check possible
> > selectively back in March. Do those changes look acceptable?
> >
> > http://marc.info/?l=linux-netdev&m=117310979823297&w=3
> Also, i'm not sure if FLOWI_FLAG_TRANSPARENT should cause
> different values for flags to be cached many times. Users without this
> flag get EINVAL when fl4_src is not configured, other failures are not
> cached too. And as fl4_src is considered in both cases (both kinds of
> callers get same path on success) we don't need changes except in
> ip_route_output_slow()? By this way I hope we can avoid any possible
> forking of cache entries just by different flags.
Indeed, for output it probably does not matter, I've removed the flags
check from the flow index compare routine.
> Then we can use some more generic name, only for the flowi flag,
> eg. FLOWI_FLAG_ANYSRC or something better?
You're right, _TRANSPARENT was a bad idea. I'm not very good at
choosing names.
So what about this one?
Loosen source address check on IPv4 output
From: KOVACS Krisztian <hidden@balabit.hu>
ip_route_output() contains a check to make sure that no flows with
non-local source IP addresses are routed. This obviously makes using
such addresses impossible.
This patch introduces a flowi flag which makes omitting this check
possible. The new flag provides a way of handling transparent and
non-transparent connections differently.
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
---
include/net/flow.h | 1 +
net/ipv4/route.c | 47 +++++++++++++++++++++++++----------------------
2 files changed, 26 insertions(+), 22 deletions(-)
diff --git a/include/net/flow.h b/include/net/flow.h
index f3cc1f8..1bfc0dc 100644
--- a/include/net/flow.h
+++ b/include/net/flow.h
@@ -49,6 +49,7 @@ struct flowi {
__u8 proto;
__u8 flags;
#define FLOWI_FLAG_MULTIPATHOLDROUTE 0x01
+#define FLOWI_FLAG_ANYSRC 0x02
union {
struct {
__be16 sport;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 8603cfb..88d0a79 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2396,7 +2396,7 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp)
/* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
dev_out = ip_dev_find(oldflp->fl4_src);
- if (dev_out == NULL)
+ if (dev_out == NULL && !(oldflp->flags & FLOWI_FLAG_ANYSRC))
goto out;
/* I removed check for oif == dev_out->oif here.
@@ -2407,29 +2407,32 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp)
of another iface. --ANK
*/
- if (oldflp->oif == 0
- && (MULTICAST(oldflp->fl4_dst) || oldflp->fl4_dst == htonl(0xFFFFFFFF))) {
- /* Special hack: user can direct multicasts
- and limited broadcast via necessary interface
- without fiddling with IP_MULTICAST_IF or IP_PKTINFO.
- This hack is not just for fun, it allows
- vic,vat and friends to work.
- They bind socket to loopback, set ttl to zero
- and expect that it will work.
- From the viewpoint of routing cache they are broken,
- because we are not allowed to build multicast path
- with loopback source addr (look, routing cache
- cannot know, that ttl is zero, so that packet
- will not leave this host and route is valid).
- Luckily, this hack is good workaround.
- */
+ if (dev_out) {
+ if (oldflp->oif == 0
+ && (MULTICAST(oldflp->fl4_dst)
+ || oldflp->fl4_dst == htonl(0xFFFFFFFF))) {
+ /* Special hack: user can direct multicasts
+ and limited broadcast via necessary interface
+ without fiddling with IP_MULTICAST_IF or IP_PKTINFO.
+ This hack is not just for fun, it allows
+ vic,vat and friends to work.
+ They bind socket to loopback, set ttl to zero
+ and expect that it will work.
+ From the viewpoint of routing cache they are broken,
+ because we are not allowed to build multicast path
+ with loopback source addr (look, routing cache
+ cannot know, that ttl is zero, so that packet
+ will not leave this host and route is valid).
+ Luckily, this hack is good workaround.
+ */
+
+ fl.oif = dev_out->ifindex;
+ goto make_route;
+ }
- fl.oif = dev_out->ifindex;
- goto make_route;
- }
- if (dev_out)
dev_put(dev_out);
- dev_out = NULL;
+ dev_out = NULL;
+ }
}
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-31 12:50 ` KOVACS Krisztian
@ 2007-05-31 23:18 ` Julian Anastasov
2007-06-01 12:55 ` KOVACS Krisztian
2007-06-20 10:57 ` Balazs Scheidler
0 siblings, 2 replies; 24+ messages in thread
From: Julian Anastasov @ 2007-05-31 23:18 UTC (permalink / raw)
To: KOVACS Krisztian; +Cc: David Miller, kaber, horms, jkrzyszt, netdev
Hello,
On Thu, 31 May 2007, KOVACS Krisztian wrote:
> So what about this one?
May be we can try with better coding style. Also, this version
adds undefined behavior for using FLOWI_FLAG_ANYSRC with multicast
oldflp->fl4_dst
> Loosen source address check on IPv4 output
>
> From: KOVACS Krisztian <hidden@balabit.hu>
>
> ip_route_output() contains a check to make sure that no flows with
> non-local source IP addresses are routed. This obviously makes using
> such addresses impossible.
>
> This patch introduces a flowi flag which makes omitting this check
> possible. The new flag provides a way of handling transparent and
> non-transparent connections differently.
>
> Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
> ---
>
> include/net/flow.h | 1 +
> net/ipv4/route.c | 47 +++++++++++++++++++++++++----------------------
> 2 files changed, 26 insertions(+), 22 deletions(-)
>
> diff --git a/include/net/flow.h b/include/net/flow.h
> index f3cc1f8..1bfc0dc 100644
> --- a/include/net/flow.h
> +++ b/include/net/flow.h
> @@ -49,6 +49,7 @@ struct flowi {
> __u8 proto;
> __u8 flags;
> #define FLOWI_FLAG_MULTIPATHOLDROUTE 0x01
> +#define FLOWI_FLAG_ANYSRC 0x02
> union {
> struct {
> __be16 sport;
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index 8603cfb..88d0a79 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -2396,7 +2396,7 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp)
>
> /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
> dev_out = ip_dev_find(oldflp->fl4_src);
> - if (dev_out == NULL)
> + if (dev_out == NULL && !(oldflp->flags & FLOWI_FLAG_ANYSRC))
> goto out;
>
> /* I removed check for oif == dev_out->oif here.
> @@ -2407,29 +2407,32 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp)
> of another iface. --ANK
> */
>
> - if (oldflp->oif == 0
> - && (MULTICAST(oldflp->fl4_dst) || oldflp->fl4_dst == htonl(0xFFFFFFFF))) {
> - /* Special hack: user can direct multicasts
> - and limited broadcast via necessary interface
> - without fiddling with IP_MULTICAST_IF or IP_PKTINFO.
> - This hack is not just for fun, it allows
> - vic,vat and friends to work.
> - They bind socket to loopback, set ttl to zero
> - and expect that it will work.
> - From the viewpoint of routing cache they are broken,
> - because we are not allowed to build multicast path
> - with loopback source addr (look, routing cache
> - cannot know, that ttl is zero, so that packet
> - will not leave this host and route is valid).
> - Luckily, this hack is good workaround.
> - */
> + if (dev_out) {
> + if (oldflp->oif == 0
> + && (MULTICAST(oldflp->fl4_dst)
> + || oldflp->fl4_dst == htonl(0xFFFFFFFF))) {
> + /* Special hack: user can direct multicasts
> + and limited broadcast via necessary interface
> + without fiddling with IP_MULTICAST_IF or IP_PKTINFO.
> + This hack is not just for fun, it allows
> + vic,vat and friends to work.
> + They bind socket to loopback, set ttl to zero
> + and expect that it will work.
> + From the viewpoint of routing cache they are broken,
> + because we are not allowed to build multicast path
> + with loopback source addr (look, routing cache
> + cannot know, that ttl is zero, so that packet
> + will not leave this host and route is valid).
> + Luckily, this hack is good workaround.
> + */
> +
> + fl.oif = dev_out->ifindex;
> + goto make_route;
> + }
>
> - fl.oif = dev_out->ifindex;
> - goto make_route;
> - }
> - if (dev_out)
> dev_put(dev_out);
> - dev_out = NULL;
> + dev_out = NULL;
> + }
> }
What about something like this, it even reduces checks
in the fast path. You can post new version if the following change
looks good to you and to other developers. If additional sign line is
needed here it is:
Signed-off-by: Julian Anastasov <ja@ssi.bg>
@@ -2396,8 +2396,6 @@ static int ip_route_output_slow(struct r
/* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
dev_out = ip_dev_find(oldflp->fl4_src);
- if (dev_out == NULL)
- goto out;
/* I removed check for oif == dev_out->oif here.
It was wrong for two reasons:
@@ -2409,6 +2407,8 @@ static int ip_route_output_slow(struct r
if (oldflp->oif == 0
&& (MULTICAST(oldflp->fl4_dst) || oldflp->fl4_dst == htonl(0xFFFFFFFF))) {
+ if (dev_out == NULL)
+ goto out;
/* Special hack: user can direct multicasts
and limited broadcast via necessary interface
without fiddling with IP_MULTICAST_IF or IP_PKTINFO.
@@ -2427,9 +2427,11 @@ static int ip_route_output_slow(struct r
fl.oif = dev_out->ifindex;
goto make_route;
}
- if (dev_out)
+ if (dev_out) {
dev_put(dev_out);
- dev_out = NULL;
+ dev_out = NULL;
+ } else if (!(oldflp->flags & FLOWI_FLAG_ANYSRC))
+ goto out;
}
Or we can go further and to avoid ip_dev_find? For me, this
second variant is preferred because calling ip_dev_find() is useless for
FLOWI_FLAG_ANYSRC.
@@ -2394,11 +2394,6 @@ static int ip_route_output_slow(struct r
ZERONET(oldflp->fl4_src))
goto out;
- /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
- dev_out = ip_dev_find(oldflp->fl4_src);
- if (dev_out == NULL)
- goto out;
-
/* I removed check for oif == dev_out->oif here.
It was wrong for two reasons:
1. ip_dev_find(saddr) can return wrong iface, if saddr is
@@ -2409,6 +2404,11 @@ static int ip_route_output_slow(struct r
if (oldflp->oif == 0
&& (MULTICAST(oldflp->fl4_dst) || oldflp->fl4_dst == htonl(0xFFFFFFFF))) {
+ /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
+ dev_out = ip_dev_find(oldflp->fl4_src);
+ if (dev_out == NULL)
+ goto out;
+
/* Special hack: user can direct multicasts
and limited broadcast via necessary interface
without fiddling with IP_MULTICAST_IF or IP_PKTINFO.
@@ -2427,9 +2427,14 @@ static int ip_route_output_slow(struct r
fl.oif = dev_out->ifindex;
goto make_route;
}
- if (dev_out)
+ if (!(oldflp->flags & FLOWI_FLAG_ANYSRC)) {
+ /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
+ dev_out = ip_dev_find(oldflp->fl4_src);
+ if (dev_out == NULL)
+ goto out;
dev_put(dev_out);
- dev_out = NULL;
+ dev_out = NULL;
+ }
}
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-31 23:18 ` Julian Anastasov
@ 2007-06-01 12:55 ` KOVACS Krisztian
2007-06-20 10:57 ` Balazs Scheidler
1 sibling, 0 replies; 24+ messages in thread
From: KOVACS Krisztian @ 2007-06-01 12:55 UTC (permalink / raw)
To: Julian Anastasov; +Cc: David Miller, kaber, horms, jkrzyszt, hidden, netdev
Hi,
On Friday 01 June 2007 01:18, Julian Anastasov wrote:
> What about something like this, it even reduces checks
> in the fast path. You can post new version if the following change
> looks good to you and to other developers. If additional sign line is
> needed here it is:
>
> Signed-off-by: Julian Anastasov <ja@ssi.bg>
>
>[...]
> Or we can go further and to avoid ip_dev_find? For me, this
> second variant is preferred because calling ip_dev_find() is useless
> for FLOWI_FLAG_ANYSRC.
You're right. Although I don't really like duplicating the ip_dev_find()
call, it's still better than the previous patch.
--
Regards,
Krisztian Kovacs
Loosen source address check on IPv4 output
ip_route_output() contains a check to make sure that no flows with
non-local source IP addresses are routed. This obviously makes using
such addresses impossible.
This patch introduces a flowi flag which makes omitting this check
possible.
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
---
include/net/flow.h | 1 +
net/ipv4/route.c | 20 +++++++++++++-------
2 files changed, 14 insertions(+), 7 deletions(-)
diff --git a/include/net/flow.h b/include/net/flow.h
index f3cc1f8..1bfc0dc 100644
--- a/include/net/flow.h
+++ b/include/net/flow.h
@@ -49,6 +49,7 @@ struct flowi {
__u8 proto;
__u8 flags;
#define FLOWI_FLAG_MULTIPATHOLDROUTE 0x01
+#define FLOWI_FLAG_ANYSRC 0x02
union {
struct {
__be16 sport;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 8603cfb..4acd3de 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2394,11 +2394,6 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp)
ZERONET(oldflp->fl4_src))
goto out;
- /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
- dev_out = ip_dev_find(oldflp->fl4_src);
- if (dev_out == NULL)
- goto out;
-
/* I removed check for oif == dev_out->oif here.
It was wrong for two reasons:
1. ip_dev_find(saddr) can return wrong iface, if saddr is
@@ -2409,6 +2404,11 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp)
if (oldflp->oif == 0
&& (MULTICAST(oldflp->fl4_dst) || oldflp->fl4_dst == htonl(0xFFFFFFFF))) {
+ /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
+ dev_out = ip_dev_find(oldflp->fl4_src);
+ if (dev_out == NULL)
+ goto out;
+
/* Special hack: user can direct multicasts
and limited broadcast via necessary interface
without fiddling with IP_MULTICAST_IF or IP_PKTINFO.
@@ -2427,9 +2427,15 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp)
fl.oif = dev_out->ifindex;
goto make_route;
}
- if (dev_out)
+
+ if (!(oldflp->flags & FLOWI_FLAG_ANYSRC)) {
+ /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
+ dev_out = ip_dev_find(oldflp->fl4_src);
+ if (dev_out == NULL)
+ goto out;
dev_put(dev_out);
- dev_out = NULL;
+ dev_out = NULL;
+ }
}
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-05-31 23:18 ` Julian Anastasov
2007-06-01 12:55 ` KOVACS Krisztian
@ 2007-06-20 10:57 ` Balazs Scheidler
2007-06-21 7:56 ` Julian Anastasov
1 sibling, 1 reply; 24+ messages in thread
From: Balazs Scheidler @ 2007-06-20 10:57 UTC (permalink / raw)
To: Julian Anastasov
Cc: KOVACS Krisztian, David Miller, kaber, horms, jkrzyszt, netdev
Hi,
Is there a chance that this, or a patch with similar spirit (e.g. a way
to send packets from non-local IP addresses) could be merged?
On Fri, 2007-06-01 at 02:18 +0300, Julian Anastasov wrote:
> Hello,
>
> On Thu, 31 May 2007, KOVACS Krisztian wrote:
>
> > So what about this one?
>
> May be we can try with better coding style. Also, this version
> adds undefined behavior for using FLOWI_FLAG_ANYSRC with multicast
> oldflp->fl4_dst
>
> > Loosen source address check on IPv4 output
> >
> > From: KOVACS Krisztian <hidden@balabit.hu>
> >
> > ip_route_output() contains a check to make sure that no flows with
> > non-local source IP addresses are routed. This obviously makes using
> > such addresses impossible.
> >
> > This patch introduces a flowi flag which makes omitting this check
> > possible. The new flag provides a way of handling transparent and
> > non-transparent connections differently.
> >
> > Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
> > ---
> >
> > include/net/flow.h | 1 +
> > net/ipv4/route.c | 47 +++++++++++++++++++++++++----------------------
> > 2 files changed, 26 insertions(+), 22 deletions(-)
> >
> > diff --git a/include/net/flow.h b/include/net/flow.h
> > index f3cc1f8..1bfc0dc 100644
> > --- a/include/net/flow.h
> > +++ b/include/net/flow.h
> > @@ -49,6 +49,7 @@ struct flowi {
> > __u8 proto;
> > __u8 flags;
> > #define FLOWI_FLAG_MULTIPATHOLDROUTE 0x01
> > +#define FLOWI_FLAG_ANYSRC 0x02
> > union {
> > struct {
> > __be16 sport;
> > diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> > index 8603cfb..88d0a79 100644
> > --- a/net/ipv4/route.c
> > +++ b/net/ipv4/route.c
> > @@ -2396,7 +2396,7 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp)
> >
> > /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
> > dev_out = ip_dev_find(oldflp->fl4_src);
> > - if (dev_out == NULL)
> > + if (dev_out == NULL && !(oldflp->flags & FLOWI_FLAG_ANYSRC))
> > goto out;
> >
> > /* I removed check for oif == dev_out->oif here.
> > @@ -2407,29 +2407,32 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp)
> > of another iface. --ANK
> > */
> >
> > - if (oldflp->oif == 0
> > - && (MULTICAST(oldflp->fl4_dst) || oldflp->fl4_dst == htonl(0xFFFFFFFF))) {
> > - /* Special hack: user can direct multicasts
> > - and limited broadcast via necessary interface
> > - without fiddling with IP_MULTICAST_IF or IP_PKTINFO.
> > - This hack is not just for fun, it allows
> > - vic,vat and friends to work.
> > - They bind socket to loopback, set ttl to zero
> > - and expect that it will work.
> > - From the viewpoint of routing cache they are broken,
> > - because we are not allowed to build multicast path
> > - with loopback source addr (look, routing cache
> > - cannot know, that ttl is zero, so that packet
> > - will not leave this host and route is valid).
> > - Luckily, this hack is good workaround.
> > - */
> > + if (dev_out) {
> > + if (oldflp->oif == 0
> > + && (MULTICAST(oldflp->fl4_dst)
> > + || oldflp->fl4_dst == htonl(0xFFFFFFFF))) {
> > + /* Special hack: user can direct multicasts
> > + and limited broadcast via necessary interface
> > + without fiddling with IP_MULTICAST_IF or IP_PKTINFO.
> > + This hack is not just for fun, it allows
> > + vic,vat and friends to work.
> > + They bind socket to loopback, set ttl to zero
> > + and expect that it will work.
> > + From the viewpoint of routing cache they are broken,
> > + because we are not allowed to build multicast path
> > + with loopback source addr (look, routing cache
> > + cannot know, that ttl is zero, so that packet
> > + will not leave this host and route is valid).
> > + Luckily, this hack is good workaround.
> > + */
> > +
> > + fl.oif = dev_out->ifindex;
> > + goto make_route;
> > + }
> >
> > - fl.oif = dev_out->ifindex;
> > - goto make_route;
> > - }
> > - if (dev_out)
> > dev_put(dev_out);
> > - dev_out = NULL;
> > + dev_out = NULL;
> > + }
> > }
>
> What about something like this, it even reduces checks
> in the fast path. You can post new version if the following change
> looks good to you and to other developers. If additional sign line is
> needed here it is:
>
> Signed-off-by: Julian Anastasov <ja@ssi.bg>
>
> @@ -2396,8 +2396,6 @@ static int ip_route_output_slow(struct r
>
> /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
> dev_out = ip_dev_find(oldflp->fl4_src);
> - if (dev_out == NULL)
> - goto out;
>
> /* I removed check for oif == dev_out->oif here.
> It was wrong for two reasons:
> @@ -2409,6 +2407,8 @@ static int ip_route_output_slow(struct r
>
> if (oldflp->oif == 0
> && (MULTICAST(oldflp->fl4_dst) || oldflp->fl4_dst == htonl(0xFFFFFFFF))) {
> + if (dev_out == NULL)
> + goto out;
> /* Special hack: user can direct multicasts
> and limited broadcast via necessary interface
> without fiddling with IP_MULTICAST_IF or IP_PKTINFO.
> @@ -2427,9 +2427,11 @@ static int ip_route_output_slow(struct r
> fl.oif = dev_out->ifindex;
> goto make_route;
> }
> - if (dev_out)
> + if (dev_out) {
> dev_put(dev_out);
> - dev_out = NULL;
> + dev_out = NULL;
> + } else if (!(oldflp->flags & FLOWI_FLAG_ANYSRC))
> + goto out;
> }
>
>
>
> Or we can go further and to avoid ip_dev_find? For me, this
> second variant is preferred because calling ip_dev_find() is useless for
> FLOWI_FLAG_ANYSRC.
>
> @@ -2394,11 +2394,6 @@ static int ip_route_output_slow(struct r
> ZERONET(oldflp->fl4_src))
> goto out;
>
> - /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
> - dev_out = ip_dev_find(oldflp->fl4_src);
> - if (dev_out == NULL)
> - goto out;
> -
> /* I removed check for oif == dev_out->oif here.
> It was wrong for two reasons:
> 1. ip_dev_find(saddr) can return wrong iface, if saddr is
> @@ -2409,6 +2404,11 @@ static int ip_route_output_slow(struct r
>
> if (oldflp->oif == 0
> && (MULTICAST(oldflp->fl4_dst) || oldflp->fl4_dst == htonl(0xFFFFFFFF))) {
> + /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
> + dev_out = ip_dev_find(oldflp->fl4_src);
> + if (dev_out == NULL)
> + goto out;
> +
> /* Special hack: user can direct multicasts
> and limited broadcast via necessary interface
> without fiddling with IP_MULTICAST_IF or IP_PKTINFO.
> @@ -2427,9 +2427,14 @@ static int ip_route_output_slow(struct r
> fl.oif = dev_out->ifindex;
> goto make_route;
> }
> - if (dev_out)
> + if (!(oldflp->flags & FLOWI_FLAG_ANYSRC)) {
> + /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */
> + dev_out = ip_dev_find(oldflp->fl4_src);
> + if (dev_out == NULL)
> + goto out;
> dev_put(dev_out);
> - dev_out = NULL;
> + dev_out = NULL;
> + }
> }
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Bazsi
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed
2007-06-20 10:57 ` Balazs Scheidler
@ 2007-06-21 7:56 ` Julian Anastasov
0 siblings, 0 replies; 24+ messages in thread
From: Julian Anastasov @ 2007-06-21 7:56 UTC (permalink / raw)
To: Balazs Scheidler
Cc: KOVACS Krisztian, David Miller, kaber, horms, jkrzyszt, netdev
Hello,
On Wed, 20 Jun 2007, Balazs Scheidler wrote:
> Is there a chance that this, or a patch with similar spirit (e.g. a way
> to send packets from non-local IP addresses) could be merged?
Last patch from Krisztian is exactly what I preferred
to see, I hope that someone really tested it :). But i'm not the one
that decides, so far we don't see comments from other interested
parties, may be there are other opinions?
Regards
--
Julian Anastasov <ja@ssi.bg>
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2007-06-21 7:57 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <200704271705.l3RH5Brw026873@hera.kernel.org>
2007-05-14 10:21 ` [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed Patrick McHardy
2007-05-14 10:35 ` David Miller
2007-05-14 14:25 ` Janusz Krzysztofik
2007-05-14 14:32 ` Patrick McHardy
2007-05-14 15:49 ` Janusz Krzysztofik
2007-05-14 17:41 ` Patrick McHardy
2007-05-15 5:26 ` Simon Horman
2007-05-15 9:46 ` Janusz Krzysztofik
2007-05-15 16:11 ` Patrick McHardy
2007-05-15 23:41 ` Julian Anastasov
2007-05-17 11:25 ` Janusz Krzysztofik
2007-05-17 16:41 ` Patrick McHardy
2007-05-17 16:40 ` Patrick McHardy
2007-05-17 20:51 ` David Miller
2007-05-18 1:06 ` Simon Horman
2007-05-18 8:40 ` Julian Anastasov
2007-05-18 9:05 ` David Miller
2007-05-30 9:38 ` KOVACS Krisztian
2007-05-31 0:21 ` Julian Anastasov
2007-05-31 12:50 ` KOVACS Krisztian
2007-05-31 23:18 ` Julian Anastasov
2007-06-01 12:55 ` KOVACS Krisztian
2007-06-20 10:57 ` Balazs Scheidler
2007-06-21 7:56 ` Julian Anastasov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).