* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed [not found] <200704271705.l3RH5Brw026873@hera.kernel.org> @ 2007-05-14 10:21 ` Patrick McHardy 2007-05-14 10:35 ` David Miller 0 siblings, 1 reply; 24+ messages in thread From: Patrick McHardy @ 2007-05-14 10:21 UTC (permalink / raw) To: Simon Horman; +Cc: Linux Netdev List, Janusz Krzysztofik Linux Kernel Mailing List wrote: > Gitweb: http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2d771cd86d4c3af26f34a7bcdc1b87696824cad9 > Commit: 2d771cd86d4c3af26f34a7bcdc1b87696824cad9 > > [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed > > this is a small patch by Janusz Krzysztofik to ip_route_output_slow() > that allows VIP-less LVS linux director to generate packets > originating >From VIP if sysctl_ip_nonlocal_bind is set. > > In a nutshell, the intention is for an LVS linux director to be able > to send ICMP unreachable responses to end-users when real-servers are > removed. > > http://archive.linuxvirtualserver.org/html/lvs-users/2007-01/msg00106.html > > Signed-off-by: Simon Horman <horms@verge.net.au> > Signed-off-by: David S. Miller <davem@davemloft.net> > --- > net/ipv4/route.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/net/ipv4/route.c b/net/ipv4/route.c > index df9fe4f..cb76e3c 100644 > --- a/net/ipv4/route.c > +++ b/net/ipv4/route.c > @@ -2396,7 +2396,7 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp) > > /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */ > dev_out = ip_dev_find(oldflp->fl4_src); > - if (dev_out == NULL) > + if ((dev_out == NULL) && !(sysctl_ip_nonlocal_bind)) > goto out; This allows any user to send spoofed packets when ip_nonlocal_bind is set, which is a quite big change in behaviour of this option. The TPROXY patches include a similar change, but use a flag in struct flowi that requires CAP_NET_ADMIN to be set, which seems like a better idea. Alternatively you could just use input routing for non-local source addresses like ip_route_me_harder does. BTW, there doesn't even seem to be a spot where IPVS calls ip_route_output with the source address set. What exactly is this needed for? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-14 10:21 ` [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed Patrick McHardy @ 2007-05-14 10:35 ` David Miller 2007-05-14 14:25 ` Janusz Krzysztofik 0 siblings, 1 reply; 24+ messages in thread From: David Miller @ 2007-05-14 10:35 UTC (permalink / raw) To: kaber; +Cc: horms, netdev, jkrzyszt From: Patrick McHardy <kaber@trash.net> Date: Mon, 14 May 2007 12:21:34 +0200 > This allows any user to send spoofed packets when ip_nonlocal_bind > is set, which is a quite big change in behaviour of this option. > The TPROXY patches include a similar change, but use a flag in > struct flowi that requires CAP_NET_ADMIN to be set, which seems like > a better idea. Alternatively you could just use input routing for > non-local source addresses like ip_route_me_harder does. Good point. > BTW, there doesn't even seem to be a spot where IPVS calls > ip_route_output with the source address set. What exactly is this > needed for? I suppose he has a patch to make use of it, but was waiting for this route.c change to go in first. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-14 10:35 ` David Miller @ 2007-05-14 14:25 ` Janusz Krzysztofik 2007-05-14 14:32 ` Patrick McHardy 0 siblings, 1 reply; 24+ messages in thread From: Janusz Krzysztofik @ 2007-05-14 14:25 UTC (permalink / raw) To: David Miller; +Cc: kaber, horms, netdev David Miller wrote: > From: Patrick McHardy <kaber@trash.net> > Date: Mon, 14 May 2007 12:21:34 +0200 > >> This allows any user to send spoofed packets when ip_nonlocal_bind >> is set, which is a quite big change in behaviour of this option. >> The TPROXY patches include a similar change, but use a flag in >> struct flowi that requires CAP_NET_ADMIN to be set, which seems like >> a better idea. Alternatively you could just use input routing for >> non-local source addresses like ip_route_me_harder does. > > Good point. > >> BTW, there doesn't even seem to be a spot where IPVS calls >> ip_route_output with the source address set. What exactly is this >> needed for? > > I suppose he has a patch to make use of it, but was waiting > for this route.c change to go in first. If you mean me, the answer is no, I do not have any patch making use of the change in question. What I have is rather a complicated method of notifying udp clients on communication problems before they are redirected to a new real server. My method needs some IPVS related patches, but ICMP port unreachable messages are not generated inside IPVS code, they are just sent, with help of the patch in question, from udp_input() or netfilter REJECT. It was my first intention to patch IPVS to send these messages, but I found no simple way to implement this in the current IPVS code. Janusz ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-14 14:25 ` Janusz Krzysztofik @ 2007-05-14 14:32 ` Patrick McHardy 2007-05-14 15:49 ` Janusz Krzysztofik 0 siblings, 1 reply; 24+ messages in thread From: Patrick McHardy @ 2007-05-14 14:32 UTC (permalink / raw) To: Janusz Krzysztofik; +Cc: David Miller, horms, netdev Janusz Krzysztofik wrote: > David Miller wrote: > >>> BTW, there doesn't even seem to be a spot where IPVS calls >>> ip_route_output with the source address set. What exactly is this >>> needed for? >> >> >> I suppose he has a patch to make use of it, but was waiting >> for this route.c change to go in first. > > > If you mean me, the answer is no, I do not have any patch making use of > the change in question. What I have is rather a complicated method of > notifying udp clients on communication problems before they are > redirected to a new real server. My method needs some IPVS related > patches, but ICMP port unreachable messages are not generated inside > IPVS code, they are just sent, with help of the patch in question, from > udp_input() or netfilter REJECT. Both use icmp_send(), which should always pick a local source, so I don't understand why this change was needed. Could you describe the specific case when the packet generated by icmp_send() does not have a local source? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-14 14:32 ` Patrick McHardy @ 2007-05-14 15:49 ` Janusz Krzysztofik 2007-05-14 17:41 ` Patrick McHardy 0 siblings, 1 reply; 24+ messages in thread From: Janusz Krzysztofik @ 2007-05-14 15:49 UTC (permalink / raw) To: Patrick McHardy; +Cc: David Miller, horms, netdev Patrick McHardy wrote: > Janusz Krzysztofik wrote: >> ... ICMP port unreachable messages are not generated inside >> IPVS code, they are just sent, with help of the patch in question, from >> udp_input() or netfilter REJECT. > > Both use icmp_send(), which should always pick a local source, so I > don't understand why this change was needed. Could you describe > the specific case when the packet generated by icmp_send() does > not have a local source? Yes, it happens when a packet with a non-local destination IP address is routed localy in order to reach ip_vs_in(), but is not catched there because of no associated connection and no matching service, so it is passed through and ends up in udp_input(). Then, inside udp_input(), icmp_send() is invoked with original non-local destination IP as source address. Again, all this is my own method, usnig special packet marking, of notifying clients of dead real servers, that is not possible with "pure" LVS methods. More details can be found several paragraphs below http://www.austintek.com/LVS/LVS-HOWTO/HOWTO/LVS-HOWTO.LVS-NAT.html#F5_snat header. Janusz ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-14 15:49 ` Janusz Krzysztofik @ 2007-05-14 17:41 ` Patrick McHardy 2007-05-15 5:26 ` Simon Horman 0 siblings, 1 reply; 24+ messages in thread From: Patrick McHardy @ 2007-05-14 17:41 UTC (permalink / raw) To: Janusz Krzysztofik; +Cc: David Miller, horms, netdev [-- Attachment #1: Type: text/plain, Size: 1379 bytes --] Janusz Krzysztofik wrote: > Patrick McHardy wrote: > >> Janusz Krzysztofik wrote: >> >>> ... ICMP port unreachable messages are not generated inside >>> IPVS code, they are just sent, with help of the patch in question, from >>> udp_input() or netfilter REJECT. >> >> >> Both use icmp_send(), which should always pick a local source, so I >> don't understand why this change was needed. Could you describe >> the specific case when the packet generated by icmp_send() does >> not have a local source? > > > Yes, it happens when a packet with a non-local destination IP address is > routed localy in order to reach ip_vs_in(), but is not catched there > because of no associated connection and no matching service, so it is > passed through and ends up in udp_input(). Then, inside udp_input(), > icmp_send() is invoked with original non-local destination IP as source > address. So you're adding a local route for non-local destination and the address selection in icmp_send() uses the original destination address as source because the route has RTCF_LOCAL set, resulting in an error in ip_route_output_slow(). If thats correct than this patch should also work, it changes icmp_send() to check if the original destination address is non-local when deciding whether to pick a new address (and reverts the routing changes). Signed-off-by: Patrick McHardy <kaber@trash.net> [-- Attachment #2: x --] [-- Type: text/plain, Size: 1310 bytes --] diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index d38cbba..b964863 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -513,7 +513,7 @@ void icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info) */ saddr = iph->daddr; - if (!(rt->rt_flags & RTCF_LOCAL)) { + if (inet_addr_type(saddr) != RTN_LOCAL) { if (sysctl_icmp_errors_use_inbound_ifaddr) saddr = inet_select_addr(skb_in->dev, 0, RT_SCOPE_LINK); else diff --git a/net/ipv4/route.c b/net/ipv4/route.c index cb76e3c..df9fe4f 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -2396,7 +2396,7 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp) /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */ dev_out = ip_dev_find(oldflp->fl4_src); - if ((dev_out == NULL) && !(sysctl_ip_nonlocal_bind)) + if (dev_out == NULL) goto out; /* I removed check for oif == dev_out->oif here. @@ -2407,7 +2407,7 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp) of another iface. --ANK */ - if (dev_out && oldflp->oif == 0 + if (oldflp->oif == 0 && (MULTICAST(oldflp->fl4_dst) || oldflp->fl4_dst == htonl(0xFFFFFFFF))) { /* Special hack: user can direct multicasts and limited broadcast via necessary interface ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-14 17:41 ` Patrick McHardy @ 2007-05-15 5:26 ` Simon Horman 2007-05-15 9:46 ` Janusz Krzysztofik 2007-05-15 16:11 ` Patrick McHardy 0 siblings, 2 replies; 24+ messages in thread From: Simon Horman @ 2007-05-15 5:26 UTC (permalink / raw) To: Patrick McHardy Cc: Janusz Krzysztofik, David Miller, netdev, Julian Anastasov On Mon, May 14, 2007 at 07:41:48PM +0200, Patrick McHardy wrote: > Janusz Krzysztofik wrote: > > Patrick McHardy wrote: > > > >> Janusz Krzysztofik wrote: > >> > >>> ... ICMP port unreachable messages are not generated inside > >>> IPVS code, they are just sent, with help of the patch in question, from > >>> udp_input() or netfilter REJECT. > >> > >> > >> Both use icmp_send(), which should always pick a local source, so I > >> don't understand why this change was needed. Could you describe > >> the specific case when the packet generated by icmp_send() does > >> not have a local source? > > > > > > Yes, it happens when a packet with a non-local destination IP address is > > routed localy in order to reach ip_vs_in(), but is not catched there > > because of no associated connection and no matching service, so it is > > passed through and ends up in udp_input(). Then, inside udp_input(), > > icmp_send() is invoked with original non-local destination IP as source > > address. > > > So you're adding a local route for non-local destination and the > address selection in icmp_send() uses the original destination > address as source because the route has RTCF_LOCAL set, resulting > in an error in ip_route_output_slow(). I'm not entirely sure that "adding a local route" is the right terminology, but then again, perhaps I'm missunderstanding exactly what that means. My undersanding of the problem is that IPVS likes to send icmp to notify end-users when real-servers are down. The source ip of such icmp is the VIP, that is the IP address associated with the virtual service. However, it is quite valid for this VIP not to be configured on the machine that is running IPVS. Thus the machine in question wants to send icmp packets with a non-local source address. http://archive.linuxvirtualserver.org/html/lvs-users/2007-01/msg00109.html I think that your patch looks good, assuming that inet_addr_type(VIP) is going to return RTN_LOCAL (except in the unlikely case that VIP is multicast or something silly like that. However, I wonder if efficiency or safety reasons it might be better for IPVS to pass some sort of OK_ITS_SUPPSED_TO_BE_NON_LOCAL flag into ip_route(). Just a thought. > > If thats correct than this patch should also work, it changes > icmp_send() to check if the original destination address is > non-local when deciding whether to pick a new address (and > reverts the routing changes). > > Signed-off-by: Patrick McHardy <kaber@trash.net> > diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c > index d38cbba..b964863 100644 > --- a/net/ipv4/icmp.c > +++ b/net/ipv4/icmp.c > @@ -513,7 +513,7 @@ void icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info) > */ > > saddr = iph->daddr; > - if (!(rt->rt_flags & RTCF_LOCAL)) { > + if (inet_addr_type(saddr) != RTN_LOCAL) { > if (sysctl_icmp_errors_use_inbound_ifaddr) > saddr = inet_select_addr(skb_in->dev, 0, RT_SCOPE_LINK); > else > diff --git a/net/ipv4/route.c b/net/ipv4/route.c > index cb76e3c..df9fe4f 100644 > --- a/net/ipv4/route.c > +++ b/net/ipv4/route.c > @@ -2396,7 +2396,7 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp) > > /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */ > dev_out = ip_dev_find(oldflp->fl4_src); > - if ((dev_out == NULL) && !(sysctl_ip_nonlocal_bind)) > + if (dev_out == NULL) > goto out; > > /* I removed check for oif == dev_out->oif here. > @@ -2407,7 +2407,7 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp) > of another iface. --ANK > */ > > - if (dev_out && oldflp->oif == 0 > + if (oldflp->oif == 0 > && (MULTICAST(oldflp->fl4_dst) || oldflp->fl4_dst == htonl(0xFFFFFFFF))) { > /* Special hack: user can direct multicasts > and limited broadcast via necessary interface -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-15 5:26 ` Simon Horman @ 2007-05-15 9:46 ` Janusz Krzysztofik 2007-05-15 16:11 ` Patrick McHardy 1 sibling, 0 replies; 24+ messages in thread From: Janusz Krzysztofik @ 2007-05-15 9:46 UTC (permalink / raw) To: Simon Horman; +Cc: Patrick McHardy, David Miller, netdev, Julian Anastasov Simon Horman wrote: > On Mon, May 14, 2007 at 07:41:48PM +0200, Patrick McHardy wrote: >> So you're adding a local route for non-local destination and the >> address selection in icmp_send() uses the original destination >> address as source because the route has RTCF_LOCAL set, resulting >> in an error in ip_route_output_slow(). > > I'm not entirely sure that "adding a local route" is the right > terminology, but then again, perhaps I'm missunderstanding exactly > what that means. What I do exactly is: ip rule add prio 1000 fwmark $IF_MARK_LVS lookup lvs ip route replace table lvs local default dev lo > > My undersanding of the problem is that IPVS likes to send icmp to notify > end-users when real-servers are down. Yes, there is one such place in IPVS code too, inside ip_vs_leave(), used for notifying clients on service overload. > The source ip of such icmp is the > VIP, that is the IP address associated with the virtual service. > However, it is quite valid for this VIP not to be configured on the > machine that is running IPVS. Thus the machine in question wants to send > icmp packets with a non-local source address. > > http://archive.linuxvirtualserver.org/html/lvs-users/2007-01/msg00109.html > >> If thats correct than this patch should also work, it changes >> icmp_send() to check if the original destination address is >> non-local when deciding whether to pick a new address (and >> reverts the routing changes). > I think that your patch looks good, assuming that inet_addr_type(VIP) > is going to return RTN_LOCAL (except in the unlikely case that VIP is > multicast or something silly like that. For now, I have no place other than my production firewall cluster to verify this patch. I will do it as soon as possible and give you some feedback. > > However, I wonder if efficiency or safety reasons it might > be better for IPVS to pass some sort of OK_ITS_SUPPSED_TO_BE_NON_LOCAL > flag into ip_route(). Do you mean packets that are passed through ip_vs_in()?. If not, please remember that current IPVS code does not send any ICMP port unreachable messages except for this rare overload case. I still have no idea how to solve more common problem of notifying clients on dead real server inside the IPVS code itself, to avoid my complicated tricks of marking based on connection tracking. On the other hand, I have to state that even if I can now send notifications to clients using my method, this does not solve my real problem of broken ipsec connections going through LVS director. Openswan clients I use do not care about ICMP port unreachable messages an insist on using connections that are invalid due to switched real server. So maybe we should first verify if there are any real cases when notifying udp clients with ICMP port unreachable may be realy usefull and then decide if we do need this functionality. Janusz P.S. Simon, sorry for duplicated message. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-15 5:26 ` Simon Horman 2007-05-15 9:46 ` Janusz Krzysztofik @ 2007-05-15 16:11 ` Patrick McHardy 2007-05-15 23:41 ` Julian Anastasov 1 sibling, 1 reply; 24+ messages in thread From: Patrick McHardy @ 2007-05-15 16:11 UTC (permalink / raw) To: Simon Horman; +Cc: Janusz Krzysztofik, David Miller, netdev, Julian Anastasov Simon Horman wrote: > On Mon, May 14, 2007 at 07:41:48PM +0200, Patrick McHardy wrote: > >>So you're adding a local route for non-local destination and the >>address selection in icmp_send() uses the original destination >>address as source because the route has RTCF_LOCAL set, resulting >>in an error in ip_route_output_slow(). > > > I'm not entirely sure that "adding a local route" is the right > terminology, but then again, perhaps I'm missunderstanding exactly > what that means. It means adding a route to the local table, which causes the resulting dst_entry to be marked with RTCF_LOCAL. > My undersanding of the problem is that IPVS likes to send icmp to notify > end-users when real-servers are down. The source ip of such icmp is the > VIP, that is the IP address associated with the virtual service. > However, it is quite valid for this VIP not to be configured on the > machine that is running IPVS. Thus the machine in question wants to send > icmp packets with a non-local source address. > > http://archive.linuxvirtualserver.org/html/lvs-users/2007-01/msg00109.html > > I think that your patch looks good, assuming that inet_addr_type(VIP) > is going to return RTN_LOCAL (except in the unlikely case that VIP is > multicast or something silly like that. I'm not familiar with the IPVS terms, but as far as I understand, it is _not_ going to return RTN_LOCAL, so we get the desired behaviour of selecting a local address as source. > However, I wonder if efficiency or safety reasons it might > be better for IPVS to pass some sort of OK_ITS_SUPPSED_TO_BE_NON_LOCAL > flag into ip_route(). > > Just a thought. I'm not too thrilled about adding a route flag when it really is ICMP address selection that is the problem here. The patch should be completely safe since multicast and broadcast packets are already filtered out earlier and the RTN_LOCAL test matches exactly what ip_route_output_slow does. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-15 16:11 ` Patrick McHardy @ 2007-05-15 23:41 ` Julian Anastasov 2007-05-17 11:25 ` Janusz Krzysztofik 2007-05-17 16:40 ` Patrick McHardy 0 siblings, 2 replies; 24+ messages in thread From: Julian Anastasov @ 2007-05-15 23:41 UTC (permalink / raw) To: Patrick McHardy; +Cc: Simon Horman, Janusz Krzysztofik, David Miller, netdev Hello, On Tue, 15 May 2007, Patrick McHardy wrote: > Simon Horman wrote: > > On Mon, May 14, 2007 at 07:41:48PM +0200, Patrick McHardy wrote: > > > >>So you're adding a local route for non-local destination and the > >>address selection in icmp_send() uses the original destination > >>address as source because the route has RTCF_LOCAL set, resulting > >>in an error in ip_route_output_slow(). > > > > I'm not entirely sure that "adding a local route" is the right > > terminology, but then again, perhaps I'm missunderstanding exactly > > what that means. > > It means adding a route to the local table, which causes the > resulting dst_entry to be marked with RTCF_LOCAL. IPVS users add local route to user-defined (not "local") routing table to deliver locally only selected traffic, eg. by fwmark. The use is just like for transparent proxy, deliver port X for VIP locally, other traffic for VIP is forwarded (hits unicast route). And it seems the idea is ICMP still to work for such scenarios, icmp_send should know that this is authorized operation that can avoid the source address check (saddr=VIP) in ip_route_output_slow. If icmp_send is changed to use inet_addr_type() then ICMP will leave with saddr != VIP and that is not nice. I think, icmp_send is fine as is, the main problem is how to differentiate the ip_route_output_slow users to ones that need the check for valid source address and others (eg. NAT) that expect and allow source address to be non-local. It is interesting, what happens when some NAT rule maps multiple internal addresses (one to one) to multiple non-local public addresses, may be if packet from world is rejected we don't send ICMP from the right public address? For example, if we are NAT router with such rules: internal 192.168.0.1 is mapped to public 10.0.0.1 internal 192.168.0.2 is mapped to public 10.0.0.2 ... where 10.0.0.0/16 are not configured as local IPs on the NAT router, eg. when we don't want to add thousands of IPs. World sends to 10.0.0.1 but the NAT router wants to send ICMP with saddr=10.0.0.1 which is not local IP. So, the rule here is that NAT allows ICMP to use 10.0.0.0/16 as source address, even if such IPs are not configured. Of course, one may use ip addr add 10.0.0.0/16 dev lo for such case and may be all such examples can be solved somehow, i simply didn't tried such setups. To summarize, what can help is a flag (eg. RT_ANYSRC) to ip_route_output* that all special users can provide to skip the check, for example: - RTCF_LOCAL packets in icmp_send() can avoid the check - NAT can avoid the check (ip_route_me_harder can be simplified?) Currently, all callers use the check, so may be the goal can be to start with small set of callers that can set the new flag. It looks like we can save some CPU cycles too, ip_route_me_harder looks too overloaded. > > I think that your patch looks good, assuming that inet_addr_type(VIP) > > is going to return RTN_LOCAL (except in the unlikely case that VIP is > > multicast or something silly like that. > > I'm not familiar with the IPVS terms, but as far as I understand, > it is _not_ going to return RTN_LOCAL, so we get the desired > behaviour of selecting a local address as source. But what is preferred is to use VIP in ICMP. ip route add local VIP dev lo table user_defined returns RTCF_LOCAL but inet_addr_type() does not return RTN_LOCAL, we fix one thing but break another :) Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-15 23:41 ` Julian Anastasov @ 2007-05-17 11:25 ` Janusz Krzysztofik 2007-05-17 16:41 ` Patrick McHardy 2007-05-17 16:40 ` Patrick McHardy 1 sibling, 1 reply; 24+ messages in thread From: Janusz Krzysztofik @ 2007-05-17 11:25 UTC (permalink / raw) To: Julian Anastasov; +Cc: Patrick McHardy, Simon Horman, David Miller, netdev Julian Anastasov wrote: > If icmp_send is changed to use inet_addr_type() then ICMP will leave > with saddr != VIP and that is not nice. > ... >> I'm not familiar with the IPVS terms, but as far as I understand, >> it is _not_ going to return RTN_LOCAL, so we get the desired >> behaviour of selecting a local address as source. > > But what is preferred is to use VIP in ICMP. > > ip route add local VIP dev lo table user_defined > > returns RTCF_LOCAL but inet_addr_type() does not return RTN_LOCAL, > we fix one thing but break another :) Well, I have promissed you to give some feedback after I test the patch proposed by Patrick, but after Julian's post I can only confirm that it works exactly as Julian said, what is not what I would expect. Julian, thank you for your detailed explanation of the issue, I have nothing more to add. Janusz ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-17 11:25 ` Janusz Krzysztofik @ 2007-05-17 16:41 ` Patrick McHardy 0 siblings, 0 replies; 24+ messages in thread From: Patrick McHardy @ 2007-05-17 16:41 UTC (permalink / raw) To: Janusz Krzysztofik; +Cc: Julian Anastasov, Simon Horman, David Miller, netdev Janusz Krzysztofik wrote: > Julian Anastasov wrote: > >> If icmp_send is changed to use inet_addr_type() then ICMP will leave >> with saddr != VIP and that is not nice. >> ... >> >>> I'm not familiar with the IPVS terms, but as far as I understand, >>> it is _not_ going to return RTN_LOCAL, so we get the desired >>> behaviour of selecting a local address as source. >> >> >> But what is preferred is to use VIP in ICMP. >> >> ip route add local VIP dev lo table user_defined >> >> returns RTCF_LOCAL but inet_addr_type() does not return RTN_LOCAL, >> we fix one thing but break another :) > > > Well, I have promissed you to give some feedback after I test the patch > proposed by Patrick, but after Julian's post I can only confirm that it > works exactly as Julian said, what is not what I would expect. It will pick a local source address and use that. I don't see why it matters whether its the VIP address or some other. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-15 23:41 ` Julian Anastasov 2007-05-17 11:25 ` Janusz Krzysztofik @ 2007-05-17 16:40 ` Patrick McHardy 2007-05-17 20:51 ` David Miller ` (2 more replies) 1 sibling, 3 replies; 24+ messages in thread From: Patrick McHardy @ 2007-05-17 16:40 UTC (permalink / raw) To: Julian Anastasov; +Cc: Simon Horman, Janusz Krzysztofik, David Miller, netdev Julian Anastasov wrote: > To summarize, what can help is a flag (eg. RT_ANYSRC) to > ip_route_output* that all special users can provide to skip the > check, for example: > - RTCF_LOCAL packets in icmp_send() can avoid the check > - NAT can avoid the check (ip_route_me_harder can be simplified?) We want to be able to use iif in rules, so ip_route_me_harder still needs to use ip_route_input(). > Currently, all callers use the check, so may be the goal can be > to start with small set of callers that can set the new flag. It looks > like we can save some CPU cycles too, ip_route_me_harder looks too > overloaded. > > >>>I think that your patch looks good, assuming that inet_addr_type(VIP) >>>is going to return RTN_LOCAL (except in the unlikely case that VIP is >>>multicast or something silly like that. >> >>I'm not familiar with the IPVS terms, but as far as I understand, >>it is _not_ going to return RTN_LOCAL, so we get the desired >>behaviour of selecting a local address as source. > > > But what is preferred is to use VIP in ICMP. > > ip route add local VIP dev lo table user_defined > > returns RTCF_LOCAL but inet_addr_type() does not return RTN_LOCAL, > we fix one thing but break another :) Actually thats exactly the case that my patch handles. Why does it matter which source address the ICMP packet uses, as long as its routed properly? In any case some better solution than the current one needs to be found, allowing users to send spoofed packets is far worse than using a non-desired source address for ICMP packets. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-17 16:40 ` Patrick McHardy @ 2007-05-17 20:51 ` David Miller 2007-05-18 1:06 ` Simon Horman 2007-05-18 8:40 ` Julian Anastasov 2 siblings, 0 replies; 24+ messages in thread From: David Miller @ 2007-05-17 20:51 UTC (permalink / raw) To: kaber; +Cc: ja, horms, jkrzyszt, netdev From: Patrick McHardy <kaber@trash.net> Date: Thu, 17 May 2007 18:40:28 +0200 > In any case some better solution than the current one needs to be > found, allowing users to send spoofed packets is far worse than > using a non-desired source address for ICMP packets. Agreed, but it only occurs if the nonlocal bind sysctl is enabled and almost nobody turns that thing on :-) ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-17 16:40 ` Patrick McHardy 2007-05-17 20:51 ` David Miller @ 2007-05-18 1:06 ` Simon Horman 2007-05-18 8:40 ` Julian Anastasov 2 siblings, 0 replies; 24+ messages in thread From: Simon Horman @ 2007-05-18 1:06 UTC (permalink / raw) To: Patrick McHardy Cc: Julian Anastasov, Janusz Krzysztofik, David Miller, netdev On Thu, May 17, 2007 at 06:40:28PM +0200, Patrick McHardy wrote: > > Actually thats exactly the case that my patch handles. Why does it > matter which source address the ICMP packet uses, as long as its > routed properly? Agreed. > In any case some better solution than the current one needs to be > found, allowing users to send spoofed packets is far worse than > using a non-desired source address for ICMP packets. No agrument there. -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-17 16:40 ` Patrick McHardy 2007-05-17 20:51 ` David Miller 2007-05-18 1:06 ` Simon Horman @ 2007-05-18 8:40 ` Julian Anastasov 2007-05-18 9:05 ` David Miller 2 siblings, 1 reply; 24+ messages in thread From: Julian Anastasov @ 2007-05-18 8:40 UTC (permalink / raw) To: Patrick McHardy; +Cc: Simon Horman, Janusz Krzysztofik, David Miller, netdev Hello, On Thu, 17 May 2007, Patrick McHardy wrote: > > But what is preferred is to use VIP in ICMP. > > > > ip route add local VIP dev lo table user_defined > > > > returns RTCF_LOCAL but inet_addr_type() does not return RTN_LOCAL, > > we fix one thing but break another :) > > > Actually thats exactly the case that my patch handles. Why does it > matter which source address the ICMP packet uses, as long as its > routed properly? It should work for most of the cases but it can cause problems in closely connected hosts where using the right subnet matters. If inet_addr_type is not considered slow for routers and this local route justifies it then i have no more objections. May be Janusz should test it first without sysctl_ip_nonlocal_bind change. > In any case some better solution than the current one needs to be > found, allowing users to send spoofed packets is far worse than > using a non-desired source address for ICMP packets. yes, I would prefer the sysctl_ip_nonlocal_bind change to be removed until such solution is found. Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-18 8:40 ` Julian Anastasov @ 2007-05-18 9:05 ` David Miller 2007-05-30 9:38 ` KOVACS Krisztian 0 siblings, 1 reply; 24+ messages in thread From: David Miller @ 2007-05-18 9:05 UTC (permalink / raw) To: ja; +Cc: kaber, horms, jkrzyszt, netdev From: Julian Anastasov <ja@ssi.bg> Date: Fri, 18 May 2007 11:40:54 +0300 (EEST) > On Thu, 17 May 2007, Patrick McHardy wrote: > > > In any case some better solution than the current one needs to be > > found, allowing users to send spoofed packets is far worse than > > using a non-desired source address for ICMP packets. > > yes, I would prefer the sysctl_ip_nonlocal_bind change to be > removed until such solution is found. Ok, I'll revert it. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-18 9:05 ` David Miller @ 2007-05-30 9:38 ` KOVACS Krisztian 2007-05-31 0:21 ` Julian Anastasov 0 siblings, 1 reply; 24+ messages in thread From: KOVACS Krisztian @ 2007-05-30 9:38 UTC (permalink / raw) To: David Miller; +Cc: ja, kaber, horms, jkrzyszt, netdev Hi, On Friday 18 May 2007 11:05, David Miller wrote: > From: Julian Anastasov <ja@ssi.bg> > Date: Fri, 18 May 2007 11:40:54 +0300 (EEST) > > > On Thu, 17 May 2007, Patrick McHardy wrote: > > > In any case some better solution than the current one needs to be > > > found, allowing users to send spoofed packets is far worse than > > > using a non-desired source address for ICMP packets. > > > > yes, I would prefer the sysctl_ip_nonlocal_bind change to be > > removed until such solution is found. > > Ok, I'll revert it. I'm just about to publish the next round of tproxy patches (with the routing code modifications completely removed), but this issue is still present. I've posted a few patches making omitting this check possible selectively back in March. Do those changes look acceptable? http://marc.info/?l=linux-netdev&m=117310979823297&w=3 And the related socket layer changes: http://marc.info/?l=linux-netdev&m=117310979815374&w=3 http://marc.info/?l=linux-netdev&m=117310979902806&w=3 http://marc.info/?l=linux-netdev&m=117310980027541&w=3 -- Regards, Krisztian Kovacs ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-30 9:38 ` KOVACS Krisztian @ 2007-05-31 0:21 ` Julian Anastasov 2007-05-31 12:50 ` KOVACS Krisztian 0 siblings, 1 reply; 24+ messages in thread From: Julian Anastasov @ 2007-05-31 0:21 UTC (permalink / raw) To: KOVACS Krisztian; +Cc: David Miller, kaber, horms, jkrzyszt, netdev Hello, On Wed, 30 May 2007, KOVACS Krisztian wrote: > I'm just about to publish the next round of tproxy patches (with the > routing code modifications completely removed), but this issue is still > present. > > I've posted a few patches making omitting this check possible > selectively back in March. Do those changes look acceptable? > > http://marc.info/?l=linux-netdev&m=117310979823297&w=3 Just a small note for now: The code 'if (dev_out == NULL && !(oldflp->flags & FLOWI_FLAG_TRANSPARENT))' can lead to problem in 'fl.oif = dev_out->ifindex;' if fl4_src is not present. Also, i'm not sure if FLOWI_FLAG_TRANSPARENT should cause different values for flags to be cached many times. Users without this flag get EINVAL when fl4_src is not configured, other failures are not cached too. And as fl4_src is considered in both cases (both kinds of callers get same path on success) we don't need changes except in ip_route_output_slow()? By this way I hope we can avoid any possible forking of cache entries just by different flags. Then we can use some more generic name, only for the flowi flag, eg. FLOWI_FLAG_ANYSRC or something better? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-31 0:21 ` Julian Anastasov @ 2007-05-31 12:50 ` KOVACS Krisztian 2007-05-31 23:18 ` Julian Anastasov 0 siblings, 1 reply; 24+ messages in thread From: KOVACS Krisztian @ 2007-05-31 12:50 UTC (permalink / raw) To: Julian Anastasov; +Cc: David Miller, kaber, horms, jkrzyszt, netdev Hi, On Thursday 31 May 2007 02:21, Julian Anastasov wrote: > > I've posted a few patches making omitting this check possible > > selectively back in March. Do those changes look acceptable? > > > > http://marc.info/?l=linux-netdev&m=117310979823297&w=3 > Also, i'm not sure if FLOWI_FLAG_TRANSPARENT should cause > different values for flags to be cached many times. Users without this > flag get EINVAL when fl4_src is not configured, other failures are not > cached too. And as fl4_src is considered in both cases (both kinds of > callers get same path on success) we don't need changes except in > ip_route_output_slow()? By this way I hope we can avoid any possible > forking of cache entries just by different flags. Indeed, for output it probably does not matter, I've removed the flags check from the flow index compare routine. > Then we can use some more generic name, only for the flowi flag, > eg. FLOWI_FLAG_ANYSRC or something better? You're right, _TRANSPARENT was a bad idea. I'm not very good at choosing names. So what about this one? Loosen source address check on IPv4 output From: KOVACS Krisztian <hidden@balabit.hu> ip_route_output() contains a check to make sure that no flows with non-local source IP addresses are routed. This obviously makes using such addresses impossible. This patch introduces a flowi flag which makes omitting this check possible. The new flag provides a way of handling transparent and non-transparent connections differently. Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> --- include/net/flow.h | 1 + net/ipv4/route.c | 47 +++++++++++++++++++++++++---------------------- 2 files changed, 26 insertions(+), 22 deletions(-) diff --git a/include/net/flow.h b/include/net/flow.h index f3cc1f8..1bfc0dc 100644 --- a/include/net/flow.h +++ b/include/net/flow.h @@ -49,6 +49,7 @@ struct flowi { __u8 proto; __u8 flags; #define FLOWI_FLAG_MULTIPATHOLDROUTE 0x01 +#define FLOWI_FLAG_ANYSRC 0x02 union { struct { __be16 sport; diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 8603cfb..88d0a79 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -2396,7 +2396,7 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp) /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */ dev_out = ip_dev_find(oldflp->fl4_src); - if (dev_out == NULL) + if (dev_out == NULL && !(oldflp->flags & FLOWI_FLAG_ANYSRC)) goto out; /* I removed check for oif == dev_out->oif here. @@ -2407,29 +2407,32 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp) of another iface. --ANK */ - if (oldflp->oif == 0 - && (MULTICAST(oldflp->fl4_dst) || oldflp->fl4_dst == htonl(0xFFFFFFFF))) { - /* Special hack: user can direct multicasts - and limited broadcast via necessary interface - without fiddling with IP_MULTICAST_IF or IP_PKTINFO. - This hack is not just for fun, it allows - vic,vat and friends to work. - They bind socket to loopback, set ttl to zero - and expect that it will work. - From the viewpoint of routing cache they are broken, - because we are not allowed to build multicast path - with loopback source addr (look, routing cache - cannot know, that ttl is zero, so that packet - will not leave this host and route is valid). - Luckily, this hack is good workaround. - */ + if (dev_out) { + if (oldflp->oif == 0 + && (MULTICAST(oldflp->fl4_dst) + || oldflp->fl4_dst == htonl(0xFFFFFFFF))) { + /* Special hack: user can direct multicasts + and limited broadcast via necessary interface + without fiddling with IP_MULTICAST_IF or IP_PKTINFO. + This hack is not just for fun, it allows + vic,vat and friends to work. + They bind socket to loopback, set ttl to zero + and expect that it will work. + From the viewpoint of routing cache they are broken, + because we are not allowed to build multicast path + with loopback source addr (look, routing cache + cannot know, that ttl is zero, so that packet + will not leave this host and route is valid). + Luckily, this hack is good workaround. + */ + + fl.oif = dev_out->ifindex; + goto make_route; + } - fl.oif = dev_out->ifindex; - goto make_route; - } - if (dev_out) dev_put(dev_out); - dev_out = NULL; + dev_out = NULL; + } } ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-31 12:50 ` KOVACS Krisztian @ 2007-05-31 23:18 ` Julian Anastasov 2007-06-01 12:55 ` KOVACS Krisztian 2007-06-20 10:57 ` Balazs Scheidler 0 siblings, 2 replies; 24+ messages in thread From: Julian Anastasov @ 2007-05-31 23:18 UTC (permalink / raw) To: KOVACS Krisztian; +Cc: David Miller, kaber, horms, jkrzyszt, netdev Hello, On Thu, 31 May 2007, KOVACS Krisztian wrote: > So what about this one? May be we can try with better coding style. Also, this version adds undefined behavior for using FLOWI_FLAG_ANYSRC with multicast oldflp->fl4_dst > Loosen source address check on IPv4 output > > From: KOVACS Krisztian <hidden@balabit.hu> > > ip_route_output() contains a check to make sure that no flows with > non-local source IP addresses are routed. This obviously makes using > such addresses impossible. > > This patch introduces a flowi flag which makes omitting this check > possible. The new flag provides a way of handling transparent and > non-transparent connections differently. > > Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> > --- > > include/net/flow.h | 1 + > net/ipv4/route.c | 47 +++++++++++++++++++++++++---------------------- > 2 files changed, 26 insertions(+), 22 deletions(-) > > diff --git a/include/net/flow.h b/include/net/flow.h > index f3cc1f8..1bfc0dc 100644 > --- a/include/net/flow.h > +++ b/include/net/flow.h > @@ -49,6 +49,7 @@ struct flowi { > __u8 proto; > __u8 flags; > #define FLOWI_FLAG_MULTIPATHOLDROUTE 0x01 > +#define FLOWI_FLAG_ANYSRC 0x02 > union { > struct { > __be16 sport; > diff --git a/net/ipv4/route.c b/net/ipv4/route.c > index 8603cfb..88d0a79 100644 > --- a/net/ipv4/route.c > +++ b/net/ipv4/route.c > @@ -2396,7 +2396,7 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp) > > /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */ > dev_out = ip_dev_find(oldflp->fl4_src); > - if (dev_out == NULL) > + if (dev_out == NULL && !(oldflp->flags & FLOWI_FLAG_ANYSRC)) > goto out; > > /* I removed check for oif == dev_out->oif here. > @@ -2407,29 +2407,32 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp) > of another iface. --ANK > */ > > - if (oldflp->oif == 0 > - && (MULTICAST(oldflp->fl4_dst) || oldflp->fl4_dst == htonl(0xFFFFFFFF))) { > - /* Special hack: user can direct multicasts > - and limited broadcast via necessary interface > - without fiddling with IP_MULTICAST_IF or IP_PKTINFO. > - This hack is not just for fun, it allows > - vic,vat and friends to work. > - They bind socket to loopback, set ttl to zero > - and expect that it will work. > - From the viewpoint of routing cache they are broken, > - because we are not allowed to build multicast path > - with loopback source addr (look, routing cache > - cannot know, that ttl is zero, so that packet > - will not leave this host and route is valid). > - Luckily, this hack is good workaround. > - */ > + if (dev_out) { > + if (oldflp->oif == 0 > + && (MULTICAST(oldflp->fl4_dst) > + || oldflp->fl4_dst == htonl(0xFFFFFFFF))) { > + /* Special hack: user can direct multicasts > + and limited broadcast via necessary interface > + without fiddling with IP_MULTICAST_IF or IP_PKTINFO. > + This hack is not just for fun, it allows > + vic,vat and friends to work. > + They bind socket to loopback, set ttl to zero > + and expect that it will work. > + From the viewpoint of routing cache they are broken, > + because we are not allowed to build multicast path > + with loopback source addr (look, routing cache > + cannot know, that ttl is zero, so that packet > + will not leave this host and route is valid). > + Luckily, this hack is good workaround. > + */ > + > + fl.oif = dev_out->ifindex; > + goto make_route; > + } > > - fl.oif = dev_out->ifindex; > - goto make_route; > - } > - if (dev_out) > dev_put(dev_out); > - dev_out = NULL; > + dev_out = NULL; > + } > } What about something like this, it even reduces checks in the fast path. You can post new version if the following change looks good to you and to other developers. If additional sign line is needed here it is: Signed-off-by: Julian Anastasov <ja@ssi.bg> @@ -2396,8 +2396,6 @@ static int ip_route_output_slow(struct r /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */ dev_out = ip_dev_find(oldflp->fl4_src); - if (dev_out == NULL) - goto out; /* I removed check for oif == dev_out->oif here. It was wrong for two reasons: @@ -2409,6 +2407,8 @@ static int ip_route_output_slow(struct r if (oldflp->oif == 0 && (MULTICAST(oldflp->fl4_dst) || oldflp->fl4_dst == htonl(0xFFFFFFFF))) { + if (dev_out == NULL) + goto out; /* Special hack: user can direct multicasts and limited broadcast via necessary interface without fiddling with IP_MULTICAST_IF or IP_PKTINFO. @@ -2427,9 +2427,11 @@ static int ip_route_output_slow(struct r fl.oif = dev_out->ifindex; goto make_route; } - if (dev_out) + if (dev_out) { dev_put(dev_out); - dev_out = NULL; + dev_out = NULL; + } else if (!(oldflp->flags & FLOWI_FLAG_ANYSRC)) + goto out; } Or we can go further and to avoid ip_dev_find? For me, this second variant is preferred because calling ip_dev_find() is useless for FLOWI_FLAG_ANYSRC. @@ -2394,11 +2394,6 @@ static int ip_route_output_slow(struct r ZERONET(oldflp->fl4_src)) goto out; - /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */ - dev_out = ip_dev_find(oldflp->fl4_src); - if (dev_out == NULL) - goto out; - /* I removed check for oif == dev_out->oif here. It was wrong for two reasons: 1. ip_dev_find(saddr) can return wrong iface, if saddr is @@ -2409,6 +2404,11 @@ static int ip_route_output_slow(struct r if (oldflp->oif == 0 && (MULTICAST(oldflp->fl4_dst) || oldflp->fl4_dst == htonl(0xFFFFFFFF))) { + /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */ + dev_out = ip_dev_find(oldflp->fl4_src); + if (dev_out == NULL) + goto out; + /* Special hack: user can direct multicasts and limited broadcast via necessary interface without fiddling with IP_MULTICAST_IF or IP_PKTINFO. @@ -2427,9 +2427,14 @@ static int ip_route_output_slow(struct r fl.oif = dev_out->ifindex; goto make_route; } - if (dev_out) + if (!(oldflp->flags & FLOWI_FLAG_ANYSRC)) { + /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */ + dev_out = ip_dev_find(oldflp->fl4_src); + if (dev_out == NULL) + goto out; dev_put(dev_out); - dev_out = NULL; + dev_out = NULL; + } } ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-31 23:18 ` Julian Anastasov @ 2007-06-01 12:55 ` KOVACS Krisztian 2007-06-20 10:57 ` Balazs Scheidler 1 sibling, 0 replies; 24+ messages in thread From: KOVACS Krisztian @ 2007-06-01 12:55 UTC (permalink / raw) To: Julian Anastasov; +Cc: David Miller, kaber, horms, jkrzyszt, hidden, netdev Hi, On Friday 01 June 2007 01:18, Julian Anastasov wrote: > What about something like this, it even reduces checks > in the fast path. You can post new version if the following change > looks good to you and to other developers. If additional sign line is > needed here it is: > > Signed-off-by: Julian Anastasov <ja@ssi.bg> > >[...] > Or we can go further and to avoid ip_dev_find? For me, this > second variant is preferred because calling ip_dev_find() is useless > for FLOWI_FLAG_ANYSRC. You're right. Although I don't really like duplicating the ip_dev_find() call, it's still better than the previous patch. -- Regards, Krisztian Kovacs Loosen source address check on IPv4 output ip_route_output() contains a check to make sure that no flows with non-local source IP addresses are routed. This obviously makes using such addresses impossible. This patch introduces a flowi flag which makes omitting this check possible. Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Julian Anastasov <ja@ssi.bg> --- include/net/flow.h | 1 + net/ipv4/route.c | 20 +++++++++++++------- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/include/net/flow.h b/include/net/flow.h index f3cc1f8..1bfc0dc 100644 --- a/include/net/flow.h +++ b/include/net/flow.h @@ -49,6 +49,7 @@ struct flowi { __u8 proto; __u8 flags; #define FLOWI_FLAG_MULTIPATHOLDROUTE 0x01 +#define FLOWI_FLAG_ANYSRC 0x02 union { struct { __be16 sport; diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 8603cfb..4acd3de 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -2394,11 +2394,6 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp) ZERONET(oldflp->fl4_src)) goto out; - /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */ - dev_out = ip_dev_find(oldflp->fl4_src); - if (dev_out == NULL) - goto out; - /* I removed check for oif == dev_out->oif here. It was wrong for two reasons: 1. ip_dev_find(saddr) can return wrong iface, if saddr is @@ -2409,6 +2404,11 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp) if (oldflp->oif == 0 && (MULTICAST(oldflp->fl4_dst) || oldflp->fl4_dst == htonl(0xFFFFFFFF))) { + /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */ + dev_out = ip_dev_find(oldflp->fl4_src); + if (dev_out == NULL) + goto out; + /* Special hack: user can direct multicasts and limited broadcast via necessary interface without fiddling with IP_MULTICAST_IF or IP_PKTINFO. @@ -2427,9 +2427,15 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp) fl.oif = dev_out->ifindex; goto make_route; } - if (dev_out) + + if (!(oldflp->flags & FLOWI_FLAG_ANYSRC)) { + /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */ + dev_out = ip_dev_find(oldflp->fl4_src); + if (dev_out == NULL) + goto out; dev_put(dev_out); - dev_out = NULL; + dev_out = NULL; + } } ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-05-31 23:18 ` Julian Anastasov 2007-06-01 12:55 ` KOVACS Krisztian @ 2007-06-20 10:57 ` Balazs Scheidler 2007-06-21 7:56 ` Julian Anastasov 1 sibling, 1 reply; 24+ messages in thread From: Balazs Scheidler @ 2007-06-20 10:57 UTC (permalink / raw) To: Julian Anastasov Cc: KOVACS Krisztian, David Miller, kaber, horms, jkrzyszt, netdev Hi, Is there a chance that this, or a patch with similar spirit (e.g. a way to send packets from non-local IP addresses) could be merged? On Fri, 2007-06-01 at 02:18 +0300, Julian Anastasov wrote: > Hello, > > On Thu, 31 May 2007, KOVACS Krisztian wrote: > > > So what about this one? > > May be we can try with better coding style. Also, this version > adds undefined behavior for using FLOWI_FLAG_ANYSRC with multicast > oldflp->fl4_dst > > > Loosen source address check on IPv4 output > > > > From: KOVACS Krisztian <hidden@balabit.hu> > > > > ip_route_output() contains a check to make sure that no flows with > > non-local source IP addresses are routed. This obviously makes using > > such addresses impossible. > > > > This patch introduces a flowi flag which makes omitting this check > > possible. The new flag provides a way of handling transparent and > > non-transparent connections differently. > > > > Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> > > --- > > > > include/net/flow.h | 1 + > > net/ipv4/route.c | 47 +++++++++++++++++++++++++---------------------- > > 2 files changed, 26 insertions(+), 22 deletions(-) > > > > diff --git a/include/net/flow.h b/include/net/flow.h > > index f3cc1f8..1bfc0dc 100644 > > --- a/include/net/flow.h > > +++ b/include/net/flow.h > > @@ -49,6 +49,7 @@ struct flowi { > > __u8 proto; > > __u8 flags; > > #define FLOWI_FLAG_MULTIPATHOLDROUTE 0x01 > > +#define FLOWI_FLAG_ANYSRC 0x02 > > union { > > struct { > > __be16 sport; > > diff --git a/net/ipv4/route.c b/net/ipv4/route.c > > index 8603cfb..88d0a79 100644 > > --- a/net/ipv4/route.c > > +++ b/net/ipv4/route.c > > @@ -2396,7 +2396,7 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp) > > > > /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */ > > dev_out = ip_dev_find(oldflp->fl4_src); > > - if (dev_out == NULL) > > + if (dev_out == NULL && !(oldflp->flags & FLOWI_FLAG_ANYSRC)) > > goto out; > > > > /* I removed check for oif == dev_out->oif here. > > @@ -2407,29 +2407,32 @@ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp) > > of another iface. --ANK > > */ > > > > - if (oldflp->oif == 0 > > - && (MULTICAST(oldflp->fl4_dst) || oldflp->fl4_dst == htonl(0xFFFFFFFF))) { > > - /* Special hack: user can direct multicasts > > - and limited broadcast via necessary interface > > - without fiddling with IP_MULTICAST_IF or IP_PKTINFO. > > - This hack is not just for fun, it allows > > - vic,vat and friends to work. > > - They bind socket to loopback, set ttl to zero > > - and expect that it will work. > > - From the viewpoint of routing cache they are broken, > > - because we are not allowed to build multicast path > > - with loopback source addr (look, routing cache > > - cannot know, that ttl is zero, so that packet > > - will not leave this host and route is valid). > > - Luckily, this hack is good workaround. > > - */ > > + if (dev_out) { > > + if (oldflp->oif == 0 > > + && (MULTICAST(oldflp->fl4_dst) > > + || oldflp->fl4_dst == htonl(0xFFFFFFFF))) { > > + /* Special hack: user can direct multicasts > > + and limited broadcast via necessary interface > > + without fiddling with IP_MULTICAST_IF or IP_PKTINFO. > > + This hack is not just for fun, it allows > > + vic,vat and friends to work. > > + They bind socket to loopback, set ttl to zero > > + and expect that it will work. > > + From the viewpoint of routing cache they are broken, > > + because we are not allowed to build multicast path > > + with loopback source addr (look, routing cache > > + cannot know, that ttl is zero, so that packet > > + will not leave this host and route is valid). > > + Luckily, this hack is good workaround. > > + */ > > + > > + fl.oif = dev_out->ifindex; > > + goto make_route; > > + } > > > > - fl.oif = dev_out->ifindex; > > - goto make_route; > > - } > > - if (dev_out) > > dev_put(dev_out); > > - dev_out = NULL; > > + dev_out = NULL; > > + } > > } > > What about something like this, it even reduces checks > in the fast path. You can post new version if the following change > looks good to you and to other developers. If additional sign line is > needed here it is: > > Signed-off-by: Julian Anastasov <ja@ssi.bg> > > @@ -2396,8 +2396,6 @@ static int ip_route_output_slow(struct r > > /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */ > dev_out = ip_dev_find(oldflp->fl4_src); > - if (dev_out == NULL) > - goto out; > > /* I removed check for oif == dev_out->oif here. > It was wrong for two reasons: > @@ -2409,6 +2407,8 @@ static int ip_route_output_slow(struct r > > if (oldflp->oif == 0 > && (MULTICAST(oldflp->fl4_dst) || oldflp->fl4_dst == htonl(0xFFFFFFFF))) { > + if (dev_out == NULL) > + goto out; > /* Special hack: user can direct multicasts > and limited broadcast via necessary interface > without fiddling with IP_MULTICAST_IF or IP_PKTINFO. > @@ -2427,9 +2427,11 @@ static int ip_route_output_slow(struct r > fl.oif = dev_out->ifindex; > goto make_route; > } > - if (dev_out) > + if (dev_out) { > dev_put(dev_out); > - dev_out = NULL; > + dev_out = NULL; > + } else if (!(oldflp->flags & FLOWI_FLAG_ANYSRC)) > + goto out; > } > > > > Or we can go further and to avoid ip_dev_find? For me, this > second variant is preferred because calling ip_dev_find() is useless for > FLOWI_FLAG_ANYSRC. > > @@ -2394,11 +2394,6 @@ static int ip_route_output_slow(struct r > ZERONET(oldflp->fl4_src)) > goto out; > > - /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */ > - dev_out = ip_dev_find(oldflp->fl4_src); > - if (dev_out == NULL) > - goto out; > - > /* I removed check for oif == dev_out->oif here. > It was wrong for two reasons: > 1. ip_dev_find(saddr) can return wrong iface, if saddr is > @@ -2409,6 +2404,11 @@ static int ip_route_output_slow(struct r > > if (oldflp->oif == 0 > && (MULTICAST(oldflp->fl4_dst) || oldflp->fl4_dst == htonl(0xFFFFFFFF))) { > + /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */ > + dev_out = ip_dev_find(oldflp->fl4_src); > + if (dev_out == NULL) > + goto out; > + > /* Special hack: user can direct multicasts > and limited broadcast via necessary interface > without fiddling with IP_MULTICAST_IF or IP_PKTINFO. > @@ -2427,9 +2427,14 @@ static int ip_route_output_slow(struct r > fl.oif = dev_out->ifindex; > goto make_route; > } > - if (dev_out) > + if (!(oldflp->flags & FLOWI_FLAG_ANYSRC)) { > + /* It is equivalent to inet_addr_type(saddr) == RTN_LOCAL */ > + dev_out = ip_dev_find(oldflp->fl4_src); > + if (dev_out == NULL) > + goto out; > dev_put(dev_out); > - dev_out = NULL; > + dev_out = NULL; > + } > } > > > - > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Bazsi ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed 2007-06-20 10:57 ` Balazs Scheidler @ 2007-06-21 7:56 ` Julian Anastasov 0 siblings, 0 replies; 24+ messages in thread From: Julian Anastasov @ 2007-06-21 7:56 UTC (permalink / raw) To: Balazs Scheidler Cc: KOVACS Krisztian, David Miller, kaber, horms, jkrzyszt, netdev Hello, On Wed, 20 Jun 2007, Balazs Scheidler wrote: > Is there a chance that this, or a patch with similar spirit (e.g. a way > to send packets from non-local IP addresses) could be merged? Last patch from Krisztian is exactly what I preferred to see, I hope that someone really tested it :). But i'm not the one that decides, so far we don't see comments from other interested parties, may be there are other opinions? Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2007-06-21 7:57 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <200704271705.l3RH5Brw026873@hera.kernel.org>
2007-05-14 10:21 ` [IPV4] LVS: Allow to send ICMP unreachable responses when real-servers are removed Patrick McHardy
2007-05-14 10:35 ` David Miller
2007-05-14 14:25 ` Janusz Krzysztofik
2007-05-14 14:32 ` Patrick McHardy
2007-05-14 15:49 ` Janusz Krzysztofik
2007-05-14 17:41 ` Patrick McHardy
2007-05-15 5:26 ` Simon Horman
2007-05-15 9:46 ` Janusz Krzysztofik
2007-05-15 16:11 ` Patrick McHardy
2007-05-15 23:41 ` Julian Anastasov
2007-05-17 11:25 ` Janusz Krzysztofik
2007-05-17 16:41 ` Patrick McHardy
2007-05-17 16:40 ` Patrick McHardy
2007-05-17 20:51 ` David Miller
2007-05-18 1:06 ` Simon Horman
2007-05-18 8:40 ` Julian Anastasov
2007-05-18 9:05 ` David Miller
2007-05-30 9:38 ` KOVACS Krisztian
2007-05-31 0:21 ` Julian Anastasov
2007-05-31 12:50 ` KOVACS Krisztian
2007-05-31 23:18 ` Julian Anastasov
2007-06-01 12:55 ` KOVACS Krisztian
2007-06-20 10:57 ` Balazs Scheidler
2007-06-21 7:56 ` Julian Anastasov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).