* [PATCH] ipv4: Fix ip_skb_dst_mtu to use the sk passed by ip_finish_output
@ 2016-06-29 18:47 Shmulik Ladkani
2016-06-29 21:42 ` Hannes Frederic Sowa
2016-06-30 13:03 ` David Miller
0 siblings, 2 replies; 3+ messages in thread
From: Shmulik Ladkani @ 2016-06-29 18:47 UTC (permalink / raw)
To: David S . Miller, Hannes Frederic Sowa, Tom Herbert
Cc: Andy Zhou, Stephen Hemminger, netdev, Shmulik Ladkani
ip_skb_dst_mtu uses skb->sk, assuming it is an AF_INET socket (e.g. it
calls ip_sk_use_pmtu which casts sk as an inet_sk).
However, in the case of UDP tunneling, the skb->sk is not necessarily an
inet socket (could be AF_PACKET socket, or AF_UNSPEC if arriving from
tun/tap).
OTOH, the sk passed as an argument throughout IP stack's output path is
the one which is of PMTU interest:
- In case of local sockets, sk is same as skb->sk;
- In case of a udp tunnel, sk is the tunneling socket.
Fix, by passing ip_finish_output's sk to ip_skb_dst_mtu.
This augments 7026b1ddb6 'netfilter: Pass socket pointer down through okfn().'
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
---
Discovered while debugging other issue in adjacent code area.
My setup seems to be handling this well; However I'd appreciate a
Tested-by or Reviewed-by as one can only cover relatively few of the
possible code flows leading to ip_skb_dst_mtu.
include/net/ip.h | 5 ++---
net/bridge/br_netfilter_hooks.c | 2 +-
net/ipv4/ip_output.c | 4 ++--
3 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/include/net/ip.h b/include/net/ip.h
index 37165fba37..08f36cd2b8 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -313,10 +313,9 @@ static inline unsigned int ip_dst_mtu_maybe_forward(const struct dst_entry *dst,
return min(dst->dev->mtu, IP_MAX_MTU);
}
-static inline unsigned int ip_skb_dst_mtu(const struct sk_buff *skb)
+static inline unsigned int ip_skb_dst_mtu(struct sock *sk,
+ const struct sk_buff *skb)
{
- struct sock *sk = skb->sk;
-
if (!sk || !sk_fullsock(sk) || ip_sk_use_pmtu(sk)) {
bool forwarding = IPCB(skb)->flags & IPSKB_FORWARDED;
diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index 2d25979273..77e7f69bf8 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -700,7 +700,7 @@ static int
br_nf_ip_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
int (*output)(struct net *, struct sock *, struct sk_buff *))
{
- unsigned int mtu = ip_skb_dst_mtu(skb);
+ unsigned int mtu = ip_skb_dst_mtu(sk, skb);
struct iphdr *iph = ip_hdr(skb);
if (unlikely(((iph->frag_off & htons(IP_DF)) && !skb->ignore_df) ||
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index cbac493c91..e23f141c9b 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -271,7 +271,7 @@ static int ip_finish_output(struct net *net, struct sock *sk, struct sk_buff *sk
return dst_output(net, sk, skb);
}
#endif
- mtu = ip_skb_dst_mtu(skb);
+ mtu = ip_skb_dst_mtu(sk, skb);
if (skb_is_gso(skb))
return ip_finish_output_gso(net, sk, skb, mtu);
@@ -541,7 +541,7 @@ int ip_do_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
iph = ip_hdr(skb);
- mtu = ip_skb_dst_mtu(skb);
+ mtu = ip_skb_dst_mtu(sk, skb);
if (IPCB(skb)->frag_max_size && IPCB(skb)->frag_max_size < mtu)
mtu = IPCB(skb)->frag_max_size;
--
2.7.4
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] ipv4: Fix ip_skb_dst_mtu to use the sk passed by ip_finish_output
2016-06-29 18:47 [PATCH] ipv4: Fix ip_skb_dst_mtu to use the sk passed by ip_finish_output Shmulik Ladkani
@ 2016-06-29 21:42 ` Hannes Frederic Sowa
2016-06-30 13:03 ` David Miller
1 sibling, 0 replies; 3+ messages in thread
From: Hannes Frederic Sowa @ 2016-06-29 21:42 UTC (permalink / raw)
To: Shmulik Ladkani, David S . Miller, Hannes Frederic Sowa,
Tom Herbert
Cc: Andy Zhou, Stephen Hemminger, netdev
On Wed, Jun 29, 2016, at 20:47, Shmulik Ladkani wrote:
> ip_skb_dst_mtu uses skb->sk, assuming it is an AF_INET socket (e.g. it
> calls ip_sk_use_pmtu which casts sk as an inet_sk).
>
> However, in the case of UDP tunneling, the skb->sk is not necessarily an
> inet socket (could be AF_PACKET socket, or AF_UNSPEC if arriving from
> tun/tap).
>
> OTOH, the sk passed as an argument throughout IP stack's output path is
> the one which is of PMTU interest:
> - In case of local sockets, sk is same as skb->sk;
> - In case of a udp tunnel, sk is the tunneling socket.
>
> Fix, by passing ip_finish_output's sk to ip_skb_dst_mtu.
> This augments 7026b1ddb6 'netfilter: Pass socket pointer down through
> okfn().'
>
> Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
> ---
>
> Discovered while debugging other issue in adjacent code area.
>
> My setup seems to be handling this well; However I'd appreciate a
> Tested-by or Reviewed-by as one can only cover relatively few of the
> possible code flows leading to ip_skb_dst_mtu.
Well spotted.
It all looks good, the socket's pmtudisc field is initialized
accordingly and I suspect people want to have vxlan to use the path mtu
when transmitting packets, even during forwarding.
Maybe this can be tweaked with an additional patch, if people want to
change this behavior.
But as your patch doesn't change any of this,
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Thanks!
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] ipv4: Fix ip_skb_dst_mtu to use the sk passed by ip_finish_output
2016-06-29 18:47 [PATCH] ipv4: Fix ip_skb_dst_mtu to use the sk passed by ip_finish_output Shmulik Ladkani
2016-06-29 21:42 ` Hannes Frederic Sowa
@ 2016-06-30 13:03 ` David Miller
1 sibling, 0 replies; 3+ messages in thread
From: David Miller @ 2016-06-30 13:03 UTC (permalink / raw)
To: shmulik.ladkani; +Cc: hannes, tom, azhou, stephen, netdev
From: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Date: Wed, 29 Jun 2016 21:47:03 +0300
> ip_skb_dst_mtu uses skb->sk, assuming it is an AF_INET socket (e.g. it
> calls ip_sk_use_pmtu which casts sk as an inet_sk).
>
> However, in the case of UDP tunneling, the skb->sk is not necessarily an
> inet socket (could be AF_PACKET socket, or AF_UNSPEC if arriving from
> tun/tap).
>
> OTOH, the sk passed as an argument throughout IP stack's output path is
> the one which is of PMTU interest:
> - In case of local sockets, sk is same as skb->sk;
> - In case of a udp tunnel, sk is the tunneling socket.
>
> Fix, by passing ip_finish_output's sk to ip_skb_dst_mtu.
> This augments 7026b1ddb6 'netfilter: Pass socket pointer down through okfn().'
>
> Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Applied, thanks.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-06-30 13:11 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-29 18:47 [PATCH] ipv4: Fix ip_skb_dst_mtu to use the sk passed by ip_finish_output Shmulik Ladkani
2016-06-29 21:42 ` Hannes Frederic Sowa
2016-06-30 13:03 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).