* [PATCH nf-next 0/4] netfilter: nftables: tcp mss mangling support
@ 2017-08-08 13:15 Florian Westphal
2017-08-08 13:15 ` [PATCH nf-next 1/4] netfilter: exthdr: factor out tcp option access Florian Westphal
` (5 more replies)
0 siblings, 6 replies; 9+ messages in thread
From: Florian Westphal @ 2017-08-08 13:15 UTC (permalink / raw)
To: netfilter-devel
This series adds the needed kernel parts to support tcp mss mangling.
First two patches rework exthdr so we don't have to copy-paste too much,
patch 3 adds tcp option mangling support.
Last patch allows to retrieve path tcpmss via rt expression, this is so we
can support iptables TCPMSS --clamp-to-pmtu by combining the two, i.e.:
nft add rule inet mangle forward tcp option mss set rt mss
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH nf-next 1/4] netfilter: exthdr: factor out tcp option access
2017-08-08 13:15 [PATCH nf-next 0/4] netfilter: nftables: tcp mss mangling support Florian Westphal
@ 2017-08-08 13:15 ` Florian Westphal
2017-08-08 13:15 ` [PATCH nf-next 2/4] netfilter: exthdr: split netlink dump function Florian Westphal
` (4 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: Florian Westphal @ 2017-08-08 13:15 UTC (permalink / raw)
To: netfilter-devel; +Cc: Florian Westphal
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nft_exthdr.c | 33 +++++++++++++++++++++------------
1 file changed, 21 insertions(+), 12 deletions(-)
diff --git a/net/netfilter/nft_exthdr.c b/net/netfilter/nft_exthdr.c
index 1ec49fe5845f..921c95f2c583 100644
--- a/net/netfilter/nft_exthdr.c
+++ b/net/netfilter/nft_exthdr.c
@@ -61,6 +61,26 @@ static void nft_exthdr_ipv6_eval(const struct nft_expr *expr,
regs->verdict.code = NFT_BREAK;
}
+static void *
+nft_tcp_header_pointer(const struct nft_pktinfo *pkt,
+ unsigned int len, void *buffer, unsigned int *tcphdr_len)
+{
+ struct tcphdr *tcph;
+
+ if (!pkt->tprot_set || pkt->tprot != IPPROTO_TCP)
+ return NULL;
+
+ tcph = skb_header_pointer(pkt->skb, pkt->xt.thoff, sizeof(*tcph), buffer);
+ if (!tcph)
+ return NULL;
+
+ *tcphdr_len = __tcp_hdrlen(tcph);
+ if (*tcphdr_len < sizeof(*tcph) || *tcphdr_len > len)
+ return NULL;
+
+ return skb_header_pointer(pkt->skb, pkt->xt.thoff, *tcphdr_len, buffer);
+}
+
static void nft_exthdr_tcp_eval(const struct nft_expr *expr,
struct nft_regs *regs,
const struct nft_pktinfo *pkt)
@@ -72,18 +92,7 @@ static void nft_exthdr_tcp_eval(const struct nft_expr *expr,
struct tcphdr *tcph;
u8 *opt;
- if (!pkt->tprot_set || pkt->tprot != IPPROTO_TCP)
- goto err;
-
- tcph = skb_header_pointer(pkt->skb, pkt->xt.thoff, sizeof(*tcph), buff);
- if (!tcph)
- goto err;
-
- tcphdr_len = __tcp_hdrlen(tcph);
- if (tcphdr_len < sizeof(*tcph))
- goto err;
-
- tcph = skb_header_pointer(pkt->skb, pkt->xt.thoff, tcphdr_len, buff);
+ tcph = nft_tcp_header_pointer(pkt, sizeof(buff), buff, &tcphdr_len);
if (!tcph)
goto err;
--
2.13.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH nf-next 2/4] netfilter: exthdr: split netlink dump function
2017-08-08 13:15 [PATCH nf-next 0/4] netfilter: nftables: tcp mss mangling support Florian Westphal
2017-08-08 13:15 ` [PATCH nf-next 1/4] netfilter: exthdr: factor out tcp option access Florian Westphal
@ 2017-08-08 13:15 ` Florian Westphal
2017-08-08 13:15 ` [PATCH nf-next 3/4] netfilter: exthdr: tcp option set support Florian Westphal
` (3 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: Florian Westphal @ 2017-08-08 13:15 UTC (permalink / raw)
To: netfilter-devel; +Cc: Florian Westphal
so eval and uncoming eval_set versions can reuse a common helper.
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nft_exthdr.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/net/netfilter/nft_exthdr.c b/net/netfilter/nft_exthdr.c
index 921c95f2c583..e3a6eebe7e0c 100644
--- a/net/netfilter/nft_exthdr.c
+++ b/net/netfilter/nft_exthdr.c
@@ -180,12 +180,8 @@ static int nft_exthdr_init(const struct nft_ctx *ctx,
NFT_DATA_VALUE, priv->len);
}
-static int nft_exthdr_dump(struct sk_buff *skb, const struct nft_expr *expr)
+static int nft_exthdr_dump_common(struct sk_buff *skb, const struct nft_exthdr *priv)
{
- const struct nft_exthdr *priv = nft_expr_priv(expr);
-
- if (nft_dump_register(skb, NFTA_EXTHDR_DREG, priv->dreg))
- goto nla_put_failure;
if (nla_put_u8(skb, NFTA_EXTHDR_TYPE, priv->type))
goto nla_put_failure;
if (nla_put_be32(skb, NFTA_EXTHDR_OFFSET, htonl(priv->offset)))
@@ -202,6 +198,16 @@ static int nft_exthdr_dump(struct sk_buff *skb, const struct nft_expr *expr)
return -1;
}
+static int nft_exthdr_dump(struct sk_buff *skb, const struct nft_expr *expr)
+{
+ const struct nft_exthdr *priv = nft_expr_priv(expr);
+
+ if (nft_dump_register(skb, NFTA_EXTHDR_DREG, priv->dreg))
+ return -1;
+
+ return nft_exthdr_dump_common(skb, priv);
+}
+
static struct nft_expr_type nft_exthdr_type;
static const struct nft_expr_ops nft_exthdr_ipv6_ops = {
.type = &nft_exthdr_type,
--
2.13.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH nf-next 3/4] netfilter: exthdr: tcp option set support
2017-08-08 13:15 [PATCH nf-next 0/4] netfilter: nftables: tcp mss mangling support Florian Westphal
2017-08-08 13:15 ` [PATCH nf-next 1/4] netfilter: exthdr: factor out tcp option access Florian Westphal
2017-08-08 13:15 ` [PATCH nf-next 2/4] netfilter: exthdr: split netlink dump function Florian Westphal
@ 2017-08-08 13:15 ` Florian Westphal
2017-08-08 13:15 ` [PATCH nf-next 4/4] netfilter: rt: add support to fetch path mss Florian Westphal
` (2 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: Florian Westphal @ 2017-08-08 13:15 UTC (permalink / raw)
To: netfilter-devel; +Cc: Florian Westphal
This allows setting 2 and 4 byte quantities in the tcp option space.
Main purpose is to allow native replacement for xt_TCPMSS to
work around pmtu blackholes.
Writes to kind and len are now allowed at the moment, it does not seem
useful to do this as it causes corruption of the tcp option space.
We can always lift this restriction later if a use-case appears.
Signed-off-by: Florian Westphal <fw@strlen.de>
---
include/uapi/linux/netfilter/nf_tables.h | 4 +-
net/netfilter/nft_exthdr.c | 164 ++++++++++++++++++++++++++++++-
2 files changed, 165 insertions(+), 3 deletions(-)
diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index be25cf69295b..40fd199f7531 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -732,7 +732,8 @@ enum nft_exthdr_op {
* @NFTA_EXTHDR_OFFSET: extension header offset (NLA_U32)
* @NFTA_EXTHDR_LEN: extension header length (NLA_U32)
* @NFTA_EXTHDR_FLAGS: extension header flags (NLA_U32)
- * @NFTA_EXTHDR_OP: option match type (NLA_U8)
+ * @NFTA_EXTHDR_OP: option match type (NLA_U32)
+ * @NFTA_EXTHDR_SREG: option match type (NLA_U32)
*/
enum nft_exthdr_attributes {
NFTA_EXTHDR_UNSPEC,
@@ -742,6 +743,7 @@ enum nft_exthdr_attributes {
NFTA_EXTHDR_LEN,
NFTA_EXTHDR_FLAGS,
NFTA_EXTHDR_OP,
+ NFTA_EXTHDR_SREG,
__NFTA_EXTHDR_MAX
};
#define NFTA_EXTHDR_MAX (__NFTA_EXTHDR_MAX - 1)
diff --git a/net/netfilter/nft_exthdr.c b/net/netfilter/nft_exthdr.c
index e3a6eebe7e0c..f5a0bf5e3bdd 100644
--- a/net/netfilter/nft_exthdr.c
+++ b/net/netfilter/nft_exthdr.c
@@ -8,6 +8,7 @@
* Development of this code funded by Astaro AG (http://www.astaro.com/)
*/
+#include <asm/unaligned.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/module.h>
@@ -23,6 +24,7 @@ struct nft_exthdr {
u8 len;
u8 op;
enum nft_registers dreg:8;
+ enum nft_registers sreg:8;
u8 flags;
};
@@ -124,6 +126,88 @@ static void nft_exthdr_tcp_eval(const struct nft_expr *expr,
regs->verdict.code = NFT_BREAK;
}
+static void nft_exthdr_tcp_set_eval(const struct nft_expr *expr,
+ struct nft_regs *regs,
+ const struct nft_pktinfo *pkt)
+{
+ u8 buff[sizeof(struct tcphdr) + MAX_TCP_OPTION_SPACE];
+ struct nft_exthdr *priv = nft_expr_priv(expr);
+ unsigned int i, optl, tcphdr_len, offset;
+ struct tcphdr *tcph;
+ u8 *opt;
+ u32 src;
+
+ tcph = nft_tcp_header_pointer(pkt, sizeof(buff), buff, &tcphdr_len);
+ if (!tcph)
+ return;
+
+ opt = (u8 *)tcph;
+ for (i = sizeof(*tcph); i < tcphdr_len - 1; i += optl) {
+ union {
+ u8 octet;
+ __be16 v16;
+ __be32 v32;
+ } old, new;
+
+ optl = optlen(opt, i);
+
+ if (priv->type != opt[i])
+ continue;
+
+ if (i + optl > tcphdr_len || priv->len + priv->offset > optl)
+ return;
+
+ if (!skb_make_writable(pkt->skb, pkt->xt.thoff + i + priv->len))
+ return;
+
+ tcph = nft_tcp_header_pointer(pkt, sizeof(buff), buff,
+ &tcphdr_len);
+ if (!tcph)
+ return;
+
+ src = regs->data[priv->sreg];
+ offset = i + priv->offset;
+
+ switch (priv->len) {
+ case 2:
+ old.v16 = get_unaligned((u16 *)(opt + offset));
+ new.v16 = src;
+
+ switch (priv->type) {
+ case TCPOPT_MSS:
+ /* increase can cause connection to stall */
+ if (ntohs(old.v16) <= ntohs(new.v16))
+ return;
+ break;
+ }
+
+ if (old.v16 == new.v16)
+ return;
+
+ put_unaligned(new.v16, (u16*)(opt + offset));
+ inet_proto_csum_replace2(&tcph->check, pkt->skb,
+ old.v16, new.v16, false);
+ break;
+ case 4:
+ new.v32 = src;
+ old.v32 = get_unaligned((u32 *)(opt + offset));
+
+ if (old.v32 == new.v32)
+ return;
+
+ put_unaligned(new.v32, (u32*)(opt + offset));
+ inet_proto_csum_replace4(&tcph->check, pkt->skb,
+ old.v32, new.v32, false);
+ break;
+ default:
+ WARN_ON_ONCE(1);
+ break;
+ }
+
+ return;
+ }
+}
+
static const struct nla_policy nft_exthdr_policy[NFTA_EXTHDR_MAX + 1] = {
[NFTA_EXTHDR_DREG] = { .type = NLA_U32 },
[NFTA_EXTHDR_TYPE] = { .type = NLA_U8 },
@@ -180,6 +264,55 @@ static int nft_exthdr_init(const struct nft_ctx *ctx,
NFT_DATA_VALUE, priv->len);
}
+static int nft_exthdr_tcp_set_init(const struct nft_ctx *ctx,
+ const struct nft_expr *expr,
+ const struct nlattr * const tb[])
+{
+ struct nft_exthdr *priv = nft_expr_priv(expr);
+ u32 offset, len, flags = 0, op = NFT_EXTHDR_OP_IPV6;
+ int err;
+
+ if (!tb[NFTA_EXTHDR_SREG] ||
+ !tb[NFTA_EXTHDR_TYPE] ||
+ !tb[NFTA_EXTHDR_OFFSET] ||
+ !tb[NFTA_EXTHDR_LEN])
+ return -EINVAL;
+
+ if (tb[NFTA_EXTHDR_DREG] || tb[NFTA_EXTHDR_FLAGS])
+ return -EINVAL;
+
+ err = nft_parse_u32_check(tb[NFTA_EXTHDR_OFFSET], U8_MAX, &offset);
+ if (err < 0)
+ return err;
+
+ err = nft_parse_u32_check(tb[NFTA_EXTHDR_LEN], U8_MAX, &len);
+ if (err < 0)
+ return err;
+
+ if (offset < 2)
+ return -EOPNOTSUPP;
+
+ switch (len) {
+ case 2: break;
+ case 4: break;
+ default:
+ return -EOPNOTSUPP;
+ }
+
+ err = nft_parse_u32_check(tb[NFTA_EXTHDR_OP], U8_MAX, &op);
+ if (err < 0)
+ return err;
+
+ priv->type = nla_get_u8(tb[NFTA_EXTHDR_TYPE]);
+ priv->offset = offset;
+ priv->len = len;
+ priv->sreg = nft_parse_register(tb[NFTA_EXTHDR_SREG]);
+ priv->flags = flags;
+ priv->op = op;
+
+ return nft_validate_register_load(priv->sreg, priv->len);
+}
+
static int nft_exthdr_dump_common(struct sk_buff *skb, const struct nft_exthdr *priv)
{
if (nla_put_u8(skb, NFTA_EXTHDR_TYPE, priv->type))
@@ -208,6 +341,16 @@ static int nft_exthdr_dump(struct sk_buff *skb, const struct nft_expr *expr)
return nft_exthdr_dump_common(skb, priv);
}
+static int nft_exthdr_dump_set(struct sk_buff *skb, const struct nft_expr *expr)
+{
+ const struct nft_exthdr *priv = nft_expr_priv(expr);
+
+ if (nft_dump_register(skb, NFTA_EXTHDR_SREG, priv->sreg))
+ return -1;
+
+ return nft_exthdr_dump_common(skb, priv);
+}
+
static struct nft_expr_type nft_exthdr_type;
static const struct nft_expr_ops nft_exthdr_ipv6_ops = {
.type = &nft_exthdr_type,
@@ -225,6 +368,14 @@ static const struct nft_expr_ops nft_exthdr_tcp_ops = {
.dump = nft_exthdr_dump,
};
+static const struct nft_expr_ops nft_exthdr_tcp_set_ops = {
+ .type = &nft_exthdr_type,
+ .size = NFT_EXPR_SIZE(sizeof(struct nft_exthdr)),
+ .eval = nft_exthdr_tcp_set_eval,
+ .init = nft_exthdr_tcp_set_init,
+ .dump = nft_exthdr_dump_set,
+};
+
static const struct nft_expr_ops *
nft_exthdr_select_ops(const struct nft_ctx *ctx,
const struct nlattr * const tb[])
@@ -234,12 +385,21 @@ nft_exthdr_select_ops(const struct nft_ctx *ctx,
if (!tb[NFTA_EXTHDR_OP])
return &nft_exthdr_ipv6_ops;
+ if (tb[NFTA_EXTHDR_SREG] && tb[NFTA_EXTHDR_DREG])
+ return ERR_PTR(-EOPNOTSUPP);
+
op = ntohl(nla_get_u32(tb[NFTA_EXTHDR_OP]));
switch (op) {
case NFT_EXTHDR_OP_TCPOPT:
- return &nft_exthdr_tcp_ops;
+ if (tb[NFTA_EXTHDR_SREG])
+ return &nft_exthdr_tcp_set_ops;
+ if (tb[NFTA_EXTHDR_DREG])
+ return &nft_exthdr_tcp_ops;
+ break;
case NFT_EXTHDR_OP_IPV6:
- return &nft_exthdr_ipv6_ops;
+ if (tb[NFTA_EXTHDR_DREG])
+ return &nft_exthdr_ipv6_ops;
+ break;
}
return ERR_PTR(-EOPNOTSUPP);
--
2.13.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH nf-next 4/4] netfilter: rt: add support to fetch path mss
2017-08-08 13:15 [PATCH nf-next 0/4] netfilter: nftables: tcp mss mangling support Florian Westphal
` (2 preceding siblings ...)
2017-08-08 13:15 ` [PATCH nf-next 3/4] netfilter: exthdr: tcp option set support Florian Westphal
@ 2017-08-08 13:15 ` Florian Westphal
2017-08-08 13:37 ` Eric Dumazet
2017-08-08 13:48 ` [PATCH v2 " Florian Westphal
2017-08-19 12:05 ` [PATCH nf-next 0/4] netfilter: nftables: tcp mss mangling support Pablo Neira Ayuso
5 siblings, 1 reply; 9+ messages in thread
From: Florian Westphal @ 2017-08-08 13:15 UTC (permalink / raw)
To: netfilter-devel; +Cc: Florian Westphal
to be used in combination with tcp option set support to mimic
iptables TCPMSS --clamp-mss-to-pmtu.
Signed-off-by: Florian Westphal <fw@strlen.de>
---
include/uapi/linux/netfilter/nf_tables.h | 2 +
net/netfilter/nft_rt.c | 65 ++++++++++++++++++++++++++++++++
2 files changed, 67 insertions(+)
diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index 40fd199f7531..b49da72efa68 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -811,11 +811,13 @@ enum nft_meta_keys {
* @NFT_RT_CLASSID: realm value of packet's route (skb->dst->tclassid)
* @NFT_RT_NEXTHOP4: routing nexthop for IPv4
* @NFT_RT_NEXTHOP6: routing nexthop for IPv6
+ * @NFT_RT_TCPMSS: fetch current path tcp mss
*/
enum nft_rt_keys {
NFT_RT_CLASSID,
NFT_RT_NEXTHOP4,
NFT_RT_NEXTHOP6,
+ NFT_RT_TCPMSS,
};
/**
diff --git a/net/netfilter/nft_rt.c b/net/netfilter/nft_rt.c
index c7383d8f88d0..69ed601d6fc6 100644
--- a/net/netfilter/nft_rt.c
+++ b/net/netfilter/nft_rt.c
@@ -23,6 +23,41 @@ struct nft_rt {
enum nft_registers dreg:8;
};
+static u16 get_tcpmss(const struct nft_pktinfo *pkt, const struct dst_entry *skbdst)
+{
+ u32 minlen = sizeof(struct ipv6hdr), mtu = dst_mtu(skbdst);
+ const struct sk_buff *skb = pkt->skb;
+ const struct nf_afinfo *ai;
+ struct dst_entry *dst;
+ struct flowi fl;
+
+ memset(&fl, 0, sizeof(fl));
+
+ switch (nft_pf(pkt)) {
+ case NFPROTO_IPV4:
+ fl.u.ip4.daddr = ip_hdr(skb)->saddr;
+ minlen = sizeof(struct iphdr);
+ break;
+ case NFPROTO_IPV6:
+ fl.u.ip6.daddr = ipv6_hdr(skb)->saddr;
+ break;
+ }
+
+ ai = nf_get_afinfo(nft_pf(pkt));
+ if (ai)
+ ai->route(nft_net(pkt), &dst, &fl, false);
+
+ if (dst) {
+ mtu = min(mtu, dst_mtu(dst));
+ dst_release(dst);
+ }
+
+ if (mtu <= minlen || mtu > 0xffff)
+ return TCP_MSS_DEFAULT;
+
+ return mtu - minlen;
+}
+
static void nft_rt_get_eval(const struct nft_expr *expr,
struct nft_regs *regs,
const struct nft_pktinfo *pkt)
@@ -57,6 +92,9 @@ static void nft_rt_get_eval(const struct nft_expr *expr,
&ipv6_hdr(skb)->daddr),
sizeof(struct in6_addr));
break;
+ case NFT_RT_TCPMSS:
+ nft_reg_store16(dest, get_tcpmss(pkt, dst));
+ break;
default:
WARN_ON(1);
goto err;
@@ -94,6 +132,9 @@ static int nft_rt_get_init(const struct nft_ctx *ctx,
case NFT_RT_NEXTHOP6:
len = sizeof(struct in6_addr);
break;
+ case NFT_RT_TCPMSS:
+ len = sizeof(u16);
+ break;
default:
return -EOPNOTSUPP;
}
@@ -118,6 +159,29 @@ static int nft_rt_get_dump(struct sk_buff *skb,
return -1;
}
+static int nft_rt_validate(const struct nft_ctx *ctx, const struct nft_expr *expr,
+ const struct nft_data **data)
+{
+ const struct nft_rt *priv = nft_expr_priv(expr);
+ unsigned int hooks;
+
+ switch (priv->key) {
+ case NFT_RT_NEXTHOP4:
+ case NFT_RT_NEXTHOP6:
+ case NFT_RT_CLASSID:
+ return 0;
+ case NFT_RT_TCPMSS:
+ hooks = (1 << NF_INET_FORWARD) |
+ (1 << NF_INET_LOCAL_OUT) |
+ (1 << NF_INET_POST_ROUTING);
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ return nft_chain_validate_hooks(ctx->chain, hooks);
+}
+
static struct nft_expr_type nft_rt_type;
static const struct nft_expr_ops nft_rt_get_ops = {
.type = &nft_rt_type,
@@ -125,6 +189,7 @@ static const struct nft_expr_ops nft_rt_get_ops = {
.eval = nft_rt_get_eval,
.init = nft_rt_get_init,
.dump = nft_rt_get_dump,
+ .validate = nft_rt_validate,
};
static struct nft_expr_type nft_rt_type __read_mostly = {
--
2.13.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH nf-next 4/4] netfilter: rt: add support to fetch path mss
2017-08-08 13:15 ` [PATCH nf-next 4/4] netfilter: rt: add support to fetch path mss Florian Westphal
@ 2017-08-08 13:37 ` Eric Dumazet
2017-08-08 13:47 ` Florian Westphal
0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2017-08-08 13:37 UTC (permalink / raw)
To: Florian Westphal; +Cc: netfilter-devel
On Tue, 2017-08-08 at 15:15 +0200, Florian Westphal wrote:
> to be used in combination with tcp option set support to mimic
> iptables TCPMSS --clamp-mss-to-pmtu.
>
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
> include/uapi/linux/netfilter/nf_tables.h | 2 +
> net/netfilter/nft_rt.c | 65 ++++++++++++++++++++++++++++++++
> 2 files changed, 67 insertions(+)
>
> diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
> index 40fd199f7531..b49da72efa68 100644
> --- a/include/uapi/linux/netfilter/nf_tables.h
> +++ b/include/uapi/linux/netfilter/nf_tables.h
> @@ -811,11 +811,13 @@ enum nft_meta_keys {
> * @NFT_RT_CLASSID: realm value of packet's route (skb->dst->tclassid)
> * @NFT_RT_NEXTHOP4: routing nexthop for IPv4
> * @NFT_RT_NEXTHOP6: routing nexthop for IPv6
> + * @NFT_RT_TCPMSS: fetch current path tcp mss
> */
> enum nft_rt_keys {
> NFT_RT_CLASSID,
> NFT_RT_NEXTHOP4,
> NFT_RT_NEXTHOP6,
> + NFT_RT_TCPMSS,
> };
>
> /**
> diff --git a/net/netfilter/nft_rt.c b/net/netfilter/nft_rt.c
> index c7383d8f88d0..69ed601d6fc6 100644
> --- a/net/netfilter/nft_rt.c
> +++ b/net/netfilter/nft_rt.c
> @@ -23,6 +23,41 @@ struct nft_rt {
> enum nft_registers dreg:8;
> };
>
> +static u16 get_tcpmss(const struct nft_pktinfo *pkt, const struct dst_entry *skbdst)
> +{
> + u32 minlen = sizeof(struct ipv6hdr), mtu = dst_mtu(skbdst);
> + const struct sk_buff *skb = pkt->skb;
> + const struct nf_afinfo *ai;
> + struct dst_entry *dst;
> + struct flowi fl;
> +
> + memset(&fl, 0, sizeof(fl));
> +
> + switch (nft_pf(pkt)) {
> + case NFPROTO_IPV4:
> + fl.u.ip4.daddr = ip_hdr(skb)->saddr;
> + minlen = sizeof(struct iphdr);
> + break;
> + case NFPROTO_IPV6:
> + fl.u.ip6.daddr = ipv6_hdr(skb)->saddr;
> + break;
> + }
> +
> + ai = nf_get_afinfo(nft_pf(pkt));
> + if (ai)
> + ai->route(nft_net(pkt), &dst, &fl, false);
> +
if ai is NULL,
dst is not initialized and might contain garbage.
> + if (dst) {
> + mtu = min(mtu, dst_mtu(dst));
> + dst_release(dst);
> + }
> +
> + if (mtu <= minlen || mtu > 0xffff)
> + return TCP_MSS_DEFAULT;
> +
> + return mtu - minlen;
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH nf-next 4/4] netfilter: rt: add support to fetch path mss
2017-08-08 13:37 ` Eric Dumazet
@ 2017-08-08 13:47 ` Florian Westphal
0 siblings, 0 replies; 9+ messages in thread
From: Florian Westphal @ 2017-08-08 13:47 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Florian Westphal, netfilter-devel
Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2017-08-08 at 15:15 +0200, Florian Westphal wrote:
> > + struct dst_entry *dst;
> > + struct flowi fl;
[..]
> > + ai = nf_get_afinfo(nft_pf(pkt));
> > + if (ai)
> > + ai->route(nft_net(pkt), &dst, &fl, false);
> > +
>
> if ai is NULL,
>
> dst is not initialized and might contain garbage.
Right, thanks for pointing this out, I sent a v2.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 nf-next 4/4] netfilter: rt: add support to fetch path mss
2017-08-08 13:15 [PATCH nf-next 0/4] netfilter: nftables: tcp mss mangling support Florian Westphal
` (3 preceding siblings ...)
2017-08-08 13:15 ` [PATCH nf-next 4/4] netfilter: rt: add support to fetch path mss Florian Westphal
@ 2017-08-08 13:48 ` Florian Westphal
2017-08-19 12:05 ` [PATCH nf-next 0/4] netfilter: nftables: tcp mss mangling support Pablo Neira Ayuso
5 siblings, 0 replies; 9+ messages in thread
From: Florian Westphal @ 2017-08-08 13:48 UTC (permalink / raw)
To: netfilter-devel; +Cc: Florian Westphal
to be used in combination with tcp option set support to mimic
iptables TCPMSS --clamp-mss-to-pmtu.
v2: Eric Dumazet points out dst must be initialized.
Signed-off-by: Florian Westphal <fw@strlen.de>
---
include/uapi/linux/netfilter/nf_tables.h | 2 +
net/netfilter/nft_rt.c | 66 ++++++++++++++++++++++++++++++++
2 files changed, 68 insertions(+)
diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index 40fd199f7531..b49da72efa68 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -811,11 +811,13 @@ enum nft_meta_keys {
* @NFT_RT_CLASSID: realm value of packet's route (skb->dst->tclassid)
* @NFT_RT_NEXTHOP4: routing nexthop for IPv4
* @NFT_RT_NEXTHOP6: routing nexthop for IPv6
+ * @NFT_RT_TCPMSS: fetch current path tcp mss
*/
enum nft_rt_keys {
NFT_RT_CLASSID,
NFT_RT_NEXTHOP4,
NFT_RT_NEXTHOP6,
+ NFT_RT_TCPMSS,
};
/**
diff --git a/net/netfilter/nft_rt.c b/net/netfilter/nft_rt.c
index c7383d8f88d0..e142e65d3176 100644
--- a/net/netfilter/nft_rt.c
+++ b/net/netfilter/nft_rt.c
@@ -23,6 +23,42 @@ struct nft_rt {
enum nft_registers dreg:8;
};
+static u16 get_tcpmss(const struct nft_pktinfo *pkt, const struct dst_entry *skbdst)
+{
+ u32 minlen = sizeof(struct ipv6hdr), mtu = dst_mtu(skbdst);
+ const struct sk_buff *skb = pkt->skb;
+ const struct nf_afinfo *ai;
+ struct flowi fl;
+
+ memset(&fl, 0, sizeof(fl));
+
+ switch (nft_pf(pkt)) {
+ case NFPROTO_IPV4:
+ fl.u.ip4.daddr = ip_hdr(skb)->saddr;
+ minlen = sizeof(struct iphdr);
+ break;
+ case NFPROTO_IPV6:
+ fl.u.ip6.daddr = ipv6_hdr(skb)->saddr;
+ break;
+ }
+
+ ai = nf_get_afinfo(nft_pf(pkt));
+ if (ai) {
+ struct dst_entry *dst = NULL;
+
+ ai->route(nft_net(pkt), &dst, &fl, false);
+ if (dst) {
+ mtu = min(mtu, dst_mtu(dst));
+ dst_release(dst);
+ }
+ }
+
+ if (mtu <= minlen || mtu > 0xffff)
+ return TCP_MSS_DEFAULT;
+
+ return mtu - minlen;
+}
+
static void nft_rt_get_eval(const struct nft_expr *expr,
struct nft_regs *regs,
const struct nft_pktinfo *pkt)
@@ -57,6 +93,9 @@ static void nft_rt_get_eval(const struct nft_expr *expr,
&ipv6_hdr(skb)->daddr),
sizeof(struct in6_addr));
break;
+ case NFT_RT_TCPMSS:
+ nft_reg_store16(dest, get_tcpmss(pkt, dst));
+ break;
default:
WARN_ON(1);
goto err;
@@ -94,6 +133,9 @@ static int nft_rt_get_init(const struct nft_ctx *ctx,
case NFT_RT_NEXTHOP6:
len = sizeof(struct in6_addr);
break;
+ case NFT_RT_TCPMSS:
+ len = sizeof(u16);
+ break;
default:
return -EOPNOTSUPP;
}
@@ -118,6 +160,29 @@ static int nft_rt_get_dump(struct sk_buff *skb,
return -1;
}
+static int nft_rt_validate(const struct nft_ctx *ctx, const struct nft_expr *expr,
+ const struct nft_data **data)
+{
+ const struct nft_rt *priv = nft_expr_priv(expr);
+ unsigned int hooks;
+
+ switch (priv->key) {
+ case NFT_RT_NEXTHOP4:
+ case NFT_RT_NEXTHOP6:
+ case NFT_RT_CLASSID:
+ return 0;
+ case NFT_RT_TCPMSS:
+ hooks = (1 << NF_INET_FORWARD) |
+ (1 << NF_INET_LOCAL_OUT) |
+ (1 << NF_INET_POST_ROUTING);
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ return nft_chain_validate_hooks(ctx->chain, hooks);
+}
+
static struct nft_expr_type nft_rt_type;
static const struct nft_expr_ops nft_rt_get_ops = {
.type = &nft_rt_type,
@@ -125,6 +190,7 @@ static const struct nft_expr_ops nft_rt_get_ops = {
.eval = nft_rt_get_eval,
.init = nft_rt_get_init,
.dump = nft_rt_get_dump,
+ .validate = nft_rt_validate,
};
static struct nft_expr_type nft_rt_type __read_mostly = {
--
2.13.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH nf-next 0/4] netfilter: nftables: tcp mss mangling support
2017-08-08 13:15 [PATCH nf-next 0/4] netfilter: nftables: tcp mss mangling support Florian Westphal
` (4 preceding siblings ...)
2017-08-08 13:48 ` [PATCH v2 " Florian Westphal
@ 2017-08-19 12:05 ` Pablo Neira Ayuso
5 siblings, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2017-08-19 12:05 UTC (permalink / raw)
To: Florian Westphal; +Cc: netfilter-devel
On Tue, Aug 08, 2017 at 03:15:26PM +0200, Florian Westphal wrote:
> This series adds the needed kernel parts to support tcp mss mangling.
> First two patches rework exthdr so we don't have to copy-paste too much,
> patch 3 adds tcp option mangling support.
>
> Last patch allows to retrieve path tcpmss via rt expression, this is so we
> can support iptables TCPMSS --clamp-to-pmtu by combining the two, i.e.:
>
> nft add rule inet mangle forward tcp option mss set rt mss
Series applied, thanks Florian.
Please, post your userspace patchset.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-08-19 12:06 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-08 13:15 [PATCH nf-next 0/4] netfilter: nftables: tcp mss mangling support Florian Westphal
2017-08-08 13:15 ` [PATCH nf-next 1/4] netfilter: exthdr: factor out tcp option access Florian Westphal
2017-08-08 13:15 ` [PATCH nf-next 2/4] netfilter: exthdr: split netlink dump function Florian Westphal
2017-08-08 13:15 ` [PATCH nf-next 3/4] netfilter: exthdr: tcp option set support Florian Westphal
2017-08-08 13:15 ` [PATCH nf-next 4/4] netfilter: rt: add support to fetch path mss Florian Westphal
2017-08-08 13:37 ` Eric Dumazet
2017-08-08 13:47 ` Florian Westphal
2017-08-08 13:48 ` [PATCH v2 " Florian Westphal
2017-08-19 12:05 ` [PATCH nf-next 0/4] netfilter: nftables: tcp mss mangling support Pablo Neira Ayuso
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).