From: Jiri Olsa <jolsa@redhat.com>
To: Patrick McHardy <kaber@trash.net>
Cc: netdev@vger.kernel.org,
Netfilter Developer Mailing List
<netfilter-devel@vger.kernel.org>
Subject: Re: no reassembly for outgoing packets on RAW socket
Date: Fri, 11 Jun 2010 10:16:04 +0200 [thread overview]
Message-ID: <20100611081604.GA1739@jolsa.Belkin> (raw)
In-Reply-To: <4C10B8C8.2050201@trash.net>
On Thu, Jun 10, 2010 at 12:04:56PM +0200, Patrick McHardy wrote:
> Jiri Olsa wrote:
> > On Thu, Jun 10, 2010 at 11:14:04AM +0200, Patrick McHardy wrote:
> >
> >> Jiri Olsa wrote:
> >>
> >>> On Wed, Jun 09, 2010 at 04:16:42PM +0200, Patrick McHardy wrote:
> >>>
> >>>
> >>>>> If this is not the way, I'd appreciatte any hint.. my goal is
> >>>>> to put malformed packet on the wire (more frags bit set for a
> >>>>> non fragmented packet)
> >>>>>
> >>>>>
> >>>> I don't have any good suggestions besides adding a flag to the IPCB
> >>>> and skipping defragmentation based on that.
> >>>>
> >>>>
> >>> ok,
> >>>
> >>> I can see a way when I set this via setsockopt to the socket,
> >>> and check the value before the defragmentation.. would such a new
> >>> setsock option be acceptable?
> >>>
> >>> I'm not sure I can see a way via IPCB, AFAICS it's for skb bound flags
> >>> which arise during the skb processing.
> >>>
> >>>
> >> Yes, a socket option is basically what I was suggesting, using the
> >> IPCB to mark the packet. But just marking the socket is fine of
> >> course.
> >>
> >>
> >>
> >
> > one last thought before the socket option.. :)
> >
> > there's IP_HDRINCL option which is enabled for RAW sockets
> > (can be disabled later by setsockopt)
> >
> > The 'man 7 ip' says:
> > "the user supplies an IP header in front of the user data"
> >
> > but does not mention the outgoing defragmentation.
> >
> > It kind of looks to me more appropriate to preserve the user suplied
> > IP header.. moreover if there's a way to switch this off and have
> > netfilter defragmentation + connection tracking for RAW socket.
> >
> > please check the following patch..
> > (there's no special need for the IPSKB_NODEFRAG, it could check the
> > socket->hdrincl flag directly..)
> >
> > thoughts?
>
> My main concern is that users might expect netfilter to properly
> track fragmented packets created using IP_HDRINCL.
>
I prepared the patch implementing IP_NODEFRAG option for IPv4 socket.
Also I just got an idea, that there could be no reassembly if there are
no rules for connection tracing set.. not sure how can I check that best
so far.. any idea?
thanks,
jirka
---
diff --git a/include/linux/in.h b/include/linux/in.h
index 583c76f..41d88a4 100644
--- a/include/linux/in.h
+++ b/include/linux/in.h
@@ -85,6 +85,7 @@ struct in_addr {
#define IP_RECVORIGDSTADDR IP_ORIGDSTADDR
#define IP_MINTTL 21
+#define IP_NODEFRAG 22
/* IP_MTU_DISCOVER values */
#define IP_PMTUDISC_DONT 0 /* Never send DF frames */
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 1653de5..1989cfd 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -137,7 +137,8 @@ struct inet_sock {
hdrincl:1,
mc_loop:1,
transparent:1,
- mc_all:1;
+ mc_all:1,
+ nodefrag:1;
int mc_index;
__be32 mc_addr;
struct ip_mc_socklist *mc_list;
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 551ce56..84d2c8e 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -355,6 +355,8 @@ lookup_protocol:
inet = inet_sk(sk);
inet->is_icsk = (INET_PROTOSW_ICSK & answer_flags) != 0;
+ inet->nodefrag = 0;
+
if (SOCK_RAW == sock->type) {
inet->inet_num = protocol;
if (IPPROTO_RAW == protocol)
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index ce23178..5aea0eb 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -449,7 +449,7 @@ static int do_ip_setsockopt(struct sock *sk, int level,
(1<<IP_MTU_DISCOVER) | (1<<IP_RECVERR) |
(1<<IP_ROUTER_ALERT) | (1<<IP_FREEBIND) |
(1<<IP_PASSSEC) | (1<<IP_TRANSPARENT) |
- (1<<IP_MINTTL))) ||
+ (1<<IP_MINTTL) | (1<<IP_NODEFRAG))) ||
optname == IP_MULTICAST_TTL ||
optname == IP_MULTICAST_ALL ||
optname == IP_MULTICAST_LOOP ||
@@ -572,6 +572,14 @@ static int do_ip_setsockopt(struct sock *sk, int level,
}
inet->hdrincl = val ? 1 : 0;
break;
+ case IP_NODEFRAG:
+ if (sk->sk_type != SOCK_RAW) {
+ err = -ENOPROTOOPT;
+ break;
+ }
+ inet->nodefrag = val ? 1 : 0;
+ printk("IP_NODEFRAG %p -> %d\n", inet, inet->nodefrag);
+ break;
case IP_MTU_DISCOVER:
if (val < IP_PMTUDISC_DONT || val > IP_PMTUDISC_PROBE)
goto e_inval;
diff --git a/net/ipv4/netfilter/nf_defrag_ipv4.c b/net/ipv4/netfilter/nf_defrag_ipv4.c
index cb763ae..eab8de3 100644
--- a/net/ipv4/netfilter/nf_defrag_ipv4.c
+++ b/net/ipv4/netfilter/nf_defrag_ipv4.c
@@ -66,6 +66,11 @@ static unsigned int ipv4_conntrack_defrag(unsigned int hooknum,
const struct net_device *out,
int (*okfn)(struct sk_buff *))
{
+ struct inet_sock *inet = inet_sk(skb->sk);
+
+ if (inet && inet->nodefrag)
+ return NF_ACCEPT;
+
#if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
#if !defined(CONFIG_NF_NAT) && !defined(CONFIG_NF_NAT_MODULE)
/* Previously seen (loopback)? Ignore. Do this before
next prev parent reply other threads:[~2010-06-11 8:16 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20100604112708.GA1958@jolsa.lab.eng.brq.redhat.com>
[not found] ` <4C08EB85.3050900@trash.net>
[not found] ` <20100607145558.GA1939@jolsa.lab.eng.brq.redhat.com>
2010-06-09 14:16 ` no reassembly for outgoing packets on RAW socket Patrick McHardy
2010-06-09 15:15 ` Jan Engelhardt
2010-06-09 15:16 ` Patrick McHardy
2010-06-09 15:20 ` Jan Engelhardt
2010-06-10 6:57 ` Jiri Olsa
2010-06-10 6:56 ` Jiri Olsa
2010-06-10 9:14 ` Patrick McHardy
2010-06-10 9:53 ` Jiri Olsa
2010-06-10 10:04 ` Patrick McHardy
2010-06-11 8:16 ` Jiri Olsa [this message]
2010-06-11 9:53 ` Jan Engelhardt
2010-06-11 13:10 ` Jiri Olsa
2010-06-15 6:53 ` [PATCH] net: IP_NODEFRAG option for IPv4 socket Jiri Olsa
2010-06-15 7:13 ` Eric Dumazet
2010-06-15 9:18 ` Jiri Olsa
2010-06-15 9:49 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100611081604.GA1739@jolsa.Belkin \
--to=jolsa@redhat.com \
--cc=kaber@trash.net \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).