* no reassembly for outgoing packets on RAW socket
@ 2010-06-04 11:27 Jiri Olsa
2010-06-04 12:03 ` Patrick McHardy
0 siblings, 1 reply; 19+ messages in thread
From: Jiri Olsa @ 2010-06-04 11:27 UTC (permalink / raw)
To: netdev
hi,
I'd like to be able to sendout a single IP packet with MF flag set.
When using RAW sockets the packet will get stuck in the
netfilter (NF_INET_LOCAL_OUT nf_defrag_ipv4 reassembly unit)
and wont ever make it out..
I made a change which bypass the outgoing reassembly for
RAW sockets, but I'm not sure wether it's too invasive..
Is there any standard for RAW sockets behaviour?
Or another way around? :)
thanks,
jirka
---
diff --git a/net/ipv4/netfilter/nf_defrag_ipv4.c b/net/ipv4/netfilter/nf_defrag_ipv4.c
index cb763ae..5ef8ab2 100644
--- a/net/ipv4/netfilter/nf_defrag_ipv4.c
+++ b/net/ipv4/netfilter/nf_defrag_ipv4.c
@@ -74,6 +74,10 @@ static unsigned int ipv4_conntrack_defrag(unsigned int hooknum,
return NF_ACCEPT;
#endif
#endif
+ /* Do not reassemble for raw sockets. */
+ if (skb->sk && skb->sk->sk_type == SOCK_RAW)
+ return NF_ACCEPT;
+
/* Gather fragments. */
if (ip_hdr(skb)->frag_off & htons(IP_MF | IP_OFFSET)) {
enum ip_defrag_users user = nf_ct_defrag_user(hooknum, skb);
diff --git a/net/ipv4/netfilter/nf_nat_standalone.c b/net/ipv4/netfilter/nf_nat_standalone.c
index beb2581..a9aa19c 100644
--- a/net/ipv4/netfilter/nf_nat_standalone.c
+++ b/net/ipv4/netfilter/nf_nat_standalone.c
@@ -86,8 +86,14 @@ nf_nat_fn(unsigned int hooknum,
enum nf_nat_manip_type maniptype = HOOK2MANIP(hooknum);
/* We never see fragments: conntrack defrags on pre-routing
- and local-out, and nf_nat_out protects post-routing. */
- NF_CT_ASSERT(!(ip_hdr(skb)->frag_off & htons(IP_MF | IP_OFFSET)));
+ and local-out, and nf_nat_out protects post-routing.
+ With the exception of RAW sockets. */
+#ifdef CONFIG_NETFILTER_DEBUG
+ int raw = (skb->sk && skb->sk->sk_type == SOCK_RAW);
+ int frag = (ip_hdr(skb)->frag_off & htons(IP_MF | IP_OFFSET));
+
+ NF_CT_ASSERT(!frag || (frag && raw));
+#endif
ct = nf_ct_get(skb, &ctinfo);
/* Can't track? It's not due to stress, or conntrack would
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: no reassembly for outgoing packets on RAW socket
2010-06-04 11:27 no reassembly for outgoing packets on RAW socket Jiri Olsa
@ 2010-06-04 12:03 ` Patrick McHardy
2010-06-07 14:55 ` Jiri Olsa
0 siblings, 1 reply; 19+ messages in thread
From: Patrick McHardy @ 2010-06-04 12:03 UTC (permalink / raw)
To: Jiri Olsa; +Cc: netdev
Jiri Olsa wrote:
> hi,
>
> I'd like to be able to sendout a single IP packet with MF flag set.
>
> When using RAW sockets the packet will get stuck in the
> netfilter (NF_INET_LOCAL_OUT nf_defrag_ipv4 reassembly unit)
> and wont ever make it out..
>
> I made a change which bypass the outgoing reassembly for
> RAW sockets, but I'm not sure wether it's too invasive..
That would break reassembly (and thus connection tracking) for cases
where its really intended.
> Is there any standard for RAW sockets behaviour?
> Or another way around? :)
You could use the NOTRACK target to bypass connection tracking.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: no reassembly for outgoing packets on RAW socket
2010-06-04 12:03 ` Patrick McHardy
@ 2010-06-07 14:55 ` Jiri Olsa
2010-06-09 14:16 ` Patrick McHardy
0 siblings, 1 reply; 19+ messages in thread
From: Jiri Olsa @ 2010-06-07 14:55 UTC (permalink / raw)
To: Patrick McHardy; +Cc: netdev
On Fri, Jun 04, 2010 at 02:03:17PM +0200, Patrick McHardy wrote:
> Jiri Olsa wrote:
> > hi,
> >
> > I'd like to be able to sendout a single IP packet with MF flag set.
> >
> > When using RAW sockets the packet will get stuck in the
> > netfilter (NF_INET_LOCAL_OUT nf_defrag_ipv4 reassembly unit)
> > and wont ever make it out..
> >
> > I made a change which bypass the outgoing reassembly for
> > RAW sockets, but I'm not sure wether it's too invasive..
>
> That would break reassembly (and thus connection tracking) for cases
> where its really intended.
>
> > Is there any standard for RAW sockets behaviour?
> > Or another way around? :)
>
> You could use the NOTRACK target to bypass connection tracking.
ok,
I tried the NOTRACK target, but the packet is still going
throught reassembly, because the RAW filter has lower priority
then the connection track defragmentation..
I was able to get it bypassed by attached patch and following
command:
iptables -v -t raw -A OUTPUT -p icmp -j NOTRACK
again, not sure if this is too invasive ;)
If this is not the way, I'd appreciatte any hint.. my goal is
to put malformed packet on the wire (more frags bit set for a
non fragmented packet)
thanks for help,
jirka
---
diff --git a/include/linux/netfilter_ipv4.h b/include/linux/netfilter_ipv4.h
index 29c7727..d249b6a 100644
--- a/include/linux/netfilter_ipv4.h
+++ b/include/linux/netfilter_ipv4.h
@@ -53,8 +53,8 @@
enum nf_ip_hook_priorities {
NF_IP_PRI_FIRST = INT_MIN,
- NF_IP_PRI_CONNTRACK_DEFRAG = -400,
- NF_IP_PRI_RAW = -300,
+ NF_IP_PRI_RAW = -400,
+ NF_IP_PRI_CONNTRACK_DEFRAG = -300,
NF_IP_PRI_SELINUX_FIRST = -225,
NF_IP_PRI_CONNTRACK = -200,
NF_IP_PRI_MANGLE = -150,
diff --git a/net/ipv4/netfilter/nf_defrag_ipv4.c b/net/ipv4/netfilter/nf_defrag_ipv4.c
index cb763ae..cb865d1 100644
--- a/net/ipv4/netfilter/nf_defrag_ipv4.c
+++ b/net/ipv4/netfilter/nf_defrag_ipv4.c
@@ -74,6 +74,9 @@ static unsigned int ipv4_conntrack_defrag(unsigned int hooknum,
return NF_ACCEPT;
#endif
#endif
+ if (nf_ct_is_untracked(skb))
+ return NF_ACCEPT;
+
/* Gather fragments. */
if (ip_hdr(skb)->frag_off & htons(IP_MF | IP_OFFSET)) {
enum ip_defrag_users user = nf_ct_defrag_user(hooknum, skb);
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: no reassembly for outgoing packets on RAW socket
2010-06-07 14:55 ` Jiri Olsa
@ 2010-06-09 14:16 ` Patrick McHardy
2010-06-09 15:15 ` Jan Engelhardt
2010-06-10 6:56 ` Jiri Olsa
0 siblings, 2 replies; 19+ messages in thread
From: Patrick McHardy @ 2010-06-09 14:16 UTC (permalink / raw)
To: Jiri Olsa; +Cc: netdev, Netfilter Developer Mailing List
Jiri Olsa wrote:
> On Fri, Jun 04, 2010 at 02:03:17PM +0200, Patrick McHardy wrote:
>
>> Jiri Olsa wrote:
>>
>>> hi,
>>>
>>> I'd like to be able to sendout a single IP packet with MF flag set.
>>>
>>> When using RAW sockets the packet will get stuck in the
>>> netfilter (NF_INET_LOCAL_OUT nf_defrag_ipv4 reassembly unit)
>>> and wont ever make it out..
>>>
>>> I made a change which bypass the outgoing reassembly for
>>> RAW sockets, but I'm not sure wether it's too invasive..
>>>
>> That would break reassembly (and thus connection tracking) for cases
>> where its really intended.
>>
>>
>>> Is there any standard for RAW sockets behaviour?
>>> Or another way around? :)
>>>
>> You could use the NOTRACK target to bypass connection tracking.
>>
>
> ok,
>
> I tried the NOTRACK target, but the packet is still going
> throught reassembly, because the RAW filter has lower priority
> then the connection track defragmentation..
>
Right.
> I was able to get it bypassed by attached patch and following
> command:
>
> iptables -v -t raw -A OUTPUT -p icmp -j NOTRACK
>
> again, not sure if this is too invasive ;)
>
Well, we can't change it in the mainline kernel.
> If this is not the way, I'd appreciatte any hint.. my goal is
> to put malformed packet on the wire (more frags bit set for a
> non fragmented packet)
I don't have any good suggestions besides adding a flag to the IPCB
and skipping defragmentation based on that.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: no reassembly for outgoing packets on RAW socket
2010-06-09 14:16 ` Patrick McHardy
@ 2010-06-09 15:15 ` Jan Engelhardt
2010-06-09 15:16 ` Patrick McHardy
2010-06-10 6:56 ` Jiri Olsa
1 sibling, 1 reply; 19+ messages in thread
From: Jan Engelhardt @ 2010-06-09 15:15 UTC (permalink / raw)
To: Patrick McHardy; +Cc: Jiri Olsa, netdev, Netfilter Developer Mailing List
On Wednesday 2010-06-09 16:16, Patrick McHardy wrote:
>>>> I'd like to be able to sendout a single IP packet with MF flag set.
>>>>
>>>> When using RAW sockets the packet will get stuck in the
>>>> netfilter (NF_INET_LOCAL_OUT nf_defrag_ipv4 reassembly unit)
>>>> and wont ever make it out..
>>>>
>>>> I made a change which bypass the outgoing reassembly for
>>>> RAW sockets, but I'm not sure wether it's too invasive..
>>>>
>>> That would break reassembly (and thus connection tracking) for cases
>>> where its really intended.
>>>
>>>> Is there any standard for RAW sockets behaviour?
>>>> Or another way around? :)
>>>>
>>> You could use the NOTRACK target to bypass connection tracking.
>>
>> I tried the NOTRACK target, but the packet is still going
>> throught reassembly, because the RAW filter has lower priority
>> then the connection track defragmentation..
>
>Right.
Blech. That reminds me of
http://marc.info/?l=netfilter-devel&m=126581823826735&w=2
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: no reassembly for outgoing packets on RAW socket
2010-06-09 15:15 ` Jan Engelhardt
@ 2010-06-09 15:16 ` Patrick McHardy
2010-06-09 15:20 ` Jan Engelhardt
0 siblings, 1 reply; 19+ messages in thread
From: Patrick McHardy @ 2010-06-09 15:16 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: Jiri Olsa, netdev, Netfilter Developer Mailing List
Jan Engelhardt wrote:
> On Wednesday 2010-06-09 16:16, Patrick McHardy wrote:
>>>> You could use the NOTRACK target to bypass connection tracking.
>>>>
>>> I tried the NOTRACK target, but the packet is still going
>>> throught reassembly, because the RAW filter has lower priority
>>> then the connection track defragmentation..
>>>
>> Right.
>>
>
> Blech. That reminds me of
> http://marc.info/?l=netfilter-devel&m=126581823826735&w=2
>
We already fixed that.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: no reassembly for outgoing packets on RAW socket
2010-06-09 15:16 ` Patrick McHardy
@ 2010-06-09 15:20 ` Jan Engelhardt
2010-06-10 6:57 ` Jiri Olsa
0 siblings, 1 reply; 19+ messages in thread
From: Jan Engelhardt @ 2010-06-09 15:20 UTC (permalink / raw)
To: Patrick McHardy; +Cc: Jiri Olsa, netdev, Netfilter Developer Mailing List
On Wednesday 2010-06-09 17:16, Patrick McHardy wrote:
>Jan Engelhardt wrote:
>> On Wednesday 2010-06-09 16:16, Patrick McHardy wrote:
>>>>> You could use the NOTRACK target to bypass connection tracking.
>>>>>
>>>> I tried the NOTRACK target, but the packet is still going
>>>> throught reassembly, because the RAW filter has lower priority
>>>> then the connection track defragmentation..
>>>
>>> Right.
>>
>> Blech. That reminds me of
>> http://marc.info/?l=netfilter-devel&m=126581823826735&w=2
>
>We already fixed that.
I know, and I posted it for the understanding of the OP
as to why RAW is after DEFRAG.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: no reassembly for outgoing packets on RAW socket
2010-06-09 14:16 ` Patrick McHardy
2010-06-09 15:15 ` Jan Engelhardt
@ 2010-06-10 6:56 ` Jiri Olsa
2010-06-10 9:14 ` Patrick McHardy
1 sibling, 1 reply; 19+ messages in thread
From: Jiri Olsa @ 2010-06-10 6:56 UTC (permalink / raw)
To: Patrick McHardy; +Cc: netdev, Netfilter Developer Mailing List
On Wed, Jun 09, 2010 at 04:16:42PM +0200, Patrick McHardy wrote:
> Jiri Olsa wrote:
> > On Fri, Jun 04, 2010 at 02:03:17PM +0200, Patrick McHardy wrote:
> >
> >> Jiri Olsa wrote:
> >>
> >>> hi,
> >>>
> >>> I'd like to be able to sendout a single IP packet with MF flag set.
> >>>
> >>> When using RAW sockets the packet will get stuck in the
> >>> netfilter (NF_INET_LOCAL_OUT nf_defrag_ipv4 reassembly unit)
> >>> and wont ever make it out..
> >>>
> >>> I made a change which bypass the outgoing reassembly for
> >>> RAW sockets, but I'm not sure wether it's too invasive..
> >>>
> >> That would break reassembly (and thus connection tracking) for cases
> >> where its really intended.
> >>
> >>
> >>> Is there any standard for RAW sockets behaviour?
> >>> Or another way around? :)
> >>>
> >> You could use the NOTRACK target to bypass connection tracking.
> >>
> >
> > ok,
> >
> > I tried the NOTRACK target, but the packet is still going
> > throught reassembly, because the RAW filter has lower priority
> > then the connection track defragmentation..
> >
>
> Right.
> > I was able to get it bypassed by attached patch and following
> > command:
> >
> > iptables -v -t raw -A OUTPUT -p icmp -j NOTRACK
> >
> > again, not sure if this is too invasive ;)
> >
>
> Well, we can't change it in the mainline kernel.
> > If this is not the way, I'd appreciatte any hint.. my goal is
> > to put malformed packet on the wire (more frags bit set for a
> > non fragmented packet)
>
> I don't have any good suggestions besides adding a flag to the IPCB
> and skipping defragmentation based on that.
ok,
I can see a way when I set this via setsockopt to the socket,
and check the value before the defragmentation.. would such a new
setsock option be acceptable?
I'm not sure I can see a way via IPCB, AFAICS it's for skb bound flags
which arise during the skb processing.
thanks,
jirka
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: no reassembly for outgoing packets on RAW socket
2010-06-09 15:20 ` Jan Engelhardt
@ 2010-06-10 6:57 ` Jiri Olsa
0 siblings, 0 replies; 19+ messages in thread
From: Jiri Olsa @ 2010-06-10 6:57 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: Patrick McHardy, netdev, Netfilter Developer Mailing List
On Wed, Jun 09, 2010 at 05:20:37PM +0200, Jan Engelhardt wrote:
>
> On Wednesday 2010-06-09 17:16, Patrick McHardy wrote:
> >Jan Engelhardt wrote:
> >> On Wednesday 2010-06-09 16:16, Patrick McHardy wrote:
> >>>>> You could use the NOTRACK target to bypass connection tracking.
> >>>>>
> >>>> I tried the NOTRACK target, but the packet is still going
> >>>> throught reassembly, because the RAW filter has lower priority
> >>>> then the connection track defragmentation..
> >>>
> >>> Right.
> >>
> >> Blech. That reminds me of
> >> http://marc.info/?l=netfilter-devel&m=126581823826735&w=2
> >
> >We already fixed that.
>
> I know, and I posted it for the understanding of the OP
> as to why RAW is after DEFRAG.
thanks, it's helpful
jirka
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: no reassembly for outgoing packets on RAW socket
2010-06-10 6:56 ` Jiri Olsa
@ 2010-06-10 9:14 ` Patrick McHardy
2010-06-10 9:53 ` Jiri Olsa
0 siblings, 1 reply; 19+ messages in thread
From: Patrick McHardy @ 2010-06-10 9:14 UTC (permalink / raw)
To: Jiri Olsa; +Cc: netdev, Netfilter Developer Mailing List
Jiri Olsa wrote:
> On Wed, Jun 09, 2010 at 04:16:42PM +0200, Patrick McHardy wrote:
>
>>> If this is not the way, I'd appreciatte any hint.. my goal is
>>> to put malformed packet on the wire (more frags bit set for a
>>> non fragmented packet)
>>>
>> I don't have any good suggestions besides adding a flag to the IPCB
>> and skipping defragmentation based on that.
>>
> ok,
>
> I can see a way when I set this via setsockopt to the socket,
> and check the value before the defragmentation.. would such a new
> setsock option be acceptable?
>
> I'm not sure I can see a way via IPCB, AFAICS it's for skb bound flags
> which arise during the skb processing.
>
Yes, a socket option is basically what I was suggesting, using the
IPCB to mark the packet. But just marking the socket is fine of
course.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: no reassembly for outgoing packets on RAW socket
2010-06-10 9:14 ` Patrick McHardy
@ 2010-06-10 9:53 ` Jiri Olsa
2010-06-10 10:04 ` Patrick McHardy
0 siblings, 1 reply; 19+ messages in thread
From: Jiri Olsa @ 2010-06-10 9:53 UTC (permalink / raw)
To: Patrick McHardy; +Cc: netdev, Netfilter Developer Mailing List
On Thu, Jun 10, 2010 at 11:14:04AM +0200, Patrick McHardy wrote:
> Jiri Olsa wrote:
> > On Wed, Jun 09, 2010 at 04:16:42PM +0200, Patrick McHardy wrote:
> >
> >>> If this is not the way, I'd appreciatte any hint.. my goal is
> >>> to put malformed packet on the wire (more frags bit set for a
> >>> non fragmented packet)
> >>>
> >> I don't have any good suggestions besides adding a flag to the IPCB
> >> and skipping defragmentation based on that.
> >>
> > ok,
> >
> > I can see a way when I set this via setsockopt to the socket,
> > and check the value before the defragmentation.. would such a new
> > setsock option be acceptable?
> >
> > I'm not sure I can see a way via IPCB, AFAICS it's for skb bound flags
> > which arise during the skb processing.
> >
>
> Yes, a socket option is basically what I was suggesting, using the
> IPCB to mark the packet. But just marking the socket is fine of
> course.
>
>
one last thought before the socket option.. :)
there's IP_HDRINCL option which is enabled for RAW sockets
(can be disabled later by setsockopt)
The 'man 7 ip' says:
"the user supplies an IP header in front of the user data"
but does not mention the outgoing defragmentation.
It kind of looks to me more appropriate to preserve the user suplied
IP header.. moreover if there's a way to switch this off and have
netfilter defragmentation + connection tracking for RAW socket.
please check the following patch..
(there's no special need for the IPSKB_NODEFRAG, it could check the
socket->hdrincl flag directly..)
thoughts?
thanks,
jirka
---
diff --git a/include/net/ip.h b/include/net/ip.h
index 452f229..201a17e 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -42,6 +42,7 @@ struct inet_skb_parm {
#define IPSKB_XFRM_TRANSFORMED 4
#define IPSKB_FRAG_COMPLETE 8
#define IPSKB_REROUTED 16
+#define IPSKB_NODEFRAG 32
};
static inline unsigned int ip_hdrlen(const struct sk_buff *skb)
diff --git a/net/ipv4/netfilter/nf_defrag_ipv4.c b/net/ipv4/netfilter/nf_defrag_ipv4.c
index cb763ae..0355bea 100644
--- a/net/ipv4/netfilter/nf_defrag_ipv4.c
+++ b/net/ipv4/netfilter/nf_defrag_ipv4.c
@@ -74,6 +74,9 @@ static unsigned int ipv4_conntrack_defrag(unsigned int hooknum,
return NF_ACCEPT;
#endif
#endif
+ if (IPCB(skb)->flags & IPSKB_NODEFRAG)
+ return NF_ACCEPT;
+
/* Gather fragments. */
if (ip_hdr(skb)->frag_off & htons(IP_MF | IP_OFFSET)) {
enum ip_defrag_users user = nf_ct_defrag_user(hooknum, skb);
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 2c7a163..978e813 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -354,6 +354,13 @@ static int raw_send_hdrinc(struct sock *sk, void *from, size_t length,
if (memcpy_fromiovecend((void *)iph, from, 0, length))
goto error_free;
+ /*
+ * The header is created by user, preserve the fragments
+ * settings throught the defragmentation unit.
+ */
+ if (iph->frag_off & htons(IP_MF|IP_OFFSET))
+ IPCB(skb)->flags |= IPSKB_NODEFRAG;
+
iphlen = iph->ihl * 4;
/*
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: no reassembly for outgoing packets on RAW socket
2010-06-10 9:53 ` Jiri Olsa
@ 2010-06-10 10:04 ` Patrick McHardy
2010-06-11 8:16 ` Jiri Olsa
0 siblings, 1 reply; 19+ messages in thread
From: Patrick McHardy @ 2010-06-10 10:04 UTC (permalink / raw)
To: Jiri Olsa; +Cc: netdev, Netfilter Developer Mailing List
Jiri Olsa wrote:
> On Thu, Jun 10, 2010 at 11:14:04AM +0200, Patrick McHardy wrote:
>
>> Jiri Olsa wrote:
>>
>>> On Wed, Jun 09, 2010 at 04:16:42PM +0200, Patrick McHardy wrote:
>>>
>>>
>>>>> If this is not the way, I'd appreciatte any hint.. my goal is
>>>>> to put malformed packet on the wire (more frags bit set for a
>>>>> non fragmented packet)
>>>>>
>>>>>
>>>> I don't have any good suggestions besides adding a flag to the IPCB
>>>> and skipping defragmentation based on that.
>>>>
>>>>
>>> ok,
>>>
>>> I can see a way when I set this via setsockopt to the socket,
>>> and check the value before the defragmentation.. would such a new
>>> setsock option be acceptable?
>>>
>>> I'm not sure I can see a way via IPCB, AFAICS it's for skb bound flags
>>> which arise during the skb processing.
>>>
>>>
>> Yes, a socket option is basically what I was suggesting, using the
>> IPCB to mark the packet. But just marking the socket is fine of
>> course.
>>
>>
>>
>
> one last thought before the socket option.. :)
>
> there's IP_HDRINCL option which is enabled for RAW sockets
> (can be disabled later by setsockopt)
>
> The 'man 7 ip' says:
> "the user supplies an IP header in front of the user data"
>
> but does not mention the outgoing defragmentation.
>
> It kind of looks to me more appropriate to preserve the user suplied
> IP header.. moreover if there's a way to switch this off and have
> netfilter defragmentation + connection tracking for RAW socket.
>
> please check the following patch..
> (there's no special need for the IPSKB_NODEFRAG, it could check the
> socket->hdrincl flag directly..)
>
> thoughts?
My main concern is that users might expect netfilter to properly
track fragmented packets created using IP_HDRINCL.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: no reassembly for outgoing packets on RAW socket
2010-06-10 10:04 ` Patrick McHardy
@ 2010-06-11 8:16 ` Jiri Olsa
2010-06-11 9:53 ` Jan Engelhardt
0 siblings, 1 reply; 19+ messages in thread
From: Jiri Olsa @ 2010-06-11 8:16 UTC (permalink / raw)
To: Patrick McHardy; +Cc: netdev, Netfilter Developer Mailing List
On Thu, Jun 10, 2010 at 12:04:56PM +0200, Patrick McHardy wrote:
> Jiri Olsa wrote:
> > On Thu, Jun 10, 2010 at 11:14:04AM +0200, Patrick McHardy wrote:
> >
> >> Jiri Olsa wrote:
> >>
> >>> On Wed, Jun 09, 2010 at 04:16:42PM +0200, Patrick McHardy wrote:
> >>>
> >>>
> >>>>> If this is not the way, I'd appreciatte any hint.. my goal is
> >>>>> to put malformed packet on the wire (more frags bit set for a
> >>>>> non fragmented packet)
> >>>>>
> >>>>>
> >>>> I don't have any good suggestions besides adding a flag to the IPCB
> >>>> and skipping defragmentation based on that.
> >>>>
> >>>>
> >>> ok,
> >>>
> >>> I can see a way when I set this via setsockopt to the socket,
> >>> and check the value before the defragmentation.. would such a new
> >>> setsock option be acceptable?
> >>>
> >>> I'm not sure I can see a way via IPCB, AFAICS it's for skb bound flags
> >>> which arise during the skb processing.
> >>>
> >>>
> >> Yes, a socket option is basically what I was suggesting, using the
> >> IPCB to mark the packet. But just marking the socket is fine of
> >> course.
> >>
> >>
> >>
> >
> > one last thought before the socket option.. :)
> >
> > there's IP_HDRINCL option which is enabled for RAW sockets
> > (can be disabled later by setsockopt)
> >
> > The 'man 7 ip' says:
> > "the user supplies an IP header in front of the user data"
> >
> > but does not mention the outgoing defragmentation.
> >
> > It kind of looks to me more appropriate to preserve the user suplied
> > IP header.. moreover if there's a way to switch this off and have
> > netfilter defragmentation + connection tracking for RAW socket.
> >
> > please check the following patch..
> > (there's no special need for the IPSKB_NODEFRAG, it could check the
> > socket->hdrincl flag directly..)
> >
> > thoughts?
>
> My main concern is that users might expect netfilter to properly
> track fragmented packets created using IP_HDRINCL.
>
I prepared the patch implementing IP_NODEFRAG option for IPv4 socket.
Also I just got an idea, that there could be no reassembly if there are
no rules for connection tracing set.. not sure how can I check that best
so far.. any idea?
thanks,
jirka
---
diff --git a/include/linux/in.h b/include/linux/in.h
index 583c76f..41d88a4 100644
--- a/include/linux/in.h
+++ b/include/linux/in.h
@@ -85,6 +85,7 @@ struct in_addr {
#define IP_RECVORIGDSTADDR IP_ORIGDSTADDR
#define IP_MINTTL 21
+#define IP_NODEFRAG 22
/* IP_MTU_DISCOVER values */
#define IP_PMTUDISC_DONT 0 /* Never send DF frames */
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 1653de5..1989cfd 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -137,7 +137,8 @@ struct inet_sock {
hdrincl:1,
mc_loop:1,
transparent:1,
- mc_all:1;
+ mc_all:1,
+ nodefrag:1;
int mc_index;
__be32 mc_addr;
struct ip_mc_socklist *mc_list;
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 551ce56..84d2c8e 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -355,6 +355,8 @@ lookup_protocol:
inet = inet_sk(sk);
inet->is_icsk = (INET_PROTOSW_ICSK & answer_flags) != 0;
+ inet->nodefrag = 0;
+
if (SOCK_RAW == sock->type) {
inet->inet_num = protocol;
if (IPPROTO_RAW == protocol)
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index ce23178..5aea0eb 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -449,7 +449,7 @@ static int do_ip_setsockopt(struct sock *sk, int level,
(1<<IP_MTU_DISCOVER) | (1<<IP_RECVERR) |
(1<<IP_ROUTER_ALERT) | (1<<IP_FREEBIND) |
(1<<IP_PASSSEC) | (1<<IP_TRANSPARENT) |
- (1<<IP_MINTTL))) ||
+ (1<<IP_MINTTL) | (1<<IP_NODEFRAG))) ||
optname == IP_MULTICAST_TTL ||
optname == IP_MULTICAST_ALL ||
optname == IP_MULTICAST_LOOP ||
@@ -572,6 +572,14 @@ static int do_ip_setsockopt(struct sock *sk, int level,
}
inet->hdrincl = val ? 1 : 0;
break;
+ case IP_NODEFRAG:
+ if (sk->sk_type != SOCK_RAW) {
+ err = -ENOPROTOOPT;
+ break;
+ }
+ inet->nodefrag = val ? 1 : 0;
+ printk("IP_NODEFRAG %p -> %d\n", inet, inet->nodefrag);
+ break;
case IP_MTU_DISCOVER:
if (val < IP_PMTUDISC_DONT || val > IP_PMTUDISC_PROBE)
goto e_inval;
diff --git a/net/ipv4/netfilter/nf_defrag_ipv4.c b/net/ipv4/netfilter/nf_defrag_ipv4.c
index cb763ae..eab8de3 100644
--- a/net/ipv4/netfilter/nf_defrag_ipv4.c
+++ b/net/ipv4/netfilter/nf_defrag_ipv4.c
@@ -66,6 +66,11 @@ static unsigned int ipv4_conntrack_defrag(unsigned int hooknum,
const struct net_device *out,
int (*okfn)(struct sk_buff *))
{
+ struct inet_sock *inet = inet_sk(skb->sk);
+
+ if (inet && inet->nodefrag)
+ return NF_ACCEPT;
+
#if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
#if !defined(CONFIG_NF_NAT) && !defined(CONFIG_NF_NAT_MODULE)
/* Previously seen (loopback)? Ignore. Do this before
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: no reassembly for outgoing packets on RAW socket
2010-06-11 8:16 ` Jiri Olsa
@ 2010-06-11 9:53 ` Jan Engelhardt
2010-06-11 13:10 ` Jiri Olsa
0 siblings, 1 reply; 19+ messages in thread
From: Jan Engelhardt @ 2010-06-11 9:53 UTC (permalink / raw)
To: Jiri Olsa; +Cc: Patrick McHardy, netdev, Netfilter Developer Mailing List
On Friday 2010-06-11 10:16, Jiri Olsa wrote:
>
>I prepared the patch implementing IP_NODEFRAG option for IPv4 socket.
>
>Also I just got an idea, that there could be no reassembly if there are
>no rules for connection tracing set.. not sure how can I check that best
>so far.. any idea?
>
>@@ -572,6 +572,14 @@ static int do_ip_setsockopt(struct sock *sk, int level,
> }
> inet->hdrincl = val ? 1 : 0;
> break;
>+ case IP_NODEFRAG:
>+ if (sk->sk_type != SOCK_RAW) {
>+ err = -ENOPROTOOPT;
>+ break;
>+ }
>+ inet->nodefrag = val ? 1 : 0;
>+ printk("IP_NODEFRAG %p -> %d\n", inet, inet->nodefrag);
>+ break;
You want to get rid of this printk otherwise it spews the logs.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: no reassembly for outgoing packets on RAW socket
2010-06-11 9:53 ` Jan Engelhardt
@ 2010-06-11 13:10 ` Jiri Olsa
2010-06-15 6:53 ` [PATCH] net: IP_NODEFRAG option for IPv4 socket Jiri Olsa
0 siblings, 1 reply; 19+ messages in thread
From: Jiri Olsa @ 2010-06-11 13:10 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: Patrick McHardy, netdev, Netfilter Developer Mailing List
On Fri, Jun 11, 2010 at 11:53:32AM +0200, Jan Engelhardt wrote:
>
> On Friday 2010-06-11 10:16, Jiri Olsa wrote:
> >
> >I prepared the patch implementing IP_NODEFRAG option for IPv4 socket.
> >
> >Also I just got an idea, that there could be no reassembly if there are
> >no rules for connection tracing set.. not sure how can I check that best
> >so far.. any idea?
> >
> >@@ -572,6 +572,14 @@ static int do_ip_setsockopt(struct sock *sk, int level,
> > }
> > inet->hdrincl = val ? 1 : 0;
> > break;
> >+ case IP_NODEFRAG:
> >+ if (sk->sk_type != SOCK_RAW) {
> >+ err = -ENOPROTOOPT;
> >+ break;
> >+ }
> >+ inet->nodefrag = val ? 1 : 0;
> >+ printk("IP_NODEFRAG %p -> %d\n", inet, inet->nodefrag);
> >+ break;
>
> You want to get rid of this printk otherwise it spews the logs.
oops, I forgot to remove this one... thanks
new patch is attached
wbr,
jirka
---
diff --git a/include/linux/in.h b/include/linux/in.h
index 583c76f..41d88a4 100644
--- a/include/linux/in.h
+++ b/include/linux/in.h
@@ -85,6 +85,7 @@ struct in_addr {
#define IP_RECVORIGDSTADDR IP_ORIGDSTADDR
#define IP_MINTTL 21
+#define IP_NODEFRAG 22
/* IP_MTU_DISCOVER values */
#define IP_PMTUDISC_DONT 0 /* Never send DF frames */
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 1653de5..1989cfd 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -137,7 +137,8 @@ struct inet_sock {
hdrincl:1,
mc_loop:1,
transparent:1,
- mc_all:1;
+ mc_all:1,
+ nodefrag:1;
int mc_index;
__be32 mc_addr;
struct ip_mc_socklist *mc_list;
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 551ce56..84d2c8e 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -355,6 +355,8 @@ lookup_protocol:
inet = inet_sk(sk);
inet->is_icsk = (INET_PROTOSW_ICSK & answer_flags) != 0;
+ inet->nodefrag = 0;
+
if (SOCK_RAW == sock->type) {
inet->inet_num = protocol;
if (IPPROTO_RAW == protocol)
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index ce23178..d8196e1 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -449,7 +449,7 @@ static int do_ip_setsockopt(struct sock *sk, int level,
(1<<IP_MTU_DISCOVER) | (1<<IP_RECVERR) |
(1<<IP_ROUTER_ALERT) | (1<<IP_FREEBIND) |
(1<<IP_PASSSEC) | (1<<IP_TRANSPARENT) |
- (1<<IP_MINTTL))) ||
+ (1<<IP_MINTTL) | (1<<IP_NODEFRAG))) ||
optname == IP_MULTICAST_TTL ||
optname == IP_MULTICAST_ALL ||
optname == IP_MULTICAST_LOOP ||
@@ -572,6 +572,13 @@ static int do_ip_setsockopt(struct sock *sk, int level,
}
inet->hdrincl = val ? 1 : 0;
break;
+ case IP_NODEFRAG:
+ if (sk->sk_type != SOCK_RAW) {
+ err = -ENOPROTOOPT;
+ break;
+ }
+ inet->nodefrag = val ? 1 : 0;
+ break;
case IP_MTU_DISCOVER:
if (val < IP_PMTUDISC_DONT || val > IP_PMTUDISC_PROBE)
goto e_inval;
diff --git a/net/ipv4/netfilter/nf_defrag_ipv4.c b/net/ipv4/netfilter/nf_defrag_ipv4.c
index cb763ae..eab8de3 100644
--- a/net/ipv4/netfilter/nf_defrag_ipv4.c
+++ b/net/ipv4/netfilter/nf_defrag_ipv4.c
@@ -66,6 +66,11 @@ static unsigned int ipv4_conntrack_defrag(unsigned int hooknum,
const struct net_device *out,
int (*okfn)(struct sk_buff *))
{
+ struct inet_sock *inet = inet_sk(skb->sk);
+
+ if (inet && inet->nodefrag)
+ return NF_ACCEPT;
+
#if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
#if !defined(CONFIG_NF_NAT) && !defined(CONFIG_NF_NAT_MODULE)
/* Previously seen (loopback)? Ignore. Do this before
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH] net: IP_NODEFRAG option for IPv4 socket
2010-06-11 13:10 ` Jiri Olsa
@ 2010-06-15 6:53 ` Jiri Olsa
2010-06-15 7:13 ` Eric Dumazet
0 siblings, 1 reply; 19+ messages in thread
From: Jiri Olsa @ 2010-06-15 6:53 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: Patrick McHardy, netdev, Netfilter Developer Mailing List
hi,
I prepared the patch implementing IP_NODEFRAG option for IPv4 socket.
The reason is, there's no other way to send out the packet with user
customized header of the reassembly part.
wbr,
jirka
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
---
diff --git a/include/linux/in.h b/include/linux/in.h
index 583c76f..41d88a4 100644
--- a/include/linux/in.h
+++ b/include/linux/in.h
@@ -85,6 +85,7 @@ struct in_addr {
#define IP_RECVORIGDSTADDR IP_ORIGDSTADDR
#define IP_MINTTL 21
+#define IP_NODEFRAG 22
/* IP_MTU_DISCOVER values */
#define IP_PMTUDISC_DONT 0 /* Never send DF frames */
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 1653de5..1989cfd 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -137,7 +137,8 @@ struct inet_sock {
hdrincl:1,
mc_loop:1,
transparent:1,
- mc_all:1;
+ mc_all:1,
+ nodefrag:1;
int mc_index;
__be32 mc_addr;
struct ip_mc_socklist *mc_list;
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 551ce56..84d2c8e 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -355,6 +355,8 @@ lookup_protocol:
inet = inet_sk(sk);
inet->is_icsk = (INET_PROTOSW_ICSK & answer_flags) != 0;
+ inet->nodefrag = 0;
+
if (SOCK_RAW == sock->type) {
inet->inet_num = protocol;
if (IPPROTO_RAW == protocol)
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index ce23178..d8196e1 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -449,7 +449,7 @@ static int do_ip_setsockopt(struct sock *sk, int level,
(1<<IP_MTU_DISCOVER) | (1<<IP_RECVERR) |
(1<<IP_ROUTER_ALERT) | (1<<IP_FREEBIND) |
(1<<IP_PASSSEC) | (1<<IP_TRANSPARENT) |
- (1<<IP_MINTTL))) ||
+ (1<<IP_MINTTL) | (1<<IP_NODEFRAG))) ||
optname == IP_MULTICAST_TTL ||
optname == IP_MULTICAST_ALL ||
optname == IP_MULTICAST_LOOP ||
@@ -572,6 +572,13 @@ static int do_ip_setsockopt(struct sock *sk, int level,
}
inet->hdrincl = val ? 1 : 0;
break;
+ case IP_NODEFRAG:
+ if (sk->sk_type != SOCK_RAW) {
+ err = -ENOPROTOOPT;
+ break;
+ }
+ inet->nodefrag = val ? 1 : 0;
+ break;
case IP_MTU_DISCOVER:
if (val < IP_PMTUDISC_DONT || val > IP_PMTUDISC_PROBE)
goto e_inval;
diff --git a/net/ipv4/netfilter/nf_defrag_ipv4.c b/net/ipv4/netfilter/nf_defrag_ipv4.c
index cb763ae..eab8de3 100644
--- a/net/ipv4/netfilter/nf_defrag_ipv4.c
+++ b/net/ipv4/netfilter/nf_defrag_ipv4.c
@@ -66,6 +66,11 @@ static unsigned int ipv4_conntrack_defrag(unsigned int hooknum,
const struct net_device *out,
int (*okfn)(struct sk_buff *))
{
+ struct inet_sock *inet = inet_sk(skb->sk);
+
+ if (inet && inet->nodefrag)
+ return NF_ACCEPT;
+
#if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
#if !defined(CONFIG_NF_NAT) && !defined(CONFIG_NF_NAT_MODULE)
/* Previously seen (loopback)? Ignore. Do this before
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH] net: IP_NODEFRAG option for IPv4 socket
2010-06-15 6:53 ` [PATCH] net: IP_NODEFRAG option for IPv4 socket Jiri Olsa
@ 2010-06-15 7:13 ` Eric Dumazet
2010-06-15 9:18 ` Jiri Olsa
0 siblings, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2010-06-15 7:13 UTC (permalink / raw)
To: Jiri Olsa
Cc: Jan Engelhardt, Patrick McHardy, netdev,
Netfilter Developer Mailing List
Le mardi 15 juin 2010 à 08:53 +0200, Jiri Olsa a écrit :
> hi,
>
> I prepared the patch implementing IP_NODEFRAG option for IPv4 socket.
> The reason is, there's no other way to send out the packet with user
> customized header of the reassembly part.
>
Obviously, you need to update documentation and man pages as well.
MAN-PAGES: MANUAL PAGES FOR LINUX -- Sections 2, 3, 4, 5, and 7
M: Michael Kerrisk <mtk.manpages@gmail.com>
W: http://www.kernel.org/doc/man-pages
L: linux-man@vger.kernel.org
S: Maintained
> wbr,
> jirka
>
>
> Signed-off-by: Jiri Olsa <jolsa@redhat.com>
> ---
> diff --git a/include/linux/in.h b/include/linux/in.h
> index 583c76f..41d88a4 100644
> --- a/include/linux/in.h
> +++ b/include/linux/in.h
> @@ -85,6 +85,7 @@ struct in_addr {
> #define IP_RECVORIGDSTADDR IP_ORIGDSTADDR
>
> #define IP_MINTTL 21
> +#define IP_NODEFRAG 22
>
> /* IP_MTU_DISCOVER values */
> #define IP_PMTUDISC_DONT 0 /* Never send DF frames */
> diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
> index 1653de5..1989cfd 100644
> --- a/include/net/inet_sock.h
> +++ b/include/net/inet_sock.h
> @@ -137,7 +137,8 @@ struct inet_sock {
> hdrincl:1,
> mc_loop:1,
> transparent:1,
> - mc_all:1;
> + mc_all:1,
> + nodefrag:1;
> int mc_index;
> __be32 mc_addr;
> struct ip_mc_socklist *mc_list;
> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index 551ce56..84d2c8e 100644
> --- a/net/ipv4/af_inet.c
> +++ b/net/ipv4/af_inet.c
> @@ -355,6 +355,8 @@ lookup_protocol:
> inet = inet_sk(sk);
> inet->is_icsk = (INET_PROTOSW_ICSK & answer_flags) != 0;
>
> + inet->nodefrag = 0;
> +
Hmm... what about cloning ?
> if (SOCK_RAW == sock->type) {
> inet->inet_num = protocol;
> if (IPPROTO_RAW == protocol)
> diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
> index ce23178..d8196e1 100644
> --- a/net/ipv4/ip_sockglue.c
> +++ b/net/ipv4/ip_sockglue.c
> @@ -449,7 +449,7 @@ static int do_ip_setsockopt(struct sock *sk, int level,
> (1<<IP_MTU_DISCOVER) | (1<<IP_RECVERR) |
> (1<<IP_ROUTER_ALERT) | (1<<IP_FREEBIND) |
> (1<<IP_PASSSEC) | (1<<IP_TRANSPARENT) |
> - (1<<IP_MINTTL))) ||
> + (1<<IP_MINTTL) | (1<<IP_NODEFRAG))) ||
> optname == IP_MULTICAST_TTL ||
> optname == IP_MULTICAST_ALL ||
> optname == IP_MULTICAST_LOOP ||
> @@ -572,6 +572,13 @@ static int do_ip_setsockopt(struct sock *sk, int level,
> }
> inet->hdrincl = val ? 1 : 0;
> break;
> + case IP_NODEFRAG:
> + if (sk->sk_type != SOCK_RAW) {
> + err = -ENOPROTOOPT;
> + break;
> + }
> + inet->nodefrag = val ? 1 : 0;
> + break;
> case IP_MTU_DISCOVER:
> if (val < IP_PMTUDISC_DONT || val > IP_PMTUDISC_PROBE)
> goto e_inval;
> diff --git a/net/ipv4/netfilter/nf_defrag_ipv4.c b/net/ipv4/netfilter/nf_defrag_ipv4.c
> index cb763ae..eab8de3 100644
> --- a/net/ipv4/netfilter/nf_defrag_ipv4.c
> +++ b/net/ipv4/netfilter/nf_defrag_ipv4.c
> @@ -66,6 +66,11 @@ static unsigned int ipv4_conntrack_defrag(unsigned int hooknum,
> const struct net_device *out,
> int (*okfn)(struct sk_buff *))
> {
> + struct inet_sock *inet = inet_sk(skb->sk);
> +
> + if (inet && inet->nodefrag)
> + return NF_ACCEPT;
> +
> #if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
> #if !defined(CONFIG_NF_NAT) && !defined(CONFIG_NF_NAT_MODULE)
> /* Previously seen (loopback)? Ignore. Do this before
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] net: IP_NODEFRAG option for IPv4 socket
2010-06-15 7:13 ` Eric Dumazet
@ 2010-06-15 9:18 ` Jiri Olsa
2010-06-15 9:49 ` Eric Dumazet
0 siblings, 1 reply; 19+ messages in thread
From: Jiri Olsa @ 2010-06-15 9:18 UTC (permalink / raw)
To: Eric Dumazet
Cc: Jan Engelhardt, Patrick McHardy, netdev,
Netfilter Developer Mailing List
On Tue, Jun 15, 2010 at 09:13:49AM +0200, Eric Dumazet wrote:
> Le mardi 15 juin 2010 à 08:53 +0200, Jiri Olsa a écrit :
> > hi,
> >
> > I prepared the patch implementing IP_NODEFRAG option for IPv4 socket.
> > The reason is, there's no other way to send out the packet with user
> > customized header of the reassembly part.
> >
>
> Obviously, you need to update documentation and man pages as well.
>
> MAN-PAGES: MANUAL PAGES FOR LINUX -- Sections 2, 3, 4, 5, and 7
> M: Michael Kerrisk <mtk.manpages@gmail.com>
> W: http://www.kernel.org/doc/man-pages
> L: linux-man@vger.kernel.org
> S: Maintained
hi,
I updated the man page, and will send it in the new post.
As for the in tree documentation, do you mean any specific doc?
I haven't found any part related to the setsockopt options..
>
>
> > wbr,
> > jirka
> >
> >
> > Signed-off-by: Jiri Olsa <jolsa@redhat.com>
> > ---
> > diff --git a/include/linux/in.h b/include/linux/in.h
> > index 583c76f..41d88a4 100644
> > --- a/include/linux/in.h
> > +++ b/include/linux/in.h
> > @@ -85,6 +85,7 @@ struct in_addr {
> > #define IP_RECVORIGDSTADDR IP_ORIGDSTADDR
> >
> > #define IP_MINTTL 21
> > +#define IP_NODEFRAG 22
> >
> > /* IP_MTU_DISCOVER values */
> > #define IP_PMTUDISC_DONT 0 /* Never send DF frames */
> > diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
> > index 1653de5..1989cfd 100644
> > --- a/include/net/inet_sock.h
> > +++ b/include/net/inet_sock.h
> > @@ -137,7 +137,8 @@ struct inet_sock {
> > hdrincl:1,
> > mc_loop:1,
> > transparent:1,
> > - mc_all:1;
> > + mc_all:1,
> > + nodefrag:1;
> > int mc_index;
> > __be32 mc_addr;
> > struct ip_mc_socklist *mc_list;
> > diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> > index 551ce56..84d2c8e 100644
> > --- a/net/ipv4/af_inet.c
> > +++ b/net/ipv4/af_inet.c
> > @@ -355,6 +355,8 @@ lookup_protocol:
> > inet = inet_sk(sk);
> > inet->is_icsk = (INET_PROTOSW_ICSK & answer_flags) != 0;
> >
> > + inet->nodefrag = 0;
> > +
>
> Hmm... what about cloning ?
I think as this is the property of the socket (not skb),
it has no affect for cloning
thanks,
jirka
>
> > if (SOCK_RAW == sock->type) {
> > inet->inet_num = protocol;
> > if (IPPROTO_RAW == protocol)
> > diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
> > index ce23178..d8196e1 100644
> > --- a/net/ipv4/ip_sockglue.c
> > +++ b/net/ipv4/ip_sockglue.c
> > @@ -449,7 +449,7 @@ static int do_ip_setsockopt(struct sock *sk, int level,
> > (1<<IP_MTU_DISCOVER) | (1<<IP_RECVERR) |
> > (1<<IP_ROUTER_ALERT) | (1<<IP_FREEBIND) |
> > (1<<IP_PASSSEC) | (1<<IP_TRANSPARENT) |
> > - (1<<IP_MINTTL))) ||
> > + (1<<IP_MINTTL) | (1<<IP_NODEFRAG))) ||
> > optname == IP_MULTICAST_TTL ||
> > optname == IP_MULTICAST_ALL ||
> > optname == IP_MULTICAST_LOOP ||
> > @@ -572,6 +572,13 @@ static int do_ip_setsockopt(struct sock *sk, int level,
> > }
> > inet->hdrincl = val ? 1 : 0;
> > break;
> > + case IP_NODEFRAG:
> > + if (sk->sk_type != SOCK_RAW) {
> > + err = -ENOPROTOOPT;
> > + break;
> > + }
> > + inet->nodefrag = val ? 1 : 0;
> > + break;
> > case IP_MTU_DISCOVER:
> > if (val < IP_PMTUDISC_DONT || val > IP_PMTUDISC_PROBE)
> > goto e_inval;
> > diff --git a/net/ipv4/netfilter/nf_defrag_ipv4.c b/net/ipv4/netfilter/nf_defrag_ipv4.c
> > index cb763ae..eab8de3 100644
> > --- a/net/ipv4/netfilter/nf_defrag_ipv4.c
> > +++ b/net/ipv4/netfilter/nf_defrag_ipv4.c
> > @@ -66,6 +66,11 @@ static unsigned int ipv4_conntrack_defrag(unsigned int hooknum,
> > const struct net_device *out,
> > int (*okfn)(struct sk_buff *))
> > {
> > + struct inet_sock *inet = inet_sk(skb->sk);
> > +
> > + if (inet && inet->nodefrag)
> > + return NF_ACCEPT;
> > +
> > #if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
> > #if !defined(CONFIG_NF_NAT) && !defined(CONFIG_NF_NAT_MODULE)
> > /* Previously seen (loopback)? Ignore. Do this before
> > --
> > To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] net: IP_NODEFRAG option for IPv4 socket
2010-06-15 9:18 ` Jiri Olsa
@ 2010-06-15 9:49 ` Eric Dumazet
0 siblings, 0 replies; 19+ messages in thread
From: Eric Dumazet @ 2010-06-15 9:49 UTC (permalink / raw)
To: Jiri Olsa
Cc: Jan Engelhardt, Patrick McHardy, netdev,
Netfilter Developer Mailing List
Le mardi 15 juin 2010 à 11:18 +0200, Jiri Olsa a écrit :
> > Hmm... what about cloning ?
>
> I think as this is the property of the socket (not skb),
> it has no affect for cloning
>
Sorry, I was thinking of sk_clone(). I sometime forgets sock_copy() copy
all fields.
This is a non issue for raw sockets, if IP_NODEFRAG is limited to RAW
sockets now and in the future.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2010-06-15 9:49 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-04 11:27 no reassembly for outgoing packets on RAW socket Jiri Olsa
2010-06-04 12:03 ` Patrick McHardy
2010-06-07 14:55 ` Jiri Olsa
2010-06-09 14:16 ` Patrick McHardy
2010-06-09 15:15 ` Jan Engelhardt
2010-06-09 15:16 ` Patrick McHardy
2010-06-09 15:20 ` Jan Engelhardt
2010-06-10 6:57 ` Jiri Olsa
2010-06-10 6:56 ` Jiri Olsa
2010-06-10 9:14 ` Patrick McHardy
2010-06-10 9:53 ` Jiri Olsa
2010-06-10 10:04 ` Patrick McHardy
2010-06-11 8:16 ` Jiri Olsa
2010-06-11 9:53 ` Jan Engelhardt
2010-06-11 13:10 ` Jiri Olsa
2010-06-15 6:53 ` [PATCH] net: IP_NODEFRAG option for IPv4 socket Jiri Olsa
2010-06-15 7:13 ` Eric Dumazet
2010-06-15 9:18 ` Jiri Olsa
2010-06-15 9:49 ` Eric Dumazet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).