netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] net: Fragment large datagrams even when IP_HDRINCL is set.
@ 2016-05-23 14:23 Alan Davey
  2016-05-31 18:39 ` David Miller
  0 siblings, 1 reply; 14+ messages in thread
From: Alan Davey @ 2016-05-23 14:23 UTC (permalink / raw)
  To: davem; +Cc: netdev, kuznet, jmorris, yoshfuji, kaber, Alan Davey

One of the bugs documented in the raw(7) man page is as follows: When the
IP_HDRINCL option is set, datagrams will not be fragmented and are limited to
the interface MTU.

This patch fixes the bug by removing the check for "length > rt->dst.dev->mtu"
in raw_send_hdrinc() (net/ipv4/raw.c).  Datagrams are no longer limited to the
interface MTU size if the IP_HDRINCL option is set, but are fragmented, if
necessary, in the same way as all other datagrams.

Signed-off-by: Alan Davey <alan.davey@metaswitch.com>
---
 net/ipv4/raw.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 8d22de7..de690b3 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -351,11 +351,6 @@ static int raw_send_hdrinc(struct sock *sk, struct flowi4 *fl4,
 	struct rtable *rt = *rtp;
 	int hlen, tlen;
 
-	if (length > rt->dst.dev->mtu) {
-		ip_local_error(sk, EMSGSIZE, fl4->daddr, inet->inet_dport,
-			       rt->dst.dev->mtu);
-		return -EMSGSIZE;
-	}
 	if (flags&MSG_PROBE)
 		goto out;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] net: Fragment large datagrams even when IP_HDRINCL is set.
  2016-05-23 14:23 [PATCH] net: Fragment large datagrams even when IP_HDRINCL is set Alan Davey
@ 2016-05-31 18:39 ` David Miller
  2016-06-08  8:41   ` Alan Davey
  2016-06-15 13:03   ` Hannes Frederic Sowa
  0 siblings, 2 replies; 14+ messages in thread
From: David Miller @ 2016-05-31 18:39 UTC (permalink / raw)
  To: alan.davey; +Cc: netdev, kuznet, jmorris, yoshfuji, kaber

From: Alan Davey <alan.davey@metaswitch.com>
Date: Mon, 23 May 2016 15:23:45 +0100

> One of the bugs documented in the raw(7) man page is as follows: When the
> IP_HDRINCL option is set, datagrams will not be fragmented and are limited to
> the interface MTU.
> 
> This patch fixes the bug by removing the check for "length > rt->dst.dev->mtu"
> in raw_send_hdrinc() (net/ipv4/raw.c).  Datagrams are no longer limited to the
> interface MTU size if the IP_HDRINCL option is set, but are fragmented, if
> necessary, in the same way as all other datagrams.
> 
> Signed-off-by: Alan Davey <alan.davey@metaswitch.com>

This is not a bug, it's a feature and it's how RAW ipv4 sockets have behaved
for two decades.

If the user wants to use hdr inclusion, he can send multiple frames and set
the fragmentation bits appropriately.

I'm not applying this patch.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH] net: Fragment large datagrams even when IP_HDRINCL is set.
  2016-05-31 18:39 ` David Miller
@ 2016-06-08  8:41   ` Alan Davey
  2016-06-08  9:33     ` YOSHIFUJI Hideaki
                       ` (2 more replies)
  2016-06-15 13:03   ` Hannes Frederic Sowa
  1 sibling, 3 replies; 14+ messages in thread
From: Alan Davey @ 2016-06-08  8:41 UTC (permalink / raw)
  To: David Miller
  Cc: netdev@vger.kernel.org, kuznet@ms2.inr.ac.ru, jmorris@namei.org,
	yoshfuji@linux-ipv6.org, kaber@trash.net

Hi David

Thank you for your email.  I understand raw IP sockets have worked this way for a long time, but I think that there is real benefit in this patch for little risk.  Please take a look at the following and let me know what you think.

-  The current behaviour is counter-intuitive (fragmentation takes place in all other cases) and therefore different to what everyone expects.

-  Consequently, everyone has to fix the same bug and work around it by fragmenting in their application (we have seen this happen several dozen times just in our experience).

-  The end result is that the fragmentation code ends up being implemented in many places, instead of just once, using the existing kernel code.

-  The patch is a low risk fix; removing 5 lines of code and using existing code to perform the fragmentation.  It should be back-compatible because
    o  existing code written to work round the feature will continue to work
    o  it seems very unlikely that anyone relies on the current behaviour of oversized packets being rejected, and would not prefer the new behavior.  

Therefore, whether it is a bug or a feature, I think there is value in fixing the behaviour.

Regards
Alan

-----Original Message-----
From: David Miller [mailto:davem@davemloft.net] 
Sent: 31 May 2016 19:39
To: Alan Davey <Alan.Davey@metaswitch.com>
Cc: netdev@vger.kernel.org; kuznet@ms2.inr.ac.ru; jmorris@namei.org; yoshfuji@linux-ipv6.org; kaber@trash.net
Subject: Re: [PATCH] net: Fragment large datagrams even when IP_HDRINCL is set.

From: Alan Davey <alan.davey@metaswitch.com>
Date: Mon, 23 May 2016 15:23:45 +0100

> One of the bugs documented in the raw(7) man page is as follows: When 
> the IP_HDRINCL option is set, datagrams will not be fragmented and are 
> limited to the interface MTU.
> 
> This patch fixes the bug by removing the check for "length > rt->dst.dev->mtu"
> in raw_send_hdrinc() (net/ipv4/raw.c).  Datagrams are no longer 
> limited to the interface MTU size if the IP_HDRINCL option is set, but 
> are fragmented, if necessary, in the same way as all other datagrams.
> 
> Signed-off-by: Alan Davey <alan.davey@metaswitch.com>

This is not a bug, it's a feature and it's how RAW ipv4 sockets have behaved for two decades.

If the user wants to use hdr inclusion, he can send multiple frames and set the fragmentation bits appropriately.

I'm not applying this patch.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] net: Fragment large datagrams even when IP_HDRINCL is set.
  2016-06-08  8:41   ` Alan Davey
@ 2016-06-08  9:33     ` YOSHIFUJI Hideaki
  2016-06-08  9:39     ` Paul Jakma
  2016-06-08 17:25     ` David Miller
  2 siblings, 0 replies; 14+ messages in thread
From: YOSHIFUJI Hideaki @ 2016-06-08  9:33 UTC (permalink / raw)
  To: Alan Davey, David Miller
  Cc: hideaki.yoshifuji, netdev@vger.kernel.org, kuznet@ms2.inr.ac.ru,
	jmorris@namei.org, yoshfuji@linux-ipv6.org, kaber@trash.net

Hi

Please do not use "top-posting" here.

Alan Davey wrote:
> Hi David
> 
> Thank you for your email.  I understand raw IP sockets have worked this way for a long time, but I think that there is real benefit in this patch for little risk.  Please take a look at the following and let me know what you think.
> 
> -  The current behaviour is counter-intuitive (fragmentation takes place in all other cases) and therefore different to what everyone expects.

I do NEVER expect fragmentation occurs with IP_HDRINCL, at least.

-- 
Hideaki Yoshifuji <hideaki.yoshifuji@miraclelinux.com>
Technical Division, MIRACLE LINUX CORPORATION

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH] net: Fragment large datagrams even when IP_HDRINCL is set.
  2016-06-08  8:41   ` Alan Davey
  2016-06-08  9:33     ` YOSHIFUJI Hideaki
@ 2016-06-08  9:39     ` Paul Jakma
  2016-06-08 17:25     ` David Miller
  2 siblings, 0 replies; 14+ messages in thread
From: Paul Jakma @ 2016-06-08  9:39 UTC (permalink / raw)
  To: Alan Davey
  Cc: David Miller, netdev@vger.kernel.org, kuznet@ms2.inr.ac.ru,
	jmorris@namei.org, yoshfuji@linux-ipv6.org, kaber@trash.net

Hi,

We have to re-create IPv4 fragmentation in user-space in ospfd in Quagga 
cause of this, on raw IPv4/OSPF sockets. It'd be really nice to be able 
to deprecate that code and just have existing kernel code do that for 
us.

It should only happen if the app indicates it though, e.g. via a 
sockopt, to avoid compatibility issues.

regards,

Paul

On Wed, 8 Jun 2016, Alan Davey wrote:

> -  Consequently, everyone has to fix the same bug and work around it
>   by fragmenting in their application (we have seen this happen
>   several dozen times just in our experience).
>
> -  The end result is that the fragmentation code ends up being
>   implemented in many places, instead of just once, using the existing
>   kernel code.
>
> -  The patch is a low risk fix; removing 5 lines of code and using existing code to perform the fragmentation.  It should be back-compatible because

>    o  existing code written to work round the feature will continue to work
>    o  it seems very unlikely that anyone relies on the current behaviour of oversized packets being rejected, and would not prefer the new behavior.
>
> Therefore, whether it is a bug or a feature, I think there is value in 
> fixing the behaviour.

> Regards
> Alan
>
> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: 31 May 2016 19:39
> To: Alan Davey <Alan.Davey@metaswitch.com>
> Cc: netdev@vger.kernel.org; kuznet@ms2.inr.ac.ru; jmorris@namei.org; yoshfuji@linux-ipv6.org; kaber@trash.net
> Subject: Re: [PATCH] net: Fragment large datagrams even when IP_HDRINCL is set.
>
> From: Alan Davey <alan.davey@metaswitch.com>
> Date: Mon, 23 May 2016 15:23:45 +0100
>
>> One of the bugs documented in the raw(7) man page is as follows: When
>> the IP_HDRINCL option is set, datagrams will not be fragmented and are
>> limited to the interface MTU.
>>
>> This patch fixes the bug by removing the check for "length > rt->dst.dev->mtu"
>> in raw_send_hdrinc() (net/ipv4/raw.c).  Datagrams are no longer
>> limited to the interface MTU size if the IP_HDRINCL option is set, but
>> are fragmented, if necessary, in the same way as all other datagrams.
>>
>> Signed-off-by: Alan Davey <alan.davey@metaswitch.com>
>
> This is not a bug, it's a feature and it's how RAW ipv4 sockets have behaved for two decades.
>
> If the user wants to use hdr inclusion, he can send multiple frames and set the fragmentation bits appropriately.
>
> I'm not applying this patch.
>

-- 
Paul Jakma | paul@jakma.org | @pjakma | Key ID: 0xD86BF79464A2FF6A
Fortune:
I don't want to bore you, but there's nobody else around for me to bore.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] net: Fragment large datagrams even when IP_HDRINCL is set.
  2016-06-08  8:41   ` Alan Davey
  2016-06-08  9:33     ` YOSHIFUJI Hideaki
  2016-06-08  9:39     ` Paul Jakma
@ 2016-06-08 17:25     ` David Miller
  2016-06-15 10:41       ` Alan Davey
  2 siblings, 1 reply; 14+ messages in thread
From: David Miller @ 2016-06-08 17:25 UTC (permalink / raw)
  To: Alan.Davey; +Cc: netdev, kuznet, jmorris, yoshfuji, kaber

From: Alan Davey <Alan.Davey@metaswitch.com>
Date: Wed, 8 Jun 2016 08:41:31 +0000

> -  The current behaviour is counter-intuitive (fragmentation takes
> -  place in all other cases) and therefore different to what
> -  everyone expects.

But it's what all existing applications must expect, and as you have
seen in these replied they absolutely do.

You cannot just break things on people like this.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH] net: Fragment large datagrams even when IP_HDRINCL is set.
  2016-06-08 17:25     ` David Miller
@ 2016-06-15 10:41       ` Alan Davey
  2016-07-08 12:55         ` Paul Jakma
  0 siblings, 1 reply; 14+ messages in thread
From: Alan Davey @ 2016-06-15 10:41 UTC (permalink / raw)
  To: David Miller
  Cc: netdev@vger.kernel.org, kuznet@ms2.inr.ac.ru, jmorris@namei.org,
	yoshfuji@linux-ipv6.org, kaber@trash.net



From: David Miller [mailto:davem@davemloft.net] 
Sent: 08 June 2016 18:26

>> -  The current behaviour is counter-intuitive (fragmentation takes
>> -  place in all other cases) and therefore different to what
>> -  everyone expects.
>
> But it's what all existing applications must expect, and as you have seen in these replied they absolutely do.
>
> You cannot just break things on people like this.

The only case that would break is that where an application relies on the existing (documented as a bug) feature of getting an EMSGSIZE return code in the case of an over-sized packet.  Applications that perform their own fragmentation would be unaffected.

I think that the benefit of the patch, in moving all fragmentation and reassembly into the kernel, outweigh the very small chance that applications rely on the send of an over-sized packet failing.

What is your thinking on taking the patch?

Regards
Alan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] net: Fragment large datagrams even when IP_HDRINCL is set.
  2016-05-31 18:39 ` David Miller
  2016-06-08  8:41   ` Alan Davey
@ 2016-06-15 13:03   ` Hannes Frederic Sowa
  1 sibling, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2016-06-15 13:03 UTC (permalink / raw)
  To: David Miller, alan.davey; +Cc: netdev, kuznet, jmorris, yoshfuji, kaber

On 31.05.2016 20:39, David Miller wrote:
> From: Alan Davey <alan.davey@metaswitch.com>
> Date: Mon, 23 May 2016 15:23:45 +0100
> 
>> One of the bugs documented in the raw(7) man page is as follows: When the
>> IP_HDRINCL option is set, datagrams will not be fragmented and are limited to
>> the interface MTU.
>>
>> This patch fixes the bug by removing the check for "length > rt->dst.dev->mtu"
>> in raw_send_hdrinc() (net/ipv4/raw.c).  Datagrams are no longer limited to the
>> interface MTU size if the IP_HDRINCL option is set, but are fragmented, if
>> necessary, in the same way as all other datagrams.
>>
>> Signed-off-by: Alan Davey <alan.davey@metaswitch.com>
> 
> This is not a bug, it's a feature and it's how RAW ipv4 sockets have behaved
> for two decades.
> 
> If the user wants to use hdr inclusion, he can send multiple frames and set
> the fragmentation bits appropriately.

Actually this is difficult, as only the kernel is in control of the
IP_ID generator and it should never leak its state to user space.

The probability of IP_ID collisions would increase drastically because
of this, even imagine different programs doing so concurrently.

Basically those checks are pretty much unnecessary anyway, if we set raw
sockets by default into inet->pmtudisc IPPMTUDISC_DO mode, so a user can
easily control the fragmentation on a raw socket themselves based on
already provided infrastructure.

Bye,
Hannes

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH] net: Fragment large datagrams even when IP_HDRINCL is set.
  2016-06-15 10:41       ` Alan Davey
@ 2016-07-08 12:55         ` Paul Jakma
  2016-07-08 22:41           ` David Miller
  0 siblings, 1 reply; 14+ messages in thread
From: Paul Jakma @ 2016-07-08 12:55 UTC (permalink / raw)
  To: Alan Davey
  Cc: David Miller, netdev@vger.kernel.org, kuznet@ms2.inr.ac.ru,
	jmorris@namei.org, yoshfuji@linux-ipv6.org, kaber@trash.net

On Wed, 15 Jun 2016, Alan Davey wrote:

> The only case that would break is that where an application relies on 
> the existing (documented as a bug) feature of getting an EMSGSIZE 
> return code in the case of an over-sized packet.  Applications that 
> perform their own fragmentation would be unaffected.

If this doesn't break existing applications that are doing fragmentation 
in userspace on raw sockets (e.g. Quagga ospfd), that's better.

As per previous email, I'd love to be able to get rid of that code and 
have the kernel do it for me. However, I also don't want to have to do 
anything other non-trivial to that code either. :)

The issue for us is, how would we know on any given host whether the 
kernel will do the fragmentation or whether ospfd has to do it? We need 
to be able to probe for that capability, surely?

I guess "send an oversized packet and see if we get EMSGSIZE comes back 
or not" could be one way, though the 'man 7 raw' man page ony my system 
says that is returned on raw socket if the /IP/ max size (64 KiB) is 
exceeded - it doesn't say anything about that being returned if output 
MTU is exceeded. Also, that implies a failed packet send on startup on 
older kernels and having to rejiggle the packet we've constructed to 
retry.

Some kind of "I want kernel to do the fragmentation" sockopt, that was 
guaranteed to error with ENOPROTOOPT on non-implementing kernels, that 
we could try set when creating the socket, would seem simpler/nicer from 
my POV of my corner of user-space.

?

regards,
-- 
Paul Jakma | paul@jakma.org | @pjakma | Key ID: 0xD86BF79464A2FF6A
Fortune:
It would be nice to be sure of anything the way some people are of everything.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] net: Fragment large datagrams even when IP_HDRINCL is set.
  2016-07-08 12:55         ` Paul Jakma
@ 2016-07-08 22:41           ` David Miller
  2016-07-08 23:21             ` Alexey Kuznetsov
  2016-07-12 12:34             ` Alan Davey
  0 siblings, 2 replies; 14+ messages in thread
From: David Miller @ 2016-07-08 22:41 UTC (permalink / raw)
  To: paul; +Cc: Alan.Davey, netdev, kuznet, jmorris, yoshfuji, kaber

From: Paul Jakma <paul@jakma.org>
Date: Fri, 8 Jul 2016 13:55:11 +0100 (BST)

> On Wed, 15 Jun 2016, Alan Davey wrote:
> 
>> The only case that would break is that where an application relies on
>> the existing (documented as a bug) feature of getting an EMSGSIZE
>> return code in the case of an over-sized packet.  Applications that
>> perform their own fragmentation would be unaffected.
> 
> If this doesn't break existing applications that are doing
> fragmentation in userspace on raw sockets (e.g. Quagga ospfd), that's
> better.
> 
> As per previous email, I'd love to be able to get rid of that code and
> have the kernel do it for me. However, I also don't want to have to do
> anything other non-trivial to that code either. :)
> 
> The issue for us is, how would we know on any given host whether the
> kernel will do the fragmentation or whether ospfd has to do it? We
> need to be able to probe for that capability, surely?

The fact is, regardless of whether you could probe for the capability
or not, you have to keep the fragmentation code around forever.

And that is yet another reason I do not want to add this change at all.

It doesn't make any existing server any simpler, in fact it makes them
all more complicated because not only do they keep the fragmentation
code, they also get new logic to test for the feature that would allow
them to avoid using it.

Sorry, there is no way I am adding this, it's a net lose.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] net: Fragment large datagrams even when IP_HDRINCL is set.
  2016-07-08 22:41           ` David Miller
@ 2016-07-08 23:21             ` Alexey Kuznetsov
  2016-07-12 12:34             ` Alan Davey
  1 sibling, 0 replies; 14+ messages in thread
From: Alexey Kuznetsov @ 2016-07-08 23:21 UTC (permalink / raw)
  To: netdev

Hello!

I can tell why it has not been done initially.

Main problem was in IP options, which can be present in raw packet.
They have to be properly fragmented, some options are to be deleted
on fragments. Not that it is too complicated, it is just boring and ugly
and inconsistent with IP_HDRINCL logic.

So, it was done in logically consistent way: did you order IP_HDRINCL?
Please, then deal with fragments yourself.

Alexey

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH] net: Fragment large datagrams even when IP_HDRINCL is set.
  2016-07-08 22:41           ` David Miller
  2016-07-08 23:21             ` Alexey Kuznetsov
@ 2016-07-12 12:34             ` Alan Davey
  2016-07-12 18:11               ` David Miller
  1 sibling, 1 reply; 14+ messages in thread
From: Alan Davey @ 2016-07-12 12:34 UTC (permalink / raw)
  To: David Miller, paul@jakma.org
  Cc: netdev@vger.kernel.org, kuznet@ms2.inr.ac.ru, jmorris@namei.org,
	yoshfuji@linux-ipv6.org, kaber@trash.net

>> On Wed, 15 Jun 2016, Alan Davey wrote:
>> 
>>> The only case that would break is that where an application relies on 
>>> the existing (documented as a bug) feature of getting an EMSGSIZE 
>>> return code in the case of an over-sized packet.  Applications that 
>>> perform their own fragmentation would be unaffected.
>> 
>> If this doesn't break existing applications that are doing 
>> fragmentation in userspace on raw sockets (e.g. Quagga ospfd), that's 
>> better.
>> 
>> As per previous email, I'd love to be able to get rid of that code and 
>> have the kernel do it for me. However, I also don't want to have to do 
>> anything other non-trivial to that code either. :)
>> 
>> The issue for us is, how would we know on any given host whether the 
>> kernel will do the fragmentation or whether ospfd has to do it? We 
>> need to be able to probe for that capability, surely?
>
>The fact is, regardless of whether you could probe for the capability or not, you have to keep the fragmentation code around forever.
>
>And that is yet another reason I do not want to add this change at all.
>
>It doesn't make any existing server any simpler, in fact it makes them all more complicated because not only do they keep the fragmentation code, they also >get new logic to test for the feature that would allow them to avoid using it.
>
>Sorry, there is no way I am adding this, it's a net lose.

David,

I accept you don't want to take this patch.  However, I don't yet understand why, so I have a few more questions (which will hopefully help me produce patches that are likely to be accepted in future).

Adding the patch means that some existing applications will continue to contain their own fragmentation code, which becomes unnecessary.  This is OK; those applications will continue to fragment packets, and will continue to work with an updated kernel.  Do you agree?

Not adding the patch means that 
-  all future applications have to continue to implement their own fragmentation code, duplicating that which already exists in the kernel
-  hardware vendors may choose to apply the patch themselves (they do not want to implement function that already exists), but would surely prefer to have it in the standard kernel.

So I still don't understand what part of taking the patch would have a negative result on existing applications, it should be neutral to them.  Could you help me out here and explain?

Regards
Alan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] net: Fragment large datagrams even when IP_HDRINCL is set.
  2016-07-12 12:34             ` Alan Davey
@ 2016-07-12 18:11               ` David Miller
  2021-12-14 12:03                 ` Senthil Kumar Nagappan
  0 siblings, 1 reply; 14+ messages in thread
From: David Miller @ 2016-07-12 18:11 UTC (permalink / raw)
  To: Alan.Davey; +Cc: paul, netdev, kuznet, jmorris, yoshfuji, kaber

From: Alan Davey <Alan.Davey@metaswitch.com>
Date: Tue, 12 Jul 2016 12:34:07 +0000

> -  all future applications have to continue to implement their own fragmentation code, duplicating that which already exists in the kernel

They have to do this anyways, don't you see this?

Otherwise they don't support %99 of the kernels out there.

Even if I put this change in now, it would take years for it to
propagate to even a moderate percentage of Linux machines out there.

And applications doing a "we only support kernels > version x.y.z" is
a situation I don't want to promote if you think that's a reasonable
way to handle this.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] net: Fragment large datagrams even when IP_HDRINCL is set.
  2016-07-12 18:11               ` David Miller
@ 2021-12-14 12:03                 ` Senthil Kumar Nagappan
  0 siblings, 0 replies; 14+ messages in thread
From: Senthil Kumar Nagappan @ 2021-12-14 12:03 UTC (permalink / raw)
  To: David Miller, Alan.Davey; +Cc: paul, netdev, kuznet, jmorris, yoshfuji, kaber


On 7/12/2016 11:41 PM, David Miller wrote:
> From: Alan Davey <Alan.Davey@metaswitch.com>
> Date: Tue, 12 Jul 2016 12:34:07 +0000
>
>> -  all future applications have to continue to implement their own fragmentation code, duplicating that which already exists in the kernel
> They have to do this anyways, don't you see this?
>
> Otherwise they don't support %99 of the kernels out there.
>
> Even if I put this change in now, it would take years for it to
> propagate to even a moderate percentage of Linux machines out there.
>
> And applications doing a "we only support kernels > version x.y.z" is
> a situation I don't want to promote if you think that's a reasonable
> way to handle this.
>
> Dave patch functionality is good to have in the kernel.
> The reason David says it not convincing,  since its there for decades and "we only support kernels > version x.y.z", As Paul suggested we could have a socket option to avoid compatibility issues.
>
> Dave patch has minimal changes, but i see some downsides to it and suggestion to do this differently.
> 1. We will not be able to support MSG_MORE if we take this path.
> 2. we will try to allocate one big skb based on the user data size and kmalloc might fail to allocate a big contiguous buffer, which is not better as compared to the path taken using ip_append_data where we will take care allocating skb's based on the mtu.
> 3. other suggestion is, may be we can take the ip_append_data path, some were down the line we can use the user provided header.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-12-14 12:03 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-23 14:23 [PATCH] net: Fragment large datagrams even when IP_HDRINCL is set Alan Davey
2016-05-31 18:39 ` David Miller
2016-06-08  8:41   ` Alan Davey
2016-06-08  9:33     ` YOSHIFUJI Hideaki
2016-06-08  9:39     ` Paul Jakma
2016-06-08 17:25     ` David Miller
2016-06-15 10:41       ` Alan Davey
2016-07-08 12:55         ` Paul Jakma
2016-07-08 22:41           ` David Miller
2016-07-08 23:21             ` Alexey Kuznetsov
2016-07-12 12:34             ` Alan Davey
2016-07-12 18:11               ` David Miller
2021-12-14 12:03                 ` Senthil Kumar Nagappan
2016-06-15 13:03   ` Hannes Frederic Sowa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).