netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paul Jakma <paul@jakma.org>
To: Alan Davey <Alan.Davey@metaswitch.com>
Cc: David Miller <davem@davemloft.net>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"kuznet@ms2.inr.ac.ru" <kuznet@ms2.inr.ac.ru>,
	"jmorris@namei.org" <jmorris@namei.org>,
	"yoshfuji@linux-ipv6.org" <yoshfuji@linux-ipv6.org>,
	"kaber@trash.net" <kaber@trash.net>
Subject: RE: [PATCH] net: Fragment large datagrams even when IP_HDRINCL is set.
Date: Wed, 8 Jun 2016 10:39:38 +0100 (BST)	[thread overview]
Message-ID: <alpine.LFD.2.20.1606081036350.11583@stoner.jakma.org> (raw)
In-Reply-To: <BN3PR0201MB1059316BEC3EC7F6CBF2F0E4F95E0@BN3PR0201MB1059.namprd02.prod.outlook.com>

Hi,

We have to re-create IPv4 fragmentation in user-space in ospfd in Quagga 
cause of this, on raw IPv4/OSPF sockets. It'd be really nice to be able 
to deprecate that code and just have existing kernel code do that for 
us.

It should only happen if the app indicates it though, e.g. via a 
sockopt, to avoid compatibility issues.

regards,

Paul

On Wed, 8 Jun 2016, Alan Davey wrote:

> -  Consequently, everyone has to fix the same bug and work around it
>   by fragmenting in their application (we have seen this happen
>   several dozen times just in our experience).
>
> -  The end result is that the fragmentation code ends up being
>   implemented in many places, instead of just once, using the existing
>   kernel code.
>
> -  The patch is a low risk fix; removing 5 lines of code and using existing code to perform the fragmentation.  It should be back-compatible because

>    o  existing code written to work round the feature will continue to work
>    o  it seems very unlikely that anyone relies on the current behaviour of oversized packets being rejected, and would not prefer the new behavior.
>
> Therefore, whether it is a bug or a feature, I think there is value in 
> fixing the behaviour.

> Regards
> Alan
>
> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: 31 May 2016 19:39
> To: Alan Davey <Alan.Davey@metaswitch.com>
> Cc: netdev@vger.kernel.org; kuznet@ms2.inr.ac.ru; jmorris@namei.org; yoshfuji@linux-ipv6.org; kaber@trash.net
> Subject: Re: [PATCH] net: Fragment large datagrams even when IP_HDRINCL is set.
>
> From: Alan Davey <alan.davey@metaswitch.com>
> Date: Mon, 23 May 2016 15:23:45 +0100
>
>> One of the bugs documented in the raw(7) man page is as follows: When
>> the IP_HDRINCL option is set, datagrams will not be fragmented and are
>> limited to the interface MTU.
>>
>> This patch fixes the bug by removing the check for "length > rt->dst.dev->mtu"
>> in raw_send_hdrinc() (net/ipv4/raw.c).  Datagrams are no longer
>> limited to the interface MTU size if the IP_HDRINCL option is set, but
>> are fragmented, if necessary, in the same way as all other datagrams.
>>
>> Signed-off-by: Alan Davey <alan.davey@metaswitch.com>
>
> This is not a bug, it's a feature and it's how RAW ipv4 sockets have behaved for two decades.
>
> If the user wants to use hdr inclusion, he can send multiple frames and set the fragmentation bits appropriately.
>
> I'm not applying this patch.
>

-- 
Paul Jakma | paul@jakma.org | @pjakma | Key ID: 0xD86BF79464A2FF6A
Fortune:
I don't want to bore you, but there's nobody else around for me to bore.

  parent reply	other threads:[~2016-06-08  9:39 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-23 14:23 [PATCH] net: Fragment large datagrams even when IP_HDRINCL is set Alan Davey
2016-05-31 18:39 ` David Miller
2016-06-08  8:41   ` Alan Davey
2016-06-08  9:33     ` YOSHIFUJI Hideaki
2016-06-08  9:39     ` Paul Jakma [this message]
2016-06-08 17:25     ` David Miller
2016-06-15 10:41       ` Alan Davey
2016-07-08 12:55         ` Paul Jakma
2016-07-08 22:41           ` David Miller
2016-07-08 23:21             ` Alexey Kuznetsov
2016-07-12 12:34             ` Alan Davey
2016-07-12 18:11               ` David Miller
2021-12-14 12:03                 ` Senthil Kumar Nagappan
2016-06-15 13:03   ` Hannes Frederic Sowa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.20.1606081036350.11583@stoner.jakma.org \
    --to=paul@jakma.org \
    --cc=Alan.Davey@metaswitch.com \
    --cc=davem@davemloft.net \
    --cc=jmorris@namei.org \
    --cc=kaber@trash.net \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=netdev@vger.kernel.org \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).