netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH resend] net: sock: Add option for memory optimized hints.
@ 2016-06-17 13:58 peter enderborg
  2016-06-17 14:14 ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: peter enderborg @ 2016-06-17 13:58 UTC (permalink / raw)
  To: open list:PTP HARDWARE CLOCK SUPPORT

From: Peter Enderborg <peter.enderborg@sonymobile.com>

When sending data the socket allocates memory for
payload on a cache or a page alloc. The page alloc
then might trigger compation that takes long time.
This can be avoided with smaller chunks. But
userspace can not know what is the right size for
the smaller sends. For this we add a SIZEHINT
getsocketopt where the userspace can get the size
for send that will fit into one page (order 0) or
the max for a slab cache allocation.

Signed-off-by: Peter Enderborg <peter.enderborg@sonymobile.com>
---
  include/uapi/asm-generic/socket.h |  2 ++
  include/uapi/linux/socket.h       |  9 +++++++++
  net/core/sock.c                   | 17 +++++++++++++++++
  3 files changed, 28 insertions(+)

diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
index 67d632f..f6a4921 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -92,4 +92,6 @@

  #define SO_CNX_ADVICE          53

+#define SO_SIZEHINT            54
+
  #endif /* __ASM_GENERIC_SOCKET_H */
diff --git a/include/uapi/linux/socket.h b/include/uapi/linux/socket.h
index 76ab0c6..16db7e8 100644
--- a/include/uapi/linux/socket.h
+++ b/include/uapi/linux/socket.h
@@ -1,6 +1,8 @@
  #ifndef _UAPI_LINUX_SOCKET_H
  #define _UAPI_LINUX_SOCKET_H

+#include <linux/types.h>
+
  /*
   * Desired design of maximum size and alignment (see RFC2553)
   */
@@ -18,4 +20,11 @@ struct __kernel_sockaddr_storage {
                                 /* _SS_MAXSIZE value minus size of ss_family */
  } __attribute__ ((aligned(_K_SS_ALIGNSIZE)));  /* force desired alignment */

+struct sock_sizehint {
+       __u32   order_zero_size;
+               /* max payload size that can fit into one page in kernel */
+       __u32   cache_size;
+               /* max payload size that can fit in socket slab cache */
+};
+
  #endif /* _UAPI_LINUX_SOCKET_H */
diff --git a/net/core/sock.c b/net/core/sock.c
index 7e73c26..4c9cd92 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1254,6 +1254,23 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
                 v.val = sk->sk_incoming_cpu;
                 break;

+       case SO_SIZEHINT:
+       {
+               struct sock_sizehint hint;
+
+               if (len > sizeof(hint))
+                       len = sizeof(hint);
+
+               hint.order_zero_size = PAGE_SIZE -
+                       SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+               hint.cache_size =  KMALLOC_MAX_CACHE_SIZE -
+                       SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+
+               if (copy_to_user(optval, &hint, len))
+                       return -EFAULT;
+               goto lenout;
+       }
+
         default:
                 /* We implement the SO_SNDLOWAT etc to not be settable
                  * (1003.1g 7).
-- 
2.4.2

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH resend] net: sock: Add option for memory optimized hints.
  2016-06-17 13:58 [PATCH resend] net: sock: Add option for memory optimized hints peter enderborg
@ 2016-06-17 14:14 ` Eric Dumazet
  2016-06-17 14:39   ` peter enderborg
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2016-06-17 14:14 UTC (permalink / raw)
  To: peter enderborg; +Cc: open list:PTP HARDWARE CLOCK SUPPORT

On Fri, 2016-06-17 at 15:58 +0200, peter enderborg wrote:
> From: Peter Enderborg <peter.enderborg@sonymobile.com>
> 
> When sending data the socket allocates memory for
> payload on a cache or a page alloc. The page alloc
> then might trigger compation that takes long time.
> This can be avoided with smaller chunks. But
> userspace can not know what is the right size for
> the smaller sends. For this we add a SIZEHINT
> getsocketopt where the userspace can get the size
> for send that will fit into one page (order 0) or
> the max for a slab cache allocation.

For which kind of sockets exactly you hit a problem ?

Sorry, this patch is probably not helping in any way.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH resend] net: sock: Add option for memory optimized hints.
  2016-06-17 14:14 ` Eric Dumazet
@ 2016-06-17 14:39   ` peter enderborg
  2016-06-17 16:03     ` Eric Dumazet
  2016-06-17 16:07     ` Eric Dumazet
  0 siblings, 2 replies; 5+ messages in thread
From: peter enderborg @ 2016-06-17 14:39 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: open list:PTP HARDWARE CLOCK SUPPORT

On 06/17/2016 04:14 PM, Eric Dumazet wrote:
> On Fri, 2016-06-17 at 15:58 +0200, peter enderborg wrote:
>> From: Peter Enderborg <peter.enderborg@sonymobile.com>
>>
>> When sending data the socket allocates memory for
>> payload on a cache or a page alloc. The page alloc
>> then might trigger compation that takes long time.
>> This can be avoided with smaller chunks. But
>> userspace can not know what is the right size for
>> the smaller sends. For this we add a SIZEHINT
>> getsocketopt where the userspace can get the size
>> for send that will fit into one page (order 0) or
>> the max for a slab cache allocation.
>
> For which kind of sockets exactly you hit a problem ?
>
> Sorry, this patch is probably not helping in any way.
>
It is mainly for af_unix sockets, and the effect is
quite significant when you hit a compaction, or with
this patch avoid get in to compaction, but it
can also be used for reducing the pressure on memory
for tcp. And the patches you suggested have been
applied (with the addition "af_unix: fix bug on large send()")
I see that there is a lot of other compaction fixes
recently but the problem are still there. And of course
to make any difference you need to change your
userland application too. But in our Qualcomm/Google
bastard to kernel. It makes a huge difference on the
behaviour of send(). But I also does not see this as
perfect solution. A wake-up function that has
the buffers reserved would be better.Or a pre allocated
send buffer would also be better. But I dont expect that
linux will have a real-time socket implementation in
near future.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH resend] net: sock: Add option for memory optimized hints.
  2016-06-17 14:39   ` peter enderborg
@ 2016-06-17 16:03     ` Eric Dumazet
  2016-06-17 16:07     ` Eric Dumazet
  1 sibling, 0 replies; 5+ messages in thread
From: Eric Dumazet @ 2016-06-17 16:03 UTC (permalink / raw)
  To: peter enderborg; +Cc: open list:PTP HARDWARE CLOCK SUPPORT

On Fri, 2016-06-17 at 16:39 +0200, peter enderborg wrote:
> On 06/17/2016 04:14 PM, Eric Dumazet wrote:
> > On Fri, 2016-06-17 at 15:58 +0200, peter enderborg wrote:
> >> From: Peter Enderborg <peter.enderborg@sonymobile.com>
> >>
> >> When sending data the socket allocates memory for
> >> payload on a cache or a page alloc. The page alloc
> >> then might trigger compation that takes long time.
> >> This can be avoided with smaller chunks. But
> >> userspace can not know what is the right size for
> >> the smaller sends. For this we add a SIZEHINT
> >> getsocketopt where the userspace can get the size
> >> for send that will fit into one page (order 0) or
> >> the max for a slab cache allocation.
> >
> > For which kind of sockets exactly you hit a problem ?
> >
> > Sorry, this patch is probably not helping in any way.
> >
> It is mainly for af_unix sockets, and the effect is
> quite significant when you hit a compaction, or with
> this patch avoid get in to compaction, but it
> can also be used for reducing the pressure on memory
> for tcp. And the patches you suggested have been
> applied (with the addition "af_unix: fix bug on large send()")
> I see that there is a lot of other compaction fixes
> recently but the problem are still there. And of course
> to make any difference you need to change your
> userland application too. But in our Qualcomm/Google
> bastard to kernel. It makes a huge difference on the
> behaviour of send(). But I also does not see this as
> perfect solution. A wake-up function that has
> the buffers reserved would be better.Or a pre allocated
> send buffer would also be better. But I dont expect that
> linux will have a real-time socket implementation in
> near future.

I have no evidence the problem you describe still exists in current
linux kernels.

Please patch your kernels, but do not send networking patches that seem
to work around a mm-layer problem, without notifying mm maintainers.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH resend] net: sock: Add option for memory optimized hints.
  2016-06-17 14:39   ` peter enderborg
  2016-06-17 16:03     ` Eric Dumazet
@ 2016-06-17 16:07     ` Eric Dumazet
  1 sibling, 0 replies; 5+ messages in thread
From: Eric Dumazet @ 2016-06-17 16:07 UTC (permalink / raw)
  To: peter enderborg; +Cc: open list:PTP HARDWARE CLOCK SUPPORT

On Fri, 2016-06-17 at 16:39 +0200, peter enderborg wrote:

> It is mainly for af_unix sockets, and the effect is
> quite significant when you hit a compaction, or with
> this patch avoid get in to compaction, but it
> can also be used for reducing the pressure on memory
> for tcp.

BTW, TCP always attempt order-3 allocations, even if you do a write(fd,
buffer, 4000)


So really your patch wont help.

We need to fix the mm layer (if needed), not add various works around.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-06-17 16:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-17 13:58 [PATCH resend] net: sock: Add option for memory optimized hints peter enderborg
2016-06-17 14:14 ` Eric Dumazet
2016-06-17 14:39   ` peter enderborg
2016-06-17 16:03     ` Eric Dumazet
2016-06-17 16:07     ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).