linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Eric Dumazet <edumazet@google.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Greg Thelen <gthelen@google.com>,
	Greg Rose <grose@lightfleet.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 4.8 06/35] netlink: do not enter direct reclaim from netlink_dump()
Date: Sun, 13 Nov 2016 12:27:20 +0100	[thread overview]
Message-ID: <20161113112421.103043667@linuxfoundation.org> (raw)
In-Reply-To: <20161113112420.863033770@linuxfoundation.org>

4.8-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>


[ Upstream commit d35c99ff77ecb2eb239731b799386f3b3637a31e ]

Since linux-3.15, netlink_dump() can use up to 16384 bytes skb
allocations.

Due to struct skb_shared_info ~320 bytes overhead, we end up using
order-3 (on x86) page allocations, that might trigger direct reclaim and
add stress.

The intent was really to attempt a large allocation but immediately
fallback to a smaller one (order-1 on x86) in case of memory stress.

On recent kernels (linux-4.4), we can remove __GFP_DIRECT_RECLAIM to
meet the goal. Old kernels would need to remove __GFP_WAIT

While we are at it, since we do an order-3 allocation, allow to use
all the allocated bytes instead of 16384 to reduce syscalls during
large dumps.

iproute2 already uses 32KB recvmsg() buffer sizes.

Alexei provided an initial patch downsizing to SKB_WITH_OVERHEAD(16384)

Fixes: 9063e21fb026 ("netlink: autosize skb lengthes")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Thelen <gthelen@google.com>
Reviewed-by: Greg Rose <grose@lightfleet.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/netlink/af_netlink.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1832,7 +1832,7 @@ static int netlink_recvmsg(struct socket
 	/* Record the max length of recvmsg() calls for future allocations */
 	nlk->max_recvmsg_len = max(nlk->max_recvmsg_len, len);
 	nlk->max_recvmsg_len = min_t(size_t, nlk->max_recvmsg_len,
-				     16384);
+				     SKB_WITH_OVERHEAD(32768));
 
 	copied = data_skb->len;
 	if (len < copied) {
@@ -2083,8 +2083,9 @@ static int netlink_dump(struct sock *sk)
 
 	if (alloc_min_size < nlk->max_recvmsg_len) {
 		alloc_size = nlk->max_recvmsg_len;
-		skb = alloc_skb(alloc_size, GFP_KERNEL |
-					    __GFP_NOWARN | __GFP_NORETRY);
+		skb = alloc_skb(alloc_size,
+				(GFP_KERNEL & ~__GFP_DIRECT_RECLAIM) |
+				__GFP_NOWARN | __GFP_NORETRY);
 	}
 	if (!skb) {
 		alloc_size = alloc_min_size;

  parent reply	other threads:[~2016-11-13 11:29 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20161113112745epcas3p44b79602b71457dccccc98057db31f68c@epcas3p4.samsung.com>
2016-11-13 11:27 ` [PATCH 4.8 00/35] 4.8.8-stable review Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 02/35] net: pktgen: fix pkt_size Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 03/35] net/sched: act_vlan: Push skb->data to mac_header prior calling skb_vlan_*() functions Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 04/35] net: Add netdev all_adj_list refcnt propagation to fix panic Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 05/35] packet: call fanout_release, while UNREGISTERING a netdev Greg Kroah-Hartman
2016-11-13 11:27   ` Greg Kroah-Hartman [this message]
2016-11-13 11:27   ` [PATCH 4.8 07/35] drivers/ptp: Fix kernel memory disclosure Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 08/35] net_sched: reorder pernet ops and act ops registrations Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 09/35] ipv6: tcp: restore IP6CB for pktoptions skbs Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 10/35] net: phy: Trigger state machine on state change and not polling Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 11/35] ip6_tunnel: fix ip6_tnl_lookup Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 12/35] ipv6: correctly add local routes when lo goes up Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 13/35] IB/ipoib: move back IB LL address into the hard header Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 14/35] net/mlx4_en: fixup xdp tx irq to match rx Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 15/35] net: pktgen: remove rcu locking in pktgen_change_name() Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 16/35] bridge: multicast: restore perm router ports on multicast enable Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 17/35] switchdev: Execute bridge ndos only for bridge ports Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 18/35] rtnetlink: Add rtnexthop offload flag to compare mask Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 19/35] net: core: Correctly iterate over lower adjacency list Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 21/35] ipv4: disable BH in set_ping_group_range() Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 22/35] ipv4: use the right lock for ping_group_range Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 23/35] net: fec: Call swap_buffer() prior to IP header alignment Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 24/35] net: sctp, forbid negative length Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 25/35] sctp: fix the panic caused by route update Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 26/35] udp: fix IP_CHECKSUM handling Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 27/35] netvsc: fix incorrect receive checksum offloading Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 28/35] macsec: Fix header length if SCI is added if explicitly disabled Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 29/35] net: ipv6: Do not consider link state for nexthop validation Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 30/35] net sched filters: fix notification of filter delete with proper handle Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 31/35] sctp: validate chunk len before actually using it Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 32/35] ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit() Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 33/35] packet: on direct_xmit, limit tso and csum to supported devices Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 34/35] arch/powerpc: Update parameters for csum_tcpudp_magic & csum_tcpudp_nofold Greg Kroah-Hartman
2016-11-13 11:27   ` [PATCH 4.8 35/35] usb: dwc3: gadget: properly account queued requests Greg Kroah-Hartman
2016-11-13 20:41   ` [PATCH 4.8 00/35] 4.8.8-stable review Guenter Roeck
2016-11-14  7:44     ` Greg Kroah-Hartman
     [not found]   ` <5828ae8b.46bb1c0a.31433.9533@mx.google.com>
2016-11-14  7:53     ` Greg Kroah-Hartman
2016-11-14 16:48   ` Shuah Khan
2016-11-14 17:18     ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161113112421.103043667@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=ast@kernel.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=grose@lightfleet.com \
    --cc=gthelen@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).