From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Ronen Arad <ronen.arad@intel.com>,
"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 4.1 14/46] netlink: Trim skb to alloc size to avoid MSG_TRUNC
Date: Fri, 23 Oct 2015 10:46:00 -0700 [thread overview]
Message-ID: <20151023174621.223657032@linuxfoundation.org> (raw)
In-Reply-To: <20151023174620.779720995@linuxfoundation.org>
4.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Arad, Ronen" <ronen.arad@intel.com>
[ Upstream commit db65a3aaf29ecce2e34271d52e8d2336b97bd9fe ]
netlink_dump() allocates skb based on the calculated min_dump_alloc or
a per socket max_recvmsg_len.
min_alloc_size is maximum space required for any single netdev
attributes as calculated by rtnl_calcit().
max_recvmsg_len tracks the user provided buffer to netlink_recvmsg.
It is capped at 16KiB.
The intention is to avoid small allocations and to minimize the number
of calls required to obtain dump information for all net devices.
netlink_dump packs as many small messages as could fit within an skb
that was sized for the largest single netdev information. The actual
space available within an skb is larger than what is requested. It could
be much larger and up to near 2x with align to next power of 2 approach.
Allowing netlink_dump to use all the space available within the
allocated skb increases the buffer size a user has to provide to avoid
truncaion (i.e. MSG_TRUNG flag set).
It was observed that with many VLANs configured on at least one netdev,
a larger buffer of near 64KiB was necessary to avoid "Message truncated"
error in "ip link" or "bridge [-c[ompressvlans]] vlan show" when
min_alloc_size was only little over 32KiB.
This patch trims skb to allocated size in order to allow the user to
avoid truncation with more reasonable buffer size.
Signed-off-by: Ronen Arad <ronen.arad@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
net/netlink/af_netlink.c | 34 ++++++++++++++++++++++------------
1 file changed, 22 insertions(+), 12 deletions(-)
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2683,6 +2683,7 @@ static int netlink_dump(struct sock *sk)
struct sk_buff *skb = NULL;
struct nlmsghdr *nlh;
int len, err = -ENOBUFS;
+ int alloc_min_size;
int alloc_size;
mutex_lock(nlk->cb_mutex);
@@ -2691,9 +2692,6 @@ static int netlink_dump(struct sock *sk)
goto errout_skb;
}
- cb = &nlk->cb;
- alloc_size = max_t(int, cb->min_dump_alloc, NLMSG_GOODSIZE);
-
if (!netlink_rx_is_mmaped(sk) &&
atomic_read(&sk->sk_rmem_alloc) >= sk->sk_rcvbuf)
goto errout_skb;
@@ -2703,23 +2701,35 @@ static int netlink_dump(struct sock *sk)
* to reduce number of system calls on dump operations, if user
* ever provided a big enough buffer.
*/
- if (alloc_size < nlk->max_recvmsg_len) {
- skb = netlink_alloc_skb(sk,
- nlk->max_recvmsg_len,
- nlk->portid,
+ cb = &nlk->cb;
+ alloc_min_size = max_t(int, cb->min_dump_alloc, NLMSG_GOODSIZE);
+
+ if (alloc_min_size < nlk->max_recvmsg_len) {
+ alloc_size = nlk->max_recvmsg_len;
+ skb = netlink_alloc_skb(sk, alloc_size, nlk->portid,
GFP_KERNEL |
__GFP_NOWARN |
__GFP_NORETRY);
- /* available room should be exact amount to avoid MSG_TRUNC */
- if (skb)
- skb_reserve(skb, skb_tailroom(skb) -
- nlk->max_recvmsg_len);
}
- if (!skb)
+ if (!skb) {
+ alloc_size = alloc_min_size;
skb = netlink_alloc_skb(sk, alloc_size, nlk->portid,
GFP_KERNEL);
+ }
if (!skb)
goto errout_skb;
+
+ /* Trim skb to allocated size. User is expected to provide buffer as
+ * large as max(min_dump_alloc, 16KiB (mac_recvmsg_len capped at
+ * netlink_recvmsg())). dump will pack as many smaller messages as
+ * could fit within the allocated skb. skb is typically allocated
+ * with larger space than required (could be as much as near 2x the
+ * requested size with align to next power of 2 approach). Allowing
+ * dump to use the excess space makes it difficult for a user to have a
+ * reasonable static buffer based on the expected largest dump of a
+ * single netdev. The outcome is MSG_TRUNC error.
+ */
+ skb_reserve(skb, skb_tailroom(skb) - alloc_size);
netlink_skb_set_owner_r(skb, sk);
len = cb->dump(skb, cb);
next prev parent reply other threads:[~2015-10-23 17:55 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-23 17:45 [PATCH 4.1 00/46] 4.1.12-stable review Greg Kroah-Hartman
2015-10-23 17:45 ` [PATCH 4.1 01/46] net/ibm/emac: bump version numbers for correct work with ethtool Greg Kroah-Hartman
2015-10-23 17:45 ` [PATCH 4.1 02/46] l2tp: protect tunnel->del_work by ref_count Greg Kroah-Hartman
2015-10-23 17:45 ` [PATCH 4.1 03/46] skbuff: Fix skb checksum flag on skb pull Greg Kroah-Hartman
2015-10-23 17:45 ` [PATCH 4.1 04/46] skbuff: Fix skb checksum partial check Greg Kroah-Hartman
2015-10-23 17:45 ` [PATCH 4.1 05/46] inet: fix races in reqsk_queue_hash_req() Greg Kroah-Hartman
2015-10-23 17:45 ` [PATCH 4.1 06/46] net: add pfmemalloc check in sk_add_backlog() Greg Kroah-Hartman
2015-10-23 17:45 ` [PATCH 4.1 07/46] ppp: dont override sk->sk_state in pppoe_flush_dev() Greg Kroah-Hartman
2015-10-23 17:45 ` [PATCH 4.1 08/46] inet: fix race in reqsk_queue_unlink() Greg Kroah-Hartman
2015-10-23 17:45 ` [PATCH 4.1 09/46] bpf: fix panic in SO_GET_FILTER with native ebpf programs Greg Kroah-Hartman
2015-10-23 17:45 ` [PATCH 4.1 10/46] ovs: do not allocate memory from offline numa node Greg Kroah-Hartman
2015-10-23 17:45 ` [PATCH 4.1 11/46] act_mirred: clear sender cpu before sending to tx Greg Kroah-Hartman
2015-10-23 17:45 ` [PATCH 4.1 12/46] ethtool: Use kcalloc instead of kmalloc for ethtool_get_strings Greg Kroah-Hartman
2015-10-23 17:45 ` [PATCH 4.1 13/46] tipc: move fragment importance field to new header position Greg Kroah-Hartman
2015-10-23 17:46 ` Greg Kroah-Hartman [this message]
2015-10-23 17:46 ` [PATCH 4.1 15/46] af_unix: Convert the unix_sk macro to an inline function for type safety Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 16/46] af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 17/46] net/unix: fix logic about sk_peek_offset Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 18/46] drm: Fix locking for sysfs dpms file Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 19/46] crypto: sparc - initialize blkcipher.ivsize Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 20/46] crypto: ahash - ensure statesize is non-zero Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 21/46] memcg: convert threshold to bytes Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 22/46] btrfs: check unsupported filters in balance arguments Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 23/46] btrfs: fix use after free iterating extrefs Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 24/46] arm64: errata: use KBUILD_CFLAGS_MODULE for erratum #843419 Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 25/46] nfsd/blocklayout: accept any minlength Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 26/46] mfd: max77843: Fix max77843_chg_init() return on error Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 27/46] i2c: rcar: enable RuntimePM before registering to the core Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 28/46] i2c: s3c2410: " Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 29/46] i2c: designware: Do not use parameters from ACPI on Dell Inspiron 7348 Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 30/46] i2c: designware-platdrv: enable RuntimePM before registering to the core Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 31/46] workqueue: make sure delayed work run in local cpu Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 32/46] drm/nouveau/fbcon: take runpm reference when userspace has an open fd Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 33/46] drm/dp/mst: make mst i2c transfer code more robust Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 34/46] drm/radeon: attach tile property to mst connector Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 35/46] drm/radeon: add pm sysfs files late Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 36/46] dm thin: fix missing pool reference count decrement in pool_ctr error path Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 37/46] rbd: fix double free on rbd_dev->header_name Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 38/46] sched/preempt: Rename PREEMPT_CHECK_OFFSET to PREEMPT_DISABLE_OFFSET Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 39/46] sched/preempt: Fix cond_resched_lock() and cond_resched_softirq() Greg Kroah-Hartman
2015-10-23 20:14 ` Thomas Backlund
2015-10-23 23:21 ` Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 41/46] arm64: Fix THP protection change logic Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 42/46] svcrdma: handle rdma read with a non-zero initial page offset Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 43/46] locks: have flock_lock_file take an inode pointer instead of a filp Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 44/46] locks: new helpers - flock_lock_inode_wait and posix_lock_inode_wait Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 45/46] locks: inline posix_lock_file_wait and flock_lock_file_wait Greg Kroah-Hartman
2015-10-23 17:46 ` [PATCH 4.1 46/46] nfs4: have do_vfs_lock take an inode pointer Greg Kroah-Hartman
2015-10-23 20:34 ` [PATCH 4.1 00/46] 4.1.12-stable review Shuah Khan
2015-10-23 23:22 ` Greg Kroah-Hartman
2015-10-24 1:11 ` Guenter Roeck
2015-10-24 3:15 ` Guenter Roeck
2015-10-24 13:20 ` Greg Kroah-Hartman
[not found] ` <562b9ffc.e8acc20a.c45a5.08fd@mx.google.com>
2015-10-24 15:16 ` Kevin Hilman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151023174621.223657032@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=linux-kernel@vger.kernel.org \
--cc=ronen.arad@intel.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).