All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Shahar Klein <shahark@mellanox.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	Eric Dumazet <edumazet@google.com>,
	"David S. Miller" <davem@davemloft.net>,
	Amit Pundir <amit.pundir@linaro.org>
Subject: [PATCH 3.18 30/92] net, sched: fix soft lockup in tc_classify
Date: Wed,  9 Aug 2017 13:36:58 -0700	[thread overview]
Message-ID: <20170809202156.718304446@linuxfoundation.org> (raw)
In-Reply-To: <20170809202155.435709888@linuxfoundation.org>

3.18-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <daniel@iogearbox.net>

commit 628185cfddf1dfb701c4efe2cfd72cf5b09f5702 upstream.

Shahar reported a soft lockup in tc_classify(), where we run into an
endless loop when walking the classifier chain due to tp->next == tp
which is a state we should never run into. The issue only seems to
trigger under load in the tc control path.

What happens is that in tc_ctl_tfilter(), thread A allocates a new
tp, initializes it, sets tp_created to 1, and calls into tp->ops->change()
with it. In that classifier callback we had to unlock/lock the rtnl
mutex and returned with -EAGAIN. One reason why we need to drop there
is, for example, that we need to request an action module to be loaded.

This happens via tcf_exts_validate() -> tcf_action_init/_1() meaning
after we loaded and found the requested action, we need to redo the
whole request so we don't race against others. While we had to unlock
rtnl in that time, thread B's request was processed next on that CPU.
Thread B added a new tp instance successfully to the classifier chain.
When thread A returned grabbing the rtnl mutex again, propagating -EAGAIN
and destroying its tp instance which never got linked, we goto replay
and redo A's request.

This time when walking the classifier chain in tc_ctl_tfilter() for
checking for existing tp instances we had a priority match and found
the tp instance that was created and linked by thread B. Now calling
again into tp->ops->change() with that tp was successful and returned
without error.

tp_created was never cleared in the second round, thus kernel thinks
that we need to link it into the classifier chain (once again). tp and
*back point to the same object due to the match we had earlier on. Thus
for thread B's already public tp, we reset tp->next to tp itself and
link it into the chain, which eventually causes the mentioned endless
loop in tc_classify() once a packet hits the data path.

Fix is to clear tp_created at the beginning of each request, also when
we replay it. On the paths that can cause -EAGAIN we already destroy
the original tp instance we had and on replay we really need to start
from scratch. It seems that this issue was first introduced in commit
12186be7d2e1 ("net_cls: fix unconfigured struct tcf_proto keeps chaining
and avoid kernel panic when we use cls_cgroup").

Fixes: 12186be7d2e1 ("net_cls: fix unconfigured struct tcf_proto keeps chaining and avoid kernel panic when we use cls_cgroup")
Reported-by: Shahar Klein <shahark@mellanox.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Tested-by: Shahar Klein <shahark@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/sched/cls_api.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -137,13 +137,15 @@ static int tc_ctl_tfilter(struct sk_buff
 	unsigned long cl;
 	unsigned long fh;
 	int err;
-	int tp_created = 0;
+	int tp_created;
 
 	if ((n->nlmsg_type != RTM_GETTFILTER) &&
 	    !netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN))
 		return -EPERM;
 
 replay:
+	tp_created = 0;
+
 	err = nlmsg_parse(n, sizeof(*t), tca, TCA_MAX, NULL);
 	if (err < 0)
 		return err;

  parent reply	other threads:[~2017-08-09 20:41 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-09 20:36 [PATCH 3.18 00/92] 3.18.64-stable review - take 2 Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 01/92] af_key: Add lock to key dump Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 02/92] pstore: Make spinlock per zone instead of global Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 03/92] net: reduce skb_warn_bad_offload() noise Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 04/92] powerpc/pseries: Fix of_node_put() underflow during reconfig remove Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 05/92] md/raid5: add thread_group worker async_tx_issue_pending_all Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 06/92] drm/vmwgfx: Fix gcc-7.1.1 warning Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 07/92] KVM: PPC: Book3S HV: Restore critical SPRs to host values on guest exit Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 08/92] KVM: PPC: Book3S HV: Reload HTM registers explicitly Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 09/92] KVM: PPC: Book3S HV: Save/restore host values of debug registers Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 10/92] Revert "powerpc/numa: Fix percpu allocations to be NUMA aware" Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 11/92] Staging: comedi: comedi_fops: Avoid orphaned proc entry Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 12/92] Bluetooth: bnep: bnep_add_connection() should verify that its dealing with l2cap socket Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 13/92] Bluetooth: Fix potential NULL dereference Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 14/92] Bluetooth: cmtp: cmtp_add_connection() should verify that its dealing with l2cap socket Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 15/92] net: phy: Do not perform software reset for Generic PHY Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 16/92] isdn: Fix a sleep-in-atomic bug Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 17/92] string: provide strscpy() Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 18/92] strscpy: zero any trailing garbage bytes in the destination Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 19/92] isdn/i4l: fix buffer overflow Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 20/92] wil6210: fix deadlock when using fw_no_recovery option Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 21/92] mailbox: always wait in mbox_send_message for blocking Tx mode Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 22/92] mailbox: skip complete wait event if timer expired Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 23/92] mailbox: handle empty message in tx_tick Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 24/92] mpt3sas: Dont overreach ioc->reply_post[] during initialization Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 25/92] kaweth: fix firmware download Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 26/92] kaweth: fix oops upon failed memory allocation Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 27/92] ipv6: fix possible deadlock in ip6_fl_purge / ip6_fl_gc Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 28/92] net: sctp: fix race for one-to-many sockets in sendmsgs auto associate Greg Kroah-Hartman
2017-08-09 20:36 ` [PATCH 3.18 29/92] sh_eth: Fix ethtool operation crash when net device is down Greg Kroah-Hartman
2017-08-09 20:36 ` Greg Kroah-Hartman [this message]
2017-08-09 20:36 ` [PATCH 3.18 31/92] ipmi/watchdog: fix watchdog timeout set on reboot Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 32/92] dentry name snapshots Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 33/92] [media] v4l: s5c73m3: fix negation operator Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 34/92] pstore: Allow prz to control need for locking Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 35/92] pstore: Correctly initialize spinlock and flags Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 36/92] pstore: Use dynamic spinlock initializer Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 37/92] net: skb_needs_check() accepts CHECKSUM_NONE for tx Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 38/92] tpm: fix a kernel memory leak in tpm-sysfs.c Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 39/92] x86/mce/AMD: Make the init code more robust Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 40/92] r8169: add support for RTL8168 series add-on card Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 42/92] ipv6: Should use consistent conditional judgement for ip6 fragment between __ip6_append_data and ip6_finish_output Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 43/92] net/mlx4: Remove BUG_ON from ICM allocation routine Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 44/92] drm/msm: Ensure that the hardware write pointer is valid Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 45/92] drm/msm: Verify that MSM_SUBMIT_BO_FLAGS are set Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 46/92] vfio-pci: use 32-bit comparisons for register address for gcc-4.5 Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 47/92] ASoC: tlv320aic3x: Mark the RESET register as volatile Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 48/92] spi: dw: Make debugfs name unique between instances Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 49/92] vlan: Propagate MAC address to VLANs Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 50/92] xfrm: Dont use sk_family for socket policy lookups Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 51/92] tile: add <asm/word-at-a-time.h> and enable support functions Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 52/92] word-at-a-time.h: support zero_bytemask() on alpha and tile Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 53/92] Make asm/word-at-a-time.h available on all architectures Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 54/92] arch/powerpc: provide zero_bytemask() for big-endian Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 55/92] tile: use global strscpy() rather than private copy Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 56/92] libata: array underflow in ata_find_dev() Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 57/92] workqueue: restore WQ_UNBOUND/max_active==1 to be ordered Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 58/92] ALSA: hda - Fix speaker output from VAIO VPCL14M1R Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 59/92] ASoC: do not close shared backend dailink Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 61/92] mm/page_alloc: Remove kernel address exposure in free_reserved_area() Greg Kroah-Hartman
2017-08-09 20:37   ` Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 62/92] ext4: fix SEEK_HOLE/SEEK_DATA for blocksize < pagesize Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 63/92] ext4: fix overflow caused by missing cast in ext4_resize_fs() Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 64/92] media: platform: davinci: return -EINVAL for VPFE_CMD_S_CCDC_RAW_PARAMS ioctl Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 65/92] target: Avoid mappedlun symlink creation during lun shutdown Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 66/92] iscsi-target: Always wait for kthread_should_stop() before kthread exit Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 67/92] iscsi-target: Fix early sk_data_ready LOGIN_FLAGS_READY race Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 68/92] iscsi-target: Fix initial login PDU asynchronous socket close OOPs Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 69/92] iscsi-target: Fix delayed logout processing greater than SECONDS_FOR_LOGOUT_COMP Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 70/92] f2fs: sanity check checkpoint segno and blkoff Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 71/92] net: Zero terminate ifr_name in dev_ifname() Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 72/92] ipv6: avoid overflow of offset in ip6_find_1stfragopt Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 73/92] ipv4: initialize fib_trie prior to register_netdev_notifier call Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 74/92] rtnetlink: allocate more memory for dev_set_mac_address() Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 75/92] mcs7780: Fix initialization when CONFIG_VMAP_STACK is enabled Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 76/92] packet: fix use-after-free in prb_retire_rx_blk_timer_expired() Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 77/92] dccp: fix a memleak for dccp_feat_init err process Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 78/92] sctp: dont dereference ptr before leaving _sctp_walk_{params, errors}() Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 79/92] sctp: fix the check for _sctp_walk_params and _sctp_walk_errors Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 80/92] net: phy: Correctly process PHY_HALTED in phy_stop_machine() Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 81/92] xen-netback: correctly schedule rate-limited queues Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 82/92] wext: handle NULL extra data in iwe_stream_add_point better Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 83/92] sh_eth: R8A7740 supports packet shecksumming Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 84/92] tg3: Fix race condition in tg3_get_stats64() Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 85/92] x86/boot: Add missing declaration of string functions Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 86/92] scsi: qla2xxx: Get mutex lock before checking optrom_state Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 87/92] ARM: 8632/1: ftrace: fix syscall name matching Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 88/92] mm, slab: make sure that KMALLOC_MAX_SIZE will fit into MAX_ORDER Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 89/92] lib/Kconfig.debug: fix frv build failure Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 90/92] signal: protect SIGNAL_UNKILLABLE from unintentional clearing Greg Kroah-Hartman
2017-08-09 20:37 ` [PATCH 3.18 91/92] mm: dont dereference struct page fields of invalid pages Greg Kroah-Hartman
2017-08-09 20:38 ` [PATCH 3.18 92/92] ipv4: Should use consistent conditional judgement for ip fragment in __ip_append_data and ip_finish_output Greg Kroah-Hartman
2017-08-09 23:59 ` [PATCH 3.18 00/92] 3.18.64-stable review - take 2 Shuah Khan
2017-08-10  2:34   ` Greg Kroah-Hartman
2017-08-10  0:29 ` Guenter Roeck
2017-08-10  2:34   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170809202156.718304446@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=amit.pundir@linaro.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=shahark@mellanox.com \
    --cc=stable@vger.kernel.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.