public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Mukesh Ojha <quic_mojha@quicinc.com>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.4 66/67] devcoredump : Serialize devcd_del work
Date: Mon, 11 Dec 2023 19:22:50 +0100	[thread overview]
Message-ID: <20231211182017.816803161@linuxfoundation.org> (raw)
In-Reply-To: <20231211182015.049134368@linuxfoundation.org>

5.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mukesh Ojha <quic_mojha@quicinc.com>

[ Upstream commit 01daccf748323dfc61112f474cf2ba81015446b0 ]

In following scenario(diagram), when one thread X running dev_coredumpm()
adds devcd device to the framework which sends uevent notification to
userspace and another thread Y reads this uevent and call to
devcd_data_write() which eventually try to delete the queued timer that
is not initialized/queued yet.

So, debug object reports some warning and in the meantime, timer is
initialized and queued from X path. and from Y path, it gets reinitialized
again and timer->entry.pprev=NULL and try_to_grab_pending() stucks.

To fix this, introduce mutex and a boolean flag to serialize the behaviour.

 	cpu0(X)			                cpu1(Y)

    dev_coredump() uevent sent to user space
    device_add()  ======================> user space process Y reads the
                                          uevents writes to devcd fd
                                          which results into writes to

                                         devcd_data_write()
                                           mod_delayed_work()
                                             try_to_grab_pending()
                                               del_timer()
                                                 debug_assert_init()
   INIT_DELAYED_WORK()
   schedule_delayed_work()
                                                   debug_object_fixup()
                                                     timer_fixup_assert_init()
                                                       timer_setup()
                                                         do_init_timer()
                                                       /*
                                                        Above call reinitializes
                                                        the timer to
                                                        timer->entry.pprev=NULL
                                                        and this will be checked
                                                        later in timer_pending() call.
                                                       */
                                                 timer_pending()
                                                  !hlist_unhashed_lockless(&timer->entry)
                                                    !h->pprev
                                                /*
                                                  del_timer() checks h->pprev and finds
                                                  it to be NULL due to which
                                                  try_to_grab_pending() stucks.
                                                */

Link: https://lore.kernel.org/lkml/2e1f81e2-428c-f11f-ce92-eb11048cb271@quicinc.com/
Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
Link: https://lore.kernel.org/r/1663073424-13663-1-git-send-email-quic_mojha@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stable-dep-of: af54d778a038 ("devcoredump: Send uevent once devcd is ready")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/base/devcoredump.c | 83 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 81 insertions(+), 2 deletions(-)

diff --git a/drivers/base/devcoredump.c b/drivers/base/devcoredump.c
index e42d0b5143849..ab2fbb43ccdb5 100644
--- a/drivers/base/devcoredump.c
+++ b/drivers/base/devcoredump.c
@@ -29,6 +29,47 @@ struct devcd_entry {
 	struct device devcd_dev;
 	void *data;
 	size_t datalen;
+	/*
+	 * Here, mutex is required to serialize the calls to del_wk work between
+	 * user/kernel space which happens when devcd is added with device_add()
+	 * and that sends uevent to user space. User space reads the uevents,
+	 * and calls to devcd_data_write() which try to modify the work which is
+	 * not even initialized/queued from devcoredump.
+	 *
+	 *
+	 *
+	 *        cpu0(X)                                 cpu1(Y)
+	 *
+	 *        dev_coredump() uevent sent to user space
+	 *        device_add()  ======================> user space process Y reads the
+	 *                                              uevents writes to devcd fd
+	 *                                              which results into writes to
+	 *
+	 *                                             devcd_data_write()
+	 *                                               mod_delayed_work()
+	 *                                                 try_to_grab_pending()
+	 *                                                   del_timer()
+	 *                                                     debug_assert_init()
+	 *       INIT_DELAYED_WORK()
+	 *       schedule_delayed_work()
+	 *
+	 *
+	 * Also, mutex alone would not be enough to avoid scheduling of
+	 * del_wk work after it get flush from a call to devcd_free()
+	 * mentioned as below.
+	 *
+	 *	disabled_store()
+	 *        devcd_free()
+	 *          mutex_lock()             devcd_data_write()
+	 *          flush_delayed_work()
+	 *          mutex_unlock()
+	 *                                   mutex_lock()
+	 *                                   mod_delayed_work()
+	 *                                   mutex_unlock()
+	 * So, delete_work flag is required.
+	 */
+	struct mutex mutex;
+	bool delete_work;
 	struct module *owner;
 	ssize_t (*read)(char *buffer, loff_t offset, size_t count,
 			void *data, size_t datalen);
@@ -88,7 +129,12 @@ static ssize_t devcd_data_write(struct file *filp, struct kobject *kobj,
 	struct device *dev = kobj_to_dev(kobj);
 	struct devcd_entry *devcd = dev_to_devcd(dev);
 
-	mod_delayed_work(system_wq, &devcd->del_wk, 0);
+	mutex_lock(&devcd->mutex);
+	if (!devcd->delete_work) {
+		devcd->delete_work = true;
+		mod_delayed_work(system_wq, &devcd->del_wk, 0);
+	}
+	mutex_unlock(&devcd->mutex);
 
 	return count;
 }
@@ -116,7 +162,12 @@ static int devcd_free(struct device *dev, void *data)
 {
 	struct devcd_entry *devcd = dev_to_devcd(dev);
 
+	mutex_lock(&devcd->mutex);
+	if (!devcd->delete_work)
+		devcd->delete_work = true;
+
 	flush_delayed_work(&devcd->del_wk);
+	mutex_unlock(&devcd->mutex);
 	return 0;
 }
 
@@ -126,6 +177,30 @@ static ssize_t disabled_show(struct class *class, struct class_attribute *attr,
 	return sprintf(buf, "%d\n", devcd_disabled);
 }
 
+/*
+ *
+ *	disabled_store()                                	worker()
+ *	 class_for_each_device(&devcd_class,
+ *		NULL, NULL, devcd_free)
+ *         ...
+ *         ...
+ *	   while ((dev = class_dev_iter_next(&iter))
+ *                                                             devcd_del()
+ *                                                               device_del()
+ *                                                                 put_device() <- last reference
+ *             error = fn(dev, data)                           devcd_dev_release()
+ *             devcd_free(dev, data)                           kfree(devcd)
+ *             mutex_lock(&devcd->mutex);
+ *
+ *
+ * In the above diagram, It looks like disabled_store() would be racing with parallely
+ * running devcd_del() and result in memory abort while acquiring devcd->mutex which
+ * is called after kfree of devcd memory  after dropping its last reference with
+ * put_device(). However, this will not happens as fn(dev, data) runs
+ * with its own reference to device via klist_node so it is not its last reference.
+ * so, above situation would not occur.
+ */
+
 static ssize_t disabled_store(struct class *class, struct class_attribute *attr,
 			      const char *buf, size_t count)
 {
@@ -282,13 +357,16 @@ void dev_coredumpm(struct device *dev, struct module *owner,
 	devcd->read = read;
 	devcd->free = free;
 	devcd->failing_dev = get_device(dev);
+	devcd->delete_work = false;
 
+	mutex_init(&devcd->mutex);
 	device_initialize(&devcd->devcd_dev);
 
 	dev_set_name(&devcd->devcd_dev, "devcd%d",
 		     atomic_inc_return(&devcd_count));
 	devcd->devcd_dev.class = &devcd_class;
 
+	mutex_lock(&devcd->mutex);
 	if (device_add(&devcd->devcd_dev))
 		goto put_device;
 
@@ -302,10 +380,11 @@ void dev_coredumpm(struct device *dev, struct module *owner,
 
 	INIT_DELAYED_WORK(&devcd->del_wk, devcd_del);
 	schedule_delayed_work(&devcd->del_wk, DEVCD_TIMEOUT);
-
+	mutex_unlock(&devcd->mutex);
 	return;
  put_device:
 	put_device(&devcd->devcd_dev);
+	mutex_unlock(&devcd->mutex);
  put_module:
 	module_put(owner);
  free:
-- 
2.42.0




  parent reply	other threads:[~2023-12-11 18:44 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-11 18:21 [PATCH 5.4 00/67] 5.4.264-rc1 review Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 01/67] hrtimers: Push pending hrtimers away from outgoing CPU earlier Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 02/67] netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 03/67] tg3: Move the [rt]x_dropped counters to tg3_napi Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 04/67] tg3: Increment tx_dropped in tg3_tso_bug() Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 05/67] kconfig: fix memory leak from range properties Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 06/67] drm/amdgpu: correct chunk_ptr to a pointer to chunk Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 07/67] of: base: Add of_get_cpu_state_node() to get idle states for a CPU node Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 08/67] ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 09/67] ACPI/IORT: Make iort_msi_map_rid() PCI agnostic Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 10/67] of/iommu: Make of_map_rid() " Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 11/67] of/irq: make of_msi_map_get_device_domain() bus agnostic Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 12/67] of/irq: Make of_msi_map_rid() PCI " Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 13/67] of: base: Fix some formatting issues and provide missing descriptions Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 14/67] of: Fix kerneldoc output formatting Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 15/67] of: Add missing Return section in kerneldoc comments Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 16/67] of: dynamic: Fix of_reconfig_get_state_change() return value documentation Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 17/67] ipv6: fix potential NULL deref in fib6_add() Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 18/67] hv_netvsc: rndis_filter needs to select NLS Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 19/67] net: arcnet: Fix RESET flag handling Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 20/67] net: arcnet: com20020 fix error handling Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 21/67] arcnet: restoring support for multiple Sohard Arcnet cards Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 22/67] ipv4: ip_gre: Avoid skb_pull() failure in ipgre_xmit() Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 23/67] net: hns: fix fake link up on xge port Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 24/67] netfilter: xt_owner: Fix for unsafe access of sk->sk_socket Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 25/67] tcp: do not accept ACK of bytes we never sent Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 26/67] bpf: sockmap, updating the sg structure should also update curr Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 27/67] RDMA/bnxt_re: Correct module description string Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 28/67] hwmon: (acpi_power_meter) Fix 4.29 MW bug Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 29/67] ASoC: wm_adsp: fix memleak in wm_adsp_buffer_populate Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 30/67] tracing: Fix a warning when allocating buffered events fails Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 31/67] scsi: be2iscsi: Fix a memleak in beiscsi_init_wrb_handle() Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 32/67] ARM: imx: Check return value of devm_kasprintf in imx_mmdc_perf_init Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 33/67] ARM: dts: imx: make gpt node name generic Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 34/67] ARM: dts: imx7: Declare timers compatible with fsl,imx6dl-gpt Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 35/67] ALSA: pcm: fix out-of-bounds in snd_pcm_state_names Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 36/67] nilfs2: prevent WARNING in nilfs_sufile_set_segment_usage() Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 37/67] tracing: Always update snapshot buffer size Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 38/67] tracing: Fix incomplete locking when disabling buffered events Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 39/67] tracing: Fix a possible race " Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 40/67] packet: Move reference count in packet_sock to atomic_long_t Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 41/67] arm64: dts: mediatek: mt7622: fix memory node warning check Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 42/67] arm64: dts: mediatek: mt8173-evb: Fix regulator-fixed node names Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 43/67] perf/core: Add a new read format to get a number of lost samples Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 44/67] perf: Fix perf_event_validate_size() Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 45/67] gpiolib: sysfs: Fix error handling on failed export Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 46/67] mmc: core: add helpers mmc_regulator_enable/disable_vqmmc Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 47/67] mmc: sdhci-sprd: Fix vqmmc not shutting down after the card was pulled Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 48/67] usb: gadget: f_hid: fix report descriptor allocation Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 49/67] parport: Add support for Brainboxes IX/UC/PX parallel cards Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 50/67] usb: typec: class: fix typec_altmode_put_partner to put plugs Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 51/67] ARM: PL011: Fix DMA support Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 52/67] serial: sc16is7xx: address RX timeout interrupt errata Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 53/67] serial: 8250_omap: Add earlycon support for the AM654 UART controller Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 54/67] x86/CPU/AMD: Check vendor in the AMD microcode callback Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 55/67] KVM: s390/mm: Properly reset no-dat Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 56/67] nilfs2: fix missing error check for sb_set_blocksize call Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 57/67] io_uring/af_unix: disable sending io_uring over sockets Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 58/67] netlink: dont call ->netlink_bind with table lock held Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 59/67] genetlink: add CAP_NET_ADMIN test for multicast bind Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 60/67] psample: Require CAP_NET_ADMIN when joining "packets" group Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 61/67] drop_monitor: Require CAP_SYS_ADMIN when joining "events" group Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 62/67] tools headers UAPI: Sync linux/perf_event.h with the kernel sources Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 63/67] Revert "btrfs: add dmesg output for first mount and last unmount of a filesystem" Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 64/67] cifs: Fix non-availability of dedup breaking generic/304 Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 65/67] smb: client: fix potential NULL deref in parse_dfs_referrals() Greg Kroah-Hartman
2023-12-11 18:22 ` Greg Kroah-Hartman [this message]
2023-12-11 18:22 ` [PATCH 5.4 67/67] devcoredump: Send uevent once devcd is ready Greg Kroah-Hartman
2023-12-11 19:04 ` [PATCH 5.4 00/67] 5.4.264-rc1 review Florian Fainelli
2023-12-12 13:23 ` Harshit Mogalapalli
2023-12-12 16:13 ` Shuah Khan
2023-12-12 17:00 ` Guenter Roeck
2023-12-12 17:05 ` Naresh Kamboju
2023-12-12 22:19 ` Jon Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231211182017.816803161@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=patches@lists.linux.dev \
    --cc=quic_mojha@quicinc.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox