All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Mukesh Ojha <quic_mojha@quicinc.com>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.19 54/55] devcoredump : Serialize devcd_del work
Date: Mon, 11 Dec 2023 19:22:04 +0100	[thread overview]
Message-ID: <20231211182014.282007034@linuxfoundation.org> (raw)
In-Reply-To: <20231211182012.263036284@linuxfoundation.org>

4.19-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mukesh Ojha <quic_mojha@quicinc.com>

[ Upstream commit 01daccf748323dfc61112f474cf2ba81015446b0 ]

In following scenario(diagram), when one thread X running dev_coredumpm()
adds devcd device to the framework which sends uevent notification to
userspace and another thread Y reads this uevent and call to
devcd_data_write() which eventually try to delete the queued timer that
is not initialized/queued yet.

So, debug object reports some warning and in the meantime, timer is
initialized and queued from X path. and from Y path, it gets reinitialized
again and timer->entry.pprev=NULL and try_to_grab_pending() stucks.

To fix this, introduce mutex and a boolean flag to serialize the behaviour.

 	cpu0(X)			                cpu1(Y)

    dev_coredump() uevent sent to user space
    device_add()  ======================> user space process Y reads the
                                          uevents writes to devcd fd
                                          which results into writes to

                                         devcd_data_write()
                                           mod_delayed_work()
                                             try_to_grab_pending()
                                               del_timer()
                                                 debug_assert_init()
   INIT_DELAYED_WORK()
   schedule_delayed_work()
                                                   debug_object_fixup()
                                                     timer_fixup_assert_init()
                                                       timer_setup()
                                                         do_init_timer()
                                                       /*
                                                        Above call reinitializes
                                                        the timer to
                                                        timer->entry.pprev=NULL
                                                        and this will be checked
                                                        later in timer_pending() call.
                                                       */
                                                 timer_pending()
                                                  !hlist_unhashed_lockless(&timer->entry)
                                                    !h->pprev
                                                /*
                                                  del_timer() checks h->pprev and finds
                                                  it to be NULL due to which
                                                  try_to_grab_pending() stucks.
                                                */

Link: https://lore.kernel.org/lkml/2e1f81e2-428c-f11f-ce92-eb11048cb271@quicinc.com/
Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
Link: https://lore.kernel.org/r/1663073424-13663-1-git-send-email-quic_mojha@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stable-dep-of: af54d778a038 ("devcoredump: Send uevent once devcd is ready")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/base/devcoredump.c | 83 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 81 insertions(+), 2 deletions(-)

diff --git a/drivers/base/devcoredump.c b/drivers/base/devcoredump.c
index f1a3353f34946..1ba4af7f2f21f 100644
--- a/drivers/base/devcoredump.c
+++ b/drivers/base/devcoredump.c
@@ -29,6 +29,47 @@ struct devcd_entry {
 	struct device devcd_dev;
 	void *data;
 	size_t datalen;
+	/*
+	 * Here, mutex is required to serialize the calls to del_wk work between
+	 * user/kernel space which happens when devcd is added with device_add()
+	 * and that sends uevent to user space. User space reads the uevents,
+	 * and calls to devcd_data_write() which try to modify the work which is
+	 * not even initialized/queued from devcoredump.
+	 *
+	 *
+	 *
+	 *        cpu0(X)                                 cpu1(Y)
+	 *
+	 *        dev_coredump() uevent sent to user space
+	 *        device_add()  ======================> user space process Y reads the
+	 *                                              uevents writes to devcd fd
+	 *                                              which results into writes to
+	 *
+	 *                                             devcd_data_write()
+	 *                                               mod_delayed_work()
+	 *                                                 try_to_grab_pending()
+	 *                                                   del_timer()
+	 *                                                     debug_assert_init()
+	 *       INIT_DELAYED_WORK()
+	 *       schedule_delayed_work()
+	 *
+	 *
+	 * Also, mutex alone would not be enough to avoid scheduling of
+	 * del_wk work after it get flush from a call to devcd_free()
+	 * mentioned as below.
+	 *
+	 *	disabled_store()
+	 *        devcd_free()
+	 *          mutex_lock()             devcd_data_write()
+	 *          flush_delayed_work()
+	 *          mutex_unlock()
+	 *                                   mutex_lock()
+	 *                                   mod_delayed_work()
+	 *                                   mutex_unlock()
+	 * So, delete_work flag is required.
+	 */
+	struct mutex mutex;
+	bool delete_work;
 	struct module *owner;
 	ssize_t (*read)(char *buffer, loff_t offset, size_t count,
 			void *data, size_t datalen);
@@ -88,7 +129,12 @@ static ssize_t devcd_data_write(struct file *filp, struct kobject *kobj,
 	struct device *dev = kobj_to_dev(kobj);
 	struct devcd_entry *devcd = dev_to_devcd(dev);
 
-	mod_delayed_work(system_wq, &devcd->del_wk, 0);
+	mutex_lock(&devcd->mutex);
+	if (!devcd->delete_work) {
+		devcd->delete_work = true;
+		mod_delayed_work(system_wq, &devcd->del_wk, 0);
+	}
+	mutex_unlock(&devcd->mutex);
 
 	return count;
 }
@@ -116,7 +162,12 @@ static int devcd_free(struct device *dev, void *data)
 {
 	struct devcd_entry *devcd = dev_to_devcd(dev);
 
+	mutex_lock(&devcd->mutex);
+	if (!devcd->delete_work)
+		devcd->delete_work = true;
+
 	flush_delayed_work(&devcd->del_wk);
+	mutex_unlock(&devcd->mutex);
 	return 0;
 }
 
@@ -126,6 +177,30 @@ static ssize_t disabled_show(struct class *class, struct class_attribute *attr,
 	return sprintf(buf, "%d\n", devcd_disabled);
 }
 
+/*
+ *
+ *	disabled_store()                                	worker()
+ *	 class_for_each_device(&devcd_class,
+ *		NULL, NULL, devcd_free)
+ *         ...
+ *         ...
+ *	   while ((dev = class_dev_iter_next(&iter))
+ *                                                             devcd_del()
+ *                                                               device_del()
+ *                                                                 put_device() <- last reference
+ *             error = fn(dev, data)                           devcd_dev_release()
+ *             devcd_free(dev, data)                           kfree(devcd)
+ *             mutex_lock(&devcd->mutex);
+ *
+ *
+ * In the above diagram, It looks like disabled_store() would be racing with parallely
+ * running devcd_del() and result in memory abort while acquiring devcd->mutex which
+ * is called after kfree of devcd memory  after dropping its last reference with
+ * put_device(). However, this will not happens as fn(dev, data) runs
+ * with its own reference to device via klist_node so it is not its last reference.
+ * so, above situation would not occur.
+ */
+
 static ssize_t disabled_store(struct class *class, struct class_attribute *attr,
 			      const char *buf, size_t count)
 {
@@ -291,13 +366,16 @@ void dev_coredumpm(struct device *dev, struct module *owner,
 	devcd->read = read;
 	devcd->free = free;
 	devcd->failing_dev = get_device(dev);
+	devcd->delete_work = false;
 
+	mutex_init(&devcd->mutex);
 	device_initialize(&devcd->devcd_dev);
 
 	dev_set_name(&devcd->devcd_dev, "devcd%d",
 		     atomic_inc_return(&devcd_count));
 	devcd->devcd_dev.class = &devcd_class;
 
+	mutex_lock(&devcd->mutex);
 	if (device_add(&devcd->devcd_dev))
 		goto put_device;
 
@@ -311,10 +389,11 @@ void dev_coredumpm(struct device *dev, struct module *owner,
 
 	INIT_DELAYED_WORK(&devcd->del_wk, devcd_del);
 	schedule_delayed_work(&devcd->del_wk, DEVCD_TIMEOUT);
-
+	mutex_unlock(&devcd->mutex);
 	return;
  put_device:
 	put_device(&devcd->devcd_dev);
+	mutex_unlock(&devcd->mutex);
  put_module:
 	module_put(owner);
  free:
-- 
2.42.0




  parent reply	other threads:[~2023-12-11 18:26 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-11 18:21 [PATCH 4.19 00/55] 4.19.302-rc1 review Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 01/55] spi: imx: add a device specific prepare_message callback Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 02/55] spi: imx: move wml setting to later than setup_transfer Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 03/55] spi: imx: correct wml as the last sg length Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 04/55] spi: imx: mx51-ecspi: Move some initialisation to prepare_message hook Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 05/55] media: davinci: vpif_capture: fix potential double free Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 06/55] block: introduce multi-page bvec helpers Greg Kroah-Hartman
2023-12-12  5:47   ` Christoph Hellwig
2023-12-12  8:38     ` Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 07/55] hrtimers: Push pending hrtimers away from outgoing CPU earlier Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 08/55] netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 09/55] tg3: Move the [rt]x_dropped counters to tg3_napi Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 10/55] tg3: Increment tx_dropped in tg3_tso_bug() Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 11/55] kconfig: fix memory leak from range properties Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 12/55] drm/amdgpu: correct chunk_ptr to a pointer to chunk Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 13/55] ipv6: fix potential NULL deref in fib6_add() Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 14/55] hv_netvsc: rndis_filter needs to select NLS Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 15/55] net: arcnet: Fix RESET flag handling Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 16/55] net: arcnet: com20020 fix error handling Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 17/55] arcnet: restoring support for multiple Sohard Arcnet cards Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 18/55] ipv4: ip_gre: Avoid skb_pull() failure in ipgre_xmit() Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 19/55] net: hns: fix fake link up on xge port Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 20/55] netfilter: xt_owner: Add supplementary groups option Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 21/55] netfilter: xt_owner: Fix for unsafe access of sk->sk_socket Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 22/55] tcp: do not accept ACK of bytes we never sent Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 23/55] RDMA/bnxt_re: Correct module description string Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 24/55] hwmon: (acpi_power_meter) Fix 4.29 MW bug Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 25/55] tracing: Fix a warning when allocating buffered events fails Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 26/55] scsi: be2iscsi: Fix a memleak in beiscsi_init_wrb_handle() Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 27/55] ARM: imx: Check return value of devm_kasprintf in imx_mmdc_perf_init Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 28/55] ARM: dts: imx: make gpt node name generic Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 29/55] ARM: dts: imx7: Declare timers compatible with fsl,imx6dl-gpt Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 30/55] ALSA: pcm: fix out-of-bounds in snd_pcm_state_names Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 31/55] packet: Move reference count in packet_sock to atomic_long_t Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 32/55] nilfs2: prevent WARNING in nilfs_sufile_set_segment_usage() Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 33/55] tracing: Always update snapshot buffer size Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 34/55] tracing: Fix incomplete locking when disabling buffered events Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 35/55] tracing: Fix a possible race " Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 36/55] perf/core: Add a new read format to get a number of lost samples Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 37/55] perf: Fix perf_event_validate_size() Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 38/55] gpiolib: sysfs: Fix error handling on failed export Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 39/55] usb: gadget: f_hid: fix report descriptor allocation Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 40/55] parport: Add support for Brainboxes IX/UC/PX parallel cards Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 41/55] usb: typec: class: fix typec_altmode_put_partner to put plugs Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 42/55] ARM: PL011: Fix DMA support Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 43/55] serial: sc16is7xx: address RX timeout interrupt errata Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 44/55] serial: 8250_omap: Add earlycon support for the AM654 UART controller Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 45/55] x86/CPU/AMD: Check vendor in the AMD microcode callback Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 46/55] KVM: s390/mm: Properly reset no-dat Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 47/55] nilfs2: fix missing error check for sb_set_blocksize call Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 48/55] netlink: dont call ->netlink_bind with table lock held Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 4.19 49/55] genetlink: add CAP_NET_ADMIN test for multicast bind Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 4.19 50/55] psample: Require CAP_NET_ADMIN when joining "packets" group Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 4.19 51/55] drop_monitor: Require CAP_SYS_ADMIN when joining "events" group Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 4.19 52/55] tools headers UAPI: Sync linux/perf_event.h with the kernel sources Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 4.19 53/55] IB/isert: Fix unaligned immediate-data handling Greg Kroah-Hartman
2023-12-11 18:22 ` Greg Kroah-Hartman [this message]
2023-12-11 18:22 ` [PATCH 4.19 55/55] devcoredump: Send uevent once devcd is ready Greg Kroah-Hartman
2023-12-11 21:04 ` [PATCH 4.19 00/55] 4.19.302-rc1 review Daniel Díaz
2023-12-12 10:39   ` Greg Kroah-Hartman
2023-12-12 16:14 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231211182014.282007034@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=patches@lists.linux.dev \
    --cc=quic_mojha@quicinc.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.