From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Anatol Pomozov <anatol.pomozov@gmail.com>,
Benjamin LaHaise <bcrl@kvack.org>, Jan Kara <jack@suse.cz>
Subject: [PATCH 3.14 86/94] aio: block io_destroy() until all context requests are completed
Date: Mon, 7 Jul 2014 16:58:16 -0700 [thread overview]
Message-ID: <20140707235759.305892948@linuxfoundation.org> (raw)
In-Reply-To: <20140707235756.780319003@linuxfoundation.org>
3.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Anatol Pomozov <anatol.pomozov@gmail.com>
commit e02ba72aabfade4c9cd6e3263e9b57bf890ad25c upstream.
deletes aio context and all resources related to. It makes sense that
no IO operations connected to the context should be running after the context
is destroyed. As we removed io_context we have no chance to
get requests status or call io_getevents().
man page for io_destroy says that this function may block until
all context's requests are completed. Before kernel 3.11 io_destroy()
blocked indeed, but since aio refactoring in 3.11 it is not true anymore.
Here is a pseudo-code that shows a testcase for a race condition discovered
in 3.11:
initialize io_context
io_submit(read to buffer)
io_destroy()
// context is destroyed so we can free the resources
free(buffers);
// if the buffer is allocated by some other user he'll be surprised
// to learn that the buffer still filled by an outstanding operation
// from the destroyed io_context
The fix is straight-forward - add a completion struct and wait on it
in io_destroy, complete() should be called when number of in-fligh requests
reaches zero.
If two or more io_destroy() called for the same context simultaneously then
only the first one waits for IO completion, other calls behaviour is undefined.
Tested: ran http://pastebin.com/LrPsQ4RL testcase for several hours and
do not see the race condition anymore.
Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/aio.c | 36 ++++++++++++++++++++++++++++++++----
1 file changed, 32 insertions(+), 4 deletions(-)
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -112,6 +112,11 @@ struct kioctx {
struct work_struct free_work;
+ /*
+ * signals when all in-flight requests are done
+ */
+ struct completion *requests_done;
+
struct {
/*
* This counts the number of available slots in the ringbuffer,
@@ -508,6 +513,10 @@ static void free_ioctx_reqs(struct percp
{
struct kioctx *ctx = container_of(ref, struct kioctx, reqs);
+ /* At this point we know that there are no any in-flight requests */
+ if (ctx->requests_done)
+ complete(ctx->requests_done);
+
INIT_WORK(&ctx->free_work, free_ioctx);
schedule_work(&ctx->free_work);
}
@@ -718,7 +727,8 @@ err:
* when the processes owning a context have all exited to encourage
* the rapid destruction of the kioctx.
*/
-static void kill_ioctx(struct mm_struct *mm, struct kioctx *ctx)
+static void kill_ioctx(struct mm_struct *mm, struct kioctx *ctx,
+ struct completion *requests_done)
{
if (!atomic_xchg(&ctx->dead, 1)) {
struct kioctx_table *table;
@@ -747,7 +757,11 @@ static void kill_ioctx(struct mm_struct
if (ctx->mmap_size)
vm_munmap(ctx->mmap_base, ctx->mmap_size);
+ ctx->requests_done = requests_done;
percpu_ref_kill(&ctx->users);
+ } else {
+ if (requests_done)
+ complete(requests_done);
}
}
@@ -809,7 +823,7 @@ void exit_aio(struct mm_struct *mm)
*/
ctx->mmap_size = 0;
- kill_ioctx(mm, ctx);
+ kill_ioctx(mm, ctx, NULL);
}
}
@@ -1187,7 +1201,7 @@ SYSCALL_DEFINE2(io_setup, unsigned, nr_e
if (!IS_ERR(ioctx)) {
ret = put_user(ioctx->user_id, ctxp);
if (ret)
- kill_ioctx(current->mm, ioctx);
+ kill_ioctx(current->mm, ioctx, NULL);
percpu_ref_put(&ioctx->users);
}
@@ -1205,8 +1219,22 @@ SYSCALL_DEFINE1(io_destroy, aio_context_
{
struct kioctx *ioctx = lookup_ioctx(ctx);
if (likely(NULL != ioctx)) {
- kill_ioctx(current->mm, ioctx);
+ struct completion requests_done =
+ COMPLETION_INITIALIZER_ONSTACK(requests_done);
+
+ /* Pass requests_done to kill_ioctx() where it can be set
+ * in a thread-safe way. If we try to set it here then we have
+ * a race condition if two io_destroy() called simultaneously.
+ */
+ kill_ioctx(current->mm, ioctx, &requests_done);
percpu_ref_put(&ioctx->users);
+
+ /* Wait until all IO for the context are done. Otherwise kernel
+ * keep using user-space buffers even if user thinks the context
+ * is destroyed.
+ */
+ wait_for_completion(&requests_done);
+
return 0;
}
pr_debug("EINVAL: io_destroy: invalid context id\n");
next prev parent reply other threads:[~2014-07-08 0:26 UTC|newest]
Thread overview: 91+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-07 23:56 [PATCH 3.14 00/94] 3.14.12-stable review Greg Kroah-Hartman
2014-07-07 23:56 ` [PATCH 3.14 01/94] ibmvscsi: Abort init sequence during error recovery Greg Kroah-Hartman
2014-07-07 23:56 ` [PATCH 3.14 02/94] ibmvscsi: Add memory barriers for send / receive Greg Kroah-Hartman
2014-07-07 23:56 ` [PATCH 3.14 03/94] virtio-scsi: avoid cancelling uninitialized work items Greg Kroah-Hartman
2014-07-07 23:56 ` [PATCH 3.14 04/94] scsi_error: fix invalid setting of host byte Greg Kroah-Hartman
2014-07-07 23:56 ` [PATCH 3.14 05/94] virtio-scsi: fix various bad behavior on aborted requests Greg Kroah-Hartman
2014-07-07 23:56 ` [PATCH 3.14 06/94] xhci: Use correct SLOT ID when handling a reset device command Greg Kroah-Hartman
2014-07-07 23:56 ` [PATCH 3.14 07/94] xhci: correct burst count field for isoc transfers on 1.0 xhci hosts Greg Kroah-Hartman
2014-07-07 23:56 ` [PATCH 3.14 08/94] xhci: Fix runtime suspended xhci from blocking system suspend Greg Kroah-Hartman
2014-07-07 23:56 ` [PATCH 3.14 09/94] USB: option: add device ID for SpeedUp SU9800 usb 3g modem Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 11/94] usb: musb: ux500: dont propagate the OF node Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 12/94] usb: musb: Ensure that cppi41 timer gets armed on premature DMA TX irq Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 13/94] usb: musb: Fix panic upon musb_am335x module removal Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 14/94] usb: chipidea: udc: delete td from reqs td list at ep_dequeue Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 15/94] USB: ftdi_sio: fix null deref at port probe Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 18/94] rt2x00: disable TKIP on USB Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 19/94] rt2x00: fix rfkill regression on rt2500pci Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 20/94] mtd: eLBC NAND: fix subpage write support Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 21/94] mtd: nand: omap: fix BCHx ecc.correct to return detected bit-flips in erased-page Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 22/94] mtd: pxa3xx_nand: make the driver work on big-endian systems Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 23/94] vgaswitcheroo: switch the mux to the igp on power down when runpm is enabled Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 24/94] drm/nouveau/kms/nv04-nv40: fix pageflip events via special case Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 25/94] drm/nouveau/disp/nv04-nv40: abort scanoutpos query on vga analog Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 26/94] drm/nouveau/kms: reference vblank for crtc during pageflip Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 27/94] drm/radeon: only apply hdmi bpc pll flags when encoder mode is hdmi Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 28/94] drm/radeon: fix typo in radeon_connector_is_dp12_capable() Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 29/94] drm/radeon/dp: fix lane/clock setup for dp 1.2 capable devices Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 30/94] drm/radeon/atom: fix dithering on certain panels Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 32/94] drm/radeon/dpm: fix typo in vddci setup for eg/btc Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 34/94] drm/radeon/cik: fix typo in EOP packet Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 35/94] drm/nv50-/mc: fix kms pageflip events by reordering irq handling order Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 36/94] drm/gk208/gr: add missing registers to grctx init Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 38/94] drm/i915: set backlight duty cycle after backlight enable for gen4 Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 39/94] drm/vmwgfx: Fix incorrect write to read-only register v2: Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 40/94] Bluetooth: Fix SSP acceptor just-works confirmation without MITM Greg Kroah-Hartman
2014-07-07 23:57 ` Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 41/94] Bluetooth: Fix check for connection encryption Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 42/94] Bluetooth: Fix indicating discovery state when canceling inquiry Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 43/94] Bluetooth: Fix locking of hdev when calling into SMP code Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 44/94] Bluetooth: Allow change security level on ATT_CID in slave role Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 45/94] dm thin: update discard_granularity to reflect the thin-pool blocksize Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 46/94] rbd: use reference counts for image requests Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 47/94] rbd: handle parent_overlap on writes correctly Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 48/94] hwmon: (ina2xx) Cast to s16 on shunt and current regs Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 49/94] intel_pstate: Correct rounding in busy calculation Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 51/94] mac80211: dont check netdev state for debugfs read/write Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 53/94] iwlwifi: pcie: try to get ownership several times Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 54/94] hugetlb: fix copy_hugetlb_page_range() to handle migration/hwpoisoned entry Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 55/94] mm, pcp: allow restoring percpu_pagelist_fraction default Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 56/94] arm64: mm: Make icache synchronisation logic huge page aware Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 57/94] ARM: OMAP2+: Fix parser-bug in platform muxing code Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 59/94] net: allwinner: emac: Add missing free_irq Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 62/94] CIFS: fix mount failure with broken pathnames when smb3 mount with mapchars option Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 63/94] blkcg: fix use-after-free in __blkg_release_rcu() by making blkcg_gq refcnt an atomic_t Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 64/94] ext4: Fix buffer double free in ext4_alloc_branch() Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 65/94] ext4: Fix hole punching for files with indirect blocks Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 66/94] KVM: x86: Increase the number of fixed MTRR regs to 10 Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 67/94] KVM: x86: preserve the high 32-bits of the PAT register Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 68/94] kvm: fix wrong address when writing Hyper-V tsc page Greg Kroah-Hartman
2014-07-07 23:57 ` Greg Kroah-Hartman
2014-07-07 23:57 ` [PATCH 3.14 69/94] iio: of_iio_channel_get_by_name() returns non-null pointers for error legs Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 70/94] staging: iio/ad7291: fix error code in ad7291_probe() Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 71/94] nfsd: fix rare symlink decoding bug Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 72/94] tools: ffs-test: fix header values endianess Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 73/94] tracing: Remove ftrace_stop/start() from reading the trace file Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 74/94] md: flush writes before starting a recovery Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 75/94] irqchip: spear_shirq: Fix interrupt offset Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 77/94] mlx4_core: Fix incorrect FLAGS1 bitmap test in mlx4_QUERY_FUNC_CAP Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 78/94] clk: qcom: Fix clk_rcg2_is_enabled() check Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 79/94] clk: qcom: Fix mmcc-8974s PLL configurations Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 80/94] serial: Fix IGNBRK handling Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 81/94] tty: Correct INPCK handling Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 82/94] netfilter: nf_nat: fix oops on netns removal Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 83/94] brcmfmac: Fix brcmf_chip_ai_coredisable not applying reset bits to BCMA_IOCTL Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 84/94] mmc: rtsx: add R1-no-CRC mmc command type handle Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 85/94] drm/i915: fix display power sw state reporting Greg Kroah-Hartman
2014-07-07 23:58 ` Greg Kroah-Hartman [this message]
2014-07-07 23:58 ` [PATCH 3.14 87/94] audit: remove superfluous new- prefix in AUDIT_LOGIN messages Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 88/94] ALSA: usb-audio: Suppress repetitive debug messages from retire_playback_urb() Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 89/94] ALSA: usb-audio: Prevent printk ratelimiting from spamming kernel log while DEBUG not defined Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 90/94] arch/unicore32/mm/alignment.c: include "asm/pgtable.h" to avoid compiling error Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 91/94] drivers/video/fbdev/fb-puv3.c: Add header files for function unifb_mmap Greg Kroah-Hartman
2014-07-07 23:58 ` Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 92/94] mm/numa: Remove BUG_ON() in __handle_mm_fault() Greg Kroah-Hartman
2014-07-07 23:58 ` Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 93/94] slab: fix oops when reading /proc/slab_allocators Greg Kroah-Hartman
2014-07-07 23:58 ` [PATCH 3.14 94/94] sym53c8xx_2: Set DID_REQUEUE return code when aborting squeue Greg Kroah-Hartman
2014-07-08 13:24 ` [PATCH 3.14 00/94] 3.14.12-stable review Guenter Roeck
2014-07-08 22:15 ` Greg Kroah-Hartman
2014-07-15 1:00 ` Greg Kroah-Hartman
2014-07-08 19:31 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140707235759.305892948@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=anatol.pomozov@gmail.com \
--cc=bcrl@kvack.org \
--cc=jack@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.