From: Greg KH <greg@kroah.com>
To: Stefan Bader <stefan.bader@canonical.com>
Cc: Greg KH <gregkh@suse.de>, Kiyoshi Ueda <k-ueda@ct.jp.nec.com>,
linux-kernel@vger.kernel.org, stable-review@kernel.org,
alan@lxorguk.ukuu.org.uk, Junichi Nomura <j-nomura@ce.jp.nec.com>,
akpm@linux-foundation.org, torvalds@linux-foundation.org,
stable@kernel.org, Alasdair G Kergon <agk@redhat.com>
Subject: Re: [Stable-review] [93/93] dm mpath: fix stall when requeueing io
Date: Tue, 23 Feb 2010 07:33:26 -0800 [thread overview]
Message-ID: <20100223153326.GA4275@kroah.com> (raw)
In-Reply-To: <4B815A3D.9040402@canonical.com>
On Sun, Feb 21, 2010 at 05:07:25PM +0100, Stefan Bader wrote:
> Greg KH wrote:
> > 2.6.32-stable review patch. If anyone has any objections, please let us know.
> >
> > ------------------
> >
> > From: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
> >
> > commit 9eef87da2a8ea4920e0d913ff977cac064b68ee0 upstream.
> >
> > This patch fixes the problem that system may stall if target's ->map_rq
> > returns DM_MAPIO_REQUEUE in map_request().
> > E.g. stall happens on 1 CPU box when a dm-mpath device with queue_if_no_path
> > bounces between all-paths-down and paths-up on I/O load.
> >
> > When target's ->map_rq returns DM_MAPIO_REQUEUE, map_request() requeues
> > the request and returns to dm_request_fn(). Then, dm_request_fn()
> > doesn't exit the I/O dispatching loop and continues processing
> > the requeued request again.
> > This map and requeue loop can be done with interrupt disabled,
> > so 1 CPU system can be stalled if this situation happens.
> >
> > For example, commands below can stall my 1 CPU box within 1 minute or so:
> > # dmsetup table mp
> > mp: 0 2097152 multipath 1 queue_if_no_path 0 1 1 service-time 0 1 2 8:144 1 1
> > # while true; do dd if=/dev/mapper/mp of=/dev/null bs=1M count=100; done &
> > # while true; do \
> > > dmsetup message mp 0 "fail_path 8:144" \
> > > dmsetup suspend --noflush mp \
> > > dmsetup resume mp \
> > > dmsetup message mp 0 "reinstate_path 8:144" \
> > > done
> >
> > To fix the problem above, this patch changes dm_request_fn() to exit
> > the I/O dispatching loop once if a request is requeued in map_request().
> >
> > Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
> > Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
> > Signed-off-by: Alasdair G Kergon <agk@redhat.com>
> > Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
> >
> > ---
> > drivers/md/dm.c | 21 ++++++++++++++++-----
> > 1 file changed, 16 insertions(+), 5 deletions(-)
> >
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> > @@ -1487,11 +1487,15 @@ static int dm_prep_fn(struct request_que
> > return BLKPREP_OK;
> > }
> >
> > -static void map_request(struct dm_target *ti, struct request *rq,
> > - struct mapped_device *md)
> > +/*
> > + * Returns:
> > + * 0 : the request has been processed (not requeued)
> > + * !0 : the request has been requeued
> > + */
> > +static int map_request(struct dm_target *ti, struct request *clone,
> > + struct mapped_device *md)
> > {
> > - int r;
> > - struct request *clone = rq->special;
>
> This change requires the argument to this function to be a rq->special
> pointer. This is changed in the map_request function by the following
> patch:
>
> commit b4324feeae304ae39e631a254d238a7d63be004b
> Author: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
> Date: Thu Dec 10 23:52:16 2009 +0000
>
> dm: use md pending for in flight IO counting
>
> > + int r, requeued = 0;
> > struct dm_rq_target_io *tio = clone->end_io_data;
> >
> > /*
> > @@ -1516,6 +1520,7 @@ static void map_request(struct dm_target
> > case DM_MAPIO_REQUEUE:
> > /* The target wants to requeue the I/O */
> > dm_requeue_unmapped_request(clone);
> > + requeued = 1;
> > break;
> > default:
> > if (r > 0) {
> > @@ -1527,6 +1532,8 @@ static void map_request(struct dm_target
> > dm_kill_unmapped_request(clone, r);
> > break;
> > }
> > +
> > + return requeued;
> > }
> >
> > /*
> > @@ -1568,12 +1575,16 @@ static void dm_request_fn(struct request
> >
> > blk_start_request(rq);
> > spin_unlock(q->queue_lock);
> > - map_request(ti, rq, md);
> > + if (map_request(ti, rq, md))
> > + goto requeued;
> > spin_lock_irq(q->queue_lock);
> > }
>
> That is the current state of dm_request_function:
>
> clone = rq->special;
> atomic_inc(&md->pending[rq_data_dir(clone)]);
>
> spin_unlock(q->queue_lock);
> if (map_request(ti, clone, md))
>
> While looking over the code I also noticed that the spinlock is dropped with
> spin_unlock and then reacquired with spin_lock_irq. Isn't the irq version too
> much in that case? Or was the intention to have interrupts enabled when unlocking?
Ick, thanks for the review. I'll drop this patch for now and fix it up
in the next release if needed.
thanks,
greg k-h
next prev parent reply other threads:[~2010-02-23 15:34 UTC|newest]
Thread overview: 105+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-19 16:57 [00/93] 2.6.32.9-stable review Greg KH
2010-02-19 16:28 ` [01/93] Fix potential crash with sys_move_pages Greg KH
2010-02-19 16:28 ` [02/93] futex_lock_pi() key refcnt fix Greg KH
2010-02-19 16:28 ` [03/93] futex: Handle user space corruption gracefully Greg KH
2010-02-19 16:28 ` [04/93] futex: Handle futex value " Greg KH
2010-02-19 16:28 ` [05/93] Fix race in tty_fasync() properly Greg KH
2010-02-19 16:28 ` [06/93] hwmon: (w83781d) Request I/O ports individually for probing Greg KH
2010-02-19 16:29 ` [07/93] hwmon: (lm78) " Greg KH
2010-02-19 16:29 ` [08/93] hwmon: (adt7462) Wrong ADT7462_VOLT_COUNT Greg KH
2010-02-19 16:29 ` [09/93] ALSA: ctxfi - fix PTP address initialization Greg KH
2010-02-19 16:29 ` [10/93] drm/i915: disable hotplug detect before Ironlake CRT detect Greg KH
2010-02-19 16:29 ` [11/93] drm/i915: enable self-refresh on 965 Greg KH
2010-02-19 16:29 ` [12/93] drm/i915: Disable SR when more than one pipe is enabled Greg KH
2010-02-19 16:29 ` [13/93] drm/i915: Fix DDC on some systems by clearing BIOS GMBUS setup Greg KH
2010-02-19 16:29 ` [14/93] drm/i915: Add HP nx9020/SamsungSX20S to ACPI LID quirk list Greg KH
2010-02-19 16:29 ` [15/93] drm/i915: Fix the incorrect DMI string for Samsung SX20S laptop Greg KH
2010-02-19 16:29 ` [16/93] drm/i915: Add MALATA PC-81005 to ACPI LID quirk list Greg KH
2010-02-19 16:29 ` [17/93] usb: r8a66597-hcd: Flush the D-cache for the pipe-in transfer buffers Greg KH
2010-02-19 16:29 ` [18/93] i2c-tiny-usb: Fix on big-endian systems Greg KH
2010-02-19 16:29 ` [19/93] drm/i915: handle FBC and self-refresh better Greg KH
2010-02-19 16:29 ` [20/93] drm/i915: Increase fb alignment to 64k Greg KH
2010-02-19 16:29 ` [21/93] drm/i915: Update write_domains on active list after flush Greg KH
2010-02-19 16:29 ` [22/93] regulator: Fix display of null constraints for regulators Greg KH
2010-02-19 16:29 ` [23/93] ALSA: hda-intel: Avoid divide by zero crash Greg KH
2010-02-19 16:29 ` [24/93] CPUFREQ: Fix use after free of struct powernow_k8_data Greg KH
2010-02-19 16:29 ` [25/93] freeze_bdev: dont deactivate successfully frozen MS_RDONLY sb Greg KH
2010-02-19 16:29 ` [26/93] cciss: Make cciss_seq_show handle holes in the h->drv[] array Greg KH
2010-02-19 16:29 ` [27/93] ioat: fix infinite timeout checking in ioat2_quiesce Greg KH
2010-02-19 16:29 ` [28/93] resource: add helpers for fetching rlimits Greg KH
2010-02-19 16:29 ` [29/93] fs/exec.c: restrict initial stack space expansion to rlimit Greg KH
2010-02-21 6:42 ` Michael Neuling
2010-02-23 15:34 ` [stable] " Greg KH
2010-02-23 20:42 ` Michael Neuling
2010-02-19 16:29 ` [30/93] cifs: fix length calculation for converted unicode readdir names Greg KH
2010-02-19 16:29 ` [31/93] NFS: Fix a reference leak in nfs_wb_cancel_page() Greg KH
2010-02-19 16:29 ` [32/93] NFS: Try to commit unstable writes in nfs_release_page() Greg KH
2010-02-19 16:29 ` [33/93] NFSv4: Dont allow posix locking against servers that dont support it Greg KH
2010-02-19 16:29 ` [34/93] NFSv4: Ensure that the NFSv4 locking can recover from stateid errors Greg KH
2010-02-19 16:29 ` [35/93] NFS: Fix an Oops when truncating a file Greg KH
2010-02-19 16:29 ` [36/93] NFS: Fix a umount race Greg KH
2010-02-19 16:29 ` [37/93] NFS: Fix a bug in nfs_fscache_release_page() Greg KH
2010-02-19 16:29 ` [38/93] NFS: Fix the mapping of the NFSERR_SERVERFAULT error Greg KH
2010-02-19 16:29 ` [39/93] md: fix degraded calculation when starting a reshape Greg KH
2010-02-19 16:29 ` [40/93] V4L/DVB: dvb-core: fix initialization of feeds list in demux filter Greg KH
2010-02-19 16:29 ` [41/93] Export the symbol of getboottime and mmonotonic_to_bootbased Greg KH
2010-02-19 16:29 ` [42/93] kvmclock: count total_sleep_time when updating guest clock Greg KH
2010-02-19 16:29 ` [43/93] KVM: PIT: control word is write-only Greg KH
2010-02-19 16:29 ` [44/93] tpm_infineon: fix suspend/resume handler for pnp_driver Greg KH
2010-02-19 16:29 ` [45/93] amd64_edac: Do not falsely trigger kerneloops Greg KH
2010-02-19 16:29 ` [46/93] netfilter: nf_conntrack: fix memory corruption with multiple namespaces Greg KH
2010-02-19 16:29 ` [47/93] netfilter: nf_conntrack: per netns nf_conntrack_cachep Greg KH
2010-02-19 16:29 ` [48/93] netfilter: nf_conntrack: restrict runtime expect hashsize modifications Greg KH
2010-02-19 16:29 ` [49/93] netfilter: xtables: compat out of scope fix Greg KH
2010-02-19 16:29 ` [50/93] netfilter: nf_conntrack: fix hash resizing with namespaces Greg KH
2010-02-19 16:29 ` [51/93] drm/i915: remove full registers dump debug Greg KH
2010-02-19 16:29 ` [52/93] drm/i915: add i915_lp_ring_sync helper Greg KH
2010-02-19 16:29 ` [53/93] drm/i915: Dont wait interruptible for possible plane buffer flush Greg KH
2010-02-19 16:29 ` [54/93] [S390] dasd: remove strings from s390dbf Greg KH
2010-02-19 16:29 ` [55/93] crypto: padlock-sha - Add import/export support Greg KH
2010-02-19 16:29 ` [56/93] wmi: Free the allocated acpi objects through wmi_get_event_data Greg KH
2010-02-19 16:29 ` [57/93] dell-wmi, hp-wmi, msi-wmi: check wmi_get_event_data() return value Greg KH
2010-02-19 16:29 ` [58/93] /dev/mem: introduce size_inside_page() Greg KH
2010-02-19 16:29 ` [59/93] devmem: check vmalloc address on kmem read/write Greg KH
2010-02-19 16:29 ` [60/93] devmem: fix kmem write bug on memory holes Greg KH
2010-02-19 16:29 ` [61/93] SCSI: mptfusion : mptscsih_abort return value should be SUCCESS instead of value 0 Greg KH
2010-02-19 16:29 ` [62/93] sh: Couple kernel and user write page perm bits for CONFIG_X2TLB Greg KH
2010-02-19 16:29 ` [63/93] ALSA: hda - use WARN_ON_ONCE() for zero-division detection Greg KH
2010-02-19 16:29 ` [64/93] dst: call cond_resched() in dst_gc_task() Greg KH
2010-02-19 16:29 ` [65/93] ALSA: hda - Improved MacBook (Pro) 5,1 / 5,2 support Greg KH
2010-02-19 16:29 ` [66/93] befs: fix leak Greg KH
2010-02-19 16:30 ` [67/93] rtc-fm3130: add missing braces Greg KH
2010-02-19 16:30 ` [68/93] [libata] Call flush_dcache_page after PIO data transfers in libata-sff.c Greg KH
2010-02-19 16:30 ` [69/93] ahci: add Acer G725 to broken suspend list Greg KH
2010-02-19 16:30 ` [70/93] pktgen: Fix freezing problem Greg KH
2010-02-19 16:30 ` [71/93] x86/amd-iommu: Fix IOMMU-API initialization for iommu=pt Greg KH
2010-02-19 16:30 ` [72/93] x86/amd-iommu: Fix deassignment of a device from the pt_domain Greg KH
2010-02-19 16:30 ` [73/93] x86: Re-get cfg_new in case reuse/move irq_desc Greg KH
2010-02-19 16:30 ` [74/93] Staging: fix rtl8187se compilation errors with mac80211 Greg KH
2010-02-19 16:30 ` [75/93] ALSA: usb-audio - Avoid Oops after disconnect Greg KH
2010-02-19 16:30 ` [76/93] serial: 8250: add serial transmitter fully empty test Greg KH
2010-02-19 16:30 ` [77/93] sysfs: sysfs_sd_setattr set iattrs unconditionally Greg KH
2010-02-19 16:30 ` [78/93] class: Free the class private data in class_release Greg KH
2010-02-19 16:30 ` [79/93] USB: usbfs: only copy the actual data received Greg KH
2010-02-19 16:30 ` [80/93] USB: usbfs: properly clean up the as structure on error paths Greg KH
2010-02-19 16:30 ` [81/93] rtl8187: Add new device ID Greg KH
2010-02-19 16:30 ` [82/93] ACPI: Add NULL pointer check in acpi_bus_start Greg KH
2010-02-19 16:30 ` [83/93] ACPI: fix High cpu temperature with 2.6.32 Greg KH
2010-02-19 16:30 ` [84/93] drm/radeon/kms: use udelay for short delays Greg KH
2010-02-19 16:30 ` [85/93] NFS: Too many GETATTR and ACCESS calls after direct I/O Greg KH
2010-02-19 16:30 ` [86/93] eCryptfs: Add getattr function Greg KH
2010-02-19 16:30 ` [87/93] b43: Fix throughput regression Greg KH
2010-02-19 16:30 ` [88/93] ath9k: Fix sequence numbers for PAE frames Greg KH
2010-02-19 16:30 ` [89/93] mac80211: Fix probe request filtering in IBSS mode Greg KH
2010-02-19 16:30 ` [90/93] iwlwifi: Fix to set correct ht configuration Greg KH
2010-02-19 16:30 ` [91/93] dm stripe: avoid divide by zero with invalid stripe count Greg KH
2010-02-19 16:30 ` [92/93] dm log: userspace fix overhead_size calcuations Greg KH
2010-02-19 16:30 ` [93/93] dm mpath: fix stall when requeueing io Greg KH
2010-02-21 16:07 ` [Stable-review] " Stefan Bader
2010-02-22 10:16 ` Kiyoshi Ueda
2010-02-23 18:12 ` Alasdair G Kergon
2010-02-24 1:12 ` Kiyoshi Ueda
2010-02-24 22:30 ` Mikulas Patocka
[not found] ` <20100223175331.GE560@agk-dp.fab.redhat.com>
[not found] ` <Pine.LNX.4.64.1002231429210.29863@hs20-bc2-1.build.redhat.com>
[not found] ` <4B84E05C.4070300@ct.jp.nec.com>
[not found] ` <Pine.LNX.4.64.1002241732300.2537@hs20-bc2-1.build.redhat.com>
[not found] ` <4B862103.1040802@ct.jp.nec.com>
2010-02-26 20:57 ` How to unload a module? (Was: [dm-devel] rqdm: bad usage of dm_get/dm_put) Mikulas Patocka
2010-02-23 15:33 ` Greg KH [this message]
2010-02-23 17:00 ` [Stable-review] [93/93] dm mpath: fix stall when requeueing io Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100223153326.GA4275@kroah.com \
--to=greg@kroah.com \
--cc=agk@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=gregkh@suse.de \
--cc=j-nomura@ce.jp.nec.com \
--cc=k-ueda@ct.jp.nec.com \
--cc=linux-kernel@vger.kernel.org \
--cc=stable-review@kernel.org \
--cc=stable@kernel.org \
--cc=stefan.bader@canonical.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox