From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Seiji Aguchi <seiji.aguchi@hds.com>,
Don Zickus <dzickus@redhat.com>, Tony Luck <tony.luck@intel.com>,
CAI Qian <caiqian@redhat.com>
Subject: [ 61/77] pstore: Avoid deadlock in panic and emergency-restart path
Date: Fri, 1 Mar 2013 11:44:46 -0800 [thread overview]
Message-ID: <20130301194358.383669505@linuxfoundation.org> (raw)
In-Reply-To: <20130301194351.913471337@linuxfoundation.org>
3.8-stable review patch. If anyone has any objections, please let me know.
------------------
From: Seiji Aguchi <seiji.aguchi@hds.com>
commit 9f244e9cfd70c7c0f82d3c92ce772ab2a92d9f64 upstream.
[Issue]
When pstore is in panic and emergency-restart paths, it may be blocked
in those paths because it simply takes spin_lock.
This is an example scenario which pstore may hang up in a panic path:
- cpuA grabs psinfo->buf_lock
- cpuB panics and calls smp_send_stop
- smp_send_stop sends IRQ to cpuA
- after 1 second, cpuB gives up on cpuA and sends an NMI instead
- cpuA is now in an NMI handler while still holding buf_lock
- cpuB is deadlocked
This case may happen if a firmware has a bug and
cpuA is stuck talking with it more than one second.
Also, this is a similar scenario in an emergency-restart path:
- cpuA grabs psinfo->buf_lock and stucks in a firmware
- cpuB kicks emergency-restart via either sysrq-b or hangcheck timer.
And then, cpuB is deadlocked by taking psinfo->buf_lock again.
[Solution]
This patch avoids the deadlocking issues in both panic and emergency_restart
paths by introducing a function, is_non_blocking_path(), to check if a cpu
can be blocked in current path.
With this patch, pstore is not blocked even if another cpu has
taken a spin_lock, in those paths by changing from spin_lock_irqsave
to spin_trylock_irqsave.
In addition, according to a comment of emergency_restart() in kernel/sys.c,
spin_lock shouldn't be taken in an emergency_restart path to avoid
deadlock. This patch fits the comment below.
<snip>
/**
* emergency_restart - reboot the system
*
* Without shutting down any hardware or taking any locks
* reboot the system. This is called when we know we are in
* trouble so this is our best effort to reboot. This is
* safe to call in interrupt context.
*/
void emergency_restart(void)
<snip>
Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: CAI Qian <caiqian@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/pstore/platform.c | 35 +++++++++++++++++++++++++++++------
include/linux/pstore.h | 6 ++++++
2 files changed, 35 insertions(+), 6 deletions(-)
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -96,6 +96,27 @@ static const char *get_reason_str(enum k
}
}
+bool pstore_cannot_block_path(enum kmsg_dump_reason reason)
+{
+ /*
+ * In case of NMI path, pstore shouldn't be blocked
+ * regardless of reason.
+ */
+ if (in_nmi())
+ return true;
+
+ switch (reason) {
+ /* In panic case, other cpus are stopped by smp_send_stop(). */
+ case KMSG_DUMP_PANIC:
+ /* Emergency restart shouldn't be blocked by spin lock. */
+ case KMSG_DUMP_EMERG:
+ return true;
+ default:
+ return false;
+ }
+}
+EXPORT_SYMBOL_GPL(pstore_cannot_block_path);
+
/*
* callback from kmsg_dump. (s2,l2) has the most recently
* written bytes, older bytes are in (s1,l1). Save as much
@@ -114,10 +135,12 @@ static void pstore_dump(struct kmsg_dump
why = get_reason_str(reason);
- if (in_nmi()) {
- is_locked = spin_trylock(&psinfo->buf_lock);
- if (!is_locked)
- pr_err("pstore dump routine blocked in NMI, may corrupt error record\n");
+ if (pstore_cannot_block_path(reason)) {
+ is_locked = spin_trylock_irqsave(&psinfo->buf_lock, flags);
+ if (!is_locked) {
+ pr_err("pstore dump routine blocked in %s path, may corrupt error record\n"
+ , in_nmi() ? "NMI" : why);
+ }
} else
spin_lock_irqsave(&psinfo->buf_lock, flags);
oopscount++;
@@ -143,9 +166,9 @@ static void pstore_dump(struct kmsg_dump
total += hsize + len;
part++;
}
- if (in_nmi()) {
+ if (pstore_cannot_block_path(reason)) {
if (is_locked)
- spin_unlock(&psinfo->buf_lock);
+ spin_unlock_irqrestore(&psinfo->buf_lock, flags);
} else
spin_unlock_irqrestore(&psinfo->buf_lock, flags);
}
--- a/include/linux/pstore.h
+++ b/include/linux/pstore.h
@@ -68,12 +68,18 @@ struct pstore_info {
#ifdef CONFIG_PSTORE
extern int pstore_register(struct pstore_info *);
+extern bool pstore_cannot_block_path(enum kmsg_dump_reason reason);
#else
static inline int
pstore_register(struct pstore_info *psi)
{
return -ENODEV;
}
+static inline bool
+pstore_cannot_block_path(enum kmsg_dump_reason reason)
+{
+ return false;
+}
#endif
#endif /*_LINUX_PSTORE_H*/
next prev parent reply other threads:[~2013-03-01 20:10 UTC|newest]
Thread overview: 99+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-01 19:43 [ 00/77] 3.8.2-stable review Greg Kroah-Hartman
2013-03-01 19:43 ` [ 01/77] mm: do not grow the stack vma just because of an overrun on preceding vma Greg Kroah-Hartman
2013-03-01 19:43 ` [ 02/77] ALSA: bt87x: Make load_all parameter working again Greg Kroah-Hartman
2013-03-01 19:43 ` [ 03/77] ALSA: hda - hdmi: Make jacks phantom, if theyre not detectable Greg Kroah-Hartman
2013-03-01 19:43 ` [ 04/77] ALSA: emu10k1: Fix regression in emu1010 firmware loading Greg Kroah-Hartman
2013-03-01 19:43 ` [ 05/77] ALSA: emu10k1: Load firmware when it was already cached Greg Kroah-Hartman
2013-03-01 19:43 ` [ 06/77] IB/srp: Track connection state properly Greg Kroah-Hartman
2013-03-01 19:43 ` [ 07/77] IB/srp: Avoid sending a task management function needlessly Greg Kroah-Hartman
2013-03-01 19:43 ` [ 08/77] IB/srp: Avoid endless SCSI error handling loop Greg Kroah-Hartman
2013-03-01 19:43 ` [ 09/77] IB/srp: Fail I/O requests if the transport is offline Greg Kroah-Hartman
2013-03-01 19:43 ` [ 10/77] quota: autoload the quota_v2 module for QFMT_VFS_V1 quota format Greg Kroah-Hartman
2013-03-01 19:43 ` [ 11/77] usb: dwc3: Enable usb2 LPM only when connected as usb2.0 Greg Kroah-Hartman
2013-03-01 19:43 ` [ 12/77] usb: dwc3: gadget: fix missed isoc Greg Kroah-Hartman
2013-03-01 19:43 ` [ 13/77] usb: dwc3: gadget: fix isoc END TRANSFER Condition Greg Kroah-Hartman
2013-03-01 19:43 ` [ 14/77] usb: dwc3: gadget: fix skip LINK_TRB on ISOC Greg Kroah-Hartman
2013-03-01 19:44 ` [ 15/77] usb: dwc3: gadget: change HIRD threshold to 12 Greg Kroah-Hartman
2013-03-01 19:44 ` [ 16/77] b43: Fix lockdep splat on module unload Greg Kroah-Hartman
2013-03-01 19:44 ` [ 17/77] UBIFS: fix use of freed ubifs_orphan objects Greg Kroah-Hartman
2013-03-01 19:44 ` [ 18/77] UBIFS: fix double free of " Greg Kroah-Hartman
2013-03-01 19:44 ` [ 19/77] iommu/amd: Initialize device table after dma_ops Greg Kroah-Hartman
2013-03-01 19:44 ` [ 20/77] posix-timer: Dont call idr_find() with out-of-range ID Greg Kroah-Hartman
2013-03-01 19:44 ` [ 21/77] ftrace: Call ftrace cleanup module notifier after all other notifiers Greg Kroah-Hartman
2013-03-01 19:44 ` [ 22/77] x86/apic: Fix parsing of the lapic cmdline option Greg Kroah-Hartman
2013-03-01 19:44 ` [ 23/77] x86, efi: Make "noefi" really disable EFI runtime serivces Greg Kroah-Hartman
2013-03-01 19:44 ` [ 24/77] doc, xen: Mention earlyprintk=xen in the documentation Greg Kroah-Hartman
2013-03-01 19:44 ` [ 25/77] doc, kernel-parameters: Document console=hvc<n> Greg Kroah-Hartman
2013-03-01 19:44 ` [ 26/77] x86: Make sure we can boot in the case the BDA contains pure garbage Greg Kroah-Hartman
2013-03-01 19:44 ` [ 27/77] target: Fix lookup of dynamic NodeACLs during cached demo-mode operation Greg Kroah-Hartman
2013-03-01 19:44 ` [ 28/77] target: Add missing mapped_lun bounds checking during make_mappedlun setup Greg Kroah-Hartman
2013-03-01 19:44 ` [ 29/77] ocfs2: fix possible use-after-free with AIO Greg Kroah-Hartman
2013-03-01 19:44 ` [ 30/77] ocfs2: fix ocfs2_init_security_and_acl() to initialize acl correctly Greg Kroah-Hartman
2013-03-01 19:44 ` [ 31/77] ocfs2: ac->ac_allow_chain_relink=0 wont disable group relink Greg Kroah-Hartman
2013-03-01 19:44 ` [ 32/77] block: fix ext_devt_idr handling Greg Kroah-Hartman
2013-03-01 19:44 ` [ 33/77] xen-blkback: do not leak mode property Greg Kroah-Hartman
2013-03-01 19:44 ` [ 34/77] xen/blkback: Dont trust the handle from the frontend Greg Kroah-Hartman
2013-03-01 21:12 ` Paul Bolle
2013-03-02 19:48 ` Ben Hutchings
2013-03-02 22:35 ` Paul Bolle
2013-03-02 23:10 ` Ben Hutchings
2013-03-03 10:20 ` Paul Bolle
2013-03-04 2:45 ` Greg Kroah-Hartman
2013-03-04 7:55 ` Jan Beulich
2013-03-04 9:11 ` Paul Bolle
2013-03-04 9:14 ` Jan Beulich
2013-03-04 15:02 ` Konrad Rzeszutek Wilk
2013-03-12 22:10 ` Greg Kroah-Hartman
2013-04-03 14:01 ` William Dauchy
2013-04-03 16:01 ` Greg Kroah-Hartman
2013-04-03 16:38 ` Konrad Rzeszutek Wilk
2013-04-03 17:08 ` Greg Kroah-Hartman
2013-03-01 19:44 ` [ 35/77] xen-blkfront: drop the use of llist_for_each_entry_safe Greg Kroah-Hartman
2013-03-01 19:44 ` [ 36/77] xen-blkback: use balloon pages for persistent grants Greg Kroah-Hartman
2013-03-01 19:44 ` [ 37/77] idr: fix a subtle bug in idr_get_next() Greg Kroah-Hartman
2013-03-01 19:44 ` [ 38/77] block: fix synchronization and limit check in blk_alloc_devt() Greg Kroah-Hartman
2013-03-01 19:44 ` [ 39/77] firewire: add minor number range check to fw_device_init() Greg Kroah-Hartman
2013-03-01 19:44 ` [ 40/77] sysctl: fix null checking in bin_dn_node_address() Greg Kroah-Hartman
2013-03-01 19:44 ` [ 41/77] nbd: fsync and kill block device on shutdown Greg Kroah-Hartman
2013-03-01 19:44 ` [ 42/77] fs: Fix possible use-after-free with AIO Greg Kroah-Hartman
2013-03-01 19:44 ` [ 43/77] ext4: fix " Greg Kroah-Hartman
2013-03-01 19:44 ` [ 44/77] media: cx18/ivtv: fix regression: remove __init from a non-init function Greg Kroah-Hartman
2013-03-01 19:44 ` [ 45/77] media: v4l: Reset subdev v4l2_dev field to NULL if registration fails Greg Kroah-Hartman
2013-03-01 19:44 ` [ 46/77] media: omap_vout: find_vma() needs ->mmap_sem held Greg Kroah-Hartman
2013-03-01 19:44 ` [ 47/77] media: rc: unlock on error in show_protocols() Greg Kroah-Hartman
2013-03-01 19:44 ` [ 48/77] ext4: return ENOMEM if sb_getblk() fails Greg Kroah-Hartman
2013-03-01 19:44 ` [ 49/77] ext4: check bh in ext4_read_block_bitmap() Greg Kroah-Hartman
2013-03-01 19:44 ` [ 50/77] ext4: release sysfs kobject when failing to enable quotas on mount Greg Kroah-Hartman
2013-03-01 19:44 ` [ 51/77] ext4: fix race in ext4_mb_add_n_trim() Greg Kroah-Hartman
2013-03-01 19:44 ` [ 52/77] ext4: fix xattr block allocation/release with bigalloc Greg Kroah-Hartman
2013-03-01 19:44 ` [ 53/77] ext4: fix free clusters calculation in bigalloc filesystem Greg Kroah-Hartman
2013-03-01 19:44 ` [ 54/77] nfsd: Fix memleak Greg Kroah-Hartman
2013-03-01 19:44 ` [ 55/77] svcrpc: make svc_age_temp_xprts enqueue under sv_lock Greg Kroah-Hartman
2013-03-01 19:44 ` [ 56/77] svcrpc: fix rpc server shutdown races Greg Kroah-Hartman
2013-03-01 19:44 ` [ 57/77] HID: add support for Sony RF receiver with USB product id 0x0374 Greg Kroah-Hartman
2013-03-01 19:44 ` [ 58/77] HID: clean up quirk for Sony RF receivers Greg Kroah-Hartman
2013-03-01 19:44 ` [ 59/77] fuse: dont WARN when nlink is zero Greg Kroah-Hartman
2013-03-01 19:44 ` [ 60/77] workqueue: consider work function when searching for busy work items Greg Kroah-Hartman
2013-03-01 19:44 ` Greg Kroah-Hartman [this message]
2013-03-01 19:44 ` [ 62/77] cpuset: fix cpuset_print_task_mems_allowed() vs rename() race Greg Kroah-Hartman
2013-03-01 19:44 ` [ 63/77] cgroup: fix exit() vs rmdir() race Greg Kroah-Hartman
2013-03-01 19:44 ` [ 64/77] bq27x00_battery: Fix bugs introduced with BQ27425 support Greg Kroah-Hartman
2013-03-01 19:44 ` [ 65/77] ab8500-chargalg: Only root should have write permission on sysfs file Greg Kroah-Hartman
2013-03-01 19:44 ` [ 66/77] ab8500_btemp: Demote initcall sequence Greg Kroah-Hartman
2013-03-01 19:44 ` [ 67/77] ACPI: Add DMI entry for Sony VGN-FW41E_H Greg Kroah-Hartman
2013-03-01 19:44 ` [ 68/77] staging: comedi: check s->async for poll(), read() and write() Greg Kroah-Hartman
2013-03-01 19:44 ` [ 69/77] ata_piix: IDE-mode SATA patch for Intel Avoton DeviceIDs Greg Kroah-Hartman
2013-03-01 19:44 ` [ 70/77] ata_piix: Add Device IDs for Intel Wellsburg PCH Greg Kroah-Hartman
2013-03-01 19:44 ` [ 71/77] ahci: AHCI-mode SATA patch for Intel Avoton DeviceIDs Greg Kroah-Hartman
2013-03-01 19:44 ` [ 72/77] ahci: Add Device IDs for Intel Wellsburg PCH Greg Kroah-Hartman
2013-03-01 19:44 ` [ 73/77] [hid] usb hid quirks for Masterkit MA901 usb radio Greg Kroah-Hartman
2013-03-04 11:05 ` Alexey Klimov
2013-03-04 14:25 ` Ben Hutchings
2013-03-01 19:44 ` [ 74/77] x86, efi: Allow slash in file path of initrd Greg Kroah-Hartman
2013-03-01 19:45 ` [ 75/77] ACPI: Overriding ACPI tables via initrd only works with an initrd and on X86 Greg Kroah-Hartman
2013-03-01 19:45 ` [ 76/77] efivarfs: Validate filenames much more aggressively Greg Kroah-Hartman
2013-03-01 19:45 ` [ 77/77] efivarfs: guid part of filenames are case-insensitive Greg Kroah-Hartman
2013-03-02 3:59 ` [ 00/77] 3.8.2-stable review Shuah Khan
2013-03-02 5:21 ` Greg Kroah-Hartman
2013-03-03 11:49 ` Satoru Takeuchi
2013-03-03 15:26 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130301194358.383669505@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=caiqian@redhat.com \
--cc=dzickus@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=seiji.aguchi@hds.com \
--cc=stable@vger.kernel.org \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.