From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, Jann Horn <jannh@google.com>,
Luis Chamberlain <mcgrof@kernel.org>,
Kees Cook <keescook@chromium.org>,
Eric Biggers <ebiggers@google.com>
Subject: [PATCH 4.19 71/80] exit: Put an upper limit on how often we can oops
Date: Fri, 3 Feb 2023 11:13:05 +0100 [thread overview]
Message-ID: <20230203101018.267182220@linuxfoundation.org> (raw)
In-Reply-To: <20230203101015.263854890@linuxfoundation.org>
From: Jann Horn <jannh@google.com>
commit d4ccd54d28d3c8598e2354acc13e28c060961dbb upstream.
Many Linux systems are configured to not panic on oops; but allowing an
attacker to oops the system **really** often can make even bugs that look
completely unexploitable exploitable (like NULL dereferences and such) if
each crash elevates a refcount by one or a lock is taken in read mode, and
this causes a counter to eventually overflow.
The most interesting counters for this are 32 bits wide (like open-coded
refcounts that don't use refcount_t). (The ldsem reader count on 32-bit
platforms is just 16 bits, but probably nobody cares about 32-bit platforms
that much nowadays.)
So let's panic the system if the kernel is constantly oopsing.
The speed of oopsing 2^32 times probably depends on several factors, like
how long the stack trace is and which unwinder you're using; an empirically
important one is whether your console is showing a graphical environment or
a text console that oopses will be printed to.
In a quick single-threaded benchmark, it looks like oopsing in a vfork()
child with a very short stack trace only takes ~510 microseconds per run
when a graphical console is active; but switching to a text console that
oopses are printed to slows it down around 87x, to ~45 milliseconds per
run.
(Adding more threads makes this faster, but the actual oops printing
happens under &die_lock on x86, so you can maybe speed this up by a factor
of around 2 and then any further improvement gets eaten up by lock
contention.)
It looks like it would take around 8-12 days to overflow a 32-bit counter
with repeated oopsing on a multi-core X86 system running a graphical
environment; both me (in an X86 VM) and Seth (with a distro kernel on
normal hardware in a standard configuration) got numbers in that ballpark.
12 days aren't *that* short on a desktop system, and you'd likely need much
longer on a typical server system (assuming that people don't run graphical
desktop environments on their servers), and this is a *very* noisy and
violent approach to exploiting the kernel; and it also seems to take orders
of magnitude longer on some machines, probably because stuff like EFI
pstore will slow it down a ton if that's active.
Signed-off-by: Jann Horn <jannh@google.com>
Link: https://lore.kernel.org/r/20221107201317.324457-1-jannh@google.com
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221117234328.594699-2-keescook@chromium.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
Documentation/sysctl/kernel.txt | 9 ++++++++
kernel/exit.c | 43 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 52 insertions(+)
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -51,6 +51,7 @@ show up in /proc/sys/kernel:
- msgmnb
- msgmni
- nmi_watchdog
+- oops_limit
- osrelease
- ostype
- overflowgid
@@ -555,6 +556,14 @@ scanned for a given scan.
==============================================================
+oops_limit:
+
+Number of kernel oopses after which the kernel should panic when
+``panic_on_oops`` is not set. Setting this to 0 or 1 has the same effect
+as setting ``panic_on_oops=1``.
+
+==============================================================
+
osrelease, ostype & version:
# cat osrelease
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -68,6 +68,33 @@
#include <asm/pgtable.h>
#include <asm/mmu_context.h>
+/*
+ * The default value should be high enough to not crash a system that randomly
+ * crashes its kernel from time to time, but low enough to at least not permit
+ * overflowing 32-bit refcounts or the ldsem writer count.
+ */
+static unsigned int oops_limit = 10000;
+
+#ifdef CONFIG_SYSCTL
+static struct ctl_table kern_exit_table[] = {
+ {
+ .procname = "oops_limit",
+ .data = &oops_limit,
+ .maxlen = sizeof(oops_limit),
+ .mode = 0644,
+ .proc_handler = proc_douintvec,
+ },
+ { }
+};
+
+static __init int kernel_exit_sysctls_init(void)
+{
+ register_sysctl_init("kernel", kern_exit_table);
+ return 0;
+}
+late_initcall(kernel_exit_sysctls_init);
+#endif
+
static void __unhash_process(struct task_struct *p, bool group_dead)
{
nr_threads--;
@@ -924,10 +951,26 @@ EXPORT_SYMBOL_GPL(do_exit);
void __noreturn make_task_dead(int signr)
{
+ static atomic_t oops_count = ATOMIC_INIT(0);
+
/*
* Take the task off the cpu after something catastrophic has
* happened.
*/
+
+ /*
+ * Every time the system oopses, if the oops happens while a reference
+ * to an object was held, the reference leaks.
+ * If the oops doesn't also leak memory, repeated oopsing can cause
+ * reference counters to wrap around (if they're not using refcount_t).
+ * This means that repeated oopsing can make unexploitable-looking bugs
+ * exploitable through repeated oopsing.
+ * To make sure this can't happen, place an upper bound on how often the
+ * kernel may oops without panic().
+ */
+ if (atomic_inc_return(&oops_count) >= READ_ONCE(oops_limit))
+ panic("Oopsed too often (kernel.oops_limit is %d)", oops_limit);
+
do_exit(signr);
}
next prev parent reply other threads:[~2023-02-03 10:20 UTC|newest]
Thread overview: 90+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-03 10:11 [PATCH 4.19 00/80] 4.19.272-rc1 review Greg Kroah-Hartman
2023-02-03 10:11 ` [PATCH 4.19 01/80] memory: mvebu-devbus: Fix missing clk_disable_unprepare in mvebu_devbus_probe() Greg Kroah-Hartman
2023-02-03 10:11 ` [PATCH 4.19 02/80] ARM: dts: imx6qdl-gw560x: Remove incorrect uart-has-rtscts Greg Kroah-Hartman
2023-02-03 10:11 ` [PATCH 4.19 03/80] HID: intel_ish-hid: Add check for ishtp_dma_tx_map Greg Kroah-Hartman
2023-02-03 10:11 ` [PATCH 4.19 04/80] EDAC/highbank: Fix memory leak in highbank_mc_probe() Greg Kroah-Hartman
2023-02-03 10:11 ` [PATCH 4.19 05/80] tomoyo: fix broken dependency on *.conf.default Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 06/80] IB/hfi1: Reject a zero-length user expected buffer Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 07/80] IB/hfi1: Reserve user expected TIDs Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 08/80] IB/hfi1: Fix expected receive setup error exit issues Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 09/80] affs: initialize fsdata in affs_truncate() Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 10/80] amd-xgbe: TX Flow Ctrl Registers are h/w ver dependent Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 11/80] amd-xgbe: Delay AN timeout during KR training Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 12/80] bpf: Fix pointer-leak due to insufficient speculative store bypass mitigation Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 13/80] phy: rockchip-inno-usb2: Fix missing clk_disable_unprepare() in rockchip_usb2phy_power_on() Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 14/80] net: nfc: Fix use-after-free in local_cleanup() Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 15/80] wifi: rndis_wlan: Prevent buffer overflow in rndis_query_oid Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 16/80] net: usb: sr9700: Handle negative len Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 17/80] net: mdio: validate parameter addr in mdiobus_get_phy() Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 18/80] HID: check empty report_list in hid_validate_values() Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 19/80] usb: gadget: f_fs: Prevent race during ffs_ep0_queue_wait Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 20/80] usb: gadget: f_fs: Ensure ep0req is dequeued before free_request Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 21/80] net: mlx5: eliminate anonymous module_init & module_exit Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 22/80] dmaengine: Fix double increment of client_count in dma_chan_get() Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 23/80] net: macb: fix PTP TX timestamp failure due to packet padding Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 24/80] HID: betop: check shape of output reports Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 25/80] dmaengine: xilinx_dma: commonize DMA copy size calculation Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 26/80] dmaengine: xilinx_dma: program hardware supported buffer length Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 27/80] dmaengine: xilinx_dma: Fix devm_platform_ioremap_resource error handling Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 28/80] dmaengine: xilinx_dma: call of_node_put() when breaking out of for_each_child_of_node() Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 29/80] tcp: avoid the lookup process failing to get sk in ehash table Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 30/80] w1: fix deadloop in __w1_remove_master_device() Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 31/80] w1: fix WARNING after calling w1_process() Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 32/80] netfilter: conntrack: do not renew entry stuck in tcp SYN_SENT state Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 33/80] block: fix and cleanup bio_check_ro Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 34/80] perf env: Do not return pointers to local variables Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 35/80] fs: reiserfs: remove useless new_opts in reiserfs_remount Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 36/80] Bluetooth: hci_sync: cancel cmd_timer if hci_open failed Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 37/80] scsi: hpsa: Fix allocation size for scsi_host_alloc() Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 38/80] module: Dont wait for GOING modules Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 39/80] tracing: Make sure trace_printk() can output as soon as it can be used Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 40/80] trace_events_hist: add check for return value of create_hist_field Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 41/80] smbd: Make upper layer decide when to destroy the transport Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 42/80] cifs: Fix oops due to uncleared server->smbd_conn in reconnect Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 43/80] ARM: 9280/1: mm: fix warning on phys_addr_t to void pointer assignment Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 44/80] EDAC/device: Respect any driver-supplied workqueue polling value Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 45/80] net: fix UaF in netns ops registration error path Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 46/80] netfilter: nft_set_rbtree: skip elements in transaction from garbage collection Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 47/80] netlink: remove hash::nelems check in netlink_insert Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 48/80] netlink: annotate data races around nlk->portid Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 49/80] netlink: annotate data races around dst_portid and dst_group Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 50/80] netlink: annotate data races around sk_state Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 51/80] ipv4: prevent potential spectre v1 gadget in ip_metrics_convert() Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 52/80] netfilter: conntrack: fix vtag checks for ABORT/SHUTDOWN_COMPLETE Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 53/80] netrom: Fix use-after-free of a listening socket Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 54/80] sctp: fail if no bound addresses can be used for a given scope Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 55/80] net: ravb: Fix possible hang if RIS2_QFF1 happen Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 56/80] net/tg3: resolve deadlock in tg3_reset_task() during EEH Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 57/80] Revert "Input: synaptics - switch touchpad on HP Laptop 15-da3001TU to RMI mode" Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 58/80] x86/i8259: Mark legacy PIC interrupts with IRQ_LEVEL Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 59/80] drm/i915/display: fix compiler warning about array overrun Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 60/80] x86/asm: Fix an assembler warning with current binutils Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 61/80] x86/entry/64: Add instruction suffix to SYSRET Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 62/80] ARM: dts: imx: Fix pca9547 i2c-mux node name Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 63/80] dmaengine: imx-sdma: Fix a possible memory leak in sdma_transfer_init Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 64/80] sysctl: add a new register_sysctl_init() interface Greg Kroah-Hartman
2023-02-03 10:12 ` [PATCH 4.19 65/80] panic: unset panic_on_warn inside panic() Greg Kroah-Hartman
2023-02-03 10:13 ` [PATCH 4.19 66/80] exit: Add and use make_task_dead Greg Kroah-Hartman
2023-02-03 10:13 ` [PATCH 4.19 67/80] objtool: Add a missing comma to avoid string concatenation Greg Kroah-Hartman
2023-02-03 10:13 ` [PATCH 4.19 68/80] hexagon: Fix function name in die() Greg Kroah-Hartman
2023-02-03 10:13 ` [PATCH 4.19 69/80] h8300: Fix build errors from do_exit() to make_task_dead() transition Greg Kroah-Hartman
2023-02-03 10:13 ` [PATCH 4.19 70/80] ia64: make IA64_MCA_RECOVERY bool instead of tristate Greg Kroah-Hartman
2023-02-03 10:13 ` Greg Kroah-Hartman [this message]
2023-02-03 10:13 ` [PATCH 4.19 72/80] exit: Expose "oops_count" to sysfs Greg Kroah-Hartman
2023-02-03 10:13 ` [PATCH 4.19 73/80] exit: Allow oops_limit to be disabled Greg Kroah-Hartman
2023-02-03 10:13 ` [PATCH 4.19 74/80] panic: Consolidate open-coded panic_on_warn checks Greg Kroah-Hartman
2023-02-03 10:13 ` [PATCH 4.19 75/80] panic: Introduce warn_limit Greg Kroah-Hartman
2023-02-03 10:13 ` [PATCH 4.19 76/80] panic: Expose "warn_count" to sysfs Greg Kroah-Hartman
2023-02-03 10:13 ` [PATCH 4.19 77/80] docs: Fix path paste-o for /sys/kernel/warn_count Greg Kroah-Hartman
2023-02-03 10:13 ` [PATCH 4.19 78/80] exit: Use READ_ONCE() for all oops/warn limit reads Greg Kroah-Hartman
2023-02-03 10:13 ` [PATCH 4.19 79/80] ipv6: ensure sane device mtu in tunnels Greg Kroah-Hartman
2023-02-03 10:13 ` [PATCH 4.19 80/80] usb: host: xhci-plat: add wakeup entry at sysfs Greg Kroah-Hartman
2023-02-03 11:04 ` [PATCH 4.19 00/80] 4.19.272-rc1 review Naresh Kamboju
2023-02-03 12:28 ` Krzysztof Kozlowski
2023-02-03 15:51 ` Guenter Roeck
2023-02-03 16:56 ` Krzysztof Kozlowski
2023-02-03 17:04 ` Greg Kroah-Hartman
2023-02-03 16:38 ` Greg Kroah-Hartman
2023-02-03 16:50 ` Greg Kroah-Hartman
2023-02-03 17:19 ` Guenter Roeck
2023-02-04 1:04 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230203101018.267182220@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=ebiggers@google.com \
--cc=jannh@google.com \
--cc=keescook@chromium.org \
--cc=mcgrof@kernel.org \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.