* [PATCH AUTOSEL 5.10 18/31] um: random: Register random as hwrng-core device
[not found] <20201230130314.3636961-1-sashal@kernel.org>
@ 2020-12-30 13:03 ` Sasha Levin
2020-12-30 13:03 ` [PATCH AUTOSEL 5.10 19/31] um: ubd: Submit all data segments atomically Sasha Levin
2020-12-30 13:03 ` [PATCH AUTOSEL 5.10 20/31] um: allocate a guard page to helper threads Sasha Levin
2 siblings, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2020-12-30 13:03 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Sasha Levin, Richard Weinberger, Alexander Neville, linux-um,
Sjoerd Simons, linux-crypto, Christopher Obbard, Anton Ivanov
From: Christopher Obbard <chris.obbard@collabora.com>
[ Upstream commit 72d3e093afae79611fa38f8f2cfab9a888fe66f2 ]
The UML random driver creates a dummy device under the guest,
/dev/hw_random. When this file is read from the guest, the driver
reads from the host machine's /dev/random, in-turn reading from
the host kernel's entropy pool. This entropy pool could have been
filled by a hardware random number generator or just the host
kernel's internal software entropy generator.
Currently the driver does not fill the guests kernel entropy pool,
this requires a userspace tool running inside the guest (like
rng-tools) to read from the dummy device provided by this driver,
which then would fill the guest's internal entropy pool.
This all seems quite pointless when we are already reading from an
entropy pool, so this patch aims to register the device as a hwrng
device using the hwrng-core framework. This not only improves and
cleans up the driver, but also fills the guest's entropy pool
without having to resort to using extra userspace tools in the guest.
This is typically a nuisance when booting a guest: the random pool
takes a long time (~200s) to build up enough entropy since the dummy
hwrng is not used to fill the guest's pool.
This port was originally attempted by Alexander Neville "dark" (in CC,
discussion in Link), but the conversation there stalled since the
handling of -EAGAIN errors were no removed and longer handled by the
driver. This patch attempts to use the existing method of error
handling but utilises the new hwrng core.
The issue can be noticed when booting a UML guest:
[ 2.560000] random: fast init done
[ 214.000000] random: crng init done
With the patch applied, filling the pool becomes a lot quicker:
[ 2.560000] random: fast init done
[ 12.000000] random: crng init done
Cc: Alexander Neville <dark@volatile.bz>
Link: https://lore.kernel.org/lkml/20190828204609.02a7ff70@TheDarkness/
Link: https://lore.kernel.org/lkml/20190829135001.6a5ff940@TheDarkness.local/
Cc: Sjoerd Simons <sjoerd.simons@collabora.co.uk>
Signed-off-by: Christopher Obbard <chris.obbard@collabora.com>
Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/um/drivers/random.c | 101 ++++++++-------------------------
drivers/char/hw_random/Kconfig | 16 +++---
2 files changed, 33 insertions(+), 84 deletions(-)
diff --git a/arch/um/drivers/random.c b/arch/um/drivers/random.c
index ce115fce52f02..e4b9b2ce9abf4 100644
--- a/arch/um/drivers/random.c
+++ b/arch/um/drivers/random.c
@@ -11,6 +11,7 @@
#include <linux/fs.h>
#include <linux/interrupt.h>
#include <linux/miscdevice.h>
+#include <linux/hw_random.h>
#include <linux/delay.h>
#include <linux/uaccess.h>
#include <init.h>
@@ -18,9 +19,8 @@
#include <os.h>
/*
- * core module and version information
+ * core module information
*/
-#define RNG_VERSION "1.0.0"
#define RNG_MODULE_NAME "hw_random"
/* Changed at init time, in the non-modular case, and at module load
@@ -28,88 +28,36 @@
* protects against a module being loaded twice at the same time.
*/
static int random_fd = -1;
-static DECLARE_WAIT_QUEUE_HEAD(host_read_wait);
+static struct hwrng hwrng = { 0, };
+static DECLARE_COMPLETION(have_data);
-static int rng_dev_open (struct inode *inode, struct file *filp)
+static int rng_dev_read(struct hwrng *rng, void *buf, size_t max, bool block)
{
- /* enforce read-only access to this chrdev */
- if ((filp->f_mode & FMODE_READ) == 0)
- return -EINVAL;
- if ((filp->f_mode & FMODE_WRITE) != 0)
- return -EINVAL;
+ int ret;
- return 0;
-}
-
-static atomic_t host_sleep_count = ATOMIC_INIT(0);
-
-static ssize_t rng_dev_read (struct file *filp, char __user *buf, size_t size,
- loff_t *offp)
-{
- u32 data;
- int n, ret = 0, have_data;
-
- while (size) {
- n = os_read_file(random_fd, &data, sizeof(data));
- if (n > 0) {
- have_data = n;
- while (have_data && size) {
- if (put_user((u8) data, buf++)) {
- ret = ret ? : -EFAULT;
- break;
- }
- size--;
- ret++;
- have_data--;
- data >>= 8;
- }
- }
- else if (n == -EAGAIN) {
- DECLARE_WAITQUEUE(wait, current);
-
- if (filp->f_flags & O_NONBLOCK)
- return ret ? : -EAGAIN;
-
- atomic_inc(&host_sleep_count);
+ for (;;) {
+ ret = os_read_file(random_fd, buf, max);
+ if (block && ret == -EAGAIN) {
add_sigio_fd(random_fd);
- add_wait_queue(&host_read_wait, &wait);
- set_current_state(TASK_INTERRUPTIBLE);
+ ret = wait_for_completion_killable(&have_data);
- schedule();
- remove_wait_queue(&host_read_wait, &wait);
+ ignore_sigio_fd(random_fd);
+ deactivate_fd(random_fd, RANDOM_IRQ);
- if (atomic_dec_and_test(&host_sleep_count)) {
- ignore_sigio_fd(random_fd);
- deactivate_fd(random_fd, RANDOM_IRQ);
- }
+ if (ret < 0)
+ break;
+ } else {
+ break;
}
- else
- return n;
-
- if (signal_pending (current))
- return ret ? : -ERESTARTSYS;
}
- return ret;
-}
-static const struct file_operations rng_chrdev_ops = {
- .owner = THIS_MODULE,
- .open = rng_dev_open,
- .read = rng_dev_read,
- .llseek = noop_llseek,
-};
-
-/* rng_init shouldn't be called more than once at boot time */
-static struct miscdevice rng_miscdev = {
- HWRNG_MINOR,
- RNG_MODULE_NAME,
- &rng_chrdev_ops,
-};
+ return ret != -EAGAIN ? ret : 0;
+}
static irqreturn_t random_interrupt(int irq, void *data)
{
- wake_up(&host_read_wait);
+ complete(&have_data);
return IRQ_HANDLED;
}
@@ -126,18 +74,19 @@ static int __init rng_init (void)
goto out;
random_fd = err;
-
err = um_request_irq(RANDOM_IRQ, random_fd, IRQ_READ, random_interrupt,
0, "random", NULL);
if (err)
goto err_out_cleanup_hw;
sigio_broken(random_fd, 1);
+ hwrng.name = RNG_MODULE_NAME;
+ hwrng.read = rng_dev_read;
+ hwrng.quality = 1024;
- err = misc_register (&rng_miscdev);
+ err = hwrng_register(&hwrng);
if (err) {
- printk (KERN_ERR RNG_MODULE_NAME ": misc device register "
- "failed\n");
+ pr_err(RNG_MODULE_NAME " registering failed (%d)\n", err);
goto err_out_cleanup_hw;
}
out:
@@ -161,8 +110,8 @@ static void cleanup(void)
static void __exit rng_cleanup(void)
{
+ hwrng_unregister(&hwrng);
os_close_file(random_fd);
- misc_deregister (&rng_miscdev);
}
module_init (rng_init);
diff --git a/drivers/char/hw_random/Kconfig b/drivers/char/hw_random/Kconfig
index e92c4d9469d82..5952210526aaa 100644
--- a/drivers/char/hw_random/Kconfig
+++ b/drivers/char/hw_random/Kconfig
@@ -540,15 +540,15 @@ endif # HW_RANDOM
config UML_RANDOM
depends on UML
- tristate "Hardware random number generator"
+ select HW_RANDOM
+ tristate "UML Random Number Generator support"
help
This option enables UML's "hardware" random number generator. It
attaches itself to the host's /dev/random, supplying as much entropy
as the host has, rather than the small amount the UML gets from its
- own drivers. It registers itself as a standard hardware random number
- generator, major 10, minor 183, and the canonical device name is
- /dev/hwrng.
- The way to make use of this is to install the rng-tools package
- (check your distro, or download from
- http://sourceforge.net/projects/gkernel/). rngd periodically reads
- /dev/hwrng and injects the entropy into /dev/random.
+ own drivers. It registers itself as a rng-core driver thus providing
+ a device which is usually called /dev/hwrng. This hardware random
+ number generator does feed into the kernel's random number generator
+ entropy pool.
+
+ If unsure, say Y.
--
2.27.0
_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH AUTOSEL 5.10 19/31] um: ubd: Submit all data segments atomically
[not found] <20201230130314.3636961-1-sashal@kernel.org>
2020-12-30 13:03 ` [PATCH AUTOSEL 5.10 18/31] um: random: Register random as hwrng-core device Sasha Levin
@ 2020-12-30 13:03 ` Sasha Levin
2020-12-30 13:03 ` [PATCH AUTOSEL 5.10 20/31] um: allocate a guard page to helper threads Sasha Levin
2 siblings, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2020-12-30 13:03 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Sasha Levin, Martyn Welch, Richard Weinberger, linux-um,
Gabriel Krisman Bertazi, Christopher Obbard, Anton Ivanov
From: Gabriel Krisman Bertazi <krisman@collabora.com>
[ Upstream commit fc6b6a872dcd48c6f39c7975836d75113db67d37 ]
Internally, UBD treats each physical IO segment as a separate command to
be submitted in the execution pipe. If the pipe returns a transient
error after a few segments have already been written, UBD will tell the
block layer to requeue the request, but there is no way to reclaim the
segments already submitted. When a new attempt to dispatch the request
is done, those segments already submitted will get duplicated, causing
the WARN_ON below in the best case, and potentially data corruption.
In my system, running a UML instance with 2GB of RAM and a 50M UBD disk,
I can reproduce the WARN_ON by simply running mkfs.fvat against the
disk on a freshly booted system.
There are a few ways to around this, like reducing the pressure on
the pipe by reducing the queue depth, which almost eliminates the
occurrence of the problem, increasing the pipe buffer size on the host
system, or by limiting the request to one physical segment, which causes
the block layer to submit way more requests to resolve a single
operation.
Instead, this patch modifies the format of a UBD command, such that all
segments are sent through a single element in the communication pipe,
turning the command submission atomic from the point of view of the
block layer. The new format has a variable size, depending on the
number of elements, and looks like this:
+------------+-----------+-----------+------------
| cmd_header | segment 0 | segment 1 | segment ...
+------------+-----------+-----------+------------
With this format, we push a pointer to cmd_header in the submission
pipe.
This has the advantage of reducing the memory footprint of executing a
single request, since it allow us to merge some fields in the header.
It is possible to reduce even further each segment memory footprint, by
merging bitmap_words and cow_offset, for instance, but this is not the
focus of this patch and is left as future work. One issue with the
patch is that for a big number of segments, we now perform one big
memory allocation instead of multiple small ones, but I wasn't able to
trigger any real issues or -ENOMEM because of this change, that wouldn't
be reproduced otherwise.
This was tested using fio with the verify-crc32 option, and by running
an ext4 filesystem over this UBD device.
The original WARN_ON was:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at lib/refcount.c:28 refcount_warn_saturate+0x13f/0x141
refcount_t: underflow; use-after-free.
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.5.0-rc6-00002-g2a5bb2cf75c8 #346
Stack:
6084eed0 6063dc77 00000009 6084ef60
00000000 604b8d9f 6084eee0 6063dcbc
6084ef40 6006ab8d e013d780 1c00000000
Call Trace:
[<600a0c1c>] ? printk+0x0/0x94
[<6004a888>] show_stack+0x13b/0x155
[<6063dc77>] ? dump_stack_print_info+0xdf/0xe8
[<604b8d9f>] ? refcount_warn_saturate+0x13f/0x141
[<6063dcbc>] dump_stack+0x2a/0x2c
[<6006ab8d>] __warn+0x107/0x134
[<6008da6c>] ? wake_up_process+0x17/0x19
[<60487628>] ? blk_queue_max_discard_sectors+0x0/0xd
[<6006b05f>] warn_slowpath_fmt+0xd1/0xdf
[<6006af8e>] ? warn_slowpath_fmt+0x0/0xdf
[<600acc14>] ? raw_read_seqcount_begin.constprop.0+0x0/0x15
[<600619ae>] ? os_nsecs+0x1d/0x2b
[<604b8d9f>] refcount_warn_saturate+0x13f/0x141
[<6048bc8f>] refcount_sub_and_test.constprop.0+0x2f/0x37
[<6048c8de>] blk_mq_free_request+0xf1/0x10d
[<6048ca06>] __blk_mq_end_request+0x10c/0x114
[<6005ac0f>] ubd_intr+0xb5/0x169
[<600a1a37>] __handle_irq_event_percpu+0x6b/0x17e
[<600a1b70>] handle_irq_event_percpu+0x26/0x69
[<600a1bd9>] handle_irq_event+0x26/0x34
[<600a1bb3>] ? handle_irq_event+0x0/0x34
[<600a5186>] ? unmask_irq+0x0/0x37
[<600a57e6>] handle_edge_irq+0xbc/0xd6
[<600a131a>] generic_handle_irq+0x21/0x29
[<60048f6e>] do_IRQ+0x39/0x54
[...]
---[ end trace c6e7444e55386c0f ]---
Cc: Christopher Obbard <chris.obbard@collabora.com>
Reported-by: Martyn Welch <martyn@collabora.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Tested-by: Christopher Obbard <chris.obbard@collabora.com>
Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/um/drivers/ubd_kern.c | 191 ++++++++++++++++++++++---------------
1 file changed, 115 insertions(+), 76 deletions(-)
diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
index eae8c83364f71..b12c1b0d3e1d0 100644
--- a/arch/um/drivers/ubd_kern.c
+++ b/arch/um/drivers/ubd_kern.c
@@ -47,18 +47,25 @@
/* Max request size is determined by sector mask - 32K */
#define UBD_MAX_REQUEST (8 * sizeof(long))
+struct io_desc {
+ char *buffer;
+ unsigned long length;
+ unsigned long sector_mask;
+ unsigned long long cow_offset;
+ unsigned long bitmap_words[2];
+};
+
struct io_thread_req {
struct request *req;
int fds[2];
unsigned long offsets[2];
unsigned long long offset;
- unsigned long length;
- char *buffer;
int sectorsize;
- unsigned long sector_mask;
- unsigned long long cow_offset;
- unsigned long bitmap_words[2];
int error;
+
+ int desc_cnt;
+ /* io_desc has to be the last element of the struct */
+ struct io_desc io_desc[];
};
@@ -525,12 +532,7 @@ static void ubd_handler(void)
blk_queue_max_write_zeroes_sectors(io_req->req->q, 0);
blk_queue_flag_clear(QUEUE_FLAG_DISCARD, io_req->req->q);
}
- if ((io_req->error) || (io_req->buffer == NULL))
- blk_mq_end_request(io_req->req, io_req->error);
- else {
- if (!blk_update_request(io_req->req, io_req->error, io_req->length))
- __blk_mq_end_request(io_req->req, io_req->error);
- }
+ blk_mq_end_request(io_req->req, io_req->error);
kfree(io_req);
}
}
@@ -946,6 +948,7 @@ static int ubd_add(int n, char **error_out)
blk_queue_write_cache(ubd_dev->queue, true, false);
blk_queue_max_segments(ubd_dev->queue, MAX_SG);
+ blk_queue_segment_boundary(ubd_dev->queue, PAGE_SIZE - 1);
err = ubd_disk_register(UBD_MAJOR, ubd_dev->size, n, &ubd_gendisk[n]);
if(err){
*error_out = "Failed to register device";
@@ -1289,37 +1292,74 @@ static void cowify_bitmap(__u64 io_offset, int length, unsigned long *cow_mask,
*cow_offset += bitmap_offset;
}
-static void cowify_req(struct io_thread_req *req, unsigned long *bitmap,
+static void cowify_req(struct io_thread_req *req, struct io_desc *segment,
+ unsigned long offset, unsigned long *bitmap,
__u64 bitmap_offset, __u64 bitmap_len)
{
- __u64 sector = req->offset >> SECTOR_SHIFT;
+ __u64 sector = offset >> SECTOR_SHIFT;
int i;
- if (req->length > (sizeof(req->sector_mask) * 8) << SECTOR_SHIFT)
+ if (segment->length > (sizeof(segment->sector_mask) * 8) << SECTOR_SHIFT)
panic("Operation too long");
if (req_op(req->req) == REQ_OP_READ) {
- for (i = 0; i < req->length >> SECTOR_SHIFT; i++) {
+ for (i = 0; i < segment->length >> SECTOR_SHIFT; i++) {
if(ubd_test_bit(sector + i, (unsigned char *) bitmap))
ubd_set_bit(i, (unsigned char *)
- &req->sector_mask);
+ &segment->sector_mask);
+ }
+ } else {
+ cowify_bitmap(offset, segment->length, &segment->sector_mask,
+ &segment->cow_offset, bitmap, bitmap_offset,
+ segment->bitmap_words, bitmap_len);
+ }
+}
+
+static void ubd_map_req(struct ubd *dev, struct io_thread_req *io_req,
+ struct request *req)
+{
+ struct bio_vec bvec;
+ struct req_iterator iter;
+ int i = 0;
+ unsigned long byte_offset = io_req->offset;
+ int op = req_op(req);
+
+ if (op == REQ_OP_WRITE_ZEROES || op == REQ_OP_DISCARD) {
+ io_req->io_desc[0].buffer = NULL;
+ io_req->io_desc[0].length = blk_rq_bytes(req);
+ } else {
+ rq_for_each_segment(bvec, req, iter) {
+ BUG_ON(i >= io_req->desc_cnt);
+
+ io_req->io_desc[i].buffer =
+ page_address(bvec.bv_page) + bvec.bv_offset;
+ io_req->io_desc[i].length = bvec.bv_len;
+ i++;
+ }
+ }
+
+ if (dev->cow.file) {
+ for (i = 0; i < io_req->desc_cnt; i++) {
+ cowify_req(io_req, &io_req->io_desc[i], byte_offset,
+ dev->cow.bitmap, dev->cow.bitmap_offset,
+ dev->cow.bitmap_len);
+ byte_offset += io_req->io_desc[i].length;
}
+
}
- else cowify_bitmap(req->offset, req->length, &req->sector_mask,
- &req->cow_offset, bitmap, bitmap_offset,
- req->bitmap_words, bitmap_len);
}
-static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
- u64 off, struct bio_vec *bvec)
+static struct io_thread_req *ubd_alloc_req(struct ubd *dev, struct request *req,
+ int desc_cnt)
{
- struct ubd *dev = hctx->queue->queuedata;
struct io_thread_req *io_req;
- int ret;
+ int i;
- io_req = kmalloc(sizeof(struct io_thread_req), GFP_ATOMIC);
+ io_req = kmalloc(sizeof(*io_req) +
+ (desc_cnt * sizeof(struct io_desc)),
+ GFP_ATOMIC);
if (!io_req)
- return -ENOMEM;
+ return NULL;
io_req->req = req;
if (dev->cow.file)
@@ -1327,26 +1367,41 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
else
io_req->fds[0] = dev->fd;
io_req->error = 0;
-
- if (bvec != NULL) {
- io_req->buffer = page_address(bvec->bv_page) + bvec->bv_offset;
- io_req->length = bvec->bv_len;
- } else {
- io_req->buffer = NULL;
- io_req->length = blk_rq_bytes(req);
- }
-
io_req->sectorsize = SECTOR_SIZE;
io_req->fds[1] = dev->fd;
- io_req->cow_offset = -1;
- io_req->offset = off;
- io_req->sector_mask = 0;
+ io_req->offset = (u64) blk_rq_pos(req) << SECTOR_SHIFT;
io_req->offsets[0] = 0;
io_req->offsets[1] = dev->cow.data_offset;
- if (dev->cow.file)
- cowify_req(io_req, dev->cow.bitmap,
- dev->cow.bitmap_offset, dev->cow.bitmap_len);
+ for (i = 0 ; i < desc_cnt; i++) {
+ io_req->io_desc[i].sector_mask = 0;
+ io_req->io_desc[i].cow_offset = -1;
+ }
+
+ return io_req;
+}
+
+static int ubd_submit_request(struct ubd *dev, struct request *req)
+{
+ int segs = 0;
+ struct io_thread_req *io_req;
+ int ret;
+ int op = req_op(req);
+
+ if (op == REQ_OP_FLUSH)
+ segs = 0;
+ else if (op == REQ_OP_WRITE_ZEROES || op == REQ_OP_DISCARD)
+ segs = 1;
+ else
+ segs = blk_rq_nr_phys_segments(req);
+
+ io_req = ubd_alloc_req(dev, req, segs);
+ if (!io_req)
+ return -ENOMEM;
+
+ io_req->desc_cnt = segs;
+ if (segs)
+ ubd_map_req(dev, io_req, req);
ret = os_write_file(thread_fd, &io_req, sizeof(io_req));
if (ret != sizeof(io_req)) {
@@ -1357,22 +1412,6 @@ static int ubd_queue_one_vec(struct blk_mq_hw_ctx *hctx, struct request *req,
return ret;
}
-static int queue_rw_req(struct blk_mq_hw_ctx *hctx, struct request *req)
-{
- struct req_iterator iter;
- struct bio_vec bvec;
- int ret;
- u64 off = (u64)blk_rq_pos(req) << SECTOR_SHIFT;
-
- rq_for_each_segment(bvec, req, iter) {
- ret = ubd_queue_one_vec(hctx, req, off, &bvec);
- if (ret < 0)
- return ret;
- off += bvec.bv_len;
- }
- return 0;
-}
-
static blk_status_t ubd_queue_rq(struct blk_mq_hw_ctx *hctx,
const struct blk_mq_queue_data *bd)
{
@@ -1385,17 +1424,12 @@ static blk_status_t ubd_queue_rq(struct blk_mq_hw_ctx *hctx,
spin_lock_irq(&ubd_dev->lock);
switch (req_op(req)) {
- /* operations with no lentgth/offset arguments */
case REQ_OP_FLUSH:
- ret = ubd_queue_one_vec(hctx, req, 0, NULL);
- break;
case REQ_OP_READ:
case REQ_OP_WRITE:
- ret = queue_rw_req(hctx, req);
- break;
case REQ_OP_DISCARD:
case REQ_OP_WRITE_ZEROES:
- ret = ubd_queue_one_vec(hctx, req, (u64)blk_rq_pos(req) << 9, NULL);
+ ret = ubd_submit_request(ubd_dev, req);
break;
default:
WARN_ON_ONCE(1);
@@ -1483,22 +1517,22 @@ static int map_error(int error_code)
* will result in unpredictable behaviour and/or crashes.
*/
-static int update_bitmap(struct io_thread_req *req)
+static int update_bitmap(struct io_thread_req *req, struct io_desc *segment)
{
int n;
- if(req->cow_offset == -1)
+ if (segment->cow_offset == -1)
return map_error(0);
- n = os_pwrite_file(req->fds[1], &req->bitmap_words,
- sizeof(req->bitmap_words), req->cow_offset);
- if (n != sizeof(req->bitmap_words))
+ n = os_pwrite_file(req->fds[1], &segment->bitmap_words,
+ sizeof(segment->bitmap_words), segment->cow_offset);
+ if (n != sizeof(segment->bitmap_words))
return map_error(-n);
return map_error(0);
}
-static void do_io(struct io_thread_req *req)
+static void do_io(struct io_thread_req *req, struct io_desc *desc)
{
char *buf = NULL;
unsigned long len;
@@ -1513,21 +1547,20 @@ static void do_io(struct io_thread_req *req)
return;
}
- nsectors = req->length / req->sectorsize;
+ nsectors = desc->length / req->sectorsize;
start = 0;
do {
- bit = ubd_test_bit(start, (unsigned char *) &req->sector_mask);
+ bit = ubd_test_bit(start, (unsigned char *) &desc->sector_mask);
end = start;
while((end < nsectors) &&
- (ubd_test_bit(end, (unsigned char *)
- &req->sector_mask) == bit))
+ (ubd_test_bit(end, (unsigned char *) &desc->sector_mask) == bit))
end++;
off = req->offset + req->offsets[bit] +
start * req->sectorsize;
len = (end - start) * req->sectorsize;
- if (req->buffer != NULL)
- buf = &req->buffer[start * req->sectorsize];
+ if (desc->buffer != NULL)
+ buf = &desc->buffer[start * req->sectorsize];
switch (req_op(req->req)) {
case REQ_OP_READ:
@@ -1567,7 +1600,8 @@ static void do_io(struct io_thread_req *req)
start = end;
} while(start < nsectors);
- req->error = update_bitmap(req);
+ req->offset += len;
+ req->error = update_bitmap(req, desc);
}
/* Changed in start_io_thread, which is serialized by being called only
@@ -1600,8 +1634,13 @@ int io_thread(void *arg)
}
for (count = 0; count < n/sizeof(struct io_thread_req *); count++) {
+ struct io_thread_req *req = (*io_req_buffer)[count];
+ int i;
+
io_count++;
- do_io((*io_req_buffer)[count]);
+ for (i = 0; !req->error && i < req->desc_cnt; i++)
+ do_io(req, &(req->io_desc[i]));
+
}
written = 0;
--
2.27.0
_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH AUTOSEL 5.10 20/31] um: allocate a guard page to helper threads
[not found] <20201230130314.3636961-1-sashal@kernel.org>
2020-12-30 13:03 ` [PATCH AUTOSEL 5.10 18/31] um: random: Register random as hwrng-core device Sasha Levin
2020-12-30 13:03 ` [PATCH AUTOSEL 5.10 19/31] um: ubd: Submit all data segments atomically Sasha Levin
@ 2020-12-30 13:03 ` Sasha Levin
2020-12-30 14:48 ` Johannes Berg
2 siblings, 1 reply; 5+ messages in thread
From: Sasha Levin @ 2020-12-30 13:03 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Sasha Levin, Richard Weinberger, linux-um, Johannes Berg
From: Johannes Berg <johannes.berg@intel.com>
[ Upstream commit ef4459a6da0955b533ebfc97a7d756ac090f50c9 ]
We've been running into stack overflows in helper threads
corrupting memory (e.g. because somebody put printf() or
os_info() there), so to avoid those causing hard-to-debug
issues later on, allocate a guard page for helper thread
stacks and mark it read-only.
Unfortunately, the crash dump at that point is useless as
the stack tracer will try to backtrace the *kernel* thread,
not the helper thread, but at least we don't survive to a
random issue caused by corruption.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/um/drivers/ubd_kern.c | 2 +-
arch/um/include/shared/kern_util.h | 2 +-
arch/um/kernel/process.c | 11 +++++++----
arch/um/os-Linux/helper.c | 4 ++--
4 files changed, 11 insertions(+), 8 deletions(-)
diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
index b12c1b0d3e1d0..e337702c30c78 100644
--- a/arch/um/drivers/ubd_kern.c
+++ b/arch/um/drivers/ubd_kern.c
@@ -1195,7 +1195,7 @@ static int __init ubd_driver_init(void){
/* Letting ubd=sync be like using ubd#s= instead of ubd#= is
* enough. So use anyway the io thread. */
}
- stack = alloc_stack(0, 0);
+ stack = alloc_stack(0);
io_pid = start_io_thread(stack + PAGE_SIZE - sizeof(void *),
&thread_fd);
if(io_pid < 0){
diff --git a/arch/um/include/shared/kern_util.h b/arch/um/include/shared/kern_util.h
index ccafb62e8ccec..c3ccbc2bbd2ce 100644
--- a/arch/um/include/shared/kern_util.h
+++ b/arch/um/include/shared/kern_util.h
@@ -19,7 +19,7 @@ extern int kmalloc_ok;
#define UML_ROUND_UP(addr) \
((((unsigned long) addr) + PAGE_SIZE - 1) & PAGE_MASK)
-extern unsigned long alloc_stack(int order, int atomic);
+extern unsigned long alloc_stack(int atomic);
extern void free_stack(unsigned long stack, int order);
struct pt_regs;
diff --git a/arch/um/kernel/process.c b/arch/um/kernel/process.c
index 9505a7e87396a..bb352dc446870 100644
--- a/arch/um/kernel/process.c
+++ b/arch/um/kernel/process.c
@@ -32,6 +32,7 @@
#include <os.h>
#include <skas.h>
#include <linux/time-internal.h>
+#include <asm/set_memory.h>
/*
* This is a per-cpu array. A processor only modifies its entry and it only
@@ -62,16 +63,18 @@ void free_stack(unsigned long stack, int order)
free_pages(stack, order);
}
-unsigned long alloc_stack(int order, int atomic)
+unsigned long alloc_stack(int atomic)
{
- unsigned long page;
+ unsigned long addr;
gfp_t flags = GFP_KERNEL;
if (atomic)
flags = GFP_ATOMIC;
- page = __get_free_pages(flags, order);
+ addr = __get_free_pages(flags, 1);
- return page;
+ set_memory_ro(addr, 1);
+
+ return addr + PAGE_SIZE;
}
static inline void set_current(struct task_struct *task)
diff --git a/arch/um/os-Linux/helper.c b/arch/um/os-Linux/helper.c
index 9fa6e4187d4fb..feb48d796e005 100644
--- a/arch/um/os-Linux/helper.c
+++ b/arch/um/os-Linux/helper.c
@@ -45,7 +45,7 @@ int run_helper(void (*pre_exec)(void *), void *pre_data, char **argv)
unsigned long stack, sp;
int pid, fds[2], ret, n;
- stack = alloc_stack(0, __cant_sleep());
+ stack = alloc_stack(__cant_sleep());
if (stack == 0)
return -ENOMEM;
@@ -116,7 +116,7 @@ int run_helper_thread(int (*proc)(void *), void *arg, unsigned int flags,
unsigned long stack, sp;
int pid, status, err;
- stack = alloc_stack(0, __cant_sleep());
+ stack = alloc_stack(__cant_sleep());
if (stack == 0)
return -ENOMEM;
--
2.27.0
_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH AUTOSEL 5.10 20/31] um: allocate a guard page to helper threads
2020-12-30 13:03 ` [PATCH AUTOSEL 5.10 20/31] um: allocate a guard page to helper threads Sasha Levin
@ 2020-12-30 14:48 ` Johannes Berg
2021-01-04 14:21 ` Sasha Levin
0 siblings, 1 reply; 5+ messages in thread
From: Johannes Berg @ 2020-12-30 14:48 UTC (permalink / raw)
To: Sasha Levin, linux-kernel, stable; +Cc: Richard Weinberger, linux-um
On Wed, 2020-12-30 at 08:03 -0500, Sasha Levin wrote:
>
> We've been running into stack overflows in helper threads
> corrupting memory
For the record, that was mostly referring to "while development", so
while this change makes a few things a bit safer, I don't think there's
all that much point in backporting to stable; presumably things are
working OK there for people, and developers will (hopefully) be working
on mainline anyway.
johannes
_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH AUTOSEL 5.10 20/31] um: allocate a guard page to helper threads
2020-12-30 14:48 ` Johannes Berg
@ 2021-01-04 14:21 ` Sasha Levin
0 siblings, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2021-01-04 14:21 UTC (permalink / raw)
To: Johannes Berg; +Cc: Richard Weinberger, linux-um, linux-kernel, stable
On Wed, Dec 30, 2020 at 03:48:57PM +0100, Johannes Berg wrote:
>On Wed, 2020-12-30 at 08:03 -0500, Sasha Levin wrote:
>>
>> We've been running into stack overflows in helper threads
>> corrupting memory
>
>For the record, that was mostly referring to "while development", so
>while this change makes a few things a bit safer, I don't think there's
>all that much point in backporting to stable; presumably things are
>working OK there for people, and developers will (hopefully) be working
>on mainline anyway.
I'll drop it, thanks!
--
Thanks,
Sasha
_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-01-04 14:21 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20201230130314.3636961-1-sashal@kernel.org>
2020-12-30 13:03 ` [PATCH AUTOSEL 5.10 18/31] um: random: Register random as hwrng-core device Sasha Levin
2020-12-30 13:03 ` [PATCH AUTOSEL 5.10 19/31] um: ubd: Submit all data segments atomically Sasha Levin
2020-12-30 13:03 ` [PATCH AUTOSEL 5.10 20/31] um: allocate a guard page to helper threads Sasha Levin
2020-12-30 14:48 ` Johannes Berg
2021-01-04 14:21 ` Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox