* [PATCH AUTOSEL 3.18 03/41] sysctl: handle overflow for file-max
[not found] <20190327182518.19394-1-sashal@kernel.org>
@ 2019-03-27 18:24 ` Sasha Levin
2019-03-27 18:24 ` [PATCH AUTOSEL 3.18 11/41] fs: fix guard_bio_eod to check for real EOD errors Sasha Levin
1 sibling, 0 replies; 2+ messages in thread
From: Sasha Levin @ 2019-03-27 18:24 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Christian Brauner, Alexey Dobriyan, Al Viro, Dominik Brodowski,
Eric W. Biederman, Joe Lawrence, Luis Chamberlain, Waiman Long,
Andrew Morton, Linus Torvalds, Sasha Levin, linux-fsdevel
From: Christian Brauner <christian@brauner.io>
[ Upstream commit 32a5ad9c22852e6bd9e74bdec5934ef9d1480bc5 ]
Currently, when writing
echo 18446744073709551616 > /proc/sys/fs/file-max
/proc/sys/fs/file-max will overflow and be set to 0. That quickly
crashes the system.
This commit sets the max and min value for file-max. The max value is
set to long int. Any higher value cannot currently be used as the
percpu counters are long ints and not unsigned integers.
Note that the file-max value is ultimately parsed via
__do_proc_doulongvec_minmax(). This function does not report error when
min or max are exceeded. Which means if a value largen that long int is
written userspace will not receive an error instead the old value will be
kept. There is an argument to be made that this should be changed and
__do_proc_doulongvec_minmax() should return an error when a dedicated min
or max value are exceeded. However this has the potential to break
userspace so let's defer this to an RFC patch.
Link: http://lkml.kernel.org/r/20190107222700.15954-3-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Waiman Long <longman@redhat.com>
[christian@brauner.io: v4]
Link: http://lkml.kernel.org/r/20190210203943.8227-3-christian@brauner.io
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
kernel/sysctl.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 27f8aa765493..a68fed9f6922 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -125,6 +125,7 @@ static int __maybe_unused one = 1;
static int __maybe_unused two = 2;
static int __maybe_unused four = 4;
static unsigned long one_ul = 1;
+static unsigned long long_max = LONG_MAX;
static int one_hundred = 100;
#ifdef CONFIG_PRINTK
static int ten_thousand = 10000;
@@ -1521,6 +1522,8 @@ static struct ctl_table fs_table[] = {
.maxlen = sizeof(files_stat.max_files),
.mode = 0644,
.proc_handler = proc_doulongvec_minmax,
+ .extra1 = &zero,
+ .extra2 = &long_max,
},
{
.procname = "nr_open",
--
2.19.1
^ permalink raw reply related [flat|nested] 2+ messages in thread* [PATCH AUTOSEL 3.18 11/41] fs: fix guard_bio_eod to check for real EOD errors
[not found] <20190327182518.19394-1-sashal@kernel.org>
2019-03-27 18:24 ` [PATCH AUTOSEL 3.18 03/41] sysctl: handle overflow for file-max Sasha Levin
@ 2019-03-27 18:24 ` Sasha Levin
1 sibling, 0 replies; 2+ messages in thread
From: Sasha Levin @ 2019-03-27 18:24 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Carlos Maiolino, Jens Axboe, Sasha Levin, linux-fsdevel
From: Carlos Maiolino <cmaiolino@redhat.com>
[ Upstream commit dce30ca9e3b676fb288c33c1f4725a0621361185 ]
guard_bio_eod() can truncate a segment in bio to allow it to do IO on
odd last sectors of a device.
It already checks if the IO starts past EOD, but it does not consider
the possibility of an IO request starting within device boundaries can
contain more than one segment past EOD.
In such cases, truncated_bytes can be bigger than PAGE_SIZE, and will
underflow bvec->bv_len.
Fix this by checking if truncated_bytes is lower than PAGE_SIZE.
This situation has been found on filesystems such as isofs and vfat,
which doesn't check the device size before mount, if the device is
smaller than the filesystem itself, a readahead on such filesystem,
which spans EOD, can trigger this situation, leading a call to
zero_user() with a wrong size possibly corrupting memory.
I didn't see any crash, or didn't let the system run long enough to
check if memory corruption will be hit somewhere, but adding
instrumentation to guard_bio_end() to check truncated_bytes size, was
enough to see the error.
The following script can trigger the error.
MNT=/mnt
IMG=./DISK.img
DEV=/dev/loop0
mkfs.vfat $IMG
mount $IMG $MNT
cp -R /etc $MNT &> /dev/null
umount $MNT
losetup -D
losetup --find --show --sizelimit 16247280 $IMG
mount $DEV $MNT
find $MNT -type f -exec cat {} + >/dev/null
Kudos to Eric Sandeen for coming up with the reproducer above
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/buffer.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/fs/buffer.c b/fs/buffer.c
index 20805db2c987..47b42e8ddca2 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2986,6 +2986,13 @@ void guard_bio_eod(int rw, struct bio *bio)
/* Uhhuh. We've got a bio that straddles the device size! */
truncated_bytes = bio->bi_iter.bi_size - (maxsector << 9);
+ /*
+ * The bio contains more than one segment which spans EOD, just return
+ * and let IO layer turn it into an EIO
+ */
+ if (truncated_bytes > bvec->bv_len)
+ return;
+
/* Truncate the bio.. */
bio->bi_iter.bi_size -= truncated_bytes;
bvec->bv_len -= truncated_bytes;
--
2.19.1
^ permalink raw reply related [flat|nested] 2+ messages in thread