* [PATCH AUTOSEL 4.4 06/63] sysctl: handle overflow for file-max
[not found] <20190327182323.18577-1-sashal@kernel.org>
@ 2019-03-27 18:22 ` Sasha Levin
2019-03-27 18:22 ` [PATCH AUTOSEL 4.4 14/63] fs/file.c: initialize init_files.resize_wait Sasha Levin
2019-03-27 18:22 ` [PATCH AUTOSEL 4.4 18/63] fs: fix guard_bio_eod to check for real EOD errors Sasha Levin
2 siblings, 0 replies; 3+ messages in thread
From: Sasha Levin @ 2019-03-27 18:22 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Christian Brauner, Alexey Dobriyan, Al Viro, Dominik Brodowski,
Eric W. Biederman, Joe Lawrence, Luis Chamberlain, Waiman Long,
Andrew Morton, Linus Torvalds, Sasha Levin, linux-fsdevel
From: Christian Brauner <christian@brauner.io>
[ Upstream commit 32a5ad9c22852e6bd9e74bdec5934ef9d1480bc5 ]
Currently, when writing
echo 18446744073709551616 > /proc/sys/fs/file-max
/proc/sys/fs/file-max will overflow and be set to 0. That quickly
crashes the system.
This commit sets the max and min value for file-max. The max value is
set to long int. Any higher value cannot currently be used as the
percpu counters are long ints and not unsigned integers.
Note that the file-max value is ultimately parsed via
__do_proc_doulongvec_minmax(). This function does not report error when
min or max are exceeded. Which means if a value largen that long int is
written userspace will not receive an error instead the old value will be
kept. There is an argument to be made that this should be changed and
__do_proc_doulongvec_minmax() should return an error when a dedicated min
or max value are exceeded. However this has the potential to break
userspace so let's defer this to an RFC patch.
Link: http://lkml.kernel.org/r/20190107222700.15954-3-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Waiman Long <longman@redhat.com>
[christian@brauner.io: v4]
Link: http://lkml.kernel.org/r/20190210203943.8227-3-christian@brauner.io
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
kernel/sysctl.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index beadcf83ceba..2f98b11477b8 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -126,6 +126,7 @@ static int __maybe_unused one = 1;
static int __maybe_unused two = 2;
static int __maybe_unused four = 4;
static unsigned long one_ul = 1;
+static unsigned long long_max = LONG_MAX;
static int one_hundred = 100;
#ifdef CONFIG_PRINTK
static int ten_thousand = 10000;
@@ -1603,6 +1604,8 @@ static struct ctl_table fs_table[] = {
.maxlen = sizeof(files_stat.max_files),
.mode = 0644,
.proc_handler = proc_doulongvec_minmax,
+ .extra1 = &zero,
+ .extra2 = &long_max,
},
{
.procname = "nr_open",
--
2.19.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH AUTOSEL 4.4 14/63] fs/file.c: initialize init_files.resize_wait
[not found] <20190327182323.18577-1-sashal@kernel.org>
2019-03-27 18:22 ` [PATCH AUTOSEL 4.4 06/63] sysctl: handle overflow for file-max Sasha Levin
@ 2019-03-27 18:22 ` Sasha Levin
2019-03-27 18:22 ` [PATCH AUTOSEL 4.4 18/63] fs: fix guard_bio_eod to check for real EOD errors Sasha Levin
2 siblings, 0 replies; 3+ messages in thread
From: Sasha Levin @ 2019-03-27 18:22 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Shuriyc Chu, Al Viro, Andrew Morton, Linus Torvalds, Sasha Levin,
linux-fsdevel
From: Shuriyc Chu <sureeju@gmail.com>
[ Upstream commit 5704a06810682683355624923547b41540e2801a ]
(Taken from https://bugzilla.kernel.org/show_bug.cgi?id=200647)
'get_unused_fd_flags' in kthread cause kernel crash. It works fine on
4.1, but causes crash after get 64 fds. It also cause crash on
ubuntu1404/1604/1804, centos7.5, and the crash messages are almost the
same.
The crash message on centos7.5 shows below:
start fd 61
start fd 62
start fd 63
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: __wake_up_common+0x2e/0x90
PGD 0
Oops: 0000 [#1] SMP
Modules linked in: test(OE) xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter devlink sunrpc kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd sg ppdev pcspkr virtio_balloon parport_pc parport i2c_piix4 joydev ip_tables xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi virtio_scsi virtio_console virtio_net cirrus drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm crct10dif_pclmul crct10dif_common crc32c_intel drm ata_piix serio_raw libata virtio_pci virtio_ring i2c_core
virtio floppy dm_mirror dm_region_hash dm_log dm_mod
CPU: 2 PID: 1820 Comm: test_fd Kdump: loaded Tainted: G OE ------------ 3.10.0-862.3.3.el7.x86_64 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
task: ffff8e92b9431fa0 ti: ffff8e94247a0000 task.ti: ffff8e94247a0000
RIP: 0010:__wake_up_common+0x2e/0x90
RSP: 0018:ffff8e94247a2d18 EFLAGS: 00010086
RAX: 0000000000000000 RBX: ffffffff9d09daa0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffffffff9d09daa0
RBP: ffff8e94247a2d50 R08: 0000000000000000 R09: ffff8e92b95dfda8
R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff9d09daa8
R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000003
FS: 0000000000000000(0000) GS:ffff8e9434e80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000017c686000 CR4: 00000000000207e0
Call Trace:
__wake_up+0x39/0x50
expand_files+0x131/0x250
__alloc_fd+0x47/0x170
get_unused_fd_flags+0x30/0x40
test_fd+0x12a/0x1c0 [test]
kthread+0xd1/0xe0
ret_from_fork_nospec_begin+0x21/0x21
Code: 66 90 55 48 89 e5 41 57 41 89 f7 41 56 41 89 ce 41 55 41 54 49 89 fc 49 83 c4 08 53 48 83 ec 10 48 8b 47 08 89 55 cc 4c 89 45 d0 <48> 8b 08 49 39 c4 48 8d 78 e8 4c 8d 69 e8 75 08 eb 3b 4c 89 ef
RIP __wake_up_common+0x2e/0x90
RSP <ffff8e94247a2d18>
CR2: 0000000000000000
This issue exists since CentOS 7.5 3.10.0-862 and CentOS 7.4
(3.10.0-693.21.1 ) is ok. Root cause: the item 'resize_wait' is not
initialized before being used.
Reported-by: Richard Zhang <zhang.zijian@h3c.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/file.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/file.c b/fs/file.c
index 39f8f15921da..7e9eb65a2912 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -474,6 +474,7 @@ struct files_struct init_files = {
.full_fds_bits = init_files.full_fds_bits_init,
},
.file_lock = __SPIN_LOCK_UNLOCKED(init_files.file_lock),
+ .resize_wait = __WAIT_QUEUE_HEAD_INITIALIZER(init_files.resize_wait),
};
static unsigned long find_next_fd(struct fdtable *fdt, unsigned long start)
--
2.19.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH AUTOSEL 4.4 18/63] fs: fix guard_bio_eod to check for real EOD errors
[not found] <20190327182323.18577-1-sashal@kernel.org>
2019-03-27 18:22 ` [PATCH AUTOSEL 4.4 06/63] sysctl: handle overflow for file-max Sasha Levin
2019-03-27 18:22 ` [PATCH AUTOSEL 4.4 14/63] fs/file.c: initialize init_files.resize_wait Sasha Levin
@ 2019-03-27 18:22 ` Sasha Levin
2 siblings, 0 replies; 3+ messages in thread
From: Sasha Levin @ 2019-03-27 18:22 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Carlos Maiolino, Jens Axboe, Sasha Levin, linux-fsdevel
From: Carlos Maiolino <cmaiolino@redhat.com>
[ Upstream commit dce30ca9e3b676fb288c33c1f4725a0621361185 ]
guard_bio_eod() can truncate a segment in bio to allow it to do IO on
odd last sectors of a device.
It already checks if the IO starts past EOD, but it does not consider
the possibility of an IO request starting within device boundaries can
contain more than one segment past EOD.
In such cases, truncated_bytes can be bigger than PAGE_SIZE, and will
underflow bvec->bv_len.
Fix this by checking if truncated_bytes is lower than PAGE_SIZE.
This situation has been found on filesystems such as isofs and vfat,
which doesn't check the device size before mount, if the device is
smaller than the filesystem itself, a readahead on such filesystem,
which spans EOD, can trigger this situation, leading a call to
zero_user() with a wrong size possibly corrupting memory.
I didn't see any crash, or didn't let the system run long enough to
check if memory corruption will be hit somewhere, but adding
instrumentation to guard_bio_end() to check truncated_bytes size, was
enough to see the error.
The following script can trigger the error.
MNT=/mnt
IMG=./DISK.img
DEV=/dev/loop0
mkfs.vfat $IMG
mount $IMG $MNT
cp -R /etc $MNT &> /dev/null
umount $MNT
losetup -D
losetup --find --show --sizelimit 16247280 $IMG
mount $DEV $MNT
find $MNT -type f -exec cat {} + >/dev/null
Kudos to Eric Sandeen for coming up with the reproducer above
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/buffer.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/fs/buffer.c b/fs/buffer.c
index 6f7d519a093b..f278e27bd8c0 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2985,6 +2985,13 @@ void guard_bio_eod(int rw, struct bio *bio)
/* Uhhuh. We've got a bio that straddles the device size! */
truncated_bytes = bio->bi_iter.bi_size - (maxsector << 9);
+ /*
+ * The bio contains more than one segment which spans EOD, just return
+ * and let IO layer turn it into an EIO
+ */
+ if (truncated_bytes > bvec->bv_len)
+ return;
+
/* Truncate the bio.. */
bio->bi_iter.bi_size -= truncated_bytes;
bvec->bv_len -= truncated_bytes;
--
2.19.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-03-27 18:38 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20190327182323.18577-1-sashal@kernel.org>
2019-03-27 18:22 ` [PATCH AUTOSEL 4.4 06/63] sysctl: handle overflow for file-max Sasha Levin
2019-03-27 18:22 ` [PATCH AUTOSEL 4.4 14/63] fs/file.c: initialize init_files.resize_wait Sasha Levin
2019-03-27 18:22 ` [PATCH AUTOSEL 4.4 18/63] fs: fix guard_bio_eod to check for real EOD errors Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox