From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751531AbbJDMPC (ORCPT ); Sun, 4 Oct 2015 08:15:02 -0400 Received: from mail-wi0-f177.google.com ([209.85.212.177]:38664 "EHLO mail-wi0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751315AbbJDMO7 (ORCPT ); Sun, 4 Oct 2015 08:14:59 -0400 MIME-Version: 1.0 In-Reply-To: References: <20150930082754.401022511@linutronix.de> <20150930083302.694788319@linutronix.de> <560F2C42.4020500@oracle.com> From: Dmitry Vyukov Date: Sun, 4 Oct 2015 14:14:38 +0200 Message-ID: Subject: Re: [patch 1/2] x86/process: Add proper bound checks in 64bit get_wchan() To: Andrey Ryabinin Cc: Thomas Gleixner , Sasha Levin , LKML , Andy Lutomirski , Andrey Konovalov , Kostya Serebryany , Alexander Potapenko , kasan-dev , Borislav Petkov , Denys Vlasenko , Andi Kleen , "x86@kernel.org" , Wolfram Gloger Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Oct 3, 2015 at 1:31 PM, Andrey Ryabinin wrote: > 2015-10-03 13:54 GMT+03:00 Thomas Gleixner : >> On Fri, 2 Oct 2015, Sasha Levin wrote: >>> I'm seeing a different issue with this patch: >>> >>> [ 5228.736320] BUG: KASAN: out-of-bounds in get_wchan+0xf9/0x1b0 at addr ffff88049d2b7c50 >>> [ 5228.737560] Read of size 8 by task killall/22177 >>> [ 5228.738304] page:ffffea001274adc0 count:0 mapcount:0 mapping: (null) index:0x0 >>> [ 5228.739374] flags: 0x6fffff80000000() >>> [ 5228.739862] page dumped because: kasan: bad access detected >>> [ 5228.741764] CPU: 8 PID: 22177 Comm: killall Not tainted 4.3.0-rc3-next-20151002-sasha-00076-gde7fa56-dirty #2590 >>> [ 5228.743337] ffff882c80967828 000000007a901a83 ffff882c80967790 ffffffffacd2c8c8 >>> [ 5228.744409] ffff88049d2b7c50 ffff882c80967818 ffffffffab74befb ffff882c8bd00000 >>> [ 5228.745436] 0000000000000002 0000000000000282 ffff882c8bd00cf8 0000000000000001 >>> [ 5228.746446] Call Trace: >>> [ 5228.746881] dump_stack (lib/dump_stack.c:52) >>> [ 5228.747720] kasan_report_error (include/linux/kasan.h:28 mm/kasan/report.c:170 mm/kasan/report.c:237) >>> [ 5228.748670] __asan_report_load8_noabort (mm/kasan/report.c:279) >>> [ 5228.750563] get_wchan (arch/x86/kernel/process.c:561) >>> [ 5228.751378] do_task_stat (fs/proc/array.c:458) >>> [ 5228.755912] proc_tgid_stat (fs/proc/array.c:565) >>> [ 5228.756770] proc_single_show (./arch/x86/include/asm/atomic.h:118 include/linux/sched.h:2012 fs/proc/base.c:789) >>> [ 5228.759066] seq_read (fs/seq_file.c:238) >>> [ 5228.762360] __vfs_read (fs/read_write.c:432) >>> [ 5228.767957] vfs_read (fs/read_write.c:454) >>> [ 5228.769368] SyS_read (fs/read_write.c:570 fs/read_write.c:562) >>> [ 5228.778344] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:186) >>> [ 5228.779272] Memory state around the buggy address: >>> [ 5228.779971] ffff88049d2b7b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>> [ 5228.780992] ffff88049d2b7b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>> [ 5228.782021] >ffff88049d2b7c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>> [ 5228.783066] ^ >>> [ 5228.783936] ffff88049d2b7c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>> [ 5228.784994] ffff88049d2b7d00: 00 00 00 00 00 f1 f1 f1 f1 00 f4 f4 f4 f3 f3 f3 >>> >>> fp = READ_ONCE(*(unsigned long *)sp); >>> do { >>> if (fp < bottom || fp > top) >>> return 0; >>> ip = READ_ONCE(*(unsigned long *)(fp + sizeof(unsigned long))); >>> if (!in_sched_functions(ip)) >>> return ip; >>> fp = READ_ONCE(*(unsigned long *)fp); <=== Here >> >> Weird, we accessed >> >> *(unsigned long *)(fp + sizeof(unsigned long)) >> >> a few lines above, i.e. ffff88049d2b7c58 >> >> But what's more weird is that the memory dump does not really look >> like a stack at all. >> >> ffff88049d2b7c50 is stored on the stack: >> >>> [ 5228.744409] ffff88049d2b7c50 ffff882c80967818 ffffffffab74befb ffff882c8bd00000 >> >> But if it is not inside the stack bounds, how do we end up >> dereferencing it. Confused.... >> > > I'm sure that we in stack bounds here. > But we are not inside bounds of some stack variable > and KASAN doesn't like it. Agree. > KASAN changes stack frame of each function, e.g. > void foo() { > int a; > } > > transformed to: > void foo() { > char redzone1[32]; > int a; > char redzone2[28]; > char redzone3[32]; > } > > So any access to redzones will be reported. > > I could make a patch which will tell KASAN to ignore get_wchan(), > but if there are any real bugs left in get_wchan() they will be ignored too. Yes, we need KASAN to ignore it. False positives are not acceptable. I wanted to mail a fix after the fix for the real bug. I am thinking about kasan_disable_current/kasan_enable_current around the region that does the stack accesses; the reports are rare enough, so it should not have any significant performance effect. I don't mind if you send the patch.