From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 82F75C433F5 for ; Thu, 6 Oct 2022 10:25:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=ezu841JM2y0ONsJBKMQyD+3kd75dxzpEPYXVIzR7UiE=; b=W1QYG3woaP4tHO htI7B+Gt3vJiW5OGlhHCNY4tU+DOcAF23to6RjNEValWtK845iZteO9NKB9mjv0uHiKCw+F32bGQG jwmEi+gcfOHLVOAE90u/G6iabL+SqnnV3qyeFXj6M6PM4mdlOL2gwuCrvJ74QQpGf1y4Tl23q//ec jCmV5HEA5O3dIxp2J8NadiBr3rrmh6jtsOBEw0L2EAiXl5zV/o0j6svn5j3BiHuPN5HQwoR8uKfmb qs7fxGu6MeV2t5dW7AZW97ab7NIVd2flsdt7s3Sd7iH/Xs93T4zlnYBLzwTn3g07UeGWhRCc8/lOc 3ODDu2eQ6AJZq08GtAVA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1ogO3W-001SOv-Jz; Thu, 06 Oct 2022 10:24:42 +0000 Received: from ams.source.kernel.org ([145.40.68.75]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1ogO3T-001SNP-55 for linux-arm-kernel@lists.infradead.org; Thu, 06 Oct 2022 10:24:41 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 9463AB82063; Thu, 6 Oct 2022 10:24:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E9B6EC433C1; Thu, 6 Oct 2022 10:24:33 +0000 (UTC) Date: Thu, 6 Oct 2022 11:24:30 +0100 From: Catalin Marinas To: James Morse Cc: Andrey Konovalov , Linux ARM , LKML , syzkaller-bugs , tongtiangen@huawei.com, Vincenzo Frascino , Kefeng Wang , Will Deacon , syzbot , Evgenii Stepanov , Peter Collingbourne , Dmitry Vyukov Subject: Re: [syzbot] KASAN: invalid-access Read in copy_page Message-ID: References: <0000000000004387dc05e5888ae5@google.com> <830e8c64-0118-9a2d-5dcf-5cad55425dc2@arm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <830e8c64-0118-9a2d-5dcf-5cad55425dc2@arm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20221006_032439_502582_C82B7FA4 X-CRM114-Status: GOOD ( 19.51 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Wed, Oct 05, 2022 at 01:38:55PM +0100, James Morse wrote: > On 27/09/2022 17:55, Andrey Konovalov wrote: > > On Tue, Sep 6, 2022 at 6:23 PM Catalin Marinas wrote: > >> On Tue, Sep 06, 2022 at 04:39:57PM +0200, Andrey Konovalov wrote: > >>> On Tue, Sep 6, 2022 at 4:29 PM Catalin Marinas wrote: > >>>>>> Does it take long to reproduce this kasan warning? > >>>>> > >>>>> syzbot finds several such cases every day (200 crashes for the past 35 days): > >>>>> https://syzkaller.appspot.com/bug?extid=c2c79c6d6eddc5262b77 > >>>>> So once it reaches the tested tree, we should have an answer within a day. [...] > I've reproduced this with the latest qemu and v6.0 kernel using ubuntu 15.04 user-space. > > The reproducer is just to log in once its booted. The vm has swap, and I've turned the > memory down low enough to force it to swap. The round trip time is about 15 minutes. > > I've not managed to reproduce it without swap, or with more memory. (but it may be a > timing thing) Thanks James. I got the error without swap enabled. Just booted Debian under Qemu with 256MB of RAM (no graphics), did an 'ls -lR /' and it triggered shortly after. There's no MTE used in user-space. ================================================================== BUG: KASAN: invalid-access in copy_page+0x10/0xd0 Read at addr f9ff0000050ba000 by task kcompactd0/28 Pointer tag: [f9], memory tag: [f8] CPU: 0 PID: 28 Comm: kcompactd0 Tainted: G W 6.0.0-rc3-dirty #1 Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 Call trace: dump_backtrace.part.0+0xdc/0xf0 show_stack+0x1c/0x4c dump_stack_lvl+0x68/0x84 print_report+0x104/0x610 kasan_report+0x90/0xb0 __do_kernel_fault+0x70/0x194 do_tag_check_fault+0x7c/0x90 do_mem_abort+0x48/0xa0 el1_abort+0x40/0x60 el1h_64_sync_handler+0xdc/0xec el1h_64_sync+0x64/0x68 copy_page+0x10/0xd0 folio_copy+0x50/0xb0 migrate_folio+0x50/0x9c move_to_new_folio+0xc0/0x1d4 migrate_pages+0x16b4/0x1740 compact_zone+0x66c/0xb0c proactive_compact_node+0x70/0xac kcompactd+0x1b4/0x370 kthread+0x110/0x114 ret_from_fork+0x10/0x20 The buggy address belongs to the physical page: page:000000007339140a refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff90019 pfn:0x450ba memcg:f9ff0000052e4000 anon flags: 0x3fffc180088000d(locked|uptodate|dirty|swapbacked|arch_2|node=0|zone=0|lastcpupid=0xffff|kasantag=0x6) raw: 03fffc180088000d fffffc0000142e48 ffff80000815bd68 fdff000001c738c1 raw: 0000000ffff90019 0000000000000000 00000001ffffffff f9ff0000052e4000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff0000050b9e00: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 ffff0000050b9f00: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 >ffff0000050ba000: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ^ ffff0000050ba100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ffff0000050ba200: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ================================================================== It looks like it always happens on read. Something updated the tag in page->flags for an existing page (or repainted the page, though less likely as I think the page is in use). I'm surprised that even without MTE in user-space, we still get PG_mte_tagged (arch_2) set. Time for more printks. -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4EDCC433FE for ; Thu, 6 Oct 2022 10:24:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229602AbiJFKYq (ORCPT ); Thu, 6 Oct 2022 06:24:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54172 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231156AbiJFKYj (ORCPT ); Thu, 6 Oct 2022 06:24:39 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D29AC5F8E for ; Thu, 6 Oct 2022 03:24:37 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id E41C5618DB for ; Thu, 6 Oct 2022 10:24:36 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E9B6EC433C1; Thu, 6 Oct 2022 10:24:33 +0000 (UTC) Date: Thu, 6 Oct 2022 11:24:30 +0100 From: Catalin Marinas To: James Morse Cc: Andrey Konovalov , Linux ARM , LKML , syzkaller-bugs , tongtiangen@huawei.com, Vincenzo Frascino , Kefeng Wang , Will Deacon , syzbot , Evgenii Stepanov , Peter Collingbourne , Dmitry Vyukov Subject: Re: [syzbot] KASAN: invalid-access Read in copy_page Message-ID: References: <0000000000004387dc05e5888ae5@google.com> <830e8c64-0118-9a2d-5dcf-5cad55425dc2@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <830e8c64-0118-9a2d-5dcf-5cad55425dc2@arm.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 05, 2022 at 01:38:55PM +0100, James Morse wrote: > On 27/09/2022 17:55, Andrey Konovalov wrote: > > On Tue, Sep 6, 2022 at 6:23 PM Catalin Marinas wrote: > >> On Tue, Sep 06, 2022 at 04:39:57PM +0200, Andrey Konovalov wrote: > >>> On Tue, Sep 6, 2022 at 4:29 PM Catalin Marinas wrote: > >>>>>> Does it take long to reproduce this kasan warning? > >>>>> > >>>>> syzbot finds several such cases every day (200 crashes for the past 35 days): > >>>>> https://syzkaller.appspot.com/bug?extid=c2c79c6d6eddc5262b77 > >>>>> So once it reaches the tested tree, we should have an answer within a day. [...] > I've reproduced this with the latest qemu and v6.0 kernel using ubuntu 15.04 user-space. > > The reproducer is just to log in once its booted. The vm has swap, and I've turned the > memory down low enough to force it to swap. The round trip time is about 15 minutes. > > I've not managed to reproduce it without swap, or with more memory. (but it may be a > timing thing) Thanks James. I got the error without swap enabled. Just booted Debian under Qemu with 256MB of RAM (no graphics), did an 'ls -lR /' and it triggered shortly after. There's no MTE used in user-space. ================================================================== BUG: KASAN: invalid-access in copy_page+0x10/0xd0 Read at addr f9ff0000050ba000 by task kcompactd0/28 Pointer tag: [f9], memory tag: [f8] CPU: 0 PID: 28 Comm: kcompactd0 Tainted: G W 6.0.0-rc3-dirty #1 Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 Call trace: dump_backtrace.part.0+0xdc/0xf0 show_stack+0x1c/0x4c dump_stack_lvl+0x68/0x84 print_report+0x104/0x610 kasan_report+0x90/0xb0 __do_kernel_fault+0x70/0x194 do_tag_check_fault+0x7c/0x90 do_mem_abort+0x48/0xa0 el1_abort+0x40/0x60 el1h_64_sync_handler+0xdc/0xec el1h_64_sync+0x64/0x68 copy_page+0x10/0xd0 folio_copy+0x50/0xb0 migrate_folio+0x50/0x9c move_to_new_folio+0xc0/0x1d4 migrate_pages+0x16b4/0x1740 compact_zone+0x66c/0xb0c proactive_compact_node+0x70/0xac kcompactd+0x1b4/0x370 kthread+0x110/0x114 ret_from_fork+0x10/0x20 The buggy address belongs to the physical page: page:000000007339140a refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff90019 pfn:0x450ba memcg:f9ff0000052e4000 anon flags: 0x3fffc180088000d(locked|uptodate|dirty|swapbacked|arch_2|node=0|zone=0|lastcpupid=0xffff|kasantag=0x6) raw: 03fffc180088000d fffffc0000142e48 ffff80000815bd68 fdff000001c738c1 raw: 0000000ffff90019 0000000000000000 00000001ffffffff f9ff0000052e4000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff0000050b9e00: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 ffff0000050b9f00: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 >ffff0000050ba000: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ^ ffff0000050ba100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ffff0000050ba200: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ================================================================== It looks like it always happens on read. Something updated the tag in page->flags for an existing page (or repainted the page, though less likely as I think the page is in use). I'm surprised that even without MTE in user-space, we still get PG_mte_tagged (arch_2) set. Time for more printks. -- Catalin