From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f49.google.com (mail-wm0-f49.google.com [74.125.82.49]) by kanga.kvack.org (Postfix) with ESMTP id 699B26B0005 for ; Mon, 18 Jan 2016 08:38:56 -0500 (EST) Received: by mail-wm0-f49.google.com with SMTP id u188so104429626wmu.1 for ; Mon, 18 Jan 2016 05:38:56 -0800 (PST) Received: from mail-wm0-x22b.google.com (mail-wm0-x22b.google.com. [2a00:1450:400c:c09::22b]) by mx.google.com with ESMTPS id b1si25393713wmi.42.2016.01.18.05.38.55 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 18 Jan 2016 05:38:55 -0800 (PST) Received: by mail-wm0-x22b.google.com with SMTP id 123so52793467wmz.0 for ; Mon, 18 Jan 2016 05:38:55 -0800 (PST) Date: Mon, 18 Jan 2016 15:38:52 +0200 From: "Kirill A. Shutemov" Subject: Re: mm: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected in split_huge_page_to_list Message-ID: <20160118133852.GC14531@node.shutemov.name> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Dmitry Vyukov Cc: "Kirill A. Shutemov" , Sasha Levin , "Aneesh Kumar K.V" , jmarchan@redhat.com, Hugh Dickins , Dave Hansen , David Rientjes , Andrew Morton , Linus Torvalds , Vlastimil Babka , Mel Gorman , Ebru Akagunduz , Dan Williams , Minchan Kim , "linux-mm@kvack.org" , LKML , syzkaller , Kostya Serebryany , Alexander Potapenko On Mon, Jan 18, 2016 at 02:08:15PM +0100, Dmitry Vyukov wrote: > Hello, > > While running syzkaller fuzzer I've hit the following report. > > Looks like cause by the recent commit > e9b61f19858a5d6c42ce2298cf138279375d0d9b "thp: reintroduce > split_huge_page()". > > ====================================================== > [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ] > 4.4.0+ #259 Tainted: G W > ------------------------------------------------------ > syz-executor/18183 [HC0[0]:SC0[2]:HE0:SE0] is trying to acquire: > (split_queue_lock){+.+...}, at: [] > free_transhuge_page+0x24/0x90 mm/huge_memory.c:3436 > > and this task is already holding: > (slock-AF_INET){+.-...}, at: [< inline >] spin_lock_bh > include/linux/spinlock.h:307 > (slock-AF_INET){+.-...}, at: [] > lock_sock_fast+0x45/0x120 net/core/sock.c:2462 > which would create a new lock dependency: > (slock-AF_INET){+.-...} -> (split_queue_lock){+.+...} > > but this new dependency connects a SOFTIRQ-irq-safe lock: > (slock-AF_INET){+.-...} > ... which became SOFTIRQ-irq-safe at: > [< inline >] mark_irqflags kernel/locking/lockdep.c:2799 > [] __lock_acquire+0xfd8/0x4700 kernel/locking/lockdep.c:3162 > [] lock_acquire+0x1dc/0x430 kernel/locking/lockdep.c:3585 > [< inline >] __raw_spin_lock include/linux/spinlock_api_smp.h:144 > [] _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151 > [< inline >] spin_lock include/linux/spinlock.h:302 > [] udp_queue_rcv_skb+0x781/0x1550 net/ipv4/udp.c:1680 > [] flush_stack+0x50/0x330 net/ipv6/udp.c:799 > [] __udp4_lib_mcast_deliver+0x694/0x7f0 net/ipv4/udp.c:1798 > [] __udp4_lib_rcv+0x17dc/0x23e0 net/ipv4/udp.c:1888 > [] udp_rcv+0x21/0x30 net/ipv4/udp.c:2108 > [] ip_local_deliver_finish+0x2b3/0xa50 > net/ipv4/ip_input.c:216 > [< inline >] NF_HOOK_THRESH include/linux/netfilter.h:226 > [< inline >] NF_HOOK include/linux/netfilter.h:249 > [] ip_local_deliver+0x1c4/0x2f0 net/ipv4/ip_input.c:257 > [< inline >] dst_input include/net/dst.h:498 > [] ip_rcv_finish+0x5ec/0x1730 net/ipv4/ip_input.c:365 > [< inline >] NF_HOOK_THRESH include/linux/netfilter.h:226 > [< inline >] NF_HOOK include/linux/netfilter.h:249 > [] ip_rcv+0x963/0x1080 net/ipv4/ip_input.c:455 > [] __netif_receive_skb_core+0x1620/0x2f80 > net/core/dev.c:4154 > [] __netif_receive_skb+0x2a/0x160 net/core/dev.c:4189 > [] netif_receive_skb_internal+0x1b5/0x390 > net/core/dev.c:4217 > [< inline >] napi_skb_finish net/core/dev.c:4542 > [] napi_gro_receive+0x2bd/0x3c0 net/core/dev.c:4572 > [] e1000_clean_rx_irq+0x4e2/0x1100 > drivers/net/ethernet/intel/e1000e/netdev.c:1038 > [] e1000_clean+0xa08/0x24a0 > drivers/net/ethernet/intel/e1000/e1000_main.c:3819 > [< inline >] napi_poll net/core/dev.c:5074 > [] net_rx_action+0x7eb/0xdf0 net/core/dev.c:5139 > [] __do_softirq+0x26a/0x920 kernel/softirq.c:273 > [< inline >] invoke_softirq kernel/softirq.c:350 > [] irq_exit+0x18f/0x1d0 kernel/softirq.c:391 > [< inline >] exiting_irq ./arch/x86/include/asm/apic.h:659 > [] do_IRQ+0x86/0x1a0 arch/x86/kernel/irq.c:252 > [] ret_from_intr+0x0/0x20 arch/x86/entry/entry_64.S:520 > [< inline >] arch_safe_halt ./arch/x86/include/asm/paravirt.h:117 > [] default_idle+0x52/0x2e0 arch/x86/kernel/process.c:304 > [] arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:295 > [] default_idle_call+0x48/0xa0 kernel/sched/idle.c:92 > [< inline >] cpuidle_idle_call kernel/sched/idle.c:156 > [< inline >] cpu_idle_loop kernel/sched/idle.c:252 > [] cpu_startup_entry+0x554/0x710 kernel/sched/idle.c:300 > [] rest_init+0x192/0x1a0 init/main.c:412 > [] start_kernel+0x678/0x69e init/main.c:683 > [] x86_64_start_reservations+0x2a/0x2c > arch/x86/kernel/head64.c:195 > [] x86_64_start_kernel+0x158/0x167 > arch/x86/kernel/head64.c:184 > > to a SOFTIRQ-irq-unsafe lock: > (split_queue_lock){+.+...} > ... which became SOFTIRQ-irq-unsafe at: > ... [< inline >] mark_irqflags kernel/locking/lockdep.c:2817 > ... [] __lock_acquire+0x146e/0x4700 > kernel/locking/lockdep.c:3162 > [] lock_acquire+0x1dc/0x430 kernel/locking/lockdep.c:3585 > [< inline >] __raw_spin_lock include/linux/spinlock_api_smp.h:144 > [] _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151 > [< inline >] spin_lock include/linux/spinlock.h:302 > [] split_huge_page_to_list+0xcc0/0x1c50 > mm/huge_memory.c:3399 > [< inline >] split_huge_page include/linux/huge_mm.h:99 > [] queue_pages_pte_range+0xa38/0xef0 mm/mempolicy.c:507 > [< inline >] walk_pmd_range mm/pagewalk.c:50 > [< inline >] walk_pud_range mm/pagewalk.c:90 > [< inline >] walk_pgd_range mm/pagewalk.c:116 > [] __walk_page_range+0x653/0xcd0 mm/pagewalk.c:204 > [] walk_page_range+0xfe/0x2b0 mm/pagewalk.c:281 > [] queue_pages_range+0xfb/0x130 mm/mempolicy.c:687 > [< inline >] migrate_to_node mm/mempolicy.c:1004 > [] do_migrate_pages+0x370/0x4e0 mm/mempolicy.c:1109 > [< inline >] SYSC_migrate_pages mm/mempolicy.c:1453 > [] SyS_migrate_pages+0x640/0x730 mm/mempolicy.c:1374 > [] entry_SYSCALL_64_fastpath+0x16/0x7a > arch/x86/entry/entry_64.S:185 > > other info that might help us debug this: > > Possible interrupt unsafe locking scenario: > > CPU0 CPU1 > ---- ---- > lock(split_queue_lock); > local_irq_disable(); > lock(slock-AF_INET); > lock(split_queue_lock); > > lock(slock-AF_INET); Thanks for report. I think this should fix the issue: