From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751207AbdKEQAa (ORCPT ); Sun, 5 Nov 2017 11:00:30 -0500 Received: from us-smtp-delivery-194.mimecast.com ([63.128.21.194]:56006 "EHLO us-smtp-delivery-194.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750786AbdKEQA2 (ORCPT ); Sun, 5 Nov 2017 11:00:28 -0500 From: Trond Myklebust To: "dvyukov@google.com" , "bot+d8fe95298ef830cd7d05e33eefa4a5a6f6f334d4@syzkaller.appspotmail.com" CC: "linux-kernel@vger.kernel.org" , "bfields@fieldses.org" , "linux-nfs@vger.kernel.org" , "jlayton@poochiereds.net" , "jiangshanlai@gmail.com" , "anna.schumaker@netapp.com" , "netdev@vger.kernel.org" , "syzkaller-bugs@googlegroups.com" , "tj@kernel.org" , "davem@davemloft.net" Subject: Re: possible deadlock in flush_work (2) Thread-Topic: possible deadlock in flush_work (2) Thread-Index: AQHTVhOSVW+CkLK9CkKAIKT7zrN6VKMF8i+A Date: Sun, 5 Nov 2017 16:00:17 +0000 Message-ID: <1509897615.5851.1.camel@primarydata.com> References: <001a113ee9baf95598055d384ecb@google.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [68.49.162.121] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;DM5PR11MB0073;20:qmLp00xwh3k56S57kefhofDQOqagzbPokf04XOoP56qhK72zkK+/3B10KNB2bHtJXemn7bFi4bG6lKEARAB4KQtdo504CV3cgj8r3SEPRemAZ+njZ6L881nSK6i36MZb9vi88ODS+oAaeJM5NlsvyGRRPkxZT72wDa0zNzNMv64= x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-correlation-id: 8d124afc-8e9e-4cb9-2d9a-08d524664f07 x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001)(4534020)(4602075)(4603075)(4627115)(201702281549075)(2017052603199);SRVR:DM5PR11MB0073; x-ms-traffictypediagnostic: DM5PR11MB0073: x-exchange-antispam-report-test: UriScan:(21532816269658); x-microsoft-antispam-prvs: x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(5005006)(8121501046)(3231021)(100000703101)(100105400095)(93006095)(93001095)(3002001)(10201501046)(6041248)(20161123562025)(20161123560025)(20161123555025)(20161123564025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123558100)(2016111802025)(6043046)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);SRVR:DM5PR11MB0073;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:DM5PR11MB0073; x-forefront-prvs: 04825EA361 x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(6009001)(346002)(39830400002)(376002)(24454002)(51234002)(377424004)(199003)(189002)(14454004)(86362001)(575784001)(3660700001)(478600001)(97736004)(8936002)(81156014)(8676002)(33646002)(229853002)(68736007)(81166006)(2906002)(99286004)(316002)(66066001)(2900100001)(53546010)(110136005)(54906003)(5890100001)(2501003)(36756003)(6116002)(5660300001)(25786009)(103116003)(101416001)(76176999)(54356999)(305945005)(50986999)(3846002)(2950100002)(3280700002)(102836003)(189998001)(6512007)(7416002)(6436002)(53936002)(39060400002)(105586002)(6506006)(7736002)(77096006)(6486002)(106356001)(4326008)(6246003);DIR:OUT;SFP:1102;SCL:1;SRVR:DM5PR11MB0073;H:DM5PR11MB0075.namprd11.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-ID: MIME-Version: 1.0 X-OriginatorOrg: primarydata.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8d124afc-8e9e-4cb9-2d9a-08d524664f07 X-MS-Exchange-CrossTenant-originalarrivaltime: 05 Nov 2017 16:00:17.3146 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 03193ed6-8726-4bb3-a832-18ab0d28adb7 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR11MB0073 X-MC-Unique: jtDL1ZW5Pf-D_5S1NdP6Lg-1 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by nfs id vA5G0ZsR002561 On Sun, 2017-11-05 at 11:53 +0300, Dmitry Vyukov wrote: > On Sun, Nov 5, 2017 at 11:41 AM, syzbot > om> > wrote: > > Hello, > > > > syzkaller hit the following crash on > > 0f611fb6dcc0d6d91b4e1fec911321f434a3b858 > > git://git.cmpxchg.org/linux-mmots.git/master > > compiler: gcc (GCC) 7.1.1 20170620 > > .config is attached > > Raw console output is attached. > > > > xs_tcp_setup_socket: connect returned unhandled error -113 > > xs_tcp_setup_socket: connect returned unhandled error -113 > > xs_tcp_setup_socket: connect returned unhandled error -113 > > > > ====================================================== > > WARNING: possible circular locking dependency detected > > 4.14.0-rc5-mm1+ #20 Not tainted > > ------------------------------------------------------ > > kworker/0:3/3400 is trying to acquire lock: > > ("xprtiod"){+.+.}, at: [] start_flush_work > > kernel/workqueue.c:2850 [inline] > > ("xprtiod"){+.+.}, at: [] flush_work+0x55a/0x8a0 > > kernel/workqueue.c:2882 > > > > but task is already holding lock: > > ((&task->u.tk_work)){+.+.}, at: [] > > process_one_work+0xb32/0x1bc0 kernel/workqueue.c:2087 > > > > which lock already depends on the new lock. > > > > > > the existing dependency chain (in reverse order) is: > > > > -> #1 ((&task->u.tk_work)){+.+.}: > > process_one_work+0xba2/0x1bc0 kernel/workqueue.c:2088 > > worker_thread+0x223/0x1990 kernel/workqueue.c:2246 > > kthread+0x38b/0x470 kernel/kthread.c:242 > > ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431 > > > > -> #0 ("xprtiod"){+.+.}: > > lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3991 > > start_flush_work kernel/workqueue.c:2851 [inline] > > flush_work+0x57f/0x8a0 kernel/workqueue.c:2882 > > __cancel_work_timer+0x30a/0x7e0 kernel/workqueue.c:2954 > > cancel_work_sync+0x17/0x20 kernel/workqueue.c:2990 > > xprt_destroy+0xa1/0x130 net/sunrpc/xprt.c:1467 > > xprt_destroy_kref net/sunrpc/xprt.c:1477 [inline] > > kref_put include/linux/kref.h:70 [inline] > > xprt_put+0x38/0x40 net/sunrpc/xprt.c:1501 > > rpc_task_release_client+0x299/0x430 net/sunrpc/clnt.c:986 > > rpc_release_resources_task+0x7f/0xa0 net/sunrpc/sched.c:1020 > > rpc_release_task net/sunrpc/sched.c:1059 [inline] > > __rpc_execute+0x4d9/0xe70 net/sunrpc/sched.c:824 > > rpc_async_schedule+0x16/0x20 net/sunrpc/sched.c:848 > > process_one_work+0xbf0/0x1bc0 kernel/workqueue.c:2112 > > worker_thread+0x223/0x1990 kernel/workqueue.c:2246 > > kthread+0x38b/0x470 kernel/kthread.c:242 > > ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431 > > > > other info that might help us debug this: > > > > Possible unsafe locking scenario: > > > > CPU0 CPU1 > > ---- ---- > > lock((&task->u.tk_work)); > > lock("xprtiod"); > > lock((&task->u.tk_work)); > > lock("xprtiod"); > > > > *** DEADLOCK *** > > > > 2 locks held by kworker/0:3/3400: > > #0: ("rpciod"){+.+.}, at: [] __write_once_size > > include/linux/compiler.h:305 [inline] > > #0: ("rpciod"){+.+.}, at: [] atomic64_set > > arch/x86/include/asm/atomic64_64.h:33 [inline] > > #0: ("rpciod"){+.+.}, at: [] atomic_long_set > > include/asm-generic/atomic-long.h:56 [inline] > > #0: ("rpciod"){+.+.}, at: [] set_work_data > > kernel/workqueue.c:618 [inline] > > #0: ("rpciod"){+.+.}, at: [] > > set_work_pool_and_clear_pending kernel/workqueue.c:645 [inline] > > #0: ("rpciod"){+.+.}, at: [] > > process_one_work+0xadf/0x1bc0 kernel/workqueue.c:2083 > > #1: ((&task->u.tk_work)){+.+.}, at: [] > > process_one_work+0xb32/0x1bc0 kernel/workqueue.c:2087 > > > > stack backtrace: > > CPU: 0 PID: 3400 Comm: kworker/0:3 Not tainted 4.14.0-rc5-mm1+ #20 > > Hardware name: Google Google Compute Engine/Google Compute Engine, > > BIOS > > Google 01/01/2011 > > Workqueue: rpciod rpc_async_schedule > > Call Trace: > > __dump_stack lib/dump_stack.c:16 [inline] > > dump_stack+0x194/0x257 lib/dump_stack.c:52 > > print_circular_bug.isra.41+0x342/0x36a > > kernel/locking/lockdep.c:1258 > > check_prev_add kernel/locking/lockdep.c:1901 [inline] > > check_prevs_add kernel/locking/lockdep.c:2018 [inline] > > validate_chain kernel/locking/lockdep.c:2460 [inline] > > __lock_acquire+0x2f55/0x3d50 kernel/locking/lockdep.c:3487 > > lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3991 > > start_flush_work kernel/workqueue.c:2851 [inline] > > flush_work+0x57f/0x8a0 kernel/workqueue.c:2882 > > __cancel_work_timer+0x30a/0x7e0 kernel/workqueue.c:2954 > > cancel_work_sync+0x17/0x20 kernel/workqueue.c:2990 > > xprt_destroy+0xa1/0x130 net/sunrpc/xprt.c:1467 > > xprt_destroy_kref net/sunrpc/xprt.c:1477 [inline] > > kref_put include/linux/kref.h:70 [inline] > > xprt_put+0x38/0x40 net/sunrpc/xprt.c:1501 > > rpc_task_release_client+0x299/0x430 net/sunrpc/clnt.c:986 > > rpc_release_resources_task+0x7f/0xa0 net/sunrpc/sched.c:1020 > > rpc_release_task net/sunrpc/sched.c:1059 [inline] > > __rpc_execute+0x4d9/0xe70 net/sunrpc/sched.c:824 > > rpc_async_schedule+0x16/0x20 net/sunrpc/sched.c:848 > > process_one_work+0xbf0/0x1bc0 kernel/workqueue.c:2112 > > worker_thread+0x223/0x1990 kernel/workqueue.c:2246 > > kthread+0x38b/0x470 kernel/kthread.c:242 > > ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431 > > > +sunrpc maintainers A fix for this has already been merged. Please retest with an up to date kernel. -- Trond Myklebust Linux NFS client maintainer, PrimaryData trond.myklebust@primarydata.com