From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-f193.google.com ([209.85.208.193]:46897 "EHLO mail-lj1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727788AbeIMVdF (ORCPT ); Thu, 13 Sep 2018 17:33:05 -0400 Received: by mail-lj1-f193.google.com with SMTP id 203-v6so5096557ljj.13 for ; Thu, 13 Sep 2018 09:22:51 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <5c0b3895-05c8-63b1-195a-316d3eaca072@suse.com> References: <8c2c436d404bca00617614d08e9720c1@t-5.eu> <63ab2fb7-15a8-f807-4a2f-04ce53f3f168@suse.com> <165d2939520.27fe.1e2eed663022c8efc8eff86f8ee324b8@t-5.eu> <7956cebe-3227-f153-6f0e-be272abe2c61@suse.com> <165d2c48478.27fe.1e2eed663022c8efc8eff86f8ee324b8@t-5.eu> <91b2f76b-5b1c-6df3-ac8c-058696f27788@suse.com> <165d2e8e110.27fe.1e2eed663022c8efc8eff86f8ee324b8@t-5.eu> <322267c4-5671-73f3-acca-797dd6fe3572@suse.com> <5c0b3895-05c8-63b1-195a-316d3eaca072@suse.com> From: Chris Murphy Date: Thu, 13 Sep 2018 10:22:50 -0600 Message-ID: Subject: Re: btrfs send hangs after partial transfer and blocks all IO To: Btrfs BTRFS Cc: =?UTF-8?Q?J=C3=BCrgen_Herrmann?= , Nikolay Borisov Content-Type: text/plain; charset="UTF-8" Sender: linux-btrfs-owner@vger.kernel.org List-ID: (resend to all) On Thu, Sep 13, 2018 at 9:44 AM, Nikolay Borisov wrote: > > > On 13.09.2018 18:30, Chris Murphy wrote: >> This is the 2nd or 3rd thread containing hanging btrfs send, with >> kernel 4.18.x. The subject of one is "btrfs send hung in pipe_wait" >> and the other I can't find at the moment. In that case though the hang >> is reproducible in 4.14.x and weirdly it only happens when a snapshot >> contains (perhaps many) reflinks. Scrub and check lowmem find nothing >> wrong. >> >> I have snapshots with a few reflinks (cp --reflink and also >> deduplication), and I see maybe 15-30 second hangs where nothing is >> apparently happening (in top or iotop), but I'm also not seeing any >> blocked tasks or high CPU usage. Perhaps in my case it's just >> recovering quickly. >> >> Are there any kernel config options in "# Debug Lockups and Hangs" >> that might hint at what's going on? Some of these are enabled in >> Fedora debug kernels, which are built practically daily, e.g. right >> now the latest in the build system is 4.19.0-0.rc3.git2.1 - which >> translates to git 54eda9df17f3. > > If it's a lock-related problem then you need Lock Debugging => Lock > debugging: prove locking correctness OK looks like that's under a different section as CONFIG_PROVE_LOCKING which is enabled on Fedora debug kernels. # Debug Lockups and Hangs CONFIG_LOCKUP_DETECTOR=y CONFIG_SOFTLOCKUP_DETECTOR=y # CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0 CONFIG_HARDLOCKUP_DETECTOR_PERF=y CONFIG_HARDLOCKUP_CHECK_TIMESTAMP=y CONFIG_HARDLOCKUP_DETECTOR=y # CONFIG_BOOTPARAM_HARDLOCKUP_PANIC is not set CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE=0 # Lock Debugging (spinlocks, mutexes, etc...) CONFIG_LOCK_DEBUGGING_SUPPORT=y CONFIG_PROVE_LOCKING=y CONFIG_LOCK_STAT=y CONFIG_DEBUG_SPINLOCK=y CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_LOCKDEP=y # CONFIG_DEBUG_LOCKDEP is not set # CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set CONFIG_LOCK_TORTURE_TEST=m -- Chris Murphy