linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Sterba <dsterba@suse.cz>
To: dsterba@suse.cz, Josef Bacik <josef@toxicpanda.com>,
	Chris Murphy <chris@colorremedies.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	David Sterba <dsterba@suse.com>
Subject: Re: 5.5.0-0.rc1 hang, could be zstd compression related
Date: Wed, 11 Dec 2019 16:59:31 +0100	[thread overview]
Message-ID: <20191211155931.GQ3929@twin.jikos.cz> (raw)
In-Reply-To: <20191211155553.GP3929@twin.jikos.cz>

On Wed, Dec 11, 2019 at 04:55:53PM +0100, David Sterba wrote:
> On Wed, Dec 11, 2019 at 09:58:45AM -0500, Josef Bacik wrote:
> > On 12/10/19 11:00 PM, Chris Murphy wrote:
> > > Could continue to chat in one application, the desktop environment was
> > > responsive, but no shells worked and I couldn't get to a tty and I
> > > couldn't ssh into remotely. Looks like the journal has everything up
> > > until I pressed and held down the power button.
> > > 
> > > 
> > > /dev/nvme0n1p7 on / type btrfs
> > > (rw,noatime,seclabel,compress=zstd:1,ssd,space_cache=v2,subvolid=274,subvol=/root)
> > > 
> > > dmesg pretty
> > > https://pastebin.com/pvG3ERnd
> > > 
> > > dmesg (likely MUA stomped)
> > > [10224.184137] flap.local kernel: perf: interrupt took too long (2522
> > >> 2500), lowering kernel.perf_event_max_sample_rate to 79000
> > > [14712.698184] flap.local kernel: perf: interrupt took too long (3153
> > >> 3152), lowering kernel.perf_event_max_sample_rate to 63000
> > > [17903.211976] flap.local kernel: Lockdown: systemd-logind:
> > > hibernation is restricted; see man kernel_lockdown.7
> > > [22877.667177] flap.local kernel: BUG: kernel NULL pointer
> > > dereference, address: 00000000000006c8
> > > [22877.667182] flap.local kernel: #PF: supervisor read access in kernel mode
> > > [22877.667184] flap.local kernel: #PF: error_code(0x0000) - not-present page
> > > [22877.667187] flap.local kernel: PGD 0 P4D 0
> > > [22877.667191] flap.local kernel: Oops: 0000 [#1] SMP PTI
> > > [22877.667194] flap.local kernel: CPU: 2 PID: 14747 Comm: kworker/u8:7
> > > Not tainted 5.5.0-0.rc1.git0.1.fc32.x86_64+debug #1
> > > [22877.667196] flap.local kernel: Hardware name: HP HP Spectre
> > > Notebook/81A0, BIOS F.43 04/16/2019
> > > [22877.667226] flap.local kernel: Workqueue: btrfs-delalloc
> > > btrfs_work_helper [btrfs]
> > > [22877.667233] flap.local kernel: RIP:
> > > 0010:bio_associate_blkg_from_css+0x1c/0x3b0
> > 
> > This looks like the extent_map bdev cleanup thing that was supposed to be fixed, 
> > did you send the patch without the fix for it Dave?  Thanks,
> 
> The fix for NULL bdev was added in 429aebc0a9a063667dba21 (and tested
> with cgroups v2) and it's in a different function than the one that
> appears on the stacktrace.
> 
> This seems to be another instance where the bdev is needed right after
> the bio is created but way earlier than it's actually known for real,
> yet still needed for the blkcg thing.
> 
>  443         bio = btrfs_bio_alloc(first_byte);
>  444         bio->bi_opf = REQ_OP_WRITE | write_flags;
>  445         bio->bi_private = cb;
>  446         bio->bi_end_io = end_compressed_bio_write;
>  447
>  448         if (blkcg_css) {
>  449                 bio->bi_opf |= REQ_CGROUP_PUNT;
>  450                 bio_associate_blkg_from_css(bio, blkcg_css);
>  451         }
> 
> Strange that it takes so long to reproduce, meaning the 'if' branch is
> not taken often.

Compile tested only:

--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -446,6 +446,7 @@ blk_status_t btrfs_submit_compressed_write(struct inode *inode, u64 start,
        bio->bi_end_io = end_compressed_bio_write;
 
        if (blkcg_css) {
+               bio_set_bev(bio, fs_info->fs_devices->latest_bdev);
                bio->bi_opf |= REQ_CGROUP_PUNT;
                bio_associate_blkg_from_css(bio, blkcg_css);
        }


  reply	other threads:[~2019-12-11 15:59 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-11  4:00 5.5.0-0.rc1 hang, could be zstd compression related Chris Murphy
2019-12-11  4:16 ` Chris Murphy
2019-12-11 14:58 ` Josef Bacik
2019-12-11 15:55   ` David Sterba
2019-12-11 15:59     ` David Sterba [this message]
2019-12-24  2:24       ` Chris Murphy
2020-01-03  8:26         ` Chris Murphy
2020-01-03 15:11           ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191211155931.GQ3929@twin.jikos.cz \
    --to=dsterba@suse.cz \
    --cc=chris@colorremedies.com \
    --cc=dsterba@suse.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).