From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from plane.gmane.org ([80.91.229.3]:33438 "EHLO plane.gmane.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751021AbbEYPdu (ORCPT ); Mon, 25 May 2015 11:33:50 -0400 Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1YwuNs-0007ft-SU for linux-btrfs@vger.kernel.org; Mon, 25 May 2015 17:33:44 +0200 Received: from ip18864262.dynamic.kabel-deutschland.de ([24.134.66.98]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 25 May 2015 17:33:44 +0200 Received: from hurikhan77 by ip18864262.dynamic.kabel-deutschland.de with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 25 May 2015 17:33:44 +0200 To: linux-btrfs@vger.kernel.org From: Kai Krakow Subject: booting btrfs RAID with dracut/systemd results in open_ctree failed Date: Mon, 25 May 2015 17:33:22 +0200 Message-ID: <2vrb3c-fsb.ln1@hurikhan77.spdns.de> Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Cc: systemd-devel@lists.freedesktop.org, linux-bcache@vger.kernel.org Sender: linux-btrfs-owner@vger.kernel.org List-ID: Hi! I need to boot with dracut to get my btrfs root partition properly initialized (because it is a multi-device btrfs). Today, after upgrading to systemd v220, I tracked a booting issue down to what looks like a general problem with the btrfs udev rules distributed with systemd: If I drop down to an emergency shell through rd.break=pre-mount, when trying to mount sysroot, I get the error "open_ctree failed" and "BTRFS: failed to read the system array". This is generally a problem when probing for btrfs devices hasn't been done yet. So I looked into the dracut sources to find that it brings it's own udev rule which properly does this. The caveat however is: If it already finds a udev rules for btrfs, it won't install its own rule. The rule in question is: $ cat 64-btrfs.rules # do not edit this file, it will be overwritten on update SUBSYSTEM!="block", GOTO="btrfs_end" ACTION=="remove", GOTO="btrfs_end" ENV{ID_FS_TYPE}!="btrfs", GOTO="btrfs_end" # let the kernel know about this btrfs filesystem, and check if it is complete IMPORT{builtin}="btrfs ready $devnode" # mark the device as not ready to be used by the system ENV{ID_BTRFS_READY}=="0", ENV{SYSTEMD_READY}="0" LABEL="btrfs_end" It comes distributed with systemd so I believe this is a systemd issue. I fixed it by placing the following work-around: /usr/lib/dracut/modules.d/99btrfs-device-scan/btrfs_device_scan.sh: #!/bin/sh type getarg >/dev/null 2>&1 || . /lib/dracut-lib.sh info "Scanning for all btrfs devices" /sbin/btrfs device scan >/dev/null 2>&1 /usr/lib/dracut/modules.d/99btrfs-device-scan/module-setup.sh: #!/bin/bash # called by dracut check() { local _rootdev # if we don't have btrfs installed on the host system, # no point in trying to support it in the initramfs. require_binaries btrfs || return 1 [[ $hostonly ]] || [[ $mount_needs ]] && { for fs in ${host_fs_types[@]}; do [[ "$fs" == "btrfs" ]] && return 0 done return 255 } return 0 } # called by dracut depends() { echo btrfs return 0 } # called by dracut install() { inst_hook pre-mount 99 "$moddir/btrfs_device_scan.sh" } This issues an explicit "btrfs device scan" in the pre-mount hook. However, looking at the udev rules of systemd for btrfs, it should accomblish more or less the same. So something is buggy or racy there. I took note that I saw only one of the following lines in dmesg when the problem was present: [ 5.514318] BTRFS: device label system devid 5 transid 2779055 /dev/bcache2 [ 5.514422] BTRFS: device label system devid 6 transid 2779055 /dev/bcache1 [ 5.514521] BTRFS: device label system devid 4 transid 2779055 /dev/bcache0 Without my "fix", only one line showed up in the log - probably exactly at mount time when systemd's sysroot.mount unit started. It wasn't always the same, tho. With v219 I only had sometimes this problem. A reboot usually fixed it. This supports my theory of the rule being racy somewhere, especially around the line ENV{ID_BTRFS_READY}=="0", ENV{SYSTEMD_READY}="0". My btrfs setup looks like this: Overall: Device size: 2.71TiB Device allocated: 1.85TiB Device unallocated: 880.47GiB Device missing: 0.00B Used: 1.30TiB Free (estimated): 1.41TiB (min: 1003.50GiB) Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 0.00B) Data,RAID0: Size:1.84TiB, Used:1.29TiB /dev/bcache0 628.00GiB /dev/bcache1 628.00GiB /dev/bcache2 628.00GiB Metadata,RAID1: Size:6.00GiB, Used:4.21GiB /dev/bcache0 4.00GiB /dev/bcache1 4.00GiB /dev/bcache2 4.00GiB System,RAID1: Size:32.00MiB, Used:120.00KiB /dev/bcache0 32.00MiB /dev/bcache2 32.00MiB Unallocated: /dev/bcache0 293.48GiB /dev/bcache1 293.51GiB /dev/bcache2 293.48GiB Dracut is v041, systemd is v220, kernel is 4.0.4, cmdline is: root=/dev/bcache0 ro snd_hda_intel.enable_msi=1 rootfstype=btrfs rootflags=compress=lzo zswap.enabled=1 splash quiet It may be worth noting that I'm using bcache whose udev rules may interfere with those for btrfs. CC'ing bcache-devel and btrfs-devel just in case, f'up btrfs-devel. -- Replies to list only preferred.