linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Filipe David Borba Manana <fdmanana@gmail.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2] Btrfs: fix btrfs boot when compiled as built-in
Date: Wed, 29 Jan 2014 19:11:35 -0800	[thread overview]
Message-ID: <20140130031135.GA8796@birch.djwong.org> (raw)
In-Reply-To: <1389493366-15264-1-git-send-email-fdmanana@gmail.com>

On Sun, Jan 12, 2014 at 02:22:46AM +0000, Filipe David Borba Manana wrote:
> After the change titled "Btrfs: add support for inode properties", if
> btrfs was built-in the kernel (i.e. not as a module), it would cause a
> kernel panic, as reported recently by Fengguang:
> 
> [    2.024722] BUG: unable to handle kernel NULL pointer dereference at           (null)
> [    2.027814] IP: [<ffffffff81501594>] crc32c+0xc/0x6b
> [    2.028684] PGD 0
> [    2.028684] Oops: 0000 [#1] SMP
> [    2.028684] Modules linked in:
> [    2.028684] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc7-04795-ga7b57c2 #1
> [    2.028684] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [    2.028684] task: ffff88000edba100 ti: ffff88000edd6000 task.ti: ffff88000edd6000
> [    2.028684] RIP: 0010:[<ffffffff81501594>]  [<ffffffff81501594>] crc32c+0xc/0x6b
> [    2.028684] RSP: 0000:ffff88000edd7e58  EFLAGS: 00010246
> [    2.028684] RAX: 0000000000000000 RBX: ffffffff82295550 RCX: 0000000000000000
> [    2.028684] RDX: 0000000000000011 RSI: ffffffff81efe393 RDI: 00000000fffffffe
> [    2.028684] RBP: ffff88000edd7e60 R08: 0000000000000003 R09: 0000000000015d20
> [    2.028684] R10: ffffffff81ef225e R11: ffffffff811b0222 R12: ffffffffffffffff
> [    2.028684] R13: 0000000000000239 R14: 0000000000000000 R15: 0000000000000000
> [    2.028684] FS:  0000000000000000(0000) GS:ffff88000fa00000(0000) knlGS:0000000000000000
> [    2.028684] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [    2.028684] CR2: 0000000000000000 CR3: 000000000220c000 CR4: 00000000000006f0
> [    2.028684] Stack:
> [    2.028684]  ffffffff82295550 ffff88000edd7e80 ffffffff8238af62 ffffffff8238ac05
> [    2.028684]  0000000000000000 ffff88000edd7e98 ffffffff8238ac0f ffffffff8238ac05
> [    2.028684]  ffff88000edd7f08 ffffffff810002ba ffff88000edd7f00 ffffffff810e2404
> [    2.028684] Call Trace:
> [    2.028684]  [<ffffffff8238af62>] btrfs_props_init+0x4f/0x96
> [    2.028684]  [<ffffffff8238ac05>] ? ftrace_define_fields_btrfs_space_reservation+0x145/0x145
> [    2.028684]  [<ffffffff8238ac0f>] init_btrfs_fs+0xa/0xf0
> [    2.028684]  [<ffffffff8238ac05>] ? ftrace_define_fields_btrfs_space_reservation+0x145/0x145
> [    2.028684]  [<ffffffff810002ba>] do_one_initcall+0xa4/0x13a
> [    2.028684]  [<ffffffff810e2404>] ? parse_args+0x25f/0x33d
> [    2.028684]  [<ffffffff8234cf75>] kernel_init_freeable+0x1aa/0x230
> [    2.028684]  [<ffffffff8234c785>] ? do_early_param+0x88/0x88
> [    2.028684]  [<ffffffff819f61b5>] ? rest_init+0x89/0x89
> [    2.028684]  [<ffffffff819f61c3>] kernel_init+0xe/0x109
> 
> The issue here is that the initialization function of btrfs (super.c:init_btrfs_fs)
> started using crc32c (from lib/libcrc32c.c). But when it needs to call crc32c (as
> part of the properties initialization routine), the libcrc32c is not yet initialized,
> so crc32c derreferenced a NULL pointer (lib/libcrc32c.c:tfm), causing the kernel
> panic on boot.
> 
> The approach to fix this is to use crypto component directly to use its crc32c (which
> is basically what lib/libcrc32c.c is, a wrapper around crypto). This is what ext4 is
> doing as well, it uses crypto directly to get crc32c functionality.

Sorry, I didn't even see this until your other patch today...

Yes, ext4 bypasses libcrc32c, but notice how ext4/jbd2 only call
crypto_alloc_shash() when you mount the filesystem.  This silly trick enables
ext4 to take advantage of faster crc32c implementations that don't get loaded
until after the initial insmod, just in case you do this:

# modprobe btrfs
# mount /dev/X /		<-- / uses software crc32c implementation
# modprobe crc32c-intel
# mount /dev/Y /home		<-- /home still uses software crc32c, because
                                    *tfm still points to the sw implementation!

Of course, one could argue that this is a distro problem, and that the
initramfs or kernel config should take care to load the appropriate hardware
accelerated drivers /before/ btrfs.

On the other hand, btrfs might as well always enjoy the fastest implementation
that it can get its hands on.

(Also I suppose metadata_csum is optional in ext4/jbd2...)

--D

> 
> Verified this works both when btrfs is built-in and when it's loadable kernel module.
> 
> Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
> ---
> 
> V2: Removed unnecessary __exit qualifier.
> 
>  fs/btrfs/Kconfig       |    3 ++-
>  fs/btrfs/Makefile      |    2 +-
>  fs/btrfs/extent-tree.c |    6 +++---
>  fs/btrfs/hash.c        |   49 ++++++++++++++++++++++++++++++++++++++++++++++++
>  fs/btrfs/hash.h        |   11 ++++++++---
>  fs/btrfs/super.c       |   10 +++++++++-
>  6 files changed, 72 insertions(+), 9 deletions(-)
>  create mode 100644 fs/btrfs/hash.c
> 
> diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig
> index aa976ec..a66768e 100644
> --- a/fs/btrfs/Kconfig
> +++ b/fs/btrfs/Kconfig
> @@ -1,6 +1,7 @@
>  config BTRFS_FS
>  	tristate "Btrfs filesystem support"
> -	select LIBCRC32C
> +	select CRYPTO
> +	select CRYPTO_CRC32C
>  	select ZLIB_INFLATE
>  	select ZLIB_DEFLATE
>  	select LZO_COMPRESS
> diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
> index af7f000..f341a98 100644
> --- a/fs/btrfs/Makefile
> +++ b/fs/btrfs/Makefile
> @@ -9,7 +9,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \
>  	   export.o tree-log.o free-space-cache.o zlib.o lzo.o \
>  	   compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \
>  	   reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \
> -	   uuid-tree.o props.o
> +	   uuid-tree.o props.o hash.o
>  
>  btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o
>  btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 77acc08..db19ed7 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -1072,11 +1072,11 @@ static u64 hash_extent_data_ref(u64 root_objectid, u64 owner, u64 offset)
>  	__le64 lenum;
>  
>  	lenum = cpu_to_le64(root_objectid);
> -	high_crc = crc32c(high_crc, &lenum, sizeof(lenum));
> +	high_crc = btrfs_crc32c(high_crc, &lenum, sizeof(lenum));
>  	lenum = cpu_to_le64(owner);
> -	low_crc = crc32c(low_crc, &lenum, sizeof(lenum));
> +	low_crc = btrfs_crc32c(low_crc, &lenum, sizeof(lenum));
>  	lenum = cpu_to_le64(offset);
> -	low_crc = crc32c(low_crc, &lenum, sizeof(lenum));
> +	low_crc = btrfs_crc32c(low_crc, &lenum, sizeof(lenum));
>  
>  	return ((u64)high_crc << 31) ^ (u64)low_crc;
>  }
> diff --git a/fs/btrfs/hash.c b/fs/btrfs/hash.c
> new file mode 100644
> index 0000000..ab3a938
> --- /dev/null
> +++ b/fs/btrfs/hash.c
> @@ -0,0 +1,49 @@
> +/*
> + * Copyright (C) 2014 Filipe David Borba Manana <fdmanana@gmail.com>
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public
> + * License v2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + */
> +
> +#include <crypto/hash.h>
> +#include "hash.h"
> +
> +static struct crypto_shash *tfm;
> +
> +int __init btrfs_hash_init(void)
> +{
> +	tfm = crypto_alloc_shash("crc32c", 0, 0);
> +	if (IS_ERR(tfm))
> +		return PTR_ERR(tfm);
> +
> +	return 0;
> +}
> +
> +void btrfs_hash_exit(void)
> +{
> +	crypto_free_shash(tfm);
> +}
> +
> +u32 btrfs_crc32c(u32 crc, const void *address, unsigned int length)
> +{
> +	struct {
> +		struct shash_desc shash;
> +		char ctx[crypto_shash_descsize(tfm)];
> +	} desc;
> +	int err;
> +
> +	desc.shash.tfm = tfm;
> +	desc.shash.flags = 0;
> +	*(u32 *)desc.ctx = crc;
> +
> +	err = crypto_shash_update(&desc.shash, address, length);
> +	BUG_ON(err);
> +
> +	return *(u32 *)desc.ctx;
> +}
> diff --git a/fs/btrfs/hash.h b/fs/btrfs/hash.h
> index 1d98281..118a231 100644
> --- a/fs/btrfs/hash.h
> +++ b/fs/btrfs/hash.h
> @@ -19,10 +19,15 @@
>  #ifndef __HASH__
>  #define __HASH__
>  
> -#include <linux/crc32c.h>
> +int __init btrfs_hash_init(void);
> +
> +void btrfs_hash_exit(void);
> +
> +u32 btrfs_crc32c(u32 crc, const void *address, unsigned int length);
> +
>  static inline u64 btrfs_name_hash(const char *name, int len)
>  {
> -	return crc32c((u32)~1, name, len);
> +	return btrfs_crc32c((u32)~1, name, len);
>  }
>  
>  /*
> @@ -31,7 +36,7 @@ static inline u64 btrfs_name_hash(const char *name, int len)
>  static inline u64 btrfs_extref_hash(u64 parent_objectid, const char *name,
>  				    int len)
>  {
> -	return (u64) crc32c(parent_objectid, name, len);
> +	return (u64) btrfs_crc32c(parent_objectid, name, len);
>  }
>  
>  #endif
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index 461e41c..f44cc6a 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -48,6 +48,7 @@
>  #include "transaction.h"
>  #include "btrfs_inode.h"
>  #include "print-tree.h"
> +#include "hash.h"
>  #include "props.h"
>  #include "xattr.h"
>  #include "volumes.h"
> @@ -1866,11 +1867,15 @@ static int __init init_btrfs_fs(void)
>  {
>  	int err;
>  
> +	err = btrfs_hash_init();
> +	if (err)
> +		return err;
> +
>  	btrfs_props_init();
>  
>  	err = btrfs_init_sysfs();
>  	if (err)
> -		return err;
> +		goto free_hash;
>  
>  	btrfs_init_compress();
>  
> @@ -1945,6 +1950,8 @@ free_cachep:
>  free_compress:
>  	btrfs_exit_compress();
>  	btrfs_exit_sysfs();
> +free_hash:
> +	btrfs_hash_exit();
>  	return err;
>  }
>  
> @@ -1963,6 +1970,7 @@ static void __exit exit_btrfs_fs(void)
>  	btrfs_exit_sysfs();
>  	btrfs_cleanup_fs_uuids();
>  	btrfs_exit_compress();
> +	btrfs_hash_exit();
>  }
>  
>  module_init(init_btrfs_fs)
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2014-01-30  3:11 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-11 21:27 [PATCH] Btrfs: fix btrfs boot when compiled as built-in Filipe David Borba Manana
2014-01-12  2:22 ` [PATCH v2] " Filipe David Borba Manana
2014-01-30  3:11   ` Darrick J. Wong [this message]
2014-01-30 13:18     ` Filipe David Manana
2014-01-30 14:56       ` Filipe David Manana
2014-01-30 21:56       ` Darrick J. Wong
2014-02-01 10:05         ` Ahmet Inan
2014-02-01 10:29           ` Ahmet Inan
2014-02-01 15:18             ` Filipe David Manana
2014-02-01 17:25               ` Ahmet Inan
2014-02-01 19:14                 ` Filipe David Manana
2014-02-01 19:42                   ` Ahmet Inan
2014-02-01 21:28                     ` Filipe David Manana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140130031135.GA8796@birch.djwong.org \
    --to=darrick.wong@oracle.com \
    --cc=fdmanana@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).