From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Alexandre Oliva <oliva@gnu.org>,
Filipe Manana <fdmanana@suse.com>, Chris Mason <clm@fb.com>
Subject: [PATCH 3.14 12/34] Btrfs: make xattr replace operations atomic
Date: Wed, 1 Jul 2015 11:40:19 -0700 [thread overview]
Message-ID: <20150701183955.782177086@linuxfoundation.org> (raw)
In-Reply-To: <20150701183955.306219425@linuxfoundation.org>
3.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana <fdmanana@suse.com>
commit 5f5bc6b1e2d5a6f827bc860ef2dc5b6f365d1339 upstream.
Replacing a xattr consists of doing a lookup for its existing value, delete
the current value from the respective leaf, release the search path and then
finally insert the new value. This leaves a time window where readers (getxattr,
listxattrs) won't see any value for the xattr. Xattrs are used to store ACLs,
so this has security implications.
This change also fixes 2 other existing issues which were:
*) Deleting the old xattr value without verifying first if the new xattr will
fit in the existing leaf item (in case multiple xattrs are packed in the
same item due to name hash collision);
*) Returning -EEXIST when the flag XATTR_CREATE is given and the xattr doesn't
exist but we have have an existing item that packs muliple xattrs with
the same name hash as the input xattr. In this case we should return ENOSPC.
A test case for xfstests follows soon.
Thanks to Alexandre Oliva for reporting the non-atomicity of the xattr replace
implementation.
Reported-by: Alexandre Oliva <oliva@gnu.org>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/btrfs/ctree.c | 2
fs/btrfs/ctree.h | 5 +
fs/btrfs/dir-item.c | 10 +--
fs/btrfs/xattr.c | 150 ++++++++++++++++++++++++++++++++--------------------
4 files changed, 102 insertions(+), 65 deletions(-)
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -2963,7 +2963,7 @@ done:
*/
if (!p->leave_spinning)
btrfs_set_path_blocking(p);
- if (ret < 0)
+ if (ret < 0 && !p->skip_release_on_error)
btrfs_release_path(p);
return ret;
}
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -608,6 +608,7 @@ struct btrfs_path {
unsigned int skip_locking:1;
unsigned int leave_spinning:1;
unsigned int search_commit_root:1;
+ unsigned int skip_release_on_error:1;
};
/*
@@ -3609,6 +3610,10 @@ struct btrfs_dir_item *btrfs_lookup_xatt
int verify_dir_item(struct btrfs_root *root,
struct extent_buffer *leaf,
struct btrfs_dir_item *dir_item);
+struct btrfs_dir_item *btrfs_match_dir_item_name(struct btrfs_root *root,
+ struct btrfs_path *path,
+ const char *name,
+ int name_len);
/* orphan.c */
int btrfs_insert_orphan_item(struct btrfs_trans_handle *trans,
--- a/fs/btrfs/dir-item.c
+++ b/fs/btrfs/dir-item.c
@@ -21,10 +21,6 @@
#include "hash.h"
#include "transaction.h"
-static struct btrfs_dir_item *btrfs_match_dir_item_name(struct btrfs_root *root,
- struct btrfs_path *path,
- const char *name, int name_len);
-
/*
* insert a name into a directory, doing overflow properly if there is a hash
* collision. data_size indicates how big the item inserted should be. On
@@ -383,9 +379,9 @@ struct btrfs_dir_item *btrfs_lookup_xatt
* this walks through all the entries in a dir item and finds one
* for a specific name.
*/
-static struct btrfs_dir_item *btrfs_match_dir_item_name(struct btrfs_root *root,
- struct btrfs_path *path,
- const char *name, int name_len)
+struct btrfs_dir_item *btrfs_match_dir_item_name(struct btrfs_root *root,
+ struct btrfs_path *path,
+ const char *name, int name_len)
{
struct btrfs_dir_item *dir_item;
unsigned long name_ptr;
--- a/fs/btrfs/xattr.c
+++ b/fs/btrfs/xattr.c
@@ -29,6 +29,7 @@
#include "xattr.h"
#include "disk-io.h"
#include "props.h"
+#include "locking.h"
ssize_t __btrfs_getxattr(struct inode *inode, const char *name,
@@ -91,7 +92,7 @@ static int do_setxattr(struct btrfs_tran
struct inode *inode, const char *name,
const void *value, size_t size, int flags)
{
- struct btrfs_dir_item *di;
+ struct btrfs_dir_item *di = NULL;
struct btrfs_root *root = BTRFS_I(inode)->root;
struct btrfs_path *path;
size_t name_len = strlen(name);
@@ -103,84 +104,119 @@ static int do_setxattr(struct btrfs_tran
path = btrfs_alloc_path();
if (!path)
return -ENOMEM;
+ path->skip_release_on_error = 1;
+
+ if (!value) {
+ di = btrfs_lookup_xattr(trans, root, path, btrfs_ino(inode),
+ name, name_len, -1);
+ if (!di && (flags & XATTR_REPLACE))
+ ret = -ENODATA;
+ else if (di)
+ ret = btrfs_delete_one_dir_name(trans, root, path, di);
+ goto out;
+ }
+ /*
+ * For a replace we can't just do the insert blindly.
+ * Do a lookup first (read-only btrfs_search_slot), and return if xattr
+ * doesn't exist. If it exists, fall down below to the insert/replace
+ * path - we can't race with a concurrent xattr delete, because the VFS
+ * locks the inode's i_mutex before calling setxattr or removexattr.
+ */
if (flags & XATTR_REPLACE) {
- di = btrfs_lookup_xattr(trans, root, path, btrfs_ino(inode), name,
- name_len, -1);
- if (IS_ERR(di)) {
- ret = PTR_ERR(di);
- goto out;
- } else if (!di) {
+ ASSERT(mutex_is_locked(&inode->i_mutex));
+ di = btrfs_lookup_xattr(NULL, root, path, btrfs_ino(inode),
+ name, name_len, 0);
+ if (!di) {
ret = -ENODATA;
goto out;
}
- ret = btrfs_delete_one_dir_name(trans, root, path, di);
- if (ret)
- goto out;
btrfs_release_path(path);
+ di = NULL;
+ }
+ ret = btrfs_insert_xattr_item(trans, root, path, btrfs_ino(inode),
+ name, name_len, value, size);
+ if (ret == -EOVERFLOW) {
/*
- * remove the attribute
+ * We have an existing item in a leaf, split_leaf couldn't
+ * expand it. That item might have or not a dir_item that
+ * matches our target xattr, so lets check.
*/
- if (!value)
- goto out;
- } else {
- di = btrfs_lookup_xattr(NULL, root, path, btrfs_ino(inode),
- name, name_len, 0);
- if (IS_ERR(di)) {
- ret = PTR_ERR(di);
+ ret = 0;
+ btrfs_assert_tree_locked(path->nodes[0]);
+ di = btrfs_match_dir_item_name(root, path, name, name_len);
+ if (!di && !(flags & XATTR_REPLACE)) {
+ ret = -ENOSPC;
goto out;
}
- if (!di && !value)
- goto out;
- btrfs_release_path(path);
+ } else if (ret == -EEXIST) {
+ ret = 0;
+ di = btrfs_match_dir_item_name(root, path, name, name_len);
+ ASSERT(di); /* logic error */
+ } else if (ret) {
+ goto out;
}
-again:
- ret = btrfs_insert_xattr_item(trans, root, path, btrfs_ino(inode),
- name, name_len, value, size);
- /*
- * If we're setting an xattr to a new value but the new value is say
- * exactly BTRFS_MAX_XATTR_SIZE, we could end up with EOVERFLOW getting
- * back from split_leaf. This is because it thinks we'll be extending
- * the existing item size, but we're asking for enough space to add the
- * item itself. So if we get EOVERFLOW just set ret to EEXIST and let
- * the rest of the function figure it out.
- */
- if (ret == -EOVERFLOW)
+ if (di && (flags & XATTR_CREATE)) {
ret = -EEXIST;
+ goto out;
+ }
- if (ret == -EEXIST) {
- if (flags & XATTR_CREATE)
- goto out;
+ if (di) {
/*
- * We can't use the path we already have since we won't have the
- * proper locking for a delete, so release the path and
- * re-lookup to delete the thing.
+ * We're doing a replace, and it must be atomic, that is, at
+ * any point in time we have either the old or the new xattr
+ * value in the tree. We don't want readers (getxattr and
+ * listxattrs) to miss a value, this is specially important
+ * for ACLs.
*/
- btrfs_release_path(path);
- di = btrfs_lookup_xattr(trans, root, path, btrfs_ino(inode),
- name, name_len, -1);
- if (IS_ERR(di)) {
- ret = PTR_ERR(di);
- goto out;
- } else if (!di) {
- /* Shouldn't happen but just in case... */
- btrfs_release_path(path);
- goto again;
+ const int slot = path->slots[0];
+ struct extent_buffer *leaf = path->nodes[0];
+ const u16 old_data_len = btrfs_dir_data_len(leaf, di);
+ const u32 item_size = btrfs_item_size_nr(leaf, slot);
+ const u32 data_size = sizeof(*di) + name_len + size;
+ struct btrfs_item *item;
+ unsigned long data_ptr;
+ char *ptr;
+
+ if (size > old_data_len) {
+ if (btrfs_leaf_free_space(root, leaf) <
+ (size - old_data_len)) {
+ ret = -ENOSPC;
+ goto out;
+ }
}
- ret = btrfs_delete_one_dir_name(trans, root, path, di);
- if (ret)
- goto out;
+ if (old_data_len + name_len + sizeof(*di) == item_size) {
+ /* No other xattrs packed in the same leaf item. */
+ if (size > old_data_len)
+ btrfs_extend_item(root, path,
+ size - old_data_len);
+ else if (size < old_data_len)
+ btrfs_truncate_item(root, path, data_size, 1);
+ } else {
+ /* There are other xattrs packed in the same item. */
+ ret = btrfs_delete_one_dir_name(trans, root, path, di);
+ if (ret)
+ goto out;
+ btrfs_extend_item(root, path, data_size);
+ }
+ item = btrfs_item_nr(slot);
+ ptr = btrfs_item_ptr(leaf, slot, char);
+ ptr += btrfs_item_size(leaf, item) - data_size;
+ di = (struct btrfs_dir_item *)ptr;
+ btrfs_set_dir_data_len(leaf, di, size);
+ data_ptr = ((unsigned long)(di + 1)) + name_len;
+ write_extent_buffer(leaf, value, data_ptr, size);
+ btrfs_mark_buffer_dirty(leaf);
+ } else {
/*
- * We have a value to set, so go back and try to insert it now.
+ * Insert, and we had space for the xattr, so path->slots[0] is
+ * where our xattr dir_item is and btrfs_insert_xattr_item()
+ * filled it.
*/
- if (value) {
- btrfs_release_path(path);
- goto again;
- }
}
out:
btrfs_free_path(path);
next prev parent reply other threads:[~2015-07-01 18:47 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-01 18:40 [PATCH 3.14 00/34] 3.14.47-stable review Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 01/34] arm64: dma-mapping: always clear allocated buffers Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 02/34] kprobes/x86: Return correct length in __copy_instruction() Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 03/34] config: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 05/34] sb_edac: Fix erroneous bytes->gigabytes conversion Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 06/34] hpsa: refine the pci enable/disable handling Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 07/34] netfilter: Zero the tuple in nfnl_cthelper_parse_tuple() Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 08/34] netfilter: nft_compat: set IP6T_F_PROTO flag if protocol is set Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 09/34] netfilter: nf_tables: allow to change chain policy without hook if it exists Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 10/34] hpsa: add missing pci_set_master in kdump path Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 11/34] x86/microcode/intel: Guard against stack overflow in the loader Greg Kroah-Hartman
2015-07-01 18:40 ` Greg Kroah-Hartman [this message]
2015-07-01 18:40 ` [PATCH 3.14 13/34] net/mlx4_en: Dont attempt to TX offload the outer UDP checksum for VXLAN Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 14/34] splice: Apply generic position and size checks to each write Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 15/34] ARM: clk-imx6q: refine satas parent Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 16/34] KVM: nSVM: Check for NRIPS support before updating control field Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 17/34] bus: mvebu: pass the coherency availability information at init time Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 18/34] ARM/arm64: KVM: fix use of WnR bit in kvm_is_write_fault() Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 19/34] KVM: ARM: vgic: plug irq injection race Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 20/34] arm/arm64: KVM: Fix set_clear_sgi_pend_reg offset Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 21/34] arm/arm64: KVM: Fix VTTBR_BADDR_MASK and pgd alloc Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 22/34] arm: kvm: fix CPU hotplug Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 23/34] arm/arm64: KVM: fix potential NULL dereference in user_mem_abort() Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 24/34] arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 25/34] arm: kvm: STRICT_MM_TYPECHECKS fix for user_mem_abort Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 26/34] arm64: KVM: fix unmapping with 48-bit VAs Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 27/34] arm/arm64: KVM: vgic: Fix error code in kvm_vgic_create() Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 28/34] arm64/kvm: Fix assembler compatibility of macros Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 29/34] arm/arm64: kvm: drop inappropriate use of kvm_is_mmio_pfn() Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 30/34] arm/arm64: KVM: Dont clear the VCPU_POWER_OFF flag Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 31/34] arm/arm64: KVM: Correct KVM_ARM_VCPU_INIT power off option Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 32/34] arm/arm64: KVM: Reset the HCR on each vcpu when resetting the vcpu Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 33/34] arm/arm64: KVM: Introduce stage2_unmap_vm Greg Kroah-Hartman
2015-07-01 18:40 ` [PATCH 3.14 34/34] arm/arm64: KVM: Dont allow creating VCPUs after vgic_initialized Greg Kroah-Hartman
2015-07-01 22:35 ` [PATCH 3.14 00/34] 3.14.47-stable review Shuah Khan
2015-07-02 2:19 ` Guenter Roeck
2015-07-02 4:30 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150701183955.782177086@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=clm@fb.com \
--cc=fdmanana@suse.com \
--cc=linux-kernel@vger.kernel.org \
--cc=oliva@gnu.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox