All of lore.kernel.org
 help / color / mirror / Atom feed
* non-atomic xattr replacement in btrfs => rsync random errors
@ 2014-11-07  0:37 Alexandre Oliva
  2014-11-07  1:39 ` Alexandre Oliva
  2014-11-07 11:21 ` Filipe David Manana
  0 siblings, 2 replies; 4+ messages in thread
From: Alexandre Oliva @ 2014-11-07  0:37 UTC (permalink / raw)
  To: linux-btrfs, rsync

A few days ago, I started using rsync batches to archive old copies of
ceph OSD snapshots for certain kinds of disaster recovery.  This seems
to exercise an unexpected race condition in rsync, which happens to
expose what appears to be a race condition in btrfs, causing random
scary but harmless errors when replaying the rsync batches.


strace has revealed that the two rsync processes running concurrently to
apply the batch both attempt to access xattrs of the same directory
concurrently.  I understand rsync is supposed to avoid this, but
something's going wrong with that.  Here's the smoking gun, snipped from
strace -p 27251 -p 27253 -o smoking.gun, where both processes are
started from a single rsync --read-batch=- -aHAX --del ... run:

0: 27251 stat("osd/0.6ed_head/DIR_D/DIR_E/DIR_6",  <unfinished ...>
1: 27253 stat("osd/0.6ed_head/DIR_D/DIR_E/DIR_6", {st_mode=S_IFDIR|0755, st_size=5470, ...}) = 0
2: 27251 <... stat resumed> {st_mode=S_IFDIR|0755, st_size=5470, ...}) = 0
3: 27253 llistxattr("osd/0.6ed_head/DIR_D/DIR_E/DIR_6", "user.cephos.phash.contents\0", 1024) = 27
4: 27251 llistxattr("osd/0.6ed_head/DIR_D/DIR_E/DIR_6",  <unfinished ...>
5: 27253 lsetxattr("osd/0.6ed_head/DIR_D/DIR_E/DIR_6", "user.cephos.phash.contents", "\x01F\x00\x00\x00\x00\x00\x00\x00\x0f\x00\x00\x00\x03\x00\x00", 17, 0 <unfinished ...>
6: 27251 <... llistxattr resumed> "user.cephos.phash.contents\0", 1024) = 27
7: 27251 lgetxattr("osd/0.6ed_head/DIR_D/DIR_E/DIR_6", "user.cephos.phash.contents", 0x0, 0) = -1 ENODATA (No data available)
8: 27253 <... lsetxattr resumed> )         = 0
9: 27253 utimensat(AT_FDCWD, "osd/0.6ed_head/DIR_D/DIR_E/DIR_6", {UTIME_NOW, {1407992261, 0}}, AT_SYMLINK_NOFOLLOW) = 0
a: 27251 write(2, "rsync: get_xattr_data: lgetxattr"..., 181) = 181

lines 0-2, 3-6 and 5-8, show concurrent access of both rsync processes
to the same directory.  This wouldn't be a problem, not even for
replaying batches, for the lsetxattr would put the intended xattr value
in there regardless of whether the scanner saw the xattr value before or
after that.


What makes the problem visible is that btrfs appears to have a race in
its handling of xattr replacement, leaving a window between the removal
of the old value and the insertion of the new one, as shown by lines
5-8.  line 3 show the attribute existed before, and lines 5-8 show it
disappears in line 7, while lsetxattr still runs to replace it.

If rsync tries hard enough to hit this window, the lgetxattr concurrent
to the lsetxattr eventually hits, and then rsync reports an error:

rsync: get_xattr_data: lgetxattr(""/media/px/snapshots/cluster/20141102-to-20140816/osd/0.6ed_head/DIR_D/DIR_E/DIR_6"","user.cephos.phash.contents",0) failed: No data available (61)

At the end, it exits with a nonzero status, even though nothing really
wrong went on and the tree ended up looking just as it was supposed to.


Now, I'm a bit concerned because the btrfs race condition, if exercised
on security-related xattrs or ACLs, could cause data to become visible
that shouldn't, which could turn this into a locally exploitable
security issue.  Sure enough nobody goes nuts repeatedly changing the
ACLs of a dir or file containing information that should be guarded by
it, so as to increase the likelihood that an attacker succeeds in
accessing the data, but still...  I don't think the temporary removal of
the xattr for subsequent insertion should be visible at all.

I'm sorry for reporting a potential security issue like that, but by the
time it occurred to me that it might have potential security
implications, I'd already mentioned the problem on #btrfs at FreeNode,
so the horse was out of the barn already :-(


I hope this helps,

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: non-atomic xattr replacement in btrfs => rsync random errors
  2014-11-07  0:37 non-atomic xattr replacement in btrfs => rsync random errors Alexandre Oliva
@ 2014-11-07  1:39 ` Alexandre Oliva
  2014-11-07  9:29   ` Holger Hoffstätte
  2014-11-07 11:21 ` Filipe David Manana
  1 sibling, 1 reply; 4+ messages in thread
From: Alexandre Oliva @ 2014-11-07  1:39 UTC (permalink / raw)
  To: linux-btrfs

[dropping rsync@lists.samba.org, it rejects posts from non-subscribers;
 refer to https://bugzilla.samba.org/show_bug.cgi?id=10925 instead]

On Nov  6, 2014, Alexandre Oliva <oliva@gnu.org> wrote:

> What makes the problem visible is that btrfs appears to have a race in
> its handling of xattr replacement, leaving a window between the removal
> of the old value and the insertion of the new one

The bugs described above occurred with rsync-3.1.0-5.fc20.x86_64 and
kernel-libre-3.16.7-200.fc20.gnu.x86_64.  The btrfs code in kernel-libre
is unchanged from the corresponding Fedora kernel.  The distro is BLAG
200k/x86_64, under development.

-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: non-atomic xattr replacement in btrfs => rsync random errors
  2014-11-07  1:39 ` Alexandre Oliva
@ 2014-11-07  9:29   ` Holger Hoffstätte
  0 siblings, 0 replies; 4+ messages in thread
From: Holger Hoffstätte @ 2014-11-07  9:29 UTC (permalink / raw)
  To: linux-btrfs

On Thu, 06 Nov 2014 23:39:02 -0200, Alexandre Oliva wrote:

> [dropping rsync@lists.samba.org, it rejects posts from non-subscribers;
>  refer to https://bugzilla.samba.org/show_bug.cgi?id=10925 instead]
> 
> On Nov  6, 2014, Alexandre Oliva <oliva@gnu.org> wrote:
> 
>> What makes the problem visible is that btrfs appears to have a race in
>> its handling of xattr replacement, leaving a window between the removal
>> of the old value and the insertion of the new one
> 
> The bugs described above occurred with rsync-3.1.0-5.fc20.x86_64 and
> kernel-libre-3.16.7-200.fc20.gnu.x86_64.  The btrfs code in kernel-libre
> is unchanged from the corresponding Fedora kernel.  The distro is BLAG
> 200k/x86_64, under development.

Just a shot in the dark - could this be related to commit:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b0d5d10f41a0f1cd839408dd94427f2db3553bca

I don't think that made it into 3.16.x.

-h


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: non-atomic xattr replacement in btrfs => rsync random errors
  2014-11-07  0:37 non-atomic xattr replacement in btrfs => rsync random errors Alexandre Oliva
  2014-11-07  1:39 ` Alexandre Oliva
@ 2014-11-07 11:21 ` Filipe David Manana
  1 sibling, 0 replies; 4+ messages in thread
From: Filipe David Manana @ 2014-11-07 11:21 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: linux-btrfs@vger.kernel.org

On Fri, Nov 7, 2014 at 12:37 AM, Alexandre Oliva <oliva@gnu.org> wrote:
> A few days ago, I started using rsync batches to archive old copies of
> ceph OSD snapshots for certain kinds of disaster recovery.  This seems
> to exercise an unexpected race condition in rsync, which happens to
> expose what appears to be a race condition in btrfs, causing random
> scary but harmless errors when replaying the rsync batches.
>
>
> strace has revealed that the two rsync processes running concurrently to
> apply the batch both attempt to access xattrs of the same directory
> concurrently.  I understand rsync is supposed to avoid this, but
> something's going wrong with that.  Here's the smoking gun, snipped from
> strace -p 27251 -p 27253 -o smoking.gun, where both processes are
> started from a single rsync --read-batch=- -aHAX --del ... run:
>
> 0: 27251 stat("osd/0.6ed_head/DIR_D/DIR_E/DIR_6",  <unfinished ...>
> 1: 27253 stat("osd/0.6ed_head/DIR_D/DIR_E/DIR_6", {st_mode=S_IFDIR|0755, st_size=5470, ...}) = 0
> 2: 27251 <... stat resumed> {st_mode=S_IFDIR|0755, st_size=5470, ...}) = 0
> 3: 27253 llistxattr("osd/0.6ed_head/DIR_D/DIR_E/DIR_6", "user.cephos.phash.contents\0", 1024) = 27
> 4: 27251 llistxattr("osd/0.6ed_head/DIR_D/DIR_E/DIR_6",  <unfinished ...>
> 5: 27253 lsetxattr("osd/0.6ed_head/DIR_D/DIR_E/DIR_6", "user.cephos.phash.contents", "\x01F\x00\x00\x00\x00\x00\x00\x00\x0f\x00\x00\x00\x03\x00\x00", 17, 0 <unfinished ...>
> 6: 27251 <... llistxattr resumed> "user.cephos.phash.contents\0", 1024) = 27
> 7: 27251 lgetxattr("osd/0.6ed_head/DIR_D/DIR_E/DIR_6", "user.cephos.phash.contents", 0x0, 0) = -1 ENODATA (No data available)
> 8: 27253 <... lsetxattr resumed> )         = 0
> 9: 27253 utimensat(AT_FDCWD, "osd/0.6ed_head/DIR_D/DIR_E/DIR_6", {UTIME_NOW, {1407992261, 0}}, AT_SYMLINK_NOFOLLOW) = 0
> a: 27251 write(2, "rsync: get_xattr_data: lgetxattr"..., 181) = 181
>
> lines 0-2, 3-6 and 5-8, show concurrent access of both rsync processes
> to the same directory.  This wouldn't be a problem, not even for
> replaying batches, for the lsetxattr would put the intended xattr value
> in there regardless of whether the scanner saw the xattr value before or
> after that.
>
>
> What makes the problem visible is that btrfs appears to have a race in
> its handling of xattr replacement, leaving a window between the removal
> of the old value and the insertion of the new one, as shown by lines
> 5-8.  line 3 show the attribute existed before, and lines 5-8 show it
> disappears in line 7, while lsetxattr still runs to replace it.
>
> If rsync tries hard enough to hit this window, the lgetxattr concurrent
> to the lsetxattr eventually hits, and then rsync reports an error:
>
> rsync: get_xattr_data: lgetxattr(""/media/px/snapshots/cluster/20141102-to-20140816/osd/0.6ed_head/DIR_D/DIR_E/DIR_6"","user.cephos.phash.contents",0) failed: No data available (61)
>
> At the end, it exits with a nonzero status, even though nothing really
> wrong went on and the tree ended up looking just as it was supposed to.
>
>
> Now, I'm a bit concerned because the btrfs race condition, if exercised
> on security-related xattrs or ACLs, could cause data to become visible
> that shouldn't, which could turn this into a locally exploitable
> security issue.  Sure enough nobody goes nuts repeatedly changing the
> ACLs of a dir or file containing information that should be guarded by
> it, so as to increase the likelihood that an attacker succeeds in
> accessing the data, but still...  I don't think the temporary removal of
> the xattr for subsequent insertion should be visible at all.

Indeed, the xattr replace code is not atomic in btrfs.
I'll send a fix soon.

thanks

>
> I'm sorry for reporting a potential security issue like that, but by the
> time it occurred to me that it might have potential security
> implications, I'd already mentioned the problem on #btrfs at FreeNode,
> so the horse was out of the barn already :-(
>
>
> I hope this helps,
>
> --
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Filipe David Manana,

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-11-07 11:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-07  0:37 non-atomic xattr replacement in btrfs => rsync random errors Alexandre Oliva
2014-11-07  1:39 ` Alexandre Oliva
2014-11-07  9:29   ` Holger Hoffstätte
2014-11-07 11:21 ` Filipe David Manana

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.