From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: "Theodore Ts'o" <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org
Subject: Re: [PATCH 06/31] e2p: Fix f[gs]etflags argument size mismatch
Date: Mon, 7 Oct 2013 16:23:58 -0700 [thread overview]
Message-ID: <20131007232358.GG6860@birch.djwong.org> (raw)
In-Reply-To: <20131007204055.GF6860@birch.djwong.org>
Well, it's nastier than this. For an ioctl that only writes to userspace,
libfuse allocates an internal buffer with whatever junk is in memory at the
time. If the fuse server's ioctl handler doesn't zero the buffer before
returning, the remnants of this garbage are passed from the server back to the
kernel, which dutifully copies the garbage into the user program's memory.
Even if the result is some error code!
fuse2fs' getflags handler would write an int to the lower end of this buffer,
so garbage + flags would get sent back to userspace. In my particular case,
gcc was applying some kind of optimization to the chattr binary that places the
flags variable next to the 'name' variable in change_attributes(), and so the
return from the ioctl (through fs/fuse/file.c) would write a 64-bit value over
flags and the lower 32 bits of name. The next time anyone tried to use name,
it would get a garbage pointer and (probably) crash.
Yuck. FUSE assumes an interface contract (the data size encoded in the ioctl
number) that neither userspace nor kernel actually abide. This has gone on for
years with no problems, since both components made the same implicit assumption
about data size in the same way. Unfortunately, userspace breaks only on FUSE,
so I don't know what to do.
It's possible to adapt both e2p and FUSE servers to allocate an unsigned long
and shift the values around accordingly, but that won't help avoid a crash with
existing e2p binaries. Even if I comment out the .ioctl function pointer in
fuse2fs.c, running chattr will still crash--FUSE copies the buffer even for an
error result.
Long term I guess we could define a new pair of ioctls that work with pointers
to 64-bit values and deprecate the old ones. Or perhaps there's a better
suggestion than "don't run chattr/lsattr on a FUSE"?
--D
On Mon, Oct 07, 2013 at 01:40:55PM -0700, Darrick J. Wong wrote:
> On Mon, Oct 07, 2013 at 09:33:04AM -0400, Theodore Ts'o wrote:
> > On Mon, Sep 30, 2013 at 06:27:21PM -0700, Darrick J. Wong wrote:
> >
> > > The EXT2_IOC_[GS]ETFLAGS ioctls take longs as arguments, however
> > > this code only reserves enough storage for an int. The kernel
> > > drivers (so far) don't transfer more than an int but FUSE sees the
> > > long and assumes that it's ok to write the full size of the long,
> > > which crashes if sizeof(long) > sizeof(int).
> >
> > All of the kernel code I was able to audit is treating the
> > EXT2_IOC_[SG]ETFLAGS ioctls as taking an int, not a long. So the
> > defacto definition of [SG]ETFLAGS is that that they take ints, not
> > longs. If we make this change which you are proposing, it will cause
> > problems on big-endian systems where sizeof(long) > sizeof(int) ---
> > for example, as would be the case on the ppc64 architecture.
> >
> > I'm not sure what the FUSE problem is? Can you say more? Is there
> > some other way we can work around the problem you are trying to solve?
>
> If I mount a FUSE fs (specifically the fuse2fs thing in the last patch) and try
> to run 'chattr +i /mnt/foo', chattr segfaults. valgrind had this[1] to offer.
> It seems that FUSE's ioctl implementation depends on the 'size' argument to
> _IOR (in the EXT2_IOC_[SG]ETFLAGS definition) to figure out how much data to
> transfer in or out of userspace. Unfortunately for fgetflags, it passes in a
> pointer to an int, and fuse seems to smash up the stack trying to write back a
> long. Valgrind and gdb show that the lower 32-bits of name get overwritten,
> which causes the segfault.
>
> Unfortunately, this is a bit of a heisenbug; if I build with -O0 (gcc 4.6.3,
> Ubuntu 12.04, libfuse 2.9.2 backport) it goes away. The stack corruption also
> seems to go away if I print the address of f, though save_errno gets
> overwritten with some suspicious looking 32695 value.
>
> I'll keep poking at what's going on here, though I'll try to come up with a
> clever solution for BE machines.
>
> --D
>
> [1] Valgrind messsage:
> ==3512== Memcheck, a memory error detector
> ==3512== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
> ==3512== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
> ==3512== Command: chattr +i /mnt/foo
> ==3512==
> ==3512== Syscall param lstat(file_name) points to unaddressable byte(s)
> ==3512== at 0x53232B5: _lxstat (lxstat.c:38)
> ==3512== by 0x4E34610: fsetflags (in /lib/x86_64-linux-gnu/libe2p.so.2.3)
> ==3512== by 0x401350: ??? (in /usr/bin/chattr)
> ==3512== by 0x4010A0: ??? (in /usr/bin/chattr)
> ==3512== by 0x525E76C: (below main) (libc-start.c:226)
> ==3512== Address 0x700007fb7 is not stack'd, malloc'd or (recently) free'd
> ==3512==
> ==3512== Syscall param open(filename) points to unaddressable byte(s)
> ==3512== at 0x53236C0: __open_nocancel (syscall-template.S:82)
> ==3512== by 0x4E3463E: fsetflags (in /lib/x86_64-linux-gnu/libe2p.so.2.3)
> ==3512== by 0x401350: ??? (in /usr/bin/chattr)
> ==3512== by 0x4010A0: ??? (in /usr/bin/chattr)
> ==3512== by 0x525E76C: (below main) (libc-start.c:226)
> ==3512== Address 0x700007fb7 is not stack'd, malloc'd or (recently) free'd
> ==3512==
> chattr: Bad address ==3512== Invalid read of size 1
> ==3512== at 0x52883B1: vfprintf (vfprintf.c:1630)
> ==3512== by 0x528B1A3: buffered_vfprintf (vfprintf.c:2313)
> ==3512== by 0x5285BDD: vfprintf (vfprintf.c:1316)
> ==3512== by 0x53463D7: __vfprintf_chk (vfprintf_chk.c:35)
> ==3512== by 0x503ABA8: ??? (in /lib/x86_64-linux-gnu/libcom_err.so.2.1)
> ==3512== by 0x503AD1E: com_err (in /lib/x86_64-linux-gnu/libcom_err.so.2.1)
> ==3512== by 0x401435: ??? (in /usr/bin/chattr)
> ==3512== by 0x4010A0: ??? (in /usr/bin/chattr)
> ==3512== by 0x525E76C: (below main) (libc-start.c:226)
> ==3512== Address 0x700007fb7 is not stack'd, malloc'd or (recently) free'd
> ==3512==
> ==3512==
> ==3512== Process terminating with default action of signal 11 (SIGSEGV)
> ==3512== Access not within mapped region at address 0x700007FB7
> ==3512== at 0x52883B1: vfprintf (vfprintf.c:1630)
> ==3512== by 0x528B1A3: buffered_vfprintf (vfprintf.c:2313)
> ==3512== by 0x5285BDD: vfprintf (vfprintf.c:1316)
> ==3512== by 0x53463D7: __vfprintf_chk (vfprintf_chk.c:35)
> ==3512== by 0x503ABA8: ??? (in /lib/x86_64-linux-gnu/libcom_err.so.2.1)
> ==3512== by 0x503AD1E: com_err (in /lib/x86_64-linux-gnu/libcom_err.so.2.1)
> ==3512== by 0x401435: ??? (in /usr/bin/chattr)
> ==3512== by 0x4010A0: ??? (in /usr/bin/chattr)
> ==3512== by 0x525E76C: (below main) (libc-start.c:226)
> ==3512== If you believe this happened as a result of a stack
> ==3512== overflow in your program's main thread (unlikely but
> ==3512== possible), you can try to increase the size of the
> ==3512== main thread stack using the --main-stacksize= flag.
> ==3512== The main thread stack size used in this run was 8388608.
> ==3512==
> ==3512== HEAP SUMMARY:
> ==3512== in use at exit: 0 bytes in 0 blocks
> ==3512== total heap usage: 247 allocs, 247 frees, 22,481 bytes allocated
> ==3512==
> ==3512== All heap blocks were freed -- no leaks are possible
> ==3512==
> ==3512== For counts of detected and suppressed errors, rerun with: -v
> ==3512== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 2 from 2)
> Segmentation fault
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2013-10-07 23:24 UTC|newest]
Thread overview: 90+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-01 1:26 [PATCH v1 00/31] e2fsprogs September 2013 patchbomb Darrick J. Wong
2013-10-01 1:26 ` [PATCH 01/31] tune2fs: Don't convert block # to cluster # when clearing uninit_bg Darrick J. Wong
2013-10-03 16:53 ` Lukáš Czerner
2013-10-03 19:04 ` Darrick J. Wong
2013-10-07 12:49 ` Lukáš Czerner
2013-10-07 13:03 ` Theodore Ts'o
2013-10-09 22:10 ` Darrick J. Wong
2013-10-10 0:26 ` Theodore Ts'o
2013-10-10 22:04 ` Darrick J. Wong
2013-10-13 3:09 ` Theodore Ts'o
2013-10-01 1:26 ` [PATCH 02/31] libext2fs: Only link an inode into a directory once Darrick J. Wong
2013-10-01 15:37 ` jon ernst
2013-10-01 21:11 ` Darrick J. Wong
2013-10-07 13:17 ` Theodore Ts'o
2013-10-07 18:53 ` Darrick J. Wong
2013-10-01 1:27 ` [PATCH 03/31] Define an error code for block bitmap checksum failures Darrick J. Wong
2013-10-13 3:12 ` Theodore Ts'o
2013-10-01 1:27 ` [PATCH 04/31] libext2fs: Fix a minor grammatical error in the error catalog Darrick J. Wong
2013-10-07 13:20 ` Theodore Ts'o
2013-10-01 1:27 ` [PATCH 05/31] libext2fs: Add space for metadata checksum when unconverting a hashed directory block Darrick J. Wong
2013-10-13 3:16 ` Theodore Ts'o
2013-10-01 1:27 ` [PATCH 06/31] e2p: Fix f[gs]etflags argument size mismatch Darrick J. Wong
2013-10-07 13:33 ` Theodore Ts'o
2013-10-07 20:40 ` Darrick J. Wong
2013-10-07 23:23 ` Darrick J. Wong [this message]
2013-10-08 0:06 ` Theodore Ts'o
2013-10-08 0:28 ` Darrick J. Wong
2013-10-01 1:27 ` [PATCH 07/31] libext2fs: When writing a file that has a i_size > 2GB, set the large_file feature flag and update the superblock Darrick J. Wong
2013-10-07 13:14 ` Theodore Ts'o
2013-10-01 1:27 ` [PATCH 08/31] libext2fs: Fix off-by-one error in file truncation Darrick J. Wong
2013-10-07 14:02 ` Lukáš Czerner
2013-10-08 15:52 ` Theodore Ts'o
2013-10-01 1:27 ` [PATCH 09/31] libext2fs: Rewind extent pointer when totally deleting an extent Darrick J. Wong
2013-10-07 13:37 ` Theodore Ts'o
2013-10-07 18:24 ` Darrick J. Wong
2013-10-01 1:27 ` [PATCH 10/31] libext2fs: Allow callers to punch a single block Darrick J. Wong
2013-10-01 19:09 ` jon ernst
2013-10-01 21:25 ` Darrick J. Wong
2013-10-07 13:40 ` Theodore Ts'o
2013-10-08 15:54 ` Theodore Ts'o
2013-10-01 1:27 ` [PATCH 11/31] libext2fs: ind_punch() must not stop examining blocks prematurely Darrick J. Wong
2013-10-07 13:43 ` Theodore Ts'o
2013-10-01 1:27 ` [PATCH 12/31] e2fsprogs: Fix blk_t <- blk64_t assignment mismatches Darrick J. Wong
2013-10-07 13:52 ` Theodore Ts'o
2013-10-01 1:28 ` [PATCH 13/31] e2fsprogs: Less critical fixes to use the appropriate blk*t types Darrick J. Wong
2013-10-07 13:59 ` Theodore Ts'o
2013-10-01 1:28 ` [PATCH 14/31] libext2fs: Fix ext2fs_open2() truncation of the superblock parameter Darrick J. Wong
2013-10-07 14:30 ` Lukáš Czerner
2013-10-07 18:42 ` Darrick J. Wong
2013-10-08 15:58 ` Theodore Ts'o
2013-10-08 17:47 ` Darrick J. Wong
2013-10-01 1:28 ` [PATCH 15/31] e2fsck: Teach EA refcounting code to handle 48bit block addresses Darrick J. Wong
2013-10-07 15:30 ` Lukáš Czerner
2013-10-07 18:37 ` Darrick J. Wong
2013-10-08 16:01 ` Theodore Ts'o
2013-10-09 21:53 ` Darrick J. Wong
2013-10-01 1:28 ` [PATCH 16/31] debugfs: Handle 64bit block numbers Darrick J. Wong
2013-10-07 15:49 ` Lukáš Czerner
2013-10-07 18:49 ` Darrick J. Wong
2013-10-01 1:28 ` [PATCH 17/31] libext2fs: Refactor u32-list to handle 32 and 64-bit data types Darrick J. Wong
2013-10-10 14:46 ` Lukáš Czerner
2013-10-10 18:05 ` Darrick J. Wong
2013-10-01 1:28 ` [PATCH 18/31] libext2fs: Badblocks should handle 48-bit block numbers correctly Darrick J. Wong
2013-10-08 16:03 ` Theodore Ts'o
2013-10-09 21:57 ` Darrick J. Wong
2013-10-01 1:28 ` [PATCH 19/31] badblocks: Use the new badblocks APIs for 64-bit block numbers Darrick J. Wong
2013-10-10 15:01 ` Lukáš Czerner
2013-10-01 1:28 ` [PATCH 20/31] e2fsprogs: Add (optional) sparse checking to the build Darrick J. Wong
2013-10-12 3:13 ` Theodore Ts'o
2013-10-01 1:28 ` [PATCH 21/31] libext2fs: Be more thorough in searching a range of blocks for a cluster Darrick J. Wong
2013-10-08 16:09 ` Theodore Ts'o
2013-10-01 1:29 ` [PATCH 22/31] libext2fs: During punch, only free a cluster if we're sure that all blocks in the cluster are being punched Darrick J. Wong
2013-10-10 15:53 ` Lukáš Czerner
2013-10-10 19:29 ` Darrick J. Wong
2013-10-01 1:29 ` [PATCH 23/31] libext2fs: expanddir and mkjournal need not update the summary counts when performing an implied cluster allocation Darrick J. Wong
2013-10-10 16:02 ` Lukáš Czerner
2013-10-01 1:29 ` [PATCH 24/31] libext2fs: Use ext2fs_punch() to truncate quota file Darrick J. Wong
2013-10-10 16:06 ` Lukáš Czerner
2013-10-01 1:29 ` [PATCH 25/31] e2fsck: Only release clusters when shortening a directory during a rehash Darrick J. Wong
2013-10-10 16:13 ` Lukáš Czerner
2013-10-01 1:29 ` [PATCH 26/31] libext2fs: openfs() musn't allow bigalloc without EXT2_FLAGS_64BITS Darrick J. Wong
2013-10-07 12:50 ` Lukáš Czerner
2013-10-12 1:36 ` Theodore Ts'o
2013-10-01 1:29 ` [PATCH 27/31] resize2fs: Convert fs to and from 64bit mode Darrick J. Wong
2013-10-01 1:29 ` [PATCH 28/31] mke2fs: Complain about creating 64bit filesystems without extents Darrick J. Wong
2013-10-12 1:14 ` Theodore Ts'o
2013-10-01 1:29 ` [PATCH 29/31] e2fsck: Enable extents on all 64bit filesystems Darrick J. Wong
2013-10-12 1:19 ` Theodore Ts'o
2013-10-01 1:29 ` [PATCH 30/31] libext2fs: Support modifying arbitrary extended attributes Darrick J. Wong
2013-10-01 1:30 ` [PATCH 31/31] misc: Add fuse2fs, a FUSE server for e2fsprogs Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131007232358.GG6860@birch.djwong.org \
--to=darrick.wong@oracle.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).