All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Cengia <mattcen@cyber.com.au>
To: linux-kernel@vger.kernel.org
Cc: Matthew Cengia <mattcen@gmail.com>, "Trent W. Buck" <twb@cyber.com.au>
Subject: overlayfs+selinux error: OPNOTSUPP
Date: Fri, 18 Sep 2015 12:07:43 +1000	[thread overview]
Message-ID: <20150918020743.GC22582@cyber.com.au> (raw)


[-- Attachment #1.1: Type: text/plain, Size: 3194 bytes --]

Hi all,

Please CC me directly when responding, as I'm not subscribed to the
mailing list.


Summary
-------
I deploy diskless Debian kiosks in prisons, for use by inmates.
As part of the Debian 7 to 8 upgrade, I want to enable SELinux.
My initrd uses overlayfs to combine a ro squashfs and a rw tmpfs.

When I add SELinux into the mix, I get a lot of EOPNOTSUPP.


Long and boring history
-----------------------
I was happy with Debian 7 / Linux 3.16 / sysvinit / aufs.
Then, new hardware arrived, which needed a newer Xorg.
So I had to switch to Debian 8 / Linux 3.16.
Debian 8 defaults to systemd, so I went with that.

I used to put $XDG_RUNTIME_DIR under a /tmp mounted -onoexec.
Systemd v215 is hard-coded to mount $XDG_RUNTIME_DIR as a dedicated tmpfs,
and provides no way to mount/remount it with -onoexec.

    src/login/logind-user.c:336:user_mkdir_runtime_path()

When I complained about this, regulars on #systemd on Freenode said:

    Just use SELinux, already!
    -o noexec might break something, and it won't stop interpreters.

...which was mostly reasonable.
So adopting SELinux was reprioritized from "some day" to "right now!"

aufs doesn't support SELinux, so I had to switch to overlayfs.
So now my target is Debian 8 / Linux 4.1 / systemd / overlayfs / SELinux.


Current problem
---------------
When I built & booted that combination, hostnames didn't resolve.

The initrd uses klibc ipconfig as a DHCP client,
then tries to create /etc/resolv.conf in the rootfs.
(This happens before switch_root.)

When SELinux is enabled, resolv.conf can't be opened for writing.
The attached strace (output.txt) shows open(2) gets EOPNOTSUPP.


Tests completed
---------------
This problem *ONLY* occurs in the initrd,
which is *BEFORE* the SELinux policy loads.
I'm not sure if this is relevant.

This problem *DOES NOT* occur if the file/directory being written to
already exists in the read/write portion of the overlay mount before the
overlayfs is mounted. I've attached a script to demonstrate this.

Booting the kernel with permissive=1 *DOES NOT* prevent the problem.


Test script
-----------
Attached is a script called 'bootstrap'.
When run on a Debian Jessie system with debootstrap, squashfs-tools, and kvm installed,
and selinux installed and enabled (even if it's in permissive mode),
'bootstrap' will:

 * Mount a tmpfs without -o nodev at /tmp/bootstrap/live, to build in;
 * Build an SOE in /tmp/bootstrap/live/;
 * Create a squashfs of the built system;
 * Leave the squashfs, kernel, and initrd in /tmp/bootstrap/live/boot/; and
 * Start up a VM using KVM to demonstrate the behaviour.

The script that the initrd runs does several things, all of which are
detailed within the script, and in output.txt; look for lines
containing '-->'.

output.txt contains a full KVM run of the system exhibiting the problem,
in which I've also run an 'strace touch' to demonstrate the failing
syscall.


Help?
-----
How can I set about debugging this problem further?
Has anybody dealt with this before?
How can I solve (or workaround) this problem?

-- 
Regards,
Matthew Cengia

[-- Attachment #1.2: bootstrap --]
[-- Type: text/plain, Size: 4599 bytes --]

#!/bin/bash
set -eEux
set -o pipefail
trap 'echo >&2 "$0: unknown error"' ERR

export LC_ALL=C DEBIAN_FRONTEND=noninteractive
a=amd64 r=jessie t=live M=http://httpredir.debian.org/debian

# We can't use /tmp as it may (reasonably) be mounted -onodev.
mkdir -p /tmp/bootstrap
grep -q '^tmpfs /tmp/bootstrap tmpfs' /proc/mounts ||
mount tmpfs /tmp/bootstrap -ttmpfs -omode=700,size=80%
cd /tmp/bootstrap
rm -rf $t                                                       # Delete previous build (if any).

debootstrap --variant minbase --arch $a $r $t $M
>$t/etc/debian_chroot                   echo bootstrap
>$t/etc/apt/sources.list                printf 'deb %s %s main\n'       $M $r   $M $r-updates   $M $r-backports   http://security.debian.org $r/updates
>$t/etc/apt/sources.list.d/30selinux.list printf 'deb %s %s selinux\n' http://www.coker.com.au $r
>$t/etc/apt/apt.conf.d/10stable         echo "APT::Default-Release \"$r\";"
>$t/etc/apt/apt.conf.d/10bootstrap      echo 'APT::Get::Assume-Yes "1"; APT::Get::AutomaticRemove "1"; APT::Install-Recommends "0"; Quiet "1";'
>$t/usr/sbin/policy-rc.d                printf '#!/bin/sh\nexit 101'
chmod +x $t/usr/sbin/policy-rc.d
chroot $t apt-key adv --keyserver hkp://pool.sks-keyservers.net --recv-key D141CD30FC4B8F79
chroot $t apt-get update
chroot $t apt-get install -y initramfs-tools
>$t/etc/kernel-img.conf                 echo link_in_boot=yes
sed -i 's/^root:[^:]*:/root::/' $t/etc/shadow                   # root has null password
>$t/etc/initramfs-tools/modules         printf '%s\n' overlay squashfs
>$t/etc/initramfs-tools/scripts/overlaytest cat <<EOF
mountroot()
{
  set -x
  : "--> Mount a tmpfs on /live for use by overlay"
  mkdir /live
  mount -t tmpfs tmpfs /live
  : "--> Make the two subdirs required by overlay"
  mkdir -p /live/overlay/rw /live/overlay/work
  : "--> Make /filesystem to mount the read-only squashfs"
  mkdir /filesystem
  : "--> Create a /etc directory in what will become the writable portion of"
  : "--> the overlay filesystem"
  mkdir -p /live/overlay/rw/etc
  : "--> Mount the squashfs"
  mount -t squashfs /dev/vda /filesystem
  : "--> Union the tmpfs and the squashfs with overlayfs and mount them on"
  : "--> /root"
  mount -t overlay -o noatime,lowerdir=/filesystem/,upperdir=/live/overlay/rw,workdir=/live/overlay/work overlay /root/

  : "--> Demonstrate that creating a file..."
  touch /root/newfile
  : "--> ... creating a directory..."
  mkdir -p /root/newdir
  : "--> ... and creating a file in the new directory all work in the"
  : "--> root of the overlay filesystem..."
  touch /root/newdir/newfile
  : "--> ...before cleaning up those files/dirs"
  rm -r /root/newfile /root/newdir/newfile /root/newdir
  : "--> Demonstrate that touching an existing directory (/etc, which we"
  : "--> created earlier), and a file within it, works"
  touch /root/etc/
  touch /root/etc/newfile
  : "--> Demonstrate that touching a directory or file not already present in"
  : "--> the read-write part of overlay does *NOT* work"
  touch /root/home/
  touch /root/home/newfile

  set +x

  maybe_break

}
EOF
>$t/etc/initramfs-tools/hooks/strace cat <<\EOF
#!/bin/bash

set -e
if [[ prereqs = $1 ]]
then exit 0
fi

. /usr/share/initramfs-tools/hook-functions

copy_exec /usr/bin/strace
EOF
chmod a+x $t/etc/initramfs-tools/hooks/strace
chroot $t apt-get install -y --no-install-recommends linux-image-4.1.0-0.bpo.1-amd64 busybox selinux-basics selinux-policy-default auditd strace

# SELinux relabel
# NOTE: This requres SELinux to be enabled on the build host, even if it
# is set to permissive!
setfiles -r $t/ $t/etc/selinux/default/contexts/files/file_contexts $t/

exclusions=(
    # Since boot/* is needed outside the squashfs, don't duplicate it inside.
    '^boot$/.'
    # Filesystems created at boot time.
    '^(dev|tmp|run)$/.'
    '^var$/^(lock|run|tmp)$/.'
    # Build-time configuration and cache.
    '^etc$/^(debian_chroot|hostname|hosts|motd(\.tail)?|mtab|resolv.conf)$'
    '^etc$/^apt$/^apt.conf.d$/^10bootstrap$'
    '^etc$/^network$/^interfaces$'
    '^usr$/^sbin$/^policy-rc\.d$'
    '^var$/^cache$/^apt$/^(src)?pkgcache\.bin$'
    '^var$/^cache$/^apt$/^archives$/\.deb$'
    '^var$/^cache$/^bootstrap$'
    '^var$/^lib$/^apt$/^lists$/.'
    '^var$/^log$/.'
)

mksquashfs $t $t/boot/filesystem.squashfs -regex -e "${exclusions[@]}"

kvm -m 256 -nographic -kernel $t/boot/vmlinuz -initrd $t/boot/initrd.img -append 'console=ttyS0 root=/dev/vda loglevel=1 security=selinux boot=overlaytest' -drive file=$t/boot/filesystem.squashfs,index=0,media=disk,if=virtio -net nic,model=virtio

[-- Attachment #1.3: output.txt --]
[-- Type: text/plain, Size: 4638 bytes --]

+ kvm -m 256 -nographic -kernel live/boot/vmlinuz -initrd live/boot/initrd.img -append 'console=ttyS0 root=/dev/vda loglevel=1 security=selinux boot=overlaytest' -drive file=live/boot/filesystem.squashfs,index=0,media=disk,if=virtio -net nic,model=virtio
Warning: vlan 0 is not connected to host network
Loading, please wait...
Begin: Loading essential drivers ... done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/nfs-top ... done.
Begin: Running /scripts/nfs-premount ... done.
+ : --> Mount a tmpfs on /live for use by overlay
+ mkdir /live
+ mount -t tmpfs tmpfs /live
+ : --> Make the two subdirs required by overlay
+ mkdir -p /live/overlay/rw /live/overlay/work
+ : --> Make /filesystem to mount the read-only squashfs
+ mkdir /filesystem
+ : --> Create a /etc directory in what will become the writable portion of
+ : --> the overlay filesystem
+ mkdir -p /live/overlay/rw/etc
+ : --> Mount the squashfs
+ mount -t squashfs /dev/vda /filesystem
+ : --> Union the tmpfs and the squashfs with overlayfs and mount them on
+ : --> /root
+ mount -t overlay -o noatime,lowerdir=/filesystem/,upperdir=/live/overlay/rw,workdir=/live/overlay/work overlay /root/
+ : --> Demonstrate that creating a file...
+ touch /root/newfile
+ : --> ... creating a directory...
+ mkdir -p /root/newdir
+ : --> ... and creating a file in the new directory all work in the
+ : --> root of the overlay filesystem...
+ touch /root/newdir/newfile
+ : --> ...before cleaning up those files/dirs
+ rm -r /root/newfile /root/newdir/newfile /root/newdir
+ : --> Demonstrate that touching an existing directory (/etc, which we
+ : --> created earlier), and a file within it, works
+ touch /root/etc/
+ touch /root/etc/newfile
+ : --> Demonstrate that touching a directory or file not already present in
+ : --> the read-write part of overlay does *NOT* work
+ touch /root/home/
touch: /root/home/: Operation not supported
+ touch /root/home/newfile
touch: /root/home/newfile: Operation not supported
+ set +x
Spawning shell within the initramfs
modprobe: module ehci-orion not found in modules.dep


BusyBox v1.22.1 (Debian 1:1.22.0-9+deb8u1) built-in shell (ash)
Enter 'help' for a list of built-in commands.

/bin/sh: can't access tty; job control turned off
(initramfs) strace touch /root/home/newfile
execve("/bin/touch", ["touch", "/root/home/newfile"], [/* 32 vars */]) = 0
brk(0)                                  = 0x1db8000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f64fcfce000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=1358, ...}) = 0
mmap(NULL, 1358, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f64fcfcd000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\34\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1729984, ...}) = 0
mmap(NULL, 3836448, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f64fca07000
mprotect(0x7f64fcba6000, 2097152, PROT_NONE) = 0
mmap(0x7f64fcda6000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x19f000) = 0x7f64fcda6000
mmap(0x7f64fcdac000, 14880, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f64fcdac000
close(3)                                = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f64fcfcc000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f64fcfcb000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f64fcfca000
arch_prctl(ARCH_SET_FS, 0x7f64fcfcb700) = 0
mprotect(0x7f64fcda6000, 16384, PROT_READ) = 0
mprotect(0x69a000, 4096, PROT_READ)     = 0
mprotect(0x7f64fcfd0000, 4096, PROT_READ) = 0
munmap(0x7f64fcfcd000, 1358)            = 0
getuid()                                = 0
utimes("/root/home/newfile", NULL)      = -1 ENOENT (No such file or directory)
open("/root/home/newfile", O_RDWR|O_CREAT, 0666) = -1 EOPNOTSUPP (Operation not supported)
brk(0)                                  = 0x1db8000
brk(0x1dd9000)                          = 0x1dd9000
write(2, "touch: /root/home/newfile: Opera"..., 51touch: /root/home/newfile: Operation not supported
) = 51
exit_group(1)                           = ?
+++ exited with 1 +++
(initramfs)

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 966 bytes --]

             reply	other threads:[~2015-09-18  2:15 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-18  2:07 Matthew Cengia [this message]
  -- strict thread matches above, loose matches on Subject: below --
2015-09-21  2:25 overlayfs+selinux error: OPNOTSUPP Matthew Cengia
2015-09-21 20:42 ` Stephen Smalley
2015-09-21 20:47   ` Stephen Smalley
2015-09-22  1:24     ` Matthew Cengia
2015-09-22 13:36       ` Stephen Smalley
2015-09-23  3:23   ` Russell Coker
2015-09-23 16:25     ` Stephen Smalley
2015-09-24  7:00     ` Matthew Cengia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150918020743.GC22582@cyber.com.au \
    --to=mattcen@cyber.com.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mattcen@gmail.com \
    --cc=twb@cyber.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.