* overlayfs+selinux error: OPNOTSUPP
@ 2015-09-18 2:07 Matthew Cengia
0 siblings, 0 replies; 9+ messages in thread
From: Matthew Cengia @ 2015-09-18 2:07 UTC (permalink / raw)
To: linux-kernel; +Cc: Matthew Cengia, Trent W. Buck
[-- Attachment #1.1: Type: text/plain, Size: 3194 bytes --]
Hi all,
Please CC me directly when responding, as I'm not subscribed to the
mailing list.
Summary
-------
I deploy diskless Debian kiosks in prisons, for use by inmates.
As part of the Debian 7 to 8 upgrade, I want to enable SELinux.
My initrd uses overlayfs to combine a ro squashfs and a rw tmpfs.
When I add SELinux into the mix, I get a lot of EOPNOTSUPP.
Long and boring history
-----------------------
I was happy with Debian 7 / Linux 3.16 / sysvinit / aufs.
Then, new hardware arrived, which needed a newer Xorg.
So I had to switch to Debian 8 / Linux 3.16.
Debian 8 defaults to systemd, so I went with that.
I used to put $XDG_RUNTIME_DIR under a /tmp mounted -onoexec.
Systemd v215 is hard-coded to mount $XDG_RUNTIME_DIR as a dedicated tmpfs,
and provides no way to mount/remount it with -onoexec.
src/login/logind-user.c:336:user_mkdir_runtime_path()
When I complained about this, regulars on #systemd on Freenode said:
Just use SELinux, already!
-o noexec might break something, and it won't stop interpreters.
...which was mostly reasonable.
So adopting SELinux was reprioritized from "some day" to "right now!"
aufs doesn't support SELinux, so I had to switch to overlayfs.
So now my target is Debian 8 / Linux 4.1 / systemd / overlayfs / SELinux.
Current problem
---------------
When I built & booted that combination, hostnames didn't resolve.
The initrd uses klibc ipconfig as a DHCP client,
then tries to create /etc/resolv.conf in the rootfs.
(This happens before switch_root.)
When SELinux is enabled, resolv.conf can't be opened for writing.
The attached strace (output.txt) shows open(2) gets EOPNOTSUPP.
Tests completed
---------------
This problem *ONLY* occurs in the initrd,
which is *BEFORE* the SELinux policy loads.
I'm not sure if this is relevant.
This problem *DOES NOT* occur if the file/directory being written to
already exists in the read/write portion of the overlay mount before the
overlayfs is mounted. I've attached a script to demonstrate this.
Booting the kernel with permissive=1 *DOES NOT* prevent the problem.
Test script
-----------
Attached is a script called 'bootstrap'.
When run on a Debian Jessie system with debootstrap, squashfs-tools, and kvm installed,
and selinux installed and enabled (even if it's in permissive mode),
'bootstrap' will:
* Mount a tmpfs without -o nodev at /tmp/bootstrap/live, to build in;
* Build an SOE in /tmp/bootstrap/live/;
* Create a squashfs of the built system;
* Leave the squashfs, kernel, and initrd in /tmp/bootstrap/live/boot/; and
* Start up a VM using KVM to demonstrate the behaviour.
The script that the initrd runs does several things, all of which are
detailed within the script, and in output.txt; look for lines
containing '-->'.
output.txt contains a full KVM run of the system exhibiting the problem,
in which I've also run an 'strace touch' to demonstrate the failing
syscall.
Help?
-----
How can I set about debugging this problem further?
Has anybody dealt with this before?
How can I solve (or workaround) this problem?
--
Regards,
Matthew Cengia
[-- Attachment #1.2: bootstrap --]
[-- Type: text/plain, Size: 4599 bytes --]
#!/bin/bash
set -eEux
set -o pipefail
trap 'echo >&2 "$0: unknown error"' ERR
export LC_ALL=C DEBIAN_FRONTEND=noninteractive
a=amd64 r=jessie t=live M=http://httpredir.debian.org/debian
# We can't use /tmp as it may (reasonably) be mounted -onodev.
mkdir -p /tmp/bootstrap
grep -q '^tmpfs /tmp/bootstrap tmpfs' /proc/mounts ||
mount tmpfs /tmp/bootstrap -ttmpfs -omode=700,size=80%
cd /tmp/bootstrap
rm -rf $t # Delete previous build (if any).
debootstrap --variant minbase --arch $a $r $t $M
>$t/etc/debian_chroot echo bootstrap
>$t/etc/apt/sources.list printf 'deb %s %s main\n' $M $r $M $r-updates $M $r-backports http://security.debian.org $r/updates
>$t/etc/apt/sources.list.d/30selinux.list printf 'deb %s %s selinux\n' http://www.coker.com.au $r
>$t/etc/apt/apt.conf.d/10stable echo "APT::Default-Release \"$r\";"
>$t/etc/apt/apt.conf.d/10bootstrap echo 'APT::Get::Assume-Yes "1"; APT::Get::AutomaticRemove "1"; APT::Install-Recommends "0"; Quiet "1";'
>$t/usr/sbin/policy-rc.d printf '#!/bin/sh\nexit 101'
chmod +x $t/usr/sbin/policy-rc.d
chroot $t apt-key adv --keyserver hkp://pool.sks-keyservers.net --recv-key D141CD30FC4B8F79
chroot $t apt-get update
chroot $t apt-get install -y initramfs-tools
>$t/etc/kernel-img.conf echo link_in_boot=yes
sed -i 's/^root:[^:]*:/root::/' $t/etc/shadow # root has null password
>$t/etc/initramfs-tools/modules printf '%s\n' overlay squashfs
>$t/etc/initramfs-tools/scripts/overlaytest cat <<EOF
mountroot()
{
set -x
: "--> Mount a tmpfs on /live for use by overlay"
mkdir /live
mount -t tmpfs tmpfs /live
: "--> Make the two subdirs required by overlay"
mkdir -p /live/overlay/rw /live/overlay/work
: "--> Make /filesystem to mount the read-only squashfs"
mkdir /filesystem
: "--> Create a /etc directory in what will become the writable portion of"
: "--> the overlay filesystem"
mkdir -p /live/overlay/rw/etc
: "--> Mount the squashfs"
mount -t squashfs /dev/vda /filesystem
: "--> Union the tmpfs and the squashfs with overlayfs and mount them on"
: "--> /root"
mount -t overlay -o noatime,lowerdir=/filesystem/,upperdir=/live/overlay/rw,workdir=/live/overlay/work overlay /root/
: "--> Demonstrate that creating a file..."
touch /root/newfile
: "--> ... creating a directory..."
mkdir -p /root/newdir
: "--> ... and creating a file in the new directory all work in the"
: "--> root of the overlay filesystem..."
touch /root/newdir/newfile
: "--> ...before cleaning up those files/dirs"
rm -r /root/newfile /root/newdir/newfile /root/newdir
: "--> Demonstrate that touching an existing directory (/etc, which we"
: "--> created earlier), and a file within it, works"
touch /root/etc/
touch /root/etc/newfile
: "--> Demonstrate that touching a directory or file not already present in"
: "--> the read-write part of overlay does *NOT* work"
touch /root/home/
touch /root/home/newfile
set +x
maybe_break
}
EOF
>$t/etc/initramfs-tools/hooks/strace cat <<\EOF
#!/bin/bash
set -e
if [[ prereqs = $1 ]]
then exit 0
fi
. /usr/share/initramfs-tools/hook-functions
copy_exec /usr/bin/strace
EOF
chmod a+x $t/etc/initramfs-tools/hooks/strace
chroot $t apt-get install -y --no-install-recommends linux-image-4.1.0-0.bpo.1-amd64 busybox selinux-basics selinux-policy-default auditd strace
# SELinux relabel
# NOTE: This requres SELinux to be enabled on the build host, even if it
# is set to permissive!
setfiles -r $t/ $t/etc/selinux/default/contexts/files/file_contexts $t/
exclusions=(
# Since boot/* is needed outside the squashfs, don't duplicate it inside.
'^boot$/.'
# Filesystems created at boot time.
'^(dev|tmp|run)$/.'
'^var$/^(lock|run|tmp)$/.'
# Build-time configuration and cache.
'^etc$/^(debian_chroot|hostname|hosts|motd(\.tail)?|mtab|resolv.conf)$'
'^etc$/^apt$/^apt.conf.d$/^10bootstrap$'
'^etc$/^network$/^interfaces$'
'^usr$/^sbin$/^policy-rc\.d$'
'^var$/^cache$/^apt$/^(src)?pkgcache\.bin$'
'^var$/^cache$/^apt$/^archives$/\.deb$'
'^var$/^cache$/^bootstrap$'
'^var$/^lib$/^apt$/^lists$/.'
'^var$/^log$/.'
)
mksquashfs $t $t/boot/filesystem.squashfs -regex -e "${exclusions[@]}"
kvm -m 256 -nographic -kernel $t/boot/vmlinuz -initrd $t/boot/initrd.img -append 'console=ttyS0 root=/dev/vda loglevel=1 security=selinux boot=overlaytest' -drive file=$t/boot/filesystem.squashfs,index=0,media=disk,if=virtio -net nic,model=virtio
[-- Attachment #1.3: output.txt --]
[-- Type: text/plain, Size: 4638 bytes --]
+ kvm -m 256 -nographic -kernel live/boot/vmlinuz -initrd live/boot/initrd.img -append 'console=ttyS0 root=/dev/vda loglevel=1 security=selinux boot=overlaytest' -drive file=live/boot/filesystem.squashfs,index=0,media=disk,if=virtio -net nic,model=virtio
Warning: vlan 0 is not connected to host network
Loading, please wait...
Begin: Loading essential drivers ... done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/nfs-top ... done.
Begin: Running /scripts/nfs-premount ... done.
+ : --> Mount a tmpfs on /live for use by overlay
+ mkdir /live
+ mount -t tmpfs tmpfs /live
+ : --> Make the two subdirs required by overlay
+ mkdir -p /live/overlay/rw /live/overlay/work
+ : --> Make /filesystem to mount the read-only squashfs
+ mkdir /filesystem
+ : --> Create a /etc directory in what will become the writable portion of
+ : --> the overlay filesystem
+ mkdir -p /live/overlay/rw/etc
+ : --> Mount the squashfs
+ mount -t squashfs /dev/vda /filesystem
+ : --> Union the tmpfs and the squashfs with overlayfs and mount them on
+ : --> /root
+ mount -t overlay -o noatime,lowerdir=/filesystem/,upperdir=/live/overlay/rw,workdir=/live/overlay/work overlay /root/
+ : --> Demonstrate that creating a file...
+ touch /root/newfile
+ : --> ... creating a directory...
+ mkdir -p /root/newdir
+ : --> ... and creating a file in the new directory all work in the
+ : --> root of the overlay filesystem...
+ touch /root/newdir/newfile
+ : --> ...before cleaning up those files/dirs
+ rm -r /root/newfile /root/newdir/newfile /root/newdir
+ : --> Demonstrate that touching an existing directory (/etc, which we
+ : --> created earlier), and a file within it, works
+ touch /root/etc/
+ touch /root/etc/newfile
+ : --> Demonstrate that touching a directory or file not already present in
+ : --> the read-write part of overlay does *NOT* work
+ touch /root/home/
touch: /root/home/: Operation not supported
+ touch /root/home/newfile
touch: /root/home/newfile: Operation not supported
+ set +x
Spawning shell within the initramfs
modprobe: module ehci-orion not found in modules.dep
BusyBox v1.22.1 (Debian 1:1.22.0-9+deb8u1) built-in shell (ash)
Enter 'help' for a list of built-in commands.
/bin/sh: can't access tty; job control turned off
(initramfs) strace touch /root/home/newfile
execve("/bin/touch", ["touch", "/root/home/newfile"], [/* 32 vars */]) = 0
brk(0) = 0x1db8000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f64fcfce000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=1358, ...}) = 0
mmap(NULL, 1358, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f64fcfcd000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\34\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1729984, ...}) = 0
mmap(NULL, 3836448, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f64fca07000
mprotect(0x7f64fcba6000, 2097152, PROT_NONE) = 0
mmap(0x7f64fcda6000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x19f000) = 0x7f64fcda6000
mmap(0x7f64fcdac000, 14880, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f64fcdac000
close(3) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f64fcfcc000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f64fcfcb000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f64fcfca000
arch_prctl(ARCH_SET_FS, 0x7f64fcfcb700) = 0
mprotect(0x7f64fcda6000, 16384, PROT_READ) = 0
mprotect(0x69a000, 4096, PROT_READ) = 0
mprotect(0x7f64fcfd0000, 4096, PROT_READ) = 0
munmap(0x7f64fcfcd000, 1358) = 0
getuid() = 0
utimes("/root/home/newfile", NULL) = -1 ENOENT (No such file or directory)
open("/root/home/newfile", O_RDWR|O_CREAT, 0666) = -1 EOPNOTSUPP (Operation not supported)
brk(0) = 0x1db8000
brk(0x1dd9000) = 0x1dd9000
write(2, "touch: /root/home/newfile: Opera"..., 51touch: /root/home/newfile: Operation not supported
) = 51
exit_group(1) = ?
+++ exited with 1 +++
(initramfs)
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 966 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* overlayfs+selinux error: OPNOTSUPP
@ 2015-09-21 2:25 Matthew Cengia
2015-09-21 20:42 ` Stephen Smalley
0 siblings, 1 reply; 9+ messages in thread
From: Matthew Cengia @ 2015-09-21 2:25 UTC (permalink / raw)
To: selinux; +Cc: Matthew Cengia, russell
[-- Attachment #1.1: Type: text/plain, Size: 3297 bytes --]
NOTE: I originally sent this to LKML
(https://lkml.org/lkml/2015/9/17/888), but was directed here.
Hi all,
Please CC me directly when responding, as I'm not subscribed to the
mailing list.
Summary
-------
I deploy diskless Debian kiosks in prisons, for use by inmates.
As part of the Debian 7 to 8 upgrade, I want to enable SELinux.
My initrd uses overlayfs to combine a ro squashfs and a rw tmpfs.
When I add SELinux into the mix, I get a lot of EOPNOTSUPP.
Long and boring history
-----------------------
I was happy with Debian 7 / Linux 3.16 / sysvinit / aufs.
Then, new hardware arrived, which needed a newer Xorg.
So I had to switch to Debian 8 / Linux 3.16.
Debian 8 defaults to systemd, so I went with that.
I used to put $XDG_RUNTIME_DIR under a /tmp mounted -onoexec.
Systemd v215 is hard-coded to mount $XDG_RUNTIME_DIR as a dedicated tmpfs,
and provides no way to mount/remount it with -onoexec.
src/login/logind-user.c:336:user_mkdir_runtime_path()
When I complained about this, regulars on #systemd on Freenode said:
Just use SELinux, already!
-o noexec might break something, and it won't stop interpreters.
...which was mostly reasonable.
So adopting SELinux was reprioritized from "some day" to "right now!"
aufs doesn't support SELinux, so I had to switch to overlayfs.
So now my target is Debian 8 / Linux 4.1 / systemd / overlayfs / SELinux.
Current problem
---------------
When I built & booted that combination, hostnames didn't resolve.
The initrd uses klibc ipconfig as a DHCP client,
then tries to create /etc/resolv.conf in the rootfs.
(This happens before switch_root.)
When SELinux is enabled, resolv.conf can't be opened for writing.
The attached strace (output.txt) shows open(2) gets EOPNOTSUPP.
Tests completed
---------------
This problem *ONLY* occurs in the initrd,
which is *BEFORE* the SELinux policy loads.
I'm not sure if this is relevant.
This problem *DOES NOT* occur if the file/directory being written to
already exists in the read/write portion of the overlay mount before the
overlayfs is mounted. I've attached a script to demonstrate this.
Booting the kernel with permissive=1 *DOES NOT* prevent the problem.
Test script
-----------
Attached is a script called 'bootstrap'.
When run on a Debian Jessie system with debootstrap, squashfs-tools, and kvm installed,
and selinux installed and enabled (even if it's in permissive mode),
'bootstrap' will:
* Mount a tmpfs without -o nodev at /tmp/bootstrap/live, to build in;
* Build an SOE in /tmp/bootstrap/live/;
* Create a squashfs of the built system;
* Leave the squashfs, kernel, and initrd in /tmp/bootstrap/live/boot/; and
* Start up a VM using KVM to demonstrate the behaviour.
The script that the initrd runs does several things, all of which are
detailed within the script, and in output.txt; look for lines
containing '-->'.
output.txt contains a full KVM run of the system exhibiting the problem,
in which I've also run an 'strace touch' to demonstrate the failing
syscall.
Help?
-----
How can I set about debugging this problem further?
Has anybody dealt with this before?
How can I solve (or workaround) this problem?
--
Regards,
Matthew Cengia
[-- Attachment #1.2: bootstrap --]
[-- Type: text/plain, Size: 4599 bytes --]
#!/bin/bash
set -eEux
set -o pipefail
trap 'echo >&2 "$0: unknown error"' ERR
export LC_ALL=C DEBIAN_FRONTEND=noninteractive
a=amd64 r=jessie t=live M=http://httpredir.debian.org/debian
# We can't use /tmp as it may (reasonably) be mounted -onodev.
mkdir -p /tmp/bootstrap
grep -q '^tmpfs /tmp/bootstrap tmpfs' /proc/mounts ||
mount tmpfs /tmp/bootstrap -ttmpfs -omode=700,size=80%
cd /tmp/bootstrap
rm -rf $t # Delete previous build (if any).
debootstrap --variant minbase --arch $a $r $t $M
>$t/etc/debian_chroot echo bootstrap
>$t/etc/apt/sources.list printf 'deb %s %s main\n' $M $r $M $r-updates $M $r-backports http://security.debian.org $r/updates
>$t/etc/apt/sources.list.d/30selinux.list printf 'deb %s %s selinux\n' http://www.coker.com.au $r
>$t/etc/apt/apt.conf.d/10stable echo "APT::Default-Release \"$r\";"
>$t/etc/apt/apt.conf.d/10bootstrap echo 'APT::Get::Assume-Yes "1"; APT::Get::AutomaticRemove "1"; APT::Install-Recommends "0"; Quiet "1";'
>$t/usr/sbin/policy-rc.d printf '#!/bin/sh\nexit 101'
chmod +x $t/usr/sbin/policy-rc.d
chroot $t apt-key adv --keyserver hkp://pool.sks-keyservers.net --recv-key D141CD30FC4B8F79
chroot $t apt-get update
chroot $t apt-get install -y initramfs-tools
>$t/etc/kernel-img.conf echo link_in_boot=yes
sed -i 's/^root:[^:]*:/root::/' $t/etc/shadow # root has null password
>$t/etc/initramfs-tools/modules printf '%s\n' overlay squashfs
>$t/etc/initramfs-tools/scripts/overlaytest cat <<EOF
mountroot()
{
set -x
: "--> Mount a tmpfs on /live for use by overlay"
mkdir /live
mount -t tmpfs tmpfs /live
: "--> Make the two subdirs required by overlay"
mkdir -p /live/overlay/rw /live/overlay/work
: "--> Make /filesystem to mount the read-only squashfs"
mkdir /filesystem
: "--> Create a /etc directory in what will become the writable portion of"
: "--> the overlay filesystem"
mkdir -p /live/overlay/rw/etc
: "--> Mount the squashfs"
mount -t squashfs /dev/vda /filesystem
: "--> Union the tmpfs and the squashfs with overlayfs and mount them on"
: "--> /root"
mount -t overlay -o noatime,lowerdir=/filesystem/,upperdir=/live/overlay/rw,workdir=/live/overlay/work overlay /root/
: "--> Demonstrate that creating a file..."
touch /root/newfile
: "--> ... creating a directory..."
mkdir -p /root/newdir
: "--> ... and creating a file in the new directory all work in the"
: "--> root of the overlay filesystem..."
touch /root/newdir/newfile
: "--> ...before cleaning up those files/dirs"
rm -r /root/newfile /root/newdir/newfile /root/newdir
: "--> Demonstrate that touching an existing directory (/etc, which we"
: "--> created earlier), and a file within it, works"
touch /root/etc/
touch /root/etc/newfile
: "--> Demonstrate that touching a directory or file not already present in"
: "--> the read-write part of overlay does *NOT* work"
touch /root/home/
touch /root/home/newfile
set +x
maybe_break
}
EOF
>$t/etc/initramfs-tools/hooks/strace cat <<\EOF
#!/bin/bash
set -e
if [[ prereqs = $1 ]]
then exit 0
fi
. /usr/share/initramfs-tools/hook-functions
copy_exec /usr/bin/strace
EOF
chmod a+x $t/etc/initramfs-tools/hooks/strace
chroot $t apt-get install -y --no-install-recommends linux-image-4.1.0-0.bpo.1-amd64 busybox selinux-basics selinux-policy-default auditd strace
# SELinux relabel
# NOTE: This requres SELinux to be enabled on the build host, even if it
# is set to permissive!
setfiles -r $t/ $t/etc/selinux/default/contexts/files/file_contexts $t/
exclusions=(
# Since boot/* is needed outside the squashfs, don't duplicate it inside.
'^boot$/.'
# Filesystems created at boot time.
'^(dev|tmp|run)$/.'
'^var$/^(lock|run|tmp)$/.'
# Build-time configuration and cache.
'^etc$/^(debian_chroot|hostname|hosts|motd(\.tail)?|mtab|resolv.conf)$'
'^etc$/^apt$/^apt.conf.d$/^10bootstrap$'
'^etc$/^network$/^interfaces$'
'^usr$/^sbin$/^policy-rc\.d$'
'^var$/^cache$/^apt$/^(src)?pkgcache\.bin$'
'^var$/^cache$/^apt$/^archives$/\.deb$'
'^var$/^cache$/^bootstrap$'
'^var$/^lib$/^apt$/^lists$/.'
'^var$/^log$/.'
)
mksquashfs $t $t/boot/filesystem.squashfs -regex -e "${exclusions[@]}"
kvm -m 256 -nographic -kernel $t/boot/vmlinuz -initrd $t/boot/initrd.img -append 'console=ttyS0 root=/dev/vda loglevel=1 security=selinux boot=overlaytest' -drive file=$t/boot/filesystem.squashfs,index=0,media=disk,if=virtio -net nic,model=virtio
[-- Attachment #1.3: output.txt --]
[-- Type: text/plain, Size: 4638 bytes --]
+ kvm -m 256 -nographic -kernel live/boot/vmlinuz -initrd live/boot/initrd.img -append 'console=ttyS0 root=/dev/vda loglevel=1 security=selinux boot=overlaytest' -drive file=live/boot/filesystem.squashfs,index=0,media=disk,if=virtio -net nic,model=virtio
Warning: vlan 0 is not connected to host network
Loading, please wait...
Begin: Loading essential drivers ... done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/nfs-top ... done.
Begin: Running /scripts/nfs-premount ... done.
+ : --> Mount a tmpfs on /live for use by overlay
+ mkdir /live
+ mount -t tmpfs tmpfs /live
+ : --> Make the two subdirs required by overlay
+ mkdir -p /live/overlay/rw /live/overlay/work
+ : --> Make /filesystem to mount the read-only squashfs
+ mkdir /filesystem
+ : --> Create a /etc directory in what will become the writable portion of
+ : --> the overlay filesystem
+ mkdir -p /live/overlay/rw/etc
+ : --> Mount the squashfs
+ mount -t squashfs /dev/vda /filesystem
+ : --> Union the tmpfs and the squashfs with overlayfs and mount them on
+ : --> /root
+ mount -t overlay -o noatime,lowerdir=/filesystem/,upperdir=/live/overlay/rw,workdir=/live/overlay/work overlay /root/
+ : --> Demonstrate that creating a file...
+ touch /root/newfile
+ : --> ... creating a directory...
+ mkdir -p /root/newdir
+ : --> ... and creating a file in the new directory all work in the
+ : --> root of the overlay filesystem...
+ touch /root/newdir/newfile
+ : --> ...before cleaning up those files/dirs
+ rm -r /root/newfile /root/newdir/newfile /root/newdir
+ : --> Demonstrate that touching an existing directory (/etc, which we
+ : --> created earlier), and a file within it, works
+ touch /root/etc/
+ touch /root/etc/newfile
+ : --> Demonstrate that touching a directory or file not already present in
+ : --> the read-write part of overlay does *NOT* work
+ touch /root/home/
touch: /root/home/: Operation not supported
+ touch /root/home/newfile
touch: /root/home/newfile: Operation not supported
+ set +x
Spawning shell within the initramfs
modprobe: module ehci-orion not found in modules.dep
BusyBox v1.22.1 (Debian 1:1.22.0-9+deb8u1) built-in shell (ash)
Enter 'help' for a list of built-in commands.
/bin/sh: can't access tty; job control turned off
(initramfs) strace touch /root/home/newfile
execve("/bin/touch", ["touch", "/root/home/newfile"], [/* 32 vars */]) = 0
brk(0) = 0x1db8000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f64fcfce000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=1358, ...}) = 0
mmap(NULL, 1358, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f64fcfcd000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\34\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1729984, ...}) = 0
mmap(NULL, 3836448, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f64fca07000
mprotect(0x7f64fcba6000, 2097152, PROT_NONE) = 0
mmap(0x7f64fcda6000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x19f000) = 0x7f64fcda6000
mmap(0x7f64fcdac000, 14880, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f64fcdac000
close(3) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f64fcfcc000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f64fcfcb000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f64fcfca000
arch_prctl(ARCH_SET_FS, 0x7f64fcfcb700) = 0
mprotect(0x7f64fcda6000, 16384, PROT_READ) = 0
mprotect(0x69a000, 4096, PROT_READ) = 0
mprotect(0x7f64fcfd0000, 4096, PROT_READ) = 0
munmap(0x7f64fcfcd000, 1358) = 0
getuid() = 0
utimes("/root/home/newfile", NULL) = -1 ENOENT (No such file or directory)
open("/root/home/newfile", O_RDWR|O_CREAT, 0666) = -1 EOPNOTSUPP (Operation not supported)
brk(0) = 0x1db8000
brk(0x1dd9000) = 0x1dd9000
write(2, "touch: /root/home/newfile: Opera"..., 51touch: /root/home/newfile: Operation not supported
) = 51
exit_group(1) = ?
+++ exited with 1 +++
(initramfs)
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 966 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: overlayfs+selinux error: OPNOTSUPP
2015-09-21 2:25 overlayfs+selinux error: OPNOTSUPP Matthew Cengia
@ 2015-09-21 20:42 ` Stephen Smalley
2015-09-21 20:47 ` Stephen Smalley
2015-09-23 3:23 ` Russell Coker
0 siblings, 2 replies; 9+ messages in thread
From: Stephen Smalley @ 2015-09-21 20:42 UTC (permalink / raw)
To: Matthew Cengia, selinux; +Cc: russell, Matthew Cengia
On 09/20/2015 10:25 PM, Matthew Cengia wrote:
> NOTE: I originally sent this to LKML
> (https://lkml.org/lkml/2015/9/17/888), but was directed here.
>
> Hi all,
>
> Please CC me directly when responding, as I'm not subscribed to the
> mailing list.
>
>
> Summary
> -------
> I deploy diskless Debian kiosks in prisons, for use by inmates.
> As part of the Debian 7 to 8 upgrade, I want to enable SELinux.
> My initrd uses overlayfs to combine a ro squashfs and a rw tmpfs.
>
> When I add SELinux into the mix, I get a lot of EOPNOTSUPP.
>
>
> Long and boring history
> -----------------------
> I was happy with Debian 7 / Linux 3.16 / sysvinit / aufs.
> Then, new hardware arrived, which needed a newer Xorg.
> So I had to switch to Debian 8 / Linux 3.16.
> Debian 8 defaults to systemd, so I went with that.
>
> I used to put $XDG_RUNTIME_DIR under a /tmp mounted -onoexec.
> Systemd v215 is hard-coded to mount $XDG_RUNTIME_DIR as a dedicated tmpfs,
> and provides no way to mount/remount it with -onoexec.
>
> src/login/logind-user.c:336:user_mkdir_runtime_path()
>
> When I complained about this, regulars on #systemd on Freenode said:
>
> Just use SELinux, already!
> -o noexec might break something, and it won't stop interpreters.
>
> ...which was mostly reasonable.
> So adopting SELinux was reprioritized from "some day" to "right now!"
>
> aufs doesn't support SELinux, so I had to switch to overlayfs.
> So now my target is Debian 8 / Linux 4.1 / systemd / overlayfs / SELinux.
>
>
> Current problem
> ---------------
> When I built & booted that combination, hostnames didn't resolve.
>
> The initrd uses klibc ipconfig as a DHCP client,
> then tries to create /etc/resolv.conf in the rootfs.
> (This happens before switch_root.)
>
> When SELinux is enabled, resolv.conf can't be opened for writing.
> The attached strace (output.txt) shows open(2) gets EOPNOTSUPP.
>
>
> Tests completed
> ---------------
> This problem *ONLY* occurs in the initrd,
> which is *BEFORE* the SELinux policy loads.
> I'm not sure if this is relevant.
Yes, I believe it is. Most likely culprit is:
security/selinux/hooks.c:
2890 static int selinux_inode_setxattr(struct dentry *dentry, const
char *name,
2891 const void *value, size_t
size, int flags)
2892 {
2893 struct inode *inode = dentry->d_inode;
2894 struct inode_security_struct *isec = inode->i_security;
2895 struct superblock_security_struct *sbsec;
2896 struct common_audit_data ad;
2897 u32 newsid, sid = current_sid();
2898 int rc = 0;
2899
2900 if (strcmp(name, XATTR_NAME_SELINUX))
2901 return selinux_inode_setotherxattr(dentry, name);
2902
2903 sbsec = inode->i_sb->s_security;
2904 if (!(sbsec->flags & SBLABEL_MNT))
2905 return -EOPNOTSUPP;
^^^^^^^^^^^^
That's to prevent setting SELinux attributes on a filesystem that does
not support labeling due to use of a context= mount or policy genfscon
rules to override any xattrs on the filesystem. Maybe that should be
exempted if no policy is loaded (!ss_initialized).
At this point, I have to ask: which is easier, patching systemd to do
what you want, loading policy earlier (in general, the earlier you load
SELinux policy, the better), or patching the kernel.
>
> This problem *DOES NOT* occur if the file/directory being written to
> already exists in the read/write portion of the overlay mount before the
> overlayfs is mounted. I've attached a script to demonstrate this.
>
> Booting the kernel with permissive=1 *DOES NOT* prevent the problem.
>
>
> Test script
> -----------
> Attached is a script called 'bootstrap'.
> When run on a Debian Jessie system with debootstrap, squashfs-tools, and kvm installed,
> and selinux installed and enabled (even if it's in permissive mode),
> 'bootstrap' will:
>
> * Mount a tmpfs without -o nodev at /tmp/bootstrap/live, to build in;
> * Build an SOE in /tmp/bootstrap/live/;
> * Create a squashfs of the built system;
> * Leave the squashfs, kernel, and initrd in /tmp/bootstrap/live/boot/; and
> * Start up a VM using KVM to demonstrate the behaviour.
>
> The script that the initrd runs does several things, all of which are
> detailed within the script, and in output.txt; look for lines
> containing '-->'.
>
> output.txt contains a full KVM run of the system exhibiting the problem,
> in which I've also run an 'strace touch' to demonstrate the failing
> syscall.
>
>
> Help?
> -----
> How can I set about debugging this problem further?
> Has anybody dealt with this before?
> How can I solve (or workaround) this problem?
>
>
>
> _______________________________________________
> Selinux mailing list
> Selinux@tycho.nsa.gov
> To unsubscribe, send email to Selinux-leave@tycho.nsa.gov.
> To get help, send an email containing "help" to Selinux-request@tycho.nsa.gov.
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: overlayfs+selinux error: OPNOTSUPP
2015-09-21 20:42 ` Stephen Smalley
@ 2015-09-21 20:47 ` Stephen Smalley
2015-09-22 1:24 ` Matthew Cengia
2015-09-23 3:23 ` Russell Coker
1 sibling, 1 reply; 9+ messages in thread
From: Stephen Smalley @ 2015-09-21 20:47 UTC (permalink / raw)
To: Matthew Cengia, selinux; +Cc: russell, Matthew Cengia
On 09/21/2015 04:42 PM, Stephen Smalley wrote:
> On 09/20/2015 10:25 PM, Matthew Cengia wrote:
>> NOTE: I originally sent this to LKML
>> (https://lkml.org/lkml/2015/9/17/888), but was directed here.
>>
>> Hi all,
>>
>> Please CC me directly when responding, as I'm not subscribed to the
>> mailing list.
>>
>>
>> Summary
>> -------
>> I deploy diskless Debian kiosks in prisons, for use by inmates.
>> As part of the Debian 7 to 8 upgrade, I want to enable SELinux.
>> My initrd uses overlayfs to combine a ro squashfs and a rw tmpfs.
>>
>> When I add SELinux into the mix, I get a lot of EOPNOTSUPP.
>>
>>
>> Long and boring history
>> -----------------------
>> I was happy with Debian 7 / Linux 3.16 / sysvinit / aufs.
>> Then, new hardware arrived, which needed a newer Xorg.
>> So I had to switch to Debian 8 / Linux 3.16.
>> Debian 8 defaults to systemd, so I went with that.
>>
>> I used to put $XDG_RUNTIME_DIR under a /tmp mounted -onoexec.
>> Systemd v215 is hard-coded to mount $XDG_RUNTIME_DIR as a dedicated tmpfs,
>> and provides no way to mount/remount it with -onoexec.
>>
>> src/login/logind-user.c:336:user_mkdir_runtime_path()
>>
>> When I complained about this, regulars on #systemd on Freenode said:
>>
>> Just use SELinux, already!
>> -o noexec might break something, and it won't stop interpreters.
>>
>> ...which was mostly reasonable.
>> So adopting SELinux was reprioritized from "some day" to "right now!"
>>
>> aufs doesn't support SELinux, so I had to switch to overlayfs.
>> So now my target is Debian 8 / Linux 4.1 / systemd / overlayfs / SELinux.
>>
>>
>> Current problem
>> ---------------
>> When I built & booted that combination, hostnames didn't resolve.
>>
>> The initrd uses klibc ipconfig as a DHCP client,
>> then tries to create /etc/resolv.conf in the rootfs.
>> (This happens before switch_root.)
>>
>> When SELinux is enabled, resolv.conf can't be opened for writing.
>> The attached strace (output.txt) shows open(2) gets EOPNOTSUPP.
>>
>>
>> Tests completed
>> ---------------
>> This problem *ONLY* occurs in the initrd,
>> which is *BEFORE* the SELinux policy loads.
>> I'm not sure if this is relevant.
>
> Yes, I believe it is. Most likely culprit is:
> security/selinux/hooks.c:
> 2890 static int selinux_inode_setxattr(struct dentry *dentry, const
> char *name,
> 2891 const void *value, size_t
> size, int flags)
> 2892 {
> 2893 struct inode *inode = dentry->d_inode;
> 2894 struct inode_security_struct *isec = inode->i_security;
> 2895 struct superblock_security_struct *sbsec;
> 2896 struct common_audit_data ad;
> 2897 u32 newsid, sid = current_sid();
> 2898 int rc = 0;
> 2899
> 2900 if (strcmp(name, XATTR_NAME_SELINUX))
> 2901 return selinux_inode_setotherxattr(dentry, name);
> 2902
> 2903 sbsec = inode->i_sb->s_security;
> 2904 if (!(sbsec->flags & SBLABEL_MNT))
> 2905 return -EOPNOTSUPP;
> ^^^^^^^^^^^^
> That's to prevent setting SELinux attributes on a filesystem that does
> not support labeling due to use of a context= mount or policy genfscon
> rules to override any xattrs on the filesystem. Maybe that should be
> exempted if no policy is loaded (!ss_initialized).
>
> At this point, I have to ask: which is easier, patching systemd to do
> what you want, loading policy earlier (in general, the earlier you load
> SELinux policy, the better), or patching the kernel.
BTW, IIUC, the reason that this manifests on an open(2) call is that
overlayfs is trying to copy-up any xattrs from the lower filesystem to
the upper filesystem when you touch the file, which triggers a
vfs_getxattr on the lower filesystem and then a vfs_setxattr on the
upper filesystem, and then we fail here. Not something we would see on
open(2) otherwise.
>
>>
>> This problem *DOES NOT* occur if the file/directory being written to
>> already exists in the read/write portion of the overlay mount before the
>> overlayfs is mounted. I've attached a script to demonstrate this.
>>
>> Booting the kernel with permissive=1 *DOES NOT* prevent the problem.
>>
>>
>> Test script
>> -----------
>> Attached is a script called 'bootstrap'.
>> When run on a Debian Jessie system with debootstrap, squashfs-tools, and kvm installed,
>> and selinux installed and enabled (even if it's in permissive mode),
>> 'bootstrap' will:
>>
>> * Mount a tmpfs without -o nodev at /tmp/bootstrap/live, to build in;
>> * Build an SOE in /tmp/bootstrap/live/;
>> * Create a squashfs of the built system;
>> * Leave the squashfs, kernel, and initrd in /tmp/bootstrap/live/boot/; and
>> * Start up a VM using KVM to demonstrate the behaviour.
>>
>> The script that the initrd runs does several things, all of which are
>> detailed within the script, and in output.txt; look for lines
>> containing '-->'.
>>
>> output.txt contains a full KVM run of the system exhibiting the problem,
>> in which I've also run an 'strace touch' to demonstrate the failing
>> syscall.
>>
>>
>> Help?
>> -----
>> How can I set about debugging this problem further?
>> Has anybody dealt with this before?
>> How can I solve (or workaround) this problem?
>>
>>
>>
>> _______________________________________________
>> Selinux mailing list
>> Selinux@tycho.nsa.gov
>> To unsubscribe, send email to Selinux-leave@tycho.nsa.gov.
>> To get help, send an email containing "help" to Selinux-request@tycho.nsa.gov.
>>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: overlayfs+selinux error: OPNOTSUPP
2015-09-21 20:47 ` Stephen Smalley
@ 2015-09-22 1:24 ` Matthew Cengia
2015-09-22 13:36 ` Stephen Smalley
0 siblings, 1 reply; 9+ messages in thread
From: Matthew Cengia @ 2015-09-22 1:24 UTC (permalink / raw)
To: Stephen Smalley; +Cc: selinux, russell
[-- Attachment #1: Type: text/plain, Size: 3335 bytes --]
On 2015-09-21 16:47, Stephen Smalley wrote:
[...]
> >> This problem *ONLY* occurs in the initrd,
> >> which is *BEFORE* the SELinux policy loads.
> >> I'm not sure if this is relevant.
> >
> > Yes, I believe it is. Most likely culprit is:
> > security/selinux/hooks.c:
> > 2890 static int selinux_inode_setxattr(struct dentry *dentry, const
> > char *name,
> > 2891 const void *value, size_t
> > size, int flags)
> > 2892 {
> > 2893 struct inode *inode = dentry->d_inode;
> > 2894 struct inode_security_struct *isec = inode->i_security;
> > 2895 struct superblock_security_struct *sbsec;
> > 2896 struct common_audit_data ad;
> > 2897 u32 newsid, sid = current_sid();
> > 2898 int rc = 0;
> > 2899
> > 2900 if (strcmp(name, XATTR_NAME_SELINUX))
> > 2901 return selinux_inode_setotherxattr(dentry, name);
> > 2902
> > 2903 sbsec = inode->i_sb->s_security;
> > 2904 if (!(sbsec->flags & SBLABEL_MNT))
> > 2905 return -EOPNOTSUPP;
> > ^^^^^^^^^^^^
> > That's to prevent setting SELinux attributes on a filesystem that does
> > not support labeling due to use of a context= mount or policy genfscon
> > rules to override any xattrs on the filesystem. Maybe that should be
> > exempted if no policy is loaded (!ss_initialized).
> >
> > At this point, I have to ask: which is easier, patching systemd to do
> > what you want, loading policy earlier (in general, the earlier you load
> > SELinux policy, the better), or patching the kernel.
>
> BTW, IIUC, the reason that this manifests on an open(2) call is that
> overlayfs is trying to copy-up any xattrs from the lower filesystem to
> the upper filesystem when you touch the file, which triggers a
> vfs_getxattr on the lower filesystem and then a vfs_setxattr on the
> upper filesystem, and then we fail here. Not something we would see on
> open(2) otherwise.
Thanks for your response Stephen!
Let me confirm I understand correctly. The problem doesn't occur when I
write a file to the root of the overlay mountpoint. Are you saying this
is because I'm not attempting to copy/set ant SELinux attributes on this
file, but when I write something to /etc or /home, copy-up attempts and
fails to write the SELinux attribute xattr?
As for possible solutions: I'm not sure I want to contemplate patching
systemd, so I'll leave that as a last resort.
I'm happy to investigate loading the policy earlier; I'll need to talk
to some Debian SELinux people to understand better how the policy gets
loaded so I can duplicate that functionality into the initrd; I'm still
getting my head around SELinux.
Your final suggestion was a kernel change (!ss_initialized). Are you
suggesting this is something you'd consider changing in mainline, or
something I might want to patch for my specific instance? I want to
avoid the latter, but if you think the former may be sensible, that'd be
cool. Would that mean, however, that SELinux attributes may not be set
correctly and that the files create wouldn't be accessible by everything
that needs them after the policy has loaded?
--
Regards,
Matthew Cengia
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 966 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: overlayfs+selinux error: OPNOTSUPP
2015-09-22 1:24 ` Matthew Cengia
@ 2015-09-22 13:36 ` Stephen Smalley
0 siblings, 0 replies; 9+ messages in thread
From: Stephen Smalley @ 2015-09-22 13:36 UTC (permalink / raw)
To: Matthew Cengia; +Cc: selinux, russell
On 09/21/2015 09:24 PM, Matthew Cengia wrote:
> On 2015-09-21 16:47, Stephen Smalley wrote:
> [...]
>>>> This problem *ONLY* occurs in the initrd,
>>>> which is *BEFORE* the SELinux policy loads.
>>>> I'm not sure if this is relevant.
>>>
>>> Yes, I believe it is. Most likely culprit is:
>>> security/selinux/hooks.c:
>>> 2890 static int selinux_inode_setxattr(struct dentry *dentry, const
>>> char *name,
>>> 2891 const void *value, size_t
>>> size, int flags)
>>> 2892 {
>>> 2893 struct inode *inode = dentry->d_inode;
>>> 2894 struct inode_security_struct *isec = inode->i_security;
>>> 2895 struct superblock_security_struct *sbsec;
>>> 2896 struct common_audit_data ad;
>>> 2897 u32 newsid, sid = current_sid();
>>> 2898 int rc = 0;
>>> 2899
>>> 2900 if (strcmp(name, XATTR_NAME_SELINUX))
>>> 2901 return selinux_inode_setotherxattr(dentry, name);
>>> 2902
>>> 2903 sbsec = inode->i_sb->s_security;
>>> 2904 if (!(sbsec->flags & SBLABEL_MNT))
>>> 2905 return -EOPNOTSUPP;
>>> ^^^^^^^^^^^^
>>> That's to prevent setting SELinux attributes on a filesystem that does
>>> not support labeling due to use of a context= mount or policy genfscon
>>> rules to override any xattrs on the filesystem. Maybe that should be
>>> exempted if no policy is loaded (!ss_initialized).
>>>
>>> At this point, I have to ask: which is easier, patching systemd to do
>>> what you want, loading policy earlier (in general, the earlier you load
>>> SELinux policy, the better), or patching the kernel.
>>
>> BTW, IIUC, the reason that this manifests on an open(2) call is that
>> overlayfs is trying to copy-up any xattrs from the lower filesystem to
>> the upper filesystem when you touch the file, which triggers a
>> vfs_getxattr on the lower filesystem and then a vfs_setxattr on the
>> upper filesystem, and then we fail here. Not something we would see on
>> open(2) otherwise.
>
> Thanks for your response Stephen!
>
> Let me confirm I understand correctly. The problem doesn't occur when I
> write a file to the root of the overlay mountpoint. Are you saying this
> is because I'm not attempting to copy/set ant SELinux attributes on this
> file, but when I write something to /etc or /home, copy-up attempts and
> fails to write the SELinux attribute xattr?
That's my theory, yes.
> As for possible solutions: I'm not sure I want to contemplate patching
> systemd, so I'll leave that as a last resort.
I will caveat that working through all the potential issues of SELinux +
overlayfs + squashfs (and I am not sure how well or if it actually does
work with that combination) and getting a functional policy that works
for your setup and actually enforces the equivalent to noexec for your
XDG_RUNTIME_DIR might be a lot more work than just patching systemd to
add the noexec mount option if that is really all you wanted. Not
trying to dissuade you, but just want you to be aware of the potential
work factor.
> I'm happy to investigate loading the policy earlier; I'll need to talk
> to some Debian SELinux people to understand better how the policy gets
> loaded so I can duplicate that functionality into the initrd; I'm still
> getting my head around SELinux.
To do that, you'd have to copy the policy files (i.e. entire
/etc/selinux tree) into the initrd itself. I think systemd will then
load them automatically if you are using systemd as the initrd's init,
or if not, you can always add a direct call to load_policy -i to your
/linuxrc or equivalent.
> Your final suggestion was a kernel change (!ss_initialized). Are you
> suggesting this is something you'd consider changing in mainline, or
> something I might want to patch for my specific instance?
I'd be open to the former, although I can't say that I have fully
considered all the implications.
> I want to
> avoid the latter, but if you think the former may be sensible, that'd be
> cool. Would that mean, however, that SELinux attributes may not be set
> correctly and that the files create wouldn't be accessible by everything
> that needs them after the policy has loaded?
Potentially, yes. That's why it is better to load policy as early as
possible. On the other hand, if the files are only ever used prior to
the switch root (and thus prior to the normal policy load from the real
root), then it might not matter.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: overlayfs+selinux error: OPNOTSUPP
2015-09-21 20:42 ` Stephen Smalley
2015-09-21 20:47 ` Stephen Smalley
@ 2015-09-23 3:23 ` Russell Coker
2015-09-23 16:25 ` Stephen Smalley
2015-09-24 7:00 ` Matthew Cengia
1 sibling, 2 replies; 9+ messages in thread
From: Russell Coker @ 2015-09-23 3:23 UTC (permalink / raw)
To: Stephen Smalley; +Cc: Matthew Cengia, selinux, Matthew Cengia
On Tue, 22 Sep 2015 06:42:34 AM Stephen Smalley wrote:
> At this point, I have to ask: which is easier, patching systemd to do
> what you want, loading policy earlier (in general, the earlier you load
> SELinux policy, the better), or patching the kernel.
Patching the kernel is unreasonably difficult and not something you want to
maintain going forward (note that I maintained the SE Linux kernel patch
package in Debian for some years before it was accepted upstream - I've had
practice at such things and it wasn't much fun).
What changes to systemd are you referring to? If you mean making it possible
to mount /tmp noexec I don't think that's a good idea. While I think the risk
of breakage is low (it's a constrained environment where general purpose use
isn't the aim) the benefits also seem minimal. As an aside the default SE
Linux policy permits regular users to execute files they create in /tmp but
that could be changed. It seems likely that Matthew will end up making a
custom policy anyway.
Regarding loading policy earlier, I thought we had already established that
loading policy in the initrd was generally a bad idea. It makes the initrd
bigger (which can cause problems in some situations) and it requires that the
initrd be changed whenever significant changes are made to the policy (which
realistically means changing the initrd every time you change the policy to be
certain). Loading the policy in the initrd is probably the best solution to
this use of an overlayfs system but I think it should be considered as an
unusual solution to an unusual problem rather than something that's generally
good.
To load the policy in the initrd you need to copy
/etc/selinux/default/policy/policy.* and /usr/sbin/load_policy to the initrd
and first mount /proc and the selinuxfs before loading the policy. It will be
a little fiddly to setup (as does anything involving the initrd) but not any
great challenge.
Also it's unlikely that systemd has been tested in a situation where an initrd
loads the policy. In case anyone wonders, I think it should be considered a
bug if systemd or SysVInit fails to work when the policy was loaded in the
initrd.
--
My Main Blog http://etbe.coker.com.au/
My Documents Blog http://doc.coker.com.au/
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: overlayfs+selinux error: OPNOTSUPP
2015-09-23 3:23 ` Russell Coker
@ 2015-09-23 16:25 ` Stephen Smalley
2015-09-24 7:00 ` Matthew Cengia
1 sibling, 0 replies; 9+ messages in thread
From: Stephen Smalley @ 2015-09-23 16:25 UTC (permalink / raw)
To: Russell Coker; +Cc: Matthew Cengia, selinux, Matthew Cengia
On 09/22/2015 11:23 PM, Russell Coker wrote:
> On Tue, 22 Sep 2015 06:42:34 AM Stephen Smalley wrote:
>> At this point, I have to ask: which is easier, patching systemd to do
>> what you want, loading policy earlier (in general, the earlier you load
>> SELinux policy, the better), or patching the kernel.
>
> Patching the kernel is unreasonably difficult and not something you want to
> maintain going forward (note that I maintained the SE Linux kernel patch
> package in Debian for some years before it was accepted upstream - I've had
> practice at such things and it wasn't much fun).
If my theory on why he is encountering EOPNOTSUPP on open(2) on
overlayfs+squashfs is correct, then he has to either patch the kernel
(but we are only talking about a one-line patch, and one that I would
consider taking upstream too) or he has to move up policy loading to the
initrd. That's all I'm suggesting there.
> What changes to systemd are you referring to? If you mean making it possible
> to mount /tmp noexec I don't think that's a good idea. While I think the risk
> of breakage is low (it's a constrained environment where general purpose use
> isn't the aim) the benefits also seem minimal. As an aside the default SE
> Linux policy permits regular users to execute files they create in /tmp but
> that could be changed. It seems likely that Matthew will end up making a
> custom policy anyway.
Yes, that is what I meant; he identified noexec XDG_RUNTIME_DIR as his
motivating reason for even enabling SELinux. If that's his only goal,
then it is a lot easier to do a one-line patch to systemd than to work
through all the details of getting SELinux working with
overlayfs+squashfs _and_ having to create a custom policy, don't you
think? Now, I'm all for him enabling SELinux; I just wanted him to
understand the potential cost up front.
> Regarding loading policy earlier, I thought we had already established that
> loading policy in the initrd was generally a bad idea. It makes the initrd
> bigger (which can cause problems in some situations) and it requires that the
> initrd be changed whenever significant changes are made to the policy (which
> realistically means changing the initrd every time you change the policy to be
> certain). Loading the policy in the initrd is probably the best solution to
> this use of an overlayfs system but I think it should be considered as an
> unusual solution to an unusual problem rather than something that's generally
> good.
I'll stand by my statement. The earlier policy is loaded, the better.
The later you load policy, the greater the potential for processes,
files, and other objects to be mislabeled and to need retroactive fixing
via restorecon or similar means. Also, as in this particular situation,
SELinux behavior when no policy is loaded may lead to surprising
results, as that doesn't get much testing. I understand the reasons why
the distributions tend to defer policy load to the real root, but I
suspect the optimal approach even there would actually be to load an
initial bootstrap policy from the initrd (which hopefully can remain
fairly small and stable) and then reload a more complete, more
frequently updated policy from the real root. Certainly less potential
for surprises there.
> To load the policy in the initrd you need to copy
> /etc/selinux/default/policy/policy.* and /usr/sbin/load_policy to the initrd
> and first mount /proc and the selinuxfs before loading the policy. It will be
> a little fiddly to setup (as does anything involving the initrd) but not any
> great challenge.
>
> Also it's unlikely that systemd has been tested in a situation where an initrd
> loads the policy. In case anyone wonders, I think it should be considered a
> bug if systemd or SysVInit fails to work when the policy was loaded in the
> initrd.
If you unpack the initramfs image on modern Fedora, the /init in the
initramfs is in fact systemd, and it already has the support for loading
policy. A cursory look at the code suggests that it supports loading
policy from the initrd and from the real root, and correctly handles the
case where policy only exists in one or the other as well as the case
where it exists in both. But YMMV.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: overlayfs+selinux error: OPNOTSUPP
2015-09-23 3:23 ` Russell Coker
2015-09-23 16:25 ` Stephen Smalley
@ 2015-09-24 7:00 ` Matthew Cengia
1 sibling, 0 replies; 9+ messages in thread
From: Matthew Cengia @ 2015-09-24 7:00 UTC (permalink / raw)
To: Russell Coker; +Cc: Stephen Smalley, selinux
[-- Attachment #1: Type: text/plain, Size: 2366 bytes --]
On 2015-09-23 13:23, Russell Coker wrote:
[...]
> To load the policy in the initrd you need to copy
> /etc/selinux/default/policy/policy.* and /usr/sbin/load_policy to the initrd
> and first mount /proc and the selinuxfs before loading the policy. It will be
> a little fiddly to setup (as does anything involving the initrd) but not any
> great challenge.
>
> Also it's unlikely that systemd has been tested in a situation where an initrd
> loads the policy. In case anyone wonders, I think it should be considered a
> bug if systemd or SysVInit fails to work when the policy was loaded in the
> initrd.
Thanks Russell!
Just a quick email to let you all know that with the below code in my
build script, I successfully copied in and loaded the SELinux policy in
the initrd, and my EOPNOTSUPP errors disappeared, allowing me to boot
the system!
That's Good Enough™ for me for the moment, and I can revisit using a
more minimal policy or similar later if necessary. For now, I'm going to
ensure I'm using Russell's latest Debian Jessie-compatible SELinux
policy to see what AVC denials I'm getting, before deciding whether to
just go with that and tweak it to do the few lockdowns I need, or learn
how to write a new one from scratch.
I suspect writing a full policy will be deferred until quite some time
later if I can get something small sorted out in the short-term, as I'm
on a reasonably tight timeline at the moment, having spent a month
fighting this problem and the ones leading up to it.
Thanks again for all your help everyone!
diff --git c/bootstrap w/bootstrap
index c38651e..cb45635 100755
--- c/bootstrap
+++ w/bootstrap
@@ -401,0 +402,16 @@ setfiles -r $t/ $t/etc/selinux/default/contexts/files/file_contexts $t/
+>$t/etc/initramfs-tools/hooks/selinux cat <<'EOF' && chmod +x $t/etc/initramfs-tools/hooks/selinux
+#!/bin/bash
+[ prereqs = "$1" ] && exit 0
+. /usr/share/initramfs-tools/hook-functions
+
+copy_exec /usr/sbin/load_policy
+cp -a {,"$DESTDIR"}/etc/selinux/
+EOF
+>$t/etc/initramfs-tools/scripts/init-top/selinux cat <<'EOF' && chmod +x $t/etc/initramfs-tools/scripts/init-top/selinux
+#!/bin/sh
+[ prereqs = "$1" ] && exit 0
+
+mount -t selinuxfs selinuxfs /sys/fs/selinux
+load_policy
+EOF
+chroot $t update-initramfs -u -k all
--
Regards,
Matthew Cengia
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 966 bytes --]
^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-09-24 7:00 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-21 2:25 overlayfs+selinux error: OPNOTSUPP Matthew Cengia
2015-09-21 20:42 ` Stephen Smalley
2015-09-21 20:47 ` Stephen Smalley
2015-09-22 1:24 ` Matthew Cengia
2015-09-22 13:36 ` Stephen Smalley
2015-09-23 3:23 ` Russell Coker
2015-09-23 16:25 ` Stephen Smalley
2015-09-24 7:00 ` Matthew Cengia
-- strict thread matches above, loose matches on Subject: below --
2015-09-18 2:07 Matthew Cengia
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.