From: Jonathan Corbet <corbet@lwn.net>
To: Shuah Khan <skhan@linuxfoundation.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>,
kstewart@linuxfoundation.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] docs: add system-state document to admin-guide
Date: Thu, 23 Mar 2023 11:55:32 -0600 [thread overview]
Message-ID: <877cv7cpyj.fsf@meer.lwn.net> (raw)
In-Reply-To: <20230322152049.12723-1-skhan@linuxfoundation.org>
Shuah Khan <skhan@linuxfoundation.org> writes:
> Add a new system state document to the admin-guide. This document is
> intended to be used as a guide on how to gather higher level information
> about a system and its run-time activity.
>
> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
> ---
> Changes since v1:
> -- Addressed review comments
>
> Documentation/admin-guide/index.rst | 1 +
> Documentation/admin-guide/system-state.rst | 350 +++++++++++++++++++++
> 2 files changed, 351 insertions(+)
> create mode 100644 Documentation/admin-guide/system-state.rst
>
> diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
> index f475554382e2..541372672c55 100644
> --- a/Documentation/admin-guide/index.rst
> +++ b/Documentation/admin-guide/index.rst
> @@ -66,6 +66,7 @@ subsystems expectations will be found here.
> :maxdepth: 1
>
> workload-tracing
> + system-state
>
> The rest of this manual consists of various unordered guides on how to
> configure specific aspects of kernel behavior to your liking.
> diff --git a/Documentation/admin-guide/system-state.rst b/Documentation/admin-guide/system-state.rst
> new file mode 100644
> index 000000000000..2a6fdf85c35c
> --- /dev/null
> +++ b/Documentation/admin-guide/system-state.rst
> @@ -0,0 +1,350 @@
> +.. SPDX-License-Identifier: (GPL-2.0+ OR CC-BY-4.0)
> +
> +===========================================================
> +Discovering system calls and features supported on a system
> +===========================================================
> +
> +:Author: Shuah Khan <skhan@linuxfoundation.org>
> +:maintained-by: Shuah Khan <skhan@linuxfoundation.org>
Rather than adding lines like this, I think everybody would be better
served with a MAINTAINERS file entry. get_maintainer.pl doesn't know
about these lines.
> +Key Points
> +==========
> +
> + * System state includes system calls, features, static and dynamic
> + modules enabled in the kernel configuration.
> + * Supported system calls and Kernel features are architecture dependent.
> + * auditd, checksyscalls.sh, and get_feat.pl tools can be used to discover
> + static system state.
> + * Understanding Linux kernel hardening configurations options and making
> + sure they are enabled will make a system more secure.
> + * Employing run-time tracing can shed light on the dynamic system state.
> + * Workloads could change the system state by loading and unloading dynamic
> + modules and tuning system parameters.
So what I'm missing, before this even, is a paragraph saying what this
document is actually for. Who is the intended audience, and why might
they want to read this document?
> +System State Visualization
> +==========================
> +
> +The kernel system state can be viewed as a combination of static and
> +dynamic features and modules. Let’s first define what static and dynamic
> +system states are and then explore how we can visualize the static and
> +dynamic system parts of the kernel.
> +
> +Static System View comprises system calls, features, static and dynamic
> +modules enabled in the kernel configuration. Supported system calls
So the "static system view" includes *dynamic* modules? Fine if that's
what you intended, but it reads a bit strangely.
> +and Kernel features are architecture dependent. System call numbering is
> +different on different architectures. We can get the supported system call
> +information using auditd utilities.
> +
> +ausyscall –dump prints out the supported system calls on a system and allows
Some clever software turned your "--" into an em-dash here.
> +mapping syscall names and numbers. You can install the auditd package on
> +Debian based systems::
> +
> + sudo apt-get install auditd
> +
> +scripts/checksyscalls.sh can be used to check if current architecture is
> +missing any system calls compared to i386.
> +
> +scripts/get_feat.pl can be used to list the Kernel feature support matrix
> +for an architecture.
> +
> +Dynamic System View comprises system calls, ioctls invoked, and subsystems
> +used during the runtime. A workload could load and unload modules and also
> +change the dynamic system configuration to suit its needs by tuning system
> +parameters.
> +
> +What is the methodology?
> +========================
> +
> +The first step is gathering the default system state such as the dynamic
> +and static modules loaded on the system. lsmod command prints out the
*The* lsmod command
> +dynamically loaded modules on a system. Statically configured modules can
> +be found in the kernel configuration file.
> +
> +The next step is discovering system activity during run-time. You can do so
> +by enabling event tracing and then running your favorite application. After
> +a period of time, gather the event logs, and kernel messages.
Might your intended readers need a hint on enabling tracing? A cross
reference to the appropriate docs if nothing else.
[Later I see you get to this; adding an "as described below" would help
here.]
> +Once you have the necessary information, you can extract the system call
> +numbers from the event trace log and map them to the supported system calls.
> +
> +Finding supported system calls
> +==============================
> +
> +As mentioned earlier, ausyscall prints out supported system calls
> +on a system and allows mapping syscalls names and numbers::
> +
> + ausyscall --dump
> +
> +You can look for specific system calls as shown in the below::
> +
> + ausyscall open
> + open 2
> + mq_open 240
> + openat 257
> + perf_event_open 298
> + open_by_handle_at 304
> + open_tree 428
> + fsopen 430
> + pidfd_open 434
> + openat2 437
> +
> + ausyscall time
> +
> + getitimer 36
> + setitimer 38
> + gettimeofday 96
> + times 100
> + rt_sigtimedwait 128
> + utime 132
> + adjtimex 159
> + settimeofday 164
> + time 201
> + semtimedop 220
> + timer_create 222
> + timer_settime 223
> + timer_gettime 224
> + timer_getoverrun 225
> + timer_delete 226
> + clock_settime 227
> + clock_gettime 228
> + utimes 235
> + mq_timedsend 242
> + mq_timedreceive 243
> + futimesat 261
> + utimensat 280
> + timerfd_create 283
> + timerfd_settime 286
> + timerfd_gettime 287
> + clock_adjtime 305
> +
> +Finding unsupported system calls
> +================================
> +
> +As mentioned earlier, scripts/checksyscalls.sh checks missing system calls
> +on current architecture compared to i386. Example run::
> +
> + checksyscalls.sh gcc
> + warning: #warning syscall mmap2 not implemented [-Wcpp]
> + warning: #warning syscall truncate64 not implemented [-Wcpp]
> + warning: #warning syscall ftruncate64 not implemented [-Wcpp]
> + warning: #warning syscall fcntl64 not implemented [-Wcpp]
> + warning: #warning syscall sendfile64 not implemented [-Wcpp]
> + warning: #warning syscall statfs64 not implemented [-Wcpp]
> + warning: #warning syscall fstatfs64 not implemented [-Wcpp]
> + warning: #warning syscall fadvise64_64 not implemented [-Wcpp]
> +
> +Let's check this against ausyscall now::
> +
> + ausyscall map
> + mmap 9
> + munmap 11
> + mremap 25
> + remap_file_pages 216
> +
> + ausyscall trunc
> + truncate 76
> + ftruncate 77
> +
> +As you can see, ausyscall shows mmap2, truncate64, and ftruncate64 aren't
> +implemented on this system. This matches what checksyscalls.sh shows.
> +
> +Finding supported features
> +==========================
> +
> +scripts/get_feat.pl can be used to list the Kernel feature support matrix
> +for an architecture::
> +
> + get_feat.pl list
> + get_feat.pl list –arch=arm64 lists
Lost the "--" again here
> +This scripts parses Documentation/features to find the support status
script (singular)
> +information. It can be used to validate the contents of the files under
> +Documentation/features or simply list them::
> +
> + --arch Outputs features for an specific architecture, optionally filtering
> + for a single specific feature.
> + --feat or --feature Output features for a single specific feature.
> +
> +Here is how you can find if stackprotector and hread-info-in-task features
and *thread*-info-in-task
> +are supported::
> +
> + scripts/get_feat.pl --arch=arm64 --feat=stackprotector list
> + #
> + # Kernel feature support matrix of the 'arm64' architecture:
> + #
> + debug/ stackprotector : ok | HAVE_STACKPROTECTOR #
> + arch supports compiler driven stack overflow protection
> +
> + scripts/get_feat.pl --feat=thread-info-in-task list
> + #
> + # Kernel feature support matrix of the 'x86' architecture:
> + #
> + core/ thread-info-in-task : ok | THREAD_INFO_IN_TASK #
> + arch makes use of the core kernel facility to embed thread_info in
> + task_struct
> +
> +Finding kernel module status
> +============================
> +
> +lsmod command shows the kernel modules that are currently loaded. This
> +program displays the contents of /proc/modules. Let's pick uvcvideo
*The* lsmod
*the* uvcvideo
> +module which is found on most laptops::
> +
> + lsmod | grep uvc
> + uvcvideo 126976 0
> + videobuf2_vmalloc 20480 1 uvcvideo
> + uvc 16384 1 uvcvideo
> + videobuf2_v4l2 36864 1 uvcvideo
> + videodev 315392 2 videobuf2_v4l2,uvcvideo
> + videobuf2_common 65536 4 videobuf2_vmalloc,videobuf2_v4l2,uvcvideo,videobuf2_memops
> + mc 77824 4 videodev,videobuf2_v4l2,uvcvideo,videobuf2_common
> +
> +You can see that lsmod shows uvcvideo and the modules it depends on and how
> +many modules are using them. videobuf2_common is in use by 4 other modules.
> +In other words, this is the reference count for this module and rmmod will
> +refuse to unload it as long as the reference count is > 0.
> +
> +You can get the same information from /proc.modules::
> +
> + less /proc/modules | grep uvc
why not just "grep uvc /proc/modules" ?
> + uvcvideo 126976 0 - Live 0x0000000000000000
> + videobuf2_vmalloc 20480 1 uvcvideo, Live 0x0000000000000000
> + uvc 16384 1 uvcvideo, Live 0x0000000000000000
> + videobuf2_v4l2 36864 1 uvcvideo, Live 0x0000000000000000
> + videodev 315392 2 uvcvideo,videobuf2_v4l2, Live 0x0000000000000000
> + videobuf2_common 65536 4 uvcvideo,videobuf2_vmalloc,videobuf2_memops,videobuf2_v4l2, Live 0x0000000000000000
> + mc 77824 4 uvcvideo,videobuf2_v4l2,videodev,videobuf2_common, Live 0x0000000000000000
> +
> +The information is similar with a few more extra fields. The address is the
> +base address for the module in kernel virtual memory space. When run as a
> +normal user, the address is all zeros. The same command when run as root will
> +be as follows::
> +
> + sudo less /proc/modules | grep uvc
> + uvcvideo 126976 0 - Live 0xffffffffc1c8b000
> + videobuf2_vmalloc 20480 1 uvcvideo, Live 0xffffffffc167f000
> + uvc 16384 1 uvcvideo, Live 0xffffffffc0ab0000
> + videobuf2_v4l2 36864 1 uvcvideo, Live 0xffffffffc0a28000
> + videodev 315392 2 uvcvideo,videobuf2_v4l2, Live 0xffffffffc16e9000
> + videobuf2_common 65536 4 uvcvideo,videobuf2_vmalloc,videobuf2_memops,videobuf2_v4l2, Live 0xffffffffc094d000
> + mc 77824 4 uvcvideo,videobuf2_v4l2,videodev,videobuf2_common, Live 0xffffffffc15eb000
> +
> +Let's check what modinfo shows that is important for us::
> +
> + /sbin/modinfo uvcvideo
> + filename: /lib/modules/6.3.0-rc2/kernel/drivers/media/usb/uvc/uvcvideo.ko
> + license: GPL
> + description: USB Video Class driver
> + depends: videobuf2-v4l2,videodev,mc,uvc,videobuf2-common,videobuf2-vmalloc
> + retpoline: Y
> + intree: Y
> + name: uvcvideo
> + vermagic: 6.3.0-rc2 SMP preempt mod_unload modversions
> + sig_id: PKCS#7
> + signer: Build time autogenerated kernel key
> +
> +This tells us that this module is built intree and the signed with a build
> +time autogenerated key.
> +
> +Let's do one last sanity check on the system to see if the following two
> +command outputs match::
> +
> + ps ax | wc -l
> + ls -d /proc/* | grep [0-9]|wc -l
> +
> +If they don't match, examine your system closely. kernel rootkits install
> +their own ps, find, etc. utilities to mask their activity. The outputs
> +match on my system. Do they on yours?
This would assume that there is no other activity on the system, of
course. Worth saying to avoid unnecessary panic.
> +Is my system as secure as it could be?
> +======================================
> +
> +Linux kernel supports several hardening options to make system secure.
*The* Linux kernel ... to make *the* system secure
the whole document could use a pass for article use
> +kconfig-hardened-check tool sanity checks kernel configuration for
> +security. You can clone the latest kconfig-hardened-check repository::
> +
> + git clone https://github.com/a13xp0p0v/kconfig-hardened-check.git
> + cd kconfig-hardened-check
> + bin/kconfig-hardened-check --config <config file> --cmdline /proc/cmdline
Should you say what <config file> is?
> +This will generate detailed report of kernel security configuration and
> +command line options that are enabled (OK) and the ones that aren't (FAIL)
> +and a summary line at the end::
> +
> + [+] Config check is finished: 'OK' - 100 / 'FAIL' - 100
> +
> +You will have to analyze the information to determine which options make
> +sense to enable on your system.
> +
> +Understanding system run-time activity
> +======================================
> +
> +Enabling event tracing gives insight into system run-time activity. This is
> +a good way to identify which parts of the kernel are used at a higher level
> +while system is in and/or while a specific workload/process is running.
> +
> +Event tracing depends on the CONFIG_EVENT_TRACING option enabled. You can
> +enable event tracing before starting workload/process. Event tracing allows
> +you to dynamically enable and disable tracing on supported/available events.
> +You can find available events, tracers, and filter functions in the following
> +files::
> +
> + /sys/kernel/debug/tracing/available_events
> + /sys/kernel/debug/tracing/available_filter_functions
> + /sys/kernel/debug/tracing/available_tracers
> +
> +Now this is how you can enable tracing::
> +
> + sudo echo 1 > /sys/kernel/debug/tracing/events/enable
> +
> +Once the workload/process stops or when you decide you have the status you
> +need, you can disable event tracing::
> +
> + sudo echo 0 > /sys/kernel/debug/tracing/events/enable
> +
> +You can find the tracing information in the file::
> +
> + /sys/kernel/debug/tracing
> +
> +Here is the information shown in this file::
> +
> + cat trace
> + # tracer: nop
> + #
> + # entries-in-buffer/entries-written: 0/0 #P:16
> + #
> + # _-----=> irqs-off/BH-disabled
> + # / _----=> need-resched
> + # | / _---=> hardirq/softirq
> + # || / _--=> preempt-depth
> + # ||| / _-=> migrate-disable
> + # |||| / delay
> + # TASK-PID CPU# ||||| TIMESTAMP FUNCTION
> + # | | | ||||| | |
> +
That looks like the header, certainly not "the information" found in the
file. Including some actual output would make the following discussion
more comprehensible.
> +Analyzing traces
> +================
> +
> +You will be able map the functions to system calls and other kernel features
> +to get insight into the overall system activity while a workload/process is
> +running.
> +
> +Map the NR (syscal) numbers from the trace to syscalls from the syscalls dump.
(syscall)
> +Categorize system calls and map them to Linux subsystems.
Not sure what that sentence is trying to tell readers. Again, who is
the audience; will a readership that needs to be told how to install
auditd be able to make sense of this and act on it?
> +Conclusion
> +==========
> +
> +This document is intended to be used as a guide on how to gather higher level
> +information about a system and its run-time activity. The approach described
> +in this document helps us get insight into supported system calls, features,
> +assess how secure a system is, and its run-time activity.
> +
> +References
> +==========
> +
> + * https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/checksyscalls.sh
> + * https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/get_feat.pl
> + * https://github.com/a13xp0p0v/kconfig-hardened-check
> + * https://docs.kernel.org/trace/index.html
Thanks,
jon
next prev parent reply other threads:[~2023-03-23 17:55 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-22 15:20 [PATCH v2] docs: add system-state document to admin-guide Shuah Khan
2023-03-23 14:53 ` Kate Stewart
2023-03-23 17:55 ` Jonathan Corbet [this message]
2023-03-24 16:50 ` Shuah Khan
-- strict thread matches above, loose matches on Subject: below --
2023-03-29 9:45 Askar Safin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877cv7cpyj.fsf@meer.lwn.net \
--to=corbet@lwn.net \
--cc=kstewart@linuxfoundation.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=skhan@linuxfoundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.