linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Peter Collingbourne <pcc@google.com>
To: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,  Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	 Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	 Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	 Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	 Daniel Bristot de Oliveira <bristot@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	 Andy Lutomirski <luto@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Masahiro Yamada <masahiroy@kernel.org>,
	 Sami Tolvanen <samitolvanen@google.com>,
	YiFei Zhu <yifeifz2@illinois.edu>,
	 Mark Rutland <mark.rutland@arm.com>,
	Frederic Weisbecker <frederic@kernel.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Andrey Konovalov <andreyknvl@gmail.com>,
	Peter Collingbourne <pcc@google.com>,
	Gabriel Krisman Bertazi <krisman@collabora.com>,
	 Chris Hyser <chris.hyser@oracle.com>,
	Daniel Vetter <daniel.vetter@ffwll.ch>,
	 Chris Wilson <chris@chris-wilson.co.uk>,
	Arnd Bergmann <arnd@arndb.de>,
	 Dmitry Vyukov <dvyukov@google.com>,
	Christian Brauner <christian.brauner@ubuntu.com>,
	 "Eric W. Biederman" <ebiederm@xmission.com>,
	Alexey Gladkov <legion@kernel.org>,
	Ran Xiaokai <ran.xiaokai@zte.com.cn>,
	David Hildenbrand <david@redhat.com>,
	 Xiaofeng Cao <caoxiaofeng@yulong.com>,
	Cyrill Gorcunov <gorcunov@gmail.com>,
	 Thomas Cedeno <thomascedeno@google.com>,
	Marco Elver <elver@google.com>,
	 Alexander Potapenko <glider@google.com>
Cc: linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	 Evgenii Stepanov <eugenis@google.com>
Subject: [PATCH v4 6/7] Documentation: document uaccess logging
Date: Thu,  9 Dec 2021 14:15:43 -0800	[thread overview]
Message-ID: <20211209221545.2333249-7-pcc@google.com> (raw)
In-Reply-To: <20211209221545.2333249-1-pcc@google.com>

Add documentation for the uaccess logging feature.

Link: https://linux-review.googlesource.com/id/Ia626c0ca91bc0a3d8067d7f28406aa40693b65a2
Signed-off-by: Peter Collingbourne <pcc@google.com>
---
v3:
- document what happens if passing NULL to prctl
- be explicit about meaning of addr and size

 Documentation/admin-guide/index.rst           |   1 +
 Documentation/admin-guide/uaccess-logging.rst | 151 ++++++++++++++++++
 2 files changed, 152 insertions(+)
 create mode 100644 Documentation/admin-guide/uaccess-logging.rst

diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
index 1bedab498104..4f6ee447ab2f 100644
--- a/Documentation/admin-guide/index.rst
+++ b/Documentation/admin-guide/index.rst
@@ -54,6 +54,7 @@ ABI will be found here.
    :maxdepth: 1
 
    sysfs-rules
+   uaccess-logging
 
 The rest of this manual consists of various unordered guides on how to
 configure specific aspects of kernel behavior to your liking.
diff --git a/Documentation/admin-guide/uaccess-logging.rst b/Documentation/admin-guide/uaccess-logging.rst
new file mode 100644
index 000000000000..24def38bbdf8
--- /dev/null
+++ b/Documentation/admin-guide/uaccess-logging.rst
@@ -0,0 +1,151 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===============
+Uaccess Logging
+===============
+
+Background
+----------
+
+Userspace tools such as sanitizers (ASan, MSan, HWASan) and tools
+making use of the ARM Memory Tagging Extension (MTE) need to
+monitor all memory accesses in a program so that they can detect
+memory errors. Furthermore, fuzzing tools such as syzkaller need to
+monitor all memory accesses so that they know which parts of memory
+to fuzz. For accesses made purely in userspace, this is achieved
+via compiler instrumentation, or for MTE, via direct hardware
+support. However, accesses made by the kernel on behalf of the user
+program via syscalls (i.e. uaccesses) are normally invisible to
+these tools.
+
+Traditionally, the sanitizers have handled this by interposing the libc
+syscall stubs with a wrapper that checks the memory based on what we
+believe the uaccesses will be. However, this creates a maintenance
+burden: each syscall must be annotated with its uaccesses in order
+to be recognized by the sanitizer, and these annotations must be
+continuously updated as the kernel changes.
+
+The kernel's uaccess logging feature provides userspace tools with
+the address and size of each userspace access, thereby allowing these
+tools to report memory errors involving these accesses without needing
+annotations for every syscall.
+
+By relying on the kernel's actual uaccesses, rather than a
+reimplementation of them, the userspace memory safety tools may
+play a dual role of verifying the validity of kernel accesses. Even
+a sanitizer whose syscall wrappers have complete knowledge of the
+kernel's intended API may vary from the kernel's actual uaccesses due
+to kernel bugs. A sanitizer with knowledge of the kernel's actual
+uaccesses may produce more accurate error reports that reveal such
+bugs. For example, a kernel that accesses more memory than expected
+by the userspace program could indicate that either userspace or the
+kernel has the wrong idea about which kernel functionality is being
+requested -- either way, there is a bug.
+
+Interface
+---------
+
+The feature may be used via the following prctl:
+
+.. code-block:: c
+
+  uint64_t addr = 0; /* Generally will be a TLS slot or equivalent */
+  prctl(PR_SET_UACCESS_DESCRIPTOR_ADDR_ADDR, &addr, 0, 0, 0);
+
+Supplying a non-zero address as the second argument to ``prctl``
+will cause the kernel to read an address (referred to as the *uaccess
+descriptor address*) from that address on each kernel entry. Specifying
+an address of NULL as the second argument will restore the kernel's
+default behavior, i.e. no uaccess descriptor address is read.
+
+When entering the kernel with a non-zero uaccess descriptor address
+to handle a syscall, the kernel will read a data structure of type
+``struct uaccess_descriptor`` from the uaccess descriptor address,
+which is defined as follows:
+
+.. code-block:: c
+
+  struct uaccess_descriptor {
+    uint64_t addr, size;
+  };
+
+This data structure contains the address and size (in array elements)
+of a *uaccess buffer*, which is an array of data structures of type
+``struct uaccess_buffer_entry``. Before returning to userspace, the
+kernel will log information about uaccesses to sequential entries
+in the uaccess buffer. It will also store ``NULL`` to the uaccess
+descriptor address, and store the address and size of the unused
+portion of the uaccess buffer to the uaccess descriptor.
+
+The format of a uaccess buffer entry is defined as follows:
+
+.. code-block:: c
+
+  struct uaccess_buffer_entry {
+    uint64_t addr, size, flags;
+  };
+
+``addr`` and ``size`` contain the address and size of the user memory
+access. On arm64, tag bits are preserved in the ``addr`` field. There
+is currently one flag bit assignment for the ``flags`` field:
+
+.. code-block:: c
+
+  #define UACCESS_BUFFER_FLAG_WRITE 1
+
+This flag is set if the access was a write, or clear if it was a
+read. The meaning of all other flag bits is reserved.
+
+When entering the kernel with a non-zero uaccess descriptor
+address for a reason other than a syscall (for example, when
+IPI'd due to an incoming asynchronous signal), any signals other
+than ``SIGKILL`` and ``SIGSTOP`` are masked as if by calling
+``sigprocmask(SIG_SETMASK, set, NULL)`` where ``set`` has been
+initialized with ``sigfillset(set)``. This is to prevent incoming
+signals from interfering with uaccess logging.
+
+Example
+-------
+
+Here is an example of a code snippet that will enumerate the accesses
+performed by a ``uname(2)`` syscall:
+
+.. code-block:: c
+
+  struct uaccess_buffer_entry entries[64];
+  struct uaccess_descriptor desc;
+  uint64_t desc_addr = 0;
+  prctl(PR_SET_UACCESS_DESCRIPTOR_ADDR_ADDR, &desc_addr, 0, 0, 0);
+
+  desc.addr = (uint64_t)&entries;
+  desc.size = 64;
+  desc_addr = (uint64_t)&desc;
+
+  struct utsname un;
+  uname(&un);
+
+  struct uaccess_buffer_entry* entries_end = (struct uaccess_buffer_entry*)desc.addr;
+  for (struct uaccess_buffer_entry* entry = entries; entry != entries_end; ++entry) {
+    printf("%s at 0x%lx size 0x%lx\n", entry->flags & UACCESS_BUFFER_FLAG_WRITE ? "WRITE" : "READ",
+           (unsigned long)entry->addr, (unsigned long)entry->size);
+  }
+
+Limitations
+-----------
+
+This feature is currently only supported on the arm64, s390 and x86
+architectures.
+
+Uaccess buffers are a "best-effort" mechanism for logging uaccesses. Of
+course, not all of the accesses may fit in the buffer, but aside from
+that, not all internal kernel APIs that access userspace memory are
+covered. Therefore, userspace programs should tolerate unreported
+accesses.
+
+On the other hand, the kernel guarantees that it will not
+(intentionally) report accessing more data than it is specified
+to read. For example, if the kernel implements a syscall that is
+specified to read a data structure of size ``N`` bytes by first
+reading a page's worth of data and then only using the first ``N``
+bytes from it, the kernel will either report reading ``N`` bytes or
+not report the access at all.
-- 
2.34.1.173.g76aa8bc2d0-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2021-12-09 22:22 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-09 22:15 [PATCH v4 0/7] kernel: introduce uaccess logging Peter Collingbourne
2021-12-09 22:15 ` [PATCH v4 1/7] include: split out uaccess instrumentation into a separate header Peter Collingbourne
2021-12-10 12:45   ` Marco Elver
2021-12-09 22:15 ` [PATCH v4 2/7] uaccess-buffer: add core code Peter Collingbourne
2021-12-10  3:52   ` Dmitry Vyukov
2021-12-10 12:39   ` Marco Elver
2021-12-09 22:15 ` [PATCH v4 3/7] fs: use copy_from_user_nolog() to copy mount() data Peter Collingbourne
2021-12-09 22:15 ` [PATCH v4 4/7] uaccess-buffer: add CONFIG_GENERIC_ENTRY support Peter Collingbourne
2021-12-11 11:50   ` Thomas Gleixner
2021-12-16  1:25     ` Peter Collingbourne
2021-12-16 13:05       ` Thomas Gleixner
2021-12-17  0:09         ` Peter Collingbourne
2021-12-17 18:42           ` Thomas Gleixner
2022-01-10 21:43             ` Peter Collingbourne
2021-12-09 22:15 ` [PATCH v4 5/7] arm64: add support for uaccess logging Peter Collingbourne
2021-12-09 22:15 ` Peter Collingbourne [this message]
2021-12-09 22:15 ` [PATCH v4 7/7] selftests: test " Peter Collingbourne
2021-12-10 13:30   ` Marco Elver
2021-12-11 17:23 ` [PATCH v4 0/7] kernel: introduce " David Laight
2021-12-13 19:48   ` Peter Collingbourne
2021-12-13 23:07     ` David Laight
2021-12-14  3:47       ` Peter Collingbourne
2021-12-15  4:27         ` Peter Collingbourne

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211209221545.2333249-7-pcc@google.com \
    --to=pcc@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=arnd@arndb.de \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=caoxiaofeng@yulong.com \
    --cc=catalin.marinas@arm.com \
    --cc=chris.hyser@oracle.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=christian.brauner@ubuntu.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=david@redhat.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=dvyukov@google.com \
    --cc=ebiederm@xmission.com \
    --cc=elver@google.com \
    --cc=eugenis@google.com \
    --cc=frederic@kernel.org \
    --cc=glider@google.com \
    --cc=gorcunov@gmail.com \
    --cc=juri.lelli@redhat.com \
    --cc=keescook@chromium.org \
    --cc=krisman@collabora.com \
    --cc=legion@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=masahiroy@kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=ran.xiaokai@zte.com.cn \
    --cc=rostedt@goodmis.org \
    --cc=samitolvanen@google.com \
    --cc=tglx@linutronix.de \
    --cc=thomascedeno@google.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    --cc=will@kernel.org \
    --cc=yifeifz2@illinois.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).