* Re: [RFC PATCH 1/3] Introduce per thread user-kernel shared structure
@ 2021-08-28 10:18 kernel test robot
0 siblings, 0 replies; 4+ messages in thread
From: kernel test robot @ 2021-08-28 10:18 UTC (permalink / raw)
To: kbuild
[-- Attachment #1: Type: text/plain, Size: 7004 bytes --]
CC: kbuild-all(a)lists.01.org
In-Reply-To: <1630107736-18269-2-git-send-email-prakash.sangappa@oracle.com>
References: <1630107736-18269-2-git-send-email-prakash.sangappa@oracle.com>
TO: Prakash Sangappa <prakash.sangappa@oracle.com>
Hi Prakash,
[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on linus/master]
[also build test WARNING on v5.14-rc7]
[cannot apply to tip/sched/core hnaz-linux-mm/master tip/x86/asm next-20210827]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Prakash-Sangappa/Provide-fast-access-to-thread-specific-data/20210828-073533
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 8f9d0349841a2871624bb1e85309e03e9867c16e
:::::: branch date: 11 hours ago
:::::: commit date: 11 hours ago
config: x86_64-randconfig-s022-20210827 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce:
# apt-get install sparse
# sparse version: v0.6.3-348-gf0e6938b-dirty
# https://github.com/0day-ci/linux/commit/4afb2fb1653308287e0f2347dfff5c499acedee7
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Prakash-Sangappa/Provide-fast-access-to-thread-specific-data/20210828-073533
git checkout 4afb2fb1653308287e0f2347dfff5c499acedee7
# save the attached .config to linux build tree
make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=x86_64 SHELL=/bin/bash
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
sparse warnings: (new ones prefixed by >>)
>> mm/task_shared.c:264:1: sparse: sparse: unused label 'out'
vim +/out +264 mm/task_shared.c
4afb2fb1653308 Prakash Sangappa 2021-08-27 200
4afb2fb1653308 Prakash Sangappa 2021-08-27 201
4afb2fb1653308 Prakash Sangappa 2021-08-27 202 /*
4afb2fb1653308 Prakash Sangappa 2021-08-27 203 * Allocate task_ushared struct for calling thread.
4afb2fb1653308 Prakash Sangappa 2021-08-27 204 */
4afb2fb1653308 Prakash Sangappa 2021-08-27 205 static int task_ushared_alloc(void)
4afb2fb1653308 Prakash Sangappa 2021-08-27 206 {
4afb2fb1653308 Prakash Sangappa 2021-08-27 207 struct mm_struct *mm = current->mm;
4afb2fb1653308 Prakash Sangappa 2021-08-27 208 struct ushared_pg *ent = NULL;
4afb2fb1653308 Prakash Sangappa 2021-08-27 209 struct task_ushrd_struct *ushrd;
4afb2fb1653308 Prakash Sangappa 2021-08-27 210 struct ushared_pages *usharedpg;
4afb2fb1653308 Prakash Sangappa 2021-08-27 211 int tryalloc = 0;
4afb2fb1653308 Prakash Sangappa 2021-08-27 212 int slot = -1;
4afb2fb1653308 Prakash Sangappa 2021-08-27 213 int ret = -ENOMEM;
4afb2fb1653308 Prakash Sangappa 2021-08-27 214
4afb2fb1653308 Prakash Sangappa 2021-08-27 215 if (mm->usharedpg == NULL && init_mm_ushared(mm))
4afb2fb1653308 Prakash Sangappa 2021-08-27 216 return ret;
4afb2fb1653308 Prakash Sangappa 2021-08-27 217
4afb2fb1653308 Prakash Sangappa 2021-08-27 218 if (current->task_ushrd == NULL && init_task_ushrd(current))
4afb2fb1653308 Prakash Sangappa 2021-08-27 219 return ret;
4afb2fb1653308 Prakash Sangappa 2021-08-27 220
4afb2fb1653308 Prakash Sangappa 2021-08-27 221 usharedpg = mm->usharedpg;
4afb2fb1653308 Prakash Sangappa 2021-08-27 222 ushrd = current->task_ushrd;
4afb2fb1653308 Prakash Sangappa 2021-08-27 223 repeat:
4afb2fb1653308 Prakash Sangappa 2021-08-27 224 if (mmap_write_lock_killable(mm))
4afb2fb1653308 Prakash Sangappa 2021-08-27 225 return -EINTR;
4afb2fb1653308 Prakash Sangappa 2021-08-27 226
4afb2fb1653308 Prakash Sangappa 2021-08-27 227 ent = list_empty(&usharedpg->frlist) ? NULL :
4afb2fb1653308 Prakash Sangappa 2021-08-27 228 list_entry(usharedpg->frlist.next,
4afb2fb1653308 Prakash Sangappa 2021-08-27 229 struct ushared_pg, fr_list);
4afb2fb1653308 Prakash Sangappa 2021-08-27 230
4afb2fb1653308 Prakash Sangappa 2021-08-27 231 if (ent == NULL || ent->slot_count == 0) {
4afb2fb1653308 Prakash Sangappa 2021-08-27 232 if (tryalloc == 0) {
4afb2fb1653308 Prakash Sangappa 2021-08-27 233 mmap_write_unlock(mm);
4afb2fb1653308 Prakash Sangappa 2021-08-27 234 (void)ushared_allocpg();
4afb2fb1653308 Prakash Sangappa 2021-08-27 235 tryalloc = 1;
4afb2fb1653308 Prakash Sangappa 2021-08-27 236 goto repeat;
4afb2fb1653308 Prakash Sangappa 2021-08-27 237 } else {
4afb2fb1653308 Prakash Sangappa 2021-08-27 238 ent = NULL;
4afb2fb1653308 Prakash Sangappa 2021-08-27 239 }
4afb2fb1653308 Prakash Sangappa 2021-08-27 240 }
4afb2fb1653308 Prakash Sangappa 2021-08-27 241
4afb2fb1653308 Prakash Sangappa 2021-08-27 242 if (ent) {
4afb2fb1653308 Prakash Sangappa 2021-08-27 243 slot = find_first_zero_bit((unsigned long *)(&ent->bitmap),
4afb2fb1653308 Prakash Sangappa 2021-08-27 244 TASK_USHARED_SLOTS);
4afb2fb1653308 Prakash Sangappa 2021-08-27 245 BUG_ON(slot >= TASK_USHARED_SLOTS);
4afb2fb1653308 Prakash Sangappa 2021-08-27 246
4afb2fb1653308 Prakash Sangappa 2021-08-27 247 set_bit(slot, (unsigned long *)(&ent->bitmap));
4afb2fb1653308 Prakash Sangappa 2021-08-27 248
4afb2fb1653308 Prakash Sangappa 2021-08-27 249 ushrd->uaddr = (struct task_ushared *)(ent->vaddr +
4afb2fb1653308 Prakash Sangappa 2021-08-27 250 (slot * sizeof(union task_shared)));
4afb2fb1653308 Prakash Sangappa 2021-08-27 251 ushrd->kaddr = (struct task_ushared *)(ent->kaddr +
4afb2fb1653308 Prakash Sangappa 2021-08-27 252 (slot * sizeof(union task_shared)));
4afb2fb1653308 Prakash Sangappa 2021-08-27 253 ushrd->upg = ent;
4afb2fb1653308 Prakash Sangappa 2021-08-27 254 ent->slot_count--;
4afb2fb1653308 Prakash Sangappa 2021-08-27 255 /* move it to tail */
4afb2fb1653308 Prakash Sangappa 2021-08-27 256 if (ent->slot_count == 0) {
4afb2fb1653308 Prakash Sangappa 2021-08-27 257 list_del(&ent->fr_list);
4afb2fb1653308 Prakash Sangappa 2021-08-27 258 list_add_tail(&ent->fr_list, &usharedpg->frlist);
4afb2fb1653308 Prakash Sangappa 2021-08-27 259 }
4afb2fb1653308 Prakash Sangappa 2021-08-27 260
4afb2fb1653308 Prakash Sangappa 2021-08-27 261 ret = 0;
4afb2fb1653308 Prakash Sangappa 2021-08-27 262 }
4afb2fb1653308 Prakash Sangappa 2021-08-27 263
4afb2fb1653308 Prakash Sangappa 2021-08-27 @264 out:
4afb2fb1653308 Prakash Sangappa 2021-08-27 265 mmap_write_unlock(mm);
4afb2fb1653308 Prakash Sangappa 2021-08-27 266 return ret;
4afb2fb1653308 Prakash Sangappa 2021-08-27 267 }
4afb2fb1653308 Prakash Sangappa 2021-08-27 268
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 33997 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread* [RFC PATCH 0/3] Provide fast access to thread-specific data
@ 2021-08-27 23:42 Prakash Sangappa
2021-08-27 23:42 ` [RFC PATCH 1/3] Introduce per thread user-kernel shared structure Prakash Sangappa
0 siblings, 1 reply; 4+ messages in thread
From: Prakash Sangappa @ 2021-08-27 23:42 UTC (permalink / raw)
To: linux-kernel; +Cc: prakash.sangappa
Sending this as RFC, looking for feedback.
Some applications, like a Databases require reading thread specific stats
frequently from the kernel in latency sensitive codepath. The overhead of
reading stats from kernel using system call affects performance.
One use case is reading thread's scheduler stats from /proc schedstat file
(/proc/pid/schedstat) to collect time spent by a thread executing on the
cpu(sum_exec_runtime), time blocked waiting on runq(run_delay). These
scheduler stats, read several times per transaction in latency-sensitive
codepath, are used to measure time taken by DB operations.
This patch proposes to introduce a mechanism for kernel to share thread
stats thru a per thread shared structure shared between userspace and
kernel. The per thread shared structure is allocated on a page shared
mapped between user space and kernel, which will provide a way for fast
communication between user and kernel. Kernel publishes stats in this
shared structure. Application thread can read from it in user space
without requiring system calls.
Similarly, there can be other use cases for such shared structure
mechanism.
Introduce 'off cpu' time:
The time spent executing on a cpu(sum_exec_runtime) by a thread,
currently available thru thread's schedstat file, can be shared thru
the shared structure mentioned above. However, when a thread is running
on the cpu, this time gets updated periodically, can take upto 1ms or
more as part of scheduler tick processing. If the application has to
measure cpu time consumed across some DB operations, using
'sum_exec_runtime' will not be accurate. To address this the proposal
is to introduce a thread's 'off cpu' time, which is measured at context
switch, similar to time on runq(ie run_delay in schedstat file) is and
should be more accurate. With that the application can determine cpu time
consumed by taking the elapsed time and subtracting off cpu time. The
off cpu time will be made available thru the shared structure along with
the other schedstats from /proc/pid/schedstat file.
The elapsed time itself can be measured using clock_gettime, which is
vdso optimized and would be fast. The schedstats(runq time & off cpu time)
published in the shared structure will be accumulated time, same as what
is available thru schedstat file, all in units of nanoseconds. The
application would take the difference of the values from before and after
the operation for measurement.
Preliminary results from a simple cached read Database workload shows
performance benefit, when the database uses shared struct for reading
stats vs reading from /proc directly.
Implementation:
A new system call is added to request use of shared structure by a user
thread. Kernel will allocate page(s), shared mapped with user space in
which per-thread shared structures will be allocated. These structures
are padded to 128 bytes. This will contain struct members or nested
structures corresponding to supported stats, like the thread's schedstats,
published by the kernel for user space consumption. More struct members
can be added as new feature support is implemented. Multiple such shared
structures will be allocated from a page(upto 32 per 4k page) and avoid
having to allocate one page per thread of a process. Although, will need
optimizing for locality. Additional pages will be allocated as needed to
accommodate more threads requesting use of shared structures. Aim is to
not expose the layout of the shared structure itself to the application,
which will allow future enhancements/changes without affecting the API.
The system call will return a pointer(user space mapped address) to the per
thread shared structure members. Application would save this per thread
pointer in a TLS variable and reference it.
The system call is of the form.
int task_getshared(int option, int flags, void __user *uaddr)
// Currently only TASK_SCHEDSTAT option is supported - returns pointer
// to struct task_schedstat. The struct task_schedstat is nested within
// the shared structure.
struct task_schedstat {
volatile u64 sum_exec_runtime;
volatile u64 run_delay;
volatile u64 pcount;
volatile u64 off_cpu;
};
Usage:
__thread struct task_schedstat *ts;
task_getshared(TASK_SCHEDSTAT, 0, &ts);
Subsequently the stats are accessed using the 'ts' pointer by the thread
Prakash Sangappa (3):
Introduce per thread user-kernel shared structure
Publish tasks's scheduler stats thru the shared structure
Introduce task's 'off cpu' time
arch/x86/entry/syscalls/syscall_32.tbl | 1 +
arch/x86/entry/syscalls/syscall_64.tbl | 1 +
include/linux/mm_types.h | 2 +
include/linux/sched.h | 9 +
include/linux/syscalls.h | 2 +
include/linux/task_shared.h | 92 ++++++++++
include/uapi/asm-generic/unistd.h | 5 +-
include/uapi/linux/task_shared.h | 23 +++
kernel/fork.c | 7 +
kernel/sched/deadline.c | 1 +
kernel/sched/fair.c | 1 +
kernel/sched/rt.c | 1 +
kernel/sched/sched.h | 1 +
kernel/sched/stats.h | 55 ++++--
kernel/sched/stop_task.c | 1 +
kernel/sys_ni.c | 3 +
mm/Makefile | 2 +-
mm/task_shared.c | 314 +++++++++++++++++++++++++++++++++
18 files changed, 501 insertions(+), 20 deletions(-)
create mode 100644 include/linux/task_shared.h
create mode 100644 include/uapi/linux/task_shared.h
create mode 100644 mm/task_shared.c
--
2.7.4
^ permalink raw reply [flat|nested] 4+ messages in thread* [RFC PATCH 1/3] Introduce per thread user-kernel shared structure 2021-08-27 23:42 [RFC PATCH 0/3] Provide fast access to thread-specific data Prakash Sangappa @ 2021-08-27 23:42 ` Prakash Sangappa 2021-08-28 7:36 ` kernel test robot 2021-08-28 8:21 ` kernel test robot 0 siblings, 2 replies; 4+ messages in thread From: Prakash Sangappa @ 2021-08-27 23:42 UTC (permalink / raw) To: linux-kernel; +Cc: prakash.sangappa A structure per thread is allocated from a page that is shared mapped between user space and kernel as means for faster communication. This will facilitate sharing information, Ex: per thread stats shared between kernel and user space, that can be read by applications without the need for making frequent system calls in latency sensitive code path. A new system call is added, which will allocate the shared structure and return its mapped user address. Multiple such structures will be allocated on a page to accommodate requests from different threads of a multithreaded process. Available space on a page is managed using a bitmap. When a thread exits, the shared structure is freed and can get reused for another thread that requests the shared structure. More pages will be allocated and used as needed based on the number of threads requesting use of shared structures. These pages are all freed when the process exits. Each of these shared structures are rounded to 128 bytes. Available space in this structure can be used to accommodate additional per thread stats, state etc as needed. In future, if more space beyond 128 bytes, is needed, multiple such shared structures per thread could be allocated and managed by the kernel. Although, space in shared structure for sharing any kind of stats or state should be sparingly used. Therefore shared structure layout is not exposed to user space. the system call will return the mapped user address of a specific member or nested structure within the shared structure corresponding to stats requested, This would allow future enhancements/changes without breaking the API. Signed-off-by: Prakash Sangappa <prakash.sangappa@oracle.com> --- arch/x86/entry/syscalls/syscall_32.tbl | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 1 + include/linux/mm_types.h | 2 + include/linux/sched.h | 3 + include/linux/syscalls.h | 2 + include/linux/task_shared.h | 57 +++++++ include/uapi/asm-generic/unistd.h | 5 +- kernel/fork.c | 7 + kernel/sys_ni.c | 3 + mm/Makefile | 2 +- mm/task_shared.c | 301 +++++++++++++++++++++++++++++++++ 11 files changed, 382 insertions(+), 2 deletions(-) create mode 100644 include/linux/task_shared.h create mode 100644 mm/task_shared.c diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl index ce763a1..a194581 100644 --- a/arch/x86/entry/syscalls/syscall_32.tbl +++ b/arch/x86/entry/syscalls/syscall_32.tbl @@ -452,3 +452,4 @@ 445 i386 landlock_add_rule sys_landlock_add_rule 446 i386 landlock_restrict_self sys_landlock_restrict_self 447 i386 memfd_secret sys_memfd_secret +448 i386 task_getshared sys_task_getshared diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index f6b5779..9dda907 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -369,6 +369,7 @@ 445 common landlock_add_rule sys_landlock_add_rule 446 common landlock_restrict_self sys_landlock_restrict_self 447 common memfd_secret sys_memfd_secret +448 common task_getshared sys_task_getshared # # Due to a historical design error, certain syscalls are numbered differently diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 52bbd2b..5ec26ed 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -572,6 +572,8 @@ struct mm_struct { #ifdef CONFIG_IOMMU_SUPPORT u32 pasid; #endif + /* user shared pages */ + void *usharedpg; } __randomize_layout; /* diff --git a/include/linux/sched.h b/include/linux/sched.h index ec8d07d..237aa21 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1400,6 +1400,9 @@ struct task_struct { struct llist_head kretprobe_instances; #endif + /* user shared struct */ + void *task_ushrd; + /* * New fields for task_struct should be added above here, so that * they are included in the randomized portion of task_struct. diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 69c9a70..09680b7 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -1052,6 +1052,8 @@ asmlinkage long sys_landlock_add_rule(int ruleset_fd, enum landlock_rule_type ru asmlinkage long sys_landlock_restrict_self(int ruleset_fd, __u32 flags); asmlinkage long sys_memfd_secret(unsigned int flags); +asmlinkage long sys_task_getshared(long opt, long flags, void __user *uaddr); + /* * Architecture-specific system calls */ diff --git a/include/linux/task_shared.h b/include/linux/task_shared.h new file mode 100644 index 0000000..de17849 --- /dev/null +++ b/include/linux/task_shared.h @@ -0,0 +1,57 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __TASK_SHARED_H__ +#define __TASK_SHARED_H__ + +#include <linux/mm_types.h> + +/* + * Track user-kernel shared pages referred by mm_struct + */ +struct ushared_pages { + struct list_head plist; + struct list_head frlist; + unsigned long pcount; +}; + +/* + * Following is the per task struct shared with kernel for + * fast communication. + */ +struct task_ushared { + long version; +}; + +/* + * Following is used for cacheline aligned allocations in a page. + */ +union task_shared { + struct task_ushared tu; + char s[128]; +}; + +/* + * Struct to track per page slots + */ +struct ushared_pg { + struct list_head list; + struct list_head fr_list; + struct page *pages[2]; + u64 bitmap; /* free slots */ + int slot_count; + unsigned long kaddr; + unsigned long vaddr; /* user address */ + struct vm_special_mapping ushrd_mapping; +}; + +/* + * Following struct is referred by tast_struct + */ +struct task_ushrd_struct { + struct task_ushared *kaddr; /* kernel address */ + struct task_ushared *uaddr; /* user address */ + struct ushared_pg *upg; +}; + +extern void task_ushared_free(struct task_struct *t); +extern void mm_ushared_clear(struct mm_struct *mm); +#endif /* __TASK_SHARED_H__ */ diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h index a9d6fcd..7c985b1 100644 --- a/include/uapi/asm-generic/unistd.h +++ b/include/uapi/asm-generic/unistd.h @@ -878,8 +878,11 @@ __SYSCALL(__NR_landlock_restrict_self, sys_landlock_restrict_self) __SYSCALL(__NR_memfd_secret, sys_memfd_secret) #endif +#define __NR_task_getshared 448 +__SYSCALL(__NR_task_getshared, sys_task_getshared) + #undef __NR_syscalls -#define __NR_syscalls 448 +#define __NR_syscalls 449 /* * 32 bit systems traditionally used different diff --git a/kernel/fork.c b/kernel/fork.c index bc94b2c..f84bac0 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -97,6 +97,7 @@ #include <linux/scs.h> #include <linux/io_uring.h> #include <linux/bpf.h> +#include <linux/task_shared.h> #include <asm/pgalloc.h> #include <linux/uaccess.h> @@ -903,6 +904,9 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node) if (err) goto free_stack; + /* task's ushared struct not inherited across fork */ + tsk->task_ushrd = NULL; + #ifdef CONFIG_SECCOMP /* * We must handle setting up seccomp filters once we're under @@ -1049,6 +1053,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS mm->pmd_huge_pte = NULL; #endif + mm->usharedpg = NULL; mm_init_uprobes_state(mm); if (current->mm) { @@ -1099,6 +1104,7 @@ static inline void __mmput(struct mm_struct *mm) ksm_exit(mm); khugepaged_exit(mm); /* must run before exit_mmap */ exit_mmap(mm); + mm_ushared_clear(mm); mm_put_huge_zero_page(mm); set_mm_exe_file(mm, NULL); if (!list_empty(&mm->mmlist)) { @@ -1308,6 +1314,7 @@ static int wait_for_vfork_done(struct task_struct *child, static void mm_release(struct task_struct *tsk, struct mm_struct *mm) { uprobe_free_utask(tsk); + task_ushared_free(tsk); /* Get rid of any cached register state */ deactivate_mm(tsk, mm); diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index 30971b1..8fbdc55 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -481,3 +481,6 @@ COND_SYSCALL(setuid16); /* restartable sequence */ COND_SYSCALL(rseq); + +/* task shared */ +COND_SYSCALL(task_getshared); diff --git a/mm/Makefile b/mm/Makefile index e343674..03f88fe 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -52,7 +52,7 @@ obj-y := filemap.o mempool.o oom_kill.o fadvise.o \ mm_init.o percpu.o slab_common.o \ compaction.o vmacache.o \ interval_tree.o list_lru.o workingset.o \ - debug.o gup.o mmap_lock.o $(mmu-y) + debug.o gup.o mmap_lock.o task_shared.o $(mmu-y) # Give 'page_alloc' its own module-parameter namespace page-alloc-y := page_alloc.o diff --git a/mm/task_shared.c b/mm/task_shared.c new file mode 100644 index 0000000..3ec5eb6 --- /dev/null +++ b/mm/task_shared.c @@ -0,0 +1,301 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <linux/mm.h> +#include <linux/uio.h> +#include <linux/sched.h> +#include <linux/sched/mm.h> +#include <linux/highmem.h> +#include <linux/ptrace.h> +#include <linux/slab.h> +#include <linux/syscalls.h> +#include <linux/task_shared.h> + +/* Shared page */ + +#define TASK_USHARED_SLOTS (PAGE_SIZE/sizeof(union task_shared)) + +/* + * Called once to init struct ushared_pages pointer. + */ +static int init_mm_ushared(struct mm_struct *mm) +{ + struct ushared_pages *usharedpg; + + usharedpg = kmalloc(sizeof(struct ushared_pages), GFP_KERNEL); + if (usharedpg == NULL) + return 1; + + INIT_LIST_HEAD(&usharedpg->plist); + INIT_LIST_HEAD(&usharedpg->frlist); + usharedpg->pcount = 0; + mmap_write_lock(mm); + if (mm->usharedpg == NULL) { + mm->usharedpg = usharedpg; + usharedpg = NULL; + } + mmap_write_unlock(mm); + if (usharedpg != NULL) + kfree(usharedpg); + return 0; +} + +static int init_task_ushrd(struct task_struct *t) +{ + struct task_ushrd_struct *ushrd; + + ushrd = kzalloc(sizeof(struct task_ushrd_struct), GFP_KERNEL); + if (ushrd == NULL) + return 1; + + mmap_write_lock(t->mm); + if (t->task_ushrd == NULL) { + t->task_ushrd = ushrd; + ushrd = NULL; + } + mmap_write_unlock(t->mm); + if (ushrd != NULL) + kfree(ushrd); + return 0; +} + +/* + * Called from __mmput(), mm is going away + */ +void mm_ushared_clear(struct mm_struct *mm) +{ + struct ushared_pg *upg; + struct ushared_pg *tmp; + struct ushared_pages *usharedpg; + + if (mm == NULL || mm->usharedpg == NULL) + return; + + usharedpg = mm->usharedpg; + if (list_empty(&usharedpg->frlist)) + goto out; + + list_for_each_entry_safe(upg, tmp, &usharedpg->frlist, fr_list) { + list_del(&upg->fr_list); + put_page(upg->pages[0]); + kfree(upg); + } +out: + kfree(mm->usharedpg); + mm->usharedpg = NULL; + +} + +void task_ushared_free(struct task_struct *t) +{ + struct task_ushrd_struct *ushrd = t->task_ushrd; + struct mm_struct *mm = t->mm; + struct ushared_pages *usharedpg; + int slot; + + if (mm == NULL || mm->usharedpg == NULL || ushrd == NULL) + return; + + usharedpg = mm->usharedpg; + mmap_write_lock(mm); + + if (ushrd->upg == NULL) + goto out; + + slot = (unsigned long)((unsigned long)ushrd->uaddr + - ushrd->upg->vaddr) / sizeof(union task_shared); + clear_bit(slot, (unsigned long *)(&ushrd->upg->bitmap)); + + /* move to head */ + if (ushrd->upg->slot_count == 0) { + list_del(&ushrd->upg->fr_list); + list_add(&ushrd->upg->fr_list, &usharedpg->frlist); + } + + ushrd->upg->slot_count++; + + ushrd->uaddr = ushrd->kaddr = NULL; + ushrd->upg = NULL; + +out: + t->task_ushrd = NULL; + mmap_write_unlock(mm); + kfree(ushrd); +} + +/* map shared page */ +static int task_shared_add_vma(struct ushared_pg *pg) +{ + struct vm_area_struct *vma; + struct mm_struct *mm = current->mm; + unsigned long ret = 1; + + + if (!pg->vaddr) { + /* Try to map as high as possible, this is only a hint. */ + pg->vaddr = get_unmapped_area(NULL, TASK_SIZE - PAGE_SIZE, + PAGE_SIZE, 0, 0); + if (pg->vaddr & ~PAGE_MASK) { + ret = 0; + goto fail; + } + } + + vma = _install_special_mapping(mm, pg->vaddr, PAGE_SIZE, + VM_SHARED|VM_READ|VM_MAYREAD|VM_DONTCOPY, + &pg->ushrd_mapping); + if (IS_ERR(vma)) { + ret = 0; + pg->vaddr = 0; + goto fail; + } + + pg->kaddr = (unsigned long)page_address(pg->pages[0]); +fail: + return ret; +} + +/* + * Allocate a page, map user address and add to freelist + */ +static struct ushared_pg *ushared_allocpg(void) +{ + + struct ushared_pg *pg; + struct mm_struct *mm = current->mm; + struct ushared_pages *usharedpg = mm->usharedpg; + + if (usharedpg == NULL) + return NULL; + pg = kzalloc(sizeof(*pg), GFP_KERNEL); + + if (unlikely(!pg)) + return NULL; + pg->ushrd_mapping.name = "[task_shared]"; + pg->ushrd_mapping.fault = NULL; + pg->ushrd_mapping.pages = pg->pages; + pg->pages[0] = alloc_page(GFP_KERNEL); + if (!pg->pages[0]) + goto out; + pg->pages[1] = NULL; + pg->bitmap = 0; + + /* + * page size should be 4096 or 8192 + */ + pg->slot_count = TASK_USHARED_SLOTS; + + mmap_write_lock(mm); + if (task_shared_add_vma(pg)) { + list_add(&pg->fr_list, &usharedpg->frlist); + usharedpg->pcount++; + mmap_write_unlock(mm); + return pg; + } + mmap_write_unlock(mm); + +out: + __free_page(pg->pages[0]); + kfree(pg); + return NULL; +} + + +/* + * Allocate task_ushared struct for calling thread. + */ +static int task_ushared_alloc(void) +{ + struct mm_struct *mm = current->mm; + struct ushared_pg *ent = NULL; + struct task_ushrd_struct *ushrd; + struct ushared_pages *usharedpg; + int tryalloc = 0; + int slot = -1; + int ret = -ENOMEM; + + if (mm->usharedpg == NULL && init_mm_ushared(mm)) + return ret; + + if (current->task_ushrd == NULL && init_task_ushrd(current)) + return ret; + + usharedpg = mm->usharedpg; + ushrd = current->task_ushrd; +repeat: + if (mmap_write_lock_killable(mm)) + return -EINTR; + + ent = list_empty(&usharedpg->frlist) ? NULL : + list_entry(usharedpg->frlist.next, + struct ushared_pg, fr_list); + + if (ent == NULL || ent->slot_count == 0) { + if (tryalloc == 0) { + mmap_write_unlock(mm); + (void)ushared_allocpg(); + tryalloc = 1; + goto repeat; + } else { + ent = NULL; + } + } + + if (ent) { + slot = find_first_zero_bit((unsigned long *)(&ent->bitmap), + TASK_USHARED_SLOTS); + BUG_ON(slot >= TASK_USHARED_SLOTS); + + set_bit(slot, (unsigned long *)(&ent->bitmap)); + + ushrd->uaddr = (struct task_ushared *)(ent->vaddr + + (slot * sizeof(union task_shared))); + ushrd->kaddr = (struct task_ushared *)(ent->kaddr + + (slot * sizeof(union task_shared))); + ushrd->upg = ent; + ent->slot_count--; + /* move it to tail */ + if (ent->slot_count == 0) { + list_del(&ent->fr_list); + list_add_tail(&ent->fr_list, &usharedpg->frlist); + } + + ret = 0; + } + +out: + mmap_write_unlock(mm); + return ret; +} + + +/* + * Task Shared : allocate if needed, and return address of shared struct for + * this thread/task. + */ +static long task_getshared(u64 opt, u64 flags, void __user *uaddr) +{ + struct task_ushrd_struct *ushrd = current->task_ushrd; + + /* We have address, return. */ + if (ushrd != NULL && ushrd->upg != NULL) { + if (copy_to_user(uaddr, &ushrd->uaddr, + sizeof(struct task_ushared *))) + return (-EFAULT); + return 0; + } + + task_ushared_alloc(); + ushrd = current->task_ushrd; + if (ushrd != NULL && ushrd->upg != NULL) { + if (copy_to_user(uaddr, &ushrd->uaddr, + sizeof(struct task_ushared *))) + return (-EFAULT); + return 0; + } + return (-ENOMEM); +} + + +SYSCALL_DEFINE3(task_getshared, u64, opt, u64, flags, void __user *, uaddr) +{ + return task_getshared(opt, flags, uaddr); +} -- 2.7.4 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [RFC PATCH 1/3] Introduce per thread user-kernel shared structure 2021-08-27 23:42 ` [RFC PATCH 1/3] Introduce per thread user-kernel shared structure Prakash Sangappa @ 2021-08-28 7:36 ` kernel test robot 2021-08-28 8:21 ` kernel test robot 1 sibling, 0 replies; 4+ messages in thread From: kernel test robot @ 2021-08-28 7:36 UTC (permalink / raw) To: kbuild-all [-- Attachment #1: Type: text/plain, Size: 1976 bytes --] Hi Prakash, [FYI, it's a private test report for your RFC patch.] [auto build test WARNING on linus/master] [also build test WARNING on v5.14-rc7] [cannot apply to tip/sched/core hnaz-linux-mm/master tip/x86/asm next-20210827] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Prakash-Sangappa/Provide-fast-access-to-thread-specific-data/20210828-073533 base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 8f9d0349841a2871624bb1e85309e03e9867c16e config: arm-randconfig-r001-20210827 (attached as .config) compiler: arm-linux-gnueabi-gcc (GCC) 11.2.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/4afb2fb1653308287e0f2347dfff5c499acedee7 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Prakash-Sangappa/Provide-fast-access-to-thread-specific-data/20210828-073533 git checkout 4afb2fb1653308287e0f2347dfff5c499acedee7 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross ARCH=arm If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): >> <stdin>:1554:2: warning: #warning syscall task_getshared not implemented [-Wcpp] -- >> <stdin>:1554:2: warning: #warning syscall task_getshared not implemented [-Wcpp] -- >> <stdin>:1554:2: warning: #warning syscall task_getshared not implemented [-Wcpp] --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org [-- Attachment #2: config.gz --] [-- Type: application/gzip, Size: 39269 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH 1/3] Introduce per thread user-kernel shared structure 2021-08-27 23:42 ` [RFC PATCH 1/3] Introduce per thread user-kernel shared structure Prakash Sangappa 2021-08-28 7:36 ` kernel test robot @ 2021-08-28 8:21 ` kernel test robot 1 sibling, 0 replies; 4+ messages in thread From: kernel test robot @ 2021-08-28 8:21 UTC (permalink / raw) To: kbuild-all [-- Attachment #1: Type: text/plain, Size: 5755 bytes --] Hi Prakash, [FYI, it's a private test report for your RFC patch.] [auto build test ERROR on linus/master] [also build test ERROR on v5.14-rc7] [cannot apply to tip/sched/core hnaz-linux-mm/master tip/x86/asm next-20210827] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Prakash-Sangappa/Provide-fast-access-to-thread-specific-data/20210828-073533 base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 8f9d0349841a2871624bb1e85309e03e9867c16e config: nios2-defconfig (attached as .config) compiler: nios2-linux-gcc (GCC) 11.2.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/4afb2fb1653308287e0f2347dfff5c499acedee7 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Prakash-Sangappa/Provide-fast-access-to-thread-specific-data/20210828-073533 git checkout 4afb2fb1653308287e0f2347dfff5c499acedee7 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross ARCH=nios2 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): mm/task_shared.c: In function 'task_ushared_alloc': mm/task_shared.c:264:1: warning: label 'out' defined but not used [-Wunused-label] 264 | out: | ^~~ In file included from mm/task_shared.c:9: mm/task_shared.c: At top level: >> include/linux/syscalls.h:241:25: error: conflicting types for 'sys_task_getshared'; have 'long int(u64, u64, void *)' {aka 'long int(long long unsigned int, long long unsigned int, void *)'} 241 | asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \ | ^~~ include/linux/syscalls.h:227:9: note: in expansion of macro '__SYSCALL_DEFINEx' 227 | __SYSCALL_DEFINEx(x, sname, __VA_ARGS__) | ^~~~~~~~~~~~~~~~~ include/linux/syscalls.h:218:36: note: in expansion of macro 'SYSCALL_DEFINEx' 218 | #define SYSCALL_DEFINE3(name, ...) SYSCALL_DEFINEx(3, _##name, __VA_ARGS__) | ^~~~~~~~~~~~~~~ mm/task_shared.c:298:1: note: in expansion of macro 'SYSCALL_DEFINE3' 298 | SYSCALL_DEFINE3(task_getshared, u64, opt, u64, flags, void __user *, uaddr) | ^~~~~~~~~~~~~~~ In file included from mm/task_shared.c:9: include/linux/syscalls.h:1055:17: note: previous declaration of 'sys_task_getshared' with type 'long int(long int, long int, void *)' 1055 | asmlinkage long sys_task_getshared(long opt, long flags, void __user *uaddr); | ^~~~~~~~~~~~~~~~~~ vim +241 include/linux/syscalls.h 1bd21c6c21e848 Dominik Brodowski 2018-04-05 230 e145242ea0df6b Dominik Brodowski 2018-04-09 231 /* e145242ea0df6b Dominik Brodowski 2018-04-09 232 * The asmlinkage stub is aliased to a function named __se_sys_*() which e145242ea0df6b Dominik Brodowski 2018-04-09 233 * sign-extends 32-bit ints to longs whenever needed. The actual work is e145242ea0df6b Dominik Brodowski 2018-04-09 234 * done within __do_sys_*(). e145242ea0df6b Dominik Brodowski 2018-04-09 235 */ 1bd21c6c21e848 Dominik Brodowski 2018-04-05 236 #ifndef __SYSCALL_DEFINEx bed1ffca022cc8 Frederic Weisbecker 2009-03-13 237 #define __SYSCALL_DEFINEx(x, name, ...) \ bee20031772af3 Arnd Bergmann 2018-06-19 238 __diag_push(); \ bee20031772af3 Arnd Bergmann 2018-06-19 239 __diag_ignore(GCC, 8, "-Wattribute-alias", \ bee20031772af3 Arnd Bergmann 2018-06-19 240 "Type aliasing is used to sanitize syscall arguments");\ 83460ec8dcac14 Andi Kleen 2013-11-12 @241 asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \ e145242ea0df6b Dominik Brodowski 2018-04-09 242 __attribute__((alias(__stringify(__se_sys##name)))); \ c9a211951c7c79 Howard McLauchlan 2018-03-21 243 ALLOW_ERROR_INJECTION(sys##name, ERRNO); \ e145242ea0df6b Dominik Brodowski 2018-04-09 244 static inline long __do_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__));\ e145242ea0df6b Dominik Brodowski 2018-04-09 245 asmlinkage long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \ e145242ea0df6b Dominik Brodowski 2018-04-09 246 asmlinkage long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \ 1a94bc34768e46 Heiko Carstens 2009-01-14 247 { \ e145242ea0df6b Dominik Brodowski 2018-04-09 248 long ret = __do_sys##name(__MAP(x,__SC_CAST,__VA_ARGS__));\ 07fe6e00f6cca6 Al Viro 2013-01-21 249 __MAP(x,__SC_TEST,__VA_ARGS__); \ 2cf0966683430b Al Viro 2013-01-21 250 __PROTECT(x, ret,__MAP(x,__SC_ARGS,__VA_ARGS__)); \ 2cf0966683430b Al Viro 2013-01-21 251 return ret; \ 1a94bc34768e46 Heiko Carstens 2009-01-14 252 } \ bee20031772af3 Arnd Bergmann 2018-06-19 253 __diag_pop(); \ e145242ea0df6b Dominik Brodowski 2018-04-09 254 static inline long __do_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) 1bd21c6c21e848 Dominik Brodowski 2018-04-05 255 #endif /* __SYSCALL_DEFINEx */ 1a94bc34768e46 Heiko Carstens 2009-01-14 256 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org [-- Attachment #2: config.gz --] [-- Type: application/gzip, Size: 10143 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-08-28 10:18 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-08-28 10:18 [RFC PATCH 1/3] Introduce per thread user-kernel shared structure kernel test robot -- strict thread matches above, loose matches on Subject: below -- 2021-08-27 23:42 [RFC PATCH 0/3] Provide fast access to thread-specific data Prakash Sangappa 2021-08-27 23:42 ` [RFC PATCH 1/3] Introduce per thread user-kernel shared structure Prakash Sangappa 2021-08-28 7:36 ` kernel test robot 2021-08-28 8:21 ` kernel test robot
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.