From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29F9AC433ED for ; Thu, 1 Apr 2021 14:50:37 +0000 (UTC) Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6104561366 for ; Thu, 1 Apr 2021 14:50:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6104561366 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kernelnewbies-bounces@kernelnewbies.org Received: from localhost ([::1] helo=shelob.surriel.com) by shelob.surriel.com with esmtp (Exim 4.94) (envelope-from ) id 1lRyeL-00063Q-3c; Thu, 01 Apr 2021 10:50:21 -0400 Received: from mail-ej1-x636.google.com ([2a00:1450:4864:20::636]) by shelob.surriel.com with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94) (envelope-from ) id 1lRyeJ-000638-GM for kernelnewbies@kernelnewbies.org; Thu, 01 Apr 2021 10:50:19 -0400 Received: by mail-ej1-x636.google.com with SMTP id r12so3295985ejr.5 for ; Thu, 01 Apr 2021 07:50:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=sTHPHeHJrfRrcVAc3zwqgABmiyP/WJCFr8EUyYOWAPg=; b=BCZMFVTdS3ijzrcZcZa+PsX8lJyCJruJwc9qIYvqY9FvbP5cr69VeoFjGLfDhA0qX8 Vi8qfuZfkYJNSV44vIlWd9eMROzhLOtOfWKdeoO6DHuNP5QwjnHjOETxQE0uvPR8DUXS 5r0+5WWndtiH1lzKzsBUAjHTnaFCuImsYIksD29V2sfGpRERKusvwMntSd1BlitIpH+8 0cPMkc1dDjmrCU9RhitSjAUxcYfWDznqL3ltK+VJgS+Knf7y65aBq5YwUhGN9j4QHFhf LuEhCGTfv1GIezC8P4Y+8u0dTIohQ7trvyYd5axxi+C63M3Y0dAPedjm6Ad5B3Lc4a0m fpfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=sTHPHeHJrfRrcVAc3zwqgABmiyP/WJCFr8EUyYOWAPg=; b=DPTKSmnH3qJYsqDysc0agbbei1sEx4n7lqgElHc9R6y+8ExGyq1qVRwfh2fdw2BcDR wm8ZGUtDWldgdS023FL/uN37HJ/dz7lQYGQ/6vw7xKbhQg6lQn/UMZrxM9PXumXH9aNn nGLZTGxwEMNkgZ2dbQBBza64+bt26b/+S6C1V4a/LV8k2+cNYK1aowohwWP3WQ8ZSEkn ADawdDVdq40E0oirINyZW00NGkjLih9PITWm+tgWgmLkKl3fi+KruZ+XEF+knJM5yB9O qOENBF4JsZbsELz8KqkO95s8H2zqABeaCfD33My4ZXlxiek/1UwRX1j1CHLQfqEYX64u muDQ== X-Gm-Message-State: AOAM532qIDjL6wOFopp06xS6VVafCPy2eqQ37MUDFhN1LJK/7v5gRC2D 4j6exq5UNP4BSoYK+cjjGkkW723XYZglp2W1EOmrlIE2 X-Google-Smtp-Source: ABdhPJz8/NhdCuJ2okvLJu/hhxOKsTOwRpKp3Ed+hy9mjsQ3LqYcMD5Z0qiDnZV7sJTDND/ziMY4MqWBHyuRPMZ28Gk= X-Received: by 2002:a17:907:2062:: with SMTP id qp2mr9436208ejb.397.1617288615458; Thu, 01 Apr 2021 07:50:15 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Navin P Date: Thu, 1 Apr 2021 20:20:04 +0530 Message-ID: Subject: Re: pid start time and new display field in proc pid stat [ race in task->start_boottime ,start_time] To: kernelnewbies X-BeenThere: kernelnewbies@kernelnewbies.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Learn about the Linux kernel List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: kernelnewbies-bounces@kernelnewbies.org On Fri, Mar 26, 2021 at 8:31 AM Navin P wrote: > > On Thu, Mar 25, 2021 at 8:51 PM wrote: > > > > > > > > On Thu, Mar 25, 2021 at 6:53 AM Navin P wrote: > >> > >> Hi, > >> > >> As of 5.11 kernel (pid,pid_start_time) is not unique /monotonic even > >> though the underlying counters are . > >> I chose start_boottime because i wanted the counter to increase > >> during suspend as well. > >> > My patch doesn't create a new field, it uses the existing field and > prints it in /proc/[pid]/stat. Stat files are created > on first lookup. These are used by ps or top. > > + seq_put_decimal_ull(m, " ", task->start_boottime); After printing the task->start_boottime as 53rd field, i found multiple process containing same value due to a race. The race exists for p->start_time as well. Working on the above i found a race in the kernel/fork.c at p->start_boottime = ktime_get_boottime_ns(); i logged the output and compared them to find identical values for many process. I made a patch and this fixes the issue but i'm not happy with the change because it locks the tasklist_lock which is a big lock. What are the other ways that can be used to fix this problem ? I was thinking of changing start_boottime as struct start_boottime { u64 boottime, rw_lock }; Now all my access to boottime go through the rw_lock . Another rwlock has to be created for start_time as well. >From a048ac81b468ec94cb10ca41416cfe9641be3c5b Mon Sep 17 00:00:00 2001 From: Navin P Date: Thu, 1 Apr 2021 20:04:24 +0530 Subject: [PATCH] Wrap task->start_boottime in write_lock_irq during creation and filling leader->task_boottime in fs/exec.c . Also wrap read_lock in /proc/fs/array.c for reading task->start_boottime to remove the race. Signed-off-by: Navin P --- fs/exec.c | 2 ++ fs/proc/array.c | 8 ++++++-- kernel/fork.c | 6 +++++- 3 files changed, 13 insertions(+), 3 deletions(-) diff --git a/fs/exec.c b/fs/exec.c index 18594f11c31f..6e29698bdd90 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1109,7 +1109,9 @@ static int de_thread(struct task_struct *tsk) * also take its birthdate (always earlier than our own). */ tsk->start_time = leader->start_time; + write_lock_irq(&tasklist_lock); tsk->start_boottime = leader->start_boottime; + write_unlock_irq(&tasklist_lock); BUG_ON(!same_thread_group(leader, tsk)); /* diff --git a/fs/proc/array.c b/fs/proc/array.c index 74389aaefa9c..dae7e973b9de 100644 --- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -469,7 +469,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns, int num_threads = 0; int permitted; struct mm_struct *mm; - unsigned long long start_time; + unsigned long long start_time,curr_boottime; unsigned long cmin_flt = 0, cmaj_flt = 0; unsigned long min_flt = 0, maj_flt = 0; u64 cutime, cstime, utime, stime; @@ -563,8 +563,12 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns, nice = task_nice(task); /* apply timens offset for boottime and convert nsec -> ticks */ + + read_lock(&tasklist_lock); start_time = nsec_to_clock_t(timens_add_boottime_ns(task->start_boottime)); + curr_boottime = task->start_boottime; + read_unlock(&tasklist_lock); seq_put_decimal_ull(m, "", pid_nr_ns(pid, ns)); seq_puts(m, " ("); @@ -645,7 +649,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns, else seq_puts(m, " 0"); - seq_put_decimal_ull(m, " ", task->start_boottime); + seq_put_decimal_ull(m, " ", curr_boottime); seq_putc(m, '\n'); if (mm) mmput(mm); diff --git a/kernel/fork.c b/kernel/fork.c index 426cd0c51f9e..41728b9bb76d 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2219,11 +2219,15 @@ static __latent_entropy struct task_struct *copy_process( * communication until we take the tasklist-lock. In particular, we do * not want user-space to be able to predict the process start-time by * stalling fork(2) after we recorded the start_time but before it is - * visible to the system. + * visible to the system. Quickly take write lock and fill in details + * preventing race. */ p->start_time = ktime_get_ns(); + write_lock_irq(&tasklist_lock); p->start_boottime = ktime_get_boottime_ns(); + write_unlock_irq(&tasklist_lock); + /* * Make it visible to the rest of the system, but dont wake it up yet. -- 2.25.1 _______________________________________________ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies