From mboxrd@z Thu Jan 1 00:00:00 1970 From: Minchan Kim Subject: Re: [PATCH v3 5/5] psi: introduce psi monitor Date: Tue, 29 Jan 2019 08:53:58 +0900 Message-ID: <20190128235358.GA211479@google.com> References: <20190124211518.244221-1-surenb@google.com> <20190124211518.244221-6-surenb@google.com> Mime-Version: 1.0 Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=Et9WX7qudex6I3J38R5ZTdZpG3jgKN8rPBOLVxxywZw=; b=dZRrlMcdIdTtC65YxgEy4yDRPtuLfj9sNemUhowoQQjuUh4XL/cMQQ5YuBQkn1d4L9 CTBpLq0LpsHirfgeqLsSmZaVcKex/KtLKkviBXQZ5HDf2avTeDUFFnsLnRid6F3p+5Gs 84UT6BPHtLPgIU7Vbyre53cucTb8gDiaL9e+SPLVqwqs6QOZiCPmKLmXUtLhjVTHOgaO Z+xHXz/ciDh7wRw3ztGfqNM5l7ZfKOWccUvDBBVD0u8uhzc5tlJSjlYE1sWI0wMPcMGq 7j3K3JS/e+MNMf6pYFSb/ver/DfIiNCAYEsvFQkK6bJq2lRp8ozvxrXhLwVszHH2LLj7 mAlA== Content-Disposition: inline In-Reply-To: <20190124211518.244221-6-surenb@google.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Suren Baghdasaryan Cc: gregkh@linuxfoundation.org, tj@kernel.org, lizefan@huawei.com, hannes@cmpxchg.org, axboe@kernel.dk, dennis@kernel.org, dennisszhou@gmail.com, mingo@redhat.com, peterz@infradead.org, akpm@linux-foundation.org, corbet@lwn.net, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com Hi Suren, When I review first time, it was rather hard to understand due to naming so below comments are mostly cleanup or minor. I'm not strong against if you don't think it's helpful. Feel free to select parts. Thanks. On Thu, Jan 24, 2019 at 01:15:18PM -0800, Suren Baghdasaryan wrote: > Psi monitor aims to provide a low-latency short-term pressure > detection mechanism configurable by users. It allows users to > monitor psi metrics growth and trigger events whenever a metric > raises above user-defined threshold within user-defined time window. > > Time window and threshold are both expressed in usecs. Multiple psi > resources with different thresholds and window sizes can be monitored > concurrently. > > Psi monitors activate when system enters stall state for the monitored > psi metric and deactivate upon exit from the stall state. While system > is in the stall state psi signal growth is monitored at a rate of 10 times > per tracking window. Min window size is 500ms, therefore the min monitoring > interval is 50ms. Max window size is 10s with monitoring interval of 1s. > > When activated psi monitor stays active for at least the duration of one > tracking window to avoid repeated activations/deactivations when psi > signal is bouncing. > > Notifications to the users are rate-limited to one per tracking window. > > Signed-off-by: Suren Baghdasaryan > Signed-off-by: Johannes Weiner > --- > Documentation/accounting/psi.txt | 104 ++++++ > include/linux/psi.h | 10 + > include/linux/psi_types.h | 59 ++++ > kernel/cgroup/cgroup.c | 107 +++++- > kernel/sched/psi.c | 562 +++++++++++++++++++++++++++++-- > 5 files changed, 808 insertions(+), 34 deletions(-) > > diff --git a/Documentation/accounting/psi.txt b/Documentation/accounting/psi.txt > index b8ca28b60215..6b21c72aa87c 100644 > --- a/Documentation/accounting/psi.txt > +++ b/Documentation/accounting/psi.txt > @@ -63,6 +63,107 @@ tracked and exported as well, to allow detection of latency spikes > which wouldn't necessarily make a dent in the time averages, or to > average trends over custom time frames. > > +Monitoring for pressure thresholds > +================================== > + > +Users can register triggers and use poll() to be woken up when resource > +pressure exceeds certain thresholds. > + > +A trigger describes the maximum cumulative stall time over a specific > +time window, e.g. 100ms of total stall time within any 500ms window to > +generate a wakeup event. > + > +To register a trigger user has to open psi interface file under > +/proc/pressure/ representing the resource to be monitored and write the > +desired threshold and time window. The open file descriptor should be > +used to wait for trigger events using select(), poll() or epoll(). > +The following format is used: > + > +