linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Suren Baghdasaryan <surenb@google.com>
To: gregkh@linuxfoundation.org
Cc: tj@kernel.org, lizefan@huawei.com, hannes@cmpxchg.org,
	axboe@kernel.dk, dennis@kernel.org, dennisszhou@gmail.com,
	mingo@redhat.com, peterz@infradead.org,
	akpm@linux-foundation.org, corbet@lwn.net,
	cgroups@vger.kernel.org, linux-mm@kvack.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	kernel-team@android.com, Suren Baghdasaryan <surenb@google.com>
Subject: [PATCH v2 0/5] psi: pressure stall monitors v2
Date: Thu, 10 Jan 2019 14:07:13 -0800	[thread overview]
Message-ID: <20190110220718.261134-1-surenb@google.com> (raw)

This is respin of:
  https://lwn.net/ml/linux-kernel/20181214171508.7791-1-surenb@google.com/

Android is adopting psi to detect and remedy memory pressure that
results in stuttering and decreased responsiveness on mobile devices.

Psi gives us the stall information, but because we're dealing with
latencies in the millisecond range, periodically reading the pressure
files to detect stalls in a timely fashion is not feasible. Psi also
doesn't aggregate its averages at a high-enough frequency right now.

This patch series extends the psi interface such that users can
configure sensitive latency thresholds and use poll() and friends to
be notified when these are breached.

As high-frequency aggregation is costly, it implements an aggregation
method that is optimized for fast, short-interval averaging, and makes
the aggregation frequency adaptive, such that high-frequency updates
only happen while monitored stall events are actively occurring.

With these patches applied, Android can monitor for, and ward off,
mounting memory shortages before they cause problems for the user.
For example, using memory stall monitors in userspace low memory
killer daemon (lmkd) we can detect mounting pressure and kill less
important processes before device becomes visibly sluggish. In our
memory stress testing psi memory monitors produce roughly 10x less
false positives compared to vmpressure signals. Having ability to
specify multiple triggers for the same psi metric allows other parts
of Android framework to monitor memory state of the device and act
accordingly.

The new interface is straight-forward. The user opens one of the
pressure files for writing and writes a trigger description into the
file descriptor that defines the stall state - some or full, and the
maximum stall time over a given window of time. E.g.:

        /* Signal when stall time exceeds 100ms of a 1s window */
        char trigger[] = "full 100000 1000000"
        fd = open("/proc/pressure/memory")
        write(fd, trigger, sizeof(trigger))
        while (poll() >= 0) {
                ...
        };
        close(fd);

When the monitored stall state is entered, psi adapts its aggregation
frequency according to what the configured time window requires in
order to emit event signals in a timely fashion. Once the stalling
subsides, aggregation reverts back to normal.

The trigger is associated with the open file descriptor. To stop
monitoring, the user only needs to close the file descriptor and the
trigger is discarded.

Patches 1-4 prepare the psi code for polling support. Patch 5
implements the adaptive polling logic, the pressure growth detection
optimized for short intervals, and hooks up write() and poll() on the
pressure files.

The patches were developed in collaboration with Johannes Weiner.

The patches are based on 4.20-rc7.

Johannes Weiner (2):
  fs: kernfs: add poll file operation
  kernel: cgroup: add poll file operation

Suren Baghdasaryan (3):
  psi: introduce state_mask to represent stalled psi states
  psi: rename psi fields in preparation for psi trigger addition
  psi: introduce psi monitor

 Documentation/accounting/psi.txt | 104 ++++++
 fs/kernfs/file.c                 |  31 +-
 include/linux/cgroup-defs.h      |   4 +
 include/linux/kernfs.h           |   6 +
 include/linux/psi.h              |  10 +
 include/linux/psi_types.h        |  83 ++++-
 kernel/cgroup/cgroup.c           | 119 +++++-
 kernel/sched/psi.c               | 597 ++++++++++++++++++++++++++++---
 8 files changed, 885 insertions(+), 69 deletions(-)

Changes in v2:
- Preserved psi idle mode, as per Peter
- Changed input parser to use sscanf, as per Peter
- Removed percentage threshold support, as per Johannes and Joel
- Added explicit numbers corresponding to NR_PSI_TASK_COUNTS,
NR_PSI_RESOURCES and NR_PSI_STATES, as per Peter
- Added a note about psi_group_cpu struct new size in the changelog
of the patch introducing new member (0003-...), as per Peter
- Fixed 64-bit division using div_u64, as per kbuild test robot

-- 
2.20.1.97.g81188d93c3-goog

             reply	other threads:[~2019-01-10 22:07 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-10 22:07 Suren Baghdasaryan [this message]
2019-01-10 22:07 ` [PATCH v2 1/5] fs: kernfs: add poll file operation Suren Baghdasaryan
2019-01-11  6:25   ` Greg KH
2019-01-10 22:07 ` [PATCH v2 2/5] kernel: cgroup: " Suren Baghdasaryan
2019-01-10 22:07 ` [PATCH v2 3/5] psi: introduce state_mask to represent stalled psi states Suren Baghdasaryan
2019-01-10 22:07 ` [PATCH v2 4/5] psi: rename psi fields in preparation for psi trigger addition Suren Baghdasaryan
2019-01-10 22:07 ` [PATCH v2 5/5] psi: introduce psi monitor Suren Baghdasaryan
2019-01-14 10:21   ` Peter Zijlstra
2019-01-14 19:30     ` Suren Baghdasaryan
2019-01-14 19:30       ` Suren Baghdasaryan
2019-01-14 19:42       ` Johannes Weiner
2019-01-14 20:47         ` Suren Baghdasaryan
2019-01-14 20:47           ` Suren Baghdasaryan
2019-01-16 13:24       ` Peter Zijlstra
2019-01-16 17:39         ` Suren Baghdasaryan
2019-01-16 17:39           ` Suren Baghdasaryan
2019-01-16 19:17           ` Johannes Weiner
2019-01-16 19:27             ` Johannes Weiner
2019-01-16 21:29               ` Suren Baghdasaryan
2019-01-16 21:29                 ` Suren Baghdasaryan
2019-01-17  8:13             ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190110220718.261134-1-surenb@google.com \
    --to=surenb@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=dennis@kernel.org \
    --cc=dennisszhou@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@android.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan@huawei.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).