linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Suren Baghdasaryan <surenb@google.com>
To: Sudarshan Rajagopalan <quic_sudaraja@quicinc.com>
Cc: David Hildenbrand <david@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	 Mike Rapoport <rppt@kernel.org>,
	Oscar Salvador <osalvador@suse.de>,
	 Anshuman Khandual <anshuman.khandual@arm.com>,
	mark.rutland@arm.com, will@kernel.org,
	 virtualization@lists.linux-foundation.org, linux-mm@kvack.org,
	 linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	 linux-arm-msm@vger.kernel.org,
	Trilok Soni <quic_tsoni@quicinc.com>,
	 Sukadev Bhattiprolu <quic_sukadev@quicinc.com>,
	Srivatsa Vaddagiri <quic_svaddagi@quicinc.com>,
	 Patrick Daly <quic_pdaly@quicinc.com>
Subject: Re: [PATCH] psi: reduce min window size to 50ms
Date: Fri, 10 Feb 2023 18:13:38 -0800	[thread overview]
Message-ID: <CAJuCfpHihLgHCcsAqMJ_o2u7Ux9B5HFGsV2y_L2_5GXYAGYLnw@mail.gmail.com> (raw)
In-Reply-To: <15cd8816-b474-0535-d854-41982d3bbe5c@quicinc.com>

On Fri, Feb 10, 2023 at 5:46 PM Sudarshan Rajagopalan
<quic_sudaraja@quicinc.com> wrote:
>
>
> On 2/10/2023 5:09 PM, Suren Baghdasaryan wrote:
> > On Fri, Feb 10, 2023 at 4:45 PM Sudarshan Rajagopalan
> > <quic_sudaraja@quicinc.com> wrote:
> >>
> >> On 2/10/2023 3:03 PM, Suren Baghdasaryan wrote:
> >>> On Fri, Feb 10, 2023 at 2:31 PM Sudarshan Rajagopalan
> >>> <quic_sudaraja@quicinc.com> wrote:
> >>>> The PSI mechanism is useful tool to monitor pressure stall
> >>>> information in the system. Currently, the minimum window size
> >>>> is set to 500ms. May we know what is the rationale for this?
> >>> The limit was set to avoid regressions in performance and power
> >>> consumption if the window is set too small and the system ends up
> >>> polling too frequently. That said, the limit was chosen based on
> >>> results of specific experiments which might not represent all
> >> Rightly as you said, the effect on power and performance depends on type
> >> of the system - embedded systems, or Android mobile, or commercial VMs
> >> or servers. With higher PSI sampling, it may not be much of power impact
> >> to embedded systems with low-tier chipsets or performance impact to
> >> powerful servers.
> >>
> >>> usecases. If you want to change this limit, you would need to describe
> >>> why the new limit is inherently better than the current one (why not
> >>> higher, why not lower).
> >> This is in regards to the userspace daemon [1] that we are working on,
> >> that dynamically resizes the VM memory based on PSI memory pressure
> >> events. With current min window size of 500ms, the PSI monitor sampling
> >> period would be 50ms. So to detect increase in memory demand in system
> >> and plug-in memory into VM when pressure goes up, the minimum time the
> >> process needs to stall for is 50ms before a event can be generated and
> >> sent out to userspace and the daemon can do actions.
> >>
> >> This again I'm talking w.r.t. lightweight embedded systems, where even
> >> background kswapd/kcompd (which I'm calling it as natural memory
> >> pressure) in the system would be less than 5-10ms stall. So any stall
> >> more than 5-10ms would "hint" us that a memory consuming usecase has
> >> ranB  and memory may need to be plugged in.
> >>
> >> So in these cases, having as low as 5ms psimon sampling time would give
> >> us faster reaction time and daemon can be responsive more quickly. In
> >> general, this will reduce the malloc latencies significantly.
> >>
> >> Pasting here the same excerpt I mentioned in [1].
> > My question is: why do you think 5ms is the optimal limit here? I want
> > to avoid a race to the bottom where next time someone can argue that
> > they would like to detect a stall within a lower period than 5ms.
> > Technically the limit can be as small as one wants but at some point I
> > think we should consider the possibility of this being used for a DoS
> > attack.
>
> Well the optimal limit should be something which is least destructive? I
> do understand about possibility of DoS attacks, but wouldn't that still
> be possible with 500ms window today? Which will atleast be 1/10th less
> severe compared to 50ms window. The way I see it is - min pressure
> sampling should be such that even the least pressure stall which we
> think is significant should be captured (this could be 5ms or 50ms at
> present) while balancing the power and performance impact across all
> usecases.
>
> At present, Android's LMKD sets 1000ms as window for which it considers
> 100ms sampling to be significant. And here, with psi_daemon usecase we
> are saying 5ms sampling would be significant. So there's no actual
> optimal limit, but we must limit as much possible without effecting
> power or performance as a whole. Also, this is just the "minimum
> allowable" window, and system admins can configure it as per the system
> type/requirement.

Ok, let me ask you another way which might be more productive. What
caused you to choose 5ms as the time you care to react to a stall
buildup?

>
> Also, about possible DoS attacks - file permissions for
> /proc/pressure/... can be set such that not any random user can register
> to psi events right?

True. We have a CAP_SYS_RESOURCE check for the writers of these files.

>
> >
> >> "
> >>
> >> 4. Detecting increase in memory demand b   when a certain usecase starts
> >> in VM that does memory allocations, it will stall causing PSI mechanism
> >> to generate a memory pressure event to userspace. To simply put, when
> >> pressure increases certain set threshold, it can make educated guess
> >> that a memory requiring usecase has ran and VM system needs memory to be
> >> added.
> >>
> >> "
> >>
> >> [1]
> >> https://lore.kernel.org/linux-arm-kernel/1bf30145-22a5-cc46-e583-25053460b105@redhat.com/T/#m95ccf038c568271e759a277a08b8e44e51e8f90b
> >>
> >>> Thanks,
> >>> Suren.
> >>>
> >>>> For lightweight systems such as Linux Embedded Systems, PSI
> >>>> can be used to monitor and track memory pressure building up
> >>>> in the system and respond quickly to such memory demands.
> >>>> Example, the Linux Embedded Systems could be a secondary VM
> >>>> system which requests for memory from Primary host. With 500ms
> >>>> window size, the sampling period is 50ms (one-tenth of windwo
> >>>> size). So the minimum amount of time the process needs to stall,
> >>>> so that a PSI event can be generated and actions can be done
> >>>> is 50ms. This reaction time can be much reduced by reducing the
> >>>> sampling time (by reducing window size), so that responses to
> >>>> such memory pressures in system can be serviced much quicker.
> >>>>
> >>>> Please let us know your thoughts on reducing window size to 50ms.
> >>>>
> >>>> Sudarshan Rajagopalan (1):
> >>>>     psi: reduce min window size to 50ms
> >>>>
> >>>>    kernel/sched/psi.c | 2 +-
> >>>>    1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> --
> >>>> 2.7.4
> >>>>


  reply	other threads:[~2023-02-11  2:13 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-10 22:31 [PATCH] psi: reduce min window size to 50ms Sudarshan Rajagopalan
2023-02-10 22:31 ` Sudarshan Rajagopalan
2023-02-10 23:03 ` Suren Baghdasaryan
2023-02-11  0:44   ` Sudarshan Rajagopalan
2023-02-11  1:09     ` Suren Baghdasaryan
2023-02-11  1:46       ` Sudarshan Rajagopalan
2023-02-11  2:13         ` Suren Baghdasaryan [this message]
2023-02-14  2:12           ` Sudarshan Rajagopalan
2023-02-14 19:34             ` Suren Baghdasaryan
2023-02-24 12:47               ` Michal Hocko
2023-02-24 21:07                 ` Suren Baghdasaryan
2023-02-27 13:34                   ` Michal Hocko
2023-02-27 17:49                     ` Suren Baghdasaryan
2023-02-27 19:11                       ` Michal Hocko
2023-02-27 19:50                         ` Suren Baghdasaryan
2023-02-28 13:50                           ` Michal Hocko
2023-02-28 18:18                             ` Suren Baghdasaryan
2023-03-01  1:49                               ` Suren Baghdasaryan
2023-02-27 19:19                       ` Josh Hunt
2023-02-27 19:51                         ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJuCfpHihLgHCcsAqMJ_o2u7Ux9B5HFGsV2y_L2_5GXYAGYLnw@mail.gmail.com \
    --to=surenb@google.com \
    --cc=anshuman.khandual@arm.com \
    --cc=david@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=osalvador@suse.de \
    --cc=quic_pdaly@quicinc.com \
    --cc=quic_sudaraja@quicinc.com \
    --cc=quic_sukadev@quicinc.com \
    --cc=quic_svaddagi@quicinc.com \
    --cc=quic_tsoni@quicinc.com \
    --cc=rppt@kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).