From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF1C2C64EC7 for ; Wed, 1 Mar 2023 01:49:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5A0696B0072; Tue, 28 Feb 2023 20:49:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5504E6B0073; Tue, 28 Feb 2023 20:49:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 418546B0074; Tue, 28 Feb 2023 20:49:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 2EEA86B0072 for ; Tue, 28 Feb 2023 20:49:32 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 03B82160665 for ; Wed, 1 Mar 2023 01:49:31 +0000 (UTC) X-FDA: 80518647384.14.E603E67 Received: from mail-yw1-f177.google.com (mail-yw1-f177.google.com [209.85.128.177]) by imf19.hostedemail.com (Postfix) with ESMTP id 50EC41A000A for ; Wed, 1 Mar 2023 01:49:30 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Ab6GA6qP; spf=pass (imf19.hostedemail.com: domain of surenb@google.com designates 209.85.128.177 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677635370; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yAu0VY4pG5lMmayDlpyajBcn4svB3LfzCeO1dyYvFcM=; b=S0gFcrESXko2fgizm87xDgAIZwvUvB2n7z5gOmsdQ99rHhFq+KtGLtYU8mWl9DBaLHtgoT vaV5ZEDIUcmNp1AFkyMK0vOOGyGenZBnxdTLwp4wlTMWSUG+knMcj0IF2zm6Gq9E2Qvs0d lUhDXZtrOY39vbmUpLx49bzytLRXxmE= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Ab6GA6qP; spf=pass (imf19.hostedemail.com: domain of surenb@google.com designates 209.85.128.177 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677635370; a=rsa-sha256; cv=none; b=gE9QIBvbkClNJN7PykEkzWAqMztPnrypySf5mefkaQhFV5deHmoJPCk4iXHSWrmr0a19in CZBE7TwuKKf5xgOOGleV6rAAvrfPLdmNHiGzMHy3wQG5D3rQvsZ3qGq/zkM+l+eyOLNktd doE+ikOGsgRyRhx6tjpRwg5HY6Vq9vA= Received: by mail-yw1-f177.google.com with SMTP id 00721157ae682-53852143afcso327552327b3.3 for ; Tue, 28 Feb 2023 17:49:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1677635369; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=yAu0VY4pG5lMmayDlpyajBcn4svB3LfzCeO1dyYvFcM=; b=Ab6GA6qPithzXQRnd8yinlWyVvOV9spWEWMRYdNDQgxJvdgJqykgfhy8Jai7ztaQ09 HDf+2cmdKeONrf+4KpcPfHOxWmp4wmYJ3BwxrEV6oGtwgZgolD0nLnnjScV6w+3MEblI 8AYNpMXJoeWoJqmlZZw8kF+3SWNcz3QOQpprZDvNy3AVOyyyRxlTFFpuT2voN57o5wwi KJvq/55aKxYW9gkwaXLVYKJslTRPL9GBds9yNmAWrQBdtAHSQS+RXIsNOhQDg9n+yvsC vo9xO4z/bk6rpniBqxIc+vfp0SY1kMAMk3ZU69uQL0zvnxOYF1Cw+Kc+v8/qPjl3hyRB sCfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1677635369; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yAu0VY4pG5lMmayDlpyajBcn4svB3LfzCeO1dyYvFcM=; b=OeiYX3jJrchE+xFivhmoogLrdqZBoxfT37VSHFRK73yMTpGfmWWevdrZxa0eEGeoMu SiZb0IbkN86ab2UcoFHmtZROyPVP/+3tH0LIGhGY/lfeOKsgI3qB2fEK3MPi38czXeF/ XOfAT1NoR44/vOb9igR00QgzC21X8tO1jOByLkyjBDmUOpi0xR40RjeKQhy79vS6Qlg9 he4BV5zoBZZ7ge391yVnMZxnwzfZffAjU26xy7u65y8ze5ZDof/ZQAZOjwxkBGXvJLO9 6qTY5C9RyEMBikvP0mEL5ejEgIKXVtYWRg86yvNzywhmycZCyXz3rraG9NoDRopXeJVd 2eWg== X-Gm-Message-State: AO0yUKX3rhT0izYLqSkHTuq0PW+HJ45FRe0xUMQSWwNb4K8QxPRHhaW4 I24M9W098+AyPodlN9aKBsyYCC36Nx1OUjNmtyKVlQ== X-Google-Smtp-Source: AK7set+dFJSx78dk4VGv7wMamuwmK+ZqCmseIbFZi0QMwuE3BCrIawoUa/thhs72KuS9XQADzu8byCOGIewo885wKNU= X-Received: by 2002:a25:9704:0:b0:a30:38fb:a2e4 with SMTP id d4-20020a259704000000b00a3038fba2e4mr2102666ybo.6.1677635369232; Tue, 28 Feb 2023 17:49:29 -0800 (PST) MIME-Version: 1.0 References: <15cd8816-b474-0535-d854-41982d3bbe5c@quicinc.com> <82406da2-799e-f0b4-bce0-7d47486030d4@quicinc.com> In-Reply-To: From: Suren Baghdasaryan Date: Tue, 28 Feb 2023 17:49:18 -0800 Message-ID: Subject: Re: [PATCH] psi: reduce min window size to 50ms To: Michal Hocko Cc: Sudarshan Rajagopalan , David Hildenbrand , Johannes Weiner , Mike Rapoport , Oscar Salvador , Anshuman Khandual , mark.rutland@arm.com, will@kernel.org, virtualization@lists.linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-arm-msm@vger.kernel.org, Trilok Soni , Sukadev Bhattiprolu , Srivatsa Vaddagiri , Patrick Daly , johunt@akamai.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 50EC41A000A X-Rspam-User: X-Stat-Signature: ms7dhmey8iwj4fhu6874bf7rgrorspob X-HE-Tag: 1677635370-131942 X-HE-Meta: U2FsdGVkX19JX3r19U/hnd9v+r+M8Qy53iCWkvkBsBxYHCLlgxS9KxWHLNHeRyD2zdHk9U6GFozvtEkaAnFcoUsA2w/iQMzvM5lWsTVXI9g/1Knts8JQEaSyQC0Q8ITMjnQGEPYCC8CqXAxYeWe/EBayNbzptuGNuRx/gqu2esgwF6JExkyrOrQXeri1g6xYJwtyC/m3hzl4SSV0gITdJ5mTsZoP7Emr6+VvtMEn1Xce1kMhUaIkFBbq0VXEMx9PLwoow9IpHqnopbSaCIHez520LhvFsGAq6Twco5CGcaqgxH//yr4tkrcs0RmpM7v+cH27wWvryRwj2/VLWN5CBexdfUBXGDToGwUl/EPz4OSGCPKv/dWepRVnw3ooDzGz1brdb/OwW5XpD7fnIvVRkXgFlGwaATYwLbE5IoNbYVnlMMb8P3BtnmEEWmRe8C/olmjyCcHWBb5R9sFIHO6nD51mfKs1ManvQtVfi5tBDam4e0XKswItRi1tUnZiNlR0nvbO4fBaOxj22SEDMDLrAy4M3AdWQ7r7wVZx2HLA8e+zEuUzgkgDhDF4C/OWpOzEYKFB3TplA6CkABuC6lcpFeKtmAjMKXgQQyVPu3dRRVrCa4We1FwmIXw5IiR4igCXaZ88pPvrtCRP1WxVnhQbAejGZ7BdQm6ZZarUqycNb+X2usOfSWELSjE+3BqPhg1vCNgFkmvQrKgWWFH1cFDgZj0GLChdt32rkUn4bZDCsin/tRUWTW8OwKvrq7ni4ua9FHvVOEJW2vVoCt4qphtvx2aXfQBvmo1UYYp+tpMgfb5Igv4U/deeB9s2KpLDhK59NzJBJ0R4q0fMmzLXFUCLMVFscNeKht/z4zuSfRgWs+BdKRt94RE3UFeLGFp83/8h89nv6fR/uXxUC/3SHhJk5tXzZpmFKb3WQ9uvnb34Yjpy2o/BF/QDMDkaiaXuLdTv9pj7hs/rz+O3+w+jnzY ti/92WFJ bGk4oWBNvsv5qUfvbORDPixDnk7atVtHuZAX9+vnEjRXhWfCbfPLNRVKpOqdWBefvS0ZG2sEGADw5kkz+IhHjpO3aAcpDlHJrFLbPWCWB/fwaxS5DhkfCvABM314WffYb05MAKQ7250Dt4rBUzKfVTAbM7vgrnU1NDSLBWUmIZkTm9YgmczdRUc0GNcTQadN2VKu0GzYCmM9ora/Jaytlndau+sfRPUsMDlSqiNDak34h8k243F2BMdFL4F4ONY49VhwoxJn5x/tgnPyW3SyAZ+KgTkfw1pumsqSXRNwEOjWzWDw5hKjPAhTgKGQb1QPOozyWpQWh7QzOU5kuzb7vccxbPhfS/VMflKZ3JZw5VcRt9gf/6R4B4dNdoP7JBhqXz6ML X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Feb 28, 2023 at 10:18=E2=80=AFAM Suren Baghdasaryan wrote: > > On Tue, Feb 28, 2023 at 5:50 AM Michal Hocko wrote: > > > > On Mon 27-02-23 11:50:48, Suren Baghdasaryan wrote: > > > On Mon, Feb 27, 2023 at 11:11 AM Michal Hocko wrote= : > > > > > > > > On Mon 27-02-23 09:49:59, Suren Baghdasaryan wrote: > > > > > On Mon, Feb 27, 2023 at 5:34 AM Michal Hocko wr= ote: > > > > > > > > > > > > On Fri 24-02-23 13:07:57, Suren Baghdasaryan wrote: > > > > > > > On Fri, Feb 24, 2023 at 4:47 AM Michal Hocko wrote: > > > > [...] > > > > > > > > Btw. it seems that there is is only a limit on a single tri= gger per fd > > > > > > > > but no limits per user so it doesn't sound too hard to end = up with too > > > > > > > > much polling even with a larger timeouts. To me it seems li= ke we need to > > > > > > > > contain the polling thread to be bound by the cpu controlle= r. > > > > > > > > > > > > > > Hmm. We have one "psimon" thread per cgroup (+1 system-level = one) and > > > > > > > poll_min_period for each thread is chosen as the min() of pol= ling > > > > > > > periods between triggers created in that group. So, a bad tri= gger that > > > > > > > causes overly aggressive polling and polling thread being thr= ottled, > > > > > > > might affect other triggers in that cgroup. > > > > > > > > > > > > Yes, and why that would be a problem? > > > > > > > > > > If unprivileged processes are allowed to add new triggers then a > > > > > malicious process can add a bad trigger and affect other legit > > > > > processes. That sounds like a problem to me. > > > > > > > > Hmm, I am not sure we are on the same page. My argument was that th= e > > > > monitoring kernel thread should be bound by the same cpu controller= so > > > > even if it was excessive it would be bound to the cgroup constrains= . > > > > > > Right. But if cgroup constraints are violated then the psimon thread'= s > > > activity will be impacted by throttling. In such cases won't that > > > affect other "good" triggers served by that thread even if they are > > > using higher polling periods? > > > > That is no different from any other part of the workload running within > > the same cpu bound cgroup running overboard with the cpu consumption. I > > do not see why psimon or anything else should be any different. > > > > Actually the only difference here is that the psi monitoring is > > outsourced to a kernel thread which is running ourside of any constrain= s. > > I am not sure where do we stand with kernel thread cpu cgroup accountin= g > > and I suspect this is not a trivial thing to do ATM. Hence longer term > > plan. > > Yeah, that sounds right. > In the meantime I think the prudent thing to do is to add > CAP_SYS_RESOURCE check for cgroup interface for consistency with > system-wide one. After that we can change the min period to be > anything more than 0 and let userspace privileged services implement > policies to limit trigger cpu consumption (might be via cpu > controller, limiting the number of triggers/their periods, etc). > Sudarshan, I'll post the CAP_SYS_RESOURCE change shortly and you can > follow up with the change to the min trigger period. Patch to require CAP_SYS_RESOURCE for writing per-cgroup psi files is posted at https://lore.kernel.org/all/20230301014651.1370939-1-surenb@googl= e.com/ > Thanks for the input folks! > > > -- > > Michal Hocko > > SUSE Labs