From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 455D3C43334 for ; Mon, 4 Jul 2022 08:11:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BC51D6B0072; Mon, 4 Jul 2022 04:11:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B4D206B0073; Mon, 4 Jul 2022 04:11:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9EE2F6B0074; Mon, 4 Jul 2022 04:11:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 8A8DB6B0072 for ; Mon, 4 Jul 2022 04:11:21 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 6CCE3329D7 for ; Mon, 4 Jul 2022 07:31:47 +0000 (UTC) X-FDA: 79648597854.18.81A10C5 Received: from mail-pj1-f50.google.com (mail-pj1-f50.google.com [209.85.216.50]) by imf30.hostedemail.com (Postfix) with ESMTP id 780458048C for ; Mon, 4 Jul 2022 07:29:44 +0000 (UTC) Received: by mail-pj1-f50.google.com with SMTP id s21so3953441pjq.4 for ; Mon, 04 Jul 2022 00:29:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:date:from:to:cc:subject:references:mime-version :content-disposition:in-reply-to; bh=4MQGn98zHrlAgx/X3OaJuVDgMEpPdUCv8/OPnesR+wc=; b=j6QUTe9abTzNfHpLdomuvezuFWq8LestUyWY/mIooWGjBS6ry+zNO9VYQsZudIc9j1 IuZ67rw2Y4c+CJMgS1/Wx9Ap8ToWFnfOtRKjAeg16C3/ZE7/ucmaITr3YNugBKTyL7C1 qVdaxxZbTAURYV/tSLefl5G0OnD0ps7ghQsFqq/DNCiNo1f/ew/7EtQGRLKLXyuiV1dr s4Z2s8jTq6c68gYef6O6DnvvZpLE0M1rekmhBHfUNPi2aerV7pozL1qhbcY4ZCC/snR9 zJ6XWbw28lAtv3zcAT4YzfL8ZS3M8pluS0VmyhxkxD98CNWj3iBrw/a9SDhR69nvtRza 1njw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:from:to:cc:subject:references :mime-version:content-disposition:in-reply-to; bh=4MQGn98zHrlAgx/X3OaJuVDgMEpPdUCv8/OPnesR+wc=; b=FVziezhb66O3p0rD4K2eixbMypf/Cf+3NxFlPGlAsW2XpBklDlaYIg/iYqFWFNR9Vu 3UqS/RlY2kv9ZEbd0AZd3TtvByWEb2YkX7loPUalLvZnWomp0VDT8j1AW3TeW70HwPwc VYNAzCySsFwG0Kq5huVamlvuBV/YmPrSkzsbW/wxeI6PdE0d3Iwb8G+Wg8btsvdJK+/f GfeqShUQEmmC2HaDsV7jYgsKYqXii39l9pvC5W6633eq8/EK5BjsWNxoLyJL3f2IgT+j cy8Zy+NiDiXzHWJlYoiIY31zTAxdavODXcKP0r7u+QsjirnAV9oq14jRhshP64DALD8q BpVA== X-Gm-Message-State: AJIora/vpXrMXTD3DRV0zN3drz2OlO/EhQRxedQp9rjyJET1FYZDIWtI 0Ysez3c7H8e8Y/wLVcLLZBM= X-Google-Smtp-Source: AGRyM1uWRKrVkJrRReZbm4+wFVkiK1JjS6ShHybd9j/8csGCaOfx7CUtAaZtyn96HdPQfN28ryymqA== X-Received: by 2002:a17:902:c792:b0:16b:e725:6f65 with SMTP id w18-20020a170902c79200b0016be7256f65mr1915134pla.58.1656919783238; Mon, 04 Jul 2022 00:29:43 -0700 (PDT) Received: from localhost ([193.203.214.57]) by smtp.gmail.com with ESMTPSA id n9-20020a17090b0d0900b001ef8912f763sm1231214pjz.7.2022.07.04.00.29.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Jul 2022 00:29:42 -0700 (PDT) Message-ID: <62c296e6.1c69fb81.41c44.1cca@mx.google.com> X-Google-Original-Message-ID: <20220704072941.GA1266413@cgel.zte@gmail.com> Date: Mon, 4 Jul 2022 07:29:41 +0000 From: CGEL To: Michal Hocko Cc: David Hildenbrand , linux-mm@kvack.org, linux-kernel@vger.kernel.org, vbabka@suse.cz, minchan@kernel.org, oleksandr@redhat.com, xu xin , Jann Horn , Andrew Morton Subject: Re: [PATCH linux-next] mm/madvise: allow KSM hints for process_madvise References: <93e1e19a-deff-2dad-0b3c-ef411309ec58@redhat.com> <203548a6-cf70-30ce-6756-f6c909e7ef21@redhat.com> <54b67d6b-f600-1b9b-3d3f-e91b13d04c91@redhat.com> <11d28e6d-edb0-7d11-b476-c5808f3b7c5d@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1656919788; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4MQGn98zHrlAgx/X3OaJuVDgMEpPdUCv8/OPnesR+wc=; b=JWMwfSJ1fDEigKdE2OZvzZZDVBkbbOclnPIbE6Vs/pDyfGZm51MFxgD++0QD2gwgKjdisO c+IXeqz4SN+5ufJJN+FAq2w34J+cYbNi4HrDMHLugXeK7mx8C4z2k9vrD4WV+Xhz30cGz7 2tNXR7z4/LZP+i5fPKuDl8qnnO8uSWM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1656919788; a=rsa-sha256; cv=none; b=WP5TcfbQMQNQ4MdXZSXUBxlRlR+X+gLk/Y8d06AjHh3ULb0ypQrGJiQXfqia74jnF3lQZy HihGDSZVtUA9SwhVSwxrt3CuxuGevBN49UW45UpedALNziQA5pMc61zFJyZidnXLZjzalg rBAztQVGjWFmwnAt/LlN+mB1m12dbk0= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=j6QUTe9a; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf30.hostedemail.com: domain of cgel.zte@gmail.com designates 209.85.216.50 as permitted sender) smtp.mailfrom=cgel.zte@gmail.com X-Stat-Signature: e3usmc5trf714qqe5binn6szakbxjmmp X-Rspamd-Queue-Id: 780458048C X-Rspam-User: Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=j6QUTe9a; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf30.hostedemail.com: domain of cgel.zte@gmail.com designates 209.85.216.50 as permitted sender) smtp.mailfrom=cgel.zte@gmail.com X-Rspamd-Server: rspam06 X-HE-Tag: 1656919784-411683 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jul 04, 2022 at 08:48:06AM +0200, Michal Hocko wrote: > On Fri 01-07-22 21:12:56, David Hildenbrand wrote: > > On 01.07.22 15:19, Michal Hocko wrote: > > > On Fri 01-07-22 14:39:24, David Hildenbrand wrote: > > >>> I am not sure about exact details of the KSM implementation but if that > > >>> is not a desirable behavior then it should be handled on the KSM level. > > >>> The very sam thing can easily happen in a multithreaded (or in general > > >>> multi-process with shared mm) environment as well. > > >> > > >> I don't quite get what you mean. > > > > > > I meant to say that if KSM needs to be aware of a special CoW semantic > > > then it should be handled on the KSM layer regardless whether the KSM > > > has been set by the process itself or any other process that has acccess > > > to the MM. process_madvise is just another way to access a remote MM > > > other than sharing the full MM. > > > > Okay. > > > > KSM has been a corner case feature that was restricted to well-defined > > and well-tested environments. Until recently, R/O pins of any KSM pages > > was essentially completely unreliably. And applications don't expect > > such surprises. The shared zeropage is most probably the last > > problematic piece. > > > > Yes, we're getting there that it's a real feature that can see more > > (forced) wide-spread use. However, until the known issues in KSM have > > been fixed (e.g., below -- there is a whole list of papers regarding > > attacks on memory deduplication), it should be limited to well defined > > environments and applications only -- IMHO. > > Very much agreed on all this! To be completely honest I am not really > sure that all those consequences are widely understood and optmizing > solely on memory savings is a very short sighted strategy IMO. But, it > seems that there is a demand for this feature and previous attempts for > APIs were much worse both from the semantic and maintainability POV. I > am not sure we can get anything more sane than madvise. > > I also very much agree that current shortcomings have to be adressed > first before we open this can of worms to 3rd party actors. I was not > aware of those so thank for bringing them up. Maybe I was overly > optimistic here. > > So I guess we have following questions to answer: > 1) Do we really want to support KSM triggered by 3rd party? Does it > impose new challenges other than existing ones in multi "threaded" > environemnts? > 2) If yes, is the process_madvise the most appropriate existing API? Or > do we need a new one? Maybe new semantics is needed similarly to MADV_NOHUGEPAGE that ensures that there will *not* be huge pages. > 3) Should this be a highly privileged operation or we want to allow > userspace to shoot its feet because consequences are subtle and not very > well understood? > > > So what I want to express here is that if we're adding an interface that > > can be used to just enable KSM on the whole system easily, it might be a > > bit to soon for that. No matter what you document, people will ignore it. > > Agreed. > Agree too. Thanks.