From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.6 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLACK,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BCE6C636CB for ; Sat, 17 Jul 2021 02:58:45 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 203486128D for ; Sat, 17 Jul 2021 02:58:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 203486128D Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8DDFA8D00F5; Fri, 16 Jul 2021 22:58:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8B4C78D00EC; Fri, 16 Jul 2021 22:58:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 77C218D00F5; Fri, 16 Jul 2021 22:58:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0242.hostedemail.com [216.40.44.242]) by kanga.kvack.org (Postfix) with ESMTP id 4F8698D00EC for ; Fri, 16 Jul 2021 22:58:45 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 13FF716AFE for ; Sat, 17 Jul 2021 02:58:44 +0000 (UTC) X-FDA: 78370572168.14.07660C6 Received: from mail-io1-f46.google.com (mail-io1-f46.google.com [209.85.166.46]) by imf06.hostedemail.com (Postfix) with ESMTP id CEE2E801F243 for ; Sat, 17 Jul 2021 02:58:43 +0000 (UTC) Received: by mail-io1-f46.google.com with SMTP id x10so12916996ion.9 for ; Fri, 16 Jul 2021 19:58:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=brOM+DfdNwlc3UsxEzMLpJktJX1yD7cB7Dsb6KaxPcE=; b=DIvva8ayH6+sZXZWK1X/BmgtFoZ6Skd6ek87kkcwfDefy1QK3YAg6R/XEsJzGF1hg8 jXbmzEQy/XMz8v9jAWv8HckIxjLHaEU6vWb5LwLOJQg0pG4a+8ltWUzbkzZrzngAWXEd laCghktTbLxY47ow4V6TECMjveyTZNEwmaW0elp5YBqDZAf5TfFc275/YNWSJMKGs5pt p0FN95JPHARsau4IiJECtubyQ1J/3MC3OlircFS4zisCQ4q+h96WZeJ+ZzwRbByMeDQ6 w8/m3uwqgsZAe9j6/+ekgeYqIaEoTR+WP6oQajcJTb2bs77Mn9EnW8JthQBKjv5be68h zasQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=brOM+DfdNwlc3UsxEzMLpJktJX1yD7cB7Dsb6KaxPcE=; b=sKx9W2tKx1hrqjs931r8n5ZqdirSqzKr1eMn9x/l9UWrc+PfSVI5cZGyEWZetn4poK Rfxh3Yoz2yuixplQZHl/yPPpfY07I8tAkthusPMZ+OQEVpegdwUc/ehQsnVli5VdES0I SZ+9cKcRGpFdN2+tWRSkM1Oy2s/q5xgMpoeXIUVNMj98WFcjdqOYEyRTXPp1PZpl6jcM aESfIbo15y3qWLEjBaNh/m8cU1ByqGiqt+IiQw6C7z7VWklFgvGovodpif7j7ZNIzoYK YkCqQhKo+n1gzueCi4YccJGLwlCgqZPeFKWm19FuWC14TSTz4dYC4Vzyk/Z8BgD8Xk2w 6CmQ== X-Gm-Message-State: AOAM5337JNkm3ezv4cC/MuXgJ7/bnFtUNNHOCDcMSv7823EuLtvmN+v8 fHsSi//TJ+a/vydX9pRV4VfNLlI/Hh+Rlp59eq7umA== X-Google-Smtp-Source: ABdhPJw+FY1DaVW7I/zvmejB5/eZra1YdFnjbI32Ud4P/SrH4MECH1mx28jRyRryX2xisFNJx64ZGI85MZ7lGi1KfLk= X-Received: by 2002:a02:cab9:: with SMTP id e25mr3644621jap.25.1626490722855; Fri, 16 Jul 2021 19:58:42 -0700 (PDT) MIME-Version: 1.0 References: <20210619092002.1791322-1-pcc@google.com> <20210628122455.sqo77q4jfxtiwt5b@box.shutemov.name> In-Reply-To: <20210628122455.sqo77q4jfxtiwt5b@box.shutemov.name> From: Peter Collingbourne Date: Fri, 16 Jul 2021 19:58:31 -0700 Message-ID: Subject: Re: [PATCH v4] mm: introduce reference pages To: "Kirill A. Shutemov" Cc: John Hubbard , Matthew Wilcox , Andrew Morton , Catalin Marinas , Evgenii Stepanov , Jann Horn , Linux ARM , Linux Memory Management List , kernel test robot , Linux API , linux-doc@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: CEE2E801F243 X-Stat-Signature: yiy7mki17j7czi5asgqtm1am3n8tfqhg Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20161025 header.b=DIvva8ay; spf=pass (imf06.hostedemail.com: domain of pcc@google.com designates 209.85.166.46 as permitted sender) smtp.mailfrom=pcc@google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1626490723-868669 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jun 28, 2021 at 5:24 AM Kirill A. Shutemov wrote: > > On Sat, Jun 19, 2021 at 02:20:02AM -0700, Peter Collingbourne wrote: > > #include > > #include > > #include > > #include > > #include > > > > constexpr unsigned char pattern_byte = 0xaa; > > > > #define PAGE_SIZE 4096 > > > > _Alignas(PAGE_SIZE) static unsigned char pattern[PAGE_SIZE]; > > > > int main(int argc, char **argv) { > > if (argc < 3) > > return 1; > > bool use_refpage = argc > 3; > > size_t mmap_size = atoi(argv[1]); > > size_t touch_size = atoi(argv[2]); > > > > int refpage_fd; > > if (use_refpage) { > > memset(pattern, pattern_byte, PAGE_SIZE); > > refpage_fd = syscall(448, pattern, 0); > > } > > for (unsigned i = 0; i != 1000; ++i) { > > char *p; > > if (use_refpage) { > > p = (char *)mmap(0, mmap_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, > > refpage_fd, 0); > > } else { > > p = (char *)mmap(0, mmap_size, PROT_READ | PROT_WRITE, > > MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > > memset(p, pattern_byte, mmap_size); > > } > > for (unsigned j = 0; j < touch_size; j += PAGE_SIZE) > > p[j] = 0; > > munmap(p, mmap_size); > > } > > } > > I don't like the inteface. It is tied to PAGE_SIZE and this doesn't seem > to be very future looking. How would it work with THPs? The idea with this interface is that the FD would be passed to mmap, and anything that uses mmap already needs to be tied to the page size to some extent. For THPs I would expect that the kernel would duplicate the contents of the page as needed. Another reason that I thought to use a page size based interface was to allow future optimizations that may reuse the actual page passed to the syscall. So for example if libc.so contained a page filled with the required pattern and the allocator passed a pointer to that page then it could be shared between all of the processes on the system that link against that libc. But I suppose that such optimizations would not require passing in a whole page like that. For pattern based optimizations we could use a reference counted hash table or something, and for larger patterns we could activate the optimization only if the size argument were equal to the page size. > Maybe we should cosider passing down a filling pattern to kernel and let > kernel allocate appropriate page size on read page fault? The pattern has > to be power of 2 and limited in lenght. Okay, so this sounds like my idea for handling THPs except applied to any size. This seems reasonable enough to me, however in order to optimize use cases where the page is only ever read, let's have the kernel prepare the reference page instead of recreating it every time. In v5 I've adopted Matthew's proposed prototype: int refpage_create(const void *__user content, unsigned int size, unsigned long pattern, unsigned long flags); Peter