From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 05A0EC4332F for ; Mon, 17 Oct 2022 11:53:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7B7976B0074; Mon, 17 Oct 2022 07:53:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 768746B0078; Mon, 17 Oct 2022 07:53:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6305C6B007B; Mon, 17 Oct 2022 07:53:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 55C596B0074 for ; Mon, 17 Oct 2022 07:53:19 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 6BD7D120BD2 for ; Mon, 17 Oct 2022 11:33:53 +0000 (UTC) X-FDA: 80030231946.13.C9CC00A Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf12.hostedemail.com (Postfix) with ESMTP id 11CBB40037 for ; Mon, 17 Oct 2022 11:33:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1666006432; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kxo3rNBB2g+qJUv1f+R8vZ98yqB+DnhnvN71rozSIN4=; b=ahVkQTTzmXjTpSyYREQ95Zqe+Lh+DvMv7l9k7Y226wpSK0jGONtZwlQgrjlBVVTRo9gxP8 vcGmmeH+DinvebL+DvM9MLkXhe40DwaiUG1sMBfRJY7febXdgnsMOJzGf+Iqp0HJOZ+DPG 0LnR0hspQ9cXoYJFSokPclanUAK0vG0= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-134-smvZWwGrNHC-i5v-DceA1A-1; Mon, 17 Oct 2022 07:33:51 -0400 X-MC-Unique: smvZWwGrNHC-i5v-DceA1A-1 Received: by mail-wm1-f69.google.com with SMTP id v23-20020a1cf717000000b003bff630f31aso5412568wmh.5 for ; Mon, 17 Oct 2022 04:33:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=kxo3rNBB2g+qJUv1f+R8vZ98yqB+DnhnvN71rozSIN4=; b=nPecdB4BQHUxE1rKI/yo/zygLc7sr8jP4TJuzv/mYLLhDxJYRcmIhVz4vkhpDKNYTD XGR2PiwPzeA74+ZGK2GoODrmrsOSliDdVliDZZA1XMOF/1YtAiXIHJV9HdJYfV+uYc3c W1899q9NHk+Q2TeMJsWvhRgkDKJR9EVADZY94GBGc+/LnCQ7BBv/kb/lhWZRDPadX1WF T2S2aBf67cxKMStgMzkswnO/aJdV/VexcHAATKjyZAn8q6MhOUIABaQGN14d8sKfOTHb gkvM+gfg6Xih25dW4k9oUCeNda5OXGNNkLff4p/M/wLYeApmrFZkfN5QhDIzOmSgCo9F hChQ== X-Gm-Message-State: ACrzQf0yiOjw2r7ybtQGbgKED3714F9wt8swEBCOUaXQm+HJap/F0fwX drT4+wR2fBxVWZgOAIFGluEA7UALE3DrCMy1+xjBn88GiZ3x317H+QpP5MZcY/F4crFyajSn9FT ylsECrZyD1gQ= X-Received: by 2002:a05:600c:4ec7:b0:3c6:e3d4:d59d with SMTP id g7-20020a05600c4ec700b003c6e3d4d59dmr12361566wmq.181.1666006430063; Mon, 17 Oct 2022 04:33:50 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6hvjkUkFb07o/+xzSf9+kGc5Vh/SvfBfARe1ybwDa8bk8XHD/uaLLfrP4NbN8y4Nk4HFd4dg== X-Received: by 2002:a05:600c:4ec7:b0:3c6:e3d4:d59d with SMTP id g7-20020a05600c4ec700b003c6e3d4d59dmr12361548wmq.181.1666006429787; Mon, 17 Oct 2022 04:33:49 -0700 (PDT) Received: from ?IPV6:2003:cb:c70a:a00:37ed:519:6c33:4dc8? (p200300cbc70a0a0037ed05196c334dc8.dip0.t-ipconnect.de. [2003:cb:c70a:a00:37ed:519:6c33:4dc8]) by smtp.gmail.com with ESMTPSA id r6-20020a5d52c6000000b00228dbf15072sm8122011wrv.62.2022.10.17.04.33.48 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 17 Oct 2022 04:33:49 -0700 (PDT) Message-ID: <2f41fc4c-68eb-ab7d-970b-fcb10f474fd4@redhat.com> Date: Mon, 17 Oct 2022 13:33:48 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.3.1 Subject: Re: [External] Re: [PATCH] mm: hugetlb: support get/set_policy for hugetlb_vm_ops To: =?UTF-8?B?6buE5p2w?= Cc: songmuchun@bytedance.com, Mike Kravetz , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20221012081526.73067-1-huangjie.albert@bytedance.com> <2aaf2c3a-6e49-abb9-b9c8-19ce87404982@redhat.com> From: David Hildenbrand Organization: Red Hat In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ahVkQTTz; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf12.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1666006433; a=rsa-sha256; cv=none; b=W5STi2RNpV5adzz/aOzMuuXQ00lucVI1Tt7LNp2eV1hMAzwZRjA7HgEsmBQIpt65ZP7/be 7IOLzAi8Fixr00ke28UsJGlJUYDchx1g7nMecP7gKEUQeZ+gb5shDEiTUanrrpuYtblIDj P64y350UYC1s6G3EDv5fLhqRtKQMvLk= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1666006433; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kxo3rNBB2g+qJUv1f+R8vZ98yqB+DnhnvN71rozSIN4=; b=dBCm6DTGwPg1+6o5KX2KlJ1m9XSCbtMPSS9m+JmrsP/wHrhG/pGjtqp1j2K7QUt1AnrSbG mIyWZZDTRhdjHj9VBSwNY5xm5kbnRD54rwu4a/xXO0BAJxqkpADYTL1XbGAPsMjZt+V0Th jdZ81ST/EWqQtHTUdSQ9pt7iI+35jes= Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ahVkQTTz; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf12.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com X-Rspam-User: X-Rspamd-Server: rspam06 X-Stat-Signature: ha3bpygq7mmrpf8epxr6u8a16z7px975 X-Rspamd-Queue-Id: 11CBB40037 X-HE-Tag: 1666006432-927643 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 17.10.22 11:48, 黄杰 wrote: > David Hildenbrand 于2022年10月17日周一 16:44写道: >> >> On 12.10.22 10:15, Albert Huang wrote: >>> From: "huangjie.albert" >>> >>> implement these two functions so that we can set the mempolicy to >>> the inode of the hugetlb file. This ensures that the mempolicy of >>> all processes sharing this huge page file is consistent. >>> >>> In some scenarios where huge pages are shared: >>> if we need to limit the memory usage of vm within node0, so I set qemu's >>> mempilciy bind to node0, but if there is a process (such as virtiofsd) >>> shared memory with the vm, in this case. If the page fault is triggered >>> by virtiofsd, the allocated memory may go to node1 which depends on >>> virtiofsd. >>> >> >> Any VM that uses hugetlb should be preallocating memory. For example, >> this is the expected default under QEMU when using huge pages. >> >> Once preallocation does the right thing regarding NUMA policy, there is >> no need to worry about it in other sub-processes. >> > > Hi, David > thanks for your reminder > > Yes, you are absolutely right, However, the pre-allocation mechanism > does solve this problem. > However, some scenarios do not like to use the pre-allocation mechanism, such as > scenarios that are sensitive to virtual machine startup time, or > scenarios that require > high memory utilization. The on-demand allocation mechanism may be better, > so the key point is to find a way support for shared policy。 Using hugetlb -- with a fixed pool size -- without preallocation is like playing with fire. Hugetlb reservation makes one believe that on-demand allocation is going to work, but there are various scenarios where that can go seriously wrong, and you can run out of huge pages. If you're using hugetlb as memory backend for a VM without preallocation, you really have to be very careful. I can only advise against doing that. Also: why does another process read/write *first* to a guest physical memory location before the OS running inside the VM even initialized that memory? That sounds very wrong. What am I missing? -- Thanks, David / dhildenb