From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755248AbeASMuJ (ORCPT ); Fri, 19 Jan 2018 07:50:09 -0500 Received: from mx2.suse.de ([195.135.220.15]:57807 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751212AbeASMuB (ORCPT ); Fri, 19 Jan 2018 07:50:01 -0500 Date: Fri, 19 Jan 2018 13:49:57 +0100 From: Michal Hocko To: Nitin Gupta Cc: steven.sistare@oracle.com, Nitin Gupta , Andrew Morton , Ingo Molnar , Mel Gorman , Nadav Amit , Minchan Kim , "Kirill A. Shutemov" , Peter Zijlstra , Vegard Nossum , "Levin, Alexander (Sasha Levin)" , Mike Rapoport , Hillf Danton , Shaohua Li , Anshuman Khandual , Andrea Arcangeli , David Rientjes , Rik van Riel , Jan Kara , Dave Jiang , =?iso-8859-1?B?Suly9G1l?= Glisse , Matthew Wilcox , Ross Zwisler , Hugh Dickins , Tobin C Harding , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v2] mm: Reduce memory bloat with THP Message-ID: <20180119124957.GA6584@dhcp22.suse.cz> References: <1516318444-30868-1-git-send-email-nitingupta910@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1516318444-30868-1-git-send-email-nitingupta910@gmail.com> User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 18-01-18 15:33:16, Nitin Gupta wrote: > From: Nitin Gupta > > Currently, if the THP enabled policy is "always", or the mode > is "madvise" and a region is marked as MADV_HUGEPAGE, a hugepage > is allocated on a page fault if the pud or pmd is empty. This > yields the best VA translation performance, but increases memory > consumption if some small page ranges within the huge page are > never accessed. Yes, this is true but hardly unexpected for MADV_HUGEPAGE or THP always users. > An alternate behavior for such page faults is to install a > hugepage only when a region is actually found to be (almost) > fully mapped and active. This is a compromise between > translation performance and memory consumption. Currently there > is no way for an application to choose this compromise for the > page fault conditions above. Is that really true? We have /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none This is not reflected during the PF of course but you can control the behavior there as well. Either by the global setting or a per proces prctl. > With this change, whenever an application issues MADV_DONTNEED on a > memory region, the region is marked as "space-efficient". For such > regions, a hugepage is not immediately allocated on first write. Kirill didn't like it in the previous version and I do not like this either. You are adding a very subtle side effect which might completely unexpected. Consider userspace memory allocator which uses MADV_DONTNEED to free up unused memory. Now you have put it out of THP usage basically. If the memory is used really scarce then we have MADV_NOHUGEPAGE. -- Michal Hocko SUSE Labs