From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S941464AbcIZDZo convert rfc822-to-8bit (ORCPT ); Sun, 25 Sep 2016 23:25:44 -0400 Received: from mga07.intel.com ([134.134.136.100]:50968 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S941260AbcIZDZm (ORCPT ); Sun, 25 Sep 2016 23:25:42 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.30,397,1470726000"; d="scan'208";a="883723175" From: "Huang\, Ying" To: Shaohua Li Cc: "Huang\, Ying" , Rik van Riel , Andrew Morton , , , , , , , Hugh Dickins , Minchan Kim , "Andrea Arcangeli" , "Kirill A . Shutemov" , Vladimir Davydov , Johannes Weiner , Michal Hocko Subject: Re: [PATCH -v3 00/10] THP swap: Delay splitting THP during swapping out References: <1473266769-2155-1-git-send-email-ying.huang@intel.com> <20160922225608.GA3898@kernel.org> <1474591086.17726.1.camel@redhat.com> <87d1jvuz08.fsf@yhuang-dev.intel.com> <20160925191849.GA83300@kernel.org> Date: Mon, 26 Sep 2016 11:25:27 +0800 In-Reply-To: <20160925191849.GA83300@kernel.org> (Shaohua Li's message of "Sun, 25 Sep 2016 12:18:49 -0700") Message-ID: <877f9zs5p4.fsf@yhuang-dev.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Shaohua Li writes: > On Fri, Sep 23, 2016 at 10:32:39AM +0800, Huang, Ying wrote: >> Rik van Riel writes: >> >> > On Thu, 2016-09-22 at 15:56 -0700, Shaohua Li wrote: >> >> On Wed, Sep 07, 2016 at 09:45:59AM -0700, Huang, Ying wrote: >> >> >. >> >> > - It will help the memory fragmentation, especially when the THP is >> >> > . heavily used by the applications.. The 2M continuous pages will >> >> > be >> >> > . free up after THP swapping out. >> >> >> >> So this is impossible without THP swapin. While 2M swapout makes a >> >> lot of >> >> sense, I doubt 2M swapin is really useful. What kind of application >> >> is >> >> 'optimized' to do sequential memory access? >> > >> > I suspect a lot of this will depend on the ratio of storage >> > speed to CPU & RAM speed. >> > >> > When swapping to a spinning disk, it makes sense to avoid >> > extra memory use on swapin, and work in 4kB blocks. >> >> For spinning disk, the THP swap optimization will be turned off in >> current implementation. Because huge swap cluster allocation based on >> swap cluster management, which is available only for non-rotating block >> devices (blk_queue_nonrot()). > > For 2m swapin, as long as one byte is changed in the 2m, next time we must do > 2m swapout. There is huge waste of memory and IO bandwidth and increases > unnecessary memory pressure. 2M IO will very easily saturate a very fast SSD > and makes IO the bottleneck. Not sure about NVRAM though. One solution is to make 2M swapin configurable, maybe via a sysfs file in /sys/kernel/mm/transparent_hugepage/, so that we can turn on it only for really fast storage devices, such as NVRAM, etc. Best Regards, Huang, Ying