From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753981AbdBVF4c (ORCPT ); Wed, 22 Feb 2017 00:56:32 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:41137 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750890AbdBVF40 (ORCPT ); Wed, 22 Feb 2017 00:56:26 -0500 Subject: Re: [PATCH 0/6] Enable parallel page migration To: Balbir Singh , Anshuman Khandual References: <20170217112453.307-1-khandual@linux.vnet.ibm.com> <20170222050425.GB9967@balbir.ozlabs.ibm.com> Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, mhocko@suse.com, vbabka@suse.cz, mgorman@suse.de, minchan@kernel.org, aneesh.kumar@linux.vnet.ibm.com, srikar@linux.vnet.ibm.com, haren@linux.vnet.ibm.com, jglisse@redhat.com, dave.hansen@intel.com, dan.j.williams@intel.com, zi.yan@cs.rutgers.edu From: Anshuman Khandual Date: Wed, 22 Feb 2017 11:25:13 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: <20170222050425.GB9967@balbir.ozlabs.ibm.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 17022205-0004-0000-0000-000001E3A47F X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17022205-0005-0000-0000-000009B64C72 Message-Id: <4efb25de-e036-4015-e764-70b4c911ca67@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-02-22_03:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1612050000 definitions=main-1702220059 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/22/2017 10:34 AM, Balbir Singh wrote: > On Fri, Feb 17, 2017 at 04:54:47PM +0530, Anshuman Khandual wrote: >> This patch series is base on the work posted by Zi Yan back in >> November 2016 (https://lkml.org/lkml/2016/11/22/457) but includes some >> amount clean up and re-organization. This series depends on THP migration >> optimization patch series posted by Naoya Horiguchi on 8th November 2016 >> (https://lwn.net/Articles/705879/). Though Zi Yan has recently reposted >> V3 of the THP migration patch series (https://lwn.net/Articles/713667/), >> this series is yet to be rebased. >> >> Primary motivation behind this patch series is to achieve higher >> bandwidth of memory migration when ever possible using multi threaded >> instead of a single threaded copy. Did all the experiments using a two >> socket X86 sytsem (Intel(R) Xeon(R) CPU E5-2650). All the experiments >> here have same allocation size 4K * 100000 (which did not split evenly >> for the 2MB huge pages). Here are the results. >> >> Vanilla: >> >> Moved 100000 normal pages in 247.000000 msecs 1.544412 GBs >> Moved 100000 normal pages in 238.000000 msecs 1.602814 GBs >> Moved 195 huge pages in 252.000000 msecs 1.513769 GBs >> Moved 195 huge pages in 257.000000 msecs 1.484318 GBs >> >> THP migration improvements: >> >> Moved 100000 normal pages in 302.000000 msecs 1.263145 GBs > > Is there a decrease here for normal pages? Yeah. > >> Moved 100000 normal pages in 262.000000 msecs 1.455991 GBs >> Moved 195 huge pages in 120.000000 msecs 3.178914 GBs >> Moved 195 huge pages in 129.000000 msecs 2.957130 GBs >> >> THP migration improvements + Multi threaded page copy: >> >> Moved 100000 normal pages in 1589.000000 msecs 0.240069 GBs ** > > Ditto? Yeah, I have already mentioned about this after these data in the cover letter. This new flag is controlled from user space while invoking the system calls. Users should be careful in using it for scenarios where its useful and avoid it for cases where it hurts. > >> Moved 100000 normal pages in 1932.000000 msecs 0.197448 GBs ** >> Moved 195 huge pages in 54.000000 msecs 7.064254 GBs *** >> Moved 195 huge pages in 86.000000 msecs 4.435694 GBs *** >> > > Could you also comment on the CPU utilization impact of these > patches. Yeah, it really makes sense to analyze this impact. I have mentioned about this in the outstanding issues section of the series. But what exactly we need to analyze from CPU utilization impact point of view ? Like whats the probability that the work queue requested jobs will throw some tasks from the run queue and make them starve for some more time ? Could you please give some details on this ?