All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Nitesh Narayan Lal <nitesh@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, pbonzini@redhat.com, lcapitulino@redhat.com,
	pagupta@redhat.com, wei.w.wang@intel.com,
	yang.zhang.wz@gmail.com, riel@surriel.com, david@redhat.com,
	dodgen@google.com, konrad.wilk@oracle.com, dhildenb@redhat.com,
	aarcange@redhat.com, alexander.duyck@gmail.com
Subject: Re: [RFC][Patch v10 0/2] mm: Support for page hinting
Date: Mon, 3 Jun 2019 14:04:40 -0400	[thread overview]
Message-ID: <20190603140304-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20190603170306.49099-1-nitesh@redhat.com>

On Mon, Jun 03, 2019 at 01:03:04PM -0400, Nitesh Narayan Lal wrote:
> This patch series proposes an efficient mechanism for communicating free memory
> from a guest to its hypervisor. It especially enables guests with no page cache
> (e.g., nvdimm, virtio-pmem) or with small page caches (e.g., ram > disk) to
> rapidly hand back free memory to the hypervisor.
> This approach has a minimal impact on the existing core-mm infrastructure.

Could you help us compare with Alex's series?
What are the main differences?

> Measurement results (measurement details appended to this email):
> * With active page hinting, 3 more guests could be launched each of 5 GB(total 
> 5 vs. 2) on a 15GB (single NUMA) system without swapping.
> * With active page hinting, on a system with 15 GB of (single NUMA) memory and
> 4GB of swap, the runtime of "memhog 6G" in 3 guests (run sequentially) resulted
> in the last invocation to only need 37s compared to 3m35s without page hinting.
> 
> This approach tracks all freed pages of the order MAX_ORDER - 2 in bitmaps.
> A new hook after buddy merging is used to set the bits in the bitmap.
> Currently, the bits are only cleared when pages are hinted, not when pages are
> re-allocated.
> 
> Bitmaps are stored on a per-zone basis and are protected by the zone lock. A
> workqueue asynchronously processes the bitmaps as soon as a pre-defined memory
> threshold is met, trying to isolate and report pages that are still free.
> 
> The isolated pages are reported via virtio-balloon, which is responsible for
> sending batched pages to the host synchronously. Once the hypervisor processed
> the hinting request, the isolated pages are returned back to the buddy.
> 
> The key changes made in this series compared to v9[1] are:
> * Pages only in the chunks of "MAX_ORDER - 2" are reported to the hypervisor to
> not break up the THP.
> * At a time only a set of 16 pages can be isolated and reported to the host to
> avoids any false OOMs.
> * page_hinting.c is moved under mm/ from virt/kvm/ as the feature is dependent
> on virtio and not on KVM itself. This would enable any other hypervisor to use
> this feature by implementing virtio devices.
> * The sysctl variable is replaced with a virtio-balloon parameter to
> enable/disable page-hinting.
> 
> Pending items:
> * Test device assigned guests to ensure that hinting doesn't break it.
> * Follow up on VIRTIO_BALLOON_F_PAGE_POISON's device side support.
> * Compare reporting free pages via vring with vhost.
> * Decide between MADV_DONTNEED and MADV_FREE.
> * Look into memory hotplug, more efficient locking, possible races when
> disabling.
> * Come up with proper/traceable error-message/logs.
> * Minor reworks and simplifications (e.g., virtio protocol).
> 
> Benefit analysis:
> 1. Use-case - Number of guests that can be launched without swap usage
> NUMA Nodes = 1 with 15 GB memory
> Guest Memory = 5 GB
> Number of cores in guest = 1
> Workload = test allocation program allocates 4GB memory, touches it via memset
> and exits.
> Procedure =
> The first guest is launched and once its console is up, the test allocation
> program is executed with 4 GB memory request (Due to this the guest occupies
> almost 4-5 GB of memory in the host in a system without page hinting). Once
> this program exits at that time another guest is launched in the host and the
> same process is followed. It is continued until the swap is not used.
> 
> Results:
> Without hinting = 3, swap usage at the end 1.1GB.
> With hinting = 5, swap usage at the end 0.
> 
> 2. Use-case - memhog execution time
> Guest Memory = 6GB
> Number of cores = 4
> NUMA Nodes = 1 with 15 GB memory
> Process: 3 Guests are launched and the ‘memhog 6G’ execution time is monitored
> one after the other in each of them.
> Without Hinting - Guest1:47s, Guest2:53s, Guest3:3m35s, End swap usage: 3.5G
> With Hinting - Guest1:40s, Guest2:44s, Guest3:37s, End swap usage: 0
> 
> Performance analysis:
> 1. will-it-scale's page_faul1:
> Guest Memory = 6GB
> Number of cores = 24
> 
> Without Hinting:
> tasks,processes,processes_idle,threads,threads_idle,linear
> 0,0,100,0,100,0
> 1,315890,95.82,317633,95.83,317633
> 2,570810,91.67,531147,91.94,635266
> 3,826491,87.54,713545,88.53,952899
> 4,1087434,83.40,901215,85.30,1270532
> 5,1277137,79.26,916442,83.74,1588165
> 6,1503611,75.12,1113832,79.89,1905798
> 7,1683750,70.99,1140629,78.33,2223431
> 8,1893105,66.85,1157028,77.40,2541064
> 9,2046516,62.50,1179445,76.48,2858697
> 10,2291171,58.57,1209247,74.99,3176330
> 11,2486198,54.47,1217265,75.13,3493963
> 12,2656533,50.36,1193392,74.42,3811596
> 13,2747951,46.21,1185540,73.45,4129229
> 14,2965757,42.09,1161862,72.20,4446862
> 15,3049128,37.97,1185923,72.12,4764495
> 16,3150692,33.83,1163789,70.70,5082128
> 17,3206023,29.70,1174217,70.11,5399761
> 18,3211380,25.62,1179660,69.40,5717394
> 19,3202031,21.44,1181259,67.28,6035027
> 20,3218245,17.35,1196367,66.75,6352660
> 21,3228576,13.26,1129561,66.74,6670293
> 22,3207452,9.15,1166517,66.47,6987926
> 23,3153800,5.09,1172877,61.57,7305559
> 24,3184542,0.99,1186244,58.36,7623192
> 
> With Hinting:
> 0,0,100,0,100,0
> 1,306737,95.82,305130,95.78,306737
> 2,573207,91.68,530453,91.92,613474
> 3,810319,87.53,695281,88.58,920211
> 4,1074116,83.40,880602,85.48,1226948
> 5,1308283,79.26,1109257,81.23,1533685
> 6,1501987,75.12,1093661,80.19,1840422
> 7,1695300,70.99,1104207,79.03,2147159
> 8,1901523,66.85,1193613,76.90,2453896
> 9,2051288,62.73,1200913,76.22,2760633
> 10,2275771,58.60,1192992,75.66,3067370
> 11,2435016,54.48,1191472,74.66,3374107
> 12,2623114,50.35,1196911,74.02,3680844
> 13,2766071,46.22,1178589,73.02,3987581
> 14,2932163,42.10,1166414,72.96,4294318
> 15,3000853,37.96,1177177,72.62,4601055
> 16,3113738,33.85,1165444,70.54,4907792
> 17,3132135,29.77,1165055,68.51,5214529
> 18,3175121,25.69,1166969,69.27,5521266
> 19,3205490,21.61,1159310,65.65,5828003
> 20,3220855,17.52,1171827,62.04,6134740
> 21,3182568,13.48,1138918,65.05,6441477
> 22,3130543,9.30,1128185,60.60,6748214
> 23,3087426,5.15,1127912,55.36,7054951
> 24,3099457,1.04,1176100,54.96,7361688
> 
> [1] https://lkml.org/lkml/2019/3/6/413
> 

  parent reply	other threads:[~2019-06-03 18:04 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-03 17:03 [RFC][Patch v10 0/2] mm: Support for page hinting Nitesh Narayan Lal
2019-06-03 17:03 ` [RFC][Patch v10 1/2] mm: page_hinting: core infrastructure Nitesh Narayan Lal
2019-06-03 19:04   ` Alexander Duyck
2019-06-04 12:55     ` Nitesh Narayan Lal
2019-06-04 15:14       ` Alexander Duyck
2019-06-04 16:07         ` Nitesh Narayan Lal
2019-06-04 16:25           ` Alexander Duyck
2019-06-04 16:42             ` Nitesh Narayan Lal
2019-06-04 17:12               ` Alexander Duyck
2019-06-03 19:57   ` David Hildenbrand
2019-06-04 13:16     ` Nitesh Narayan Lal
2019-06-14  7:24   ` David Hildenbrand
2019-06-03 17:03 ` [RFC][Patch v10 2/2] virtio-balloon: page_hinting: reporting to the host Nitesh Narayan Lal
2019-06-03 22:38   ` Alexander Duyck
2019-06-04  7:12     ` David Hildenbrand
2019-06-04 11:50       ` Nitesh Narayan Lal
2019-06-04 11:31     ` Nitesh Narayan Lal
2019-06-04 16:33   ` Alexander Duyck
2019-06-04 16:44     ` Nitesh Narayan Lal
2019-06-03 17:04 ` [QEMU PATCH] KVM: Support for page hinting Nitesh Narayan Lal
2019-06-03 18:34   ` Alexander Duyck
2019-06-03 18:37     ` Nitesh Narayan Lal
2019-06-03 18:45     ` Nitesh Narayan Lal
2019-06-04 16:41   ` Alexander Duyck
2019-06-04 16:48     ` Nitesh Narayan Lal
2019-06-03 18:04 ` Michael S. Tsirkin [this message]
2019-06-03 18:38   ` [RFC][Patch v10 0/2] mm: " Nitesh Narayan Lal
2019-06-11 12:19   ` Nitesh Narayan Lal
2019-06-11 15:00     ` Alexander Duyck
2019-06-25 14:48   ` Nitesh Narayan Lal
2019-06-25 17:10     ` Alexander Duyck
2019-06-25 17:31       ` Nitesh Narayan Lal
2019-06-28 18:25         ` Alexander Duyck
2019-06-28 19:13           ` Nitesh Narayan Lal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190603140304-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=alexander.duyck@gmail.com \
    --cc=david@redhat.com \
    --cc=dhildenb@redhat.com \
    --cc=dodgen@google.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=lcapitulino@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nitesh@redhat.com \
    --cc=pagupta@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=riel@surriel.com \
    --cc=wei.w.wang@intel.com \
    --cc=yang.zhang.wz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.