[LSF/MM/BPF TOPIC] Virtual Machine Memory Passthrough

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [LSF/MM/BPF TOPIC] Virtual Machine Memory Passthrough
@ 2023-02-20 16:31 Pasha Tatashin
  2023-02-20 23:51 ` Gavin Shan
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Pasha Tatashin @ 2023-02-20 16:31 UTC (permalink / raw)
  To: lsf-pc; +Cc: linux-mm

Hello,

As a part of an ongoing work of replacing some containerized work load
with virtual machines within Google, I have worked on making the
memory translations faster.

I would like to propose the following topic for this year's LSF/MM/BPF:

Discuss  a set of techniques that can improve the guest performance,
memory footprint overhead, observability, and manageability of virtual
machines by hypervirtualizing the guest memory to the extreme. The end
goal is to allow very lightweight virtual machines to be closer in
performance to the containers.

The following items are going to be discussed in this topic:
- Reducing the cost of SLAT page table translations.
- Reducing the memory footprint overhead.
- Reducing the memory management overhead.
- Increasing the observability of guest memory.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Virtual Machine Memory Passthrough
  2023-02-20 16:31 [LSF/MM/BPF TOPIC] Virtual Machine Memory Passthrough Pasha Tatashin
@ 2023-02-20 23:51 ` Gavin Shan
  2023-02-22 13:43   ` Pasha Tatashin
  2023-02-21  4:38 ` Zhu Yanjun
  2023-02-22 17:08 ` Gupta, Pankaj
  2 siblings, 1 reply; 12+ messages in thread
From: Gavin Shan @ 2023-02-20 23:51 UTC (permalink / raw)
  To: Pasha Tatashin, lsf-pc; +Cc: linux-mm

Hi Pasha,

On 2/21/23 3:31 AM, Pasha Tatashin wrote:
> 
> As a part of an ongoing work of replacing some containerized work load
> with virtual machines within Google, I have worked on making the
> memory translations faster.
> 
> I would like to propose the following topic for this year's LSF/MM/BPF:
> 
> Discuss  a set of techniques that can improve the guest performance,
> memory footprint overhead, observability, and manageability of virtual
> machines by hypervirtualizing the guest memory to the extreme. The end
> goal is to allow very lightweight virtual machines to be closer in
> performance to the containers.
> 
> The following items are going to be discussed in this topic:
> - Reducing the cost of SLAT page table translations.
> - Reducing the memory footprint overhead.
> - Reducing the memory management overhead.
> - Increasing the observability of guest memory.
> 

It's all about to understand the problem and possible solution or directions.

I googled for 'SLAT' and direct me to x86's EPT. ARM64 has similar thing called
stage-2 page table. The usual way to reduce page table translation cost is to map
the contiguous memory through PUD/PMD. I'm not sure if there are other solutions
we're heading for?

Guest's memory is usually backed up by virtual memory area (VMA), which is either
a anonymous or hugetlb region. As I understand, the page fault handling is excessive
to populate the requested memory. I'm not sure if reducing the memory management
overhead is to get it faster, or something else? :)

Thanks,
Gavin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Virtual Machine Memory Passthrough
  2023-02-20 16:31 [LSF/MM/BPF TOPIC] Virtual Machine Memory Passthrough Pasha Tatashin
  2023-02-20 23:51 ` Gavin Shan
@ 2023-02-21  4:38 ` Zhu Yanjun
  2023-02-22 13:44   ` Pasha Tatashin
  2023-02-22 17:08 ` Gupta, Pankaj
  2 siblings, 1 reply; 12+ messages in thread
From: Zhu Yanjun @ 2023-02-21  4:38 UTC (permalink / raw)
  To: Pasha Tatashin; +Cc: lsf-pc, linux-mm

On Tue, Feb 21, 2023 at 12:32 AM Pasha Tatashin
<pasha.tatashin@soleen.com> wrote:
>
> Hello,
>
> As a part of an ongoing work of replacing some containerized work load
> with virtual machines within Google, I have worked on making the
> memory translations faster.
>
> I would like to propose the following topic for this year's LSF/MM/BPF:
>
> Discuss  a set of techniques that can improve the guest performance,
> memory footprint overhead, observability, and manageability of virtual
> machines by hypervirtualizing the guest memory to the extreme. The end
> goal is to allow very lightweight virtual machines to be closer in
> performance to the containers.
>
> The following items are going to be discussed in this topic:
> - Reducing the cost of SLAT page table translations.

Intel's implementation of SLAT, known as Extended Page Table (EPT),
was introduced in the Nehalem microarchitecture found in certain Core
i7, Core i5, and Core i3 processors.
ARM's virtualization extensions support SLAT, known as Stage-2
page-tables provided by a Stage-2 MMU. The guest uses the Stage-1 MMU.
Support was added as optional in the ARMv7ve architecture and is also
supported in the ARMv8 (32-bit and 64-bit) architectures.
I am interested in this. Hope we have a better solution to reduce the
cost of SLAT.

> - Reducing the memory footprint overhead.
> - Reducing the memory management overhead.
> - Increasing the observability of guest memory.
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Virtual Machine Memory Passthrough
  2023-02-20 23:51 ` Gavin Shan
@ 2023-02-22 13:43   ` Pasha Tatashin
  2023-02-22 15:31     ` Zi Yan
  0 siblings, 1 reply; 12+ messages in thread
From: Pasha Tatashin @ 2023-02-22 13:43 UTC (permalink / raw)
  To: Gavin Shan; +Cc: lsf-pc, linux-mm

On Mon, Feb 20, 2023 at 6:51 PM Gavin Shan <gshan@redhat.com> wrote:
>
> Hi Pasha,
>
> On 2/21/23 3:31 AM, Pasha Tatashin wrote:
> >
> > As a part of an ongoing work of replacing some containerized work load
> > with virtual machines within Google, I have worked on making the
> > memory translations faster.
> >
> > I would like to propose the following topic for this year's LSF/MM/BPF:
> >
> > Discuss  a set of techniques that can improve the guest performance,
> > memory footprint overhead, observability, and manageability of virtual
> > machines by hypervirtualizing the guest memory to the extreme. The end
> > goal is to allow very lightweight virtual machines to be closer in
> > performance to the containers.
> >
> > The following items are going to be discussed in this topic:
> > - Reducing the cost of SLAT page table translations.
> > - Reducing the memory footprint overhead.
> > - Reducing the memory management overhead.
> > - Increasing the observability of guest memory.
> >
>
> It's all about to understand the problem and possible solution or directions.
>
> I googled for 'SLAT' and direct me to x86's EPT. ARM64 has similar thing called
> stage-2 page table. The usual way to reduce page table translation cost is to map
> the contiguous memory through PUD/PMD. I'm not sure if there are other solutions
> we're heading for?
>
> Guest's memory is usually backed up by virtual memory area (VMA), which is either
> a anonymous or hugetlb region. As I understand, the page fault handling is excessive
> to populate the requested memory. I'm not sure if reducing the memory management
> overhead is to get it faster, or something else? :)

Hi Gavin,

In a non-virtualized environment, when converting VA to PA, we load
each level of page table, so converting to a 4K page takes 4 or 5
loads, depending on the page table type used. However, in a
virtualized environment, the number of loads to convert guest VA to
host PA is not a summation of SLAT page table levels and Guest page
table levels; rather, it is equal to: n*m + n + m. This is because
each guest's page table level must also be converted from guest PA to
host PA.

One way to minimize the number of loads is for the guest to use huge
pages, for example, 1-Gbyte pages. However, this normally wastes a lot
of memory. The idea is that we can use guest physical memory in a
virtual way: create 1-Gbyte pages that are only partially backed by
host memory, yet improve the access performance due to fewer TLB
misses and faster translations through guest + SLAT page tables. I
would like to discuss how this can be achieved.

Thanks,
Pasha

>
> Thanks,
> Gavin
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Virtual Machine Memory Passthrough
  2023-02-21  4:38 ` Zhu Yanjun
@ 2023-02-22 13:44   ` Pasha Tatashin
  0 siblings, 0 replies; 12+ messages in thread
From: Pasha Tatashin @ 2023-02-22 13:44 UTC (permalink / raw)
  To: Zhu Yanjun; +Cc: lsf-pc, linux-mm

On Mon, Feb 20, 2023 at 11:38 PM Zhu Yanjun <zyjzyj2000@gmail.com> wrote:
>
> On Tue, Feb 21, 2023 at 12:32 AM Pasha Tatashin
> <pasha.tatashin@soleen.com> wrote:
> >
> > Hello,
> >
> > As a part of an ongoing work of replacing some containerized work load
> > with virtual machines within Google, I have worked on making the
> > memory translations faster.
> >
> > I would like to propose the following topic for this year's LSF/MM/BPF:
> >
> > Discuss  a set of techniques that can improve the guest performance,
> > memory footprint overhead, observability, and manageability of virtual
> > machines by hypervirtualizing the guest memory to the extreme. The end
> > goal is to allow very lightweight virtual machines to be closer in
> > performance to the containers.
> >
> > The following items are going to be discussed in this topic:
> > - Reducing the cost of SLAT page table translations.
>
> Intel's implementation of SLAT, known as Extended Page Table (EPT),
> was introduced in the Nehalem microarchitecture found in certain Core
> i7, Core i5, and Core i3 processors.
> ARM's virtualization extensions support SLAT, known as Stage-2
> page-tables provided by a Stage-2 MMU. The guest uses the Stage-1 MMU.
> Support was added as optional in the ARMv7ve architecture and is also
> supported in the ARMv8 (32-bit and 64-bit) architectures.
> I am interested in this. Hope we have a better solution to reduce the
> cost of SLAT.

Hi Zhu,

Please take a look at my previous reply to Gavin Shan where I clarify
the SLAT performance improvements.

Thanks,
Pasha

>
> > - Reducing the memory footprint overhead.
> > - Reducing the memory management overhead.
> > - Increasing the observability of guest memory.
> >


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Virtual Machine Memory Passthrough
  2023-02-22 13:43   ` Pasha Tatashin
@ 2023-02-22 15:31     ` Zi Yan
  2023-02-22 15:43       ` Pasha Tatashin
  0 siblings, 1 reply; 12+ messages in thread
From: Zi Yan @ 2023-02-22 15:31 UTC (permalink / raw)
  To: Pasha Tatashin; +Cc: Gavin Shan, lsf-pc, linux-mm

[-- Attachment #1: Type: text/plain, Size: 3201 bytes --]

On 22 Feb 2023, at 8:43, Pasha Tatashin wrote:

> On Mon, Feb 20, 2023 at 6:51 PM Gavin Shan <gshan@redhat.com> wrote:
>>
>> Hi Pasha,
>>
>> On 2/21/23 3:31 AM, Pasha Tatashin wrote:
>>>
>>> As a part of an ongoing work of replacing some containerized work load
>>> with virtual machines within Google, I have worked on making the
>>> memory translations faster.
>>>
>>> I would like to propose the following topic for this year's LSF/MM/BPF:
>>>
>>> Discuss  a set of techniques that can improve the guest performance,
>>> memory footprint overhead, observability, and manageability of virtual
>>> machines by hypervirtualizing the guest memory to the extreme. The end
>>> goal is to allow very lightweight virtual machines to be closer in
>>> performance to the containers.
>>>
>>> The following items are going to be discussed in this topic:
>>> - Reducing the cost of SLAT page table translations.
>>> - Reducing the memory footprint overhead.
>>> - Reducing the memory management overhead.
>>> - Increasing the observability of guest memory.
>>>
>>
>> It's all about to understand the problem and possible solution or directions.
>>
>> I googled for 'SLAT' and direct me to x86's EPT. ARM64 has similar thing called
>> stage-2 page table. The usual way to reduce page table translation cost is to map
>> the contiguous memory through PUD/PMD. I'm not sure if there are other solutions
>> we're heading for?
>>
>> Guest's memory is usually backed up by virtual memory area (VMA), which is either
>> a anonymous or hugetlb region. As I understand, the page fault handling is excessive
>> to populate the requested memory. I'm not sure if reducing the memory management
>> overhead is to get it faster, or something else? :)
>
> Hi Gavin,
>
> In a non-virtualized environment, when converting VA to PA, we load
> each level of page table, so converting to a 4K page takes 4 or 5
> loads, depending on the page table type used. However, in a
> virtualized environment, the number of loads to convert guest VA to
> host PA is not a summation of SLAT page table levels and Guest page
> table levels; rather, it is equal to: n*m + n + m. This is because
> each guest's page table level must also be converted from guest PA to
> host PA.
>
> One way to minimize the number of loads is for the guest to use huge
> pages, for example, 1-Gbyte pages. However, this normally wastes a lot
> of memory. The idea is that we can use guest physical memory in a
> virtual way: create 1-Gbyte pages that are only partially backed by
> host memory, yet improve the access performance due to fewer TLB
> misses and faster translations through guest + SLAT page tables. I
> would like to discuss how this can be achieved.

Do you mean allocating 1GB pages in the guest and backing them using
2MB and/or 4KB pages in the host? From my understanding, for virtual
machines, TLB caches guestVA to hostPA, so the number of TLB entries
would be the same as using 2MB or 4KB pages in the guest (as long as
the guest page and the host page backing it have the same size).
What am I missing here?

For a TLB miss, it will be faster since fewer page table walks are
needed for 1GB pages in the guest.


--
Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Virtual Machine Memory Passthrough
  2023-02-22 15:31     ` Zi Yan
@ 2023-02-22 15:43       ` Pasha Tatashin
  0 siblings, 0 replies; 12+ messages in thread
From: Pasha Tatashin @ 2023-02-22 15:43 UTC (permalink / raw)
  To: Zi Yan; +Cc: Gavin Shan, lsf-pc, linux-mm

On Wed, Feb 22, 2023 at 10:31 AM Zi Yan <ziy@nvidia.com> wrote:
>
> On 22 Feb 2023, at 8:43, Pasha Tatashin wrote:
>
> > On Mon, Feb 20, 2023 at 6:51 PM Gavin Shan <gshan@redhat.com> wrote:
> >>
> >> Hi Pasha,
> >>
> >> On 2/21/23 3:31 AM, Pasha Tatashin wrote:
> >>>
> >>> As a part of an ongoing work of replacing some containerized work load
> >>> with virtual machines within Google, I have worked on making the
> >>> memory translations faster.
> >>>
> >>> I would like to propose the following topic for this year's LSF/MM/BPF:
> >>>
> >>> Discuss  a set of techniques that can improve the guest performance,
> >>> memory footprint overhead, observability, and manageability of virtual
> >>> machines by hypervirtualizing the guest memory to the extreme. The end
> >>> goal is to allow very lightweight virtual machines to be closer in
> >>> performance to the containers.
> >>>
> >>> The following items are going to be discussed in this topic:
> >>> - Reducing the cost of SLAT page table translations.
> >>> - Reducing the memory footprint overhead.
> >>> - Reducing the memory management overhead.
> >>> - Increasing the observability of guest memory.
> >>>
> >>
> >> It's all about to understand the problem and possible solution or directions.
> >>
> >> I googled for 'SLAT' and direct me to x86's EPT. ARM64 has similar thing called
> >> stage-2 page table. The usual way to reduce page table translation cost is to map
> >> the contiguous memory through PUD/PMD. I'm not sure if there are other solutions
> >> we're heading for?
> >>
> >> Guest's memory is usually backed up by virtual memory area (VMA), which is either
> >> a anonymous or hugetlb region. As I understand, the page fault handling is excessive
> >> to populate the requested memory. I'm not sure if reducing the memory management
> >> overhead is to get it faster, or something else? :)
> >
> > Hi Gavin,
> >
> > In a non-virtualized environment, when converting VA to PA, we load
> > each level of page table, so converting to a 4K page takes 4 or 5
> > loads, depending on the page table type used. However, in a
> > virtualized environment, the number of loads to convert guest VA to
> > host PA is not a summation of SLAT page table levels and Guest page
> > table levels; rather, it is equal to: n*m + n + m. This is because
> > each guest's page table level must also be converted from guest PA to
> > host PA.
> >
> > One way to minimize the number of loads is for the guest to use huge
> > pages, for example, 1-Gbyte pages. However, this normally wastes a lot
> > of memory. The idea is that we can use guest physical memory in a
> > virtual way: create 1-Gbyte pages that are only partially backed by
> > host memory, yet improve the access performance due to fewer TLB
> > misses and faster translations through guest + SLAT page tables. I
> > would like to discuss how this can be achieved.
>
> Do you mean allocating 1GB pages in the guest and backing them using
> 2MB and/or 4KB pages in the host? From my understanding, for virtual

Yes, that is exactly right. However, backing only a subset of the 1G
page, and not zero the whole page on allocation or first fault in the
guest, fault on demand on host, as new parts of 1G page are touched.

> machines, TLB caches guestVA to hostPA, so the number of TLB entries
> would be the same as using 2MB or 4KB pages in the guest (as long as
> the guest page and the host page backing it have the same size).
> What am I missing here?

Yes, the way TLB works is the smallest page of host and guest is the
size of TLB entry. So, 1G guest pages and 2M host pages yield to 2M
TLB entries, and  1G guest pages and 4K host pages yield to 4K TLB
entries. The sabing is coming from always having 1G pages in the
guest, and if the host backs with 2M pages, the 2M TLB entries are
used.

>
> For a TLB miss, it will be faster since fewer page table walks are
> needed for 1GB pages in the guest.

That is exactly right, the faster page table walk or SLAT translation
is achieved with this approach.

Thanks,
Pasha


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Virtual Machine Memory Passthrough
  2023-02-20 16:31 [LSF/MM/BPF TOPIC] Virtual Machine Memory Passthrough Pasha Tatashin
  2023-02-20 23:51 ` Gavin Shan
  2023-02-21  4:38 ` Zhu Yanjun
@ 2023-02-22 17:08 ` Gupta, Pankaj
  2023-02-22 18:18   ` Pasha Tatashin
  2 siblings, 1 reply; 12+ messages in thread
From: Gupta, Pankaj @ 2023-02-22 17:08 UTC (permalink / raw)
  To: Pasha Tatashin, lsf-pc; +Cc: linux-mm


> Hello,
> 
> As a part of an ongoing work of replacing some containerized work load
> with virtual machines within Google, I have worked on making the
> memory translations faster.
> 
> I would like to propose the following topic for this year's LSF/MM/BPF:
> 
> Discuss  a set of techniques that can improve the guest performance,
> memory footprint overhead, observability, and manageability of virtual
> machines by hypervirtualizing the guest memory to the extreme. The end
> goal is to allow very lightweight virtual machines to be closer in
> performance to the containers.
> 
> The following items are going to be discussed in this topic:
> - Reducing the cost of SLAT page table translations.
> - Reducing the memory footprint overhead.

Coming from the virtio-pmem and some free page hinting background, I am
interested in this discussion. I saw your proposal about single owner 
memory driver in other thread and could not entirely link the dots about 
applicability of the idea with "reducing the memory footprint overhead 
for virtual machines". Do we plan to co-ordinate guest memory state with 
corresponding host state for efficient memory reclaim decisions?
Or something entirely different we are targeting here?

Thanks,
Pankaj

> - Reducing the memory management overhead.
> - Increasing the observability of guest memory.
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Virtual Machine Memory Passthrough
  2023-02-22 17:08 ` Gupta, Pankaj
@ 2023-02-22 18:18   ` Pasha Tatashin
  2023-02-22 20:27     ` Gupta, Pankaj
  0 siblings, 1 reply; 12+ messages in thread
From: Pasha Tatashin @ 2023-02-22 18:18 UTC (permalink / raw)
  To: Gupta, Pankaj; +Cc: lsf-pc, linux-mm

> Coming from the virtio-pmem and some free page hinting background, I am
> interested in this discussion. I saw your proposal about single owner
> memory driver in other thread and could not entirely link the dots about
> applicability of the idea with "reducing the memory footprint overhead
> for virtual machines". Do we plan to co-ordinate guest memory state with
> corresponding host state for efficient memory reclaim decisions?
> Or something entirely different we are targeting here?

Hi Pankaj,

The plan is to have a driver /dev/memctl and corresponding VMM agent
that synchronously passes information about how guest would like its
memory to be backed on the host.

For example the following information can come from guest for a range
of physical addresses:
MADV_NOHUGEPAGE
MADV_HUGEPAGE
MADV_DONTNEED
PR_SET_VMA_ANON_NAME
etc.

All together this should help by doing memory management operations
only on the host side, and reduce the number of operations that are
performed on the guest.

The /dev/som can help with allowing support for anonymous memory in
the guest with 1G pages that are only partially backed on the host
side, thus yielding to faster guestVA hostPA translations.

Pasha

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Virtual Machine Memory Passthrough
  2023-02-22 18:18   ` Pasha Tatashin
@ 2023-02-22 20:27     ` Gupta, Pankaj
  2023-02-22 20:56       ` Pasha Tatashin
  0 siblings, 1 reply; 12+ messages in thread
From: Gupta, Pankaj @ 2023-02-22 20:27 UTC (permalink / raw)
  To: Pasha Tatashin; +Cc: lsf-pc, linux-mm

On 2/22/2023 7:18 PM, Pasha Tatashin wrote:

Hi Pasha,

>> Coming from the virtio-pmem and some free page hinting background, I am
>> interested in this discussion. I saw your proposal about single owner
>> memory driver in other thread and could not entirely link the dots about
>> applicability of the idea with "reducing the memory footprint overhead
>> for virtual machines". Do we plan to co-ordinate guest memory state with
>> corresponding host state for efficient memory reclaim decisions?
>> Or something entirely different we are targeting here?
> 
> Hi Pankaj,
> 
> The plan is to have a driver /dev/memctl and corresponding VMM agent
> that synchronously passes information about how guest would like its
> memory to be backed on the host.
> 
> For example the following information can come from guest for a range
> of physical addresses:
> MADV_NOHUGEPAGE
> MADV_HUGEPAGE
> MADV_DONTNEED
> PR_SET_VMA_ANON_NAME
> etc.
> 
> All together this should help by doing memory management operations
> only on the host side, and reduce the number of operations that are
> performed on the guest.

o.k. That sounds like guest will have a *special* interface (paravirt?) 
for some of the memory management operations with the coordination of host.

Guest would still allow other regular memory operations? which would get
full-filled by the guest? Just wondering if this solution only be
useful for specific workloads which are aware of known MADV calls?
And might not do/require continuous allocation/deallocation of memory?

Thanks,
Pankaj
> 
> The /dev/som can help with allowing support for anonymous memory in
> the guest with 1G pages that are only partially backed on the host
> side, thus yielding to faster guestVA hostPA translations.
> 
> Pasha



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Virtual Machine Memory Passthrough
  2023-02-22 20:27     ` Gupta, Pankaj
@ 2023-02-22 20:56       ` Pasha Tatashin
  2023-02-23  9:11         ` Gupta, Pankaj
  0 siblings, 1 reply; 12+ messages in thread
From: Pasha Tatashin @ 2023-02-22 20:56 UTC (permalink / raw)
  To: Gupta, Pankaj; +Cc: lsf-pc, linux-mm

On Wed, Feb 22, 2023 at 3:27 PM Gupta, Pankaj <pankaj.gupta@amd.com> wrote:
>
> On 2/22/2023 7:18 PM, Pasha Tatashin wrote:
>
> Hi Pasha,
>
> >> Coming from the virtio-pmem and some free page hinting background, I am
> >> interested in this discussion. I saw your proposal about single owner
> >> memory driver in other thread and could not entirely link the dots about
> >> applicability of the idea with "reducing the memory footprint overhead
> >> for virtual machines". Do we plan to co-ordinate guest memory state with
> >> corresponding host state for efficient memory reclaim decisions?
> >> Or something entirely different we are targeting here?
> >
> > Hi Pankaj,
> >
> > The plan is to have a driver /dev/memctl and corresponding VMM agent
> > that synchronously passes information about how guest would like its
> > memory to be backed on the host.
> >
> > For example the following information can come from guest for a range
> > of physical addresses:
> > MADV_NOHUGEPAGE
> > MADV_HUGEPAGE
> > MADV_DONTNEED
> > PR_SET_VMA_ANON_NAME
> > etc.
> >
> > All together this should help by doing memory management operations
> > only on the host side, and reduce the number of operations that are
> > performed on the guest.
>
> o.k. That sounds like guest will have a *special* interface (paravirt?)
> for some of the memory management operations with the coordination of host.

That is correct, hence memory passthrough.

>
> Guest would still allow other regular memory operations? which would get
> full-filled by the guest? Just wondering if this solution only be
> useful for specific workloads which are aware of known MADV calls?

Depending on the flexibility of the interface, we are currently
working supporting tcmalloc(), and also mmap(MAP_ANONYMOUS), but in
the future can be extended to more types of memory.

> And might not do/require continuous allocation/deallocation of memory?

Contiguous memory allocation on the host is not required.

Thanks,
Pasha


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Virtual Machine Memory Passthrough
  2023-02-22 20:56       ` Pasha Tatashin
@ 2023-02-23  9:11         ` Gupta, Pankaj
  0 siblings, 0 replies; 12+ messages in thread
From: Gupta, Pankaj @ 2023-02-23  9:11 UTC (permalink / raw)
  To: Pasha Tatashin; +Cc: lsf-pc, linux-mm, David Hildenbrand


>> Hi Pasha,
>>
>>>> Coming from the virtio-pmem and some free page hinting background, I am
>>>> interested in this discussion. I saw your proposal about single owner
>>>> memory driver in other thread and could not entirely link the dots about
>>>> applicability of the idea with "reducing the memory footprint overhead
>>>> for virtual machines". Do we plan to co-ordinate guest memory state with
>>>> corresponding host state for efficient memory reclaim decisions?
>>>> Or something entirely different we are targeting here?
>>>
>>> Hi Pankaj,
>>>
>>> The plan is to have a driver /dev/memctl and corresponding VMM agent
>>> that synchronously passes information about how guest would like its
>>> memory to be backed on the host.
>>>
>>> For example the following information can come from guest for a range
>>> of physical addresses:
>>> MADV_NOHUGEPAGE
>>> MADV_HUGEPAGE
>>> MADV_DONTNEED
>>> PR_SET_VMA_ANON_NAME
>>> etc.
>>>
>>> All together this should help by doing memory management operations
>>> only on the host side, and reduce the number of operations that are
>>> performed on the guest.
>>
>> o.k. That sounds like guest will have a *special* interface (paravirt?)
>> for some of the memory management operations with the coordination of host.
> 
> That is correct, hence memory passthrough.

ya.

> 
>>
>> Guest would still allow other regular memory operations? which would get
>> full-filled by the guest? Just wondering if this solution only be
>> useful for specific workloads which are aware of known MADV calls?
> 
> Depending on the flexibility of the interface, we are currently
> working supporting tcmalloc(), and also mmap(MAP_ANONYMOUS), but in
> the future can be extended to more types of memory.

Not sure if its worth to extend the existing paravirt memory management 
interfaces like virtio-mem or virtio-balloon or create a new driver 
altogether? Adding David (in Cc) for his thoughts.

Thanks,
Pankaj

> 
>> And might not do/require continuous allocation/deallocation of memory?
> 
> Contiguous memory allocation on the host is not required.
> 
> Thanks,
> Pasha



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-02-23  9:12 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-20 16:31 [LSF/MM/BPF TOPIC] Virtual Machine Memory Passthrough Pasha Tatashin
2023-02-20 23:51 ` Gavin Shan
2023-02-22 13:43   ` Pasha Tatashin
2023-02-22 15:31     ` Zi Yan
2023-02-22 15:43       ` Pasha Tatashin
2023-02-21  4:38 ` Zhu Yanjun
2023-02-22 13:44   ` Pasha Tatashin
2023-02-22 17:08 ` Gupta, Pankaj
2023-02-22 18:18   ` Pasha Tatashin
2023-02-22 20:27     ` Gupta, Pankaj
2023-02-22 20:56       ` Pasha Tatashin
2023-02-23  9:11         ` Gupta, Pankaj

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).