From: Anthony Liguori <anthony@codemonkey.ws>
To: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: npiggin@suse.de, akpm@osdl.org, jeremy@goop.org,
xen-devel@lists.xensource.com, tmem-devel@oss.oracle.com,
kurt.hackel@oracle.com, Rusty Russell <rusty@rustcorp.com.au>,
linux-kernel@vger.kernel.org, dave.mccracken@oracle.com,
linux-mm@kvack.org, chris.mason@oracle.com,
sunil.mushran@oracle.com, Avi Kivity <avi@redhat.com>,
Schwidefsky <schwidefsky@de.ibm.com>,
Marcelo Tosatti <mtosatti@redhat.com>,
alan@lxorguk.ukuu.org.uk,
Balbir Singh <balbir@linux.vnet.ibm.com>
Subject: Re: [Xen-devel] Re: [RFC PATCH 0/4] (Take 2): transcendent memory ("tmem") for Linux
Date: Wed, 08 Jul 2009 18:57:38 -0500 [thread overview]
Message-ID: <4A553272.5050909@codemonkey.ws> (raw)
In-Reply-To: <ac5dec0d-e593-4a82-8c9d-8aa374e8c6ed@default>
Dan Magenheimer wrote:
> Hi Anthony --
>
> Thanks for the comments.
>
>
>> I have trouble mapping this to a VMM capable of overcommit
>> without just coming back to CMM2.
>>
>> In CMM2 parlance, ephemeral tmem pools is just normal kernel memory
>> marked in the volatile state, no?
>>
>
> They are similar in concept, but a volatile-marked kernel page
> is still a kernel page, can be changed by a kernel (or user)
> store instruction, and counts as part of the memory used
> by the VM. An ephemeral tmem page cannot be directly written
> by a kernel (or user) store,
Why does tmem require a special store?
A VMM can trap write operations pages can be stored on disk
transparently by the VMM if necessary. I guess that's the bit I'm missing.
>> It seems to me that an architecture built around hinting
>> would be more
>> robust than having to use separate memory pools for this type
>> of memory
>> (especially since you are requiring a copy to/from the pool).
>>
>
> Depends on what you mean by robust, I suppose. Once you
> understand the basics of tmem, it is very simple and this
> is borne out in the low invasiveness of the Linux patch.
> Simplicity is another form of robustness.
>
The main disadvantage I see is that you need to explicitly convert
portions of the kernel to use a data copying API. That seems like an
invasive change to me. Hinting on the other hand can be done in a
less-invasive way.
I'm not really arguing against tmem, just the need to have explicit
get/put mechanisms for the transcendent memory areas.
> The copy may be expensive on an older machine, but on newer
> machines copying a page is relatively inexpensive.
I don't think that's a true statement at all :-) If you had a workload
where data never came into the CPU cache (zero-copy) and now you
introduce a copy, even with new system, you're going to see a
significant performance hit.
> On a reasonable
> multi-VM-kernbench-like benchmark I'll be presenting at Linux
> Symposium next week, the overhead is on the order of 0.01%
> for a fairly significant savings in IOs.
>
But how would something like specweb do where you should be doing
zero-copy IO from the disk to the network? This is the area where I
would be concerned. For something like kernbench, you're already
bringing the disk data into the CPU cache anyway so I can appreciate
that the copy could get lost in the noise.
Regards,
Anthony Liguori
WARNING: multiple messages have this Message-ID (diff)
From: Anthony Liguori <anthony@codemonkey.ws>
To: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: npiggin@suse.de, akpm@osdl.org, jeremy@goop.org,
xen-devel@lists.xensource.com, tmem-devel@oss.oracle.com,
kurt.hackel@oracle.com, Rusty Russell <rusty@rustcorp.com.au>,
linux-kernel@vger.kernel.org, dave.mccracken@oracle.com,
linux-mm@kvack.org, chris.mason@oracle.com,
sunil.mushran@oracle.com, Avi Kivity <avi@redhat.com>,
Schwidefsky <schwidefsky@de.ibm.com>,
Marcelo Tosatti <mtosatti@redhat.com>,
alan@lxorguk.ukuu.org.uk,
Balbir Singh <balbir@linux.vnet.ibm.com>
Subject: Re: [Xen-devel] Re: [RFC PATCH 0/4] (Take 2): transcendent memory ("tmem") for Linux
Date: Wed, 08 Jul 2009 18:57:38 -0500 [thread overview]
Message-ID: <4A553272.5050909@codemonkey.ws> (raw)
In-Reply-To: <ac5dec0d-e593-4a82-8c9d-8aa374e8c6ed@default>
Dan Magenheimer wrote:
> Hi Anthony --
>
> Thanks for the comments.
>
>
>> I have trouble mapping this to a VMM capable of overcommit
>> without just coming back to CMM2.
>>
>> In CMM2 parlance, ephemeral tmem pools is just normal kernel memory
>> marked in the volatile state, no?
>>
>
> They are similar in concept, but a volatile-marked kernel page
> is still a kernel page, can be changed by a kernel (or user)
> store instruction, and counts as part of the memory used
> by the VM. An ephemeral tmem page cannot be directly written
> by a kernel (or user) store,
Why does tmem require a special store?
A VMM can trap write operations pages can be stored on disk
transparently by the VMM if necessary. I guess that's the bit I'm missing.
>> It seems to me that an architecture built around hinting
>> would be more
>> robust than having to use separate memory pools for this type
>> of memory
>> (especially since you are requiring a copy to/from the pool).
>>
>
> Depends on what you mean by robust, I suppose. Once you
> understand the basics of tmem, it is very simple and this
> is borne out in the low invasiveness of the Linux patch.
> Simplicity is another form of robustness.
>
The main disadvantage I see is that you need to explicitly convert
portions of the kernel to use a data copying API. That seems like an
invasive change to me. Hinting on the other hand can be done in a
less-invasive way.
I'm not really arguing against tmem, just the need to have explicit
get/put mechanisms for the transcendent memory areas.
> The copy may be expensive on an older machine, but on newer
> machines copying a page is relatively inexpensive.
I don't think that's a true statement at all :-) If you had a workload
where data never came into the CPU cache (zero-copy) and now you
introduce a copy, even with new system, you're going to see a
significant performance hit.
> On a reasonable
> multi-VM-kernbench-like benchmark I'll be presenting at Linux
> Symposium next week, the overhead is on the order of 0.01%
> for a fairly significant savings in IOs.
>
But how would something like specweb do where you should be doing
zero-copy IO from the disk to the network? This is the area where I
would be concerned. For something like kernbench, you're already
bringing the disk data into the CPU cache anyway so I can appreciate
that the copy could get lost in the noise.
Regards,
Anthony Liguori
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-07-08 23:57 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-07 16:17 [RFC PATCH 0/4] (Take 2): transcendent memory ("tmem") for Linux Dan Magenheimer
2009-07-07 16:17 ` Dan Magenheimer
2009-07-07 17:28 ` Rik van Riel
2009-07-07 17:28 ` Rik van Riel
2009-07-07 19:53 ` Dan Magenheimer
2009-07-07 19:53 ` Dan Magenheimer
2009-07-08 22:56 ` Anthony Liguori
2009-07-08 22:56 ` Anthony Liguori
2009-07-08 23:31 ` [Xen-devel] " Dan Magenheimer
2009-07-08 23:31 ` Dan Magenheimer
2009-07-08 23:57 ` Anthony Liguori [this message]
2009-07-08 23:57 ` Anthony Liguori
2009-07-09 0:17 ` Jeremy Fitzhardinge
2009-07-09 0:17 ` Jeremy Fitzhardinge
2009-07-09 0:27 ` Anthony Liguori
2009-07-09 0:27 ` Anthony Liguori
2009-07-09 1:20 ` Rik van Riel
2009-07-09 1:20 ` Rik van Riel
2009-07-09 21:09 ` Dan Magenheimer
2009-07-09 21:09 ` Dan Magenheimer
2009-07-09 21:27 ` Rik van Riel
2009-07-09 21:27 ` Rik van Riel
2009-07-09 21:48 ` Dan Magenheimer
2009-07-09 21:48 ` Dan Magenheimer
2009-07-09 21:41 ` Anthony Liguori
2009-07-09 21:41 ` Anthony Liguori
2009-07-09 22:34 ` Dan Magenheimer
2009-07-09 22:34 ` Dan Magenheimer
2009-07-09 22:45 ` Rik van Riel
2009-07-09 22:45 ` Rik van Riel
2009-07-09 23:33 ` Anthony Liguori
2009-07-09 23:33 ` Anthony Liguori
2009-07-09 23:33 ` Anthony Liguori
2009-07-10 15:23 ` Dan Magenheimer
2009-07-10 15:23 ` Dan Magenheimer
2009-07-12 9:20 ` Avi Kivity
2009-07-12 9:20 ` Avi Kivity
2009-07-12 16:28 ` Dan Magenheimer
2009-07-12 16:28 ` Dan Magenheimer
2009-07-12 17:27 ` Avi Kivity
2009-07-12 17:27 ` Avi Kivity
2009-07-12 20:59 ` Dan Magenheimer
2009-07-12 20:59 ` Dan Magenheimer
2009-07-12 13:28 ` Anthony Liguori
2009-07-12 13:28 ` Anthony Liguori
2009-07-12 16:20 ` Dan Magenheimer
2009-07-12 16:20 ` Dan Magenheimer
2009-07-12 17:16 ` Avi Kivity
2009-07-12 17:16 ` Avi Kivity
2009-07-12 19:34 ` Anthony Liguori
2009-07-12 19:34 ` Anthony Liguori
2009-07-13 20:17 ` Chris Mason
2009-07-13 20:17 ` Chris Mason
2009-07-13 20:38 ` Anthony Liguori
2009-07-13 20:38 ` Anthony Liguori
2009-07-13 20:38 ` Anthony Liguori
2009-07-13 20:38 ` Anthony Liguori
2009-07-13 21:01 ` Chris Mason
2009-07-13 21:01 ` Chris Mason
2009-07-13 21:17 ` Anthony Liguori
2009-07-13 21:17 ` Anthony Liguori
2009-07-13 21:17 ` Anthony Liguori
2009-07-26 15:00 ` Avi Kivity
2009-07-26 15:00 ` Avi Kivity
2009-07-12 20:39 ` [Xen-devel] " Dan Magenheimer
2009-07-12 20:39 ` Dan Magenheimer
2009-07-12 20:43 ` Avi Kivity
2009-07-12 20:43 ` Avi Kivity
2009-07-12 21:08 ` Dan Magenheimer
2009-07-12 21:08 ` Dan Magenheimer
2009-07-13 11:33 ` Avi Kivity
2009-07-13 11:33 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A553272.5050909@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=akpm@osdl.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=avi@redhat.com \
--cc=balbir@linux.vnet.ibm.com \
--cc=chris.mason@oracle.com \
--cc=dan.magenheimer@oracle.com \
--cc=dave.mccracken@oracle.com \
--cc=jeremy@goop.org \
--cc=kurt.hackel@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mtosatti@redhat.com \
--cc=npiggin@suse.de \
--cc=rusty@rustcorp.com.au \
--cc=schwidefsky@de.ibm.com \
--cc=sunil.mushran@oracle.com \
--cc=tmem-devel@oss.oracle.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.