From: Avi Kivity <avi@redhat.com>
To: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
jeremy@goop.org, hugh.dickins@tiscali.co.uk, ngupta@vflare.org,
JBeulich@novell.com, chris.mason@oracle.com,
kurt.hackel@oracle.com, dave.mccracken@oracle.com,
npiggin@suse.de, akpm@linux-foundation.org, riel@redhat.com
Subject: Re: Frontswap [PATCH 0/4] (was Transcendent Memory): overview
Date: Mon, 26 Apr 2010 09:01:51 +0300 [thread overview]
Message-ID: <4BD52C4F.40505@redhat.com> (raw)
In-Reply-To: <7264e3c0-15fe-4b70-a3d8-2c36a2b934df@default>
On 04/25/2010 06:29 PM, Dan Magenheimer wrote:
>>> While I admit that I started this whole discussion by implying
>>> that frontswap (and cleancache) might be useful for SSDs, I think
>>> we are going far astray here. Frontswap is synchronous for a
>>> reason: It uses real RAM, but RAM that is not directly addressable
>>> by a (guest) kernel. SSD's (at least today) are still I/O devices;
>>> even though they may be very fast, they still live on a PCI (or
>>> slower) bus and use DMA. Frontswap is not intended for use with
>>> I/O devices.
>>>
>>> Today's memory technologies are either RAM that can be addressed
>>> by the kernel, or I/O devices that sit on an I/O bus. The
>>> exotic memories that I am referring to may be a hybrid:
>>> memory that is fast enough to live on a QPI/hypertransport,
>>> but slow enough that you wouldn't want to randomly mix and
>>> hand out to userland apps some pages from "exotic RAM" and some
>>> pages from "normal RAM". Such memory makes no sense today
>>> because OS's wouldn't know what to do with it. But it MAY
>>> make sense with frontswap (and cleancache).
>>>
>>> Nevertheless, frontswap works great today with a bare-metal
>>> hypervisor. I think it stands on its own merits, regardless
>>> of one's vision of future SSD/memory technologies.
>>>
>> Even when frontswapping to RAM on a bare metal hypervisor it makes
>> sense
>> to use an async API, in case you have a DMA engine on board.
>>
> When pages are 2MB, this may be true. When pages are 4KB and
> copied individually, it may take longer to program a DMA engine
> than to just copy 4KB.
>
Of course, you have to use a batching API, like virtio or Xen's rings,
to avoid the overhead.
> But in any case, frontswap works fine on all existing machines
> today. If/when most commodity CPUs have an asynchronous RAM DMA
> engine, an asynchronous API may be appropriate. Or the existing
> swap API might be appropriate. Or the synchronous frontswap API
> may work fine too. Speculating further about non-existent
> hardware that might exist in the (possibly far) future is irrelevant
> to the proposed patch, which works today on all existing x86 hardware
> and on shipping software.
>
dma engines are present on commodity hardware now:
http://en.wikipedia.org/wiki/I/O_Acceleration_Technology
I don't know if consumer machines have them, but servers certainly do.
modprobe ioatdma.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
WARNING: multiple messages have this Message-ID (diff)
From: Avi Kivity <avi@redhat.com>
To: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
jeremy@goop.org, hugh.dickins@tiscali.co.uk, ngupta@vflare.org,
JBeulich@novell.com, chris.mason@oracle.com,
kurt.hackel@oracle.com, dave.mccracken@oracle.com,
npiggin@suse.de, akpm@linux-foundation.org, riel@redhat.com
Subject: Re: Frontswap [PATCH 0/4] (was Transcendent Memory): overview
Date: Mon, 26 Apr 2010 09:01:51 +0300 [thread overview]
Message-ID: <4BD52C4F.40505@redhat.com> (raw)
In-Reply-To: <7264e3c0-15fe-4b70-a3d8-2c36a2b934df@default>
On 04/25/2010 06:29 PM, Dan Magenheimer wrote:
>>> While I admit that I started this whole discussion by implying
>>> that frontswap (and cleancache) might be useful for SSDs, I think
>>> we are going far astray here. Frontswap is synchronous for a
>>> reason: It uses real RAM, but RAM that is not directly addressable
>>> by a (guest) kernel. SSD's (at least today) are still I/O devices;
>>> even though they may be very fast, they still live on a PCI (or
>>> slower) bus and use DMA. Frontswap is not intended for use with
>>> I/O devices.
>>>
>>> Today's memory technologies are either RAM that can be addressed
>>> by the kernel, or I/O devices that sit on an I/O bus. The
>>> exotic memories that I am referring to may be a hybrid:
>>> memory that is fast enough to live on a QPI/hypertransport,
>>> but slow enough that you wouldn't want to randomly mix and
>>> hand out to userland apps some pages from "exotic RAM" and some
>>> pages from "normal RAM". Such memory makes no sense today
>>> because OS's wouldn't know what to do with it. But it MAY
>>> make sense with frontswap (and cleancache).
>>>
>>> Nevertheless, frontswap works great today with a bare-metal
>>> hypervisor. I think it stands on its own merits, regardless
>>> of one's vision of future SSD/memory technologies.
>>>
>> Even when frontswapping to RAM on a bare metal hypervisor it makes
>> sense
>> to use an async API, in case you have a DMA engine on board.
>>
> When pages are 2MB, this may be true. When pages are 4KB and
> copied individually, it may take longer to program a DMA engine
> than to just copy 4KB.
>
Of course, you have to use a batching API, like virtio or Xen's rings,
to avoid the overhead.
> But in any case, frontswap works fine on all existing machines
> today. If/when most commodity CPUs have an asynchronous RAM DMA
> engine, an asynchronous API may be appropriate. Or the existing
> swap API might be appropriate. Or the synchronous frontswap API
> may work fine too. Speculating further about non-existent
> hardware that might exist in the (possibly far) future is irrelevant
> to the proposed patch, which works today on all existing x86 hardware
> and on shipping software.
>
dma engines are present on commodity hardware now:
http://en.wikipedia.org/wiki/I/O_Acceleration_Technology
I don't know if consumer machines have them, but servers certainly do.
modprobe ioatdma.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-04-26 6:02 UTC|newest]
Thread overview: 163+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-22 13:42 Frontswap [PATCH 0/4] (was Transcendent Memory): overview Dan Magenheimer
2010-04-22 13:42 ` Dan Magenheimer
2010-04-22 15:28 ` Avi Kivity
2010-04-22 15:28 ` Avi Kivity
2010-04-22 15:48 ` Dan Magenheimer
2010-04-22 15:48 ` Dan Magenheimer
2010-04-22 16:13 ` Avi Kivity
2010-04-22 16:13 ` Avi Kivity
2010-04-22 20:15 ` Dan Magenheimer
2010-04-22 20:15 ` Dan Magenheimer
2010-04-23 9:48 ` Avi Kivity
2010-04-23 9:48 ` Avi Kivity
2010-04-23 13:47 ` Dan Magenheimer
2010-04-23 13:47 ` Dan Magenheimer
2010-04-23 13:57 ` Avi Kivity
2010-04-23 13:57 ` Avi Kivity
2010-04-23 14:43 ` Dan Magenheimer
2010-04-23 14:43 ` Dan Magenheimer
2010-04-23 14:52 ` Avi Kivity
2010-04-23 14:52 ` Avi Kivity
2010-04-23 15:00 ` Avi Kivity
2010-04-23 15:00 ` Avi Kivity
2010-04-23 16:26 ` Dan Magenheimer
2010-04-23 16:26 ` Dan Magenheimer
2010-04-24 18:25 ` Avi Kivity
2010-04-24 18:25 ` Avi Kivity
[not found] ` <1c02a94a-a6aa-4cbb-a2e6-9d4647760e91@default4BD43033.7090706@redhat.com>
2010-04-25 0:41 ` Dan Magenheimer
2010-04-25 0:41 ` Dan Magenheimer
2010-04-25 12:06 ` Avi Kivity
2010-04-25 12:06 ` Avi Kivity
2010-04-25 13:12 ` Dan Magenheimer
2010-04-25 13:12 ` Dan Magenheimer
2010-04-25 13:18 ` Avi Kivity
2010-04-25 13:18 ` Avi Kivity
2010-04-28 5:55 ` Pavel Machek
2010-04-28 5:55 ` Pavel Machek
2010-04-29 14:42 ` Dan Magenheimer
2010-04-29 14:42 ` Dan Magenheimer
2010-04-29 18:59 ` Avi Kivity
2010-04-29 18:59 ` Avi Kivity
2010-04-29 19:01 ` Avi Kivity
2010-04-29 19:01 ` Avi Kivity
2010-04-29 18:53 ` Avi Kivity
2010-04-29 18:53 ` Avi Kivity
2010-04-30 1:45 ` Dave Hansen
2010-04-30 1:45 ` Dave Hansen
2010-04-30 7:13 ` Avi Kivity
2010-04-30 7:13 ` Avi Kivity
2010-04-30 15:59 ` Dan Magenheimer
2010-04-30 15:59 ` Dan Magenheimer
2010-04-30 16:08 ` Dave Hansen
2010-04-30 16:08 ` Dave Hansen
2010-05-10 16:05 ` Martin Schwidefsky
2010-05-10 16:05 ` Martin Schwidefsky
2010-04-30 16:16 ` Avi Kivity
2010-04-30 16:16 ` Avi Kivity
[not found] ` <4BDB18CE.2090608@goop.org4BDB2069.4000507@redhat.com>
[not found] ` <3a62a058-7976-48d7-acd2-8c6a8312f10f@default20100502071059.GF1790@ucw.cz>
2010-04-30 16:43 ` Dan Magenheimer
2010-04-30 16:43 ` Dan Magenheimer
2010-04-30 17:10 ` Dave Hansen
2010-04-30 17:10 ` Dave Hansen
2010-04-30 18:08 ` Avi Kivity
2010-04-30 18:08 ` Avi Kivity
2010-04-30 17:52 ` Jeremy Fitzhardinge
2010-04-30 17:52 ` Jeremy Fitzhardinge
2010-04-30 18:24 ` Avi Kivity
2010-04-30 18:24 ` Avi Kivity
2010-04-30 18:59 ` Jeremy Fitzhardinge
2010-04-30 18:59 ` Jeremy Fitzhardinge
2010-05-01 8:28 ` Avi Kivity
2010-05-01 8:28 ` Avi Kivity
2010-05-01 17:10 ` Dan Magenheimer
2010-05-01 17:10 ` Dan Magenheimer
2010-05-02 7:11 ` Pavel Machek
2010-05-02 7:11 ` Pavel Machek
2010-05-02 15:05 ` Dan Magenheimer
2010-05-02 15:05 ` Dan Magenheimer
2010-05-02 20:06 ` Pavel Machek
2010-05-02 20:06 ` Pavel Machek
2010-05-02 21:05 ` Dan Magenheimer
2010-05-02 21:05 ` Dan Magenheimer
2010-05-02 7:57 ` Nitin Gupta
2010-05-02 7:57 ` Nitin Gupta
2010-05-02 16:06 ` Dan Magenheimer
2010-05-02 16:06 ` Dan Magenheimer
2010-05-02 16:48 ` Avi Kivity
2010-05-02 16:48 ` Avi Kivity
2010-05-02 17:22 ` Dan Magenheimer
2010-05-02 17:22 ` Dan Magenheimer
2010-05-03 9:39 ` Avi Kivity
2010-05-03 9:39 ` Avi Kivity
2010-05-03 14:59 ` Dan Magenheimer
2010-05-03 14:59 ` Dan Magenheimer
2010-05-02 15:35 ` Avi Kivity
2010-05-02 15:35 ` Avi Kivity
2010-05-02 17:06 ` Dan Magenheimer
2010-05-02 17:06 ` Dan Magenheimer
2010-05-03 8:46 ` Avi Kivity
2010-05-03 8:46 ` Avi Kivity
2010-05-03 16:01 ` Dan Magenheimer
2010-05-03 16:01 ` Dan Magenheimer
2010-05-03 19:32 ` Pavel Machek
2010-05-03 19:32 ` Pavel Machek
2010-04-30 16:04 ` Dave Hansen
2010-04-30 16:04 ` Dave Hansen
2010-04-23 15:56 ` Dan Magenheimer
2010-04-23 15:56 ` Dan Magenheimer
2010-04-24 18:22 ` Avi Kivity
2010-04-24 18:22 ` Avi Kivity
2010-04-25 0:30 ` Dan Magenheimer
2010-04-25 0:30 ` Dan Magenheimer
2010-04-25 12:11 ` Avi Kivity
2010-04-25 12:11 ` Avi Kivity
[not found] ` <c5062f3a-3232-4b21-b032-2ee1f2485ff0@default4BD44E74.2020506@redhat.com>
2010-04-25 13:37 ` Dan Magenheimer
2010-04-25 13:37 ` Dan Magenheimer
2010-04-25 14:15 ` Avi Kivity
2010-04-25 14:15 ` Avi Kivity
2010-04-25 15:29 ` Dan Magenheimer
2010-04-25 15:29 ` Dan Magenheimer
2010-04-26 6:01 ` Avi Kivity [this message]
2010-04-26 6:01 ` Avi Kivity
2010-04-26 12:45 ` Dan Magenheimer
2010-04-26 12:45 ` Dan Magenheimer
2010-04-26 13:48 ` Avi Kivity
2010-04-26 13:48 ` Avi Kivity
2010-04-27 12:56 ` Pavel Machek
2010-04-27 12:56 ` Pavel Machek
2010-04-27 14:32 ` Dan Magenheimer
2010-04-27 14:32 ` Dan Magenheimer
2010-04-29 13:02 ` Pavel Machek
2010-04-29 13:02 ` Pavel Machek
2010-04-27 11:52 ` Valdis.Kletnieks
2010-04-27 0:49 ` Jeremy Fitzhardinge
2010-04-27 0:49 ` Jeremy Fitzhardinge
2010-04-27 12:55 ` Pavel Machek
2010-04-27 12:55 ` Pavel Machek
2010-04-27 14:43 ` Nitin Gupta
2010-04-27 14:43 ` Nitin Gupta
2010-04-29 13:04 ` Pavel Machek
2010-04-29 13:04 ` Pavel Machek
2010-04-24 1:49 ` Nitin Gupta
2010-04-24 1:49 ` Nitin Gupta
2010-04-24 18:27 ` Avi Kivity
2010-04-24 18:27 ` Avi Kivity
2010-04-25 3:11 ` Nitin Gupta
2010-04-25 3:11 ` Nitin Gupta
2010-04-25 12:16 ` Avi Kivity
2010-04-25 12:16 ` Avi Kivity
2010-04-25 16:05 ` Nitin Gupta
2010-04-25 16:05 ` Nitin Gupta
2010-04-26 6:06 ` Avi Kivity
2010-04-26 6:06 ` Avi Kivity
2010-04-26 12:50 ` Dan Magenheimer
2010-04-26 12:50 ` Dan Magenheimer
2010-04-26 13:43 ` Avi Kivity
2010-04-26 13:43 ` Avi Kivity
2010-04-27 8:29 ` Dan Magenheimer
2010-04-27 8:29 ` Dan Magenheimer
2010-04-27 9:21 ` Avi Kivity
2010-04-27 9:21 ` Avi Kivity
2010-04-26 13:47 ` Nitin Gupta
2010-04-26 13:47 ` Nitin Gupta
2010-04-23 16:35 ` Jiahua
2010-04-23 16:35 ` Jiahua
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BD52C4F.40505@redhat.com \
--to=avi@redhat.com \
--cc=JBeulich@novell.com \
--cc=akpm@linux-foundation.org \
--cc=chris.mason@oracle.com \
--cc=dan.magenheimer@oracle.com \
--cc=dave.mccracken@oracle.com \
--cc=hugh.dickins@tiscali.co.uk \
--cc=jeremy@goop.org \
--cc=kurt.hackel@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ngupta@vflare.org \
--cc=npiggin@suse.de \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.