From: Austin S Hemmelgarn <ahferroin7@gmail.com>
To: Petros Koutoupis <petros@petroskoutoupis.com>,
Christoph Hellwig <hch@infradead.org>
Cc: linux-kernel@vger.kernel.org,
"devel@rapiddisk.org" <devel@rapiddisk.org>
Subject: Re: [PATCH] Patch to integrate RapidDisk and RapidCache RAM Drive / Caching modules into the kernel
Date: Wed, 30 Sep 2015 11:17:10 -0400 [thread overview]
Message-ID: <560BFCF6.9000203@gmail.com> (raw)
In-Reply-To: <560BF1DF.3000506@petroskoutoupis.com>
[-- Attachment #1: Type: text/plain, Size: 7296 bytes --]
On 2015-09-30 10:29, Petros Koutoupis wrote:
> Christoph and Austin,
>
> You both have provided me with some valuable feedback. I will do what I
> can to clean this patch up and in turn apply the same dynamic
> functionality to the already in-kernel module. Also please see my
> replies below.
>
> On 9/29/15 9:32 AM, Austin S Hemmelgarn wrote:
>> On 2015-09-28 12:45, Petros Koutoupis wrote:
>>> Christoph,
>>>
>>> See my replies below....
>>>
>>> On 9/28/15 11:29 AM, Christoph Hellwig wrote:
>>>> Hi Petros,
>>>>
>>>> On Mon, Sep 28, 2015 at 09:12:13AM -0500, Petros Koutoupis wrote:
>>>>> 1. Unlike the already mainline ramdisk driver, RapidDisk is designed
>>>>> to be
>>>>> managed dynamically. That is, instead of configuring a fixed number of
>>>>> volumes and volume sizes as compile/boot time variables, RapidDisk
>>>>> will
>>>>> allow you to add, remove, and resize your RAM drive(s) at runtime.
>>>>> Besides,
>>>>> the built in module is designed to work with smaller sizes in mind
>>>>> while
>>>>> RapidDisk focuses on larger sizes that can reach to the multiple
>>>>> Gigabytes
>>>>> or even Terabytes. Much like the built in module, it will allocate
>>>>> pages as
>>>>> they are needed which allows for over provisioning (not that it is
>>>>> advised)
>>>>> of volume sizes.
>>>> The ramdisk driver allows to selects sizes and count at module load
>>>> load. I agree that having runtime control would be even better, but
>>>> that's best done by adding a runtime interface to the existing driver
>>>> instead of duplicating it.
>>> I understand the concern and I will definitely scope out this approach,
>>> although at the moment, I am not sure how both approaches will play nice
>>> together. As mentioned above, the current implementation requires the
>>> predefined number of ram drives with the specified size to be configured
>>> at boot time (or compiled into the kernel). The only wiggle room I see
>>> for runtime control is resizing individual volumes.
>> Just because there is not code currently to do dynamic
>> allocation/freeing of ramdisks in the current driver doesn't mean that
>> it isn't possible, it just means that nobody has written code to do it
>> yet. This functionality would be extremely useful (I often use
>> ramdisks on a VM host as a small amount of very fast swap space for
>> the virtual machines). On top of that, the deduplication would be a
>> wonderful feature, although it may already be indirectly implemented
>> through KSM (that is, when KSM is on and configured to scan
>> everything, I'm not sure if it scans memory used by the ramdisks or not).
>>
> To my understanding KSM is only applied to KVM deployments. One way I
> have seen my caching module work is users/vendors have a block device,
> map it to a RapidDisk RAM drive as a RAM based Write-Through caching
> node and in turn export it via a traditional SAN. The idea behind adding
> deduplication to this module is to minimize the RAM drive footprint when
> used as a block level cache.
KSM is usually used in KVM or other userspace VM deployments, but that
is by no means the only use-case. I actually use it regularly on most
of my systems, and it does help in some cases (for example, I run a lot
of distributed computing apps, often using multiple instances of the
same app, and those don't always share memory to the degree they should,
KSM helps with this).
The write-through caching may be worth looking into, although I think
(not certain about this) that you can force the page cache to do
write-through caching only, except that can only be done globally.
It would probably be better to improve upon the existing pagecache
implementation anyway, ideally, I would love to see:
1. The ability to tell the page cache to claim some minimum amount of
memory that only it can use.
2. The ability to easily tune cache parameters on a per-device (or even
better, per-filesystem) basis.
3. Conversion to a framework that would allow for easy development and
testing of different caching algorithms (although this is probably never
going to happen).
>>>>> 2. The majority of RapidDisk code focuses on the use of Volatile
>>>>> memory.
>>>>> The support for Non-Volatile memory is a bit newer and there may be
>>>>> some
>>>>> overlap here with the recently integrated pmem code. The only
>>>>> advantage to
>>>>> having this code within RapidDisk is to provide the user with the
>>>>> ability
>>>>> to manage both technologies simultaneously, through a single
>>>>> interface.
>>>> Which really doesn't sound like a good enough reason to duplicate it.
>>> I do not disagree with your comment here. This component does not have
>>> to be patched into the mainline.
>>>
>>>>> 3. The RapidCache component is designed around the Non-Volatile
>>>>> functionality of RapidDisk (hence the block-level Write-Through
>>>>> caching).
>>>>> It is also coded and optimized around the RapidDisk sizes/variables,
>>>>> out-of-box. It is worth noting that I am in the process of expanding
>>>>> this
>>>>> module to add deduplication support. This will leverage RapidDisk's
>>>>> ability
>>>>> to allocate pages only when needed and reduce the cache's memory
>>>>> footprint;
>>>>> making more out of less.
>>>> Still needs some code comparism to our existing two caching solutions.
>>>>
>>>> I'd love to see you go ahead with the dynamic ramdisk configuration as
>>>> this is clearly a very useful feature. A caching solution that is
>>>> optimized for non-volatile memory does sound useful, but we'll still
>>>> need a patch better explaining how it actually is as useful as it might
>>>> sound.
>>> CORRECTION: I meant to say Volatile and NOT Non-Volatile. RapidCache is
>>> designed around Volatile memory. I guess I was a little to excited in my
>>> response and I do apologize for that. I will provide a code comparison
>>> in my next e-mail, after I go through the existing RAM drive code.
>> To a certain extent, I see that as potentially less useful than
>> optimized for non-volatile memory. While the current incarnation of
>> the pagecache in Linux could stand to have some serious performance
>> improvements (just think how fast things would be if we used ARC
>> instead of plain LRU), it does still do it's job well for most
>> workloads (although being able to tell the kernel to reserve some
>> portion of memory _just_ for the pagecache would be an interesting and
>> probably very useful feature).
>>
> My only concern with an ARC is CPU utilization. A lot more is required
> to manage two lists.
Actually, most of the CPU time spent in an ARC cache is in the
auto-tuning (the 'adaptive' bit), I've done testing just in userspace
and SLRU (ARC without the adaptive sizing of the lists) uses only a
little more CPU time than traditional LRU, somewhat less than ARC, and
does a much better job of handling COW based workloads. COW is a tough
workload for LRU caching (which is why ZFS uses ARC and not traditional
LRU), as a read-modify-write cycle ends up with the read data not being
needed ever again, which in turn means that MRU caching can be better in
may cases for heavy read-write COW workloads.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3019 bytes --]
prev parent reply other threads:[~2015-09-30 15:17 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-27 17:17 [PATCH] Patch to integrate RapidDisk and RapidCache RAM Drive / Caching modules into the kernel Petros Koutoupis
2015-09-28 6:49 ` Christoph Hellwig
2015-09-28 14:50 ` Petros Koutoupis
[not found] ` <CALMxJTyS5ARHw5NWhiPkJOh_0ys2x7cGVNdn60O6ecaUTFkq_Q@mail.gmail.com>
2015-09-28 16:29 ` Christoph Hellwig
2015-09-28 16:45 ` Petros Koutoupis
2015-09-29 14:32 ` Austin S Hemmelgarn
2015-09-30 14:29 ` Petros Koutoupis
2015-09-30 15:17 ` Austin S Hemmelgarn [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=560BFCF6.9000203@gmail.com \
--to=ahferroin7@gmail.com \
--cc=devel@rapiddisk.org \
--cc=hch@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=petros@petroskoutoupis.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.