* Interest in Implementing Cache layer on top of NVME Driver [GSoC]
@ 2016-03-01 8:31 Rajath Shashidhara
0 siblings, 0 replies; 5+ messages in thread
From: Rajath Shashidhara @ 2016-03-01 8:31 UTC (permalink / raw)
To: ceph-devel; +Cc: haomai
Hello,
I am a GSoC' 2016 aspirant. I have browsed through your ideas page and
I find that the above project best aligns with my experience and
interest.
During the course of my academics, I have developed an interest in
development of performance critical software development. I am
familiar with the parallel and distributed computing paradigms and I
would like to gain further experience in this field.
I have experience in the development of Page buffers for efficient I/O
from secondary storage. I lead the development of External Memory
(secondary storage) data structures [1] for C++ over the last few
months. We have implemented custom buffering (modifications of the LRU
strategy) specific to each data structure. Some features of the page
buffer include - support for page pinning, priority paging (pages with
higher priority are harder to replace as compared to pages with low
priority of comparable recent access time), prefetching (strategy
specific to data structure). (It is an ongoing project and we are
expanding the number of Data structures and corresponding
optimizations)
I will familiarize myself with the architecture of Ceph, the userspace
NVME driver and the cache requirements. Please point me to the right
resources and the documentation related to this problem. It would be
great if you can suggest me a bug to fix or any contribution that
would help me understand the underlying source code.
Note : I have successfully completed GSoC' 2013 with Apache OpenOffice.
Thank you,
Rajath Shashidhara
[1] https://github.com/ExternalMemoryDS/external-mem-ds
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Interest in Implementing Cache layer on top of NVME Driver [GSoC]
[not found] <CAFjGuFTQ3P_GrcTuYBR5jRxXcZu4v1DeEH23NqfER-w=SB69-A@mail.gmail.com>
@ 2016-03-01 14:01 ` Haomai Wang
2016-03-02 0:27 ` Josh Durgin
0 siblings, 1 reply; 5+ messages in thread
From: Haomai Wang @ 2016-03-01 14:01 UTC (permalink / raw)
To: Rajath Shashidhara; +Cc: ceph-devel@vger.kernel.org
On Tue, Mar 1, 2016 at 4:26 PM, Rajath Shashidhara
<rajath.shashidhara@gmail.com> wrote:
> Hello,
>
> I am a GSoC' 2016 aspirant. I have browsed through your ideas page and I
> find that the above project best aligns with my experience and interest.
>
> During the course of my academics, I have developed an interest in
> development of performance critical software development. I am familiar with
> the parallel and distributed computing paradigms and I would like to gain
> further experience in this field.
>
> I have experience in the development of Page buffers for efficient I/O from
> secondary storage. I lead the development of External Memory (secondary
> storage) data structures [1] for C++ over the last few months. We have
> implemented custom buffering (modifications of the LRU strategy) specific to
> each data structure. Some features of the page buffer include - support for
> page pinning, priority paging (pages with higher priority are harder to
> replace as compared to pages with low priority of comparable recent access
> time), prefetching (strategy specific to data structure). (It is an ongoing
> project and we are expanding the number of Data structures and corresponding
> optimizations)
>
> I will familiarize myself with the architecture of Ceph, the userspace NVME
> driver and the cache requirements. Please point me to the right resources
> and the documentation related to this problem. It would be great if you can
> suggest me a bug to fix or any contribution that would help me understand
> the underlying source code.
Hmm, I suppose you have a basic overview on ceph osd side.
Currently bluestore will directly manage block device, so pagecache(as
well as buffer cache) is used to cache data and metadata. But
userspace NVME driver is used to bypass kernel layer as well as
pagecache. So we can't make use of any kernel side cache.
In general, metadata cache is very important to io latency, if lack of
cache, IO mostly will hit the persist device which will cause
outstanding latency compared to cache metadata in memory even if the
backend if nvme ssd.
So we need to implement a cache layer above nvme driver to accelerate
metadata or data get.
I think it's a initial idea background, but since we also have
ObjectCacher in client side, if we can unify this it will be good. But
obviously ObjectCacher is complexity now, you may not take care of it
at first...
@sage @sam do you have any other points about this idea?
>
> Note : I have successfully completed GSoC' 2013 with Apache OpenOffice.
>
> Thank you,
> Rajath Shashidhara
>
> [1] https://github.com/ExternalMemoryDS/external-mem-ds
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Interest in Implementing Cache layer on top of NVME Driver [GSoC]
2016-03-01 14:01 ` Interest in Implementing Cache layer on top of NVME Driver [GSoC] Haomai Wang
@ 2016-03-02 0:27 ` Josh Durgin
2016-03-02 19:11 ` Rajath Shashidhara
0 siblings, 1 reply; 5+ messages in thread
From: Josh Durgin @ 2016-03-02 0:27 UTC (permalink / raw)
To: Haomai Wang, Rajath Shashidhara; +Cc: ceph-devel@vger.kernel.org
On 03/01/2016 06:01 AM, Haomai Wang wrote:
> I think it's a initial idea background, but since we also have
> ObjectCacher in client side, if we can unify this it will be good. But
> obviously ObjectCacher is complexity now, you may not take care of it
> at first...
I'd suggest ignoring ObjectCacher, since it's got ordering and buffer
assumptions you probably don't need, plus a single global lock. I'd
expect a read-focused cache to be pretty different.
Josh
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Interest in Implementing Cache layer on top of NVME Driver [GSoC]
2016-03-02 0:27 ` Josh Durgin
@ 2016-03-02 19:11 ` Rajath Shashidhara
2016-03-03 6:12 ` Haomai Wang
0 siblings, 1 reply; 5+ messages in thread
From: Rajath Shashidhara @ 2016-03-02 19:11 UTC (permalink / raw)
To: Josh Durgin; +Cc: Haomai Wang, ceph-devel@vger.kernel.org
I did some reading on user space drivers and specifically NVME drivers
[1,2]. I understand that user space drivers run in "user mode" as
opposed to the "kernel mode" to save context switch time. But, this
also places several restrictions on the driver, like the
unavailability of buffer cache usually present to optimize read/write
from storage devices in the kernel.
I would like to know more about the following things to understand the
cache requirements :
[1] Are there any assumptions that I should make about locality of access ?
[2] Are multiple threads going to be accessing the cache at the same
time (thread safety) ?
[3] What are the components in Ceph that are going to directly
interact with the cache ? (Reading their code might give me a better
idea about what is expected of me)
When designing a cache there are several parameters/factors that needs
to be addresses - like page replacement policies, page sizes,
prefetching, etc. I guess I will be able to make the right design
choices if I better understand the cache requirements and access
profile.
This is my first exposure to ceph. Please excuse me if I am asking
basic questions.
Please point me to the right resources so that I can better understand
the project. I would also be happy to fix a bug to familiarize myself
with the source code. Any suggestions on bugs that are relevant to the
project will be of great help.
Thank you,
Rajath Shashidhara
[1] http://www.enea.com/Documents/Resources/Whitepapers/Enea-User-Space-Drivers-in-Linux_Whitepaper_2013.pdf
[2] http://www.nvmexpress.org/about/nvm-express-overview/
On Wed, Mar 2, 2016 at 5:57 AM, Josh Durgin <jdurgin@redhat.com> wrote:
> On 03/01/2016 06:01 AM, Haomai Wang wrote:
>>
>> I think it's a initial idea background, but since we also have
>> ObjectCacher in client side, if we can unify this it will be good. But
>> obviously ObjectCacher is complexity now, you may not take care of it
>> at first...
>
>
> I'd suggest ignoring ObjectCacher, since it's got ordering and buffer
> assumptions you probably don't need, plus a single global lock. I'd
> expect a read-focused cache to be pretty different.
>
> Josh
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Interest in Implementing Cache layer on top of NVME Driver [GSoC]
2016-03-02 19:11 ` Rajath Shashidhara
@ 2016-03-03 6:12 ` Haomai Wang
0 siblings, 0 replies; 5+ messages in thread
From: Haomai Wang @ 2016-03-03 6:12 UTC (permalink / raw)
To: Rajath Shashidhara; +Cc: Josh Durgin, ceph-devel@vger.kernel.org
On Thu, Mar 3, 2016 at 3:11 AM, Rajath Shashidhara
<rajath.shashidhara@gmail.com> wrote:
> I did some reading on user space drivers and specifically NVME drivers
> [1,2]. I understand that user space drivers run in "user mode" as
> opposed to the "kernel mode" to save context switch time. But, this
> also places several restrictions on the driver, like the
> unavailability of buffer cache usually present to optimize read/write
> from storage devices in the kernel.
>
> I would like to know more about the following things to understand the
> cache requirements :
> [1] Are there any assumptions that I should make about locality of access ?
Yep, I think you should know about bluestore arch firstly. This
slide(http://www.slideshare.net/sageweil1/ceph-and-rocksdb?qid=3021e4f1-58b5-4758-aa64-7005e3b98f51&v=&b=&from_search=7)
is great to know.
I think if we can separate data and metadata cache, it would be
better(just a idea).
The metadata workload is very critical for cache design and
implementation, so you can give a look at details about how rocksdb
access BlockDevice. This part contains rocksdb itself and RocksDBEnv
at ceph side.
About data part, I think it should be generic.
> [2] Are multiple threads going to be accessing the cache at the same
> time (thread safety) ?
Yes or not, you can implement a shared cache which accessed by multi
threads, or you can build a local cache which owned by only a thread.
I think a suitable design match bluestore is the best.
> [3] What are the components in Ceph that are going to directly
> interact with the cache ? (Reading their code might give me a better
> idea about what is expected of me)
From current status, I think we only need to focus on bluestore part.
>
> When designing a cache there are several parameters/factors that needs
> to be addresses - like page replacement policies, page sizes,
> prefetching, etc. I guess I will be able to make the right design
> choices if I better understand the cache requirements and access
> profile.
>
> This is my first exposure to ceph. Please excuse me if I am asking
> basic questions.
>
> Please point me to the right resources so that I can better understand
> the project. I would also be happy to fix a bug to familiarize myself
> with the source code. Any suggestions on bugs that are relevant to the
> project will be of great help.
>
> Thank you,
> Rajath Shashidhara
>
> [1] http://www.enea.com/Documents/Resources/Whitepapers/Enea-User-Space-Drivers-in-Linux_Whitepaper_2013.pdf
> [2] http://www.nvmexpress.org/about/nvm-express-overview/
>
> On Wed, Mar 2, 2016 at 5:57 AM, Josh Durgin <jdurgin@redhat.com> wrote:
>> On 03/01/2016 06:01 AM, Haomai Wang wrote:
>>>
>>> I think it's a initial idea background, but since we also have
>>> ObjectCacher in client side, if we can unify this it will be good. But
>>> obviously ObjectCacher is complexity now, you may not take care of it
>>> at first...
>>
>>
>> I'd suggest ignoring ObjectCacher, since it's got ordering and buffer
>> assumptions you probably don't need, plus a single global lock. I'd
>> expect a read-focused cache to be pretty different.
>>
>> Josh
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-03-03 6:12 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CAFjGuFTQ3P_GrcTuYBR5jRxXcZu4v1DeEH23NqfER-w=SB69-A@mail.gmail.com>
2016-03-01 14:01 ` Interest in Implementing Cache layer on top of NVME Driver [GSoC] Haomai Wang
2016-03-02 0:27 ` Josh Durgin
2016-03-02 19:11 ` Rajath Shashidhara
2016-03-03 6:12 ` Haomai Wang
2016-03-01 8:31 Rajath Shashidhara
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.