* Re: Interest in Implementing Cache layer on top of NVME Driver [GSoC] [not found] <CAFjGuFTQ3P_GrcTuYBR5jRxXcZu4v1DeEH23NqfER-w=SB69-A@mail.gmail.com> @ 2016-03-01 14:01 ` Haomai Wang 2016-03-02 0:27 ` Josh Durgin 0 siblings, 1 reply; 5+ messages in thread From: Haomai Wang @ 2016-03-01 14:01 UTC (permalink / raw) To: Rajath Shashidhara; +Cc: ceph-devel@vger.kernel.org On Tue, Mar 1, 2016 at 4:26 PM, Rajath Shashidhara <rajath.shashidhara@gmail.com> wrote: > Hello, > > I am a GSoC' 2016 aspirant. I have browsed through your ideas page and I > find that the above project best aligns with my experience and interest. > > During the course of my academics, I have developed an interest in > development of performance critical software development. I am familiar with > the parallel and distributed computing paradigms and I would like to gain > further experience in this field. > > I have experience in the development of Page buffers for efficient I/O from > secondary storage. I lead the development of External Memory (secondary > storage) data structures [1] for C++ over the last few months. We have > implemented custom buffering (modifications of the LRU strategy) specific to > each data structure. Some features of the page buffer include - support for > page pinning, priority paging (pages with higher priority are harder to > replace as compared to pages with low priority of comparable recent access > time), prefetching (strategy specific to data structure). (It is an ongoing > project and we are expanding the number of Data structures and corresponding > optimizations) > > I will familiarize myself with the architecture of Ceph, the userspace NVME > driver and the cache requirements. Please point me to the right resources > and the documentation related to this problem. It would be great if you can > suggest me a bug to fix or any contribution that would help me understand > the underlying source code. Hmm, I suppose you have a basic overview on ceph osd side. Currently bluestore will directly manage block device, so pagecache(as well as buffer cache) is used to cache data and metadata. But userspace NVME driver is used to bypass kernel layer as well as pagecache. So we can't make use of any kernel side cache. In general, metadata cache is very important to io latency, if lack of cache, IO mostly will hit the persist device which will cause outstanding latency compared to cache metadata in memory even if the backend if nvme ssd. So we need to implement a cache layer above nvme driver to accelerate metadata or data get. I think it's a initial idea background, but since we also have ObjectCacher in client side, if we can unify this it will be good. But obviously ObjectCacher is complexity now, you may not take care of it at first... @sage @sam do you have any other points about this idea? > > Note : I have successfully completed GSoC' 2013 with Apache OpenOffice. > > Thank you, > Rajath Shashidhara > > [1] https://github.com/ExternalMemoryDS/external-mem-ds ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Interest in Implementing Cache layer on top of NVME Driver [GSoC] 2016-03-01 14:01 ` Interest in Implementing Cache layer on top of NVME Driver [GSoC] Haomai Wang @ 2016-03-02 0:27 ` Josh Durgin 2016-03-02 19:11 ` Rajath Shashidhara 0 siblings, 1 reply; 5+ messages in thread From: Josh Durgin @ 2016-03-02 0:27 UTC (permalink / raw) To: Haomai Wang, Rajath Shashidhara; +Cc: ceph-devel@vger.kernel.org On 03/01/2016 06:01 AM, Haomai Wang wrote: > I think it's a initial idea background, but since we also have > ObjectCacher in client side, if we can unify this it will be good. But > obviously ObjectCacher is complexity now, you may not take care of it > at first... I'd suggest ignoring ObjectCacher, since it's got ordering and buffer assumptions you probably don't need, plus a single global lock. I'd expect a read-focused cache to be pretty different. Josh ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Interest in Implementing Cache layer on top of NVME Driver [GSoC] 2016-03-02 0:27 ` Josh Durgin @ 2016-03-02 19:11 ` Rajath Shashidhara 2016-03-03 6:12 ` Haomai Wang 0 siblings, 1 reply; 5+ messages in thread From: Rajath Shashidhara @ 2016-03-02 19:11 UTC (permalink / raw) To: Josh Durgin; +Cc: Haomai Wang, ceph-devel@vger.kernel.org I did some reading on user space drivers and specifically NVME drivers [1,2]. I understand that user space drivers run in "user mode" as opposed to the "kernel mode" to save context switch time. But, this also places several restrictions on the driver, like the unavailability of buffer cache usually present to optimize read/write from storage devices in the kernel. I would like to know more about the following things to understand the cache requirements : [1] Are there any assumptions that I should make about locality of access ? [2] Are multiple threads going to be accessing the cache at the same time (thread safety) ? [3] What are the components in Ceph that are going to directly interact with the cache ? (Reading their code might give me a better idea about what is expected of me) When designing a cache there are several parameters/factors that needs to be addresses - like page replacement policies, page sizes, prefetching, etc. I guess I will be able to make the right design choices if I better understand the cache requirements and access profile. This is my first exposure to ceph. Please excuse me if I am asking basic questions. Please point me to the right resources so that I can better understand the project. I would also be happy to fix a bug to familiarize myself with the source code. Any suggestions on bugs that are relevant to the project will be of great help. Thank you, Rajath Shashidhara [1] http://www.enea.com/Documents/Resources/Whitepapers/Enea-User-Space-Drivers-in-Linux_Whitepaper_2013.pdf [2] http://www.nvmexpress.org/about/nvm-express-overview/ On Wed, Mar 2, 2016 at 5:57 AM, Josh Durgin <jdurgin@redhat.com> wrote: > On 03/01/2016 06:01 AM, Haomai Wang wrote: >> >> I think it's a initial idea background, but since we also have >> ObjectCacher in client side, if we can unify this it will be good. But >> obviously ObjectCacher is complexity now, you may not take care of it >> at first... > > > I'd suggest ignoring ObjectCacher, since it's got ordering and buffer > assumptions you probably don't need, plus a single global lock. I'd > expect a read-focused cache to be pretty different. > > Josh ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Interest in Implementing Cache layer on top of NVME Driver [GSoC] 2016-03-02 19:11 ` Rajath Shashidhara @ 2016-03-03 6:12 ` Haomai Wang 0 siblings, 0 replies; 5+ messages in thread From: Haomai Wang @ 2016-03-03 6:12 UTC (permalink / raw) To: Rajath Shashidhara; +Cc: Josh Durgin, ceph-devel@vger.kernel.org On Thu, Mar 3, 2016 at 3:11 AM, Rajath Shashidhara <rajath.shashidhara@gmail.com> wrote: > I did some reading on user space drivers and specifically NVME drivers > [1,2]. I understand that user space drivers run in "user mode" as > opposed to the "kernel mode" to save context switch time. But, this > also places several restrictions on the driver, like the > unavailability of buffer cache usually present to optimize read/write > from storage devices in the kernel. > > I would like to know more about the following things to understand the > cache requirements : > [1] Are there any assumptions that I should make about locality of access ? Yep, I think you should know about bluestore arch firstly. This slide(http://www.slideshare.net/sageweil1/ceph-and-rocksdb?qid=3021e4f1-58b5-4758-aa64-7005e3b98f51&v=&b=&from_search=7) is great to know. I think if we can separate data and metadata cache, it would be better(just a idea). The metadata workload is very critical for cache design and implementation, so you can give a look at details about how rocksdb access BlockDevice. This part contains rocksdb itself and RocksDBEnv at ceph side. About data part, I think it should be generic. > [2] Are multiple threads going to be accessing the cache at the same > time (thread safety) ? Yes or not, you can implement a shared cache which accessed by multi threads, or you can build a local cache which owned by only a thread. I think a suitable design match bluestore is the best. > [3] What are the components in Ceph that are going to directly > interact with the cache ? (Reading their code might give me a better > idea about what is expected of me) From current status, I think we only need to focus on bluestore part. > > When designing a cache there are several parameters/factors that needs > to be addresses - like page replacement policies, page sizes, > prefetching, etc. I guess I will be able to make the right design > choices if I better understand the cache requirements and access > profile. > > This is my first exposure to ceph. Please excuse me if I am asking > basic questions. > > Please point me to the right resources so that I can better understand > the project. I would also be happy to fix a bug to familiarize myself > with the source code. Any suggestions on bugs that are relevant to the > project will be of great help. > > Thank you, > Rajath Shashidhara > > [1] http://www.enea.com/Documents/Resources/Whitepapers/Enea-User-Space-Drivers-in-Linux_Whitepaper_2013.pdf > [2] http://www.nvmexpress.org/about/nvm-express-overview/ > > On Wed, Mar 2, 2016 at 5:57 AM, Josh Durgin <jdurgin@redhat.com> wrote: >> On 03/01/2016 06:01 AM, Haomai Wang wrote: >>> >>> I think it's a initial idea background, but since we also have >>> ObjectCacher in client side, if we can unify this it will be good. But >>> obviously ObjectCacher is complexity now, you may not take care of it >>> at first... >> >> >> I'd suggest ignoring ObjectCacher, since it's got ordering and buffer >> assumptions you probably don't need, plus a single global lock. I'd >> expect a read-focused cache to be pretty different. >> >> Josh > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Interest in Implementing Cache layer on top of NVME Driver [GSoC] @ 2016-03-01 8:31 Rajath Shashidhara 0 siblings, 0 replies; 5+ messages in thread From: Rajath Shashidhara @ 2016-03-01 8:31 UTC (permalink / raw) To: ceph-devel; +Cc: haomai Hello, I am a GSoC' 2016 aspirant. I have browsed through your ideas page and I find that the above project best aligns with my experience and interest. During the course of my academics, I have developed an interest in development of performance critical software development. I am familiar with the parallel and distributed computing paradigms and I would like to gain further experience in this field. I have experience in the development of Page buffers for efficient I/O from secondary storage. I lead the development of External Memory (secondary storage) data structures [1] for C++ over the last few months. We have implemented custom buffering (modifications of the LRU strategy) specific to each data structure. Some features of the page buffer include - support for page pinning, priority paging (pages with higher priority are harder to replace as compared to pages with low priority of comparable recent access time), prefetching (strategy specific to data structure). (It is an ongoing project and we are expanding the number of Data structures and corresponding optimizations) I will familiarize myself with the architecture of Ceph, the userspace NVME driver and the cache requirements. Please point me to the right resources and the documentation related to this problem. It would be great if you can suggest me a bug to fix or any contribution that would help me understand the underlying source code. Note : I have successfully completed GSoC' 2013 with Apache OpenOffice. Thank you, Rajath Shashidhara [1] https://github.com/ExternalMemoryDS/external-mem-ds ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-03-03 6:12 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CAFjGuFTQ3P_GrcTuYBR5jRxXcZu4v1DeEH23NqfER-w=SB69-A@mail.gmail.com>
2016-03-01 14:01 ` Interest in Implementing Cache layer on top of NVME Driver [GSoC] Haomai Wang
2016-03-02 0:27 ` Josh Durgin
2016-03-02 19:11 ` Rajath Shashidhara
2016-03-03 6:12 ` Haomai Wang
2016-03-01 8:31 Rajath Shashidhara
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.