Integration work

All of lore.kernel.org
 help / color / mirror / Atom feed

* Integration work
@ 2012-08-28 18:12 Ross Turk
  2012-08-28 18:32 ` Plaetinck, Dieter
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Ross Turk @ 2012-08-28 18:12 UTC (permalink / raw)
  To: ceph-devel@vger.kernel.org

Hi, ceph-devel! It's me, your friendly community guy.

Inktank has an engineering team dedicated to Ceph, and we want to work 
on the right stuff. From time to time, I'd like to check in with you to 
make sure that we are.

Over the past several months, Inktank's engineers have focused on core 
stability, radosgw, and feature expansion for RBD. At the same time, 
they have been regularly allocating cycles to integration work. 
Recently, this has consisted of improvements to the way Ceph works 
within OpenStack (even though OpenStack isn't the only technology that 
we think Ceph should play nicely with).

What other sorts of integrations would you like to see Inktank engineers 
work on? For example, are you interested in seeing Inktank spend more of 
its resources improving interoperability with Apache CloudStack or 
Eucalyptus? How about Xen?

Please share your thoughts. We want to contribute in the best way 
possible with the resources we have, and your input can help.

Thx,
Ross

--
Ross Turk
Community, Ceph
@rossturk @inktank @ceph

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Integration work
  2012-08-28 18:12 Integration work Ross Turk
@ 2012-08-28 18:32 ` Plaetinck, Dieter
  2012-08-28 21:03   ` Florian Haas
  2012-08-28 18:51 ` Dieter Kasper
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 20+ messages in thread
From: Plaetinck, Dieter @ 2012-08-28 18:32 UTC (permalink / raw)
  To: Ross Turk; +Cc: ceph-devel@vger.kernel.org

On Tue, 28 Aug 2012 11:12:16 -0700
Ross Turk <ross@inktank.com> wrote:

> 
> Hi, ceph-devel! It's me, your friendly community guy.
> 
> Inktank has an engineering team dedicated to Ceph, and we want to work 
> on the right stuff. From time to time, I'd like to check in with you to 
> make sure that we are.
> 
> Over the past several months, Inktank's engineers have focused on core 
> stability, radosgw, and feature expansion for RBD. At the same time, 
> they have been regularly allocating cycles to integration work. 
> Recently, this has consisted of improvements to the way Ceph works 
> within OpenStack (even though OpenStack isn't the only technology that 
> we think Ceph should play nicely with).
> 
> What other sorts of integrations would you like to see Inktank engineers 
> work on?

are we only supposed to give answers wrt. integration with other software?
if not, I would suggest to write documentation.
and also integration with CM like puppet/chef
both of these points can give a shorter "time from zero to working cluster" which IMHO is critical
in attracting new users. (myself included)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Integration work
  2012-08-28 18:12 Integration work Ross Turk
  2012-08-28 18:32 ` Plaetinck, Dieter
@ 2012-08-28 18:51 ` Dieter Kasper
  2012-08-28 18:57   ` Smart Weblications GmbH - Florian Wiessner
  2012-08-28 20:46   ` Tren Blackburn
  2012-08-29  8:20 ` Sylvain Munaut
  2012-09-01  6:02 ` Ryan Nicholson
  3 siblings, 2 replies; 20+ messages in thread
From: Dieter Kasper @ 2012-08-28 18:51 UTC (permalink / raw)
  To: Ross Turk; +Cc: ceph-devel@vger.kernel.org, Dieter Kasper (KD)

Hi Ross,

focusing on core stability and feature expansion for RBD was the right appoach 
in the past and I feel you have reached an adequate maturity level here.

Performance enhancements - especially to reduce the latency of a single IO / increase IOPS -
and a stronger engagement on the CephFS client would be very much appreciated.
A stable and fast CephFS client would allow an efficient integration with
- (clustered) NFS (v3 and v4)
- (clustered) Samba v4


Cheers,
-Dieter


On Tue, Aug 28, 2012 at 08:12:16PM +0200, Ross Turk wrote:
> 
> Hi, ceph-devel! It's me, your friendly community guy.
> 
> Inktank has an engineering team dedicated to Ceph, and we want to work 
> on the right stuff. From time to time, I'd like to check in with you to 
> make sure that we are.
> 
> Over the past several months, Inktank's engineers have focused on core 
> stability, radosgw, and feature expansion for RBD. At the same time, 
> they have been regularly allocating cycles to integration work. 
> Recently, this has consisted of improvements to the way Ceph works 
> within OpenStack (even though OpenStack isn't the only technology that 
> we think Ceph should play nicely with).
> 
> What other sorts of integrations would you like to see Inktank engineers 
> work on? For example, are you interested in seeing Inktank spend more of 
> its resources improving interoperability with Apache CloudStack or 
> Eucalyptus? How about Xen?
> 
> Please share your thoughts. We want to contribute in the best way 
> possible with the resources we have, and your input can help.
> 
> Thx,
> Ross
> 
> --
> Ross Turk
> Community, Ceph
> @rossturk @inktank @ceph
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Integration work
  2012-08-28 18:51 ` Dieter Kasper
@ 2012-08-28 18:57   ` Smart Weblications GmbH - Florian Wiessner
  2012-08-28 20:05     ` Dieter Kasper
  2012-08-28 20:46   ` Tren Blackburn
  1 sibling, 1 reply; 20+ messages in thread
From: Smart Weblications GmbH - Florian Wiessner @ 2012-08-28 18:57 UTC (permalink / raw)
  To: Dieter Kasper; +Cc: Ross Turk, ceph-devel@vger.kernel.org

Am 28.08.2012 20:51, schrieb Dieter Kasper:

> Performance enhancements - especially to reduce the latency of a single IO / increase IOPS -
> and a stronger engagement on the CephFS client would be very much appreciated.
> A stable and fast CephFS client would allow an efficient integration with
> - (clustered) NFS (v3 and v4)
> - (clustered) Samba v4

Have you tried ocfs2 ontop of rbd in the meanwhile until cephFS gets ready?


-- 

Mit freundlichen Grüßen,

Florian Wiessner

Smart Weblications GmbH
Martinsberger Str. 1
D-95119 Naila

fon.: +49 9282 9638 200
fax.: +49 9282 9638 205
24/7: +49 900 144 000 00 - 0,99 EUR/Min*
http://www.smart-weblications.de

--
Sitz der Gesellschaft: Naila
Geschäftsführer: Florian Wiessner
HRB-Nr.: HRB 3840 Amtsgericht Hof
*aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Integration work
  2012-08-28 18:57   ` Smart Weblications GmbH - Florian Wiessner
@ 2012-08-28 20:05     ` Dieter Kasper
  0 siblings, 0 replies; 20+ messages in thread
From: Dieter Kasper @ 2012-08-28 20:05 UTC (permalink / raw)
  To: Smart Weblications GmbH - Florian Wiessner
  Cc: Ross Turk, ceph-devel@vger.kernel.org

On Tue, Aug 28, 2012 at 08:57:02PM +0200, Smart Weblications GmbH - Florian Wiessner wrote:
> Am 28.08.2012 20:51, schrieb Dieter Kasper:
> 
> > Performance enhancements - especially to reduce the latency of a single IO / increase IOPS -
> > and a stronger engagement on the CephFS client would be very much appreciated.
> > A stable and fast CephFS client would allow an efficient integration with
> > - (clustered) NFS (v3 and v4)
> > - (clustered) Samba v4
> 
> Have you tried ocfs2 ontop of rbd in the meanwhile until cephFS gets ready?
No, I haven't, but I know its limitations.

OCFS2 (like GFS/GFS2 from sistina/RH) is build on the cluster-FS design
of the 90s.
I'm looking for Cluster-FS which is based on 
+ a system which is inherently dynamic
+ failures in a cluster are the norm, rather than an exception
+ that characters of workloads are constantly shifting over time
+ a system which is inevitably built incrementally
+ a system which is self-managing
= Ceph (RBD + CephFS)


Cheers,
Dieter Kasper

> 
> 
> -- 
> 
> Mit freundlichen Grüßen,
> 
> Florian Wiessner
> 
> Smart Weblications GmbH
> Martinsberger Str. 1
> D-95119 Naila
> 
> fon.: +49 9282 9638 200
> fax.: +49 9282 9638 205
> 24/7: +49 900 144 000 00 - 0,99 EUR/Min*
> http://www.smart-weblications.de
> 
> --
> Sitz der Gesellschaft: Naila
> Geschäftsführer: Florian Wiessner
> HRB-Nr.: HRB 3840 Amtsgericht Hof
> *aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Principal Consultant, Data Center Storage Architecture and Technology
FTS CTO
FUJITSU TECHNOLOGY SOLUTIONS GMBH
Mies-van-der-Rohe-Straße 8 / 4F
80807 München
Germany

Telephone:      +49 89 62060     1898
Telefax:	+49 89 62060 329 1898
Mobile: 	+49 170 8563173
Email:  	dieter.kasper@ts.fujitsu.com
Internet:       http://ts.fujitsu.com
Company Details: http://ts.fujitsu.com/imprint.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Integration work
  2012-08-28 18:51 ` Dieter Kasper
  2012-08-28 18:57   ` Smart Weblications GmbH - Florian Wiessner
@ 2012-08-28 20:46   ` Tren Blackburn
  2012-08-29  7:06     ` Amon Ott
  1 sibling, 1 reply; 20+ messages in thread
From: Tren Blackburn @ 2012-08-28 20:46 UTC (permalink / raw)
  To: Dieter Kasper; +Cc: Ross Turk, ceph-devel@vger.kernel.org

On Tue, Aug 28, 2012 at 11:51 AM, Dieter Kasper <d.kasper@kabelmail.de> wrote:
> Hi Ross,
>
> focusing on core stability and feature expansion for RBD was the right appoach
> in the past and I feel you have reached an adequate maturity level here.
>
> Performance enhancements - especially to reduce the latency of a single IO / increase IOPS -
> and a stronger engagement on the CephFS client would be very much appreciated.
> A stable and fast CephFS client would allow an efficient integration with
> - (clustered) NFS (v3 and v4)
> - (clustered) Samba v4

+1 to CephFS being worked on. Things like the multi-mds being improved
upon would be amazing.

Regards,

Tren

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Integration work
  2012-08-28 18:32 ` Plaetinck, Dieter
@ 2012-08-28 21:03   ` Florian Haas
  2012-08-28 21:15     ` Tommi Virtanen
  0 siblings, 1 reply; 20+ messages in thread
From: Florian Haas @ 2012-08-28 21:03 UTC (permalink / raw)
  To: ceph-devel@vger.kernel.org; +Cc: Plaetinck, Dieter, Ross Turk

On 08/28/2012 11:32 AM, Plaetinck, Dieter wrote:
> On Tue, 28 Aug 2012 11:12:16 -0700
> Ross Turk <ross@inktank.com> wrote:
> 
>>
>> Hi, ceph-devel! It's me, your friendly community guy.
>>
>> Inktank has an engineering team dedicated to Ceph, and we want to work 
>> on the right stuff. From time to time, I'd like to check in with you to 
>> make sure that we are.
>>
>> Over the past several months, Inktank's engineers have focused on core 
>> stability, radosgw, and feature expansion for RBD. At the same time, 
>> they have been regularly allocating cycles to integration work. 
>> Recently, this has consisted of improvements to the way Ceph works 
>> within OpenStack (even though OpenStack isn't the only technology that 
>> we think Ceph should play nicely with).
>>
>> What other sorts of integrations would you like to see Inktank engineers 
>> work on?
> 
> are we only supposed to give answers wrt. integration with other software?
> if not, I would suggest to write documentation.

If I may say so, the amount of work that John has poured into this in
recent week has been incredible (http://www.ceph.com/docs/master/). So
while it's definitely not complete nor perfect, I'm sure he would
appreciate a little more specific information as to where you believe
documentation is lacking.

I for my part, in the documentation space, would love for the admin
tools to become self-documenting. For example, I would love a "help"
subcommand at any level of the ceph shell, listing the supported
subcommands in that level. As in "ceph help", "ceph mon help", "ceph osd
getmap help".

Even better, the ceph shell could support a general-purpose hook that
bash-completion can use (kind of like "hg" does in Mercurial), and this
and the above-conjectured help facility could arguably share quite a bit
of code.

> and also integration with CM like puppet/chef

+1, although people are already working on both. So maybe this is just
about the need to tell more people about that. :)

Cheers,
Florian

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Integration work
  2012-08-28 21:03   ` Florian Haas
@ 2012-08-28 21:15     ` Tommi Virtanen
  2012-08-28 21:20       ` Josh Durgin
  0 siblings, 1 reply; 20+ messages in thread
From: Tommi Virtanen @ 2012-08-28 21:15 UTC (permalink / raw)
  To: Florian Haas; +Cc: ceph-devel@vger.kernel.org, Plaetinck, Dieter, Ross Turk

On Tue, Aug 28, 2012 at 5:03 PM, Florian Haas <florian@hastexo.com> wrote:
> I for my part, in the documentation space, would love for the admin
> tools to become self-documenting. For example, I would love a "help"
> subcommand at any level of the ceph shell, listing the supported
> subcommands in that level. As in "ceph help", "ceph mon help", "ceph osd
> getmap help".
>
> Even better, the ceph shell could support a general-purpose hook that
> bash-completion can use (kind of like "hg" does in Mercurial), and this
> and the above-conjectured help facility could arguably share quite a bit
> of code.

I would love to see all of that. But, a lot of the "ceph" tool
functionality is implemented by shoveling strings in and out of the
monitors. It largely doesn't understand what's happening.

If we were to redo that from scratch, I'd convert that to have some
sort of API to monitors, and make the cli understand all the relevant
things. Understandably, that can feel a little bit more rigid; to add
a command means adding it to both the server and a client, where as
currently the client is very very generic.

>> and also integration with CM like puppet/chef
> +1, although people are already working on both. So maybe this is just
> about the need to tell more people about that. :)

Please do give constructive feedback on

http://ceph.com/docs/master/install/chef/
http://ceph.com/docs/master/config-cluster/chef/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Integration work
  2012-08-28 21:15     ` Tommi Virtanen
@ 2012-08-28 21:20       ` Josh Durgin
  2012-08-30 14:00         ` João Eduardo Luís
  0 siblings, 1 reply; 20+ messages in thread
From: Josh Durgin @ 2012-08-28 21:20 UTC (permalink / raw)
  To: Tommi Virtanen
  Cc: Florian Haas, ceph-devel@vger.kernel.org, Plaetinck, Dieter,
	Ross Turk

On 08/28/2012 02:15 PM, Tommi Virtanen wrote:
> On Tue, Aug 28, 2012 at 5:03 PM, Florian Haas <florian@hastexo.com> wrote:
>> I for my part, in the documentation space, would love for the admin
>> tools to become self-documenting. For example, I would love a "help"
>> subcommand at any level of the ceph shell, listing the supported
>> subcommands in that level. As in "ceph help", "ceph mon help", "ceph osd
>> getmap help".
>>
>> Even better, the ceph shell could support a general-purpose hook that
>> bash-completion can use (kind of like "hg" does in Mercurial), and this
>> and the above-conjectured help facility could arguably share quite a bit
>> of code.
>
> I would love to see all of that. But, a lot of the "ceph" tool
> functionality is implemented by shoveling strings in and out of the
> monitors. It largely doesn't understand what's happening.

It doesn't need to understand what's happening to give basic usage info 
though - the monitors can provide that themselves in the short term
while we don't have an admin api like you describe below.

I added a feature request for this a little while back:

http://www.tracker.newdream.net/issues/2894

> If we were to redo that from scratch, I'd convert that to have some
> sort of API to monitors, and make the cli understand all the relevant
> things. Understandably, that can feel a little bit more rigid; to add
> a command means adding it to both the server and a client, where as
> currently the client is very very generic.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Integration work
  2012-08-28 20:46   ` Tren Blackburn
@ 2012-08-29  7:06     ` Amon Ott
  0 siblings, 0 replies; 20+ messages in thread
From: Amon Ott @ 2012-08-29  7:06 UTC (permalink / raw)
  To: ceph-devel

On Tuesday 28 August 2012 you wrote:
> On Tue, Aug 28, 2012 at 11:51 AM, Dieter Kasper <d.kasper@kabelmail.de> 
wrote:
> > Hi Ross,
> >
> > focusing on core stability and feature expansion for RBD was the right
> > appoach in the past and I feel you have reached an adequate maturity
> > level here.
> >
> > Performance enhancements - especially to reduce the latency of a single
> > IO / increase IOPS - and a stronger engagement on the CephFS client would
> > be very much appreciated. A stable and fast CephFS client would allow an
> > efficient integration with - (clustered) NFS (v3 and v4)
> > - (clustered) Samba v4
>
> +1 to CephFS being worked on. Things like the multi-mds being improved
> upon would be amazing.

Stable CephFS is what we need, too - many concurrent write accesses from 
multiple clients with a real file system underneath. However, most stability 
problems we have had so far were crashes in the daemons, not the Linux 
kernel. Great for me would be some solution to what we call the Domino 
effect - one daemon crashes, the next takes over, crashes at the same place 
(same data...), until the whole cluster is down. There will always be bugs, 
but they should not kill the whole cluster.

In our tests, single MDS was no real bottleneck, it was only lacking 
stability. I have not tested the newest releases, so it might be better now. 
Improved performance with many small files being written concurrently would 
be great, but CephFS has been getting significantly faster over the last year 
and performance is being worked on all the time.

Amon Ott
-- 
Dr. Amon Ott
m-privacy GmbH           Tel: +49 30 24342334
Am Köllnischen Park 1    Fax: +49 30 99296856
10179 Berlin             http://www.m-privacy.de

Amtsgericht Charlottenburg, HRB 84946

Geschäftsführer:
 Dipl.-Kfm. Holger Maczkowsky,
 Roman Maczkowsky

GnuPG-Key-ID: 0x2DD3A649
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Integration work
  2012-08-28 18:12 Integration work Ross Turk
  2012-08-28 18:32 ` Plaetinck, Dieter
  2012-08-28 18:51 ` Dieter Kasper
@ 2012-08-29  8:20 ` Sylvain Munaut
  2012-08-29  9:53   ` Wido den Hollander
  2012-09-01  6:02 ` Ryan Nicholson
  3 siblings, 1 reply; 20+ messages in thread
From: Sylvain Munaut @ 2012-08-29  8:20 UTC (permalink / raw)
  To: Ross Turk; +Cc: ceph-devel@vger.kernel.org

Hi,

> How about Xen?

I vote for this :)

Using RBD storage for Xen VM images / disks is IMHO a very nice fit,
the same way people do with QEMU. This should even allow live
migration of VM.

Currently we have to rely on the RBD kernel driver which has some
downsides (no caching / need recent kernel to get latest ceph
patches). There also seem to be some weird interactions between RBD
and Xen that lead to significant performance hits that are not present
when using only RBD or only Xen.

One possibility would be to develop a blktap driver for xen to provide
block device backend in userspace using librbd rather than kernel mode
rbd.

Cheers,

    Sylvain

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Integration work
  2012-08-29  8:20 ` Sylvain Munaut
@ 2012-08-29  9:53   ` Wido den Hollander
  2012-08-29 12:35     ` Sylvain Munaut
  0 siblings, 1 reply; 20+ messages in thread
From: Wido den Hollander @ 2012-08-29  9:53 UTC (permalink / raw)
  To: Sylvain Munaut; +Cc: Ross Turk, ceph-devel@vger.kernel.org

On 08/29/2012 10:20 AM, Sylvain Munaut wrote:
> Hi,
>
>> How about Xen?
>
> I vote for this :)
>
> Using RBD storage for Xen VM images / disks is IMHO a very nice fit,
> the same way people do with QEMU. This should even allow live
> migration of VM.
>

Correct me if I'm wrong, but when I was at Citrix in May this year 
somebody there told me that Xen was going 100% Qemu?

By going 100% Qemu they would also get RBD support.

Wido

> Currently we have to rely on the RBD kernel driver which has some
> downsides (no caching / need recent kernel to get latest ceph
> patches). There also seem to be some weird interactions between RBD
> and Xen that lead to significant performance hits that are not present
> when using only RBD or only Xen.
>
> One possibility would be to develop a blktap driver for xen to provide
> block device backend in userspace using librbd rather than kernel mode
> rbd.
>
> Cheers,
>
>      Sylvain
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Integration work
  2012-08-29  9:53   ` Wido den Hollander
@ 2012-08-29 12:35     ` Sylvain Munaut
  2012-08-29 13:40       ` Wido den Hollander
  0 siblings, 1 reply; 20+ messages in thread
From: Sylvain Munaut @ 2012-08-29 12:35 UTC (permalink / raw)
  To: Wido den Hollander; +Cc: Ross Turk, ceph-devel@vger.kernel.org

> Correct me if I'm wrong, but when I was at Citrix in May this year somebody
> there told me that Xen was going 100% Qemu?

Huh ... I've never heard this. Also the guys in ##xen haven't either.
I'm not really involved in xen dev and don't follow it closely but
that seems unlikely. The few slides I looked at from the Xen Summit a
couple days ago show that they really like their PV model.

AFAIK QEMU is only used for HVM guests to emulate the hw. And even for
those HVM guest, it's recommended to use PV drivers for performance
which bypass the qemu layer all together.

Cheers,

    Sylvain

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Integration work
  2012-08-29 12:35     ` Sylvain Munaut
@ 2012-08-29 13:40       ` Wido den Hollander
  2012-08-29 13:43         ` Tommi Virtanen
  0 siblings, 1 reply; 20+ messages in thread
From: Wido den Hollander @ 2012-08-29 13:40 UTC (permalink / raw)
  To: Sylvain Munaut; +Cc: Ross Turk, ceph-devel@vger.kernel.org

On 08/29/2012 02:35 PM, Sylvain Munaut wrote:
>> Correct me if I'm wrong, but when I was at Citrix in May this year somebody
>> there told me that Xen was going 100% Qemu?
>
> Huh ... I've never heard this. Also the guys in ##xen haven't either.
> I'm not really involved in xen dev and don't follow it closely but
> that seems unlikely. The few slides I looked at from the Xen Summit a
> couple days ago show that they really like their PV model.
>

I must be wrong then!

> AFAIK QEMU is only used for HVM guests to emulate the hw. And even for
> those HVM guest, it's recommended to use PV drivers for performance
> which bypass the qemu layer all together.
>

Not sure where this came from then, but in that case it would take work 
from Citrix to get RBD in Xen.

Wido

> Cheers,
>
>      Sylvain
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Integration work
  2012-08-29 13:40       ` Wido den Hollander
@ 2012-08-29 13:43         ` Tommi Virtanen
  2012-08-29 15:19           ` Joseph Glanville
  2012-08-29 15:19           ` Joseph Glanville
  0 siblings, 2 replies; 20+ messages in thread
From: Tommi Virtanen @ 2012-08-29 13:43 UTC (permalink / raw)
  To: Wido den Hollander; +Cc: Sylvain Munaut, Ross Turk, ceph-devel@vger.kernel.org

On Wed, Aug 29, 2012 at 9:40 AM, Wido den Hollander <wido@widodh.nl> wrote:
>> Huh ... I've never heard this. Also the guys in ##xen haven't either.
>> I'm not really involved in xen dev and don't follow it closely but
>> that seems unlikely. The few slides I looked at from the Xen Summit a
>> couple days ago show that they really like their PV model.
> I must be wrong then!

They are (at least, Red Hat is) looking at using more qemu for
xen-hvm. Whether that has any effect on the PV side, I wouldn't know.
It might make sense for them to use virtio even for PV, so they might
use qemu to implement the hypervisor side of virtio too, and that
would get you librbd support.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Integration work
  2012-08-29 13:43         ` Tommi Virtanen
@ 2012-08-29 15:19           ` Joseph Glanville
  2012-08-29 15:19           ` Joseph Glanville
  1 sibling, 0 replies; 20+ messages in thread
From: Joseph Glanville @ 2012-08-29 15:19 UTC (permalink / raw)
  To: Tommi Virtanen
  Cc: Wido den Hollander, Sylvain Munaut, Ross Turk,
	ceph-devel@vger.kernel.org, xen-devel

On 29 August 2012 23:43, Tommi Virtanen <tv@inktank.com> wrote:
> On Wed, Aug 29, 2012 at 9:40 AM, Wido den Hollander <wido@widodh.nl> wrote:
>>> Huh ... I've never heard this. Also the guys in ##xen haven't either.
>>> I'm not really involved in xen dev and don't follow it closely but
>>> that seems unlikely. The few slides I looked at from the Xen Summit a
>>> couple days ago show that they really like their PV model.
>> I must be wrong then!
>
> They are (at least, Red Hat is) looking at using more qemu for
> xen-hvm. Whether that has any effect on the PV side, I wouldn't know.
> It might make sense for them to use virtio even for PV, so they might
> use qemu to implement the hypervisor side of virtio too, and that
> would get you librbd support.
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

I don't there is much going on in terms of increasing use of QEMU,
only that Xen can now use upsteam QEMU rather than the Xen specific
fork (qemu-xen-traditional).
There was GSOC project to build a virtio front/backend for Xen but I
am not sure if this would be the way to go.
As far as I can see Xen dominates KVM in terms network and I/O
performance on every benchmark so apart from compatibility the gains
of using virtio don't seem that great... Xen's blkback/netback PV
system is just that much faster and more scalable with large numbers
of domains or 100k+ IOPs.

With regards to blktap.. blktap is currently in a state where blktap2
is included in a minimal amount of distros and is non-upstreamable.
blktap3 which is coming will but fully userspace but I have never been
a big fan of userspace block devices, YMMV.
That being said, building blktap devices is really easy (similar to tuntap).

Ideally improving the kernel RBD device would provide the best
performance across the board and the most compatibility (anything can
use a raw block device).

Joseph.

-- 
CTO | Orion Virtualisation Solutions | www.orionvm.com.au
Phone: 1300 56 99 52 | Mobile: 0428 754 846

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Integration work
  2012-08-29 13:43         ` Tommi Virtanen
  2012-08-29 15:19           ` Joseph Glanville
@ 2012-08-29 15:19           ` Joseph Glanville
  1 sibling, 0 replies; 20+ messages in thread
From: Joseph Glanville @ 2012-08-29 15:19 UTC (permalink / raw)
  To: Tommi Virtanen
  Cc: ceph-devel@vger.kernel.org, Sylvain Munaut, Wido den Hollander,
	Ross Turk, xen-devel

On 29 August 2012 23:43, Tommi Virtanen <tv@inktank.com> wrote:
> On Wed, Aug 29, 2012 at 9:40 AM, Wido den Hollander <wido@widodh.nl> wrote:
>>> Huh ... I've never heard this. Also the guys in ##xen haven't either.
>>> I'm not really involved in xen dev and don't follow it closely but
>>> that seems unlikely. The few slides I looked at from the Xen Summit a
>>> couple days ago show that they really like their PV model.
>> I must be wrong then!
>
> They are (at least, Red Hat is) looking at using more qemu for
> xen-hvm. Whether that has any effect on the PV side, I wouldn't know.
> It might make sense for them to use virtio even for PV, so they might
> use qemu to implement the hypervisor side of virtio too, and that
> would get you librbd support.
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

I don't there is much going on in terms of increasing use of QEMU,
only that Xen can now use upsteam QEMU rather than the Xen specific
fork (qemu-xen-traditional).
There was GSOC project to build a virtio front/backend for Xen but I
am not sure if this would be the way to go.
As far as I can see Xen dominates KVM in terms network and I/O
performance on every benchmark so apart from compatibility the gains
of using virtio don't seem that great... Xen's blkback/netback PV
system is just that much faster and more scalable with large numbers
of domains or 100k+ IOPs.

With regards to blktap.. blktap is currently in a state where blktap2
is included in a minimal amount of distros and is non-upstreamable.
blktap3 which is coming will but fully userspace but I have never been
a big fan of userspace block devices, YMMV.
That being said, building blktap devices is really easy (similar to tuntap).

Ideally improving the kernel RBD device would provide the best
performance across the board and the most compatibility (anything can
use a raw block device).

Joseph.

-- 
CTO | Orion Virtualisation Solutions | www.orionvm.com.au
Phone: 1300 56 99 52 | Mobile: 0428 754 846

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Integration work
  2012-08-28 21:20       ` Josh Durgin
@ 2012-08-30 14:00         ` João Eduardo Luís
  0 siblings, 0 replies; 20+ messages in thread
From: João Eduardo Luís @ 2012-08-30 14:00 UTC (permalink / raw)
  To: Josh Durgin
  Cc: Tommi Virtanen, Florian Haas, ceph-devel@vger.kernel.org,
	Plaetinck, Dieter, Ross Turk

[-- Attachment #1: Type: text/plain, Size: 1400 bytes --]

On 08/28/2012 10:20 PM, Josh Durgin wrote:
> On 08/28/2012 02:15 PM, Tommi Virtanen wrote:
>> On Tue, Aug 28, 2012 at 5:03 PM, Florian Haas <florian@hastexo.com>
>> wrote:
>>> I for my part, in the documentation space, would love for the admin
>>> tools to become self-documenting. For example, I would love a "help"
>>> subcommand at any level of the ceph shell, listing the supported
>>> subcommands in that level. As in "ceph help", "ceph mon help", "ceph osd
>>> getmap help".
>>>
>>> Even better, the ceph shell could support a general-purpose hook that
>>> bash-completion can use (kind of like "hg" does in Mercurial), and this
>>> and the above-conjectured help facility could arguably share quite a bit
>>> of code.
>>
>> I would love to see all of that. But, a lot of the "ceph" tool
>> functionality is implemented by shoveling strings in and out of the
>> monitors. It largely doesn't understand what's happening.
> 
> It doesn't need to understand what's happening to give basic usage info
> though - the monitors can provide that themselves in the short term
> while we don't have an admin api like you describe below.
> 
> I added a feature request for this a little while back:
> 
> http://www.tracker.newdream.net/issues/2894

I believe this is pretty straightforward to get done.


-- 
João Eduardo Luís
gpg key: 477C26E5 from pool.keyserver.eu


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 554 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: Integration work
  2012-08-28 18:12 Integration work Ross Turk
                   ` (2 preceding siblings ...)
  2012-08-29  8:20 ` Sylvain Munaut
@ 2012-09-01  6:02 ` Ryan Nicholson
  2012-09-04 15:52   ` Tommi Virtanen
  3 siblings, 1 reply; 20+ messages in thread
From: Ryan Nicholson @ 2012-09-01  6:02 UTC (permalink / raw)
  To: Ross Turk, ceph-devel@vger.kernel.org

Ross, All:

I've read through several recommendations, and I'd like to add 2 to that list for consideration.

First: For my local project, I'm using rbd with Oracle VM and VM manager, mainly because of the other engineers' familiarity with the Oracle platforms, and they're certified by MS to run Windows on Xen (using Oracle's stuff).

Now, due to necessity, I'll be working on a Storage plugin that allows the Orcale VM to understand RBD, to make pools, etc for our project. I would be interested to know if anyone else has actually started on their own version of the same.

Secondly: Through some trials, I've found that if one loses all of his Monitors in a way that they also lose their disks, one basically loses their cluster. I would like to recommend a lower priority shift in design that allows for "recovery of the entire monitor set from data/snapshots automatically stored at the osd's". 

For example, a monitor boots:
	-keyring file and ceph.conf are available
	-monitor sees that it is missing its local copy of maps, etc.
	-goes onto the first OSD's it sees and pulls down a snapshot of the same
	-checks for another running monitor, syncs with it, if not,
	-boots at quorum 0, verifying OSD states
	-life continues.

The big deal here, is that while the entire cluster is able to recover from failures using one storage philosophy, the monitors are using an entirely different, and more legacy storage philosophy - basically local RAID/power in numbers. Perhaps this has already been considered, and I would be interested in knowing what people think here, as well. Or perhaps I missed something and this is already done?

Thanks, for your time!

Ryan Nicholson

-----Original Message-----
From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Ross Turk
Sent: Tuesday, August 28, 2012 1:12 PM
To: ceph-devel@vger.kernel.org
Subject: Integration work

Hi, ceph-devel! It's me, your friendly community guy.

Inktank has an engineering team dedicated to Ceph, and we want to work on the right stuff. From time to time, I'd like to check in with you to make sure that we are.

Over the past several months, Inktank's engineers have focused on core stability, radosgw, and feature expansion for RBD. At the same time, they have been regularly allocating cycles to integration work. 
Recently, this has consisted of improvements to the way Ceph works within OpenStack (even though OpenStack isn't the only technology that we think Ceph should play nicely with).

What other sorts of integrations would you like to see Inktank engineers work on? For example, are you interested in seeing Inktank spend more of its resources improving interoperability with Apache CloudStack or Eucalyptus? How about Xen?

Please share your thoughts. We want to contribute in the best way possible with the resources we have, and your input can help.

Thx,
Ross

--
Ross Turk
Community, Ceph
@rossturk @inktank @ceph

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Integration work
  2012-09-01  6:02 ` Ryan Nicholson
@ 2012-09-04 15:52   ` Tommi Virtanen
  0 siblings, 0 replies; 20+ messages in thread
From: Tommi Virtanen @ 2012-09-04 15:52 UTC (permalink / raw)
  To: Ryan Nicholson; +Cc: Ross Turk, ceph-devel@vger.kernel.org

On Fri, Aug 31, 2012 at 11:02 PM, Ryan Nicholson
<Ryan.Nicholson@kcrg.com> wrote:
> Secondly: Through some trials, I've found that if one loses all of his Monitors in a way that they also lose their disks, one basically loses their cluster. I would like to recommend a lower priority shift in design that allows for "recovery of the entire monitor set from data/snapshots automatically stored at the osd's".
>
> For example, a monitor boots:
>         -keyring file and ceph.conf are available
>         -monitor sees that it is missing its local copy of maps, etc.
>         -goes onto the first OSD's it sees and pulls down a snapshot of the same
>         -checks for another running monitor, syncs with it, if not,
>         -boots at quorum 0, verifying OSD states
>         -life continues.

Monitor fetching initial information from an OSD is full of
challenges. The monitor won't know what IP addresses and ports the
OSDs are, the OSDs won't trust the monitor to talk to them, etc (it
lost its crypto keys, after all). It wouldn't even know which OSD to
talk to, and I highly doubt having the backup on every OSD would be a
good idea.

> The big deal here, is that while the entire cluster is able to recover from failures using one storage philosophy, the monitors are using an entirely different, and more legacy storage philosophy - basically local RAID/power in numbers. Perhaps this has already been considered, and I would be interested in knowing what people think here, as well. Or perhaps I missed something and this is already done?

That's why you run multiple monitors: they provide High Availability
to the monitor service, as a whole. Losing all of your monitors at all
disrupts operation of the cluster. Losing all of their stable storage
really is disastrous. This is why you are supposed to deploy them in
different failure domains, e.g. in different rows or rooms.

If a monitor has its mon. keyring and ceph.conf, it should be able to
join an existing monitor cluster as a new member, no special-case
recovery needed.

I'm not sure what kind of architecture you have that makes losing all
the of the monitor disks somehow likely, but perhaps you should just
take backups of their disks, with plain-old backup tools? Don't try to
store that backup in the same Ceph cluster, though. It would be
interesting to hear more about what you're thinking of, here.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2012-09-04 15:53 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-28 18:12 Integration work Ross Turk
2012-08-28 18:32 ` Plaetinck, Dieter
2012-08-28 21:03   ` Florian Haas
2012-08-28 21:15     ` Tommi Virtanen
2012-08-28 21:20       ` Josh Durgin
2012-08-30 14:00         ` João Eduardo Luís
2012-08-28 18:51 ` Dieter Kasper
2012-08-28 18:57   ` Smart Weblications GmbH - Florian Wiessner
2012-08-28 20:05     ` Dieter Kasper
2012-08-28 20:46   ` Tren Blackburn
2012-08-29  7:06     ` Amon Ott
2012-08-29  8:20 ` Sylvain Munaut
2012-08-29  9:53   ` Wido den Hollander
2012-08-29 12:35     ` Sylvain Munaut
2012-08-29 13:40       ` Wido den Hollander
2012-08-29 13:43         ` Tommi Virtanen
2012-08-29 15:19           ` Joseph Glanville
2012-08-29 15:19           ` Joseph Glanville
2012-09-01  6:02 ` Ryan Nicholson
2012-09-04 15:52   ` Tommi Virtanen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.