the state of cephfs in giant

All of lore.kernel.org
 help / color / mirror / Atom feed

* the state of cephfs in giant
@ 2014-10-13 18:16 Sage Weil
  2014-10-13 18:20 ` Wido den Hollander
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Sage Weil @ 2014-10-13 18:16 UTC (permalink / raw)
  To: ceph-users-Qp0mS5GaXlQ, ceph-devel-u79uwXL29TY76Z2rM5mHXA

We've been doing a lot of work on CephFS over the past few months. This
is an update on the current state of things as of Giant.

What we've working on:

* better mds/cephfs health reports to the monitor
* mds journal dump/repair tool
* many kernel and ceph-fuse/libcephfs client bug fixes
* file size recovery improvements
* client session management fixes (and tests)
* admin socket commands for diagnosis and admin intervention
* many bug fixes

We started using CephFS to back the teuthology (QA) infrastructure in the
lab about three months ago. We fixed a bunch of stuff over the first
month or two (several kernel bugs, a few MDS bugs). We've had no problems
for the last month or so. We're currently running 0.86 (giant release
candidate) with a single MDS and ~70 OSDs. Clients are running a 3.16
kernel plus several fixes that went into 3.17.

With Giant, we are at a point where we would ask that everyone try
things out for any non-production workloads. We are very interested in
feedback around stability, usability, feature gaps, and performance. We
recommend:

* Single active MDS. You can run any number of standby MDS's, but we are
  not focusing on multi-mds bugs just yet (and our existing multimds test
  suite is already hitting several).
* No snapshots. These are disabled by default and require a scary admin
  command to enable them. Although these mostly work, there are
  several known issues that we haven't addressed and they complicate
  things immensely. Please avoid them for now.
* Either the kernel client (kernel 3.17 or later) or userspace (ceph-fuse
  or libcephfs) clients are in good working order.

The key missing feature right now is fsck (both check and repair). This is 
*the* development focus for Hammer.

Here's a more detailed rundown of the status of various features:

* multi-mds: implemented. limited test coverage. several known issues.
  use only for non-production workloads and expect some stability
  issues that could lead to data loss.

* snapshots: implemented. limited test coverage. several known issues.
  use only for non-production workloads and expect some stability issues
  that could lead to data loss.

* hard links: stable. no known issues, but there is somewhat limited
  test coverage (we don't test creating huge link farms).

* direct io: implemented and tested for kernel client. no special
  support for ceph-fuse (the kernel fuse driver handles this).

* xattrs: implemented, stable, tested. no known issues (for both kernel
  and userspace clients).

* ACLs: implemented, tested for kernel client. not implemented for
  ceph-fuse.

* file locking (fcntl, flock): supported and tested for kernel client.
  limited test coverage. one known minor issue for kernel with fix
  pending. implemention in progress for ceph-fuse/libcephfs.

* kernel fscache support: implmented. no test coverage. used in
  production by adfin.

* hadoop bindings: implemented, limited test coverage. a few known
  issues.

* samba VFS integration: implemented, limited test coverage.

* ganesha NFS integration: implemented, no test coverage.

* kernel NFS reexport: implemented. limited test coverage. no known
  issues.

Anybody who has experienced bugs in the past should be excited by:

* new MDS admin socket commands to look at pending operations and client 
  session states. (Check them out with "ceph daemon mds.a help"!) These 
  will make diagnosing, debugging, and even fixing issues a lot simpler.

* the cephfs_journal_tool, which is capable of manipulating mds journal 
  state without doing difficult exports/imports and using hexedit.

Thanks!
sage

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: the state of cephfs in giant
  2014-10-13 18:16 the state of cephfs in giant Sage Weil
@ 2014-10-13 18:20 ` Wido den Hollander
  2014-10-13 18:26   ` Sage Weil
  2014-10-13 19:03 ` [ceph-users] " Eric Eastman
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 20+ messages in thread
From: Wido den Hollander @ 2014-10-13 18:20 UTC (permalink / raw)
  To: Sage Weil, ceph-users, ceph-devel

On 13-10-14 20:16, Sage Weil wrote:
> We've been doing a lot of work on CephFS over the past few months. This
> is an update on the current state of things as of Giant.
> 
> What we've working on:
> 
> * better mds/cephfs health reports to the monitor
> * mds journal dump/repair tool
> * many kernel and ceph-fuse/libcephfs client bug fixes
> * file size recovery improvements
> * client session management fixes (and tests)
> * admin socket commands for diagnosis and admin intervention
> * many bug fixes
> 
> We started using CephFS to back the teuthology (QA) infrastructure in the
> lab about three months ago. We fixed a bunch of stuff over the first
> month or two (several kernel bugs, a few MDS bugs). We've had no problems
> for the last month or so. We're currently running 0.86 (giant release
> candidate) with a single MDS and ~70 OSDs. Clients are running a 3.16
> kernel plus several fixes that went into 3.17.
> 
> 
> With Giant, we are at a point where we would ask that everyone try
> things out for any non-production workloads. We are very interested in
> feedback around stability, usability, feature gaps, and performance. We
> recommend:
> 

A question to clarify this for anybody out there. Do you think it is
safe to run CephFS on a cluster which is doing production RBD/RGW I/O?

Will it be the MDS/CephFS part which breaks or are there potential issue
due to OSD classes which might cause OSDs to crash due to bugs in CephFS?

I know you can't fully rule it out, but it would be useful to have this
clarified.

> * Single active MDS. You can run any number of standby MDS's, but we are
>   not focusing on multi-mds bugs just yet (and our existing multimds test
>   suite is already hitting several).
> * No snapshots. These are disabled by default and require a scary admin
>   command to enable them. Although these mostly work, there are
>   several known issues that we haven't addressed and they complicate
>   things immensely. Please avoid them for now.
> * Either the kernel client (kernel 3.17 or later) or userspace (ceph-fuse
>   or libcephfs) clients are in good working order.
> 
> The key missing feature right now is fsck (both check and repair). This is 
> *the* development focus for Hammer.
> 
> 
> Here's a more detailed rundown of the status of various features:
> 
> * multi-mds: implemented. limited test coverage. several known issues.
>   use only for non-production workloads and expect some stability
>   issues that could lead to data loss.
> 
> * snapshots: implemented. limited test coverage. several known issues.
>   use only for non-production workloads and expect some stability issues
>   that could lead to data loss.
> 
> * hard links: stable. no known issues, but there is somewhat limited
>   test coverage (we don't test creating huge link farms).
> 
> * direct io: implemented and tested for kernel client. no special
>   support for ceph-fuse (the kernel fuse driver handles this).
> 
> * xattrs: implemented, stable, tested. no known issues (for both kernel
>   and userspace clients).
> 
> * ACLs: implemented, tested for kernel client. not implemented for
>   ceph-fuse.
> 
> * file locking (fcntl, flock): supported and tested for kernel client.
>   limited test coverage. one known minor issue for kernel with fix
>   pending. implemention in progress for ceph-fuse/libcephfs.
> 
> * kernel fscache support: implmented. no test coverage. used in
>   production by adfin.
> 
> * hadoop bindings: implemented, limited test coverage. a few known
>   issues.
> 
> * samba VFS integration: implemented, limited test coverage.
> 
> * ganesha NFS integration: implemented, no test coverage.
> 
> * kernel NFS reexport: implemented. limited test coverage. no known
>   issues.
> 
> 
> Anybody who has experienced bugs in the past should be excited by:
> 
> * new MDS admin socket commands to look at pending operations and client 
>   session states. (Check them out with "ceph daemon mds.a help"!) These 
>   will make diagnosing, debugging, and even fixing issues a lot simpler.
> 
> * the cephfs_journal_tool, which is capable of manipulating mds journal 
>   state without doing difficult exports/imports and using hexedit.
> 
> Thanks!
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
Wido den Hollander
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: the state of cephfs in giant
  2014-10-13 18:20 ` Wido den Hollander
@ 2014-10-13 18:26   ` Sage Weil
  0 siblings, 0 replies; 20+ messages in thread
From: Sage Weil @ 2014-10-13 18:26 UTC (permalink / raw)
  To: Wido den Hollander; +Cc: ceph-users, ceph-devel

On Mon, 13 Oct 2014, Wido den Hollander wrote:
> On 13-10-14 20:16, Sage Weil wrote:
> > With Giant, we are at a point where we would ask that everyone try
> > things out for any non-production workloads. We are very interested in
> > feedback around stability, usability, feature gaps, and performance. We
> > recommend:
> 
> A question to clarify this for anybody out there. Do you think it is
> safe to run CephFS on a cluster which is doing production RBD/RGW I/O?
> 
> Will it be the MDS/CephFS part which breaks or are there potential issue
> due to OSD classes which might cause OSDs to crash due to bugs in CephFS?
> 
> I know you can't fully rule it out, but it would be useful to have this
> clarified.

I can't think of any issues that this would cause with the OSDs.  CephFS 
isn't using any rados classes; just core rados functionality that RGW also 
uses.

On the monitor side, there is a reasonably probability of triggering a 
CephFS related health warning.  There is also the potential for code in 
the MDSMonitor.cc code to crash the mon, but I don't think we've seen any 
problems there any time recently.

So, probably safe.

sage

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [ceph-users] the state of cephfs in giant
  2014-10-13 18:16 the state of cephfs in giant Sage Weil
  2014-10-13 18:20 ` Wido den Hollander
@ 2014-10-13 19:03 ` Eric Eastman
  2014-10-13 20:56   ` Sage Weil
  2014-10-14  7:31 ` Amon Ott
       [not found] ` <alpine.DEB.2.00.1410131114130.10561-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
  3 siblings, 1 reply; 20+ messages in thread
From: Eric Eastman @ 2014-10-13 19:03 UTC (permalink / raw)
  To: sage, ceph-users, ceph-devel

I would be interested in testing the Samba VFS and Ganesha NFS 

integration with CephFS.  Are there any notes on how to configure these 

two interfaces with CephFS?



Eric



> We've been doing a lot of work on CephFS over the past few months. 

This

> is an update on the current state of things as of Giant.

> ...

> * samba VFS integration: implemented, limited test coverage.

> * ganesha NFS integration: implemented, no test coverage.

> ...

> Thanks!

> sage

_______________________________________________

  

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [ceph-users] the state of cephfs in giant
  2014-10-13 19:03 ` [ceph-users] " Eric Eastman
@ 2014-10-13 20:56   ` Sage Weil
  0 siblings, 0 replies; 20+ messages in thread
From: Sage Weil @ 2014-10-13 20:56 UTC (permalink / raw)
  To: Eric Eastman; +Cc: ceph-users, ceph-devel, matt

On Mon, 13 Oct 2014, Eric Eastman wrote:
> I would be interested in testing the Samba VFS and Ganesha NFS integration
> with CephFS.  Are there any notes on how to configure these two interfaces
> with CephFS?

For samba, based on 
https://github.com/ceph/ceph-qa-suite/blob/master/tasks/samba.py#L106
I think you need something like

[myshare]
path = /
writeable = yes
vfs objects = ceph
ceph:config_file = /etc/ceph/ceph.conf

Not sure what the ganesha config looks like.  Matt and the other folks at 
cohortfs would know more.

sage

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: the state of cephfs in giant
  2014-10-13 18:16 the state of cephfs in giant Sage Weil
  2014-10-13 18:20 ` Wido den Hollander
  2014-10-13 19:03 ` [ceph-users] " Eric Eastman
@ 2014-10-14  7:31 ` Amon Ott
  2014-10-14 13:09   ` Sage Weil
  2014-10-14 14:23   ` [ceph-users] " Sage Weil
       [not found] ` <alpine.DEB.2.00.1410131114130.10561-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
  3 siblings, 2 replies; 20+ messages in thread
From: Amon Ott @ 2014-10-14  7:31 UTC (permalink / raw)
  To: Sage Weil, ceph-devel, ceph-users

Am 13.10.2014 20:16, schrieb Sage Weil:
> We've been doing a lot of work on CephFS over the past few months. This
> is an update on the current state of things as of Giant.
...
> * Either the kernel client (kernel 3.17 or later) or userspace (ceph-fuse
>   or libcephfs) clients are in good working order.

Thanks for all the work and specially for concentrating on CephFS! We
have been watching and testing for years by now and really hope to
change our Clusters to CephFS soon.

For kernel maintenance reasons, we only want to run longterm stable
kernels. And for performance reasons and because of severe known
problems we want to avoid Fuse. How good are our chances of a stable
system with the kernel client in the latest longterm kernel 3.14? Will
there be further bugfixes or feature backports?

Thanks again,

Amon Ott
-- 
Dr. Amon Ott
m-privacy GmbH           Tel: +49 30 24342334
Werner-Voß-Damm 62       Fax: +49 30 99296856
12101 Berlin             http://www.m-privacy.de

Amtsgericht Charlottenburg, HRB 84946

Geschäftsführer:
 Dipl.-Kfm. Holger Maczkowsky,
 Roman Maczkowsky

GnuPG-Key-ID: 0x2DD3A649

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: the state of cephfs in giant
  2014-10-14  7:31 ` Amon Ott
@ 2014-10-14 13:09   ` Sage Weil
  2014-10-14 14:23   ` [ceph-users] " Sage Weil
  1 sibling, 0 replies; 20+ messages in thread
From: Sage Weil @ 2014-10-14 13:09 UTC (permalink / raw)
  To: Amon Ott; +Cc: ceph-devel, ceph-users

On Tue, 14 Oct 2014, Amon Ott wrote:
> Am 13.10.2014 20:16, schrieb Sage Weil:
> > We've been doing a lot of work on CephFS over the past few months. This
> > is an update on the current state of things as of Giant.
> ...
> > * Either the kernel client (kernel 3.17 or later) or userspace (ceph-fuse
> >   or libcephfs) clients are in good working order.
> 
> Thanks for all the work and specially for concentrating on CephFS! We
> have been watching and testing for years by now and really hope to
> change our Clusters to CephFS soon.
> 
> For kernel maintenance reasons, we only want to run longterm stable
> kernels. And for performance reasons and because of severe known
> problems we want to avoid Fuse. How good are our chances of a stable
> system with the kernel client in the latest longterm kernel 3.14? Will
> there be further bugfixes or feature backports?

We haven't been backporting CephFS bug fixes to the stable kernels the 
same way we've been doing RBD bugs; it's a bit of a chore.  This can be 
done retroactively but no promises.  Probably 3.14 makes the most sense.  
The RHEL7/CentOS7 kernel is also a likely target.

sage

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [ceph-users] the state of cephfs in giant
  2014-10-14  7:31 ` Amon Ott
  2014-10-14 13:09   ` Sage Weil
@ 2014-10-14 14:23   ` Sage Weil
  2014-10-15  0:16     ` Alphe Salas
       [not found]     ` <alpine.DEB.2.00.1410140718050.10462-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
  1 sibling, 2 replies; 20+ messages in thread
From: Sage Weil @ 2014-10-14 14:23 UTC (permalink / raw)
  To: Amon Ott; +Cc: ceph-devel, ceph-users

On Tue, 14 Oct 2014, Amon Ott wrote:
> Am 13.10.2014 20:16, schrieb Sage Weil:
> > We've been doing a lot of work on CephFS over the past few months. This
> > is an update on the current state of things as of Giant.
> ...
> > * Either the kernel client (kernel 3.17 or later) or userspace (ceph-fuse
> >   or libcephfs) clients are in good working order.
> 
> Thanks for all the work and specially for concentrating on CephFS! We
> have been watching and testing for years by now and really hope to
> change our Clusters to CephFS soon.
> 
> For kernel maintenance reasons, we only want to run longterm stable
> kernels. And for performance reasons and because of severe known
> problems we want to avoid Fuse. How good are our chances of a stable
> system with the kernel client in the latest longterm kernel 3.14? Will
> there be further bugfixes or feature backports?

There are important bug fixes missing from 3.14.  IIRC, the EC, cache 
tiering, and firefly CRUSH changes aren't there yet either (they landed in 
3.15), and that is not appropriate for a stable series.

They can be backported, but no commitment yet on that :)

sage

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [ceph-users] the state of cephfs in giant
  2014-10-14 14:23   ` [ceph-users] " Sage Weil
@ 2014-10-15  0:16     ` Alphe Salas
       [not found]       ` <543DBCE9.2080605-g2h0fw6BmCNmR6Xm/wNWPw@public.gmane.org>
       [not found]     ` <alpine.DEB.2.00.1410140718050.10462-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
  1 sibling, 1 reply; 20+ messages in thread
From: Alphe Salas @ 2014-10-15  0:16 UTC (permalink / raw)
  To: Sage Weil, Amon Ott; +Cc: ceph-devel, ceph-users

Hello sage, last time I used CephFS it had a strange behaviour when if 
used in conjunction with a nfs reshare of the cephfs mount point, I 
experienced a partial random disapearance of the tree folders.

According to people in the mailing list it was a kernel module bug (not 
using ceph-fuse) do you know if any work has been done recently in that 
topic?

best regards

Alphe Salas
I.T ingeneer

On 10/14/2014 11:23 AM, Sage Weil wrote:
> On Tue, 14 Oct 2014, Amon Ott wrote:
>> Am 13.10.2014 20:16, schrieb Sage Weil:
>>> We've been doing a lot of work on CephFS over the past few months. This
>>> is an update on the current state of things as of Giant.
>> ...
>>> * Either the kernel client (kernel 3.17 or later) or userspace (ceph-fuse
>>>    or libcephfs) clients are in good working order.
>>
>> Thanks for all the work and specially for concentrating on CephFS! We
>> have been watching and testing for years by now and really hope to
>> change our Clusters to CephFS soon.
>>
>> For kernel maintenance reasons, we only want to run longterm stable
>> kernels. And for performance reasons and because of severe known
>> problems we want to avoid Fuse. How good are our chances of a stable
>> system with the kernel client in the latest longterm kernel 3.14? Will
>> there be further bugfixes or feature backports?
>
> There are important bug fixes missing from 3.14.  IIRC, the EC, cache
> tiering, and firefly CRUSH changes aren't there yet either (they landed in
> 3.15), and that is not appropriate for a stable series.
>
> They can be backported, but no commitment yet on that :)
>
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

[parent not found: <543DBCE9.2080605-g2h0fw6BmCNmR6Xm/wNWPw@public.gmane.org>]

* Re: the state of cephfs in giant
       [not found]       ` <543DBCE9.2080605-g2h0fw6BmCNmR6Xm/wNWPw@public.gmane.org>
@ 2014-10-15  2:06         ` Sage Weil
  0 siblings, 0 replies; 20+ messages in thread
From: Sage Weil @ 2014-10-15  2:06 UTC (permalink / raw)
  To: Alphe Salas
  Cc: ceph-devel-u79uwXL29TY76Z2rM5mHXA, Amon Ott,
	ceph-users-Qp0mS5GaXlQ

This sounds like any number of readdir bugs that Zheng has fixed over the 
last 6 months.

sage


On Tue, 14 Oct 2014, Alphe Salas wrote:

> Hello sage, last time I used CephFS it had a strange behaviour when if used in
> conjunction with a nfs reshare of the cephfs mount point, I experienced a
> partial random disapearance of the tree folders.
> 
> According to people in the mailing list it was a kernel module bug (not using
> ceph-fuse) do you know if any work has been done recently in that topic?
> 
> best regards
> 
> Alphe Salas
> I.T ingeneer
> 
> On 10/14/2014 11:23 AM, Sage Weil wrote:
> > On Tue, 14 Oct 2014, Amon Ott wrote:
> > > Am 13.10.2014 20:16, schrieb Sage Weil:
> > > > We've been doing a lot of work on CephFS over the past few months. This
> > > > is an update on the current state of things as of Giant.
> > > ...
> > > > * Either the kernel client (kernel 3.17 or later) or userspace
> > > > (ceph-fuse
> > > >    or libcephfs) clients are in good working order.
> > > 
> > > Thanks for all the work and specially for concentrating on CephFS! We
> > > have been watching and testing for years by now and really hope to
> > > change our Clusters to CephFS soon.
> > > 
> > > For kernel maintenance reasons, we only want to run longterm stable
> > > kernels. And for performance reasons and because of severe known
> > > problems we want to avoid Fuse. How good are our chances of a stable
> > > system with the kernel client in the latest longterm kernel 3.14? Will
> > > there be further bugfixes or feature backports?
> > 
> > There are important bug fixes missing from 3.14.  IIRC, the EC, cache
> > tiering, and firefly CRUSH changes aren't there yet either (they landed in
> > 3.15), and that is not appropriate for a stable series.
> > 
> > They can be backported, but no commitment yet on that :)
> > 
> > sage
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> _______________________________________________
> ceph-users mailing list
> ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

[parent not found: <alpine.DEB.2.00.1410140718050.10462-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>]

* Re: the state of cephfs in giant
       [not found]     ` <alpine.DEB.2.00.1410140718050.10462-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
@ 2014-10-15  6:43       ` Amon Ott
  2014-10-15 12:11         ` [ceph-users] " Ric Wheeler
  0 siblings, 1 reply; 20+ messages in thread
From: Amon Ott @ 2014-10-15  6:43 UTC (permalink / raw)
  To: Sage Weil, Amon Ott
  Cc: ceph-devel-u79uwXL29TY76Z2rM5mHXA, ceph-users-Qp0mS5GaXlQ

Am 14.10.2014 16:23, schrieb Sage Weil:
> On Tue, 14 Oct 2014, Amon Ott wrote:
>> Am 13.10.2014 20:16, schrieb Sage Weil:
>>> We've been doing a lot of work on CephFS over the past few months. This
>>> is an update on the current state of things as of Giant.
>> ...
>>> * Either the kernel client (kernel 3.17 or later) or userspace (ceph-fuse
>>>   or libcephfs) clients are in good working order.
>>
>> Thanks for all the work and specially for concentrating on CephFS! We
>> have been watching and testing for years by now and really hope to
>> change our Clusters to CephFS soon.
>>
>> For kernel maintenance reasons, we only want to run longterm stable
>> kernels. And for performance reasons and because of severe known
>> problems we want to avoid Fuse. How good are our chances of a stable
>> system with the kernel client in the latest longterm kernel 3.14? Will
>> there be further bugfixes or feature backports?
> 
> There are important bug fixes missing from 3.14.  IIRC, the EC, cache 
> tiering, and firefly CRUSH changes aren't there yet either (they landed in 
> 3.15), and that is not appropriate for a stable series.
> 
> They can be backported, but no commitment yet on that :)

If the bugfixes are easily identified in one of your Ceph git branches,
I would even try to backport them myself. Still, I would rather see
someone from the Ceph team with deeper knowledge of the code port them.

IMHO, it would be good for Ceph to have stable support in at least the
latest longterm kernel. No need for new features, but bugfixes should be
there.

Amon Ott
-- 
Dr. Amon Ott
m-privacy GmbH           Tel: +49 30 24342334
Werner-Voß-Damm 62       Fax: +49 30 99296856
12101 Berlin             http://www.m-privacy.de

Amtsgericht Charlottenburg, HRB 84946

Geschäftsführer:
 Dipl.-Kfm. Holger Maczkowsky,
 Roman Maczkowsky

GnuPG-Key-ID: 0x2DD3A649

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [ceph-users] the state of cephfs in giant
  2014-10-15  6:43       ` Amon Ott
@ 2014-10-15 12:11         ` Ric Wheeler
       [not found]           ` <543E645E.4080405-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: Ric Wheeler @ 2014-10-15 12:11 UTC (permalink / raw)
  To: Amon Ott, Sage Weil; +Cc: ceph-devel, ceph-users

On 10/15/2014 08:43 AM, Amon Ott wrote:
> Am 14.10.2014 16:23, schrieb Sage Weil:
>> On Tue, 14 Oct 2014, Amon Ott wrote:
>>> Am 13.10.2014 20:16, schrieb Sage Weil:
>>>> We've been doing a lot of work on CephFS over the past few months. This
>>>> is an update on the current state of things as of Giant.
>>> ...
>>>> * Either the kernel client (kernel 3.17 or later) or userspace (ceph-fuse
>>>>    or libcephfs) clients are in good working order.
>>> Thanks for all the work and specially for concentrating on CephFS! We
>>> have been watching and testing for years by now and really hope to
>>> change our Clusters to CephFS soon.
>>>
>>> For kernel maintenance reasons, we only want to run longterm stable
>>> kernels. And for performance reasons and because of severe known
>>> problems we want to avoid Fuse. How good are our chances of a stable
>>> system with the kernel client in the latest longterm kernel 3.14? Will
>>> there be further bugfixes or feature backports?
>> There are important bug fixes missing from 3.14.  IIRC, the EC, cache
>> tiering, and firefly CRUSH changes aren't there yet either (they landed in
>> 3.15), and that is not appropriate for a stable series.
>>
>> They can be backported, but no commitment yet on that :)
> If the bugfixes are easily identified in one of your Ceph git branches,
> I would even try to backport them myself. Still, I would rather see
> someone from the Ceph team with deeper knowledge of the code port them.
>
> IMHO, it would be good for Ceph to have stable support in at least the
> latest longterm kernel. No need for new features, but bugfixes should be
> there.
>
> Amon Ott

Long term support and aggressive, tedious backports are what you go to distro 
vendors for normally - I don't think that it is generally a good practice to 
continually backport anything to stable series kernels that is not a 
bugfix/security issue (or else, the stable branches rapidly just a stale version 
of the upstream tip :)).

Not meant as a commercial for RH, other vendors also do this kind of thing of 
course...

Regards,

Ric


^ permalink raw reply	[flat|nested] 20+ messages in thread

[parent not found: <543E645E.4080405-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]

* Re: the state of cephfs in giant
       [not found]           ` <543E645E.4080405-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2014-10-15 13:13             ` Amon Ott
  2014-10-15 14:58               ` [ceph-users] " Sage Weil
  0 siblings, 1 reply; 20+ messages in thread
From: Amon Ott @ 2014-10-15 13:13 UTC (permalink / raw)
  To: Ric Wheeler, Sage Weil
  Cc: ceph-devel-u79uwXL29TY76Z2rM5mHXA, ceph-users-Qp0mS5GaXlQ

Am 15.10.2014 14:11, schrieb Ric Wheeler:
> On 10/15/2014 08:43 AM, Amon Ott wrote:
>> Am 14.10.2014 16:23, schrieb Sage Weil:
>>> On Tue, 14 Oct 2014, Amon Ott wrote:
>>>> Am 13.10.2014 20:16, schrieb Sage Weil:
>>>>> We've been doing a lot of work on CephFS over the past few months.
>>>>> This
>>>>> is an update on the current state of things as of Giant.
>>>> ...
>>>>> * Either the kernel client (kernel 3.17 or later) or userspace
>>>>> (ceph-fuse
>>>>>    or libcephfs) clients are in good working order.
>>>> Thanks for all the work and specially for concentrating on CephFS! We
>>>> have been watching and testing for years by now and really hope to
>>>> change our Clusters to CephFS soon.
>>>>
>>>> For kernel maintenance reasons, we only want to run longterm stable
>>>> kernels. And for performance reasons and because of severe known
>>>> problems we want to avoid Fuse. How good are our chances of a stable
>>>> system with the kernel client in the latest longterm kernel 3.14? Will
>>>> there be further bugfixes or feature backports?
>>> There are important bug fixes missing from 3.14.  IIRC, the EC, cache
>>> tiering, and firefly CRUSH changes aren't there yet either (they
>>> landed in
>>> 3.15), and that is not appropriate for a stable series.
>>>
>>> They can be backported, but no commitment yet on that :)
>> If the bugfixes are easily identified in one of your Ceph git branches,
>> I would even try to backport them myself. Still, I would rather see
>> someone from the Ceph team with deeper knowledge of the code port them.
>>
>> IMHO, it would be good for Ceph to have stable support in at least the
>> latest longterm kernel. No need for new features, but bugfixes should be
>> there.
>>
>> Amon Ott
> 
> Long term support and aggressive, tedious backports are what you go to
> distro vendors for normally - I don't think that it is generally a good
> practice to continually backport anything to stable series kernels that
> is not a bugfix/security issue (or else, the stable branches rapidly
> just a stale version of the upstream tip :)).

bugfix/security is exactly what I am looking for.

Amon Ott
-- 
Dr. Amon Ott
m-privacy GmbH           Tel: +49 30 24342334
Werner-Voß-Damm 62       Fax: +49 30 99296856
12101 Berlin             http://www.m-privacy.de

Amtsgericht Charlottenburg, HRB 84946

Geschäftsführer:
 Dipl.-Kfm. Holger Maczkowsky,
 Roman Maczkowsky

GnuPG-Key-ID: 0x2DD3A649

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [ceph-users] the state of cephfs in giant
  2014-10-15 13:13             ` Amon Ott
@ 2014-10-15 14:58               ` Sage Weil
       [not found]                 ` <alpine.DEB.2.00.1410150754560.10462-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: Sage Weil @ 2014-10-15 14:58 UTC (permalink / raw)
  To: Amon Ott; +Cc: Ric Wheeler, ceph-devel, ceph-users

On Wed, 15 Oct 2014, Amon Ott wrote:
> Am 15.10.2014 14:11, schrieb Ric Wheeler:
> > On 10/15/2014 08:43 AM, Amon Ott wrote:
> >> Am 14.10.2014 16:23, schrieb Sage Weil:
> >>> On Tue, 14 Oct 2014, Amon Ott wrote:
> >>>> Am 13.10.2014 20:16, schrieb Sage Weil:
> >>>>> We've been doing a lot of work on CephFS over the past few months.
> >>>>> This
> >>>>> is an update on the current state of things as of Giant.
> >>>> ...
> >>>>> * Either the kernel client (kernel 3.17 or later) or userspace
> >>>>> (ceph-fuse
> >>>>>    or libcephfs) clients are in good working order.
> >>>> Thanks for all the work and specially for concentrating on CephFS! We
> >>>> have been watching and testing for years by now and really hope to
> >>>> change our Clusters to CephFS soon.
> >>>>
> >>>> For kernel maintenance reasons, we only want to run longterm stable
> >>>> kernels. And for performance reasons and because of severe known
> >>>> problems we want to avoid Fuse. How good are our chances of a stable
> >>>> system with the kernel client in the latest longterm kernel 3.14? Will
> >>>> there be further bugfixes or feature backports?
> >>> There are important bug fixes missing from 3.14.  IIRC, the EC, cache
> >>> tiering, and firefly CRUSH changes aren't there yet either (they
> >>> landed in
> >>> 3.15), and that is not appropriate for a stable series.
> >>>
> >>> They can be backported, but no commitment yet on that :)
> >> If the bugfixes are easily identified in one of your Ceph git branches,
> >> I would even try to backport them myself. Still, I would rather see
> >> someone from the Ceph team with deeper knowledge of the code port them.
> >>
> >> IMHO, it would be good for Ceph to have stable support in at least the
> >> latest longterm kernel. No need for new features, but bugfixes should be
> >> there.
> >>
> >> Amon Ott
> > 
> > Long term support and aggressive, tedious backports are what you go to
> > distro vendors for normally - I don't think that it is generally a good
> > practice to continually backport anything to stable series kernels that
> > is not a bugfix/security issue (or else, the stable branches rapidly
> > just a stale version of the upstream tip :)).
> 
> bugfix/security is exactly what I am looking for.

Right; sorry if I was unclear.  We make a point of sending bug fixes to 
stable@vger.kernel.org but haven't been aggressive with cephfs because 
the code is less stable.  There will be catch-up required to get 3.14 in 
good working order.

Definitely hear you that this important, just can't promise when we'll 
have the time to do it.  There's probably a half day's effort to pick out 
the right patches and make sure they build properly, and then some time to 
feed it through the test suite.

sage

^ permalink raw reply	[flat|nested] 20+ messages in thread

[parent not found: <alpine.DEB.2.00.1410150754560.10462-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>]

* Re: the state of cephfs in giant
       [not found]                 ` <alpine.DEB.2.00.1410150754560.10462-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
@ 2014-10-15 16:47                   ` Alphe Salas
  0 siblings, 0 replies; 20+ messages in thread
From: Alphe Salas @ 2014-10-15 16:47 UTC (permalink / raw)
  To: Sage Weil, Amon Ott
  Cc: ceph-devel-u79uwXL29TY76Z2rM5mHXA, ceph-users-Qp0mS5GaXlQ

For the humble ceph user I am it is really hard to follow what version 
of what product will get the changes I requiere.

Let me explain myself. I use ceph in my company is specialised in disk 
recovery, my company needs a flexible, easy to maintain, trustable way 
to store the data from the disks of our clients.

We try the usual way jbod boxes connected to a single server with a SAS 
raid card and ZFS mirror to handle replicas and merging disks into a big 
disk. result is really slow. (used to use zfs and solaris 11 on x86 
servers... with openZfs and ubuntu 14.04 the perf are way better but not 
any were comparable with ceph (on a giga ethernet lan you can get data 
transfer betwin client and ceph cluster around 80MB/s...while client to 
openzfs/ubuntu is around 25MB/S)

Along my path with ceph I first used cephfs, worked fine! until I 
noticed that part of the folder tree suddently randomly disapeared 
forcing a constant periodical remount of the partitions.

Then I choose to forget about cephfs and use rbd images, worked fine!
Until I noticed that rbd replicas where never freed or overwriten and 
that for a replicas set to 2 (data and 1 replica) and an image of 13 TB 
after some time of write erase cycles on the same rbd image I get an 
overall data use of 34 TB over the 36TB available on my cluster I 
noticed that there was a real problem with "space management". The data 
part of the rbd image was properly managed using overwrites on old 
deleted data at OS level, so the only logical explaination of the 
overall data use growth was that the replicas where never freed.

All along that time I was pending of the bugs/ features and advances of 
ceph.
But those isues are not really ceph related they are kernel modules for 
using "ceph clients" so the release of feature add and bug fix are in 
part to be given in the ceph-common package (for the server related 
machanics) and the other part is then to be provided at the kernel level.

For comodity I use Ubuntu which is not really top notch using the very 
lastest brew of the kernel and all the bug fixed modules.

So when I see this great news about giant and the fact that alot of work 
has been done in solving most of the problems we all faced with
ceph then I notice that it will be around a year or so for those fix to 
be production available in ubuntu. There is some inertia there that 
doesn t match with the pace of the work on ceph.

Then people can arg with me "why you use ubuntu?"
and the answers are simple I have a cluster of 10 machines and 1 proxy 
if I need to compile from source lastest brew of ceph and lastest brew 
of kernel then my maintainance time will be way bigger. And I am more 
intended to get something that isn t properly done and have a machine 
that doesn t reboot.
I know what I am talking about I used during several month ceph in 
archlinux compiling kernel and ceph from source until the gcc installed 
on my test server was too new and a compile option had been removed then 
ceph wasn t compiling. That way to proceed was descarted because not 
stable enough to bring production level quality.

So as far as I understand things I will have cephfs enhanced and rbd 
discard ability available at same time using the couple ceph giant and 
linux kernel 3.18 and up ?

regards and thank you again for your hardwork, I wish I could do more to 
help.

---
Alphe Salas
I.T ingeneer

On 10/15/2014 11:58 AM, Sage Weil wrote:
> On Wed, 15 Oct 2014, Amon Ott wrote:
>> Am 15.10.2014 14:11, schrieb Ric Wheeler:
>>> On 10/15/2014 08:43 AM, Amon Ott wrote:
>>>> Am 14.10.2014 16:23, schrieb Sage Weil:
>>>>> On Tue, 14 Oct 2014, Amon Ott wrote:
>>>>>> Am 13.10.2014 20:16, schrieb Sage Weil:
>>>>>>> We've been doing a lot of work on CephFS over the past few months.
>>>>>>> This
>>>>>>> is an update on the current state of things as of Giant.
>>>>>> ...
>>>>>>> * Either the kernel client (kernel 3.17 or later) or userspace
>>>>>>> (ceph-fuse
>>>>>>>     or libcephfs) clients are in good working order.
>>>>>> Thanks for all the work and specially for concentrating on CephFS! We
>>>>>> have been watching and testing for years by now and really hope to
>>>>>> change our Clusters to CephFS soon.
>>>>>>
>>>>>> For kernel maintenance reasons, we only want to run longterm stable
>>>>>> kernels. And for performance reasons and because of severe known
>>>>>> problems we want to avoid Fuse. How good are our chances of a stable
>>>>>> system with the kernel client in the latest longterm kernel 3.14? Will
>>>>>> there be further bugfixes or feature backports?
>>>>> There are important bug fixes missing from 3.14.  IIRC, the EC, cache
>>>>> tiering, and firefly CRUSH changes aren't there yet either (they
>>>>> landed in
>>>>> 3.15), and that is not appropriate for a stable series.
>>>>>
>>>>> They can be backported, but no commitment yet on that :)
>>>> If the bugfixes are easily identified in one of your Ceph git branches,
>>>> I would even try to backport them myself. Still, I would rather see
>>>> someone from the Ceph team with deeper knowledge of the code port them.
>>>>
>>>> IMHO, it would be good for Ceph to have stable support in at least the
>>>> latest longterm kernel. No need for new features, but bugfixes should be
>>>> there.
>>>>
>>>> Amon Ott
>>>
>>> Long term support and aggressive, tedious backports are what you go to
>>> distro vendors for normally - I don't think that it is generally a good
>>> practice to continually backport anything to stable series kernels that
>>> is not a bugfix/security issue (or else, the stable branches rapidly
>>> just a stale version of the upstream tip :)).
>>
>> bugfix/security is exactly what I am looking for.
>
> Right; sorry if I was unclear.  We make a point of sending bug fixes to
> stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org but haven't been aggressive with cephfs because
> the code is less stable.  There will be catch-up required to get 3.14 in
> good working order.
>
> Definitely hear you that this important, just can't promise when we'll
> have the time to do it.  There's probably a half day's effort to pick out
> the right patches and make sure they build properly, and then some time to
> feed it through the test suite.
>
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

[parent not found: <alpine.DEB.2.00.1410131114130.10561-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>]

* Re: the state of cephfs in giant
       [not found] ` <alpine.DEB.2.00.1410131114130.10561-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
@ 2014-10-14  9:57   ` Thomas Lemarchand
  2014-10-14 13:11     ` [ceph-users] " Sage Weil
  2014-10-30 10:55   ` Florian Haas
  1 sibling, 1 reply; 20+ messages in thread
From: Thomas Lemarchand @ 2014-10-14  9:57 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel-u79uwXL29TY76Z2rM5mHXA, ceph-users-Qp0mS5GaXlQ

Thanks for theses informations.

I plan to use CephFS on Giant, with production workload, knowing the
risks and having a hot backup near. I hope to be able to provide useful
feedback.

My cluster is made of 7 servers (3mon, 3osd (27 osd inside), 1mds). I
use ceph-fuse on clients.

You wrote about hardlinks, but what about symlinks ? I use some (on
cephFS firefly) without any problem for now.

Do you suggest something for backup of CephFS ? For now I use a simple
rsync, it works quite well.

Thanks !

-- 
Thomas Lemarchand
Cloud Solutions SAS - Responsable des systèmes d'information



On lun., 2014-10-13 at 11:16 -0700, Sage Weil wrote:
> We've been doing a lot of work on CephFS over the past few months. This
> is an update on the current state of things as of Giant.
> 
> What we've working on:
> 
> * better mds/cephfs health reports to the monitor
> * mds journal dump/repair tool
> * many kernel and ceph-fuse/libcephfs client bug fixes
> * file size recovery improvements
> * client session management fixes (and tests)
> * admin socket commands for diagnosis and admin intervention
> * many bug fixes
> 
> We started using CephFS to back the teuthology (QA) infrastructure in the
> lab about three months ago. We fixed a bunch of stuff over the first
> month or two (several kernel bugs, a few MDS bugs). We've had no problems
> for the last month or so. We're currently running 0.86 (giant release
> candidate) with a single MDS and ~70 OSDs. Clients are running a 3.16
> kernel plus several fixes that went into 3.17.
> 
> 
> With Giant, we are at a point where we would ask that everyone try
> things out for any non-production workloads. We are very interested in
> feedback around stability, usability, feature gaps, and performance. We
> recommend:
> 
> * Single active MDS. You can run any number of standby MDS's, but we are
>   not focusing on multi-mds bugs just yet (and our existing multimds test
>   suite is already hitting several).
> * No snapshots. These are disabled by default and require a scary admin
>   command to enable them. Although these mostly work, there are
>   several known issues that we haven't addressed and they complicate
>   things immensely. Please avoid them for now.
> * Either the kernel client (kernel 3.17 or later) or userspace (ceph-fuse
>   or libcephfs) clients are in good working order.
> 
> The key missing feature right now is fsck (both check and repair). This is 
> *the* development focus for Hammer.
> 
> 
> Here's a more detailed rundown of the status of various features:
> 
> * multi-mds: implemented. limited test coverage. several known issues.
>   use only for non-production workloads and expect some stability
>   issues that could lead to data loss.
> 
> * snapshots: implemented. limited test coverage. several known issues.
>   use only for non-production workloads and expect some stability issues
>   that could lead to data loss.
> 
> * hard links: stable. no known issues, but there is somewhat limited
>   test coverage (we don't test creating huge link farms).
> 
> * direct io: implemented and tested for kernel client. no special
>   support for ceph-fuse (the kernel fuse driver handles this).
> 
> * xattrs: implemented, stable, tested. no known issues (for both kernel
>   and userspace clients).
> 
> * ACLs: implemented, tested for kernel client. not implemented for
>   ceph-fuse.
> 
> * file locking (fcntl, flock): supported and tested for kernel client.
>   limited test coverage. one known minor issue for kernel with fix
>   pending. implemention in progress for ceph-fuse/libcephfs.
> 
> * kernel fscache support: implmented. no test coverage. used in
>   production by adfin.
> 
> * hadoop bindings: implemented, limited test coverage. a few known
>   issues.
> 
> * samba VFS integration: implemented, limited test coverage.
> 
> * ganesha NFS integration: implemented, no test coverage.
> 
> * kernel NFS reexport: implemented. limited test coverage. no known
>   issues.
> 
> 
> Anybody who has experienced bugs in the past should be excited by:
> 
> * new MDS admin socket commands to look at pending operations and client 
>   session states. (Check them out with "ceph daemon mds.a help"!) These 
>   will make diagnosing, debugging, and even fixing issues a lot simpler.
> 
> * the cephfs_journal_tool, which is capable of manipulating mds journal 
>   state without doing difficult exports/imports and using hexedit.
> 
> Thanks!
> sage
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [ceph-users] the state of cephfs in giant
  2014-10-14  9:57   ` Thomas Lemarchand
@ 2014-10-14 13:11     ` Sage Weil
  0 siblings, 0 replies; 20+ messages in thread
From: Sage Weil @ 2014-10-14 13:11 UTC (permalink / raw)
  To: Thomas Lemarchand; +Cc: ceph-users, ceph-devel

On Tue, 14 Oct 2014, Thomas Lemarchand wrote:
> Thanks for theses informations.
> 
> I plan to use CephFS on Giant, with production workload, knowing the
> risks and having a hot backup near. I hope to be able to provide useful
> feedback.
> 
> My cluster is made of 7 servers (3mon, 3osd (27 osd inside), 1mds). I
> use ceph-fuse on clients.

Cool!  Please be careful, and have a plan B.  :)

> You wrote about hardlinks, but what about symlinks ? I use some (on
> cephFS firefly) without any problem for now.

Symlinks are simple and cheap; no issues there.

> Do you suggest something for backup of CephFS ? For now I use a simple
> rsync, it works quite well.

rsync is fine.  There is some opportunity to do clever things with the 
recursive ctime metadata, but nobody has wired it up to any tools yet.

sage


> 
> Thanks !
> 
> -- 
> Thomas Lemarchand
> Cloud Solutions SAS - Responsable des syst?mes d'information
> 
> 
> 
> On lun., 2014-10-13 at 11:16 -0700, Sage Weil wrote:
> > We've been doing a lot of work on CephFS over the past few months. This
> > is an update on the current state of things as of Giant.
> > 
> > What we've working on:
> > 
> > * better mds/cephfs health reports to the monitor
> > * mds journal dump/repair tool
> > * many kernel and ceph-fuse/libcephfs client bug fixes
> > * file size recovery improvements
> > * client session management fixes (and tests)
> > * admin socket commands for diagnosis and admin intervention
> > * many bug fixes
> > 
> > We started using CephFS to back the teuthology (QA) infrastructure in the
> > lab about three months ago. We fixed a bunch of stuff over the first
> > month or two (several kernel bugs, a few MDS bugs). We've had no problems
> > for the last month or so. We're currently running 0.86 (giant release
> > candidate) with a single MDS and ~70 OSDs. Clients are running a 3.16
> > kernel plus several fixes that went into 3.17.
> > 
> > 
> > With Giant, we are at a point where we would ask that everyone try
> > things out for any non-production workloads. We are very interested in
> > feedback around stability, usability, feature gaps, and performance. We
> > recommend:
> > 
> > * Single active MDS. You can run any number of standby MDS's, but we are
> >   not focusing on multi-mds bugs just yet (and our existing multimds test
> >   suite is already hitting several).
> > * No snapshots. These are disabled by default and require a scary admin
> >   command to enable them. Although these mostly work, there are
> >   several known issues that we haven't addressed and they complicate
> >   things immensely. Please avoid them for now.
> > * Either the kernel client (kernel 3.17 or later) or userspace (ceph-fuse
> >   or libcephfs) clients are in good working order.
> > 
> > The key missing feature right now is fsck (both check and repair). This is 
> > *the* development focus for Hammer.
> > 
> > 
> > Here's a more detailed rundown of the status of various features:
> > 
> > * multi-mds: implemented. limited test coverage. several known issues.
> >   use only for non-production workloads and expect some stability
> >   issues that could lead to data loss.
> > 
> > * snapshots: implemented. limited test coverage. several known issues.
> >   use only for non-production workloads and expect some stability issues
> >   that could lead to data loss.
> > 
> > * hard links: stable. no known issues, but there is somewhat limited
> >   test coverage (we don't test creating huge link farms).
> > 
> > * direct io: implemented and tested for kernel client. no special
> >   support for ceph-fuse (the kernel fuse driver handles this).
> > 
> > * xattrs: implemented, stable, tested. no known issues (for both kernel
> >   and userspace clients).
> > 
> > * ACLs: implemented, tested for kernel client. not implemented for
> >   ceph-fuse.
> > 
> > * file locking (fcntl, flock): supported and tested for kernel client.
> >   limited test coverage. one known minor issue for kernel with fix
> >   pending. implemention in progress for ceph-fuse/libcephfs.
> > 
> > * kernel fscache support: implmented. no test coverage. used in
> >   production by adfin.
> > 
> > * hadoop bindings: implemented, limited test coverage. a few known
> >   issues.
> > 
> > * samba VFS integration: implemented, limited test coverage.
> > 
> > * ganesha NFS integration: implemented, no test coverage.
> > 
> > * kernel NFS reexport: implemented. limited test coverage. no known
> >   issues.
> > 
> > 
> > Anybody who has experienced bugs in the past should be excited by:
> > 
> > * new MDS admin socket commands to look at pending operations and client 
> >   session states. (Check them out with "ceph daemon mds.a help"!) These 
> >   will make diagnosing, debugging, and even fixing issues a lot simpler.
> > 
> > * the cephfs_journal_tool, which is capable of manipulating mds journal 
> >   state without doing difficult exports/imports and using hexedit.
> > 
> > Thanks!
> > sage
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > 
> 
> 
> -- 
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: the state of cephfs in giant
       [not found] ` <alpine.DEB.2.00.1410131114130.10561-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
  2014-10-14  9:57   ` Thomas Lemarchand
@ 2014-10-30 10:55   ` Florian Haas
  2014-10-30 14:36     ` [ceph-users] " John Spray
       [not found]     ` <CAPUexz_+jD7RMNSZEgy3h6WqKS4PSMj1fbyRgLKxQWHvctviNA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 2 replies; 20+ messages in thread
From: Florian Haas @ 2014-10-30 10:55 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, ceph-users

Hi Sage,

sorry to be late to this thread; I just caught this one as I was
reviewing the Giant release notes. A few questions below:

On Mon, Oct 13, 2014 at 8:16 PM, Sage Weil <sage@newdream.net> wrote:
> [...]
> * ACLs: implemented, tested for kernel client. not implemented for
>   ceph-fuse.
> [...]
> * samba VFS integration: implemented, limited test coverage.

ACLs are kind of a must-have feature for most Samba admins. The Samba
Ceph VFS builds on userspace libcephfs directly, neither the kernel
client nor ceph-fuse, so I'm trying to understand whether ACLs are
available to Samba users or not. Can you clarify please?

> * ganesha NFS integration: implemented, no test coverage.

I understood from a conversation I had with John in London that
flock() and fcntl() support had recently been added to ceph-fuse, can
this be expected to Just Work™ in Ganesha as well?

Also, can you make a general statement as to the stability of flock()
and fcntl() support in the kernel client and in libcephfs/ceph-fuse?
This too is particularly interesting for Samba admins who rely on
byte-range locking for Samba CTDB support.

> * kernel NFS reexport: implemented. limited test coverage. no known
>   issues.

In this scenario, is there any specific magic that the kernel client
does to avoid producing deadlocks under memory pressure? Or are you
referring to FUSE-mounted CephFS reexported via kernel NFS?

Cheers,
Florian
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [ceph-users] the state of cephfs in giant
  2014-10-30 10:55   ` Florian Haas
@ 2014-10-30 14:36     ` John Spray
       [not found]     ` <CAPUexz_+jD7RMNSZEgy3h6WqKS4PSMj1fbyRgLKxQWHvctviNA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 0 replies; 20+ messages in thread
From: John Spray @ 2014-10-30 14:36 UTC (permalink / raw)
  To: Florian Haas; +Cc: Sage Weil, ceph-devel@vger.kernel.org, ceph-users

On Thu, Oct 30, 2014 at 10:55 AM, Florian Haas <florian@hastexo.com> wrote:
>> * ganesha NFS integration: implemented, no test coverage.
>
> I understood from a conversation I had with John in London that
> flock() and fcntl() support had recently been added to ceph-fuse, can
> this be expected to Just Work™ in Ganesha as well?

To clarify this comment: flock in ceph-fuse was recently implemented
(by Yan Zheng) in *master* rather than giant, so it's in line for
hammer.

Cheers,
John
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

[parent not found: <CAPUexz_+jD7RMNSZEgy3h6WqKS4PSMj1fbyRgLKxQWHvctviNA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]

* Re: the state of cephfs in giant
       [not found]     ` <CAPUexz_+jD7RMNSZEgy3h6WqKS4PSMj1fbyRgLKxQWHvctviNA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2014-10-30 15:28       ` Sage Weil
  0 siblings, 0 replies; 20+ messages in thread
From: Sage Weil @ 2014-10-30 15:28 UTC (permalink / raw)
  To: Florian Haas
  Cc: ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, ceph-users

On Thu, 30 Oct 2014, Florian Haas wrote:
> Hi Sage,
> 
> sorry to be late to this thread; I just caught this one as I was
> reviewing the Giant release notes. A few questions below:
> 
> On Mon, Oct 13, 2014 at 8:16 PM, Sage Weil <sage-BnTBU8nroG7k1uMJSBkQmQ@public.gmane.org> wrote:
> > [...]
> > * ACLs: implemented, tested for kernel client. not implemented for
> >   ceph-fuse.
> > [...]
> > * samba VFS integration: implemented, limited test coverage.
> 
> ACLs are kind of a must-have feature for most Samba admins. The Samba
> Ceph VFS builds on userspace libcephfs directly, neither the kernel
> client nor ceph-fuse, so I'm trying to understand whether ACLs are
> available to Samba users or not. Can you clarify please?

I believe that with the current integration, Samba is doing all of the 
ACLs and storing them as xattrs.  They will work for CIFS users, but won't 
be coherent with users access the same file system directly via the kernel 
cephfs client or NFS or some other means.

This is a general problem with NFS vs CIFS.  The richacl project built a 
coherent ACL structure that captures both NFS4 and windows ACLs but it has 
not made it into the mainline kernel.  :/

> > * ganesha NFS integration: implemented, no test coverage.
> 
> I understood from a conversation I had with John in London that
> flock() and fcntl() support had recently been added to ceph-fuse, can
> this be expected to Just Work? in Ganesha as well?

It probably could without much trouble, but I don't think it has been 
wired up.  This is probably a pretty simple matter...

> Also, can you make a general statement as to the stability of flock()
> and fcntl() support in the kernel client and in libcephfs/ceph-fuse?
> This too is particularly interesting for Samba admins who rely on
> byte-range locking for Samba CTDB support.

Zheng fixed a bug or two with the existing kernel and MDS support when he 
did the ceph-fuse/libcephfs implementation.  At this point there are no 
known issues.  I would not expect problems, but will of course be very 
interested to hear bug reports.

> > * kernel NFS reexport: implemented. limited test coverage. no known
> >   issues.
> 
> In this scenario, is there any specific magic that the kernel client
> does to avoid producing deadlocks under memory pressure? Or are you
> referring to FUSE-mounted CephFS reexported via kernel NFS?

I'm not aware of any memory deadlock issues with NFS reexport.  Unless the 
ceph daemons are running on the same host as the client/exporter... but 
that is not specific to NFS.

Hope that helps!
sage

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2014-10-30 15:28 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-13 18:16 the state of cephfs in giant Sage Weil
2014-10-13 18:20 ` Wido den Hollander
2014-10-13 18:26   ` Sage Weil
2014-10-13 19:03 ` [ceph-users] " Eric Eastman
2014-10-13 20:56   ` Sage Weil
2014-10-14  7:31 ` Amon Ott
2014-10-14 13:09   ` Sage Weil
2014-10-14 14:23   ` [ceph-users] " Sage Weil
2014-10-15  0:16     ` Alphe Salas
     [not found]       ` <543DBCE9.2080605-g2h0fw6BmCNmR6Xm/wNWPw@public.gmane.org>
2014-10-15  2:06         ` Sage Weil
     [not found]     ` <alpine.DEB.2.00.1410140718050.10462-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
2014-10-15  6:43       ` Amon Ott
2014-10-15 12:11         ` [ceph-users] " Ric Wheeler
     [not found]           ` <543E645E.4080405-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-10-15 13:13             ` Amon Ott
2014-10-15 14:58               ` [ceph-users] " Sage Weil
     [not found]                 ` <alpine.DEB.2.00.1410150754560.10462-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
2014-10-15 16:47                   ` Alphe Salas
     [not found] ` <alpine.DEB.2.00.1410131114130.10561-vIokxiIdD2AQNTJnQDzGJqxOck334EZe@public.gmane.org>
2014-10-14  9:57   ` Thomas Lemarchand
2014-10-14 13:11     ` [ceph-users] " Sage Weil
2014-10-30 10:55   ` Florian Haas
2014-10-30 14:36     ` [ceph-users] " John Spray
     [not found]     ` <CAPUexz_+jD7RMNSZEgy3h6WqKS4PSMj1fbyRgLKxQWHvctviNA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-30 15:28       ` Sage Weil

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.