* [GIT PULL] Ceph distributed file system client for 2.6.33 @ 2009-12-07 23:25 Sage Weil 2009-12-18 20:54 ` Sage Weil 0 siblings, 1 reply; 10+ messages in thread From: Sage Weil @ 2009-12-07 23:25 UTC (permalink / raw) To: torvalds; +Cc: linux-kernel, linux-fsdevel Hi Linus, Please pull from 'master' branch of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git master to receive the Ceph distributed file system client. The fs has made a half dozen rounds on linux-fsdevel, and has been in linux-next for the last month or so. Although review has been sparse, Andrew said the code looks reasonable for 2.6.33. The git tree includes the full patchset posted in October and incremental changes since then. I've tried to cram in all the anticipated protocol changes, but the file system is still strictly EXPERIMENTAL and is marked as such. Merging now will attract new eyes and make it easier to test and evaluate the system (both the client and server side). Basic features include: * High availability and reliability. No single points of failure. * Strong data and metadata consistency between clients * N-way replication of all data across storage nodes * Seamless scaling from 1 to potentially many thousands of nodes * Fast recovery from node failures * Automatic rebalancing of data on node addition/removal * Easy deployment: most FS components are userspace daemons More info on Ceph at http://ceph.newdream.net/ Thanks- sage Julia Lawall (2): fs/ceph: introduce missing kfree fs/ceph: Move a dereference below a NULL test Noah Watkins (3): ceph: replace list_entry with container_of ceph: remove redundant use of le32_to_cpu ceph: fix intra strip unit length calculation Sage Weil (93): ceph: documentation ceph: on-wire types ceph: client types ceph: ref counted buffer ceph: super.c ceph: inode operations ceph: directory operations ceph: file operations ceph: address space operations ceph: MDS client ceph: OSD client ceph: CRUSH mapping algorithm ceph: monitor client ceph: capability management ceph: snapshot management ceph: messenger library ceph: message pools ceph: nfs re-export support ceph: ioctls ceph: debugfs ceph: Kconfig, Makefile ceph: document shared files in README ceph: show meaningful version on module load ceph: include preferred_osd in file layout virtual xattr ceph: gracefully avoid empty crush buckets ceph: fix mdsmap decoding when multiple mds's are present ceph: renew mon subscription before it expires ceph: fix osd request submission race ceph: revoke osd request message on request completion ceph: fail gracefully on corrupt osdmap (bad pg_temp mapping) ceph: reset osd session on fault, not peer_reset ceph: cancel osd requests before resending them ceph: update to mon client protocol v15 ceph: add file layout validation ceph: ignore trailing data in monamp ceph: remove unused CEPH_MSG_{OSD,MDS}_GETMAP ceph: add version field to message header ceph: convert encode/decode macros to inlines ceph: initialize sb->s_bdi, bdi_unregister after kill_anon_super ceph: move generic flushing code into helper ceph: flush dirty caps via the cap_dirty list ceph: correct subscribe_ack msgpool payload size ceph: warn on allocation from msgpool with larger front_len ceph: move dirty caps code around ceph: enable readahead ceph: include preferred osd in placement seed ceph: v0.17 of client ceph: move directory size logic to ceph_getattr ceph: remove small mon addr limit; use CEPH_MAX_MON where appropriate ceph: reduce parse_mount_args stack usage ceph: silence uninitialized variable warning ceph: fix, clean up string mount arg parsing ceph: allocate and parse mount args before client instance ceph: correct comment to match striping calculation ceph: fix object striping calculation for non-default striping schemes ceph: fix uninitialized err variable crush: always return a value from crush_bucket_choose ceph: init/destroy bdi in client create/destroy helpers ceph: use fixed endian encoding for ceph_entity_addr ceph: fix endian conversions for ceph_pg ceph: fix sparse endian warning ceph: convert port endianness ceph: clean up 'osd%d down' console msg ceph: make CRUSH hash functions non-inline ceph: use strong hash function for mapping objects to pgs ceph: make object hash a pg_pool property ceph: make CRUSH hash function a bucket property ceph: do not confuse stale and dead (unreconnected) caps ceph: separate banner and connect during handshake into distinct stages ceph: remove recon_gen logic ceph: exclude snapdir from readdir results ceph: initialize i_size/i_rbytes on snapdir ceph: pr_info when mds reconnect completes ceph: build cleanly without CONFIG_DEBUG_FS ceph: fix page invalidation deadlock ceph: remove bad calls to ceph_con_shutdown ceph: remove unnecessary ceph_con_shutdown ceph: handle errors during osd client init ceph: negotiate authentication protocol; implement AUTH_NONE protocol ceph: move mempool creation to ceph_create_client ceph: small cleanup in hash function ceph: fix debugfs entry, simplify fsid checks ceph: decode updated mdsmap format ceph: reset requested max_size after mds reconnect ceph: reset msgr backoff during open, not after successful handshake ceph: remove dead code ceph: remove useless IS_ERR checks ceph: plug leak of request_mutex ceph: whitespace cleanup ceph: hide /.ceph from readdir results ceph: allow preferred osd to be get/set via layout ioctl ceph: update MAINTAINERS entry with correct git URL ceph: mark v0.18 release Yehuda Sadeh (1): ceph: mount fails immediately on error ---- Documentation/filesystems/ceph.txt | 139 ++ Documentation/ioctl/ioctl-number.txt | 1 + MAINTAINERS | 9 + fs/Kconfig | 1 + fs/Makefile | 1 + fs/ceph/Kconfig | 26 + fs/ceph/Makefile | 37 + fs/ceph/README | 20 + fs/ceph/addr.c | 1115 +++++++++++++ fs/ceph/auth.c | 225 +++ fs/ceph/auth.h | 77 + fs/ceph/auth_none.c | 120 ++ fs/ceph/auth_none.h | 28 + fs/ceph/buffer.c | 34 + fs/ceph/buffer.h | 55 + fs/ceph/caps.c | 2863 ++++++++++++++++++++++++++++++++ fs/ceph/ceph_debug.h | 37 + fs/ceph/ceph_frag.c | 21 + fs/ceph/ceph_frag.h | 109 ++ fs/ceph/ceph_fs.c | 74 + fs/ceph/ceph_fs.h | 648 ++++++++ fs/ceph/ceph_hash.c | 118 ++ fs/ceph/ceph_hash.h | 13 + fs/ceph/ceph_strings.c | 176 ++ fs/ceph/crush/crush.c | 151 ++ fs/ceph/crush/crush.h | 180 ++ fs/ceph/crush/hash.c | 149 ++ fs/ceph/crush/hash.h | 17 + fs/ceph/crush/mapper.c | 596 +++++++ fs/ceph/crush/mapper.h | 20 + fs/ceph/debugfs.c | 450 +++++ fs/ceph/decode.h | 159 ++ fs/ceph/dir.c | 1222 ++++++++++++++ fs/ceph/export.c | 223 +++ fs/ceph/file.c | 904 +++++++++++ fs/ceph/inode.c | 1624 +++++++++++++++++++ fs/ceph/ioctl.c | 160 ++ fs/ceph/ioctl.h | 40 + fs/ceph/mds_client.c | 2976 ++++++++++++++++++++++++++++++++++ fs/ceph/mds_client.h | 327 ++++ fs/ceph/mdsmap.c | 170 ++ fs/ceph/mdsmap.h | 54 + fs/ceph/messenger.c | 2103 ++++++++++++++++++++++++ fs/ceph/messenger.h | 253 +++ fs/ceph/mon_client.c | 751 +++++++++ fs/ceph/mon_client.h | 115 ++ fs/ceph/msgpool.c | 181 ++ fs/ceph/msgpool.h | 27 + fs/ceph/msgr.h | 167 ++ fs/ceph/osd_client.c | 1364 ++++++++++++++++ fs/ceph/osd_client.h | 150 ++ fs/ceph/osdmap.c | 916 +++++++++++ fs/ceph/osdmap.h | 124 ++ fs/ceph/rados.h | 370 +++++ fs/ceph/snap.c | 887 ++++++++++ fs/ceph/super.c | 984 +++++++++++ fs/ceph/super.h | 895 ++++++++++ fs/ceph/types.h | 29 + fs/ceph/xattr.c | 842 ++++++++++ 59 files changed, 25527 insertions(+), 0 deletions(-) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [GIT PULL] Ceph distributed file system client for 2.6.33 2009-12-07 23:25 [GIT PULL] Ceph distributed file system client for 2.6.33 Sage Weil @ 2009-12-18 20:54 ` Sage Weil 2009-12-18 21:38 ` Linus Torvalds 2009-12-19 5:33 ` Valdis.Kletnieks 0 siblings, 2 replies; 10+ messages in thread From: Sage Weil @ 2009-12-18 20:54 UTC (permalink / raw) To: torvalds; +Cc: akpm, linux-kernel, linux-fsdevel Hi Linus, I would still like to see ceph merged for 2.6.33. It's certainly not production ready, but it would be greatly beneficial to be in mainline for the same reasons other file systems like btrfs and exofs were merged early. Is there more information you'd like to see from me before pulling? If there was a reason you decided not to pull, please let me know. Thanks- sage On Mon, 7 Dec 2009, Sage Weil wrote: > Hi Linus, > > Please pull from 'master' branch of > > git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git master > > to receive the Ceph distributed file system client. The fs has made a > half dozen rounds on linux-fsdevel, and has been in linux-next for the > last month or so. Although review has been sparse, Andrew said the code > looks reasonable for 2.6.33. > > The git tree includes the full patchset posted in October and incremental > changes since then. I've tried to cram in all the anticipated protocol > changes, but the file system is still strictly EXPERIMENTAL and is marked > as such. Merging now will attract new eyes and make it easier to test and > evaluate the system (both the client and server side). > > Basic features include: > > * High availability and reliability. No single points of failure. > * Strong data and metadata consistency between clients > * N-way replication of all data across storage nodes > * Seamless scaling from 1 to potentially many thousands of nodes > * Fast recovery from node failures > * Automatic rebalancing of data on node addition/removal > * Easy deployment: most FS components are userspace daemons > > More info on Ceph at > > http://ceph.newdream.net/ > > Thanks- > sage > > > Julia Lawall (2): > fs/ceph: introduce missing kfree > fs/ceph: Move a dereference below a NULL test > > Noah Watkins (3): > ceph: replace list_entry with container_of > ceph: remove redundant use of le32_to_cpu > ceph: fix intra strip unit length calculation > > Sage Weil (93): > ceph: documentation > ceph: on-wire types > ceph: client types > ceph: ref counted buffer > ceph: super.c > ceph: inode operations > ceph: directory operations > ceph: file operations > ceph: address space operations > ceph: MDS client > ceph: OSD client > ceph: CRUSH mapping algorithm > ceph: monitor client > ceph: capability management > ceph: snapshot management > ceph: messenger library > ceph: message pools > ceph: nfs re-export support > ceph: ioctls > ceph: debugfs > ceph: Kconfig, Makefile > ceph: document shared files in README > ceph: show meaningful version on module load > ceph: include preferred_osd in file layout virtual xattr > ceph: gracefully avoid empty crush buckets > ceph: fix mdsmap decoding when multiple mds's are present > ceph: renew mon subscription before it expires > ceph: fix osd request submission race > ceph: revoke osd request message on request completion > ceph: fail gracefully on corrupt osdmap (bad pg_temp mapping) > ceph: reset osd session on fault, not peer_reset > ceph: cancel osd requests before resending them > ceph: update to mon client protocol v15 > ceph: add file layout validation > ceph: ignore trailing data in monamp > ceph: remove unused CEPH_MSG_{OSD,MDS}_GETMAP > ceph: add version field to message header > ceph: convert encode/decode macros to inlines > ceph: initialize sb->s_bdi, bdi_unregister after kill_anon_super > ceph: move generic flushing code into helper > ceph: flush dirty caps via the cap_dirty list > ceph: correct subscribe_ack msgpool payload size > ceph: warn on allocation from msgpool with larger front_len > ceph: move dirty caps code around > ceph: enable readahead > ceph: include preferred osd in placement seed > ceph: v0.17 of client > ceph: move directory size logic to ceph_getattr > ceph: remove small mon addr limit; use CEPH_MAX_MON where appropriate > ceph: reduce parse_mount_args stack usage > ceph: silence uninitialized variable warning > ceph: fix, clean up string mount arg parsing > ceph: allocate and parse mount args before client instance > ceph: correct comment to match striping calculation > ceph: fix object striping calculation for non-default striping schemes > ceph: fix uninitialized err variable > crush: always return a value from crush_bucket_choose > ceph: init/destroy bdi in client create/destroy helpers > ceph: use fixed endian encoding for ceph_entity_addr > ceph: fix endian conversions for ceph_pg > ceph: fix sparse endian warning > ceph: convert port endianness > ceph: clean up 'osd%d down' console msg > ceph: make CRUSH hash functions non-inline > ceph: use strong hash function for mapping objects to pgs > ceph: make object hash a pg_pool property > ceph: make CRUSH hash function a bucket property > ceph: do not confuse stale and dead (unreconnected) caps > ceph: separate banner and connect during handshake into distinct stages > ceph: remove recon_gen logic > ceph: exclude snapdir from readdir results > ceph: initialize i_size/i_rbytes on snapdir > ceph: pr_info when mds reconnect completes > ceph: build cleanly without CONFIG_DEBUG_FS > ceph: fix page invalidation deadlock > ceph: remove bad calls to ceph_con_shutdown > ceph: remove unnecessary ceph_con_shutdown > ceph: handle errors during osd client init > ceph: negotiate authentication protocol; implement AUTH_NONE protocol > ceph: move mempool creation to ceph_create_client > ceph: small cleanup in hash function > ceph: fix debugfs entry, simplify fsid checks > ceph: decode updated mdsmap format > ceph: reset requested max_size after mds reconnect > ceph: reset msgr backoff during open, not after successful handshake > ceph: remove dead code > ceph: remove useless IS_ERR checks > ceph: plug leak of request_mutex > ceph: whitespace cleanup > ceph: hide /.ceph from readdir results > ceph: allow preferred osd to be get/set via layout ioctl > ceph: update MAINTAINERS entry with correct git URL > ceph: mark v0.18 release > > Yehuda Sadeh (1): > ceph: mount fails immediately on error > > ---- > Documentation/filesystems/ceph.txt | 139 ++ > Documentation/ioctl/ioctl-number.txt | 1 + > MAINTAINERS | 9 + > fs/Kconfig | 1 + > fs/Makefile | 1 + > fs/ceph/Kconfig | 26 + > fs/ceph/Makefile | 37 + > fs/ceph/README | 20 + > fs/ceph/addr.c | 1115 +++++++++++++ > fs/ceph/auth.c | 225 +++ > fs/ceph/auth.h | 77 + > fs/ceph/auth_none.c | 120 ++ > fs/ceph/auth_none.h | 28 + > fs/ceph/buffer.c | 34 + > fs/ceph/buffer.h | 55 + > fs/ceph/caps.c | 2863 ++++++++++++++++++++++++++++++++ > fs/ceph/ceph_debug.h | 37 + > fs/ceph/ceph_frag.c | 21 + > fs/ceph/ceph_frag.h | 109 ++ > fs/ceph/ceph_fs.c | 74 + > fs/ceph/ceph_fs.h | 648 ++++++++ > fs/ceph/ceph_hash.c | 118 ++ > fs/ceph/ceph_hash.h | 13 + > fs/ceph/ceph_strings.c | 176 ++ > fs/ceph/crush/crush.c | 151 ++ > fs/ceph/crush/crush.h | 180 ++ > fs/ceph/crush/hash.c | 149 ++ > fs/ceph/crush/hash.h | 17 + > fs/ceph/crush/mapper.c | 596 +++++++ > fs/ceph/crush/mapper.h | 20 + > fs/ceph/debugfs.c | 450 +++++ > fs/ceph/decode.h | 159 ++ > fs/ceph/dir.c | 1222 ++++++++++++++ > fs/ceph/export.c | 223 +++ > fs/ceph/file.c | 904 +++++++++++ > fs/ceph/inode.c | 1624 +++++++++++++++++++ > fs/ceph/ioctl.c | 160 ++ > fs/ceph/ioctl.h | 40 + > fs/ceph/mds_client.c | 2976 ++++++++++++++++++++++++++++++++++ > fs/ceph/mds_client.h | 327 ++++ > fs/ceph/mdsmap.c | 170 ++ > fs/ceph/mdsmap.h | 54 + > fs/ceph/messenger.c | 2103 ++++++++++++++++++++++++ > fs/ceph/messenger.h | 253 +++ > fs/ceph/mon_client.c | 751 +++++++++ > fs/ceph/mon_client.h | 115 ++ > fs/ceph/msgpool.c | 181 ++ > fs/ceph/msgpool.h | 27 + > fs/ceph/msgr.h | 167 ++ > fs/ceph/osd_client.c | 1364 ++++++++++++++++ > fs/ceph/osd_client.h | 150 ++ > fs/ceph/osdmap.c | 916 +++++++++++ > fs/ceph/osdmap.h | 124 ++ > fs/ceph/rados.h | 370 +++++ > fs/ceph/snap.c | 887 ++++++++++ > fs/ceph/super.c | 984 +++++++++++ > fs/ceph/super.h | 895 ++++++++++ > fs/ceph/types.h | 29 + > fs/ceph/xattr.c | 842 ++++++++++ > 59 files changed, 25527 insertions(+), 0 deletions(-) > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [GIT PULL] Ceph distributed file system client for 2.6.33 2009-12-18 20:54 ` Sage Weil @ 2009-12-18 21:38 ` Linus Torvalds 2009-12-18 23:15 ` Jim Garlick 2010-02-09 20:43 ` Josef Bacik 2009-12-19 5:33 ` Valdis.Kletnieks 1 sibling, 2 replies; 10+ messages in thread From: Linus Torvalds @ 2009-12-18 21:38 UTC (permalink / raw) To: Sage Weil, Gregory Haskins Cc: Andrew Morton, Linux Kernel Mailing List, linux-fsdevel On Fri, 18 Dec 2009, Sage Weil wrote: > > I would still like to see ceph merged for 2.6.33. It's certainly not > production ready, but it would be greatly beneficial to be in mainline for > the same reasons other file systems like btrfs and exofs were merged > early. So what happened to ceph is the same thing that happened to the alacrityvm pull request (Greg Haskins added to cc): I pretty much continually had a _lot_ of pull requests, and all the time the priority for the ceph and alactrityvm pull requests were just low enough on my priority list that I never felt I had the reason to look into the background enough to make an even half-assed decision of whether to pull or not. And no, "just pull" is not my default answer - if I don't have a reason, the default action is "don't pull". I used to say that "my job is to say 'no'", although I've been so good at farming out submaintainers that most of the time my real job is to pull from submaintainers who hopefully know how to say 'no'. But when it comes to whole new driver features, I'm still "no by default - tell me _why_ I should pull". So what is a new subsystem person to do? The best thing to do is to try to have users that are vocal about the feature, and talk about how great it is. Some advocates for it, in other words. Just a few other people saying "hey, I use this, it's great", is actually a big deal to me. For alacrityvm and cephfs, I didn't have that, or they just weren't loud enough for me to hear. So since you mentioned btrfs as an "early merge", I'll mention it too, as a great example of how something got merged early because it had easily gotten past my "people are asking for it" filter, to the point where _I_ was interested in trying it out personally, and asking Chris&co to tell me when it was ready. Ok, so that was somewhat unusual - I'm not suggesting you'd need to try to drum up quite _that_ much hype - but it kind of illustrates the opposite extreme of your issue. Get some PR going, get people talking about it, get people testing it out. Get people outside of your area saying "hey, I use it, and I hate having to merge it every release". Then, when I see a pull request during the merge window, the pull suddenly has a much higher priority, and I go "Ok, I know people are using this". So no astro-turfing, but real grass-roots support really does help (or top-down feedback for that matter - if a _distribution_ says "we're going to merge this in our distro regardless", that also counts as a big hint for me that people actually expect to use it and would like to not go through the pain of merging). Linus ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [GIT PULL] Ceph distributed file system client for 2.6.33 2009-12-18 21:38 ` Linus Torvalds @ 2009-12-18 23:15 ` Jim Garlick 2009-12-19 11:01 ` Andi Kleen 2010-02-09 20:43 ` Josef Bacik 1 sibling, 1 reply; 10+ messages in thread From: Jim Garlick @ 2009-12-18 23:15 UTC (permalink / raw) To: Linus Torvalds Cc: Sage Weil, Gregory Haskins, Andrew Morton, Linux Kernel Mailing List, linux-fsdevel On Fri, Dec 18, 2009 at 01:38:00PM -0800, Linus Torvalds wrote: > On Fri, 18 Dec 2009, Sage Weil wrote: > > > > I would still like to see ceph merged for 2.6.33. It's certainly not > > production ready, but it would be greatly beneficial to be in mainline for > > the same reasons other file systems like btrfs and exofs were merged > > early. > > The best thing to do is to try to have users that are vocal about the > feature, and talk about how great it is. Some advocates for it, in other > words. Just a few other people saying "hey, I use this, it's great", is > actually a big deal to me. For alacrityvm and cephfs, I didn't have that, > or they just weren't loud enough for me to hear. FWIW: I'd like to see it go in. Ceph is new and experimental so you're not going to see production shops like ours jumping up and down saying we use it and are tired of merging it, like we would say if if Lustre were (again) on the table. However I will say Ceph looks good and in the interest of nuturing future options, I'm for merging it! Jim Garlick Lawrence Livermore National Laboratory ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [GIT PULL] Ceph distributed file system client for 2.6.33 2009-12-18 23:15 ` Jim Garlick @ 2009-12-19 11:01 ` Andi Kleen 2009-12-21 16:42 ` Sage Weil 0 siblings, 1 reply; 10+ messages in thread From: Andi Kleen @ 2009-12-19 11:01 UTC (permalink / raw) To: Jim Garlick Cc: Linus Torvalds, Sage Weil, Gregory Haskins, Andrew Morton, Linux Kernel Mailing List, linux-fsdevel, greg Jim Garlick <garlick@llnl.gov> writes: > > Ceph is new and experimental so you're not going to see production shops One issue with ceph is that I'm not sure it has any users at all. The mailing list seems to be pretty much dead? On a philosophical area I agree that network file systems are definitely an area that could need some more improvements. > like ours jumping up and down saying we use it and are tired of merging it, > like we would say if if Lustre were (again) on the table. OT, but I took a look at some Lustre srpm a few months ago and it didn't seem to still require all the horrible VFS patches that the older versions were plagued with (or perhaps I missed them). Because it definitely seems to have a large real world user base perhaps it would be something for staging at least these days? -Andi -- ak@linux.intel.com -- Speaking for myself only. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [GIT PULL] Ceph distributed file system client for 2.6.33 2009-12-19 11:01 ` Andi Kleen @ 2009-12-21 16:42 ` Sage Weil 0 siblings, 0 replies; 10+ messages in thread From: Sage Weil @ 2009-12-21 16:42 UTC (permalink / raw) To: Andi Kleen Cc: Jim Garlick, Linus Torvalds, Gregory Haskins, Andrew Morton, Linux Kernel Mailing List, linux-fsdevel, greg On Sat, 19 Dec 2009, Andi Kleen wrote: > Jim Garlick <garlick@llnl.gov> writes: > > > > Ceph is new and experimental so you're not going to see production shops > > One issue with ceph is that I'm not sure it has any users at all. > The mailing list seems to be pretty much dead? > On a philosophical area I agree that network file systems are > definitely an area that could need some more improvements. The list is slow. The developers all work in the same office, so most of the technical discussion ends up face to face (we're working on moving more of it to the list). I also tend to send users actively testing it to the irc channel. That said, there aren't many active users. I see lots of interested people lurking on the list and 'waiting for stability,' but I think the prospect of testing an unstable cluster fs is much more daunting than a local one. If you want stability, then it's probably too early to merge. If you want active users, that essentially hinges on stability too. But if it's interest in/demand for an alternative distributed fs, then the sooner it's merged the better. >From my point of view merging now will be a bit rockier with coordinating releases, bug fixes, and dealing with any unforseen client side changes, but I think it'll be worth it. OTOH, another release cycle will bring greater stability and better first impressions. sage ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [GIT PULL] Ceph distributed file system client for 2.6.33 2009-12-18 21:38 ` Linus Torvalds 2009-12-18 23:15 ` Jim Garlick @ 2010-02-09 20:43 ` Josef Bacik 1 sibling, 0 replies; 10+ messages in thread From: Josef Bacik @ 2010-02-09 20:43 UTC (permalink / raw) To: Linus Torvalds Cc: Sage Weil, Andrew Morton, Linux Kernel Mailing List, linux-fsdevel, jdarcy, rwheeler On Fri, Dec 18, 2009 at 01:38:00PM -0800, Linus Torvalds wrote: > > > On Fri, 18 Dec 2009, Sage Weil wrote: > > > > I would still like to see ceph merged for 2.6.33. It's certainly not > > production ready, but it would be greatly beneficial to be in mainline for > > the same reasons other file systems like btrfs and exofs were merged > > early. > > So what happened to ceph is the same thing that happened to the alacrityvm > pull request (Greg Haskins added to cc): I pretty much continually had a > _lot_ of pull requests, and all the time the priority for the ceph and > alactrityvm pull requests were just low enough on my priority list that I > never felt I had the reason to look into the background enough to make an > even half-assed decision of whether to pull or not. > > And no, "just pull" is not my default answer - if I don't have a reason, > the default action is "don't pull". > > I used to say that "my job is to say 'no'", although I've been so good at > farming out submaintainers that most of the time my real job is to pull > from submaintainers who hopefully know how to say 'no'. But when it comes > to whole new driver features, I'm still "no by default - tell me _why_ I > should pull". > > So what is a new subsystem person to do? > > The best thing to do is to try to have users that are vocal about the > feature, and talk about how great it is. Some advocates for it, in other > words. Just a few other people saying "hey, I use this, it's great", is > actually a big deal to me. For alacrityvm and cephfs, I didn't have that, > or they just weren't loud enough for me to hear. > > So since you mentioned btrfs as an "early merge", I'll mention it too, as > a great example of how something got merged early because it had easily > gotten past my "people are asking for it" filter, to the point where _I_ > was interested in trying it out personally, and asking Chris&co to tell me > when it was ready. > > Ok, so that was somewhat unusual - I'm not suggesting you'd need to try to > drum up quite _that_ much hype - but it kind of illustrates the opposite > extreme of your issue. Get some PR going, get people talking about it, get > people testing it out. Get people outside of your area saying "hey, I use > it, and I hate having to merge it every release". > > Then, when I see a pull request during the merge window, the pull suddenly > has a much higher priority, and I go "Ok, I know people are using this". > > So no astro-turfing, but real grass-roots support really does help (or > top-down feedback for that matter - if a _distribution_ says "we're going > to merge this in our distro regardless", that also counts as a big hint > for me that people actually expect to use it and would like to not go > through the pain of merging). > We have had bugzilla's opened with us (Red Hat) requesting that CEPH be included in Fedora/RHEL, so I'm here to yell loudly that somebody wants it :). The problem for these particular users is that sucking down a git tree and applying patches and building a kernel is a very high entry cost to test something they are very excited about, so they depend on distributions to ship the new fun stuff for them to start testing. The problem is the distributions do not want to ship new fun stuff thats not upstream if at all possible (especially when it comes to filesystems). I personally have no issues with just sucking a bunch of patches into the Fedora kernel so people can start testing it, but I think that sends the wrong message, since we're supposed to be following upstream and encouraging people to push their code upstream first. Not to mention that it makes the actual Fedora kernel team antsy, and I already bug them enough with what I pull in for btrfs :). So for the time being I'm just going to pull the userspace stuff into Fedora. If you still feel that there is not enough users to justify pulling CEPH in I will probably pull the patches into the rawhide Fedora kernel when F13 branches off and hopefully that will pull even more users in. Thanks, Josef ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [GIT PULL] Ceph distributed file system client for 2.6.33 2009-12-18 20:54 ` Sage Weil 2009-12-18 21:38 ` Linus Torvalds @ 2009-12-19 5:33 ` Valdis.Kletnieks 2009-12-21 16:42 ` Sage Weil 1 sibling, 1 reply; 10+ messages in thread From: Valdis.Kletnieks @ 2009-12-19 5:33 UTC (permalink / raw) To: Sage Weil; +Cc: torvalds, akpm, linux-kernel, linux-fsdevel [-- Attachment #1: Type: text/plain, Size: 697 bytes --] On Fri, 18 Dec 2009 12:54:02 PST, Sage Weil said: > I would still like to see ceph merged for 2.6.33. It's certainly not > production ready, but it would be greatly beneficial to be in mainline for > the same reasons other file systems like btrfs and exofs were merged > early. Is the on-the-wire protocol believed to be correct, complete, and stable? How about any userspace APIs and on-disk formats? In other words.. > > The git tree includes the full patchset posted in October and incremental > > changes since then. I've tried to cram in all the anticipated protocol > > changes, but the file system is still strictly EXPERIMENTAL and is marked Anything left dangling on the changes? [-- Attachment #2: Type: application/pgp-signature, Size: 227 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [GIT PULL] Ceph distributed file system client for 2.6.33 2009-12-19 5:33 ` Valdis.Kletnieks @ 2009-12-21 16:42 ` Sage Weil 2009-12-21 18:04 ` Andreas Dilger 0 siblings, 1 reply; 10+ messages in thread From: Sage Weil @ 2009-12-21 16:42 UTC (permalink / raw) To: Valdis.Kletnieks; +Cc: torvalds, akpm, linux-kernel, linux-fsdevel On Sat, 19 Dec 2009, Valdis.Kletnieks@vt.edu wrote: > On Fri, 18 Dec 2009 12:54:02 PST, Sage Weil said: > > I would still like to see ceph merged for 2.6.33. It's certainly not > > production ready, but it would be greatly beneficial to be in mainline for > > the same reasons other file systems like btrfs and exofs were merged > > early. > > Is the on-the-wire protocol believed to be correct, complete, and stable? How > about any userspace APIs and on-disk formats? In other words.. > > > > The git tree includes the full patchset posted in October and incremental > > > changes since then. I've tried to cram in all the anticipated protocol > > > changes, but the file system is still strictly EXPERIMENTAL and is marked > > Anything left dangling on the changes? The wire protocol is close. There is a corner cases with MDS failure recovery that need attention, but it can be resolved in a backward compatible way. I think a compat/incompat flags mechanism during the initial handshake might be appropriate to make changes easier going forward. I don't anticipate any other changes there. There are some as-yet unresolved interface and performance issues with the way the storage nodes interact with btrfs that have on disk format implications. I hope to resolve those shortly. Those of course do not impact the client code. sage ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [GIT PULL] Ceph distributed file system client for 2.6.33 2009-12-21 16:42 ` Sage Weil @ 2009-12-21 18:04 ` Andreas Dilger 0 siblings, 0 replies; 10+ messages in thread From: Andreas Dilger @ 2009-12-21 18:04 UTC (permalink / raw) To: Sage Weil; +Cc: Valdis.Kletnieks, torvalds, akpm, linux-kernel, linux-fsdevel On 2009-12-21, at 09:42, Sage Weil wrote: > I think a compat/incompat flags mechanism during the > initial handshake might be appropriate to make changes easier going > forward. Having compat/incompat flags for the network protocol, implemented correctly, is really critical for long term maintenance. For Lustre, we ended up using a single set of compatibility flags: - client sends full set of features that it understands - server replies with strict subset of flags that it also understands (i.e. client_features & server_supported_features) - if client doesn't have required support for a feature needed by the server, server refuses to allow client to mount - if server doesn't have feature required by client (e.g. understands only some older implementation no longer supported by client), client refuses to mount filesystem We've been able to use this mechanism for the past 5 years to maintain protocol interoperability for Lustre, though we don't promise perpetual interoperability, only for about 3 years or so before users have to upgrade to a newer release. That allows us to drop support for ancient code instead of having to carry around baggage for every possible combination of old features. Using simple version numbers for the protocol means you have to carry the baggage of every single previous version, and it isn't possible to have "experimental" features that are out in the wild, but eventually don't make sense to keep around forever. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2010-02-09 20:43 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-12-07 23:25 [GIT PULL] Ceph distributed file system client for 2.6.33 Sage Weil 2009-12-18 20:54 ` Sage Weil 2009-12-18 21:38 ` Linus Torvalds 2009-12-18 23:15 ` Jim Garlick 2009-12-19 11:01 ` Andi Kleen 2009-12-21 16:42 ` Sage Weil 2010-02-09 20:43 ` Josef Bacik 2009-12-19 5:33 ` Valdis.Kletnieks 2009-12-21 16:42 ` Sage Weil 2009-12-21 18:04 ` Andreas Dilger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).