* long object names @ 2011-04-21 4:42 Sage Weil 2011-04-21 18:56 ` Tommi Virtanen 0 siblings, 1 reply; 19+ messages in thread From: Sage Weil @ 2011-04-21 4:42 UTC (permalink / raw) To: ceph-devel Yehuda and I talked about the lfn branch today and we're not on the same page yet about the best way to proceed. The current code keeps the long file name translating independent of the other naming/mangling that FileStore does (collection/ prefix, escaping, and sobject_t -> <object>_<snapid|head>). I see that it's nice to do one thing at once, but I'm also not sure the long files are useful anywhere else. Other thoughts: The escaping may make more sense in the same layer as the long name stuff? Eventually we'll be prehashing the pg dir contents into subdirs, and that translation will have to be done somewhere too. That will mean possibily looking in two locations during the rehashing process, similar to how the lfn stuff has to peek at xattrs. One thing to keep in mind is that the hash value will need to be passed down and stored with the file... it's usually hash(object name), but not always when the object_locator_t::key is set. Where will this fit in? We may eventually want to adjust the ObjectStore interface to include collection/dir handles so that the full path isn't traversed in kernel for every operation (the OSD could maintain an open handle/fd for each pg it has open). I think the lfn_open/_get type interface below all of the operation methods will allow all of those things. I think it'll be simpler to push as much of the filename rendering into that layer as possible, though (possibly including the sobject_t mangling). Having all the mangling/rendering done in one place will also make it easy to extend without making multiple passes in different layers... Unfortunately I'm out tomorrow. Any other opinions? One other thing: the xattr names are mangled too (user.ceph. prefix). As long as the long name xattr has a different prefix we don't have to worry about those getting mixed up. sage ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: long object names 2011-04-21 4:42 long object names Sage Weil @ 2011-04-21 18:56 ` Tommi Virtanen 2011-04-21 19:27 ` Colin McCabe 0 siblings, 1 reply; 19+ messages in thread From: Tommi Virtanen @ 2011-04-21 18:56 UTC (permalink / raw) To: Sage Weil; +Cc: ceph-devel On Wed, Apr 20, 2011 at 09:42:48PM -0700, Sage Weil wrote: > I think the lfn_open/_get type interface below all of the operation > methods will allow all of those things. I think it'll be simpler to push > as much of the filename rendering into that layer as possible, though > (possibly including the sobject_t mangling). Having all the > mangling/rendering done in one place will also make it easy to extend > without making multiple passes in different layers... So apparently, this email confused many. Let's try to clarify, to the best of my understanding: - Yehuda's code would currently write this: ./rados mkpool kitties ./rados create --pool=foo "longcatisl$(python -c 'print 300*"o"')ng" >>> FILENAME_SHORT_LEN=16 >>> FILENAME_HASH_LEN=3 >>> FILENAME_COOKIE="cLFN" >>> FILENAME_PREFIX_LEN = (FILENAME_SHORT_LEN - FILENAME_HASH_LEN - 1 - (len(FILENAME_COOKIE) - 1) - 1) >>> orig = "longcatisl"+300*"o"+"ng" >>> storable = orig + "_head" # plus backslash escaping, but let's ignore that now >>> munged = orig[:FILENAME_PREFIX_LEN] + "_" + FILENAME_COOKIE + "_%s_%d" % ("zzz", 42) >>> munged 'longcati_cLFN_zzz_42' So it loses the _head suffix, and all such things. - Sage wanted (as far as I understand) to have the munging be where the backslash escaping is, so the end result would look like 'longcati_cLFN_zzz_42_head'. Note the suffix. I hope I got that right. As for Yehuda's approach, I'm not very happy to see layers upon layers of rewriting the filenames.. It just seems more brittle. So Sage's version looks nicer to me, can do all the work it needs to do in a single pass, and lets us see the _head etc suffixes without reading xattrs, which might be useful for fsck-style things. As for both, I'm especially not fond of the very limited hashing, and the loops that keep calling build_filename. I fear collisions and races. Bumping up the hash size significantly will help the common case, but I still fear the races. I also think that Ceph, and especially the RGW bits, needs to be written to be fairly robust against DoS attacks. Nasty things happen out there, and having somebody able to trigger a "slow mode" on your server with fairly cheap operations is bad. Here's a concrete proposal: split the filename into subdirs if needed, and map the names 1:1, just to avoid the unpredictability of the above approach. And to get significantly less code and branching in the fast path. That is, I think I'd go for something like (Python written in C style to make it more direct to translate): # how much overhead to reserve in filenames to always have # prefix/suffix not split by slashes LONGEST_PREFIX_SUFFIX_LEN = len("_head") SAFE_FILENAME_LEN = 255 - LONGEST_PREFIX_SUFFIX_LEN def munge(path): dirprefix = None while len(path) > SAFE_FILENAME_LEN: head = path[:SAFE_FILENAME_LEN] if dirprefix is None: dirprefix = head else: dirprefix = dirprefix + '/' + head path = path[SAFE_FILENAME_LEN:] return dirprefix + '/' + path and now >>> munged = munge(orig) >>> munged 'longcatisloooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo/oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong' >>> munged + '_head' 'longcatisloooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo/oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong_head' And the caller would just mkdir all the leading dirs (ignore EEXIST errors). They can either be left around, or with a small loop handling a race between mkdir & rmdir, can be cleaned up either on unlink or in fsck/scrub/etc. Also, I'm not thrilled to have something this core, *and* being string manipulation in C, go without unit tests that exercise the corner cases. Finally, here's some misc notes on the existing code that are probably obvious, I just wanted to make sure: - hash is always "zzz" - xattr user.ceph._lfn conflicts with actual end-user xattrs "_lfn"? - escaping can lengthen the filename, does it handle that (I guess yes because this is a layer after that, but I can't tell without reading a lot of code) -- :(){ :|:&};: ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: long object names 2011-04-21 18:56 ` Tommi Virtanen @ 2011-04-21 19:27 ` Colin McCabe 2011-04-21 19:32 ` Tommi Virtanen 0 siblings, 1 reply; 19+ messages in thread From: Colin McCabe @ 2011-04-21 19:27 UTC (permalink / raw) To: Tommi Virtanen; +Cc: Sage Weil, ceph-devel On Thu, Apr 21, 2011 at 11:56 AM, Tommi Virtanen <tommi.virtanen@dreamhost.com> wrote: > I also think that Ceph, and especially the RGW bits, needs to be > written to be fairly robust against DoS attacks. Nasty things happen > out there, and having somebody able to trigger a "slow mode" on your > server with fairly cheap operations is bad. Yeah. > Here's a concrete proposal: split the filename into subdirs if needed, > and map the names 1:1, just to avoid the unpredictability of the above > approach. And to get significantly less code and branching in the fast > path. That is, I think I'd go for something like (Python written in C > style to make it more direct to translate): I like this idea a lot. It does involve extra expense, but only for long file names. It also avoids object name collisions completely. One additional idea: can we make the chunking configurable? If we did a translation like this: abcdefg -> abc/def/g 123456789 -> 123/456/789 prefix search would become a *lot* more efficient for rgw. On the other hand, the filesystem layer doesn't care about prefix search, so it could just configure the chunking to be after 200 characters or something (at which point it's basically a no-op.) cheers, Colin ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: long object names 2011-04-21 19:27 ` Colin McCabe @ 2011-04-21 19:32 ` Tommi Virtanen 2011-04-21 20:03 ` Gregory Farnum 0 siblings, 1 reply; 19+ messages in thread From: Tommi Virtanen @ 2011-04-21 19:32 UTC (permalink / raw) To: Colin McCabe; +Cc: Sage Weil, ceph-devel On Thu, Apr 21, 2011 at 12:27:01PM -0700, Colin McCabe wrote: > I like this idea a lot. It does involve extra expense, but only for > long file names. It also avoids object name collisions completely. > > One additional idea: can we make the chunking configurable? > If we did a translation like this: > abcdefg -> abc/def/g > 123456789 -> 123/456/789 > > prefix search would become a *lot* more efficient for rgw. > On the other hand, the filesystem layer doesn't care about prefix > search, so it could just configure the chunking to be after 200 > characters or something (at which point it's basically a no-op.) The one big downside is that with configurable chunking, you no longer have an always correct 1:1 mapping between object and file. You might argue for always (not configurably) chunking at some smaller, fixed boundary, so on the average you'd need to readdir() less to serve a prefix search. I think this is what your last sentence refers to. But that means more overhead with the directories. The only real answers are available via benchmarks. -- :(){ :|:&};: ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: long object names 2011-04-21 19:32 ` Tommi Virtanen @ 2011-04-21 20:03 ` Gregory Farnum 2011-04-21 21:09 ` Colin McCabe 2011-04-21 22:00 ` Tommi Virtanen 0 siblings, 2 replies; 19+ messages in thread From: Gregory Farnum @ 2011-04-21 20:03 UTC (permalink / raw) To: ceph-devel I really don't see how pushing the naming complexity into the local filesystem, where it adds lots of otherwise-useless inodes and dentries, is going to help us. I like what Yehuda has here for its relative simplicity -- though I think we should just up the hash size enough that we don't need to handle collisions, and leave out the retry looping so as to make it simpler still -- but given the relative simplicity I think it might be nice to push all the name mangling into a flat space so that we can preserve the prefix- and post-fixing -- this would keep snapshots of one object more identifiable than hashing over the entire name like it's doing right now. -Greg On Thursday, April 21, 2011 at 12:32 PM, Tommi Virtanen wrote: > On Thu, Apr 21, 2011 at 12:27:01PM -0700, Colin McCabe wrote: > > I like this idea a lot. It does involve extra expense, but only for > > long file names. It also avoids object name collisions completely. > > > > One additional idea: can we make the chunking configurable? > > If we did a translation like this: > > abcdefg -> abc/def/g > > 123456789 -> 123/456/789 > > > > prefix search would become a *lot* more efficient for rgw. > > On the other hand, the filesystem layer doesn't care about prefix > > search, so it could just configure the chunking to be after 200 > > characters or something (at which point it's basically a no-op.) > > The one big downside is that with configurable chunking, you no longer > have an always correct 1:1 mapping between object and file. > > You might argue for always (not configurably) chunking at some > smaller, fixed boundary, so on the average you'd need to readdir() > less to serve a prefix search. I think this is what your last sentence > refers to. But that means more overhead with the directories. > > The only real answers are available via benchmarks. > > -- > :(){ :|:&};: > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: long object names 2011-04-21 20:03 ` Gregory Farnum @ 2011-04-21 21:09 ` Colin McCabe 2011-04-21 21:23 ` Yehuda Sadeh Weinraub 2011-04-21 22:00 ` Tommi Virtanen 1 sibling, 1 reply; 19+ messages in thread From: Colin McCabe @ 2011-04-21 21:09 UTC (permalink / raw) To: Gregory Farnum; +Cc: ceph-devel On Thu, Apr 21, 2011 at 1:03 PM, Gregory Farnum <gregory.farnum@dreamhost.com> wrote: > I really don't see how pushing the naming complexity into the local filesystem, > where it adds lots of otherwise-useless inodes and dentries, is going to help us. Here is a quick summary of how the TV's proposal would help us. 1. it avoids collisions entirely 2. You don't ever have do an extra xattr lookup, no matter how short or long the object name is. My add-on proposal helps us: 3. get reasonable prefix search performance (with those supposedly "useless" dentries) > I like what Yehuda has here for its relative simplicity -- though I think we should just up > the hash size enough that we don't need to handle collisions, Personally, I think the xattr proposal is more complex. I guess that is a matter of taste. No matter how big your hash table will be, there are still collisions! That is the nature of hashing. And since the code is open source, it's pretty easy for an attacker to read the source and then create two objects whose names collide. So far, the only disadvantage that has been pointed out to TV's scheme is that it creates extra dentries. But those extra dentries only affect long object names, not the ones that (for example) the Ceph FS creates. Also, when long object names occur in S3, they don't tend to come out of the blue. They come about because the organization has a sort of directory structure like this: foocorp/business_data/business_reports/year_2008/input/foo foocorp/business_data/business_reports/year_2008/input/bar Of course we "know" that there are no such things as directories in S3. But people like to structure their object names as if there were. In cases like that, TV's scheme only incurs the cost of creating the extra dentries once per long prefix. Colin ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: long object names 2011-04-21 21:09 ` Colin McCabe @ 2011-04-21 21:23 ` Yehuda Sadeh Weinraub 2011-04-21 21:44 ` Colin McCabe 0 siblings, 1 reply; 19+ messages in thread From: Yehuda Sadeh Weinraub @ 2011-04-21 21:23 UTC (permalink / raw) To: Colin McCabe; +Cc: Gregory Farnum, ceph-devel On Thu, Apr 21, 2011 at 2:09 PM, Colin McCabe <cmccabe@alumni.cmu.edu> wrote: > On Thu, Apr 21, 2011 at 1:03 PM, Gregory Farnum > <gregory.farnum@dreamhost.com> wrote: >> I really don't see how pushing the naming complexity into the local filesystem, >> where it adds lots of otherwise-useless inodes and dentries, is going to help us. > > Here is a quick summary of how the TV's proposal would help us. > 1. it avoids collisions entirely > 2. You don't ever have do an extra xattr lookup, no matter how short > or long the object name is. Yeah, but you read more directories. Note that btrfs stores the xattrs on the directories, so reading those xattrs will have a lower IO impact than traversing directories recursively. > > My add-on proposal helps us: > 3. get reasonable prefix search performance (with those supposedly > "useless" dentries) > >> I like what Yehuda has here for its relative simplicity -- though I think we should just up >> the hash size enough that we don't need to handle collisions, > > Personally, I think the xattr proposal is more complex. I guess that > is a matter of taste. > > No matter how big your hash table will be, there are still collisions! > That is the nature of hashing. And since the code is open source, it's > pretty easy for an attacker to read the source and then create two > objects whose names collide. Sure there will be, and the code should handle it. With a good hashing scheme having a collision will be pretty rare. > > So far, the only disadvantage that has been pointed out to TV's scheme > is that it creates extra dentries. But those extra dentries only > affect long object names, not the ones that (for example) the Ceph FS > creates. Also, when long object names occur in S3, they don't tend to > come out of the blue. They come about because the organization has a > sort of directory structure like this: > > foocorp/business_data/business_reports/year_2008/input/foo > foocorp/business_data/business_reports/year_2008/input/bar > > Of course we "know" that there are no such things as directories in > S3. But people like to structure their object names as if there were. > In cases like that, TV's scheme only incurs the cost of creating the > extra dentries once per long prefix. > As I said above, for most cases reading xattrs should be more efficient. Yehuda ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: long object names 2011-04-21 21:23 ` Yehuda Sadeh Weinraub @ 2011-04-21 21:44 ` Colin McCabe 2011-04-21 21:54 ` Yehuda Sadeh Weinraub 0 siblings, 1 reply; 19+ messages in thread From: Colin McCabe @ 2011-04-21 21:44 UTC (permalink / raw) To: Yehuda Sadeh Weinraub; +Cc: Gregory Farnum, ceph-devel On Thu, Apr 21, 2011 at 2:23 PM, Yehuda Sadeh Weinraub <yehudasa@gmail.com> wrote: > On Thu, Apr 21, 2011 at 2:09 PM, Colin McCabe <cmccabe@alumni.cmu.edu> wrote: >> On Thu, Apr 21, 2011 at 1:03 PM, Gregory Farnum >> <gregory.farnum@dreamhost.com> wrote: >>> I really don't see how pushing the naming complexity into the local filesystem, >>> where it adds lots of otherwise-useless inodes and dentries, is going to help us. >> >> Here is a quick summary of how the TV's proposal would help us. >> 1. it avoids collisions entirely >> 2. You don't ever have do an extra xattr lookup, no matter how short >> or long the object name is. > > Yeah, but you read more directories. Note that btrfs stores the xattrs > on the directories, so reading those xattrs will have a lower IO > impact than traversing directories recursively. It does seem like btrfs' extended attribute implementation is fairly efficient. But Linux's dentry cache (dcache) is also pretty efficient. TV's approach involves fewer syscalls and no loop. I also wonder how xattr performance is on ext3/4 these days. I think benchmarks would be needed to really settle this question. I'm almost tempted to write one... sincerely, Colin > >> >> My add-on proposal helps us: >> 3. get reasonable prefix search performance (with those supposedly >> "useless" dentries) >> >>> I like what Yehuda has here for its relative simplicity -- though I think we should just up >>> the hash size enough that we don't need to handle collisions, >> >> Personally, I think the xattr proposal is more complex. I guess that >> is a matter of taste. >> >> No matter how big your hash table will be, there are still collisions! >> That is the nature of hashing. And since the code is open source, it's >> pretty easy for an attacker to read the source and then create two >> objects whose names collide. > > Sure there will be, and the code should handle it. With a good hashing > scheme having a collision will be pretty rare. > >> >> So far, the only disadvantage that has been pointed out to TV's scheme >> is that it creates extra dentries. But those extra dentries only >> affect long object names, not the ones that (for example) the Ceph FS >> creates. Also, when long object names occur in S3, they don't tend to >> come out of the blue. They come about because the organization has a >> sort of directory structure like this: >> >> foocorp/business_data/business_reports/year_2008/input/foo >> foocorp/business_data/business_reports/year_2008/input/bar >> >> Of course we "know" that there are no such things as directories in >> S3. But people like to structure their object names as if there were. >> In cases like that, TV's scheme only incurs the cost of creating the >> extra dentries once per long prefix. >> > As I said above, for most cases reading xattrs should be more efficient. > > > Yehuda > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: long object names 2011-04-21 21:44 ` Colin McCabe @ 2011-04-21 21:54 ` Yehuda Sadeh Weinraub 2011-04-21 22:01 ` Colin McCabe 0 siblings, 1 reply; 19+ messages in thread From: Yehuda Sadeh Weinraub @ 2011-04-21 21:54 UTC (permalink / raw) To: Colin McCabe; +Cc: Gregory Farnum, ceph-devel On Thu, Apr 21, 2011 at 2:44 PM, Colin McCabe <cmccabe@alumni.cmu.edu> wrote: > On Thu, Apr 21, 2011 at 2:23 PM, Yehuda Sadeh Weinraub > <yehudasa@gmail.com> wrote: >> On Thu, Apr 21, 2011 at 2:09 PM, Colin McCabe <cmccabe@alumni.cmu.edu> wrote: >>> On Thu, Apr 21, 2011 at 1:03 PM, Gregory Farnum >>> <gregory.farnum@dreamhost.com> wrote: >>>> I really don't see how pushing the naming complexity into the local filesystem, >>>> where it adds lots of otherwise-useless inodes and dentries, is going to help us. >>> >>> Here is a quick summary of how the TV's proposal would help us. >>> 1. it avoids collisions entirely >>> 2. You don't ever have do an extra xattr lookup, no matter how short >>> or long the object name is. >> >> Yeah, but you read more directories. Note that btrfs stores the xattrs >> on the directories, so reading those xattrs will have a lower IO >> impact than traversing directories recursively. > > It does seem like btrfs' extended attribute implementation is fairly > efficient. But Linux's dentry cache (dcache) is also pretty efficient. > (resending to list) It needs to be populated first before being efficient. And it'll be less efficient now that you populate it with extra entries. Yehuda ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: long object names 2011-04-21 21:54 ` Yehuda Sadeh Weinraub @ 2011-04-21 22:01 ` Colin McCabe 2011-04-21 22:58 ` Zenon Panoussis 0 siblings, 1 reply; 19+ messages in thread From: Colin McCabe @ 2011-04-21 22:01 UTC (permalink / raw) To: Yehuda Sadeh Weinraub; +Cc: Gregory Farnum, ceph-devel On Thu, Apr 21, 2011 at 2:54 PM, Yehuda Sadeh Weinraub <yehudasa@gmail.com> wrote: > On Thu, Apr 21, 2011 at 2:44 PM, Colin McCabe <cmccabe@alumni.cmu.edu> wrote: >> On Thu, Apr 21, 2011 at 2:23 PM, Yehuda Sadeh Weinraub >> <yehudasa@gmail.com> wrote: >>> On Thu, Apr 21, 2011 at 2:09 PM, Colin McCabe <cmccabe@alumni.cmu.edu> wrote: >>>> On Thu, Apr 21, 2011 at 1:03 PM, Gregory Farnum >>>> <gregory.farnum@dreamhost.com> wrote: >>>>> I really don't see how pushing the naming complexity into the local filesystem, >>>>> where it adds lots of otherwise-useless inodes and dentries, is going to help us. >>>> >>>> Here is a quick summary of how the TV's proposal would help us. >>>> 1. it avoids collisions entirely >>>> 2. You don't ever have do an extra xattr lookup, no matter how short >>>> or long the object name is. >>> >>> Yeah, but you read more directories. Note that btrfs stores the xattrs >>> on the directories, so reading those xattrs will have a lower IO >>> impact than traversing directories recursively. >> >> It does seem like btrfs' extended attribute implementation is fairly >> efficient. But Linux's dentry cache (dcache) is also pretty efficient. >> > (resending to list) > > It needs to be populated first before being efficient. And it'll be > less efficient now that you populate it with extra entries. That is a good point. However, xattrs also have a cost. It seems like btrfs sometimes creates an inode for xattrs, and sometimes just stashes them in the dentry (presumably if there aren't many and they're small?) The xattr-scheme always creates an extra xattr per entry. The directory-based scheme creates extra directories, but not that many, assuming a lot of objects have names with similar prefixes-- an assumption that is likely to be true nearly all the time. I think both schemes are doable, but I still lean towards the directory-based one, just because I like fast prefix search. Colin ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: long object names 2011-04-21 22:01 ` Colin McCabe @ 2011-04-21 22:58 ` Zenon Panoussis 2011-04-21 23:04 ` Yehuda Sadeh Weinraub 0 siblings, 1 reply; 19+ messages in thread From: Zenon Panoussis @ 2011-04-21 22:58 UTC (permalink / raw) To: ceph-devel >> It needs to be populated first before being efficient. And it'll be >> less efficient now that you populate it with extra entries. At the risk of being run out of town covered in tar and feathers, I'll venture voicing the opinion of an end-user who doesn't know ceph, is not a developer, and doesn't even understand half of the technicalities of this discussion. From my end-user point of view, efficiency is great and very desirable, but is still secondary. Simplicity of code and the reduction of bugs that comes with it is great and adds elegance to intelligence, but is still secondary. The safety of data though, now, that is primary and above everything else when it comes to a file system. A file system's *only* purpose is to store and retrieve data. Efficiency and speed are features, positive qualities that make a file system better, but only as long as it actually can fulfil its purpose of storing and retrieving data without losing or corrupting them. Looking at it this way, the potential of a hash collision is catastrophic no matter how small it might be. The measure of this problem is not the objective likelihood that it will occur, but the subjective level of worry that it might occur. Simply put, even if there's one chance of a hash collision in 10 billion and I only have a couple of million files, I still end up being unable to trust the integrity of *any* of them. One might argue here that no file system in this world offers a 100% file integrity guarantee. That's absolutely true, but it is and should remain a shortcoming and not be elevated to an intentional design feature. Z ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: long object names 2011-04-21 22:58 ` Zenon Panoussis @ 2011-04-21 23:04 ` Yehuda Sadeh Weinraub 0 siblings, 0 replies; 19+ messages in thread From: Yehuda Sadeh Weinraub @ 2011-04-21 23:04 UTC (permalink / raw) To: Zenon Panoussis; +Cc: ceph-devel On Thu, Apr 21, 2011 at 3:58 PM, Zenon Panoussis <oracle@provocation.net> wrote: > >>> It needs to be populated first before being efficient. And it'll be >>> less efficient now that you populate it with extra entries. > > At the risk of being run out of town covered in tar and feathers, I'll > venture voicing the opinion of an end-user who doesn't know ceph, is not > a developer, and doesn't even understand half of the technicalities of > this discussion. > > From my end-user point of view, efficiency is great and very desirable, > but is still secondary. Simplicity of code and the reduction of bugs that > comes with it is great and adds elegance to intelligence, but is still > secondary. The safety of data though, now, that is primary and above > everything else when it comes to a file system. A file system's *only* > purpose is to store and retrieve data. Efficiency and speed are features, > positive qualities that make a file system better, but only as long as > it actually can fulfil its purpose of storing and retrieving data without > losing or corrupting them. > > Looking at it this way, the potential of a hash collision is catastrophic > no matter how small it might be. The measure of this problem is not the > objective likelihood that it will occur, but the subjective level of worry > that it might occur. Simply put, even if there's one chance of a hash > collision in 10 billion and I only have a couple of million files, I still > end up being unable to trust the integrity of *any* of them. > > One might argue here that no file system in this world offers a 100% file > integrity guarantee. That's absolutely true, but it is and should remain > a shortcoming and not be elevated to an intentional design feature. > We fully understand your worry, and in any case with the hashing solution it doesn't mean that when there's a collision you lose the data, just that the data lookup needs to traverse more objects. HTH, Yehuda ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: long object names 2011-04-21 20:03 ` Gregory Farnum 2011-04-21 21:09 ` Colin McCabe @ 2011-04-21 22:00 ` Tommi Virtanen 2011-04-21 22:23 ` Gregory Farnum 2011-04-21 22:25 ` Yehuda Sadeh Weinraub 1 sibling, 2 replies; 19+ messages in thread From: Tommi Virtanen @ 2011-04-21 22:00 UTC (permalink / raw) To: Gregory Farnum; +Cc: ceph-devel On Thu, Apr 21, 2011 at 01:03:57PM -0700, Gregory Farnum wrote: > I like what Yehuda has here for its relative simplicity It's far from simple. Let's look at the unlink path: static int lfn_unlink(const char *pathname) { const char *filename; char short_fn[PATH_MAX]; char short_fn2[PATH_MAX]; int r, i, exist, err; int path_len; int is_lfn; ** helper function to split the path to dir and file, figure out a ** short name for this longname, count the lenght of the directory ** part of the path and other things; loops through the candidates, ** comparing against the xattr r = lfn_get(pathname, short_fn, sizeof(short_fn), &filename, &exist, &is_lfn); if (r < 0) return r; ** if the filename wasn't actually too long, take the easy way out if (!is_lfn) return unlink(pathname); if (!exist) { errno = ENOENT; return -1; } ** actual file unlink here err = unlink(short_fn); if (err < 0) return err; ** and then, rename all the collisions, one by one, because they have ** a sequential number in them! path_len = filename - pathname; memcpy(short_fn2, pathname, path_len); ** this loop finds the highest sequential number in this hash ** collision bucket, saves it in i for (i = r + 1; ; i++) { struct stat buf; int ret; build_filename(&short_fn2[path_len], sizeof(short_fn2) - path_len, filename, i); ret = stat(short_fn2, &buf); if (ret < 0) { if (i == r + 1) return 0; break; } } ** and then the highest seq number munged filename gets renamed to ** fill the gap we left behind build_filename(&short_fn2[path_len], sizeof(short_fn2) - path_len, filename, i - 1); generic_dout(0) << "renaming " << short_fn2 << " -> " << short_fn << dendl; if (rename(short_fn2, short_fn) < 0) { generic_derr << "ERROR: could not rename " << short_fn2 << " -> " << short_fn << dendl; assert(0); } return 0; } Now, imagine a colliding file create between the stat and the rename -> boom. This is not the only race in there. The underlying problem is that you're constructing an atomic operation out of multiple underlying operations, and you're not obsessively careful about ordering them. Once you get obsessive about ordering them, the extra directory my scheme creates will seem very cheap. If you say that's not relevant because of some locking that the OSD does, then 1) you're building a lot of assumptions on the locking never changing 2) I can construct similar bugs with a single actor, with a crash at the wrong moment. Simple code makes Tv happy. You don't want an unhappy Tv all up in your codebase. -- :(){ :|:&};: ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: long object names 2011-04-21 22:00 ` Tommi Virtanen @ 2011-04-21 22:23 ` Gregory Farnum 2011-04-21 22:25 ` Yehuda Sadeh Weinraub 1 sibling, 0 replies; 19+ messages in thread From: Gregory Farnum @ 2011-04-21 22:23 UTC (permalink / raw) To: Tommi Virtanen; +Cc: ceph-devel On Thursday, April 21, 2011 at 3:00 PM, Tommi Virtanen wrote: On Thu, Apr 21, 2011 at 01:03:57PM -0700, Gregory Farnum wrote: > > I like what Yehuda has here for its relative simplicity > > It's far from simple. > > Let's look at the unlink path: > > > static int lfn_unlink(const char *pathname) > { > const char *filename; > char short_fn[PATH_MAX]; > char short_fn2[PATH_MAX]; > int r, i, exist, err; > int path_len; > int is_lfn; > > ** helper function to split the path to dir and file, figure out a > ** short name for this longname, count the lenght of the directory > ** part of the path and other things; loops through the candidates, > ** comparing against the xattr > r = lfn_get(pathname, short_fn, sizeof(short_fn), &filename, &exist, &is_lfn); > if (r < 0) > return r; > ** if the filename wasn't actually too long, take the easy way out > if (!is_lfn) > return unlink(pathname); > if (!exist) { > errno = ENOENT; > return -1; > } > > ** actual file unlink here > err = unlink(short_fn); > if (err < 0) > return err; > > ** and then, rename all the collisions, one by one, because they have > ** a sequential number in them! > path_len = filename - pathname; > memcpy(short_fn2, pathname, path_len); > > ** this loop finds the highest sequential number in this hash > ** collision bucket, saves it in i > for (i = r + 1; ; i++) { > struct stat buf; > int ret; > > build_filename(&short_fn2[path_len], sizeof(short_fn2) - path_len, filename, i); > ret = stat(short_fn2, &buf); > if (ret < 0) { > if (i == r + 1) > return 0; > > break; > } > } > > ** and then the highest seq number munged filename gets renamed to > ** fill the gap we left behind > build_filename(&short_fn2[path_len], sizeof(short_fn2) - path_len, filename, i - 1); > generic_dout(0) << "renaming " << short_fn2 << " -> " << short_fn << dendl; > > if (rename(short_fn2, short_fn) < 0) { > generic_derr << "ERROR: could not rename " << short_fn2 << " -> " << short_fn << dendl; > assert(0); > } > > return 0; > } > > > Now, imagine a colliding file create between the stat and the rename > -> boom. This is not the only race in there. > > The underlying problem is that you're constructing an atomic operation > out of multiple underlying operations, and you're not obsessively > careful about ordering them. Once you get obsessive about ordering > them, the extra directory my scheme creates will seem very cheap. > > If you say that's not relevant because of some locking that the OSD > does, then 1) you're building a lot of assumptions on the locking > never changing 2) I can construct similar bugs with a single actor, > with a crash at the wrong moment. > > Simple code makes Tv happy. You don't want an unhappy Tv all up in > your codebase. > I said "relatively simple". In fact I also suggested just ditching the collision handling precisely because of issues like this -- keep in mind that we have 200+ characters to make a hash out of[1] and PGs really shouldn't ever grow big enough for collisions to happen -- and if we instead make a folder structure out of long names that's not exactly going to remove any races. I understand that Colin likes making folders so as to speed up the prefix searches but I don't think we should optimize for RGW -- if we're going to do that we should (God help us) implement multiple ObjectStore classes and choose the appropriate one to use based on what kind of data the cluster is serving. I think that you're inflating the cost of doing hashing and an xattr, especially in btrfs where we get the xattrs on lookup anyway, when compared to deep dir lookups. I'm also concerned about issues that may crop up when we take a 4k object name and translate it directly into a path of 4k + slashes, since at that point we're not going to be able to address it all in one go and will need to pull tricks like moving in and out of directories, which endlessly complicates your simple little loops. :( -Greg [1]: The current code has short hashes precisely because Yehuda wants to test his collision-handling, and it is a work in progress as you can see by the random "fix blah" patches at the end. :) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: long object names 2011-04-21 22:00 ` Tommi Virtanen 2011-04-21 22:23 ` Gregory Farnum @ 2011-04-21 22:25 ` Yehuda Sadeh Weinraub 2011-04-21 23:07 ` Tommi Virtanen 1 sibling, 1 reply; 19+ messages in thread From: Yehuda Sadeh Weinraub @ 2011-04-21 22:25 UTC (permalink / raw) To: Tommi Virtanen; +Cc: Gregory Farnum, ceph-devel On Thu, Apr 21, 2011 at 3:00 PM, Tommi Virtanen <tommi.virtanen@dreamhost.com> wrote: > On Thu, Apr 21, 2011 at 01:03:57PM -0700, Gregory Farnum wrote: >> I like what Yehuda has here for its relative simplicity > > It's far from simple. > > Let's look at the unlink path: > > > static int lfn_unlink(const char *pathname) > { > const char *filename; > char short_fn[PATH_MAX]; > char short_fn2[PATH_MAX]; > int r, i, exist, err; > int path_len; > int is_lfn; > > ** helper function to split the path to dir and file, figure out a > ** short name for this longname, count the lenght of the directory > ** part of the path and other things; loops through the candidates, > ** comparing against the xattr > r = lfn_get(pathname, short_fn, sizeof(short_fn), &filename, &exist, &is_lfn); > if (r < 0) > return r; > ** if the filename wasn't actually too long, take the easy way out > if (!is_lfn) > return unlink(pathname); > if (!exist) { > errno = ENOENT; > return -1; > } > > ** actual file unlink here > err = unlink(short_fn); > if (err < 0) > return err; > > ** and then, rename all the collisions, one by one, because they have > ** a sequential number in them! > path_len = filename - pathname; > memcpy(short_fn2, pathname, path_len); > > ** this loop finds the highest sequential number in this hash > ** collision bucket, saves it in i > for (i = r + 1; ; i++) { > struct stat buf; > int ret; > > build_filename(&short_fn2[path_len], sizeof(short_fn2) - path_len, filename, i); > ret = stat(short_fn2, &buf); > if (ret < 0) { > if (i == r + 1) > return 0; > > break; > } > } > > ** and then the highest seq number munged filename gets renamed to > ** fill the gap we left behind > build_filename(&short_fn2[path_len], sizeof(short_fn2) - path_len, filename, i - 1); > generic_dout(0) << "renaming " << short_fn2 << " -> " << short_fn << dendl; > > if (rename(short_fn2, short_fn) < 0) { > generic_derr << "ERROR: could not rename " << short_fn2 << " -> " << short_fn << dendl; > assert(0); > } > > return 0; > } This is a work in progress, a proper locking is required and will be applied. > > > Now, imagine a colliding file create between the stat and the rename > -> boom. This is not the only race in there. > Yeah, we're well aware of those races. Note that splitting to subdirectories is racey too. Imagine one thread/process creating an object, while the other one removing a similar object with the same prefix. The first one tries to create a subtree, while the other is trying to remove the same subtree. I've seen these issues before, they're real. The chances of hitting these issues with none hashed structure is much greater than the chances of hitting those races when the appropriate hash algorithm is being used (the 'zzz' hash is just a filler). Yehuda -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: long object names 2011-04-21 22:25 ` Yehuda Sadeh Weinraub @ 2011-04-21 23:07 ` Tommi Virtanen 2011-04-22 15:44 ` Sage Weil 0 siblings, 1 reply; 19+ messages in thread From: Tommi Virtanen @ 2011-04-21 23:07 UTC (permalink / raw) To: Yehuda Sadeh Weinraub; +Cc: Gregory Farnum, ceph-devel On Thu, Apr 21, 2011 at 03:25:35PM -0700, Yehuda Sadeh Weinraub wrote: > Yeah, we're well aware of those races. Note that splitting to > subdirectories is racey too. Imagine one thread/process creating an > object, while the other one removing a similar object with the same > prefix. The first one tries to create a subtree, while the other is > trying to remove the same subtree. I've seen these issues before, > they're real. Yup, that's why I said there's a rmdir/mkdir race. You can fix that two ways: 1. Don't rmdir; there's not going to be that much junk there (punting it, but not badly; no harm done, just littering). 2. Make the mkdir & create file case just handle the race; all you need is a simple retry loop, there's no problems and the races can't cause actual harm. And more to the point, this is the only kind of race there is. If FileStore needs to support arbitrary rename etc operations, they all need this same retry loop, but it's still just the same retry loop, and can probably put in a nice utility function. *There are no other kinds of races*, and it seems FileStore doesn't really do renames etc anyway. // try to create a file, using the dynamic dirs trick for long // filenames. note that this is only needed for file creation; opening // an existing file needs no mkdir trickery. overwrites pathname, // returns fd or <0 on errors. pathname is relative to dirfd. int really_create(int dirfd, char *pathname, int flags, mode_t mode) { int ret; // split into leading path and base filename const char *filename = strrchr(pathname, '/'); if (!filename) { // pathname has no slashes, safe to just open return openat(dirfd, pathname, flags, mode); } // nul terminate leading path filename = '\0'; // move from slash to actual filename filename++; // go through leading prefixes and mkdir them retry: char *cursor = pathname; while (1) { printf("cursor=%p %s\n", cursor, cursor); cursor = strchr(cursor, '/'); if (!cursor) break; // terminate the string here temporarily, mkdir that *cursor = '\0'; ret = mkdirat(dirfd, pathname, 0755); // restore the slash so we don't forget *cursor = '/'; // and nudge us past the slash cursor++; if (ret < 0) { switch (errno) { case EEXIST: // it already exists; ignore break; case ENOENT: // somebody rmdir'd a parent path; retry from the top goto retry; default: return -errno; } } // loop back to find the next slash and mkdir that } // leading path is created (unless we lost a race just now); now do // the file operation ret = openat(dirfd, pathname, flags, mode); if (ret<0) { switch (errno) { case ENOENT: // it seems we lost a race at the last second; do mkdirs again goto retry; default: return -errno; } } return ret; } -- :(){ :|:&};: ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: long object names 2011-04-21 23:07 ` Tommi Virtanen @ 2011-04-22 15:44 ` Sage Weil 2011-04-22 16:34 ` Tommi Virtanen 2011-04-22 17:36 ` Colin McCabe 0 siblings, 2 replies; 19+ messages in thread From: Sage Weil @ 2011-04-22 15:44 UTC (permalink / raw) To: Tommi Virtanen; +Cc: Yehuda Sadeh Weinraub, Gregory Farnum, ceph-devel Few things: - I think the xattr approach is always going to be faster. xattrs are stored adjacent to the inode in the btree, while creating intervening directories means a new inode is allocated, seeked to, and loaded, and _then_ the directory content is looked up in another part of the btree before the final inode is located. For each level you add two seeks (although in the common case, at least, those inodes will be close by). - They may make it harder to inspect things out of band (need to peek at xattrs instead of subdirectories). OTOH, it's a 1:1 mapping of dirent to object, while subdirs are not. - You can't make intervening directories both rare (long) and useful for prefix search (short) unless you really think people will be searching on 100+ character prefixes. - Hash collisions will be rare for all but our test cases. If we only hash for long filenames (say, 200+ characters) that means someone has to find a SHA-256 collision (has anybody??). And even then they only turn 1 stat into 2. Only if someone can generate an arbitrary number of inputs that hash to the same value do they get anywhere. I don't think that's something we should worry about. If someone breaks a crypto hash there are much bigger things to worry about. (Even if we are super paranoid, then just sha(name + sha(name)). - We can easily wrap the non-fast past with a mutex to avoid the races (because, again, collisions are vanishingly rare except in our test cases). - I'm somewhat attracted to the idea of not escaping / and creating intervening directories because that's how people frequently use it. It's worth noting though that S3 at least doesn't treat / as anything special (you can delimit using anything) so we'd only optimize for the common case here. And it will slow down _everything_else_ besides prefix search. So... bleh. - Those mkdir helpers may be useful for the prehashing. Or we can just precreate the hash dirs (there'll be a fixed power-of-two number of them). - For simplicity, I still think the simplest thing will be to push all the escaping/mangling into one layer. Once place to audit and unit test. sage On Thu, 21 Apr 2011, Tommi Virtanen wrote: > On Thu, Apr 21, 2011 at 03:25:35PM -0700, Yehuda Sadeh Weinraub wrote: > > Yeah, we're well aware of those races. Note that splitting to > > subdirectories is racey too. Imagine one thread/process creating an > > object, while the other one removing a similar object with the same > > prefix. The first one tries to create a subtree, while the other is > > trying to remove the same subtree. I've seen these issues before, > > they're real. > > Yup, that's why I said there's a rmdir/mkdir race. You can fix that > two ways: > > 1. Don't rmdir; there's not going to be that much junk there > (punting it, but not badly; no harm done, just littering). > > 2. Make the mkdir & create file case just handle the race; all you > need is a simple retry loop, there's no problems and the races > can't cause actual harm. > > And more to the point, this is the only kind of race there is. > If FileStore needs to support arbitrary rename etc operations, > they all need this same retry loop, but it's still just the > same retry loop, and can probably put in a nice utility function. > > *There are no other kinds of races*, and it seems FileStore doesn't > really do renames etc anyway. > > > > // try to create a file, using the dynamic dirs trick for long > // filenames. note that this is only needed for file creation; opening > // an existing file needs no mkdir trickery. overwrites pathname, > // returns fd or <0 on errors. pathname is relative to dirfd. > int really_create(int dirfd, char *pathname, int flags, mode_t mode) { > int ret; > > // split into leading path and base filename > const char *filename = strrchr(pathname, '/'); > > if (!filename) { > // pathname has no slashes, safe to just open > return openat(dirfd, pathname, flags, mode); > } > > // nul terminate leading path > filename = '\0'; > // move from slash to actual filename > filename++; > > // go through leading prefixes and mkdir them > retry: > char *cursor = pathname; > while (1) { > printf("cursor=%p %s\n", cursor, cursor); > cursor = strchr(cursor, '/'); > if (!cursor) > break; > // terminate the string here temporarily, mkdir that > *cursor = '\0'; > ret = mkdirat(dirfd, pathname, 0755); > // restore the slash so we don't forget > *cursor = '/'; > // and nudge us past the slash > cursor++; > if (ret < 0) { > switch (errno) { > case EEXIST: > // it already exists; ignore > break; > case ENOENT: > // somebody rmdir'd a parent path; retry from the top > goto retry; > default: > return -errno; > } > } > // loop back to find the next slash and mkdir that > } > > // leading path is created (unless we lost a race just now); now do > // the file operation > ret = openat(dirfd, pathname, flags, mode); > if (ret<0) { > switch (errno) { > case ENOENT: > // it seems we lost a race at the last second; do mkdirs again > goto retry; > default: > return -errno; > } > } > return ret; > } > > > -- > :(){ :|:&};: > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: long object names 2011-04-22 15:44 ` Sage Weil @ 2011-04-22 16:34 ` Tommi Virtanen 2011-04-22 17:36 ` Colin McCabe 1 sibling, 0 replies; 19+ messages in thread From: Tommi Virtanen @ 2011-04-22 16:34 UTC (permalink / raw) To: Sage Weil; +Cc: Yehuda Sadeh Weinraub, Gregory Farnum, ceph-devel On Fri, Apr 22, 2011 at 08:44:49AM -0700, Sage Weil wrote: > - We can easily wrap the non-fast past with a mutex to avoid the races > (because, again, collisions are vanishingly rare except in our test > cases). How do you guard against crashes, e.g. the create+set_xattr crashing before set_xattr? How do you guard against gaps in the sequence number thing? (Perhaps make that part a random string, and change consumers to listdir instead of probing 1,2,3...) How do you convince yourself you've covered all the races? > - For simplicity, I still think the simplest thing will be to push all the > escaping/mangling into one layer. Once place to audit and unit test. I think the big functional benefit with that is that you can have the suffix not be obscured by the hash; FOO_a43fec_n_head not FOO_a43fec_n -- :(){ :|:&};: ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: long object names 2011-04-22 15:44 ` Sage Weil 2011-04-22 16:34 ` Tommi Virtanen @ 2011-04-22 17:36 ` Colin McCabe 1 sibling, 0 replies; 19+ messages in thread From: Colin McCabe @ 2011-04-22 17:36 UTC (permalink / raw) To: Sage Weil Cc: Tommi Virtanen, Yehuda Sadeh Weinraub, Gregory Farnum, ceph-devel On Fri, Apr 22, 2011 at 8:44 AM, Sage Weil <sage@newdream.net> wrote: > Few things: > > - I think the xattr approach is always going to be faster. xattrs are > stored adjacent to the inode in the btree, while creating intervening > directories means a new inode is allocated, seeked to, and loaded, and > _then_ the directory content is looked up in another part of the btree > before the final inode is located. For each level you add two seeks > (although in the common case, at least, those inodes will be close by). Fair enough. > - You can't make intervening directories both rare (long) and useful for > prefix search (short) unless you really think people will be searching on > 100+ character prefixes. Earlier I suggested making it configurable, so that we could have it tuned to a short value on the cluster backing rgw, but a long value elsewhere. > - Hash collisions will be rare for all but our test cases. If we only > hash for long filenames (say, 200+ characters) that means someone has to > find a SHA-256 collision (has anybody??). And even then they only turn 1 > stat into 2. Only if someone can generate an arbitrary number of inputs > that hash to the same value do they get anywhere. I don't think that's > something we should worry about. If someone breaks a crypto hash there > are much bigger things to worry about. (Even if we are super paranoid, > then just sha(name + sha(name)). A good guide to choosing a crypto hash: http://valerieaurora.org/hash.html > - We can easily wrap the non-fast past with a mutex to avoid the races > (because, again, collisions are vanishingly rare except in our test > cases). I believe that all these operations are already done under the PG lock. So there are no race conditions in normal operation. TV is talking about a case where there has been a crash and we're resuming from some intermediate state. Based on our earlier discussion, perhaps this is not a problem on btrfs because of the snapshotting mechanic? cheers, Colin -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2011-04-22 17:36 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-04-21 4:42 long object names Sage Weil 2011-04-21 18:56 ` Tommi Virtanen 2011-04-21 19:27 ` Colin McCabe 2011-04-21 19:32 ` Tommi Virtanen 2011-04-21 20:03 ` Gregory Farnum 2011-04-21 21:09 ` Colin McCabe 2011-04-21 21:23 ` Yehuda Sadeh Weinraub 2011-04-21 21:44 ` Colin McCabe 2011-04-21 21:54 ` Yehuda Sadeh Weinraub 2011-04-21 22:01 ` Colin McCabe 2011-04-21 22:58 ` Zenon Panoussis 2011-04-21 23:04 ` Yehuda Sadeh Weinraub 2011-04-21 22:00 ` Tommi Virtanen 2011-04-21 22:23 ` Gregory Farnum 2011-04-21 22:25 ` Yehuda Sadeh Weinraub 2011-04-21 23:07 ` Tommi Virtanen 2011-04-22 15:44 ` Sage Weil 2011-04-22 16:34 ` Tommi Virtanen 2011-04-22 17:36 ` Colin McCabe
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.