* Re: UTF-8 and case-insensitivity [not found] ` <1qJsF-6Be-45@gated-at.bofh.it> @ 2004-02-19 0:06 ` Pascal Schmidt 2004-02-19 1:01 ` tridge 0 siblings, 1 reply; 69+ messages in thread From: Pascal Schmidt @ 2004-02-19 0:06 UTC (permalink / raw) To: tridge; +Cc: linux-kernel On Thu, 19 Feb 2004 00:40:21 +0100, you wrote in linux.kernel: > Because a large number of file operations are on filenames that don't > exist. I have to *prove* they don't exist. That includes: Evil question: do you need to be case-preserving? 'Cause if not, you could simply squash all incoming filenames from case-insensitive clients to some canonical form (say, all lower-case) and use that. -- Ciao, Pascal ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-19 0:06 ` UTF-8 and case-insensitivity Pascal Schmidt @ 2004-02-19 1:01 ` tridge 2004-02-19 1:08 ` Hua Zhong 2004-02-19 2:44 ` Theodore Ts'o 0 siblings, 2 replies; 69+ messages in thread From: tridge @ 2004-02-19 1:01 UTC (permalink / raw) To: Pascal Schmidt; +Cc: linux-kernel Pascal, > Evil question: do you need to be case-preserving? 'Cause if not, you > could simply squash all incoming filenames from case-insensitive clients > to some canonical form (say, all lower-case) and use that. yes, we have to be case preserving, but thats not the problem. Keeping some name mapping in user space or xattrs is tedious but conceptually easy and potentially quite efficient. The problem is that Samba isn't the only program to be accessing these directories. Multi-protocol file servers and file servers where users also have local access are common. That means we can't assume that some other filesystem user hasn't created a file which matches in a case-insensitive manner. That means we need to do an awful lot of directory scans. I also understand the decision Linus has made that we won't be doing anything fundamental at the filesystem level to fix this, so we will just have to live with it. Cheers, Tridge ^ permalink raw reply [flat|nested] 69+ messages in thread
* RE: UTF-8 and case-insensitivity 2004-02-19 1:01 ` tridge @ 2004-02-19 1:08 ` Hua Zhong 2004-02-19 1:46 ` tridge 2004-02-19 2:44 ` Theodore Ts'o 1 sibling, 1 reply; 69+ messages in thread From: Hua Zhong @ 2004-02-19 1:08 UTC (permalink / raw) To: tridge, 'Pascal Schmidt'; +Cc: linux-kernel > The problem is that Samba isn't the only program to be accessing these > directories. Multi-protocol file servers and file servers where users > also have local access are common. That means we can't assume that > some other filesystem user hasn't created a file which matches in a > case-insensitive manner. That means we need to do an awful lot of > directory scans. Do you also require NFSD or other file daemons to do the same case-insensitivity check? Say you create a foo, how do you prevent NFSD from creating FOO? What could you do about that? ^ permalink raw reply [flat|nested] 69+ messages in thread
* RE: UTF-8 and case-insensitivity 2004-02-19 1:08 ` Hua Zhong @ 2004-02-19 1:46 ` tridge 0 siblings, 0 replies; 69+ messages in thread From: tridge @ 2004-02-19 1:46 UTC (permalink / raw) To: hzhong; +Cc: 'Pascal Schmidt', linux-kernel Hua, > Do you also require NFSD or other file daemons to do the same > case-insensitivity check? no. That's the point of the per-process check. Only Samba needs to pay the price. > Say you create a foo, how do you prevent NFSD from creating FOO? > What could you do about that? You don't need to do anything in particular about it. I did explain this earlier in this thread, but here goes again: * samba always tries the name exactly as given by the client. If that succeeds then we are done. * if it doesn't find an exact match then it does a directory scan. It uses the first case-insensitive matching name it finds, or if it reaches the end of the directory then it concludes that the file doesn't exist. So if FOO and foo both exist in the filesystem, and someone asks for FoO then its pretty much random which one they get (ok, not exactly random, but close enough for this argument). The thing is that just making an arbitrary choice is a perfectly fine set of semantics. You can't deal with this situation any more sanely, so don't even try. well, actually, there is something you could do that we don't do. We could have some special marker that distinguishes files created by windows clients and files created by unix clients, and preferentially return the one created by windows clients, I just don't think this is worth doing. Nobody has even complained (within earshot of me anyway) of the current "pick one" method. Cheers, Tridge ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-19 1:01 ` tridge 2004-02-19 1:08 ` Hua Zhong @ 2004-02-19 2:44 ` Theodore Ts'o 2004-02-19 3:20 ` tridge 1 sibling, 1 reply; 69+ messages in thread From: Theodore Ts'o @ 2004-02-19 2:44 UTC (permalink / raw) To: tridge; +Cc: Pascal Schmidt, linux-kernel On Thu, Feb 19, 2004 at 12:01:53PM +1100, tridge@samba.org wrote: > The problem is that Samba isn't the only program to be accessing these > directories. Multi-protocol file servers and file servers where users > also have local access are common. That means we can't assume that > some other filesystem user hasn't created a file which matches in a > case-insensitive manner. That means we need to do an awful lot of > directory scans. Actually, not necessarily. What if Samba gets notifications of all filename renames and creates in the directory, so that after the initial directory scan, it can keep track of what filenames are present in the directory? It can then "prove the negative", as you put it, without having to continuously do directory scans. Yeah, there can be some race conditions, but Samba already has to deal with the race condition where it tries to create "MaKeFiLe" either just before or just after a Posix process creates "Makefile". - Ted ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-19 2:44 ` Theodore Ts'o @ 2004-02-19 3:20 ` tridge 2004-02-19 10:18 ` Helge Hafting ` (3 more replies) 0 siblings, 4 replies; 69+ messages in thread From: tridge @ 2004-02-19 3:20 UTC (permalink / raw) To: Theodore Ts'o; +Cc: Pascal Schmidt, linux-kernel Ted, > Actually, not necessarily. What if Samba gets notifications of all > filename renames and creates in the directory, so that after the > initial directory scan, it can keep track of what filenames are > present in the directory? It can then "prove the negative", as you > put it, without having to continuously do directory scans. Currently dnotify doesn't give you the filename that is being added/deleted/renamed. It just tells you that something has happened, but not enough to actually maintain a name cache in user space. That could be changed, so that on a dnotify event you do a fcntl() to ask for the name of the file. Or perhaps we could cram it into the structure the signal handler gets passed? I doubt that would make sense, but maybe some signal guru can tell me otherwise. Maybe we could even invent a new dnotify system where you do a read on a file descriptor to get details on what event happened, and give some "everything has changed" error when you run out of buffers. If that happened then we could build our own dcache in user space, but it will be a very second rate dcache, with a racy and slow update mechanism that will in itself chew cpu. Maybe thats the best we can do, or maybe I should be asking distro vendors if they would consider a case-insensitive patch, especially the vendors aiming for "enterprise" scalability which might include serving windows clients. > Yeah, there can be some race conditions, but Samba already has to deal > with the race condition where it tries to create "MaKeFiLe" either > just before or just after a Posix process creates "Makefile". yes, thats true. The races aren't my primary concern really. I've spent the last week doing profiling of a large Samba install, and after fixing a horrendous scalability problem do to with fcntl locking (more on that later) the next thing on the profile is stat() and directory scans. That's why the efficiency of this stuff is a hot topic for me right now. It's not all as bleak as perhaps I make it seem though. I suspect there is still quite a bit of improvement that can be made in Samba just because our code is so messy that sometimes we do a stat() call or a directory scan when perhaps we can prove that we don't need to. The Samba4 code is much cleaner, and maybe we have room to keep improving things for a couple of years by finding those inefficiencies and fixing them. We will eventually hit a wall, but it could be a fair way off. Maybe windows will be dead by then. Cheers, Tridge ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-19 3:20 ` tridge @ 2004-02-19 10:18 ` Helge Hafting 2004-02-19 12:11 ` Paulo Marques ` (2 subsequent siblings) 3 siblings, 0 replies; 69+ messages in thread From: Helge Hafting @ 2004-02-19 10:18 UTC (permalink / raw) To: tridge; +Cc: Theodore Ts'o, Pascal Schmidt, linux-kernel tridge@samba.org wrote: > Ted, > > > Actually, not necessarily. What if Samba gets notifications of all > > filename renames and creates in the directory, so that after the > > initial directory scan, it can keep track of what filenames are > > present in the directory? It can then "prove the negative", as you > > put it, without having to continuously do directory scans. > > Currently dnotify doesn't give you the filename that is being > added/deleted/renamed. It just tells you that something has happened, > but not enough to actually maintain a name cache in user space. > You can still keep per-directory caches that you simply invalidate on each dnotify, and rebuild when necessary. At least it would help the "repeated lookup of nonexistant filenames" case. Path searches for executables usually happens on directories that don't see much writing. Helge Hafting ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-19 3:20 ` tridge 2004-02-19 10:18 ` Helge Hafting @ 2004-02-19 12:11 ` Paulo Marques 2004-02-19 19:04 ` Helge Hafting 2004-02-19 14:08 ` Theodore Ts'o 2004-02-19 20:12 ` Robert White 3 siblings, 1 reply; 69+ messages in thread From: Paulo Marques @ 2004-02-19 12:11 UTC (permalink / raw) To: tridge; +Cc: Theodore Ts'o, Pascal Schmidt, linux-kernel tridge@samba.org wrote: > Currently dnotify doesn't give you the filename that is being > added/deleted/renamed. It just tells you that something has happened, > but not enough to actually maintain a name cache in user space. This might be a crazy / stupid idea, so flame at will :) Wouldn't it be possible to do a samba "super-server" mode, in which samba would assume that it controlled the directories it is exporting? In this mode a "corporate" Samba server, serving Windows clients, could improve performance by assuming that its cache was always up-to-date. If if we wanted to access the directory locally we could always mount locally using samba, and access the files anyway, albeit a lot slower and without linux permissions, etc. What we would gain was the ability to say "I want to give priority to my samba server" (and set it to "super-server" mode) or "my priority is to the linux native filesystem, and just want to share my files with windows users anyway" (and keep using samba as always). -- Paulo Marques - www.grupopie.com "In a world without walls and fences who needs windows and gates?" ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-19 12:11 ` Paulo Marques @ 2004-02-19 19:04 ` Helge Hafting 0 siblings, 0 replies; 69+ messages in thread From: Helge Hafting @ 2004-02-19 19:04 UTC (permalink / raw) To: Paulo Marques; +Cc: tridge, Theodore Ts'o, Pascal Schmidt, linux-kernel On Thu, Feb 19, 2004 at 12:11:32PM +0000, Paulo Marques wrote: > tridge@samba.org wrote: > > >Currently dnotify doesn't give you the filename that is being > >added/deleted/renamed. It just tells you that something has happened, > >but not enough to actually maintain a name cache in user space. > > This might be a crazy / stupid idea, so flame at will :) > > Wouldn't it be possible to do a samba "super-server" mode, in which samba > would assume that it controlled the directories it is exporting? > > In this mode a "corporate" Samba server, serving Windows clients, could > improve performance by assuming that its cache was always up-to-date. > > If if we wanted to access the directory locally we could always mount > locally using samba, and access the files anyway, albeit a lot slower and > without linux permissions, etc. > You don't really need to go to such extremes. Samba can use dnotify, and run with caching and great performance as long as nobody touch the files in other ways. There is no need to _enforce_ it though, samba can cope by invalidating the cache on those rare occations the files are accessed in other ways. It won't happen often, because: 1. Linux/nfs people have no business in a directory full of windows .dll's and .exe's 2. On a corporate server you simply tell people to stay out. nfs may export another set of homedirs for the unix people. > What we would gain was the ability to say "I want to give priority to my > samba server" (and set it to "super-server" mode) or "my priority is to the > linux native filesystem, and just want to share my files with windows users > anyway" (and keep using samba as always). > Thanks to dnotify even the "linux priority" setup will be able to benefit from a cache. Particularly if we can get a dnotify that doesn't trip when samba is the one making changes. Helge Hafting ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-19 3:20 ` tridge 2004-02-19 10:18 ` Helge Hafting 2004-02-19 12:11 ` Paulo Marques @ 2004-02-19 14:08 ` Theodore Ts'o 2004-02-19 20:12 ` Robert White 3 siblings, 0 replies; 69+ messages in thread From: Theodore Ts'o @ 2004-02-19 14:08 UTC (permalink / raw) To: tridge; +Cc: Pascal Schmidt, linux-kernel On Thu, Feb 19, 2004 at 02:20:44PM +1100, tridge@samba.org wrote: > Currently dnotify doesn't give you the filename that is being > added/deleted/renamed. It just tells you that something has happened, > but not enough to actually maintain a name cache in user space. > > That could be changed, so that on a dnotify event you do a fcntl() to > ask for the name of the file. Or perhaps we could cram it into the > structure the signal handler gets passed? I doubt that would make > sense, but maybe some signal guru can tell me otherwise. Maybe we > could even invent a new dnotify system where you do a read on a file > descriptor to get details on what event happened, and give some > "everything has changed" error when you run out of buffers. Yes, that's what I was suggesting. One advantage of such a scheme is that it's not just for Windows compatibility. A more rich directory change notification scheme would also be useful for graphical file managers, automatic indexing tools, and many, many other applications. No, it's not everything you were requesting, but it may very well represent three-quarters of a loaf, instead of nothing. > If that happened then we could build our own dcache in user space, but > it will be a very second rate dcache, with a racy and slow update > mechanism that will in itself chew cpu. Maybe thats the best we can > do, or maybe I should be asking distro vendors if they would consider > a case-insensitive patch, especially the vendors aiming for > "enterprise" scalability which might include serving windows clients. I don't know that the update mechanism has to seriously chew that much CPU. It can certainly can be designed to minimize the amount of CPU that is consumed, especially if it is read via a file descriptor so that multiple updates can be sent via a single read() system call, instead of sending a signal every single time a directory entry is created, renamed, or deleted. The problem with a case-insentive patch is that for most modern filesystems (i.e., any filesystem that does better than O(1) directory searches), it will have to involve a format change, since the case insensitivity has to be built into the hash function or the tree comparison fucture, or both. At this point, the filesystem author has to make the choice of whether to try to solve the Windows-specific problem, in which case the fundamental filesystem format would have to be tailored to the Windows case mapping table, or try to solve the more general I18N case mapping problem. (Lots of luck! It's constantly changing over time as new character sets are added or modified...) Yes, a few such filesystems might have this support already, but I doubt distributions would be willing to accept patches that make filesystem format-incompatible changes just for the sake of accelerating Samba operations. I don't know if the distributions would be willing to accept a case-insensitive patch, but my suspicions is that it would be difficult, and I would argue that it might be more efficient to get a richer directory change notification system, for the reasons I argued above. - Ted ^ permalink raw reply [flat|nested] 69+ messages in thread
* RE: UTF-8 and case-insensitivity 2004-02-19 3:20 ` tridge ` (2 preceding siblings ...) 2004-02-19 14:08 ` Theodore Ts'o @ 2004-02-19 20:12 ` Robert White 3 siblings, 0 replies; 69+ messages in thread From: Robert White @ 2004-02-19 20:12 UTC (permalink / raw) To: tridge, 'Theodore Ts'o' Cc: 'Pascal Schmidt', linux-kernel (I may, of course, be overly naive... but a thought occurs... 8-) It would seem that the there is a moment of opportunity at the dentry_operations invocation point to harvest all the information you would need to maintain a specialized dcache in a separate module. Unfortunately, since the individual file systems get to tweak their own pointer(s) to this/these struct-of-calls it could get hard to hijack things at that level. With two changes to core Linux behavior, which could easily be implemented as a configurable kernel option, you could create an advisory hook. 1) add a usually-null pointer(*) to dentry_operations structure to the superblock data structure in vfs (and, of course, an install/remove structure call pair) as a look-aside mechanism, and 2) if-not-null "parallel" invocations of these "advisory" calls are then added to the fixed vfs invocation points along side the normal dentry notices... You could then add any imaginable advisory behavior to any file system. A well crafted module could then attach to file systems, flag directories (+), and get low-level advisory service at core dentry action time. A module so attached could answer all your negative enquiries quickly and yet remain nicely segregated. You could probably create the magic_open dream logic of your choice and net near, if not absolute, race elimination. You still might have to readdir a whole dirctory from time to time just to clean-up a partily aged cache, but there would be no need for the stepwise transfer of this information into the user context. 100% of the native function of each file system is preserved and there are probably other applications for this look-aside feature like low-level security auditing or semantic mirroring (a-la real-time rdist). But, you know, just a thought... Rob. (*) this should, if enabled, be arranged as a linked list of structures so that multiple modules could be installed for different purposes. (+) flagging and un-flagging directories of interest ad-hoc is needed to prevent saturation of resources. ^ permalink raw reply [flat|nested] 69+ messages in thread
[parent not found: <fa.epf5o9k.1rkudgo@ifi.uio.no>]
[parent not found: <fa.idvvhjl.1jge92d@ifi.uio.no>]
* Re: UTF-8 and case-insensitivity [not found] ` <fa.idvvhjl.1jge92d@ifi.uio.no> @ 2004-02-18 1:09 ` Andy Lutomirski 0 siblings, 0 replies; 69+ messages in thread From: Andy Lutomirski @ 2004-02-18 1:09 UTC (permalink / raw) To: Kernel Mailing List; +Cc: Andrew Tridgell, Linus Torvalds, Al Viro Linus Torvalds wrote: > int magic_open( > /* Input arguments */ > const char *pathname, > unsigned long flags, > mode_t mode, > > /* output arguments */ > int *fd, > struct stat *st, > int *successful_path_length); > > ie the system call would: > > - look up as far into the pathname (using _exact_ lookup) as possible > - return the error code of the last failure > - the "flags" could be extended so that you can specify that you mustn't > traverse ".." or symlinks (ie those would count as failures) > > but also: > > - fill in the "struct stat" information for the last _successful_ > pathname component. > - fill in the "fd" with a fd of the last _successful_ pathname component. > - tell how much of the pathname it could traverse. Aside from just case-insensitivity, I imagine this could give lots of other benefits: - file servers that don't want to follow symlinks can do it quickly. - Apache could serve things like http://www.foo.com/a/b/c/d.php/e/f/g a lot faster. - a flag to avoid traversing mountpoints could help someone - a flag for root to see _through_ mountpoints would make it possible to clean up initramfs and such that got mounted over, or to do other useful and currently impossible tasks. (e.g. I could see what's under my devfs mount...) I would be nice to see this added even if it's not the perfect solution for samba :) BTW, here's a thought for solving samba's negative lookup problem: int ugly_stat(char *pattern, struct stat *st, char *match_out) Pattern would be some description of what the filename should look like. Something like: - pattern is an array of slash-delimited groups of characters separated by nulls and terminated by two nulls. For example, ugly_stat("F/f\0O/o\0O/o\0\0", ...) finds a file called foo, case-insensitively in English, while ugly_stat("F\0i\0l\0e\011/22/33") finds "File" followed by either 11, 22, or 33. - the dcache problem is easy: don't use it. All Andrew wants (I think) is proof that there is no such file or the name if there is one. Samba can cache it itself; I don't think the kernel should involve itself in trying to cache this. - ugly_stat does not traverse directories -- that's why the slash trick is safe. - st gets the stat data, and match_out gets the filename if any - if there are multiple matches, one is arbitrarily selected. If the file-system doesn't have specific support for this, then either VFS or the caller could emulate it (probably VFS -- it would avoid lots of syscalls). Would ugly_stat + magic_open be sufficient? --Andy ^ permalink raw reply [flat|nested] 69+ messages in thread
* UTF-8 and case-insensitivity
@ 2004-02-17 4:12 tridge
2004-02-17 5:11 ` Linus Torvalds
` (4 more replies)
0 siblings, 5 replies; 69+ messages in thread
From: tridge @ 2004-02-17 4:12 UTC (permalink / raw)
To: linux-kernel
Given how much pain the "kernel is agnostic to charset encoding"
attitude has cost me in terms of programming pain, I thought I should
de-cloak from lurk mode and put my 2c into the UTF-8 issue.
Personally I think that eventually the Linux kernel will have to
embrace the interpretation of the byte streams that applications have
given it, despite the fact that this will be very painful and
potentially quite complex. The reason is that I think that eventually
the Linux kernel will need to efficiently support a userspace policy
of case-insensitivity and the only way to do case-insensitive filename
operations is to interpret those byte streams as a particular
encoding.
Personally I much prefer the systems I use to be case-sensitive, but
there are important applications that require case-insensitivity for
interoperability. Right now it is not possible to write a case
insensitive application on Linux in an efficient manner. With the
current "encoding agnostic" APIs a simple open() or stat() call
becomes a horrendously expensive operation and one that is fraught
with race conditions. Providing the same functionality in the kernel
is dirt cheap by comparison (not cheap in terms of code complexity,
but cheap in terms of runtime efficiency).
Cheers, Tridge
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: UTF-8 and case-insensitivity 2004-02-17 4:12 tridge @ 2004-02-17 5:11 ` Linus Torvalds 2004-02-17 6:54 ` tridge 2004-02-19 2:53 ` Daniel Newby 2004-02-17 5:25 ` Tim Connors ` (3 subsequent siblings) 4 siblings, 2 replies; 69+ messages in thread From: Linus Torvalds @ 2004-02-17 5:11 UTC (permalink / raw) To: Andrew Tridgell; +Cc: Kernel Mailing List, Al Viro [ Al cc'd, because while I'm pretty certain that he agrees with me 100% on the craziness of case-insensitive name lookups, he may have some input on the "samba helper" function approach. That input may well boil down to "Linus is crazy", of course. Wouldn't be the first time ;) Andrew - you really should assume that case insensitivity is a hell of a lot more costly than you think it is, and forget that particular idea. Let's see if there are acceptable half-measures. ] On Tue, 17 Feb 2004 tridge@samba.org wrote: > > Given how much pain the "kernel is agnostic to charset encoding" > attitude has cost me in terms of programming pain, I thought I should > de-cloak from lurk mode and put my 2c into the UTF-8 issue. > > Personally I think that eventually the Linux kernel will have to > embrace the interpretation of the byte streams that applications have > given it, despite the fact that this will be very painful and > potentially quite complex. I seriously doubt it. There just isn't any point. > The reason is that I think that eventually > the Linux kernel will need to efficiently support a userspace policy > of case-insensitivity and the only way to do case-insensitive filename > operations is to interpret those byte streams as a particular > encoding. The thing is, if you want to do efficient user-space case-insensitive lookups, that is a _completely_ different matter from having the kernel do case-insensitivity. Kernel-level case insensitivity is a total disaster, and your "very painful and potentially quite complex" assertion is the understatement of the year. The thing is, you can't sanely do dentry caching, since the case insensitivity has to be per-open or at least per-process (you MUST NOT be case-insensitive in a POSIX process). So the only way to do case-insensitive names is to do all lookups very slowly. I'm willing to bet that WNT opens files a hell of a lot slower than Linux does, and one big portion of that is exactly the fact that Linux can do a really good job with the dentry cache. And that _depends_ on a well-defined and unique filename setup (by changing the hashing function and compare function, a filesystem can do a limited kind of case-insensitivity right now in Linux, but then it will have to be not only fairly slow, but also case-insensitive for _everybody_ which is unacceptable in a mixed POSIX/samba environment). In other words, just forget the whole notion. The only set people who have any reason at _all_ to want it is the samba team, and we can solve the samba-specific problems other ways. Just take that as a simple fact - case insensitivity in the kernel is such a horribly bad idea, that you really shouldn't go there. With that destructive criticism out of the way, let's look at somewhat more constructive approaches, ie some way to allow certain processes that need it better help in their quest for case insensitivity. Let's start with some assumptions: - MOST name lookups are likely results of some kind of "readdir()" lookup, and tend to have the case right in the first place. So that should go fast. Maybe Tridge has some statistics on this one? - samba probably has certain pretty well-defined special patterns for what it wants to do with a filename, do you probably don't need a generic "everything that takes a filename should be case-insensitive", and it would be acceptable to have a few _very_ specific system calls. With those assumptions out of the way, we could think of an interface that exports some partial functionality of the "lookup_path()" code the kernel as a special system call. In particular, something that takes an input pathname, and is able to stop at any point of the name when a lookup fails. So some variation of the interface int magic_open( /* Input arguments */ const char *pathname, unsigned long flags, mode_t mode, /* output arguments */ int *fd, struct stat *st, int *successful_path_length); ie the system call would: - look up as far into the pathname (using _exact_ lookup) as possible - return the error code of the last failure - the "flags" could be extended so that you can specify that you mustn't traverse ".." or symlinks (ie those would count as failures) but also: - fill in the "struct stat" information for the last _successful_ pathname component. - fill in the "fd" with a fd of the last _successful_ pathname component. - tell how much of the pathname it could traverse. so that the user can do a "readdir" and try to "fix up" the problem without having to restart the whole thing. For the (hopefully common case) where the cases match, this would just boil down to an "open with stat information" thing. We'd need something more interesting to guarantee unique filename on file create, possibly even including letting a trusted process maintain some locks in the VFS layer. The point being that the kernel can _help_ some specific usage, but making case-insensitive names be part of the VFS layer proper is not acceptable. I suspect we can do case-insensitive names faster than WNT even with a fairly complex user-mode interface. Just because _not_ having them in the kernel allows us to have much faster default behaviour. Linus ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 5:11 ` Linus Torvalds @ 2004-02-17 6:54 ` tridge 2004-02-17 8:33 ` Neil Brown 2004-02-17 15:13 ` Linus Torvalds 2004-02-19 2:53 ` Daniel Newby 1 sibling, 2 replies; 69+ messages in thread From: tridge @ 2004-02-17 6:54 UTC (permalink / raw) To: Linus Torvalds; +Cc: Kernel Mailing List, Al Viro Linus, > Kernel-level case insensitivity is a total disaster, and your "very > painful and potentially quite complex" assertion is the understatement of > the year. The thing is, you can't sanely do dentry caching, since the case > insensitivity has to be per-open or at least per-process (you MUST NOT be > case-insensitive in a POSIX process). right, and the patches to add this support to Linux that I have been involved with in the past have been per-process. You are right that it is messy, but it is not *horribly* messy. In fact I'd say it is no worse than many of the other things we already have in the kernel, although it certainly is much harder than sticking to the "bag of bytes" interpretation of filenames. I just think that in this case the simple solution is also wrong. > So the only way to do case-insensitive names is to do all lookups very > slowly. I don't agree with this at all. I agree that the worst-case will get worse, but I see absolutely no reason why the average case will get sigificantly worse and I think that the worst case will be rare. In fact, John Bonesio did a patch to the 2.4 kernel with XFS that implemened per-process case-insensitivity. It's been a long time since I played with that patch, but I certainly don't recall any significant slowdowns. The patch was messy, but it wasn't grossly inefficient. (that patch was just a proof of concept, and just used strcasecmp() instead of doing a proper UTF-8 case-insensitive compare, so there will be some amount of additional cost to adding that). >From memory, the patch added new classes of dentries to the current "+ve" and "-ve" dentries. It added concepts like a "-ve case-insensitive" dentry and a "-ve case-sensitive" dentry. It certainly adds more code in trying to deal with these variants, but I see no reason why it should be significantly computationally less efficient. > I'm willing to bet that WNT opens files a hell of a lot slower > than Linux does, and one big portion of that is exactly the fact that > Linux can do a really good job with the dentry cache. Anyone have any lmbench filesystem numbers for w2k3? The only windows boxes I use are in vmware sessions, so running performance tests myself is pretty pointless. > And that _depends_ on a well-defined and unique filename setup (by > changing the hashing function and compare function, a filesystem can do a > limited kind of case-insensitivity right now in Linux, but then it will > have to be not only fairly slow, but also case-insensitive for _everybody_ > which is unacceptable in a mixed POSIX/samba environment). right, and thats why bones made it per-process in his patch. It was set using a process personality bit, which really wasn't ideal (that was one of my contributions to the patch) but it did work. > In other words, just forget the whole notion. The only set people who have > any reason at _all_ to want it is the samba team, and we can solve the > samba-specific problems other ways. Nope, its not just Samba, though perhaps Samba is the app that cares the most about the actual performance. The other obvious people who care are wine and anyone porting an application from windows. Also, the problem isn't just one of performance, its also hard to make it raceless from userspace. I also think that if the choice were given then some linux distros (the likes of Lindows comes to mind) would choose to run all processes case-insensitive. These sorts of distros are aiming at the sorts of users that would want everything to be case-insensitive. > Just take that as a simple fact - case insensitivity in the kernel is such > a horribly bad idea, that you really shouldn't go there. I'm yet to be convinced :) > - MOST name lookups are likely results of some kind of "readdir()" > lookup, and tend to have the case right in the first place. So that > should go fast. Maybe Tridge has some statistics on this one? ok, the first thing you need to understand about case-insensitivity on a case-sensitive system is that the hardest thing to do is prove that a file doesn't exist. File operations on non-existant files are *very* common. If you can come up with a solution that allows me to prove that a file doesn't exist in any case combination then we will be most of the way there. That immediately throws out most of the "why don't you just use a cache" arguments that everyone seems to come up with. We *do* use a cache that primes the "most likely" filename code, its just that a cache is almost useless when you are trying to prove that a file definately doesn't exist. > - samba probably has certain pretty well-defined special patterns for > what it wants to do with a filename, do you probably don't need a > generic "everything that takes a filename should be case-insensitive", > and it would be acceptable to have a few _very_ specific system calls. yes, if we had a single function that took a pathname and gave us either -1/ENOENT or the pathname of a file that matches case-insensitively then that would be great. Then again, if we had such a function then it would be really easy to use that function in the VFS to make the Linux case-insensitive on a per-process basis. So lets imagine we have such a function like this: int ci_normalize(char *path); Lets assume it takes a pathname and returns either -1/ENOENT or modifies the pathname in place (totally ignoring the fact that the length of the pathname could change, and that the "char *" is really a "const char *" - pedants go home). now lets build a ci_unlink() on top of that: int ci_unlink(char *path) { if (task_is_case_sensitive(current)) { return unlink(path); } if (ci_normalize(path) == -1) { return -1; } return unlink(path); } The problem is the negative dentries. If you do the above then case-sensitive processes will be fast, but case-insensitive processes will effectively be running without the negative dcache, so unlink() on paths that don't exist will be slow each and every time. That's why doing this with any sort of decent efficiency needs dcache changes. btw, I already know that Al is completely and utterly opposed to putting any case-insensitivity in the dcache (I think the phrase "over my dead body" was mentioned), so I know that I'm fighting an uphill battle here, but I like trying every now and again to see if I can make any progress. > With those assumptions out of the way, we could think of an interface that > exports some partial functionality of the "lookup_path()" code the kernel > as a special system call. In particular, something that takes an input > pathname, and is able to stop at any point of the name when a lookup > fails. > So some variation of the interface > > int magic_open( .... how would this interact with the negative dcache entries? That is the key. > I suspect we can do case-insensitive names faster than WNT even with a > fairly complex user-mode interface. Just because _not_ having them in the > kernel allows us to have much faster default behaviour. on this I completely disagree. Any solution that doesn't cope with case insensitive properties of negative dentries is just going to start filling the dcache with lots of useless entries (case combinations) or effectively not end up using the dcache at all. Either way its a big loss compared to making the dcache know about case insensitivity properly. Cheers, Tridge PS: ahh, what timing, someone just posted a request to the rsync list asking for case-insensitivity in rsync. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 6:54 ` tridge @ 2004-02-17 8:33 ` Neil Brown 2004-02-17 22:48 ` tridge 2004-02-17 15:13 ` Linus Torvalds 1 sibling, 1 reply; 69+ messages in thread From: Neil Brown @ 2004-02-17 8:33 UTC (permalink / raw) To: tridge; +Cc: Linus Torvalds, Kernel Mailing List, Al Viro On Tuesday February 17, tridge@samba.org wrote: > > I also think that if the choice were given then some linux distros > (the likes of Lindows comes to mind) would choose to run all processes > case-insensitive. These sorts of distros are aiming at the sorts of > users that would want everything to be case-insensitive. This is the bit I don't understand. Surely the value of case-insensitivity is that you can type in a filename from memory and not worry about what case you used when you created the file. Yet with Lindows / MS-Windows style interfaces, you virtually never type the name of a pre-existing file. So case-insensitivity doesn't seem to be a win to the user. I thought the value of a case-insensitive filenames was for legacy applications which have been written to the WIN32 API and took lots of liberties with "pretty-casing" filenames between readdir and open. NeilBrown ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 8:33 ` Neil Brown @ 2004-02-17 22:48 ` tridge 2004-02-18 0:06 ` Neil Brown 0 siblings, 1 reply; 69+ messages in thread From: tridge @ 2004-02-17 22:48 UTC (permalink / raw) To: Neil Brown; +Cc: Linus Torvalds, Kernel Mailing List, Al Viro Neil, > I thought the value of a case-insensitive filenames was for > legacy applications which have been written to the WIN32 API and took > lots of liberties with "pretty-casing" filenames between readdir and > open. No, thats a common misconception. It does happen (the "pretty-casing") but its relatively rare these days. The real problem is *proving* that a file doesn't exist. If a file does exist then there are all sorts of heuristic and cache mechanisms that can be used to get the real filename quickly on average, but if you have to prove absolutely that a file does not exist then all of that stuff is pretty much useless. Samba (and any other system that wants case-insensitive semantics on Linux) can't make do with "oh, it probably doesn't exist". That way leads to data loss. You have to know with 100% certainty that the file doesn't exist in any case combination. Unfortunately, that is also the hardest thing to do. Cheers, Tridge ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 22:48 ` tridge @ 2004-02-18 0:06 ` Neil Brown 2004-02-18 9:47 ` Helge Hafting 0 siblings, 1 reply; 69+ messages in thread From: Neil Brown @ 2004-02-18 0:06 UTC (permalink / raw) To: tridge; +Cc: Linus Torvalds, Kernel Mailing List, Al Viro On Wednesday February 18, tridge@samba.org wrote: > > Samba (and any other system that wants case-insensitive semantics on > Linux) can't make do with "oh, it probably doesn't exist". That way > leads to data loss. You have to know with 100% certainty that the file > doesn't exist in any case combination. > > Unfortunately, that is also the hardest thing to do. Hi Tridge, Maybe if it is so hard, we should just define it to be easy.... just change the universe a bit..... I'm, sure you've thought about this a lot more that I have or will, so I must be missing something, but there seems to be a solution that is efficient, predictable, and should we acceptable. The first observation is that POSIX applications and WIN32 application cannot both get exactly the file system, semantics they expect in the same directory. The example: POSIX: create "Makefile" create "makefile" WIN32: unlink "MakeFile" seems to show that. So decide up front that a WIN32 application will see something different, and decide what the best thing for it to see would be (i.e. change the universe). First cut: An application that wants case-insensitive filenames only sees those filenames that are in a case-insensitive-canonical-form. So the interface maps all file names in requests to a canonical form, and the readdir equivalent discards all non-canonical names. Thus in the above example, the WIN32 app would unlink "makefile" and never notice that "Makefile" exists. This has (to me) two problems. 1/ case gets lost, so if I save "My File", I will find "my file" has been created (unless the application pretty-cases things, in which case I can expect case to change anyway). 2/ Files created by posix apps might be invisible. To answer 2/, I'd say "tough". If you want posix files to be visible to WIN32 apps, choose appropriate names. However I would allow there to be a process, either once-off or periodic, which creates symlinks from canonical names to non-canocial filenames. This would allow you to access pre-existing files where there was no ambiguity. To answer 1/ I would suggest a second cut at the problem... Second cut: As above, but readdir tries to be clever. If it sees two (or more) names which have the same canonical form, it chooses just one of them (predictably), prefering a non-canonical name which is a symlink to the canonical name. Then when creating an a object, you create it with the canonical name and (if that succeeds) subsequently create a symlink from the requested name to the canonical name (if that is possible, don't worry if it isn't). Given this approach: If only case-insensitive apps use a linux filesystem, they will see exactly the semantics they expect, with minimal performance impact. If case-sensitive and case-insensitive apps use a linux filesystem, they will each see a consistent view and though they may not see the same view, there will be well-defined mechanisms which can work at a user-space level to resolve or highlight any issues. The biggest cost I see with this is with large directories. The "readdir" equivalent would need to read the whole directory before it could reliably return any of it. However dropping the "guarantee to preserve case" semantic on really large directories probably isn't an enormous cost (and could be configurable). NeilBrown ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-18 0:06 ` Neil Brown @ 2004-02-18 9:47 ` Helge Hafting 0 siblings, 0 replies; 69+ messages in thread From: Helge Hafting @ 2004-02-18 9:47 UTC (permalink / raw) To: Neil Brown; +Cc: linux-kernel Neil Brown wrote: > 1/ case gets lost, so if I save "My File", I will find "my file" > has been created (unless the application pretty-cases things, in > which case I can expect case to change anyway). > > 2/ Files created by posix apps might be invisible. > > > To answer 2/, I'd say "tough". If you want posix files to be This is a bit worse than just "though". win32: rmdir foo directory not empty! win32: there are _no_ files there? Helge Hafting ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 6:54 ` tridge 2004-02-17 8:33 ` Neil Brown @ 2004-02-17 15:13 ` Linus Torvalds 2004-02-17 16:57 ` Linus Torvalds 2004-02-17 23:20 ` tridge 1 sibling, 2 replies; 69+ messages in thread From: Linus Torvalds @ 2004-02-17 15:13 UTC (permalink / raw) To: tridge; +Cc: Kernel Mailing List, Al Viro On Tue, 17 Feb 2004 tridge@samba.org wrote: > > From memory, the patch added new classes of dentries to the current > "+ve" and "-ve" dentries. It added concepts like a "-ve > case-insensitive" dentry and a "-ve case-sensitive" dentry. It > certainly adds more code in trying to deal with these variants, but I > see no reason why it should be significantly computationally less > efficient. Yes, we could add context sensitivity to the dcache with a context bitmask. However, it's _not_ correct. It assumes that there is only one way to do lower/upper case, which just isn't true. What about different locales that have different case rules? Your "one bit per dentry" becomes "one bit per locale per dentry". That's just horribly hard to do. I don't know how Windows does it, so maybe this thing is hardcoded, and you don't even want "true" case insensitivity. How "correct" is Windows? (And don't even bother telling me about the translation table in NTFS volumes - I'm not interested. This would have to work on a sane filesystem to be useful, even for samba.) Linus ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 15:13 ` Linus Torvalds @ 2004-02-17 16:57 ` Linus Torvalds 2004-02-17 19:44 ` viro ` (2 more replies) 2004-02-17 23:20 ` tridge 1 sibling, 3 replies; 69+ messages in thread From: Linus Torvalds @ 2004-02-17 16:57 UTC (permalink / raw) To: tridge; +Cc: Kernel Mailing List, Al Viro On Tue, 17 Feb 2004, Linus Torvalds wrote: > > It assumes that there is only one way to do lower/upper case, which just > isn't true. What about different locales that have different case rules? > Your "one bit per dentry" becomes "one bit per locale per dentry". That's > just horribly hard to do. It's also hard to know what to do when there are two filenames that literally _are_ the same when not comparing cases. Which can obviously happen under Linux - you'd have a case-sensitive app that creates a both "makefile" and "Makefile", and now you have a case-insensitive app that looks it up (or worse, removes it), and what the *heck* is the dcache now supposed to really do? This is why I'd hate for the generic Linux dcache to know about case sensitivity, and I'd be a lot happier having a separate path (which isn't as speed-critical) that can be used to help implement helper functions for doing case-insensitive things. That way the bugs and strange behaviour would be all be limited to the case-insensitive special code, and not pollute the "sane" side. For example, I fundamentally can't easily do an atomic exclusive case-insensitive "create" or "rename", but we _could_ expose things like directory generation counts to the special interfaces, and thus allow at least "local-atomic" operations (but they would _not_ be atomic over a network, to give you an idea of the kinds of _fundamental_ limitations there are here). That's why I'd advocate having a few very special system calls for doing the operations that samba (and I'll throw wine into the pot too) wants to do. So you could literally do an atomic create with something like - regular atomic create of random case-_sensitive_ name using something tempnam()-like (use a prefix that is invalid on windows or something: make the first character be 0xff or whatever). - "read directory local sequence count" - readdir to make sure that the new name is still unique even in the case-insensitive sense - "atomic move conditionally on the local sequence count still being X" The thing is, we can do hack like the above, and yes, we could do them all inside the kernel, and give user space a reasonably nice interface with "pseudo-atomic" behaviour (ie it will _not_ be atomic if multiple clients do this over NFS, but I doubt you care). But it wouldn't be "open()" and "rename()". It would be a totally separate kernel path. It would be in the "case-insensitivity-module". It would be _outside_ the regular VFS layer, although it would have some visibility into it (ie it could follow dentries on its own, and know about the RCU etc locking rules). We can even allow that case-insensitive module to set some flags in the dentries (so that you can create negative dentries that have a flag set "this is negative for all cases"). Trust me, this is much less intrusive, and a lot easier to debug too. It won't be as fast as the regular path operations, but depending on what the common cases are (hopefully "look up name that is exact"), it would likely not be horrible either. And it could probably be debugged as a real module, without impacting any existing code, which would make it a lot easier to create. See where I'm going? Would this be acceptable to you? Are there any samba people who are knowledgeable about the VFS-layer and have the time/energy to try something like this? Al? What do you think? Linus ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 16:57 ` Linus Torvalds @ 2004-02-17 19:44 ` viro 2004-02-17 20:10 ` Linus Torvalds 2004-02-17 21:08 ` Robin Rosenberg 2004-02-17 23:57 ` tridge 2 siblings, 1 reply; 69+ messages in thread From: viro @ 2004-02-17 19:44 UTC (permalink / raw) To: Linus Torvalds; +Cc: tridge, Kernel Mailing List On Tue, Feb 17, 2004 at 08:57:40AM -0800, Linus Torvalds wrote: > Trust me, this is much less intrusive, and a lot easier to debug too. It > won't be as fast as the regular path operations, but depending on what the > common cases are (hopefully "look up name that is exact"), it would likely > not be horrible either. And it could probably be debugged as a real > module, without impacting any existing code, which would make it a lot > easier to create. > > See where I'm going? Would this be acceptable to you? Are there any samba > people who are knowledgeable about the VFS-layer and have the time/energy > to try something like this? > > Al? What do you think? What will protect your generation counts during the operation itself? ->i_sem? If anything, I'd suggest doing it as cretinous_rename(dir_fd, name1, name2) with the following semantics: * if directory had been changed since open() that gave us dir_fd - -EFOAD * otherwise, rename name1 to name2 (no cross-directory renames here). No need to expose generation counts to userland - we can just compare the count at open() time with that at operation time. The rest can be done in userland (including creation of files). We _definitely_ don't want to put "UTF-8 case-insensitive comparison" anywhere near the kernel - it's insane. If samba wants it, they get to pay the price, both in performance and keeping butt-ugly code (after all, the goal of project is to imitate butt-ugly system for butt-ugly clients). The same goes for Wine. And we really don't want to encourage those who port Windows userland in not fixing the idiotic semantics. As for Lindows... let's just say that I can't find any way to describe what I really think of those clowns, their intellect and their morals that wouldn't lead to a lawsuit from them. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 19:44 ` viro @ 2004-02-17 20:10 ` Linus Torvalds 2004-02-17 20:17 ` viro 0 siblings, 1 reply; 69+ messages in thread From: Linus Torvalds @ 2004-02-17 20:10 UTC (permalink / raw) To: viro; +Cc: tridge, Kernel Mailing List On Tue, 17 Feb 2004 viro@parcelfarce.linux.theplanet.co.uk wrote: > > What will protect your generation counts during the operation itself? > ->i_sem? Yes. You have to take it anyway, so why not? > If anything, I'd suggest doing it as > cretinous_rename(dir_fd, name1, name2) > with the following semantics: > > * if directory had been changed since open() that gave us dir_fd - > -EFOAD > * otherwise, rename name1 to name2 (no cross-directory renames here). Sure, that works. > No need to expose generation counts to userland - we can just compare the > count at open() time with that at operation time. The rest can be done > in userland (including creation of files). Note that I'm not sure we would expose generation counts at all to user space: we might keep all of this inside the "crapola windows behaviour" module, and user space could actually see some easier highlevel interface. Something like yours, but I suspect we'd want to see what the whole user-level loop would look like to know what the architecture should be like. I do believe we'd need to have some way to "refresh" the fd in your example, without restarting the whole lookup. So that when the user gets EFOAD, it can do refresh(fd); readdir(fd); /* Check that nothing clashes */ goto try_again; or similar. So the generation count _semantics_ would be exposed, even if the numbers themselves would be hidden inside the kernel. > We _definitely_ don't want to put "UTF-8 case-insensitive comparison" anywhere > near the kernel - it's insane. If samba wants it, they get to pay the price, > both in performance and keeping butt-ugly code (after all, the goal of project > is to imitate butt-ugly system for butt-ugly clients). The same goes for Wine. I agree. We'd need to let user space do the equality comparisons, I just don't see how to sanely do it in kernel land. > And we really don't want to encourage those who port Windows userland in > not fixing the idiotic semantics. As for Lindows... let's just say that > I can't find any way to describe what I really think of those clowns, their > intellect and their morals that wouldn't lead to a lawsuit from them. Heh. I suspect most people don't care that much, but I also suspect that projects like samba have to have a "anal mode" where they really act like Windows, even when it's "wrong". People can then choose to say "screw that idiocy", but by just _having_ a very compatible mode you deflect a lot of criticism. Regardless of whether people want the anal mode or not in real life. Backwards compatibility is King. It's _hugely_ important. It's one of the most important things to me in the kernel, and by the same logic I do see that it is important to others as well - even when the backwards compatibility ends up being inherited from a broken Windows setup. So while I hate case-insensitive names, I do understand that people want to have some way to emulate the braindamage for some _really_ "ass-backwards" compatibility reasons. So I think it's worth some pain, as long as we keep that compatibility from starting to encrust the _good_ stuff. Linus ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 20:10 ` Linus Torvalds @ 2004-02-17 20:17 ` viro 2004-02-17 20:23 ` Linus Torvalds 0 siblings, 1 reply; 69+ messages in thread From: viro @ 2004-02-17 20:17 UTC (permalink / raw) To: Linus Torvalds; +Cc: tridge, Kernel Mailing List On Tue, Feb 17, 2004 at 12:10:23PM -0800, Linus Torvalds wrote: > I do believe we'd need to have some way to "refresh" the fd in your > example, without restarting the whole lookup. So that when the user gets > EFOAD, it can do > > refresh(fd); lseek(fd, 0, 0); > > And we really don't want to encourage those who port Windows userland in > > not fixing the idiotic semantics. As for Lindows... let's just say that > > I can't find any way to describe what I really think of those clowns, their > > intellect and their morals that wouldn't lead to a lawsuit from them. > > Heh. > > I suspect most people don't care that much, but I also suspect that > projects like samba have to have a "anal mode" where they really act like > Windows, even when it's "wrong". People can then choose to say "screw that > idiocy", but by just _having_ a very compatible mode you deflect a lot of > criticism. Regardless of whether people want the anal mode or not in real > life. Umm... Samba deals with Windows clients. Windows software allegedly being ported to Linux is a different story and in that case there's no excuse for demanding case-insensitive operations. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 20:17 ` viro @ 2004-02-17 20:23 ` Linus Torvalds 0 siblings, 0 replies; 69+ messages in thread From: Linus Torvalds @ 2004-02-17 20:23 UTC (permalink / raw) To: viro; +Cc: tridge, Kernel Mailing List On Tue, 17 Feb 2004 viro@parcelfarce.linux.theplanet.co.uk wrote: > > > refresh(fd); > > lseek(fd, 0, 0); Yes. We can make that implicitly refresh, I'm certainly ok with that. > > I suspect most people don't care that much, but I also suspect that > > projects like samba have to have a "anal mode" where they really act like > > Windows, even when it's "wrong". People can then choose to say "screw that > > idiocy", but by just _having_ a very compatible mode you deflect a lot of > > criticism. Regardless of whether people want the anal mode or not in real > > life. > > Umm... Samba deals with Windows clients. Windows software allegedly being > ported to Linux is a different story and in that case there's no excuse for > demanding case-insensitive operations. "wine". It's not porting, it's emulation. But yes, I agree, I don't see any other cases where we want it. We basically want to support broken clients - whether they be on the other side of the network, or the other side of an emulation interface. That is the only valid reason to do this crap. It's a fairly sizeable reason, though. On another front ("World Domination, Fast!") we'll try to fix the problem another way, but there's nothing wrong with fighting on multiple fronts if you have the man-power. Linus ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 16:57 ` Linus Torvalds 2004-02-17 19:44 ` viro @ 2004-02-17 21:08 ` Robin Rosenberg 2004-02-17 21:17 ` Linus Torvalds 2004-02-17 23:57 ` tridge 2 siblings, 1 reply; 69+ messages in thread From: Robin Rosenberg @ 2004-02-17 21:08 UTC (permalink / raw) To: Linus Torvalds; +Cc: tridge, Kernel Mailing List, Al Viro On Tuesday 17 February 2004 17.57, Linus Torvalds wrote: [case-insanesititvity proposal ///] > See where I'm going? Would this be acceptable to you? Are there any samba > people who are knowledgeable about the VFS-layer and have the time/energy > to try something like this? So the same guy that strongly insist that a file is a string of bytes and nothing else, now thinks it is sane to even think of "case" of a byte. That's impossible unless you actually DO believe its a bunch of characters. What is it? -- robin ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 21:08 ` Robin Rosenberg @ 2004-02-17 21:17 ` Linus Torvalds 2004-02-17 22:27 ` Robin Rosenberg 0 siblings, 1 reply; 69+ messages in thread From: Linus Torvalds @ 2004-02-17 21:17 UTC (permalink / raw) To: Robin Rosenberg; +Cc: tridge, Kernel Mailing List, Al Viro On Tue, 17 Feb 2004, Robin Rosenberg wrote: > > On Tuesday 17 February 2004 17.57, Linus Torvalds wrote: > [case-insanesititvity proposal ///] > > See where I'm going? Would this be acceptable to you? Are there any samba > > people who are knowledgeable about the VFS-layer and have the time/energy > > to try something like this? > > So the same guy that strongly insist that a file is a string of bytes and nothing else, > now thinks it is sane to even think of "case" of a byte. That's impossible unless you > actually DO believe its a bunch of characters. What is it? Which part of my argumen don't you understand? The kernel proper thinks it's just a stream of bytes, and all the existing interfaces do likewise. But we'd have a kernel helper module to let samba do what it already does now, except help it do so more efficiently? The fact that _I_ think pathnames are just a nice stream of bytes sadly doesn't make Windows clients do the same. Some day when I'm King Of The World, and I can outlaw windows clients, we'll finally get rid of the braindamage, but until then I'm pragmatic enough to say "let's help out the poor samba people who have to deal with the crap day in and day out". What's your problem with that? Linus ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 21:17 ` Linus Torvalds @ 2004-02-17 22:27 ` Robin Rosenberg 2004-02-18 3:02 ` tridge 0 siblings, 1 reply; 69+ messages in thread From: Robin Rosenberg @ 2004-02-17 22:27 UTC (permalink / raw) To: Linus Torvalds; +Cc: tridge, Kernel Mailing List, Al Viro On Tuesday 17 February 2004 22.17, Linus Torvalds wrote: > The fact that _I_ think pathnames are just a nice stream of bytes sadly > doesn't make Windows clients do the same. Some day when I'm King Of The > World, and I can outlaw windows clients, we'll finally get rid of the LPA = Linus' Patriot Act. > braindamage, but until then I'm pragmatic enough to say "let's help out > the poor samba people who have to deal with the crap day in and day out". > > What's your problem with that? Nothing wrong with helping people. Having to put up with the existence of Windows day in and out is the reason I'm still on an eight-bit encoding. Sorry for not explaining the REAL problem, but only a partial problem. I need to support all kinds of clients on Windows with protocols that convey no character set info. With samba that's no problem. Having to put up with a Unix world running ISO-8859-1 (or ISO-8859-15) is another. Ofcourse that means Linux machines also add to the disturbance by not storing things as unicode. The real obstable is file names, everything else including content of files, I can handle (I think). Maybe I'll find a solution for the filenames too, but usually some hot discussions are needed for the brain to kick into the right gear. I want to switch to UTF-8 to work better with the outside world, but as things are people will start to take notice of what OS is running in the shadows when they see the filename problems, and start demanding Windows, and ... You see; I'm not mean; I don't want to do that to them (or myself), -- robin ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 22:27 ` Robin Rosenberg @ 2004-02-18 3:02 ` tridge 0 siblings, 0 replies; 69+ messages in thread From: tridge @ 2004-02-18 3:02 UTC (permalink / raw) To: Robin Rosenberg; +Cc: Linus Torvalds, Kernel Mailing List, Al Viro Robin, > Having to put up with the existence of Windows day in and out is > the reason I'm still on an eight-bit encoding. Sorry for not > explaining the REAL problem, but only a partial problem. I need to > support all kinds of clients on Windows with protocols that convey > no character set info. With samba that's no problem. Having to put > up with a Unix world running ISO-8859-1 (or ISO-8859-15) is > another. Ofcourse that means Linux machines also add to the > disturbance by not storing things as unicode. The real obstable is > file names, everything else including content of files, I can > handle (I think). Maybe I'll find a solution for the filenames too, > but usually some hot discussions are needed for the brain to kick > into the right gear. I suspect you are running Samba 2.x, which negotiated all that multi-byte stuff on the wire. Samba 3.x does the same as windows servers have done for years and negotiates UCS-2, which means that every windows box that connects to it no matter what locale it is in uses the same charset encoding as every other windows box. There are still some legacy interfaces on the wire that use the old encodings, but they are rare and getting rarer. To support these, Samba3 juggles 4 character set encodings internally: * the unix-charset, which it uses to talk to the OS, and defaults to UTF-8 * the windows wire charset, which is always UCS-2 * the dos-charset for legacy parts of the protocol, which you have to configure in the samba config if you care about these legacy parts of the protocol (for example if you have older apps). It defaults to either CP850 or ASCII depending on what autoconf discovers. * the display-charset which is used to put stuff on an admins terminal for utilities like smbclient. The default depends on your LOCALE setting, or if nothing is set it uses ASCII. Internally Samba3 only ever stores stuff in the "unix-charset" encoding, which is usually UTF-8. It converts to the others as needed when talking on the wire or to terminals. > I want to switch to UTF-8 to work better with the outside world, > but as things are people will start to take notice of what OS is > running in the shadows when they see the filename problems, and > start demanding Windows, and ... You see; I'm not mean; I don't > want to do that to them (or myself), If you use Samba3 then they will not notice what charset you are using on your Linux filesystems. The windows clients will just see UCS-2. Cheers, Tridge ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 16:57 ` Linus Torvalds 2004-02-17 19:44 ` viro 2004-02-17 21:08 ` Robin Rosenberg @ 2004-02-17 23:57 ` tridge 2 siblings, 0 replies; 69+ messages in thread From: tridge @ 2004-02-17 23:57 UTC (permalink / raw) To: Linus Torvalds; +Cc: Kernel Mailing List, Al Viro Linus, > It's also hard to know what to do when there are two filenames that > literally _are_ the same when not comparing cases. Which can obviously > happen under Linux - you'd have a case-sensitive app that creates a both > "makefile" and "Makefile", and now you have a case-insensitive app that > looks it up (or worse, removes it), and what the *heck* is the dcache now > supposed to really do? This is really not as bad as it first seems. Just think what the absolutely obvious thing to do is and do that. It's like all those things in POSIX where it says "if you do XXX then the behaviour is undefined" and the implementations end up doing whatever the heck they find easiest to do. It's the same here. In the example you give then you just give whatever file you come across first or happen to have in the dcache. You can't do better than that, as the problem is fundamentally insoluble in a sane fashion, so just don't try. We've been doing exactly that in Samba for 12 years (picking the first file we come across) and I can't recall a *single* complaint about that behaviour. Users *expect* the server to just pick one, and have no pre-conceived idea of which one it will pick. Of course, some samba-tuned filesystem could have a mount option to refuse to allow the creation of filenames that conflict in this way, but don't even try to enforce this in the kernel core. > This is why I'd hate for the generic Linux dcache to know about case > sensitivity, and I'd be a lot happier having a separate path (which isn't > as speed-critical) that can be used to help implement helper functions for > doing case-insensitive things. The problem is that if that separate path doesn't go via the dcache then we won't get the invalidation of our negative dentries so we won't be able to do any better than scanning the whole directory every time to prove files don't exist. The dcache has to know about this as its the only place where all the information that is needed comes together (I'm sure you'll correct me if I'm wrong about this). > That way the bugs and strange behaviour would be all be limited to the > case-insensitive special code, and not pollute the "sane" side. except when something like a file create happens on the "sane" side of things and we then have no way of knowing that our name space has just changed. I suppose we could create a completely new dcache in parallel with the current one and have some sort of notify between the "sane" and "insane" worlds, but I suspect the glue code between them would be worse than just adding that context bit to the main dcache. > For example, I fundamentally can't easily do an atomic exclusive > case-insensitive "create" or "rename", but we _could_ expose things like > directory generation counts to the special interfaces, and thus allow at > least "local-atomic" operations (but they would _not_ be atomic over a > network, to give you an idea of the kinds of _fundamental_ limitations > there are here). yes, doing atomic network file operations sucks, but please don't let that stop us doing it in a reasonable fashion for local filesystems. Doing a nice atomic case-insensitive create or rename is really *no* different from what we do now in Linux, it just means that we need to have case-insensitive dentries that mean "this is a negative dentry that covers all possible case combinations of the name it contains". It is up to the filesystem to provide you with that -ve dentry (just like the filesystem provides the case-sensitive -ve dentries now) and the dcache just has to use it in the same way that it uses the existing ones. If you really don't want to do this then fine, in which case I'll ask again in a year or twos time and see if I can convince you then. I know this would make the code messier, and making code messier for the sake of interoperability with windows is perhaps reason enough not to do it. But please don't tell me it *can't* be done or that it is just too hard. That's just not true. > - regular atomic create of random case-_sensitive_ name using something > tempnam()-like (use a prefix that is invalid on windows or something: > make the first character be 0xff or whatever). > - "read directory local sequence count" > - readdir to make sure that the new name is still unique even in the > case-insensitive sense > - "atomic move conditionally on the local sequence count still being X" that could make things atomic, but it won't make it fast. Think about the fact that modern filesystems are now using better than linear lists for directories. So in most cases lookups in large directories can be done in much better than O(n) time (for reasonable values of n). The above solution means Samba will never be better than O(n), so for large directories we will always suck performance wise. It doesn't have to be that way. > We can even allow that case-insensitive module to set some flags in the > dentries (so that you can create negative dentries that have a flag set > "this is negative for all cases"). ahh! yipee! yes, if we have that dentry bit then we have a hope. Without that I think it won't help much. > See where I'm going? Would this be acceptable to you? Are there any samba > people who are knowledgeable about the VFS-layer and have the time/energy > to try something like this? I'll discuss this with some of the people here in OzLabs and see if we can come up with a plan. I suspect most of OzLabs will be avoiding me for a day or two in an attempt to not be the one to do this :-) Cheers, Tridge ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 15:13 ` Linus Torvalds 2004-02-17 16:57 ` Linus Torvalds @ 2004-02-17 23:20 ` tridge 2004-02-17 23:43 ` Linus Torvalds 2004-02-18 2:37 ` H. Peter Anvin 1 sibling, 2 replies; 69+ messages in thread From: tridge @ 2004-02-17 23:20 UTC (permalink / raw) To: Linus Torvalds; +Cc: Kernel Mailing List, Al Viro Linus, > Yes, we could add context sensitivity to the dcache with a context > bitmask. > > However, it's _not_ correct. > > It assumes that there is only one way to do lower/upper case, which just > isn't true. What about different locales that have different case rules? > Your "one bit per dentry" becomes "one bit per locale per dentry". That's > just horribly hard to do. I think you're making it sound much harder than it really is. We just add a VFS hook in the filesystems. The filesystem chooses the encoding specific comparison function. If the filesystem doesn't provide one then don't do case insensitivity. If the filesystem does provide one (for example NTFS, JFS) then use it. Then all I need to do is convince one of the filesystem maintainers to add a mount time option to specify the case table (for example by specifying the name of a file in the filesystem that holds it). So, all the really ugly stuff is then in the per-filesystem code, and all the VFS and dcache has to do is know about a single context bit per dentry. > I don't know how Windows does it, so maybe this thing is hardcoded, and > you don't even want "true" case insensitivity. NTFS has a 128k table on disk, created at mkfs time and indexed by the UCS2 character. The interesting thing about this table is that it doesn't seem to vary between different locales as one might expect. I have checked 3 locales so far (Swedish, Japanese and English) and all have the same 128k table. I should check a few more locales to see if it really is the same everywhere. Contact me off-list if you have a NTFS filesystem created in a different locale and would be willing to run a test program against it to see if the table is different from the one we have in Samba. There is stuff in the charset handling of every locale that does vary in windows, but it isn't the case table, its the "valid characters" map used to determine what characters are allowed when converting strings into legacy multi-byte encodings. Even I don't think that the kernel will ever have to deal with that crap unless someone is foolish enough to port Samba into the kernel (several people have actually done that despite the insanity of the idea, but they all did an absolutely terrible job of it and certainly didn't take care to get all the charset handling right). > How "correct" is Windows? from my rather limited point of view I always have to assume that windows is "correct", unless I can show that its behaviour leads to data loss, a security hole or something equally extreme. Cheers, Tridge ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 23:20 ` tridge @ 2004-02-17 23:43 ` Linus Torvalds 2004-02-18 3:26 ` tridge 2004-02-18 2:37 ` H. Peter Anvin 1 sibling, 1 reply; 69+ messages in thread From: Linus Torvalds @ 2004-02-17 23:43 UTC (permalink / raw) To: tridge; +Cc: Kernel Mailing List, Al Viro On Wed, 18 Feb 2004 tridge@samba.org wrote: > > I think you're making it sound much harder than it really is. I think I'm just making the mistake of assuming that anybody would care to do it "right", while everybody really only cares to get it be compatible with Windows. For example, if you only want to be compatible with Windows, you don't have to worry about UCS-4, you only have the UCS-2 part, which means that you can do a silly array-lookup based thing or something. > We just add a VFS hook in the filesystems. The filesystem chooses the > encoding specific comparison function. If the filesystem doesn't > provide one then don't do case insensitivity. If the filesystem does > provide one (for example NTFS, JFS) then use it. Then all I need to do > is convince one of the filesystem maintainers to add a mount time > option to specify the case table (for example by specifying the name > of a file in the filesystem that holds it). Ugh. What a horrible kludge, and it won't work without "preparing" the filesystem at mount-time. I'd much rather leave the translation table in user space, and just give it as an argument to the "look up case insensitive" special thing. That would mean that we can hold the directory semaphore over the whole thing, which would simplify _my_ kludge, since there would be no need to worry about user space having separate stages. The hard part would be negative dentries. We'd have to invalidate all "case-insensitive" negative dentries when creating any new file in a directory, and that would be something the generic VFS layer would have to know about, and that might be unacceptable to Al. Linus ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 23:43 ` Linus Torvalds @ 2004-02-18 3:26 ` tridge 2004-02-18 5:33 ` H. Peter Anvin 2004-02-18 7:54 ` Marc Lehmann 0 siblings, 2 replies; 69+ messages in thread From: tridge @ 2004-02-18 3:26 UTC (permalink / raw) To: Linus Torvalds; +Cc: Kernel Mailing List, Al Viro Linus, > For example, if you only want to be compatible with Windows, you don't > have to worry about UCS-4, you only have the UCS-2 part, which means that > you can do a silly array-lookup based thing or something. Even within UCS-2 land the case-mapping table is sparse as only some characters have a upper/lower mapping. In fact, there are just 636 characters out of 64k that have an upper/lower case mapping that isn't the identity. That is across *all* languages that windows uses for UCS-2. In Samba that's not sparse enough that its worth saving the single mmap of 128k to encode it sparsely in memory, but in UCS-4 land you would obviously use a sparse mapping, and that mapping table would probably be just a few k in size. If you allow for extents then I expect you could encode it in a couple of hundred bytes. (I experimented with using a sparse mapping in Samba, and it was a slight loss on the machine I was testing on compared to just doing the mmap, so I went with the mmap. Maybe someone else can do a better sparse encoding than I did and actually get a win due to better cache behaviour.) > Ugh. What a horrible kludge, and it won't work without "preparing" the > filesystem at mount-time. I'd much rather leave the translation table in > user space, and just give it as an argument to the "look up case > insensitive" special thing. The case mapping table must remain the same for the lifetime of the mounted filesyste, otherwise you'd get chaos. That's why tying it to the filesystem (ie. hanging it off the superblock) makes sense. > The hard part would be negative dentries. We'd have to invalidate all > "case-insensitive" negative dentries when creating any new file in a > directory, and that would be something the generic VFS layer would have to > know about Right, the handling of negative dentries is the key. I don't think its quite as bad as you say though, as you can do this: 1) use a filesystem provided case-insensitive hash in the dcache. If the filesystem provided hash isn't case-insensitive then don't try to do case-insensitive lookups on this filesystem. 2) you only need to potentially invalidate entries in the same hash bucket as the name you are creating. 3) Even better, you don't need to invalidate entries that don't have the same hash value (presuming your hash values are larger than your truncated hash keys). > and that might be unacceptable to Al. yes, and I'm quite sympathentic to that point of view. I just want to make sure that if we don't do this then we use honest reasons for not doing it, not "that's impossible" reasons which are bogus when you examine them. Cheers, Tridge ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-18 3:26 ` tridge @ 2004-02-18 5:33 ` H. Peter Anvin 2004-02-18 7:54 ` Marc Lehmann 1 sibling, 0 replies; 69+ messages in thread From: H. Peter Anvin @ 2004-02-18 5:33 UTC (permalink / raw) To: linux-kernel Followup to: <16434.56190.639555.554525@samba.org> By author: tridge@samba.org In newsgroup: linux.dev.kernel > > In Samba that's not sparse enough that its worth saving the single > mmap of 128k to encode it sparsely in memory, but in UCS-4 land you > would obviously use a sparse mapping, and that mapping table would > probably be just a few k in size. If you allow for extents then I > expect you could encode it in a couple of hundred bytes. > If all you care about is the UTF-16-compatible range, you only need 1088K entries in your table; small enough that it can be reasonably had in userspace. > (I experimented with using a sparse mapping in Samba, and it was a > slight loss on the machine I was testing on compared to just doing the > mmap, so I went with the mmap. Maybe someone else can do a better > sparse encoding than I did and actually get a win due to better cache > behaviour.) The thing is, you're probably only touching small parts of your table, so the kernel and the CPU cache works quite well on the large table as it is. Wouldn't work in kernel space, though. -hpa ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-18 3:26 ` tridge 2004-02-18 5:33 ` H. Peter Anvin @ 2004-02-18 7:54 ` Marc Lehmann 1 sibling, 0 replies; 69+ messages in thread From: Marc Lehmann @ 2004-02-18 7:54 UTC (permalink / raw) To: linux-kernel On Wed, Feb 18, 2004 at 02:26:54PM +1100, tridge@samba.org wrote: > Even within UCS-2 land the case-mapping table is sparse as only some > characters have a upper/lower mapping. In fact, there are just 636 > characters out of 64k that have an upper/lower case mapping that isn't > the identity. That is across *all* languages that windows uses for > UCS-2. This is because scripts differentiating between upper and lower case are rare exceptions in the world. Unfortunately, commonly used exceptions, and still locale dependent. Having a samba-helper kernel module that would contain this table (I am confident that it's only a single table in existing versions of windows, but maybe they improve that in future versions) could solve this problem. I still wonder wether it ever can be made efficient, though. -- -----==- | ----==-- _ | ---==---(_)__ __ ____ __ Marc Lehmann +-- --==---/ / _ \/ // /\ \/ / pcg@goof.com |e| -=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE --+ The choice of a GNU generation | | ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 23:20 ` tridge 2004-02-17 23:43 ` Linus Torvalds @ 2004-02-18 2:37 ` H. Peter Anvin 2004-02-18 3:03 ` Linus Torvalds 2004-02-18 4:08 ` tridge 1 sibling, 2 replies; 69+ messages in thread From: H. Peter Anvin @ 2004-02-18 2:37 UTC (permalink / raw) To: linux-kernel Followup to: <16434.41376.453823.260362@samba.org> By author: tridge@samba.org In newsgroup: linux.dev.kernel > > > I don't know how Windows does it, so maybe this thing is hardcoded, and > > you don't even want "true" case insensitivity. > > NTFS has a 128k table on disk, created at mkfs time and indexed by the > UCS2 character. So you're hosed if anyone uses characters outside the UCS-2 character set... > The interesting thing about this table is that it doesn't seem to > vary between different locales as one might expect. I have checked 3 > locales so far (Swedish, Japanese and English) and all have the same > 128k table. I should check a few more locales to see if it really is > the same everywhere. Contact me off-list if you have a NTFS > filesystem created in a different locale and would be willing to run > a test program against it to see if the table is different from the > one we have in Samba. There is a "standard" table, which is published by the Unicode consortium. However, the "standard" table isn't what you want in certain locales, e.g. Turkish. > There is stuff in the charset handling of every locale that does vary > in windows, but it isn't the case table, its the "valid characters" > map used to determine what characters are allowed when converting > strings into legacy multi-byte encodings. Even I don't think that the > kernel will ever have to deal with that crap unless someone is foolish > enough to port Samba into the kernel (several people have actually > done that despite the insanity of the idea, but they all did an > absolutely terrible job of it and certainly didn't take care to get > all the charset handling right). > > > How "correct" is Windows? > > from my rather limited point of view I always have to assume that > windows is "correct", unless I can show that its behaviour leads to > data loss, a security hole or something equally extreme. Well, we don't want to support a bunch of hacks to make it behave like Windows if what Windows does doesn't make sense. If so you should use a metalayer where you canonicalize the filenames and don't store "Makefile" on the disk; store "makefile" and keep the "real" filename stashed elsewhere, perhaps an EA. -hpa ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-18 2:37 ` H. Peter Anvin @ 2004-02-18 3:03 ` Linus Torvalds 2004-02-18 3:14 ` H. Peter Anvin 2004-02-18 4:08 ` tridge 1 sibling, 1 reply; 69+ messages in thread From: Linus Torvalds @ 2004-02-18 3:03 UTC (permalink / raw) To: H. Peter Anvin; +Cc: linux-kernel On Wed, 18 Feb 2004, H. Peter Anvin wrote: > > Well, we don't want to support a bunch of hacks to make it behave like > Windows if what Windows does doesn't make sense. I'd disagree, for a very simple reason: case-insensitivity itself simply does not make sense, so the _only_ reason for having a bunch of hacks is literally to support windows file exports and nothing else. I obviously agree with the fact that we should _not_ put those hacks into the VFS layer proper - we should keep them as a separate thing, and we should make it clear that it makes no sense _except_ for Windows compatibility. Think of it as nothing more than a binary compatibility layer, the same way we have hooks to support "lcall 7,0" for binary compatibility with some silly (and much less interesting) x86 OSes through external modules. Linus ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-18 3:03 ` Linus Torvalds @ 2004-02-18 3:14 ` H. Peter Anvin 2004-02-18 3:27 ` Linus Torvalds 0 siblings, 1 reply; 69+ messages in thread From: H. Peter Anvin @ 2004-02-18 3:14 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel Linus Torvalds wrote: > > On Wed, 18 Feb 2004, H. Peter Anvin wrote: > >>Well, we don't want to support a bunch of hacks to make it behave like >>Windows if what Windows does doesn't make sense. > > > I'd disagree, for a very simple reason: case-insensitivity itself simply > does not make sense, so the _only_ reason for having a bunch of hacks is > literally to support windows file exports and nothing else. > > I obviously agree with the fact that we should _not_ put those hacks into > the VFS layer proper - we should keep them as a separate thing, and we > should make it clear that it makes no sense _except_ for Windows > compatibility. > > Think of it as nothing more than a binary compatibility layer, the same > way we have hooks to support "lcall 7,0" for binary compatibility with > some silly (and much less interesting) x86 OSes through external modules. > Well, this is also true :) I still say it belongs in userspace. For 100% bug-compatibility with Windows, though, it is probably worthwhile to have the filename in the native filesystem be not what a Windows user would see, but rather the normalized filename. That makes a userspace implementation much easier. -hpa ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-18 3:14 ` H. Peter Anvin @ 2004-02-18 3:27 ` Linus Torvalds 2004-02-18 21:31 ` tridge 0 siblings, 1 reply; 69+ messages in thread From: Linus Torvalds @ 2004-02-18 3:27 UTC (permalink / raw) To: H. Peter Anvin; +Cc: linux-kernel On Tue, 17 Feb 2004, H. Peter Anvin wrote: > > Well, this is also true :) I still say it belongs in userspace. The thing is, I do agree with Tridge on one simple fact: it's very hard indeed to do atomic file operations from user space. That's not necessarily a problem if samba is the only process accessing the directories in question, since then samba could do all locking internally and make sure that it never does anything inconsistent. However, clearly people who run samba on a machine want to potentially _also_ export that same filesystem as a NFS volume, as a way to have both Windows and UNIX clients access the same data. And that pretty much means that other people _will_ access the directories, and that samba can't do its internal locking in that kind of environment. This is why I am symphathetic to the need to add _some_ kind of support for this. And the only common place ends up being the kernel. > For 100% bug-compatibility with Windows, though, it is probably > worthwhile to have the filename in the native filesystem be not what a > Windows user would see, but rather the normalized filename. That makes > a userspace implementation much easier. Oh, absolutely. But that's something that samba can easily do internally: it can choose to just entirely ignore filenames that aren't normalized, or it can export it on the wire (obviously in the normalized UCS-2 format), and just consider non-normalized names to be another "case". In fact, that's what the naive implementation would do anyway, so that's not any added complexity. (And samba clearly _cannot_ show the client a non-normalized name per se, since the smb protocol ends up using UCS-2). Linus ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-18 3:27 ` Linus Torvalds @ 2004-02-18 21:31 ` tridge 2004-02-18 22:23 ` Linus Torvalds 0 siblings, 1 reply; 69+ messages in thread From: tridge @ 2004-02-18 21:31 UTC (permalink / raw) To: Linus Torvalds; +Cc: H. Peter Anvin, linux-kernel Linus, > The thing is, I do agree with Tridge on one simple fact: it's very hard > indeed to do atomic file operations from user space. I'm glad I'm making progress :) The second basic fact that I think is relevant is that its not possible to do case-insensitive filesystem operations efficiently without the filesystem having knowledge of the fact that you want a case-insensitive lookup. The reason for this is that modern filesystems do much better than an O(n) linear scan for lookups in directories. They use a hash, or a tree or whatever you like to take advantage of an ordering function on the names in the directory. The days of linear scans in directories are fast dwindling. The only way you are going to avoid the linear scan for a case-insensitive lookup is to make that ordering function case-insensitive. The question really is whether we are willing to pay the price in terms of complexity for doing that. I've tried to make the claim in this thread that the code complexity cost of doing this isn't really all that high, but it is definately non-zero. So your magic_open() proposal would probably be a help, and would certainly reduce the amount of code we would need in userspace, but it doesn't change the fundamental linear scan of directories problem at all. That doesn't mean I won't take you up on the magic_open() proposal, it's just that I'd need to try it to see if its a sufficient win to justify using it given the limitations. Cheers, Tridge ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-18 21:31 ` tridge @ 2004-02-18 22:23 ` Linus Torvalds 2004-02-18 22:28 ` Linus Torvalds 0 siblings, 1 reply; 69+ messages in thread From: Linus Torvalds @ 2004-02-18 22:23 UTC (permalink / raw) To: tridge; +Cc: H. Peter Anvin, linux-kernel On Thu, 19 Feb 2004 tridge@samba.org wrote: > > The second basic fact that I think is relevant is that its not > possible to do case-insensitive filesystem operations efficiently > without the filesystem having knowledge of the fact that you want a > case-insensitive lookup. That's not my problem. That is _your_ problem, and I don't care. I disagree violently with the notion that we would push this down to a filesystem level. Sorry, but there are limits to how much we care about broken operating systems. Linus ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-18 22:23 ` Linus Torvalds @ 2004-02-18 22:28 ` Linus Torvalds 2004-02-18 22:50 ` tridge 0 siblings, 1 reply; 69+ messages in thread From: Linus Torvalds @ 2004-02-18 22:28 UTC (permalink / raw) To: tridge; +Cc: H. Peter Anvin, linux-kernel On Wed, 18 Feb 2004, Linus Torvalds wrote: > > That's not my problem. That is _your_ problem, and I don't care. I > disagree violently with the notion that we would push this down to a > filesystem level. > > Sorry, but there are limits to how much we care about broken operating > systems. Side note: this only matters for cold cache entries anyway, so I doubt you'll see any performance improvement on a file server from passing the brain damage down to the lower levels. And I bet the performance advantages of _not_ doing native case insensitivity are likely to dominate hugely. Linus ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-18 22:28 ` Linus Torvalds @ 2004-02-18 22:50 ` tridge 2004-02-18 22:59 ` Linus Torvalds 0 siblings, 1 reply; 69+ messages in thread From: tridge @ 2004-02-18 22:50 UTC (permalink / raw) To: Linus Torvalds; +Cc: H. Peter Anvin, linux-kernel Linus, > And I bet the performance advantages of _not_ doing native case > insensitivity are likely to dominate hugely. This part I just don't understand at all. The proposed changes would be extremely cheap performance wise as you are just replacing one hash with another, and dealing with one extra context bit in the dcache. There is no way that this could come anywhere near the cost of doing linear directory scans. The hash function would be slightly more expensive (when enabled), but not much, especially when you put in the obvious optimisation for 7 bit characters. The string comparison function in a couple of places would also become more expensive, but once again it would only be expensive for case-insensitive processes and benefits from the 7 bit optimisation so that the average case will only be very slightly more expensive than the current function. Fair enough that you don't want to do this for code complexity reasons, but please don't tell me it would be slower than what we have to do now. Try an strace of Samba trying to unlink() a non-existant file in a large directory. It's enough to make you want to curl up and die :) Cheers, Tridge ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-18 22:50 ` tridge @ 2004-02-18 22:59 ` Linus Torvalds 2004-02-18 23:09 ` tridge 0 siblings, 1 reply; 69+ messages in thread From: Linus Torvalds @ 2004-02-18 22:59 UTC (permalink / raw) To: tridge; +Cc: H. Peter Anvin, linux-kernel On Thu, 19 Feb 2004 tridge@samba.org wrote: > > > And I bet the performance advantages of _not_ doing native case > > insensitivity are likely to dominate hugely. > > This part I just don't understand at all. The proposed changes would > be extremely cheap performance wise as you are just replacing one hash > with another, and dealing with one extra context bit in the > dcache. There is no way that this could come anywhere near the cost of > doing linear directory scans. Why do you focus on linear directory scans? They simply do not happen under any reasonable IO patterns. You look up names under the same name that they are on the disk. So the _only_ thing that should matter is the exact match. The inexact matches should be a case of "make them correct". Screw performance. And tell people that they are slower. Sure, I can imaging that MS would make some benchmark to show that case, but at that point I just don't care. Linus ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-18 22:59 ` Linus Torvalds @ 2004-02-18 23:09 ` tridge 2004-02-18 23:16 ` Linus Torvalds 0 siblings, 1 reply; 69+ messages in thread From: tridge @ 2004-02-18 23:09 UTC (permalink / raw) To: Linus Torvalds; +Cc: H. Peter Anvin, linux-kernel Linus, > Why do you focus on linear directory scans? Because a large number of file operations are on filenames that don't exist. I have to *prove* they don't exist. That includes: * every file create. I have to prove there wasn't an existing file under a different case combination. * every rename. Again, I have to prove that the destination name doesn't exist. * every open of a non-existant name (*very* common, its what MS office does all the time). etc etc. If I had a single function that could quickly tell me that a file does not exist in any case combination then I would be much better off. > They simply do not happen under any reasonable IO patterns. You look up > names under the same name that they are on the disk. So the _only_ thing > that should matter is the exact match. nope, see above. The most common pattern of accesses involves doing a full directory scan on every access. > Sure, I can imaging that MS would make some benchmark to show that case, > but at that point I just don't care. It's not just "some benchmark". It's the normal use case. Cheers, Tridge ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-18 23:09 ` tridge @ 2004-02-18 23:16 ` Linus Torvalds 2004-02-19 8:10 ` Jamie Lokier 0 siblings, 1 reply; 69+ messages in thread From: Linus Torvalds @ 2004-02-18 23:16 UTC (permalink / raw) To: tridge; +Cc: H. Peter Anvin, linux-kernel On Thu, 19 Feb 2004 tridge@samba.org wrote: > > > Why do you focus on linear directory scans? > > Because a large number of file operations are on filenames that don't > exist. I have to *prove* they don't exist. And you only need to do that ONCE per name. There is zero reason to do it over and over again, and there is zero reason to push case insensitivity deep into the filesystem. Have you checked how many filesystems we have? Hint: ls -l fs/ | grep '^d' | wc The thing is, you have to realize that Windows-compatibility is very very much second-class. If you refuse to realize that, you can't argue effectively, because you are arguing for things that simply WILL NOT happen. So instead of having this crazy windows-centric idea, I would suggest you try to come up with ways to make it easier for you. I can tell you already that it won't be everything you want or need, but quite frankly, your choice is between _nada_ and something reasonable. So give it up. We're not making the same STUPID mistakes that Microsoft has done. Linus ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-18 23:16 ` Linus Torvalds @ 2004-02-19 8:10 ` Jamie Lokier 2004-02-19 16:09 ` Linus Torvalds 0 siblings, 1 reply; 69+ messages in thread From: Jamie Lokier @ 2004-02-19 8:10 UTC (permalink / raw) To: Linus Torvalds; +Cc: tridge, H. Peter Anvin, linux-kernel Linus Torvalds wrote: > > > Why do you focus on linear directory scans? > > > > Because a large number of file operations are on filenames that don't > > exist. I have to *prove* they don't exist. > > And you only need to do that ONCE per name. > > There is zero reason to do it over and over again, and there is zero > reason to push case insensitivity deep into the filesystem. Linus, while I agree with you wholeheartedly on everything else in this thread - how can Samba only do that lookup ONCE per name if a client is issuing many requests for non-existent opens or stats? Example: A client has a search path for executables or libraries. Each time SomeThing.DLL is looked up by the client, it will issue an open() for each entry in the path, until it finds the file it wants. For each request, Samba must readdir() every directory in the path until the file is found. If a directory doesn't change between requests, Samba can use dnotify to cache the negative lookups. However, if any change occurs in a directory, or if the directory is not dnotify-capable, Samba is not allowed to cache these negative results: It has to do the readdir() for _every_ request. -- Jamie ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-19 8:10 ` Jamie Lokier @ 2004-02-19 16:09 ` Linus Torvalds 2004-02-19 16:38 ` Jamie Lokier 0 siblings, 1 reply; 69+ messages in thread From: Linus Torvalds @ 2004-02-19 16:09 UTC (permalink / raw) To: Jamie Lokier; +Cc: tridge, H. Peter Anvin, linux-kernel On Thu, 19 Feb 2004, Jamie Lokier wrote: > > Linus, while I agree with you wholeheartedly on everything else in > this thread - how can Samba only do that lookup ONCE per name if a > client is issuing many requests for non-existent opens or stats? While I'm not willing to push case insensitivity deep into the filesystems, I _am_ willing to entertain the notion of an extra flag to a dcache entry that the regular VFS operations ignore (apart from clearing it when they change anything and having to flush them under some circumstances), which would basically be "this dentry has been judged unique in a case-insensitive environment". So assuming nobody else is touching the directory, the case-insensitive special module could create these kinds of dentries to its hearts content when it does a lookup. > Example: A client has a search path for executables or libraries. > > Each time SomeThing.DLL is looked up by the client, it will issue an > open() for each entry in the path, until it finds the file it wants. > > For each request, Samba must readdir() every directory in the path > until the file is found. > > If a directory doesn't change between requests, Samba can use dnotify > to cache the negative lookups. > > However, if any change occurs in a directory, or if the directory is > not dnotify-capable, Samba is not allowed to cache these negative > results: It has to do the readdir() for _every_ request. But this is exactly what I _am_ willing to entertain: have some limited special logic inside the kernel (but outside the VFS layer proper), that allows samba to use special interfaces that avoids this. For example, the rule can be that _any_ regular dentry create will invalidate all the "case-insensitive" dentries. Just to be simple about it. But if samba is the only thing that accesses a certain directory (or the directory is not written to, like / and /usr etc usually behave), the "windows hack" interface will be able to populate it with its fake dentries all it wants. Or something like this. Basically, I'm convinced that the problem _can_ be solved without going deep into the VFS layer. Maybe I'm wrong. But I'd better not be, because we're definitely not going to screw up the VFS layer for Windows. Linus ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-19 16:09 ` Linus Torvalds @ 2004-02-19 16:38 ` Jamie Lokier 2004-02-19 16:54 ` Linus Torvalds 0 siblings, 1 reply; 69+ messages in thread From: Jamie Lokier @ 2004-02-19 16:38 UTC (permalink / raw) To: Linus Torvalds; +Cc: tridge, H. Peter Anvin, linux-kernel Linus Torvalds wrote: > For example, the rule can be that _any_ regular dentry create will > invalidate all the "case-insensitive" dentries. Just to be simple about > it. If that's the rule, then with exactly the same algorithmic efficiency, readdir+dnotify can be used to maintain the cache in userspace instead. There is nothing gained by using the helper module in that case. It follows that a helper module is only useful if readdir+dnotify isn't fast enough, and the invalidation rule has to be more selective. (Although, maybe there are atomicity concerns I haven't thought of). -- Jamie ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-19 16:38 ` Jamie Lokier @ 2004-02-19 16:54 ` Linus Torvalds 2004-02-19 18:29 ` Jamie Lokier 2004-02-19 19:08 ` Helge Hafting 0 siblings, 2 replies; 69+ messages in thread From: Linus Torvalds @ 2004-02-19 16:54 UTC (permalink / raw) To: Jamie Lokier; +Cc: tridge, H. Peter Anvin, linux-kernel On Thu, 19 Feb 2004, Jamie Lokier wrote: > Linus Torvalds wrote: > > For example, the rule can be that _any_ regular dentry create will > > invalidate all the "case-insensitive" dentries. Just to be simple about > > it. > > If that's the rule, then with exactly the same algorithmic efficiency, > readdir+dnotify can be used to maintain the cache in userspace > instead. There is nothing gained by using the helper module in that case. Wrong. Because the dnotify would trigger EVEN FOR SAMBA OPERATIONS. Think about it. Think about samba doing a "rename()" within the directory. Linus ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-19 16:54 ` Linus Torvalds @ 2004-02-19 18:29 ` Jamie Lokier 2004-02-19 19:08 ` Helge Hafting 1 sibling, 0 replies; 69+ messages in thread From: Jamie Lokier @ 2004-02-19 18:29 UTC (permalink / raw) To: Linus Torvalds; +Cc: tridge, H. Peter Anvin, linux-kernel Linus Torvalds wrote: > > > For example, the rule can be that _any_ regular dentry create will > > > invalidate all the "case-insensitive" dentries. Just to be simple about > > > it. > > > > If that's the rule, then with exactly the same algorithmic efficiency, > > readdir+dnotify can be used to maintain the cache in userspace > > instead. There is nothing gained by using the helper module in that case. > > Wrong. > Because the dnotify would trigger EVEN FOR SAMBA OPERATIONS. Ah, I didn't know you meant "_any_ regular dentry create (except for Samba operations)". To apply that rule, you either need alternate versions of rename() and other file syscalls, or something akin to a process-specific flag (set by the helper module) saying that this is a Samba process and dentry creation _by this process_ shouldn't invalidate case-insensitive dentries. And if you have either of those, the bit of code which says "don't invalidate case-insenitive dentries because this is a Samba process" can just as easily say "don't send dnotify events to the current process". And once you've done that, it's easier just to add a DN_IGNORE_SELF flag to dnotify meaning to ignore events caused by the current process, and forget about the helper module. That'd be useful for other programs, too. -- Jamie ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-19 16:54 ` Linus Torvalds 2004-02-19 18:29 ` Jamie Lokier @ 2004-02-19 19:08 ` Helge Hafting 1 sibling, 0 replies; 69+ messages in thread From: Helge Hafting @ 2004-02-19 19:08 UTC (permalink / raw) To: Linus Torvalds; +Cc: Jamie Lokier, tridge, H. Peter Anvin, linux-kernel On Thu, Feb 19, 2004 at 08:54:51AM -0800, Linus Torvalds wrote: > > > On Thu, 19 Feb 2004, Jamie Lokier wrote: > > Linus Torvalds wrote: > > > For example, the rule can be that _any_ regular dentry create will > > > invalidate all the "case-insensitive" dentries. Just to be simple about > > > it. > > > > If that's the rule, then with exactly the same algorithmic efficiency, > > readdir+dnotify can be used to maintain the cache in userspace > > instead. There is nothing gained by using the helper module in that case. > > Wrong. > > Because the dnotify would trigger EVEN FOR SAMBA OPERATIONS. > > Think about it. Think about samba doing a "rename()" within the directory. Avoiding its own operations is a nice one. Could dnotify pass some information, such as the inode number involved to samba? samba could then look up the filename in its cache and take a closer look at that file only. That would avoid loosing the cache, even in case of other processes intruding. Helge Hafting ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-18 2:37 ` H. Peter Anvin 2004-02-18 3:03 ` Linus Torvalds @ 2004-02-18 4:08 ` tridge 2004-02-18 10:05 ` Robin Rosenberg 1 sibling, 1 reply; 69+ messages in thread From: tridge @ 2004-02-18 4:08 UTC (permalink / raw) To: hpa; +Cc: Kernel Mailing List Hpa, > So you're hosed if anyone uses characters outside the UCS-2 character > set... I've heard they are re-defining all those 16 bit numbers to be UCS-16 instead of UCS-2 for exactly that reason. This is rather similar to the move in the Unix community to start using UTF-8. Note that I am not at all proposing that we use UCS-2 in the Linux kernel (except in places where you have to, like the NTFS filesystem). I am proposing that the filesystems be able to offer a case-insenstive hash function to the dcache, and I would expect that this function would be based on UTF-8. The function might operate internally by converting UTF-8 to UCS-2, or it might use a sparse mapping table. It would almost certainly have a fast-path that looked first to see if there are any bytes with the top bit set, and if there are none then it can do a really easy 7 bit table based hash which would make this really fast for most users. The point is that the kernel proper (the VFS and dcache in particular) won't have to care how this hash works. They're just consumers of it. > There is a "standard" table, which is published by the Unicode > consortium. The table used in windows is not exactly the same as the one on unicode.org. Which is "correct" I will leave up to the pedants to discuss, as all that Samba cares about is that it uses the same table as w2k. > However, the "standard" table isn't what you want in certain > locales, e.g. Turkish. I'd really like someone to confirm this for me by volunteering to run a tool I provide on a Turkish NTFS filesystem or sending me a compressed empty Turkish NTFS volume (please ask first by email - I only need one of these). Up to now I have only ever seen the one 128k table used across all windows locales. If this table really *is* different in some locales then I need to know. Cheers, Tridge ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-18 4:08 ` tridge @ 2004-02-18 10:05 ` Robin Rosenberg 2004-02-18 11:43 ` tridge 0 siblings, 1 reply; 69+ messages in thread From: Robin Rosenberg @ 2004-02-18 10:05 UTC (permalink / raw) To: tridge; +Cc: hpa, Kernel Mailing List On Wednesday 18 February 2004 05.08, tridge@samba.org wrote: > Hpa, > > > So you're hosed if anyone uses characters outside the UCS-2 character > > set... > > I've heard they are re-defining all those 16 bit numbers to be UCS-16 > instead of UCS-2 for exactly that reason. This is rather similar to > the move in the Unix community to start using UTF-8. I've read it also: http://www.microsoft.com/globaldev/getwr/steps/wrg_unicode.mspx "The fundamental representation of text in Windows NT-based operating systems is UTF-16" -- robin ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-18 10:05 ` Robin Rosenberg @ 2004-02-18 11:43 ` tridge 2004-02-18 12:31 ` Robin Rosenberg 0 siblings, 1 reply; 69+ messages in thread From: tridge @ 2004-02-18 11:43 UTC (permalink / raw) To: Robin Rosenberg; +Cc: hpa, Kernel Mailing List Robin, > I've read it also: > http://www.microsoft.com/globaldev/getwr/steps/wrg_unicode.mspx > "The fundamental representation of text in Windows NT-based > operating systems is UTF-16" yep, in this thread I've been mistakenly using the term UCS-16 when I should have said UTF-16 (ie. the variable length, 2 byte encoding). Samba currently treats the bytes on the wire from windows as UCS-2 (a 2 byte fixed width encoding), whereas perhaps it should be treating them as UTF-16. I should write a smbtorture test to detect the difference and see what different versions of windows actually use. luckily the new charset handling stuff in samba3 and samba4 will make this easy to fix :-) ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-18 11:43 ` tridge @ 2004-02-18 12:31 ` Robin Rosenberg 2004-02-18 16:48 ` H. Peter Anvin 0 siblings, 1 reply; 69+ messages in thread From: Robin Rosenberg @ 2004-02-18 12:31 UTC (permalink / raw) To: tridge; +Cc: hpa, Kernel Mailing List On Wednesday 18 February 2004 12.43, tridge@samba.org wrote: > Robin, > > I've read it also: > > http://www.microsoft.com/globaldev/getwr/steps/wrg_unicode.mspx > > "The fundamental representation of text in Windows NT-based > > operating systems is UTF-16" I believe (please correct me if this is wrong) that Windows never actually supported any of the UCS-2 code that were in conflict with UTF-16. The cost of this operation was that some of the "private" code blocks of unicode 2.0, i.e. U+D800..U+DFFF were redefined as "surrogates" in Unicode 3.0 making the UTF-16 encoding more or less backwards compatible with UCS-2. And it's UTF-16LE and UCS-2LE, but I suspect you knew that :-) > yep, in this thread I've been mistakenly using the term UCS-16 when I > should have said UTF-16 (ie. the variable length, 2 byte encoding). > > Samba currently treats the bytes on the wire from windows as UCS-2 (a > 2 byte fixed width encoding), whereas perhaps it should be treating > them as UTF-16. I should write a smbtorture test to detect the > difference and see what different versions of windows actually use. See above, and most importantly the definition in Amendment 1 of the unicode 3.0 standard. > luckily the new charset handling stuff in samba3 and samba4 will make > this easy to fix :-) Happy man! -- robin ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-18 12:31 ` Robin Rosenberg @ 2004-02-18 16:48 ` H. Peter Anvin 2004-02-18 20:00 ` H. Peter Anvin 0 siblings, 1 reply; 69+ messages in thread From: H. Peter Anvin @ 2004-02-18 16:48 UTC (permalink / raw) To: Robin Rosenberg; +Cc: tridge, Kernel Mailing List Robin Rosenberg wrote: > > I believe (please correct me if this is wrong) that Windows never actually > supported any of the UCS-2 code that were in conflict with UTF-16. The cost > of this operation was that some of the "private" code blocks of unicode 2.0, i.e. > U+D800..U+DFFF were redefined as "surrogates" in Unicode 3.0 making the > UTF-16 encoding more or less backwards compatible with UCS-2. And it's > UTF-16LE and UCS-2LE, but I suspect you knew that :-) > Make that Unicode 1.0 and 1.1, and you're correct. -hpa ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-18 16:48 ` H. Peter Anvin @ 2004-02-18 20:00 ` H. Peter Anvin 0 siblings, 0 replies; 69+ messages in thread From: H. Peter Anvin @ 2004-02-18 20:00 UTC (permalink / raw) To: linux-kernel Followup to: <4033974F.4090706@zytor.com> By author: "H. Peter Anvin" <hpa@zytor.com> In newsgroup: linux.dev.kernel > > Robin Rosenberg wrote: > > > > I believe (please correct me if this is wrong) that Windows never actually > > supported any of the UCS-2 code that were in conflict with UTF-16. The cost > > of this operation was that some of the "private" code blocks of unicode 2.0, i.e. > > U+D800..U+DFFF were redefined as "surrogates" in Unicode 3.0 making the > > UTF-16 encoding more or less backwards compatible with UCS-2. And it's > > UTF-16LE and UCS-2LE, but I suspect you knew that :-) > > > > Make that Unicode 1.0 and 1.1, and you're correct. > Err, that was supposed to be 1.1 and 2.0. Unicode 1.1 reshuffled the private use range from Unicode 1.0, in order to make room for surrogates in Unicode 2.0. UTF-16, what a horrible ugly hack. -hpa ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 5:11 ` Linus Torvalds 2004-02-17 6:54 ` tridge @ 2004-02-19 2:53 ` Daniel Newby 1 sibling, 0 replies; 69+ messages in thread From: Daniel Newby @ 2004-02-19 2:53 UTC (permalink / raw) To: Linus Torvalds, Andrew Tridgell, Kernel Mailing List Linus Torvalds wrote: > So some variation of the interface > > int magic_open( > /* Input arguments */ > const char *pathname, > unsigned long flags, > mode_t mode, What about making the pathname hold the alternative cases for each character, not just an exact string? If Samba wanted to open "A File.txt", it would do magic_open( "[a|A][ ][f|F][i|I][e|E][.][t|T][x|X][t|T]", ... ) The syntax shown is conceptual; the actual code would use binary packing. Characters would be variable length to support UTF-8 and the like. Userland would be responsible for making a useful pathname. If it tried something like "[aL|P|#][m|m]", the kernel would cheerfully use it. The only sanity checking would be that special characters like "/" and ":" cannot have alternatives. Pros: 1. Filesystem names are looked up in kernel mode, where it might be efficient. (Less grossly slow at least.) 2. But the kernel doesn't care about encodings and character sets. 3. No new kernel infrastructure needed. (I hope?) The case- insensitive system calls don't take a performance hit. 4. The kernel can detect name collisions and decide what to do based on a flag. 5. Lookup tables are totally in userland and outside locks. Each app can use the table it finds appropriate. 6. A naughty app can't deadlock the filesystem. 7. Case-insensitive calls can be atomic, if you're willing to pay the performance price. It's straightforward for magic_creat() to refuse to create collisions. Cons: 1. Looking up multiple alternatives is hairy. (Not that the other approaches are much prettier.) 2. Massive filenames would get turned into something *really* massive (five times as many bytes for a simple packing). Does this break anything? -- Daniel Newby ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 4:12 tridge 2004-02-17 5:11 ` Linus Torvalds @ 2004-02-17 5:25 ` Tim Connors 2004-02-17 7:43 ` H. Peter Anvin ` (2 subsequent siblings) 4 siblings, 0 replies; 69+ messages in thread From: Tim Connors @ 2004-02-17 5:25 UTC (permalink / raw) To: linux-kernel tridge@samba.org said on Tue, 17 Feb 2004 15:12:06 +1100: > Given how much pain the "kernel is agnostic to charset encoding" > attitude has cost me in terms of programming pain, I thought I should > de-cloak from lurk mode and put my 2c into the UTF-8 issue. > > Personally I think that eventually the Linux kernel will have to > embrace the interpretation of the byte streams that applications have > given it, What applications? > despite the fact that this will be very painful and > potentially quite complex. The reason is that I think that eventually > the Linux kernel will need to efficiently support a userspace policy > of case-insensitivity and the only way to do case-insensitive filename > operations is to interpret those byte streams as a particular > encoding. > > Personally I much prefer the systems I use to be case-sensitive, but > there are important applications that require case-insensitivity for > interoperability. Why? Sounds pretty idiotic to me. If you don't like it, using some microshit filesystem like vfat. I'll keep using ext3 etc, thanks. -- TimC -- http://astronomy.swin.edu.au/staff/tconnors/ Conclusion to my thesis -- "It is trivial to show that it is clearly obvious that this is not woofly." ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 4:12 tridge 2004-02-17 5:11 ` Linus Torvalds 2004-02-17 5:25 ` Tim Connors @ 2004-02-17 7:43 ` H. Peter Anvin 2004-02-17 8:05 ` H. Peter Anvin 2004-02-17 14:25 ` Dave Kleikamp 2004-02-18 0:16 ` Robert White 4 siblings, 1 reply; 69+ messages in thread From: H. Peter Anvin @ 2004-02-17 7:43 UTC (permalink / raw) To: linux-kernel Followup to: <16433.38038.881005.468116@samba.org> By author: tridge@samba.org In newsgroup: linux.dev.kernel > > Given how much pain the "kernel is agnostic to charset encoding" > attitude has cost me in terms of programming pain, I thought I should > de-cloak from lurk mode and put my 2c into the UTF-8 issue. > > Personally I think that eventually the Linux kernel will have to > embrace the interpretation of the byte streams that applications have > given it, despite the fact that this will be very painful and > potentially quite complex. The reason is that I think that eventually > the Linux kernel will need to efficiently support a userspace policy > of case-insensitivity and the only way to do case-insensitive filename > operations is to interpret those byte streams as a particular > encoding. > Realistically, the only sane way to do this is to set our foot down and say: UTF-8 is *the* encoding. A good step in that direction would be to set utf-8 to be the default NLS in the kernel, but as long as people keep the whole sick idea that we can continue to use locale-dependent encoding we're in for a world of hurt. That's really the long and short of it. Until people are willing to say "we support UTF-8, anything else and it's anyone's guess what happens" then nothing is going to happen. -hpa -- PGP public key available - finger hpa@zytor.com Key fingerprint: 2047/2A960705 BA 03 D3 2C 14 A8 A8 BD 1E DF FE 69 EE 35 BD 74 "The earth is but one country, and mankind its citizens." -- Bahá'u'lláh Just Say No to Morden * The Shadows were defeated -- Babylon 5 is renewed!! ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 7:43 ` H. Peter Anvin @ 2004-02-17 8:05 ` H. Peter Anvin 0 siblings, 0 replies; 69+ messages in thread From: H. Peter Anvin @ 2004-02-17 8:05 UTC (permalink / raw) To: linux-kernel [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 2181 bytes --] Followup to: <c0sgnc$ngo$1@terminus.zytor.com> By author: hpa@zytor.com (H. Peter Anvin) In newsgroup: linux.dev.kernel > > Realistically, the only sane way to do this is to set our foot down > and say: UTF-8 is *the* encoding. A good step in that direction would > be to set utf-8 to be the default NLS in the kernel, but as long as > people keep the whole sick idea that we can continue to use > locale-dependent encoding we're in for a world of hurt. > > That's really the long and short of it. Until people are willing to > say "we support UTF-8, anything else and it's anyone's guess what > happens" then nothing is going to happen. > Oh yes, on top of that, if you want case insensitivity, then you also need to start thinking about a whole lot of other things, including what normalization form(s) you care about. Keeping normalization (as well as case-conversion) data for the entire Unicode space in the kernel is a boatload of memory. Then, you have to deal with your filesystem going sour on you when two files suddenly alias, because there is a new revision of the mapping tables. Case seemed simple when we were dealing with the "let's teach them all English" world, but even when you're dealing with languages like German (Ã) or Dutch (IJ) things get fuzzy... what's worse, in Turkish the uppercase equivalent of "i" (U+0069) isn't "I" (U+0049), it's "İ" (U+0130)! There is no table which can tell you that, since it's context-dependent. Thus, you may now need to consider larger equivalence classes, but is the other user expecting the same thing? You can't just use the same base letter being equivalent everywhere, or a Swedish user would beat the sh*t out of you for confusing the words "vas" and "väs". On the other hand, the Swedish user would be perfectly happy having "ä" equivalent with "æ" and "ü" equivalent with "y"! Therein lies madness. -hpa -- PGP public key available - finger hpa@zytor.com Key fingerprint: 2047/2A960705 BA 03 D3 2C 14 A8 A8 BD 1E DF FE 69 EE 35 BD 74 "The earth is but one country, and mankind its citizens." -- Bahá'u'lláh Just Say No to Morden * The Shadows were defeated -- Babylon 5 is renewed!! ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-17 4:12 tridge ` (2 preceding siblings ...) 2004-02-17 7:43 ` H. Peter Anvin @ 2004-02-17 14:25 ` Dave Kleikamp 2004-02-18 0:16 ` Robert White 4 siblings, 0 replies; 69+ messages in thread From: Dave Kleikamp @ 2004-02-17 14:25 UTC (permalink / raw) To: tridge; +Cc: linux-kernel On Mon, 2004-02-16 at 22:12, tridge@samba.org wrote: > Given how much pain the "kernel is agnostic to charset encoding" > attitude has cost me in terms of programming pain, I thought I should > de-cloak from lurk mode and put my 2c into the UTF-8 issue. > > Personally I think that eventually the Linux kernel will have to > embrace the interpretation of the byte streams that applications have > given it, despite the fact that this will be very painful and > potentially quite complex. The reason is that I think that eventually > the Linux kernel will need to efficiently support a userspace policy > of case-insensitivity and the only way to do case-insensitive filename > operations is to interpret those byte streams as a particular > encoding. > > Personally I much prefer the systems I use to be case-sensitive, but > there are important applications that require case-insensitivity for > interoperability. Right now it is not possible to write a case > insensitive application on Linux in an efficient manner. With the > current "encoding agnostic" APIs a simple open() or stat() call > becomes a horrendously expensive operation and one that is fraught > with race conditions. Providing the same functionality in the kernel > is dirt cheap by comparison (not cheap in terms of code complexity, > but cheap in terms of runtime efficiency). This would be easy to do in JFS due to the baggage we carried over to be compatible with OS/2-formatted volumes. In OS/2, the directories were ordered in a case-insensitive fashion. This would have to be a mkfs option, and would not be a per-process option. The directories must be created either case-sensitive or not. Shaggy -- David Kleikamp IBM Linux Technology Center ^ permalink raw reply [flat|nested] 69+ messages in thread
* RE: UTF-8 and case-insensitivity 2004-02-17 4:12 tridge ` (3 preceding siblings ...) 2004-02-17 14:25 ` Dave Kleikamp @ 2004-02-18 0:16 ` Robert White 2004-02-18 0:20 ` Linus Torvalds 2004-02-18 2:48 ` tridge 4 siblings, 2 replies; 69+ messages in thread From: Robert White @ 2004-02-18 0:16 UTC (permalink / raw) To: tridge, linux-kernel Cc: 'Linus Torvalds', 'Kernel Mailing List', 'Al Viro', 'Neil Brown' OK, so I wrote the below, but then in the summary I realized that there was a significant factor that doesn't fit in with the rest of the post. Case insensitivity, and more generally locale equivalence rules, is a security nightmare. Consider the number of different file names that "su" could map to if you apply case insensitivity (4) and/or worse yet the various accents and umlats (?,etc) that sort-equivalent for "u" in some locales. The user types "su" and runs "S(u-umlat)" etc. ==== In point of fact (ok in point of "technically abstract truth"), it is a "bad thing" that Windows (and seemingly only Windows these days) is case insensitive. It is sometimes said that windows is really an application and not an OS. If you ignore the occasionally snide *way* it is said you can find some technical truth to the matter. In point of fact the entire windows application space has a singular active locale at any one time and there is a well-defined but horrible layer of indirection where "long names" like "My Documents" become "real names" like "MYDOCU~1". Essentially every windows file name is subject to a double-indirect file name translation. The first pass is the strcasecmp() locale-dependent traversal of the "long name" list. The second is the strcasecmp() frozen-locale-spec-dependent traversal of (US Latin?) 8.3 file naming standard list of media elements (files/directories). In point of fact, Windows is *not* "properly" case insensitive at the file system level. Use "dir /x" more often on your windows box to relive the experience. The "real" file names are mangled to good old 8.3 uppercase internally(1). You don't usually have to think about this, but if you have ever lost the long-to-short file name mapping on a drive you know the hell that ensues. (see also iso9660.) So the application file naming interface wedge thingy (in windows) creates and maintains the mixed case names as an illusion. It just happens to be an illusion planted so deeply in the application space that it appears to be coming up from the "operating system level". OK, as time has moved on, some later versions of later file systems *may* (I honestly don't know) have modified the double-indirection model, but if they have, they must have done so in a guaranteed-to-look-the-same way. Either way it ends up being quite costly. Further, the model only really works because a DOS (and therefore windows) based program invariably and individually takes responsibility for doing all sorts of tasks like wildcard expansions (etc) in the application space (often "free" through comctl32.dll). [This tends to be foreign to Linux (UNIX) programmers where shells and such do the expansion.] The line is then blurred further by the subsequent steady creep of wildcarding and file selection back into common DLLs. (more comctl32.dll and friends.) The thing is, to match this ersatz "functionality" on a system where more than one locale may be used at the same time, you end up with a kind of Cartesian product of user locales and filesystem native locales. The cost could get extreme and can only really be amortized if Linux were to declare our own 8.3 style pronouncement for the character classes used for the "real" file name storage (etc). Late stage case insensitivity isn't that hard to put in a linux application, just crack open your file selection dialog boxes and have them use strcasecmp() in all their select/sort logic. Also then replace open() with CaseOpen() which does a find/search operation before daring to creat(). That is, in every practical way, how Windows handles these problems. It just happens in some fairly interesting and hard-to-predict places depending on context. It is easier, IMHO, to bring the users into the 20th century (let alone the 21st 8-) by making them mean what they say (if they deign to step out from behind their GUIs). So what was I saying... Oh yea... -- Single Locale storage standard required to prevent multiplicative cost. -- Not that hard to fake case insensitivity "when necessary". -- Cheaper in CPU/Space to mix case. -- Native file names in native locales simplifies administration and expectations. (not elaborated above, but true.) -- Case insensitivity and locale equivalence leads to uncertainties about what/which file may be intended in a given context, which could often lead to exploitable error. Rob. (1) The actual truth is a tad uglier than this, the media can have the 8.3 names stored in interesting ways, but essentially a "toupper()" is done on every file name as it is retrieved and processed. This cuts out a lot of possibilities and leads to a lot of "tildes of shame" in even some of the more harmless seeming name conflicts. ^ permalink raw reply [flat|nested] 69+ messages in thread
* RE: UTF-8 and case-insensitivity 2004-02-18 0:16 ` Robert White @ 2004-02-18 0:20 ` Linus Torvalds 2004-02-18 1:03 ` Robert White 2004-02-18 21:48 ` Ville Herva 2004-02-18 2:48 ` tridge 1 sibling, 2 replies; 69+ messages in thread From: Linus Torvalds @ 2004-02-18 0:20 UTC (permalink / raw) To: Robert White Cc: tridge, 'Kernel Mailing List', 'Al Viro', 'Neil Brown' On Tue, 17 Feb 2004, Robert White wrote: > > OK, so I wrote the below, but then in the summary I realized that there was > a significant factor that doesn't fit in with the rest of the post. Case > insensitivity, and more generally locale equivalence rules, is a security > nightmare. Consider the number of different file names that "su" could map > to if you apply case insensitivity (4) and/or worse yet the various accents > and umlats (?,etc) that sort-equivalent for "u" in some locales. The user > types "su" and runs "S(u-umlat)" etc. This is but one reason why I will _refuse_ to make case insensitivity magically start happening on regular "open()" etc calls. You'd literally have to use a _different_ system call to do a case-insensitive file open. Exactly because anything else would be very confusing to existing apps (and thus be potential security holes). Linus ^ permalink raw reply [flat|nested] 69+ messages in thread
* RE: UTF-8 and case-insensitivity 2004-02-18 0:20 ` Linus Torvalds @ 2004-02-18 1:03 ` Robert White 2004-02-18 21:48 ` Ville Herva 1 sibling, 0 replies; 69+ messages in thread From: Robert White @ 2004-02-18 1:03 UTC (permalink / raw) To: tridge Cc: 'Kernel Mailing List', 'Al Viro', 'Neil Brown', 'Linus Torvalds' P.S. Given that the GUI libraries (almost invariably) already deal with displaying things in a case insensitive way, the "best place to cut" to add case insensitivity to the user command-line experience would be adding a flag to file name completion in bash. Bash is already doing file name finds and lookups when you press tab; and the user is actively looking at the correctness and singularity/duality of the results. So the proverbial "vi makef{tab}" would, if the flag was set, show you makefile, Makefile, and MakeFile (etc) as existent or just switch makef to "Makefile" if the name were unique. It doesn't make lives easier for the API level project programmer people (c.f. samba), but it could uber-happy the incoming newbies, and people like me who have to interoperate within a vast wasteland of directories full of inconsistently named files created by windows programmers (like SOCKET.C, Socket.H, constants.h, and ss_switch.c all in one directory tree with hundreds of their friends. 8-) I would however, be forced to throttle myself with my own intestine if kernel started doing this magic mapping "for me", especially "in some calls/contexts but not in others". (Not that I want to provide my possible death as a strong motivation for adding the feature. 8-) Rob. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: UTF-8 and case-insensitivity 2004-02-18 0:20 ` Linus Torvalds 2004-02-18 1:03 ` Robert White @ 2004-02-18 21:48 ` Ville Herva 1 sibling, 0 replies; 69+ messages in thread From: Ville Herva @ 2004-02-18 21:48 UTC (permalink / raw) To: Linus Torvalds Cc: Robert White, tridge, 'Kernel Mailing List', 'Al Viro', 'Neil Brown' On Tue, Feb 17, 2004 at 04:20:26PM -0800, you [Linus Torvalds] wrote: > > This is but one reason why I will _refuse_ to make case insensitivity > magically start happening on regular "open()" etc calls. > > You'd literally have to use a _different_ system call to do a > case-insensitive file open. Tongue-in-cheek: int Open(const char *pathname, int flags); ? -- v -- v@iki.fi ^ permalink raw reply [flat|nested] 69+ messages in thread
* RE: UTF-8 and case-insensitivity 2004-02-18 0:16 ` Robert White 2004-02-18 0:20 ` Linus Torvalds @ 2004-02-18 2:48 ` tridge 2004-02-18 20:56 ` Robert White 1 sibling, 1 reply; 69+ messages in thread From: tridge @ 2004-02-18 2:48 UTC (permalink / raw) To: Robert White Cc: linux-kernel, 'Linus Torvalds', 'Al Viro', 'Neil Brown' Robert, Just about everything in your posting is either years out of date or just totally wrong. > OK, so I wrote the below, but then in the summary I realized that there was > a significant factor that doesn't fit in with the rest of the post. Case > insensitivity, and more generally locale equivalence rules, is a security > nightmare. Consider the number of different file names that "su" could map > to if you apply case insensitivity (4) and/or worse yet the various accents > and umlats (?,etc) that sort-equivalent for "u" in some locales. The user > types "su" and runs "S(u-umlat)" etc. This is no different from the "stupid admin puts . in $PATH" problem. Simple solutions: 1) don't mount your root filesystem with case insensitive naming 2) use a sane $PATH 3) don't allow untrusted users to create files in your $PATH 4) don't run bash in case insensitive mode if you can't for some you can't do (1) or (2) or (3) any of (1), (2) or (3) solves this. > In point of fact the entire windows application space has a > singular active locale at any one time and there is a well-defined > but horrible layer of indirection where "long names" like "My > Documents" become "real names" like "MYDOCU~1". Essentially every > windows file name is subject to a double-indirect file name > translation. The first pass is the strcasecmp() locale-dependent > traversal of the "long name" list. The second is the strcasecmp() > frozen-locale-spec-dependent traversal of (US Latin?) 8.3 file > naming standard list of media elements (files/directories). this is just total crap. That might have been true for msdos and even possibly win9x, but its totally untrue for NTFS. There are enough stupidities in windows without having to invent more. NTFS is case insensitive at the filesystem level. In fact, its selectable whether its case sensitive or case insensitive per-process (a process can switch between the two models). The case mapping table is built into the filesystem itself. That mapping has absolutely *zero* to do with US Latin or any other legacy multi-byte encoding. What you have done is the equivalent of stating that Linux can only do 14 character filenames, because once upon a time Linux had a filesystem called minix. We've moved beyond that and so has windows. > In point of fact, Windows is *not* "properly" case insensitive at the file > system level. Use "dir /x" more often on your windows box to relive the > experience. The "real" file names are mangled to good old 8.3 uppercase > internally(1). You don't usually have to think about this, but if you have > ever lost the long-to-short file name mapping on a drive you know the hell > that ensues. (see also iso9660.) again, this is just complete crap. NTFS has had the ability to completely disable 8.3 "alternative name" support for ages. Microsoft is even starting to use this switch in their published benchmark results, and I suspect it will become the default in a couple of years. We've been through the same transition in Samba: - Samba 0.x only supported 8.3 - Samba 1.x was oriented towards 8.3, but also supported long names - Samba 2.x and 3.x is oriented towards long names, and can disable 8.3 names to some extent by the time Samba 4.x comes out (I am working on it now) we may see a significant number of sites disabling 8.3 completely. > The thing is, to match this ersatz "functionality" on a system where more > than one locale may be used at the same time, you end up with a kind of > Cartesian product of user locales and filesystem native locales. The cost > could get extreme and can only really be amortized if Linux were to declare > our own 8.3 style pronouncement for the character classes used for the > "real" file name storage (etc). you are *way* out of date here. All recent windows apps use the UCS-2 interfaces which provides a single charset encoding across all locales. I've heard that they may be redefining this as UCS-16 to allow for an even larger range of characters, although I haven't seen this popping up on the wire yet (then again, I just might not have noticed). I wish they had chosen UTF-8 instead of UCS-2, but at least they chose something and got it into every part of the OS years ago. > Late stage case insensitivity isn't that hard to put in a linux application, > just crack open your file selection dialog boxes and have them use > strcasecmp() in all their select/sort logic. Also then replace open() with > CaseOpen() which does a find/search operation before daring to > creat(). Have you read *any* of what I've been saying about how expensive this is?? > That is, in every practical way, how Windows handles these problems. It > just happens in some fairly interesting and hard-to-predict places depending > on context. No, that is *not* how current versions of windows do things. > So what was I saying... Oh yea... > > -- Single Locale storage standard required to prevent multiplicative cost. windows has this. Linux doesn't. > -- Not that hard to fake case insensitivity "when necessary". ditto > -- Cheaper in CPU/Space to mix case. ditto > -- Native file names in native locales simplifies administration and > expectations. (not elaborated above, but true.) ?? single locale storage makes this just a no-op > -- Case insensitivity and locale equivalence leads to uncertainties about > what/which file may be intended in a given context, which could often lead > to exploitable error. and that is just a complete load of crap. Windows has had exploitable bugs due to case insensitivity, but the cause was things like leaving directories in the search path writeable by unprivileged users. It was *not* due to anything fundamentally insecure about case-insensitive names in filesystems. > (1) The actual truth is a tad uglier than this, the media can have the 8.3 > names stored in interesting ways, but essentially a "toupper()" is done on > every file name as it is retrieved and processed. This cuts out a lot of > possibilities and leads to a lot of "tildes of shame" in even some of the > more harmless seeming name conflicts. oh i get it, you're just a troll .... Cheers, Tridge ^ permalink raw reply [flat|nested] 69+ messages in thread
* RE: UTF-8 and case-insensitivity 2004-02-18 2:48 ` tridge @ 2004-02-18 20:56 ` Robert White 0 siblings, 0 replies; 69+ messages in thread From: Robert White @ 2004-02-18 20:56 UTC (permalink / raw) To: tridge Cc: linux-kernel, 'Linus Torvalds', 'Al Viro', 'Neil Brown' I guess I don't get it... tridge@samba.org [mailto:tridge@samba.org] said: > NTFS is case insensitive at the filesystem level. In fact, its > selectable whether its case sensitive or case insensitive per-process > (a process can switch between the two models). The case mapping table > is built into the filesystem itself. That mapping has absolutely > *zero* to do with US Latin or any other legacy multi-byte encoding. If the process selects whether it wants to be case insensitive or not how is NTFS case insensitive "at the file-system level"? Let me guess, they have two complete paths through the logic? Lots of DLLs? Redundant conflicting access semantics^Wfeatures? > you are *way* out of date here. All recent windows apps use the UCS-2 > interfaces which provides a single charset encoding across all locales. Which kind of directly supports where I said that to amortize the expense Linux would have to set up its *own* cannon about all file systems using the same encoding. The fact that I kept bringing up 8.3 was out of date. Point to you. The point that picking an arbitrary encoding will lead Linux getting out of date, or at least require a catastrophic realignment of every program that deigns to open() any file anywhere, remains germane. > Have you read *any* of what I've been saying about how expensive this is?? Yes, I understand the expense. I have *paid* that expense in excruciating detail on several occasions. You want to have the kernel pay that expense (in place of the application) as a fixed (amortized) cost or you want to codify the file names with a standard encoding which would penalize the entire system uniformly by raising the base cost to localize. I appreciate the unbounded regex-like expense of iteratively applying case/encoding insensitivity to a list of files. I really don't want to pay that cost in every application when I only need it at the front end. Sue me. I also understand the pain of having to load any/each entire directory into memory one blasted dirent at a time, and appreciate that since the kernel is bulk loading them at the filesystem interface it seems (is) wasteful to have to spoon them across the kernel/user-space interface. I really do understand. (ASIDE: a bulk-fetch-directory-into-buffer call might be nice, I havn't looked lately, but I presume none such exists.) Your proposed "single locale storage" would penalize all us embedded systems types with our space sensitive embedded file systems and low-powered CPUs so that the larger system that _can_ afford to pay the cost only when necessary don't have to. Two-bytes for one in every file name isn't a good trade off when you are dealing with a 32k file system image. I kind of tried (and apparently utterly failed) to make the points about how the Windows model worked and what it would cost by describing the basis for the model, not the current implementation. That is kind of why I *started* the message with "(ok in point of "technically abstract truth")" and mentioned later that what I was saying may have changed, but if so, it changed in a way consistent with the model as described. Windows has been digging themselves steadily out of the deep hole of case-insensitive file name handling for years; which does nothing to entice me to jump in and join them. So bully for windows that they have, iteration after iteration, managed to reduce the cost of their mistake. Even *with* a standardized file name character set/encoding case insensitivity would still be very bad-off in some important areas. Consider a simple security log. "[date] user command xx satisfied with executive Xx." etc. I can think of *lots* of times when I would have to open a file and then have to ask what the real name of the file I opened actually was. "I asked for 'Bob', what did I get?" isn't a fun question to have to answer *after* an open. Yes, all this *can* be addressed by scrubbing paths, but history suggests that this doesn't happen and the more the system does for you, the more likely you are to miss something. At the application level, since I have to sort file names for a picklist anyway, I'd rather pay the case insensitivity cost while I was sorting. It's actually cleaner and I am already paying to sort. I used to write SMB based applications (yes, I'm still way out of date) and I appreciate the painful tit-for-tat non-streaming ugliness. I feel your pain at having to read a whole directory and doing the sort/search. I understand the race condition that occurs between the directory read and the actual open where the file could be renamed or replaced. I really do. But "fixing" Linux so that it can share Window's pain doesn't seem wise. I can imagine a mod/module that would graft a localized and/or case-insensitive companion hash onto the dirent(s) as the central facility was doing its work. I can imagine an alternate open that traversed this alternate tree. Creating sort of a giant look-aside into the current file information tree. But I can't imagine any winning scenario that came from making that alternate hash the normal access method. Too many people and projects would suddenly break. {And I try not to troll, but I apparently have a knack for getting peoples dander in a bunch when I write. I think it is because I write as I speak, and the loss of tone and inflection in writing makes my turn-of-phrase come off very priggish. I'm not sure how to fix that. /sigh 8-) Rob. ^ permalink raw reply [flat|nested] 69+ messages in thread
end of thread, other threads:[~2004-02-19 20:13 UTC | newest]
Thread overview: 69+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1q4Si-658-5@gated-at.bofh.it>
[not found] ` <1q7no-8ss-7@gated-at.bofh.it>
[not found] ` <1qfb7-7s5-19@gated-at.bofh.it>
[not found] ` <1qmPm-6Gl-11@gated-at.bofh.it>
[not found] ` <1qpWI-1Sa-1@gated-at.bofh.it>
[not found] ` <1qqpO-2lx-3@gated-at.bofh.it>
[not found] ` <1qqzv-2tr-3@gated-at.bofh.it>
[not found] ` <1qqJc-2A2-5@gated-at.bofh.it>
[not found] ` <1qHAR-2Wm-49@gated-at.bofh.it>
[not found] ` <1qIwr-5GB-11@gated-at.bofh.it>
[not found] ` <1qIwr-5GB-9@gated-at.bofh.it>
[not found] ` <1qIQ1-5WR-27@gated-at.bofh.it>
[not found] ` <1qIZt-6b9-11@gated-at.bofh.it>
[not found] ` <1qJsF-6Be-45@gated-at.bofh.it>
2004-02-19 0:06 ` UTF-8 and case-insensitivity Pascal Schmidt
2004-02-19 1:01 ` tridge
2004-02-19 1:08 ` Hua Zhong
2004-02-19 1:46 ` tridge
2004-02-19 2:44 ` Theodore Ts'o
2004-02-19 3:20 ` tridge
2004-02-19 10:18 ` Helge Hafting
2004-02-19 12:11 ` Paulo Marques
2004-02-19 19:04 ` Helge Hafting
2004-02-19 14:08 ` Theodore Ts'o
2004-02-19 20:12 ` Robert White
[not found] <fa.epf5o9k.1rkudgo@ifi.uio.no>
[not found] ` <fa.idvvhjl.1jge92d@ifi.uio.no>
2004-02-18 1:09 ` Andy Lutomirski
2004-02-17 4:12 tridge
2004-02-17 5:11 ` Linus Torvalds
2004-02-17 6:54 ` tridge
2004-02-17 8:33 ` Neil Brown
2004-02-17 22:48 ` tridge
2004-02-18 0:06 ` Neil Brown
2004-02-18 9:47 ` Helge Hafting
2004-02-17 15:13 ` Linus Torvalds
2004-02-17 16:57 ` Linus Torvalds
2004-02-17 19:44 ` viro
2004-02-17 20:10 ` Linus Torvalds
2004-02-17 20:17 ` viro
2004-02-17 20:23 ` Linus Torvalds
2004-02-17 21:08 ` Robin Rosenberg
2004-02-17 21:17 ` Linus Torvalds
2004-02-17 22:27 ` Robin Rosenberg
2004-02-18 3:02 ` tridge
2004-02-17 23:57 ` tridge
2004-02-17 23:20 ` tridge
2004-02-17 23:43 ` Linus Torvalds
2004-02-18 3:26 ` tridge
2004-02-18 5:33 ` H. Peter Anvin
2004-02-18 7:54 ` Marc Lehmann
2004-02-18 2:37 ` H. Peter Anvin
2004-02-18 3:03 ` Linus Torvalds
2004-02-18 3:14 ` H. Peter Anvin
2004-02-18 3:27 ` Linus Torvalds
2004-02-18 21:31 ` tridge
2004-02-18 22:23 ` Linus Torvalds
2004-02-18 22:28 ` Linus Torvalds
2004-02-18 22:50 ` tridge
2004-02-18 22:59 ` Linus Torvalds
2004-02-18 23:09 ` tridge
2004-02-18 23:16 ` Linus Torvalds
2004-02-19 8:10 ` Jamie Lokier
2004-02-19 16:09 ` Linus Torvalds
2004-02-19 16:38 ` Jamie Lokier
2004-02-19 16:54 ` Linus Torvalds
2004-02-19 18:29 ` Jamie Lokier
2004-02-19 19:08 ` Helge Hafting
2004-02-18 4:08 ` tridge
2004-02-18 10:05 ` Robin Rosenberg
2004-02-18 11:43 ` tridge
2004-02-18 12:31 ` Robin Rosenberg
2004-02-18 16:48 ` H. Peter Anvin
2004-02-18 20:00 ` H. Peter Anvin
2004-02-19 2:53 ` Daniel Newby
2004-02-17 5:25 ` Tim Connors
2004-02-17 7:43 ` H. Peter Anvin
2004-02-17 8:05 ` H. Peter Anvin
2004-02-17 14:25 ` Dave Kleikamp
2004-02-18 0:16 ` Robert White
2004-02-18 0:20 ` Linus Torvalds
2004-02-18 1:03 ` Robert White
2004-02-18 21:48 ` Ville Herva
2004-02-18 2:48 ` tridge
2004-02-18 20:56 ` Robert White
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox