* cp command, lookup intents and fs/namei.c horrors
@ 2006-12-01 1:57 Steve French
2006-12-01 5:46 ` Trond Myklebust
0 siblings, 1 reply; 2+ messages in thread
From: Steve French @ 2006-12-01 1:57 UTC (permalink / raw)
To: linux-fsdevel
I have been looking at why the command:
cp source target
does very strange things when the target file is open and not writable
but the attributes (file size) are writable.
cp doesn't create a new file, if the target already exists it simply
shrinks and opens the targetfile as follows:
open("/mnt/ttargetfile, O_WRONLY|O_TRUNC|O_LARGEFILE)
The problem with this behavior is that it is possible to change
attributes on a file (file size) and not
have permission to open the target file for write - if the server is
Windows it is even more common because
the target file could be opened deny write by another client - even
OpenOffice does this (at least for Windows).
The result of all this is that open_namei (in fs/namei.c) can "fail"
with sideeffects - it first
sets the file size to zero (deleting all of the data in the target in
the process!), then fails
to open the file for writing so returns an error ...
Simplest case to reproduce this is:
1) open a file in OpenOffice on Windows (which does an open with "deny
write")
2) mount to Windows from Linux
3) try to cp from a local Linux file to the remote file opened by
OpenOffice
4) the copy will report failure, but the file will be set to zero size
Basically, what is happening in the open call in Linux is that namei.c
is doing very
strange things (more complicated logic than seemed plausible):
lookup target (but don't set the lookup intent to open!)
if target exists and O_TRUNC - shrink inode to zero size and
(notify attribute change) set the attributes
try to open the file for write (which fails when the target file
is open with deny modes)
Since the lookup in namei.c does not set the lookup intents to
LOOKUP_OPEN or LOOKUP_CREATE, a
filesystem can't try the open before the truncate. This seems like a
bug, and since the truncate (setattr)
does not pass the intent either, we can't know to defer the truncate
until after the open (or better
simply pass the flags to the server so the server can do this as one
operation).
Ideally a filesystem should be able to hook namei_open since it needs to
be more atomic for the case
of network and cluster filesystems (it would be much faster too for
network/cluster filesystems)
Any ideas how to fix this?
1) Allow an fs to hook namei_open?
or
2) pass the proper lookup intent on the lookup called from namei_open
or
3) pass lookup intents on setattr (so a network fs can tell why the
truncate is being done and combine operations)
or
4) find some clue to hint the filesystem that it can defer the setattr
until later or throw it away
or
...
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: cp command, lookup intents and fs/namei.c horrors
2006-12-01 1:57 cp command, lookup intents and fs/namei.c horrors Steve French
@ 2006-12-01 5:46 ` Trond Myklebust
0 siblings, 0 replies; 2+ messages in thread
From: Trond Myklebust @ 2006-12-01 5:46 UTC (permalink / raw)
To: Steve French; +Cc: linux-fsdevel
On Thu, 2006-11-30 at 19:57 -0600, Steve French wrote:
> I have been looking at why the command:
> cp source target
> does very strange things when the target file is open and not writable
> but the attributes (file size) are writable.
>
> cp doesn't create a new file, if the target already exists it simply
> shrinks and opens the targetfile as follows:
> open("/mnt/ttargetfile, O_WRONLY|O_TRUNC|O_LARGEFILE)
>
> The problem with this behavior is that it is possible to change
> attributes on a file (file size) and not
> have permission to open the target file for write - if the server is
> Windows it is even more common because
> the target file could be opened deny write by another client - even
> OpenOffice does this (at least for Windows).
>
> The result of all this is that open_namei (in fs/namei.c) can "fail"
> with sideeffects - it first
> sets the file size to zero (deleting all of the data in the target in
> the process!), then fails
> to open the file for writing so returns an error ...
>
> Simplest case to reproduce this is:
> 1) open a file in OpenOffice on Windows (which does an open with "deny
> write")
> 2) mount to Windows from Linux
> 3) try to cp from a local Linux file to the remote file opened by
> OpenOffice
> 4) the copy will report failure, but the file will be set to zero size
>
> Basically, what is happening in the open call in Linux is that namei.c
> is doing very
> strange things (more complicated logic than seemed plausible):
> lookup target (but don't set the lookup intent to open!)
> if target exists and O_TRUNC - shrink inode to zero size and
> (notify attribute change) set the attributes
> try to open the file for write (which fails when the target file
> is open with deny modes)
>
> Since the lookup in namei.c does not set the lookup intents to
> LOOKUP_OPEN or LOOKUP_CREATE, a
> filesystem can't try the open before the truncate. This seems like a
> bug, and since the truncate (setattr)
> does not pass the intent either, we can't know to defer the truncate
> until after the open (or better
> simply pass the flags to the server so the server can do this as one
> operation).
>
> Ideally a filesystem should be able to hook namei_open since it needs to
> be more atomic for the case
> of network and cluster filesystems (it would be much faster too for
> network/cluster filesystems)
>
> Any ideas how to fix this?
>
> 1) Allow an fs to hook namei_open?
Not a natural solution. You don't know which filesystem the path will
end up referring to until all path elements have been resolved. This is
why lookup intents were invented in the first place.
> or
> 2) pass the proper lookup intent on the lookup called from namei_open
Definitely if this is not already the case! I'm confused as to which
code path is missing the lookup intents, though. AFAIK, this all works
for NFS.
> or
> 3) pass lookup intents on setattr (so a network fs can tell why the
> truncate is being done and combine operations)
Possibly. Note, though, that if the fs is able to do the truncate as
part of the lookup(), then the test for i_size==0 means that the setattr
will not truncate.
> or
> 4) find some clue to hint the filesystem that it can defer the setattr
> until later or throw it away
> or
> ...
Cheers
Trond
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2006-12-01 5:46 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-01 1:57 cp command, lookup intents and fs/namei.c horrors Steve French
2006-12-01 5:46 ` Trond Myklebust
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox