* Re: file offset corruption on 32-bit machines? [not found] ` <ah7vN-7Wz-9@gated-at.bofh.it> @ 2008-04-11 12:24 ` Bodo Eggert 2008-04-11 13:55 ` Lennart Sorensen 0 siblings, 1 reply; 53+ messages in thread From: Bodo Eggert @ 2008-04-11 12:24 UTC (permalink / raw) To: Diego Calleja, Jiri Kosina, Jan Kara, Michal Hocko, Meelis Roos, Linux Kernel list <linux-kernel Diego Calleja <diegocg@gmail.com> wrote: > El Thu, 10 Apr 2008 16:31:09 +0200 (CEST), Jiri Kosina <jkosina@suse.cz> > escribió: > >> I think this is worth fixing. > > This question comes very often, and Linus even wrote a patch > (http://lkml.org/lkml/2006/4/13/124 , http://lkml.org/lkml/2006/4/13/130) > > But apparently there's no much interest in fixing it, because it would > slow down some workloads... AS far as I understand, the race is e.g.: fpos := A:a, we want to make process/thread a read A:b or B:a without it being a correct value in fpos. a!=b!=c, A!=B, A!=C. a: read fpos.high (A:?) b: write fpos (B:b) a: read fpos.low (A:b) If you change this to a: read fpos.high a: read fpos.low a: read fpos.high a: read fpos.low and compare the results, you need to a: read fpos.high (A:?) b: write fpos (B:b) a: read fpos.low (A:b) b: write fpos (A:c) a: read fpos.high (A:b),(A:?) b: write fpos (C:b) a: read fpos.low (A:b),(A:b) That would be winning three races in order to hit the bug. OTOH, writers MUST NOT be interrupted, because: b: write fpos.high (B:a) a: read fpos.high (B:?) a: read fpos.low (B:a) a: read fpos.high (B:a),(B:?) a: read fpos.low (B:a),(B:a) b: write fpos.low (B:b) -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-11 12:24 ` file offset corruption on 32-bit machines? Bodo Eggert @ 2008-04-11 13:55 ` Lennart Sorensen 2008-04-11 16:59 ` Bryan Henderson 2008-04-14 16:20 ` Jan Kara 0 siblings, 2 replies; 53+ messages in thread From: Lennart Sorensen @ 2008-04-11 13:55 UTC (permalink / raw) To: Bodo Eggert Cc: Diego Calleja, Jiri Kosina, Jan Kara, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Fri, Apr 11, 2008 at 02:24:34PM +0200, Bodo Eggert wrote: > AS far as I understand, the race is e.g.: > > fpos := A:a, we want to make process/thread a read A:b or B:a without it > being a correct value in fpos. a!=b!=c, A!=B, A!=C. > > a: read fpos.high (A:?) > b: write fpos (B:b) > a: read fpos.low (A:b) > > > If you change this to > > a: read fpos.high > a: read fpos.low > a: read fpos.high > a: read fpos.low > > and compare the results, you need to > > a: read fpos.high (A:?) > b: write fpos (B:b) > a: read fpos.low (A:b) > b: write fpos (A:c) > a: read fpos.high (A:b),(A:?) > b: write fpos (C:b) > a: read fpos.low (A:b),(A:b) > > That would be winning three races in order to hit the bug. > > > OTOH, writers MUST NOT be interrupted, because: > > b: write fpos.high (B:a) > a: read fpos.high (B:?) > a: read fpos.low (B:a) > a: read fpos.high (B:a),(B:?) > a: read fpos.low (B:a),(B:a) > b: write fpos.low (B:b) So if you write multithreaded code and don't understand what locking around shared resources is for, then your application might break. Can you give an example where locking is being used correctly where this can possibly fail? The kernel can't prevent idiots from writing bad code that breaks. I just don't get this "problem". -- Len Sorensen ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-11 13:55 ` Lennart Sorensen @ 2008-04-11 16:59 ` Bryan Henderson 2008-04-11 17:15 ` Lennart Sorensen 2008-04-14 16:20 ` Jan Kara 1 sibling, 1 reply; 53+ messages in thread From: Bryan Henderson @ 2008-04-11 16:59 UTC (permalink / raw) To: Lennart Sorensen Cc: Bodo Eggert, Diego Calleja, Jan Kara, Jiri Kosina, linux-fsdevel, Linux Kernel list, Michal Hocko, Meelis Roos >So if you write multithreaded code and don't understand what locking >around shared resources is for, then your application might break. I think I know what locking around shared resources is for, which is why I'm surprised the kernel doesn't do it. Is it normal for a kernel resource not to be thread-safe (i.e. you don't get advertised/sensible results if two threads access it at the same time)? >Can you give an example where locking is being used correctly where this can >possibly fail? I could accept (though I haven't thought about it) that there aren't any real-world applications that do simultaneous reads and writes through the same file pointer. I might even accept that there can be no useful application that does. But can you say such an application is incorrect? -- Bryan Henderson IBM Almaden Research Center San Jose CA Filesystems ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-11 16:59 ` Bryan Henderson @ 2008-04-11 17:15 ` Lennart Sorensen 2008-04-11 21:29 ` Bryan Henderson 2008-04-12 8:48 ` Pavel Machek 0 siblings, 2 replies; 53+ messages in thread From: Lennart Sorensen @ 2008-04-11 17:15 UTC (permalink / raw) To: Bryan Henderson Cc: Bodo Eggert, Diego Calleja, Jan Kara, Jiri Kosina, linux-fsdevel, Linux Kernel list, Michal Hocko, Meelis Roos On Fri, Apr 11, 2008 at 09:59:45AM -0700, Bryan Henderson wrote: > >So if you write multithreaded code and don't understand what locking > >around shared resources is for, then your application might break. > > I think I know what locking around shared resources is for, which is why > I'm surprised the kernel doesn't do it. > > Is it normal for a kernel resource not to be thread-safe (i.e. you don't > get advertised/sensible results if two threads access it at the same > time)? If two threads are changing one filehandle at the same time, then the program is broken. I can't see how the kernel making updates to 64bit filehandles "atomic" helps. You could still seek in one thread, then seek in another and then start the write in the first and get a wrong result. Changes to a shared filehandle of any kind requires locking to work reliably, so additional slow downs and locking in the kernel won't fix anything. > I could accept (though I haven't thought about it) that there aren't any > real-world applications that do simultaneous reads and writes through the > same file pointer. I might even accept that there can be no useful > application that does. But can you say such an application is incorrect? Unless the application has it's own locking to ensure multiple threads don't screw up each other's fileposition, it simply wouldn't work. What is the difference between doing: threadA: seek(positionA) threadB: seel(positionB) threadA: write threadB: write versus threadA: seek(posisionA) but only set half the 64bits threadB: seek(positionB) set all 64bits threadA: complete seek operation setting the other half of the bits threadA: write threadB: write either way you end up writing to the wrong file location even though the first case the kernel made the setting of the fileposition atomic and in the second case it wasn't. The application has to do: threadA: lock access to filehandle threadA: seek(positionA) threadB: try to get lock and wait threadA: write threadA: unlock threadB: get lock finally threadB: seek(positionB) threadB: write threadB: unlock Once the application does locking, it doesn't matter if the setting of the fileposition is atomic or not since no other thread can touch the filehandle anyhow. Doesn't matter if you read or write. If it's a shared filehandle you have only one current position so to share it you have to lock access while doing a seek + read or write operation if you want predictable results. -- Len Sorensen ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-11 17:15 ` Lennart Sorensen @ 2008-04-11 21:29 ` Bryan Henderson 2008-04-12 8:48 ` Pavel Machek 1 sibling, 0 replies; 53+ messages in thread From: Bryan Henderson @ 2008-04-11 21:29 UTC (permalink / raw) To: Lennart Sorensen Cc: Bodo Eggert, Diego Calleja, Jan Kara, Jiri Kosina, linux-fsdevel, Linux Kernel list, Michal Hocko, Meelis Roos >What is the difference between doing: > >threadA: seek(positionA) >threadB: seel(positionB) >threadA: write >threadB: write > >versus > >threadA: seek(posisionA) but only set half the 64bits >threadB: seek(positionB) set all 64bits >threadA: complete seek operation setting the other half of the bits >threadA: write >threadB: write > >either way you end up writing to the wrong file location Only if you make an assumption about what this program considers the right location. One difference is that in the first case, data gets written only at a place to which the program seeked, while in the second, it gets written to a totally illogical place. Another is that in the first, the data gets written as specified in standards and in the second, it doesn't. I can imagine a program that would be satisfied with the first and not the second, and for such a program, I cannot use the word "incorrect" or "broken" or say the programmer doesn't understand shared resources. -- Bryan Henderson IBM Almaden Research Center San Jose CA Filesystems ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-11 17:15 ` Lennart Sorensen 2008-04-11 21:29 ` Bryan Henderson @ 2008-04-12 8:48 ` Pavel Machek 1 sibling, 0 replies; 53+ messages in thread From: Pavel Machek @ 2008-04-12 8:48 UTC (permalink / raw) To: Lennart Sorensen Cc: Bryan Henderson, Bodo Eggert, Diego Calleja, Jan Kara, Jiri Kosina, linux-fsdevel, Linux Kernel list, Michal Hocko, Meelis Roos Hi! > > >So if you write multithreaded code and don't understand what locking > > >around shared resources is for, then your application might break. > > > > I think I know what locking around shared resources is for, which is why > > I'm surprised the kernel doesn't do it. > > > > Is it normal for a kernel resource not to be thread-safe (i.e. you don't > > get advertised/sensible results if two threads access it at the same > > time)? > > If two threads are changing one filehandle at the same time, then the > program is broken. I can't see how the kernel making updates to 64bit > filehandles "atomic" helps. You could still seek in one thread, then > seek in another and then start the write in the first and get a wrong > result. Changes to a shared filehandle of any kind requires locking to > work reliably, so additional slow downs and locking in the kernel won't > fix anything. Well, app may be broken, or it may be trying to confuse you. If you were stracing app, it seeked at 1GB and at 7GB, then did read(), you'd be certainly very surprised if it did read secret data at 3GB, right? And ptrace monitors do exist. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-11 13:55 ` Lennart Sorensen 2008-04-11 16:59 ` Bryan Henderson @ 2008-04-14 16:20 ` Jan Kara 2008-04-14 16:22 ` Lennart Sorensen 1 sibling, 1 reply; 53+ messages in thread From: Jan Kara @ 2008-04-14 16:20 UTC (permalink / raw) To: Lennart Sorensen Cc: Bodo Eggert, Diego Calleja, Jiri Kosina, Jan Kara, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Fri 11-04-08 09:55:44, Lennart Sorensen wrote: > On Fri, Apr 11, 2008 at 02:24:34PM +0200, Bodo Eggert wrote: > > AS far as I understand, the race is e.g.: > > > > fpos := A:a, we want to make process/thread a read A:b or B:a without it > > being a correct value in fpos. a!=b!=c, A!=B, A!=C. > > > > a: read fpos.high (A:?) > > b: write fpos (B:b) > > a: read fpos.low (A:b) > > > > > > If you change this to > > > > a: read fpos.high > > a: read fpos.low > > a: read fpos.high > > a: read fpos.low > > > > and compare the results, you need to > > > > a: read fpos.high (A:?) > > b: write fpos (B:b) > > a: read fpos.low (A:b) > > b: write fpos (A:c) > > a: read fpos.high (A:b),(A:?) > > b: write fpos (C:b) > > a: read fpos.low (A:b),(A:b) > > > > That would be winning three races in order to hit the bug. > > > > > > OTOH, writers MUST NOT be interrupted, because: > > > > b: write fpos.high (B:a) > > a: read fpos.high (B:?) > > a: read fpos.low (B:a) > > a: read fpos.high (B:a),(B:?) > > a: read fpos.low (B:a),(B:a) > > b: write fpos.low (B:b) > > So if you write multithreaded code and don't understand what locking > around shared resources is for, then your application might break. Can > you give an example where locking is being used correctly where this can > possibly fail? The kernel can't prevent idiots from writing bad code > that breaks. > > I just don't get this "problem". Well, as Jiri Kosina wrote, this isn't a problem unless someone finds a way how to use this race for some attack (and for example making f_pos negative compromises security so it is not so far-fetched as it would seem). So proactively fixing this makes some sence. Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-14 16:20 ` Jan Kara @ 2008-04-14 16:22 ` Lennart Sorensen 2008-04-14 16:53 ` Jan Kara 0 siblings, 1 reply; 53+ messages in thread From: Lennart Sorensen @ 2008-04-14 16:22 UTC (permalink / raw) To: Jan Kara Cc: Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Mon, Apr 14, 2008 at 06:20:31PM +0200, Jan Kara wrote: > Well, as Jiri Kosina wrote, this isn't a problem unless someone finds > a way how to use this race for some attack (and for example making f_pos > negative compromises security so it is not so far-fetched as it would > seem). So proactively fixing this makes some sence. But you would have to be part of that process to affect the filehandle wouldn't you? If you are part of the process already wouldn't it be easier to manipulate things directly rather than playing with the filehandle position? -- Len Sorensen ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-14 16:22 ` Lennart Sorensen @ 2008-04-14 16:53 ` Jan Kara 2008-04-14 16:54 ` Alan Cox 2008-04-14 17:06 ` Lennart Sorensen 0 siblings, 2 replies; 53+ messages in thread From: Jan Kara @ 2008-04-14 16:53 UTC (permalink / raw) To: Lennart Sorensen Cc: Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Mon 14-04-08 12:22:02, Lennart Sorensen wrote: > On Mon, Apr 14, 2008 at 06:20:31PM +0200, Jan Kara wrote: > > Well, as Jiri Kosina wrote, this isn't a problem unless someone finds > > a way how to use this race for some attack (and for example making f_pos > > negative compromises security so it is not so far-fetched as it would > > seem). So proactively fixing this makes some sence. > > But you would have to be part of that process to affect the filehandle > wouldn't you? If you are part of the process already wouldn't it be > easier to manipulate things directly rather than playing with the > filehandle position? Well, but imagine you have a file /proc/my_secret_file from which you are able to read from position A:a and B:b but not from position A:b. Concievably, checks for the file position could be bypassed because of this race... I know this is kind of dumb example but I can imagine someone can eventually find something like this. So I guess one spin lock/unlock pair is a price worth paying in the callpath which is quite long anyway. Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-14 16:53 ` Jan Kara @ 2008-04-14 16:54 ` Alan Cox 2008-04-14 18:34 ` Alexey Dobriyan 2008-04-14 17:06 ` Lennart Sorensen 1 sibling, 1 reply; 53+ messages in thread From: Alan Cox @ 2008-04-14 16:54 UTC (permalink / raw) To: Jan Kara Cc: Lennart Sorensen, Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel > Well, but imagine you have a file /proc/my_secret_file from which you > are able to read from position A:a and B:b but not from position > A:b. Concievably, checks for the file position could be bypassed because of > this race... I know this is kind of dumb example but I can imagine someone Unlikely as the ppos passed to the driver is a private copy and the user could equally use pread/pwrite to specify that offset. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-14 16:54 ` Alan Cox @ 2008-04-14 18:34 ` Alexey Dobriyan 0 siblings, 0 replies; 53+ messages in thread From: Alexey Dobriyan @ 2008-04-14 18:34 UTC (permalink / raw) To: Alan Cox Cc: Jan Kara, Lennart Sorensen, Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Mon, Apr 14, 2008 at 05:54:52PM +0100, Alan Cox wrote: > > Well, but imagine you have a file /proc/my_secret_file from which you > > are able to read from position A:a and B:b but not from position > > A:b. Concievably, checks for the file position could be bypassed because of > > this race... I know this is kind of dumb example but I can imagine someone > > Unlikely as the ppos passed to the driver is a private copy and the user > could equally use pread/pwrite to specify that offset. pread is banned on proc files implemented via seq_files. And in no-seq_file case, there are MAX_NON_LFS checks which fits into 32 bits. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-14 16:53 ` Jan Kara 2008-04-14 16:54 ` Alan Cox @ 2008-04-14 17:06 ` Lennart Sorensen 2008-04-14 19:03 ` Jan Kara 1 sibling, 1 reply; 53+ messages in thread From: Lennart Sorensen @ 2008-04-14 17:06 UTC (permalink / raw) To: Jan Kara Cc: Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Mon, Apr 14, 2008 at 06:53:54PM +0200, Jan Kara wrote: > Well, but imagine you have a file /proc/my_secret_file from which you > are able to read from position A:a and B:b but not from position > A:b. Concievably, checks for the file position could be bypassed because of > this race... I know this is kind of dumb example but I can imagine someone > can eventually find something like this. So I guess one spin lock/unlock > pair is a price worth paying in the callpath which is quite long anyway. But only two threads within the process can read from the filehandle and hence the process would be doing locking. And external attacker can't break the internal locking of the process between the threads, and even if you do open the file in /proc that the process is using, being and external process you would have your own file handle and hence your own file position since you aren't part of that process. -- Len Sorensen ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-14 17:06 ` Lennart Sorensen @ 2008-04-14 19:03 ` Jan Kara 2008-04-14 19:29 ` Lennart Sorensen 0 siblings, 1 reply; 53+ messages in thread From: Jan Kara @ 2008-04-14 19:03 UTC (permalink / raw) To: Lennart Sorensen Cc: Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Mon 14-04-08 13:06:13, Lennart Sorensen wrote: > On Mon, Apr 14, 2008 at 06:53:54PM +0200, Jan Kara wrote: > > Well, but imagine you have a file /proc/my_secret_file from which you > > are able to read from position A:a and B:b but not from position > > A:b. Concievably, checks for the file position could be bypassed because of > > this race... I know this is kind of dumb example but I can imagine someone > > can eventually find something like this. So I guess one spin lock/unlock > > pair is a price worth paying in the callpath which is quite long anyway. > > But only two threads within the process can read from the filehandle and > hence the process would be doing locking. And external attacker can't Why would it be doing locking? If some nasty user runs the process, he *wants* his two threads to race as much as possible and trigger the race. And then use corrupted f_pos. > break the internal locking of the process between the threads, and even > if you do open the file in /proc that the process is using, being and > external process you would have your own file handle and hence your own > file position since you aren't part of that process. Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-14 19:03 ` Jan Kara @ 2008-04-14 19:29 ` Lennart Sorensen 2008-04-14 19:42 ` Jan Kara 2008-04-15 8:57 ` Pavel Machek 0 siblings, 2 replies; 53+ messages in thread From: Lennart Sorensen @ 2008-04-14 19:29 UTC (permalink / raw) To: Jan Kara Cc: Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Mon, Apr 14, 2008 at 09:03:09PM +0200, Jan Kara wrote: > Why would it be doing locking? If some nasty user runs the process, he > *wants* his two threads to race as much as possible and trigger the race. > And then use corrupted f_pos. Why would you want to? You can already set the filepointer explicitly to any value you want if you have the filehandle. If you had a file with some security checks for whether the user could read from it implemented based on locations then you would check it when you read/write not when you seek, since after all you could just keep reading until you get to the desired position. -- Len Sorensen ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-14 19:29 ` Lennart Sorensen @ 2008-04-14 19:42 ` Jan Kara 2008-04-14 19:45 ` Lennart Sorensen 2008-04-15 8:57 ` Pavel Machek 1 sibling, 1 reply; 53+ messages in thread From: Jan Kara @ 2008-04-14 19:42 UTC (permalink / raw) To: Lennart Sorensen Cc: Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Mon 14-04-08 15:29:28, Lennart Sorensen wrote: > On Mon, Apr 14, 2008 at 09:03:09PM +0200, Jan Kara wrote: > > Why would it be doing locking? If some nasty user runs the process, he > > *wants* his two threads to race as much as possible and trigger the race. > > And then use corrupted f_pos. > > Why would you want to? You can already set the filepointer explicitly > to any value you want if you have the filehandle. > > If you had a file with some security checks for whether the user could > read from it implemented based on locations then you would check it when > you read/write not when you seek, since after all you could just keep > reading until you get to the desired position. Yes and no - for example if you manage to corrupt f_pos so that it becomes negative, you have won because it is checked only in seek, pread, pwrite, but not in read or write which rely on the check in seek... Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-14 19:42 ` Jan Kara @ 2008-04-14 19:45 ` Lennart Sorensen 0 siblings, 0 replies; 53+ messages in thread From: Lennart Sorensen @ 2008-04-14 19:45 UTC (permalink / raw) To: Jan Kara Cc: Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Mon, Apr 14, 2008 at 09:42:46PM +0200, Jan Kara wrote: > > Why would you want to? You can already set the filepointer explicitly > > to any value you want if you have the filehandle. > > > > If you had a file with some security checks for whether the user could > > read from it implemented based on locations then you would check it when > > you read/write not when you seek, since after all you could just keep > > reading until you get to the desired position. > Yes and no - for example if you manage to corrupt f_pos so that it > becomes negative, you have won because it is checked only in seek, pread, > pwrite, but not in read or write which rely on the check in seek... The only file that could possibly implement any such silly security based on position would be in /proc or /sys or similar, in which case whatever driver implements it can check the position during any read/write operation, and it would have to if it wants to implement such a silly security system. Any sane system would put the secured data in a seperate file from the unsecured data obviously. Trying to read from a negative position on a normal file should clearly fail, and if it doesn't then that is a seperate issue to fix and has nothing to do with the file position being set atomicly. -- Len Sorensen ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-14 19:29 ` Lennart Sorensen 2008-04-14 19:42 ` Jan Kara @ 2008-04-15 8:57 ` Pavel Machek 2008-04-15 15:32 ` Lennart Sorensen 1 sibling, 1 reply; 53+ messages in thread From: Pavel Machek @ 2008-04-15 8:57 UTC (permalink / raw) To: Lennart Sorensen Cc: Jan Kara, Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel > On Mon, Apr 14, 2008 at 09:03:09PM +0200, Jan Kara wrote: > > Why would it be doing locking? If some nasty user runs the process, he > > *wants* his two threads to race as much as possible and trigger the race. > > And then use corrupted f_pos. > > Why would you want to? You can already set the filepointer explicitly > to any value you want if you have the filehandle. > > If you had a file with some security checks for whether the user could > read from it implemented based on locations then you would check it when > you read/write not when you seek, since after all you could just keep > reading until you get to the desired position. Not if you tried to do checking from ptrace monitor. And heck, yes, it is very confusing to see seek(somewhere) write() ond ptrace and write going somewhere else. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-15 8:57 ` Pavel Machek @ 2008-04-15 15:32 ` Lennart Sorensen 2008-04-15 17:34 ` Pavel Machek 0 siblings, 1 reply; 53+ messages in thread From: Lennart Sorensen @ 2008-04-15 15:32 UTC (permalink / raw) To: Pavel Machek Cc: Jan Kara, Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Tue, Apr 15, 2008 at 10:57:41AM +0200, Pavel Machek wrote: > Not if you tried to do checking from ptrace monitor. > > And heck, yes, it is very confusing to see > > seek(somewhere) > write() > > ond ptrace and write going somewhere else. Yes bugs are confusing. An application can't do this on demand so you can't write code that relies on the effect between threads. So it would only be a bug, not a bizare feature (that wouldn't even work on 64bit machines). -- Len Sorensen ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-15 15:32 ` Lennart Sorensen @ 2008-04-15 17:34 ` Pavel Machek 2008-04-15 18:24 ` Lennart Sorensen 0 siblings, 1 reply; 53+ messages in thread From: Pavel Machek @ 2008-04-15 17:34 UTC (permalink / raw) To: Lennart Sorensen Cc: Jan Kara, Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel Hi! > > Not if you tried to do checking from ptrace monitor. > > > > And heck, yes, it is very confusing to see > > > > seek(somewhere) > > write() > > > > ond ptrace and write going somewhere else. > > Yes bugs are confusing. An application can't do this on demand so you > can't write code that relies on the effect between threads. So it would > only be a bug, not a bizare feature (that wouldn't even work on 64bit > machines). Yes, kernel bugs are confusing ;-). The "application" could be malware trying to confuse debugger, for example. The "application" could be something you are trying to debug. I did brief reading on lseek man pages, and it does not mention "kernel may seek to random place if you attempt to seek from two threads at the same time". So this is a kernel or manpages bug. Maybe you can take a look at POSIX if it permits this behaviour? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-15 17:34 ` Pavel Machek @ 2008-04-15 18:24 ` Lennart Sorensen 2008-04-15 19:12 ` Pavel Machek 0 siblings, 1 reply; 53+ messages in thread From: Lennart Sorensen @ 2008-04-15 18:24 UTC (permalink / raw) To: Pavel Machek Cc: Jan Kara, Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Tue, Apr 15, 2008 at 07:34:30PM +0200, Pavel Machek wrote: > Yes, kernel bugs are confusing ;-). I only see an application bug so far. > The "application" could be malware trying to confuse debugger, for > example. If you can't do it on demand (which I can't see any way to do) then I don't think malware can take advantage of it. > The "application" could be something you are trying to debug. True, but even without this behaviour doing seeks and read/writes from multiple threads without locking will already show plenty of problems even if you somehow manage to hit this issue, and not only that you have to have threads writing to different 4GB aligned chunks of the file to cause a problem, since otherwise they would all be setting the top bits the same. I would hope anyone doing multithreaded work on a file that big would like to avoid the locking issue by using pread and pwrite instead in which case there is no problem either. > I did brief reading on lseek man pages, and it does not mention > "kernel may seek to random place if you attempt to seek from two > threads at the same time". So this is a kernel or manpages bug. Does it say anything about what happens if you try to seek from two places at once? > Maybe you can take a look at POSIX if it permits this behaviour? The parts I can find for posix don't say one way or the other. -- Len Sorensen ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-15 18:24 ` Lennart Sorensen @ 2008-04-15 19:12 ` Pavel Machek 2008-04-15 19:49 ` Lennart Sorensen 0 siblings, 1 reply; 53+ messages in thread From: Pavel Machek @ 2008-04-15 19:12 UTC (permalink / raw) To: Lennart Sorensen Cc: Jan Kara, Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel Hi! > > Yes, kernel bugs are confusing ;-). > > I only see an application bug so far. Really? > The lseek() function repositions the offset of the open file > associated with > the file descriptor fildes to the argument offset according to >the directive > whence as follows: It does not say "repositions the offset to the random number" nor "under certain conditions repositions the offsets" nor "it repositions the offset unless you are unlucky and hit kernel race". More seriously, it does not contain note "not safe from multithreaded programs" nor "multithreaded behaviour is undefined". So this pretty clearly is application bug. > > The "application" could be malware trying to confuse debugger, for > > example. > > If you can't do it on demand (which I can't see any way to do) then I > don't think malware can take advantage of it. Really? I see an application to detecting if I'm being debugged. Try to hit the race 1000 times, if you hit it, you are probably not debugged (because debugger would be very likely to make that race hard to hit). Will only work on multicores, but... [Plus, there's "strace seen it writing to either offset A or offset B, but I see the data at offset C, WTF?] > > The "application" could be something you are trying to debug. > > True, but even without this behaviour doing seeks and read/writes from > multiple threads without locking will already show plenty of problems > even if you somehow manage to hit this issue, and not only that you have > to have threads writing to different 4GB aligned chunks of the file > cause a problem, since otherwise they would all be setting the top bits > the same. I would hope anyone doing multithreaded work on a file that > big would like to avoid the locking issue by using pread and pwrite > instead in which case there is no problem either. I'm not saying this kernel bug is likely to hit in practice. It is still a kernel bug. Is the slowdown of lseek worth getting rid of this minor bug? Not sure, probably yes. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-15 19:12 ` Pavel Machek @ 2008-04-15 19:49 ` Lennart Sorensen 2008-04-15 20:06 ` Pavel Machek 0 siblings, 1 reply; 53+ messages in thread From: Lennart Sorensen @ 2008-04-15 19:49 UTC (permalink / raw) To: Pavel Machek Cc: Jan Kara, Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Tue, Apr 15, 2008 at 09:12:38PM +0200, Pavel Machek wrote: > It does not say "repositions the offset to the random number" nor > "under certain conditions repositions the offsets" nor "it repositions > the offset unless you are unlucky and hit kernel race". More > seriously, it does not contain note "not safe from multithreaded > programs" nor "multithreaded behaviour is undefined". And if you debug it on a 64bit system then it won't be able to do that. So not exactly a useful thing to try, and even trying 1000 times you are unlikely to hit it, so you can't know for sure unless you happen to be lucky and hit it. > So this pretty clearly is application bug. > Really? I see an application to detecting if I'm being debugged. Try > to hit the race 1000 times, if you hit it, you are probably not > debugged (because debugger would be very likely to make that race hard > to hit). Will only work on multicores, but... If lseek not being atomic breaks your application, then your application would be broken already. Any weird debug detection you might be able to do using the fact is isn't atomic could I suppose be considered a kernel bug if you think being able to do such detection is a bug. Nothing prevents the debuger from preloading an override to the access to lseek that uses it's own locks to make the call atomic and hence prevent such use. So other than that, is there any case in which lseek being not atomic can cause an application to break if it wasn't already broken (due to having a race condition by trying to do 2 or more seeks on the same file handle at the same time)? If not, I think adding any kind of locking to seek in the kernel (which would I think have to cause a slight slow down) is a bad move. But hey that's just my opinion. :) I won't be upset either way. > [Plus, there's "strace seen it writing to either offset A or offset B, > but I see the data at offset C, WTF?] Most likely it would also be a program where you see it randomly seek to A and write or seek to A then B then write depending on how it happens to get scheduled when you run it. Already the program is clearly doing something unreliable. And C only happens to vary from B if A and B differ in the upper 32 bits of the file position. > I'm not saying this kernel bug is likely to hit in practice. It is > still a kernel bug. > > Is the slowdown of lseek worth getting rid of this minor bug? Not > sure, probably yes. I think a slow down is the worse choice. Adding a note to the documentation saying that "By the way, on 32bit systems the seek call is not atomic for 64bit file offsets, so if you happen to issue two at the same time to the same file pointer to offsets that differ in the upper 32bits, then the result of the seek might not be either of A or B but will contain the upper 32bits of either A or B and the lower 32bits of ether A or B. You should of course use locking for your file access to ensure you know where your threads end up writing so this should be a non issue." -- Len Sorensen ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-15 19:49 ` Lennart Sorensen @ 2008-04-15 20:06 ` Pavel Machek 2008-04-15 20:28 ` Peter Zijlstra 2008-04-15 20:29 ` Lennart Sorensen 0 siblings, 2 replies; 53+ messages in thread From: Pavel Machek @ 2008-04-15 20:06 UTC (permalink / raw) To: Lennart Sorensen Cc: Jan Kara, Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel Hi! > So other than that, is there any case in which lseek being not atomic > can cause an application to break if it wasn't already broken (due to > having a race condition by trying to do 2 or more seeks on the same file > handle at the same time)? If not, I think adding any kind of locking to > seek in the kernel (which would I think have to cause a slight slow > down) is a bad move. But hey that's just my opinion. :) I won't be > upset either way. Of course I can write an application that will be broken by this, and was not broken before. It will be slightly nasty code. Come on, you can do this too ;-). > > I'm not saying this kernel bug is likely to hit in practice. It is > > still a kernel bug. > > > > Is the slowdown of lseek worth getting rid of this minor bug? Not > > sure, probably yes. > > I think a slow down is the worse choice. Adding a note to the > documentation saying that "By the way, on 32bit systems the seek call is > not atomic for 64bit file offsets, so if you happen to issue two at That would be very wrong addition to documentation. If you really wanted to do something like this, you would probably want to say something like "Doing concurrent seeks on one file is undefined. Kernel may end up with seeking to some other place." Unfortunately, you'd have to get this addition into POSIX standard... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-15 20:06 ` Pavel Machek @ 2008-04-15 20:28 ` Peter Zijlstra 2008-04-16 8:15 ` Pavel Machek 2008-04-15 20:29 ` Lennart Sorensen 1 sibling, 1 reply; 53+ messages in thread From: Peter Zijlstra @ 2008-04-15 20:28 UTC (permalink / raw) To: Pavel Machek Cc: Lennart Sorensen, Jan Kara, Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Tue, 2008-04-15 at 22:06 +0200, Pavel Machek wrote: > > > I'm not saying this kernel bug is likely to hit in practice. It is > > > still a kernel bug. > > > > > > Is the slowdown of lseek worth getting rid of this minor bug? Not > > > sure, probably yes. > > > > I think a slow down is the worse choice. Adding a note to the > > documentation saying that "By the way, on 32bit systems the seek call is > > not atomic for 64bit file offsets, so if you happen to issue two at > > That would be very wrong addition to documentation. If you really > wanted to do something like this, you would probably want to say > something like > > "Doing concurrent seeks on one file is undefined. Kernel may end up > with seeking to some other place." > > Unfortunately, you'd have to get this addition into POSIX standard... Is not treating the point not similar to undefined? And undefined semantics cover pretty much anything, including the current behaviour. FWIW I really think this issue is a non-issue; one cannot expect sane behaviour of unsynchronized usage of a shared resource. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-15 20:28 ` Peter Zijlstra @ 2008-04-16 8:15 ` Pavel Machek 2008-04-16 8:20 ` Peter Zijlstra ` (2 more replies) 0 siblings, 3 replies; 53+ messages in thread From: Pavel Machek @ 2008-04-16 8:15 UTC (permalink / raw) To: Peter Zijlstra Cc: Lennart Sorensen, Jan Kara, Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Tue 2008-04-15 22:28:55, Peter Zijlstra wrote: > On Tue, 2008-04-15 at 22:06 +0200, Pavel Machek wrote: > > > > > I'm not saying this kernel bug is likely to hit in practice. It is > > > > still a kernel bug. > > > > > > > > Is the slowdown of lseek worth getting rid of this minor bug? Not > > > > sure, probably yes. > > > > > > I think a slow down is the worse choice. Adding a note to the > > > documentation saying that "By the way, on 32bit systems the seek call is > > > not atomic for 64bit file offsets, so if you happen to issue two at > > > > That would be very wrong addition to documentation. If you really > > wanted to do something like this, you would probably want to say > > something like > > > > "Doing concurrent seeks on one file is undefined. Kernel may end up > > with seeking to some other place." > > > > Unfortunately, you'd have to get this addition into POSIX standard... > > Is not treating the point not similar to undefined? And undefined > semantics cover pretty much anything, including the current behaviour. > > FWIW I really think this issue is a non-issue; one cannot expect sane > behaviour of unsynchronized usage of a shared resource. Why not? Kernel syscalls are traditionally atomic, and Lennard seems to have found sentence in POSIX that says so. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-16 8:15 ` Pavel Machek @ 2008-04-16 8:20 ` Peter Zijlstra 2008-04-16 10:54 ` Alan Cox 2008-04-16 13:57 ` Lennart Sorensen 2 siblings, 0 replies; 53+ messages in thread From: Peter Zijlstra @ 2008-04-16 8:20 UTC (permalink / raw) To: Pavel Machek Cc: Lennart Sorensen, Jan Kara, Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Wed, 2008-04-16 at 10:15 +0200, Pavel Machek wrote: > On Tue 2008-04-15 22:28:55, Peter Zijlstra wrote: > > On Tue, 2008-04-15 at 22:06 +0200, Pavel Machek wrote: > > > > > > > I'm not saying this kernel bug is likely to hit in practice. It is > > > > > still a kernel bug. > > > > > > > > > > Is the slowdown of lseek worth getting rid of this minor bug? Not > > > > > sure, probably yes. > > > > > > > > I think a slow down is the worse choice. Adding a note to the > > > > documentation saying that "By the way, on 32bit systems the seek call is > > > > not atomic for 64bit file offsets, so if you happen to issue two at > > > > > > That would be very wrong addition to documentation. If you really > > > wanted to do something like this, you would probably want to say > > > something like > > > > > > "Doing concurrent seeks on one file is undefined. Kernel may end up > > > with seeking to some other place." > > > > > > Unfortunately, you'd have to get this addition into POSIX standard... > > > > Is not treating the point not similar to undefined? And undefined > > semantics cover pretty much anything, including the current behaviour. > > > > FWIW I really think this issue is a non-issue; one cannot expect sane > > behaviour of unsynchronized usage of a shared resource. > > Why not? Kernel syscalls are traditionally atomic, and Lennard seems > to have found sentence in POSIX that says so. Ah, ok missed that part. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-16 8:15 ` Pavel Machek 2008-04-16 8:20 ` Peter Zijlstra @ 2008-04-16 10:54 ` Alan Cox 2008-04-16 13:57 ` Lennart Sorensen 2 siblings, 0 replies; 53+ messages in thread From: Alan Cox @ 2008-04-16 10:54 UTC (permalink / raw) To: Pavel Machek Cc: Peter Zijlstra, Lennart Sorensen, Jan Kara, Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel > Why not? Kernel syscalls are traditionally atomic, and Lennard seems > to have found sentence in POSIX that says so. Almost no call is atomic or has atomicity guarantees. There are specific rules for certain disk access and pipe queueing but almost nothing else. The same is as true (often more true) for all Unix systems Alan ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-16 8:15 ` Pavel Machek 2008-04-16 8:20 ` Peter Zijlstra 2008-04-16 10:54 ` Alan Cox @ 2008-04-16 13:57 ` Lennart Sorensen 2 siblings, 0 replies; 53+ messages in thread From: Lennart Sorensen @ 2008-04-16 13:57 UTC (permalink / raw) To: Pavel Machek Cc: Peter Zijlstra, Jan Kara, Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Wed, Apr 16, 2008 at 10:15:23AM +0200, Pavel Machek wrote: > Why not? Kernel syscalls are traditionally atomic, and Lennard seems > to have found sentence in POSIX that says so. Well it didn't say atomic, but it did say "thread safe" which I suppose comes down to about the same thing. -- Len Sorensen ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-15 20:06 ` Pavel Machek 2008-04-15 20:28 ` Peter Zijlstra @ 2008-04-15 20:29 ` Lennart Sorensen 2008-04-15 22:11 ` Bryan Henderson 1 sibling, 1 reply; 53+ messages in thread From: Lennart Sorensen @ 2008-04-15 20:29 UTC (permalink / raw) To: Pavel Machek Cc: Jan Kara, Bodo Eggert, Diego Calleja, Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Tue, Apr 15, 2008 at 10:06:47PM +0200, Pavel Machek wrote: > Of course I can write an application that will be broken by this, and > was not broken before. It will be slightly nasty code. Come on, you > can do this too ;-). Well it would take seriously hard work to make a program that would work correctly if it was atomic and would break if it isn't. Certainly a normal program that just tries to seek and read/write should never have any issue. > That would be very wrong addition to documentation. If you really > wanted to do something like this, you would probably want to say > something like > > "Doing concurrent seeks on one file is undefined. Kernel may end up > with seeking to some other place." Well perhaps that is a lot simpler. > Unfortunately, you'd have to get this addition into POSIX standard... Well I do see something in a PDF on posix I found that says all posix functions (at least in POXIS.1 which I think might be an old name for it) are thread safe unless stated otherwise, so since lseek doesn't state otherwise I suppose it better be completely thread safe in all cases. It seems a bit stupid given any program that wants to work reliably has to do its own locking already, so why waste time on it in the kernel. Any way the kernel could know how many copies of the filehandle exist (yeah right, of course not) to ensure that it only has to lock if there is multiple accesses going on? Darn. Those stupid standards documents. :) -- Len Sorensen ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-15 20:29 ` Lennart Sorensen @ 2008-04-15 22:11 ` Bryan Henderson 2008-04-16 9:40 ` Jamie Lokier 0 siblings, 1 reply; 53+ messages in thread From: Bryan Henderson @ 2008-04-15 22:11 UTC (permalink / raw) To: Lennart Sorensen Cc: Bodo Eggert, Diego Calleja, Jan Kara, Jiri Kosina, linux-fsdevel, Linux Kernel list, Michal Hocko, Meelis Roos, Pavel Machek > Well it would take seriously hard work to make a program that would work > correctly if it was atomic and would break if it isn't. Certainly a > normal program that just tries to seek and read/write should never have > any issue. I can easily imagine such a program. I think you aren't exercising enough imagination about the kinds of requirements a program might be implementing. That lack of imagination (in all of us) is the reason we shouldn't tolerate something working not as designed or not as expected just because we went through every possible use scenario and it didn't matter in any of them. Just focus on the layer in question. The easiest way to imagine a program not doing locking and being useful anyway (as long as the kernel is thread-safe) is to use the same arguments you use for the kernel doing it: there's a higher level user responsible for locking. The code in question doesn't guarantee that user writes all its stuff to the right place, but at least it guarantees that that user's lack of locking doesn't screw some other user of the file. It does that by ensuring it never seeks to a place the user doesn't own and that no two separate users ever access the file at the same time. I'd even like to accomodate the poor user trying to debug the broken locking in his application. He sees the file getting corrupted and immediately thinks, "what if my thread serialization isn't working right?" But he notices that the corruption isn't consistent with that hypothesis. He knows he was working with only the beginning and the end of the file and the corruption happened in the middle. So he wastes a week considering other hypotheses, including a kernel bug, until someone points out a paragraph in the lseek() man page that says contrary to all Unix convention, that particular function and system call is not thread-safe, and it doesn't necessarily seek to the place mentioned in its argument. -- Bryan Henderson IBM Almaden Research Center San Jose CA Filesystems ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-15 22:11 ` Bryan Henderson @ 2008-04-16 9:40 ` Jamie Lokier 0 siblings, 0 replies; 53+ messages in thread From: Jamie Lokier @ 2008-04-16 9:40 UTC (permalink / raw) To: Bryan Henderson Cc: Lennart Sorensen, Bodo Eggert, Diego Calleja, Jan Kara, Jiri Kosina, linux-fsdevel, Linux Kernel list, Michal Hocko, Meelis Roos, Pavel Machek Bryan Henderson wrote: > The easiest way to imagine a program not doing locking and being useful > anyway (as long as the kernel is thread-safe) is to use the same arguments > you use for the kernel doing it: there's a higher level user responsible > for locking. The code in question doesn't guarantee that user writes all > its stuff to the right place, but at least it guarantees that that user's > lack of locking doesn't screw some other user of the file. It does that > by ensuring it never seeks to a place the user doesn't own and that no two > separate users ever access the file at the same time. > > I'd even like to accomodate the poor user trying to debug the broken > locking in his application. He sees the file getting corrupted and > immediately thinks, "what if my thread serialization isn't working right?" > But he notices that the corruption isn't consistent with that hypothesis. > He knows he was working with only the beginning and the end of the file > and the corruption happened in the middle. So he wastes a week > considering other hypotheses, including a kernel bug, until someone points > out a paragraph in the lseek() man page that says contrary to all Unix > convention, that particular function and system call is not thread-safe, > and it doesn't necessarily seek to the place mentioned in its argument. I think that argument is the strongest yet. Wasted debugging time due to totally surprising and hardly justifiable kernel behaviour. Strace / GDB on the application shows a trace which doesn't relate at all to the unexpected file changes. There is also POSIX specification: http://www.opengroup.org/onlinepubs/000095399/functions/xsh_chap02_09.html "All functions defined by this volume of IEEE Std 1003.1-2001 shall be thread-safe, except that the following functions need not be thread-safe." [List which does not include lseek(), therefore lseek() shall be thread-safe. Same for read() and write().] Docs for HP-UX and AIX say the same as POSIX about thread-safety. -- Jamie ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <Pine.SOC.4.64.0804081101430.28938@math.ut.ee>]
* Re: file offset corruption on 32-bit machines? [not found] <Pine.SOC.4.64.0804081101430.28938@math.ut.ee> @ 2008-04-10 13:55 ` Michal Hocko 2008-04-10 14:01 ` Jiri Kosina ` (2 more replies) 0 siblings, 3 replies; 53+ messages in thread From: Michal Hocko @ 2008-04-10 13:55 UTC (permalink / raw) To: Meelis Roos; +Cc: Linux Kernel list, linux-fsdevel [Adding fsdevel list] On Tuesday 08 April 2008 10:05:47 am Meelis Roos wrote: > Jeff Robertson analyzes the behaviour of different operating systems' > 64-bit file offset implementation and concludes that on 32-bit > machines, Linux and Solaris lack any locking to keep the two 32-bit > halves in sync and this could cause rare file offset corruption. > > http://jeffr-tech.livejournal.com/21014.html AFAICS, this race is theoretically possible, but it is very hard (almost impossible) to trigger with a sane file usage pattern. Note that you have to access shared struct file (same file descriptor) in different threads which should be synchronized by caller anyway (*). I also don't see any security implications from this race, but maybe someone with more knowlage about fs can see (f_pos is used at many places in the kernel code). I think that it is better to live with tiny-race-on-broken-patterns rather than paying for synchronization which is not needed for correct paths. [*] file_pos_{read,write} (fs/read_write.c) are not called under lock (in sys_read, sys_write, ...), so even if f_pos is written atomically, you will be able to get races when accessing shared descriptor from different threads. I think that POSIX states, that behavior is undefined under these conditions. Best regards -- Michal Hocko SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-10 13:55 ` Michal Hocko @ 2008-04-10 14:01 ` Jiri Kosina 2008-04-10 14:27 ` Jan Kara 2008-04-10 14:31 ` Michal Hocko 2008-04-10 14:11 ` Martin Mares 2008-04-10 15:33 ` Andi Kleen 2 siblings, 2 replies; 53+ messages in thread From: Jiri Kosina @ 2008-04-10 14:01 UTC (permalink / raw) To: Michal Hocko; +Cc: Meelis Roos, Linux Kernel list, linux-fsdevel On Thu, 10 Apr 2008, Michal Hocko wrote: > > Jeff Robertson analyzes the behaviour of different operating systems' > > 64-bit file offset implementation and concludes that on 32-bit > > machines, Linux and Solaris lack any locking to keep the two 32-bit > > halves in sync and this could cause rare file offset corruption. > > http://jeffr-tech.livejournal.com/21014.html > AFAICS, this race is theoretically possible, but it is very hard (almost > impossible) to trigger with a sane file usage pattern. Note that you > have to access shared struct file (same file descriptor) in different > threads which should be synchronized by caller anyway (*). ... but not in cases the caller is an intentionally evil code, right? :) > I also don't see any security implications from this race, but maybe > someone with more knowlage about fs can see (f_pos is used at many > places in the kernel code). The f_pos races are in fact exploitable, we've already been there. See for example http://www.isec.pl/vulnerabilities/isec-0016-procleaks.txt -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-10 14:01 ` Jiri Kosina @ 2008-04-10 14:27 ` Jan Kara 2008-04-10 14:31 ` Jiri Kosina 2008-04-11 19:26 ` Pavel Machek 2008-04-10 14:31 ` Michal Hocko 1 sibling, 2 replies; 53+ messages in thread From: Jan Kara @ 2008-04-10 14:27 UTC (permalink / raw) To: Jiri Kosina; +Cc: Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel > On Thu, 10 Apr 2008, Michal Hocko wrote: > > > > Jeff Robertson analyzes the behaviour of different operating systems' > > > 64-bit file offset implementation and concludes that on 32-bit > > > machines, Linux and Solaris lack any locking to keep the two 32-bit > > > halves in sync and this could cause rare file offset corruption. > > > http://jeffr-tech.livejournal.com/21014.html > > AFAICS, this race is theoretically possible, but it is very hard (almost > > impossible) to trigger with a sane file usage pattern. Note that you > > have to access shared struct file (same file descriptor) in different > > threads which should be synchronized by caller anyway (*). > > ... but not in cases the caller is an intentionally evil code, right? :) Yes. > > I also don't see any security implications from this race, but maybe > > someone with more knowlage about fs can see (f_pos is used at many > > places in the kernel code). > > The f_pos races are in fact exploitable, we've already been there. See > for example http://www.isec.pl/vulnerabilities/isec-0016-procleaks.txt Well, this race is more subtle - the window is just one instruction wide (stores to f_pos from CPU2 must come between the store of lower and upper 32-bits of f_pos on CPU1). And the only result is that f_pos has 32-bits from one file pointer and 32-bits from the other one. So I can hardly imagine this would be exploitable... Honza -- Jan Kara <jack@suse.cz> SuSE CR Labs ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-10 14:27 ` Jan Kara @ 2008-04-10 14:31 ` Jiri Kosina 2008-04-10 14:48 ` Matthew Wilcox ` (2 more replies) 2008-04-11 19:26 ` Pavel Machek 1 sibling, 3 replies; 53+ messages in thread From: Jiri Kosina @ 2008-04-10 14:31 UTC (permalink / raw) To: Jan Kara; +Cc: Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Thu, 10 Apr 2008, Jan Kara wrote: > > The f_pos races are in fact exploitable, we've already been there. See > > for example http://www.isec.pl/vulnerabilities/isec-0016-procleaks.txt > Well, this race is more subtle - the window is just one instruction > wide (stores to f_pos from CPU2 must come between the store of lower and > upper 32-bits of f_pos on CPU1). And the only result is that f_pos has > 32-bits from one file pointer and 32-bits from the other one. So I can > hardly imagine this would be exploitable... Supposing you are not holding any spinlock and are running with preemptible kernel (pretty common scenario nowadays), there is nothing that would prevent kernel from rescheduling between the two instructions, enlarging the race window to be more comfortable for attacker, right? I think this is worth fixing. -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-10 14:31 ` Jiri Kosina @ 2008-04-10 14:48 ` Matthew Wilcox 2008-04-10 15:22 ` Jan Kara 2008-04-10 15:19 ` Jan Kara 2008-04-10 16:03 ` Diego Calleja 2 siblings, 1 reply; 53+ messages in thread From: Matthew Wilcox @ 2008-04-10 14:48 UTC (permalink / raw) To: Jiri Kosina Cc: Jan Kara, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Thu, Apr 10, 2008 at 04:31:09PM +0200, Jiri Kosina wrote: > > Well, this race is more subtle - the window is just one instruction > > wide (stores to f_pos from CPU2 must come between the store of lower and > > upper 32-bits of f_pos on CPU1). And the only result is that f_pos has > > 32-bits from one file pointer and 32-bits from the other one. So I can > > hardly imagine this would be exploitable... > > Supposing you are not holding any spinlock and are running with > preemptible kernel (pretty common scenario nowadays), there is nothing > that would prevent kernel from rescheduling between the two instructions, > enlarging the race window to be more comfortable for attacker, right? > > I think this is worth fixing. Seems a lot like reading jiffies to me. Is the seqlock the right solution to use for fixing this? -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-10 14:48 ` Matthew Wilcox @ 2008-04-10 15:22 ` Jan Kara 2008-04-10 15:30 ` Matthew Wilcox 0 siblings, 1 reply; 53+ messages in thread From: Jan Kara @ 2008-04-10 15:22 UTC (permalink / raw) To: Matthew Wilcox Cc: Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel > On Thu, Apr 10, 2008 at 04:31:09PM +0200, Jiri Kosina wrote: > > > Well, this race is more subtle - the window is just one instruction > > > wide (stores to f_pos from CPU2 must come between the store of lower and > > > upper 32-bits of f_pos on CPU1). And the only result is that f_pos has > > > 32-bits from one file pointer and 32-bits from the other one. So I can > > > hardly imagine this would be exploitable... > > > > Supposing you are not holding any spinlock and are running with > > preemptible kernel (pretty common scenario nowadays), there is nothing > > that would prevent kernel from rescheduling between the two instructions, > > enlarging the race window to be more comfortable for attacker, right? > > > > I think this is worth fixing. > > Seems a lot like reading jiffies to me. Is the seqlock the right > solution to use for fixing this? You can get your inspiration in the implementation of i_size_read() and i_size_write() functions :). They deal with exactly the same problem. But in the case of f_pos, the number of readers and writers is balanced so maybe a spinlock would be fine as well... Honza -- Jan Kara <jack@suse.cz> SuSE CR Labs ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-10 15:22 ` Jan Kara @ 2008-04-10 15:30 ` Matthew Wilcox 0 siblings, 0 replies; 53+ messages in thread From: Matthew Wilcox @ 2008-04-10 15:30 UTC (permalink / raw) To: Jan Kara Cc: Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Thu, Apr 10, 2008 at 05:22:12PM +0200, Jan Kara wrote: > You can get your inspiration in the implementation of i_size_read() > and i_size_write() functions :). They deal with exactly the same problem. > But in the case of f_pos, the number of readers and writers is balanced so > maybe a spinlock would be fine as well... It's not quite balanced -- see sys_getdents() for a counterexample. i_size_read/write use a seqcount rather than a seqlock, but the principle is the same. -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-10 14:31 ` Jiri Kosina 2008-04-10 14:48 ` Matthew Wilcox @ 2008-04-10 15:19 ` Jan Kara 2008-04-10 15:37 ` Michal Hocko 2008-04-10 16:03 ` Diego Calleja 2 siblings, 1 reply; 53+ messages in thread From: Jan Kara @ 2008-04-10 15:19 UTC (permalink / raw) To: Jiri Kosina; +Cc: Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel > On Thu, 10 Apr 2008, Jan Kara wrote: > > > > The f_pos races are in fact exploitable, we've already been there. See > > > for example http://www.isec.pl/vulnerabilities/isec-0016-procleaks.txt > > Well, this race is more subtle - the window is just one instruction > > wide (stores to f_pos from CPU2 must come between the store of lower and > > upper 32-bits of f_pos on CPU1). And the only result is that f_pos has > > 32-bits from one file pointer and 32-bits from the other one. So I can > > hardly imagine this would be exploitable... > > Supposing you are not holding any spinlock and are running with > preemptible kernel (pretty common scenario nowadays), there is nothing > that would prevent kernel from rescheduling between the two instructions, > enlarging the race window to be more comfortable for attacker, right? Yes, this is theoretically possible. > I think this is worth fixing. Hmm, maybe it is, although I still don't see how to exploit it :). Honza -- Jan Kara <jack@suse.cz> SuSE CR Labs ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-10 15:19 ` Jan Kara @ 2008-04-10 15:37 ` Michal Hocko 2008-04-10 15:56 ` Jan Kara 0 siblings, 1 reply; 53+ messages in thread From: Michal Hocko @ 2008-04-10 15:37 UTC (permalink / raw) To: Jan Kara; +Cc: Jiri Kosina, Meelis Roos, Linux Kernel list, linux-fsdevel On Thursday 10 April 2008 05:19:45 pm Jan Kara wrote: > > On Thu, 10 Apr 2008, Jan Kara wrote: > > > > The f_pos races are in fact exploitable, we've already been there. > > > > See for example > > > > http://www.isec.pl/vulnerabilities/isec-0016-procleaks.txt > > > > > > Well, this race is more subtle - the window is just one instruction > > > wide (stores to f_pos from CPU2 must come between the store of lower > > > and upper 32-bits of f_pos on CPU1). And the only result is that f_pos > > > has 32-bits from one file pointer and 32-bits from the other one. So I > > > can hardly imagine this would be exploitable... > > > > Supposing you are not holding any spinlock and are running with > > preemptible kernel (pretty common scenario nowadays), there is nothing > > that would prevent kernel from rescheduling between the two instructions, > > enlarging the race window to be more comfortable for attacker, right? > > Yes, this is theoretically possible. > > > I think this is worth fixing. > > Hmm, maybe it is, although I still don't see how to exploit it :). Maybe (just guess) some high priority malicious process could try to preempt reading thread to always in the bad moment (when the half of the f_pos is written) and thus forcing it to read bad data (you usually don't check that file position is growing after each read and you wait only for end of the file). But do agree, I still don't see something with really security implications (privileged processes usually don't work with such a big files). > > Honza Best regards -- Michal Hocko SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-10 15:37 ` Michal Hocko @ 2008-04-10 15:56 ` Jan Kara 0 siblings, 0 replies; 53+ messages in thread From: Jan Kara @ 2008-04-10 15:56 UTC (permalink / raw) To: Michal Hocko; +Cc: Jiri Kosina, Meelis Roos, Linux Kernel list, linux-fsdevel On Thu 10-04-08 17:37:16, Michal Hocko wrote: > On Thursday 10 April 2008 05:19:45 pm Jan Kara wrote: > > > On Thu, 10 Apr 2008, Jan Kara wrote: > > > > > The f_pos races are in fact exploitable, we've already been there. > > > > > See for example > > > > > http://www.isec.pl/vulnerabilities/isec-0016-procleaks.txt > > > > > > > > Well, this race is more subtle - the window is just one instruction > > > > wide (stores to f_pos from CPU2 must come between the store of lower > > > > and upper 32-bits of f_pos on CPU1). And the only result is that f_pos > > > > has 32-bits from one file pointer and 32-bits from the other one. So I > > > > can hardly imagine this would be exploitable... > > > > > > Supposing you are not holding any spinlock and are running with > > > preemptible kernel (pretty common scenario nowadays), there is nothing > > > that would prevent kernel from rescheduling between the two instructions, > > > enlarging the race window to be more comfortable for attacker, right? > > > > Yes, this is theoretically possible. > > > > > I think this is worth fixing. > > > > Hmm, maybe it is, although I still don't see how to exploit it :). > > Maybe (just guess) some high priority malicious process could try to preempt > reading thread to always in the bad moment (when the half of the f_pos is > written) and thus forcing it to read bad data (you usually don't check that > file position is growing after each read and you wait only for end of the > file). > But do agree, I still don't see something with really security implications > (privileged processes usually don't work with such a big files). Well, but for this to work the process you try to attack must access the file from several threads in parallel without any locking... And I'm not aware of anybody really doing this. Really the only attack vector I could imagine is that you create several malitious processes which will try to corrupt f_pos and then use it (like if you could make it negative, I could imagine this could trigger some bug somewhere). But since possible corruptions are quite limited, I don't see how to corrupt f_pos to something at least remotely "useful". Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-10 14:31 ` Jiri Kosina 2008-04-10 14:48 ` Matthew Wilcox 2008-04-10 15:19 ` Jan Kara @ 2008-04-10 16:03 ` Diego Calleja 2008-04-10 16:15 ` Jan Kara 2 siblings, 1 reply; 53+ messages in thread From: Diego Calleja @ 2008-04-10 16:03 UTC (permalink / raw) To: Jiri Kosina Cc: Jan Kara, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel El Thu, 10 Apr 2008 16:31:09 +0200 (CEST), Jiri Kosina <jkosina@suse.cz> escribió: > I think this is worth fixing. This question comes very often, and Linus even wrote a patch (http://lkml.org/lkml/2006/4/13/124 , http://lkml.org/lkml/2006/4/13/130) But apparently there's no much interest in fixing it, because it would slow down some workloads... -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-10 16:03 ` Diego Calleja @ 2008-04-10 16:15 ` Jan Kara 0 siblings, 0 replies; 53+ messages in thread From: Jan Kara @ 2008-04-10 16:15 UTC (permalink / raw) To: Diego Calleja Cc: Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Thu 10-04-08 18:03:35, Diego Calleja wrote: > El Thu, 10 Apr 2008 16:31:09 +0200 (CEST), Jiri Kosina <jkosina@suse.cz> escribió: > > > I think this is worth fixing. > > This question comes very often, and Linus even wrote a patch > (http://lkml.org/lkml/2006/4/13/124 , http://lkml.org/lkml/2006/4/13/130) > > But apparently there's no much interest in fixing it, because it would > slow down some workloads... Well, what Linus writes about is a different issue (and with a more costly solution). Here we are concerned just with the problem that file->f_pos = pos; isn't atomic on some archs. Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-10 14:27 ` Jan Kara 2008-04-10 14:31 ` Jiri Kosina @ 2008-04-11 19:26 ` Pavel Machek 2008-04-14 16:25 ` Jan Kara 1 sibling, 1 reply; 53+ messages in thread From: Pavel Machek @ 2008-04-11 19:26 UTC (permalink / raw) To: Jan Kara Cc: Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Thu 2008-04-10 16:27:00, Jan Kara wrote: > > On Thu, 10 Apr 2008, Michal Hocko wrote: > > > > > > Jeff Robertson analyzes the behaviour of different operating systems' > > > > 64-bit file offset implementation and concludes that on 32-bit > > > > machines, Linux and Solaris lack any locking to keep the two 32-bit > > > > halves in sync and this could cause rare file offset corruption. > > > > http://jeffr-tech.livejournal.com/21014.html > > > AFAICS, this race is theoretically possible, but it is very hard (almost > > > impossible) to trigger with a sane file usage pattern. Note that you > > > have to access shared struct file (same file descriptor) in different > > > threads which should be synchronized by caller anyway (*). > > > > ... but not in cases the caller is an intentionally evil code, right? :) > Yes. > > > > I also don't see any security implications from this race, but maybe > > > someone with more knowlage about fs can see (f_pos is used at many > > > places in the kernel code). > > > > The f_pos races are in fact exploitable, we've already been there. See > > for example http://www.isec.pl/vulnerabilities/isec-0016-procleaks.txt > Well, this race is more subtle - the window is just one instruction > wide (stores to f_pos from CPU2 must come between the store of lower and > upper 32-bits of f_pos on CPU1). And the only result is that f_pos has > 32-bits from one file pointer and 32-bits from the other one. So I can > hardly imagine this would be exploitable... Don't we have rlimit on max file size? I'd guess this could work around it? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-11 19:26 ` Pavel Machek @ 2008-04-14 16:25 ` Jan Kara 0 siblings, 0 replies; 53+ messages in thread From: Jan Kara @ 2008-04-14 16:25 UTC (permalink / raw) To: Pavel Machek Cc: Jiri Kosina, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Fri 11-04-08 21:26:56, Pavel Machek wrote: > On Thu 2008-04-10 16:27:00, Jan Kara wrote: > > > On Thu, 10 Apr 2008, Michal Hocko wrote: > > > > > > > > Jeff Robertson analyzes the behaviour of different operating systems' > > > > > 64-bit file offset implementation and concludes that on 32-bit > > > > > machines, Linux and Solaris lack any locking to keep the two 32-bit > > > > > halves in sync and this could cause rare file offset corruption. > > > > > http://jeffr-tech.livejournal.com/21014.html > > > > AFAICS, this race is theoretically possible, but it is very hard (almost > > > > impossible) to trigger with a sane file usage pattern. Note that you > > > > have to access shared struct file (same file descriptor) in different > > > > threads which should be synchronized by caller anyway (*). > > > > > > ... but not in cases the caller is an intentionally evil code, right? :) > > Yes. > > > > > > I also don't see any security implications from this race, but maybe > > > > someone with more knowlage about fs can see (f_pos is used at many > > > > places in the kernel code). > > > > > > The f_pos races are in fact exploitable, we've already been there. See > > > for example http://www.isec.pl/vulnerabilities/isec-0016-procleaks.txt > > Well, this race is more subtle - the window is just one instruction > > wide (stores to f_pos from CPU2 must come between the store of lower and > > upper 32-bits of f_pos on CPU1). And the only result is that f_pos has > > 32-bits from one file pointer and 32-bits from the other one. So I can > > hardly imagine this would be exploitable... > > Don't we have rlimit on max file size? I'd guess this could work > around it? There is this limit but AFAIK it limits max size of file you're able to create. And write/truncate checks already their local variable so the real value used later. Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-10 14:01 ` Jiri Kosina 2008-04-10 14:27 ` Jan Kara @ 2008-04-10 14:31 ` Michal Hocko 2008-04-10 14:35 ` Jiri Kosina 1 sibling, 1 reply; 53+ messages in thread From: Michal Hocko @ 2008-04-10 14:31 UTC (permalink / raw) To: Jiri Kosina; +Cc: Meelis Roos, Linux Kernel list, linux-fsdevel On Thursday 10 April 2008 04:01:27 pm Jiri Kosina wrote: > On Thu, 10 Apr 2008, Michal Hocko wrote: > > > Jeff Robertson analyzes the behaviour of different operating systems' > > > 64-bit file offset implementation and concludes that on 32-bit > > > machines, Linux and Solaris lack any locking to keep the two 32-bit > > > halves in sync and this could cause rare file offset corruption. > > > http://jeffr-tech.livejournal.com/21014.html > > > > AFAICS, this race is theoretically possible, but it is very hard (almost > > impossible) to trigger with a sane file usage pattern. Note that you > > have to access shared struct file (same file descriptor) in different > > threads which should be synchronized by caller anyway (*). > > ... but not in cases the caller is an intentionally evil code, right? :) Ok, but evil code needs to have access to your struct file and in such a case he can do worse things ;) Or do you have some concrete (innocent looking) example? > > > I also don't see any security implications from this race, but maybe > > someone with more knowlage about fs can see (f_pos is used at many > > places in the kernel code). > > The f_pos races are in fact exploitable, we've already been there. See > for example http://www.isec.pl/vulnerabilities/isec-0016-procleaks.txt This is different race with file position IMO. If I understand the report correctly, problem was with sleeping copy_to_user while the f_pos has changed. Best regards -- Michal Hocko SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-10 14:31 ` Michal Hocko @ 2008-04-10 14:35 ` Jiri Kosina 0 siblings, 0 replies; 53+ messages in thread From: Jiri Kosina @ 2008-04-10 14:35 UTC (permalink / raw) To: Michal Hocko; +Cc: Meelis Roos, Linux Kernel list, linux-fsdevel On Thu, 10 Apr 2008, Michal Hocko wrote: > This is different race with file position IMO. If I understand the > report correctly, problem was with sleeping copy_to_user while the f_pos > has changed. Is this really in principle different from obtaining reschedule between the two mov instructions? -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-10 13:55 ` Michal Hocko 2008-04-10 14:01 ` Jiri Kosina @ 2008-04-10 14:11 ` Martin Mares 2008-04-10 15:12 ` Jan Kara 2008-04-10 15:14 ` Jamie Lokier 2008-04-10 15:33 ` Andi Kleen 2 siblings, 2 replies; 53+ messages in thread From: Martin Mares @ 2008-04-10 14:11 UTC (permalink / raw) To: Michal Hocko; +Cc: Meelis Roos, Linux Kernel list, linux-fsdevel Hello! > [*] file_pos_{read,write} (fs/read_write.c) are not called under lock (in > sys_read, sys_write, ...), so even if f_pos is written atomically, you will > be able to get races when accessing shared descriptor from different threads. There are however cases when such behavior is perfectly valid: For example you can have a file of records of a fixed size, whose order does not matter. Then multiple processes can produce the records in parallel, sharing a single fd. > I think that POSIX states, that behavior is undefined under these conditions. Do you have a pointer to that? Have a nice fortnight -- Martin `MJ' Mares <mj@ucw.cz> http://mj.ucw.cz/ Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth Mr. Worf, scan that ship." "Aye, Captain... 600 DPI? ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-10 14:11 ` Martin Mares @ 2008-04-10 15:12 ` Jan Kara 2008-04-10 15:14 ` Jamie Lokier 1 sibling, 0 replies; 53+ messages in thread From: Jan Kara @ 2008-04-10 15:12 UTC (permalink / raw) To: Martin Mares; +Cc: Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel > Hello! > > > [*] file_pos_{read,write} (fs/read_write.c) are not called under lock (in > > sys_read, sys_write, ...), so even if f_pos is written atomically, you will > > be able to get races when accessing shared descriptor from different threads. > > There are however cases when such behavior is perfectly valid: For example > you can have a file of records of a fixed size, whose order does not matter. > Then multiple processes can produce the records in parallel, sharing > a single fd. Well, but noone guarantees that both processes don't read the same data. > > I think that POSIX states, that behavior is undefined under these conditions. > > Do you have a pointer to that? SUSv3 says: On files that support seeking (for example, a regular file), the read() shall start at a position in the file given by the file offset associated with fildes. The file offset shall be incremented by the number of bytes actually read. But nowhere is specified when this happens so OS is perfectly free to advance f_pos after read finishes when read from the other process is already running. And Linux does exactly that - actually, we do: pos = f_pos do reading which advances pos f_pos = pos So it can even in theory happen that one thread reads entries 1,2,3,2 because the other thread in the mean time finished reading entry 1... Honza -- Jan Kara <jack@suse.cz> SuSE CR Labs ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-10 14:11 ` Martin Mares 2008-04-10 15:12 ` Jan Kara @ 2008-04-10 15:14 ` Jamie Lokier 2008-04-10 15:21 ` Matthew Wilcox 2008-04-10 15:28 ` Jan Kara 1 sibling, 2 replies; 53+ messages in thread From: Jamie Lokier @ 2008-04-10 15:14 UTC (permalink / raw) To: Martin Mares; +Cc: Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel Martin Mares wrote: > > [*] file_pos_{read,write} (fs/read_write.c) are not called under > > lock (in sys_read, sys_write, ...), so even if f_pos is written > > atomically, you will be able to get races when accessing shared > > descriptor from different threads. > > There are however cases when such behavior is perfectly valid: For example > you can have a file of records of a fixed size, whose order does not matter. > Then multiple processes can produce the records in parallel, sharing > a single fd. A rather more common thing: Does this problem apply when appending lines or records to a log file, with or without O_APPEND? Also, can this problem affect programs doing concurrent reads/writes using pread/pwrite (or the AIO equivalents)? -- Jamie ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-10 15:14 ` Jamie Lokier @ 2008-04-10 15:21 ` Matthew Wilcox 2008-04-10 15:28 ` Jan Kara 1 sibling, 0 replies; 53+ messages in thread From: Matthew Wilcox @ 2008-04-10 15:21 UTC (permalink / raw) To: Martin Mares, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel On Thu, Apr 10, 2008 at 04:14:06PM +0100, Jamie Lokier wrote: > Also, can this problem affect programs doing concurrent reads/writes > using pread/pwrite (or the AIO equivalents)? pread/pwrite specify an explicit offset and do not change the file offset, so there's no way they can be affected. See the manpage. -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-10 15:14 ` Jamie Lokier 2008-04-10 15:21 ` Matthew Wilcox @ 2008-04-10 15:28 ` Jan Kara 1 sibling, 0 replies; 53+ messages in thread From: Jan Kara @ 2008-04-10 15:28 UTC (permalink / raw) To: Martin Mares, Michal Hocko, Meelis Roos, Linux Kernel list, linux-fsdevel > Martin Mares wrote: > > > [*] file_pos_{read,write} (fs/read_write.c) are not called under > > > lock (in sys_read, sys_write, ...), so even if f_pos is written > > > atomically, you will be able to get races when accessing shared > > > descriptor from different threads. > > > > There are however cases when such behavior is perfectly valid: For example > > you can have a file of records of a fixed size, whose order does not matter. > > Then multiple processes can produce the records in parallel, sharing > > a single fd. > > A rather more common thing: > > Does this problem apply when appending lines or records to a log file, > with or without O_APPEND? O_APPEND works correctly in all cases (it ignores f_pos in the descriptor). Without O_APPEND you can hit the race (but I'd like to see a sensible use case of this ;). > Also, can this problem affect programs doing concurrent reads/writes > using pread/pwrite (or the AIO equivalents)? As Matthew said, pread/pwrite are safe, parallel read can hit the race, write was described above... Honza -- Jan Kara <jack@suse.cz> SuSE CR Labs ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: file offset corruption on 32-bit machines? 2008-04-10 13:55 ` Michal Hocko 2008-04-10 14:01 ` Jiri Kosina 2008-04-10 14:11 ` Martin Mares @ 2008-04-10 15:33 ` Andi Kleen 2 siblings, 0 replies; 53+ messages in thread From: Andi Kleen @ 2008-04-10 15:33 UTC (permalink / raw) To: Michal Hocko; +Cc: Meelis Roos, Linux Kernel list, linux-fsdevel Michal Hocko <mhocko@suse.cz> writes: > [Adding fsdevel list] > > On Tuesday 08 April 2008 10:05:47 am Meelis Roos wrote: >> Jeff Robertson analyzes the behaviour of different operating systems' >> 64-bit file offset implementation and concludes that on 32-bit >> machines, Linux and Solaris lack any locking to keep the two 32-bit >> halves in sync and this could cause rare file offset corruption. >> >> http://jeffr-tech.livejournal.com/21014.html > > AFAICS, this race is theoretically possible, but it is very hard (almost > impossible) to trigger with a sane file usage pattern. We discussed this extensively some time ago in http://thread.gmane.org/gmane.linux.file-systems/20712/focus=20771 No solution so far -Andi ^ permalink raw reply [flat|nested] 53+ messages in thread
end of thread, other threads:[~2008-04-17 0:34 UTC | newest]
Thread overview: 53+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <agh4d-6yc-35@gated-at.bofh.it>
[not found] ` <ah5tY-3lR-7@gated-at.bofh.it>
[not found] ` <ah5DA-3X9-9@gated-at.bofh.it>
[not found] ` <ah5X5-4tl-13@gated-at.bofh.it>
[not found] ` <ah66A-4Nk-7@gated-at.bofh.it>
[not found] ` <ah7vN-7Wz-9@gated-at.bofh.it>
2008-04-11 12:24 ` file offset corruption on 32-bit machines? Bodo Eggert
2008-04-11 13:55 ` Lennart Sorensen
2008-04-11 16:59 ` Bryan Henderson
2008-04-11 17:15 ` Lennart Sorensen
2008-04-11 21:29 ` Bryan Henderson
2008-04-12 8:48 ` Pavel Machek
2008-04-14 16:20 ` Jan Kara
2008-04-14 16:22 ` Lennart Sorensen
2008-04-14 16:53 ` Jan Kara
2008-04-14 16:54 ` Alan Cox
2008-04-14 18:34 ` Alexey Dobriyan
2008-04-14 17:06 ` Lennart Sorensen
2008-04-14 19:03 ` Jan Kara
2008-04-14 19:29 ` Lennart Sorensen
2008-04-14 19:42 ` Jan Kara
2008-04-14 19:45 ` Lennart Sorensen
2008-04-15 8:57 ` Pavel Machek
2008-04-15 15:32 ` Lennart Sorensen
2008-04-15 17:34 ` Pavel Machek
2008-04-15 18:24 ` Lennart Sorensen
2008-04-15 19:12 ` Pavel Machek
2008-04-15 19:49 ` Lennart Sorensen
2008-04-15 20:06 ` Pavel Machek
2008-04-15 20:28 ` Peter Zijlstra
2008-04-16 8:15 ` Pavel Machek
2008-04-16 8:20 ` Peter Zijlstra
2008-04-16 10:54 ` Alan Cox
2008-04-16 13:57 ` Lennart Sorensen
2008-04-15 20:29 ` Lennart Sorensen
2008-04-15 22:11 ` Bryan Henderson
2008-04-16 9:40 ` Jamie Lokier
[not found] <Pine.SOC.4.64.0804081101430.28938@math.ut.ee>
2008-04-10 13:55 ` Michal Hocko
2008-04-10 14:01 ` Jiri Kosina
2008-04-10 14:27 ` Jan Kara
2008-04-10 14:31 ` Jiri Kosina
2008-04-10 14:48 ` Matthew Wilcox
2008-04-10 15:22 ` Jan Kara
2008-04-10 15:30 ` Matthew Wilcox
2008-04-10 15:19 ` Jan Kara
2008-04-10 15:37 ` Michal Hocko
2008-04-10 15:56 ` Jan Kara
2008-04-10 16:03 ` Diego Calleja
2008-04-10 16:15 ` Jan Kara
2008-04-11 19:26 ` Pavel Machek
2008-04-14 16:25 ` Jan Kara
2008-04-10 14:31 ` Michal Hocko
2008-04-10 14:35 ` Jiri Kosina
2008-04-10 14:11 ` Martin Mares
2008-04-10 15:12 ` Jan Kara
2008-04-10 15:14 ` Jamie Lokier
2008-04-10 15:21 ` Matthew Wilcox
2008-04-10 15:28 ` Jan Kara
2008-04-10 15:33 ` Andi Kleen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).