* Write is not atomic?
@ 2012-10-15 21:36 Juliusz Chroboczek
2012-10-15 22:21 ` Max Filippov
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Juliusz Chroboczek @ 2012-10-15 21:36 UTC (permalink / raw)
To: linux-kernel
Hi,
The Linux manual page for write(2) says:
The adjustment of the file offset and the write operation are
performed as an atomic step.
This is apparently an extension to POSIX, which says
This volume of IEEE Std 1003.1-2001 does not specify behavior of
concurrent writes to a file from multiple processes. Applications
should use some form of concurrency control.
The following fragment of code
int fd;
fd = open("exemple", O_CREAT | O_WRONLY | O_TRUNC, 0666);
fork();
write(fd, "Ouille", 6);
close(fd);
produces "OuilleOuille", as expected, on ext4 on two machines running
Linux 3.2 AMD64. However, over XFS on an old Pentium III at 500 MHz
running 2.6.32, it produces just "Ouille" roughly once in three times.
Sorry for not being able to give more test cases, but I cannot easily
change either the filesystem or the kernel on the Pentium server.
-- Juliusz Chroboczek
P.S. I'll appreciate being copied with any replies.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Write is not atomic?
2012-10-15 21:36 Write is not atomic? Juliusz Chroboczek
@ 2012-10-15 22:21 ` Max Filippov
2012-10-15 23:36 ` Juliusz Chroboczek
2012-10-15 23:13 ` Dave Chinner
2012-10-15 23:40 ` Jochen Striepe
2 siblings, 1 reply; 8+ messages in thread
From: Max Filippov @ 2012-10-15 22:21 UTC (permalink / raw)
To: Juliusz Chroboczek; +Cc: linux-kernel
On Tue, Oct 16, 2012 at 1:36 AM, Juliusz Chroboczek <jch@pps.jussieu.fr> wrote:
> Hi,
>
> The Linux manual page for write(2) says:
>
> The adjustment of the file offset and the write operation are
> performed as an atomic step.
>
> This is apparently an extension to POSIX, which says
>
> This volume of IEEE Std 1003.1-2001 does not specify behavior of
> concurrent writes to a file from multiple processes. Applications
> should use some form of concurrency control.
>
> The following fragment of code
>
> int fd;
> fd = open("exemple", O_CREAT | O_WRONLY | O_TRUNC, 0666);
> fork();
> write(fd, "Ouille", 6);
You don't check return code here, does write succeed at all?
> close(fd);
>
> produces "OuilleOuille", as expected, on ext4 on two machines running
> Linux 3.2 AMD64. However, over XFS on an old Pentium III at 500 MHz
> running 2.6.32, it produces just "Ouille" roughly once in three times.
Does it ever produce e.g. OuOuilleille (as this is what atomicity is about
here)?
--
Thanks.
-- Max
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Write is not atomic?
2012-10-15 21:36 Write is not atomic? Juliusz Chroboczek
2012-10-15 22:21 ` Max Filippov
@ 2012-10-15 23:13 ` Dave Chinner
2012-10-15 23:24 ` Philippe Troin
2012-10-15 23:40 ` Jochen Striepe
2 siblings, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2012-10-15 23:13 UTC (permalink / raw)
To: Juliusz Chroboczek; +Cc: linux-kernel
On Mon, Oct 15, 2012 at 11:36:15PM +0200, Juliusz Chroboczek wrote:
> Hi,
>
> The Linux manual page for write(2) says:
>
> The adjustment of the file offset and the write operation are
> performed as an atomic step.
That's wrong. The file offset update is not synchronised at all with
the write, and for a shared fd the update will race.
> This is apparently an extension to POSIX, which says
>
> This volume of IEEE Std 1003.1-2001 does not specify behavior of
> concurrent writes to a file from multiple processes. Applications
> should use some form of concurrency control.
This is how Linux behaves.
> The following fragment of code
>
> int fd;
> fd = open("exemple", O_CREAT | O_WRONLY | O_TRUNC, 0666);
> fork();
> write(fd, "Ouille", 6);
> close(fd);
>
> produces "OuilleOuille", as expected, on ext4 on two machines running
> Linux 3.2 AMD64. However, over XFS on an old Pentium III at 500 MHz
> running 2.6.32, it produces just "Ouille" roughly once in three times.
ext4, on 3.6:
$ for i in `seq 0 10000`; do ./a.out ; cat /mnt/scratch/foo ; echo ; done | sort |uniq -c
39 Ouille
9962 OuilleOuille
$
XFS, on the same kernel, hardware and block device:
$ for i in `seq 0 10000`; do ./a.out ; cat /mnt/scratch/foo ; echo ; done | sort |uniq -c
40 Ouille
9961 OuilleOuille
$
So both filesystems behave according to the POSIX definition of
concurrent writes....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Write is not atomic?
2012-10-15 23:13 ` Dave Chinner
@ 2012-10-15 23:24 ` Philippe Troin
2012-10-15 23:42 ` Max Filippov
0 siblings, 1 reply; 8+ messages in thread
From: Philippe Troin @ 2012-10-15 23:24 UTC (permalink / raw)
To: Dave Chinner; +Cc: Juliusz Chroboczek, linux-kernel
On Tue, 2012-10-16 at 10:13 +1100, Dave Chinner wrote:
> On Mon, Oct 15, 2012 at 11:36:15PM +0200, Juliusz Chroboczek wrote:
> > Hi,
> >
> > The Linux manual page for write(2) says:
> >
> > The adjustment of the file offset and the write operation are
> > performed as an atomic step.
>
> That's wrong. The file offset update is not synchronised at all with
> the write, and for a shared fd the update will race.
That's what O_APPEND or pread/pwrite are for.
> > This is apparently an extension to POSIX, which says
> >
> > This volume of IEEE Std 1003.1-2001 does not specify behavior of
> > concurrent writes to a file from multiple processes. Applications
> > should use some form of concurrency control.
>
> This is how Linux behaves.
>
> > The following fragment of code
> >
> > int fd;
> > fd = open("exemple", O_CREAT | O_WRONLY | O_TRUNC, 0666);
> > fork();
> > write(fd, "Ouille", 6);
> > close(fd);
can be replaced with:
int fd;
fd = open("exemple", O_CREAT | O_WRONLY | O_TRUNC | O_APPEND, 0666);
fork();
write(fd, "Ouille", 6);
close(fd);
or:
int fd;
fd = open("exemple", O_CREAT | O_WRONLY | O_TRUNC, 0666);
pid_t pid = fork();
pwrite(fd, "Ouille", 6, strlen("Ouille")*(pid == 0));
close(fd);
(both code fragments untested)
Phil.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Write is not atomic?
2012-10-15 22:21 ` Max Filippov
@ 2012-10-15 23:36 ` Juliusz Chroboczek
0 siblings, 0 replies; 8+ messages in thread
From: Juliusz Chroboczek @ 2012-10-15 23:36 UTC (permalink / raw)
To: Max Filippov; +Cc: linux-kernel
> You don't check return code here, does write succeed at all?
Yes, both writes return 6.
> Does it ever produce e.g. OuOuilleille
No.
> (as this is what atomicity is about here)?
I was referring to the claim that under Linux writing and adjusting the
file offset are performed as an atomic step, not to the atomicity of the
write operation itself.
-- Juliusz
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Write is not atomic?
2012-10-15 21:36 Write is not atomic? Juliusz Chroboczek
2012-10-15 22:21 ` Max Filippov
2012-10-15 23:13 ` Dave Chinner
@ 2012-10-15 23:40 ` Jochen Striepe
2012-10-16 6:21 ` Juliusz Chroboczek
2 siblings, 1 reply; 8+ messages in thread
From: Jochen Striepe @ 2012-10-15 23:40 UTC (permalink / raw)
To: Juliusz Chroboczek; +Cc: linux-kernel
Hello,
On Mon, Oct 15, 2012 at 11:36:15PM +0200, Juliusz Chroboczek wrote:
> The Linux manual page for write(2) says:
>
> The adjustment of the file offset and the write operation are
> performed as an atomic step.
This seems out of context.
Over here write(2) reads:
If the file was open(2)ed with O_APPEND, the file offset is first
set to the end of the file before writing. The adjustment of the
file offset and the write operation are performed as an atomic
step.
Sounds different, doesn't it?
Hth,
Jochen.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Write is not atomic?
2012-10-15 23:24 ` Philippe Troin
@ 2012-10-15 23:42 ` Max Filippov
0 siblings, 0 replies; 8+ messages in thread
From: Max Filippov @ 2012-10-15 23:42 UTC (permalink / raw)
To: Philippe Troin; +Cc: Dave Chinner, Juliusz Chroboczek, linux-kernel
On Tue, Oct 16, 2012 at 3:24 AM, Philippe Troin <phil@fifi.org> wrote:
> On Tue, 2012-10-16 at 10:13 +1100, Dave Chinner wrote:
>> On Mon, Oct 15, 2012 at 11:36:15PM +0200, Juliusz Chroboczek wrote:
>> > The following fragment of code
>> >
>> > int fd;
>> > fd = open("exemple", O_CREAT | O_WRONLY | O_TRUNC, 0666);
>> > fork();
>> > write(fd, "Ouille", 6);
>> > close(fd);
>
> can be replaced with:
>
> int fd;
> fd = open("exemple", O_CREAT | O_WRONLY | O_TRUNC | O_APPEND, 0666);
> fork();
> write(fd, "Ouille", 6);
> close(fd);
Fails the same way as the original. I guess O_APPEND doesn't work this way
for writes to the shared file descriptor.
--
Thanks.
-- Max
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Write is not atomic?
2012-10-15 23:40 ` Jochen Striepe
@ 2012-10-16 6:21 ` Juliusz Chroboczek
0 siblings, 0 replies; 8+ messages in thread
From: Juliusz Chroboczek @ 2012-10-16 6:21 UTC (permalink / raw)
To: Jochen Striepe; +Cc: linux-kernel
> This seems out of context.
> If the file was open(2)ed with O_APPEND, the file offset is first
> set to the end of the file before writing. The adjustment of the
> file offset and the write operation are performed as an atomic
> step.
> Sounds different, doesn't it?
Yes, it does -- thanks for the clarification. (And thanks to Dave for
the interesting tests.)
Sorry for the confusion,
-- Juliusz
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2012-10-16 6:22 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-15 21:36 Write is not atomic? Juliusz Chroboczek
2012-10-15 22:21 ` Max Filippov
2012-10-15 23:36 ` Juliusz Chroboczek
2012-10-15 23:13 ` Dave Chinner
2012-10-15 23:24 ` Philippe Troin
2012-10-15 23:42 ` Max Filippov
2012-10-15 23:40 ` Jochen Striepe
2012-10-16 6:21 ` Juliusz Chroboczek
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox