public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Writing more than 4096 bytes with O_SYNC flag does not persist all previously written data if system crashes
@ 2026-02-18 13:29 Vyacheslav Kovalevsky
  2026-02-18 21:55 ` Andreas Dilger
  2026-02-24 14:47 ` Christoph Hellwig
  0 siblings, 2 replies; 10+ messages in thread
From: Vyacheslav Kovalevsky @ 2026-02-18 13:29 UTC (permalink / raw)
  To: tytso, adilger.kernel; +Cc: linux-ext4, linux-kernel

Detailed description
====================

Hello, there seems to be an issue with ext4 crash behavior:

1. Create and sync a new file.
2. Open the file and write some data (must be more than 4096 bytes).
3. Close the file.
4. Open the file with O_SYNC flag and write some data.

After system crash the file will have the wrong size and some previously 
written data will be lost.

According to Linux manual 
<https://man7.org/linux/man-pages/man2/open.2.html> O_SYNC can replaced 
with fsync() call after each write operation:

```
By the time write(2) (or similar) returns, the output data
and associated file metadata have been transferred to the
underlying hardware (i.e., as though each write(2) was
followed by a call to fsync(2)).
```

In this case it is not true, using O_SYNC does not persist the data like 
fsync() does (see test below).


System info
===========

Linux version 6.19.2


How to reproduce
================

```
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>

#define BUFFER_LEN 5000 // should be at least ~ 4096+1

int main() {
   int status;
   int file_fd0;
   int file_fd1;
   int file_fd2;

   char buffer[BUFFER_LEN + 1] = {};
   for (int i = 0; i <= BUFFER_LEN; ++i) {
     buffer[i] = (char)i;
   }

   status = creat("file", S_IRWXU | S_IRWXG | S_IROTH | S_IXOTH);
   printf("CREAT: %d\n", status);
   file_fd0 = status;

   status = close(file_fd0);
   printf("CLOSE: %d\n", status);

   sync();

   status = open("file", O_WRONLY);
   printf("OPEN: %d\n", status);
   file_fd1 = status;

   status = write(file_fd1, buffer, BUFFER_LEN);
   printf("WRITE: %d\n", status);

   status = close(file_fd1);
   printf("CLOSE: %d\n", status);

   status = open("file", O_WRONLY | O_SYNC);
   printf("OPEN: %d\n", status);
   file_fd2 = status;

   status = write(file_fd2, "Test data!", 10);
   printf("WRITE: %d\n", status);

   status = close(file_fd2);
   printf("CLOSE: %d\n", status);
}
// after crash file size is 4096 instead of 5000
```

Output:

```
CREAT: 3
CLOSE: 0
OPEN: 3
WRITE: 5000
CLOSE: 0
OPEN: 3
WRITE: 10
CLOSE: 0
```

File content after crash:

```
$ xxd file
00000000: 5465 7374 2064 6174 6121 0a0b 0c0d 0e0f  Test data!......
00000010: 1011 1213 1415 1617 1819 1a1b 1c1d 1e1f ................
00000020: 2021 2223 2425 2627 2829 2a2b 2c2d 2e2f  !"#$%&'()*+,-./

.........

00000ff0: f0f1 f2f3 f4f5 f6f7 f8f9 fafb fcfd feff ................
```

Steps:

1. Create and mount new ext4 file system in default configuration.
2. Change directory to root of the file system and run the compiled test.
3. Cause hard system crash (e.g. QEMU `system_reset` command).
4. Remount file system after crash.
5. Observe that file size is 4096 instead of 5000.

Notes:

- This also seems to affect XFS in the same way.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-03-03 13:20 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-18 13:29 Writing more than 4096 bytes with O_SYNC flag does not persist all previously written data if system crashes Vyacheslav Kovalevsky
2026-02-18 21:55 ` Andreas Dilger
2026-02-19 13:32   ` Theodore Tso
2026-02-23 12:46     ` Alejandro Colomar
2026-02-23 19:32       ` Theodore Tso
2026-02-24  1:21         ` Andreas Dilger
2026-03-03 13:19         ` Alejandro Colomar
2026-02-24 14:47 ` Christoph Hellwig
2026-02-24 22:23   ` Darrick J. Wong
2026-02-25 14:20     ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox