* [PATCH 00/32] e2fsprogs patchbomb 2/14
@ 2014-03-02 7:16 Darrick J. Wong
2014-03-02 7:16 ` [PATCH 01/32] libext2fs: support modifying arbitrary extended attributes (v5) Darrick J. Wong
` (29 more replies)
0 siblings, 30 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:16 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
Well it's been a while, but this time there aren't as many patches. :)
The first two patches provide some minor tweaks to the extended
attribute editing code that had been sitting (unreleased :/) in my
tree when Ted pulled in v4 of the extended attribute patches. Most
notable is a fix for the delete method being unable to remove the last
xattr attached to an inode.
Patches 3-6 implement various minor bug fixes and cleanups, some of
which are based on complaints from clang and cppcheck.
Patches 7-8 fix some warts I've noticed while running e2fsck with
regards to inline data and printing runs of duplicate blocks.
Patches 9-10 make some alterations to metadata checksumming support;
by default, e2fsck will now check the inode before verifying the
checksum. There's a command line option to restore the "just scrape
it off the system" behavior for heavily damaged filesystems. There's
also a command line option to dumpe2fs to ignore checksum failures.
Patch 11 enables block_validity for new filesystems. See patch 30 for
a performance microbenchmark.
Patches 12-13 enhance ext2fs_bmap2() to allow the creation of
uninitialized extents. The functionality is already there; really it
just adds a flag to indicate uninitialized. There's also a patch to
the fileio routines to handle uninitialized extents. These patches
are unchanged from December.
Patches 14-16 add to resize2fs the ability to convert a filesystem to
and from 64bit mode. These patches are unchanged from December.
Patches 17-20 implement readahead for e2fsck. The first patch tries
to reduce system call overhead by using pread/pwrite if available.
The next two patches plumb in the IO manager and library changes
necessary to read metadata blocks into the page cache (on Linux). The
final patch teaches e2fsck to use the library readahead functions in a
separate thread.
Crude testing has been done via:
# echo 3 > /proc/sys/vm/drop_caches
# e2fsck -Fnfvtt /dev/XXX
So far in my crude testing on a cold system, I've seen about a ~20%
speedup on a SSD, a ~40% speedup on a 3x RAID1 SATA array, and about
a 10% speedup on a single-spindle SATA disk. On a single-queue USB
HDD, performance doesn't change much. It looks as though low end
storage like USB HDDs will not benefit, which doesn't surprise me.
There's around a 2% regression for USB HDDs, though it doesn't seem
statistically significant. The SSD numbers are harder to quantify
since they're already fast. Somewhat unexpectedly, the readahead code
speeds up e2fsck even when the page cache has already been warmed up.
This third version of the readahead patches try to prevent page cache
thrashing by limiting the amount of (user-configurable) readahead to a
default of half of physical memory. It also tries to release some of
the memory pages if it can conclude that it's totally done with a
block, and it can now detect very slow readahead and disable it.
Patches 21-25 implement fallocate for e2fsprogs, and modifies Ted's
mk_hugefiles functionality to use it. The general fallocate API call
is (regrettably) much more complex than Ted's, since it must grapple
with the possibility that the file already has mapped blocks. There
were also a lot of bigalloc related subtleties.
Patches 26-29 implement fuse2fs, a FUSE server based on libext2fs.
Primarily I've been using it to shake out bugs in the library via
xfstests and the metadata checksumming test program. It can also be
used to mount ext4 on any OS supporting FUSE, and it can also mount
64k-block filesystems on x86, though I'd be wary of using rw mode.
fuse2fs depends on these new APIs: xattr editing, uninit extent
handling, and the new fallocate call.
Patches 30-32 provide the metadata checksumming test script. Its
primary advantage over 'make check' is that it allows one to specify a
variety of different mkfs and mount options. It's also growing more
tests as a result of fuse2fs exercise.
I've tested these e2fsprogs changes against the -next branch as of
3/1. These days, I use an 8GB ramdisk and a 20T "disk" I constructed
out of dm-snapshot to test in an x64 VM. The make check tests should
pass, and most of the xfstests should pass when run against fuse2fs.
Comments and questions are, as always, welcome.
--D
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 01/32] libext2fs: support modifying arbitrary extended attributes (v5)
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
@ 2014-03-02 7:16 ` Darrick J. Wong
2014-03-02 7:16 ` [PATCH 02/32] debugfs: create commands to edit extended attributes Darrick J. Wong
` (28 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:16 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
v5: Add magic number checking to the extended attribute editing
handle; move inline data to the head of the attribute list when
writing so that inline data ends up in the inode area; and always zero
the attribute space before writing to ensure that we can delete the
last xattr.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
debugfs/debugfs.c | 4 +++-
lib/ext2fs/ext2_err.et.in | 3 +++
lib/ext2fs/ext2fs.h | 2 +-
lib/ext2fs/ext_attr.c | 39 ++++++++++++++++++++++++++++++++++++---
4 files changed, 43 insertions(+), 5 deletions(-)
diff --git a/debugfs/debugfs.c b/debugfs/debugfs.c
index a3fcbf4..aded072 100644
--- a/debugfs/debugfs.c
+++ b/debugfs/debugfs.c
@@ -561,6 +561,7 @@ static int dump_attr(char *name, char *value, size_t value_len, void *data)
static void dump_inode_attributes(FILE *out, ext2_ino_t ino)
{
struct ext2_xattr_handle *h;
+ size_t sz;
errcode_t err;
err = ext2fs_xattrs_open(current_fs, ino, &h);
@@ -571,7 +572,8 @@ static void dump_inode_attributes(FILE *out, ext2_ino_t ino)
if (err)
goto out;
- if (ext2fs_xattrs_count(h) == 0)
+ err = ext2fs_xattrs_count(h, &sz);
+ if (err || sz == 0)
goto out;
fprintf(out, "Extended attributes:\n");
diff --git a/lib/ext2fs/ext2_err.et.in b/lib/ext2fs/ext2_err.et.in
index 0a69aa3..198a56c 100644
--- a/lib/ext2fs/ext2_err.et.in
+++ b/lib/ext2fs/ext2_err.et.in
@@ -503,4 +503,7 @@ ec EXT2_ET_EA_NO_SPACE,
ec EXT2_ET_MISSING_EA_FEATURE,
"Filesystem is missing ext_attr or inline_data feature"
+ec EXT2_ET_MAGIC_EA_HANDLE,
+ "Wrong magic number for extended attribute structure"
+
end
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index dd6404e..b1b9d3d 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -1180,7 +1180,7 @@ errcode_t ext2fs_xattrs_open(ext2_filsys fs, ext2_ino_t ino,
errcode_t ext2fs_xattrs_close(struct ext2_xattr_handle **handle);
errcode_t ext2fs_free_ext_attr(ext2_filsys fs, ext2_ino_t ino,
struct ext2_inode_large *inode);
-size_t ext2fs_xattrs_count(struct ext2_xattr_handle *handle);
+errcode_t ext2fs_xattrs_count(struct ext2_xattr_handle *handle, size_t *count);
/* extent.c */
extern errcode_t ext2fs_extent_header_verify(void *ptr, int size);
diff --git a/lib/ext2fs/ext_attr.c b/lib/ext2fs/ext_attr.c
index e69275e..44d7615 100644
--- a/lib/ext2fs/ext_attr.c
+++ b/lib/ext2fs/ext_attr.c
@@ -195,6 +195,7 @@ struct ext2_xattr {
};
struct ext2_xattr_handle {
+ errcode_t magic;
ext2_filsys fs;
struct ext2_xattr *attrs;
size_t length, count;
@@ -238,6 +239,24 @@ static struct ea_name_index ea_names[] = {
{0, NULL},
};
+static void move_inline_data_to_front(struct ext2_xattr_handle *h)
+{
+ struct ext2_xattr *x;
+ struct ext2_xattr tmp;
+
+ for (x = h->attrs + 1; x < h->attrs + h->length; x++) {
+ if (!x->name)
+ continue;
+
+ if (strcmp(x->name, "system.data") == 0) {
+ memcpy(&tmp, x, sizeof(tmp));
+ memcpy(x, h->attrs, sizeof(tmp));
+ memcpy(h->attrs, &tmp, sizeof(tmp));
+ return;
+ }
+ }
+}
+
static const char *find_ea_prefix(int index)
{
struct ea_name_index *e;
@@ -412,6 +431,7 @@ static errcode_t write_xattrs_to_buffer(struct ext2_xattr_handle *handle,
unsigned int entry_size, value_size;
int idx, ret;
+ memset(entries_start, 0, storage_size);
/* For all remaining x... */
for (; x < handle->attrs + handle->length; x++) {
if (!x->name)
@@ -471,6 +491,7 @@ errcode_t ext2fs_xattrs_write(struct ext2_xattr_handle *handle)
unsigned int i;
errcode_t err;
+ EXT2_CHECK_MAGIC(handle, EXT2_ET_MAGIC_EA_HANDLE);
i = EXT2_INODE_SIZE(handle->fs->super);
if (i < sizeof(*inode))
i = sizeof(*inode);
@@ -484,6 +505,8 @@ errcode_t ext2fs_xattrs_write(struct ext2_xattr_handle *handle)
if (err)
goto out;
+ move_inline_data_to_front(handle);
+
x = handle->attrs;
/* Does the inode have size for EA? */
if (EXT2_INODE_SIZE(handle->fs->super) <= EXT2_GOOD_OLD_INODE_SIZE +
@@ -511,7 +534,7 @@ errcode_t ext2fs_xattrs_write(struct ext2_xattr_handle *handle)
write_ea_block:
/* Write the EA block */
- err = ext2fs_get_memzero(handle->fs->blocksize, &block_buf);
+ err = ext2fs_get_mem(handle->fs->blocksize, &block_buf);
if (err)
goto out;
@@ -590,6 +613,7 @@ static errcode_t read_xattrs_from_buffer(struct ext2_xattr_handle *handle,
x++;
entry = entries;
+ remain = storage_size;
while (!EXT2_EXT_IS_LAST_ENTRY(entry)) {
__u32 hash;
@@ -682,6 +706,7 @@ errcode_t ext2fs_xattrs_read(struct ext2_xattr_handle *handle)
int i;
errcode_t err;
+ EXT2_CHECK_MAGIC(handle, EXT2_ET_MAGIC_EA_HANDLE);
i = EXT2_INODE_SIZE(handle->fs->super);
if (i < sizeof(*inode))
i = sizeof(*inode);
@@ -781,6 +806,7 @@ errcode_t ext2fs_xattrs_iterate(struct ext2_xattr_handle *h,
errcode_t err;
int ret;
+ EXT2_CHECK_MAGIC(h, EXT2_ET_MAGIC_EA_HANDLE);
for (x = h->attrs; x < h->attrs + h->length; x++) {
if (!x->name)
continue;
@@ -802,6 +828,7 @@ errcode_t ext2fs_xattr_get(struct ext2_xattr_handle *h, const char *key,
void *val;
errcode_t err;
+ EXT2_CHECK_MAGIC(h, EXT2_ET_MAGIC_EA_HANDLE);
for (x = h->attrs; x < h->attrs + h->length; x++) {
if (!x->name)
continue;
@@ -829,6 +856,7 @@ errcode_t ext2fs_xattr_set(struct ext2_xattr_handle *handle,
char *new_value;
errcode_t err;
+ EXT2_CHECK_MAGIC(handle, EXT2_ET_MAGIC_EA_HANDLE);
last_empty = NULL;
for (x = handle->attrs; x < handle->attrs + handle->length; x++) {
if (!x->name) {
@@ -894,6 +922,7 @@ errcode_t ext2fs_xattr_remove(struct ext2_xattr_handle *handle,
struct ext2_xattr *x;
errcode_t err;
+ EXT2_CHECK_MAGIC(handle, EXT2_ET_MAGIC_EA_HANDLE);
for (x = handle->attrs; x < handle->attrs + handle->length; x++) {
if (!x->name)
continue;
@@ -927,6 +956,7 @@ errcode_t ext2fs_xattrs_open(ext2_filsys fs, ext2_ino_t ino,
if (err)
return err;
+ h->magic = EXT2_ET_MAGIC_EA_HANDLE;
h->length = 4;
err = ext2fs_get_arrayzero(h->length, sizeof(struct ext2_xattr),
&h->attrs);
@@ -946,6 +976,7 @@ errcode_t ext2fs_xattrs_close(struct ext2_xattr_handle **handle)
struct ext2_xattr_handle *h = *handle;
errcode_t err;
+ EXT2_CHECK_MAGIC(h, EXT2_ET_MAGIC_EA_HANDLE);
if (h->dirty) {
err = ext2fs_xattrs_write(h);
if (err)
@@ -958,7 +989,9 @@ errcode_t ext2fs_xattrs_close(struct ext2_xattr_handle **handle)
return 0;
}
-size_t ext2fs_xattrs_count(struct ext2_xattr_handle *handle)
+errcode_t ext2fs_xattrs_count(struct ext2_xattr_handle *handle, size_t *count)
{
- return handle->count;
+ EXT2_CHECK_MAGIC(handle, EXT2_ET_MAGIC_EA_HANDLE);
+ *count = handle->count;
+ return 0;
}
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 02/32] debugfs: create commands to edit extended attributes
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
2014-03-02 7:16 ` [PATCH 01/32] libext2fs: support modifying arbitrary extended attributes (v5) Darrick J. Wong
@ 2014-03-02 7:16 ` Darrick J. Wong
2014-03-02 7:16 ` [PATCH 03/32] libext2fs: fix 64bit overflow in ext2fs_block_alloc_stats_range Darrick J. Wong
` (27 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:16 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
Enhance debugfs to be able to display and modify extended attributes, and
create some simple tests for the extended attribute editing functions.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
debugfs/Makefile.in | 15 ++
debugfs/debug_cmds.ct | 12 ++
debugfs/debugfs.c | 62 ---------
debugfs/debugfs.h | 3
debugfs/xattrs.c | 297 ++++++++++++++++++++++++++++++++++++++++++++
tests/d_xattr_edits/expect | 51 ++++++++
tests/d_xattr_edits/name | 1
tests/d_xattr_edits/script | 135 ++++++++++++++++++++
8 files changed, 511 insertions(+), 65 deletions(-)
create mode 100644 debugfs/xattrs.c
create mode 100644 tests/d_xattr_edits/expect
create mode 100644 tests/d_xattr_edits/name
create mode 100644 tests/d_xattr_edits/script
diff --git a/debugfs/Makefile.in b/debugfs/Makefile.in
index 5ddeab7..ce63673 100644
--- a/debugfs/Makefile.in
+++ b/debugfs/Makefile.in
@@ -18,17 +18,18 @@ MK_CMDS= _SS_DIR_OVERRIDE=../lib/ss ../lib/ss/mk_cmds
DEBUG_OBJS= debug_cmds.o debugfs.o util.o ncheck.o icheck.o ls.o \
lsdel.o dump.o set_fields.o logdump.o htree.o unused.o e2freefrag.o \
- filefrag.o extent_cmds.o extent_inode.o zap.o
+ filefrag.o extent_cmds.o extent_inode.o zap.o xattrs.o
RO_DEBUG_OBJS= ro_debug_cmds.o ro_debugfs.o util.o ncheck.o icheck.o ls.o \
lsdel.o logdump.o htree.o e2freefrag.o filefrag.o extent_cmds.o \
- extent_inode.o
+ extent_inode.o xattrs.o
SRCS= debug_cmds.c $(srcdir)/debugfs.c $(srcdir)/util.c $(srcdir)/ls.c \
$(srcdir)/ncheck.c $(srcdir)/icheck.c $(srcdir)/lsdel.c \
$(srcdir)/dump.c $(srcdir)/set_fields.c ${srcdir}/logdump.c \
$(srcdir)/htree.c $(srcdir)/unused.c ${srcdir}/../misc/e2freefrag.c \
- $(srcdir)/filefrag.c $(srcdir)/extent_inode.c $(srcdir)/zap.c
+ $(srcdir)/filefrag.c $(srcdir)/extent_inode.c $(srcdir)/zap.c \
+ $(srcdir)/xattrs.c
LIBS= $(LIBEXT2FS) $(LIBE2P) $(LIBSS) $(LIBCOM_ERR) $(LIBBLKID) \
$(LIBUUID) $(SYSLIBS)
@@ -206,3 +207,11 @@ unused.o: $(srcdir)/unused.c $(srcdir)/debugfs.h \
$(top_srcdir)/lib/et/com_err.h $(top_srcdir)/lib/ext2fs/ext2_io.h \
$(top_builddir)/lib/ext2fs/ext2_err.h \
$(top_srcdir)/lib/ext2fs/ext2_ext_attr.h $(top_srcdir)/lib/ext2fs/bitops.h
+xattrs.o: $(srcdir)/xattrs.c $(srcdir)/debugfs.h \
+ $(top_srcdir)/lib/ext2fs/ext2_fs.h $(top_builddir)/lib/ext2fs/ext2_types.h \
+ $(top_srcdir)/lib/ext2fs/ext2fs.h $(top_srcdir)/lib/ext2fs/ext3_extents.h \
+ $(top_srcdir)/lib/et/com_err.h $(top_srcdir)/lib/ext2fs/ext2_io.h \
+ $(top_builddir)/lib/ext2fs/ext2_err.h \
+ $(top_srcdir)/lib/ext2fs/ext2_ext_attr.h $(top_srcdir)/lib/ext2fs/bitops.h \
+ $(srcdir)/jfs_user.h $(top_srcdir)/lib/ext2fs/kernel-jbd.h \
+ $(top_srcdir)/lib/ext2fs/jfs_compat.h $(top_srcdir)/lib/ext2fs/kernel-list.h
diff --git a/debugfs/debug_cmds.ct b/debugfs/debug_cmds.ct
index 96ff00f..666032b 100644
--- a/debugfs/debug_cmds.ct
+++ b/debugfs/debug_cmds.ct
@@ -190,5 +190,17 @@ request do_zap_block, "Zap block: fill with 0, pattern, flip bits etc.",
request do_block_dump, "Dump contents of a block",
block_dump, bd;
+request do_list_xattr, "List extended attributes of an inode",
+ ea_list;
+
+request do_get_xattr, "Get an extended attribute of an inode",
+ ea_get;
+
+request do_set_xattr, "Set an extended attribute of an inode",
+ ea_set;
+
+request do_rm_xattr, "Remove an extended attribute of an inode",
+ ea_rm;
+
end;
diff --git a/debugfs/debugfs.c b/debugfs/debugfs.c
index aded072..bc435b8 100644
--- a/debugfs/debugfs.c
+++ b/debugfs/debugfs.c
@@ -504,27 +504,6 @@ static int list_blocks_proc(ext2_filsys fs EXT2FS_ATTR((unused)),
return 0;
}
-static void dump_xattr_string(FILE *out, const char *str, int len)
-{
- int printable = 0;
- int i;
-
- /* check: is string "printable enough?" */
- for (i = 0; i < len; i++)
- if (isprint(str[i]))
- printable++;
-
- if (printable <= len*7/8)
- printable = 0;
-
- for (i = 0; i < len; i++)
- if (printable)
- fprintf(out, isprint(str[i]) ? "%c" : "\\%03o",
- (unsigned char)str[i]);
- else
- fprintf(out, "%02x ", (unsigned char)str[i]);
-}
-
static void internal_dump_inode_extra(FILE *out,
const char *prefix EXT2FS_ATTR((unused)),
ext2_ino_t inode_num EXT2FS_ATTR((unused)),
@@ -544,47 +523,6 @@ static void internal_dump_inode_extra(FILE *out,
}
}
-/* Dump extended attributes */
-static int dump_attr(char *name, char *value, size_t value_len, void *data)
-{
- FILE *out = data;
-
- fprintf(out, " ");
- dump_xattr_string(out, name, strlen(name));
- fprintf(out, " = \"");
- dump_xattr_string(out, value, value_len);
- fprintf(out, "\" (%zu)\n", value_len);
-
- return 0;
-}
-
-static void dump_inode_attributes(FILE *out, ext2_ino_t ino)
-{
- struct ext2_xattr_handle *h;
- size_t sz;
- errcode_t err;
-
- err = ext2fs_xattrs_open(current_fs, ino, &h);
- if (err)
- return;
-
- err = ext2fs_xattrs_read(h);
- if (err)
- goto out;
-
- err = ext2fs_xattrs_count(h, &sz);
- if (err || sz == 0)
- goto out;
-
- fprintf(out, "Extended attributes:\n");
- err = ext2fs_xattrs_iterate(h, dump_attr, out);
- if (err)
- goto out;
-
-out:
- err = ext2fs_xattrs_close(&h);
-}
-
static void dump_blocks(FILE *f, const char *prefix, ext2_ino_t inode)
{
struct list_blocks_struct lb;
diff --git a/debugfs/debugfs.h b/debugfs/debugfs.h
index 33389fa..7113119 100644
--- a/debugfs/debugfs.h
+++ b/debugfs/debugfs.h
@@ -174,6 +174,9 @@ extern void do_filefrag(int argc, char *argv[]);
/* util.c */
extern time_t string_to_time(const char *arg);
+/* xattrs.c */
+void dump_inode_attributes(FILE *out, ext2_ino_t ino);
+
/* zap.c */
extern void do_zap_block(int argc, char **argv);
extern void do_block_dump(int argc, char **argv);
diff --git a/debugfs/xattrs.c b/debugfs/xattrs.c
new file mode 100644
index 0000000..0a29521
--- /dev/null
+++ b/debugfs/xattrs.c
@@ -0,0 +1,297 @@
+/*
+ * xattrs.c --- Modify extended attributes via debugfs.
+ *
+ * Copyright (C) 2014 Oracle. This file may be redistributed
+ * under the terms of the GNU Public License.
+ */
+
+#include "config.h"
+#include <stdio.h>
+#ifdef HAVE_GETOPT_H
+#include <getopt.h>
+#else
+extern int optind;
+extern char *optarg;
+#endif
+#include <ctype.h>
+
+#include "debugfs.h"
+
+/* Dump extended attributes */
+static void dump_xattr_string(FILE *out, const char *str, int len)
+{
+ int printable = 0;
+ int i;
+
+ /* check: is string "printable enough?" */
+ for (i = 0; i < len; i++)
+ if (isprint(str[i]))
+ printable++;
+
+ if (printable <= len*7/8)
+ printable = 0;
+
+ for (i = 0; i < len; i++)
+ if (printable)
+ fprintf(out, isprint(str[i]) ? "%c" : "\\%03o",
+ (unsigned char)str[i]);
+ else
+ fprintf(out, "%02x ", (unsigned char)str[i]);
+}
+
+static int dump_attr(char *name, char *value, size_t value_len, void *data)
+{
+ FILE *out = data;
+
+ fprintf(out, " ");
+ dump_xattr_string(out, name, strlen(name));
+ fprintf(out, " = \"");
+ dump_xattr_string(out, value, value_len);
+ fprintf(out, "\" (%zu)\n", value_len);
+
+ return 0;
+}
+
+void dump_inode_attributes(FILE *out, ext2_ino_t ino)
+{
+ struct ext2_xattr_handle *h;
+ size_t sz;
+ errcode_t err;
+
+ err = ext2fs_xattrs_open(current_fs, ino, &h);
+ if (err)
+ return;
+
+ err = ext2fs_xattrs_read(h);
+ if (err)
+ goto out;
+
+ err = ext2fs_xattrs_count(h, &sz);
+ if (err || sz == 0)
+ goto out;
+
+ fprintf(out, "Extended attributes:\n");
+ err = ext2fs_xattrs_iterate(h, dump_attr, out);
+ if (err)
+ goto out;
+
+out:
+ err = ext2fs_xattrs_close(&h);
+}
+
+void do_list_xattr(int argc, char **argv)
+{
+ ext2_ino_t ino;
+
+ if (argc != 2) {
+ printf("%s: Usage: %s <file>\n", argv[0],
+ argv[0]);
+ return;
+ }
+
+ if (check_fs_open(argv[0]))
+ return;
+
+ ino = string_to_inode(argv[1]);
+ if (!ino)
+ return;
+
+ dump_inode_attributes(stdout, ino);
+}
+
+void do_get_xattr(int argc, char **argv)
+{
+ ext2_ino_t ino;
+ struct ext2_xattr_handle *h;
+ FILE *fp = NULL;
+ char *buf = NULL;
+ size_t buflen;
+ int i;
+ errcode_t err;
+
+ reset_getopt();
+ while ((i = getopt(argc, argv, "f:")) != -1) {
+ switch (i) {
+ case 'f':
+ fp = fopen(optarg, "w");
+ if (fp == NULL) {
+ perror(optarg);
+ return;
+ }
+ break;
+ default:
+ printf("%s: Usage: %s <file> <attr> [-f outfile]\n",
+ argv[0], argv[0]);
+ return;
+ }
+ }
+
+ if (optind != argc - 2) {
+ printf("%s: Usage: %s <file> <attr> [-f outfile]\n", argv[0],
+ argv[0]);
+ return;
+ }
+
+ if (check_fs_open(argv[0]))
+ return;
+
+ ino = string_to_inode(argv[optind]);
+ if (!ino)
+ return;
+
+ err = ext2fs_xattrs_open(current_fs, ino, &h);
+ if (err)
+ return;
+
+ err = ext2fs_xattrs_read(h);
+ if (err)
+ goto out;
+
+ err = ext2fs_xattr_get(h, argv[optind + 1], (void **)&buf, &buflen);
+ if (err)
+ goto out;
+
+ if (fp) {
+ fwrite(buf, buflen, 1, fp);
+ fclose(fp);
+ } else {
+ dump_xattr_string(stdout, buf, buflen);
+ printf("\n");
+ }
+
+ if (buf)
+ ext2fs_free_mem(&buf);
+out:
+ ext2fs_xattrs_close(&h);
+ if (err)
+ com_err(argv[0], err, "while getting extended attribute");
+}
+
+void do_set_xattr(int argc, char **argv)
+{
+ ext2_ino_t ino;
+ struct ext2_xattr_handle *h;
+ FILE *fp = NULL;
+ char *buf = NULL;
+ size_t buflen;
+ int i;
+ errcode_t err;
+
+ reset_getopt();
+ while ((i = getopt(argc, argv, "f:")) != -1) {
+ switch (i) {
+ case 'f':
+ fp = fopen(optarg, "r");
+ if (fp == NULL) {
+ perror(optarg);
+ return;
+ }
+ break;
+ default:
+ printf("%s: Usage: %s <file> <attr> [-f infile | "
+ "value]\n", argv[0], argv[0]);
+ return;
+ }
+ }
+
+ if (optind != argc - 2 && optind != argc - 3) {
+ printf("%s: Usage: %s <file> <attr> [-f infile | value>]\n",
+ argv[0], argv[0]);
+ return;
+ }
+
+ if (check_fs_open(argv[0]))
+ return;
+ if (check_fs_read_write(argv[0]))
+ return;
+ if (check_fs_bitmaps(argv[0]))
+ return;
+
+ ino = string_to_inode(argv[optind]);
+ if (!ino)
+ return;
+
+ err = ext2fs_xattrs_open(current_fs, ino, &h);
+ if (err)
+ return;
+
+ err = ext2fs_xattrs_read(h);
+ if (err)
+ goto out;
+
+ if (fp) {
+ err = ext2fs_get_mem(current_fs->blocksize, &buf);
+ if (err)
+ goto out;
+ buflen = fread(buf, 1, current_fs->blocksize, fp);
+ } else {
+ buf = argv[optind + 2];
+ buflen = strlen(argv[optind + 2]);
+ }
+
+ err = ext2fs_xattr_set(h, argv[optind + 1], buf, buflen);
+ if (err)
+ goto out;
+
+ err = ext2fs_xattrs_write(h);
+ if (err)
+ goto out;
+
+out:
+ if (fp) {
+ fclose(fp);
+ ext2fs_free_mem(&buf);
+ }
+ ext2fs_xattrs_close(&h);
+ if (err)
+ com_err(argv[0], err, "while setting extended attribute");
+}
+
+void do_rm_xattr(int argc, char **argv)
+{
+ ext2_ino_t ino;
+ struct ext2_xattr_handle *h;
+ int i;
+ errcode_t err;
+
+ if (argc < 3) {
+ printf("%s: Usage: %s <file> <attrs>...\n", argv[0], argv[0]);
+ return;
+ }
+
+ if (check_fs_open(argv[0]))
+ return;
+ if (check_fs_read_write(argv[0]))
+ return;
+ if (check_fs_bitmaps(argv[0]))
+ return;
+
+ ino = string_to_inode(argv[1]);
+ if (!ino)
+ return;
+
+ err = ext2fs_xattrs_open(current_fs, ino, &h);
+ if (err)
+ return;
+
+ err = ext2fs_xattrs_read(h);
+ if (err)
+ goto out;
+
+ for (i = 2; i < argc; i++) {
+ size_t buflen;
+ char *buf;
+
+ err = ext2fs_xattr_remove(h, argv[i]);
+ if (err)
+ goto out;
+ }
+
+ err = ext2fs_xattrs_write(h);
+ if (err)
+ goto out;
+out:
+ ext2fs_xattrs_close(&h);
+ if (err)
+ com_err(argv[0], err, "while removing extended attribute");
+}
diff --git a/tests/d_xattr_edits/expect b/tests/d_xattr_edits/expect
new file mode 100644
index 0000000..10e30c1
--- /dev/null
+++ b/tests/d_xattr_edits/expect
@@ -0,0 +1,51 @@
+debugfs edit extended attributes
+mke2fs -Fq -b 1024 test.img 512
+Exit status is 0
+ea_set / user.joe smith
+Exit status is 0
+ea_set / user.moo FEE_FIE_FOE_FUMMMMMM
+Exit status is 0
+ea_list /
+Extended attributes:
+ user.joe = "smith" (5)
+ user.moo = "FEE_FIE_FOE_FUMMMMMM" (20)
+Exit status is 0
+ea_get / user.moo
+FEE_FIE_FOE_FUMMMMMM
+Exit status is 0
+ea_get / nosuchea
+ea_get: Extended attribute key not found while getting extended attribute
+Exit status is 0
+ea_rm / user.moo
+Exit status is 0
+ea_rm / nosuchea
+ea_rm: Extended attribute key not found while removing extended attribute
+Exit status is 0
+ea_list /
+Extended attributes:
+ user.joe = "smith" (5)
+Exit status is 0
+ea_get / user.moo
+ea_get: Extended attribute key not found while getting extended attribute
+Exit status is 0
+ea_rm / user.joe
+Exit status is 0
+ea_list /
+Exit status is 0
+ea_set / user.file_based_xattr -f d_xattr_edits.tmp
+Exit status is 0
+ea_list /
+Extended attributes:
+ user.file_based_xattr = "12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567\012" (108)
+Exit status is 0
+ea_get / user.file_based_xattr -f d_xattr_edits.ver.tmp
+Exit status is 0
+Compare big attribute
+e2fsck -yf -N test_filesys
+Pass 1: Checking inodes, blocks, and sizes
+Pass 2: Checking directory structure
+Pass 3: Checking directory connectivity
+Pass 4: Checking reference counts
+Pass 5: Checking group summary information
+test_filesys: 11/64 files (0.0% non-contiguous), 29/512 blocks
+Exit status is 0
diff --git a/tests/d_xattr_edits/name b/tests/d_xattr_edits/name
new file mode 100644
index 0000000..c0c428c
--- /dev/null
+++ b/tests/d_xattr_edits/name
@@ -0,0 +1 @@
+edit extended attributes in debugfs
diff --git a/tests/d_xattr_edits/script b/tests/d_xattr_edits/script
new file mode 100644
index 0000000..1e33716
--- /dev/null
+++ b/tests/d_xattr_edits/script
@@ -0,0 +1,135 @@
+if test -x $DEBUGFS_EXE; then
+
+OUT=$test_name.log
+EXP=$test_dir/expect
+VERIFY_FSCK_OPT=-yf
+
+TEST_DATA=$test_name.tmp
+VERIFY_DATA=$test_name.ver.tmp
+
+echo "debugfs edit extended attributes" > $OUT
+
+dd if=/dev/zero of=$TMPFILE bs=1k count=512 > /dev/null 2>&1
+
+echo "mke2fs -Fq -b 1024 test.img 512" >> $OUT
+
+$MKE2FS -Fq $TMPFILE 512 > /dev/null 2>&1
+status=$?
+echo Exit status is $status >> $OUT
+
+echo "ea_set / user.joe smith" > $OUT.new
+$DEBUGFS -w $TMPFILE -R "ea_set / user.joe smith" >> $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed $OUT.new >> $OUT
+
+echo "ea_set / user.moo FEE_FIE_FOE_FUMMMMMM" > $OUT.new
+$DEBUGFS -w $TMPFILE -R "ea_set / user.moo FEE_FIE_FOE_FUMMMMMM" >> $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed $OUT.new >> $OUT
+
+echo "ea_list /" > $OUT.new
+$DEBUGFS -w $TMPFILE -R "ea_list /" >> $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed $OUT.new >> $OUT
+
+echo "ea_get / user.moo" > $OUT.new
+$DEBUGFS -w $TMPFILE -R "ea_get / user.moo" >> $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed $OUT.new >> $OUT
+
+echo "ea_get / nosuchea" > $OUT.new
+$DEBUGFS -w $TMPFILE -R "ea_get / nosuchea" >> $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed $OUT.new >> $OUT
+
+echo "ea_rm / user.moo" > $OUT.new
+$DEBUGFS -w $TMPFILE -R "ea_rm / user.moo" >> $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed $OUT.new >> $OUT
+
+echo "ea_rm / nosuchea" > $OUT.new
+$DEBUGFS -w $TMPFILE -R "ea_rm / nosuchea" >> $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed $OUT.new >> $OUT
+
+echo "ea_list /" > $OUT.new
+$DEBUGFS -w $TMPFILE -R "ea_list /" >> $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed $OUT.new >> $OUT
+
+echo "ea_get / user.moo" > $OUT.new
+$DEBUGFS -w $TMPFILE -R "ea_get / user.moo" >> $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed $OUT.new >> $OUT
+
+echo "ea_rm / user.joe" > $OUT.new
+$DEBUGFS -w $TMPFILE -R "ea_rm / user.joe" >> $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed $OUT.new >> $OUT
+
+echo "ea_list /" > $OUT.new
+$DEBUGFS -w $TMPFILE -R "ea_list /" >> $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed $OUT.new >> $OUT
+
+echo "12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567" > $TEST_DATA
+echo "ea_set / user.file_based_xattr -f $TEST_DATA" > $OUT.new
+$DEBUGFS -w $TMPFILE -R "ea_set / user.file_based_xattr -f $TEST_DATA" >> $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed $OUT.new >> $OUT
+
+echo "ea_list /" > $OUT.new
+$DEBUGFS -w $TMPFILE -R "ea_list /" >> $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed $OUT.new >> $OUT
+
+echo "ea_get / user.file_based_xattr -f $VERIFY_DATA" > $OUT.new
+$DEBUGFS -w $TMPFILE -R "ea_get / user.file_based_xattr -f $VERIFY_DATA" >> $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed $OUT.new >> $OUT
+
+echo "Compare big attribute" > $OUT.new
+diff -u $TEST_DATA $VERIFY_DATA >> $OUT.new
+sed -f $cmd_dir/filter.sed $OUT.new >> $OUT
+
+echo e2fsck $VERIFY_FSCK_OPT -N test_filesys > $OUT.new
+$FSCK $VERIFY_FSCK_OPT -N test_filesys $TMPFILE >> $OUT.new 2>&1
+status=$?
+echo Exit status is $status >> $OUT.new
+sed -f $cmd_dir/filter.sed $OUT.new >> $OUT
+
+#
+# Do the verification
+#
+
+rm -f $TMPFILE $OUT.new
+cmp -s $OUT $EXP
+status=$?
+
+if [ "$status" = 0 ] ; then
+ echo "$test_name: $test_description: ok"
+ touch $test_name.ok
+else
+ echo "$test_name: $test_description: failed"
+ diff $DIFF_OPTS $EXP $OUT > $test_name.failed
+fi
+
+unset VERIFY_FSCK_OPT NATIVE_FSCK_OPT OUT EXP TEST_DATA VERIFY_DATA
+
+else #if test -x $DEBUGFS_EXE; then
+ echo "$test_name: $test_description: skipped"
+fi
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 03/32] libext2fs: fix 64bit overflow in ext2fs_block_alloc_stats_range
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
2014-03-02 7:16 ` [PATCH 01/32] libext2fs: support modifying arbitrary extended attributes (v5) Darrick J. Wong
2014-03-02 7:16 ` [PATCH 02/32] debugfs: create commands to edit extended attributes Darrick J. Wong
@ 2014-03-02 7:16 ` Darrick J. Wong
2014-03-02 7:17 ` [PATCH 04/32] misc: fix header complaints and resource leaks in e2fsprogs Darrick J. Wong
` (26 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:16 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
In ext2fs_block_alloc_stats_range(), the quantity "-inuse * n" is
calculated as a signed 32-bit quantity. Unfortunately, gcc (4.6.3 on
Ubuntu 12.04) doesn't sign-extend this quantity to fill the blk64_t
parameter that ext2fs_free_blocks_count_add() wants, so the end result
is that the superblock gets a ridiculously huge free block count.
Changing the declaration of 'n' to blk64_t seems to fix this.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
lib/ext2fs/alloc_stats.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/ext2fs/alloc_stats.c b/lib/ext2fs/alloc_stats.c
index 5bb86ef..4feb24d 100644
--- a/lib/ext2fs/alloc_stats.c
+++ b/lib/ext2fs/alloc_stats.c
@@ -129,7 +129,7 @@ void ext2fs_block_alloc_stats_range(ext2_filsys fs, blk64_t blk,
while (num) {
int group = ext2fs_group_of_blk2(fs, blk);
blk64_t last_blk = ext2fs_group_last_block2(fs, group);
- blk_t n = num;
+ blk64_t n = num;
if (blk + num > last_blk)
n = last_blk - blk + 1;
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 04/32] misc: fix header complaints and resource leaks in e2fsprogs
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (2 preceding siblings ...)
2014-03-02 7:16 ` [PATCH 03/32] libext2fs: fix 64bit overflow in ext2fs_block_alloc_stats_range Darrick J. Wong
@ 2014-03-02 7:17 ` Darrick J. Wong
2014-03-02 7:17 ` [PATCH 05/32] libext2fs: fix memory leak when drastically shrinking extent tree depth Darrick J. Wong
` (25 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:17 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
Fix a few minor bugs that cppcheck complained about.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
debugfs/debugfs.c | 1 +
debugfs/util.c | 2 +-
e2fsck/unix.c | 1 +
lib/ext2fs/icount.c | 2 ++
util/subst.c | 3 +++
5 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/debugfs/debugfs.c b/debugfs/debugfs.c
index bc435b8..f0c5373 100644
--- a/debugfs/debugfs.c
+++ b/debugfs/debugfs.c
@@ -669,6 +669,7 @@ static void dump_extents(FILE *f, const char *prefix, ext2_ino_t ino,
}
if (printed)
fprintf(f, "\n");
+ ext2fs_extent_free(handle);
}
void internal_dump_inode(FILE *out, const char *prefix,
diff --git a/debugfs/util.c b/debugfs/util.c
index 9ddfe0b..5cc4e22 100644
--- a/debugfs/util.c
+++ b/debugfs/util.c
@@ -201,7 +201,7 @@ char *time_to_string(__u32 cl)
tz = ss_safe_getenv("TZ");
if (!tz)
tz = "";
- do_gmt = !strcmp(tz, "GMT") | !strcmp(tz, "GMT0");
+ do_gmt = !strcmp(tz, "GMT") || !strcmp(tz, "GMT0");
}
return asctime((do_gmt) ? gmtime(&t) : localtime(&t));
diff --git a/e2fsck/unix.c b/e2fsck/unix.c
index 429f1cd..f73a252 100644
--- a/e2fsck/unix.c
+++ b/e2fsck/unix.c
@@ -1016,6 +1016,7 @@ static errcode_t PRS(int argc, char *argv[], e2fsck_t *ret_ctx)
strcat(newpath, oldpath);
}
putenv(newpath);
+ free(newpath);
}
#ifdef CONFIG_JBD_DEBUG
jbd_debug = getenv("E2FSCK_JBD_DEBUG");
diff --git a/lib/ext2fs/icount.c b/lib/ext2fs/icount.c
index a3b20f0..7d1b3d5 100644
--- a/lib/ext2fs/icount.c
+++ b/lib/ext2fs/icount.c
@@ -198,6 +198,7 @@ errcode_t ext2fs_create_icount_tdb(ext2_filsys fs, char *tdb_dir,
fd = mkstemp(fn);
if (fd < 0) {
retval = errno;
+ ext2fs_free_mem(&fn);
goto errout;
}
umask(save_umask);
@@ -216,6 +217,7 @@ errcode_t ext2fs_create_icount_tdb(ext2_filsys fs, char *tdb_dir,
close(fd);
if (icount->tdb == NULL) {
retval = errno;
+ ext2fs_free_mem(&fn);
goto errout;
}
*ret = icount;
diff --git a/util/subst.c b/util/subst.c
index 6a5eab1..602546c 100644
--- a/util/subst.c
+++ b/util/subst.c
@@ -17,6 +17,9 @@
#include <fcntl.h>
#include <time.h>
#include <utime.h>
+#ifdef HAVE_SYS_TIME_H
+#include <sys/time.h>
+#endif
#ifdef HAVE_GETOPT_H
#include <getopt.h>
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 05/32] libext2fs: fix memory leak when drastically shrinking extent tree depth
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (3 preceding siblings ...)
2014-03-02 7:17 ` [PATCH 04/32] misc: fix header complaints and resource leaks in e2fsprogs Darrick J. Wong
@ 2014-03-02 7:17 ` Darrick J. Wong
2014-03-02 7:17 ` [PATCH 06/32] libext2fs: fix parents when modifying extents Darrick J. Wong
` (24 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:17 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
In ext2fs_extent_free(), h(andle)->max_depth is used as a loop
conditional variable to free all the h->path[].buf pointers. However,
ext2fs_extent_delete() sets max_depth = 0 if we've removed everything
from the extent tree, which causes a subsequent _free() to leak some
buf pointers. max_depth can be re-incremented when splitting extent
nodes, but there's no guarantee that it'll reach the old value before
the free.
Therefore, remember the size of h->paths[] separately, and use that
when freeing the extent handle.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
lib/ext2fs/extent.c | 23 +++++++++++------------
1 file changed, 11 insertions(+), 12 deletions(-)
diff --git a/lib/ext2fs/extent.c b/lib/ext2fs/extent.c
index 3ccae66..f27344e 100644
--- a/lib/ext2fs/extent.c
+++ b/lib/ext2fs/extent.c
@@ -58,6 +58,7 @@ struct ext2_extent_handle {
int type;
int level;
int max_depth;
+ int max_paths;
struct extent_path *path;
};
@@ -168,7 +169,7 @@ void ext2fs_extent_free(ext2_extent_handle_t handle)
return;
if (handle->path) {
- for (i=1; i <= handle->max_depth; i++) {
+ for (i = 1; i < handle->max_paths; i++) {
if (handle->path[i].buf)
ext2fs_free_mem(&handle->path[i].buf);
}
@@ -242,11 +243,10 @@ errcode_t ext2fs_extent_open2(ext2_filsys fs, ext2_ino_t ino,
handle->max_depth = ext2fs_le16_to_cpu(eh->eh_depth);
handle->type = ext2fs_le16_to_cpu(eh->eh_magic);
- retval = ext2fs_get_mem(((handle->max_depth+1) *
- sizeof(struct extent_path)),
- &handle->path);
- memset(handle->path, 0,
- (handle->max_depth+1) * sizeof(struct extent_path));
+ handle->max_paths = handle->max_depth + 1;
+ retval = ext2fs_get_memzero(handle->max_paths *
+ sizeof(struct extent_path),
+ &handle->path);
handle->path[0].buf = (char *) handle->inode->i_block;
handle->path[0].left = handle->path[0].entries =
@@ -912,13 +912,11 @@ errcode_t ext2fs_extent_node_split(ext2_extent_handle_t handle)
if (handle->level == 0) {
new_root = 1;
tocopy = ext2fs_le16_to_cpu(eh->eh_entries);
- retval = ext2fs_get_mem(((handle->max_depth+2) *
- sizeof(struct extent_path)),
- &newpath);
+ retval = ext2fs_get_memzero((handle->max_paths + 1) *
+ sizeof(struct extent_path),
+ &newpath);
if (retval)
goto done;
- memset(newpath, 0,
- ((handle->max_depth+2) * sizeof(struct extent_path)));
} else {
tocopy = ext2fs_le16_to_cpu(eh->eh_entries) / 2;
}
@@ -996,13 +994,14 @@ errcode_t ext2fs_extent_node_split(ext2_extent_handle_t handle)
/* current path now has fewer active entries, we copied some out */
if (handle->level == 0) {
memcpy(newpath, path,
- sizeof(struct extent_path) * (handle->max_depth+1));
+ sizeof(struct extent_path) * handle->max_paths);
handle->path = newpath;
newpath = path;
path = handle->path;
path->entries = 1;
path->left = path->max_entries - 1;
handle->max_depth++;
+ handle->max_paths++;
eh->eh_depth = ext2fs_cpu_to_le16(handle->max_depth);
} else {
path->entries -= tocopy;
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 06/32] libext2fs: fix parents when modifying extents
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (4 preceding siblings ...)
2014-03-02 7:17 ` [PATCH 05/32] libext2fs: fix memory leak when drastically shrinking extent tree depth Darrick J. Wong
@ 2014-03-02 7:17 ` Darrick J. Wong
2014-03-02 7:17 ` [PATCH 07/32] e2fsck: fix inline_data flag errors in pass1 Darrick J. Wong
` (23 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:17 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
In ext2fs_extent_set_bmap() and ext2fs_punch_extent(), fix the parents
when altering either end of an extent so that the parent nodes reflect
the added mapping.
There's a slight complication to using fix_parents: if there are two
mappings to an lblk in the tree, the value of handle->path->curr can
point to either extent afterwards), which is documented in a comment.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
lib/ext2fs/extent.c | 30 ++++++++++++++++++++++++------
lib/ext2fs/punch.c | 14 ++++++++++----
2 files changed, 34 insertions(+), 10 deletions(-)
diff --git a/lib/ext2fs/extent.c b/lib/ext2fs/extent.c
index f27344e..80ce88f 100644
--- a/lib/ext2fs/extent.c
+++ b/lib/ext2fs/extent.c
@@ -720,7 +720,14 @@ errcode_t ext2fs_extent_goto(ext2_extent_handle_t handle,
* and so on.
*
* Safe to call for any position in node; if not at the first entry,
- * will simply return.
+ * it will simply return.
+ *
+ * Note a subtlety of this function -- if there happen to be two extents
+ * mapping the same lblk and someone calls fix_parents on the second of the two
+ * extents, the position of the extent handle after the call will be the second
+ * extent if nothing happened, or the first extent if something did. A caller
+ * in this situation must use ext2fs_extent_goto() after calling this function.
+ * Or simply don't map the same lblk with two extents, ever.
*/
errcode_t ext2fs_extent_fix_parents(ext2_extent_handle_t handle)
{
@@ -1379,17 +1386,25 @@ errcode_t ext2fs_extent_set_bmap(ext2_extent_handle_t handle,
&next_extent);
if (retval)
goto done;
- retval = ext2fs_extent_fix_parents(handle);
- if (retval)
- goto done;
} else
retval = ext2fs_extent_insert(handle,
EXT2_EXTENT_INSERT_AFTER, &newextent);
if (retval)
goto done;
- /* Now pointing at inserted extent; move back to prev */
+ retval = ext2fs_extent_fix_parents(handle);
+ if (retval)
+ goto done;
+ /*
+ * Now pointing at inserted extent; move back to prev.
+ *
+ * We cannot use EXT2_EXTENT_PREV to go back; note the
+ * subtlety in the comment for fix_parents().
+ */
+ retval = ext2fs_extent_goto(handle, logical);
+ if (retval)
+ goto done;
retval = ext2fs_extent_get(handle,
- EXT2_EXTENT_PREV_LEAF,
+ EXT2_EXTENT_CURRENT,
&extent);
if (retval)
goto done;
@@ -1422,6 +1437,9 @@ errcode_t ext2fs_extent_set_bmap(ext2_extent_handle_t handle,
0, &newextent);
if (retval)
goto done;
+ retval = ext2fs_extent_fix_parents(handle);
+ if (retval)
+ goto done;
retval = ext2fs_extent_get(handle,
EXT2_EXTENT_NEXT_LEAF,
&extent);
diff --git a/lib/ext2fs/punch.c b/lib/ext2fs/punch.c
index a3d020e..2a2cf10 100644
--- a/lib/ext2fs/punch.c
+++ b/lib/ext2fs/punch.c
@@ -343,10 +343,16 @@ static errcode_t ext2fs_punch_extent(ext2_filsys fs, ext2_ino_t ino,
EXT2_EXTENT_INSERT_AFTER, &newex);
if (retval)
goto errout;
- /* Now pointing at inserted extent; so go back */
- retval = ext2fs_extent_get(handle,
- EXT2_EXTENT_PREV_LEAF,
- &newex);
+ retval = ext2fs_extent_fix_parents(handle);
+ if (retval)
+ goto errout;
+ /*
+ * Now pointing at inserted extent; so go back.
+ *
+ * We cannot use EXT2_EXTENT_PREV to go back; note the
+ * subtlety in the comment for fix_parents().
+ */
+ retval = ext2fs_extent_goto(handle, extent.e_lblk);
if (retval)
goto errout;
}
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 07/32] e2fsck: fix inline_data flag errors in pass1
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (5 preceding siblings ...)
2014-03-02 7:17 ` [PATCH 06/32] libext2fs: fix parents when modifying extents Darrick J. Wong
@ 2014-03-02 7:17 ` Darrick J. Wong
2014-03-02 7:17 ` [PATCH 08/32] e2fsck: print runs of duplicate blocks instead of all of them Darrick J. Wong
` (22 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:17 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
In pass1, check for a file with inline_data set on a filesystem that
doesn't support it.
If we decide to clear the inline_data flag on the inode and the inode
doesn't use extents, write the inode out to disk so that
block_iterate3 doesn't see the inline_data flag when it re-reads the
inode, thereby aborting e2fsck.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
e2fsck/pass1.c | 14 ++++++++++++++
e2fsck/problem.c | 4 ++++
e2fsck/problem.h | 4 ++++
tests/f_bad_disconnected_inode/expect.1 | 9 +++++++++
4 files changed, 31 insertions(+)
diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index 7554f4e..cf84db6 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -2163,6 +2163,16 @@ static void check_blocks(e2fsck_t ctx, struct problem_context *pctx,
}
}
+ if (inode->i_flags & EXT4_INLINE_DATA_FL) {
+ if (!(fs->super->s_feature_incompat &
+ EXT4_FEATURE_INCOMPAT_INLINE_DATA)) {
+ if (fix_problem(ctx, PR_1_INLINE_DATA_SET, pctx)) {
+ inode->i_flags &= ~EXT4_INLINE_DATA_FL;
+ dirty_inode++;
+ }
+ }
+ }
+
if (ext2fs_file_acl_block(fs, inode) &&
check_ext_attr(ctx, pctx, block_buf)) {
if (ctx->flags & E2F_FLAG_SIGNAL_MASK)
@@ -2174,6 +2184,10 @@ static void check_blocks(e2fsck_t ctx, struct problem_context *pctx,
if (extent_fs && (inode->i_flags & EXT4_EXTENTS_FL))
check_blocks_extents(ctx, pctx, &pb);
else {
+ if (dirty_inode)
+ e2fsck_write_inode(ctx, ino, inode,
+ "check_blocks");
+ dirty_inode = 0;
pctx->errcode = ext2fs_block_iterate3(fs, ino,
pb.is_dir ? BLOCK_FLAG_HOLE : 0,
block_buf, process_block, &pb);
diff --git a/e2fsck/problem.c b/e2fsck/problem.c
index be9d3ec..7d9cfd6 100644
--- a/e2fsck/problem.c
+++ b/e2fsck/problem.c
@@ -1020,6 +1020,10 @@ static struct e2fsck_problem problem_table[] = {
N_("@i %i, end of extent exceeds allowed value\n\t(logical @b %c, physical @b %b, len %N)\n"),
PROMPT_CLEAR, 0 },
+ /* Inline data flag set on filesystem without inline data support */
+ { PR_1_INLINE_DATA_SET,
+ N_("@i %i has inline data flag set on @f without inline data support.\n"),
+ PROMPT_CLEAR, 0 },
/* Pass 1b errors */
diff --git a/e2fsck/problem.h b/e2fsck/problem.h
index 8999a64..3304caa 100644
--- a/e2fsck/problem.h
+++ b/e2fsck/problem.h
@@ -593,6 +593,10 @@ struct problem_context {
#define PR_1_EXTENT_INDEX_START_INVALID 0x01006D
#define PR_1_EXTENT_END_OUT_OF_BOUNDS 0x01006E
+
+/* Inline data flag set on filesystem without inline data support */
+#define PR_1_INLINE_DATA_SET 0x01006F
+
/*
* Pass 1b errors
*/
diff --git a/tests/f_bad_disconnected_inode/expect.1 b/tests/f_bad_disconnected_inode/expect.1
index 11862f6..dec22e1 100644
--- a/tests/f_bad_disconnected_inode/expect.1
+++ b/tests/f_bad_disconnected_inode/expect.1
@@ -2,12 +2,21 @@ Pass 1: Checking inodes, blocks, and sizes
Inode 1 has EXTENTS_FL flag set on filesystem without extents support.
Clear? yes
+Inode 9 has inline data flag set on filesystem without inline data support.
+Clear? yes
+
+Inode 10 has inline data flag set on filesystem without inline data support.
+Clear? yes
+
Inode 15 has EXTENTS_FL flag set on filesystem without extents support.
Clear? yes
Inode 16 has EXTENTS_FL flag set on filesystem without extents support.
Clear? yes
+Inode 14 has inline data flag set on filesystem without inline data support.
+Clear? yes
+
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
/lost+found not found. Create? yes
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 08/32] e2fsck: print runs of duplicate blocks instead of all of them
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (6 preceding siblings ...)
2014-03-02 7:17 ` [PATCH 07/32] e2fsck: fix inline_data flag errors in pass1 Darrick J. Wong
@ 2014-03-02 7:17 ` Darrick J. Wong
2014-03-02 7:17 ` [PATCH 09/32] e2fsck: verify checksums after checking everything else Darrick J. Wong
` (21 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:17 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
When pass1 finds blocks that are mapped to multiple files, it will
print every duplicated block. If there are long sequences of
duplicate blocks (e.g. the e_pblk field is wrong in an extent), this
can cause a gigantic flood of output when a range could convey the
same information. Therefore, teach pass1b to print ranges when
possible.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
e2fsck/pass1b.c | 23 +++++++++++++++++++++--
e2fsck/problem.c | 5 +++++
e2fsck/problem.h | 3 +++
tests/f_bbfile/expect.1 | 4 ++--
tests/f_dup/expect.1 | 4 ++--
tests/f_dup2/expect.1 | 6 +++---
tests/f_dup_ba/expect.1 | 12 ++++++------
tests/f_dup_resize/expect.1 | 4 ++--
tests/f_dupfsblks/expect.1 | 4 ++--
tests/f_dupsuper/expect.1 | 2 +-
10 files changed, 47 insertions(+), 20 deletions(-)
diff --git a/e2fsck/pass1b.c b/e2fsck/pass1b.c
index 41a82cf..d7c5e55 100644
--- a/e2fsck/pass1b.c
+++ b/e2fsck/pass1b.c
@@ -262,6 +262,7 @@ struct process_block_struct {
ext2_ino_t ino;
int dup_blocks;
blk64_t cur_cluster;
+ blk64_t last_blk;
struct ext2_inode *inode;
struct problem_context *pctx;
};
@@ -274,6 +275,7 @@ static void pass1b(e2fsck_t ctx, char *block_buf)
ext2_inode_scan scan;
struct process_block_struct pb;
struct problem_context pctx;
+ problem_t op;
clear_problem_context(&pctx);
@@ -314,6 +316,8 @@ static void pass1b(e2fsck_t ctx, char *block_buf)
pb.dup_blocks = 0;
pb.inode = &inode;
pb.cur_cluster = ~0;
+ pb.last_blk = 0;
+ pb.pctx->blk = pb.pctx->blk2 = 0;
if (ext2fs_inode_has_valid_blocks2(fs, &inode) ||
(ino == EXT2_BAD_INO))
@@ -329,6 +333,11 @@ static void pass1b(e2fsck_t ctx, char *block_buf)
ext2fs_file_acl_block_set(fs, &inode, blk);
}
if (pb.dup_blocks) {
+ if (ino != EXT2_BAD_INO) {
+ op = pctx.blk == pctx.blk2 ?
+ PR_1B_DUP_BLOCK : PR_1B_DUP_RANGE;
+ fix_problem(ctx, op, pb.pctx);
+ }
end_problem_latch(ctx, PR_LATCH_DBLOCK);
if (ino >= EXT2_FIRST_INODE(fs->super) ||
ino == EXT2_ROOT_INO)
@@ -351,6 +360,7 @@ static int process_pass1b_block(ext2_filsys fs EXT2FS_ATTR((unused)),
struct process_block_struct *p;
e2fsck_t ctx;
blk64_t lc;
+ problem_t op;
if (HOLE_BLKADDR(*block_nr))
return 0;
@@ -363,8 +373,17 @@ static int process_pass1b_block(ext2_filsys fs EXT2FS_ATTR((unused)),
/* OK, this is a duplicate block */
if (p->ino != EXT2_BAD_INO) {
- p->pctx->blk = *block_nr;
- fix_problem(ctx, PR_1B_DUP_BLOCK, p->pctx);
+ if (p->last_blk + 1 != *block_nr) {
+ if (p->last_blk) {
+ op = p->pctx->blk == p->pctx->blk2 ?
+ PR_1B_DUP_BLOCK :
+ PR_1B_DUP_RANGE;
+ fix_problem(ctx, op, p->pctx);
+ }
+ p->pctx->blk = *block_nr;
+ }
+ p->pctx->blk2 = *block_nr;
+ p->last_blk = *block_nr;
}
p->dup_blocks++;
ext2fs_mark_inode_bitmap2(inode_dup_map, p->ino);
diff --git a/e2fsck/problem.c b/e2fsck/problem.c
index 7d9cfd6..f4c1363 100644
--- a/e2fsck/problem.c
+++ b/e2fsck/problem.c
@@ -1068,6 +1068,11 @@ static struct e2fsck_problem problem_table[] = {
N_("Error adjusting refcount for @a @b %b (@i %i): %m\n"),
PROMPT_NONE, 0 },
+ /* Duplicate/bad block range in inode */
+ { PR_1B_DUP_RANGE,
+ " %b--%c",
+ PROMPT_NONE, PR_LATCH_DBLOCK | PR_PREEN_NOHDR },
+
/* Pass 1C: Scan directories for inodes with multiply-claimed blocks. */
{ PR_1C_PASS_HEADER,
N_("Pass 1C: Scanning directories for @is with @m @bs\n"),
diff --git a/e2fsck/problem.h b/e2fsck/problem.h
index 3304caa..62f9032 100644
--- a/e2fsck/problem.h
+++ b/e2fsck/problem.h
@@ -625,6 +625,9 @@ struct problem_context {
/* Error adjusting EA refcount */
#define PR_1B_ADJ_EA_REFCOUNT 0x011007
+/* Duplicate/bad block range in inode */
+#define PR_1B_DUP_RANGE 0x011008
+
/* Pass 1C: Scan directories for inodes with dup blocks. */
#define PR_1C_PASS_HEADER 0x012000
diff --git a/tests/f_bbfile/expect.1 b/tests/f_bbfile/expect.1
index 1d639f6..ec1a36e 100644
--- a/tests/f_bbfile/expect.1
+++ b/tests/f_bbfile/expect.1
@@ -8,8 +8,8 @@ Relocating group 0's inode bitmap from 4 to 43...
Running additional passes to resolve blocks claimed by more than one inode...
Pass 1B: Rescanning for multiply-claimed blocks
Multiply-claimed block(s) in inode 2: 21
-Multiply-claimed block(s) in inode 11: 9 10 11 12 13 14 15 16 17 18 19 20
-Multiply-claimed block(s) in inode 12: 25 26
+Multiply-claimed block(s) in inode 11: 9--20
+Multiply-claimed block(s) in inode 12: 25--26
Pass 1C: Scanning directories for inodes with multiply-claimed blocks
Pass 1D: Reconciling multiply-claimed blocks
(There are 3 inodes containing multiply-claimed blocks.)
diff --git a/tests/f_dup/expect.1 b/tests/f_dup/expect.1
index e7128f3..075e62c 100644
--- a/tests/f_dup/expect.1
+++ b/tests/f_dup/expect.1
@@ -4,8 +4,8 @@ Pass 1: Checking inodes, blocks, and sizes
Running additional passes to resolve blocks claimed by more than one inode...
Pass 1B: Rescanning for multiply-claimed blocks
-Multiply-claimed block(s) in inode 12: 25 26
-Multiply-claimed block(s) in inode 13: 25 26
+Multiply-claimed block(s) in inode 12: 25--26
+Multiply-claimed block(s) in inode 13: 25--26
Pass 1C: Scanning directories for inodes with multiply-claimed blocks
Pass 1D: Reconciling multiply-claimed blocks
(There are 2 inodes containing multiply-claimed blocks.)
diff --git a/tests/f_dup2/expect.1 b/tests/f_dup2/expect.1
index 0476005..69aa21b 100644
--- a/tests/f_dup2/expect.1
+++ b/tests/f_dup2/expect.1
@@ -4,9 +4,9 @@ Pass 1: Checking inodes, blocks, and sizes
Running additional passes to resolve blocks claimed by more than one inode...
Pass 1B: Rescanning for multiply-claimed blocks
-Multiply-claimed block(s) in inode 12: 25 26
-Multiply-claimed block(s) in inode 13: 25 26 57 58
-Multiply-claimed block(s) in inode 14: 57 58
+Multiply-claimed block(s) in inode 12: 25--26
+Multiply-claimed block(s) in inode 13: 25--26 57--58
+Multiply-claimed block(s) in inode 14: 57--58
Pass 1C: Scanning directories for inodes with multiply-claimed blocks
Pass 1D: Reconciling multiply-claimed blocks
(There are 3 inodes containing multiply-claimed blocks.)
diff --git a/tests/f_dup_ba/expect.1 b/tests/f_dup_ba/expect.1
index f0ad457..f4581c4 100644
--- a/tests/f_dup_ba/expect.1
+++ b/tests/f_dup_ba/expect.1
@@ -6,12 +6,12 @@ Inode 16, i_blocks is 128, should be 896. Fix? yes
Running additional passes to resolve blocks claimed by more than one inode...
Pass 1B: Rescanning for multiply-claimed blocks
-Multiply-claimed block(s) in inode 16: 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239
-Multiply-claimed block(s) in inode 17: 160 161
-Multiply-claimed block(s) in inode 18: 176 177
-Multiply-claimed block(s) in inode 19: 192 193
-Multiply-claimed block(s) in inode 20: 208 209
-Multiply-claimed block(s) in inode 21: 224 225
+Multiply-claimed block(s) in inode 16: 160--239
+Multiply-claimed block(s) in inode 17: 160--161
+Multiply-claimed block(s) in inode 18: 176--177
+Multiply-claimed block(s) in inode 19: 192--193
+Multiply-claimed block(s) in inode 20: 208--209
+Multiply-claimed block(s) in inode 21: 224--225
Pass 1C: Scanning directories for inodes with multiply-claimed blocks
Pass 1D: Reconciling multiply-claimed blocks
(There are 6 inodes containing multiply-claimed blocks.)
diff --git a/tests/f_dup_resize/expect.1 b/tests/f_dup_resize/expect.1
index dd8fe05..aaf7769 100644
--- a/tests/f_dup_resize/expect.1
+++ b/tests/f_dup_resize/expect.1
@@ -4,8 +4,8 @@ Pass 1: Checking inodes, blocks, and sizes
Running additional passes to resolve blocks claimed by more than one inode...
Pass 1B: Rescanning for multiply-claimed blocks
-Multiply-claimed block(s) in inode 7: 4 5 6 7
-Multiply-claimed block(s) in inode 12: 4 5 6 7
+Multiply-claimed block(s) in inode 7: 4--7
+Multiply-claimed block(s) in inode 12: 4--7
Pass 1C: Scanning directories for inodes with multiply-claimed blocks
Pass 1D: Reconciling multiply-claimed blocks
(There are 1 inodes containing multiply-claimed blocks.)
diff --git a/tests/f_dupfsblks/expect.1 b/tests/f_dupfsblks/expect.1
index 3f70109..6751986 100644
--- a/tests/f_dupfsblks/expect.1
+++ b/tests/f_dupfsblks/expect.1
@@ -8,8 +8,8 @@ Inode 13, i_size is 0, should be 2048. Fix? yes
Running additional passes to resolve blocks claimed by more than one inode...
Pass 1B: Rescanning for multiply-claimed blocks
-Multiply-claimed block(s) in inode 12: 3 4 6 1
-Multiply-claimed block(s) in inode 13: 2 3
+Multiply-claimed block(s) in inode 12: 3--4 6 1
+Multiply-claimed block(s) in inode 13: 2--3
Multiply-claimed block(s) in inode 14: 2
Pass 1C: Scanning directories for inodes with multiply-claimed blocks
Pass 1D: Reconciling multiply-claimed blocks
diff --git a/tests/f_dupsuper/expect.1 b/tests/f_dupsuper/expect.1
index 830370a..2107e2d 100644
--- a/tests/f_dupsuper/expect.1
+++ b/tests/f_dupsuper/expect.1
@@ -4,7 +4,7 @@ Pass 1: Checking inodes, blocks, and sizes
Running additional passes to resolve blocks claimed by more than one inode...
Pass 1B: Rescanning for multiply-claimed blocks
-Multiply-claimed block(s) in inode 12: 2 3 1
+Multiply-claimed block(s) in inode 12: 2--3 1
Pass 1C: Scanning directories for inodes with multiply-claimed blocks
Pass 1D: Reconciling multiply-claimed blocks
(There are 1 inodes containing multiply-claimed blocks.)
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 09/32] e2fsck: verify checksums after checking everything else
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (7 preceding siblings ...)
2014-03-02 7:17 ` [PATCH 08/32] e2fsck: print runs of duplicate blocks instead of all of them Darrick J. Wong
@ 2014-03-02 7:17 ` Darrick J. Wong
2014-03-02 7:17 ` [PATCH 10/32] dumpe2fs: add switch to disable checksum verification Darrick J. Wong
` (20 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:17 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
There's a particular problem with e2fsck's user interface where
checksum errors are concerned: Fixing the first complaint about
a checksum problem results in the inode being cleared even if e2fsck
could otherwise have recovered it. While this mode is useful for
cleaning the remaining broken crud off the filesystem, we could at
least default to checking everything /else/ and only complaining about
the incorrect checksum if fsck finds nothing else wrong.
So, plumb in a config option. We default to "verify and checksum"
unless the user tell us otherwise.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
e2fsck/e2fsck.8.in | 12 ++++++++++++
e2fsck/e2fsck.conf.5.in | 20 ++++++++++++++++++++
e2fsck/e2fsck.h | 1 +
e2fsck/problem.c | 18 ++++++++++++++----
e2fsck/problemP.h | 1 +
e2fsck/unix.c | 11 +++++++++++
6 files changed, 59 insertions(+), 4 deletions(-)
diff --git a/e2fsck/e2fsck.8.in b/e2fsck/e2fsck.8.in
index f5ed758..43ee063 100644
--- a/e2fsck/e2fsck.8.in
+++ b/e2fsck/e2fsck.8.in
@@ -207,6 +207,18 @@ option may prevent you from further manual data recovery.
.BI nodiscard
Do not attempt to discard free blocks and unused inode blocks. This option is
exactly the opposite of discard option. This is set as default.
+.TP
+.BI strict_csums
+Verify each metadata object's checksum before checking anything other fields
+in the metadata object. If the verification fails, offer to clear the item,
+also before checking any of the other fields. This option causes e2fsck to
+favor throwing away broken objects over trying to salvage them.
+.TP
+.BI no_strict_csums
+Perform all regular checks of a metadata object and only verify the checksum if
+no problems were found. This option causes e2fsck to try to salvage slightly
+damaged metadata objects, at the cost of spending processing time on recovering
+data. This is set as the default.
.RE
.TP
.B \-f
diff --git a/e2fsck/e2fsck.conf.5.in b/e2fsck/e2fsck.conf.5.in
index 9ebfbbf..a8219a8 100644
--- a/e2fsck/e2fsck.conf.5.in
+++ b/e2fsck/e2fsck.conf.5.in
@@ -222,6 +222,26 @@ If this boolean relation is true, e2fsck will run as if the option
.B -v
is always specified. This will cause e2fsck to print some additional
information at the end of each full file system check.
+.TP
+.I strict_csums
+If this boolean relation is true, e2fsck will run as if
+.B -E strict_csums
+is set. This causes e2fsck to verify each metadata object's checksum before
+checking anything other fields in the metadata object. If the verification
+fails, offer to clear the item, also before checking any of the other fields.
+This option causes e2fsck to favor throwing away broken objects over trying to
+salvage them.
+.IP
+If the boolean relation is false, e2fsck will run as if
+.B -E no_strict_csums
+is set. In this case, e2fsck will perform all regular checks of a metadata
+object and only verify the checksum if no problems were found. This option
+causes e2fsck to try to salvage slightly damaged metadata objects, at the cost
+of spending processing time on recovering data.
+.IP
+The default is for e2fsck to behave as if
+.B -E no_strict_csums
+is set.
.SH THE [problems] STANZA
Each tag in the
.I [problems]
diff --git a/e2fsck/e2fsck.h b/e2fsck/e2fsck.h
index dbd6ea8..d7a7be9 100644
--- a/e2fsck/e2fsck.h
+++ b/e2fsck/e2fsck.h
@@ -167,6 +167,7 @@ struct resource_track {
#define E2F_OPT_FRAGCHECK 0x0800
#define E2F_OPT_JOURNAL_ONLY 0x1000 /* only replay the journal */
#define E2F_OPT_DISCARD 0x2000
+#define E2F_OPT_CSUM_FIRST 0x4000
/*
* E2fsck flags
diff --git a/e2fsck/problem.c b/e2fsck/problem.c
index f4c1363..aa09c91 100644
--- a/e2fsck/problem.c
+++ b/e2fsck/problem.c
@@ -970,7 +970,7 @@ static struct e2fsck_problem problem_table[] = {
/* inode checksum does not match inode */
{ PR_1_INODE_CSUM_INVALID,
N_("@i %i checksum does not match @i. "),
- PROMPT_CLEAR, PR_PREEN_OK },
+ PROMPT_CLEAR, PR_PREEN_OK | PR_INITIAL_CSUM },
/* inode passes checks, but checksum does not match inode */
{ PR_1_INODE_ONLY_CSUM_INVALID,
@@ -981,7 +981,7 @@ static struct e2fsck_problem problem_table[] = {
{ PR_1_EXTENT_CSUM_INVALID,
N_("@i %i extent block checksum does not match extent\n\t(logical @b "
"%c, @n physical @b %b, len %N)\n"),
- PROMPT_CLEAR, 0 },
+ PROMPT_CLEAR, PR_INITIAL_CSUM },
/*
* Inode extent block passes checks, but checksum does not match
@@ -996,7 +996,7 @@ static struct e2fsck_problem problem_table[] = {
{ PR_1_EA_BLOCK_CSUM_INVALID,
N_("Extended attribute @a @b %b checksum for @i %i does not "
"match. "),
- PROMPT_CLEAR, 0 },
+ PROMPT_CLEAR, PR_INITIAL_CSUM },
/*
* Extended attribute block passes checks, but checksum for inode does
@@ -1465,7 +1465,7 @@ static struct e2fsck_problem problem_table[] = {
/* leaf node fails checksum */
{ PR_2_LEAF_NODE_CSUM_INVALID,
N_("@d @i %i, %B, offset %N: @d fails checksum\n"),
- PROMPT_SALVAGE, PR_PREEN_OK },
+ PROMPT_SALVAGE, PR_PREEN_OK | PR_INITIAL_CSUM },
/* leaf node has no checksum */
{ PR_2_LEAF_NODE_MISSING_CSUM,
@@ -1934,6 +1934,16 @@ int fix_problem(e2fsck_t ctx, problem_t code, struct problem_context *pctx)
printf(_("Unhandled error code (0x%x)!\n"), code);
return 0;
}
+
+ /*
+ * If there is a problem with the initial csum verification and the
+ * user told e2fsck to verify csums /after/ checking everything else,
+ * then don't "fix" anything.
+ */
+ if ((ptr->flags & PR_INITIAL_CSUM) &&
+ !(ctx->options & E2F_OPT_CSUM_FIRST))
+ return 0;
+
if (!(ptr->flags & PR_CONFIG)) {
char key[9], *new_desc = NULL;
diff --git a/e2fsck/problemP.h b/e2fsck/problemP.h
index 7944cd6..a983598 100644
--- a/e2fsck/problemP.h
+++ b/e2fsck/problemP.h
@@ -44,3 +44,4 @@ struct latch_descr {
#define PR_CONFIG 0x080000 /* This problem has been customized
from the config file */
#define PR_FORCE_NO 0x100000 /* Force the answer to be no */
+#define PR_INITIAL_CSUM 0x200000 /* User can ignore initial csum check */
diff --git a/e2fsck/unix.c b/e2fsck/unix.c
index f73a252..67b3578 100644
--- a/e2fsck/unix.c
+++ b/e2fsck/unix.c
@@ -692,6 +692,10 @@ static void parse_extended_opts(e2fsck_t ctx, const char *opts)
else
ctx->log_fn = string_copy(ctx, arg, 0);
continue;
+ } else if (strcmp(token, "strict_csums") == 0) {
+ ctx->options |= E2F_OPT_CSUM_FIRST;
+ } else if (strcmp(token, "no_strict_csums") == 0) {
+ ctx->options &= ~E2F_OPT_CSUM_FIRST;
} else {
fprintf(stderr, _("Unknown extended option: %s\n"),
token);
@@ -710,6 +714,8 @@ static void parse_extended_opts(e2fsck_t ctx, const char *opts)
fputs(("\tjournal_only\n"), stderr);
fputs(("\tdiscard\n"), stderr);
fputs(("\tnodiscard\n"), stderr);
+ fputs(("\tstrict_csums\n"), stderr);
+ fputs(("\tno_strict_csums\n"), stderr);
fputc('\n', stderr);
exit(1);
}
@@ -944,6 +950,11 @@ static errcode_t PRS(int argc, char *argv[], e2fsck_t *ret_ctx)
profile_set_syntax_err_cb(syntax_err_report);
profile_init(config_fn, &ctx->profile);
+ profile_get_boolean(ctx->profile, "options", "strict_csums", NULL,
+ 0, &c);
+ if (c)
+ ctx->options |= E2F_OPT_CSUM_FIRST;
+
profile_get_boolean(ctx->profile, "options", "report_time", 0, 0,
&c);
if (c)
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 10/32] dumpe2fs: add switch to disable checksum verification
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (8 preceding siblings ...)
2014-03-02 7:17 ` [PATCH 09/32] e2fsck: verify checksums after checking everything else Darrick J. Wong
@ 2014-03-02 7:17 ` Darrick J. Wong
2014-03-02 7:17 ` [PATCH 11/32] mke2fs: set block_validity as a default mount option Darrick J. Wong
` (19 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:17 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
Add a -n switch to turn off checksum verification.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
misc/dumpe2fs.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/misc/dumpe2fs.c b/misc/dumpe2fs.c
index ae54f8a..45eddaf 100644
--- a/misc/dumpe2fs.c
+++ b/misc/dumpe2fs.c
@@ -582,7 +582,9 @@ int main (int argc, char ** argv)
if (argc && *argv)
program_name = *argv;
- while ((c = getopt (argc, argv, "bfhixVo:")) != EOF) {
+ flags = EXT2_FLAG_JOURNAL_DEV_OK | EXT2_FLAG_SOFTSUPP_FEATURES |
+ EXT2_FLAG_64BITS;
+ while ((c = getopt(argc, argv, "bfhixVo:n")) != EOF) {
switch (c) {
case 'b':
print_badblocks++;
@@ -608,6 +610,9 @@ int main (int argc, char ** argv)
case 'x':
hex_format++;
break;
+ case 'n':
+ flags |= EXT2_FLAG_IGNORE_CSUM_ERRORS;
+ break;
default:
usage();
}
@@ -615,7 +620,6 @@ int main (int argc, char ** argv)
if (optind > argc - 1)
usage();
device_name = argv[optind++];
- flags = EXT2_FLAG_JOURNAL_DEV_OK | EXT2_FLAG_SOFTSUPP_FEATURES | EXT2_FLAG_64BITS;
if (force)
flags |= EXT2_FLAG_FORCE;
if (image_dump)
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 11/32] mke2fs: set block_validity as a default mount option
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (9 preceding siblings ...)
2014-03-02 7:17 ` [PATCH 10/32] dumpe2fs: add switch to disable checksum verification Darrick J. Wong
@ 2014-03-02 7:17 ` Darrick J. Wong
2014-03-02 7:17 ` [PATCH 12/32] libext2fs: support allocating uninit blocks in bmap2() Darrick J. Wong
` (18 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:17 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
The block_validity mount option spot-checks block allocations against
a bitmap of known group metadata blocks. This helps us to prevent
self-inflicted catastrophic failures such as trying to "share"
critical metadata (think bitmaps) with file data, which usually
results in filesystem destruction.
In order to test the overhead of the mount option, I re-used the speed
tests in the metadata checksum testing script. In short, the program
creates what looks like 15 copies of a kernel source tree, except that
it uses fallocate to strip out the overhead of writing the file data
so that we can focus on metadata overhead. On a 64G RAM disk, the
overhead was generally about 0.9% and at most 1.6%. On a 160G USB
disk, the overhead was about 0.8% and peaked at 1.2%.
When I changed the test to write out files instead of merely
fallocating space, the overhead was negligible.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
misc/mke2fs.conf.in | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/misc/mke2fs.conf.in b/misc/mke2fs.conf.in
index 178733f..3919f3b 100644
--- a/misc/mke2fs.conf.in
+++ b/misc/mke2fs.conf.in
@@ -1,6 +1,6 @@
[defaults]
base_features = sparse_super,filetype,resize_inode,dir_index,ext_attr
- default_mntopts = acl,user_xattr
+ default_mntopts = acl,user_xattr,block_validity
enable_periodic_fsck = 0
blocksize = 4096
inode_size = 256
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 12/32] libext2fs: support allocating uninit blocks in bmap2()
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (10 preceding siblings ...)
2014-03-02 7:17 ` [PATCH 11/32] mke2fs: set block_validity as a default mount option Darrick J. Wong
@ 2014-03-02 7:17 ` Darrick J. Wong
2014-03-02 7:18 ` [PATCH 13/32] libext2fs: file IO routines should handle uninit blocks Darrick J. Wong
` (17 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:17 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
In order to support fallocate, we need to be able to have
ext2fs_bmap2() allocate blocks and put them into uninitialized
extents. There's a flag to do this in the extent code, but it's not
exposed to the bmap2 interface, so plumb that in. Eventually fuse2fs
or somebody will use it.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
lib/ext2fs/bmap.c | 24 ++++++++++++++++++++++--
lib/ext2fs/ext2fs.h | 1 +
lib/ext2fs/mkjournal.c | 17 +++++++++++++++++
3 files changed, 40 insertions(+), 2 deletions(-)
diff --git a/lib/ext2fs/bmap.c b/lib/ext2fs/bmap.c
index db2fd72..07455f8 100644
--- a/lib/ext2fs/bmap.c
+++ b/lib/ext2fs/bmap.c
@@ -72,6 +72,11 @@ static _BMAP_INLINE_ errcode_t block_ind_bmap(ext2_filsys fs, int flags,
block_buf + fs->blocksize, &b);
if (retval)
return retval;
+ if (flags & BMAP_UNINIT) {
+ retval = ext2fs_zero_blocks2(fs, b, 1, NULL, NULL);
+ if (retval)
+ return retval;
+ }
#ifdef WORDS_BIGENDIAN
((blk_t *) block_buf)[nr] = ext2fs_swab32(b);
@@ -214,10 +219,13 @@ static errcode_t extent_bmap(ext2_filsys fs, ext2_ino_t ino,
errcode_t retval = 0;
blk64_t blk64 = 0;
int alloc = 0;
+ int set_flags;
+
+ set_flags = bmap_flags & BMAP_UNINIT ? EXT2_EXTENT_SET_BMAP_UNINIT : 0;
if (bmap_flags & BMAP_SET) {
retval = ext2fs_extent_set_bmap(handle, block,
- *phys_blk, 0);
+ *phys_blk, set_flags);
return retval;
}
retval = ext2fs_extent_goto(handle, block);
@@ -254,7 +262,7 @@ got_block:
alloc++;
set_extent:
retval = ext2fs_extent_set_bmap(handle, block,
- blk64, 0);
+ blk64, set_flags);
if (retval) {
ext2fs_block_alloc_stats2(fs, blk64, -1);
return retval;
@@ -338,6 +346,12 @@ errcode_t ext2fs_bmap2(ext2_filsys fs, ext2_ino_t ino, struct ext2_inode *inode,
goto done;
}
+ if ((bmap_flags & BMAP_SET) && (bmap_flags & BMAP_UNINIT)) {
+ retval = ext2fs_zero_blocks2(fs, *phys_blk, 1, NULL, NULL);
+ if (retval)
+ goto done;
+ }
+
if (block < EXT2_NDIR_BLOCKS) {
if (bmap_flags & BMAP_SET) {
b = *phys_blk;
@@ -353,6 +367,12 @@ errcode_t ext2fs_bmap2(ext2_filsys fs, ext2_ino_t ino, struct ext2_inode *inode,
retval = ext2fs_alloc_block(fs, b, block_buf, &b);
if (retval)
goto done;
+ if (bmap_flags & BMAP_UNINIT) {
+ retval = ext2fs_zero_blocks2(fs, b, 1, NULL,
+ NULL);
+ if (retval)
+ goto done;
+ }
inode_bmap(inode, block) = b;
blocks_alloc++;
*phys_blk = b;
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index b1b9d3d..645285b 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -525,6 +525,7 @@ typedef struct ext2_icount *ext2_icount_t;
*/
#define BMAP_ALLOC 0x0001
#define BMAP_SET 0x0002
+#define BMAP_UNINIT 0x0004
/*
* Returned flags from ext2fs_bmap
diff --git a/lib/ext2fs/mkjournal.c b/lib/ext2fs/mkjournal.c
index 884d9c0..ecc3912 100644
--- a/lib/ext2fs/mkjournal.c
+++ b/lib/ext2fs/mkjournal.c
@@ -174,6 +174,23 @@ errcode_t ext2fs_zero_blocks2(ext2_filsys fs, blk64_t blk, int num,
return ENOMEM;
memset(buf, 0, fs->blocksize * STRIDE_LENGTH);
}
+
+ /* Try discard, if it zeroes data... */
+ if (io_channel_discard_zeroes_data(fs->io)) {
+ memset(buf + fs->blocksize, 0, fs->blocksize);
+ retval = io_channel_discard(fs->io, blk, num);
+ if (retval)
+ goto skip_discard;
+ retval = io_channel_read_blk64(fs->io, blk, 1, buf);
+ if (retval)
+ goto skip_discard;
+ if (memcmp(buf, buf + fs->blocksize, fs->blocksize) == 0)
+ return 0;
+ /* Hah! Discard doesn't zero! */
+ fs->io->flags &= ~CHANNEL_FLAGS_DISCARD_ZEROES;
+ }
+skip_discard:
+
/* OK, do the write loop */
j=0;
while (j < num) {
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 13/32] libext2fs: file IO routines should handle uninit blocks
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (11 preceding siblings ...)
2014-03-02 7:17 ` [PATCH 12/32] libext2fs: support allocating uninit blocks in bmap2() Darrick J. Wong
@ 2014-03-02 7:18 ` Darrick J. Wong
2014-03-02 7:18 ` [PATCH 14/32] resize2fs: convert fs to and from 64bit mode Darrick J. Wong
` (16 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:18 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
The file IO routines do not handle uninit blocks at all. The read
method should check for the uninit flag and return a buffer of zeroes,
and the write routine should convert unwritten extents.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
lib/ext2fs/fileio.c | 24 ++++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)
diff --git a/lib/ext2fs/fileio.c b/lib/ext2fs/fileio.c
index 5a39c32..607609f 100644
--- a/lib/ext2fs/fileio.c
+++ b/lib/ext2fs/fileio.c
@@ -123,6 +123,8 @@ errcode_t ext2fs_file_flush(ext2_file_t file)
{
errcode_t retval;
ext2_filsys fs;
+ int ret_flags;
+ blk64_t dontcare;
EXT2_CHECK_MAGIC(file, EXT2_ET_MAGIC_EXT2_FILE);
fs = file->fs;
@@ -131,6 +133,22 @@ errcode_t ext2fs_file_flush(ext2_file_t file)
!(file->flags & EXT2_FILE_BUF_DIRTY))
return 0;
+ /* Is this an uninit block? */
+ if (file->physblock && file->inode.i_flags & EXT4_EXTENTS_FL) {
+ retval = ext2fs_bmap2(fs, file->ino, &file->inode, BMAP_BUFFER,
+ 0, file->blockno, &ret_flags, &dontcare);
+ if (retval)
+ return retval;
+ if (ret_flags & BMAP_RET_UNINIT) {
+ retval = ext2fs_bmap2(fs, file->ino, &file->inode,
+ BMAP_BUFFER, BMAP_SET,
+ file->blockno, 0,
+ &file->physblock);
+ if (retval)
+ return retval;
+ }
+ }
+
/*
* OK, the physical block hasn't been allocated yet.
* Allocate it.
@@ -185,15 +203,17 @@ static errcode_t load_buffer(ext2_file_t file, int dontfill)
{
ext2_filsys fs = file->fs;
errcode_t retval;
+ int ret_flags;
if (!(file->flags & EXT2_FILE_BUF_VALID)) {
retval = ext2fs_bmap2(fs, file->ino, &file->inode,
- BMAP_BUFFER, 0, file->blockno, 0,
+ BMAP_BUFFER, 0, file->blockno, &ret_flags,
&file->physblock);
if (retval)
return retval;
if (!dontfill) {
- if (file->physblock) {
+ if (file->physblock &&
+ !(ret_flags & BMAP_RET_UNINIT)) {
retval = io_channel_read_blk64(fs->io,
file->physblock,
1, file->buf);
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 14/32] resize2fs: convert fs to and from 64bit mode
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (12 preceding siblings ...)
2014-03-02 7:18 ` [PATCH 13/32] libext2fs: file IO routines should handle uninit blocks Darrick J. Wong
@ 2014-03-02 7:18 ` Darrick J. Wong
2014-03-02 7:18 ` [PATCH 15/32] resize2fs: when toggling 64bit, don't free in-use bg data clusters Darrick J. Wong
` (15 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:18 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
resize2fs does its magic by loading a filesystem, duplicating the
in-memory image of that fs, moving relevant blocks out of the way of
whatever new metadata get created, and finally writing everything back
out to disk. Enabling 64bit mode enlarges the group descriptors,
which makes resize2fs a reasonable vehicle for taking care of the rest
of the bookkeeping requirements, so add to resize2fs the ability to
convert a filesystem to 64bit mode and back.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
resize/main.c | 40 ++++++-
resize/resize2fs.8.in | 18 +++
resize/resize2fs.c | 282 ++++++++++++++++++++++++++++++++++++++++++++++++-
resize/resize2fs.h | 3 +
4 files changed, 336 insertions(+), 7 deletions(-)
diff --git a/resize/main.c b/resize/main.c
index 2b7abff..e37521a 100644
--- a/resize/main.c
+++ b/resize/main.c
@@ -42,7 +42,7 @@ static char *device_name, *io_options;
static void usage (char *prog)
{
fprintf (stderr, _("Usage: %s [-d debug_flags] [-f] [-F] [-M] [-P] "
- "[-p] device [new_size]\n\n"), prog);
+ "[-p] device [-b|-s|new_size]\n\n"), prog);
exit (1);
}
@@ -200,7 +200,7 @@ int main (int argc, char ** argv)
if (argc && *argv)
program_name = *argv;
- while ((c = getopt (argc, argv, "d:fFhMPpS:")) != EOF) {
+ while ((c = getopt(argc, argv, "d:fFhMPpS:bs")) != EOF) {
switch (c) {
case 'h':
usage(program_name);
@@ -226,6 +226,12 @@ int main (int argc, char ** argv)
case 'S':
use_stride = atoi(optarg);
break;
+ case 'b':
+ flags |= RESIZE_ENABLE_64BIT;
+ break;
+ case 's':
+ flags |= RESIZE_DISABLE_64BIT;
+ break;
default:
usage(program_name);
}
@@ -384,6 +390,10 @@ int main (int argc, char ** argv)
if (sys_page_size > fs->blocksize)
new_size &= ~((sys_page_size / fs->blocksize)-1);
}
+ /* If changing 64bit, don't change the filesystem size. */
+ if (flags & (RESIZE_DISABLE_64BIT | RESIZE_ENABLE_64BIT)) {
+ new_size = ext2fs_blocks_count(fs->super);
+ }
if (!EXT2_HAS_INCOMPAT_FEATURE(fs->super,
EXT4_FEATURE_INCOMPAT_64BIT)) {
/* Take 16T down to 2^32-1 blocks */
@@ -435,7 +445,31 @@ int main (int argc, char ** argv)
fs->blocksize / 1024, new_size);
exit(1);
}
- if (new_size == ext2fs_blocks_count(fs->super)) {
+ if ((flags & RESIZE_DISABLE_64BIT) && (flags & RESIZE_ENABLE_64BIT)) {
+ fprintf(stderr, _("Cannot set and unset 64bit feature.\n"));
+ exit(1);
+ } else if (flags & (RESIZE_DISABLE_64BIT | RESIZE_ENABLE_64BIT)) {
+ new_size = ext2fs_blocks_count(fs->super);
+ if (new_size >= (1ULL << 32)) {
+ fprintf(stderr, _("Cannot change the 64bit feature "
+ "on a filesystem that is larger than "
+ "2^32 blocks.\n"));
+ exit(1);
+ }
+ if (mount_flags & EXT2_MF_MOUNTED) {
+ fprintf(stderr, _("Cannot change the 64bit feature "
+ "while the filesystem is mounted.\n"));
+ exit(1);
+ }
+ if (flags & RESIZE_ENABLE_64BIT &&
+ !EXT2_HAS_INCOMPAT_FEATURE(fs->super,
+ EXT3_FEATURE_INCOMPAT_EXTENTS)) {
+ fprintf(stderr, _("Please enable the extents feature "
+ "with tune2fs before enabling the 64bit "
+ "feature.\n"));
+ exit(1);
+ }
+ } else if (new_size == ext2fs_blocks_count(fs->super)) {
fprintf(stderr, _("The filesystem is already %llu blocks "
"long. Nothing to do!\n\n"), new_size);
exit(0);
diff --git a/resize/resize2fs.8.in b/resize/resize2fs.8.in
index a1f3099..1c75816 100644
--- a/resize/resize2fs.8.in
+++ b/resize/resize2fs.8.in
@@ -8,7 +8,7 @@ resize2fs \- ext2/ext3/ext4 file system resizer
.SH SYNOPSIS
.B resize2fs
[
-.B \-fFpPM
+.B \-fFpPMbs
]
[
.B \-d
@@ -85,8 +85,21 @@ to shrink the size of filesystem. Then you may use
to shrink the size of the partition. When shrinking the size of
the partition, make sure you do not make it smaller than the new size
of the ext2 filesystem!
+.PP
+The
+.B \-b
+and
+.B \-s
+options enable and disable the 64bit feature, respectively. The resize2fs
+program will, of course, take care of resizing the block group descriptors
+and moving other data blocks out of the way, as needed. It is not possible
+to resize the filesystem concurrent with changing the 64bit status.
.SH OPTIONS
.TP
+.B \-b
+Turns on the 64bit feature, resizes the group descriptors as necessary, and
+moves other metadata out of the way.
+.TP
.B \-d \fIdebug-flags
Turns on various resize2fs debugging features, if they have been compiled
into the binary.
@@ -126,6 +139,9 @@ of what the program is doing.
.B \-P
Print the minimum size of the filesystem and exit.
.TP
+.B \-s
+Turns off the 64bit feature and frees blocks that are no longer in use.
+.TP
.B \-S \fIRAID-stride
The
.B resize2fs
diff --git a/resize/resize2fs.c b/resize/resize2fs.c
index 7122b2f..339885b 100644
--- a/resize/resize2fs.c
+++ b/resize/resize2fs.c
@@ -56,6 +56,9 @@ static errcode_t mark_table_blocks(ext2_filsys fs,
static errcode_t clear_sparse_super2_last_group(ext2_resize_t rfs);
static errcode_t reserve_sparse_super2_last_group(ext2_resize_t rfs,
ext2fs_block_bitmap meta_bmap);
+static errcode_t resize_group_descriptors(ext2_resize_t rfs, blk64_t new_size);
+static errcode_t move_bg_metadata(ext2_resize_t rfs);
+static errcode_t zero_high_bits_in_inodes(ext2_resize_t rfs);
/*
* Some helper CPP macros
@@ -122,13 +125,30 @@ errcode_t resize_fs(ext2_filsys fs, blk64_t *new_size, int flags,
if (retval)
goto errout;
+ init_resource_track(&rtrack, "resize_group_descriptors", fs->io);
+ retval = resize_group_descriptors(rfs, *new_size);
+ if (retval)
+ goto errout;
+ print_resource_track(rfs, &rtrack, fs->io);
+
+ init_resource_track(&rtrack, "move_bg_metadata", fs->io);
+ retval = move_bg_metadata(rfs);
+ if (retval)
+ goto errout;
+ print_resource_track(rfs, &rtrack, fs->io);
+
+ init_resource_track(&rtrack, "zero_high_bits_in_metadata", fs->io);
+ retval = zero_high_bits_in_inodes(rfs);
+ if (retval)
+ goto errout;
+ print_resource_track(rfs, &rtrack, fs->io);
+
init_resource_track(&rtrack, "adjust_superblock", fs->io);
retval = adjust_superblock(rfs, *new_size);
if (retval)
goto errout;
print_resource_track(rfs, &rtrack, fs->io);
-
init_resource_track(&rtrack, "fix_uninit_block_bitmaps 2", fs->io);
fix_uninit_block_bitmaps(rfs->new_fs);
print_resource_track(rfs, &rtrack, fs->io);
@@ -231,6 +251,259 @@ errout:
return retval;
}
+/* Toggle 64bit mode */
+static errcode_t resize_group_descriptors(ext2_resize_t rfs, blk64_t new_size)
+{
+ void *o, *n, *new_group_desc;
+ dgrp_t i;
+ int copy_size;
+ errcode_t retval;
+
+ if (!(rfs->flags & (RESIZE_DISABLE_64BIT | RESIZE_ENABLE_64BIT)))
+ return 0;
+
+ if (new_size != ext2fs_blocks_count(rfs->new_fs->super) ||
+ ext2fs_blocks_count(rfs->new_fs->super) >= (1ULL << 32) ||
+ (rfs->flags & RESIZE_DISABLE_64BIT &&
+ rfs->flags & RESIZE_ENABLE_64BIT))
+ return EXT2_ET_INVALID_ARGUMENT;
+
+ if (rfs->flags & RESIZE_DISABLE_64BIT) {
+ rfs->new_fs->super->s_feature_incompat &=
+ ~EXT4_FEATURE_INCOMPAT_64BIT;
+ rfs->new_fs->super->s_desc_size = EXT2_MIN_DESC_SIZE;
+ } else if (rfs->flags & RESIZE_ENABLE_64BIT) {
+ rfs->new_fs->super->s_feature_incompat |=
+ EXT4_FEATURE_INCOMPAT_64BIT;
+ rfs->new_fs->super->s_desc_size = EXT2_MIN_DESC_SIZE_64BIT;
+ }
+
+ if (EXT2_DESC_SIZE(rfs->old_fs->super) ==
+ EXT2_DESC_SIZE(rfs->new_fs->super))
+ return 0;
+
+ o = rfs->new_fs->group_desc;
+ rfs->new_fs->desc_blocks = ext2fs_div_ceil(
+ rfs->old_fs->group_desc_count,
+ EXT2_DESC_PER_BLOCK(rfs->new_fs->super));
+ retval = ext2fs_get_arrayzero(rfs->new_fs->desc_blocks,
+ rfs->old_fs->blocksize, &new_group_desc);
+ if (retval)
+ return retval;
+
+ n = new_group_desc;
+
+ if (EXT2_DESC_SIZE(rfs->old_fs->super) <=
+ EXT2_DESC_SIZE(rfs->new_fs->super))
+ copy_size = EXT2_DESC_SIZE(rfs->old_fs->super);
+ else
+ copy_size = EXT2_DESC_SIZE(rfs->new_fs->super);
+ for (i = 0; i < rfs->old_fs->group_desc_count; i++) {
+ memcpy(n, o, copy_size);
+ n += EXT2_DESC_SIZE(rfs->new_fs->super);
+ o += EXT2_DESC_SIZE(rfs->old_fs->super);
+ }
+
+ ext2fs_free_mem(&rfs->new_fs->group_desc);
+ rfs->new_fs->group_desc = new_group_desc;
+
+ for (i = 0; i < rfs->old_fs->group_desc_count; i++)
+ ext2fs_group_desc_csum_set(rfs->new_fs, i);
+
+ return 0;
+}
+
+/* Move bitmaps/inode tables out of the way. */
+static errcode_t move_bg_metadata(ext2_resize_t rfs)
+{
+ dgrp_t i;
+ blk64_t b, c, d;
+ ext2fs_block_bitmap old_map, new_map;
+ int old, new;
+ errcode_t retval;
+ int zero = 0, one = 1;
+
+ if (!(rfs->flags & (RESIZE_DISABLE_64BIT | RESIZE_ENABLE_64BIT)))
+ return 0;
+
+ retval = ext2fs_allocate_block_bitmap(rfs->old_fs, "oldfs", &old_map);
+ if (retval)
+ return retval;
+
+ retval = ext2fs_allocate_block_bitmap(rfs->new_fs, "newfs", &new_map);
+ if (retval)
+ goto out;
+
+ /* Construct bitmaps of super/descriptor blocks in old and new fs */
+ for (i = 0; i < rfs->old_fs->group_desc_count; i++) {
+ retval = ext2fs_super_and_bgd_loc2(rfs->old_fs, i, &b, &c, &d,
+ NULL);
+ if (retval)
+ goto out;
+ ext2fs_mark_block_bitmap2(old_map, b);
+ ext2fs_mark_block_bitmap2(old_map, c);
+ ext2fs_mark_block_bitmap2(old_map, d);
+
+ retval = ext2fs_super_and_bgd_loc2(rfs->new_fs, i, &b, &c, &d,
+ NULL);
+ if (retval)
+ goto out;
+ ext2fs_mark_block_bitmap2(new_map, b);
+ ext2fs_mark_block_bitmap2(new_map, c);
+ ext2fs_mark_block_bitmap2(new_map, d);
+ }
+
+ /* Find changes in block allocations for bg metadata */
+ for (b = 0;
+ b < ext2fs_blocks_count(rfs->new_fs->super);
+ b += EXT2FS_CLUSTER_RATIO(rfs->new_fs)) {
+ old = ext2fs_test_block_bitmap2(old_map, b);
+ new = ext2fs_test_block_bitmap2(new_map, b);
+
+ if (old && !new)
+ ext2fs_unmark_block_bitmap2(rfs->new_fs->block_map, b);
+ else if (!old && new)
+ ; /* empty ext2fs_mark_block_bitmap2(new_map, b); */
+ else
+ ext2fs_unmark_block_bitmap2(new_map, b);
+ }
+ /* new_map now shows blocks that have been newly allocated. */
+
+ /* Move any conflicting bitmaps and inode tables */
+ for (i = 0; i < rfs->old_fs->group_desc_count; i++) {
+ b = ext2fs_block_bitmap_loc(rfs->new_fs, i);
+ if (ext2fs_test_block_bitmap2(new_map, b))
+ ext2fs_block_bitmap_loc_set(rfs->new_fs, i, 0);
+
+ b = ext2fs_inode_bitmap_loc(rfs->new_fs, i);
+ if (ext2fs_test_block_bitmap2(new_map, b))
+ ext2fs_inode_bitmap_loc_set(rfs->new_fs, i, 0);
+
+ c = ext2fs_inode_table_loc(rfs->new_fs, i);
+ for (b = 0; b < rfs->new_fs->inode_blocks_per_group; b++) {
+ if (ext2fs_test_block_bitmap2(new_map, b + c)) {
+ ext2fs_inode_table_loc_set(rfs->new_fs, i, 0);
+ break;
+ }
+ }
+ }
+
+out:
+ if (old_map)
+ ext2fs_free_block_bitmap(old_map);
+ if (new_map)
+ ext2fs_free_block_bitmap(new_map);
+ return retval;
+}
+
+/* Zero out the high bits of extent fields */
+static errcode_t zero_high_bits_in_extents(ext2_filsys fs, ext2_ino_t ino,
+ struct ext2_inode *inode)
+{
+ ext2_extent_handle_t handle;
+ struct ext2fs_extent extent;
+ int op = EXT2_EXTENT_ROOT;
+ errcode_t errcode;
+
+ if (!(inode->i_flags & EXT4_EXTENTS_FL))
+ return 0;
+
+ errcode = ext2fs_extent_open(fs, ino, &handle);
+ if (errcode)
+ return errcode;
+
+ while (1) {
+ errcode = ext2fs_extent_get(handle, op, &extent);
+ if (errcode)
+ break;
+
+ op = EXT2_EXTENT_NEXT_SIB;
+
+ if (extent.e_pblk > (1ULL << 32)) {
+ extent.e_pblk &= (1ULL << 32) - 1;
+ errcode = ext2fs_extent_replace(handle, 0, &extent);
+ if (errcode)
+ break;
+ }
+ }
+
+ /* Ok if we run off the end */
+ if (errcode == EXT2_ET_EXTENT_NO_NEXT)
+ errcode = 0;
+ return errcode;
+}
+
+/* Zero out the high bits of inodes. */
+static errcode_t zero_high_bits_in_inodes(ext2_resize_t rfs)
+{
+ ext2_filsys fs = rfs->new_fs;
+ int length = EXT2_INODE_SIZE(fs->super);
+ struct ext2_inode *inode = NULL;
+ ext2_inode_scan scan = NULL;
+ errcode_t retval;
+ ext2_ino_t ino;
+ blk64_t file_acl_block;
+ int inode_dirty;
+
+ if (!(rfs->flags & (RESIZE_DISABLE_64BIT | RESIZE_ENABLE_64BIT)))
+ return 0;
+
+ if (fs->super->s_creator_os != EXT2_OS_LINUX)
+ return 0;
+
+ retval = ext2fs_open_inode_scan(fs, 0, &scan);
+ if (retval)
+ return retval;
+
+ retval = ext2fs_get_mem(length, &inode);
+ if (retval)
+ goto out;
+
+ do {
+ retval = ext2fs_get_next_inode_full(scan, &ino, inode, length);
+ if (retval)
+ goto out;
+ if (!ino)
+ break;
+ if (!ext2fs_test_inode_bitmap2(fs->inode_map, ino))
+ continue;
+
+ /*
+ * Here's how we deal with high block number fields:
+ *
+ * - i_size_high has been been written out with i_size_lo
+ * since the ext2 days, so no conversion is needed.
+ *
+ * - i_blocks_hi is guarded by both the huge_file feature and
+ * inode flags and has always been written out with
+ * i_blocks_lo if the feature is set. The field is only
+ * ever read if both feature and inode flag are set, so
+ * we don't need to zero it now.
+ *
+ * - i_file_acl_high can be uninitialized, so zero it if
+ * it isn't already.
+ */
+ if (inode->osd2.linux2.l_i_file_acl_high) {
+ inode->osd2.linux2.l_i_file_acl_high = 0;
+ retval = ext2fs_write_inode_full(fs, ino, inode,
+ length);
+ if (retval)
+ goto out;
+ }
+
+ retval = zero_high_bits_in_extents(fs, ino, inode);
+ if (retval)
+ goto out;
+ } while (ino);
+
+out:
+ if (inode)
+ ext2fs_free_mem(&inode);
+ if (scan)
+ ext2fs_close_inode_scan(scan);
+ return retval;
+}
+
/*
* Clean up the bitmaps for unitialized bitmaps
*/
@@ -455,7 +728,8 @@ retry:
/*
* Reallocate the group descriptors as necessary.
*/
- if (old_fs->desc_blocks != fs->desc_blocks) {
+ if (EXT2_DESC_SIZE(old_fs->super) == EXT2_DESC_SIZE(fs->super) &&
+ old_fs->desc_blocks != fs->desc_blocks) {
retval = ext2fs_resize_mem(old_fs->desc_blocks *
fs->blocksize,
fs->desc_blocks * fs->blocksize,
@@ -1006,7 +1280,9 @@ static errcode_t blocks_to_move(ext2_resize_t rfs)
if (retval)
goto errout;
- if (old_blocks == new_blocks) {
+ if (EXT2_DESC_SIZE(rfs->old_fs->super) ==
+ EXT2_DESC_SIZE(rfs->new_fs->super) &&
+ old_blocks == new_blocks) {
retval = 0;
goto errout;
}
diff --git a/resize/resize2fs.h b/resize/resize2fs.h
index 7aeab91..829fcd8 100644
--- a/resize/resize2fs.h
+++ b/resize/resize2fs.h
@@ -82,6 +82,9 @@ typedef struct ext2_sim_progress *ext2_sim_progmeter;
#define RESIZE_PERCENT_COMPLETE 0x0100
#define RESIZE_VERBOSE 0x0200
+#define RESIZE_ENABLE_64BIT 0x0400
+#define RESIZE_DISABLE_64BIT 0x0800
+
/*
* This structure is used for keeping track of how much resources have
* been used for a particular resize2fs pass.
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 15/32] resize2fs: when toggling 64bit, don't free in-use bg data clusters
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (13 preceding siblings ...)
2014-03-02 7:18 ` [PATCH 14/32] resize2fs: convert fs to and from 64bit mode Darrick J. Wong
@ 2014-03-02 7:18 ` Darrick J. Wong
2014-03-02 7:18 ` [PATCH 16/32] resize2fs: adjust reserved_gdt_blocks when changing group descriptor size Darrick J. Wong
` (14 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:18 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
Currently, move_bg_metadata() assumes that if a block containing a
superblock or a group descriptor is no longer needed, then it is safe
to free the whole cluster. This of course isn't true, for bitmaps and
inode tables can share these clusters. Therefore, check a little more
carefully before freeing clusters.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
resize/resize2fs.c | 71 ++++++++++++++++++++++++++++++++++++++++------------
1 file changed, 55 insertions(+), 16 deletions(-)
diff --git a/resize/resize2fs.c b/resize/resize2fs.c
index 339885b..646c65e 100644
--- a/resize/resize2fs.c
+++ b/resize/resize2fs.c
@@ -317,11 +317,11 @@ static errcode_t resize_group_descriptors(ext2_resize_t rfs, blk64_t new_size)
static errcode_t move_bg_metadata(ext2_resize_t rfs)
{
dgrp_t i;
- blk64_t b, c, d;
+ blk64_t b, c, d, old_desc_blocks, new_desc_blocks, j;
ext2fs_block_bitmap old_map, new_map;
int old, new;
errcode_t retval;
- int zero = 0, one = 1;
+ int zero = 0, one = 1, cluster_ratio;
if (!(rfs->flags & (RESIZE_DISABLE_64BIT | RESIZE_ENABLE_64BIT)))
return 0;
@@ -334,6 +334,17 @@ static errcode_t move_bg_metadata(ext2_resize_t rfs)
if (retval)
goto out;
+ if (EXT2_HAS_INCOMPAT_FEATURE(rfs->old_fs->super,
+ EXT2_FEATURE_INCOMPAT_META_BG)) {
+ old_desc_blocks = rfs->old_fs->super->s_first_meta_bg;
+ new_desc_blocks = rfs->new_fs->super->s_first_meta_bg;
+ } else {
+ old_desc_blocks = rfs->old_fs->desc_blocks +
+ rfs->old_fs->super->s_reserved_gdt_blocks;
+ new_desc_blocks = rfs->new_fs->desc_blocks +
+ rfs->new_fs->super->s_reserved_gdt_blocks;
+ }
+
/* Construct bitmaps of super/descriptor blocks in old and new fs */
for (i = 0; i < rfs->old_fs->group_desc_count; i++) {
retval = ext2fs_super_and_bgd_loc2(rfs->old_fs, i, &b, &c, &d,
@@ -341,7 +352,8 @@ static errcode_t move_bg_metadata(ext2_resize_t rfs)
if (retval)
goto out;
ext2fs_mark_block_bitmap2(old_map, b);
- ext2fs_mark_block_bitmap2(old_map, c);
+ for (j = 0; c != 0 && j < old_desc_blocks; j++)
+ ext2fs_mark_block_bitmap2(old_map, c + j);
ext2fs_mark_block_bitmap2(old_map, d);
retval = ext2fs_super_and_bgd_loc2(rfs->new_fs, i, &b, &c, &d,
@@ -349,45 +361,72 @@ static errcode_t move_bg_metadata(ext2_resize_t rfs)
if (retval)
goto out;
ext2fs_mark_block_bitmap2(new_map, b);
- ext2fs_mark_block_bitmap2(new_map, c);
+ for (j = 0; c != 0 && j < new_desc_blocks; j++)
+ ext2fs_mark_block_bitmap2(new_map, c + j);
ext2fs_mark_block_bitmap2(new_map, d);
}
+ cluster_ratio = EXT2FS_CLUSTER_RATIO(rfs->new_fs);
+
/* Find changes in block allocations for bg metadata */
for (b = 0;
b < ext2fs_blocks_count(rfs->new_fs->super);
- b += EXT2FS_CLUSTER_RATIO(rfs->new_fs)) {
+ b += cluster_ratio) {
old = ext2fs_test_block_bitmap2(old_map, b);
new = ext2fs_test_block_bitmap2(new_map, b);
- if (old && !new)
- ext2fs_unmark_block_bitmap2(rfs->new_fs->block_map, b);
- else if (!old && new)
- ; /* empty ext2fs_mark_block_bitmap2(new_map, b); */
- else
+ if (old && !new) {
+ /* mark old_map, unmark new_map */
+ if (cluster_ratio == 1)
+ ext2fs_unmark_block_bitmap2(
+ rfs->new_fs->block_map, b);
+ } else if (!old && new)
+ ; /* unmark old_map, mark new_map */
+ else {
+ ext2fs_unmark_block_bitmap2(old_map, b);
ext2fs_unmark_block_bitmap2(new_map, b);
+ }
}
- /* new_map now shows blocks that have been newly allocated. */
- /* Move any conflicting bitmaps and inode tables */
+ /*
+ * new_map now shows blocks that have been newly allocated.
+ * old_map now shows blocks that have been newly freed.
+ */
+
+ /*
+ * Move any conflicting bitmaps and inode tables. Ensure that we
+ * don't try to free clusters associated with bitmaps or tables.
+ */
for (i = 0; i < rfs->old_fs->group_desc_count; i++) {
b = ext2fs_block_bitmap_loc(rfs->new_fs, i);
if (ext2fs_test_block_bitmap2(new_map, b))
ext2fs_block_bitmap_loc_set(rfs->new_fs, i, 0);
+ else if (ext2fs_test_block_bitmap2(old_map, b))
+ ext2fs_unmark_block_bitmap2(old_map, b);
b = ext2fs_inode_bitmap_loc(rfs->new_fs, i);
if (ext2fs_test_block_bitmap2(new_map, b))
ext2fs_inode_bitmap_loc_set(rfs->new_fs, i, 0);
+ else if (ext2fs_test_block_bitmap2(old_map, b))
+ ext2fs_unmark_block_bitmap2(old_map, b);
c = ext2fs_inode_table_loc(rfs->new_fs, i);
- for (b = 0; b < rfs->new_fs->inode_blocks_per_group; b++) {
- if (ext2fs_test_block_bitmap2(new_map, b + c)) {
+ for (b = 0;
+ b < rfs->new_fs->inode_blocks_per_group;
+ b++) {
+ if (ext2fs_test_block_bitmap2(new_map, b + c))
ext2fs_inode_table_loc_set(rfs->new_fs, i, 0);
- break;
- }
+ else if (ext2fs_test_block_bitmap2(old_map, b + c))
+ ext2fs_unmark_block_bitmap2(old_map, b + c);
}
}
+ /* Free unused clusters */
+ for (b = 0;
+ cluster_ratio > 1 && b < ext2fs_blocks_count(rfs->new_fs->super);
+ b += cluster_ratio)
+ if (ext2fs_test_block_bitmap2(old_map, b))
+ ext2fs_unmark_block_bitmap2(rfs->new_fs->block_map, b);
out:
if (old_map)
ext2fs_free_block_bitmap(old_map);
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 16/32] resize2fs: adjust reserved_gdt_blocks when changing group descriptor size
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (14 preceding siblings ...)
2014-03-02 7:18 ` [PATCH 15/32] resize2fs: when toggling 64bit, don't free in-use bg data clusters Darrick J. Wong
@ 2014-03-02 7:18 ` Darrick J. Wong
2014-03-02 7:18 ` [PATCH 17/32] libext2fs: have UNIX IO manager use pread/pwrite Darrick J. Wong
` (13 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:18 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
Since we're constructing the fantasy that new_fs has always been a
64bit fs, we need to adjust reserved_gdt_blocks when we start resizing
the metadata so that the size of the gdt space in the new fs reflects
the fantasy throughout the resize process.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
resize/resize2fs.c | 37 ++++++++++++++++++++++++-------------
1 file changed, 24 insertions(+), 13 deletions(-)
diff --git a/resize/resize2fs.c b/resize/resize2fs.c
index 646c65e..54e0053 100644
--- a/resize/resize2fs.c
+++ b/resize/resize2fs.c
@@ -251,6 +251,24 @@ errout:
return retval;
}
+/* Keep the size of the group descriptor region constant */
+static void adjust_reserved_gdt_blocks(ext2_filsys old_fs, ext2_filsys fs)
+{
+ if ((fs->super->s_feature_compat &
+ EXT2_FEATURE_COMPAT_RESIZE_INODE) &&
+ (old_fs->desc_blocks != fs->desc_blocks)) {
+ int new;
+
+ new = ((int) fs->super->s_reserved_gdt_blocks) +
+ (old_fs->desc_blocks - fs->desc_blocks);
+ if (new < 0)
+ new = 0;
+ if (new > (int) fs->blocksize/4)
+ new = fs->blocksize/4;
+ fs->super->s_reserved_gdt_blocks = new;
+ }
+}
+
/* Toggle 64bit mode */
static errcode_t resize_group_descriptors(ext2_resize_t rfs, blk64_t new_size)
{
@@ -310,6 +328,8 @@ static errcode_t resize_group_descriptors(ext2_resize_t rfs, blk64_t new_size)
for (i = 0; i < rfs->old_fs->group_desc_count; i++)
ext2fs_group_desc_csum_set(rfs->new_fs, i);
+ adjust_reserved_gdt_blocks(rfs->old_fs, rfs->new_fs);
+
return 0;
}
@@ -787,20 +807,11 @@ retry:
* number of descriptor blocks, then adjust
* s_reserved_gdt_blocks if possible to avoid needing to move
* the inode table either now or in the future.
+ *
+ * Note: If we're converting to 64bit mode, we did this earlier.
*/
- if ((fs->super->s_feature_compat &
- EXT2_FEATURE_COMPAT_RESIZE_INODE) &&
- (old_fs->desc_blocks != fs->desc_blocks)) {
- int new;
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 17/32] libext2fs: have UNIX IO manager use pread/pwrite
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (15 preceding siblings ...)
2014-03-02 7:18 ` [PATCH 16/32] resize2fs: adjust reserved_gdt_blocks when changing group descriptor size Darrick J. Wong
@ 2014-03-02 7:18 ` Darrick J. Wong
2014-03-02 7:18 ` [PATCH 18/32] ext2fs: add readahead method to improve scanning Darrick J. Wong
` (12 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:18 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
If pread/pwrite are present, have the UNIX IO manager use them for
aligned IOs (instead of the current seek -> read/write), thereby
saving us a (minor) amount of system call overhead.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
configure | 2 +-
configure.in | 2 ++
lib/config.h.in | 6 ++++++
lib/ext2fs/unix_io.c | 24 ++++++++++++++++++++++++
4 files changed, 33 insertions(+), 1 deletion(-)
diff --git a/configure b/configure
index 6449f59..7b0a0d1 100755
--- a/configure
+++ b/configure
@@ -11155,7 +11155,7 @@ if test "$ac_res" != no; then :
fi
fi
-for ac_func in __secure_getenv backtrace blkid_probe_get_topology chflags fadvise64 fallocate fallocate64 fchown fdatasync fstat64 ftruncate64 futimes getcwd getdtablesize getmntinfo getpwuid_r getrlimit getrusage jrand48 llseek lseek64 mallinfo mbstowcs memalign mempcpy mmap msync nanosleep open64 pathconf posix_fadvise posix_fadvise64 posix_memalign prctl secure_getenv setmntent setresgid setresuid srandom stpcpy strcasecmp strdup strnlen strptime strtoull sync_file_range sysconf usleep utime valloc
+for ac_func in __secure_getenv backtrace blkid_probe_get_topology chflags fadvise64 fallocate fallocate64 fchown fdatasync fstat64 ftruncate64 futimes getcwd getdtablesize getmntinfo getpwuid_r getrlimit getrusage jrand48 llseek lseek64 mallinfo mbstowcs memalign mempcpy mmap msync nanosleep open64 pathconf posix_fadvise posix_fadvise64 posix_memalign prctl pread pwrite secure_getenv setmntent setresgid setresuid srandom stpcpy strcasecmp strdup strnlen strptime strtoull sync_file_range sysconf usleep utime valloc
do :
as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
diff --git a/configure.in b/configure.in
index 8a033b0..f28bd46 100644
--- a/configure.in
+++ b/configure.in
@@ -1135,6 +1135,8 @@ AC_CHECK_FUNCS(m4_flatten([
posix_fadvise64
posix_memalign
prctl
+ pread
+ pwrite
secure_getenv
setmntent
setresgid
diff --git a/lib/config.h.in b/lib/config.h.in
index 12ac1e0..e0384ee 100644
--- a/lib/config.h.in
+++ b/lib/config.h.in
@@ -311,9 +311,15 @@
/* Define to 1 if you have the `prctl' function. */
#undef HAVE_PRCTL
+/* Define to 1 if you have the `pread' function. */
+#undef HAVE_PREAD
+
/* Define to 1 if you have the `putenv' function. */
#undef HAVE_PUTENV
+/* Define to 1 if you have the `pwrite' function. */
+#undef HAVE_PWRITE
+
/* Define to 1 if dirent has d_reclen */
#undef HAVE_RECLEN_DIRENT
diff --git a/lib/ext2fs/unix_io.c b/lib/ext2fs/unix_io.c
index c3185b6..a818c13 100644
--- a/lib/ext2fs/unix_io.c
+++ b/lib/ext2fs/unix_io.c
@@ -130,6 +130,18 @@ static errcode_t raw_read_blk(io_channel channel,
size = (count < 0) ? -count : count * channel->block_size;
data->io_stats.bytes_read += size;
location = ((ext2_loff_t) block * channel->block_size) + data->offset;
+
+#ifdef HAVE_PREAD
+ /* Try an aligned pread */
+ if ((channel->align == 0) ||
+ (IS_ALIGNED(buf, channel->align) &&
+ IS_ALIGNED(size, channel->align))) {
+ actual = pread(data->dev, buf, size, location);
+ if (actual == size)
+ return 0;
+ }
+#endif /* HAVE_PREAD */
+
if (ext2fs_llseek(data->dev, location, SEEK_SET) != location) {
retval = errno ? errno : EXT2_ET_LLSEEK_FAILED;
goto error_out;
@@ -200,6 +212,18 @@ static errcode_t raw_write_blk(io_channel channel,
data->io_stats.bytes_written += size;
location = ((ext2_loff_t) block * channel->block_size) + data->offset;
+
+#ifdef HAVE_PWRITE
+ /* Try an aligned pwrite */
+ if ((channel->align == 0) ||
+ (IS_ALIGNED(buf, channel->align) &&
+ IS_ALIGNED(size, channel->align))) {
+ actual = pwrite(data->dev, buf, size, location);
+ if (actual == size)
+ return 0;
+ }
+#endif /* HAVE_PWRITE */
+
if (ext2fs_llseek(data->dev, location, SEEK_SET) != location) {
retval = errno ? errno : EXT2_ET_LLSEEK_FAILED;
goto error_out;
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 18/32] ext2fs: add readahead method to improve scanning
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (16 preceding siblings ...)
2014-03-02 7:18 ` [PATCH 17/32] libext2fs: have UNIX IO manager use pread/pwrite Darrick J. Wong
@ 2014-03-02 7:18 ` Darrick J. Wong
2014-03-02 7:18 ` [PATCH 19/32] libext2fs: allow clients to read-ahead metadata Darrick J. Wong
` (11 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:18 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4, Andreas Dilger
Frøm: Andreas Dilger <adilger@whamcloud.com>
Add a readahead method for prefetching ranges of disk blocks. This is
useful for inode table scanning, and other large contiguous ranges of
blocks, and may also prove useful for random block prefetch, since it
will allow reordering of the IO without waiting synchronously for the
reads to complete.
It is currently using the posix_fadvise(POSIX_FADV_WILLNEED)
interface, as this proved most efficient during our testing.
[darrick.wong@oracle.com]
Add a cache_release method for advising the pagecache to discard disk
cache blocks. Make the arguments to the readahead function take the
same ULL values as the other IO functions, and return an appropriate
error code when fadvise isn't available.
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
lib/ext2fs/ext2_io.h | 12 ++++++++++++
lib/ext2fs/io_manager.c | 18 ++++++++++++++++++
lib/ext2fs/unix_io.c | 46 +++++++++++++++++++++++++++++++++++++++++++---
3 files changed, 73 insertions(+), 3 deletions(-)
diff --git a/lib/ext2fs/ext2_io.h b/lib/ext2fs/ext2_io.h
index 1894fb8..636f797 100644
--- a/lib/ext2fs/ext2_io.h
+++ b/lib/ext2fs/ext2_io.h
@@ -90,6 +90,12 @@ struct struct_io_manager {
int count, const void *data);
errcode_t (*discard)(io_channel channel, unsigned long long block,
unsigned long long count);
+ errcode_t (*cache_readahead)(io_channel channel,
+ unsigned long long block,
+ unsigned long long count);
+ errcode_t (*cache_release)(io_channel channel,
+ unsigned long long block,
+ unsigned long long count);
long reserved[16];
};
@@ -124,6 +130,12 @@ extern errcode_t io_channel_discard(io_channel channel,
unsigned long long count);
extern errcode_t io_channel_alloc_buf(io_channel channel,
int count, void *ptr);
+extern errcode_t io_channel_cache_readahead(io_channel io,
+ unsigned long long block,
+ unsigned long long count);
+extern errcode_t io_channel_cache_release(io_channel io,
+ unsigned long long block,
+ unsigned long long count);
/* unix_io.c */
extern io_manager unix_io_manager;
diff --git a/lib/ext2fs/io_manager.c b/lib/ext2fs/io_manager.c
index 34e4859..a1258c4 100644
--- a/lib/ext2fs/io_manager.c
+++ b/lib/ext2fs/io_manager.c
@@ -128,3 +128,21 @@ errcode_t io_channel_alloc_buf(io_channel io, int count, void *ptr)
else
return ext2fs_get_mem(size, ptr);
}
+
+errcode_t io_channel_cache_readahead(io_channel io, unsigned long long block,
+ unsigned long long count)
+{
+ if (!io->manager->cache_readahead)
+ return EXT2_ET_OP_NOT_SUPPORTED;
+
+ return io->manager->cache_readahead(io, block, count);
+}
+
+errcode_t io_channel_cache_release(io_channel io, unsigned long long block,
+ unsigned long long count)
+{
+ if (!io->manager->cache_release)
+ return EXT2_ET_OP_NOT_SUPPORTED;
+
+ return io->manager->cache_release(io, block, count);
+}
diff --git a/lib/ext2fs/unix_io.c b/lib/ext2fs/unix_io.c
index a818c13..a95e289 100644
--- a/lib/ext2fs/unix_io.c
+++ b/lib/ext2fs/unix_io.c
@@ -15,6 +15,9 @@
* %End-Header%
*/
+#define _XOPEN_SOURCE 600
+#define _DARWIN_C_SOURCE
+#define _FILE_OFFSET_BITS 64
#define _LARGEFILE_SOURCE
#define _LARGEFILE64_SOURCE
#ifndef _GNU_SOURCE
@@ -35,6 +38,9 @@
#ifdef __linux__
#include <sys/utsname.h>
#endif
+#if HAVE_SYS_TYPES_H
+#include <sys/types.h>
+#endif
#ifdef HAVE_SYS_IOCTL_H
#include <sys/ioctl.h>
#endif
@@ -44,9 +50,6 @@
#if HAVE_SYS_STAT_H
#include <sys/stat.h>
#endif
-#if HAVE_SYS_TYPES_H
-#include <sys/types.h>
-#endif
#if HAVE_SYS_RESOURCE_H
#include <sys/resource.h>
#endif
@@ -97,6 +100,7 @@ struct unix_private_data {
#define IS_ALIGNED(n, align) ((((unsigned long) n) & \
((unsigned long) ((align)-1))) == 0)
+
static errcode_t unix_get_stats(io_channel channel, io_stats *stats)
{
errcode_t retval = 0;
@@ -810,6 +814,40 @@ static errcode_t unix_write_blk64(io_channel channel, unsigned long long block,
#endif /* NO_IO_CACHE */
}
+static errcode_t unix_cache_readahead(io_channel channel,
+ unsigned long long block,
+ unsigned long long count)
+{
+#ifdef POSIX_FADV_WILLNEED
+ struct unix_private_data *data;
+
+ data = (struct unix_private_data *)channel->private_data;
+ return posix_fadvise(data->dev,
+ (ext2_loff_t)block * channel->block_size,
+ (ext2_loff_t)count * channel->block_size,
+ POSIX_FADV_WILLNEED);
+#else
+ return EXT2_ET_OP_NOT_SUPPORTED;
+#endif
+}
+
+static errcode_t unix_cache_release(io_channel channel,
+ unsigned long long block,
+ unsigned long long count)
+{
+#ifdef POSIX_FADV_DONTNEED
+ struct unix_private_data *data;
+
+ data = (struct unix_private_data *)channel->private_data;
+ return posix_fadvise(data->dev,
+ (ext2_loff_t)block * channel->block_size,
+ (ext2_loff_t)count * channel->block_size,
+ POSIX_FADV_DONTNEED);
+#else
+ return EXT2_ET_OP_NOT_SUPPORTED;
+#endif
+}
+
static errcode_t unix_write_blk(io_channel channel, unsigned long block,
int count, const void *buf)
{
@@ -961,6 +999,8 @@ static struct struct_io_manager struct_unix_manager = {
unix_read_blk64,
unix_write_blk64,
unix_discard,
+ unix_cache_readahead,
+ unix_cache_release,
};
io_manager unix_io_manager = &struct_unix_manager;
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 19/32] libext2fs: allow clients to read-ahead metadata
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (17 preceding siblings ...)
2014-03-02 7:18 ` [PATCH 18/32] ext2fs: add readahead method to improve scanning Darrick J. Wong
@ 2014-03-02 7:18 ` Darrick J. Wong
2014-03-02 7:18 ` [PATCH 20/32] e2fsck: read-ahead metadata during passes 1, 2, and 4 Darrick J. Wong
` (10 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:18 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
This patch adds to libext2fs the ability to pre-fetch metadata
into the page cache in the hopes of speeding up libext2fs' clients.
There are two new library functions -- the first allows a client to
readahead a list of blocks, and the second is a helper function that
uses that first mechanism to load group data (bitmaps, inode tables).
e2fsck will employ both of these methods to speed itself up.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
lib/ext2fs/Makefile.in | 4 +
lib/ext2fs/ext2fs.h | 13 +++
lib/ext2fs/readahead.c | 188 ++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 205 insertions(+)
create mode 100644 lib/ext2fs/readahead.c
diff --git a/lib/ext2fs/Makefile.in b/lib/ext2fs/Makefile.in
index 88808a3..dde4c6d 100644
--- a/lib/ext2fs/Makefile.in
+++ b/lib/ext2fs/Makefile.in
@@ -77,6 +77,7 @@ OBJS= $(DEBUGFS_LIB_OBJS) $(RESIZE_LIB_OBJS) $(E2IMAGE_LIB_OBJS) \
qcow2.o \
read_bb.o \
read_bb_file.o \
+ readahead.o \
res_gdt.o \
rw_bitmaps.o \
swapfs.o \
@@ -153,6 +154,7 @@ SRCS= ext2_err.c \
$(srcdir)/qcow2.c \
$(srcdir)/read_bb.c \
$(srcdir)/read_bb_file.c \
+ $(srcdir)/readahead.c \
$(srcdir)/res_gdt.c \
$(srcdir)/rw_bitmaps.c \
$(srcdir)/swapfs.c \
@@ -887,6 +889,8 @@ read_bb_file.o: $(srcdir)/read_bb_file.c $(top_builddir)/lib/config.h \
$(srcdir)/ext2_fs.h $(srcdir)/ext3_extents.h $(top_srcdir)/lib/et/com_err.h \
$(srcdir)/ext2_io.h $(top_builddir)/lib/ext2fs/ext2_err.h \
$(srcdir)/ext2_ext_attr.h $(srcdir)/bitops.h
+readahead.o: $(srcdir)/readahead.c $(top_builddir)/lib/config.h \
+ $(srcdir)/ext2fs.h $(srcdir)/ext2_fs.h $(top_builddir)/lib/ext2fs/ext2_err.h
res_gdt.o: $(srcdir)/res_gdt.c $(top_builddir)/lib/config.h \
$(top_builddir)/lib/dirpaths.h $(srcdir)/ext2_fs.h \
$(top_builddir)/lib/ext2fs/ext2_types.h $(srcdir)/ext2fs.h \
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index 645285b..8aa0ac9 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -1543,6 +1543,19 @@ extern errcode_t ext2fs_read_bb_FILE(ext2_filsys fs, FILE *f,
void (*invalid)(ext2_filsys fs,
blk_t blk));
+/* readahead.c */
+#define EXT2_READA_SUPER (0x01)
+#define EXT2_READA_GDT (0x02)
+#define EXT2_READA_BBITMAP (0x04)
+#define EXT2_READA_IBITMAP (0x08)
+#define EXT2_READA_ITABLE (0x10)
+#define EXT2_READA_ALL_FLAGS (0x1F)
+errcode_t ext2fs_readahead(ext2_filsys fs, int flags, dgrp_t start,
+ dgrp_t ngroups);
+errcode_t ext2fs_readahead_dblist(ext2_filsys fs, int flags,
+ ext2_dblist dblist);
+int ext2fs_can_readahead(ext2_filsys fs);
+
/* res_gdt.c */
extern errcode_t ext2fs_create_resize_inode(ext2_filsys fs);
diff --git a/lib/ext2fs/readahead.c b/lib/ext2fs/readahead.c
new file mode 100644
index 0000000..ed6e555
--- /dev/null
+++ b/lib/ext2fs/readahead.c
@@ -0,0 +1,188 @@
+/*
+ * readahead.c -- Try to convince the OS to prefetch metadata.
+ *
+ * Copyright (C) 2014 Oracle.
+ *
+ * %Begin-Header%
+ * This file may be redistributed under the terms of the GNU Library
+ * General Public License, version 2.
+ * %End-Header%
+ */
+
+#include "config.h"
+#include <string.h>
+
+#include "ext2_fs.h"
+#include "ext2fs.h"
+
+#undef DEBUG
+
+#ifdef DEBUG
+# define dbg_printf(f, a...) do {printf(f, ## a); fflush(stdout); } while (0)
+#else
+# define dbg_printf(f, a...)
+#endif
+
+struct read_dblist {
+ errcode_t err;
+ blk64_t run_start;
+ blk64_t run_len;
+};
+
+static EXT2_QSORT_TYPE readahead_dir_block_cmp(const void *a, const void *b)
+{
+ const struct ext2_db_entry2 *db_a =
+ (const struct ext2_db_entry2 *) a;
+ const struct ext2_db_entry2 *db_b =
+ (const struct ext2_db_entry2 *) b;
+
+ return (int) (db_a->blk - db_b->blk);
+}
+
+static int readahead_dir_block(ext2_filsys fs, struct ext2_db_entry2 *db,
+ void *priv_data)
+{
+ errcode_t err = 0;
+ struct read_dblist *pr = priv_data;
+
+ if (!pr->run_len || db->blk != pr->run_start + pr->run_len) {
+ if (pr->run_len) {
+ pr->err = io_channel_cache_readahead(fs->io,
+ pr->run_start,
+ pr->run_len);
+ dbg_printf("readahead start=%llu len=%llu err=%d\n",
+ pr->run_start, pr->run_len,
+ (int)pr->err);
+ }
+ pr->run_start = db->blk;
+ pr->run_len = 0;
+ }
+ pr->run_len += db->blockcnt;
+
+ return pr->err ? DBLIST_ABORT : 0;
+}
+
+errcode_t ext2fs_readahead_dblist(ext2_filsys fs, int flags,
+ ext2_dblist dblist)
+{
+ errcode_t err;
+ struct read_dblist pr;
+
+ dbg_printf("%s: flags=0x%x\n", __func__, flags);
+ if (flags)
+ return EXT2_ET_INVALID_ARGUMENT;
+
+ ext2fs_dblist_sort2(dblist, readahead_dir_block_cmp);
+
+ memset(&pr, 0, sizeof(pr));
+ err = ext2fs_dblist_iterate2(dblist, readahead_dir_block, &pr);
+ if (pr.err)
+ return pr.err;
+ if (err)
+ return err;
+
+ if (pr.run_len)
+ err = io_channel_cache_readahead(fs->io, pr.run_start,
+ pr.run_len);
+
+ return err;
+}
+
+errcode_t ext2fs_readahead(ext2_filsys fs, int flags, dgrp_t start,
+ dgrp_t ngroups)
+{
+ blk64_t super, old_gdt, new_gdt;
+ blk_t blocks;
+ dgrp_t i;
+ ext2_dblist dblist;
+ dgrp_t end = start + ngroups;
+ errcode_t err = 0;
+
+ dbg_printf("%s: flags=0x%x start=%d groups=%d\n", __func__, flags,
+ start, ngroups);
+ if (flags & ~EXT2_READA_ALL_FLAGS)
+ return EXT2_ET_INVALID_ARGUMENT;
+
+ if (end > fs->group_desc_count)
+ end = fs->group_desc_count;
+
+ if (flags == 0)
+ return 0;
+
+ err = ext2fs_init_dblist(fs, &dblist);
+ if (err)
+ return err;
+
+ for (i = start; i < end; i++) {
+ err = ext2fs_super_and_bgd_loc2(fs, i, &super, &old_gdt,
+ &new_gdt, &blocks);
+ if (err)
+ break;
+
+ if (flags & EXT2_READA_SUPER) {
+ err = ext2fs_add_dir_block2(dblist, 0, super, 0);
+ if (err)
+ break;
+ }
+
+ if (flags & EXT2_READA_GDT) {
+ if (old_gdt)
+ err = ext2fs_add_dir_block2(dblist, 0, old_gdt,
+ blocks);
+ else if (new_gdt)
+ err = ext2fs_add_dir_block2(dblist, 0, new_gdt,
+ blocks);
+ else
+ err = 0;
+ if (err)
+ break;
+ }
+
+ if ((flags & EXT2_READA_BBITMAP) &&
+ !ext2fs_bg_flags_test(fs, i, EXT2_BG_BLOCK_UNINIT) &&
+ ext2fs_bg_free_blocks_count(fs, i) <
+ fs->super->s_blocks_per_group) {
+ super = ext2fs_block_bitmap_loc(fs, i);
+ err = ext2fs_add_dir_block2(dblist, 0, super, 1);
+ if (err)
+ break;
+ }
+
+ if ((flags & EXT2_READA_IBITMAP) &&
+ !ext2fs_bg_flags_test(fs, i, EXT2_BG_INODE_UNINIT) &&
+ ext2fs_bg_free_inodes_count(fs, i) <
+ fs->super->s_inodes_per_group) {
+ super = ext2fs_inode_bitmap_loc(fs, i);
+ err = ext2fs_add_dir_block2(dblist, 0, super, 1);
+ if (err)
+ break;
+ }
+
+ if ((flags & EXT2_READA_ITABLE) &&
+ ext2fs_bg_free_inodes_count(fs, i) <
+ fs->super->s_inodes_per_group) {
+ super = ext2fs_inode_table_loc(fs, i);
+ blocks = fs->inode_blocks_per_group -
+ (ext2fs_bg_itable_unused(fs, i) *
+ EXT2_INODE_SIZE(fs->super) / fs->blocksize);
+ err = ext2fs_add_dir_block2(dblist, 0, super, blocks);
+ if (err)
+ break;
+ }
+ }
+
+ if (!err)
+ err = ext2fs_readahead_dblist(fs, 0, dblist);
+
+ ext2fs_free_dblist(dblist);
+ return err;
+}
+
+int ext2fs_can_readahead(ext2_filsys fs)
+{
+ errcode_t err;
+
+ err = io_channel_cache_readahead(fs->io, 0, 1);
+ dbg_printf("%s: supp=%d\n", __func__, err != EXT2_ET_OP_NOT_SUPPORTED);
+ return err != EXT2_ET_OP_NOT_SUPPORTED;
+}
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 20/32] e2fsck: read-ahead metadata during passes 1, 2, and 4
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (18 preceding siblings ...)
2014-03-02 7:18 ` [PATCH 19/32] libext2fs: allow clients to read-ahead metadata Darrick J. Wong
@ 2014-03-02 7:18 ` Darrick J. Wong
2014-03-02 7:18 ` [PATCH 21/32] libext2fs: when appending to a file, don't split an index block in equal halves Darrick J. Wong
` (9 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:18 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
e2fsck pass1 is modified to use the block group data prefetch function
to try to fetch the inode tables into the pagecache before it is
needed. In order to avoid cache thrashing, we limit ourselves to
prefetching at most half the available memory.
pass2 is modified to use the dirblock prefetching function to prefetch
the list of directory blocks that are assembled in pass1. So long as
we don't anticipate rehashing the dirs (pass 3a), we can release the
dirblocks as soon as we're done checking them.
pass4 is modified to prefetch the block and inode bitmaps in
anticipation of pass 5, because pass4 is entirely CPU bound.
In general, these mechanisms can halve fsck time, if the host system
has sufficient memory and the storage system can provide a lot of
IOPs. SSDs and multi-spindle RAIDs see the most speedup; single disks
experience a modest speedup, and single-spindle USB mass storage
devices see hardly any benefit.
By default, readahead will try to fill half the physical memory in the
system. The -R option can be given to specify the amount of memory to
use for readahead, or zero to disable it entirely; or an option can be
given in e2fsck.conf.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
MCONFIG.in | 1
configure | 49 +++++++++++++++++
configure.in | 6 ++
e2fsck/Makefile.in | 4 +
e2fsck/e2fsck.8.in | 9 +++
e2fsck/e2fsck.c | 136 +++++++++++++++++++++++++++++++++++++++++++++++
e2fsck/e2fsck.conf.5.in | 13 ++++
e2fsck/e2fsck.h | 25 +++++++++
e2fsck/pass1.c | 83 +++++++++++++++++++++++++++++
e2fsck/pass2.c | 97 +++++++++++++++++++++++++++++++++-
e2fsck/pass4.c | 22 ++++++++
e2fsck/prof_err.et | 1
e2fsck/rehash.c | 10 +++
e2fsck/unix.c | 35 +++++++++++-
e2fsck/util.c | 51 ++++++++++++++++++
lib/config.h.in | 9 +++
16 files changed, 545 insertions(+), 6 deletions(-)
diff --git a/MCONFIG.in b/MCONFIG.in
index 5ed4df0..aeab004 100644
--- a/MCONFIG.in
+++ b/MCONFIG.in
@@ -110,6 +110,7 @@ LIBUUID = @LIBUUID@ @SOCKET_LIB@
LIBQUOTA = @STATIC_LIBQUOTA@
LIBBLKID = @LIBBLKID@ @PRIVATE_LIBS_CMT@ $(LIBUUID)
LIBINTL = @LIBINTL@
+LIBPTHREADS = @PTHREADS_LIB@
SYSLIBS = @LIBS@
DEPLIBSS = $(LIB)/libss@LIB_EXT@
DEPLIBCOM_ERR = $(LIB)/libcom_err@LIB_EXT@
diff --git a/configure b/configure
index 7b0a0d1..5b89229 100755
--- a/configure
+++ b/configure
@@ -639,6 +639,7 @@ CYGWIN_CMT
LINUX_CMT
UNI_DIFF_OPTS
SEM_INIT_LIB
+PTHREADS_LIB
SOCKET_LIB
SIZEOF_OFF_T
SIZEOF_LONG_LONG
@@ -10474,7 +10475,7 @@ fi
done
fi
-for ac_header in dirent.h errno.h execinfo.h getopt.h malloc.h mntent.h paths.h semaphore.h setjmp.h signal.h stdarg.h stdint.h stdlib.h termios.h termio.h unistd.h utime.h linux/falloc.h linux/fd.h linux/major.h linux/loop.h net/if_dl.h netinet/in.h sys/disklabel.h sys/file.h sys/ioctl.h sys/mkdev.h sys/mman.h sys/prctl.h sys/queue.h sys/resource.h sys/select.h sys/socket.h sys/sockio.h sys/stat.h sys/syscall.h sys/sysmacros.h sys/time.h sys/types.h sys/un.h sys/wait.h
+for ac_header in dirent.h errno.h execinfo.h getopt.h malloc.h mntent.h paths.h semaphore.h setjmp.h signal.h stdarg.h stdint.h stdlib.h termios.h termio.h unistd.h utime.h linux/falloc.h linux/fd.h linux/major.h linux/loop.h net/if_dl.h netinet/in.h sys/disklabel.h sys/file.h sys/ioctl.h sys/mkdev.h sys/mman.h sys/prctl.h sys/queue.h sys/resource.h sys/select.h sys/socket.h sys/sockio.h sys/stat.h sys/syscall.h sys/sysctl.h sys/sysmacros.h sys/time.h sys/types.h sys/un.h sys/wait.h
do :
as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh`
ac_fn_c_check_header_mongrel "$LINENO" "$ac_header" "$as_ac_Header" "$ac_includes_default"
@@ -11235,6 +11236,52 @@ if test $ac_cv_have_optreset = yes; then
$as_echo "#define HAVE_OPTRESET 1" >>confdefs.h
fi
+PTHREADS_LIB='-lpthread'
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for pthread_create in -lpthread" >&5
+$as_echo_n "checking for pthread_create in -lpthread... " >&6; }
+if ${ac_cv_lib_pthread_pthread_create+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ ac_check_lib_save_LIBS=$LIBS
+LIBS="-lpthread $LIBS"
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+
+/* Override any GCC internal prototype to avoid an error.
+ Use char because int might match the return type of a GCC
+ builtin and then its argument prototype would still apply. */
+#ifdef __cplusplus
+extern "C"
+#endif
+char pthread_create ();
+int
+main ()
+{
+return pthread_create ();
+ ;
+ return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+ ac_cv_lib_pthread_pthread_create=yes
+else
+ ac_cv_lib_pthread_pthread_create=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+ conftest$ac_exeext conftest.$ac_ext
+LIBS=$ac_check_lib_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_pthread_pthread_create" >&5
+$as_echo "$ac_cv_lib_pthread_pthread_create" >&6; }
+if test "x$ac_cv_lib_pthread_pthread_create" = xyes; then :
+ cat >>confdefs.h <<_ACEOF
+#define HAVE_LIBPTHREAD 1
+_ACEOF
+
+ LIBS="-lpthread $LIBS"
+
+fi
+
SEM_INIT_LIB=''
ac_fn_c_check_func "$LINENO" "sem_init" "ac_cv_func_sem_init"
diff --git a/configure.in b/configure.in
index f28bd46..d2cfe41 100644
--- a/configure.in
+++ b/configure.in
@@ -961,6 +961,7 @@ AC_CHECK_HEADERS(m4_flatten([
sys/sockio.h
sys/stat.h
sys/syscall.h
+ sys/sysctl.h
sys/sysmacros.h
sys/time.h
sys/types.h
@@ -1173,6 +1174,11 @@ if test $ac_cv_have_optreset = yes; then
AC_DEFINE(HAVE_OPTRESET, 1, [Define to 1 if optreset for getopt is present])
fi
dnl
+dnl Test for pthread_create in -lpthread
+dnl
+PTHREADS_LIB='-lpthread'
+AC_CHECK_LIB(pthread, pthread_create, AC_SUBST(PTHREADS_LIB))
+dnl
dnl Test for sem_init, and which library it might require:
dnl
AH_TEMPLATE([HAVE_SEM_INIT], [Define to 1 if sem_init() exists])
diff --git a/e2fsck/Makefile.in b/e2fsck/Makefile.in
index c23f1cb..67f4e76 100644
--- a/e2fsck/Makefile.in
+++ b/e2fsck/Makefile.in
@@ -16,13 +16,13 @@ MANPAGES= e2fsck.8
FMANPAGES= e2fsck.conf.5
LIBS= $(LIBQUOTA) $(LIBEXT2FS) $(LIBCOM_ERR) $(LIBBLKID) $(LIBUUID) \
- $(LIBINTL) $(LIBE2P) $(SYSLIBS)
+ $(LIBINTL) $(LIBE2P) $(SYSLIBS) $(LIBPTHREADS)
DEPLIBS= $(DEPLIBQUOTA) $(LIBEXT2FS) $(DEPLIBCOM_ERR) $(DEPLIBBLKID) \
$(DEPLIBUUID) $(DEPLIBE2P)
STATIC_LIBS= $(STATIC_LIBQUOTA) $(STATIC_LIBEXT2FS) $(STATIC_LIBCOM_ERR) \
$(STATIC_LIBBLKID) $(STATIC_LIBUUID) $(LIBINTL) $(STATIC_LIBE2P) \
- $(SYSLIBS)
+ $(SYSLIBS) $(LIBPTHEADS)
STATIC_DEPLIBS= $(DEPSTATIC_LIBQUOTA) $(STATIC_LIBEXT2FS) \
$(DEPSTATIC_LIBCOM_ERR) $(DEPSTATIC_LIBBLKID) \
$(DEPSTATIC_LIBUUID) $(DEPSTATIC_LIBE2P)
diff --git a/e2fsck/e2fsck.8.in b/e2fsck/e2fsck.8.in
index 43ee063..90eda4c 100644
--- a/e2fsck/e2fsck.8.in
+++ b/e2fsck/e2fsck.8.in
@@ -34,6 +34,10 @@ e2fsck \- check a Linux ext2/ext3/ext4 file system
.B \-E
.I extended_options
]
+[
+.B \-R
+.I readahead_mem_kb
+]
.I device
.SH DESCRIPTION
.B e2fsck
@@ -302,6 +306,11 @@ options.
This option does nothing at all; it is provided only for backwards
compatibility.
.TP
+.B \-R
+Use at most this many KiB to pre-fetch metadata in the hopes of reducing
+e2fsck runtime. By default, this uses half the physical memory in the
+system; setting this value to zero disables readahead entirely.
+.TP
.B \-t
Print timing statistics for
.BR e2fsck .
diff --git a/e2fsck/e2fsck.c b/e2fsck/e2fsck.c
index 0ec1540..c5d823c 100644
--- a/e2fsck/e2fsck.c
+++ b/e2fsck/e2fsck.c
@@ -15,6 +15,10 @@
#include "e2fsck.h"
#include "problem.h"
+#ifdef HAVE_PTHREAD_H
+#include <pthread.h>
+#endif
+
/*
* This function allocates an e2fsck context
*/
@@ -44,6 +48,8 @@ errcode_t e2fsck_allocate_context(e2fsck_t *ret)
context->flags |= E2F_FLAG_TIME_INSANE;
}
+ e2fsck_init_thread(&context->ra_thread);
+
*ret = context;
return 0;
}
@@ -209,6 +215,7 @@ int e2fsck_run(e2fsck_t ctx)
{
int i;
pass_t e2fsck_pass;
+ errcode_t err;
#ifdef HAVE_SETJMP_H
if (setjmp(ctx->abort_loc)) {
@@ -226,6 +233,10 @@ int e2fsck_run(e2fsck_t ctx)
e2fsck_pass(ctx);
if (ctx->progress)
(void) (ctx->progress)(ctx, 0, 0, 0);
+ err = e2fsck_stop_thread(&ctx->ra_thread, NULL);
+ if (err)
+ com_err(ctx->program_name, err, "%s",
+ _("while stopping readahead"));
}
ctx->flags &= ~E2F_FLAG_SETJMP_OK;
@@ -233,3 +244,128 @@ int e2fsck_run(e2fsck_t ctx)
return (ctx->flags & E2F_FLAG_RUN_RETURN);
return 0;
}
+
+#ifdef HAVE_PTHREAD_H
+struct run_threaded {
+ struct e2fsck_thread *thread;
+ void * (*func)(void *);
+ void (*cleanup)(void *);
+ void *arg;
+};
+
+static void run_threaded_cleanup(void *p)
+{
+ struct run_threaded *rt = p;
+
+ if (rt->cleanup)
+ rt->cleanup(rt->arg);
+ pthread_mutex_lock(&rt->thread->lock);
+ rt->thread->running = 0;
+ pthread_mutex_unlock(&rt->thread->lock);
+ ext2fs_free_mem(&rt);
+}
+
+static void *run_threaded_helper(void *p)
+{
+ int old;
+ struct run_threaded *rt = p;
+ void *ret;
+
+ pthread_cleanup_push(run_threaded_cleanup, rt);
+ pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, &old);
+ ret = rt->func(rt->arg);
+ pthread_setcanceltype(old, NULL);
+ pthread_cleanup_pop(1);
+ pthread_exit(ret);
+ return NULL;
+}
+#endif /* HAVE_PTHREAD_H */
+
+errcode_t e2fsck_init_thread(struct e2fsck_thread *thread)
+{
+ errcode_t err = 0;
+
+ thread->magic = E2FSCK_ET_MAGIC_RUN_THREAD;
+#ifdef HAVE_PTHREAD_H
+ err = pthread_mutex_init(&thread->lock, NULL);
+#endif /* HAVE_PTHREAD_H */
+
+ return err;
+}
+
+errcode_t e2fsck_run_thread(struct e2fsck_thread *thread,
+ void * (*func)(void *), void (*cleanup)(void *),
+ void *arg)
+{
+#ifdef HAVE_PTHREAD_H
+ struct run_threaded *rt;
+#endif
+ errcode_t err = 0, err2;
+
+ EXT2_CHECK_MAGIC(thread, E2FSCK_ET_MAGIC_RUN_THREAD);
+#ifdef HAVE_PTHREAD_H
+ err = pthread_mutex_lock(&thread->lock);
+ if (err)
+ return err;
+
+ if (thread->running) {
+ err = EAGAIN;
+ goto out;
+ }
+
+ err = pthread_join(thread->tid, NULL);
+ if (err && err != ESRCH)
+ goto out;
+
+ err = ext2fs_get_mem(sizeof(*rt), &rt);
+ if (err)
+ goto out;
+
+ rt->thread = thread;
+ rt->func = func;
+ rt->cleanup = cleanup;
+ rt->arg = arg;
+
+ err = pthread_create(&thread->tid, NULL, run_threaded_helper, rt);
+ if (err)
+ ext2fs_free_mem(&rt);
+ else
+ thread->running = 1;
+out:
+ pthread_mutex_unlock(&thread->lock);
+#else
+ thread->ret = func(arg);
+ if (cleanup)
+ cleanup(arg);
+#endif /* HAVE_PTHREAD_H */
+
+ return err;
+}
+
+errcode_t e2fsck_stop_thread(struct e2fsck_thread *thread, void **ret)
+{
+ errcode_t err = 0, err2;
+
+ EXT2_CHECK_MAGIC(thread, E2FSCK_ET_MAGIC_RUN_THREAD);
+
+#ifdef HAVE_PTHREAD_H
+ err = pthread_mutex_lock(&thread->lock);
+ if (err)
+ return err;
+ if (thread->running)
+ err = pthread_cancel(thread->tid);
+ if (err == ESRCH)
+ err = 0;
+ err2 = pthread_mutex_unlock(&thread->lock);
+ if (!err && err2)
+ err = err2;
+ if (!err)
+ err = pthread_join(thread->tid, ret);
+ if (err == ESRCH)
+ err = 0;
+#else
+ if (ret)
+ *ret = thread->ret;
+#endif
+ return err;
+}
diff --git a/e2fsck/e2fsck.conf.5.in b/e2fsck/e2fsck.conf.5.in
index a8219a8..fcda392 100644
--- a/e2fsck/e2fsck.conf.5.in
+++ b/e2fsck/e2fsck.conf.5.in
@@ -205,6 +205,19 @@ of that type are squelched. This can be useful if the console is slow
(i.e., connected to a serial port) and so a large amount of output could
end up delaying the boot process for a long time (potentially hours).
.TP
+.I readahead_mem_pct
+Use no more than this percentage of memory to try to read in metadata blocks
+ahead of the main e2fsck thread. This should reduce run times, depending on
+the speed of the underlying storage and the amount of free memory. By default,
+this is set to 50%.
+.TP
+.I readahead_mem_kb
+Use no more than this amount of memory to read in metadata blocks ahead of the
+main checking thread. Setting this value to zero disables readahead entirely.
+There is no default, but see
+.B readahead_mem_pct
+for more details.
+.TP
.I report_features
If this boolean relation is true, e2fsck will print the file system
features as part of its verbose reporting (i.e., if the
diff --git a/e2fsck/e2fsck.h b/e2fsck/e2fsck.h
index d7a7be9..8ceeff9 100644
--- a/e2fsck/e2fsck.h
+++ b/e2fsck/e2fsck.h
@@ -11,6 +11,7 @@
#include <stdio.h>
#include <string.h>
+#include <stdint.h>
#ifdef HAVE_UNISTD_H
#include <unistd.h>
#endif
@@ -69,6 +70,24 @@
#include "quota/mkquota.h"
+/* Functions to run something asynchronously */
+struct e2fsck_thread {
+ int magic;
+#ifdef HAVE_PTHREAD_H
+ int running;
+ pthread_t tid;
+ pthread_mutex_t lock;
+#else
+ void *ret;
+#endif /* HAVE_PTHREAD_T */
+};
+
+errcode_t e2fsck_init_thread(struct e2fsck_thread *thread);
+errcode_t e2fsck_run_thread(struct e2fsck_thread *thread,
+ void * (*func)(void *), void (*cleanup)(void *),
+ void *arg);
+errcode_t e2fsck_stop_thread(struct e2fsck_thread *thread, void **ret);
+
/*
* Exit codes used by fsck-type programs
*/
@@ -373,6 +392,10 @@ struct e2fsck_struct {
* e2fsck functions themselves.
*/
void *priv_data;
+
+ /* How much are we allowed to readahead? */
+ unsigned long long readahead_mem_kb;
+ struct e2fsck_thread ra_thread;
};
/* Used by the region allocation code */
@@ -495,6 +518,7 @@ void e2fsck_rehash_dir_later(e2fsck_t ctx, ext2_ino_t ino);
int e2fsck_dir_will_be_rehashed(e2fsck_t ctx, ext2_ino_t ino);
errcode_t e2fsck_rehash_dir(e2fsck_t ctx, ext2_ino_t ino);
void e2fsck_rehash_directories(e2fsck_t ctx);
+int e2fsck_will_rehash_dirs(e2fsck_t ctx);
/* sigcatcher.c */
void sigcatcher_setup(void);
@@ -573,6 +597,7 @@ extern errcode_t e2fsck_allocate_subcluster_bitmap(ext2_filsys fs,
int default_type,
const char *profile_name,
ext2fs_block_bitmap *ret);
+int64_t get_memory_size(void);
/* unix.c */
extern void e2fsck_clear_progbar(e2fsck_t ctx);
diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index cf84db6..3475ed0 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -574,6 +574,49 @@ static errcode_t recheck_bad_inode_checksum(ext2_filsys fs, ext2_ino_t ino,
return 0;
}
+struct pass1ra_ctx {
+ ext2_filsys fs;
+ dgrp_t group;
+ dgrp_t ngroups;
+};
+
+static void pass1_readahead_cleanup(void *p)
+{
+ struct pass1ra_ctx *c = p;
+
+ ext2fs_free_mem(&p);
+}
+
+static void *pass1_readahead(void *p)
+{
+ struct pass1ra_ctx *c = p;
+ errcode_t err;
+
+ ext2fs_readahead(c->fs, EXT2_READA_ITABLE, c->group, c->ngroups);
+ return NULL;
+}
+
+static errcode_t initiate_readahead(e2fsck_t ctx, dgrp_t group, dgrp_t ngroups)
+{
+ struct pass1ra_ctx *ractx;
+ errcode_t err;
+
+ err = ext2fs_get_mem(sizeof(*ractx), &ractx);
+ if (err)
+ return err;
+
+ ractx->fs = ctx->fs;
+ ractx->group = group;
+ ractx->ngroups = ngroups;
+
+ err = e2fsck_run_thread(&ctx->ra_thread, pass1_readahead,
+ pass1_readahead_cleanup, ractx);
+ if (err)
+ ext2fs_free_mem(&ractx);
+
+ return err;
+}
+
void e2fsck_pass1(e2fsck_t ctx)
{
int i;
@@ -596,10 +639,37 @@ void e2fsck_pass1(e2fsck_t ctx)
int busted_fs_time = 0;
int inode_size;
int failed_csum = 0;
+ dgrp_t grp;
+ ext2_ino_t ra_threshold = 0;
+ dgrp_t ra_groups = 0;
+ errcode_t err;
init_resource_track(&rtrack, ctx->fs->io);
clear_problem_context(&pctx);
+ /* If we can do readahead, figure out how many groups to pull in. */
+ if (!ext2fs_can_readahead(ctx->fs))
+ ctx->readahead_mem_kb = 0;
+ if (ctx->readahead_mem_kb) {
+ ra_groups = ctx->readahead_mem_kb /
+ (fs->inode_blocks_per_group * fs->blocksize /
+ 1024);
+ if (ra_groups < 16)
+ ra_groups = 0;
+ else if (ra_groups > fs->group_desc_count)
+ ra_groups = fs->group_desc_count;
+ if (ra_groups) {
+ err = initiate_readahead(ctx, grp, ra_groups);
+ if (err) {
+ com_err(ctx->program_name, err, "%s",
+ _("while starting pass1 readahead"));
+ ra_groups = 0;
+ }
+ ra_threshold = ra_groups *
+ fs->super->s_inodes_per_group;
+ }
+ }
+
if (!(ctx->options & E2F_OPT_PREEN))
fix_problem(ctx, PR_1_PASS_HEADER, &pctx);
@@ -761,6 +831,19 @@ void e2fsck_pass1(e2fsck_t ctx)
if (e2fsck_mmp_update(fs))
fatal_error(ctx, 0);
}
+ if (ra_groups && ino > ra_threshold) {
+ grp = (ino - 1) / fs->super->s_inodes_per_group;
+ ra_threshold = (grp + ra_groups) *
+ fs->super->s_inodes_per_group;
+ err = initiate_readahead(ctx, grp, ra_groups);
+ if (err == EAGAIN) {
+ printf("Disabling slow readahead.\n");
+ ra_groups = 0;
+ } else if (err) {
+ com_err(ctx->program_name, err, "%s",
+ _("while starting pass1 readahead"));
+ }
+ }
old_op = ehandler_operation(_("getting next inode from scan"));
pctx.errcode = ext2fs_get_next_inode_full(scan, &ino,
inode, inode_size);
diff --git a/e2fsck/pass2.c b/e2fsck/pass2.c
index 5a2745a..3e22a18 100644
--- a/e2fsck/pass2.c
+++ b/e2fsck/pass2.c
@@ -61,6 +61,9 @@
* Keeps track of how many times an inode is referenced.
*/
static void deallocate_inode(e2fsck_t ctx, ext2_ino_t ino, char* block_buf);
+static int check_dir_block2(ext2_filsys fs,
+ struct ext2_db_entry2 *dir_blocks_info,
+ void *priv_data);
static int check_dir_block(ext2_filsys fs,
struct ext2_db_entry2 *dir_blocks_info,
void *priv_data);
@@ -77,8 +80,67 @@ struct check_dir_struct {
struct problem_context pctx;
int count, max;
e2fsck_t ctx;
+ int save_readahead;
+};
+
+struct pass2_readahead_data {
+ ext2_filsys fs;
+ ext2_dblist dblist;
};
+static int readahead_dir_block(ext2_filsys fs, struct ext2_db_entry2 *db,
+ void *priv_data)
+{
+ db->blockcnt = 1;
+ return 0;
+}
+
+static void pass2_readahead_cleanup(void *p)
+{
+ struct pass2_readahead_data *pr = p;
+
+ ext2fs_free_dblist(pr->dblist);
+ ext2fs_free_mem(&pr);
+}
+
+static void *pass2_readahead(void *p)
+{
+ struct pass2_readahead_data *pr = p;
+
+ ext2fs_readahead_dblist(pr->fs, 0, pr->dblist);
+ return NULL;
+}
+
+static errcode_t initiate_readahead(e2fsck_t ctx)
+{
+ struct pass2_readahead_data *pr;
+ errcode_t err;
+
+ err = ext2fs_get_mem(sizeof(*pr), &pr);
+ if (err)
+ return err;
+ pr->fs = ctx->fs;
+ err = ext2fs_copy_dblist(ctx->fs->dblist, &pr->dblist);
+ if (err)
+ goto out_pr;
+ err = ext2fs_dblist_iterate2(pr->dblist, readahead_dir_block,
+ NULL);
+ if (err)
+ goto out_dblist;
+ err = e2fsck_run_thread(&ctx->ra_thread, pass2_readahead,
+ pass2_readahead_cleanup, pr);
+ if (err)
+ goto out_dblist;
+
+ return 0;
+
+out_dblist:
+ ext2fs_free_dblist(pr->dblist);
+out_pr:
+ ext2fs_free_mem(&pr);
+ return err;
+}
+
void e2fsck_pass2(e2fsck_t ctx)
{
struct ext2_super_block *sb = ctx->fs->super;
@@ -96,6 +158,10 @@ void e2fsck_pass2(e2fsck_t ctx)
int i, depth;
problem_t code;
int bad_dir;
+ int (*check_dir_func)(ext2_filsys fs,
+ struct ext2_db_entry2 *dir_blocks_info,
+ void *priv_data);
+ errcode_t err;
init_resource_track(&rtrack, ctx->fs->io);
clear_problem_context(&cd.pctx);
@@ -139,6 +205,7 @@ void e2fsck_pass2(e2fsck_t ctx)
cd.ctx = ctx;
cd.count = 1;
cd.max = ext2fs_dblist_count2(fs->dblist);
+ cd.save_readahead = e2fsck_will_rehash_dirs(ctx);
if (ctx->progress)
(void) (ctx->progress)(ctx, 2, 0, cd.max);
@@ -146,7 +213,16 @@ void e2fsck_pass2(e2fsck_t ctx)
if (fs->super->s_feature_compat & EXT2_FEATURE_COMPAT_DIR_INDEX)
ext2fs_dblist_sort2(fs->dblist, special_dir_block_cmp);
- cd.pctx.errcode = ext2fs_dblist_iterate2(fs->dblist, check_dir_block,
+ if (ctx->readahead_mem_kb) {
+ check_dir_func = check_dir_block2;
+ err = initiate_readahead(ctx);
+ if (err)
+ com_err(ctx->program_name, err, "%s",
+ _("while starting pass2 readahead"));
+ } else
+ check_dir_func = check_dir_block;
+
+ cd.pctx.errcode = ext2fs_dblist_iterate2(fs->dblist, check_dir_func,
&cd);
if (ctx->flags & E2F_FLAG_SIGNAL_MASK || ctx->flags & E2F_FLAG_RESTART)
return;
@@ -655,6 +731,7 @@ clear_and_exit:
clear_htree(cd->ctx, cd->pctx.ino);
dx_dir->numblocks = 0;
e2fsck_rehash_dir_later(cd->ctx, cd->pctx.ino);
+ cd->save_readahead = 1;
}
#endif /* ENABLE_HTREE */
@@ -730,6 +807,19 @@ static void salvage_directory(ext2_filsys fs,
}
}
+static int check_dir_block2(ext2_filsys fs,
+ struct ext2_db_entry2 *db,
+ void *priv_data)
+{
+ int err;
+ struct check_dir_struct *cd = priv_data;
+
+ err = check_dir_block(fs, db, priv_data);
+ if (!cd->save_readahead)
+ io_channel_cache_release(fs->io, db->blk, 1);
+ return err;
+}
+
static int check_dir_block(ext2_filsys fs,
struct ext2_db_entry2 *db,
void *priv_data)
@@ -894,6 +984,7 @@ out_htree:
&cd->pctx))
goto skip_checksum;
e2fsck_rehash_dir_later(ctx, ino);
+ cd->save_readahead = 1;
goto skip_checksum;
}
if (failed_csum) {
@@ -1162,6 +1253,7 @@ skip_checksum:
pctx.dirent = dirent;
fix_problem(ctx, PR_2_REPORT_DUP_DIRENT, &pctx);
e2fsck_rehash_dir_later(ctx, ino);
+ cd->save_readahead = 1;
dups_found++;
} else
dict_alloc_insert(&de_dict, dirent, dirent);
@@ -1209,7 +1301,10 @@ skip_checksum:
EXT4_FEATURE_RO_COMPAT_METADATA_CSUM) &&
is_leaf &&
!ext2fs_dirent_has_tail(fs, (struct ext2_dir_entry *)buf))
+ {
e2fsck_rehash_dir_later(ctx, ino);
+ cd->save_readahead = 1;
+ }
write_and_fix:
if (e2fsck_dir_will_be_rehashed(ctx, ino))
diff --git a/e2fsck/pass4.c b/e2fsck/pass4.c
index 21d93f0..959dfc3 100644
--- a/e2fsck/pass4.c
+++ b/e2fsck/pass4.c
@@ -87,6 +87,21 @@ static int disconnect_inode(e2fsck_t ctx, ext2_ino_t i,
return 0;
}
+/* Since pass4 is mostly CPU bound, start readahead of bitmaps for pass 5. */
+static void *pass5_readahead(void *p)
+{
+ ext2_filsys fs = p;
+
+ ext2fs_readahead(fs, EXT2_READA_BBITMAP | EXT2_READA_IBITMAP, 0,
+ fs->group_desc_count);
+ return NULL;
+}
+
+static errcode_t initiate_readahead(e2fsck_t ctx)
+{
+ return e2fsck_run_thread(&ctx->ra_thread, pass5_readahead, NULL,
+ ctx->fs);
+}
void e2fsck_pass4(e2fsck_t ctx)
{
@@ -100,12 +115,19 @@ void e2fsck_pass4(e2fsck_t ctx)
__u16 link_count, link_counted;
char *buf = 0;
dgrp_t group, maxgroup;
+ errcode_t err;
init_resource_track(&rtrack, ctx->fs->io);
#ifdef MTRACE
mtrace_print("Pass 4");
#endif
+ if (ctx->readahead_mem_kb) {
+ err = initiate_readahead(ctx);
+ if (err)
+ com_err(ctx->program_name, err, "%s",
+ _("while starting pass5 readahead"));
+ }
clear_problem_context(&pctx);
diff --git a/e2fsck/prof_err.et b/e2fsck/prof_err.et
index c9316c7..21fb524 100644
--- a/e2fsck/prof_err.et
+++ b/e2fsck/prof_err.et
@@ -62,5 +62,6 @@ error_code PROF_BAD_INTEGER, "Invalid integer value"
error_code PROF_MAGIC_FILE_DATA, "Bad magic value in profile_file_data_t"
+error_code E2FSCK_ET_MAGIC_RUN_THREAD, "Wrong magic number for e2fsck_thread structure"
end
diff --git a/e2fsck/rehash.c b/e2fsck/rehash.c
index 9b90353..283515c 100644
--- a/e2fsck/rehash.c
+++ b/e2fsck/rehash.c
@@ -71,6 +71,16 @@ int e2fsck_dir_will_be_rehashed(e2fsck_t ctx, ext2_ino_t ino)
return ext2fs_u32_list_test(ctx->dirs_to_hash, ino);
}
+/* Ask if there will be a pass 3A. */
+int e2fsck_will_rehash_dirs(e2fsck_t ctx)
+{
+ if (ctx->options & E2F_OPT_COMPRESS_DIRS)
+ return 1;
+ if (!ctx->dirs_to_hash)
+ return 0;
+ return ext2fs_u32_list_count(ctx->dirs_to_hash) > 0;
+}
+
struct fill_dir_struct {
char *buf;
struct ext2_inode *inode;
diff --git a/e2fsck/unix.c b/e2fsck/unix.c
index 67b3578..fac8cc9 100644
--- a/e2fsck/unix.c
+++ b/e2fsck/unix.c
@@ -74,7 +74,7 @@ static void usage(e2fsck_t ctx)
_("Usage: %s [-panyrcdfvtDFV] [-b superblock] [-B blocksize]\n"
"\t\t[-I inode_buffer_blocks] [-P process_inode_size]\n"
"\t\t[-l|-L bad_blocks_file] [-C fd] [-j external_journal]\n"
- "\t\t[-E extended-options] device\n"),
+ "\t\t[-E extended-options] [-R readahead_kb] device\n"),
ctx->program_name);
fprintf(stderr, "%s", _("\nEmergency help:\n"
@@ -90,6 +90,7 @@ static void usage(e2fsck_t ctx)
" -j external_journal Set location of the external journal\n"
" -l bad_blocks_file Add to badblocks list\n"
" -L bad_blocks_file Set badblocks list\n"
+ " -R readahead_kb Allow this much readahead.\n"
));
exit(FSCK_USAGE);
@@ -749,6 +750,7 @@ static errcode_t PRS(int argc, char *argv[], e2fsck_t *ret_ctx)
#ifdef CONFIG_JBD_DEBUG
char *jbd_debug;
#endif
+ unsigned long long phys_mem_kb, reada_kb;
retval = e2fsck_allocate_context(&ctx);
if (retval)
@@ -775,8 +777,16 @@ static errcode_t PRS(int argc, char *argv[], e2fsck_t *ret_ctx)
else
ctx->program_name = "e2fsck";
- while ((c = getopt (argc, argv, "panyrcC:B:dE:fvtFVM:b:I:j:P:l:L:N:SsDk")) != EOF)
+ phys_mem_kb = get_memory_size() / 1024;
+ reada_kb = ~0ULL;
+ while ((c = getopt(argc, argv,
+ "panyrcC:B:dE:fvtFVM:b:I:j:P:l:L:N:SsDkR:")) != EOF)
switch (c) {
+ case 'R':
+ res = sscanf(optarg, "%llu", &reada_kb);
+ if (res != 1)
+ goto sscanf_err;
+ break;
case 'C':
ctx->progress = e2fsck_update_progress;
res = sscanf(optarg, "%d", &ctx->progress_fd);
@@ -964,6 +974,22 @@ static errcode_t PRS(int argc, char *argv[], e2fsck_t *ret_ctx)
if (c)
verbose = 1;
+ /* Figure out how much memory goes to readahead */
+ profile_get_integer(ctx->profile, "options", "readahead_mem_pct", 0,
+ 50, &c);
+ if (c >= 0 && c <= 100)
+ ctx->readahead_mem_kb = phys_mem_kb * c / 100;
+ else
+ ctx->readahead_mem_kb = phys_mem_kb / 2;
+ profile_get_integer(ctx->profile, "options", "readahead_mem_kb", 0,
+ -1, &c);
+ if (c >= 0)
+ ctx->readahead_mem_kb = c;
+ if (reada_kb != ~0ULL)
+ ctx->readahead_mem_kb = reada_kb;
+ if (ctx->readahead_mem_kb > phys_mem_kb)
+ ctx->readahead_mem_kb = phys_mem_kb;
+
/* Turn off discard in read-only mode */
if ((ctx->options & E2F_OPT_NO) &&
(ctx->options & E2F_OPT_DISCARD))
@@ -1781,6 +1807,11 @@ no_journal:
}
}
+ retval = e2fsck_stop_thread(&ctx->ra_thread, NULL);
+ if (retval)
+ com_err(ctx->program_name, retval, "%s",
+ _("while stopping readahead"));
+
e2fsck_write_bitmaps(ctx);
io_channel_flush(ctx->fs->io);
print_resource_track(ctx, NULL, &ctx->global_rtrack, ctx->fs->io);
diff --git a/e2fsck/util.c b/e2fsck/util.c
index e7e8704..b88f9f5 100644
--- a/e2fsck/util.c
+++ b/e2fsck/util.c
@@ -37,6 +37,10 @@
#include <errno.h>
#endif
+#ifdef HAVE_SYS_SYSCTL_H
+#include <sys/sysctl.h>
+#endif
+
#include "e2fsck.h"
extern e2fsck_t e2fsck_global_ctx; /* Try your very best not to use this! */
@@ -845,3 +849,50 @@ errcode_t e2fsck_allocate_subcluster_bitmap(ext2_filsys fs, const char *descr,
fs->default_bitmap_type = save_type;
return retval;
}
+
+/* Return memory size in bytes */
+int64_t get_memory_size(void)
+{
+#if defined(_SC_PHYS_PAGES)
+# if defined(_SC_PAGESIZE)
+ return (int64_t)sysconf(_SC_PHYS_PAGES) *
+ (int64_t)sysconf(_SC_PAGESIZE);
+# elif defined(_SC_PAGE_SIZE)
+ return (int64_t)sysconf(_SC_PHYS_PAGES) *
+ (int64_t)sysconf(_SC_PAGE_SIZE);
+# endif
+#elif defined(_SC_AIX_REALMEM)
+ return (int64_t)sysconf(_SC_AIX_REALMEM) * (int64_t)1024L;
+#elif defined(CTL_HW)
+# if (defined(HW_MEMSIZE) || defined(HW_PHYSMEM64))
+# define CTL_HW_INT64
+# elif (defined(HW_PHYSMEM) || defined(HW_REALMEM))
+# define CTL_HW_UINT
+# endif
+ int mib[2];
+ mib[0] = CTL_HW;
+# if defined(HW_MEMSIZE)
+ mib[1] = HW_MEMSIZE;
+# elif defined(HW_PHYSMEM64)
+ mib[1] = HW_PHYSMEM64;
+# elif defined(HW_REALMEM)
+ mib[1] = HW_REALMEM;
+# elif defined(HW_PYSMEM)
+ mib[1] = HW_PHYSMEM;
+# endif
+# if defined(CTL_HW_INT64)
+ int64_t size = 0;
+# elif defined(CTL_HW_UINT)
+ unsigned int size = 0;
+# endif
+# if defined(CTL_HW_INT64) || defined(CTL_HW_UINT)
+ size_t len = sizeof( size );
+ if ( sysctl( mib, 2, &size, &len, NULL, 0 ) == 0 )
+ return (int64_t)size;
+# endif
+ return 0;
+#else
+# warning "Don't know how to detect memory on your platform?"
+ return 0;
+#endif
+}
diff --git a/lib/config.h.in b/lib/config.h.in
index e0384ee..836c2df 100644
--- a/lib/config.h.in
+++ b/lib/config.h.in
@@ -203,6 +203,9 @@
/* Define if your <locale.h> file defines LC_MESSAGES. */
#undef HAVE_LC_MESSAGES
+/* Define to 1 if you have the `pthread' library (-lpthread). */
+#undef HAVE_LIBPTHREAD
+
/* Define to 1 if you have the <limits.h> header file. */
#undef HAVE_LIMITS_H
@@ -314,6 +317,9 @@
/* Define to 1 if you have the `pread' function. */
#undef HAVE_PREAD
+/* Define to 1 if you have the <pthread.h> header file. */
+#undef HAVE_PTHREAD_H
+
/* Define to 1 if you have the `putenv' function. */
#undef HAVE_PUTENV
@@ -465,6 +471,9 @@
/* Define to 1 if you have the <sys/syscall.h> header file. */
#undef HAVE_SYS_SYSCALL_H
+/* Define to 1 if you have the <sys/sysctl.h> header file. */
+#undef HAVE_SYS_SYSCTL_H
+
/* Define to 1 if you have the <sys/sysmacros.h> header file. */
#undef HAVE_SYS_SYSMACROS_H
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 21/32] libext2fs: when appending to a file, don't split an index block in equal halves
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (19 preceding siblings ...)
2014-03-02 7:18 ` [PATCH 20/32] e2fsck: read-ahead metadata during passes 1, 2, and 4 Darrick J. Wong
@ 2014-03-02 7:18 ` Darrick J. Wong
2014-03-02 7:18 ` [PATCH 22/32] libext2fs: find inode goal when allocating blocks Darrick J. Wong
` (8 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:18 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
When we're appending an extent to the end of a file and the index
block is full, don't split the index block into two half-full index
blocks because this leaves us with under utilized index blocks, at
least in the fallocate case. Instead, copy the last extent from the
full block into the new block. This isn't perfect utilization, but
there's a lot of work involved in teaching extent.c to be able to goto
a nonexistent node in a newly allocated (and empty) extent block.
This patch does not fix the general problem of keeping the extent tree
balanced.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
lib/ext2fs/extent.c | 79 ++++++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 72 insertions(+), 7 deletions(-)
diff --git a/lib/ext2fs/extent.c b/lib/ext2fs/extent.c
index 80ce88f..cf75a0b 100644
--- a/lib/ext2fs/extent.c
+++ b/lib/ext2fs/extent.c
@@ -29,6 +29,8 @@
#include "ext2fsP.h"
#include "e2image.h"
+#undef DEBUG
+
/*
* Definitions to be dropped in lib/ext2fs/ext2fs.h
*/
@@ -122,11 +124,39 @@ static void dbg_print_extent(char *desc, struct ext2fs_extent *extent)
}
+static void dump_path(const char *tag, struct ext2_extent_handle *handle,
+ struct extent_path *path)
+{
+ struct extent_path *ppp = path;
+ printf("%s: level=%d\n", tag, handle->level);
+
+ do {
+ printf("%s: path=%ld buf=%p entries=%d max_entries=%d left=%d "
+ "visit_num=%d flags=0x%x end_blk=%llu curr=%p(%ld)\n",
+ tag, (ppp - handle->path), ppp->buf, ppp->entries,
+ ppp->max_entries, ppp->left, ppp->visit_num, ppp->flags,
+ ppp->end_blk, ppp->curr, ppp->curr - (void *)ppp->buf);
+ printf(" ");
+ dbg_show_header((struct ext3_extent_header *)ppp->buf);
+ if (ppp->curr) {
+ printf(" ");
+ dbg_show_index(ppp->curr);
+ printf(" ");
+ dbg_show_extent(ppp->curr);
+ }
+ ppp--;
+ } while (ppp >= handle->path);
+ fflush(stdout);
+
+ return;
+}
+
#else
#define dbg_show_header(eh) do { } while (0)
#define dbg_show_index(ix) do { } while (0)
#define dbg_show_extent(ex) do { } while (0)
#define dbg_print_extent(desc, ex) do { } while (0)
+#define dump_path(tag, handle, path) do { } while (0)
#endif
/*
@@ -837,12 +867,31 @@ errcode_t ext2fs_extent_replace(ext2_extent_handle_t handle,
return 0;
}
+static int splitting_at_eof(struct ext2_extent_handle *handle,
+ struct extent_path *path)
+{
+ struct extent_path *ppp = path;
+ dump_path(__func__, handle, path);
+
+ if (handle->level == 0)
+ return 0;
+
+ do {
+ if (ppp->left)
+ return 0;
+ ppp--;
+ } while (ppp >= handle->path);
+
+ return 1;
+}
+
/*
* allocate a new block, move half the current node to it, and update parent
*
* handle will be left pointing at original record.
*/
-errcode_t ext2fs_extent_node_split(ext2_extent_handle_t handle)
+static errcode_t extent_node_split(ext2_extent_handle_t handle,
+ int expand_allowed)
{
errcode_t retval = 0;
blk64_t new_node_pblk;
@@ -857,6 +906,7 @@ errcode_t ext2fs_extent_node_split(ext2_extent_handle_t handle)
int tocopy;
int new_root = 0;
struct ext2_extent_info info;
+ int no_balance;
/* basic sanity */
EXT2_CHECK_MAGIC(handle, EXT2_ET_MAGIC_EXTENT_HANDLE);
@@ -897,7 +947,7 @@ errcode_t ext2fs_extent_node_split(ext2_extent_handle_t handle)
goto done;
goal_blk = extent.e_pblk;
- retval = ext2fs_extent_node_split(handle);
+ retval = extent_node_split(handle, expand_allowed);
if (retval)
goto done;
@@ -912,6 +962,14 @@ errcode_t ext2fs_extent_node_split(ext2_extent_handle_t handle)
if (!path->curr)
return EXT2_ET_NO_CURRENT_NODE;
+ /*
+ * Normally, we try to split a full node in half. This doesn't turn
+ * out so well if we're tacking extents on the end of the file because
+ * then we're stuck with a tree of half-full extent blocks. This of
+ * course doesn't apply to the root level.
+ */
+ no_balance = expand_allowed ? splitting_at_eof(handle, path) : 0;
+
/* extent header of the current node we'll split */
eh = (struct ext3_extent_header *)path->buf;
@@ -925,7 +983,10 @@ errcode_t ext2fs_extent_node_split(ext2_extent_handle_t handle)
if (retval)
goto done;
} else {
- tocopy = ext2fs_le16_to_cpu(eh->eh_entries) / 2;
+ if (no_balance)
+ tocopy = 1;
+ else
+ tocopy = ext2fs_le16_to_cpu(eh->eh_entries) / 2;
}
#ifdef DEBUG
@@ -934,7 +995,7 @@ errcode_t ext2fs_extent_node_split(ext2_extent_handle_t handle)
handle->level);
#endif
- if (!tocopy) {
+ if (!tocopy && !no_balance) {
#ifdef DEBUG
printf("Nothing to copy to new block!\n");
#endif
@@ -1059,8 +1120,7 @@ errcode_t ext2fs_extent_node_split(ext2_extent_handle_t handle)
goto done;
/* new node hooked in, so update inode block count (do this here?) */
- handle->inode->i_blocks += (handle->fs->blocksize *
- EXT2FS_CLUSTER_RATIO(handle->fs)) / 512;
+ ext2fs_iblk_add_blocks(handle->fs, handle->inode, 1);
retval = ext2fs_write_inode(handle->fs, handle->ino,
handle->inode);
if (retval)
@@ -1074,6 +1134,11 @@ done:
return retval;
}
+errcode_t ext2fs_extent_node_split(ext2_extent_handle_t handle)
+{
+ return extent_node_split(handle, 0);
+}
+
errcode_t ext2fs_extent_insert(ext2_extent_handle_t handle, int flags,
struct ext2fs_extent *extent)
{
@@ -1105,7 +1170,7 @@ errcode_t ext2fs_extent_insert(ext2_extent_handle_t handle, int flags,
printf("node full (level %d) - splitting\n",
handle->level);
#endif
- retval = ext2fs_extent_node_split(handle);
+ retval = extent_node_split(handle, 1);
if (retval)
return retval;
path = handle->path + handle->level;
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 22/32] libext2fs: find inode goal when allocating blocks
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (20 preceding siblings ...)
2014-03-02 7:18 ` [PATCH 21/32] libext2fs: when appending to a file, don't split an index block in equal halves Darrick J. Wong
@ 2014-03-02 7:18 ` Darrick J. Wong
2014-03-02 7:19 ` [PATCH 23/32] libext2fs: find a range of empty blocks Darrick J. Wong
` (7 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:18 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
Try to be a little smarter about where we go to allocate blocks for a
inode.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
e2fsck/pass2.c | 3 ++-
e2fsck/pass3.c | 2 +-
lib/ext2fs/alloc.c | 10 ++++++++++
lib/ext2fs/bmap.c | 5 +++--
lib/ext2fs/expanddir.c | 2 +-
lib/ext2fs/ext2fs.h | 1 +
lib/ext2fs/ext_attr.c | 3 +--
lib/ext2fs/extent.c | 10 ++--------
lib/ext2fs/mkdir.c | 3 ++-
lib/ext2fs/symlink.c | 3 ++-
10 files changed, 25 insertions(+), 17 deletions(-)
diff --git a/e2fsck/pass2.c b/e2fsck/pass2.c
index 3e22a18..e2974eb 100644
--- a/e2fsck/pass2.c
+++ b/e2fsck/pass2.c
@@ -1619,7 +1619,8 @@ static int allocate_dir_block(e2fsck_t ctx,
/*
* First, find a free block
*/
- pctx->errcode = ext2fs_new_block2(fs, 0, ctx->block_found_map, &blk);
+ blk = ext2fs_find_inode_goal(fs, db->ino);
+ pctx->errcode = ext2fs_new_block2(fs, blk, ctx->block_found_map, &blk);
if (pctx->errcode) {
pctx->str = "ext2fs_new_block";
fix_problem(ctx, PR_2_ALLOC_DIRBOCK, pctx);
diff --git a/e2fsck/pass3.c b/e2fsck/pass3.c
index aaf177c..6769b05 100644
--- a/e2fsck/pass3.c
+++ b/e2fsck/pass3.c
@@ -812,7 +812,7 @@ errcode_t e2fsck_expand_directory(e2fsck_t ctx, ext2_ino_t dir,
es.num = num;
es.guaranteed_size = guaranteed_size;
- es.last_block = 0;
+ es.last_block = ext2fs_find_inode_goal(fs, dir);
es.err = 0;
es.newblocks = 0;
es.ctx = ctx;
diff --git a/lib/ext2fs/alloc.c b/lib/ext2fs/alloc.c
index 1be4ecc..aa084ac 100644
--- a/lib/ext2fs/alloc.c
+++ b/lib/ext2fs/alloc.c
@@ -293,3 +293,13 @@ void ext2fs_set_alloc_block_callback(ext2_filsys fs,
fs->get_alloc_block = func;
}
+
+blk64_t ext2fs_find_inode_goal(ext2_filsys fs, ext2_ino_t ino)
+{
+ dgrp_t group = ext2fs_group_of_ino(fs, ino);
+ __u8 log_flex = fs->super->s_log_groups_per_flex;
+
+ if (log_flex)
+ group = group & ~((1 << (log_flex)) - 1);
+ return ext2fs_group_first_block2(fs, group);
+}
diff --git a/lib/ext2fs/bmap.c b/lib/ext2fs/bmap.c
index 07455f8..837d745 100644
--- a/lib/ext2fs/bmap.c
+++ b/lib/ext2fs/bmap.c
@@ -252,7 +252,7 @@ got_block:
retval = extent_bmap(fs, ino, inode, handle, block_buf,
0, block-1, 0, blocks_alloc, &blk64);
if (retval)
- blk64 = 0;
+ blk64 = ext2fs_find_inode_goal(fs, ino);
retval = ext2fs_alloc_block2(fs, blk64, block_buf,
&blk64);
if (retval)
@@ -361,7 +361,8 @@ errcode_t ext2fs_bmap2(ext2_filsys fs, ext2_ino_t ino, struct ext2_inode *inode,
}
*phys_blk = inode_bmap(inode, block);
- b = block ? inode_bmap(inode, block-1) : 0;
+ b = block ? inode_bmap(inode, block-1) :
+ ext2fs_find_inode_goal(fs, ino);
if ((*phys_blk == 0) && (bmap_flags & BMAP_ALLOC)) {
retval = ext2fs_alloc_block(fs, b, block_buf, &b);
diff --git a/lib/ext2fs/expanddir.c b/lib/ext2fs/expanddir.c
index 09a15fa..0463462 100644
--- a/lib/ext2fs/expanddir.c
+++ b/lib/ext2fs/expanddir.c
@@ -110,7 +110,7 @@ errcode_t ext2fs_expand_dir(ext2_filsys fs, ext2_ino_t dir)
es.done = 0;
es.err = 0;
- es.goal = 0;
+ es.goal = ext2fs_find_inode_goal(fs, dir);
es.newblocks = 0;
es.dir = dir;
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index 8aa0ac9..edbb92b 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -687,6 +687,7 @@ extern void ext2fs_set_alloc_block_callback(ext2_filsys fs,
errcode_t (**old)(ext2_filsys fs,
blk64_t goal,
blk64_t *ret));
+blk64_t ext2fs_find_inode_goal(ext2_filsys fs, ext2_ino_t ino);
/* alloc_sb.c */
extern int ext2fs_reserve_super_and_bgd(ext2_filsys fs,
diff --git a/lib/ext2fs/ext_attr.c b/lib/ext2fs/ext_attr.c
index 44d7615..9302fdf 100644
--- a/lib/ext2fs/ext_attr.c
+++ b/lib/ext2fs/ext_attr.c
@@ -404,8 +404,7 @@ static errcode_t prep_ea_block_for_write(ext2_filsys fs, ext2_ino_t ino,
}
/* Allocate a block */
- grp = ext2fs_group_of_ino(fs, ino);
- goal = ext2fs_inode_table_loc(fs, grp);
+ goal = ext2fs_find_inode_goal(fs, ino);
err = ext2fs_alloc_block2(fs, goal, NULL, &blk);
if (err)
goto out2;
diff --git a/lib/ext2fs/extent.c b/lib/ext2fs/extent.c
index cf75a0b..5a6c5b5 100644
--- a/lib/ext2fs/extent.c
+++ b/lib/ext2fs/extent.c
@@ -1010,14 +1010,8 @@ static errcode_t extent_node_split(ext2_extent_handle_t handle,
goto done;
}
- if (!goal_blk) {
- dgrp_t group = ext2fs_group_of_ino(handle->fs, handle->ino);
- __u8 log_flex = handle->fs->super->s_log_groups_per_flex;
-
- if (log_flex)
- group = group & ~((1 << (log_flex)) - 1);
- goal_blk = ext2fs_group_first_block2(handle->fs, group);
- }
+ if (!goal_blk)
+ goal_blk = ext2fs_find_inode_goal(handle->fs, handle->ino);
retval = ext2fs_alloc_block2(handle->fs, goal_blk, block_buf,
&new_node_pblk);
if (retval)
diff --git a/lib/ext2fs/mkdir.c b/lib/ext2fs/mkdir.c
index 4a85439..9864645 100644
--- a/lib/ext2fs/mkdir.c
+++ b/lib/ext2fs/mkdir.c
@@ -57,7 +57,8 @@ errcode_t ext2fs_mkdir(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t inum,
/*
* Allocate a data block for the directory
*/
- retval = ext2fs_new_block2(fs, 0, 0, &blk);
+ retval = ext2fs_new_block2(fs, ext2fs_find_inode_goal(fs, ino), 0,
+ &blk);
if (retval)
goto cleanup;
diff --git a/lib/ext2fs/symlink.c b/lib/ext2fs/symlink.c
index b2ef66c..cb3a2e7 100644
--- a/lib/ext2fs/symlink.c
+++ b/lib/ext2fs/symlink.c
@@ -53,7 +53,8 @@ errcode_t ext2fs_symlink(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t ino,
*/
fastlink = (target_len < sizeof(inode.i_block));
if (!fastlink) {
- retval = ext2fs_new_block2(fs, 0, 0, &blk);
+ retval = ext2fs_new_block2(fs, ext2fs_find_inode_goal(fs, ino),
+ 0, &blk);
if (retval)
goto cleanup;
retval = ext2fs_get_mem(fs->blocksize, &block_buf);
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 23/32] libext2fs: find a range of empty blocks
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (21 preceding siblings ...)
2014-03-02 7:18 ` [PATCH 22/32] libext2fs: find inode goal when allocating blocks Darrick J. Wong
@ 2014-03-02 7:19 ` Darrick J. Wong
2014-03-02 7:19 ` [PATCH 24/32] libext2fs: provide a function to set inode size Darrick J. Wong
` (6 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:19 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
Provide a function that, given a goal pblk and a range, will try to
find a run of free blocks to satisfy the allocation. By default the
function will look anywhere in the filesystem for the run, though this
can be constrained with optional flags. One flag indicates that the
range must start at the goal block; the other flag indicates that we
should not return a range shorter than len.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
lib/ext2fs/alloc.c | 104 +++++++++++++++++++++++++++++++++++++++++++++++++++
lib/ext2fs/ext2fs.h | 6 +++
2 files changed, 110 insertions(+)
diff --git a/lib/ext2fs/alloc.c b/lib/ext2fs/alloc.c
index aa084ac..7bc86f1 100644
--- a/lib/ext2fs/alloc.c
+++ b/lib/ext2fs/alloc.c
@@ -26,6 +26,16 @@
#include "ext2_fs.h"
#include "ext2fs.h"
+#define min(a, b) ((a) < (b) ? (a) : (b))
+
+#undef DEBUG
+
+#ifdef DEBUG
+# define dbg_printf(f, a...) do {printf(f, ## a); fflush(stdout); } while (0)
+#else
+# define dbg_printf(f, a...)
+#endif
+
/*
* Clear the uninit block bitmap flag if necessary
*/
@@ -303,3 +313,97 @@ blk64_t ext2fs_find_inode_goal(ext2_filsys fs, ext2_ino_t ino)
group = group & ~((1 << (log_flex)) - 1);
return ext2fs_group_first_block2(fs, group);
}
+
+/*
+ * Starting at _goal_, scan around the filesystem to find a run of free blocks
+ * that's at least _len_ blocks long. If EXT2_NEWRANGE_EXACT_GOAL is given,
+ * then the range of blocks must start at _goal_. If
+ * EXT2_NEWRANGE_EXACT_LENGTH is given, do not return a allocation shorter than
+ * _len_.
+ *
+ * The starting block is returned in _pblk_ and the length is returned via
+ * _plen_.
+ */
+errcode_t ext2fs_new_range(ext2_filsys fs, int flags, blk64_t goal,
+ blk64_t len, ext2fs_block_bitmap map, blk64_t *pblk,
+ blk64_t *plen)
+{
+ errcode_t retval;
+ blk64_t start, end, b;
+ int looped = 0;
+ blk64_t max_blocks = ext2fs_blocks_count(fs->super);
+
+ dbg_printf("%s: flags=0x%x goal=%llu len=%llu\n", __func__, flags,
+ goal, len);
+ EXT2_CHECK_MAGIC(fs, EXT2_ET_MAGIC_EXT2FS_FILSYS);
+ if (len == 0 || (flags & ~EXT2_NEWRANGE_ALL_FLAGS))
+ return EXT2_ET_INVALID_ARGUMENT;
+ if (!map)
+ map = fs->block_map;
+ if (!map)
+ return EXT2_ET_NO_BLOCK_BITMAP;
+ if (!goal || goal >= ext2fs_blocks_count(fs->super))
+ goal = fs->super->s_first_data_block;
+
+ start = goal;
+ while (!looped || start <= goal) {
+ retval = ext2fs_find_first_zero_block_bitmap2(fs->block_map,
+ start, max_blocks - 1, &start);
+ if (retval == ENOENT) {
+ /*
+ * If there are no free blocks beyond the starting
+ * point, try scanning the whole filesystem, unless the
+ * user told us only to allocate from _goal_, or if
+ * we're already scanning the whole filesystem.
+ */
+ if (flags & EXT2_NEWRANGE_FIXED_GOAL ||
+ start == fs->super->s_first_data_block)
+ goto fail;
+ start = fs->super->s_first_data_block;
+ continue;
+ } else if (retval)
+ goto errout;
+
+ if (flags & EXT2_NEWRANGE_FIXED_GOAL && start != goal)
+ goto fail;
+
+ b = min(start + len - 1, max_blocks - 1);
+ retval = ext2fs_find_first_set_block_bitmap2(fs->block_map,
+ start, b, &end);
+ if (retval == ENOENT)
+ end = b + 1;
+ else if (retval)
+ goto errout;
+
+ if (!(flags & EXT2_NEWRANGE_EXACT_LENGTH) || (end - start) >= len) {
+ *pblk = start;
+ *plen = end - start;
+ dbg_printf("%s: new_range goal=%llu--%llu "
+ "blk=%llu--%llu %llu\n",
+ __func__, goal, goal + len - 1,
+ *pblk, *pblk + *plen - 1, *plen);
+
+ for (b = start; b < end;
+ b += fs->super->s_blocks_per_group)
+ clear_block_uninit(fs,
+ ext2fs_group_of_blk2(fs, b));
+ return 0;
+ }
+
+try_again:
+ if (flags & EXT2_NEWRANGE_FIXED_GOAL)
+ goto fail;
+ start = end;
+ if (start >= max_blocks) {
+ if (looped)
+ goto fail;
+ looped = 1;
+ start = fs->super->s_first_data_block;
+ }
+ }
+
+fail:
+ retval = EXT2_ET_BLOCK_ALLOC_FAIL;
+errout:
+ return retval;
+}
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index edbb92b..a37c06b 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -688,6 +688,12 @@ extern void ext2fs_set_alloc_block_callback(ext2_filsys fs,
blk64_t goal,
blk64_t *ret));
blk64_t ext2fs_find_inode_goal(ext2_filsys fs, ext2_ino_t ino);
+#define EXT2_NEWRANGE_FIXED_GOAL (0x1)
+#define EXT2_NEWRANGE_EXACT_LENGTH (0x2)
+#define EXT2_NEWRANGE_ALL_FLAGS (0x3)
+errcode_t ext2fs_new_range(ext2_filsys fs, int flags, blk64_t goal,
+ blk64_t len, ext2fs_block_bitmap map, blk64_t *pblk,
+ blk64_t *plen);
/* alloc_sb.c */
extern int ext2fs_reserve_super_and_bgd(ext2_filsys fs,
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 24/32] libext2fs: provide a function to set inode size
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (22 preceding siblings ...)
2014-03-02 7:19 ` [PATCH 23/32] libext2fs: find a range of empty blocks Darrick J. Wong
@ 2014-03-02 7:19 ` Darrick J. Wong
2014-03-02 7:19 ` [PATCH 25/32] libext2fs: implement fallocate Darrick J. Wong
` (5 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:19 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
Provide an API to set i_size in an inode and take care of all required
feature flag modifications. Refactor the code to use this new
function.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
debugfs/debugfs.c | 7 ++++++-
e2fsck/pass1.c | 9 ++++-----
e2fsck/pass2.c | 11 +++++++++--
e2fsck/pass3.c | 5 +++--
e2fsck/rehash.c | 5 ++++-
lib/ext2fs/bb_inode.c | 5 ++++-
lib/ext2fs/ext2fs.h | 2 ++
lib/ext2fs/fileio.c | 41 ++++++++++++++++++++++++++++-------------
lib/ext2fs/mkjournal.c | 8 +++-----
lib/ext2fs/res_gdt.c | 9 +++------
lib/ext2fs/symlink.c | 2 +-
tests/f_big_sparse/expect.1 | 5 -----
12 files changed, 67 insertions(+), 42 deletions(-)
diff --git a/debugfs/debugfs.c b/debugfs/debugfs.c
index f0c5373..5f61fc5 100644
--- a/debugfs/debugfs.c
+++ b/debugfs/debugfs.c
@@ -1681,7 +1681,12 @@ void do_write(int argc, char *argv[])
inode.i_atime = inode.i_ctime = inode.i_mtime =
current_fs->now ? current_fs->now : time(0);
inode.i_links_count = 1;
- inode.i_size = statbuf.st_size;
+ retval = ext2fs_inode_set_size(current_fs, &inode, statbuf.st_size);
+ if (retval) {
+ com_err(argv[2], retval, 0);
+ close(fd);
+ return;
+ }
if (current_fs->super->s_feature_incompat &
EXT3_FEATURE_INCOMPAT_EXTENTS) {
int i;
diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index 3475ed0..f80076e 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -264,8 +264,7 @@ static void check_size(e2fsck_t ctx, struct problem_context *pctx)
if (!fix_problem(ctx, PR_1_SET_NONZSIZE, pctx))
return;
- inode->i_size = 0;
- inode->i_size_high = 0;
+ ext2fs_inode_set_size(ctx->fs, inode, 0);
e2fsck_write_inode(ctx, pctx->ino, pctx->inode, "pass1");
}
@@ -2373,9 +2372,9 @@ static void check_blocks(e2fsck_t ctx, struct problem_context *pctx,
pctx->num = (pb.last_block+1) * fs->blocksize;
pctx->group = bad_size;
if (fix_problem(ctx, PR_1_BAD_I_SIZE, pctx)) {
- inode->i_size = pctx->num;
- if (!LINUX_S_ISDIR(inode->i_mode))
- inode->i_size_high = pctx->num >> 32;
+ if (LINUX_S_ISDIR(inode->i_mode))
+ pctx->num &= 0xFFFFFFFFULL;
+ ext2fs_inode_set_size(fs, inode, pctx->num);
dirty_inode++;
}
pctx->num = 0;
diff --git a/e2fsck/pass2.c b/e2fsck/pass2.c
index e2974eb..6b44cac 100644
--- a/e2fsck/pass2.c
+++ b/e2fsck/pass2.c
@@ -1658,8 +1658,15 @@ static int allocate_dir_block(e2fsck_t ctx,
*/
e2fsck_read_inode(ctx, db->ino, &inode, "allocate_dir_block");
ext2fs_iblk_add_blocks(fs, &inode, 1);
- if (inode.i_size < (db->blockcnt+1) * fs->blocksize)
- inode.i_size = (db->blockcnt+1) * fs->blocksize;
+ if (EXT2_I_SIZE(&inode) < (db->blockcnt+1) * fs->blocksize) {
+ pctx->errcode = ext2fs_inode_set_size(fs, &inode,
+ (db->blockcnt+1) * fs->blocksize);
+ if (pctx->errcode) {
+ pctx->str = "ext2fs_inode_set_size";
+ fix_problem(ctx, PR_2_ALLOC_DIRBOCK, pctx);
+ return 1;
+ }
+ }
e2fsck_write_inode(ctx, db->ino, &inode, "allocate_dir_block");
/*
diff --git a/e2fsck/pass3.c b/e2fsck/pass3.c
index 6769b05..b8d0ccc 100644
--- a/e2fsck/pass3.c
+++ b/e2fsck/pass3.c
@@ -848,8 +848,9 @@ errcode_t e2fsck_expand_directory(e2fsck_t ctx, ext2_ino_t dir,
return retval;
sz = (es.last_block + 1) * fs->blocksize;
- inode.i_size = sz;
- inode.i_size_high = sz >> 32;
+ retval = ext2fs_inode_set_size(fs, &inode, sz);
+ if (retval)
+ return retval;
ext2fs_iblk_add_blocks(fs, &inode, es.newblocks);
quota_data_add(ctx->qctx, &inode, dir, es.newblocks * fs->blocksize);
diff --git a/e2fsck/rehash.c b/e2fsck/rehash.c
index 283515c..3a0e568 100644
--- a/e2fsck/rehash.c
+++ b/e2fsck/rehash.c
@@ -783,7 +783,10 @@ static errcode_t write_directory(e2fsck_t ctx, ext2_filsys fs,
inode.i_flags &= ~EXT2_INDEX_FL;
else
inode.i_flags |= EXT2_INDEX_FL;
- inode.i_size = outdir->num * fs->blocksize;
+ retval = ext2fs_inode_set_size(fs, &inode,
+ outdir->num * fs->blocksize);
+ if (retval)
+ return retval;
ext2fs_iblk_sub_blocks(fs, &inode, wd.cleared);
e2fsck_write_inode(ctx, ino, &inode, "rehash_dir");
diff --git a/lib/ext2fs/bb_inode.c b/lib/ext2fs/bb_inode.c
index 268eecf..3d9132b 100644
--- a/lib/ext2fs/bb_inode.c
+++ b/lib/ext2fs/bb_inode.c
@@ -128,7 +128,10 @@ errcode_t ext2fs_update_bb_inode(ext2_filsys fs, ext2_badblocks_list bb_list)
if (!inode.i_ctime)
inode.i_ctime = fs->now ? fs->now : time(0);
ext2fs_iblk_set(fs, &inode, rec.bad_block_count);
- inode.i_size = rec.bad_block_count * fs->blocksize;
+ retval = ext2fs_inode_set_size(fs, &inode,
+ rec.bad_block_count * fs->blocksize);
+ if (retval)
+ goto cleanup;
retval = ext2fs_write_inode(fs, EXT2_BAD_INO, &inode);
if (retval)
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index a37c06b..fd1f583 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -1240,6 +1240,8 @@ errcode_t ext2fs_file_get_lsize(ext2_file_t file, __u64 *ret_size);
extern ext2_off_t ext2fs_file_get_size(ext2_file_t file);
extern errcode_t ext2fs_file_set_size(ext2_file_t file, ext2_off_t size);
extern errcode_t ext2fs_file_set_size2(ext2_file_t file, ext2_off64_t size);
+errcode_t ext2fs_inode_set_size(ext2_filsys fs, struct ext2_inode *inode,
+ ext2_off64_t size);
/* finddev.c */
extern char *ext2fs_find_block_device(dev_t device);
diff --git a/lib/ext2fs/fileio.c b/lib/ext2fs/fileio.c
index 607609f..7d34258 100644
--- a/lib/ext2fs/fileio.c
+++ b/lib/ext2fs/fileio.c
@@ -461,6 +461,31 @@ out:
return retval;
}
+errcode_t ext2fs_inode_set_size(ext2_filsys fs, struct ext2_inode *inode,
+ ext2_off64_t size)
+{
+ /* Only regular files get to be larger than 4GB */
+ if (!LINUX_S_ISREG(inode->i_mode) && (size >> 32))
+ return EXT2_ET_FILE_TOO_BIG;
+
+ /* If we're writing a large file, set the large_file flag */
+ if (LINUX_S_ISREG(inode->i_mode) &&
+ ext2fs_needs_large_file_feature(size) &&
+ (!EXT2_HAS_RO_COMPAT_FEATURE(fs->super,
+ EXT2_FEATURE_RO_COMPAT_LARGE_FILE) ||
+ fs->super->s_rev_level == EXT2_GOOD_OLD_REV)) {
+ fs->super->s_feature_ro_compat |=
+ EXT2_FEATURE_RO_COMPAT_LARGE_FILE;
+ ext2fs_update_dynamic_rev(fs);
+ ext2fs_mark_super_dirty(fs);
+ }
+
+ inode->i_size = size & 0xffffffff;
+ inode->i_size_high = (size >> 32);
+
+ return 0;
+}
+
/*
* This function sets the size of the file, truncating it if necessary
*
@@ -482,20 +507,10 @@ errcode_t ext2fs_file_set_size2(ext2_file_t file, ext2_off64_t size)
old_truncate = ((old_size + file->fs->blocksize - 1) >>
EXT2_BLOCK_SIZE_BITS(file->fs->super));
- /* If we're writing a large file, set the large_file flag */
- if (LINUX_S_ISREG(file->inode.i_mode) &&
- ext2fs_needs_large_file_feature(EXT2_I_SIZE(&file->inode)) &&
- (!EXT2_HAS_RO_COMPAT_FEATURE(file->fs->super,
- EXT2_FEATURE_RO_COMPAT_LARGE_FILE) ||
- file->fs->super->s_rev_level == EXT2_GOOD_OLD_REV)) {
- file->fs->super->s_feature_ro_compat |=
- EXT2_FEATURE_RO_COMPAT_LARGE_FILE;
- ext2fs_update_dynamic_rev(file->fs);
- ext2fs_mark_super_dirty(file->fs);
- }
+ retval = ext2fs_inode_set_size(file->fs, &file->inode, size);
+ if (retval)
+ return retval;
- file->inode.i_size = size & 0xffffffff;
- file->inode.i_size_high = (size >> 32);
if (file->ino) {
retval = ext2fs_write_inode(file->fs, file->ino, &file->inode);
if (retval)
diff --git a/lib/ext2fs/mkjournal.c b/lib/ext2fs/mkjournal.c
index ecc3912..11f33ab 100644
--- a/lib/ext2fs/mkjournal.c
+++ b/lib/ext2fs/mkjournal.c
@@ -400,15 +400,13 @@ static errcode_t write_journal_inode(ext2_filsys fs, ext2_ino_t journal_ino,
goto errout;
inode_size = (unsigned long long)fs->blocksize * num_blocks;
- inode.i_size = inode_size & 0xFFFFFFFF;
- inode.i_size_high = (inode_size >> 32) & 0xFFFFFFFF;
- if (ext2fs_needs_large_file_feature(inode_size))
- fs->super->s_feature_ro_compat |=
- EXT2_FEATURE_RO_COMPAT_LARGE_FILE;
ext2fs_iblk_add_blocks(fs, &inode, es.newblocks);
inode.i_mtime = inode.i_ctime = fs->now ? fs->now : time(0);
inode.i_links_count = 1;
inode.i_mode = LINUX_S_IFREG | 0600;
+ retval = ext2fs_inode_set_size(fs, &inode, inode_size);
+ if (retval)
+ goto errout;
if ((retval = ext2fs_write_new_inode(fs, journal_ino, &inode)))
goto errout;
diff --git a/lib/ext2fs/res_gdt.c b/lib/ext2fs/res_gdt.c
index e61c330..1343ce6 100644
--- a/lib/ext2fs/res_gdt.c
+++ b/lib/ext2fs/res_gdt.c
@@ -133,12 +133,9 @@ errcode_t ext2fs_create_resize_inode(ext2_filsys fs)
dindir_dirty = inode_dirty = 1;
inode_size = apb*apb + apb + EXT2_NDIR_BLOCKS;
inode_size *= fs->blocksize;
- inode.i_size = inode_size & 0xFFFFFFFF;
- inode.i_size_high = (inode_size >> 32) & 0xFFFFFFFF;
- if(inode.i_size_high) {
- sb->s_feature_ro_compat |=
- EXT2_FEATURE_RO_COMPAT_LARGE_FILE;
- }
+ retval = ext2fs_inode_set_size(fs, &inode, inode_size);
+ if (retval)
+ goto out_free;
inode.i_ctime = fs->now ? fs->now : time(0);
}
diff --git a/lib/ext2fs/symlink.c b/lib/ext2fs/symlink.c
index cb3a2e7..4147181 100644
--- a/lib/ext2fs/symlink.c
+++ b/lib/ext2fs/symlink.c
@@ -80,7 +80,7 @@ errcode_t ext2fs_symlink(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t ino,
inode.i_uid = inode.i_gid = 0;
ext2fs_iblk_set(fs, &inode, fastlink ? 0 : 1);
inode.i_links_count = 1;
- inode.i_size = target_len;
+ ext2fs_inode_set_size(fs, &inode, target_len);
/* The time fields are set by ext2fs_write_new_inode() */
if (fastlink) {
diff --git a/tests/f_big_sparse/expect.1 b/tests/f_big_sparse/expect.1
index 437ade7..eac82ed 100644
--- a/tests/f_big_sparse/expect.1
+++ b/tests/f_big_sparse/expect.1
@@ -2,11 +2,6 @@ Pass 1: Checking inodes, blocks, and sizes
Inode 12, i_size is 61440, should be 4398050758656. Fix? yes
Pass 2: Checking directory structure
-Filesystem contains large files, but lacks LARGE_FILE flag in superblock.
-Fix? yes
-
-Filesystem has feature flag(s) set, but is a revision 0 filesystem. Fix? yes
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 25/32] libext2fs: implement fallocate
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (23 preceding siblings ...)
2014-03-02 7:19 ` [PATCH 24/32] libext2fs: provide a function to set inode size Darrick J. Wong
@ 2014-03-02 7:19 ` Darrick J. Wong
2014-03-02 7:19 ` [PATCH 27/32] fuse2fs: translate ACL structures Darrick J. Wong
` (4 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:19 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
Create a library function to perform fallocation on arbitrary files,
and wire up a few users for this function. This is a bit more intense
than Ted's original mk_hugefiles implementation since we have to honor
any blocks that may already be allocated to the file.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
lib/ext2fs/Makefile.in | 8
lib/ext2fs/ext2fs.h | 10 +
lib/ext2fs/fallocate.c | 835 ++++++++++++++++++++++++++++++++++++++++++++++++
misc/mk_hugefiles.c | 84 +----
4 files changed, 863 insertions(+), 74 deletions(-)
create mode 100644 lib/ext2fs/fallocate.c
diff --git a/lib/ext2fs/Makefile.in b/lib/ext2fs/Makefile.in
index dde4c6d..aab52e1 100644
--- a/lib/ext2fs/Makefile.in
+++ b/lib/ext2fs/Makefile.in
@@ -44,6 +44,7 @@ OBJS= $(DEBUGFS_LIB_OBJS) $(RESIZE_LIB_OBJS) $(E2IMAGE_LIB_OBJS) \
expanddir.o \
ext_attr.o \
extent.o \
+ fallocate.o \
fileio.o \
finddev.o \
flushb.o \
@@ -675,6 +676,13 @@ extent.o: $(srcdir)/extent.c $(top_builddir)/lib/config.h \
$(top_srcdir)/lib/et/com_err.h $(srcdir)/ext2_io.h \
$(top_builddir)/lib/ext2fs/ext2_err.h $(srcdir)/ext2_ext_attr.h \
$(srcdir)/bitops.h $(srcdir)/e2image.h
+fallocate.o: $(srcdir)/fallocate.c $(top_builddir)/lib/config.h \
+ $(top_builddir)/lib/dirpaths.h $(srcdir)/ext2_fs.h \
+ $(top_builddir)/lib/ext2fs/ext2_types.h $(srcdir)/ext2fsP.h \
+ $(srcdir)/ext2fs.h $(srcdir)/ext2_fs.h $(srcdir)/ext3_extents.h \
+ $(top_srcdir)/lib/et/com_err.h $(srcdir)/ext2_io.h \
+ $(top_builddir)/lib/ext2fs/ext2_err.h $(srcdir)/ext2_ext_attr.h \
+ $(srcdir)/bitops.h $(srcdir)/e2image.h
fileio.o: $(srcdir)/fileio.c $(top_builddir)/lib/config.h \
$(top_builddir)/lib/dirpaths.h $(srcdir)/ext2_fs.h \
$(top_builddir)/lib/ext2fs/ext2_types.h $(srcdir)/ext2fs.h \
diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index fd1f583..f7b7195 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -1217,6 +1217,16 @@ extern errcode_t ext2fs_extent_goto2(ext2_extent_handle_t handle,
int leaf_level, blk64_t blk);
extern errcode_t ext2fs_extent_fix_parents(ext2_extent_handle_t handle);
+/* fallocate.c */
+#define EXT2_FALLOCATE_ZERO_BLOCKS (0x1)
+#define EXT2_FALLOCATE_FORCE_INIT (0x2)
+#define EXT2_FALLOCATE_FORCE_UNINIT (0x4)
+#define EXT2_FALLOCATE_INIT_BEYOND_EOF (0x8)
+#define EXT2_FALLOCATE_ALL_FLAGS (0xF)
+errcode_t ext2fs_fallocate(ext2_filsys fs, int flags, ext2_ino_t ino,
+ struct ext2_inode *inode,
+ blk64_t start, blk64_t len);
+
/* fileio.c */
extern errcode_t ext2fs_file_open2(ext2_filsys fs, ext2_ino_t ino,
struct ext2_inode *inode,
diff --git a/lib/ext2fs/fallocate.c b/lib/ext2fs/fallocate.c
new file mode 100644
index 0000000..5e91037
--- /dev/null
+++ b/lib/ext2fs/fallocate.c
@@ -0,0 +1,835 @@
+/*
+ * fallocate.c -- Allocate large chunks of file.
+ *
+ * Copyright (C) 2014 Oracle.
+ *
+ * %Begin-Header%
+ * This file may be redistributed under the terms of the GNU Library
+ * General Public License, version 2.
+ * %End-Header%
+ */
+
+#include "config.h"
+
+#include "ext2_fs.h"
+#include "ext2fs.h"
+#define min(a, b) ((a) < (b) ? (a) : (b))
+
+#undef DEBUG
+
+#ifdef DEBUG
+# define dbg_printf(f, a...) do {printf(f, ## a); fflush(stdout); } while (0)
+#else
+# define dbg_printf(f, a...)
+#endif
+
+/*
+ * Extent-based fallocate code.
+ *
+ * Find runs of unmapped logical blocks by starting at start and walking the
+ * extents until we reach the end of the range we want.
+ *
+ * For each run of unmapped blocks, try to find the extents on either side of
+ * the range. If there's a left extent that can grow by at least a cluster and
+ * there are lblocks between start and the next lcluster after start, see if
+ * there's an implied cluster allocation; if so, zero the blocks (if the left
+ * extent is initialized) and adjust the extent. Ditto for the blocks between
+ * the end of the last full lcluster and end, if there's a right extent.
+ *
+ * Try to attach as much as we can to the left extent, then try to attach as
+ * much as we can to the right extent. For the remainder, try to allocate the
+ * whole range; map in whatever we get; and repeat until we're done.
+ *
+ * To attach to a left extent, figure out the maximum amount we can add to the
+ * extent and try to allocate that much, and append if successful. To attach
+ * to a right extent, figure out the max we can add to the extent, try to
+ * allocate that much, and prepend if successful.
+ *
+ * We need an alloc_range function that tells us how much we can allocate given
+ * a maximum length and one of a suggested start, a fixed start, or a fixed end
+ * point.
+ *
+ * Every time we modify the extent tree we also need to update the block stats.
+ *
+ * At the end, update i_blocks and i_size appropriately.
+ */
+
+static void dbg_print_extent(char *desc, struct ext2fs_extent *extent)
+{
+#ifdef DEBUG
+ if (desc)
+ printf("%s: ", desc);
+ printf("extent: lblk %llu--%llu, len %u, pblk %llu, flags: ",
+ extent->e_lblk, extent->e_lblk + extent->e_len - 1,
+ extent->e_len, extent->e_pblk);
+ if (extent->e_flags & EXT2_EXTENT_FLAGS_LEAF)
+ fputs("LEAF ", stdout);
+ if (extent->e_flags & EXT2_EXTENT_FLAGS_UNINIT)
+ fputs("UNINIT ", stdout);
+ if (extent->e_flags & EXT2_EXTENT_FLAGS_SECOND_VISIT)
+ fputs("2ND_VISIT ", stdout);
+ if (!extent->e_flags)
+ fputs("(none)", stdout);
+ fputc('\n', stdout);
+ fflush(stdout);
+#endif
+}
+
+static errcode_t claim_range(ext2_filsys fs, struct ext2_inode *inode,
+ blk64_t blk, blk64_t len)
+{
+ blk64_t clusters;
+
+ clusters = (len + EXT2FS_CLUSTER_RATIO(fs) - 1) /
+ EXT2FS_CLUSTER_RATIO(fs);
+ ext2fs_block_alloc_stats_range(fs, blk,
+ clusters * EXT2FS_CLUSTER_RATIO(fs), +1);
+ return ext2fs_iblk_add_blocks(fs, inode, clusters);
+}
+
+static errcode_t ext_falloc_helper(ext2_filsys fs,
+ int flags,
+ ext2_ino_t ino,
+ struct ext2_inode *inode,
+ ext2_extent_handle_t handle,
+ struct ext2fs_extent *left_ext,
+ struct ext2fs_extent *right_ext,
+ blk64_t range_start, blk64_t range_len,
+ blk64_t alloc_goal)
+{
+ struct ext2fs_extent newex, ex;
+ int op;
+ blk64_t fillable, pblk, plen, x, cluster_fill, y;
+ blk64_t eof_blk;
+ errcode_t err;
+ blk_t max_extent_len, max_uninit_len, max_init_len;
+
+#ifdef DEBUG
+ printf("%s: ", __func__);
+ if (left_ext)
+ printf("left_ext=%llu--%llu, ", left_ext->e_lblk,
+ left_ext->e_lblk + left_ext->e_len - 1);
+ if (right_ext)
+ printf("right_ext=%llu--%llu, ", right_ext->e_lblk,
+ right_ext->e_lblk + right_ext->e_len - 1);
+ printf("start=%llu len=%llu, goal=%llu\n", range_start, range_len,
+ alloc_goal);
+ fflush(stdout);
+#endif
+ /* Can't create initialized extents past EOF? */
+ if (!(flags & EXT2_FALLOCATE_INIT_BEYOND_EOF))
+ eof_blk = EXT2_I_SIZE(inode) / fs->blocksize;
+
+ /* The allocation goal must be as far into a cluster as range_start. */
+ alloc_goal = (alloc_goal & ~EXT2FS_CLUSTER_MASK(fs)) |
+ (range_start & EXT2FS_CLUSTER_MASK(fs));
+
+ max_uninit_len = EXT_UNINIT_MAX_LEN & ~EXT2FS_CLUSTER_MASK(fs);
+ max_init_len = EXT_INIT_MAX_LEN & ~EXT2FS_CLUSTER_MASK(fs);
+
+ /* We must lengthen the left extent to the end of the cluster */
+ if (left_ext && EXT2FS_CLUSTER_RATIO(fs) > 1) {
+ /* How many more blocks can be attached to left_ext? */
+ if (left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)
+ fillable = max_uninit_len - left_ext->e_len;
+ else
+ fillable = max_init_len - left_ext->e_len;
+
+ if (fillable > range_len)
+ fillable = range_len;
+ if (fillable == 0)
+ goto expand_right;
+
+ /*
+ * If range_start isn't on a cluster boundary, try an
+ * implied cluster allocation for left_ext.
+ */
+ cluster_fill = EXT2FS_CLUSTER_RATIO(fs) -
+ (range_start & EXT2FS_CLUSTER_MASK(fs));
+ cluster_fill &= EXT2FS_CLUSTER_MASK(fs);
+ if (cluster_fill == 0)
+ goto expand_right;
+
+ if (cluster_fill > fillable)
+ cluster_fill = fillable;
+
+ /* Don't expand an initialized left_ext beyond EOF */
+ if (!(flags & EXT2_FALLOCATE_INIT_BEYOND_EOF)) {
+ x = left_ext->e_lblk + left_ext->e_len - 1;
+ dbg_printf("%s: lend=%llu newlend=%llu eofblk=%llu\n",
+ __func__, x, x + cluster_fill, eof_blk);
+ if (eof_blk >= x && eof_blk <= x + cluster_fill)
+ cluster_fill = eof_blk - x;
+ if (cluster_fill == 0)
+ goto expand_right;
+ }
+
+ err = ext2fs_extent_goto(handle, left_ext->e_lblk);
+ if (err)
+ goto expand_right;
+ left_ext->e_len += cluster_fill;
+ range_start += cluster_fill;
+ range_len -= cluster_fill;
+ alloc_goal += cluster_fill;
+
+ dbg_print_extent("ext_falloc clus left+", left_ext);
+ err = ext2fs_extent_replace(handle, 0, left_ext);
+ if (err)
+ goto out;
+ err = ext2fs_extent_fix_parents(handle);
+ if (err)
+ goto out;
+
+ /* Zero blocks */
+ if (!(left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)) {
+ err = ext2fs_zero_blocks2(fs, left_ext->e_pblk +
+ left_ext->e_len -
+ cluster_fill, cluster_fill,
+ NULL, NULL);
+ if (err)
+ goto out;
+ }
+ }
+
+expand_right:
+ /* We must lengthen the right extent to the beginning of the cluster */
+ if (right_ext && EXT2FS_CLUSTER_RATIO(fs) > 1) {
+ /* How much can we attach to right_ext? */
+ if (right_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)
+ fillable = max_uninit_len - right_ext->e_len;
+ else
+ fillable = max_init_len - right_ext->e_len;
+
+ if (fillable > range_len)
+ fillable = range_len;
+ if (fillable == 0)
+ goto try_merge;
+
+ /*
+ * If range_end isn't on a cluster boundary, try an implied
+ * cluster allocation for right_ext.
+ */
+ cluster_fill = right_ext->e_lblk & EXT2FS_CLUSTER_MASK(fs);
+ if (cluster_fill == 0)
+ goto try_merge;
+
+ err = ext2fs_extent_goto(handle, right_ext->e_lblk);
+ if (err)
+ goto out;
+
+ if (cluster_fill > fillable)
+ cluster_fill = fillable;
+ right_ext->e_lblk -= cluster_fill;
+ right_ext->e_pblk -= cluster_fill;
+ right_ext->e_len += cluster_fill;
+ range_len -= cluster_fill;
+
+ dbg_print_extent("ext_falloc clus right+", right_ext);
+ err = ext2fs_extent_replace(handle, 0, right_ext);
+ if (err)
+ goto out;
+ err = ext2fs_extent_fix_parents(handle);
+ if (err)
+ goto out;
+
+ /* Zero blocks if necessary */
+ if (!(right_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)) {
+ err = ext2fs_zero_blocks2(fs, right_ext->e_pblk,
+ cluster_fill, NULL, NULL);
+ if (err)
+ goto out;
+ }
+ }
+
+try_merge:
+ /* Merge both extents together, perhaps? */
+ if (left_ext && right_ext) {
+ /* Are the two extents mergeable? */
+ if ((left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT) !=
+ (right_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT))
+ goto try_left;
+
+ /* User requires init/uninit but extent is uninit/init. */
+ if (((flags & EXT2_FALLOCATE_FORCE_INIT) &&
+ (left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)) ||
+ ((flags & EXT2_FALLOCATE_FORCE_UNINIT) &&
+ !(left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)))
+ goto try_left;
+
+ /*
+ * Skip initialized extent unless user wants to zero blocks
+ * or requires init extent.
+ */
+ if (!(left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT) &&
+ (!(flags & EXT2_FALLOCATE_ZERO_BLOCKS) ||
+ !(flags & EXT2_FALLOCATE_FORCE_INIT)))
+ goto try_left;
+
+ /* Will it even fit? */
+ x = left_ext->e_len + range_len + right_ext->e_len;
+ if (x > (left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT ?
+ max_uninit_len : max_init_len))
+ goto try_left;
+
+ err = ext2fs_extent_goto(handle, left_ext->e_lblk);
+ if (err)
+ goto try_left;
+
+ /* Allocate blocks */
+ y = left_ext->e_pblk + left_ext->e_len;
+ err = ext2fs_new_range(fs, EXT2_NEWRANGE_FIXED_GOAL |
+ EXT2_NEWRANGE_EXACT_LENGTH, y,
+ right_ext->e_pblk - y + 1, NULL,
+ &pblk, &plen);
+ if (err)
+ goto try_left;
+ if (pblk + plen != right_ext->e_pblk)
+ goto try_left;
+ err = claim_range(fs, inode, pblk, plen);
+ if (err)
+ goto out;
+
+ /* Modify extents */
+ left_ext->e_len = x;
+ dbg_print_extent("ext_falloc merge", left_ext);
+ err = ext2fs_extent_replace(handle, 0, left_ext);
+ if (err)
+ goto out;
+ err = ext2fs_extent_fix_parents(handle);
+ if (err)
+ goto out;
+ err = ext2fs_extent_get(handle, EXT2_EXTENT_NEXT_LEAF, &newex);
+ if (err)
+ goto out;
+ err = ext2fs_extent_delete(handle, 0);
+ if (err)
+ goto out;
+ err = ext2fs_extent_fix_parents(handle);
+ if (err)
+ goto out;
+ *right_ext = *left_ext;
+
+ /* Zero blocks */
+ if (!(left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT) &&
+ (flags & EXT2_FALLOCATE_ZERO_BLOCKS)) {
+ err = ext2fs_zero_blocks2(fs, range_start, range_len,
+ NULL, NULL);
+ if (err)
+ goto out;
+ }
+
+ return 0;
+ }
+
+try_left:
+ /* Extend the left extent */
+ if (left_ext) {
+ /* How many more blocks can be attached to left_ext? */
+ if (left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)
+ fillable = max_uninit_len - left_ext->e_len;
+ else if (flags & EXT2_FALLOCATE_ZERO_BLOCKS)
+ fillable = max_init_len - left_ext->e_len;
+ else
+ fillable = 0;
+
+ /* User requires init/uninit but extent is uninit/init. */
+ if (((flags & EXT2_FALLOCATE_FORCE_INIT) &&
+ (left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)) ||
+ ((flags & EXT2_FALLOCATE_FORCE_UNINIT) &&
+ !(left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)))
+ goto try_right;
+
+ if (fillable > range_len)
+ fillable = range_len;
+
+ /* Don't expand an initialized left_ext beyond EOF */
+ x = left_ext->e_lblk + left_ext->e_len - 1;
+ if (!(flags & EXT2_FALLOCATE_INIT_BEYOND_EOF)) {
+ dbg_printf("%s: lend=%llu newlend=%llu eofblk=%llu\n",
+ __func__, x, x + fillable, eof_blk);
+ if (eof_blk >= x && eof_blk <= x + fillable)
+ fillable = eof_blk - x;
+ }
+
+ if (fillable == 0)
+ goto try_right;
+
+ /* Test if the right edge of the range is already mapped? */
+ if (EXT2FS_CLUSTER_RATIO(fs) > 1) {
+ err = ext2fs_map_cluster_block(fs, ino, inode,
+ x + fillable, &pblk);
+ if (err)
+ goto out;
+ if (pblk)
+ fillable -= 1 + ((x + fillable)
+ & EXT2FS_CLUSTER_MASK(fs));
+ if (fillable == 0)
+ goto try_right;
+ }
+
+ /* Allocate range of blocks */
+ x = left_ext->e_pblk + left_ext->e_len;
+ err = ext2fs_new_range(fs, EXT2_NEWRANGE_FIXED_GOAL |
+ EXT2_NEWRANGE_EXACT_LENGTH,
+ x, fillable, NULL, &pblk, &plen);
+ if (err)
+ goto try_right;
+ err = claim_range(fs, inode, pblk, plen);
+ if (err)
+ goto out;
+
+ /* Modify left_ext */
+ err = ext2fs_extent_goto(handle, left_ext->e_lblk);
+ if (err)
+ goto out;
+ range_start += plen;
+ range_len -= plen;
+ left_ext->e_len += plen;
+ dbg_print_extent("ext_falloc left+", left_ext);
+ err = ext2fs_extent_replace(handle, 0, left_ext);
+ if (err)
+ goto out;
+ err = ext2fs_extent_fix_parents(handle);
+ if (err)
+ goto out;
+
+ /* Zero blocks if necessary */
+ if (!(left_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT) &&
+ (flags & EXT2_FALLOCATE_ZERO_BLOCKS)) {
+ err = ext2fs_zero_blocks2(fs, pblk, plen, NULL, NULL);
+ if (err)
+ goto out;
+ }
+ }
+
+try_right:
+ /* Extend the right extent */
+ if (right_ext) {
+ /* How much can we attach to right_ext? */
+ if (right_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)
+ fillable = max_uninit_len - right_ext->e_len;
+ else if (flags & EXT2_FALLOCATE_ZERO_BLOCKS)
+ fillable = max_init_len - right_ext->e_len;
+ else
+ fillable = 0;
+
+ /* User requires init/uninit but extent is uninit/init. */
+ if (((flags & EXT2_FALLOCATE_FORCE_INIT) &&
+ (right_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)) ||
+ ((flags & EXT2_FALLOCATE_FORCE_UNINIT) &&
+ !(right_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT)))
+ goto try_anywhere;
+
+ if (fillable > range_len)
+ fillable = range_len;
+ if (fillable == 0)
+ goto try_anywhere;
+
+ /* Test if the left edge of the range is already mapped? */
+ if (EXT2FS_CLUSTER_RATIO(fs) > 1) {
+ err = ext2fs_map_cluster_block(fs, ino, inode,
+ right_ext->e_lblk - fillable, &pblk);
+ if (err)
+ goto out;
+ if (pblk)
+ fillable -= EXT2FS_CLUSTER_RATIO(fs) -
+ ((right_ext->e_lblk - fillable)
+ & EXT2FS_CLUSTER_MASK(fs));
+ if (fillable == 0)
+ goto try_anywhere;
+ }
+
+ /*
+ * FIXME: It would be nice if we could handle allocating a
+ * variable range from a fixed end point instead of just
+ * skipping to the general allocator if the whole range is
+ * unavailable.
+ */
+ err = ext2fs_new_range(fs, EXT2_NEWRANGE_FIXED_GOAL |
+ EXT2_NEWRANGE_EXACT_LENGTH,
+ right_ext->e_pblk - fillable,
+ fillable, NULL, &pblk, &plen);
+ if (err)
+ goto try_anywhere;
+ err = claim_range(fs, inode,
+ pblk & ~EXT2FS_CLUSTER_MASK(fs),
+ plen + (pblk & EXT2FS_CLUSTER_MASK(fs)));
+ if (err)
+ goto out;
+
+ /* Modify right_ext */
+ err = ext2fs_extent_goto(handle, right_ext->e_lblk);
+ if (err)
+ goto out;
+ range_len -= plen;
+ right_ext->e_lblk -= plen;
+ right_ext->e_pblk -= plen;
+ right_ext->e_len += plen;
+ dbg_print_extent("ext_falloc right+", right_ext);
+ err = ext2fs_extent_replace(handle, 0, right_ext);
+ if (err)
+ goto out;
+ err = ext2fs_extent_fix_parents(handle);
+ if (err)
+ goto out;
+
+ /* Zero blocks if necessary */
+ if (!(right_ext->e_flags & EXT2_EXTENT_FLAGS_UNINIT) &&
+ (flags & EXT2_FALLOCATE_ZERO_BLOCKS)) {
+ err = ext2fs_zero_blocks2(fs, pblk,
+ plen + cluster_fill, NULL, NULL);
+ if (err)
+ goto out;
+ }
+ }
+
+try_anywhere:
+ /* Try implied cluster alloc on the left and right ends */
+ if (range_len > 0 && (range_start & EXT2FS_CLUSTER_MASK(fs))) {
+ cluster_fill = EXT2FS_CLUSTER_RATIO(fs) -
+ (range_start & EXT2FS_CLUSTER_MASK(fs));
+ cluster_fill &= EXT2FS_CLUSTER_MASK(fs);
+ if (cluster_fill > range_len)
+ cluster_fill = range_len;
+ newex.e_lblk = range_start;
+ err = ext2fs_map_cluster_block(fs, ino, inode, newex.e_lblk,
+ &pblk);
+ if (err)
+ goto out;
+ if (pblk == 0)
+ goto try_right_implied;
+ newex.e_pblk = pblk;
+ newex.e_len = cluster_fill;
+ newex.e_flags = (flags & EXT2_FALLOCATE_FORCE_INIT ? 0 :
+ EXT2_EXTENT_FLAGS_UNINIT);
+ dbg_print_extent("ext_falloc iclus left+", &newex);
+ ext2fs_extent_goto(handle, newex.e_lblk);
+ err = ext2fs_extent_get(handle, EXT2_EXTENT_CURRENT,
+ &ex);
+ if (err == EXT2_ET_NO_CURRENT_NODE)
+ ex.e_lblk = 0;
+ else if (err)
+ goto out;
+
+ if (ex.e_lblk > newex.e_lblk)
+ op = 0; /* insert before */
+ else
+ op = EXT2_EXTENT_INSERT_AFTER;
+ dbg_printf("%s: inserting %s lblk %llu newex=%llu\n",
+ __func__, op ? "after" : "before", ex.e_lblk,
+ newex.e_lblk);
+ err = ext2fs_extent_insert(handle, op, &newex);
+ if (err)
+ goto out;
+ err = ext2fs_extent_fix_parents(handle);
+ if (err)
+ goto out;
+
+ if (!(newex.e_flags & EXT2_EXTENT_FLAGS_UNINIT) &&
+ (flags & EXT2_FALLOCATE_ZERO_BLOCKS)) {
+ err = ext2fs_zero_blocks2(fs, newex.e_pblk,
+ newex.e_len, NULL, NULL);
+ if (err)
+ goto out;
+ }
+
+ range_start += cluster_fill;
+ range_len -= cluster_fill;
+ }
+
+try_right_implied:
+ y = range_start + range_len;
+ if (range_len > 0 && (y & EXT2FS_CLUSTER_MASK(fs))) {
+ cluster_fill = y & EXT2FS_CLUSTER_MASK(fs);
+ if (cluster_fill > range_len)
+ cluster_fill = range_len;
+ newex.e_lblk = y & ~EXT2FS_CLUSTER_MASK(fs);
+ err = ext2fs_map_cluster_block(fs, ino, inode, newex.e_lblk,
+ &pblk);
+ if (err)
+ goto out;
+ if (pblk == 0)
+ goto no_implied;
+ newex.e_pblk = pblk;
+ newex.e_len = cluster_fill;
+ newex.e_flags = (flags & EXT2_FALLOCATE_FORCE_INIT ? 0 :
+ EXT2_EXTENT_FLAGS_UNINIT);
+ dbg_print_extent("ext_falloc iclus right+", &newex);
+ ext2fs_extent_goto(handle, newex.e_lblk);
+ err = ext2fs_extent_get(handle, EXT2_EXTENT_CURRENT,
+ &ex);
+ if (err == EXT2_ET_NO_CURRENT_NODE)
+ ex.e_lblk = 0;
+ else if (err)
+ goto out;
+
+ if (ex.e_lblk > newex.e_lblk)
+ op = 0; /* insert before */
+ else
+ op = EXT2_EXTENT_INSERT_AFTER;
+ dbg_printf("%s: inserting %s lblk %llu newex=%llu\n",
+ __func__, op ? "after" : "before", ex.e_lblk,
+ newex.e_lblk);
+ err = ext2fs_extent_insert(handle, op, &newex);
+ if (err)
+ goto out;
+ err = ext2fs_extent_fix_parents(handle);
+ if (err)
+ goto out;
+
+ if (!(newex.e_flags & EXT2_EXTENT_FLAGS_UNINIT) &&
+ (flags & EXT2_FALLOCATE_ZERO_BLOCKS)) {
+ err = ext2fs_zero_blocks2(fs, newex.e_pblk,
+ newex.e_len, NULL, NULL);
+ if (err)
+ goto out;
+ }
+
+ range_len -= cluster_fill;
+ }
+
+no_implied:
+ if (range_len == 0)
+ return 0;
+
+ newex.e_lblk = range_start;
+ if (flags & EXT2_FALLOCATE_FORCE_INIT) {
+ max_extent_len = max_init_len;
+ newex.e_flags = 0;
+ } else {
+ max_extent_len = max_uninit_len;
+ newex.e_flags = EXT2_EXTENT_FLAGS_UNINIT;
+ }
+ pblk = alloc_goal;
+ y = range_len;
+ for (x = 0; x < y;) {
+ cluster_fill = newex.e_lblk & EXT2FS_CLUSTER_MASK(fs);
+ fillable = min(range_len + cluster_fill, max_extent_len);
+ err = ext2fs_new_range(fs, 0, pblk & ~EXT2FS_CLUSTER_MASK(fs),
+ fillable,
+ NULL, &pblk, &plen);
+ if (err)
+ goto out;
+ err = claim_range(fs, inode, pblk, plen);
+ if (err)
+ goto out;
+
+ /* Create extent */
+ newex.e_pblk = pblk + cluster_fill;
+ newex.e_len = plen - cluster_fill;
+ dbg_print_extent("ext_falloc create", &newex);
+ ext2fs_extent_goto(handle, newex.e_lblk);
+ err = ext2fs_extent_get(handle, EXT2_EXTENT_CURRENT,
+ &ex);
+ if (err == EXT2_ET_NO_CURRENT_NODE)
+ ex.e_lblk = 0;
+ else if (err)
+ goto out;
+
+ if (ex.e_lblk > newex.e_lblk)
+ op = 0; /* insert before */
+ else
+ op = EXT2_EXTENT_INSERT_AFTER;
+ dbg_printf("%s: inserting %s lblk %llu newex=%llu\n",
+ __func__, op ? "after" : "before", ex.e_lblk,
+ newex.e_lblk);
+ err = ext2fs_extent_insert(handle, op, &newex);
+ if (err)
+ goto out;
+ err = ext2fs_extent_fix_parents(handle);
+ if (err)
+ goto out;
+
+ if (!(newex.e_flags & EXT2_EXTENT_FLAGS_UNINIT) &&
+ (flags & EXT2_FALLOCATE_ZERO_BLOCKS)) {
+ err = ext2fs_zero_blocks2(fs, pblk, plen, NULL, NULL);
+ if (err)
+ goto out;
+ }
+
+ /* Update variables at end of loop */
+ x += plen - cluster_fill;
+ range_len -= plen - cluster_fill;
+ newex.e_lblk += plen - cluster_fill;
+ pblk += plen - cluster_fill;
+ if (pblk >= ext2fs_blocks_count(fs->super))
+ pblk = fs->super->s_first_data_block;
+ }
+
+out:
+ return err;
+}
+
+static errcode_t extent_fallocate(ext2_filsys fs, int flags, ext2_ino_t ino,
+ struct ext2_inode *inode,
+ blk64_t start, blk64_t len)
+{
+ ext2_extent_handle_t handle;
+ struct ext2fs_extent left_extent, right_extent;
+ struct ext2fs_extent *left_adjacent, *right_adjacent;
+ errcode_t err;
+ blk64_t range_start, range_end = 0, end, next;
+ blk64_t count, goal, goal_distance;
+
+ end = start + len - 1;
+ err = ext2fs_extent_open2(fs, ino, inode, &handle);
+ if (err)
+ return err;
+
+ /*
+ * Find the extent closest to the start of the alloc range. We don't
+ * check the return value because _goto() sets the current node to the
+ * next-lowest extent if 'start' is in a hole; or the next-highest
+ * extent if there aren't any lower ones; or doesn't set a current node
+ * if there was a real error reading the extent tree. In that case,
+ * _get() will error out.
+ */
+start_again:
+ ext2fs_extent_goto(handle, start);
+ err = ext2fs_extent_get(handle, EXT2_EXTENT_CURRENT, &left_extent);
+ if (err == EXT2_ET_NO_CURRENT_NODE) {
+ blk64_t max_blocks = ext2fs_blocks_count(fs->super);
+ goal = ext2fs_find_inode_goal(fs, ino);
+ err = ext2fs_find_first_zero_block_bitmap2(fs->block_map,
+ goal, max_blocks - 1, &goal);
+ goal += start;
+ err = ext_falloc_helper(fs, flags, ino, inode, handle, NULL,
+ NULL, start, len, goal);
+ goto errout;
+ } else if (err)
+ goto errout;
+
+ dbg_print_extent("ext_falloc initial", &left_extent);
+ next = left_extent.e_lblk + left_extent.e_len;
+ if (left_extent.e_lblk > start) {
+ /* The nearest extent we found was beyond start??? */
+ goal = left_extent.e_pblk - (left_extent.e_lblk - start);
+ err = ext_falloc_helper(fs, flags, ino, inode, handle, NULL,
+ &left_extent, start,
+ left_extent.e_lblk - start, goal);
+ if (err)
+ goto errout;
+
+ goto start_again;
+ } else if (next >= start) {
+ range_start = next;
+ left_adjacent = &left_extent;
+ } else {
+ range_start = start;
+ left_adjacent = NULL;
+ }
+ goal = left_extent.e_pblk + (range_start - left_extent.e_lblk);
+ goal_distance = range_start - next;
+
+ do {
+ err = ext2fs_extent_get(handle, EXT2_EXTENT_NEXT_LEAF,
+ &right_extent);
+ dbg_printf("%s: ino=%d get next =%d\n", __func__, ino,
+ (int)err);
+ dbg_print_extent("ext_falloc next", &right_extent);
+ /* Stop if we've seen this extent before */
+ if (!err && right_extent.e_lblk <= left_extent.e_lblk)
+ err = EXT2_ET_EXTENT_NO_NEXT;
+
+ if (err && err != EXT2_ET_EXTENT_NO_NEXT)
+ goto errout;
+ if (err == EXT2_ET_EXTENT_NO_NEXT ||
+ right_extent.e_lblk > end + 1) {
+ range_end = end;
+ right_adjacent = NULL;
+ } else {
+ /* Handle right_extent.e_lblk <= end */
+ range_end = right_extent.e_lblk - 1;
+ right_adjacent = &right_extent;
+ }
+ if (err != EXT2_ET_EXTENT_NO_NEXT &&
+ goal_distance > (range_end - right_extent.e_lblk)) {
+ goal = right_extent.e_pblk -
+ (right_extent.e_lblk - range_start);
+ goal_distance = range_end - right_extent.e_lblk;
+ }
+
+ dbg_printf("%s: ino=%d rstart=%llu rend=%llu\n", __func__, ino,
+ range_start, range_end);
+ err = 0;
+ if (range_start <= range_end) {
+ count = range_end - range_start + 1;
+ err = ext_falloc_helper(fs, flags, ino, inode, handle,
+ left_adjacent, right_adjacent,
+ range_start, count, goal);
+ if (err)
+ goto errout;
+ }
+
+ if (range_end == end)
+ break;
+
+ err = ext2fs_extent_goto(handle, right_extent.e_lblk);
+ if (err)
+ goto errout;
+ next = right_extent.e_lblk + right_extent.e_len;
+ left_extent = right_extent;
+ left_adjacent = &left_extent;
+ range_start = next;
+ goal = left_extent.e_pblk + (range_start - left_extent.e_lblk);
+ goal_distance = range_start - next;
+ } while (range_end < end);
+
+errout:
+ ext2fs_zero_blocks2(NULL, 0, 0, NULL, NULL);
+ ext2fs_extent_free(handle);
+ return err;
+}
+
+errcode_t ext2fs_fallocate(ext2_filsys fs, int flags, ext2_ino_t ino,
+ struct ext2_inode *inode,
+ blk64_t start, blk64_t len)
+{
+ struct ext2_inode inode_buf;
+ blk64_t blk, x;
+ errcode_t err;
+
+ if (((flags & EXT2_FALLOCATE_FORCE_INIT) &&
+ (flags & EXT2_FALLOCATE_FORCE_UNINIT)) ||
+ (flags & ~EXT2_FALLOCATE_ALL_FLAGS))
+ return EXT2_ET_INVALID_ARGUMENT;
+
+ if (len > ext2fs_blocks_count(fs->super))
+ return EXT2_ET_BLOCK_ALLOC_FAIL;
+ else if (len == 0)
+ return 0;
+
+ /* Read inode structure if necessary */
+ if (!inode) {
+ err = ext2fs_read_inode(fs, ino, &inode_buf);
+ if (err)
+ return err;
+ inode = &inode_buf;
+ }
+ dbg_printf("%s: ino=%d start=%llu len=%llu\n", __func__, ino, start,
+ len);
+
+ if (inode->i_flags & EXT4_EXTENTS_FL) {
+ err = extent_fallocate(fs, flags, ino, inode, start, len);
+ goto out;
+ }
+
+ /* XXX: Allocate a bunch of blocks the slow way */
+ for (blk = start; blk <= start + len; blk++) {
+ err = ext2fs_bmap2(fs, ino, inode, NULL, 0, blk, 0, &x);
+ if (err)
+ return err;
+ if (x)
+ continue;
+
+ err = ext2fs_bmap2(fs, ino, inode, NULL,
+ BMAP_ALLOC | BMAP_UNINIT, blk, 0, &x);
+ if (err)
+ return err;
+ }
+
+out:
+ if (inode == &inode_buf)
+ ext2fs_write_inode(fs, ino, inode);
+ return err;
+}
diff --git a/misc/mk_hugefiles.c b/misc/mk_hugefiles.c
index d4dadc4..19892c5 100644
--- a/misc/mk_hugefiles.c
+++ b/misc/mk_hugefiles.c
@@ -144,84 +144,20 @@ static errcode_t mk_hugefile(ext2_filsys fs, blk64_t num,
ext2fs_inode_alloc_stats2(fs, *ino, +1, 0);
- retval = ext2fs_extent_open2(fs, *ino, &inode, &handle);
+ if (EXT2_HAS_INCOMPAT_FEATURE(fs->super,
+ EXT3_FEATURE_INCOMPAT_EXTENTS))
+ inode.i_flags |= EXT4_EXTENTS_FL;
+ retval = ext2fs_fallocate(fs,
+ EXT2_FALLOCATE_FORCE_INIT |
+ EXT2_FALLOCATE_ZERO_BLOCKS,
+ *ino, &inode, 0, num);
if (retval)
return retval;
-
- lblk = 0;
- left = num ? num : 1;
- while (left) {
- blk64_t pblk, end;
- blk64_t n = left;
-
- retval = ext2fs_find_first_zero_block_bitmap2(fs->block_map,
- goal, ext2fs_blocks_count(fs->super) - 1, &end);
- if (retval)
- goto errout;
- goal = end;
-
- retval = ext2fs_find_first_set_block_bitmap2(fs->block_map, goal,
- ext2fs_blocks_count(fs->super) - 1, &bend);
- if (retval == ENOENT) {
- bend = ext2fs_blocks_count(fs->super);
- if (num == 0)
- left = 0;
- }
- if (!num || bend - goal < left)
- n = bend - goal;
- pblk = goal;
- if (num)
- left -= n;
- goal += n;
- count += n;
- ext2fs_block_alloc_stats_range(fs, pblk, n, +1);
-
- if (zero_hugefile) {
- blk64_t ret_blk;
- retval = ext2fs_zero_blocks2(fs, pblk, n,
- &ret_blk, NULL);
-
- if (retval)
- com_err(program_name, retval,
- _("while zeroing block %llu "
- "for hugefile"), ret_blk);
- }
-
- while (n) {
- blk64_t l = n;
- struct ext2fs_extent newextent;
-
- if (l > EXT_INIT_MAX_LEN)
- l = EXT_INIT_MAX_LEN;
-
- newextent.e_len = l;
- newextent.e_pblk = pblk;
- newextent.e_lblk = lblk;
- newextent.e_flags = 0;
-
- retval = ext2fs_extent_insert(handle,
- EXT2_EXTENT_INSERT_AFTER, &newextent);
- if (retval)
- return retval;
- pblk += l;
- lblk += l;
- n -= l;
- }
- }
-
- retval = ext2fs_read_inode(fs, *ino, &inode);
+ retval = ext2fs_inode_set_size(fs, &inode, num * fs->blocksize);
if (retval)
- goto errout;
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 27/32] fuse2fs: translate ACL structures
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (24 preceding siblings ...)
2014-03-02 7:19 ` [PATCH 25/32] libext2fs: implement fallocate Darrick J. Wong
@ 2014-03-02 7:19 ` Darrick J. Wong
2014-03-02 7:19 ` [PATCH 28/32] fuse2fs: handle 64-bit dates correctly Darrick J. Wong
` (3 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:19 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
Translate "native" ACL structures into ext4 ACL structures when
reading or writing the ACL EAs.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
configure | 3 +
configure.in | 7 +-
misc/fuse2fs.c | 260 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
3 files changed, 263 insertions(+), 7 deletions(-)
diff --git a/configure b/configure
index ce6a4ef..6cf6380 100755
--- a/configure
+++ b/configure
@@ -11228,6 +11228,7 @@ else
do :
as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh`
ac_fn_c_check_header_compile "$LINENO" "$ac_header" "$as_ac_Header" "#define _FILE_OFFSET_BITS 64
+#define FUSE_USE_VERSION 29
"
if eval test \"x\$"$as_ac_Header"\" = x"yes"; then :
cat >>confdefs.h <<_ACEOF
@@ -11246,6 +11247,7 @@ done
cat confdefs.h - <<_ACEOF >conftest.$ac_ext
/* end confdefs.h. */
+#define FUSE_USE_VERSION 29
#ifdef __linux__
#include <linux/fs.h>
#include <linux/falloc.h>
@@ -11365,6 +11367,7 @@ else
do :
as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh`
ac_fn_c_check_header_compile "$LINENO" "$ac_header" "$as_ac_Header" "#define _FILE_OFFSET_BITS 64
+#define FUSE_USE_VERSION 29
#ifdef __linux__
# include <linux/fs.h>
# include <linux/falloc.h>
diff --git a/configure.in b/configure.in
index 2c455af..2ceca2d 100644
--- a/configure.in
+++ b/configure.in
@@ -1177,10 +1177,12 @@ then
else
AC_CHECK_HEADERS([pthread.h fuse.h], [],
[AC_MSG_FAILURE([Cannot find fuse2fs headers.])],
-[#define _FILE_OFFSET_BITS 64])
+[#define _FILE_OFFSET_BITS 64
+#define FUSE_USE_VERSION 29])
AC_PREPROC_IFELSE(
-[AC_LANG_PROGRAM([[#ifdef __linux__
+[AC_LANG_PROGRAM([[#define FUSE_USE_VERSION 29
+#ifdef __linux__
#include <linux/fs.h>
#include <linux/falloc.h>
#include <linux/xattr.h>
@@ -1195,6 +1197,7 @@ fi
,
AC_CHECK_HEADERS([pthread.h fuse.h], [], [FUSE_CMT="#"],
[#define _FILE_OFFSET_BITS 64
+#define FUSE_USE_VERSION 29
#ifdef __linux__
# include <linux/fs.h>
# include <linux/falloc.h>
diff --git a/misc/fuse2fs.c b/misc/fuse2fs.c
index 8dc3c4e..a790d85 100644
--- a/misc/fuse2fs.c
+++ b/misc/fuse2fs.c
@@ -17,9 +17,13 @@
# include <linux/falloc.h>
# include <linux/xattr.h>
# define FUSE_PLATFORM_OPTS ",nonempty,big_writes"
+# define TRANSLATE_LINUX_ACLS
#else
# define FUSE_PLATFORM_OPTS ""
#endif
+#ifdef TRANSLATE_LINUX_ACLS
+# include <sys/acl.h>
+#endif
#include <sys/ioctl.h>
#include <unistd.h>
#include <fuse.h>
@@ -59,6 +63,199 @@ static ext2_filsys global_fs; /* Try not to use this directly */
# define FL_PUNCH_HOLE_FLAG (0)
#endif
+/* ACL translation stuff */
+#ifdef TRANSLATE_LINUX_ACLS
+/*
+ * Copied from acl_ea.h in libacl source; ACLs have to be sent to and from fuse
+ * in this format... at least on Linux.
+ */
+#define ACL_EA_ACCESS "system.posix_acl_access"
+#define ACL_EA_DEFAULT "system.posix_acl_default"
+
+#define ACL_EA_VERSION 0x0002
+
+typedef struct {
+ u_int16_t e_tag;
+ u_int16_t e_perm;
+ u_int32_t e_id;
+} acl_ea_entry;
+
+typedef struct {
+ u_int32_t a_version;
+ acl_ea_entry a_entries[0];
+} acl_ea_header;
+
+static inline size_t acl_ea_size(int count)
+{
+ return sizeof(acl_ea_header) + count * sizeof(acl_ea_entry);
+}
+
+static inline int acl_ea_count(size_t size)
+{
+ if (size < sizeof(acl_ea_header))
+ return -1;
+ size -= sizeof(acl_ea_header);
+ if (size % sizeof(acl_ea_entry))
+ return -1;
+ return size / sizeof(acl_ea_entry);
+}
+
+/*
+ * ext4 ACL structures, copied from fs/ext4/acl.h.
+ */
+#define EXT4_ACL_VERSION 0x0001
+
+typedef struct {
+ __u16 e_tag;
+ __u16 e_perm;
+ __u32 e_id;
+} ext4_acl_entry;
+
+typedef struct {
+ __u16 e_tag;
+ __u16 e_perm;
+} ext4_acl_entry_short;
+
+typedef struct {
+ __u32 a_version;
+} ext4_acl_header;
+
+static inline size_t ext4_acl_size(int count)
+{
+ if (count <= 4) {
+ return sizeof(ext4_acl_header) +
+ count * sizeof(ext4_acl_entry_short);
+ } else {
+ return sizeof(ext4_acl_header) +
+ 4 * sizeof(ext4_acl_entry_short) +
+ (count - 4) * sizeof(ext4_acl_entry);
+ }
+}
+
+static inline int ext4_acl_count(size_t size)
+{
+ ssize_t s;
+ size -= sizeof(ext4_acl_header);
+ s = size - 4 * sizeof(ext4_acl_entry_short);
+ if (s < 0) {
+ if (size % sizeof(ext4_acl_entry_short))
+ return -1;
+ return size / sizeof(ext4_acl_entry_short);
+ } else {
+ if (s % sizeof(ext4_acl_entry))
+ return -1;
+ return s / sizeof(ext4_acl_entry) + 4;
+ }
+}
+
+static errcode_t fuse_to_ext4_acl(acl_ea_header *facl, size_t facl_sz,
+ ext4_acl_header **eacl, size_t *eacl_sz)
+{
+ int i, facl_count;
+ ext4_acl_header *h;
+ size_t h_sz;
+ ext4_acl_entry *e;
+ acl_ea_entry *a;
+ void *hptr;
+ errcode_t err;
+
+ facl_count = acl_ea_count(facl_sz);
+ h_sz = ext4_acl_size(facl_count);
+ if (h_sz < 0 || facl_count < 0 || facl->a_version != ACL_EA_VERSION)
+ return EXT2_ET_INVALID_ARGUMENT;
+
+ err = ext2fs_get_mem(h_sz, &h);
+ if (err)
+ return err;
+
+ h->a_version = ext2fs_cpu_to_le32(EXT4_ACL_VERSION);
+ hptr = h + 1;
+ for (i = 0, a = facl->a_entries; i < facl_count; i++, a++) {
+ e = hptr;
+ e->e_tag = ext2fs_cpu_to_le16(a->e_tag);
+ e->e_perm = ext2fs_cpu_to_le16(a->e_perm);
+
+ switch (a->e_tag) {
+ case ACL_USER:
+ case ACL_GROUP:
+ e->e_id = ext2fs_cpu_to_le32(a->e_id);
+ hptr += sizeof(ext4_acl_entry);
+ break;
+ case ACL_USER_OBJ:
+ case ACL_GROUP_OBJ:
+ case ACL_MASK:
+ case ACL_OTHER:
+ hptr += sizeof(ext4_acl_entry_short);
+ break;
+ default:
+ err = EXT2_ET_INVALID_ARGUMENT;
+ goto out;
+ }
+ }
+
+ *eacl = h;
+ *eacl_sz = h_sz;
+ return err;
+out:
+ ext2fs_free_mem(&h);
+ return err;
+}
+
+static errcode_t ext4_to_fuse_acl(acl_ea_header **facl, size_t *facl_sz,
+ ext4_acl_header *eacl, size_t eacl_sz)
+{
+ int i, eacl_count;
+ acl_ea_header *f;
+ ext4_acl_entry *e;
+ acl_ea_entry *a;
+ size_t f_sz;
+ void *hptr;
+ errcode_t err;
+
+ eacl_count = ext4_acl_count(eacl_sz);
+ f_sz = acl_ea_size(eacl_count);
+ if (f_sz < 0 || eacl_count < 0 ||
+ eacl->a_version != ext2fs_cpu_to_le32(EXT4_ACL_VERSION))
+ return EXT2_ET_INVALID_ARGUMENT;
+
+ err = ext2fs_get_mem(f_sz, &f);
+ if (err)
+ return err;
+
+ f->a_version = ACL_EA_VERSION;
+ hptr = eacl + 1;
+ for (i = 0, a = f->a_entries; i < eacl_count; i++, a++) {
+ e = hptr;
+ a->e_tag = ext2fs_le16_to_cpu(e->e_tag);
+ a->e_perm = ext2fs_le16_to_cpu(e->e_perm);
+
+ switch (a->e_tag) {
+ case ACL_USER:
+ case ACL_GROUP:
+ a->e_id = ext2fs_le32_to_cpu(e->e_id);
+ hptr += sizeof(ext4_acl_entry);
+ break;
+ case ACL_USER_OBJ:
+ case ACL_GROUP_OBJ:
+ case ACL_MASK:
+ case ACL_OTHER:
+ hptr += sizeof(ext4_acl_entry_short);
+ break;
+ default:
+ err = EXT2_ET_INVALID_ARGUMENT;
+ goto out;
+ }
+ }
+
+ *facl = f;
+ *facl_sz = f_sz;
+ return err;
+out:
+ ext2fs_free_mem(&f);
+ return err;
+}
+#endif /* TRANSLATE_LINUX_ACLS */
+
/*
* ext2_file_t contains a struct inode, so we can't leave files open.
* Use this as a proxy instead.
@@ -2100,6 +2297,30 @@ static int op_statfs(const char *path, struct statvfs *buf)
return 0;
}
+typedef errcode_t (*xattr_xlate_get)(void **cooked_buf, size_t *cooked_sz,
+ const void *raw_buf, size_t raw_sz);
+typedef errcode_t (*xattr_xlate_set)(const void *cooked_buf, size_t cooked_sz,
+ void **raw_buf, size_t *raw_sz);
+struct xattr_translate {
+ const char *prefix;
+ xattr_xlate_get get;
+ xattr_xlate_set set;
+};
+
+#define XATTR_TRANSLATOR(p, g, s) \
+ {.prefix = (p), \
+ .get = (xattr_xlate_get)(g), \
+ .set = (xattr_xlate_set)(s)}
+
+static struct xattr_translate xattr_translators[] = {
+#ifdef TRANSLATE_LINUX_ACLS
+ XATTR_TRANSLATOR(ACL_EA_ACCESS, ext4_to_fuse_acl, fuse_to_ext4_acl),
+ XATTR_TRANSLATOR(ACL_EA_DEFAULT, ext4_to_fuse_acl, fuse_to_ext4_acl),
+#endif
+ XATTR_TRANSLATOR(NULL, NULL, NULL),
+};
+#undef XATTR_TRANSLATOR
+
static int op_getxattr(const char *path, const char *key, char *value,
size_t len)
{
@@ -2107,8 +2328,9 @@ static int op_getxattr(const char *path, const char *key, char *value,
struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
ext2_filsys fs;
struct ext2_xattr_handle *h;
- void *ptr;
- size_t plen;
+ struct xattr_translate *xt;
+ void *ptr, *cptr;
+ size_t plen, clen;
ext2_ino_t ino;
errcode_t err;
int ret = 0;
@@ -2151,6 +2373,17 @@ static int op_getxattr(const char *path, const char *key, char *value,
goto out2;
}
+ for (xt = xattr_translators; xt->prefix != NULL; xt++) {
+ if (strncmp(key, xt->prefix, strlen(xt->prefix)) == 0) {
+ err = xt->get(&cptr, &clen, ptr, plen);
+ if (err)
+ goto out3;
+ ext2fs_free_mem(&ptr);
+ ptr = cptr;
+ plen = clen;
+ }
+ }
+
if (!len) {
ret = plen;
} else if (len < plen) {
@@ -2160,6 +2393,7 @@ static int op_getxattr(const char *path, const char *key, char *value,
ret = plen;
}
+out3:
ext2fs_free_mem(&ptr);
out2:
err = ext2fs_xattrs_close(&h);
@@ -2274,6 +2508,9 @@ static int op_setxattr(const char *path, const char *key, const char *value,
struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
ext2_filsys fs;
struct ext2_xattr_handle *h;
+ struct xattr_translate *xt;
+ void *cvalue;
+ size_t clen;
ext2_ino_t ino;
errcode_t err;
int ret = 0;
@@ -2313,19 +2550,32 @@ static int op_setxattr(const char *path, const char *key, const char *value,
goto out2;
}
- err = ext2fs_xattr_set(h, key, value, len);
+ cvalue = (void *)value;
+ clen = len;
+ for (xt = xattr_translators; xt->prefix != NULL; xt++) {
+ if (strncmp(key, xt->prefix, strlen(xt->prefix)) == 0) {
+ err = xt->set(value, len, &cvalue, &clen);
+ if (err)
+ goto out3;
+ }
+ }
+
+ err = ext2fs_xattr_set(h, key, cvalue, clen);
if (err) {
ret = translate_error(fs, ino, err);
- goto out2;
+ goto out3;
}
err = ext2fs_xattrs_write(h);
if (err) {
ret = translate_error(fs, ino, err);
- goto out2;
+ goto out3;
}
ret = update_ctime(fs, ino, NULL);
+out3:
+ if (cvalue != value)
+ ext2fs_free_mem(&cvalue);
out2:
err = ext2fs_xattrs_close(&h);
if (!ret && err)
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 28/32] fuse2fs: handle 64-bit dates correctly
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (25 preceding siblings ...)
2014-03-02 7:19 ` [PATCH 27/32] fuse2fs: translate ACL structures Darrick J. Wong
@ 2014-03-02 7:19 ` Darrick J. Wong
2014-03-02 7:19 ` [PATCH 29/32] fuse2fs: implement fallocate Darrick J. Wong
` (2 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:19 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
Fix fuse2fs' interpretation of 64-bit date quantities to match the
kernel.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
misc/fuse2fs.c | 31 ++++++++++++++++++++++---------
1 file changed, 22 insertions(+), 9 deletions(-)
diff --git a/misc/fuse2fs.c b/misc/fuse2fs.c
index a790d85..2b4a489 100644
--- a/misc/fuse2fs.c
+++ b/misc/fuse2fs.c
@@ -321,15 +321,24 @@ static int __translate_error(ext2_filsys fs, errcode_t err, ext2_ino_t ino,
static inline __u32 ext4_encode_extra_time(const struct timespec *time)
{
- return (sizeof(time->tv_sec) > 4 ?
- (time->tv_sec >> 32) & EXT4_EPOCH_MASK : 0) |
- ((time->tv_nsec << EXT4_EPOCH_BITS) & EXT4_NSEC_MASK);
+ __u32 extra = sizeof(time->tv_sec) > 4 ?
+ ((time->tv_sec - (__s32)time->tv_sec) >> 32) &
+ EXT4_EPOCH_MASK : 0;
+ return extra | (time->tv_nsec << EXT4_EPOCH_BITS);
}
static inline void ext4_decode_extra_time(struct timespec *time, __u32 extra)
{
- if (sizeof(time->tv_sec) > 4)
- time->tv_sec |= (__u64)((extra) & EXT4_EPOCH_MASK) << 32;
+ if (sizeof(time->tv_sec) > 4 && (extra & EXT4_EPOCH_MASK)) {
+ __u64 extra_bits = extra & EXT4_EPOCH_MASK;
+ /*
+ * Prior to kernel 3.14?, we had a broken decode function,
+ * wherein we effectively did this:
+ * if (extra_bits == 3)
+ * extra_bits = 0;
+ */
+ time->tv_sec += extra_bits << 32;
+ }
time->tv_nsec = ((extra) & EXT4_NSEC_MASK) >> EXT4_EPOCH_BITS;
}
@@ -355,7 +364,7 @@ do { \
(timespec)->tv_sec = (signed)((raw_inode)->xtime); \
if (EXT4_FITS_IN_INODE(raw_inode, xtime ## _extra)) \
ext4_decode_extra_time((timespec), \
- raw_inode->xtime ## _extra); \
+ (raw_inode)->xtime ## _extra); \
else \
(timespec)->tv_nsec = 0; \
} while (0)
@@ -717,6 +726,7 @@ static int stat_inode(ext2_filsys fs, ext2_ino_t ino, struct stat *statbuf)
dev_t fakedev = 0;
errcode_t err;
int ret = 0;
+ struct timespec tv;
memset(&inode, 0, sizeof(inode));
err = ext2fs_read_inode_full(fs, ino, (struct ext2_inode *)&inode,
@@ -734,9 +744,12 @@ static int stat_inode(ext2_filsys fs, ext2_ino_t ino, struct stat *statbuf)
statbuf->st_size = EXT2_I_SIZE(&inode);
statbuf->st_blksize = fs->blocksize;
statbuf->st_blocks = blocks_from_inode(fs, &inode);
- statbuf->st_atime = inode.i_atime;
- statbuf->st_mtime = inode.i_mtime;
- statbuf->st_ctime = inode.i_ctime;
+ EXT4_INODE_GET_XTIME(i_atime, &tv, &inode);
+ statbuf->st_atime = tv.tv_sec;
+ EXT4_INODE_GET_XTIME(i_mtime, &tv, &inode);
+ statbuf->st_mtime = tv.tv_sec;
+ EXT4_INODE_GET_XTIME(i_ctime, &tv, &inode);
+ statbuf->st_ctime = tv.tv_sec;
if (LINUX_S_ISCHR(inode.i_mode) ||
LINUX_S_ISBLK(inode.i_mode)) {
if (inode.i_block[0])
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 29/32] fuse2fs: implement fallocate
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (26 preceding siblings ...)
2014-03-02 7:19 ` [PATCH 28/32] fuse2fs: handle 64-bit dates correctly Darrick J. Wong
@ 2014-03-02 7:19 ` Darrick J. Wong
2014-03-02 7:19 ` [PATCH 31/32] tests: enable using fuse2fs with metadata checksum test Darrick J. Wong
2014-03-02 7:20 ` [PATCH 32/32] tests: test date handling Darrick J. Wong
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:19 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
Use the (new) ext2fs_fallocate() to fallocate file space.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
misc/fuse2fs.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 58 insertions(+), 1 deletion(-)
diff --git a/misc/fuse2fs.c b/misc/fuse2fs.c
index 2b4a489..e642594 100644
--- a/misc/fuse2fs.c
+++ b/misc/fuse2fs.c
@@ -3259,7 +3259,64 @@ out:
static int fallocate_helper(struct fuse_file_info *fp, int mode, off_t offset,
off_t len)
{
- return -EOPNOTSUPP;
+ struct fuse_context *ctxt = fuse_get_context();
+ struct fuse2fs *ff = (struct fuse2fs *)ctxt->private_data;
+ struct fuse2fs_file_handle *fh = (struct fuse2fs_file_handle *)fp->fh;
+ ext2_filsys fs;
+ struct ext2_inode_large inode;
+ blk64_t start, end, x;
+ __u64 fsize;
+ errcode_t err;
+ int flags;
+ int ret = 0;
+
+ FUSE2FS_CHECK_CONTEXT(ff);
+ fs = ff->fs;
+ FUSE2FS_CHECK_MAGIC(fs, fh, FUSE2FS_FILE_MAGIC);
+ start = offset / fs->blocksize;
+ end = (offset + len - 1) / fs->blocksize;
+ dbg_printf("%s: ino=%d mode=0x%x start=%jd end=%llu\n", __func__,
+ fh->ino, mode, offset / fs->blocksize, end);
+ if (!fs_can_allocate(ff, len / fs->blocksize))
+ return -ENOSPC;
+
+ memset(&inode, 0, sizeof(inode));
+ err = ext2fs_read_inode_full(fs, fh->ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err)
+ return err;
+ fsize = EXT2_I_SIZE(&inode);
+
+ /* Allocate a bunch of blocks */
+ flags = (mode & FL_KEEP_SIZE_FLAG ? 0 :
+ EXT2_FALLOCATE_INIT_BEYOND_EOF);
+ err = ext2fs_fallocate(fs, flags, fh->ino,
+ (struct ext2_inode *)&inode,
+ start, end - start + 1);
+ if (err && err != EXT2_ET_BLOCK_ALLOC_FAIL)
+ return translate_error(fs, fh->ino, err);
+
+ /* Update i_size */
+ if (!(mode & FL_KEEP_SIZE_FLAG)) {
+ if (offset + len > fsize) {
+ err = ext2fs_inode_set_size(fs,
+ (struct ext2_inode *)&inode,
+ offset + len);
+ if (err)
+ return translate_error(fs, fh->ino, err);
+ }
+ }
+
+ err = update_mtime(fs, fh->ino, &inode);
+ if (err)
+ return err;
+
+ err = ext2fs_write_inode_full(fs, fh->ino, (struct ext2_inode *)&inode,
+ sizeof(inode));
+ if (err)
+ return translate_error(fs, fh->ino, err);
+
+ return err;
}
static errcode_t clean_block_middle(ext2_filsys fs, ext2_ino_t ino,
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 31/32] tests: enable using fuse2fs with metadata checksum test
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (27 preceding siblings ...)
2014-03-02 7:19 ` [PATCH 29/32] fuse2fs: implement fallocate Darrick J. Wong
@ 2014-03-02 7:19 ` Darrick J. Wong
2014-03-02 7:20 ` [PATCH 32/32] tests: test date handling Darrick J. Wong
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:19 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
Create custom mount/umount commands so that we can run the metadata
checksumming tests against fuse2fs.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
tests/fuse2fs/mount | 28 ++++++++++++++++++++++++++++
tests/fuse2fs/umount | 21 +++++++++++++++++++++
2 files changed, 49 insertions(+)
create mode 100755 tests/fuse2fs/mount
create mode 100755 tests/fuse2fs/umount
diff --git a/tests/fuse2fs/mount b/tests/fuse2fs/mount
new file mode 100755
index 0000000..321b1f5
--- /dev/null
+++ b/tests/fuse2fs/mount
@@ -0,0 +1,28 @@
+#!/bin/bash
+
+# Mount ext4 via fuse. Put tests/fuse2fs/ at the start of PATH if you want
+# to run the metadata checksumming tests with fuse2fs.
+
+for arg in "$@"; do
+ if [ -b "${arg}" ]; then
+ DEV="${arg}"
+ elif [ -d "${arg}" ]; then
+ MNT="${arg}"
+ fi
+done
+
+if [ -z "${DEV}" -o -z "${MNT}" ]; then
+ echo "Please specify a device and a mountpoint."
+fi
+
+DIR="$(readlink -f "$(dirname "$0")")"
+if [ -n "${FUSE2FS_DEBUG}" ]; then
+ "${DIR}/../../misc/fuse2fs" "${DEV}" "${MNT}" -d >> "${FUSE2FS_DEBUG}" 2>&1 &
+ sleep 1
+ exit 0
+else
+ "${DIR}/../../misc/fuse2fs" "${DEV}" "${MNT}"
+ ERR=$?
+ sleep 1
+ exit "${ERR}"
+fi
diff --git a/tests/fuse2fs/umount b/tests/fuse2fs/umount
new file mode 100755
index 0000000..715bee1
--- /dev/null
+++ b/tests/fuse2fs/umount
@@ -0,0 +1,21 @@
+#!/bin/bash
+
+# unmount a filesystem
+sync
+sync
+sync
+
+N=1
+if [ -x /bin/umount ]; then
+ /bin/umount "$@"
+ ERR=$?
+elif [ -x /sbin/umount ]; then
+ /sbin/umount "$@"
+ ERR=$?
+else
+ echo "Where is umount?"
+ exit 5
+fi
+sleep 1
+
+exit "${ERR}"
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 32/32] tests: test date handling
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
` (28 preceding siblings ...)
2014-03-02 7:19 ` [PATCH 31/32] tests: enable using fuse2fs with metadata checksum test Darrick J. Wong
@ 2014-03-02 7:20 ` Darrick J. Wong
29 siblings, 0 replies; 31+ messages in thread
From: Darrick J. Wong @ 2014-03-02 7:20 UTC (permalink / raw)
To: tytso, darrick.wong; +Cc: linux-ext4
Test our ability to handle the entire range of valid dates.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
tests/metadata-checksum-test.sh | 59 +++++++++++++++++++++++++++++++++++++++
1 file changed, 59 insertions(+)
diff --git a/tests/metadata-checksum-test.sh b/tests/metadata-checksum-test.sh
index 05b88bc..7887d1a 100755
--- a/tests/metadata-checksum-test.sh
+++ b/tests/metadata-checksum-test.sh
@@ -3746,6 +3746,65 @@ ${fsck_cmd} -C0 -f -n "${DEV}"
${E2FSPROGS}/debugfs/debugfs -R 'ex /fragfile' "${DEV}" | tail -n 15
}
+#####################################
+function date_test {
+msg "date_test"
+
+rm -rf /tmp/ls.before /tmp/ls.after /tmp/debugfs.diff
+
+INODE_SIZE="$(${E2FSPROGS}/misc/dumpe2fs -h "${DEV}" | grep 'Inode size:' | awk '{print $3}')"
+if [ "${INODE_SIZE}" -gt 128 ]; then
+ LAST_YEAR=2430
+else
+ LAST_YEAR=2030
+fi
+
+# Write dates
+${mount_cmd} ${MOUNT_OPTS} "${DEV}" "${MNT}" -t ext4 -o journal_checksum
+seq 1910 20 "${LAST_YEAR}" | while read year; do
+ DATE="${year}-01-01 00:00:00.000000000"
+ FNAME="$(echo "${DATE}" | tr '[ \-:.]' '____')"
+ touch -d "${DATE}" "${MNT}/${FNAME}"
+ echo "${FNAME} ${DATE}" >> /tmp/ls.before
+done
+umount "${MNT}"
+${fsck_cmd} -C0 -f -n "${DEV}"
+
+# debugfs
+seq 1910 20 "${LAST_YEAR}" | while read year; do
+ DATE="${year}-01-01 00:00:00.000000000"
+ FNAME="$(echo "${DATE}" | tr '[ \-:.]' '____')"
+ echo "${FNAME}" "$(${E2FSPROGS}/debugfs/debugfs -R "stat ${FNAME}" "${DEV}" | grep 'mtime:')"
+done > /tmp/debugfs.before
+
+# Re-read from kernel
+${mount_cmd} ${MOUNT_OPTS} "${DEV}" "${MNT}" -t ext4 -o journal_checksum
+seq 1910 20 "${LAST_YEAR}" | while read year; do
+ DATE="${year}-01-01 00:00:00.000000000"
+ FNAME="$(echo "${DATE}" | tr '[ \-:.]' '____')"
+ FDATE="$(stat -c '%y' "${MNT}/${FNAME}" | sed -e 's/......$//g')"
+ echo "${FNAME}" "${FDATE}" >> /tmp/ls.after
+done
+umount "${MNT}"
+
+# Did the kernel work?
+diff -u /tmp/ls.before /tmp/ls.after > /tmp/ls.diff || true
+
+# Does debugfs work?
+touch /tmp/debugfs.diff
+cat /tmp/debugfs.before | sed -e 's/^\(....\).*\(....\)$/\1 \2/g' | while read date fdate crap; do
+ if [ "${date}" != "${fdate}" ]; then
+ echo "${date} != ${fdate}" >> /tmp/debugfs.diff
+ fi
+done
+
+if [ "$(cat /tmp/debugfs.diff /tmp/ls.diff | wc -l)" -gt 0 ]; then
+ echo "BROKEN DATE HANDLING"
+ cat /tmp/debugfs.diff /tmp/ls.diff
+ false
+fi
+}
+
# This test should be the last one (before speed tests, anyway)
#### ALL SPEED TESTS GO AT THE END
^ permalink raw reply related [flat|nested] 31+ messages in thread
end of thread, other threads:[~2014-03-02 7:20 UTC | newest]
Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-02 7:16 [PATCH 00/32] e2fsprogs patchbomb 2/14 Darrick J. Wong
2014-03-02 7:16 ` [PATCH 01/32] libext2fs: support modifying arbitrary extended attributes (v5) Darrick J. Wong
2014-03-02 7:16 ` [PATCH 02/32] debugfs: create commands to edit extended attributes Darrick J. Wong
2014-03-02 7:16 ` [PATCH 03/32] libext2fs: fix 64bit overflow in ext2fs_block_alloc_stats_range Darrick J. Wong
2014-03-02 7:17 ` [PATCH 04/32] misc: fix header complaints and resource leaks in e2fsprogs Darrick J. Wong
2014-03-02 7:17 ` [PATCH 05/32] libext2fs: fix memory leak when drastically shrinking extent tree depth Darrick J. Wong
2014-03-02 7:17 ` [PATCH 06/32] libext2fs: fix parents when modifying extents Darrick J. Wong
2014-03-02 7:17 ` [PATCH 07/32] e2fsck: fix inline_data flag errors in pass1 Darrick J. Wong
2014-03-02 7:17 ` [PATCH 08/32] e2fsck: print runs of duplicate blocks instead of all of them Darrick J. Wong
2014-03-02 7:17 ` [PATCH 09/32] e2fsck: verify checksums after checking everything else Darrick J. Wong
2014-03-02 7:17 ` [PATCH 10/32] dumpe2fs: add switch to disable checksum verification Darrick J. Wong
2014-03-02 7:17 ` [PATCH 11/32] mke2fs: set block_validity as a default mount option Darrick J. Wong
2014-03-02 7:17 ` [PATCH 12/32] libext2fs: support allocating uninit blocks in bmap2() Darrick J. Wong
2014-03-02 7:18 ` [PATCH 13/32] libext2fs: file IO routines should handle uninit blocks Darrick J. Wong
2014-03-02 7:18 ` [PATCH 14/32] resize2fs: convert fs to and from 64bit mode Darrick J. Wong
2014-03-02 7:18 ` [PATCH 15/32] resize2fs: when toggling 64bit, don't free in-use bg data clusters Darrick J. Wong
2014-03-02 7:18 ` [PATCH 16/32] resize2fs: adjust reserved_gdt_blocks when changing group descriptor size Darrick J. Wong
2014-03-02 7:18 ` [PATCH 17/32] libext2fs: have UNIX IO manager use pread/pwrite Darrick J. Wong
2014-03-02 7:18 ` [PATCH 18/32] ext2fs: add readahead method to improve scanning Darrick J. Wong
2014-03-02 7:18 ` [PATCH 19/32] libext2fs: allow clients to read-ahead metadata Darrick J. Wong
2014-03-02 7:18 ` [PATCH 20/32] e2fsck: read-ahead metadata during passes 1, 2, and 4 Darrick J. Wong
2014-03-02 7:18 ` [PATCH 21/32] libext2fs: when appending to a file, don't split an index block in equal halves Darrick J. Wong
2014-03-02 7:18 ` [PATCH 22/32] libext2fs: find inode goal when allocating blocks Darrick J. Wong
2014-03-02 7:19 ` [PATCH 23/32] libext2fs: find a range of empty blocks Darrick J. Wong
2014-03-02 7:19 ` [PATCH 24/32] libext2fs: provide a function to set inode size Darrick J. Wong
2014-03-02 7:19 ` [PATCH 25/32] libext2fs: implement fallocate Darrick J. Wong
2014-03-02 7:19 ` [PATCH 27/32] fuse2fs: translate ACL structures Darrick J. Wong
2014-03-02 7:19 ` [PATCH 28/32] fuse2fs: handle 64-bit dates correctly Darrick J. Wong
2014-03-02 7:19 ` [PATCH 29/32] fuse2fs: implement fallocate Darrick J. Wong
2014-03-02 7:19 ` [PATCH 31/32] tests: enable using fuse2fs with metadata checksum test Darrick J. Wong
2014-03-02 7:20 ` [PATCH 32/32] tests: test date handling Darrick J. Wong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).