From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Theodore Tso <tytso@MIT.EDU>
Cc: linux-ext4@vger.kernel.org
Subject: Re: What's cooking in e2fsprogs.git (topics)
Date: Thu, 24 Apr 2008 00:28:01 +0530 [thread overview]
Message-ID: <20080423185801.GC21899@skywalker> (raw)
In-Reply-To: <20080421164144.GG9700@mit.edu>
On Mon, Apr 21, 2008 at 12:41:44PM -0400, Theodore Tso wrote:
>
> * ak/undo-mgr (Mon Aug 13 15:56:26 2007 +0530) 6 commits
> - Add test cases for undoe2fs: u_undoe2fs_mke2fs and
> u_undoe2fs_tune2fs
> - Fix the resize inode test case
> - tune2fs: Support for large inode migration.
> - mke2fs: Add support for the undo I/O manager.
> - Add undoe2fs command
> - libext2fs: Add undo I/O manager
>
> These patches have been *significantly* rototilled.
>
> * The test cases weren't correctly setting and deleting the
> test_name.ok and test_name.failed files
>
> * mke2fs would bomb out with an incomprehensible error message
> if run twice in a row, or if the user didn't have write access
> to /var/lib/e2fsprogs (some users run mke2fs as a non-root
> user!)
>
> * the undoe2fs program was calling com_err and passing
> in uninitialized retval values, and was otherwise confused
> about how to do proper error handling, resulting in garbage
> getting printed if the file passed in didn't exist
>
> * there was a terrible performance problem because in the
> mke2fs case, the undo manager was using a tdb_data_size of
> 512.
>
> * I added the ability to configure the location of the scratch
> dirctory via mke2fs.conf for mk2efs. What we should do for
> tune2fs is less clear, and still needs to be addressed. It
> is also possible to force an undo file to be always created,
> and not just when doing a lazy inode table initialization.
> By using a 4k instead of 512 tdb_data_size, the time to
> speed up an mke2fs was cut in half, while still using an
> undo file. I suspect if we force the tdb_data_size to be
> even larger, say 32k or 64k the overhead would shrink even
> more.
>
> Unfortunately, there appears to be some kind of data
> corruption bug if I force a tdb_data_size of 32768, so I'm not
> entirely sure I trust the undo manager to be working
> correctly. The undo_manager code itself needs a pretty
> serious auditing and checking to make sure it's doing the
> right thing with large tdb_data_sizes.
>
The bug was in the changes added to support block size via set_options
We had two records in the data for blocksize. one 1024. configured via
set_blk_size and other 32K configured via set_options. So the undoe2fs
was replaying with 1K as the block size.
The below changes get everything working for me. The patch is not clean.
so they are not to apply. I still need to think a little bit more about
what to do when we attempt to read tdb_data_size from the end of file
system. Will send the cleanup patch later.
Let me know whether you want to keep the debug printf
diff --git a/lib/ext2fs/undo_io.c b/lib/ext2fs/undo_io.c
index 9bee1b6..4ad0fd6 100644
--- a/lib/ext2fs/undo_io.c
+++ b/lib/ext2fs/undo_io.c
@@ -169,6 +169,9 @@ static errcode_t write_block_size(TDB_CONTEXT *tdb, int block_size)
tdb_key.dsize = sizeof("filesystem BLKSIZE");
tdb_data.dptr = (unsigned char *)&(block_size);
tdb_data.dsize = sizeof(block_size);
+#ifdef DEBUG
+ printf("Setting blocksize %d\n", block_size);
+#endif
retval = tdb_store(tdb, tdb_key, tdb_data, TDB_INSERT);
if (retval == -1) {
@@ -182,13 +185,14 @@ static errcode_t undo_write_tdb(io_channel channel,
unsigned long block, int count)
{
- int size, loop_count = 0, i;
+ int size, i;
unsigned long block_num, backing_blk_num;
errcode_t retval = 0;
ext2_loff_t offset;
struct undo_private_data *data;
TDB_DATA tdb_key, tdb_data;
char *read_ptr;
+ unsigned long end_block;
data = (struct undo_private_data *) channel->private_data;
@@ -208,21 +212,26 @@ static errcode_t undo_write_tdb(io_channel channel,
size = count * channel->block_size;
}
+#ifdef DEBUG
+ printf("Writing at %ld of size %ld\n", block, size);
+#endif
/*
* Data is stored in tdb database as blocks of tdb_data_size size
* This helps in efficient lookup further.
*
* We divide the disk to blocks of tdb_data_size.
*/
+ /*FIXME need to check end of file system. */
- block_num = ((block*channel->block_size)+data->offset) /
- data->tdb_data_size;
-
- loop_count = (size + data->tdb_data_size -1) /
- data->tdb_data_size;
+ offset = (block * channel->block_size) + data->offset ;
+ block_num = offset / data->tdb_data_size;
+ end_block = (offset + size) / data->tdb_data_size;
+#ifdef DEBUG
+ printf("Writing from %ld to %ld\n", block_num, end_block);
+#endif
tdb_transaction_start(data->tdb);
- for (i = 0; i < loop_count; i++) {
+ while (block_num <= end_block ) {
tdb_key.dptr = (unsigned char *)&block_num;
tdb_key.dsize = sizeof(block_num);
@@ -231,8 +240,11 @@ static errcode_t undo_write_tdb(io_channel channel,
* Check if we have the record already
*/
if (tdb_exists(data->tdb, tdb_key)) {
-
/* Try the next block */
+#ifdef DEBUG
+ printf("The block %d already exist in database\n",
+ block_num);
+#endif
block_num++;
continue;
}
@@ -258,6 +270,9 @@ static errcode_t undo_write_tdb(io_channel channel,
}
memset(read_ptr, 0, count);
+#ifdef DEBUG
+ printf("Reading from %ld to %ld\n", backing_blk_num, count);
+#endif
retval = io_channel_read_blk(data->real, backing_blk_num,
-count, read_ptr);
@@ -276,10 +291,22 @@ static errcode_t undo_write_tdb(io_channel channel,
printf("Printing with key %ld data %x and size %d\n",
block_num,
tdb_data.dptr,
- channel->tdb_data_size);
+ data->tdb_data_size);
#endif
- data->tdb_written = 1;
+ if (!data->tdb_written) {
+ data->tdb_written = 1;
+
+ /* Write the blocksize to tdb file */
+ retval = write_block_size(data->tdb,
+ data->tdb_data_size);
+ if (retval) {
+ tdb_transaction_cancel(data->tdb);
+ retval = EXT2_ET_TDB_ERR_IO;
+ free(read_ptr);
+ return retval;
+ }
+ }
retval = tdb_store(data->tdb, tdb_key, tdb_data, TDB_INSERT);
if (retval == -1) {
/*
@@ -417,11 +444,6 @@ static errcode_t undo_set_blksize(io_channel channel, int blksize)
*/
if (!data->tdb_data_size) {
data->tdb_data_size = blksize;
-
- /* Write it to tdb file */
- retval = write_block_size(data->tdb, data->tdb_data_size);
- if (retval)
- return retval;
}
channel->block_size = blksize;
@@ -540,12 +562,6 @@ static errcode_t undo_set_option(io_channel channel, const char *option,
return EXT2_ET_INVALID_ARGUMENT;
if (!data->tdb_data_size || !data->tdb_written) {
data->tdb_data_size = tmp;
-
- /* Write it to tdb file */
- retval = write_block_size(data->tdb,
- data->tdb_data_size);
- if (retval)
- return retval;
}
return 0;
}
diff --git a/misc/mke2fs.c b/misc/mke2fs.c
index 593b743..be772ee 100644
--- a/misc/mke2fs.c
+++ b/misc/mke2fs.c
@@ -1745,6 +1745,7 @@ static int should_do_undo(const char *name)
io_manager manager = unix_io_manager;
int csum_flag, force_undo;
+
csum_flag = EXT2_HAS_RO_COMPAT_FEATURE(&fs_param,
EXT4_FEATURE_RO_COMPAT_GDT_CSUM);
force_undo = get_int_from_profile(fs_types, "force_undo", 0);
@@ -1873,7 +1874,7 @@ int main (int argc, char *argv[])
com_err(device_name, retval, _("while setting up superblock"));
exit(1);
}
-#if 0 /* XXX bug in undo_io.c if we set this? */
+#if 1 /* XXX bug in undo_io.c if we set this? */
io_channel_set_options(fs->io, "tdb_data_size=32768");
#endif
next prev parent reply other threads:[~2008-04-23 18:58 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-11-04 22:42 What's cooking in e2fsprogs.git (topics) Theodore Ts'o, Theodore Ts'o
2007-11-05 2:15 ` Andreas Dilger
2007-11-05 4:32 ` Theodore Tso
2007-11-05 15:06 ` Andreas Dilger
2007-12-17 17:11 ` Theodore Tso
2007-12-17 22:34 ` Andreas Dilger
2007-12-17 22:59 ` Theodore Tso
2007-12-17 23:36 ` Andreas Dilger
2007-12-18 3:32 ` Theodore Tso
2007-12-18 8:13 ` Florian Weimer
2007-12-18 19:10 ` What's cooking in e2fsprogs.git (topics) - [RFC] FLEX_BG bmap and itable allocation patch Jose R. Santos
2008-02-11 4:51 ` What's cooking in e2fsprogs.git (topics) Theodore Tso
2008-02-11 5:08 ` Eric Sandeen
2008-02-11 7:24 ` Theodore Tso
2008-02-19 5:09 ` Theodore Tso
2008-02-20 18:46 ` Eric Sandeen
2008-02-21 14:05 ` Theodore Tso
2008-02-21 16:40 ` Eric Sandeen
2008-02-22 23:14 ` Andreas Dilger
2008-02-23 0:15 ` Theodore Tso
2008-02-25 4:20 ` Andreas Dilger
2008-02-25 15:13 ` Theodore Tso
2008-02-25 16:01 ` Aneesh Kumar K.V
2008-02-25 17:32 ` Eric Sandeen
2008-02-25 20:23 ` Theodore Tso
2008-02-29 15:43 ` Theodore Tso
2008-02-29 19:59 ` Andreas Dilger
2008-02-29 22:49 ` Theodore Tso
2008-03-02 3:24 ` Jose R. Santos
2008-03-05 16:59 ` Jose R. Santos
2008-03-13 18:11 ` Theodore Tso
2008-03-20 20:32 ` Theodore Tso
2008-04-02 0:09 ` Theodore Tso
2008-04-07 17:12 ` Theodore Tso
2008-04-18 18:43 ` Theodore Tso
2008-04-21 16:41 ` Theodore Tso
2008-04-23 7:32 ` Aneesh Kumar K.V
2008-04-23 11:55 ` Theodore Tso
2008-04-23 18:58 ` Aneesh Kumar K.V [this message]
2008-04-28 19:44 ` The changes I made to the undo-mgr (Re: What's cooking in e2fsprogs.git (topics)) Theodore Tso
2008-05-24 23:54 ` What's cooking in e2fsprogs.git (topics) Theodore Tso
2008-06-03 2:40 ` Theodore Tso
2008-06-17 12:03 ` Theodore Tso
2008-03-02 23:50 ` Christian Kujau
-- strict thread matches above, loose matches on Subject: below --
2007-10-16 4:48 Theodore Ts'o
2007-10-16 5:36 ` Aneesh Kumar K.V
2007-10-16 16:41 ` Jose R. Santos
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080423185801.GC21899@skywalker \
--to=aneesh.kumar@linux.vnet.ibm.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@MIT.EDU \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.