From: Austin S Hemmelgarn <ahferroin7@gmail.com>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>,
linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Crazy idea of cleanup the inode_record btrfsck things with SQL?
Date: Mon, 01 Dec 2014 07:53:08 -0500 [thread overview]
Message-ID: <547C64B4.2050802@gmail.com> (raw)
In-Reply-To: <547BCB43.5020505@cn.fujitsu.com>
[-- Attachment #1: Type: text/plain, Size: 3533 bytes --]
On 2014-11-30 20:58, Qu Wenruo wrote:
> [BACKGROUND]
> I'm trying to implement the function to repair missing inode item.
> Under that case, inode type must be salvaged(although it can be fallback to
> FILE).
>
> One case should be, if there is any dir_item/index or inode_ref refers the
> inode as parent, the type of that inode must be DIR.
>
> However, currently btrfsck implement (inode_record only records
> backref), we
> are unable to search the inode_backref whose parent is given inode number.
>
> [FIRST IMPLEMENT DESIGN]
> My first thought is to implement an generic inode-relation structure,
> recording parent ino, child ino, name and namelen, and restore the
> structure
> in a rbtree, not in the child/parent's list.
>
> But I soon recognize that this is a perfect use case for relational
> database,
> as 'ino' as the primary key for INODE table,
> ('parent_ino', 'child_ino', 'name') as the primary key for INODE_REF table.
>
> [CRAZY IDEA]
> So why not using SQL to implement the btrfsck inode-record things?
>
> With such crazy idea, it will be much much easier to do any iteration
> from a
> given ino, and with the already mature RDB implement, like sqlite3, we can
> save hundreds of lines of codes implementing the rb-tree or list.
>
> [PROS]
> 1. Easy to maintain
> Now we don't need to maintain the rbtree searching or list
> iteration, but
> easy SQL lines and its wrapper.
>
> 2. Easy to extend
> If we need to record something more, like extents and its relation to
> inode, we only need to create 2 tables and several SQL and wrappers.
>
> 3. Reduced memory usage for HUGE fs.
> When metadata grows to several TB or even more, current rb-tree based
> implement may run short of memory since they are all stored in memory.
> But if use SQL, RDBMS like sqlite3 can restore things in either
> memory or
> disk, which may hugely reduce the memory usage for huge btrfs.
>
> If not use existing RDBMS, we need to implement complicated memory
> control
> system to manage memory in userland.
>
> [CONS]
> 1. Heavy implement
> SQL hide the rb-tree or B+ tree implement but costs more memory(if not
> compressed) and CPU cycles, which will be slower than the simple
> rb-tree
> implement even using lightweight RDBMS like sqlite3.
>
> 2. Heavy dependency
> If use it, btrfs-progs will include RDBMS as the make and runtime
> dependency.
> Such low level progs depend on high level programs like sqlite3 may
> be very
> strange.
>
> 3. A lot of rework on existing codes.
> Even SQL is easier to maintain and extend, if we use it, we still
> need to
> reimplement several hundreds or even thousands lines of code to
> implement
> it, not to mention the regression tests.
>
> 4. Copyright
> Will it cause any copyright problem if using non-GPL RDBMS like
> sqlite3 in
> GPLv2 btrfs-progs?
>
> [NEED FEEDBACK]
> Any feedback or discussion on the crazy idea is welcomed, since this may
> needs
> a lot of work, it definitely needs a lot review on the idea before it
> comes to
> codes.
>
So, I think this does a good job of highlighting one of the bigger
issues with btrfsck when it is compared to ext* and/or xfs. Despite
this being a problem, I really don't think using a rdbms is the way to
fix it, both for reasons outlined in other responses, and because fsck
should be as fast as possible when nothing is wrong with the fs.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2455 bytes --]
next prev parent reply other threads:[~2014-12-01 12:53 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-01 1:58 Crazy idea of cleanup the inode_record btrfsck things with SQL? Qu Wenruo
2014-12-01 3:08 ` Duncan
2014-12-01 3:24 ` Qu Wenruo
2014-12-01 5:47 ` Duncan
2014-12-01 6:25 ` Qu Wenruo
2014-12-01 4:03 ` Robert White
2014-12-01 6:18 ` Qu Wenruo
2014-12-01 18:10 ` Robert White
2014-12-02 1:17 ` Qu Wenruo
2014-12-03 19:18 ` Robert White
2014-12-04 6:56 ` Qu Wenruo
2014-12-10 21:57 ` Zygo Blaxell
2014-12-11 2:05 ` Qu Wenruo
2014-12-11 2:27 ` Zygo Blaxell
2014-12-01 12:53 ` Austin S Hemmelgarn [this message]
2014-12-02 0:37 ` Qu Wenruo
2014-12-11 19:00 ` Martin Steigerwald
2014-12-11 19:38 ` Roger Binns
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=547C64B4.2050802@gmail.com \
--to=ahferroin7@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=quwenruo@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox