Re: Crazy idea of cleanup the inode_record btrfsck things with SQL?

Linux Btrfs filesystem development
 help / color / mirror / Atom feed

From: Austin S Hemmelgarn <ahferroin7@gmail.com>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>,
	linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Crazy idea of cleanup the inode_record btrfsck things with SQL?
Date: Mon, 01 Dec 2014 07:53:08 -0500	[thread overview]
Message-ID: <547C64B4.2050802@gmail.com> (raw)
In-Reply-To: <547BCB43.5020505@cn.fujitsu.com>

[-- Attachment #1: Type: text/plain, Size: 3533 bytes --]

On 2014-11-30 20:58, Qu Wenruo wrote:
> [BACKGROUND]
> I'm trying to implement the function to repair missing inode item.
> Under that case, inode type must be salvaged(although it can be fallback to
> FILE).
>
> One case should be, if there is any dir_item/index or inode_ref refers the
> inode as parent, the type of that inode must be DIR.
>
> However, currently btrfsck implement (inode_record only records
> backref), we
> are unable to search the inode_backref whose parent is given inode number.
>
> [FIRST IMPLEMENT DESIGN]
> My first thought is to implement an generic inode-relation structure,
> recording parent ino, child ino, name and namelen, and restore the
> structure
> in a rbtree, not in the child/parent's list.
>
> But I soon recognize that this is a perfect use case for relational
> database,
> as 'ino' as the primary key for INODE table,
> ('parent_ino', 'child_ino', 'name') as the primary key for INODE_REF table.
>
> [CRAZY IDEA]
> So why not using SQL to implement the btrfsck inode-record things?
>
> With such crazy idea, it will be much much easier to do any iteration
> from a
> given ino, and with the already mature RDB implement, like sqlite3, we can
> save hundreds of lines of codes implementing the rb-tree or list.
>
> [PROS]
> 1. Easy to maintain
>     Now we don't need to maintain the rbtree searching or list
> iteration, but
>     easy SQL lines and its wrapper.
>
> 2. Easy to extend
>     If we need to record something more, like extents and its relation to
>     inode, we only need to create 2 tables and several SQL and wrappers.
>
> 3. Reduced memory usage for HUGE fs.
>     When metadata grows to several TB or even more, current rb-tree based
>     implement may run short of memory since they are all stored in memory.
>     But if use SQL, RDBMS like sqlite3 can restore things in either
> memory or
>     disk, which may hugely reduce the memory usage for huge btrfs.
>
>     If not use existing RDBMS, we need to implement complicated memory
> control
>     system to manage memory in userland.
>
> [CONS]
> 1. Heavy implement
>     SQL hide the rb-tree or B+ tree implement but costs more memory(if not
>     compressed) and CPU cycles, which will be slower than the simple
> rb-tree
>     implement even using lightweight RDBMS like sqlite3.
>
> 2. Heavy dependency
>     If use it, btrfs-progs will include RDBMS as the make and runtime
>     dependency.
>     Such low level progs depend on high level programs like sqlite3 may
> be very
>     strange.
>
> 3. A lot of rework on existing codes.
>     Even SQL is easier to maintain and extend, if we use it, we still
> need to
>     reimplement several hundreds or even thousands lines of code to
> implement
>     it, not to mention the regression tests.
>
> 4. Copyright
>     Will it cause any copyright problem if using non-GPL RDBMS like
> sqlite3 in
>     GPLv2 btrfs-progs?
>
> [NEED FEEDBACK]
> Any feedback or discussion on the crazy idea is welcomed, since this may
> needs
> a lot of work, it definitely needs a lot review on the idea before it
> comes to
> codes.
>
So, I think this does a good job of highlighting one of the bigger 
issues with btrfsck when it is compared to ext* and/or xfs.  Despite 
this being a problem, I really don't think using a rdbms is the way to 
fix it, both for reasons outlined in other responses, and because fsck 
should be as fast as possible when nothing is wrong with the fs.



[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2455 bytes --]

next prev parent reply	other threads:[~2014-12-01 12:53 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-01  1:58 Crazy idea of cleanup the inode_record btrfsck things with SQL? Qu Wenruo
2014-12-01  3:08 ` Duncan
2014-12-01  3:24   ` Qu Wenruo
2014-12-01  5:47     ` Duncan
2014-12-01  6:25       ` Qu Wenruo
2014-12-01  4:03 ` Robert White
2014-12-01  6:18   ` Qu Wenruo
2014-12-01 18:10     ` Robert White
2014-12-02  1:17       ` Qu Wenruo
2014-12-03 19:18         ` Robert White
2014-12-04  6:56           ` Qu Wenruo
2014-12-10 21:57             ` Zygo Blaxell
2014-12-11  2:05               ` Qu Wenruo
2014-12-11  2:27                 ` Zygo Blaxell
2014-12-01 12:53 ` Austin S Hemmelgarn [this message]
2014-12-02  0:37   ` Qu Wenruo
2014-12-11 19:00     ` Martin Steigerwald
2014-12-11 19:38 ` Roger Binns

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=547C64B4.2050802@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox