linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Rework qgroup accounting
@ 2013-12-18 21:07 Josef Bacik
  2013-12-18 21:07 ` [PATCH 1/3] Btrfs: introduce lock_ref/unlock_ref Josef Bacik
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Josef Bacik @ 2013-12-18 21:07 UTC (permalink / raw)
  To: linux-btrfs

People have been complaining about autodefrag/defrag killing their box with OOM.
This is because the snapshot aware defrag stuff super sucks if you have lots of
snapshots, and so that needs to be reworked.  The problem is once that is fixed
you start to hit horrible lock contention on the delayed refs lock because we
have thousands of like entries that can't be merged until when we go to actually
run the delayed ref.  This problem exists because of the delayed ref sequence
number.

The major user of the delayed ref sequence number is the qgroup code.  It uses
it to pass into btrfs_find_all_roots to see what roots pointed to a particular
bytenr either before or including the current operation.  It needs this
information to know if we were removing the last ref or an just the last ref for
this particular root.  The problem with this is that it has made the delayed ref
code incredibly fragile and has forced us to do things like
btrfs_merge_delayed_refs which is what is causing us so much pain when we have
thousands of ref updates for the same block.

In order to fix this I'm introducing a new way of adjusting quota counts.  I've
called them qgroup operations, and we apply them in very specific situations.
We only add these when we add or remove the only ref for a particular root.
Obviously we have to account for shared refs as well so there is some extra code
for these special cases, but basically we make the qgroup accounting only happen
when we know there was a real change (or likely a real change in the case of
shared refs).

In order to do this I've also introduced lock/unlock_ref.  This only gets used
if we actually have qgroups enabled, but it will be relatively low cost even if
we have qgroups enabled as it only locks the bytenr for reference updates.  So
delayed ref updates will not trip over this since we only do one at a time
anyway, so we'll only have contention if we have delayed refs running at the
same time as a qgroup operation update.

Then all we need to account for is the fact that we will get the full view of
the roots at the time we run the operations, not what they were when our
particular operation occurred.  This is ok because we will either ignore our
root in the case of add or not ignore it in case of remove when calculating the
ref counts.  We use the same ref counting scheme that Arne developed as it's
pretty freaking awesome, and just adjust how we count the ref counts based on
our operations.

In addition to all of this new code I've added a big set of sanity tests to make
sure everything is working right.  Between this and the qgroups xfstests I'm
pretty certain I haven't broken anything obvious with qgroups.  This is just the
first step in getting rid of the delayed ref sequence number and fixing the
defrag OOM mess but it is the biggest part.  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-01-08 14:42 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-18 21:07 Rework qgroup accounting Josef Bacik
2013-12-18 21:07 ` [PATCH 1/3] Btrfs: introduce lock_ref/unlock_ref Josef Bacik
2013-12-19  4:01   ` Dave Chinner
2013-12-19 14:37     ` Josef Bacik
2013-12-18 21:07 ` [PATCH 2/3] Btrfs: rework qgroup accounting Josef Bacik
2013-12-21  8:01   ` Wang Shilong
2013-12-21 14:13     ` Josef Bacik
2013-12-21  8:56   ` Wang Shilong
2013-12-21 14:14     ` Josef Bacik
2014-01-07 16:43     ` Josef Bacik
2014-01-08 14:33   ` David Sterba
2014-01-08 14:42     ` Josef Bacik
2013-12-18 21:07 ` [PATCH 3/3] Btrfs: add sanity tests for new qgroup accounting code Josef Bacik
2013-12-19  2:00 ` Rework qgroup accounting Liu Bo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).