All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Bartłomiej Święcki" <bartlomiej.swiecki@corp.ovh.com>
To: Ceph Development <ceph-devel@vger.kernel.org>
Subject: Understanding mon space usage during recovery
Date: Wed, 8 Jun 2016 11:25:14 +0200	[thread overview]
Message-ID: <5757E47A.2010203@corp.ovh.com> (raw)

Hi,

I was recently trying to understand the growth of mon disk space usage 
during recovery in one of our clusters,
wanted to know whether we could reduce disk usage somehow or if we just 
have to prepare more space for our mons.
Cluster is 0.94.6, just over 300 OSDs. Leveldb compaction does reduce 
space usage but it quickly grows back
to the previous usage. What I found out is that most of the leveldb data 
is used by osdmap history.

For each osdmap version leveldb contains both full and incremental entry 
so I was thinking if we really need to
store full osdmaps for all versions? If we're having incremental changes 
for every version anyway, wouldn't it be
sufficient to keep first full version only and then recover any future 
ones by applying incrementals?

I was also trying to understand how ceph figures out the range of osdmap 
versions to keep. After analyzing the code
I thought the obvious answer was in PGMap::calc_min_last_epoch_clean() - 
In case of our production cluster,
the difference between min and max clean epochs was around 30k during 
recovery, size of one full osdmap blob
in leveldb is around 250k.

I also tried to test this on my dev cluster where I could run gdb (15 
OSD, 4 OSD nearfull and lots of misplaced objects).
What I found out is that execution in OSDmonitor::get_trim_to() almost 
never jumped inside the first 'if'.
mon->pgmon()->is_readable() returns false, I did debug it once and it 
was a result of false returned by Paxos::is_lease_valid().
I was able get into mentioned 'if' only once the cluster got back to the 
healthy state. Is this expected behavior?

Thanks,
Bartek

             reply	other threads:[~2016-06-08 10:45 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-08  9:25 Bartłomiej Święcki [this message]
2016-06-08 15:33 ` Understanding mon space usage during recovery Gregory Farnum
2016-06-10  8:58   ` Bartłomiej Święcki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5757E47A.2010203@corp.ovh.com \
    --to=bartlomiej.swiecki@corp.ovh.com \
    --cc=ceph-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.