All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Piotr Dałek" <branch@predictor.org.pl>
To: Mark Nelson <mnelson@redhat.com>
Cc: Sage Weil <sweil@redhat.com>, Haomai Wang <haomai@xsky.com>,
	"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: pg_stat_t is 500+ bytes
Date: Wed, 24 Aug 2016 18:20:31 +0200	[thread overview]
Message-ID: <20160824162031.GA26181@predictor> (raw)
In-Reply-To: <e20c13f4-ae6f-e06a-ac1b-c5736737e642@redhat.com>

On Wed, Aug 24, 2016 at 11:12:24AM -0500, Mark Nelson wrote:
> 
> 
> On 08/24/2016 11:09 AM, Sage Weil wrote:
> >On Wed, 24 Aug 2016, Haomai Wang wrote:
> >>On Wed, Aug 24, 2016 at 11:01 AM, Haomai Wang <haomai@xsky.com> wrote:
> >>>On Wed, Aug 24, 2016 at 2:13 AM, Sage Weil <sweil@redhat.com> wrote:
> >>>>This is huge.  It takes the pg_info_t str from 306 bytes to 847 bytes, and
> >>>>this _info omap key is rewritten on *every* IO.
> >>>>
> >>>>We could shrink this down significant with varint and/or delta encoding
> >>>>since a huge portion of it is just a bunch of uint64_t counters.  This
> >>>>will probably cost some CPU time, but OTOH it'll also shrink our metadata
> >>>>down a fair bit too which will pay off later.
> >>>>
> >>>>Anybody want to tackle this?
> >>>
> >>>what about separating "object_stat_collection_t stats" from pg_stat_t?
> >>>pg info should be unchanged for most of times, we could only update
> >>>object related stats. This may help to reduce half bytes.
> >
> >I don't think this will work, since every op changes last_update in
> >pg_info_t *and* the stats (write op count, total bytes, objects, etc.).
> >
> >>Or we only store increment values and keep the full in memory(may
> >>reduce to 20-30bytes). In period time we store the full structure(only
> >>hundreds of bytes)....
> >
> >A delta is probably very compressible (only a few fields in the stats
> >struct change).  The question is how fast can we make it in CPU time.
> >Probably a simple delta (which will be almost all 0's) and a trivial
> >run-length-encoding scheme that just gets rid of the 0's would do well
> >enough...
> 
> Do we have any rough idea of how many/often consecutive 0s we end up
> with in the current encoding?

Or how high these counters get? We could try transposing the matrix made of
those counters. At least the two most significant bytes in most of those
counters are mostly zeros, and after transposing, simple RLE would be
feasible. In any case, I'm not sure if *all* of these fields need to be
uint64_t.

-- 
Piotr Dałek
branch@predictor.org.pl
http://blog.predictor.org.pl

  reply	other threads:[~2016-08-24 16:30 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-23 18:13 pg_stat_t is 500+ bytes Sage Weil
2016-08-23 19:39 ` Mark Nelson
2016-08-24  3:01 ` Haomai Wang
2016-08-24  4:02   ` Haomai Wang
2016-08-24 16:09     ` Sage Weil
2016-08-24 16:12       ` Mark Nelson
2016-08-24 16:20         ` Piotr Dałek [this message]
2016-08-25 12:54           ` Haomai Wang
2016-08-25 12:56             ` Haomai Wang
2016-08-26  1:28               ` Somnath Roy
2016-08-26  3:04                 ` Mark Nelson
2016-08-26  3:11                   ` Mark Nelson
2016-08-26  5:36                     ` Mark Nelson
2016-08-29 13:27                       ` Mark Nelson
2016-08-27  0:00               ` Somnath Roy
2016-08-29 13:28                 ` Sage Weil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160824162031.GA26181@predictor \
    --to=branch@predictor.org.pl \
    --cc=ceph-devel@vger.kernel.org \
    --cc=haomai@xsky.com \
    --cc=mnelson@redhat.com \
    --cc=sweil@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.