From: caifeng.zhu@uniswdc.com
To: Sage Weil <sage@newdream.net>
Cc: caifeng.zhu@uniswdc.com, ceph-devel@vger.kernel.org
Subject: Re: incorrect object stat sum in PG info after pg split
Date: Wed, 11 Jan 2017 16:08:35 +0800 [thread overview]
Message-ID: <20170111080835.GA24093@T530I> (raw)
In-Reply-To: <alpine.DEB.2.11.1701101240520.5745@piezo.novalocal>
Hi, Sage
Thanks for your suggestion. It works for us.
Best Regards
On Tue, Jan 10, 2017 at 12:44:50PM +0000, Sage Weil wrote:
> On Tue, 10 Jan 2017, caifeng.zhu@uniswdc.com wrote:
> > Hi, all
> >
> > We find that after the number of pgs increased, the object stat sum
> > in pg info is incorrect.
> >
> > The following steps can reproduce the problem.
> > 0 assume the object store is a filestore.
> > 1 create a pool 'foo' with the number of pgs such as 64.
> > 2 write data through clients(rbd, cephfs or rgw) into the pool 'foo'.
> > 3 increase the number of pgs in the pool 'foo' to such as 128.
> > 4 after pgs are settled, use 'ceph pg x.y query' to look at the field
> > 'num_objects'
> > 5 find the osd shard where pg x.y resides by 'ceph pg map x.y' and
> > count the number of objects in the osd shard by command like
> > 'find /var/lib/ceph/osd/ceph-0/current/x.y_head/ -type f | wc -l'
> >
> > The code flow to increase the pg number is as follows:
> > OSD::advance_pg
> > -> OSD::split_pgs
> > -> object_stat_sum::split
> > -> ReplicatedPG::split_colls
> > -> PG::_create
> > -> ObjectStore::Transaction::split_collection
> > /* indirectly call FileStore::_split_collection
> > * when applying transaction into file system.
> > */
> > -> PG::split_into
> >
> > Compare object_stat_sum::split with FileStore::_split_collection, the splitting
> > logic is different and makes stat.sum different from the actual number of objects
> > in the collection.
> >
> > The question is that should we fix this difference? If so, how to fix?
> > In current design, it seems very difficult to fix the problem.
>
> Right, it's expected to be out of sync. The pg_stats structure has a bool
> flag indicating the stats are not strictly accurate (only an
> approximation), and will be corrected during the next scrub. You can
> force this to happen explicitly on a test pg with 'ceph pg scrub <pgid>'
> and then verif that afterwards the stats are accurate. You can also see
> the full stats strcuture (including the flag) with 'ceph pg dump -f
> json-pretty'.
>
> It would be very hard to make the ObjectStore backend (FileStore or
> BlueStore) be able to split a collection in O(1) time *and* provide an
> accurate split of the stats (and its many fields) as well. And not that
> important; the approximation is sufficient for most purposes. The only
> one it's not good enough for is the cache tiering agent; that is disabled
> until the next scrub happens on the PG.
>
> sage
>
> >
> > A similar bug is reported as tracker.ceph.com/issues/16671, which will occur
> > if all the exitent data in pool 'foo' is deleted.
> >
> > Best Regards
> >
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
> >
prev parent reply other threads:[~2017-01-11 8:19 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-10 10:03 incorrect object stat sum in PG info after pg split caifeng.zhu
2017-01-10 12:44 ` Sage Weil
2017-01-11 8:08 ` caifeng.zhu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170111080835.GA24093@T530I \
--to=caifeng.zhu@uniswdc.com \
--cc=ceph-devel@vger.kernel.org \
--cc=sage@newdream.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.