From: Vladimir Bashkirtsev <vladimir@bashkirtsev.com>
To: ceph-devel@vger.kernel.org
Subject: OSD weighting
Date: Fri, 20 Apr 2012 16:53:47 +0930 [thread overview]
Message-ID: <4F910F03.6070604@bashkirtsev.com> (raw)
Dear devs,
Playing around with ceph and gradually moving it from a toy thing into
production I wanted ceph to actually make its run for the money (so to
speak). I have assembled number of OSDs which are really built on
different hardware: starting from old P4 with 512MB of RAM and ending up
with high end Dell server, including mixture of 100 and 1000 mbit
networks. I will not really speak about performance of MONs and MDSes as
they do fairly well does not matter what I throw to them. But with OSDs
it is different story. Even one full OSD will stall whole ceph - I've
read that it is normal and good way of fighting it is to have periodic
health check to see that no OSD is approaching full status. However I
believe it would be better if ceph will reduce weighting for OSDs
approaching full status so it will effectively prevent OSD getting full.
Should be reasonably simple to implement and will not cause major grief
if some OSD will go past near full status to full status quickly and
unnoticed. I guess reweight-by-utilization is an attempt to address the
issue based on CPU performance.
In the mean time I have reverted back to manual weighting of OSDs and I
found that there no clear explanation on how weights actually applied.
I've seen suggestion to keep weight equivalent to number of TBs on OSD.
Doing so in single rack has achieved expected result: data has spread
itself proportionally to OSDs sizes. But when I started to move OSDs
from toy rack into production rack I also have changed weights for racks
in pool. So I had 6 OSDs and I moved 2 of them. I have changed toyrack
weight to 4.000 and productionrack to 2.000. Waited for data to settle
just to find out that disk use is no longer proportional. Then I have
changed rack weights to total amount of TBs in the rack, data reshuffled
and settled but again did not achieved expected result. So I guess
function of weights: racks, hosts and devices is not straight forward as
I thought originally. This begs clear explanation of how weights are
used in case of straw algo.
Regards,
Vladimir
next reply other threads:[~2012-04-20 7:24 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-20 7:23 Vladimir Bashkirtsev [this message]
2012-04-20 17:15 ` OSD weighting Sage Weil
2012-04-20 17:34 ` Sage Weil
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F910F03.6070604@bashkirtsev.com \
--to=vladimir@bashkirtsev.com \
--cc=ceph-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.