Re: rbd top - Wido den Hollander

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Wido den Hollander <wido@42on.com>
To: John Spray <john.spray@redhat.com>,
	Robert LeBlanc <robert@leblancnet.us>
Cc: Sage Weil <sage@newdream.net>, Gregory Farnum <greg@gregs42.com>,
	ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: rbd top
Date: Tue, 16 Jun 2015 13:05:20 +0200	[thread overview]
Message-ID: <558002F0.5060607@42on.com> (raw)
In-Reply-To: <557F02D9.1040303@redhat.com>

On 06/15/2015 06:52 PM, John Spray wrote:
> 
> 
> On 15/06/2015 17:10, Robert LeBlanc wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA256
>>
>> John, let me see if I understand what you are saying...
>>
>> When a person runs `rbd top`, each OSD would receive a message saying
>> please capture all the performance, grouped by RBD and limit it to
>> 'X'. That way the OSD doesn't have to constantly update performance
>> for each object, but when it is requested it starts tracking it?
> 
> Right, initially the OSD isn't collecting anything, it starts as soon as
> it sees a query get loaded up (published via OSDMap or some other
> mechanism).
> 

I like that idea very much. Currently the OSDs are already CPU bound, a
lot of time is used by processing a request while it's not waiting on
the disk.

Although tracking IOps might seem like a small and cheap thing to do,
it's yet more CPU time spent by the system on something else then
processing the I/O.

So I'm in favor of not always collecting, but only on demand.

Go for performance, low-latency and high IOps.

Wido

> That said, in practice I can see people having some set of queries that
> they always have loaded and feeding into graphite in the background.
>>
>> If so, that is an interesting idea. I wonder if that would be simpler
>> than tracking the performance of each/MRU objects in some format like
>> /proc/diskstats where it is in memory and not necessarily consistent.
>> The benefit is that you could have "lifelong" stats that show up like
>> iostat and it would be a simple operation.
> 
> Hmm, not sure we're on the same page about this part, what I'm talking
> about is all in memory and would be lost across daemon restarts.  Some
> other component would be responsible for gathering the stats across all
> the daemons in one place (that central part could persist stats if
> desired).
> 
>> Each object should be able
>> to reference back to RBD/CephFS upon request and the client could even
>> be responsible for that load. Client performance data would need stats
>> in addition to the object stats.
> 
> You could extend the mechanism to clients.  However, as much as possible
> it's a good thing to keep it server side, as servers are generally fewer
> (still have to reduce these stats across N servers to present to user),
> and we have multiple client implementations (kernel/userspace).  What
> kind of thing do you want to get from clients?
>> My concern is that adding additional SQL like logic to each op is
>> going to get very expensive. I guess if we could push that to another
>> thread early in the op, then it might not be too bad. I'm enjoying the
>> discussion and new ideas.
> 
> Hopefully in most cases the query can be applied very cheaply, for
> operations like comparing pool ID or grouping by client ID. However, I
> would also envisage an optional sampling number, such that e.g. only 1
> in every 100 ops would go through the query processing.  Useful for
> systems where keeping highest throughput is paramount, and the numbers
> will still be useful if clients are doing many thousands of ops per second.
> 
> Cheers,
> John
> -- 
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on

next prev parent reply	other threads:[~2015-06-16 11:05 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-11 19:33 rbd top Robert LeBlanc
2015-06-15 11:52 ` Gregory Farnum
2015-06-15 13:52   ` Sage Weil
2015-06-15 15:03     ` John Spray
2015-06-15 16:10       ` Robert LeBlanc
2015-06-15 16:52         ` John Spray
2015-06-16 11:05           ` Wido den Hollander [this message]
2015-06-17 17:06             ` Robert LeBlanc
2015-06-17 17:59               ` John Spray
2015-06-16 10:04       ` Gregory Farnum
2015-06-15 16:28     ` Robert LeBlanc

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=558002F0.5060607@42on.com \
    --to=wido@42on.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=greg@gregs42.com \
    --cc=john.spray@redhat.com \
    --cc=robert@leblancnet.us \
    --cc=sage@newdream.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.