From: Andreas Bluemle <andreas.bluemle@itxperts.de>
To: Ceph Development <ceph-devel@vger.kernel.org>
Subject: FileStore performance: coalescing operations
Date: Thu, 26 Feb 2015 15:28:07 +0100 [thread overview]
Message-ID: <20150226152807.5b71a93c@doppio> (raw)
[-- Attachment #1: Type: text/plain, Size: 817 bytes --]
Hi,
during the performance weely meeting, I had mentioned
my experiences concerning the transaction structure
for write requests at the level of the FileStore.
Such a transaction not only contains the OP_WRITE
operation to the object in the file system, but also
a series of OP_OMAP_SETKEYS and OP_SETATTR operations.
Find attached a README and source code patch, which
describe a prototype for coalescing the OP_OMAP_SETKEYS
operations and the performance impact f this change.
Regards
Andreas Bluemle
--
Andreas Bluemle mailto:Andreas.Bluemle@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910
Company details: http://www.itxperts.de/imprint.htm
[-- Attachment #2: ceph-master.file-store-omap_setkeys-colaescing.patch --]
[-- Type: text/x-patch, Size: 1864 bytes --]
diff --git a/src/os/FileStore.cc b/src/os/FileStore.cc
index f6c3bb8..29382b2 100644
--- a/src/os/FileStore.cc
+++ b/src/os/FileStore.cc
@@ -2260,10 +2260,24 @@ int FileStore::_check_replay_guard(int fd, const SequencerPosition& spos)
}
}
+void FileStore::_coalesce(map<string, bufferlist> &target, map<string, bufferlist> &source)
+{
+ for (map<string, bufferlist>::iterator p = source.begin();
+ p != source.end();
+ p++) {
+ target[p->first] = p->second;
+ }
+ return;
+}
+
unsigned FileStore::_do_transaction(
Transaction& t, uint64_t op_seq, int trans_num,
ThreadPool::TPHandle *handle)
{
+ map<string, bufferlist> collected_aset;
+ coll_t collected_cid;
+ ghobject_t collected_oid;
+
dout(10) << "_do_transaction on " << &t << dendl;
#ifdef WITH_LTTNG
@@ -2282,6 +2296,22 @@ unsigned FileStore::_do_transaction(
_inject_failure();
+ if (op->op == Transaction::OP_OMAP_SETKEYS) {
+ collected_cid = i.get_cid(op->cid);
+ collected_oid = i.get_oid(op->oid);
+ map<string, bufferlist> aset;
+ i.decode_attrset(aset);
+ _coalesce(collected_aset, aset);
+ continue;
+ } else {
+ if (collected_aset.empty() == false) {
+ tracepoint(objectstore, omap_setkeys_enter, osr_name);
+ r = _omap_setkeys(collected_cid, collected_oid, collected_aset, spos);
+ tracepoint(objectstore, omap_setkeys_exit, r);
+ collected_aset.clear();
+ }
+ }
+
switch (op->op) {
case Transaction::OP_NOP:
break;
diff --git a/src/os/FileStore.h b/src/os/FileStore.h
index af1fb8d..a039731 100644
--- a/src/os/FileStore.h
+++ b/src/os/FileStore.h
@@ -449,6 +449,8 @@ public:
int statfs(struct statfs *buf);
+ void _coalesce( map<string, bufferlist> &target, map<string, bufferlist> &source);
+
int _do_transactions(
list<Transaction*> &tls, uint64_t op_seq,
ThreadPool::TPHandle *handle);
[-- Attachment #3: README.file-store-coalescing --]
[-- Type: text/plain, Size: 2554 bytes --]
Coalescing OMAP_SETKEYS operations in a write transaction
---------------------------------------------------------
Description
-----------
At the level of FileStore, every write request is embedded in a transaction
which consists of
6 key-value pair settings in 3 OMAP_SETKEYS operations
the actual OP_WRITE
2 settings in the extended file system attributes.
The modification of the FileStore::_do_transaction() coalesces the
6 key-value pairs into a single operation, with the side effect of
reducing the number of key-value pairs to 5: one key appears twice
and only the last values is going to be set.
Performance improvement
-----------------------
Cluster with 3 storage nodes, 4 osd (SAS disk, SSD journal) per node,
separate client node with rbd using the kernel clients,
test load generated by fio, randon write, 4K block size, iodepth 16.
client improvement: approx. 5 % (12890 iops vs. 13369 iops)
storage node improvement: reduction in CPU consuptiom of ceph-osd daemon
by 10%; see follwoing table (derived from /proc/<pid>/schedstat:
ceph-osd process and CPU usage | CPU usage
thread classes v0.91 unmodified | v0.91 with coalescing
---------------------------------------------------+----------------------
total cpu usage: 43.17 CPU-seconds | 39.33 CPU-seconds
|
ThreadPool::WorkThread::entry(): 15.56 36.04% | 12.45 31.66%
ShardedThreadPool::workers: 8.07 18.70% | 7.94 20.18%
Pipe::Reader:: 5.81 13.45% | 5.92 15.04%
Pipe::Writer::entry(): 4.59 10.63% | 4.73 12.02%
FileJournal::Writer:: 2.41 5.57% | 2.45 6.22%
Finisher::finisher_thread: 2.86 6.63% | 1.03 2.61%
|
WBThrottle::entry: n/a n/a | 0.81 2.06%
Interesting: with coalescing active, the WBthrottle shows up in CPU usage.
In the default case, this was almost invisible.
Source/Patch
------------
https://www.github.com/andreas-bluemle/ceph
commit f33c48358f762cbeb5d30724efacf78ff5438e9e
patches:
relative to pull request at https://www.github.com/andreas-bluemle/ceph
ceph-andreas-bluemle.file-store-omap_setkeys-colaescing.patch
relative to ceph master at at https://www.github.com
(commit a7a70cabe25fdfe3322c784f6797231d14e112c2)
ceph-master.file-store-omap_setkeys-colaescing.patch
next reply other threads:[~2015-02-26 14:28 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-26 14:28 Andreas Bluemle [this message]
2015-02-26 15:02 ` FileStore performance: coalescing operations Haomai Wang
2015-02-26 15:06 ` Mark Nelson
2015-03-04 1:05 ` Sage Weil
2015-03-05 0:10 ` Sage Weil
2015-03-05 7:04 ` Haomai Wang
2015-03-11 3:44 ` Ning Yao
2015-03-11 12:34 ` Sage Weil
2015-03-19 14:59 ` Andreas Bluemle
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150226152807.5b71a93c@doppio \
--to=andreas.bluemle@itxperts.de \
--cc=ceph-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.