From mboxrd@z Thu Jan 1 00:00:00 1970 From: Igor Fedotov Subject: Re: ceph-osd mem usage growth Date: Fri, 11 Dec 2015 19:09:14 +0300 Message-ID: <566AF52A.4070604@mirantis.com> References: <5669A726.8080009@mirantis.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from mail-lb0-f182.google.com ([209.85.217.182]:34825 "EHLO mail-lb0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755296AbbLKQJR (ORCPT ); Fri, 11 Dec 2015 11:09:17 -0500 Received: by lbpu9 with SMTP id u9so66459608lbp.2 for ; Fri, 11 Dec 2015 08:09:15 -0800 (PST) In-Reply-To: Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Samuel Just Cc: ceph-devel Hi Samuel, thanks for you answer. One more question: Why erasure coded pools use PG log for setxattr op while replicated ones don't? What's the rationale for that? Thanks, Igor. On 10.12.2015 20:46, Samuel Just wrote: > The short answer is that you aren't supposed to store large things in > xattrs at all. If you feel it's a "vulnerability", than we could add > a config option to reject xattrs over a particular size. > -Sam > > On Thu, Dec 10, 2015 at 8:24 AM, Igor Fedotov wrote: >> Hi Cephers, >> >> implementing compression support for EC pools I faced an issue that can be >> summarized as follows. >> >> Imagine a client that continuously extends specific object xattr by doing >> complete attribute rewrite with new data portion appended. >> As a result one can observe permanently increasing mem usage for ceph-osd >> processes. This happens for objects at EC pools only. >> >> I briefly investigated for the root cause and it looks like that's due to PG >> log memory consumption growth. PG log entry count is pretty stable but each >> entry consumes more and more memory over the time since it contains full >> attribute value. >> As far as I understand replicated pools do not log setattr operation ( >> actually mark it as unrollbackable ) that's why the issue isn't observed >> there. >> >> With 3000 log entries and e.g. 64Kb attribute value memory consumption is >> pretty visible. >> >> So the questions are: >> * Are there any ideas how to resolve this issue? Obvious solution is to >> refactor attribute extending by using multiple keys... Anything else? >> * Does it make sense to resolve it at all? IMO that's a sort of >> vulnerability for Ceph process to behave this way... >> >> Please find a python script to reproduce the issue below, to be started from >> the folder where ceph.conf is located: >> >> python repro.py >> >> ###################################### >> import rados, sys >> from time import sleep >> import psutil >> >> def print_process_mem_usage(pid): >> process = psutil.Process(pid) >> mem = process.get_memory_info() >> mem0=mem[0] / (2 ** 20) >> mem1=mem[1] / (2 ** 20) >> print "pid %d: Virt: %i MB, Res: %i MB" % (pid, mem1, mem0) >> >> def print_processes_mem_usage(): >> for proc in psutil.process_iter(): >> try: >> if 'ceph-osd' in proc.name(): >> print_process_mem_usage(proc.pid) >> except psutil.NoSuchProcess: >> pass >> >> cluster = rados.Rados(conffile='./ceph.conf') >> >> cluster.connect() >> >> ioctx = cluster.open_ioctx(sys.argv[1]) >> try: >> ioctx.remove_object("pyobject") >> except: >> pass >> s="" >> for i in range(25000): >> s=''.zfill( i*15) >> ioctx.set_xattr( 'pyobject', 'somekey', s) >> if (i % 500)==0: >> print '%d-th step, attr len = %d' % (i, len(s)) >> print_processes_mem_usage() >> >> ioctx.close() >> ######################### >> Sample output is as below: >> 0-th step, attr len = 0 >> pid 23723: Virt: 700 MB, Res: 30 MB >> pid 23922: Virt: 701 MB, Res: 32 MB >> pid 24142: Virt: 700 MB, Res: 32 MB >> ... >> 4000-th step, attr len = 60000 >> pid 23723: Virt: 896 MB, Res: 207 MB >> pid 23922: Virt: 900 MB, Res: 212 MB >> pid 24142: Virt: 897 MB, Res: 210 MB >> ... >> 6000-th step, attr len = 90000 >> pid 23723: Virt: 1025 MB, Res: 331 MB >> pid 23922: Virt: 1032 MB, Res: 338 MB >> pid 24142: Virt: 1025 MB, Res: 333 MB >> ... >> >> >> Thanks, >> Igor >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html