From: Josh Durgin <josh.durgin@dreamhost.com>
To: Sage Weil <sage@newdream.net>
Cc: ceph-devel@vger.kernel.org
Subject: Re: efficient removal of old objects
Date: Tue, 31 Jan 2012 17:19:52 -0800 [thread overview]
Message-ID: <4F289338.8030501@dreamhost.com> (raw)
In-Reply-To: <Pine.LNX.4.64.1201311549170.21770@cobra.newdream.net>
(sorry for the extra email)
On 01/31/2012 04:33 PM, Sage Weil wrote:
> Currently rgw logs objects it wants to delete after some period of time,
> and an radosgw-admin command comes back later to process the log. It
> works, but is currently slow (one sync op at a time).
>
> A better approach would be to mark objects for later removal, and have the
> OSD do it in some more efficient way. wip-objs-expire has a client side
> (librados) interface for this.
>
> I think there are a couple questions:
>
> Should this be generalized to saying "do these osd ops at time X" instead
> of "delete at time X". Then it could setxattr, remove, call into a class,
> whatever.
What are some other use cases for this? It may be useful in the future,
but if the only immediate use is speeding up rgw-admin, I don't think
it's worth further complicating the osd and all the layers above it.
> How would the OSD implement this? A kludgey way would be to do it during
> scrub. The current scrub implementation may make that problematic because
> it does a whole PG at time, and we probably don't want to issue a whole
> PG's worth of deletes at a time. Is there a way to make that less
> painful?
This would also tie it to scrub actually happening. This means osds
with high load would never process the operations, unless you disable
the load check, in which case you slow down loady osds with scrubbing.
> Not using scrub means we need some sort of index to keep track of objects
> with delayed events. Using a collection for this might work, but loading
> all this state into memory would be slow if there were too many events
> registered.
>
> Given all that, and that we need a solution to the expiration soon
> (weeks), do we
> - do a complete solution now,
> - parallelize radosgw-admin log processing,
I'm in favor of this, since it's much simpler and easier to maintain
than a full-blown time-based op, and the scrub kludge will be even
worse to maintain (plus it turns a read-only operation into a
read-write one).
> - or hack it into scrub?
>
> sage
next prev parent reply other threads:[~2012-02-01 1:19 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-01 0:33 efficient removal of old objects Sage Weil
2012-02-01 0:52 ` Josh Durgin
2012-02-01 1:02 ` Tommi Virtanen
[not found] ` <CAC-hyiExnN6CxMh=+5tLoZy3T0=Mx6Y3P796rG3L01mZ-=+vOg@mail.gmail.com>
2012-02-01 8:04 ` Yehuda Sadeh Weinraub
2012-02-02 0:11 ` Mark Kampe
2012-02-01 1:19 ` Josh Durgin [this message]
2012-02-01 8:26 ` Yehuda Sadeh Weinraub
2012-02-01 17:39 ` Gregory Farnum
2012-02-01 18:53 ` Yehuda Sadeh Weinraub
2012-02-01 19:35 ` Gregory Farnum
2012-02-01 20:01 ` Yehuda Sadeh Weinraub
2012-02-01 19:43 ` Sage Weil
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F289338.8030501@dreamhost.com \
--to=josh.durgin@dreamhost.com \
--cc=ceph-devel@vger.kernel.org \
--cc=sage@newdream.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.