From: Alex Elsayed <eternaleye@gmail.com>
To: ceph-devel@vger.kernel.org
Subject: Re: Forcing Ceph into mapping all objects to a single PG
Date: Tue, 22 Jul 2014 15:44:16 -0700 [thread overview]
Message-ID: <lqmpg2$chu$1@ger.gmane.org> (raw)
In-Reply-To: CAPYLRzj4tzBCSrGPCsaLvy1+7ac4=Um1EHDeZhLsCpuZZy7miA@mail.gmail.com
Gregory Farnum wrote:
> On Mon, Jul 21, 2014 at 3:27 PM, Daniel Hofmann <daniel@trvx.org> wrote:
>> Preamble: you might want to read the decent formatted version of this
>> mail at:
>>> https://gist.github.com/daniel-j-h/2daae2237bb21596c97d
<snip aggressively>
>> -------
>>
>> Ceph's object mapping depends on the rjenkins hash function.
>> It's possible to force Ceph into mapping all objects to a single PG.
>>
>> Please discuss!
>
> Yes, this is an attack vector. It functions against...well, any system
> using hash-based placement.
Sort of. How well it functions is a function (heh) of how easy it is to find
a preimage against the hash (collision only allows a pair, you need
preimages to get beyond that).
With fletcher4, preimages aren't particularly difficult to find. By using a
more robust hash[1], then preimages become more computationally expensive
since you need to brute-force for each value rather than taking advantage of
a weakness in the algorithm.
This doesn't buy a huge amount since the bruteforce effort per iteration is
still bounded by the number of PGs, but it does help - and it means that as
PGs are split, resistance to the attack increases as well.
> RGW mangles names on its own, although the mangling is deterministic
> enough that an attacker could perhaps manipulate it into mangling them
> onto the same PG. (Within the constraints, though, it'd be pretty
> difficult.)
> RBD names objects in a way that users can't really control, so I guess
> it's safe, sort of? (But users of rbd will still have write permission
> to some class of objects in which they may be able to find an attack.)
>
> The real issue though, is that any user with permission to write to
> *any* set of objects directly in the cluster will be able to exploit
> this regardless of what barriers we erect. Deterministic placement, in
> that anybody directly accessing the cluster can compute data
> locations, is central to Ceph's design. We could add "salts" or
> something to try and prevent attackers from *outside* the direct set
> (eg, users of RGW) exploiting it directly, but anybody who can read or
> write from the cluster would need to be able to read the salt in order
> to compute locations themselves.
Actually, doing (say) per-pool salts does help in a notable way: even
someone who can write to two pools can't reuse the computation of colliding
values across pools. It forces them to expend the work factor for each pool
they attack, rather than being able to amortize.
> So I'm pretty sure this attack vector
> is:
> 1) Inherent to all hash-placement systems,
> 2) not something we can reasonably defend against *anyway*.
I'd agree that in the absolute sense it's inherent and insoluble, but that
doesn't imply that _mitigations_ are worthless.
A more drastic option would be to look at how the sfq network scheduler
handles it - it hashes flows onto a fixed number of queues, and gets around
collisions by periodically perturbing the salt (resulting in a _stochastic_
avoidance of clumping). It'd definitely require some research to find a way
to do this such that it doesn't cause huge data movement, but it might be
worth thinking about for the longer term.
[1] I'm thinking along the lines of SipHash, not any heavy-weight
cryptographic hash; however with network latencies on the table those might
not be too bad regardless
next prev parent reply other threads:[~2014-07-22 22:44 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-21 22:27 Forcing Ceph into mapping all objects to a single PG Daniel Hofmann
2014-07-21 22:50 ` Gregory Farnum
2014-07-22 22:44 ` Alex Elsayed [this message]
2014-07-22 22:46 ` Alex Elsayed
2014-07-25 12:26 ` Daniel Hofmann
2014-07-25 14:12 ` Sage Weil
2014-07-25 18:14 ` Alex Elsayed
2014-07-31 0:29 ` Daniel Hofmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='lqmpg2$chu$1@ger.gmane.org' \
--to=eternaleye@gmail.com \
--cc=ceph-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.