All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Elsayed <eternaleye@gmail.com>
To: ceph-devel@vger.kernel.org
Subject: Re: Forcing Ceph into mapping all objects to a single PG
Date: Tue, 22 Jul 2014 15:44:16 -0700	[thread overview]
Message-ID: <lqmpg2$chu$1@ger.gmane.org> (raw)
In-Reply-To: CAPYLRzj4tzBCSrGPCsaLvy1+7ac4=Um1EHDeZhLsCpuZZy7miA@mail.gmail.com

Gregory Farnum wrote:

> On Mon, Jul 21, 2014 at 3:27 PM, Daniel Hofmann <daniel@trvx.org> wrote:
>> Preamble: you might want to read the decent formatted version of this
>> mail at:
>>> https://gist.github.com/daniel-j-h/2daae2237bb21596c97d
<snip aggressively>
>> -------
>>
>> Ceph's object mapping depends on the rjenkins hash function.
>> It's possible to force Ceph into mapping all objects to a single PG.
>>
>> Please discuss!
> 
> Yes, this is an attack vector. It functions against...well, any system
> using hash-based placement.

Sort of. How well it functions is a function (heh) of how easy it is to find 
a preimage against the hash (collision only allows a pair, you need 
preimages to get beyond that).

With fletcher4, preimages aren't particularly difficult to find. By using a 
more robust hash[1], then preimages become more computationally expensive 
since you need to brute-force for each value rather than taking advantage of 
a weakness in the algorithm.

This doesn't buy a huge amount since the bruteforce effort per iteration is 
still bounded by the number of PGs, but it does help - and it means that as 
PGs are split, resistance to the attack increases as well.

> RGW mangles names on its own, although the mangling is deterministic
> enough that an attacker could perhaps manipulate it into mangling them
> onto the same PG. (Within the constraints, though, it'd be pretty
> difficult.)
> RBD names objects in a way that users can't really control, so I guess
> it's safe, sort of? (But users of rbd will still have write permission
> to some class of objects in which they may be able to find an attack.)
> 
> The real issue though, is that any user with permission to write to
> *any* set of objects directly in the cluster will be able to exploit
> this regardless of what barriers we erect. Deterministic placement, in
> that anybody directly accessing the cluster can compute data
> locations, is central to Ceph's design. We could add "salts" or
> something to try and prevent attackers from *outside* the direct set
> (eg, users of RGW) exploiting it directly, but anybody who can read or
> write from the cluster would need to be able to read the salt in order
> to compute locations themselves.

Actually, doing (say) per-pool salts does help in a notable way: even 
someone who can write to two pools can't reuse the computation of colliding 
values across pools. It forces them to expend the work factor for each pool 
they attack, rather than being able to amortize.

> So I'm pretty sure this attack vector
> is:
> 1) Inherent to all hash-placement systems,
> 2) not something we can reasonably defend against *anyway*.

I'd agree that in the absolute sense it's inherent and insoluble, but that 
doesn't imply that _mitigations_ are worthless.

A more drastic option would be to look at how the sfq network scheduler 
handles it - it hashes flows onto a fixed number of queues, and gets around 
collisions by periodically perturbing the salt (resulting in a _stochastic_ 
avoidance of clumping). It'd definitely require some research to find a way 
to do this such that it doesn't cause huge data movement, but it might be 
worth thinking about for the longer term.

[1] I'm thinking along the lines of SipHash, not any heavy-weight 
cryptographic hash; however with network latencies on the table those might 
not be too bad regardless


  reply	other threads:[~2014-07-22 22:44 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-21 22:27 Forcing Ceph into mapping all objects to a single PG Daniel Hofmann
2014-07-21 22:50 ` Gregory Farnum
2014-07-22 22:44   ` Alex Elsayed [this message]
2014-07-22 22:46     ` Alex Elsayed
2014-07-25 12:26     ` Daniel Hofmann
2014-07-25 14:12       ` Sage Weil
2014-07-25 18:14         ` Alex Elsayed
2014-07-31  0:29 ` Daniel Hofmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='lqmpg2$chu$1@ger.gmane.org' \
    --to=eternaleye@gmail.com \
    --cc=ceph-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.