All of lore.kernel.org
 help / color / mirror / Atom feed
From: Igor Fedotov <ifedotov@mirantis.com>
To: Samuel Just <sjust@redhat.com>, Robert LeBlanc <robert@leblancnet.us>
Cc: Sage Weil <sweil@redhat.com>, Gregory Farnum <gfarnum@redhat.com>,
	ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: Adding Data-At-Rest compression support to Ceph
Date: Fri, 25 Sep 2015 14:59:42 +0300	[thread overview]
Message-ID: <5605372E.9020004@mirantis.com> (raw)
In-Reply-To: <CAN=+7FUrBdTc7hD1mQNs5GPO_uKUZsbxR73VU8P9KqaiZ_hAJA@mail.gmail.com>

Another thing to note is that we don't have the whole object ready for 
compression. We just have some new data block written(appended) to the 
object. And we should either compress that block and save mentioned 
mapping data or decompress the existing object data and do full 
compression again.
And IMO introducing seek points is largely similar to what we were 
talking about - it requires a sort of offset mapping as well.

Probably compression at OSD has some Pros as well. But it wouldn't 
eliminate the need to "muck with stripe sizes or anything".

On 24.09.2015 20:53, Samuel Just wrote:
> The catch is that currently accessing 4k in the middle of a 4MB object
> does not require reading the whole object, so you'd need some kind of
> logical offset -> compressed offset mapping.
> -Sam
>
> On Thu, Sep 24, 2015 at 10:36 AM, Robert LeBlanc <robert@leblancnet.us> wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA256
>>
>> I'm probably missing something, but since we are talking about data at
>> rest, can't we just have the OSD compress the object as it goes to
>> disk? Instead of
>> rbd\udata.1ba49c10d9b00c.0000000000006859__head_2AD1002B__11 it would
>> be rbd\udata.1ba49c10d9b00c.0000000000006859__head_2AD1002B__11.{gz,xz,bz2,lzo,etc}.
>> Then it seems that you don't have to muck with stripe sizes or
>> anything. For compressible objects they would be less than 4MB, some
>> of theses algorithms already say if it is not compressible enough,
>> just store it.
>>
>> Something like zlib Z_FULL_FLUSH may help provide some seek points
>> within an archive to prevent decompressing the whole object for reads?
>>
>> - ----------------
>> Robert LeBlanc
>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>>
>>
>> On Thu, Sep 24, 2015 at 10:25 AM, Igor Fedotov  wrote:
>>>
>>> On 24.09.2015 19:03, Sage Weil wrote:
>>>> On Thu, 24 Sep 2015, Igor Fedotov wrote:
>>>>
>>>> Dynamic stripe sizes are possible but it's a significant change from the
>>>> way the EC pool currently works. I would make that a separate project (as
>>>> its useful in its own right) and not complicate the compression situation.
>>>> Or, if it simplifies the compression approach, then I'd make that change
>>>> first. sage
>>> Just to clarify a bit. What I saw when played with Ceph. Please correct me
>>> if I'm wrong..
>>>
>>> For low-level RADOS access client data written to EC pool has to be aligned
>>> with stripe size . The last block can be unaligned though but no more
>>> appends are permitted in this case.
>>> Data copied from cache goes in blocks up to 8Mb size. In general case  the
>>> last block seems to have unaligned size too.
>>>
>>> EC pool additionally performs alignment of the incoming blocks to stripe
>>> bound internally. This way blocks going to EC lib are always aligned.
>>> We should probably perform compression prior to this alignment.
>>> Thus some dependency on stripe size is present in EC pools but it's not that
>>> strict.
>>>
>>> Thanks,
>>> Igor
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> -----BEGIN PGP SIGNATURE-----
>> Version: Mailvelope v1.1.0
>> Comment: https://www.mailvelope.com
>>
>> wsFcBAEBCAAQBQJWBDSDCRDmVDuy+mK58QAAmwwP/3q0tbLZA95RVsvSLrXk
>> ipuhjiGPvAX8o2kTYFtf5tXkMuiJIJIy+WK1uD6zs+CXM/2JR6SJthS3tE9A
>> meaFW7W5lropbWKRZ8TkpUNQAXDyRrpSEcTDBWciq+EOca5tlP+17KDevVnZ
>> PWDCNPlZmbHyBy91iJju4TTzaJYoD8mXU/+4xLCicePDPomlpO4oyndDfOmI
>> JP5uRDmgP0ecsxfcyoYSTCJylfnBsmK0IMyxZoV2Mx+SEcqgtECPCOY7Uc/4
>> wwXGhu//zO7twyOvtsk4OQGjLX9wpSpVWz+zcR2RYiYfw3YSTSzGvbBC5hpb
>> pfQya5DbypJra2oz5BZkikvwYPhxPoI0FcdTCYFFxclm0jMwQqh2b141kN8Z
>> eR7v8ttfnbACumWP74j2KSpHRm/1l65nN4wqzg3ovoesjoJDvb2miz8AX7ag
>> FXVa54JpIcoIzCkIkqvpCfzhatGU55yQiyt7aFAhJfpmP/cNpxmAete8buTK
>> 6aFMiYWFJe+md/bLOrk5g/cyr9BUq+tHT7Qf+mRmgw9fuECUXMXMzf6vOUk8
>> 0JnYiYVk0j+twZeuDaVPBrXEMKuYuq7NlILuHJDF3meRPM2xekan8ARZoJxL
>> XAOzvaEFly0TH5DJfItSVOL86qtp+1orULSrVbtvolxzQtv8xiNOzJYBKEnO
>> ouVI
>> =d8mm
>> -----END PGP SIGNATURE-----
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html


  reply	other threads:[~2015-09-25 11:59 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-22 17:04 Adding Data-At-Rest compression support to Ceph Igor Fedotov
2015-09-22 19:11 ` Sage Weil
2015-09-23 12:47   ` Igor Fedotov
2015-09-23 13:15     ` Sage Weil
2015-09-23 14:05       ` Gregory Farnum
2015-09-23 15:26         ` Igor Fedotov
2015-09-23 17:31           ` Samuel Just
2015-09-24 15:34             ` Igor Fedotov
2015-09-23 18:03           ` Gregory Farnum
2015-09-24 15:13             ` Igor Fedotov
2015-09-24 15:34               ` Sage Weil
2015-09-24 15:41                 ` HEWLETT, Paul (Paul)
2015-09-24 16:00                   ` Igor Fedotov
2015-09-24 15:56                 ` Igor Fedotov
2015-09-24 16:03                   ` Sage Weil
2015-09-24 16:14                     ` Igor Fedotov
2015-09-24 16:25                     ` Igor Fedotov
2015-09-24 17:36                       ` Robert LeBlanc
2015-09-24 17:53                         ` Samuel Just
2015-09-25 11:59                           ` Igor Fedotov [this message]
2015-09-25 14:14                             ` Sage Weil
2015-09-28 16:56                               ` Igor Fedotov
2015-09-24 18:10               ` Gregory Farnum
2015-09-25 13:16                 ` Igor Fedotov
2015-09-23 14:08       ` Igor Fedotov
2015-09-23 14:37         ` Sage Weil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5605372E.9020004@mirantis.com \
    --to=ifedotov@mirantis.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=gfarnum@redhat.com \
    --cc=robert@leblancnet.us \
    --cc=sjust@redhat.com \
    --cc=sweil@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.