All of lore.kernel.org
 help / color / mirror / Atom feed
* RadosGW objects to Rados object mapping
@ 2014-09-17 14:39 Abhishek L
  2014-09-17 15:52 ` Yehuda Sadeh
  0 siblings, 1 reply; 3+ messages in thread
From: Abhishek L @ 2014-09-17 14:39 UTC (permalink / raw)
  To: 'ceph-devel@vger.kernel.org'

[-- Attachment #1: Type: text/plain, Size: 1967 bytes --]

Hi,

I'm trying to understand the internals of RadosGW, on how
buckets/containers, objects are mapped back to rados objects. I couldn't
find any docs, however a previous mailing list discussion[1] explained
how an S3/Swift objects are cut into rados objects and about manifests. I was
able to construct back a file uploaded to RadosGW by getting the rados
objects by using the manifest to figure out the rados object names. 
For eg:
```
# random.txt is an 8 MB text file
[r@ra:~/ceph/src]$ s3 -us put my-first-bucket/random filename=random.txt 
[r@ra:~/ceph/src]$ ./radosgw-admin object stat --bucket=my-first-bucket --object=random  | grep prefix 
      "prefix": "._op2xmptte2DD7z3_9EjQKgmmRcWRWL_",

```

And then getting the objects via rados and joining back

```
[r@ra:~/ceph/src]$ ./rados --pool .rgw.buckets ls | grep _op2xm
default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_2
default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_1
[r@ra:~/ceph/src]$ ./rados get default.4124.1_random random.part0 --pool .rgw.buckets
[r@ra:~/ceph/src]$ ./rados get default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_1 random.part1 --pool .rgw.buckets
[r@ra:~/ceph/src]$ ./rados get default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_2 random.part2 --pool .rgw.buckets

# Now join the objects back 
[r@ra:~/ceph/src]$ cat random.part0 random.part1 random.part2 > random.rados.txt
[r@ra:~/ceph/src]$ diff random.txt random.rados.txt 
```

I'm trying to find similiar information on how radosgw ends up storing
the buckets & metadata into rados objects, what information is
contained within them and how they are updated when say an object is
added etc. I was able to find the bucket name & bucket meta data being
stored in .rgw pool, but not sure how the bucket knows the objects it
has or buckets owned by user etc.

[1] https://www.mail-archive.com/ceph-devel@vger.kernel.org/msg19747.html 

Thanks
-- 
Abhishek

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: RadosGW objects to Rados object mapping
  2014-09-17 14:39 RadosGW objects to Rados object mapping Abhishek L
@ 2014-09-17 15:52 ` Yehuda Sadeh
  2014-09-17 18:23   ` Abhishek L
  0 siblings, 1 reply; 3+ messages in thread
From: Yehuda Sadeh @ 2014-09-17 15:52 UTC (permalink / raw)
  To: Abhishek L; +Cc: ceph-devel@vger.kernel.org

On Wed, Sep 17, 2014 at 7:39 AM, Abhishek L
<abhishek.lekshmanan@gmail.com> wrote:
>
> Hi,
>
> I'm trying to understand the internals of RadosGW, on how
> buckets/containers, objects are mapped back to rados objects. I couldn't
> find any docs, however a previous mailing list discussion[1] explained
> how an S3/Swift objects are cut into rados objects and about manifests. I was
> able to construct back a file uploaded to RadosGW by getting the rados
> objects by using the manifest to figure out the rados object names.
> For eg:
> ```
> # random.txt is an 8 MB text file
> [r@ra:~/ceph/src]$ s3 -us put my-first-bucket/random filename=random.txt
> [r@ra:~/ceph/src]$ ./radosgw-admin object stat --bucket=my-first-bucket --object=random  | grep prefix
>       "prefix": "._op2xmptte2DD7z3_9EjQKgmmRcWRWL_",
>
> ```
>
> And then getting the objects via rados and joining back
>
> ```
> [r@ra:~/ceph/src]$ ./rados --pool .rgw.buckets ls | grep _op2xm
> default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_2
> default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_1
> [r@ra:~/ceph/src]$ ./rados get default.4124.1_random random.part0 --pool .rgw.buckets
> [r@ra:~/ceph/src]$ ./rados get default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_1 random.part1 --pool .rgw.buckets
> [r@ra:~/ceph/src]$ ./rados get default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_2 random.part2 --pool .rgw.buckets
>
> # Now join the objects back
> [r@ra:~/ceph/src]$ cat random.part0 random.part1 random.part2 > random.rados.txt
> [r@ra:~/ceph/src]$ diff random.txt random.rados.txt
> ```
>
> I'm trying to find similiar information on how radosgw ends up storing
> the buckets & metadata into rados objects, what information is
> contained within them and how they are updated when say an object is
> added etc. I was able to find the bucket name & bucket meta data being
> stored in .rgw pool, but not sure how the bucket knows the objects it
> has or buckets owned by user etc.
>

The bucket doesn't know who owns each object, this info is stored in
the object's info. The bucket index is stored as omap information in
the bucket instance object. The list of buckets per user is kept in
the user metadata object (also as omap information). There's a rados
command that lets you list the omap keys for each rados object.

Yehuda

> [1] https://www.mail-archive.com/ceph-devel@vger.kernel.org/msg19747.html
>
> Thanks
> --
> Abhishek

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: RadosGW objects to Rados object mapping
  2014-09-17 15:52 ` Yehuda Sadeh
@ 2014-09-17 18:23   ` Abhishek L
  0 siblings, 0 replies; 3+ messages in thread
From: Abhishek L @ 2014-09-17 18:23 UTC (permalink / raw)
  To: Yehuda Sadeh; +Cc: ceph-devel@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 3182 bytes --]


Yehuda Sadeh writes:

> On Wed, Sep 17, 2014 at 7:39 AM, Abhishek L
> <abhishek.lekshmanan@gmail.com> wrote:
>>
>> Hi,
>>
>> I'm trying to understand the internals of RadosGW, on how
>> buckets/containers, objects are mapped back to rados objects. I couldn't
>> find any docs, however a previous mailing list discussion[1] explained
>> how an S3/Swift objects are cut into rados objects and about manifests. I was
>> able to construct back a file uploaded to RadosGW by getting the rados
>> objects by using the manifest to figure out the rados object names.
>> For eg:
>> ```
>> # random.txt is an 8 MB text file
>> [r@ra:~/ceph/src]$ s3 -us put my-first-bucket/random filename=random.txt
>> [r@ra:~/ceph/src]$ ./radosgw-admin object stat --bucket=my-first-bucket --object=random  | grep prefix
>>       "prefix": "._op2xmptte2DD7z3_9EjQKgmmRcWRWL_",
>>
>> ```
>>
>> And then getting the objects via rados and joining back
>>
>> ```
>> [r@ra:~/ceph/src]$ ./rados --pool .rgw.buckets ls | grep _op2xm
>> default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_2
>> default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_1
>> [r@ra:~/ceph/src]$ ./rados get default.4124.1_random random.part0 --pool .rgw.buckets
>> [r@ra:~/ceph/src]$ ./rados get default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_1 random.part1 --pool .rgw.buckets
>> [r@ra:~/ceph/src]$ ./rados get default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_2 random.part2 --pool .rgw.buckets
>>
>> # Now join the objects back
>> [r@ra:~/ceph/src]$ cat random.part0 random.part1 random.part2 > random.rados.txt
>> [r@ra:~/ceph/src]$ diff random.txt random.rados.txt
>> ```
>>
>> I'm trying to find similiar information on how radosgw ends up storing
>> the buckets & metadata into rados objects, what information is
>> contained within them and how they are updated when say an object is
>> added etc. I was able to find the bucket name & bucket meta data being
>> stored in .rgw pool, but not sure how the bucket knows the objects it
>> has or buckets owned by user etc.
>>
>
> The bucket doesn't know who owns each object, this info is stored in
> the object's info. The bucket index is stored as omap information in
> the bucket instance object. 

Ah thanks, I was able to list the objects for the buckets, by getting
omapkeys from the buckets.index pool

```
[r@ra:~/ceph/src](⎇ master)$ ./rados -p .rgw.buckets.index ls
.dir.defualt.4124.2
.dir.default.4124.1

[r@ra:~/ceph/src](⎇ master)$ ./rados -p .rgw.buckets.index listomapkeys .dir.default.4124.1
big-object
file-1
object-8
random
```
 
> The list of buckets per user is kept in
> the user metadata object (also as omap information). There's a rados
> command that lets you list the omap keys for each rados object.

This also I was able to get by inspecting the <uid>.buckets objects in
users.uid pool.

```

./rados -p .users.uid listomapkeys testid.buckets
another-bucket
my-first-bucket
```

Thanks for the info. I'll try to combine these mailing list discussions
to something of a starting point for storage in radosgw developer docs. 

Cheers
-- 
Abhishek

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-09-17 18:25 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-17 14:39 RadosGW objects to Rados object mapping Abhishek L
2014-09-17 15:52 ` Yehuda Sadeh
2014-09-17 18:23   ` Abhishek L

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.