All of lore.kernel.org
 help / color / mirror / Atom feed
* Rados and user-provided buffers
@ 2013-09-18 19:57 Rutger ter Borg
  2013-09-18 20:01 ` Sage Weil
  0 siblings, 1 reply; 6+ messages in thread
From: Rutger ter Borg @ 2013-09-18 19:57 UTC (permalink / raw)
  To: ceph-devel


Dear all,

I've a question regarding buffers in rados (using the C++ API). I'm 
allocating and using my own buffers, and would like to read and write 
directly into and from them. I'm using a bufferlist consisting of 
static_buffers, which are passed to Rados' aio_read and aio_write.

For aio_write, rados works as expected, i.e., the bufferlist is returned 
as it was before the call. However, when doing aio_read, it seems that 
the bufferlist is destroyed (not used) in the call, despite all the 
buffers being static.

Is this expected behaviour? I read a thread "read/write on RADOS using 
external buffer" from this mailing list from 2010, but wasn't able to 
figure out whether rados does or doesn't support reading into 
user-provided static_buffers.

Thanks in advance,
Cheers,

Rutger




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Rados and user-provided buffers
  2013-09-18 19:57 Rados and user-provided buffers Rutger ter Borg
@ 2013-09-18 20:01 ` Sage Weil
  2013-09-18 20:31   ` Rutger ter Borg
  0 siblings, 1 reply; 6+ messages in thread
From: Sage Weil @ 2013-09-18 20:01 UTC (permalink / raw)
  To: Rutger ter Borg; +Cc: ceph-devel

On Wed, 18 Sep 2013, Rutger ter Borg wrote:
> 
> Dear all,
> 
> I've a question regarding buffers in rados (using the C++ API). I'm allocating
> and using my own buffers, and would like to read and write directly into and
> from them. I'm using a bufferlist consisting of static_buffers, which are
> passed to Rados' aio_read and aio_write.
> 
> For aio_write, rados works as expected, i.e., the bufferlist is returned as it
> was before the call. However, when doing aio_read, it seems that the
> bufferlist is destroyed (not used) in the call, despite all the buffers being
> static.
> 
> Is this expected behaviour? I read a thread "read/write on RADOS using
> external buffer" from this mailing list from 2010, but wasn't able to figure
> out whether rados does or doesn't support reading into user-provided
> static_buffers.

The read-into-existing-buffer is only wired up properly for the C 
interface.  For the C++ it isn't generally necessary: we allocate and read 
the data off the network,a nd pass the reference directly back to the user 
without making another copy.  The 2010 thread is about similarly avoiding 
such a copy for the C API.  We didn't contemplate the situation where you 
specifically want the bytes to go to a particular address via C++.  If 
that's what you need, the C++ API needs to be extended, or you can just 
use the C call for that case.

sage


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Rados and user-provided buffers
  2013-09-18 20:01 ` Sage Weil
@ 2013-09-18 20:31   ` Rutger ter Borg
  2013-09-18 20:52     ` Sage Weil
  0 siblings, 1 reply; 6+ messages in thread
From: Rutger ter Borg @ 2013-09-18 20:31 UTC (permalink / raw)
  To: ceph-devel

On 2013-09-18 22:01, Sage Weil wrote:
>
> The read-into-existing-buffer is only wired up properly for the C
> interface.  For the C++ it isn't generally necessary: we allocate and read
> the data off the network,a nd pass the reference directly back to the user
> without making another copy.  The 2010 thread is about similarly avoiding
> such a copy for the C API.  We didn't contemplate the situation where you
> specifically want the bytes to go to a particular address via C++.  If
> that's what you need, the C++ API needs to be extended, or you can just
> use the C call for that case.
>
> sage
>

Hey Sage,

my particular use case is a pager that uses Rados as a backend. Striping 
of pages works identical to the striping mechanism of Ceph. Reads and 
writes of multiple pages may be combined into one aio_ call with one 
bufferlist. Pages are allocated by the pager.

AFAICT, the C call provides reading into a contiguous buffer, whereas I 
would like to read into a bufferlist. What would need to be done to add 
support for this in rados?

Thanks,

Rutger




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Rados and user-provided buffers
  2013-09-18 20:31   ` Rutger ter Borg
@ 2013-09-18 20:52     ` Sage Weil
  2013-09-19  9:28       ` Rutger ter Borg
  0 siblings, 1 reply; 6+ messages in thread
From: Sage Weil @ 2013-09-18 20:52 UTC (permalink / raw)
  To: Rutger ter Borg; +Cc: ceph-devel

On Wed, 18 Sep 2013, Rutger ter Borg wrote:
> On 2013-09-18 22:01, Sage Weil wrote:
> > 
> > The read-into-existing-buffer is only wired up properly for the C
> > interface.  For the C++ it isn't generally necessary: we allocate and read
> > the data off the network,a nd pass the reference directly back to the user
> > without making another copy.  The 2010 thread is about similarly avoiding
> > such a copy for the C API.  We didn't contemplate the situation where you
> > specifically want the bytes to go to a particular address via C++.  If
> > that's what you need, the C++ API needs to be extended, or you can just
> > use the C call for that case.
> > 
> > sage
> > 
> 
> Hey Sage,
> 
> my particular use case is a pager that uses Rados as a backend. Striping of
> pages works identical to the striping mechanism of Ceph. Reads and writes of
> multiple pages may be combined into one aio_ call with one bufferlist. Pages
> are allocated by the pager.
> 
> AFAICT, the C call provides reading into a contiguous buffer, whereas I would
> like to read into a bufferlist. What would need to be done to add support for
> this in rados?

Hmm, looking at the code, I'm surprised that this isn't working.  The C 
aio_read call is just doing

  bufferlist bl;
  bufferptr bp = buffer::create_static(len, buf);
  bl.push_back(bp);

  ret = ctx->read(oid, bl, len, off);
  if (ret >= 0) {
    if (bl.length() > len)
      return -ERANGE;
    if (bl.c_str() != buf)
      bl.copy(0, bl.length(), buf);


My guess is the rx_buffers machinery is broken and we are triggering that 
bl.copy() all the time.  In principle, was is supposed to happen:

- the outbl is passed into Objecter and associated with the request.

- in Objecter::send_op(), we do

  if (op->outbl && op->outbl->length()) {
    ldout(cct, 20) << " posting rx buffer for " << op->tid << " on " << op->session->con << dendl;
    op->con = op->session->con;
    op->con->post_rx_buffer(op->tid, *op->outbl);
  }

- in msg/Pipe.cc when we are reading a message, we find that bufferliist 
and use it directly instead of allocating a new one.

      connection_state->lock.Lock();
      map<tid_t,pair<bufferlist,int> >::iterator p = connection_state->rx_buffers.find(header.tid);
      if (p != connection_state->rx_buffers.end()) {
	if (rxbuf.length() == 0 || p->second.second != rxbuf_version) {
	  ldout(msgr->cct,10) << "reader seleting rx buffer v " << p->second.second
		   << " at offset " << offset
		   << " len " << p->second.first.length() << dendl;
...

As a first step I would 'debug objecter = 20' and 'debug ms = 20' and see 
if you see those debug messages going by for a single read request.

sage

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Rados and user-provided buffers
  2013-09-18 20:52     ` Sage Weil
@ 2013-09-19  9:28       ` Rutger ter Borg
  2013-09-19 12:39         ` Rutger ter Borg
  0 siblings, 1 reply; 6+ messages in thread
From: Rutger ter Borg @ 2013-09-19  9:28 UTC (permalink / raw)
  To: ceph-devel

On 2013-09-18 22:52, Sage Weil wrote:

> Hmm, looking at the code, I'm surprised that this isn't working.  The C
> aio_read call is just doing
>
>    bufferlist bl;
>    bufferptr bp = buffer::create_static(len, buf);
>    bl.push_back(bp);
>
>    ret = ctx->read(oid, bl, len, off);
>    if (ret >= 0) {
>      if (bl.length() > len)
>        return -ERANGE;
>      if (bl.c_str() != buf)
>        bl.copy(0, bl.length(), buf);


Hey Sage,

thanks for the hints. You're citing the synchronous version of 
rados_read, not rados_aio_read. The difference between rados_read and 
rados_aio_read (the C-versions) is that rados_read uses a bufferlist, 
and rados_aio_read uses an overload with a char* buf. Both are delegated 
to IoCtxImpl. IoCtxImpl has two overloads for aio_read, one accepting a 
bufferlist and a char*, but only one overload for read, accepting a 
bufferlist only.

IoCtxImpl's overloads for aio_read are almost identical, the difference 
is that the buflist overload sets a bufferlist on AioCompletionImpl* c,

c->pbl = pbl;

and the char* buffer-overload sets a buffer

c->buf = buf;

AioCompletionImpl contains multiple data members: a bufferlist (bl), a 
pointer to a bufferlist (pbl), and a pointer to a character array buf.
The following calls (in IoCtxImpl's aio_read overloads) to the objecter 
are identical:

objecter->read(oid, oloc,
          off, len, snapid, &c->bl, 0,
          onack, &c->objver);

Looking at the Objecter read and deeper in the call chain, it seems that 
information about data member pbl in AioCompletionImpl* is lost. The 
objecter only knows about a Context, not about AioCompletionImpl. The 
user-provided buffer is not passed on.

My preliminary conclusion is that my problem is caused by information 
lost in IoCtxImpl's aio_read overloads. Maybe it can be solved by 
modifying IoCtxImpl.cc:

  * removing line 615. Not sure why the AioCompletionImpl needs to know 
anything about buffers?
  * replacing '&c->bl' with 'pbl' on line 619 of IoCtxImpl.cc, making 
the call to the objecter

objecter->read(oid, oloc,
          off, len, snapid, pbl, 0,
          onack, &c->objver);

In that way, the bufferlist is passed through, and not thrown away.

Thanks,

Rutger





^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Rados and user-provided buffers
  2013-09-19  9:28       ` Rutger ter Borg
@ 2013-09-19 12:39         ` Rutger ter Borg
  0 siblings, 0 replies; 6+ messages in thread
From: Rutger ter Borg @ 2013-09-19 12:39 UTC (permalink / raw)
  To: ceph-devel

On 2013-09-19 11:28, Rutger ter Borg wrote:
>
>
> Hey Sage,
>
> thanks for the hints. You're citing the synchronous version of
> rados_read, not rados_aio_read. The difference between rados_read and
> rados_aio_read (the C-versions) is that rados_read uses a bufferlist,
> and rados_aio_read uses an overload with a char* buf. Both are delegated
> to IoCtxImpl. IoCtxImpl has two overloads for aio_read, one accepting a
> bufferlist and a char*, but only one overload for read, accepting a
> bufferlist only.
>
> IoCtxImpl's overloads for aio_read are almost identical, the difference
> is that the buflist overload sets a bufferlist on AioCompletionImpl* c,
>
> c->pbl = pbl;
>
> and the char* buffer-overload sets a buffer
>
> c->buf = buf;
>
> AioCompletionImpl contains multiple data members: a bufferlist (bl), a
> pointer to a bufferlist (pbl), and a pointer to a character array buf.
> The following calls (in IoCtxImpl's aio_read overloads) to the objecter
> are identical:
>
> objecter->read(oid, oloc,
>           off, len, snapid, &c->bl, 0,
>           onack, &c->objver);
>
> Looking at the Objecter read and deeper in the call chain, it seems that
> information about data member pbl in AioCompletionImpl* is lost. The
> objecter only knows about a Context, not about AioCompletionImpl. The
> user-provided buffer is not passed on.
>
> My preliminary conclusion is that my problem is caused by information
> lost in IoCtxImpl's aio_read overloads. Maybe it can be solved by
> modifying IoCtxImpl.cc:
>
>   * removing line 615. Not sure why the AioCompletionImpl needs to know
> anything about buffers?
>   * replacing '&c->bl' with 'pbl' on line 619 of IoCtxImpl.cc, making
> the call to the objecter
>
> objecter->read(oid, oloc,
>           off, len, snapid, pbl, 0,
>           onack, &c->objver);
>
> In that way, the bufferlist is passed through, and not thrown away.
>
> Thanks,
>
> Rutger
>

FWIW, it works for me.

Cheers,

Rutger






^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-09-19 12:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-18 19:57 Rados and user-provided buffers Rutger ter Borg
2013-09-18 20:01 ` Sage Weil
2013-09-18 20:31   ` Rutger ter Borg
2013-09-18 20:52     ` Sage Weil
2013-09-19  9:28       ` Rutger ter Borg
2013-09-19 12:39         ` Rutger ter Borg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.