* Rados and user-provided buffers
@ 2013-09-18 19:57 Rutger ter Borg
2013-09-18 20:01 ` Sage Weil
0 siblings, 1 reply; 6+ messages in thread
From: Rutger ter Borg @ 2013-09-18 19:57 UTC (permalink / raw)
To: ceph-devel
Dear all,
I've a question regarding buffers in rados (using the C++ API). I'm
allocating and using my own buffers, and would like to read and write
directly into and from them. I'm using a bufferlist consisting of
static_buffers, which are passed to Rados' aio_read and aio_write.
For aio_write, rados works as expected, i.e., the bufferlist is returned
as it was before the call. However, when doing aio_read, it seems that
the bufferlist is destroyed (not used) in the call, despite all the
buffers being static.
Is this expected behaviour? I read a thread "read/write on RADOS using
external buffer" from this mailing list from 2010, but wasn't able to
figure out whether rados does or doesn't support reading into
user-provided static_buffers.
Thanks in advance,
Cheers,
Rutger
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Rados and user-provided buffers
2013-09-18 19:57 Rados and user-provided buffers Rutger ter Borg
@ 2013-09-18 20:01 ` Sage Weil
2013-09-18 20:31 ` Rutger ter Borg
0 siblings, 1 reply; 6+ messages in thread
From: Sage Weil @ 2013-09-18 20:01 UTC (permalink / raw)
To: Rutger ter Borg; +Cc: ceph-devel
On Wed, 18 Sep 2013, Rutger ter Borg wrote:
>
> Dear all,
>
> I've a question regarding buffers in rados (using the C++ API). I'm allocating
> and using my own buffers, and would like to read and write directly into and
> from them. I'm using a bufferlist consisting of static_buffers, which are
> passed to Rados' aio_read and aio_write.
>
> For aio_write, rados works as expected, i.e., the bufferlist is returned as it
> was before the call. However, when doing aio_read, it seems that the
> bufferlist is destroyed (not used) in the call, despite all the buffers being
> static.
>
> Is this expected behaviour? I read a thread "read/write on RADOS using
> external buffer" from this mailing list from 2010, but wasn't able to figure
> out whether rados does or doesn't support reading into user-provided
> static_buffers.
The read-into-existing-buffer is only wired up properly for the C
interface. For the C++ it isn't generally necessary: we allocate and read
the data off the network,a nd pass the reference directly back to the user
without making another copy. The 2010 thread is about similarly avoiding
such a copy for the C API. We didn't contemplate the situation where you
specifically want the bytes to go to a particular address via C++. If
that's what you need, the C++ API needs to be extended, or you can just
use the C call for that case.
sage
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Rados and user-provided buffers
2013-09-18 20:01 ` Sage Weil
@ 2013-09-18 20:31 ` Rutger ter Borg
2013-09-18 20:52 ` Sage Weil
0 siblings, 1 reply; 6+ messages in thread
From: Rutger ter Borg @ 2013-09-18 20:31 UTC (permalink / raw)
To: ceph-devel
On 2013-09-18 22:01, Sage Weil wrote:
>
> The read-into-existing-buffer is only wired up properly for the C
> interface. For the C++ it isn't generally necessary: we allocate and read
> the data off the network,a nd pass the reference directly back to the user
> without making another copy. The 2010 thread is about similarly avoiding
> such a copy for the C API. We didn't contemplate the situation where you
> specifically want the bytes to go to a particular address via C++. If
> that's what you need, the C++ API needs to be extended, or you can just
> use the C call for that case.
>
> sage
>
Hey Sage,
my particular use case is a pager that uses Rados as a backend. Striping
of pages works identical to the striping mechanism of Ceph. Reads and
writes of multiple pages may be combined into one aio_ call with one
bufferlist. Pages are allocated by the pager.
AFAICT, the C call provides reading into a contiguous buffer, whereas I
would like to read into a bufferlist. What would need to be done to add
support for this in rados?
Thanks,
Rutger
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Rados and user-provided buffers
2013-09-18 20:31 ` Rutger ter Borg
@ 2013-09-18 20:52 ` Sage Weil
2013-09-19 9:28 ` Rutger ter Borg
0 siblings, 1 reply; 6+ messages in thread
From: Sage Weil @ 2013-09-18 20:52 UTC (permalink / raw)
To: Rutger ter Borg; +Cc: ceph-devel
On Wed, 18 Sep 2013, Rutger ter Borg wrote:
> On 2013-09-18 22:01, Sage Weil wrote:
> >
> > The read-into-existing-buffer is only wired up properly for the C
> > interface. For the C++ it isn't generally necessary: we allocate and read
> > the data off the network,a nd pass the reference directly back to the user
> > without making another copy. The 2010 thread is about similarly avoiding
> > such a copy for the C API. We didn't contemplate the situation where you
> > specifically want the bytes to go to a particular address via C++. If
> > that's what you need, the C++ API needs to be extended, or you can just
> > use the C call for that case.
> >
> > sage
> >
>
> Hey Sage,
>
> my particular use case is a pager that uses Rados as a backend. Striping of
> pages works identical to the striping mechanism of Ceph. Reads and writes of
> multiple pages may be combined into one aio_ call with one bufferlist. Pages
> are allocated by the pager.
>
> AFAICT, the C call provides reading into a contiguous buffer, whereas I would
> like to read into a bufferlist. What would need to be done to add support for
> this in rados?
Hmm, looking at the code, I'm surprised that this isn't working. The C
aio_read call is just doing
bufferlist bl;
bufferptr bp = buffer::create_static(len, buf);
bl.push_back(bp);
ret = ctx->read(oid, bl, len, off);
if (ret >= 0) {
if (bl.length() > len)
return -ERANGE;
if (bl.c_str() != buf)
bl.copy(0, bl.length(), buf);
My guess is the rx_buffers machinery is broken and we are triggering that
bl.copy() all the time. In principle, was is supposed to happen:
- the outbl is passed into Objecter and associated with the request.
- in Objecter::send_op(), we do
if (op->outbl && op->outbl->length()) {
ldout(cct, 20) << " posting rx buffer for " << op->tid << " on " << op->session->con << dendl;
op->con = op->session->con;
op->con->post_rx_buffer(op->tid, *op->outbl);
}
- in msg/Pipe.cc when we are reading a message, we find that bufferliist
and use it directly instead of allocating a new one.
connection_state->lock.Lock();
map<tid_t,pair<bufferlist,int> >::iterator p = connection_state->rx_buffers.find(header.tid);
if (p != connection_state->rx_buffers.end()) {
if (rxbuf.length() == 0 || p->second.second != rxbuf_version) {
ldout(msgr->cct,10) << "reader seleting rx buffer v " << p->second.second
<< " at offset " << offset
<< " len " << p->second.first.length() << dendl;
...
As a first step I would 'debug objecter = 20' and 'debug ms = 20' and see
if you see those debug messages going by for a single read request.
sage
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Rados and user-provided buffers
2013-09-18 20:52 ` Sage Weil
@ 2013-09-19 9:28 ` Rutger ter Borg
2013-09-19 12:39 ` Rutger ter Borg
0 siblings, 1 reply; 6+ messages in thread
From: Rutger ter Borg @ 2013-09-19 9:28 UTC (permalink / raw)
To: ceph-devel
On 2013-09-18 22:52, Sage Weil wrote:
> Hmm, looking at the code, I'm surprised that this isn't working. The C
> aio_read call is just doing
>
> bufferlist bl;
> bufferptr bp = buffer::create_static(len, buf);
> bl.push_back(bp);
>
> ret = ctx->read(oid, bl, len, off);
> if (ret >= 0) {
> if (bl.length() > len)
> return -ERANGE;
> if (bl.c_str() != buf)
> bl.copy(0, bl.length(), buf);
Hey Sage,
thanks for the hints. You're citing the synchronous version of
rados_read, not rados_aio_read. The difference between rados_read and
rados_aio_read (the C-versions) is that rados_read uses a bufferlist,
and rados_aio_read uses an overload with a char* buf. Both are delegated
to IoCtxImpl. IoCtxImpl has two overloads for aio_read, one accepting a
bufferlist and a char*, but only one overload for read, accepting a
bufferlist only.
IoCtxImpl's overloads for aio_read are almost identical, the difference
is that the buflist overload sets a bufferlist on AioCompletionImpl* c,
c->pbl = pbl;
and the char* buffer-overload sets a buffer
c->buf = buf;
AioCompletionImpl contains multiple data members: a bufferlist (bl), a
pointer to a bufferlist (pbl), and a pointer to a character array buf.
The following calls (in IoCtxImpl's aio_read overloads) to the objecter
are identical:
objecter->read(oid, oloc,
off, len, snapid, &c->bl, 0,
onack, &c->objver);
Looking at the Objecter read and deeper in the call chain, it seems that
information about data member pbl in AioCompletionImpl* is lost. The
objecter only knows about a Context, not about AioCompletionImpl. The
user-provided buffer is not passed on.
My preliminary conclusion is that my problem is caused by information
lost in IoCtxImpl's aio_read overloads. Maybe it can be solved by
modifying IoCtxImpl.cc:
* removing line 615. Not sure why the AioCompletionImpl needs to know
anything about buffers?
* replacing '&c->bl' with 'pbl' on line 619 of IoCtxImpl.cc, making
the call to the objecter
objecter->read(oid, oloc,
off, len, snapid, pbl, 0,
onack, &c->objver);
In that way, the bufferlist is passed through, and not thrown away.
Thanks,
Rutger
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Rados and user-provided buffers
2013-09-19 9:28 ` Rutger ter Borg
@ 2013-09-19 12:39 ` Rutger ter Borg
0 siblings, 0 replies; 6+ messages in thread
From: Rutger ter Borg @ 2013-09-19 12:39 UTC (permalink / raw)
To: ceph-devel
On 2013-09-19 11:28, Rutger ter Borg wrote:
>
>
> Hey Sage,
>
> thanks for the hints. You're citing the synchronous version of
> rados_read, not rados_aio_read. The difference between rados_read and
> rados_aio_read (the C-versions) is that rados_read uses a bufferlist,
> and rados_aio_read uses an overload with a char* buf. Both are delegated
> to IoCtxImpl. IoCtxImpl has two overloads for aio_read, one accepting a
> bufferlist and a char*, but only one overload for read, accepting a
> bufferlist only.
>
> IoCtxImpl's overloads for aio_read are almost identical, the difference
> is that the buflist overload sets a bufferlist on AioCompletionImpl* c,
>
> c->pbl = pbl;
>
> and the char* buffer-overload sets a buffer
>
> c->buf = buf;
>
> AioCompletionImpl contains multiple data members: a bufferlist (bl), a
> pointer to a bufferlist (pbl), and a pointer to a character array buf.
> The following calls (in IoCtxImpl's aio_read overloads) to the objecter
> are identical:
>
> objecter->read(oid, oloc,
> off, len, snapid, &c->bl, 0,
> onack, &c->objver);
>
> Looking at the Objecter read and deeper in the call chain, it seems that
> information about data member pbl in AioCompletionImpl* is lost. The
> objecter only knows about a Context, not about AioCompletionImpl. The
> user-provided buffer is not passed on.
>
> My preliminary conclusion is that my problem is caused by information
> lost in IoCtxImpl's aio_read overloads. Maybe it can be solved by
> modifying IoCtxImpl.cc:
>
> * removing line 615. Not sure why the AioCompletionImpl needs to know
> anything about buffers?
> * replacing '&c->bl' with 'pbl' on line 619 of IoCtxImpl.cc, making
> the call to the objecter
>
> objecter->read(oid, oloc,
> off, len, snapid, pbl, 0,
> onack, &c->objver);
>
> In that way, the bufferlist is passed through, and not thrown away.
>
> Thanks,
>
> Rutger
>
FWIW, it works for me.
Cheers,
Rutger
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-09-19 12:39 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-18 19:57 Rados and user-provided buffers Rutger ter Borg
2013-09-18 20:01 ` Sage Weil
2013-09-18 20:31 ` Rutger ter Borg
2013-09-18 20:52 ` Sage Weil
2013-09-19 9:28 ` Rutger ter Borg
2013-09-19 12:39 ` Rutger ter Borg
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.