* Rados and user-provided buffers @ 2013-09-18 19:57 Rutger ter Borg 2013-09-18 20:01 ` Sage Weil 0 siblings, 1 reply; 6+ messages in thread From: Rutger ter Borg @ 2013-09-18 19:57 UTC (permalink / raw) To: ceph-devel Dear all, I've a question regarding buffers in rados (using the C++ API). I'm allocating and using my own buffers, and would like to read and write directly into and from them. I'm using a bufferlist consisting of static_buffers, which are passed to Rados' aio_read and aio_write. For aio_write, rados works as expected, i.e., the bufferlist is returned as it was before the call. However, when doing aio_read, it seems that the bufferlist is destroyed (not used) in the call, despite all the buffers being static. Is this expected behaviour? I read a thread "read/write on RADOS using external buffer" from this mailing list from 2010, but wasn't able to figure out whether rados does or doesn't support reading into user-provided static_buffers. Thanks in advance, Cheers, Rutger ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Rados and user-provided buffers 2013-09-18 19:57 Rados and user-provided buffers Rutger ter Borg @ 2013-09-18 20:01 ` Sage Weil 2013-09-18 20:31 ` Rutger ter Borg 0 siblings, 1 reply; 6+ messages in thread From: Sage Weil @ 2013-09-18 20:01 UTC (permalink / raw) To: Rutger ter Borg; +Cc: ceph-devel On Wed, 18 Sep 2013, Rutger ter Borg wrote: > > Dear all, > > I've a question regarding buffers in rados (using the C++ API). I'm allocating > and using my own buffers, and would like to read and write directly into and > from them. I'm using a bufferlist consisting of static_buffers, which are > passed to Rados' aio_read and aio_write. > > For aio_write, rados works as expected, i.e., the bufferlist is returned as it > was before the call. However, when doing aio_read, it seems that the > bufferlist is destroyed (not used) in the call, despite all the buffers being > static. > > Is this expected behaviour? I read a thread "read/write on RADOS using > external buffer" from this mailing list from 2010, but wasn't able to figure > out whether rados does or doesn't support reading into user-provided > static_buffers. The read-into-existing-buffer is only wired up properly for the C interface. For the C++ it isn't generally necessary: we allocate and read the data off the network,a nd pass the reference directly back to the user without making another copy. The 2010 thread is about similarly avoiding such a copy for the C API. We didn't contemplate the situation where you specifically want the bytes to go to a particular address via C++. If that's what you need, the C++ API needs to be extended, or you can just use the C call for that case. sage ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Rados and user-provided buffers 2013-09-18 20:01 ` Sage Weil @ 2013-09-18 20:31 ` Rutger ter Borg 2013-09-18 20:52 ` Sage Weil 0 siblings, 1 reply; 6+ messages in thread From: Rutger ter Borg @ 2013-09-18 20:31 UTC (permalink / raw) To: ceph-devel On 2013-09-18 22:01, Sage Weil wrote: > > The read-into-existing-buffer is only wired up properly for the C > interface. For the C++ it isn't generally necessary: we allocate and read > the data off the network,a nd pass the reference directly back to the user > without making another copy. The 2010 thread is about similarly avoiding > such a copy for the C API. We didn't contemplate the situation where you > specifically want the bytes to go to a particular address via C++. If > that's what you need, the C++ API needs to be extended, or you can just > use the C call for that case. > > sage > Hey Sage, my particular use case is a pager that uses Rados as a backend. Striping of pages works identical to the striping mechanism of Ceph. Reads and writes of multiple pages may be combined into one aio_ call with one bufferlist. Pages are allocated by the pager. AFAICT, the C call provides reading into a contiguous buffer, whereas I would like to read into a bufferlist. What would need to be done to add support for this in rados? Thanks, Rutger ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Rados and user-provided buffers 2013-09-18 20:31 ` Rutger ter Borg @ 2013-09-18 20:52 ` Sage Weil 2013-09-19 9:28 ` Rutger ter Borg 0 siblings, 1 reply; 6+ messages in thread From: Sage Weil @ 2013-09-18 20:52 UTC (permalink / raw) To: Rutger ter Borg; +Cc: ceph-devel On Wed, 18 Sep 2013, Rutger ter Borg wrote: > On 2013-09-18 22:01, Sage Weil wrote: > > > > The read-into-existing-buffer is only wired up properly for the C > > interface. For the C++ it isn't generally necessary: we allocate and read > > the data off the network,a nd pass the reference directly back to the user > > without making another copy. The 2010 thread is about similarly avoiding > > such a copy for the C API. We didn't contemplate the situation where you > > specifically want the bytes to go to a particular address via C++. If > > that's what you need, the C++ API needs to be extended, or you can just > > use the C call for that case. > > > > sage > > > > Hey Sage, > > my particular use case is a pager that uses Rados as a backend. Striping of > pages works identical to the striping mechanism of Ceph. Reads and writes of > multiple pages may be combined into one aio_ call with one bufferlist. Pages > are allocated by the pager. > > AFAICT, the C call provides reading into a contiguous buffer, whereas I would > like to read into a bufferlist. What would need to be done to add support for > this in rados? Hmm, looking at the code, I'm surprised that this isn't working. The C aio_read call is just doing bufferlist bl; bufferptr bp = buffer::create_static(len, buf); bl.push_back(bp); ret = ctx->read(oid, bl, len, off); if (ret >= 0) { if (bl.length() > len) return -ERANGE; if (bl.c_str() != buf) bl.copy(0, bl.length(), buf); My guess is the rx_buffers machinery is broken and we are triggering that bl.copy() all the time. In principle, was is supposed to happen: - the outbl is passed into Objecter and associated with the request. - in Objecter::send_op(), we do if (op->outbl && op->outbl->length()) { ldout(cct, 20) << " posting rx buffer for " << op->tid << " on " << op->session->con << dendl; op->con = op->session->con; op->con->post_rx_buffer(op->tid, *op->outbl); } - in msg/Pipe.cc when we are reading a message, we find that bufferliist and use it directly instead of allocating a new one. connection_state->lock.Lock(); map<tid_t,pair<bufferlist,int> >::iterator p = connection_state->rx_buffers.find(header.tid); if (p != connection_state->rx_buffers.end()) { if (rxbuf.length() == 0 || p->second.second != rxbuf_version) { ldout(msgr->cct,10) << "reader seleting rx buffer v " << p->second.second << " at offset " << offset << " len " << p->second.first.length() << dendl; ... As a first step I would 'debug objecter = 20' and 'debug ms = 20' and see if you see those debug messages going by for a single read request. sage ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Rados and user-provided buffers 2013-09-18 20:52 ` Sage Weil @ 2013-09-19 9:28 ` Rutger ter Borg 2013-09-19 12:39 ` Rutger ter Borg 0 siblings, 1 reply; 6+ messages in thread From: Rutger ter Borg @ 2013-09-19 9:28 UTC (permalink / raw) To: ceph-devel On 2013-09-18 22:52, Sage Weil wrote: > Hmm, looking at the code, I'm surprised that this isn't working. The C > aio_read call is just doing > > bufferlist bl; > bufferptr bp = buffer::create_static(len, buf); > bl.push_back(bp); > > ret = ctx->read(oid, bl, len, off); > if (ret >= 0) { > if (bl.length() > len) > return -ERANGE; > if (bl.c_str() != buf) > bl.copy(0, bl.length(), buf); Hey Sage, thanks for the hints. You're citing the synchronous version of rados_read, not rados_aio_read. The difference between rados_read and rados_aio_read (the C-versions) is that rados_read uses a bufferlist, and rados_aio_read uses an overload with a char* buf. Both are delegated to IoCtxImpl. IoCtxImpl has two overloads for aio_read, one accepting a bufferlist and a char*, but only one overload for read, accepting a bufferlist only. IoCtxImpl's overloads for aio_read are almost identical, the difference is that the buflist overload sets a bufferlist on AioCompletionImpl* c, c->pbl = pbl; and the char* buffer-overload sets a buffer c->buf = buf; AioCompletionImpl contains multiple data members: a bufferlist (bl), a pointer to a bufferlist (pbl), and a pointer to a character array buf. The following calls (in IoCtxImpl's aio_read overloads) to the objecter are identical: objecter->read(oid, oloc, off, len, snapid, &c->bl, 0, onack, &c->objver); Looking at the Objecter read and deeper in the call chain, it seems that information about data member pbl in AioCompletionImpl* is lost. The objecter only knows about a Context, not about AioCompletionImpl. The user-provided buffer is not passed on. My preliminary conclusion is that my problem is caused by information lost in IoCtxImpl's aio_read overloads. Maybe it can be solved by modifying IoCtxImpl.cc: * removing line 615. Not sure why the AioCompletionImpl needs to know anything about buffers? * replacing '&c->bl' with 'pbl' on line 619 of IoCtxImpl.cc, making the call to the objecter objecter->read(oid, oloc, off, len, snapid, pbl, 0, onack, &c->objver); In that way, the bufferlist is passed through, and not thrown away. Thanks, Rutger ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Rados and user-provided buffers 2013-09-19 9:28 ` Rutger ter Borg @ 2013-09-19 12:39 ` Rutger ter Borg 0 siblings, 0 replies; 6+ messages in thread From: Rutger ter Borg @ 2013-09-19 12:39 UTC (permalink / raw) To: ceph-devel On 2013-09-19 11:28, Rutger ter Borg wrote: > > > Hey Sage, > > thanks for the hints. You're citing the synchronous version of > rados_read, not rados_aio_read. The difference between rados_read and > rados_aio_read (the C-versions) is that rados_read uses a bufferlist, > and rados_aio_read uses an overload with a char* buf. Both are delegated > to IoCtxImpl. IoCtxImpl has two overloads for aio_read, one accepting a > bufferlist and a char*, but only one overload for read, accepting a > bufferlist only. > > IoCtxImpl's overloads for aio_read are almost identical, the difference > is that the buflist overload sets a bufferlist on AioCompletionImpl* c, > > c->pbl = pbl; > > and the char* buffer-overload sets a buffer > > c->buf = buf; > > AioCompletionImpl contains multiple data members: a bufferlist (bl), a > pointer to a bufferlist (pbl), and a pointer to a character array buf. > The following calls (in IoCtxImpl's aio_read overloads) to the objecter > are identical: > > objecter->read(oid, oloc, > off, len, snapid, &c->bl, 0, > onack, &c->objver); > > Looking at the Objecter read and deeper in the call chain, it seems that > information about data member pbl in AioCompletionImpl* is lost. The > objecter only knows about a Context, not about AioCompletionImpl. The > user-provided buffer is not passed on. > > My preliminary conclusion is that my problem is caused by information > lost in IoCtxImpl's aio_read overloads. Maybe it can be solved by > modifying IoCtxImpl.cc: > > * removing line 615. Not sure why the AioCompletionImpl needs to know > anything about buffers? > * replacing '&c->bl' with 'pbl' on line 619 of IoCtxImpl.cc, making > the call to the objecter > > objecter->read(oid, oloc, > off, len, snapid, pbl, 0, > onack, &c->objver); > > In that way, the bufferlist is passed through, and not thrown away. > > Thanks, > > Rutger > FWIW, it works for me. Cheers, Rutger ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-09-19 12:39 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-09-18 19:57 Rados and user-provided buffers Rutger ter Borg 2013-09-18 20:01 ` Sage Weil 2013-09-18 20:31 ` Rutger ter Borg 2013-09-18 20:52 ` Sage Weil 2013-09-19 9:28 ` Rutger ter Borg 2013-09-19 12:39 ` Rutger ter Borg
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.