* Re: Compare And Write against unwritten ranges
2016-07-26 18:29 ` Mike Christie
@ 2016-07-27 12:57 ` David Disseldorp
2016-07-29 11:09 ` David Disseldorp
1 sibling, 0 replies; 4+ messages in thread
From: David Disseldorp @ 2016-07-27 12:57 UTC (permalink / raw)
To: Mike Christie; +Cc: ceph-devel@vger.kernel.org
Thanks for the feedback Mike...
On Tue, 26 Jul 2016 13:29:03 -0500, Mike Christie wrote:
> On 07/26/2016 07:14 AM, David Disseldorp wrote:
> > Hi Mike,
> >
> > Returning to the OSD cmpext functionality in
> > https://github.com/ceph/ceph/pull/8911 , I'm wondering how such
> > requests should be handled against unwritten ranges.
> >
> > Currently an OSD will return -EINVAL to the client, as the short read
> > will be caught via:
> > https://github.com/ceph/ceph/pull/8911/commits/440895ea9f2604756c9f3c81e5c4ec5ca40401d7#diff-72747d40a424e7b5404366b557ff12a3R3722
> > -EINVAL then means that krbd will return an error for the corresponding
> > client I/O.
> >
> > For read requests, rbd_img_obj_request_read_callback() handles
> > zero-filling read buffers that cover unwritten RBD ranges. For SCSI
> > Compare And Write the OSD is responsible for atomicity, so zero-filling
> > on the client side is problematic.
> > One potential option could be to add a truncate/zero operation to the
> > Compare And Write compound request, or optionally support truncate_seq
> > and truncate_size parameters in cmpext. Any thoughts/suggestions on the
> > approach here?
>
> We have a similar problem if the data needed to be copyup'd right? I
Similar, but different - copyup should be handled via the
rbd_img_obj_parent_read_full() logic in rbd_img_obj_request_submit().
> think the multi-op route might be nice because it could work for both cases.
>
> Did you already try the multi op zero/truncate approach? Did you have to
> make changes to the OSD code too?
I'm working on a multi-op prototype now, and will send you the patches
when done. I don't expect any changes on the OSD side.
> A long while back, I was working on the copyup part of the problem but I
> hit another problem. It was something like the copyup's write would
> succeed, but when the cmpext op does the read it will fail still. If I
> sent it down as a multi-op, some other bits/structs on the OSD side
> needed to be updated before I could do the cmpext op. I cannot find the
> patches and I never submitted because I had just hacked it in for
> testing. Did you hit something similar?
Yeah, I've seen issues with copyup+cmpext+write, but am treating that
as a separate problem for now.
Cheers, David
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Compare And Write against unwritten ranges
2016-07-26 18:29 ` Mike Christie
2016-07-27 12:57 ` David Disseldorp
@ 2016-07-29 11:09 ` David Disseldorp
1 sibling, 0 replies; 4+ messages in thread
From: David Disseldorp @ 2016-07-29 11:09 UTC (permalink / raw)
To: Mike Christie; +Cc: ceph-devel@vger.kernel.org, Josh Durgin
On Tue, 26 Jul 2016 13:29:03 -0500, Mike Christie wrote:
> Did you already try the multi op zero/truncate approach? Did you have to
> make changes to the OSD code too?
I'm a little stumped by the OSD handling of these requests once truncate
is added to the mix...
As mentioned, with set-alloc-hint+cmpext(512~512)+write(512~512), the
cmpext/sync_read obtains an empty read buffer against the unwritten
range.
With set-alloc-hint+truncate(4194304)+cmpext(512~512)+write(512~512),
the cmpext/sync_read gets ENOENT from the filestore. The truncate
immediately prior doesn't appear to hit the filestore - vstart logs
below.
Cheers, David
7fd9bbfff700 10 osd.2 pg_epoch: 11 pg[0.1( empty local-les=9 n=0 ec=1 les/c/f 9/9/0 8/8/8) [2,0,1] r=0 lpr=8 crt=0'0 mlcod 0'0 active+clean] do_op 0:841a7acf:::rbd_data.100e74b0dc51.0000000000000000:head [set-alloc-hint object_size 4194304 write_size 4194304,truncate 4194304,cmpext 512~512,write 512~512] ov 0'0 av 11'1 snapc 0=[] snapset 0=[]:[]
7fd9bbfff700 10 osd.2 pg_epoch: 11 pg[0.1( empty local-les=9 n=0 ec=1 les/c/f 9/9/0 8/8/8) [2,0,1] r=0 lpr=8 crt=0'0 mlcod 0'0 active+clean] taking ondisk_read_lock
7fd9bbfff700 10 osd.2 pg_epoch: 11 pg[0.1( empty local-les=9 n=0 ec=1 les/c/f 9/9/0 8/8/8) [2,0,1] r=0 lpr=8 crt=0'0 mlcod 0'0 active+clean] do_osd_op 0:841a7acf:::rbd_data.100e74b0dc51.0000000000000000:head [set-alloc-hint object_size 4194304 write_size 4194304,truncate 4194304,cmpext 512~512,write 512~512]
7fd9bbfff700 10 osd.2 pg_epoch: 11 pg[0.1( empty local-les=9 n=0 ec=1 les/c/f 9/9/0 8/8/8) [2,0,1] r=0 lpr=8 crt=0'0 mlcod 0'0 active+clean] do_osd_op set-alloc-hint object_size 4194304 write_size 4194304
7fd9bbfff700 10 osd.2 pg_epoch: 11 pg[0.1( empty local-les=9 n=0 ec=1 les/c/f 9/9/0 8/8/8) [2,0,1] r=0 lpr=8 crt=0'0 mlcod 0'0 active+clean] do_osd_op truncate 4194304
7fd9bbfff700 10 osd.2 pg_epoch: 11 pg[0.1( empty local-les=9 n=0 ec=1 les/c/f 9/9/0 8/8/8) [2,0,1] r=0 lpr=8 crt=0'0 mlcod 0'0 active+clean] do_osd_op cmpext 512~512
7fd9bbfff700 10 osd.2 pg_epoch: 11 pg[0.1( empty local-les=9 n=0 ec=1 les/c/f 9/9/0 8/8/8) [2,0,1] r=0 lpr=8 crt=0'0 mlcod 0'0 active+clean] do_osd_op 0:841a7acf:::rbd_data.100e74b0dc51.0000000000000000:head [sync_read 512~512]
7fd9bbfff700 10 osd.2 pg_epoch: 11 pg[0.1( empty local-les=9 n=0 ec=1 les/c/f 9/9/0 8/8/8) [2,0,1] r=0 lpr=8 crt=0'0 mlcod 0'0 active+clean] do_osd_op sync_read 512~512
7fd9bbfff700 15 filestore(/home/ddiss/isms/ceph/src/dev/osd2) read 0.1_head/#0:841a7acf:::rbd_data.100e74b0dc51.0000000000000000:head# 512~512
7fd9bbfff700 10 filestore(/home/ddiss/isms/ceph/src/dev/osd2) error opening file /home/ddiss/isms/ceph/src/dev/osd2/current/0.1_head/rbd\udata.100e74b0dc51.0000000000000000__head_F35E5821__0 with flags=2: (2) No such file or directory
7fd9bbfff700 10 filestore(/home/ddiss/isms/ceph/src/dev/osd2) FileStore::read(0.1_head/#0:841a7acf:::rbd_data.100e74b0dc51.0000000000000000:head#) open error: (2) No such file or directory
7fd9bbfff700 10 osd.2 pg_epoch: 11 pg[0.1( empty local-les=9 n=0 ec=1 les/c/f 9/9/0 8/8/8) [2,0,1] r=0 lpr=8 crt=0'0 mlcod 0'0 active+clean] read got -2 / 0 bytes from obj 0:841a7acf:::rbd_data.100e74b0dc51.0000000000000000:head
7fd9bbfff700 -1 osd.2 pg_epoch: 11 pg[0.1( empty local-les=9 n=0 ec=1 les/c/f 9/9/0 8/8/8) [2,0,1] r=0 lpr=8 crt=0'0 mlcod 0'0 active+clean] do_extent_cmp do_osd_ops failed -2
7fd9bbfff700 10 osd.2 pg_epoch: 11 pg[0.1( empty local-les=9 n=0 ec=1 les/c/f 9/9/0 8/8/8) [2,0,1] r=0 lpr=8 crt=0'0 mlcod 0'0 active+clean] dropping ondisk_read_lock
7fd9bbfff700 1 -- 192.168.155.1:6808/7807 --> 192.168.155.101:0/3185149525 -- osd_op_reply(78 rbd_data.100e74b0dc51.0000000000000000 [set-alloc-hint object_size 4194304 write_size 4194304,truncate 4194304,cmpext 512~512,write 512~512] v0'0 uv0 ondisk = -2 ((2) No such file or directory)) v7 -- ?+0 0x7fd9d002bde0 con 0x7fda08006230
^ permalink raw reply [flat|nested] 4+ messages in thread