* Re: Testing latest linux-next 4.9 ib_srp and ib_srpt
[not found] ` <1928955380.8220327.1482302041315.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-12-22 2:10 ` Laurence Oberman
[not found] ` <1337539588.8422488.1482372617094.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 7+ messages in thread
From: Laurence Oberman @ 2016-12-22 2:10 UTC (permalink / raw)
To: Bart Van Assche, Christoph Hellwig
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
----- Original Message -----
> From: "Laurence Oberman" <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> To: "Bart Van Assche" <Bart.VanAssche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
> Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Sent: Wednesday, December 21, 2016 1:34:01 AM
> Subject: Re: Testing latest linux-next 4.9 ib_srp and ib_srpt
>
>
>
> ----- Original Message -----
> > From: "Laurence Oberman" <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > To: "Bart Van Assche" <Bart.VanAssche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
> > Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > Sent: Tuesday, December 20, 2016 10:31:34 PM
> > Subject: Re: Testing latest linux-next 4.9 ib_srp and ib_srpt
> >
> >
> >
> > ----- Original Message -----
> > > From: "Laurence Oberman" <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > > To: "Bart Van Assche" <Bart.VanAssche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
> > > Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > > Sent: Tuesday, December 20, 2016 3:44:42 PM
> > > Subject: Re: Testing latest linux-next 4.9 ib_srp and ib_srpt
> > >
> > >
> > >
> > > ----- Original Message -----
> > > > From: "Laurence Oberman" <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > > > To: "Bart Van Assche" <Bart.VanAssche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
> > > > Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > > > Sent: Tuesday, December 20, 2016 2:43:48 PM
> > > > Subject: Re: Testing latest linux-next 4.9 ib_srp and ib_srpt
> > > >
> > > >
> > > >
> > > > ----- Original Message -----
> > > > > From: "Laurence Oberman" <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > > > > To: "Bart Van Assche" <Bart.VanAssche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
> > > > > Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > > > > Sent: Tuesday, December 20, 2016 2:33:26 PM
> > > > > Subject: Re: Testing latest linux-next 4.9 ib_srp and ib_srpt
> > > > >
> > > > > Hello Bart
> > > > >
> > > > > I pulled the latest linux-next and built kernels for both server and
> > > > > client
> > > > > to rerun all my EDR tests for srp.
> > > > >
> > > > > For some reason the I/O size is being capped again to 1MB in my
> > > > > testing.
> > > > > Using my same testbed.
> > > > > Remember we spent a lot of time making sure we could do 4MB I/O :)
> > > > >
> > > > > Its working fine in the RHEL 7.3 kernel so before I start going back
> > > > > testing
> > > > > upstream kernels decided to ask.
> > > > >
> > > > > Have you tested large I/O with latest linux-next
> > > > >
> > > > > Server Configuration
> > > > > ---------------------
> > > > > Linux fedstorage.bos.redhat.com 4.9.0+
> > > > >
> > > > > [root@fedstorage modprobe.d]# cat ib_srp.conf
> > > > > options ib_srp cmd_sg_entries=64 indirect_sg_entries=2048
> > > > >
> > > > > [root@fedstorage modprobe.d]# cat ib_srpt.conf
> > > > > options ib_srpt srp_max_req_size=8296
> > > > >
> > > > > Also Using
> > > > >
> > > > > # Set the srp_sq_size
> > > > > for i in
> > > > > /sys/kernel/config/target/srpt/0xfe800000000000007cfe900300726e4e
> > > > > /sys/kernel/config/target/srpt/0xfe800000000000007cfe900300726e4f
> > > > > do
> > > > > echo 16384 > $i/tpgt_1/attrib/srp_sq_size
> > > > > done
> > > > >
> > > > > Client Configuration
> > > > > --------------------
> > > > > Linux ibclient 4.9.0+
> > > > >
> > > > > [root@ibclient modprobe.d]# cat ib_srp.conf
> > > > > options ib_srp cmd_sg_entries=255 indirect_sg_entries=2048
> > > > >
> > > > > dd if=/dev/sdw bs=4096k of=/dev/null iflag=direct
> > > > >
> > > > > ### RECORD 4 >>> ibclient <<< (1482261733.001) (Tue Dec 20
> > > > > 14:22:13
> > > > > 2016)
> > > > > ###
> > > > > # DISK STATISTICS (/sec)
> > > > > #
> > > > > <---------reads---------------><---------writes--------------><--------averages-------->
> > > > > Pct
> > > > > #Time Name KBytes Merged IOs Size Wait KBytes Merged
> > > > > IOs
> > > > > Size
> > > > > Wait RWSize QLen Wait SvcTim Util
> > > > > 14:22:13 sdw 1373184 0 1341 1024 2 0 0
> > > > > 0
> > > > > 0
> > > > > 0 1024 3 2 0 97
> > > > >
> > > > >
> > > > > If I reboot into my 7.3 kernel its back to what I expect
> > > > >
> > > > > dd if=/dev/sdw bs=4096k of=/dev/null iflag=direct
> > > > >
> > > > >
> > > > > ### RECORD 3 >>> ibclient <<< (1482262254.001) (Tue Dec 20
> > > > > 14:30:54
> > > > > 2016)
> > > > > ###
> > > > > # DISK STATISTICS (/sec)
> > > > > #
> > > > > <---------reads---------------><---------writes--------------><--------averages-------->
> > > > > Pct
> > > > > #Time Name KBytes Merged IOs Size Wait KBytes Merged
> > > > > IOs
> > > > > Size
> > > > > Wait RWSize QLen Wait SvcTim Util
> > > > > 14:30:54 sdw 172032 129 42 4096 3 0 0
> > > > > 0
> > > > > 0
> > > > > 0 4096 1 3 3 130
> > > > >
> > > > > --
> > > > > To unsubscribe from this list: send the line "unsubscribe linux-rdma"
> > > > > in
> > > > > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > > >
> > > >
> > > > Hi Bart,
> > > >
> > > > Just FYI
> > > >
> > > > That dd snap was just as I had stopped the dd.
> > > >
> > > > Here is a stable dd snap with the RHEL kernel, I just noticed the
> > > > merging,
> > > > need to reboot back into upstream to compare again.
> > > > No merging seen in upstream.
> > > >
> > > > Thanks
> > > > Laurence
> > > >
> > > > ### RECORD 6 >>> ibclient <<< (1482262723.001) (Tue Dec 20 14:38:43
> > > > 2016)
> > > > ###
> > > > # DISK STATISTICS (/sec)
> > > > #
> > > > <---------reads---------------><---------writes--------------><--------averages-------->
> > > > Pct
> > > > #Time Name KBytes Merged IOs Size Wait KBytes Merged IOs
> > > > Size
> > > > Wait RWSize QLen Wait SvcTim Util
> > > > 14:38:43 sdw 1200128 879 293 4096 3 0 0 0
> > > > 0
> > > > 0 4096 1 3 3 95
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-rdma"
> > > > in
> > > > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > >
> > >
> > > Replying to my own message to keep the thread going
> > >
> > >
> > > This is Linux ibclient 4.8.0-rc4
> > >
> > > Behaves like I expected, and I see the 4MB I/O sizes. as I had already
> > > tested
> > > this.
> > >
> > > dd if=/dev/sdw bs=4096k of=/dev/null iflag=direct
> > >
> > >
> > > ### RECORD 6 >>> ibclient <<< (1482266543.001) (Tue Dec 20 15:42:23
> > > 2016)
> > > ###
> > > # DISK STATISTICS (/sec)
> > > #
> > > <---------reads---------------><---------writes--------------><--------averages-------->
> > > Pct
> > > #Time Name KBytes Merged IOs Size Wait KBytes Merged IOs
> > > Size
> > > Wait RWSize QLen Wait SvcTim Util
> > > 15:42:23 sdw 278528 201 68 4096 2 0 0 0
> > > 0
> > > 0 4096 1 2 2 206
> > >
> > > Rebooting back into 4.9 and will bisect leading to it once I confirm
> > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > >
> >
> > So this is where I got to here:
> >
> > These all worked and I get the 4MB I/O size with direct I/O
> >
> > v4.8-rc4
> > v4.8-rc5
> > v4.8-rc6
> > v4.8-rc7
> > v4.8-rc8
> > v4.9
> > v4.9-rc1
> > v4.9-rc2
> > v4.9-rc3
> > v4.9-rc4
> > v4.9-rc5
> > v4.9-rc6
> > v4.9-rc7
> > v4.9-rc8
> >
> > Then git checkout master and build final test kernel
> > 4.9.0+
> >
> > This one fails
> >
> > # DISK STATISTICS (/sec)
> > #
> > <---------reads---------------><---------writes--------------><--------averages-------->
> > Pct
> > #Time Name KBytes Merged IOs Size Wait KBytes Merged IOs Size
> > Wait RWSize QLen Wait SvcTim Util
> > 22:12:48 sdw 1413120 0 1380 1024 2 0 0 0
> > 0
> > 0 1024 3 2 0 99
> > 22:12:49 sdw 1409024 0 1376 1024 2 0 0 0
> > 0
> > 0 1024 3 2 0 98
> > 22:12:50 sdw 1445888 0 1412 1024 2 0 0 0
> > 0
> > 0 1024 3 2 0 98
> > 22:12:51 sdw 1429504 0 1396 1024 2 0 0 0
> > 0
> > 0 1024 3 2 0 98
> > 22:12:52 sdw 1426432 0 1393 1024 2 0 0 0
> > 0
> > 0 1024 3 2 0 98
> > 22:12:53 sdw 1408000 0 1375 1024 2 0 0 0
> > 0
> > 0 1024 3 2 0 98
> > *** ****
> > ****
> >
> > The last commit for 4.9-rc8 was 3e5de27
> >
> > Between this and the master checkout HEAD there are 2313 lines of commits
> > This was the giant merge.
> >
> > Something broke here.
> > I guess I will have to bisect the hard way, unless somebody realizes which
> > of
> > the commits broke it.
> >
> > Commits for SRP, none of these look like candidates to break the I/O size
> > and
> > merge
> >
> > 9032ad7 Merge branches 'misc', 'qedr', 'reject-helpers', 'rxe' and 'srp'
> > into
> > merge-test
> > 4fa354c IB/srp: Make writing the add_target sysfs attr interruptible
> > 290081b IB/srp: Make mapping failures easier to debug
> > 3787d99 IB/srp: Make login failures easier to debug
> > 042dd76 IB/srp: Introduce a local variable in srp_add_one()
> > 1a1faf7 IB/srp: Fix CONFIG_DYNAMIC_DEBUG=n build
> > 0d38c24 IB/srpt: Report login failures only once
> >
> > dm related, however I am testing against an sd, not dm device here
> >
> > ef548c5 dm flakey: introduce "error_writes" feature
> > e99dda8f dm cache policy smq: use hash_32() instead of hash_32_generic()
> > 027c431 dm crypt: reject key strings containing whitespace chars
> > b446396 dm space map: always set ev if sm_ll_mutate() succeeds
> > 0c79ce0 dm space map metadata: skip useless memcpy in
> > metadata_ll_init_index()
> > 314c25c dm space map metadata: fix 'struct sm_metadata' leak on failed
> > create
> > 58fc4fe Documentation: dm raid: define data_offset status field
> > 11e2968 dm raid: fix discard support regression
> > affa9d2 dm raid: don't allow "write behind" with raid4/5/6
> > 54cd640 dm mpath: use hw_handler_params if attached hw_handler is same as
> > requested
> > c538f6e dm crypt: add ability to use keys from the kernel key retention
> > service
> > 0637018 dm array: remove a dead assignment in populate_ablock_with_values()
> > 6080758 dm ioctl: use offsetof() instead of open-coding it
> > b23df0d dm rq: simplify use_blk_mq initialization
> > 41c73a4 dm bufio: drop the lock when doing GFP_NOIO allocation
> > d12067f dm bufio: don't take the lock in dm_bufio_shrink_count
> > 9ea61ca dm bufio: avoid sleeping while holding the dm_bufio lock
> > 5b8c01f dm table: simplify dm_table_determine_type()
> > 301fc3f dm table: an 'all_blk_mq' table must be loaded for a blk-mq DM
> > device
> > 6936c12 dm table: fix 'all_blk_mq' inconsistency when an empty table is
> > loaded
> >
> > I have time next week so will try narrow it down via bisect by commit
> >
> > Thanks
> > Laurence
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
>
> Started with bisect at > 6000 revisions and 13 cycles between v4.9-rc8 and
> master.
> bisect good for v4.9-rc8, bisect bad for master HEAD
>
> Down to around 700 now, but its late, going to bed now so will finish
> tomorrow
>
> Thanks
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
Hi Bart, Christoph
After multiple bisects (6000 revisions, 13 cycles), I got to this one.
Of course there are a huge amount of block layer changes as we know in rc10.
[loberman@ibclient linux-next.orig]$ git bisect bad
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[542ff7bf18c63cf403e36a4a1c71d86dc120d924] block: new direct I/O implementation
This commit is the one that seems to have changed the behavior.
Max I/O size restricted 1MB, even when 4MB I/O is requested, no merging seen.
This is not going to only affect SRP targets.
I will review the code and will be happy to test any patches.
I will leave the test bed in place.
commit 542ff7bf18c63cf403e36a4a1c71d86dc120d924
Author: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
Date: Wed Nov 16 23:14:22 2016 -0700
block: new direct I/O implementation
Similar to the simple fast path, but we now need a dio structure to
track multiple-bio completions. It's basically a cut-down version
of the new iomap-based direct I/O code for filesystems, but without
all the logic to call into the filesystem for extent lookup or
allocation, and without the complex I/O completion workqueue handler
for AIO - instead we just use the FUA bit on the bios to ensure
data is flushed to stable storage.
Signed-off-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
Signed-off-by: Jens Axboe <axboe-b10kYP2dOMg@public.gmane.org>
diff --git a/fs/block_dev.c b/fs/block_dev.c
index a1b9abe..35cc494 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -270,11 +270,161 @@ static void blkdev_bio_end_io_simple(struct bio *bio)
return ret;
}
+struct blkdev_dio {
+ union {
+ struct kiocb *iocb;
+ struct task_struct *waiter;
+ };
+ size_t size;
+ atomic_t ref;
+ bool multi_bio : 1;
+ bool should_dirty : 1;
+ bool is_sync : 1;
+ struct bio bio;
+};
+
+static struct bio_set *blkdev_dio_pool __read_mostly;
+
+static void blkdev_bio_end_io(struct bio *bio)
+{
+ struct blkdev_dio *dio = bio->bi_private;
+ bool should_dirty = dio->should_dirty;
+
+ if (dio->multi_bio && !atomic_dec_and_test(&dio->ref)) {
+ if (bio->bi_error && !dio->bio.bi_error)
+ dio->bio.bi_error = bio->bi_error;
+ } else {
+ if (!dio->is_sync) {
+ struct kiocb *iocb = dio->iocb;
+ ssize_t ret = dio->bio.bi_error;
+
+ if (likely(!ret)) {
+ ret = dio->size;
+ iocb->ki_pos += ret;
+ }
+
+ dio->iocb->ki_complete(iocb, ret, 0);
+ bio_put(&dio->bio);
+ } else {
+ struct task_struct *waiter = dio->waiter;
+
+ WRITE_ONCE(dio->waiter, NULL);
+ wake_up_process(waiter);
+ }
+ }
+
+ if (should_dirty) {
+ bio_check_pages_dirty(bio);
+ } else {
+ struct bio_vec *bvec;
+ int i;
+
+ bio_for_each_segment_all(bvec, bio, i)
+ put_page(bvec->bv_page);
+ bio_put(bio);
+ }
+}
+
static ssize_t
-blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
+__blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages)
{
struct file *file = iocb->ki_filp;
struct inode *inode = bdev_file_inode(file);
+ struct block_device *bdev = I_BDEV(inode);
+ unsigned blkbits = blksize_bits(bdev_logical_block_size(bdev));
+ struct blkdev_dio *dio;
+ struct bio *bio;
+ bool is_read = (iov_iter_rw(iter) == READ);
+ loff_t pos = iocb->ki_pos;
+ blk_qc_t qc = BLK_QC_T_NONE;
+ int ret;
+
+ if ((pos | iov_iter_alignment(iter)) & ((1 << blkbits) - 1))
+ return -EINVAL;
+
+ bio = bio_alloc_bioset(GFP_KERNEL, nr_pages, blkdev_dio_pool);
+ bio_get(bio); /* extra ref for the completion handler */
+
+ dio = container_of(bio, struct blkdev_dio, bio);
+ dio->is_sync = is_sync_kiocb(iocb);
+ if (dio->is_sync)
+ dio->waiter = current;
+ else
+ dio->iocb = iocb;
+
+ dio->size = 0;
+ dio->multi_bio = false;
+ dio->should_dirty = is_read && (iter->type == ITER_IOVEC);
+
+ for (;;) {
+ bio->bi_bdev = bdev;
+ bio->bi_iter.bi_sector = pos >> blkbits;
+ bio->bi_private = dio;
+ bio->bi_end_io = blkdev_bio_end_io;
+
+ ret = bio_iov_iter_get_pages(bio, iter);
+ if (unlikely(ret)) {
+ bio->bi_error = ret;
+ bio_endio(bio);
+ break;
+ }
+
+ if (is_read) {
+ bio->bi_opf = REQ_OP_READ;
+ if (dio->should_dirty)
+ bio_set_pages_dirty(bio);
+ } else {
+ bio->bi_opf = dio_bio_write_op(iocb);
+ task_io_account_write(bio->bi_iter.bi_size);
+ }
+
+ dio->size += bio->bi_iter.bi_size;
+ pos += bio->bi_iter.bi_size;
+
+ nr_pages = iov_iter_npages(iter, BIO_MAX_PAGES);
+ if (!nr_pages) {
+ qc = submit_bio(bio);
+ break;
+ }
+
+ if (!dio->multi_bio) {
+ dio->multi_bio = true;
+ atomic_set(&dio->ref, 2);
+ } else {
+ atomic_inc(&dio->ref);
+ }
+
+ submit_bio(bio);
+ bio = bio_alloc(GFP_KERNEL, nr_pages);
+ }
+
+ if (!dio->is_sync)
+ return -EIOCBQUEUED;
+
+ for (;;) {
+ set_current_state(TASK_UNINTERRUPTIBLE);
+ if (!READ_ONCE(dio->waiter))
+ break;
+
+ if (!(iocb->ki_flags & IOCB_HIPRI) ||
+ !blk_mq_poll(bdev_get_queue(bdev), qc))
+ io_schedule();
+ }
+ __set_current_state(TASK_RUNNING);
+
+ ret = dio->bio.bi_error;
+ if (likely(!ret)) {
+ ret = dio->size;
+ iocb->ki_pos += ret;
+ }
+
+ bio_put(&dio->bio);
+ return ret;
+}
+
+static ssize_t
+blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
+{
int nr_pages;
nr_pages = iov_iter_npages(iter, BIO_MAX_PAGES + 1);
@@ -282,10 +432,18 @@ static void blkdev_bio_end_io_simple(struct bio *bio)
return 0;
if (is_sync_kiocb(iocb) && nr_pages <= BIO_MAX_PAGES)
return __blkdev_direct_IO_simple(iocb, iter, nr_pages);
- return __blockdev_direct_IO(iocb, inode, I_BDEV(inode), iter,
- blkdev_get_block, NULL, NULL,
- DIO_SKIP_DIO_COUNT);
+
+ return __blkdev_direct_IO(iocb, iter, min(nr_pages, BIO_MAX_PAGES));
+}
+
+static __init int blkdev_init(void)
+{
+ blkdev_dio_pool = bioset_create(4, offsetof(struct blkdev_dio, bio));
+ if (!blkdev_dio_pool)
+ return -ENOMEM;
+ return 0;
}
+module_init(blkdev_init);
int __sync_blockdev(struct block_device *bdev, int wait)
{
Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: Testing latest linux-next 4.9 ib_srp and ib_srpt
[not found] ` <1337539588.8422488.1482372617094.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-12-22 6:23 ` Christoph Hellwig
[not found] ` <20161222062321.GA30326-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2016-12-22 6:23 UTC (permalink / raw)
To: Laurence Oberman
Cc: Bart Van Assche, Christoph Hellwig,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
I just got CC'ed to a massive chain of full quotes. If you want an
answer from me please write a readable mail, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Testing latest linux-next 4.9 ib_srp and ib_srpt sees I/O capped at 1MB and no merging
[not found] ` <20161222062321.GA30326-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2016-12-22 13:17 ` Laurence Oberman
[not found] ` <1661819060.8462293.1482412678092.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 7+ messages in thread
From: Laurence Oberman @ 2016-12-22 13:17 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Bart Van Assche, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-scsi-u79uwXL29TY76Z2rM5mHXA
Hello Christoph, apologies, here is a clear summary of the issue.
During testing of the latest linux-next with rc-10 block layer changes I noticed that I/O was being capped at 1MB size and no merging was seen.
The issue was not apparent on 4.8.0-rc8 or earlier.
dd if=/dev/sdw bs=4096k of=/dev/null iflag=direct
### RECORD 6 >>> ibclient <<< (1482266543.001) (Tue Dec 20 15:42:23 2016) ###
# DISK STATISTICS (/sec)
# <---------reads---------------><---------writes--------------><--------averages--------> Pct
#Time Name KBytes Merged IOs Size Wait KBytes Merged IOs Size Wait RWSize QLen Wait SvcTim Util
15:42:23 sdw 278528 201 68 4096 2 0 0 0 0 0 4096 1 2 2 206
Then git checkout master and build final test kernel
4.9.0+
This one clearly shows the I.O at 1MB and no merging.
# DISK STATISTICS (/sec)
# <---------reads---------------><---------writes--------------><--------averages--------> Pct
#Time Name KBytes Merged IOs Size Wait KBytes Merged IOs Size Wait RWSize QLen Wait SvcTim Util
22:12:48 sdw 1413120 0 1380 1024 2 0 0 0 0 0 1024 3 2 0 99
22:12:49 sdw 1409024 0 1376 1024 2 0 0 0 0 0 1024 3 2 0 98
22:12:50 sdw 1445888 0 1412 1024 2 0 0 0 0 0 1024 3 2 0 98
22:12:51 sdw 1429504 0 1396 1024 2 0 0 0 0 0 1024 3 2 0 98
22:12:52 sdw 1426432 0 1393 1024 2 0 0 0 0 0 1024 3 2 0 98
22:12:53 sdw 1408000 0 1375 1024 2 0 0 0 0 0 1024 3 2 0 98
*** **** ****
After multiple bisects (6000 revisions, 13 cycles), I got to this one.
Of course there are a huge amount of block layer changes as we know in rc10.
[loberman@ibclient linux-next.orig]$ git bisect bad
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[542ff7bf18c63cf403e36a4a1c71d86dc120d924] block: new direct I/O implementation
This commit is the one that seems to have changed the behavior.
Max I/O size restricted 1MB, even when 4MB I/O is requested, no merging seen.
This is not going to only affect SRP targets.
I will be happy to test any patches and the test bed is always in place.
commit 542ff7bf18c63cf403e36a4a1c71d86dc120d924
Author: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
Date: Wed Nov 16 23:14:22 2016 -0700
block: new direct I/O implementation
Similar to the simple fast path, but we now need a dio structure to
track multiple-bio completions. It's basically a cut-down version
of the new iomap-based direct I/O code for filesystems, but without
all the logic to call into the filesystem for extent lookup or
allocation, and without the complex I/O completion workqueue handler
for AIO - instead we just use the FUA bit on the bios to ensure
data is flushed to stable storage.
Signed-off-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
Signed-off-by: Jens Axboe <axboe-b10kYP2dOMg@public.gmane.org>
Many Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Testing latest linux-next 4.9 ib_srp and ib_srpt sees I/O capped at 1MB and no merging
[not found] ` <1661819060.8462293.1482412678092.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-12-22 15:40 ` Christoph Hellwig
[not found] ` <20161222154049.GA4638-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-12-22 19:54 ` kbuild test robot
0 siblings, 2 replies; 7+ messages in thread
From: Christoph Hellwig @ 2016-12-22 15:40 UTC (permalink / raw)
To: Laurence Oberman
Cc: Christoph Hellwig, Bart Van Assche,
linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-scsi-u79uwXL29TY76Z2rM5mHXA
Hi Laurance,
please try the patch below:
---
>From 69febe1cfb55844862f768447432249781001f9c Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
Date: Thu, 22 Dec 2016 16:38:29 +0100
Subject: block: add back plugging in __blkdev_direct_IO
This allows sending larger than 1 MB requess to devices that support
large I/O sizes.
Signed-off-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
Reported-by: Laurence Oberman <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
fs/block_dev.c | 3 +++
fs/iomap.c | 1 -
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 7c45072..206a92a 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -328,6 +328,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages)
struct file *file = iocb->ki_filp;
struct inode *inode = bdev_file_inode(file);
struct block_device *bdev = I_BDEV(inode);
+ struct blk_plug plug;
struct blkdev_dio *dio;
struct bio *bio;
bool is_read = (iov_iter_rw(iter) == READ);
@@ -353,6 +354,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages)
dio->multi_bio = false;
dio->should_dirty = is_read && (iter->type == ITER_IOVEC);
+ blk_start_plug(&plug);
for (;;) {
bio->bi_bdev = bdev;
bio->bi_iter.bi_sector = pos >> 9;
@@ -394,6 +396,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages)
submit_bio(bio);
bio = bio_alloc(GFP_KERNEL, nr_pages);
}
+ blk_finish_plug(&plug);
if (!dio->is_sync)
return -EIOCBQUEUED;
diff --git a/fs/iomap.c b/fs/iomap.c
index 354a123..3adf1e1 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -844,7 +844,6 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, struct iomap_ops *ops,
size_t count = iov_iter_count(iter);
loff_t pos = iocb->ki_pos, end = iocb->ki_pos + count - 1, ret = 0;
unsigned int flags = IOMAP_DIRECT;
- struct blk_plug plug;
struct iomap_dio *dio;
lockdep_assert_held(&inode->i_rwsem);
--
2.1.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: Testing latest linux-next 4.9 ib_srp and ib_srpt sees I/O capped at 1MB and no merging
[not found] ` <20161222154049.GA4638-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2016-12-22 17:59 ` Laurence Oberman
2016-12-22 19:55 ` block: add back plugging in __blkdev_direct_IO kbuild test robot
1 sibling, 0 replies; 7+ messages in thread
From: Laurence Oberman @ 2016-12-22 17:59 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Bart Van Assche, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-scsi-u79uwXL29TY76Z2rM5mHXA
----- Original Message -----
> From: "Christoph Hellwig" <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
> To: "Laurence Oberman" <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Cc: "Christoph Hellwig" <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>, "Bart Van Assche" <Bart.VanAssche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>,
> linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Sent: Thursday, December 22, 2016 10:40:49 AM
> Subject: Re: Testing latest linux-next 4.9 ib_srp and ib_srpt sees I/O capped at 1MB and no merging
>
> Hi Laurance,
>
> please try the patch below:
>
> ---
> From 69febe1cfb55844862f768447432249781001f9c Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
> Date: Thu, 22 Dec 2016 16:38:29 +0100
> Subject: block: add back plugging in __blkdev_direct_IO
>
> This allows sending larger than 1 MB requess to devices that support
> large I/O sizes.
>
> Signed-off-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
> Reported-by: Laurence Oberman <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
> fs/block_dev.c | 3 +++
> fs/iomap.c | 1 -
> 2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 7c45072..206a92a 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -328,6 +328,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter
> *iter, int nr_pages)
> struct file *file = iocb->ki_filp;
> struct inode *inode = bdev_file_inode(file);
> struct block_device *bdev = I_BDEV(inode);
> + struct blk_plug plug;
> struct blkdev_dio *dio;
> struct bio *bio;
> bool is_read = (iov_iter_rw(iter) == READ);
> @@ -353,6 +354,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter
> *iter, int nr_pages)
> dio->multi_bio = false;
> dio->should_dirty = is_read && (iter->type == ITER_IOVEC);
>
> + blk_start_plug(&plug);
> for (;;) {
> bio->bi_bdev = bdev;
> bio->bi_iter.bi_sector = pos >> 9;
> @@ -394,6 +396,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter
> *iter, int nr_pages)
> submit_bio(bio);
> bio = bio_alloc(GFP_KERNEL, nr_pages);
> }
> + blk_finish_plug(&plug);
>
> if (!dio->is_sync)
> return -EIOCBQUEUED;
> diff --git a/fs/iomap.c b/fs/iomap.c
> index 354a123..3adf1e1 100644
> --- a/fs/iomap.c
> +++ b/fs/iomap.c
> @@ -844,7 +844,6 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
> struct iomap_ops *ops,
> size_t count = iov_iter_count(iter);
> loff_t pos = iocb->ki_pos, end = iocb->ki_pos + count - 1, ret = 0;
> unsigned int flags = IOMAP_DIRECT;
> - struct blk_plug plug;
> struct iomap_dio *dio;
>
> lockdep_assert_held(&inode->i_rwsem);
> --
> 2.1.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
Hello Christoph
The patch works and I now see 4MB I/O
# DISK STATISTICS (/sec)
# <---------reads---------><---------writes---------><--------averages--------> Pct
#Time Name KBytes Merged IOs Size KBytes Merged IOs Size RWSize QLen Wait SvcTim Util
11:53:58 sdah 143360 105 35 4096 0 0 0 0 4096 1 28 28 99
11:53:59 sdah 139264 102 34 4096 0 0 0 0 4096 1 29 29 99
11:54:00 sdah 143360 105 35 4096 0 0 0 0 4096 1 28 28 99
I think you forgot to remove calls to blk_start_plug and blk_finish_plug in fs/iomap.c in your patch.
I took them out and built the test kernel that way.
Let me know if you will just remove those in the final or want a patch.
Thanks fo rthe super quick response
Regards
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: block: add back plugging in __blkdev_direct_IO
2016-12-22 15:40 ` Christoph Hellwig
[not found] ` <20161222154049.GA4638-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2016-12-22 19:54 ` kbuild test robot
1 sibling, 0 replies; 7+ messages in thread
From: kbuild test robot @ 2016-12-22 19:54 UTC (permalink / raw)
Cc: kbuild-all, Laurence Oberman, Christoph Hellwig, Bart Van Assche,
linux-rdma, linux-scsi
[-- Attachment #1: Type: text/plain, Size: 1878 bytes --]
Hi Christoph,
[auto build test ERROR on linus/master]
[also build test ERROR on next-20161222]
[cannot apply to v4.9]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Christoph-Hellwig/block-add-back-plugging-in-__blkdev_direct_IO/20161223-002453
config: x86_64-rhel (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64
All errors (new ones prefixed by >>):
fs/iomap.c: In function 'iomap_dio_rw':
>> fs/iomap.c:897:18: error: 'plug' undeclared (first use in this function)
blk_start_plug(&plug);
^~~~
fs/iomap.c:897:18: note: each undeclared identifier is reported only once for each function it appears in
vim +/plug +897 fs/iomap.c
ff6a9292 Christoph Hellwig 2016-11-30 891 WARN_ON_ONCE(ret);
ff6a9292 Christoph Hellwig 2016-11-30 892 ret = 0;
ff6a9292 Christoph Hellwig 2016-11-30 893 }
ff6a9292 Christoph Hellwig 2016-11-30 894
ff6a9292 Christoph Hellwig 2016-11-30 895 inode_dio_begin(inode);
ff6a9292 Christoph Hellwig 2016-11-30 896
ff6a9292 Christoph Hellwig 2016-11-30 @897 blk_start_plug(&plug);
ff6a9292 Christoph Hellwig 2016-11-30 898 do {
ff6a9292 Christoph Hellwig 2016-11-30 899 ret = iomap_apply(inode, pos, count, flags, ops, dio,
ff6a9292 Christoph Hellwig 2016-11-30 900 iomap_dio_actor);
:::::: The code at line 897 was first introduced by commit
:::::: ff6a9292e6f633d596826be5ba70d3ef90cc3300 iomap: implement direct I/O
:::::: TO: Christoph Hellwig <hch@lst.de>
:::::: CC: Dave Chinner <david@fromorbit.com>
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 38270 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: block: add back plugging in __blkdev_direct_IO
[not found] ` <20161222154049.GA4638-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-12-22 17:59 ` Laurence Oberman
@ 2016-12-22 19:55 ` kbuild test robot
1 sibling, 0 replies; 7+ messages in thread
From: kbuild test robot @ 2016-12-22 19:55 UTC (permalink / raw)
Cc: kbuild-all-JC7UmRfGjtg, Laurence Oberman, Christoph Hellwig,
Bart Van Assche, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-scsi-u79uwXL29TY76Z2rM5mHXA
[-- Attachment #1: Type: text/plain, Size: 1925 bytes --]
Hi Christoph,
[auto build test ERROR on linus/master]
[also build test ERROR on next-20161222]
[cannot apply to v4.9]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Christoph-Hellwig/block-add-back-plugging-in-__blkdev_direct_IO/20161223-002453
config: x86_64-lkp (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64
All errors (new ones prefixed by >>):
fs/iomap.c: In function 'iomap_dio_rw':
>> fs/iomap.c:897:18: error: 'plug' undeclared (first use in this function)
blk_start_plug(&plug);
^~~~
fs/iomap.c:897:18: note: each undeclared identifier is reported only once for each function it appears in
vim +/plug +897 fs/iomap.c
ff6a9292 Christoph Hellwig 2016-11-30 891 WARN_ON_ONCE(ret);
ff6a9292 Christoph Hellwig 2016-11-30 892 ret = 0;
ff6a9292 Christoph Hellwig 2016-11-30 893 }
ff6a9292 Christoph Hellwig 2016-11-30 894
ff6a9292 Christoph Hellwig 2016-11-30 895 inode_dio_begin(inode);
ff6a9292 Christoph Hellwig 2016-11-30 896
ff6a9292 Christoph Hellwig 2016-11-30 @897 blk_start_plug(&plug);
ff6a9292 Christoph Hellwig 2016-11-30 898 do {
ff6a9292 Christoph Hellwig 2016-11-30 899 ret = iomap_apply(inode, pos, count, flags, ops, dio,
ff6a9292 Christoph Hellwig 2016-11-30 900 iomap_dio_actor);
:::::: The code at line 897 was first introduced by commit
:::::: ff6a9292e6f633d596826be5ba70d3ef90cc3300 iomap: implement direct I/O
:::::: TO: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
:::::: CC: Dave Chinner <david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org>
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 24656 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-12-22 19:55 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <07c07529-4636-fafb-2598-7358d8a1460d@sandisk.com>
[not found] ` <52e8398f-a146-721c-3b92-0892b4abbff8@intel.com>
[not found] ` <1482166487.25336.10.camel@sandisk.com>
[not found] ` <1918919536.8196250.1482262406057.JavaMail.zimbra@redhat.com>
[not found] ` <2052479881.8196880.1482263028727.JavaMail.zimbra@redhat.com>
[not found] ` <1668746735.8200653.1482266682585.JavaMail.zimbra@redhat.com>
[not found] ` <120559766.8215321.1482291094018.JavaMail.zimbra@redhat.com>
[not found] ` <1928955380.8220327.1482302041315.JavaMail.zimbra@redhat.com>
[not found] ` <1928955380.8220327.1482302041315.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-12-22 2:10 ` Testing latest linux-next 4.9 ib_srp and ib_srpt Laurence Oberman
[not found] ` <1337539588.8422488.1482372617094.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-12-22 6:23 ` Christoph Hellwig
[not found] ` <20161222062321.GA30326-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-12-22 13:17 ` Testing latest linux-next 4.9 ib_srp and ib_srpt sees I/O capped at 1MB and no merging Laurence Oberman
[not found] ` <1661819060.8462293.1482412678092.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-12-22 15:40 ` Christoph Hellwig
[not found] ` <20161222154049.GA4638-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-12-22 17:59 ` Laurence Oberman
2016-12-22 19:55 ` block: add back plugging in __blkdev_direct_IO kbuild test robot
2016-12-22 19:54 ` kbuild test robot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).