On 01/09/2015 02:07 PM, Jens Axboe wrote: > On 01/09/2015 12:49 PM, Mike Snitzer wrote: >> On Wed, Jan 07 2015 at 3:40pm -0500, >> Keith Busch wrote: >> >>> On Wed, 7 Jan 2015, Bart Van Assche wrote: >>>> On 01/06/15 17:15, Jens Axboe wrote: >>>>> blk-mq request allocation is pretty much as optimized/fast as it >>>>> can be. >>>>> The slowdown must be due to one of two reasons: >>>>> >>>>> - A bug related to running out of requests, perhaps a missing queue >>>>> run >>>>> or something like that. >>>>> - A smaller number of available requests, due to the requested >>>>> queue depth. >>>>> >>>>> Looking at Barts results, it looks like it's usually fast, but >>>>> sometimes >>>>> very slow. That would seem to indicate it's option #1 above that is >>>>> the >>>>> issue. Bart, since this seems to wait for quite a bit, would it be >>>>> possible to cat the 'tags' file for that queue when it is stuck >>>>> like that? >>>> >>>> Hello Jens, >>>> >>>> Thanks for the assistance. Is this the output you were looking for >>> >>> I'm a little confused by the later comments given the below data. It >>> says >>> multipath_clone_and_map() is stuck at bt_get, but that doesn't block >>> unless there are no tags available. The tags should be coming from one >>> of dm-1's path queues, and I'm assuming these queues are provided by sdc >>> and sdd. All their tags are free, so that looks like a missing wake_up >>> when the queue idles. >> >> Like I said in an earlier email, I cannot reproduce Bart's hangs running >> mkfs.xfs against a multipath device that is built ontop of a virtio >> device in a KVM guest. >> >> But I can hit __bt_get() failures on the virtio-blk device that I'm >> using for the root device on this guest. Bart I'd be interested to see >> what you get when running the attached debug patch (likely will just >> echo the same type of info you've already provided). >> >> There does appear to be something weird going on with bt_get(). With >> the debug patch I'm seeing the following when I simply run "make install" >> of the kernel (it'll run dracut to build the initramfs, etc): >> >> You'll note that in all instances where __bt_get() returns -1 nr_free >> isn't 0. > > Yeah, that doesn't look good. Can you try with this patch? The second > hunk is the interesting bit, the first is more of a cleanup. Actually, try this one instead, it should be a bit more precise than the first. -- Jens Axboe