Linux block layer

Linux block layer
 help / color / mirror / Atom feed

* Re: [PATCH 05/25] fs: Get proper reference for s_bdi
From: Christoph Hellwig @ 2017-04-12  8:09 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, linux-block, Christoph Hellwig
In-Reply-To: <20170329105623.18241-6-jack@suse.cz>

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply

* Re: [PATCH 04/25] fs: Provide infrastructure for dynamic BDIs in filesystems
From: Christoph Hellwig @ 2017-04-12  8:09 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-fsdevel, linux-block, Christoph Hellwig, linux-mtd,
	linux-nfs, Petr Vandrovec, linux-nilfs, cluster-devel, osd-dev,
	codalist, linux-afs, ecryptfs, linux-cifs, ceph-devel,
	linux-btrfs, v9fs-developer, lustre-devel
In-Reply-To: <20170329105623.18241-5-jack@suse.cz>

> +	if (sb->s_iflags & SB_I_DYNBDI) {
> +		bdi_put(sb->s_bdi);
> +		sb->s_bdi = &noop_backing_dev_info;

At some point I'd really like to get rid of noop_backing_dev_info and
have a NULL here..

Otherwise this looks fine..

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply

* Re: [PATCH 03/25] bdi: Export bdi_alloc_node() and bdi_put()
From: Christoph Hellwig @ 2017-04-12  8:08 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, linux-block, Christoph Hellwig
In-Reply-To: <20170329105623.18241-4-jack@suse.cz>

On Wed, Mar 29, 2017 at 12:56:01PM +0200, Jan Kara wrote:
> MTD will want to call bdi_alloc_node() and bdi_put() directly. Export
> these functions.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply

* Re: [PATCH 02/25] block: Unregister bdi on last reference drop
From: Christoph Hellwig @ 2017-04-12  8:06 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, linux-block, Christoph Hellwig
In-Reply-To: <20170329105623.18241-3-jack@suse.cz>

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply

* Re: [PATCH 01/25] bdi: Provide bdi_register_va() and bdi_alloc()
From: Christoph Hellwig @ 2017-04-12  8:06 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, linux-block, Christoph Hellwig
In-Reply-To: <20170329105623.18241-2-jack@suse.cz>

Normally this would be two separate patches.

Otherwise looks fine:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply

* Re: [PATCH v5] lightnvn: pblk
From: Javier González @ 2017-04-12  8:04 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Matias Bjørling, linux-kernel@vger.kernel.org,
	linux-block@vger.kernel.org
In-Reply-To: <1491949415.2654.23.camel@sandisk.com>

[-- Attachment #1: Type: text/plain, Size: 572 bytes --]

> On 12 Apr 2017, at 00.23, Bart Van Assche <bart.vanassche@sandisk.com> wrote:
> 
> On Wed, 2017-04-12 at 00:13 +0200, Javier González wrote:
>> please point out to any other tools/concerns you may have.
> 
> Hello Javier,
> 
> Do you already have an account at https://scan.coverity.com/? Any Linux
> kernel developer can get an account for free. A full Coverity scan of
> Linus' tree is available at https://scan.coverity.com/projects/linux.

Hi Bart,

No I did not. Thanks for the invite. I just created an account now;
waiting for approval.

Javier

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply

* Re: [PATCH 0/25 v2] fs: Convert all embedded bdis into separate ones
From: Jan Kara @ 2017-04-12  7:32 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: linux-block, Christoph Hellwig, Jens Axboe
In-Reply-To: <20170329105623.18241-1-jack@suse.cz>

Jens, Christoph, ping? Any opinion on this?

								Honza
On Wed 29-03-17 12:55:58, Jan Kara wrote:
> Hello,
> 
> this is the second revision of the patch series which converts all embedded
> occurences of struct backing_dev_info to use standalone dynamically allocated
> structures. This makes bdi handling unified across all bdi users and generally
> removes some boilerplate code from filesystems setting up their own bdi. It
> also allows us to remove some code from generic bdi implementation.
> 
> The patches were only compile-tested for most filesystems (I've tested
> mounting only for NFS & btrfs) so fs maintainers please have a look whether
> the changes look sound to you.
> 
> This series is based on top of bdi fixes that were merged into linux-block
> git tree into for-next branch. I have pushed out the result as a branch to
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git bdi
> 
> Changes since v1:
> * Added some acks
> * Added further FUSE cleanup patch
> * Added removal of unused argument to bdi_register()
> * Fixed up some compilation failures spotted by 0-day testing
> 
> 								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply

* Re: [PATCH rfc 0/6] Automatic affinity settings for nvme over rdma
From: Christoph Hellwig @ 2017-04-12  6:34 UTC (permalink / raw)
  To: Steve Wise
  Cc: Sagi Grimberg, linux-rdma, linux-nvme, linux-block, netdev,
	Saeed Mahameed, Or Gerlitz, Christoph Hellwig
In-Reply-To: <14fd128d-7155-ab13-492f-952f072808d5@opengridcomputing.com>

On Mon, Apr 10, 2017 at 01:05:50PM -0500, Steve Wise wrote:
> I'll test cxgb4 if you convert it. :)

That will take a lot of work.  The problem with cxgb4 is that it
allocatesd all the interrupts at device enable time, but then only
allocates them to ULDs when they attached, while this scheme assumes
as way to map out queues / vectors at initialization time.

^ permalink raw reply

* Re: [PATCH] remove the mg_disk driver
From: Hannes Reinecke @ 2017-04-12  6:16 UTC (permalink / raw)
  To: Christoph Hellwig, axboe, donari75; +Cc: linux-block
In-Reply-To: <20170412055837.GA20204@lst.de>

On 04/12/2017 07:58 AM, Christoph Hellwig wrote:
> Any comments?  Getting rid of this driver which was never wired up
> at all would help with some of the pending block work..
> 
> On Thu, Apr 06, 2017 at 01:28:46PM +0200, Christoph Hellwig wrote:
>> This drivers was added in 2008, but as far as a I can tell we never had a
>> single platform that actually registered resources for the platform driver.
>>
>> It's also been unmaintained for a long time and apparently has a ATA mode
>> that can be driven using the IDE/libata subsystem.
>>
>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>> ---
>>  Documentation/blockdev/mflash.txt |   84 ---
>>  drivers/block/Kconfig             |   17 -
>>  drivers/block/Makefile            |    1 -
>>  drivers/block/mg_disk.c           | 1110 -------------------------------------
>>  include/linux/mg_disk.h           |   45 --
>>  5 files changed, 1257 deletions(-)
>>  delete mode 100644 Documentation/blockdev/mflash.txt
>>  delete mode 100644 drivers/block/mg_disk.c
>>  delete mode 100644 include/linux/mg_disk.h
>>
Go.

Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nï¿½rnberg
GF: F. Imendï¿½rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nï¿½rnberg)

^ permalink raw reply

* Re: [PATCH V3 00/16] Introduce the BFQ I/O scheduler
From: Paolo Valente @ 2017-04-12  6:01 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-kernel@vger.kernel.org, linux-block@vger.kernel.org,
	fchecconi@gmail.com, linus.walleij@linaro.org, axboe@kernel.dk,
	Arianna Avanzini, broonie@kernel.org, tj@kernel.org,
	ulf.hansson@linaro.org
In-Reply-To: <1491935512.2654.16.camel@sandisk.com>

> Il giorno 11 apr 2017, alle ore 20:31, Bart Van Assche =
<bart.vanassche@sandisk.com> ha scritto:
>=20
> On Tue, 2017-04-11 at 19:37 +0200, Paolo Valente wrote:
>> Just pushed:
>> https://github.com/Algodev-github/bfq-mq/tree/add-bfq-mq-logical
>=20
> Thanks!
>=20
> But are you aware that the code on that branch doesn't build?
>=20
> $ make all
> [ ... ]
> ERROR: "bfq_mark_bfqq_busy" [block/bfq-wf2q.ko] undefined!
> ERROR: "bfqg_stats_update_dequeue" [block/bfq-wf2q.ko] undefined!
> [ ... ]
>=20
> $ PAGER=3D git grep bfq_mark_bfqq_busy
> block/bfq-wf2q.c:       bfq_mark_bfqq_busy(bfqq);
>=20

That's exactly the complain of the kbuild test robot.  As I wrote,
build completes with no problem in my test system (Ubuntu 16.04, gcc
5.4.0), even with the exact offending tree and .config that the robot
reports.

I didn't understand what is going on.  In your case, as well as for
the test robot, the compilation of the file block/bfq-wf2q.c as a
module component fails, because that file does not contain the
definition of the reported functions.  But that definition is
(uniquely) in the file block/bfq-iosched.c, which is to be compiled
with the former file, according to the following rule in
block/Makefile:
obj-$(CONFIG_IOSCHED_BFQ)       +=3D bfq-iosched.o bfq-wf2q.o =
bfq-cgroup.o

I have tried all combinations of configurations for bfq (builti-in or
module, with or without cgrousp support), always successfully.  If it
makes any sense to share this information, these are the exact
commands I used to test al combinations (in addition to make full
builds in some cases, and try make all as in your case):

make O=3Dbuilddir M=3Dblock

and

make O=3Dbuilddir M=3Dblock modules

Where is my mistake?

Thanks,
Paolo

> Bart.

^ permalink raw reply

* Re: [PATCH] remove the mg_disk driver
From: Christoph Hellwig @ 2017-04-12  5:58 UTC (permalink / raw)
  To: axboe, donari75; +Cc: linux-block
In-Reply-To: <20170406112846.31078-1-hch@lst.de>

Any comments?  Getting rid of this driver which was never wired up
at all would help with some of the pending block work..

On Thu, Apr 06, 2017 at 01:28:46PM +0200, Christoph Hellwig wrote:
> This drivers was added in 2008, but as far as a I can tell we never had a
> single platform that actually registered resources for the platform driver.
> 
> It's also been unmaintained for a long time and apparently has a ATA mode
> that can be driven using the IDE/libata subsystem.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  Documentation/blockdev/mflash.txt |   84 ---
>  drivers/block/Kconfig             |   17 -
>  drivers/block/Makefile            |    1 -
>  drivers/block/mg_disk.c           | 1110 -------------------------------------
>  include/linux/mg_disk.h           |   45 --
>  5 files changed, 1257 deletions(-)
>  delete mode 100644 Documentation/blockdev/mflash.txt
>  delete mode 100644 drivers/block/mg_disk.c
>  delete mode 100644 include/linux/mg_disk.h
> 
> diff --git a/Documentation/blockdev/mflash.txt b/Documentation/blockdev/mflash.txt
> deleted file mode 100644
> index f7e050551487..000000000000
> --- a/Documentation/blockdev/mflash.txt
> +++ /dev/null
> @@ -1,84 +0,0 @@
> -This document describes m[g]flash support in linux.
> -
> -Contents
> -  1. Overview
> -  2. Reserved area configuration
> -  3. Example of mflash platform driver registration
> -
> -1. Overview
> -
> -Mflash and gflash are embedded flash drive. The only difference is mflash is
> -MCP(Multi Chip Package) device. These two device operate exactly same way.
> -So the rest mflash repersents mflash and gflash altogether.
> -
> -Internally, mflash has nand flash and other hardware logics and supports
> -2 different operation (ATA, IO) modes. ATA mode doesn't need any new
> -driver and currently works well under standard IDE subsystem. Actually it's
> -one chip SSD. IO mode is ATA-like custom mode for the host that doesn't have
> -IDE interface.
> -
> -Following are brief descriptions about IO mode.
> -A. IO mode based on ATA protocol and uses some custom command. (read confirm,
> -write confirm)
> -B. IO mode uses SRAM bus interface.
> -C. IO mode supports 4kB boot area, so host can boot from mflash.
> -
> -2. Reserved area configuration
> -If host boot from mflash, usually needs raw area for boot loader image. All of
> -the mflash's block device operation will be taken this value as start offset.
> -Note that boot loader's size of reserved area and kernel configuration value
> -must be same.
> -
> -3. Example of mflash platform driver registration
> -Working mflash is very straight forward. Adding platform device stuff to board
> -configuration file is all. Here is some pseudo example.
> -
> -static struct mg_drv_data mflash_drv_data = {
> -	/* If you want to polling driver set to 1 */
> -	.use_polling = 0,
> -	/* device attribution */
> -	.dev_attr = MG_BOOT_DEV
> -};
> -
> -static struct resource mg_mflash_rsc[] = {
> -	/* Base address of mflash */
> -	[0] = {
> -		.start = 0x08000000,
> -		.end = 0x08000000 + SZ_64K - 1,
> -		.flags = IORESOURCE_MEM
> -	},
> -	/* mflash interrupt pin */
> -	[1] = {
> -		.start = IRQ_GPIO(84),
> -		.end = IRQ_GPIO(84),
> -		.flags = IORESOURCE_IRQ
> -	},
> -	/* mflash reset pin */
> -	[2] = {
> -		.start = 43,
> -		.end = 43,
> -		.name = MG_RST_PIN,
> -		.flags = IORESOURCE_IO
> -	},
> -	/* mflash reset-out pin
> -	 * If you use mflash as storage device (i.e. other than MG_BOOT_DEV),
> -	 * should assign this */
> -	[3] = {
> -		.start = 51,
> -		.end = 51,
> -		.name = MG_RSTOUT_PIN,
> -		.flags = IORESOURCE_IO
> -	}
> -};
> -
> -static struct platform_device mflash_dev = {
> -	.name = MG_DEV_NAME,
> -	.id = -1,
> -	.dev = {
> -		.platform_data = &mflash_drv_data,
> -	},
> -	.num_resources = ARRAY_SIZE(mg_mflash_rsc),
> -	.resource = mg_mflash_rsc
> -};
> -
> -platform_device_register(&mflash_dev);
> diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig
> index a1c2e816128f..ebe8c1a6195e 100644
> --- a/drivers/block/Kconfig
> +++ b/drivers/block/Kconfig
> @@ -434,23 +434,6 @@ config ATA_OVER_ETH
>  	This driver provides Support for ATA over Ethernet block
>  	devices like the Coraid EtherDrive (R) Storage Blade.
>  
> -config MG_DISK
> -	tristate "mGine mflash, gflash support"
> -	depends on ARM && GPIOLIB
> -	help
> -	  mGine mFlash(gFlash) block device driver
> -
> -config MG_DISK_RES
> -	int "Size of reserved area before MBR"
> -	depends on MG_DISK
> -	default 0
> -	help
> -	  Define size of reserved area that usually used for boot. Unit is KB.
> -	  All of the block device operation will be taken this value as start
> -	  offset
> -	  Examples:
> -			1024 => 1 MB
> -
>  config SUNVDC
>  	tristate "Sun Virtual Disk Client support"
>  	depends on SUN_LDOMS
> diff --git a/drivers/block/Makefile b/drivers/block/Makefile
> index b12c772bbeb3..5ceead8b52d7 100644
> --- a/drivers/block/Makefile
> +++ b/drivers/block/Makefile
> @@ -19,7 +19,6 @@ obj-$(CONFIG_BLK_CPQ_CISS_DA)  += cciss.o
>  obj-$(CONFIG_BLK_DEV_DAC960)	+= DAC960.o
>  obj-$(CONFIG_XILINX_SYSACE)	+= xsysace.o
>  obj-$(CONFIG_CDROM_PKTCDVD)	+= pktcdvd.o
> -obj-$(CONFIG_MG_DISK)		+= mg_disk.o
>  obj-$(CONFIG_SUNVDC)		+= sunvdc.o
>  obj-$(CONFIG_BLK_DEV_SKD)	+= skd.o
>  obj-$(CONFIG_BLK_DEV_OSD)	+= osdblk.o
> diff --git a/drivers/block/mg_disk.c b/drivers/block/mg_disk.c
> deleted file mode 100644
> index e88e7b06c616..000000000000
> --- a/drivers/block/mg_disk.c
> +++ /dev/null
> @@ -1,1110 +0,0 @@
> -/*
> - *  drivers/block/mg_disk.c
> - *
> - *  Support for the mGine m[g]flash IO mode.
> - *  Based on legacy hd.c
> - *
> - * (c) 2008 mGine Co.,LTD
> - * (c) 2008 unsik Kim <donari75@gmail.com>
> - *
> - *  This program is free software; you can redistribute it and/or modify
> - *  it under the terms of the GNU General Public License version 2 as
> - *  published by the Free Software Foundation.
> - */
> -
> -#include <linux/kernel.h>
> -#include <linux/module.h>
> -#include <linux/fs.h>
> -#include <linux/blkdev.h>
> -#include <linux/hdreg.h>
> -#include <linux/ata.h>
> -#include <linux/interrupt.h>
> -#include <linux/delay.h>
> -#include <linux/platform_device.h>
> -#include <linux/gpio.h>
> -#include <linux/mg_disk.h>
> -#include <linux/slab.h>
> -
> -#define MG_RES_SEC (CONFIG_MG_DISK_RES << 1)
> -
> -/* name for block device */
> -#define MG_DISK_NAME "mgd"
> -
> -#define MG_DISK_MAJ 0
> -#define MG_DISK_MAX_PART 16
> -#define MG_SECTOR_SIZE 512
> -#define MG_MAX_SECTS 256
> -
> -/* Register offsets */
> -#define MG_BUFF_OFFSET			0x8000
> -#define MG_REG_OFFSET			0xC000
> -#define MG_REG_FEATURE			(MG_REG_OFFSET + 2)	/* write case */
> -#define MG_REG_ERROR			(MG_REG_OFFSET + 2)	/* read case */
> -#define MG_REG_SECT_CNT			(MG_REG_OFFSET + 4)
> -#define MG_REG_SECT_NUM			(MG_REG_OFFSET + 6)
> -#define MG_REG_CYL_LOW			(MG_REG_OFFSET + 8)
> -#define MG_REG_CYL_HIGH			(MG_REG_OFFSET + 0xA)
> -#define MG_REG_DRV_HEAD			(MG_REG_OFFSET + 0xC)
> -#define MG_REG_COMMAND			(MG_REG_OFFSET + 0xE)	/* write case */
> -#define MG_REG_STATUS			(MG_REG_OFFSET + 0xE)	/* read  case */
> -#define MG_REG_DRV_CTRL			(MG_REG_OFFSET + 0x10)
> -#define MG_REG_BURST_CTRL		(MG_REG_OFFSET + 0x12)
> -
> -/* handy status */
> -#define MG_STAT_READY	(ATA_DRDY | ATA_DSC)
> -#define MG_READY_OK(s)	(((s) & (MG_STAT_READY | (ATA_BUSY | ATA_DF | \
> -				 ATA_ERR))) == MG_STAT_READY)
> -
> -/* error code for others */
> -#define MG_ERR_NONE		0
> -#define MG_ERR_TIMEOUT		0x100
> -#define MG_ERR_INIT_STAT	0x101
> -#define MG_ERR_TRANSLATION	0x102
> -#define MG_ERR_CTRL_RST		0x103
> -#define MG_ERR_INV_STAT		0x104
> -#define MG_ERR_RSTOUT		0x105
> -
> -#define MG_MAX_ERRORS	6	/* Max read/write errors */
> -
> -/* command */
> -#define MG_CMD_RD 0x20
> -#define MG_CMD_WR 0x30
> -#define MG_CMD_SLEEP 0x99
> -#define MG_CMD_WAKEUP 0xC3
> -#define MG_CMD_ID 0xEC
> -#define MG_CMD_WR_CONF 0x3C
> -#define MG_CMD_RD_CONF 0x40
> -
> -/* operation mode */
> -#define MG_OP_CASCADE (1 << 0)
> -#define MG_OP_CASCADE_SYNC_RD (1 << 1)
> -#define MG_OP_CASCADE_SYNC_WR (1 << 2)
> -#define MG_OP_INTERLEAVE (1 << 3)
> -
> -/* synchronous */
> -#define MG_BURST_LAT_4 (3 << 4)
> -#define MG_BURST_LAT_5 (4 << 4)
> -#define MG_BURST_LAT_6 (5 << 4)
> -#define MG_BURST_LAT_7 (6 << 4)
> -#define MG_BURST_LAT_8 (7 << 4)
> -#define MG_BURST_LEN_4 (1 << 1)
> -#define MG_BURST_LEN_8 (2 << 1)
> -#define MG_BURST_LEN_16 (3 << 1)
> -#define MG_BURST_LEN_32 (4 << 1)
> -#define MG_BURST_LEN_CONT (0 << 1)
> -
> -/* timeout value (unit: ms) */
> -#define MG_TMAX_CONF_TO_CMD	1
> -#define MG_TMAX_WAIT_RD_DRQ	10
> -#define MG_TMAX_WAIT_WR_DRQ	500
> -#define MG_TMAX_RST_TO_BUSY	10
> -#define MG_TMAX_HDRST_TO_RDY	500
> -#define MG_TMAX_SWRST_TO_RDY	500
> -#define MG_TMAX_RSTOUT		3000
> -
> -#define MG_DEV_MASK (MG_BOOT_DEV | MG_STORAGE_DEV | MG_STORAGE_DEV_SKIP_RST)
> -
> -/* main structure for mflash driver */
> -struct mg_host {
> -	struct device *dev;
> -
> -	struct request_queue *breq;
> -	struct request *req;
> -	spinlock_t lock;
> -	struct gendisk *gd;
> -
> -	struct timer_list timer;
> -	void (*mg_do_intr) (struct mg_host *);
> -
> -	u16 id[ATA_ID_WORDS];
> -
> -	u16 cyls;
> -	u16 heads;
> -	u16 sectors;
> -	u32 n_sectors;
> -	u32 nres_sectors;
> -
> -	void __iomem *dev_base;
> -	unsigned int irq;
> -	unsigned int rst;
> -	unsigned int rstout;
> -
> -	u32 major;
> -	u32 error;
> -};
> -
> -/*
> - * Debugging macro and defines
> - */
> -#undef DO_MG_DEBUG
> -#ifdef DO_MG_DEBUG
> -#  define MG_DBG(fmt, args...) \
> -	printk(KERN_DEBUG "%s:%d "fmt, __func__, __LINE__, ##args)
> -#else /* CONFIG_MG_DEBUG */
> -#  define MG_DBG(fmt, args...) do { } while (0)
> -#endif /* CONFIG_MG_DEBUG */
> -
> -static void mg_request(struct request_queue *);
> -
> -static bool mg_end_request(struct mg_host *host, int err, unsigned int nr_bytes)
> -{
> -	if (__blk_end_request(host->req, err, nr_bytes))
> -		return true;
> -
> -	host->req = NULL;
> -	return false;
> -}
> -
> -static bool mg_end_request_cur(struct mg_host *host, int err)
> -{
> -	return mg_end_request(host, err, blk_rq_cur_bytes(host->req));
> -}
> -
> -static void mg_dump_status(const char *msg, unsigned int stat,
> -		struct mg_host *host)
> -{
> -	char *name = MG_DISK_NAME;
> -
> -	if (host->req)
> -		name = host->req->rq_disk->disk_name;
> -
> -	printk(KERN_ERR "%s: %s: status=0x%02x { ", name, msg, stat & 0xff);
> -	if (stat & ATA_BUSY)
> -		printk("Busy ");
> -	if (stat & ATA_DRDY)
> -		printk("DriveReady ");
> -	if (stat & ATA_DF)
> -		printk("WriteFault ");
> -	if (stat & ATA_DSC)
> -		printk("SeekComplete ");
> -	if (stat & ATA_DRQ)
> -		printk("DataRequest ");
> -	if (stat & ATA_CORR)
> -		printk("CorrectedError ");
> -	if (stat & ATA_ERR)
> -		printk("Error ");
> -	printk("}\n");
> -	if ((stat & ATA_ERR) == 0) {
> -		host->error = 0;
> -	} else {
> -		host->error = inb((unsigned long)host->dev_base + MG_REG_ERROR);
> -		printk(KERN_ERR "%s: %s: error=0x%02x { ", name, msg,
> -				host->error & 0xff);
> -		if (host->error & ATA_BBK)
> -			printk("BadSector ");
> -		if (host->error & ATA_UNC)
> -			printk("UncorrectableError ");
> -		if (host->error & ATA_IDNF)
> -			printk("SectorIdNotFound ");
> -		if (host->error & ATA_ABORTED)
> -			printk("DriveStatusError ");
> -		if (host->error & ATA_AMNF)
> -			printk("AddrMarkNotFound ");
> -		printk("}");
> -		if (host->error & (ATA_BBK | ATA_UNC | ATA_IDNF | ATA_AMNF)) {
> -			if (host->req)
> -				printk(", sector=%u",
> -				       (unsigned int)blk_rq_pos(host->req));
> -		}
> -		printk("\n");
> -	}
> -}
> -
> -static unsigned int mg_wait(struct mg_host *host, u32 expect, u32 msec)
> -{
> -	u8 status;
> -	unsigned long expire, cur_jiffies;
> -	struct mg_drv_data *prv_data = host->dev->platform_data;
> -
> -	host->error = MG_ERR_NONE;
> -	expire = jiffies + msecs_to_jiffies(msec);
> -
> -	/* These 2 times dummy status read prevents reading invalid
> -	 * status. A very little time (3 times of mflash operating clk)
> -	 * is required for busy bit is set. Use dummy read instead of
> -	 * busy wait, because mflash's PLL is machine dependent.
> -	 */
> -	if (prv_data->use_polling) {
> -		status = inb((unsigned long)host->dev_base + MG_REG_STATUS);
> -		status = inb((unsigned long)host->dev_base + MG_REG_STATUS);
> -	}
> -
> -	status = inb((unsigned long)host->dev_base + MG_REG_STATUS);
> -
> -	do {
> -		cur_jiffies = jiffies;
> -		if (status & ATA_BUSY) {
> -			if (expect == ATA_BUSY)
> -				break;
> -		} else {
> -			/* Check the error condition! */
> -			if (status & ATA_ERR) {
> -				mg_dump_status("mg_wait", status, host);
> -				break;
> -			}
> -
> -			if (expect == MG_STAT_READY)
> -				if (MG_READY_OK(status))
> -					break;
> -
> -			if (expect == ATA_DRQ)
> -				if (status & ATA_DRQ)
> -					break;
> -		}
> -		if (!msec) {
> -			mg_dump_status("not ready", status, host);
> -			return MG_ERR_INV_STAT;
> -		}
> -
> -		status = inb((unsigned long)host->dev_base + MG_REG_STATUS);
> -	} while (time_before(cur_jiffies, expire));
> -
> -	if (time_after_eq(cur_jiffies, expire) && msec)
> -		host->error = MG_ERR_TIMEOUT;
> -
> -	return host->error;
> -}
> -
> -static unsigned int mg_wait_rstout(u32 rstout, u32 msec)
> -{
> -	unsigned long expire;
> -
> -	expire = jiffies + msecs_to_jiffies(msec);
> -	while (time_before(jiffies, expire)) {
> -		if (gpio_get_value(rstout) == 1)
> -			return MG_ERR_NONE;
> -		msleep(10);
> -	}
> -
> -	return MG_ERR_RSTOUT;
> -}
> -
> -static void mg_unexpected_intr(struct mg_host *host)
> -{
> -	u32 status = inb((unsigned long)host->dev_base + MG_REG_STATUS);
> -
> -	mg_dump_status("mg_unexpected_intr", status, host);
> -}
> -
> -static irqreturn_t mg_irq(int irq, void *dev_id)
> -{
> -	struct mg_host *host = dev_id;
> -	void (*handler)(struct mg_host *) = host->mg_do_intr;
> -
> -	spin_lock(&host->lock);
> -
> -	host->mg_do_intr = NULL;
> -	del_timer(&host->timer);
> -	if (!handler)
> -		handler = mg_unexpected_intr;
> -	handler(host);
> -
> -	spin_unlock(&host->lock);
> -
> -	return IRQ_HANDLED;
> -}
> -
> -/* local copy of ata_id_string() */
> -static void mg_id_string(const u16 *id, unsigned char *s,
> -			 unsigned int ofs, unsigned int len)
> -{
> -	unsigned int c;
> -
> -	BUG_ON(len & 1);
> -
> -	while (len > 0) {
> -		c = id[ofs] >> 8;
> -		*s = c;
> -		s++;
> -
> -		c = id[ofs] & 0xff;
> -		*s = c;
> -		s++;
> -
> -		ofs++;
> -		len -= 2;
> -	}
> -}
> -
> -/* local copy of ata_id_c_string() */
> -static void mg_id_c_string(const u16 *id, unsigned char *s,
> -			   unsigned int ofs, unsigned int len)
> -{
> -	unsigned char *p;
> -
> -	mg_id_string(id, s, ofs, len - 1);
> -
> -	p = s + strnlen(s, len - 1);
> -	while (p > s && p[-1] == ' ')
> -		p--;
> -	*p = '\0';
> -}
> -
> -static int mg_get_disk_id(struct mg_host *host)
> -{
> -	u32 i;
> -	s32 err;
> -	const u16 *id = host->id;
> -	struct mg_drv_data *prv_data = host->dev->platform_data;
> -	char fwrev[ATA_ID_FW_REV_LEN + 1];
> -	char model[ATA_ID_PROD_LEN + 1];
> -	char serial[ATA_ID_SERNO_LEN + 1];
> -
> -	if (!prv_data->use_polling)
> -		outb(ATA_NIEN, (unsigned long)host->dev_base + MG_REG_DRV_CTRL);
> -
> -	outb(MG_CMD_ID, (unsigned long)host->dev_base + MG_REG_COMMAND);
> -	err = mg_wait(host, ATA_DRQ, MG_TMAX_WAIT_RD_DRQ);
> -	if (err)
> -		return err;
> -
> -	for (i = 0; i < (MG_SECTOR_SIZE >> 1); i++)
> -		host->id[i] = le16_to_cpu(inw((unsigned long)host->dev_base +
> -					MG_BUFF_OFFSET + i * 2));
> -
> -	outb(MG_CMD_RD_CONF, (unsigned long)host->dev_base + MG_REG_COMMAND);
> -	err = mg_wait(host, MG_STAT_READY, MG_TMAX_CONF_TO_CMD);
> -	if (err)
> -		return err;
> -
> -	if ((id[ATA_ID_FIELD_VALID] & 1) == 0)
> -		return MG_ERR_TRANSLATION;
> -
> -	host->n_sectors = ata_id_u32(id, ATA_ID_LBA_CAPACITY);
> -	host->cyls = id[ATA_ID_CYLS];
> -	host->heads = id[ATA_ID_HEADS];
> -	host->sectors = id[ATA_ID_SECTORS];
> -
> -	if (MG_RES_SEC && host->heads && host->sectors) {
> -		/* modify cyls, n_sectors */
> -		host->cyls = (host->n_sectors - MG_RES_SEC) /
> -			host->heads / host->sectors;
> -		host->nres_sectors = host->n_sectors - host->cyls *
> -			host->heads * host->sectors;
> -		host->n_sectors -= host->nres_sectors;
> -	}
> -
> -	mg_id_c_string(id, fwrev, ATA_ID_FW_REV, sizeof(fwrev));
> -	mg_id_c_string(id, model, ATA_ID_PROD, sizeof(model));
> -	mg_id_c_string(id, serial, ATA_ID_SERNO, sizeof(serial));
> -	printk(KERN_INFO "mg_disk: model: %s\n", model);
> -	printk(KERN_INFO "mg_disk: firm: %.8s\n", fwrev);
> -	printk(KERN_INFO "mg_disk: serial: %s\n", serial);
> -	printk(KERN_INFO "mg_disk: %d + reserved %d sectors\n",
> -			host->n_sectors, host->nres_sectors);
> -
> -	if (!prv_data->use_polling)
> -		outb(0, (unsigned long)host->dev_base + MG_REG_DRV_CTRL);
> -
> -	return err;
> -}
> -
> -
> -static int mg_disk_init(struct mg_host *host)
> -{
> -	struct mg_drv_data *prv_data = host->dev->platform_data;
> -	s32 err;
> -	u8 init_status;
> -
> -	/* hdd rst low */
> -	gpio_set_value(host->rst, 0);
> -	err = mg_wait(host, ATA_BUSY, MG_TMAX_RST_TO_BUSY);
> -	if (err)
> -		return err;
> -
> -	/* hdd rst high */
> -	gpio_set_value(host->rst, 1);
> -	err = mg_wait(host, MG_STAT_READY, MG_TMAX_HDRST_TO_RDY);
> -	if (err)
> -		return err;
> -
> -	/* soft reset on */
> -	outb(ATA_SRST | (prv_data->use_polling ? ATA_NIEN : 0),
> -			(unsigned long)host->dev_base + MG_REG_DRV_CTRL);
> -	err = mg_wait(host, ATA_BUSY, MG_TMAX_RST_TO_BUSY);
> -	if (err)
> -		return err;
> -
> -	/* soft reset off */
> -	outb(prv_data->use_polling ? ATA_NIEN : 0,
> -			(unsigned long)host->dev_base + MG_REG_DRV_CTRL);
> -	err = mg_wait(host, MG_STAT_READY, MG_TMAX_SWRST_TO_RDY);
> -	if (err)
> -		return err;
> -
> -	init_status = inb((unsigned long)host->dev_base + MG_REG_STATUS) & 0xf;
> -
> -	if (init_status == 0xf)
> -		return MG_ERR_INIT_STAT;
> -
> -	return err;
> -}
> -
> -static void mg_bad_rw_intr(struct mg_host *host)
> -{
> -	if (host->req)
> -		if (++host->req->errors >= MG_MAX_ERRORS ||
> -		    host->error == MG_ERR_TIMEOUT)
> -			mg_end_request_cur(host, -EIO);
> -}
> -
> -static unsigned int mg_out(struct mg_host *host,
> -		unsigned int sect_num,
> -		unsigned int sect_cnt,
> -		unsigned int cmd,
> -		void (*intr_addr)(struct mg_host *))
> -{
> -	struct mg_drv_data *prv_data = host->dev->platform_data;
> -
> -	if (mg_wait(host, MG_STAT_READY, MG_TMAX_CONF_TO_CMD))
> -		return host->error;
> -
> -	if (!prv_data->use_polling) {
> -		host->mg_do_intr = intr_addr;
> -		mod_timer(&host->timer, jiffies + 3 * HZ);
> -	}
> -	if (MG_RES_SEC)
> -		sect_num += MG_RES_SEC;
> -	outb((u8)sect_cnt, (unsigned long)host->dev_base + MG_REG_SECT_CNT);
> -	outb((u8)sect_num, (unsigned long)host->dev_base + MG_REG_SECT_NUM);
> -	outb((u8)(sect_num >> 8), (unsigned long)host->dev_base +
> -			MG_REG_CYL_LOW);
> -	outb((u8)(sect_num >> 16), (unsigned long)host->dev_base +
> -			MG_REG_CYL_HIGH);
> -	outb((u8)((sect_num >> 24) | ATA_LBA | ATA_DEVICE_OBS),
> -			(unsigned long)host->dev_base + MG_REG_DRV_HEAD);
> -	outb(cmd, (unsigned long)host->dev_base + MG_REG_COMMAND);
> -	return MG_ERR_NONE;
> -}
> -
> -static void mg_read_one(struct mg_host *host, struct request *req)
> -{
> -	u16 *buff = (u16 *)bio_data(req->bio);
> -	u32 i;
> -
> -	for (i = 0; i < MG_SECTOR_SIZE >> 1; i++)
> -		*buff++ = inw((unsigned long)host->dev_base + MG_BUFF_OFFSET +
> -			      (i << 1));
> -}
> -
> -static void mg_read(struct request *req)
> -{
> -	struct mg_host *host = req->rq_disk->private_data;
> -
> -	if (mg_out(host, blk_rq_pos(req), blk_rq_sectors(req),
> -		   MG_CMD_RD, NULL) != MG_ERR_NONE)
> -		mg_bad_rw_intr(host);
> -
> -	MG_DBG("requested %d sects (from %ld), buffer=0x%p\n",
> -	       blk_rq_sectors(req), blk_rq_pos(req), bio_data(req->bio));
> -
> -	do {
> -		if (mg_wait(host, ATA_DRQ,
> -			    MG_TMAX_WAIT_RD_DRQ) != MG_ERR_NONE) {
> -			mg_bad_rw_intr(host);
> -			return;
> -		}
> -
> -		mg_read_one(host, req);
> -
> -		outb(MG_CMD_RD_CONF, (unsigned long)host->dev_base +
> -				MG_REG_COMMAND);
> -	} while (mg_end_request(host, 0, MG_SECTOR_SIZE));
> -}
> -
> -static void mg_write_one(struct mg_host *host, struct request *req)
> -{
> -	u16 *buff = (u16 *)bio_data(req->bio);
> -	u32 i;
> -
> -	for (i = 0; i < MG_SECTOR_SIZE >> 1; i++)
> -		outw(*buff++, (unsigned long)host->dev_base + MG_BUFF_OFFSET +
> -		     (i << 1));
> -}
> -
> -static void mg_write(struct request *req)
> -{
> -	struct mg_host *host = req->rq_disk->private_data;
> -	unsigned int rem = blk_rq_sectors(req);
> -
> -	if (mg_out(host, blk_rq_pos(req), rem,
> -		   MG_CMD_WR, NULL) != MG_ERR_NONE) {
> -		mg_bad_rw_intr(host);
> -		return;
> -	}
> -
> -	MG_DBG("requested %d sects (from %ld), buffer=0x%p\n",
> -	       rem, blk_rq_pos(req), bio_data(req->bio));
> -
> -	if (mg_wait(host, ATA_DRQ,
> -		    MG_TMAX_WAIT_WR_DRQ) != MG_ERR_NONE) {
> -		mg_bad_rw_intr(host);
> -		return;
> -	}
> -
> -	do {
> -		mg_write_one(host, req);
> -
> -		outb(MG_CMD_WR_CONF, (unsigned long)host->dev_base +
> -				MG_REG_COMMAND);
> -
> -		rem--;
> -		if (rem > 1 && mg_wait(host, ATA_DRQ,
> -					MG_TMAX_WAIT_WR_DRQ) != MG_ERR_NONE) {
> -			mg_bad_rw_intr(host);
> -			return;
> -		} else if (mg_wait(host, MG_STAT_READY,
> -					MG_TMAX_WAIT_WR_DRQ) != MG_ERR_NONE) {
> -			mg_bad_rw_intr(host);
> -			return;
> -		}
> -	} while (mg_end_request(host, 0, MG_SECTOR_SIZE));
> -}
> -
> -static void mg_read_intr(struct mg_host *host)
> -{
> -	struct request *req = host->req;
> -	u32 i;
> -
> -	/* check status */
> -	do {
> -		i = inb((unsigned long)host->dev_base + MG_REG_STATUS);
> -		if (i & ATA_BUSY)
> -			break;
> -		if (!MG_READY_OK(i))
> -			break;
> -		if (i & ATA_DRQ)
> -			goto ok_to_read;
> -	} while (0);
> -	mg_dump_status("mg_read_intr", i, host);
> -	mg_bad_rw_intr(host);
> -	mg_request(host->breq);
> -	return;
> -
> -ok_to_read:
> -	mg_read_one(host, req);
> -
> -	MG_DBG("sector %ld, remaining=%ld, buffer=0x%p\n",
> -	       blk_rq_pos(req), blk_rq_sectors(req) - 1, bio_data(req->bio));
> -
> -	/* send read confirm */
> -	outb(MG_CMD_RD_CONF, (unsigned long)host->dev_base + MG_REG_COMMAND);
> -
> -	if (mg_end_request(host, 0, MG_SECTOR_SIZE)) {
> -		/* set handler if read remains */
> -		host->mg_do_intr = mg_read_intr;
> -		mod_timer(&host->timer, jiffies + 3 * HZ);
> -	} else /* goto next request */
> -		mg_request(host->breq);
> -}
> -
> -static void mg_write_intr(struct mg_host *host)
> -{
> -	struct request *req = host->req;
> -	u32 i;
> -	bool rem;
> -
> -	/* check status */
> -	do {
> -		i = inb((unsigned long)host->dev_base + MG_REG_STATUS);
> -		if (i & ATA_BUSY)
> -			break;
> -		if (!MG_READY_OK(i))
> -			break;
> -		if ((blk_rq_sectors(req) <= 1) || (i & ATA_DRQ))
> -			goto ok_to_write;
> -	} while (0);
> -	mg_dump_status("mg_write_intr", i, host);
> -	mg_bad_rw_intr(host);
> -	mg_request(host->breq);
> -	return;
> -
> -ok_to_write:
> -	if ((rem = mg_end_request(host, 0, MG_SECTOR_SIZE))) {
> -		/* write 1 sector and set handler if remains */
> -		mg_write_one(host, req);
> -		MG_DBG("sector %ld, remaining=%ld, buffer=0x%p\n",
> -		       blk_rq_pos(req), blk_rq_sectors(req), bio_data(req->bio));
> -		host->mg_do_intr = mg_write_intr;
> -		mod_timer(&host->timer, jiffies + 3 * HZ);
> -	}
> -
> -	/* send write confirm */
> -	outb(MG_CMD_WR_CONF, (unsigned long)host->dev_base + MG_REG_COMMAND);
> -
> -	if (!rem)
> -		mg_request(host->breq);
> -}
> -
> -static void mg_times_out(unsigned long data)
> -{
> -	struct mg_host *host = (struct mg_host *)data;
> -	char *name;
> -
> -	spin_lock_irq(&host->lock);
> -
> -	if (!host->req)
> -		goto out_unlock;
> -
> -	host->mg_do_intr = NULL;
> -
> -	name = host->req->rq_disk->disk_name;
> -	printk(KERN_DEBUG "%s: timeout\n", name);
> -
> -	host->error = MG_ERR_TIMEOUT;
> -	mg_bad_rw_intr(host);
> -
> -out_unlock:
> -	mg_request(host->breq);
> -	spin_unlock_irq(&host->lock);
> -}
> -
> -static void mg_request_poll(struct request_queue *q)
> -{
> -	struct mg_host *host = q->queuedata;
> -
> -	while (1) {
> -		if (!host->req) {
> -			host->req = blk_fetch_request(q);
> -			if (!host->req)
> -				break;
> -		}
> -
> -		switch (req_op(host->req)) {
> -		case REQ_OP_READ:
> -			mg_read(host->req);
> -			break;
> -		case REQ_OP_WRITE:
> -			mg_write(host->req);
> -			break;
> -		default:
> -			mg_end_request_cur(host, -EIO);
> -			break;
> -		}
> -	}
> -}
> -
> -static unsigned int mg_issue_req(struct request *req,
> -		struct mg_host *host,
> -		unsigned int sect_num,
> -		unsigned int sect_cnt)
> -{
> -	switch (req_op(host->req)) {
> -	case REQ_OP_READ:
> -		if (mg_out(host, sect_num, sect_cnt, MG_CMD_RD, &mg_read_intr)
> -				!= MG_ERR_NONE) {
> -			mg_bad_rw_intr(host);
> -			return host->error;
> -		}
> -		break;
> -	case REQ_OP_WRITE:
> -		/* TODO : handler */
> -		outb(ATA_NIEN, (unsigned long)host->dev_base + MG_REG_DRV_CTRL);
> -		if (mg_out(host, sect_num, sect_cnt, MG_CMD_WR, &mg_write_intr)
> -				!= MG_ERR_NONE) {
> -			mg_bad_rw_intr(host);
> -			return host->error;
> -		}
> -		del_timer(&host->timer);
> -		mg_wait(host, ATA_DRQ, MG_TMAX_WAIT_WR_DRQ);
> -		outb(0, (unsigned long)host->dev_base + MG_REG_DRV_CTRL);
> -		if (host->error) {
> -			mg_bad_rw_intr(host);
> -			return host->error;
> -		}
> -		mg_write_one(host, req);
> -		mod_timer(&host->timer, jiffies + 3 * HZ);
> -		outb(MG_CMD_WR_CONF, (unsigned long)host->dev_base +
> -				MG_REG_COMMAND);
> -		break;
> -	default:
> -		mg_end_request_cur(host, -EIO);
> -		break;
> -	}
> -	return MG_ERR_NONE;
> -}
> -
> -/* This function also called from IRQ context */
> -static void mg_request(struct request_queue *q)
> -{
> -	struct mg_host *host = q->queuedata;
> -	struct request *req;
> -	u32 sect_num, sect_cnt;
> -
> -	while (1) {
> -		if (!host->req) {
> -			host->req = blk_fetch_request(q);
> -			if (!host->req)
> -				break;
> -		}
> -		req = host->req;
> -
> -		/* check unwanted request call */
> -		if (host->mg_do_intr)
> -			return;
> -
> -		del_timer(&host->timer);
> -
> -		sect_num = blk_rq_pos(req);
> -		/* deal whole segments */
> -		sect_cnt = blk_rq_sectors(req);
> -
> -		/* sanity check */
> -		if (sect_num >= get_capacity(req->rq_disk) ||
> -				((sect_num + sect_cnt) >
> -				 get_capacity(req->rq_disk))) {
> -			printk(KERN_WARNING
> -					"%s: bad access: sector=%d, count=%d\n",
> -					req->rq_disk->disk_name,
> -					sect_num, sect_cnt);
> -			mg_end_request_cur(host, -EIO);
> -			continue;
> -		}
> -
> -		if (!mg_issue_req(req, host, sect_num, sect_cnt))
> -			return;
> -	}
> -}
> -
> -static int mg_getgeo(struct block_device *bdev, struct hd_geometry *geo)
> -{
> -	struct mg_host *host = bdev->bd_disk->private_data;
> -
> -	geo->cylinders = (unsigned short)host->cyls;
> -	geo->heads = (unsigned char)host->heads;
> -	geo->sectors = (unsigned char)host->sectors;
> -	return 0;
> -}
> -
> -static const struct block_device_operations mg_disk_ops = {
> -	.getgeo = mg_getgeo
> -};
> -
> -#ifdef CONFIG_PM_SLEEP
> -static int mg_suspend(struct device *dev)
> -{
> -	struct mg_drv_data *prv_data = dev->platform_data;
> -	struct mg_host *host = prv_data->host;
> -
> -	if (mg_wait(host, MG_STAT_READY, MG_TMAX_CONF_TO_CMD))
> -		return -EIO;
> -
> -	if (!prv_data->use_polling)
> -		outb(ATA_NIEN, (unsigned long)host->dev_base + MG_REG_DRV_CTRL);
> -
> -	outb(MG_CMD_SLEEP, (unsigned long)host->dev_base + MG_REG_COMMAND);
> -	/* wait until mflash deep sleep */
> -	msleep(1);
> -
> -	if (mg_wait(host, MG_STAT_READY, MG_TMAX_CONF_TO_CMD)) {
> -		if (!prv_data->use_polling)
> -			outb(0, (unsigned long)host->dev_base + MG_REG_DRV_CTRL);
> -		return -EIO;
> -	}
> -
> -	return 0;
> -}
> -
> -static int mg_resume(struct device *dev)
> -{
> -	struct mg_drv_data *prv_data = dev->platform_data;
> -	struct mg_host *host = prv_data->host;
> -
> -	if (mg_wait(host, MG_STAT_READY, MG_TMAX_CONF_TO_CMD))
> -		return -EIO;
> -
> -	outb(MG_CMD_WAKEUP, (unsigned long)host->dev_base + MG_REG_COMMAND);
> -	/* wait until mflash wakeup */
> -	msleep(1);
> -
> -	if (mg_wait(host, MG_STAT_READY, MG_TMAX_CONF_TO_CMD))
> -		return -EIO;
> -
> -	if (!prv_data->use_polling)
> -		outb(0, (unsigned long)host->dev_base + MG_REG_DRV_CTRL);
> -
> -	return 0;
> -}
> -#endif
> -
> -static SIMPLE_DEV_PM_OPS(mg_pm, mg_suspend, mg_resume);
> -
> -static int mg_probe(struct platform_device *plat_dev)
> -{
> -	struct mg_host *host;
> -	struct resource *rsc;
> -	struct mg_drv_data *prv_data = plat_dev->dev.platform_data;
> -	int err = 0;
> -
> -	if (!prv_data) {
> -		printk(KERN_ERR	"%s:%d fail (no driver_data)\n",
> -				__func__, __LINE__);
> -		err = -EINVAL;
> -		goto probe_err;
> -	}
> -
> -	/* alloc mg_host */
> -	host = kzalloc(sizeof(struct mg_host), GFP_KERNEL);
> -	if (!host) {
> -		printk(KERN_ERR "%s:%d fail (no memory for mg_host)\n",
> -				__func__, __LINE__);
> -		err = -ENOMEM;
> -		goto probe_err;
> -	}
> -	host->major = MG_DISK_MAJ;
> -
> -	/* link each other */
> -	prv_data->host = host;
> -	host->dev = &plat_dev->dev;
> -
> -	/* io remap */
> -	rsc = platform_get_resource(plat_dev, IORESOURCE_MEM, 0);
> -	if (!rsc) {
> -		printk(KERN_ERR "%s:%d platform_get_resource fail\n",
> -				__func__, __LINE__);
> -		err = -EINVAL;
> -		goto probe_err_2;
> -	}
> -	host->dev_base = ioremap(rsc->start, resource_size(rsc));
> -	if (!host->dev_base) {
> -		printk(KERN_ERR "%s:%d ioremap fail\n",
> -				__func__, __LINE__);
> -		err = -EIO;
> -		goto probe_err_2;
> -	}
> -	MG_DBG("dev_base = 0x%x\n", (u32)host->dev_base);
> -
> -	/* get reset pin */
> -	rsc = platform_get_resource_byname(plat_dev, IORESOURCE_IO,
> -			MG_RST_PIN);
> -	if (!rsc) {
> -		printk(KERN_ERR "%s:%d get reset pin fail\n",
> -				__func__, __LINE__);
> -		err = -EIO;
> -		goto probe_err_3;
> -	}
> -	host->rst = rsc->start;
> -
> -	/* init rst pin */
> -	err = gpio_request(host->rst, MG_RST_PIN);
> -	if (err)
> -		goto probe_err_3;
> -	gpio_direction_output(host->rst, 1);
> -
> -	/* reset out pin */
> -	if (!(prv_data->dev_attr & MG_DEV_MASK)) {
> -		err = -EINVAL;
> -		goto probe_err_3a;
> -	}
> -
> -	if (prv_data->dev_attr != MG_BOOT_DEV) {
> -		rsc = platform_get_resource_byname(plat_dev, IORESOURCE_IO,
> -				MG_RSTOUT_PIN);
> -		if (!rsc) {
> -			printk(KERN_ERR "%s:%d get reset-out pin fail\n",
> -					__func__, __LINE__);
> -			err = -EIO;
> -			goto probe_err_3a;
> -		}
> -		host->rstout = rsc->start;
> -		err = gpio_request(host->rstout, MG_RSTOUT_PIN);
> -		if (err)
> -			goto probe_err_3a;
> -		gpio_direction_input(host->rstout);
> -	}
> -
> -	/* disk reset */
> -	if (prv_data->dev_attr == MG_STORAGE_DEV) {
> -		/* If POR seq. not yet finished, wait */
> -		err = mg_wait_rstout(host->rstout, MG_TMAX_RSTOUT);
> -		if (err)
> -			goto probe_err_3b;
> -		err = mg_disk_init(host);
> -		if (err) {
> -			printk(KERN_ERR "%s:%d fail (err code : %d)\n",
> -					__func__, __LINE__, err);
> -			err = -EIO;
> -			goto probe_err_3b;
> -		}
> -	}
> -
> -	/* get irq resource */
> -	if (!prv_data->use_polling) {
> -		host->irq = platform_get_irq(plat_dev, 0);
> -		if (host->irq == -ENXIO) {
> -			err = host->irq;
> -			goto probe_err_3b;
> -		}
> -		err = request_irq(host->irq, mg_irq,
> -				IRQF_TRIGGER_RISING,
> -				MG_DEV_NAME, host);
> -		if (err) {
> -			printk(KERN_ERR "%s:%d fail (request_irq err=%d)\n",
> -					__func__, __LINE__, err);
> -			goto probe_err_3b;
> -		}
> -
> -	}
> -
> -	/* get disk id */
> -	err = mg_get_disk_id(host);
> -	if (err) {
> -		printk(KERN_ERR "%s:%d fail (err code : %d)\n",
> -				__func__, __LINE__, err);
> -		err = -EIO;
> -		goto probe_err_4;
> -	}
> -
> -	err = register_blkdev(host->major, MG_DISK_NAME);
> -	if (err < 0) {
> -		printk(KERN_ERR "%s:%d register_blkdev fail (err code : %d)\n",
> -				__func__, __LINE__, err);
> -		goto probe_err_4;
> -	}
> -	if (!host->major)
> -		host->major = err;
> -
> -	spin_lock_init(&host->lock);
> -
> -	if (prv_data->use_polling)
> -		host->breq = blk_init_queue(mg_request_poll, &host->lock);
> -	else
> -		host->breq = blk_init_queue(mg_request, &host->lock);
> -
> -	if (!host->breq) {
> -		err = -ENOMEM;
> -		printk(KERN_ERR "%s:%d (blk_init_queue) fail\n",
> -				__func__, __LINE__);
> -		goto probe_err_5;
> -	}
> -	host->breq->queuedata = host;
> -
> -	/* mflash is random device, thanx for the noop */
> -	err = elevator_change(host->breq, "noop");
> -	if (err) {
> -		printk(KERN_ERR "%s:%d (elevator_init) fail\n",
> -				__func__, __LINE__);
> -		goto probe_err_6;
> -	}
> -	blk_queue_max_hw_sectors(host->breq, MG_MAX_SECTS);
> -	blk_queue_logical_block_size(host->breq, MG_SECTOR_SIZE);
> -
> -	setup_timer(&host->timer, mg_times_out, (unsigned long)host);
> -
> -	host->gd = alloc_disk(MG_DISK_MAX_PART);
> -	if (!host->gd) {
> -		printk(KERN_ERR "%s:%d (alloc_disk) fail\n",
> -				__func__, __LINE__);
> -		err = -ENOMEM;
> -		goto probe_err_7;
> -	}
> -	host->gd->major = host->major;
> -	host->gd->first_minor = 0;
> -	host->gd->fops = &mg_disk_ops;
> -	host->gd->queue = host->breq;
> -	host->gd->private_data = host;
> -	sprintf(host->gd->disk_name, MG_DISK_NAME"a");
> -
> -	set_capacity(host->gd, host->n_sectors);
> -
> -	add_disk(host->gd);
> -
> -	return err;
> -
> -probe_err_7:
> -	del_timer_sync(&host->timer);
> -probe_err_6:
> -	blk_cleanup_queue(host->breq);
> -probe_err_5:
> -	unregister_blkdev(host->major, MG_DISK_NAME);
> -probe_err_4:
> -	if (!prv_data->use_polling)
> -		free_irq(host->irq, host);
> -probe_err_3b:
> -	gpio_free(host->rstout);
> -probe_err_3a:
> -	gpio_free(host->rst);
> -probe_err_3:
> -	iounmap(host->dev_base);
> -probe_err_2:
> -	kfree(host);
> -probe_err:
> -	return err;
> -}
> -
> -static int mg_remove(struct platform_device *plat_dev)
> -{
> -	struct mg_drv_data *prv_data = plat_dev->dev.platform_data;
> -	struct mg_host *host = prv_data->host;
> -	int err = 0;
> -
> -	/* delete timer */
> -	del_timer_sync(&host->timer);
> -
> -	/* remove disk */
> -	if (host->gd) {
> -		del_gendisk(host->gd);
> -		put_disk(host->gd);
> -	}
> -	/* remove queue */
> -	if (host->breq)
> -		blk_cleanup_queue(host->breq);
> -
> -	/* unregister blk device */
> -	unregister_blkdev(host->major, MG_DISK_NAME);
> -
> -	/* free irq */
> -	if (!prv_data->use_polling)
> -		free_irq(host->irq, host);
> -
> -	/* free reset-out pin */
> -	if (prv_data->dev_attr != MG_BOOT_DEV)
> -		gpio_free(host->rstout);
> -
> -	/* free rst pin */
> -	if (host->rst)
> -		gpio_free(host->rst);
> -
> -	/* unmap io */
> -	if (host->dev_base)
> -		iounmap(host->dev_base);
> -
> -	/* free mg_host */
> -	kfree(host);
> -
> -	return err;
> -}
> -
> -static struct platform_driver mg_disk_driver = {
> -	.probe = mg_probe,
> -	.remove = mg_remove,
> -	.driver = {
> -		.name = MG_DEV_NAME,
> -		.pm = &mg_pm,
> -	}
> -};
> -
> -/****************************************************************************
> - *
> - * Module stuff
> - *
> - ****************************************************************************/
> -
> -static int __init mg_init(void)
> -{
> -	printk(KERN_INFO "mGine mflash driver, (c) 2008 mGine Co.\n");
> -	return platform_driver_register(&mg_disk_driver);
> -}
> -
> -static void __exit mg_exit(void)
> -{
> -	printk(KERN_INFO "mflash driver : bye bye\n");
> -	platform_driver_unregister(&mg_disk_driver);
> -}
> -
> -module_init(mg_init);
> -module_exit(mg_exit);
> -
> -MODULE_LICENSE("GPL");
> -MODULE_AUTHOR("unsik Kim <donari75@gmail.com>");
> -MODULE_DESCRIPTION("mGine m[g]flash device driver");
> diff --git a/include/linux/mg_disk.h b/include/linux/mg_disk.h
> deleted file mode 100644
> index e11f4d9f1c2e..000000000000
> --- a/include/linux/mg_disk.h
> +++ /dev/null
> @@ -1,45 +0,0 @@
> -/*
> - *  include/linux/mg_disk.c
> - *
> - *  Private data for mflash platform driver
> - *
> - * (c) 2008 mGine Co.,LTD
> - * (c) 2008 unsik Kim <donari75@gmail.com>
> - *
> - *  This program is free software; you can redistribute it and/or modify
> - *  it under the terms of the GNU General Public License version 2 as
> - *  published by the Free Software Foundation.
> - */
> -
> -#ifndef __MG_DISK_H__
> -#define __MG_DISK_H__
> -
> -/* name for platform device */
> -#define MG_DEV_NAME "mg_disk"
> -
> -/* names of GPIO resource */
> -#define MG_RST_PIN	"mg_rst"
> -/* except MG_BOOT_DEV, reset-out pin should be assigned */
> -#define MG_RSTOUT_PIN	"mg_rstout"
> -
> -/* device attribution */
> -/* use mflash as boot device */
> -#define MG_BOOT_DEV		(1 << 0)
> -/* use mflash as storage device */
> -#define MG_STORAGE_DEV		(1 << 1)
> -/* same as MG_STORAGE_DEV, but bootloader already done reset sequence */
> -#define MG_STORAGE_DEV_SKIP_RST	(1 << 2)
> -
> -/* private driver data */
> -struct mg_drv_data {
> -	/* disk resource */
> -	u32 use_polling;
> -
> -	/* device attribution */
> -	u32 dev_attr;
> -
> -	/* internally used */
> -	void *host;
> -};
> -
> -#endif
> -- 
> 2.11.0
---end quoted text---

^ permalink raw reply

* Re: [PATCH 2/8] target: remove iblock WRITE_SAME passthrough support
From: Christoph Hellwig @ 2017-04-12  5:51 UTC (permalink / raw)
  To: Nicholas A. Bellinger
  Cc: Christoph Hellwig, axboe, martin.petersen, philipp.reisner,
	lars.ellenberg, target-devel, linux-block, linux-scsi, drbd-dev,
	dm-devel
In-Reply-To: <1491975041.8231.131.camel@haakon3.risingtidesystems.com>

Hi Nic,

this patch looks fine, and I'll include it for the next post.  I'll
move some of the explanation in this mail into the patch, though.

^ permalink raw reply

* Re: [PATCH 2/8] target: remove iblock WRITE_SAME passthrough support
From: Nicholas A. Bellinger @ 2017-04-12  5:30 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, martin.petersen, philipp.reisner, lars.ellenberg,
	target-devel, linux-block, linux-scsi, drbd-dev, dm-devel
In-Reply-To: <20170410160807.23674-3-hch@lst.de>

On Mon, 2017-04-10 at 18:08 +0200, Christoph Hellwig wrote:
> Use the pscsi driver to support arbitrary command passthrough
> instead.
> 

The people who are actively using iblock_execute_write_same_direct() are
doing so in the context of ESX VAAI BlockZero, together with
EXTENDED_COPY and COMPARE_AND_WRITE primitives.  Just using PSCSI is not
an option for them.

In practice though I've not seen any users of IBLOCK WRITE_SAME for
anything other than VAAI BlockZero, so just using blkdev_issue_zeroout()
when available, and falling back to iblock_execute_write_same() if the
WRITE_SAME buffer contains anything other than zeros should be OK.

How about something like the following below..?

This would bring parity to how blkdev_issue_write_same() works atm wrt
to synchronous bio completions.  However, most folks with a raw
make_request or blk-mq backend driver that supports multiple GB/sec of
zero bandwidth end up changing IBLOCK to support asynchronous
REQ_WRITE_SAME completions anyways.

I'd be happy to add support for that using __blkdev_issue_zeroout() once
the basic conversion is in place.

>From ff74012eaff38f9fa0d74aca60507b9964f484ce Mon Sep 17 00:00:00 2001
From: Nicholas Bellinger <nab@linux-iscsi.org>
Date: Tue, 11 Apr 2017 22:21:47 -0700
Subject: [PATCH] target/iblock: Convert WRITE_SAME to blkdev_issue_zeroout

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
---
 drivers/target/target_core_iblock.c | 44 +++++++++++++++++++++++--------------
 1 file changed, 27 insertions(+), 17 deletions(-)

diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c
index d316ed5..5bfde20 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -86,6 +86,7 @@ static int iblock_configure_device(struct se_device *dev)
 	struct block_device *bd = NULL;
 	struct blk_integrity *bi;
 	fmode_t mode;
+	unsigned int max_write_zeroes_sectors;
 	int ret = -ENOMEM;
 
 	if (!(ib_dev->ibd_flags & IBDF_HAS_UDEV_PATH)) {
@@ -129,7 +130,11 @@ static int iblock_configure_device(struct se_device *dev)
 	 * Enable write same emulation for IBLOCK and use 0xFFFF as
 	 * the smaller WRITE_SAME(10) only has a two-byte block count.
 	 */
-	dev->dev_attrib.max_write_same_len = 0xFFFF;
+	max_write_zeroes_sectors = bdev_write_zeroes_sectors(bd);
+	if (max_write_zeroes_sectors)
+		dev->dev_attrib.max_write_same_len = max_write_zeroes_sectors;
+	else
+		dev->dev_attrib.max_write_same_len = 0xFFFF;
 
 	if (blk_queue_nonrot(q))
 		dev->dev_attrib.is_nonrot = 1;
@@ -415,28 +420,31 @@ static void iblock_end_io_flush(struct bio *bio)
 }
 
 static sense_reason_t
-iblock_execute_write_same_direct(struct block_device *bdev, struct se_cmd *cmd)
+iblock_execute_zero_out(struct block_device *bdev, struct se_cmd *cmd)
 {
 	struct se_device *dev = cmd->se_dev;
 	struct scatterlist *sg = &cmd->t_data_sg[0];
-	struct page *page = NULL;
-	int ret;
+	unsigned char *buf, zero = 0x00, *p = &zero;
+	int rc, ret;
 
-	if (sg->offset) {
-		page = alloc_page(GFP_KERNEL);
-		if (!page)
-			return TCM_OUT_OF_RESOURCES;
-		sg_copy_to_buffer(sg, cmd->t_data_nents, page_address(page),
-				  dev->dev_attrib.block_size);
-	}
+	buf = kmap(sg_page(sg)) + sg->offset;
+	if (!buf)
+		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
+	/*
+	 * Fall back to block_execute_write_same() slow-path if
+	 * incoming WRITE_SAME payload does not contain zeros.
+	 */
+	rc = memcmp(buf, p, cmd->data_length);
+	kunmap(sg_page(sg));
+
+	if (rc)
+		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
 
-	ret = blkdev_issue_write_same(bdev,
+	ret = blkdev_issue_zeroout(bdev,
 				target_to_linux_sector(dev, cmd->t_task_lba),
 				target_to_linux_sector(dev,
 					sbc_get_write_same_sectors(cmd)),
-				GFP_KERNEL, page ? page : sg_page(sg));
-	if (page)
-		__free_page(page);
+				GFP_KERNEL, false);
 	if (ret)
 		return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
 
@@ -472,8 +480,10 @@ static void iblock_end_io_flush(struct bio *bio)
 		return TCM_INVALID_CDB_FIELD;
 	}
 
-	if (bdev_write_same(bdev))
-		return iblock_execute_write_same_direct(bdev, cmd);
+	if (bdev_write_zeroes_sectors(bdev)) {
+		if (!iblock_execute_zero_out(bdev, cmd))
+			return 0;
+	}
 
 	ibr = kzalloc(sizeof(struct iblock_req), GFP_KERNEL);
 	if (!ibr)

^ permalink raw reply related

* Re: [PATCH V3 02/16] block, bfq: add full hierarchical scheduling and cgroups support
From: Paolo Valente @ 2017-04-12  5:22 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, Fabio Checconi, Arianna Avanzini, linux-block,
	Linux-Kernal, Ulf Hansson, Linus Walleij, broonie
In-Reply-To: <20170411214702.GA31551@wtj.duckdns.org>


> Il giorno 11 apr 2017, alle ore 23:47, Tejun Heo <tj@kernel.org> ha =
scritto:
>=20
> Hello,
>=20
> On Tue, Apr 11, 2017 at 03:43:01PM +0200, Paolo Valente wrote:
>> From: Arianna Avanzini <avanzini.arianna@gmail.com>
>>=20
>> Add complete support for full hierarchical scheduling, with a cgroups
>> interface. Full hierarchical scheduling is implemented through the
>> 'entity' abstraction: both bfq_queues, i.e., the internal BFQ queues
>> associated with processes, and groups are represented in general by
>> entities. Given the bfq_queues associated with the processes =
belonging
>> to a given group, the entities representing these queues are sons of
>> the entity representing the group. At higher levels, if a group, say
>> G, contains other groups, then the entity representing G is the =
parent
>> entity of the entities representing the groups in G.
>>=20
>> Hierarchical scheduling is performed as follows: if the timestamps of
>> a leaf entity (i.e., of a bfq_queue) change, and such a change lets
>> the entity become the next-to-serve entity for its parent entity, =
then
>> the timestamps of the parent entity are recomputed as a function of
>> the budget of its new next-to-serve leaf entity. If the parent entity
>> belongs, in its turn, to a group, and its new timestamps let it =
become
>> the next-to-serve for its parent entity, then the timestamps of the
>> latter parent entity are recomputed as well, and so on. When a new
>> bfq_queue must be set in service, the reverse path is followed: the
>> next-to-serve highest-level entity is chosen, then its next-to-serve
>> child entity, and so on, until the next-to-serve leaf entity is
>> reached, and the bfq_queue that this entity represents is set in
>> service.
>>=20
>> Writeback is accounted for on a per-group basis, i.e., for each =
group,
>> the async I/O requests of the processes of the group are enqueued in =
a
>> distinct bfq_queue, and the entity associated with this queue is a
>> child of the entity associated with the group.
>>=20
>> Weights can be assigned explicitly to groups and processes through =
the
>> cgroups interface, differently from what happens, for single
>> processes, if the cgroups interface is not used (as explained in the
>> description of the previous patch). In particular, since each node =
has
>> a full scheduler, each group can be assigned its own weight.
>=20
> Can we please hold off on cgroup support for now?  I've been trying to
> chase down cpu scheduler latency issues lately and have some doubts
> about implementing cgroup support by simply nesting the timelines like
> this.
>=20

Hi Tejun,
could you elaborate a bit more on this?  I mean, cgroups support has
been in BFQ (and CFQ) for almost ten years, perfectly working as far
as I know.  Of course it is perfectly working in terms of I/O and not
of CPU bandwidth distribution; and, for the moment, it is effective
only for devices below 30-50KIOPS.  What's the point in throwing
(momentarily?) away such a fundamental feature?  What am I missing?

Thanks,
Paolo


> Thanks
>=20
> --=20
> tejun

^ permalink raw reply

* Re: [PATCH] blk-mq: Fix blk_execute_rq_nowait() handling of dying queues
From: Ming Lei @ 2017-04-12  5:01 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Jens Axboe, linux-block, Mike Snitzer, stable
In-Reply-To: <20170411235848.8686-1-bart.vanassche@sandisk.com>

On Wed, Apr 12, 2017 at 7:58 AM, Bart Van Assche
<bart.vanassche@sandisk.com> wrote:
> Although blk_execute_rq_nowait() asks blk_mq_sched_insert_request()
> to run the queue, the function that should run the queue
> (__blk_mq_delay_run_hw_queue()) skips hardware queues for which
> .tags =3D=3D NULL. Since blk_mq_free_tag_set() clears .tags this means
> if blk_execute_rq_nowait() is called after the tag set has been

Just wondering how that can happen, because we usually call
blk_mq_free_tag_set()
after blk_cleanup_queue() is completed.

> freed that the request that has been queued will never be executed.
> In my tests I noticed that every now and then an SG_IO request that
> got queued by multipathd on a dm device did not get executed. This
> resulted in either a memory leak complaint about the SG_IO code or
> the dm device becoming unremovable with e.g. the following state:
>
> $ grep busy=3D /sys/kernel/debug/block/dm*/mq/*
> /sys/kernel/debug/block/dm-0/mq/state:SAME_COMP STACKABLE IO_STAT INIT_DO=
NE POLL REGISTERED, pg_init_in_progress=3D0, nr_valid_paths=3D4, flags=3D R=
ETAIN_ATTACHED_HW_HANDLER, paths: [0:0] active=3D1 busy=3D0 dying dead [1:0=
] active=3D1 busy=3D0 dying dead [2:0] active=3D1 busy=3D0 dying dead [3:0]=
 active=3D1 busy=3D0 dying dead
> $ multipath -ll
> mpathu (3600140572616d6469736b32000000000) dm-0 ##,##
> size=3D984M features=3D'3 retain_attached_hw_handler queue_mode mq' hwhan=
dler=3D'1 alua' wp=3Drw
> |-+- policy=3D'service-time 0' prio=3D0 status=3Dactive
> |-+- policy=3D'service-time 0' prio=3D0 status=3Dundef
> |-+- policy=3D'service-time 0' prio=3D0 status=3Dundef
> `-+- policy=3D'service-time 0' prio=3D0 status=3Dundef
>
> Avoid that blk_execute_rq_nowait() is called to queue a request
> onto a dying queue by changing the blk_freeze_queue_start() call
> in blk_set_queue_dying() into a blk_freeze_queue() call.

blk_mq_freeze_queue_wait() is only for waiting for completion of pending IO=
, so
could you explain it a bit why _wait() is required?

In this case, either blk_freeze_queue_start() or blk_freeze_queue() can't
prevent the rq coming into queue, because we only hold/check q_usage_counte=
r
before allocating a request, but blk_execute_rq_nowait() has got the reques=
t
already.

>
> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
> Cc: Mike Snitzer <snitzer@redhat.com>
> Cc: Ming Lei <tom.leiming@gmail.com>
> Cc: <stable@vger.kernel.org>
> ---
>  block/blk-core.c | 9 +++++----
>  block/blk-exec.c | 7 +++++--
>  2 files changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 8654aa0cef6d..21314b995887 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -501,11 +501,12 @@ void blk_set_queue_dying(struct request_queue *q)
>         spin_unlock_irq(q->queue_lock);
>
>         /*
> -        * When queue DYING flag is set, we need to block new req
> -        * entering queue, so we call blk_freeze_queue_start() to
> -        * prevent I/O from crossing blk_queue_enter().
> +        * When queue DYING flag is set, we need to block new requests
> +        * from being queued. Hence call blk_freeze_queue() to make
> +        * new blk_queue_enter() calls fail and to wait until all pending
> +        * I/O has finished.
>          */
> -       blk_freeze_queue_start(q);
> +       blk_freeze_queue(q);
>
>         if (q->mq_ops)
>                 blk_mq_wake_waiters(q);
> diff --git a/block/blk-exec.c b/block/blk-exec.c
> index 8cd0e9bc8dc8..f7d9bed2cb15 100644
> --- a/block/blk-exec.c
> +++ b/block/blk-exec.c
> @@ -57,10 +57,13 @@ void blk_execute_rq_nowait(struct request_queue *q, s=
truct gendisk *bd_disk,
>         rq->end_io =3D done;
>
>         /*
> -        * don't check dying flag for MQ because the request won't
> -        * be reused after dying flag is set
> +        * The blk_freeze_queue() call in blk_set_queue_dying() and the
> +        * test of the "dying" flag in blk_queue_enter() guarantee that
> +        * blk_execute_rq_nowait() won't be called anymore after the "dyi=
ng"
> +        * flag has been set.

That never be guaranteed, see the following case:

1) blk_get_request() is called just before queue is set as dying in another=
 path

2) the request is allocated successfully and passed to
blk_execute_rq_nowait() even
though queue has been set as dying


Thanks,
Ming Lei

^ permalink raw reply

* Re: [PATCH v4 6/6] dm rq: Avoid that request processing stalls sporadically
From: Ming Lei @ 2017-04-12  3:42 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: snitzer@redhat.com, linux-scsi@vger.kernel.org,
	dm-devel@redhat.com, linux-block@vger.kernel.org, axboe@kernel.dk
In-Reply-To: <1491934715.2654.14.camel@sandisk.com>

On Tue, Apr 11, 2017 at 06:18:36PM +0000, Bart Van Assche wrote:
> On Tue, 2017-04-11 at 14:03 -0400, Mike Snitzer wrote:
> > Rather than working so hard to use DM code against me, your argument
> > should be: "blk-mq drivers X, Y and Z rerun the hw queue; this is a well
> > established pattern"
> > 
> > I see drivers/nvme/host/fc.c:nvme_fc_start_fcp_op() does.  But that is
> > only one other driver out of ~20 BLK_MQ_RQ_QUEUE_BUSY returns
> > tree-wide.
> > 
> > Could be there are some others, but hardly a well-established pattern.
> 
> Hello Mike,
> 
> Several blk-mq drivers that can return BLK_MQ_RQ_QUEUE_BUSY from their
> .queue_rq() implementation stop the request queueï¿½(blk_mq_stop_hw_queue())
> before returning "busy" and restart the queue after the busy condition has
> been cleared (blk_mq_start_stopped_hw_queues()). Examples are virtio_blk and
> xen-blkfront. However, this approach is not appropriate for the dm-mq core
> nor for the scsi core since both drivers already use the "stopped" state for
> another purpose than tracking whether or not a hardware queue is busy. Hence
> the blk_mq_delay_run_hw_queue() and blk_mq_run_hw_queue() calls in these last
> two drivers to rerun a hardware queue after the busy state has been cleared.

But looks this patch just reruns the hw queue after 100ms, which isn't
that after the busy state has been cleared, right?

Actually if BLK_MQ_RQ_QUEUE_BUSY is returned from .queue_rq(), blk-mq
will buffer this request into hctx->dispatch and run the hw queue again,
so looks blk_mq_delay_run_hw_queue() in this situation shouldn't have been
needed at my 1st impression. Or maybe Bart has more stories about this usage,
better to comments it?

Thanks,
Ming

^ permalink raw reply

* [PATCH] blk-mq: Fix blk_execute_rq_nowait() handling of dying queues
From: Bart Van Assche @ 2017-04-11 23:58 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Bart Van Assche, Mike Snitzer, Ming Lei, stable

Although blk_execute_rq_nowait() asks blk_mq_sched_insert_request()
to run the queue, the function that should run the queue
(__blk_mq_delay_run_hw_queue()) skips hardware queues for which
.tags == NULL. Since blk_mq_free_tag_set() clears .tags this means
if blk_execute_rq_nowait() is called after the tag set has been
freed that the request that has been queued will never be executed.
In my tests I noticed that every now and then an SG_IO request that
got queued by multipathd on a dm device did not get executed. This
resulted in either a memory leak complaint about the SG_IO code or
the dm device becoming unremovable with e.g. the following state:

$ grep busy= /sys/kernel/debug/block/dm*/mq/*
/sys/kernel/debug/block/dm-0/mq/state:SAME_COMP STACKABLE IO_STAT INIT_DONE POLL REGISTERED, pg_init_in_progress=0, nr_valid_paths=4, flags= RETAIN_ATTACHED_HW_HANDLER, paths: [0:0] active=1 busy=0 dying dead [1:0] active=1 busy=0 dying dead [2:0] active=1 busy=0 dying dead [3:0] active=1 busy=0 dying dead
$ multipath -ll
mpathu (3600140572616d6469736b32000000000) dm-0 ##,##
size=984M features='3 retain_attached_hw_handler queue_mode mq' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=0 status=active
|-+- policy='service-time 0' prio=0 status=undef
|-+- policy='service-time 0' prio=0 status=undef
`-+- policy='service-time 0' prio=0 status=undef

Avoid that blk_execute_rq_nowait() is called to queue a request
onto a dying queue by changing the blk_freeze_queue_start() call
in blk_set_queue_dying() into a blk_freeze_queue() call.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Ming Lei <tom.leiming@gmail.com>
Cc: <stable@vger.kernel.org>
---
 block/blk-core.c | 9 +++++----
 block/blk-exec.c | 7 +++++--
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 8654aa0cef6d..21314b995887 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -501,11 +501,12 @@ void blk_set_queue_dying(struct request_queue *q)
 	spin_unlock_irq(q->queue_lock);
 
 	/*
-	 * When queue DYING flag is set, we need to block new req
-	 * entering queue, so we call blk_freeze_queue_start() to
-	 * prevent I/O from crossing blk_queue_enter().
+	 * When queue DYING flag is set, we need to block new requests
+	 * from being queued. Hence call blk_freeze_queue() to make
+	 * new blk_queue_enter() calls fail and to wait until all pending
+	 * I/O has finished.
 	 */
-	blk_freeze_queue_start(q);
+	blk_freeze_queue(q);
 
 	if (q->mq_ops)
 		blk_mq_wake_waiters(q);
diff --git a/block/blk-exec.c b/block/blk-exec.c
index 8cd0e9bc8dc8..f7d9bed2cb15 100644
--- a/block/blk-exec.c
+++ b/block/blk-exec.c
@@ -57,10 +57,13 @@ void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk,
 	rq->end_io = done;
 
 	/*
-	 * don't check dying flag for MQ because the request won't
-	 * be reused after dying flag is set
+	 * The blk_freeze_queue() call in blk_set_queue_dying() and the
+	 * test of the "dying" flag in blk_queue_enter() guarantee that
+	 * blk_execute_rq_nowait() won't be called anymore after the "dying"
+	 * flag has been set.
 	 */
 	if (q->mq_ops) {
+		WARN_ON_ONCE(blk_queue_dying(q));
 		blk_mq_sched_insert_request(rq, at_head, true, false, false);
 		return;
 	}
-- 
2.12.2

^ permalink raw reply related

* Re: [PATCH v5] lightnvn: pblk
From: Bart Van Assche @ 2017-04-11 22:23 UTC (permalink / raw)
  To: jg@lightnvm.io
  Cc: mb@lightnvm.io, linux-kernel@vger.kernel.org,
	linux-block@vger.kernel.org
In-Reply-To: <8123618D-0CBB-420B-BC68-64A0A36E2532@lightnvm.io>

On Wed, 2017-04-12 at 00:13 +0200, Javier Gonz=E1lez wrote:
> please point out to any other tools/concerns you may have.

Hello Javier,

Do you already have an account at https://scan.coverity.com/? Any Linux
kernel developer can get an account for free. A full Coverity scan of
Linus' tree is available at https://scan.coverity.com/projects/linux.

Bart.=

^ permalink raw reply

* Re: [PATCH v5] lightnvn: pblk
From: Javier González @ 2017-04-11 22:13 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Matias Bjørling, linux-kernel@vger.kernel.org,
	linux-block@vger.kernel.org
In-Reply-To: <1491923975.2654.4.camel@sandisk.com>

[-- Attachment #1: Type: text/plain, Size: 1501 bytes --]

Hi Bart,

> On 11 Apr 2017, at 17.19, Bart Van Assche <Bart.VanAssche@sandisk.com> wrote:
> 
> On Tue, 2017-04-11 at 16:31 +0200, Javier González wrote:
>> Changes since v4:
>> * Rebase on top of Matias' for-4.12/core
>> * Fix type implicit conversions reported by sparse (reported by Bart Van
>>  Assche)
>> * Make error and debug statistics long atomic variables.
> 
> Hello Javier,
> 
> Thanks for the quick respin. But have you already had a look at the
> diagnostics reported by smatch? Smatch reports e.g.
> 
> drivers/lightnvm/pblk-rb.c:783: pblk_rb_tear_down_check() error: we previously assumed 'rb->entries' could be null (see line 779)
> 
> on the following code:
> 
> 	if (rb->entries)
> 		goto out;
> 
> 	for (i = 0; i < rb->nr_entries; i++) {
> 		entry = &rb->entries[i];
> 
> 		if (entry->data)
> 			goto out;
> 	}
> 
> Is that "if (rb->entries)" check correct or should that perhaps been
> "if (!rb->entries)"? Smatch is available at http://repo.or.cz/w/smatch.git.

I have run smatch over the code (did not know the tool, so thanks!).
This particular error has been fixed on v5. The only standing warning
relates to a semaphore on pblk-map that is taken on the erase path. This
is a false positive; it is intended that the semaphore lock is taken
here and then released on the completion path. Sparse and coccicheck
have been also been used on v5, but please point out to any other
tools/concerns you may have.

> Bart.

Thanks,

Javier

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply

* Re: [PATCH V3 02/16] block, bfq: add full hierarchical scheduling and cgroups support
From: Tejun Heo @ 2017-04-11 21:47 UTC (permalink / raw)
  To: Paolo Valente
  Cc: Jens Axboe, Fabio Checconi, Arianna Avanzini, linux-block,
	linux-kernel, ulf.hansson, linus.walleij, broonie
In-Reply-To: <20170411134315.44135-3-paolo.valente@linaro.org>

Hello,

On Tue, Apr 11, 2017 at 03:43:01PM +0200, Paolo Valente wrote:
> From: Arianna Avanzini <avanzini.arianna@gmail.com>
> 
> Add complete support for full hierarchical scheduling, with a cgroups
> interface. Full hierarchical scheduling is implemented through the
> 'entity' abstraction: both bfq_queues, i.e., the internal BFQ queues
> associated with processes, and groups are represented in general by
> entities. Given the bfq_queues associated with the processes belonging
> to a given group, the entities representing these queues are sons of
> the entity representing the group. At higher levels, if a group, say
> G, contains other groups, then the entity representing G is the parent
> entity of the entities representing the groups in G.
> 
> Hierarchical scheduling is performed as follows: if the timestamps of
> a leaf entity (i.e., of a bfq_queue) change, and such a change lets
> the entity become the next-to-serve entity for its parent entity, then
> the timestamps of the parent entity are recomputed as a function of
> the budget of its new next-to-serve leaf entity. If the parent entity
> belongs, in its turn, to a group, and its new timestamps let it become
> the next-to-serve for its parent entity, then the timestamps of the
> latter parent entity are recomputed as well, and so on. When a new
> bfq_queue must be set in service, the reverse path is followed: the
> next-to-serve highest-level entity is chosen, then its next-to-serve
> child entity, and so on, until the next-to-serve leaf entity is
> reached, and the bfq_queue that this entity represents is set in
> service.
> 
> Writeback is accounted for on a per-group basis, i.e., for each group,
> the async I/O requests of the processes of the group are enqueued in a
> distinct bfq_queue, and the entity associated with this queue is a
> child of the entity associated with the group.
> 
> Weights can be assigned explicitly to groups and processes through the
> cgroups interface, differently from what happens, for single
> processes, if the cgroups interface is not used (as explained in the
> description of the previous patch). In particular, since each node has
> a full scheduler, each group can be assigned its own weight.

Can we please hold off on cgroup support for now?  I've been trying to
chase down cpu scheduler latency issues lately and have some doubts
about implementing cgroup support by simply nesting the timelines like
this.

Thanks

-- 
tejun

^ permalink raw reply

* [PATCH 6/6] scsi: Implement blk_mq_ops.show_rq()
From: Bart Van Assche @ 2017-04-11 20:58 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Bart Van Assche, Martin K . Petersen,
	James Bottomley, Omar Sandoval, Hannes Reinecke, linux-scsi
In-Reply-To: <20170411205842.28137-1-bart.vanassche@sandisk.com>

Show the SCSI CDB, .eh_eflags and .result for pending SCSI commands
in /sys/kernel/debug/block/*/mq/*/dispatch and */rq_list.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: <linux-scsi@vger.kernel.org>
---
 drivers/scsi/scsi_lib.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 7bc4513bf4e4..7d3efb8924ee 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2126,6 +2126,32 @@ static void scsi_exit_rq(struct request_queue *q, struct request *rq)
 	scsi_free_sense_buffer(shost, cmd->sense_buffer);
 }
 
+static const char *const ehflag_name[] = {
+	[ilog2(SCSI_EH_CANCEL_CMD)]	 = "CANCEL_CMD",
+	[ilog2(SCSI_EH_ABORT_SCHEDULED)] = "ABORT_SCHEDULED",
+};
+
+static void scsi_show_rq(struct request *rq, char *info, unsigned int info_sz)
+{
+	char *p = info, *const end = info + info_sz;
+	struct scsi_cmnd *cmd = container_of(scsi_req(rq), typeof(*cmd), req);
+	unsigned int i;
+
+	p += scnprintf(p, end - p, ".cmd =");
+	for (i = 0; i < cmd->cmd_len; i++)
+		p += scnprintf(p, end - p, " %02x", cmd->cmnd[i]);
+	p += scnprintf(p, end - p, ", .eh_eflags =");
+	for (i = 0; i < sizeof(cmd->eh_eflags) * BITS_PER_BYTE; i++) {
+		if (!(cmd->eh_eflags & BIT(i)))
+			continue;
+		if (i < ARRAY_SIZE(ehflag_name) && ehflag_name[i])
+			p += scnprintf(p, end - p, " %s", ehflag_name[i]);
+		else
+			p += scnprintf(p, end - p, " %d", i);
+	}
+	p += scnprintf(p, end - p, ", .result = %#06x", cmd->result);
+}
+
 struct request_queue *scsi_alloc_queue(struct scsi_device *sdev)
 {
 	struct Scsi_Host *shost = sdev->host;
@@ -2158,6 +2184,7 @@ static const struct blk_mq_ops scsi_mq_ops = {
 	.queue_rq	= scsi_queue_rq,
 	.complete	= scsi_softirq_done,
 	.timeout	= scsi_timeout,
+	.show_rq	= scsi_show_rq,
 	.init_request	= scsi_init_request,
 	.exit_request	= scsi_exit_request,
 	.map_queues	= scsi_map_queues,
-- 
2.12.0

^ permalink raw reply related

* [PATCH 5/6] blk-mq: Add blk_mq_ops.show_rq()
From: Bart Van Assche @ 2017-04-11 20:58 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Bart Van Assche, Omar Sandoval, Hannes Reinecke
In-Reply-To: <20170411205842.28137-1-bart.vanassche@sandisk.com>

This new callback function will be used in the next patch to show
more information about SCSI requests.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Hannes Reinecke <hare@suse.com>
---
 block/blk-mq-debugfs.c | 10 ++++++++--
 include/linux/blk-mq.h |  6 ++++++
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index 161f30fc236f..6b28d92d4c0e 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -316,7 +316,9 @@ static const char *const rqf_name[] = {
 static int blk_mq_debugfs_rq_show(struct seq_file *m, void *v)
 {
 	struct request *rq = list_entry_rq(v);
+	const struct blk_mq_ops *const mq_ops = rq->q->mq_ops;
 	const unsigned int op = rq->cmd_flags & REQ_OP_MASK;
+	char drv_info[200];
 
 	seq_printf(m, "%p {.op=", rq);
 	if (op < ARRAY_SIZE(op_name) && op_name[op])
@@ -329,8 +331,12 @@ static int blk_mq_debugfs_rq_show(struct seq_file *m, void *v)
 	seq_puts(m, ", .rq_flags=");
 	blk_flags_show(m, (__force unsigned int)rq->rq_flags, rqf_name,
 		       ARRAY_SIZE(rqf_name));
-	seq_printf(m, ", .tag=%d, .internal_tag=%d}\n", rq->tag,
-		   rq->internal_tag);
+	if (mq_ops->show_rq)
+		mq_ops->show_rq(rq, drv_info, sizeof(drv_info));
+	else
+		drv_info[0] = '\0';
+	seq_printf(m, ", .tag=%d, .internal_tag=%d%s%s}\n", rq->tag,
+		   rq->internal_tag, drv_info[0] ? ", " : "", drv_info);
 	return 0;
 }
 
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index dea255dee359..de2a1c2ddd84 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -120,6 +120,12 @@ struct blk_mq_ops {
 	softirq_done_fn		*complete;
 
 	/*
+	 * Used by the debugfs implementation to show driver-specific
+	 * information about a request.
+	 */
+	void (*show_rq)(struct request *rq, char *info, unsigned int info_sz);
+
+	/*
 	 * Called when the block layer side of a hardware queue has been
 	 * set up, allowing the driver to allocate/init matching structures.
 	 * Ditto for exit/teardown.
-- 
2.12.0

^ permalink raw reply related

* [PATCH 4/6] blk-mq: Show operation, cmd_flags and rq_flags names
From: Bart Van Assche @ 2017-04-11 20:58 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Bart Van Assche, Omar Sandoval, Hannes Reinecke
In-Reply-To: <20170411205842.28137-1-bart.vanassche@sandisk.com>

Show the operation name, .cmd_flags and .rq_flags as names instead
of numbers.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Hannes Reinecke <hare@suse.com>
---
 block/blk-mq-debugfs.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 69 insertions(+), 3 deletions(-)

diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index aae4b7c7b7b0..161f30fc236f 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -258,13 +258,79 @@ static const struct file_operations hctx_flags_fops = {
 	.release	= single_release,
 };
 
+static const char *const op_name[] = {
+	[REQ_OP_READ]		= "READ",
+	[REQ_OP_WRITE]		= "WRITE",
+	[REQ_OP_FLUSH]		= "FLUSH",
+	[REQ_OP_DISCARD]	= "DISCARD",
+	[REQ_OP_ZONE_REPORT]	= "ZONE_REPORT",
+	[REQ_OP_SECURE_ERASE]	= "SECURE_ERASE",
+	[REQ_OP_ZONE_RESET]	= "ZONE_RESET",
+	[REQ_OP_WRITE_SAME]	= "WRITE_SAME",
+	[REQ_OP_WRITE_ZEROES]	= "WRITE_ZEROES",
+	[REQ_OP_SCSI_IN]	= "SCSI_IN",
+	[REQ_OP_SCSI_OUT]	= "SCSI_OUT",
+	[REQ_OP_DRV_IN]		= "DRV_IN",
+	[REQ_OP_DRV_OUT]	= "DRV_OUT",
+};
+
+static const char *const cmd_flag_name[] = {
+	[__REQ_FAILFAST_DEV]		= "FAILFAST_DEV",
+	[__REQ_FAILFAST_TRANSPORT]	= "FAILFAST_TRANSPORT",
+	[__REQ_FAILFAST_DRIVER]		= "FAILFAST_DRIVER",
+	[__REQ_SYNC]			= "SYNC",
+	[__REQ_META]			= "META",
+	[__REQ_PRIO]			= "PRIO",
+	[__REQ_NOMERGE]			= "NOMERGE",
+	[__REQ_IDLE]			= "IDLE",
+	[__REQ_INTEGRITY]		= "INTEGRITY",
+	[__REQ_FUA]			= "FUA",
+	[__REQ_PREFLUSH]		= "PREFLUSH",
+	[__REQ_RAHEAD]			= "RAHEAD",
+	[__REQ_BACKGROUND]		= "BACKGROUND",
+	[__REQ_NR_BITS]			= "NR_BITS",
+};
+
+static const char *const rqf_name[] = {
+	[ilog2(RQF_SORTED)]		= "SORTED",
+	[ilog2(RQF_STARTED)]		= "STARTED",
+	[ilog2(RQF_QUEUED)]		= "QUEUED",
+	[ilog2(RQF_SOFTBARRIER)]	= "SOFTBARRIER",
+	[ilog2(RQF_FLUSH_SEQ)]		= "FLUSH_SEQ",
+	[ilog2(RQF_MIXED_MERGE)]	= "MIXED_MERGE",
+	[ilog2(RQF_MQ_INFLIGHT)]	= "MQ_INFLIGHT",
+	[ilog2(RQF_DONTPREP)]		= "DONTPREP",
+	[ilog2(RQF_PREEMPT)]		= "PREEMPT",
+	[ilog2(RQF_COPY_USER)]		= "COPY_USER",
+	[ilog2(RQF_FAILED)]		= "FAILED",
+	[ilog2(RQF_QUIET)]		= "QUIET",
+	[ilog2(RQF_ELVPRIV)]		= "ELVPRIV",
+	[ilog2(RQF_IO_STAT)]		= "IO_STAT",
+	[ilog2(RQF_ALLOCED)]		= "ALLOCED",
+	[ilog2(RQF_PM)]			= "PM",
+	[ilog2(RQF_HASHED)]		= "HASHED",
+	[ilog2(RQF_STATS)]		= "STATS",
+	[ilog2(RQF_SPECIAL_PAYLOAD)]	= "SPECIAL_PAYLOAD",
+};
+
 static int blk_mq_debugfs_rq_show(struct seq_file *m, void *v)
 {
 	struct request *rq = list_entry_rq(v);
+	const unsigned int op = rq->cmd_flags & REQ_OP_MASK;
 
-	seq_printf(m, "%p {.cmd_flags=0x%x, .rq_flags=0x%x, .tag=%d, .internal_tag=%d}\n",
-		   rq, rq->cmd_flags, (__force unsigned int)rq->rq_flags,
-		   rq->tag, rq->internal_tag);
+	seq_printf(m, "%p {.op=", rq);
+	if (op < ARRAY_SIZE(op_name) && op_name[op])
+		seq_printf(m, "%s", op_name[op]);
+	else
+		seq_printf(m, "%d", op);
+	seq_puts(m, ", .cmd_flags=");
+	blk_flags_show(m, rq->cmd_flags ^ op, cmd_flag_name,
+		       ARRAY_SIZE(cmd_flag_name));
+	seq_puts(m, ", .rq_flags=");
+	blk_flags_show(m, (__force unsigned int)rq->rq_flags, rqf_name,
+		       ARRAY_SIZE(rqf_name));
+	seq_printf(m, ", .tag=%d, .internal_tag=%d}\n", rq->tag,
+		   rq->internal_tag);
 	return 0;
 }
 
-- 
2.12.0

^ permalink raw reply related

* [PATCH 3/6] blk-mq: Make blk_flags_show() callers append a newline character
From: Bart Van Assche @ 2017-04-11 20:58 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Bart Van Assche, Omar Sandoval, Hannes Reinecke
In-Reply-To: <20170411205842.28137-1-bart.vanassche@sandisk.com>

This patch does not change any functionality but makes it possible
to produce a single line of output with multiple flag-to-name
translations.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Hannes Reinecke <hare@suse.com>
---
 block/blk-mq-debugfs.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index 564470d4af52..aae4b7c7b7b0 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -60,7 +60,6 @@ static int blk_flags_show(struct seq_file *m, const unsigned long flags,
 		else
 			seq_printf(m, "%d", i);
 	}
-	seq_puts(m, "\n");
 	return 0;
 }
 
@@ -102,6 +101,7 @@ static int blk_queue_flags_show(struct seq_file *m, void *v)
 
 	blk_flags_show(m, q->queue_flags, blk_queue_flag_name,
 		       ARRAY_SIZE(blk_queue_flag_name));
+	seq_puts(m, "\n");
 	return 0;
 }
 
@@ -198,6 +198,7 @@ static int hctx_state_show(struct seq_file *m, void *v)
 
 	blk_flags_show(m, hctx->state, hctx_state_name,
 		       ARRAY_SIZE(hctx_state_name));
+	seq_puts(m, "\n");
 	return 0;
 }
 
@@ -241,6 +242,7 @@ static int hctx_flags_show(struct seq_file *m, void *v)
 	blk_flags_show(m,
 		       hctx->flags ^ BLK_ALLOC_POLICY_TO_MQ_FLAG(alloc_policy),
 		       hctx_flag_name, ARRAY_SIZE(hctx_flag_name));
+	seq_puts(m, "\n");
 	return 0;
 }
 
-- 
2.12.0

^ permalink raw reply related

* [PATCH 2/6] blk-mq: Move the "state" debugfs attribute one level down
From: Bart Van Assche @ 2017-04-11 20:58 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Bart Van Assche, Omar Sandoval, Hannes Reinecke
In-Reply-To: <20170411205842.28137-1-bart.vanassche@sandisk.com>

Move the "state" attribute from the top level to the "mq" directory
as requested by Omar.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Hannes Reinecke <hare@suse.com>
---
 block/blk-mq-debugfs.c | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index a1ce823578c7..564470d4af52 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -149,11 +149,6 @@ static const struct file_operations blk_queue_flags_fops = {
 	.write		= blk_queue_flags_store,
 };
 
-static const struct blk_mq_debugfs_attr blk_queue_attrs[] = {
-	{"state", 0600, &blk_queue_flags_fops},
-	{},
-};
-
 static void print_stat(struct seq_file *m, struct blk_rq_stat *stat)
 {
 	if (stat->nr_samples) {
@@ -762,6 +757,7 @@ static const struct file_operations ctx_completed_fops = {
 
 static const struct blk_mq_debugfs_attr blk_mq_debugfs_queue_attrs[] = {
 	{"poll_stat", 0400, &queue_poll_stat_fops},
+	{"state", 0600, &blk_queue_flags_fops},
 	{},
 };
 
@@ -877,9 +873,6 @@ int blk_mq_debugfs_register_hctxs(struct request_queue *q)
 	if (!q->debugfs_dir)
 		return -ENOENT;
 
-	if (!debugfs_create_files(q->debugfs_dir, q, blk_queue_attrs))
-		goto err;
-
 	q->mq_debugfs_dir = debugfs_create_dir("mq", q->debugfs_dir);
 	if (!q->mq_debugfs_dir)
 		goto err;
-- 
2.12.0

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox