* [GIT PULL] Block changes for 4.21-rc
@ 2018-12-21 4:04 Jens Axboe
2018-12-23 3:35 ` Jens Axboe
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Jens Axboe @ 2018-12-21 4:04 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-block@vger.kernel.org
Hi Linus,
Sending this out a bit early to get this sorted for the holiday break.
This is the main pull request for block/storage for 4.21. Larger than
usual, it was a busy round with lots of goodies queued up. Most notable
is the removal of the old IO stack, which has been a long time coming.
No new features for a while, everything coming in this week has all
been fixes for things that were previously merged.
Note that I've pulled in 4.20-rc a few times to both resolve a few
conflicts, but mostly to get the important fixes that went into mainline
late in this series.
The IRQ changes are staged in Thomas's tree, but I pulled a branch from
him to get them in here as well, as the multiple queue maps feature
depends on it.
This pull request contains:
- Use atomic counters instead of semaphores for mtip32xx (Arnd)
- Cleanup of the mtip32xx request setup (Christoph)
- Fix for circular locking dependency in loop (Jan, Tetsuo)
- bcache (Coly, Guoju, Shenghui)
- Optimizations for writeback caching
- Various fixes and improvements
nvme (Chaitanya, Christoph, Sagi, Jay, me, Keith)
- host and target support for NVMe over TCP
- Error log page support
- Support for separate read/write/poll queues
- Much improved polling
- discard OOM fallback
- Tracepoint improvements
lightnvm (Hans, Hua, Igor, Matias, Javier)
- Igor added packed metadata to pblk. Now drives without metadata
per LBA can be used as well.
- Fix from Geert on uninitialized value on chunk metadata reads.
- Fixes from Hans and Javier to pblk recovery and write path.
- Fix from Hua Su to fix a race condition in the pblk recovery code.
- Scan optimization added to pblk recovery from Zhoujie.
- Small geometry cleanup from me.
- Conversion of the last few drivers that used the legacy path to
blk-mq (me)
- Removal of legacy IO path in SCSI (me, Christoph)
- Removal of legacy IO stack and schedulers (me)
- Support for much better polling, now without interrupts at all. blk-mq
adds support for multiple queue maps, which enables us to have a map
per type. This in turn enables nvme to have separate completion queues
for polling, which can then be interrupt-less. Also means we're ready
for async polled IO, which is hopefully coming in the next release.
- Killing of (now) unused block exports (Christoph)
- Unification of the blk-rq-qos and blk-wbt wait handling (Josef)
- Support for zoned testing with null_blk (Masato)
- sx8 conversion to per-host tag sets (Christoph)
- IO priority improvements (Damien)
- mq-deadline zoned fix (Damien)
- Ref count blkcg series (Dennis)
- Lots of blk-mq improvements and speedups (me)
- sbitmap scalability improvements (me)
- Make core inflight IO accounting per-cpu (Mikulas)
- Export timeout setting in sysfs (Weiping)
- Cleanup the direct issue path (Jianchao)
- Export blk-wbt internals in block debugfs for easier debugging (Ming)
- Lots of other fixes and improvements
Please pull!
git://git.kernel.dk/linux-block.git tags/for-4.21/block-20181221
----------------------------------------------------------------
Arnd Bergmann (1):
mtip32xx: avoid using semaphores
Balbir Singh (1):
block: add cmd_flags to print_req_error
Chaitanya Kulkarni (17):
nvme: consolidate memset calls in the nvme_setup_cmd path
nvmet: use IOCB_NOWAIT for file-ns buffered I/O
nvmet: use unlikely for req status check
nvmet: fix the structure member indentation
nvme: remove nvme_common command cdw10 array
nvme: add error log page slot definition
nvmet: add error-log definitions
nvmet: add interface to update error-log page
nvmet: add error log support in the core
nvmet: add error log support for fabrics-cmd
nvmet: add error log support for rdma backend
nvmet: add error log support for admin-cmd
nvmet: add error log support for bdev backend
nvmet: add error log support for file backend
nvmet: add error log page cmd handler
nvmet: update smart log with num err log entries
nvmet: use a macro for default error location
Chengguang Xu (3):
nvme: add __exit annotation
aoe: add __exit annotation
block: loop: check error using IS_ERR instead of IS_ERR_OR_NULL in loop_add()
Christoph Hellwig (76):
sx8: cleanup queue and disk allocation / freeing
sx8: use a per-host tag_set
mtip32xx: move the blk_rq_map_sg call to mtip_hw_submit_io
mtip32xx: merge mtip_submit_request into mtip_queue_rq
mtip32xx: return a blk_status_t from mtip_send_trim
mtip32xx: remove __force_bit2int
mtip32xx: add missing endianess annotations on struct smart_attr
mtip32xx: remove mtip_init_cmd_header
mtip32xx: remove mtip_get_int_command
mtip32xx: don't use req->special
mtip32xxx: use for_each_sg
block: remove req->timeout_list
ide: cleanup ->prep_rq calling convention
scsi: simplify scsi_prep_state_check
scsi: push blk_status_t up into scsi_setup_{fs,scsi}_cmnd
scsi: clean up error handling in scsi_init_io
scsi: return blk_status_t from scsi_init_io and ->init_command
scsi: return blk_status_t from device handler ->prep_fn
block: remove the BLKPREP_* values.
fnic: fix fnic_scsi_host_{start,end}_tag
nullb: remove leftover legacy request code
skd_main: don't use req->special
aoe: replace ->special use with private data in the request
pd: replace ->special use with private data in the request
ide: don't use req->special
block: remove QUEUE_FLAG_BYPASS and ->bypass
block: remove deadline __deadline manipulation helpers
block: don't hold the queue_lock over blk_abort_request
block: use atomic bitops for ->queue_flags
block: remove queue_lockdep_assert_held
block: remove the unused lock argument to rq_qos_throttle
block: update a few comments for the legacy request removal
block: remove a few unused exports
blk-cgroup: consolidate error handling in blkcg_init_queue
blk-cgroup: move locking into blkg_destroy_all
drbd: don't override the queue_lock
umem: don't override the queue_lock
mmc: simplify queue initialization
mmc: stop abusing the request queue_lock pointer
block: remove the lock argument to blk_alloc_queue_node
block: remove the queue_lock indirection
block: remove the rq_alloc_data request_queue field
floppy: remove queue_lock around floppy_end_request
pktcdvd: remove queue_lock around blk_queue_max_hw_sectors
ide: don't acquire queue lock in ide_pm_execute_rq
ide: don't acquire queue_lock in ide_complete_pm_rq
mmc: stop abusing the request queue_lock pointer
block: avoid extra bio reference for async O_DIRECT
aio: clear IOCB_HIPRI
block: move queues types to the block layer
nvme-pci: use atomic bitops to mark a queue enabled
nvme-pci: cleanup SQ allocation a bit
nvme-pci: only allow polling with separate poll queues
nvme-pci: consolidate code for polling non-dedicated queues
nvme-pci: refactor nvme_disable_io_queues
nvme-pci: don't poll from irq context when deleting queues
nvme-pci: remove the CQ lock for interrupt driven queues
nvme-rdma: remove I/O polling support
nvme-mpath: remove I/O polling support
block: remove ->poll_fn
block: only allow polling if a poll queue_map exists
block: enable polling by default if a poll map is initalized
nvmet: mark nvmet_genctr static
block: remove the bio_phys_segments export
block: remove the blk_recount_segments export
block: remove the unused bio_iov_iter_get_pages export
block: remove the unused bio_set_pages_dirty and bio_check_pages_dirty exports
block: remove the bioset_integrity_free export
block: remove the bio_integrity_advance export
block: clear REQ_HIPRI if polling is not supported
blk-mq: only dispatch to non-defauly queue maps if they have queues
nvme-pci: don't share queue maps
nvme-pci: only set nr_maps to 2 if poll queues are supported
nvme-pci: refactor nvme_poll_irqdisable to make sparse happy
nvmet-tcp: fix endianess annotations
nvme-tcp: fix endianess annotations
Colin Ian King (4):
ms_block: remove unused pointer 'set'
block: clean up dead code that is now redundant
nvmet: fix comparison of a u16 with -1
nvme-tcp: fix spelling mistake "attepmpt" -> "attempt"
Coly Li (5):
bcache: introduce force_wake_up_gc()
bcache: option to automatically run gc thread after writeback
bcache: add MODULE_DESCRIPTION information
bcache: make cutoff_writeback and cutoff_writeback_sync tunable
bcache: set writeback_percent in a flexible range
Damien Le Moal (8):
aio: Comment use of IOCB_FLAG_IOPRIO aio flag
block: Remove bio->bi_ioc
block: Introduce get_current_ioprio()
aio: Fix fallback I/O priority value
block: prevent merging of requests with different priorities
block: Initialize BIO I/O priority early
block: update sysfs documentation
block: mq-deadline: Fix write completion handling
Dan Carpenter (3):
ataflop: fix error handling in atari_floppy_init()
blk-mq: Add a NULL check in blk_mq_free_map_and_requests()
scsi: Fix a harmless double shift bug
Dennis Zhou (17):
blkcg: fix ref count issue with bio_blkcg() using task_css
blkcg: update blkg_lookup_create() to do locking
blkcg: convert blkg_lookup_create() to find closest blkg
blkcg: introduce common blkg association logic
dm: set the static flush bio device on demand
blkcg: associate blkg when associating a device
blkcg: consolidate bio_issue_init() to be a part of core
blkcg: associate a blkg for pages being evicted by swap
blkcg: associate writeback bios with a blkg
blkcg: remove bio->bi_css and instead use bio->bi_blkg
blkcg: remove additional reference to the css
blkcg: remove bio_disassociate_task()
blkcg: change blkg reference counting to use percpu_ref
blkcg: rename blkg_try_get() to blkg_tryget()
blkcg: put back rcu lock in blkcg_bio_issue_check()
blkcg: handle dying request_queue when associating a blkg
block: fix blk-iolatency accounting underflow
Eric Biggers (1):
block: make blk_try_req_merge() static
Geert Uytterhoeven (1):
lightnvm: Fix uninitialized return value in nvm_get_chunk_meta()
Guoju Fang (1):
bcache: print number of keys in trace_bcache_journal_write
Hannes Reinecke (1):
nvme: add a numa_node field to struct nvme_ctrl
Hans Holmberg (8):
lightnvm: pblk: fix chunk close trace event check
lightnvm: pblk: fix resubmission of overwritten write err lbas
lightnvm: pblk: account for write error sectors in emeta
lightnvm: pblk: stop writes gracefully when running out of lines
lightnvm: pblk: set conservative threshold for user writes
lightnvm: pblk: remove unused macro
lightnvm: pblk: fix pblk_lines_init error handling path
lightnvm: pblk: remove dead code in pblk_recov_l2p
Hua Su (2):
lightnvm: pblk: fix spelling in comment
lightnvm: pblk: add lock protection to list operations
Igor Konopko (6):
lightnvm: pblk: move lba list to partial read context
lightnvm: pblk: add helpers for OOB metadata
lightnvm: dynamic DMA pool entry size
lightnvm: disable interleaved metadata
lightnvm: pblk: support packed metadata
lightnvm: pblk: do not overwrite ppa list with meta list
Israel Rukshin (3):
nvme: Remove unused forward declaration
nvmet-rdma: Add unlikely for response allocated check
nvme: remove unused function nvme_ctrl_ready
James Smart (1):
nvmet-fc: remove the IN_ISR deferred scheduling options
Jan Kara (14):
loop: Fold __loop_release into loop_release
loop: Get rid of loop_index_mutex
loop: Push lo_ctl_mutex down into individual ioctls
loop: Split setting of lo_state from loop_clr_fd
loop: Push loop_ctl_mutex down into loop_clr_fd()
loop: Push loop_ctl_mutex down to loop_get_status()
loop: Push loop_ctl_mutex down to loop_set_status()
loop: Push loop_ctl_mutex down to loop_set_fd()
loop: Push loop_ctl_mutex down to loop_change_fd()
loop: Move special partition reread handling in loop_clr_fd()
loop: Move loop_reread_partitions() out of loop_ctl_mutex
loop: Fix deadlock when calling blkdev_reread_part()
loop: Avoid circular locking dependency between loop_ctl_mutex and bd_mutex
loop: Get rid of 'nested' acquisition of loop_ctl_mutex
Javier González (2):
lightnvm: pblk: add comments wrt locking in recovery path
lightnvm: pblk: avoid ref warning on cache creation
Jay Sternberg (7):
nvmet: provide aen bit functions for multiple controller types
nvmet: change aen mask functions to use bit numbers
nvmet: allow Keep Alive for Discovery controller
nvmet: make kato and AEN processing for use by other controllers
nvmet: add defines for discovery change async events
nvmet: add support to Discovery controllers for commands
nvmet: enable Discovery Controller AENs
Jens Axboe (108):
genirq/affinity: Add support for allocating interrupt sets
sunvdc: convert to blk-mq
ms_block: convert to blk-mq
mspro_block: convert to blk-mq
ide: convert to blk-mq
blk-mq: remove the request_list usage
blk-mq: remove legacy check in queue blk_freeze_queue()
blk-mq: provide mq_ops->busy() hook
scsi: provide mq_ops->busy() hook
scsi: kill off the legacy IO path
block: remove q->lld_busy_fn()
dasd: remove dead code
bsg: pass in desired timeout handler
bsg: provide bsg_remove_queue() helper
bsg: convert to use blk-mq
block: remove blk_complete_request()
blk-wbt: kill check for legacy queue type
blk-cgroup: remove legacy queue bypassing
block: remove legacy rq tagging
block: remove non mq parts from the flush code
block: cleanup kick/queued handling
block: remove legacy IO schedulers
block: remove dead elevator code
block: get rid of MQ scheduler ops union
block: remove __blk_put_request()
block: kill legacy parts of timeout handling
bsg: move bsg-lib parts outside of request queue
block: remove request_list code
block: kill request slab cache
block: remove req_no_special_merge() from merging code
blk-merge: kill dead queue lock held check
block: get rid of blk_queued_rq()
block: get rid of q->softirq_done_fn()
block: kill request ->cpu member
Merge branch 'irq/for-block' of git://git.kernel.org/.../tip/tip into for-4.21/block
blk-mq: kill q->mq_map
blk-mq: abstract out queue map
blk-mq: provide dummy blk_mq_map_queue_type() helper
blk-mq: pass in request/bio flags to queue mapping
blk-mq: allow software queue to map to multiple hardware queues
blk-mq: add 'type' attribute to the sysfs hctx directory
blk-mq: support multiple hctx maps
blk-mq: separate number of hardware queues from nr_cpu_ids
blk-mq: cache request hardware queue mapping
blk-mq: cleanup and improve list insertion
blk-mq: improve plug list sorting
blk-mq: initial support for multiple queue maps
nvme: utilize two queue maps, one for reads and one for writes
block: add REQ_HIPRI and inherit it from IOCB_HIPRI
nvme: add separate poll queue map
sunvdc: fix compiler warning
blk-mq-tag: change busy_iter_fn to return whether to continue or not
blk-mq: provide a helper to check if a queue is busy
blk-mq-tag: document tag iteration helper return value
null_blk: remove unused nullb device
ide: don't clear special on ide_queue_rq() entry
nvme: fix boot hang with only being able to get one IRQ vector
block: remove dead queue members
block: add wbt_disable_default export for BFQ
nvme: fix handling of EINVAL on pci_alloc_irq_vectors_affinity()
ide: clear ide_req()->special for non-passthrough requests
nvme: provide optimized poll function for separate poll queues
block: add queue_is_mq() helper
blk-rq-qos: inline check for q->rq_qos functions
block: add polled wakeup task helper
block: for async O_DIRECT, mark us as polling if asked to
block: don't plug for aio/O_DIRECT HIPRI IO
floppy: remove now unused 'flags' variable
Merge tag 'v4.20-rc3' into for-4.21/block
nvme: default to 0 poll queues
block: avoid ordered task state change for polled IO
block: have ->poll_fn() return number of entries polled
nvme-fc: remove ->poll implementation
block: fix attempt to assign NULL io_context
blk-mq: when polling for IO, look for any completion
blk-mq: remove 'tag' parameter from mq_ops->poll()
nvme: remove opportunistic polling from bdev target
block: make blk_poll() take a parameter on whether to spin or not
blk-mq: ensure mq_ops ->poll() is entered at least once
blk-mq: never redirect polled IO completions
block: sum requests in the plug structure
blk-mq: fix failure to decrement plug count on single rq removal
block: improve logic around when to sort a plug list
blk-mq: add mq_ops->commit_rqs()
nvme: implement mq_ops->commit_rqs() hook
virtio_blk: implement mq_ops->commit_rqs() hook
ataflop: implement mq_ops->commit_rqs() hook
blk-mq: use bd->last == true for list inserts
blk-mq: use plug for devices that implement ->commits_rqs()
sbitmap: don't loop for find_next_zero_bit() for !round_robin
sbitmap: ammortize cost of clearing bits
sbitmap: optimize wakeup check
blk-mq: don't call ktime_get_ns() if we don't need it
Merge tag 'v4.20-rc5' into for-4.21/block
blk-mq: remove QUEUE_FLAG_POLL from default MQ flags
sbitmap: silence bogus lockdep IRQ warning
Merge tag 'v4.20-rc6' into for-4.21/block
mtip32xx: use BLK_STS_DEV_RESOURCE for device resources
dm: fix inflight IO check
nvme: fix irq vs io_queue calculations
sbitmap: flush deferred clears for resize and shallow gets
nvme: provide fallback for discard alloc failure
Merge branch 'nvme-4.21' of git://git.infradead.org/nvme into for-4.21/block
blk-mq: change blk_mq_queue_busy() to blk_mq_queue_inflight()
Merge branch 'nvme-4.21' of git://git.infradead.org/nvme into for-4.21/block
dm: don't reuse bio for flushes
sbitmap: add helpers for add/del wait queue handling
kyber: use sbitmap add_wait_queue/list_del wait helpers
Jianchao Wang (3):
blk-mq: refactor the code of issue request directly
blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests
blk-mq: replace and kill blk_mq_request_issue_directly
Josef Bacik (3):
block: add rq_qos_wait to rq_qos
block: convert wbt_wait() to use rq_qos_wait()
block: convert io-latency to use rq_qos_wait
Keith Busch (4):
blk-mq: Return true if request was completed
scsi: Do not rely on blk-mq for double completions
blk-mq: Simplify request completion state
nvme: implement Enhanced Command Retry
Long Li (1):
genirq/affinity: Spread IRQs to all available NUMA nodes
Masato Suzuki (1):
null_blk: Add conventional zone configuration for zoned support
Matias Bjørling (1):
lightnvm: simplify geometry enumeration
Mike Snitzer (3):
dm rq: leverage blk_mq_queue_busy() to check for outstanding IO
block: stop passing 'cpu' to all percpu stats methods
dm: fix request-based dm's use of dm_wait_for_completion
Mikulas Patocka (5):
dm: dont rewrite dm_disk(md)->part0.in_flight
block: delete part_round_stats and switch to less precise counting
block: switch to per-cpu in-flight counters
block: return just one value from part_in_flight
dm: remove the pending IO accounting
Ming Lei (13):
genirq/affinity: Move two stage affinity spreading into a helper function
genirq/affinity: Pass first vector to __irq_build_affinity_masks()
blk-mq: not embed .mq_kobj and ctx->kobj into queue instance
blk-mq: re-build queue map in case of kdump kernel
block: deactivate blk_stat timer in wbt_disable_default()
blk-mq-debugfs: support rq_qos
blk-wbt: export internal state via debugfs
blk-mq: fix allocation for queue mapping table
blk-mq: export hctx->type in debugfs instead of sysfs
blk-mq: fix dispatch from sw queue
blk-mq: skip zero-queue maps in blk_mq_map_swqueue
blk-mq: enable IO poll if .nr_queues of type poll > 0
block: save irq state in blkg_lookup_create()
Omar Sandoval (1):
sbitmap: fix sbitmap_for_each_set()
Sagi Grimberg (34):
nvme: introduce ctrl attributes enumeration
nvme: cache controller attributes
nvme: support traffic based keep-alive
nvmet: support for traffic based keep-alive
nvmet: allow host connect even if no allowed subsystems are exported
nvmet: support fabrics sq flow control
nvmet: don't override treq upon modification.
nvmet: expose support for fabrics SQ flow control disable in treq
nvme: disable fabrics SQ flow control when asked by the user
ath6kl: add ath6kl_ prefix to crypto_type
datagram: open-code copy_page_to_iter
iov_iter: pass void csum pointer to csum_and_copy_to_iter
datagram: consolidate datagram copy to iter helpers
iov_iter: introduce hash_and_copy_to_iter helper
datagram: introduce skb_copy_and_hash_datagram_iter helper
nvmet: Add install_queue callout
nvme-fabrics: allow user passing header digest
nvme-fabrics: allow user passing data digest
nvme-tcp: Add protocol header
nvmet-tcp: add NVMe over TCP target driver
nvmet: allow configfs tcp trtype configuration
nvme-tcp: add NVMe over TCP host driver
nvmet: remove unused variable
blk-mq-rdma: pass in queue map to blk_mq_rdma_map_queues
nvme-fabrics: add missing nvmf_ctrl_options documentation
nvme-fabrics: allow user to set nr_write_queues for separate queue maps
nvme-tcp: support separate queue maps for read and write
nvme-rdma: support separate queue maps for read and write
nvme: fix kernel paging oops
block: make request_to_qc_t public
nvme-core: optionally poll sync commands
nvme-fabrics: allow nvmf_connect_io_queue to poll
nvme-fabrics: allow user to pass in nr_poll_queues
nvme-rdma: implement polling queue map
Shenghui Wang (6):
bcache: add comment for cache_set->fill_iter
bcache: do not check if debug dentry is ERR or NULL explicitly on remove
bcache: update comment for bch_data_insert
bcache: update comment in sysfs.c
bcache: do not mark writeback_running too early
bcache: cannot set writeback_running via sysfs if no writeback kthread created
Tetsuo Handa (3):
block/loop: Don't grab "struct file" for vfs_getattr() operation.
block/loop: Use global lock for ioctl() operation.
loop: Fix double mutex_unlock(&loop_ctl_mutex) in loop_control_ioctl()
Weiping Zhang (1):
block: add io timeout to sysfs
Young Xiao (1):
sunvdc: Do not spin in an infinite loop when vio_ldc_send() returns EAGAIN
YueHaibing (1):
block: remove set but not used variable 'et'
Yufen Yu (1):
block: use rcu_work instead of call_rcu to avoid sleep in softirq
Zhoujie Wu (1):
lightnvm: pblk: ignore the smeta oob area scan
yupeng (1):
nvme-pci: trace SQ status on completions
Documentation/ABI/testing/sysfs-block | 12 +-
Documentation/admin-guide/cgroup-v2.rst | 8 +-
Documentation/block/biodoc.txt | 88 -
Documentation/block/cfq-iosched.txt | 291 --
Documentation/block/queue-sysfs.txt | 29 +-
Documentation/scsi/scsi-parameters.txt | 5 -
block/Kconfig | 6 -
block/Kconfig.iosched | 61 -
block/Makefile | 5 +-
block/bfq-cgroup.c | 6 +-
block/bfq-iosched.c | 21 +-
block/bio-integrity.c | 2 -
block/bio.c | 202 +-
block/blk-cgroup.c | 272 +-
block/blk-core.c | 2278 +------------
block/blk-exec.c | 20 +-
block/blk-flush.c | 188 +-
block/blk-ioc.c | 54 +-
block/blk-iolatency.c | 75 +-
block/blk-merge.c | 53 +-
block/blk-mq-cpumap.c | 19 +-
block/blk-mq-debugfs.c | 147 +-
block/blk-mq-debugfs.h | 17 +
block/blk-mq-pci.c | 10 +-
block/blk-mq-rdma.c | 8 +-
block/blk-mq-sched.c | 82 +-
block/blk-mq-sched.h | 25 +-
block/blk-mq-sysfs.c | 35 +-
block/blk-mq-tag.c | 41 +-
block/blk-mq-virtio.c | 8 +-
block/blk-mq.c | 757 +++--
block/blk-mq.h | 70 +-
block/blk-pm.c | 20 +-
block/blk-pm.h | 6 +-
block/blk-rq-qos.c | 154 +-
block/blk-rq-qos.h | 96 +-
block/blk-settings.c | 65 +-
block/blk-softirq.c | 27 +-
block/blk-stat.c | 4 -
block/blk-stat.h | 5 +
block/blk-sysfs.c | 107 +-
block/blk-tag.c | 378 --
block/blk-throttle.c | 39 +-
block/blk-timeout.c | 117 +-
block/blk-wbt.c | 176 +-
block/blk-zoned.c | 2 +-
block/blk.h | 188 +-
block/bounce.c | 3 +-
block/bsg-lib.c | 146 +-
block/bsg.c | 2 +-
block/cfq-iosched.c | 4916 ---------------------------
block/deadline-iosched.c | 560 ---
block/elevator.c | 477 +--
block/genhd.c | 63 +-
block/kyber-iosched.c | 37 +-
block/mq-deadline.c | 15 +-
block/noop-iosched.c | 124 -
block/partition-generic.c | 18 +-
drivers/ata/libata-eh.c | 4 -
drivers/block/aoe/aoe.h | 4 +
drivers/block/aoe/aoeblk.c | 1 +
drivers/block/aoe/aoecmd.c | 27 +-
drivers/block/aoe/aoedev.c | 11 +-
drivers/block/aoe/aoemain.c | 2 +-
drivers/block/ataflop.c | 26 +-
drivers/block/drbd/drbd_main.c | 2 +-
drivers/block/floppy.c | 6 -
drivers/block/loop.c | 415 ++-
drivers/block/loop.h | 1 -
drivers/block/mtip32xx/mtip32xx.c | 226 +-
drivers/block/mtip32xx/mtip32xx.h | 48 +-
drivers/block/nbd.c | 3 +-
drivers/block/null_blk.h | 1 +
drivers/block/null_blk_main.c | 21 +-
drivers/block/null_blk_zoned.c | 27 +-
drivers/block/paride/pd.c | 30 +-
drivers/block/pktcdvd.c | 2 -
drivers/block/skd_main.c | 16 +-
drivers/block/sunvdc.c | 153 +-
drivers/block/sx8.c | 434 +--
drivers/block/umem.c | 3 +-
drivers/block/virtio_blk.c | 17 +-
drivers/ide/ide-atapi.c | 27 +-
drivers/ide/ide-cd.c | 179 +-
drivers/ide/ide-devsets.c | 4 +-
drivers/ide/ide-disk.c | 15 +-
drivers/ide/ide-eh.c | 2 +-
drivers/ide/ide-floppy.c | 2 +-
drivers/ide/ide-io.c | 112 +-
drivers/ide/ide-park.c | 8 +-
drivers/ide/ide-pm.c | 46 +-
drivers/ide/ide-probe.c | 69 +-
drivers/ide/ide-tape.c | 2 +-
drivers/ide/ide-taskfile.c | 2 +-
drivers/lightnvm/core.c | 25 +-
drivers/lightnvm/pblk-core.c | 77 +-
drivers/lightnvm/pblk-init.c | 103 +-
drivers/lightnvm/pblk-map.c | 63 +-
drivers/lightnvm/pblk-rb.c | 5 +-
drivers/lightnvm/pblk-read.c | 66 +-
drivers/lightnvm/pblk-recovery.c | 46 +-
drivers/lightnvm/pblk-rl.c | 5 +-
drivers/lightnvm/pblk-sysfs.c | 7 +
drivers/lightnvm/pblk-write.c | 64 +-
drivers/lightnvm/pblk.h | 43 +-
drivers/md/bcache/bcache.h | 20 +-
drivers/md/bcache/btree.c | 5 +
drivers/md/bcache/btree.h | 18 +
drivers/md/bcache/debug.c | 3 +-
drivers/md/bcache/journal.c | 2 +-
drivers/md/bcache/request.c | 6 +-
drivers/md/bcache/super.c | 48 +-
drivers/md/bcache/sysfs.c | 61 +-
drivers/md/bcache/writeback.c | 30 +-
drivers/md/bcache/writeback.h | 12 +-
drivers/md/dm-core.h | 5 -
drivers/md/dm-rq.c | 7 +-
drivers/md/dm-table.c | 4 +-
drivers/md/dm.c | 79 +-
drivers/md/md.c | 7 +-
drivers/md/raid0.c | 2 +-
drivers/memstick/core/ms_block.c | 109 +-
drivers/memstick/core/ms_block.h | 1 +
drivers/memstick/core/mspro_block.c | 121 +-
drivers/mmc/core/block.c | 26 +-
drivers/mmc/core/queue.c | 110 +-
drivers/mmc/core/queue.h | 4 +-
drivers/net/wireless/ath/ath6kl/cfg80211.c | 2 +-
drivers/net/wireless/ath/ath6kl/common.h | 2 +-
drivers/net/wireless/ath/ath6kl/wmi.c | 6 +-
drivers/net/wireless/ath/ath6kl/wmi.h | 6 +-
drivers/nvdimm/pmem.c | 2 +-
drivers/nvme/host/Kconfig | 15 +
drivers/nvme/host/Makefile | 3 +
drivers/nvme/host/core.c | 191 +-
drivers/nvme/host/fabrics.c | 61 +-
drivers/nvme/host/fabrics.h | 17 +-
drivers/nvme/host/fc.c | 43 +-
drivers/nvme/host/lightnvm.c | 33 +-
drivers/nvme/host/multipath.c | 20 +-
drivers/nvme/host/nvme.h | 24 +-
drivers/nvme/host/pci.c | 518 ++-
drivers/nvme/host/rdma.c | 119 +-
drivers/nvme/host/tcp.c | 2278 +++++++++++++
drivers/nvme/host/trace.c | 3 +
drivers/nvme/host/trace.h | 27 +-
drivers/nvme/target/Kconfig | 10 +
drivers/nvme/target/Makefile | 2 +
drivers/nvme/target/admin-cmd.c | 146 +-
drivers/nvme/target/configfs.c | 43 +-
drivers/nvme/target/core.c | 220 +-
drivers/nvme/target/discovery.c | 139 +-
drivers/nvme/target/fabrics-cmd.c | 64 +-
drivers/nvme/target/fc.c | 66 +-
drivers/nvme/target/io-cmd-bdev.c | 89 +-
drivers/nvme/target/io-cmd-file.c | 165 +-
drivers/nvme/target/loop.c | 2 +-
drivers/nvme/target/nvmet.h | 68 +-
drivers/nvme/target/rdma.c | 12 +-
drivers/nvme/target/tcp.c | 1737 ++++++++++
drivers/pci/msi.c | 14 +
drivers/s390/block/dasd_ioctl.c | 22 +-
drivers/scsi/Kconfig | 12 -
drivers/scsi/bnx2i/bnx2i_hwi.c | 8 +-
drivers/scsi/csiostor/csio_scsi.c | 8 +-
drivers/scsi/cxlflash/main.c | 6 -
drivers/scsi/device_handler/scsi_dh_alua.c | 21 +-
drivers/scsi/device_handler/scsi_dh_emc.c | 8 +-
drivers/scsi/device_handler/scsi_dh_hp_sw.c | 7 +-
drivers/scsi/device_handler/scsi_dh_rdac.c | 7 +-
drivers/scsi/fnic/fnic_scsi.c | 4 +-
drivers/scsi/hosts.c | 29 +-
drivers/scsi/libsas/sas_ata.c | 5 -
drivers/scsi/libsas/sas_scsi_host.c | 10 +-
drivers/scsi/lpfc/lpfc_scsi.c | 2 +-
drivers/scsi/osd/osd_initiator.c | 4 +-
drivers/scsi/osst.c | 2 +-
drivers/scsi/qedi/qedi_main.c | 3 +-
drivers/scsi/qla2xxx/qla_nvme.c | 12 -
drivers/scsi/qla2xxx/qla_os.c | 37 +-
drivers/scsi/scsi.c | 5 +-
drivers/scsi/scsi_debug.c | 3 +-
drivers/scsi/scsi_error.c | 24 +-
drivers/scsi/scsi_lib.c | 806 +----
drivers/scsi/scsi_priv.h | 1 -
drivers/scsi/scsi_scan.c | 10 +-
drivers/scsi/scsi_sysfs.c | 8 +-
drivers/scsi/scsi_transport_fc.c | 71 +-
drivers/scsi/scsi_transport_iscsi.c | 7 +-
drivers/scsi/scsi_transport_sas.c | 10 +-
drivers/scsi/sd.c | 85 +-
drivers/scsi/sd.h | 6 +-
drivers/scsi/sd_zbc.c | 10 +-
drivers/scsi/sg.c | 2 +-
drivers/scsi/smartpqi/smartpqi_init.c | 3 +-
drivers/scsi/sr.c | 12 +-
drivers/scsi/st.c | 2 +-
drivers/scsi/ufs/ufs_bsg.c | 4 +-
drivers/scsi/virtio_scsi.c | 3 +-
drivers/target/iscsi/iscsi_target_util.c | 12 +-
drivers/target/target_core_pscsi.c | 2 +-
fs/aio.c | 13 +-
fs/block_dev.c | 50 +-
fs/buffer.c | 10 +-
fs/direct-io.c | 4 +-
fs/ext4/page-io.c | 2 +-
fs/iomap.c | 16 +-
include/linux/bio.h | 29 +-
include/linux/blk-cgroup.h | 227 +-
include/linux/blk-mq-pci.h | 4 +-
include/linux/blk-mq-rdma.h | 2 +-
include/linux/blk-mq-virtio.h | 4 +-
include/linux/blk-mq.h | 83 +-
include/linux/blk_types.h | 24 +-
include/linux/blkdev.h | 250 +-
include/linux/bsg-lib.h | 6 +-
include/linux/cgroup.h | 2 +
include/linux/elevator.h | 94 +-
include/linux/fs.h | 2 +-
include/linux/genhd.h | 57 +-
include/linux/ide.h | 14 +-
include/linux/init.h | 1 -
include/linux/interrupt.h | 4 +
include/linux/ioprio.h | 13 +
include/linux/lightnvm.h | 3 +-
include/linux/nvme-fc-driver.h | 17 -
include/linux/nvme-tcp.h | 189 +
include/linux/nvme.h | 73 +-
include/linux/sbitmap.h | 89 +-
include/linux/skbuff.h | 3 +
include/linux/uio.h | 5 +-
include/linux/writeback.h | 5 +-
include/scsi/scsi_cmnd.h | 6 +-
include/scsi/scsi_dh.h | 2 +-
include/scsi/scsi_driver.h | 3 +-
include/scsi/scsi_host.h | 18 +-
include/scsi/scsi_tcq.h | 14 +-
include/trace/events/bcache.h | 27 +-
include/uapi/linux/aio_abi.h | 2 +
init/do_mounts_initrd.c | 3 -
init/initramfs.c | 6 -
init/main.c | 12 -
kernel/cgroup/cgroup.c | 48 +-
kernel/irq/affinity.c | 148 +-
kernel/trace/blktrace.c | 4 +-
lib/iov_iter.c | 19 +-
lib/sbitmap.c | 170 +-
mm/page_io.c | 9 +-
net/core/datagram.c | 159 +-
249 files changed, 10663 insertions(+), 14494 deletions(-)
delete mode 100644 Documentation/block/cfq-iosched.txt
delete mode 100644 block/blk-tag.c
delete mode 100644 block/cfq-iosched.c
delete mode 100644 block/deadline-iosched.c
delete mode 100644 block/noop-iosched.c
create mode 100644 drivers/nvme/host/tcp.c
create mode 100644 drivers/nvme/target/tcp.c
create mode 100644 include/linux/nvme-tcp.h
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [GIT PULL] Block changes for 4.21-rc
2018-12-21 4:04 [GIT PULL] Block changes for 4.21-rc Jens Axboe
@ 2018-12-23 3:35 ` Jens Axboe
2018-12-28 21:48 ` Linus Torvalds
2018-12-29 1:30 ` pr-tracker-bot
2 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2018-12-23 3:35 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-block@vger.kernel.org
On 12/20/18 9:04 PM, Jens Axboe wrote:
> Hi Linus,
>
> Sending this out a bit early to get this sorted for the holiday break.
> This is the main pull request for block/storage for 4.21. Larger than
> usual, it was a busy round with lots of goodies queued up. Most notable
> is the removal of the old IO stack, which has been a long time coming.
> No new features for a while, everything coming in this week has all
> been fixes for things that were previously merged.
>
> Note that I've pulled in 4.20-rc a few times to both resolve a few
> conflicts, but mostly to get the important fixes that went into mainline
> late in this series.
The SD discard fix [1] you just merged causes a conflict, just for
reference, the below is the straight forward way to resolve it.
[1] 61cce6f6eeced5ddd9cac55e807fe28b4f18c1ba
diff --cc drivers/scsi/sd.c
index bd0a5c694a97,4a6ed2fc8c71..a1a44f52e0e8
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@@ -760,10 -759,9 +760,10 @@@ static blk_status_t sd_setup_unmap_cmnd
unsigned int data_len = 24;
char *buf;
- rq->special_vec.bv_page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
+ rq->special_vec.bv_page = mempool_alloc(sd_page_pool, GFP_ATOMIC);
if (!rq->special_vec.bv_page)
- return BLKPREP_DEFER;
+ return BLK_STS_RESOURCE;
+ clear_highpage(rq->special_vec.bv_page);
rq->special_vec.bv_offset = 0;
rq->special_vec.bv_len = data_len;
rq->rq_flags |= RQF_SPECIAL_PAYLOAD;
@@@ -794,10 -793,9 +795,10 @@@ static blk_status_t sd_setup_write_same
u32 nr_sectors = blk_rq_sectors(rq) >> (ilog2(sdp->sector_size) - 9);
u32 data_len = sdp->sector_size;
- rq->special_vec.bv_page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
+ rq->special_vec.bv_page = mempool_alloc(sd_page_pool, GFP_ATOMIC);
if (!rq->special_vec.bv_page)
- return BLKPREP_DEFER;
+ return BLK_STS_RESOURCE;
+ clear_highpage(rq->special_vec.bv_page);
rq->special_vec.bv_offset = 0;
rq->special_vec.bv_len = data_len;
rq->rq_flags |= RQF_SPECIAL_PAYLOAD;
@@@ -825,10 -824,9 +827,10 @@@ static blk_status_t sd_setup_write_same
u32 nr_sectors = blk_rq_sectors(rq) >> (ilog2(sdp->sector_size) - 9);
u32 data_len = sdp->sector_size;
- rq->special_vec.bv_page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
+ rq->special_vec.bv_page = mempool_alloc(sd_page_pool, GFP_ATOMIC);
if (!rq->special_vec.bv_page)
- return BLKPREP_DEFER;
+ return BLK_STS_RESOURCE;
+ clear_highpage(rq->special_vec.bv_page);
rq->special_vec.bv_offset = 0;
rq->special_vec.bv_len = data_len;
rq->rq_flags |= RQF_SPECIAL_PAYLOAD;
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GIT PULL] Block changes for 4.21-rc
2018-12-21 4:04 [GIT PULL] Block changes for 4.21-rc Jens Axboe
2018-12-23 3:35 ` Jens Axboe
@ 2018-12-28 21:48 ` Linus Torvalds
2018-12-28 21:57 ` Linus Torvalds
2018-12-31 20:06 ` Jens Axboe
2018-12-29 1:30 ` pr-tracker-bot
2 siblings, 2 replies; 6+ messages in thread
From: Linus Torvalds @ 2018-12-28 21:48 UTC (permalink / raw)
To: Jens Axboe, Christoph Hellwig; +Cc: linux-block@vger.kernel.org
On Thu, Dec 20, 2018 at 8:05 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> Jens Axboe (108):
> block: avoid ordered task state change for polled IO
This one seems *very* questionable.
The commit message is misleading:
For the core poll helper, the task state setting don't need to imply any
atomics, as it's the current task itself that is being modified and
we're not going to sleep.
For IRQ driven, the wakeup path have the necessary barriers to not need
us using the heavy handed version of the task state setting.
What? Barriers on one side have *no* meaning or point if the *other*
side doesn't have any barriers. That's completely magical thinking.
But it's also not at all the case that such barriers even exist. The
wakup side does:
struct task_struct *waiter = dio->submit.waiter;
WRITE_ONCE(dio->submit.waiter, NULL);
blk_wake_io_task(waiter);
and here the sleeping side now does:
__set_current_state(TASK_UNINTERRUPTIBLE);
if (!READ_ONCE(dio->submit.waiter))
break;
which is entirely unordered. So just what protects against this:
Sleeper: Waker (different cpu)
read submit.waker
(sees non-NULL)
WRITE_ONCE(dio->submit.waiter, NULL);
blk_wake_io_task(waiter);
write TASK_UNINTERRUPTIBLE
io_schedule()
and now the sleeper sleeps forever, because there will be nobody who
ever wakes it up (the wakeup happened before the write of
TASK_UNINTERRUPTIBLE).
Notice how it does not matter AT ALL what barriers that other CPU is
doing when waking things up. The problem is the sleeping CPU having
reordered the check for "oh, there's a waiter", and "tell people I'm
going to sleep".
Those memory barriers matter. You can't just remove them randomly and
claim they don't.
Why would you even remove it in the first place?
Guys, if you don't understand memory ordering, then you have
absolutely no business using __set_current_state(). And talking about
barriers on the other side of the equation clearly shows that you
don't seem to understand memory ordering.
Barriers are only *ever* meaningful if they exist on both sides
(although sometimes said barriers can be implicit: an address
dependency or similar in RCU, now that we've decided to not care abotu
the crazy alpha read-barrier-depends)
Maybe I'm missing something, but this really looks like a completely
invalid "optimization" to me. And it's entirely bogus too. If that
memory barrier matters, you're almost certainly doing something wrong
(most likely benchmarking something pointless).
Linus
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [GIT PULL] Block changes for 4.21-rc
2018-12-28 21:48 ` Linus Torvalds
@ 2018-12-28 21:57 ` Linus Torvalds
2018-12-31 20:06 ` Jens Axboe
1 sibling, 0 replies; 6+ messages in thread
From: Linus Torvalds @ 2018-12-28 21:57 UTC (permalink / raw)
To: Jens Axboe, Christoph Hellwig; +Cc: linux-block@vger.kernel.org
On Fri, Dec 28, 2018 at 1:48 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Maybe I'm missing something, but this really looks like a completely
> invalid "optimization" to me. And it's entirely bogus too. If that
> memory barrier matters, you're almost certainly doing something wrong
> (most likely benchmarking something pointless).
Note: I have pulled the tree, but I expect this to be either reverted,
or explained why it really is correct.
Because right now it just looks to be like a race condition that
generates faster - but incorrect - code.
The race may be practically impossible to hit simply because the other
side is slow and heavy (and you need to hit the timing just right),
but I don't see what would keep it from fundamentally happening. The
"this happens in a blue moon on just very specific hardware" bugs are
the worst kind of bugs.
Linus
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [GIT PULL] Block changes for 4.21-rc
2018-12-28 21:48 ` Linus Torvalds
2018-12-28 21:57 ` Linus Torvalds
@ 2018-12-31 20:06 ` Jens Axboe
1 sibling, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2018-12-31 20:06 UTC (permalink / raw)
To: Linus Torvalds, Christoph Hellwig; +Cc: linux-block@vger.kernel.org
On 12/28/18 2:48 PM, Linus Torvalds wrote:
> On Thu, Dec 20, 2018 at 8:05 PM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> Jens Axboe (108):
>> block: avoid ordered task state change for polled IO
>
> This one seems *very* questionable.
>
> The commit message is misleading:
>
> For the core poll helper, the task state setting don't need to imply any
> atomics, as it's the current task itself that is being modified and
> we're not going to sleep.
>
> For IRQ driven, the wakeup path have the necessary barriers to not need
> us using the heavy handed version of the task state setting.
>
> What? Barriers on one side have *no* meaning or point if the *other*
> side doesn't have any barriers. That's completely magical thinking.
>
> But it's also not at all the case that such barriers even exist. The
> wakup side does:
>
> struct task_struct *waiter = dio->submit.waiter;
> WRITE_ONCE(dio->submit.waiter, NULL);
> blk_wake_io_task(waiter);
>
> and here the sleeping side now does:
>
> __set_current_state(TASK_UNINTERRUPTIBLE);
>
> if (!READ_ONCE(dio->submit.waiter))
> break;
>
> which is entirely unordered. So just what protects against this:
>
> Sleeper: Waker (different cpu)
>
> read submit.waker
> (sees non-NULL)
>
> WRITE_ONCE(dio->submit.waiter, NULL);
> blk_wake_io_task(waiter);
>
> write TASK_UNINTERRUPTIBLE
> io_schedule()
>
> and now the sleeper sleeps forever, because there will be nobody who
> ever wakes it up (the wakeup happened before the write of
> TASK_UNINTERRUPTIBLE).
>
> Notice how it does not matter AT ALL what barriers that other CPU is
> doing when waking things up. The problem is the sleeping CPU having
> reordered the check for "oh, there's a waiter", and "tell people I'm
> going to sleep".
>
> Those memory barriers matter. You can't just remove them randomly and
> claim they don't.
>
> Why would you even remove it in the first place?
>
> Guys, if you don't understand memory ordering, then you have
> absolutely no business using __set_current_state(). And talking about
> barriers on the other side of the equation clearly shows that you
> don't seem to understand memory ordering.
>
> Barriers are only *ever* meaningful if they exist on both sides
> (although sometimes said barriers can be implicit: an address
> dependency or similar in RCU, now that we've decided to not care abotu
> the crazy alpha read-barrier-depends)
>
> Maybe I'm missing something, but this really looks like a completely
> invalid "optimization" to me. And it's entirely bogus too. If that
> memory barrier matters, you're almost certainly doing something wrong
> (most likely benchmarking something pointless).
I think what went wrong here is that the patch morphed through
multiple iterations, and I agree the IRQ side does not look correct.
The polled part is fine.
Sorry for the late reply, in and out this time of year. I'll get
something posted and queued up for the later pull I was planning
for this merge window, for the other stragglers.
--
Jens Axboe
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [GIT PULL] Block changes for 4.21-rc
2018-12-21 4:04 [GIT PULL] Block changes for 4.21-rc Jens Axboe
2018-12-23 3:35 ` Jens Axboe
2018-12-28 21:48 ` Linus Torvalds
@ 2018-12-29 1:30 ` pr-tracker-bot
2 siblings, 0 replies; 6+ messages in thread
From: pr-tracker-bot @ 2018-12-29 1:30 UTC (permalink / raw)
To: Jens Axboe; +Cc: Linus Torvalds, linux-block@vger.kernel.org
The pull request you sent on Thu, 20 Dec 2018 21:04:50 -0700:
> git://git.kernel.dk/linux-block.git tags/for-4.21/block-20181221
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/0e9da3fbf7d81f0f913b491c8de1ba7883d4f217
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-12-31 20:06 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-12-21 4:04 [GIT PULL] Block changes for 4.21-rc Jens Axboe
2018-12-23 3:35 ` Jens Axboe
2018-12-28 21:48 ` Linus Torvalds
2018-12-28 21:57 ` Linus Torvalds
2018-12-31 20:06 ` Jens Axboe
2018-12-29 1:30 ` pr-tracker-bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).