All of lore.kernel.org
 help / color / mirror / Atom feed
From: Guenter Roeck <linux@roeck-us.net>
To: Ming Lei <ming.lei@redhat.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>,
	linux-scsi@vger.kernel.org,
	"Martin K . Petersen" <martin.petersen@oracle.com>,
	linux-block@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
	Bart Van Assche <bvanassche@acm.org>,
	"Ewan D . Milne" <emilne@redhat.com>,
	Hannes Reinecke <hare@suse.com>
Subject: Re: [PATCH V4 3/3] scsi: core: avoid to pre-allocate big chunk for sg list
Date: Mon, 3 Jun 2019 13:44:22 -0700	[thread overview]
Message-ID: <20190603204422.GA7240@roeck-us.net> (raw)
In-Reply-To: <20190428073932.9898-4-ming.lei@redhat.com>

On Sun, Apr 28, 2019 at 03:39:32PM +0800, Ming Lei wrote:
> Now scsi_mq_setup_tags() pre-allocates a big buffer for IO sg list,
> and the buffer size is scsi_mq_sgl_size() which depends on smaller
> value between shost->sg_tablesize and SG_CHUNK_SIZE.
> 
> Modern HBA's DMA is often capable of deadling with very big segment
> number, so scsi_mq_sgl_size() is often big. Suppose the max sg number
> of SG_CHUNK_SIZE is taken, scsi_mq_sgl_size() will be 4KB.
> 
> Then if one HBA has lots of queues, and each hw queue's depth is
> high, pre-allocation for sg list can consume huge memory.
> For example of lpfc, nr_hw_queues can be 70, each queue's depth
> can be 3781, so the pre-allocation for data sg list is 70*3781*2k
> =517MB for single HBA.
> 
> There is Red Hat internal report that scsi_debug based tests can't
> be run any more since legacy io path is killed because too big
> pre-allocation.
> 
> So switch to runtime allocation for sg list, meantime pre-allocate 2
> inline sg entries. This way has been applied to NVMe PCI for a while,
> so it should be fine for SCSI too. Also runtime sg entries allocation
> has verified and run always in the original legacy io path.
> 
> Not see performance effect in my big BS test on scsi_debug.
> 

This patch causes a variety of boot failures in -next. Typical failure
pattern is scsi hangs or failure to find a root file system. For example,
on alpha, trying to boot from usb:

scsi 0:0:0:0: Direct-Access     QEMU     QEMU HARDDISK    2.5+ PQ: 0 ANSI: 5
sd 0:0:0:0: Power-on or device reset occurred
sd 0:0:0:0: [sda] 20480 512-byte logical blocks: (10.5 MB/10.0 MiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 0:0:0:0: [sda] Attached SCSI disk
usb 1-1: reset full-speed USB device number 2 using ohci-pci
sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=0x00
sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 00 58 00 00 42 00
print_req_error: I/O error, dev sda, sector 88 flags 80000
EXT4-fs error (device sda): __ext4_get_inode_loc:4703: inode #2: block 44: comm
swapper: unable to read itable block
EXT4-fs (sda): get root inode failed
EXT4-fs (sda): mount failed
VFS: Cannot open root device "sda" or unknown-block(8,0): error -5

Similar problems are seen with other architectures.
Reverting the patch fixes the problem. Bisect log attached.

Guenter

---
# bad: [ae3cad8f39ccf8d31775d9737488bccf0e44d370] Add linux-next specific files for 20190603
# good: [f2c7c76c5d0a443053e94adb9f0918fa2fb85c3a] Linux 5.2-rc3
git bisect start 'HEAD' 'v5.2-rc3'
# good: [8ff6f4c6e067a9d3f3bbacf02c4ea5eb81fe2c6a] Merge remote-tracking branch 'crypto/master'
git bisect good 8ff6f4c6e067a9d3f3bbacf02c4ea5eb81fe2c6a
# good: [6c93755861ce6a6dd904df9cdae9f08671132dbe] Merge remote-tracking branch 'iommu/next'
git bisect good 6c93755861ce6a6dd904df9cdae9f08671132dbe
# good: [1a567956cb3be5754d94ce9380a2151e57e204a7] Merge remote-tracking branch 'cgroup/for-next'
git bisect good 1a567956cb3be5754d94ce9380a2151e57e204a7
# bad: [a6878ca73cf30b83efbdfb1ecc443d7cfb2d8193] Merge remote-tracking branch 'rtc/rtc-next'
git bisect bad a6878ca73cf30b83efbdfb1ecc443d7cfb2d8193
# bad: [25e7f57cf9f86076704b543628a0f02d3d733726] Merge remote-tracking branch 'gpio/for-next'
git bisect bad 25e7f57cf9f86076704b543628a0f02d3d733726
# bad: [28e85db94534c92fa00d787264e3ea1843d1dc42] scsi: lpfc: Fix nvmet handling of received ABTS for unmapped frames
git bisect bad 28e85db94534c92fa00d787264e3ea1843d1dc42
# bad: [cf9eddf616bb03387e597629142b0be34111e8d0] scsi: hpsa: check for tag collision
git bisect bad cf9eddf616bb03387e597629142b0be34111e8d0
# good: [4e74166c52a836f05d4bd8270835703908b34d3e] scsi: libsas: switch sas_ata.[ch] to SPDX tags
git bisect good 4e74166c52a836f05d4bd8270835703908b34d3e
# good: [f186090846c29fd9760917fb3d01f095c39262e0] scsi: lib/sg_pool.c: improve APIs for allocating sg pool
git bisect good f186090846c29fd9760917fb3d01f095c39262e0
# bad: [12b6b55806928690359bb21de3a19199412289fd] scsi: sd: Inline sd_probe_part2()
git bisect bad 12b6b55806928690359bb21de3a19199412289fd
# bad: [c3288dd8c2328a74cb2157dff18d13f6a75cedd2] scsi: core: avoid pre-allocating big SGL for data
git bisect bad c3288dd8c2328a74cb2157dff18d13f6a75cedd2
# good: [0f0e744eae6c8af361d227d3a2286973351e5a2a] scsi: core: avoid pre-allocating big SGL for protection information
git bisect good 0f0e744eae6c8af361d227d3a2286973351e5a2a
# first bad commit: [c3288dd8c2328a74cb2157dff18d13f6a75cedd2] scsi: core: avoid pre-allocating big SGL for data

  parent reply	other threads:[~2019-06-03 21:43 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-28  7:39 [PATCH V4 0/3] scsi: core: avoid big pre-allocation for sg list Ming Lei
2019-04-28  7:39 ` Ming Lei
2019-04-28  7:39 ` [PATCH V4 1/3] lib/sg_pool.c: improve APIs for allocating sg pool Ming Lei
2019-04-28  7:39   ` Ming Lei
2019-04-28 12:04   ` Christoph Hellwig
2019-04-28 12:04     ` Christoph Hellwig
2019-04-28  7:39 ` [PATCH V4 2/3] scsi: core: avoid to pre-allocate big chunk for protection meta data Ming Lei
2019-04-29 18:15   ` Bart Van Assche
2019-04-28  7:39 ` [PATCH V4 3/3] scsi: core: avoid to pre-allocate big chunk for sg list Ming Lei
2019-04-29 18:16   ` Bart Van Assche
2019-06-03 20:44   ` Guenter Roeck [this message]
2019-06-04  1:00     ` Ming Lei
2019-06-04  3:49       ` Guenter Roeck
2019-06-04  4:10         ` Ming Lei
2019-06-04 14:51           ` Guenter Roeck
2019-06-04  6:55         ` Ming Lei
2019-05-05  1:10 ` [PATCH V4 0/3] scsi: core: avoid big pre-allocation " Ming Lei
2019-05-05  1:10   ` Ming Lei
2019-05-14  2:06 ` Martin K. Petersen
2019-05-14  2:06   ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190603204422.GA7240@roeck-us.net \
    --to=linux@roeck-us.net \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=bvanassche@acm.org \
    --cc=emilne@redhat.com \
    --cc=hare@suse.com \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.