public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Kent Overstreet <kent.overstreet@gmail.com>
To: "Artem S. Tashkinov" <t.artem@lycos.com>
Cc: Ming Lei <tom.leiming@gmail.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Christoph Hellwig <hch@lst.de>, Ming Lin <ming.l@ssi.samsung.com>,
	Jens Axboe <axboe@fb.com>,
	"Artem S. Tashkinov" <t.artem@mailcity.com>,
	Steven Whitehouse <swhiteho@redhat.com>,
	Tejun Heo <tj@kernel.org>, IDE-ML <linux-ide@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"Martin K. Petersen" <martin.petersen@oracle.com>
Subject: Re: IO errors after "block: remove bio_get_nr_vecs()"
Date: Sun, 20 Dec 2015 17:32:37 -0900	[thread overview]
Message-ID: <20151221023237.GB20661@kmo-pixel> (raw)
In-Reply-To: <20aa515947cdc15799f520f904ad99a2@lycos.com>

On Mon, Dec 21, 2015 at 06:50:21AM +0500, Artem S. Tashkinov wrote:
> On 2015-12-21 06:38, Ming Lei wrote:
> >On Mon, Dec 21, 2015 at 1:51 AM, Linus Torvalds wrote:
> >>Kent, Jens, Christoph et al,
> >> please see this bugzilla:
> >>
> >>  https://bugzilla.kernel.org/show_bug.cgi?id=109661
> >>
> >>where Artem Tashkinov bisected his problems with 4.3 down to commit
> >>b54ffb73cadc ("block: remove bio_get_nr_vecs()") that you've all
> >>signed off on.
> >>
> >>(Also Tejun - maybe you can see what's up - maybe that error message
> >>tells you something)
> >>
> >>I'm not sure what's up with his machine, the disk doesn't seem to be
> >>anyuthing particularly unusual, it looks like a 1TB Seagate Barracuda:
> >>
> >>  ata1.00: ATA-8: ST1000DM003-1CH162, CC44, max UDMA/133
> >>
> >>which doesn't strike me as odd.
> >>
> >>Looking at the dmesg, it also looks like it's a pretty normal
> >>Sandybridge setup with Intel chipset. Artem, can you confirm? The PCI
> >>ID for the AHCI chip seems to be (INTEL, 0x1c02).
> >>
> >>Any ideas? Anybody?
> >
> >BTW, I have posted very similar issue in the link:
> >
> >http://marc.info/?l=linux-ide&m=145066119623811&w=2
> >
> >Artem, I noticed from bugzillar that the hardware is i386, just
> >wondering if PAE is enabled?  If yes, I am more confident
> >that both the two kinds of report are similar or same.
> >
> 
> Yes, I'm on i686 with PAE (16GB of RAM here) - it's specifically mentioned
> in the corresponding bug report.
> 
> P.S. I know Linus doesn't condone PAE but I still find it more preferrable
> than running a mixed environment with almost zero benefit in regard to
> performance and quite obvious performance regressions related to an
> increased number of libraries being loaded (i686 + x86_64) and slightly
> bloated code which sometimes cannot fit in the CPU cache. Call me old
> fashioned but I won't upgrade to x86_64 until most of the things that I run
> locally are available for x86_64 and that won't happen any time soon.

oy vey. WTF's been happening in blk-merge.c?

Theyy're not the same bug. The bug in your thread was introduced by Jens in
5014c311ba "block: fix bogus compiler warnings in blk-merge.c", where he screwed
up the bvprv handling - but that patch comes after the patch Artem bisected to.

blk_bio_segment_split() looks correct in b54ffb73ca. 

What we need to do is:
 in the _driver_, immediately before handing the sglist off to the device, walk
 the sglist and verify it obeys all the restrictions for that particular device
 - and if it's not, print out exactly what we screwed up.

I don't know where that code lives in the ahci driver, and more importantly I
don't know where the dma restrictions come from, but if someone who knows the
driver code can walk me through it I'll write the patch.

--------------

Also - Ming, Christoph, anyone else who might be working on this stuff in the
future:

The way all the queue limits stuff works is still way too fragile; this has been
a recurring source of bugs. There's way too many different restrictions
different devices need, and it's easy for a driver to specify the restrictions
incorrectly in a way that just happens to work, but for the wrong reasons - e.g.
"I can't handle more than x segments, but saying I can't handle more than x
sectors happens to work for now because of some other bug in the upper layers" -
and then when we have to debug that later, we're screwed.

My intent when I was working on this was to eventually push the implementation
of the limits down as much as possible to the actual drivers - i.e. there the
limitations come from, so the driver can say, for example:

"ok, my device can only do scatter/gather dma to max 20 different addresses, so
I'll allocate sglists with 20 entries, and it doesn't matter if the bio or
request or whatever is bigger because when I call blk_rq_map_sg() it's just
going to map as much of the request as will fit in a given sglist and requests
will get processed incrementally until they're finished - and if a particular sg
entry can only be a particular size, or has alignment restrictions or whatever,
I'll just pass that directly to blk_rq_map_sg()"

so that the driver is ideally specifying _only_ its real restrictions, and
they're being specified in the code exactly where they're being used.

-------

Basically, blk_queue_split() was only meant to be an interim solution, so I'd
suggest that instead of doing performance optimizations on that codepath a
better use of time and effort would be to work towards ripping it out entirely.

  parent reply	other threads:[~2015-12-21  2:32 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-20 17:51 IO errors after "block: remove bio_get_nr_vecs()" Linus Torvalds
2015-12-20 18:18 ` Christoph Hellwig
2015-12-20 18:41   ` Linus Torvalds
2015-12-20 23:36     ` Artem S. Tashkinov
2015-12-21 11:21     ` Dan Aloni
2015-12-20 18:44   ` Kent Overstreet
2015-12-20 23:41     ` Artem S. Tashkinov
2015-12-20 23:25   ` Artem S. Tashkinov
2015-12-20 23:42     ` Kent Overstreet
2015-12-20 23:49       ` Artem S. Tashkinov
2015-12-20 23:23 ` Artem S. Tashkinov
2015-12-21  1:38 ` Ming Lei
2015-12-21  1:50   ` Artem S. Tashkinov
2015-12-21  2:18     ` Ming Lei
2015-12-21  2:25       ` Artem S. Tashkinov
2015-12-21  2:32     ` Kent Overstreet [this message]
2015-12-21  3:21       ` Ming Lei
2015-12-21  3:36         ` Artem S. Tashkinov
2015-12-21  4:32     ` Linus Torvalds
2015-12-21  4:43       ` Artem S. Tashkinov
2015-12-21  4:47         ` Linus Torvalds
2015-12-21  5:23           ` Linus Torvalds
2015-12-21  7:31             ` Artem S. Tashkinov
2015-12-22  4:06             ` Artem S. Tashkinov
2015-12-21  4:26 ` Tejun Heo
2015-12-21  5:10   ` Linus Torvalds
2015-12-21  6:55 ` Tejun Heo
2015-12-21  7:25   ` Artem S. Tashkinov
2015-12-21 19:35     ` Tejun Heo
2015-12-21 20:07       ` Tejun Heo
2015-12-21 21:08         ` Tejun Heo
2015-12-22  3:43           ` Kent Overstreet
2015-12-22  3:59           ` Kent Overstreet
2015-12-22  5:26             ` Junichi Nomura
2015-12-22  5:37               ` Kent Overstreet
2015-12-22  5:38               ` Kent Overstreet
2015-12-22  5:52                 ` Artem S. Tashkinov
2015-12-22  5:55                   ` Kent Overstreet
2015-12-22  5:59                     ` Artem S. Tashkinov
2015-12-22  6:02                       ` Kent Overstreet
2015-12-22 17:28               ` Jens Axboe
2015-12-22  4:45           ` Kent Overstreet
2015-12-22  5:10         ` Artem S. Tashkinov
2015-12-22  5:20         ` Artem S. Tashkinov
2015-12-21 22:51       ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151221023237.GB20661@kmo-pixel \
    --to=kent.overstreet@gmail.com \
    --cc=axboe@fb.com \
    --cc=hch@lst.de \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=ming.l@ssi.samsung.com \
    --cc=swhiteho@redhat.com \
    --cc=t.artem@lycos.com \
    --cc=t.artem@mailcity.com \
    --cc=tj@kernel.org \
    --cc=tom.leiming@gmail.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox