From: Don Dutile <ddutile@redhat.com>
To: Richard Yang <weiyang@linux.vnet.ibm.com>
Cc: linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: One problem in reassign pci bus number?
Date: Mon, 23 Apr 2012 15:46:03 -0400 [thread overview]
Message-ID: <4F95B17B.3030401@redhat.com> (raw)
In-Reply-To: <20120422155202.GA13540@richard>
On 04/22/2012 11:52 AM, Richard Yang wrote:
> All,
>
> I am reading the pci_scan_bridge() and not sure what will happen in
> following situation.
>
> Suppose the kernel is not passed the pci=assign-busses.
>
> Below is a picture about the pci system.
>
> +-------+
> | | root bridge(0,255)
> +---+---+
> | Bus 0
> -----+-----------+------------------------------+--
> | |
> | |
> | |
> +----+----+ +-----+-----+
> | | B1(1,15) | |B2(16,28)
> +----+----+ +-----+-----+
> | Bus 1 | Bus 16
> -----+----------------------- ----------+----------------
> |
> +----+----+
> | | B3
> +---------+
>
> Suppose B1 and B2 works fine with the BIOS, which get the right bus
> number and range.
>
> B3 does not works fine with the BIOS, which doesn't get the bus number.
>
> So in pci_scan_bridge(), B3 will be met in the second pass and get bus
> number 16?
unfortunately, today, the answer is yes.
I have run into a similar problem recently when trying to use pci=assign-busses
with an SRIOV device behind a non-ARI-capable PCIe switch.
In this scenario, the assign-busses code assigned the next bus number,
which conflicted with an existing one on the system, and hangs the
system -- two bridges responding to the same PCI bus num evidently
confuses the hw! ;-)
The PCI code is suppose to do two bus scans -- pass=0: to see what the BIOS
has setup, and then pass=1 to assign non-BIOS setup devices.
But, what I'm finding is that when pci=assign-busses is set, the
pass=0 scan is not doing a full PCI tree scan and registering all
the BIOS-setup busses first, and it tries to do extended bus assignment in pass=0,
not pass=1; in the above configuration, it expands B1's bus num range from (1,15)
to (1,16), then tries to scan behind it. that creates an overlap btwn
B1 & B2's sec/sub bus-num ranges, and they both respond to a Type1 config cycle
with a bus-number of 16 (typically when trying to read the VID register of 16:0.0
in this case).... boom! ... or more like silence due to system hang...
*If* the system spaces bus ranges apart, e.g., in your config above,
if the BIOS setup B1(1,15) and B2(24,32), then pci=assign-busses will
work because bus num 16 is free, and two bridges won't think they both
respond to type1 pci config cycle (with bus-number=16 lying in their sec/sub-bus num range),
and all will (luckily) work.
Unfortunately, I'm in & out of work due to at-home time requirements,
so I haven't had a chance to work out a proper patch.
What should happen in the above case, is the kernel prints a warning saying
it couldn't do needed assign-busses operations due to configuration constraints...
and continue to do pci (pass=1) bridge scanning.... and not wedge the system
as it does now.
The base problem is that
(a)pass=0 is doing bus-assigning, and it shouldn't be done
until pass=1, after all known BIOS-setup busses are known
(b) the code doesn't have a nice warning and continuation when this
conflict occurs.
> Would this be a conflict?
>
summary: yes.
next prev parent reply other threads:[~2012-04-23 19:46 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20120410090306.GA7056@richard>
2012-04-22 15:52 ` One problem in reassign pci bus number? Richard Yang
2012-04-23 19:46 ` Don Dutile [this message]
2012-04-23 20:19 ` Yinghai Lu
2012-04-23 21:09 ` Don Dutile
2012-04-23 22:07 ` Yinghai Lu
2012-04-25 2:56 ` Don Dutile
2012-04-23 22:22 ` Yinghai Lu
2012-04-25 2:59 ` Don Dutile
2012-04-24 7:29 ` Richard Yang
2012-05-14 1:55 ` Richard Yang
2012-05-14 5:40 ` Yinghai Lu
2012-05-14 6:46 ` Richard Yang
2012-05-15 17:32 ` Yinghai Lu
2012-05-16 1:18 ` Richard Yang
2012-05-16 2:11 ` Yinghai Lu
2012-05-29 11:59 ` Richard Yang
2012-05-29 17:14 ` Yinghai Lu
2012-06-04 13:34 ` Richard Yang
2012-06-04 23:36 ` Yinghai Lu
2012-06-05 6:26 ` Richard Yang
2012-06-05 18:01 ` Yinghai Lu
2012-04-23 20:13 ` Yinghai Lu
2012-04-24 14:56 ` Wei Yang
2012-04-25 9:47 ` Wei Yang
2012-04-25 16:28 ` Yinghai Lu
2012-05-03 6:54 ` Richard Yang
2012-05-03 8:33 ` Yinghai Lu
2012-05-04 2:47 ` Richard Yang
2012-05-04 3:15 ` Yinghai Lu
2012-05-04 4:52 ` Richard Yang
2012-05-04 17:37 ` Yinghai Lu
2012-05-06 15:17 ` Richard Yang
2012-05-06 16:35 ` Yinghai Lu
2012-05-06 16:36 ` Yinghai Lu
2012-05-07 1:17 ` Richard Yang
2012-05-07 2:04 ` Yinghai Lu
2012-05-08 2:46 ` Richard Yang
2012-05-08 3:42 ` Yinghai Lu
2012-05-10 3:35 ` Richard Yang
2012-05-10 5:42 ` Yinghai Lu
2012-05-11 1:23 ` Richard Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F95B17B.3030401@redhat.com \
--to=ddutile@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=weiyang@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).