public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <matthew@wil.cx>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>,
	Alex Chiang <achiang@hp.com>,
	jbarnes@virtuousgeek.org, linux-arch@vger.kernel.org,
	Kyle McMartin <kyle@mcmartin.ca>, Tony Luck <tony.luck@intel.com>,
	Russell King <linux@arm.linux.org.uk>,
	Arnd Bergmann <arnd@arndb.de>,
	Yoshinori Sato <ysato@users.sourceforge.jp>,
	Jeff Dike <jdike@addtoit.com>,
	linux-kernel@vger.kernel.org, Ralf Baechle <ralf@linux-mips.org>,
	David Howells <dhowells@redhat.com>,
	Paul Mundt <lethal@linux-sh.org>,
	Ivan Kokshaysky <ink@jurassic.park.msu.ru>,
	Ingo Molnar <mingo@redhat.com>,
	"David S. Miller" <davem@davemloft.net>,
	Avi Kivity <avi@redhat.com>
Subject: Re: [PATCH] PCI: remove pcibios_scan_all_fns()
Date: Tue, 23 Jun 2009 13:08:27 -0600	[thread overview]
Message-ID: <20090623190826.GJ19977@parisc-linux.org> (raw)
In-Reply-To: <1245714008.4017.7.camel@pasglop>

On Tue, Jun 23, 2009 at 09:40:08AM +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2009-06-22 at 12:30 -0600, Matthew Wilcox wrote:
> > 
> > That would be correct.  I'm guessing your out-of-tree code sets
> > pcibios_scan_all_fns()?
> > 
> > Now, there are various options.  One is that you could remap config
> > space accesses -- domain:bus:dev.fn in the guest don't have to match
> > domain:bus:dev.fn in the host.  That's a certain amount of overhead in
> > every config space access, but it doesn't have to be a large one.
> > 
> That's tricky. Some devices have internal registers that -do- depend on
> what function they are on. In fact, I remember seeing that in
> multifunction devices that are meant to be virtualized but still need to
> have some registers be accessed differently depending on the function
> (ugh) though don't ask me who that was ...

That's pretty horrific.  PA-RISC has a NS87415 superio chip that is
full of that kind of bogosity, but nobody's ever suggested it should
support v12n.

> > Another would be that you could create dummy devices in the guest at
> > function 0, and then the guest would scan all the functions.  A little
> > ugly, perhaps.
> 
> But less ugly than the above.

There's some nastiness when you want to later migrate function 0 into
the VM, and it looks a little ugly to have a fake func 0 in the VM with
nothing attached to it.

> > A third would be for guests to not do this scanning at all.  You could
> > present the devices through something like the openfirmware tree, and
> > create them insteaqd of scanning for them.  If you care about startup
> > time, this is probably the way to go.
> 
> Which is what we do on powerpc nowadays. In fact, this code is currently
> inside arch/powerpc and arch/sparc (2 copies slightly diverged) but I
> had plans to make it common at move it over to either drivers/of or
> drivers/pci (most probably the later).

I'd support a drivers/pci/of.c.  Definitely better than having two copies
of it under arch/, and you'd be well within your rights to complain if
we changed something and didn't fix it up.

> > There's probably other ways I haven't thought of ...
> 
> Well, making up the devices without actual config space probing is nice
> and fast but I don't think we want to see too many occurences of such
> code in the kernel. We already had breakage once in powerpc land iirc
> due to changes in drivers/pci/probe.c that we didn't reflect properly. I
> think the normal and OF methods should be enough.

Particularly since we have these people creating fake OF trees for
embedded platforms so we don't have to probe them.  The v12n people
should definitely take advantage of this work.

> At this stage, it does look to me like a trivial tweak like
> pcibios_scan_all_fns() but maybe done a bit nicely, would still be the
> simplest solution in term of amount of code involved etc...
> 
> Maybe something like
> 
> pcibios_get_slot_fn_mask() which returns a bitmask of functions to be
> scanned, whose default implementation (weak) would basically check
> the header type for function 0 ? As I said, I don't -need- that right
> now on powerpc "server" platforms but heh... 

So we need to tweak this code anyway for Alternate RoutingID Interpretation,
and what I've ended up doing is creating a bunch of different functions that
can be called to determine what the next function to probe should be, given
the current device and function.

Take a look: http://marc.info/?l=linux-pci&m=124578246906927&w=2

It wouldn't be hard to continue supporting pcibios_scan_all_fns() with
this scheme; it's an extra two lines:

+	else if (pcibios_scan_all_fns())
+		next_fn = next_trad_fn;

I think simply materialising them, either the way the OF code does,
or the way the IOV code does is the best route forwards.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

  reply	other threads:[~2009-06-23 19:08 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-22 14:08 [PATCH] PCI: remove pcibios_scan_all_fns() Alex Chiang
2009-06-22 14:21 ` Kyle McMartin
2009-06-22 14:26 ` Ralf Baechle
2009-06-22 14:36   ` Matthew Wilcox
2009-06-22 14:34 ` Matthew Wilcox
2009-06-22 18:20   ` Jeremy Fitzhardinge
2009-06-22 18:30     ` Matthew Wilcox
2009-06-22 23:40       ` Benjamin Herrenschmidt
2009-06-23 19:08         ` Matthew Wilcox [this message]
2009-06-23 20:34           ` Jeremy Fitzhardinge
2009-06-23 20:56             ` Matthew Wilcox
2009-06-23 21:49             ` Benjamin Herrenschmidt
2009-06-23 22:24               ` Jeremy Fitzhardinge
2009-06-23 22:41                 ` Benjamin Herrenschmidt
2009-06-23 22:53                   ` Jeremy Fitzhardinge
2009-06-24  0:02                     ` Benjamin Herrenschmidt
2009-06-24 10:30             ` Ian Campbell
2009-06-23 21:47           ` Benjamin Herrenschmidt
2009-06-22 23:55       ` Jeremy Fitzhardinge
2009-06-23  0:33         ` Benjamin Herrenschmidt
2009-06-23  1:43   ` Chris Wright
2009-06-23  2:22     ` Benjamin Herrenschmidt
2009-06-23 18:29   ` Avi Kivity
2009-06-22 15:29 ` Russell King - ARM Linux
2009-06-22 16:53 ` Arnd Bergmann
2009-06-23 16:43 ` Paul Mundt
2009-06-29 20:33 ` Jesse Barnes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090623190826.GJ19977@parisc-linux.org \
    --to=matthew@wil.cx \
    --cc=achiang@hp.com \
    --cc=arnd@arndb.de \
    --cc=avi@redhat.com \
    --cc=benh@kernel.crashing.org \
    --cc=davem@davemloft.net \
    --cc=dhowells@redhat.com \
    --cc=ink@jurassic.park.msu.ru \
    --cc=jbarnes@virtuousgeek.org \
    --cc=jdike@addtoit.com \
    --cc=jeremy@goop.org \
    --cc=kyle@mcmartin.ca \
    --cc=lethal@linux-sh.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=mingo@redhat.com \
    --cc=ralf@linux-mips.org \
    --cc=tony.luck@intel.com \
    --cc=ysato@users.sourceforge.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox