From: Daniel Vetter <daniel@ffwll.ch>
To: "Luis R. Rodriguez" <mcgrof@suse.com>
Cc: "Daniel Vetter" <daniel@ffwll.ch>,
"Luis R. Rodriguez" <mcgrof@kernel.org>,
vw@iommu.org, "Oded Gabbay" <oded.gabbay@gmail.com>,
"Joerg Roedel" <joro@8bytes.org>,
"Christian König" <christian.koenig@amd.com>,
"alexander.deucher@amd.com" <alexander.deucher@amd.com>,
"Dave Airlie" <airlied@linux.ie>,
iommu@lists.linux-foundation.org,
"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
dri-devel <dri-devel@lists.freedesktop.org>,
gregkh@linuxfoundation.org, hpa@zytor.com
Subject: Re: [RFT v3] drm: use late_initcall() for amdkfd and radeon
Date: Tue, 31 May 2016 21:04:53 +0200 [thread overview]
Message-ID: <20160531190453.GL7231@phenom.ffwll.local> (raw)
In-Reply-To: <20160531165834.GG11948@wotan.suse.de>
On Tue, May 31, 2016 at 06:58:34PM +0200, Luis R. Rodriguez wrote:
> On Sun, May 29, 2016 at 08:27:07PM +0200, Daniel Vetter wrote:
> > On Fri, May 27, 2016 at 3:18 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > > To get KFD support in radeon we need the following
> > > initialization to happen in this order, their
> > > respective driver file that has its init routine
> > > listed next to it:
> > >
> > > 0. AMD IOMMUv1: arch/x86/kernel/pci-dma.c
> > > 1. AMD IOMMUv2: drivers/iommu/amd_iommu_v2.c
> > > 2. AMD KFD: drivers/gpu/drm/amd/amdkfd/kfd_module.c
> > > 3. AMD Radeon: drivers/gpu/drm/radeon/radeon_drv.c
> > >
> > > Order is rather implicit, but these drivers can currently
> > > only do so much given the amount of leg room available.
> > > Below are the respective init routines and how they are
> > > initialized:
> > >
> > > arch/x86/kernel/pci-dma.c rootfs_initcall(pci_iommu_init);
> > > drivers/iommu/amd_iommu_v2.c module_init(amd_iommu_v2_init);
> > > drivers/gpu/drm/amd/amdkfd/kfd_module.c module_init(kfd_module_init);
> > > drivers/gpu/drm/radeon/radeon_drv.c module_init(radeon_init);
> > >
> > > When a driver is built-in module_init() folds to use
> > > device_initcall(), and we have the following possible
> > > orders:
> > >
> > > #define pure_initcall(fn) __define_initcall(fn, 0)
> > > #define core_initcall(fn) __define_initcall(fn, 1)
> > > #define postcore_initcall(fn)__define_initcall(fn, 2)
> > > #define arch_initcall(fn) __define_initcall(fn, 3)
> > > #define subsys_initcall(fn) __define_initcall(fn, 4)
> > > #define fs_initcall(fn) __define_initcall(fn, 5)
> > > ---------------------------------------------------------
> > > #define rootfs_initcall(fn) __define_initcall(fn, rootfs)
> > > #define device_initcall(fn) __define_initcall(fn, 6)
> > > #define late_initcall(fn) __define_initcall(fn, 7)
> > >
> > > Since we start off from rootfs_initcall(), it gives us 3 more
> > > levels of leg room to play with for order semantics, this isn't
> > > enough to address all required levels of dependencies, this
> > > is specially true given that AMD-KFD needs to be loaded before
> > > the radeon driver -- -but this it not enforced by symbols.
> > > If the AMD-KFD driver is not loaded prior to the radeon driver
> > > because otherwise the radeon driver will not initialize the
> > > AMD-KFD driver and you get no KFD functionality in userspace.
> > >
> > > Commit 1bacc894c227fad8a7 ("drivers: Move iommu/ before gpu/ in
> > > Makefile") works around some of the possibe races between
> > > the AMD IOMMU v2 and GPU drivers by changing the link order.
> > > This is fragile, however its the bets we can do, given that
> > > making the GPU drivers use late_initcall() would also implicate
> > > a similar race between them. That possible race is fortunatley
> > > addressed given that the drm Makefile currently has amdkfd
> > > linked prior to radeon:
> > >
> > > drivers/gpu/drm/Makefile
> > > ...
> > > obj-$(CONFIG_HSA_AMD) += amd/amdkfd/
> > > obj-$(CONFIG_DRM_RADEON)+= radeon/
> > > ...
> > >
> > > Changing amdkfd and radeon to late_initcall() however is
> > > still the right call in orde to annotate explicitly a
> > > delayed dependency requirement between the GPU drivers
> > > and the IOMMUs.
> > >
> > > We can't address the fragile nature of the link order
> > > right now, but in the future that might be possible.
> > >
> > > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > > ---
> > >
> > > Please note, the changes to drivers/Makefile are just
> > > for the sake of forcing the possible race to occur,
> > > if this works well the actual [PATCH] submission will
> > > skip those changes as its pointless to remove those
> > > work arounds as it stands, due to the limited nature
> > > of the levels available for addressing requirements.
> > >
> > > Also, if you are aware of further dependency hell
> > > things like these -- please do let me know as I am
> > > interested in looking at addressing them.
> >
> > This should be fixed with -EDEFER_PROBE instead. Frobbing initcall
> > levels should then just be done as an optimization to avoid too much
> > reprobing.
>
> radeon already uses -EDEFER_PROBE but it assumes that amdkfd *is* loaded first,
> and only if it is already loaded can it count on getting -EDEFER_PROBE. The
> radeon driver will deffer probe *iff* kgd2kfd_init() returns -EDEFER_PROBE,
> which only happens when amdkfd_init_completed is no longer present. If radeon
> gets linked first though the symbol fetch for kgd2kfd_init() will make it as
> not-present though. So the current heuristic used does not address proper link
> / load order. Part of the issue mentioned on the commit log is another race
> underneath the hood with the AMD IOMMU v2 which is needed for amdkfd. The
> underlying issue however really is the lack of available clear semantics for
> dependencies over 3 levels here. This is solved one way or another by link
> order in the Makefiles, but as I've noted so far this has been rather implicit
> and fragile. The change here makes at least the order of the GPU drivers
> explicitly later than the IOMMUs. The specific race between radeon and amdfkd
> is solved currently through link order through the Makefiles. In the future we
> maybe able to make things more explicit.
Sounds like the EDEFER_PROBE handling is broken - if the module isn't set
up yet but selected in Kconfig, and needed for that hw generation then it
should not just silently fail.
> -EDEFER_PROBE also introduces a latency on load which we should not need if we
> can handle proper link / load order dependency annotations. This change is a
> small part of that work, but as it the commit log also notes future further
> work is possible to build stronger semantics. Some of the work I'm doing with
> linker-tables may help with this in the future [0], but for now this should
> help with what the semantics we have in place.
>
> [0] http://lkml.kernel.org/r/1455889559-9428-1-git-send-email-mcgrof@kernel.org
That's what I meant with "avoiding too much reprobing". But in the end the
current solution to cross-driver deps we have is EDEFER_PROBE. Fiddling
with the link order is all well for optimizing stuff, but imo _way_ too
fragile for correctness.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
next prev parent reply other threads:[~2016-05-31 19:05 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-15 22:12 [RFT] iommu/amd: use subsys_initcall() on amdv2 iommu Luis R. Rodriguez
2016-03-16 7:02 ` Oded Gabbay
2016-03-16 10:14 ` Joerg Roedel
2016-03-16 10:16 ` Oded Gabbay
2016-03-16 16:17 ` Luis R. Rodriguez
2016-03-16 16:39 ` Joerg Roedel
2016-03-16 16:57 ` Luis R. Rodriguez
2016-03-16 17:17 ` Joerg Roedel
2016-03-29 17:41 ` [RFT v2] " Luis R. Rodriguez
2016-04-09 0:25 ` Luis R. Rodriguez
2016-04-11 13:28 ` Christian König
2016-04-11 13:39 ` Oded Gabbay
2016-04-11 13:52 ` Christian König
2016-04-12 22:07 ` Luis R. Rodriguez
2016-04-18 6:48 ` Oded Gabbay
[not found] ` <CAB=NE6WL7j_azrFxQUG3bybXtu67ew51CyzYvkBct6tCdARKDg@mail.gmail.com>
2016-04-18 7:02 ` Oded Gabbay
2016-04-18 12:03 ` Luis R. Rodriguez
2016-04-19 2:02 ` Wan Zongshun
2016-05-27 0:12 ` Luis R. Rodriguez
2016-04-25 10:23 ` Joerg Roedel
2016-05-27 0:46 ` Luis R. Rodriguez
2016-05-27 1:18 ` [RFT v3] drm: use late_initcall() for amdkfd and radeon Luis R. Rodriguez
2016-05-29 14:49 ` Oded Gabbay
2016-05-31 17:15 ` Luis R. Rodriguez
2016-05-31 17:33 ` Oded Gabbay
2016-05-29 18:27 ` Daniel Vetter
2016-05-31 16:58 ` Luis R. Rodriguez
2016-05-31 19:04 ` Daniel Vetter [this message]
2016-06-01 21:11 ` Luis R. Rodriguez
2016-11-10 22:12 ` Luis R. Rodriguez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160531190453.GL7231@phenom.ffwll.local \
--to=daniel@ffwll.ch \
--cc=airlied@linux.ie \
--cc=alexander.deucher@amd.com \
--cc=christian.koenig@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=gregkh@linuxfoundation.org \
--cc=hpa@zytor.com \
--cc=iommu@lists.linux-foundation.org \
--cc=joro@8bytes.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mcgrof@kernel.org \
--cc=mcgrof@suse.com \
--cc=oded.gabbay@gmail.com \
--cc=vw@iommu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox