All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <deathsimple@vodafone.de>
To: Oded Gabbay <oded.gabbay@amd.com>, dri-devel@lists.freedesktop.org
Cc: Alexander.Deucher@amd.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/3] amdkfd: Don't clear *kfd2kgd on kfd_module_init
Date: Sun, 21 Dec 2014 16:57:37 +0100	[thread overview]
Message-ID: <5496EDF1.7080106@vodafone.de> (raw)
In-Reply-To: <5496CA0F.8000800@amd.com>

> There should be, but when the modules are compiled in, they are loaded 
> based on
> link order only, if they are in the same group, and the groups are 
> loaded by a
> pre-defined order. 
Is that really still up to date? I've seen effort to change that 
something like 10+ years ago when Rusty reworked the module system. And 
it is comming up on the lists again from time to time.

> I don't want to move iommu before gpu, so I don't have a solution for 
> the order between amdkfd and amd_iommu_v2. 
Why not? That's still better than creating a kernel workqueue, 
scheduling one work item on it, rescheduling the task until everything 
is completed and you can continue.

Christian.

Am 21.12.2014 um 14:24 schrieb Oded Gabbay:
>
>
> On 12/21/2014 03:06 PM, Oded Gabbay wrote:
>>
>>
>> On 12/21/2014 02:19 PM, Christian König wrote:
>>> Am 21.12.2014 um 12:34 schrieb Oded Gabbay:
>>>>
>>>>
>>>> On 12/21/2014 01:27 PM, Christian König wrote:
>>>>> Am 20.12.2014 um 21:46 schrieb Oded Gabbay:
>>>>>> When amdkfd and radeon are compiled inside the kernel image (not 
>>>>>> as modules),
>>>>>> radeon will load before amdkfd and will set *kfd2kgd to its 
>>>>>> interface
>>>>>> structure. Therefore, we must not set *kfd2kgd to NULL when 
>>>>>> amdkfd is loaded
>>>>>> because it will override radeon's initialization and cause kernel 
>>>>>> BUG.
>>>>>>
>>>>>> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
>>>>>
>>>>> You should probably rather fix the dependency between the two 
>>>>> modules to get an
>>>>> determined load order instead of doing such nasty workarounds.
>>>>>
>>>>> Christian.
>>>>
>>>> The problem is that when modules are compiled inside the kernel, 
>>>> there is NO
>>>> determined load order and there is no mechanism to enforce that. If 
>>>> there
>>>> is/was such a mechanism, I would of course prefer to use it.
>>>
>>> There should be an determined order based on the symbol use, otherwise
>>> initializing most of the kernel modules wouldn't work as expected. 
>>> For example
>>> radeon depends on the drm module must be loaded before radeon is 
>>> loaded.
>> There should be, but when the modules are compiled in, they are 
>> loaded based on
>> link order only, if they are in the same group, and the groups are 
>> loaded by a
>> pre-defined order.
>> The groups are: pure, core, postcore, arch, subsys, fs, device (which 
>> represents
>> all the modules) and late. See init.h
>>
>> So radeon, amdkfd and amd_iommu_v2 are all in device group, and in 
>> the group
>> they are ordered by their link order.
>>
>> Yes, radeon loads after drm, because drm*.o are before radeon*.o in the
>> Makefile. See
>> http://stackoverflow.com/questions/5669647/linux-order-of-statically-linked-module-loading 
>>
>>
>
> So I tried moving amdkfd inside the Makefile before radeon, and that 
> made amdkfd load before radeon did. This solves my first problem - 
> order between amdkfd and radeon. However, amd_iommu_v2 doesn't belong 
> to the drm Makefile, and I don't want to move iommu before gpu, so I 
> don't have a solution for the order between amdkfd and amd_iommu_v2.
>
>     Oded
>>
>>
>>>
>>>>
>>>> Actually, I don't understand why the kernel doesn't enforce the order
>>>> according to the use of exported symbols, like it does with modules.
>>>
>>> Yeah, that's indeed rather strange. There must be something in the 
>>> amdkfd code
>>> which broke that somehow.
>> IMO, that's a far-fetched guess. Could you point to something more 
>> specific ?
>>
>>>
>>> As far as I understand you the desired init order is radeon and 
>>> amd_iommu_v2
>>> first and then amdkfd, right?
>> Actually no. The preferred order is amd_iommu_v2, amdkfd and radeon 
>> last. This
>> is the order that happens when all three are built as modules. More 
>> accurately,
>> radeon inits, but its init triggers amdkfd init, which triggers 
>> amd_iommu_v2
>> init. So before radeon reaches its probe stage, all the modules were 
>> initialized.
>>
>> So what happens when you boot with radeon,
>>> amd_iommu_v2 and amdkfd blacklisted for automatically load and only 
>>> load amdkfd
>>> manually?
>> As said above, that's ok.
>>>
>>>> There will always be dependencies between kgd (radeon) and amdkfd 
>>>> and between
>>>> amdkfd and amd_iommu_v2. I don't think I can eliminate those 
>>>> dependencies, not
>>>> without a very complex solution. And the fact that this complex 
>>>> solution
>>>> occurs only in a very specific use case (all modules compiled in), 
>>>> makes me
>>>> less inclined to do that.
>>>>
>>>> So I don't see it as a "nasty workaround". I would call it just a 
>>>> "workaround"
>>>> for a specific use case, which should be solved by a generic 
>>>> solution to the
>>>> kernel enforcing load orders.
>>>
>>> The normal kernel module handling already should provide the correct 
>>> init order,
>>> so I would still call this a rather nasty workaround because we 
>>> couldn't find
>>> the underlying problem.
>> Well, the normal kernel module handling doesn't work when all modules 
>> are
>> compiled in. I'm not a huge expert on this issue so I had Joerg 
>> Roedel help me
>> analyze this (thanks Joerg) and he agreed that there is no 
>> enforcement of order
>> in this case.
>>
>>>
>>> Christian.
>>>
>>>>
>>>>     Oded
>>>>>
>>>>>> ---
>>>>>>   drivers/gpu/drm/amd/amdkfd/kfd_module.c | 5 ++---
>>>>>>   1 file changed, 2 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
>>>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
>>>>>> index 95d5af1..236562f 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
>>>>>> @@ -34,7 +34,7 @@
>>>>>>   #define KFD_DRIVER_MINOR    7
>>>>>>   #define KFD_DRIVER_PATCHLEVEL    0
>>>>>> -const struct kfd2kgd_calls *kfd2kgd;
>>>>>> +const struct kfd2kgd_calls *kfd2kgd = NULL;
>>>>>>   static const struct kgd2kfd_calls kgd2kfd = {
>>>>>>       .exit        = kgd2kfd_exit,
>>>>>>       .probe        = kgd2kfd_probe,
>>>>>> @@ -84,14 +84,13 @@ EXPORT_SYMBOL(kgd2kfd_init);
>>>>>>   void kgd2kfd_exit(void)
>>>>>>   {
>>>>>> +    kfd2kgd = NULL;
>>>>>>   }
>>>>>>   static int __init kfd_module_init(void)
>>>>>>   {
>>>>>>       int err;
>>>>>> -    kfd2kgd = NULL;
>>>>>> -
>>>>>>       /* Verify module parameters */
>>>>>>       if ((sched_policy < KFD_SCHED_POLICY_HWS) ||
>>>>>>           (sched_policy > KFD_SCHED_POLICY_NO_HWS)) {
>>>>>
>>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/dri-devel

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

WARNING: multiple messages have this Message-ID (diff)
From: "Christian König" <deathsimple@vodafone.de>
To: Oded Gabbay <oded.gabbay@amd.com>, dri-devel@lists.freedesktop.org
Cc: Alexander.Deucher@amd.com, linux-kernel@vger.kernel.org,
	Joerg Roedel <joro@8bytes.org>
Subject: Re: [PATCH 1/3] amdkfd: Don't clear *kfd2kgd on kfd_module_init
Date: Sun, 21 Dec 2014 16:57:37 +0100	[thread overview]
Message-ID: <5496EDF1.7080106@vodafone.de> (raw)
In-Reply-To: <5496CA0F.8000800@amd.com>

> There should be, but when the modules are compiled in, they are loaded 
> based on
> link order only, if they are in the same group, and the groups are 
> loaded by a
> pre-defined order. 
Is that really still up to date? I've seen effort to change that 
something like 10+ years ago when Rusty reworked the module system. And 
it is comming up on the lists again from time to time.

> I don't want to move iommu before gpu, so I don't have a solution for 
> the order between amdkfd and amd_iommu_v2. 
Why not? That's still better than creating a kernel workqueue, 
scheduling one work item on it, rescheduling the task until everything 
is completed and you can continue.

Christian.

Am 21.12.2014 um 14:24 schrieb Oded Gabbay:
>
>
> On 12/21/2014 03:06 PM, Oded Gabbay wrote:
>>
>>
>> On 12/21/2014 02:19 PM, Christian König wrote:
>>> Am 21.12.2014 um 12:34 schrieb Oded Gabbay:
>>>>
>>>>
>>>> On 12/21/2014 01:27 PM, Christian König wrote:
>>>>> Am 20.12.2014 um 21:46 schrieb Oded Gabbay:
>>>>>> When amdkfd and radeon are compiled inside the kernel image (not 
>>>>>> as modules),
>>>>>> radeon will load before amdkfd and will set *kfd2kgd to its 
>>>>>> interface
>>>>>> structure. Therefore, we must not set *kfd2kgd to NULL when 
>>>>>> amdkfd is loaded
>>>>>> because it will override radeon's initialization and cause kernel 
>>>>>> BUG.
>>>>>>
>>>>>> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
>>>>>
>>>>> You should probably rather fix the dependency between the two 
>>>>> modules to get an
>>>>> determined load order instead of doing such nasty workarounds.
>>>>>
>>>>> Christian.
>>>>
>>>> The problem is that when modules are compiled inside the kernel, 
>>>> there is NO
>>>> determined load order and there is no mechanism to enforce that. If 
>>>> there
>>>> is/was such a mechanism, I would of course prefer to use it.
>>>
>>> There should be an determined order based on the symbol use, otherwise
>>> initializing most of the kernel modules wouldn't work as expected. 
>>> For example
>>> radeon depends on the drm module must be loaded before radeon is 
>>> loaded.
>> There should be, but when the modules are compiled in, they are 
>> loaded based on
>> link order only, if they are in the same group, and the groups are 
>> loaded by a
>> pre-defined order.
>> The groups are: pure, core, postcore, arch, subsys, fs, device (which 
>> represents
>> all the modules) and late. See init.h
>>
>> So radeon, amdkfd and amd_iommu_v2 are all in device group, and in 
>> the group
>> they are ordered by their link order.
>>
>> Yes, radeon loads after drm, because drm*.o are before radeon*.o in the
>> Makefile. See
>> http://stackoverflow.com/questions/5669647/linux-order-of-statically-linked-module-loading 
>>
>>
>
> So I tried moving amdkfd inside the Makefile before radeon, and that 
> made amdkfd load before radeon did. This solves my first problem - 
> order between amdkfd and radeon. However, amd_iommu_v2 doesn't belong 
> to the drm Makefile, and I don't want to move iommu before gpu, so I 
> don't have a solution for the order between amdkfd and amd_iommu_v2.
>
>     Oded
>>
>>
>>>
>>>>
>>>> Actually, I don't understand why the kernel doesn't enforce the order
>>>> according to the use of exported symbols, like it does with modules.
>>>
>>> Yeah, that's indeed rather strange. There must be something in the 
>>> amdkfd code
>>> which broke that somehow.
>> IMO, that's a far-fetched guess. Could you point to something more 
>> specific ?
>>
>>>
>>> As far as I understand you the desired init order is radeon and 
>>> amd_iommu_v2
>>> first and then amdkfd, right?
>> Actually no. The preferred order is amd_iommu_v2, amdkfd and radeon 
>> last. This
>> is the order that happens when all three are built as modules. More 
>> accurately,
>> radeon inits, but its init triggers amdkfd init, which triggers 
>> amd_iommu_v2
>> init. So before radeon reaches its probe stage, all the modules were 
>> initialized.
>>
>> So what happens when you boot with radeon,
>>> amd_iommu_v2 and amdkfd blacklisted for automatically load and only 
>>> load amdkfd
>>> manually?
>> As said above, that's ok.
>>>
>>>> There will always be dependencies between kgd (radeon) and amdkfd 
>>>> and between
>>>> amdkfd and amd_iommu_v2. I don't think I can eliminate those 
>>>> dependencies, not
>>>> without a very complex solution. And the fact that this complex 
>>>> solution
>>>> occurs only in a very specific use case (all modules compiled in), 
>>>> makes me
>>>> less inclined to do that.
>>>>
>>>> So I don't see it as a "nasty workaround". I would call it just a 
>>>> "workaround"
>>>> for a specific use case, which should be solved by a generic 
>>>> solution to the
>>>> kernel enforcing load orders.
>>>
>>> The normal kernel module handling already should provide the correct 
>>> init order,
>>> so I would still call this a rather nasty workaround because we 
>>> couldn't find
>>> the underlying problem.
>> Well, the normal kernel module handling doesn't work when all modules 
>> are
>> compiled in. I'm not a huge expert on this issue so I had Joerg 
>> Roedel help me
>> analyze this (thanks Joerg) and he agreed that there is no 
>> enforcement of order
>> in this case.
>>
>>>
>>> Christian.
>>>
>>>>
>>>>     Oded
>>>>>
>>>>>> ---
>>>>>>   drivers/gpu/drm/amd/amdkfd/kfd_module.c | 5 ++---
>>>>>>   1 file changed, 2 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
>>>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
>>>>>> index 95d5af1..236562f 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
>>>>>> @@ -34,7 +34,7 @@
>>>>>>   #define KFD_DRIVER_MINOR    7
>>>>>>   #define KFD_DRIVER_PATCHLEVEL    0
>>>>>> -const struct kfd2kgd_calls *kfd2kgd;
>>>>>> +const struct kfd2kgd_calls *kfd2kgd = NULL;
>>>>>>   static const struct kgd2kfd_calls kgd2kfd = {
>>>>>>       .exit        = kgd2kfd_exit,
>>>>>>       .probe        = kgd2kfd_probe,
>>>>>> @@ -84,14 +84,13 @@ EXPORT_SYMBOL(kgd2kfd_init);
>>>>>>   void kgd2kfd_exit(void)
>>>>>>   {
>>>>>> +    kfd2kgd = NULL;
>>>>>>   }
>>>>>>   static int __init kfd_module_init(void)
>>>>>>   {
>>>>>>       int err;
>>>>>> -    kfd2kgd = NULL;
>>>>>> -
>>>>>>       /* Verify module parameters */
>>>>>>       if ((sched_policy < KFD_SCHED_POLICY_HWS) ||
>>>>>>           (sched_policy > KFD_SCHED_POLICY_NO_HWS)) {
>>>>>
>>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/dri-devel


  reply	other threads:[~2014-12-21 15:58 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-20 20:46 [PATCH 0/3] Use workqueue for device init in amdkfd Oded Gabbay
2014-12-20 20:46 ` Oded Gabbay
2014-12-20 20:46 ` [PATCH 1/3] amdkfd: Don't clear *kfd2kgd on kfd_module_init Oded Gabbay
2014-12-20 20:46   ` Oded Gabbay
2014-12-20 21:25   ` Greg KH
2014-12-20 21:25     ` Greg KH
2014-12-21 11:12     ` Oded Gabbay
2014-12-21 11:12       ` Oded Gabbay
2014-12-21 11:27   ` Christian König
2014-12-21 11:27     ` Christian König
2014-12-21 11:34     ` Oded Gabbay
2014-12-21 11:34       ` Oded Gabbay
2014-12-21 12:19       ` Christian König
2014-12-21 12:19         ` Christian König
2014-12-21 13:06         ` Oded Gabbay
2014-12-21 13:06           ` Oded Gabbay
2014-12-21 13:24           ` Oded Gabbay
2014-12-21 13:24             ` Oded Gabbay
2014-12-21 15:57             ` Christian König [this message]
2014-12-21 15:57               ` Christian König
2014-12-21 16:03               ` Oded Gabbay
2014-12-21 16:03                 ` Oded Gabbay
2014-12-21 16:10                 ` Christian König
2014-12-21 16:10                   ` Christian König
2014-12-22  7:34                   ` Oded Gabbay
2014-12-22  7:34                     ` Oded Gabbay
2014-12-22  7:40                     ` Dave Airlie
2014-12-22  7:40                       ` Dave Airlie
2014-12-22  7:43                       ` Oded Gabbay
2014-12-22  7:43                         ` Oded Gabbay
2014-12-22  8:57                         ` Christian König
2014-12-22  8:57                           ` Christian König
2014-12-22  9:26                           ` Oded Gabbay
2014-12-22  9:26                             ` Oded Gabbay
2014-12-22 10:22                             ` Oded Gabbay
2014-12-22 10:22                               ` Oded Gabbay
2014-12-20 20:46 ` [PATCH 2/3] amdkfd: Track when amdkfd init is complete Oded Gabbay
2014-12-20 20:46   ` Oded Gabbay
2014-12-20 20:46 ` [PATCH 3/3] amdkfd: Use workqueue for GPU init Oded Gabbay
2014-12-20 20:46   ` Oded Gabbay

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5496EDF1.7080106@vodafone.de \
    --to=deathsimple@vodafone.de \
    --cc=Alexander.Deucher@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oded.gabbay@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.