public inbox for linux-pci@vger.kernel.org
 help / color / mirror / Atom feed
* Question about call to pci_assign_unassigned_bus_resources in amdgpu
@ 2019-06-24  4:27 Benjamin Herrenschmidt
  2019-06-24  8:42 ` Koenig, Christian
  0 siblings, 1 reply; 9+ messages in thread
From: Benjamin Herrenschmidt @ 2019-06-24  4:27 UTC (permalink / raw)
  To: Christian König; +Cc: Alex Deucher, linux-pci

Hi Christian !

While cleaning up & consolidating resource management code accross
archs I stumbled upon this call to pci_assign_unassigned_bus_resources
in amdgpu:amdgpu_device_resize_fb_bar()

Why do you need this ? My understanding is that pci_resize_resource()
will already be calling pci_reassign_bridge_resources() on the parent
bridge which should have the same effect. Or am I missing something ?

I'd like to remove that call if possible...

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about call to pci_assign_unassigned_bus_resources in amdgpu
  2019-06-24  4:27 Question about call to pci_assign_unassigned_bus_resources in amdgpu Benjamin Herrenschmidt
@ 2019-06-24  8:42 ` Koenig, Christian
  2019-06-24  8:47   ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 9+ messages in thread
From: Koenig, Christian @ 2019-06-24  8:42 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Deucher, Alexander, linux-pci@vger.kernel.org

Hi Ben,

Am 24.06.19 um 06:27 schrieb Benjamin Herrenschmidt:
> Hi Christian !
>
> While cleaning up & consolidating resource management code accross
> archs I stumbled upon this call to pci_assign_unassigned_bus_resources
> in amdgpu:amdgpu_device_resize_fb_bar()
>
> Why do you need this ? My understanding is that pci_resize_resource()
> will already be calling pci_reassign_bridge_resources() on the parent
> bridge which should have the same effect. Or am I missing something ?
>
> I'd like to remove that call if possible...

Bjorn suggested to do it this way and it indeed proved to be less error 
prone.

Basically amdgpu_device_resize_fb_bar() frees the resources for our VRAM 
and doorbell BAR:
>         /* Free the VRAM and doorbell BAR, we most likely need to move 
> both. */
>         amdgpu_device_doorbell_fini(adev);
>         if (adev->asic_type >= CHIP_BONAIRE)
>                 pci_release_resource(adev->pdev, 2);
>
>         pci_release_resource(adev->pdev, 0);

Then we resize the VRAM BAR by calling pci_resize_resource(). That in 
turn tries to resize and shuffle around the parent bridge resources 
using pci_reassign_bridge_resources().

But pci_reassign_bridge_resources() does not assign any device 
resources, it just tries to make sure the upstream bridges have enough 
space to fit everything in.

Independent if we succeeded or failed with handling the bridge(s) we 
call pci_assign_unassigned_bus_resources() to re-assign the previously 
freed up VRAM and doorbell BARs.

So yeah, this definitely necessary, or otherwise the driver would crash 
soon after because the resources are not assigned again.

Regards,
Christian.

>
> Cheers,
> Ben.
>
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about call to pci_assign_unassigned_bus_resources in amdgpu
  2019-06-24  8:42 ` Koenig, Christian
@ 2019-06-24  8:47   ` Benjamin Herrenschmidt
  2019-06-24  8:56     ` Benjamin Herrenschmidt
  2019-06-24  9:08     ` Koenig, Christian
  0 siblings, 2 replies; 9+ messages in thread
From: Benjamin Herrenschmidt @ 2019-06-24  8:47 UTC (permalink / raw)
  To: Koenig, Christian; +Cc: Deucher, Alexander, linux-pci@vger.kernel.org

On Mon, 2019-06-24 at 08:42 +0000, Koenig, Christian wrote:
> Then we resize the VRAM BAR by calling pci_resize_resource(). That in 
> turn tries to resize and shuffle around the parent bridge resources 
> using pci_reassign_bridge_resources().
> 
> But pci_reassign_bridge_resources() does not assign any device 
> resources, it just tries to make sure the upstream bridges have enough 
> space to fit everything in.
> 
> Independent if we succeeded or failed with handling the bridge(s) we 
> call pci_assign_unassigned_bus_resources() to re-assign the previously 
> freed up VRAM and doorbell BARs.
> 
> So yeah, this definitely necessary, or otherwise the driver would crash 
> soon after because the resources are not assigned again.

Oh, I missed that pci_reassign_bridge_resources() didn't reassign the
resources under the bridge... ugh... that code is a bloody mess.

We have 4 or 5 "assign bus resources" functions, all subtly different
for no clear reasons (the historical changelogs from Yinghai may as
well don't exist, they are basically encrypted), and in most case for
no good reasons....

Question: Do you ever need to assign anything other than that one
device though ? In my branch, I've added this typically for the case
where a single device needs to be reassigned:

+void pci_dev_assign_resources(struct pci_dev *dev)
+{
+	LIST_HEAD(head);
+
+	/* Assign non-fixed resources */
+	__dev_sort_resources(dev, &head);
+	__assign_resources_sorted(&head, NULL, NULL);
+
+	/* Assign fixed ones if any */
+	pdev_assign_fixed_resources(dev);
+}
+EXPORT_SYMBOL(pci_dev_assign_resources);

Would that work for you ?

If yes, I'll replace pci_assign_unassigned_bus_resources(). I'd like to
eventually kill it..

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about call to pci_assign_unassigned_bus_resources in amdgpu
  2019-06-24  8:47   ` Benjamin Herrenschmidt
@ 2019-06-24  8:56     ` Benjamin Herrenschmidt
  2019-06-24  9:10       ` Koenig, Christian
  2019-06-24  9:08     ` Koenig, Christian
  1 sibling, 1 reply; 9+ messages in thread
From: Benjamin Herrenschmidt @ 2019-06-24  8:56 UTC (permalink / raw)
  To: Koenig, Christian; +Cc: Deucher, Alexander, linux-pci@vger.kernel.org

On Mon, 2019-06-24 at 18:47 +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2019-06-24 at 08:42 +0000, Koenig, Christian wrote:
> > Then we resize the VRAM BAR by calling pci_resize_resource(). That in 
> > turn tries to resize and shuffle around the parent bridge resources 
> > using pci_reassign_bridge_resources().
> > 
> > But pci_reassign_bridge_resources() does not assign any device 
> > resources, it just tries to make sure the upstream bridges have enough 
> > space to fit everything in.

Hrm... are you sure of this ? Maybe it has changed... or I'm missing
something. Because right in the middle of it I see:


 	__pci_bus_size_bridges(bridge->subordinate, &added);
	__pci_bridge_assign_resources(bridge, &added, &failed);

Now the second of these will call __pci_bus_assign_resources() on the
bridge->subordinate, which will recursively assign all devices below
the bridge.

Or am I overlooking something ?

It could be that if it fails, then you need to restore your device
resources indeed... but the normal case should work from my reading of
the code.

Cheers,
Ben.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about call to pci_assign_unassigned_bus_resources in amdgpu
  2019-06-24  8:47   ` Benjamin Herrenschmidt
  2019-06-24  8:56     ` Benjamin Herrenschmidt
@ 2019-06-24  9:08     ` Koenig, Christian
  2019-06-24  9:15       ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 9+ messages in thread
From: Koenig, Christian @ 2019-06-24  9:08 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Deucher, Alexander, linux-pci@vger.kernel.org

Am 24.06.19 um 10:47 schrieb Benjamin Herrenschmidt:
> On Mon, 2019-06-24 at 08:42 +0000, Koenig, Christian wrote:
>> Then we resize the VRAM BAR by calling pci_resize_resource(). That in
>> turn tries to resize and shuffle around the parent bridge resources
>> using pci_reassign_bridge_resources().
>>
>> But pci_reassign_bridge_resources() does not assign any device
>> resources, it just tries to make sure the upstream bridges have enough
>> space to fit everything in.
>>
>> Independent if we succeeded or failed with handling the bridge(s) we
>> call pci_assign_unassigned_bus_resources() to re-assign the previously
>> freed up VRAM and doorbell BARs.
>>
>> So yeah, this definitely necessary, or otherwise the driver would crash
>> soon after because the resources are not assigned again.
> Oh, I missed that pci_reassign_bridge_resources() didn't reassign the
> resources under the bridge... ugh... that code is a bloody mess.
>
> We have 4 or 5 "assign bus resources" functions, all subtly different
> for no clear reasons (the historical changelogs from Yinghai may as
> well don't exist, they are basically encrypted), and in most case for
> no good reasons....

Yeah, can't agree more :) It was one of the main challenges implementing 
the resizing support.

If you want to clean this up feel free to CC me and I can take a look as 
well. But honestly I was scared of touching it when I worked on this, 
because of all the little corner cases you have in PCI.

> Question: Do you ever need to assign anything other than that one
> device though ? In my branch, I've added this typically for the case
> where a single device needs to be reassigned:
>
> +void pci_dev_assign_resources(struct pci_dev *dev)
> +{
> +	LIST_HEAD(head);
> +
> +	/* Assign non-fixed resources */
> +	__dev_sort_resources(dev, &head);
> +	__assign_resources_sorted(&head, NULL, NULL);
> +
> +	/* Assign fixed ones if any */
> +	pdev_assign_fixed_resources(dev);
> +}
> +EXPORT_SYMBOL(pci_dev_assign_resources);
>
> Would that work for you ?

That should work perfectly fine.

Regards,
Christian.

>
> If yes, I'll replace pci_assign_unassigned_bus_resources(). I'd like to
> eventually kill it..
>
> Cheers,
> Ben.
>
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about call to pci_assign_unassigned_bus_resources in amdgpu
  2019-06-24  8:56     ` Benjamin Herrenschmidt
@ 2019-06-24  9:10       ` Koenig, Christian
  0 siblings, 0 replies; 9+ messages in thread
From: Koenig, Christian @ 2019-06-24  9:10 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Deucher, Alexander, linux-pci@vger.kernel.org

Am 24.06.19 um 10:56 schrieb Benjamin Herrenschmidt:
> On Mon, 2019-06-24 at 18:47 +1000, Benjamin Herrenschmidt wrote:
>> On Mon, 2019-06-24 at 08:42 +0000, Koenig, Christian wrote:
>>> Then we resize the VRAM BAR by calling pci_resize_resource(). That in
>>> turn tries to resize and shuffle around the parent bridge resources
>>> using pci_reassign_bridge_resources().
>>>
>>> But pci_reassign_bridge_resources() does not assign any device
>>> resources, it just tries to make sure the upstream bridges have enough
>>> space to fit everything in.
> Hrm... are you sure of this ? Maybe it has changed... or I'm missing
> something. Because right in the middle of it I see:
>
>
>   	__pci_bus_size_bridges(bridge->subordinate, &added);
> 	__pci_bridge_assign_resources(bridge, &added, &failed);
>
> Now the second of these will call __pci_bus_assign_resources() on the
> bridge->subordinate, which will recursively assign all devices below
> the bridge.
>
> Or am I overlooking something ?

It is perfectly possible that this changed later in the patch set. We 
had something like 4 or 5 iterations until everything settled.

> It could be that if it fails, then you need to restore your device
> resources indeed... but the normal case should work from my reading of
> the code.

Yeah, it is definitely still necessary for error handling.

Regards,
Christian.

>
> Cheers,
> Ben.
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about call to pci_assign_unassigned_bus_resources in amdgpu
  2019-06-24  9:08     ` Koenig, Christian
@ 2019-06-24  9:15       ` Benjamin Herrenschmidt
  2019-06-24  9:42         ` Koenig, Christian
  0 siblings, 1 reply; 9+ messages in thread
From: Benjamin Herrenschmidt @ 2019-06-24  9:15 UTC (permalink / raw)
  To: Koenig, Christian; +Cc: Deucher, Alexander, linux-pci@vger.kernel.org

On Mon, 2019-06-24 at 09:08 +0000, Koenig, Christian wrote:
> 
> Yeah, can't agree more :) It was one of the main challenges implementing 
> the resizing support.
> 
> If you want to clean this up feel free to CC me and I can take a look as 
> well. But honestly I was scared of touching it when I worked on this, 
> because of all the little corner cases you have in PCI.
> 
> > Question: Do you ever need to assign anything other than that one
> > device though ? In my branch, I've added this typically for the case
> > where a single device needs to be reassigned:
> > 
> > +void pci_dev_assign_resources(struct pci_dev *dev)
> > +{
> > +	LIST_HEAD(head);
> > +
> > +	/* Assign non-fixed resources */
> > +	__dev_sort_resources(dev, &head);
> > +	__assign_resources_sorted(&head, NULL, NULL);
> > +
> > +	/* Assign fixed ones if any */
> > +	pdev_assign_fixed_resources(dev);
> > +}
> > +EXPORT_SYMBOL(pci_dev_assign_resources);
> > 
> > Would that work for you ?
> 
> That should work perfectly fine.

Ok, Ill plumb it that way in my branch, I'll let you know when it's
worth testing. BTW. Which GPUs typically are affected ? I'm pretty sure
my old R9 290 isn't :-) But I was thinking of upgrading so...

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about call to pci_assign_unassigned_bus_resources in amdgpu
  2019-06-24  9:15       ` Benjamin Herrenschmidt
@ 2019-06-24  9:42         ` Koenig, Christian
  2019-06-24 10:27           ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 9+ messages in thread
From: Koenig, Christian @ 2019-06-24  9:42 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Deucher, Alexander, linux-pci@vger.kernel.org

Am 24.06.19 um 11:15 schrieb Benjamin Herrenschmidt:
> On Mon, 2019-06-24 at 09:08 +0000, Koenig, Christian wrote:
>> Yeah, can't agree more :) It was one of the main challenges implementing
>> the resizing support.
>>
>> If you want to clean this up feel free to CC me and I can take a look as
>> well. But honestly I was scared of touching it when I worked on this,
>> because of all the little corner cases you have in PCI.
>>
>>> Question: Do you ever need to assign anything other than that one
>>> device though ? In my branch, I've added this typically for the case
>>> where a single device needs to be reassigned:
>>>
>>> +void pci_dev_assign_resources(struct pci_dev *dev)
>>> +{
>>> +	LIST_HEAD(head);
>>> +
>>> +	/* Assign non-fixed resources */
>>> +	__dev_sort_resources(dev, &head);
>>> +	__assign_resources_sorted(&head, NULL, NULL);
>>> +
>>> +	/* Assign fixed ones if any */
>>> +	pdev_assign_fixed_resources(dev);
>>> +}
>>> +EXPORT_SYMBOL(pci_dev_assign_resources);
>>>
>>> Would that work for you ?
>> That should work perfectly fine.
> Ok, Ill plumb it that way in my branch, I'll let you know when it's
> worth testing. BTW. Which GPUs typically are affected ? I'm pretty sure
> my old R9 290 isn't :-) But I was thinking of upgrading so...

Well in theory we have the functionality 10+ years now, but only 
activated it in all hardware versions recently.

Polaris and Vega should definitely have it, older hardware most likely not.

You can check the PCI capabilities and look for #15, if it's present 
then that should be supported.

Christian.

>
> Cheers,
> Ben.
>
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about call to pci_assign_unassigned_bus_resources in amdgpu
  2019-06-24  9:42         ` Koenig, Christian
@ 2019-06-24 10:27           ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 9+ messages in thread
From: Benjamin Herrenschmidt @ 2019-06-24 10:27 UTC (permalink / raw)
  To: Koenig, Christian; +Cc: Deucher, Alexander, linux-pci@vger.kernel.org

On Mon, 2019-06-24 at 09:42 +0000, Koenig, Christian wrote:
> 
> > Ok, Ill plumb it that way in my branch, I'll let you know when it's
> > worth testing. BTW. Which GPUs typically are affected ? I'm pretty
> > sure
> > my old R9 290 isn't :-) But I was thinking of upgrading so...
> 
> Well in theory we have the functionality 10+ years now, but only 
> activated it in all hardware versions recently.
> 
> Polaris and Vega should definitely have it, older hardware most
> likely not.
> 
> You can check the PCI capabilities and look for #15, if it's present 
> then that should be supported.

Ok, well, we'll see if I decide to get myself a Navi when it comes out,
otherwise I'll rely on your testing :)

That said, I'm keen on having a discussion about our resource
assignment code at Plumbers with whoever can make it. The current
situation is rather horrible, it would be good to ensure at least that
we all understand and agree on what it's trying to do, what it's
actually doing, and what we want to do, which the more I stare at it
I'm reasonably sure are very different things depending on the
context...

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-06-24 10:27 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-06-24  4:27 Question about call to pci_assign_unassigned_bus_resources in amdgpu Benjamin Herrenschmidt
2019-06-24  8:42 ` Koenig, Christian
2019-06-24  8:47   ` Benjamin Herrenschmidt
2019-06-24  8:56     ` Benjamin Herrenschmidt
2019-06-24  9:10       ` Koenig, Christian
2019-06-24  9:08     ` Koenig, Christian
2019-06-24  9:15       ` Benjamin Herrenschmidt
2019-06-24  9:42         ` Koenig, Christian
2019-06-24 10:27           ` Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox