From: Thomas Hellstrom <thomas@shipmail.org>
To: skeggsb@gmail.com
Cc: dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] drm/ttm: fix two regressions since move_notify changes
Date: Wed, 25 Jan 2012 09:39:47 +0100 [thread overview]
Message-ID: <4F1FBFD3.90105@shipmail.org> (raw)
In-Reply-To: <1327478723.5189.7.camel@nisroch>
On 01/25/2012 09:05 AM, Ben Skeggs wrote:
> On Wed, 2012-01-25 at 08:43 +0100, Thomas Hellstrom wrote:
>> On 01/25/2012 06:34 AM, Ben Skeggs wrote:
>>> From: Ben Skeggs<bskeggs@redhat.com>
>>>
>>> Both changes in dc97b3409a790d2a21aac6e5cdb99558b5944119 cause serious
>>> regressions in the nouveau driver.
>>>
>>> move_notify() was originally able to presume that bo->mem is the old node,
>>> and new_mem is the new node. The above commit moves the call to
>>> move_notify() to after move() has been done, which means that now, sometimes,
>>> new_mem isn't the new node at all, bo->mem is, and new_mem points at a
>>> stale, possibly-just-been-killed-by-move node.
>>>
>>> This is clearly not a good situation. This patch reverts this change, and
>>> replaces it with a cleanup in the move() failure path instead.
>> While I have nothing against having move_notify() happening before the
>> move, but
>> I still very much dislike the failure path thing, since it puts a
>> requirement on TTM to support
>> driver moves through move_notify(), which is done by Radeon.
>> I've kindly asked Jerome to change that, stating a number of reasons,
>> but he refuses, and I'm
>> not prepared to let support for that mode of operation sneak in like this.
> What requirement is there? All it's asking the driver is to do a
> move_notify with old/new nodes swapped, which would effectively undo the
> previous move_notify().
>
> move_notify() *cannot* happen after the move for the reasons I mentioned
> already, so the choice is apparently either to completely ignore
> cleaning up if the subsequent move() fails (previous behaviour prior to
> the patch which caused these regressions), or to make the change I
> did...
I agree. What I'm proposing is that we should go through with the first
part of
your patch and ignore the cleanup.
>
>>
>> The second issue is that the call to move_notify() from cleanup_memtype_use()
>> causes the TTM ghost objects to get passed into the driver. This is clearly
>> bad as the driver knows nothing about these "fake" TTM BOs, and ends up
>> accessing uninitialised memory.
>>
>> I worked around this in nouveau's move_notify() hook by ensuring the BO
>> destructor was nouveau's. I don't particularly like this solution, and
>> would rather TTM never pass the driver these objects. However, I don't
>> clearly understand the reason why we're calling move_notify() here anyway
>> and am happy to work around the problem in nouveau instead of breaking the
>> behaviour expected by other drivers.
>>
>>
>> move_notify() here gives the driver a chance to unbind any GPU maps
>> remaining on the BO
>> before it changes placement or is destroyed.
>> We've previously required the driver to support any type of BO in the
>> driver hooks, I agree
>> with you that it would be desirable to make that go away.
> Okay, that's fine I guess. As the commit says, I'm happy to make
> nouveau just ignore operations on BOs it doesn't "own".
>
> I know *you* think of move_notify() as only being around to deal with
> unmapping, but as I mentioned previously, it'd have never taken the new
> node as a parameter if this is were the case.
That's not exactly true. After you pointed that out, I went back and
check the
old vmwgfx use, and the new node was used to determine whether
an unbind was actually necessary, or whether it could leave the gpu map
untouched.
> I have zero issue with
> nouveau using it to set up the GPU mappings currently. I know you've
> suggested alternatives previously, which may be possible down the track
> if it's *really* necessary, but it seems like a bad idea to pursue this
> for 3.3.
My main concern is that we blindly and unnecessarily set up GPU bindings and
end up with unnecessary code in TTM, and furthermore that we communicate
that bad practice to future driver writers.
Thomas, what do you suggest to move forward with this? Both of these
bugs are serious regressions that make nouveau unusable with the current
3.3-rc series. Ben.
My number one choice would of course be to have the drivers set up their
private GPU mappings after a
successful validate call, as I originally suggested and you agreed to.
If that's not possible (I realize it's late in the release series), I'll
ack this patch if you and Jerome agree not to block
attempts to move in that direction for future kernel releases.
Thanks,
Thomas
>> Signed-off-by: Ben Skeggs<bskeggs@redhat.com>
>> Cc: Jerome Glisse<j.glisse@gmail.com>
>> ---
>> drivers/gpu/drm/nouveau/nouveau_bo.c | 4 ++++
>> drivers/gpu/drm/ttm/ttm_bo.c | 17 +++++++++++++----
>> 2 files changed, 17 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
>> index 724b41a..ec54364 100644
>> --- a/drivers/gpu/drm/nouveau/nouveau_bo.c
>> +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
>> @@ -812,6 +812,10 @@ nouveau_bo_move_ntfy(struct ttm_buffer_object *bo, struct ttm_mem_reg *new_mem)
>> struct nouveau_bo *nvbo = nouveau_bo(bo);
>> struct nouveau_vma *vma;
>>
>> + /* ttm can now (stupidly) pass the driver bos it didn't create... */
>> + if (bo->destroy != nouveau_bo_del_ttm)
>> + return;
>> +
>> list_for_each_entry(vma,&nvbo->vma_list, head) {
>> if (new_mem&& new_mem->mem_type == TTM_PL_VRAM) {
>> nouveau_vm_map(vma, new_mem->mm_node);
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index 2f0eab6..7c3a57d 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -404,6 +404,9 @@ static int ttm_bo_handle_move_mem(struct ttm_buffer_object *bo,
>> }
>> }
>>
>> + if (bdev->driver->move_notify)
>> + bdev->driver->move_notify(bo, mem);
>> +
>> if (!(old_man->flags& TTM_MEMTYPE_FLAG_FIXED)&&
>> !(new_man->flags& TTM_MEMTYPE_FLAG_FIXED))
>> ret = ttm_bo_move_ttm(bo, evict, no_wait_reserve, no_wait_gpu, mem);
>> @@ -413,11 +416,17 @@ static int ttm_bo_handle_move_mem(struct ttm_buffer_object *bo,
>> else
>> ret = ttm_bo_move_memcpy(bo, evict, no_wait_reserve, no_wait_gpu, mem);
>>
>> - if (ret)
>> - goto out_err;
>> + if (ret) {
>> + if (bdev->driver->move_notify) {
>> + struct ttm_mem_reg tmp_mem = *mem;
>> + *mem = bo->mem;
>> + bo->mem = tmp_mem;
>> + bdev->driver->move_notify(bo, mem);
>> + bo->mem = *mem;
>> + }
>>
>> - if (bdev->driver->move_notify)
>> - bdev->driver->move_notify(bo, mem);
>> + goto out_err;
>> + }
>>
>> moved:
>> if (bo->evicted) {
>>
>>
>>
>>
>
next prev parent reply other threads:[~2012-01-25 8:39 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-15 21:31 [next] Null pointer dereference in nouveau_vm_map_sg Martin Nyhus
2012-01-16 20:30 ` Jerome Glisse
2012-01-16 23:57 ` Martin Nyhus
2012-01-22 18:33 ` Konrad Rzeszutek Wilk
2012-01-24 22:33 ` Jerome Glisse
2012-01-25 0:12 ` Martin Nyhus
2012-01-25 16:54 ` Jerome Glisse
2012-01-25 5:34 ` [PATCH] drm/ttm: fix two regressions since move_notify changes Ben Skeggs
2012-01-25 7:43 ` Thomas Hellstrom
2012-01-25 8:05 ` Ben Skeggs
2012-01-25 8:39 ` Thomas Hellstrom [this message]
2012-01-25 9:41 ` Ben Skeggs
2012-01-25 14:33 ` Thomas Hellstrom
2012-01-25 15:21 ` Ben Skeggs
2012-01-25 15:37 ` Jerome Glisse
2012-01-25 17:15 ` Thomas Hellstrom
2012-01-25 17:19 ` Thomas Hellstrom
2012-01-25 18:12 ` Dave Airlie
2012-01-25 18:21 ` Thomas Hellstrom
2012-01-25 18:51 ` Jerome Glisse
2012-01-25 8:24 ` Dave Airlie
2012-01-25 8:38 ` Ben Skeggs
2012-01-25 17:32 ` Thomas Hellstrom
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F1FBFD3.90105@shipmail.org \
--to=thomas@shipmail.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=skeggsb@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.