linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tejas Joglekar <Tejas.Joglekar@synopsys.com>
To: Mathias Nyman <mathias.nyman@linux.intel.com>,
	Tejas Joglekar <Tejas.Joglekar@synopsys.com>
Cc: "linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>,
	John Youn <John.Youn@synopsys.com>
Subject: Re: [RFC PATCH 1/4] usb: xhci: Synopsys xHC consolidate TRBs
Date: Mon, 16 Mar 2020 17:09:40 +0000	[thread overview]
Message-ID: <cb3b73da-d7f2-3bbc-f822-6a9ca4edc3d4@synopsys.com> (raw)
In-Reply-To: <37e8bc7b-4dd4-704c-ab1d-933a56534213@linux.intel.com>

Hi,
On 3/3/2020 8:17 PM, Mathias Nyman wrote:
> On 3.3.2020 12.24, Tejas Joglekar wrote:
>> Hi,
>> On 2/14/2020 2:06 PM, Tejas Joglekar wrote:
>>> Hi,
>>> On 2/12/2020 8:34 PM, Mathias Nyman wrote:
>>>> On 7.2.2020 19.17, Tejas Joglekar wrote:
>>>>> Hi,
>>>>> Thanks for the review comments.
>>>>> On 1/3/2020 10:14 PM, Mathias Nyman wrote:
>>>>>> On 20.12.2019 15.39, Tejas Joglekar wrote:
>>>>>>> The Synopsys xHC has an internal TRB cache of size TRB_CACHE_SIZE for
>>>>>>> each endpoint. The default value for TRB_CACHE_SIZE is 16 for SS and 8
>>>>>>> for HS. The controller loads and updates the TRB cache from the transfer
>>>>>>> ring in system memory whenever the driver issues a start transfer or
>>>>>>> update transfer command.
>>>>>>>
>>>>>>> For chained TRBs, the Synopsys xHC requires that the total amount of
>>>>>>> bytes for all TRBs loaded in the TRB cache be greater than or equal to 1
>>>>>>> MPS. Or the chain ends within the TRB cache (with a last TRB).
>>>>>>>
>>>>>>> If this requirement is not met, the controller will not be able to send
>>>>>>> or receive a packet and it will hang causing a driver timeout and error.
>>>>>>>
>>>>>>> This can be a problem if a class driver queues SG requests with many
>>>>>>> small-buffer entries. The XHCI driver will create a chained TRB for each
>>>>>>> entry which may trigger this issue.
>>>>>>>
>>>>>>> This patch adds logic to the XHCI driver to detect and prevent this from
>>>>>>> happening.
>>>>>>>
>>>>>>> For every (TRB_CACHE_SIZE - 2) TRBs, we check the total buffer size of
>>>>>>> the TRBs and if the chain continues and we don't make up at least 1 MPS,
>>>>>>> we create a bounce buffer to consolidate up to the next (4 * MPS) TRBs
>>>>>>> into the last TRB.
>>>>>>>
>>>>>>> We check at (TRB_CACHE_SIZE - 2) because it is possible that there would
>>>>>>> be a link and/or event data TRB that take up to 2 of the cache entries.
>>>>>>> And we consolidate the next (4 * MPS) to improve performance.
>>>>>>>
>>>>>>> We discovered this issue with devices on other platforms but have not
>>>>>>> yet come across any device that triggers this on Linux. But it could be
>>>>>>> a real problem now or in the future. All it takes is N number of small
>>>>>>> chained TRBs. And other instances of the Synopsys IP may have smaller
>>>>>>> values for the TRB_CACHE_SIZE which would exacerbate the problem.
>>>>>>>
>>>>>>> We verified this patch using our internal driver and testing framework.
>>>>>>
>>>>>>
>>>>> I understand that in a first look it looks a complex solution, but you can suggest
>>>>> the modifications/changes which would be required to make the patch more readable.
>>>>> I have tried to keep the implementation similar to bounce buffer implementation 
>>>>> only with addition of bounce buffer list. For the spinlock case, you can take a 
>>>>> call if it is required or not.
>>>>
>>>> In your case you know the need for a bounce buffer much earlier than in the
>>>> existing TD fragment case.
>>>>
>>>> Have you looked into the struct hc_driver map_urb_for_dma() and
>>>> unmap_urb_for_dma() hooks? In those you could detect the need for a bounce
>>>> buffer, allocate it, and bluntly copy entire scattergather buffer to the 
>>>> bounce buffer. It should be fairly small anyway.
>>>>
>>> I will look into it, and get back to you. Thanks for the suggestion.
>>>  
>> I looked into it and I have a question related to approach you have suggested.
>> I can detect need for the bounce buffer in map_urb_for_dma() function and can allocate bounce
>> buffer and bluntly copy it, but when the SG list is having data over few MB's, I think 
>> my bounce buffer allocation might fail, over a period. Do you think it is possible?
> 
> Doesn't sound very likely, the sg would need to have more than 16 entries of which the
> total length of consecutive 16 entries is less than 1024 bytes, and then the following
> entries should be large enough to fail memory allocation. 
> (for HS the total legth < 512 for 8 entries, and rest huge)  
> 
> And no real world device has yet even triggered the first condition of having 16 (8)
> sg entries with total length less than max packet size.
> 
>>
>> So to avoid that, I thought of having a list of such bounce buffers held with the URB and 
>> then I can break the SG list with some fixed data length (e.g 16KB or 32 KB's) bounce buffers
>> and copy the SG into that bounce buffer list.
>>
>> Another option is to create a bounce sg, based on detection of bounce buffer requirement, 
>> where the small size SG's are combined to create a new SG list which can satisfy the 
>> TRB cache requirement for the SNPS controller.
>>
>> What do you suggest? Which is the best way to go about solving the problem?
> 
> As this is extremely rare (never reported in a real use case) 
> I'd keep this as simple as possible. Allocate one bounce buffer when needed, and copy all.
> If memory allocation fails (unlikely) print a warning, then we immediately know what to fix.
> 
> We are talking about sizes where 16 sg entries have in total 1024 bytes of data.
> 
Ok, I agree.
> ehci-tegra.c does something related to what you want. It replaces
> urb->transfer_buffer with a dma aligned bounce buffer. 
> 
I tried this (in hcd.c) and it seems to be working with my tests. I want to override the 
unmap_urb_for_dma() and map_urb_for_dma() and keep all other things same as xhci driver.
I thought to use xhci_driver_override for doing this instead of creating a whole new glue
driver. Would that be ok ?

> -Mathias
> 


  reply	other threads:[~2020-03-16 17:09 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-20 13:38 [RFC PATCH 0/4] Add logic to consolidate TRBs for Synopsys xHC Tejas Joglekar
2019-12-20 13:39 ` [RFC PATCH 1/4] usb: xhci: Synopsys xHC consolidate TRBs Tejas Joglekar
2020-01-02  9:08   ` David Laight
2020-01-03 16:44   ` Mathias Nyman
2020-02-07 17:17     ` Tejas Joglekar
2020-02-12 15:04       ` Mathias Nyman
2020-02-14  8:36         ` Tejas Joglekar
2020-03-03 10:24           ` Tejas Joglekar
2020-03-03 14:47             ` Mathias Nyman
2020-03-16 17:09               ` Tejas Joglekar [this message]
2020-03-17  8:48                 ` Mathias Nyman
2019-12-20 13:39 ` [RFC PATCH 2/4] usb: dwc3: Add device property consolidate-trbs Tejas Joglekar
2019-12-20 13:40 ` [RFC PATCH 3/4] usb: xhci: Set quirk for XHCI_CONSOLIDATE_TRBS Tejas Joglekar
2019-12-20 13:40 ` [RFC PATCH 4/4] dt-bindings: usb: Add snps,consolidate-trbs & consolidate-trbs Tejas Joglekar
2020-01-08 15:19   ` Rob Herring

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cb3b73da-d7f2-3bbc-f822-6a9ca4edc3d4@synopsys.com \
    --to=tejas.joglekar@synopsys.com \
    --cc=John.Youn@synopsys.com \
    --cc=linux-usb@vger.kernel.org \
    --cc=mathias.nyman@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).