public inbox for linux-usb@vger.kernel.org
 help / color / mirror / Atom feed
From: Michael Grzeschik <mgr@pengutronix.de>
To: Avichal Rakesh <arakesh@google.com>
Cc: Alan Stern <stern@rowland.harvard.edu>,
	Laurent Pinchart <laurent.pinchart@ideasonboard.com>,
	Daniel Scally <dan.scally@ideasonboard.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Jayant Chowdhary <jchowdhary@google.com>,
	"etalvala@google.com" <etalvala@google.com>,
	Michael Riesch <michael.riesch@wolfvision.net>,
	"linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Subject: Re: [PATCH 0/3] usb: gadget: uvc: allocate requests based on frame interval length and buffersize
Date: Wed, 29 May 2024 23:24:18 +0200	[thread overview]
Message-ID: <ZledAuxYrxZlJ0ow@pengutronix.de> (raw)
In-Reply-To: <adabc6f5-1b87-4bbe-9070-984f0acc8e75@google.com>

[-- Attachment #1: Type: text/plain, Size: 13000 bytes --]

On Tue, May 28, 2024 at 05:33:46PM -0700, Avichal Rakesh wrote:
>
>
>On 5/28/24 15:43, Michael Grzeschik wrote:
>> On Tue, May 28, 2024 at 02:27:34PM -0700, Avichal Rakesh wrote:
>>>
>>>
>>> On 5/28/24 13:22, Michael Grzeschik wrote:
>>>> On Tue, May 28, 2024 at 10:30:30AM -0700, Avichal Rakesh wrote:
>>>>>
>>>>>
>>>>> On 5/22/24 10:37, Michael Grzeschik wrote:
>>>>>> On Wed, May 22, 2024 at 05:17:02PM +0000, Thinh Nguyen wrote:
>>>>>>> On Wed, May 22, 2024, Alan Stern wrote:
>>>>>>>> On Wed, May 22, 2024 at 01:41:42AM +0000, Thinh Nguyen wrote:
>>>>>>>> > On Wed, May 22, 2024, Michael Grzeschik wrote:
>>>>>>>> > > On Fri, May 17, 2024 at 01:44:05AM +0000, Thinh Nguyen wrote:
>>>>>>>> > > > For isoc endpoint IN, yes. If the host requests for isoc data IN while
>>>>>>>> > > > no TRB is prepared, then the controller will automatically send 0-length
>>>>>>>> > > > packet respond.
>>>>>>>> > >
>>>>>>>> > > Perfect! This will help a lot and will make active queueing of own
>>>>>>>> > > zero-length requests run unnecessary.
>>>>>>>> >
>>>>>>>> > Yes, if we rely on the current start/stop isoc transfer scheme for UVC,
>>>>>>>> > then this will work.
>>>>>>>>
>>>>>>>> You shouldn't rely on this behavior.  Other device controllers might not
>>>>>>>> behave this way; they might send no packet at all to the host (causing a
>>>>>>>> USB protocol error) instead of sending a zero-length packet.
>>>>>>>
>>>>>>> I agree. The dwc3 driver has this workaround to somewhat work with the
>>>>>>> UVC. This behavior is not something the controller expected, and this
>>>>>>> workaround should not be a common behavior for different function
>>>>>>> driver/protocol. Since this behavior was added a long time ago, it will
>>>>>>> remain the default behavior in dwc3 to avoid regression with UVC (at
>>>>>>> least until the UVC is changed). However, it would be nice for UVC to
>>>>>>> not rely on this.
>>>>>>
>>>>>> With "this" you mean exactly the following commit, right?
>>>>>>
>>>>>> (f5e46aa4 usb: dwc3: gadget: when the started list is empty stop the active xfer)
>>>>>>
>>>>>> When we start questioning this, then lets dig deeper here.
>>>>>>
>>>>>> With the fast datarate of at least usb superspeed shouldn't they not all
>>>>>> completely work asynchronous with their in flight trbs?
>>>>>>
>>>>>> In my understanding this validates that, with at least superspeed we are
>>>>>> unlikely to react fast enough to maintain a steady isoc dataflow, since
>>>>>> the driver above has to react to errors in the processing context.
>>>>>>
>>>>>> This runs the above patch (f5e46aa4) a gadget independent solution
>>>>>> which has nothing to do with uvc in particular IMHO.
>>>>>>
>>>>>> How do other controllers and their drivers work?
>>>>>>
>>>>>>> Side note, when the dwc3 driver reschedules/starts isoc transfer again,
>>>>>>> the first transfer will be scheduled go out at some future interval and
>>>>>>> not the next immediate microframe. For UVC, it probably won't be a
>>>>>>> problem since it doesn't seem to need data going out every interval.
>>>>>>
>>>>>> It should not make a difference. [TM]
>>>>>>
>>>>>
>>>>>
>>>>> Sorry for being absent for a lot of this discussion.
>>>>>
>>>>> I want to take a step back from the details of how we're
>>>>> solving the problem to what problems we're trying to solve.
>>>>>
>>>>> So, question(s) for Michael, because I don't see an explicit
>>>>> answer in this thread (and my sincerest apologies if they've
>>>>> been answered already and I missed it):
>>>>>
>>>>> What exactly is the bug (or bugs) we're trying to solve here?
>>>>>
>>>>> So far, it looks like there are two main problems being
>>>>> discussed:
>>>>>
>>>>> 1. Reducing the bandwidth usage of individual usb_requests
>>>>> 2. Error handling for when transmission over the wire fails.
>>>>>
>>>>> Is that correct, or are there other issues at play here?
>>>>
>>>> That is correct.
>>>>
>>>>> (1) in isolation should be relatively easy to solve: Just
>>>>> smear the encoded frame across some percentage
>>>>> (prefereably < 100%) of the usb_requests in one
>>>>> video frame interval.
>>>>
>>>> Right.
>>>>
>>>>> (2) is more complicated, and your suggestion is to have a
>>>>> sentinel request between two video frames that tells the
>>>>> host if the transmission of "current" video frame was
>>>>> successful or not. (Is that correct?)
>>>>
>>>> Right.
>>>>
>>>>> Assuming my understanding of (2) is correct, how should
>>>>> the host behave if the transmission of the sentinel
>>>>> request fails after the video frame was sent
>>>>> successfully (isoc is best effort so transmission is
>>>>> never guaranteed)?
>>>>
>>>> If we have transmitted all requests of the frame but would only miss the
>>>> sentinel request to the host that includes the EOF, the host could
>>>> rather show it or drop it. The drop would not be necessary since the
>>>> host did see all data of the frame. The user would not see any
>>>> distortion in both cases.
>>>>
>>>> If we have transmitted only partial data of the frame it would be
>>>> preferred if the host would drop the frame that did not finish with an
>>>> proper EOF tag.
>>>>
>>>> AFAIK the linux kernel would still show the frame if the FID of the
>>>> currently handled request would change and would take this as the end of
>>>> frame indication. But I am not totally sure if this is by spec or
>>>> optional.
>>>>
>>>> One option to be totally sure would be to resend the sentinel request to
>>>> be properly transmitted before starting the next frame. This resend
>>>> polling would probably include some extra zero-length requests. But also
>>>> if this resend keeps failing for n times, the driver should doubt there
>>>> is anything sane going on with the USB connection and bail out somehow.
>>>>
>>>> Since we try to tackle case (1) to avoid transmit errors and also avoid
>>>> creating late enqueued requests in the running isoc transfer, the over
>>>> all chance to trigger missed transfers should be minimal.
>>>
>>> Gotcha. It seems like the UVC gadget driver implicitly assumes that EOF
>>> flag will be used although the userspace application can technically
>>> make it optional.
>>
>> That is not all. The additional UVC_STREAM_ERR tag on the sentinel
>> request can be set optional by the host driver. But by spec the
>> userspace application has to drop the frame when the flag was set.
>
>Looking at the UVC specs, the ERR bit doesn't seem to refer to actual
>transmission error, only errors in frame generation (Section 4.3.1.7
>of UVC 1.5 Class Specification). Maybe "data discontinuity" can be
>used but the examples given are bad media, and encoder issues, which
>suggests errors at higher level than the wire.

Oh! That is a new perspective I did not consider.

With the definition of UVC_STREAM_ERR by spec, the uvc_video driver
would in no case set this header bit for the current frame on its own?
Is that correct?

>> With my proposal this flag will be set, whenever we find out that
>> the currently transferred frame was erroneous.
>>
>>> Summarizing some of the discussions above:
>>> 1. UVC gadget driver should _not_ rely on the usb controller to
>>>   enqueue 0-length requests on UVC gadget drivers behalf;
>>> 2. However keeping up the backpressure to the controller means the
>>>   EOF request will be delayed behind all the zero-length requests.
>>
>> Exactly, this is why we have to somehow finetune the timedelay between
>> requests that trigger interrupts. And also monitor the amount of
>> requests currently enqueued in the hw ringbuffer. So that our drivers
>> enqueue dequeue mechanism is virtually adding only the minimum amount
>> of necessary zero-length requests in the hardware. This should be
>> possible.
>>
>> I am currently thinking through the remaining steps the pump worker has
>> to do on each wakeup to maintain the minimum threshold while waiting
>> with submitting requests that contain actual image payload.
>>
>>> Out of curiosity: What is wrong with letting the host rely on
>>> FID alone? Decoding the jpeg payload _should_ fail if any of the
>>> usb_requests containing the payload failed to transmit.
>>
>> This is not totally true. We saw partially rendered jpeg frames on the
>> host stream. How the host behaves with broken data is totally undefined
>> if the typical uvc flags EOF/ERR are not used as specified. Then think
>> about uncompressed formats. So relying on the transferred image format
>> to solve our problems is just as wrong as relying on the gadgets
>> hardware behavior.
>
>Do you know if the partially rendered frames were valid JPEGs, or
>if the host was simply making a best effort at displaying a broken
>JPEG? Perhaps the fix should go to the host instead?

I can fully reproduce this with linux and windows hosts. For linux
machines I saw that the host was taking the FID change as a marker
to see the previous frame as ready and just rendered what got through.
This did not lead to garbage but only to partially displayed frames
with jpeg macroblock alignment.

>Following is my opinion, feel free to disagree (and correct me if
>something is factually incorrect):
>
>The fundamental issue here is that ISOC doesn't guarantee
>delivery of usb_requests or even basic data consistency upon delivery.
>So the gadget driver has no way to know the state of transmitted data.
>The gadget driver is notified of underruns but not of any other issues,
>and ideally we should never have an underrun if the zero-length
>backpressure is working as intended.
>
>So, UVC gadget driver can reduce the number of errors, but it'll never be
>able to guarantee that the data transmitted to the host isn't somehow
>corrupted or missing unless a more reliable mode of transmission
>(bulk, for example) is used.
>
>All of this to say: The host absolutely needs to be able to handle
>all sorts of invalid and broken payloads. How the host handles it
>might be undefined, but the host can never rely on perfect knowledge
>about the transmission state. In cases like these, where the underlying
>transport is unreliable, the burden of enforcing consistency moves up
>a layer, i.e. to the encoded payload in this case. So it is perfectly
>fine for the host to rely on the encoding to determine if the payload
>is corrupt and handle it accordingly.

Right.

>As for uncompressed format, you're correct that subtle corruptions
>may not be caught, but outright missing usb_requests can be easily
>checked by simply looking at the number of bytes in the payload. YUV
>frames are all of the same (predetermined) size for a given resolution.

That was also my thought about five minutes after I did send you the
previous mail. So sure, this is no real issue for the host.

>So my recommendation is the following:
>1. Fix the bandwidth problem by splitting the encoded video frame
>   into more usb_requests (as your patch already does) making sure
>   there are enough free usb_request to encode the video frame in
>   one burst so we don't accidentally inflate the transmission
>   duration of a video frame by sneaking in zero-length requests in
>   the middle.

Ack. This should already solve a lot of issues.

For this I would still suggest to move the usb_ep_queue to be done in
the pump worker again. Its a bit back and forth, but IMHO its worth the
extra mile since only this way we would respect the dwc3 interrupt
threads assumption to run *very* short.

>2. Unless there is an unusually high rate of transmission failures
>   when using the UVC gadget driver, it might be worth fixing the
>   host side driver to handle broken frames better instead (assuming
>   host is linux as well).

Agreed, but this needs a separate scoped undestanding of the host side
behaviour over all layers.

>2. Tighten up the error checking in UVC gadget driver -- We drop the
>   current frame whenever an EXDEV happens which is wrong. We should
>   only be dropping the current frame if the EXDEV corresponds to the
>   frame currently being encoded.

What do you mean by drop?

I would suggest to immediatly switch the uvc_buffer that is being
enqueued and start queueing prepared requests from the next buffers prep
list. As suggested, the idea is to have per uvc_buffer prep_list
requests which would make this task easy.

>   If the frame is already fully queued to the usb controller, the host
>   can handle missing payload as it sees fit.

Ack

This roadmap sounds like a good one.

Michael

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2024-05-29 21:24 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-09 21:24 [PATCH 0/3] usb: gadget: uvc: allocate requests based on frame interval length and buffersize Michael Grzeschik
2024-04-09 21:24 ` [PATCH 1/3] usb: gadget: function: uvc: set req_size once when the vb2 queue is calculated Michael Grzeschik
2024-04-09 21:24 ` [PATCH 2/3] usb: gadget: uvc: add g_parm and s_parm for frame interval Michael Grzeschik
2024-04-09 21:24 ` [PATCH 3/3] usb: gadget: uvc: set req_size and n_requests based on the " Michael Grzeschik
2024-04-21 23:25 ` [PATCH 0/3] usb: gadget: uvc: allocate requests based on frame interval length and buffersize Michael Grzeschik
2024-04-23  0:21   ` Avichal Rakesh
2024-04-23 14:25     ` Michael Grzeschik
2024-04-23 23:23       ` Avichal Rakesh
2024-04-24  2:28         ` Thinh Nguyen
2024-05-12 22:10           ` Michael Grzeschik
2024-05-17  1:44             ` Thinh Nguyen
2024-05-22  0:08               ` Michael Grzeschik
2024-05-22  1:41                 ` Thinh Nguyen
2024-05-22 14:50                   ` Alan Stern
2024-05-22 17:17                     ` Thinh Nguyen
2024-05-22 17:37                       ` Michael Grzeschik
2024-05-22 18:23                         ` Thinh Nguyen
2024-05-22 18:58                         ` Alan Stern
2024-05-28 17:30                         ` Avichal Rakesh
2024-05-28 20:22                           ` Michael Grzeschik
2024-05-28 21:27                             ` Avichal Rakesh
2024-05-28 22:43                               ` Michael Grzeschik
2024-05-29  0:33                                 ` Avichal Rakesh
2024-05-29 21:24                                   ` Michael Grzeschik [this message]
2024-06-04 22:32                                     ` Avichal Rakesh
2024-06-10 21:44                                       ` Michael Grzeschik
2024-05-22 11:16         ` Michael Grzeschik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZledAuxYrxZlJ0ow@pengutronix.de \
    --to=mgr@pengutronix.de \
    --cc=Thinh.Nguyen@synopsys.com \
    --cc=arakesh@google.com \
    --cc=dan.scally@ideasonboard.com \
    --cc=etalvala@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jchowdhary@google.com \
    --cc=laurent.pinchart@ideasonboard.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=michael.riesch@wolfvision.net \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox