All of lore.kernel.org
 help / color / mirror / Atom feed
From: Felipe Balbi <balbi@kernel.org>
To: "Noralf Trønnes" <noralf@tronnes.org>, dri-devel@lists.freedesktop.org
Cc: linux-usb@vger.kernel.org, sam@ravnborg.org
Subject: Re: [PATCH v3 6/6] usb: gadget: function: Add Generic USB Display support
Date: Wed, 03 Jun 2020 10:10:47 +0300	[thread overview]
Message-ID: <87h7vs1v60.fsf@kernel.org> (raw)
In-Reply-To: <0d0ec3f5-4b9a-e128-5d50-9e7096b3f984@tronnes.org>

[-- Attachment #1: Type: text/plain, Size: 5359 bytes --]


Hi,

Noralf Trønnes <noralf@tronnes.org> writes:
>> I missed this completely until now.
>> Noralf Trønnes <noralf@tronnes.org> writes:
>>> This adds the gadget side support for the Generic USB Display. It presents
>>> a DRM display device as a USB Display configured through configfs.
>>>
>>> The display is implemented as a vendor type USB interface with one bulk
>>> out endpoint. The protocol is implemented using control requests.
>>> lz4 compressed framebuffer data/pixels are sent over the bulk endpoint.
>>>
>>> The DRM part of the gadget is placed in the DRM subsystem since it reaches
>>> into the DRM internals.
>> 
>> First and foremost, could this be done in userspace using the raw gadget
>> or f_fs?
>> 
>
> An uncompressed 1920x1080-RGB565 frame is ~4MB. All frames can be
> compressed (lz4) even if just a little, so this is decompressed into the
> framebuffer of the attached DRM device. AFAIU I would need to be able to
> mmap the received bulk buffer if I were to do this from userspace
> without killing performance. So it doesn't look like I can use raw
> gadget or f_fs.

oh, yeah we couldn't map that much. I was thinking that maybe we could
transfer several small buffers instead of a single large one, but
perhaps that would complicate decompression?

>>> diff --git a/drivers/usb/gadget/function/f_gud_drm.c b/drivers/usb/gadget/function/f_gud_drm.c
>>> new file mode 100644
>>> index 000000000000..9a2d6bb9739f
>>> --- /dev/null
>>> +++ b/drivers/usb/gadget/function/f_gud_drm.c
>>> @@ -0,0 +1,678 @@
>>> +struct f_gud_drm {
>>> +	struct usb_function func;
>>> +	struct work_struct worker;
>> 
>> why do you need a worker?
>> 
>
> The gadget runs in interrupt context and I need to call into the DRM
> subsystem which can sleep.

At some point someone wanted to provide a patch to run endpoint giveback
routine in process context, much like usb host stack does if
requested. That's currently not implemented, but should be doable by
modifying usb_gadget_giveback_request().

>>> +	size_t max_buffer_size;
>>> +	void *ctrl_req_buf;
>>> +
>>> +	u8 interface_id;
>>> +	struct usb_request *ctrl_req;
>>> +
>>> +	struct usb_ep *bulk_ep;
>>> +	struct usb_request *bulk_req;
>> 
>> single request? Don't you want to amortize the latency of
>> queue->complete by using a series of requests?
>> 
>
> I use only one request per update or partial update.
> I kmalloc the biggest buffer I can get (default 4MB) and tell the host
> about this size. If a frame doesn't fit, the host splits it up into
> partial updates. I already support partial updates so this is built in.
> Userspace can tell the graphics driver which portion of the framebuffer
> it has touched to avoid sending the full frame each time.
> Having one continous buffer simplifies decompression.

got it

> There's a control request preceding the bulk request which tells the
> area the update is for and whether it is compressed or not.
> I did some testing to see if I should avoid the control request overhead
> for split updates, but it turns out that the bottleneck is the
> decompression. The control request is just 400-500us, while the total
> time from control request to buffer is decompressed is 50-100ms
> (depending on how well the frame compresses).

yeah, that makes sense.

>>> +	struct gud_drm_gadget *gdg;
>>> +
>>> +	spinlock_t lock; /* Protects the following members: */
>>> +	bool ctrl_pending;
>>> +	bool status_pending;
>>> +	bool bulk_pending;
>>> +	bool disable_pending;
>> 
>> could this be a single u32 with #define'd flags? That would be atomic,
>> right?
>> 
>
> I have never grasped all the concurrency issues, but wouldn't I need
> memory barriers as well?

As far as I understand, {test_and_,}{set,clear,change}_bit() handle all
the required steps to guarantee proper atomic behavior. I haven't looked
at the implementation myself, though.

>>> +	u8 errno;
>> 
>> a global per-function error? Why?
>> 
>
> This is the result of the previously request operation. The host will
> request this value to see how it went. I might switch to using a bulk
> endpoint for status following a discussion with Alan Stern in the host
> driver thread in this patch series. If so I might not need this.

got it.

>>> +	u16 request;
>>> +	u16 value;
>> 
>> also why? Looks odd
>> 
>
> These values contains the operation (in addition to the payload) that
> the worker shall perform following the control-OUT requests.
>
> control-IN requests can run in interrupt context so in that case the
> payload is queued up immediately.

cool

>>> +static void f_gud_drm_bulk_complete(struct usb_ep *ep, struct usb_request *req)
>>> +{
>>> +	struct f_gud_drm *fgd = req->context;
>>> +	unsigned long flags;
>>> +
>>> +	if (req->status || req->actual != req->length)
>>> +		return;
>> 
>> so, if we complete with an erroneous status or a short packet, you'll
>> ignore it?
>> 
>
> Hmm yeah. When I wrote this I thought that the bottleneck was the USB
> transfers, so I didn't want the host to slow down performance by
> requesting this status. Now I know it's the decompression that takes
> time, so I could actually do a status check and retry the frame if the
> device detects an error.

sounds good :-)

-- 
balbi

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: Felipe Balbi <balbi@kernel.org>
To: "Noralf Trønnes" <noralf@tronnes.org>, dri-devel@lists.freedesktop.org
Cc: linux-usb@vger.kernel.org, sam@ravnborg.org
Subject: Re: [PATCH v3 6/6] usb: gadget: function: Add Generic USB Display support
Date: Wed, 03 Jun 2020 10:10:47 +0300	[thread overview]
Message-ID: <87h7vs1v60.fsf@kernel.org> (raw)
In-Reply-To: <0d0ec3f5-4b9a-e128-5d50-9e7096b3f984@tronnes.org>


[-- Attachment #1.1: Type: text/plain, Size: 5359 bytes --]


Hi,

Noralf Trønnes <noralf@tronnes.org> writes:
>> I missed this completely until now.
>> Noralf Trønnes <noralf@tronnes.org> writes:
>>> This adds the gadget side support for the Generic USB Display. It presents
>>> a DRM display device as a USB Display configured through configfs.
>>>
>>> The display is implemented as a vendor type USB interface with one bulk
>>> out endpoint. The protocol is implemented using control requests.
>>> lz4 compressed framebuffer data/pixels are sent over the bulk endpoint.
>>>
>>> The DRM part of the gadget is placed in the DRM subsystem since it reaches
>>> into the DRM internals.
>> 
>> First and foremost, could this be done in userspace using the raw gadget
>> or f_fs?
>> 
>
> An uncompressed 1920x1080-RGB565 frame is ~4MB. All frames can be
> compressed (lz4) even if just a little, so this is decompressed into the
> framebuffer of the attached DRM device. AFAIU I would need to be able to
> mmap the received bulk buffer if I were to do this from userspace
> without killing performance. So it doesn't look like I can use raw
> gadget or f_fs.

oh, yeah we couldn't map that much. I was thinking that maybe we could
transfer several small buffers instead of a single large one, but
perhaps that would complicate decompression?

>>> diff --git a/drivers/usb/gadget/function/f_gud_drm.c b/drivers/usb/gadget/function/f_gud_drm.c
>>> new file mode 100644
>>> index 000000000000..9a2d6bb9739f
>>> --- /dev/null
>>> +++ b/drivers/usb/gadget/function/f_gud_drm.c
>>> @@ -0,0 +1,678 @@
>>> +struct f_gud_drm {
>>> +	struct usb_function func;
>>> +	struct work_struct worker;
>> 
>> why do you need a worker?
>> 
>
> The gadget runs in interrupt context and I need to call into the DRM
> subsystem which can sleep.

At some point someone wanted to provide a patch to run endpoint giveback
routine in process context, much like usb host stack does if
requested. That's currently not implemented, but should be doable by
modifying usb_gadget_giveback_request().

>>> +	size_t max_buffer_size;
>>> +	void *ctrl_req_buf;
>>> +
>>> +	u8 interface_id;
>>> +	struct usb_request *ctrl_req;
>>> +
>>> +	struct usb_ep *bulk_ep;
>>> +	struct usb_request *bulk_req;
>> 
>> single request? Don't you want to amortize the latency of
>> queue->complete by using a series of requests?
>> 
>
> I use only one request per update or partial update.
> I kmalloc the biggest buffer I can get (default 4MB) and tell the host
> about this size. If a frame doesn't fit, the host splits it up into
> partial updates. I already support partial updates so this is built in.
> Userspace can tell the graphics driver which portion of the framebuffer
> it has touched to avoid sending the full frame each time.
> Having one continous buffer simplifies decompression.

got it

> There's a control request preceding the bulk request which tells the
> area the update is for and whether it is compressed or not.
> I did some testing to see if I should avoid the control request overhead
> for split updates, but it turns out that the bottleneck is the
> decompression. The control request is just 400-500us, while the total
> time from control request to buffer is decompressed is 50-100ms
> (depending on how well the frame compresses).

yeah, that makes sense.

>>> +	struct gud_drm_gadget *gdg;
>>> +
>>> +	spinlock_t lock; /* Protects the following members: */
>>> +	bool ctrl_pending;
>>> +	bool status_pending;
>>> +	bool bulk_pending;
>>> +	bool disable_pending;
>> 
>> could this be a single u32 with #define'd flags? That would be atomic,
>> right?
>> 
>
> I have never grasped all the concurrency issues, but wouldn't I need
> memory barriers as well?

As far as I understand, {test_and_,}{set,clear,change}_bit() handle all
the required steps to guarantee proper atomic behavior. I haven't looked
at the implementation myself, though.

>>> +	u8 errno;
>> 
>> a global per-function error? Why?
>> 
>
> This is the result of the previously request operation. The host will
> request this value to see how it went. I might switch to using a bulk
> endpoint for status following a discussion with Alan Stern in the host
> driver thread in this patch series. If so I might not need this.

got it.

>>> +	u16 request;
>>> +	u16 value;
>> 
>> also why? Looks odd
>> 
>
> These values contains the operation (in addition to the payload) that
> the worker shall perform following the control-OUT requests.
>
> control-IN requests can run in interrupt context so in that case the
> payload is queued up immediately.

cool

>>> +static void f_gud_drm_bulk_complete(struct usb_ep *ep, struct usb_request *req)
>>> +{
>>> +	struct f_gud_drm *fgd = req->context;
>>> +	unsigned long flags;
>>> +
>>> +	if (req->status || req->actual != req->length)
>>> +		return;
>> 
>> so, if we complete with an erroneous status or a short packet, you'll
>> ignore it?
>> 
>
> Hmm yeah. When I wrote this I thought that the bottleneck was the USB
> transfers, so I didn't want the host to slow down performance by
> requesting this status. Now I know it's the decompression that takes
> time, so I could actually do a status check and retry the frame if the
> device detects an error.

sounds good :-)

-- 
balbi

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2020-06-03  7:10 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-29 17:56 [PATCH v3 0/6] Generic USB Display driver Noralf Trønnes
2020-05-29 17:56 ` Noralf Trønnes
2020-05-29 17:56 ` [PATCH v3 1/6] drm/client: Add drm_client_init_from_id() Noralf Trønnes
2020-05-29 17:56   ` Noralf Trønnes
2020-05-29 17:56 ` [PATCH v3 2/6] drm/client: Add drm_client_modeset_disable() Noralf Trønnes
2020-05-29 17:56   ` Noralf Trønnes
2020-05-29 17:56 ` [PATCH v3 3/6] drm/client: Add a way to set modeset, properties and rotation Noralf Trønnes
2020-05-29 17:56   ` Noralf Trønnes
2020-05-29 17:56 ` [PATCH v3 4/6] drm: Add Generic USB Display driver Noralf Trønnes
2020-05-29 17:56   ` Noralf Trønnes
2020-05-29 22:45   ` Peter Stuge
2020-05-29 22:45     ` Peter Stuge
2020-06-01 16:57     ` Noralf Trønnes
2020-06-01 16:57       ` Noralf Trønnes
2020-06-02  0:12       ` Peter Stuge
2020-06-02  0:12         ` Peter Stuge
2020-06-02  2:32         ` Alan Stern
2020-06-02  2:32           ` Alan Stern
2020-06-02  5:21           ` Peter Stuge
2020-06-02  5:21             ` Peter Stuge
2020-06-02 15:27             ` Alan Stern
2020-06-02 15:27               ` Alan Stern
2020-06-02 18:38               ` Peter Stuge
2020-06-02 18:38                 ` Peter Stuge
2020-06-05 12:03                 ` Noralf Trønnes
2020-06-05 12:03                   ` Noralf Trønnes
2020-06-02 11:46           ` Noralf Trønnes
2020-06-02 11:46             ` Noralf Trønnes
2020-07-14 15:30             ` Noralf Trønnes
2020-07-14 15:30               ` Noralf Trønnes
2020-06-03 19:17         ` Noralf Trønnes
2020-06-03 19:17           ` Noralf Trønnes
2020-05-29 17:56 ` [PATCH v3 5/6] drm/gud: Add functionality for the USB gadget side Noralf Trønnes
2020-05-29 17:56   ` Noralf Trønnes
2020-05-29 17:56 ` [PATCH v3 6/6] usb: gadget: function: Add Generic USB Display support Noralf Trønnes
2020-05-29 17:56   ` Noralf Trønnes
2020-06-02 17:05   ` Felipe Balbi
2020-06-02 17:05     ` Felipe Balbi
2020-06-02 19:14     ` Noralf Trønnes
2020-06-02 19:14       ` Noralf Trønnes
2020-06-03  7:10       ` Felipe Balbi [this message]
2020-06-03  7:10         ` Felipe Balbi
2020-07-09 16:32 ` [PATCH v3 0/6] Generic USB Display driver Lubomir Rintel
2020-07-09 16:32   ` Lubomir Rintel
2020-07-14 13:33   ` Noralf Trønnes
2020-07-14 13:33     ` Noralf Trønnes
2020-07-14 17:40     ` Peter Stuge
2020-07-14 17:40       ` Peter Stuge
2020-07-14 19:03       ` Noralf Trønnes
2020-07-14 19:03         ` Noralf Trønnes
2020-07-14 19:38         ` Peter Stuge
2020-07-14 19:38           ` Peter Stuge
2020-07-16 17:43           ` Noralf Trønnes
2020-07-16 17:43             ` Noralf Trønnes
2020-07-15  7:30         ` Lubomir Rintel
2020-07-15  7:30           ` Lubomir Rintel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h7vs1v60.fsf@kernel.org \
    --to=balbi@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=noralf@tronnes.org \
    --cc=sam@ravnborg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.