* [RFC] Memory allocation requirements, videobuf integration, pluggable allocators
@ 2010-05-25 12:33 Pawel Osciak
2010-05-26 5:05 ` Hiremath, Vaibhav
2010-05-27 18:44 ` Andy Walls
0 siblings, 2 replies; 4+ messages in thread
From: Pawel Osciak @ 2010-05-25 12:33 UTC (permalink / raw)
To: linux-media; +Cc: kyungmin.park, 'Marek Szyprowski'
Hello,
this RFC concerns video buffer allocation in videobuf, as well as in V4L in
general.
Its main purpose is to discuss issues, gather comments and specific
requirements, proposals and ideas for allocation mechanisms from
interested parties.
Background
======================
V4L drivers use memory buffers for storing video/media data, such as video
frames. There are many different ways to acquire such memory and devices may
have special requirements for it. Further handling of it also differs between
drivers.
Typical ways of acquiring memory for media devices include:
- allocating a number of non-contiguous pages (e.g. alloc_page)
- acquiring a number of physically contiguous pages:
* bootmem allocation
* other custom solutions?
- allocating virtually contiguous memory (vmalloc)
- device-specific/private/on-board memory
- others?
The above examples are quite standard, but just to give you an idea of more
exotic cases:
- allocation of memory from specific memory banks
- allocation of buffers in a particular arrangement
- allocation with specific CPU flags, etc.
If the above sounds too unrealistic/too abstract to you: these are the actual
requirements for our (Samsung) devices.
Furthermore, there might be some additional considerations:
- VM_PFNMAP memory - may need additional refcounting
- how to handle problems with remapping memory with different flags
- others?
Of course, freeing can also be handled in a plethora of ways.
Moreover, related to the above are specific operations that may have to be
performed, such as syncing caches, page pinning, etc.
Motivation
======================
Videobuf framework memory-type code (videobuf-vmalloc, videobuf-dma-sg,
videobuf-dma-contig) has been created to help developers in some of the
above-mentioned case. Unfortunately, I see the following main, inherent
problems with it:
- memory allocation is performed in videobuf code in a fixed way. There is
no way for drivers to override this; e.g., dma_alloc_coherent is used for
dma-contig memory,
- it is performed during mmap (dma-contig, vmalloc) or even on VM fault
sometimes (dma-sg); this does not conform to the V4L2 API, which states
that allocation should be done on REQBUFS call,
- freeing is not centralized (it is also performed on STREAMOFF, which is
really bad, but this is a topic for a separate RFC)
This prevents driver developers from using videobuf, sometimes they just use
parts of it, add custom/incompatible modifications, or are, as a matter of fact,
"forced" to duplicate parts of its code. Some drivers are dependent on
boot-time allocation mechanisms. Examples include (apologies to the authors
if I am mistaken here):
- Intel Moorestown
- OMAP
- multimedia devices in Samsung SoCs- drivers are not yet posted for the sole
reason that we need a bootmem-based allocator mechanism, which is hard to have
accepted. Custom allocators in kernel for every platform/device are not
received well, for good reasons.
- others?
>From various discussions I believe that there are more parties interested in
having custom memory allocation mechanisms. Moreover, the current situation in
videobuf calls for fixing (e.g. allocation should be performed on REQBUFS).
A request for requirements, ideas and comments
================================================
We would like to change this situation. Before proposing anything, we would like
to first gather:
- device-specific requirements and, possibly, peculiarities,
- more general ideas and requirements for a generic allocator framework
for media devices,
- a list of devices that would benefit from this,
- a list of drivers that do not use videobuf because of problems with adapting
to its memory allocation scheme.
They do not have to be videobuf-specific, although we would like to integrate
the resulting solution with either the current videobuf, or its future rework.
To clarify, I am mainly trying to gather requirements for videobuf or similar
frameworks to introduce a generic interface for plugging-in custom memory
allocation mechanisms, not really trying to implement a solving-world-hunger
memory allocator.
There is also the topic of a video buffer pool, discussed last year. How it
relates to this topic and its integration (in any form) with videobuf could
also be of interest here.
I will be grateful for any comments, thoughts or ideas. Thank you!
=========================
Below is a (hopefully complete) list of required features for Samsung multimedia
devices, related to memory allocation:
- physically contiguous memory buffers of different sizes (up to several
megabytes)
- memory allocation from particular memory banks (ranges of physical addresses)
- partitioning areas of memory into custom zones and an ability to allocate from
a chosen zone
- an ability to share memory buffers across different devices in a pipeline
- automatic video buffer allocation from videobuf is the main use case, but
direct access to the memory allocator from drivers (for temporary buffers,
firmware etc.) is also required
- ability to pre-configure allocator behavior by drivers
Some nice-to-have features:
- pluggable memory allocation strategies
- cacheable/non-cacheable buffers
- CPU cache synchronization for non-coherent areas
- support for VM_PFNMAP memory, such as framebuffer memory, etc. (alternative
methods of reference counting required)
- shared memory (shmem) support, zero-copy X server interoperability
- support for contiguous memory allocated by userspace, including contiguity
checks and (maybe) bounce buffers
- usage/fragmentation statistics
Best regards
--
Pawel Osciak
Linux Platform Group
Samsung Poland R&D Center
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: [RFC] Memory allocation requirements, videobuf integration, pluggable allocators
2010-05-25 12:33 [RFC] Memory allocation requirements, videobuf integration, pluggable allocators Pawel Osciak
@ 2010-05-26 5:05 ` Hiremath, Vaibhav
2010-05-27 8:37 ` Zhang, Xiaolin
2010-05-27 18:44 ` Andy Walls
1 sibling, 1 reply; 4+ messages in thread
From: Hiremath, Vaibhav @ 2010-05-26 5:05 UTC (permalink / raw)
To: Pawel Osciak, linux-media@vger.kernel.org
Cc: kyungmin.park@samsung.com, 'Marek Szyprowski'
> -----Original Message-----
> From: linux-media-owner@vger.kernel.org [mailto:linux-media-
> owner@vger.kernel.org] On Behalf Of Pawel Osciak
> Sent: Tuesday, May 25, 2010 6:04 PM
> To: linux-media@vger.kernel.org
> Cc: kyungmin.park@samsung.com; 'Marek Szyprowski'
> Subject: [RFC] Memory allocation requirements, videobuf integration,
> pluggable allocators
>
> Hello,
>
> this RFC concerns video buffer allocation in videobuf, as well as in V4L in
> general.
>
> Its main purpose is to discuss issues, gather comments and specific
> requirements, proposals and ideas for allocation mechanisms from
> interested parties.
>
[Hiremath, Vaibhav] Thanks Pawel for summarizing VideoBuf need into RFC.
> Background
> ======================
> V4L drivers use memory buffers for storing video/media data, such as video
> frames. There are many different ways to acquire such memory and devices may
> have special requirements for it. Further handling of it also differs
> between
> drivers.
>
> Typical ways of acquiring memory for media devices include:
> - allocating a number of non-contiguous pages (e.g. alloc_page)
> - acquiring a number of physically contiguous pages:
> * bootmem allocation
> * other custom solutions?
> - allocating virtually contiguous memory (vmalloc)
> - device-specific/private/on-board memory
> - others?
>
> The above examples are quite standard, but just to give you an idea of more
> exotic cases:
> - allocation of memory from specific memory banks
> - allocation of buffers in a particular arrangement
> - allocation with specific CPU flags, etc.
>
> If the above sounds too unrealistic/too abstract to you: these are the
> actual
> requirements for our (Samsung) devices.
>
> Furthermore, there might be some additional considerations:
> - VM_PFNMAP memory - may need additional refcounting
> - how to handle problems with remapping memory with different flags
> - others?
>
[Hiremath, Vaibhav] We do have similar requirement for OMAP devices.
> Of course, freeing can also be handled in a plethora of ways.
>
> Moreover, related to the above are specific operations that may have to be
> performed, such as syncing caches, page pinning, etc.
>
>
> Motivation
> ======================
> Videobuf framework memory-type code (videobuf-vmalloc, videobuf-dma-sg,
> videobuf-dma-contig) has been created to help developers in some of the
> above-mentioned case. Unfortunately, I see the following main, inherent
> problems with it:
>
> - memory allocation is performed in videobuf code in a fixed way. There is
> no way for drivers to override this; e.g., dma_alloc_coherent is used for
> dma-contig memory,
>
> - it is performed during mmap (dma-contig, vmalloc) or even on VM fault
> sometimes (dma-sg); this does not conform to the V4L2 API, which states
> that allocation should be done on REQBUFS call,
>
> - freeing is not centralized (it is also performed on STREAMOFF, which is
> really bad, but this is a topic for a separate RFC)
>
> This prevents driver developers from using videobuf, sometimes they just use
> parts of it, add custom/incompatible modifications, or are, as a matter of
> fact,
> "forced" to duplicate parts of its code. Some drivers are dependent on
> boot-time allocation mechanisms. Examples include (apologies to the authors
> if I am mistaken here):
>
> - Intel Moorestown
> - OMAP
[Hiremath, Vaibhav] Let me explain V4L2 Display allocation schema here,
We have defined module parameters, to specify buffer size and number of buffers which user can configure through boot argument or during module insert time.
Driver always keep this specified number of buffers until removed from system. That means we are not freeing memory (allocated during boot time) neither in streamoff nor in close API.
Since driver manages buffer allocation we are not using video-buf mmap_mapper function here, we have our own mmap_mapper function.
> - multimedia devices in Samsung SoCs- drivers are not yet posted for the
> sole
> reason that we need a bootmem-based allocator mechanism, which is hard to
> have
> accepted. Custom allocators in kernel for every platform/device are not
> received well, for good reasons.
> - others?
>
> From various discussions I believe that there are more parties interested in
> having custom memory allocation mechanisms. Moreover, the current situation
> in
> videobuf calls for fixing (e.g. allocation should be performed on REQBUFS).
>
>
>
> A request for requirements, ideas and comments
> ================================================
> We would like to change this situation. Before proposing anything, we would
> like
> to first gather:
>
> - device-specific requirements and, possibly, peculiarities,
> - more general ideas and requirements for a generic allocator framework
> for media devices,
> - a list of devices that would benefit from this,
[Hiremath, Vaibhav] You can safely consider/add following TI devices,
OMAP2/3/4 (OMAP3/OAMP4) series of devices
Davinci series of devices
AM/DM37x series of devices
Some future devices in this segment...
> - a list of drivers that do not use videobuf because of problems with
> adapting
> to its memory allocation scheme.
[Hiremath, Vaibhav] As I mentioned we do use Video-Buf partially.
>
> They do not have to be videobuf-specific, although we would like to
> integrate
> the resulting solution with either the current videobuf, or its future
> rework.
>
> To clarify, I am mainly trying to gather requirements for videobuf or
> similar
> frameworks to introduce a generic interface for plugging-in custom memory
> allocation mechanisms, not really trying to implement a solving-world-hunger
> memory allocator.
>
> There is also the topic of a video buffer pool, discussed last year. How it
> relates to this topic and its integration (in any form) with videobuf could
> also be of interest here.
>
>
> I will be grateful for any comments, thoughts or ideas. Thank you!
>
>
> =========================
>
> Below is a (hopefully complete) list of required features for Samsung
> multimedia
> devices, related to memory allocation:
>
> - physically contiguous memory buffers of different sizes (up to several
> megabytes)
> - memory allocation from particular memory banks (ranges of physical
> addresses)
> - partitioning areas of memory into custom zones and an ability to allocate
> from
> a chosen zone
> - an ability to share memory buffers across different devices in a pipeline
> - automatic video buffer allocation from videobuf is the main use case, but
> direct access to the memory allocator from drivers (for temporary buffers,
> firmware etc.) is also required
> - ability to pre-configure allocator behavior by drivers
>
[Hiremath, Vaibhav] If you have anything in mind readily available on this, then I would suggest you to propose here in this forum for discussion.
Below are some suggestions or my opinions -
-------------------------------
1) Irrespective of this RFC, I think we should change the Video-Buf behavior where buffer allocation must happen in VIDIOC_REQBUF and not in MMAP.
2) It would be really nice if Video-Buf checks whether buffer is allocated by driver or not before allocating it, providing flexibility to driver to handle buffer allocation on his own.
3) I am thinking of thin layer which takes boot time argument specifying total buffer size for Video-Buf layer which is taken away from Linux kernel. Allocation will happen in VIDIOC_REQBUF and will be released either in streamoff/close API.
videobuf_size=40M
Video-buf layer will take away 40M from Linux kernel during boot time itself, then Video-buf layer manages/uses this pool for memory allocation.
Note: There are other issues which still need to be discussed as mentioned by Pawel in RFC like, cache, multiple drivers request with different sizes and stuff which defines complexity of this thin layer.
-------------------------------
Atleast from OMAP2/3 Display driver point of view, points 1 & 2 are important and will help driver to use completely standard Video-Buf API's.
Thanks,
Vaibhav
> Some nice-to-have features:
>
> - pluggable memory allocation strategies
> - cacheable/non-cacheable buffers
> - CPU cache synchronization for non-coherent areas
> - support for VM_PFNMAP memory, such as framebuffer memory, etc.
> (alternative
> methods of reference counting required)
> - shared memory (shmem) support, zero-copy X server interoperability
> - support for contiguous memory allocated by userspace, including contiguity
> checks and (maybe) bounce buffers
> - usage/fragmentation statistics
>
> Best regards
> --
> Pawel Osciak
> Linux Platform Group
> Samsung Poland R&D Center
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-media" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: [RFC] Memory allocation requirements, videobuf integration, pluggable allocators
2010-05-26 5:05 ` Hiremath, Vaibhav
@ 2010-05-27 8:37 ` Zhang, Xiaolin
0 siblings, 0 replies; 4+ messages in thread
From: Zhang, Xiaolin @ 2010-05-27 8:37 UTC (permalink / raw)
To: Hiremath, Vaibhav, Pawel Osciak, linux-media@vger.kernel.org
Cc: kyungmin.park@samsung.com, 'Marek Szyprowski'
Looking forward to the proposal. Indeed, Intel Moorestown is suffered from the videobuf framework limitation.
Xiaolin
-----Original Message-----
From: linux-media-owner@vger.kernel.org [mailto:linux-media-owner@vger.kernel.org] On Behalf Of Hiremath, Vaibhav
Sent: Wednesday, May 26, 2010 1:06 PM
To: Pawel Osciak; linux-media@vger.kernel.org
Cc: kyungmin.park@samsung.com; 'Marek Szyprowski'
Subject: RE: [RFC] Memory allocation requirements, videobuf integration, pluggable allocators
> -----Original Message-----
> From: linux-media-owner@vger.kernel.org [mailto:linux-media-
> owner@vger.kernel.org] On Behalf Of Pawel Osciak
> Sent: Tuesday, May 25, 2010 6:04 PM
> To: linux-media@vger.kernel.org
> Cc: kyungmin.park@samsung.com; 'Marek Szyprowski'
> Subject: [RFC] Memory allocation requirements, videobuf integration,
> pluggable allocators
>
> Hello,
>
> this RFC concerns video buffer allocation in videobuf, as well as in V4L in
> general.
>
> Its main purpose is to discuss issues, gather comments and specific
> requirements, proposals and ideas for allocation mechanisms from
> interested parties.
>
[Hiremath, Vaibhav] Thanks Pawel for summarizing VideoBuf need into RFC.
> Background
> ======================
> V4L drivers use memory buffers for storing video/media data, such as video
> frames. There are many different ways to acquire such memory and devices may
> have special requirements for it. Further handling of it also differs
> between
> drivers.
>
> Typical ways of acquiring memory for media devices include:
> - allocating a number of non-contiguous pages (e.g. alloc_page)
> - acquiring a number of physically contiguous pages:
> * bootmem allocation
> * other custom solutions?
> - allocating virtually contiguous memory (vmalloc)
> - device-specific/private/on-board memory
> - others?
>
> The above examples are quite standard, but just to give you an idea of more
> exotic cases:
> - allocation of memory from specific memory banks
> - allocation of buffers in a particular arrangement
> - allocation with specific CPU flags, etc.
>
> If the above sounds too unrealistic/too abstract to you: these are the
> actual
> requirements for our (Samsung) devices.
>
> Furthermore, there might be some additional considerations:
> - VM_PFNMAP memory - may need additional refcounting
> - how to handle problems with remapping memory with different flags
> - others?
>
[Hiremath, Vaibhav] We do have similar requirement for OMAP devices.
> Of course, freeing can also be handled in a plethora of ways.
>
> Moreover, related to the above are specific operations that may have to be
> performed, such as syncing caches, page pinning, etc.
>
>
> Motivation
> ======================
> Videobuf framework memory-type code (videobuf-vmalloc, videobuf-dma-sg,
> videobuf-dma-contig) has been created to help developers in some of the
> above-mentioned case. Unfortunately, I see the following main, inherent
> problems with it:
>
> - memory allocation is performed in videobuf code in a fixed way. There is
> no way for drivers to override this; e.g., dma_alloc_coherent is used for
> dma-contig memory,
>
> - it is performed during mmap (dma-contig, vmalloc) or even on VM fault
> sometimes (dma-sg); this does not conform to the V4L2 API, which states
> that allocation should be done on REQBUFS call,
>
> - freeing is not centralized (it is also performed on STREAMOFF, which is
> really bad, but this is a topic for a separate RFC)
>
> This prevents driver developers from using videobuf, sometimes they just use
> parts of it, add custom/incompatible modifications, or are, as a matter of
> fact,
> "forced" to duplicate parts of its code. Some drivers are dependent on
> boot-time allocation mechanisms. Examples include (apologies to the authors
> if I am mistaken here):
>
> - Intel Moorestown
> - OMAP
[Hiremath, Vaibhav] Let me explain V4L2 Display allocation schema here,
We have defined module parameters, to specify buffer size and number of buffers which user can configure through boot argument or during module insert time.
Driver always keep this specified number of buffers until removed from system. That means we are not freeing memory (allocated during boot time) neither in streamoff nor in close API.
Since driver manages buffer allocation we are not using video-buf mmap_mapper function here, we have our own mmap_mapper function.
> - multimedia devices in Samsung SoCs- drivers are not yet posted for the
> sole
> reason that we need a bootmem-based allocator mechanism, which is hard to
> have
> accepted. Custom allocators in kernel for every platform/device are not
> received well, for good reasons.
> - others?
>
> From various discussions I believe that there are more parties interested in
> having custom memory allocation mechanisms. Moreover, the current situation
> in
> videobuf calls for fixing (e.g. allocation should be performed on REQBUFS).
>
>
>
> A request for requirements, ideas and comments
> ================================================
> We would like to change this situation. Before proposing anything, we would
> like
> to first gather:
>
> - device-specific requirements and, possibly, peculiarities,
> - more general ideas and requirements for a generic allocator framework
> for media devices,
> - a list of devices that would benefit from this,
[Hiremath, Vaibhav] You can safely consider/add following TI devices,
OMAP2/3/4 (OMAP3/OAMP4) series of devices
Davinci series of devices
AM/DM37x series of devices
Some future devices in this segment...
> - a list of drivers that do not use videobuf because of problems with
> adapting
> to its memory allocation scheme.
[Hiremath, Vaibhav] As I mentioned we do use Video-Buf partially.
>
> They do not have to be videobuf-specific, although we would like to
> integrate
> the resulting solution with either the current videobuf, or its future
> rework.
>
> To clarify, I am mainly trying to gather requirements for videobuf or
> similar
> frameworks to introduce a generic interface for plugging-in custom memory
> allocation mechanisms, not really trying to implement a solving-world-hunger
> memory allocator.
>
> There is also the topic of a video buffer pool, discussed last year. How it
> relates to this topic and its integration (in any form) with videobuf could
> also be of interest here.
>
>
> I will be grateful for any comments, thoughts or ideas. Thank you!
>
>
> =========================
>
> Below is a (hopefully complete) list of required features for Samsung
> multimedia
> devices, related to memory allocation:
>
> - physically contiguous memory buffers of different sizes (up to several
> megabytes)
> - memory allocation from particular memory banks (ranges of physical
> addresses)
> - partitioning areas of memory into custom zones and an ability to allocate
> from
> a chosen zone
> - an ability to share memory buffers across different devices in a pipeline
> - automatic video buffer allocation from videobuf is the main use case, but
> direct access to the memory allocator from drivers (for temporary buffers,
> firmware etc.) is also required
> - ability to pre-configure allocator behavior by drivers
>
[Hiremath, Vaibhav] If you have anything in mind readily available on this, then I would suggest you to propose here in this forum for discussion.
Below are some suggestions or my opinions -
-------------------------------
1) Irrespective of this RFC, I think we should change the Video-Buf behavior where buffer allocation must happen in VIDIOC_REQBUF and not in MMAP.
2) It would be really nice if Video-Buf checks whether buffer is allocated by driver or not before allocating it, providing flexibility to driver to handle buffer allocation on his own.
3) I am thinking of thin layer which takes boot time argument specifying total buffer size for Video-Buf layer which is taken away from Linux kernel. Allocation will happen in VIDIOC_REQBUF and will be released either in streamoff/close API.
videobuf_size=40M
Video-buf layer will take away 40M from Linux kernel during boot time itself, then Video-buf layer manages/uses this pool for memory allocation.
Note: There are other issues which still need to be discussed as mentioned by Pawel in RFC like, cache, multiple drivers request with different sizes and stuff which defines complexity of this thin layer.
-------------------------------
Atleast from OMAP2/3 Display driver point of view, points 1 & 2 are important and will help driver to use completely standard Video-Buf API's.
Thanks,
Vaibhav
> Some nice-to-have features:
>
> - pluggable memory allocation strategies
> - cacheable/non-cacheable buffers
> - CPU cache synchronization for non-coherent areas
> - support for VM_PFNMAP memory, such as framebuffer memory, etc.
> (alternative
> methods of reference counting required)
> - shared memory (shmem) support, zero-copy X server interoperability
> - support for contiguous memory allocated by userspace, including contiguity
> checks and (maybe) bounce buffers
> - usage/fragmentation statistics
>
> Best regards
> --
> Pawel Osciak
> Linux Platform Group
> Samsung Poland R&D Center
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-media" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC] Memory allocation requirements, videobuf integration, pluggable allocators
2010-05-25 12:33 [RFC] Memory allocation requirements, videobuf integration, pluggable allocators Pawel Osciak
2010-05-26 5:05 ` Hiremath, Vaibhav
@ 2010-05-27 18:44 ` Andy Walls
1 sibling, 0 replies; 4+ messages in thread
From: Andy Walls @ 2010-05-27 18:44 UTC (permalink / raw)
To: Pawel Osciak; +Cc: linux-media, kyungmin.park, 'Marek Szyprowski'
On Tue, 2010-05-25 at 14:33 +0200, Pawel Osciak wrote:
> Hello,
>
> this RFC concerns video buffer allocation in videobuf, as well as in V4L in
> general.
>
> Its main purpose is to discuss issues, gather comments and specific
> requirements, proposals and ideas for allocation mechanisms from
> interested parties.
>
> Background
> ======================
> V4L drivers use memory buffers for storing video/media data, such as video
> frames. There are many different ways to acquire such memory and devices may
> have special requirements for it. Further handling of it also differs between
> drivers.
>
> Typical ways of acquiring memory for media devices include:
> - allocating a number of non-contiguous pages (e.g. alloc_page)
> - acquiring a number of physically contiguous pages:
> * bootmem allocation
> * other custom solutions?
> - allocating virtually contiguous memory (vmalloc)
> - device-specific/private/on-board memory
> - others?
>
> The above examples are quite standard, but just to give you an idea of more
> exotic cases:
> - allocation of memory from specific memory banks
> - allocation of buffers in a particular arrangement
> - allocation with specific CPU flags, etc.
>
> If the above sounds too unrealistic/too abstract to you: these are the actual
> requirements for our (Samsung) devices.
>
> Furthermore, there might be some additional considerations:
> - VM_PFNMAP memory - may need additional refcounting
> - how to handle problems with remapping memory with different flags
> - others?
>
> Of course, freeing can also be handled in a plethora of ways.
>
> Moreover, related to the above are specific operations that may have to be
> performed, such as syncing caches, page pinning, etc.
>
>
> Motivation
> ======================
> Videobuf framework memory-type code (videobuf-vmalloc, videobuf-dma-sg,
> videobuf-dma-contig) has been created to help developers in some of the
> above-mentioned case. Unfortunately, I see the following main, inherent
> problems with it:
>
> - memory allocation is performed in videobuf code in a fixed way. There is
> no way for drivers to override this; e.g., dma_alloc_coherent is used for
> dma-contig memory,
>
> - it is performed during mmap (dma-contig, vmalloc) or even on VM fault
> sometimes (dma-sg); this does not conform to the V4L2 API, which states
> that allocation should be done on REQBUFS call,
>
> - freeing is not centralized (it is also performed on STREAMOFF, which is
> really bad, but this is a topic for a separate RFC)
>
> This prevents driver developers from using videobuf, sometimes they just use
> parts of it, add custom/incompatible modifications, or are, as a matter of fact,
> "forced" to duplicate parts of its code. Some drivers are dependent on
> boot-time allocation mechanisms. Examples include (apologies to the authors
> if I am mistaken here):
>
> - Intel Moorestown
> - OMAP
> - multimedia devices in Samsung SoCs- drivers are not yet posted for the sole
> reason that we need a bootmem-based allocator mechanism, which is hard to have
> accepted. Custom allocators in kernel for every platform/device are not
> received well, for good reasons.
> - others?
>
> >From various discussions I believe that there are more parties interested in
> having custom memory allocation mechanisms. Moreover, the current situation in
> videobuf calls for fixing (e.g. allocation should be performed on REQBUFS).
>
>
>
> A request for requirements, ideas and comments
> ================================================
> We would like to change this situation. Before proposing anything, we would like
> to first gather:
>
> - device-specific requirements and, possibly, peculiarities,
> - more general ideas and requirements for a generic allocator framework
> for media devices,
> - a list of devices that would benefit from this,
> - a list of drivers that do not use videobuf because of problems with adapting
> to its memory allocation scheme.
>
> They do not have to be videobuf-specific, although we would like to integrate
> the resulting solution with either the current videobuf, or its future rework.
>
> To clarify, I am mainly trying to gather requirements for videobuf or similar
> frameworks to introduce a generic interface for plugging-in custom memory
> allocation mechanisms, not really trying to implement a solving-world-hunger
> memory allocator.
>
> There is also the topic of a video buffer pool, discussed last year. How it
> relates to this topic and its integration (in any form) with videobuf could
> also be of interest here.
>
>
> I will be grateful for any comments, thoughts or ideas. Thank you!
I'll talk a little bit about the cx18 driver for the CX23418. The ivtv
driver for the CX23415/6 is similar in many respects, but slightly more
complicated: the chip's DMA engine gets set up differently and the
CX23415 has video and vbi output.
1. Both of these drivers support the read()/write() method for video
capture and output and use their own buffer allocation and handling
mechanism. Neither streaming I/O method (memory mapping nor user
pointers) is currently supported.
2. A CX2341x chip can DMA buffers for several concurrent capture
streams:
MPEG,
VBI (raw samples from the VBI lines),
YUV (uncompressed video frames)
PCM (uncompressed audio)
IDX (MPEG stream index records: offsets of I, P, and B frames)
The CX23415/6 chips can be told to send VBI and YUV stream buffers once
per frame. For the CX23418 to send YUV or VBI data once per n frames,
the chip must be told the buffers are the *exact* size required to hold
the expected data (even if the buffers are actually larger).
The MPEG stream has a variable length of data per video frame,
obviously.
The IDX stream has very small records; 24 bytes per I, P, or B index
record.
Giving very large buffers to the CX23418 for MPEG or IDX streams can
impact latency of MPEG data availability for playback or IDX records
latency.
3. The CX23415/6/8 chips are all PCI bus devices, not PCIe. The
CX23417 is a stand alone unit that can output MPEG compressed video to
other bridge chips (like a CX23885).
4. The CX23418 supports scatter gather DMA. The cx18 driver, trivially
uses S-G DMA (1 buffer per S-G list) for all capture streams except the
YUV stream. For YUV DMA, there is one video frame's worth of buffers in
each S-G list. The cx18 driver tells the CX23418 that there is
*exactly* one frame's worth of space in each YUV S-G list, to ensure
frame sync is maintained in the event of a missed notification.) Since
the YUV stream is in an unusual macroblock format - which always assumes
720 pixel per line and deals with 32 lines at a time - image heights are
forced to multiples of 32 lines by the cx18 driver and buffers in the
S-G list for YUV video need to be a multiple of 720*32*3/2 = 33.75 kB.
5. The CX23418 can handle 63 S-G lists per stream. The cx18 driver
logically extends that limit in software, if requested by via module
options.
6. DMA Buffer allocation is done at module load for all the various
streams at module load time. kmalloc( , GFP_KERNEL) is used for all DMA
buffers. pci_dmap_map_single() is used to get the dma_addr of each
buffer for the CX2341x chip to use as an address on the PCI bus. For
simplicity, the buffers stay mapped until the module is unloaded.
7. The number of buffers and size of buffers allocated is determined by:
- default values
- per stream type (MPEG, YUV, PCM, IDX, VBI) module options
- payload sizes (Max VBI data per frame, YUV macroblocks)
The size of a stream's S-G list and the size of buffer storage in a S-G
list reported to the CX23418 is a function of the video standard for VBI
and YUV (525/60 or 625/50) and also the screen size for YUV (how many 32
line units per frame requiring 33.75 kB each)
8. I once tried to allocate a very large buffers for DMA (to get a full
sized YUV frame all in one buffer). Without an actual hardware IOMMU on
the machine, this fails badly on x86. The linux software I/O TLB (a
software IOMMU) implementation can panic the machine, if trying to
allocate large bounce buffers (IIRC).
9. Most linux apps that support YUV only use the streaming I/O interface
and not the read()/write() interface. Supporting streaming I/O on YUV
(and PCM and VBI?) while maintaing read()/write() for MPEG is desired
for both the ivtv and cx18 drivers.
10. The cx18 and ivtv drivers are so reliable right now, that gutting
their buffer handling implementation is not a good option. (Cards based
on the CX23415/6/8 PCI bridges will be less viabile in the consumer
market as PCI slots become scarce, so why break the drivers now?). I do
want to support mmap() for certain stream types, but still maintain the
current driver buffer handling and enforcement of odd rules on buffer
sizes to maintain frame sync. I can't have useland requesting any old
buffer size and number and expect good results. I would need a way to
hint or enforce buffer size requests from userland.
11. The cx18 driver also suppots a DVB TS stream (since the CX23418 is
used in a hybrid Analog TV / Digtial TV card designs). Right now the
cx18 internal buffer management scheme handles the S-G lists and buffers
for that stream as well.
12, Here is a quick look at defualt buffer memory allocated for one cx18
card (the IDX stream isn't shown, but it's rather small)
cx18-0: Registered device video1 for encoder MPEG (64 x 32.00 kB)
cx18-0: Registered DVB adapter0 for TS (32 x 32.00 kB)
cx18-0: Registered device video33 for encoder YUV (20 x 101.25 kB)
cx18-0: Registered device vbi1 for encoder VBI (20 x 51984 bytes)
cx18-0: Registered device video25 for encoder PCM audio (256 x 4.00 kB)
That's almost 7 MiB of buffers allocated per card by default.
Hmmm, maybe deferred allocation would be a good thing, as most users
only use the TS, MPEG, and VBI streams. Module parameters allow a
stream to be disabled (0 buffers for that stream type) to save memory,
but that is on a global basis, not a per card basis.
Regards,
Andy
> =========================
>
> Below is a (hopefully complete) list of required features for Samsung multimedia
> devices, related to memory allocation:
>
> - physically contiguous memory buffers of different sizes (up to several
> megabytes)
> - memory allocation from particular memory banks (ranges of physical addresses)
> - partitioning areas of memory into custom zones and an ability to allocate from
> a chosen zone
> - an ability to share memory buffers across different devices in a pipeline
> - automatic video buffer allocation from videobuf is the main use case, but
> direct access to the memory allocator from drivers (for temporary buffers,
> firmware etc.) is also required
> - ability to pre-configure allocator behavior by drivers
>
> Some nice-to-have features:
>
> - pluggable memory allocation strategies
> - cacheable/non-cacheable buffers
> - CPU cache synchronization for non-coherent areas
> - support for VM_PFNMAP memory, such as framebuffer memory, etc. (alternative
> methods of reference counting required)
> - shared memory (shmem) support, zero-copy X server interoperability
> - support for contiguous memory allocated by userspace, including contiguity
> checks and (maybe) bounce buffers
> - usage/fragmentation statistics
>
> Best regards
> --
> Pawel Osciak
> Linux Platform Group
> Samsung Poland R&D Center
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-05-27 18:44 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-25 12:33 [RFC] Memory allocation requirements, videobuf integration, pluggable allocators Pawel Osciak
2010-05-26 5:05 ` Hiremath, Vaibhav
2010-05-27 8:37 ` Zhang, Xiaolin
2010-05-27 18:44 ` Andy Walls
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).