* [PATCH v5 0/4] Add AIE driver support for mt8188
@ 2025-04-03 7:38 bo.kong
2025-04-03 7:38 ` [PATCH v5 1/4] media: dt-bindings: add MT8188 AIE bo.kong
` (5 more replies)
0 siblings, 6 replies; 17+ messages in thread
From: bo.kong @ 2025-04-03 7:38 UTC (permalink / raw)
To: Rob Herring, AngeloGioacchino Del Regno, Mauro Carvalho Chehab,
mtk29348, linux-media, linux-kernel, linux-arm-kernel,
linux-mediatek
Cc: zhaoyuan.chen, Teddy.Chen, Project_Global_Chrome_Upstream_Group
From: Bo Kong <Bo.Kong@mediatek.com>
AIE(AI Engine) is one of the units in mt8188 ISP which provides hardware accelerated face detection function, it can detect different sizes of faces in a raw image.
Bo Kong (4):
media: dt-bindings: add MT8188 AIE
arm64: dts: mt8188: add aie node
uapi: linux: add MT8188 AIE
media: mediatek: add MT8188 AIE driver
.../bindings/media/mediatek,mt8188-aie.yaml | 78 +
arch/arm64/boot/dts/mediatek/mt8188.dtsi | 33 +
drivers/media/platform/mediatek/Kconfig | 1 +
drivers/media/platform/mediatek/Makefile | 1 +
drivers/media/platform/mediatek/aie/Kconfig | 12 +
drivers/media/platform/mediatek/aie/Makefile | 5 +
drivers/media/platform/mediatek/aie/mtk_aie.h | 870 ++++++
.../media/platform/mediatek/aie/mtk_aie_drv.c | 2398 +++++++++++++++++
.../platform/mediatek/aie/mtk_aie_v4l2.c | 1295 +++++++++
drivers/media/v4l2-core/v4l2-ioctl.c | 3 +
include/uapi/linux/mtk_aie_v4l2_controls.h | 308 +++
include/uapi/linux/videodev2.h | 6 +
12 files changed, 5010 insertions(+)
create mode 100644 Documentation/devicetree/bindings/media/mediatek,mt8188-aie.yaml
create mode 100644 drivers/media/platform/mediatek/aie/Kconfig
create mode 100644 drivers/media/platform/mediatek/aie/Makefile
create mode 100644 drivers/media/platform/mediatek/aie/mtk_aie.h
create mode 100644 drivers/media/platform/mediatek/aie/mtk_aie_drv.c
create mode 100644 drivers/media/platform/mediatek/aie/mtk_aie_v4l2.c
create mode 100644 include/uapi/linux/mtk_aie_v4l2_controls.h
--
2.45.2
^ permalink raw reply [flat|nested] 17+ messages in thread* [PATCH v5 1/4] media: dt-bindings: add MT8188 AIE 2025-04-03 7:38 [PATCH v5 0/4] Add AIE driver support for mt8188 bo.kong @ 2025-04-03 7:38 ` bo.kong 2025-04-04 6:04 ` Krzysztof Kozlowski 2025-04-03 7:38 ` [PATCH v5 2/4] arm64: dts: mt8188: add aie node bo.kong ` (4 subsequent siblings) 5 siblings, 1 reply; 17+ messages in thread From: bo.kong @ 2025-04-03 7:38 UTC (permalink / raw) To: Rob Herring, AngeloGioacchino Del Regno, Mauro Carvalho Chehab, mtk29348, linux-media, linux-kernel, linux-arm-kernel, linux-mediatek Cc: zhaoyuan.chen, Teddy.Chen, Project_Global_Chrome_Upstream_Group From: Bo Kong <Bo.Kong@mediatek.com> Add YAML device tree bindings for MT8188 AIE. Signed-off-by: Bo Kong <Bo.Kong@mediatek.com> --- Changes in v5: 1. Modify the description to make it more concise. 2. Delete the description of reg. 3. Modify the description of iommus and delete the maxItems of iommus. 4. Delete all mediatek,larb. 5. Modify the name of clock, change _ to -. Changes in v4: 1. Remove address-cells and size-cells 2. Remove larb12 related content 3. Update id content Changes in v3: None Changes in v2: 1. Fix coding style --- .../bindings/media/mediatek,mt8188-aie.yaml | 78 +++++++++++++++++++ 1 file changed, 78 insertions(+) create mode 100644 Documentation/devicetree/bindings/media/mediatek,mt8188-aie.yaml diff --git a/Documentation/devicetree/bindings/media/mediatek,mt8188-aie.yaml b/Documentation/devicetree/bindings/media/mediatek,mt8188-aie.yaml new file mode 100644 index 000000000000..861110bb0c98 --- /dev/null +++ b/Documentation/devicetree/bindings/media/mediatek,mt8188-aie.yaml @@ -0,0 +1,78 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/media/mediatek,mt8188-aie.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: The AI Engine Unit of MediaTek Camera System + +maintainers: + - Bo Kong <bo.kong@mediatek.com> + +description: + AIE(AI Engine) provides a hardware-accelerated face detection function. + +properties: + compatible: + items: + - const: mediatek,mt8188-aie + + reg: + maxItems: 1 + + interrupts: + maxItems: 1 + + iommus: + description: + List of the hardware port in respective IOMMU block for current Socs. + Refer to bindings/iommu/mediatek,iommu.yaml. + + power-domains: + maxItems: 1 + + clocks: + items: + - description: clock for imgsys main ipe + - description: clock for ipe fdvt + - description: clock for ipe top + + clock-names: + items: + - const: img-ipe + - const: ipe-fdvt + - const: ipe-top + +required: + - compatible + - reg + - interrupts + - iommus + - power-domains + - clocks + - clock-names + +additionalProperties: false + +examples: + - | + #include <dt-bindings/interrupt-controller/arm-gic.h> + #include <dt-bindings/memory/mediatek,mt8188-memory-port.h> + #include <dt-bindings/power/mediatek,mt8188-power.h> + #include <dt-bindings/clock/mediatek,mt8188-clk.h> + aie@15310000 { + compatible = "mediatek,mt8188-aie"; + reg = <0x15310000 0x1000>; + interrupts = <GIC_SPI 787 IRQ_TYPE_LEVEL_HIGH 0>; + iommus = <&vpp_iommu M4U_PORT_L12_FDVT_RDA_0>, + <&vpp_iommu M4U_PORT_L12_FDVT_RDB_0>, + <&vpp_iommu M4U_PORT_L12_FDVT_WRA_0>, + <&vpp_iommu M4U_PORT_L12_FDVT_WRB_0>; + power-domains = <&spm MT8188_POWER_DOMAIN_IPE>; + clocks = <&imgsys CLK_IMGSYS_MAIN_IPE>, + <&ipesys CLK_IPE_FDVT>, + <&ipesys CLK_IPESYS_TOP>; + clock-names = "img-ipe", + "ipe-fdvt", + "ipe-top"; + }; -- 2.45.2 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v5 1/4] media: dt-bindings: add MT8188 AIE 2025-04-03 7:38 ` [PATCH v5 1/4] media: dt-bindings: add MT8188 AIE bo.kong @ 2025-04-04 6:04 ` Krzysztof Kozlowski 0 siblings, 0 replies; 17+ messages in thread From: Krzysztof Kozlowski @ 2025-04-04 6:04 UTC (permalink / raw) To: bo.kong, Rob Herring, AngeloGioacchino Del Regno, Mauro Carvalho Chehab, linux-media, linux-kernel, linux-arm-kernel, linux-mediatek Cc: zhaoyuan.chen, Teddy.Chen, Project_Global_Chrome_Upstream_Group On 03/04/2025 09:38, bo.kong wrote: > From: Bo Kong <Bo.Kong@mediatek.com> > > Add YAML device tree bindings for MT8188 AIE. > > Signed-off-by: Bo Kong <Bo.Kong@mediatek.com> > --- <form letter> Please use scripts/get_maintainers.pl to get a list of necessary people and lists to CC. It might happen, that command when run on an older kernel, gives you outdated entries. Therefore please be sure you base your patches on recent Linux kernel. Tools like b4 or scripts/get_maintainer.pl provide you proper list of people, so fix your workflow. Tools might also fail if you work on some ancient tree (don't, instead use mainline) or work on fork of kernel (don't, instead use mainline). Just use b4 and everything should be fine, although remember about `b4 prep --auto-to-cc` if you added new patches to the patchset. You missed at least devicetree list (maybe more), so this won't be tested by automated tooling. Performing review on untested code might be a waste of time. Please kindly resend and include all necessary To/Cc entries. </form letter> Best regards, Krzysztof ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v5 2/4] arm64: dts: mt8188: add aie node 2025-04-03 7:38 [PATCH v5 0/4] Add AIE driver support for mt8188 bo.kong 2025-04-03 7:38 ` [PATCH v5 1/4] media: dt-bindings: add MT8188 AIE bo.kong @ 2025-04-03 7:38 ` bo.kong 2025-04-04 6:04 ` Krzysztof Kozlowski 2025-04-03 7:38 ` [PATCH v5 3/4] uapi: linux: add MT8188 AIE bo.kong ` (3 subsequent siblings) 5 siblings, 1 reply; 17+ messages in thread From: bo.kong @ 2025-04-03 7:38 UTC (permalink / raw) To: Rob Herring, AngeloGioacchino Del Regno, Mauro Carvalho Chehab, mtk29348, linux-media, linux-kernel, linux-arm-kernel, linux-mediatek Cc: zhaoyuan.chen, Teddy.Chen, Project_Global_Chrome_Upstream_Group From: Bo Kong <Bo.Kong@mediatek.com> Add aie node and related node Signed-off-by: Bo Kong <Bo.Kong@mediatek.com> --- Changes in v5: 1.Modify the name of clock, change _ to -. Changes in v4: None Changes in v3: 1. Remove dts non-MMIO nodes Changes in v2: 1. Add AIE node and related node --- arch/arm64/boot/dts/mediatek/mt8188.dtsi | 33 ++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/arch/arm64/boot/dts/mediatek/mt8188.dtsi b/arch/arm64/boot/dts/mediatek/mt8188.dtsi index 69a8423d3858..641de110321a 100644 --- a/arch/arm64/boot/dts/mediatek/mt8188.dtsi +++ b/arch/arm64/boot/dts/mediatek/mt8188.dtsi @@ -2330,12 +2330,45 @@ imgsys_wpe1: clock-controller@15220000 { #clock-cells = <1>; }; + aie: aie@15310000 { + compatible = "mediatek,mt8188-aie"; + reg = <0 0x15310000 0 0x1000>; + interrupts = <GIC_SPI 787 IRQ_TYPE_LEVEL_HIGH 0>; + mediatek,larb = <&larb12>; + iommus = <&vpp_iommu M4U_PORT_L12_FDVT_RDA_0>, + <&vpp_iommu M4U_PORT_L12_FDVT_RDB_0>, + <&vpp_iommu M4U_PORT_L12_FDVT_WRA_0>, + <&vpp_iommu M4U_PORT_L12_FDVT_WRB_0>; + power-domains = <&spm MT8188_POWER_DOMAIN_IPE>; + clocks = <&imgsys CLK_IMGSYS_MAIN_IPE>, + <&ipesys CLK_IPE_FDVT>, + <&ipesys CLK_IPE_SMI_LARB12>, + <&ipesys CLK_IPESYS_TOP>; + clock-names = "img-ipe", + "ipe-fdvt", + "ipe-smi-larb12", + "ipe-top"; + }; + ipesys: clock-controller@15330000 { compatible = "mediatek,mt8188-ipesys"; reg = <0 0x15330000 0 0x1000>; #clock-cells = <1>; }; + larb12: larb@15340000 { + compatible = "mediatek,mt8188-smi-larb"; + reg = <0 0x15340000 0 0x1000>; + mediatek,larb-id = <SMI_L12_ID>; + mediatek,smi = <&vpp_smi_common>; + mediatek,smi-sub-comm = <&smi_img1>; + mediatek,smi-sub-comm-inport = <0>; + clocks = <&imgsys CLK_IMGSYS_MAIN_IPE>, + <&ipesys CLK_IPE_SMI_LARB12>; + clock-names = "apb", "smi"; + power-domains = <&spm MT8188_POWER_DOMAIN_IPE>; + }; + imgsys_wpe2: clock-controller@15520000 { compatible = "mediatek,mt8188-imgsys-wpe2"; reg = <0 0x15520000 0 0x1000>; -- 2.45.2 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v5 2/4] arm64: dts: mt8188: add aie node 2025-04-03 7:38 ` [PATCH v5 2/4] arm64: dts: mt8188: add aie node bo.kong @ 2025-04-04 6:04 ` Krzysztof Kozlowski 0 siblings, 0 replies; 17+ messages in thread From: Krzysztof Kozlowski @ 2025-04-04 6:04 UTC (permalink / raw) To: bo.kong, Rob Herring, AngeloGioacchino Del Regno, Mauro Carvalho Chehab, linux-media, linux-kernel, linux-arm-kernel, linux-mediatek Cc: zhaoyuan.chen, Teddy.Chen, Project_Global_Chrome_Upstream_Group On 03/04/2025 09:38, bo.kong wrote: > From: Bo Kong <Bo.Kong@mediatek.com> > > Add aie node and related node > > Signed-off-by: Bo Kong <Bo.Kong@mediatek.com> > --- <form letter> Please use scripts/get_maintainers.pl to get a list of necessary people and lists to CC (and consider --no-git-fallback argument, so you will not CC people just because they made one commit years ago). It might happen, that command when run on an older kernel, gives you outdated entries. Therefore please be sure you base your patches on recent Linux kernel. Tools like b4 or scripts/get_maintainer.pl provide you proper list of people, so fix your workflow. Tools might also fail if you work on some ancient tree (don't, instead use mainline) or work on fork of kernel (don't, instead use mainline). Just use b4 and everything should be fine, although remember about `b4 prep --auto-to-cc` if you added new patches to the patchset. </form letter> Best regards, Krzysztof ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v5 3/4] uapi: linux: add MT8188 AIE 2025-04-03 7:38 [PATCH v5 0/4] Add AIE driver support for mt8188 bo.kong 2025-04-03 7:38 ` [PATCH v5 1/4] media: dt-bindings: add MT8188 AIE bo.kong 2025-04-03 7:38 ` [PATCH v5 2/4] arm64: dts: mt8188: add aie node bo.kong @ 2025-04-03 7:38 ` bo.kong 2025-04-07 3:57 ` CK Hu (胡俊光) 2025-04-03 7:38 ` [PATCH v5 4/4] media: mediatek: add MT8188 AIE driver bo.kong ` (2 subsequent siblings) 5 siblings, 1 reply; 17+ messages in thread From: bo.kong @ 2025-04-03 7:38 UTC (permalink / raw) To: Rob Herring, AngeloGioacchino Del Regno, Mauro Carvalho Chehab, mtk29348, linux-media, linux-kernel, linux-arm-kernel, linux-mediatek Cc: zhaoyuan.chen, Teddy.Chen, Project_Global_Chrome_Upstream_Group From: Bo Kong <Bo.Kong@mediatek.com> Add AIE control related definitions. Signed-off-by: Bo Kong <Bo.Kong@mediatek.com> --- Changes in v5: 1. Add an introduction for feature_threshold. 2. Change the name of v4l2_aie_roi to aie_roi_coordinate. 3. Change the name of v4l2_aie_padding to aie_padding_size. 4. Add an explanation for en_padding and describe the functions of the three modes of fd_mode. 5. Move the structures in the mtk_aie.h file to the uapi directory. Changes in v4: 1. Add document to describe the detail of V4L2_META_FMT_MTFD_RESULT 2. Add the introduction of related variables Changes in v3: None Changes in v2: 1. Fix coding style --- drivers/media/v4l2-core/v4l2-ioctl.c | 3 + include/uapi/linux/mtk_aie_v4l2_controls.h | 308 +++++++++++++++++++++ include/uapi/linux/videodev2.h | 6 + 3 files changed, 317 insertions(+) create mode 100644 include/uapi/linux/mtk_aie_v4l2_controls.h diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c index a16fb44c7246..74d0d13f841f 100644 --- a/drivers/media/v4l2-core/v4l2-ioctl.c +++ b/drivers/media/v4l2-core/v4l2-ioctl.c @@ -1479,6 +1479,9 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt) case V4L2_META_FMT_GENERIC_CSI2_16: descr = "8-bit Generic Meta, 16b CSI-2"; break; case V4L2_META_FMT_GENERIC_CSI2_20: descr = "8-bit Generic Meta, 20b CSI-2"; break; case V4L2_META_FMT_GENERIC_CSI2_24: descr = "8-bit Generic Meta, 24b CSI-2"; break; + case V4L2_META_FMT_MTFD_RESULT: + descr = "MTK AIE Result Fmt"; + break; default: /* Compressed formats */ diff --git a/include/uapi/linux/mtk_aie_v4l2_controls.h b/include/uapi/linux/mtk_aie_v4l2_controls.h new file mode 100644 index 000000000000..952dcdb23283 --- /dev/null +++ b/include/uapi/linux/mtk_aie_v4l2_controls.h @@ -0,0 +1,308 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +/* + * AIE Controls Header + * + * Copyright (c) 2020 MediaTek Inc. + * Author: Bo Kong <bo.kong@mediatek.com> + */ + +#ifndef __MTK_AIE_V4L2_CONTROLS_H__ +#define __MTK_AIE_V4L2_CONTROLS_H__ + +#include <linux/types.h> + +/* + * The base for the mediatek Face Detection driver controls. + * We reserve 16 controls for this driver. + * Each CID represents different stages of AIE, with different structures and functions + * and cannot be reused + */ +#define V4L2_CID_USER_MTK_FD_BASE (V4L2_CID_USER_BASE + 0x1fd0) +#define V4L2_CID_MTK_AIE_INIT (V4L2_CID_USER_MTK_FD_BASE + 1) +#define V4L2_CID_MTK_AIE_PARAM (V4L2_CID_USER_MTK_FD_BASE + 2) + +#define MAX_FACE_NUM 1024 +#define FLD_CUR_LANDMARK 11 +#define FLD_MAX_FRAME 15 + +/** + * struct v4l2_ctrl_aie_init - aie init parameters. + * + * @max_img_width: maximum width of the source image. + * @max_img_height: maximum height of the source image. + * @pyramid_width: maximum width of the base pyramid. + * @pyramid_height: maximum height of the base pyramid. + * @feature_threshold: The threshold for the face confidence.Range: 100 ~ 400. + * The larger the value,the lower the face recognition rate + */ +struct v4l2_ctrl_aie_init { + __u32 max_img_width; + __u32 max_img_height; + __u32 pyramid_width; + __u32 pyramid_height; + __s32 feature_threshold; +}; + +/** + * struct aie_roi_coordinate - aie roi parameters. + * + * @x1: x1 of the roi coordinate. + * @y1: y1 of the roi coordinate. + * @x2: x2 of the roi coordinate. + * @y2: y2 of the roi coordinate. + */ +struct aie_roi_coordinate { + __u32 x1; + __u32 y1; + __u32 x2; + __u32 y2; +}; + +/** + * struct aie_padding_size - aie padding parameters. + * + * @left: the size of padding left. + * @right: the size of padding right. + * @down: the size of padding below. + * @up: the size of padding above. + */ +struct aie_padding_size { + __u32 left; + __u32 right; + __u32 down; + __u32 up; +}; + +/** + * struct v4l2_fld_crop_rip_rop - aie fld parameters. + * + * @fld_in_crop_x1: x1 of the crop coordinate. + * @fld_in_crop_y1: y1 of the crop coordinate. + * @fld_in_crop_x2: x2 of the crop coordinate. + * @fld_in_crop_y2: y2 of the crop coordinate. + * @fld_in_rip: fld in rip. + * @fld_in_rop: fld in rop. + */ +struct v4l2_fld_crop_rip_rop { + __u32 fld_in_crop_x1; + __u32 fld_in_crop_y1; + __u32 fld_in_crop_x2; + __u32 fld_in_crop_y2; + __u32 fld_in_rip; + __u32 fld_in_rop; +}; + +/** + * struct fd_ret - aie fd result parameters. + * + * @anchor_x0: X coordinate of the top-left corner of each detected face. + * @anchor_x1: X coordinate of the bottom-right corner of each detected face. + * @anchor_y0: Y coordinate of the top-left corner of each detected face. + * @anchor_y1: Y coordinate of the bottom-right corner of each detected face. + * @rop_landmark_score0: Score for the first rotation pose landmark. + * @rop_landmark_score1: Score for the second rotation pose landmark. + * @rop_landmark_score2: Score for the third rotation pose landmark. + * @anchor_score: Score for the anchor points. + * @rip_landmark_score0: Score for the first rotation-invariant pose landmark. + * @rip_landmark_score1: Score for the second rotation-invariant pose landmark. + * @rip_landmark_score2: Score for the third rotation-invariant pose landmark. + * @rip_landmark_score3: Score for the fourth rotation-invariant pose landmark. + * @rip_landmark_score4: Score for the fifth rotation-invariant pose landmark. + * @rip_landmark_score5: Score for the sixth rotation-invariant pose landmark. + * @rip_landmark_score6: Score for the seventh rotation-invariant pose landmark. + * @face_result_index: Index of each detected face. + * @anchor_index: Index of the anchor points. + * @fd_partial_result: Partial face detection result. + */ +struct fd_ret { + __u16 anchor_x0[MAX_FACE_NUM]; + __u16 anchor_x1[MAX_FACE_NUM]; + __u16 anchor_y0[MAX_FACE_NUM]; + __u16 anchor_y1[MAX_FACE_NUM]; + __s16 rop_landmark_score0[MAX_FACE_NUM]; + __s16 rop_landmark_score1[MAX_FACE_NUM]; + __s16 rop_landmark_score2[MAX_FACE_NUM]; + __s16 anchor_score[MAX_FACE_NUM]; + __s16 rip_landmark_score0[MAX_FACE_NUM]; + __s16 rip_landmark_score1[MAX_FACE_NUM]; + __s16 rip_landmark_score2[MAX_FACE_NUM]; + __s16 rip_landmark_score3[MAX_FACE_NUM]; + __s16 rip_landmark_score4[MAX_FACE_NUM]; + __s16 rip_landmark_score5[MAX_FACE_NUM]; + __s16 rip_landmark_score6[MAX_FACE_NUM]; + __u16 face_result_index[MAX_FACE_NUM]; + __u16 anchor_index[MAX_FACE_NUM]; + __u32 fd_partial_result; +}; + +/** + * struct fd_result - Face detection results for different pyramid levels. + * + * @fd_pyramid0_num: Number of faces detected at pyramid level 0. + * @fd_pyramid1_num: Number of faces detected at pyramid level 1. + * @fd_pyramid2_num: Number of faces detected at pyramid level 2. + * @fd_total_num: Total number of faces detected across all pyramid levels. + * @pyramid0_result: Detection results for pyramid level 0. + * @pyramid1_result: Detection results for pyramid level 1. + * @pyramid2_result: Detection results for pyramid level 2. + */ +struct fd_result { + __u16 fd_pyramid0_num; + __u16 fd_pyramid1_num; + __u16 fd_pyramid2_num; + __u16 fd_total_num; + struct fd_ret pyramid0_result; + struct fd_ret pyramid1_result; + struct fd_ret pyramid2_result; +}; + +/** + * struct attr_result - Attribute detection results parameters. + * + * @gender_ret: Gender detection results. + * @ethnicity_ret: Ethnicity detection results. + * @merged_age_ret: Merged age detection results. + * @merged_gender_ret: Merged gender detection results. + * @merged_is_indian_ret: Merged results indicating if the subject is Indian. + * @merged_ethnicity_ret: Merged ethnicity detection results. + */ +struct attr_result { + __s16 gender_ret[2][64]; + __s16 ethnicity_ret[4][64]; + __s16 merged_age_ret[2]; + __s16 merged_gender_ret[2]; + __s16 merged_is_indian_ret[2]; + __s16 merged_ethnicity_ret[4]; +}; + +/** + * struct fld_landmark - FLD coordinates parameters. + * + * @x: X coordinate of the facial landmark. + * @y: Y coordinate of the facial landmark. + */ +struct fld_landmark { + __u16 x; + __u16 y; +}; + +/** + * struct fld_result - FLD detection results parameters. + * + * @fld_landmark: Array of facial landmarks, each with X and Y coordinates. + * @fld_out_rip: Output rotation-invariant pose value. + * @fld_out_rop: Output rotation pose value. + * @confidence: Confidence score of the facial landmark detection. + * @blinkscore: Blink score indicating the likelihood of eye blink. + */ +struct fld_result { + struct fld_landmark fld_landmark[FLD_CUR_LANDMARK]; + __u16 fld_out_rip; + __u16 fld_out_rop; + __u16 confidence; + __s16 blinkscore; +}; + +/** + * struct aie_enq_info - V4L2 Kernelspace parameters. + * + * @sel_mode: select a mode(FDMODE, ATTRIBUTEMODE, FLDMODE) for current fd. + * FDMODE: Face Detection. + * ATTRIBUTEMODE: Gender and ethnicity detection + * FLDMODE: Locations of eyebrows, eyes, ears, nose,and mouth + * @src_img_fmt: source image format. + * @src_img_width: the width of the source image. + * @src_img_height: the height of the source image. + * @src_img_stride: the stride of the source image. + * @pyramid_base_width: pyramid is the size of resizer, the width of the base pyramid. + * @pyramid_base_height: pyramid is the size of resizer, the width of the base pyramid. + * @number_of_pyramid: number of pyramid, min: 1, max: 3. + * @rotate_degree: the rotate degree of the image. + * @en_roi: enable roi(roi is a box diagram that selects a rectangle in a picture). + * when en_roi is enable, AIE will return a rectangle face detection result + * @src_roi: roi params. + * @en_padding: enable padding, this is only used on the hardware of yuv to rgb. + * and has noting to do with fd_mode + * @src_padding: padding params. + * @freq_level: frequency level, Get value from user space enque. + * @fld_face_num: the number of faces in fld. + * user space tells driver the number of detections. + * @fld_input: fld input params. + * @src_img_addr: Source image address. + * @src_img_addr_uv: Source image address for UV plane. + * @fd_out: Face detection results. + * @attr_out: Attribute detection results. + * @fld_out: Array of facial landmark detection results for multiple frames. + * @irq_status: Interrupt request status. + */ +struct aie_enq_info { + __u32 sel_mode; + __u32 src_img_fmt; + __u32 src_img_width; + __u32 src_img_height; + __u32 src_img_stride; + __u32 pyramid_base_width; + __u32 pyramid_base_height; + __u32 number_of_pyramid; + __u32 rotate_degree; + int en_roi; + struct aie_roi_coordinate src_roi; + int en_padding; + struct aie_padding_size src_padding; + unsigned int freq_level; + unsigned int fld_face_num; + struct v4l2_fld_crop_rip_rop fld_input[FLD_MAX_FRAME]; + __u32 src_img_addr; + __u32 src_img_addr_uv; + struct fd_result fd_out; + struct attr_result attr_out; + struct fld_result fld_out[FLD_MAX_FRAME]; + __u32 irq_status; +}; + +/** + * struct v4l2_ctrl_aie_param - V4L2 Userspace parameters. + * + * @fd_mode: select a mode(FDMODE, ATTRIBUTEMODE, FLDMODE) for current fd. + * FDMODE: Face Detection. + * ATTRIBUTEMODE: Gender and ethnicity detection + * FLDMODE: Locations of eyebrows, eyes, ears, nose,and mouth + * @src_img_fmt: source image format. + * @src_img_width: the width of the source image. + * @src_img_height: the height of the source image. + * @src_img_stride: the stride of the source image. + * @pyramid_base_width: pyramid is the size of resizer, the width of the base pyramid. + * @pyramid_base_height: pyramid is the size of resizer, the width of the base pyramid. + * @number_of_pyramid: number of pyramid, min: 1, max: 3. + * @rotate_degree: the rotate degree of the image. + * @en_roi: enable roi(roi is a box diagram that selects a rectangle in a picture). + * when en_roi is enable, AIE will return a rectangle face detection result + * @src_roi: roi params. + * @en_padding: enable padding, this is only used on the hardware of yuv to rgb. + * and has noting to do with fd_mode + * @src_padding: padding params. + * @freq_level: frequency level, Get value from user space enque. + * @fld_face_num: the number of faces in fld. + * user space tells driver the number of detections. + * @fld_input: fld input params. + */ +struct v4l2_ctrl_aie_param { + __u32 fd_mode; + __u32 src_img_fmt; + __u32 src_img_width; + __u32 src_img_height; + __u32 src_img_stride; + __u32 pyramid_base_width; + __u32 pyramid_base_height; + __u32 number_of_pyramid; + __u32 rotate_degree; + __s32 en_roi; + struct aie_roi_coordinate src_roi; + __s32 en_padding; + struct aie_padding_size src_padding; + __u32 freq_level; + __u32 fld_face_num; + struct v4l2_fld_crop_rip_rop fld_input[FLD_MAX_FRAME]; +}; + +#endif /* __MTK_AIE_V4L2_CONTROLS_H__ */ diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h index c8cb2796130f..329bbac9239e 100644 --- a/include/uapi/linux/videodev2.h +++ b/include/uapi/linux/videodev2.h @@ -858,6 +858,9 @@ struct v4l2_pix_format { #define V4L2_META_FMT_RK_ISP1_STAT_3A v4l2_fourcc('R', 'K', '1', 'S') /* Rockchip ISP1 3A Statistics */ #define V4L2_META_FMT_RK_ISP1_EXT_PARAMS v4l2_fourcc('R', 'K', '1', 'E') /* Rockchip ISP1 3a Extensible Parameters */ +/* Vendor-specific definition: used for the MediaTek camera subsystem's face detection results */ +#define V4L2_META_FMT_MTFD_RESULT v4l2_fourcc('M', 'T', 'f', 'd') + /* Vendor specific - used for RaspberryPi PiSP */ #define V4L2_META_FMT_RPI_BE_CFG v4l2_fourcc('R', 'P', 'B', 'C') /* PiSP BE configuration */ #define V4L2_META_FMT_RPI_FE_CFG v4l2_fourcc('R', 'P', 'F', 'C') /* PiSP FE configuration */ @@ -1965,6 +1968,9 @@ enum v4l2_ctrl_type { V4L2_CTRL_TYPE_AV1_TILE_GROUP_ENTRY = 0x281, V4L2_CTRL_TYPE_AV1_FRAME = 0x282, V4L2_CTRL_TYPE_AV1_FILM_GRAIN = 0x283, + + V4L2_CTRL_TYPE_AIE_INIT = 0x0290, + V4L2_CTRL_TYPE_AIE_PARAM = 0x0291, }; /* Used in the VIDIOC_QUERYCTRL ioctl for querying controls */ -- 2.45.2 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v5 3/4] uapi: linux: add MT8188 AIE 2025-04-03 7:38 ` [PATCH v5 3/4] uapi: linux: add MT8188 AIE bo.kong @ 2025-04-07 3:57 ` CK Hu (胡俊光) 2025-04-23 4:02 ` Bo Kong (孔波) 2025-04-23 9:29 ` Bo Kong (孔波) 0 siblings, 2 replies; 17+ messages in thread From: CK Hu (胡俊光) @ 2025-04-07 3:57 UTC (permalink / raw) To: linux-media@vger.kernel.org, AngeloGioacchino Del Regno, robh@kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, mchehab@kernel.org, Bo Kong (孔波), linux-mediatek@lists.infradead.org Cc: Zhaoyuan Chen (陈兆远), Teddy Chen (陳乾元), Project_Global_Chrome_Upstream_Group On Thu, 2025-04-03 at 15:38 +0800, bo.kong wrote: > From: Bo Kong <Bo.Kong@mediatek.com> > > Add AIE control related definitions. Laurent has asked you to provide document of the UAPI. If you have no idea what to write to this document, I ask you question in this mail and you write the answer to the document. > > Signed-off-by: Bo Kong <Bo.Kong@mediatek.com> > --- [snip] > diff --git a/include/uapi/linux/mtk_aie_v4l2_controls.h b/include/uapi/linux/mtk_aie_v4l2_controls.h > new file mode 100644 > index 000000000000..952dcdb23283 > --- /dev/null > +++ b/include/uapi/linux/mtk_aie_v4l2_controls.h > @@ -0,0 +1,308 @@ > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ > +/* > + * AIE Controls Header > + * > + * Copyright (c) 2020 MediaTek Inc. > + * Author: Bo Kong <bo.kong@mediatek.com> > + */ > + > +#ifndef __MTK_AIE_V4L2_CONTROLS_H__ > +#define __MTK_AIE_V4L2_CONTROLS_H__ > + > +#include <linux/types.h> > + > +/* > + * The base for the mediatek Face Detection driver controls. > + * We reserve 16 controls for this driver. > + * Each CID represents different stages of AIE, with different structures and functions > + * and cannot be reused > + */ > +#define V4L2_CID_USER_MTK_FD_BASE (V4L2_CID_USER_BASE + 0x1fd0) > +#define V4L2_CID_MTK_AIE_INIT (V4L2_CID_USER_MTK_FD_BASE + 1) > +#define V4L2_CID_MTK_AIE_PARAM (V4L2_CID_USER_MTK_FD_BASE + 2) > + > +#define MAX_FACE_NUM 1024 > +#define FLD_CUR_LANDMARK 11 > +#define FLD_MAX_FRAME 15 > + > +/** > + * struct v4l2_ctrl_aie_init - aie init parameters. > + * > + * @max_img_width: maximum width of the source image. > + * @max_img_height: maximum height of the source image. > + * @pyramid_width: maximum width of the base pyramid. > + * @pyramid_height: maximum height of the base pyramid. > + * @feature_threshold: The threshold for the face confidence.Range: 100 ~ 400. > + * The larger the value,the lower the face recognition rate > + */ > +struct v4l2_ctrl_aie_init { > + __u32 max_img_width; > + __u32 max_img_height; max image width/height is hardware limitation. This should be stored in driver. User space program should use some V4L2 query interface to get this information. > + __u32 pyramid_width; > + __u32 pyramid_height; Ditto. > + __s32 feature_threshold; How to decide this value? I guess there is a tuning process to find a reasonable threshold. Is there only one reasonable threshold for all scenario? If so, write done this one threshold in driver and user space is not necessary to set this value to driver. If the threshold would vary for different scenario, for example, 168 for indoor video and 258 for outdoor video, add comment to describe this so user space program have to detect the scenario of input video. > +}; > + > +/** > + * struct aie_roi_coordinate - aie roi parameters. > + * > + * @x1: x1 of the roi coordinate. > + * @y1: y1 of the roi coordinate. > + * @x2: x2 of the roi coordinate. > + * @y2: y2 of the roi coordinate. > + */ > +struct aie_roi_coordinate { > + __u32 x1; > + __u32 y1; > + __u32 x2; > + __u32 y2; > +}; > + > +/** > + * struct aie_padding_size - aie padding parameters. > + * > + * @left: the size of padding left. > + * @right: the size of padding right. > + * @down: the size of padding below. > + * @up: the size of padding above. > + */ > +struct aie_padding_size { > + __u32 left; > + __u32 right; > + __u32 down; > + __u32 up; > +}; > + > +/** > + * struct v4l2_fld_crop_rip_rop - aie fld parameters. > + * > + * @fld_in_crop_x1: x1 of the crop coordinate. > + * @fld_in_crop_y1: y1 of the crop coordinate. > + * @fld_in_crop_x2: x2 of the crop coordinate. > + * @fld_in_crop_y2: y2 of the crop coordinate. > + * @fld_in_rip: fld in rip. > + * @fld_in_rop: fld in rop. > + */ > +struct v4l2_fld_crop_rip_rop { > + __u32 fld_in_crop_x1; > + __u32 fld_in_crop_y1; > + __u32 fld_in_crop_x2; > + __u32 fld_in_crop_y2; > + __u32 fld_in_rip; > + __u32 fld_in_rop; > +}; > + > +/** > + * struct fd_ret - aie fd result parameters. > + * > + * @anchor_x0: X coordinate of the top-left corner of each detected face. > + * @anchor_x1: X coordinate of the bottom-right corner of each detected face. > + * @anchor_y0: Y coordinate of the top-left corner of each detected face. > + * @anchor_y1: Y coordinate of the bottom-right corner of each detected face. > + * @rop_landmark_score0: Score for the first rotation pose landmark. > + * @rop_landmark_score1: Score for the second rotation pose landmark. > + * @rop_landmark_score2: Score for the third rotation pose landmark. > + * @anchor_score: Score for the anchor points. > + * @rip_landmark_score0: Score for the first rotation-invariant pose landmark. > + * @rip_landmark_score1: Score for the second rotation-invariant pose landmark. > + * @rip_landmark_score2: Score for the third rotation-invariant pose landmark. > + * @rip_landmark_score3: Score for the fourth rotation-invariant pose landmark. > + * @rip_landmark_score4: Score for the fifth rotation-invariant pose landmark. > + * @rip_landmark_score5: Score for the sixth rotation-invariant pose landmark. > + * @rip_landmark_score6: Score for the seventh rotation-invariant pose landmark. > + * @face_result_index: Index of each detected face. > + * @anchor_index: Index of the anchor points. > + * @fd_partial_result: Partial face detection result. > + */ > +struct fd_ret { > + __u16 anchor_x0[MAX_FACE_NUM]; > + __u16 anchor_x1[MAX_FACE_NUM]; > + __u16 anchor_y0[MAX_FACE_NUM]; > + __u16 anchor_y1[MAX_FACE_NUM]; I don't know why use 'anchor' for face coordinate. Maybe this is trivial for AI domain, but I think there should be some brief introduction for who don't understand AI. > + __s16 rop_landmark_score0[MAX_FACE_NUM]; > + __s16 rop_landmark_score1[MAX_FACE_NUM]; > + __s16 rop_landmark_score2[MAX_FACE_NUM]; I don't know what is landmark. Why there are 'first', 'second', and 'third' landmark? Maybe this is trivial for AI domain, but I think there should be some brief introduction for who don't understand AI. > + __s16 anchor_score[MAX_FACE_NUM]; > + __s16 rip_landmark_score0[MAX_FACE_NUM]; > + __s16 rip_landmark_score1[MAX_FACE_NUM]; > + __s16 rip_landmark_score2[MAX_FACE_NUM]; > + __s16 rip_landmark_score3[MAX_FACE_NUM]; > + __s16 rip_landmark_score4[MAX_FACE_NUM]; > + __s16 rip_landmark_score5[MAX_FACE_NUM]; > + __s16 rip_landmark_score6[MAX_FACE_NUM]; > + __u16 face_result_index[MAX_FACE_NUM]; What is this index? If fd_pyramid0_num is 100, so anchor_x0[0] ~ anchor_x0[99] is valid. I don't know why need this index? > + __u16 anchor_index[MAX_FACE_NUM]; What is this index? > + __u32 fd_partial_result; What is this result? Describe more so that everyone could understand this. > +}; It's better that one face information is placed together in DRAM because it's easy to cache hit. Then this structure would reform as: struct face { __u16 anchor_x0; __u16 anchor_x1; __u16 anchor_y0; __u16 anchor_y1; ... __u16 anchor_index; ; struct fd_ret { struct face face_result[MAX_FACE_NUM]; __u32 fd_partial_result; }; > + > +/** > + * struct fd_result - Face detection results for different pyramid levels. > + * > + * @fd_pyramid0_num: Number of faces detected at pyramid level 0. > + * @fd_pyramid1_num: Number of faces detected at pyramid level 1. > + * @fd_pyramid2_num: Number of faces detected at pyramid level 2. > + * @fd_total_num: Total number of faces detected across all pyramid levels. > + * @pyramid0_result: Detection results for pyramid level 0. > + * @pyramid1_result: Detection results for pyramid level 1. > + * @pyramid2_result: Detection results for pyramid level 2. > + */ > +struct fd_result { > + __u16 fd_pyramid0_num; > + __u16 fd_pyramid1_num; > + __u16 fd_pyramid2_num; > + __u16 fd_total_num; fd_total_num is redundant. User space program could use (fd_pyramid0_num + fd_pyramid1_num + fd_pyramid2_num) to get this. So drop this. > + struct fd_ret pyramid0_result; > + struct fd_ret pyramid1_result; > + struct fd_ret pyramid2_result; > +}; > + > +/** > + * struct attr_result - Attribute detection results parameters. > + * > + * @gender_ret: Gender detection results. > + * @ethnicity_ret: Ethnicity detection results. > + * @merged_age_ret: Merged age detection results. > + * @merged_gender_ret: Merged gender detection results. > + * @merged_is_indian_ret: Merged results indicating if the subject is Indian. > + * @merged_ethnicity_ret: Merged ethnicity detection results. > + */ > +struct attr_result { > + __s16 gender_ret[2][64]; Is the value 0 for woman and 1 for man? Why the array is [2][64]? Explain what the first index mean and what the second index mean. attribute mode has no coordinate information, so we just know how many man and how many woman, but we don't know which one (the position in video) is man or woman? > + __s16 ethnicity_ret[4][64]; 0 for Chinese, 1 for Japanese, 2 for Korean? > + __s16 merged_age_ret[2]; Why the term 'merged'? > + __s16 merged_gender_ret[2]; Ditto. > + __s16 merged_is_indian_ret[2]; indian is India people or indigenous people of America or both? > + __s16 merged_ethnicity_ret[4]; > +}; > + > +/** > + * struct fld_landmark - FLD coordinates parameters. > + * > + * @x: X coordinate of the facial landmark. > + * @y: Y coordinate of the facial landmark. > + */ > +struct fld_landmark { > + __u16 x; > + __u16 y; > +}; > + > +/** > + * struct fld_result - FLD detection results parameters. > + * > + * @fld_landmark: Array of facial landmarks, each with X and Y coordinates. > + * @fld_out_rip: Output rotation-invariant pose value. > + * @fld_out_rop: Output rotation pose value. > + * @confidence: Confidence score of the facial landmark detection. > + * @blinkscore: Blink score indicating the likelihood of eye blink. > + */ > +struct fld_result { > + struct fld_landmark fld_landmark[FLD_CUR_LANDMARK]; > + __u16 fld_out_rip; > + __u16 fld_out_rop; > + __u16 confidence; > + __s16 blinkscore; > +}; > + > +/** > + * struct aie_enq_info - V4L2 Kernelspace parameters. > + * > + * @sel_mode: select a mode(FDMODE, ATTRIBUTEMODE, FLDMODE) for current fd. > + * FDMODE: Face Detection. > + * ATTRIBUTEMODE: Gender and ethnicity detection > + * FLDMODE: Locations of eyebrows, eyes, ears, nose,and mouth > + * @src_img_fmt: source image format. > + * @src_img_width: the width of the source image. > + * @src_img_height: the height of the source image. > + * @src_img_stride: the stride of the source image. > + * @pyramid_base_width: pyramid is the size of resizer, the width of the base pyramid. > + * @pyramid_base_height: pyramid is the size of resizer, the width of the base pyramid. > + * @number_of_pyramid: number of pyramid, min: 1, max: 3. > + * @rotate_degree: the rotate degree of the image. > + * @en_roi: enable roi(roi is a box diagram that selects a rectangle in a picture). > + * when en_roi is enable, AIE will return a rectangle face detection result > + * @src_roi: roi params. > + * @en_padding: enable padding, this is only used on the hardware of yuv to rgb. > + * and has noting to do with fd_mode > + * @src_padding: padding params. > + * @freq_level: frequency level, Get value from user space enque. > + * @fld_face_num: the number of faces in fld. > + * user space tells driver the number of detections. > + * @fld_input: fld input params. > + * @src_img_addr: Source image address. > + * @src_img_addr_uv: Source image address for UV plane. > + * @fd_out: Face detection results. > + * @attr_out: Attribute detection results. > + * @fld_out: Array of facial landmark detection results for multiple frames. > + * @irq_status: Interrupt request status. > + */ > +struct aie_enq_info { > + __u32 sel_mode; > + __u32 src_img_fmt; > + __u32 src_img_width; > + __u32 src_img_height; > + __u32 src_img_stride; > + __u32 pyramid_base_width; > + __u32 pyramid_base_height; > + __u32 number_of_pyramid; > + __u32 rotate_degree; > + int en_roi; > + struct aie_roi_coordinate src_roi; > + int en_padding; > + struct aie_padding_size src_padding; > + unsigned int freq_level; > + unsigned int fld_face_num; > + struct v4l2_fld_crop_rip_rop fld_input[FLD_MAX_FRAME]; Above information is set by user space, so driver is not necessary to return these information back to user space. Drop these. > + __u32 src_img_addr; > + __u32 src_img_addr_uv; If user space program need to access source image buffer, it should map it itself. Do not pass this address information from driver. Drop these. > + struct fd_result fd_out; > + struct attr_result attr_out; > + struct fld_result fld_out[FLD_MAX_FRAME]; Union these three structure because hardware would work in one mode in one time. > + __u32 irq_status; User space program should not process irq_status. So drop this. > +}; > + > +/** > + * struct v4l2_ctrl_aie_param - V4L2 Userspace parameters. > + * > + * @fd_mode: select a mode(FDMODE, ATTRIBUTEMODE, FLDMODE) for current fd. > + * FDMODE: Face Detection. > + * ATTRIBUTEMODE: Gender and ethnicity detection > + * FLDMODE: Locations of eyebrows, eyes, ears, nose,and mouth > + * @src_img_fmt: source image format. > + * @src_img_width: the width of the source image. > + * @src_img_height: the height of the source image. > + * @src_img_stride: the stride of the source image. > + * @pyramid_base_width: pyramid is the size of resizer, the width of the base pyramid. > + * @pyramid_base_height: pyramid is the size of resizer, the width of the base pyramid. > + * @number_of_pyramid: number of pyramid, min: 1, max: 3. > + * @rotate_degree: the rotate degree of the image. > + * @en_roi: enable roi(roi is a box diagram that selects a rectangle in a picture). > + * when en_roi is enable, AIE will return a rectangle face detection result > + * @src_roi: roi params. > + * @en_padding: enable padding, this is only used on the hardware of yuv to rgb. > + * and has noting to do with fd_mode > + * @src_padding: padding params. > + * @freq_level: frequency level, Get value from user space enque. > + * @fld_face_num: the number of faces in fld. > + * user space tells driver the number of detections. > + * @fld_input: fld input params. > + */ > +struct v4l2_ctrl_aie_param { > + __u32 fd_mode; > + __u32 src_img_fmt; > + __u32 src_img_width; > + __u32 src_img_height; > + __u32 src_img_stride; Use V4L2 standard interface to set source image format/width/height/stride. > + __u32 pyramid_base_width; > + __u32 pyramid_base_height; Does pyramid width/height has limitation? If source image resolution is 3840x2160, does pyramid resolution could be 3840x2160? If there are hardware limitation, query this limitation from driver. Some AI model just accept some certain resolution, for example 360x360. In this case, any source resolution should resize to this resolution. Does AIE need to resize to some fixed resolution? Or any resolution is acceptable? > + __u32 number_of_pyramid; Why need multiple pyramid? Does this AI model need the face resolution around 100x100? If a face it 400x400, so it would not be detect so need to resize this face to multiple smaller resolution. In first pyramid face resolution is 400x400. In second pyramid face resolution is 200x200. In third pyramid face resolution is 100x100. So this face would be detected in third pyramid. If this is right, add this information to document. If not, add right information to document. > + __u32 rotate_degree; > + __s32 en_roi; > + struct aie_roi_coordinate src_roi; > + __s32 en_padding; If padding has nothing to do with fd_mode, drop it. > + struct aie_padding_size src_padding; > + __u32 freq_level; freq_level is FPS? What is freq_level and how hardware work with this? The comment says "Get value from user space enque", what's this? Describe more detail about this. > + __u32 fld_face_num; > + struct v4l2_fld_crop_rip_rop fld_input[FLD_MAX_FRAME]; Does the fld mode need fd mode result? The face_num is get from fd mode, and the fld_input coordinate is get from fd mode also. If so, describe more about this. Regards, CK > +}; > + > +#endif /* __MTK_AIE_V4L2_CONTROLS_H__ */ > diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h > index c8cb2796130f..329bbac9239e 100644 > --- a/include/uapi/linux/videodev2.h > +++ b/include/uapi/linux/videodev2.h > @@ -858,6 +858,9 @@ struct v4l2_pix_format { > #define V4L2_META_FMT_RK_ISP1_STAT_3A v4l2_fourcc('R', 'K', '1', 'S') /* Rockchip ISP1 3A Statistics */ > #define V4L2_META_FMT_RK_ISP1_EXT_PARAMS v4l2_fourcc('R', 'K', '1', 'E') /* Rockchip ISP1 3a Extensible Parameters */ > > +/* Vendor-specific definition: used for the MediaTek camera subsystem's face detection results */ > +#define V4L2_META_FMT_MTFD_RESULT v4l2_fourcc('M', 'T', 'f', 'd') > + > /* Vendor specific - used for RaspberryPi PiSP */ > #define V4L2_META_FMT_RPI_BE_CFG v4l2_fourcc('R', 'P', 'B', 'C') /* PiSP BE configuration */ > #define V4L2_META_FMT_RPI_FE_CFG v4l2_fourcc('R', 'P', 'F', 'C') /* PiSP FE configuration */ > @@ -1965,6 +1968,9 @@ enum v4l2_ctrl_type { > V4L2_CTRL_TYPE_AV1_TILE_GROUP_ENTRY = 0x281, > V4L2_CTRL_TYPE_AV1_FRAME = 0x282, > V4L2_CTRL_TYPE_AV1_FILM_GRAIN = 0x283, > + > + V4L2_CTRL_TYPE_AIE_INIT = 0x0290, > + V4L2_CTRL_TYPE_AIE_PARAM = 0x0291, > }; > > /* Used in the VIDIOC_QUERYCTRL ioctl for querying controls */ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v5 3/4] uapi: linux: add MT8188 AIE 2025-04-07 3:57 ` CK Hu (胡俊光) @ 2025-04-23 4:02 ` Bo Kong (孔波) 2025-04-24 1:43 ` CK Hu (胡俊光) 2025-04-24 5:22 ` CK Hu (胡俊光) 2025-04-23 9:29 ` Bo Kong (孔波) 1 sibling, 2 replies; 17+ messages in thread From: Bo Kong (孔波) @ 2025-04-23 4:02 UTC (permalink / raw) To: linux-mediatek@lists.infradead.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, mchehab@kernel.org, CK Hu (胡俊光), robh@kernel.org, linux-arm-kernel@lists.infradead.org, AngeloGioacchino Del Regno Cc: Zhaoyuan Chen (陈兆远), Teddy Chen (陳乾元), Project_Global_Chrome_Upstream_Group Hi,CK, Thanks for the reviews. On Mon, 2025-04-07 at 03:57 +0000, CK Hu (胡俊光) wrote: > On Thu, 2025-04-03 at 15:38 +0800, bo.kong wrote: > > From: Bo Kong <Bo.Kong@mediatek.com> > > > > Add AIE control related definitions. > > Laurent has asked you to provide document of the UAPI. > If you have no idea what to write to this document, > I ask you question in this mail and you write the answer to the > document. OK, I will reply with the answer here. > > > > > Signed-off-by: Bo Kong <Bo.Kong@mediatek.com> > > --- > > [snip] > > > diff --git a/include/uapi/linux/mtk_aie_v4l2_controls.h > > b/include/uapi/linux/mtk_aie_v4l2_controls.h > > new file mode 100644 > > index 000000000000..952dcdb23283 > > --- /dev/null > > +++ b/include/uapi/linux/mtk_aie_v4l2_controls.h > > @@ -0,0 +1,308 @@ > > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ > > +/* > > + * AIE Controls Header > > + * > > + * Copyright (c) 2020 MediaTek Inc. > > + * Author: Bo Kong <bo.kong@mediatek.com> > > + */ > > + > > +#ifndef __MTK_AIE_V4L2_CONTROLS_H__ > > +#define __MTK_AIE_V4L2_CONTROLS_H__ > > + > > +#include <linux/types.h> > > + > > +/* > > + * The base for the mediatek Face Detection driver controls. > > + * We reserve 16 controls for this driver. > > + * Each CID represents different stages of AIE, with different > > structures and functions > > + * and cannot be reused > > + */ > > +#define V4L2_CID_USER_MTK_FD_BASE (V4L2_CID_USER_BASE + 0x1fd0) > > +#define V4L2_CID_MTK_AIE_INIT (V4L2_CID_USER_MTK_FD_BASE + 1) > > +#define V4L2_CID_MTK_AIE_PARAM (V4L2_CID_USER_MTK_FD_BASE + 2) > > + > > +#define MAX_FACE_NUM 1024 > > +#define FLD_CUR_LANDMARK 11 > > +#define FLD_MAX_FRAME 15 > > + > > +/** > > + * struct v4l2_ctrl_aie_init - aie init parameters. > > + * > > + * @max_img_width: maximum width of the source image. > > + * @max_img_height: maximum height of the source image. > > + * @pyramid_width: maximum width of the base pyramid. > > + * @pyramid_height: maximum height of the base pyramid. > > + * @feature_threshold: The threshold for the face > > confidence.Range: 100 ~ 400. > > + * The larger the value,the lower the face > > recognition rate > > + */ > > +struct v4l2_ctrl_aie_init { > > + __u32 max_img_width; > > + __u32 max_img_height; > > max image width/height is hardware limitation. This should be stored > in driver. > User space program should use some V4L2 query interface to get this > information. This is the max size of the image sent by the user, not the maximum size of the hardware. For example, for a 640x480 image, the user needs to set max_img_width = 640 and max_img_height = 480. > > > + __u32 pyramid_width; > > + __u32 pyramid_height; > > Ditto. > > > + __s32 feature_threshold; > > How to decide this value? > I guess there is a tuning process to find a reasonable threshold. > Is there only one reasonable threshold for all scenario? > If so, write done this one threshold in driver and user space is not > necessary to set this value to driver. > If the threshold would vary for different scenario, for example, 168 > for indoor video and 258 for outdoor video, > add comment to describe this so user space program have to detect the > scenario of input video. This value depends on whether the user prioritizes recognition accuracy or recognition speed. There is no standard value, as it provides users with multiple options. > > > +}; > > + > > +/** > > + * struct aie_roi_coordinate - aie roi parameters. > > + * > > + * @x1: x1 of the roi coordinate. > > + * @y1: y1 of the roi coordinate. > > + * @x2: x2 of the roi coordinate. > > + * @y2: y2 of the roi coordinate. > > + */ > > +struct aie_roi_coordinate { > > + __u32 x1; > > + __u32 y1; > > + __u32 x2; > > + __u32 y2; > > +}; > > + > > +/** > > + * struct aie_padding_size - aie padding parameters. > > + * > > + * @left: the size of padding left. > > + * @right: the size of padding right. > > + * @down: the size of padding below. > > + * @up: the size of padding above. > > + */ > > +struct aie_padding_size { > > + __u32 left; > > + __u32 right; > > + __u32 down; > > + __u32 up; > > +}; > > + > > +/** > > + * struct v4l2_fld_crop_rip_rop - aie fld parameters. > > + * > > + * @fld_in_crop_x1: x1 of the crop coordinate. > > + * @fld_in_crop_y1: y1 of the crop coordinate. > > + * @fld_in_crop_x2: x2 of the crop coordinate. > > + * @fld_in_crop_y2: y2 of the crop coordinate. > > + * @fld_in_rip: fld in rip. > > + * @fld_in_rop: fld in rop. > > + */ > > +struct v4l2_fld_crop_rip_rop { > > + __u32 fld_in_crop_x1; > > + __u32 fld_in_crop_y1; > > + __u32 fld_in_crop_x2; > > + __u32 fld_in_crop_y2; > > + __u32 fld_in_rip; > > + __u32 fld_in_rop; > > +}; > > + > > +/** > > + * struct fd_ret - aie fd result parameters. > > + * > > + * @anchor_x0: X coordinate of the top-left corner of each > > detected face. > > + * @anchor_x1: X coordinate of the bottom-right corner of each > > detected face. > > + * @anchor_y0: Y coordinate of the top-left corner of each > > detected face. > > + * @anchor_y1: Y coordinate of the bottom-right corner of each > > detected face. > > + * @rop_landmark_score0: Score for the first rotation pose > > landmark. > > + * @rop_landmark_score1: Score for the second rotation pose > > landmark. > > + * @rop_landmark_score2: Score for the third rotation pose > > landmark. > > + * @anchor_score: Score for the anchor points. > > + * @rip_landmark_score0: Score for the first rotation-invariant > > pose landmark. > > + * @rip_landmark_score1: Score for the second rotation-invariant > > pose landmark. > > + * @rip_landmark_score2: Score for the third rotation-invariant > > pose landmark. > > + * @rip_landmark_score3: Score for the fourth rotation-invariant > > pose landmark. > > + * @rip_landmark_score4: Score for the fifth rotation-invariant > > pose landmark. > > + * @rip_landmark_score5: Score for the sixth rotation-invariant > > pose landmark. > > + * @rip_landmark_score6: Score for the seventh rotation-invariant > > pose landmark. > > + * @face_result_index: Index of each detected face. > > + * @anchor_index: Index of the anchor points. > > + * @fd_partial_result: Partial face detection result. > > + */ > > +struct fd_ret { > > + __u16 anchor_x0[MAX_FACE_NUM]; > > + __u16 anchor_x1[MAX_FACE_NUM]; > > + __u16 anchor_y0[MAX_FACE_NUM]; > > + __u16 anchor_y1[MAX_FACE_NUM]; > > I don't know why use 'anchor' for face coordinate. > Maybe this is trivial for AI domain, but I think there should be some > brief introduction for who don't understand AI. This naming is based on the hardware specifications. The four variables othe anchor represent the coordinates of the four corners of the rectangular box. > > > + __s16 rop_landmark_score0[MAX_FACE_NUM]; > > + __s16 rop_landmark_score1[MAX_FACE_NUM]; > > + __s16 rop_landmark_score2[MAX_FACE_NUM]; > > I don't know what is landmark. > Why there are 'first', 'second', and 'third' landmark? > Maybe this is trivial for AI domain, but I think there should be some > brief introduction for who don't understand AI. rop (Rotation Out of Plane) refers to 3D rotation, which will select the highest scoring face direction among the front face, right profile, and left profile. > > > + __s16 anchor_score[MAX_FACE_NUM]; > > + __s16 rip_landmark_score0[MAX_FACE_NUM]; > > + __s16 rip_landmark_score1[MAX_FACE_NUM]; > > + __s16 rip_landmark_score2[MAX_FACE_NUM]; > > + __s16 rip_landmark_score3[MAX_FACE_NUM]; > > + __s16 rip_landmark_score4[MAX_FACE_NUM]; > > + __s16 rip_landmark_score5[MAX_FACE_NUM]; > > + __s16 rip_landmark_score6[MAX_FACE_NUM]; > > + __u16 face_result_index[MAX_FACE_NUM]; > > What is this index? > If fd_pyramid0_num is 100, so anchor_x0[0] ~ anchor_x0[99] is valid. > I don't know why need this index? RIP (Rotation In Plane) refers to planar rotation, which will determine the rotation angle by selecting the highest scoring angle among the six possible face rotation angles. face_result_index indicates which face is output by the hardware. > > > + __u16 anchor_index[MAX_FACE_NUM]; > > What is this index? The meaning of anchor_index is which anchor detected it. > > > + __u32 fd_partial_result; > > What is this result? > Describe more so that everyone could understand this. fd_partial_result indicates the number of faces detected in three different resized images (large, medium, and small) by the face detection (FD) process. > > > +}; > > It's better that one face information is placed together in DRAM > because it's easy to cache hit. > Then this structure would reform as: > > struct face { > __u16 anchor_x0; > __u16 anchor_x1; > __u16 anchor_y0; > __u16 anchor_y1; > ... > __u16 anchor_index; > ; > > struct fd_ret { > struct face face_result[MAX_FACE_NUM]; > __u32 fd_partial_result; > }; > > > + > > +/** > > + * struct fd_result - Face detection results for different pyramid > > levels. > > + * > > + * @fd_pyramid0_num: Number of faces detected at pyramid level 0. > > + * @fd_pyramid1_num: Number of faces detected at pyramid level 1. > > + * @fd_pyramid2_num: Number of faces detected at pyramid level 2. > > + * @fd_total_num: Total number of faces detected across all > > pyramid levels. > > + * @pyramid0_result: Detection results for pyramid level 0. > > + * @pyramid1_result: Detection results for pyramid level 1. > > + * @pyramid2_result: Detection results for pyramid level 2. > > + */ > > +struct fd_result { > > + __u16 fd_pyramid0_num; > > + __u16 fd_pyramid1_num; > > + __u16 fd_pyramid2_num; > > + __u16 fd_total_num; > > fd_total_num is redundant. > User space program could use (fd_pyramid0_num + fd_pyramid1_num + > fd_pyramid2_num) to get this. > So drop this. fd_total_num represents the final number of faces detected by the hardware. fd_pyramind0, fd_pyramind1, and fd_pyramind2 are the numbers of faces detected in the resized images (large, medium, and small) respectively. These are different variables and fd_total_num is not obtained by summing fd_pyramind0, fd_pyramind1, and fd_pyramind2. > > > + struct fd_ret pyramid0_result; > > + struct fd_ret pyramid1_result; > > + struct fd_ret pyramid2_result; > > +}; > > + > > +/** > > + * struct attr_result - Attribute detection results parameters. > > + * > > + * @gender_ret: Gender detection results. > > + * @ethnicity_ret: Ethnicity detection results. > > + * @merged_age_ret: Merged age detection results. > > + * @merged_gender_ret: Merged gender detection results. > > + * @merged_is_indian_ret: Merged results indicating if the subject > > is Indian. > > + * @merged_ethnicity_ret: Merged ethnicity detection results. > > + */ > > +struct attr_result { > > + __s16 gender_ret[2][64]; > > Is the value 0 for woman and 1 for man? > Why the array is [2][64]? > Explain what the first index mean and what the second index mean. > attribute mode has no coordinate information, so we just know how > many man and how many woman, > but we don't know which one (the position in video) is man or woman? > 0:male, 1:female > > + __s16 ethnicity_ret[4][64]; > > 0 for while, 1 for black, 2 for asian? 0: while, 1:black, 2:asian, 3:indian > > > + __s16 merged_age_ret[2]; > > Why the term 'merged'? > This is just naming > > + __s16 merged_gender_ret[2]; > > Ditto. > > > + __s16 merged_is_indian_ret[2]; > > indian is India people or indigenous people of America or both? > indian is India people > > + __s16 merged_ethnicity_ret[4]; > > +}; > > + > > +/** > > + * struct fld_landmark - FLD coordinates parameters. > > + * > > + * @x: X coordinate of the facial landmark. > > + * @y: Y coordinate of the facial landmark. > > + */ > > +struct fld_landmark { > > + __u16 x; > > + __u16 y; > > +}; > > + > > +/** > > + * struct fld_result - FLD detection results parameters. > > + * > > + * @fld_landmark: Array of facial landmarks, each with X and Y > > coordinates. > > + * @fld_out_rip: Output rotation-invariant pose value. > > + * @fld_out_rop: Output rotation pose value. > > + * @confidence: Confidence score of the facial landmark detection. > > + * @blinkscore: Blink score indicating the likelihood of eye > > blink. > > + */ > > +struct fld_result { > > + struct fld_landmark fld_landmark[FLD_CUR_LANDMARK]; > > + __u16 fld_out_rip; > > + __u16 fld_out_rop; > > + __u16 confidence; > > + __s16 blinkscore; > > +}; > > + > > +/** > > + * struct aie_enq_info - V4L2 Kernelspace parameters. > > + * > > + * @sel_mode: select a mode(FDMODE, ATTRIBUTEMODE, FLDMODE) for > > current fd. > > + * FDMODE: Face Detection. > > + * ATTRIBUTEMODE: Gender and ethnicity detection > > + * FLDMODE: Locations of eyebrows, eyes, ears, nose,and > > mouth > > + * @src_img_fmt: source image format. > > + * @src_img_width: the width of the source image. > > + * @src_img_height: the height of the source image. > > + * @src_img_stride: the stride of the source image. > > + * @pyramid_base_width: pyramid is the size of resizer, the width > > of the base pyramid. > > + * @pyramid_base_height: pyramid is the size of resizer, the width > > of the base pyramid. > > + * @number_of_pyramid: number of pyramid, min: 1, max: 3. > > + * @rotate_degree: the rotate degree of the image. > > + * @en_roi: enable roi(roi is a box diagram that selects a > > rectangle in a picture). > > + * when en_roi is enable, AIE will return a rectangle > > face detection result > > + * @src_roi: roi params. > > + * @en_padding: enable padding, this is only used on the hardware > > of yuv to rgb. > > + * and has noting to do with fd_mode > > + * @src_padding: padding params. > > + * @freq_level: frequency level, Get value from user space enque. > > + * @fld_face_num: the number of faces in fld. > > + * user space tells driver the number of > > detections. > > + * @fld_input: fld input params. > > + * @src_img_addr: Source image address. > > + * @src_img_addr_uv: Source image address for UV plane. > > + * @fd_out: Face detection results. > > + * @attr_out: Attribute detection results. > > + * @fld_out: Array of facial landmark detection results for > > multiple frames. > > + * @irq_status: Interrupt request status. > > + */ > > +struct aie_enq_info { > > + __u32 sel_mode; > > + __u32 src_img_fmt; > > + __u32 src_img_width; > > + __u32 src_img_height; > > + __u32 src_img_stride; > > + __u32 pyramid_base_width; > > + __u32 pyramid_base_height; > > + __u32 number_of_pyramid; > > + __u32 rotate_degree; > > + int en_roi; > > + struct aie_roi_coordinate src_roi; > > + int en_padding; > > + struct aie_padding_size src_padding; > > + unsigned int freq_level; > > + unsigned int fld_face_num; > > + struct v4l2_fld_crop_rip_rop fld_input[FLD_MAX_FRAME]; > > Above information is set by user space, so driver is not necessary to > return these information back to user space. > Drop these. > This is the V4L2 enqueue structure, which requires information sent from user space. It cannot be dropped; otherwise, the FLD input will not receive data from the user. > > + __u32 src_img_addr; > > + __u32 src_img_addr_uv; > > If user space program need to access source image buffer, it should > map it itself. > Do not pass this address information from driver. > Drop these. > This is where the user provides the addresses of the Y and UV components of the image for the AIE to use. > > + struct fd_result fd_out; > > + struct attr_result attr_out; > > + struct fld_result fld_out[FLD_MAX_FRAME]; > > Union these three structure because hardware would work in one mode > in one time. > > > + __u32 irq_status; > > User space program should not process irq_status. > So drop this. > The irq status is provided for users to check the current status of the AIE. Users need to use it to monitor the AIE status in order to perform some custom operations in advance. > > +}; > > + > > +/** > > + * struct v4l2_ctrl_aie_param - V4L2 Userspace parameters. > > + * > > + * @fd_mode: select a mode(FDMODE, ATTRIBUTEMODE, FLDMODE) for > > current fd. > > + * FDMODE: Face Detection. > > + * ATTRIBUTEMODE: Gender and ethnicity detection > > + * FLDMODE: Locations of eyebrows, eyes, ears, nose,and > > mouth > > + * @src_img_fmt: source image format. > > + * @src_img_width: the width of the source image. > > + * @src_img_height: the height of the source image. > > + * @src_img_stride: the stride of the source image. > > + * @pyramid_base_width: pyramid is the size of resizer, the width > > of the base pyramid. > > + * @pyramid_base_height: pyramid is the size of resizer, the width > > of the base pyramid. > > + * @number_of_pyramid: number of pyramid, min: 1, max: 3. > > + * @rotate_degree: the rotate degree of the image. > > + * @en_roi: enable roi(roi is a box diagram that selects a > > rectangle in a picture). > > + * when en_roi is enable, AIE will return a rectangle > > face detection result > > + * @src_roi: roi params. > > + * @en_padding: enable padding, this is only used on the hardware > > of yuv to rgb. > > + * and has noting to do with fd_mode > > + * @src_padding: padding params. > > + * @freq_level: frequency level, Get value from user space enque. > > + * @fld_face_num: the number of faces in fld. > > + * user space tells driver the number of > > detections. > > + * @fld_input: fld input params. > > + */ > > +struct v4l2_ctrl_aie_param { > > + __u32 fd_mode; > > + __u32 src_img_fmt; > > + __u32 src_img_width; > > + __u32 src_img_height; > > + __u32 src_img_stride; > > Use V4L2 standard interface to set source image > format/width/height/stride. > This structure is used to obtain data sent by the user through s_ctrl. Users only need to set it up once, without repeatedly calling multiple IOCTLs to set width, height, and format. This approach can reduce system overhead. > > + __u32 pyramid_base_width; > > + __u32 pyramid_base_height; > > Does pyramid width/height has limitation? > If source image resolution is 3840x2160, does pyramid resolution > could be 3840x2160? > If there are hardware limitation, query this limitation from driver. > > Some AI model just accept some certain resolution, for example > 360x360. > In this case, any source resolution should resize to this resolution. > Does AIE need to resize to some fixed resolution? Or any resolution > is acceptable? > The AIE only supports a maximum source image size of 640x480. Similarly, the pyramid width and height are also 640x480. > > + __u32 number_of_pyramid; > > Why need multiple pyramid? > Does this AI model need the face resolution around 100x100? > If a face it 400x400, so it would not be detect so need to resize > this face to multiple smaller resolution. > In first pyramid face resolution is 400x400. > In second pyramid face resolution is 200x200. > In third pyramid face resolution is 100x100. > So this face would be detected in third pyramid. > > If this is right, add this information to document. > If not, add right information to document. > You are correct.The AIE hardware needs to downscale the source image twice to obtain three images. It then performs face detection on each of these three images. This process is designed to improve the recognition rate and speed of the AIE. such as source image:640x480 Pyramid0: 480x460, Detect small face Pyramid1:0.5x, 240x180 Pyramid2:0.25x, 120x90, Detect large face > > + __u32 rotate_degree; > > + __s32 en_roi; > > + struct aie_roi_coordinate src_roi; > > + __s32 en_padding; > > If padding has nothing to do with fd_mode, drop it. > Padding is used to determine whether additional areas outside the detected face should be included. Whether to enable it depends on the user. > > + struct aie_padding_size src_padding; > > + __u32 freq_level; > > freq_level is FPS? > What is freq_level and how hardware work with this? > The comment says "Get value from user space enque", what's this? > Describe more detail about this. > The freq_level is no longer in use, I will drop it > > + __u32 fld_face_num; > > + struct v4l2_fld_crop_rip_rop fld_input[FLD_MAX_FRAME]; > > Does the fld mode need fd mode result? > The face_num is get from fd mode, and the fld_input coordinate is get > from fd mode also. > If so, describe more about this. > FLD is using detected face window information and head pose detected by AIE, to perform Binary Tree Traversal to approximate the final landmark coordination. FLD is using the AIE detection result, FLD and AIE won’t be operating at the same time. Therefore, FLD can share AIE’s DMAs and SRAM > Regards, > CK > > > +}; > > + > > +#endif /* __MTK_AIE_V4L2_CONTROLS_H__ */ > > diff --git a/include/uapi/linux/videodev2.h > > b/include/uapi/linux/videodev2.h > > index c8cb2796130f..329bbac9239e 100644 > > --- a/include/uapi/linux/videodev2.h > > +++ b/include/uapi/linux/videodev2.h > > @@ -858,6 +858,9 @@ struct v4l2_pix_format { > > #define V4L2_META_FMT_RK_ISP1_STAT_3A v4l2_fourcc('R', 'K', > > '1', 'S') /* Rockchip ISP1 3A Statistics */ > > #define V4L2_META_FMT_RK_ISP1_EXT_PARAMS v4l2_fourcc('R', 'K', > > '1', 'E') /* Rockchip ISP1 3a Extensible Parameters */ > > > > +/* Vendor-specific definition: used for the MediaTek camera > > subsystem's face detection results */ > > +#define V4L2_META_FMT_MTFD_RESULT v4l2_fourcc('M', 'T', 'f', 'd') > > + > > /* Vendor specific - used for RaspberryPi PiSP */ > > #define V4L2_META_FMT_RPI_BE_CFG v4l2_fourcc('R', 'P', 'B', 'C') > > /* PiSP BE configuration */ > > #define V4L2_META_FMT_RPI_FE_CFG v4l2_fourcc('R', 'P', 'F', 'C') > > /* PiSP FE configuration */ > > @@ -1965,6 +1968,9 @@ enum v4l2_ctrl_type { > > V4L2_CTRL_TYPE_AV1_TILE_GROUP_ENTRY = 0x281, > > V4L2_CTRL_TYPE_AV1_FRAME = 0x282, > > V4L2_CTRL_TYPE_AV1_FILM_GRAIN = 0x283, > > + > > + V4L2_CTRL_TYPE_AIE_INIT = 0x0290, > > + V4L2_CTRL_TYPE_AIE_PARAM = 0x0291, > > }; > > > > /* Used in the VIDIOC_QUERYCTRL ioctl for querying controls */ > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v5 3/4] uapi: linux: add MT8188 AIE 2025-04-23 4:02 ` Bo Kong (孔波) @ 2025-04-24 1:43 ` CK Hu (胡俊光) 2025-04-24 5:22 ` CK Hu (胡俊光) 1 sibling, 0 replies; 17+ messages in thread From: CK Hu (胡俊光) @ 2025-04-24 1:43 UTC (permalink / raw) To: linux-media@vger.kernel.org, AngeloGioacchino Del Regno, robh@kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, mchehab@kernel.org, Bo Kong (孔波), linux-mediatek@lists.infradead.org Cc: Zhaoyuan Chen (陈兆远), Teddy Chen (陳乾元), Project_Global_Chrome_Upstream_Group On Wed, 2025-04-23 at 04:02 +0000, Bo Kong (孔波) wrote: > Hi,CK, > Thanks for the reviews. > > On Mon, 2025-04-07 at 03:57 +0000, CK Hu (胡俊光) wrote: > > On Thu, 2025-04-03 at 15:38 +0800, bo.kong wrote: > > > From: Bo Kong <Bo.Kong@mediatek.com> > > > > > > Add AIE control related definitions. > > > > Laurent has asked you to provide document of the UAPI. > > If you have no idea what to write to this document, > > I ask you question in this mail and you write the answer to the > > document. > > OK, I will reply with the answer here. > > > > > > > > Signed-off-by: Bo Kong <Bo.Kong@mediatek.com> > > > --- > > > > [snip] > > > > > diff --git a/include/uapi/linux/mtk_aie_v4l2_controls.h > > > b/include/uapi/linux/mtk_aie_v4l2_controls.h > > > new file mode 100644 > > > index 000000000000..952dcdb23283 > > > --- /dev/null > > > +++ b/include/uapi/linux/mtk_aie_v4l2_controls.h > > > @@ -0,0 +1,308 @@ > > > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ > > > +/* > > > + * AIE Controls Header > > > + * > > > + * Copyright (c) 2020 MediaTek Inc. > > > + * Author: Bo Kong <bo.kong@mediatek.com> > > > + */ > > > + > > > +#ifndef __MTK_AIE_V4L2_CONTROLS_H__ > > > +#define __MTK_AIE_V4L2_CONTROLS_H__ > > > + > > > +#include <linux/types.h> > > > + > > > +/* > > > + * The base for the mediatek Face Detection driver controls. > > > + * We reserve 16 controls for this driver. > > > + * Each CID represents different stages of AIE, with different > > > structures and functions > > > + * and cannot be reused > > > + */ > > > +#define V4L2_CID_USER_MTK_FD_BASE (V4L2_CID_USER_BASE + 0x1fd0) > > > +#define V4L2_CID_MTK_AIE_INIT (V4L2_CID_USER_MTK_FD_BASE + 1) > > > +#define V4L2_CID_MTK_AIE_PARAM (V4L2_CID_USER_MTK_FD_BASE + 2) > > > + > > > +#define MAX_FACE_NUM 1024 > > > +#define FLD_CUR_LANDMARK 11 > > > +#define FLD_MAX_FRAME 15 > > > + > > > +/** > > > + * struct v4l2_ctrl_aie_init - aie init parameters. > > > + * > > > + * @max_img_width: maximum width of the source image. > > > + * @max_img_height: maximum height of the source image. > > > + * @pyramid_width: maximum width of the base pyramid. > > > + * @pyramid_height: maximum height of the base pyramid. > > > + * @feature_threshold: The threshold for the face > > > confidence.Range: 100 ~ 400. > > > + * The larger the value,the lower the face > > > recognition rate > > > + */ > > > +struct v4l2_ctrl_aie_init { > > > + __u32 max_img_width; > > > + __u32 max_img_height; > > > > max image width/height is hardware limitation. This should be stored > > in driver. > > User space program should use some V4L2 query interface to get this > > information. > > This is the max size of the image sent by the user, not the maximum > size of the hardware. For example, for a 640x480 image, the user needs > to set max_img_width = 640 and max_img_height = 480. I trace the driver and find out that max_img_width and max_img_height are used for pure software checking. This checking could be done by user space program. When user space program receive a new input image, it could check its resolution before send it to driver. Checking earlier is better, so it's not necessary pass max_img_width and max_img_height to driver. > > > > > > + __u32 pyramid_width; > > > + __u32 pyramid_height; > > > > Ditto. > > > > > + __s32 feature_threshold; > > > > How to decide this value? > > I guess there is a tuning process to find a reasonable threshold. > > Is there only one reasonable threshold for all scenario? > > If so, write done this one threshold in driver and user space is not > > necessary to set this value to driver. > > If the threshold would vary for different scenario, for example, 168 > > for indoor video and 258 for outdoor video, > > add comment to describe this so user space program have to detect the > > scenario of input video. > > This value depends on whether the user prioritizes recognition accuracy > or recognition speed. There is no standard value, as it provides users > with multiple options. So, 2147483647 would have best accuracy and lowest speed, -2147483648 would have worst accuracy and highest speed, and 0 is a balance of accuracy and speed. If so, add this to document. > > > > > > +}; > > > + > > > +/** > > > + * struct aie_roi_coordinate - aie roi parameters. > > > + * > > > + * @x1: x1 of the roi coordinate. > > > + * @y1: y1 of the roi coordinate. > > > + * @x2: x2 of the roi coordinate. > > > + * @y2: y2 of the roi coordinate. > > > + */ > > > +struct aie_roi_coordinate { > > > + __u32 x1; > > > + __u32 y1; > > > + __u32 x2; > > > + __u32 y2; > > > +}; > > > + > > > +/** > > > + * struct aie_padding_size - aie padding parameters. > > > + * > > > + * @left: the size of padding left. > > > + * @right: the size of padding right. > > > + * @down: the size of padding below. > > > + * @up: the size of padding above. > > > + */ > > > +struct aie_padding_size { > > > + __u32 left; > > > + __u32 right; > > > + __u32 down; > > > + __u32 up; > > > +}; > > > + > > > +/** > > > + * struct v4l2_fld_crop_rip_rop - aie fld parameters. > > > + * > > > + * @fld_in_crop_x1: x1 of the crop coordinate. > > > + * @fld_in_crop_y1: y1 of the crop coordinate. > > > + * @fld_in_crop_x2: x2 of the crop coordinate. > > > + * @fld_in_crop_y2: y2 of the crop coordinate. > > > + * @fld_in_rip: fld in rip. > > > + * @fld_in_rop: fld in rop. > > > + */ > > > +struct v4l2_fld_crop_rip_rop { > > > + __u32 fld_in_crop_x1; > > > + __u32 fld_in_crop_y1; > > > + __u32 fld_in_crop_x2; > > > + __u32 fld_in_crop_y2; > > > + __u32 fld_in_rip; > > > + __u32 fld_in_rop; > > > +}; > > > + > > > +/** > > > + * struct fd_ret - aie fd result parameters. > > > + * > > > + * @anchor_x0: X coordinate of the top-left corner of each > > > detected face. > > > + * @anchor_x1: X coordinate of the bottom-right corner of each > > > detected face. > > > + * @anchor_y0: Y coordinate of the top-left corner of each > > > detected face. > > > + * @anchor_y1: Y coordinate of the bottom-right corner of each > > > detected face. > > > + * @rop_landmark_score0: Score for the first rotation pose > > > landmark. > > > + * @rop_landmark_score1: Score for the second rotation pose > > > landmark. > > > + * @rop_landmark_score2: Score for the third rotation pose > > > landmark. > > > + * @anchor_score: Score for the anchor points. > > > + * @rip_landmark_score0: Score for the first rotation-invariant > > > pose landmark. > > > + * @rip_landmark_score1: Score for the second rotation-invariant > > > pose landmark. > > > + * @rip_landmark_score2: Score for the third rotation-invariant > > > pose landmark. > > > + * @rip_landmark_score3: Score for the fourth rotation-invariant > > > pose landmark. > > > + * @rip_landmark_score4: Score for the fifth rotation-invariant > > > pose landmark. > > > + * @rip_landmark_score5: Score for the sixth rotation-invariant > > > pose landmark. > > > + * @rip_landmark_score6: Score for the seventh rotation-invariant > > > pose landmark. > > > + * @face_result_index: Index of each detected face. > > > + * @anchor_index: Index of the anchor points. > > > + * @fd_partial_result: Partial face detection result. > > > + */ > > > +struct fd_ret { > > > + __u16 anchor_x0[MAX_FACE_NUM]; > > > + __u16 anchor_x1[MAX_FACE_NUM]; > > > + __u16 anchor_y0[MAX_FACE_NUM]; > > > + __u16 anchor_y1[MAX_FACE_NUM]; > > > > I don't know why use 'anchor' for face coordinate. > > Maybe this is trivial for AI domain, but I think there should be some > > brief introduction for who don't understand AI. > > This naming is based on the hardware specifications. The four variables > othe anchor represent the coordinates of the four corners of the > rectangular box. > > > > > > + __s16 rop_landmark_score0[MAX_FACE_NUM]; > > > + __s16 rop_landmark_score1[MAX_FACE_NUM]; > > > + __s16 rop_landmark_score2[MAX_FACE_NUM]; > > > > I don't know what is landmark. > > Why there are 'first', 'second', and 'third' landmark? > > Maybe this is trivial for AI domain, but I think there should be some > > brief introduction for who don't understand AI. > > rop (Rotation Out of Plane) refers to 3D rotation, which will select the highest scoring face direction among the front face, right profile, and left profile. > > > > > > + __s16 anchor_score[MAX_FACE_NUM]; > > > + __s16 rip_landmark_score0[MAX_FACE_NUM]; > > > + __s16 rip_landmark_score1[MAX_FACE_NUM]; > > > + __s16 rip_landmark_score2[MAX_FACE_NUM]; > > > + __s16 rip_landmark_score3[MAX_FACE_NUM]; > > > + __s16 rip_landmark_score4[MAX_FACE_NUM]; > > > + __s16 rip_landmark_score5[MAX_FACE_NUM]; > > > + __s16 rip_landmark_score6[MAX_FACE_NUM]; > > > + __u16 face_result_index[MAX_FACE_NUM]; > > > > What is this index? > > If fd_pyramid0_num is 100, so anchor_x0[0] ~ anchor_x0[99] is valid. > > I don't know why need this index? > > RIP (Rotation In Plane) refers to planar rotation, which will determine > the rotation angle by selecting the highest scoring angle among the six > possible face rotation angles. Even though user space know ROP and RIP, what could it do by using this score? This looks like a temporary information to detect face. If hardware has already detection the position of a face, what we could do by using these rotation information? > > face_result_index indicates which face is output by the hardware. If face_result_index[0] = 0 face_result_index[1] = 1 face_result_index[2] = 0 face_result_index[3] = 1 Does this mean that face 0 and face 2 is not a face, and face 1 and face 3 is face? > > > > > > + __u16 anchor_index[MAX_FACE_NUM]; > > > > What is this index? > > The meaning of anchor_index is which anchor detected it. This is the came as ROP and RIP. What could user space program do by using this information? > > > > > > + __u32 fd_partial_result; > > > > What is this result? > > Describe more so that everyone could understand this. > > fd_partial_result indicates the number of faces detected in three > different resized images (large, medium, and small) by the face > detection (FD) process. This is identical to fd_pyramidX_num (X = 0, 1, 2) in struct fd_result, and this naming does not show the 'number of faces', so remove this. > > > > > > +}; > > > > It's better that one face information is placed together in DRAM > > because it's easy to cache hit. > > Then this structure would reform as: > > > > struct face { > > __u16 anchor_x0; > > __u16 anchor_x1; > > __u16 anchor_y0; > > __u16 anchor_y1; > > ... > > __u16 anchor_index; > > ; > > > > struct fd_ret { > > struct face face_result[MAX_FACE_NUM]; > > __u32 fd_partial_result; > > }; > > > > > + > > > +/** > > > + * struct fd_result - Face detection results for different pyramid > > > levels. > > > + * > > > + * @fd_pyramid0_num: Number of faces detected at pyramid level 0. > > > + * @fd_pyramid1_num: Number of faces detected at pyramid level 1. > > > + * @fd_pyramid2_num: Number of faces detected at pyramid level 2. > > > + * @fd_total_num: Total number of faces detected across all > > > pyramid levels. > > > + * @pyramid0_result: Detection results for pyramid level 0. > > > + * @pyramid1_result: Detection results for pyramid level 1. > > > + * @pyramid2_result: Detection results for pyramid level 2. > > > + */ > > > +struct fd_result { > > > + __u16 fd_pyramid0_num; > > > + __u16 fd_pyramid1_num; > > > + __u16 fd_pyramid2_num; > > > + __u16 fd_total_num; > > > > fd_total_num is redundant. > > User space program could use (fd_pyramid0_num + fd_pyramid1_num + > > fd_pyramid2_num) to get this. > > So drop this. > > fd_total_num represents the final number of faces detected by the > hardware. fd_pyramind0, fd_pyramind1, and fd_pyramind2 are the numbers > of faces detected in the resized images (large, medium, and small) > respectively. These are different variables and fd_total_num is not > obtained by summing fd_pyramind0, fd_pyramind1, and fd_pyramind2. How to use fd_total_num? If fd_pyramid0_num = 5 fd_pyramid1_num = 4 fd_pyramid2_num = 6 fd_total_num = 12 We could find 5 face coordinate in pyramid0_result; We could find 4 face coordinate in pyramid1_result; We could find 6 face coordinate in pyramid2_result; How do we find 12 face coordinate of fd_total_num? We have no any information of these 12 faces. > > > > > > + struct fd_ret pyramid0_result; > > > + struct fd_ret pyramid1_result; > > > + struct fd_ret pyramid2_result; > > > +}; > > > + > > > +/** > > > + * struct attr_result - Attribute detection results parameters. > > > + * > > > + * @gender_ret: Gender detection results. > > > + * @ethnicity_ret: Ethnicity detection results. > > > + * @merged_age_ret: Merged age detection results. > > > + * @merged_gender_ret: Merged gender detection results. > > > + * @merged_is_indian_ret: Merged results indicating if the subject > > > is Indian. > > > + * @merged_ethnicity_ret: Merged ethnicity detection results. > > > + */ > > > +struct attr_result { > > > + __s16 gender_ret[2][64]; > > > > Is the value 0 for woman and 1 for man? > > Why the array is [2][64]? > > Explain what the first index mean and what the second index mean. > > attribute mode has no coordinate information, so we just know how > > many man and how many woman, > > but we don't know which one (the position in video) is man or woman? > > > > 0:male, 1:female You forgot to answer what does the array index mean. Or you just want to answer in document. > > > > + __s16 ethnicity_ret[4][64]; > > > > 0 for while, 1 for black, 2 for asian? > > 0: while, 1:black, 2:asian, 3:indian When add this information to document, 3: India people > > > > > > + __s16 merged_age_ret[2]; > > > > Why the term 'merged'? > > > > This is just naming If 'merged' has no any meaning, remove this prefix. > > > > + __s16 merged_gender_ret[2]; > > > > Ditto. > > > > > + __s16 merged_is_indian_ret[2]; > > > > indian is India people or indigenous people of America or both? > > > > indian is India people > > > > + __s16 merged_ethnicity_ret[4]; > > > +}; > > > + > > > +/** > > > + * struct fld_landmark - FLD coordinates parameters. > > > + * > > > + * @x: X coordinate of the facial landmark. > > > + * @y: Y coordinate of the facial landmark. > > > + */ > > > +struct fld_landmark { > > > + __u16 x; > > > + __u16 y; > > > +}; > > > + > > > +/** > > > + * struct fld_result - FLD detection results parameters. > > > + * > > > + * @fld_landmark: Array of facial landmarks, each with X and Y > > > coordinates. > > > + * @fld_out_rip: Output rotation-invariant pose value. > > > + * @fld_out_rop: Output rotation pose value. > > > + * @confidence: Confidence score of the facial landmark detection. > > > + * @blinkscore: Blink score indicating the likelihood of eye > > > blink. > > > + */ > > > +struct fld_result { > > > + struct fld_landmark fld_landmark[FLD_CUR_LANDMARK]; > > > + __u16 fld_out_rip; > > > + __u16 fld_out_rop; > > > + __u16 confidence; > > > + __s16 blinkscore; > > > +}; > > > + > > > +/** > > > + * struct aie_enq_info - V4L2 Kernelspace parameters. > > > + * > > > + * @sel_mode: select a mode(FDMODE, ATTRIBUTEMODE, FLDMODE) for > > > current fd. > > > + * FDMODE: Face Detection. > > > + * ATTRIBUTEMODE: Gender and ethnicity detection > > > + * FLDMODE: Locations of eyebrows, eyes, ears, nose,and > > > mouth > > > + * @src_img_fmt: source image format. > > > + * @src_img_width: the width of the source image. > > > + * @src_img_height: the height of the source image. > > > + * @src_img_stride: the stride of the source image. > > > + * @pyramid_base_width: pyramid is the size of resizer, the width > > > of the base pyramid. > > > + * @pyramid_base_height: pyramid is the size of resizer, the width > > > of the base pyramid. > > > + * @number_of_pyramid: number of pyramid, min: 1, max: 3. > > > + * @rotate_degree: the rotate degree of the image. > > > + * @en_roi: enable roi(roi is a box diagram that selects a > > > rectangle in a picture). > > > + * when en_roi is enable, AIE will return a rectangle > > > face detection result > > > + * @src_roi: roi params. > > > + * @en_padding: enable padding, this is only used on the hardware > > > of yuv to rgb. > > > + * and has noting to do with fd_mode > > > + * @src_padding: padding params. > > > + * @freq_level: frequency level, Get value from user space enque. > > > + * @fld_face_num: the number of faces in fld. > > > + * user space tells driver the number of > > > detections. > > > + * @fld_input: fld input params. > > > + * @src_img_addr: Source image address. > > > + * @src_img_addr_uv: Source image address for UV plane. > > > + * @fd_out: Face detection results. > > > + * @attr_out: Attribute detection results. > > > + * @fld_out: Array of facial landmark detection results for > > > multiple frames. > > > + * @irq_status: Interrupt request status. > > > + */ > > > +struct aie_enq_info { > > > + __u32 sel_mode; > > > + __u32 src_img_fmt; > > > + __u32 src_img_width; > > > + __u32 src_img_height; > > > + __u32 src_img_stride; > > > + __u32 pyramid_base_width; > > > + __u32 pyramid_base_height; > > > + __u32 number_of_pyramid; > > > + __u32 rotate_degree; > > > + int en_roi; > > > + struct aie_roi_coordinate src_roi; > > > + int en_padding; > > > + struct aie_padding_size src_padding; > > > + unsigned int freq_level; > > > + unsigned int fld_face_num; > > > + struct v4l2_fld_crop_rip_rop fld_input[FLD_MAX_FRAME]; > > > > Above information is set by user space, so driver is not necessary to > > return these information back to user space. > > Drop these. > > > > This is the V4L2 enqueue structure, which requires information sent > from user space. It cannot be dropped; otherwise, the FLD input will > not receive data from the user. > > > > + __u32 src_img_addr; > > > + __u32 src_img_addr_uv; > > > > If user space program need to access source image buffer, it should > > map it itself. > > Do not pass this address information from driver. > > Drop these. > > > > This is where the user provides the addresses of the Y and UV > components of the image for the AIE to use. No, src_img_addr and src_img_addr_uv is not provided by user. In driver, mtk_aie_job_ready(), src_img[0].dma_addr = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0); if (ctx->src_fmt.num_planes == 2) { src_img[1].dma_addr = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 1); } if ((fd->aie_cfg->sel_mode == FDMODE || fd->aie_cfg->sel_mode == ATTRIBUTEMODE) && fd->aie_cfg->src_img_fmt == FMT_YUV420_1P) { src_img[1].dma_addr = src_img[0].dma_addr + ctx->user_param.src_img_stride * ctx->user_param.src_img_height; } fd->aie_cfg->src_img_addr = src_img[0].dma_addr; fd->aie_cfg->src_img_addr_uv = src_img[1].dma_addr; Driver would get these address itself. It's impossible for user space program to provide these address. So drop src_img_addr and src_img_addr_uv from the user space interface. > > > > + struct fd_result fd_out; > > > + struct attr_result attr_out; > > > + struct fld_result fld_out[FLD_MAX_FRAME]; > > > > Union these three structure because hardware would work in one mode > > in one time. > > > > > + __u32 irq_status; > > > > User space program should not process irq_status. > > So drop this. > > > > The irq status is provided for users to check the current status of the > AIE. Users need to use it to monitor the AIE status in order to perform > some custom operations in advance. It's not a good idea to expose hardware information to user space. In new SoC, irq_status may have different definition and this interface would vary by SoC. It's better to define a type which is not direct hardware information. For example enum AIE_STATUS { AIE_SUCCESS, AIE_ERR_XXX, AIE_ERR_YYY, AIE_ERR_ZZZ, }; I'm curious what user space program could do when AIE driver has error? If user space program just do nothing, I think it's not necessary to return AIE_STATUS, and just return FD, ATTR, FLD with no data. When user space get no data, it would also do nothing. > > > > +}; > > > + > > > +/** > > > + * struct v4l2_ctrl_aie_param - V4L2 Userspace parameters. > > > + * > > > + * @fd_mode: select a mode(FDMODE, ATTRIBUTEMODE, FLDMODE) for > > > current fd. > > > + * FDMODE: Face Detection. > > > + * ATTRIBUTEMODE: Gender and ethnicity detection > > > + * FLDMODE: Locations of eyebrows, eyes, ears, nose,and > > > mouth > > > + * @src_img_fmt: source image format. > > > + * @src_img_width: the width of the source image. > > > + * @src_img_height: the height of the source image. > > > + * @src_img_stride: the stride of the source image. > > > + * @pyramid_base_width: pyramid is the size of resizer, the width > > > of the base pyramid. > > > + * @pyramid_base_height: pyramid is the size of resizer, the width > > > of the base pyramid. > > > + * @number_of_pyramid: number of pyramid, min: 1, max: 3. > > > + * @rotate_degree: the rotate degree of the image. > > > + * @en_roi: enable roi(roi is a box diagram that selects a > > > rectangle in a picture). > > > + * when en_roi is enable, AIE will return a rectangle > > > face detection result > > > + * @src_roi: roi params. > > > + * @en_padding: enable padding, this is only used on the hardware > > > of yuv to rgb. > > > + * and has noting to do with fd_mode > > > + * @src_padding: padding params. > > > + * @freq_level: frequency level, Get value from user space enque. > > > + * @fld_face_num: the number of faces in fld. > > > + * user space tells driver the number of > > > detections. > > > + * @fld_input: fld input params. > > > + */ > > > +struct v4l2_ctrl_aie_param { > > > + __u32 fd_mode; > > > + __u32 src_img_fmt; > > > + __u32 src_img_width; > > > + __u32 src_img_height; > > > + __u32 src_img_stride; > > > > Use V4L2 standard interface to set source image > > format/width/height/stride. > > > > This structure is used to obtain data sent by the user through s_ctrl. > Users only need to set it up once, without repeatedly calling multiple > IOCTLs to set width, height, and format. This approach can reduce > system overhead. > > > > + __u32 pyramid_base_width; > > > + __u32 pyramid_base_height; > > > > Does pyramid width/height has limitation? > > If source image resolution is 3840x2160, does pyramid resolution > > could be 3840x2160? > > If there are hardware limitation, query this limitation from driver. > > > > Some AI model just accept some certain resolution, for example > > 360x360. > > In this case, any source resolution should resize to this resolution. > > Does AIE need to resize to some fixed resolution? Or any resolution > > is acceptable? > > > > The AIE only supports a maximum source image size of 640x480. > Similarly, the pyramid width and height are also 640x480. > > > > + __u32 number_of_pyramid; > > > > Why need multiple pyramid? > > Does this AI model need the face resolution around 100x100? > > If a face it 400x400, so it would not be detect so need to resize > > this face to multiple smaller resolution. > > In first pyramid face resolution is 400x400. > > In second pyramid face resolution is 200x200. > > In third pyramid face resolution is 100x100. > > So this face would be detected in third pyramid. > > > > If this is right, add this information to document. > > If not, add right information to document. > > > > You are correct.The AIE hardware needs to downscale the source image > twice to obtain three images. It then performs face detection on each > of these three images. This process is designed to improve the > recognition rate and speed of the AIE. > > such as source image:640x480 > Pyramid0: 480x460, Detect small face > Pyramid1:0.5x, 240x180 > Pyramid2:0.25x, 120x90, Detect large face > > > > + __u32 rotate_degree; > > > + __s32 en_roi; > > > + struct aie_roi_coordinate src_roi; > > > + __s32 en_padding; > > > > If padding has nothing to do with fd_mode, drop it. > > > > Padding is used to determine whether additional areas outside the > detected face should be included. Whether to enable it depends on the > user. I trace the driver, and find that padding is just to enlarge source roi. You could enlarge source roi in user space program in advance. So padding is redundant. Drop padding interface. > > > > + struct aie_padding_size src_padding; > > > + __u32 freq_level; > > > > freq_level is FPS? > > What is freq_level and how hardware work with this? > > The comment says "Get value from user space enque", what's this? > > Describe more detail about this. > > > > The freq_level is no longer in use, I will drop it > > > > + __u32 fld_face_num; > > > + struct v4l2_fld_crop_rip_rop fld_input[FLD_MAX_FRAME]; > > > > Does the fld mode need fd mode result? > > The face_num is get from fd mode, and the fld_input coordinate is get > > from fd mode also. > > If so, describe more about this. > > > > FLD is using detected face window information and head pose detected by > AIE, to perform Binary Tree Traversal to approximate the final landmark > coordination. FLD is using the AIE detection result, FLD and AIE won’t > be operating at the same time. Therefore, FLD can share AIE’s DMAs and > SRAM Provide a sample code to show how to use it. Regards, CK > > > Regards, > > CK > > > > > +}; > > > + > > > +#endif /* __MTK_AIE_V4L2_CONTROLS_H__ */ > > > diff --git a/include/uapi/linux/videodev2.h > > > b/include/uapi/linux/videodev2.h > > > index c8cb2796130f..329bbac9239e 100644 > > > --- a/include/uapi/linux/videodev2.h > > > +++ b/include/uapi/linux/videodev2.h > > > @@ -858,6 +858,9 @@ struct v4l2_pix_format { > > > #define V4L2_META_FMT_RK_ISP1_STAT_3A v4l2_fourcc('R', 'K', > > > '1', 'S') /* Rockchip ISP1 3A Statistics */ > > > #define V4L2_META_FMT_RK_ISP1_EXT_PARAMS v4l2_fourcc('R', 'K', > > > '1', 'E') /* Rockchip ISP1 3a Extensible Parameters */ > > > > > > +/* Vendor-specific definition: used for the MediaTek camera > > > subsystem's face detection results */ > > > +#define V4L2_META_FMT_MTFD_RESULT v4l2_fourcc('M', 'T', 'f', 'd') > > > + > > > /* Vendor specific - used for RaspberryPi PiSP */ > > > #define V4L2_META_FMT_RPI_BE_CFG v4l2_fourcc('R', 'P', 'B', 'C') > > > /* PiSP BE configuration */ > > > #define V4L2_META_FMT_RPI_FE_CFG v4l2_fourcc('R', 'P', 'F', 'C') > > > /* PiSP FE configuration */ > > > @@ -1965,6 +1968,9 @@ enum v4l2_ctrl_type { > > > V4L2_CTRL_TYPE_AV1_TILE_GROUP_ENTRY = 0x281, > > > V4L2_CTRL_TYPE_AV1_FRAME = 0x282, > > > V4L2_CTRL_TYPE_AV1_FILM_GRAIN = 0x283, > > > + > > > + V4L2_CTRL_TYPE_AIE_INIT = 0x0290, > > > + V4L2_CTRL_TYPE_AIE_PARAM = 0x0291, > > > }; > > > > > > /* Used in the VIDIOC_QUERYCTRL ioctl for querying controls */ > > > > > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v5 3/4] uapi: linux: add MT8188 AIE 2025-04-23 4:02 ` Bo Kong (孔波) 2025-04-24 1:43 ` CK Hu (胡俊光) @ 2025-04-24 5:22 ` CK Hu (胡俊光) 1 sibling, 0 replies; 17+ messages in thread From: CK Hu (胡俊光) @ 2025-04-24 5:22 UTC (permalink / raw) To: linux-media@vger.kernel.org, AngeloGioacchino Del Regno, robh@kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, mchehab@kernel.org, Bo Kong (孔波), linux-mediatek@lists.infradead.org Cc: Zhaoyuan Chen (陈兆远), Teddy Chen (陳乾元), Project_Global_Chrome_Upstream_Group On Wed, 2025-04-23 at 04:02 +0000, Bo Kong (孔波) wrote: > Hi,CK, > Thanks for the reviews. > > > > > +/** > > > + * struct aie_enq_info - V4L2 Kernelspace parameters. > > > + * > > > + * @sel_mode: select a mode(FDMODE, ATTRIBUTEMODE, FLDMODE) for > > > current fd. > > > + * FDMODE: Face Detection. > > > + * ATTRIBUTEMODE: Gender and ethnicity detection > > > + * FLDMODE: Locations of eyebrows, eyes, ears, nose,and > > > mouth > > > + * @src_img_fmt: source image format. > > > + * @src_img_width: the width of the source image. > > > + * @src_img_height: the height of the source image. > > > + * @src_img_stride: the stride of the source image. > > > + * @pyramid_base_width: pyramid is the size of resizer, the width > > > of the base pyramid. > > > + * @pyramid_base_height: pyramid is the size of resizer, the width > > > of the base pyramid. > > > + * @number_of_pyramid: number of pyramid, min: 1, max: 3. > > > + * @rotate_degree: the rotate degree of the image. > > > + * @en_roi: enable roi(roi is a box diagram that selects a > > > rectangle in a picture). > > > + * when en_roi is enable, AIE will return a rectangle > > > face detection result > > > + * @src_roi: roi params. > > > + * @en_padding: enable padding, this is only used on the hardware > > > of yuv to rgb. > > > + * and has noting to do with fd_mode > > > + * @src_padding: padding params. > > > + * @freq_level: frequency level, Get value from user space enque. > > > + * @fld_face_num: the number of faces in fld. > > > + * user space tells driver the number of > > > detections. > > > + * @fld_input: fld input params. > > > + * @src_img_addr: Source image address. > > > + * @src_img_addr_uv: Source image address for UV plane. > > > + * @fd_out: Face detection results. > > > + * @attr_out: Attribute detection results. > > > + * @fld_out: Array of facial landmark detection results for > > > multiple frames. > > > + * @irq_status: Interrupt request status. > > > + */ > > > +struct aie_enq_info { > > > + __u32 sel_mode; > > > + __u32 src_img_fmt; > > > + __u32 src_img_width; > > > + __u32 src_img_height; > > > + __u32 src_img_stride; > > > + __u32 pyramid_base_width; > > > + __u32 pyramid_base_height; > > > + __u32 number_of_pyramid; > > > + __u32 rotate_degree; > > > + int en_roi; > > > + struct aie_roi_coordinate src_roi; > > > + int en_padding; > > > + struct aie_padding_size src_padding; > > > + unsigned int freq_level; > > > + unsigned int fld_face_num; > > > + struct v4l2_fld_crop_rip_rop fld_input[FLD_MAX_FRAME]; > > > > Above information is set by user space, so driver is not necessary to > > return these information back to user space. > > Drop these. > > > > This is the V4L2 enqueue structure, which requires information sent > from user space. It cannot be dropped; otherwise, the FLD input will > not receive data from the user. In struct v4l2_ctrl_aie_param, you have already define these parameter, so it's not necessary to set these parameter by two interface to driver. So drop these. Regards, CK > > > > + __u32 src_img_addr; > > > + __u32 src_img_addr_uv; > > > > If user space program need to access source image buffer, it should > > map it itself. > > Do not pass this address information from driver. > > Drop these. > > > > This is where the user provides the addresses of the Y and UV > components of the image for the AIE to use. > > > > + struct fd_result fd_out; > > > + struct attr_result attr_out; > > > + struct fld_result fld_out[FLD_MAX_FRAME]; > > > > Union these three structure because hardware would work in one mode > > in one time. > > > > > + __u32 irq_status; > > > > User space program should not process irq_status. > > So drop this. > > > > The irq status is provided for users to check the current status of the > AIE. Users need to use it to monitor the AIE status in order to perform > some custom operations in advance. > > > > +}; > > > + ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v5 3/4] uapi: linux: add MT8188 AIE 2025-04-07 3:57 ` CK Hu (胡俊光) 2025-04-23 4:02 ` Bo Kong (孔波) @ 2025-04-23 9:29 ` Bo Kong (孔波) 1 sibling, 0 replies; 17+ messages in thread From: Bo Kong (孔波) @ 2025-04-23 9:29 UTC (permalink / raw) To: CK Hu (胡俊光), linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mediatek@lists.infradead.org Cc: Teddy Chen (陳乾元), Bo Kong (孔波) Hi,CK, Thanks for the reviews. It seems the previous email wasn't updated to patchwork, so I'm resending it here. On Mon, 2025-04-07 at 03:57 +0000, CK Hu (胡俊光) wrote: > On Thu, 2025-04-03 at 15:38 +0800, bo.kong wrote: > > From: Bo Kong <Bo.Kong@mediatek.com> > > > > Add AIE control related definitions. > > Laurent has asked you to provide document of the UAPI. > If you have no idea what to write to this document, > I ask you question in this mail and you write the answer to the > document. OK, I will reply with the answer here. > > > > > Signed-off-by: Bo Kong <Bo.Kong@mediatek.com> > > --- > > [snip] > > > diff --git a/include/uapi/linux/mtk_aie_v4l2_controls.h > > b/include/uapi/linux/mtk_aie_v4l2_controls.h > > new file mode 100644 > > index 000000000000..952dcdb23283 > > --- /dev/null > > +++ b/include/uapi/linux/mtk_aie_v4l2_controls.h > > @@ -0,0 +1,308 @@ > > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ > > +/* > > + * AIE Controls Header > > + * > > + * Copyright (c) 2020 MediaTek Inc. > > + * Author: Bo Kong <bo.kong@mediatek.com> > > + */ > > + > > +#ifndef __MTK_AIE_V4L2_CONTROLS_H__ > > +#define __MTK_AIE_V4L2_CONTROLS_H__ > > + > > +#include <linux/types.h> > > + > > +/* > > + * The base for the mediatek Face Detection driver controls. > > + * We reserve 16 controls for this driver. > > + * Each CID represents different stages of AIE, with different > > structures and functions > > + * and cannot be reused > > + */ > > +#define V4L2_CID_USER_MTK_FD_BASE (V4L2_CID_USER_BASE + 0x1fd0) > > +#define V4L2_CID_MTK_AIE_INIT (V4L2_CID_USER_MTK_FD_BASE + 1) > > +#define V4L2_CID_MTK_AIE_PARAM (V4L2_CID_USER_MTK_FD_BASE + 2) > > + > > +#define MAX_FACE_NUM 1024 > > +#define FLD_CUR_LANDMARK 11 > > +#define FLD_MAX_FRAME 15 > > + > > +/** > > + * struct v4l2_ctrl_aie_init - aie init parameters. > > + * > > + * @max_img_width: maximum width of the source image. > > + * @max_img_height: maximum height of the source image. > > + * @pyramid_width: maximum width of the base pyramid. > > + * @pyramid_height: maximum height of the base pyramid. > > + * @feature_threshold: The threshold for the face > > confidence.Range: 100 ~ 400. > > + * The larger the value,the lower the face > > recognition rate > > + */ > > +struct v4l2_ctrl_aie_init { > > + __u32 max_img_width; > > + __u32 max_img_height; > > max image width/height is hardware limitation. This should be stored > in driver. > User space program should use some V4L2 query interface to get this > information. This is the max size of the image sent by the user, not the maximum size of the hardware. For example, for a 640x480 image, the user needs to set max_img_width = 640 and max_img_height = 480. > > > + __u32 pyramid_width; > > + __u32 pyramid_height; > > Ditto. > > > + __s32 feature_threshold; > > How to decide this value? > I guess there is a tuning process to find a reasonable threshold. > Is there only one reasonable threshold for all scenario? > If so, write done this one threshold in driver and user space is not > necessary to set this value to driver. > If the threshold would vary for different scenario, for example, 168 > for indoor video and 258 for outdoor video, > add comment to describe this so user space program have to detect the > scenario of input video. This value depends on whether the user prioritizes recognition accuracy or recognition speed. There is no standard value, as it provides users with multiple options. > > > +}; > > + > > +/** > > + * struct aie_roi_coordinate - aie roi parameters. > > + * > > + * @x1: x1 of the roi coordinate. > > + * @y1: y1 of the roi coordinate. > > + * @x2: x2 of the roi coordinate. > > + * @y2: y2 of the roi coordinate. > > + */ > > +struct aie_roi_coordinate { > > + __u32 x1; > > + __u32 y1; > > + __u32 x2; > > + __u32 y2; > > +}; > > + > > +/** > > + * struct aie_padding_size - aie padding parameters. > > + * > > + * @left: the size of padding left. > > + * @right: the size of padding right. > > + * @down: the size of padding below. > > + * @up: the size of padding above. > > + */ > > +struct aie_padding_size { > > + __u32 left; > > + __u32 right; > > + __u32 down; > > + __u32 up; > > +}; > > + > > +/** > > + * struct v4l2_fld_crop_rip_rop - aie fld parameters. > > + * > > + * @fld_in_crop_x1: x1 of the crop coordinate. > > + * @fld_in_crop_y1: y1 of the crop coordinate. > > + * @fld_in_crop_x2: x2 of the crop coordinate. > > + * @fld_in_crop_y2: y2 of the crop coordinate. > > + * @fld_in_rip: fld in rip. > > + * @fld_in_rop: fld in rop. > > + */ > > +struct v4l2_fld_crop_rip_rop { > > + __u32 fld_in_crop_x1; > > + __u32 fld_in_crop_y1; > > + __u32 fld_in_crop_x2; > > + __u32 fld_in_crop_y2; > > + __u32 fld_in_rip; > > + __u32 fld_in_rop; > > +}; > > + > > +/** > > + * struct fd_ret - aie fd result parameters. > > + * > > + * @anchor_x0: X coordinate of the top-left corner of each > > detected face. > > + * @anchor_x1: X coordinate of the bottom-right corner of each > > detected face. > > + * @anchor_y0: Y coordinate of the top-left corner of each > > detected face. > > + * @anchor_y1: Y coordinate of the bottom-right corner of each > > detected face. > > + * @rop_landmark_score0: Score for the first rotation pose > > landmark. > > + * @rop_landmark_score1: Score for the second rotation pose > > landmark. > > + * @rop_landmark_score2: Score for the third rotation pose > > landmark. > > + * @anchor_score: Score for the anchor points. > > + * @rip_landmark_score0: Score for the first rotation-invariant > > pose landmark. > > + * @rip_landmark_score1: Score for the second rotation-invariant > > pose landmark. > > + * @rip_landmark_score2: Score for the third rotation-invariant > > pose landmark. > > + * @rip_landmark_score3: Score for the fourth rotation-invariant > > pose landmark. > > + * @rip_landmark_score4: Score for the fifth rotation-invariant > > pose landmark. > > + * @rip_landmark_score5: Score for the sixth rotation-invariant > > pose landmark. > > + * @rip_landmark_score6: Score for the seventh rotation-invariant > > pose landmark. > > + * @face_result_index: Index of each detected face. > > + * @anchor_index: Index of the anchor points. > > + * @fd_partial_result: Partial face detection result. > > + */ > > +struct fd_ret { > > + __u16 anchor_x0[MAX_FACE_NUM]; > > + __u16 anchor_x1[MAX_FACE_NUM]; > > + __u16 anchor_y0[MAX_FACE_NUM]; > > + __u16 anchor_y1[MAX_FACE_NUM]; > > I don't know why use 'anchor' for face coordinate. > Maybe this is trivial for AI domain, but I think there should be some > brief introduction for who don't understand AI. This naming is based on the hardware specifications. The four variables othe anchor represent the coordinates of the four corners of the rectangular box. > > > + __s16 rop_landmark_score0[MAX_FACE_NUM]; > > + __s16 rop_landmark_score1[MAX_FACE_NUM]; > > + __s16 rop_landmark_score2[MAX_FACE_NUM]; > > I don't know what is landmark. > Why there are 'first', 'second', and 'third' landmark? > Maybe this is trivial for AI domain, but I think there should be some > brief introduction for who don't understand AI. rop (Rotation Out of Plane) refers to 3D rotation, which will select the highest scoring face direction among the front face, right profile, and left profile. > > > + __s16 anchor_score[MAX_FACE_NUM]; > > + __s16 rip_landmark_score0[MAX_FACE_NUM]; > > + __s16 rip_landmark_score1[MAX_FACE_NUM]; > > + __s16 rip_landmark_score2[MAX_FACE_NUM]; > > + __s16 rip_landmark_score3[MAX_FACE_NUM]; > > + __s16 rip_landmark_score4[MAX_FACE_NUM]; > > + __s16 rip_landmark_score5[MAX_FACE_NUM]; > > + __s16 rip_landmark_score6[MAX_FACE_NUM]; > > + __u16 face_result_index[MAX_FACE_NUM]; > > What is this index? > If fd_pyramid0_num is 100, so anchor_x0[0] ~ anchor_x0[99] is valid. > I don't know why need this index? RIP (Rotation In Plane) refers to planar rotation, which will determine the rotation angle by selecting the highest scoring angle among the six possible face rotation angles. > > > + __u16 anchor_index[MAX_FACE_NUM]; > > What is this index? > The meaning of anchor_index is which anchor detected it. > > + __u32 fd_partial_result; > > What is this result? > Describe more so that everyone could understand this. > fd_partial_result indicates the number of faces detected in three different resized images (large, medium, and small) by the face detection (FD) process. > > +}; > > It's better that one face information is placed together in DRAM > because it's easy to cache hit. > Then this structure would reform as: > > struct face { > __u16 anchor_x0; > __u16 anchor_x1; > __u16 anchor_y0; > __u16 anchor_y1; > ... > __u16 anchor_index; > ; > > struct fd_ret { > struct face face_result[MAX_FACE_NUM]; > __u32 fd_partial_result; > }; > > > + > > +/** > > + * struct fd_result - Face detection results for different pyramid > > levels. > > + * > > + * @fd_pyramid0_num: Number of faces detected at pyramid level 0. > > + * @fd_pyramid1_num: Number of faces detected at pyramid level 1. > > + * @fd_pyramid2_num: Number of faces detected at pyramid level 2. > > + * @fd_total_num: Total number of faces detected across all > > pyramid levels. > > + * @pyramid0_result: Detection results for pyramid level 0. > > + * @pyramid1_result: Detection results for pyramid level 1. > > + * @pyramid2_result: Detection results for pyramid level 2. > > + */ > > +struct fd_result { > > + __u16 fd_pyramid0_num; > > + __u16 fd_pyramid1_num; > > + __u16 fd_pyramid2_num; > > + __u16 fd_total_num; > > fd_total_num is redundant. > User space program could use (fd_pyramid0_num + fd_pyramid1_num + > fd_pyramid2_num) to get this. > So drop this. > fd_total_num represents the final number of faces detected by the hardware. fd_pyramind0, fd_pyramind1, and fd_pyramind2 are the numbers of faces detected in the resized images (large, medium, and small) respectively. These are different variables and fd_total_num is not obtained by summing fd_pyramind0, fd_pyramind1, and fd_pyramind2. > > + struct fd_ret pyramid0_result; > > + struct fd_ret pyramid1_result; > > + struct fd_ret pyramid2_result; > > +}; > > + > > +/** > > + * struct attr_result - Attribute detection results parameters. > > + * > > + * @gender_ret: Gender detection results. > > + * @ethnicity_ret: Ethnicity detection results. > > + * @merged_age_ret: Merged age detection results. > > + * @merged_gender_ret: Merged gender detection results. > > + * @merged_is_indian_ret: Merged results indicating if the subject > > is Indian. > > + * @merged_ethnicity_ret: Merged ethnicity detection results. > > + */ > > +struct attr_result { > > + __s16 gender_ret[2][64]; > > Is the value 0 for woman and 1 for man? > Why the array is [2][64]? > Explain what the first index mean and what the second index mean. > attribute mode has no coordinate information, so we just know how > many man and how many woman, > but we don't know which one (the position in video) is man or woman? > 0:male, 1:female > > + __s16 ethnicity_ret[4][64]; > > 0 for Chinese, 1 for Japanese, 2 for Korean? > 0: while, 1:black, 2:asian, 3:indian > > + __s16 merged_age_ret[2]; > > Why the term 'merged'? > > > + __s16 merged_gender_ret[2]; > > Ditto. > > > + __s16 merged_is_indian_ret[2]; > > indian is India people or indigenous people of America or both? > indian is India people > > + __s16 merged_ethnicity_ret[4]; > > +}; > > + > > +/** > > + * struct fld_landmark - FLD coordinates parameters. > > + * > > + * @x: X coordinate of the facial landmark. > > + * @y: Y coordinate of the facial landmark. > > + */ > > +struct fld_landmark { > > + __u16 x; > > + __u16 y; > > +}; > > + > > +/** > > + * struct fld_result - FLD detection results parameters. > > + * > > + * @fld_landmark: Array of facial landmarks, each with X and Y > > coordinates. > > + * @fld_out_rip: Output rotation-invariant pose value. > > + * @fld_out_rop: Output rotation pose value. > > + * @confidence: Confidence score of the facial landmark detection. > > + * @blinkscore: Blink score indicating the likelihood of eye > > blink. > > + */ > > +struct fld_result { > > + struct fld_landmark fld_landmark[FLD_CUR_LANDMARK]; > > + __u16 fld_out_rip; > > + __u16 fld_out_rop; > > + __u16 confidence; > > + __s16 blinkscore; > > +}; > > + > > +/** > > + * struct aie_enq_info - V4L2 Kernelspace parameters. > > + * > > + * @sel_mode: select a mode(FDMODE, ATTRIBUTEMODE, FLDMODE) for > > current fd. > > + * FDMODE: Face Detection. > > + * ATTRIBUTEMODE: Gender and ethnicity detection > > + * FLDMODE: Locations of eyebrows, eyes, ears, nose,and > > mouth > > + * @src_img_fmt: source image format. > > + * @src_img_width: the width of the source image. > > + * @src_img_height: the height of the source image. > > + * @src_img_stride: the stride of the source image. > > + * @pyramid_base_width: pyramid is the size of resizer, the width > > of the base pyramid. > > + * @pyramid_base_height: pyramid is the size of resizer, the width > > of the base pyramid. > > + * @number_of_pyramid: number of pyramid, min: 1, max: 3. > > + * @rotate_degree: the rotate degree of the image. > > + * @en_roi: enable roi(roi is a box diagram that selects a > > rectangle in a picture). > > + * when en_roi is enable, AIE will return a rectangle > > face detection result > > + * @src_roi: roi params. > > + * @en_padding: enable padding, this is only used on the hardware > > of yuv to rgb. > > + * and has noting to do with fd_mode > > + * @src_padding: padding params. > > + * @freq_level: frequency level, Get value from user space enque. > > + * @fld_face_num: the number of faces in fld. > > + * user space tells driver the number of > > detections. > > + * @fld_input: fld input params. > > + * @src_img_addr: Source image address. > > + * @src_img_addr_uv: Source image address for UV plane. > > + * @fd_out: Face detection results. > > + * @attr_out: Attribute detection results. > > + * @fld_out: Array of facial landmark detection results for > > multiple frames. > > + * @irq_status: Interrupt request status. > > + */ > > +struct aie_enq_info { > > + __u32 sel_mode; > > + __u32 src_img_fmt; > > + __u32 src_img_width; > > + __u32 src_img_height; > > + __u32 src_img_stride; > > + __u32 pyramid_base_width; > > + __u32 pyramid_base_height; > > + __u32 number_of_pyramid; > > + __u32 rotate_degree; > > + int en_roi; > > + struct aie_roi_coordinate src_roi; > > + int en_padding; > > + struct aie_padding_size src_padding; > > + unsigned int freq_level; > > + unsigned int fld_face_num; > > + struct v4l2_fld_crop_rip_rop fld_input[FLD_MAX_FRAME]; > > Above information is set by user space, so driver is not necessary to > return these information back to user space. > Drop these. > This is the V4L2 enqueue structure, which requires information sent from user space. It cannot be dropped; otherwise, the FLD input will not receive data from the user. > > + __u32 src_img_addr; > > + __u32 src_img_addr_uv; > > If user space program need to access source image buffer, it should > map it itself. > Do not pass this address information from driver. > Drop these. > This is where the user provides the addresses of the Y and UV components of the image for the AIE to use. > > + struct fd_result fd_out; > > + struct attr_result attr_out; > > + struct fld_result fld_out[FLD_MAX_FRAME]; > > Union these three structure because hardware would work in one mode > in one time. > > > + __u32 irq_status; > > User space program should not process irq_status. > So drop this. > The irq status is provided for users to check the current status of the AIE. Users need to use it to monitor the AIE status in order to perform some custom operations in advance. > > +}; > > + > > +/** > > + * struct v4l2_ctrl_aie_param - V4L2 Userspace parameters. > > + * > > + * @fd_mode: select a mode(FDMODE, ATTRIBUTEMODE, FLDMODE) for > > current fd. > > + * FDMODE: Face Detection. > > + * ATTRIBUTEMODE: Gender and ethnicity detection > > + * FLDMODE: Locations of eyebrows, eyes, ears, nose,and > > mouth > > + * @src_img_fmt: source image format. > > + * @src_img_width: the width of the source image. > > + * @src_img_height: the height of the source image. > > + * @src_img_stride: the stride of the source image. > > + * @pyramid_base_width: pyramid is the size of resizer, the width > > of the base pyramid. > > + * @pyramid_base_height: pyramid is the size of resizer, the width > > of the base pyramid. > > + * @number_of_pyramid: number of pyramid, min: 1, max: 3. > > + * @rotate_degree: the rotate degree of the image. > > + * @en_roi: enable roi(roi is a box diagram that selects a > > rectangle in a picture). > > + * when en_roi is enable, AIE will return a rectangle > > face detection result > > + * @src_roi: roi params. > > + * @en_padding: enable padding, this is only used on the hardware > > of yuv to rgb. > > + * and has noting to do with fd_mode > > + * @src_padding: padding params. > > + * @freq_level: frequency level, Get value from user space enque. > > + * @fld_face_num: the number of faces in fld. > > + * user space tells driver the number of > > detections. > > + * @fld_input: fld input params. > > + */ > > +struct v4l2_ctrl_aie_param { > > + __u32 fd_mode; > > + __u32 src_img_fmt; > > + __u32 src_img_width; > > + __u32 src_img_height; > > + __u32 src_img_stride; > > Use V4L2 standard interface to set source image > format/width/height/stride. > This structure is used to obtain data sent by the user through s_ctrl. Users only need to set it up once, without repeatedly calling multiple IOCTLs to set width, height, and format. This approach can reduce system overhead. > > + __u32 pyramid_base_width; > > + __u32 pyramid_base_height; > > Does pyramid width/height has limitation? > If source image resolution is 3840x2160, does pyramid resolution > could be 3840x2160? > If there are hardware limitation, query this limitation from driver. > > Some AI model just accept some certain resolution, for example > 360x360. > In this case, any source resolution should resize to this resolution. > Does AIE need to resize to some fixed resolution? Or any resolution > is acceptable? > The AIE only supports a maximum source image size of 640x480. Similarly, the pyramid width and height are also 640x480. > > + __u32 number_of_pyramid; > > Why need multiple pyramid? > Does this AI model need the face resolution around 100x100? > If a face it 400x400, so it would not be detect so need to resize > this face to multiple smaller resolution. > In first pyramid face resolution is 400x400. > In second pyramid face resolution is 200x200. > In third pyramid face resolution is 100x100. > So this face would be detected in third pyramid. > > If this is right, add this information to document. > If not, add right information to document. > You are correct.The AIE hardware needs to downscale the source image twice to obtain three images. It then performs face detection on each of these three images. This process is designed to improve the recognition rate and speed of the AIE. such as source image:640x480 Pyramid0: 480x460, Detect small face Pyramid1:0.5x, 240x180 Pyramid2:0.25x, 120x90, Detect large face > > + __u32 rotate_degree; > > + __s32 en_roi; > > + struct aie_roi_coordinate src_roi; > > + __s32 en_padding; > > If padding has nothing to do with fd_mode, drop it. > Padding is used to determine whether additional areas outside the detected face should be included. Whether to enable it depends on the user. > > + struct aie_padding_size src_padding; > > + __u32 freq_level; > > freq_level is FPS? > What is freq_level and how hardware work with this? > The comment says "Get value from user space enque", what's this? > Describe more detail about this. > The freq_level is no longer in use, I will drop it > > + __u32 fld_face_num; > > + struct v4l2_fld_crop_rip_rop fld_input[FLD_MAX_FRAME]; > > Does the fld mode need fd mode result? > The face_num is get from fd mode, and the fld_input coordinate is get > from fd mode also. > If so, describe more about this. > FLD is using detected face window information and head pose detected by AIE, to perform Binary Tree Traversal to approximate the final landmark coordination. FLD is using the AIE detection result, FLD and AIE won’t be operating at the same time. Therefore, FLD can share AIE’s DMAs and SRAM > Regards, > CK > > > +}; > > + > > +#endif /* __MTK_AIE_V4L2_CONTROLS_H__ */ > > diff --git a/include/uapi/linux/videodev2.h > > b/include/uapi/linux/videodev2.h > > index c8cb2796130f..329bbac9239e 100644 > > --- a/include/uapi/linux/videodev2.h > > +++ b/include/uapi/linux/videodev2.h > > @@ -858,6 +858,9 @@ struct v4l2_pix_format { > > #define V4L2_META_FMT_RK_ISP1_STAT_3A v4l2_fourcc('R', 'K', > > '1', 'S') /* Rockchip ISP1 3A Statistics */ > > #define V4L2_META_FMT_RK_ISP1_EXT_PARAMS v4l2_fourcc('R', 'K', > > '1', 'E') /* Rockchip ISP1 3a Extensible Parameters */ > > > > +/* Vendor-specific definition: used for the MediaTek camera > > subsystem's face detection results */ > > +#define V4L2_META_FMT_MTFD_RESULT v4l2_fourcc('M', 'T', 'f', 'd') > > + > > /* Vendor specific - used for RaspberryPi PiSP */ > > #define V4L2_META_FMT_RPI_BE_CFG v4l2_fourcc('R', 'P', 'B', 'C') > > /* PiSP BE configuration */ > > #define V4L2_META_FMT_RPI_FE_CFG v4l2_fourcc('R', 'P', 'F', 'C') > > /* PiSP FE configuration */ > > @@ -1965,6 +1968,9 @@ enum v4l2_ctrl_type { > > V4L2_CTRL_TYPE_AV1_TILE_GROUP_ENTRY = 0x281, > > V4L2_CTRL_TYPE_AV1_FRAME = 0x282, > > V4L2_CTRL_TYPE_AV1_FILM_GRAIN = 0x283, > > + > > + V4L2_CTRL_TYPE_AIE_INIT = 0x0290, > > + V4L2_CTRL_TYPE_AIE_PARAM = 0x0291, > > }; > > > > /* Used in the VIDIOC_QUERYCTRL ioctl for querying controls */ > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v5 4/4] media: mediatek: add MT8188 AIE driver 2025-04-03 7:38 [PATCH v5 0/4] Add AIE driver support for mt8188 bo.kong ` (2 preceding siblings ...) 2025-04-03 7:38 ` [PATCH v5 3/4] uapi: linux: add MT8188 AIE bo.kong @ 2025-04-03 7:38 ` bo.kong 2025-04-03 19:31 ` Nicolas Dufresne ` (2 more replies) 2025-04-03 14:35 ` [PATCH v5 0/4] Add AIE driver support for mt8188 Rob Herring (Arm) 2025-04-03 19:52 ` Laurent Pinchart 5 siblings, 3 replies; 17+ messages in thread From: bo.kong @ 2025-04-03 7:38 UTC (permalink / raw) To: Rob Herring, AngeloGioacchino Del Regno, Mauro Carvalho Chehab, mtk29348, linux-media, linux-kernel, linux-arm-kernel, linux-mediatek Cc: zhaoyuan.chen, Teddy.Chen, Project_Global_Chrome_Upstream_Group From: Bo Kong <Bo.Kong@mediatek.com> Add a V4L2 sub-device driver for MT8188 AIE. Signed-off-by: Bo Kong <Bo.Kong@mediatek.com> --- Changes in V5: 1. Modify the description of FLDMODE. 2. Replace all instances of the keyword "race" with "ethnicity" (e.g., race_result) to improve readability. 3. Remove NUM_FORMATS. 4. Correct some English spelling errors. 5. Change the name of clock, replacing _ with -. 6. Add an explanation for the aie_copy_fw related bin files. 7. Use enums to save Y2R, RS, and FD related variables, removing the define format, and add new enums Y2R_CONFIG, RS_CONFIG, and FD_CONFIG. 8. Optimize the expression of the RPN_LOOP_NUM and PYM_START_LOOP variables. 9. Remove unnecessary define variables. 10. Remove the fd_wdma_en structure, replacing it with out_stride_size for judgment. 11. Optimize the fd_ker_rdma_size structure, add the get_fd_ker_rdma_size function, optimize the attr_ker_rdma_size structure, and add the get_attr_ker_rdma_size function. 12. Delete the fd_stride and fd_maxpool structures. 13. Remove duplicate structures in mtk_aie.h and mtk_aie_v4l2_controls.h. 14. Remove fd_version, attr_version, and pose_version. 15. Move the aie_enq_info and related structures to the uapi directory. 16. Rename mtk_aie_53.c to mtk_aie_v4l2.c and update the Makefile accordingly. 17. Optimize some code to reduce redundant conditional checks. Changes in v4: 1. Remove ccflags that are not used in Makefile. 2. Optimize the write register operation. 3. Remove excess config from Kconfig. 4. Remove the probe function related to hw_version. 5. Remove mtk_aie_hw_connect,mtk_aie_hw_enable,mtk_aie_hw_disconnect etc. 6. Fix coding style in mtk_aie.h. 7. Replace signed short with s16. 8. Remove aie_reset_output_buf function, optimize this part of the code. 9. Remove struct race_result and other structures, and use s16 instead. of structures. 10. Change multiple different types of width and height into the same structure. 11. Remove all unnecessary variable initialization operations,retaining some variables that need to be initialized. 12. Delete print physical address. 13. Remove aie_init_table and combine aie_init_table into the aie_update_table function. 14. Optimize the readability of some codes, add aie_calculate_pa function. 15. Optimize the aie_execute function and divide this function into aie_execute_face_detection, aie_execute_attribute_detection, aie_execute_fld_detection functions. 16. Modify arrays with all 5 into define ANCHOR_EN_NUM. 17. Remove mtk_aie_get_variant function. 18. The version in the code was deleted, but the variables in the enque info structure cannot be deleted. Once deleted, the AIE will not work because this structure needs to be aligned with User Space. Changes in v3: 1. Remove not used include file, include only headers which AIE use 2. Remove Makefile some private driver headers Changes in v2: 1. Fix coding style --- drivers/media/platform/mediatek/Kconfig | 1 + drivers/media/platform/mediatek/Makefile | 1 + drivers/media/platform/mediatek/aie/Kconfig | 12 + drivers/media/platform/mediatek/aie/Makefile | 5 + drivers/media/platform/mediatek/aie/mtk_aie.h | 870 ++++++ .../media/platform/mediatek/aie/mtk_aie_drv.c | 2398 +++++++++++++++++ .../platform/mediatek/aie/mtk_aie_v4l2.c | 1295 +++++++++ 7 files changed, 4582 insertions(+) create mode 100644 drivers/media/platform/mediatek/aie/Kconfig create mode 100644 drivers/media/platform/mediatek/aie/Makefile create mode 100644 drivers/media/platform/mediatek/aie/mtk_aie.h create mode 100644 drivers/media/platform/mediatek/aie/mtk_aie_drv.c create mode 100644 drivers/media/platform/mediatek/aie/mtk_aie_v4l2.c diff --git a/drivers/media/platform/mediatek/Kconfig b/drivers/media/platform/mediatek/Kconfig index 84104e2cd024..cd161272666b 100644 --- a/drivers/media/platform/mediatek/Kconfig +++ b/drivers/media/platform/mediatek/Kconfig @@ -2,6 +2,7 @@ comment "Mediatek media platform drivers" +source "drivers/media/platform/mediatek/aie/Kconfig" source "drivers/media/platform/mediatek/jpeg/Kconfig" source "drivers/media/platform/mediatek/mdp/Kconfig" source "drivers/media/platform/mediatek/vcodec/Kconfig" diff --git a/drivers/media/platform/mediatek/Makefile b/drivers/media/platform/mediatek/Makefile index 38e6ba917fe5..23a096fdf21c 100644 --- a/drivers/media/platform/mediatek/Makefile +++ b/drivers/media/platform/mediatek/Makefile @@ -1,4 +1,5 @@ # SPDX-License-Identifier: GPL-2.0-only +obj-y += aie/ obj-y += jpeg/ obj-y += mdp/ obj-y += vcodec/ diff --git a/drivers/media/platform/mediatek/aie/Kconfig b/drivers/media/platform/mediatek/aie/Kconfig new file mode 100644 index 000000000000..ffbe0a79a2ea --- /dev/null +++ b/drivers/media/platform/mediatek/aie/Kconfig @@ -0,0 +1,12 @@ +config VIDEO_MTK_AIE + tristate "MediaTek AI engine function" + depends on OF + select V4L2_MEM2MEM_DEV + select VIDEOBUF2_DMA_CONTIG + select MEDIA_CONTROLLER_REQUEST_API + help + Support the AI engine (AIE) feature + + AIE driver is a V4L2 memory-to-memory device driver which + provides hardware accelerated face detection function, + it can detect different sizes of faces in a raw image. diff --git a/drivers/media/platform/mediatek/aie/Makefile b/drivers/media/platform/mediatek/aie/Makefile new file mode 100644 index 000000000000..0b4252452ca2 --- /dev/null +++ b/drivers/media/platform/mediatek/aie/Makefile @@ -0,0 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0 +mtk-aie-$(CONFIG_VIDEO_MTK_AIE) += mtk_aie_v4l2.o +mtk-aie-$(CONFIG_VIDEO_MTK_AIE) += mtk_aie_drv.o + +obj-$(CONFIG_VIDEO_MTK_AIE) += mtk-aie.o \ No newline at end of file diff --git a/drivers/media/platform/mediatek/aie/mtk_aie.h b/drivers/media/platform/mediatek/aie/mtk_aie.h new file mode 100644 index 000000000000..1de8d8b82499 --- /dev/null +++ b/drivers/media/platform/mediatek/aie/mtk_aie.h @@ -0,0 +1,870 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2020 MediaTek Inc. + * Author: Bo Kong <bo.kong@mediatek.com> + */ + +#ifndef __MTK_AIE_H__ +#define __MTK_AIE_H__ + +#include <media/v4l2-ctrls.h> +#include <media/v4l2-device.h> +#include <linux/mtk_aie_v4l2_controls.h> + +#define RPN_NUM 3 +#define RPN_LOOP_NUM 29 +#define FD_LOOP_NUM (RPN_NUM * RPN_LOOP_NUM) + +#define RPNX_LOOP_NUM(x) ((3 - (x)) * RPN_LOOP_NUM - 1) +#define PYMX_START_LOOP(x) ((2 - (x)) * RPN_LOOP_NUM) + +#define ATTR_LOOP_NUM 26 +#define AGE_OUT_RGS 17 +#define GENDER_OUT_RGS 20 +#define INDIAN_OUT_RGS 22 +#define ETHNICITY_OUT_RGS 25 + +#define INPUT_WDMA_WRA_NUM 4 +#define OUTPUT_WDMA_WRA_NUM 4 +#define KERNEL_RDMA_RA_NUM 2 + +#define MAX_ENQUE_FRAME_NUM 10 +#define PYM_NUM 3 +#define COLOR_NUM 3 + +#define ATTR_MODE_PYRAMID_WIDTH 128 +#define ATTR_OUT_SIZE 32 + +#define AIE_START_REG 0x000 +#define AIE_ENABLE_REG 0x004 +#define AIE_LOOP_REG 0x008 +#define AIE_YUV2RGB_CON_BASE_ADR_REG 0x00c +#define AIE_RS_CON_BASE_ADR_REG 0x010 +#define AIE_FD_CON_BASE_ADR_REG 0x014 +#define AIE_INT_EN_REG 0x018 +#define AIE_INT_REG 0x01c +#define AIE_RESULT_0_REG 0x08c +#define AIE_RESULT_1_REG 0x090 +#define AIE_DMA_CTL_REG 0x094 + +/* AIE 3.0 register */ +#define AIE_YUV2RGB_CON_BASE_ADR_MSB 0x14C +#define AIE_RS_CON_BASE_ADR_MSB 0x150 +#define AIE_FD_CON_BASE_ADR_MSB 0x154 + +/* AIE 3.0 FLD register */ +#define FLD_EN 0x400 +#define FLD_BASE_ADDR_FACE_0 0x404 +#define FLD_MODEL_PARA1 0x4FC +#define FLD_MODEL_PARA14 0x530 + +#define FLD_BUSY 0x544 +#define FLD_DONE 0x548 +#define FLD_SRC_WD_HT 0x54C + +/* n: min 0, max 14 */ +#define FLD_PL_IN_BASE_ADDR_0_(n) (0x550 + 4 * (n)) +#define FLD_PL_IN_BASE_ADDR_1_(n) (0x5C8 + 4 * (n)) +#define FLD_PL_IN_BASE_ADDR_2_(n) (0x640 + 4 * (n)) +#define FLD_PL_IN_BASE_ADDR_3_(n) (0x6B8 + 4 * (n)) +#define FLD_SH_IN_BASE_ADDR_(n) (0x85C + 4 * (n)) + +#define FLD_PL_IN_SIZE_0 0x730 +#define FLD_PL_IN_STRIDE_0 0x734 +#define FLD_PL_IN_SIZE_1 0x738 +#define FLD_PL_IN_STRIDE_1 0x73C +#define FLD_PL_IN_SIZE_2_0 0x740 +#define FLD_PL_IN_STRIDE_2_0 0x744 +#define FLD_PL_IN_SIZE_2_1 0x748 +#define FLD_PL_IN_STRIDE_2_1 0x74C +#define FLD_PL_IN_SIZE_2_2 0x750 +#define FLD_PL_IN_STRIDE_2_2 0x754 +#define FLD_PL_IN_SIZE_3 0x758 +#define FLD_PL_IN_STRIDE_3 0x75C + +#define FLD_SH_IN_SIZE_0 0x7D8 +#define FLD_SH_IN_STRIDE_0 0x7DC +#define FLD_TR_OUT_BASE_ADDR_0 0x7E0 +#define FLD_TR_OUT_SIZE_0 0x7E4 +#define FLD_TR_OUT_STRIDE_0 0x7E8 +#define FLD_PP_OUT_BASE_ADDR_0 0x7EC +#define FLD_PP_OUT_SIZE_0 0x7F0 +#define FLD_PP_OUT_STRIDE_0 0x7F4 +#define FLD_SPARE 0x7F8 + +#define FLD_BASE_ADDR_FACE_0_7_MSB 0x7FC +#define FLD_BASE_ADDR_FACE_8_14_MSB 0x800 + +#define FLD_PL_IN_BASE_ADDR_0_0_7_MSB 0x804 +#define FLD_PL_IN_BASE_ADDR_0_8_15_MSB 0x808 +#define FLD_PL_IN_BASE_ADDR_0_16_23_MSB 0x80C +#define FLD_PL_IN_BASE_ADDR_0_24_29_MSB 0x810 + +#define FLD_PL_IN_BASE_ADDR_1_0_7_MSB 0x814 +#define FLD_PL_IN_BASE_ADDR_1_8_15_MSB 0x818 +#define FLD_PL_IN_BASE_ADDR_1_16_23_MSB 0x81C +#define FLD_PL_IN_BASE_ADDR_1_24_29_MSB 0x820 + +#define FLD_PL_IN_BASE_ADDR_2_0_7_MSB 0x824 +#define FLD_PL_IN_BASE_ADDR_2_8_15_MSB 0x828 +#define FLD_PL_IN_BASE_ADDR_2_16_23_MSB 0x82C +#define FLD_PL_IN_BASE_ADDR_2_24_29_MSB 0x830 + +#define FLD_PL_IN_BASE_ADDR_3_0_7_MSB 0x834 +#define FLD_PL_IN_BASE_ADDR_3_8_15_MSB 0x838 +#define FLD_PL_IN_BASE_ADDR_3_16_23_MSB 0x83C +#define FLD_PL_IN_BASE_ADDR_3_24_29_MSB 0x840 + +#define FLD_SH_IN_BASE_ADDR_0_7_MSB 0x844 +#define FLD_SH_IN_BASE_ADDR_8_15_MSB 0x848 +#define FLD_SH_IN_BASE_ADDR_16_23_MSB 0x84C +#define FLD_SH_IN_BASE_ADDR_24_29_MSB 0x850 + +#define FLD_BS_IN_BASE_ADDR_0_7_MSB 0x8d4 +#define FLD_BS_IN_BASE_ADDR_8_15_MSB 0x8d8 + +#define FLD_TR_OUT_BASE_ADDR_0_MSB 0x854 +#define FLD_PP_OUT_BASE_ADDR_0_MSB 0x858 + +#define FLD_BS_IN_BASE_ADDR_14 0x894 + +#define FLD_BS_BIAS 0x8E4 +#define FLD_CV_FM_RANGE_0 0x8E8 +#define FLD_CV_FM_RANGE_1 0x8EC +#define FLD_CV_PM_RANGE_0 0x8F0 +#define FLD_CV_PM_RANGE_1 0x8F4 +#define FLD_BS_RANGE_0 0x8F8 +#define FLD_BS_RANGE_1 0x8FC + +#define MTK_FD_OUTPUT_MIN_WIDTH 16U +#define MTK_FD_OUTPUT_MIN_HEIGHT 16U +#define MTK_FD_OUTPUT_MAX_WIDTH 4096U +#define MTK_FD_OUTPUT_MAX_HEIGHT 4096U + +#define MTK_FD_HW_TIMEOUT_IN_MSEC 2000 +#define RLT_NUM 48 +#define GENDER_OUT 32 + +#define ETHNICITY_RST_X_NUM 4 +#define ETHNICITY_RST_Y_NUM 64 +#define GENDER_RST_X_NUM 2 +#define GENDER_RST_Y_NUM 64 +#define METHNICITY_RST_NUM 4 +#define MGENDER_RST_NUM 2 +#define MAGE_RST_NUM 2 +#define MINDIAN_RST_NUM 2 + +#define FLD_FOREST 14 +#define FLD_POINT 500 + +#define FLD_STEP_NUM 6 +#define FLD_MAX_FRAME 15 + +#define FLD_STEP_BLINK 0 +#define FLD_STEP_CV 1 +#define FLD_STEP_FP 2 +#define FLD_STEP_LEAF 3 +#define FLD_STEP_KM02 4 +#define FLD_STEP_KM13 5 + +#define FLD_BLINK_WEIGHT_FOREST14_SIZE 6416 +#define FLD_CV_SIZE 19392 +#define FLD_FP_SIZE 80160 +#define FLD_LEAFNODE_SIZE 4608000 +#define FLD_TREE_KM02_SIZE 120000 +#define FLD_TREE_KM13_SIZE 120000 +#define FLD_OUTPUT_SIZE 112 + +/* FLD_OUTPUT_X_SIZE: min: 1, max: 15 */ +#define FLD_OUTPUT_X_SIZE 9 + +#define RESULT_SIZE (RLT_NUM * MAX_FACE_NUM) + +#define SRZ_BIT (BIT_MASK(16) | BIT_MASK(12) | BIT_MASK(8) | BIT_MASK(0)) +#define RESET_BIT16 BIT(16) +#define RESET_BIT17 BIT(17) +#define RESET_BIT (RESET_BIT16 | RESET_BIT17) + +#define ANCHOR_EN_NUM 5 + +#define FD_STRIDE_SIZE 3 +#define FD_MAXPOOL_SIZE 3 + +#define ATTR_FD_STRIDE(n) ((n) == 0 ? 2 : 1) +#define ATTR_FD_MAXPOOL(n) ((n) == 0 ? 1 : 0) +#define FLD_FACE_INFO(m, n) (fld_face_info_0[n] + 4 * (m)) + +enum Y2R_CONFIG { + Y2R_SRC_DST_FORMAT, + Y2R_IN_W_H, + Y2R_OUT_W_H, + Y2R_RA0_RA1_EN, + Y2R_IN_X_Y_SIZE0, + Y2R_IN_STRIDE0_BUS_SIZE0, + Y2R_IN_X_Y_SIZE1, + Y2R_IN_STRIDE1_BUS_SIZE1, + Y2R_OUT_X_Y_SIZE0, + Y2R_OUT_STRIDE0_BUS_SIZE0, + Y2R_OUT_X_Y_SIZE1, + Y2R_OUT_STRIDE1_BUS_SIZE1, + Y2R_OUT_X_Y_SIZE2, + Y2R_OUT_STRIDE2_BUS_SIZE2, + Y2R_IN_0, + Y2R_IN_1, + Y2R_OUT_0, + Y2R_OUT_1, + Y2R_OUT_2, + Y2R_RS_SEL_SRZ_EN, + Y2R_X_Y_MAG, + y2r_reserve0, + Y2R_SRZ_HORI_STEP, + Y2R_SRZ_VERT_STEP, + y2r_reserve1, + y2r_reserve2, + Y2R_PADDING_EN_UP_DOWN, + Y2R_PADDING_RIGHT_LEFT, + Y2R_CO2_FMT_MODE_EN, + Y2R_CO2_CROP_X, + Y2R_CO2_CROP_Y, + Y2R_CON_IN_BA_MSB, + Y2R_CON_OUT_BA_MSB +}; + +enum RS_CONFIG { + RS_X_Y_MAG = 1, + RS_SRZ_HORI_STEP = 3, + RS_SRZ_VERT_STEP, + RS_INPUT_W_H = 7, + RS_OUTPUT_W_H, + RS_IN_X_Y_SIZE0 = 10, + RS_IN_STRIDE0, + RS_IN_X_Y_SIZE1, + RS_IN_STRIDE1, + RS_IN_X_Y_SIZE2, + RS_IN_STRIDE2, + RS_OUT_X_Y_SIZE0, + RS_OUT_STRIDE0, + RS_OUT_X_Y_SIZE1, + RS_OUT_STRIDE1, + RS_OUT_X_Y_SIZE2, + RS_OUT_STRIDE2, + RS_IN_0, + RS_IN_1, + RS_IN_2, + RS_OUT_0, + RS_OUT_1, + RS_OUT_2, + RS_CON_IN_BA_MSB, + RS_CON_OUT_BA_MSB, +}; + +enum FD_CONFIG { + FD_INPUT_ROTATE = 1, + FD_CONV_WIDTH_MOD6, + FD_CONV_IMG_W_H = 4, + FD_IN_IMG_W_H, + FD_OUT_IMG_W_H, + FD_IN_X_Y_SIZE0 = 9, + FD_IN_STRIDE0_BUS_SIZE0, + FD_IN_X_Y_SIZE1, + FD_IN_STRIDE1_BUS_SIZE1, + FD_IN_X_Y_SIZE2, + FD_IN_STRIDE2_BUS_SIZE2, + FD_IN_X_Y_SIZE3, + FD_IN_STRIDE3_BUS_SIZE3, + FD_OUT_X_Y_SIZE0, + FD_OUT_STRIDE0_BUS_SIZE0, + FD_OUT_X_Y_SIZE1, + FD_OUT_STRIDE1_BUS_SIZE1, + FD_OUT_X_Y_SIZE2, + FD_OUT_STRIDE2_BUS_SIZE2, + FD_OUT_X_Y_SIZE3, + FD_OUT_STRIDE3_BUS_SIZE3, + FD_IN_0 = 27, + FD_IN_1, + FD_IN_2, + FD_IN_3, + FD_OUT_0, + FD_OUT_1, + FD_OUT_2, + FD_OUT_3, + FD_KERNEL_0, + FD_KERNEL_1, + FD_RPN_SET, + FD_IMAGE_COORD, + FD_IMAGE_COORD_XY_OFST, + FD_BIAS_ACCU = 47, + FD_SRZ_FDRZ_RS, + FD_SRZ_HORI_STEP, + FD_SRZ_VERT_STEP, + FD_SRZ_HORI_SUB_INT_OFST, + FD_SRZ_VERT_SUB_INT_OFST, + FD_CON_IN_BA_MSB, + FD_CON_OUT_BA_MSB, + FD_CON_KERNEL_BA_MSB +}; + +static const unsigned int out_stride_size[FD_LOOP_NUM][OUTPUT_WDMA_WRA_NUM] = { + { 1, 0, 0, 0 }, { 1, 0, 2, 0 }, { 1, 0, 2, 0 }, { 1, 0, 0, 0 }, + { 1, 1, 2, 2 }, { 1, 1, 2, 2 }, { 1, 0, 0, 0 }, { 1, 0, 2, 0 }, + { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 2, 0 }, { 1, 1, 0, 0 }, + { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, + { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, + { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, + { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, + { 3, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 2, 0 }, { 1, 0, 2, 0 }, + { 1, 0, 0, 0 }, { 1, 1, 2, 2 }, { 1, 1, 2, 2 }, { 1, 0, 0, 0 }, + { 1, 0, 2, 0 }, { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 2, 0 }, + { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, + { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 1, 0, 0 }, + { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, { 1, 1, 1, 1 }, + { 1, 1, 1, 1 }, { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, + { 1, 0, 0, 0 }, { 3, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 2, 0 }, + { 1, 0, 2, 0 }, { 1, 0, 0, 0 }, { 1, 1, 2, 2 }, { 1, 1, 2, 2 }, + { 1, 0, 0, 0 }, { 1, 0, 2, 0 }, { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, + { 1, 0, 2, 0 }, { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, + { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, + { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, + { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, + { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, { 3, 0, 0, 0 } +}; + +static const unsigned int fd_ker_rdma_size[RPN_LOOP_NUM] = { + 240, 1168, 1168, 272, + 2320, 2080, 1040, 4624, + 3104, 9232, 4624, 4128, + 1040, 4624, 4624, 1552, + 4624, 4624, 4128, 1040, + 1040, 528, 4160, 4160, + 2080, 2080, 2080, 1040, + 0 +}; + +static const unsigned int fd_out_stride2_in[FD_LOOP_NUM] = { + 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, + 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 +}; + +static const unsigned int fd_stride_indices[FD_STRIDE_SIZE] = { + 0, 29, 58 +}; + +static const unsigned int fd_maxpool_indices[FD_MAXPOOL_SIZE] = { + 1, 30, 59 +}; + +static const unsigned int out_2size[FD_LOOP_NUM] = { + 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, + 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 +}; + +static const unsigned int in_ch_pack[FD_LOOP_NUM] = { + 1, 16, 16, 16, 16, 16, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, + 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 0, 1, 16, 16, 16, 16, 16, 32, + 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, + 32, 32, 32, 0, 1, 16, 16, 16, 16, 16, 32, 32, 32, 32, 32, 32, 32, 32, + 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 0 +}; + +static const unsigned int outlayer[FD_LOOP_NUM] = { + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, + 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0 +}; + +static const unsigned int out_ch_pack[FD_LOOP_NUM] = { + 16, 16, 16, 16, 16, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, + 32, 16, 16, 16, 32, 32, 32, 32, 32, 32, 0, 16, 16, 16, 16, 16, 32, 32, + 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 16, 16, 16, 32, 32, 32, + 32, 32, 32, 0, 16, 16, 16, 16, 16, 32, 32, 32, 32, 32, 32, 32, 32, 32, + 32, 32, 32, 32, 32, 16, 16, 16, 32, 32, 32, 32, 32, 32, 0 +}; + +static const signed int fd_rdma_en[FD_LOOP_NUM][INPUT_WDMA_WRA_NUM][2] = { + { { 99, 99 }, { 99, 99 }, { 99, 99 }, { -1, -1 } }, + { { 0, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 1, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 1, 0 }, { 2, 0 }, { -1, -1 }, { -1, -1 } }, + { { 3, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 1, 2 }, { 2, 2 }, { 4, 2 }, { 4, 3 } }, + { { 5, 0 }, { 5, 1 }, { -1, -1 }, { -1, -1 } }, + { { 6, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 5, 0 }, { 5, 1 }, { 7, 0 }, { -1, -1 } }, + { { 8, 0 }, { 8, 1 }, { -1, -1 }, { -1, -1 } }, + { { 9, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 5, 2 }, { 5, 3 }, { 7, 2 }, { 10, 2 } }, + { { 11, 0 }, { 11, 1 }, { -1, -1 }, { -1, -1 } }, + { { 12, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 13, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 11, 0 }, { 11, 1 }, { 14, 0 }, { -1, -1 } }, + { { 15, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 16, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 11, 0 }, { 11, 1 }, { 14, 0 }, { 17, 0 } }, + { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, + { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, + { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, + { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, + { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, + { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, + { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, + { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, + { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, + { { 19, 0 }, { 22, 0 }, { 22, 1 }, { 25, 0 } }, + { { 99, 99 }, { 99, 99 }, { 99, 99 }, { -1, -1 } }, + { { 29, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 30, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 30, 0 }, { 31, 0 }, { -1, -1 }, { -1, -1 } }, + { { 32, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 30, 2 }, { 31, 2 }, { 33, 2 }, { 33, 3 } }, + { { 34, 0 }, { 34, 1 }, { -1, -1 }, { -1, -1 } }, + { { 35, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 34, 0 }, { 34, 1 }, { 36, 0 }, { -1, -1 } }, + { { 37, 0 }, { 37, 1 }, { -1, -1 }, { -1, -1 } }, + { { 38, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 34, 2 }, { 34, 3 }, { 36, 2 }, { 39, 2 } }, + { { 40, 0 }, { 40, 1 }, { -1, -1 }, { -1, -1 } }, + { { 41, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 42, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 40, 0 }, { 40, 1 }, { 43, 0 }, { -1, -1 } }, + { { 44, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 45, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 40, 0 }, { 40, 1 }, { 43, 0 }, { 46, 0 } }, + { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, + { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, + { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, + { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, + { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, + { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, + { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, + { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, + { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, + { { 48, 0 }, { 51, 0 }, { 51, 1 }, { 54, 0 } }, + { { 99, 99 }, { 99, 99 }, { 99, 99 }, { -1, -1 } }, + { { 58, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 59, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 59, 0 }, { 60, 0 }, { -1, -1 }, { -1, -1 } }, + { { 61, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 59, 2 }, { 60, 2 }, { 62, 2 }, { 62, 3 } }, + { { 63, 0 }, { 63, 1 }, { -1, -1 }, { -1, -1 } }, + { { 64, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 63, 0 }, { 63, 1 }, { 65, 0 }, { -1, -1 } }, + { { 66, 0 }, { 66, 1 }, { -1, -1 }, { -1, -1 } }, + { { 67, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 63, 2 }, { 63, 3 }, { 65, 2 }, { 68, 2 } }, + { { 69, 0 }, { 69, 1 }, { -1, -1 }, { -1, -1 } }, + { { 70, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 71, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 69, 0 }, { 69, 1 }, { 72, 0 }, { -1, -1 } }, + { { 73, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 74, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 69, 0 }, { 69, 1 }, { 72, 0 }, { 75, 0 } }, + { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, + { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, + { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, + { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, + { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, + { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, + { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, + { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, + { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, + { { 77, 0 }, { 80, 0 }, { 80, 1 }, { 83, 0 } } +}; + +static const unsigned int attr_wdma_en[ATTR_LOOP_NUM][OUTPUT_WDMA_WRA_NUM] = { + { 1, 0, 1, 0 }, { 1, 0, 1, 0 }, { 1, 0, 0, 0 }, { 1, 1, 1, 1 }, + { 1, 1, 1, 1 }, { 1, 0, 1, 0 }, { 1, 1, 0, 0 }, { 1, 0, 1, 0 }, + { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, + { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 1, 0, 0 }, + { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, + { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, + { 1, 0, 0, 0 }, { 1, 0, 0, 0 } +}; + +static const unsigned int attr_ker_rdma_size[ATTR_LOOP_NUM] = { + 240, 1168, 272, 2320, + 2080, 9232, 3104, 9232, + 4128, 1040, 4624, 4624, + 1552, 4624, 4624, 4128, + 9232, 272, 9232, 2320, + 144, 9232, 272, 9232, + 2320, 144 +}; + +static const unsigned int attr_out_stride2_as_in[ATTR_LOOP_NUM] = { + 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 +}; + +static const unsigned int attr_out_2size[ATTR_LOOP_NUM] = { /* O */ + 1, 1, 0, 1, 1, 1, 0, + 1, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0 +}; + +/* [loop][ch][output_index] */ +static const signed int attr_rdma_en[ATTR_LOOP_NUM][INPUT_WDMA_WRA_NUM][2] = { + { { 99, 99 }, { 99, 99 }, { 99, 99 }, { -1, -1 } }, + { { 0, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 0, 0 }, { 1, 0 }, { -1, -1 }, { -1, -1 } }, + { { 2, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 0, 2 }, { 1, 2 }, { 3, 2 }, { 3, 3 } }, + { { 4, 0 }, { 4, 1 }, { -1, -1 }, { -1, -1 } }, + { { 4, 0 }, { 4, 1 }, { 5, 0 }, { -1, -1 } }, + { { 6, 0 }, { 6, 1 }, { -1, -1 }, { -1, -1 } }, + { { 4, 2 }, { 4, 3 }, { 5, 2 }, { 7, 2 } }, + { { 8, 0 }, { 8, 1 }, { -1, -1 }, { -1, -1 } }, + { { 9, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 10, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 8, 0 }, { 8, 1 }, { 11, 0 }, { -1, -1 } }, + { { 12, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 13, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 8, 0 }, { 8, 1 }, { 11, 0 }, { 14, 0 } }, + { { 15, 0 }, { 15, 1 }, { -1, -1 }, { -1, -1 } }, + { { 16, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 15, 0 }, { 15, 1 }, { -1, -1 }, { -1, -1 } }, + { { 18, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 19, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 15, 0 }, { 15, 1 }, { -1, -1 }, { -1, -1 } }, + { { 21, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 15, 0 }, { 15, 1 }, { -1, -1 }, { -1, -1 } }, + { { 23, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, + { { 24, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } } +}; + +static const unsigned int attr_wdma_size[ATTR_LOOP_NUM][OUTPUT_WDMA_WRA_NUM] = { + { 16384, 0, 4096, 0 }, + { 16384, 0, 4096, 0 }, + { 16384, 0, 0, 0 }, + { 16384, 16384, 4096, 4096 }, + { 8192, 8192, 2048, 2048 }, + { 8192, 0, 2048, 0 }, + { 8192, 8192, 0, 0 }, + { 8192, 0, 2048, 0 }, + { 2048, 2048, 0, 0 }, + { 2048, 0, 0, 0 }, + { 2048, 0, 0, 0 }, + { 2048, 0, 0, 0 }, + { 2048, 0, 0, 0 }, + { 2048, 0, 0, 0 }, + { 2048, 0, 0, 0 }, + { 2048, 2048, 0, 0 }, + { 2048, 0, 0, 0 }, + { 0, 0, 0, 0 }, + { 2048, 0, 0, 0 }, + { 1024, 0, 0, 0 }, + { 0, 0, 0, 0 }, + { 2048, 0, 0, 0 }, + { 0, 0, 0, 0 }, + { 2048, 0, 0, 0 }, + { 1024, 0, 0, 0 }, + { 0, 0, 0, 0 } +}; + +static const unsigned int fld_face_info_0[FLD_MAX_FRAME] = { + 0x440, 0x44C, 0x458, 0x464, 0x470, 0x47C, 0x488, 0x494, 0x4A4, + 0x4B0, 0x4BC, 0x4C8, 0x4D4, 0x4E0, 0x4EC +}; + +struct aie_static_info_element { + u32 fd_wdma_size[OUTPUT_WDMA_WRA_NUM]; + u32 out_xsize_plus_1; + u32 out_height; + u32 out_ysize_plus_1_stride2; + u32 out_stride; + u32 out_stride_stride2; + u32 out_width; + u32 img_width; + u32 img_height; + u32 stride2_out_width; + u32 stride2_out_height; + u32 out_xsize_plus_1_stride2; + u32 input_xsize_plus_1; +}; + +struct aie_static_info { + struct aie_static_info_element inf_elm[FD_LOOP_NUM]; +}; + +enum aie_state { + STATE_NA, + STATE_INIT, + STATE_OPEN +}; + +/** + * enum aie_mode - AIE Feature Mode + * @FDMODE: Face Detection. + * @ATTRIBUTEMODE: Gender and ethnicity detection. + * @FLDMODE: Locations of eyebrows, eyes, ears, nose,and mouth. + */ +enum aie_mode { + FDMODE, + ATTRIBUTEMODE, + FLDMODE +}; + +enum aie_format { + FMT_NA, + FMT_YUV_2P, + FMT_YVU_2P, + FMT_YUYV, + FMT_YVYU, + FMT_UYVY, + FMT_VYUY, + FMT_MONO, + FMT_YUV420_2P, + FMT_YUV420_1P +}; + +enum aie_input_degree { + DEGREE_0, + DEGREE_90, + DEGREE_270, + DEGREE_180 +}; + +struct ethnicity_result { + signed short result[4][64]; +}; + +struct merged_ethnicity_result { + signed short result[4]; +}; + +struct merged_gender_result { + signed short result[2]; +}; + +struct merged_age_result { + signed short result[2]; +}; + +struct merged_is_indian_result { + signed short result[2]; +}; + +struct aie_reg_cfg { + u32 rs_adr; + u32 yuv2rgb_adr; + u32 fd_adr; + u32 fd_pose_adr; + u32 fd_mode; + u32 hw_result; + u32 hw_result1; + u32 reserved; +}; + +struct aie_hw_rect { + u16 width; + u16 height; +}; + +struct aie_para { + void *fd_fd_cfg_va; + void *fd_rs_cfg_va; + void *fd_yuv2rgb_cfg_va; + + void *attr_fd_cfg_va[MAX_ENQUE_FRAME_NUM]; + void *attr_yuv2rgb_cfg_va[MAX_ENQUE_FRAME_NUM]; + + void *rs_pym_rst_va[PYM_NUM][COLOR_NUM]; + + dma_addr_t fd_fd_cfg_pa; + dma_addr_t fd_rs_cfg_pa; + dma_addr_t fd_yuv2rgb_cfg_pa; + + dma_addr_t attr_fd_cfg_pa[MAX_ENQUE_FRAME_NUM]; + dma_addr_t attr_yuv2rgb_cfg_pa[MAX_ENQUE_FRAME_NUM]; + + dma_addr_t rs_pym_rst_pa[PYM_NUM][COLOR_NUM]; + + u32 sel_mode; + u32 src_img_fmt; + u32 rotate_degree; + s16 rpn_anchor_thrd; + u16 number_of_pyramid; + u32 src_img_addr; + u32 src_img_addr_uv; + + struct aie_hw_rect max_img_rect; + struct aie_hw_rect img_rect; + struct aie_hw_rect crop_rect; + struct aie_hw_rect pyramid_rect; + struct aie_hw_rect max_pyramid_rect; +}; + +struct aie_attr_para { + u32 w_idx; + u32 r_idx; + u32 sel_mode[MAX_ENQUE_FRAME_NUM]; + u16 img_width[MAX_ENQUE_FRAME_NUM]; + u16 img_height[MAX_ENQUE_FRAME_NUM]; + u16 crop_width[MAX_ENQUE_FRAME_NUM]; + u16 crop_height[MAX_ENQUE_FRAME_NUM]; + u32 src_img_fmt[MAX_ENQUE_FRAME_NUM]; + u32 rotate_degree[MAX_ENQUE_FRAME_NUM]; + u32 src_img_addr[MAX_ENQUE_FRAME_NUM]; + u32 src_img_addr_uv[MAX_ENQUE_FRAME_NUM]; +}; + +struct aie_fd_dma_para { + void *fd_out_hw_va[FD_LOOP_NUM][OUTPUT_WDMA_WRA_NUM]; + void *fd_kernel_va[FD_LOOP_NUM][KERNEL_RDMA_RA_NUM]; + void *attr_out_hw_va[ATTR_LOOP_NUM][OUTPUT_WDMA_WRA_NUM]; + void *attr_kernel_va[ATTR_LOOP_NUM][KERNEL_RDMA_RA_NUM]; + + void *age_out_hw_va[MAX_ENQUE_FRAME_NUM]; + void *gender_out_hw_va[MAX_ENQUE_FRAME_NUM]; + void *is_indian_out_hw_va[MAX_ENQUE_FRAME_NUM]; + void *ethnicity_out_hw_va[MAX_ENQUE_FRAME_NUM]; + + dma_addr_t fd_out_hw_pa[FD_LOOP_NUM][OUTPUT_WDMA_WRA_NUM]; + dma_addr_t fd_kernel_pa[FD_LOOP_NUM][KERNEL_RDMA_RA_NUM]; + dma_addr_t attr_out_hw_pa[ATTR_LOOP_NUM][OUTPUT_WDMA_WRA_NUM]; + dma_addr_t attr_kernel_pa[ATTR_LOOP_NUM][KERNEL_RDMA_RA_NUM]; + + dma_addr_t age_out_hw_pa[MAX_ENQUE_FRAME_NUM]; + dma_addr_t gender_out_hw_pa[MAX_ENQUE_FRAME_NUM]; + dma_addr_t is_indian_out_hw_pa[MAX_ENQUE_FRAME_NUM]; + dma_addr_t ethnicity_out_hw_pa[MAX_ENQUE_FRAME_NUM]; +}; + +struct aie_fd_fld_para { + void *fld_step_va[FLD_STEP_NUM][FLD_MAX_FRAME]; + void *fld_output_va[FLD_MAX_FRAME]; + dma_addr_t fld_step_pa[FLD_STEP_NUM][FLD_MAX_FRAME]; + dma_addr_t fld_output_pa[FLD_MAX_FRAME]; +}; + +struct imem_buf_info { + void *va; + dma_addr_t pa; + unsigned int size; + unsigned int reserved; +}; + +struct fd_buffer { + /* used by DMA HW */ + u32 dma_addr; +}; + +struct aie_clocks { + struct clk_bulk_data *clks; + unsigned int clk_num; +}; + +struct mtk_aie_req_work { + struct work_struct work; + struct mtk_aie_dev *fd_dev; +}; + +struct mtk_aie_variant { + unsigned int y2r_cfg_size; + unsigned int rs_cfg_size; + unsigned int fd_cfg_size; +}; + +struct mtk_aie_dev { + struct device *dev; + struct mtk_aie_ctx *ctx; + struct v4l2_m2m_dev *m2m_dev; + struct device *larb; + struct aie_para *base_para; + struct aie_attr_para *attr_para; + struct aie_fd_dma_para *dma_para; + + struct aie_fd_fld_para *fld_para; + + struct aie_enq_info *aie_cfg; + struct workqueue_struct *frame_done_wq; + void __iomem *fd_base; + const struct mtk_aie_variant *variant; + + /* Input Buffer Pointer */ + struct imem_buf_info rs_cfg_data; + struct imem_buf_info fd_cfg_data; + struct imem_buf_info yuv2rgb_cfg_data; + /* HW Output Buffer Pointer */ + struct imem_buf_info rs_output_hw; + struct imem_buf_info fd_dma_hw; + struct imem_buf_info fd_dma_result_hw; + struct imem_buf_info fd_kernel_hw; + struct imem_buf_info fd_attr_dma_hw; + struct aie_static_info st_info; + + struct aie_reg_cfg reg_cfg; + + /* Fld fw buffer */ + struct media_device mdev; + struct video_device vfd; + struct aie_clocks aie_clk; + struct v4l2_device v4l2_dev; + + /* Lock for V4L2 operations */ + struct mutex vfd_lock; + /* Lock for device operations */ + struct mutex dev_lock; + /* Lock for performance optimization */ + struct mutex fd_lock; + struct imem_buf_info fd_fld_step_data; + struct imem_buf_info fd_fld_out_hw; + + int irq; + struct completion fd_job_finished; + struct delayed_work job_timeout_work; + + /* DRAM Buffer Size */ + unsigned int fd_rs_cfg_size; + unsigned int fd_fd_cfg_size; + unsigned int fd_yuv2rgb_cfg_size; + unsigned int attr_fd_cfg_size; + unsigned int attr_yuv2rgb_cfg_size; + + /* HW Output Buffer Size */ + unsigned int rs_pym_out_size[PYM_NUM]; + unsigned int fd_dma_max_size; + unsigned int fd_dma_rst_max_size; + unsigned int fd_fd_kernel_size; + unsigned int fd_attr_kernel_size; + unsigned int fd_attr_dma_max_size; + unsigned int fd_attr_dma_rst_max_size; + + /* Fld size */ + unsigned int fld_step_size; + unsigned int fld_out_size; + + wait_queue_head_t flushing_waitq; + atomic_t num_composing; + struct mtk_aie_req_work req_work; + unsigned int fd_state; + unsigned int fd_mem_size; + u32 fd_stream_count; +}; + +struct mtk_aie_ctx { + struct mtk_aie_dev *fd_dev; + struct device *dev; + struct v4l2_fh fh; + struct v4l2_ctrl_handler hdl; + struct v4l2_pix_format_mplane src_fmt; + struct v4l2_meta_format dst_fmt; + struct v4l2_ctrl_aie_init user_init; + struct v4l2_ctrl_aie_param user_param; +}; + +void aie_reset(struct mtk_aie_dev *fd); +int aie_init(struct mtk_aie_dev *fd, struct v4l2_ctrl_aie_init *user_init); +void aie_uninit(struct mtk_aie_dev *fd); +void aie_prepare(struct mtk_aie_dev *fd, struct aie_enq_info *aie_cfg); +void aie_execute(struct mtk_aie_dev *fd, struct aie_enq_info *aie_cfg); +void aie_irqhandle(struct mtk_aie_dev *fd); +void aie_get_fd_result(struct mtk_aie_dev *fd, struct aie_enq_info *aie_cfg); +void aie_get_attr_result(struct mtk_aie_dev *fd, struct aie_enq_info *aie_cfg); +void aie_get_fld_result(struct mtk_aie_dev *fd, struct aie_enq_info *aie_cfg); +#endif /*__MTK_AIE_H__*/ diff --git a/drivers/media/platform/mediatek/aie/mtk_aie_drv.c b/drivers/media/platform/mediatek/aie/mtk_aie_drv.c new file mode 100644 index 000000000000..a327d55e2ef2 --- /dev/null +++ b/drivers/media/platform/mediatek/aie/mtk_aie_drv.c @@ -0,0 +1,2398 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2020 MediaTek Inc. + * Author: Bo Kong <bo.kong@mediatek.com> + */ + +#include <linux/firmware.h> +#include <linux/mtk_aie_v4l2_controls.h> +#include "mtk_aie.h" + +static u32 aie_cmb_u16(u16 low, u16 high) +{ + return ((u32)high << 16) | low; +} + +static u32 aie_cmb_stride(u16 low, u16 high) +{ + return ((u32)high << 16) | (low & 0x000F); +} + +static inline u16 dif_x(const struct aie_enq_info *aie_cfg) +{ + return (u16)(aie_cfg->src_roi.x2 - aie_cfg->src_roi.x1); +} + +static inline u16 dif_y(const struct aie_enq_info *aie_cfg) +{ + return (u16)(aie_cfg->src_roi.y2 - aie_cfg->src_roi.y1); +} + +static inline void set_cmb_cfg(u32 *tbl, u16 index, u16 stride) +{ + tbl[index] = aie_cmb_u16(tbl[index], stride); +} + +static inline void set_cmbst_cfg(u32 *tbl, u16 index, u16 stride) +{ + tbl[index] = aie_cmb_stride(tbl[index], stride); +} + +static int get_fd_ker_rdma_size(int i, int j) +{ + if (i < 0 || i >= FD_LOOP_NUM || j < 0 || j >= KERNEL_RDMA_RA_NUM) + return 0; + + return fd_ker_rdma_size[i % RPN_LOOP_NUM]; +} + +static int get_fd_stride(int index) +{ + int i; + + for (i = 0; i < FD_STRIDE_SIZE; i++) { + if (fd_stride_indices[i] == index) + return 2; + } + return 1; +} + +static int is_fd_maxpool(int index) +{ + int i; + + for (i = 0; i < FD_MAXPOOL_SIZE; i++) { + if (fd_maxpool_indices[i] == index) + return 1; + } + return 0; +} + +static int get_attr_ker_rdma_size(int index, int j) +{ + if (index < 0 || index >= ATTR_LOOP_NUM || j < 0 || j >= KERNEL_RDMA_RA_NUM) + return 0; + + return attr_ker_rdma_size[index]; +} + +static inline unsigned int fld_step_align_size(int step_num, int frame) +{ + if (step_num < 0 || step_num >= FLD_STEP_NUM || frame < 0 || frame >= FLD_MAX_FRAME) + return 0; + + switch (step_num) { + case FLD_STEP_BLINK: + return frame == (FLD_MAX_FRAME - 1) ? 6528 : 0; + case FLD_STEP_CV: + return frame == 0 ? 1536 : 1280; + case FLD_STEP_FP: + return 5376; + case FLD_STEP_LEAF: + return 307200; + case FLD_STEP_KM02: + case FLD_STEP_KM13: + return 8064; + } + return 0; +} + +static int aie_imem_alloc(struct mtk_aie_dev *fd, u32 size, + struct imem_buf_info *bufinfo) +{ + struct device *dev = fd->dev; + dma_addr_t dma_handle; + void *va; + + if (size == 0) { + dev_dbg(fd->dev, "%s: size(%d)\n", __func__, size); + return -EINVAL; + } + + fd->fd_mem_size += size; + + va = dma_alloc_coherent(dev, size, &dma_handle, GFP_KERNEL); + if (!va || dma_handle == 0) + return -ENOMEM; + + bufinfo->va = va; + bufinfo->pa = dma_handle; + bufinfo->size = size; + + dev_dbg(fd->dev, "%s: vAddr(0x%p) size(%d)\n", __func__, va, size); + + return 0; +} + +static void aie_imem_free(struct mtk_aie_dev *fd, struct imem_buf_info *bufinfo) +{ + dev_dbg(fd->dev, "%s: vAddr(0x%p) size(%d)\n", __func__, bufinfo->va, bufinfo->size); + + if (bufinfo->va) + dma_free_coherent(fd->dev, bufinfo->size, bufinfo->va, bufinfo->pa); +} + +static void aie_update_table(struct mtk_aie_dev *fd, u16 pym_width, + u16 pym_height, bool binit) +{ + int i, stridesize; + struct aie_static_info *pstv = &fd->st_info; + + pstv->inf_elm[PYMX_START_LOOP(2)].img_width = pym_width / 4; + pstv->inf_elm[PYMX_START_LOOP(2)].img_height = pym_height / 4; + + pstv->inf_elm[PYMX_START_LOOP(1)].img_width = pym_width / 2; + pstv->inf_elm[PYMX_START_LOOP(1)].img_height = pym_height / 2; + + pstv->inf_elm[PYMX_START_LOOP(0)].img_width = pym_width; + pstv->inf_elm[PYMX_START_LOOP(0)].img_height = pym_height; + + for (i = 0; i < FD_LOOP_NUM; i++) { + if (i != PYMX_START_LOOP(2) && i != PYMX_START_LOOP(1) && + i != PYMX_START_LOOP(0)) { + if (fd_out_stride2_in[i] == 1) { + pstv->inf_elm[i].img_width = + pstv->inf_elm[i - 1].stride2_out_width; + pstv->inf_elm[i].img_height = + pstv->inf_elm[i - 1].stride2_out_height; + } else { + pstv->inf_elm[i].img_width = pstv->inf_elm[i - 1].out_width; + pstv->inf_elm[i].img_height = pstv->inf_elm[i - 1].out_height; + } + } + + stridesize = get_fd_stride(i); + if (is_fd_maxpool(i)) { + pstv->inf_elm[i].out_width = + (pstv->inf_elm[i].img_width - 1) / 2 + 1; + pstv->inf_elm[i].out_height = + (pstv->inf_elm[i].img_height - 1) / 2 + 1; + } else { + pstv->inf_elm[i].out_width = (pstv->inf_elm[i].img_width - 1) / + stridesize + 1; + pstv->inf_elm[i].out_height = (pstv->inf_elm[i].img_height - 1) / + stridesize + 1; + } + + pstv->inf_elm[i].stride2_out_width = + ((pstv->inf_elm[i].out_width - 1) / 2 + 1) * out_2size[i]; + pstv->inf_elm[i].stride2_out_height = + ((pstv->inf_elm[i].out_height - 1) / 2 + 1) * out_2size[i]; + + if (outlayer[i] == 1) { + pstv->inf_elm[i].out_xsize_plus_1 = + pstv->inf_elm[i].out_width * out_ch_pack[i] * 2; + pstv->inf_elm[i].out_stride = + round_up(pstv->inf_elm[i].out_xsize_plus_1 * ANCHOR_EN_NUM, 16); + pstv->inf_elm[i].out_xsize_plus_1_stride2 = + ((pstv->inf_elm[i].out_width - 1) / 2 + 1) * + out_ch_pack[i] * 2 * out_2size[i]; + } else { + pstv->inf_elm[i].out_xsize_plus_1 = + pstv->inf_elm[i].out_width * out_ch_pack[i]; + pstv->inf_elm[i].out_stride = + round_up(pstv->inf_elm[i].out_xsize_plus_1, 16); + pstv->inf_elm[i].out_xsize_plus_1_stride2 = + ((pstv->inf_elm[i].out_width - 1) / 2 + 1) * + out_ch_pack[i] * out_2size[i]; + } + + pstv->inf_elm[i].out_stride_stride2 = + round_up(pstv->inf_elm[i].out_xsize_plus_1_stride2, 16); + + if (out_2size[i] == 1) + pstv->inf_elm[i].out_ysize_plus_1_stride2 = + (pstv->inf_elm[i].out_height - 1) / 2 + 1; + else + pstv->inf_elm[i].out_ysize_plus_1_stride2 = pstv->inf_elm[i].out_height; + + if (binit) { + if (i == RPNX_LOOP_NUM(2) || i == RPNX_LOOP_NUM(1) || i == RPNX_LOOP_NUM(0)) + pstv->inf_elm[i].fd_wdma_size[0] = RESULT_SIZE; + else + pstv->inf_elm[i].fd_wdma_size[0] = pstv->inf_elm[i].out_height * + pstv->inf_elm[i].out_stride; + + if (outlayer[i] == 1) { + if (out_stride_size[i][1]) + pstv->inf_elm[i].fd_wdma_size[1] = + pstv->inf_elm[i].fd_wdma_size[0]; + if (out_stride_size[i][2]) + pstv->inf_elm[i].fd_wdma_size[2] = + pstv->inf_elm[i].fd_wdma_size[0]; + if (out_stride_size[i][3]) + pstv->inf_elm[i].fd_wdma_size[3] = + pstv->inf_elm[i].fd_wdma_size[0]; + } else { + if (out_stride_size[i][1]) + pstv->inf_elm[i].fd_wdma_size[1] = + pstv->inf_elm[i].out_height * + pstv->inf_elm[i].out_stride; + if (out_stride_size[i][2]) + pstv->inf_elm[i].fd_wdma_size[2] = + pstv->inf_elm[i].out_ysize_plus_1_stride2 * + pstv->inf_elm[i].out_stride_stride2; + if (out_stride_size[i][3]) + pstv->inf_elm[i].fd_wdma_size[3] = + pstv->inf_elm[i].out_ysize_plus_1_stride2 * + pstv->inf_elm[i].out_stride_stride2; + } + } + + if (in_ch_pack[i] == 1) + pstv->inf_elm[i].input_xsize_plus_1 = + round_up(pstv->inf_elm[i].img_width, 8); + else + pstv->inf_elm[i].input_xsize_plus_1 = + pstv->inf_elm[i].img_width * in_ch_pack[i]; + } +} + +static void aie_update_buf_params(struct mtk_aie_dev *fd, u16 max_img_width, + u16 max_img_height) +{ + struct aie_static_info *pstv = &fd->st_info; + u32 size; + u8 i, j; + + fd->base_para->max_img_rect.width = max_img_width; + fd->base_para->max_img_rect.height = max_img_height; + fd->fd_dma_max_size = 0; + fd->fd_dma_rst_max_size = 0; + fd->fd_fd_kernel_size = 0; + fd->fd_attr_kernel_size = 0; + fd->fd_attr_dma_max_size = 0; + fd->fd_attr_dma_rst_max_size = 0; + + /* FDMODE Dram Buffer Size */ + fd->fd_rs_cfg_size = 4 * fd->variant->rs_cfg_size * 2; + fd->fd_fd_cfg_size = 4 * fd->variant->fd_cfg_size * FD_LOOP_NUM; + fd->fd_yuv2rgb_cfg_size = 4 * fd->variant->y2r_cfg_size; + + /* ATTRMODE Dram Buffer Size */ + fd->attr_fd_cfg_size = 4 * fd->variant->fd_cfg_size * ATTR_LOOP_NUM; + fd->attr_yuv2rgb_cfg_size = 4 * fd->variant->y2r_cfg_size; + + /* HW Output Buffer Size */ + fd->rs_pym_out_size[0] = fd->base_para->max_pyramid_rect.width * + fd->base_para->max_pyramid_rect.height; + fd->rs_pym_out_size[1] = fd->rs_pym_out_size[0] / 2; + fd->rs_pym_out_size[2] = fd->rs_pym_out_size[0] / 4; + + /* FDMODE Dram Buffer Size */ + for (i = 0; i < FD_LOOP_NUM; i++) { + for (j = 0; j < OUTPUT_WDMA_WRA_NUM; j++) { + if (out_stride_size[i][j]) { + if ((i == RPNX_LOOP_NUM(2) || i == RPNX_LOOP_NUM(1) || + i == RPNX_LOOP_NUM(0)) && j == 0) + fd->fd_dma_rst_max_size += pstv->inf_elm[i].fd_wdma_size[j]; + else + fd->fd_dma_max_size += pstv->inf_elm[i].fd_wdma_size[j]; + } + } + } + + for (i = 0; i < FD_LOOP_NUM; i++) { + for (j = 0; j < KERNEL_RDMA_RA_NUM; j++) { + size = get_fd_ker_rdma_size(i, j); + if (size) + fd->fd_fd_kernel_size += size; + } + } + + /* ATTRMODE Dram Buffer Size */ + for (i = 0; i < ATTR_LOOP_NUM; i++) { + for (j = 0; j < OUTPUT_WDMA_WRA_NUM; j++) { + if (attr_wdma_en[i][j]) { + if ((i == AGE_OUT_RGS || i == GENDER_OUT_RGS || + i == INDIAN_OUT_RGS || i == ETHNICITY_OUT_RGS) && j == 0) + fd->fd_attr_dma_rst_max_size += + ATTR_OUT_SIZE * MAX_ENQUE_FRAME_NUM; + else + fd->fd_attr_dma_max_size += attr_wdma_size[i][j]; + } + } + } + + for (i = 0; i < ATTR_LOOP_NUM; i++) { + for (j = 0; j < KERNEL_RDMA_RA_NUM; j++) + fd->fd_attr_kernel_size += get_attr_ker_rdma_size(i, j); + } + + /* FD Pose secure result output buffer: result size * 3 loops */ + fd->fd_dma_rst_max_size += RESULT_SIZE * 3; + + /* FLD size */ + fd->fld_step_size = 0; + for (i = 0; i < FLD_STEP_NUM; i++) + for (j = 0; j < FLD_MAX_FRAME; j++) + fd->fld_step_size += fld_step_align_size(i, j); + + fd->fld_out_size = FLD_OUTPUT_SIZE * FLD_MAX_FRAME; +} + +static int aie_alloc_dram_buf(struct mtk_aie_dev *fd) +{ + u32 alloc_size; + int ret; + u8 i; + + /* RS DRAM */ + alloc_size = fd->fd_rs_cfg_size; + dev_dbg(fd->dev, "RS CFG:"); + ret = aie_imem_alloc(fd, alloc_size, &fd->rs_cfg_data); + if (ret) + goto dma_alloc_fail; + /* FD MODE */ + fd->base_para->fd_rs_cfg_pa = fd->rs_cfg_data.pa; + fd->base_para->fd_rs_cfg_va = fd->rs_cfg_data.va; + + /* FD DRAM */ + alloc_size = fd->fd_fd_cfg_size + fd->attr_fd_cfg_size * MAX_ENQUE_FRAME_NUM; + dev_dbg(fd->dev, "FD CFG:"); + ret = aie_imem_alloc(fd, alloc_size, &fd->fd_cfg_data); + if (ret) + goto dma_alloc_fail; + /* FD MODE */ + fd->base_para->fd_fd_cfg_pa = fd->fd_cfg_data.pa; + fd->base_para->fd_fd_cfg_va = fd->fd_cfg_data.va; + /* ATTR MODE */ + fd->base_para->attr_fd_cfg_pa[0] = + fd->base_para->fd_fd_cfg_pa + fd->fd_fd_cfg_size; + fd->base_para->attr_fd_cfg_va[0] = + fd->base_para->fd_fd_cfg_va + fd->fd_fd_cfg_size; + + for (i = 1; i < MAX_ENQUE_FRAME_NUM; i++) { + fd->base_para->attr_fd_cfg_pa[i] = + fd->base_para->attr_fd_cfg_pa[i - 1] + fd->attr_fd_cfg_size; + fd->base_para->attr_fd_cfg_va[i] = + fd->base_para->attr_fd_cfg_va[i - 1] + fd->attr_fd_cfg_size; + } + + /* YUV2RGB DRAM */ + alloc_size = fd->fd_yuv2rgb_cfg_size + + fd->attr_yuv2rgb_cfg_size * MAX_ENQUE_FRAME_NUM; + dev_dbg(fd->dev, "YUV2RGB CFG:"); + ret = aie_imem_alloc(fd, alloc_size, &fd->yuv2rgb_cfg_data); + if (ret) + goto dma_alloc_fail; + /* FD MODE */ + fd->base_para->fd_yuv2rgb_cfg_pa = fd->yuv2rgb_cfg_data.pa; + fd->base_para->fd_yuv2rgb_cfg_va = fd->yuv2rgb_cfg_data.va; + + /* ATTR MODE */ + fd->base_para->attr_yuv2rgb_cfg_pa[0] = + fd->base_para->fd_yuv2rgb_cfg_pa + fd->fd_yuv2rgb_cfg_size; + fd->base_para->attr_yuv2rgb_cfg_va[0] = + fd->base_para->fd_yuv2rgb_cfg_va + fd->fd_yuv2rgb_cfg_size; + + for (i = 1; i < MAX_ENQUE_FRAME_NUM; i++) { + fd->base_para->attr_yuv2rgb_cfg_pa[i] = + fd->base_para->attr_yuv2rgb_cfg_pa[i - 1] + fd->attr_yuv2rgb_cfg_size; + fd->base_para->attr_yuv2rgb_cfg_va[i] = + fd->base_para->attr_yuv2rgb_cfg_va[i - 1] + fd->attr_yuv2rgb_cfg_size; + } + + return 0; +dma_alloc_fail: + aie_imem_free(fd, &fd->fd_cfg_data); + aie_imem_free(fd, &fd->rs_cfg_data); + + return ret; +} + +static int aie_alloc_output_buf(struct mtk_aie_dev *fd) +{ + u32 alloc_size = 0, pa_off = 0, va_off = 0; + int i, j; + int ret; + + for (i = 0; i < PYM_NUM; i++) + alloc_size += fd->rs_pym_out_size[i] * 3; + dev_dbg(fd->dev, "RS OUT:"); + ret = aie_imem_alloc(fd, alloc_size, &fd->rs_output_hw); + if (ret) + return ret; + + for (i = 0; i < PYM_NUM; i++) { + for (j = 0; j < COLOR_NUM; j++) { + fd->base_para->rs_pym_rst_pa[i][j] = fd->rs_output_hw.pa + pa_off; + pa_off += fd->rs_pym_out_size[i]; + + fd->base_para->rs_pym_rst_va[i][j] = fd->rs_output_hw.va + va_off; + va_off += fd->rs_pym_out_size[i]; + } + } + + return 0; +} + +static void aie_alloc_normal(struct mtk_aie_dev *fd, int start, int end) +{ + struct aie_static_info *pstv = &fd->st_info; + int i, j, pi, pj; + + if (start <= 0 || end <= start || end >= FD_LOOP_NUM) { + dev_err(fd->dev, "%s: start = %d, end = %d\n", __func__, start, end); + return; + } + + pi = start - 1; + pj = 0; + for (i = start; i < end + 1; i++) { + for (j = 0; j < OUTPUT_WDMA_WRA_NUM; j++) { + if (out_stride_size[i][j]) { + fd->dma_para->fd_out_hw_pa[i][j] = + fd->dma_para->fd_out_hw_pa[pi][pj] + + pstv->inf_elm[pi].fd_wdma_size[pj]; + pi = i; + pj = j; + } + } + } +} + +static int aie_alloc_fddma_buf(struct mtk_aie_dev *fd) +{ + u32 alloc_size; + int ret; + + alloc_size = fd->fd_dma_max_size; + dev_dbg(fd->dev, "FD DMA:"); + ret = aie_imem_alloc(fd, alloc_size, &fd->fd_dma_hw); + if (ret) + goto dma_alloc_fail; + + alloc_size = fd->fd_fd_kernel_size + fd->fd_attr_kernel_size; + dev_dbg(fd->dev, "FD KERNEL:"); + ret = aie_imem_alloc(fd, alloc_size, &fd->fd_kernel_hw); + if (ret) + goto dma_alloc_fail; + + alloc_size = fd->fd_attr_dma_max_size; + dev_dbg(fd->dev, "ATTR DMA:"); + ret = aie_imem_alloc(fd, alloc_size, &fd->fd_attr_dma_hw); + if (ret) + goto dma_alloc_fail; + + alloc_size = fd->fd_dma_rst_max_size + fd->fd_attr_dma_rst_max_size; + dev_dbg(fd->dev, "RESULT DMA:"); + ret = aie_imem_alloc(fd, alloc_size, &fd->fd_dma_result_hw); + if (ret) + goto dma_alloc_fail; + + return 0; + +dma_alloc_fail: + aie_imem_free(fd, &fd->fd_attr_dma_hw); + aie_imem_free(fd, &fd->fd_kernel_hw); + aie_imem_free(fd, &fd->fd_dma_hw); + + return ret; +} + +static int aie_alloc_fld_buf(struct mtk_aie_dev *fd) +{ + u32 alloc_size; + int ret; + + alloc_size = fd->fld_step_size; + dev_dbg(fd->dev, "FLD STEP:"); + ret = aie_imem_alloc(fd, alloc_size, &fd->fd_fld_step_data); + if (ret) + return ret; + + alloc_size = fd->fld_out_size; + dev_dbg(fd->dev, "FLD OUT:"); + ret = aie_imem_alloc(fd, alloc_size, &fd->fd_fld_out_hw); + if (ret) + goto fld_step; + + return 0; +fld_step: + aie_imem_free(fd, &fd->fd_fld_step_data); + + return ret; +} + +static void aie_calculate_pa(struct mtk_aie_dev *fd, struct aie_static_info *pstv, int i) +{ + fd->dma_para->fd_out_hw_pa[i][0] = fd->dma_para->fd_out_hw_pa[i - 1][1] + + pstv->inf_elm[i - 1].fd_wdma_size[1]; + fd->dma_para->fd_out_hw_pa[i][1] = fd->dma_para->fd_out_hw_pa[i][0] + + pstv->inf_elm[i].out_xsize_plus_1; + fd->dma_para->fd_out_hw_pa[i + 1][0] = fd->dma_para->fd_out_hw_pa[i][0] + + 2 * pstv->inf_elm[i + 1].out_xsize_plus_1; + fd->dma_para->fd_out_hw_pa[i + 1][1] = fd->dma_para->fd_out_hw_pa[i][0] + + 3 * pstv->inf_elm[i + 1].out_xsize_plus_1; + fd->dma_para->fd_out_hw_pa[i + 2][0] = fd->dma_para->fd_out_hw_pa[i][0] + + 4 * pstv->inf_elm[i + 2].out_xsize_plus_1; + fd->dma_para->fd_out_hw_pa[i + 3][0] = fd->dma_para->fd_out_hw_pa[i][0] + + pstv->inf_elm[i].fd_wdma_size[0] + + pstv->inf_elm[i].fd_wdma_size[1] + + pstv->inf_elm[i + 1].fd_wdma_size[0] + + pstv->inf_elm[i + 1].fd_wdma_size[1] + + pstv->inf_elm[i + 2].fd_wdma_size[0]; + fd->dma_para->fd_out_hw_pa[i + 3][1] = fd->dma_para->fd_out_hw_pa[i + 3][0] + + pstv->inf_elm[i + 3].fd_wdma_size[0] + + pstv->inf_elm[i + 3].fd_wdma_size[2] + + pstv->inf_elm[i + 4].fd_wdma_size[0] + + pstv->inf_elm[i + 4].fd_wdma_size[2] + + pstv->inf_elm[i + 5].fd_wdma_size[0]; + fd->dma_para->fd_out_hw_pa[i + 3][2] = fd->dma_para->fd_out_hw_pa[i + 3][0] + + pstv->inf_elm[i + 3].out_xsize_plus_1; + fd->dma_para->fd_out_hw_pa[i + 3][3] = fd->dma_para->fd_out_hw_pa[i + 3][1] + + pstv->inf_elm[i + 3].out_xsize_plus_1; + fd->dma_para->fd_out_hw_pa[i + 4][0] = fd->dma_para->fd_out_hw_pa[i + 3][0] + + 2 * pstv->inf_elm[i + 4].out_xsize_plus_1; + fd->dma_para->fd_out_hw_pa[i + 4][1] = fd->dma_para->fd_out_hw_pa[i + 3][1] + + 2 * pstv->inf_elm[i + 4].out_xsize_plus_1; + fd->dma_para->fd_out_hw_pa[i + 4][2] = fd->dma_para->fd_out_hw_pa[i + 3][0] + + 3 * pstv->inf_elm[i + 4].out_xsize_plus_1; + fd->dma_para->fd_out_hw_pa[i + 4][3] = fd->dma_para->fd_out_hw_pa[i + 3][1] + + 3 * pstv->inf_elm[i + 4].out_xsize_plus_1; + fd->dma_para->fd_out_hw_pa[i + 5][0] = fd->dma_para->fd_out_hw_pa[i + 3][0] + + 4 * pstv->inf_elm[i + 5].out_xsize_plus_1; + fd->dma_para->fd_out_hw_pa[i + 5][1] = fd->dma_para->fd_out_hw_pa[i + 3][1] + + 4 * pstv->inf_elm[i + 5].out_xsize_plus_1; + fd->dma_para->fd_out_hw_pa[i + 6][0] = fd->dma_para->fd_out_hw_pa[i + 3][1] + + pstv->inf_elm[i + 3].fd_wdma_size[1] + + pstv->inf_elm[i + 3].fd_wdma_size[3] + + pstv->inf_elm[i + 4].fd_wdma_size[1] + + pstv->inf_elm[i + 4].fd_wdma_size[3] + + pstv->inf_elm[i + 5].fd_wdma_size[1]; + fd->dma_para->fd_out_hw_pa[i + 6][1] = fd->dma_para->fd_out_hw_pa[i + 6][0] + + pstv->inf_elm[i + 6].out_xsize_plus_1; + fd->dma_para->fd_out_hw_pa[i + 7][0] = fd->dma_para->fd_out_hw_pa[i + 6][0] + + 2 * pstv->inf_elm[i + 7].out_xsize_plus_1; + fd->dma_para->fd_out_hw_pa[i + 7][1] = fd->dma_para->fd_out_hw_pa[i + 6][0] + + 3 * pstv->inf_elm[i + 7].out_xsize_plus_1; + fd->dma_para->fd_out_hw_pa[i + 8][0] = fd->dma_para->fd_out_hw_pa[i + 6][0] + + 4 * pstv->inf_elm[i + 8].out_xsize_plus_1; +} + +static void aie_arrange_fddma_buf(struct mtk_aie_dev *fd) +{ + struct aie_static_info *pstv = &fd->st_info; + dma_addr_t current_pa; + void *current_va; + u8 i, j; + + /* 0 ~ 18 */ + fd->dma_para->fd_out_hw_pa[0][0] = fd->fd_dma_hw.pa; + aie_alloc_normal(fd, 1, 18); + + /* 19 ~ 27 */ + aie_calculate_pa(fd, pstv, 19); + + /* 29 ~ 47 */ + fd->dma_para->fd_out_hw_pa[29][0] = fd->dma_para->fd_out_hw_pa[25][0] + + pstv->inf_elm[25].fd_wdma_size[0] + + pstv->inf_elm[25].fd_wdma_size[1] + + pstv->inf_elm[26].fd_wdma_size[0] + + pstv->inf_elm[26].fd_wdma_size[1] + + pstv->inf_elm[27].fd_wdma_size[0]; + aie_alloc_normal(fd, 30, 47); + + /* 48 ~ 56 */ + aie_calculate_pa(fd, pstv, 48); + + /* 58 ~ 76 */ + fd->dma_para->fd_out_hw_pa[58][0] = fd->dma_para->fd_out_hw_pa[54][0] + + pstv->inf_elm[54].fd_wdma_size[0] + + pstv->inf_elm[54].fd_wdma_size[1] + + pstv->inf_elm[55].fd_wdma_size[0] + + pstv->inf_elm[55].fd_wdma_size[1] + + pstv->inf_elm[56].fd_wdma_size[0]; + aie_alloc_normal(fd, 59, 76); + + /* 77 ~ 85 */ + aie_calculate_pa(fd, pstv, 77); + + /* VA : except 28, 57, 86 */ + /* 0 ~ 86 */ + fd->dma_para->fd_out_hw_va[0][0] = fd->fd_dma_hw.va; + for (i = 1; i < FD_LOOP_NUM; i++) { + if (i == RPNX_LOOP_NUM(2) || i == RPNX_LOOP_NUM(1) || i == RPNX_LOOP_NUM(0)) + continue; + for (j = 0; j < 4; j++) { + if (out_stride_size[i][j]) { + fd->dma_para->fd_out_hw_va[i][j] = fd->fd_dma_hw.va + + fd->dma_para->fd_out_hw_pa[i][j] - fd->fd_dma_hw.pa; + } + } + } + + current_pa = fd->dma_para->fd_out_hw_pa[83][0] + + pstv->inf_elm[83].fd_wdma_size[0] + + pstv->inf_elm[83].fd_wdma_size[1] + + pstv->inf_elm[84].fd_wdma_size[0] + + pstv->inf_elm[84].fd_wdma_size[1] + + pstv->inf_elm[85].fd_wdma_size[0]; + current_va = fd->dma_para->fd_out_hw_va[83][0] + + pstv->inf_elm[83].fd_wdma_size[0] + + pstv->inf_elm[83].fd_wdma_size[1] + + pstv->inf_elm[84].fd_wdma_size[0] + + pstv->inf_elm[84].fd_wdma_size[1] + + pstv->inf_elm[85].fd_wdma_size[0]; + + dev_dbg(fd->dev, "%s: current VA = %p\n", __func__, current_va); +} + +static void aie_arrange_kernel_buf(struct mtk_aie_dev *fd) +{ + void *current_va; + dma_addr_t current_pa; + u32 size, attrsize; + u8 i, j; + + current_pa = fd->fd_kernel_hw.pa; + current_va = fd->fd_kernel_hw.va; + + for (i = 0; i < FD_LOOP_NUM; i++) { + for (j = 0; j < KERNEL_RDMA_RA_NUM; j++) { + size = get_fd_ker_rdma_size(i, j); + if (size) { + fd->dma_para->fd_kernel_pa[i][j] = current_pa; + fd->dma_para->fd_kernel_va[i][j] = current_va; + current_pa += size; + current_va += size; + } + } + } + + for (i = 0; i < ATTR_LOOP_NUM; i++) { + for (j = 0; j < KERNEL_RDMA_RA_NUM; j++) { + attrsize = get_attr_ker_rdma_size(i, j); + fd->dma_para->attr_kernel_pa[i][j] = current_pa; + fd->dma_para->attr_kernel_va[i][j] = current_va; + current_pa += attrsize; + current_va += attrsize; + } + } + + dev_dbg(fd->dev, "%s: current VA = %p\n", __func__, current_va); +} + +static void aie_arrange_attrdma_buf(struct mtk_aie_dev *fd) +{ + void *current_va; + dma_addr_t current_pa; + u8 i, j; + + current_pa = fd->fd_attr_dma_hw.pa; + current_va = fd->fd_attr_dma_hw.va; + + /* attribute mode */ + for (i = 0; i < ATTR_LOOP_NUM; i++) { + for (j = 0; j < OUTPUT_WDMA_WRA_NUM; j++) { + if (attr_wdma_en[i][j]) { + fd->dma_para->attr_out_hw_pa[i][j] = current_pa; + fd->dma_para->attr_out_hw_va[i][j] = current_va; + current_pa += attr_wdma_size[i][j]; + current_va += attr_wdma_size[i][j]; + } + } + } + + dev_dbg(fd->dev, "%s: current VA = %p\n", __func__, current_va); +} + +static void aie_arrange_result_dma_buf(struct mtk_aie_dev *fd) +{ + struct aie_static_info *pstv = &fd->st_info; + dma_addr_t currentresult_pa; + void *currentresult_va; + u8 i; + + currentresult_pa = fd->fd_dma_result_hw.pa; + currentresult_va = fd->fd_dma_result_hw.va; + + fd->dma_para->fd_out_hw_pa[RPNX_LOOP_NUM(2)][0] = currentresult_pa; + fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(2)][0] = currentresult_va; + currentresult_pa += pstv->inf_elm[RPNX_LOOP_NUM(2)].fd_wdma_size[0]; + currentresult_va += pstv->inf_elm[RPNX_LOOP_NUM(2)].fd_wdma_size[0]; + fd->dma_para->fd_out_hw_pa[RPNX_LOOP_NUM(1)][0] = currentresult_pa; + fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(1)][0] = currentresult_va; + currentresult_pa += pstv->inf_elm[RPNX_LOOP_NUM(1)].fd_wdma_size[0]; + currentresult_va += pstv->inf_elm[RPNX_LOOP_NUM(1)].fd_wdma_size[0]; + fd->dma_para->fd_out_hw_pa[RPNX_LOOP_NUM(0)][0] = currentresult_pa; + fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(0)][0] = currentresult_va; + currentresult_pa += pstv->inf_elm[RPNX_LOOP_NUM(0)].fd_wdma_size[0]; + currentresult_va += pstv->inf_elm[RPNX_LOOP_NUM(0)].fd_wdma_size[0]; + + fd->dma_para->attr_out_hw_pa[AGE_OUT_RGS][0] = currentresult_pa; + fd->dma_para->attr_out_hw_va[AGE_OUT_RGS][0] = currentresult_va; + currentresult_pa += ATTR_OUT_SIZE * MAX_ENQUE_FRAME_NUM; + currentresult_va += ATTR_OUT_SIZE * MAX_ENQUE_FRAME_NUM; + fd->dma_para->attr_out_hw_pa[GENDER_OUT_RGS][0] = currentresult_pa; + fd->dma_para->attr_out_hw_va[GENDER_OUT_RGS][0] = currentresult_va; + currentresult_pa += ATTR_OUT_SIZE * MAX_ENQUE_FRAME_NUM; + currentresult_va += ATTR_OUT_SIZE * MAX_ENQUE_FRAME_NUM; + fd->dma_para->attr_out_hw_pa[INDIAN_OUT_RGS][0] = currentresult_pa; + fd->dma_para->attr_out_hw_va[INDIAN_OUT_RGS][0] = currentresult_va; + currentresult_pa += ATTR_OUT_SIZE * MAX_ENQUE_FRAME_NUM; + currentresult_va += ATTR_OUT_SIZE * MAX_ENQUE_FRAME_NUM; + fd->dma_para->attr_out_hw_pa[ETHNICITY_OUT_RGS][0] = currentresult_pa; + fd->dma_para->attr_out_hw_va[ETHNICITY_OUT_RGS][0] = currentresult_va; + currentresult_pa += ATTR_OUT_SIZE * MAX_ENQUE_FRAME_NUM; + currentresult_va += ATTR_OUT_SIZE * MAX_ENQUE_FRAME_NUM; + + /* need to prepare 10 buffers to store 10 times result */ + fd->dma_para->age_out_hw_pa[0] = fd->dma_para->attr_out_hw_pa[AGE_OUT_RGS][0]; + fd->dma_para->age_out_hw_va[0] = fd->dma_para->attr_out_hw_va[AGE_OUT_RGS][0]; + fd->dma_para->gender_out_hw_pa[0] = fd->dma_para->attr_out_hw_pa[GENDER_OUT_RGS][0]; + fd->dma_para->gender_out_hw_va[0] = fd->dma_para->attr_out_hw_va[GENDER_OUT_RGS][0]; + fd->dma_para->is_indian_out_hw_pa[0] = + fd->dma_para->attr_out_hw_pa[INDIAN_OUT_RGS][0]; + fd->dma_para->is_indian_out_hw_va[0] = + fd->dma_para->attr_out_hw_va[INDIAN_OUT_RGS][0]; + fd->dma_para->ethnicity_out_hw_pa[0] = fd->dma_para->attr_out_hw_pa[ETHNICITY_OUT_RGS][0]; + fd->dma_para->ethnicity_out_hw_va[0] = fd->dma_para->attr_out_hw_va[ETHNICITY_OUT_RGS][0]; + + for (i = 1; i < MAX_ENQUE_FRAME_NUM; i++) { + fd->dma_para->age_out_hw_pa[i] = + fd->dma_para->age_out_hw_pa[i - 1] + ATTR_OUT_SIZE; + fd->dma_para->age_out_hw_va[i] = + fd->dma_para->age_out_hw_va[i - 1] + ATTR_OUT_SIZE; + fd->dma_para->gender_out_hw_pa[i] = + fd->dma_para->gender_out_hw_pa[i - 1] + ATTR_OUT_SIZE; + fd->dma_para->gender_out_hw_va[i] = + fd->dma_para->gender_out_hw_va[i - 1] + ATTR_OUT_SIZE; + fd->dma_para->is_indian_out_hw_pa[i] = + fd->dma_para->is_indian_out_hw_pa[i - 1] + ATTR_OUT_SIZE; + fd->dma_para->is_indian_out_hw_va[i] = + fd->dma_para->is_indian_out_hw_va[i - 1] + ATTR_OUT_SIZE; + fd->dma_para->ethnicity_out_hw_pa[i] = + fd->dma_para->ethnicity_out_hw_pa[i - 1] + ATTR_OUT_SIZE; + fd->dma_para->ethnicity_out_hw_va[i] = + fd->dma_para->ethnicity_out_hw_va[i - 1] + ATTR_OUT_SIZE; + } + + memset(fd->fd_dma_result_hw.va, 0, fd->fd_dma_result_hw.size); + + dev_dbg(fd->dev, "%s: current VA = %p\n", __func__, currentresult_va); +} + +static void aie_arrange_fld_buf(struct mtk_aie_dev *fd) +{ + unsigned int offset = 0; + u8 i, j; + + for (i = 0; i < FLD_STEP_NUM; i++) { + for (j = 0; j < FLD_MAX_FRAME; j++) { + fd->fld_para->fld_step_va[i][j] = fd->fd_fld_step_data.va + offset; + fd->fld_para->fld_step_pa[i][j] = fd->fd_fld_step_data.pa + offset; + offset += fld_step_align_size(i, j); + } + } + + for (i = 0, offset = 0; i < FLD_MAX_FRAME; i++) { + fd->fld_para->fld_output_va[i] = fd->fd_fld_out_hw.va + offset; + fd->fld_para->fld_output_pa[i] = fd->fd_fld_out_hw.pa + offset; + offset += FLD_OUTPUT_SIZE; + } +} + +static void aie_update_fddma_buf(struct mtk_aie_dev *fd) +{ + struct aie_static_info *pstv = &fd->st_info; + u8 i, j; + + /* 19 ~ 27 */ + aie_calculate_pa(fd, pstv, 19); + + /* 48 ~ 56 */ + aie_calculate_pa(fd, pstv, 48); + + /* 77 ~ 85 */ + aie_calculate_pa(fd, pstv, 77); + + /* VA : except 28, 57, 86 */ + /* 0 ~ 86 */ + fd->dma_para->fd_out_hw_va[0][0] = fd->fd_dma_hw.va; + for (i = 1; i < FD_LOOP_NUM; i++) { + if (i == RPNX_LOOP_NUM(2) || i == RPNX_LOOP_NUM(1) || i == RPNX_LOOP_NUM(0)) + continue; + for (j = 0; j < 4; j++) { + if (out_stride_size[i][j]) { + fd->dma_para->fd_out_hw_va[i][j] = fd->fd_dma_hw.va + + fd->dma_para->fd_out_hw_pa[i][j] - fd->fd_dma_hw.pa; + } + } + } +} + +static void aie_free_dram_buf(struct mtk_aie_dev *fd) +{ + aie_imem_free(fd, &fd->rs_cfg_data); + aie_imem_free(fd, &fd->fd_cfg_data); + aie_imem_free(fd, &fd->yuv2rgb_cfg_data); +} + +static void aie_free_output_buf(struct mtk_aie_dev *fd) +{ + aie_imem_free(fd, &fd->rs_output_hw); +} + +static void aie_free_fddma_buf(struct mtk_aie_dev *fd) +{ + aie_imem_free(fd, &fd->fd_dma_hw); + aie_imem_free(fd, &fd->fd_kernel_hw); + aie_imem_free(fd, &fd->fd_attr_dma_hw); + aie_imem_free(fd, &fd->fd_dma_result_hw); +} + +static void aie_free_fld_buf(struct mtk_aie_dev *fd) +{ + aie_imem_free(fd, &fd->fd_fld_step_data); + aie_imem_free(fd, &fd->fd_fld_out_hw); +} + +static int aie_copy_fw(struct mtk_aie_dev *fd, const char *name, void *buf, + unsigned int size) +{ + int ret; + const struct firmware *fw = NULL; + + ret = request_firmware(&fw, name, fd->dev); + if (ret == 0) { + if (size >= fw->size) + memcpy(buf, fw->data, fw->size); + else + ret = -EINVAL; + } + + release_firmware(fw); + + return ret; +} + +/** + * The aie driver just loads the bin file provided by algorithm. + * Driver cannot change path and file name preset by algorithm. + * Otherwise it will not be able to get aie related data and algorithms. + */ +static int aie_load_fw(struct mtk_aie_dev *fd) +{ + char *cfg_path = "aie_mp_fw31/config"; + char *kel_path = "aie_mp_fw31/kernel"; + char name[128] = {}; + u32 size, attrsize; + u8 i, j; + int ret; + + /* aie_fd_fd_config.bin: Provide relevant configs necessary for fd mode*/ + ret = sprintf(name, "%s/aie_fd_fd_config.bin", cfg_path); + if (ret < 0) + return ret; + + ret = aie_copy_fw(fd, + name, + fd->base_para->fd_fd_cfg_va, + fd->fd_fd_cfg_size); + if (ret) + return ret; + + /* aie_fd_rs_config.bin: Provide the necessary data for fd mode resizer hardware */ + ret = sprintf(name, "%s/aie_fd_rs_config.bin", cfg_path); + if (ret < 0) + return ret; + + ret = aie_copy_fw(fd, + name, + fd->base_para->fd_rs_cfg_va, + fd->fd_rs_cfg_size); + if (ret) + return ret; + + /* aie_fd_yuv2rgb_config.bin: Provide fd mode yuv to rgb related data */ + ret = sprintf(name, "%s/aie_fd_yuv2rgb_config.bin", cfg_path); + if (ret < 0) + return ret; + + ret = aie_copy_fw(fd, + name, + fd->base_para->fd_yuv2rgb_cfg_va, + fd->fd_yuv2rgb_cfg_size); + if (ret) + return ret; + + /* aie_attr_fd_config.bin: Provide relevant configs necessary for attribute mode*/ + ret = sprintf(name, "%s/aie_attr_fd_config.bin", cfg_path); + if (ret < 0) + return ret; + + ret = aie_copy_fw(fd, + name, + fd->base_para->attr_fd_cfg_va[0], + fd->attr_fd_cfg_size); + if (ret) + return ret; + + /** aie_attr_yuv2rgb_config.bin: Provide the necessary data for + * attribute mode resizer hardware + */ + ret = sprintf(name, "%s/aie_attr_yuv2rgb_config.bin", cfg_path); + if (ret < 0) + return ret; + + ret = aie_copy_fw(fd, + name, + fd->base_para->attr_yuv2rgb_cfg_va[0], + fd->attr_yuv2rgb_cfg_size); + if (ret) + return ret; + + for (i = 1; i < MAX_ENQUE_FRAME_NUM; i++) { + memcpy(fd->base_para->attr_fd_cfg_va[i], + fd->base_para->attr_fd_cfg_va[0], fd->attr_fd_cfg_size); + memcpy(fd->base_para->attr_yuv2rgb_cfg_va[i], + fd->base_para->attr_yuv2rgb_cfg_va[0], + fd->attr_yuv2rgb_cfg_size); + } + + for (i = 0; i < FD_LOOP_NUM; i++) { + for (j = 0; j < KERNEL_RDMA_RA_NUM; j++) { + size = get_fd_ker_rdma_size(i, j); + if (size) { + /* FD mode Hardware requires loop 87 times */ + ret = sprintf(name, + "%s/aie_fd_kernel_bias_loop%02d_%d.bin", + kel_path, i, j); + if (ret < 0) + return ret; + + ret = aie_copy_fw(fd, name, + fd->dma_para->fd_kernel_va[i][j], + size); + if (ret) + return ret; + } + } + } + + for (i = 0; i < ATTR_LOOP_NUM; i++) { + for (j = 0; j < KERNEL_RDMA_RA_NUM; j++) { + attrsize = get_attr_ker_rdma_size(i, j); + /* attribute mode Hardware requires loop 26 times */ + ret = sprintf(name, + "%s/aie_attr_kernel_bias_loop%02d_%d.bin", + kel_path, i, j); + if (ret < 0) + return ret; + + ret = aie_copy_fw(fd, name, + fd->dma_para->attr_kernel_va[i][j], + attrsize); + if (ret) + return ret; + } + } + + /* The necessary data when the FLD mode blink model runs */ + ret = sprintf(name, "%s/aie_fld_blink_weight_forest14.bin", cfg_path); + + if (ret < 0) + return ret; + + ret = aie_copy_fw(fd, name, + fd->fld_para->fld_step_va[FLD_STEP_BLINK][14], + fld_step_align_size(FLD_STEP_BLINK, 14)); + if (ret) + return ret; + + for (j = 0; j < FLD_MAX_FRAME; j++) { + /* The necessary data when the FLD mode cv model runs */ + ret = sprintf(name, + "%s/aie_fld_cv_forest%02d_iom3.bin", cfg_path, j); + if (ret < 0) + return ret; + ret = aie_copy_fw(fd, name, + fd->fld_para->fld_step_va[FLD_STEP_CV][j], + fld_step_align_size(FLD_STEP_CV, j)); + if (ret) + return ret; + } + + for (j = 0; j < FLD_MAX_FRAME; j++) { + /* The necessary data when the FLD mode fp model runs */ + ret = sprintf(name, + "%s/aie_fld_fp_forest%02d_om45.bin", cfg_path, j); + if (ret < 0) + return ret; + + ret = aie_copy_fw(fd, name, + fd->fld_para->fld_step_va[FLD_STEP_FP][j], + fld_step_align_size(FLD_STEP_FP, j)); + if (ret) + return ret; + } + + for (j = 0; j < FLD_MAX_FRAME; j++) { + /* The necessary data when the FLD mode leafnode model runs */ + ret = sprintf(name, + "%s/aie_fld_leafnode_forest%02d.bin", cfg_path, j); + if (ret < 0) + return ret; + + ret = aie_copy_fw(fd, name, + fd->fld_para->fld_step_va[FLD_STEP_LEAF][j], + fld_step_align_size(FLD_STEP_LEAF, j)); + if (ret) + return ret; + } + + for (j = 0; j < FLD_MAX_FRAME; j++) { + /* The necessary data when the FLD mode treekm02 model runs */ + ret = sprintf(name, + "%s/aie_fld_tree_forest%02d_km02.bin", cfg_path, j); + if (ret < 0) + return ret; + ret = aie_copy_fw(fd, name, + fd->fld_para->fld_step_va[FLD_STEP_KM02][j], + fld_step_align_size(FLD_STEP_KM02, j)); + if (ret) + return ret; + } + + for (j = 0; j < FLD_MAX_FRAME; j++) { + /* The necessary data when the FLD mode treekm13 model runs */ + ret = sprintf(name, + "%s/aie_fld_tree_forest%02d_km13.bin", cfg_path, j); + if (ret < 0) + return ret; + ret = aie_copy_fw(fd, name, + fd->fld_para->fld_step_va[FLD_STEP_KM13][j], + fld_step_align_size(FLD_STEP_KM13, j)); + if (ret) + return ret; + } + + return 0; +} + +static int aie_update_cfg(struct mtk_aie_dev *fd, struct aie_enq_info *aie_cfg) +{ + int crop_width; + int crop_height; + + if (aie_cfg->en_roi) { + crop_width = dif_x(aie_cfg) + 1; + crop_height = dif_y(aie_cfg) + 1; + } else { + crop_width = aie_cfg->src_img_width; + crop_height = aie_cfg->src_img_height; + } + + if (crop_width == 0 || crop_height == 0) { + dev_err(fd->dev, "Invalid crop size 0x0\n"); + return -EINVAL; + } + + if (aie_cfg->en_padding) { + crop_width += aie_cfg->src_padding.right + aie_cfg->src_padding.left; + crop_height += aie_cfg->src_padding.up + aie_cfg->src_padding.down; + } + + if (aie_cfg->sel_mode == FDMODE) { + fd->base_para->sel_mode = aie_cfg->sel_mode; + fd->base_para->crop_rect.width = crop_width; + fd->base_para->crop_rect.height = crop_height; + fd->base_para->src_img_addr = aie_cfg->src_img_addr; + fd->base_para->src_img_addr_uv = aie_cfg->src_img_addr_uv; + fd->base_para->img_rect.width = aie_cfg->src_img_width; + fd->base_para->img_rect.height = aie_cfg->src_img_height; + fd->base_para->src_img_fmt = aie_cfg->src_img_fmt; + fd->base_para->rotate_degree = aie_cfg->rotate_degree; + } else if (aie_cfg->sel_mode == ATTRIBUTEMODE) { + fd->attr_para->sel_mode[fd->attr_para->w_idx] = aie_cfg->sel_mode; + fd->attr_para->crop_width[fd->attr_para->w_idx] = crop_width; + fd->attr_para->crop_height[fd->attr_para->w_idx] = crop_height; + fd->attr_para->src_img_addr[fd->attr_para->w_idx] = aie_cfg->src_img_addr; + fd->attr_para->src_img_addr_uv[fd->attr_para->w_idx] = aie_cfg->src_img_addr_uv; + fd->attr_para->img_width[fd->attr_para->w_idx] = aie_cfg->src_img_width; + fd->attr_para->img_height[fd->attr_para->w_idx] = aie_cfg->src_img_height; + fd->attr_para->src_img_fmt[fd->attr_para->w_idx] = aie_cfg->src_img_fmt; + fd->attr_para->rotate_degree[fd->attr_para->w_idx] = aie_cfg->rotate_degree; + } + + return 0; +} + +static int aie_config_y2r(struct mtk_aie_dev *fd, struct aie_enq_info *aie_cfg, + int mode) +{ + u32 img_addr; + u32 img_addr_UV; + u32 img_off; + u32 img_off_uv; + u32 *yuv2rgb_cfg; + u32 srcbuf, srcbuf_UV; + u16 xmag_0, ymag_0; + u16 pym0_out_w; + u16 pym0_out_h; + u16 stride_pym0_out_w; + u16 sr_crp_w; + u16 sr_crp_h; + u16 y1_stride; + + if (!aie_cfg->en_roi) { + img_off = 0; + img_off_uv = 0; + } else { + if (aie_cfg->src_img_fmt == FMT_MONO || aie_cfg->src_img_fmt == FMT_YUV_2P || + aie_cfg->src_img_fmt == FMT_YVU_2P) { + y1_stride = aie_cfg->src_img_stride * aie_cfg->src_roi.y1; + img_off = y1_stride + aie_cfg->src_roi.x1; + img_off_uv = y1_stride + aie_cfg->src_roi.x1; + } else if (aie_cfg->src_img_fmt == FMT_YUV420_2P || + aie_cfg->src_img_fmt == FMT_YUV420_1P) { + y1_stride = aie_cfg->src_img_stride * aie_cfg->src_roi.y1; + img_off = y1_stride + aie_cfg->src_roi.x1; + img_off_uv = y1_stride / 2 + aie_cfg->src_roi.x1; + } else if (aie_cfg->src_img_fmt == FMT_YUYV || + aie_cfg->src_img_fmt == FMT_YVYU || + aie_cfg->src_img_fmt == FMT_UYVY || + aie_cfg->src_img_fmt == FMT_VYUY) { + y1_stride = aie_cfg->src_img_stride * aie_cfg->src_roi.y1; + img_off = y1_stride + aie_cfg->src_roi.x1 * 2; + img_off_uv = y1_stride + aie_cfg->src_roi.x1 * 2; + } else { + dev_err(fd->dev, "Unsupport input format %d", aie_cfg->src_img_fmt); + return -EINVAL; + } + } + + img_addr = aie_cfg->src_img_addr + img_off; + img_addr_UV = aie_cfg->src_img_addr_uv + img_off_uv; + + srcbuf = img_addr; + if (aie_cfg->src_img_fmt == FMT_YUV420_2P || aie_cfg->src_img_fmt == FMT_YUV420_1P || + aie_cfg->src_img_fmt == FMT_YUV_2P || aie_cfg->src_img_fmt == FMT_YVU_2P) + srcbuf_UV = img_addr_UV; + else + srcbuf_UV = 0; + + if (mode == FDMODE) { + sr_crp_w = fd->base_para->crop_rect.width; + sr_crp_h = fd->base_para->crop_rect.height; + yuv2rgb_cfg = (u32 *)fd->base_para->fd_yuv2rgb_cfg_va; + pym0_out_w = fd->base_para->pyramid_rect.width; + } + if (mode == ATTRIBUTEMODE) { + sr_crp_w = fd->attr_para->crop_width[fd->attr_para->w_idx]; + sr_crp_h = fd->attr_para->crop_height[fd->attr_para->w_idx]; + yuv2rgb_cfg = (u32 *)fd->base_para->attr_yuv2rgb_cfg_va[fd->attr_para->w_idx]; + pym0_out_w = ATTR_MODE_PYRAMID_WIDTH; + } + + pym0_out_h = pym0_out_w * sr_crp_h / sr_crp_w; + + if (pym0_out_w != 0) { + xmag_0 = 512 * sr_crp_w / pym0_out_w; + ymag_0 = xmag_0; + } else { + xmag_0 = 0; + ymag_0 = 0; + } + + yuv2rgb_cfg[Y2R_SRC_DST_FORMAT] = (yuv2rgb_cfg[Y2R_SRC_DST_FORMAT] & 0xFFFFFFF8) | + ((aie_cfg->src_img_fmt) & 0x7); + if (aie_cfg->src_img_fmt == FMT_YUV420_2P || aie_cfg->src_img_fmt == FMT_YUV420_1P) { + /* for match patten */ + yuv2rgb_cfg[Y2R_SRC_DST_FORMAT] = (yuv2rgb_cfg[Y2R_SRC_DST_FORMAT] & + 0xFFFFFFF8) | ((0x3) & 0x7); + } + yuv2rgb_cfg[Y2R_IN_W_H] = (yuv2rgb_cfg[Y2R_IN_W_H] & 0xF800F800) | + ((sr_crp_w << 16) & 0x7FF0000) | (sr_crp_h & 0x7FF); + yuv2rgb_cfg[Y2R_OUT_W_H] = (yuv2rgb_cfg[Y2R_OUT_W_H] & 0xF800F800) | + ((pym0_out_w << 16) & 0x7FF0000) | (pym0_out_h & 0x7FF); + + if (aie_cfg->src_img_fmt == FMT_YUV_2P || aie_cfg->src_img_fmt == FMT_YVU_2P) { + /* 2 plane */ + yuv2rgb_cfg[Y2R_RA0_RA1_EN] = (yuv2rgb_cfg[Y2R_RA0_RA1_EN] & 0xFFFFFFEE) | 0x11; + if (aie_cfg->en_roi) { + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = aie_cmb_u16(dif_x(aie_cfg), dif_y(aie_cfg)); + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(dif_x(aie_cfg), dif_y(aie_cfg)); + } else { + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = aie_cmb_u16(sr_crp_w - 1, sr_crp_h - 1); + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(sr_crp_w - 1, sr_crp_h - 1); + } + yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] = + (yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] & 0xFFF0) | + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x1; + yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] = + (yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] & 0xFFF0) | + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x1; + } else if (aie_cfg->src_img_fmt == FMT_MONO) { + yuv2rgb_cfg[Y2R_RA0_RA1_EN] = + (yuv2rgb_cfg[Y2R_RA0_RA1_EN] & 0xFFFFFFEE) | 0x01; + if (aie_cfg->en_roi) { + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = aie_cmb_u16(dif_x(aie_cfg), dif_y(aie_cfg)); + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(dif_x(aie_cfg), dif_y(aie_cfg)); + } else { + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = aie_cmb_u16(sr_crp_w - 1, sr_crp_h - 1); + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(sr_crp_w - 1, sr_crp_h - 1); + } + yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] = + (yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] & 0xFFF0) | + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x0; + yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] = + (yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] & 0xFFF0) | + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x0; + } else if (aie_cfg->src_img_fmt == FMT_YUYV || + aie_cfg->src_img_fmt == FMT_YVYU || + aie_cfg->src_img_fmt == FMT_UYVY || + aie_cfg->src_img_fmt == FMT_VYUY) { + /* 1 plane */ + yuv2rgb_cfg[Y2R_RA0_RA1_EN] = (yuv2rgb_cfg[Y2R_RA0_RA1_EN] & 0xFFFFFFEE) | 0x1; + if (aie_cfg->en_roi) { + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = aie_cmb_u16(2 * (dif_x(aie_cfg) + 1) - 1, + dif_y(aie_cfg)); + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(2 * (dif_x(aie_cfg) + 1) - 1, + dif_y(aie_cfg)); + } else { + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = aie_cmb_u16(2 * sr_crp_w - 1, sr_crp_h - 1); + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(2 * sr_crp_w - 1, sr_crp_h - 1); + } + yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] = + (yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] & 0xFFF0) | + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x3; + yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] = + (yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] & 0xFFF0) | + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x3; + } + + /* AIE3.0 */ + if (aie_cfg->src_img_fmt == FMT_YUV420_2P || + aie_cfg->src_img_fmt == FMT_YUV420_1P) { + yuv2rgb_cfg[Y2R_RA0_RA1_EN] = + (yuv2rgb_cfg[Y2R_RA0_RA1_EN] & 0xFFFFFFEE) | 0x11; + if (aie_cfg->en_roi) { + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = aie_cmb_u16(dif_x(aie_cfg), dif_y(aie_cfg)); + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(dif_x(aie_cfg), + dif_y(aie_cfg) / 2); + } else { + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = + aie_cmb_u16(sr_crp_w - 1, sr_crp_h - 1); + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(sr_crp_w - 1, + sr_crp_h / 2 - 1); + } + yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] = + (yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] & 0xFFF0) | + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x0; + yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] = + (yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] & 0xFFF0) | + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x0; + + yuv2rgb_cfg[Y2R_CO2_FMT_MODE_EN] = + (yuv2rgb_cfg[Y2R_CO2_FMT_MODE_EN] & 0xFFFFFFFE) | 0x01; + if (aie_cfg->en_roi) { + yuv2rgb_cfg[Y2R_CO2_CROP_X] = aie_cmb_u16(0, dif_x(aie_cfg)); + yuv2rgb_cfg[Y2R_CO2_CROP_Y] = aie_cmb_u16(0, dif_y(aie_cfg)); + } else { + yuv2rgb_cfg[Y2R_CO2_CROP_X] = aie_cmb_u16(0, sr_crp_w - 1); + yuv2rgb_cfg[Y2R_CO2_CROP_Y] = aie_cmb_u16(0, sr_crp_h - 1); + } + } else { + yuv2rgb_cfg[Y2R_CO2_FMT_MODE_EN] = + (yuv2rgb_cfg[Y2R_CO2_FMT_MODE_EN] & 0xFFFFFFFE); + + if (aie_cfg->en_roi) { + yuv2rgb_cfg[Y2R_CO2_CROP_X] = aie_cmb_u16(0, dif_x(aie_cfg)); + yuv2rgb_cfg[Y2R_CO2_CROP_Y] = aie_cmb_u16(0, dif_y(aie_cfg)); + } else { + yuv2rgb_cfg[Y2R_CO2_CROP_X] = aie_cmb_u16(0, sr_crp_w - 1); + yuv2rgb_cfg[Y2R_CO2_CROP_Y] = aie_cmb_u16(0, sr_crp_h - 1); + } + } + + stride_pym0_out_w = round_up(pym0_out_w, 8); + + yuv2rgb_cfg[Y2R_OUT_X_Y_SIZE0] = + aie_cmb_u16(pym0_out_w - 1, pym0_out_h - 1); + set_cmb_cfg(yuv2rgb_cfg, Y2R_OUT_STRIDE0_BUS_SIZE0, stride_pym0_out_w); + yuv2rgb_cfg[Y2R_OUT_X_Y_SIZE1] = + aie_cmb_u16(pym0_out_w - 1, pym0_out_h - 1); + set_cmb_cfg(yuv2rgb_cfg, Y2R_OUT_STRIDE1_BUS_SIZE1, stride_pym0_out_w); + yuv2rgb_cfg[Y2R_OUT_X_Y_SIZE2] = + aie_cmb_u16(pym0_out_w - 1, pym0_out_h - 1); + set_cmb_cfg(yuv2rgb_cfg, Y2R_OUT_STRIDE2_BUS_SIZE2, stride_pym0_out_w); + + if (aie_cfg->en_padding) { + yuv2rgb_cfg[Y2R_PADDING_EN_UP_DOWN] = + 1 | ((aie_cfg->src_padding.up << 4) & 0x1FF0) | + ((aie_cfg->src_padding.down << 16) & 0x01FF0000); + yuv2rgb_cfg[Y2R_PADDING_RIGHT_LEFT] = + (aie_cfg->src_padding.right & 0x01FF) | + ((aie_cfg->src_padding.left << 16) & 0x01FF0000); + } else { + yuv2rgb_cfg[Y2R_PADDING_EN_UP_DOWN] = 0; + yuv2rgb_cfg[Y2R_PADDING_RIGHT_LEFT] = 0; + } + + yuv2rgb_cfg[Y2R_IN_0] = srcbuf; + yuv2rgb_cfg[Y2R_IN_1] = srcbuf_UV; + + yuv2rgb_cfg[Y2R_OUT_0] = (u32)fd->base_para->rs_pym_rst_pa[0][0]; + yuv2rgb_cfg[Y2R_OUT_1] = (u32)fd->base_para->rs_pym_rst_pa[0][1]; + yuv2rgb_cfg[Y2R_OUT_2] = (u32)fd->base_para->rs_pym_rst_pa[0][2]; + + yuv2rgb_cfg[Y2R_X_Y_MAG] = (xmag_0 & 0x3FFF) | ((ymag_0 << 16) & 0x3FFF0000); + + if (sr_crp_w >= pym0_out_w) { + /* down scale AIE1.0 by FRZ */ + yuv2rgb_cfg[Y2R_RS_SEL_SRZ_EN] = + (yuv2rgb_cfg[Y2R_RS_SEL_SRZ_EN] & 0x00100070); + yuv2rgb_cfg[Y2R_SRZ_HORI_STEP] = 0; + yuv2rgb_cfg[Y2R_SRZ_VERT_STEP] = 0; + } else { + /* SRZ */ + /* 0: FDRZ for down scaling */ + /* 1: SRZ for up scaling */ + yuv2rgb_cfg[Y2R_RS_SEL_SRZ_EN] = + (yuv2rgb_cfg[Y2R_RS_SEL_SRZ_EN] & 0x00100070) | SRZ_BIT; + yuv2rgb_cfg[Y2R_SRZ_HORI_STEP] = ((sr_crp_w - 1) << 15) / (pym0_out_w - 1); + yuv2rgb_cfg[Y2R_SRZ_VERT_STEP] = ((sr_crp_h - 1) << 15) / (pym0_out_h - 1); + } + + yuv2rgb_cfg[Y2R_CON_IN_BA_MSB] = (u32)0x02020202; + yuv2rgb_cfg[Y2R_CON_OUT_BA_MSB] = (u32)0x02020202; + + return 0; +} + +static int aie_config_rs(struct mtk_aie_dev *fd, struct aie_enq_info *aie_cfg) +{ + u32 *rs_cfg; + u32 *rs_tbl[2]; + u16 xmag_0, ymag_0; + u16 pym_out_w[3]; + u16 pym_out_h[3]; + u16 round_w; + u16 sr_crp_w; + u16 sr_crp_h; + int i; + + sr_crp_w = fd->base_para->crop_rect.width; + sr_crp_h = fd->base_para->crop_rect.height; + + rs_cfg = (u32 *)fd->base_para->fd_rs_cfg_va; + + pym_out_w[0] = fd->base_para->pyramid_rect.width; + pym_out_w[1] = pym_out_w[0] >> 1; + pym_out_w[2] = pym_out_w[1] >> 1; + + pym_out_h[0] = pym_out_w[0] * sr_crp_h / sr_crp_w; + pym_out_h[1] = pym_out_h[0] >> 1; + pym_out_h[2] = pym_out_h[1] >> 1; + + for (i = 0; i < 2; i++) { + rs_tbl[i] = rs_cfg + fd->variant->rs_cfg_size * i; + + rs_tbl[i][RS_IN_0] = (u32)fd->base_para->rs_pym_rst_pa[i][0]; + rs_tbl[i][RS_IN_1] = (u32)fd->base_para->rs_pym_rst_pa[i][1]; + rs_tbl[i][RS_IN_2] = (u32)fd->base_para->rs_pym_rst_pa[i][2]; + + rs_tbl[i][RS_OUT_0] = (u32)fd->base_para->rs_pym_rst_pa[i + 1][0]; + rs_tbl[i][RS_OUT_1] = (u32)fd->base_para->rs_pym_rst_pa[i + 1][1]; + rs_tbl[i][RS_OUT_2] = (u32)fd->base_para->rs_pym_rst_pa[i + 1][2]; + + rs_tbl[i][RS_INPUT_W_H] = (rs_tbl[i][RS_INPUT_W_H] & 0xF800F800) | + (pym_out_h[i] & 0x7FF) | ((pym_out_w[i] << 16) & 0x7FF0000); + rs_tbl[i][RS_OUTPUT_W_H] = (rs_tbl[i][RS_OUTPUT_W_H] & 0xF800F800) | + (pym_out_h[i + 1] & 0x7FF) | ((pym_out_w[i + 1] << 16) & 0x7FF0000); + + rs_tbl[i][RS_IN_X_Y_SIZE0] = aie_cmb_u16(pym_out_w[i] - 1, pym_out_h[i] - 1); + rs_tbl[i][RS_IN_X_Y_SIZE1] = aie_cmb_u16(pym_out_w[i] - 1, pym_out_h[i] - 1); + rs_tbl[i][RS_IN_X_Y_SIZE2] = aie_cmb_u16(pym_out_w[i] - 1, pym_out_h[i] - 1); + + set_cmb_cfg(rs_tbl[i], RS_IN_STRIDE0, pym_out_w[i]); + set_cmb_cfg(rs_tbl[i], RS_IN_STRIDE1, pym_out_w[i]); + set_cmb_cfg(rs_tbl[i], RS_IN_STRIDE2, pym_out_w[i]); + + rs_tbl[i][RS_OUT_X_Y_SIZE0] = aie_cmb_u16(pym_out_w[i + 1] - 1, + pym_out_h[i + 1] - 1); + rs_tbl[i][RS_OUT_X_Y_SIZE1] = aie_cmb_u16(pym_out_w[i + 1] - 1, + pym_out_h[i + 1] - 1); + rs_tbl[i][RS_OUT_X_Y_SIZE2] = aie_cmb_u16(pym_out_w[i + 1] - 1, + pym_out_h[i + 1] - 1); + + if (i == 0) + round_w = pym_out_w[i + 1]; + else + round_w = round_up(pym_out_w[i + 1], 8); + + set_cmb_cfg(rs_tbl[i], RS_OUT_STRIDE0, round_w); + set_cmb_cfg(rs_tbl[i], RS_OUT_STRIDE1, round_w); + set_cmb_cfg(rs_tbl[i], RS_OUT_STRIDE2, round_w); + + xmag_0 = 512 * pym_out_w[i] / pym_out_w[i + 1]; + ymag_0 = xmag_0; + + rs_tbl[i][RS_X_Y_MAG] = (xmag_0 & 0x3FFF) | ((ymag_0 << 16) & 0x3FFF0000); + rs_tbl[i][RS_CON_IN_BA_MSB] = (u32)0x02020202; + rs_tbl[i][RS_CON_OUT_BA_MSB] = (u32)0x02020202; + } + + return 0; +} + +static int aie_config_network(struct mtk_aie_dev *fd, + struct aie_enq_info *aie_cfg) +{ + struct aie_static_info *pstv = &fd->st_info; + u16 pyramid0_out_w, pyramid0_out_h, pyramid1_out_h, pyramid2_out_h; + u16 out_ysize_minus_1, out_ysize_minus_1_stride2; + u16 fd_xsize[4], size, poolsize, stridesize; + u16 input_height, out_height; + u16 conv_width, conv_height; + u32 *fd_cur_cfg, *fd_cur_set; + u32 sr_crp_w, sr_crp_h; + u32 cal_x, cal_y; + u8 uch, uloop; + u8 i, j; + void *fd_cfg; + + sr_crp_w = fd->base_para->crop_rect.width; + sr_crp_h = fd->base_para->crop_rect.height; + + pyramid0_out_w = fd->base_para->pyramid_rect.width; + pyramid0_out_h = pyramid0_out_w * sr_crp_h / sr_crp_w; + + pyramid1_out_h = pyramid0_out_h / 2; + pyramid2_out_h = pyramid1_out_h / 2; + + fd_cfg = fd->base_para->fd_fd_cfg_va; + + for (i = 0; i < FD_LOOP_NUM; i++) { + fd_cur_cfg = (u32 *)fd_cfg + fd->variant->fd_cfg_size * i; + fd_cur_cfg[FD_INPUT_ROTATE] = (fd_cur_cfg[FD_INPUT_ROTATE] & 0xFFFF0FFF) | + ((aie_cfg->rotate_degree << 12) & 0x3000); + + if (i == 0) + input_height = pyramid2_out_h; + else if (i == (RPNX_LOOP_NUM(2) + 1)) + input_height = pyramid1_out_h; + else if (i == (RPNX_LOOP_NUM(1) + 1)) + input_height = pyramid0_out_h; + else + if (fd_out_stride2_in[i] == 0) + input_height = out_height; + else + input_height = (out_height + 1) / 2; + + poolsize = is_fd_maxpool(i); + stridesize = get_fd_stride(i); + if (poolsize == 1) + out_height = + DIV_ROUND_UP(input_height, 2 * poolsize); + else + out_height = DIV_ROUND_UP(input_height, stridesize + 2 * poolsize); + + if (i == RPNX_LOOP_NUM(0) || i == RPNX_LOOP_NUM(1) || i == RPNX_LOOP_NUM(2)) { + conv_width = fd->base_para->img_rect.width; + conv_height = fd->base_para->img_rect.height; + fd_xsize[0] = pstv->inf_elm[i].img_width * 2 * 16 * ANCHOR_EN_NUM - 1; + fd_xsize[3] = pstv->inf_elm[i].img_width * 2 * 32 * ANCHOR_EN_NUM - 1; + fd_xsize[2] = fd_xsize[3]; + fd_xsize[1] = fd_xsize[2]; + } else { + conv_width = DIV_ROUND_UP(pstv->inf_elm[i].img_width, stridesize); + conv_height = DIV_ROUND_UP(input_height, stridesize); + + fd_xsize[3] = pstv->inf_elm[i].input_xsize_plus_1 - 1; + fd_xsize[2] = fd_xsize[3]; + fd_xsize[1] = fd_xsize[2]; + fd_xsize[0] = fd_xsize[1]; + } + + fd_cur_cfg[FD_CONV_WIDTH_MOD6] = (fd_cur_cfg[FD_CONV_WIDTH_MOD6] & 0xFF8FFFFF) | + (((conv_width % 6) << 20) & 0x00700000); + fd_cur_cfg[FD_CONV_IMG_W_H] = aie_cmb_u16(conv_height, conv_width); + + fd_cur_cfg[FD_IN_IMG_W_H] = aie_cmb_u16(input_height, pstv->inf_elm[i].img_width); + fd_cur_cfg[FD_OUT_IMG_W_H] = aie_cmb_u16(out_height, pstv->inf_elm[i].out_width); + + if (fd_rdma_en[i][0][0] != -1) { + for (j = 0; j < 4; j++) { + fd_cur_cfg[FD_IN_X_Y_SIZE0 + 2 * j] = + aie_cmb_u16(fd_xsize[j], input_height - 1); + set_cmbst_cfg(fd_cur_cfg, FD_IN_STRIDE0_BUS_SIZE0 + 2 * j, + fd_xsize[j] + 1); + } + } + + out_ysize_minus_1 = out_height - 1; + out_ysize_minus_1_stride2 = (out_height + 1) / 2 - 1; + + for (j = 0; j < OUTPUT_WDMA_WRA_NUM; j++) { + fd_cur_set = fd_cur_cfg + 2 * j; + if (!out_stride_size[i][j]) + continue; + + if (out_stride_size[i][j] == 1) { + fd_cur_set[FD_OUT_X_Y_SIZE0] = + aie_cmb_u16(pstv->inf_elm[i].out_xsize_plus_1 - 1, + out_ysize_minus_1); + set_cmbst_cfg(fd_cur_set, FD_OUT_STRIDE0_BUS_SIZE0, + pstv->inf_elm[i].out_stride); + } else if (out_stride_size[i][j] == 2) { + fd_cur_set[FD_OUT_X_Y_SIZE0] = + aie_cmb_u16(pstv->inf_elm[i].out_xsize_plus_1_stride2 - 1, + out_ysize_minus_1_stride2); + set_cmbst_cfg(fd_cur_set, FD_OUT_STRIDE0_BUS_SIZE0, + pstv->inf_elm[i].out_stride_stride2); + } + } + + if (i == RPNX_LOOP_NUM(0) || i == RPNX_LOOP_NUM(1) || i == RPNX_LOOP_NUM(2)) + set_cmb_cfg(fd_cur_cfg, FD_RPN_SET, fd->base_para->rpn_anchor_thrd); + + if (i == RPNX_LOOP_NUM(0)) { + cal_x = ((sr_crp_w << 10) * 100 / + (int)fd->base_para->pyramid_rect.width) >> 10; + cal_y = cal_x * 512 / 100; + fd_cur_cfg[FD_IMAGE_COORD] = (fd_cur_cfg[FD_IMAGE_COORD] & 0xF) | + ((cal_y << 4) & 0x7FFF0); + fd_cur_cfg[FD_IMAGE_COORD_XY_OFST] = 0; + if (aie_cfg->en_roi) { + fd_cur_cfg[FD_IMAGE_COORD_XY_OFST] = + (aie_cfg->src_roi.x1 - aie_cfg->src_padding.left) | + (aie_cfg->src_roi.y1 - aie_cfg->src_padding.up) << 16; + } + } else if (i == RPNX_LOOP_NUM(1)) { + cal_x = ((sr_crp_w << 10) * 100 / + (int)fd->base_para->pyramid_rect.width) >> 10; + cal_y = cal_x * 2 * 512 / 100; + fd_cur_cfg[FD_IMAGE_COORD] = (fd_cur_cfg[FD_IMAGE_COORD] & 0xF) | + ((cal_y << 4) & 0x7FFF0); + fd_cur_cfg[FD_IMAGE_COORD_XY_OFST] = 0; + if (aie_cfg->en_roi) { + fd_cur_cfg[FD_IMAGE_COORD_XY_OFST] = + (aie_cfg->src_roi.x1 - aie_cfg->src_padding.left) | + (aie_cfg->src_roi.y1 - aie_cfg->src_padding.up) << 16; + } + } else if (i == RPNX_LOOP_NUM(2)) { + cal_x = ((sr_crp_w << 10) * 100 / + (int)fd->base_para->pyramid_rect.width) >> 10; + cal_y = cal_x * 4 * 512 / 100; + fd_cur_cfg[FD_IMAGE_COORD] = (fd_cur_cfg[FD_IMAGE_COORD] & 0xF) | + ((cal_y << 4) & 0x7FFF0); + fd_cur_cfg[FD_IMAGE_COORD_XY_OFST] = 0; + if (aie_cfg->en_roi) { + fd_cur_cfg[FD_IMAGE_COORD_XY_OFST] = + (aie_cfg->src_roi.x1 - aie_cfg->src_padding.left) | + (aie_cfg->src_roi.y1 - aie_cfg->src_padding.up) << 16; + } + } + + /* IN_FM_BASE_ADR */ + if (i == 0) { + fd_cur_cfg[FD_IN_0] = (u32)(fd->base_para->rs_pym_rst_pa[2][0]); + fd_cur_cfg[FD_IN_1] = (u32)(fd->base_para->rs_pym_rst_pa[2][1]); + fd_cur_cfg[FD_IN_2] = (u32)(fd->base_para->rs_pym_rst_pa[2][2]); + } else if (i == (RPNX_LOOP_NUM(2) + 1)) { + fd_cur_cfg[FD_IN_0] = (u32)(fd->base_para->rs_pym_rst_pa[1][0]); + fd_cur_cfg[FD_IN_1] = (u32)(fd->base_para->rs_pym_rst_pa[1][1]); + fd_cur_cfg[FD_IN_2] = (u32)(fd->base_para->rs_pym_rst_pa[1][2]); + } else if (i == (RPNX_LOOP_NUM(1) + 1)) { + fd_cur_cfg[FD_IN_0] = (u32)(fd->base_para->rs_pym_rst_pa[0][0]); + fd_cur_cfg[FD_IN_1] = (u32)(fd->base_para->rs_pym_rst_pa[0][1]); + fd_cur_cfg[FD_IN_2] = (u32)(fd->base_para->rs_pym_rst_pa[0][2]); + } else { + for (j = 0; j < INPUT_WDMA_WRA_NUM; j++) { + if (fd_rdma_en[i][j][0] != -1) { + uloop = fd_rdma_en[i][j][0]; + uch = fd_rdma_en[i][j][1]; + fd_cur_cfg[FD_IN_0 + j] = + (u32)(fd->dma_para->fd_out_hw_pa[uloop][uch]); + } + } + } + + /* OUT_FM_BASE_ADR */ + for (j = 0; j < OUTPUT_WDMA_WRA_NUM; j++) { + if (out_stride_size[i][j]) + fd_cur_cfg[FD_OUT_0 + j] = + (u32)(fd->dma_para->fd_out_hw_pa[i][j]); + } + + /* KERNEL_BASE_ADR */ + for (j = 0; j < KERNEL_RDMA_RA_NUM; j++) { + size = get_fd_ker_rdma_size(i, j); + if (size) + fd_cur_cfg[FD_KERNEL_0 + j] = + (u32)(fd->dma_para->fd_kernel_pa[i][j]); + } + + fd_cur_cfg[FD_CON_IN_BA_MSB] = (u32)0x02020202; + fd_cur_cfg[FD_CON_OUT_BA_MSB] = (u32)0x02020202; + fd_cur_cfg[FD_CON_KERNEL_BA_MSB] = (u32)0x00000202; + } + + return 0; +} + +static int aie_config_attr_network(struct mtk_aie_dev *fd, + struct aie_enq_info *aie_cfg) +{ + bool is_regression_loop; + void *fd_cfg; + u32 *fd_cur_cfg; + u16 fd_input_ht, fd_output_ht; + u16 fd_out_y[4]; + u8 i, j; + u8 uloop, uch, uidx; + u16 pyramid0_out_w, pyramid0_out_h; + int fd_conv_ht; + u16 sr_crp_w, sr_crp_h; + + sr_crp_w = fd->attr_para->crop_width[fd->attr_para->w_idx]; + sr_crp_h = fd->attr_para->crop_height[fd->attr_para->w_idx]; + + pyramid0_out_w = ATTR_MODE_PYRAMID_WIDTH; + pyramid0_out_h = pyramid0_out_w * sr_crp_h / sr_crp_w; + + fd_cfg = fd->base_para->attr_fd_cfg_va[fd->attr_para->w_idx]; + + for (i = 0; i < ATTR_LOOP_NUM; i++) { + fd_cur_cfg = (u32 *)fd_cfg + fd->variant->fd_cfg_size * i; + fd_cur_cfg[FD_INPUT_ROTATE] = + (fd_cur_cfg[FD_INPUT_ROTATE] & 0xFFFF0FFF) | + ((aie_cfg->rotate_degree << 12) & 0x3000); + if (i == 0) + fd_input_ht = pyramid0_out_h; + else + if (attr_out_stride2_as_in[i] == 0) + fd_input_ht = fd_output_ht; + else if (attr_out_stride2_as_in[i] == 1) + fd_input_ht = (fd_output_ht + 1) / 2; + + fd_output_ht = DIV_ROUND_UP(fd_input_ht, ATTR_FD_STRIDE(i) + + 2 * ATTR_FD_MAXPOOL(i)); + fd_conv_ht = DIV_ROUND_UP(fd_input_ht, ATTR_FD_STRIDE(i)); + + fd_cur_cfg[FD_CONV_IMG_W_H] = + (fd_cur_cfg[FD_CONV_IMG_W_H] & 0xFFFF0000) | + (fd_conv_ht & 0xFFFF); + fd_cur_cfg[FD_IN_IMG_W_H] = + (fd_cur_cfg[FD_IN_IMG_W_H] & 0xFFFF0000) | + (fd_input_ht & 0xFFFF); + fd_cur_cfg[FD_OUT_IMG_W_H] = + (fd_cur_cfg[FD_OUT_IMG_W_H] & 0xFFFF0000) | + (fd_output_ht & 0xFFFF); + set_cmb_cfg(fd_cur_cfg, FD_IN_X_Y_SIZE0, fd_input_ht - 1); + set_cmb_cfg(fd_cur_cfg, FD_IN_X_Y_SIZE1, fd_input_ht - 1); + set_cmb_cfg(fd_cur_cfg, FD_IN_X_Y_SIZE2, fd_input_ht - 1); + set_cmb_cfg(fd_cur_cfg, FD_IN_X_Y_SIZE3, fd_input_ht - 1); + + is_regression_loop = (i == AGE_OUT_RGS || i == GENDER_OUT_RGS || + i == INDIAN_OUT_RGS || i == ETHNICITY_OUT_RGS); + + if (is_regression_loop) { + fd_out_y[0] = 0; + fd_out_y[1] = 0; + fd_out_y[2] = 0; + fd_out_y[3] = 0; + } else { + fd_out_y[0] = fd_output_ht - 1; + fd_out_y[1] = fd_output_ht - 1; + if (attr_out_2size[i] == 0) { + fd_out_y[2] = fd_output_ht - 1; + fd_out_y[3] = fd_output_ht - 1; + } else { + fd_out_y[2] = (fd_output_ht + 1) / 2 - 1; + fd_out_y[3] = (fd_output_ht + 1) / 2 - 1; + } + } + + for (j = 0; j < 4; j++) + set_cmb_cfg(fd_cur_cfg, FD_OUT_X_Y_SIZE0 + 2 * j, fd_out_y[j]); + + /* IN_FM_BASE_ADR */ + if (i == 0) { + fd_cur_cfg[FD_IN_0] = (u32)(fd->base_para->rs_pym_rst_pa[0][0]); + fd_cur_cfg[FD_IN_1] = (u32)(fd->base_para->rs_pym_rst_pa[0][1]); + fd_cur_cfg[FD_IN_2] = (u32)(fd->base_para->rs_pym_rst_pa[0][2]); + } else { + for (j = 0; j < INPUT_WDMA_WRA_NUM; j++) { + if (attr_rdma_en[i][j][0] != -1) { + uloop = attr_rdma_en[i][j][0]; + uch = attr_rdma_en[i][j][1]; + fd_cur_cfg[FD_IN_0 + j] = + (u32)(fd->dma_para->attr_out_hw_pa[uloop][uch]); + } + } + } + + /* OUT_FM_BASE_ADR */ + for (j = 0; j < OUTPUT_WDMA_WRA_NUM; j++) { + if (attr_wdma_en[i][j]) { + uidx = fd->attr_para->w_idx; + if (i == AGE_OUT_RGS && j == 0) + fd_cur_cfg[FD_OUT_0 + j] = + (u32)(fd->dma_para->age_out_hw_pa[uidx]); + else if (i == GENDER_OUT_RGS && j == 0) + fd_cur_cfg[FD_OUT_0 + j] = + (u32)(fd->dma_para->gender_out_hw_pa[uidx]); + else if (i == INDIAN_OUT_RGS && j == 0) + fd_cur_cfg[FD_OUT_0 + j] = + (u32)(fd->dma_para->is_indian_out_hw_pa[uidx]); + else if (i == ETHNICITY_OUT_RGS && j == 0) + fd_cur_cfg[FD_OUT_0 + j] = + (u32)(fd->dma_para->ethnicity_out_hw_pa[uidx]); + else + fd_cur_cfg[FD_OUT_0 + j] = + (u32)(fd->dma_para->attr_out_hw_pa[i][j]); + } + } + + /* KERNEL_BASE_ADR */ + for (j = 0; j < KERNEL_RDMA_RA_NUM; j++) { + fd_cur_cfg[FD_KERNEL_0 + j] = + (u32)(fd->dma_para->attr_kernel_pa[i][j]); + } + + fd_cur_cfg[FD_CON_IN_BA_MSB] = (u32)0x02020202; + fd_cur_cfg[FD_CON_OUT_BA_MSB] = (u32)0x02020202; + fd_cur_cfg[FD_CON_KERNEL_BA_MSB] = (u32)0x00000202; + } + return 0; +} + +static int aie_config_dram(struct mtk_aie_dev *fd, struct aie_enq_info *aie_cfg) +{ + int ret; + + ret = aie_config_y2r(fd, aie_cfg, aie_cfg->sel_mode); + if (ret) + return ret; + + if (aie_cfg->sel_mode == FDMODE) { + ret = aie_config_rs(fd, aie_cfg); + if (ret) + return ret; + + ret = aie_config_network(fd, aie_cfg); + if (ret) + return ret; + + } else if (aie_cfg->sel_mode == ATTRIBUTEMODE) { + ret = aie_config_attr_network(fd, aie_cfg); + if (ret) + return ret; + } else { + return -EINVAL; + } + + return 0; +} + +void aie_reset(struct mtk_aie_dev *fd) +{ + writel(RESET_BIT, fd->fd_base + AIE_START_REG); + writel(0x0, fd->fd_base + AIE_START_REG); +} + +int aie_init(struct mtk_aie_dev *fd, struct v4l2_ctrl_aie_init *user_init) +{ + int ret; + int i, j; + + if (fd->fd_state & STATE_INIT) { + dev_err(fd->dev, "%s fd state: %d\n", __func__, fd->fd_state); + return -EINVAL; + } + + fd->fd_state &= ~STATE_INIT; + fd->fd_mem_size = 0; + + fd->base_para = kmalloc(sizeof(*fd->base_para), GFP_KERNEL); + if (!fd->base_para) + return -ENOMEM; + + fd->attr_para = kmalloc(sizeof(*fd->attr_para), GFP_KERNEL); + if (!fd->attr_para) { + ret = -ENOMEM; + goto kmalloc_fail; + } + + fd->dma_para = kmalloc(sizeof(*fd->dma_para), GFP_KERNEL); + if (!fd->dma_para) { + ret = -ENOMEM; + goto kmalloc_fail; + } + + fd->fld_para = kmalloc(sizeof(*fd->fld_para), GFP_KERNEL); + if (!fd->fld_para) { + ret = -ENOMEM; + goto kmalloc_fail; + } + + fd->base_para->rpn_anchor_thrd = + (signed short)(user_init->feature_threshold & 0x0000FFFF); + fd->base_para->pyramid_rect.width = user_init->pyramid_width; + fd->base_para->pyramid_rect.height = user_init->pyramid_height; + fd->base_para->max_pyramid_rect.width = user_init->pyramid_width; + fd->base_para->max_pyramid_rect.height = user_init->pyramid_height; + + fd->base_para->fd_fd_cfg_va = NULL; + fd->base_para->fd_rs_cfg_va = NULL; + fd->base_para->fd_yuv2rgb_cfg_va = NULL; + for (i = 0; i < MAX_ENQUE_FRAME_NUM; i++) + fd->base_para->attr_fd_cfg_va[i] = NULL; + for (i = 0; i < MAX_ENQUE_FRAME_NUM; i++) + fd->base_para->attr_yuv2rgb_cfg_va[i] = NULL; + for (i = 0; i < PYM_NUM; i++) + for (j = 0; j < COLOR_NUM; j++) + fd->base_para->rs_pym_rst_va[i][j] = NULL; + + memset(&fd->st_info, 0, sizeof(struct aie_static_info)); + aie_update_table(fd, fd->base_para->max_pyramid_rect.width, + fd->base_para->max_pyramid_rect.height, 1); + aie_update_buf_params(fd, user_init->max_img_width, + user_init->max_img_height); + ret = aie_alloc_dram_buf(fd); + if (ret) + return -ENOMEM; + + ret = aie_alloc_output_buf(fd); + if (ret) { + ret = -ENOMEM; + goto free_all; + } + + ret = aie_alloc_fddma_buf(fd); + if (ret) { + ret = -ENOMEM; + goto free_all; + } + + ret = aie_alloc_fld_buf(fd); + if (ret) { + ret = -ENOMEM; + goto free_all; + } + + aie_arrange_fddma_buf(fd); + aie_arrange_kernel_buf(fd); + aie_arrange_attrdma_buf(fd); + aie_arrange_result_dma_buf(fd); + + aie_arrange_fld_buf(fd); + + ret = aie_load_fw(fd); + if (ret) { + dev_err(fd->dev, "Failed to load aie fw\n"); + goto free_all; + } + + fd->attr_para->r_idx = 0; + fd->attr_para->w_idx = 0; + + fd->fd_state |= STATE_INIT; + + dev_dbg(fd->dev, "%s: fd_mem_size(%d)\n", __func__, fd->fd_mem_size); + + return 0; + +free_all: + aie_free_dram_buf(fd); + aie_free_output_buf(fd); + aie_free_fddma_buf(fd); + aie_free_fld_buf(fd); + +kmalloc_fail: + kfree(fd->base_para); + kfree(fd->attr_para); + kfree(fd->dma_para); + kfree(fd->fld_para); + + dev_err(fd->dev, "Failed to init aie\n"); + + return ret; +} + +void aie_uninit(struct mtk_aie_dev *fd) +{ + fd->fd_state &= ~STATE_INIT; + aie_free_dram_buf(fd); + aie_free_output_buf(fd); + aie_free_fddma_buf(fd); + aie_free_fld_buf(fd); + + kfree(fd->base_para); + kfree(fd->attr_para); + kfree(fd->dma_para); + kfree(fd->fld_para); +} + +void aie_prepare(struct mtk_aie_dev *fd, struct aie_enq_info *aie_cfg) +{ + if (aie_cfg->sel_mode == FLDMODE) + return; + + memset(&fd->reg_cfg, 0, sizeof(struct aie_reg_cfg)); + + if (aie_cfg->pyramid_base_width == 0) { + fd->base_para->pyramid_rect.width = fd->base_para->max_pyramid_rect.width; + fd->base_para->pyramid_rect.height = fd->base_para->max_pyramid_rect.height; + fd->base_para->number_of_pyramid = 3; + } else { + fd->base_para->pyramid_rect.height = + fd->base_para->max_pyramid_rect.height; + fd->base_para->number_of_pyramid = aie_cfg->number_of_pyramid; + if (aie_cfg->pyramid_base_width != + fd->base_para->pyramid_rect.width) { + dev_dbg(fd->dev, + "pre: %d cur: %d num: %d\n", + fd->base_para->pyramid_rect.width, + aie_cfg->pyramid_base_width, + fd->base_para->number_of_pyramid + ); + fd->base_para->pyramid_rect.width = + aie_cfg->pyramid_base_width; + aie_update_table(fd, fd->base_para->pyramid_rect.width, + fd->base_para->pyramid_rect.height, 0); + aie_update_fddma_buf(fd); + } + } + + //Init output va array + if (aie_cfg->sel_mode == FDMODE) { + memset(fd->rs_output_hw.va, 0, fd->rs_output_hw.size); + memset(fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(0)][0], 0, + RESULT_SIZE); + memset(fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(1)][0], 0, + RESULT_SIZE); + memset(fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(2)][0], 0, + RESULT_SIZE); + } else if (aie_cfg->sel_mode == ATTRIBUTEMODE) { + memset(fd->base_para->rs_pym_rst_va[0][0], 0, + fd->rs_pym_out_size[0]); + memset(fd->base_para->rs_pym_rst_va[0][1], 0, + fd->rs_pym_out_size[0]); + memset(fd->base_para->rs_pym_rst_va[0][2], 0, + fd->rs_pym_out_size[0]); + } + + fd->reg_cfg.fd_mode = aie_cfg->sel_mode; + if (aie_cfg->sel_mode == FDMODE) { + fd->reg_cfg.rs_adr = (u32)fd->base_para->fd_rs_cfg_pa; + fd->reg_cfg.yuv2rgb_adr = (u32)fd->base_para->fd_yuv2rgb_cfg_pa; + fd->reg_cfg.fd_adr = (u32)fd->base_para->fd_fd_cfg_pa + + fd->variant->fd_cfg_size * 4 * + FD_LOOP_NUM / 3 * + (3 - aie_cfg->number_of_pyramid); + + } else if (aie_cfg->sel_mode == ATTRIBUTEMODE) { + fd->reg_cfg.yuv2rgb_adr = + (u32)fd->base_para->attr_yuv2rgb_cfg_pa[fd->attr_para->w_idx]; + fd->reg_cfg.fd_adr = + (u32)fd->base_para->attr_fd_cfg_pa[fd->attr_para->w_idx]; + } else { + dev_err(fd->dev, "Invalid Mode: %d", aie_cfg->sel_mode); + } + + aie_update_cfg(fd, aie_cfg); + + aie_config_dram(fd, aie_cfg); + + if (aie_cfg->sel_mode == ATTRIBUTEMODE) + fd->attr_para->w_idx = (fd->attr_para->w_idx + 1) % MAX_ENQUE_FRAME_NUM; +} + +static void aie_execute_face_detection(struct mtk_aie_dev *fd, + struct aie_enq_info *aie_cfg) +{ + unsigned int loop_num, loop_reg_val; + + writel(0x0, fd->fd_base + AIE_START_REG); + writel(0x00000111, fd->fd_base + AIE_ENABLE_REG); + loop_num = FD_LOOP_NUM / 3 * (aie_cfg->number_of_pyramid); + loop_reg_val = (loop_num << 8) | (aie_cfg->number_of_pyramid - 1); + writel(loop_reg_val, fd->fd_base + AIE_LOOP_REG); + writel(0x1, fd->fd_base + AIE_INT_EN_REG); + writel(fd->reg_cfg.rs_adr, fd->fd_base + AIE_RS_CON_BASE_ADR_REG); + writel(fd->reg_cfg.fd_adr, fd->fd_base + AIE_FD_CON_BASE_ADR_REG); + writel(fd->reg_cfg.yuv2rgb_adr, fd->fd_base + AIE_YUV2RGB_CON_BASE_ADR_REG); + writel(0x00000002, fd->fd_base + AIE_YUV2RGB_CON_BASE_ADR_MSB); + writel(0x00000002, fd->fd_base + AIE_RS_CON_BASE_ADR_MSB); + writel(0x00000002, fd->fd_base + AIE_FD_CON_BASE_ADR_MSB); + writel(0x1, fd->fd_base + AIE_START_REG); +} + +static void aie_execute_attribute_detection(struct mtk_aie_dev *fd, + struct aie_enq_info *aie_cfg) +{ + writel(0x0, fd->fd_base + AIE_START_REG); + writel(0x00000101, fd->fd_base + AIE_ENABLE_REG); + writel(0x00001A00, fd->fd_base + AIE_LOOP_REG); + writel(0x1, fd->fd_base + AIE_INT_EN_REG); + writel(fd->reg_cfg.rs_adr, fd->fd_base + AIE_RS_CON_BASE_ADR_REG); + writel(fd->reg_cfg.fd_adr, fd->fd_base + AIE_FD_CON_BASE_ADR_REG); + writel(fd->reg_cfg.yuv2rgb_adr, fd->fd_base + AIE_YUV2RGB_CON_BASE_ADR_REG); + writel(0x00000002, fd->fd_base + AIE_YUV2RGB_CON_BASE_ADR_MSB); + writel(0x00000002, fd->fd_base + AIE_RS_CON_BASE_ADR_MSB); + writel(0x00000002, fd->fd_base + AIE_FD_CON_BASE_ADR_MSB); + writel(0x1, fd->fd_base + AIE_START_REG); +} + +static void aie_execute_fld_detection(struct mtk_aie_dev *fd, + struct aie_enq_info *aie_cfg) +{ + unsigned int i; + + writel(0x10, fd->fd_base + AIE_START_REG); + writel(0x00011111, fd->fd_base + AIE_DMA_CTL_REG); + writel(0x01111111, fd->fd_base + FLD_EN); + writel(0x1, fd->fd_base + AIE_INT_EN_REG); + for (i = 0; i < aie_cfg->fld_face_num; i++) { + writel(aie_cfg->src_img_addr, fd->fd_base + FLD_BASE_ADDR_FACE_0 + i * 0x4); + writel(aie_cfg->fld_input[i].fld_in_crop_x1 << 16 | + aie_cfg->fld_input[i].fld_in_crop_y1, + fd->fd_base + fld_face_info_0[i]); + writel(aie_cfg->fld_input[i].fld_in_crop_x2 << 16 | + aie_cfg->fld_input[i].fld_in_crop_y2, + fd->fd_base + FLD_FACE_INFO(i, 1)); + writel(aie_cfg->fld_input[i].fld_in_rip << 4 | + aie_cfg->fld_input[i].fld_in_rop, + fd->fd_base + FLD_FACE_INFO(i, 2)); + } + + writel(aie_cfg->fld_face_num << 28 | FLD_FOREST << 16 | + FLD_POINT, fd->fd_base + FLD_MODEL_PARA1); + writel(13 << 16 | 0xfe9, fd->fd_base + FLD_MODEL_PARA14); + writel(aie_cfg->src_img_width << 16 | aie_cfg->src_img_height, + fd->fd_base + FLD_SRC_WD_HT); + + /*input settings*/ + writel(0x007c003f, fd->fd_base + FLD_PL_IN_SIZE_0); + writel(0x0040000f, fd->fd_base + FLD_PL_IN_STRIDE_0); + writel(0x007c003f, fd->fd_base + FLD_PL_IN_SIZE_1); + writel(0x0040000f, fd->fd_base + FLD_PL_IN_STRIDE_1); + writel(0x0016003f, fd->fd_base + FLD_PL_IN_SIZE_2_0); + writel(0x0040000f, fd->fd_base + FLD_PL_IN_STRIDE_2_0); + writel(0x0013003f, fd->fd_base + FLD_PL_IN_SIZE_2_1); + writel(0x0040000f, fd->fd_base + FLD_PL_IN_STRIDE_2_1); + writel(0x0013003f, fd->fd_base + FLD_PL_IN_SIZE_2_2); + writel(0x0040000f, fd->fd_base + FLD_PL_IN_STRIDE_2_2); + writel(0x00a6001f, fd->fd_base + FLD_PL_IN_SIZE_3); + writel(0x0020000f, fd->fd_base + FLD_PL_IN_STRIDE_3); + + /*output setting*/ + writel((2400 * aie_cfg->fld_face_num - 1) << 16 | 127, + fd->fd_base + FLD_SH_IN_SIZE_0); + writel(0x0010000f, fd->fd_base + FLD_SH_IN_STRIDE_0); + writel(fd->fld_para->fld_output_pa[0], + fd->fd_base + FLD_TR_OUT_BASE_ADDR_0); + writel((aie_cfg->fld_face_num - 1) << 16 | 0x6f, + fd->fd_base + FLD_TR_OUT_SIZE_0); + writel(0x0070000f, fd->fd_base + FLD_TR_OUT_STRIDE_0); + writel(fd->fld_para->fld_output_pa[0], + fd->fd_base + FLD_PP_OUT_BASE_ADDR_0); + writel((aie_cfg->fld_face_num - 1) << 16 | 0x6f, + fd->fd_base + FLD_PP_OUT_SIZE_0); + writel(0x0070000f, fd->fd_base + FLD_PP_OUT_STRIDE_0); + + /*cv score*/ + writel(0x00000001, fd->fd_base + FLD_BS_BIAS); + writel(0x0000b835, fd->fd_base + FLD_CV_FM_RANGE_0); + writel(0xffff5cba, fd->fd_base + FLD_CV_FM_RANGE_1); + writel(0x00005ed5, fd->fd_base + FLD_CV_PM_RANGE_0); + writel(0xffff910d, fd->fd_base + FLD_CV_PM_RANGE_1); + writel(0x0000031e, fd->fd_base + FLD_BS_RANGE_0); + writel(0xfffffcae, fd->fd_base + FLD_BS_RANGE_1); + + /* 6 steps */ + writel(fd->fld_para->fld_step_pa[FLD_STEP_BLINK][14], + fd->fd_base + FLD_BS_IN_BASE_ADDR_14); + + for (i = 0; i < 15; i++) { + writel(fd->fld_para->fld_step_pa[FLD_STEP_CV][i], + fd->fd_base + FLD_PL_IN_BASE_ADDR_2_(i)); + + writel(fd->fld_para->fld_step_pa[FLD_STEP_FP][i], + fd->fd_base + FLD_PL_IN_BASE_ADDR_3_(i)); + + writel(fd->fld_para->fld_step_pa[FLD_STEP_LEAF][i], + fd->fd_base + FLD_SH_IN_BASE_ADDR_(i)); + + writel(fd->fld_para->fld_step_pa[FLD_STEP_KM02][i], + fd->fd_base + FLD_PL_IN_BASE_ADDR_0_(i)); + + writel(fd->fld_para->fld_step_pa[FLD_STEP_KM13][i], + fd->fd_base + FLD_PL_IN_BASE_ADDR_1_(i)); + } + + writel(0x22222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_0_0_7_MSB); + writel(0x02222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_0_8_15_MSB); + + writel(0x22222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_1_0_7_MSB); + writel(0x02222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_1_8_15_MSB); + + writel(0x22222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_2_0_7_MSB); + writel(0x02222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_2_8_15_MSB); + + writel(0x22222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_3_0_7_MSB); + writel(0x02222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_3_8_15_MSB); + + writel(0x22222222, fd->fd_base + FLD_SH_IN_BASE_ADDR_0_7_MSB); + writel(0x02222222, fd->fd_base + FLD_SH_IN_BASE_ADDR_8_15_MSB); + + writel(0x02000000, fd->fd_base + FLD_BS_IN_BASE_ADDR_8_15_MSB); + + writel(0x22222222, fd->fd_base + FLD_BASE_ADDR_FACE_0_7_MSB); + writel(0x02222222, fd->fd_base + FLD_BASE_ADDR_FACE_8_14_MSB); + writel(0x00000002, fd->fd_base + FLD_TR_OUT_BASE_ADDR_0_MSB); + writel(0x00000002, fd->fd_base + FLD_PP_OUT_BASE_ADDR_0_MSB); + + /* fld mode + trigger start */ + writel(0x11, fd->fd_base + AIE_START_REG); +} + +void aie_execute(struct mtk_aie_dev *fd, struct aie_enq_info *aie_cfg) +{ + if (aie_cfg->sel_mode == FDMODE) + aie_execute_face_detection(fd, aie_cfg); + else if (aie_cfg->sel_mode == ATTRIBUTEMODE) + aie_execute_attribute_detection(fd, aie_cfg); + else if (aie_cfg->sel_mode == FLDMODE) + aie_execute_fld_detection(fd, aie_cfg); + else + return; +} + +void aie_irqhandle(struct mtk_aie_dev *fd) +{ + writel(0x0, fd->fd_base + AIE_START_REG); + + /* interrupt read clear */ + readl(fd->fd_base + AIE_INT_REG); +} + +static u16 aie_get_hi16(unsigned int value) +{ + return (value & 0xFFFF0000) >> 16; +} + +static u16 aie_get_lo16(unsigned int value) +{ + return value & 0xFFFF; +} + +static signed short aie_refine_s16_value(signed short value) +{ + s16 result; + + if ((value & 0x200) >> 9) + result = (value | 0xFE00); + else + result = value; + + return result; +} + +/* return aie_cfg to user space */ +void aie_get_fd_result(struct mtk_aie_dev *fd, struct aie_enq_info *aie_cfg) +{ + u32 fd_result_hw, fd_result_1_hw, fd_total_num; + struct aie_enq_info *tmp_aie_cfg; + void *fd_pym_result[PYM_NUM]; + u32 fd_pyramid_num[PYM_NUM]; + signed short landmark; + unsigned int *pto12; + struct fd_ret *prst; + unsigned int i, j; + + aie_cfg->sel_mode = fd->base_para->sel_mode; + aie_cfg->rotate_degree = fd->base_para->rotate_degree; + aie_cfg->src_img_addr = fd->base_para->src_img_addr; + aie_cfg->src_img_addr_uv = fd->base_para->src_img_addr_uv; + aie_cfg->src_img_width = fd->base_para->img_rect.width; + aie_cfg->src_img_height = fd->base_para->img_rect.height; + aie_cfg->src_img_fmt = fd->base_para->src_img_fmt; + + aie_cfg->irq_status = readl(fd->fd_base + AIE_INT_EN_REG); + + fd_result_hw = fd->reg_cfg.hw_result; + fd_result_1_hw = fd->reg_cfg.hw_result1; + fd_total_num = fd_result_hw & 0xFFF; + fd_pyramid_num[0] = (fd_result_hw & 0xFFF0000) >> 16; + fd_pyramid_num[1] = fd_result_1_hw & 0xFFF; + fd_pyramid_num[2] = (fd_result_1_hw & 0xFFF0000) >> 16; + + if (fd_total_num == 0) + goto nothing_out; + + tmp_aie_cfg = aie_cfg; + + tmp_aie_cfg->fd_out.fd_total_num = fd_total_num; + tmp_aie_cfg->fd_out.fd_pyramid0_num = fd_pyramid_num[0]; + tmp_aie_cfg->fd_out.fd_pyramid1_num = fd_pyramid_num[1]; + tmp_aie_cfg->fd_out.fd_pyramid2_num = fd_pyramid_num[2]; + + switch (tmp_aie_cfg->number_of_pyramid) { + case 1: + fd_pym_result[2] = fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(0)][0]; + break; + case 2: + fd_pym_result[1] = fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(0)][0]; + fd_pym_result[2] = fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(1)][0]; + break; + case 3: + fd_pym_result[0] = fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(0)][0]; + fd_pym_result[1] = fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(1)][0]; + fd_pym_result[2] = fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(2)][0]; + break; + default: + dev_err(fd->dev, "Wrong number_of_pyramid\n"); + goto nothing_out; + } + + for (i = 0; i < 3; i++) { + for (j = 0; j < fd_pyramid_num[i]; j++) { + if (i == 0) + prst = &tmp_aie_cfg->fd_out.pyramid0_result; + else if (i == 1) + prst = &tmp_aie_cfg->fd_out.pyramid1_result; + else if (i == 2) + prst = &tmp_aie_cfg->fd_out.pyramid2_result; + + pto12 = (unsigned int *)fd_pym_result[i] + 12 * j; + + prst->anchor_x0[j] = aie_get_lo16(*(pto12 + 0)); + prst->anchor_y0[j] = aie_get_hi16(*(pto12 + 0)); + prst->anchor_x1[j] = aie_get_lo16(*(pto12 + 1)); + prst->anchor_y1[j] = aie_get_hi16(*(pto12 + 1)); + + if (prst->anchor_x1[j] == 0 || + prst->anchor_y1[j] == 0) { + dev_err(fd->dev, + "wrong coordinate: i=%d j=%d M:%d %d %d %d\n", i, j, + prst->anchor_x0[j], + prst->anchor_x1[j], + prst->anchor_y0[j], + prst->anchor_y1[j] + ); + goto nothing_out; + } + + /* ROP result at 1st run */ + landmark = (*(pto12 + 2) & 0x3FF); + prst->rop_landmark_score0[j] = aie_refine_s16_value(landmark); + landmark = ((*(pto12 + 2) & 0xFFC00) >> 10); + prst->rop_landmark_score1[j] = aie_refine_s16_value(landmark); + landmark = ((*(pto12 + 2) & 0x3FF00000) >> 20); + prst->rop_landmark_score2[j] = aie_refine_s16_value(landmark); + + prst->anchor_score[j] = aie_refine_s16_value(*(pto12 + 9) & 0x3FF); + + /* RIP result at 1st run */ + landmark = ((*(pto12 + 9) & 0xFFC00) >> 10); + prst->rip_landmark_score0[j] = aie_refine_s16_value(landmark); + landmark = ((*(pto12 + 9) & 0x3FF00000) >> 20); + prst->rip_landmark_score1[j] = aie_refine_s16_value(landmark); + landmark = ((*(pto12 + 9) & 0xC0000000) >> 30) | + ((*(pto12 + 10) & 0xFF) << 2); + prst->rip_landmark_score2[j] = aie_refine_s16_value(landmark); + landmark = ((*(pto12 + 10) & 0x3FF00) >> 8); + prst->rip_landmark_score3[j] = aie_refine_s16_value(landmark); + landmark = ((*(pto12 + 10) & 0xFFC0000) >> 18); + prst->rip_landmark_score4[j] = aie_refine_s16_value(landmark); + landmark = ((*(pto12 + 10) & 0xF0000000) >> 28) | + ((*(pto12 + 11) & 0x3F) << 4); + prst->rip_landmark_score5[j] = aie_refine_s16_value(landmark); + landmark = ((*(pto12 + 11) & 0xFFC0) >> 6); + prst->rip_landmark_score6[j] = aie_refine_s16_value(landmark); + prst->face_result_index[j] = ((*(pto12 + 11) & 0xFFF0000) >> 16); + prst->anchor_index[j] = ((*(pto12 + 11) & 0x70000000) >> 28); + + prst->fd_partial_result = fd_pyramid_num[i]; + } + } + return; +nothing_out: + // Ensure that user mode does not receive an inappropriate result structure + memset(&aie_cfg->fd_out, 0, sizeof(struct fd_result)); +} + +void aie_get_attr_result(struct mtk_aie_dev *fd, struct aie_enq_info *aie_cfg) +{ + u32 *attr_ethnicity_result, *attr_gender_result; + u32 *attr_age_result, *attr_is_indian_result; + + aie_cfg->sel_mode = fd->attr_para->sel_mode[fd->attr_para->r_idx]; + aie_cfg->rotate_degree = fd->attr_para->rotate_degree[fd->attr_para->r_idx]; + aie_cfg->src_img_addr = + fd->attr_para->src_img_addr[fd->attr_para->r_idx]; + aie_cfg->src_img_addr_uv = + fd->attr_para->src_img_addr_uv[fd->attr_para->r_idx]; + aie_cfg->src_img_width = fd->attr_para->img_width[fd->attr_para->r_idx]; + aie_cfg->src_img_height = + fd->attr_para->img_height[fd->attr_para->r_idx]; + aie_cfg->src_img_fmt = fd->attr_para->src_img_fmt[fd->attr_para->r_idx]; + + aie_cfg->irq_status = readl(fd->fd_base + AIE_INT_EN_REG); + + /* 64 feature * 32 bytes */ + attr_age_result = + (u32 *)fd->dma_para->age_out_hw_va[fd->attr_para->r_idx]; + attr_gender_result = + (u32 *)fd->dma_para->gender_out_hw_va[fd->attr_para->r_idx]; + attr_is_indian_result = + (u32 *)fd->dma_para->is_indian_out_hw_va[fd->attr_para->r_idx]; + attr_ethnicity_result = + (u32 *)fd->dma_para->ethnicity_out_hw_va[fd->attr_para->r_idx]; + + aie_cfg->attr_out.merged_age_ret[0] = aie_get_lo16(*attr_age_result); + aie_cfg->attr_out.merged_age_ret[1] = aie_get_hi16(*attr_age_result); + + aie_cfg->attr_out.merged_gender_ret[0] = aie_get_lo16(*attr_gender_result); + aie_cfg->attr_out.merged_gender_ret[1] = aie_get_hi16(*attr_gender_result); + + aie_cfg->attr_out.merged_is_indian_ret[0] = aie_get_lo16(*attr_is_indian_result); + aie_cfg->attr_out.merged_is_indian_ret[1] = aie_get_hi16(*attr_is_indian_result); + + aie_cfg->attr_out.merged_ethnicity_ret[0] = aie_get_lo16(*attr_ethnicity_result); + aie_cfg->attr_out.merged_ethnicity_ret[1] = aie_get_hi16(*attr_ethnicity_result); + aie_cfg->attr_out.merged_ethnicity_ret[2] = aie_get_lo16(*(attr_ethnicity_result + 1)); + + fd->attr_para->r_idx = (fd->attr_para->r_idx + 1) % MAX_ENQUE_FRAME_NUM; +} + +void aie_get_fld_result(struct mtk_aie_dev *fd, struct aie_enq_info *aie_cfg) +{ + u8 fld_rlt[FLD_OUTPUT_X_SIZE][FLD_OUTPUT_SIZE]; + u16 *out_parsing; + int i, j; + + aie_cfg->irq_status = readl(fd->fd_base + AIE_INT_EN_REG); + + memcpy(fld_rlt, fd->fld_para->fld_output_va[0], sizeof(fld_rlt)); + + for (j = 0; j < aie_cfg->fld_face_num; j++) { + out_parsing = (unsigned short *)&fld_rlt[j][0]; + for (i = 0; i < FLD_CUR_LANDMARK; i++) { + aie_cfg->fld_out[j].fld_landmark[i].x = *out_parsing; + aie_cfg->fld_out[j].fld_landmark[i].y = *(out_parsing + 1); + + if (i % 2) + out_parsing = out_parsing + 6; + else + out_parsing = out_parsing + 2; + } + out_parsing = (unsigned short *)&fld_rlt[j][0]; + if (FLD_CUR_LANDMARK % 2) + out_parsing = out_parsing + ((FLD_CUR_LANDMARK + 1) / 2) * 8; + else + out_parsing = out_parsing + (FLD_CUR_LANDMARK / 2) * 8; + + aie_cfg->fld_out[j].fld_out_rop = *out_parsing; + aie_cfg->fld_out[j].fld_out_rip = *(out_parsing + 1); + aie_cfg->fld_out[j].confidence = *(out_parsing + 2); + aie_cfg->fld_out[j].blinkscore = *(out_parsing + 3); + } +} diff --git a/drivers/media/platform/mediatek/aie/mtk_aie_v4l2.c b/drivers/media/platform/mediatek/aie/mtk_aie_v4l2.c new file mode 100644 index 000000000000..28646d5a53aa --- /dev/null +++ b/drivers/media/platform/mediatek/aie/mtk_aie_v4l2.c @@ -0,0 +1,1295 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2020 MediaTek Inc. + * Author: Bo.Kong <bo.kong@mediatek.com> + */ + +#include <linux/clk.h> +#include <linux/pm_runtime.h> +#include <linux/mtk_aie_v4l2_controls.h> +#include <media/v4l2-event.h> +#include <media/v4l2-ioctl.h> +#include <media/v4l2-mem2mem.h> +#include <media/videobuf2-dma-contig.h> +#include "mtk_aie.h" + +#define V4L2_CID_MTK_AIE_MAX 2 +#define Y2R_CFG_SIZE 34 +#define RS_CFG_SIZE 30 +#define FD_CFG_SIZE 56 + +static const struct mtk_aie_variant aie_31_drvdata = { + .y2r_cfg_size = Y2R_CFG_SIZE, + .rs_cfg_size = RS_CFG_SIZE, + .fd_cfg_size = FD_CFG_SIZE, +}; + +static const struct of_device_id mtk_aie_of_ids[] = { + { .compatible = "mediatek,mt8188-aie", .data = &aie_31_drvdata }, + { /* sentinel */ }, +}; +MODULE_DEVICE_TABLE(of, mtk_aie_of_ids); + +static const struct v4l2_pix_format_mplane mtk_aie_img_fmts[] = { + { + .pixelformat = V4L2_PIX_FMT_NV16M, + .num_planes = 2, + }, + { + .pixelformat = V4L2_PIX_FMT_NV61M, + .num_planes = 2, + }, + { + .pixelformat = V4L2_PIX_FMT_YUYV, + .num_planes = 1, + }, + { + .pixelformat = V4L2_PIX_FMT_YVYU, + .num_planes = 1, + }, + { + .pixelformat = V4L2_PIX_FMT_UYVY, + .num_planes = 1, + }, + { + .pixelformat = V4L2_PIX_FMT_VYUY, + .num_planes = 1, + }, + { + .pixelformat = V4L2_PIX_FMT_GREY, + .num_planes = 1, + }, + { + .pixelformat = V4L2_PIX_FMT_NV12M, + .num_planes = 2, + }, + { + .pixelformat = V4L2_PIX_FMT_NV12, + .num_planes = 1, + }, +}; + +static inline struct mtk_aie_ctx *fh_to_ctx(struct v4l2_fh *fh) +{ + return container_of(fh, struct mtk_aie_ctx, fh); +} + +static inline struct mtk_aie_ctx *ctrl_to_ctx(const struct v4l2_ctrl *ctrl) +{ + return container_of(ctrl->handler, struct mtk_aie_ctx, hdl); +} + +static void mtk_aie_hw_job_finish(struct mtk_aie_dev *fd, + enum vb2_buffer_state vb_state) +{ + struct vb2_v4l2_buffer *src_vbuf, *dst_vbuf; + struct mtk_aie_ctx *ctx; + + pm_runtime_put(fd->dev); + ctx = v4l2_m2m_get_curr_priv(fd->m2m_dev); + if (!ctx) { + dev_err(fd->dev, "Failed to do v4l2_m2m_get_curr_priv!\n"); + } else { + src_vbuf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx); + dst_vbuf = v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx); + if (src_vbuf && dst_vbuf) + v4l2_m2m_buf_copy_metadata(src_vbuf, dst_vbuf, true); + if (src_vbuf) + v4l2_m2m_buf_done(src_vbuf, vb_state); + if (dst_vbuf) + v4l2_m2m_buf_done(dst_vbuf, vb_state); + if (src_vbuf && dst_vbuf) + v4l2_m2m_job_finish(fd->m2m_dev, ctx->fh.m2m_ctx); + } + complete_all(&fd->fd_job_finished); +} + +static int mtk_aie_hw_job_exec(struct mtk_aie_dev *fd) +{ + pm_runtime_get_sync(fd->dev); + + reinit_completion(&fd->fd_job_finished); + schedule_delayed_work(&fd->job_timeout_work, + msecs_to_jiffies(MTK_FD_HW_TIMEOUT_IN_MSEC)); + + return 0; +} + +static int mtk_aie_vb2_buf_out_validate(struct vb2_buffer *vb) +{ + struct vb2_v4l2_buffer *v4l2_buf = to_vb2_v4l2_buffer(vb); + + if (v4l2_buf->field == V4L2_FIELD_ANY) + v4l2_buf->field = V4L2_FIELD_NONE; + if (v4l2_buf->field != V4L2_FIELD_NONE) + return -EINVAL; + + return 0; +} + +static int mtk_aie_vb2_buf_prepare(struct vb2_buffer *vb) +{ + struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb); + struct vb2_queue *vq = vb->vb2_queue; + struct mtk_aie_ctx *ctx = vb2_get_drv_priv(vq); + struct v4l2_pix_format_mplane *pixfmt; + struct device *dev = ctx->dev; + + switch (vq->type) { + case V4L2_BUF_TYPE_META_CAPTURE: + if (vb2_plane_size(vb, 0) < ctx->dst_fmt.buffersize) { + dev_err(dev, "meta size %lu is too small\n", vb2_plane_size(vb, 0)); + return -EINVAL; + } + vb2_set_plane_payload(vb, 0, ctx->dst_fmt.buffersize); + break; + case V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE: + pixfmt = &ctx->src_fmt; + + if (vbuf->field == V4L2_FIELD_ANY) + vbuf->field = V4L2_FIELD_NONE; + + if (vb->num_planes > 2 || vbuf->field != V4L2_FIELD_NONE) { + dev_dbg(dev, "plane %d or field %d not supported\n", + vb->num_planes, vbuf->field); + return -EINVAL; + } + + if (vb2_plane_size(vb, 0) < pixfmt->plane_fmt[0].sizeimage) { + dev_dbg(dev, "plane 0 %lu is too small than %x\n", + vb2_plane_size(vb, 0), pixfmt->plane_fmt[0].sizeimage); + return -EINVAL; + } + vb2_set_plane_payload(vb, 0, pixfmt->plane_fmt[0].sizeimage); + + if (pixfmt->num_planes == 2 && + vb2_plane_size(vb, 1) < pixfmt->plane_fmt[1].sizeimage) { + dev_dbg(dev, "plane 1 %lu is too small than %x\n", + vb2_plane_size(vb, 1), pixfmt->plane_fmt[1].sizeimage); + return -EINVAL; + } + vb2_set_plane_payload(vb, 1, pixfmt->plane_fmt[1].sizeimage); + break; + } + + return 0; +} + +static void mtk_aie_vb2_buf_queue(struct vb2_buffer *vb) +{ + struct mtk_aie_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue); + struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb); + + v4l2_m2m_buf_queue(ctx->fh.m2m_ctx, vbuf); +} + +static int mtk_aie_vb2_queue_setup(struct vb2_queue *vq, + unsigned int *num_buffers, + unsigned int *num_planes, + unsigned int sizes[], + struct device *alloc_devs[]) +{ + struct mtk_aie_ctx *ctx = vb2_get_drv_priv(vq); + struct device *dev = ctx->dev; + unsigned int size[2]; + unsigned int plane; + + switch (vq->type) { + case V4L2_BUF_TYPE_META_CAPTURE: + size[0] = ctx->dst_fmt.buffersize; + size[1] = 0; + break; + case V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE: + size[0] = ctx->src_fmt.plane_fmt[0].sizeimage; + size[1] = ctx->src_fmt.plane_fmt[1].sizeimage; + break; + default: + size[0] = 0; + size[1] = 0; + } + + dev_dbg(dev, "vq type = %d, size[0] = %d, size[1] = %d\n", + vq->type, size[0], size[1]); + + if (*num_planes > 2) + return -EINVAL; + + *num_buffers = clamp_val(*num_buffers, 1, VB2_MAX_FRAME); + + if (*num_planes == 0) { + if (vq->type == V4L2_BUF_TYPE_META_CAPTURE) { + sizes[0] = ctx->dst_fmt.buffersize; + *num_planes = 1; + return 0; + } + + *num_planes = ctx->src_fmt.num_planes; + if (*num_planes > 2) + return -EINVAL; + for (plane = 0; plane < *num_planes; plane++) + sizes[plane] = ctx->src_fmt.plane_fmt[plane].sizeimage; + + return 0; + } + + return 0; +} + +static int mtk_aie_vb2_start_streaming(struct vb2_queue *vq, unsigned int count) +{ + struct mtk_aie_ctx *ctx = vb2_get_drv_priv(vq); + struct mtk_aie_dev *fd; + + if (!ctx) + return -EINVAL; + + fd = ctx->fd_dev; + if (vq->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE && (++fd->fd_stream_count == 1)) + return aie_init(ctx->fd_dev, &ctx->fd_dev->ctx->user_init); + + return 0; +} + +static void mtk_aie_job_timeout_work(struct work_struct *work) +{ + struct mtk_aie_dev *fd = + container_of(work, struct mtk_aie_dev, job_timeout_work.work); + + dev_err(fd->dev, "FD Job timeout!"); + + dev_dbg(fd->dev, "%s result result1: %x, %x, %x", __func__, + readl(fd->fd_base + AIE_RESULT_0_REG), + readl(fd->fd_base + AIE_RESULT_1_REG), + readl(fd->fd_base + AIE_DMA_CTL_REG)); + + fd->aie_cfg->irq_status = readl(fd->fd_base + AIE_INT_EN_REG); + + if (fd->aie_cfg->sel_mode == ATTRIBUTEMODE) { + dev_dbg(fd->dev, "w_idx = %d, r_idx = %d\n", + fd->attr_para->w_idx, fd->attr_para->r_idx); + } + + aie_irqhandle(fd); + aie_reset(fd); + atomic_dec(&fd->num_composing); + mtk_aie_hw_job_finish(fd, VB2_BUF_STATE_ERROR); + wake_up(&fd->flushing_waitq); +} + +static void mtk_aie_vb2_stop_streaming(struct vb2_queue *vq) +{ + struct mtk_aie_ctx *ctx = vb2_get_drv_priv(vq); + struct v4l2_m2m_ctx *m2m_ctx = ctx->fh.m2m_ctx; + struct v4l2_m2m_queue_ctx *queue_ctx; + struct mtk_aie_dev *fd = ctx->fd_dev; + struct vb2_v4l2_buffer *vb = NULL; + int ret; + + /* Waiting Job Finish */ + ret = wait_for_completion_timeout(&fd->fd_job_finished, msecs_to_jiffies(1000)); + if (!ret) + dev_err(fd->dev, "Wait job finish timeout\n"); + + if (vq->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) { + fd->fd_stream_count--; + if (fd->fd_stream_count > 0) + dev_dbg(fd->dev, "Stop: fd_stream_count = %d\n", fd->fd_stream_count); + else + aie_uninit(fd); + } + + queue_ctx = V4L2_TYPE_IS_OUTPUT(vq->type) ? &m2m_ctx->out_q_ctx : + &m2m_ctx->cap_q_ctx; + while ((vb = v4l2_m2m_buf_remove(queue_ctx))) + v4l2_m2m_buf_done(vb, VB2_BUF_STATE_ERROR); +} + +static void mtk_aie_vb2_request_complete(struct vb2_buffer *vb) +{ + struct mtk_aie_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue); + + v4l2_ctrl_request_complete(vb->req_obj.req, &ctx->hdl); +} + +static int mtk_aie_querycap(struct file *file, void *fh, + struct v4l2_capability *cap) +{ + struct mtk_aie_dev *fd = video_drvdata(file); + struct device *dev = fd->dev; + + strscpy(cap->driver, dev_driver_string(dev), sizeof(cap->driver)); + strscpy(cap->card, dev_driver_string(dev), sizeof(cap->card)); + + cap->device_caps = V4L2_CAP_VIDEO_OUTPUT_MPLANE | + V4L2_CAP_STREAMING | V4L2_CAP_META_CAPTURE; + cap->capabilities = V4L2_CAP_DEVICE_CAPS | cap->device_caps; + + return 0; +} + +static int mtk_aie_enum_fmt_out_mp(struct file *file, void *fh, + struct v4l2_fmtdesc *f) +{ + if (f->index >= ARRAY_SIZE(mtk_aie_img_fmts)) + return -EINVAL; + + f->pixelformat = mtk_aie_img_fmts[f->index].pixelformat; + return 0; +} + +static void mtk_aie_fill_pixfmt_mp(struct v4l2_pix_format_mplane *dfmt, + const struct v4l2_pix_format_mplane *sfmt) +{ + dfmt->field = V4L2_FIELD_NONE; + dfmt->colorspace = V4L2_COLORSPACE_BT2020; + dfmt->num_planes = sfmt->num_planes; + dfmt->ycbcr_enc = V4L2_YCBCR_ENC_DEFAULT; + dfmt->quantization = V4L2_QUANTIZATION_DEFAULT; + dfmt->xfer_func = V4L2_MAP_XFER_FUNC_DEFAULT(dfmt->colorspace); + dfmt->pixelformat = sfmt->pixelformat; + + /* Keep user setting as possible */ + dfmt->width = clamp(dfmt->width, MTK_FD_OUTPUT_MIN_WIDTH, + MTK_FD_OUTPUT_MAX_WIDTH); + dfmt->height = clamp(dfmt->height, MTK_FD_OUTPUT_MIN_HEIGHT, + MTK_FD_OUTPUT_MAX_HEIGHT); + + dfmt->plane_fmt[0].bytesperline = ALIGN(dfmt->width, 16); + dfmt->plane_fmt[1].bytesperline = ALIGN(dfmt->width, 16); + + dfmt->plane_fmt[0].sizeimage = dfmt->height * dfmt->plane_fmt[0].bytesperline; + dfmt->plane_fmt[1].sizeimage = dfmt->height * dfmt->plane_fmt[1].bytesperline; + if (sfmt->num_planes == 2 && sfmt->pixelformat == V4L2_PIX_FMT_NV12M) { + dfmt->plane_fmt[1].sizeimage /= 2; + } else if (sfmt->pixelformat == V4L2_PIX_FMT_NV12) { + dfmt->plane_fmt[0].sizeimage *= 3; + dfmt->plane_fmt[0].sizeimage /= 2; + } +} + +static const struct v4l2_pix_format_mplane *mtk_aie_find_fmt(u32 format) +{ + unsigned int i; + + for (i = 0; i < ARRAY_SIZE(mtk_aie_img_fmts); i++) { + if (mtk_aie_img_fmts[i].pixelformat == format) + return &mtk_aie_img_fmts[i]; + } + + return NULL; +} + +static int mtk_aie_try_fmt_out_mp(struct file *file, void *fh, + struct v4l2_format *f) +{ + struct v4l2_pix_format_mplane *pix_mp = &f->fmt.pix_mp; + const struct v4l2_pix_format_mplane *fmt; + + fmt = mtk_aie_find_fmt(pix_mp->pixelformat); + if (!fmt) + fmt = &mtk_aie_img_fmts[0]; + + mtk_aie_fill_pixfmt_mp(pix_mp, fmt); + return 0; +} + +static int mtk_aie_g_fmt_out_mp(struct file *file, void *fh, + struct v4l2_format *f) +{ + struct mtk_aie_ctx *ctx = fh_to_ctx(fh); + + f->fmt.pix_mp = ctx->src_fmt; + + return 0; +} + +static int mtk_aie_s_fmt_out_mp(struct file *file, void *fh, + struct v4l2_format *f) +{ + struct mtk_aie_ctx *ctx = fh_to_ctx(fh); + struct vb2_queue *vq = v4l2_m2m_get_vq(ctx->fh.m2m_ctx, f->type); + const struct v4l2_pix_format_mplane *fmt; + struct mtk_aie_dev *fd = ctx->fd_dev; + + if (!vq) { + dev_err(fd->dev, "%s vq is NULL!\n", __func__); + return -EINVAL; + } + + /* Change not allowed if queue is streaming. */ + if (vb2_is_streaming(vq)) + return -EBUSY; + + fmt = mtk_aie_find_fmt(f->fmt.pix_mp.pixelformat); + if (!fmt) + fmt = &mtk_aie_img_fmts[0]; + else if (&fd->ctx->fh != file->private_data) + return -EBUSY; + + fd->ctx = ctx; + mtk_aie_fill_pixfmt_mp(&f->fmt.pix_mp, fmt); + ctx->src_fmt = f->fmt.pix_mp; + + return 0; +} + +static int mtk_aie_enum_fmt_meta_cap(struct file *file, void *fh, + struct v4l2_fmtdesc *f) +{ + if (f->index) + return -EINVAL; + + strscpy(f->description, "Face detection result", sizeof(f->description)); + f->pixelformat = V4L2_META_FMT_MTFD_RESULT; + f->flags = 0; + + return 0; +} + +static int mtk_aie_g_fmt_meta_cap(struct file *file, void *fh, + struct v4l2_format *f) +{ + f->fmt.meta.dataformat = V4L2_META_FMT_MTFD_RESULT; + f->fmt.meta.buffersize = sizeof(struct aie_enq_info); + + return 0; +} + +static const struct vb2_ops mtk_aie_vb2_ops = { + .queue_setup = mtk_aie_vb2_queue_setup, + .buf_out_validate = mtk_aie_vb2_buf_out_validate, + .buf_prepare = mtk_aie_vb2_buf_prepare, + .buf_queue = mtk_aie_vb2_buf_queue, + .start_streaming = mtk_aie_vb2_start_streaming, + .stop_streaming = mtk_aie_vb2_stop_streaming, + .wait_prepare = vb2_ops_wait_prepare, + .wait_finish = vb2_ops_wait_finish, + .buf_request_complete = mtk_aie_vb2_request_complete, +}; + +static const struct v4l2_ioctl_ops mtk_aie_v4l2_video_out_ioctl_ops = { + .vidioc_querycap = mtk_aie_querycap, + .vidioc_enum_fmt_vid_out = mtk_aie_enum_fmt_out_mp, + .vidioc_g_fmt_vid_out_mplane = mtk_aie_g_fmt_out_mp, + .vidioc_s_fmt_vid_out_mplane = mtk_aie_s_fmt_out_mp, + .vidioc_try_fmt_vid_out_mplane = mtk_aie_try_fmt_out_mp, + .vidioc_enum_fmt_meta_cap = mtk_aie_enum_fmt_meta_cap, + .vidioc_g_fmt_meta_cap = mtk_aie_g_fmt_meta_cap, + .vidioc_s_fmt_meta_cap = mtk_aie_g_fmt_meta_cap, + .vidioc_try_fmt_meta_cap = mtk_aie_g_fmt_meta_cap, + .vidioc_reqbufs = v4l2_m2m_ioctl_reqbufs, + .vidioc_create_bufs = v4l2_m2m_ioctl_create_bufs, + .vidioc_expbuf = v4l2_m2m_ioctl_expbuf, + .vidioc_prepare_buf = v4l2_m2m_ioctl_prepare_buf, + .vidioc_querybuf = v4l2_m2m_ioctl_querybuf, + .vidioc_qbuf = v4l2_m2m_ioctl_qbuf, + .vidioc_dqbuf = v4l2_m2m_ioctl_dqbuf, + .vidioc_streamon = v4l2_m2m_ioctl_streamon, + .vidioc_streamoff = v4l2_m2m_ioctl_streamoff, + .vidioc_subscribe_event = v4l2_ctrl_subscribe_event, + .vidioc_unsubscribe_event = v4l2_event_unsubscribe, +}; + +static int mtk_aie_queue_init(void *priv, struct vb2_queue *src_vq, + struct vb2_queue *dst_vq) +{ + struct mtk_aie_ctx *ctx = priv; + int ret; + + src_vq->type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE; + src_vq->io_modes = VB2_MMAP | VB2_DMABUF; + src_vq->supports_requests = true; + src_vq->drv_priv = ctx; + src_vq->ops = &mtk_aie_vb2_ops; + src_vq->mem_ops = &vb2_dma_contig_memops; + src_vq->buf_struct_size = sizeof(struct v4l2_m2m_buffer); + src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY; + src_vq->lock = &ctx->fd_dev->vfd_lock; + src_vq->dev = ctx->fd_dev->v4l2_dev.dev; + + ret = vb2_queue_init(src_vq); + if (ret) + return ret; + + dst_vq->type = V4L2_BUF_TYPE_META_CAPTURE; + dst_vq->io_modes = VB2_MMAP | VB2_DMABUF; + dst_vq->drv_priv = ctx; + dst_vq->ops = &mtk_aie_vb2_ops; + dst_vq->mem_ops = &vb2_dma_contig_memops; + dst_vq->buf_struct_size = sizeof(struct v4l2_m2m_buffer); + dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY; + dst_vq->lock = &ctx->fd_dev->vfd_lock; + dst_vq->dev = ctx->fd_dev->v4l2_dev.dev; + + return vb2_queue_init(dst_vq); +} + +static int mtk_aie_s_ctrl(struct v4l2_ctrl *ctrl) +{ + struct mtk_aie_ctx *ctx = ctrl_to_ctx(ctrl); + struct v4l2_ctrl_aie_param *p_aie_param; + struct v4l2_ctrl_aie_init *p_aie_init; + + if (!ctx) + return -EINVAL; + + switch (ctrl->id) { + case V4L2_CID_MTK_AIE_INIT: + p_aie_init = ctrl->p_new.p; + memcpy(&ctx->user_init, p_aie_init, sizeof(struct v4l2_ctrl_aie_init)); + break; + case V4L2_CID_MTK_AIE_PARAM: + p_aie_param = ctrl->p_new.p; + memcpy(&ctx->user_param, p_aie_param, sizeof(struct v4l2_ctrl_aie_param)); + break; + default: + return -EINVAL; + } + + return 0; +} + +static const struct v4l2_ctrl_ops aie_ctrl_ops = { + .s_ctrl = mtk_aie_s_ctrl, +}; + +static void mtk_aie_ctrl_type_op_init(const struct v4l2_ctrl *ctrl, + u32 from_idx, union v4l2_ctrl_ptr ptr) +{ + struct v4l2_ctrl_aie_param *p_aie_param; + struct v4l2_ctrl_aie_init *p_aie_init; + + switch (ctrl->id) { + case V4L2_CID_MTK_AIE_INIT: + p_aie_init = ptr.p; + memset(p_aie_init, 0, sizeof(struct v4l2_ctrl_aie_init)); + break; + + case V4L2_CID_MTK_AIE_PARAM: + p_aie_param = ptr.p; + memset(p_aie_param, 0, sizeof(struct v4l2_ctrl_aie_param)); + break; + + default: + break; + } +} + +static int mtk_aie_ctrl_type_op_validate(const struct v4l2_ctrl *ctrl, + union v4l2_ctrl_ptr ptr) +{ + struct mtk_aie_ctx *ctx = ctrl_to_ctx(ctrl); + struct v4l2_ctrl_aie_param *p_aie_param; + struct v4l2_ctrl_aie_init *p_aie_init; + struct mtk_aie_dev *fd; + + if (!ctx) + return -EINVAL; + + fd = ctx->fd_dev; + + switch (ctrl->id) { + case V4L2_CID_MTK_AIE_PARAM: + p_aie_param = ptr.p; + + switch (p_aie_param->fd_mode) { + case FDMODE: + case ATTRIBUTEMODE: + case FLDMODE: + break; + default: + dev_err(ctx->dev, "Invalid Requested Mode: %d\n", p_aie_param->fd_mode); + return -EINVAL; + } + + switch (p_aie_param->src_img_fmt) { + case FMT_YUV_2P: + case FMT_YVU_2P: + case FMT_YUYV: + case FMT_YVYU: + case FMT_UYVY: + case FMT_VYUY: + case FMT_MONO: + case FMT_YUV420_2P: + case FMT_YUV420_1P: + break; + default: + dev_err(ctx->dev, "Invalid Requested Fmt: %d\n", p_aie_param->src_img_fmt); + return -EINVAL; + } + + if (p_aie_param->src_img_width > fd->base_para->max_img_rect.width || + p_aie_param->src_img_height > fd->base_para->max_img_rect.height || + p_aie_param->src_img_width == 0 || p_aie_param->src_img_height == 0) { + dev_err(fd->dev, "Invalid Requested Src_WD: %d Src_HT: %d\n", + p_aie_param->src_img_width, + p_aie_param->src_img_height); + + dev_err(fd->dev, "Invalid Requested MAX_Src_WD: %d MAX_Src_HT: %d\n", + fd->base_para->max_img_rect.width, + fd->base_para->max_img_rect.height); + + return -EINVAL; + } + + if (p_aie_param->pyramid_base_width > fd->base_para->max_pyramid_rect.width || + p_aie_param->pyramid_base_height > fd->base_para->max_pyramid_rect.height || + p_aie_param->number_of_pyramid > 3 || + p_aie_param->number_of_pyramid <= 0) { + dev_err(fd->dev, "Invalid Requested base w: %d h: %d num: %d\n", + p_aie_param->pyramid_base_width, p_aie_param->pyramid_base_height, + p_aie_param->number_of_pyramid); + + dev_err(fd->dev, "Invalid Requested max w: %d h: %d\n", + fd->base_para->max_pyramid_rect.width, + fd->base_para->max_pyramid_rect.height); + + return -EINVAL; + } + + break; + + case V4L2_CID_MTK_AIE_INIT: + p_aie_init = ptr.p; + if (!p_aie_init->max_img_width || !p_aie_init->max_img_height || + !p_aie_init->pyramid_width || !p_aie_init->pyramid_height) { + dev_err(fd->dev, + "Invalid Requested max_w: %d max_h: %d, p_w: %d p_h: %d\n", + p_aie_init->max_img_width, p_aie_init->max_img_height, + p_aie_init->pyramid_width, p_aie_init->pyramid_height); + + return -EINVAL; + } + + break; + + default: + return -EINVAL; + } + + return 0; +} + +static const struct v4l2_ctrl_type_ops aie_ctrl_type_ops = { + .equal = v4l2_ctrl_type_op_equal, + .init = mtk_aie_ctrl_type_op_init, + .log = v4l2_ctrl_type_op_log, + .validate = mtk_aie_ctrl_type_op_validate, +}; + +static struct v4l2_ctrl_config mtk_aie_controls[] = { + { + .ops = &aie_ctrl_ops, + .type_ops = &aie_ctrl_type_ops, + .id = V4L2_CID_MTK_AIE_INIT, + .name = "FD detection init", + .type = V4L2_CTRL_TYPE_AIE_INIT, + .elem_size = sizeof(struct v4l2_ctrl_aie_init), + }, { + .ops = &aie_ctrl_ops, + .type_ops = &aie_ctrl_type_ops, + .id = V4L2_CID_MTK_AIE_PARAM, + .name = "FD detection param", + .type = V4L2_CTRL_TYPE_AIE_PARAM, + .elem_size = sizeof(struct v4l2_ctrl_aie_param), + }, +}; + +static int mtk_aie_ctrls_setup(struct mtk_aie_ctx *ctx) +{ + struct v4l2_ctrl_handler *hdl = &ctx->hdl; + int i; + + v4l2_ctrl_handler_init(hdl, V4L2_CID_MTK_AIE_MAX); + if (hdl->error) + return hdl->error; + + for (i = 0; i < ARRAY_SIZE(mtk_aie_controls); i++) { + v4l2_ctrl_new_custom(hdl, &mtk_aie_controls[i], ctx); + if (hdl->error) { + v4l2_ctrl_handler_free(hdl); + dev_err(ctx->dev, "Failed to register controls: %d", i); + return hdl->error; + } + } + + ctx->fh.ctrl_handler = &ctx->hdl; + v4l2_ctrl_handler_setup(hdl); + + return 0; +} + +static void init_ctx_fmt(struct mtk_aie_ctx *ctx) +{ + struct v4l2_pix_format_mplane *src_fmt = &ctx->src_fmt; + struct v4l2_meta_format *dst_fmt = &ctx->dst_fmt; + + /* Initialize M2M source fmt */ + src_fmt->width = MTK_FD_OUTPUT_MAX_WIDTH; + src_fmt->height = MTK_FD_OUTPUT_MAX_HEIGHT; + mtk_aie_fill_pixfmt_mp(src_fmt, &mtk_aie_img_fmts[0]); + + /* Initialize M2M destination fmt */ + dst_fmt->buffersize = sizeof(struct aie_enq_info); + dst_fmt->dataformat = V4L2_META_FMT_MTFD_RESULT; +} + +/* + * V4L2 file operations. + */ +static int mtk_vfd_open(struct file *filp) +{ + struct video_device *vdev = video_devdata(filp); + struct mtk_aie_dev *fd = video_drvdata(filp); + struct mtk_aie_ctx *ctx; + int ret; + + mutex_lock(&fd->dev_lock); + + if (fd->fd_state & STATE_OPEN) { + dev_err(fd->dev, "VFD is already open, Only one instance is supported\n"); + ret = -EBUSY; + goto err_unlock; + } + + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); + if (!ctx) { + ret = -ENOMEM; + goto err_unlock; + } + + ctx->fd_dev = fd; + ctx->dev = fd->dev; + fd->ctx = ctx; + + v4l2_fh_init(&ctx->fh, vdev); + filp->private_data = &ctx->fh; + + init_ctx_fmt(ctx); + + ret = mtk_aie_ctrls_setup(ctx); + if (ret) { + dev_err(ctx->dev, "Failed to set up controls: %d\n", ret); + goto err_fh_exit; + } + ctx->fh.m2m_ctx = v4l2_m2m_ctx_init(fd->m2m_dev, ctx, &mtk_aie_queue_init); + if (IS_ERR(ctx->fh.m2m_ctx)) { + ret = PTR_ERR(ctx->fh.m2m_ctx); + goto err_free_ctrl_handler; + } + v4l2_fh_add(&ctx->fh); + fd->fd_state |= STATE_OPEN; + + mutex_unlock(&fd->dev_lock); + + return 0; +err_free_ctrl_handler: + v4l2_ctrl_handler_free(&ctx->hdl); +err_fh_exit: + v4l2_fh_exit(&ctx->fh); + kfree(ctx); +err_unlock: + mutex_unlock(&fd->dev_lock); + + return ret; +} + +static int mtk_vfd_release(struct file *filp) +{ + struct mtk_aie_ctx *ctx = container_of(filp->private_data, struct mtk_aie_ctx, fh); + struct mtk_aie_dev *fd = video_drvdata(filp); + + mutex_lock(&fd->dev_lock); + + fd->fd_state &= ~STATE_OPEN; + + v4l2_m2m_ctx_release(ctx->fh.m2m_ctx); + v4l2_ctrl_handler_free(&ctx->hdl); + v4l2_fh_del(&ctx->fh); + v4l2_fh_exit(&ctx->fh); + + kfree(ctx); + + mutex_unlock(&fd->dev_lock); + + return 0; +} + +static __poll_t mtk_vfd_fop_poll(struct file *file, poll_table *wait) +{ + int ret; + + struct mtk_aie_ctx *ctx = container_of(file->private_data, struct mtk_aie_ctx, fh); + struct mtk_aie_dev *fd = ctx->fd_dev; + + if (fd->fd_state & STATE_INIT) { + /* Waiting Job Finsh */ + ret = wait_for_completion_timeout(&fd->fd_job_finished, msecs_to_jiffies(1000)); + if (!ret) { + dev_err(ctx->dev, "Wait job finish timeout from poll\n"); + return EPOLLERR; + } + } + + return v4l2_m2m_fop_poll(file, wait); +} + +static const struct v4l2_file_operations fd_video_fops = { + .owner = THIS_MODULE, + .open = mtk_vfd_open, + .release = mtk_vfd_release, + .poll = mtk_vfd_fop_poll, + .unlocked_ioctl = video_ioctl2, + .mmap = v4l2_m2m_fop_mmap, +}; + +static int mtk_aie_job_ready(void *priv) +{ + struct vb2_v4l2_buffer *src_buf, *dst_buf; + struct mtk_aie_ctx *ctx = priv; + struct mtk_aie_dev *fd = ctx->fd_dev; + struct fd_buffer src_img[2] = {}; + void *plane_vaddr; + + if (!ctx->fh.m2m_ctx) { + dev_err(fd->dev, "Memory-to-memory context is NULL\n"); + return -1; + } + + if (!(fd->fd_state & STATE_OPEN)) { + dev_err(fd->dev, "Job ready with device closed\n"); + return -1; + } + + mutex_lock(&fd->fd_lock); + + src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx); + dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx); + + if (!src_buf || !dst_buf) { + dev_err(fd->dev, "src or dst buf is NULL\n"); + mutex_unlock(&fd->fd_lock); + return -1; + } + + if (!(fd->fd_state & STATE_INIT)) { + dev_err(fd->dev, "%s Wrong fd state: %d\n", __func__, fd->fd_state); + mutex_unlock(&fd->fd_lock); + return -1; + } + + plane_vaddr = vb2_plane_vaddr(&dst_buf->vb2_buf, 0); + if (!plane_vaddr) { + dev_err(fd->dev, "Failed to get plane virtual address\n"); + mutex_unlock(&fd->fd_lock); + return -1; + } + + v4l2_ctrl_request_setup(src_buf->vb2_buf.req_obj.req, &ctx->hdl); + + fd->aie_cfg = (struct aie_enq_info *)plane_vaddr; + fd->aie_cfg->fld_face_num = ctx->user_param.fld_face_num; + + memset(fd->aie_cfg, 0, sizeof(struct aie_enq_info)); + memcpy(fd->aie_cfg, &ctx->user_param, sizeof(struct aie_enq_info)); + memcpy(fd->aie_cfg->fld_input, ctx->user_param.fld_input, + FLD_MAX_FRAME * sizeof(struct v4l2_fld_crop_rip_rop)); + + src_img[0].dma_addr = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0); + + if (ctx->src_fmt.num_planes == 2) { + src_img[1].dma_addr = + vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 1); + } + + if ((fd->aie_cfg->sel_mode == FDMODE || fd->aie_cfg->sel_mode == ATTRIBUTEMODE) && + fd->aie_cfg->src_img_fmt == FMT_YUV420_1P) { + src_img[1].dma_addr = src_img[0].dma_addr + ctx->user_param.src_img_stride * + ctx->user_param.src_img_height; + } + + fd->aie_cfg->src_img_addr = src_img[0].dma_addr; + fd->aie_cfg->src_img_addr_uv = src_img[1].dma_addr; + + aie_prepare(fd, fd->aie_cfg); + + mutex_unlock(&fd->fd_lock); + + if (src_buf) { + /* Complete request controls if any */ + v4l2_ctrl_request_complete(src_buf->vb2_buf.req_obj.req, &ctx->hdl); + } + + return 0; +} + +static void mtk_aie_device_run(void *priv) +{ + struct mtk_aie_ctx *ctx = priv; + struct mtk_aie_dev *fd = ctx->fd_dev; + int ret; + + ret = mtk_aie_job_ready(priv); + if (ret == -1) { + dev_err(fd->dev, "Failed to run job ready\n"); + return; + } + + atomic_inc(&fd->num_composing); + mtk_aie_hw_job_exec(fd); + aie_execute(fd, fd->aie_cfg); +} + +static struct v4l2_m2m_ops fd_m2m_ops = { + .device_run = mtk_aie_device_run, +}; + +static const struct media_device_ops fd_m2m_media_ops = { + .req_validate = vb2_request_validate, + .req_queue = v4l2_m2m_request_queue, +}; + +static int mtk_aie_video_device_register(struct mtk_aie_dev *fd) +{ + struct v4l2_m2m_dev *m2m_dev = fd->m2m_dev; + struct video_device *vfd = &fd->vfd; + struct device *dev = fd->dev; + int ret; + + vfd->fops = &fd_video_fops; + vfd->release = video_device_release_empty; + vfd->lock = &fd->vfd_lock; + vfd->v4l2_dev = &fd->v4l2_dev; + vfd->vfl_dir = VFL_DIR_M2M; + vfd->device_caps = V4L2_CAP_STREAMING | V4L2_CAP_VIDEO_OUTPUT_MPLANE | + V4L2_CAP_META_CAPTURE; + vfd->ioctl_ops = &mtk_aie_v4l2_video_out_ioctl_ops; + + strscpy(vfd->name, dev_driver_string(dev), sizeof(vfd->name)); + + video_set_drvdata(vfd, fd); + + ret = video_register_device(vfd, VFL_TYPE_VIDEO, 0); + if (ret) { + dev_err(dev, "Failed to register video device\n"); + return ret; + } + + ret = v4l2_m2m_register_media_controller(m2m_dev, vfd, MEDIA_ENT_F_PROC_VIDEO_STATISTICS); + if (ret) { + dev_err(dev, "Failed to init mem2mem media controller\n"); + video_unregister_device(vfd); + return ret; + } + + return 0; +} + +static int mtk_aie_dev_v4l2_init(struct mtk_aie_dev *fd) +{ + struct media_device *mdev = &fd->mdev; + struct device *dev = fd->dev; + int ret; + + ret = v4l2_device_register(dev, &fd->v4l2_dev); + if (ret) { + dev_err(dev, "Failed to register v4l2 device\n"); + return ret; + } + + fd->m2m_dev = v4l2_m2m_init(&fd_m2m_ops); + if (IS_ERR(fd->m2m_dev)) { + dev_err(dev, "Failed to init mem2mem device\n"); + ret = PTR_ERR(fd->m2m_dev); + goto err_unreg_v4l2_dev; + } + + mdev->dev = dev; + strscpy(mdev->model, dev_driver_string(dev), sizeof(mdev->model)); + media_device_init(mdev); + mdev->ops = &fd_m2m_media_ops; + fd->v4l2_dev.mdev = mdev; + + ret = mtk_aie_video_device_register(fd); + if (ret) + goto err_cleanup_mdev; + + ret = media_device_register(mdev); + if (ret) { + dev_err(dev, "Failed to register mem2mem media device\n"); + goto err_unreg_vdev; + } + return 0; + +err_unreg_vdev: + v4l2_m2m_unregister_media_controller(fd->m2m_dev); + video_unregister_device(&fd->vfd); +err_cleanup_mdev: + media_device_cleanup(mdev); + v4l2_m2m_release(fd->m2m_dev); +err_unreg_v4l2_dev: + v4l2_device_unregister(&fd->v4l2_dev); + return ret; +} + +static void mtk_aie_video_device_unregister(struct mtk_aie_dev *fd) +{ + v4l2_m2m_unregister_media_controller(fd->m2m_dev); + video_unregister_device(&fd->vfd); + media_device_cleanup(&fd->mdev); + v4l2_m2m_release(fd->m2m_dev); + v4l2_device_unregister(&fd->v4l2_dev); +} + +static void mtk_aie_frame_done_worker(struct work_struct *work) +{ + struct mtk_aie_req_work *req_work = (struct mtk_aie_req_work *)work; + struct mtk_aie_dev *fd = (struct mtk_aie_dev *)req_work->fd_dev; + + if (fd->reg_cfg.fd_mode == FDMODE) { + fd->reg_cfg.hw_result = readl(fd->fd_base + AIE_RESULT_0_REG); + fd->reg_cfg.hw_result1 = readl(fd->fd_base + AIE_RESULT_1_REG); + } + + mutex_lock(&fd->fd_lock); + + switch (fd->aie_cfg->sel_mode) { + case FDMODE: + aie_get_fd_result(fd, fd->aie_cfg); + break; + case ATTRIBUTEMODE: + aie_get_attr_result(fd, fd->aie_cfg); + break; + case FLDMODE: + aie_get_fld_result(fd, fd->aie_cfg); + break; + default: + dev_dbg(fd->dev, "Wrong sel_mode\n"); + break; + } + + mutex_unlock(&fd->fd_lock); + + if (!cancel_delayed_work(&fd->job_timeout_work)) + return; + + atomic_dec(&fd->num_composing); + mtk_aie_hw_job_finish(fd, VB2_BUF_STATE_DONE); + wake_up(&fd->flushing_waitq); +} + +static int mtk_aie_resource_init(struct mtk_aie_dev *fd) +{ + mutex_init(&fd->vfd_lock); + mutex_init(&fd->dev_lock); + mutex_init(&fd->fd_lock); + + init_completion(&fd->fd_job_finished); + complete_all(&fd->fd_job_finished); + INIT_DELAYED_WORK(&fd->job_timeout_work, mtk_aie_job_timeout_work); + init_waitqueue_head(&fd->flushing_waitq); + atomic_set(&fd->num_composing, 0); + fd->fd_stream_count = 0; + + fd->frame_done_wq = alloc_ordered_workqueue(dev_name(fd->dev), + WQ_HIGHPRI | WQ_FREEZABLE); + if (!fd->frame_done_wq) { + dev_err(fd->dev, "failed to alloc frame_done workqueue\n"); + mutex_destroy(&fd->vfd_lock); + mutex_destroy(&fd->dev_lock); + mutex_destroy(&fd->fd_lock); + return -ENOMEM; + } + + INIT_WORK(&fd->req_work.work, mtk_aie_frame_done_worker); + fd->req_work.fd_dev = fd; + + return 0; +} + +static void mtk_aie_resource_free(struct platform_device *pdev) +{ + struct mtk_aie_dev *fd = dev_get_drvdata(&pdev->dev); + + if (fd->frame_done_wq) + destroy_workqueue(fd->frame_done_wq); + fd->frame_done_wq = NULL; + mutex_destroy(&fd->vfd_lock); + mutex_destroy(&fd->dev_lock); + mutex_destroy(&fd->fd_lock); +} + +static irqreturn_t mtk_aie_irq(int irq, void *data) +{ + struct mtk_aie_dev *fd = (struct mtk_aie_dev *)data; + + aie_irqhandle(fd); + + queue_work(fd->frame_done_wq, &fd->req_work.work); + + return IRQ_HANDLED; +} + +static int mtk_aie_probe(struct platform_device *pdev) +{ + struct mtk_aie_dev *fd; + struct device *dev = &pdev->dev; + const struct mtk_aie_variant *driver_data = NULL; + const struct of_device_id *match = NULL; + int irq; + int ret; + + static struct clk_bulk_data aie_clks[] = { + { .id = "img-ipe" }, + { .id = "ipe-fdvt" }, + { .id = "ipe-top" }, + { .id = "ipe-smi-larb12" }, + }; + + fd = devm_kzalloc(&pdev->dev, sizeof(*fd), GFP_KERNEL); + if (!fd) + return -ENOMEM; + + match = of_match_node(mtk_aie_of_ids, dev->of_node); + if (match) + driver_data = (const struct mtk_aie_variant *)match->data; + + fd->variant = driver_data; + if (!fd->variant) + return -ENODEV; + + ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(34)); + if (ret) + return dev_err_probe(dev, ret, "Cannot set Coherent DMA mask\n"); + + dev_set_drvdata(dev, fd); + fd->dev = dev; + + irq = platform_get_irq(pdev, 0); + if (irq < 0) + return dev_err_probe(dev, irq, "Failed to get IRQ\n"); + + ret = devm_request_irq(dev, irq, mtk_aie_irq, IRQF_SHARED, + dev_driver_string(dev), fd); + if (ret) + return dev_err_probe(dev, ret, "Failed to request irq\n"); + + fd->irq = irq; + fd->fd_base = devm_platform_ioremap_resource(pdev, 0); + if (IS_ERR(fd->fd_base)) + return dev_err_probe(dev, -EINVAL, "Failed to get fd reg base\n"); + + fd->aie_clk.clk_num = ARRAY_SIZE(aie_clks); + fd->aie_clk.clks = aie_clks; + ret = devm_clk_bulk_get(&pdev->dev, fd->aie_clk.clk_num, fd->aie_clk.clks); + if (ret) + return dev_err_probe(dev, ret, "Failed to get raw clock\n"); + + ret = mtk_aie_resource_init(fd); + if (ret) { + mtk_aie_resource_free(pdev); + return ret; + } + pm_runtime_enable(dev); + ret = mtk_aie_dev_v4l2_init(fd); + if (ret) { + pm_runtime_disable(&pdev->dev); + return ret; + } + + return 0; +} + +static void mtk_aie_remove(struct platform_device *pdev) +{ + struct mtk_aie_dev *fd = dev_get_drvdata(&pdev->dev); + + mtk_aie_video_device_unregister(fd); + pm_runtime_disable(&pdev->dev); + mtk_aie_resource_free(pdev); +} + +static int __maybe_unused mtk_aie_suspend(struct device *dev) +{ + struct mtk_aie_dev *fd = dev_get_drvdata(dev); + int ret, num; + + if (pm_runtime_suspended(dev)) + return 0; + + num = atomic_read(&fd->num_composing); + + ret = wait_event_timeout(fd->flushing_waitq, + !(num = atomic_read(&fd->num_composing)), + msecs_to_jiffies(MTK_FD_HW_TIMEOUT_IN_MSEC)); + if (!ret && num) { + dev_dbg(dev, "%s: flushing aie job timeout num %d\n", __func__, num); + + return -EBUSY; + } + + ret = pm_runtime_force_suspend(dev); + if (ret) + return ret; + + return 0; +} + +static int __maybe_unused mtk_aie_resume(struct device *dev) +{ + int ret; + + if (pm_runtime_suspended(dev)) { + dev_dbg(dev, "%s: pm_runtime_suspended is true, no action\n", __func__); + return 0; + } + + ret = pm_runtime_force_resume(dev); + if (ret) + return ret; + + return 0; +} + +static int __maybe_unused mtk_aie_runtime_suspend(struct device *dev) +{ + struct mtk_aie_dev *fd = dev_get_drvdata(dev); + + clk_bulk_disable_unprepare(fd->aie_clk.clk_num, fd->aie_clk.clks); + + return 0; +} + +static int __maybe_unused mtk_aie_runtime_resume(struct device *dev) +{ + struct mtk_aie_dev *fd = dev_get_drvdata(dev); + int ret; + + ret = clk_bulk_prepare_enable(fd->aie_clk.clk_num, fd->aie_clk.clks); + if (ret) { + dev_err(dev, "Failed to enable clock: %d\n", ret); + return ret; + } + + return 0; +} + +static const struct dev_pm_ops mtk_aie_pm_ops = { + SET_SYSTEM_SLEEP_PM_OPS(mtk_aie_suspend, mtk_aie_resume) + SET_RUNTIME_PM_OPS(mtk_aie_runtime_suspend, mtk_aie_runtime_resume, NULL) +}; + +static struct platform_driver mtk_aie_driver = { + .probe = mtk_aie_probe, + .remove = mtk_aie_remove, + .driver = { + .name = "mtk-aie-5.3", + .of_match_table = mtk_aie_of_ids, + .pm = pm_ptr(&mtk_aie_pm_ops), + } +}; + +module_platform_driver(mtk_aie_driver); +MODULE_AUTHOR("Bo Kong <bo.kong@mediatek.com>"); +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("MediaTek AIE driver"); -- 2.45.2 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v5 4/4] media: mediatek: add MT8188 AIE driver 2025-04-03 7:38 ` [PATCH v5 4/4] media: mediatek: add MT8188 AIE driver bo.kong @ 2025-04-03 19:31 ` Nicolas Dufresne 2025-04-04 17:56 ` kernel test robot 2025-04-17 6:39 ` CK Hu (胡俊光) 2 siblings, 0 replies; 17+ messages in thread From: Nicolas Dufresne @ 2025-04-03 19:31 UTC (permalink / raw) To: bo.kong, Rob Herring, AngeloGioacchino Del Regno, Mauro Carvalho Chehab, linux-media, linux-kernel, linux-arm-kernel, linux-mediatek Cc: zhaoyuan.chen, Teddy.Chen, Project_Global_Chrome_Upstream_Group Hi, Le jeudi 03 avril 2025 à 15:38 +0800, bo.kong a écrit : > From: Bo Kong <Bo.Kong@mediatek.com> > > Add a V4L2 sub-device driver for MT8188 AIE. I was confused for a moment with the "sub-device driver" combined with V4L2_MEM2MEM_DEV and MEDIA_CONTROLLER_REQUEST_API. You likely want to correct your commit message. From quick code lookup, this looks like a new class of M2M driver, not a sub-dev that you incorporate into an existing media controller. This would be the first M2M driver using request upstream that is not based on the Stateless Decoder specification. What that means is that more documentation will have to be written. Last similar driver was dw100, perhaps you can take inspiration. Documentation/userspace-api/media/drivers/dw100.rst Its your chance to document how your driver works, how request must be used and also (since you have a media controller) document the graph content, which would mostly be used for identification if there is new compatible drivers in the future. Its always nice to have a link to an open source or public domain demo or production code for this too. That will enhance your doc and commit message. MT8188 is pretty public SoC now with Genio line, so I suppose you don't expect to provide support to everyone that are to use this upstream feature, so its better document it well. Nicolas > > Signed-off-by: Bo Kong <Bo.Kong@mediatek.com> > --- > Changes in V5: > 1. Modify the description of FLDMODE. > 2. Replace all instances of the keyword "race" with "ethnicity" > (e.g., race_result) to improve readability. > 3. Remove NUM_FORMATS. > 4. Correct some English spelling errors. > 5. Change the name of clock, replacing _ with -. > 6. Add an explanation for the aie_copy_fw related bin files. > 7. Use enums to save Y2R, RS, and FD related variables, removing the > define format, > and add new enums Y2R_CONFIG, RS_CONFIG, and FD_CONFIG. > 8. Optimize the expression of the RPN_LOOP_NUM and PYM_START_LOOP > variables. > 9. Remove unnecessary define variables. > 10. Remove the fd_wdma_en structure, replacing it with > out_stride_size for judgment. > 11. Optimize the fd_ker_rdma_size structure, add the > get_fd_ker_rdma_size function, > optimize the attr_ker_rdma_size structure, and add the > get_attr_ker_rdma_size function. > 12. Delete the fd_stride and fd_maxpool structures. > 13. Remove duplicate structures in mtk_aie.h and > mtk_aie_v4l2_controls.h. > 14. Remove fd_version, attr_version, and pose_version. > 15. Move the aie_enq_info and related structures to the uapi > directory. > 16. Rename mtk_aie_53.c to mtk_aie_v4l2.c and update the Makefile > accordingly. > 17. Optimize some code to reduce redundant conditional checks. > > Changes in v4: > 1. Remove ccflags that are not used in Makefile. > 2. Optimize the write register operation. > 3. Remove excess config from Kconfig. > 4. Remove the probe function related to hw_version. > 5. Remove mtk_aie_hw_connect,mtk_aie_hw_enable,mtk_aie_hw_disconnect > etc. > 6. Fix coding style in mtk_aie.h. > 7. Replace signed short with s16. > 8. Remove aie_reset_output_buf function, optimize this part of the > code. > 9. Remove struct race_result and other structures, and use s16 > instead. > of structures. > 10. Change multiple different types of width and height into the same > structure. > 11. Remove all unnecessary variable initialization > operations,retaining > some variables that need to be initialized. > 12. Delete print physical address. > 13. Remove aie_init_table and combine aie_init_table into the > aie_update_table function. > 14. Optimize the readability of some codes, add aie_calculate_pa > function. > 15. Optimize the aie_execute function and divide this function into > aie_execute_face_detection, aie_execute_attribute_detection, > aie_execute_fld_detection functions. > 16. Modify arrays with all 5 into define ANCHOR_EN_NUM. > 17. Remove mtk_aie_get_variant function. > 18. The version in the code was deleted, but the variables in the > enque > info structure cannot be deleted. Once deleted, the AIE will not work > because this structure needs to be aligned with User Space. > > Changes in v3: > 1. Remove not used include file, include only headers which AIE use > 2. Remove Makefile some private driver headers > > Changes in v2: > 1. Fix coding style > --- > drivers/media/platform/mediatek/Kconfig | 1 + > drivers/media/platform/mediatek/Makefile | 1 + > drivers/media/platform/mediatek/aie/Kconfig | 12 + > drivers/media/platform/mediatek/aie/Makefile | 5 + > drivers/media/platform/mediatek/aie/mtk_aie.h | 870 ++++++ > .../media/platform/mediatek/aie/mtk_aie_drv.c | 2398 > +++++++++++++++++ > .../platform/mediatek/aie/mtk_aie_v4l2.c | 1295 +++++++++ > 7 files changed, 4582 insertions(+) > create mode 100644 drivers/media/platform/mediatek/aie/Kconfig > create mode 100644 drivers/media/platform/mediatek/aie/Makefile > create mode 100644 drivers/media/platform/mediatek/aie/mtk_aie.h > create mode 100644 drivers/media/platform/mediatek/aie/mtk_aie_drv.c > create mode 100644 > drivers/media/platform/mediatek/aie/mtk_aie_v4l2.c > > diff --git a/drivers/media/platform/mediatek/Kconfig > b/drivers/media/platform/mediatek/Kconfig > index 84104e2cd024..cd161272666b 100644 > --- a/drivers/media/platform/mediatek/Kconfig > +++ b/drivers/media/platform/mediatek/Kconfig > @@ -2,6 +2,7 @@ > > comment "Mediatek media platform drivers" > > +source "drivers/media/platform/mediatek/aie/Kconfig" > source "drivers/media/platform/mediatek/jpeg/Kconfig" > source "drivers/media/platform/mediatek/mdp/Kconfig" > source "drivers/media/platform/mediatek/vcodec/Kconfig" > diff --git a/drivers/media/platform/mediatek/Makefile > b/drivers/media/platform/mediatek/Makefile > index 38e6ba917fe5..23a096fdf21c 100644 > --- a/drivers/media/platform/mediatek/Makefile > +++ b/drivers/media/platform/mediatek/Makefile > @@ -1,4 +1,5 @@ > # SPDX-License-Identifier: GPL-2.0-only > +obj-y += aie/ > obj-y += jpeg/ > obj-y += mdp/ > obj-y += vcodec/ > diff --git a/drivers/media/platform/mediatek/aie/Kconfig > b/drivers/media/platform/mediatek/aie/Kconfig > new file mode 100644 > index 000000000000..ffbe0a79a2ea > --- /dev/null > +++ b/drivers/media/platform/mediatek/aie/Kconfig > @@ -0,0 +1,12 @@ > +config VIDEO_MTK_AIE > + tristate "MediaTek AI engine function" > + depends on OF > + select V4L2_MEM2MEM_DEV > + select VIDEOBUF2_DMA_CONTIG > + select MEDIA_CONTROLLER_REQUEST_API > + help > + Support the AI engine (AIE) feature > + > + AIE driver is a V4L2 memory-to-memory device driver which > + provides hardware accelerated face detection function, > + it can detect different sizes of faces in a raw image. > diff --git a/drivers/media/platform/mediatek/aie/Makefile > b/drivers/media/platform/mediatek/aie/Makefile > new file mode 100644 > index 000000000000..0b4252452ca2 > --- /dev/null > +++ b/drivers/media/platform/mediatek/aie/Makefile > @@ -0,0 +1,5 @@ > +# SPDX-License-Identifier: GPL-2.0 > +mtk-aie-$(CONFIG_VIDEO_MTK_AIE) += mtk_aie_v4l2.o > +mtk-aie-$(CONFIG_VIDEO_MTK_AIE) += mtk_aie_drv.o > + > +obj-$(CONFIG_VIDEO_MTK_AIE) += mtk-aie.o > \ No newline at end of file > diff --git a/drivers/media/platform/mediatek/aie/mtk_aie.h > b/drivers/media/platform/mediatek/aie/mtk_aie.h > new file mode 100644 > index 000000000000..1de8d8b82499 > --- /dev/null > +++ b/drivers/media/platform/mediatek/aie/mtk_aie.h > @@ -0,0 +1,870 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * Copyright (c) 2020 MediaTek Inc. > + * Author: Bo Kong <bo.kong@mediatek.com> > + */ > + > +#ifndef __MTK_AIE_H__ > +#define __MTK_AIE_H__ > + > +#include <media/v4l2-ctrls.h> > +#include <media/v4l2-device.h> > +#include <linux/mtk_aie_v4l2_controls.h> > + > +#define RPN_NUM 3 > +#define RPN_LOOP_NUM 29 > +#define FD_LOOP_NUM (RPN_NUM * RPN_LOOP_NUM) > + > +#define RPNX_LOOP_NUM(x) ((3 - (x)) * RPN_LOOP_NUM - 1) > +#define PYMX_START_LOOP(x) ((2 - (x)) * RPN_LOOP_NUM) > + > +#define ATTR_LOOP_NUM 26 > +#define AGE_OUT_RGS 17 > +#define GENDER_OUT_RGS 20 > +#define INDIAN_OUT_RGS 22 > +#define ETHNICITY_OUT_RGS 25 > + > +#define INPUT_WDMA_WRA_NUM 4 > +#define OUTPUT_WDMA_WRA_NUM 4 > +#define KERNEL_RDMA_RA_NUM 2 > + > +#define MAX_ENQUE_FRAME_NUM 10 > +#define PYM_NUM 3 > +#define COLOR_NUM 3 > + > +#define ATTR_MODE_PYRAMID_WIDTH 128 > +#define ATTR_OUT_SIZE 32 > + > +#define AIE_START_REG 0x000 > +#define AIE_ENABLE_REG 0x004 > +#define AIE_LOOP_REG 0x008 > +#define AIE_YUV2RGB_CON_BASE_ADR_REG 0x00c > +#define AIE_RS_CON_BASE_ADR_REG 0x010 > +#define AIE_FD_CON_BASE_ADR_REG 0x014 > +#define AIE_INT_EN_REG 0x018 > +#define AIE_INT_REG 0x01c > +#define AIE_RESULT_0_REG 0x08c > +#define AIE_RESULT_1_REG 0x090 > +#define AIE_DMA_CTL_REG 0x094 > + > +/* AIE 3.0 register */ > +#define AIE_YUV2RGB_CON_BASE_ADR_MSB 0x14C > +#define AIE_RS_CON_BASE_ADR_MSB 0x150 > +#define AIE_FD_CON_BASE_ADR_MSB 0x154 > + > +/* AIE 3.0 FLD register */ > +#define FLD_EN 0x400 > +#define FLD_BASE_ADDR_FACE_0 0x404 > +#define FLD_MODEL_PARA1 0x4FC > +#define FLD_MODEL_PARA14 0x530 > + > +#define FLD_BUSY 0x544 > +#define FLD_DONE 0x548 > +#define FLD_SRC_WD_HT 0x54C > + > +/* n: min 0, max 14 */ > +#define FLD_PL_IN_BASE_ADDR_0_(n) (0x550 + 4 * (n)) > +#define FLD_PL_IN_BASE_ADDR_1_(n) (0x5C8 + 4 * (n)) > +#define FLD_PL_IN_BASE_ADDR_2_(n) (0x640 + 4 * (n)) > +#define FLD_PL_IN_BASE_ADDR_3_(n) (0x6B8 + 4 * (n)) > +#define FLD_SH_IN_BASE_ADDR_(n) (0x85C + 4 * (n)) > + > +#define FLD_PL_IN_SIZE_0 0x730 > +#define FLD_PL_IN_STRIDE_0 0x734 > +#define FLD_PL_IN_SIZE_1 0x738 > +#define FLD_PL_IN_STRIDE_1 0x73C > +#define FLD_PL_IN_SIZE_2_0 0x740 > +#define FLD_PL_IN_STRIDE_2_0 0x744 > +#define FLD_PL_IN_SIZE_2_1 0x748 > +#define FLD_PL_IN_STRIDE_2_1 0x74C > +#define FLD_PL_IN_SIZE_2_2 0x750 > +#define FLD_PL_IN_STRIDE_2_2 0x754 > +#define FLD_PL_IN_SIZE_3 0x758 > +#define FLD_PL_IN_STRIDE_3 0x75C > + > +#define FLD_SH_IN_SIZE_0 0x7D8 > +#define FLD_SH_IN_STRIDE_0 0x7DC > +#define FLD_TR_OUT_BASE_ADDR_0 0x7E0 > +#define FLD_TR_OUT_SIZE_0 0x7E4 > +#define FLD_TR_OUT_STRIDE_0 0x7E8 > +#define FLD_PP_OUT_BASE_ADDR_0 0x7EC > +#define FLD_PP_OUT_SIZE_0 0x7F0 > +#define FLD_PP_OUT_STRIDE_0 0x7F4 > +#define FLD_SPARE 0x7F8 > + > +#define FLD_BASE_ADDR_FACE_0_7_MSB 0x7FC > +#define FLD_BASE_ADDR_FACE_8_14_MSB 0x800 > + > +#define FLD_PL_IN_BASE_ADDR_0_0_7_MSB 0x804 > +#define FLD_PL_IN_BASE_ADDR_0_8_15_MSB 0x808 > +#define FLD_PL_IN_BASE_ADDR_0_16_23_MSB 0x80C > +#define FLD_PL_IN_BASE_ADDR_0_24_29_MSB 0x810 > + > +#define FLD_PL_IN_BASE_ADDR_1_0_7_MSB 0x814 > +#define FLD_PL_IN_BASE_ADDR_1_8_15_MSB 0x818 > +#define FLD_PL_IN_BASE_ADDR_1_16_23_MSB 0x81C > +#define FLD_PL_IN_BASE_ADDR_1_24_29_MSB 0x820 > + > +#define FLD_PL_IN_BASE_ADDR_2_0_7_MSB 0x824 > +#define FLD_PL_IN_BASE_ADDR_2_8_15_MSB 0x828 > +#define FLD_PL_IN_BASE_ADDR_2_16_23_MSB 0x82C > +#define FLD_PL_IN_BASE_ADDR_2_24_29_MSB 0x830 > + > +#define FLD_PL_IN_BASE_ADDR_3_0_7_MSB 0x834 > +#define FLD_PL_IN_BASE_ADDR_3_8_15_MSB 0x838 > +#define FLD_PL_IN_BASE_ADDR_3_16_23_MSB 0x83C > +#define FLD_PL_IN_BASE_ADDR_3_24_29_MSB 0x840 > + > +#define FLD_SH_IN_BASE_ADDR_0_7_MSB 0x844 > +#define FLD_SH_IN_BASE_ADDR_8_15_MSB 0x848 > +#define FLD_SH_IN_BASE_ADDR_16_23_MSB 0x84C > +#define FLD_SH_IN_BASE_ADDR_24_29_MSB 0x850 > + > +#define FLD_BS_IN_BASE_ADDR_0_7_MSB 0x8d4 > +#define FLD_BS_IN_BASE_ADDR_8_15_MSB 0x8d8 > + > +#define FLD_TR_OUT_BASE_ADDR_0_MSB 0x854 > +#define FLD_PP_OUT_BASE_ADDR_0_MSB 0x858 > + > +#define FLD_BS_IN_BASE_ADDR_14 0x894 > + > +#define FLD_BS_BIAS 0x8E4 > +#define FLD_CV_FM_RANGE_0 0x8E8 > +#define FLD_CV_FM_RANGE_1 0x8EC > +#define FLD_CV_PM_RANGE_0 0x8F0 > +#define FLD_CV_PM_RANGE_1 0x8F4 > +#define FLD_BS_RANGE_0 0x8F8 > +#define FLD_BS_RANGE_1 0x8FC > + > +#define MTK_FD_OUTPUT_MIN_WIDTH 16U > +#define MTK_FD_OUTPUT_MIN_HEIGHT 16U > +#define MTK_FD_OUTPUT_MAX_WIDTH 4096U > +#define MTK_FD_OUTPUT_MAX_HEIGHT 4096U > + > +#define MTK_FD_HW_TIMEOUT_IN_MSEC 2000 > +#define RLT_NUM 48 > +#define GENDER_OUT 32 > + > +#define ETHNICITY_RST_X_NUM 4 > +#define ETHNICITY_RST_Y_NUM 64 > +#define GENDER_RST_X_NUM 2 > +#define GENDER_RST_Y_NUM 64 > +#define METHNICITY_RST_NUM 4 > +#define MGENDER_RST_NUM 2 > +#define MAGE_RST_NUM 2 > +#define MINDIAN_RST_NUM 2 > + > +#define FLD_FOREST 14 > +#define FLD_POINT 500 > + > +#define FLD_STEP_NUM 6 > +#define FLD_MAX_FRAME 15 > + > +#define FLD_STEP_BLINK 0 > +#define FLD_STEP_CV 1 > +#define FLD_STEP_FP 2 > +#define FLD_STEP_LEAF 3 > +#define FLD_STEP_KM02 4 > +#define FLD_STEP_KM13 5 > + > +#define FLD_BLINK_WEIGHT_FOREST14_SIZE 6416 > +#define FLD_CV_SIZE 19392 > +#define FLD_FP_SIZE 80160 > +#define FLD_LEAFNODE_SIZE 4608000 > +#define FLD_TREE_KM02_SIZE 120000 > +#define FLD_TREE_KM13_SIZE 120000 > +#define FLD_OUTPUT_SIZE 112 > + > +/* FLD_OUTPUT_X_SIZE: min: 1, max: 15 */ > +#define FLD_OUTPUT_X_SIZE 9 > + > +#define RESULT_SIZE (RLT_NUM * MAX_FACE_NUM) > + > +#define SRZ_BIT (BIT_MASK(16) | BIT_MASK(12) | BIT_MASK(8) | > BIT_MASK(0)) > +#define RESET_BIT16 BIT(16) > +#define RESET_BIT17 BIT(17) > +#define RESET_BIT (RESET_BIT16 | RESET_BIT17) > + > +#define ANCHOR_EN_NUM 5 > + > +#define FD_STRIDE_SIZE 3 > +#define FD_MAXPOOL_SIZE 3 > + > +#define ATTR_FD_STRIDE(n) ((n) == 0 ? 2 : 1) > +#define ATTR_FD_MAXPOOL(n) ((n) == 0 ? 1 : 0) > +#define FLD_FACE_INFO(m, n) (fld_face_info_0[n] + 4 * (m)) > + > +enum Y2R_CONFIG { > + Y2R_SRC_DST_FORMAT, > + Y2R_IN_W_H, > + Y2R_OUT_W_H, > + Y2R_RA0_RA1_EN, > + Y2R_IN_X_Y_SIZE0, > + Y2R_IN_STRIDE0_BUS_SIZE0, > + Y2R_IN_X_Y_SIZE1, > + Y2R_IN_STRIDE1_BUS_SIZE1, > + Y2R_OUT_X_Y_SIZE0, > + Y2R_OUT_STRIDE0_BUS_SIZE0, > + Y2R_OUT_X_Y_SIZE1, > + Y2R_OUT_STRIDE1_BUS_SIZE1, > + Y2R_OUT_X_Y_SIZE2, > + Y2R_OUT_STRIDE2_BUS_SIZE2, > + Y2R_IN_0, > + Y2R_IN_1, > + Y2R_OUT_0, > + Y2R_OUT_1, > + Y2R_OUT_2, > + Y2R_RS_SEL_SRZ_EN, > + Y2R_X_Y_MAG, > + y2r_reserve0, > + Y2R_SRZ_HORI_STEP, > + Y2R_SRZ_VERT_STEP, > + y2r_reserve1, > + y2r_reserve2, > + Y2R_PADDING_EN_UP_DOWN, > + Y2R_PADDING_RIGHT_LEFT, > + Y2R_CO2_FMT_MODE_EN, > + Y2R_CO2_CROP_X, > + Y2R_CO2_CROP_Y, > + Y2R_CON_IN_BA_MSB, > + Y2R_CON_OUT_BA_MSB > +}; > + > +enum RS_CONFIG { > + RS_X_Y_MAG = 1, > + RS_SRZ_HORI_STEP = 3, > + RS_SRZ_VERT_STEP, > + RS_INPUT_W_H = 7, > + RS_OUTPUT_W_H, > + RS_IN_X_Y_SIZE0 = 10, > + RS_IN_STRIDE0, > + RS_IN_X_Y_SIZE1, > + RS_IN_STRIDE1, > + RS_IN_X_Y_SIZE2, > + RS_IN_STRIDE2, > + RS_OUT_X_Y_SIZE0, > + RS_OUT_STRIDE0, > + RS_OUT_X_Y_SIZE1, > + RS_OUT_STRIDE1, > + RS_OUT_X_Y_SIZE2, > + RS_OUT_STRIDE2, > + RS_IN_0, > + RS_IN_1, > + RS_IN_2, > + RS_OUT_0, > + RS_OUT_1, > + RS_OUT_2, > + RS_CON_IN_BA_MSB, > + RS_CON_OUT_BA_MSB, > +}; > + > +enum FD_CONFIG { > + FD_INPUT_ROTATE = 1, > + FD_CONV_WIDTH_MOD6, > + FD_CONV_IMG_W_H = 4, > + FD_IN_IMG_W_H, > + FD_OUT_IMG_W_H, > + FD_IN_X_Y_SIZE0 = 9, > + FD_IN_STRIDE0_BUS_SIZE0, > + FD_IN_X_Y_SIZE1, > + FD_IN_STRIDE1_BUS_SIZE1, > + FD_IN_X_Y_SIZE2, > + FD_IN_STRIDE2_BUS_SIZE2, > + FD_IN_X_Y_SIZE3, > + FD_IN_STRIDE3_BUS_SIZE3, > + FD_OUT_X_Y_SIZE0, > + FD_OUT_STRIDE0_BUS_SIZE0, > + FD_OUT_X_Y_SIZE1, > + FD_OUT_STRIDE1_BUS_SIZE1, > + FD_OUT_X_Y_SIZE2, > + FD_OUT_STRIDE2_BUS_SIZE2, > + FD_OUT_X_Y_SIZE3, > + FD_OUT_STRIDE3_BUS_SIZE3, > + FD_IN_0 = 27, > + FD_IN_1, > + FD_IN_2, > + FD_IN_3, > + FD_OUT_0, > + FD_OUT_1, > + FD_OUT_2, > + FD_OUT_3, > + FD_KERNEL_0, > + FD_KERNEL_1, > + FD_RPN_SET, > + FD_IMAGE_COORD, > + FD_IMAGE_COORD_XY_OFST, > + FD_BIAS_ACCU = 47, > + FD_SRZ_FDRZ_RS, > + FD_SRZ_HORI_STEP, > + FD_SRZ_VERT_STEP, > + FD_SRZ_HORI_SUB_INT_OFST, > + FD_SRZ_VERT_SUB_INT_OFST, > + FD_CON_IN_BA_MSB, > + FD_CON_OUT_BA_MSB, > + FD_CON_KERNEL_BA_MSB > +}; > + > +static const unsigned int > out_stride_size[FD_LOOP_NUM][OUTPUT_WDMA_WRA_NUM] = { > + { 1, 0, 0, 0 }, { 1, 0, 2, 0 }, { 1, 0, 2, 0 }, { 1, 0, 0, 0 }, > + { 1, 1, 2, 2 }, { 1, 1, 2, 2 }, { 1, 0, 0, 0 }, { 1, 0, 2, 0 }, > + { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 2, 0 }, { 1, 1, 0, 0 }, > + { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, > + { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, > + { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, > + { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, > + { 3, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 2, 0 }, { 1, 0, 2, 0 }, > + { 1, 0, 0, 0 }, { 1, 1, 2, 2 }, { 1, 1, 2, 2 }, { 1, 0, 0, 0 }, > + { 1, 0, 2, 0 }, { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 2, 0 }, > + { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, > + { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 1, 0, 0 }, > + { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, { 1, 1, 1, 1 }, > + { 1, 1, 1, 1 }, { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, > + { 1, 0, 0, 0 }, { 3, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 2, 0 }, > + { 1, 0, 2, 0 }, { 1, 0, 0, 0 }, { 1, 1, 2, 2 }, { 1, 1, 2, 2 }, > + { 1, 0, 0, 0 }, { 1, 0, 2, 0 }, { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, > + { 1, 0, 2, 0 }, { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, > + { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, > + { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, > + { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, > + { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, { 3, 0, 0, 0 } > +}; > + > +static const unsigned int fd_ker_rdma_size[RPN_LOOP_NUM] = { > + 240, 1168, 1168, 272, > + 2320, 2080, 1040, 4624, > + 3104, 9232, 4624, 4128, > + 1040, 4624, 4624, 1552, > + 4624, 4624, 4128, 1040, > + 1040, 528, 4160, 4160, > + 2080, 2080, 2080, 1040, > + 0 > +}; > + > +static const unsigned int fd_out_stride2_in[FD_LOOP_NUM] = { > + 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, > + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, > + 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 > +}; > + > +static const unsigned int fd_stride_indices[FD_STRIDE_SIZE] = { > + 0, 29, 58 > +}; > + > +static const unsigned int fd_maxpool_indices[FD_MAXPOOL_SIZE] = { > + 1, 30, 59 > +}; > + > +static const unsigned int out_2size[FD_LOOP_NUM] = { > + 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > + 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, > + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, > + 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 > +}; > + > +static const unsigned int in_ch_pack[FD_LOOP_NUM] = { > + 1, 16, 16, 16, 16, 16, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, > 32, > + 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 0, 1, 16, 16, 16, 16, 16, > 32, > + 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, > 32, > + 32, 32, 32, 0, 1, 16, 16, 16, 16, 16, 32, 32, 32, 32, 32, 32, 32, > 32, > + 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 0 > +}; > + > +static const unsigned int outlayer[FD_LOOP_NUM] = { > + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, > + 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > + 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, > + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0 > +}; > + > +static const unsigned int out_ch_pack[FD_LOOP_NUM] = { > + 16, 16, 16, 16, 16, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, > 32, > + 32, 16, 16, 16, 32, 32, 32, 32, 32, 32, 0, 16, 16, 16, 16, 16, 32, > 32, > + 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 16, 16, 16, 32, 32, > 32, > + 32, 32, 32, 0, 16, 16, 16, 16, 16, 32, 32, 32, 32, 32, 32, 32, 32, > 32, > + 32, 32, 32, 32, 32, 16, 16, 16, 32, 32, 32, 32, 32, 32, 0 > +}; > + > +static const signed int > fd_rdma_en[FD_LOOP_NUM][INPUT_WDMA_WRA_NUM][2] = { > + { { 99, 99 }, { 99, 99 }, { 99, 99 }, { -1, -1 } }, > + { { 0, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 1, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 1, 0 }, { 2, 0 }, { -1, -1 }, { -1, -1 } }, > + { { 3, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 1, 2 }, { 2, 2 }, { 4, 2 }, { 4, 3 } }, > + { { 5, 0 }, { 5, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 6, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 5, 0 }, { 5, 1 }, { 7, 0 }, { -1, -1 } }, > + { { 8, 0 }, { 8, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 9, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 5, 2 }, { 5, 3 }, { 7, 2 }, { 10, 2 } }, > + { { 11, 0 }, { 11, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 12, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 13, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 11, 0 }, { 11, 1 }, { 14, 0 }, { -1, -1 } }, > + { { 15, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 16, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 11, 0 }, { 11, 1 }, { 14, 0 }, { 17, 0 } }, > + { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 19, 0 }, { 22, 0 }, { 22, 1 }, { 25, 0 } }, > + { { 99, 99 }, { 99, 99 }, { 99, 99 }, { -1, -1 } }, > + { { 29, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 30, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 30, 0 }, { 31, 0 }, { -1, -1 }, { -1, -1 } }, > + { { 32, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 30, 2 }, { 31, 2 }, { 33, 2 }, { 33, 3 } }, > + { { 34, 0 }, { 34, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 35, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 34, 0 }, { 34, 1 }, { 36, 0 }, { -1, -1 } }, > + { { 37, 0 }, { 37, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 38, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 34, 2 }, { 34, 3 }, { 36, 2 }, { 39, 2 } }, > + { { 40, 0 }, { 40, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 41, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 42, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 40, 0 }, { 40, 1 }, { 43, 0 }, { -1, -1 } }, > + { { 44, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 45, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 40, 0 }, { 40, 1 }, { 43, 0 }, { 46, 0 } }, > + { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 48, 0 }, { 51, 0 }, { 51, 1 }, { 54, 0 } }, > + { { 99, 99 }, { 99, 99 }, { 99, 99 }, { -1, -1 } }, > + { { 58, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 59, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 59, 0 }, { 60, 0 }, { -1, -1 }, { -1, -1 } }, > + { { 61, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 59, 2 }, { 60, 2 }, { 62, 2 }, { 62, 3 } }, > + { { 63, 0 }, { 63, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 64, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 63, 0 }, { 63, 1 }, { 65, 0 }, { -1, -1 } }, > + { { 66, 0 }, { 66, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 67, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 63, 2 }, { 63, 3 }, { 65, 2 }, { 68, 2 } }, > + { { 69, 0 }, { 69, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 70, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 71, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 69, 0 }, { 69, 1 }, { 72, 0 }, { -1, -1 } }, > + { { 73, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 74, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 69, 0 }, { 69, 1 }, { 72, 0 }, { 75, 0 } }, > + { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 77, 0 }, { 80, 0 }, { 80, 1 }, { 83, 0 } } > +}; > + > +static const unsigned int > attr_wdma_en[ATTR_LOOP_NUM][OUTPUT_WDMA_WRA_NUM] = { > + { 1, 0, 1, 0 }, { 1, 0, 1, 0 }, { 1, 0, 0, 0 }, { 1, 1, 1, 1 }, > + { 1, 1, 1, 1 }, { 1, 0, 1, 0 }, { 1, 1, 0, 0 }, { 1, 0, 1, 0 }, > + { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, > + { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 1, 0, 0 }, > + { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, > + { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, > + { 1, 0, 0, 0 }, { 1, 0, 0, 0 } > +}; > + > +static const unsigned int attr_ker_rdma_size[ATTR_LOOP_NUM] = { > + 240, 1168, 272, 2320, > + 2080, 9232, 3104, 9232, > + 4128, 1040, 4624, 4624, > + 1552, 4624, 4624, 4128, > + 9232, 272, 9232, 2320, > + 144, 9232, 272, 9232, > + 2320, 144 > +}; > + > +static const unsigned int attr_out_stride2_as_in[ATTR_LOOP_NUM] = { > + 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, > + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 > +}; > + > +static const unsigned int attr_out_2size[ATTR_LOOP_NUM] = { /* O */ > + 1, 1, 0, 1, 1, 1, 0, > + 1, 0, 0, 0, 0, 0, 0, > + 0, 0, 0, 0, 0, 0, 0, > + 0, 0, 0, 0, 0 > +}; > + > +/* [loop][ch][output_index] */ > +static const signed int > attr_rdma_en[ATTR_LOOP_NUM][INPUT_WDMA_WRA_NUM][2] = { > + { { 99, 99 }, { 99, 99 }, { 99, 99 }, { -1, -1 } }, > + { { 0, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 0, 0 }, { 1, 0 }, { -1, -1 }, { -1, -1 } }, > + { { 2, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 0, 2 }, { 1, 2 }, { 3, 2 }, { 3, 3 } }, > + { { 4, 0 }, { 4, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 4, 0 }, { 4, 1 }, { 5, 0 }, { -1, -1 } }, > + { { 6, 0 }, { 6, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 4, 2 }, { 4, 3 }, { 5, 2 }, { 7, 2 } }, > + { { 8, 0 }, { 8, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 9, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 10, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 8, 0 }, { 8, 1 }, { 11, 0 }, { -1, -1 } }, > + { { 12, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 13, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 8, 0 }, { 8, 1 }, { 11, 0 }, { 14, 0 } }, > + { { 15, 0 }, { 15, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 16, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 15, 0 }, { 15, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 18, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 19, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 15, 0 }, { 15, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 21, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 15, 0 }, { 15, 1 }, { -1, -1 }, { -1, -1 } }, > + { { 23, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, > + { { 24, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } } > +}; > + > +static const unsigned int > attr_wdma_size[ATTR_LOOP_NUM][OUTPUT_WDMA_WRA_NUM] = { > + { 16384, 0, 4096, 0 }, > + { 16384, 0, 4096, 0 }, > + { 16384, 0, 0, 0 }, > + { 16384, 16384, 4096, 4096 }, > + { 8192, 8192, 2048, 2048 }, > + { 8192, 0, 2048, 0 }, > + { 8192, 8192, 0, 0 }, > + { 8192, 0, 2048, 0 }, > + { 2048, 2048, 0, 0 }, > + { 2048, 0, 0, 0 }, > + { 2048, 0, 0, 0 }, > + { 2048, 0, 0, 0 }, > + { 2048, 0, 0, 0 }, > + { 2048, 0, 0, 0 }, > + { 2048, 0, 0, 0 }, > + { 2048, 2048, 0, 0 }, > + { 2048, 0, 0, 0 }, > + { 0, 0, 0, 0 }, > + { 2048, 0, 0, 0 }, > + { 1024, 0, 0, 0 }, > + { 0, 0, 0, 0 }, > + { 2048, 0, 0, 0 }, > + { 0, 0, 0, 0 }, > + { 2048, 0, 0, 0 }, > + { 1024, 0, 0, 0 }, > + { 0, 0, 0, 0 } > +}; > + > +static const unsigned int fld_face_info_0[FLD_MAX_FRAME] = { > + 0x440, 0x44C, 0x458, 0x464, 0x470, 0x47C, 0x488, 0x494, 0x4A4, > + 0x4B0, 0x4BC, 0x4C8, 0x4D4, 0x4E0, 0x4EC > +}; > + > +struct aie_static_info_element { > + u32 fd_wdma_size[OUTPUT_WDMA_WRA_NUM]; > + u32 out_xsize_plus_1; > + u32 out_height; > + u32 out_ysize_plus_1_stride2; > + u32 out_stride; > + u32 out_stride_stride2; > + u32 out_width; > + u32 img_width; > + u32 img_height; > + u32 stride2_out_width; > + u32 stride2_out_height; > + u32 out_xsize_plus_1_stride2; > + u32 input_xsize_plus_1; > +}; > + > +struct aie_static_info { > + struct aie_static_info_element inf_elm[FD_LOOP_NUM]; > +}; > + > +enum aie_state { > + STATE_NA, > + STATE_INIT, > + STATE_OPEN > +}; > + > +/** > + * enum aie_mode - AIE Feature Mode > + * @FDMODE: Face Detection. > + * @ATTRIBUTEMODE: Gender and ethnicity detection. > + * @FLDMODE: Locations of eyebrows, eyes, ears, nose,and mouth. > + */ > +enum aie_mode { > + FDMODE, > + ATTRIBUTEMODE, > + FLDMODE > +}; > + > +enum aie_format { > + FMT_NA, > + FMT_YUV_2P, > + FMT_YVU_2P, > + FMT_YUYV, > + FMT_YVYU, > + FMT_UYVY, > + FMT_VYUY, > + FMT_MONO, > + FMT_YUV420_2P, > + FMT_YUV420_1P > +}; > + > +enum aie_input_degree { > + DEGREE_0, > + DEGREE_90, > + DEGREE_270, > + DEGREE_180 > +}; > + > +struct ethnicity_result { > + signed short result[4][64]; > +}; > + > +struct merged_ethnicity_result { > + signed short result[4]; > +}; > + > +struct merged_gender_result { > + signed short result[2]; > +}; > + > +struct merged_age_result { > + signed short result[2]; > +}; > + > +struct merged_is_indian_result { > + signed short result[2]; > +}; > + > +struct aie_reg_cfg { > + u32 rs_adr; > + u32 yuv2rgb_adr; > + u32 fd_adr; > + u32 fd_pose_adr; > + u32 fd_mode; > + u32 hw_result; > + u32 hw_result1; > + u32 reserved; > +}; > + > +struct aie_hw_rect { > + u16 width; > + u16 height; > +}; > + > +struct aie_para { > + void *fd_fd_cfg_va; > + void *fd_rs_cfg_va; > + void *fd_yuv2rgb_cfg_va; > + > + void *attr_fd_cfg_va[MAX_ENQUE_FRAME_NUM]; > + void *attr_yuv2rgb_cfg_va[MAX_ENQUE_FRAME_NUM]; > + > + void *rs_pym_rst_va[PYM_NUM][COLOR_NUM]; > + > + dma_addr_t fd_fd_cfg_pa; > + dma_addr_t fd_rs_cfg_pa; > + dma_addr_t fd_yuv2rgb_cfg_pa; > + > + dma_addr_t attr_fd_cfg_pa[MAX_ENQUE_FRAME_NUM]; > + dma_addr_t attr_yuv2rgb_cfg_pa[MAX_ENQUE_FRAME_NUM]; > + > + dma_addr_t rs_pym_rst_pa[PYM_NUM][COLOR_NUM]; > + > + u32 sel_mode; > + u32 src_img_fmt; > + u32 rotate_degree; > + s16 rpn_anchor_thrd; > + u16 number_of_pyramid; > + u32 src_img_addr; > + u32 src_img_addr_uv; > + > + struct aie_hw_rect max_img_rect; > + struct aie_hw_rect img_rect; > + struct aie_hw_rect crop_rect; > + struct aie_hw_rect pyramid_rect; > + struct aie_hw_rect max_pyramid_rect; > +}; > + > +struct aie_attr_para { > + u32 w_idx; > + u32 r_idx; > + u32 sel_mode[MAX_ENQUE_FRAME_NUM]; > + u16 img_width[MAX_ENQUE_FRAME_NUM]; > + u16 img_height[MAX_ENQUE_FRAME_NUM]; > + u16 crop_width[MAX_ENQUE_FRAME_NUM]; > + u16 crop_height[MAX_ENQUE_FRAME_NUM]; > + u32 src_img_fmt[MAX_ENQUE_FRAME_NUM]; > + u32 rotate_degree[MAX_ENQUE_FRAME_NUM]; > + u32 src_img_addr[MAX_ENQUE_FRAME_NUM]; > + u32 src_img_addr_uv[MAX_ENQUE_FRAME_NUM]; > +}; > + > +struct aie_fd_dma_para { > + void *fd_out_hw_va[FD_LOOP_NUM][OUTPUT_WDMA_WRA_NUM]; > + void *fd_kernel_va[FD_LOOP_NUM][KERNEL_RDMA_RA_NUM]; > + void *attr_out_hw_va[ATTR_LOOP_NUM][OUTPUT_WDMA_WRA_NUM]; > + void *attr_kernel_va[ATTR_LOOP_NUM][KERNEL_RDMA_RA_NUM]; > + > + void *age_out_hw_va[MAX_ENQUE_FRAME_NUM]; > + void *gender_out_hw_va[MAX_ENQUE_FRAME_NUM]; > + void *is_indian_out_hw_va[MAX_ENQUE_FRAME_NUM]; > + void *ethnicity_out_hw_va[MAX_ENQUE_FRAME_NUM]; > + > + dma_addr_t fd_out_hw_pa[FD_LOOP_NUM][OUTPUT_WDMA_WRA_NUM]; > + dma_addr_t fd_kernel_pa[FD_LOOP_NUM][KERNEL_RDMA_RA_NUM]; > + dma_addr_t attr_out_hw_pa[ATTR_LOOP_NUM][OUTPUT_WDMA_WRA_NUM]; > + dma_addr_t attr_kernel_pa[ATTR_LOOP_NUM][KERNEL_RDMA_RA_NUM]; > + > + dma_addr_t age_out_hw_pa[MAX_ENQUE_FRAME_NUM]; > + dma_addr_t gender_out_hw_pa[MAX_ENQUE_FRAME_NUM]; > + dma_addr_t is_indian_out_hw_pa[MAX_ENQUE_FRAME_NUM]; > + dma_addr_t ethnicity_out_hw_pa[MAX_ENQUE_FRAME_NUM]; > +}; > + > +struct aie_fd_fld_para { > + void *fld_step_va[FLD_STEP_NUM][FLD_MAX_FRAME]; > + void *fld_output_va[FLD_MAX_FRAME]; > + dma_addr_t fld_step_pa[FLD_STEP_NUM][FLD_MAX_FRAME]; > + dma_addr_t fld_output_pa[FLD_MAX_FRAME]; > +}; > + > +struct imem_buf_info { > + void *va; > + dma_addr_t pa; > + unsigned int size; > + unsigned int reserved; > +}; > + > +struct fd_buffer { > + /* used by DMA HW */ > + u32 dma_addr; > +}; > + > +struct aie_clocks { > + struct clk_bulk_data *clks; > + unsigned int clk_num; > +}; > + > +struct mtk_aie_req_work { > + struct work_struct work; > + struct mtk_aie_dev *fd_dev; > +}; > + > +struct mtk_aie_variant { > + unsigned int y2r_cfg_size; > + unsigned int rs_cfg_size; > + unsigned int fd_cfg_size; > +}; > + > +struct mtk_aie_dev { > + struct device *dev; > + struct mtk_aie_ctx *ctx; > + struct v4l2_m2m_dev *m2m_dev; > + struct device *larb; > + struct aie_para *base_para; > + struct aie_attr_para *attr_para; > + struct aie_fd_dma_para *dma_para; > + > + struct aie_fd_fld_para *fld_para; > + > + struct aie_enq_info *aie_cfg; > + struct workqueue_struct *frame_done_wq; > + void __iomem *fd_base; > + const struct mtk_aie_variant *variant; > + > + /* Input Buffer Pointer */ > + struct imem_buf_info rs_cfg_data; > + struct imem_buf_info fd_cfg_data; > + struct imem_buf_info yuv2rgb_cfg_data; > + /* HW Output Buffer Pointer */ > + struct imem_buf_info rs_output_hw; > + struct imem_buf_info fd_dma_hw; > + struct imem_buf_info fd_dma_result_hw; > + struct imem_buf_info fd_kernel_hw; > + struct imem_buf_info fd_attr_dma_hw; > + struct aie_static_info st_info; > + > + struct aie_reg_cfg reg_cfg; > + > + /* Fld fw buffer */ > + struct media_device mdev; > + struct video_device vfd; > + struct aie_clocks aie_clk; > + struct v4l2_device v4l2_dev; > + > + /* Lock for V4L2 operations */ > + struct mutex vfd_lock; > + /* Lock for device operations */ > + struct mutex dev_lock; > + /* Lock for performance optimization */ > + struct mutex fd_lock; > + struct imem_buf_info fd_fld_step_data; > + struct imem_buf_info fd_fld_out_hw; > + > + int irq; > + struct completion fd_job_finished; > + struct delayed_work job_timeout_work; > + > + /* DRAM Buffer Size */ > + unsigned int fd_rs_cfg_size; > + unsigned int fd_fd_cfg_size; > + unsigned int fd_yuv2rgb_cfg_size; > + unsigned int attr_fd_cfg_size; > + unsigned int attr_yuv2rgb_cfg_size; > + > + /* HW Output Buffer Size */ > + unsigned int rs_pym_out_size[PYM_NUM]; > + unsigned int fd_dma_max_size; > + unsigned int fd_dma_rst_max_size; > + unsigned int fd_fd_kernel_size; > + unsigned int fd_attr_kernel_size; > + unsigned int fd_attr_dma_max_size; > + unsigned int fd_attr_dma_rst_max_size; > + > + /* Fld size */ > + unsigned int fld_step_size; > + unsigned int fld_out_size; > + > + wait_queue_head_t flushing_waitq; > + atomic_t num_composing; > + struct mtk_aie_req_work req_work; > + unsigned int fd_state; > + unsigned int fd_mem_size; > + u32 fd_stream_count; > +}; > + > +struct mtk_aie_ctx { > + struct mtk_aie_dev *fd_dev; > + struct device *dev; > + struct v4l2_fh fh; > + struct v4l2_ctrl_handler hdl; > + struct v4l2_pix_format_mplane src_fmt; > + struct v4l2_meta_format dst_fmt; > + struct v4l2_ctrl_aie_init user_init; > + struct v4l2_ctrl_aie_param user_param; > +}; > + > +void aie_reset(struct mtk_aie_dev *fd); > +int aie_init(struct mtk_aie_dev *fd, struct v4l2_ctrl_aie_init > *user_init); > +void aie_uninit(struct mtk_aie_dev *fd); > +void aie_prepare(struct mtk_aie_dev *fd, struct aie_enq_info > *aie_cfg); > +void aie_execute(struct mtk_aie_dev *fd, struct aie_enq_info > *aie_cfg); > +void aie_irqhandle(struct mtk_aie_dev *fd); > +void aie_get_fd_result(struct mtk_aie_dev *fd, struct aie_enq_info > *aie_cfg); > +void aie_get_attr_result(struct mtk_aie_dev *fd, struct aie_enq_info > *aie_cfg); > +void aie_get_fld_result(struct mtk_aie_dev *fd, struct aie_enq_info > *aie_cfg); > +#endif /*__MTK_AIE_H__*/ > diff --git a/drivers/media/platform/mediatek/aie/mtk_aie_drv.c > b/drivers/media/platform/mediatek/aie/mtk_aie_drv.c > new file mode 100644 > index 000000000000..a327d55e2ef2 > --- /dev/null > +++ b/drivers/media/platform/mediatek/aie/mtk_aie_drv.c > @@ -0,0 +1,2398 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Copyright (c) 2020 MediaTek Inc. > + * Author: Bo Kong <bo.kong@mediatek.com> > + */ > + > +#include <linux/firmware.h> > +#include <linux/mtk_aie_v4l2_controls.h> > +#include "mtk_aie.h" > + > +static u32 aie_cmb_u16(u16 low, u16 high) > +{ > + return ((u32)high << 16) | low; > +} > + > +static u32 aie_cmb_stride(u16 low, u16 high) > +{ > + return ((u32)high << 16) | (low & 0x000F); > +} > + > +static inline u16 dif_x(const struct aie_enq_info *aie_cfg) > +{ > + return (u16)(aie_cfg->src_roi.x2 - aie_cfg->src_roi.x1); > +} > + > +static inline u16 dif_y(const struct aie_enq_info *aie_cfg) > +{ > + return (u16)(aie_cfg->src_roi.y2 - aie_cfg->src_roi.y1); > +} > + > +static inline void set_cmb_cfg(u32 *tbl, u16 index, u16 stride) > +{ > + tbl[index] = aie_cmb_u16(tbl[index], stride); > +} > + > +static inline void set_cmbst_cfg(u32 *tbl, u16 index, u16 stride) > +{ > + tbl[index] = aie_cmb_stride(tbl[index], stride); > +} > + > +static int get_fd_ker_rdma_size(int i, int j) > +{ > + if (i < 0 || i >= FD_LOOP_NUM || j < 0 || j >= KERNEL_RDMA_RA_NUM) > + return 0; > + > + return fd_ker_rdma_size[i % RPN_LOOP_NUM]; > +} > + > +static int get_fd_stride(int index) > +{ > + int i; > + > + for (i = 0; i < FD_STRIDE_SIZE; i++) { > + if (fd_stride_indices[i] == index) > + return 2; > + } > + return 1; > +} > + > +static int is_fd_maxpool(int index) > +{ > + int i; > + > + for (i = 0; i < FD_MAXPOOL_SIZE; i++) { > + if (fd_maxpool_indices[i] == index) > + return 1; > + } > + return 0; > +} > + > +static int get_attr_ker_rdma_size(int index, int j) > +{ > + if (index < 0 || index >= ATTR_LOOP_NUM || j < 0 || j >= > KERNEL_RDMA_RA_NUM) > + return 0; > + > + return attr_ker_rdma_size[index]; > +} > + > +static inline unsigned int fld_step_align_size(int step_num, int > frame) > +{ > + if (step_num < 0 || step_num >= FLD_STEP_NUM || frame < 0 || frame > >= FLD_MAX_FRAME) > + return 0; > + > + switch (step_num) { > + case FLD_STEP_BLINK: > + return frame == (FLD_MAX_FRAME - 1) ? 6528 : 0; > + case FLD_STEP_CV: > + return frame == 0 ? 1536 : 1280; > + case FLD_STEP_FP: > + return 5376; > + case FLD_STEP_LEAF: > + return 307200; > + case FLD_STEP_KM02: > + case FLD_STEP_KM13: > + return 8064; > + } > + return 0; > +} > + > +static int aie_imem_alloc(struct mtk_aie_dev *fd, u32 size, > + struct imem_buf_info *bufinfo) > +{ > + struct device *dev = fd->dev; > + dma_addr_t dma_handle; > + void *va; > + > + if (size == 0) { > + dev_dbg(fd->dev, "%s: size(%d)\n", __func__, size); > + return -EINVAL; > + } > + > + fd->fd_mem_size += size; > + > + va = dma_alloc_coherent(dev, size, &dma_handle, GFP_KERNEL); > + if (!va || dma_handle == 0) > + return -ENOMEM; > + > + bufinfo->va = va; > + bufinfo->pa = dma_handle; > + bufinfo->size = size; > + > + dev_dbg(fd->dev, "%s: vAddr(0x%p) size(%d)\n", __func__, va, > size); > + > + return 0; > +} > + > +static void aie_imem_free(struct mtk_aie_dev *fd, struct > imem_buf_info *bufinfo) > +{ > + dev_dbg(fd->dev, "%s: vAddr(0x%p) size(%d)\n", __func__, bufinfo- > >va, bufinfo->size); > + > + if (bufinfo->va) > + dma_free_coherent(fd->dev, bufinfo->size, bufinfo->va, bufinfo- > >pa); > +} > + > +static void aie_update_table(struct mtk_aie_dev *fd, u16 pym_width, > + u16 pym_height, bool binit) > +{ > + int i, stridesize; > + struct aie_static_info *pstv = &fd->st_info; > + > + pstv->inf_elm[PYMX_START_LOOP(2)].img_width = pym_width / 4; > + pstv->inf_elm[PYMX_START_LOOP(2)].img_height = pym_height / 4; > + > + pstv->inf_elm[PYMX_START_LOOP(1)].img_width = pym_width / 2; > + pstv->inf_elm[PYMX_START_LOOP(1)].img_height = pym_height / 2; > + > + pstv->inf_elm[PYMX_START_LOOP(0)].img_width = pym_width; > + pstv->inf_elm[PYMX_START_LOOP(0)].img_height = pym_height; > + > + for (i = 0; i < FD_LOOP_NUM; i++) { > + if (i != PYMX_START_LOOP(2) && i != PYMX_START_LOOP(1) && > + i != PYMX_START_LOOP(0)) { > + if (fd_out_stride2_in[i] == 1) { > + pstv->inf_elm[i].img_width = > + pstv->inf_elm[i - 1].stride2_out_width; > + pstv->inf_elm[i].img_height = > + pstv->inf_elm[i - 1].stride2_out_height; > + } else { > + pstv->inf_elm[i].img_width = pstv->inf_elm[i - 1].out_width; > + pstv->inf_elm[i].img_height = pstv->inf_elm[i - 1].out_height; > + } > + } > + > + stridesize = get_fd_stride(i); > + if (is_fd_maxpool(i)) { > + pstv->inf_elm[i].out_width = > + (pstv->inf_elm[i].img_width - 1) / 2 + 1; > + pstv->inf_elm[i].out_height = > + (pstv->inf_elm[i].img_height - 1) / 2 + 1; > + } else { > + pstv->inf_elm[i].out_width = (pstv->inf_elm[i].img_width - 1) / > + stridesize + 1; > + pstv->inf_elm[i].out_height = (pstv->inf_elm[i].img_height - 1) / > + stridesize + 1; > + } > + > + pstv->inf_elm[i].stride2_out_width = > + ((pstv->inf_elm[i].out_width - 1) / 2 + 1) * out_2size[i]; > + pstv->inf_elm[i].stride2_out_height = > + ((pstv->inf_elm[i].out_height - 1) / 2 + 1) * out_2size[i]; > + > + if (outlayer[i] == 1) { > + pstv->inf_elm[i].out_xsize_plus_1 = > + pstv->inf_elm[i].out_width * out_ch_pack[i] * 2; > + pstv->inf_elm[i].out_stride = > + round_up(pstv->inf_elm[i].out_xsize_plus_1 * ANCHOR_EN_NUM, 16); > + pstv->inf_elm[i].out_xsize_plus_1_stride2 = > + ((pstv->inf_elm[i].out_width - 1) / 2 + 1) * > + out_ch_pack[i] * 2 * out_2size[i]; > + } else { > + pstv->inf_elm[i].out_xsize_plus_1 = > + pstv->inf_elm[i].out_width * out_ch_pack[i]; > + pstv->inf_elm[i].out_stride = > + round_up(pstv->inf_elm[i].out_xsize_plus_1, 16); > + pstv->inf_elm[i].out_xsize_plus_1_stride2 = > + ((pstv->inf_elm[i].out_width - 1) / 2 + 1) * > + out_ch_pack[i] * out_2size[i]; > + } > + > + pstv->inf_elm[i].out_stride_stride2 = > + round_up(pstv->inf_elm[i].out_xsize_plus_1_stride2, 16); > + > + if (out_2size[i] == 1) > + pstv->inf_elm[i].out_ysize_plus_1_stride2 = > + (pstv->inf_elm[i].out_height - 1) / 2 + 1; > + else > + pstv->inf_elm[i].out_ysize_plus_1_stride2 = pstv- > >inf_elm[i].out_height; > + > + if (binit) { > + if (i == RPNX_LOOP_NUM(2) || i == RPNX_LOOP_NUM(1) || i == > RPNX_LOOP_NUM(0)) > + pstv->inf_elm[i].fd_wdma_size[0] = RESULT_SIZE; > + else > + pstv->inf_elm[i].fd_wdma_size[0] = pstv->inf_elm[i].out_height * > + pstv->inf_elm[i].out_stride; > + > + if (outlayer[i] == 1) { > + if (out_stride_size[i][1]) > + pstv->inf_elm[i].fd_wdma_size[1] = > + pstv->inf_elm[i].fd_wdma_size[0]; > + if (out_stride_size[i][2]) > + pstv->inf_elm[i].fd_wdma_size[2] = > + pstv->inf_elm[i].fd_wdma_size[0]; > + if (out_stride_size[i][3]) > + pstv->inf_elm[i].fd_wdma_size[3] = > + pstv->inf_elm[i].fd_wdma_size[0]; > + } else { > + if (out_stride_size[i][1]) > + pstv->inf_elm[i].fd_wdma_size[1] = > + pstv->inf_elm[i].out_height * > + pstv->inf_elm[i].out_stride; > + if (out_stride_size[i][2]) > + pstv->inf_elm[i].fd_wdma_size[2] = > + pstv->inf_elm[i].out_ysize_plus_1_stride2 * > + pstv->inf_elm[i].out_stride_stride2; > + if (out_stride_size[i][3]) > + pstv->inf_elm[i].fd_wdma_size[3] = > + pstv->inf_elm[i].out_ysize_plus_1_stride2 * > + pstv->inf_elm[i].out_stride_stride2; > + } > + } > + > + if (in_ch_pack[i] == 1) > + pstv->inf_elm[i].input_xsize_plus_1 = > + round_up(pstv->inf_elm[i].img_width, 8); > + else > + pstv->inf_elm[i].input_xsize_plus_1 = > + pstv->inf_elm[i].img_width * in_ch_pack[i]; > + } > +} > + > +static void aie_update_buf_params(struct mtk_aie_dev *fd, u16 > max_img_width, > + u16 max_img_height) > +{ > + struct aie_static_info *pstv = &fd->st_info; > + u32 size; > + u8 i, j; > + > + fd->base_para->max_img_rect.width = max_img_width; > + fd->base_para->max_img_rect.height = max_img_height; > + fd->fd_dma_max_size = 0; > + fd->fd_dma_rst_max_size = 0; > + fd->fd_fd_kernel_size = 0; > + fd->fd_attr_kernel_size = 0; > + fd->fd_attr_dma_max_size = 0; > + fd->fd_attr_dma_rst_max_size = 0; > + > + /* FDMODE Dram Buffer Size */ > + fd->fd_rs_cfg_size = 4 * fd->variant->rs_cfg_size * 2; > + fd->fd_fd_cfg_size = 4 * fd->variant->fd_cfg_size * FD_LOOP_NUM; > + fd->fd_yuv2rgb_cfg_size = 4 * fd->variant->y2r_cfg_size; > + > + /* ATTRMODE Dram Buffer Size */ > + fd->attr_fd_cfg_size = 4 * fd->variant->fd_cfg_size * > ATTR_LOOP_NUM; > + fd->attr_yuv2rgb_cfg_size = 4 * fd->variant->y2r_cfg_size; > + > + /* HW Output Buffer Size */ > + fd->rs_pym_out_size[0] = fd->base_para->max_pyramid_rect.width * > + fd->base_para->max_pyramid_rect.height; > + fd->rs_pym_out_size[1] = fd->rs_pym_out_size[0] / 2; > + fd->rs_pym_out_size[2] = fd->rs_pym_out_size[0] / 4; > + > + /* FDMODE Dram Buffer Size */ > + for (i = 0; i < FD_LOOP_NUM; i++) { > + for (j = 0; j < OUTPUT_WDMA_WRA_NUM; j++) { > + if (out_stride_size[i][j]) { > + if ((i == RPNX_LOOP_NUM(2) || i == RPNX_LOOP_NUM(1) || > + i == RPNX_LOOP_NUM(0)) && j == 0) > + fd->fd_dma_rst_max_size += pstv->inf_elm[i].fd_wdma_size[j]; > + else > + fd->fd_dma_max_size += pstv->inf_elm[i].fd_wdma_size[j]; > + } > + } > + } > + > + for (i = 0; i < FD_LOOP_NUM; i++) { > + for (j = 0; j < KERNEL_RDMA_RA_NUM; j++) { > + size = get_fd_ker_rdma_size(i, j); > + if (size) > + fd->fd_fd_kernel_size += size; > + } > + } > + > + /* ATTRMODE Dram Buffer Size */ > + for (i = 0; i < ATTR_LOOP_NUM; i++) { > + for (j = 0; j < OUTPUT_WDMA_WRA_NUM; j++) { > + if (attr_wdma_en[i][j]) { > + if ((i == AGE_OUT_RGS || i == GENDER_OUT_RGS || > + i == INDIAN_OUT_RGS || i == ETHNICITY_OUT_RGS) && j == 0) > + fd->fd_attr_dma_rst_max_size += > + ATTR_OUT_SIZE * MAX_ENQUE_FRAME_NUM; > + else > + fd->fd_attr_dma_max_size += attr_wdma_size[i][j]; > + } > + } > + } > + > + for (i = 0; i < ATTR_LOOP_NUM; i++) { > + for (j = 0; j < KERNEL_RDMA_RA_NUM; j++) > + fd->fd_attr_kernel_size += get_attr_ker_rdma_size(i, j); > + } > + > + /* FD Pose secure result output buffer: result size * 3 loops */ > + fd->fd_dma_rst_max_size += RESULT_SIZE * 3; > + > + /* FLD size */ > + fd->fld_step_size = 0; > + for (i = 0; i < FLD_STEP_NUM; i++) > + for (j = 0; j < FLD_MAX_FRAME; j++) > + fd->fld_step_size += fld_step_align_size(i, j); > + > + fd->fld_out_size = FLD_OUTPUT_SIZE * FLD_MAX_FRAME; > +} > + > +static int aie_alloc_dram_buf(struct mtk_aie_dev *fd) > +{ > + u32 alloc_size; > + int ret; > + u8 i; > + > + /* RS DRAM */ > + alloc_size = fd->fd_rs_cfg_size; > + dev_dbg(fd->dev, "RS CFG:"); > + ret = aie_imem_alloc(fd, alloc_size, &fd->rs_cfg_data); > + if (ret) > + goto dma_alloc_fail; > + /* FD MODE */ > + fd->base_para->fd_rs_cfg_pa = fd->rs_cfg_data.pa; > + fd->base_para->fd_rs_cfg_va = fd->rs_cfg_data.va; > + > + /* FD DRAM */ > + alloc_size = fd->fd_fd_cfg_size + fd->attr_fd_cfg_size * > MAX_ENQUE_FRAME_NUM; > + dev_dbg(fd->dev, "FD CFG:"); > + ret = aie_imem_alloc(fd, alloc_size, &fd->fd_cfg_data); > + if (ret) > + goto dma_alloc_fail; > + /* FD MODE */ > + fd->base_para->fd_fd_cfg_pa = fd->fd_cfg_data.pa; > + fd->base_para->fd_fd_cfg_va = fd->fd_cfg_data.va; > + /* ATTR MODE */ > + fd->base_para->attr_fd_cfg_pa[0] = > + fd->base_para->fd_fd_cfg_pa + fd->fd_fd_cfg_size; > + fd->base_para->attr_fd_cfg_va[0] = > + fd->base_para->fd_fd_cfg_va + fd->fd_fd_cfg_size; > + > + for (i = 1; i < MAX_ENQUE_FRAME_NUM; i++) { > + fd->base_para->attr_fd_cfg_pa[i] = > + fd->base_para->attr_fd_cfg_pa[i - 1] + fd->attr_fd_cfg_size; > + fd->base_para->attr_fd_cfg_va[i] = > + fd->base_para->attr_fd_cfg_va[i - 1] + fd->attr_fd_cfg_size; > + } > + > + /* YUV2RGB DRAM */ > + alloc_size = fd->fd_yuv2rgb_cfg_size + > + fd->attr_yuv2rgb_cfg_size * MAX_ENQUE_FRAME_NUM; > + dev_dbg(fd->dev, "YUV2RGB CFG:"); > + ret = aie_imem_alloc(fd, alloc_size, &fd->yuv2rgb_cfg_data); > + if (ret) > + goto dma_alloc_fail; > + /* FD MODE */ > + fd->base_para->fd_yuv2rgb_cfg_pa = fd->yuv2rgb_cfg_data.pa; > + fd->base_para->fd_yuv2rgb_cfg_va = fd->yuv2rgb_cfg_data.va; > + > + /* ATTR MODE */ > + fd->base_para->attr_yuv2rgb_cfg_pa[0] = > + fd->base_para->fd_yuv2rgb_cfg_pa + fd->fd_yuv2rgb_cfg_size; > + fd->base_para->attr_yuv2rgb_cfg_va[0] = > + fd->base_para->fd_yuv2rgb_cfg_va + fd->fd_yuv2rgb_cfg_size; > + > + for (i = 1; i < MAX_ENQUE_FRAME_NUM; i++) { > + fd->base_para->attr_yuv2rgb_cfg_pa[i] = > + fd->base_para->attr_yuv2rgb_cfg_pa[i - 1] + fd- > >attr_yuv2rgb_cfg_size; > + fd->base_para->attr_yuv2rgb_cfg_va[i] = > + fd->base_para->attr_yuv2rgb_cfg_va[i - 1] + fd- > >attr_yuv2rgb_cfg_size; > + } > + > + return 0; > +dma_alloc_fail: > + aie_imem_free(fd, &fd->fd_cfg_data); > + aie_imem_free(fd, &fd->rs_cfg_data); > + > + return ret; > +} > + > +static int aie_alloc_output_buf(struct mtk_aie_dev *fd) > +{ > + u32 alloc_size = 0, pa_off = 0, va_off = 0; > + int i, j; > + int ret; > + > + for (i = 0; i < PYM_NUM; i++) > + alloc_size += fd->rs_pym_out_size[i] * 3; > + dev_dbg(fd->dev, "RS OUT:"); > + ret = aie_imem_alloc(fd, alloc_size, &fd->rs_output_hw); > + if (ret) > + return ret; > + > + for (i = 0; i < PYM_NUM; i++) { > + for (j = 0; j < COLOR_NUM; j++) { > + fd->base_para->rs_pym_rst_pa[i][j] = fd->rs_output_hw.pa + pa_off; > + pa_off += fd->rs_pym_out_size[i]; > + > + fd->base_para->rs_pym_rst_va[i][j] = fd->rs_output_hw.va + va_off; > + va_off += fd->rs_pym_out_size[i]; > + } > + } > + > + return 0; > +} > + > +static void aie_alloc_normal(struct mtk_aie_dev *fd, int start, int > end) > +{ > + struct aie_static_info *pstv = &fd->st_info; > + int i, j, pi, pj; > + > + if (start <= 0 || end <= start || end >= FD_LOOP_NUM) { > + dev_err(fd->dev, "%s: start = %d, end = %d\n", __func__, start, > end); > + return; > + } > + > + pi = start - 1; > + pj = 0; > + for (i = start; i < end + 1; i++) { > + for (j = 0; j < OUTPUT_WDMA_WRA_NUM; j++) { > + if (out_stride_size[i][j]) { > + fd->dma_para->fd_out_hw_pa[i][j] = > + fd->dma_para->fd_out_hw_pa[pi][pj] + > + pstv->inf_elm[pi].fd_wdma_size[pj]; > + pi = i; > + pj = j; > + } > + } > + } > +} > + > +static int aie_alloc_fddma_buf(struct mtk_aie_dev *fd) > +{ > + u32 alloc_size; > + int ret; > + > + alloc_size = fd->fd_dma_max_size; > + dev_dbg(fd->dev, "FD DMA:"); > + ret = aie_imem_alloc(fd, alloc_size, &fd->fd_dma_hw); > + if (ret) > + goto dma_alloc_fail; > + > + alloc_size = fd->fd_fd_kernel_size + fd->fd_attr_kernel_size; > + dev_dbg(fd->dev, "FD KERNEL:"); > + ret = aie_imem_alloc(fd, alloc_size, &fd->fd_kernel_hw); > + if (ret) > + goto dma_alloc_fail; > + > + alloc_size = fd->fd_attr_dma_max_size; > + dev_dbg(fd->dev, "ATTR DMA:"); > + ret = aie_imem_alloc(fd, alloc_size, &fd->fd_attr_dma_hw); > + if (ret) > + goto dma_alloc_fail; > + > + alloc_size = fd->fd_dma_rst_max_size + fd- > >fd_attr_dma_rst_max_size; > + dev_dbg(fd->dev, "RESULT DMA:"); > + ret = aie_imem_alloc(fd, alloc_size, &fd->fd_dma_result_hw); > + if (ret) > + goto dma_alloc_fail; > + > + return 0; > + > +dma_alloc_fail: > + aie_imem_free(fd, &fd->fd_attr_dma_hw); > + aie_imem_free(fd, &fd->fd_kernel_hw); > + aie_imem_free(fd, &fd->fd_dma_hw); > + > + return ret; > +} > + > +static int aie_alloc_fld_buf(struct mtk_aie_dev *fd) > +{ > + u32 alloc_size; > + int ret; > + > + alloc_size = fd->fld_step_size; > + dev_dbg(fd->dev, "FLD STEP:"); > + ret = aie_imem_alloc(fd, alloc_size, &fd->fd_fld_step_data); > + if (ret) > + return ret; > + > + alloc_size = fd->fld_out_size; > + dev_dbg(fd->dev, "FLD OUT:"); > + ret = aie_imem_alloc(fd, alloc_size, &fd->fd_fld_out_hw); > + if (ret) > + goto fld_step; > + > + return 0; > +fld_step: > + aie_imem_free(fd, &fd->fd_fld_step_data); > + > + return ret; > +} > + > +static void aie_calculate_pa(struct mtk_aie_dev *fd, struct > aie_static_info *pstv, int i) > +{ > + fd->dma_para->fd_out_hw_pa[i][0] = fd->dma_para->fd_out_hw_pa[i - > 1][1] + > + pstv->inf_elm[i - 1].fd_wdma_size[1]; > + fd->dma_para->fd_out_hw_pa[i][1] = fd->dma_para->fd_out_hw_pa[i][0] > + > + pstv->inf_elm[i].out_xsize_plus_1; > + fd->dma_para->fd_out_hw_pa[i + 1][0] = fd->dma_para- > >fd_out_hw_pa[i][0] + > + 2 * pstv->inf_elm[i + 1].out_xsize_plus_1; > + fd->dma_para->fd_out_hw_pa[i + 1][1] = fd->dma_para- > >fd_out_hw_pa[i][0] + > + 3 * pstv->inf_elm[i + 1].out_xsize_plus_1; > + fd->dma_para->fd_out_hw_pa[i + 2][0] = fd->dma_para- > >fd_out_hw_pa[i][0] + > + 4 * pstv->inf_elm[i + 2].out_xsize_plus_1; > + fd->dma_para->fd_out_hw_pa[i + 3][0] = fd->dma_para- > >fd_out_hw_pa[i][0] + > + pstv->inf_elm[i].fd_wdma_size[0] + > + pstv->inf_elm[i].fd_wdma_size[1] + > + pstv->inf_elm[i + 1].fd_wdma_size[0] + > + pstv->inf_elm[i + 1].fd_wdma_size[1] + > + pstv->inf_elm[i + 2].fd_wdma_size[0]; > + fd->dma_para->fd_out_hw_pa[i + 3][1] = fd->dma_para->fd_out_hw_pa[i > + 3][0] + > + pstv->inf_elm[i + 3].fd_wdma_size[0] + > + pstv->inf_elm[i + 3].fd_wdma_size[2] + > + pstv->inf_elm[i + 4].fd_wdma_size[0] + > + pstv->inf_elm[i + 4].fd_wdma_size[2] + > + pstv->inf_elm[i + 5].fd_wdma_size[0]; > + fd->dma_para->fd_out_hw_pa[i + 3][2] = fd->dma_para->fd_out_hw_pa[i > + 3][0] + > + pstv->inf_elm[i + 3].out_xsize_plus_1; > + fd->dma_para->fd_out_hw_pa[i + 3][3] = fd->dma_para->fd_out_hw_pa[i > + 3][1] + > + pstv->inf_elm[i + 3].out_xsize_plus_1; > + fd->dma_para->fd_out_hw_pa[i + 4][0] = fd->dma_para->fd_out_hw_pa[i > + 3][0] + > + 2 * pstv->inf_elm[i + 4].out_xsize_plus_1; > + fd->dma_para->fd_out_hw_pa[i + 4][1] = fd->dma_para->fd_out_hw_pa[i > + 3][1] + > + 2 * pstv->inf_elm[i + 4].out_xsize_plus_1; > + fd->dma_para->fd_out_hw_pa[i + 4][2] = fd->dma_para->fd_out_hw_pa[i > + 3][0] + > + 3 * pstv->inf_elm[i + 4].out_xsize_plus_1; > + fd->dma_para->fd_out_hw_pa[i + 4][3] = fd->dma_para->fd_out_hw_pa[i > + 3][1] + > + 3 * pstv->inf_elm[i + 4].out_xsize_plus_1; > + fd->dma_para->fd_out_hw_pa[i + 5][0] = fd->dma_para->fd_out_hw_pa[i > + 3][0] + > + 4 * pstv->inf_elm[i + 5].out_xsize_plus_1; > + fd->dma_para->fd_out_hw_pa[i + 5][1] = fd->dma_para->fd_out_hw_pa[i > + 3][1] + > + 4 * pstv->inf_elm[i + 5].out_xsize_plus_1; > + fd->dma_para->fd_out_hw_pa[i + 6][0] = fd->dma_para->fd_out_hw_pa[i > + 3][1] + > + pstv->inf_elm[i + 3].fd_wdma_size[1] + > + pstv->inf_elm[i + 3].fd_wdma_size[3] + > + pstv->inf_elm[i + 4].fd_wdma_size[1] + > + pstv->inf_elm[i + 4].fd_wdma_size[3] + > + pstv->inf_elm[i + 5].fd_wdma_size[1]; > + fd->dma_para->fd_out_hw_pa[i + 6][1] = fd->dma_para->fd_out_hw_pa[i > + 6][0] + > + pstv->inf_elm[i + 6].out_xsize_plus_1; > + fd->dma_para->fd_out_hw_pa[i + 7][0] = fd->dma_para->fd_out_hw_pa[i > + 6][0] + > + 2 * pstv->inf_elm[i + 7].out_xsize_plus_1; > + fd->dma_para->fd_out_hw_pa[i + 7][1] = fd->dma_para->fd_out_hw_pa[i > + 6][0] + > + 3 * pstv->inf_elm[i + 7].out_xsize_plus_1; > + fd->dma_para->fd_out_hw_pa[i + 8][0] = fd->dma_para->fd_out_hw_pa[i > + 6][0] + > + 4 * pstv->inf_elm[i + 8].out_xsize_plus_1; > +} > + > +static void aie_arrange_fddma_buf(struct mtk_aie_dev *fd) > +{ > + struct aie_static_info *pstv = &fd->st_info; > + dma_addr_t current_pa; > + void *current_va; > + u8 i, j; > + > + /* 0 ~ 18 */ > + fd->dma_para->fd_out_hw_pa[0][0] = fd->fd_dma_hw.pa; > + aie_alloc_normal(fd, 1, 18); > + > + /* 19 ~ 27 */ > + aie_calculate_pa(fd, pstv, 19); > + > + /* 29 ~ 47 */ > + fd->dma_para->fd_out_hw_pa[29][0] = fd->dma_para- > >fd_out_hw_pa[25][0] + > + pstv->inf_elm[25].fd_wdma_size[0] + > + pstv->inf_elm[25].fd_wdma_size[1] + > + pstv->inf_elm[26].fd_wdma_size[0] + > + pstv->inf_elm[26].fd_wdma_size[1] + > + pstv->inf_elm[27].fd_wdma_size[0]; > + aie_alloc_normal(fd, 30, 47); > + > + /* 48 ~ 56 */ > + aie_calculate_pa(fd, pstv, 48); > + > + /* 58 ~ 76 */ > + fd->dma_para->fd_out_hw_pa[58][0] = fd->dma_para- > >fd_out_hw_pa[54][0] + > + pstv->inf_elm[54].fd_wdma_size[0] + > + pstv->inf_elm[54].fd_wdma_size[1] + > + pstv->inf_elm[55].fd_wdma_size[0] + > + pstv->inf_elm[55].fd_wdma_size[1] + > + pstv->inf_elm[56].fd_wdma_size[0]; > + aie_alloc_normal(fd, 59, 76); > + > + /* 77 ~ 85 */ > + aie_calculate_pa(fd, pstv, 77); > + > + /* VA : except 28, 57, 86 */ > + /* 0 ~ 86 */ > + fd->dma_para->fd_out_hw_va[0][0] = fd->fd_dma_hw.va; > + for (i = 1; i < FD_LOOP_NUM; i++) { > + if (i == RPNX_LOOP_NUM(2) || i == RPNX_LOOP_NUM(1) || i == > RPNX_LOOP_NUM(0)) > + continue; > + for (j = 0; j < 4; j++) { > + if (out_stride_size[i][j]) { > + fd->dma_para->fd_out_hw_va[i][j] = fd->fd_dma_hw.va + > + fd->dma_para->fd_out_hw_pa[i][j] - fd->fd_dma_hw.pa; > + } > + } > + } > + > + current_pa = fd->dma_para->fd_out_hw_pa[83][0] + > + pstv->inf_elm[83].fd_wdma_size[0] + > + pstv->inf_elm[83].fd_wdma_size[1] + > + pstv->inf_elm[84].fd_wdma_size[0] + > + pstv->inf_elm[84].fd_wdma_size[1] + > + pstv->inf_elm[85].fd_wdma_size[0]; > + current_va = fd->dma_para->fd_out_hw_va[83][0] + > + pstv->inf_elm[83].fd_wdma_size[0] + > + pstv->inf_elm[83].fd_wdma_size[1] + > + pstv->inf_elm[84].fd_wdma_size[0] + > + pstv->inf_elm[84].fd_wdma_size[1] + > + pstv->inf_elm[85].fd_wdma_size[0]; > + > + dev_dbg(fd->dev, "%s: current VA = %p\n", __func__, current_va); > +} > + > +static void aie_arrange_kernel_buf(struct mtk_aie_dev *fd) > +{ > + void *current_va; > + dma_addr_t current_pa; > + u32 size, attrsize; > + u8 i, j; > + > + current_pa = fd->fd_kernel_hw.pa; > + current_va = fd->fd_kernel_hw.va; > + > + for (i = 0; i < FD_LOOP_NUM; i++) { > + for (j = 0; j < KERNEL_RDMA_RA_NUM; j++) { > + size = get_fd_ker_rdma_size(i, j); > + if (size) { > + fd->dma_para->fd_kernel_pa[i][j] = current_pa; > + fd->dma_para->fd_kernel_va[i][j] = current_va; > + current_pa += size; > + current_va += size; > + } > + } > + } > + > + for (i = 0; i < ATTR_LOOP_NUM; i++) { > + for (j = 0; j < KERNEL_RDMA_RA_NUM; j++) { > + attrsize = get_attr_ker_rdma_size(i, j); > + fd->dma_para->attr_kernel_pa[i][j] = current_pa; > + fd->dma_para->attr_kernel_va[i][j] = current_va; > + current_pa += attrsize; > + current_va += attrsize; > + } > + } > + > + dev_dbg(fd->dev, "%s: current VA = %p\n", __func__, current_va); > +} > + > +static void aie_arrange_attrdma_buf(struct mtk_aie_dev *fd) > +{ > + void *current_va; > + dma_addr_t current_pa; > + u8 i, j; > + > + current_pa = fd->fd_attr_dma_hw.pa; > + current_va = fd->fd_attr_dma_hw.va; > + > + /* attribute mode */ > + for (i = 0; i < ATTR_LOOP_NUM; i++) { > + for (j = 0; j < OUTPUT_WDMA_WRA_NUM; j++) { > + if (attr_wdma_en[i][j]) { > + fd->dma_para->attr_out_hw_pa[i][j] = current_pa; > + fd->dma_para->attr_out_hw_va[i][j] = current_va; > + current_pa += attr_wdma_size[i][j]; > + current_va += attr_wdma_size[i][j]; > + } > + } > + } > + > + dev_dbg(fd->dev, "%s: current VA = %p\n", __func__, current_va); > +} > + > +static void aie_arrange_result_dma_buf(struct mtk_aie_dev *fd) > +{ > + struct aie_static_info *pstv = &fd->st_info; > + dma_addr_t currentresult_pa; > + void *currentresult_va; > + u8 i; > + > + currentresult_pa = fd->fd_dma_result_hw.pa; > + currentresult_va = fd->fd_dma_result_hw.va; > + > + fd->dma_para->fd_out_hw_pa[RPNX_LOOP_NUM(2)][0] = currentresult_pa; > + fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(2)][0] = currentresult_va; > + currentresult_pa += pstv- > >inf_elm[RPNX_LOOP_NUM(2)].fd_wdma_size[0]; > + currentresult_va += pstv- > >inf_elm[RPNX_LOOP_NUM(2)].fd_wdma_size[0]; > + fd->dma_para->fd_out_hw_pa[RPNX_LOOP_NUM(1)][0] = currentresult_pa; > + fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(1)][0] = currentresult_va; > + currentresult_pa += pstv- > >inf_elm[RPNX_LOOP_NUM(1)].fd_wdma_size[0]; > + currentresult_va += pstv- > >inf_elm[RPNX_LOOP_NUM(1)].fd_wdma_size[0]; > + fd->dma_para->fd_out_hw_pa[RPNX_LOOP_NUM(0)][0] = currentresult_pa; > + fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(0)][0] = currentresult_va; > + currentresult_pa += pstv- > >inf_elm[RPNX_LOOP_NUM(0)].fd_wdma_size[0]; > + currentresult_va += pstv- > >inf_elm[RPNX_LOOP_NUM(0)].fd_wdma_size[0]; > + > + fd->dma_para->attr_out_hw_pa[AGE_OUT_RGS][0] = currentresult_pa; > + fd->dma_para->attr_out_hw_va[AGE_OUT_RGS][0] = currentresult_va; > + currentresult_pa += ATTR_OUT_SIZE * MAX_ENQUE_FRAME_NUM; > + currentresult_va += ATTR_OUT_SIZE * MAX_ENQUE_FRAME_NUM; > + fd->dma_para->attr_out_hw_pa[GENDER_OUT_RGS][0] = currentresult_pa; > + fd->dma_para->attr_out_hw_va[GENDER_OUT_RGS][0] = currentresult_va; > + currentresult_pa += ATTR_OUT_SIZE * MAX_ENQUE_FRAME_NUM; > + currentresult_va += ATTR_OUT_SIZE * MAX_ENQUE_FRAME_NUM; > + fd->dma_para->attr_out_hw_pa[INDIAN_OUT_RGS][0] = currentresult_pa; > + fd->dma_para->attr_out_hw_va[INDIAN_OUT_RGS][0] = currentresult_va; > + currentresult_pa += ATTR_OUT_SIZE * MAX_ENQUE_FRAME_NUM; > + currentresult_va += ATTR_OUT_SIZE * MAX_ENQUE_FRAME_NUM; > + fd->dma_para->attr_out_hw_pa[ETHNICITY_OUT_RGS][0] = > currentresult_pa; > + fd->dma_para->attr_out_hw_va[ETHNICITY_OUT_RGS][0] = > currentresult_va; > + currentresult_pa += ATTR_OUT_SIZE * MAX_ENQUE_FRAME_NUM; > + currentresult_va += ATTR_OUT_SIZE * MAX_ENQUE_FRAME_NUM; > + > + /* need to prepare 10 buffers to store 10 times result */ > + fd->dma_para->age_out_hw_pa[0] = fd->dma_para- > >attr_out_hw_pa[AGE_OUT_RGS][0]; > + fd->dma_para->age_out_hw_va[0] = fd->dma_para- > >attr_out_hw_va[AGE_OUT_RGS][0]; > + fd->dma_para->gender_out_hw_pa[0] = fd->dma_para- > >attr_out_hw_pa[GENDER_OUT_RGS][0]; > + fd->dma_para->gender_out_hw_va[0] = fd->dma_para- > >attr_out_hw_va[GENDER_OUT_RGS][0]; > + fd->dma_para->is_indian_out_hw_pa[0] = > + fd->dma_para->attr_out_hw_pa[INDIAN_OUT_RGS][0]; > + fd->dma_para->is_indian_out_hw_va[0] = > + fd->dma_para->attr_out_hw_va[INDIAN_OUT_RGS][0]; > + fd->dma_para->ethnicity_out_hw_pa[0] = fd->dma_para- > >attr_out_hw_pa[ETHNICITY_OUT_RGS][0]; > + fd->dma_para->ethnicity_out_hw_va[0] = fd->dma_para- > >attr_out_hw_va[ETHNICITY_OUT_RGS][0]; > + > + for (i = 1; i < MAX_ENQUE_FRAME_NUM; i++) { > + fd->dma_para->age_out_hw_pa[i] = > + fd->dma_para->age_out_hw_pa[i - 1] + ATTR_OUT_SIZE; > + fd->dma_para->age_out_hw_va[i] = > + fd->dma_para->age_out_hw_va[i - 1] + ATTR_OUT_SIZE; > + fd->dma_para->gender_out_hw_pa[i] = > + fd->dma_para->gender_out_hw_pa[i - 1] + ATTR_OUT_SIZE; > + fd->dma_para->gender_out_hw_va[i] = > + fd->dma_para->gender_out_hw_va[i - 1] + ATTR_OUT_SIZE; > + fd->dma_para->is_indian_out_hw_pa[i] = > + fd->dma_para->is_indian_out_hw_pa[i - 1] + ATTR_OUT_SIZE; > + fd->dma_para->is_indian_out_hw_va[i] = > + fd->dma_para->is_indian_out_hw_va[i - 1] + ATTR_OUT_SIZE; > + fd->dma_para->ethnicity_out_hw_pa[i] = > + fd->dma_para->ethnicity_out_hw_pa[i - 1] + ATTR_OUT_SIZE; > + fd->dma_para->ethnicity_out_hw_va[i] = > + fd->dma_para->ethnicity_out_hw_va[i - 1] + ATTR_OUT_SIZE; > + } > + > + memset(fd->fd_dma_result_hw.va, 0, fd->fd_dma_result_hw.size); > + > + dev_dbg(fd->dev, "%s: current VA = %p\n", __func__, > currentresult_va); > +} > + > +static void aie_arrange_fld_buf(struct mtk_aie_dev *fd) > +{ > + unsigned int offset = 0; > + u8 i, j; > + > + for (i = 0; i < FLD_STEP_NUM; i++) { > + for (j = 0; j < FLD_MAX_FRAME; j++) { > + fd->fld_para->fld_step_va[i][j] = fd->fd_fld_step_data.va + offset; > + fd->fld_para->fld_step_pa[i][j] = fd->fd_fld_step_data.pa + offset; > + offset += fld_step_align_size(i, j); > + } > + } > + > + for (i = 0, offset = 0; i < FLD_MAX_FRAME; i++) { > + fd->fld_para->fld_output_va[i] = fd->fd_fld_out_hw.va + offset; > + fd->fld_para->fld_output_pa[i] = fd->fd_fld_out_hw.pa + offset; > + offset += FLD_OUTPUT_SIZE; > + } > +} > + > +static void aie_update_fddma_buf(struct mtk_aie_dev *fd) > +{ > + struct aie_static_info *pstv = &fd->st_info; > + u8 i, j; > + > + /* 19 ~ 27 */ > + aie_calculate_pa(fd, pstv, 19); > + > + /* 48 ~ 56 */ > + aie_calculate_pa(fd, pstv, 48); > + > + /* 77 ~ 85 */ > + aie_calculate_pa(fd, pstv, 77); > + > + /* VA : except 28, 57, 86 */ > + /* 0 ~ 86 */ > + fd->dma_para->fd_out_hw_va[0][0] = fd->fd_dma_hw.va; > + for (i = 1; i < FD_LOOP_NUM; i++) { > + if (i == RPNX_LOOP_NUM(2) || i == RPNX_LOOP_NUM(1) || i == > RPNX_LOOP_NUM(0)) > + continue; > + for (j = 0; j < 4; j++) { > + if (out_stride_size[i][j]) { > + fd->dma_para->fd_out_hw_va[i][j] = fd->fd_dma_hw.va + > + fd->dma_para->fd_out_hw_pa[i][j] - fd->fd_dma_hw.pa; > + } > + } > + } > +} > + > +static void aie_free_dram_buf(struct mtk_aie_dev *fd) > +{ > + aie_imem_free(fd, &fd->rs_cfg_data); > + aie_imem_free(fd, &fd->fd_cfg_data); > + aie_imem_free(fd, &fd->yuv2rgb_cfg_data); > +} > + > +static void aie_free_output_buf(struct mtk_aie_dev *fd) > +{ > + aie_imem_free(fd, &fd->rs_output_hw); > +} > + > +static void aie_free_fddma_buf(struct mtk_aie_dev *fd) > +{ > + aie_imem_free(fd, &fd->fd_dma_hw); > + aie_imem_free(fd, &fd->fd_kernel_hw); > + aie_imem_free(fd, &fd->fd_attr_dma_hw); > + aie_imem_free(fd, &fd->fd_dma_result_hw); > +} > + > +static void aie_free_fld_buf(struct mtk_aie_dev *fd) > +{ > + aie_imem_free(fd, &fd->fd_fld_step_data); > + aie_imem_free(fd, &fd->fd_fld_out_hw); > +} > + > +static int aie_copy_fw(struct mtk_aie_dev *fd, const char *name, > void *buf, > + unsigned int size) > +{ > + int ret; > + const struct firmware *fw = NULL; > + > + ret = request_firmware(&fw, name, fd->dev); > + if (ret == 0) { > + if (size >= fw->size) > + memcpy(buf, fw->data, fw->size); > + else > + ret = -EINVAL; > + } > + > + release_firmware(fw); > + > + return ret; > +} > + > +/** > + * The aie driver just loads the bin file provided by algorithm. > + * Driver cannot change path and file name preset by algorithm. > + * Otherwise it will not be able to get aie related data and > algorithms. > + */ > +static int aie_load_fw(struct mtk_aie_dev *fd) > +{ > + char *cfg_path = "aie_mp_fw31/config"; > + char *kel_path = "aie_mp_fw31/kernel"; > + char name[128] = {}; > + u32 size, attrsize; > + u8 i, j; > + int ret; > + > + /* aie_fd_fd_config.bin: Provide relevant configs necessary for fd > mode*/ > + ret = sprintf(name, "%s/aie_fd_fd_config.bin", cfg_path); > + if (ret < 0) > + return ret; > + > + ret = aie_copy_fw(fd, > + name, > + fd->base_para->fd_fd_cfg_va, > + fd->fd_fd_cfg_size); > + if (ret) > + return ret; > + > + /* aie_fd_rs_config.bin: Provide the necessary data for fd mode > resizer hardware */ > + ret = sprintf(name, "%s/aie_fd_rs_config.bin", cfg_path); > + if (ret < 0) > + return ret; > + > + ret = aie_copy_fw(fd, > + name, > + fd->base_para->fd_rs_cfg_va, > + fd->fd_rs_cfg_size); > + if (ret) > + return ret; > + > + /* aie_fd_yuv2rgb_config.bin: Provide fd mode yuv to rgb related > data */ > + ret = sprintf(name, "%s/aie_fd_yuv2rgb_config.bin", cfg_path); > + if (ret < 0) > + return ret; > + > + ret = aie_copy_fw(fd, > + name, > + fd->base_para->fd_yuv2rgb_cfg_va, > + fd->fd_yuv2rgb_cfg_size); > + if (ret) > + return ret; > + > + /* aie_attr_fd_config.bin: Provide relevant configs necessary for > attribute mode*/ > + ret = sprintf(name, "%s/aie_attr_fd_config.bin", cfg_path); > + if (ret < 0) > + return ret; > + > + ret = aie_copy_fw(fd, > + name, > + fd->base_para->attr_fd_cfg_va[0], > + fd->attr_fd_cfg_size); > + if (ret) > + return ret; > + > + /** aie_attr_yuv2rgb_config.bin: Provide the necessary data for > + * attribute mode resizer hardware > + */ > + ret = sprintf(name, "%s/aie_attr_yuv2rgb_config.bin", cfg_path); > + if (ret < 0) > + return ret; > + > + ret = aie_copy_fw(fd, > + name, > + fd->base_para->attr_yuv2rgb_cfg_va[0], > + fd->attr_yuv2rgb_cfg_size); > + if (ret) > + return ret; > + > + for (i = 1; i < MAX_ENQUE_FRAME_NUM; i++) { > + memcpy(fd->base_para->attr_fd_cfg_va[i], > + fd->base_para->attr_fd_cfg_va[0], fd->attr_fd_cfg_size); > + memcpy(fd->base_para->attr_yuv2rgb_cfg_va[i], > + fd->base_para->attr_yuv2rgb_cfg_va[0], > + fd->attr_yuv2rgb_cfg_size); > + } > + > + for (i = 0; i < FD_LOOP_NUM; i++) { > + for (j = 0; j < KERNEL_RDMA_RA_NUM; j++) { > + size = get_fd_ker_rdma_size(i, j); > + if (size) { > + /* FD mode Hardware requires loop 87 times */ > + ret = sprintf(name, > + "%s/aie_fd_kernel_bias_loop%02d_%d.bin", > + kel_path, i, j); > + if (ret < 0) > + return ret; > + > + ret = aie_copy_fw(fd, name, > + fd->dma_para->fd_kernel_va[i][j], > + size); > + if (ret) > + return ret; > + } > + } > + } > + > + for (i = 0; i < ATTR_LOOP_NUM; i++) { > + for (j = 0; j < KERNEL_RDMA_RA_NUM; j++) { > + attrsize = get_attr_ker_rdma_size(i, j); > + /* attribute mode Hardware requires loop 26 times */ > + ret = sprintf(name, > + "%s/aie_attr_kernel_bias_loop%02d_%d.bin", > + kel_path, i, j); > + if (ret < 0) > + return ret; > + > + ret = aie_copy_fw(fd, name, > + fd->dma_para->attr_kernel_va[i][j], > + attrsize); > + if (ret) > + return ret; > + } > + } > + > + /* The necessary data when the FLD mode blink model runs */ > + ret = sprintf(name, "%s/aie_fld_blink_weight_forest14.bin", > cfg_path); > + > + if (ret < 0) > + return ret; > + > + ret = aie_copy_fw(fd, name, > + fd->fld_para->fld_step_va[FLD_STEP_BLINK][14], > + fld_step_align_size(FLD_STEP_BLINK, 14)); > + if (ret) > + return ret; > + > + for (j = 0; j < FLD_MAX_FRAME; j++) { > + /* The necessary data when the FLD mode cv model runs */ > + ret = sprintf(name, > + "%s/aie_fld_cv_forest%02d_iom3.bin", cfg_path, j); > + if (ret < 0) > + return ret; > + ret = aie_copy_fw(fd, name, > + fd->fld_para->fld_step_va[FLD_STEP_CV][j], > + fld_step_align_size(FLD_STEP_CV, j)); > + if (ret) > + return ret; > + } > + > + for (j = 0; j < FLD_MAX_FRAME; j++) { > + /* The necessary data when the FLD mode fp model runs */ > + ret = sprintf(name, > + "%s/aie_fld_fp_forest%02d_om45.bin", cfg_path, j); > + if (ret < 0) > + return ret; > + > + ret = aie_copy_fw(fd, name, > + fd->fld_para->fld_step_va[FLD_STEP_FP][j], > + fld_step_align_size(FLD_STEP_FP, j)); > + if (ret) > + return ret; > + } > + > + for (j = 0; j < FLD_MAX_FRAME; j++) { > + /* The necessary data when the FLD mode leafnode model runs */ > + ret = sprintf(name, > + "%s/aie_fld_leafnode_forest%02d.bin", cfg_path, j); > + if (ret < 0) > + return ret; > + > + ret = aie_copy_fw(fd, name, > + fd->fld_para->fld_step_va[FLD_STEP_LEAF][j], > + fld_step_align_size(FLD_STEP_LEAF, j)); > + if (ret) > + return ret; > + } > + > + for (j = 0; j < FLD_MAX_FRAME; j++) { > + /* The necessary data when the FLD mode treekm02 model runs */ > + ret = sprintf(name, > + "%s/aie_fld_tree_forest%02d_km02.bin", cfg_path, j); > + if (ret < 0) > + return ret; > + ret = aie_copy_fw(fd, name, > + fd->fld_para->fld_step_va[FLD_STEP_KM02][j], > + fld_step_align_size(FLD_STEP_KM02, j)); > + if (ret) > + return ret; > + } > + > + for (j = 0; j < FLD_MAX_FRAME; j++) { > + /* The necessary data when the FLD mode treekm13 model runs */ > + ret = sprintf(name, > + "%s/aie_fld_tree_forest%02d_km13.bin", cfg_path, j); > + if (ret < 0) > + return ret; > + ret = aie_copy_fw(fd, name, > + fd->fld_para->fld_step_va[FLD_STEP_KM13][j], > + fld_step_align_size(FLD_STEP_KM13, j)); > + if (ret) > + return ret; > + } > + > + return 0; > +} > + > +static int aie_update_cfg(struct mtk_aie_dev *fd, struct > aie_enq_info *aie_cfg) > +{ > + int crop_width; > + int crop_height; > + > + if (aie_cfg->en_roi) { > + crop_width = dif_x(aie_cfg) + 1; > + crop_height = dif_y(aie_cfg) + 1; > + } else { > + crop_width = aie_cfg->src_img_width; > + crop_height = aie_cfg->src_img_height; > + } > + > + if (crop_width == 0 || crop_height == 0) { > + dev_err(fd->dev, "Invalid crop size 0x0\n"); > + return -EINVAL; > + } > + > + if (aie_cfg->en_padding) { > + crop_width += aie_cfg->src_padding.right + aie_cfg- > >src_padding.left; > + crop_height += aie_cfg->src_padding.up + aie_cfg->src_padding.down; > + } > + > + if (aie_cfg->sel_mode == FDMODE) { > + fd->base_para->sel_mode = aie_cfg->sel_mode; > + fd->base_para->crop_rect.width = crop_width; > + fd->base_para->crop_rect.height = crop_height; > + fd->base_para->src_img_addr = aie_cfg->src_img_addr; > + fd->base_para->src_img_addr_uv = aie_cfg->src_img_addr_uv; > + fd->base_para->img_rect.width = aie_cfg->src_img_width; > + fd->base_para->img_rect.height = aie_cfg->src_img_height; > + fd->base_para->src_img_fmt = aie_cfg->src_img_fmt; > + fd->base_para->rotate_degree = aie_cfg->rotate_degree; > + } else if (aie_cfg->sel_mode == ATTRIBUTEMODE) { > + fd->attr_para->sel_mode[fd->attr_para->w_idx] = aie_cfg->sel_mode; > + fd->attr_para->crop_width[fd->attr_para->w_idx] = crop_width; > + fd->attr_para->crop_height[fd->attr_para->w_idx] = crop_height; > + fd->attr_para->src_img_addr[fd->attr_para->w_idx] = aie_cfg- > >src_img_addr; > + fd->attr_para->src_img_addr_uv[fd->attr_para->w_idx] = aie_cfg- > >src_img_addr_uv; > + fd->attr_para->img_width[fd->attr_para->w_idx] = aie_cfg- > >src_img_width; > + fd->attr_para->img_height[fd->attr_para->w_idx] = aie_cfg- > >src_img_height; > + fd->attr_para->src_img_fmt[fd->attr_para->w_idx] = aie_cfg- > >src_img_fmt; > + fd->attr_para->rotate_degree[fd->attr_para->w_idx] = aie_cfg- > >rotate_degree; > + } > + > + return 0; > +} > + > +static int aie_config_y2r(struct mtk_aie_dev *fd, struct > aie_enq_info *aie_cfg, > + int mode) > +{ > + u32 img_addr; > + u32 img_addr_UV; > + u32 img_off; > + u32 img_off_uv; > + u32 *yuv2rgb_cfg; > + u32 srcbuf, srcbuf_UV; > + u16 xmag_0, ymag_0; > + u16 pym0_out_w; > + u16 pym0_out_h; > + u16 stride_pym0_out_w; > + u16 sr_crp_w; > + u16 sr_crp_h; > + u16 y1_stride; > + > + if (!aie_cfg->en_roi) { > + img_off = 0; > + img_off_uv = 0; > + } else { > + if (aie_cfg->src_img_fmt == FMT_MONO || aie_cfg->src_img_fmt == > FMT_YUV_2P || > + aie_cfg->src_img_fmt == FMT_YVU_2P) { > + y1_stride = aie_cfg->src_img_stride * aie_cfg->src_roi.y1; > + img_off = y1_stride + aie_cfg->src_roi.x1; > + img_off_uv = y1_stride + aie_cfg->src_roi.x1; > + } else if (aie_cfg->src_img_fmt == FMT_YUV420_2P || > + aie_cfg->src_img_fmt == FMT_YUV420_1P) { > + y1_stride = aie_cfg->src_img_stride * aie_cfg->src_roi.y1; > + img_off = y1_stride + aie_cfg->src_roi.x1; > + img_off_uv = y1_stride / 2 + aie_cfg->src_roi.x1; > + } else if (aie_cfg->src_img_fmt == FMT_YUYV || > + aie_cfg->src_img_fmt == FMT_YVYU || > + aie_cfg->src_img_fmt == FMT_UYVY || > + aie_cfg->src_img_fmt == FMT_VYUY) { > + y1_stride = aie_cfg->src_img_stride * aie_cfg->src_roi.y1; > + img_off = y1_stride + aie_cfg->src_roi.x1 * 2; > + img_off_uv = y1_stride + aie_cfg->src_roi.x1 * 2; > + } else { > + dev_err(fd->dev, "Unsupport input format %d", aie_cfg- > >src_img_fmt); > + return -EINVAL; > + } > + } > + > + img_addr = aie_cfg->src_img_addr + img_off; > + img_addr_UV = aie_cfg->src_img_addr_uv + img_off_uv; > + > + srcbuf = img_addr; > + if (aie_cfg->src_img_fmt == FMT_YUV420_2P || aie_cfg->src_img_fmt > == FMT_YUV420_1P || > + aie_cfg->src_img_fmt == FMT_YUV_2P || aie_cfg->src_img_fmt == > FMT_YVU_2P) > + srcbuf_UV = img_addr_UV; > + else > + srcbuf_UV = 0; > + > + if (mode == FDMODE) { > + sr_crp_w = fd->base_para->crop_rect.width; > + sr_crp_h = fd->base_para->crop_rect.height; > + yuv2rgb_cfg = (u32 *)fd->base_para->fd_yuv2rgb_cfg_va; > + pym0_out_w = fd->base_para->pyramid_rect.width; > + } > + if (mode == ATTRIBUTEMODE) { > + sr_crp_w = fd->attr_para->crop_width[fd->attr_para->w_idx]; > + sr_crp_h = fd->attr_para->crop_height[fd->attr_para->w_idx]; > + yuv2rgb_cfg = (u32 *)fd->base_para->attr_yuv2rgb_cfg_va[fd- > >attr_para->w_idx]; > + pym0_out_w = ATTR_MODE_PYRAMID_WIDTH; > + } > + > + pym0_out_h = pym0_out_w * sr_crp_h / sr_crp_w; > + > + if (pym0_out_w != 0) { > + xmag_0 = 512 * sr_crp_w / pym0_out_w; > + ymag_0 = xmag_0; > + } else { > + xmag_0 = 0; > + ymag_0 = 0; > + } > + > + yuv2rgb_cfg[Y2R_SRC_DST_FORMAT] = (yuv2rgb_cfg[Y2R_SRC_DST_FORMAT] > & 0xFFFFFFF8) | > + ((aie_cfg->src_img_fmt) & 0x7); > + if (aie_cfg->src_img_fmt == FMT_YUV420_2P || aie_cfg->src_img_fmt > == FMT_YUV420_1P) { > + /* for match patten */ > + yuv2rgb_cfg[Y2R_SRC_DST_FORMAT] = (yuv2rgb_cfg[Y2R_SRC_DST_FORMAT] > & > + 0xFFFFFFF8) | ((0x3) & 0x7); > + } > + yuv2rgb_cfg[Y2R_IN_W_H] = (yuv2rgb_cfg[Y2R_IN_W_H] & 0xF800F800) | > + ((sr_crp_w << 16) & 0x7FF0000) | (sr_crp_h & 0x7FF); > + yuv2rgb_cfg[Y2R_OUT_W_H] = (yuv2rgb_cfg[Y2R_OUT_W_H] & 0xF800F800) > | > + ((pym0_out_w << 16) & 0x7FF0000) | (pym0_out_h & 0x7FF); > + > + if (aie_cfg->src_img_fmt == FMT_YUV_2P || aie_cfg->src_img_fmt == > FMT_YVU_2P) { > + /* 2 plane */ > + yuv2rgb_cfg[Y2R_RA0_RA1_EN] = (yuv2rgb_cfg[Y2R_RA0_RA1_EN] & > 0xFFFFFFEE) | 0x11; > + if (aie_cfg->en_roi) { > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = aie_cmb_u16(dif_x(aie_cfg), > dif_y(aie_cfg)); > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(dif_x(aie_cfg), > dif_y(aie_cfg)); > + } else { > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = aie_cmb_u16(sr_crp_w - 1, sr_crp_h > - 1); > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(sr_crp_w - 1, sr_crp_h > - 1); > + } > + yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] = > + (yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] & 0xFFF0) | > + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x1; > + yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] = > + (yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] & 0xFFF0) | > + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x1; > + } else if (aie_cfg->src_img_fmt == FMT_MONO) { > + yuv2rgb_cfg[Y2R_RA0_RA1_EN] = > + (yuv2rgb_cfg[Y2R_RA0_RA1_EN] & 0xFFFFFFEE) | 0x01; > + if (aie_cfg->en_roi) { > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = aie_cmb_u16(dif_x(aie_cfg), > dif_y(aie_cfg)); > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(dif_x(aie_cfg), > dif_y(aie_cfg)); > + } else { > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = aie_cmb_u16(sr_crp_w - 1, sr_crp_h > - 1); > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(sr_crp_w - 1, sr_crp_h > - 1); > + } > + yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] = > + (yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] & 0xFFF0) | > + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x0; > + yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] = > + (yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] & 0xFFF0) | > + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x0; > + } else if (aie_cfg->src_img_fmt == FMT_YUYV || > + aie_cfg->src_img_fmt == FMT_YVYU || > + aie_cfg->src_img_fmt == FMT_UYVY || > + aie_cfg->src_img_fmt == FMT_VYUY) { > + /* 1 plane */ > + yuv2rgb_cfg[Y2R_RA0_RA1_EN] = (yuv2rgb_cfg[Y2R_RA0_RA1_EN] & > 0xFFFFFFEE) | 0x1; > + if (aie_cfg->en_roi) { > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = aie_cmb_u16(2 * (dif_x(aie_cfg) + > 1) - 1, > + dif_y(aie_cfg)); > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(2 * (dif_x(aie_cfg) + > 1) - 1, > + dif_y(aie_cfg)); > + } else { > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = aie_cmb_u16(2 * sr_crp_w - 1, > sr_crp_h - 1); > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(2 * sr_crp_w - 1, > sr_crp_h - 1); > + } > + yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] = > + (yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] & 0xFFF0) | > + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x3; > + yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] = > + (yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] & 0xFFF0) | > + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x3; > + } > + > + /* AIE3.0 */ > + if (aie_cfg->src_img_fmt == FMT_YUV420_2P || > + aie_cfg->src_img_fmt == FMT_YUV420_1P) { > + yuv2rgb_cfg[Y2R_RA0_RA1_EN] = > + (yuv2rgb_cfg[Y2R_RA0_RA1_EN] & 0xFFFFFFEE) | 0x11; > + if (aie_cfg->en_roi) { > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = aie_cmb_u16(dif_x(aie_cfg), > dif_y(aie_cfg)); > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(dif_x(aie_cfg), > + dif_y(aie_cfg) / 2); > + } else { > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = > + aie_cmb_u16(sr_crp_w - 1, sr_crp_h - 1); > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(sr_crp_w - 1, > + sr_crp_h / 2 - 1); > + } > + yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] = > + (yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] & 0xFFF0) | > + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x0; > + yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] = > + (yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] & 0xFFF0) | > + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x0; > + > + yuv2rgb_cfg[Y2R_CO2_FMT_MODE_EN] = > + (yuv2rgb_cfg[Y2R_CO2_FMT_MODE_EN] & 0xFFFFFFFE) | 0x01; > + if (aie_cfg->en_roi) { > + yuv2rgb_cfg[Y2R_CO2_CROP_X] = aie_cmb_u16(0, dif_x(aie_cfg)); > + yuv2rgb_cfg[Y2R_CO2_CROP_Y] = aie_cmb_u16(0, dif_y(aie_cfg)); > + } else { > + yuv2rgb_cfg[Y2R_CO2_CROP_X] = aie_cmb_u16(0, sr_crp_w - 1); > + yuv2rgb_cfg[Y2R_CO2_CROP_Y] = aie_cmb_u16(0, sr_crp_h - 1); > + } > + } else { > + yuv2rgb_cfg[Y2R_CO2_FMT_MODE_EN] = > + (yuv2rgb_cfg[Y2R_CO2_FMT_MODE_EN] & 0xFFFFFFFE); > + > + if (aie_cfg->en_roi) { > + yuv2rgb_cfg[Y2R_CO2_CROP_X] = aie_cmb_u16(0, dif_x(aie_cfg)); > + yuv2rgb_cfg[Y2R_CO2_CROP_Y] = aie_cmb_u16(0, dif_y(aie_cfg)); > + } else { > + yuv2rgb_cfg[Y2R_CO2_CROP_X] = aie_cmb_u16(0, sr_crp_w - 1); > + yuv2rgb_cfg[Y2R_CO2_CROP_Y] = aie_cmb_u16(0, sr_crp_h - 1); > + } > + } > + > + stride_pym0_out_w = round_up(pym0_out_w, 8); > + > + yuv2rgb_cfg[Y2R_OUT_X_Y_SIZE0] = > + aie_cmb_u16(pym0_out_w - 1, pym0_out_h - 1); > + set_cmb_cfg(yuv2rgb_cfg, Y2R_OUT_STRIDE0_BUS_SIZE0, > stride_pym0_out_w); > + yuv2rgb_cfg[Y2R_OUT_X_Y_SIZE1] = > + aie_cmb_u16(pym0_out_w - 1, pym0_out_h - 1); > + set_cmb_cfg(yuv2rgb_cfg, Y2R_OUT_STRIDE1_BUS_SIZE1, > stride_pym0_out_w); > + yuv2rgb_cfg[Y2R_OUT_X_Y_SIZE2] = > + aie_cmb_u16(pym0_out_w - 1, pym0_out_h - 1); > + set_cmb_cfg(yuv2rgb_cfg, Y2R_OUT_STRIDE2_BUS_SIZE2, > stride_pym0_out_w); > + > + if (aie_cfg->en_padding) { > + yuv2rgb_cfg[Y2R_PADDING_EN_UP_DOWN] = > + 1 | ((aie_cfg->src_padding.up << 4) & 0x1FF0) | > + ((aie_cfg->src_padding.down << 16) & 0x01FF0000); > + yuv2rgb_cfg[Y2R_PADDING_RIGHT_LEFT] = > + (aie_cfg->src_padding.right & 0x01FF) | > + ((aie_cfg->src_padding.left << 16) & 0x01FF0000); > + } else { > + yuv2rgb_cfg[Y2R_PADDING_EN_UP_DOWN] = 0; > + yuv2rgb_cfg[Y2R_PADDING_RIGHT_LEFT] = 0; > + } > + > + yuv2rgb_cfg[Y2R_IN_0] = srcbuf; > + yuv2rgb_cfg[Y2R_IN_1] = srcbuf_UV; > + > + yuv2rgb_cfg[Y2R_OUT_0] = (u32)fd->base_para->rs_pym_rst_pa[0][0]; > + yuv2rgb_cfg[Y2R_OUT_1] = (u32)fd->base_para->rs_pym_rst_pa[0][1]; > + yuv2rgb_cfg[Y2R_OUT_2] = (u32)fd->base_para->rs_pym_rst_pa[0][2]; > + > + yuv2rgb_cfg[Y2R_X_Y_MAG] = (xmag_0 & 0x3FFF) | ((ymag_0 << 16) & > 0x3FFF0000); > + > + if (sr_crp_w >= pym0_out_w) { > + /* down scale AIE1.0 by FRZ */ > + yuv2rgb_cfg[Y2R_RS_SEL_SRZ_EN] = > + (yuv2rgb_cfg[Y2R_RS_SEL_SRZ_EN] & 0x00100070); > + yuv2rgb_cfg[Y2R_SRZ_HORI_STEP] = 0; > + yuv2rgb_cfg[Y2R_SRZ_VERT_STEP] = 0; > + } else { > + /* SRZ */ > + /* 0: FDRZ for down scaling */ > + /* 1: SRZ for up scaling */ > + yuv2rgb_cfg[Y2R_RS_SEL_SRZ_EN] = > + (yuv2rgb_cfg[Y2R_RS_SEL_SRZ_EN] & 0x00100070) | SRZ_BIT; > + yuv2rgb_cfg[Y2R_SRZ_HORI_STEP] = ((sr_crp_w - 1) << 15) / > (pym0_out_w - 1); > + yuv2rgb_cfg[Y2R_SRZ_VERT_STEP] = ((sr_crp_h - 1) << 15) / > (pym0_out_h - 1); > + } > + > + yuv2rgb_cfg[Y2R_CON_IN_BA_MSB] = (u32)0x02020202; > + yuv2rgb_cfg[Y2R_CON_OUT_BA_MSB] = (u32)0x02020202; > + > + return 0; > +} > + > +static int aie_config_rs(struct mtk_aie_dev *fd, struct aie_enq_info > *aie_cfg) > +{ > + u32 *rs_cfg; > + u32 *rs_tbl[2]; > + u16 xmag_0, ymag_0; > + u16 pym_out_w[3]; > + u16 pym_out_h[3]; > + u16 round_w; > + u16 sr_crp_w; > + u16 sr_crp_h; > + int i; > + > + sr_crp_w = fd->base_para->crop_rect.width; > + sr_crp_h = fd->base_para->crop_rect.height; > + > + rs_cfg = (u32 *)fd->base_para->fd_rs_cfg_va; > + > + pym_out_w[0] = fd->base_para->pyramid_rect.width; > + pym_out_w[1] = pym_out_w[0] >> 1; > + pym_out_w[2] = pym_out_w[1] >> 1; > + > + pym_out_h[0] = pym_out_w[0] * sr_crp_h / sr_crp_w; > + pym_out_h[1] = pym_out_h[0] >> 1; > + pym_out_h[2] = pym_out_h[1] >> 1; > + > + for (i = 0; i < 2; i++) { > + rs_tbl[i] = rs_cfg + fd->variant->rs_cfg_size * i; > + > + rs_tbl[i][RS_IN_0] = (u32)fd->base_para->rs_pym_rst_pa[i][0]; > + rs_tbl[i][RS_IN_1] = (u32)fd->base_para->rs_pym_rst_pa[i][1]; > + rs_tbl[i][RS_IN_2] = (u32)fd->base_para->rs_pym_rst_pa[i][2]; > + > + rs_tbl[i][RS_OUT_0] = (u32)fd->base_para->rs_pym_rst_pa[i + 1][0]; > + rs_tbl[i][RS_OUT_1] = (u32)fd->base_para->rs_pym_rst_pa[i + 1][1]; > + rs_tbl[i][RS_OUT_2] = (u32)fd->base_para->rs_pym_rst_pa[i + 1][2]; > + > + rs_tbl[i][RS_INPUT_W_H] = (rs_tbl[i][RS_INPUT_W_H] & 0xF800F800) | > + (pym_out_h[i] & 0x7FF) | ((pym_out_w[i] << 16) & 0x7FF0000); > + rs_tbl[i][RS_OUTPUT_W_H] = (rs_tbl[i][RS_OUTPUT_W_H] & 0xF800F800) > | > + (pym_out_h[i + 1] & 0x7FF) | ((pym_out_w[i + 1] << 16) & > 0x7FF0000); > + > + rs_tbl[i][RS_IN_X_Y_SIZE0] = aie_cmb_u16(pym_out_w[i] - 1, > pym_out_h[i] - 1); > + rs_tbl[i][RS_IN_X_Y_SIZE1] = aie_cmb_u16(pym_out_w[i] - 1, > pym_out_h[i] - 1); > + rs_tbl[i][RS_IN_X_Y_SIZE2] = aie_cmb_u16(pym_out_w[i] - 1, > pym_out_h[i] - 1); > + > + set_cmb_cfg(rs_tbl[i], RS_IN_STRIDE0, pym_out_w[i]); > + set_cmb_cfg(rs_tbl[i], RS_IN_STRIDE1, pym_out_w[i]); > + set_cmb_cfg(rs_tbl[i], RS_IN_STRIDE2, pym_out_w[i]); > + > + rs_tbl[i][RS_OUT_X_Y_SIZE0] = aie_cmb_u16(pym_out_w[i + 1] - 1, > + pym_out_h[i + 1] - 1); > + rs_tbl[i][RS_OUT_X_Y_SIZE1] = aie_cmb_u16(pym_out_w[i + 1] - 1, > + pym_out_h[i + 1] - 1); > + rs_tbl[i][RS_OUT_X_Y_SIZE2] = aie_cmb_u16(pym_out_w[i + 1] - 1, > + pym_out_h[i + 1] - 1); > + > + if (i == 0) > + round_w = pym_out_w[i + 1]; > + else > + round_w = round_up(pym_out_w[i + 1], 8); > + > + set_cmb_cfg(rs_tbl[i], RS_OUT_STRIDE0, round_w); > + set_cmb_cfg(rs_tbl[i], RS_OUT_STRIDE1, round_w); > + set_cmb_cfg(rs_tbl[i], RS_OUT_STRIDE2, round_w); > + > + xmag_0 = 512 * pym_out_w[i] / pym_out_w[i + 1]; > + ymag_0 = xmag_0; > + > + rs_tbl[i][RS_X_Y_MAG] = (xmag_0 & 0x3FFF) | ((ymag_0 << 16) & > 0x3FFF0000); > + rs_tbl[i][RS_CON_IN_BA_MSB] = (u32)0x02020202; > + rs_tbl[i][RS_CON_OUT_BA_MSB] = (u32)0x02020202; > + } > + > + return 0; > +} > + > +static int aie_config_network(struct mtk_aie_dev *fd, > + struct aie_enq_info *aie_cfg) > +{ > + struct aie_static_info *pstv = &fd->st_info; > + u16 pyramid0_out_w, pyramid0_out_h, pyramid1_out_h, pyramid2_out_h; > + u16 out_ysize_minus_1, out_ysize_minus_1_stride2; > + u16 fd_xsize[4], size, poolsize, stridesize; > + u16 input_height, out_height; > + u16 conv_width, conv_height; > + u32 *fd_cur_cfg, *fd_cur_set; > + u32 sr_crp_w, sr_crp_h; > + u32 cal_x, cal_y; > + u8 uch, uloop; > + u8 i, j; > + void *fd_cfg; > + > + sr_crp_w = fd->base_para->crop_rect.width; > + sr_crp_h = fd->base_para->crop_rect.height; > + > + pyramid0_out_w = fd->base_para->pyramid_rect.width; > + pyramid0_out_h = pyramid0_out_w * sr_crp_h / sr_crp_w; > + > + pyramid1_out_h = pyramid0_out_h / 2; > + pyramid2_out_h = pyramid1_out_h / 2; > + > + fd_cfg = fd->base_para->fd_fd_cfg_va; > + > + for (i = 0; i < FD_LOOP_NUM; i++) { > + fd_cur_cfg = (u32 *)fd_cfg + fd->variant->fd_cfg_size * i; > + fd_cur_cfg[FD_INPUT_ROTATE] = (fd_cur_cfg[FD_INPUT_ROTATE] & > 0xFFFF0FFF) | > + ((aie_cfg->rotate_degree << 12) & 0x3000); > + > + if (i == 0) > + input_height = pyramid2_out_h; > + else if (i == (RPNX_LOOP_NUM(2) + 1)) > + input_height = pyramid1_out_h; > + else if (i == (RPNX_LOOP_NUM(1) + 1)) > + input_height = pyramid0_out_h; > + else > + if (fd_out_stride2_in[i] == 0) > + input_height = out_height; > + else > + input_height = (out_height + 1) / 2; > + > + poolsize = is_fd_maxpool(i); > + stridesize = get_fd_stride(i); > + if (poolsize == 1) > + out_height = > + DIV_ROUND_UP(input_height, 2 * poolsize); > + else > + out_height = DIV_ROUND_UP(input_height, stridesize + 2 * poolsize); > + > + if (i == RPNX_LOOP_NUM(0) || i == RPNX_LOOP_NUM(1) || i == > RPNX_LOOP_NUM(2)) { > + conv_width = fd->base_para->img_rect.width; > + conv_height = fd->base_para->img_rect.height; > + fd_xsize[0] = pstv->inf_elm[i].img_width * 2 * 16 * ANCHOR_EN_NUM - > 1; > + fd_xsize[3] = pstv->inf_elm[i].img_width * 2 * 32 * ANCHOR_EN_NUM - > 1; > + fd_xsize[2] = fd_xsize[3]; > + fd_xsize[1] = fd_xsize[2]; > + } else { > + conv_width = DIV_ROUND_UP(pstv->inf_elm[i].img_width, stridesize); > + conv_height = DIV_ROUND_UP(input_height, stridesize); > + > + fd_xsize[3] = pstv->inf_elm[i].input_xsize_plus_1 - 1; > + fd_xsize[2] = fd_xsize[3]; > + fd_xsize[1] = fd_xsize[2]; > + fd_xsize[0] = fd_xsize[1]; > + } > + > + fd_cur_cfg[FD_CONV_WIDTH_MOD6] = (fd_cur_cfg[FD_CONV_WIDTH_MOD6] & > 0xFF8FFFFF) | > + (((conv_width % 6) << 20) & 0x00700000); > + fd_cur_cfg[FD_CONV_IMG_W_H] = aie_cmb_u16(conv_height, conv_width); > + > + fd_cur_cfg[FD_IN_IMG_W_H] = aie_cmb_u16(input_height, pstv- > >inf_elm[i].img_width); > + fd_cur_cfg[FD_OUT_IMG_W_H] = aie_cmb_u16(out_height, pstv- > >inf_elm[i].out_width); > + > + if (fd_rdma_en[i][0][0] != -1) { > + for (j = 0; j < 4; j++) { > + fd_cur_cfg[FD_IN_X_Y_SIZE0 + 2 * j] = > + aie_cmb_u16(fd_xsize[j], input_height - 1); > + set_cmbst_cfg(fd_cur_cfg, FD_IN_STRIDE0_BUS_SIZE0 + 2 * j, > + fd_xsize[j] + 1); > + } > + } > + > + out_ysize_minus_1 = out_height - 1; > + out_ysize_minus_1_stride2 = (out_height + 1) / 2 - 1; > + > + for (j = 0; j < OUTPUT_WDMA_WRA_NUM; j++) { > + fd_cur_set = fd_cur_cfg + 2 * j; > + if (!out_stride_size[i][j]) > + continue; > + > + if (out_stride_size[i][j] == 1) { > + fd_cur_set[FD_OUT_X_Y_SIZE0] = > + aie_cmb_u16(pstv->inf_elm[i].out_xsize_plus_1 - 1, > + out_ysize_minus_1); > + set_cmbst_cfg(fd_cur_set, FD_OUT_STRIDE0_BUS_SIZE0, > + pstv->inf_elm[i].out_stride); > + } else if (out_stride_size[i][j] == 2) { > + fd_cur_set[FD_OUT_X_Y_SIZE0] = > + aie_cmb_u16(pstv->inf_elm[i].out_xsize_plus_1_stride2 - 1, > + out_ysize_minus_1_stride2); > + set_cmbst_cfg(fd_cur_set, FD_OUT_STRIDE0_BUS_SIZE0, > + pstv->inf_elm[i].out_stride_stride2); > + } > + } > + > + if (i == RPNX_LOOP_NUM(0) || i == RPNX_LOOP_NUM(1) || i == > RPNX_LOOP_NUM(2)) > + set_cmb_cfg(fd_cur_cfg, FD_RPN_SET, fd->base_para- > >rpn_anchor_thrd); > + > + if (i == RPNX_LOOP_NUM(0)) { > + cal_x = ((sr_crp_w << 10) * 100 / > + (int)fd->base_para->pyramid_rect.width) >> 10; > + cal_y = cal_x * 512 / 100; > + fd_cur_cfg[FD_IMAGE_COORD] = (fd_cur_cfg[FD_IMAGE_COORD] & 0xF) | > + ((cal_y << 4) & 0x7FFF0); > + fd_cur_cfg[FD_IMAGE_COORD_XY_OFST] = 0; > + if (aie_cfg->en_roi) { > + fd_cur_cfg[FD_IMAGE_COORD_XY_OFST] = > + (aie_cfg->src_roi.x1 - aie_cfg->src_padding.left) | > + (aie_cfg->src_roi.y1 - aie_cfg->src_padding.up) << 16; > + } > + } else if (i == RPNX_LOOP_NUM(1)) { > + cal_x = ((sr_crp_w << 10) * 100 / > + (int)fd->base_para->pyramid_rect.width) >> 10; > + cal_y = cal_x * 2 * 512 / 100; > + fd_cur_cfg[FD_IMAGE_COORD] = (fd_cur_cfg[FD_IMAGE_COORD] & 0xF) | > + ((cal_y << 4) & 0x7FFF0); > + fd_cur_cfg[FD_IMAGE_COORD_XY_OFST] = 0; > + if (aie_cfg->en_roi) { > + fd_cur_cfg[FD_IMAGE_COORD_XY_OFST] = > + (aie_cfg->src_roi.x1 - aie_cfg->src_padding.left) | > + (aie_cfg->src_roi.y1 - aie_cfg->src_padding.up) << 16; > + } > + } else if (i == RPNX_LOOP_NUM(2)) { > + cal_x = ((sr_crp_w << 10) * 100 / > + (int)fd->base_para->pyramid_rect.width) >> 10; > + cal_y = cal_x * 4 * 512 / 100; > + fd_cur_cfg[FD_IMAGE_COORD] = (fd_cur_cfg[FD_IMAGE_COORD] & 0xF) | > + ((cal_y << 4) & 0x7FFF0); > + fd_cur_cfg[FD_IMAGE_COORD_XY_OFST] = 0; > + if (aie_cfg->en_roi) { > + fd_cur_cfg[FD_IMAGE_COORD_XY_OFST] = > + (aie_cfg->src_roi.x1 - aie_cfg->src_padding.left) | > + (aie_cfg->src_roi.y1 - aie_cfg->src_padding.up) << 16; > + } > + } > + > + /* IN_FM_BASE_ADR */ > + if (i == 0) { > + fd_cur_cfg[FD_IN_0] = (u32)(fd->base_para->rs_pym_rst_pa[2][0]); > + fd_cur_cfg[FD_IN_1] = (u32)(fd->base_para->rs_pym_rst_pa[2][1]); > + fd_cur_cfg[FD_IN_2] = (u32)(fd->base_para->rs_pym_rst_pa[2][2]); > + } else if (i == (RPNX_LOOP_NUM(2) + 1)) { > + fd_cur_cfg[FD_IN_0] = (u32)(fd->base_para->rs_pym_rst_pa[1][0]); > + fd_cur_cfg[FD_IN_1] = (u32)(fd->base_para->rs_pym_rst_pa[1][1]); > + fd_cur_cfg[FD_IN_2] = (u32)(fd->base_para->rs_pym_rst_pa[1][2]); > + } else if (i == (RPNX_LOOP_NUM(1) + 1)) { > + fd_cur_cfg[FD_IN_0] = (u32)(fd->base_para->rs_pym_rst_pa[0][0]); > + fd_cur_cfg[FD_IN_1] = (u32)(fd->base_para->rs_pym_rst_pa[0][1]); > + fd_cur_cfg[FD_IN_2] = (u32)(fd->base_para->rs_pym_rst_pa[0][2]); > + } else { > + for (j = 0; j < INPUT_WDMA_WRA_NUM; j++) { > + if (fd_rdma_en[i][j][0] != -1) { > + uloop = fd_rdma_en[i][j][0]; > + uch = fd_rdma_en[i][j][1]; > + fd_cur_cfg[FD_IN_0 + j] = > + (u32)(fd->dma_para->fd_out_hw_pa[uloop][uch]); > + } > + } > + } > + > + /* OUT_FM_BASE_ADR */ > + for (j = 0; j < OUTPUT_WDMA_WRA_NUM; j++) { > + if (out_stride_size[i][j]) > + fd_cur_cfg[FD_OUT_0 + j] = > + (u32)(fd->dma_para->fd_out_hw_pa[i][j]); > + } > + > + /* KERNEL_BASE_ADR */ > + for (j = 0; j < KERNEL_RDMA_RA_NUM; j++) { > + size = get_fd_ker_rdma_size(i, j); > + if (size) > + fd_cur_cfg[FD_KERNEL_0 + j] = > + (u32)(fd->dma_para->fd_kernel_pa[i][j]); > + } > + > + fd_cur_cfg[FD_CON_IN_BA_MSB] = (u32)0x02020202; > + fd_cur_cfg[FD_CON_OUT_BA_MSB] = (u32)0x02020202; > + fd_cur_cfg[FD_CON_KERNEL_BA_MSB] = (u32)0x00000202; > + } > + > + return 0; > +} > + > +static int aie_config_attr_network(struct mtk_aie_dev *fd, > + struct aie_enq_info *aie_cfg) > +{ > + bool is_regression_loop; > + void *fd_cfg; > + u32 *fd_cur_cfg; > + u16 fd_input_ht, fd_output_ht; > + u16 fd_out_y[4]; > + u8 i, j; > + u8 uloop, uch, uidx; > + u16 pyramid0_out_w, pyramid0_out_h; > + int fd_conv_ht; > + u16 sr_crp_w, sr_crp_h; > + > + sr_crp_w = fd->attr_para->crop_width[fd->attr_para->w_idx]; > + sr_crp_h = fd->attr_para->crop_height[fd->attr_para->w_idx]; > + > + pyramid0_out_w = ATTR_MODE_PYRAMID_WIDTH; > + pyramid0_out_h = pyramid0_out_w * sr_crp_h / sr_crp_w; > + > + fd_cfg = fd->base_para->attr_fd_cfg_va[fd->attr_para->w_idx]; > + > + for (i = 0; i < ATTR_LOOP_NUM; i++) { > + fd_cur_cfg = (u32 *)fd_cfg + fd->variant->fd_cfg_size * i; > + fd_cur_cfg[FD_INPUT_ROTATE] = > + (fd_cur_cfg[FD_INPUT_ROTATE] & 0xFFFF0FFF) | > + ((aie_cfg->rotate_degree << 12) & 0x3000); > + if (i == 0) > + fd_input_ht = pyramid0_out_h; > + else > + if (attr_out_stride2_as_in[i] == 0) > + fd_input_ht = fd_output_ht; > + else if (attr_out_stride2_as_in[i] == 1) > + fd_input_ht = (fd_output_ht + 1) / 2; > + > + fd_output_ht = DIV_ROUND_UP(fd_input_ht, ATTR_FD_STRIDE(i) + > + 2 * ATTR_FD_MAXPOOL(i)); > + fd_conv_ht = DIV_ROUND_UP(fd_input_ht, ATTR_FD_STRIDE(i)); > + > + fd_cur_cfg[FD_CONV_IMG_W_H] = > + (fd_cur_cfg[FD_CONV_IMG_W_H] & 0xFFFF0000) | > + (fd_conv_ht & 0xFFFF); > + fd_cur_cfg[FD_IN_IMG_W_H] = > + (fd_cur_cfg[FD_IN_IMG_W_H] & 0xFFFF0000) | > + (fd_input_ht & 0xFFFF); > + fd_cur_cfg[FD_OUT_IMG_W_H] = > + (fd_cur_cfg[FD_OUT_IMG_W_H] & 0xFFFF0000) | > + (fd_output_ht & 0xFFFF); > + set_cmb_cfg(fd_cur_cfg, FD_IN_X_Y_SIZE0, fd_input_ht - 1); > + set_cmb_cfg(fd_cur_cfg, FD_IN_X_Y_SIZE1, fd_input_ht - 1); > + set_cmb_cfg(fd_cur_cfg, FD_IN_X_Y_SIZE2, fd_input_ht - 1); > + set_cmb_cfg(fd_cur_cfg, FD_IN_X_Y_SIZE3, fd_input_ht - 1); > + > + is_regression_loop = (i == AGE_OUT_RGS || i == GENDER_OUT_RGS || > + i == INDIAN_OUT_RGS || i == ETHNICITY_OUT_RGS); > + > + if (is_regression_loop) { > + fd_out_y[0] = 0; > + fd_out_y[1] = 0; > + fd_out_y[2] = 0; > + fd_out_y[3] = 0; > + } else { > + fd_out_y[0] = fd_output_ht - 1; > + fd_out_y[1] = fd_output_ht - 1; > + if (attr_out_2size[i] == 0) { > + fd_out_y[2] = fd_output_ht - 1; > + fd_out_y[3] = fd_output_ht - 1; > + } else { > + fd_out_y[2] = (fd_output_ht + 1) / 2 - 1; > + fd_out_y[3] = (fd_output_ht + 1) / 2 - 1; > + } > + } > + > + for (j = 0; j < 4; j++) > + set_cmb_cfg(fd_cur_cfg, FD_OUT_X_Y_SIZE0 + 2 * j, fd_out_y[j]); > + > + /* IN_FM_BASE_ADR */ > + if (i == 0) { > + fd_cur_cfg[FD_IN_0] = (u32)(fd->base_para->rs_pym_rst_pa[0][0]); > + fd_cur_cfg[FD_IN_1] = (u32)(fd->base_para->rs_pym_rst_pa[0][1]); > + fd_cur_cfg[FD_IN_2] = (u32)(fd->base_para->rs_pym_rst_pa[0][2]); > + } else { > + for (j = 0; j < INPUT_WDMA_WRA_NUM; j++) { > + if (attr_rdma_en[i][j][0] != -1) { > + uloop = attr_rdma_en[i][j][0]; > + uch = attr_rdma_en[i][j][1]; > + fd_cur_cfg[FD_IN_0 + j] = > + (u32)(fd->dma_para->attr_out_hw_pa[uloop][uch]); > + } > + } > + } > + > + /* OUT_FM_BASE_ADR */ > + for (j = 0; j < OUTPUT_WDMA_WRA_NUM; j++) { > + if (attr_wdma_en[i][j]) { > + uidx = fd->attr_para->w_idx; > + if (i == AGE_OUT_RGS && j == 0) > + fd_cur_cfg[FD_OUT_0 + j] = > + (u32)(fd->dma_para->age_out_hw_pa[uidx]); > + else if (i == GENDER_OUT_RGS && j == 0) > + fd_cur_cfg[FD_OUT_0 + j] = > + (u32)(fd->dma_para->gender_out_hw_pa[uidx]); > + else if (i == INDIAN_OUT_RGS && j == 0) > + fd_cur_cfg[FD_OUT_0 + j] = > + (u32)(fd->dma_para->is_indian_out_hw_pa[uidx]); > + else if (i == ETHNICITY_OUT_RGS && j == 0) > + fd_cur_cfg[FD_OUT_0 + j] = > + (u32)(fd->dma_para->ethnicity_out_hw_pa[uidx]); > + else > + fd_cur_cfg[FD_OUT_0 + j] = > + (u32)(fd->dma_para->attr_out_hw_pa[i][j]); > + } > + } > + > + /* KERNEL_BASE_ADR */ > + for (j = 0; j < KERNEL_RDMA_RA_NUM; j++) { > + fd_cur_cfg[FD_KERNEL_0 + j] = > + (u32)(fd->dma_para->attr_kernel_pa[i][j]); > + } > + > + fd_cur_cfg[FD_CON_IN_BA_MSB] = (u32)0x02020202; > + fd_cur_cfg[FD_CON_OUT_BA_MSB] = (u32)0x02020202; > + fd_cur_cfg[FD_CON_KERNEL_BA_MSB] = (u32)0x00000202; > + } > + return 0; > +} > + > +static int aie_config_dram(struct mtk_aie_dev *fd, struct > aie_enq_info *aie_cfg) > +{ > + int ret; > + > + ret = aie_config_y2r(fd, aie_cfg, aie_cfg->sel_mode); > + if (ret) > + return ret; > + > + if (aie_cfg->sel_mode == FDMODE) { > + ret = aie_config_rs(fd, aie_cfg); > + if (ret) > + return ret; > + > + ret = aie_config_network(fd, aie_cfg); > + if (ret) > + return ret; > + > + } else if (aie_cfg->sel_mode == ATTRIBUTEMODE) { > + ret = aie_config_attr_network(fd, aie_cfg); > + if (ret) > + return ret; > + } else { > + return -EINVAL; > + } > + > + return 0; > +} > + > +void aie_reset(struct mtk_aie_dev *fd) > +{ > + writel(RESET_BIT, fd->fd_base + AIE_START_REG); > + writel(0x0, fd->fd_base + AIE_START_REG); > +} > + > +int aie_init(struct mtk_aie_dev *fd, struct v4l2_ctrl_aie_init > *user_init) > +{ > + int ret; > + int i, j; > + > + if (fd->fd_state & STATE_INIT) { > + dev_err(fd->dev, "%s fd state: %d\n", __func__, fd->fd_state); > + return -EINVAL; > + } > + > + fd->fd_state &= ~STATE_INIT; > + fd->fd_mem_size = 0; > + > + fd->base_para = kmalloc(sizeof(*fd->base_para), GFP_KERNEL); > + if (!fd->base_para) > + return -ENOMEM; > + > + fd->attr_para = kmalloc(sizeof(*fd->attr_para), GFP_KERNEL); > + if (!fd->attr_para) { > + ret = -ENOMEM; > + goto kmalloc_fail; > + } > + > + fd->dma_para = kmalloc(sizeof(*fd->dma_para), GFP_KERNEL); > + if (!fd->dma_para) { > + ret = -ENOMEM; > + goto kmalloc_fail; > + } > + > + fd->fld_para = kmalloc(sizeof(*fd->fld_para), GFP_KERNEL); > + if (!fd->fld_para) { > + ret = -ENOMEM; > + goto kmalloc_fail; > + } > + > + fd->base_para->rpn_anchor_thrd = > + (signed short)(user_init->feature_threshold & 0x0000FFFF); > + fd->base_para->pyramid_rect.width = user_init->pyramid_width; > + fd->base_para->pyramid_rect.height = user_init->pyramid_height; > + fd->base_para->max_pyramid_rect.width = user_init->pyramid_width; > + fd->base_para->max_pyramid_rect.height = user_init->pyramid_height; > + > + fd->base_para->fd_fd_cfg_va = NULL; > + fd->base_para->fd_rs_cfg_va = NULL; > + fd->base_para->fd_yuv2rgb_cfg_va = NULL; > + for (i = 0; i < MAX_ENQUE_FRAME_NUM; i++) > + fd->base_para->attr_fd_cfg_va[i] = NULL; > + for (i = 0; i < MAX_ENQUE_FRAME_NUM; i++) > + fd->base_para->attr_yuv2rgb_cfg_va[i] = NULL; > + for (i = 0; i < PYM_NUM; i++) > + for (j = 0; j < COLOR_NUM; j++) > + fd->base_para->rs_pym_rst_va[i][j] = NULL; > + > + memset(&fd->st_info, 0, sizeof(struct aie_static_info)); > + aie_update_table(fd, fd->base_para->max_pyramid_rect.width, > + fd->base_para->max_pyramid_rect.height, 1); > + aie_update_buf_params(fd, user_init->max_img_width, > + user_init->max_img_height); > + ret = aie_alloc_dram_buf(fd); > + if (ret) > + return -ENOMEM; > + > + ret = aie_alloc_output_buf(fd); > + if (ret) { > + ret = -ENOMEM; > + goto free_all; > + } > + > + ret = aie_alloc_fddma_buf(fd); > + if (ret) { > + ret = -ENOMEM; > + goto free_all; > + } > + > + ret = aie_alloc_fld_buf(fd); > + if (ret) { > + ret = -ENOMEM; > + goto free_all; > + } > + > + aie_arrange_fddma_buf(fd); > + aie_arrange_kernel_buf(fd); > + aie_arrange_attrdma_buf(fd); > + aie_arrange_result_dma_buf(fd); > + > + aie_arrange_fld_buf(fd); > + > + ret = aie_load_fw(fd); > + if (ret) { > + dev_err(fd->dev, "Failed to load aie fw\n"); > + goto free_all; > + } > + > + fd->attr_para->r_idx = 0; > + fd->attr_para->w_idx = 0; > + > + fd->fd_state |= STATE_INIT; > + > + dev_dbg(fd->dev, "%s: fd_mem_size(%d)\n", __func__, fd- > >fd_mem_size); > + > + return 0; > + > +free_all: > + aie_free_dram_buf(fd); > + aie_free_output_buf(fd); > + aie_free_fddma_buf(fd); > + aie_free_fld_buf(fd); > + > +kmalloc_fail: > + kfree(fd->base_para); > + kfree(fd->attr_para); > + kfree(fd->dma_para); > + kfree(fd->fld_para); > + > + dev_err(fd->dev, "Failed to init aie\n"); > + > + return ret; > +} > + > +void aie_uninit(struct mtk_aie_dev *fd) > +{ > + fd->fd_state &= ~STATE_INIT; > + aie_free_dram_buf(fd); > + aie_free_output_buf(fd); > + aie_free_fddma_buf(fd); > + aie_free_fld_buf(fd); > + > + kfree(fd->base_para); > + kfree(fd->attr_para); > + kfree(fd->dma_para); > + kfree(fd->fld_para); > +} > + > +void aie_prepare(struct mtk_aie_dev *fd, struct aie_enq_info > *aie_cfg) > +{ > + if (aie_cfg->sel_mode == FLDMODE) > + return; > + > + memset(&fd->reg_cfg, 0, sizeof(struct aie_reg_cfg)); > + > + if (aie_cfg->pyramid_base_width == 0) { > + fd->base_para->pyramid_rect.width = fd->base_para- > >max_pyramid_rect.width; > + fd->base_para->pyramid_rect.height = fd->base_para- > >max_pyramid_rect.height; > + fd->base_para->number_of_pyramid = 3; > + } else { > + fd->base_para->pyramid_rect.height = > + fd->base_para->max_pyramid_rect.height; > + fd->base_para->number_of_pyramid = aie_cfg->number_of_pyramid; > + if (aie_cfg->pyramid_base_width != > + fd->base_para->pyramid_rect.width) { > + dev_dbg(fd->dev, > + "pre: %d cur: %d num: %d\n", > + fd->base_para->pyramid_rect.width, > + aie_cfg->pyramid_base_width, > + fd->base_para->number_of_pyramid > + ); > + fd->base_para->pyramid_rect.width = > + aie_cfg->pyramid_base_width; > + aie_update_table(fd, fd->base_para->pyramid_rect.width, > + fd->base_para->pyramid_rect.height, 0); > + aie_update_fddma_buf(fd); > + } > + } > + > + //Init output va array > + if (aie_cfg->sel_mode == FDMODE) { > + memset(fd->rs_output_hw.va, 0, fd->rs_output_hw.size); > + memset(fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(0)][0], 0, > + RESULT_SIZE); > + memset(fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(1)][0], 0, > + RESULT_SIZE); > + memset(fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(2)][0], 0, > + RESULT_SIZE); > + } else if (aie_cfg->sel_mode == ATTRIBUTEMODE) { > + memset(fd->base_para->rs_pym_rst_va[0][0], 0, > + fd->rs_pym_out_size[0]); > + memset(fd->base_para->rs_pym_rst_va[0][1], 0, > + fd->rs_pym_out_size[0]); > + memset(fd->base_para->rs_pym_rst_va[0][2], 0, > + fd->rs_pym_out_size[0]); > + } > + > + fd->reg_cfg.fd_mode = aie_cfg->sel_mode; > + if (aie_cfg->sel_mode == FDMODE) { > + fd->reg_cfg.rs_adr = (u32)fd->base_para->fd_rs_cfg_pa; > + fd->reg_cfg.yuv2rgb_adr = (u32)fd->base_para->fd_yuv2rgb_cfg_pa; > + fd->reg_cfg.fd_adr = (u32)fd->base_para->fd_fd_cfg_pa + > + fd->variant->fd_cfg_size * 4 * > + FD_LOOP_NUM / 3 * > + (3 - aie_cfg->number_of_pyramid); > + > + } else if (aie_cfg->sel_mode == ATTRIBUTEMODE) { > + fd->reg_cfg.yuv2rgb_adr = > + (u32)fd->base_para->attr_yuv2rgb_cfg_pa[fd->attr_para->w_idx]; > + fd->reg_cfg.fd_adr = > + (u32)fd->base_para->attr_fd_cfg_pa[fd->attr_para->w_idx]; > + } else { > + dev_err(fd->dev, "Invalid Mode: %d", aie_cfg->sel_mode); > + } > + > + aie_update_cfg(fd, aie_cfg); > + > + aie_config_dram(fd, aie_cfg); > + > + if (aie_cfg->sel_mode == ATTRIBUTEMODE) > + fd->attr_para->w_idx = (fd->attr_para->w_idx + 1) % > MAX_ENQUE_FRAME_NUM; > +} > + > +static void aie_execute_face_detection(struct mtk_aie_dev *fd, > + struct aie_enq_info *aie_cfg) > +{ > + unsigned int loop_num, loop_reg_val; > + > + writel(0x0, fd->fd_base + AIE_START_REG); > + writel(0x00000111, fd->fd_base + AIE_ENABLE_REG); > + loop_num = FD_LOOP_NUM / 3 * (aie_cfg->number_of_pyramid); > + loop_reg_val = (loop_num << 8) | (aie_cfg->number_of_pyramid - 1); > + writel(loop_reg_val, fd->fd_base + AIE_LOOP_REG); > + writel(0x1, fd->fd_base + AIE_INT_EN_REG); > + writel(fd->reg_cfg.rs_adr, fd->fd_base + AIE_RS_CON_BASE_ADR_REG); > + writel(fd->reg_cfg.fd_adr, fd->fd_base + AIE_FD_CON_BASE_ADR_REG); > + writel(fd->reg_cfg.yuv2rgb_adr, fd->fd_base + > AIE_YUV2RGB_CON_BASE_ADR_REG); > + writel(0x00000002, fd->fd_base + AIE_YUV2RGB_CON_BASE_ADR_MSB); > + writel(0x00000002, fd->fd_base + AIE_RS_CON_BASE_ADR_MSB); > + writel(0x00000002, fd->fd_base + AIE_FD_CON_BASE_ADR_MSB); > + writel(0x1, fd->fd_base + AIE_START_REG); > +} > + > +static void aie_execute_attribute_detection(struct mtk_aie_dev *fd, > + struct aie_enq_info *aie_cfg) > +{ > + writel(0x0, fd->fd_base + AIE_START_REG); > + writel(0x00000101, fd->fd_base + AIE_ENABLE_REG); > + writel(0x00001A00, fd->fd_base + AIE_LOOP_REG); > + writel(0x1, fd->fd_base + AIE_INT_EN_REG); > + writel(fd->reg_cfg.rs_adr, fd->fd_base + AIE_RS_CON_BASE_ADR_REG); > + writel(fd->reg_cfg.fd_adr, fd->fd_base + AIE_FD_CON_BASE_ADR_REG); > + writel(fd->reg_cfg.yuv2rgb_adr, fd->fd_base + > AIE_YUV2RGB_CON_BASE_ADR_REG); > + writel(0x00000002, fd->fd_base + AIE_YUV2RGB_CON_BASE_ADR_MSB); > + writel(0x00000002, fd->fd_base + AIE_RS_CON_BASE_ADR_MSB); > + writel(0x00000002, fd->fd_base + AIE_FD_CON_BASE_ADR_MSB); > + writel(0x1, fd->fd_base + AIE_START_REG); > +} > + > +static void aie_execute_fld_detection(struct mtk_aie_dev *fd, > + struct aie_enq_info *aie_cfg) > +{ > + unsigned int i; > + > + writel(0x10, fd->fd_base + AIE_START_REG); > + writel(0x00011111, fd->fd_base + AIE_DMA_CTL_REG); > + writel(0x01111111, fd->fd_base + FLD_EN); > + writel(0x1, fd->fd_base + AIE_INT_EN_REG); > + for (i = 0; i < aie_cfg->fld_face_num; i++) { > + writel(aie_cfg->src_img_addr, fd->fd_base + FLD_BASE_ADDR_FACE_0 + > i * 0x4); > + writel(aie_cfg->fld_input[i].fld_in_crop_x1 << 16 | > + aie_cfg->fld_input[i].fld_in_crop_y1, > + fd->fd_base + fld_face_info_0[i]); > + writel(aie_cfg->fld_input[i].fld_in_crop_x2 << 16 | > + aie_cfg->fld_input[i].fld_in_crop_y2, > + fd->fd_base + FLD_FACE_INFO(i, 1)); > + writel(aie_cfg->fld_input[i].fld_in_rip << 4 | > + aie_cfg->fld_input[i].fld_in_rop, > + fd->fd_base + FLD_FACE_INFO(i, 2)); > + } > + > + writel(aie_cfg->fld_face_num << 28 | FLD_FOREST << 16 | > + FLD_POINT, fd->fd_base + FLD_MODEL_PARA1); > + writel(13 << 16 | 0xfe9, fd->fd_base + FLD_MODEL_PARA14); > + writel(aie_cfg->src_img_width << 16 | aie_cfg->src_img_height, > + fd->fd_base + FLD_SRC_WD_HT); > + > + /*input settings*/ > + writel(0x007c003f, fd->fd_base + FLD_PL_IN_SIZE_0); > + writel(0x0040000f, fd->fd_base + FLD_PL_IN_STRIDE_0); > + writel(0x007c003f, fd->fd_base + FLD_PL_IN_SIZE_1); > + writel(0x0040000f, fd->fd_base + FLD_PL_IN_STRIDE_1); > + writel(0x0016003f, fd->fd_base + FLD_PL_IN_SIZE_2_0); > + writel(0x0040000f, fd->fd_base + FLD_PL_IN_STRIDE_2_0); > + writel(0x0013003f, fd->fd_base + FLD_PL_IN_SIZE_2_1); > + writel(0x0040000f, fd->fd_base + FLD_PL_IN_STRIDE_2_1); > + writel(0x0013003f, fd->fd_base + FLD_PL_IN_SIZE_2_2); > + writel(0x0040000f, fd->fd_base + FLD_PL_IN_STRIDE_2_2); > + writel(0x00a6001f, fd->fd_base + FLD_PL_IN_SIZE_3); > + writel(0x0020000f, fd->fd_base + FLD_PL_IN_STRIDE_3); > + > + /*output setting*/ > + writel((2400 * aie_cfg->fld_face_num - 1) << 16 | 127, > + fd->fd_base + FLD_SH_IN_SIZE_0); > + writel(0x0010000f, fd->fd_base + FLD_SH_IN_STRIDE_0); > + writel(fd->fld_para->fld_output_pa[0], > + fd->fd_base + FLD_TR_OUT_BASE_ADDR_0); > + writel((aie_cfg->fld_face_num - 1) << 16 | 0x6f, > + fd->fd_base + FLD_TR_OUT_SIZE_0); > + writel(0x0070000f, fd->fd_base + FLD_TR_OUT_STRIDE_0); > + writel(fd->fld_para->fld_output_pa[0], > + fd->fd_base + FLD_PP_OUT_BASE_ADDR_0); > + writel((aie_cfg->fld_face_num - 1) << 16 | 0x6f, > + fd->fd_base + FLD_PP_OUT_SIZE_0); > + writel(0x0070000f, fd->fd_base + FLD_PP_OUT_STRIDE_0); > + > + /*cv score*/ > + writel(0x00000001, fd->fd_base + FLD_BS_BIAS); > + writel(0x0000b835, fd->fd_base + FLD_CV_FM_RANGE_0); > + writel(0xffff5cba, fd->fd_base + FLD_CV_FM_RANGE_1); > + writel(0x00005ed5, fd->fd_base + FLD_CV_PM_RANGE_0); > + writel(0xffff910d, fd->fd_base + FLD_CV_PM_RANGE_1); > + writel(0x0000031e, fd->fd_base + FLD_BS_RANGE_0); > + writel(0xfffffcae, fd->fd_base + FLD_BS_RANGE_1); > + > + /* 6 steps */ > + writel(fd->fld_para->fld_step_pa[FLD_STEP_BLINK][14], > + fd->fd_base + FLD_BS_IN_BASE_ADDR_14); > + > + for (i = 0; i < 15; i++) { > + writel(fd->fld_para->fld_step_pa[FLD_STEP_CV][i], > + fd->fd_base + FLD_PL_IN_BASE_ADDR_2_(i)); > + > + writel(fd->fld_para->fld_step_pa[FLD_STEP_FP][i], > + fd->fd_base + FLD_PL_IN_BASE_ADDR_3_(i)); > + > + writel(fd->fld_para->fld_step_pa[FLD_STEP_LEAF][i], > + fd->fd_base + FLD_SH_IN_BASE_ADDR_(i)); > + > + writel(fd->fld_para->fld_step_pa[FLD_STEP_KM02][i], > + fd->fd_base + FLD_PL_IN_BASE_ADDR_0_(i)); > + > + writel(fd->fld_para->fld_step_pa[FLD_STEP_KM13][i], > + fd->fd_base + FLD_PL_IN_BASE_ADDR_1_(i)); > + } > + > + writel(0x22222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_0_0_7_MSB); > + writel(0x02222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_0_8_15_MSB); > + > + writel(0x22222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_1_0_7_MSB); > + writel(0x02222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_1_8_15_MSB); > + > + writel(0x22222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_2_0_7_MSB); > + writel(0x02222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_2_8_15_MSB); > + > + writel(0x22222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_3_0_7_MSB); > + writel(0x02222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_3_8_15_MSB); > + > + writel(0x22222222, fd->fd_base + FLD_SH_IN_BASE_ADDR_0_7_MSB); > + writel(0x02222222, fd->fd_base + FLD_SH_IN_BASE_ADDR_8_15_MSB); > + > + writel(0x02000000, fd->fd_base + FLD_BS_IN_BASE_ADDR_8_15_MSB); > + > + writel(0x22222222, fd->fd_base + FLD_BASE_ADDR_FACE_0_7_MSB); > + writel(0x02222222, fd->fd_base + FLD_BASE_ADDR_FACE_8_14_MSB); > + writel(0x00000002, fd->fd_base + FLD_TR_OUT_BASE_ADDR_0_MSB); > + writel(0x00000002, fd->fd_base + FLD_PP_OUT_BASE_ADDR_0_MSB); > + > + /* fld mode + trigger start */ > + writel(0x11, fd->fd_base + AIE_START_REG); > +} > + > +void aie_execute(struct mtk_aie_dev *fd, struct aie_enq_info > *aie_cfg) > +{ > + if (aie_cfg->sel_mode == FDMODE) > + aie_execute_face_detection(fd, aie_cfg); > + else if (aie_cfg->sel_mode == ATTRIBUTEMODE) > + aie_execute_attribute_detection(fd, aie_cfg); > + else if (aie_cfg->sel_mode == FLDMODE) > + aie_execute_fld_detection(fd, aie_cfg); > + else > + return; > +} > + > +void aie_irqhandle(struct mtk_aie_dev *fd) > +{ > + writel(0x0, fd->fd_base + AIE_START_REG); > + > + /* interrupt read clear */ > + readl(fd->fd_base + AIE_INT_REG); > +} > + > +static u16 aie_get_hi16(unsigned int value) > +{ > + return (value & 0xFFFF0000) >> 16; > +} > + > +static u16 aie_get_lo16(unsigned int value) > +{ > + return value & 0xFFFF; > +} > + > +static signed short aie_refine_s16_value(signed short value) > +{ > + s16 result; > + > + if ((value & 0x200) >> 9) > + result = (value | 0xFE00); > + else > + result = value; > + > + return result; > +} > + > +/* return aie_cfg to user space */ > +void aie_get_fd_result(struct mtk_aie_dev *fd, struct aie_enq_info > *aie_cfg) > +{ > + u32 fd_result_hw, fd_result_1_hw, fd_total_num; > + struct aie_enq_info *tmp_aie_cfg; > + void *fd_pym_result[PYM_NUM]; > + u32 fd_pyramid_num[PYM_NUM]; > + signed short landmark; > + unsigned int *pto12; > + struct fd_ret *prst; > + unsigned int i, j; > + > + aie_cfg->sel_mode = fd->base_para->sel_mode; > + aie_cfg->rotate_degree = fd->base_para->rotate_degree; > + aie_cfg->src_img_addr = fd->base_para->src_img_addr; > + aie_cfg->src_img_addr_uv = fd->base_para->src_img_addr_uv; > + aie_cfg->src_img_width = fd->base_para->img_rect.width; > + aie_cfg->src_img_height = fd->base_para->img_rect.height; > + aie_cfg->src_img_fmt = fd->base_para->src_img_fmt; > + > + aie_cfg->irq_status = readl(fd->fd_base + AIE_INT_EN_REG); > + > + fd_result_hw = fd->reg_cfg.hw_result; > + fd_result_1_hw = fd->reg_cfg.hw_result1; > + fd_total_num = fd_result_hw & 0xFFF; > + fd_pyramid_num[0] = (fd_result_hw & 0xFFF0000) >> 16; > + fd_pyramid_num[1] = fd_result_1_hw & 0xFFF; > + fd_pyramid_num[2] = (fd_result_1_hw & 0xFFF0000) >> 16; > + > + if (fd_total_num == 0) > + goto nothing_out; > + > + tmp_aie_cfg = aie_cfg; > + > + tmp_aie_cfg->fd_out.fd_total_num = fd_total_num; > + tmp_aie_cfg->fd_out.fd_pyramid0_num = fd_pyramid_num[0]; > + tmp_aie_cfg->fd_out.fd_pyramid1_num = fd_pyramid_num[1]; > + tmp_aie_cfg->fd_out.fd_pyramid2_num = fd_pyramid_num[2]; > + > + switch (tmp_aie_cfg->number_of_pyramid) { > + case 1: > + fd_pym_result[2] = fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(0)][0]; > + break; > + case 2: > + fd_pym_result[1] = fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(0)][0]; > + fd_pym_result[2] = fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(1)][0]; > + break; > + case 3: > + fd_pym_result[0] = fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(0)][0]; > + fd_pym_result[1] = fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(1)][0]; > + fd_pym_result[2] = fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(2)][0]; > + break; > + default: > + dev_err(fd->dev, "Wrong number_of_pyramid\n"); > + goto nothing_out; > + } > + > + for (i = 0; i < 3; i++) { > + for (j = 0; j < fd_pyramid_num[i]; j++) { > + if (i == 0) > + prst = &tmp_aie_cfg->fd_out.pyramid0_result; > + else if (i == 1) > + prst = &tmp_aie_cfg->fd_out.pyramid1_result; > + else if (i == 2) > + prst = &tmp_aie_cfg->fd_out.pyramid2_result; > + > + pto12 = (unsigned int *)fd_pym_result[i] + 12 * j; > + > + prst->anchor_x0[j] = aie_get_lo16(*(pto12 + 0)); > + prst->anchor_y0[j] = aie_get_hi16(*(pto12 + 0)); > + prst->anchor_x1[j] = aie_get_lo16(*(pto12 + 1)); > + prst->anchor_y1[j] = aie_get_hi16(*(pto12 + 1)); > + > + if (prst->anchor_x1[j] == 0 || > + prst->anchor_y1[j] == 0) { > + dev_err(fd->dev, > + "wrong coordinate: i=%d j=%d M:%d %d %d %d\n", i, j, > + prst->anchor_x0[j], > + prst->anchor_x1[j], > + prst->anchor_y0[j], > + prst->anchor_y1[j] > + ); > + goto nothing_out; > + } > + > + /* ROP result at 1st run */ > + landmark = (*(pto12 + 2) & 0x3FF); > + prst->rop_landmark_score0[j] = aie_refine_s16_value(landmark); > + landmark = ((*(pto12 + 2) & 0xFFC00) >> 10); > + prst->rop_landmark_score1[j] = aie_refine_s16_value(landmark); > + landmark = ((*(pto12 + 2) & 0x3FF00000) >> 20); > + prst->rop_landmark_score2[j] = aie_refine_s16_value(landmark); > + > + prst->anchor_score[j] = aie_refine_s16_value(*(pto12 + 9) & 0x3FF); > + > + /* RIP result at 1st run */ > + landmark = ((*(pto12 + 9) & 0xFFC00) >> 10); > + prst->rip_landmark_score0[j] = aie_refine_s16_value(landmark); > + landmark = ((*(pto12 + 9) & 0x3FF00000) >> 20); > + prst->rip_landmark_score1[j] = aie_refine_s16_value(landmark); > + landmark = ((*(pto12 + 9) & 0xC0000000) >> 30) | > + ((*(pto12 + 10) & 0xFF) << 2); > + prst->rip_landmark_score2[j] = aie_refine_s16_value(landmark); > + landmark = ((*(pto12 + 10) & 0x3FF00) >> 8); > + prst->rip_landmark_score3[j] = aie_refine_s16_value(landmark); > + landmark = ((*(pto12 + 10) & 0xFFC0000) >> 18); > + prst->rip_landmark_score4[j] = aie_refine_s16_value(landmark); > + landmark = ((*(pto12 + 10) & 0xF0000000) >> 28) | > + ((*(pto12 + 11) & 0x3F) << 4); > + prst->rip_landmark_score5[j] = aie_refine_s16_value(landmark); > + landmark = ((*(pto12 + 11) & 0xFFC0) >> 6); > + prst->rip_landmark_score6[j] = aie_refine_s16_value(landmark); > + prst->face_result_index[j] = ((*(pto12 + 11) & 0xFFF0000) >> 16); > + prst->anchor_index[j] = ((*(pto12 + 11) & 0x70000000) >> 28); > + > + prst->fd_partial_result = fd_pyramid_num[i]; > + } > + } > + return; > +nothing_out: > + // Ensure that user mode does not receive an inappropriate result > structure > + memset(&aie_cfg->fd_out, 0, sizeof(struct fd_result)); > +} > + > +void aie_get_attr_result(struct mtk_aie_dev *fd, struct aie_enq_info > *aie_cfg) > +{ > + u32 *attr_ethnicity_result, *attr_gender_result; > + u32 *attr_age_result, *attr_is_indian_result; > + > + aie_cfg->sel_mode = fd->attr_para->sel_mode[fd->attr_para->r_idx]; > + aie_cfg->rotate_degree = fd->attr_para->rotate_degree[fd- > >attr_para->r_idx]; > + aie_cfg->src_img_addr = > + fd->attr_para->src_img_addr[fd->attr_para->r_idx]; > + aie_cfg->src_img_addr_uv = > + fd->attr_para->src_img_addr_uv[fd->attr_para->r_idx]; > + aie_cfg->src_img_width = fd->attr_para->img_width[fd->attr_para- > >r_idx]; > + aie_cfg->src_img_height = > + fd->attr_para->img_height[fd->attr_para->r_idx]; > + aie_cfg->src_img_fmt = fd->attr_para->src_img_fmt[fd->attr_para- > >r_idx]; > + > + aie_cfg->irq_status = readl(fd->fd_base + AIE_INT_EN_REG); > + > + /* 64 feature * 32 bytes */ > + attr_age_result = > + (u32 *)fd->dma_para->age_out_hw_va[fd->attr_para->r_idx]; > + attr_gender_result = > + (u32 *)fd->dma_para->gender_out_hw_va[fd->attr_para->r_idx]; > + attr_is_indian_result = > + (u32 *)fd->dma_para->is_indian_out_hw_va[fd->attr_para->r_idx]; > + attr_ethnicity_result = > + (u32 *)fd->dma_para->ethnicity_out_hw_va[fd->attr_para->r_idx]; > + > + aie_cfg->attr_out.merged_age_ret[0] = > aie_get_lo16(*attr_age_result); > + aie_cfg->attr_out.merged_age_ret[1] = > aie_get_hi16(*attr_age_result); > + > + aie_cfg->attr_out.merged_gender_ret[0] = > aie_get_lo16(*attr_gender_result); > + aie_cfg->attr_out.merged_gender_ret[1] = > aie_get_hi16(*attr_gender_result); > + > + aie_cfg->attr_out.merged_is_indian_ret[0] = > aie_get_lo16(*attr_is_indian_result); > + aie_cfg->attr_out.merged_is_indian_ret[1] = > aie_get_hi16(*attr_is_indian_result); > + > + aie_cfg->attr_out.merged_ethnicity_ret[0] = > aie_get_lo16(*attr_ethnicity_result); > + aie_cfg->attr_out.merged_ethnicity_ret[1] = > aie_get_hi16(*attr_ethnicity_result); > + aie_cfg->attr_out.merged_ethnicity_ret[2] = > aie_get_lo16(*(attr_ethnicity_result + 1)); > + > + fd->attr_para->r_idx = (fd->attr_para->r_idx + 1) % > MAX_ENQUE_FRAME_NUM; > +} > + > +void aie_get_fld_result(struct mtk_aie_dev *fd, struct aie_enq_info > *aie_cfg) > +{ > + u8 fld_rlt[FLD_OUTPUT_X_SIZE][FLD_OUTPUT_SIZE]; > + u16 *out_parsing; > + int i, j; > + > + aie_cfg->irq_status = readl(fd->fd_base + AIE_INT_EN_REG); > + > + memcpy(fld_rlt, fd->fld_para->fld_output_va[0], sizeof(fld_rlt)); > + > + for (j = 0; j < aie_cfg->fld_face_num; j++) { > + out_parsing = (unsigned short *)&fld_rlt[j][0]; > + for (i = 0; i < FLD_CUR_LANDMARK; i++) { > + aie_cfg->fld_out[j].fld_landmark[i].x = *out_parsing; > + aie_cfg->fld_out[j].fld_landmark[i].y = *(out_parsing + 1); > + > + if (i % 2) > + out_parsing = out_parsing + 6; > + else > + out_parsing = out_parsing + 2; > + } > + out_parsing = (unsigned short *)&fld_rlt[j][0]; > + if (FLD_CUR_LANDMARK % 2) > + out_parsing = out_parsing + ((FLD_CUR_LANDMARK + 1) / 2) * 8; > + else > + out_parsing = out_parsing + (FLD_CUR_LANDMARK / 2) * 8; > + > + aie_cfg->fld_out[j].fld_out_rop = *out_parsing; > + aie_cfg->fld_out[j].fld_out_rip = *(out_parsing + 1); > + aie_cfg->fld_out[j].confidence = *(out_parsing + 2); > + aie_cfg->fld_out[j].blinkscore = *(out_parsing + 3); > + } > +} > diff --git a/drivers/media/platform/mediatek/aie/mtk_aie_v4l2.c > b/drivers/media/platform/mediatek/aie/mtk_aie_v4l2.c > new file mode 100644 > index 000000000000..28646d5a53aa > --- /dev/null > +++ b/drivers/media/platform/mediatek/aie/mtk_aie_v4l2.c > @@ -0,0 +1,1295 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Copyright (c) 2020 MediaTek Inc. > + * Author: Bo.Kong <bo.kong@mediatek.com> > + */ > + > +#include <linux/clk.h> > +#include <linux/pm_runtime.h> > +#include <linux/mtk_aie_v4l2_controls.h> > +#include <media/v4l2-event.h> > +#include <media/v4l2-ioctl.h> > +#include <media/v4l2-mem2mem.h> > +#include <media/videobuf2-dma-contig.h> > +#include "mtk_aie.h" > + > +#define V4L2_CID_MTK_AIE_MAX 2 > +#define Y2R_CFG_SIZE 34 > +#define RS_CFG_SIZE 30 > +#define FD_CFG_SIZE 56 > + > +static const struct mtk_aie_variant aie_31_drvdata = { > + .y2r_cfg_size = Y2R_CFG_SIZE, > + .rs_cfg_size = RS_CFG_SIZE, > + .fd_cfg_size = FD_CFG_SIZE, > +}; > + > +static const struct of_device_id mtk_aie_of_ids[] = { > + { .compatible = "mediatek,mt8188-aie", .data = &aie_31_drvdata }, > + { /* sentinel */ }, > +}; > +MODULE_DEVICE_TABLE(of, mtk_aie_of_ids); > + > +static const struct v4l2_pix_format_mplane mtk_aie_img_fmts[] = { > + { > + .pixelformat = V4L2_PIX_FMT_NV16M, > + .num_planes = 2, > + }, > + { > + .pixelformat = V4L2_PIX_FMT_NV61M, > + .num_planes = 2, > + }, > + { > + .pixelformat = V4L2_PIX_FMT_YUYV, > + .num_planes = 1, > + }, > + { > + .pixelformat = V4L2_PIX_FMT_YVYU, > + .num_planes = 1, > + }, > + { > + .pixelformat = V4L2_PIX_FMT_UYVY, > + .num_planes = 1, > + }, > + { > + .pixelformat = V4L2_PIX_FMT_VYUY, > + .num_planes = 1, > + }, > + { > + .pixelformat = V4L2_PIX_FMT_GREY, > + .num_planes = 1, > + }, > + { > + .pixelformat = V4L2_PIX_FMT_NV12M, > + .num_planes = 2, > + }, > + { > + .pixelformat = V4L2_PIX_FMT_NV12, > + .num_planes = 1, > + }, > +}; > + > +static inline struct mtk_aie_ctx *fh_to_ctx(struct v4l2_fh *fh) > +{ > + return container_of(fh, struct mtk_aie_ctx, fh); > +} > + > +static inline struct mtk_aie_ctx *ctrl_to_ctx(const struct v4l2_ctrl > *ctrl) > +{ > + return container_of(ctrl->handler, struct mtk_aie_ctx, hdl); > +} > + > +static void mtk_aie_hw_job_finish(struct mtk_aie_dev *fd, > + enum vb2_buffer_state vb_state) > +{ > + struct vb2_v4l2_buffer *src_vbuf, *dst_vbuf; > + struct mtk_aie_ctx *ctx; > + > + pm_runtime_put(fd->dev); > + ctx = v4l2_m2m_get_curr_priv(fd->m2m_dev); > + if (!ctx) { > + dev_err(fd->dev, "Failed to do v4l2_m2m_get_curr_priv!\n"); > + } else { > + src_vbuf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx); > + dst_vbuf = v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx); > + if (src_vbuf && dst_vbuf) > + v4l2_m2m_buf_copy_metadata(src_vbuf, dst_vbuf, true); > + if (src_vbuf) > + v4l2_m2m_buf_done(src_vbuf, vb_state); > + if (dst_vbuf) > + v4l2_m2m_buf_done(dst_vbuf, vb_state); > + if (src_vbuf && dst_vbuf) > + v4l2_m2m_job_finish(fd->m2m_dev, ctx->fh.m2m_ctx); > + } > + complete_all(&fd->fd_job_finished); > +} > + > +static int mtk_aie_hw_job_exec(struct mtk_aie_dev *fd) > +{ > + pm_runtime_get_sync(fd->dev); > + > + reinit_completion(&fd->fd_job_finished); > + schedule_delayed_work(&fd->job_timeout_work, > + msecs_to_jiffies(MTK_FD_HW_TIMEOUT_IN_MSEC)); > + > + return 0; > +} > + > +static int mtk_aie_vb2_buf_out_validate(struct vb2_buffer *vb) > +{ > + struct vb2_v4l2_buffer *v4l2_buf = to_vb2_v4l2_buffer(vb); > + > + if (v4l2_buf->field == V4L2_FIELD_ANY) > + v4l2_buf->field = V4L2_FIELD_NONE; > + if (v4l2_buf->field != V4L2_FIELD_NONE) > + return -EINVAL; > + > + return 0; > +} > + > +static int mtk_aie_vb2_buf_prepare(struct vb2_buffer *vb) > +{ > + struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb); > + struct vb2_queue *vq = vb->vb2_queue; > + struct mtk_aie_ctx *ctx = vb2_get_drv_priv(vq); > + struct v4l2_pix_format_mplane *pixfmt; > + struct device *dev = ctx->dev; > + > + switch (vq->type) { > + case V4L2_BUF_TYPE_META_CAPTURE: > + if (vb2_plane_size(vb, 0) < ctx->dst_fmt.buffersize) { > + dev_err(dev, "meta size %lu is too small\n", vb2_plane_size(vb, > 0)); > + return -EINVAL; > + } > + vb2_set_plane_payload(vb, 0, ctx->dst_fmt.buffersize); > + break; > + case V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE: > + pixfmt = &ctx->src_fmt; > + > + if (vbuf->field == V4L2_FIELD_ANY) > + vbuf->field = V4L2_FIELD_NONE; > + > + if (vb->num_planes > 2 || vbuf->field != V4L2_FIELD_NONE) { > + dev_dbg(dev, "plane %d or field %d not supported\n", > + vb->num_planes, vbuf->field); > + return -EINVAL; > + } > + > + if (vb2_plane_size(vb, 0) < pixfmt->plane_fmt[0].sizeimage) { > + dev_dbg(dev, "plane 0 %lu is too small than %x\n", > + vb2_plane_size(vb, 0), pixfmt->plane_fmt[0].sizeimage); > + return -EINVAL; > + } > + vb2_set_plane_payload(vb, 0, pixfmt->plane_fmt[0].sizeimage); > + > + if (pixfmt->num_planes == 2 && > + vb2_plane_size(vb, 1) < pixfmt->plane_fmt[1].sizeimage) { > + dev_dbg(dev, "plane 1 %lu is too small than %x\n", > + vb2_plane_size(vb, 1), pixfmt->plane_fmt[1].sizeimage); > + return -EINVAL; > + } > + vb2_set_plane_payload(vb, 1, pixfmt->plane_fmt[1].sizeimage); > + break; > + } > + > + return 0; > +} > + > +static void mtk_aie_vb2_buf_queue(struct vb2_buffer *vb) > +{ > + struct mtk_aie_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue); > + struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb); > + > + v4l2_m2m_buf_queue(ctx->fh.m2m_ctx, vbuf); > +} > + > +static int mtk_aie_vb2_queue_setup(struct vb2_queue *vq, > + unsigned int *num_buffers, > + unsigned int *num_planes, > + unsigned int sizes[], > + struct device *alloc_devs[]) > +{ > + struct mtk_aie_ctx *ctx = vb2_get_drv_priv(vq); > + struct device *dev = ctx->dev; > + unsigned int size[2]; > + unsigned int plane; > + > + switch (vq->type) { > + case V4L2_BUF_TYPE_META_CAPTURE: > + size[0] = ctx->dst_fmt.buffersize; > + size[1] = 0; > + break; > + case V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE: > + size[0] = ctx->src_fmt.plane_fmt[0].sizeimage; > + size[1] = ctx->src_fmt.plane_fmt[1].sizeimage; > + break; > + default: > + size[0] = 0; > + size[1] = 0; > + } > + > + dev_dbg(dev, "vq type = %d, size[0] = %d, size[1] = %d\n", > + vq->type, size[0], size[1]); > + > + if (*num_planes > 2) > + return -EINVAL; > + > + *num_buffers = clamp_val(*num_buffers, 1, VB2_MAX_FRAME); > + > + if (*num_planes == 0) { > + if (vq->type == V4L2_BUF_TYPE_META_CAPTURE) { > + sizes[0] = ctx->dst_fmt.buffersize; > + *num_planes = 1; > + return 0; > + } > + > + *num_planes = ctx->src_fmt.num_planes; > + if (*num_planes > 2) > + return -EINVAL; > + for (plane = 0; plane < *num_planes; plane++) > + sizes[plane] = ctx->src_fmt.plane_fmt[plane].sizeimage; > + > + return 0; > + } > + > + return 0; > +} > + > +static int mtk_aie_vb2_start_streaming(struct vb2_queue *vq, > unsigned int count) > +{ > + struct mtk_aie_ctx *ctx = vb2_get_drv_priv(vq); > + struct mtk_aie_dev *fd; > + > + if (!ctx) > + return -EINVAL; > + > + fd = ctx->fd_dev; > + if (vq->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE && (++fd- > >fd_stream_count == 1)) > + return aie_init(ctx->fd_dev, &ctx->fd_dev->ctx->user_init); > + > + return 0; > +} > + > +static void mtk_aie_job_timeout_work(struct work_struct *work) > +{ > + struct mtk_aie_dev *fd = > + container_of(work, struct mtk_aie_dev, job_timeout_work.work); > + > + dev_err(fd->dev, "FD Job timeout!"); > + > + dev_dbg(fd->dev, "%s result result1: %x, %x, %x", __func__, > + readl(fd->fd_base + AIE_RESULT_0_REG), > + readl(fd->fd_base + AIE_RESULT_1_REG), > + readl(fd->fd_base + AIE_DMA_CTL_REG)); > + > + fd->aie_cfg->irq_status = readl(fd->fd_base + AIE_INT_EN_REG); > + > + if (fd->aie_cfg->sel_mode == ATTRIBUTEMODE) { > + dev_dbg(fd->dev, "w_idx = %d, r_idx = %d\n", > + fd->attr_para->w_idx, fd->attr_para->r_idx); > + } > + > + aie_irqhandle(fd); > + aie_reset(fd); > + atomic_dec(&fd->num_composing); > + mtk_aie_hw_job_finish(fd, VB2_BUF_STATE_ERROR); > + wake_up(&fd->flushing_waitq); > +} > + > +static void mtk_aie_vb2_stop_streaming(struct vb2_queue *vq) > +{ > + struct mtk_aie_ctx *ctx = vb2_get_drv_priv(vq); > + struct v4l2_m2m_ctx *m2m_ctx = ctx->fh.m2m_ctx; > + struct v4l2_m2m_queue_ctx *queue_ctx; > + struct mtk_aie_dev *fd = ctx->fd_dev; > + struct vb2_v4l2_buffer *vb = NULL; > + int ret; > + > + /* Waiting Job Finish */ > + ret = wait_for_completion_timeout(&fd->fd_job_finished, > msecs_to_jiffies(1000)); > + if (!ret) > + dev_err(fd->dev, "Wait job finish timeout\n"); > + > + if (vq->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) { > + fd->fd_stream_count--; > + if (fd->fd_stream_count > 0) > + dev_dbg(fd->dev, "Stop: fd_stream_count = %d\n", fd- > >fd_stream_count); > + else > + aie_uninit(fd); > + } > + > + queue_ctx = V4L2_TYPE_IS_OUTPUT(vq->type) ? &m2m_ctx->out_q_ctx : > + &m2m_ctx->cap_q_ctx; > + while ((vb = v4l2_m2m_buf_remove(queue_ctx))) > + v4l2_m2m_buf_done(vb, VB2_BUF_STATE_ERROR); > +} > + > +static void mtk_aie_vb2_request_complete(struct vb2_buffer *vb) > +{ > + struct mtk_aie_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue); > + > + v4l2_ctrl_request_complete(vb->req_obj.req, &ctx->hdl); > +} > + > +static int mtk_aie_querycap(struct file *file, void *fh, > + struct v4l2_capability *cap) > +{ > + struct mtk_aie_dev *fd = video_drvdata(file); > + struct device *dev = fd->dev; > + > + strscpy(cap->driver, dev_driver_string(dev), sizeof(cap->driver)); > + strscpy(cap->card, dev_driver_string(dev), sizeof(cap->card)); > + > + cap->device_caps = V4L2_CAP_VIDEO_OUTPUT_MPLANE | > + V4L2_CAP_STREAMING | V4L2_CAP_META_CAPTURE; > + cap->capabilities = V4L2_CAP_DEVICE_CAPS | cap->device_caps; > + > + return 0; > +} > + > +static int mtk_aie_enum_fmt_out_mp(struct file *file, void *fh, > + struct v4l2_fmtdesc *f) > +{ > + if (f->index >= ARRAY_SIZE(mtk_aie_img_fmts)) > + return -EINVAL; > + > + f->pixelformat = mtk_aie_img_fmts[f->index].pixelformat; > + return 0; > +} > + > +static void mtk_aie_fill_pixfmt_mp(struct v4l2_pix_format_mplane > *dfmt, > + const struct v4l2_pix_format_mplane *sfmt) > +{ > + dfmt->field = V4L2_FIELD_NONE; > + dfmt->colorspace = V4L2_COLORSPACE_BT2020; > + dfmt->num_planes = sfmt->num_planes; > + dfmt->ycbcr_enc = V4L2_YCBCR_ENC_DEFAULT; > + dfmt->quantization = V4L2_QUANTIZATION_DEFAULT; > + dfmt->xfer_func = V4L2_MAP_XFER_FUNC_DEFAULT(dfmt->colorspace); > + dfmt->pixelformat = sfmt->pixelformat; > + > + /* Keep user setting as possible */ > + dfmt->width = clamp(dfmt->width, MTK_FD_OUTPUT_MIN_WIDTH, > + MTK_FD_OUTPUT_MAX_WIDTH); > + dfmt->height = clamp(dfmt->height, MTK_FD_OUTPUT_MIN_HEIGHT, > + MTK_FD_OUTPUT_MAX_HEIGHT); > + > + dfmt->plane_fmt[0].bytesperline = ALIGN(dfmt->width, 16); > + dfmt->plane_fmt[1].bytesperline = ALIGN(dfmt->width, 16); > + > + dfmt->plane_fmt[0].sizeimage = dfmt->height * dfmt- > >plane_fmt[0].bytesperline; > + dfmt->plane_fmt[1].sizeimage = dfmt->height * dfmt- > >plane_fmt[1].bytesperline; > + if (sfmt->num_planes == 2 && sfmt->pixelformat == > V4L2_PIX_FMT_NV12M) { > + dfmt->plane_fmt[1].sizeimage /= 2; > + } else if (sfmt->pixelformat == V4L2_PIX_FMT_NV12) { > + dfmt->plane_fmt[0].sizeimage *= 3; > + dfmt->plane_fmt[0].sizeimage /= 2; > + } > +} > + > +static const struct v4l2_pix_format_mplane *mtk_aie_find_fmt(u32 > format) > +{ > + unsigned int i; > + > + for (i = 0; i < ARRAY_SIZE(mtk_aie_img_fmts); i++) { > + if (mtk_aie_img_fmts[i].pixelformat == format) > + return &mtk_aie_img_fmts[i]; > + } > + > + return NULL; > +} > + > +static int mtk_aie_try_fmt_out_mp(struct file *file, void *fh, > + struct v4l2_format *f) > +{ > + struct v4l2_pix_format_mplane *pix_mp = &f->fmt.pix_mp; > + const struct v4l2_pix_format_mplane *fmt; > + > + fmt = mtk_aie_find_fmt(pix_mp->pixelformat); > + if (!fmt) > + fmt = &mtk_aie_img_fmts[0]; > + > + mtk_aie_fill_pixfmt_mp(pix_mp, fmt); > + return 0; > +} > + > +static int mtk_aie_g_fmt_out_mp(struct file *file, void *fh, > + struct v4l2_format *f) > +{ > + struct mtk_aie_ctx *ctx = fh_to_ctx(fh); > + > + f->fmt.pix_mp = ctx->src_fmt; > + > + return 0; > +} > + > +static int mtk_aie_s_fmt_out_mp(struct file *file, void *fh, > + struct v4l2_format *f) > +{ > + struct mtk_aie_ctx *ctx = fh_to_ctx(fh); > + struct vb2_queue *vq = v4l2_m2m_get_vq(ctx->fh.m2m_ctx, f->type); > + const struct v4l2_pix_format_mplane *fmt; > + struct mtk_aie_dev *fd = ctx->fd_dev; > + > + if (!vq) { > + dev_err(fd->dev, "%s vq is NULL!\n", __func__); > + return -EINVAL; > + } > + > + /* Change not allowed if queue is streaming. */ > + if (vb2_is_streaming(vq)) > + return -EBUSY; > + > + fmt = mtk_aie_find_fmt(f->fmt.pix_mp.pixelformat); > + if (!fmt) > + fmt = &mtk_aie_img_fmts[0]; > + else if (&fd->ctx->fh != file->private_data) > + return -EBUSY; > + > + fd->ctx = ctx; > + mtk_aie_fill_pixfmt_mp(&f->fmt.pix_mp, fmt); > + ctx->src_fmt = f->fmt.pix_mp; > + > + return 0; > +} > + > +static int mtk_aie_enum_fmt_meta_cap(struct file *file, void *fh, > + struct v4l2_fmtdesc *f) > +{ > + if (f->index) > + return -EINVAL; > + > + strscpy(f->description, "Face detection result", sizeof(f- > >description)); > + f->pixelformat = V4L2_META_FMT_MTFD_RESULT; > + f->flags = 0; > + > + return 0; > +} > + > +static int mtk_aie_g_fmt_meta_cap(struct file *file, void *fh, > + struct v4l2_format *f) > +{ > + f->fmt.meta.dataformat = V4L2_META_FMT_MTFD_RESULT; > + f->fmt.meta.buffersize = sizeof(struct aie_enq_info); > + > + return 0; > +} > + > +static const struct vb2_ops mtk_aie_vb2_ops = { > + .queue_setup = mtk_aie_vb2_queue_setup, > + .buf_out_validate = mtk_aie_vb2_buf_out_validate, > + .buf_prepare = mtk_aie_vb2_buf_prepare, > + .buf_queue = mtk_aie_vb2_buf_queue, > + .start_streaming = mtk_aie_vb2_start_streaming, > + .stop_streaming = mtk_aie_vb2_stop_streaming, > + .wait_prepare = vb2_ops_wait_prepare, > + .wait_finish = vb2_ops_wait_finish, > + .buf_request_complete = mtk_aie_vb2_request_complete, > +}; > + > +static const struct v4l2_ioctl_ops mtk_aie_v4l2_video_out_ioctl_ops > = { > + .vidioc_querycap = mtk_aie_querycap, > + .vidioc_enum_fmt_vid_out = mtk_aie_enum_fmt_out_mp, > + .vidioc_g_fmt_vid_out_mplane = mtk_aie_g_fmt_out_mp, > + .vidioc_s_fmt_vid_out_mplane = mtk_aie_s_fmt_out_mp, > + .vidioc_try_fmt_vid_out_mplane = mtk_aie_try_fmt_out_mp, > + .vidioc_enum_fmt_meta_cap = mtk_aie_enum_fmt_meta_cap, > + .vidioc_g_fmt_meta_cap = mtk_aie_g_fmt_meta_cap, > + .vidioc_s_fmt_meta_cap = mtk_aie_g_fmt_meta_cap, > + .vidioc_try_fmt_meta_cap = mtk_aie_g_fmt_meta_cap, > + .vidioc_reqbufs = v4l2_m2m_ioctl_reqbufs, > + .vidioc_create_bufs = v4l2_m2m_ioctl_create_bufs, > + .vidioc_expbuf = v4l2_m2m_ioctl_expbuf, > + .vidioc_prepare_buf = v4l2_m2m_ioctl_prepare_buf, > + .vidioc_querybuf = v4l2_m2m_ioctl_querybuf, > + .vidioc_qbuf = v4l2_m2m_ioctl_qbuf, > + .vidioc_dqbuf = v4l2_m2m_ioctl_dqbuf, > + .vidioc_streamon = v4l2_m2m_ioctl_streamon, > + .vidioc_streamoff = v4l2_m2m_ioctl_streamoff, > + .vidioc_subscribe_event = v4l2_ctrl_subscribe_event, > + .vidioc_unsubscribe_event = v4l2_event_unsubscribe, > +}; > + > +static int mtk_aie_queue_init(void *priv, struct vb2_queue *src_vq, > + struct vb2_queue *dst_vq) > +{ > + struct mtk_aie_ctx *ctx = priv; > + int ret; > + > + src_vq->type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE; > + src_vq->io_modes = VB2_MMAP | VB2_DMABUF; > + src_vq->supports_requests = true; > + src_vq->drv_priv = ctx; > + src_vq->ops = &mtk_aie_vb2_ops; > + src_vq->mem_ops = &vb2_dma_contig_memops; > + src_vq->buf_struct_size = sizeof(struct v4l2_m2m_buffer); > + src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY; > + src_vq->lock = &ctx->fd_dev->vfd_lock; > + src_vq->dev = ctx->fd_dev->v4l2_dev.dev; > + > + ret = vb2_queue_init(src_vq); > + if (ret) > + return ret; > + > + dst_vq->type = V4L2_BUF_TYPE_META_CAPTURE; > + dst_vq->io_modes = VB2_MMAP | VB2_DMABUF; > + dst_vq->drv_priv = ctx; > + dst_vq->ops = &mtk_aie_vb2_ops; > + dst_vq->mem_ops = &vb2_dma_contig_memops; > + dst_vq->buf_struct_size = sizeof(struct v4l2_m2m_buffer); > + dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY; > + dst_vq->lock = &ctx->fd_dev->vfd_lock; > + dst_vq->dev = ctx->fd_dev->v4l2_dev.dev; > + > + return vb2_queue_init(dst_vq); > +} > + > +static int mtk_aie_s_ctrl(struct v4l2_ctrl *ctrl) > +{ > + struct mtk_aie_ctx *ctx = ctrl_to_ctx(ctrl); > + struct v4l2_ctrl_aie_param *p_aie_param; > + struct v4l2_ctrl_aie_init *p_aie_init; > + > + if (!ctx) > + return -EINVAL; > + > + switch (ctrl->id) { > + case V4L2_CID_MTK_AIE_INIT: > + p_aie_init = ctrl->p_new.p; > + memcpy(&ctx->user_init, p_aie_init, sizeof(struct > v4l2_ctrl_aie_init)); > + break; > + case V4L2_CID_MTK_AIE_PARAM: > + p_aie_param = ctrl->p_new.p; > + memcpy(&ctx->user_param, p_aie_param, sizeof(struct > v4l2_ctrl_aie_param)); > + break; > + default: > + return -EINVAL; > + } > + > + return 0; > +} > + > +static const struct v4l2_ctrl_ops aie_ctrl_ops = { > + .s_ctrl = mtk_aie_s_ctrl, > +}; > + > +static void mtk_aie_ctrl_type_op_init(const struct v4l2_ctrl *ctrl, > + u32 from_idx, union v4l2_ctrl_ptr ptr) > +{ > + struct v4l2_ctrl_aie_param *p_aie_param; > + struct v4l2_ctrl_aie_init *p_aie_init; > + > + switch (ctrl->id) { > + case V4L2_CID_MTK_AIE_INIT: > + p_aie_init = ptr.p; > + memset(p_aie_init, 0, sizeof(struct v4l2_ctrl_aie_init)); > + break; > + > + case V4L2_CID_MTK_AIE_PARAM: > + p_aie_param = ptr.p; > + memset(p_aie_param, 0, sizeof(struct v4l2_ctrl_aie_param)); > + break; > + > + default: > + break; > + } > +} > + > +static int mtk_aie_ctrl_type_op_validate(const struct v4l2_ctrl > *ctrl, > + union v4l2_ctrl_ptr ptr) > +{ > + struct mtk_aie_ctx *ctx = ctrl_to_ctx(ctrl); > + struct v4l2_ctrl_aie_param *p_aie_param; > + struct v4l2_ctrl_aie_init *p_aie_init; > + struct mtk_aie_dev *fd; > + > + if (!ctx) > + return -EINVAL; > + > + fd = ctx->fd_dev; > + > + switch (ctrl->id) { > + case V4L2_CID_MTK_AIE_PARAM: > + p_aie_param = ptr.p; > + > + switch (p_aie_param->fd_mode) { > + case FDMODE: > + case ATTRIBUTEMODE: > + case FLDMODE: > + break; > + default: > + dev_err(ctx->dev, "Invalid Requested Mode: %d\n", p_aie_param- > >fd_mode); > + return -EINVAL; > + } > + > + switch (p_aie_param->src_img_fmt) { > + case FMT_YUV_2P: > + case FMT_YVU_2P: > + case FMT_YUYV: > + case FMT_YVYU: > + case FMT_UYVY: > + case FMT_VYUY: > + case FMT_MONO: > + case FMT_YUV420_2P: > + case FMT_YUV420_1P: > + break; > + default: > + dev_err(ctx->dev, "Invalid Requested Fmt: %d\n", p_aie_param- > >src_img_fmt); > + return -EINVAL; > + } > + > + if (p_aie_param->src_img_width > fd->base_para->max_img_rect.width > || > + p_aie_param->src_img_height > fd->base_para- > >max_img_rect.height || > + p_aie_param->src_img_width == 0 || p_aie_param->src_img_height > == 0) { > + dev_err(fd->dev, "Invalid Requested Src_WD: %d Src_HT: %d\n", > + p_aie_param->src_img_width, > + p_aie_param->src_img_height); > + > + dev_err(fd->dev, "Invalid Requested MAX_Src_WD: %d MAX_Src_HT: > %d\n", > + fd->base_para->max_img_rect.width, > + fd->base_para->max_img_rect.height); > + > + return -EINVAL; > + } > + > + if (p_aie_param->pyramid_base_width > fd->base_para- > >max_pyramid_rect.width || > + p_aie_param->pyramid_base_height > fd->base_para- > >max_pyramid_rect.height || > + p_aie_param->number_of_pyramid > 3 || > + p_aie_param->number_of_pyramid <= 0) { > + dev_err(fd->dev, "Invalid Requested base w: %d h: %d num: %d\n", > + p_aie_param->pyramid_base_width, p_aie_param->pyramid_base_height, > + p_aie_param->number_of_pyramid); > + > + dev_err(fd->dev, "Invalid Requested max w: %d h: %d\n", > + fd->base_para->max_pyramid_rect.width, > + fd->base_para->max_pyramid_rect.height); > + > + return -EINVAL; > + } > + > + break; > + > + case V4L2_CID_MTK_AIE_INIT: > + p_aie_init = ptr.p; > + if (!p_aie_init->max_img_width || !p_aie_init->max_img_height || > + !p_aie_init->pyramid_width || !p_aie_init->pyramid_height) { > + dev_err(fd->dev, > + "Invalid Requested max_w: %d max_h: %d, p_w: %d p_h: %d\n", > + p_aie_init->max_img_width, p_aie_init->max_img_height, > + p_aie_init->pyramid_width, p_aie_init->pyramid_height); > + > + return -EINVAL; > + } > + > + break; > + > + default: > + return -EINVAL; > + } > + > + return 0; > +} > + > +static const struct v4l2_ctrl_type_ops aie_ctrl_type_ops = { > + .equal = v4l2_ctrl_type_op_equal, > + .init = mtk_aie_ctrl_type_op_init, > + .log = v4l2_ctrl_type_op_log, > + .validate = mtk_aie_ctrl_type_op_validate, > +}; > + > +static struct v4l2_ctrl_config mtk_aie_controls[] = { > + { > + .ops = &aie_ctrl_ops, > + .type_ops = &aie_ctrl_type_ops, > + .id = V4L2_CID_MTK_AIE_INIT, > + .name = "FD detection init", > + .type = V4L2_CTRL_TYPE_AIE_INIT, > + .elem_size = sizeof(struct v4l2_ctrl_aie_init), > + }, { > + .ops = &aie_ctrl_ops, > + .type_ops = &aie_ctrl_type_ops, > + .id = V4L2_CID_MTK_AIE_PARAM, > + .name = "FD detection param", > + .type = V4L2_CTRL_TYPE_AIE_PARAM, > + .elem_size = sizeof(struct v4l2_ctrl_aie_param), > + }, > +}; > + > +static int mtk_aie_ctrls_setup(struct mtk_aie_ctx *ctx) > +{ > + struct v4l2_ctrl_handler *hdl = &ctx->hdl; > + int i; > + > + v4l2_ctrl_handler_init(hdl, V4L2_CID_MTK_AIE_MAX); > + if (hdl->error) > + return hdl->error; > + > + for (i = 0; i < ARRAY_SIZE(mtk_aie_controls); i++) { > + v4l2_ctrl_new_custom(hdl, &mtk_aie_controls[i], ctx); > + if (hdl->error) { > + v4l2_ctrl_handler_free(hdl); > + dev_err(ctx->dev, "Failed to register controls: %d", i); > + return hdl->error; > + } > + } > + > + ctx->fh.ctrl_handler = &ctx->hdl; > + v4l2_ctrl_handler_setup(hdl); > + > + return 0; > +} > + > +static void init_ctx_fmt(struct mtk_aie_ctx *ctx) > +{ > + struct v4l2_pix_format_mplane *src_fmt = &ctx->src_fmt; > + struct v4l2_meta_format *dst_fmt = &ctx->dst_fmt; > + > + /* Initialize M2M source fmt */ > + src_fmt->width = MTK_FD_OUTPUT_MAX_WIDTH; > + src_fmt->height = MTK_FD_OUTPUT_MAX_HEIGHT; > + mtk_aie_fill_pixfmt_mp(src_fmt, &mtk_aie_img_fmts[0]); > + > + /* Initialize M2M destination fmt */ > + dst_fmt->buffersize = sizeof(struct aie_enq_info); > + dst_fmt->dataformat = V4L2_META_FMT_MTFD_RESULT; > +} > + > +/* > + * V4L2 file operations. > + */ > +static int mtk_vfd_open(struct file *filp) > +{ > + struct video_device *vdev = video_devdata(filp); > + struct mtk_aie_dev *fd = video_drvdata(filp); > + struct mtk_aie_ctx *ctx; > + int ret; > + > + mutex_lock(&fd->dev_lock); > + > + if (fd->fd_state & STATE_OPEN) { > + dev_err(fd->dev, "VFD is already open, Only one instance is > supported\n"); > + ret = -EBUSY; > + goto err_unlock; > + } > + > + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); > + if (!ctx) { > + ret = -ENOMEM; > + goto err_unlock; > + } > + > + ctx->fd_dev = fd; > + ctx->dev = fd->dev; > + fd->ctx = ctx; > + > + v4l2_fh_init(&ctx->fh, vdev); > + filp->private_data = &ctx->fh; > + > + init_ctx_fmt(ctx); > + > + ret = mtk_aie_ctrls_setup(ctx); > + if (ret) { > + dev_err(ctx->dev, "Failed to set up controls: %d\n", ret); > + goto err_fh_exit; > + } > + ctx->fh.m2m_ctx = v4l2_m2m_ctx_init(fd->m2m_dev, ctx, > &mtk_aie_queue_init); > + if (IS_ERR(ctx->fh.m2m_ctx)) { > + ret = PTR_ERR(ctx->fh.m2m_ctx); > + goto err_free_ctrl_handler; > + } > + v4l2_fh_add(&ctx->fh); > + fd->fd_state |= STATE_OPEN; > + > + mutex_unlock(&fd->dev_lock); > + > + return 0; > +err_free_ctrl_handler: > + v4l2_ctrl_handler_free(&ctx->hdl); > +err_fh_exit: > + v4l2_fh_exit(&ctx->fh); > + kfree(ctx); > +err_unlock: > + mutex_unlock(&fd->dev_lock); > + > + return ret; > +} > + > +static int mtk_vfd_release(struct file *filp) > +{ > + struct mtk_aie_ctx *ctx = container_of(filp->private_data, struct > mtk_aie_ctx, fh); > + struct mtk_aie_dev *fd = video_drvdata(filp); > + > + mutex_lock(&fd->dev_lock); > + > + fd->fd_state &= ~STATE_OPEN; > + > + v4l2_m2m_ctx_release(ctx->fh.m2m_ctx); > + v4l2_ctrl_handler_free(&ctx->hdl); > + v4l2_fh_del(&ctx->fh); > + v4l2_fh_exit(&ctx->fh); > + > + kfree(ctx); > + > + mutex_unlock(&fd->dev_lock); > + > + return 0; > +} > + > +static __poll_t mtk_vfd_fop_poll(struct file *file, poll_table > *wait) > +{ > + int ret; > + > + struct mtk_aie_ctx *ctx = container_of(file->private_data, struct > mtk_aie_ctx, fh); > + struct mtk_aie_dev *fd = ctx->fd_dev; > + > + if (fd->fd_state & STATE_INIT) { > + /* Waiting Job Finsh */ > + ret = wait_for_completion_timeout(&fd->fd_job_finished, > msecs_to_jiffies(1000)); > + if (!ret) { > + dev_err(ctx->dev, "Wait job finish timeout from poll\n"); > + return EPOLLERR; > + } > + } > + > + return v4l2_m2m_fop_poll(file, wait); > +} > + > +static const struct v4l2_file_operations fd_video_fops = { > + .owner = THIS_MODULE, > + .open = mtk_vfd_open, > + .release = mtk_vfd_release, > + .poll = mtk_vfd_fop_poll, > + .unlocked_ioctl = video_ioctl2, > + .mmap = v4l2_m2m_fop_mmap, > +}; > + > +static int mtk_aie_job_ready(void *priv) > +{ > + struct vb2_v4l2_buffer *src_buf, *dst_buf; > + struct mtk_aie_ctx *ctx = priv; > + struct mtk_aie_dev *fd = ctx->fd_dev; > + struct fd_buffer src_img[2] = {}; > + void *plane_vaddr; > + > + if (!ctx->fh.m2m_ctx) { > + dev_err(fd->dev, "Memory-to-memory context is NULL\n"); > + return -1; > + } > + > + if (!(fd->fd_state & STATE_OPEN)) { > + dev_err(fd->dev, "Job ready with device closed\n"); > + return -1; > + } > + > + mutex_lock(&fd->fd_lock); > + > + src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx); > + dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx); > + > + if (!src_buf || !dst_buf) { > + dev_err(fd->dev, "src or dst buf is NULL\n"); > + mutex_unlock(&fd->fd_lock); > + return -1; > + } > + > + if (!(fd->fd_state & STATE_INIT)) { > + dev_err(fd->dev, "%s Wrong fd state: %d\n", __func__, fd- > >fd_state); > + mutex_unlock(&fd->fd_lock); > + return -1; > + } > + > + plane_vaddr = vb2_plane_vaddr(&dst_buf->vb2_buf, 0); > + if (!plane_vaddr) { > + dev_err(fd->dev, "Failed to get plane virtual address\n"); > + mutex_unlock(&fd->fd_lock); > + return -1; > + } > + > + v4l2_ctrl_request_setup(src_buf->vb2_buf.req_obj.req, &ctx->hdl); > + > + fd->aie_cfg = (struct aie_enq_info *)plane_vaddr; > + fd->aie_cfg->fld_face_num = ctx->user_param.fld_face_num; > + > + memset(fd->aie_cfg, 0, sizeof(struct aie_enq_info)); > + memcpy(fd->aie_cfg, &ctx->user_param, sizeof(struct aie_enq_info)); > + memcpy(fd->aie_cfg->fld_input, ctx->user_param.fld_input, > + FLD_MAX_FRAME * sizeof(struct v4l2_fld_crop_rip_rop)); > + > + src_img[0].dma_addr = vb2_dma_contig_plane_dma_addr(&src_buf- > >vb2_buf, 0); > + > + if (ctx->src_fmt.num_planes == 2) { > + src_img[1].dma_addr = > + vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 1); > + } > + > + if ((fd->aie_cfg->sel_mode == FDMODE || fd->aie_cfg->sel_mode == > ATTRIBUTEMODE) && > + fd->aie_cfg->src_img_fmt == FMT_YUV420_1P) { > + src_img[1].dma_addr = src_img[0].dma_addr + ctx- > >user_param.src_img_stride * > + ctx->user_param.src_img_height; > + } > + > + fd->aie_cfg->src_img_addr = src_img[0].dma_addr; > + fd->aie_cfg->src_img_addr_uv = src_img[1].dma_addr; > + > + aie_prepare(fd, fd->aie_cfg); > + > + mutex_unlock(&fd->fd_lock); > + > + if (src_buf) { > + /* Complete request controls if any */ > + v4l2_ctrl_request_complete(src_buf->vb2_buf.req_obj.req, &ctx- > >hdl); > + } > + > + return 0; > +} > + > +static void mtk_aie_device_run(void *priv) > +{ > + struct mtk_aie_ctx *ctx = priv; > + struct mtk_aie_dev *fd = ctx->fd_dev; > + int ret; > + > + ret = mtk_aie_job_ready(priv); > + if (ret == -1) { > + dev_err(fd->dev, "Failed to run job ready\n"); > + return; > + } > + > + atomic_inc(&fd->num_composing); > + mtk_aie_hw_job_exec(fd); > + aie_execute(fd, fd->aie_cfg); > +} > + > +static struct v4l2_m2m_ops fd_m2m_ops = { > + .device_run = mtk_aie_device_run, > +}; > + > +static const struct media_device_ops fd_m2m_media_ops = { > + .req_validate = vb2_request_validate, > + .req_queue = v4l2_m2m_request_queue, > +}; > + > +static int mtk_aie_video_device_register(struct mtk_aie_dev *fd) > +{ > + struct v4l2_m2m_dev *m2m_dev = fd->m2m_dev; > + struct video_device *vfd = &fd->vfd; > + struct device *dev = fd->dev; > + int ret; > + > + vfd->fops = &fd_video_fops; > + vfd->release = video_device_release_empty; > + vfd->lock = &fd->vfd_lock; > + vfd->v4l2_dev = &fd->v4l2_dev; > + vfd->vfl_dir = VFL_DIR_M2M; > + vfd->device_caps = V4L2_CAP_STREAMING | > V4L2_CAP_VIDEO_OUTPUT_MPLANE | > + V4L2_CAP_META_CAPTURE; > + vfd->ioctl_ops = &mtk_aie_v4l2_video_out_ioctl_ops; > + > + strscpy(vfd->name, dev_driver_string(dev), sizeof(vfd->name)); > + > + video_set_drvdata(vfd, fd); > + > + ret = video_register_device(vfd, VFL_TYPE_VIDEO, 0); > + if (ret) { > + dev_err(dev, "Failed to register video device\n"); > + return ret; > + } > + > + ret = v4l2_m2m_register_media_controller(m2m_dev, vfd, > MEDIA_ENT_F_PROC_VIDEO_STATISTICS); > + if (ret) { > + dev_err(dev, "Failed to init mem2mem media controller\n"); > + video_unregister_device(vfd); > + return ret; > + } > + > + return 0; > +} > + > +static int mtk_aie_dev_v4l2_init(struct mtk_aie_dev *fd) > +{ > + struct media_device *mdev = &fd->mdev; > + struct device *dev = fd->dev; > + int ret; > + > + ret = v4l2_device_register(dev, &fd->v4l2_dev); > + if (ret) { > + dev_err(dev, "Failed to register v4l2 device\n"); > + return ret; > + } > + > + fd->m2m_dev = v4l2_m2m_init(&fd_m2m_ops); > + if (IS_ERR(fd->m2m_dev)) { > + dev_err(dev, "Failed to init mem2mem device\n"); > + ret = PTR_ERR(fd->m2m_dev); > + goto err_unreg_v4l2_dev; > + } > + > + mdev->dev = dev; > + strscpy(mdev->model, dev_driver_string(dev), sizeof(mdev->model)); > + media_device_init(mdev); > + mdev->ops = &fd_m2m_media_ops; > + fd->v4l2_dev.mdev = mdev; > + > + ret = mtk_aie_video_device_register(fd); > + if (ret) > + goto err_cleanup_mdev; > + > + ret = media_device_register(mdev); > + if (ret) { > + dev_err(dev, "Failed to register mem2mem media device\n"); > + goto err_unreg_vdev; > + } > + return 0; > + > +err_unreg_vdev: > + v4l2_m2m_unregister_media_controller(fd->m2m_dev); > + video_unregister_device(&fd->vfd); > +err_cleanup_mdev: > + media_device_cleanup(mdev); > + v4l2_m2m_release(fd->m2m_dev); > +err_unreg_v4l2_dev: > + v4l2_device_unregister(&fd->v4l2_dev); > + return ret; > +} > + > +static void mtk_aie_video_device_unregister(struct mtk_aie_dev *fd) > +{ > + v4l2_m2m_unregister_media_controller(fd->m2m_dev); > + video_unregister_device(&fd->vfd); > + media_device_cleanup(&fd->mdev); > + v4l2_m2m_release(fd->m2m_dev); > + v4l2_device_unregister(&fd->v4l2_dev); > +} > + > +static void mtk_aie_frame_done_worker(struct work_struct *work) > +{ > + struct mtk_aie_req_work *req_work = (struct mtk_aie_req_work > *)work; > + struct mtk_aie_dev *fd = (struct mtk_aie_dev *)req_work->fd_dev; > + > + if (fd->reg_cfg.fd_mode == FDMODE) { > + fd->reg_cfg.hw_result = readl(fd->fd_base + AIE_RESULT_0_REG); > + fd->reg_cfg.hw_result1 = readl(fd->fd_base + AIE_RESULT_1_REG); > + } > + > + mutex_lock(&fd->fd_lock); > + > + switch (fd->aie_cfg->sel_mode) { > + case FDMODE: > + aie_get_fd_result(fd, fd->aie_cfg); > + break; > + case ATTRIBUTEMODE: > + aie_get_attr_result(fd, fd->aie_cfg); > + break; > + case FLDMODE: > + aie_get_fld_result(fd, fd->aie_cfg); > + break; > + default: > + dev_dbg(fd->dev, "Wrong sel_mode\n"); > + break; > + } > + > + mutex_unlock(&fd->fd_lock); > + > + if (!cancel_delayed_work(&fd->job_timeout_work)) > + return; > + > + atomic_dec(&fd->num_composing); > + mtk_aie_hw_job_finish(fd, VB2_BUF_STATE_DONE); > + wake_up(&fd->flushing_waitq); > +} > + > +static int mtk_aie_resource_init(struct mtk_aie_dev *fd) > +{ > + mutex_init(&fd->vfd_lock); > + mutex_init(&fd->dev_lock); > + mutex_init(&fd->fd_lock); > + > + init_completion(&fd->fd_job_finished); > + complete_all(&fd->fd_job_finished); > + INIT_DELAYED_WORK(&fd->job_timeout_work, mtk_aie_job_timeout_work); > + init_waitqueue_head(&fd->flushing_waitq); > + atomic_set(&fd->num_composing, 0); > + fd->fd_stream_count = 0; > + > + fd->frame_done_wq = alloc_ordered_workqueue(dev_name(fd->dev), > + WQ_HIGHPRI | WQ_FREEZABLE); > + if (!fd->frame_done_wq) { > + dev_err(fd->dev, "failed to alloc frame_done workqueue\n"); > + mutex_destroy(&fd->vfd_lock); > + mutex_destroy(&fd->dev_lock); > + mutex_destroy(&fd->fd_lock); > + return -ENOMEM; > + } > + > + INIT_WORK(&fd->req_work.work, mtk_aie_frame_done_worker); > + fd->req_work.fd_dev = fd; > + > + return 0; > +} > + > +static void mtk_aie_resource_free(struct platform_device *pdev) > +{ > + struct mtk_aie_dev *fd = dev_get_drvdata(&pdev->dev); > + > + if (fd->frame_done_wq) > + destroy_workqueue(fd->frame_done_wq); > + fd->frame_done_wq = NULL; > + mutex_destroy(&fd->vfd_lock); > + mutex_destroy(&fd->dev_lock); > + mutex_destroy(&fd->fd_lock); > +} > + > +static irqreturn_t mtk_aie_irq(int irq, void *data) > +{ > + struct mtk_aie_dev *fd = (struct mtk_aie_dev *)data; > + > + aie_irqhandle(fd); > + > + queue_work(fd->frame_done_wq, &fd->req_work.work); > + > + return IRQ_HANDLED; > +} > + > +static int mtk_aie_probe(struct platform_device *pdev) > +{ > + struct mtk_aie_dev *fd; > + struct device *dev = &pdev->dev; > + const struct mtk_aie_variant *driver_data = NULL; > + const struct of_device_id *match = NULL; > + int irq; > + int ret; > + > + static struct clk_bulk_data aie_clks[] = { > + { .id = "img-ipe" }, > + { .id = "ipe-fdvt" }, > + { .id = "ipe-top" }, > + { .id = "ipe-smi-larb12" }, > + }; > + > + fd = devm_kzalloc(&pdev->dev, sizeof(*fd), GFP_KERNEL); > + if (!fd) > + return -ENOMEM; > + > + match = of_match_node(mtk_aie_of_ids, dev->of_node); > + if (match) > + driver_data = (const struct mtk_aie_variant *)match->data; > + > + fd->variant = driver_data; > + if (!fd->variant) > + return -ENODEV; > + > + ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(34)); > + if (ret) > + return dev_err_probe(dev, ret, "Cannot set Coherent DMA mask\n"); > + > + dev_set_drvdata(dev, fd); > + fd->dev = dev; > + > + irq = platform_get_irq(pdev, 0); > + if (irq < 0) > + return dev_err_probe(dev, irq, "Failed to get IRQ\n"); > + > + ret = devm_request_irq(dev, irq, mtk_aie_irq, IRQF_SHARED, > + dev_driver_string(dev), fd); > + if (ret) > + return dev_err_probe(dev, ret, "Failed to request irq\n"); > + > + fd->irq = irq; > + fd->fd_base = devm_platform_ioremap_resource(pdev, 0); > + if (IS_ERR(fd->fd_base)) > + return dev_err_probe(dev, -EINVAL, "Failed to get fd reg base\n"); > + > + fd->aie_clk.clk_num = ARRAY_SIZE(aie_clks); > + fd->aie_clk.clks = aie_clks; > + ret = devm_clk_bulk_get(&pdev->dev, fd->aie_clk.clk_num, fd- > >aie_clk.clks); > + if (ret) > + return dev_err_probe(dev, ret, "Failed to get raw clock\n"); > + > + ret = mtk_aie_resource_init(fd); > + if (ret) { > + mtk_aie_resource_free(pdev); > + return ret; > + } > + pm_runtime_enable(dev); > + ret = mtk_aie_dev_v4l2_init(fd); > + if (ret) { > + pm_runtime_disable(&pdev->dev); > + return ret; > + } > + > + return 0; > +} > + > +static void mtk_aie_remove(struct platform_device *pdev) > +{ > + struct mtk_aie_dev *fd = dev_get_drvdata(&pdev->dev); > + > + mtk_aie_video_device_unregister(fd); > + pm_runtime_disable(&pdev->dev); > + mtk_aie_resource_free(pdev); > +} > + > +static int __maybe_unused mtk_aie_suspend(struct device *dev) > +{ > + struct mtk_aie_dev *fd = dev_get_drvdata(dev); > + int ret, num; > + > + if (pm_runtime_suspended(dev)) > + return 0; > + > + num = atomic_read(&fd->num_composing); > + > + ret = wait_event_timeout(fd->flushing_waitq, > + !(num = atomic_read(&fd->num_composing)), > + msecs_to_jiffies(MTK_FD_HW_TIMEOUT_IN_MSEC)); > + if (!ret && num) { > + dev_dbg(dev, "%s: flushing aie job timeout num %d\n", __func__, > num); > + > + return -EBUSY; > + } > + > + ret = pm_runtime_force_suspend(dev); > + if (ret) > + return ret; > + > + return 0; > +} > + > +static int __maybe_unused mtk_aie_resume(struct device *dev) > +{ > + int ret; > + > + if (pm_runtime_suspended(dev)) { > + dev_dbg(dev, "%s: pm_runtime_suspended is true, no action\n", > __func__); > + return 0; > + } > + > + ret = pm_runtime_force_resume(dev); > + if (ret) > + return ret; > + > + return 0; > +} > + > +static int __maybe_unused mtk_aie_runtime_suspend(struct device > *dev) > +{ > + struct mtk_aie_dev *fd = dev_get_drvdata(dev); > + > + clk_bulk_disable_unprepare(fd->aie_clk.clk_num, fd->aie_clk.clks); > + > + return 0; > +} > + > +static int __maybe_unused mtk_aie_runtime_resume(struct device *dev) > +{ > + struct mtk_aie_dev *fd = dev_get_drvdata(dev); > + int ret; > + > + ret = clk_bulk_prepare_enable(fd->aie_clk.clk_num, fd- > >aie_clk.clks); > + if (ret) { > + dev_err(dev, "Failed to enable clock: %d\n", ret); > + return ret; > + } > + > + return 0; > +} > + > +static const struct dev_pm_ops mtk_aie_pm_ops = { > + SET_SYSTEM_SLEEP_PM_OPS(mtk_aie_suspend, mtk_aie_resume) > + SET_RUNTIME_PM_OPS(mtk_aie_runtime_suspend, mtk_aie_runtime_resume, > NULL) > +}; > + > +static struct platform_driver mtk_aie_driver = { > + .probe = mtk_aie_probe, > + .remove = mtk_aie_remove, > + .driver = { > + .name = "mtk-aie-5.3", > + .of_match_table = mtk_aie_of_ids, > + .pm = pm_ptr(&mtk_aie_pm_ops), > + } > +}; > + > +module_platform_driver(mtk_aie_driver); > +MODULE_AUTHOR("Bo Kong <bo.kong@mediatek.com>"); > +MODULE_LICENSE("GPL"); > +MODULE_DESCRIPTION("MediaTek AIE driver"); ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v5 4/4] media: mediatek: add MT8188 AIE driver 2025-04-03 7:38 ` [PATCH v5 4/4] media: mediatek: add MT8188 AIE driver bo.kong 2025-04-03 19:31 ` Nicolas Dufresne @ 2025-04-04 17:56 ` kernel test robot 2025-04-17 6:39 ` CK Hu (胡俊光) 2 siblings, 0 replies; 17+ messages in thread From: kernel test robot @ 2025-04-04 17:56 UTC (permalink / raw) To: bo.kong, Rob Herring, AngeloGioacchino Del Regno, Mauro Carvalho Chehab, linux-kernel, linux-arm-kernel, linux-mediatek Cc: oe-kbuild-all, linux-media, zhaoyuan.chen, Teddy.Chen, Project_Global_Chrome_Upstream_Group Hi bo.kong, kernel test robot noticed the following build warnings: [auto build test WARNING on robh/for-next] [also build test WARNING on linus/master v6.14 next-20250404] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/bo-kong/media-dt-bindings-add-MT8188-AIE/20250403-154205 base: https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git for-next patch link: https://lore.kernel.org/r/20250403074005.21472-5-bo.kong%40mediatek.com patch subject: [PATCH v5 4/4] media: mediatek: add MT8188 AIE driver config: alpha-allyesconfig (https://download.01.org/0day-ci/archive/20250405/202504050137.vMfW5SAb-lkp@intel.com/config) compiler: alpha-linux-gcc (GCC) 14.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250405/202504050137.vMfW5SAb-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202504050137.vMfW5SAb-lkp@intel.com/ All warnings (new ones prefixed by >>): In file included from drivers/media/platform/mediatek/aie/mtk_aie_v4l2.c:14: >> drivers/media/platform/mediatek/aie/mtk_aie.h:569:27: warning: 'fld_face_info_0' defined but not used [-Wunused-const-variable=] 569 | static const unsigned int fld_face_info_0[FLD_MAX_FRAME] = { | ^~~~~~~~~~~~~~~ >> drivers/media/platform/mediatek/aie/mtk_aie.h:540:27: warning: 'attr_wdma_size' defined but not used [-Wunused-const-variable=] 540 | static const unsigned int attr_wdma_size[ATTR_LOOP_NUM][OUTPUT_WDMA_WRA_NUM] = { | ^~~~~~~~~~~~~~ >> drivers/media/platform/mediatek/aie/mtk_aie.h:511:25: warning: 'attr_rdma_en' defined but not used [-Wunused-const-variable=] 511 | static const signed int attr_rdma_en[ATTR_LOOP_NUM][INPUT_WDMA_WRA_NUM][2] = { | ^~~~~~~~~~~~ >> drivers/media/platform/mediatek/aie/mtk_aie.h:503:27: warning: 'attr_out_2size' defined but not used [-Wunused-const-variable=] 503 | static const unsigned int attr_out_2size[ATTR_LOOP_NUM] = { /* O */ | ^~~~~~~~~~~~~~ >> drivers/media/platform/mediatek/aie/mtk_aie.h:498:27: warning: 'attr_out_stride2_as_in' defined but not used [-Wunused-const-variable=] 498 | static const unsigned int attr_out_stride2_as_in[ATTR_LOOP_NUM] = { | ^~~~~~~~~~~~~~~~~~~~~~ >> drivers/media/platform/mediatek/aie/mtk_aie.h:488:27: warning: 'attr_ker_rdma_size' defined but not used [-Wunused-const-variable=] 488 | static const unsigned int attr_ker_rdma_size[ATTR_LOOP_NUM] = { | ^~~~~~~~~~~~~~~~~~ >> drivers/media/platform/mediatek/aie/mtk_aie.h:478:27: warning: 'attr_wdma_en' defined but not used [-Wunused-const-variable=] 478 | static const unsigned int attr_wdma_en[ATTR_LOOP_NUM][OUTPUT_WDMA_WRA_NUM] = { | ^~~~~~~~~~~~ >> drivers/media/platform/mediatek/aie/mtk_aie.h:388:25: warning: 'fd_rdma_en' defined but not used [-Wunused-const-variable=] 388 | static const signed int fd_rdma_en[FD_LOOP_NUM][INPUT_WDMA_WRA_NUM][2] = { | ^~~~~~~~~~ >> drivers/media/platform/mediatek/aie/mtk_aie.h:380:27: warning: 'out_ch_pack' defined but not used [-Wunused-const-variable=] 380 | static const unsigned int out_ch_pack[FD_LOOP_NUM] = { | ^~~~~~~~~~~ >> drivers/media/platform/mediatek/aie/mtk_aie.h:373:27: warning: 'outlayer' defined but not used [-Wunused-const-variable=] 373 | static const unsigned int outlayer[FD_LOOP_NUM] = { | ^~~~~~~~ >> drivers/media/platform/mediatek/aie/mtk_aie.h:365:27: warning: 'in_ch_pack' defined but not used [-Wunused-const-variable=] 365 | static const unsigned int in_ch_pack[FD_LOOP_NUM] = { | ^~~~~~~~~~ >> drivers/media/platform/mediatek/aie/mtk_aie.h:358:27: warning: 'out_2size' defined but not used [-Wunused-const-variable=] 358 | static const unsigned int out_2size[FD_LOOP_NUM] = { | ^~~~~~~~~ >> drivers/media/platform/mediatek/aie/mtk_aie.h:354:27: warning: 'fd_maxpool_indices' defined but not used [-Wunused-const-variable=] 354 | static const unsigned int fd_maxpool_indices[FD_MAXPOOL_SIZE] = { | ^~~~~~~~~~~~~~~~~~ >> drivers/media/platform/mediatek/aie/mtk_aie.h:350:27: warning: 'fd_stride_indices' defined but not used [-Wunused-const-variable=] 350 | static const unsigned int fd_stride_indices[FD_STRIDE_SIZE] = { | ^~~~~~~~~~~~~~~~~ >> drivers/media/platform/mediatek/aie/mtk_aie.h:343:27: warning: 'fd_out_stride2_in' defined but not used [-Wunused-const-variable=] 343 | static const unsigned int fd_out_stride2_in[FD_LOOP_NUM] = { | ^~~~~~~~~~~~~~~~~ >> drivers/media/platform/mediatek/aie/mtk_aie.h:332:27: warning: 'fd_ker_rdma_size' defined but not used [-Wunused-const-variable=] 332 | static const unsigned int fd_ker_rdma_size[RPN_LOOP_NUM] = { | ^~~~~~~~~~~~~~~~ >> drivers/media/platform/mediatek/aie/mtk_aie.h:307:27: warning: 'out_stride_size' defined but not used [-Wunused-const-variable=] 307 | static const unsigned int out_stride_size[FD_LOOP_NUM][OUTPUT_WDMA_WRA_NUM] = { | ^~~~~~~~~~~~~~~ -- drivers/media/platform/mediatek/aie/mtk_aie_drv.c: In function 'aie_arrange_fddma_buf': >> drivers/media/platform/mediatek/aie/mtk_aie_drv.c:577:20: warning: variable 'current_pa' set but not used [-Wunused-but-set-variable] 577 | dma_addr_t current_pa; | ^~~~~~~~~~ -- >> drivers/media/platform/mediatek/aie/mtk_aie_drv.c:872: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * The aie driver just loads the bin file provided by algorithm. vim +/fld_face_info_0 +569 drivers/media/platform/mediatek/aie/mtk_aie.h 372 > 373 static const unsigned int outlayer[FD_LOOP_NUM] = { 374 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 375 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 376 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 377 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0 378 }; 379 > 380 static const unsigned int out_ch_pack[FD_LOOP_NUM] = { 381 16, 16, 16, 16, 16, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 382 32, 16, 16, 16, 32, 32, 32, 32, 32, 32, 0, 16, 16, 16, 16, 16, 32, 32, 383 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 16, 16, 16, 32, 32, 32, 384 32, 32, 32, 0, 16, 16, 16, 16, 16, 32, 32, 32, 32, 32, 32, 32, 32, 32, 385 32, 32, 32, 32, 32, 16, 16, 16, 32, 32, 32, 32, 32, 32, 0 386 }; 387 > 388 static const signed int fd_rdma_en[FD_LOOP_NUM][INPUT_WDMA_WRA_NUM][2] = { 389 { { 99, 99 }, { 99, 99 }, { 99, 99 }, { -1, -1 } }, 390 { { 0, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 391 { { 1, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 392 { { 1, 0 }, { 2, 0 }, { -1, -1 }, { -1, -1 } }, 393 { { 3, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 394 { { 1, 2 }, { 2, 2 }, { 4, 2 }, { 4, 3 } }, 395 { { 5, 0 }, { 5, 1 }, { -1, -1 }, { -1, -1 } }, 396 { { 6, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 397 { { 5, 0 }, { 5, 1 }, { 7, 0 }, { -1, -1 } }, 398 { { 8, 0 }, { 8, 1 }, { -1, -1 }, { -1, -1 } }, 399 { { 9, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 400 { { 5, 2 }, { 5, 3 }, { 7, 2 }, { 10, 2 } }, 401 { { 11, 0 }, { 11, 1 }, { -1, -1 }, { -1, -1 } }, 402 { { 12, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 403 { { 13, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 404 { { 11, 0 }, { 11, 1 }, { 14, 0 }, { -1, -1 } }, 405 { { 15, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 406 { { 16, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 407 { { 11, 0 }, { 11, 1 }, { 14, 0 }, { 17, 0 } }, 408 { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, 409 { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, 410 { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, 411 { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, 412 { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, 413 { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, 414 { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, 415 { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, 416 { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, 417 { { 19, 0 }, { 22, 0 }, { 22, 1 }, { 25, 0 } }, 418 { { 99, 99 }, { 99, 99 }, { 99, 99 }, { -1, -1 } }, 419 { { 29, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 420 { { 30, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 421 { { 30, 0 }, { 31, 0 }, { -1, -1 }, { -1, -1 } }, 422 { { 32, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 423 { { 30, 2 }, { 31, 2 }, { 33, 2 }, { 33, 3 } }, 424 { { 34, 0 }, { 34, 1 }, { -1, -1 }, { -1, -1 } }, 425 { { 35, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 426 { { 34, 0 }, { 34, 1 }, { 36, 0 }, { -1, -1 } }, 427 { { 37, 0 }, { 37, 1 }, { -1, -1 }, { -1, -1 } }, 428 { { 38, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 429 { { 34, 2 }, { 34, 3 }, { 36, 2 }, { 39, 2 } }, 430 { { 40, 0 }, { 40, 1 }, { -1, -1 }, { -1, -1 } }, 431 { { 41, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 432 { { 42, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 433 { { 40, 0 }, { 40, 1 }, { 43, 0 }, { -1, -1 } }, 434 { { 44, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 435 { { 45, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 436 { { 40, 0 }, { 40, 1 }, { 43, 0 }, { 46, 0 } }, 437 { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, 438 { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, 439 { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, 440 { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, 441 { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, 442 { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, 443 { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, 444 { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, 445 { { 47, 0 }, { 47, 1 }, { -1, -1 }, { -1, -1 } }, 446 { { 48, 0 }, { 51, 0 }, { 51, 1 }, { 54, 0 } }, 447 { { 99, 99 }, { 99, 99 }, { 99, 99 }, { -1, -1 } }, 448 { { 58, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 449 { { 59, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 450 { { 59, 0 }, { 60, 0 }, { -1, -1 }, { -1, -1 } }, 451 { { 61, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 452 { { 59, 2 }, { 60, 2 }, { 62, 2 }, { 62, 3 } }, 453 { { 63, 0 }, { 63, 1 }, { -1, -1 }, { -1, -1 } }, 454 { { 64, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 455 { { 63, 0 }, { 63, 1 }, { 65, 0 }, { -1, -1 } }, 456 { { 66, 0 }, { 66, 1 }, { -1, -1 }, { -1, -1 } }, 457 { { 67, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 458 { { 63, 2 }, { 63, 3 }, { 65, 2 }, { 68, 2 } }, 459 { { 69, 0 }, { 69, 1 }, { -1, -1 }, { -1, -1 } }, 460 { { 70, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 461 { { 71, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 462 { { 69, 0 }, { 69, 1 }, { 72, 0 }, { -1, -1 } }, 463 { { 73, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 464 { { 74, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 465 { { 69, 0 }, { 69, 1 }, { 72, 0 }, { 75, 0 } }, 466 { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, 467 { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, 468 { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, 469 { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, 470 { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, 471 { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, 472 { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, 473 { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, 474 { { 76, 0 }, { 76, 1 }, { -1, -1 }, { -1, -1 } }, 475 { { 77, 0 }, { 80, 0 }, { 80, 1 }, { 83, 0 } } 476 }; 477 > 478 static const unsigned int attr_wdma_en[ATTR_LOOP_NUM][OUTPUT_WDMA_WRA_NUM] = { 479 { 1, 0, 1, 0 }, { 1, 0, 1, 0 }, { 1, 0, 0, 0 }, { 1, 1, 1, 1 }, 480 { 1, 1, 1, 1 }, { 1, 0, 1, 0 }, { 1, 1, 0, 0 }, { 1, 0, 1, 0 }, 481 { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, 482 { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 1, 0, 0 }, 483 { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, 484 { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, 485 { 1, 0, 0, 0 }, { 1, 0, 0, 0 } 486 }; 487 > 488 static const unsigned int attr_ker_rdma_size[ATTR_LOOP_NUM] = { 489 240, 1168, 272, 2320, 490 2080, 9232, 3104, 9232, 491 4128, 1040, 4624, 4624, 492 1552, 4624, 4624, 4128, 493 9232, 272, 9232, 2320, 494 144, 9232, 272, 9232, 495 2320, 144 496 }; 497 > 498 static const unsigned int attr_out_stride2_as_in[ATTR_LOOP_NUM] = { 499 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 500 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 501 }; 502 > 503 static const unsigned int attr_out_2size[ATTR_LOOP_NUM] = { /* O */ 504 1, 1, 0, 1, 1, 1, 0, 505 1, 0, 0, 0, 0, 0, 0, 506 0, 0, 0, 0, 0, 0, 0, 507 0, 0, 0, 0, 0 508 }; 509 510 /* [loop][ch][output_index] */ > 511 static const signed int attr_rdma_en[ATTR_LOOP_NUM][INPUT_WDMA_WRA_NUM][2] = { 512 { { 99, 99 }, { 99, 99 }, { 99, 99 }, { -1, -1 } }, 513 { { 0, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 514 { { 0, 0 }, { 1, 0 }, { -1, -1 }, { -1, -1 } }, 515 { { 2, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 516 { { 0, 2 }, { 1, 2 }, { 3, 2 }, { 3, 3 } }, 517 { { 4, 0 }, { 4, 1 }, { -1, -1 }, { -1, -1 } }, 518 { { 4, 0 }, { 4, 1 }, { 5, 0 }, { -1, -1 } }, 519 { { 6, 0 }, { 6, 1 }, { -1, -1 }, { -1, -1 } }, 520 { { 4, 2 }, { 4, 3 }, { 5, 2 }, { 7, 2 } }, 521 { { 8, 0 }, { 8, 1 }, { -1, -1 }, { -1, -1 } }, 522 { { 9, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 523 { { 10, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 524 { { 8, 0 }, { 8, 1 }, { 11, 0 }, { -1, -1 } }, 525 { { 12, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 526 { { 13, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 527 { { 8, 0 }, { 8, 1 }, { 11, 0 }, { 14, 0 } }, 528 { { 15, 0 }, { 15, 1 }, { -1, -1 }, { -1, -1 } }, 529 { { 16, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 530 { { 15, 0 }, { 15, 1 }, { -1, -1 }, { -1, -1 } }, 531 { { 18, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 532 { { 19, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 533 { { 15, 0 }, { 15, 1 }, { -1, -1 }, { -1, -1 } }, 534 { { 21, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 535 { { 15, 0 }, { 15, 1 }, { -1, -1 }, { -1, -1 } }, 536 { { 23, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, 537 { { 24, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } } 538 }; 539 > 540 static const unsigned int attr_wdma_size[ATTR_LOOP_NUM][OUTPUT_WDMA_WRA_NUM] = { 541 { 16384, 0, 4096, 0 }, 542 { 16384, 0, 4096, 0 }, 543 { 16384, 0, 0, 0 }, 544 { 16384, 16384, 4096, 4096 }, 545 { 8192, 8192, 2048, 2048 }, 546 { 8192, 0, 2048, 0 }, 547 { 8192, 8192, 0, 0 }, 548 { 8192, 0, 2048, 0 }, 549 { 2048, 2048, 0, 0 }, 550 { 2048, 0, 0, 0 }, 551 { 2048, 0, 0, 0 }, 552 { 2048, 0, 0, 0 }, 553 { 2048, 0, 0, 0 }, 554 { 2048, 0, 0, 0 }, 555 { 2048, 0, 0, 0 }, 556 { 2048, 2048, 0, 0 }, 557 { 2048, 0, 0, 0 }, 558 { 0, 0, 0, 0 }, 559 { 2048, 0, 0, 0 }, 560 { 1024, 0, 0, 0 }, 561 { 0, 0, 0, 0 }, 562 { 2048, 0, 0, 0 }, 563 { 0, 0, 0, 0 }, 564 { 2048, 0, 0, 0 }, 565 { 1024, 0, 0, 0 }, 566 { 0, 0, 0, 0 } 567 }; 568 > 569 static const unsigned int fld_face_info_0[FLD_MAX_FRAME] = { 570 0x440, 0x44C, 0x458, 0x464, 0x470, 0x47C, 0x488, 0x494, 0x4A4, 571 0x4B0, 0x4BC, 0x4C8, 0x4D4, 0x4E0, 0x4EC 572 }; 573 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v5 4/4] media: mediatek: add MT8188 AIE driver 2025-04-03 7:38 ` [PATCH v5 4/4] media: mediatek: add MT8188 AIE driver bo.kong 2025-04-03 19:31 ` Nicolas Dufresne 2025-04-04 17:56 ` kernel test robot @ 2025-04-17 6:39 ` CK Hu (胡俊光) 2 siblings, 0 replies; 17+ messages in thread From: CK Hu (胡俊光) @ 2025-04-17 6:39 UTC (permalink / raw) To: linux-media@vger.kernel.org, AngeloGioacchino Del Regno, robh@kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, mchehab@kernel.org, Bo Kong (孔波), linux-mediatek@lists.infradead.org Cc: Zhaoyuan Chen (陈兆远), Teddy Chen (陳乾元), Project_Global_Chrome_Upstream_Group On Thu, 2025-04-03 at 15:38 +0800, bo.kong wrote: > From: Bo Kong <Bo.Kong@mediatek.com> > > Add a V4L2 sub-device driver for MT8188 AIE. This patch is a little big, so separate this patch by fd_mode. So these patch would be: 1. Add MT8188 AIE driver (support FDMODE only) 2. Add ATTRIBUTEMODE 3. Add FLDMODE > > Signed-off-by: Bo Kong <Bo.Kong@mediatek.com> > --- [snip] > + > +enum RS_CONFIG { > + RS_X_Y_MAG = 1, > + RS_SRZ_HORI_STEP = 3, > + RS_SRZ_VERT_STEP, You allocate memory for the index 5 and 6 but it's not used. Add the comment to describe why you allocate this memory but not use it. > + RS_INPUT_W_H = 7, > + RS_OUTPUT_W_H, > + RS_IN_X_Y_SIZE0 = 10, > + RS_IN_STRIDE0, > + RS_IN_X_Y_SIZE1, > + RS_IN_STRIDE1, > + RS_IN_X_Y_SIZE2, > + RS_IN_STRIDE2, > + RS_OUT_X_Y_SIZE0, > + RS_OUT_STRIDE0, > + RS_OUT_X_Y_SIZE1, > + RS_OUT_STRIDE1, > + RS_OUT_X_Y_SIZE2, > + RS_OUT_STRIDE2, > + RS_IN_0, > + RS_IN_1, > + RS_IN_2, > + RS_OUT_0, > + RS_OUT_1, > + RS_OUT_2, > + RS_CON_IN_BA_MSB, > + RS_CON_OUT_BA_MSB, > +}; > + > [snip] > + > +struct aie_static_info_element { > + u32 fd_wdma_size[OUTPUT_WDMA_WRA_NUM]; > + u32 out_xsize_plus_1; > + u32 out_height; > + u32 out_ysize_plus_1_stride2; > + u32 out_stride; > + u32 out_stride_stride2; > + u32 out_width; > + u32 img_width; > + u32 img_height; > + u32 stride2_out_width; > + u32 stride2_out_height; > + u32 out_xsize_plus_1_stride2; > + u32 input_xsize_plus_1; > +}; > + > +struct aie_static_info { > + struct aie_static_info_element inf_elm[FD_LOOP_NUM]; > +}; aie_static_info{} is redundant. Use struct aie_static_info_element{} directly. > + > +enum aie_state { > + STATE_NA, > + STATE_INIT, > + STATE_OPEN > +}; > + [snip] > + > +struct aie_reg_cfg { > + u32 rs_adr; > + u32 yuv2rgb_adr; > + u32 fd_adr; > + u32 fd_pose_adr; fd_pose_adr is useless, so drop it. > + u32 fd_mode; > + u32 hw_result; > + u32 hw_result1; > + u32 reserved; reserved is useless, so drop it. > +}; > + > +struct aie_hw_rect { > + u16 width; > + u16 height; > +}; [snip] > + > +struct aie_attr_para { > + u32 w_idx; > + u32 r_idx; In this code, only one attribute parameter is used at one time. So it's not necessary to queue the attribute parameter. Drop w_idx and r_idx. > + u32 sel_mode[MAX_ENQUE_FRAME_NUM]; > + u16 img_width[MAX_ENQUE_FRAME_NUM]; > + u16 img_height[MAX_ENQUE_FRAME_NUM]; > + u16 crop_width[MAX_ENQUE_FRAME_NUM]; > + u16 crop_height[MAX_ENQUE_FRAME_NUM]; > + u32 src_img_fmt[MAX_ENQUE_FRAME_NUM]; > + u32 rotate_degree[MAX_ENQUE_FRAME_NUM]; > + u32 src_img_addr[MAX_ENQUE_FRAME_NUM]; > + u32 src_img_addr_uv[MAX_ENQUE_FRAME_NUM]; > +}; > + [snip] > + > +struct imem_buf_info { > + void *va; > + dma_addr_t pa; > + unsigned int size; > + unsigned int reserved; reserved is useless, so drop it. > +}; > + [snip] > + > +static int aie_config_y2r(struct mtk_aie_dev *fd, struct aie_enq_info *aie_cfg, > + int mode) > +{ > + u32 img_addr; > + u32 img_addr_UV; Lower case for variable name. > + u32 img_off; > + u32 img_off_uv; > + u32 *yuv2rgb_cfg; > + u32 srcbuf, srcbuf_UV; > + u16 xmag_0, ymag_0; > + u16 pym0_out_w; > + u16 pym0_out_h; > + u16 stride_pym0_out_w; > + u16 sr_crp_w; > + u16 sr_crp_h; > + u16 y1_stride; > + > + if (!aie_cfg->en_roi) { > + img_off = 0; > + img_off_uv = 0; > + } else { > + if (aie_cfg->src_img_fmt == FMT_MONO || aie_cfg->src_img_fmt == FMT_YUV_2P || > + aie_cfg->src_img_fmt == FMT_YVU_2P) { > + y1_stride = aie_cfg->src_img_stride * aie_cfg->src_roi.y1; > + img_off = y1_stride + aie_cfg->src_roi.x1; > + img_off_uv = y1_stride + aie_cfg->src_roi.x1; > + } else if (aie_cfg->src_img_fmt == FMT_YUV420_2P || > + aie_cfg->src_img_fmt == FMT_YUV420_1P) { > + y1_stride = aie_cfg->src_img_stride * aie_cfg->src_roi.y1; > + img_off = y1_stride + aie_cfg->src_roi.x1; > + img_off_uv = y1_stride / 2 + aie_cfg->src_roi.x1; > + } else if (aie_cfg->src_img_fmt == FMT_YUYV || > + aie_cfg->src_img_fmt == FMT_YVYU || > + aie_cfg->src_img_fmt == FMT_UYVY || > + aie_cfg->src_img_fmt == FMT_VYUY) { > + y1_stride = aie_cfg->src_img_stride * aie_cfg->src_roi.y1; > + img_off = y1_stride + aie_cfg->src_roi.x1 * 2; > + img_off_uv = y1_stride + aie_cfg->src_roi.x1 * 2; > + } else { > + dev_err(fd->dev, "Unsupport input format %d", aie_cfg->src_img_fmt); > + return -EINVAL; I think you should check format when setting format not here. It's not necessary to check format everywhere you use it. > + } > + } > + > + img_addr = aie_cfg->src_img_addr + img_off; > + img_addr_UV = aie_cfg->src_img_addr_uv + img_off_uv; > + > + srcbuf = img_addr; > + if (aie_cfg->src_img_fmt == FMT_YUV420_2P || aie_cfg->src_img_fmt == FMT_YUV420_1P || > + aie_cfg->src_img_fmt == FMT_YUV_2P || aie_cfg->src_img_fmt == FMT_YVU_2P) > + srcbuf_UV = img_addr_UV; > + else > + srcbuf_UV = 0; srcbuf = aie_cfg->src_img_addr + img_off; srcbuf_UV = aie_cfg->src_img_addr_uv + img_off_uv or 0; and drop img_addr and img_addr_UV. > + > + if (mode == FDMODE) { > + sr_crp_w = fd->base_para->crop_rect.width; > + sr_crp_h = fd->base_para->crop_rect.height; > + yuv2rgb_cfg = (u32 *)fd->base_para->fd_yuv2rgb_cfg_va; > + pym0_out_w = fd->base_para->pyramid_rect.width; > + } > + if (mode == ATTRIBUTEMODE) { > + sr_crp_w = fd->attr_para->crop_width[fd->attr_para->w_idx]; > + sr_crp_h = fd->attr_para->crop_height[fd->attr_para->w_idx]; > + yuv2rgb_cfg = (u32 *)fd->base_para->attr_yuv2rgb_cfg_va[fd->attr_para->w_idx]; > + pym0_out_w = ATTR_MODE_PYRAMID_WIDTH; > + } > + > + pym0_out_h = pym0_out_w * sr_crp_h / sr_crp_w; > + > + if (pym0_out_w != 0) { > + xmag_0 = 512 * sr_crp_w / pym0_out_w; > + ymag_0 = xmag_0; > + } else { Why do you allow pym0_out_w is zero? return error here. It's better to return error when user space set wrong parameter. > + xmag_0 = 0; > + ymag_0 = 0; > + } > + > + yuv2rgb_cfg[Y2R_SRC_DST_FORMAT] = (yuv2rgb_cfg[Y2R_SRC_DST_FORMAT] & 0xFFFFFFF8) | > + ((aie_cfg->src_img_fmt) & 0x7); You does not assign other bits of Y2R_SRC_DST_FORMAT, so you could set it as yuv2rgb_cfg[Y2R_SRC_DST_FORMAT] = (aie_cfg->src_img_fmt) & 0x7; > + if (aie_cfg->src_img_fmt == FMT_YUV420_2P || aie_cfg->src_img_fmt == FMT_YUV420_1P) { > + /* for match patten */ > + yuv2rgb_cfg[Y2R_SRC_DST_FORMAT] = (yuv2rgb_cfg[Y2R_SRC_DST_FORMAT] & > + 0xFFFFFFF8) | ((0x3) & 0x7); You does not assign other bits of Y2R_SRC_DST_FORMAT, so you could set it as yuv2rgb_cfg[Y2R_SRC_DST_FORMAT] = 0x3; Maybe use a symbol to replace 0x3; > + } > + yuv2rgb_cfg[Y2R_IN_W_H] = (yuv2rgb_cfg[Y2R_IN_W_H] & 0xF800F800) | > + ((sr_crp_w << 16) & 0x7FF0000) | (sr_crp_h & 0x7FF); > + yuv2rgb_cfg[Y2R_OUT_W_H] = (yuv2rgb_cfg[Y2R_OUT_W_H] & 0xF800F800) | > + ((pym0_out_w << 16) & 0x7FF0000) | (pym0_out_h & 0x7FF); > + > + if (aie_cfg->src_img_fmt == FMT_YUV_2P || aie_cfg->src_img_fmt == FMT_YVU_2P) { > + /* 2 plane */ > + yuv2rgb_cfg[Y2R_RA0_RA1_EN] = (yuv2rgb_cfg[Y2R_RA0_RA1_EN] & 0xFFFFFFEE) | 0x11; > + if (aie_cfg->en_roi) { > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = aie_cmb_u16(dif_x(aie_cfg), dif_y(aie_cfg)); > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(dif_x(aie_cfg), dif_y(aie_cfg)); I think you could define a structure and its member has different type, struct yuv2rgb_config { ... u16 in_x_size0; u16 in_y_size0; u16 in_x_size1; u16 in_y_size1; ... }; struct yuv2rgb_config *yuv2rgb_cfg; yuv2rgb_cfg->in_x_size0 = dif_x(aie_cfg); yuv2rgb_cfg->in_y_size0 = dif_y(aie_cfg); yuv2rgb_cfg->in_x_size1 = dif_x(aie_cfg); yuv2rgb_cfg->in_y_size1 = dif_y(aie_cfg); This look more readable. > + } else { > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = aie_cmb_u16(sr_crp_w - 1, sr_crp_h - 1); > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(sr_crp_w - 1, sr_crp_h - 1); > + } > + yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] = > + (yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] & 0xFFF0) | > + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x1; > + yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] = > + (yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] & 0xFFF0) | > + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x1; > + } else if (aie_cfg->src_img_fmt == FMT_MONO) { > + yuv2rgb_cfg[Y2R_RA0_RA1_EN] = > + (yuv2rgb_cfg[Y2R_RA0_RA1_EN] & 0xFFFFFFEE) | 0x01; You does not assign other bits of Y2R_RA0_RA1_EN, so you could set it as yuv2rgb_cfg[Y2R_RA0_RA1_EN] = 0x1; > + if (aie_cfg->en_roi) { > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = aie_cmb_u16(dif_x(aie_cfg), dif_y(aie_cfg)); > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(dif_x(aie_cfg), dif_y(aie_cfg)); > + } else { > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = aie_cmb_u16(sr_crp_w - 1, sr_crp_h - 1); > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(sr_crp_w - 1, sr_crp_h - 1); > + } > + yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] = > + (yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] & 0xFFF0) | > + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x0; > + yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] = > + (yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] & 0xFFF0) | > + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x0; I think you could define a structure and its member has different type, struct yuv2rgb_config { ... u16 in_bus_size0; u16 in_stride0; u16 in_bus_size1; u16 in_stride1; ... }; struct yuv2rgb_config *yuv2rgb_cfg; yuv2rgb_cfg->in_bus_size0 = 0; yuv2rgb_cfg->in_stride0 = aie_cfg->src_img_stride; yuv2rgb_cfg->in_bus_size1 = 0; yuv2rgb_cfg->in_stride1 = aie_cfg->src_img_stride; This look more readable. > + } else if (aie_cfg->src_img_fmt == FMT_YUYV || > + aie_cfg->src_img_fmt == FMT_YVYU || > + aie_cfg->src_img_fmt == FMT_UYVY || > + aie_cfg->src_img_fmt == FMT_VYUY) { > + /* 1 plane */ > + yuv2rgb_cfg[Y2R_RA0_RA1_EN] = (yuv2rgb_cfg[Y2R_RA0_RA1_EN] & 0xFFFFFFEE) | 0x1; > + if (aie_cfg->en_roi) { > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = aie_cmb_u16(2 * (dif_x(aie_cfg) + 1) - 1, > + dif_y(aie_cfg)); > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(2 * (dif_x(aie_cfg) + 1) - 1, > + dif_y(aie_cfg)); > + } else { > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = aie_cmb_u16(2 * sr_crp_w - 1, sr_crp_h - 1); > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(2 * sr_crp_w - 1, sr_crp_h - 1); > + } > + yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] = > + (yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] & 0xFFF0) | > + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x3; > + yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] = > + (yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] & 0xFFF0) | > + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x3; > + } > + > + /* AIE3.0 */ > + if (aie_cfg->src_img_fmt == FMT_YUV420_2P || > + aie_cfg->src_img_fmt == FMT_YUV420_1P) { > + yuv2rgb_cfg[Y2R_RA0_RA1_EN] = > + (yuv2rgb_cfg[Y2R_RA0_RA1_EN] & 0xFFFFFFEE) | 0x11; > + if (aie_cfg->en_roi) { > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = aie_cmb_u16(dif_x(aie_cfg), dif_y(aie_cfg)); > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(dif_x(aie_cfg), > + dif_y(aie_cfg) / 2); > + } else { > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE0] = > + aie_cmb_u16(sr_crp_w - 1, sr_crp_h - 1); > + yuv2rgb_cfg[Y2R_IN_X_Y_SIZE1] = aie_cmb_u16(sr_crp_w - 1, > + sr_crp_h / 2 - 1); > + } > + yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] = > + (yuv2rgb_cfg[Y2R_IN_STRIDE0_BUS_SIZE0] & 0xFFF0) | > + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x0; > + yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] = > + (yuv2rgb_cfg[Y2R_IN_STRIDE1_BUS_SIZE1] & 0xFFF0) | > + ((aie_cfg->src_img_stride << 16) & 0xFFFF0000) | 0x0; > + > + yuv2rgb_cfg[Y2R_CO2_FMT_MODE_EN] = > + (yuv2rgb_cfg[Y2R_CO2_FMT_MODE_EN] & 0xFFFFFFFE) | 0x01; > + if (aie_cfg->en_roi) { > + yuv2rgb_cfg[Y2R_CO2_CROP_X] = aie_cmb_u16(0, dif_x(aie_cfg)); > + yuv2rgb_cfg[Y2R_CO2_CROP_Y] = aie_cmb_u16(0, dif_y(aie_cfg)); > + } else { > + yuv2rgb_cfg[Y2R_CO2_CROP_X] = aie_cmb_u16(0, sr_crp_w - 1); > + yuv2rgb_cfg[Y2R_CO2_CROP_Y] = aie_cmb_u16(0, sr_crp_h - 1); > + } > + } else { > + yuv2rgb_cfg[Y2R_CO2_FMT_MODE_EN] = > + (yuv2rgb_cfg[Y2R_CO2_FMT_MODE_EN] & 0xFFFFFFFE); > + > + if (aie_cfg->en_roi) { > + yuv2rgb_cfg[Y2R_CO2_CROP_X] = aie_cmb_u16(0, dif_x(aie_cfg)); > + yuv2rgb_cfg[Y2R_CO2_CROP_Y] = aie_cmb_u16(0, dif_y(aie_cfg)); > + } else { > + yuv2rgb_cfg[Y2R_CO2_CROP_X] = aie_cmb_u16(0, sr_crp_w - 1); > + yuv2rgb_cfg[Y2R_CO2_CROP_Y] = aie_cmb_u16(0, sr_crp_h - 1); > + } > + } > + > + stride_pym0_out_w = round_up(pym0_out_w, 8); > + > + yuv2rgb_cfg[Y2R_OUT_X_Y_SIZE0] = > + aie_cmb_u16(pym0_out_w - 1, pym0_out_h - 1); > + set_cmb_cfg(yuv2rgb_cfg, Y2R_OUT_STRIDE0_BUS_SIZE0, stride_pym0_out_w); > + yuv2rgb_cfg[Y2R_OUT_X_Y_SIZE1] = > + aie_cmb_u16(pym0_out_w - 1, pym0_out_h - 1); > + set_cmb_cfg(yuv2rgb_cfg, Y2R_OUT_STRIDE1_BUS_SIZE1, stride_pym0_out_w); > + yuv2rgb_cfg[Y2R_OUT_X_Y_SIZE2] = > + aie_cmb_u16(pym0_out_w - 1, pym0_out_h - 1); > + set_cmb_cfg(yuv2rgb_cfg, Y2R_OUT_STRIDE2_BUS_SIZE2, stride_pym0_out_w); > + > + if (aie_cfg->en_padding) { > + yuv2rgb_cfg[Y2R_PADDING_EN_UP_DOWN] = > + 1 | ((aie_cfg->src_padding.up << 4) & 0x1FF0) | > + ((aie_cfg->src_padding.down << 16) & 0x01FF0000); > + yuv2rgb_cfg[Y2R_PADDING_RIGHT_LEFT] = > + (aie_cfg->src_padding.right & 0x01FF) | > + ((aie_cfg->src_padding.left << 16) & 0x01FF0000); I think you could define a structure and its member has different type, struct yuv2rgb_config { ... u16 padding_en:4; u16 padding_up:12; u16 padding_down; u16 padding_right; u16 padding_left; ... }; struct yuv2rgb_config *yuv2rgb_cfg; yuv2rgb_cfg->padding_en = 1; yuv2rgb_cfg->padding_up = aie_cfg->src_padding.up; yuv2rgb_cfg->padding_down = aie_cfg->src_padding.down; yuv2rgb_cfg->padding_right = aie_cfg->src_padding.right; yuv2rgb_cfg->padding_left = aie_cfg->src_padding.left; This look more readable. > + } else { > + yuv2rgb_cfg[Y2R_PADDING_EN_UP_DOWN] = 0; > + yuv2rgb_cfg[Y2R_PADDING_RIGHT_LEFT] = 0; > + } > + > + yuv2rgb_cfg[Y2R_IN_0] = srcbuf; > + yuv2rgb_cfg[Y2R_IN_1] = srcbuf_UV; > + > + yuv2rgb_cfg[Y2R_OUT_0] = (u32)fd->base_para->rs_pym_rst_pa[0][0]; > + yuv2rgb_cfg[Y2R_OUT_1] = (u32)fd->base_para->rs_pym_rst_pa[0][1]; > + yuv2rgb_cfg[Y2R_OUT_2] = (u32)fd->base_para->rs_pym_rst_pa[0][2]; > + > + yuv2rgb_cfg[Y2R_X_Y_MAG] = (xmag_0 & 0x3FFF) | ((ymag_0 << 16) & 0x3FFF0000); > + > + if (sr_crp_w >= pym0_out_w) { > + /* down scale AIE1.0 by FRZ */ > + yuv2rgb_cfg[Y2R_RS_SEL_SRZ_EN] = > + (yuv2rgb_cfg[Y2R_RS_SEL_SRZ_EN] & 0x00100070); > + yuv2rgb_cfg[Y2R_SRZ_HORI_STEP] = 0; > + yuv2rgb_cfg[Y2R_SRZ_VERT_STEP] = 0; > + } else { > + /* SRZ */ > + /* 0: FDRZ for down scaling */ > + /* 1: SRZ for up scaling */ > + yuv2rgb_cfg[Y2R_RS_SEL_SRZ_EN] = > + (yuv2rgb_cfg[Y2R_RS_SEL_SRZ_EN] & 0x00100070) | SRZ_BIT; > + yuv2rgb_cfg[Y2R_SRZ_HORI_STEP] = ((sr_crp_w - 1) << 15) / (pym0_out_w - 1); > + yuv2rgb_cfg[Y2R_SRZ_VERT_STEP] = ((sr_crp_h - 1) << 15) / (pym0_out_h - 1); > + } > + > + yuv2rgb_cfg[Y2R_CON_IN_BA_MSB] = (u32)0x02020202; > + yuv2rgb_cfg[Y2R_CON_OUT_BA_MSB] = (u32)0x02020202; > + > + return 0; > +} > + > +static int aie_config_rs(struct mtk_aie_dev *fd, struct aie_enq_info *aie_cfg) > +{ > + u32 *rs_cfg; > + u32 *rs_tbl[2]; > + u16 xmag_0, ymag_0; > + u16 pym_out_w[3]; > + u16 pym_out_h[3]; > + u16 round_w; > + u16 sr_crp_w; > + u16 sr_crp_h; > + int i; > + > + sr_crp_w = fd->base_para->crop_rect.width; > + sr_crp_h = fd->base_para->crop_rect.height; > + > + rs_cfg = (u32 *)fd->base_para->fd_rs_cfg_va; > + > + pym_out_w[0] = fd->base_para->pyramid_rect.width; > + pym_out_w[1] = pym_out_w[0] >> 1; > + pym_out_w[2] = pym_out_w[1] >> 1; > + > + pym_out_h[0] = pym_out_w[0] * sr_crp_h / sr_crp_w; In aie_update_table(), img_height is equal to pym_height, but here is not. pstv->inf_elm[PYMX_START_LOOP(0)].img_width = pym_width; pstv->inf_elm[PYMX_START_LOOP(0)].img_height = pym_height; Which one is correct? If both are correct, add comment to describe why these two are different. > + pym_out_h[1] = pym_out_h[0] >> 1; > + pym_out_h[2] = pym_out_h[1] >> 1; > + > + for (i = 0; i < 2; i++) { > + rs_tbl[i] = rs_cfg + fd->variant->rs_cfg_size * i; Because you would not use rs_tbl[0] and rs_tbl[1] at the same time, it's not necessary to define rs_tbl as an array. Just define it as u32 *rs_tbl; > + > + rs_tbl[i][RS_IN_0] = (u32)fd->base_para->rs_pym_rst_pa[i][0]; > + rs_tbl[i][RS_IN_1] = (u32)fd->base_para->rs_pym_rst_pa[i][1]; > + rs_tbl[i][RS_IN_2] = (u32)fd->base_para->rs_pym_rst_pa[i][2]; > + > + rs_tbl[i][RS_OUT_0] = (u32)fd->base_para->rs_pym_rst_pa[i + 1][0]; > + rs_tbl[i][RS_OUT_1] = (u32)fd->base_para->rs_pym_rst_pa[i + 1][1]; > + rs_tbl[i][RS_OUT_2] = (u32)fd->base_para->rs_pym_rst_pa[i + 1][2]; > + > + rs_tbl[i][RS_INPUT_W_H] = (rs_tbl[i][RS_INPUT_W_H] & 0xF800F800) | > + (pym_out_h[i] & 0x7FF) | ((pym_out_w[i] << 16) & 0x7FF0000); > + rs_tbl[i][RS_OUTPUT_W_H] = (rs_tbl[i][RS_OUTPUT_W_H] & 0xF800F800) | > + (pym_out_h[i + 1] & 0x7FF) | ((pym_out_w[i + 1] << 16) & 0x7FF0000); I think you could define a structure and its member has different type, struct rs_table { ... u16 in_h; u16 in_w; u16 out_h; u16 out_w; ... }; struct rs_table *rs_tbl; rs_tbl->in_h = pym_out_h[i]; rs_tbl->in_w = pym_out_w[i]; rs_tbl->out_h = pym_out_h[i + 1]; rs_tbl->out_w = pym_out_w[i + 1]; This look more readable. > + > + rs_tbl[i][RS_IN_X_Y_SIZE0] = aie_cmb_u16(pym_out_w[i] - 1, pym_out_h[i] - 1); > + rs_tbl[i][RS_IN_X_Y_SIZE1] = aie_cmb_u16(pym_out_w[i] - 1, pym_out_h[i] - 1); > + rs_tbl[i][RS_IN_X_Y_SIZE2] = aie_cmb_u16(pym_out_w[i] - 1, pym_out_h[i] - 1); > + > + set_cmb_cfg(rs_tbl[i], RS_IN_STRIDE0, pym_out_w[i]); > + set_cmb_cfg(rs_tbl[i], RS_IN_STRIDE1, pym_out_w[i]); > + set_cmb_cfg(rs_tbl[i], RS_IN_STRIDE2, pym_out_w[i]); > + > + rs_tbl[i][RS_OUT_X_Y_SIZE0] = aie_cmb_u16(pym_out_w[i + 1] - 1, > + pym_out_h[i + 1] - 1); > + rs_tbl[i][RS_OUT_X_Y_SIZE1] = aie_cmb_u16(pym_out_w[i + 1] - 1, > + pym_out_h[i + 1] - 1); > + rs_tbl[i][RS_OUT_X_Y_SIZE2] = aie_cmb_u16(pym_out_w[i + 1] - 1, > + pym_out_h[i + 1] - 1); > + > + if (i == 0) > + round_w = pym_out_w[i + 1]; > + else > + round_w = round_up(pym_out_w[i + 1], 8); > + > + set_cmb_cfg(rs_tbl[i], RS_OUT_STRIDE0, round_w); > + set_cmb_cfg(rs_tbl[i], RS_OUT_STRIDE1, round_w); > + set_cmb_cfg(rs_tbl[i], RS_OUT_STRIDE2, round_w); > + > + xmag_0 = 512 * pym_out_w[i] / pym_out_w[i + 1]; > + ymag_0 = xmag_0; > + > + rs_tbl[i][RS_X_Y_MAG] = (xmag_0 & 0x3FFF) | ((ymag_0 << 16) & 0x3FFF0000); > + rs_tbl[i][RS_CON_IN_BA_MSB] = (u32)0x02020202; > + rs_tbl[i][RS_CON_OUT_BA_MSB] = (u32)0x02020202; > + } > + > + return 0; > +} > + > +static int aie_config_network(struct mtk_aie_dev *fd, > + struct aie_enq_info *aie_cfg) > +{ > + struct aie_static_info *pstv = &fd->st_info; > + u16 pyramid0_out_w, pyramid0_out_h, pyramid1_out_h, pyramid2_out_h; > + u16 out_ysize_minus_1, out_ysize_minus_1_stride2; > + u16 fd_xsize[4], size, poolsize, stridesize; > + u16 input_height, out_height; > + u16 conv_width, conv_height; > + u32 *fd_cur_cfg, *fd_cur_set; > + u32 sr_crp_w, sr_crp_h; > + u32 cal_x, cal_y; > + u8 uch, uloop; > + u8 i, j; > + void *fd_cfg; > + > + sr_crp_w = fd->base_para->crop_rect.width; > + sr_crp_h = fd->base_para->crop_rect.height; > + > + pyramid0_out_w = fd->base_para->pyramid_rect.width; > + pyramid0_out_h = pyramid0_out_w * sr_crp_h / sr_crp_w; > + > + pyramid1_out_h = pyramid0_out_h / 2; > + pyramid2_out_h = pyramid1_out_h / 2; > + > + fd_cfg = fd->base_para->fd_fd_cfg_va; > + > + for (i = 0; i < FD_LOOP_NUM; i++) { I think this for-loop could be changed to nest for-loop and the code would be more simple. Below comment is based on the nest for-loop. struct fd_config { ... }; fd_cur_cfg = (struct fd_config *)fd_cfg; for (i = 0; i < RPN_NUM; i++) { for (j = 0; j < RPN_LOOP_NUM; j++, fd_cur_cfg++) { > + fd_cur_cfg = (u32 *)fd_cfg + fd->variant->fd_cfg_size * i; Drop this. > + fd_cur_cfg[FD_INPUT_ROTATE] = (fd_cur_cfg[FD_INPUT_ROTATE] & 0xFFFF0FFF) | > + ((aie_cfg->rotate_degree << 12) & 0x3000); struct fd_config { ... u32 input_rotate; ... }; fd_cur_cfg->input_rotate = aie_cfg->rotate_degree << 12; This looks more readable. > + > + if (i == 0) > + input_height = pyramid2_out_h; > + else if (i == (RPNX_LOOP_NUM(2) + 1)) > + input_height = pyramid1_out_h; > + else if (i == (RPNX_LOOP_NUM(1) + 1)) > + input_height = pyramid0_out_h; > + else > + if (fd_out_stride2_in[i] == 0) > + input_height = out_height; > + else > + input_height = (out_height + 1) / 2; Define pyramid0_out_h, pyramid1_out_h, pyramid2_out_h as an array pyramid_out_h[], and #define FD_OUT_STRIDE2_IN(n) ((n == 5) || (n == 11) ? 1 : 0) if (j == 0) input_height = pyramid_out_h[2 - i]; else if (FD_OUT_STRIDE2_IN(j) == 0) input_height = out_height; else input_height = (out_height + 1) / 2; This looks more simple. > + > + poolsize = is_fd_maxpool(i); > + stridesize = get_fd_stride(i); poolsize = (j == 1) ? 1 : 0; stridesize = (j == 0) ? 2 : 1; This looks more simple. > + if (poolsize == 1) > + out_height = > + DIV_ROUND_UP(input_height, 2 * poolsize); > + else > + out_height = DIV_ROUND_UP(input_height, stridesize + 2 * poolsize); When j == 0, poolsize = 0, stridesize = 2, then out_height = DIV_ROUND_UP(input_height, 2); When j == 1, poolsize = 1, stridesize = 1, then out_height = DIV_ROUND_UP(input_height, 2); When j > 1, poolsize = 0, stridesize = 1, then out_height = DIV_ROUND_UP(input_height, 1); So these could be simplified as out_height = DIV_ROUND_UP(input_height, j > 1 ? 1 : 2); > + > + if (i == RPNX_LOOP_NUM(0) || i == RPNX_LOOP_NUM(1) || i == RPNX_LOOP_NUM(2)) { if (j == (RPN_LOOP_NUM - 1)) { > + conv_width = fd->base_para->img_rect.width; > + conv_height = fd->base_para->img_rect.height; > + fd_xsize[0] = pstv->inf_elm[i].img_width * 2 * 16 * ANCHOR_EN_NUM - 1; fd_xsize[0] = pstv->inf_elm[i * RPN_LOOP_NUM + j].img_width * 2 * 16 * ANCHOR_EN_NUM - 1; > + fd_xsize[3] = pstv->inf_elm[i].img_width * 2 * 32 * ANCHOR_EN_NUM - 1; > + fd_xsize[2] = fd_xsize[3]; > + fd_xsize[1] = fd_xsize[2]; > + } else { > + conv_width = DIV_ROUND_UP(pstv->inf_elm[i].img_width, stridesize); > + conv_height = DIV_ROUND_UP(input_height, stridesize); > + > + fd_xsize[3] = pstv->inf_elm[i].input_xsize_plus_1 - 1; > + fd_xsize[2] = fd_xsize[3]; > + fd_xsize[1] = fd_xsize[2]; > + fd_xsize[0] = fd_xsize[1]; > + } > + > + fd_cur_cfg[FD_CONV_WIDTH_MOD6] = (fd_cur_cfg[FD_CONV_WIDTH_MOD6] & 0xFF8FFFFF) | > + (((conv_width % 6) << 20) & 0x00700000); > + fd_cur_cfg[FD_CONV_IMG_W_H] = aie_cmb_u16(conv_height, conv_width); struct fd_config { ... u32 conv_width_mod6; u16 conv_height; u16 conv_width; ... }; fd_cur_cfg->conv_width_mod6 = (conv_width % 6) << 20; fd_cur_cfg->conv_width = conv_width; fd_cur_cfg->conv_height = conv_height; This looks more simple. > + > + fd_cur_cfg[FD_IN_IMG_W_H] = aie_cmb_u16(input_height, pstv->inf_elm[i].img_width); > + fd_cur_cfg[FD_OUT_IMG_W_H] = aie_cmb_u16(out_height, pstv->inf_elm[i].out_width); > + > + if (fd_rdma_en[i][0][0] != -1) { This is always true, so drop this checking. > + for (j = 0; j < 4; j++) { > + fd_cur_cfg[FD_IN_X_Y_SIZE0 + 2 * j] = > + aie_cmb_u16(fd_xsize[j], input_height - 1); > + set_cmbst_cfg(fd_cur_cfg, FD_IN_STRIDE0_BUS_SIZE0 + 2 * j, > + fd_xsize[j] + 1); > + } struct fd_in_config { u16 xsize; u16 height; u16 bus_size; u16 stride; }; struct fd_config { ... struct fd_in_config fd_in[INPUT_WDMA_WRA_NUM]; ... }; for (k = 0; k < INPUT_WDMA_WRA_NUM; k++) { fd_cur_cfg->fd_in[i].xsize = fd_xsize[k]; fd_cur_cfg->fd_in[i].height = input_height - 1; fd_cur_cfg->fd_in[i].stride = fd_xsize[j] + 1; } This looks more readable. > + } > + > + out_ysize_minus_1 = out_height - 1; > + out_ysize_minus_1_stride2 = (out_height + 1) / 2 - 1; > + > + for (j = 0; j < OUTPUT_WDMA_WRA_NUM; j++) { > + fd_cur_set = fd_cur_cfg + 2 * j; > + if (!out_stride_size[i][j]) > + continue; > + > + if (out_stride_size[i][j] == 1) { > + fd_cur_set[FD_OUT_X_Y_SIZE0] = > + aie_cmb_u16(pstv->inf_elm[i].out_xsize_plus_1 - 1, > + out_ysize_minus_1); > + set_cmbst_cfg(fd_cur_set, FD_OUT_STRIDE0_BUS_SIZE0, > + pstv->inf_elm[i].out_stride); > + } else if (out_stride_size[i][j] == 2) { > + fd_cur_set[FD_OUT_X_Y_SIZE0] = > + aie_cmb_u16(pstv->inf_elm[i].out_xsize_plus_1_stride2 - 1, > + out_ysize_minus_1_stride2); > + set_cmbst_cfg(fd_cur_set, FD_OUT_STRIDE0_BUS_SIZE0, > + pstv->inf_elm[i].out_stride_stride2); > + } > + } static const unsigned int out_stride_size[RPN_LOOP_NUM][OUTPUT_WDMA_WRA_NUM] = { { 1, 0, 0, 0 }, { 1, 0, 2, 0 }, { 1, 0, 2, 0 }, { 1, 0, 0, 0 }, { 1, 1, 2, 2 }, { 1, 1, 2, 2 }, { 1, 0, 0, 0 }, { 1, 0, 2, 0 }, { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 2, 0 }, { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 0, 0, 0 }, { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, { 1, 1, 0, 0 }, { 1, 0, 0, 0 }, { 3, 0, 0, 0 } }; struct fd_out_config { u16 xsize; u16 ysize_minus_1; u16 bus_size; u16 stride; }; struct fd_config { ... struct fd_out_config fd_out[OUTPUT_WDMA_WRA_NUM]; ... }; for (k = 0; k < OUTPUT_WDMA_WRA_NUM; k++) { if (!out_stride_size[j][k]) continue; if (out_stride_size[j][k] == 1) { fd_cur_cfg->fd_out[j].xsize = pstv->inf_elm[i].out_xsize_plus_1 - 1; fd_cur_cfg->fd_out[j].ysize_minus_1 = out_ysize_minus_1; fd_cur_cfg->fd_out[j].stride = pstv->inf_elm[i].out_stride; } else if (out_stride_size[j][k] == 2) { fd_cur_cfg->fd_out[j].xsize = pstv->inf_elm[i].out_xsize_plus_1_stride2 - 1; fd_cur_cfg->fd_out[j].ysize_minus_1 = out_ysize_minus_1_stride2; fd_cur_cfg->fd_out[j].stride = pstv->inf_elm[i].out_stride_stride2; } } This looks more readable. > + > + if (i == RPNX_LOOP_NUM(0) || i == RPNX_LOOP_NUM(1) || i == RPNX_LOOP_NUM(2)) if (j == (RPN_LOOP_NUM - 1)) > + set_cmb_cfg(fd_cur_cfg, FD_RPN_SET, fd->base_para->rpn_anchor_thrd); > + > + if (i == RPNX_LOOP_NUM(0)) { > + cal_x = ((sr_crp_w << 10) * 100 / > + (int)fd->base_para->pyramid_rect.width) >> 10; > + cal_y = cal_x * 512 / 100; > + fd_cur_cfg[FD_IMAGE_COORD] = (fd_cur_cfg[FD_IMAGE_COORD] & 0xF) | > + ((cal_y << 4) & 0x7FFF0); > + fd_cur_cfg[FD_IMAGE_COORD_XY_OFST] = 0; > + if (aie_cfg->en_roi) { > + fd_cur_cfg[FD_IMAGE_COORD_XY_OFST] = > + (aie_cfg->src_roi.x1 - aie_cfg->src_padding.left) | > + (aie_cfg->src_roi.y1 - aie_cfg->src_padding.up) << 16; > + } > + } else if (i == RPNX_LOOP_NUM(1)) { > + cal_x = ((sr_crp_w << 10) * 100 / > + (int)fd->base_para->pyramid_rect.width) >> 10; > + cal_y = cal_x * 2 * 512 / 100; > + fd_cur_cfg[FD_IMAGE_COORD] = (fd_cur_cfg[FD_IMAGE_COORD] & 0xF) | > + ((cal_y << 4) & 0x7FFF0); > + fd_cur_cfg[FD_IMAGE_COORD_XY_OFST] = 0; > + if (aie_cfg->en_roi) { > + fd_cur_cfg[FD_IMAGE_COORD_XY_OFST] = > + (aie_cfg->src_roi.x1 - aie_cfg->src_padding.left) | > + (aie_cfg->src_roi.y1 - aie_cfg->src_padding.up) << 16; > + } > + } else if (i == RPNX_LOOP_NUM(2)) { > + cal_x = ((sr_crp_w << 10) * 100 / > + (int)fd->base_para->pyramid_rect.width) >> 10; > + cal_y = cal_x * 4 * 512 / 100; > + fd_cur_cfg[FD_IMAGE_COORD] = (fd_cur_cfg[FD_IMAGE_COORD] & 0xF) | > + ((cal_y << 4) & 0x7FFF0); > + fd_cur_cfg[FD_IMAGE_COORD_XY_OFST] = 0; > + if (aie_cfg->en_roi) { > + fd_cur_cfg[FD_IMAGE_COORD_XY_OFST] = > + (aie_cfg->src_roi.x1 - aie_cfg->src_padding.left) | > + (aie_cfg->src_roi.y1 - aie_cfg->src_padding.up) << 16; > + } > + } struct fd_config { ... u8 img_coord_x:4; u8 img_coord_y:4; u32 img_coord_reserved:24; u16 img_coord_x_ofst; u16 img_coord_y_ofst; ... }; if (j == (RPN_LOOP_NUM - 1)) { cal_x = ((sr_crp_w << 10) * 100 / (int)fd->base_para->pyramid_rect.width) >> 10; cal_y = cal_x * (1 << (2 - i)) * 512 / 100; fd_cur_cfg->img_coord_x = fd_cur_cfg[FD_IMAGE_COORD]; fd_cur_cfg->img_coord_y = cal_y; fd_cur_cfg->img_coord_x_ofst = 0; fd_cur_cfg->img_coord_y_ofst = 0; if (aie_cfg->en_roi) { fd_cur_cfg->img_coord_x_ofst = aie_cfg->src_roi.x1 - aie_cfg->src_padding.left; fd_cur_cfg->img_coord_y_ofst = aie_cfg->src_roi.y1 - aie_cfg->src_padding.up; } } This would shrink the code size and looks readable. > + > + /* IN_FM_BASE_ADR */ > + if (i == 0) { > + fd_cur_cfg[FD_IN_0] = (u32)(fd->base_para->rs_pym_rst_pa[2][0]); > + fd_cur_cfg[FD_IN_1] = (u32)(fd->base_para->rs_pym_rst_pa[2][1]); > + fd_cur_cfg[FD_IN_2] = (u32)(fd->base_para->rs_pym_rst_pa[2][2]); > + } else if (i == (RPNX_LOOP_NUM(2) + 1)) { > + fd_cur_cfg[FD_IN_0] = (u32)(fd->base_para->rs_pym_rst_pa[1][0]); > + fd_cur_cfg[FD_IN_1] = (u32)(fd->base_para->rs_pym_rst_pa[1][1]); > + fd_cur_cfg[FD_IN_2] = (u32)(fd->base_para->rs_pym_rst_pa[1][2]); > + } else if (i == (RPNX_LOOP_NUM(1) + 1)) { > + fd_cur_cfg[FD_IN_0] = (u32)(fd->base_para->rs_pym_rst_pa[0][0]); > + fd_cur_cfg[FD_IN_1] = (u32)(fd->base_para->rs_pym_rst_pa[0][1]); > + fd_cur_cfg[FD_IN_2] = (u32)(fd->base_para->rs_pym_rst_pa[0][2]); > + } else { > + for (j = 0; j < INPUT_WDMA_WRA_NUM; j++) { > + if (fd_rdma_en[i][j][0] != -1) { > + uloop = fd_rdma_en[i][j][0]; > + uch = fd_rdma_en[i][j][1]; > + fd_cur_cfg[FD_IN_0 + j] = > + (u32)(fd->dma_para->fd_out_hw_pa[uloop][uch]); > + } > + } > + } static const signed int fd_rdma_en[RPN_LOOP_NUM][INPUT_WDMA_WRA_NUM][2] = { { { 99, 99 }, { 99, 99 }, { 99, 99 }, { -1, -1 } }, { { 0, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, { { 1, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, { { 1, 0 }, { 2, 0 }, { -1, -1 }, { -1, -1 } }, { { 3, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, { { 1, 2 }, { 2, 2 }, { 4, 2 }, { 4, 3 } }, { { 5, 0 }, { 5, 1 }, { -1, -1 }, { -1, -1 } }, { { 6, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, { { 5, 0 }, { 5, 1 }, { 7, 0 }, { -1, -1 } }, { { 8, 0 }, { 8, 1 }, { -1, -1 }, { -1, -1 } }, { { 9, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, { { 5, 2 }, { 5, 3 }, { 7, 2 }, { 10, 2 } }, { { 11, 0 }, { 11, 1 }, { -1, -1 }, { -1, -1 } }, { { 12, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, { { 13, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, { { 11, 0 }, { 11, 1 }, { 14, 0 }, { -1, -1 } }, { { 15, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, { { 16, 0 }, { -1, -1 }, { -1, -1 }, { -1, -1 } }, { { 11, 0 }, { 11, 1 }, { 14, 0 }, { 17, 0 } }, { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, { { 18, 0 }, { 18, 1 }, { -1, -1 }, { -1, -1 } }, { { 19, 0 }, { 22, 0 }, { 22, 1 }, { 25, 0 } } }; struct fd_config { ... u32 in[INPUT_WDMA_WRA_NUM]; ... }; if (j == 0) { fd_cur_cfg->in_0 = fd->base_para->rs_pym_rst_pa[2 - i][0]; fd_cur_cfg->in_1 = fd->base_para->rs_pym_rst_pa[2 - i][1]; fd_cur_cfg->in_2 = fd->base_para->rs_pym_rst_pa[2 - i][2]; } else { for (k = 0; k < INPUT_WDMA_WRA_NUM; k++) { if (fd_rdma_en[j][k][0] != -1) { uloop = fd_rdma_en[j][k][0] + i * RPN_LOOP_NUM; uch = fd_rdma_en[j][k][1]; fd_cur_cfg->in[k] = fd->dma_para->fd_out_hw_pa[uloop][uch]; } } } This looks more simple. > + > + /* OUT_FM_BASE_ADR */ > + for (j = 0; j < OUTPUT_WDMA_WRA_NUM; j++) { > + if (out_stride_size[i][j]) > + fd_cur_cfg[FD_OUT_0 + j] = > + (u32)(fd->dma_para->fd_out_hw_pa[i][j]); > + } > + > + /* KERNEL_BASE_ADR */ > + for (j = 0; j < KERNEL_RDMA_RA_NUM; j++) { > + size = get_fd_ker_rdma_size(i, j); > + if (size) > + fd_cur_cfg[FD_KERNEL_0 + j] = > + (u32)(fd->dma_para->fd_kernel_pa[i][j]); > + } > + > + fd_cur_cfg[FD_CON_IN_BA_MSB] = (u32)0x02020202; It's not necessary to case to u32. > + fd_cur_cfg[FD_CON_OUT_BA_MSB] = (u32)0x02020202; > + fd_cur_cfg[FD_CON_KERNEL_BA_MSB] = (u32)0x00000202; > + } > + > + return 0; > +} > + > +static int aie_config_attr_network(struct mtk_aie_dev *fd, > + struct aie_enq_info *aie_cfg) > +{ > + bool is_regression_loop; > + void *fd_cfg; > + u32 *fd_cur_cfg; > + u16 fd_input_ht, fd_output_ht; > + u16 fd_out_y[4]; > + u8 i, j; > + u8 uloop, uch, uidx; > + u16 pyramid0_out_w, pyramid0_out_h; > + int fd_conv_ht; > + u16 sr_crp_w, sr_crp_h; > + > + sr_crp_w = fd->attr_para->crop_width[fd->attr_para->w_idx]; > + sr_crp_h = fd->attr_para->crop_height[fd->attr_para->w_idx]; > + > + pyramid0_out_w = ATTR_MODE_PYRAMID_WIDTH; > + pyramid0_out_h = pyramid0_out_w * sr_crp_h / sr_crp_w; > + > + fd_cfg = fd->base_para->attr_fd_cfg_va[fd->attr_para->w_idx]; > + > + for (i = 0; i < ATTR_LOOP_NUM; i++) { > + fd_cur_cfg = (u32 *)fd_cfg + fd->variant->fd_cfg_size * i; > + fd_cur_cfg[FD_INPUT_ROTATE] = > + (fd_cur_cfg[FD_INPUT_ROTATE] & 0xFFFF0FFF) | > + ((aie_cfg->rotate_degree << 12) & 0x3000); > + if (i == 0) > + fd_input_ht = pyramid0_out_h; > + else > + if (attr_out_stride2_as_in[i] == 0) > + fd_input_ht = fd_output_ht; > + else if (attr_out_stride2_as_in[i] == 1) > + fd_input_ht = (fd_output_ht + 1) / 2; > + > + fd_output_ht = DIV_ROUND_UP(fd_input_ht, ATTR_FD_STRIDE(i) + > + 2 * ATTR_FD_MAXPOOL(i)); > + fd_conv_ht = DIV_ROUND_UP(fd_input_ht, ATTR_FD_STRIDE(i)); > + > + fd_cur_cfg[FD_CONV_IMG_W_H] = > + (fd_cur_cfg[FD_CONV_IMG_W_H] & 0xFFFF0000) | > + (fd_conv_ht & 0xFFFF); > + fd_cur_cfg[FD_IN_IMG_W_H] = > + (fd_cur_cfg[FD_IN_IMG_W_H] & 0xFFFF0000) | > + (fd_input_ht & 0xFFFF); > + fd_cur_cfg[FD_OUT_IMG_W_H] = > + (fd_cur_cfg[FD_OUT_IMG_W_H] & 0xFFFF0000) | > + (fd_output_ht & 0xFFFF); I think you could define a structure and its member has different type, struct fd_cur_config { ... u16 in_w; u16 in_h; u16 out_w; u16 out_h; ... }; struct fd_cur_config *fd_cur_cfg; fd_cur_cfg->in_w = fd_input_ht; fd_cur_cfg->out_w = fd_output_ht; This look more readable. > + set_cmb_cfg(fd_cur_cfg, FD_IN_X_Y_SIZE0, fd_input_ht - 1); > + set_cmb_cfg(fd_cur_cfg, FD_IN_X_Y_SIZE1, fd_input_ht - 1); > + set_cmb_cfg(fd_cur_cfg, FD_IN_X_Y_SIZE2, fd_input_ht - 1); > + set_cmb_cfg(fd_cur_cfg, FD_IN_X_Y_SIZE3, fd_input_ht - 1); > + > + is_regression_loop = (i == AGE_OUT_RGS || i == GENDER_OUT_RGS || > + i == INDIAN_OUT_RGS || i == ETHNICITY_OUT_RGS); > + > + if (is_regression_loop) { > + fd_out_y[0] = 0; > + fd_out_y[1] = 0; > + fd_out_y[2] = 0; > + fd_out_y[3] = 0; > + } else { > + fd_out_y[0] = fd_output_ht - 1; > + fd_out_y[1] = fd_output_ht - 1; > + if (attr_out_2size[i] == 0) { > + fd_out_y[2] = fd_output_ht - 1; > + fd_out_y[3] = fd_output_ht - 1; > + } else { > + fd_out_y[2] = (fd_output_ht + 1) / 2 - 1; > + fd_out_y[3] = (fd_output_ht + 1) / 2 - 1; > + } > + } > + > + for (j = 0; j < 4; j++) > + set_cmb_cfg(fd_cur_cfg, FD_OUT_X_Y_SIZE0 + 2 * j, fd_out_y[j]); > + > + /* IN_FM_BASE_ADR */ > + if (i == 0) { > + fd_cur_cfg[FD_IN_0] = (u32)(fd->base_para->rs_pym_rst_pa[0][0]); > + fd_cur_cfg[FD_IN_1] = (u32)(fd->base_para->rs_pym_rst_pa[0][1]); > + fd_cur_cfg[FD_IN_2] = (u32)(fd->base_para->rs_pym_rst_pa[0][2]); > + } else { > + for (j = 0; j < INPUT_WDMA_WRA_NUM; j++) { > + if (attr_rdma_en[i][j][0] != -1) { > + uloop = attr_rdma_en[i][j][0]; > + uch = attr_rdma_en[i][j][1]; > + fd_cur_cfg[FD_IN_0 + j] = > + (u32)(fd->dma_para->attr_out_hw_pa[uloop][uch]); > + } > + } > + } > + > + /* OUT_FM_BASE_ADR */ > + for (j = 0; j < OUTPUT_WDMA_WRA_NUM; j++) { > + if (attr_wdma_en[i][j]) { > + uidx = fd->attr_para->w_idx; > + if (i == AGE_OUT_RGS && j == 0) > + fd_cur_cfg[FD_OUT_0 + j] = > + (u32)(fd->dma_para->age_out_hw_pa[uidx]); > + else if (i == GENDER_OUT_RGS && j == 0) > + fd_cur_cfg[FD_OUT_0 + j] = > + (u32)(fd->dma_para->gender_out_hw_pa[uidx]); > + else if (i == INDIAN_OUT_RGS && j == 0) > + fd_cur_cfg[FD_OUT_0 + j] = > + (u32)(fd->dma_para->is_indian_out_hw_pa[uidx]); > + else if (i == ETHNICITY_OUT_RGS && j == 0) > + fd_cur_cfg[FD_OUT_0 + j] = > + (u32)(fd->dma_para->ethnicity_out_hw_pa[uidx]); > + else > + fd_cur_cfg[FD_OUT_0 + j] = > + (u32)(fd->dma_para->attr_out_hw_pa[i][j]); > + } > + } > + > + /* KERNEL_BASE_ADR */ > + for (j = 0; j < KERNEL_RDMA_RA_NUM; j++) { > + fd_cur_cfg[FD_KERNEL_0 + j] = > + (u32)(fd->dma_para->attr_kernel_pa[i][j]); > + } > + > + fd_cur_cfg[FD_CON_IN_BA_MSB] = (u32)0x02020202; > + fd_cur_cfg[FD_CON_OUT_BA_MSB] = (u32)0x02020202; > + fd_cur_cfg[FD_CON_KERNEL_BA_MSB] = (u32)0x00000202; > + } > + return 0; > +} > + [snip] > + > +static void aie_execute_attribute_detection(struct mtk_aie_dev *fd, > + struct aie_enq_info *aie_cfg) > +{ > + writel(0x0, fd->fd_base + AIE_START_REG); > + writel(0x00000101, fd->fd_base + AIE_ENABLE_REG); > + writel(0x00001A00, fd->fd_base + AIE_LOOP_REG); > + writel(0x1, fd->fd_base + AIE_INT_EN_REG); > + writel(fd->reg_cfg.rs_adr, fd->fd_base + AIE_RS_CON_BASE_ADR_REG); You does not assign fd->reg_cfg.rs_adr in attribute mode, why do you set it to register? > + writel(fd->reg_cfg.fd_adr, fd->fd_base + AIE_FD_CON_BASE_ADR_REG); > + writel(fd->reg_cfg.yuv2rgb_adr, fd->fd_base + AIE_YUV2RGB_CON_BASE_ADR_REG); > + writel(0x00000002, fd->fd_base + AIE_YUV2RGB_CON_BASE_ADR_MSB); > + writel(0x00000002, fd->fd_base + AIE_RS_CON_BASE_ADR_MSB); > + writel(0x00000002, fd->fd_base + AIE_FD_CON_BASE_ADR_MSB); > + writel(0x1, fd->fd_base + AIE_START_REG); > +} You could merge aie_execute_face_detection() and aie_execute_attribute_detection() into one function: aie_execute_face_attr_detection(struct mtk_aie_dev *fd, u32 nr_pyramid) { writel(0x0, fd->fd_base + AIE_START_REG); if (nr_pyramid) { writel(0x00000111, fd->fd_base + AIE_ENABLE_REG); loop_num = FD_LOOP_NUM / 3 * nr_pyramid; loop_reg_val = (loop_num << 8) | (nr_pyramid - 1); writel(loop_reg_val, fd->fd_base + AIE_LOOP_REG); } else { writel(0x00000101, fd->fd_base + AIE_ENABLE_REG); writel(0x00001A00, fd->fd_base + AIE_LOOP_REG); } writel(0x1, fd->fd_base + AIE_INT_EN_REG); writel(fd->reg_cfg.rs_adr, fd->fd_base + AIE_RS_CON_BASE_ADR_REG); writel(fd->reg_cfg.fd_adr, fd->fd_base + AIE_FD_CON_BASE_ADR_REG); writel(fd->reg_cfg.yuv2rgb_adr, fd->fd_base + AIE_YUV2RGB_CON_BASE_ADR_REG); writel(0x00000002, fd->fd_base + AIE_YUV2RGB_CON_BASE_ADR_MSB); writel(0x00000002, fd->fd_base + AIE_RS_CON_BASE_ADR_MSB); writel(0x00000002, fd->fd_base + AIE_FD_CON_BASE_ADR_MSB); writel(0x1, fd->fd_base + AIE_START_REG); } For attribute mode, nr_pyramid is 0. > + > +static void aie_execute_fld_detection(struct mtk_aie_dev *fd, > + struct aie_enq_info *aie_cfg) > +{ > + unsigned int i; > + > + writel(0x10, fd->fd_base + AIE_START_REG); > + writel(0x00011111, fd->fd_base + AIE_DMA_CTL_REG); > + writel(0x01111111, fd->fd_base + FLD_EN); > + writel(0x1, fd->fd_base + AIE_INT_EN_REG); > + for (i = 0; i < aie_cfg->fld_face_num; i++) { > + writel(aie_cfg->src_img_addr, fd->fd_base + FLD_BASE_ADDR_FACE_0 + i * 0x4); > + writel(aie_cfg->fld_input[i].fld_in_crop_x1 << 16 | > + aie_cfg->fld_input[i].fld_in_crop_y1, > + fd->fd_base + fld_face_info_0[i]); > + writel(aie_cfg->fld_input[i].fld_in_crop_x2 << 16 | > + aie_cfg->fld_input[i].fld_in_crop_y2, > + fd->fd_base + FLD_FACE_INFO(i, 1)); > + writel(aie_cfg->fld_input[i].fld_in_rip << 4 | > + aie_cfg->fld_input[i].fld_in_rop, > + fd->fd_base + FLD_FACE_INFO(i, 2)); > + } > + > + writel(aie_cfg->fld_face_num << 28 | FLD_FOREST << 16 | > + FLD_POINT, fd->fd_base + FLD_MODEL_PARA1); > + writel(13 << 16 | 0xfe9, fd->fd_base + FLD_MODEL_PARA14); > + writel(aie_cfg->src_img_width << 16 | aie_cfg->src_img_height, > + fd->fd_base + FLD_SRC_WD_HT); > + > + /*input settings*/ > + writel(0x007c003f, fd->fd_base + FLD_PL_IN_SIZE_0); > + writel(0x0040000f, fd->fd_base + FLD_PL_IN_STRIDE_0); > + writel(0x007c003f, fd->fd_base + FLD_PL_IN_SIZE_1); > + writel(0x0040000f, fd->fd_base + FLD_PL_IN_STRIDE_1); > + writel(0x0016003f, fd->fd_base + FLD_PL_IN_SIZE_2_0); > + writel(0x0040000f, fd->fd_base + FLD_PL_IN_STRIDE_2_0); > + writel(0x0013003f, fd->fd_base + FLD_PL_IN_SIZE_2_1); > + writel(0x0040000f, fd->fd_base + FLD_PL_IN_STRIDE_2_1); > + writel(0x0013003f, fd->fd_base + FLD_PL_IN_SIZE_2_2); > + writel(0x0040000f, fd->fd_base + FLD_PL_IN_STRIDE_2_2); > + writel(0x00a6001f, fd->fd_base + FLD_PL_IN_SIZE_3); > + writel(0x0020000f, fd->fd_base + FLD_PL_IN_STRIDE_3); > + > + /*output setting*/ > + writel((2400 * aie_cfg->fld_face_num - 1) << 16 | 127, > + fd->fd_base + FLD_SH_IN_SIZE_0); > + writel(0x0010000f, fd->fd_base + FLD_SH_IN_STRIDE_0); > + writel(fd->fld_para->fld_output_pa[0], > + fd->fd_base + FLD_TR_OUT_BASE_ADDR_0); fld_output_pa[] is used in this function and you just use fld_output_pa[0], so it's not necessary to use an array to store fld_output_pa. In aie_arrange_fld_buf(), fd->fld_para->fld_output_pa = fd->fd_fld_out_hw.pa; And here, writel(fd->fld_para->fld_output_pa, fd->fd_base + FLD_TR_OUT_BASE_ADDR_0); > + writel((aie_cfg->fld_face_num - 1) << 16 | 0x6f, > + fd->fd_base + FLD_TR_OUT_SIZE_0); > + writel(0x0070000f, fd->fd_base + FLD_TR_OUT_STRIDE_0); > + writel(fd->fld_para->fld_output_pa[0], > + fd->fd_base + FLD_PP_OUT_BASE_ADDR_0); Ditto. > + writel((aie_cfg->fld_face_num - 1) << 16 | 0x6f, > + fd->fd_base + FLD_PP_OUT_SIZE_0); > + writel(0x0070000f, fd->fd_base + FLD_PP_OUT_STRIDE_0); > + > + /*cv score*/ > + writel(0x00000001, fd->fd_base + FLD_BS_BIAS); > + writel(0x0000b835, fd->fd_base + FLD_CV_FM_RANGE_0); > + writel(0xffff5cba, fd->fd_base + FLD_CV_FM_RANGE_1); > + writel(0x00005ed5, fd->fd_base + FLD_CV_PM_RANGE_0); > + writel(0xffff910d, fd->fd_base + FLD_CV_PM_RANGE_1); > + writel(0x0000031e, fd->fd_base + FLD_BS_RANGE_0); > + writel(0xfffffcae, fd->fd_base + FLD_BS_RANGE_1); > + > + /* 6 steps */ > + writel(fd->fld_para->fld_step_pa[FLD_STEP_BLINK][14], > + fd->fd_base + FLD_BS_IN_BASE_ADDR_14); > + > + for (i = 0; i < 15; i++) { > + writel(fd->fld_para->fld_step_pa[FLD_STEP_CV][i], > + fd->fd_base + FLD_PL_IN_BASE_ADDR_2_(i)); > + > + writel(fd->fld_para->fld_step_pa[FLD_STEP_FP][i], > + fd->fd_base + FLD_PL_IN_BASE_ADDR_3_(i)); > + > + writel(fd->fld_para->fld_step_pa[FLD_STEP_LEAF][i], > + fd->fd_base + FLD_SH_IN_BASE_ADDR_(i)); > + > + writel(fd->fld_para->fld_step_pa[FLD_STEP_KM02][i], > + fd->fd_base + FLD_PL_IN_BASE_ADDR_0_(i)); > + > + writel(fd->fld_para->fld_step_pa[FLD_STEP_KM13][i], > + fd->fd_base + FLD_PL_IN_BASE_ADDR_1_(i)); > + } > + > + writel(0x22222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_0_0_7_MSB); > + writel(0x02222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_0_8_15_MSB); > + > + writel(0x22222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_1_0_7_MSB); > + writel(0x02222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_1_8_15_MSB); > + > + writel(0x22222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_2_0_7_MSB); > + writel(0x02222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_2_8_15_MSB); > + > + writel(0x22222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_3_0_7_MSB); > + writel(0x02222222, fd->fd_base + FLD_PL_IN_BASE_ADDR_3_8_15_MSB); > + > + writel(0x22222222, fd->fd_base + FLD_SH_IN_BASE_ADDR_0_7_MSB); > + writel(0x02222222, fd->fd_base + FLD_SH_IN_BASE_ADDR_8_15_MSB); > + > + writel(0x02000000, fd->fd_base + FLD_BS_IN_BASE_ADDR_8_15_MSB); > + > + writel(0x22222222, fd->fd_base + FLD_BASE_ADDR_FACE_0_7_MSB); > + writel(0x02222222, fd->fd_base + FLD_BASE_ADDR_FACE_8_14_MSB); > + writel(0x00000002, fd->fd_base + FLD_TR_OUT_BASE_ADDR_0_MSB); > + writel(0x00000002, fd->fd_base + FLD_PP_OUT_BASE_ADDR_0_MSB); > + > + /* fld mode + trigger start */ > + writel(0x11, fd->fd_base + AIE_START_REG); > +} > + [snip] > + > +/* return aie_cfg to user space */ > +void aie_get_fd_result(struct mtk_aie_dev *fd, struct aie_enq_info *aie_cfg) > +{ > + u32 fd_result_hw, fd_result_1_hw, fd_total_num; > + struct aie_enq_info *tmp_aie_cfg; > + void *fd_pym_result[PYM_NUM]; > + u32 fd_pyramid_num[PYM_NUM]; > + signed short landmark; > + unsigned int *pto12; > + struct fd_ret *prst; > + unsigned int i, j; > + > + aie_cfg->sel_mode = fd->base_para->sel_mode; > + aie_cfg->rotate_degree = fd->base_para->rotate_degree; > + aie_cfg->src_img_addr = fd->base_para->src_img_addr; > + aie_cfg->src_img_addr_uv = fd->base_para->src_img_addr_uv; > + aie_cfg->src_img_width = fd->base_para->img_rect.width; > + aie_cfg->src_img_height = fd->base_para->img_rect.height; > + aie_cfg->src_img_fmt = fd->base_para->src_img_fmt; User space know all these information, so it's not necessary to set these information to output buffer. Drop these. > + > + aie_cfg->irq_status = readl(fd->fd_base + AIE_INT_EN_REG); User space is not necessary to know irq_status. Drop this. > + > + fd_result_hw = fd->reg_cfg.hw_result; > + fd_result_1_hw = fd->reg_cfg.hw_result1; > + fd_total_num = fd_result_hw & 0xFFF; > + fd_pyramid_num[0] = (fd_result_hw & 0xFFF0000) >> 16; > + fd_pyramid_num[1] = fd_result_1_hw & 0xFFF; > + fd_pyramid_num[2] = (fd_result_1_hw & 0xFFF0000) >> 16; > + > + if (fd_total_num == 0) > + goto nothing_out; > + > + tmp_aie_cfg = aie_cfg; Drop tmp_aie_cfg and use aie_cfg directly. > + > + tmp_aie_cfg->fd_out.fd_total_num = fd_total_num; > + tmp_aie_cfg->fd_out.fd_pyramid0_num = fd_pyramid_num[0]; > + tmp_aie_cfg->fd_out.fd_pyramid1_num = fd_pyramid_num[1]; > + tmp_aie_cfg->fd_out.fd_pyramid2_num = fd_pyramid_num[2]; > + > + switch (tmp_aie_cfg->number_of_pyramid) { > + case 1: > + fd_pym_result[2] = fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(0)][0]; Why use index 2? Use index 0 is more trivial. Add comment to describe why use such weird index. > + break; > + case 2: > + fd_pym_result[1] = fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(0)][0]; > + fd_pym_result[2] = fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(1)][0]; > + break; > + case 3: > + fd_pym_result[0] = fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(0)][0]; > + fd_pym_result[1] = fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(1)][0]; > + fd_pym_result[2] = fd->dma_para->fd_out_hw_va[RPNX_LOOP_NUM(2)][0]; > + break; > + default: > + dev_err(fd->dev, "Wrong number_of_pyramid\n"); > + goto nothing_out; > + } > + > + for (i = 0; i < 3; i++) { > + for (j = 0; j < fd_pyramid_num[i]; j++) { > + if (i == 0) > + prst = &tmp_aie_cfg->fd_out.pyramid0_result; > + else if (i == 1) > + prst = &tmp_aie_cfg->fd_out.pyramid1_result; > + else if (i == 2) > + prst = &tmp_aie_cfg->fd_out.pyramid2_result; If you define pyramid0_result to pyramid_result[0], then you could simplify this as prst = &tmp_aie_cfg->fd_out.pyramid_result[i]; In addition, this statement is just related to variable i, not related to variable j, so move this statement before the for-loop for j. > + > + pto12 = (unsigned int *)fd_pym_result[i] + 12 * j; > + > + prst->anchor_x0[j] = aie_get_lo16(*(pto12 + 0)); > + prst->anchor_y0[j] = aie_get_hi16(*(pto12 + 0)); > + prst->anchor_x1[j] = aie_get_lo16(*(pto12 + 1)); > + prst->anchor_y1[j] = aie_get_hi16(*(pto12 + 1)); > + > + if (prst->anchor_x1[j] == 0 || > + prst->anchor_y1[j] == 0) { > + dev_err(fd->dev, > + "wrong coordinate: i=%d j=%d M:%d %d %d %d\n", i, j, > + prst->anchor_x0[j], > + prst->anchor_x1[j], > + prst->anchor_y0[j], > + prst->anchor_y1[j] > + ); > + goto nothing_out; > + } > + > + /* ROP result at 1st run */ > + landmark = (*(pto12 + 2) & 0x3FF); > + prst->rop_landmark_score0[j] = aie_refine_s16_value(landmark); > + landmark = ((*(pto12 + 2) & 0xFFC00) >> 10); > + prst->rop_landmark_score1[j] = aie_refine_s16_value(landmark); > + landmark = ((*(pto12 + 2) & 0x3FF00000) >> 20); > + prst->rop_landmark_score2[j] = aie_refine_s16_value(landmark); > + > + prst->anchor_score[j] = aie_refine_s16_value(*(pto12 + 9) & 0x3FF); > + > + /* RIP result at 1st run */ > + landmark = ((*(pto12 + 9) & 0xFFC00) >> 10); > + prst->rip_landmark_score0[j] = aie_refine_s16_value(landmark); > + landmark = ((*(pto12 + 9) & 0x3FF00000) >> 20); > + prst->rip_landmark_score1[j] = aie_refine_s16_value(landmark); > + landmark = ((*(pto12 + 9) & 0xC0000000) >> 30) | > + ((*(pto12 + 10) & 0xFF) << 2); > + prst->rip_landmark_score2[j] = aie_refine_s16_value(landmark); > + landmark = ((*(pto12 + 10) & 0x3FF00) >> 8); > + prst->rip_landmark_score3[j] = aie_refine_s16_value(landmark); > + landmark = ((*(pto12 + 10) & 0xFFC0000) >> 18); > + prst->rip_landmark_score4[j] = aie_refine_s16_value(landmark); > + landmark = ((*(pto12 + 10) & 0xF0000000) >> 28) | > + ((*(pto12 + 11) & 0x3F) << 4); > + prst->rip_landmark_score5[j] = aie_refine_s16_value(landmark); > + landmark = ((*(pto12 + 11) & 0xFFC0) >> 6); > + prst->rip_landmark_score6[j] = aie_refine_s16_value(landmark); > + prst->face_result_index[j] = ((*(pto12 + 11) & 0xFFF0000) >> 16); > + prst->anchor_index[j] = ((*(pto12 + 11) & 0x70000000) >> 28); > + > + prst->fd_partial_result = fd_pyramid_num[i]; > + } > + } > + return; > +nothing_out: > + // Ensure that user mode does not receive an inappropriate result structure Use c style instead of c++ style comment. Run checkpatch before you send out patches. > + memset(&aie_cfg->fd_out, 0, sizeof(struct fd_result)); > +} > + > +void aie_get_attr_result(struct mtk_aie_dev *fd, struct aie_enq_info *aie_cfg) > +{ > + u32 *attr_ethnicity_result, *attr_gender_result; > + u32 *attr_age_result, *attr_is_indian_result; > + > + aie_cfg->sel_mode = fd->attr_para->sel_mode[fd->attr_para->r_idx]; > + aie_cfg->rotate_degree = fd->attr_para->rotate_degree[fd->attr_para->r_idx]; > + aie_cfg->src_img_addr = > + fd->attr_para->src_img_addr[fd->attr_para->r_idx]; > + aie_cfg->src_img_addr_uv = > + fd->attr_para->src_img_addr_uv[fd->attr_para->r_idx]; > + aie_cfg->src_img_width = fd->attr_para->img_width[fd->attr_para->r_idx]; > + aie_cfg->src_img_height = > + fd->attr_para->img_height[fd->attr_para->r_idx]; > + aie_cfg->src_img_fmt = fd->attr_para->src_img_fmt[fd->attr_para->r_idx]; User space know all these information, so it's not necessary to set these information to output buffer. Drop these. > + > + aie_cfg->irq_status = readl(fd->fd_base + AIE_INT_EN_REG); User space is not necessary to know irq_status. Drop this. > + > + /* 64 feature * 32 bytes */ > + attr_age_result = > + (u32 *)fd->dma_para->age_out_hw_va[fd->attr_para->r_idx]; > + attr_gender_result = > + (u32 *)fd->dma_para->gender_out_hw_va[fd->attr_para->r_idx]; > + attr_is_indian_result = > + (u32 *)fd->dma_para->is_indian_out_hw_va[fd->attr_para->r_idx]; > + attr_ethnicity_result = > + (u32 *)fd->dma_para->ethnicity_out_hw_va[fd->attr_para->r_idx]; > + > + aie_cfg->attr_out.merged_age_ret[0] = aie_get_lo16(*attr_age_result); > + aie_cfg->attr_out.merged_age_ret[1] = aie_get_hi16(*attr_age_result); Define merged_age_ret as u32, u32 merged_age_ret; And this would be aie_cfg->attr_out.merged_age_ret = *attr_age_result; > + > + aie_cfg->attr_out.merged_gender_ret[0] = aie_get_lo16(*attr_gender_result); > + aie_cfg->attr_out.merged_gender_ret[1] = aie_get_hi16(*attr_gender_result); u32 merged_gender_ret; > + > + aie_cfg->attr_out.merged_is_indian_ret[0] = aie_get_lo16(*attr_is_indian_result); > + aie_cfg->attr_out.merged_is_indian_ret[1] = aie_get_hi16(*attr_is_indian_result); u32 merged_is_indian_ret; > + > + aie_cfg->attr_out.merged_ethnicity_ret[0] = aie_get_lo16(*attr_ethnicity_result); > + aie_cfg->attr_out.merged_ethnicity_ret[1] = aie_get_hi16(*attr_ethnicity_result); > + aie_cfg->attr_out.merged_ethnicity_ret[2] = aie_get_lo16(*(attr_ethnicity_result + 1)); u64 merged_ethnicity_ret; > + > + fd->attr_para->r_idx = (fd->attr_para->r_idx + 1) % MAX_ENQUE_FRAME_NUM; > +} > + > +void aie_get_fld_result(struct mtk_aie_dev *fd, struct aie_enq_info *aie_cfg) > +{ > + u8 fld_rlt[FLD_OUTPUT_X_SIZE][FLD_OUTPUT_SIZE]; > + u16 *out_parsing; > + int i, j; > + > + aie_cfg->irq_status = readl(fd->fd_base + AIE_INT_EN_REG); > + > + memcpy(fld_rlt, fd->fld_para->fld_output_va[0], sizeof(fld_rlt)); fld_output_va[] is used in this function and you just use fld_output_va[0], so it's not necessary to use an array to store fld_output_va. In aie_arrange_fld_buf(), fd->fld_para->fld_output_va = fd->fd_fld_out_hw.va; In below for-loop, out_parsing = fd->fld_para->fld_output_va + j * FLD_OUTPUT_SIZE; > + > + for (j = 0; j < aie_cfg->fld_face_num; j++) { > + out_parsing = (unsigned short *)&fld_rlt[j][0]; > + for (i = 0; i < FLD_CUR_LANDMARK; i++) { > + aie_cfg->fld_out[j].fld_landmark[i].x = *out_parsing; > + aie_cfg->fld_out[j].fld_landmark[i].y = *(out_parsing + 1); aie_cfg->fld_out[j].fld_landmark[i].x = out_parsing[0]; aie_cfg->fld_out[j].fld_landmark[i].y = out_parsing[1]; looks better. > + > + if (i % 2) > + out_parsing = out_parsing + 6; > + else > + out_parsing = out_parsing + 2; > + } > + out_parsing = (unsigned short *)&fld_rlt[j][0]; > + if (FLD_CUR_LANDMARK % 2) > + out_parsing = out_parsing + ((FLD_CUR_LANDMARK + 1) / 2) * 8; > + else > + out_parsing = out_parsing + (FLD_CUR_LANDMARK / 2) * 8; Even though FLD_CUR_LANDMARK % 2 is 0, you could still use below formula, out_parsing = out_parsing + ((FLD_CUR_LANDMARK + 1) / 2) * 8; So you could drop this if-checking and use this formula for all case. > + > + aie_cfg->fld_out[j].fld_out_rop = *out_parsing; > + aie_cfg->fld_out[j].fld_out_rip = *(out_parsing + 1); > + aie_cfg->fld_out[j].confidence = *(out_parsing + 2); > + aie_cfg->fld_out[j].blinkscore = *(out_parsing + 3); aie_cfg->fld_out[j].fld_out_rop = out_parsing[0]; aie_cfg->fld_out[j].fld_out_rip = out_parsing[1]; aie_cfg->fld_out[j].confidence = out_parsing[2]; aie_cfg->fld_out[j].blinkscore = out_parsing[3]; looks better. > + } > +} > diff --git a/drivers/media/platform/mediatek/aie/mtk_aie_v4l2.c b/drivers/media/platform/mediatek/aie/mtk_aie_v4l2.c > new file mode 100644 > index 000000000000..28646d5a53aa > --- /dev/null > [snip] > + > +static void mtk_aie_fill_pixfmt_mp(struct v4l2_pix_format_mplane *dfmt, > + const struct v4l2_pix_format_mplane *sfmt) > +{ > + dfmt->field = V4L2_FIELD_NONE; > + dfmt->colorspace = V4L2_COLORSPACE_BT2020; > + dfmt->num_planes = sfmt->num_planes; > + dfmt->ycbcr_enc = V4L2_YCBCR_ENC_DEFAULT; > + dfmt->quantization = V4L2_QUANTIZATION_DEFAULT; > + dfmt->xfer_func = V4L2_MAP_XFER_FUNC_DEFAULT(dfmt->colorspace); > + dfmt->pixelformat = sfmt->pixelformat; > + > + /* Keep user setting as possible */ > + dfmt->width = clamp(dfmt->width, MTK_FD_OUTPUT_MIN_WIDTH, > + MTK_FD_OUTPUT_MAX_WIDTH); > + dfmt->height = clamp(dfmt->height, MTK_FD_OUTPUT_MIN_HEIGHT, > + MTK_FD_OUTPUT_MAX_HEIGHT); > + > + dfmt->plane_fmt[0].bytesperline = ALIGN(dfmt->width, 16); > + dfmt->plane_fmt[1].bytesperline = ALIGN(dfmt->width, 16); > + > + dfmt->plane_fmt[0].sizeimage = dfmt->height * dfmt->plane_fmt[0].bytesperline; > + dfmt->plane_fmt[1].sizeimage = dfmt->height * dfmt->plane_fmt[1].bytesperline; > + if (sfmt->num_planes == 2 && sfmt->pixelformat == V4L2_PIX_FMT_NV12M) { if (sfmt->pixelformat == V4L2_PIX_FMT_NV12M) { V4L2_PIX_FMT_NV12M imply that num_planes is 2, so checking 'sfmt->num_planes == 2' is redundant. > + dfmt->plane_fmt[1].sizeimage /= 2; > + } else if (sfmt->pixelformat == V4L2_PIX_FMT_NV12) { > + dfmt->plane_fmt[0].sizeimage *= 3; > + dfmt->plane_fmt[0].sizeimage /= 2; > + } > +} [snip] > + > +static __poll_t mtk_vfd_fop_poll(struct file *file, poll_table *wait) > +{ > + int ret; > + > + struct mtk_aie_ctx *ctx = container_of(file->private_data, struct mtk_aie_ctx, fh); > + struct mtk_aie_dev *fd = ctx->fd_dev; > + > + if (fd->fd_state & STATE_INIT) { > + /* Waiting Job Finsh */ > + ret = wait_for_completion_timeout(&fd->fd_job_finished, msecs_to_jiffies(1000)); It's not necessary to wait here because v4l2_m2m_fop_poll() would do waiting. > + if (!ret) { > + dev_err(ctx->dev, "Wait job finish timeout from poll\n"); > + return EPOLLERR; > + } > + } > + > + return v4l2_m2m_fop_poll(file, wait); > +} > + > +static const struct v4l2_file_operations fd_video_fops = { > + .owner = THIS_MODULE, > + .open = mtk_vfd_open, > + .release = mtk_vfd_release, > + .poll = mtk_vfd_fop_poll, > + .unlocked_ioctl = video_ioctl2, > + .mmap = v4l2_m2m_fop_mmap, > +}; > + > +static int mtk_aie_job_ready(void *priv) > +{ > + struct vb2_v4l2_buffer *src_buf, *dst_buf; > + struct mtk_aie_ctx *ctx = priv; > + struct mtk_aie_dev *fd = ctx->fd_dev; > + struct fd_buffer src_img[2] = {}; > + void *plane_vaddr; > + > + if (!ctx->fh.m2m_ctx) { When file is open, ctx->fh.m2m_ctx is valid. And device_run() callback is called when file is opened. So this checking is redundant. > + dev_err(fd->dev, "Memory-to-memory context is NULL\n"); > + return -1; > + } > + > + if (!(fd->fd_state & STATE_OPEN)) { I think when file is clode, V4L2 would not call device_run() callback function. So this checking is redundant. > + dev_err(fd->dev, "Job ready with device closed\n"); > + return -1; > + } > + > + mutex_lock(&fd->fd_lock); > + > + src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx); > + dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx); > + > + if (!src_buf || !dst_buf) { > + dev_err(fd->dev, "src or dst buf is NULL\n"); > + mutex_unlock(&fd->fd_lock); > + return -1; > + } > + > + if (!(fd->fd_state & STATE_INIT)) { > + dev_err(fd->dev, "%s Wrong fd state: %d\n", __func__, fd->fd_state); > + mutex_unlock(&fd->fd_lock); > + return -1; > + } > + > + plane_vaddr = vb2_plane_vaddr(&dst_buf->vb2_buf, 0); > + if (!plane_vaddr) { > + dev_err(fd->dev, "Failed to get plane virtual address\n"); > + mutex_unlock(&fd->fd_lock); > + return -1; > + } > + > + v4l2_ctrl_request_setup(src_buf->vb2_buf.req_obj.req, &ctx->hdl); > + > + fd->aie_cfg = (struct aie_enq_info *)plane_vaddr; > + fd->aie_cfg->fld_face_num = ctx->user_param.fld_face_num; You clear fd->aie_cfg below, so here assign fd->aie_cfg->fld_face_num is redundant. Drop it. > + > + memset(fd->aie_cfg, 0, sizeof(struct aie_enq_info)); > + memcpy(fd->aie_cfg, &ctx->user_param, sizeof(struct aie_enq_info)); It's not necessary to copy user space setting to output buffer. User space know all these setting. > + memcpy(fd->aie_cfg->fld_input, ctx->user_param.fld_input, > + FLD_MAX_FRAME * sizeof(struct v4l2_fld_crop_rip_rop)); > + > + src_img[0].dma_addr = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0); > + > + if (ctx->src_fmt.num_planes == 2) { > + src_img[1].dma_addr = > + vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 1); > + } > + > + if ((fd->aie_cfg->sel_mode == FDMODE || fd->aie_cfg->sel_mode == ATTRIBUTEMODE) && > + fd->aie_cfg->src_img_fmt == FMT_YUV420_1P) { > + src_img[1].dma_addr = src_img[0].dma_addr + ctx->user_param.src_img_stride * > + ctx->user_param.src_img_height; > + } > + > + fd->aie_cfg->src_img_addr = src_img[0].dma_addr; > + fd->aie_cfg->src_img_addr_uv = src_img[1].dma_addr; fd->aie_cfg is output buffer, so do not write src_img_addr and src_img_addr_uv to output buffer. You should define src_img_addr and src_img_addr_uv to somewhere not output buffer. > + > + aie_prepare(fd, fd->aie_cfg); > + > + mutex_unlock(&fd->fd_lock); > + > + if (src_buf) { > + /* Complete request controls if any */ > + v4l2_ctrl_request_complete(src_buf->vb2_buf.req_obj.req, &ctx->hdl); > + } > + > + return 0; > +} [snip] > + > +static void mtk_aie_frame_done_worker(struct work_struct *work) > +{ > + struct mtk_aie_req_work *req_work = (struct mtk_aie_req_work *)work; > + struct mtk_aie_dev *fd = (struct mtk_aie_dev *)req_work->fd_dev; > + > + if (fd->reg_cfg.fd_mode == FDMODE) { > + fd->reg_cfg.hw_result = readl(fd->fd_base + AIE_RESULT_0_REG); > + fd->reg_cfg.hw_result1 = readl(fd->fd_base + AIE_RESULT_1_REG); hw_result and hw_result1 is used only in aie_get_fd_result() which will be called in next statement. So you could read AIE_RESULT_0_REG and AIE_RESULT_1_REG in aie_get_fd_result() and drop hw_result and hw_result1. Regards, CK > + } > + > + mutex_lock(&fd->fd_lock); > + > + switch (fd->aie_cfg->sel_mode) { > + case FDMODE: > + aie_get_fd_result(fd, fd->aie_cfg); > + break; > + case ATTRIBUTEMODE: > + aie_get_attr_result(fd, fd->aie_cfg); > + break; > + case FLDMODE: > + aie_get_fld_result(fd, fd->aie_cfg); > + break; > + default: > + dev_dbg(fd->dev, "Wrong sel_mode\n"); > + break; > + } > + > + mutex_unlock(&fd->fd_lock); > + > + if (!cancel_delayed_work(&fd->job_timeout_work)) > + return; > + > + atomic_dec(&fd->num_composing); > + mtk_aie_hw_job_finish(fd, VB2_BUF_STATE_DONE); > + wake_up(&fd->flushing_waitq); > +} ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v5 0/4] Add AIE driver support for mt8188 2025-04-03 7:38 [PATCH v5 0/4] Add AIE driver support for mt8188 bo.kong ` (3 preceding siblings ...) 2025-04-03 7:38 ` [PATCH v5 4/4] media: mediatek: add MT8188 AIE driver bo.kong @ 2025-04-03 14:35 ` Rob Herring (Arm) 2025-04-03 19:52 ` Laurent Pinchart 5 siblings, 0 replies; 17+ messages in thread From: Rob Herring (Arm) @ 2025-04-03 14:35 UTC (permalink / raw) To: bo.kong Cc: Mauro Carvalho Chehab, mtk29348, linux-kernel, linux-media, Teddy.Chen, Project_Global_Chrome_Upstream_Group, zhaoyuan.chen, AngeloGioacchino Del Regno, linux-arm-kernel, linux-mediatek On Thu, 03 Apr 2025 15:38:32 +0800, bo.kong wrote: > From: Bo Kong <Bo.Kong@mediatek.com> > > AIE(AI Engine) is one of the units in mt8188 ISP which provides hardware accelerated face detection function, it can detect different sizes of faces in a raw image. > > Bo Kong (4): > media: dt-bindings: add MT8188 AIE > arm64: dts: mt8188: add aie node > uapi: linux: add MT8188 AIE > media: mediatek: add MT8188 AIE driver > > .../bindings/media/mediatek,mt8188-aie.yaml | 78 + > arch/arm64/boot/dts/mediatek/mt8188.dtsi | 33 + > drivers/media/platform/mediatek/Kconfig | 1 + > drivers/media/platform/mediatek/Makefile | 1 + > drivers/media/platform/mediatek/aie/Kconfig | 12 + > drivers/media/platform/mediatek/aie/Makefile | 5 + > drivers/media/platform/mediatek/aie/mtk_aie.h | 870 ++++++ > .../media/platform/mediatek/aie/mtk_aie_drv.c | 2398 +++++++++++++++++ > .../platform/mediatek/aie/mtk_aie_v4l2.c | 1295 +++++++++ > drivers/media/v4l2-core/v4l2-ioctl.c | 3 + > include/uapi/linux/mtk_aie_v4l2_controls.h | 308 +++ > include/uapi/linux/videodev2.h | 6 + > 12 files changed, 5010 insertions(+) > create mode 100644 Documentation/devicetree/bindings/media/mediatek,mt8188-aie.yaml > create mode 100644 drivers/media/platform/mediatek/aie/Kconfig > create mode 100644 drivers/media/platform/mediatek/aie/Makefile > create mode 100644 drivers/media/platform/mediatek/aie/mtk_aie.h > create mode 100644 drivers/media/platform/mediatek/aie/mtk_aie_drv.c > create mode 100644 drivers/media/platform/mediatek/aie/mtk_aie_v4l2.c > create mode 100644 include/uapi/linux/mtk_aie_v4l2_controls.h > > -- > 2.45.2 > > > My bot found new DTB warnings on the .dts files added or changed in this series. Some warnings may be from an existing SoC .dtsi. Or perhaps the warnings are fixed by another series. Ultimately, it is up to the platform maintainer whether these warnings are acceptable or not. No need to reply unless the platform maintainer has comments. If you already ran DT checks and didn't see these error(s), then make sure dt-schema is up to date: pip3 install dtschema --upgrade This patch series was applied (using b4) to base: Base: attempting to guess base-commit... Base: tags/next-20250403 (exact match) If this is not the correct base, please add 'base-commit' tag (or use b4 which does this automatically) New warnings running 'make CHECK_DTBS=y for arch/arm64/boot/dts/mediatek/' for 20250403074005.21472-1-bo.kong@mediatek.com: arch/arm64/boot/dts/mediatek/mt8188.dtsi:2359.25-2370.5: ERROR (phandle_references): /soc/larb@15340000: Reference to non-existent node or label "smi_img1" ERROR: Input tree has errors, aborting (use -f to force output) make[3]: *** [scripts/Makefile.dtbs:131: arch/arm64/boot/dts/mediatek/mt8188-geralt-ciri-sku4.dtb] Error 2 make[2]: *** [scripts/Makefile.build:461: arch/arm64/boot/dts/mediatek] Error 2 make[2]: Target 'arch/arm64/boot/dts/mediatek/mt8188-geralt-ciri-sku4.dtb' not remade because of errors. make[1]: *** [/home/rob/proj/linux-dt-testing/Makefile:1475: mediatek/mt8188-geralt-ciri-sku4.dtb] Error 2 arch/arm64/boot/dts/mediatek/mt8188.dtsi:2359.25-2370.5: ERROR (phandle_references): /soc/larb@15340000: Reference to non-existent node or label "smi_img1" ERROR: Input tree has errors, aborting (use -f to force output) make[3]: *** [scripts/Makefile.dtbs:131: arch/arm64/boot/dts/mediatek/mt8370-genio-510-evk.dtb] Error 2 make[2]: *** [scripts/Makefile.build:461: arch/arm64/boot/dts/mediatek] Error 2 make[2]: Target 'arch/arm64/boot/dts/mediatek/mt8370-genio-510-evk.dtb' not remade because of errors. make[1]: *** [/home/rob/proj/linux-dt-testing/Makefile:1475: mediatek/mt8370-genio-510-evk.dtb] Error 2 arch/arm64/boot/dts/mediatek/mt8188.dtsi:2359.25-2370.5: ERROR (phandle_references): /soc/larb@15340000: Reference to non-existent node or label "smi_img1" ERROR: Input tree has errors, aborting (use -f to force output) make[3]: *** [scripts/Makefile.dtbs:131: arch/arm64/boot/dts/mediatek/mt8390-genio-700-evk.dtb] Error 2 make[2]: *** [scripts/Makefile.build:461: arch/arm64/boot/dts/mediatek] Error 2 make[2]: Target 'arch/arm64/boot/dts/mediatek/mt8390-genio-700-evk.dtb' not remade because of errors. make[1]: *** [/home/rob/proj/linux-dt-testing/Makefile:1475: mediatek/mt8390-genio-700-evk.dtb] Error 2 arch/arm64/boot/dts/mediatek/mt8188.dtsi:2359.25-2370.5: ERROR (phandle_references): /soc/larb@15340000: Reference to non-existent node or label "smi_img1" ERROR: Input tree has errors, aborting (use -f to force output) make[3]: *** [scripts/Makefile.dtbs:131: arch/arm64/boot/dts/mediatek/mt8188-geralt-ciri-sku1.dtb] Error 2 make[2]: *** [scripts/Makefile.build:461: arch/arm64/boot/dts/mediatek] Error 2 make[2]: Target 'arch/arm64/boot/dts/mediatek/mt8188-geralt-ciri-sku1.dtb' not remade because of errors. make[1]: *** [/home/rob/proj/linux-dt-testing/Makefile:1475: mediatek/mt8188-geralt-ciri-sku1.dtb] Error 2 arch/arm64/boot/dts/mediatek/mt8188.dtsi:2359.25-2370.5: ERROR (phandle_references): /soc/larb@15340000: Reference to non-existent node or label "smi_img1" ERROR: Input tree has errors, aborting (use -f to force output) make[3]: *** [scripts/Makefile.dtbs:131: arch/arm64/boot/dts/mediatek/mt8188-geralt-ciri-sku7.dtb] Error 2 make[2]: *** [scripts/Makefile.build:461: arch/arm64/boot/dts/mediatek] Error 2 make[2]: Target 'arch/arm64/boot/dts/mediatek/mt8188-geralt-ciri-sku7.dtb' not remade because of errors. make[1]: *** [/home/rob/proj/linux-dt-testing/Makefile:1475: mediatek/mt8188-geralt-ciri-sku7.dtb] Error 2 arch/arm64/boot/dts/mediatek/mt8188.dtsi:2359.25-2370.5: ERROR (phandle_references): /soc/larb@15340000: Reference to non-existent node or label "smi_img1" ERROR: Input tree has errors, aborting (use -f to force output) make[3]: *** [scripts/Makefile.dtbs:131: arch/arm64/boot/dts/mediatek/mt8188-geralt-ciri-sku3.dtb] Error 2 make[2]: *** [scripts/Makefile.build:461: arch/arm64/boot/dts/mediatek] Error 2 make[2]: Target 'arch/arm64/boot/dts/mediatek/mt8188-geralt-ciri-sku3.dtb' not remade because of errors. make[1]: *** [/home/rob/proj/linux-dt-testing/Makefile:1475: mediatek/mt8188-geralt-ciri-sku3.dtb] Error 2 arch/arm64/boot/dts/mediatek/mt8188.dtsi:2359.25-2370.5: ERROR (phandle_references): /soc/larb@15340000: Reference to non-existent node or label "smi_img1" ERROR: Input tree has errors, aborting (use -f to force output) make[3]: *** [scripts/Makefile.dtbs:131: arch/arm64/boot/dts/mediatek/mt8188-evb.dtb] Error 2 make[2]: *** [scripts/Makefile.build:461: arch/arm64/boot/dts/mediatek] Error 2 make[2]: Target 'arch/arm64/boot/dts/mediatek/mt8188-evb.dtb' not remade because of errors. make[1]: *** [/home/rob/proj/linux-dt-testing/Makefile:1475: mediatek/mt8188-evb.dtb] Error 2 arch/arm64/boot/dts/mediatek/mt8188.dtsi:2359.25-2370.5: ERROR (phandle_references): /soc/larb@15340000: Reference to non-existent node or label "smi_img1" ERROR: Input tree has errors, aborting (use -f to force output) make[3]: *** [scripts/Makefile.dtbs:131: arch/arm64/boot/dts/mediatek/mt8188-geralt-ciri-sku6.dtb] Error 2 make[2]: *** [scripts/Makefile.build:461: arch/arm64/boot/dts/mediatek] Error 2 make[2]: Target 'arch/arm64/boot/dts/mediatek/mt8188-geralt-ciri-sku6.dtb' not remade because of errors. make[1]: *** [/home/rob/proj/linux-dt-testing/Makefile:1475: mediatek/mt8188-geralt-ciri-sku6.dtb] Error 2 arch/arm64/boot/dts/mediatek/mt8188.dtsi:2359.25-2370.5: ERROR (phandle_references): /soc/larb@15340000: Reference to non-existent node or label "smi_img1" ERROR: Input tree has errors, aborting (use -f to force output) make[3]: *** [scripts/Makefile.dtbs:131: arch/arm64/boot/dts/mediatek/mt8188-geralt-ciri-sku0.dtb] Error 2 make[2]: *** [scripts/Makefile.build:461: arch/arm64/boot/dts/mediatek] Error 2 make[2]: Target 'arch/arm64/boot/dts/mediatek/mt8188-geralt-ciri-sku0.dtb' not remade because of errors. make[1]: *** [/home/rob/proj/linux-dt-testing/Makefile:1475: mediatek/mt8188-geralt-ciri-sku0.dtb] Error 2 arch/arm64/boot/dts/mediatek/mt8188.dtsi:2359.25-2370.5: ERROR (phandle_references): /soc/larb@15340000: Reference to non-existent node or label "smi_img1" ERROR: Input tree has errors, aborting (use -f to force output) make[3]: *** [scripts/Makefile.dtbs:131: arch/arm64/boot/dts/mediatek/mt8188-geralt-ciri-sku2.dtb] Error 2 make[2]: *** [scripts/Makefile.build:461: arch/arm64/boot/dts/mediatek] Error 2 make[2]: Target 'arch/arm64/boot/dts/mediatek/mt8188-geralt-ciri-sku2.dtb' not remade because of errors. make[1]: *** [/home/rob/proj/linux-dt-testing/Makefile:1475: mediatek/mt8188-geralt-ciri-sku2.dtb] Error 2 arch/arm64/boot/dts/mediatek/mt8188.dtsi:2359.25-2370.5: ERROR (phandle_references): /soc/larb@15340000: Reference to non-existent node or label "smi_img1" ERROR: Input tree has errors, aborting (use -f to force output) make[3]: *** [scripts/Makefile.dtbs:131: arch/arm64/boot/dts/mediatek/mt8188-geralt-ciri-sku5.dtb] Error 2 make[2]: *** [scripts/Makefile.build:461: arch/arm64/boot/dts/mediatek] Error 2 make[2]: Target 'arch/arm64/boot/dts/mediatek/mt8188-geralt-ciri-sku5.dtb' not remade because of errors. make[1]: *** [/home/rob/proj/linux-dt-testing/Makefile:1475: mediatek/mt8188-geralt-ciri-sku5.dtb] Error 2 make: *** [Makefile:248: __sub-make] Error 2 make: Target 'mediatek/mt8183-pumpkin.dtb' not remade because of errors. make: Target 'mediatek/mt6797-evb.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-kodama-sku16.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-jacuzzi-fennel14.dtb' not remade because of errors. make: Target 'mediatek/mt8395-genio-1200-evk.dtb' not remade because of errors. make: Target 'mediatek/mt7986a-bananapi-bpi-r3.dtb' not remade because of errors. make: Target 'mediatek/mt8173-evb.dtb' not remade because of errors. make: Target 'mediatek/mt7986b-rfb.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-jacuzzi-willow-sku0.dtb' not remade because of errors. make: Target 'mediatek/mt8186-corsola-tentacruel-sku262144.dtb' not remade because of errors. make: Target 'mediatek/mt8195-cherry-tomato-r2.dtb' not remade because of errors. make: Target 'mediatek/mt7981b-openwrt-one.dtb' not remade because of errors. make: Target 'mediatek/mt8173-elm.dtb' not remade because of errors. make: Target 'mediatek/mt7622-rfb1.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-jacuzzi-fennel-sku1.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-jacuzzi-damu.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-jacuzzi-willow-sku1.dtb' not remade because of errors. make: Target 'mediatek/mt8186-corsola-tentacool-sku327681.dtb' not remade because of errors. make: Target 'mediatek/mt6779-evb.dtb' not remade because of errors. make: Target 'mediatek/mt8188-geralt-ciri-sku4.dtb' not remade because of errors. make: Target 'mediatek/mt8173-elm-hana-rev7.dtb' not remade because of errors. make: Target 'mediatek/mt8186-corsola-starmie-sku1.dtb' not remade because of errors. make: Target 'mediatek/mt8370-genio-510-evk.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-jacuzzi-kenzo.dtb' not remade because of errors. make: Target 'mediatek/mt2712-evb.dtb' not remade because of errors. make: Target 'mediatek/mt8186-corsola-tentacool-sku327683.dtb' not remade because of errors. make: Target 'mediatek/mt7981b-cudy-wr3000-v1.dtb' not remade because of errors. make: Target 'mediatek/mt8192-asurada-hayato-r1.dtb' not remade because of errors. make: Target 'mediatek/mt8390-genio-700-evk.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-jacuzzi-fennel-sku7.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-jacuzzi-makomo-sku1.dtb' not remade because of errors. make: Target 'mediatek/mt8395-kontron-3-5-sbc-i1200.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-jacuzzi-pico6.dtb' not remade because of errors. make: Target 'mediatek/mt8188-geralt-ciri-sku1.dtb' not remade because of errors. make: Target 'mediatek/mt8195-demo.dtb' not remade because of errors. make: Target 'mediatek/mt8173-elm-hana.dtb' not remade because of errors. make: Target 'mediatek/mt6755-evb.dtb' not remade because of errors. make: Target 'mediatek/mt8186-corsola-chinchou-sku16.dtb' not remade because of errors. make: Target 'mediatek/mt8192-asurada-spherion-r0.dtb' not remade because of errors. make: Target 'mediatek/mt8186-corsola-chinchou-sku0.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-kodama-sku32.dtb' not remade because of errors. make: Target 'mediatek/mt7622-bananapi-bpi-r64.dtb' not remade because of errors. make: Target 'mediatek/mt8186-corsola-chinchou-sku1.dtb' not remade because of errors. make: Target 'mediatek/mt8186-corsola-voltorb-sku589825.dtb' not remade because of errors. make: Target 'mediatek/mt8186-corsola-magneton-sku393218.dtb' not remade because of errors. make: Target 'mediatek/mt8183-evb.dtb' not remade because of errors. make: Target 'mediatek/mt8188-geralt-ciri-sku7.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-jacuzzi-fennel-sku6.dtb' not remade because of errors. make: Target 'mediatek/mt8186-evb.dtb' not remade because of errors. make: Target 'mediatek/mt8186-corsola-magneton-sku393216.dtb' not remade because of errors. make: Target 'mediatek/mt8188-geralt-ciri-sku3.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-jacuzzi-fennel14-sku2.dtb' not remade because of errors. make: Target 'mediatek/mt6797-x20-dev.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-jacuzzi-cozmo.dtb' not remade because of errors. make: Target 'mediatek/mt8188-evb.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-krane-sku0.dtb' not remade because of errors. make: Target 'mediatek/mt8365-evk.dtb' not remade because of errors. make: Target 'mediatek/mt8195-evb.dtb' not remade because of errors. make: Target 'mediatek/mt8395-radxa-nio-12l.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-jacuzzi-makomo-sku0.dtb' not remade because of errors. make: Target 'mediatek/mt8186-corsola-steelix-sku131073.dtb' not remade because of errors. make: Target 'mediatek/mt8186-corsola-voltorb-sku589824.dtb' not remade because of errors. make: Target 'mediatek/mt7988a-bananapi-bpi-r4.dtb' not remade because of errors. make: Target 'mediatek/mt8186-corsola-starmie-sku0.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-jacuzzi-burnet.dtb' not remade because of errors. make: Target 'mediatek/mt8188-geralt-ciri-sku6.dtb' not remade because of errors. make: Target 'mediatek/mt7986a-acelink-ew-7886cax.dtb' not remade because of errors. make: Target 'mediatek/mt8516-pumpkin.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-kakadu-sku22.dtb' not remade because of errors. make: Target 'mediatek/mt7986a-rfb.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-kodama-sku288.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-krane-sku176.dtb' not remade because of errors. make: Target 'mediatek/mt8188-geralt-ciri-sku0.dtb' not remade because of errors. make: Target 'mediatek/mt6795-evb.dtb' not remade because of errors. make: Target 'mediatek/mt8186-corsola-tentacruel-sku262148.dtb' not remade because of errors. make: Target 'mediatek/mt8167-pumpkin.dtb' not remade because of errors. make: Target 'mediatek/mt7981b-xiaomi-ax3000t.dtb' not remade because of errors. make: Target 'mediatek/mt8188-geralt-ciri-sku2.dtb' not remade because of errors. make: Target 'mediatek/mt8186-corsola-steelix-sku131072.dtb' not remade because of errors. make: Target 'mediatek/mt8195-cherry-dojo-r1.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-jacuzzi-kappa.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-kakadu.dtb' not remade because of errors. make: Target 'mediatek/mt7986a-bananapi-bpi-r3-mini.dtb' not remade because of errors. make: Target 'mediatek/mt6795-sony-xperia-m5.dtb' not remade because of errors. make: Target 'mediatek/mt8186-corsola-magneton-sku393217.dtb' not remade because of errors. make: Target 'mediatek/mt8192-evb.dtb' not remade because of errors. make: Target 'mediatek/mt8195-cherry-tomato-r3.dtb' not remade because of errors. make: Target 'mediatek/mt8195-cherry-tomato-r1.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-jacuzzi-juniper-sku16.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-katsu-sku32.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-katsu-sku38.dtb' not remade because of errors. make: Target 'mediatek/mt8188-geralt-ciri-sku5.dtb' not remade because of errors. make: Target 'mediatek/mt8186-corsola-rusty-sku196608.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-jacuzzi-pico.dtb' not remade because of errors. make: Target 'mediatek/mt8183-kukui-kodama-sku272.dtb' not remade because of errors. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v5 0/4] Add AIE driver support for mt8188 2025-04-03 7:38 [PATCH v5 0/4] Add AIE driver support for mt8188 bo.kong ` (4 preceding siblings ...) 2025-04-03 14:35 ` [PATCH v5 0/4] Add AIE driver support for mt8188 Rob Herring (Arm) @ 2025-04-03 19:52 ` Laurent Pinchart 5 siblings, 0 replies; 17+ messages in thread From: Laurent Pinchart @ 2025-04-03 19:52 UTC (permalink / raw) To: bo.kong Cc: Rob Herring, AngeloGioacchino Del Regno, Mauro Carvalho Chehab, linux-media, linux-kernel, linux-arm-kernel, linux-mediatek, zhaoyuan.chen, Teddy.Chen, Project_Global_Chrome_Upstream_Group Hi Bo, Thank you for the patches. On Thu, Apr 03, 2025 at 03:38:32PM +0800, bo.kong wrote: > From: Bo Kong <Bo.Kong@mediatek.com> > > AIE(AI Engine) is one of the units in mt8188 ISP which provides hardware accelerated face detection function, it can detect different sizes of faces in a raw image. This series is missing: - The v4l2-compliance report - Documentation of the UAPI - A pointer to an open-source userspace implementation using the device, in a project relevant to the field of use > Bo Kong (4): > media: dt-bindings: add MT8188 AIE > arm64: dts: mt8188: add aie node > uapi: linux: add MT8188 AIE > media: mediatek: add MT8188 AIE driver > > .../bindings/media/mediatek,mt8188-aie.yaml | 78 + > arch/arm64/boot/dts/mediatek/mt8188.dtsi | 33 + > drivers/media/platform/mediatek/Kconfig | 1 + > drivers/media/platform/mediatek/Makefile | 1 + > drivers/media/platform/mediatek/aie/Kconfig | 12 + > drivers/media/platform/mediatek/aie/Makefile | 5 + > drivers/media/platform/mediatek/aie/mtk_aie.h | 870 ++++++ > .../media/platform/mediatek/aie/mtk_aie_drv.c | 2398 +++++++++++++++++ > .../platform/mediatek/aie/mtk_aie_v4l2.c | 1295 +++++++++ > drivers/media/v4l2-core/v4l2-ioctl.c | 3 + > include/uapi/linux/mtk_aie_v4l2_controls.h | 308 +++ > include/uapi/linux/videodev2.h | 6 + > 12 files changed, 5010 insertions(+) > create mode 100644 Documentation/devicetree/bindings/media/mediatek,mt8188-aie.yaml > create mode 100644 drivers/media/platform/mediatek/aie/Kconfig > create mode 100644 drivers/media/platform/mediatek/aie/Makefile > create mode 100644 drivers/media/platform/mediatek/aie/mtk_aie.h > create mode 100644 drivers/media/platform/mediatek/aie/mtk_aie_drv.c > create mode 100644 drivers/media/platform/mediatek/aie/mtk_aie_v4l2.c > create mode 100644 include/uapi/linux/mtk_aie_v4l2_controls.h -- Regards, Laurent Pinchart ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2025-04-24 5:22 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-04-03 7:38 [PATCH v5 0/4] Add AIE driver support for mt8188 bo.kong 2025-04-03 7:38 ` [PATCH v5 1/4] media: dt-bindings: add MT8188 AIE bo.kong 2025-04-04 6:04 ` Krzysztof Kozlowski 2025-04-03 7:38 ` [PATCH v5 2/4] arm64: dts: mt8188: add aie node bo.kong 2025-04-04 6:04 ` Krzysztof Kozlowski 2025-04-03 7:38 ` [PATCH v5 3/4] uapi: linux: add MT8188 AIE bo.kong 2025-04-07 3:57 ` CK Hu (胡俊光) 2025-04-23 4:02 ` Bo Kong (孔波) 2025-04-24 1:43 ` CK Hu (胡俊光) 2025-04-24 5:22 ` CK Hu (胡俊光) 2025-04-23 9:29 ` Bo Kong (孔波) 2025-04-03 7:38 ` [PATCH v5 4/4] media: mediatek: add MT8188 AIE driver bo.kong 2025-04-03 19:31 ` Nicolas Dufresne 2025-04-04 17:56 ` kernel test robot 2025-04-17 6:39 ` CK Hu (胡俊光) 2025-04-03 14:35 ` [PATCH v5 0/4] Add AIE driver support for mt8188 Rob Herring (Arm) 2025-04-03 19:52 ` Laurent Pinchart
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox