* Re: [PATCH] doc/virt/kvm: move KVM_X86_SET_MSR_FILTER in section 8
From: Jonathan Corbet @ 2021-04-13 21:20 UTC (permalink / raw)
To: Emanuele Giuseppe Esposito, linux-doc; +Cc: Paolo Bonzini, kvm, linux-kernel
In-Reply-To: <20210316170814.64286-1-eesposit@redhat.com>
Emanuele Giuseppe Esposito <eesposit@redhat.com> writes:
> KVM_X86_SET_MSR_FILTER is a capability, not an ioctl.
> Therefore move it from section 4.97 to the new 8.31 (other capabilities).
>
> To fill the gap, move KVM_X86_SET_MSR_FILTER (was 4.126) to
> 4.97, and shifted Xen-related ioctl (were 4.127 - 4.130) by
> one place (4.126 - 4.129).
>
> Also fixed minor typo in KVM_GET_MSR_INDEX_LIST ioctl description
> (section 4.3).
>
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
> Documentation/virt/kvm/api.rst | 250 ++++++++++++++++-----------------
> 1 file changed, 125 insertions(+), 125 deletions(-)
Paolo, what's your thought on this one? If it's OK should I pick it up?
Thanks,
jon
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 1a2b5210cdbf..a230140d6a7f 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -201,7 +201,7 @@ Errors:
>
> ====== ============================================================
> EFAULT the msr index list cannot be read from or written to
> - E2BIG the msr index list is to be to fit in the array specified by
> + E2BIG the msr index list is too big to fit in the array specified by
> the user.
> ====== ============================================================
>
> @@ -3686,31 +3686,105 @@ which is the maximum number of possibly pending cpu-local interrupts.
>
> Queues an SMI on the thread's vcpu.
>
> -4.97 KVM_CAP_PPC_MULTITCE
> --------------------------
> +4.97 KVM_X86_SET_MSR_FILTER
> +----------------------------
>
> -:Capability: KVM_CAP_PPC_MULTITCE
> -:Architectures: ppc
> -:Type: vm
> +:Capability: KVM_X86_SET_MSR_FILTER
> +:Architectures: x86
> +:Type: vm ioctl
> +:Parameters: struct kvm_msr_filter
> +:Returns: 0 on success, < 0 on error
>
> -This capability means the kernel is capable of handling hypercalls
> -H_PUT_TCE_INDIRECT and H_STUFF_TCE without passing those into the user
> -space. This significantly accelerates DMA operations for PPC KVM guests.
> -User space should expect that its handlers for these hypercalls
> -are not going to be called if user space previously registered LIOBN
> -in KVM (via KVM_CREATE_SPAPR_TCE or similar calls).
> +::
>
> -In order to enable H_PUT_TCE_INDIRECT and H_STUFF_TCE use in the guest,
> -user space might have to advertise it for the guest. For example,
> -IBM pSeries (sPAPR) guest starts using them if "hcall-multi-tce" is
> -present in the "ibm,hypertas-functions" device-tree property.
> + struct kvm_msr_filter_range {
> + #define KVM_MSR_FILTER_READ (1 << 0)
> + #define KVM_MSR_FILTER_WRITE (1 << 1)
> + __u32 flags;
> + __u32 nmsrs; /* number of msrs in bitmap */
> + __u32 base; /* MSR index the bitmap starts at */
> + __u8 *bitmap; /* a 1 bit allows the operations in flags, 0 denies */
> + };
>
> -The hypercalls mentioned above may or may not be processed successfully
> -in the kernel based fast path. If they can not be handled by the kernel,
> -they will get passed on to user space. So user space still has to have
> -an implementation for these despite the in kernel acceleration.
> + #define KVM_MSR_FILTER_MAX_RANGES 16
> + struct kvm_msr_filter {
> + #define KVM_MSR_FILTER_DEFAULT_ALLOW (0 << 0)
> + #define KVM_MSR_FILTER_DEFAULT_DENY (1 << 0)
> + __u32 flags;
> + struct kvm_msr_filter_range ranges[KVM_MSR_FILTER_MAX_RANGES];
> + };
>
> -This capability is always enabled.
> +flags values for ``struct kvm_msr_filter_range``:
> +
> +``KVM_MSR_FILTER_READ``
> +
> + Filter read accesses to MSRs using the given bitmap. A 0 in the bitmap
> + indicates that a read should immediately fail, while a 1 indicates that
> + a read for a particular MSR should be handled regardless of the default
> + filter action.
> +
> +``KVM_MSR_FILTER_WRITE``
> +
> + Filter write accesses to MSRs using the given bitmap. A 0 in the bitmap
> + indicates that a write should immediately fail, while a 1 indicates that
> + a write for a particular MSR should be handled regardless of the default
> + filter action.
> +
> +``KVM_MSR_FILTER_READ | KVM_MSR_FILTER_WRITE``
> +
> + Filter both read and write accesses to MSRs using the given bitmap. A 0
> + in the bitmap indicates that both reads and writes should immediately fail,
> + while a 1 indicates that reads and writes for a particular MSR are not
> + filtered by this range.
> +
> +flags values for ``struct kvm_msr_filter``:
> +
> +``KVM_MSR_FILTER_DEFAULT_ALLOW``
> +
> + If no filter range matches an MSR index that is getting accessed, KVM will
> + fall back to allowing access to the MSR.
> +
> +``KVM_MSR_FILTER_DEFAULT_DENY``
> +
> + If no filter range matches an MSR index that is getting accessed, KVM will
> + fall back to rejecting access to the MSR. In this mode, all MSRs that should
> + be processed by KVM need to explicitly be marked as allowed in the bitmaps.
> +
> +This ioctl allows user space to define up to 16 bitmaps of MSR ranges to
> +specify whether a certain MSR access should be explicitly filtered for or not.
> +
> +If this ioctl has never been invoked, MSR accesses are not guarded and the
> +default KVM in-kernel emulation behavior is fully preserved.
> +
> +Calling this ioctl with an empty set of ranges (all nmsrs == 0) disables MSR
> +filtering. In that mode, ``KVM_MSR_FILTER_DEFAULT_DENY`` is invalid and causes
> +an error.
> +
> +As soon as the filtering is in place, every MSR access is processed through
> +the filtering except for accesses to the x2APIC MSRs (from 0x800 to 0x8ff);
> +x2APIC MSRs are always allowed, independent of the ``default_allow`` setting,
> +and their behavior depends on the ``X2APIC_ENABLE`` bit of the APIC base
> +register.
> +
> +If a bit is within one of the defined ranges, read and write accesses are
> +guarded by the bitmap's value for the MSR index if the kind of access
> +is included in the ``struct kvm_msr_filter_range`` flags. If no range
> +cover this particular access, the behavior is determined by the flags
> +field in the kvm_msr_filter struct: ``KVM_MSR_FILTER_DEFAULT_ALLOW``
> +and ``KVM_MSR_FILTER_DEFAULT_DENY``.
> +
> +Each bitmap range specifies a range of MSRs to potentially allow access on.
> +The range goes from MSR index [base .. base+nmsrs]. The flags field
> +indicates whether reads, writes or both reads and writes are filtered
> +by setting a 1 bit in the bitmap for the corresponding MSR index.
> +
> +If an MSR access is not permitted through the filtering, it generates a
> +#GP inside the guest. When combined with KVM_CAP_X86_USER_SPACE_MSR, that
> +allows user space to deflect and potentially handle various MSR accesses
> +into user space.
> +
> +If a vCPU is in running state while this ioctl is invoked, the vCPU may
> +experience inconsistent filtering behavior on MSR accesses.
>
> 4.98 KVM_CREATE_SPAPR_TCE_64
> ----------------------------
> @@ -4706,107 +4780,7 @@ KVM_PV_VM_VERIFY
> Verify the integrity of the unpacked image. Only if this succeeds,
> KVM is allowed to start protected VCPUs.
>
> -4.126 KVM_X86_SET_MSR_FILTER
> -----------------------------
> -
> -:Capability: KVM_X86_SET_MSR_FILTER
> -:Architectures: x86
> -:Type: vm ioctl
> -:Parameters: struct kvm_msr_filter
> -:Returns: 0 on success, < 0 on error
> -
> -::
> -
> - struct kvm_msr_filter_range {
> - #define KVM_MSR_FILTER_READ (1 << 0)
> - #define KVM_MSR_FILTER_WRITE (1 << 1)
> - __u32 flags;
> - __u32 nmsrs; /* number of msrs in bitmap */
> - __u32 base; /* MSR index the bitmap starts at */
> - __u8 *bitmap; /* a 1 bit allows the operations in flags, 0 denies */
> - };
> -
> - #define KVM_MSR_FILTER_MAX_RANGES 16
> - struct kvm_msr_filter {
> - #define KVM_MSR_FILTER_DEFAULT_ALLOW (0 << 0)
> - #define KVM_MSR_FILTER_DEFAULT_DENY (1 << 0)
> - __u32 flags;
> - struct kvm_msr_filter_range ranges[KVM_MSR_FILTER_MAX_RANGES];
> - };
> -
> -flags values for ``struct kvm_msr_filter_range``:
> -
> -``KVM_MSR_FILTER_READ``
> -
> - Filter read accesses to MSRs using the given bitmap. A 0 in the bitmap
> - indicates that a read should immediately fail, while a 1 indicates that
> - a read for a particular MSR should be handled regardless of the default
> - filter action.
> -
> -``KVM_MSR_FILTER_WRITE``
> -
> - Filter write accesses to MSRs using the given bitmap. A 0 in the bitmap
> - indicates that a write should immediately fail, while a 1 indicates that
> - a write for a particular MSR should be handled regardless of the default
> - filter action.
> -
> -``KVM_MSR_FILTER_READ | KVM_MSR_FILTER_WRITE``
> -
> - Filter both read and write accesses to MSRs using the given bitmap. A 0
> - in the bitmap indicates that both reads and writes should immediately fail,
> - while a 1 indicates that reads and writes for a particular MSR are not
> - filtered by this range.
> -
> -flags values for ``struct kvm_msr_filter``:
> -
> -``KVM_MSR_FILTER_DEFAULT_ALLOW``
> -
> - If no filter range matches an MSR index that is getting accessed, KVM will
> - fall back to allowing access to the MSR.
> -
> -``KVM_MSR_FILTER_DEFAULT_DENY``
> -
> - If no filter range matches an MSR index that is getting accessed, KVM will
> - fall back to rejecting access to the MSR. In this mode, all MSRs that should
> - be processed by KVM need to explicitly be marked as allowed in the bitmaps.
> -
> -This ioctl allows user space to define up to 16 bitmaps of MSR ranges to
> -specify whether a certain MSR access should be explicitly filtered for or not.
> -
> -If this ioctl has never been invoked, MSR accesses are not guarded and the
> -default KVM in-kernel emulation behavior is fully preserved.
> -
> -Calling this ioctl with an empty set of ranges (all nmsrs == 0) disables MSR
> -filtering. In that mode, ``KVM_MSR_FILTER_DEFAULT_DENY`` is invalid and causes
> -an error.
> -
> -As soon as the filtering is in place, every MSR access is processed through
> -the filtering except for accesses to the x2APIC MSRs (from 0x800 to 0x8ff);
> -x2APIC MSRs are always allowed, independent of the ``default_allow`` setting,
> -and their behavior depends on the ``X2APIC_ENABLE`` bit of the APIC base
> -register.
> -
> -If a bit is within one of the defined ranges, read and write accesses are
> -guarded by the bitmap's value for the MSR index if the kind of access
> -is included in the ``struct kvm_msr_filter_range`` flags. If no range
> -cover this particular access, the behavior is determined by the flags
> -field in the kvm_msr_filter struct: ``KVM_MSR_FILTER_DEFAULT_ALLOW``
> -and ``KVM_MSR_FILTER_DEFAULT_DENY``.
> -
> -Each bitmap range specifies a range of MSRs to potentially allow access on.
> -The range goes from MSR index [base .. base+nmsrs]. The flags field
> -indicates whether reads, writes or both reads and writes are filtered
> -by setting a 1 bit in the bitmap for the corresponding MSR index.
> -
> -If an MSR access is not permitted through the filtering, it generates a
> -#GP inside the guest. When combined with KVM_CAP_X86_USER_SPACE_MSR, that
> -allows user space to deflect and potentially handle various MSR accesses
> -into user space.
> -
> -If a vCPU is in running state while this ioctl is invoked, the vCPU may
> -experience inconsistent filtering behavior on MSR accesses.
> -
> -4.127 KVM_XEN_HVM_SET_ATTR
> +4.126 KVM_XEN_HVM_SET_ATTR
> --------------------------
>
> :Capability: KVM_CAP_XEN_HVM / KVM_XEN_HVM_CONFIG_SHARED_INFO
> @@ -4849,7 +4823,7 @@ KVM_XEN_ATTR_TYPE_SHARED_INFO
> KVM_XEN_ATTR_TYPE_UPCALL_VECTOR
> Sets the exception vector used to deliver Xen event channel upcalls.
>
> -4.128 KVM_XEN_HVM_GET_ATTR
> +4.127 KVM_XEN_HVM_GET_ATTR
> --------------------------
>
> :Capability: KVM_CAP_XEN_HVM / KVM_XEN_HVM_CONFIG_SHARED_INFO
> @@ -4861,7 +4835,7 @@ KVM_XEN_ATTR_TYPE_UPCALL_VECTOR
> Allows Xen VM attributes to be read. For the structure and types,
> see KVM_XEN_HVM_SET_ATTR above.
>
> -4.129 KVM_XEN_VCPU_SET_ATTR
> +4.128 KVM_XEN_VCPU_SET_ATTR
> ---------------------------
>
> :Capability: KVM_CAP_XEN_HVM / KVM_XEN_HVM_CONFIG_SHARED_INFO
> @@ -4923,7 +4897,7 @@ KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST
> or RUNSTATE_offline) to set the current accounted state as of the
> adjusted state_entry_time.
>
> -4.130 KVM_XEN_VCPU_GET_ATTR
> +4.129 KVM_XEN_VCPU_GET_ATTR
> ---------------------------
>
> :Capability: KVM_CAP_XEN_HVM / KVM_XEN_HVM_CONFIG_SHARED_INFO
> @@ -6721,3 +6695,29 @@ vcpu_info is set.
> The KVM_XEN_HVM_CONFIG_RUNSTATE flag indicates that the runstate-related
> features KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADDR/_CURRENT/_DATA/_ADJUST are
> supported by the KVM_XEN_VCPU_SET_ATTR/KVM_XEN_VCPU_GET_ATTR ioctls.
> +
> +8.31 KVM_CAP_PPC_MULTITCE
> +-------------------------
> +
> +:Capability: KVM_CAP_PPC_MULTITCE
> +:Architectures: ppc
> +:Type: vm
> +
> +This capability means the kernel is capable of handling hypercalls
> +H_PUT_TCE_INDIRECT and H_STUFF_TCE without passing those into the user
> +space. This significantly accelerates DMA operations for PPC KVM guests.
> +User space should expect that its handlers for these hypercalls
> +are not going to be called if user space previously registered LIOBN
> +in KVM (via KVM_CREATE_SPAPR_TCE or similar calls).
> +
> +In order to enable H_PUT_TCE_INDIRECT and H_STUFF_TCE use in the guest,
> +user space might have to advertise it for the guest. For example,
> +IBM pSeries (sPAPR) guest starts using them if "hcall-multi-tce" is
> +present in the "ibm,hypertas-functions" device-tree property.
> +
> +The hypercalls mentioned above may or may not be processed successfully
> +in the kernel based fast path. If they can not be handled by the kernel,
> +they will get passed on to user space. So user space still has to have
> +an implementation for these despite the in kernel acceleration.
> +
> +This capability is always enabled.
> \ No newline at end of file
> --
> 2.29.2
^ permalink raw reply
* Re: [PATCH v5 2/7] docs/zh_CN: Add translation zh_CN/doc-guide/kernel-doc.rst
From: Jonathan Corbet @ 2021-04-13 21:57 UTC (permalink / raw)
To: Wu XiangCheng, Alex Shi; +Cc: Alex Shi, YanTeng Si, linux-doc
In-Reply-To: <783d134b1dd18f580f2c0511c2330382a86e79b5.1618295149.git.bobwxc@email.cn>
Wu XiangCheng <bobwxc@email.cn> writes:
> Add new translation
> Documentation/translations/zh_CN/doc-guide/kernel-doc.rst
>
> Signed-off-by: Wu XiangCheng <bobwxc@email.cn>
> Reviewed-by: Yanteng Si <siyanteng@loongson.cn>
> ---
> .../zh_CN/doc-guide/kernel-doc.rst | 500 ++++++++++++++++++
> 1 file changed, 500 insertions(+)
> create mode 100644 Documentation/translations/zh_CN/doc-guide/kernel-doc.rst
>
> diff --git a/Documentation/translations/zh_CN/doc-guide/kernel-doc.rst b/Documentation/translations/zh_CN/doc-guide/kernel-doc.rst
> new file mode 100644
> index 000000000000..b0427944f8f0
> --- /dev/null
> +++ b/Documentation/translations/zh_CN/doc-guide/kernel-doc.rst
> @@ -0,0 +1,500 @@
> +.. include:: ../disclaimer-zh_CN.rst
> +
> +:Original: Documentation/doc-guide/kernel-doc.rst
> +
> +:译者: 吴想成 Wu XiangCheng <bobwxc@email.cn>
> +
> +编写kernel-doc注释
> +==================
> +
> +Linux内核源文件可以包含kernel-doc格式的结构化文档注释,用以描述代码的函数、
> +类型和设计。将文档嵌入源文件更容易保持文档最新。
> +
> +.. note:: 内核文档格式与javadoc、gtk-doc或Doxygen看似很相似,但由于历史原因,
> + 实际有着明显的不同。内核源包含成千上万个kernel-doc注释。请坚持遵循
> + 此处描述的风格。
> +
> +.. note:: kernel-doc无法包含Rust代码:请参考
> + :ref:`Documentation/rust/docs.rst <rust_docs>`。
Note that this reference adds a warning to the docs-next build - this
translation is evidently against linux-next instead. The warning will
eventually go away, but it would be better not to do this if possible.
Also, there is no need for a :ref: here; just say
"Documentation/rust/docs.rst" and the automarkup code will do the right
thing. Yes, that should be changed in the original as well.
Thanks,
jon
^ permalink raw reply
* Re: [PATCH] doc/virt/kvm: move KVM_X86_SET_MSR_FILTER in section 8
From: Paolo Bonzini @ 2021-04-13 22:21 UTC (permalink / raw)
To: Jonathan Corbet, Emanuele Giuseppe Esposito, linux-doc; +Cc: kvm, linux-kernel
In-Reply-To: <87v98priev.fsf@meer.lwn.net>
On 13/04/21 23:20, Jonathan Corbet wrote:
> Emanuele Giuseppe Esposito <eesposit@redhat.com> writes:
>
>> KVM_X86_SET_MSR_FILTER is a capability, not an ioctl.
>> Therefore move it from section 4.97 to the new 8.31 (other capabilities).
>>
>> To fill the gap, move KVM_X86_SET_MSR_FILTER (was 4.126) to
>> 4.97, and shifted Xen-related ioctl (were 4.127 - 4.130) by
>> one place (4.126 - 4.129).
>>
>> Also fixed minor typo in KVM_GET_MSR_INDEX_LIST ioctl description
>> (section 4.3).
>>
>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>> ---
>> Documentation/virt/kvm/api.rst | 250 ++++++++++++++++-----------------
>> 1 file changed, 125 insertions(+), 125 deletions(-)
>
> Paolo, what's your thought on this one? If it's OK should I pick it up?
I missed the patch, I'll queue it up for 5.13.
Paolo
> Thanks,
>
> jon
>
>>
>> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
>> index 1a2b5210cdbf..a230140d6a7f 100644
>> --- a/Documentation/virt/kvm/api.rst
>> +++ b/Documentation/virt/kvm/api.rst
>> @@ -201,7 +201,7 @@ Errors:
>>
>> ====== ============================================================
>> EFAULT the msr index list cannot be read from or written to
>> - E2BIG the msr index list is to be to fit in the array specified by
>> + E2BIG the msr index list is too big to fit in the array specified by
>> the user.
>> ====== ============================================================
>>
>> @@ -3686,31 +3686,105 @@ which is the maximum number of possibly pending cpu-local interrupts.
>>
>> Queues an SMI on the thread's vcpu.
>>
>> -4.97 KVM_CAP_PPC_MULTITCE
>> --------------------------
>> +4.97 KVM_X86_SET_MSR_FILTER
>> +----------------------------
>>
>> -:Capability: KVM_CAP_PPC_MULTITCE
>> -:Architectures: ppc
>> -:Type: vm
>> +:Capability: KVM_X86_SET_MSR_FILTER
>> +:Architectures: x86
>> +:Type: vm ioctl
>> +:Parameters: struct kvm_msr_filter
>> +:Returns: 0 on success, < 0 on error
>>
>> -This capability means the kernel is capable of handling hypercalls
>> -H_PUT_TCE_INDIRECT and H_STUFF_TCE without passing those into the user
>> -space. This significantly accelerates DMA operations for PPC KVM guests.
>> -User space should expect that its handlers for these hypercalls
>> -are not going to be called if user space previously registered LIOBN
>> -in KVM (via KVM_CREATE_SPAPR_TCE or similar calls).
>> +::
>>
>> -In order to enable H_PUT_TCE_INDIRECT and H_STUFF_TCE use in the guest,
>> -user space might have to advertise it for the guest. For example,
>> -IBM pSeries (sPAPR) guest starts using them if "hcall-multi-tce" is
>> -present in the "ibm,hypertas-functions" device-tree property.
>> + struct kvm_msr_filter_range {
>> + #define KVM_MSR_FILTER_READ (1 << 0)
>> + #define KVM_MSR_FILTER_WRITE (1 << 1)
>> + __u32 flags;
>> + __u32 nmsrs; /* number of msrs in bitmap */
>> + __u32 base; /* MSR index the bitmap starts at */
>> + __u8 *bitmap; /* a 1 bit allows the operations in flags, 0 denies */
>> + };
>>
>> -The hypercalls mentioned above may or may not be processed successfully
>> -in the kernel based fast path. If they can not be handled by the kernel,
>> -they will get passed on to user space. So user space still has to have
>> -an implementation for these despite the in kernel acceleration.
>> + #define KVM_MSR_FILTER_MAX_RANGES 16
>> + struct kvm_msr_filter {
>> + #define KVM_MSR_FILTER_DEFAULT_ALLOW (0 << 0)
>> + #define KVM_MSR_FILTER_DEFAULT_DENY (1 << 0)
>> + __u32 flags;
>> + struct kvm_msr_filter_range ranges[KVM_MSR_FILTER_MAX_RANGES];
>> + };
>>
>> -This capability is always enabled.
>> +flags values for ``struct kvm_msr_filter_range``:
>> +
>> +``KVM_MSR_FILTER_READ``
>> +
>> + Filter read accesses to MSRs using the given bitmap. A 0 in the bitmap
>> + indicates that a read should immediately fail, while a 1 indicates that
>> + a read for a particular MSR should be handled regardless of the default
>> + filter action.
>> +
>> +``KVM_MSR_FILTER_WRITE``
>> +
>> + Filter write accesses to MSRs using the given bitmap. A 0 in the bitmap
>> + indicates that a write should immediately fail, while a 1 indicates that
>> + a write for a particular MSR should be handled regardless of the default
>> + filter action.
>> +
>> +``KVM_MSR_FILTER_READ | KVM_MSR_FILTER_WRITE``
>> +
>> + Filter both read and write accesses to MSRs using the given bitmap. A 0
>> + in the bitmap indicates that both reads and writes should immediately fail,
>> + while a 1 indicates that reads and writes for a particular MSR are not
>> + filtered by this range.
>> +
>> +flags values for ``struct kvm_msr_filter``:
>> +
>> +``KVM_MSR_FILTER_DEFAULT_ALLOW``
>> +
>> + If no filter range matches an MSR index that is getting accessed, KVM will
>> + fall back to allowing access to the MSR.
>> +
>> +``KVM_MSR_FILTER_DEFAULT_DENY``
>> +
>> + If no filter range matches an MSR index that is getting accessed, KVM will
>> + fall back to rejecting access to the MSR. In this mode, all MSRs that should
>> + be processed by KVM need to explicitly be marked as allowed in the bitmaps.
>> +
>> +This ioctl allows user space to define up to 16 bitmaps of MSR ranges to
>> +specify whether a certain MSR access should be explicitly filtered for or not.
>> +
>> +If this ioctl has never been invoked, MSR accesses are not guarded and the
>> +default KVM in-kernel emulation behavior is fully preserved.
>> +
>> +Calling this ioctl with an empty set of ranges (all nmsrs == 0) disables MSR
>> +filtering. In that mode, ``KVM_MSR_FILTER_DEFAULT_DENY`` is invalid and causes
>> +an error.
>> +
>> +As soon as the filtering is in place, every MSR access is processed through
>> +the filtering except for accesses to the x2APIC MSRs (from 0x800 to 0x8ff);
>> +x2APIC MSRs are always allowed, independent of the ``default_allow`` setting,
>> +and their behavior depends on the ``X2APIC_ENABLE`` bit of the APIC base
>> +register.
>> +
>> +If a bit is within one of the defined ranges, read and write accesses are
>> +guarded by the bitmap's value for the MSR index if the kind of access
>> +is included in the ``struct kvm_msr_filter_range`` flags. If no range
>> +cover this particular access, the behavior is determined by the flags
>> +field in the kvm_msr_filter struct: ``KVM_MSR_FILTER_DEFAULT_ALLOW``
>> +and ``KVM_MSR_FILTER_DEFAULT_DENY``.
>> +
>> +Each bitmap range specifies a range of MSRs to potentially allow access on.
>> +The range goes from MSR index [base .. base+nmsrs]. The flags field
>> +indicates whether reads, writes or both reads and writes are filtered
>> +by setting a 1 bit in the bitmap for the corresponding MSR index.
>> +
>> +If an MSR access is not permitted through the filtering, it generates a
>> +#GP inside the guest. When combined with KVM_CAP_X86_USER_SPACE_MSR, that
>> +allows user space to deflect and potentially handle various MSR accesses
>> +into user space.
>> +
>> +If a vCPU is in running state while this ioctl is invoked, the vCPU may
>> +experience inconsistent filtering behavior on MSR accesses.
>>
>> 4.98 KVM_CREATE_SPAPR_TCE_64
>> ----------------------------
>> @@ -4706,107 +4780,7 @@ KVM_PV_VM_VERIFY
>> Verify the integrity of the unpacked image. Only if this succeeds,
>> KVM is allowed to start protected VCPUs.
>>
>> -4.126 KVM_X86_SET_MSR_FILTER
>> -----------------------------
>> -
>> -:Capability: KVM_X86_SET_MSR_FILTER
>> -:Architectures: x86
>> -:Type: vm ioctl
>> -:Parameters: struct kvm_msr_filter
>> -:Returns: 0 on success, < 0 on error
>> -
>> -::
>> -
>> - struct kvm_msr_filter_range {
>> - #define KVM_MSR_FILTER_READ (1 << 0)
>> - #define KVM_MSR_FILTER_WRITE (1 << 1)
>> - __u32 flags;
>> - __u32 nmsrs; /* number of msrs in bitmap */
>> - __u32 base; /* MSR index the bitmap starts at */
>> - __u8 *bitmap; /* a 1 bit allows the operations in flags, 0 denies */
>> - };
>> -
>> - #define KVM_MSR_FILTER_MAX_RANGES 16
>> - struct kvm_msr_filter {
>> - #define KVM_MSR_FILTER_DEFAULT_ALLOW (0 << 0)
>> - #define KVM_MSR_FILTER_DEFAULT_DENY (1 << 0)
>> - __u32 flags;
>> - struct kvm_msr_filter_range ranges[KVM_MSR_FILTER_MAX_RANGES];
>> - };
>> -
>> -flags values for ``struct kvm_msr_filter_range``:
>> -
>> -``KVM_MSR_FILTER_READ``
>> -
>> - Filter read accesses to MSRs using the given bitmap. A 0 in the bitmap
>> - indicates that a read should immediately fail, while a 1 indicates that
>> - a read for a particular MSR should be handled regardless of the default
>> - filter action.
>> -
>> -``KVM_MSR_FILTER_WRITE``
>> -
>> - Filter write accesses to MSRs using the given bitmap. A 0 in the bitmap
>> - indicates that a write should immediately fail, while a 1 indicates that
>> - a write for a particular MSR should be handled regardless of the default
>> - filter action.
>> -
>> -``KVM_MSR_FILTER_READ | KVM_MSR_FILTER_WRITE``
>> -
>> - Filter both read and write accesses to MSRs using the given bitmap. A 0
>> - in the bitmap indicates that both reads and writes should immediately fail,
>> - while a 1 indicates that reads and writes for a particular MSR are not
>> - filtered by this range.
>> -
>> -flags values for ``struct kvm_msr_filter``:
>> -
>> -``KVM_MSR_FILTER_DEFAULT_ALLOW``
>> -
>> - If no filter range matches an MSR index that is getting accessed, KVM will
>> - fall back to allowing access to the MSR.
>> -
>> -``KVM_MSR_FILTER_DEFAULT_DENY``
>> -
>> - If no filter range matches an MSR index that is getting accessed, KVM will
>> - fall back to rejecting access to the MSR. In this mode, all MSRs that should
>> - be processed by KVM need to explicitly be marked as allowed in the bitmaps.
>> -
>> -This ioctl allows user space to define up to 16 bitmaps of MSR ranges to
>> -specify whether a certain MSR access should be explicitly filtered for or not.
>> -
>> -If this ioctl has never been invoked, MSR accesses are not guarded and the
>> -default KVM in-kernel emulation behavior is fully preserved.
>> -
>> -Calling this ioctl with an empty set of ranges (all nmsrs == 0) disables MSR
>> -filtering. In that mode, ``KVM_MSR_FILTER_DEFAULT_DENY`` is invalid and causes
>> -an error.
>> -
>> -As soon as the filtering is in place, every MSR access is processed through
>> -the filtering except for accesses to the x2APIC MSRs (from 0x800 to 0x8ff);
>> -x2APIC MSRs are always allowed, independent of the ``default_allow`` setting,
>> -and their behavior depends on the ``X2APIC_ENABLE`` bit of the APIC base
>> -register.
>> -
>> -If a bit is within one of the defined ranges, read and write accesses are
>> -guarded by the bitmap's value for the MSR index if the kind of access
>> -is included in the ``struct kvm_msr_filter_range`` flags. If no range
>> -cover this particular access, the behavior is determined by the flags
>> -field in the kvm_msr_filter struct: ``KVM_MSR_FILTER_DEFAULT_ALLOW``
>> -and ``KVM_MSR_FILTER_DEFAULT_DENY``.
>> -
>> -Each bitmap range specifies a range of MSRs to potentially allow access on.
>> -The range goes from MSR index [base .. base+nmsrs]. The flags field
>> -indicates whether reads, writes or both reads and writes are filtered
>> -by setting a 1 bit in the bitmap for the corresponding MSR index.
>> -
>> -If an MSR access is not permitted through the filtering, it generates a
>> -#GP inside the guest. When combined with KVM_CAP_X86_USER_SPACE_MSR, that
>> -allows user space to deflect and potentially handle various MSR accesses
>> -into user space.
>> -
>> -If a vCPU is in running state while this ioctl is invoked, the vCPU may
>> -experience inconsistent filtering behavior on MSR accesses.
>> -
>> -4.127 KVM_XEN_HVM_SET_ATTR
>> +4.126 KVM_XEN_HVM_SET_ATTR
>> --------------------------
>>
>> :Capability: KVM_CAP_XEN_HVM / KVM_XEN_HVM_CONFIG_SHARED_INFO
>> @@ -4849,7 +4823,7 @@ KVM_XEN_ATTR_TYPE_SHARED_INFO
>> KVM_XEN_ATTR_TYPE_UPCALL_VECTOR
>> Sets the exception vector used to deliver Xen event channel upcalls.
>>
>> -4.128 KVM_XEN_HVM_GET_ATTR
>> +4.127 KVM_XEN_HVM_GET_ATTR
>> --------------------------
>>
>> :Capability: KVM_CAP_XEN_HVM / KVM_XEN_HVM_CONFIG_SHARED_INFO
>> @@ -4861,7 +4835,7 @@ KVM_XEN_ATTR_TYPE_UPCALL_VECTOR
>> Allows Xen VM attributes to be read. For the structure and types,
>> see KVM_XEN_HVM_SET_ATTR above.
>>
>> -4.129 KVM_XEN_VCPU_SET_ATTR
>> +4.128 KVM_XEN_VCPU_SET_ATTR
>> ---------------------------
>>
>> :Capability: KVM_CAP_XEN_HVM / KVM_XEN_HVM_CONFIG_SHARED_INFO
>> @@ -4923,7 +4897,7 @@ KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST
>> or RUNSTATE_offline) to set the current accounted state as of the
>> adjusted state_entry_time.
>>
>> -4.130 KVM_XEN_VCPU_GET_ATTR
>> +4.129 KVM_XEN_VCPU_GET_ATTR
>> ---------------------------
>>
>> :Capability: KVM_CAP_XEN_HVM / KVM_XEN_HVM_CONFIG_SHARED_INFO
>> @@ -6721,3 +6695,29 @@ vcpu_info is set.
>> The KVM_XEN_HVM_CONFIG_RUNSTATE flag indicates that the runstate-related
>> features KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADDR/_CURRENT/_DATA/_ADJUST are
>> supported by the KVM_XEN_VCPU_SET_ATTR/KVM_XEN_VCPU_GET_ATTR ioctls.
>> +
>> +8.31 KVM_CAP_PPC_MULTITCE
>> +-------------------------
>> +
>> +:Capability: KVM_CAP_PPC_MULTITCE
>> +:Architectures: ppc
>> +:Type: vm
>> +
>> +This capability means the kernel is capable of handling hypercalls
>> +H_PUT_TCE_INDIRECT and H_STUFF_TCE without passing those into the user
>> +space. This significantly accelerates DMA operations for PPC KVM guests.
>> +User space should expect that its handlers for these hypercalls
>> +are not going to be called if user space previously registered LIOBN
>> +in KVM (via KVM_CREATE_SPAPR_TCE or similar calls).
>> +
>> +In order to enable H_PUT_TCE_INDIRECT and H_STUFF_TCE use in the guest,
>> +user space might have to advertise it for the guest. For example,
>> +IBM pSeries (sPAPR) guest starts using them if "hcall-multi-tce" is
>> +present in the "ibm,hypertas-functions" device-tree property.
>> +
>> +The hypercalls mentioned above may or may not be processed successfully
>> +in the kernel based fast path. If they can not be handled by the kernel,
>> +they will get passed on to user space. So user space still has to have
>> +an implementation for these despite the in kernel acceleration.
>> +
>> +This capability is always enabled.
>> \ No newline at end of file
>> --
>> 2.29.2
>
^ permalink raw reply
* Re: [PATCH v2 00/12] docs: path-lookup: Update pathlookup docs
From: NeilBrown @ 2021-04-13 22:26 UTC (permalink / raw)
To: Jonathan Corbet, Fox Chen
Cc: Fox Chen, vegard.nossum, viro, rdunlap, grandmaster, linux-doc,
linux-kernel, gregkh
In-Reply-To: <87zgy1rihm.fsf@meer.lwn.net>
[-- Attachment #1: Type: text/plain, Size: 822 bytes --]
On Tue, Apr 13 2021, Jonathan Corbet wrote:
> Fox Chen <foxhlchen@gmail.com> writes:
>
>> The Path lookup is a very complex subject in VFS. The path-lookup
>> document provides a very detailed guidance to help people understand
>> how path lookup works in the kernel. This document was originally
>> written based on three lwn articles five years ago. As times goes by,
>> some of the content is outdated. This patchset is intended to update
>> the document to make it more relevant to current codebase.
>
> Neil, have you had a chance to take a look at these? I'm reluctant to
> apply them without your ack...
No I haven't, I'm sorry. And I'm on leave at the moment so my attention
is mostly elsewhere. However I'll try to make time to have a look
sometime in the next week or so.
Thanks for the prompt.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 853 bytes --]
^ permalink raw reply
* Re: [PATCH v4 05/13] module: Add printk formats to add module build ID to stacktraces
From: Stephen Boyd @ 2021-04-13 22:36 UTC (permalink / raw)
To: Andy Shevchenko, Petr Mladek
Cc: Andrew Morton, linux-kernel, Jiri Olsa, Alexei Starovoitov,
Jessica Yu, Evan Green, Hsin-Yi Wang, Steven Rostedt,
Sergey Senozhatsky, Rasmus Villemoes, linux-doc, Matthew Wilcox
In-Reply-To: <161834460576.3764895.758141455860109099@swboyd.mtv.corp.google.com>
Quoting Stephen Boyd (2021-04-13 13:10:05)
> Quoting Petr Mladek (2021-04-13 08:16:20)
> > On Tue 2021-04-13 13:56:31, Andy Shevchenko wrote:
> > > On Mon, Apr 12, 2021 at 12:29:05PM -0700, Stephen Boyd wrote:
> > > > Quoting Andy Shevchenko (2021-04-12 04:58:02)
> > > > >
> > > > > First of all, why not static_assert() defined near to the actual macro?
> > > >
> > > > Which macro? BUILD_ID_SIZE_MAX?
> > >
> > > Yes.
> > >
> > > > I tried static_assert() and it didn't
> > > > work for me but maybe I missed something.
> >
> > I guess that you wanted to use it inside macro definition:
> >
> > #define VMCOREINFO_BUILD_ID(value) \
> > static_assert(ARRAY_SIZE(value) == BUILD_ID_SIZE_MAX); \
> > vmcoreinfo_append_str("BUILD-ID=%20phN\n", value)
> >
> > Instead, you should do it outside the macro:
> >
> > static_assert(ARRAY_SIZE(value) == BUILD_ID_SIZE_MAX);
> > #define VMCOREINFO_BUILD_ID(value) \
> > vmcoreinfo_append_str("BUILD-ID=%20phN\n", value)
>
> In this example "value" is not defined because it's an argument to the
> macro. How can this work?
>
> From what I can tell static_assert() is for the case that you want to
> assert something at the global scope level. BUILD_BUG_ON() can't be used
> at global scope. I see the usage is usually to assert struct members and
> alignment of those members. In turn, static_assert() can't be used at
> function level scope. Each has a use and in this case I want to assert
> at function level scope to be as close as possible to the place that
> would need to change.
>
Good news. I can do this to force a basic block and then GCC doesn't complain.
---8<---
diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index 2174dab16ba9..de62a722431e 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -38,9 +38,12 @@ phys_addr_t paddr_vmcoreinfo_note(void);
#define VMCOREINFO_OSRELEASE(value) \
vmcoreinfo_append_str("OSRELEASE=%s\n", value)
-#define VMCOREINFO_BUILD_ID(value) \
- BUILD_BUG_ON(ARRAY_SIZE(value) != BUILD_ID_SIZE_MAX); \
- vmcoreinfo_append_str("BUILD-ID=%20phN\n", value)
+#define VMCOREINFO_BUILD_ID() \
+ ({ \
+ static_assert(sizeof(vmlinux_build_id) == 20); \
+ vmcoreinfo_append_str("BUILD-ID=%20phN\n", vmlinux_build_id); \
+ })
+
#define VMCOREINFO_PAGESIZE(value) \
vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
^ permalink raw reply related
* Re: [PATCH 0/4] Enable Fault Injection for RTRS
From: Jason Gunthorpe @ 2021-04-13 22:53 UTC (permalink / raw)
To: Gioh Kim
Cc: linux-rdma, linux-doc, bvanassche, leon, dledford, haris.iqbal,
jinpu.wang, akinobu.mita, corbet
In-Reply-To: <20210406115049.196527-1-gi-oh.kim@ionos.com>
On Tue, Apr 06, 2021 at 01:50:45PM +0200, Gioh Kim wrote:
> My colleagues and I would like to apply the fault injection
> of the Linux to test error handling of RTRS module. RTRS module
> consists of client and server modules that are connected via
> Infiniband network. So it is important for the client to receive
> the error of the server and handle it smoothly.
>
> When debugfs is enabled, RTRS is able to export interfaces
> to fail RTRS client and server.
> Following fault injection points are enabled:
> - fail a request processing on RTRS client side
> - fail a heart-beat transferation on RTRS server side
>
> This patch set is just a starting point. We will enable various
> faults and test as many error cases as possible.
>
> Best regards
>
> Gioh Kim (4):
> RDMA/rtrs: Enable the fault-injection
> RDMA/rtrs-clt: Inject a fault at request processing
> RDMA/rtrs-srv: Inject a fault at heart-beat sending
> docs: fault-injection: Add fault-injection manual of RTRS
I'm going to drop this until you can look into ebpf kprobes, it does
seem the more modern way
Jason
^ permalink raw reply
* Re: [PATCH v4 05/13] module: Add printk formats to add module build ID to stacktraces
From: Stephen Boyd @ 2021-04-13 22:57 UTC (permalink / raw)
To: Petr Mladek
Cc: Andrew Morton, linux-kernel, Jiri Olsa, Alexei Starovoitov,
Jessica Yu, Evan Green, Hsin-Yi Wang, Steven Rostedt,
Sergey Senozhatsky, Andy Shevchenko, Rasmus Villemoes, linux-doc,
Matthew Wilcox
In-Reply-To: <YHWyOhYZuLsbt2gB@alley>
Quoting Petr Mladek (2021-04-13 08:01:14)
> On Fri 2021-04-09 18:52:52, Stephen Boyd wrote:
> > Let's make kernel stacktraces easier to identify by including the build
> > ID[1] of a module if the stacktrace is printing a symbol from a module.
> > This makes it simpler for developers to locate a kernel module's full
> > debuginfo for a particular stacktrace. Combined with
> > scripts/decode_stracktrace.sh, a developer can download the matching
> > debuginfo from a debuginfod[2] server and find the exact file and line
> > number for the functions plus offsets in a stacktrace that match the
> > module. This is especially useful for pstore crash debugging where the
> > kernel crashes are recorded in something like console-ramoops and the
> > recovery kernel/modules are different or the debuginfo doesn't exist on
> > the device due to space concerns (the debuginfo can be too large for
> > space limited devices).
> >
> > diff --git a/include/linux/module.h b/include/linux/module.h
> > index 59f094fa6f74..4bf869f6c944 100644
> > --- a/include/linux/module.h
> > +++ b/include/linux/module.h
> > @@ -11,6 +11,7 @@
> >
> > #include <linux/list.h>
> > #include <linux/stat.h>
> > +#include <linux/buildid.h>
> > #include <linux/compiler.h>
> > #include <linux/cache.h>
> > #include <linux/kmod.h>
> > @@ -367,6 +368,9 @@ struct module {
> > /* Unique handle for this module */
> > char name[MODULE_NAME_LEN];
> >
> > + /* Module build ID */
> > + unsigned char build_id[BUILD_ID_SIZE_MAX];
>
> Do we want to initialize/store the ID even when
> CONFIG_STACKTRACE_BUILD_ID is disabled and nobody would
> use it?
>
> Most struct module members are added only when the related feature
> is enabled.
>
> I am not sure how it would complicate the code. It is possible
> that it is not worth it. Well, I could imagine that the API
> will always pass the buildid parameter and
> module_address_lookup() might do something like
>
> #ifndef CONFIG_STACKTRACE_BUILD_ID
> static char empty_build_id[BUILD_ID_SIZE_MAX];
> #endif
>
> if (modbuildid) {
> if (IS_ENABLED(CONFIG_STACKTRACE_BUILD_ID))
> *modbuildid = mod->build_id;
> else
> *modbuildid = empty_build_id;
>
> IMHO, this is primary a call for Jessica as the module code maintainer.
>
> Otherwise, I am fine with this patch. And it works as expected.
>
Does declaring mod->build_id as zero length work well enough?
----8<----
diff --git a/include/linux/module.h b/include/linux/module.h
index 4bf869f6c944..03b2f6af093a 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -359,6 +359,12 @@ struct klp_modinfo {
};
#endif
+#if IS_ENABLED(CONFIG_STACKTRACE_BUILD_ID)
+#define MODULE_BUILD_ID_LEN BUILD_ID_SIZE_MAX
+#else
+#define MODULE_BUILD_ID_LEN 0
+#endif
+
struct module {
enum module_state state;
@@ -369,7 +375,7 @@ struct module {
char name[MODULE_NAME_LEN];
/* Module build ID */
- unsigned char build_id[BUILD_ID_SIZE_MAX];
+ unsigned char build_id[MODULE_BUILD_ID_LEN];
/* Sysfs stuff. */
struct module_kobject mkobj;
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index b835992e76c2..ebd5b30c3039 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -25,7 +25,10 @@
#include <linux/filter.h>
#include <linux/ftrace.h>
#include <linux/kprobes.h>
+#include <linux/build_bug.h>
#include <linux/compiler.h>
+#include <linux/module.h>
+#include <linux/kernel.h>
/*
* These will be re-linked against their real values
@@ -393,10 +396,13 @@ static int __sprint_symbol(char *buffer, unsigned long address,
if (modname) {
len += sprintf(buffer + len, " [%s", modname);
- /* build ID should match length of sprintf below */
- BUILD_BUG_ON(BUILD_ID_SIZE_MAX != 20);
- if (IS_ENABLED(CONFIG_STACKTRACE_BUILD_ID) && add_buildid && buildid)
+#if IS_ENABLED(CONFIG_STACKTRACE_BUILD_ID)
+ if (add_buildid && buildid) {
+ /* build ID should match length of sprintf */
+ static_assert(MODULE_BUILD_ID_LEN == 20);
len += sprintf(buffer + len, " %20phN", buildid);
+ }
+#endif
len += sprintf(buffer + len, "]");
}
diff --git a/kernel/module.c b/kernel/module.c
index 6f5bc1b046a5..a0d222fbd281 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2771,7 +2771,17 @@ static void add_kallsyms(struct module *mod, const struct load_info *info)
}
mod->core_kallsyms.num_symtab = ndst;
}
+#else
+static inline void layout_symtab(struct module *mod, struct load_info *info)
+{
+}
+
+static void add_kallsyms(struct module *mod, const struct load_info *info)
+{
+}
+#endif /* CONFIG_KALLSYMS */
+#if IS_ENABLED(CONFIG_KALLSYMS) && IS_ENABLED(CONFIG_STACKTRACE_BUILD_ID)
static void init_build_id(struct module *mod, const struct load_info *info)
{
const Elf_Shdr *sechdr;
@@ -2786,18 +2796,10 @@ static void init_build_id(struct module *mod, const struct load_info *info)
}
}
#else
-static inline void layout_symtab(struct module *mod, struct load_info *info)
-{
-}
-
-static void add_kallsyms(struct module *mod, const struct load_info *info)
-{
-}
-
static void init_build_id(struct module *mod, const struct load_info *info)
{
}
-#endif /* CONFIG_KALLSYMS */
+#endif
static void dynamic_debug_setup(struct module *mod, struct _ddebug *debug, unsigned int num)
{
^ permalink raw reply related
* Re: [RFC 0/2] Add a new translation tool scripts/trslt.py
From: Federico Vaga @ 2021-04-13 23:27 UTC (permalink / raw)
To: Wu XiangCheng
Cc: Jonathan Corbet, Alex Shi, Masahiro Yamada, Tsugikazu Shibata,
SeongJae Park, linux-doc, linux-kernel
In-Reply-To: <cover.1618208899.git.bobwxc@email.cn>
Hi,
Yes, you are touching a good point where things can be improved. I admit that I
did not have a look at the code yet, if not very quickly. Perhaps I'm missing
somethin. However, let me give you my two cents based on what I usually do.
I do not like the idea of adding tags to the file and having tools to modify it.
I would prefer to keep the text as clean as possible.
Instead, what can be done without touching manipulating the text file is to do
something like this:
# Take the commit ID of the last time a document has translated
LAST_TRANS=$(git log -n 1 --oneline Documentation/translations/<lang>/<path-to-file> | cut -d " " -f 1)
# Take the history of the same file in the main Documentation tree
git log --oneline $LAST_TRANS..doc/docs-next Documentation/<path-to-file>
This will give you the list of commits that changed <path-to-file>, and that
probably need to be translated. The problem of this approach is that by the time
you submit a translation, other people may change the very same files. The
correctness of this approach depends on patch order in docs-next, and this can't
be guaranteed.
So, instead of reling on LAST_DIR, I rely on a special git branch that acts as
marker. But this works only for me and not for other translator of the same
languages, so you can get troubles also in this case.
What we can actually do is to exploit the git commit message to store the tag
you mentioned. Hence, we can get the last Id with something like this:
LAST_ID=$(git log -n 1 Documentation/translations/<lang>/<path-to-file> | grep -E "Translated-on-top-of: commit [0-9a-f]{12}")
The ID we store in the tag does not need to be the commit ID of the last change
to <path-to-file>, but just the commit on which you were when you did the
translation. This because it will simplify the management of this tag when
translating multiple files/patches in a single patch (to avoid to spam the
mailing list with dozens of small patches).
On Mon, Apr 12, 2021 at 03:04:03PM +0800, Wu XiangCheng wrote:
>Hi all,
>
>This set of patches aim to add a new translation tool - trslt.py, which
>can control the transltions version corresponding to source files.
>
>For a long time, kernel documentation translations lacks a way to control the
>version corresponding to the source files. If you translate a file and then
>someone updates the source file, there will be a problem. It's hard to know
>which version the existing translation corresponds to, and even harder to sync
>them.
>
>The common way now is to check the date, but this is not exactly accurate,
>especially for documents that are often updated. And some translators write
>corresponding commit ID in the commit log for reference, it is a good way,
>but still a little troublesome.
>
>Thus, the purpose of ``trslt.py`` is to add a new annotating tag to the file
>to indicate corresponding version of the source file::
>
>.. translation_origin_commit: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>
>The script will automatically copy file and generate tag when creating new
>translation, and give update suggestions based on those tags when updating
>translations.
>
>More details please read doc in [Patch 2/2].
>
>Still need working:
>- improve verbose mode
>- test on more python 3.x version
>- only support linux now, need test on Mac OS, nonsupport Windows
> due to '\'
>
>Any suggestion is welcome!
>
>Thanks!
>
>Wu XiangCheng (2):
> scripts: Add new translation tool trslt.py
> docs: doc-guide: Add document for scripts/trslt.py
>
> Documentation/doc-guide/index.rst | 1 +
> Documentation/doc-guide/trslt.rst | 233 ++++++++++++++++++++++++++
> scripts/trslt.py | 267 ++++++++++++++++++++++++++++++
> 3 files changed, 501 insertions(+)
> create mode 100644 Documentation/doc-guide/trslt.rst
> create mode 100755 scripts/trslt.py
>
>--
>2.20.1
>
^ permalink raw reply
* [PATCH v5] hwmon: Add driver for fsp-3y PSUs and PDUs
From: Václav Kubernát @ 2021-04-14 0:13 UTC (permalink / raw)
Cc: Václav Kubernát, Jean Delvare, Guenter Roeck,
Jonathan Corbet, linux-hwmon, linux-doc, linux-kernel
In-Reply-To: <20210329143833.1047539-1-kubernat@cesnet.cz>
This patch adds support for these devices:
- YH-5151E - the PDU
- YM-2151E - the PSU
The device datasheet says that the devices support PMBus 1.2, but in my
testing, a lot of the commands aren't supported and if they are, they
sometimes behave strangely or inconsistently. For example, writes to the
PAGE command requires using PEC, otherwise the write won't work and the
page won't switch, even though, the standard says that PEC is optional.
On the other hand, writes to SMBALERT don't require PEC. Because of
this, the driver is mostly reverse engineered with the help of a tool
called pmbus_peek written by David Brownell (and later adopted by my
colleague Jan Kundrát).
The device also has some sort of a timing issue when switching pages,
which is explained further in the code.
Because of this, the driver support is limited. It exposes only the
values, that have been tested to work correctly.
Signed-off-by: Václav Kubernát <kubernat@cesnet.cz>
---
Documentation/hwmon/fsp-3y.rst | 26 ++++
Documentation/hwmon/index.rst | 1 +
drivers/hwmon/pmbus/Kconfig | 10 ++
drivers/hwmon/pmbus/Makefile | 1 +
drivers/hwmon/pmbus/fsp-3y.c | 251 +++++++++++++++++++++++++++++++++
5 files changed, 289 insertions(+)
create mode 100644 Documentation/hwmon/fsp-3y.rst
create mode 100644 drivers/hwmon/pmbus/fsp-3y.c
diff --git a/Documentation/hwmon/fsp-3y.rst b/Documentation/hwmon/fsp-3y.rst
new file mode 100644
index 000000000000..fc87d4686032
--- /dev/null
+++ b/Documentation/hwmon/fsp-3y.rst
@@ -0,0 +1,26 @@
+Kernel driver fsp3y
+======================
+Supported devices:
+ * 3Y POWER YH-5151E
+ * 3Y POWER YM-2151E
+
+Author: Václav Kubernát <kubernat@cesnet.cz>
+
+Description
+-----------
+This driver implements limited support for two 3Y POWER devices.
+
+Sysfs entries
+-------------
+ * in1_input input voltage
+ * in2_input 12V output voltage
+ * in3_input 5V output voltage
+ * curr1_input input current
+ * curr2_input 12V output current
+ * curr3_input 5V output current
+ * fan1_input fan rpm
+ * temp1_input temperature 1
+ * temp2_input temperature 2
+ * temp3_input temperature 3
+ * power1_input input power
+ * power2_input output power
diff --git a/Documentation/hwmon/index.rst b/Documentation/hwmon/index.rst
index fcb870ce6286..55c9f014c248 100644
--- a/Documentation/hwmon/index.rst
+++ b/Documentation/hwmon/index.rst
@@ -63,6 +63,7 @@ Hardware Monitoring Kernel Drivers
f71805f
f71882fg
fam15h_power
+ fsp-3y
ftsteutates
g760a
g762
diff --git a/drivers/hwmon/pmbus/Kconfig b/drivers/hwmon/pmbus/Kconfig
index 03606d4298a4..9d12d446396c 100644
--- a/drivers/hwmon/pmbus/Kconfig
+++ b/drivers/hwmon/pmbus/Kconfig
@@ -56,6 +56,16 @@ config SENSORS_BEL_PFE
This driver can also be built as a module. If so, the module will
be called bel-pfe.
+config SENSORS_FSP_3Y
+ tristate "FSP/3Y-Power power supplies"
+ help
+ If you say yes here you get hardware monitoring support for
+ FSP/3Y-Power hot-swap power supplies.
+ Supported models: YH-5151E, YM-2151E
+
+ This driver can also be built as a module. If so, the module will
+ be called fsp-3y.
+
config SENSORS_IBM_CFFPS
tristate "IBM Common Form Factor Power Supply"
depends on LEDS_CLASS
diff --git a/drivers/hwmon/pmbus/Makefile b/drivers/hwmon/pmbus/Makefile
index 6a4ba0fdc1db..bfe218ad898f 100644
--- a/drivers/hwmon/pmbus/Makefile
+++ b/drivers/hwmon/pmbus/Makefile
@@ -8,6 +8,7 @@ obj-$(CONFIG_SENSORS_PMBUS) += pmbus.o
obj-$(CONFIG_SENSORS_ADM1266) += adm1266.o
obj-$(CONFIG_SENSORS_ADM1275) += adm1275.o
obj-$(CONFIG_SENSORS_BEL_PFE) += bel-pfe.o
+obj-$(CONFIG_SENSORS_FSP_3Y) += fsp-3y.o
obj-$(CONFIG_SENSORS_IBM_CFFPS) += ibm-cffps.o
obj-$(CONFIG_SENSORS_INSPUR_IPSPS) += inspur-ipsps.o
obj-$(CONFIG_SENSORS_IR35221) += ir35221.o
diff --git a/drivers/hwmon/pmbus/fsp-3y.c b/drivers/hwmon/pmbus/fsp-3y.c
new file mode 100644
index 000000000000..af58f9950f3d
--- /dev/null
+++ b/drivers/hwmon/pmbus/fsp-3y.c
@@ -0,0 +1,251 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Hardware monitoring driver for FSP 3Y-Power PSUs
+ *
+ * Copyright (c) 2021 Václav Kubernát, CESNET
+ *
+ * This driver is mostly reverse engineered with the help of a tool called pmbus_peek written by
+ * David Brownell (and later adopted by Jan Kundrát). The device has some sort of a timing issue
+ * when switching pages, details are explained in the code. The driver support is limited. It
+ * exposes only the values, that have been tested to work correctly. Unsupported values either
+ * aren't supported by the devices or their encondings are unknown.
+ */
+
+#include <linux/delay.h>
+#include <linux/i2c.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include "pmbus.h"
+
+#define YM2151_PAGE_12V_LOG 0x00
+#define YM2151_PAGE_12V_REAL 0x00
+#define YM2151_PAGE_5VSB_LOG 0x01
+#define YM2151_PAGE_5VSB_REAL 0x20
+#define YH5151E_PAGE_12V_LOG 0x00
+#define YH5151E_PAGE_12V_REAL 0x00
+#define YH5151E_PAGE_5V_LOG 0x01
+#define YH5151E_PAGE_5V_REAL 0x10
+#define YH5151E_PAGE_3V3_LOG 0x02
+#define YH5151E_PAGE_3V3_REAL 0x11
+
+enum chips {
+ ym2151e,
+ yh5151e
+};
+
+struct fsp3y_data {
+ struct pmbus_driver_info info;
+ int chip;
+ int page;
+};
+
+#define to_fsp3y_data(x) container_of(x, struct fsp3y_data, info)
+
+static int page_log_to_page_real(int page_log, enum chips chip)
+{
+ switch (chip) {
+ case ym2151e:
+ switch (page_log) {
+ case YM2151_PAGE_12V_LOG:
+ return YM2151_PAGE_12V_REAL;
+ case YM2151_PAGE_5VSB_LOG:
+ return YM2151_PAGE_5VSB_REAL;
+ }
+ return -EINVAL;
+ case yh5151e:
+ switch (page_log) {
+ case YH5151E_PAGE_12V_LOG:
+ return YH5151E_PAGE_12V_REAL;
+ case YH5151E_PAGE_5V_LOG:
+ return YH5151E_PAGE_5V_LOG;
+ case YH5151E_PAGE_3V3_LOG:
+ return YH5151E_PAGE_3V3_REAL;
+ }
+ return -EINVAL;
+ }
+
+ return -EINVAL;
+}
+
+static int set_page(struct i2c_client *client, int page_log)
+{
+ const struct pmbus_driver_info *info = pmbus_get_driver_info(client);
+ struct fsp3y_data *data = to_fsp3y_data(info);
+ int rv;
+ int page_real;
+
+ if (page_log < 0)
+ return 0;
+
+ page_real = page_log_to_page_real(page_log, data->chip);
+ if (page_real < 0)
+ return page_real;
+
+ if (data->page != page_real) {
+ rv = i2c_smbus_write_byte_data(client, PMBUS_PAGE, page_real);
+ if (rv < 0)
+ return rv;
+
+ data->page = page_real;
+
+ /*
+ * Testing showed that the device has a timing issue. After
+ * setting a page, it takes a while, before the device actually
+ * gives the correct values from the correct page. 20 ms was
+ * tested to be enough to not give wrong values (15 ms wasn't
+ * enough).
+ */
+ usleep_range(20000, 30000);
+ }
+
+ return 0;
+}
+
+static int fsp3y_read_byte_data(struct i2c_client *client, int page, int reg)
+{
+ int rv;
+
+ rv = set_page(client, page);
+ if (rv < 0)
+ return rv;
+
+ return i2c_smbus_read_byte_data(client, reg);
+}
+
+static int fsp3y_read_word_data(struct i2c_client *client, int page, int phase, int reg)
+{
+ int rv;
+
+ /*
+ * This masks commands which weren't tested to work correctly. Some of
+ * the masked commands return 0xFFFF. These would probably get tagged as
+ * invalid by pmbus_core. Other ones do return values which might be
+ * useful (that is, they are not 0xFFFF), but their encoding is unknown,
+ * and so they are unsupported.
+ */
+ switch (reg) {
+ case PMBUS_READ_FAN_SPEED_1:
+ case PMBUS_READ_IIN:
+ case PMBUS_READ_IOUT:
+ case PMBUS_READ_PIN:
+ case PMBUS_READ_POUT:
+ case PMBUS_READ_TEMPERATURE_1:
+ case PMBUS_READ_TEMPERATURE_2:
+ case PMBUS_READ_TEMPERATURE_3:
+ case PMBUS_READ_VIN:
+ case PMBUS_READ_VOUT:
+ case PMBUS_STATUS_WORD:
+ break;
+ default:
+ return -ENXIO;
+ }
+
+ rv = set_page(client, page);
+ if (rv < 0)
+ return rv;
+
+ return i2c_smbus_read_word_data(client, reg);
+}
+
+struct pmbus_driver_info fsp3y_info[] = {
+ [ym2151e] = {
+ .pages = 2,
+ .func[YM2151_PAGE_12V_LOG] =
+ PMBUS_HAVE_VOUT | PMBUS_HAVE_IOUT |
+ PMBUS_HAVE_PIN | PMBUS_HAVE_POUT |
+ PMBUS_HAVE_TEMP | PMBUS_HAVE_TEMP2 |
+ PMBUS_HAVE_VIN | PMBUS_HAVE_IIN |
+ PMBUS_HAVE_FAN12,
+ .func[YM2151_PAGE_5VSB_LOG] =
+ PMBUS_HAVE_VOUT | PMBUS_HAVE_IOUT,
+ PMBUS_HAVE_IIN,
+ .read_word_data = fsp3y_read_word_data,
+ .read_byte_data = fsp3y_read_byte_data,
+ },
+ [yh5151e] = {
+ .pages = 3,
+ .func[YH5151E_PAGE_12V_LOG] =
+ PMBUS_HAVE_VOUT | PMBUS_HAVE_IOUT |
+ PMBUS_HAVE_POUT |
+ PMBUS_HAVE_TEMP | PMBUS_HAVE_TEMP2 | PMBUS_HAVE_TEMP3,
+ .func[YH5151E_PAGE_5V_LOG] =
+ PMBUS_HAVE_VOUT | PMBUS_HAVE_IOUT |
+ PMBUS_HAVE_POUT,
+ .func[YH5151E_PAGE_3V3_LOG] =
+ PMBUS_HAVE_VOUT | PMBUS_HAVE_IOUT |
+ PMBUS_HAVE_POUT,
+ .read_word_data = fsp3y_read_word_data,
+ .read_byte_data = fsp3y_read_byte_data,
+ }
+};
+
+static int fsp3y_detect(struct i2c_client *client)
+{
+ int rv;
+ u8 buf[I2C_SMBUS_BLOCK_MAX + 1];
+
+ rv = i2c_smbus_read_block_data(client, PMBUS_MFR_MODEL, buf);
+ if (rv < 0)
+ return rv;
+
+ buf[rv] = '\0';
+
+ if (rv == 8) {
+ if (!strcmp(buf, "YM-2151E"))
+ return ym2151e;
+ else if (!strcmp(buf, "YH-5151E"))
+ return yh5151e;
+ }
+
+ dev_err(&client->dev, "Unsupported model %.*s\n", rv, buf);
+ return -ENODEV;
+}
+
+static const struct i2c_device_id fsp3y_id[] = {
+ {"ym2151e", ym2151e},
+ {"yh5151e", yh5151e}
+};
+
+static int fsp3y_probe(struct i2c_client *client)
+{
+ struct fsp3y_data *data;
+ const struct i2c_device_id *id;
+ int rv;
+
+ data = devm_kzalloc(&client->dev, sizeof(struct fsp3y_data), GFP_KERNEL);
+ if (!data)
+ return -ENOMEM;
+
+ data->chip = fsp3y_detect(client);
+ if (data->chip < 0)
+ return data->chip;
+
+ id = i2c_match_id(fsp3y_id, client);
+ if (data->chip != id->driver_data)
+ dev_warn(&client->dev, "Device mismatch: Configured %s (%d), detected %d\n", id->name, (int)id->driver_data, data->chip);
+
+ rv = i2c_smbus_read_byte_data(client, PMBUS_PAGE);
+ if (rv < 0)
+ return rv;
+ data->page = rv;
+
+ data->info = fsp3y_info[data->chip];
+
+ return pmbus_do_probe(client, &data->info);
+}
+
+MODULE_DEVICE_TABLE(i2c, fsp3y_id);
+
+static struct i2c_driver fsp3y_driver = {
+ .driver = {
+ .name = "fsp3y",
+ },
+ .probe_new = fsp3y_probe,
+ .id_table = fsp3y_id
+};
+
+module_i2c_driver(fsp3y_driver);
+
+MODULE_AUTHOR("Václav Kubernát");
+MODULE_DESCRIPTION("PMBus driver for FSP/3Y-Power power supplies");
+MODULE_LICENSE("GPL");
--
2.31.1
^ permalink raw reply related
* Re: [PATCH v4] hwmon: Add driver for fsp-3y PSUs and PDUs
From: Václav Kubernát @ 2021-04-14 0:27 UTC (permalink / raw)
To: Guenter Roeck
Cc: Jean Delvare, Jonathan Corbet, linux-hwmon, linux-doc,
linux-kernel
In-Reply-To: <43c5cc02-d2ad-ceee-7709-190681b81ae7@roeck-us.net>
út 13. 4. 2021 v 16:59 odesílatel Guenter Roeck <linux@roeck-us.net> napsal:
>
> On 4/13/21 3:42 AM, Václav Kubernát wrote:
> > This patch adds support for these devices:
> > - YH-5151E - the PDU
> > - YM-2151E - the PSU
> >
> > The device datasheet says that the devices support PMBus 1.2, but in my
> > testing, a lot of the commands aren't supported and if they are, they
> > sometimes behave strangely or inconsistently. For example, writes to the
> > PAGE command requires using PEC, otherwise the write won't work and the
> > page won't switch, even though, the standard says that PEC is opiotnal.
>
> optional
Done.
>
> > On the other hand, writes the SMBALERT don't require PEC. Because of
>
> s/writes the/writes to/ ?
Done.
>
> > this, the driver is mostly reverse engineered with the help of a tool
> > called pmbus_peek written by David Brownell (and later adopted by my
> > colleague Jan Kundrát).
> >
> > The device also has some sort of a timing issue when switching pages,
> > which is explained further in the code.
> >
> > Because of this, the driver support is limited. It exposes only the
> > values, that have been tested to work correctly.
> >
>
> You might want to add those details into the driver code, below the
> copyright line. It would be more valuable there than in the commit log.
>
Done.
> > Signed-off-by: Václav Kubernát <kubernat@cesnet.cz>
> > ---
> > Documentation/hwmon/fsp-3y.rst | 26 ++++
>
> Needs to be added to index.rst.
Done.
>
> > drivers/hwmon/pmbus/Kconfig | 10 ++
> > drivers/hwmon/pmbus/Makefile | 1 +
> > drivers/hwmon/pmbus/fsp-3y.c | 239 +++++++++++++++++++++++++++++++++
> > 4 files changed, 276 insertions(+)
> > create mode 100644 Documentation/hwmon/fsp-3y.rst
> > create mode 100644 drivers/hwmon/pmbus/fsp-3y.c
> >
> > diff --git a/Documentation/hwmon/fsp-3y.rst b/Documentation/hwmon/fsp-3y.rst
> > new file mode 100644
> > index 000000000000..68a547021846
> > --- /dev/null
> > +++ b/Documentation/hwmon/fsp-3y.rst
> > @@ -0,0 +1,26 @@
> > +Kernel driver fsp3y
> > +======================
> > +Supported devices:
> > + * 3Y POWER YH-5151E
> > + * 3Y POWER YM-2151E
> > +
> > +Author: Václav Kubernát <kubernat@cesnet.cz>
> > +
> > +Description
> > +-----------
> > +This driver implements limited support for two 3Y POWER devices.
> > +
> > +Sysfs entries
> > +-------------
> > +in1_input input voltage
> > +in2_input 12V output voltage
> > +in3_input 5V output voltage
> > +curr1_input input current
> > +curr2_input 12V output current
> > +curr3_input 5V output current
> > +fan1_input fan rpm
> > +temp1_input temperature 1
> > +temp2_input temperature 2
> > +temp3_input temperature 3
> > +power1_input input power
> > +power2_input output power
> > diff --git a/drivers/hwmon/pmbus/Kconfig b/drivers/hwmon/pmbus/Kconfig
> > index 03606d4298a4..9d12d446396c 100644
> > --- a/drivers/hwmon/pmbus/Kconfig
> > +++ b/drivers/hwmon/pmbus/Kconfig
> > @@ -56,6 +56,16 @@ config SENSORS_BEL_PFE
> > This driver can also be built as a module. If so, the module will
> > be called bel-pfe.
> >
> > +config SENSORS_FSP_3Y
> > + tristate "FSP/3Y-Power power supplies"
> > + help
> > + If you say yes here you get hardware monitoring support for
> > + FSP/3Y-Power hot-swap power supplies.
> > + Supported models: YH-5151E, YM-2151E
> > +
> > + This driver can also be built as a module. If so, the module will
> > + be called fsp-3y.
> > +
> > config SENSORS_IBM_CFFPS
> > tristate "IBM Common Form Factor Power Supply"
> > depends on LEDS_CLASS
> > diff --git a/drivers/hwmon/pmbus/Makefile b/drivers/hwmon/pmbus/Makefile
> > index 6a4ba0fdc1db..bfe218ad898f 100644
> > --- a/drivers/hwmon/pmbus/Makefile
> > +++ b/drivers/hwmon/pmbus/Makefile
> > @@ -8,6 +8,7 @@ obj-$(CONFIG_SENSORS_PMBUS) += pmbus.o
> > obj-$(CONFIG_SENSORS_ADM1266) += adm1266.o
> > obj-$(CONFIG_SENSORS_ADM1275) += adm1275.o
> > obj-$(CONFIG_SENSORS_BEL_PFE) += bel-pfe.o
> > +obj-$(CONFIG_SENSORS_FSP_3Y) += fsp-3y.o
> > obj-$(CONFIG_SENSORS_IBM_CFFPS) += ibm-cffps.o
> > obj-$(CONFIG_SENSORS_INSPUR_IPSPS) += inspur-ipsps.o
> > obj-$(CONFIG_SENSORS_IR35221) += ir35221.o
> > diff --git a/drivers/hwmon/pmbus/fsp-3y.c b/drivers/hwmon/pmbus/fsp-3y.c
> > new file mode 100644
> > index 000000000000..2185ab119fd2
> > --- /dev/null
> > +++ b/drivers/hwmon/pmbus/fsp-3y.c
> > @@ -0,0 +1,239 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +/*
> > + * Hardware monitoring driver for FSP 3Y-Power PSUs
> > + *
> > + * Copyright (c) 2021 Václav Kubernát, CESNET
> > + */
> > +
> > +#include <linux/delay.h>
> > +#include <linux/i2c.h>
> > +#include <linux/kernel.h>
> > +#include <linux/module.h>
> > +#include "pmbus.h"
> > +
> > +#define YM2151_PAGE_12V_LOG 0x00
> > +#define YM2151_PAGE_12V_REAL 0x00
> > +#define YM2151_PAGE_5VSB_LOG 0x01
> > +#define YM2151_PAGE_5VSB_REAL 0x20
> > +#define YH5151E_PAGE_12V_LOG 0x00
> > +#define YH5151E_PAGE_12V_REAL 0x00
> > +#define YH5151E_PAGE_5V_LOG 0x01
> > +#define YH5151E_PAGE_5V_REAL 0x10
> > +#define YH5151E_PAGE_3V3_LOG 0x02
> > +#define YH5151E_PAGE_3V3_REAL 0x11
> > +
> > +enum chips {
> > + ym2151e,
> > + yh5151e
> > +};
> > +
> > +struct fsp3y_data {
> > + struct pmbus_driver_info info;
> > + int chip;
> > + int page;
> > +};
> > +
> > +#define to_fsp3y_data(x) container_of(x, struct fsp3y_data, info)
> > +
> > +static int page_log_to_page_real(int page_log, enum chips chip)
> > +{
> > + switch (chip) {
> > + case ym2151e:
> > + switch (page_log) {
> > + case YM2151_PAGE_12V_LOG:
> > + return YM2151_PAGE_12V_REAL;
> > + case YM2151_PAGE_5VSB_LOG:
> > + return YM2151_PAGE_5VSB_REAL;
> > + }
> > + return -EINVAL;
> > + case yh5151e:
> > + switch (page_log) {
> > + case YH5151E_PAGE_12V_LOG:
> > + return YH5151E_PAGE_12V_REAL;
> > + case YH5151E_PAGE_5V_LOG:
> > + return YH5151E_PAGE_5V_LOG;
> > + case YH5151E_PAGE_3V3_LOG:
> > + return YH5151E_PAGE_3V3_REAL;
> > + }
> > + return -EINVAL;
> > + }
> > +
> > + return -EINVAL;
> > +}
> > +
> > +static int set_page(struct i2c_client *client, int page_log)
> > +{
> > + const struct pmbus_driver_info *info = pmbus_get_driver_info(client);
> > + struct fsp3y_data *data = to_fsp3y_data(info);
> > + int rv;
> > + int page_real;
> > +
> > + if (page_log < 0)
> > + return 0;
> > +
> > + page_real = page_log_to_page_real(page_log, data->chip);
> > + if (page_real < 0)
> > + return page_real;
> > +
> > + if (data->page != page_real) {
> > + rv = i2c_smbus_write_byte_data(client, PMBUS_PAGE, page_real);
> > + if (rv < 0)
> > + return rv;
> > +
> > + data->page = page_real;
> > +
> > + /* Testing showed that the device has a timing issue. After
>
> Somehow network subsystem multi-line alignments slipped in.
> Not in hwmon, please. I cringe at those; it makes my brain focus on the
> comment (because it is asynchronous) instead of the code.
>
Done.
> > + * setting a page, it takes a while, before the device actually
> > + * gives the correct values from the correct page. 20 ms was
> > + * tested to be enough to not give wrong values (15 ms wasn't
> > + * enough)
> > + */
> > + usleep_range(20000, 30000);
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static int fsp3y_read_byte_data(struct i2c_client *client, int page, int reg)
> > +{
> > + int rv;
> > +
> > + rv = set_page(client, page);
> > + if (rv < 0)
> > + return rv;
> > +
> > + return i2c_smbus_read_byte_data(client, reg);
> > +}
> > +
> > +static int fsp3y_read_word_data(struct i2c_client *client, int page, int phase, int reg)
> > +{
> > + int rv;
> > +
> > + /* This masks commands which weren't tested to work correctly. Some of the masked commands
> > + * return either 0xFFFF. These would probably get tagged as invalid by pmbus_core. Other
>
> s/either// ?
Done.
>
> > + * ones do return values, which might be useful (that is, they are not 0xFFFF), but their
>
> s/values,/values/
Done.
>
> > + * encoding is unknown, and so they are unsupported.
> > + */
> > + switch (reg) {
> > + case PMBUS_READ_FAN_SPEED_1:
> > + case PMBUS_READ_IIN:
> > + case PMBUS_READ_IOUT:
> > + case PMBUS_READ_PIN:
> > + case PMBUS_READ_POUT:
> > + case PMBUS_READ_TEMPERATURE_1:
> > + case PMBUS_READ_TEMPERATURE_2:
> > + case PMBUS_READ_TEMPERATURE_3:
> > + case PMBUS_READ_VIN:
> > + case PMBUS_READ_VOUT:
> > + case PMBUS_STATUS_WORD:
> > + break;
> > + default:
> > + return -ENXIO;
> > + }
> > +
> > + rv = set_page(client, page);
> > + if (rv < 0)
> > + return rv;
> > +
> > + return i2c_smbus_read_word_data(client, reg);
> > +}
> > +
> > +struct pmbus_driver_info fsp3y_info[] = {
> > + [ym2151e] = {
> > + .pages = 2,
> > + .func[YM2151_PAGE_12V_LOG] =
> > + PMBUS_HAVE_VOUT | PMBUS_HAVE_IOUT |
> > + PMBUS_HAVE_PIN | PMBUS_HAVE_POUT |
> > + PMBUS_HAVE_TEMP | PMBUS_HAVE_TEMP2 |
> > + PMBUS_HAVE_VIN | PMBUS_HAVE_IIN |
> > + PMBUS_HAVE_FAN12,
> > + .func[YM2151_PAGE_5VSB_LOG] =
> > + PMBUS_HAVE_VOUT | PMBUS_HAVE_IOUT,
> > + PMBUS_HAVE_IIN,
> > + .read_word_data = fsp3y_read_word_data,
> > + .read_byte_data = fsp3y_read_byte_data,
> > + },
> > + [yh5151e] = {
> > + .pages = 3,
> > + .func[YH5151E_PAGE_12V_LOG] =
> > + PMBUS_HAVE_VOUT | PMBUS_HAVE_IOUT |
> > + PMBUS_HAVE_POUT |
> > + PMBUS_HAVE_TEMP | PMBUS_HAVE_TEMP2 | PMBUS_HAVE_TEMP3,
> > + .func[YH5151E_PAGE_5V_LOG] =
> > + PMBUS_HAVE_VOUT | PMBUS_HAVE_IOUT |
> > + PMBUS_HAVE_POUT,
> > + .func[YH5151E_PAGE_3V3_LOG] =
> > + PMBUS_HAVE_VOUT | PMBUS_HAVE_IOUT |
> > + PMBUS_HAVE_POUT,
> > + .read_word_data = fsp3y_read_word_data,
> > + .read_byte_data = fsp3y_read_byte_data,
> > + }
> > +};
> > +
> > +static int fsp3y_detect(struct i2c_client *client)
> > +{
> > + int rv;
> > + u8 buf[I2C_SMBUS_BLOCK_MAX];
> > +
> > + rv = i2c_smbus_read_block_data(client, PMBUS_MFR_MODEL, buf);
> > + if (rv < 0)
> > + return rv;
> > +
> > + if (rv == 8 && !strncmp(buf, "YM-2151E", strlen("YM-2151E")))
> > + return ym2151e;
> > + else if (rv == 8 && !strncmp(buf, "YH-5151E", strlen("YH-5151E")))
> > + return yh5151e;
>
> better
> if (rv == 8) {
> /* rest of check */
> }
>
Done.
> > +
> > + dev_err(&client->dev, "Unsupported model %.*s\n", rv, buf);
>
> Sorry I didn't notice before. This assumes that the buffer is zero-terminated,
> which may not be the case. For this to work, add another byte to the buffer
> (u8 buf[I2C_SMBUS_BLOCK_MAX + 1];), and add
> buf[rv] = '\0';
>
> You could actually do that before checking the returned strings and then just
> use strcmp() for the model comparisons, without bothering about the return
> length.
>
Done.
> > + return -ENODEV;
> > +}
> > +
> > +static const struct i2c_device_id fsp3y_id[] = {
> > + {"ym2151e", ym2151e},
> > + {"yh5151e", yh5151e}
> > +};
> > +
> > +static int fsp3y_probe(struct i2c_client *client)
> > +{
> > + struct fsp3y_data *data;
> > + const struct i2c_device_id *id;
> > + int rv;
> > +
> > + data = devm_kzalloc(&client->dev, sizeof(struct fsp3y_data), GFP_KERNEL);
> > + if (!data)
> > + return -ENOMEM;
> > +
> > + data->chip = fsp3y_detect(client);
> > + if (data->chip < 0)
> > + return data->chip;
> > +
> > + id = i2c_match_id(fsp3y_id, client);
> > + if (data->chip != id->driver_data)
> > + dev_warn(&client->dev, "Device mismatch: Configured %s (%d), detected %d\n", id->name, (int)id->driver_data, data->chip);
> > +
> > + rv = i2c_smbus_read_byte_data(client, PMBUS_PAGE);
> > + if (rv < 0)
> > + return rv;
> > + data->page = rv;
> > +
> > + data->info = fsp3y_info[data->chip];
> > +
> > + return pmbus_do_probe(client, &data->info);
> > +}
> > +
> > +MODULE_DEVICE_TABLE(i2c, pmbus_id);
> > +
> > +/* This is the driver that will be inserted */
>
> Nit: Useless comment
>
Done.
> > +static struct i2c_driver fsp3y_driver = {
> > + .driver = {
> > + .name = "fsp3y",
> > + },
> > + .probe_new = fsp3y_probe,
> > + .id_table = fsp3y_id
> > +};
> > +
> > +module_i2c_driver(fsp3y_driver);
> > +
> > +MODULE_AUTHOR("Václav Kubernát");
> > +MODULE_DESCRIPTION("PMBus driver for FSP/3Y-Power power supplies");
> > +MODULE_LICENSE("GPL");
> >
>
^ permalink raw reply
* Re: [PATCH v5 2/7] docs/zh_CN: Add translation zh_CN/doc-guide/kernel-doc.rst
From: Wu X.C. @ 2021-04-14 2:38 UTC (permalink / raw)
To: Jonathan Corbet; +Cc: Alex Shi, Alex Shi, YanTeng Si, linux-doc
In-Reply-To: <87lf9lrgp3.fsf@meer.lwn.net>
On Tue, Apr 13, 2021 at 03:57:12PM -0600, Jonathan Corbet wrote:
> Wu XiangCheng <bobwxc@email.cn> writes:
>
> > Add new translation
> > Documentation/translations/zh_CN/doc-guide/kernel-doc.rst
> >
> > Signed-off-by: Wu XiangCheng <bobwxc@email.cn>
> > Reviewed-by: Yanteng Si <siyanteng@loongson.cn>
> > ---
> > .../zh_CN/doc-guide/kernel-doc.rst | 500 ++++++++++++++++++
> > 1 file changed, 500 insertions(+)
> > create mode 100644 Documentation/translations/zh_CN/doc-guide/kernel-doc.rst
> >
> > diff --git a/Documentation/translations/zh_CN/doc-guide/kernel-doc.rst b/Documentation/translations/zh_CN/doc-guide/kernel-doc.rst
> > new file mode 100644
> > index 000000000000..b0427944f8f0
> > --- /dev/null
> > +++ b/Documentation/translations/zh_CN/doc-guide/kernel-doc.rst
> > @@ -0,0 +1,500 @@
> > +.. include:: ../disclaimer-zh_CN.rst
> > +
> > +:Original: Documentation/doc-guide/kernel-doc.rst
> > +
> > +:译者: 吴想成 Wu XiangCheng <bobwxc@email.cn>
> > +
> > +编写kernel-doc注释
> > +==================
> > +
> > +Linux内核源文件可以包含kernel-doc格式的结构化文档注释,用以描述代码的函数、
> > +类型和设计。将文档嵌入源文件更容易保持文档最新。
> > +
> > +.. note:: 内核文档格式与javadoc、gtk-doc或Doxygen看似很相似,但由于历史原因,
> > + 实际有着明显的不同。内核源包含成千上万个kernel-doc注释。请坚持遵循
> > + 此处描述的风格。
> > +
> > +.. note:: kernel-doc无法包含Rust代码:请参考
> > + :ref:`Documentation/rust/docs.rst <rust_docs>`。
Hi Jonathan,
>
> Note that this reference adds a warning to the docs-next build - this
> translation is evidently against linux-next instead. The warning will
> eventually go away, but it would be better not to do this if possible.
Yeah, seems rust doc do not merge into docs-next yet. It's quite new.
BTW, I'd like to ask does "git.lwn.net" has a web access like git-web or
cgit... ?
>
> Also, there is no need for a :ref: here; just say
> "Documentation/rust/docs.rst" and the automarkup code will do the right
> thing. Yes, that should be changed in the original as well.
>
I'll send another patch to replace the two refs.
And use "Documentation/rust/docs.rst" will also fix the warning.
Thanks,
Wu X.C.
^ permalink raw reply
* [docs-next PATCH] docs/zh_CN: two minor fixes in zh_CN/doc-guide/
From: Wu XiangCheng @ 2021-04-14 3:34 UTC (permalink / raw)
To: Jonathan Corbet; +Cc: Alex Shi, linux-doc
zh_CN/doc-guide/kernel-doc.rst
replace a ref tag to solve docs-next warning
zh_CN/doc-guide/parse-headers.rst
fix an unperfect word
Signed-off-by: Wu XiangCheng <bobwxc@email.cn>
---
Rust part in doc-guide/kernel-doc.rst is from rust-next, I'll solve that
tag there.
Documentation/translations/zh_CN/doc-guide/kernel-doc.rst | 3 +--
| 2 +-
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/Documentation/translations/zh_CN/doc-guide/kernel-doc.rst b/Documentation/translations/zh_CN/doc-guide/kernel-doc.rst
index b0427944f8f0..82ec84470c0b 100644
--- a/Documentation/translations/zh_CN/doc-guide/kernel-doc.rst
+++ b/Documentation/translations/zh_CN/doc-guide/kernel-doc.rst
@@ -14,8 +14,7 @@ Linux内核源文件可以包含kernel-doc格式的结构化文档注释,用
实际有着明显的不同。内核源包含成千上万个kernel-doc注释。请坚持遵循
此处描述的风格。
-.. note:: kernel-doc无法包含Rust代码:请参考
- :ref:`Documentation/rust/docs.rst <rust_docs>`。
+.. note:: kernel-doc无法包含Rust代码:请参考 Documentation/rust/docs.rst 。
从注释中提取kernel-doc结构,并从中生成适当的 `Sphinx C 域`_ 函数和带有锚点的
类型描述。这些注释将被过滤以生成特殊kernel-doc高亮和交叉引用。详见下文。
--git a/Documentation/translations/zh_CN/doc-guide/parse-headers.rst b/Documentation/translations/zh_CN/doc-guide/parse-headers.rst
index 3c6612a3e47e..a6f9d052979e 100644
--- a/Documentation/translations/zh_CN/doc-guide/parse-headers.rst
+++ b/Documentation/translations/zh_CN/doc-guide/parse-headers.rst
@@ -184,4 +184,4 @@ enum foo { BAR1, BAR2, PRIVATE };
许可证 GPLv2:GNU GPL version 2 <https://gnu.org/licenses/gpl.html>
这是自由软件:你可以自由地修改和重新发布它。
-在法律允许的范围内,**没有任何保证**。
+在法律允许的范围内,**不提供任何保证**。
--
2.20.1
^ permalink raw reply related
* [PATCH net-next 3/6] ethtool: add FEC statistics
From: Jakub Kicinski @ 2021-04-14 3:44 UTC (permalink / raw)
To: davem
Cc: netdev, michael.chan, saeedm, leon, ecree.xilinx, habetsm.xilinx,
f.fainelli, andrew, mkubecek, ariela, Jakub Kicinski, corbet,
linux-doc
In-Reply-To: <20210414034454.1970967-1-kuba@kernel.org>
Similarly to pause statistics add stats for FEC.
The IEEE standard mandates two sets of counters:
- 30.5.1.1.17 aFECCorrectedBlocks
- 30.5.1.1.18 aFECUncorrectableBlocks
where block is a block of bits FEC operates on.
Each of these counters is defined per lane (PCS instance).
Multiple vendors provide number of corrected _bits_ rather
than/as well as blocks.
This set adds the 2 standard-based block counters and a extra
one for corrected bits.
Counters are exposed to user space via netlink in new attributes.
Each attribute carries an array of u64s, first element is
the total count, and the following ones are a per-lane break down.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: corbet@lwn.net
CC: linux-doc@vger.kernel.org
---
Documentation/networking/ethtool-netlink.rst | 21 ++++++
Documentation/networking/statistics.rst | 2 +
include/linux/ethtool.h | 40 +++++++++++
include/uapi/linux/ethtool_netlink.h | 14 ++++
net/ethtool/fec.c | 73 +++++++++++++++++++-
5 files changed, 149 insertions(+), 1 deletion(-)
diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index bbecffc7b11a..f8219e2f489e 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -1302,6 +1302,7 @@ Gets FEC configuration and state like ``ETHTOOL_GFECPARAM`` ioctl request.
``ETHTOOL_A_FEC_MODES`` bitset configured modes
``ETHTOOL_A_FEC_AUTO`` bool FEC mode auto selection
``ETHTOOL_A_FEC_ACTIVE`` u32 index of active FEC mode
+ ``ETHTOOL_A_FEC_STATS`` nested FEC statistics
===================================== ====== ==========================
``ETHTOOL_A_FEC_ACTIVE`` is the bit index of the FEC link mode currently
@@ -1315,6 +1316,26 @@ This is equivalent to the ``ETHTOOL_FEC_AUTO`` bit of the ioctl interface.
``ETHTOOL_A_FEC_MODES`` carry the current FEC configuration using link mode
bits (rather than old ``ETHTOOL_FEC_*`` bits).
+``ETHTOOL_A_FEC_STATS`` are reported if ``ETHTOOL_FLAG_STATS`` was set in
+``ETHTOOL_A_HEADER_FLAGS``.
+Each attribute carries an array of 64bit statistics. First entry in the array
+contains the total number of events on the port, while the following entries
+are counters corresponding to lanes/PCS instances. The number of entries in
+the array will be:
+
++--------------+---------------------------------------------+
+| `0` | device does not support FEC statistics |
++--------------+---------------------------------------------+
+| `1` | device does not support per-lane break down |
++--------------+---------------------------------------------+
+| `1 + #lanes` | device has full support for FEC stats |
++--------------+---------------------------------------------+
+
+Drivers fill in the statistics in the following structure:
+
+.. kernel-doc:: include/linux/ethtool.h
+ :identifiers: ethtool_fec_stats
+
FEC_SET
=======
diff --git a/Documentation/networking/statistics.rst b/Documentation/networking/statistics.rst
index 234abedc29b2..b748fe44ee02 100644
--- a/Documentation/networking/statistics.rst
+++ b/Documentation/networking/statistics.rst
@@ -130,6 +130,7 @@ the `ETHTOOL_FLAG_STATS` flag in `ETHTOOL_A_HEADER_FLAGS`. Currently
statistics are supported in the following commands:
- `ETHTOOL_MSG_PAUSE_GET`
+ - `ETHTOOL_MSG_FEC_GET`
debugfs
-------
@@ -176,3 +177,4 @@ translated to netlink attributes when dumped. Drivers must not overwrite
the statistics they don't report with 0.
- ethtool_pause_stats()
+- ethtool_fec_stats()
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 069100b252bd..112a85b57f1f 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -269,6 +269,39 @@ struct ethtool_pause_stats {
u64 rx_pause_frames;
};
+#define ETHTOOL_MAX_LANES 8
+
+/**
+ * struct ethtool_fec_stats - statistics for IEEE 802.3 FEC
+ * @corrected_blocks: number of received blocks corrected by FEC
+ * Reported to user space as %ETHTOOL_A_FEC_STAT_CORRECTED.
+ *
+ * Equivalent to `30.5.1.1.17 aFECCorrectedBlocks` from the standard.
+ *
+ * @uncorrectable_blocks: number of received blocks FEC was not able to correct
+ * Reported to user space as %ETHTOOL_A_FEC_STAT_UNCORR.
+ *
+ * Equivalent to `30.5.1.1.18 aFECUncorrectableBlocks` from the standard.
+ *
+ * @corrected_bits: number of bits corrected by FEC
+ * Similar to @corrected_blocks but counts individual bit changes,
+ * not entire FEC data blocks. This is a non-standard statistic.
+ * Reported to user space as %ETHTOOL_A_FEC_STAT_CORR_BITS.
+ *
+ * @lane: per-lane/PCS-instance counts as defined by the standard
+ * @total: error counts for the entire port, for drivers incapable of reporting
+ * per-lane stats
+ *
+ * Drivers should fill in either only total or per-lane statistics, core
+ * will take care of adding lane values up to produce the total.
+ */
+struct ethtool_fec_stats {
+ struct ethtool_fec_stat {
+ u64 total;
+ u64 lanes[ETHTOOL_MAX_LANES];
+ } corrected_blocks, uncorrectable_blocks, corrected_bits;
+};
+
#define ETH_MODULE_EEPROM_PAGE_LEN 128
#define ETH_MODULE_MAX_I2C_ADDRESS 0x7f
@@ -439,6 +472,11 @@ struct ethtool_module_eeprom {
* ignored (use %__ETHTOOL_LINK_MODE_MASK_NBITS instead of the latter),
* any change to them will be overwritten by kernel. Returns a negative
* error code or zero.
+ * @get_fec_stats: Report FEC statistics.
+ * Core will sum up per-lane stats to get the total.
+ * Drivers must not zero statistics which they don't report. The stats
+ * structure is initialized to ETHTOOL_STAT_NOT_SET indicating driver does
+ * not report statistics.
* @get_fecparam: Get the network device Forward Error Correction parameters.
* @set_fecparam: Set the network device Forward Error Correction parameters.
* @get_ethtool_phy_stats: Return extended statistics about the PHY device.
@@ -544,6 +582,8 @@ struct ethtool_ops {
struct ethtool_link_ksettings *);
int (*set_link_ksettings)(struct net_device *,
const struct ethtool_link_ksettings *);
+ void (*get_fec_stats)(struct net_device *dev,
+ struct ethtool_fec_stats *fec_stats);
int (*get_fecparam)(struct net_device *,
struct ethtool_fecparam *);
int (*set_fecparam)(struct net_device *,
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index 9612dcd48a6a..3a2b31ccbc5b 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -643,11 +643,25 @@ enum {
ETHTOOL_A_FEC_MODES, /* bitset */
ETHTOOL_A_FEC_AUTO, /* u8 */
ETHTOOL_A_FEC_ACTIVE, /* u32 */
+ ETHTOOL_A_FEC_STATS, /* nest - _A_FEC_STAT */
__ETHTOOL_A_FEC_CNT,
ETHTOOL_A_FEC_MAX = (__ETHTOOL_A_FEC_CNT - 1)
};
+enum {
+ ETHTOOL_A_FEC_STAT_UNSPEC,
+ ETHTOOL_A_FEC_STAT_PAD,
+
+ ETHTOOL_A_FEC_STAT_CORRECTED, /* array, u64 */
+ ETHTOOL_A_FEC_STAT_UNCORR, /* array, u64 */
+ ETHTOOL_A_FEC_STAT_CORR_BITS, /* array, u64 */
+
+ /* add new constants above here */
+ __ETHTOOL_A_FEC_STAT_CNT,
+ ETHTOOL_A_FEC_STAT_MAX = (__ETHTOOL_A_FEC_STAT_CNT - 1)
+};
+
/* MODULE EEPROM */
enum {
diff --git a/net/ethtool/fec.c b/net/ethtool/fec.c
index 3e7d091ee7aa..8738dafd5417 100644
--- a/net/ethtool/fec.c
+++ b/net/ethtool/fec.c
@@ -13,6 +13,10 @@ struct fec_reply_data {
__ETHTOOL_DECLARE_LINK_MODE_MASK(fec_link_modes);
u32 active_fec;
u8 fec_auto;
+ struct fec_stat_grp {
+ u64 stats[1 + ETHTOOL_MAX_LANES];
+ u8 cnt;
+ } corr, uncorr, corr_bits;
};
#define FEC_REPDATA(__reply_base) \
@@ -21,7 +25,7 @@ struct fec_reply_data {
#define ETHTOOL_FEC_MASK ((ETHTOOL_FEC_LLRS << 1) - 1)
const struct nla_policy ethnl_fec_get_policy[ETHTOOL_A_FEC_HEADER + 1] = {
- [ETHTOOL_A_FEC_HEADER] = NLA_POLICY_NESTED(ethnl_header_policy),
+ [ETHTOOL_A_FEC_HEADER] = NLA_POLICY_NESTED(ethnl_header_policy_stats),
};
static void
@@ -64,6 +68,28 @@ ethtool_link_modes_to_fecparam(struct ethtool_fecparam *fec,
return 0;
}
+static void
+fec_stats_recalc(struct fec_stat_grp *grp, struct ethtool_fec_stat *stats)
+{
+ int i;
+
+ if (stats->lanes[0] == ETHTOOL_STAT_NOT_SET) {
+ grp->stats[0] = stats->total;
+ grp->cnt = stats->total != ETHTOOL_STAT_NOT_SET;
+ return;
+ }
+
+ grp->cnt = 1;
+ grp->stats[0] = 0;
+ for (i = 0; i < ETHTOOL_MAX_LANES; i++) {
+ if (stats->lanes[i] == ETHTOOL_STAT_NOT_SET)
+ break;
+
+ grp->stats[0] += stats->lanes[i];
+ grp->stats[grp->cnt++] = stats->lanes[i];
+ }
+}
+
static int fec_prepare_data(const struct ethnl_req_info *req_base,
struct ethnl_reply_data *reply_base,
struct genl_info *info)
@@ -82,6 +108,17 @@ static int fec_prepare_data(const struct ethnl_req_info *req_base,
ret = dev->ethtool_ops->get_fecparam(dev, &fec);
if (ret)
goto out_complete;
+ if (req_base->flags & ETHTOOL_FLAG_STATS &&
+ dev->ethtool_ops->get_fec_stats) {
+ struct ethtool_fec_stats stats;
+
+ ethtool_stats_init((u64 *)&stats, sizeof(stats) / 8);
+ dev->ethtool_ops->get_fec_stats(dev, &stats);
+
+ fec_stats_recalc(&data->corr, &stats.corrected_blocks);
+ fec_stats_recalc(&data->uncorr, &stats.uncorrectable_blocks);
+ fec_stats_recalc(&data->corr_bits, &stats.corrected_bits);
+ }
WARN_ON_ONCE(fec.reserved);
@@ -120,9 +157,40 @@ static int fec_reply_size(const struct ethnl_req_info *req_base,
len += nla_total_size(sizeof(u8)) + /* _FEC_AUTO */
nla_total_size(sizeof(u32)); /* _FEC_ACTIVE */
+ if (req_base->flags & ETHTOOL_FLAG_STATS)
+ len += 3 * nla_total_size_64bit(sizeof(u64) *
+ (1 + ETHTOOL_MAX_LANES));
+
return len;
}
+static int fec_put_stats(struct sk_buff *skb, const struct fec_reply_data *data)
+{
+ struct nlattr *nest;
+
+ nest = nla_nest_start(skb, ETHTOOL_A_FEC_STATS);
+ if (!nest)
+ return -EMSGSIZE;
+
+ if (nla_put_64bit(skb, ETHTOOL_A_FEC_STAT_CORRECTED,
+ sizeof(u64) * data->corr.cnt,
+ data->corr.stats, ETHTOOL_A_FEC_STAT_PAD) ||
+ nla_put_64bit(skb, ETHTOOL_A_FEC_STAT_UNCORR,
+ sizeof(u64) * data->uncorr.cnt,
+ data->uncorr.stats, ETHTOOL_A_FEC_STAT_PAD) ||
+ nla_put_64bit(skb, ETHTOOL_A_FEC_STAT_CORR_BITS,
+ sizeof(u64) * data->corr_bits.cnt,
+ data->corr_bits.stats, ETHTOOL_A_FEC_STAT_PAD))
+ goto err_cancel;
+
+ nla_nest_end(skb, nest);
+ return 0;
+
+err_cancel:
+ nla_nest_cancel(skb, nest);
+ return -EMSGSIZE;
+}
+
static int fec_fill_reply(struct sk_buff *skb,
const struct ethnl_req_info *req_base,
const struct ethnl_reply_data *reply_base)
@@ -143,6 +211,9 @@ static int fec_fill_reply(struct sk_buff *skb,
nla_put_u32(skb, ETHTOOL_A_FEC_ACTIVE, data->active_fec)))
return -EMSGSIZE;
+ if (req_base->flags & ETHTOOL_FLAG_STATS && fec_put_stats(skb, data))
+ return -EMSGSIZE;
+
return 0;
}
--
2.30.2
^ permalink raw reply related
* Re: [PATCH] Documentation: syscalls: add a note about ABI-agnostic types
From: Yury Norov @ 2021-04-14 4:40 UTC (permalink / raw)
To: linux-api, linux-arch, linux-doc, linux-kernel
Cc: Alexander A. Klimov, André Almeida, Andrew Morton,
Arnd Bergmann, David Sterba, Joe Perches, Jonathan Corbet,
Mauro Carvalho Chehab, Mike Rapoport
In-Reply-To: <20210409204304.1273139-1-yury.norov@gmail.com>
Ping?
On Fri, Apr 09, 2021 at 01:43:04PM -0700, Yury Norov wrote:
> Recently added memfd_secret() syscall had a flags parameter passed
> as unsigned long, which requires creation of compat entry for it.
> It was possible to change the type of flags to unsigned int and so
> avoid bothering with compat layer.
>
> https://www.spinics.net/lists/linux-mm/msg251550.html
>
> Documentation/process/adding-syscalls.rst doesn't point clearly about
> preference of ABI-agnostic types. This patch adds such notification.
>
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
> Documentation/process/adding-syscalls.rst | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/Documentation/process/adding-syscalls.rst b/Documentation/process/adding-syscalls.rst
> index 9af35f4ec728..46add16edf14 100644
> --- a/Documentation/process/adding-syscalls.rst
> +++ b/Documentation/process/adding-syscalls.rst
> @@ -172,6 +172,13 @@ arguments (i.e. parameter 1, 3, 5), to allow use of contiguous pairs of 32-bit
> registers. (This concern does not apply if the arguments are part of a
> structure that's passed in by pointer.)
>
> +Whenever possible, try to use ABI-agnostic types for passing parameters to
> +a syscall in order to avoid creating compat entry for it. Linux supports two
> +ABI models - ILP32 and LP64. The types like ``void *``, ``long``, ``size_t``,
> +``off_t`` have different size in those ABIs; types like ``char`` and ``int``
> +have the same size and don't require a compat layer support. For flags, it's
> +always better to use ``unsigned int``.
> +
>
> Proposing the API
> -----------------
> --
> 2.25.1
^ permalink raw reply
* Re: [PATCH v3] powerpc/papr_scm: Implement support for H_SCM_FLUSH hcall
From: Aneesh Kumar K.V @ 2021-04-14 5:20 UTC (permalink / raw)
To: Vaibhav Jain, Shivaprasad G Bhat, sbhat, linuxppc-dev, kvm-ppc,
linux-nvdimm, ellerman
Cc: linux-doc
In-Reply-To: <87sg3ujmrm.fsf@vajain21.in.ibm.com>
Vaibhav Jain <vaibhav@linux.ibm.com> writes:
> Hi Shiva,
>
> Apologies for a late review but something just caught my eye while
> working on a different patch.
>
> Shivaprasad G Bhat <sbhat@linux.ibm.com> writes:
>
>> Add support for ND_REGION_ASYNC capability if the device tree
>> indicates 'ibm,hcall-flush-required' property in the NVDIMM node.
>> Flush is done by issuing H_SCM_FLUSH hcall to the hypervisor.
>>
>> If the flush request failed, the hypervisor is expected to
>> to reflect the problem in the subsequent nvdimm H_SCM_HEALTH call.
>>
>> This patch prevents mmap of namespaces with MAP_SYNC flag if the
>> nvdimm requires an explicit flush[1].
>>
>> References:
>> [1] https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/memory/ndctl.py.data/map_sync.c
>>
>> Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
>> ---
>> v2 - https://www.spinics.net/lists/kvm-ppc/msg18799.html
>> Changes from v2:
>> - Fixed the commit message.
>> - Add dev_dbg before the H_SCM_FLUSH hcall
>>
>> v1 - https://www.spinics.net/lists/kvm-ppc/msg18272.html
>> Changes from v1:
>> - Hcall semantics finalized, all changes are to accomodate them.
>>
>> Documentation/powerpc/papr_hcalls.rst | 14 ++++++++++
>> arch/powerpc/include/asm/hvcall.h | 3 +-
>> arch/powerpc/platforms/pseries/papr_scm.c | 40 +++++++++++++++++++++++++++++
>> 3 files changed, 56 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/powerpc/papr_hcalls.rst b/Documentation/powerpc/papr_hcalls.rst
>> index 48fcf1255a33..648f278eea8f 100644
>> --- a/Documentation/powerpc/papr_hcalls.rst
>> +++ b/Documentation/powerpc/papr_hcalls.rst
>> @@ -275,6 +275,20 @@ Health Bitmap Flags:
>> Given a DRC Index collect the performance statistics for NVDIMM and copy them
>> to the resultBuffer.
>>
>> +**H_SCM_FLUSH**
>> +
>> +| Input: *drcIndex, continue-token*
>> +| Out: *continue-token*
>> +| Return Value: *H_SUCCESS, H_Parameter, H_P2, H_BUSY*
>> +
>> +Given a DRC Index Flush the data to backend NVDIMM device.
>> +
>> +The hcall returns H_BUSY when the flush takes longer time and the hcall needs
>> +to be issued multiple times in order to be completely serviced. The
>> +*continue-token* from the output to be passed in the argument list of
>> +subsequent hcalls to the hypervisor until the hcall is completely serviced
>> +at which point H_SUCCESS or other error is returned by the hypervisor.
>> +
> Does the hcall semantic also include measures to trigger a barrier/fence
> (like pm_wmb()) so that all the stores before the hcall are gauranteed
> to have hit the pmem controller ?
It is upto the hypervisor to implement the right sequence to ensure the
guarantee. The hcall doesn't expect the store to reach the platform
buffers.
>
> If not then the papr_scm_pmem_flush() introduced below should issue a
> fence like pm_wmb() before issuing the flush hcall.
>
> If yes then this behaviour should also be documented with the hcall
> semantics above.
>
IIUC fdatasync on the backend file is enough to ensure the
guaraantee. Such a request will have REQ_PREFLUSH and REQ_FUA which will
do the necessary barrier on the backing device holding the backend file.
-aneesh
^ permalink raw reply
* Re: [PATCH] Documentation: syscalls: add a note about ABI-agnostic types
From: Mauro Carvalho Chehab @ 2021-04-14 6:14 UTC (permalink / raw)
To: Yury Norov
Cc: linux-api, linux-arch, linux-doc, linux-kernel,
Alexander A. Klimov, André Almeida, Andrew Morton,
Arnd Bergmann, David Sterba, Joe Perches, Jonathan Corbet,
Mike Rapoport
In-Reply-To: <20210414044020.GA44464@yury-ThinkPad>
Em Tue, 13 Apr 2021 21:40:20 -0700
Yury Norov <yury.norov@gmail.com> escreveu:
> Ping?
>
> On Fri, Apr 09, 2021 at 01:43:04PM -0700, Yury Norov wrote:
> > Recently added memfd_secret() syscall had a flags parameter passed
> > as unsigned long, which requires creation of compat entry for it.
> > It was possible to change the type of flags to unsigned int and so
> > avoid bothering with compat layer.
> >
> > https://www.spinics.net/lists/linux-mm/msg251550.html
> >
> > Documentation/process/adding-syscalls.rst doesn't point clearly about
> > preference of ABI-agnostic types. This patch adds such notification.
> >
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > ---
> > Documentation/process/adding-syscalls.rst | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/Documentation/process/adding-syscalls.rst b/Documentation/process/adding-syscalls.rst
> > index 9af35f4ec728..46add16edf14 100644
> > --- a/Documentation/process/adding-syscalls.rst
> > +++ b/Documentation/process/adding-syscalls.rst
> > @@ -172,6 +172,13 @@ arguments (i.e. parameter 1, 3, 5), to allow use of contiguous pairs of 32-bit
> > registers. (This concern does not apply if the arguments are part of a
> > structure that's passed in by pointer.)
> >
> > +Whenever possible, try to use ABI-agnostic types for passing parameters to
> > +a syscall in order to avoid creating compat entry for it. Linux supports two
> > +ABI models - ILP32 and LP64.
> > + The types like ``void *``, ``long``, ``size_t``,
> > +``off_t`` have different size in those ABIs;
In the case of pointers, the best is to use __u64. The pointer can then
be read on Kernelspace with something like this:
static inline void __user *media_get_uptr(__u64 arg)
{
return (void __user *)(uintptr_t)arg;
}
> > types like ``char`` and ``int``
> > +have the same size and don't require a compat layer support. For flags, it's
> > +always better to use ``unsigned int``.
> > +
I don't think this is true for all compilers on userspace, as the C
standard doesn't define how many bits an int/unsigned int has.
So, even if this is today's reality, things may change in the future.
For instance, I remember we had to replace "int" and "enum" by "__u32"
and "long" by "__u64" at the media uAPI in the past, when we start
seeing x86_64 Kernels with 32-bits userspace and when cameras started
being supported on arm32.
We did have some real bugs with "enum", as, on that time, some
compilers (gcc, I guess) were optimizing them to have less than
32 bits on certain architectures, when it fits.
Thanks,
Mauro
^ permalink raw reply
* [PATCH v6] hwmon: Add driver for fsp-3y PSUs and PDUs
From: Václav Kubernát @ 2021-04-14 8:00 UTC (permalink / raw)
Cc: Václav Kubernát, Jean Delvare, Guenter Roeck,
Jonathan Corbet, linux-hwmon, linux-doc, linux-kernel
In-Reply-To: <20210329143833.1047539-1-kubernat@cesnet.cz>
This patch adds support for these devices:
- YH-5151E - the PDU
- YM-2151E - the PSU
The device datasheet says that the devices support PMBus 1.2, but in my
testing, a lot of the commands aren't supported and if they are, they
sometimes behave strangely or inconsistently. For example, writes to the
PAGE command requires using PEC, otherwise the write won't work and the
page won't switch, even though, the standard says that PEC is optional.
On the other hand, writes to SMBALERT don't require PEC. Because of
this, the driver is mostly reverse engineered with the help of a tool
called pmbus_peek written by David Brownell (and later adopted by my
colleague Jan Kundrát).
The device also has some sort of a timing issue when switching pages,
which is explained further in the code.
Because of this, the driver support is limited. It exposes only the
values, that have been tested to work correctly.
Signed-off-by: Václav Kubernát <kubernat@cesnet.cz>
---
Documentation/hwmon/fsp-3y.rst | 28 ++++
Documentation/hwmon/index.rst | 1 +
drivers/hwmon/pmbus/Kconfig | 10 ++
drivers/hwmon/pmbus/Makefile | 1 +
drivers/hwmon/pmbus/fsp-3y.c | 253 +++++++++++++++++++++++++++++++++
5 files changed, 293 insertions(+)
create mode 100644 Documentation/hwmon/fsp-3y.rst
create mode 100644 drivers/hwmon/pmbus/fsp-3y.c
diff --git a/Documentation/hwmon/fsp-3y.rst b/Documentation/hwmon/fsp-3y.rst
new file mode 100644
index 000000000000..5693d83a2035
--- /dev/null
+++ b/Documentation/hwmon/fsp-3y.rst
@@ -0,0 +1,28 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Kernel driver fsp3y
+======================
+Supported devices:
+ * 3Y POWER YH-5151E
+ * 3Y POWER YM-2151E
+
+Author: Václav Kubernát <kubernat@cesnet.cz>
+
+Description
+-----------
+This driver implements limited support for two 3Y POWER devices.
+
+Sysfs entries
+-------------
+ * in1_input input voltage
+ * in2_input 12V output voltage
+ * in3_input 5V output voltage
+ * curr1_input input current
+ * curr2_input 12V output current
+ * curr3_input 5V output current
+ * fan1_input fan rpm
+ * temp1_input temperature 1
+ * temp2_input temperature 2
+ * temp3_input temperature 3
+ * power1_input input power
+ * power2_input output power
diff --git a/Documentation/hwmon/index.rst b/Documentation/hwmon/index.rst
index fcb870ce6286..55c9f014c248 100644
--- a/Documentation/hwmon/index.rst
+++ b/Documentation/hwmon/index.rst
@@ -63,6 +63,7 @@ Hardware Monitoring Kernel Drivers
f71805f
f71882fg
fam15h_power
+ fsp-3y
ftsteutates
g760a
g762
diff --git a/drivers/hwmon/pmbus/Kconfig b/drivers/hwmon/pmbus/Kconfig
index 03606d4298a4..9d12d446396c 100644
--- a/drivers/hwmon/pmbus/Kconfig
+++ b/drivers/hwmon/pmbus/Kconfig
@@ -56,6 +56,16 @@ config SENSORS_BEL_PFE
This driver can also be built as a module. If so, the module will
be called bel-pfe.
+config SENSORS_FSP_3Y
+ tristate "FSP/3Y-Power power supplies"
+ help
+ If you say yes here you get hardware monitoring support for
+ FSP/3Y-Power hot-swap power supplies.
+ Supported models: YH-5151E, YM-2151E
+
+ This driver can also be built as a module. If so, the module will
+ be called fsp-3y.
+
config SENSORS_IBM_CFFPS
tristate "IBM Common Form Factor Power Supply"
depends on LEDS_CLASS
diff --git a/drivers/hwmon/pmbus/Makefile b/drivers/hwmon/pmbus/Makefile
index 6a4ba0fdc1db..bfe218ad898f 100644
--- a/drivers/hwmon/pmbus/Makefile
+++ b/drivers/hwmon/pmbus/Makefile
@@ -8,6 +8,7 @@ obj-$(CONFIG_SENSORS_PMBUS) += pmbus.o
obj-$(CONFIG_SENSORS_ADM1266) += adm1266.o
obj-$(CONFIG_SENSORS_ADM1275) += adm1275.o
obj-$(CONFIG_SENSORS_BEL_PFE) += bel-pfe.o
+obj-$(CONFIG_SENSORS_FSP_3Y) += fsp-3y.o
obj-$(CONFIG_SENSORS_IBM_CFFPS) += ibm-cffps.o
obj-$(CONFIG_SENSORS_INSPUR_IPSPS) += inspur-ipsps.o
obj-$(CONFIG_SENSORS_IR35221) += ir35221.o
diff --git a/drivers/hwmon/pmbus/fsp-3y.c b/drivers/hwmon/pmbus/fsp-3y.c
new file mode 100644
index 000000000000..564649e87e34
--- /dev/null
+++ b/drivers/hwmon/pmbus/fsp-3y.c
@@ -0,0 +1,253 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Hardware monitoring driver for FSP 3Y-Power PSUs
+ *
+ * Copyright (c) 2021 Václav Kubernát, CESNET
+ *
+ * This driver is mostly reverse engineered with the help of a tool called pmbus_peek written by
+ * David Brownell (and later adopted by Jan Kundrát). The device has some sort of a timing issue
+ * when switching pages, details are explained in the code. The driver support is limited. It
+ * exposes only the values, that have been tested to work correctly. Unsupported values either
+ * aren't supported by the devices or their encondings are unknown.
+ */
+
+#include <linux/delay.h>
+#include <linux/i2c.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include "pmbus.h"
+
+#define YM2151_PAGE_12V_LOG 0x00
+#define YM2151_PAGE_12V_REAL 0x00
+#define YM2151_PAGE_5VSB_LOG 0x01
+#define YM2151_PAGE_5VSB_REAL 0x20
+#define YH5151E_PAGE_12V_LOG 0x00
+#define YH5151E_PAGE_12V_REAL 0x00
+#define YH5151E_PAGE_5V_LOG 0x01
+#define YH5151E_PAGE_5V_REAL 0x10
+#define YH5151E_PAGE_3V3_LOG 0x02
+#define YH5151E_PAGE_3V3_REAL 0x11
+
+enum chips {
+ ym2151e,
+ yh5151e
+};
+
+struct fsp3y_data {
+ struct pmbus_driver_info info;
+ int chip;
+ int page;
+};
+
+#define to_fsp3y_data(x) container_of(x, struct fsp3y_data, info)
+
+static int page_log_to_page_real(int page_log, enum chips chip)
+{
+ switch (chip) {
+ case ym2151e:
+ switch (page_log) {
+ case YM2151_PAGE_12V_LOG:
+ return YM2151_PAGE_12V_REAL;
+ case YM2151_PAGE_5VSB_LOG:
+ return YM2151_PAGE_5VSB_REAL;
+ }
+ return -EINVAL;
+ case yh5151e:
+ switch (page_log) {
+ case YH5151E_PAGE_12V_LOG:
+ return YH5151E_PAGE_12V_REAL;
+ case YH5151E_PAGE_5V_LOG:
+ return YH5151E_PAGE_5V_LOG;
+ case YH5151E_PAGE_3V3_LOG:
+ return YH5151E_PAGE_3V3_REAL;
+ }
+ return -EINVAL;
+ }
+
+ return -EINVAL;
+}
+
+static int set_page(struct i2c_client *client, int page_log)
+{
+ const struct pmbus_driver_info *info = pmbus_get_driver_info(client);
+ struct fsp3y_data *data = to_fsp3y_data(info);
+ int rv;
+ int page_real;
+
+ if (page_log < 0)
+ return 0;
+
+ page_real = page_log_to_page_real(page_log, data->chip);
+ if (page_real < 0)
+ return page_real;
+
+ if (data->page != page_real) {
+ rv = i2c_smbus_write_byte_data(client, PMBUS_PAGE, page_real);
+ if (rv < 0)
+ return rv;
+
+ data->page = page_real;
+
+ /*
+ * Testing showed that the device has a timing issue. After
+ * setting a page, it takes a while, before the device actually
+ * gives the correct values from the correct page. 20 ms was
+ * tested to be enough to not give wrong values (15 ms wasn't
+ * enough).
+ */
+ usleep_range(20000, 30000);
+ }
+
+ return 0;
+}
+
+static int fsp3y_read_byte_data(struct i2c_client *client, int page, int reg)
+{
+ int rv;
+
+ rv = set_page(client, page);
+ if (rv < 0)
+ return rv;
+
+ return i2c_smbus_read_byte_data(client, reg);
+}
+
+static int fsp3y_read_word_data(struct i2c_client *client, int page, int phase, int reg)
+{
+ int rv;
+
+ /*
+ * This masks commands which weren't tested to work correctly. Some of
+ * the masked commands return 0xFFFF. These would probably get tagged as
+ * invalid by pmbus_core. Other ones do return values which might be
+ * useful (that is, they are not 0xFFFF), but their encoding is unknown,
+ * and so they are unsupported.
+ */
+ switch (reg) {
+ case PMBUS_READ_FAN_SPEED_1:
+ case PMBUS_READ_IIN:
+ case PMBUS_READ_IOUT:
+ case PMBUS_READ_PIN:
+ case PMBUS_READ_POUT:
+ case PMBUS_READ_TEMPERATURE_1:
+ case PMBUS_READ_TEMPERATURE_2:
+ case PMBUS_READ_TEMPERATURE_3:
+ case PMBUS_READ_VIN:
+ case PMBUS_READ_VOUT:
+ case PMBUS_STATUS_WORD:
+ break;
+ default:
+ return -ENXIO;
+ }
+
+ rv = set_page(client, page);
+ if (rv < 0)
+ return rv;
+
+ return i2c_smbus_read_word_data(client, reg);
+}
+
+struct pmbus_driver_info fsp3y_info[] = {
+ [ym2151e] = {
+ .pages = 2,
+ .func[YM2151_PAGE_12V_LOG] =
+ PMBUS_HAVE_VOUT | PMBUS_HAVE_IOUT |
+ PMBUS_HAVE_PIN | PMBUS_HAVE_POUT |
+ PMBUS_HAVE_TEMP | PMBUS_HAVE_TEMP2 |
+ PMBUS_HAVE_VIN | PMBUS_HAVE_IIN |
+ PMBUS_HAVE_FAN12,
+ .func[YM2151_PAGE_5VSB_LOG] =
+ PMBUS_HAVE_VOUT | PMBUS_HAVE_IOUT,
+ PMBUS_HAVE_IIN,
+ .read_word_data = fsp3y_read_word_data,
+ .read_byte_data = fsp3y_read_byte_data,
+ },
+ [yh5151e] = {
+ .pages = 3,
+ .func[YH5151E_PAGE_12V_LOG] =
+ PMBUS_HAVE_VOUT | PMBUS_HAVE_IOUT |
+ PMBUS_HAVE_POUT |
+ PMBUS_HAVE_TEMP | PMBUS_HAVE_TEMP2 | PMBUS_HAVE_TEMP3,
+ .func[YH5151E_PAGE_5V_LOG] =
+ PMBUS_HAVE_VOUT | PMBUS_HAVE_IOUT |
+ PMBUS_HAVE_POUT,
+ .func[YH5151E_PAGE_3V3_LOG] =
+ PMBUS_HAVE_VOUT | PMBUS_HAVE_IOUT |
+ PMBUS_HAVE_POUT,
+ .read_word_data = fsp3y_read_word_data,
+ .read_byte_data = fsp3y_read_byte_data,
+ }
+};
+
+static int fsp3y_detect(struct i2c_client *client)
+{
+ int rv;
+ u8 buf[I2C_SMBUS_BLOCK_MAX + 1];
+
+ rv = i2c_smbus_read_block_data(client, PMBUS_MFR_MODEL, buf);
+ if (rv < 0)
+ return rv;
+
+ buf[rv] = '\0';
+
+ if (rv == 8) {
+ if (!strcmp(buf, "YM-2151E"))
+ return ym2151e;
+ else if (!strcmp(buf, "YH-5151E"))
+ return yh5151e;
+ }
+
+ dev_err(&client->dev, "Unsupported model %.*s\n", rv, buf);
+ return -ENODEV;
+}
+
+static const struct i2c_device_id fsp3y_id[] = {
+ {"ym2151e", ym2151e},
+ {"yh5151e", yh5151e},
+ {0}
+};
+
+static int fsp3y_probe(struct i2c_client *client)
+{
+ struct fsp3y_data *data;
+ const struct i2c_device_id *id;
+ int rv;
+
+ data = devm_kzalloc(&client->dev, sizeof(struct fsp3y_data), GFP_KERNEL);
+ if (!data)
+ return -ENOMEM;
+
+ data->chip = fsp3y_detect(client);
+ if (data->chip < 0)
+ return data->chip;
+
+ id = i2c_match_id(fsp3y_id, client);
+ if (data->chip != id->driver_data)
+ dev_warn(&client->dev, "Device mismatch: Configured %s (%d), detected %d\n",
+ id->name, (int)id->driver_data, data->chip);
+
+ rv = i2c_smbus_read_byte_data(client, PMBUS_PAGE);
+ if (rv < 0)
+ return rv;
+ data->page = rv;
+
+ data->info = fsp3y_info[data->chip];
+
+ return pmbus_do_probe(client, &data->info);
+}
+
+MODULE_DEVICE_TABLE(i2c, fsp3y_id);
+
+static struct i2c_driver fsp3y_driver = {
+ .driver = {
+ .name = "fsp3y",
+ },
+ .probe_new = fsp3y_probe,
+ .id_table = fsp3y_id
+};
+
+module_i2c_driver(fsp3y_driver);
+
+MODULE_AUTHOR("Václav Kubernát");
+MODULE_DESCRIPTION("PMBus driver for FSP/3Y-Power power supplies");
+MODULE_LICENSE("GPL");
--
2.31.1
^ permalink raw reply related
* Re: [PATCH v5] hwmon: Add driver for fsp-3y PSUs and PDUs
From: Václav Kubernát @ 2021-04-14 8:02 UTC (permalink / raw)
To: Guenter Roeck
Cc: Jean Delvare, Jonathan Corbet, linux-hwmon, linux-doc,
linux-kernel
In-Reply-To: <20210414032902.GA242591@roeck-us.net>
st 14. 4. 2021 v 5:29 odesílatel Guenter Roeck <linux@roeck-us.net> napsal:
>
> On Wed, Apr 14, 2021 at 02:13:06AM +0200, Václav Kubernát wrote:
> > This patch adds support for these devices:
> > - YH-5151E - the PDU
> > - YM-2151E - the PSU
> >
> > The device datasheet says that the devices support PMBus 1.2, but in my
> > testing, a lot of the commands aren't supported and if they are, they
> > sometimes behave strangely or inconsistently. For example, writes to the
> > PAGE command requires using PEC, otherwise the write won't work and the
> > page won't switch, even though, the standard says that PEC is optional.
> > On the other hand, writes to SMBALERT don't require PEC. Because of
> > this, the driver is mostly reverse engineered with the help of a tool
> > called pmbus_peek written by David Brownell (and later adopted by my
> > colleague Jan Kundrát).
> >
> > The device also has some sort of a timing issue when switching pages,
> > which is explained further in the code.
> >
> > Because of this, the driver support is limited. It exposes only the
> > values, that have been tested to work correctly.
> >
> > Signed-off-by: Václav Kubernát <kubernat@cesnet.cz>
>
> checkpatch says:
>
> WARNING: Missing or malformed SPDX-License-Identifier tag in line 1
> #108: FILE: Documentation/hwmon/fsp-3y.rst:1:
> +Kernel driver fsp3y
>
> WARNING: line length of 137 exceeds 100 columns
> #409: FILE: drivers/hwmon/pmbus/fsp-3y.c:225:
> + dev_warn(&client->dev, "Device mismatch: Configured %s (%d), detected %d\n", id->name, (int)id->driver_data, data->chip);
>
> Please fix and resubmit.
>
Done.
Václav
> Thanks,
> Guenter
^ permalink raw reply
* [PATCH v2] Documentation: dev-tools: Add Testing Overview
From: David Gow @ 2021-04-14 8:14 UTC (permalink / raw)
To: Jonathan Corbet, Shuah Khan, Andrew Morton, Dmitry Vyukov,
Marco Elver, Brendan Higgins, Daniel Latypov
Cc: David Gow, linux-doc, KUnit Development, linux-kselftest,
linux-kernel
The kernel now has a number of testing and debugging tools, and we've
seen a bit of confusion about what the differences between them are.
Add a basic documentation outlining the testing tools, when to use each,
and how they interact.
This is a pretty quick overview rather than the idealised "kernel
testing guide" that'd probably be optimal, but given the number of times
questions like "When do you use KUnit and when do you use Kselftest?"
are being asked, it seemed worth at least having something. Hopefully
this can form the basis for more detailed documentation later.
Signed-off-by: David Gow <davidgow@google.com>
---
Thanks, everyone, for the comments on the doc. I've made a few of the
suggested changes. Please let me know what you think!
-- David
Changes since v1:
https://lore.kernel.org/linux-kselftest/20210410070529.4113432-1-davidgow@google.com/
- Note KUnit's speed and that one should provide selftests for syscalls
- Mention lockdep as a Dynamic Analysis Tool
- Refer to "Dynamic Analysis Tools" instead of "Sanitizers"
- A number of minor formatting tweaks and rewordings for clarity.
Not changed:
- I haven't included an exhaustive list of differences, advantages, etc,
between KUnit and kselftest: for now, the doc continues to focus on
the difference between 'in-kernel' and 'userspace' testing here.
- Similarly, I'm not linking out to docs defining and describing "Unit"
tests versus "End-to-end" tests. None of the existing documentation
elsewhere quite matches what we do in the kernel perfectly, so it
seems less confusing to focus on the 'in-kernel'/'userspace'
distinction, and leave other definitions as a passing mention for
those who are already familiar with the concepts.
- I haven't linked to any talk videos here: a few of them are linked on
(e.g.) the KUnit webpage, but I wanted to keep the Kernel documentation
more self-contained for now. No objection to adding them in a follow-up
patch if people feel strongly about it, though.
- The link from index.rst to this doc is unchanged. I personally think
that the link is prominent enough there: it's the first link, and
shows up a few times. One possibility if people disagreed would be to
merge this page with the index, but given not all dev-tools are going
to be testing-related, it seemed a bit arrogant. :-)
Documentation/dev-tools/index.rst | 3 +
Documentation/dev-tools/testing-overview.rst | 117 +++++++++++++++++++
2 files changed, 120 insertions(+)
create mode 100644 Documentation/dev-tools/testing-overview.rst
diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst
index 1b1cf4f5c9d9..f590e5860794 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -7,6 +7,8 @@ be used to work on the kernel. For now, the documents have been pulled
together without any significant effort to integrate them into a coherent
whole; patches welcome!
+A brief overview of testing-specific tools can be found in :doc:`testing-overview`.
+
.. class:: toc-title
Table of contents
@@ -14,6 +16,7 @@ whole; patches welcome!
.. toctree::
:maxdepth: 2
+ testing-overview
coccinelle
sparse
kcov
diff --git a/Documentation/dev-tools/testing-overview.rst b/Documentation/dev-tools/testing-overview.rst
new file mode 100644
index 000000000000..ce36a8cdf6b5
--- /dev/null
+++ b/Documentation/dev-tools/testing-overview.rst
@@ -0,0 +1,117 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================
+Kernel Testing Guide
+====================
+
+
+There are a number of different tools for testing the Linux kernel, so knowing
+when to use each of them can be a challenge. This document provides a rough
+overview of their differences, and how they fit together.
+
+
+Writing and Running Tests
+=========================
+
+The bulk of kernel tests are written using either the kselftest or KUnit
+frameworks. These both provide infrastructure to help make running tests and
+groups of tests easier, as well as providing helpers to aid in writing new
+tests.
+
+If you're looking to verify the behaviour of the Kernel — particularly specific
+parts of the kernel — then you'll want to use KUnit or kselftest.
+
+
+The Difference Between KUnit and kselftest
+------------------------------------------
+
+KUnit (Documentation/dev-tools/kunit/index.rst) is an entirely in-kernel system
+for "white box" testing: because test code is part of the kernel, it can access
+internal structures and functions which aren't exposed to userspace.
+
+KUnit tests therefore are best written against small, self-contained parts
+of the kernel, which can be tested in isolation. This aligns well with the
+concept of 'unit' testing.
+
+For example, a KUnit test might test an individual kernel function (or even a
+single codepath through a function, such as an error handling case), rather
+than a feature as a whole.
+
+This also makes KUnit tests very fast to build and run, allowing them to be
+run frequently as part of the development process.
+
+There is a KUnit test style guide which may give further pointers in
+Documentation/dev-tools/kunit/style.rst
+
+
+kselftest (Documentation/dev-tools/kselftest.rst), on the other hand, is
+largely implemented in userspace, and tests are normal userspace scripts or
+programs.
+
+This makes it easier to write more complicated tests, or tests which need to
+manipulate the overall system state more (e.g., spawning processes, etc.).
+However, it's not possible to call kernel functions directly from kselftest.
+This means that only kernel functionality which is exposed to userspace somhow
+(e.g. by a syscall, device, filesystem, etc.) can be tested with kselftest. To
+work around this, some tests include a companion kernel module which exposes
+more information or functionality. If a test runs mostly or entirely within the
+kernel, however, KUnit may be the more appropriate tool.
+
+kselftest is therefore suited well to tests of whole features, as these will
+expose an interface to userspace, which can be tested, but not implementation
+details. This aligns well with 'system' or 'end-to-end' testing.
+
+For example, all new system calls should be accompanied by kselftest tests.
+
+Code Coverage Tools
+===================
+
+The Linux Kernel supports two different code coverage measurement tools. These
+can be used to verify that a test is executing particular functions or lines
+of code. This is useful for determining how much of the kernel is being tested,
+and for finding corner-cases which are not covered by the appropriate test.
+
+:doc:`gcov` is GCC's coverage testing tool, which can be used with the kernel
+to get global or per-module coverage. Unlike KCOV, it does not record per-task
+coverage. Coverage data can be read from debugfs, and interpreted using the
+usual gcov tooling.
+
+:doc:`kcov` is a feature which can be built in to the kernel to allow
+capturing coverage on a per-task level. It's therefore useful for fuzzing and
+other situations where information about code executed during, for example, a
+single syscall is useful.
+
+
+Dynamic Analysis Tools
+======================
+
+The kernel also supports a number of dynamic analysis tools, which attempt to
+detect classes of issues when the occur in a running kernel. These typically
+look for undefined behaviour of some kind, such as invalid memory accesses,
+concurrency issues such as data races, or other undefined behaviour like
+integer overflows.
+
+Some of these tools are listed below:
+
+* kmemleak detects possible memory leaks. See
+ Documentation/dev-tools/kmemleak.rst
+* KASAN detects invalid memory accesses such as out-of-bounds and
+ use-after-free errors. See Documentation/dev-tools/kasan.rst
+* UBSAN detects behaviour that is undefined by the C standard, like integer
+ overflows. See Documentation/dev-tools/ubsan.rst
+* KCSAN detects data races. See Documentation/dev-tools/kcsan.rst
+* KFENCE is a low-overhead detector of memory issues, which is much faster than
+ KASAN and can be used in production. See Documentation/dev-tools/kfence.rst
+* lockdep is a locking correctness validator. See
+ Documentation/locking/lockdep-design.rst
+* There are several other pieces of debug instrumentation in the kernel, many
+ of which can be found in lib/Kconfig.debug
+
+These tools tend to test the kernel as a whole, and do not "pass" like
+kselftest or KUnit tests. They can be combined with KUnit or kselftest by
+running tests on a kernel with a sanitizer enabled: you can then be sure
+that none of these errors are occurring during the test.
+
+Some of these tools integrate with KUnit or kselftest and will
+automatically fail tests if an issue is detected.
+
--
2.31.1.295.g9ea45b61b8-goog
^ permalink raw reply related
* [patch v3] translations/zh_CN: add translations to dev-tools gcov
From: Bernard Zhao @ 2021-04-14 8:23 UTC (permalink / raw)
To: Harry Wei, Alex Shi, Jonathan Corbet, Nathan Chancellor,
Nick Desaulniers, Bernard Zhao, linux-doc, linux-kernel,
clang-built-linux
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=n, Size: 10758 bytes --]
Add translations to dev-tools gcov
Signed-off-by: Bernard Zhao <bernard@vivo.com>
---
Changes since V2:
* fix some inaccurate translation
Changes since V1:
* add index.rst in dev-tools and link to to zh_CN/index.rst
* fix some inaccurate translation
---
.../translations/zh_CN/dev-tools/gcov.rst | 261 ++++++++++++++++++
.../translations/zh_CN/dev-tools/index.rst | 35 +++
Documentation/translations/zh_CN/index.rst | 1 +
3 files changed, 297 insertions(+)
create mode 100644 Documentation/translations/zh_CN/dev-tools/gcov.rst
create mode 100644 Documentation/translations/zh_CN/dev-tools/index.rst
diff --git a/Documentation/translations/zh_CN/dev-tools/gcov.rst b/Documentation/translations/zh_CN/dev-tools/gcov.rst
new file mode 100644
index 000000000000..882eee4c75c5
--- /dev/null
+++ b/Documentation/translations/zh_CN/dev-tools/gcov.rst
@@ -0,0 +1,261 @@
+.. include:: ../disclaimer-zh_CN.rst
+
+:Original: Documentation/dev-tools/gcov.rst
+:Translator: 赵军奎 Bernard Zhao <bernard@vivo.com>
+
+在Linux内核里使用gcov做代码覆盖率检查
+=========================================
+
+gcov是linux中已经集成的一个分析模块,该模块在内核中对GCC的代码覆盖率统
+计提供了支持。
+linux内核运行时的代码覆盖率数据会以gcov兼容的格式存储在debug-fs中,可
+以通过gcov的``-o``选项(如下示例)获得指定文件的代码运行覆盖率统计数据
+(需要跳转到内核编译路径下并且要有root权限)::
+
+ # cd /tmp/linux-out
+ # gcov -o /sys/kernel/debug/gcov/tmp/linux-out/kernel spinlock.c
+
+这将在当前目录中创建带有执行计数注释的源代码文件。
+在获得这些统计文件后,可以使用图形化的gcov_前端工具(比如lcov_),来实现
+自动化处理linux 内核的覆盖率运行数据,同时生成易于阅读的HTML格式文件。
+
+可能的用途:
+
+* 调试(用来判断每一行的代码是否已经运行过)
+* 测试改进(如何修改测试代码,尽可能地覆盖到没有运行过的代码)
+* 内核配置优化(对于某一个选项配置,如果关联的代码从来没有运行过,是
+否还需要这个配置)
+
+_gcov: https://gcc.gnu.org/onlinedocs/gcc/Gcov.html
+_lcov: http://ltp.sourceforge.net/coverage/lcov.php
+
+
+准备
+-----------
+
+内核打开如下配置::
+
+ CONFIG_DEBUG_FS=y
+ CONFIG_GCOV_KERNEL=y
+
+获取整个内核的覆盖率数据,还需要打开::
+
+ CONFIG_GCOV_PROFILE_ALL=y
+
+需要注意的是,整个内核开启覆盖率统计会造成内核镜像文件尺寸的增大,
+同时内核运行的也会变慢一些。
+另外,并不是所有的架构都支持整个内核开启覆盖率统计。
+
+代码运行覆盖率数据只在debugfs挂载完成后才可以访问::
+
+ mount -t debugfs none /sys/kernel/debug
+
+
+客制化
+-------------
+
+如果要单独针对某一个路径或者文件进行代码覆盖率统计,可以在内核相应路
+径的Makefile中增加如下的配置:
+
+- 单独统计单个文件(例如main.o)::
+
+ GCOV_PROFILE_main.o := y
+
+- 单独统计某一个路径::
+
+ GCOV_PROFILE := y
+
+如果要在整个内核的覆盖率统计(开启CONFIG_GCOV_PROFILE_ALL)中单独排除
+某一个文件或者路径,可以使用如下的方法::
+
+ GCOV_PROFILE_main.o := n
+
+和::
+
+ GCOV_PROFILE := n
+
+此机制仅支持链接到内核镜像或编译为内核模块的文件。
+
+
+相关文件
+-------------
+
+gcov功能需要在debugfs中创建如下文件:
+
+``/sys/kernel/debug/gcov``
+ gcov相关功能的根路径
+
+``/sys/kernel/debug/gcov/reset``
+ 全局复位文件:向该文件写入数据后会将所有的gcov统计数据清0
+
+``/sys/kernel/debug/gcov/path/to/compile/dir/file.gcda``
+ gcov工具可以识别的覆盖率统计数据文件,向该文件写入数据后
+ 会将本文件的gcov统计数据清0
+
+``/sys/kernel/debug/gcov/path/to/compile/dir/file.gcno``
+ gcov工具需要的软连接文件(指向编译时生成的信息统计文件),这个文件是
+ 在gcc编译时如果配置了选项``-ftest-coverage``时生成的。
+
+
+针对模块的统计
+---------------
+
+内核中的模块会动态的加载和卸载,模块卸载时对应的数据会被清除掉。
+gcov提供了一种机制,通过保留相关数据的副本来收集这部分卸载模块的覆盖率数据。
+模块卸载后这些备份数据在debugfs中会继续存在。
+一旦这个模块重新加载,模块关联的运行统计会被初始化成debugfs中备份的数据。
+
+可以通过对内核参数gcov_persist的修改来停用gcov对模块的备份机制::
+
+ gcov_persist = 0
+
+在运行时,用户还可以通过写入模块的数据文件或者写入gcov复位文件来丢弃已卸
+载模块的数据。
+
+
+分离的编译和运行设备
+---------------------
+
+gcov的内核分析架构支持内核的编译和分析是在同一台设备上,也可以编译和运
+行是在不同的设备上。
+如果内核编译和运行是不同的设备,那么需要额外的准备工作,这取决于gcov工具
+是在哪里使用的:
+
+a) 若gcov运行在测试设备上
+
+ 测试设备上面gcov工具的版本必须要跟设备内核编译使用的gcc版本相兼容,
+ 同时下面的文件要从编译设备拷贝到测试设备上:
+
+ 从源代码中:
+ - 所有的C文件和头文件
+
+ 从编译目录中:
+ - 所有的C文件和头文件
+ - 所有的.gcda文件和.gcno文件
+ - 所有目录的链接
+
+ 特别需要注意,测试机器上面的目录结构跟编译机器上面的目录机构必须
+ 完全一致。
+ 如果文件是软链接,需要替换成真正的目录文件(这是由make的当前工作
+ 目录变量CURDIR引起的)。
+
+b) 若gcov运行在编译设备上
+
+ 测试用例运行结束后,如下的文件需要从测试设备中拷贝到编译设备上:
+
+ 从sysfs中的gcov目录中:
+ - 所有的.gcda文件
+ - 所有的.gcno文件软链接
+
+ 这些文件可以拷贝到编译设备的任意目录下,gcov使用-o选项指定拷贝的
+ 目录。
+
+ 比如一个是示例的目录结构如下::
+
+ /tmp/linux: 内核源码目录
+ /tmp/out: 内核编译文件路径(make O=指定)
+ /tmp/coverage: 从测试机器上面拷贝的数据文件路径
+
+ [user@build] cd /tmp/out
+ [user@build] gcov -o /tmp/coverage/tmp/out/init main.c
+
+
+关于编译器的注意事项
+----------------------
+
+GCC和LLVM gcov工具不一定兼容。
+如果编译器是GCC,使用gcov_来处理.gcno和.gcda文件,如果是Clang编译器,
+则使用llvm-cov_。
+
+_gcov: https://gcc.gnu.org/onlinedocs/gcc/Gcov.html
+_llvm-cov: https://llvm.org/docs/CommandGuide/llvm-cov.html
+
+GCC和Clang gcov之间的版本差异由Kconfig处理的。
+kconfig会根据编译工具链的检查自动选择合适的gcov格式。
+
+问题定位
+---------------
+
+可能出现的问题1
+ 编译到链接阶段报错终止
+
+问题原因
+ 分析标志指定在了源文件但是没有链接到主内核,或者客制化了链接程序
+
+解决方法
+ 通过在相应的Makefile中使用``GCOV_PROFILE := n``
+ 或者``GCOV_PROFILE_basename.o := n``来将链接报错的文件排除掉
+
+可能出现的问题2
+ 从sysfs复制的文件显示为空或不完整
+
+问题原因
+ 由于seq_file的工作方式,某些工具(例如cp或tar)可能无法正确地从
+ sysfs复制文件。
+
+解决方法
+ 使用``cat``读取``.gcda``文件,使用``cp -d``复制链接,或者使用附录B中所示的
+ 机制。
+
+
+附录A::collect_on_build.sh
+------------------------------
+
+用于在编译设备上收集覆盖率编译中间数据文件的示例脚本.
+(如下6a示例)
+
+.. code-block:: sh
+
+ #!/bin/bash
+
+ KSRC=$1
+ KOBJ=$2
+ DEST=$3
+
+ if [ -z "$KSRC" ] || [ -z "$KOBJ" ] || [ -z "$DEST" ]; then
+ echo "Usage: $0 <ksrc directory> <kobj directory> <output.tar.gz>" >&2
+ exit 1
+ fi
+
+ KSRC=$(cd $KSRC; printf "all:\n\t@echo \${CURDIR}\n" | make -f -)
+ KOBJ=$(cd $KOBJ; printf "all:\n\t@echo \${CURDIR}\n" | make -f -)
+
+ find $KSRC $KOBJ \( -name '*.gcno' -o -name '*.[ch]' -o -type l \) -a \
+ -perm /u+r,g+r | tar cfz $DEST -P -T -
+
+ if [ $? -eq 0 ] ; then
+ echo "$DEST successfully created, copy to test system and unpack with:"
+ echo " tar xfz $DEST -P"
+ else
+ echo "Could not create file $DEST"
+ fi
+
+
+附录B::collect_on_test.sh
+-----------------------------
+
+用于在测试设备上收集覆盖里统计数据数据文件的示例脚本
+(如下6b示例)
+
+.. code-block:: sh
+
+ #!/bin/bash -e
+
+ DEST=$1
+ GCDA=/sys/kernel/debug/gcov
+
+ if [ -z "$DEST" ] ; then
+ echo "Usage: $0 <output.tar.gz>" >&2
+ exit 1
+ fi
+
+ TEMPDIR=$(mktemp -d)
+ echo Collecting data..
+ find $GCDA -type d -exec mkdir -p $TEMPDIR/\{\} \;
+ find $GCDA -name '*.gcda' -exec sh -c 'cat < $0 > '$TEMPDIR'/$0' {} \;
+ find $GCDA -name '*.gcno' -exec sh -c 'cp -d $0 '$TEMPDIR'/$0' {} \;
+ tar czf $DEST -C $TEMPDIR sys
+ rm -rf $TEMPDIR
+
+ echo "$DEST successfully created, copy to build system and unpack with:"
+ echo " tar xfz $DEST"
diff --git a/Documentation/translations/zh_CN/dev-tools/index.rst b/Documentation/translations/zh_CN/dev-tools/index.rst
new file mode 100644
index 000000000000..28b256ad4257
--- /dev/null
+++ b/Documentation/translations/zh_CN/dev-tools/index.rst
@@ -0,0 +1,35 @@
+.. include:: ../disclaimer-zh_CN.rst
+
+:Original: Documentation/dev-tools/index.rst
+:Translator: 赵军奎 Bernard Zhao <bernard@vivo.com>
+
+================================
+内核开发工具
+================================
+
+本文档是有关内核开发工具文档的合集。
+目前这些文档已经整理在一起,不需要再花费额外的精力。
+欢迎任何补丁。
+
+.. class:: toc-title
+
+ 目录
+
+.. toctree::
+ :maxdepth: 2
+
+ gcov
+
+Todolist:
+
+ coccinelle
+ sparse
+ kcov
+ kasan
+ ubsan
+ kmemleak
+ kcsan
+ gdb-kernel-debugging
+ kgdb
+ kselftest
+ kunit/index
diff --git a/Documentation/translations/zh_CN/index.rst b/Documentation/translations/zh_CN/index.rst
index be6f11176200..a5e483b0c7f2 100644
--- a/Documentation/translations/zh_CN/index.rst
+++ b/Documentation/translations/zh_CN/index.rst
@@ -20,6 +20,7 @@
process/index
filesystems/index
arm64/index
+ dev-tools/index
目录和表格
----------
--
2.31.0
^ permalink raw reply related
* Re: [PATCH v3 1/4] dt-bindings: mfd: Add bindings for Ampere Altra SMPro drivers
From: Quan Nguyen @ 2021-04-14 8:28 UTC (permalink / raw)
To: Rob Herring
Cc: Joel Stanley, Andrew Jeffery, Jean Delvare, Guenter Roeck,
Lee Jones, Jonathan Corbet, linux-hwmon, devicetree, linux-kernel,
linux-doc, linux-aspeed, openbmc, Open Source Submission,
Phong Vo, Thang Q . Nguyen
In-Reply-To: <20210413134906.GA1538655@robh.at.kernel.org>
On 13/04/2021 20:49, Rob Herring wrote:
> On Fri, Apr 09, 2021 at 10:13:29AM +0700, Quan Nguyen wrote:
>> Adds device tree bindings for SMPro drivers found on the Mt.Jade hardware
>> reference platform with Ampere's Altra Processor family.
>>
>> Signed-off-by: Quan Nguyen <quan@os.amperecomputing.com>
>> ---
>> .../bindings/hwmon/ampere,ac01-hwmon.yaml | 28 +++++
>> .../devicetree/bindings/mfd/ampere,smpro.yaml | 105 ++++++++++++++++++
>> 2 files changed, 133 insertions(+)
>> create mode 100644 Documentation/devicetree/bindings/hwmon/ampere,ac01-hwmon.yaml
>> create mode 100644 Documentation/devicetree/bindings/mfd/ampere,smpro.yaml
>>
>> diff --git a/Documentation/devicetree/bindings/hwmon/ampere,ac01-hwmon.yaml b/Documentation/devicetree/bindings/hwmon/ampere,ac01-hwmon.yaml
>> new file mode 100644
>> index 000000000000..fbf7ec754160
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/hwmon/ampere,ac01-hwmon.yaml
>> @@ -0,0 +1,28 @@
>> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
>> +%YAML 1.2
>> +---
>> +$id: http://devicetree.org/schemas/hwmon/ampere,ac01-hwmon.yaml#
>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>> +
>> +title: Hardware monitoring driver for the Ampere Altra SMPro
>> +
>> +maintainers:
>> + - Quan Nguyen <quan@os.amperecomputing.com>
>> +
>> +description: |
>> + This module is part of the Ampere Altra SMPro multi-function device. For more
>> + details see ../mfd/ampere,smpro.yaml.
>> +
>> +properties:
>> + compatible:
>> + enum:
>> + - ampere,ac01-hwmon
>> +
>> + reg:
>> + maxItems: 1
>> +
>> +required:
>> + - compatible
>> + - reg
>> +
>> +additionalProperties: false
>> diff --git a/Documentation/devicetree/bindings/mfd/ampere,smpro.yaml b/Documentation/devicetree/bindings/mfd/ampere,smpro.yaml
>> new file mode 100644
>> index 000000000000..5613c420869e
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/mfd/ampere,smpro.yaml
>> @@ -0,0 +1,105 @@
>> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
>> +%YAML 1.2
>> +---
>> +$id: http://devicetree.org/schemas/mfd/ampere,smpro.yaml#
>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>> +
>> +title: Ampere Altra SMPro firmware driver
>> +
>> +maintainers:
>> + - Quan Nguyen <quan@os.amperecomputing.com>
>> +
>> +description: |
>> + Ampere Altra SMPro firmware may contain different blocks like hardware
>> + monitoring, error monitoring and other miscellaneous features.
>> +
>> +properties:
>> + compatible:
>> + enum:
>> + - ampere,smpro
>
> Again, not very specific. There's only 1 version of 'smpro' h/w or
> firmware? Are the firmware version and features discoverable? If not,
> you need to be more specific (or better yet, make them discoverable).
>
Hi Rob,
So far, there's nothing to guarantee this is the only version of SMPro
and neither firmware version nor features that are discoverable.
In fact, it was "ampere,ac01-smpro" specifically in my v1. But this is
the "ampere,smpro" in arch/arm/boot/dts/nuvoton-npcm730-kudo.dts, that
is why it got changed to "ampere,smpro" to avoid changes in that dts file.
I'm thinking about change it back to "ampere,ac01-smpro" in next version
to make this compatible string more specific.
>> +
>> + reg:
>> + description:
>> + I2C device address.
>> + maxItems: 1
>> +
>> + "#address-cells":
>> + const: 1
>> +
>> + "#size-cells":
>> + const: 0
>> +
>> +patternProperties:
>> + "^hwmon(@[0-9a-f]+)?$":
>> + $ref: ../hwmon/ampere,ac01-hwmon.yaml
>> +
>> + "^misc(@[0-9a-f]+)?$":
>
> You don't need these child nodes in DT if there are no resources
> associated with them. The parent driver can instantiate all the
> sub-functions.
>
From v3, there is a "reg" property introduced for the child driver,
especially for the misc driver. This is unavoidable because other
properties might be introduced in future for other misc features.
>> + type: object
>> + description: |
>> + This module is part of the Ampere Altra SMPro multi-function device
>> + to support miscellaneous features
>> + properties:
>> + compatible:
>> + enum:
>> + - ampere,ac01-misc
>> + reg:
>> + maxItems: 1
>> +
>> + required:
>> + - compatible
>> + - reg
>> +
>> + "^errmon(@[0-9a-f]+)?$":
>> + type: object
>> + description: |
>> + This module is part of the Ampere Altra SMPro multi-function device
>> + that supports error monitoring feature.
>> +
>> + properties:
>> + compatible:
>> + enum:
>> + - ampere,ac01-errmon
>> + reg:
>> + maxItems: 1
>> +
>> + required:
>> + - compatible
>> + - reg
>> +
>> +required:
>> + - "#address-cells"
>> + - "#size-cells"
>> + - compatible
>> + - reg
>> +
>> +additionalProperties: false
>> +
>> +examples:
>> + - |
>> + i2c {
>> + #address-cells = <1>;
>> + #size-cells = <0>;
>> +
>> + smpro@4f {
>> + compatible = "ampere,smpro";
>> + reg = <0x4f>;
>> + #address-cells = <1>;
>> + #size-cells = <0>;
>> +
>> + hwmon@10 {
>> + compatible = "ampere,ac01-hwmon";
>> + reg = <0x10>;
>> + };
>> +
>> + misc@b0 {
>> + compatible = "ampere,ac01-misc";
>> + reg = <0xb0>;
>> + };
>> +
>> + errmon@80 {
>> + compatible = "ampere,ac01-errmon";
>> + reg = <0x80>;
>> + };
>> +
>> + };
>> + };
>> --
>> 2.28.0
>>
^ permalink raw reply
* Re: [PATCH] Documentation: syscalls: add a note about ABI-agnostic types
From: Christian Brauner @ 2021-04-14 8:46 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Yury Norov, linux-api, linux-arch, linux-doc, linux-kernel,
Alexander A. Klimov, André Almeida, Andrew Morton,
Arnd Bergmann, David Sterba, Joe Perches, Jonathan Corbet,
Mike Rapoport, Aleksa Sarai
In-Reply-To: <20210414081422.5a9d0c4b@coco.lan>
On Wed, Apr 14, 2021 at 08:14:22AM +0200, Mauro Carvalho Chehab wrote:
> Em Tue, 13 Apr 2021 21:40:20 -0700
> Yury Norov <yury.norov@gmail.com> escreveu:
>
> > Ping?
> >
> > On Fri, Apr 09, 2021 at 01:43:04PM -0700, Yury Norov wrote:
> > > Recently added memfd_secret() syscall had a flags parameter passed
> > > as unsigned long, which requires creation of compat entry for it.
> > > It was possible to change the type of flags to unsigned int and so
> > > avoid bothering with compat layer.
> > >
> > > https://www.spinics.net/lists/linux-mm/msg251550.html
> > >
> > > Documentation/process/adding-syscalls.rst doesn't point clearly about
> > > preference of ABI-agnostic types. This patch adds such notification.
> > >
> > > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > > ---
> > > Documentation/process/adding-syscalls.rst | 7 +++++++
> > > 1 file changed, 7 insertions(+)
> > >
> > > diff --git a/Documentation/process/adding-syscalls.rst b/Documentation/process/adding-syscalls.rst
> > > index 9af35f4ec728..46add16edf14 100644
> > > --- a/Documentation/process/adding-syscalls.rst
> > > +++ b/Documentation/process/adding-syscalls.rst
> > > @@ -172,6 +172,13 @@ arguments (i.e. parameter 1, 3, 5), to allow use of contiguous pairs of 32-bit
> > > registers. (This concern does not apply if the arguments are part of a
> > > structure that's passed in by pointer.)
> > >
> > > +Whenever possible, try to use ABI-agnostic types for passing parameters to
> > > +a syscall in order to avoid creating compat entry for it. Linux supports two
> > > +ABI models - ILP32 and LP64.
>
> > > + The types like ``void *``, ``long``, ``size_t``,
> > > +``off_t`` have different size in those ABIs;
>
> In the case of pointers, the best is to use __u64. The pointer can then
> be read on Kernelspace with something like this:
>
> static inline void __user *media_get_uptr(__u64 arg)
> {
> return (void __user *)(uintptr_t)arg;
> }
>
>
> > > types like ``char`` and ``int``
> > > +have the same size and don't require a compat layer support. For flags, it's
> > > +always better to use ``unsigned int``.
> > > +
>
> I don't think this is true for all compilers on userspace, as the C
> standard doesn't define how many bits an int/unsigned int has.
> So, even if this is today's reality, things may change in the future.
>
> For instance, I remember we had to replace "int" and "enum" by "__u32"
> and "long" by "__u64" at the media uAPI in the past, when we start
> seeing x86_64 Kernels with 32-bits userspace and when cameras started
> being supported on arm32.
>
> We did have some real bugs with "enum", as, on that time, some
> compilers (gcc, I guess) were optimizing them to have less than
> 32 bits on certain architectures, when it fits.
Fwiw, Aleksa and I have written extended syscall documentation
documenting the agreement that we came to in a dedicated session with a
wide range of kernel folks during Linux Plumbers last year. We simply
never had time to actually send this series but fwiw here it is. It also
mentions the use of correct types. If people feel it's worth it I can
send as a proper series:
From 9035258aaa23c5e1bb5bc2242f97221a3e5b9a87 Mon Sep 17 00:00:00 2001
From: Christian Brauner <christian.brauner@ubuntu.com>
Date: Fri, 4 Sep 2020 14:27:47 +0200
Subject: [PATCH 1/6] docs: split extensibility section into two subsections
The section already explains two different formats that are available to
extend a syscall. Move each into its own subsection. This clarifies the
structure and will be useful when we extend each section in follow-up
patches.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Co-developed-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
Documentation/process/adding-syscalls.rst | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/Documentation/process/adding-syscalls.rst b/Documentation/process/adding-syscalls.rst
index 906c47f1a9e5..3853ce57e757 100644
--- a/Documentation/process/adding-syscalls.rst
+++ b/Documentation/process/adding-syscalls.rst
@@ -65,6 +65,9 @@ together with the corresponding follow-up system calls --
``pipe``/``pipe2``, ``renameat``/``renameat2`` -- so
learn from the history of the kernel and plan for extensions from the start.)
+Baseline extensibility: adding a flag argument
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
For simpler system calls that only take a couple of arguments, the preferred
way to allow for future extensibility is to include a flags argument to the
system call. To make sure that userspace programs can safely use flags
@@ -76,6 +79,9 @@ flags, and reject the system call (with ``EINVAL``) if it does::
(If no flags values are used yet, check that the flags argument is zero.)
+Advanced extensibility: extensible structs
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
For more sophisticated system calls that involve a larger number of arguments,
it's preferred to encapsulate the majority of the arguments into a structure
that is passed in by pointer. Such a structure can cope with future extension
base-commit: e49d033bddf5b565044e2abe4241353959bc9120
--
2.27.0
From de1b27a0d29ac8edfb19226e3ad0547831f12a97 Mon Sep 17 00:00:00 2001
From: Christian Brauner <christian.brauner@ubuntu.com>
Date: Fri, 4 Sep 2020 14:37:53 +0200
Subject: [PATCH 2/6] docs: require unsigned int as type for flag arguments
This has been an undocumented requirement for quite a long time but people
still get confused and add system calls that use e.g. unsigned long as flag
argument. Document the unsigned int requirement for flag arguments.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Co-developed-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
Documentation/process/adding-syscalls.rst | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/Documentation/process/adding-syscalls.rst b/Documentation/process/adding-syscalls.rst
index 3853ce57e757..c08c3a6e7979 100644
--- a/Documentation/process/adding-syscalls.rst
+++ b/Documentation/process/adding-syscalls.rst
@@ -70,9 +70,16 @@ Baseline extensibility: adding a flag argument
For simpler system calls that only take a couple of arguments, the preferred
way to allow for future extensibility is to include a flags argument to the
-system call. To make sure that userspace programs can safely use flags
-between kernel versions, check whether the flags value holds any unknown
-flags, and reject the system call (with ``EINVAL``) if it does::
+system call. As such, flag arguments function as a baseline for extensibility.
+
+Different types such as ``int`` or ``unsigned long`` have been used for flag
+arguments. Since this is not just inconsistent but can also lead to issues
+with sign- and zero extension all new system calls are expected to use
+``unsigned int`` as type for flag arguments.
+
+To make sure that userspace programs can safely use flags between kernel
+versions, check whether the flags value holds any unknown flags, and reject the
+system call (with ``EINVAL``) if it does::
if (flags & ~(THING_FLAG1 | THING_FLAG2 | THING_FLAG3))
return -EINVAL;
--
2.27.0
From 08bc20493242d93c187b8cdcf9c2353e385a2565 Mon Sep 17 00:00:00 2001
From: Christian Brauner <christian.brauner@ubuntu.com>
Date: Fri, 4 Sep 2020 14:52:56 +0200
Subject: [PATCH 3/6] docs: clarify flag value checking a little
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Co-developed-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
Documentation/process/adding-syscalls.rst | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/Documentation/process/adding-syscalls.rst b/Documentation/process/adding-syscalls.rst
index c08c3a6e7979..faea347c0ecb 100644
--- a/Documentation/process/adding-syscalls.rst
+++ b/Documentation/process/adding-syscalls.rst
@@ -77,14 +77,18 @@ arguments. Since this is not just inconsistent but can also lead to issues
with sign- and zero extension all new system calls are expected to use
``unsigned int`` as type for flag arguments.
-To make sure that userspace programs can safely use flags between kernel
-versions, check whether the flags value holds any unknown flags, and reject the
-system call (with ``EINVAL``) if it does::
+A system call doesn't need to support any flags right away to justify adding
+a flag argument. If no flags are supported yet, the new system call needs
+to check that the flag argument is zero and to return ``EINVAL`` if it is not::
- if (flags & ~(THING_FLAG1 | THING_FLAG2 | THING_FLAG3))
+ if (flags)
return -EINVAL;
-(If no flags values are used yet, check that the flags argument is zero.)
+Similarly, if flags are supported the system call needs to check that no
+unknown flag values are present and return ``EINVAL`` if there are::
+
+ if (flags & ~(THING_FLAG1 | THING_FLAG2 | THING_FLAG3))
+ return -EINVAL;
Advanced extensibility: extensible structs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--
2.27.0
From 35745a05b7d7350f2039b7d868e52f6c306f7c8e Mon Sep 17 00:00:00 2001
From: Christian Brauner <christian.brauner@ubuntu.com>
Date: Fri, 4 Sep 2020 15:39:38 +0200
Subject: [PATCH 4/6] docs: extend section about extensible structs
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Co-developed-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
Documentation/process/adding-syscalls.rst | 88 +++++++++++++++++------
1 file changed, 68 insertions(+), 20 deletions(-)
diff --git a/Documentation/process/adding-syscalls.rst b/Documentation/process/adding-syscalls.rst
index faea347c0ecb..875e32bbabac 100644
--- a/Documentation/process/adding-syscalls.rst
+++ b/Documentation/process/adding-syscalls.rst
@@ -65,6 +65,7 @@ together with the corresponding follow-up system calls --
``pipe``/``pipe2``, ``renameat``/``renameat2`` -- so
learn from the history of the kernel and plan for extensions from the start.)
+
Baseline extensibility: adding a flag argument
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -90,34 +91,81 @@ unknown flag values are present and return ``EINVAL`` if there are::
if (flags & ~(THING_FLAG1 | THING_FLAG2 | THING_FLAG3))
return -EINVAL;
+
Advanced extensibility: extensible structs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For more sophisticated system calls that involve a larger number of arguments,
-it's preferred to encapsulate the majority of the arguments into a structure
-that is passed in by pointer. Such a structure can cope with future extension
-by including a size argument in the structure::
-
- struct xyzzy_params {
- u32 size; /* userspace sets p->size = sizeof(struct xyzzy_params) */
- u32 param_1;
- u64 param_2;
- u64 param_3;
+it's preferred to encapsulate the majority of the arguments into an extensible
+structure that is passed in by pointer.
+
+Extensible structs are versioned by their size. For any new non-flag based
+extension a new field has to be added to the end of the extensible struct. The
+zero value of the new field must not have any meaning so the system call can
+continue to display the old behavior.
+
+Extensible struct system calls can and should use the dedicated
+``copy_struct_from_user`` API which enforces the following behavior:
+
+ - The kernel will reject any size that is smaller than the initially supported
+ size of the extensible struct.
+ - If the extensible struct size the kernel knows about is equal to the size
+ passed in from userspace then ``copy_struct_from_user`` will copy the struct
+ verbatim.
+ - If the extensible struct size the kernel knows about is larger than the size
+ passed in from userspace the kernel will copy the size userspace indicated
+ and treat all additional extensions it knows about as zero.
+ - If the extensible struct size the kernel knows about is smaller than the
+ size passed in from userspace the kernel will copy the number of bytes it
+ knows about and verify that all trailing bytes are zero. If non-zero bytes
+ are present the kernel returns ``E2BIG``. While not an intuitive error code
+ at first, ``E2BIG`` means that the argument list is too long.
+
+Early examples for extensible struct system calls include
+:manpage:`perf_event_open(2)` and :manpage:`sched_setattr(2)`. These system
+calls implement mostly similar behavior even before the introduction of
+``copy_struct_from_user`` but have since been switched over to it. Newer
+examples include :manpage:`clone3(2)` and :manpage:`openat2(2)`.
+
+The size associated with an extensible struct can either be passed as
+a separate argument in the system call as is e.g. done for :manpage:`clone3(2)`
+and :manpage:`openat2(2)`. Alternatively, the size can be passed as the first
+field in the extensible struct as is e.g. done for :manpage:`sched_setattr(2)`.
+
+Any struct passed from userspace to the kernel and especially extensible
+structs must ensure that they are correctly padded. This ensures that no data
+can be leaked on accident or on purpose by an attacker from the kernel. The
+easiest way to ensure that a struct is correctly padded is to always use 64 bit
+fields::
+
+ struct sys_foo_args {
+ __aligned_u64 arg1;
+ __aligned_u64 arg2;
+ __aligned_u64 arg3;
+ __aligned_u64 arg4;
+ __aligned_u64 arg5;
};
-As long as any subsequently added field, say ``param_4``, is designed so that a
-zero value gives the previous behaviour, then this allows both directions of
-version mismatch:
+System calls that need to worry about the size of their extensible structs or
+need fields to be of a specific size can rely on careful manual struct
+packing::
+
+ struct sys_foo_args {
+ __u32 arg1;
+ __u16 arg2;
+ __u16 arg3;
+ __u32 arg4;
+ __u32 arg5;
+ __u64 arg6;
+ };
- - To cope with a later userspace program calling an older kernel, the kernel
- code should check that any memory beyond the size of the structure that it
- expects is zero (effectively checking that ``param_4 == 0``).
- - To cope with an older userspace program calling a newer kernel, the kernel
- code can zero-extend a smaller instance of the structure (effectively
- setting ``param_4 = 0``).
+(There are tools such as ``pahole`` available that allow to check whether
+a struct is correctly padded!)
-See :manpage:`perf_event_open(2)` and the ``perf_copy_attr()`` function (in
-``kernel/events/core.c``) for an example of this approach.
+Note that in contrast to flag arguments passed as register arguments flag
+arguments in extensible structures can be 64 bit wide. As with simple
+flag-only system calls, the system call needs to verify any unknown values for
+flag-like fields in the passed struct are zeroed.
Designing the API: Other Considerations
--
2.27.0
From 92d48ab239e28f56d1d685ca1f1618aef08e7fe6 Mon Sep 17 00:00:00 2001
From: Christian Brauner <christian.brauner@ubuntu.com>
Date: Fri, 4 Sep 2020 15:57:36 +0200
Subject: [PATCH 5/6] docs: document how to name revised syscalls
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Co-developed-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
Documentation/process/adding-syscalls.rst | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/Documentation/process/adding-syscalls.rst b/Documentation/process/adding-syscalls.rst
index 875e32bbabac..663d1b768121 100644
--- a/Documentation/process/adding-syscalls.rst
+++ b/Documentation/process/adding-syscalls.rst
@@ -168,6 +168,28 @@ flag-only system calls, the system call needs to verify any unknown values for
flag-like fields in the passed struct are zeroed.
+Designing the API: Revisions of syscalls
+-----------------------------------------------
+
+System calls that were not designed to be extensible or system calls that use
+a flag argument for extensions running out of bits (e.g. :manpage:`clone(2)`)
+sometimes need to be replaced.
+
+If the revised system call provides a superset (or a reasonably large subset,
+such as when a feature that turned out to be a design mistake is dropped) of
+the features of the old system call, it is common practice to give it the same
+name with a number appended. Examples for this include ``dup2``/``dup3``,
+``epoll_create``/``epoll_create1`` and others.
+
+For some syscalls the appended number indicates the number of arguments
+(``accept``/``accept4``) for others the number of the revision
+(``clone``/``clone3``, ``epoll_create``/``epoll_create1``). New system calls
+that are a revision of an earlier system call should treat the appended number
+as the number of the revision. For example, if you were to add a revised
+version of ``readlinkat`` with an additional flag argument it should be named
+``readlinkat2``.
+
+
Designing the API: Other Considerations
---------------------------------------
--
2.27.0
From 0f61fcb2949170532283048e8ac0d8e3c16305db Mon Sep 17 00:00:00 2001
From: Christian Brauner <christian.brauner@ubuntu.com>
Date: Fri, 4 Sep 2020 17:06:05 +0200
Subject: [PATCH 6/6] docs: add section about multiplexers
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Co-developed-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
Documentation/process/adding-syscalls.rst | 39 +++++++++++++++++++++++
1 file changed, 39 insertions(+)
diff --git a/Documentation/process/adding-syscalls.rst b/Documentation/process/adding-syscalls.rst
index 663d1b768121..ef1fa5fb223d 100644
--- a/Documentation/process/adding-syscalls.rst
+++ b/Documentation/process/adding-syscalls.rst
@@ -51,6 +51,45 @@ interface.
getting/setting a simple flag related to a process.
+Designing the API: No more multiplexers
+---------------------------------------
+
+New system calls are not allowed to be multiplexers. The kernel strongly
+encourages simple system call designs.
+
+The kernel has historically supported multiplexing system calls but both the
+kernel and userspace have learned the hard way that this is usually
+a significant design mistake. The list of actively supported multiplexing
+system calls ranges from domain specific multiplexers such as
+:manpage:`seccomp(2)` or :manpage:`keyctl(2)` to more generic multiplexers such
+as :manpage:`prctl(2)` and :manpage:`ptrace(2)`, up to ``ioctl``, a generic
+multiplexer not bound to any specific API or functionality. In essence,
+a multiplexing system call often implements the functionality of multiple
+simple system calls.
+
+Multiplexers are especially problematic when they are type-polymorph. This
+includes all multiplexers that pass different types through a void pointer,
+a ``long``, or a union using a specific argument to differentiate the type or
+the member of the union to look at. The ``ioctl`` multiplexer is a prime
+example of a type-polymorph multiplexer. Depending on the second argument
+passed to ``ioctl`` different types can be passed as the third argument. Some
+multiplexers, including ``ioctl``, :manpage:`prctl(2)`, or :manpage:`fcntl(2)`
+even pass a different number of arguments depending on the command that is
+executed.
+
+Multiplexers also pose significant problems for userspace libraries:
+
+ - Type safety is lost for the most part.
+ - There are no real advantages in terms of useability. In fact, userspace
+ libraries do consider exposing the commands implemented by a system call as
+ separate library calls.
+ - Multiplexers provide problems for 32 bit systems on 64 bit kernels. They
+ can for example cause breakage with ILP32 (i.e. I-ntegers, L-ongs, and
+ P-ointers are 32 bit wide) targets when types are not promoted correctly for
+ use with the kernel/userspace ABI. Getting this wrong is easier than
+ getting it right.
+
+
Designing the API: Planning for Extension
-----------------------------------------
--
2.27.0
^ permalink raw reply related
* Re: [PATCH v3] powerpc/papr_scm: Implement support for H_SCM_FLUSH hcall
From: Vaibhav Jain @ 2021-04-14 9:21 UTC (permalink / raw)
To: Aneesh Kumar K.V, Shivaprasad G Bhat, sbhat, linuxppc-dev,
kvm-ppc, linux-nvdimm, ellerman
Cc: linux-doc
In-Reply-To: <877dl530id.fsf@linux.ibm.com>
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
> Vaibhav Jain <vaibhav@linux.ibm.com> writes:
>
>> Hi Shiva,
>>
>> Apologies for a late review but something just caught my eye while
>> working on a different patch.
>>
>> Shivaprasad G Bhat <sbhat@linux.ibm.com> writes:
>>
>>> Add support for ND_REGION_ASYNC capability if the device tree
>>> indicates 'ibm,hcall-flush-required' property in the NVDIMM node.
>>> Flush is done by issuing H_SCM_FLUSH hcall to the hypervisor.
>>>
>>> If the flush request failed, the hypervisor is expected to
>>> to reflect the problem in the subsequent nvdimm H_SCM_HEALTH call.
>>>
>>> This patch prevents mmap of namespaces with MAP_SYNC flag if the
>>> nvdimm requires an explicit flush[1].
>>>
>>> References:
>>> [1] https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/memory/ndctl.py.data/map_sync.c
>>>
>>> Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
>>> ---
>>> v2 - https://www.spinics.net/lists/kvm-ppc/msg18799.html
>>> Changes from v2:
>>> - Fixed the commit message.
>>> - Add dev_dbg before the H_SCM_FLUSH hcall
>>>
>>> v1 - https://www.spinics.net/lists/kvm-ppc/msg18272.html
>>> Changes from v1:
>>> - Hcall semantics finalized, all changes are to accomodate them.
>>>
>>> Documentation/powerpc/papr_hcalls.rst | 14 ++++++++++
>>> arch/powerpc/include/asm/hvcall.h | 3 +-
>>> arch/powerpc/platforms/pseries/papr_scm.c | 40 +++++++++++++++++++++++++++++
>>> 3 files changed, 56 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/Documentation/powerpc/papr_hcalls.rst b/Documentation/powerpc/papr_hcalls.rst
>>> index 48fcf1255a33..648f278eea8f 100644
>>> --- a/Documentation/powerpc/papr_hcalls.rst
>>> +++ b/Documentation/powerpc/papr_hcalls.rst
>>> @@ -275,6 +275,20 @@ Health Bitmap Flags:
>>> Given a DRC Index collect the performance statistics for NVDIMM and copy them
>>> to the resultBuffer.
>>>
>>> +**H_SCM_FLUSH**
>>> +
>>> +| Input: *drcIndex, continue-token*
>>> +| Out: *continue-token*
>>> +| Return Value: *H_SUCCESS, H_Parameter, H_P2, H_BUSY*
>>> +
>>> +Given a DRC Index Flush the data to backend NVDIMM device.
>>> +
>>> +The hcall returns H_BUSY when the flush takes longer time and the hcall needs
>>> +to be issued multiple times in order to be completely serviced. The
>>> +*continue-token* from the output to be passed in the argument list of
>>> +subsequent hcalls to the hypervisor until the hcall is completely serviced
>>> +at which point H_SUCCESS or other error is returned by the hypervisor.
>>> +
>> Does the hcall semantic also include measures to trigger a barrier/fence
>> (like pm_wmb()) so that all the stores before the hcall are gauranteed
>> to have hit the pmem controller ?
>
> It is upto the hypervisor to implement the right sequence to ensure the
> guarantee. The hcall doesn't expect the store to reach the platform
> buffers.
Since the asyc_flush function is also called for performing
deep_flush from libnvdimm and as the hcall doesnt gaurantee stores to
reach the platform buffers, hence papr_scm_pmem_flush() should issue
pm_wmb() before the hcall.
This would ensure papr_scm_pmem_flush() semantics are consistent that to
generic_nvdimm_flush().
Also, adding the statement "The hcall doesn't expect the store to reach
the platform buffers" to the hcall documentation would be good to have.
>
>
>>
>> If not then the papr_scm_pmem_flush() introduced below should issue a
>> fence like pm_wmb() before issuing the flush hcall.
>>
>> If yes then this behaviour should also be documented with the hcall
>> semantics above.
>>
>
> IIUC fdatasync on the backend file is enough to ensure the
> guaraantee. Such a request will have REQ_PREFLUSH and REQ_FUA which will
> do the necessary barrier on the backing device holding the backend file.
>
Right, but thats assuming nvdimm is backed by a regular file in
hypervisor which may not always be the case.
> -aneesh
--
Cheers
~ Vaibhav
^ permalink raw reply
* Re: [PATCH v2] Documentation: dev-tools: Add Testing Overview
From: Marco Elver @ 2021-04-14 9:39 UTC (permalink / raw)
To: David Gow
Cc: Jonathan Corbet, Shuah Khan, Andrew Morton, Dmitry Vyukov,
Brendan Higgins, Daniel Latypov, open list:DOCUMENTATION,
KUnit Development, open list:KERNEL SELFTEST FRAMEWORK, LKML
In-Reply-To: <20210414081428.337494-1-davidgow@google.com>
On Wed, 14 Apr 2021 at 10:15, David Gow <davidgow@google.com> wrote:
>
> The kernel now has a number of testing and debugging tools, and we've
> seen a bit of confusion about what the differences between them are.
>
> Add a basic documentation outlining the testing tools, when to use each,
> and how they interact.
>
> This is a pretty quick overview rather than the idealised "kernel
> testing guide" that'd probably be optimal, but given the number of times
> questions like "When do you use KUnit and when do you use Kselftest?"
> are being asked, it seemed worth at least having something. Hopefully
> this can form the basis for more detailed documentation later.
>
> Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Looks good to me, thanks. I think one can write whole articles (or
even books) about this topic, so it's easy to forget this is a quick
overview, and keep on adding.
> ---
> Thanks, everyone, for the comments on the doc. I've made a few of the
> suggested changes. Please let me know what you think!
>
> -- David
>
> Changes since v1:
> https://lore.kernel.org/linux-kselftest/20210410070529.4113432-1-davidgow@google.com/
> - Note KUnit's speed and that one should provide selftests for syscalls
> - Mention lockdep as a Dynamic Analysis Tool
> - Refer to "Dynamic Analysis Tools" instead of "Sanitizers"
> - A number of minor formatting tweaks and rewordings for clarity.
>
> Not changed:
> - I haven't included an exhaustive list of differences, advantages, etc,
> between KUnit and kselftest: for now, the doc continues to focus on
> the difference between 'in-kernel' and 'userspace' testing here.
> - Similarly, I'm not linking out to docs defining and describing "Unit"
> tests versus "End-to-end" tests. None of the existing documentation
> elsewhere quite matches what we do in the kernel perfectly, so it
> seems less confusing to focus on the 'in-kernel'/'userspace'
> distinction, and leave other definitions as a passing mention for
> those who are already familiar with the concepts.
> - I haven't linked to any talk videos here: a few of them are linked on
> (e.g.) the KUnit webpage, but I wanted to keep the Kernel documentation
> more self-contained for now. No objection to adding them in a follow-up
> patch if people feel strongly about it, though.
> - The link from index.rst to this doc is unchanged. I personally think
> that the link is prominent enough there: it's the first link, and
> shows up a few times. One possibility if people disagreed would be to
> merge this page with the index, but given not all dev-tools are going
> to be testing-related, it seemed a bit arrogant. :-)
>
> Documentation/dev-tools/index.rst | 3 +
> Documentation/dev-tools/testing-overview.rst | 117 +++++++++++++++++++
> 2 files changed, 120 insertions(+)
> create mode 100644 Documentation/dev-tools/testing-overview.rst
>
> diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst
> index 1b1cf4f5c9d9..f590e5860794 100644
> --- a/Documentation/dev-tools/index.rst
> +++ b/Documentation/dev-tools/index.rst
> @@ -7,6 +7,8 @@ be used to work on the kernel. For now, the documents have been pulled
> together without any significant effort to integrate them into a coherent
> whole; patches welcome!
>
> +A brief overview of testing-specific tools can be found in :doc:`testing-overview`.
> +
> .. class:: toc-title
>
> Table of contents
> @@ -14,6 +16,7 @@ whole; patches welcome!
> .. toctree::
> :maxdepth: 2
>
> + testing-overview
> coccinelle
> sparse
> kcov
> diff --git a/Documentation/dev-tools/testing-overview.rst b/Documentation/dev-tools/testing-overview.rst
> new file mode 100644
> index 000000000000..ce36a8cdf6b5
> --- /dev/null
> +++ b/Documentation/dev-tools/testing-overview.rst
> @@ -0,0 +1,117 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +====================
> +Kernel Testing Guide
> +====================
> +
> +
> +There are a number of different tools for testing the Linux kernel, so knowing
> +when to use each of them can be a challenge. This document provides a rough
> +overview of their differences, and how they fit together.
> +
> +
> +Writing and Running Tests
> +=========================
> +
> +The bulk of kernel tests are written using either the kselftest or KUnit
> +frameworks. These both provide infrastructure to help make running tests and
> +groups of tests easier, as well as providing helpers to aid in writing new
> +tests.
> +
> +If you're looking to verify the behaviour of the Kernel — particularly specific
> +parts of the kernel — then you'll want to use KUnit or kselftest.
> +
> +
> +The Difference Between KUnit and kselftest
> +------------------------------------------
> +
> +KUnit (Documentation/dev-tools/kunit/index.rst) is an entirely in-kernel system
> +for "white box" testing: because test code is part of the kernel, it can access
> +internal structures and functions which aren't exposed to userspace.
> +
> +KUnit tests therefore are best written against small, self-contained parts
> +of the kernel, which can be tested in isolation. This aligns well with the
> +concept of 'unit' testing.
> +
> +For example, a KUnit test might test an individual kernel function (or even a
> +single codepath through a function, such as an error handling case), rather
> +than a feature as a whole.
> +
> +This also makes KUnit tests very fast to build and run, allowing them to be
> +run frequently as part of the development process.
> +
> +There is a KUnit test style guide which may give further pointers in
> +Documentation/dev-tools/kunit/style.rst
> +
> +
> +kselftest (Documentation/dev-tools/kselftest.rst), on the other hand, is
> +largely implemented in userspace, and tests are normal userspace scripts or
> +programs.
> +
> +This makes it easier to write more complicated tests, or tests which need to
> +manipulate the overall system state more (e.g., spawning processes, etc.).
> +However, it's not possible to call kernel functions directly from kselftest.
> +This means that only kernel functionality which is exposed to userspace somhow
> +(e.g. by a syscall, device, filesystem, etc.) can be tested with kselftest. To
> +work around this, some tests include a companion kernel module which exposes
> +more information or functionality. If a test runs mostly or entirely within the
> +kernel, however, KUnit may be the more appropriate tool.
> +
> +kselftest is therefore suited well to tests of whole features, as these will
> +expose an interface to userspace, which can be tested, but not implementation
> +details. This aligns well with 'system' or 'end-to-end' testing.
> +
> +For example, all new system calls should be accompanied by kselftest tests.
> +
> +Code Coverage Tools
> +===================
> +
> +The Linux Kernel supports two different code coverage measurement tools. These
> +can be used to verify that a test is executing particular functions or lines
> +of code. This is useful for determining how much of the kernel is being tested,
> +and for finding corner-cases which are not covered by the appropriate test.
> +
> +:doc:`gcov` is GCC's coverage testing tool, which can be used with the kernel
> +to get global or per-module coverage. Unlike KCOV, it does not record per-task
> +coverage. Coverage data can be read from debugfs, and interpreted using the
> +usual gcov tooling.
> +
> +:doc:`kcov` is a feature which can be built in to the kernel to allow
> +capturing coverage on a per-task level. It's therefore useful for fuzzing and
> +other situations where information about code executed during, for example, a
> +single syscall is useful.
> +
> +
> +Dynamic Analysis Tools
> +======================
> +
> +The kernel also supports a number of dynamic analysis tools, which attempt to
> +detect classes of issues when the occur in a running kernel. These typically
> +look for undefined behaviour of some kind, such as invalid memory accesses,
> +concurrency issues such as data races, or other undefined behaviour like
> +integer overflows.
> +
> +Some of these tools are listed below:
> +
> +* kmemleak detects possible memory leaks. See
> + Documentation/dev-tools/kmemleak.rst
> +* KASAN detects invalid memory accesses such as out-of-bounds and
> + use-after-free errors. See Documentation/dev-tools/kasan.rst
> +* UBSAN detects behaviour that is undefined by the C standard, like integer
> + overflows. See Documentation/dev-tools/ubsan.rst
> +* KCSAN detects data races. See Documentation/dev-tools/kcsan.rst
> +* KFENCE is a low-overhead detector of memory issues, which is much faster than
> + KASAN and can be used in production. See Documentation/dev-tools/kfence.rst
> +* lockdep is a locking correctness validator. See
> + Documentation/locking/lockdep-design.rst
> +* There are several other pieces of debug instrumentation in the kernel, many
> + of which can be found in lib/Kconfig.debug
> +
> +These tools tend to test the kernel as a whole, and do not "pass" like
> +kselftest or KUnit tests. They can be combined with KUnit or kselftest by
> +running tests on a kernel with a sanitizer enabled: you can then be sure
> +that none of these errors are occurring during the test.
> +
> +Some of these tools integrate with KUnit or kselftest and will
> +automatically fail tests if an issue is detected.
> +
> --
> 2.31.1.295.g9ea45b61b8-goog
>
^ permalink raw reply
* Re: [PATCH] Documentation: syscalls: add a note about ABI-agnostic types
From: Mike Rapoport @ 2021-04-14 9:46 UTC (permalink / raw)
To: Christian Brauner
Cc: Mauro Carvalho Chehab, Yury Norov, linux-api, linux-arch,
linux-doc, linux-kernel, Alexander A. Klimov, André Almeida,
Andrew Morton, Arnd Bergmann, David Sterba, Joe Perches,
Jonathan Corbet, Aleksa Sarai
In-Reply-To: <20210414084605.pdlnjkwa3h47jxno@wittgenstein>
On Wed, Apr 14, 2021 at 10:46:05AM +0200, Christian Brauner wrote:
> On Wed, Apr 14, 2021 at 08:14:22AM +0200, Mauro Carvalho Chehab wrote:
> > Em Tue, 13 Apr 2021 21:40:20 -0700
> > Yury Norov <yury.norov@gmail.com> escreveu:
> >
> > > Ping?
> > >
> > > On Fri, Apr 09, 2021 at 01:43:04PM -0700, Yury Norov wrote:
> > > > Recently added memfd_secret() syscall had a flags parameter passed
> > > > as unsigned long, which requires creation of compat entry for it.
> > > > It was possible to change the type of flags to unsigned int and so
> > > > avoid bothering with compat layer.
> > > >
> > > > https://www.spinics.net/lists/linux-mm/msg251550.html
> > > >
> > > > Documentation/process/adding-syscalls.rst doesn't point clearly about
> > > > preference of ABI-agnostic types. This patch adds such notification.
> > > >
> > > > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > > > ---
> > > > Documentation/process/adding-syscalls.rst | 7 +++++++
> > > > 1 file changed, 7 insertions(+)
> > > >
> > > > diff --git a/Documentation/process/adding-syscalls.rst b/Documentation/process/adding-syscalls.rst
> > > > index 9af35f4ec728..46add16edf14 100644
> > > > --- a/Documentation/process/adding-syscalls.rst
> > > > +++ b/Documentation/process/adding-syscalls.rst
> > > > @@ -172,6 +172,13 @@ arguments (i.e. parameter 1, 3, 5), to allow use of contiguous pairs of 32-bit
> > > > registers. (This concern does not apply if the arguments are part of a
> > > > structure that's passed in by pointer.)
> > > >
> > > > +Whenever possible, try to use ABI-agnostic types for passing parameters to
> > > > +a syscall in order to avoid creating compat entry for it. Linux supports two
> > > > +ABI models - ILP32 and LP64.
> >
> > > > + The types like ``void *``, ``long``, ``size_t``,
> > > > +``off_t`` have different size in those ABIs;
> >
> > In the case of pointers, the best is to use __u64. The pointer can then
> > be read on Kernelspace with something like this:
> >
> > static inline void __user *media_get_uptr(__u64 arg)
> > {
> > return (void __user *)(uintptr_t)arg;
> > }
> >
> >
> > > > types like ``char`` and ``int``
> > > > +have the same size and don't require a compat layer support. For flags, it's
> > > > +always better to use ``unsigned int``.
> > > > +
> >
> > I don't think this is true for all compilers on userspace, as the C
> > standard doesn't define how many bits an int/unsigned int has.
> > So, even if this is today's reality, things may change in the future.
> >
> > For instance, I remember we had to replace "int" and "enum" by "__u32"
> > and "long" by "__u64" at the media uAPI in the past, when we start
> > seeing x86_64 Kernels with 32-bits userspace and when cameras started
> > being supported on arm32.
> >
> > We did have some real bugs with "enum", as, on that time, some
> > compilers (gcc, I guess) were optimizing them to have less than
> > 32 bits on certain architectures, when it fits.
>
> Fwiw, Aleksa and I have written extended syscall documentation
> documenting the agreement that we came to in a dedicated session with a
> wide range of kernel folks during Linux Plumbers last year. We simply
> never had time to actually send this series but fwiw here it is. It also
> mentions the use of correct types. If people feel it's worth it I can
> send as a proper series:
Yes, please.
> From 9035258aaa23c5e1bb5bc2242f97221a3e5b9a87 Mon Sep 17 00:00:00 2001
> From: Christian Brauner <christian.brauner@ubuntu.com>
> Date: Fri, 4 Sep 2020 14:27:47 +0200
> Subject: [PATCH 1/6] docs: split extensibility section into two subsections
...
--
Sincerely yours,
Mike.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox