* Re: [PATCH 3/6] ravb: remove custom .set_link_ksettings from ethtool ops
From: Sergei Shtylyov @ 2018-06-06 20:34 UTC (permalink / raw)
To: Vladimir Zapolskiy, David S. Miller; +Cc: netdev, linux-renesas-soc
In-Reply-To: <75b66b77-0165-7258-ccfe-cf3ff08930b2@mentor.com>
On 06/04/2018 02:07 PM, Vladimir Zapolskiy wrote:
>>>>> The change replaces a custom implementation of .set_link_ksettings
>>>>> callback with a shared phy_ethtool_set_link_ksettings(), this fixes
>>>>> sleep in atomic context bug, which is encountered every time when link
>>>>> settings are changed by ethtool.
>>>>
>>>> Seeing it now...
>>
>> And to say that this is *fixed* by removing the custom method is err...
>> simply misleading. The sleep in atomic context is fixed solely by the removal
>> of the spinlock grabbing before the phylib call.
> As I've already said, "the removal of the spinlock grabbing before the phylib
> call" is not a valid fix, but it will introduce another regression.
It's the necessary part of the fix, unlike using the generic phylib method.
>>>>> Now duplex mode setting is enforced in ravb_adjust_link() only, also
>>>>> now TX/RX is disabled when link is put down or modifications to E-MAC
>>>>> registers ECMR and GECMR are expected for both cases of checked and
>>>>> ignored link status pin state from E-MAC interrupt handler.
>>>>>
>>>>> Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
>>>>> ---
>>>>> drivers/net/ethernet/renesas/ravb_main.c | 58 +++++++++-----------------------
>>>>> 1 file changed, 15 insertions(+), 43 deletions(-)
>>>>>
>>>>> diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
>>>>> index 3d91caa44176..0d811c02ff34 100644
>>>>> --- a/drivers/net/ethernet/renesas/ravb_main.c
>>>>> +++ b/drivers/net/ethernet/renesas/ravb_main.c
>>>>> @@ -980,6 +980,13 @@ static void ravb_adjust_link(struct net_device *ndev)
>>>>> struct ravb_private *priv = netdev_priv(ndev);
>>>>> struct phy_device *phydev = ndev->phydev;
>>>>> bool new_state = false;
>>>>> + unsigned long flags;
>>>>> +
>>>>> + spin_lock_irqsave(&priv->lock, flags);
>>>>> +
>>>>> + /* Disable TX and RX right over here, if E-MAC change is ignored */
>>>>> + if (priv->no_avb_link)
>>>>> + ravb_rcv_snd_disable(ndev);
>>>>>
>>>>> if (phydev->link) {
>>>>> if (phydev->duplex != priv->duplex) {
>>>>> @@ -997,18 +1004,21 @@ static void ravb_adjust_link(struct net_device *ndev)
>>>>> ravb_modify(ndev, ECMR, ECMR_TXF, 0);
>>>>> new_state = true;
>>>>> priv->link = phydev->link;
>>>>> - if (priv->no_avb_link)
>>>>> - ravb_rcv_snd_enable(ndev);
>>>>> }
>>>>> } else if (priv->link) {
>>>>> new_state = true;
>>>>> priv->link = 0;
>>>>> priv->speed = 0;
>>>>> priv->duplex = -1;
>>>>> - if (priv->no_avb_link)
>>>>> - ravb_rcv_snd_disable(ndev);
>>>>> }
>>>>>
>>>>> + /* Enable TX and RX right over here, if E-MAC change is ignored */
>>>>> + if (priv->no_avb_link && phydev->link)
>>>>> + ravb_rcv_snd_enable(ndev);
>>>>> +
>>>>> + mmiowb();
>>>>> + spin_unlock_irqrestore(&priv->lock, flags);
>>>>> +
>>>>
>>>> I like this part. :-)
>>>>
>>>
>>> A weight off my mind :) And I hope that this change will remain the less
>>> questionable one, other ones from the series are trivial.
>>>
>>> Anyway I hope it is understandable that this part of the change can not
>>> be simply extracted from the rest one below, otherwise there'll be bugs of
>>> another type intorduced.
>>
>> I never said I'd like to apply this part alone, my idea was more like removing
>> the spinlock grabbing and the duplex handling down below.
>>
>
> As I've already said, "the removal of the spinlock grabbing" is not a valid fix,
> but it will introduce another regression.
>
> Please tell me, a removal of duplex handling change should be done before
> a removal of the spinlock grabbing? Or after?
As much as I was able to figure out, at the same time.
>> [...]
>>>>> @@ -1357,7 +1329,7 @@ static const struct ethtool_ops ravb_ethtool_ops = {
>>>>> .set_ringparam = ravb_set_ringparam,
>>>>> .get_ts_info = ravb_get_ts_info,
>>>>> .get_link_ksettings = phy_ethtool_get_link_ksettings,
>>>>> - .set_link_ksettings = ravb_set_link_ksettings,
>>>>> + .set_link_ksettings = phy_ethtool_set_link_ksettings,
>>>>
>>>> Should have been a part of the final patch in the fix/enhancement chain...
>>>
>>> Please elaborate.
>>>
>>> Do you mean that firstly I have to make erroneous ravb_set_link_ksettings()
>>> to look similar to phy_ethtool_set_link_ksettings() and then remove it?
>>
>> Yes.
> Then this change of "ravb_set_link_ksettings() looks similar to
> phy_ethtool_set_link_ksettings()" will be a single commit, and it will be
> a fix. Does it sound good?
It does, as I was trying to tell you. :-)
>>> As I see it in the current context (removal of ravb_set_duplex() call and
>>> so on), the problem with this approach is that the actual fix change will
>>> be done on top of a number of enchancement changes, thus it contradicts to
>>
>> Now I have to ask you to elaborate. I have no idea what you mean. :-(
>
> My statement is based on the next facts:
s/next/following/?
> 1. ravb_set_duplex() call in ravb_set_link_ksettings() is unnecessary,
> however its removal is an enchancement,
> 2. removal of the spinlock grabbing is just a *part* of the proper fix,
> and the complete proper fix includes ravb_set_duplex() call removal,
> adding spinlock grabbing to ravb_adjust_link() and other modifications
> to ravb_adjust_link() from this commit.
So far, so good. :-)
> 3. commits with fixes must precede commits with enchancements in the
> series, because enchancements are not backported.
Enhancements?
Yes, if at all possible.
> The question remains the same, what do you ask me to do?
Mainly, to separate fixes from clean-ups, as much as possible. That'll simplify
the -stable backport handling for DaveM and the people maintaining the earlier kernel
versions.
>> And of course, sometimes the things are broken in a so subtle way, that
>> only as pile of "cleanups" fixed them, we had that situation in e.g. the
>> R-Car I2C driver -- *none* of AFAIR 9 patches was good as a -stable patch...
>>
>>> the accepted development/maintenace model "fixes first", and most probably
>>> it won't be possible to backport the real fix, however this sole change can
>>> be backported.
>>
>> My idea was to move the [G]ECMR writes to the adjust_link() callback and
>> to stop grabbing the spinlock where it *was* grabbed in the same fix patch.
>> Then just a single clean up, to start using the new phylib method.
> It will be okay iff ravb_set_duplex() call removal is added to the first
> part ("fixes" part) of two changes. Please confirm that our understanding
> is aligned.
Yes, and I've tried to communicate that to you but somehow failed. :-)
> --
> With best wishes,
> Vladimir
MBR, Sergei
^ permalink raw reply
* Re: [PATCH net] net: ipv6: ip6_output: alloc skb with tailroom
From: Willem de Bruijn @ 2018-06-06 20:26 UTC (permalink / raw)
To: David Miller
Cc: aring, Network Development, Hideaki YOSHIFUJI, david.palma,
rabinarayans0828, Jamal Hadi Salim, stefan, linux-wpan, kernel
In-Reply-To: <20180606.141125.1145874895676180399.davem@davemloft.net>
On Wed, Jun 6, 2018 at 2:11 PM, David Miller <davem@davemloft.net> wrote:
> From: Alexander Aring <aring@mojatatu.com>
> Date: Wed, 6 Jun 2018 14:09:20 -0400
>
>> okay, then you want to have this patch for net-next? As an optimization?
>>
>> Of course, when it's open again.
>
> Like you, I have questions about where this adjustment is applied and
> why. So I'm not sure yet.
>
> For example, only IPV6 really takes it into consideration and as you
> saw only really for the fragmentation path and not the normal output
> path.
>
> This needs more consideration and investigation.
This is the unconditional skb_put in ieee802154_tx. In many cases
there is some tailroom due to SKB_DATA_ALIGN in __alloc_skb,
so it may take a specific case to not have even 2 bytes of tailroom
available.
^ permalink raw reply
* Re: [RFC PATCH ghak90 (was ghak32) V3 01/10] audit: add container id
From: Richard Guy Briggs @ 2018-06-06 20:26 UTC (permalink / raw)
To: Steve Grubb
Cc: linux-audit, cgroups, containers, linux-api, linux-fsdevel, LKML,
netdev, ebiederm, luto, jlayton, carlos, dhowells, viro, simo,
eparis, serge
In-Reply-To: <1828967.ZGm5HJUlmN@x2>
On 2018-06-06 13:56, Steve Grubb wrote:
> On Wednesday, June 6, 2018 12:58:28 PM EDT Richard Guy Briggs wrote:
> > Implement the proc fs write to set the audit container identifier of a
> > process, emitting an AUDIT_CONTAINER_ID record to document the event.
> >
> > This is a write from the container orchestrator task to a proc entry of
> > the form /proc/PID/audit_containerid where PID is the process ID of the
> > newly created task that is to become the first task in a container, or
> > an additional task added to a container.
> >
> > The write expects up to a u64 value (unset: 18446744073709551615).
> >
> > The writer must have capability CAP_AUDIT_CONTROL.
> >
> > This will produce a record such as this:
> > type=CONTAINER_ID msg=audit(2018-06-06 12:39:29.636:26949) : op=set
> > opid=2209 old-contid=18446744073709551615 contid=123456 pid=628 auid=root
> > uid=root tty=ttyS0 ses=1
> > subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 comm=bash
> > exe=/usr/bin/bash res=yes
> >
> > The "op" field indicates an initial set. The "pid" to "ses" fields are
> > the orchestrator while the "opid" field is the object's PID, the process
> > being "contained". Old and new audit container identifier values are
> > given in the "contid" fields, while res indicates its success.
> >
> > It is not permitted to unset or re-set the audit container identifier.
> > A child inherits its parent's audit container identifier, but then can
> > be set only once after.
> >
> > See: https://github.com/linux-audit/audit-kernel/issues/90
> > See: https://github.com/linux-audit/audit-userspace/issues/51
> > See: https://github.com/linux-audit/audit-testsuite/issues/64
> > See:
> > https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
> >
> > Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
> > ---
> > fs/proc/base.c | 37 ++++++++++++++++++++++++
> > include/linux/audit.h | 25 ++++++++++++++++
> > include/uapi/linux/audit.h | 2 ++
> > kernel/auditsc.c | 71
> > ++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 135
> > insertions(+)
> >
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index eafa39a..318dff4 100644
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> > @@ -1302,6 +1302,41 @@ static ssize_t proc_sessionid_read(struct file *
> > file, char __user * buf, .read = proc_sessionid_read,
> > .llseek = generic_file_llseek,
> > };
> > +
> > +static ssize_t proc_contid_write(struct file *file, const char __user
> > *buf, + size_t count, loff_t *ppos)
> > +{
> > + struct inode *inode = file_inode(file);
> > + u64 contid;
> > + int rv;
> > + struct task_struct *task = get_proc_task(inode);
> > +
> > + if (!task)
> > + return -ESRCH;
> > + if (*ppos != 0) {
> > + /* No partial writes. */
> > + put_task_struct(task);
> > + return -EINVAL;
> > + }
> > +
> > + rv = kstrtou64_from_user(buf, count, 10, &contid);
> > + if (rv < 0) {
> > + put_task_struct(task);
> > + return rv;
> > + }
> > +
> > + rv = audit_set_contid(task, contid);
> > + put_task_struct(task);
> > + if (rv < 0)
> > + return rv;
> > + return count;
> > +}
> > +
> > +static const struct file_operations proc_contid_operations = {
> > + .write = proc_contid_write,
> > + .llseek = generic_file_llseek,
> > +};
> > +
> > #endif
> >
> > #ifdef CONFIG_FAULT_INJECTION
> > @@ -2995,6 +3030,7 @@ static int proc_pid_patch_state(struct seq_file *m,
> > struct pid_namespace *ns, #ifdef CONFIG_AUDITSYSCALL
> > REG("loginuid", S_IWUSR|S_IRUGO, proc_loginuid_operations),
> > REG("sessionid", S_IRUGO, proc_sessionid_operations),
> > + REG("audit_containerid", S_IWUSR, proc_contid_operations),
> > #endif
> > #ifdef CONFIG_FAULT_INJECTION
> > REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),
> > @@ -3386,6 +3422,7 @@ static int proc_tid_comm_permission(struct inode
> > *inode, int mask) #ifdef CONFIG_AUDITSYSCALL
> > REG("loginuid", S_IWUSR|S_IRUGO, proc_loginuid_operations),
> > REG("sessionid", S_IRUGO, proc_sessionid_operations),
> > + REG("audit_containerid", S_IWUSR, proc_contid_operations),
> > #endif
> > #ifdef CONFIG_FAULT_INJECTION
> > REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),
> > diff --git a/include/linux/audit.h b/include/linux/audit.h
> > index 4f824c4..497cd81 100644
> > --- a/include/linux/audit.h
> > +++ b/include/linux/audit.h
> > @@ -219,6 +219,8 @@ static inline void audit_log_task_info(struct
> > audit_buffer *ab, struct audit_task_info {
> > kuid_t loginuid;
> > unsigned int sessionid;
> > + u64 contid;
> > + bool inherited; /* containerid inheritance */
> > struct audit_context *ctx;
> > };
> > extern struct audit_task_info init_struct_audit;
> > @@ -331,6 +333,7 @@ static inline void audit_ptrace(struct task_struct *t)
> > extern int auditsc_get_stamp(struct audit_context *ctx,
> > struct timespec64 *t, unsigned int *serial);
> > extern int audit_set_loginuid(kuid_t loginuid);
> > +extern int audit_set_contid(struct task_struct *tsk, u64 contid);
> >
> > static inline kuid_t audit_get_loginuid(struct task_struct *tsk)
> > {
> > @@ -348,6 +351,14 @@ static inline unsigned int audit_get_sessionid(struct
> > task_struct *tsk) return AUDIT_SID_UNSET;
> > }
> >
> > +static inline u64 audit_get_contid(struct task_struct *tsk)
> > +{
> > + if (!tsk->audit)
> > + return AUDIT_CID_UNSET;
> > + else
> > + return tsk->audit->contid;
> > +}
> > +
> > extern void __audit_ipc_obj(struct kern_ipc_perm *ipcp);
> > extern void __audit_ipc_set_perm(unsigned long qbytes, uid_t uid, gid_t
> > gid, umode_t mode); extern void __audit_bprm(struct linux_binprm *bprm);
> > @@ -542,6 +553,10 @@ static inline unsigned int audit_get_sessionid(struct
> > task_struct *tsk) {
> > return AUDIT_SID_UNSET;
> > }
> > +static inline kuid_t audit_get_contid(struct task_struct *tsk)
> > +{
> > + return AUDIT_CID_UNSET;
> > +}
> > static inline void audit_ipc_obj(struct kern_ipc_perm *ipcp)
> > { }
> > static inline void audit_ipc_set_perm(unsigned long qbytes, uid_t uid,
> > @@ -606,6 +621,16 @@ static inline bool audit_loginuid_set(struct
> > task_struct *tsk) return uid_valid(audit_get_loginuid(tsk));
> > }
> >
> > +static inline bool cid_valid(u64 contid)
> > +{
> > + return contid != AUDIT_CID_UNSET;
> > +}
> > +
> > +static inline bool audit_contid_set(struct task_struct *tsk)
> > +{
> > + return cid_valid(audit_get_contid(tsk));
> > +}
> > +
> > static inline void audit_log_string(struct audit_buffer *ab, const char
> > *buf) {
> > audit_log_n_string(ab, buf, strlen(buf));
> > diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
> > index 04f9bd2..c3b1aca 100644
> > --- a/include/uapi/linux/audit.h
> > +++ b/include/uapi/linux/audit.h
> > @@ -71,6 +71,7 @@
> > #define AUDIT_TTY_SET 1017 /* Set TTY auditing status */
> > #define AUDIT_SET_FEATURE 1018 /* Turn an audit feature on or off */
> > #define AUDIT_GET_FEATURE 1019 /* Get which features are enabled */
> > +#define AUDIT_CONTAINER_ID 1020 /* Define the container id and
> information
> > */
> >
> > #define AUDIT_FIRST_USER_MSG 1100 /* Userspace messages mostly
> > uninteresting to kernel */ #define AUDIT_USER_AVC 1107 /* We filter this
> > differently */
> > @@ -466,6 +467,7 @@ struct audit_tty_status {
> >
> > #define AUDIT_UID_UNSET (unsigned int)-1
> > #define AUDIT_SID_UNSET ((unsigned int)-1)
> > +#define AUDIT_CID_UNSET ((u64)-1)
> >
> > /* audit_rule_data supports filter rules with both integer and string
> > * fields. It corresponds with AUDIT_ADD_RULE, AUDIT_DEL_RULE and
> > diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> > index 59ef7a81..611e926 100644
> > --- a/kernel/auditsc.c
> > +++ b/kernel/auditsc.c
> > @@ -956,6 +956,8 @@ int audit_alloc(struct task_struct *tsk)
> > return -ENOMEM;
> > info->loginuid = audit_get_loginuid(current);
> > info->sessionid = audit_get_sessionid(current);
> > + info->contid = audit_get_contid(current);
> > + info->inherited = true;
> > tsk->audit = info;
> >
> > if (likely(!audit_ever_enabled))
> > @@ -985,6 +987,8 @@ int audit_alloc(struct task_struct *tsk)
> > struct audit_task_info init_struct_audit = {
> > .loginuid = INVALID_UID,
> > .sessionid = AUDIT_SID_UNSET,
> > + .contid = AUDIT_CID_UNSET,
> > + .inherited = true,
> > .ctx = NULL,
> > };
> >
> > @@ -2112,6 +2116,73 @@ int audit_set_loginuid(kuid_t loginuid)
> > }
> >
> > /**
> > + * audit_set_contid - set current task's audit_context contid
> > + * @contid: contid value
> > + *
> > + * Returns 0 on success, -EPERM on permission failure.
> > + *
> > + * Called (set) from fs/proc/base.c::proc_contid_write().
> > + */
> > +int audit_set_contid(struct task_struct *task, u64 contid)
> > +{
> > + u64 oldcontid;
> > + int rc = 0;
> > + struct audit_buffer *ab;
> > + uid_t uid;
> > + struct tty_struct *tty;
> > + char comm[sizeof(current->comm)];
> > +
> > + /* Can't set if audit disabled */
> > + if (!task->audit)
> > + return -ENOPROTOOPT;
> > + oldcontid = audit_get_contid(task);
> > + /* Don't allow the audit containerid to be unset */
> > + if (!cid_valid(contid))
> > + rc = -EINVAL;
> > + /* if we don't have caps, reject */
> > + else if (!capable(CAP_AUDIT_CONTROL))
> > + rc = -EPERM;
> > + /* if task has children or is not single-threaded, deny */
> > + else if (!list_empty(&task->children))
> > + rc = -EBUSY;
> > + else if (!(thread_group_leader(task) && thread_group_empty(task)))
> > + rc = -EALREADY;
> > + /* it is already set, and not inherited from the parent, reject */
> > + else if (cid_valid(oldcontid) && !task->audit->inherited)
> > + rc = -EEXIST;
> > + if (!rc) {
> > + task_lock(task);
> > + task->audit->contid = contid;
> > + task->audit->inherited = false;
> > + task_unlock(task);
> > + }
> > +
> > + if (!audit_enabled)
> > + return rc;
> > +
> > + ab = audit_log_start(audit_context(), GFP_KERNEL, AUDIT_CONTAINER_ID);
> > + if (!ab)
> > + return rc;
> > +
> > + uid = from_kuid(&init_user_ns, task_uid(current));
> > + tty = audit_get_tty(current);
> > + audit_log_format(ab, "op=set opid=%d old-contid=%llu contid=%llu pid=%d
> > uid=%u auid=%u tty=%s ses=%u", + task_tgid_nr(task), oldcontid,
> contid,
> > + task_tgid_nr(current), uid
> > + from_kuid(&init_user_ns, audit_get_loginuid(current)),
> > + tty ? tty_name(tty) : "(none)",
> > + audit_get_sessionid(current));
>
> The event code doesn't match the example event at the top. (uid and auid are
> transposed.) But the code looks right.
Hmmm, I thought I checked that explicitly... That event sample must
have come from the previous compile before I fixed that.
> Ack for the event format.
Thanks!
> -Steve
>
> > + audit_put_tty(tty);
> > + audit_log_task_context(ab);
> > + audit_log_format(ab, " comm=");
> > + audit_log_untrustedstring(ab, get_task_comm(comm, current));
> > + audit_log_d_path_exe(ab, current->mm);
> > + audit_log_format(ab, " res=%d", !rc);
> > + audit_log_end(ab);
> > + return rc;
> > +}
> > +
> > +/**
> > * __audit_mq_open - record audit data for a POSIX MQ open
> > * @oflag: open flag
> > * @mode: mode bits
- RGB
--
Richard Guy Briggs <rgb@redhat.com>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635
^ permalink raw reply
* Re: [PATCH bpf-next v5 1/2] trace_helpers.c: Add helpers to poll multiple perf FDs for events
From: Jakub Kicinski @ 2018-06-06 20:17 UTC (permalink / raw)
To: Toke Høiland-Jørgensen
Cc: Linux Netdev List, Jesper Dangaard Brouer
In-Reply-To: <87tvqfercc.fsf@toke.dk>
On Wed, Jun 6, 2018 at 10:51 AM, Toke Høiland-Jørgensen wrote:
> Jakub Kicinski <jakub.kicinski@netronome.com> writes:
>
>> On Wed, 06 Jun 2018 14:43:39 +0200, Toke Høiland-Jørgensen wrote:
>>> Add two new helper functions to trace_helpers that supports polling
>>> multiple perf file descriptors for events. These are used to the XDP
>>> perf_event_output example, which needs to work with one perf fd per CPU.
>>>
>>> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
>>
>> malloc() + memset(0) could have been replaced with calloc, but
>> otherwise looks good.
>
> Ah right. I'll fix that and your nit for the other patch and resubmit
> with your reviewed-by tag. Thanks for reviewing!
Thank you :)
^ permalink raw reply
* Re: [PATCH 2/2] rfkill: Create rfkill-none LED trigger
From: João Paulo Rechi Vita @ 2018-06-06 19:47 UTC (permalink / raw)
To: Johannes Berg, David S. Miller
Cc: linux, Michał Kępień, João Paulo Rechi Vita,
linux-wireless, Network Development, LKML
In-Reply-To: <20180522212932.5357-2-jprvita@endlessm.com>
On Tue, May 22, 2018 at 2:29 PM, João Paulo Rechi Vita
<jprvita@gmail.com> wrote:
> Creates a new trigger rfkill-none, as a complement to rfkill-any, which
> drives LEDs when any radio is enabled. The new trigger is meant to turn
> a LED ON whenever all radios are OFF, and turn it OFF otherwise.
>
Johannes, do you have any feedback here? This will be used by
asus-wireless in the same way fujitsu-laptop uses rfkill-any.
--
João Paulo Rechi Vita
http://about.me/jprvita
^ permalink raw reply
* linux-next: Please add mlx5-next tree
From: Leon Romanovsky @ 2018-06-06 19:25 UTC (permalink / raw)
To: Stephen Rothwell
Cc: Linux-Next Mailing List, Linux Kernel Mailing List,
Saeed Mahameed, Jason Gunthorpe, Doug Ledford, David S. Miller,
linux-netdev, RDMA mailing list
[-- Attachment #1: Type: text/plain, Size: 344 bytes --]
Hi Stephen,
Can you please add the branch "mlx5-next" from the following tree to the
linux-next integration tree?
https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/
The mlx5-next branch is used to send shared pull requests to both netdev
and RDMA subsystems and it is worth to have linux-next coverage on it
before.
Thanks
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
^ permalink raw reply
* Re: [net-next PATCH v3 3/5] net: Enable Tx queue selection based on Rx queues
From: Willem de Bruijn @ 2018-06-06 19:13 UTC (permalink / raw)
To: Samudrala, Sridhar
Cc: Amritha Nambiar, Network Development, David Miller,
Alexander Duyck, Eric Dumazet, Hannes Frederic Sowa, Tom Herbert
In-Reply-To: <cbf3d17c-6585-b66c-35d1-1a65811bb8d0@intel.com>
On Wed, Jun 6, 2018 at 3:08 PM, Samudrala, Sridhar
<sridhar.samudrala@intel.com> wrote:
>
> On 6/6/2018 11:56 AM, Willem de Bruijn wrote:
>>
>> On Tue, Jun 5, 2018 at 4:38 AM, Amritha Nambiar
>> <amritha.nambiar@intel.com> wrote:
>>>
>>> This patch adds support to pick Tx queue based on the Rx queue(s) map
>>> configuration set by the admin through the sysfs attribute
>>> for each Tx queue. If the user configuration for receive queue(s) map
>>> does not apply, then the Tx queue selection falls back to CPU(s) map
>>> based selection and finally to hashing.
>>>
>>> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
>>> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>>> ---
>>> int sysctl_tcp_max_orphans __read_mostly = NR_FILE;
>>>
>>> @@ -5574,6 +5575,7 @@ void tcp_finish_connect(struct sock *sk, struct
>>> sk_buff *skb)
>>> if (skb) {
>>> icsk->icsk_af_ops->sk_rx_dst_set(sk, skb);
>>> security_inet_conn_established(sk, skb);
>>> + sk_mark_napi_id(sk, skb);
>>> }
>>
>> This and the call below should be in a standalone patch, as the mark
>> changes are not rxq-xps specific. Is the additional earlier marking really
>> required?
>
>
> The additional earlier marking in tcp_finish_connect() allows a client app
> to do
> SO_INCOMING_NAPI_ID after a a connect() call to get the right queue
> association
> for a socket.
>
> The marking in tcp_conn_request() allows syn-ack to go on the right tx-queue
> associated with the queue on which syn is received.
I understand the intent. My question really is whether it is needed.
Marking has been slightly lossy in this regard in the past, not
necessarily as an oversight. I don't mean to make that call here,
but it's worth discussion and its own patch.
^ permalink raw reply
* Re: [net-next PATCH v3 3/5] net: Enable Tx queue selection based on Rx queues
From: Samudrala, Sridhar @ 2018-06-06 19:08 UTC (permalink / raw)
To: Willem de Bruijn, Amritha Nambiar
Cc: Network Development, David Miller, Alexander Duyck, Eric Dumazet,
Hannes Frederic Sowa, Tom Herbert
In-Reply-To: <CAF=yD-LyoCxs7dOtKpVuWcFcQ_DBr0O=yw8-Crszma=qC+F3gA@mail.gmail.com>
On 6/6/2018 11:56 AM, Willem de Bruijn wrote:
> On Tue, Jun 5, 2018 at 4:38 AM, Amritha Nambiar
> <amritha.nambiar@intel.com> wrote:
>> This patch adds support to pick Tx queue based on the Rx queue(s) map
>> configuration set by the admin through the sysfs attribute
>> for each Tx queue. If the user configuration for receive queue(s) map
>> does not apply, then the Tx queue selection falls back to CPU(s) map
>> based selection and finally to hashing.
>>
>> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
>> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
>> ---
>> int sysctl_tcp_max_orphans __read_mostly = NR_FILE;
>>
>> @@ -5574,6 +5575,7 @@ void tcp_finish_connect(struct sock *sk, struct sk_buff *skb)
>> if (skb) {
>> icsk->icsk_af_ops->sk_rx_dst_set(sk, skb);
>> security_inet_conn_established(sk, skb);
>> + sk_mark_napi_id(sk, skb);
>> }
> This and the call below should be in a standalone patch, as the mark
> changes are not rxq-xps specific. Is the additional earlier marking really
> required?
The additional earlier marking in tcp_finish_connect() allows a client app to do
SO_INCOMING_NAPI_ID after a a connect() call to get the right queue association
for a socket.
The marking in tcp_conn_request() allows syn-ack to go on the right tx-queue
associated with the queue on which syn is received.
>
> I would separate out the entire rxq caching with sk_rx_queue_mapping from
> the refactoring of get_xps_queue. The two changes are quite independent.
>
>> tcp_init_transfer(sk, BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB);
>> @@ -6402,6 +6404,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
>> tcp_rsk(req)->snt_isn = isn;
>> tcp_rsk(req)->txhash = net_tx_rndhash();
>> tcp_openreq_init_rwin(req, sk, dst);
>> + sk_mark_rx_queue(req_to_sk(req), skb);
^ permalink raw reply
* Re: [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan
From: Or Gerlitz @ 2018-06-06 19:07 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Paul Blakey, Jiri Pirko, Cong Wang, Jamal Hadi Salim,
David Miller, Linux Netdev List, Yevgeny Kliteynik, Roi Dayan,
Shahar Klein, Mark Bloch, Or Gerlitz
In-Reply-To: <20180606085033.7c5acff7@cakuba.netronome.com>
On Wed, Jun 6, 2018 at 6:50 PM, Jakub Kicinski <kubakici@wp.pl> wrote:
> On Wed, 6 Jun 2018 08:12:20 +0300, Or Gerlitz wrote:
>> On Tue, Jun 5, 2018 at 9:59 PM, Jakub Kicinski <kubakici@wp.pl> wrote:
>> > But the correct fix is to remove egdev crutch completely IMO.
>> Not against it, sometimes designs should change and be replaced
>> with better ones, happens
> Cool, I'm not sure when we'll get to it, but perhaps we can go over
> the ideas I presented at netconf on switchdev call next week (if it's
> still active)?
sounds good, I'll check. Where the netconf slides can be found?
^ permalink raw reply
* Re: [net-next PATCH v3 3/5] net: Enable Tx queue selection based on Rx queues
From: Willem de Bruijn @ 2018-06-06 18:56 UTC (permalink / raw)
To: Amritha Nambiar
Cc: Network Development, David Miller, Alexander Duyck,
Samudrala, Sridhar, Eric Dumazet, Hannes Frederic Sowa,
Tom Herbert
In-Reply-To: <152818788079.20862.16138764303449301927.stgit@anamdev.jf.intel.com>
On Tue, Jun 5, 2018 at 4:38 AM, Amritha Nambiar
<amritha.nambiar@intel.com> wrote:
> This patch adds support to pick Tx queue based on the Rx queue(s) map
> configuration set by the admin through the sysfs attribute
> for each Tx queue. If the user configuration for receive queue(s) map
> does not apply, then the Tx queue selection falls back to CPU(s) map
> based selection and finally to hashing.
>
> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
> ---
> int sysctl_tcp_max_orphans __read_mostly = NR_FILE;
>
> @@ -5574,6 +5575,7 @@ void tcp_finish_connect(struct sock *sk, struct sk_buff *skb)
> if (skb) {
> icsk->icsk_af_ops->sk_rx_dst_set(sk, skb);
> security_inet_conn_established(sk, skb);
> + sk_mark_napi_id(sk, skb);
> }
This and the call below should be in a standalone patch, as the mark
changes are not rxq-xps specific. Is the additional earlier marking really
required?
I would separate out the entire rxq caching with sk_rx_queue_mapping from
the refactoring of get_xps_queue. The two changes are quite independent.
> tcp_init_transfer(sk, BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB);
> @@ -6402,6 +6404,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
> tcp_rsk(req)->snt_isn = isn;
> tcp_rsk(req)->txhash = net_tx_rndhash();
> tcp_openreq_init_rwin(req, sk, dst);
> + sk_mark_rx_queue(req_to_sk(req), skb);
^ permalink raw reply
* Re: [PATCH 2/5] spi: implement companion-spi driver
From: Andy Shevchenko @ 2018-06-06 18:47 UTC (permalink / raw)
To: Mark Jonas
Cc: Wolfgang Grandegger, Marc Kleine-Budde, linux-can, netdev,
Linux Kernel Mailing List, Heiko Schocher, Zhu Yi
In-Reply-To: <1528224240-30786-3-git-send-email-mark.jonas@de.bosch.com>
On Tue, Jun 5, 2018 at 9:43 PM, Mark Jonas <mark.jonas@de.bosch.com> wrote:
> The low level companion-spi driver encapsulates the communication
> details with the companion processor, and provides interface for
> the upper level drivers to access.
> +int companion_do_io_tx(struct device *parent,
> + const char __user *buf,
> + size_t count)
> +{
> + struct companion_spi_priv *priv = dev_get_drvdata(parent);
> + unsigned int copied;
> + int error;
> + struct companion_packet p;
> +
> + /*TODO: support mutiple packets in one write in future*/
Hmm... Either fix this, or remove comment?
> + if (copy_from_user(p.data, buf, sizeof(p)) == 0) {
> + if (is_can_type(&p))
> + return -EINVAL;
> + } else {
> + dev_info(parent, "copy from user not succeed in one call\n");
Shouldn't it return immediately?
> + }
> +
> + error = qm_io_txq_in(&priv->pm.qm, buf, count, &copied);
...what the point to call if we got garbage from user space?
> + if (!error) {
The usual pattern is to check for errors first.
> + wake_up_interruptible(&priv->wait);
> + priv->pm.stats.io_tx++;
> + return copied;
> + } else {
> + priv->pm.stats.io_tx_overflows++;
> + }
> + return error;
> +}
> + ... qm_io_rxq_out(&priv->pm.qm, buf, count, &copied);
> + ... pm_can_data_tx(&priv->pm, port, prio, cf);
Oh, oh, oh.
These namespaces are too generic, moreover pm is kinda occupied by
power management. You bring a lot of confusion here, I think.
> + err = pm_can_get_txq_status(pm, port);
> + if (!err) {
if (err)
return err;
> + }
> + return err;
> + int ret, value;
> +
> + ret = sscanf(buf, "%d", &value);
> + if (ret != 1) {
> + }
You have to be consistent in your code.
I've already noticed
err
error
and now
ret
Choose one and stick with it.
Also check your entire patch series' code for consistency.
> +static DEVICE_ATTR(dump_packets, S_IRUGO | S_IWUSR,
> + show_dump_packets, store_dump_packets);
We have helpers, like DEVICE_ATTR_RW().
> + ret = snprintf(buf + pos, PAGE_SIZE - pos,
> + "\ntx: %u, rx: %u, err: %u\n\n",
> + total,
> + priv->pm.stats.can_rx_overflows[i],
> + priv->pm.stats.can_err_overflows[i]);
Please, avoid leading '\n'.
> + gpio_set_value(priv->cs_gpios, priv->cs_gpios_assert);
> + gpio_set_value(priv->cs_gpios, !priv->cs_gpios_assert);
Can you switch to GPIO descriptor API?
> + unsigned int count = READY_POLL_US / READY_POLL_US_GRAN;
> + while (count--) {
For counting like this better to have
do {
} while (--count);
The rationale, reader at first glance will know that the loop will
iterate at least once.
> + if (slave_is_not_busy(priv))
> + return 0;
> +
> + udelay(READY_POLL_US_GRAN);
Should it be atomic?
Can it use read_poll_* type of helpers instead?
> + }
Above comments applicable to entire code you have.
> +static void companion_spi_cpu_to_be32(char *buf)
> +{
> + u32 *buf32 = (u32*)buf;
> + int i;
> +
> + for (i = 0; i < (BCP_PACKET_SIZE / sizeof(u32)); i++, buf32++)
> + *buf32 = cpu_to_be32(*buf32);
> +}
This entire function should be replaced by one of the helpers from
include/linux/byteorder/generic.h
I guess cpu_to_be32_array() is a right one.
> +static void companion_spi_be32_to_cpu(char *buf)
> +{
> + u32 *buf32 = (u32*)buf;
> + int i;
> +
> + for (i = 0; i < (BCP_PACKET_SIZE / sizeof(u32)); i++, buf32++)
> + *buf32 = be32_to_cpu(*buf32);
> +}
Ditto.
Recommendation: check your code against existing helpers.
> + p = (const struct companion_packet*)transfer->tx_buf;
> + companion_spi_cpu_to_be32((char*)transfer->tx_buf);
If tx_buf is type of void * all these castings are redundant.
Also looking at the function, did you consider to use whatever SPI
core provides, like CS handling, messages handling and so on?
> +static int companion_spi_thread(void *data)
> +{
> + struct companion_spi_priv *priv = data;
> + struct companion_packet tx_packet;
> + struct companion_packet rx_packet;
> + struct spi_message message;
> + struct spi_transfer transfer;
> +
> + memset(&transfer, 0, sizeof(transfer));
> + transfer.tx_buf = tx_packet.data;
> + transfer.rx_buf = rx_packet.data;
> + transfer.len = sizeof(struct companion_packet);
> + transfer.cs_change = 0;
Redundant.
> + transfer.bits_per_word = 32;
> +
> + for (;;) {
Infinite loops are evil in most of the cases.
I see here
do {
} while (kthread_should_stop());
> + if (wait_event_interruptible(priv->wait,
> + kthread_should_stop() ||
> + slave_has_request(priv) ||
> + qm_has_tx_data(&priv->pm.qm)))
> + continue;
> +
> + if (kthread_should_stop())
> + break;
> +
> + pm_prepare_tx(&priv->pm, &tx_packet);
> + companion_spi_transceive(priv, &message, &transfer);
> + pm_on_tx_done(&priv->pm);
> + pm_on_rx_done(&priv->pm, &rx_packet);
> + }
> +
> + return 0;
> +}
> +static const struct of_device_id companion_spi_of_match[] = {
> + { .compatible = DRIVER_NAME, .data = NULL, },
> + { /* sentinel */ },
terminators better without commas.
> +};
> +static struct spi_driver companion_spi_driver = {
> + .driver = {
> + .name = DRIVER_NAME,
> + .owner = THIS_MODULE,
This is redundant, macro you are using below sets that.
> + .of_match_table = of_match_ptr(companion_spi_of_match),
> + },
> + .probe = companion_spi_probe,
> + .remove = companion_spi_remove,
> +};
> +module_spi_driver(companion_spi_driver);
> +static void io_process(struct companion_protocol_manager *pm,
> + const struct companion_packet *p)
Something wrong with indentation in this file.
> +#define CHECK_SIZE(x) BUILD_BUG_ON(sizeof(struct companion_packet) != \
> + sizeof(x))
Better to split like
_SIZE(x) \
BUILD_BUG_ON()
> +void pm_init(struct companion_protocol_manager *pm)
Unfortunately, horrible name for the function.
Namespace is kinda occupied, name itself way too generic.
> + if (ctrl & CAN_CTRLMODE_LISTENONLY)
> + p.mode = BCP_CAN_MODE_LISTEN;
> + else
> + return -EOPNOTSUPP;
if (!(cond))
return -ERRNO;
?
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
Do you still need this text?
> + * TODO: add more statistics fields and export to sysfs
Do it or remove the comment?
> + * TODO: re-think the data structure for handle CAN response
Ditto.
> +/**
> + * BCP status field definitions
> + */
> +#define BCP_STATUS_SUCCESS 0x00u
> +#define BCP_STATUS_UNKNOWN 0x01u
> +#define BCP_STATUS_OTHER 0x02u
BIT() ?
> +struct companion_packet {
> + __u8 data[BCP_PACKET_SIZE];
> +};
Is it going from / to user space? Otherwise why __ kind of type?
> +#define CAN_MAX_IN_A_ROW 8
> +
> +
> +
Too many blank lines.
> +int companion_do_can_err(struct device *parent,
> + u8 port,
> + struct can_berr_counter *bec,
> + u8 *state,
> + u8 *code);
+ blank line here.
> +#define COMPANION_CAN_STATE_WARNING 0x01u
> +#define COMPANION_CAN_STATE_PASSIVE 0x02u
> +#define COMPANION_CAN_STATE_BUS_OFF 0x04u
> +#define COMPANION_CAN_ERROR_STUFF 0x01u
> +#define COMPANION_CAN_ERROR_FORM 0x02u
> +#define COMPANION_CAN_ERROR_ACK 0x04u
> +#define COMPANION_CAN_ERROR_BIT1 0x08u
> +#define COMPANION_CAN_ERROR_BIT0 0x10u
> +#define COMPANION_CAN_ERROR_CRC 0x20u
> +#define COMPANION_CAN_ERROR_RXOV 0x80u
BIT() ?
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply
* [PATCH] net: stmmac: fix build failure due to missing COMMON_CLK dependency
From: Corentin Labbe @ 2018-06-06 18:45 UTC (permalink / raw)
To: alexandre.torgue, peppe.cavallaro
Cc: linux-kernel, netdev, linux-sunxi, Corentin Labbe
This patch fix the build failure on m68k;
drivers/net/ethernet/stmicro/stmmac/dwmac-ipq806x.o: In function `ipq806x_gmac_probe':
dwmac-ipq806x.c:(.text+0xda): undefined reference to `clk_set_rate'
drivers/net/ethernet/stmicro/stmmac/dwmac-rk.o: In function `rk_gmac_probe':
dwmac-rk.c:(.text+0x1e58): undefined reference to `clk_set_rate'
drivers/net/ethernet/stmicro/stmmac/dwmac-sti.o: In function `stid127_fix_retime_src':
dwmac-sti.c:(.text+0xd8): undefined reference to `clk_set_rate'
dwmac-sti.c:(.text+0x114): undefined reference to `clk_set_rate'
drivers/net/ethernet/stmicro/stmmac/dwmac-sti.o:dwmac-sti.c:(.text+0x12c): more undefined references to `clk_set_rate' follow
Lots of stmmac platform drivers need COMMON_CLK in their Kconfig depends.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
---
drivers/net/ethernet/stmicro/stmmac/Kconfig | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/Kconfig b/drivers/net/ethernet/stmicro/stmmac/Kconfig
index e28c0d2c58e9..cb5b0f58c395 100644
--- a/drivers/net/ethernet/stmicro/stmmac/Kconfig
+++ b/drivers/net/ethernet/stmicro/stmmac/Kconfig
@@ -33,7 +33,7 @@ config DWMAC_DWC_QOS_ETH
select PHYLIB
select CRC32
select MII
- depends on OF && HAS_DMA
+ depends on OF && COMMON_CLK && HAS_DMA
help
Support for chips using the snps,dwc-qos-ethernet.txt DT binding.
@@ -57,7 +57,7 @@ config DWMAC_ANARION
config DWMAC_IPQ806X
tristate "QCA IPQ806x DWMAC support"
default ARCH_QCOM
- depends on OF && (ARCH_QCOM || COMPILE_TEST)
+ depends on OF && COMMON_CLK && (ARCH_QCOM || COMPILE_TEST)
select MFD_SYSCON
help
Support for QCA IPQ806X DWMAC Ethernet.
@@ -100,7 +100,7 @@ config DWMAC_OXNAS
config DWMAC_ROCKCHIP
tristate "Rockchip dwmac support"
default ARCH_ROCKCHIP
- depends on OF && (ARCH_ROCKCHIP || COMPILE_TEST)
+ depends on OF && COMMON_CLK && (ARCH_ROCKCHIP || COMPILE_TEST)
select MFD_SYSCON
help
Support for Ethernet controller on Rockchip RK3288 SoC.
@@ -123,7 +123,7 @@ config DWMAC_SOCFPGA
config DWMAC_STI
tristate "STi GMAC support"
default ARCH_STI
- depends on OF && (ARCH_STI || COMPILE_TEST)
+ depends on OF && COMMON_CLK && (ARCH_STI || COMPILE_TEST)
select MFD_SYSCON
---help---
Support for ethernet controller on STi SOCs.
@@ -147,7 +147,7 @@ config DWMAC_STM32
config DWMAC_SUNXI
tristate "Allwinner GMAC support"
default ARCH_SUNXI
- depends on OF && (ARCH_SUNXI || COMPILE_TEST)
+ depends on OF && COMMON_CLK && (ARCH_SUNXI || COMPILE_TEST)
---help---
Support for Allwinner A20/A31 GMAC ethernet controllers.
--
2.16.4
^ permalink raw reply related
* GREETINGS FROM MOHAMMAD AHMED .
From: Mohammad Ahmed @ 2018-06-06 18:36 UTC (permalink / raw)
My Dear Friend.
I am Mr. Mohammed Ahmed, a banker in Bank of Africa Burkina Faso West
Africa, Please i need your assistance to transfer an abandoned sum of
13.5 Millions USD into your account if you can permit and after the
transaction is over 50% will be for you and 50% for me.
No risk involved in the transaction. Contact me for more details along
with your personal information needed below to build more trust and
confident.
Note. The business transaction will be done legal and official on your behalf.
1. Full name:.........
2. Current Address:.........
3. Phone.............
4. Occupation:.............
5. Age:............
6. Country:........
7. Sex........
8. Your Passport or ID card or Driving License
Thanks.
Best Regards.
Mr. Mohammed Ahmed.
^ permalink raw reply
* Re: general protection fault in sockfs_setattr
From: Cong Wang @ 2018-06-06 18:21 UTC (permalink / raw)
To: shankarapailoor, linux-fsdevel
Cc: David Miller, LKML, syzkaller, Linux Kernel Network Developers
In-Reply-To: <CAB+yDaahHYeWE1kS=4GEAC3GKyrAenaSat9Wi_iKQbt1yTp_zg@mail.gmail.com>
On Tue, Jun 5, 2018 at 7:19 PM, shankarapailoor
<shankarapailoor@gmail.com> wrote:
> Hi Cong,
>
> I added that check and it seems to stop the crash. Like you said, I
> don't see where the reference count for the file is increased. The
> inode lock also seems to be held during this call.
I know inode lock is held for ->setattr(), but not for ->release(), this
is why I suspect sock_close() could still race with sockfs_setattr()
after my patch.
I am not sure if it is crazy to just hold fd refcnt for fchmodat() too..
^ permalink raw reply
* Re: [less-CONFIG_NET 1/7] net: reorder filter code
From: Norbert Manthey @ 2018-06-06 18:19 UTC (permalink / raw)
To: Willem de Bruijn
Cc: Alexei Starovoitov, Daniel Borkmann, David S. Miller,
Network Development, LKML
In-Reply-To: <CAF=yD-JCtvVpdxwrncFurc_-OO_VtELjkxBLh=1Gk3jTnOQmug@mail.gmail.com>
On 06/06/2018 06:33 PM, Willem de Bruijn wrote:
> On Wed, Jun 6, 2018 at 9:53 AM, Norbert Manthey <nmanthey@amazon.de> wrote:
>> This commit reorders the definition of functions and struct in the
>> file filter.c, such that in the next step we can easily cut the file
>> into a commonly used part, as well as a part that is only required in
>> case CONFIG_NET is actually set.
>>
>> This is part of the effort to split CONFIG_SECCOMP_FILTER and
>> CONFIG_NET.
>>
>> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
> Reordering patches like this and the flow-dissector patch in this
> series make cherry-picking fixes back to stable branches and
> and following code history with git blame harder.
>
Dear Willem,
I agree that this change makes cherry picking harder. When keeping the
order of the code, the following commits have to introduce many pairs
"#ifdef CONFIG_NET - #endif" and "#if defined(CONFIG_NET) ||
defined(CONFIG_SECCOMP_FILTER) #endif"
This is how I developed the commits, but then refactored for improved
readability.
From my experience, cherry picking changes from this variant is equally
difficult. A third alternative, splitting the files into separate files,
would force me to make current static definitions visible.
I am happy to go another route, if people insist that this is the better
way forward.
Best,
Norbert
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
^ permalink raw reply
* Re: [PATCH net] net: ipv6: ip6_output: alloc skb with tailroom
From: David Miller @ 2018-06-06 18:11 UTC (permalink / raw)
To: aring
Cc: netdev, yoshfuji, david.palma, rabinarayans0828, jhs, stefan,
linux-wpan, kernel
In-Reply-To: <20180606180853.zimcpxbo3ejxum6g@x220t>
From: Alexander Aring <aring@mojatatu.com>
Date: Wed, 6 Jun 2018 14:09:20 -0400
> okay, then you want to have this patch for net-next? As an optimization?
>
> Of course, when it's open again.
Like you, I have questions about where this adjustment is applied and
why. So I'm not sure yet.
For example, only IPV6 really takes it into consideration and as you
saw only really for the fragmentation path and not the normal output
path.
This needs more consideration and investigation.
^ permalink raw reply
* Re: [PATCH net] net: ipv6: ip6_output: alloc skb with tailroom
From: Alexander Aring @ 2018-06-06 18:09 UTC (permalink / raw)
To: David Miller
Cc: netdev, yoshfuji, david.palma, rabinarayans0828, jhs, stefan,
linux-wpan, kernel
In-Reply-To: <20180606.135339.2253125602143741999.davem@davemloft.net>
Hi,
On Wed, Jun 06, 2018 at 01:53:39PM -0400, David Miller wrote:
> From: Alexander Aring <aring@mojatatu.com>
> Date: Tue, 5 Jun 2018 18:04:04 -0400
>
> > This patch adds care about tailroom length for allocate a skb from ipv6
> > level stack. In case of 6lowpan we had the problem the skb runs into a
> > skb_over_panic() in some special length cases. The root was there was no
> > tailroom allocated for the IEEE 802.15.4 checksum, although we had
> > the necessary tailroom specified inside the netdev structure.
> >
> > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195059
> > Reported-by: David Palma <david.palma@ntnu.no>
> > Reported-by: Rabi Narayan Sahoo <rabinarayans0828@gmail.com>
> > Signed-off-by: Alexander Aring <aring@mojatatu.com>
>
> needed_tailroom is an optimization to avoid SKB reallocations
> and adjustments, it is not a guarantee.
>
okay, then you want to have this patch for net-next? As an optimization?
Of course, when it's open again.
> If you are seeing crashes, it means code is assuming something which
> is not to be assumed.
>
> Whatever code is involved, it needs to check that the necessary
> tailroom is there and reallocate if necessary, rather than
> blindly pushing past the end of the SKB data.
>
I see, I will add checks and reallocs (if necessary) in the underlaying
subsystem level.
Thanks for clarifying this.
- Alex
^ permalink raw reply
* Re: [PATCH net-next] strparser: Add __strp_unpause and use it in ktls.
From: David Miller @ 2018-06-06 18:08 UTC (permalink / raw)
To: doronrk; +Cc: davejwatson, tom, vakul.garg, netdev
In-Reply-To: <20180606163328.757943-1-doronrk@fb.com>
From: Doron Roberts-Kedes <doronrk@fb.com>
Date: Wed, 6 Jun 2018 09:33:28 -0700
> strp_unpause queues strp_work in order to parse any messages that
> arrived while the strparser was paused. However, the process invoking
> strp_unpause could eagerly parse a buffered message itself if it held
> the sock lock.
>
> __strp_unpause is an alternative to strp_pause that avoids the scheduling
> overhead that results when a receiving thread unpauses the strparser
> and waits for the next message to be delivered by the workqueue thread.
>
> This patch more than doubled the IOPS achieved in a benchmark of NBD
> traffic encrypted using ktls.
>
> Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com>
Applied, thank you.
^ permalink raw reply
* Re: [PATCH 0/5] can: enable multi-queue for SocketCAN devices
From: Andy Shevchenko @ 2018-06-06 18:06 UTC (permalink / raw)
To: Mark Jonas
Cc: Wolfgang Grandegger, Marc Kleine-Budde, linux-can, netdev,
Linux Kernel Mailing List, Heiko Schocher, Zhu Yi
In-Reply-To: <1528224240-30786-1-git-send-email-mark.jonas@de.bosch.com>
On Tue, Jun 5, 2018 at 9:43 PM, Mark Jonas <mark.jonas@de.bosch.com> wrote:
> Upon request by Marc Kleine-Budde this patch series does not only
> contain our patch to enable enable multi-queue for SocketCAN devices
> but also a driver (Companion driver suite) which makes active use of
> this feature.
>
> The driver suite implements
> - two CAN interfaces
> - one generic command interfaces
> and offers a SocketCAN as well as a char device interface. The
> SocketCAN interface supports multi-queue.
>
> The functionality bases on an external peripheral chip named Companion.
> It offers two CAN interfaces, each has 8 prioritized transmit FIFOs as
> well as one receive FIFO. Besides CAN, undisclosed additional functions
> can be accessed through the char device.
>
> A standard SPI interface with two additional lines for flow control is
> used. The Companion chip is the SPI slave.
Can remoteproc API be utilized here?
>
> The driver suite consists of three separate drivers. The following
> diagram illustrates the dependencies in layers.
>
> /dev/companion SocketCAN User Space
> -------------------------------------------------------------------
> +----------------+ +---------------+
> | companion-char | | companion-can |
> +----------------+ +---------------+
> +----------------------------------+
> | companion-spi |
> +----------------------------------+
> +----------------------------------+
> | standard SPI subsystem |
> +----------------------------------+ Linux Kernel
> -------------------------------------------------------------------
> | | | | | | Hardware
> CS-+ | | | | +-BUSY
> CLK--+ | | +---REQUEST
> MOSI---+ |
> MISO-----+
>
> companion-spi
> core.c: handles SPI, sysfs entry and interface to upper layer
> protocol-manager.c: handles protocol with the SPI HW
> queue-manager.c: handles buffering and packets scheduling
>
> companion-can
> makes use of multi-queue support and allows to use tc to configure
> the queuing discipline (e.g. mqprio). Together with the SO_PRIORITY
> socket option this allows to specify the FIFO a CAN frame shall be
> sent to.
>
> companion-char
> handles messages to other undisclosed functionality beyond CAN.
>
> Zhu Yi (5):
> can: enable multi-queue for SocketCAN devices
> spi: implement companion-spi driver
> char: implement companion-char driver
> can: implement companion-can driver
> spi,can,char: add companion DT binding documentation
>
> .../devicetree/bindings/spi/bosch,companion.txt | 82 ++
> drivers/char/Kconfig | 7 +
> drivers/char/Makefile | 2 +
> drivers/char/companion-char.c | 367 ++++++
> drivers/net/can/Kconfig | 8 +
> drivers/net/can/Makefile | 1 +
> drivers/net/can/companion-can.c | 694 ++++++++++++
> drivers/net/can/dev.c | 8 +-
> drivers/spi/Kconfig | 2 +
> drivers/spi/Makefile | 2 +
> drivers/spi/companion/Kconfig | 5 +
> drivers/spi/companion/Makefile | 2 +
> drivers/spi/companion/core.c | 1189 ++++++++++++++++++++
> drivers/spi/companion/protocol-manager.c | 1035 +++++++++++++++++
> drivers/spi/companion/protocol-manager.h | 348 ++++++
> drivers/spi/companion/protocol.h | 273 +++++
> drivers/spi/companion/queue-manager.c | 146 +++
> drivers/spi/companion/queue-manager.h | 245 ++++
> include/linux/can/dev.h | 7 +-
> include/linux/companion.h | 258 +++++
> 20 files changed, 4677 insertions(+), 4 deletions(-)
> create mode 100644 Documentation/devicetree/bindings/spi/bosch,companion.txt
> create mode 100644 drivers/char/companion-char.c
> create mode 100644 drivers/net/can/companion-can.c
> create mode 100644 drivers/spi/companion/Kconfig
> create mode 100644 drivers/spi/companion/Makefile
> create mode 100644 drivers/spi/companion/core.c
> create mode 100644 drivers/spi/companion/protocol-manager.c
> create mode 100644 drivers/spi/companion/protocol-manager.h
> create mode 100644 drivers/spi/companion/protocol.h
> create mode 100644 drivers/spi/companion/queue-manager.c
> create mode 100644 drivers/spi/companion/queue-manager.h
> create mode 100644 include/linux/companion.h
>
> --
> 2.7.4
>
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply
* Re: [PATCH net] rxrpc: Fix terminal retransmission connection ID to include the channel
From: David Miller @ 2018-06-06 18:04 UTC (permalink / raw)
To: dhowells; +Cc: netdev, linux-afs, linux-kernel
In-Reply-To: <152829355456.15682.4887593676960385184.stgit@warthog.procyon.org.uk>
From: David Howells <dhowells@redhat.com>
Date: Wed, 06 Jun 2018 14:59:14 +0100
> When retransmitting the final ACK or ABORT packet for a call, the cid field
> in the packet header is set to the connection's cid, but this is incorrect
> as it also needs to include the channel number on that connection that the
> call was made on.
>
> Fix this by OR'ing in the channel number.
>
> Note that this fixes the bug that:
>
> commit 1a025028d400b23477341aa7ec2ce55f8b39b554
> rxrpc: Fix handling of call quietly cancelled out on server
>
> works around. I'm not intending to revert that as it will help protect
> against problems that might occur on the server.
>
> Fixes: 3136ef49a14c ("rxrpc: Delay terminal ACK transmission on a client call")
> Signed-off-by: David Howells <dhowells@redhat.com>
Applid, thanks David.
^ permalink raw reply
* Re: [PATCH V2 net-next 0/3] Bug fixes & optimization for HNS3 Driver
From: David Miller @ 2018-06-06 18:03 UTC (permalink / raw)
To: salil.mehta
Cc: yisen.zhuang, lipeng321, mehta.salil, netdev, linux-kernel,
linuxarm
In-Reply-To: <20180606130753.54428-1-salil.mehta@huawei.com>
From: Salil Mehta <salil.mehta@huawei.com>
Date: Wed, 6 Jun 2018 14:07:50 +0100
> This patch-set presents miscellaneous bug fixes and an optimization
> for HNS3 driver
>
> V1->V2:
> * Fixes the compilation break reported by David Miller & Kbuild
Series applied, hope this one builds better :)
^ permalink raw reply
* Re: [PATCH] bnx2x: use the right constant
From: David Miller @ 2018-06-06 18:01 UTC (permalink / raw)
To: Julia.Lawall
Cc: ariel.elior, kernel-janitors, everest-linux-l2, netdev,
linux-kernel
In-Reply-To: <1528290202-32454-1-git-send-email-Julia.Lawall@lip6.fr>
From: Julia Lawall <Julia.Lawall@lip6.fr>
Date: Wed, 6 Jun 2018 15:03:22 +0200
> Nearby code that also tests port suggests that the P0 constant should be
> used when port is zero.
>
> The semantic match that finds this problem is as follows:
> (http://coccinelle.lip6.fr/)
>
> // <smpl>
> @@
> expression e,e1;
> @@
>
> * e ? e1 : e1
> // </smpl>
>
> Fixes: 6c3218c6f7e5 ("bnx2x: Adjust ETS to 578xx")
> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
This definitely looks correct. Applied and queued up for -stable, thanks!
^ permalink raw reply
* [PATCH bpf-next v6 2/2] samples/bpf: Add xdp_sample_pkts example
From: Toke Høiland-Jørgensen @ 2018-06-06 17:58 UTC (permalink / raw)
To: netdev; +Cc: Jakub Kicinski, Jesper Dangaard Brouer
In-Reply-To: <152830792906.21161.5446415596970027478.stgit@alrua-kau>
Add an example program showing how to sample packets from XDP using the
perf event buffer. The example userspace program just prints the ethernet
header for every packet sampled.
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
---
samples/bpf/Makefile | 4 +
samples/bpf/xdp_sample_pkts_kern.c | 66 ++++++++++++++
samples/bpf/xdp_sample_pkts_user.c | 169 ++++++++++++++++++++++++++++++++++++
3 files changed, 239 insertions(+)
create mode 100644 samples/bpf/xdp_sample_pkts_kern.c
create mode 100644 samples/bpf/xdp_sample_pkts_user.c
diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 1303af10e54d..9ea2f7b64869 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -52,6 +52,7 @@ hostprogs-y += xdp_adjust_tail
hostprogs-y += xdpsock
hostprogs-y += xdp_fwd
hostprogs-y += task_fd_query
+hostprogs-y += xdp_sample_pkts
# Libbpf dependencies
LIBBPF = $(TOOLS_PATH)/lib/bpf/libbpf.a
@@ -107,6 +108,7 @@ xdp_adjust_tail-objs := xdp_adjust_tail_user.o
xdpsock-objs := bpf_load.o xdpsock_user.o
xdp_fwd-objs := bpf_load.o xdp_fwd_user.o
task_fd_query-objs := bpf_load.o task_fd_query_user.o $(TRACE_HELPERS)
+xdp_sample_pkts-objs := xdp_sample_pkts_user.o $(TRACE_HELPERS)
# Tell kbuild to always build the programs
always := $(hostprogs-y)
@@ -163,6 +165,7 @@ always += xdp_adjust_tail_kern.o
always += xdpsock_kern.o
always += xdp_fwd_kern.o
always += task_fd_query_kern.o
+always += xdp_sample_pkts_kern.o
HOSTCFLAGS += -I$(objtree)/usr/include
HOSTCFLAGS += -I$(srctree)/tools/lib/
@@ -179,6 +182,7 @@ HOSTCFLAGS_spintest_user.o += -I$(srctree)/tools/lib/bpf/
HOSTCFLAGS_trace_event_user.o += -I$(srctree)/tools/lib/bpf/
HOSTCFLAGS_sampleip_user.o += -I$(srctree)/tools/lib/bpf/
HOSTCFLAGS_task_fd_query_user.o += -I$(srctree)/tools/lib/bpf/
+HOSTCFLAGS_xdp_sample_pkts_user.o += -I$(srctree)/tools/lib/bpf/
HOST_LOADLIBES += $(LIBBPF) -lelf
HOSTLOADLIBES_tracex4 += -lrt
diff --git a/samples/bpf/xdp_sample_pkts_kern.c b/samples/bpf/xdp_sample_pkts_kern.c
new file mode 100644
index 000000000000..f7ca8b850978
--- /dev/null
+++ b/samples/bpf/xdp_sample_pkts_kern.c
@@ -0,0 +1,66 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/ptrace.h>
+#include <linux/version.h>
+#include <uapi/linux/bpf.h>
+#include "bpf_helpers.h"
+
+#define SAMPLE_SIZE 64ul
+#define MAX_CPUS 128
+
+#define bpf_printk(fmt, ...) \
+({ \
+ char ____fmt[] = fmt; \
+ bpf_trace_printk(____fmt, sizeof(____fmt), \
+ ##__VA_ARGS__); \
+})
+
+struct bpf_map_def SEC("maps") my_map = {
+ .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
+ .key_size = sizeof(int),
+ .value_size = sizeof(u32),
+ .max_entries = MAX_CPUS,
+};
+
+SEC("xdp_sample")
+int xdp_sample_prog(struct xdp_md *ctx)
+{
+ void *data_end = (void *)(long)ctx->data_end;
+ void *data = (void *)(long)ctx->data;
+
+ /* Metadata will be in the perf event before the packet data. */
+ struct S {
+ u16 cookie;
+ u16 pkt_len;
+ } __packed metadata;
+
+ if (data < data_end) {
+ /* The XDP perf_event_output handler will use the upper 32 bits
+ * of the flags argument as a number of bytes to include of the
+ * packet payload in the event data. If the size is too big, the
+ * call to bpf_perf_event_output will fail and return -EFAULT.
+ *
+ * See bpf_xdp_event_output in net/core/filter.c.
+ *
+ * The BPF_F_CURRENT_CPU flag means that the event output fd
+ * will be indexed by the CPU number in the event map.
+ */
+ u64 flags = BPF_F_CURRENT_CPU;
+ u16 sample_size;
+ int ret;
+
+ metadata.cookie = 0xdead;
+ metadata.pkt_len = (u16)(data_end - data);
+ sample_size = min(metadata.pkt_len, SAMPLE_SIZE);
+ flags |= (u64)sample_size << 32;
+
+ ret = bpf_perf_event_output(ctx, &my_map, flags,
+ &metadata, sizeof(metadata));
+ if (ret)
+ bpf_printk("perf_event_output failed: %d\n", ret);
+ }
+
+ return XDP_PASS;
+}
+
+char _license[] SEC("license") = "GPL";
+u32 _version SEC("version") = LINUX_VERSION_CODE;
diff --git a/samples/bpf/xdp_sample_pkts_user.c b/samples/bpf/xdp_sample_pkts_user.c
new file mode 100644
index 000000000000..8dd87c1eb560
--- /dev/null
+++ b/samples/bpf/xdp_sample_pkts_user.c
@@ -0,0 +1,169 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <linux/perf_event.h>
+#include <linux/bpf.h>
+#include <net/if.h>
+#include <errno.h>
+#include <assert.h>
+#include <sys/sysinfo.h>
+#include <sys/ioctl.h>
+#include <signal.h>
+#include <libbpf.h>
+#include <bpf/bpf.h>
+
+#include "perf-sys.h"
+#include "trace_helpers.h"
+
+#define MAX_CPUS 128
+static int pmu_fds[MAX_CPUS], if_idx;
+static struct perf_event_mmap_page *headers[MAX_CPUS];
+static char *if_name;
+
+static int do_attach(int idx, int fd, const char *name)
+{
+ int err;
+
+ err = bpf_set_link_xdp_fd(idx, fd, 0);
+ if (err < 0)
+ printf("ERROR: failed to attach program to %s\n", name);
+
+ return err;
+}
+
+static int do_detach(int idx, const char *name)
+{
+ int err;
+
+ err = bpf_set_link_xdp_fd(idx, -1, 0);
+ if (err < 0)
+ printf("ERROR: failed to detach program from %s\n", name);
+
+ return err;
+}
+
+#define SAMPLE_SIZE 64
+
+static int print_bpf_output(void *data, int size)
+{
+ struct {
+ __u16 cookie;
+ __u16 pkt_len;
+ __u8 pkt_data[SAMPLE_SIZE];
+ } __packed *e = data;
+ int i;
+
+ if (e->cookie != 0xdead) {
+ printf("BUG cookie %x sized %d\n",
+ e->cookie, size);
+ return LIBBPF_PERF_EVENT_ERROR;
+ }
+
+ printf("Pkt len: %-5d bytes. Ethernet hdr: ", e->pkt_len);
+ for (i = 0; i < 14 && i < e->pkt_len; i++)
+ printf("%02x ", e->pkt_data[i]);
+ printf("\n");
+
+ return LIBBPF_PERF_EVENT_CONT;
+}
+
+static void test_bpf_perf_event(int map_fd, int num)
+{
+ struct perf_event_attr attr = {
+ .sample_type = PERF_SAMPLE_RAW,
+ .type = PERF_TYPE_SOFTWARE,
+ .config = PERF_COUNT_SW_BPF_OUTPUT,
+ .wakeup_events = 1, /* get an fd notification for every event */
+ };
+ int i;
+
+ for (i = 0; i < num; i++) {
+ int key = i;
+
+ pmu_fds[i] = sys_perf_event_open(&attr, -1/*pid*/, i/*cpu*/,
+ -1/*group_fd*/, 0);
+
+ assert(pmu_fds[i] >= 0);
+ assert(bpf_map_update_elem(map_fd, &key,
+ &pmu_fds[i], BPF_ANY) == 0);
+ ioctl(pmu_fds[i], PERF_EVENT_IOC_ENABLE, 0);
+ }
+}
+
+static void sig_handler(int signo)
+{
+ do_detach(if_idx, if_name);
+ exit(0);
+}
+
+int main(int argc, char **argv)
+{
+ struct bpf_prog_load_attr prog_load_attr = {
+ .prog_type = BPF_PROG_TYPE_XDP,
+ };
+ struct bpf_object *obj;
+ struct bpf_map *map;
+ int prog_fd, map_fd;
+ char filename[256];
+ int ret, err, i;
+ int numcpus;
+
+ if (argc < 2) {
+ printf("Usage: %s <ifname>\n", argv[0]);
+ return 1;
+ }
+
+ numcpus = get_nprocs();
+ if (numcpus > MAX_CPUS)
+ numcpus = MAX_CPUS;
+
+ snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
+ prog_load_attr.file = filename;
+
+ if (bpf_prog_load_xattr(&prog_load_attr, &obj, &prog_fd))
+ return 1;
+
+ if (!prog_fd) {
+ printf("load_bpf_file: %s\n", strerror(errno));
+ return 1;
+ }
+
+ map = bpf_map__next(NULL, obj);
+ if (!map) {
+ printf("finding a map in obj file failed\n");
+ return 1;
+ }
+ map_fd = bpf_map__fd(map);
+
+ if_idx = if_nametoindex(argv[1]);
+ if (!if_idx)
+ if_idx = strtoul(argv[1], NULL, 0);
+
+ if (!if_idx) {
+ fprintf(stderr, "Invalid ifname\n");
+ return 1;
+ }
+ if_name = argv[1];
+ err = do_attach(if_idx, prog_fd, argv[1]);
+ if (err)
+ return err;
+
+ if (signal(SIGINT, sig_handler) ||
+ signal(SIGHUP, sig_handler) ||
+ signal(SIGTERM, sig_handler)) {
+ perror("signal");
+ return 1;
+ }
+
+ test_bpf_perf_event(map_fd, numcpus);
+
+ for (i = 0; i < numcpus; i++)
+ if (perf_event_mmap_header(pmu_fds[i], &headers[i]) < 0)
+ return 1;
+
+ ret = perf_event_poller_multi(pmu_fds, headers, numcpus,
+ print_bpf_output);
+ kill(0, SIGINT);
+ return ret;
+}
^ permalink raw reply related
* [PATCH bpf-next v6 1/2] trace_helpers.c: Add helpers to poll multiple perf FDs for events
From: Toke Høiland-Jørgensen @ 2018-06-06 17:58 UTC (permalink / raw)
To: netdev; +Cc: Jakub Kicinski, Jesper Dangaard Brouer
Add two new helper functions to trace_helpers that supports polling
multiple perf file descriptors for events. These are used to the XDP
perf_event_output example, which needs to work with one perf fd per CPU.
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
---
tools/testing/selftests/bpf/trace_helpers.c | 48 ++++++++++++++++++++++++++-
tools/testing/selftests/bpf/trace_helpers.h | 4 ++
2 files changed, 50 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/bpf/trace_helpers.c b/tools/testing/selftests/bpf/trace_helpers.c
index 3868dcb63420..cabe2a3a3b30 100644
--- a/tools/testing/selftests/bpf/trace_helpers.c
+++ b/tools/testing/selftests/bpf/trace_helpers.c
@@ -88,7 +88,7 @@ static int page_size;
static int page_cnt = 8;
static struct perf_event_mmap_page *header;
-int perf_event_mmap(int fd)
+int perf_event_mmap_header(int fd, struct perf_event_mmap_page **header)
{
void *base;
int mmap_size;
@@ -102,10 +102,15 @@ int perf_event_mmap(int fd)
return -1;
}
- header = base;
+ *header = base;
return 0;
}
+int perf_event_mmap(int fd)
+{
+ return perf_event_mmap_header(fd, &header);
+}
+
static int perf_event_poll(int fd)
{
struct pollfd pfd = { .fd = fd, .events = POLLIN };
@@ -163,3 +168,42 @@ int perf_event_poller(int fd, perf_event_print_fn output_fn)
return ret;
}
+
+int perf_event_poller_multi(int *fds, struct perf_event_mmap_page **headers,
+ int num_fds, perf_event_print_fn output_fn)
+{
+ enum bpf_perf_event_ret ret;
+ struct pollfd *pfds;
+ void *buf = NULL;
+ size_t len = 0;
+ int i;
+
+ pfds = calloc(num_fds, sizeof(*pfds));
+ if (!pfds)
+ return LIBBPF_PERF_EVENT_ERROR;
+
+ for (i = 0; i < num_fds; i++) {
+ pfds[i].fd = fds[i];
+ pfds[i].events = POLLIN;
+ }
+
+ for (;;) {
+ poll(pfds, num_fds, 1000);
+ for (i = 0; i < num_fds; i++) {
+ if (!pfds[i].revents)
+ continue;
+
+ ret = bpf_perf_event_read_simple(headers[i],
+ page_cnt * page_size,
+ page_size, &buf, &len,
+ bpf_perf_event_print,
+ output_fn);
+ if (ret != LIBBPF_PERF_EVENT_CONT)
+ break;
+ }
+ }
+ free(buf);
+ free(pfds);
+
+ return ret;
+}
diff --git a/tools/testing/selftests/bpf/trace_helpers.h b/tools/testing/selftests/bpf/trace_helpers.h
index 3b4bcf7f5084..18924f23db1b 100644
--- a/tools/testing/selftests/bpf/trace_helpers.h
+++ b/tools/testing/selftests/bpf/trace_helpers.h
@@ -3,6 +3,7 @@
#define __TRACE_HELPER_H
#include <libbpf.h>
+#include <linux/perf_event.h>
struct ksym {
long addr;
@@ -16,6 +17,9 @@ long ksym_get_addr(const char *name);
typedef enum bpf_perf_event_ret (*perf_event_print_fn)(void *data, int size);
int perf_event_mmap(int fd);
+int perf_event_mmap_header(int fd, struct perf_event_mmap_page **header);
/* return LIBBPF_PERF_EVENT_DONE or LIBBPF_PERF_EVENT_ERROR */
int perf_event_poller(int fd, perf_event_print_fn output_fn);
+int perf_event_poller_multi(int *fds, struct perf_event_mmap_page **headers,
+ int num_fds, perf_event_print_fn output_fn);
#endif
^ permalink raw reply related
* Re: [RFC PATCH ghak90 (was ghak32) V3 02/10] audit: log container info of syscalls
From: Steve Grubb @ 2018-06-06 17:58 UTC (permalink / raw)
To: linux-audit
Cc: Richard Guy Briggs, cgroups, containers, linux-api, linux-fsdevel,
LKML, netdev, ebiederm, luto, jlayton, carlos, dhowells, viro,
simo, eparis, serge
In-Reply-To: <b839dc7fb29aae226b2a4602e7ef3c2a406ee90a.1528304203.git.rgb@redhat.com>
On Wednesday, June 6, 2018 12:58:29 PM EDT Richard Guy Briggs wrote:
> Create a new audit record AUDIT_CONTAINER to document the audit
> container identifier of a process if it is present.
>
> Called from audit_log_exit(), syscalls are covered.
>
> A sample raw event:
> type=SYSCALL msg=audit(1519924845.499:257): arch=c000003e syscall=257
> success=yes exit=3 a0=ffffff9c a1=56374e1cef30 a2=241 a3=1b6 items=2
> ppid=606 pid=635 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0
> fsgid=0 tty=pts0 ses=3 comm="bash" exe="/usr/bin/bash"
> subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
> key="tmpcontainerid" type=CWD msg=audit(1519924845.499:257): cwd="/root"
> type=PATH msg=audit(1519924845.499:257): item=0 name="/tmp/" inode=13863
> dev=00:27 mode=041777 ouid=0 ogid=0 rdev=00:00
> obj=system_u:object_r:tmp_t:s0 nametype= PARENT cap_fp=0000000000000000
> cap_fi=0000000000000000 cap_fe=0 cap_fver=0 type=PATH
> msg=audit(1519924845.499:257): item=1 name="/tmp/tmpcontainerid"
> inode=17729 dev=00:27 mode=0100644 ouid=0 ogid=0 rdev=00:00
> obj=unconfined_u:object_r:user_tmp_t:s0 nametype=CREATE
> cap_fp=0000000000000000 cap_fi=0000000000000000 cap_fe=0 cap_fver=0
> type=PROCTITLE msg=audit(1519924845.499:257):
> proctitle=62617368002D6300736C65657020313B206563686F2074657374203E202F746D
> 702F746D70636F6E7461696E65726964 type=CONTAINER
> msg=audit(1519924845.499:257): op=task contid=123458
Ack for the audit record names.
-Steve
> See: https://github.com/linux-audit/audit-kernel/issues/90
> See: https://github.com/linux-audit/audit-userspace/issues/51
> See: https://github.com/linux-audit/audit-testsuite/issues/64
> See:
> https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
> Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
> ---
> include/linux/audit.h | 7 +++++++
> include/uapi/linux/audit.h | 1 +
> kernel/audit.c | 23 +++++++++++++++++++++++
> kernel/auditsc.c | 3 +++
> 4 files changed, 34 insertions(+)
>
> diff --git a/include/linux/audit.h b/include/linux/audit.h
> index 497cd81..4e1e34e 100644
> --- a/include/linux/audit.h
> +++ b/include/linux/audit.h
> @@ -152,6 +152,9 @@ extern void audit_log_key(struct audit_buffer
*ab,
> extern int audit_log_task_context(struct audit_buffer *ab);
> extern void audit_log_task_info(struct audit_buffer *ab,
> struct task_struct *tsk);
> +extern int audit_log_contid(struct task_struct *tsk,
> + struct audit_context *context,
> + char *op);
>
> extern int audit_update_lsm_rules(void);
>
> @@ -202,6 +205,10 @@ static inline int audit_log_task_context(struct
> audit_buffer *ab) static inline void audit_log_task_info(struct
> audit_buffer *ab,
> struct task_struct *tsk)
> { }
> +static inline int audit_log_contid(struct task_struct *tsk,
> + struct audit_context *context,
> + char *op)
> +{ }
> #define audit_enabled 0
> #endif /* CONFIG_AUDIT */
>
> diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
> index c3b1aca..469ab25 100644
> --- a/include/uapi/linux/audit.h
> +++ b/include/uapi/linux/audit.h
> @@ -115,6 +115,7 @@
> #define AUDIT_REPLACE 1329 /* Replace auditd if this packet unanswerd
*/
> #define AUDIT_KERN_MODULE 1330 /* Kernel Module events */
> #define AUDIT_FANOTIFY 1331 /* Fanotify access decision */
> +#define AUDIT_CONTAINER 1332 /* Container ID */
>
> #define AUDIT_AVC 1400 /* SE Linux avc denial or grant */
> #define AUDIT_SELINUX_ERR 1401 /* Internal SE Linux Errors */
> diff --git a/kernel/audit.c b/kernel/audit.c
> index e7478cb..5e150c6 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -2048,6 +2048,29 @@ void audit_log_session_info(struct audit_buffer *ab)
> audit_log_format(ab, " auid=%u ses=%u", auid, sessionid);
> }
>
> +/*
> + * audit_log_contid - report container info
> + * @tsk: task to be recorded
> + * @context: task or local context for record
> + * @op: contid string description
> + */
> +int audit_log_contid(struct task_struct *tsk,
> + struct audit_context *context, char *op)
> +{
> + struct audit_buffer *ab;
> +
> + if (!audit_contid_set(tsk))
> + return 0;
> + /* Generate AUDIT_CONTAINER record with container ID */
> + ab = audit_log_start(context, GFP_KERNEL, AUDIT_CONTAINER);
> + if (!ab)
> + return -ENOMEM;
> + audit_log_format(ab, "op=%s contid=%llu",
> + op, audit_get_contid(tsk));
> + audit_log_end(ab);
> + return 0;
> +}
> +
> void audit_log_key(struct audit_buffer *ab, char *key)
> {
> audit_log_format(ab, " key=");
> diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> index 611e926..a3c946c 100644
> --- a/kernel/auditsc.c
> +++ b/kernel/auditsc.c
> @@ -1490,10 +1490,13 @@ static void audit_log_exit(struct audit_context
> *context, struct task_struct *ts
>
> audit_log_proctitle(tsk, context);
>
> + audit_log_contid(tsk, context, "task");
> +
> /* Send end of event record to help user space know we are finished */
> ab = audit_log_start(context, GFP_KERNEL, AUDIT_EOE);
> if (ab)
> audit_log_end(ab);
> +
> if (call_panic)
> audit_panic("error converting sid to string");
> }
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox