Netdev List
 help / color / mirror / Atom feed
* Re: skb_splice_bits() and large chunks in pipe (was Re: xfs_file_splice_read: possible circular locking dependency detected
From: Linus Torvalds @ 2016-09-19  0:18 UTC (permalink / raw)
  To: Al Viro
  Cc: Jens Axboe, Nick Piggin, linux-fsdevel, Network Development,
	Eric Dumazet
In-Reply-To: <20160918223117.GH2356@ZenIV.linux.org.uk>

On Sun, Sep 18, 2016 at 3:31 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> What worries me is iov_iter_get_pages() and friends.

So honestly, if it worries you, I'm not going to complain at all if
you decide that you'd rather translate the pipe_buffer[] array into a
kvec by always splitting at page boundaries.

Even with large packets in networking, it's not going t be a huge
deal. And maybe we *should* make it a rule that a "kvec" is always
composed of individual entries that fit entirely within a page.

In this code, being safe rather than clever would be a welcome and
surprising change, I guess.

             Linus

^ permalink raw reply

* Re: skb_splice_bits() and large chunks in pipe (was Re: xfs_file_splice_read: possible circular locking dependency detected
From: Al Viro @ 2016-09-19  0:22 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jens Axboe, Nick Piggin, linux-fsdevel, Network Development,
	Eric Dumazet
In-Reply-To: <20160918223117.GH2356@ZenIV.linux.org.uk>

On Sun, Sep 18, 2016 at 11:31:17PM +0100, Al Viro wrote:

> At the moment there are 11 callers (10 in mainline; one more added in
> conversion of vmsplice_to_pipe() to new pipe locking, but it's irrelevant
> anyway - it gets fed an iovec-backed iov_iter).  I'm looking through those
> right now, hopefully will come up with something sane...

FWIW, I wonder how many of those users are ready to cope with compound
pages in the first place; they end up passed to
	* skb_fill_page_desc().  Probably OK (as in all of them, modulo
calculating the number of pages and ranges for them).
	* shoved into scatterlist, which gets passed to virtqueue_add_sgs().
Need to check virtio to see what happens there.
	* shoved into nfs ->wb_page and fed into nfs_pageio_add_request() and
machinery behind it.  These, BTW, are reachable by pipe_buffer-derived ones
at the moment (splice to O_DIRECT nfs file).  The code looks like it's
playing fast and loose with ->wb_page - in some cases it's an NFS pagecache
one, in some - anything from userland, and there are places like
	inode = page_file_mapping(req->wb_page)->host;
which will do nasty things if they are ever reached by the second kind.
nfs_pgio_rpcsetup() looks like it won't be happy with compound pages, but
again, I'm not familiar enough with that code to tell if it's reachable
from nfs_pageio_add_request().
	* shoved into scatterlist, which gets fed into crypto/*.c machinery.
No way for a pipe_buffer stuff to get there, fortunately, because I would
be very surprised if it works correctly with compound pages and large
ranges in those.
	* shoved into lustre ->ldp_pages; almost certainly not ready for
compound pages.
	* fed to ceph_osd_data_pages_init(); again, practically certain not
to be ready.
	* put into dio_submit ->pages[], eventually fed to bio_add_page();
that might be fixable, but it would take some massage in fs/direct-io.c
	* fuse - probably OK, but that's only on a fairly cursory look.

It certainly won't be easy to verify in details ;-/

^ permalink raw reply

* (unknown), 
From: Hello Email User @ 2016-09-19  0:17 UTC (permalink / raw)




-- 
Sign-In Alert

Hello Email User,

We noticed a login to your Webmail account from an unrecognized device
on Sunday Sept 18, 2016 4:07 PM GMT+1 from London, UK.

Was this you? If so, please disregard the rest of this email.

If this wasn't you, please follow the links below to keep your E-Mail
account safe and provide required information to keep your account
ACTIVE.  https://formcrafts.com/a/22938?preview=true

Thanks,
Webmail Account Services

Please do not reply to this message.
Mail sent to this address cannot be answered.
----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

^ permalink raw reply

* RE: [PATCH 2/2] iw_cxgb4: add fast-path for small REG_MR operations
From: Steve Wise @ 2016-09-19  0:40 UTC (permalink / raw)
  To: 'Leon Romanovsky'
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
	netdev-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20160918142242.GJ2923-2ukJVAZIZ/Y@public.gmane.org>

> On Fri, Sep 16, 2016 at 07:54:52AM -0700, Steve Wise wrote:
> > When processing a REG_MR work request, if fw supports the
> > FW_RI_NSMR_TPTE_WR work request, and if the page list for this
> > registration is <= 2 pages, and the current state of the mr is INVALID,
> > then use FW_RI_NSMR_TPTE_WR to pass down a fully populated TPTE for
> FW
> > to write.  This avoids FW having to do an async read of the TPTE
blocking
> > the SQ until the read completes.
> >
> > To know if the current MR state is INVALID or not, iw_cxgb4 must track
the
> > state of each fastreg MR.  The c4iw_mr struct state is updated as REG_MR
> > and LOCAL_INV WRs are posted and completed, when a reg_mr is
> destroyed,
> > and when RECV completions are processed that include a local
invalidation.
> >
> > This optimization increases small IO IOPS for both iSER and NVMF.
> >
> > Signed-off-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
> > ---
> 
> <...>
> 
> > +			      struct ib_reg_wr *wr, struct c4iw_mr *mhp,
> > +			      u8 *len16)
> > +{
> > +	__be64 *p = (__be64 *)fr->pbl;
> > +
> > +	fr->r2 = cpu_to_be32(0);
> 
> Is there any difference between the line above and "fr->r2 = 0"?

It makes sparse happy, IIRC...



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH net-next 2/3] r8152: support ECM mode
From: kbuild test robot @ 2016-09-19  0:43 UTC (permalink / raw)
  To: Hayes Wang
  Cc: kbuild-all, netdev, nic_swsd, linux-kernel, linux-usb, Hayes Wang
In-Reply-To: <1394712342-15778-217-Taiwan-albertk@realtek.com>

[-- Attachment #1: Type: text/plain, Size: 924 bytes --]

Hi Hayes,

[auto build test ERROR on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Hayes-Wang/r8152-configuration-setting/20160907-192351
config: x86_64-randconfig-s0-09190146 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/built-in.o: In function `rtl_ecm_bind':
   r8152.c:(.text+0x64b43c): undefined reference to `usbnet_cdc_bind'
   r8152.c:(.text+0x64b59e): undefined reference to `usbnet_cdc_unbind'
>> drivers/built-in.o:(.rodata+0x272438): undefined reference to `usbnet_cdc_unbind'
>> drivers/built-in.o:(.rodata+0x272460): undefined reference to `usbnet_cdc_status'

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 33233 bytes --]

^ permalink raw reply

* Re: [PATCH net-next 0/5] mlx4 misc fixes and improvements
From: David Miller @ 2016-09-19  2:00 UTC (permalink / raw)
  To: ttoukan.linux; +Cc: tariqt, netdev, eranbe
In-Reply-To: <80a8e8c5-689e-a2b1-e2d4-817dc0649385@gmail.com>

From: Tariq Toukan <ttoukan.linux@gmail.com>
Date: Sun, 18 Sep 2016 10:27:23 +0300

> Hi Dave,
> 
> On 16/09/2016 2:21 AM, David Miller wrote:
>> From: Tariq Toukan <tariqt@mellanox.com>
>> Date: Mon, 12 Sep 2016 16:20:11 +0300
>>
>>> This patchset contains some bug fixes, a cleanup, and small
>>> improvements
>>> from the team to the mlx4 Eth and core drivers.
>>>
>>> Series generated against net-next commit:
>>> 02154927c115 "net: dsa: bcm_sf2: Get VLAN_PORT_MASK from b53_device"
>>>
>>> Please push the following patch to -stable  >= 4.6 as well:
>>> "net/mlx4_core: Fix to clean devlink resources"
>> Again, coding style fixes and optimizations like branch prediction
>> hints are not bug fixes and therefore not appropriate for 'net'.
> Yes, I know. Please notice that it was submitted to net-next this
> time.

This is completely incompatible with a request for one of the changes
to go into -stable.

If the change is not in 'net', it can't go to -stable.

^ permalink raw reply

* Re: [PATCHv2 net 0/6] sctp: fix the transmit err process
From: David Miller @ 2016-09-19  2:03 UTC (permalink / raw)
  To: lucien.xin; +Cc: netdev, linux-sctp, marcelo.leitner, vyasevich, daniel
In-Reply-To: <cover.1473789537.git.lucien.xin@gmail.com>

From: Xin Long <lucien.xin@gmail.com>
Date: Wed, 14 Sep 2016 02:04:17 +0800

> This patchset is to improve the transmit err process and also fix some
> issues.
> 
> After this patchset, once the chunks are enqueued successfully, even
> if the chunks fail to send out, no matter because of nodst or nomem,
> no err retruns back to users any more. Instead, they are taken care
> of by retransmit.
> 
> v1->v2:
>   - add more details to the changelog in patch 1/6
>   - add Fixes: tag in patch 2/6, 3/6
>   - also revert 69b5777f2e57 in patch 3/6

Since SCTP's behavior has been like this for a long time I've applied
this series to net-next, thanks.

^ permalink raw reply

* Re: [PATCH next] sctp: make use of WORD_TRUNC macro
From: David Miller @ 2016-09-19  2:06 UTC (permalink / raw)
  To: marcelo.leitner; +Cc: netdev, linux-sctp, nhorman, vyasevich
In-Reply-To: <b40b1bd4adff2d389588d9a4710af0399ce68d1e.1473963064.git.marcelo.leitner@gmail.com>

From: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Date: Thu, 15 Sep 2016 15:12:30 -0300

> No functional change. Just to avoid the usage of '&~3'.
> Also break the line to make it easier to read.

You're reply later in this thread:

	"to make sure it is correctly adapted to some arch if
	necessary. (even though it's not necessary in this case)"

is inconsistent with your commit log message.

If you think that the word size might possibly be different
on a given arch, then this is in fact a functional change.

This patch just adds ambiguity.  Whereas the existing code is explicit
about "multiple of 4" and there can be no confusion.

I'm not applying this, sorry.

^ permalink raw reply

* Re: [PATCH v2 net-next 0/7] net: ILA resolver and generic resolver backend
From: David Miller @ 2016-09-19  2:10 UTC (permalink / raw)
  To: tom; +Cc: netdev, kernel-team, roopa, tgraf
In-Reply-To: <1473974361-2275254-1-git-send-email-tom@herbertland.com>

From: Tom Herbert <tom@herbertland.com>
Date: Thu, 15 Sep 2016 14:19:14 -0700

> This patch set implements an ILA host side resolver. This uses LWT
> to implement the hook to a userspace resolver and tracks pending
> unresolved address using the backend net resolver.

Can you please repost this series with Herbert Xu properly CC:'d
since he maintains rhashtable and is making changes to it recently
which might conflict with what you are proposing here?

Thanks.

^ permalink raw reply

* Re: [PATCH v2 1/2] openvswitch: fix flow stats accounting when node 0 is not possible
From: David Miller @ 2016-09-19  2:14 UTC (permalink / raw)
  To: cascardo-H+wXaHxf7aLQT0dZR+AlfA
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
	eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w
In-Reply-To: <1473977513-7617-1-git-send-email-cascardo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

From: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Date: Thu, 15 Sep 2016 19:11:52 -0300

> On a system with only node 1 as possible, all statistics is going to be
> accounted on node 0 as it will have a single writer.
> 
> However, when getting and clearing the statistics, node 0 is not going
> to be considered, as it's not a possible node.
> 
> Tested that statistics are not zero on a system with only node 1
> possible. Also compile-tested with CONFIG_NUMA off.
> 
> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>

Applied to net-next.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

^ permalink raw reply

* Re: [PATCH v2 2/2] openvswitch: use percpu flow stats
From: David Miller @ 2016-09-19  2:14 UTC (permalink / raw)
  To: cascardo-H+wXaHxf7aLQT0dZR+AlfA
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
	eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w
In-Reply-To: <1473977513-7617-2-git-send-email-cascardo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

From: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Date: Thu, 15 Sep 2016 19:11:53 -0300

> Instead of using flow stats per NUMA node, use it per CPU. When using
> megaflows, the stats lock can be a bottleneck in scalability.
> 
> On a E5-2690 12-core system, usual throughput went from ~4Mpps to
> ~15Mpps when forwarding between two 40GbE ports with a single flow
> configured on the datapath.
> 
> This has been tested on a system with possible CPUs 0-7,16-23. After
> module removal, there were no corruption on the slab cache.
> 
> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>

Also applied to net-next, thanks.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

^ permalink raw reply

* Re: [PATCH net-next] pkt_sched: fq: use proper locking in fq_dump_stats()
From: David Miller @ 2016-09-19  2:17 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1473981601.22679.64.camel@edumazet-glaptop3.roam.corp.google.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 15 Sep 2016 16:20:01 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> When fq is used on 32bit kernels, we need to lock the qdisc before
> copying 64bit fields.
> 
> Otherwise "tc -s qdisc ..." might report bogus values.
> 
> Fixes: afe4fd062416 ("pkt_sched: fq: Fair Queue packet scheduler")
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied.

^ permalink raw reply

* Re: [PATCH (net.git)] stmmac: fix PWRDWN into the PMT register for global unicast.
From: David Miller @ 2016-09-19  2:21 UTC (permalink / raw)
  To: peppe.cavallaro; +Cc: netdev, alexandre.torgue
In-Reply-To: <1474015813-5400-1-git-send-email-peppe.cavallaro@st.com>

From: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Date: Fri, 16 Sep 2016 10:50:13 +0200

> MAC devices use the RWKPKTEN and MGKPKTEN bits of the PMT Control/Status
> register to generate power management events.
> So this patch is to properly set the RWKPKTEN [BIT(2)] inside the
> PMT register (needed in case of global unicast).
> 
> Reported-by: Aditi SHARMA <aditi-hed.sharma@st.com>
> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>

Applied.

^ permalink raw reply

* Re: [PATCH][V2] net: r6040: add in missing white space in error message text
From: David Miller @ 2016-09-19  2:22 UTC (permalink / raw)
  To: colin.king; +Cc: f.fainelli, netdev, linux-kernel
In-Reply-To: <20160916094338.23698-1-colin.king@canonical.com>

From: Colin King <colin.king@canonical.com>
Date: Fri, 16 Sep 2016 10:43:38 +0100

> From: Colin Ian King <colin.king@canonical.com>
> 
> A couple of dev_err messages span two lines and the literal
> string is missing a white space between words. Add the white
> space and join the two lines into one.
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>

Applied to net-next, thanks.

^ permalink raw reply

* Re: [PATCH net-next RESEND] xen-netfront: avoid packet loss when ethernet header crosses page boundary
From: David Miller @ 2016-09-19  2:26 UTC (permalink / raw)
  To: vkuznets
  Cc: netdev, linux-kernel, ptalbert, boris.ostrovsky, david.vrabel,
	jgross, xen-devel
In-Reply-To: <1474023554-24520-1-git-send-email-vkuznets@redhat.com>

From: Vitaly Kuznetsov <vkuznets@redhat.com>
Date: Fri, 16 Sep 2016 12:59:14 +0200

> @@ -595,6 +596,19 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  	offset = offset_in_page(skb->data);
>  	len = skb_headlen(skb);
>  
> +	/* The first req should be at least ETH_HLEN size or the packet will be
> +	 * dropped by netback.
> +	 */
> +	if (unlikely(PAGE_SIZE - offset < ETH_HLEN)) {
> +		nskb = skb_copy(skb, GFP_ATOMIC);
> +		if (!nskb)
> +			goto drop;
> +		dev_kfree_skb_any(skb);
> +		skb = nskb;
> +		page = virt_to_page(skb->data);
> +		offset = offset_in_page(skb->data);
> +	}
> +
>  	spin_lock_irqsave(&queue->tx_lock, flags);

I think you also have to recalculate 'len' in this case too, as
skb_headlen() will definitely be different for nskb.

In fact, I can't see how this code can work properly without that fix.

^ permalink raw reply

* Re: pull-request: mac80211 2016-09-16
From: David Miller @ 2016-09-19  2:27 UTC (permalink / raw)
  To: johannes; +Cc: netdev, linux-wireless
In-Reply-To: <1474030043-16094-1-git-send-email-johannes@sipsolutions.net>

From: Johannes Berg <johannes@sipsolutions.net>
Date: Fri, 16 Sep 2016 14:47:22 +0200

> Sorry - I know you only just pulled my tree for the previous fixes,
> but we found two more problems in the last few days; it'd be great
> to get those fixes in as well.
> 
> Let me know if there's any problem.

Pulled, thanks Johannes.

^ permalink raw reply

* Re: pull-request: mac80211-next 2016-09-16
From: David Miller @ 2016-09-19  2:30 UTC (permalink / raw)
  To: johannes; +Cc: netdev, linux-wireless
In-Reply-To: <1474030606-16664-1-git-send-email-johannes@sipsolutions.net>

From: Johannes Berg <johannes@sipsolutions.net>
Date: Fri, 16 Sep 2016 14:56:45 +0200

> And here's another set for net-next, it's been a month or so and we have a
> reasonably large number of patches (for a change, mostly because I cleaned
> up some WEP crypto thing and a few static checkers.)

Pulled, thanks.

^ permalink raw reply

* Re: [PATCH v3 net-next 16/16] tcp_bbr: add BBR congestion control
From: Neal Cardwell @ 2016-09-19  2:43 UTC (permalink / raw)
  To: Kenneth Klette Jonassen
  Cc: David Miller, Netdev, Van Jacobson, Yuchung Cheng,
	Nandita Dukkipati, Eric Dumazet, Soheil Hassas Yeganeh
In-Reply-To: <CA++eYdtWkMqT1zk_D00H1TciYb_4+aQ6-96YzG1n_h4LLk663g@mail.gmail.com>

On Sun, Sep 18, 2016 at 9:18 PM, Kenneth Klette Jonassen
<kennetkl@ifi.uio.no> wrote:
>> +static u64 bbr_rate_kbps(struct sock *sk, u64 rate)
>> +{
>> +       return bbr_rate_bytes_per_sec(sk, rate, BBR_UNIT) * 8 / 1000;
>
>
> Consider div_u64() here to keep all builds happy. :-) This adds __udivdi3()
> with the nxp powerpc toolchains I'm using.
>
> (Or have the one caller use bbr_rate_bytes_per_sec() instead.)

Thanks, Kenneth. We will fix this in the next revision (we like your
suggested approach of just using bbr_rate_bytes_per_sec()).

cheers,
neal

^ permalink raw reply

* Re: [PATCH next] sctp: make use of WORD_TRUNC macro
From: Marcelo Ricardo Leitner @ 2016-09-19  3:05 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-sctp, nhorman, vyasevich
In-Reply-To: <20160918.220629.1190601768509139321.davem@davemloft.net>

Em 18-09-2016 23:06, David Miller escreveu:
> From: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> Date: Thu, 15 Sep 2016 15:12:30 -0300
>
>> No functional change. Just to avoid the usage of '&~3'.
>> Also break the line to make it easier to read.
>
> You're reply later in this thread:
>
> 	"to make sure it is correctly adapted to some arch if
> 	necessary. (even though it's not necessary in this case)"
>
> is inconsistent with your commit log message.
>
> If you think that the word size might possibly be different
> on a given arch, then this is in fact a functional change.
>

Alright, that was badly worded, sorry. I meant not about the macro in 
specific but in a more general way, as in to not use magic hardcoded 
values, just that.

> This patch just adds ambiguity.  Whereas the existing code is explicit
> about "multiple of 4" and there can be no confusion.
>

On the other hand, it brings the code closer to a standard. This is the 
one but last occurrence of '~3' throughout sctp code. There is only one 
other spot left. All of them are using WORD_ROUND or WORD_TRUNC macros 
already.

We can rename the macros, I agree they sound confusing. Proposing 
SCTP_ALIGN4 and SCTP_TRUNC4. Does that sound better? Then I'll send a 
patchset renaming and updating all remaining places.

> I'm not applying this, sorry.
>

^ permalink raw reply

* Re: [PATCH v4 09/16] IB/pvrdma: Add support for Completion Queues
From: Leon Romanovsky @ 2016-09-19  3:24 UTC (permalink / raw)
  To: Adit Ranadive
  Cc: Yuval Shaia, dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, pv-drivers,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Jorgen S. Hansen, Aditya Sarwade, George Zhang, Bryan Tan
In-Reply-To: <BLUPR0501MB83604E10410486A3F068950C5F50-84Rf5TRaNBMVDhIuTCx1aJLWcSx1hRipwIZJ9u9yWa8oOQlpcoRfSA@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 3395 bytes --]

On Sun, Sep 18, 2016 at 08:36:55PM +0000, Adit Ranadive wrote:
> On Sun, Sep 18, 2016 at 10:07:18 -0700, Leon Romanovsky wrote:
> > On Thu, Sep 15, 2016 at 10:36:12AM +0300, Yuval Shaia wrote:
> > > Hi Adit,
> > > Please see my comments inline.
> > >
> > > Besides that I have no more comment for this patch.
> > >
> > > Reviewed-by: Yuval Shaia <yuval.shaia-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> > >
> > > Yuval
> > >
> > > On Thu, Sep 15, 2016 at 12:07:29AM +0000, Adit Ranadive wrote:
> > > > On Wed, Sep 14, 2016 at 05:43:37 -0700, Yuval Shaia wrote:
> > > > > On Sun, Sep 11, 2016 at 09:49:19PM -0700, Adit Ranadive wrote:
> > > > > > +
> > > > > > +static int pvrdma_poll_one(struct pvrdma_cq *cq, struct pvrdma_qp
> > > > > **cur_qp,
> > > > > > +			   struct ib_wc *wc)
> > > > > > +{
> > > > > > +	struct pvrdma_dev *dev = to_vdev(cq->ibcq.device);
> > > > > > +	int has_data;
> > > > > > +	unsigned int head;
> > > > > > +	bool tried = false;
> > > > > > +	struct pvrdma_cqe *cqe;
> > > > > > +
> > > > > > +retry:
> > > > > > +	has_data = pvrdma_idx_ring_has_data(&cq->ring_state->rx,
> > > > > > +					    cq->ibcq.cqe, &head);
> > > > > > +	if (has_data == 0) {
> > > > > > +		if (tried)
> > > > > > +			return -EAGAIN;
> > > > > > +
> > > > > > +		/* Pass down POLL to give physical HCA a chance to poll. */
> > > > > > +		pvrdma_write_uar_cq(dev, cq->cq_handle |
> > > > > PVRDMA_UAR_CQ_POLL);
> > > > > > +
> > > > > > +		tried = true;
> > > > > > +		goto retry;
> > > > > > +	} else if (has_data == PVRDMA_INVALID_IDX) {
> > > > >
> > > > > I didn't went throw the entire life cycle of RX-ring's head and tail but you
> > > > > need to make sure that PVRDMA_INVALID_IDX error is recoverable one, i.e
> > > > > there is probability that in the next call to pvrdma_poll_one it will be fine.
> > > > > Otherwise it is an endless loop.
> > > >
> > > > We have never run into this issue internally but I don't think we can recover here
> > >
> > > I briefly reviewed the life cycle of RX-ring's head and tail and didn't
> > > caught any suspicious place that might corrupt it.
> > > So glad to see that you never encountered this case.
> > >
> > > > in the driver. The only way to recover would be to destroy and recreate the CQ
> > > > which we shouldn't do since it could be used by multiple QPs.
> > >
> > > Agree.
> > > But don't they hit the same problem too?
> > >
> > > > We don't have a way yet to recover in the device. Once we add that this check
> > > > should go away.
> > >
> > > To be honest i have no idea how to do that - i was expecting driver's vendors
> > > to come up with an ideas :)
> > > I once came up with an idea to force restart of the driver but it was
> > > rejected.
> > >
> > > >
> > > > The reason I returned an error value from poll_cq in v3 was to break the possible
> > > > loop so that it might give clients a chance to recover. But since poll_cq is not expected
> > > > to fail I just log the device error here. I can revert to that version if you want to break
> > > > the possible loop.
> > >
> > > Clients (ULPs) cannot recover from this case. They even do not check the
> > > reason of the error and treats any error as -EAGAIN.
> >
> > It is because poll_one is not expected to fall.
>
> Poll_one is an internal function in our driver. ULPs should still be okay I think as long as poll_cq
> does not fail, no?

Yes, I think so.

Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply

* Re: [PATCH 2/2] iw_cxgb4: add fast-path for small REG_MR operations
From: Leon Romanovsky @ 2016-09-19  3:28 UTC (permalink / raw)
  To: Steve Wise; +Cc: dledford, davem, netdev, linux-rdma
In-Reply-To: <0bd501d2120e$6bf5aaf0$43e100d0$@opengridcomputing.com>

[-- Attachment #1: Type: text/plain, Size: 1354 bytes --]

On Sun, Sep 18, 2016 at 07:40:29PM -0500, Steve Wise wrote:
> > On Fri, Sep 16, 2016 at 07:54:52AM -0700, Steve Wise wrote:
> > > When processing a REG_MR work request, if fw supports the
> > > FW_RI_NSMR_TPTE_WR work request, and if the page list for this
> > > registration is <= 2 pages, and the current state of the mr is INVALID,
> > > then use FW_RI_NSMR_TPTE_WR to pass down a fully populated TPTE for
> > FW
> > > to write.  This avoids FW having to do an async read of the TPTE
> blocking
> > > the SQ until the read completes.
> > >
> > > To know if the current MR state is INVALID or not, iw_cxgb4 must track
> the
> > > state of each fastreg MR.  The c4iw_mr struct state is updated as REG_MR
> > > and LOCAL_INV WRs are posted and completed, when a reg_mr is
> > destroyed,
> > > and when RECV completions are processed that include a local
> invalidation.
> > >
> > > This optimization increases small IO IOPS for both iSER and NVMF.
> > >
> > > Signed-off-by: Steve Wise <swise@opengridcomputing.com>
> > > ---
> >
> > <...>
> >
> > > +			      struct ib_reg_wr *wr, struct c4iw_mr *mhp,
> > > +			      u8 *len16)
> > > +{
> > > +	__be64 *p = (__be64 *)fr->pbl;
> > > +
> > > +	fr->r2 = cpu_to_be32(0);
> >
> > Is there any difference between the line above and "fr->r2 = 0"?
>
> It makes sparse happy, IIRC...

Strange, but ok :)

>
>
>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply

* [PATCH 1/2] ptp_clock: allow for it to be optional
From: Nicolas Pitre @ 2016-09-19  3:51 UTC (permalink / raw)
  To: John Stultz
  Cc: Thomas Gleixner, Richard Cochran, Josh Triplett, netdev,
	linux-kernel
In-Reply-To: <1474257070-4255-1-git-send-email-nicolas.pitre@linaro.org>

In order to break the hard dependency between the PTP clock subsystem and
ethernet drivers capable of being clock providers, this patch provides
simple PTP stub functions to allow linkage of those drivers into the
kernel even when the PTP subsystem is configured out.

And to make it possible for PTP to be configured out, the select statement
in the Kconfig entry for those ethernet drivers is changed from selecting
PTP_1588_CLOCK to PTP_1588_CLOCK_SELECTED whose purpose is to indicate the
default Kconfig value for the PTP subsystem.

This way the PTP subsystem may have Kconfig dependencies of its own, such
as POSIX_TIMERS, without making those ethernet drivers unavailable if
POSIX timers are cconfigured out. And when support for POSIX timers is
selected again then PTP clock support will also be selected accordingly.

Drivers must be ready to accept NULL from ptp_clock_register().
The pch_gbe driver is a bit special as it relies on extra code in
drivers/ptp/ptp_pch.c. Therefore we let the make process descend into
drivers/ptp/ even if PTP_1588_CLOCK is unselected.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Richard Cochran <richardcochran@gmail.com>
---
 drivers/Makefile                                   |  2 +-
 drivers/net/ethernet/adi/Kconfig                   |  8 ++-
 drivers/net/ethernet/amd/Kconfig                   |  2 +-
 drivers/net/ethernet/amd/xgbe/xgbe-main.c          |  6 ++-
 drivers/net/ethernet/broadcom/Kconfig              |  4 +-
 drivers/net/ethernet/cavium/Kconfig                |  2 +-
 drivers/net/ethernet/freescale/Kconfig             |  2 +-
 drivers/net/ethernet/intel/Kconfig                 | 10 ++--
 drivers/net/ethernet/intel/e1000e/ptp.c            |  2 +-
 drivers/net/ethernet/intel/i40e/i40e_ptp.c         |  2 +-
 drivers/net/ethernet/intel/igb/igb_ptp.c           |  2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c       |  2 +-
 drivers/net/ethernet/mellanox/mlx4/Kconfig         |  2 +-
 drivers/net/ethernet/mellanox/mlx4/en_clock.c      |  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/Kconfig    |  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_clock.c |  2 +-
 drivers/net/ethernet/renesas/Kconfig               |  2 +-
 drivers/net/ethernet/samsung/Kconfig               |  2 +-
 drivers/net/ethernet/sfc/Kconfig                   |  2 +-
 drivers/net/ethernet/sfc/ptp.c                     | 14 ++---
 drivers/net/ethernet/stmicro/stmmac/Kconfig        |  2 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c   |  2 +-
 drivers/net/ethernet/ti/Kconfig                    |  2 +-
 drivers/net/ethernet/tile/Kconfig                  |  2 +-
 drivers/ptp/Kconfig                                | 12 +++--
 include/linux/ptp_clock_kernel.h                   | 59 +++++++++++++++-------
 26 files changed, 92 insertions(+), 59 deletions(-)

diff --git a/drivers/Makefile b/drivers/Makefile
index 53abb4a5f7..8a538d0856 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -105,7 +105,7 @@ obj-$(CONFIG_INPUT)		+= input/
 obj-$(CONFIG_RTC_LIB)		+= rtc/
 obj-y				+= i2c/ media/
 obj-$(CONFIG_PPS)		+= pps/
-obj-$(CONFIG_PTP_1588_CLOCK)	+= ptp/
+obj-y				+= ptp/
 obj-$(CONFIG_W1)		+= w1/
 obj-y				+= power/
 obj-$(CONFIG_HWMON)		+= hwmon/
diff --git a/drivers/net/ethernet/adi/Kconfig b/drivers/net/ethernet/adi/Kconfig
index 6b94ba6103..67094a9cfe 100644
--- a/drivers/net/ethernet/adi/Kconfig
+++ b/drivers/net/ethernet/adi/Kconfig
@@ -55,10 +55,14 @@ config BFIN_RX_DESC_NUM
 	---help---
 	  Set the number of buffer packets used in driver.
 
+config BFIN_MAC_HAS_HWSTAMP
+	def_tristate BFIN_MAC
+	depends on BF518
+	select PTP_1588_CLOCK_SELECTED
+
 config BFIN_MAC_USE_HWSTAMP
 	bool "Use IEEE 1588 hwstamp"
-	depends on BFIN_MAC && BF518
-	select PTP_1588_CLOCK
+	depends on BFIN_MAC_HAS_HWSTAMP && PTP_1588_CLOCK
 	default y
 	---help---
 	  To support the IEEE 1588 Precision Time Protocol (PTP), select y here
diff --git a/drivers/net/ethernet/amd/Kconfig b/drivers/net/ethernet/amd/Kconfig
index 0038709fd3..327e71a554 100644
--- a/drivers/net/ethernet/amd/Kconfig
+++ b/drivers/net/ethernet/amd/Kconfig
@@ -177,7 +177,7 @@ config AMD_XGBE
 	depends on ARM64 || COMPILE_TEST
 	select BITREVERSE
 	select CRC32
-	select PTP_1588_CLOCK
+	select PTP_1588_CLOCK_SELECTED
 	---help---
 	  This driver supports the AMD 10GbE Ethernet device found on an
 	  AMD SoC.
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-main.c b/drivers/net/ethernet/amd/xgbe/xgbe-main.c
index 3eee3201b5..4aeeb018b6 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-main.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-main.c
@@ -773,7 +773,8 @@ static int xgbe_probe(struct platform_device *pdev)
 		goto err_wq;
 	}
 
-	xgbe_ptp_register(pdata);
+	if (IS_REACHABLE(CONFIG_PTP_1588_CLOCK))
+		xgbe_ptp_register(pdata);
 
 	xgbe_debugfs_init(pdata);
 
@@ -812,7 +813,8 @@ static int xgbe_remove(struct platform_device *pdev)
 
 	xgbe_debugfs_exit(pdata);
 
-	xgbe_ptp_unregister(pdata);
+	if (IS_REACHABLE(CONFIG_PTP_1588_CLOCK))
+		xgbe_ptp_unregister(pdata);
 
 	flush_workqueue(pdata->an_workqueue);
 	destroy_workqueue(pdata->an_workqueue);
diff --git a/drivers/net/ethernet/broadcom/Kconfig b/drivers/net/ethernet/broadcom/Kconfig
index bd8c80c0b7..3db7eca92c 100644
--- a/drivers/net/ethernet/broadcom/Kconfig
+++ b/drivers/net/ethernet/broadcom/Kconfig
@@ -110,7 +110,7 @@ config TIGON3
 	depends on PCI
 	select PHYLIB
 	select HWMON
-	select PTP_1588_CLOCK
+	select PTP_1588_CLOCK_SELECTED
 	---help---
 	  This driver supports Broadcom Tigon3 based gigabit Ethernet cards.
 
@@ -120,7 +120,7 @@ config TIGON3
 config BNX2X
 	tristate "Broadcom NetXtremeII 10Gb support"
 	depends on PCI
-	select PTP_1588_CLOCK
+	select PTP_1588_CLOCK_SELECTED
 	select FW_LOADER
 	select ZLIB_INFLATE
 	select LIBCRC32C
diff --git a/drivers/net/ethernet/cavium/Kconfig b/drivers/net/ethernet/cavium/Kconfig
index e1b78b5003..5dd86cf683 100644
--- a/drivers/net/ethernet/cavium/Kconfig
+++ b/drivers/net/ethernet/cavium/Kconfig
@@ -53,7 +53,7 @@ config	THUNDER_NIC_RGX
 config LIQUIDIO
 	tristate "Cavium LiquidIO support"
 	depends on 64BIT
-	select PTP_1588_CLOCK
+	select PTP_1588_CLOCK_SELECTED
 	select FW_LOADER
 	select LIBCRC32C
 	---help---
diff --git a/drivers/net/ethernet/freescale/Kconfig b/drivers/net/ethernet/freescale/Kconfig
index d1ca45fbb1..775dbfeb7b 100644
--- a/drivers/net/ethernet/freescale/Kconfig
+++ b/drivers/net/ethernet/freescale/Kconfig
@@ -25,7 +25,7 @@ config FEC
 		   ARCH_MXC || SOC_IMX28)
 	default ARCH_MXC || SOC_IMX28 if ARM
 	select PHYLIB
-	select PTP_1588_CLOCK
+	select PTP_1588_CLOCK_SELECTED
 	---help---
 	  Say Y here if you want to use the built-in 10/100 Fast ethernet
 	  controller on some Motorola ColdFire and Freescale i.MX processors.
diff --git a/drivers/net/ethernet/intel/Kconfig b/drivers/net/ethernet/intel/Kconfig
index c0e17433f6..8bbfb43f09 100644
--- a/drivers/net/ethernet/intel/Kconfig
+++ b/drivers/net/ethernet/intel/Kconfig
@@ -58,7 +58,7 @@ config E1000E
 	tristate "Intel(R) PRO/1000 PCI-Express Gigabit Ethernet support"
 	depends on PCI && (!SPARC32 || BROKEN)
 	select CRC32
-	select PTP_1588_CLOCK
+	select PTP_1588_CLOCK_SELECTED
 	---help---
 	  This driver supports the PCI-Express Intel(R) PRO/1000 gigabit
 	  ethernet family of adapters. For PCI or PCI-X e1000 adapters,
@@ -83,7 +83,7 @@ config E1000E_HWTS
 config IGB
 	tristate "Intel(R) 82575/82576 PCI-Express Gigabit Ethernet support"
 	depends on PCI
-	select PTP_1588_CLOCK
+	select PTP_1588_CLOCK_SELECTED
 	select I2C
 	select I2C_ALGOBIT
 	---help---
@@ -156,7 +156,7 @@ config IXGBE
 	tristate "Intel(R) 10GbE PCI Express adapters support"
 	depends on PCI
 	select MDIO
-	select PTP_1588_CLOCK
+	select PTP_1588_CLOCK_SELECTED
 	---help---
 	  This driver supports Intel(R) 10GbE PCI Express family of
 	  adapters.  For more information on how to identify your adapter, go
@@ -213,7 +213,7 @@ config IXGBEVF
 
 config I40E
 	tristate "Intel(R) Ethernet Controller XL710 Family support"
-	select PTP_1588_CLOCK
+	select PTP_1588_CLOCK_SELECTED
 	depends on PCI
 	---help---
 	  This driver supports Intel(R) Ethernet Controller XL710 Family of
@@ -264,7 +264,7 @@ config FM10K
 	tristate "Intel(R) FM10000 Ethernet Switch Host Interface Support"
 	default n
 	depends on PCI_MSI
-	select PTP_1588_CLOCK
+	select PTP_1588_CLOCK_SELECTED
 	---help---
 	  This driver supports Intel(R) FM10000 Ethernet Switch Host
 	  Interface.  For more information on how to identify your adapter,
diff --git a/drivers/net/ethernet/intel/e1000e/ptp.c b/drivers/net/ethernet/intel/e1000e/ptp.c
index 2e1b17ad52..ad03763e00 100644
--- a/drivers/net/ethernet/intel/e1000e/ptp.c
+++ b/drivers/net/ethernet/intel/e1000e/ptp.c
@@ -334,7 +334,7 @@ void e1000e_ptp_init(struct e1000_adapter *adapter)
 	if (IS_ERR(adapter->ptp_clock)) {
 		adapter->ptp_clock = NULL;
 		e_err("ptp_clock_register failed\n");
-	} else {
+	} else if (adapter->ptp_clock) {
 		e_info("registered PHC clock\n");
 	}
 }
diff --git a/drivers/net/ethernet/intel/i40e/i40e_ptp.c b/drivers/net/ethernet/intel/i40e/i40e_ptp.c
index ed39cbad24..f1feceab75 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ptp.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ptp.c
@@ -669,7 +669,7 @@ void i40e_ptp_init(struct i40e_pf *pf)
 		pf->ptp_clock = NULL;
 		dev_err(&pf->pdev->dev, "%s: ptp_clock_register failed\n",
 			__func__);
-	} else {
+	} else if (pf->ptp_clock) {
 		struct timespec64 ts;
 		u32 regval;
 
diff --git a/drivers/net/ethernet/intel/igb/igb_ptp.c b/drivers/net/ethernet/intel/igb/igb_ptp.c
index 66dfa2085c..1dd14e166d 100644
--- a/drivers/net/ethernet/intel/igb/igb_ptp.c
+++ b/drivers/net/ethernet/intel/igb/igb_ptp.c
@@ -1159,7 +1159,7 @@ void igb_ptp_init(struct igb_adapter *adapter)
 	if (IS_ERR(adapter->ptp_clock)) {
 		adapter->ptp_clock = NULL;
 		dev_err(&adapter->pdev->dev, "ptp_clock_register failed\n");
-	} else {
+	} else if (adapter->ptp_clock) {
 		dev_info(&adapter->pdev->dev, "added PHC on %s\n",
 			 adapter->netdev->name);
 		adapter->ptp_flags |= IGB_PTP_ENABLED;
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c
index e5431bfe33..a92277683a 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c
@@ -1254,7 +1254,7 @@ static long ixgbe_ptp_create_clock(struct ixgbe_adapter *adapter)
 		adapter->ptp_clock = NULL;
 		e_dev_err("ptp_clock_register failed\n");
 		return err;
-	} else
+	} else if (adapter->ptp_clock)
 		e_dev_info("registered PHC device on %s\n", netdev->name);
 
 	/* set default timestamp mode to disabled here. We do this in
diff --git a/drivers/net/ethernet/mellanox/mlx4/Kconfig b/drivers/net/ethernet/mellanox/mlx4/Kconfig
index 5098e7f219..b2998bc5ab 100644
--- a/drivers/net/ethernet/mellanox/mlx4/Kconfig
+++ b/drivers/net/ethernet/mellanox/mlx4/Kconfig
@@ -7,7 +7,7 @@ config MLX4_EN
 	depends on MAY_USE_DEVLINK
 	depends on PCI
 	select MLX4_CORE
-	select PTP_1588_CLOCK
+	select PTP_1588_CLOCK_SELECTED
 	---help---
 	  This driver supports Mellanox Technologies ConnectX Ethernet
 	  devices.
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_clock.c b/drivers/net/ethernet/mellanox/mlx4/en_clock.c
index 1494997c4f..08fc5fc56d 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_clock.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_clock.c
@@ -298,7 +298,7 @@ void mlx4_en_init_timestamp(struct mlx4_en_dev *mdev)
 	if (IS_ERR(mdev->ptp_clock)) {
 		mdev->ptp_clock = NULL;
 		mlx4_err(mdev, "ptp_clock_register failed\n");
-	} else {
+	} else if (mdev->ptp_clock) {
 		mlx4_info(mdev, "registered PHC clock\n");
 	}
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
index aae46884bf..0d679346dc 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
@@ -14,7 +14,7 @@ config MLX5_CORE
 config MLX5_CORE_EN
 	bool "Mellanox Technologies ConnectX-4 Ethernet support"
 	depends on NETDEVICES && ETHERNET && PCI && MLX5_CORE
-	select PTP_1588_CLOCK
+	select PTP_1588_CLOCK_SELECTED
 	default n
 	---help---
 	  Ethernet support in Mellanox Technologies ConnectX-4 NIC.
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c b/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c
index 847a8f3ac2..13dc388667 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_clock.c
@@ -273,7 +273,7 @@ void mlx5e_timestamp_init(struct mlx5e_priv *priv)
 
 	tstamp->ptp = ptp_clock_register(&tstamp->ptp_info,
 					 &priv->mdev->pdev->dev);
-	if (IS_ERR_OR_NULL(tstamp->ptp)) {
+	if (IS_ERR(tstamp->ptp)) {
 		mlx5_core_warn(priv->mdev, "ptp_clock_register failed %ld\n",
 			       PTR_ERR(tstamp->ptp));
 		tstamp->ptp = NULL;
diff --git a/drivers/net/ethernet/renesas/Kconfig b/drivers/net/ethernet/renesas/Kconfig
index 4f132cf177..6862a9c1b6 100644
--- a/drivers/net/ethernet/renesas/Kconfig
+++ b/drivers/net/ethernet/renesas/Kconfig
@@ -37,7 +37,7 @@ config RAVB
 	select MII
 	select MDIO_BITBANG
 	select PHYLIB
-	select PTP_1588_CLOCK
+	select PTP_1588_CLOCK_SELECTED
 	help
 	  Renesas Ethernet AVB device driver.
 	  This driver supports the following SoCs:
diff --git a/drivers/net/ethernet/samsung/Kconfig b/drivers/net/ethernet/samsung/Kconfig
index 2360d81507..121b7e4426 100644
--- a/drivers/net/ethernet/samsung/Kconfig
+++ b/drivers/net/ethernet/samsung/Kconfig
@@ -21,7 +21,7 @@ config SXGBE_ETH
 	depends on HAS_IOMEM && HAS_DMA
 	select PHYLIB
 	select CRC32
-	select PTP_1588_CLOCK
+	select PTP_1588_CLOCK_SELECTED
 	---help---
 	  This is the driver for the SXGBE 10G Ethernet IP block found on
 	  Samsung platforms.
diff --git a/drivers/net/ethernet/sfc/Kconfig b/drivers/net/ethernet/sfc/Kconfig
index 4dd92b7b80..472152bd72 100644
--- a/drivers/net/ethernet/sfc/Kconfig
+++ b/drivers/net/ethernet/sfc/Kconfig
@@ -5,7 +5,7 @@ config SFC
 	select CRC32
 	select I2C
 	select I2C_ALGOBIT
-	select PTP_1588_CLOCK
+	select PTP_1588_CLOCK_SELECTED
 	---help---
 	  This driver supports 10/40-gigabit Ethernet cards based on
 	  the Solarflare SFC4000, SFC9000-family and SFC9100-family
diff --git a/drivers/net/ethernet/sfc/ptp.c b/drivers/net/ethernet/sfc/ptp.c
index dd204d9704..77a5364f7a 100644
--- a/drivers/net/ethernet/sfc/ptp.c
+++ b/drivers/net/ethernet/sfc/ptp.c
@@ -1269,13 +1269,13 @@ int efx_ptp_probe(struct efx_nic *efx, struct efx_channel *channel)
 		if (IS_ERR(ptp->phc_clock)) {
 			rc = PTR_ERR(ptp->phc_clock);
 			goto fail3;
-		}
-
-		INIT_WORK(&ptp->pps_work, efx_ptp_pps_worker);
-		ptp->pps_workwq = create_singlethread_workqueue("sfc_pps");
-		if (!ptp->pps_workwq) {
-			rc = -ENOMEM;
-			goto fail4;
+		} else if (ptp->phc_clock) {
+			INIT_WORK(&ptp->pps_work, efx_ptp_pps_worker);
+			ptp->pps_workwq = create_singlethread_workqueue("sfc_pps");
+			if (!ptp->pps_workwq) {
+				rc = -ENOMEM;
+				goto fail4;
+			}
 		}
 	}
 	ptp->nic_ts_enabled = false;
diff --git a/drivers/net/ethernet/stmicro/stmmac/Kconfig b/drivers/net/ethernet/stmicro/stmmac/Kconfig
index 8f06a6621a..578e5a15cb 100644
--- a/drivers/net/ethernet/stmicro/stmmac/Kconfig
+++ b/drivers/net/ethernet/stmicro/stmmac/Kconfig
@@ -4,7 +4,7 @@ config STMMAC_ETH
 	select MII
 	select PHYLIB
 	select CRC32
-	select PTP_1588_CLOCK
+	select PTP_1588_CLOCK_SELECTED
 	select RESET_CONTROLLER
 	---help---
 	  This is the driver for the Ethernet IPs are built around a
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c
index 170a18b612..6e3b82972c 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c
@@ -187,7 +187,7 @@ int stmmac_ptp_register(struct stmmac_priv *priv)
 	if (IS_ERR(priv->ptp_clock)) {
 		priv->ptp_clock = NULL;
 		pr_err("ptp_clock_register() failed on %s\n", priv->dev->name);
-	} else
+	} else if (priv->ptp_clock)
 		pr_debug("Added PTP HW clock successfully on %s\n",
 			 priv->dev->name);
 
diff --git a/drivers/net/ethernet/ti/Kconfig b/drivers/net/ethernet/ti/Kconfig
index 9904d740d5..bc895114c9 100644
--- a/drivers/net/ethernet/ti/Kconfig
+++ b/drivers/net/ethernet/ti/Kconfig
@@ -76,7 +76,7 @@ config TI_CPSW
 config TI_CPTS
 	bool "TI Common Platform Time Sync (CPTS) Support"
 	depends on TI_CPSW
-	select PTP_1588_CLOCK
+	select PTP_1588_CLOCK_SELECTED
 	---help---
 	  This driver supports the Common Platform Time Sync unit of
 	  the CPSW Ethernet Switch. The unit can time stamp PTP UDP/IPv4
diff --git a/drivers/net/ethernet/tile/Kconfig b/drivers/net/ethernet/tile/Kconfig
index f59a6c2653..b6ba43c78c 100644
--- a/drivers/net/ethernet/tile/Kconfig
+++ b/drivers/net/ethernet/tile/Kconfig
@@ -9,7 +9,7 @@ config TILE_NET
 	select CRC32
 	select TILE_GXIO_MPIPE if TILEGX
 	select HIGH_RES_TIMERS if TILEGX
-	select PTP_1588_CLOCK if TILEGX
+	select PTP_1588_CLOCK_SELECTED if TILEGX
 	---help---
 	  This is a standard Linux network device driver for the
 	  on-chip Tilera Gigabit Ethernet and XAUI interfaces.
diff --git a/drivers/ptp/Kconfig b/drivers/ptp/Kconfig
index ee3de3421f..f34b3748c0 100644
--- a/drivers/ptp/Kconfig
+++ b/drivers/ptp/Kconfig
@@ -4,8 +4,12 @@
 
 menu "PTP clock support"
 
+config PTP_1588_CLOCK_SELECTED
+	tristate
+
 config PTP_1588_CLOCK
 	tristate "PTP clock support"
+	default PTP_1588_CLOCK_SELECTED
 	depends on NET
 	select PPS
 	select NET_PTP_CLASSIFY
@@ -28,7 +32,7 @@ config PTP_1588_CLOCK
 config PTP_1588_CLOCK_GIANFAR
 	tristate "Freescale eTSEC as PTP clock"
 	depends on GIANFAR
-	select PTP_1588_CLOCK
+	depends on PTP_1588_CLOCK
 	default y
 	help
 	  This driver adds support for using the eTSEC as a PTP
@@ -42,7 +46,7 @@ config PTP_1588_CLOCK_GIANFAR
 config PTP_1588_CLOCK_IXP46X
 	tristate "Intel IXP46x as PTP clock"
 	depends on IXP4XX_ETH
-	select PTP_1588_CLOCK
+	depends on PTP_1588_CLOCK
 	default y
 	help
 	  This driver adds support for using the IXP46X as a PTP
@@ -60,7 +64,7 @@ config DP83640_PHY
 	tristate "Driver for the National Semiconductor DP83640 PHYTER"
 	depends on NETWORK_PHY_TIMESTAMPING
 	depends on PHYLIB
-	select PTP_1588_CLOCK
+	depends on PTP_1588_CLOCK
 	---help---
 	  Supports the DP83640 PHYTER with IEEE 1588 features.
 
@@ -76,7 +80,7 @@ config PTP_1588_CLOCK_PCH
 	tristate "Intel PCH EG20T as PTP clock"
 	depends on X86_32 || COMPILE_TEST
 	depends on HAS_IOMEM && NET
-	select PTP_1588_CLOCK
+	select PTP_1588_CLOCK_SELECTED
 	help
 	  This driver adds support for using the PCH EG20T as a PTP
 	  clock. The hardware supports time stamping of PTP packets
diff --git a/include/linux/ptp_clock_kernel.h b/include/linux/ptp_clock_kernel.h
index 6b15e16814..4c29eb8e53 100644
--- a/include/linux/ptp_clock_kernel.h
+++ b/include/linux/ptp_clock_kernel.h
@@ -122,24 +122,6 @@ struct ptp_clock_info {
 
 struct ptp_clock;
 
-/**
- * ptp_clock_register() - register a PTP hardware clock driver
- *
- * @info:   Structure describing the new clock.
- * @parent: Pointer to the parent device of the new clock.
- */
-
-extern struct ptp_clock *ptp_clock_register(struct ptp_clock_info *info,
-					    struct device *parent);
-
-/**
- * ptp_clock_unregister() - unregister a PTP hardware clock driver
- *
- * @ptp:  The clock to remove from service.
- */
-
-extern int ptp_clock_unregister(struct ptp_clock *ptp);
-
 
 enum ptp_clock_events {
 	PTP_CLOCK_ALARM,
@@ -166,6 +148,31 @@ struct ptp_clock_event {
 	};
 };
 
+#if IS_REACHABLE(CONFIG_PTP_1588_CLOCK)
+
+/**
+ * ptp_clock_register() - register a PTP hardware clock driver
+ *
+ * @info:   Structure describing the new clock.
+ * @parent: Pointer to the parent device of the new clock.
+ *
+ * Returns a valid pointer on success or PTR_ERR on failure.  If PHC
+ * support is missing at the configuration level, this function
+ * returns NULL, and drivers are expected to gracefully handle that
+ * case separately.
+ */
+
+extern struct ptp_clock *ptp_clock_register(struct ptp_clock_info *info,
+					    struct device *parent);
+
+/**
+ * ptp_clock_unregister() - unregister a PTP hardware clock driver
+ *
+ * @ptp:  The clock to remove from service.
+ */
+
+extern int ptp_clock_unregister(struct ptp_clock *ptp);
+
 /**
  * ptp_clock_event() - notify the PTP layer about an event
  *
@@ -197,4 +204,20 @@ extern int ptp_clock_index(struct ptp_clock *ptp);
 int ptp_find_pin(struct ptp_clock *ptp,
 		 enum ptp_pin_function func, unsigned int chan);
 
+#else
+static inline struct ptp_clock *ptp_clock_register(struct ptp_clock_info *info,
+						   struct device *parent)
+{ return NULL; }
+static inline int ptp_clock_unregister(struct ptp_clock *ptp)
+{ return 0; }
+static inline void ptp_clock_event(struct ptp_clock *ptp,
+				   struct ptp_clock_event *event)
+{ (void)event; }
+static inline int ptp_clock_index(struct ptp_clock *ptp)
+{ return -1; }
+static inline int ptp_find_pin(struct ptp_clock *ptp,
+			       enum ptp_pin_function func, unsigned int chan)
+{ return -1; }
+#endif
+
 #endif
-- 
2.7.4

^ permalink raw reply related

* [PATCH 0/2] make POSIX timers configurable
From: Nicolas Pitre @ 2016-09-19  3:51 UTC (permalink / raw)
  To: John Stultz
  Cc: Thomas Gleixner, Richard Cochran, Josh Triplett, netdev,
	linux-kernel

Many embedded systems don't need the full POSIX timer support.
Configuring them out provides a nice kernel image size reduction.

When POSIX timers are configured out, the PTP clock subsystem should be
left out as well. However a bunch of ethernet drivers currently *select*
it in their Kconfig entries. Therefore some more tweaks were needed to
break that hard dependency for those drivers to still be configured in
if desired.

It was agreed that the best path upstream for those patches is via
John Stultz's timer tree.

Previous itterations of those patches and the discussion threads can be
found here:

  https://lkml.org/lkml/2016/9/14/992

  https://lkml.org/lkml/2016/9/14/803

  https://lkml.org/lkml/2016/9/8/793

diffstat:

 drivers/Makefile                                |   2 +-
 drivers/net/ethernet/adi/Kconfig                |   8 +-
 drivers/net/ethernet/amd/Kconfig                |   2 +-
 drivers/net/ethernet/amd/xgbe/xgbe-main.c       |   6 +-
 drivers/net/ethernet/broadcom/Kconfig           |   4 +-
 drivers/net/ethernet/cavium/Kconfig             |   2 +-
 drivers/net/ethernet/freescale/Kconfig          |   2 +-
 drivers/net/ethernet/intel/Kconfig              |  10 +-
 drivers/net/ethernet/intel/e1000e/ptp.c         |   2 +-
 drivers/net/ethernet/intel/i40e/i40e_ptp.c      |   2 +-
 drivers/net/ethernet/intel/igb/igb_ptp.c        |   2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c    |   2 +-
 drivers/net/ethernet/mellanox/mlx4/Kconfig      |   2 +-
 drivers/net/ethernet/mellanox/mlx4/en_clock.c   |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/Kconfig |   2 +-
 .../net/ethernet/mellanox/mlx5/core/en_clock.c  |   2 +-
 drivers/net/ethernet/renesas/Kconfig            |   2 +-
 drivers/net/ethernet/samsung/Kconfig            |   2 +-
 drivers/net/ethernet/sfc/Kconfig                |   2 +-
 drivers/net/ethernet/sfc/ptp.c                  |  14 +--
 drivers/net/ethernet/stmicro/stmmac/Kconfig     |   2 +-
 .../net/ethernet/stmicro/stmmac/stmmac_ptp.c    |   2 +-
 drivers/net/ethernet/ti/Kconfig                 |   2 +-
 drivers/net/ethernet/tile/Kconfig               |   2 +-
 drivers/ptp/Kconfig                             |  14 ++-
 include/linux/posix-timers.h                    |  28 ++++-
 include/linux/ptp_clock_kernel.h                |  59 ++++++---
 include/linux/sched.h                           |  10 ++
 init/Kconfig                                    |  17 +++
 kernel/signal.c                                 |   4 +
 kernel/time/Kconfig                             |   1 +
 kernel/time/Makefile                            |  10 +-
 kernel/time/posix-stubs.c                       | 118 ++++++++++++++++++
 33 files changed, 277 insertions(+), 64 deletions(-)

^ permalink raw reply

* [PATCH 2/2] posix-timers: make it configurable
From: Nicolas Pitre @ 2016-09-19  3:51 UTC (permalink / raw)
  To: John Stultz
  Cc: Thomas Gleixner, Richard Cochran, Josh Triplett, netdev,
	linux-kernel
In-Reply-To: <1474257070-4255-1-git-send-email-nicolas.pitre@linaro.org>

Many embedded systems typically don't need them.  This removes about
22KB from the kernel binary size on ARM when configured out.

Corresponding syscalls are routed to a stub logging the attempt to
use those syscalls which should be enough of a clue if they were
disabled without proper consideration. They are: timer_create,
timer_gettime: timer_getoverrun, timer_settime, timer_delete,
clock_adjtime.

The clock_settime, clock_gettime, clock_getres and clock_nanosleep syscalls
are replaced by simple wrappers compatible with CLOCK_REALTIME,
CLOCK_MONOTONIC and CLOCK_BOOTTIME only.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
---
 drivers/ptp/Kconfig          |   2 +-
 include/linux/posix-timers.h |  28 +++++++++-
 include/linux/sched.h        |  10 ++++
 init/Kconfig                 |  17 +++++++
 kernel/signal.c              |   4 ++
 kernel/time/Kconfig          |   1 +
 kernel/time/Makefile         |  10 +++-
 kernel/time/posix-stubs.c    | 118 +++++++++++++++++++++++++++++++++++++++++++
 8 files changed, 185 insertions(+), 5 deletions(-)
 create mode 100644 kernel/time/posix-stubs.c

diff --git a/drivers/ptp/Kconfig b/drivers/ptp/Kconfig
index f34b3748c0..940fa10907 100644
--- a/drivers/ptp/Kconfig
+++ b/drivers/ptp/Kconfig
@@ -10,7 +10,7 @@ config PTP_1588_CLOCK_SELECTED
 config PTP_1588_CLOCK
 	tristate "PTP clock support"
 	default PTP_1588_CLOCK_SELECTED
-	depends on NET
+	depends on NET && POSIX_TIMERS
 	select PPS
 	select NET_PTP_CLASSIFY
 	help
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 62d44c1760..2288c5c557 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -118,6 +118,8 @@ struct k_clock {
 extern struct k_clock clock_posix_cpu;
 extern struct k_clock clock_posix_dynamic;
 
+#ifdef CONFIG_POSIX_TIMERS
+
 void posix_timers_register_clock(const clockid_t clock_id, struct k_clock *new_clock);
 
 /* function to call to trigger timer event */
@@ -131,8 +133,30 @@ void posix_cpu_timers_exit_group(struct task_struct *task);
 void set_process_cpu_timer(struct task_struct *task, unsigned int clock_idx,
 			   cputime_t *newval, cputime_t *oldval);
 
-long clock_nanosleep_restart(struct restart_block *restart_block);
-
 void update_rlimit_cpu(struct task_struct *task, unsigned long rlim_new);
 
+#else
+
+#include <linux/random.h>
+
+static inline void posix_timers_register_clock(const clockid_t clock_id,
+					       struct k_clock *new_clock) {}
+static inline int posix_timer_event(struct k_itimer *timr, int si_private)
+{ return 0; }
+static inline void run_posix_cpu_timers(struct task_struct *task) {}
+static inline void posix_cpu_timers_exit(struct task_struct *task)
+{
+	add_device_randomness((const void*) &task->se.sum_exec_runtime,
+			      sizeof(unsigned long long));
+}
+static inline void posix_cpu_timers_exit_group(struct task_struct *task) {}
+static inline void set_process_cpu_timer(struct task_struct *task,
+		unsigned int clock_idx, cputime_t *newval, cputime_t *oldval) {}
+static inline void update_rlimit_cpu(struct task_struct *task,
+				     unsigned long rlim_new) {}
+
+#endif
+
+long clock_nanosleep_restart(struct restart_block *restart_block);
+
 #endif
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 54182d52a0..39a1d6d3f5 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2924,8 +2924,13 @@ static inline void exit_thread(struct task_struct *tsk)
 extern void exit_files(struct task_struct *);
 extern void __cleanup_sighand(struct sighand_struct *);
 
+#ifdef CONFIG_POSIX_TIMERS
 extern void exit_itimers(struct signal_struct *);
 extern void flush_itimer_signals(void);
+#else
+static inline void exit_itimers(struct signal_struct *s) {}
+static inline void flush_itimer_signals(void) {}
+#endif
 
 extern void do_group_exit(int);
 
@@ -3382,7 +3387,12 @@ static __always_inline bool need_resched(void)
  * Thread group CPU time accounting.
  */
 void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times);
+#ifdef CONFIG_POSIX_TIMERS
 void thread_group_cputimer(struct task_struct *tsk, struct task_cputime *times);
+#else
+static inline void thread_group_cputimer(struct task_struct *tsk,
+					 struct task_cputime *times) {}
+#endif
 
 /*
  * Reevaluate whether the task has signals pending delivery.
diff --git a/init/Kconfig b/init/Kconfig
index a117738afd..3fdea723dd 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1449,6 +1449,23 @@ config SYSCTL_SYSCALL
 
 	  If unsure say N here.
 
+config POSIX_TIMERS
+	bool "Posix Clocks & timers" if EXPERT
+	default y
+	help
+	  This includes native support for POSIX timers to the kernel.
+	  Most embedded systems may have no use for them and therefore they
+	  can be configured out to reduce the size of the kernel image.
+
+	  When this option is disabled, the following syscalls won't be
+	  available: timer_create, timer_gettime: timer_getoverrun,
+	  timer_settime, timer_delete, clock_adjtime. Furthermore, the
+	  clock_settime, clock_gettime, clock_getres and clock_nanosleep
+	  syscalls will be limited to CLOCK_REALTIME and CLOCK_MONOTONIC
+	  only.
+
+	  If unsure say y.
+
 config KALLSYMS
 	 bool "Load all symbols for debugging/ksymoops" if EXPERT
 	 default y
diff --git a/kernel/signal.c b/kernel/signal.c
index af21afc00d..ea75065e29 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -427,6 +427,7 @@ void flush_signals(struct task_struct *t)
 	spin_unlock_irqrestore(&t->sighand->siglock, flags);
 }
 
+#ifdef CONFIG_POSIX_TIMERS
 static void __flush_itimer_signals(struct sigpending *pending)
 {
 	sigset_t signal, retain;
@@ -460,6 +461,7 @@ void flush_itimer_signals(void)
 	__flush_itimer_signals(&tsk->signal->shared_pending);
 	spin_unlock_irqrestore(&tsk->sighand->siglock, flags);
 }
+#endif
 
 void ignore_signals(struct task_struct *t)
 {
@@ -611,6 +613,7 @@ int dequeue_signal(struct task_struct *tsk, sigset_t *mask, siginfo_t *info)
 		 */
 		current->jobctl |= JOBCTL_STOP_DEQUEUED;
 	}
+#ifdef CONFIG_POSIX_TIMERS
 	if ((info->si_code & __SI_MASK) == __SI_TIMER && info->si_sys_private) {
 		/*
 		 * Release the siglock to ensure proper locking order
@@ -622,6 +625,7 @@ int dequeue_signal(struct task_struct *tsk, sigset_t *mask, siginfo_t *info)
 		do_schedule_next_timer(info);
 		spin_lock(&tsk->sighand->siglock);
 	}
+#endif
 	return signr;
 }
 
diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
index 62824f2fe4..a3817ef652 100644
--- a/kernel/time/Kconfig
+++ b/kernel/time/Kconfig
@@ -195,3 +195,4 @@ config HIGH_RES_TIMERS
 
 endmenu
 endif
+
diff --git a/kernel/time/Makefile b/kernel/time/Makefile
index 49eca0beed..fc26c308f5 100644
--- a/kernel/time/Makefile
+++ b/kernel/time/Makefile
@@ -1,6 +1,12 @@
-obj-y += time.o timer.o hrtimer.o itimer.o posix-timers.o posix-cpu-timers.o
+obj-y += time.o timer.o hrtimer.o itimer.o
 obj-y += timekeeping.o ntp.o clocksource.o jiffies.o timer_list.o
-obj-y += timeconv.o timecounter.o posix-clock.o alarmtimer.o
+obj-y += timeconv.o timecounter.o alarmtimer.o
+
+ifeq ($(CONFIG_POSIX_TIMERS),y)
+ obj-y += posix-timers.o posix-cpu-timers.o posix-clock.o
+else
+ obj-y += posix-stubs.o
+endif
 
 obj-$(CONFIG_GENERIC_CLOCKEVENTS)		+= clockevents.o tick-common.o
 ifeq ($(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST),y)
diff --git a/kernel/time/posix-stubs.c b/kernel/time/posix-stubs.c
new file mode 100644
index 0000000000..fe857bd4a0
--- /dev/null
+++ b/kernel/time/posix-stubs.c
@@ -0,0 +1,118 @@
+/*
+ * Dummy stubs used when CONFIG_POSIX_TIMERS=n
+ *
+ * Created by:  Nicolas Pitre, July 2016
+ * Copyright:   (C) 2016 Linaro Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/linkage.h>
+#include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/errno.h>
+#include <linux/syscalls.h>
+#include <linux/ktime.h>
+#include <linux/timekeeping.h>
+#include <linux/posix-timers.h>
+
+asmlinkage long sys_ni_posix_timers(void)
+{
+	pr_err_once("process %d (%s) attempted a POSIX timer syscall "
+		    "while CONFIG_POSIX_TIMERS is not set\n",
+		    current->pid, current->comm);
+	return -ENOSYS;
+}
+
+#define SYS_NI(name)  SYSCALL_ALIAS(sys_##name, sys_ni_posix_timers)
+
+SYS_NI(timer_create);
+SYS_NI(timer_gettime);
+SYS_NI(timer_getoverrun);
+SYS_NI(timer_settime);
+SYS_NI(timer_delete);
+SYS_NI(clock_adjtime);
+
+/*
+ * We preserve minimal support for CLOCK_REALTIME and CLOCK_MONOTONIC
+ * as it is easy to remain compatible with little code. CLOCK_BOOTTIME
+ * is also included for convenience as at least systemd uses it.
+ */
+
+SYSCALL_DEFINE2(clock_settime, const clockid_t, which_clock,
+		const struct timespec __user *, tp)
+{
+	struct timespec new_tp;
+
+	if (which_clock != CLOCK_REALTIME)
+		return -EINVAL;
+	if (copy_from_user(&new_tp, tp, sizeof (*tp)))
+		return -EFAULT;
+	return do_sys_settimeofday(&new_tp, NULL);
+}
+
+SYSCALL_DEFINE2(clock_gettime, const clockid_t, which_clock,
+		struct timespec __user *,tp)
+{
+	struct timespec kernel_tp;
+
+	switch (which_clock) {
+	case CLOCK_REALTIME: ktime_get_real_ts(&kernel_tp); break;
+	case CLOCK_MONOTONIC: ktime_get_ts(&kernel_tp); break;
+	case CLOCK_BOOTTIME: get_monotonic_boottime(&kernel_tp); break;
+	default: return -EINVAL;
+	}
+	if (copy_to_user(tp, &kernel_tp, sizeof (kernel_tp)))
+		return -EFAULT;
+	return 0;
+}
+
+SYSCALL_DEFINE2(clock_getres, const clockid_t, which_clock, struct timespec __user *, tp)
+{
+	struct timespec rtn_tp = {
+		.tv_sec = 0,
+		.tv_nsec = hrtimer_resolution,
+	};
+
+	switch (which_clock) {
+	case CLOCK_REALTIME:
+	case CLOCK_MONOTONIC:
+	case CLOCK_BOOTTIME:
+		if (copy_to_user(tp, &rtn_tp, sizeof(rtn_tp)))
+			return -EFAULT;
+		return 0;
+	default:
+		return -EINVAL;
+	}
+}
+
+SYSCALL_DEFINE4(clock_nanosleep, const clockid_t, which_clock, int, flags,
+		const struct timespec __user *, rqtp,
+		struct timespec __user *, rmtp)
+{
+	struct timespec t;
+
+	switch (which_clock) {
+	case CLOCK_REALTIME:
+	case CLOCK_MONOTONIC:
+	case CLOCK_BOOTTIME:
+		if (copy_from_user(&t, rqtp, sizeof (struct timespec)))
+			return -EFAULT;
+		if (!timespec_valid(&t))
+			return -EINVAL;
+		return hrtimer_nanosleep(&t, rmtp, flags & TIMER_ABSTIME ?
+					 HRTIMER_MODE_ABS : HRTIMER_MODE_REL,
+					 which_clock);
+	default:
+		return -EINVAL;
+	}
+}
+
+#ifdef CONFIG_COMPAT
+long clock_nanosleep_restart(struct restart_block *restart_block)
+{
+	return hrtimer_nanosleep_restart(restart_block);
+}
+#endif
-- 
2.7.4

^ permalink raw reply related

* Re: [patch net-next v10 0/3] return offloaded stats as default and expose original sw stats
From: David Miller @ 2016-09-19  4:56 UTC (permalink / raw)
  To: jiri
  Cc: netdev, nogahf, idosch, eladr, yotamg, ogerlitz, roopa, nikolay,
	linville, tgraf, gospo, sfeldma, sd, eranbe, ast, edumazet,
	hannes, f.fainelli, dsa
In-Reply-To: <1474031138-2065-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@resnulli.us>
Date: Fri, 16 Sep 2016 15:05:35 +0200

> The problem we try to handle is about offloaded forwarded packets
> which are not seen by kernel. Let me try to draw it:
> 
>     port1                       port2 (HW stats are counted here)
>       \                          /
>        \                        /
>         \                      /
>          --(A)---- ASIC --(B)--
>                     |
>                    (C)
>                     |
>                    CPU (SW stats are counted here)
> 
> 
> Now we have couple of flows for TX and RX (direction does not matter here):
> 
> 1) port1->A->ASIC->C->CPU
> 
>    For this flow, HW and SW stats are equal.
> 
> 2) port1->A->ASIC->C->CPU->C->ASIC->B->port2
> 
>    For this flow, HW and SW stats are equal.
> 
> 3) port1->A->ASIC->B->port2
> 
>    For this flow, SW stats are 0.
> 
> The purpose of this patchset is to provide facility for user to
> find out the difference between flows 1+2 and 3. In other words, user
> will be able to see the statistics for the slow-path (through kernel).
 ...

Series applied, thanks Jiri.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox