Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] vxlan: Fix error that was resulting in VXLAN MTU size being 10 bytes too large
From: Stephen Hemminger @ 2012-11-19 16:03 UTC (permalink / raw)
  To: Joseph Glanville; +Cc: David Miller, alexander.h.duyck, netdev
In-Reply-To: <CAOzFzEiuhqiNiiij9P+sh8_ypyXN4zABeRB6tzH2TQqVsNHjCA@mail.gmail.com>

On Mon, 19 Nov 2012 22:33:50 +1100
Joseph Glanville <joseph.glanville@orionvm.com.au> wrote:

> On 14 November 2012 08:33, Stephen Hemminger <shemminger@vyatta.com> wrote:
> > On Tue, 13 Nov 2012 14:37:19 -0500 (EST)
> > David Miller <davem@davemloft.net> wrote:
> >
> >> From: Alexander Duyck <alexander.h.duyck@intel.com>
> >> Date: Fri, 09 Nov 2012 15:35:24 -0800
> >>
> >> > This change fixes an issue I found where VXLAN frames were fragmented when
> >> > they were up to the VXLAN MTU size.  I root caused the issue to the fact that
> >> > the headroom was 4 + 20 + 8 + 8.  This math doesn't appear to be correct
> >> > because we are not inserting a VLAN header, but instead a 2nd Ethernet header.
> >> > As such the math for the overhead should be 20 + 8 + 8 + 14 to account for the
> >> > extra headers that are inserted for VXLAN.
> >> >
> >> > Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> >>
> >> Applied, thanks for the detailed commit message.
> >
> > Probably need smarter code there to look at header length requirement
> > of underlying device as well, maybe someone will be perverse and runn
> > vxlan over a tunnel or IPoIB.
> 
> Forgive my ignorance but why would running VXLAN on IPoIB require
> special header handling? (and would it work or behave strangely?)
> 
> I was planning on giving this a go when 3.7 is released but I might do
> that sooner if problems are anticipated.
> 
> > --
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> Joseph.
> 

Some lower layers require bigger (or smaller headers). As it was, vxlan
was only allocating skb with a fixed amount of headroom. This would lead to
lower layers having to copy the skb.

My suggestion has already been addressed by a later patch.

^ permalink raw reply

* Re: [PATCH v2 net-next] sctp: Add support to per-association statistics via a new SCTP_GET_ASSOC_STATS call
From: Vlad Yasevich @ 2012-11-19 16:01 UTC (permalink / raw)
  To: Michele Baldessari
  Cc: linux-sctp, Neil Horman, Thomas Graf, netdev, David S. Miller
In-Reply-To: <1352991680-12289-1-git-send-email-michele@acksyn.org>

On 11/15/2012 10:01 AM, Michele Baldessari wrote:
> The current SCTP stack is lacking a mechanism to have per association
> statistics. This is an implementation modeled after OpenSolaris'
> SCTP_GET_ASSOC_STATS.
>
> Userspace part will follow on lksctp if/when there is a general ACK on
> this.
>
> V2)
>    - Implement partial retrieval of stat struct to cope for future expansion
>    - Kill the rtxpackets counter as it cannot be precise anyway
>    - Rename outseqtsns to outofseqtsns to make it clearer that these are out
>      of sequence unexpected TSNs
>    - Move asoc->ipackets++ under a lock to avoid potential miscounts
>    - Fold asoc->opackets++ into the already existing asoc check
>    - Kill unneeded (q->asoc) test when increasing rtxchunks
>    - Do not count octrlchunks if sending failed (SCTP_XMIT_OK != 0)
>    - Don't count SHUTDOWNs as SACKs
>    - Move SCTP_GET_ASSOC_STATS to the private space API
>    - Adjust the len check in sctp_getsockopt_assoc_stats() to allow for
>      future struct growth
>    - Move association statistics in their own struct
>    - Update idupchunks when we send a SACK with dup TSNs
>    - return min_rto in max_rto when RTO has not changed. Also return the
>      transport when max_rto last changed.
>
> Signed-off: Michele Baldessari <michele@acksyn.org>
> Acked-by: Thomas Graf <tgraf@suug.ch>
> ---
>   include/net/sctp/sctp.h    | 12 ++++++++
>   include/net/sctp/structs.h | 36 +++++++++++++++++++++++
>   include/net/sctp/user.h    | 27 +++++++++++++++++
>   net/sctp/associola.c       |  9 +++++-
>   net/sctp/endpointola.c     |  5 +++-
>   net/sctp/input.c           |  2 ++
>   net/sctp/output.c          | 14 +++++----
>   net/sctp/outqueue.c        | 12 ++++++--
>   net/sctp/sm_make_chunk.c   |  5 ++--
>   net/sctp/sm_sideeffect.c   |  1 +
>   net/sctp/sm_statefuns.c    | 10 +++++--
>   net/sctp/socket.c          | 72 ++++++++++++++++++++++++++++++++++++++++++++++
>   net/sctp/transport.c       |  2 ++
>   13 files changed, 194 insertions(+), 13 deletions(-)
>
> diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
> index 9c6414f..7fdf298 100644
> --- a/include/net/sctp/sctp.h
> +++ b/include/net/sctp/sctp.h
> @@ -272,6 +272,18 @@ struct sctp_mib {
>           unsigned long   mibs[SCTP_MIB_MAX];
>   };
>
> +/* helper function to track stats about max rto and related transport */
> +static inline void sctp_max_rto(struct sctp_association *asoc,
> +				struct sctp_transport *trans)
> +{
> +	if (asoc->stats.max_obs_rto < (__u64)trans->rto) {
> +		asoc->stats.max_obs_rto = trans->rto;
> +		memset(&asoc->stats.obs_rto_ipaddr, 0,
> +			sizeof(struct sockaddr_storage));
> +		memcpy(&asoc->stats.obs_rto_ipaddr, &trans->ipaddr,
> +			trans->af_specific->sockaddr_len);
> +	}
> +}
>
>   /* Print debugging messages.  */
>   #if SCTP_DEBUG
> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
> index 2b2f61d..bf94ddf 100644
> --- a/include/net/sctp/structs.h
> +++ b/include/net/sctp/structs.h
> @@ -1312,6 +1312,40 @@ struct sctp_inithdr_host {
>   	__u32 initial_tsn;
>   };
>
> +/* SCTP_GET_ASSOC_STATS counters */
> +struct sctp_priv_assoc_stats {
> +	/* Maximum observed rto in the association. Value is set to
> +	 * 0 when read and the max rto did not change. The transport where
> +	 * the max_rto was observed is returned in obs_rto_ipaddr
> +	 */
> +	struct sockaddr_storage obs_rto_ipaddr;
> +	__u64 max_obs_rto;
> +	__u64 max_prev_obs_rto;
> +	/* Total In and Out SACKs received and sent */
> +	__u64 isacks;
> +	__u64 osacks;
> +	/* Total In and Out packets received and sent */
> +	__u64 opackets;
> +	__u64 ipackets;
> +	/* Total retransmitted chunks */
> +	__u64 rtxchunks;
> +	/* TSN received > next expected */
> +	__u64 outofseqtsns;
> +	/* Duplicate Chunks received */
> +	__u64 idupchunks;
> +	/* Gap Ack Blocks received */
> +	__u64 gapcnt;
> +	/* Unordered data chunks sent and received */
> +	__u64 ouodchunks;
> +	__u64 iuodchunks;
> +	/* Ordered data chunks sent and received */
> +	__u64 oodchunks;
> +	__u64 iodchunks;
> +	/* Control chunks sent and received */
> +	__u64 octrlchunks;
> +	__u64 ictrlchunks;
> +};
> +
>   /* RFC2960
>    *
>    * 12. Recommended Transmission Control Block (TCB) Parameters
> @@ -1830,6 +1864,8 @@ struct sctp_association {
>
>   	__u8 need_ecne:1,	/* Need to send an ECNE Chunk? */
>   	     temp:1;		/* Is it a temporary association? */
> +
> +	struct sctp_priv_assoc_stats stats;
>   };
>
>
> diff --git a/include/net/sctp/user.h b/include/net/sctp/user.h
> index 1b02d7a..c851e72 100644
> --- a/include/net/sctp/user.h
> +++ b/include/net/sctp/user.h
> @@ -107,6 +107,7 @@ typedef __s32 sctp_assoc_t;
>   #define SCTP_GET_LOCAL_ADDRS	109		/* Get all local address. */
>   #define SCTP_SOCKOPT_CONNECTX	110		/* CONNECTX requests. */
>   #define SCTP_SOCKOPT_CONNECTX3	111	/* CONNECTX requests (updated) */
> +#define SCTP_GET_ASSOC_STATS	112	/* Read only */
>
>   /*
>    * 5.2.1 SCTP Initiation Structure (SCTP_INIT)
> @@ -719,6 +720,32 @@ struct sctp_getaddrs {
>   	__u8			addrs[0]; /*output, variable size*/
>   };
>
> +/* A socket user request obtained via SCTP_GET_ASSOC_STATS that retrieves
> + * association stats. All stats are counts except sas_maxrto and
> + * sas_obs_rto_ipaddr. maxrto is the max observed rto + transport since
> + * the last call. Will return 0 when did not change since last call
> + */
> +struct sctp_assoc_stats {
> +	sctp_assoc_t	sas_assoc_id;    /* Input */
> +					 /* Transport of observed max RTO */
> +	struct sockaddr_storage sas_obs_rto_ipaddr;
> +	__u64		sas_maxrto;      /* Maximum Observed RTO for period */
> +	__u64		sas_isacks;	 /* SACKs received */
> +	__u64		sas_osacks;	 /* SACKs sent */
> +	__u64		sas_opackets;	 /* Packets sent */
> +	__u64		sas_ipackets;	 /* Packets received */
> +	__u64		sas_rtxchunks;   /* Retransmitted Chunks */
> +	__u64		sas_outofseqtsns;/* TSN received > next expected */
> +	__u64		sas_idupchunks;  /* Dups received (ordered+unordered) */
> +	__u64		sas_gapcnt;      /* Gap Acknowledgements Received */
> +	__u64		sas_ouodchunks;  /* Unordered data chunks sent */
> +	__u64		sas_iuodchunks;  /* Unordered data chunks received */
> +	__u64		sas_oodchunks;	 /* Ordered data chunks sent */
> +	__u64		sas_iodchunks;	 /* Ordered data chunks received */
> +	__u64		sas_octrlchunks; /* Control chunks sent */
> +	__u64		sas_ictrlchunks; /* Control chunks received */
> +};
> +
>   /* These are bit fields for msghdr->msg_flags.  See section 5.1.  */
>   /* On user space Linux, these live in <bits/socket.h> as an enum.  */
>   enum sctp_msg_flags {
> diff --git a/net/sctp/associola.c b/net/sctp/associola.c
> index b1ef3bc..afb7522 100644
> --- a/net/sctp/associola.c
> +++ b/net/sctp/associola.c
> @@ -321,6 +321,9 @@ static struct sctp_association *sctp_association_init(struct sctp_association *a
>   	asoc->default_timetolive = sp->default_timetolive;
>   	asoc->default_rcv_context = sp->default_rcv_context;
>
> +	/* SCTP_GET_ASSOC_STATS COUNTERS */
> +	memset(&asoc->stats, 0, sizeof(struct sctp_priv_assoc_stats));
> +
>   	/* AUTH related initializations */
>   	INIT_LIST_HEAD(&asoc->endpoint_shared_keys);
>   	err = sctp_auth_asoc_copy_shkeys(ep, asoc, gfp);
> @@ -760,6 +763,7 @@ struct sctp_transport *sctp_assoc_add_peer(struct sctp_association *asoc,
>
>   	/* Set the transport's RTO.initial value */
>   	peer->rto = asoc->rto_initial;
> +	sctp_max_rto(asoc, peer);
>
>   	/* Set the peer's active state. */
>   	peer->state = peer_state;
> @@ -1152,8 +1156,11 @@ static void sctp_assoc_bh_rcv(struct work_struct *work)
>   		 */
>   		if (sctp_chunk_is_data(chunk))
>   			asoc->peer.last_data_from = chunk->transport;
> -		else
> +		else {
>   			SCTP_INC_STATS(net, SCTP_MIB_INCTRLCHUNKS);
> +			if (chunk->chunk_hdr->type == SCTP_CID_SACK)
> +				asoc->stats.isacks++;
> +		}

Should the above include asoc->stats.ictrlchunks++; just like ep_bh_rcv()?

>
>   		if (chunk->transport)
>   			chunk->transport->last_time_heard = jiffies;
> diff --git a/net/sctp/endpointola.c b/net/sctp/endpointola.c
> index 1859e2b..32ab55b 100644
> --- a/net/sctp/endpointola.c
> +++ b/net/sctp/endpointola.c
> @@ -480,8 +480,11 @@ normal:
>   		 */
>   		if (asoc && sctp_chunk_is_data(chunk))
>   			asoc->peer.last_data_from = chunk->transport;
> -		else
> +		else {
>   			SCTP_INC_STATS(sock_net(ep->base.sk), SCTP_MIB_INCTRLCHUNKS);
> +			if (asoc)
> +				asoc->stats.ictrlchunks++;
> +		}
>
>   		if (chunk->transport)
>   			chunk->transport->last_time_heard = jiffies;
> diff --git a/net/sctp/input.c b/net/sctp/input.c
> index 8bd3c27..54c449b 100644
> --- a/net/sctp/input.c
> +++ b/net/sctp/input.c
> @@ -281,6 +281,8 @@ int sctp_rcv(struct sk_buff *skb)
>   		SCTP_INC_STATS_BH(net, SCTP_MIB_IN_PKT_SOFTIRQ);
>   		sctp_inq_push(&chunk->rcvr->inqueue, chunk);
>   	}
> +	if (asoc)
> +		asoc->stats.ipackets++;
>
>   	sctp_bh_unlock_sock(sk);

This needs a bit more thought.  Current counting behaves differently 
depending on whether the user holds a socket lock or not.
If the user holds the lock, we'll end counting the packet before it is 
processed.  If the user isn't holding the lock, we'll count the packet 
after it is processed.

>
> diff --git a/net/sctp/output.c b/net/sctp/output.c
> index 4e90188bf..f5200a2 100644
> --- a/net/sctp/output.c
> +++ b/net/sctp/output.c
> @@ -311,6 +311,8 @@ static sctp_xmit_t __sctp_packet_append_chunk(struct sctp_packet *packet,
>
>   	    case SCTP_CID_SACK:
>   		packet->has_sack = 1;
> +		if (chunk->asoc)
> +			chunk->asoc->stats.osacks++;
>   		break;
>
>   	    case SCTP_CID_AUTH:
> @@ -584,11 +586,13 @@ int sctp_packet_transmit(struct sctp_packet *packet)
>   	 */
>
>   	/* Dump that on IP!  */
> -	if (asoc && asoc->peer.last_sent_to != tp) {
> -		/* Considering the multiple CPU scenario, this is a
> -		 * "correcter" place for last_sent_to.  --xguo
> -		 */
> -		asoc->peer.last_sent_to = tp;
> +	if (asoc) {
> +		asoc->stats.opackets++;
> +		if (asoc->peer.last_sent_to != tp)
> +			/* Considering the multiple CPU scenario, this is a
> +			 * "correcter" place for last_sent_to.  --xguo
> +			 */
> +			asoc->peer.last_sent_to = tp;
>   	}
>
>   	if (has_data) {
> diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
> index 1b4a7f8..379c81d 100644
> --- a/net/sctp/outqueue.c
> +++ b/net/sctp/outqueue.c
> @@ -667,6 +667,7 @@ redo:
>   				chunk->fast_retransmit = SCTP_DONT_FRTX;
>
>   			q->empty = 0;
> +			q->asoc->stats.rtxchunks++;
>   			break;
>   		}
>
> @@ -876,12 +877,14 @@ static int sctp_outq_flush(struct sctp_outq *q, int rtx_timeout)
>   			if (status  != SCTP_XMIT_OK) {
>   				/* put the chunk back */
>   				list_add(&chunk->list, &q->control_chunk_list);
> -			} else if (chunk->chunk_hdr->type == SCTP_CID_FWD_TSN) {
> +			} else {
> +				asoc->stats.octrlchunks++;
>   				/* PR-SCTP C5) If a FORWARD TSN is sent, the
>   				 * sender MUST assure that at least one T3-rtx
>   				 * timer is running.
>   				 */
> -				sctp_transport_reset_timers(transport);
> +				if (chunk->chunk_hdr->type == SCTP_CID_FWD_TSN)
> +					sctp_transport_reset_timers(transport);
>   			}
>   			break;
>
> @@ -1055,6 +1058,10 @@ static int sctp_outq_flush(struct sctp_outq *q, int rtx_timeout)
>   				 */
>   				if (asoc->state == SCTP_STATE_SHUTDOWN_PENDING)
>   					chunk->chunk_hdr->flags |= SCTP_DATA_SACK_IMM;
> +				if (chunk->chunk_hdr->flags & SCTP_DATA_UNORDERED)
> +					asoc->stats.ouodchunks++;
> +				else
> +					asoc->stats.oodchunks++;
>
>   				break;
>
> @@ -1162,6 +1169,7 @@ int sctp_outq_sack(struct sctp_outq *q, struct sctp_chunk *chunk)
>
>   	sack_ctsn = ntohl(sack->cum_tsn_ack);
>   	gap_ack_blocks = ntohs(sack->num_gap_ack_blocks);
> +	asoc->stats.gapcnt += gap_ack_blocks;
>   	/*
>   	 * SFR-CACC algorithm:
>   	 * On receipt of a SACK the sender SHOULD execute the
> diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
> index fbe1636..eb7633f 100644
> --- a/net/sctp/sm_make_chunk.c
> +++ b/net/sctp/sm_make_chunk.c
> @@ -804,10 +804,11 @@ struct sctp_chunk *sctp_make_sack(const struct sctp_association *asoc)
>   				 gabs);
>
>   	/* Add the duplicate TSN information.  */
> -	if (num_dup_tsns)
> +	if (num_dup_tsns) {
> +		aptr->stats.idupchunks += num_dup_tsns;
>   		sctp_addto_chunk(retval, sizeof(__u32) * num_dup_tsns,
>   				 sctp_tsnmap_get_dups(map));
> -
> +	}
>   	/* Once we have a sack generated, check to see what our sack
>   	 * generation is, if its 0, reset the transports to 0, and reset
>   	 * the association generation to 1
> diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
> index 6eecf7e..363727e 100644
> --- a/net/sctp/sm_sideeffect.c
> +++ b/net/sctp/sm_sideeffect.c
> @@ -542,6 +542,7 @@ static void sctp_do_8_2_transport_strike(sctp_cmd_seq_t *commands,
>   	 */
>   	if (!is_hb || transport->hb_sent) {
>   		transport->rto = min((transport->rto * 2), transport->asoc->rto_max);
> +		sctp_max_rto(asoc, transport);
>   	}
>   }
>
> diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
> index b6adef8..ecf7a17 100644
> --- a/net/sctp/sm_statefuns.c
> +++ b/net/sctp/sm_statefuns.c
> @@ -6127,6 +6127,8 @@ static int sctp_eat_data(const struct sctp_association *asoc,
>   		/* The TSN is too high--silently discard the chunk and
>   		 * count on it getting retransmitted later.
>   		 */
> +		if (chunk->asoc)
> +			chunk->asoc->stats.outofseqtsns++;
>   		return SCTP_IERROR_HIGH_TSN;
>   	} else if (tmp > 0) {
>   		/* This is a duplicate.  Record it.  */
> @@ -6226,10 +6228,14 @@ static int sctp_eat_data(const struct sctp_association *asoc,
>   	/* Note: Some chunks may get overcounted (if we drop) or overcounted
>   	 * if we renege and the chunk arrives again.
>   	 */
> -	if (chunk->chunk_hdr->flags & SCTP_DATA_UNORDERED)
> +	if (chunk->chunk_hdr->flags & SCTP_DATA_UNORDERED) {
>   		SCTP_INC_STATS(net, SCTP_MIB_INUNORDERCHUNKS);
> -	else {
> +		if (chunk->asoc)
> +			chunk->asoc->stats.iuodchunks++;
> +	} else {
>   		SCTP_INC_STATS(net, SCTP_MIB_INORDERCHUNKS);
> +		if (chunk->asoc)
> +			chunk->asoc->stats.iodchunks++;
>   		ordered = 1;
>   	}
>
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index 15379ac..8113249 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -609,6 +609,7 @@ static int sctp_send_asconf_add_ip(struct sock		*sk,
>   				    2*asoc->pathmtu, 4380));
>   				trans->ssthresh = asoc->peer.i.a_rwnd;
>   				trans->rto = asoc->rto_initial;
> +				sctp_max_rto(asoc, trans);
>   				trans->rtt = trans->srtt = trans->rttvar = 0;
>   				sctp_transport_route(trans, NULL,
>   				    sctp_sk(asoc->base.sk));
> @@ -5633,6 +5634,74 @@ static int sctp_getsockopt_paddr_thresholds(struct sock *sk,
>   	return 0;
>   }
>
> +/*
> + * SCTP_GET_ASSOC_STATS
> + *
> + * This option retrieves local per endpoint statistics. It is modeled
> + * after OpenSolaris' implementation
> + */
> +static int sctp_getsockopt_assoc_stats(struct sock *sk, int len,
> +				       char __user *optval,
> +				       int __user *optlen)
> +{
> +	struct sctp_assoc_stats sas;
> +	struct sctp_association *asoc = NULL;
> +
> +	/* User must provide at least the assoc id */
> +	if (len < sizeof(sctp_assoc_t))
> +		return -EINVAL;
> +
> +	if (copy_from_user(&sas, optval, len))
> +		return -EFAULT;
> +
> +	asoc = sctp_id2assoc(sk, sas.sas_assoc_id);
> +	if (!asoc)
> +		return -EINVAL;
> +
> +	sas.sas_rtxchunks = asoc->stats.rtxchunks;
> +	sas.sas_gapcnt = asoc->stats.gapcnt;
> +	sas.sas_outofseqtsns = asoc->stats.outofseqtsns;
> +	sas.sas_osacks = asoc->stats.osacks;
> +	sas.sas_isacks = asoc->stats.isacks;
> +	sas.sas_octrlchunks = asoc->stats.octrlchunks;
> +	sas.sas_ictrlchunks = asoc->stats.ictrlchunks;
> +	sas.sas_oodchunks = asoc->stats.oodchunks;
> +	sas.sas_iodchunks = asoc->stats.iodchunks;
> +	sas.sas_ouodchunks = asoc->stats.ouodchunks;
> +	sas.sas_iuodchunks = asoc->stats.iuodchunks;
> +	sas.sas_idupchunks = asoc->stats.idupchunks;
> +	sas.sas_opackets = asoc->stats.opackets;
> +	sas.sas_ipackets = asoc->stats.ipackets;
> +
> +	memcpy(&sas.sas_obs_rto_ipaddr, &asoc->stats.obs_rto_ipaddr,
> +		sizeof(struct sockaddr_storage));
> +	/* New high max rto observed */
> +	if (asoc->stats.max_obs_rto > asoc->stats.max_prev_obs_rto)
> +		sas.sas_maxrto = asoc->stats.max_obs_rto;
> +	else /* return min_rto since max rto has not changed */
> +		sas.sas_maxrto = asoc->rto_min;
> +
> +	/* Record the value sent to the user this period */
> +	asoc->stats.max_prev_obs_rto = sas.sas_maxrto;
> +
> +	/* Mark beginning of a new observation period */
> +	asoc->stats.max_obs_rto = 0;

I don't think the logic above behaves correctly.  The fact that 
max_obs_rto < max_prev_obs_rto doesn't not mean that max_obs_rto has
not changed.  It just means that the networks had smaller latency this 
this time slice then it had in the privouse one.  Returning rto_min is 
mis-information in this case.

-vlad

> +
> +	/* Allow the struct to grow and fill in as much as possible */
> +	len = min_t(size_t, len, sizeof(sas));
> +
> +	if (put_user(len, optlen))
> +		return -EFAULT;
> +
> +	SCTP_DEBUG_PRINTK("sctp_getsockopt_assoc_stat(%d): %d\n",
> +			  len, sas.sas_assoc_id);
> +
> +	if (copy_to_user(optval, &sas, len))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
>   SCTP_STATIC int sctp_getsockopt(struct sock *sk, int level, int optname,
>   				char __user *optval, int __user *optlen)
>   {
> @@ -5774,6 +5843,9 @@ SCTP_STATIC int sctp_getsockopt(struct sock *sk, int level, int optname,
>   	case SCTP_PEER_ADDR_THLDS:
>   		retval = sctp_getsockopt_paddr_thresholds(sk, optval, len, optlen);
>   		break;
> +	case SCTP_GET_ASSOC_STATS:
> +		retval = sctp_getsockopt_assoc_stats(sk, len, optval, optlen);
> +		break;
>   	default:
>   		retval = -ENOPROTOOPT;
>   		break;
> diff --git a/net/sctp/transport.c b/net/sctp/transport.c
> index 953c21e..8c6920d 100644
> --- a/net/sctp/transport.c
> +++ b/net/sctp/transport.c
> @@ -350,6 +350,7 @@ void sctp_transport_update_rto(struct sctp_transport *tp, __u32 rtt)
>
>   	/* 6.3.1 C3) After the computation, update RTO <- SRTT + 4 * RTTVAR. */
>   	tp->rto = tp->srtt + (tp->rttvar << 2);
> +	sctp_max_rto(tp->asoc, tp);
>
>   	/* 6.3.1 C6) Whenever RTO is computed, if it is less than RTO.Min
>   	 * seconds then it is rounded up to RTO.Min seconds.
> @@ -620,6 +621,7 @@ void sctp_transport_reset(struct sctp_transport *t)
>   	t->burst_limited = 0;
>   	t->ssthresh = asoc->peer.i.a_rwnd;
>   	t->rto = asoc->rto_initial;
> +	sctp_max_rto(asoc, t);
>   	t->rtt = 0;
>   	t->srtt = 0;
>   	t->rttvar = 0;
>

^ permalink raw reply

* [PATCH v2] net/macb: move to circ_buf macros and fix initial condition
From: Nicolas Ferre @ 2012-11-19 16:00 UTC (permalink / raw)
  To: David S. Miller, netdev
  Cc: Jean-Christophe PLAGNIOL-VILLARD, Joachim Eastwood,
	linux-arm-kernel, linux-kernel, Nicolas Ferre
In-Reply-To: <1353338244-11506-1-git-send-email-nicolas.ferre@atmel.com>

Move to circular buffers management macro and correct an error
with circular buffer initial condition.

Without this patch, the macb_tx_ring_avail() function was
not reporting the proper ring availability at startup:
macb macb: eth0: BUG! Tx Ring full when queue awake!
macb macb: eth0: tx_head = 0, tx_tail = 0
And hanginig forever...

I remove the macb_tx_ring_avail() function and use the
proven macros from circ_buf.h. CIRC_CNT() is used in the
"consumer" part of the driver: macb_tx_interrupt() to match
advice from Documentation/circular-buffers.txt.

Reported-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Tested-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
---
v2: - added tags from Jean-Christophe PLAGNIOL-VILLARD

 drivers/net/ethernet/cadence/macb.c | 17 +++++++----------
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
index edb2aba..27991dd 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -14,6 +14,7 @@
 #include <linux/moduleparam.h>
 #include <linux/kernel.h>
 #include <linux/types.h>
+#include <linux/circ_buf.h>
 #include <linux/slab.h>
 #include <linux/init.h>
 #include <linux/gpio.h>
@@ -38,8 +39,8 @@
 #define TX_RING_SIZE		128 /* must be power of 2 */
 #define TX_RING_BYTES		(sizeof(struct macb_dma_desc) * TX_RING_SIZE)
 
-/* minimum number of free TX descriptors before waking up TX process */
-#define MACB_TX_WAKEUP_THRESH	(TX_RING_SIZE / 4)
+/* level of occupied TX descriptors under which we wake up TX process */
+#define MACB_TX_WAKEUP_THRESH	(3 * TX_RING_SIZE / 4)
 
 #define MACB_RX_INT_FLAGS	(MACB_BIT(RCOMP) | MACB_BIT(RXUBR)	\
 				 | MACB_BIT(ISR_ROVR))
@@ -60,11 +61,6 @@ static unsigned int macb_tx_ring_wrap(unsigned int index)
 	return index & (TX_RING_SIZE - 1);
 }
 
-static unsigned int macb_tx_ring_avail(struct macb *bp)
-{
-	return (bp->tx_tail - bp->tx_head) & (TX_RING_SIZE - 1);
-}
-
 static struct macb_dma_desc *macb_tx_desc(struct macb *bp, unsigned int index)
 {
 	return &bp->tx_ring[macb_tx_ring_wrap(index)];
@@ -524,7 +520,8 @@ static void macb_tx_interrupt(struct macb *bp)
 
 	bp->tx_tail = tail;
 	if (netif_queue_stopped(bp->dev)
-			&& macb_tx_ring_avail(bp) > MACB_TX_WAKEUP_THRESH)
+			&& CIRC_CNT(bp->tx_head, bp->tx_tail,
+				    TX_RING_SIZE) <= MACB_TX_WAKEUP_THRESH)
 		netif_wake_queue(bp->dev);
 }
 
@@ -818,7 +815,7 @@ static int macb_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	spin_lock_irqsave(&bp->lock, flags);
 
 	/* This is a hard error, log it. */
-	if (macb_tx_ring_avail(bp) < 1) {
+	if (CIRC_SPACE(bp->tx_head, bp->tx_tail, TX_RING_SIZE) < 1) {
 		netif_stop_queue(dev);
 		spin_unlock_irqrestore(&bp->lock, flags);
 		netdev_err(bp->dev, "BUG! Tx Ring full when queue awake!\n");
@@ -855,7 +852,7 @@ static int macb_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(TSTART));
 
-	if (macb_tx_ring_avail(bp) < 1)
+	if (CIRC_SPACE(bp->tx_head, bp->tx_tail, TX_RING_SIZE) < 1)
 		netif_stop_queue(dev);
 
 	spin_unlock_irqrestore(&bp->lock, flags);
-- 
1.8.0

^ permalink raw reply related

* Re: compound skb frag pages appearing in start_xmit
From: Sander Eikelenboom @ 2012-11-19 15:43 UTC (permalink / raw)
  To: ANNIE LI
  Cc: Ian Campbell, Eric Dumazet, netdev@vger.kernel.org,
	Marcos E. Matsunaga, xen-devel, Konrad Rzeszutek Wilk,
	Eric Dumazet
In-Reply-To: <50A4540E.1010601@oracle.com>


Thursday, November 15, 2012, 3:31:42 AM, you wrote:

> On 2012-10-11 18:14, Ian Campbell wrote:
>> On Thu, 2012-10-11 at 11:05 +0100, Eric Dumazet wrote:
>>> On Thu, 2012-10-11 at 12:00 +0200, Sander Eikelenboom wrote:
>>>
>>>> Probably due to the BUG_ON from the patch below, i changed it into a WARN_ON.
>>>> And i seem to hit it, but only in one of the guests at the moment and it triggers quite irregularly.
>>> xennet_make_frags() is able to split the skb->head in multiple page-size
>>> chunks.
>>>
>>> It should do the same for fragments
>> Right, I just want to be reproduce the issue so I can know I've fixed it
>> properly ;-)
> Hi Ian,

> I can reproduce this BUG_ON when running netperf/netserver test between 
> two domus running on the same dom0. The domu and dom0 all use v3.7-rc1.

> When I tried to rebase my persistent grant netfront/netback patch on 
> latest kernel, netperf/netserver test never succeeded. I did some test 
> to find out that v3.6-rc7 works fine, but v3.7-rc1, v3.7-rc2 and 
> v3.7-rc4 does not succeed in netperf/netserver test. So I keep my 
> persistent grant patch only based on v3.4-rc3 now.

> Konrad thought about commit 6a8ed462f16b8455eec5ae00eb6014159a6721f0 in 
> v3.7-rc1, and suggested me to test your debug patch in netfront. This 
> BUG_ON happens soon after running the netperf/netserver test case.

> Thanks
> Annie

Is there any progression with this bug (rc6 is out the door, so the release of 3.7-final seems to be eminent and this bug completely cripples any networking with guests) ?

--
Sander

>>
>> Ian.
>>
>>
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel

^ permalink raw reply

* Re: [PATCH] net/macb: move to circ_buf macros and fix initial condition
From: Jean-Christophe PLAGNIOL-VILLARD @ 2012-11-19 15:33 UTC (permalink / raw)
  To: Nicolas Ferre
  Cc: David S. Miller, netdev, Joachim Eastwood, linux-arm-kernel,
	linux-kernel
In-Reply-To: <1353338244-11506-1-git-send-email-nicolas.ferre@atmel.com>

On 16:17 Mon 19 Nov     , Nicolas Ferre wrote:
> Move to circular buffers management macro and correct an error
> with circular buffer initial condition.
> 
> Without this patch, the macb_tx_ring_avail() function was
> not reporting the proper ring availability at startup:
> macb macb: eth0: BUG! Tx Ring full when queue awake!
> macb macb: eth0: tx_head = 0, tx_tail = 0
> And hanginig forever...
> 
> I remove the macb_tx_ring_avail() function and use the
> proven macros from circ_buf.h. CIRC_CNT() is used in the
> "consumer" part of the driver: macb_tx_interrupt() to match
> advice from Documentation/circular-buffers.txt.
> 
> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>

Just tested on 9260
Tested-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>

Best Regards,
J.
> ---
>  drivers/net/ethernet/cadence/macb.c | 17 +++++++----------
>  1 file changed, 7 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
> index edb2aba..27991dd 100644
> --- a/drivers/net/ethernet/cadence/macb.c
> +++ b/drivers/net/ethernet/cadence/macb.c
> @@ -14,6 +14,7 @@
>  #include <linux/moduleparam.h>
>  #include <linux/kernel.h>
>  #include <linux/types.h>
> +#include <linux/circ_buf.h>
>  #include <linux/slab.h>
>  #include <linux/init.h>
>  #include <linux/gpio.h>
> @@ -38,8 +39,8 @@
>  #define TX_RING_SIZE		128 /* must be power of 2 */
>  #define TX_RING_BYTES		(sizeof(struct macb_dma_desc) * TX_RING_SIZE)
>  
> -/* minimum number of free TX descriptors before waking up TX process */
> -#define MACB_TX_WAKEUP_THRESH	(TX_RING_SIZE / 4)
> +/* level of occupied TX descriptors under which we wake up TX process */
> +#define MACB_TX_WAKEUP_THRESH	(3 * TX_RING_SIZE / 4)
>  
>  #define MACB_RX_INT_FLAGS	(MACB_BIT(RCOMP) | MACB_BIT(RXUBR)	\
>  				 | MACB_BIT(ISR_ROVR))
> @@ -60,11 +61,6 @@ static unsigned int macb_tx_ring_wrap(unsigned int index)
>  	return index & (TX_RING_SIZE - 1);
>  }
>  
> -static unsigned int macb_tx_ring_avail(struct macb *bp)
> -{
> -	return (bp->tx_tail - bp->tx_head) & (TX_RING_SIZE - 1);
> -}
> -
>  static struct macb_dma_desc *macb_tx_desc(struct macb *bp, unsigned int index)
>  {
>  	return &bp->tx_ring[macb_tx_ring_wrap(index)];
> @@ -524,7 +520,8 @@ static void macb_tx_interrupt(struct macb *bp)
>  
>  	bp->tx_tail = tail;
>  	if (netif_queue_stopped(bp->dev)
> -			&& macb_tx_ring_avail(bp) > MACB_TX_WAKEUP_THRESH)
> +			&& CIRC_CNT(bp->tx_head, bp->tx_tail,
> +				    TX_RING_SIZE) <= MACB_TX_WAKEUP_THRESH)
>  		netif_wake_queue(bp->dev);
>  }
>  
> @@ -818,7 +815,7 @@ static int macb_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  	spin_lock_irqsave(&bp->lock, flags);
>  
>  	/* This is a hard error, log it. */
> -	if (macb_tx_ring_avail(bp) < 1) {
> +	if (CIRC_SPACE(bp->tx_head, bp->tx_tail, TX_RING_SIZE) < 1) {
>  		netif_stop_queue(dev);
>  		spin_unlock_irqrestore(&bp->lock, flags);
>  		netdev_err(bp->dev, "BUG! Tx Ring full when queue awake!\n");
> @@ -855,7 +852,7 @@ static int macb_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  
>  	macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(TSTART));
>  
> -	if (macb_tx_ring_avail(bp) < 1)
> +	if (CIRC_SPACE(bp->tx_head, bp->tx_tail, TX_RING_SIZE) < 1)
>  		netif_stop_queue(dev);
>  
>  	spin_unlock_irqrestore(&bp->lock, flags);
> -- 
> 1.8.0
> 

^ permalink raw reply

* [PATCH] netfilter: Remove the spurious \ in __ip_vs_lblc_init
From: Eric W. Biederman @ 2012-11-19 15:26 UTC (permalink / raw)
  To: David Miller; +Cc: netdev


In (464dc801c76a net: Don't export sysctls to unprivileged users)
I typoed and introduced a spurious backslash.  Delete it.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/netfilter/ipvs/ip_vs_lblc.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_lblc.c b/net/netfilter/ipvs/ip_vs_lblc.c
index d742aa9..fdd89b9 100644
--- a/net/netfilter/ipvs/ip_vs_lblc.c
+++ b/net/netfilter/ipvs/ip_vs_lblc.c
@@ -574,7 +574,7 @@ static int __net_init __ip_vs_lblc_init(struct net *net)
 		register_net_sysctl(net, "net/ipv4/vs", ipvs->lblc_ctl_table);
 	if (!ipvs->lblc_ctl_header) {
 		if (!net_eq(net, &init_net))
-			kfree(ipvs->lblc_ctl_table);\
+			kfree(ipvs->lblc_ctl_table);
 		return -ENOMEM;
 	}
 
-- 
1.7.5.4

^ permalink raw reply related

* [PATCH] net/macb: move to circ_buf macros and fix initial condition
From: Nicolas Ferre @ 2012-11-19 15:17 UTC (permalink / raw)
  To: David S. Miller, netdev
  Cc: Jean-Christophe PLAGNIOL-VILLARD, Joachim Eastwood,
	linux-arm-kernel, linux-kernel, Nicolas Ferre

Move to circular buffers management macro and correct an error
with circular buffer initial condition.

Without this patch, the macb_tx_ring_avail() function was
not reporting the proper ring availability at startup:
macb macb: eth0: BUG! Tx Ring full when queue awake!
macb macb: eth0: tx_head = 0, tx_tail = 0
And hanginig forever...

I remove the macb_tx_ring_avail() function and use the
proven macros from circ_buf.h. CIRC_CNT() is used in the
"consumer" part of the driver: macb_tx_interrupt() to match
advice from Documentation/circular-buffers.txt.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
---
 drivers/net/ethernet/cadence/macb.c | 17 +++++++----------
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
index edb2aba..27991dd 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -14,6 +14,7 @@
 #include <linux/moduleparam.h>
 #include <linux/kernel.h>
 #include <linux/types.h>
+#include <linux/circ_buf.h>
 #include <linux/slab.h>
 #include <linux/init.h>
 #include <linux/gpio.h>
@@ -38,8 +39,8 @@
 #define TX_RING_SIZE		128 /* must be power of 2 */
 #define TX_RING_BYTES		(sizeof(struct macb_dma_desc) * TX_RING_SIZE)
 
-/* minimum number of free TX descriptors before waking up TX process */
-#define MACB_TX_WAKEUP_THRESH	(TX_RING_SIZE / 4)
+/* level of occupied TX descriptors under which we wake up TX process */
+#define MACB_TX_WAKEUP_THRESH	(3 * TX_RING_SIZE / 4)
 
 #define MACB_RX_INT_FLAGS	(MACB_BIT(RCOMP) | MACB_BIT(RXUBR)	\
 				 | MACB_BIT(ISR_ROVR))
@@ -60,11 +61,6 @@ static unsigned int macb_tx_ring_wrap(unsigned int index)
 	return index & (TX_RING_SIZE - 1);
 }
 
-static unsigned int macb_tx_ring_avail(struct macb *bp)
-{
-	return (bp->tx_tail - bp->tx_head) & (TX_RING_SIZE - 1);
-}
-
 static struct macb_dma_desc *macb_tx_desc(struct macb *bp, unsigned int index)
 {
 	return &bp->tx_ring[macb_tx_ring_wrap(index)];
@@ -524,7 +520,8 @@ static void macb_tx_interrupt(struct macb *bp)
 
 	bp->tx_tail = tail;
 	if (netif_queue_stopped(bp->dev)
-			&& macb_tx_ring_avail(bp) > MACB_TX_WAKEUP_THRESH)
+			&& CIRC_CNT(bp->tx_head, bp->tx_tail,
+				    TX_RING_SIZE) <= MACB_TX_WAKEUP_THRESH)
 		netif_wake_queue(bp->dev);
 }
 
@@ -818,7 +815,7 @@ static int macb_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	spin_lock_irqsave(&bp->lock, flags);
 
 	/* This is a hard error, log it. */
-	if (macb_tx_ring_avail(bp) < 1) {
+	if (CIRC_SPACE(bp->tx_head, bp->tx_tail, TX_RING_SIZE) < 1) {
 		netif_stop_queue(dev);
 		spin_unlock_irqrestore(&bp->lock, flags);
 		netdev_err(bp->dev, "BUG! Tx Ring full when queue awake!\n");
@@ -855,7 +852,7 @@ static int macb_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(TSTART));
 
-	if (macb_tx_ring_avail(bp) < 1)
+	if (CIRC_SPACE(bp->tx_head, bp->tx_tail, TX_RING_SIZE) < 1)
 		netif_stop_queue(dev);
 
 	spin_unlock_irqrestore(&bp->lock, flags);
-- 
1.8.0

^ permalink raw reply related

* Re: [PATCH RFC 3/5] printk: modify printk interface for syslog_namespace
From: Joe Perches @ 2012-11-19 14:53 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Rui Xiang, serge.hallyn, containers, netdev, Eric W. Biederman
In-Reply-To: <20121119142926.GB4453@mail.hallyn.com>

On Mon, 2012-11-19 at 14:29 +0000, Serge E. Hallyn wrote:
> > diff --git a/drivers/base/core.c b/drivers/base/core.c

48kb of quoted text...

> > @@ -1678,7 +1674,7 @@ asmlinkage int printk(const char *fmt, ...)
> >  	}
> >  #endif
> >  	va_start(args, fmt);
> > -	r = vprintk_emit(0, -1, NULL, 0, fmt, args);
> > +	r = vprintk_emit(0, -1, NULL, 0, fmt, args, current_syslog_ns());
> 
> Current is meaningless here.  The default should be using init_syslog_ns.

For a single sentence.  Please remember to trim your replies.

^ permalink raw reply

* Re: [PATCH RFC 0/5] Containerize syslog
From: Serge E. Hallyn @ 2012-11-19 14:37 UTC (permalink / raw)
  To: Rui Xiang
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Eric W. Biederman, netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <50A9EAD8.9090501-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

Quoting Rui Xiang (leo.ruixiang-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org):
> From: Xiang Rui <rui.xiang-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> 
> In Serge's patch (http://lwn.net/Articles/525629/), syslog_namespace was tied to a user
> namespace. We add syslog_ns tied to nsproxy instead, and implement ns_printk in
> ip_table context.

Since you say 'we', I'm just wondering, which project is this a part of?

> We add syslog_namespace as a part of nsproxy, and a new flag CLONE_SYSLOG to unshare
> syslog area.

Thanks, looks like you save me the time of having to add some users of
nsprintk :)

I understand that user namespaces aren't 100% usable yet, but looking
long term, is there a reason to have the syslog namespace separate
from user namespace?

thanks,
-serge

^ permalink raw reply

* Re: [PATCH RFC 3/5] printk: modify printk interface for syslog_namespace
From: Serge E. Hallyn @ 2012-11-19 14:29 UTC (permalink / raw)
  To: Rui Xiang; +Cc: serge.hallyn, containers, netdev, Eric W. Biederman
In-Reply-To: <50A9EAF0.4000902@gmail.com>

Quoting Rui Xiang (leo.ruixiang@gmail.com):
> From: Libo Chen <clbchenlibo.chen@huawei.com>
> 
> We re-implement printk by additional syslog_ns.
> 
> The function include printk, /dev/kmsg, do_syslog and kmsg_dump should be modifyed
> for syslog_ns. Previous identifier *** such as log_first_seq should be replaced
> by syslog_ns->***.
> 
> Signed-off-by: Libo Chen <clbchenlibo.chen@huawei.com>
> Signed-off-by: Xiang Rui <rui.xiang@huawei.com>
> ---
>  drivers/base/core.c    |    4 +-
>  include/linux/printk.h |    4 +-
>  kernel/printk.c        |  609 +++++++++++++++++++++++++++++-------------------
>  3 files changed, 372 insertions(+), 245 deletions(-)
> 
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index abea76c..665c2f7 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -26,6 +26,7 @@
>  #include <linux/async.h>
>  #include <linux/pm_runtime.h>
>  #include <linux/netdevice.h>
> +#include <linux/syslog_namespace.h>
> 
>  #include "base.h"
>  #include "power/power.h"
> @@ -1922,7 +1923,8 @@ int dev_vprintk_emit(int level, const struct device *dev,
> 
>  	hdrlen = create_syslog_header(dev, hdr, sizeof(hdr));
> 
> -	return vprintk_emit(0, level, hdrlen ? hdr : NULL, hdrlen, fmt, args);
> +	return vprintk_emit(0, level, hdrlen ? hdr : NULL, hdrlen,
> +				fmt, args, current_syslog_ns());
>  }
>  EXPORT_SYMBOL(dev_vprintk_emit);
> 
> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index 9afc01e..e0c60d9 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -7,6 +7,7 @@
>  extern const char linux_banner[];
>  extern const char linux_proc_banner[];
> 
> +struct syslog_namespace;
>  static inline int printk_get_level(const char *buffer)
>  {
>  	if (buffer[0] == KERN_SOH_ASCII && buffer[1]) {
> @@ -105,7 +106,8 @@ extern void printk_tick(void);
>  asmlinkage __printf(5, 0)
>  int vprintk_emit(int facility, int level,
>  		 const char *dict, size_t dictlen,
> -		 const char *fmt, va_list args);
> +		 const char *fmt, va_list args,
> +		 struct syslog_namespace *syslog_ns);
> 
>  asmlinkage __printf(1, 0)
>  int vprintk(const char *fmt, va_list args);
> diff --git a/kernel/printk.c b/kernel/printk.c
> index 2d607f4..2ef9c46 100644
> --- a/kernel/printk.c
> +++ b/kernel/printk.c
> @@ -42,6 +42,7 @@
>  #include <linux/notifier.h>
>  #include <linux/rculist.h>
>  #include <linux/poll.h>
> +#include <linux/syslog_namespace.h>
> 
>  #include <asm/uaccess.h>
> 
> @@ -214,46 +215,14 @@ struct log {
>   * The logbuf_lock protects kmsg buffer, indices, counters. It is also
>   * used in interesting ways to provide interlocking in console_unlock();
>   */
> -static DEFINE_RAW_SPINLOCK(logbuf_lock);
> 
>  #ifdef CONFIG_PRINTK
> -/* the next printk record to read by syslog(READ) or /proc/kmsg */
> -static u64 syslog_seq;
> -static u32 syslog_idx;
> -static enum log_flags syslog_prev;
> -static size_t syslog_partial;
> -
> -/* index and sequence number of the first record stored in the buffer */
> -static u64 log_first_seq;
> -static u32 log_first_idx;
> -
> -/* index and sequence number of the next record to store in the buffer */
> -static u64 log_next_seq;
> -static u32 log_next_idx;
> 
> -/* the next printk record to write to the console */
> -static u64 console_seq;
> -static u32 console_idx;
>  static enum log_flags console_prev;
> 
> -/* the next printk record to read after the last 'clear' command */
> -static u64 clear_seq;
> -static u32 clear_idx;
> -
>  #define PREFIX_MAX		32
>  #define LOG_LINE_MAX		1024 - PREFIX_MAX
> 
> -/* record buffer */
> -#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
> -#define LOG_ALIGN 4
> -#else
> -#define LOG_ALIGN __alignof__(struct log)
> -#endif
> -#define __LOG_BUF_LEN (1 << CONFIG_LOG_BUF_SHIFT)
> -static char __log_buf[__LOG_BUF_LEN] __aligned(LOG_ALIGN);
> -static char *log_buf = __log_buf;
> -static u32 log_buf_len = __LOG_BUF_LEN;
> -
>  /* cpu currently holding logbuf_lock */
>  static volatile unsigned int logbuf_cpu = UINT_MAX;
> 
> @@ -270,23 +239,23 @@ static char *log_dict(const struct log *msg)
>  }
> 
>  /* get record by index; idx must point to valid msg */
> -static struct log *log_from_idx(u32 idx)
> +static struct log *log_from_idx(u32 idx, struct syslog_namespace *syslog_ns)
>  {
> -	struct log *msg = (struct log *)(log_buf + idx);
> +	struct log *msg = (struct log *)(syslog_ns->log_buf + idx);
> 
>  	/*
>  	 * A length == 0 record is the end of buffer marker. Wrap around and
>  	 * read the message at the start of the buffer.
>  	 */
>  	if (!msg->len)
> -		return (struct log *)log_buf;
> +		return (struct log *)syslog_ns->log_buf;
>  	return msg;
>  }
> 
>  /* get next record; idx must point to valid msg */
> -static u32 log_next(u32 idx)
> +static u32 log_next(u32 idx, struct syslog_namespace *syslog_ns)
>  {
> -	struct log *msg = (struct log *)(log_buf + idx);
> +	struct log *msg = (struct log *)(syslog_ns->log_buf + idx);
> 
>  	/* length == 0 indicates the end of the buffer; wrap */
>  	/*
> @@ -295,7 +264,7 @@ static u32 log_next(u32 idx)
>  	 * return the one after that.
>  	 */
>  	if (!msg->len) {
> -		msg = (struct log *)log_buf;
> +		msg = (struct log *)syslog_ns->log_buf;
>  		return msg->len;
>  	}
>  	return idx + msg->len;
> @@ -305,7 +274,8 @@ static u32 log_next(u32 idx)
>  static void log_store(int facility, int level,
>  		      enum log_flags flags, u64 ts_nsec,
>  		      const char *dict, u16 dict_len,
> -		      const char *text, u16 text_len)
> +		      const char *text, u16 text_len,
> +		      struct syslog_namespace *syslog_ns)
>  {
>  	struct log *msg;
>  	u32 size, pad_len;
> @@ -315,34 +285,40 @@ static void log_store(int facility, int level,
>  	pad_len = (-size) & (LOG_ALIGN - 1);
>  	size += pad_len;
> 
> -	while (log_first_seq < log_next_seq) {
> +	while (syslog_ns->log_first_seq < syslog_ns->log_next_seq) {
>  		u32 free;
> 
> -		if (log_next_idx > log_first_idx)
> -			free = max(log_buf_len - log_next_idx, log_first_idx);
> +		if (syslog_ns->log_next_idx > syslog_ns->log_first_idx)
> +			free = max(syslog_ns->log_buf_len -
> +				 syslog_ns->log_next_idx,
> +				 syslog_ns->log_first_idx);
>  		else
> -			free = log_first_idx - log_next_idx;
> +			free = syslog_ns->log_first_idx -
> +					 syslog_ns->log_next_idx;
> 
>  		if (free > size + sizeof(struct log))
>  			break;
> 
>  		/* drop old messages until we have enough contiuous space */
> -		log_first_idx = log_next(log_first_idx);
> -		log_first_seq++;
> +		syslog_ns->log_first_idx =
> +				log_next(syslog_ns->log_first_idx, syslog_ns);
> +		syslog_ns->log_first_seq++;
>  	}
> 
> -	if (log_next_idx + size + sizeof(struct log) >= log_buf_len) {
> +	if (syslog_ns->log_next_idx + size + sizeof(struct log) >=
> +						 syslog_ns->log_buf_len) {
>  		/*
>  		 * This message + an additional empty header does not fit
>  		 * at the end of the buffer. Add an empty header with len == 0
>  		 * to signify a wrap around.
>  		 */
> -		memset(log_buf + log_next_idx, 0, sizeof(struct log));
> -		log_next_idx = 0;
> +		memset(syslog_ns->log_buf + syslog_ns->log_next_idx,
> +						 0, sizeof(struct log));
> +		syslog_ns->log_next_idx = 0;
>  	}
> 
>  	/* fill message */
> -	msg = (struct log *)(log_buf + log_next_idx);
> +	msg = (struct log *)(syslog_ns->log_buf + syslog_ns->log_next_idx);
>  	memcpy(log_text(msg), text, text_len);
>  	msg->text_len = text_len;
>  	memcpy(log_dict(msg), dict, dict_len);
> @@ -358,8 +334,8 @@ static void log_store(int facility, int level,
>  	msg->len = sizeof(struct log) + text_len + dict_len + pad_len;
> 
>  	/* insert message */
> -	log_next_idx += msg->len;
> -	log_next_seq++;
> +	syslog_ns->log_next_idx += msg->len;
> +	syslog_ns->log_next_seq++;
>  }
> 
>  /* /dev/kmsg - userspace message inject/listen interface */
> @@ -368,6 +344,7 @@ struct devkmsg_user {
>  	u32 idx;
>  	enum log_flags prev;
>  	struct mutex lock;
> +	struct syslog_namespace *syslog_ns;
>  	char buf[8192];
>  };
> 
> @@ -431,6 +408,7 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf,
>  			    size_t count, loff_t *ppos)
>  {
>  	struct devkmsg_user *user = file->private_data;
> +	struct syslog_namespace *syslog_ns = user->syslog_ns;
>  	struct log *msg;
>  	u64 ts_usec;
>  	size_t i;
> @@ -444,32 +422,32 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf,
>  	ret = mutex_lock_interruptible(&user->lock);
>  	if (ret)
>  		return ret;
> -	raw_spin_lock_irq(&logbuf_lock);
> -	while (user->seq == log_next_seq) {
> +	raw_spin_lock_irq(&syslog_ns->logbuf_lock);
> +	while (user->seq == syslog_ns->log_next_seq) {
>  		if (file->f_flags & O_NONBLOCK) {
>  			ret = -EAGAIN;
> -			raw_spin_unlock_irq(&logbuf_lock);
> +			raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
>  			goto out;
>  		}
> 
> -		raw_spin_unlock_irq(&logbuf_lock);
> +		raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
>  		ret = wait_event_interruptible(log_wait,
> -					       user->seq != log_next_seq);
> +				user->seq != syslog_ns->log_next_seq);
>  		if (ret)
>  			goto out;
> -		raw_spin_lock_irq(&logbuf_lock);
> +		raw_spin_lock_irq(&syslog_ns->logbuf_lock);
>  	}
> 
> -	if (user->seq < log_first_seq) {
> +	if (user->seq < syslog_ns->log_first_seq) {
>  		/* our last seen message is gone, return error and reset */
> -		user->idx = log_first_idx;
> -		user->seq = log_first_seq;
> +		user->idx = syslog_ns->log_first_idx;
> +		user->seq = syslog_ns->log_first_seq;
>  		ret = -EPIPE;
> -		raw_spin_unlock_irq(&logbuf_lock);
> +		raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
>  		goto out;
>  	}
> 
> -	msg = log_from_idx(user->idx);
> +	msg = log_from_idx(user->idx, syslog_ns);
>  	ts_usec = msg->ts_nsec;
>  	do_div(ts_usec, 1000);
> 
> @@ -530,9 +508,9 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf,
>  		user->buf[len++] = '\n';
>  	}
> 
> -	user->idx = log_next(user->idx);
> +	user->idx = log_next(user->idx, syslog_ns);
>  	user->seq++;
> -	raw_spin_unlock_irq(&logbuf_lock);
> +	raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
> 
>  	if (len > count) {
>  		ret = -EINVAL;
> @@ -552,6 +530,7 @@ out:
>  static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence)
>  {
>  	struct devkmsg_user *user = file->private_data;
> +	struct syslog_namespace *syslog_ns = user->syslog_ns;
>  	loff_t ret = 0;
> 
>  	if (!user)
> @@ -559,12 +538,12 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence)
>  	if (offset)
>  		return -ESPIPE;
> 
> -	raw_spin_lock_irq(&logbuf_lock);
> +	raw_spin_lock_irq(&syslog_ns->logbuf_lock);
>  	switch (whence) {
>  	case SEEK_SET:
>  		/* the first record */
> -		user->idx = log_first_idx;
> -		user->seq = log_first_seq;
> +		user->idx = syslog_ns->log_first_idx;
> +		user->seq = syslog_ns->log_first_seq;
>  		break;
>  	case SEEK_DATA:
>  		/*
> @@ -572,24 +551,25 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence)
>  		 * like issued by 'dmesg -c'. Reading /dev/kmsg itself
>  		 * changes no global state, and does not clear anything.
>  		 */
> -		user->idx = clear_idx;
> -		user->seq = clear_seq;
> +		user->idx = syslog_ns->clear_idx;
> +		user->seq = syslog_ns->clear_seq;
>  		break;
>  	case SEEK_END:
>  		/* after the last record */
> -		user->idx = log_next_idx;
> -		user->seq = log_next_seq;
> +		user->idx = syslog_ns->log_next_idx;
> +		user->seq = syslog_ns->log_next_seq;
>  		break;
>  	default:
>  		ret = -EINVAL;
>  	}
> -	raw_spin_unlock_irq(&logbuf_lock);
> +	raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
>  	return ret;
>  }
> 
>  static unsigned int devkmsg_poll(struct file *file, poll_table *wait)
>  {
>  	struct devkmsg_user *user = file->private_data;
> +	struct syslog_namespace *syslog_ns = user->syslog_ns;
>  	int ret = 0;
> 
>  	if (!user)
> @@ -597,20 +577,21 @@ static unsigned int devkmsg_poll(struct file *file, poll_table *wait)
> 
>  	poll_wait(file, &log_wait, wait);
> 
> -	raw_spin_lock_irq(&logbuf_lock);
> -	if (user->seq < log_next_seq) {
> +	raw_spin_lock_irq(&syslog_ns->logbuf_lock);
> +	if (user->seq < syslog_ns->log_next_seq) {
>  		/* return error when data has vanished underneath us */
> -		if (user->seq < log_first_seq)
> +		if (user->seq < syslog_ns->log_first_seq)
>  			ret = POLLIN|POLLRDNORM|POLLERR|POLLPRI;
>  		ret = POLLIN|POLLRDNORM;
>  	}
> -	raw_spin_unlock_irq(&logbuf_lock);
> +	raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
> 
>  	return ret;
>  }
> 
>  static int devkmsg_open(struct inode *inode, struct file *file)
>  {
> +	struct syslog_namespace *syslog_ns;
>  	struct devkmsg_user *user;
>  	int err;
> 
> @@ -628,10 +609,11 @@ static int devkmsg_open(struct inode *inode, struct file *file)
> 
>  	mutex_init(&user->lock);
> 
> -	raw_spin_lock_irq(&logbuf_lock);
> -	user->idx = log_first_idx;
> -	user->seq = log_first_seq;
> -	raw_spin_unlock_irq(&logbuf_lock);
> +	user->syslog_ns = current_syslog_ns();
> +	raw_spin_lock_irq(&syslog_ns->logbuf_lock);
> +	user->idx = syslog_ns->log_first_idx;
> +	user->seq = syslog_ns->log_first_seq;
> +	raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
> 
>  	file->private_data = user;
>  	return 0;
> @@ -669,10 +651,12 @@ const struct file_operations kmsg_fops = {
>   */
>  void log_buf_kexec_setup(void)
>  {
> -	VMCOREINFO_SYMBOL(log_buf);
> -	VMCOREINFO_SYMBOL(log_buf_len);
> -	VMCOREINFO_SYMBOL(log_first_idx);
> -	VMCOREINFO_SYMBOL(log_next_idx);
> +	struct syslog_namespace *syslog_ns = current_syslog_ns();
> +
> +	VMCOREINFO_SYMBOL(syslog_ns->log_buf);
> +	VMCOREINFO_SYMBOL(syslog_ns->log_buf_len);
> +	VMCOREINFO_SYMBOL(syslog_ns->log_first_idx);
> +	VMCOREINFO_SYMBOL(syslog_ns->log_next_idx);
>  	/*
>  	 * Export struct log size and field offsets. User space tools can
>  	 * parse it and detect any changes to structure down the line.
> @@ -692,10 +676,11 @@ static unsigned long __initdata new_log_buf_len;
>  static int __init log_buf_len_setup(char *str)
>  {
>  	unsigned size = memparse(str, &str);
> +	struct syslog_namespace *syslog_ns = &init_syslog_ns;
> 
>  	if (size)
>  		size = roundup_pow_of_two(size);
> -	if (size > log_buf_len)
> +	if (size > syslog_ns->log_buf_len)
>  		new_log_buf_len = size;
> 
>  	return 0;
> @@ -707,6 +692,7 @@ void __init setup_log_buf(int early)
>  	unsigned long flags;
>  	char *new_log_buf;
>  	int free;
> +	struct syslog_namespace *syslog_ns = &init_syslog_ns;
> 
>  	if (!new_log_buf_len)
>  		return;
> @@ -728,15 +714,15 @@ void __init setup_log_buf(int early)
>  		return;
>  	}
> 
> -	raw_spin_lock_irqsave(&logbuf_lock, flags);
> -	log_buf_len = new_log_buf_len;
> -	log_buf = new_log_buf;
> +	raw_spin_lock_irqsave(&syslog_ns->logbuf_lock, flags);
> +	memcpy(new_log_buf, syslog_ns->log_buf, __LOG_BUF_LEN);
> +	syslog_ns->log_buf_len = new_log_buf_len;
> +	syslog_ns->log_buf = new_log_buf;
>  	new_log_buf_len = 0;
> -	free = __LOG_BUF_LEN - log_next_idx;
> -	memcpy(log_buf, __log_buf, __LOG_BUF_LEN);
> -	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
> +	free = __LOG_BUF_LEN - syslog_ns->log_next_idx;
> +	raw_spin_unlock_irqrestore(&syslog_ns->logbuf_lock, flags);
> 
> -	pr_info("log_buf_len: %d\n", log_buf_len);
> +	pr_info("log_buf_len: %d\n", syslog_ns->log_buf_len);
>  	pr_info("early log buf free: %d(%d%%)\n",
>  		free, (free * 100) / __LOG_BUF_LEN);
>  }
> @@ -937,7 +923,8 @@ static size_t msg_print_text(const struct log *msg, enum log_flags prev,
>  	return len;
>  }
> 
> -static int syslog_print(char __user *buf, int size)
> +static int syslog_print(char __user *buf, int size,
> +			 struct syslog_namespace *syslog_ns)
>  {
>  	char *text;
>  	struct log *msg;
> @@ -951,37 +938,38 @@ static int syslog_print(char __user *buf, int size)
>  		size_t n;
>  		size_t skip;
> 
> -		raw_spin_lock_irq(&logbuf_lock);
> -		if (syslog_seq < log_first_seq) {
> +		raw_spin_lock_irq(&syslog_ns->logbuf_lock);
> +		if (syslog_ns->syslog_seq < syslog_ns->log_first_seq) {
>  			/* messages are gone, move to first one */
> -			syslog_seq = log_first_seq;
> -			syslog_idx = log_first_idx;
> -			syslog_prev = 0;
> -			syslog_partial = 0;
> +			syslog_ns->syslog_seq = syslog_ns->log_first_seq;
> +			syslog_ns->syslog_idx = syslog_ns->log_first_idx;
> +			syslog_ns->syslog_prev = 0;
> +			syslog_ns->syslog_partial = 0;
>  		}
> -		if (syslog_seq == log_next_seq) {
> -			raw_spin_unlock_irq(&logbuf_lock);
> +		if (syslog_ns->syslog_seq == syslog_ns->log_next_seq) {
> +			raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
>  			break;
>  		}
> 
> -		skip = syslog_partial;
> -		msg = log_from_idx(syslog_idx);
> -		n = msg_print_text(msg, syslog_prev, true, text,
> +		skip = syslog_ns->syslog_partial;
> +		msg = log_from_idx(syslog_ns->syslog_idx, syslog_ns);
> +		n = msg_print_text(msg, syslog_ns->syslog_prev, true, text,
>  				   LOG_LINE_MAX + PREFIX_MAX);
> -		if (n - syslog_partial <= size) {
> +		if (n - syslog_ns->syslog_partial <= size) {
>  			/* message fits into buffer, move forward */
> -			syslog_idx = log_next(syslog_idx);
> -			syslog_seq++;
> -			syslog_prev = msg->flags;
> -			n -= syslog_partial;
> -			syslog_partial = 0;
> +			syslog_ns->syslog_idx =
> +				log_next(syslog_ns->syslog_idx, syslog_ns);
> +			syslog_ns->syslog_seq++;
> +			syslog_ns->syslog_prev = msg->flags;
> +			n -= syslog_ns->syslog_partial;
> +			syslog_ns->syslog_partial = 0;
>  		} else if (!len){
>  			/* partial read(), remember position */
>  			n = size;
> -			syslog_partial += n;
> +			syslog_ns->syslog_partial += n;
>  		} else
>  			n = 0;
> -		raw_spin_unlock_irq(&logbuf_lock);
> +		raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
> 
>  		if (!n)
>  			break;
> @@ -1001,7 +989,8 @@ static int syslog_print(char __user *buf, int size)
>  	return len;
>  }
> 
> -static int syslog_print_all(char __user *buf, int size, bool clear)
> +static int syslog_print_all(char __user *buf, int size, bool clear,
> +				struct syslog_namespace *syslog_ns)
>  {
>  	char *text;
>  	int len = 0;
> @@ -1010,55 +999,55 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
>  	if (!text)
>  		return -ENOMEM;
> 
> -	raw_spin_lock_irq(&logbuf_lock);
> +	raw_spin_lock_irq(&syslog_ns->logbuf_lock);
>  	if (buf) {
>  		u64 next_seq;
>  		u64 seq;
>  		u32 idx;
>  		enum log_flags prev;
> 
> -		if (clear_seq < log_first_seq) {
> +		if (syslog_ns->clear_seq < syslog_ns->log_first_seq) {
>  			/* messages are gone, move to first available one */
> -			clear_seq = log_first_seq;
> -			clear_idx = log_first_idx;
> +			syslog_ns->clear_seq = syslog_ns->log_first_seq;
> +			syslog_ns->clear_idx = syslog_ns->log_first_idx;
>  		}
> 
>  		/*
>  		 * Find first record that fits, including all following records,
>  		 * into the user-provided buffer for this dump.
>  		 */
> -		seq = clear_seq;
> -		idx = clear_idx;
> +		seq = syslog_ns->clear_seq;
> +		idx = syslog_ns->clear_idx;
>  		prev = 0;
> -		while (seq < log_next_seq) {
> -			struct log *msg = log_from_idx(idx);
> +		while (seq < syslog_ns->log_next_seq) {
> +			struct log *msg = log_from_idx(idx, syslog_ns);
> 
>  			len += msg_print_text(msg, prev, true, NULL, 0);
>  			prev = msg->flags;
> -			idx = log_next(idx);
> +			idx = log_next(idx, syslog_ns);
>  			seq++;
>  		}
> 
>  		/* move first record forward until length fits into the buffer */
> -		seq = clear_seq;
> -		idx = clear_idx;
> +		seq = syslog_ns->clear_seq;
> +		idx = syslog_ns->clear_idx;
>  		prev = 0;
> -		while (len > size && seq < log_next_seq) {
> -			struct log *msg = log_from_idx(idx);
> +		while (len > size && seq < syslog_ns->log_next_seq) {
> +			struct log *msg = log_from_idx(idx, syslog_ns);
> 
>  			len -= msg_print_text(msg, prev, true, NULL, 0);
>  			prev = msg->flags;
> -			idx = log_next(idx);
> +			idx = log_next(idx, syslog_ns);
>  			seq++;
>  		}
> 
>  		/* last message fitting into this dump */
> -		next_seq = log_next_seq;
> +		next_seq = syslog_ns->log_next_seq;
> 
>  		len = 0;
>  		prev = 0;
>  		while (len >= 0 && seq < next_seq) {
> -			struct log *msg = log_from_idx(idx);
> +			struct log *msg = log_from_idx(idx, syslog_ns);
>  			int textlen;
> 
>  			textlen = msg_print_text(msg, prev, true, text,
> @@ -1067,31 +1056,31 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
>  				len = textlen;
>  				break;
>  			}
> -			idx = log_next(idx);
> +			idx = log_next(idx, syslog_ns);
>  			seq++;
>  			prev = msg->flags;
> 
> -			raw_spin_unlock_irq(&logbuf_lock);
> +			raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
>  			if (copy_to_user(buf + len, text, textlen))
>  				len = -EFAULT;
>  			else
>  				len += textlen;
> -			raw_spin_lock_irq(&logbuf_lock);
> +			raw_spin_lock_irq(&syslog_ns->logbuf_lock);
> 
> -			if (seq < log_first_seq) {
> +			if (seq < syslog_ns->log_first_seq) {
>  				/* messages are gone, move to next one */
> -				seq = log_first_seq;
> -				idx = log_first_idx;
> +				seq = syslog_ns->log_first_seq;
> +				idx = syslog_ns->log_first_idx;
>  				prev = 0;
>  			}
>  		}
>  	}
> 
>  	if (clear) {
> -		clear_seq = log_next_seq;
> -		clear_idx = log_next_idx;
> +		syslog_ns->clear_seq = syslog_ns->log_next_seq;
> +		syslog_ns->clear_idx = syslog_ns->log_next_idx;
>  	}
> -	raw_spin_unlock_irq(&logbuf_lock);
> +	raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
> 
>  	kfree(text);
>  	return len;
> @@ -1102,6 +1091,7 @@ int do_syslog(int type, char __user *buf, int len, bool from_file)
>  	bool clear = false;
>  	static int saved_console_loglevel = -1;
>  	int error;
> +	struct syslog_namespace *syslog_ns = current_syslog_ns();
> 
>  	error = check_syslog_permissions(type, from_file);
>  	if (error)
> @@ -1128,10 +1118,10 @@ int do_syslog(int type, char __user *buf, int len, bool from_file)
>  			goto out;
>  		}
>  		error = wait_event_interruptible(log_wait,
> -						 syslog_seq != log_next_seq);
> +			syslog_ns->syslog_seq != syslog_ns->log_next_seq);
>  		if (error)
>  			goto out;
> -		error = syslog_print(buf, len);
> +		error = syslog_print(buf, len, syslog_ns);
>  		break;
>  	/* Read/clear last kernel messages */
>  	case SYSLOG_ACTION_READ_CLEAR:
> @@ -1149,11 +1139,11 @@ int do_syslog(int type, char __user *buf, int len, bool from_file)
>  			error = -EFAULT;
>  			goto out;
>  		}
> -		error = syslog_print_all(buf, len, clear);
> +		error = syslog_print_all(buf, len, clear, syslog_ns);
>  		break;
>  	/* Clear ring buffer */
>  	case SYSLOG_ACTION_CLEAR:
> -		syslog_print_all(NULL, 0, true);
> +		syslog_print_all(NULL, 0, true, syslog_ns);
>  		break;
>  	/* Disable logging to console */
>  	case SYSLOG_ACTION_CONSOLE_OFF:
> @@ -1182,13 +1172,13 @@ int do_syslog(int type, char __user *buf, int len, bool from_file)
>  		break;
>  	/* Number of chars in the log buffer */
>  	case SYSLOG_ACTION_SIZE_UNREAD:
> -		raw_spin_lock_irq(&logbuf_lock);
> -		if (syslog_seq < log_first_seq) {
> +		raw_spin_lock_irq(&syslog_ns->logbuf_lock);
> +		if (syslog_ns->syslog_seq < syslog_ns->log_first_seq) {
>  			/* messages are gone, move to first one */
> -			syslog_seq = log_first_seq;
> -			syslog_idx = log_first_idx;
> -			syslog_prev = 0;
> -			syslog_partial = 0;
> +			syslog_ns->syslog_seq = syslog_ns->log_first_seq;
> +			syslog_ns->syslog_idx = syslog_ns->log_first_idx;
> +			syslog_ns->syslog_prev = 0;
> +			syslog_ns->syslog_partial = 0;
>  		}
>  		if (from_file) {
>  			/*
> @@ -1196,28 +1186,28 @@ int do_syslog(int type, char __user *buf, int len, bool from_file)
>  			 * for pending data, not the size; return the count of
>  			 * records, not the length.
>  			 */
> -			error = log_next_idx - syslog_idx;
> +			error = syslog_ns->log_next_idx - syslog_ns->syslog_idx;
>  		} else {
> -			u64 seq = syslog_seq;
> -			u32 idx = syslog_idx;
> -			enum log_flags prev = syslog_prev;
> +			u64 seq = syslog_ns->syslog_seq;
> +			u32 idx = syslog_ns->syslog_idx;
> +			enum log_flags prev = syslog_ns->syslog_prev;
> 
>  			error = 0;
> -			while (seq < log_next_seq) {
> -				struct log *msg = log_from_idx(idx);
> +			while (seq < syslog_ns->log_next_seq) {
> +				struct log *msg = log_from_idx(idx, syslog_ns);
> 
>  				error += msg_print_text(msg, prev, true, NULL, 0);
> -				idx = log_next(idx);
> +				idx = log_next(idx, syslog_ns);
>  				seq++;
>  				prev = msg->flags;
>  			}
> -			error -= syslog_partial;
> +			error -= syslog_ns->syslog_partial;
>  		}
> -		raw_spin_unlock_irq(&logbuf_lock);
> +		raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
>  		break;
>  	/* Size of the log buffer */
>  	case SYSLOG_ACTION_SIZE_BUFFER:
> -		error = log_buf_len;
> +		error = syslog_ns->log_buf_len;
>  		break;
>  	default:
>  		error = -EINVAL;
> @@ -1282,7 +1272,7 @@ static void call_console_drivers(int level, const char *text, size_t len)
>   * every 10 seconds, to leave time for slow consoles to print a
>   * full oops.
>   */
> -static void zap_locks(void)
> +static void zap_locks(struct syslog_namespace *syslog_ns)
>  {
>  	static unsigned long oops_timestamp;
> 
> @@ -1294,7 +1284,7 @@ static void zap_locks(void)
> 
>  	debug_locks_off();
>  	/* If a crash is occurring, make sure we can't deadlock */
> -	raw_spin_lock_init(&logbuf_lock);
> +	raw_spin_lock_init(&syslog_ns->logbuf_lock);
>  	/* And make sure that we print immediately */
>  	sema_init(&console_sem, 1);
>  }
> @@ -1334,8 +1324,9 @@ static inline int can_use_console(unsigned int cpu)
>   * interrupts disabled. It should return with 'lockbuf_lock'
>   * released but interrupts still disabled.
>   */
> -static int console_trylock_for_printk(unsigned int cpu)
> -	__releases(&logbuf_lock)
> +static int console_trylock_for_printk(unsigned int cpu,
> +			 struct syslog_namespace *syslog_ns)
> +	__releases(&syslog_ns->logbuf_lock)
>  {
>  	int retval = 0, wake = 0;
> 
> @@ -1357,7 +1348,7 @@ static int console_trylock_for_printk(unsigned int cpu)
>  	logbuf_cpu = UINT_MAX;
>  	if (wake)
>  		up(&console_sem);
> -	raw_spin_unlock(&logbuf_lock);
> +	raw_spin_unlock(&syslog_ns->logbuf_lock);
>  	return retval;
>  }
> 
> @@ -1393,7 +1384,7 @@ static struct cont {
>  	bool flushed:1;			/* buffer sealed and committed */
>  } cont;
> 
> -static void cont_flush(enum log_flags flags)
> +static void cont_flush(enum log_flags flags, struct syslog_namespace *syslog_ns)
>  {
>  	if (cont.flushed)
>  		return;
> @@ -1407,7 +1398,7 @@ static void cont_flush(enum log_flags flags)
>  		 * line. LOG_NOCONS suppresses a duplicated output.
>  		 */
>  		log_store(cont.facility, cont.level, flags | LOG_NOCONS,
> -			  cont.ts_nsec, NULL, 0, cont.buf, cont.len);
> +			  cont.ts_nsec, NULL, 0, cont.buf, cont.len, syslog_ns);
>  		cont.flags = flags;
>  		cont.flushed = true;
>  	} else {
> @@ -1416,19 +1407,20 @@ static void cont_flush(enum log_flags flags)
>  		 * just submit it to the store and free the buffer.
>  		 */
>  		log_store(cont.facility, cont.level, flags, 0,
> -			  NULL, 0, cont.buf, cont.len);
> +			  NULL, 0, cont.buf, cont.len, syslog_ns);
>  		cont.len = 0;
>  	}
>  }
> 
> -static bool cont_add(int facility, int level, const char *text, size_t len)
> +static bool cont_add(int facility, int level, const char *text, size_t len,
> +					struct syslog_namespace *syslog_ns)
>  {
>  	if (cont.len && cont.flushed)
>  		return false;
> 
>  	if (cont.len + len > sizeof(cont.buf)) {
>  		/* the line gets too long, split it up in separate records */
> -		cont_flush(LOG_CONT);
> +		cont_flush(LOG_CONT, syslog_ns);
>  		return false;
>  	}
> 
> @@ -1446,7 +1438,7 @@ static bool cont_add(int facility, int level, const char *text, size_t len)
>  	cont.len += len;
> 
>  	if (cont.len > (sizeof(cont.buf) * 80) / 100)
> -		cont_flush(LOG_CONT);
> +		cont_flush(LOG_CONT, syslog_ns);
> 
>  	return true;
>  }
> @@ -1481,7 +1473,8 @@ static size_t cont_print_text(char *text, size_t size)
> 
>  asmlinkage int vprintk_emit(int facility, int level,
>  			    const char *dict, size_t dictlen,
> -			    const char *fmt, va_list args)
> +			    const char *fmt, va_list args,
> +			    struct syslog_namespace *syslog_ns)
>  {
>  	static int recursion_bug;
>  	static char textbuf[LOG_LINE_MAX];
> @@ -1514,11 +1507,11 @@ asmlinkage int vprintk_emit(int facility, int level,
>  			recursion_bug = 1;
>  			goto out_restore_irqs;
>  		}
> -		zap_locks();
> +		zap_locks(syslog_ns);
>  	}
> 
>  	lockdep_off();
> -	raw_spin_lock(&logbuf_lock);
> +	raw_spin_lock(&syslog_ns->logbuf_lock);
>  	logbuf_cpu = this_cpu;
> 
>  	if (recursion_bug) {
> @@ -1529,7 +1522,7 @@ asmlinkage int vprintk_emit(int facility, int level,
>  		printed_len += strlen(recursion_msg);
>  		/* emit KERN_CRIT message */
>  		log_store(0, 2, LOG_PREFIX|LOG_NEWLINE, 0,
> -			  NULL, 0, recursion_msg, printed_len);
> +			  NULL, 0, recursion_msg, printed_len, syslog_ns);
>  	}
> 
>  	/*
> @@ -1576,12 +1569,12 @@ asmlinkage int vprintk_emit(int facility, int level,
>  		 * or another task also prints continuation lines.
>  		 */
>  		if (cont.len && (lflags & LOG_PREFIX || cont.owner != current))
> -			cont_flush(LOG_NEWLINE);
> +			cont_flush(LOG_NEWLINE, syslog_ns);
> 
>  		/* buffer line if possible, otherwise store it right away */
> -		if (!cont_add(facility, level, text, text_len))
> +		if (!cont_add(facility, level, text, text_len, syslog_ns))
>  			log_store(facility, level, lflags | LOG_CONT, 0,
> -				  dict, dictlen, text, text_len);
> +				  dict, dictlen, text, text_len, syslog_ns);
>  	} else {
>  		bool stored = false;
> 
> @@ -1593,13 +1586,14 @@ asmlinkage int vprintk_emit(int facility, int level,
>  		 */
>  		if (cont.len && cont.owner == current) {
>  			if (!(lflags & LOG_PREFIX))
> -				stored = cont_add(facility, level, text, text_len);
> -			cont_flush(LOG_NEWLINE);
> +				stored = cont_add(facility, level, text,
> +							 text_len, syslog_ns);
> +			cont_flush(LOG_NEWLINE, syslog_ns);
>  		}
> 
>  		if (!stored)
>  			log_store(facility, level, lflags, 0,
> -				  dict, dictlen, text, text_len);
> +				  dict, dictlen, text, text_len, syslog_ns);
>  	}
>  	printed_len += text_len;
> 
> @@ -1611,7 +1605,7 @@ asmlinkage int vprintk_emit(int facility, int level,
>  	 * The console_trylock_for_printk() function will release 'logbuf_lock'
>  	 * regardless of whether it actually gets the console semaphore or not.
>  	 */
> -	if (console_trylock_for_printk(this_cpu))
> +	if (console_trylock_for_printk(this_cpu, syslog_ns))
>  		console_unlock();
> 
>  	lockdep_on();
> @@ -1624,7 +1618,8 @@ EXPORT_SYMBOL(vprintk_emit);
> 
>  asmlinkage int vprintk(const char *fmt, va_list args)
>  {
> -	return vprintk_emit(0, -1, NULL, 0, fmt, args);
> +	return vprintk_emit(0, -1, NULL, 0, fmt, args,
> +				current_syslog_ns());
>  }
>  EXPORT_SYMBOL(vprintk);
> 
> @@ -1636,7 +1631,8 @@ asmlinkage int printk_emit(int facility, int level,
>  	int r;
> 
>  	va_start(args, fmt);
> -	r = vprintk_emit(facility, level, dict, dictlen, fmt, args);
> +	r = vprintk_emit(facility, level, dict, dictlen, fmt, args,
> +						current_syslog_ns());
>  	va_end(args);
> 
>  	return r;
> @@ -1678,7 +1674,7 @@ asmlinkage int printk(const char *fmt, ...)
>  	}
>  #endif
>  	va_start(args, fmt);
> -	r = vprintk_emit(0, -1, NULL, 0, fmt, args);
> +	r = vprintk_emit(0, -1, NULL, 0, fmt, args, current_syslog_ns());

Current is meaningless here.  The default should be using init_syslog_ns.

>  	va_end(args);
> 
>  	return r;
> @@ -1981,12 +1977,13 @@ void wake_up_klogd(void)
>  		this_cpu_or(printk_pending, PRINTK_PENDING_WAKEUP);
>  }
> 
> -static void console_cont_flush(char *text, size_t size)
> +static void console_cont_flush(char *text, size_t size,
> +			 struct syslog_namespace *syslog_ns)
>  {
>  	unsigned long flags;
>  	size_t len;
> 
> -	raw_spin_lock_irqsave(&logbuf_lock, flags);
> +	raw_spin_lock_irqsave(&syslog_ns->logbuf_lock, flags);
> 
>  	if (!cont.len)
>  		goto out;
> @@ -1996,18 +1993,131 @@ static void console_cont_flush(char *text, size_t size)
>  	 * busy. The earlier ones need to be printed before this one, we
>  	 * did not flush any fragment so far, so just let it queue up.
>  	 */
> -	if (console_seq < log_next_seq && !cont.cons)
> +	if (syslog_ns->console_seq < syslog_ns->log_next_seq && !cont.cons)
>  		goto out;
> 
>  	len = cont_print_text(text, size);
> -	raw_spin_unlock(&logbuf_lock);
> +	raw_spin_unlock(&syslog_ns->logbuf_lock);
>  	stop_critical_timings();
>  	call_console_drivers(cont.level, text, len);
>  	start_critical_timings();
>  	local_irq_restore(flags);
>  	return;
>  out:
> -	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
> +	raw_spin_unlock_irqrestore(&syslog_ns->logbuf_lock, flags);
> +}
> +
> +/**
> + * syslog_console_unlock - unlock the console system for syslog_namespace
> + *
> + * Releases the console_lock which the caller holds on the console system
> + * and the console driver list.
> + *
> + * While the console_lock was held, console output may have been buffered
> + * by printk().  If this is the case, syslog_console_unlock(); emits
> + * the output prior to releasing the lock.
> + *
> + * If there is output waiting, we wake /dev/kmsg and syslog() users.
> + *
> + * syslog_console_unlock(); may be called from any context.
> + */
> +void syslog_console_unlock(struct syslog_namespace *syslog_ns)
> +{
> +	static char text[LOG_LINE_MAX + PREFIX_MAX];
> +	static u64 seen_seq;
> +	unsigned long flags;
> +	bool wake_klogd = false;
> +	bool retry;
> +
> +	if (console_suspended) {
> +		up(&console_sem);
> +		return;
> +	}
> +
> +	console_may_schedule = 0;
> +
> +	/* flush buffered message fragment immediately to console */
> +	console_cont_flush(text, sizeof(text), syslog_ns);
> +again:
> +	for (;;) {
> +		struct log *msg;
> +		size_t len;
> +		int level;
> +
> +		raw_spin_lock_irqsave(&syslog_ns->logbuf_lock, flags);
> +		if (seen_seq != syslog_ns->log_next_seq) {
> +			wake_klogd = true;
> +			seen_seq = syslog_ns->log_next_seq;
> +		}
> +
> +		if (syslog_ns->console_seq < syslog_ns->log_first_seq) {
> +			/* messages are gone, move to first one */
> +			syslog_ns->console_seq = syslog_ns->log_first_seq;
> +			syslog_ns->console_idx = syslog_ns->log_first_idx;
> +			console_prev = 0;
> +		}
> +skip:
> +		if (syslog_ns->console_seq == syslog_ns->log_next_seq)
> +			break;
> +
> +		msg = log_from_idx(syslog_ns->console_idx, syslog_ns);
> +		if (msg->flags & LOG_NOCONS) {
> +			/*
> +			 * Skip record we have buffered and already printed
> +			 * directly to the console when we received it.
> +			 */
> +			syslog_ns->console_idx =
> +				log_next(syslog_ns->console_idx, syslog_ns);
> +			syslog_ns->console_seq++;
> +			/*
> +			 * We will get here again when we register a new
> +			 * CON_PRINTBUFFER console. Clear the flag so we
> +			 * will properly dump everything later.
> +			 */
> +			msg->flags &= ~LOG_NOCONS;
> +			console_prev = msg->flags;
> +			goto skip;
> +		}
> +
> +		level = msg->level;
> +		len = msg_print_text(msg, console_prev, false,
> +				     text, sizeof(text));
> +		syslog_ns->console_idx =
> +			 log_next(syslog_ns->console_idx, syslog_ns);
> +		syslog_ns->console_seq++;
> +		console_prev = msg->flags;
> +		raw_spin_unlock(&syslog_ns->logbuf_lock);
> +
> +		stop_critical_timings();	/* don't trace print latency */
> +		call_console_drivers(level, text, len);
> +		start_critical_timings();
> +		local_irq_restore(flags);
> +	}
> +	console_locked = 0;
> +
> +	/* Release the exclusive_console once it is used */
> +	if (unlikely(exclusive_console))
> +		exclusive_console = NULL;
> +
> +	raw_spin_unlock(&syslog_ns->logbuf_lock);
> +
> +	up(&console_sem);
> +
> +	/*
> +	 * Someone could have filled up the buffer again, so re-check if there's
> +	 * something to flush. In case we cannot trylock the console_sem again,
> +	 * there's a new owner and the console_unlock() from them will do the
> +	 * flush, no worries.
> +	 */
> +	raw_spin_lock(&syslog_ns->logbuf_lock);
> +	retry = syslog_ns->console_seq != syslog_ns->log_next_seq;
> +	raw_spin_unlock_irqrestore(&syslog_ns->logbuf_lock, flags);
> +
> +	if (retry && console_trylock())
> +		goto again;
> +
> +	if (wake_klogd)
> +		wake_up_klogd();
>  }
> 
>  /**
> @@ -2027,6 +2137,7 @@ out:
>  void console_unlock(void)
>  {
>  	static char text[LOG_LINE_MAX + PREFIX_MAX];
> +	struct syslog_namespace *syslog_ns = current_syslog_ns();
>  	static u64 seen_seq;
>  	unsigned long flags;
>  	bool wake_klogd = false;
> @@ -2040,37 +2151,38 @@ void console_unlock(void)
>  	console_may_schedule = 0;
> 
>  	/* flush buffered message fragment immediately to console */
> -	console_cont_flush(text, sizeof(text));
> +	console_cont_flush(text, sizeof(text), syslog_ns);
>  again:
>  	for (;;) {
>  		struct log *msg;
>  		size_t len;
>  		int level;
> 
> -		raw_spin_lock_irqsave(&logbuf_lock, flags);
> -		if (seen_seq != log_next_seq) {
> +		raw_spin_lock_irqsave(&syslog_ns->logbuf_lock, flags);
> +		if (seen_seq != syslog_ns->log_next_seq) {
>  			wake_klogd = true;
> -			seen_seq = log_next_seq;
> +			seen_seq = syslog_ns->log_next_seq;
>  		}
> 
> -		if (console_seq < log_first_seq) {
> +		if (syslog_ns->console_seq < syslog_ns->log_first_seq) {
>  			/* messages are gone, move to first one */
> -			console_seq = log_first_seq;
> -			console_idx = log_first_idx;
> +			syslog_ns->console_seq = syslog_ns->log_first_seq;
> +			syslog_ns->console_idx = syslog_ns->log_first_idx;
>  			console_prev = 0;
>  		}
>  skip:
> -		if (console_seq == log_next_seq)
> +		if (syslog_ns->console_seq == syslog_ns->log_next_seq)
>  			break;
> 
> -		msg = log_from_idx(console_idx);
> +		msg = log_from_idx(syslog_ns->console_idx, syslog_ns);
>  		if (msg->flags & LOG_NOCONS) {
>  			/*
>  			 * Skip record we have buffered and already printed
>  			 * directly to the console when we received it.
>  			 */
> -			console_idx = log_next(console_idx);
> -			console_seq++;
> +			syslog_ns->console_idx =
> +				 log_next(syslog_ns->console_idx, syslog_ns);
> +			syslog_ns->console_seq++;
>  			/*
>  			 * We will get here again when we register a new
>  			 * CON_PRINTBUFFER console. Clear the flag so we
> @@ -2084,10 +2196,11 @@ skip:
>  		level = msg->level;
>  		len = msg_print_text(msg, console_prev, false,
>  				     text, sizeof(text));
> -		console_idx = log_next(console_idx);
> -		console_seq++;
> +		syslog_ns->console_idx =
> +			 log_next(syslog_ns->console_idx, syslog_ns);
> +		syslog_ns->console_seq++;
>  		console_prev = msg->flags;
> -		raw_spin_unlock(&logbuf_lock);
> +		raw_spin_unlock(&syslog_ns->logbuf_lock);
> 
>  		stop_critical_timings();	/* don't trace print latency */
>  		call_console_drivers(level, text, len);
> @@ -2100,7 +2213,7 @@ skip:
>  	if (unlikely(exclusive_console))
>  		exclusive_console = NULL;
> 
> -	raw_spin_unlock(&logbuf_lock);
> +	raw_spin_unlock(&syslog_ns->logbuf_lock);
> 
>  	up(&console_sem);
> 
> @@ -2110,9 +2223,9 @@ skip:
>  	 * there's a new owner and the console_unlock() from them will do the
>  	 * flush, no worries.
>  	 */
> -	raw_spin_lock(&logbuf_lock);
> -	retry = console_seq != log_next_seq;
> -	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
> +	raw_spin_lock(&syslog_ns->logbuf_lock);
> +	retry = syslog_ns->console_seq != syslog_ns->log_next_seq;
> +	raw_spin_unlock_irqrestore(&syslog_ns->logbuf_lock, flags);
> 
>  	if (retry && console_trylock())
>  		goto again;
> @@ -2237,6 +2350,7 @@ void register_console(struct console *newcon)
>  	int i;
>  	unsigned long flags;
>  	struct console *bcon = NULL;
> +	struct syslog_namespace *syslog_ns = &init_syslog_ns;
> 
>  	/*
>  	 * before we register a new CON_BOOT console, make sure we don't
> @@ -2346,11 +2460,11 @@ void register_console(struct console *newcon)
>  		 * console_unlock(); will print out the buffered messages
>  		 * for us.
>  		 */
> -		raw_spin_lock_irqsave(&logbuf_lock, flags);
> -		console_seq = syslog_seq;
> -		console_idx = syslog_idx;
> -		console_prev = syslog_prev;
> -		raw_spin_unlock_irqrestore(&logbuf_lock, flags);
> +		raw_spin_lock_irqsave(&syslog_ns->logbuf_lock, flags);
> +		syslog_ns->console_seq = syslog_ns->syslog_seq;
> +		syslog_ns->console_idx = syslog_ns->syslog_idx;
> +		console_prev = syslog_ns->syslog_prev;
> +		raw_spin_unlock_irqrestore(&syslog_ns->logbuf_lock, flags);
>  		/*
>  		 * We're about to replay the log buffer.  Only do this to the
>  		 * just-registered console to avoid excessive message spam to
> @@ -2573,6 +2687,7 @@ void kmsg_dump(enum kmsg_dump_reason reason)
>  {
>  	struct kmsg_dumper *dumper;
>  	unsigned long flags;
> +	struct syslog_namespace *syslog_ns = &init_syslog_ns;
> 
>  	if ((reason > KMSG_DUMP_OOPS) && !always_kmsg_dump)
>  		return;
> @@ -2585,12 +2700,12 @@ void kmsg_dump(enum kmsg_dump_reason reason)
>  		/* initialize iterator with data about the stored records */
>  		dumper->active = true;
> 
> -		raw_spin_lock_irqsave(&logbuf_lock, flags);
> -		dumper->cur_seq = clear_seq;
> -		dumper->cur_idx = clear_idx;
> -		dumper->next_seq = log_next_seq;
> -		dumper->next_idx = log_next_idx;
> -		raw_spin_unlock_irqrestore(&logbuf_lock, flags);
> +		raw_spin_lock_irqsave(&syslog_ns->logbuf_lock, flags);
> +		dumper->cur_seq = syslog_ns->clear_seq;
> +		dumper->cur_idx = syslog_ns->clear_idx;
> +		dumper->next_seq = syslog_ns->log_next_seq;
> +		dumper->next_idx = syslog_ns->log_next_idx;
> +		raw_spin_unlock_irqrestore(&syslog_ns->logbuf_lock, flags);
> 
>  		/* invoke dumper which will iterate over records */
>  		dumper->dump(dumper, reason);
> @@ -2626,24 +2741,25 @@ bool kmsg_dump_get_line_nolock(struct kmsg_dumper *dumper, bool syslog,
>  	struct log *msg;
>  	size_t l = 0;
>  	bool ret = false;
> +	struct syslog_namespace *syslog_ns = &init_syslog_ns;
> 
>  	if (!dumper->active)
>  		goto out;
> 
> -	if (dumper->cur_seq < log_first_seq) {
> +	if (dumper->cur_seq < syslog_ns->log_first_seq) {
>  		/* messages are gone, move to first available one */
> -		dumper->cur_seq = log_first_seq;
> -		dumper->cur_idx = log_first_idx;
> +		dumper->cur_seq = syslog_ns->log_first_seq;
> +		dumper->cur_idx = syslog_ns->log_first_idx;
>  	}
> 
>  	/* last entry */
> -	if (dumper->cur_seq >= log_next_seq)
> +	if (dumper->cur_seq >= syslog_ns->log_next_seq)
>  		goto out;
> 
> -	msg = log_from_idx(dumper->cur_idx);
> +	msg = log_from_idx(dumper->cur_idx, syslog_ns);
>  	l = msg_print_text(msg, 0, syslog, line, size);
> 
> -	dumper->cur_idx = log_next(dumper->cur_idx);
> +	dumper->cur_idx = log_next(dumper->cur_idx, syslog_ns);
>  	dumper->cur_seq++;
>  	ret = true;
>  out:
> @@ -2674,10 +2790,12 @@ bool kmsg_dump_get_line(struct kmsg_dumper *dumper, bool syslog,
>  {
>  	unsigned long flags;
>  	bool ret;
> +	struct syslog_namespace *syslog_ns = &init_syslog_ns;
> +
> +	raw_spin_lock_irqsave(&syslog_ns->logbuf_lock, flags);
> 
> -	raw_spin_lock_irqsave(&logbuf_lock, flags);
>  	ret = kmsg_dump_get_line_nolock(dumper, syslog, line, size, len);
> -	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
> +	raw_spin_unlock_irqrestore(&syslog_ns->logbuf_lock, flags);
> 
>  	return ret;
>  }
> @@ -2713,20 +2831,21 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
>  	enum log_flags prev;
>  	size_t l = 0;
>  	bool ret = false;
> +	struct syslog_namespace *syslog_ns = &init_syslog_ns;
> 
>  	if (!dumper->active)
>  		goto out;
> 
> -	raw_spin_lock_irqsave(&logbuf_lock, flags);
> -	if (dumper->cur_seq < log_first_seq) {
> +	raw_spin_lock_irqsave(&syslog_ns->logbuf_lock, flags);
> +	if (dumper->cur_seq < syslog_ns->log_first_seq) {
>  		/* messages are gone, move to first available one */
> -		dumper->cur_seq = log_first_seq;
> -		dumper->cur_idx = log_first_idx;
> +		dumper->cur_seq = syslog_ns->log_first_seq;
> +		dumper->cur_idx = syslog_ns->log_first_idx;
>  	}
> 
>  	/* last entry */
>  	if (dumper->cur_seq >= dumper->next_seq) {
> -		raw_spin_unlock_irqrestore(&logbuf_lock, flags);
> +		raw_spin_unlock_irqrestore(&syslog_ns->logbuf_lock, flags);
>  		goto out;
>  	}
> 
> @@ -2735,10 +2854,10 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
>  	idx = dumper->cur_idx;
>  	prev = 0;
>  	while (seq < dumper->next_seq) {
> -		struct log *msg = log_from_idx(idx);
> +		struct log *msg = log_from_idx(idx, syslog_ns);
> 
>  		l += msg_print_text(msg, prev, true, NULL, 0);
> -		idx = log_next(idx);
> +		idx = log_next(idx, syslog_ns);
>  		seq++;
>  		prev = msg->flags;
>  	}
> @@ -2748,10 +2867,10 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
>  	idx = dumper->cur_idx;
>  	prev = 0;
>  	while (l > size && seq < dumper->next_seq) {
> -		struct log *msg = log_from_idx(idx);
> +		struct log *msg = log_from_idx(idx, syslog_ns);
> 
>  		l -= msg_print_text(msg, prev, true, NULL, 0);
> -		idx = log_next(idx);
> +		idx = log_next(idx, syslog_ns);
>  		seq++;
>  		prev = msg->flags;
>  	}
> @@ -2763,10 +2882,10 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
>  	l = 0;
>  	prev = 0;
>  	while (seq < dumper->next_seq) {
> -		struct log *msg = log_from_idx(idx);
> +		struct log *msg = log_from_idx(idx, syslog_ns);
> 
>  		l += msg_print_text(msg, prev, syslog, buf + l, size - l);
> -		idx = log_next(idx);
> +		idx = log_next(idx, syslog_ns);
>  		seq++;
>  		prev = msg->flags;
>  	}
> @@ -2774,7 +2893,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
>  	dumper->next_seq = next_seq;
>  	dumper->next_idx = next_idx;
>  	ret = true;
> -	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
> +	raw_spin_unlock_irqrestore(&syslog_ns->logbuf_lock, flags);
>  out:
>  	if (len)
>  		*len = l;
> @@ -2794,10 +2913,12 @@ EXPORT_SYMBOL_GPL(kmsg_dump_get_buffer);
>   */
>  void kmsg_dump_rewind_nolock(struct kmsg_dumper *dumper)
>  {
> -	dumper->cur_seq = clear_seq;
> -	dumper->cur_idx = clear_idx;
> -	dumper->next_seq = log_next_seq;
> -	dumper->next_idx = log_next_idx;
> +	struct syslog_namespace *syslog_ns = &init_syslog_ns;
> +
> +	dumper->cur_seq = syslog_ns->clear_seq;
> +	dumper->cur_idx = syslog_ns->clear_idx;
> +	dumper->next_seq = syslog_ns->log_next_seq;
> +	dumper->next_idx = syslog_ns->log_next_idx;
>  }
> 
>  /**
> @@ -2811,10 +2932,12 @@ void kmsg_dump_rewind_nolock(struct kmsg_dumper *dumper)
>  void kmsg_dump_rewind(struct kmsg_dumper *dumper)
>  {
>  	unsigned long flags;
> +	struct syslog_namespace *syslog_ns = &init_syslog_ns;
> +
> +	raw_spin_lock_irqsave(&syslog_ns->logbuf_lock, flags);
> 
> -	raw_spin_lock_irqsave(&logbuf_lock, flags);
>  	kmsg_dump_rewind_nolock(dumper);
> -	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
> +	raw_spin_unlock_irqrestore(&syslog_ns->logbuf_lock, flags);
>  }
>  EXPORT_SYMBOL_GPL(kmsg_dump_rewind);
>  #endif
> -- 
> 1.7.1
> 
> _______________________________________________
> Containers mailing list
> Containers@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply

* Re: [PATCH net-next ] net: Allow userns root to control tun and tap devices
From: Serge E. Hallyn @ 2012-11-19 14:23 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, Linux Containers, David Miller
In-Reply-To: <87a9uekpvw.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>

Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> 
> Allow an unpriviled user who has created a user namespace, and then
> created a network namespace to effectively use the new network
> namespace, by reducing capable(CAP_NET_ADMIN) calls to
> ns_capable(net->user_ns,CAP_NET_ADMIN) calls.
> 
> Allow setting of the tun iff flags.
> Allow creating of tun devices.
> Allow adding a new queue to a tun device.
> 

Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>

> Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
> ---
>  drivers/net/tun.c |    5 +++--
>  1 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index b44d7b7..b01e8c0 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -373,10 +373,11 @@ static u16 tun_select_queue(struct net_device *dev, struct sk_buff *skb)
>  static inline bool tun_not_capable(struct tun_struct *tun)
>  {
>  	const struct cred *cred = current_cred();
> +	struct net *net = dev_net(tun->dev);
>  
>  	return ((uid_valid(tun->owner) && !uid_eq(cred->euid, tun->owner)) ||
>  		  (gid_valid(tun->group) && !in_egroup_p(tun->group))) &&
> -		!capable(CAP_NET_ADMIN);
> +		!ns_capable(net->user_ns, CAP_NET_ADMIN);
>  }
>  
>  static void tun_set_real_num_queues(struct tun_struct *tun)
> @@ -1559,7 +1560,7 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
>  		char *name;
>  		unsigned long flags = 0;
>  
> -		if (!capable(CAP_NET_ADMIN))
> +		if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
>  			return -EPERM;
>  		err = security_tun_dev_create();
>  		if (err < 0)
> -- 
> 1.7.5.4

^ permalink raw reply

* [PATCH 2/2] net: mvneta: fix section mismatch warning caused by mvneta_deinit()
From: Thomas Petazzoni @ 2012-11-19 14:13 UTC (permalink / raw)
  To: Jason Cooper
  Cc: David S. Miller, netdev, Gregory Clement, Andrew Lunn,
	linux-arm-kernel
In-Reply-To: <1353334396-26128-1-git-send-email-thomas.petazzoni@free-electrons.com>

mvneta_deinit() can be called from the ->probe() hook in the error
path, so it shouldn't be marked as __devexit. It fixes the following
section mismatch warning:

WARNING: vmlinux.o(.devinit.text+0x239c): Section mismatch in reference
from the function mvneta_probe() to the function .devexit.text:mvneta_deinit()
The function __devinit mvneta_probe() references
a function __devexit mvneta_deinit().

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 drivers/net/ethernet/marvell/mvneta.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index c90382a..ad0b551 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -2605,7 +2605,7 @@ static int __devinit mvneta_init(struct mvneta_port *pp, int phy_addr)
 	return 0;
 }
 
-static void __devexit mvneta_deinit(struct mvneta_port *pp)
+static void mvneta_deinit(struct mvneta_port *pp)
 {
 	kfree(pp->txqs);
 	kfree(pp->rxqs);
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH 1/2] net: mvneta: add clk support
From: Thomas Petazzoni @ 2012-11-19 14:13 UTC (permalink / raw)
  To: Jason Cooper
  Cc: David S. Miller, netdev, Gregory Clement, Andrew Lunn,
	linux-arm-kernel
In-Reply-To: <1353334396-26128-1-git-send-email-thomas.petazzoni@free-electrons.com>

Now that the Armada 370/XP platform has gained proper integration with
the clock framework, we add clk support in the Marvell Armada 370/XP
Ethernet driver.

Since the existing Device Tree binding that exposes a
'clock-frequency' property has never been exposed in any stable kernel
release, we take the freedom of removing this property to replace it
with the standard 'clocks' clock pointer property.

The Device Tree binding documentation is updated accordingly.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 .../bindings/net/marvell-armada-370-neta.txt       |    4 +--
 drivers/net/ethernet/marvell/mvneta.c              |   31 +++++++++++++-------
 2 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt b/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt
index c4e87f0..859a6fa 100644
--- a/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt
+++ b/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt
@@ -8,7 +8,7 @@ Required properties:
   property, a single integer).
 - phy-mode: The interface between the SoC and the PHY (a string that
   of_get_phy_mode() can understand)
-- clock-frequency: frequency of the peripheral clock of the SoC.
+- clocks: a pointer to the reference clock for this device.
 
 Example:
 
@@ -16,7 +16,7 @@ ethernet@d0070000 {
 	compatible = "marvell,armada-370-neta";
 	reg = <0xd0070000 0x2500>;
 	interrupts = <8>;
-	clock-frequency = <250000000>;
+	clocks = <&gate_clk 4>;
 	status = "okay";
 	phy = <&phy0>;
 	phy-mode = "rgmii-id";
diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index a7826f0..c90382a 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -29,6 +29,7 @@
 #include <linux/of_net.h>
 #include <linux/of_address.h>
 #include <linux/phy.h>
+#include <linux/clk.h>
 
 /* Registers */
 #define MVNETA_RXQ_CONFIG_REG(q)                (0x1400 + ((q) << 2))
@@ -243,7 +244,7 @@ struct mvneta_port {
 	int weight;
 
 	/* Core clock */
-	unsigned int clk_rate_hz;
+	struct clk *clk;
 	u8 mcast_count[256];
 	u16 tx_ring_size;
 	u16 rx_ring_size;
@@ -1029,7 +1030,11 @@ static void mvneta_rx_pkts_coal_set(struct mvneta_port *pp,
 static void mvneta_rx_time_coal_set(struct mvneta_port *pp,
 				    struct mvneta_rx_queue *rxq, u32 value)
 {
-	u32 val = (pp->clk_rate_hz / 1000000) * value;
+	u32 val;
+	unsigned long clk_rate;
+
+	clk_rate = clk_get_rate(pp->clk);
+	val = (clk_rate / 1000000) * value;
 
 	mvreg_write(pp, MVNETA_RXQ_TIME_COAL_REG(rxq->id), val);
 	rxq->time_coal = value;
@@ -2670,7 +2675,7 @@ static int __devinit mvneta_probe(struct platform_device *pdev)
 	const struct mbus_dram_target_info *dram_target_info;
 	struct device_node *dn = pdev->dev.of_node;
 	struct device_node *phy_node;
-	u32 phy_addr, clk_rate_hz;
+	u32 phy_addr;
 	struct mvneta_port *pp;
 	struct net_device *dev;
 	const char *mac_addr;
@@ -2710,12 +2715,6 @@ static int __devinit mvneta_probe(struct platform_device *pdev)
 		goto err_free_irq;
 	}
 
-	if (of_property_read_u32(dn, "clock-frequency", &clk_rate_hz) != 0) {
-		dev_err(&pdev->dev, "could not read clock-frequency\n");
-		err = -EINVAL;
-		goto err_free_irq;
-	}
-
 	mac_addr = of_get_mac_address(dn);
 
 	if (!mac_addr || !is_valid_ether_addr(mac_addr))
@@ -2736,7 +2735,6 @@ static int __devinit mvneta_probe(struct platform_device *pdev)
 	clear_bit(MVNETA_F_TX_DONE_TIMER_BIT, &pp->flags);
 
 	pp->weight = MVNETA_RX_POLL_WEIGHT;
-	pp->clk_rate_hz = clk_rate_hz;
 	pp->phy_node = phy_node;
 	pp->phy_interface = phy_mode;
 
@@ -2746,6 +2744,14 @@ static int __devinit mvneta_probe(struct platform_device *pdev)
 		goto err_free_irq;
 	}
 
+	pp->clk = devm_clk_get(&pdev->dev, NULL);
+	if (IS_ERR(pp->clk)) {
+		err = PTR_ERR(pp->clk);
+		goto err_unmap;
+	}
+
+	clk_prepare_enable(pp->clk);
+
 	pp->tx_done_timer.data = (unsigned long)dev;
 
 	pp->tx_ring_size = MVNETA_MAX_TXD;
@@ -2757,7 +2763,7 @@ static int __devinit mvneta_probe(struct platform_device *pdev)
 	err = mvneta_init(pp, phy_addr);
 	if (err < 0) {
 		dev_err(&pdev->dev, "can't init eth hal\n");
-		goto err_unmap;
+		goto err_clk;
 	}
 	mvneta_port_power_up(pp, phy_mode);
 
@@ -2785,6 +2791,8 @@ static int __devinit mvneta_probe(struct platform_device *pdev)
 
 err_deinit:
 	mvneta_deinit(pp);
+err_clk:
+	clk_disable_unprepare(pp->clk);
 err_unmap:
 	iounmap(pp->base);
 err_free_irq:
@@ -2802,6 +2810,7 @@ static int __devexit mvneta_remove(struct platform_device *pdev)
 
 	unregister_netdev(dev);
 	mvneta_deinit(pp);
+	clk_disable_unprepare(pp->clk);
 	iounmap(pp->base);
 	irq_dispose_mapping(dev->irq);
 	free_netdev(dev);
-- 
1.7.9.5

^ permalink raw reply related

* [GIT PULL] Marvell Ethernet driver clk support + section mismatch fix
From: Thomas Petazzoni @ 2012-11-19 14:13 UTC (permalink / raw)
  To: Jason Cooper
  Cc: David S. Miller, netdev, Gregory Clement, Andrew Lunn,
	linux-arm-kernel

Jason,

Here are two additional patches for the mvneta Ethernet driver
(intended for your mvebu/driver branch):

 * One adding clk support. It's not a fix, but I think it's better if
   we can get the Device Tree binding correct directly in 3.8. This
   commit basically replaces the clock-frequency DT property by a
   proper 'clocks' DT property that references a DT clock. If we don't
   get this in 3.8, we'll have to support the legacy clock-frequency
   DT property for ever as it will be part of the kernel-to-DT ABI. So
   it would be nicer to get this patch in for 3.8.

   The DT patches related to this one are coming into a separate pull
   request.

 * The second one fixes a section mismatch warning, so it is certainly
   important.

Thanks!

The following changes since commit 803b4905545039f97c7824bcab605e91ac20acbb:

  Merge remote-tracking branch 'jcooper/mvebu/dt' into mvneta-fixes (2012-11-19 12:08:58 +0100)

are available in the git repository at:

  git@github.com:MISL-EBU-System-SW/mainline-public.git tags/marvell-mvneta-fix-and-clk-support-3.8

for you to fetch changes up to 4b968e8f98603a422f35b18682c169fb706c0115:

  net: mvneta: fix section mismatch warning caused by mvneta_deinit() (2012-11-19 14:45:25 +0100)

----------------------------------------------------------------
Marvell Ethernet driver fix + clk support

----------------------------------------------------------------
Thomas Petazzoni (2):
      net: mvneta: add clk support
      net: mvneta: fix section mismatch warning caused by mvneta_deinit()

 .../bindings/net/marvell-armada-370-neta.txt       |    4 +--
 drivers/net/ethernet/marvell/mvneta.c              |   33 +++++++++++++-------
 2 files changed, 23 insertions(+), 14 deletions(-)

^ permalink raw reply

* Re: [PATCH 1/1] iwlwifi: Remove duplicate inclusion of iwl-trans.h in pcie/drv.c
From: Johannes Berg @ 2012-11-19 14:09 UTC (permalink / raw)
  To: Sachin Kamat
  Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, ilw-VuQAYsv1563Yd54FQh9/CA,
	wey-yi.w.guy-ral2JQCrhuEAvxtiuMwx3w,
	patches-QSEj5FYQhm4dnm+yROfE0A
In-Reply-To: <1353325540-9338-1-git-send-email-sachin.kamat-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

On Mon, 2012-11-19 at 17:15 +0530, Sachin Kamat wrote:
> iwl-trans.h was included twice.

Thanks, I've picked this up.

johannes

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* 018/00/848
From: thomas Kurlick @ 2012-11-19 13:40 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 0 bytes --]



[-- Attachment #2: Col. Thomas Kurlick.pdf --]
[-- Type: application/pdf, Size: 171059 bytes --]

^ permalink raw reply

* 018/00/848
From: Mr. Glenn Taylor @ 2012-11-19 13:45 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 0 bytes --]



[-- Attachment #2: Document Of Business Proposal(Mr. Glenn Taylor)open the attachment for more details.pdf --]
[-- Type: application/pdf, Size: 87791 bytes --]

^ permalink raw reply

* Re: [PATCH 3/3] net: mvneta: adjust multiline comments to net/ style
From: Thomas Petazzoni @ 2012-11-19 12:56 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: Jason Cooper, Lior Amsalem, Andrew Lunn, netdev, Gregory Clement,
	David S. Miller, linux-arm-kernel
In-Reply-To: <50AA2AD4.6030007@mvista.com>

Dear Sergei Shtylyov,

On Mon, 19 Nov 2012 16:49:24 +0400, Sergei Shtylyov wrote:

> > -/*
> > - * The two bytes Marvell header. Either contains a special value used
> > +/* The two bytes Marvell header. Either contains a special value used
> 
>     Why the heck you're doing this? It's the preferred style, see 
> Documentation/CodingStyle, chapter 8.

No. Please read Documentation/CodingStyle, chapter 8 entirely:

===============================================================
The preferred style for long (multi-line) comments is:

        /*
         * This is the preferred style for multi-line
         * comments in the Linux kernel source code.
         * Please use it consistently.
         *
         * Description:  A column of asterisks on the left side,
         * with beginning and ending almost-blank lines.
         */

For files in net/ and drivers/net/ the preferred style for long
(multi-line) comments is a little different.

        /* The preferred comment style for files in net/ and drivers/net
         * looks like this.
         *
         * It is nearly the same as the generally preferred comment style,
         * but there is no initial almost-blank line.
         */
===============================================================

Seen the second part?

Please also see scripts/checkpatch.pl:

                if ($realfile =~ m@^(drivers/net/|net/)@ &&
                    $rawline =~ /^\+[ \t]*\/\*[ \t]*$/ &&
                    $prevrawline =~ /^\+[ \t]*$/) {
                        WARN("NETWORKING_BLOCK_COMMENT_STYLE",
                             "networking block comments don't use an empty /* line, use /* Comment...\n" . $hereprev);
                }

Thanks,

Thomas
-- 
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com

^ permalink raw reply

* Re: [PATCH 3/3] net: mvneta: adjust multiline comments to net/ style
From: Sergei Shtylyov @ 2012-11-19 12:49 UTC (permalink / raw)
  To: Thomas Petazzoni
  Cc: Jason Cooper, Lior Amsalem, Andrew Lunn, netdev, Gregory Clement,
	David S. Miller, linux-arm-kernel
In-Reply-To: <1353322834-16952-4-git-send-email-thomas.petazzoni@free-electrons.com>

Hello.

On 19-11-2012 15:00, Thomas Petazzoni wrote:

> As reported by checkpatch, the multiline comments for net/ and
> drivers/net/ have a slightly different format than the one used in the
> rest of the kernel, so we adjust our multiline comments accordingly.

> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
>   drivers/net/ethernet/marvell/mvneta.c |   84 ++++++++++++++++-----------------
>   1 file changed, 42 insertions(+), 42 deletions(-)
>
> diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
> index a7826f0..d9dadee 100644
> --- a/drivers/net/ethernet/marvell/mvneta.c
> +++ b/drivers/net/ethernet/marvell/mvneta.c
> @@ -178,8 +178,7 @@
>   /* Napi polling weight */
>   #define MVNETA_RX_POLL_WEIGHT		64
>
> -/*
> - * The two bytes Marvell header. Either contains a special value used
> +/* The two bytes Marvell header. Either contains a special value used

    Why the heck you're doing this? It's the preferred style, see 
Documentation/CodingStyle, chapter 8.

WBR, Sergei

^ permalink raw reply

* [patch 6/6] [PATCH] qeth: Remove BUG_ONs
From: frank.blaschka @ 2012-11-19 12:46 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, Stefan Raspl
In-Reply-To: <20121119124644.679089322@de.ibm.com>

[-- Attachment #1: 607-qeth-bug-on.diff --]
[-- Type: text/plain, Size: 4525 bytes --]

From: Stefan Raspl <raspl@linux.vnet.ibm.com>

Remove BUG_ONs or convert to WARN_ON_ONCE/WARN_ONs since a failure within a
networking device driver is no reason to shut down the entire machine.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Reviewed-by: Ursula Braun <ursula.braun@de.ibm.com>
---
 drivers/s390/net/qeth_core_main.c |   14 ++++++--------
 drivers/s390/net/qeth_l2_main.c   |    3 +--
 drivers/s390/net/qeth_l3_main.c   |    7 +++----
 3 files changed, 10 insertions(+), 14 deletions(-)

--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -383,7 +383,7 @@ static inline void qeth_cleanup_handled_
 				qeth_release_skbs(c);
 
 				c = f->next_pending;
-				BUG_ON(head->next_pending != f);
+				WARN_ON_ONCE(head->next_pending != f);
 				head->next_pending = c;
 				kmem_cache_free(qeth_qdio_outbuf_cache, f);
 			} else {
@@ -415,13 +415,12 @@ static inline void qeth_qdio_handle_aob(
 	buffer = (struct qeth_qdio_out_buffer *) aob->user1;
 	QETH_CARD_TEXT_(card, 5, "%lx", aob->user1);
 
-	BUG_ON(buffer == NULL);
-
 	if (atomic_cmpxchg(&buffer->state, QETH_QDIO_BUF_PRIMED,
 			   QETH_QDIO_BUF_IN_CQ) == QETH_QDIO_BUF_PRIMED) {
 		notification = TX_NOTIFY_OK;
 	} else {
-		BUG_ON(atomic_read(&buffer->state) != QETH_QDIO_BUF_PENDING);
+		WARN_ON_ONCE(atomic_read(&buffer->state) !=
+							QETH_QDIO_BUF_PENDING);
 		atomic_set(&buffer->state, QETH_QDIO_BUF_IN_CQ);
 		notification = TX_NOTIFY_DELAYED_OK;
 	}
@@ -1131,7 +1130,7 @@ static void qeth_release_skbs(struct qet
 		notify_general_error = 1;
 
 	/* release may never happen from within CQ tasklet scope */
-	BUG_ON(atomic_read(&buf->state) == QETH_QDIO_BUF_IN_CQ);
+	WARN_ON_ONCE(atomic_read(&buf->state) == QETH_QDIO_BUF_IN_CQ);
 
 	skb = skb_dequeue(&buf->skb_list);
 	while (skb) {
@@ -2400,7 +2399,7 @@ static int qeth_alloc_qdio_buffers(struc
 		card->qdio.out_qs[i]->queue_no = i;
 		/* give outbound qeth_qdio_buffers their qdio_buffers */
 		for (j = 0; j < QDIO_MAX_BUFFERS_PER_Q; ++j) {
-			BUG_ON(card->qdio.out_qs[i]->bufs[j] != NULL);
+			WARN_ON(card->qdio.out_qs[i]->bufs[j] != NULL);
 			if (qeth_init_qdio_out_buf(card->qdio.out_qs[i], j))
 				goto out_freeoutqbufs;
 		}
@@ -3565,7 +3564,7 @@ void qeth_qdio_output_handler(struct ccw
 		if (queue->bufstates &&
 		    (queue->bufstates[bidx].flags &
 		     QDIO_OUTBUF_STATE_FLAG_PENDING) != 0) {
-			BUG_ON(card->options.cq != QETH_CQ_ENABLED);
+			WARN_ON_ONCE(card->options.cq != QETH_CQ_ENABLED);
 
 			if (atomic_cmpxchg(&buffer->state,
 					   QETH_QDIO_BUF_PRIMED,
@@ -3579,7 +3578,6 @@ void qeth_qdio_output_handler(struct ccw
 			QETH_CARD_TEXT(queue->card, 5, "aob");
 			QETH_CARD_TEXT_(queue->card, 5, "%lx",
 					virt_to_phys(buffer->aob));
-			BUG_ON(bidx < 0 || bidx >= QDIO_MAX_BUFFERS_PER_Q);
 			if (qeth_init_qdio_out_buf(queue, bidx)) {
 				QETH_CARD_TEXT(card, 2, "outofbuf");
 				qeth_schedule_recovery(card);
--- a/drivers/s390/net/qeth_l2_main.c
+++ b/drivers/s390/net/qeth_l2_main.c
@@ -411,7 +411,7 @@ static int qeth_l2_process_inbound_buffe
 	unsigned int len;
 
 	*done = 0;
-	BUG_ON(!budget);
+	WARN_ON_ONCE(!budget);
 	while (budget) {
 		skb = qeth_core_get_next_skb(card,
 			&card->qdio.in_q->bufs[card->rx.b_index],
@@ -973,7 +973,6 @@ static int __qeth_l2_set_online(struct c
 	int rc = 0;
 	enum qeth_card_states recover_flag;
 
-	BUG_ON(!card);
 	mutex_lock(&card->discipline_mutex);
 	mutex_lock(&card->conf_mutex);
 	QETH_DBF_TEXT(SETUP, 2, "setonlin");
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -1939,7 +1939,7 @@ static int qeth_l3_process_inbound_buffe
 	__u16 magic;
 
 	*done = 0;
-	BUG_ON(!budget);
+	WARN_ON_ONCE(!budget);
 	while (budget) {
 		skb = qeth_core_get_next_skb(card,
 			&card->qdio.in_q->bufs[card->rx.b_index],
@@ -3334,7 +3334,6 @@ static int __qeth_l3_set_online(struct c
 	int rc = 0;
 	enum qeth_card_states recover_flag;
 
-	BUG_ON(!card);
 	mutex_lock(&card->discipline_mutex);
 	mutex_lock(&card->conf_mutex);
 	QETH_DBF_TEXT(SETUP, 2, "setonlin");
@@ -3715,9 +3714,9 @@ static void qeth_l3_unregister_notifiers
 {
 
 	QETH_DBF_TEXT(SETUP, 5, "unregnot");
-	BUG_ON(unregister_inetaddr_notifier(&qeth_l3_ip_notifier));
+	WARN_ON(unregister_inetaddr_notifier(&qeth_l3_ip_notifier));
 #ifdef CONFIG_QETH_IPV6
-	BUG_ON(unregister_inet6addr_notifier(&qeth_l3_ip6_notifier));
+	WARN_ON(unregister_inet6addr_notifier(&qeth_l3_ip6_notifier));
 #endif /* QETH_IPV6 */
 }
 

^ permalink raw reply

* [patch 0/6] s390: network patches for net-next
From: frank.blaschka @ 2012-11-19 12:46 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390

Hi Dave,

here are some s390 related patches for net-next

shortlog:

Stefan Raspl (4)
qeth: Remove unused variable
qeth: Consolidate tracing of card features
qeth: Clarify card type naming for virtual NICs
qeth: Remove BUG_ONs

Ursula Braun (2)
ctcm: remove BUG_ON's
claw: remove BUG_ON's


Thanks,
        Frank

^ permalink raw reply

* [patch 2/6] [PATCH] ctcm: remove BUG_ONs
From: frank.blaschka @ 2012-11-19 12:46 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, Ursula Braun
In-Reply-To: <20121119124644.679089322@de.ibm.com>

[-- Attachment #1: 604-ctcm-bugon.diff --]
[-- Type: text/plain, Size: 1197 bytes --]

From: Ursula Braun <ursula.braun@de.ibm.com>

Remove BUG_ON's in ctcm driver, since the checked error conditions
are null pointer accesses.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
---
 drivers/s390/net/ctcm_main.c |    2 --
 drivers/s390/net/ctcm_mpc.c  |    3 ---
 2 files changed, 5 deletions(-)

--- a/drivers/s390/net/ctcm_main.c
+++ b/drivers/s390/net/ctcm_main.c
@@ -1691,8 +1691,6 @@ static void ctcm_remove_device(struct cc
 {
 	struct ctcm_priv *priv = dev_get_drvdata(&cgdev->dev);
 
-	BUG_ON(priv == NULL);
-
 	CTCM_DBF_TEXT_(SETUP, CTC_DBF_INFO,
 			"removing device %p, proto : %d",
 			cgdev, priv->protocol);
--- a/drivers/s390/net/ctcm_mpc.c
+++ b/drivers/s390/net/ctcm_mpc.c
@@ -1367,7 +1367,6 @@ static void mpc_action_go_inop(fsm_insta
 	struct mpc_group *grp;
 	struct channel *wch;
 
-	BUG_ON(dev == NULL);
 	CTCM_PR_DEBUG("Enter %s: %s\n",	__func__, dev->name);
 
 	priv  = dev->ml_priv;
@@ -1472,8 +1471,6 @@ static void mpc_action_timeout(fsm_insta
 	struct channel *wch;
 	struct channel *rch;
 
-	BUG_ON(dev == NULL);
-
 	priv = dev->ml_priv;
 	grp = priv->mpcg;
 	wch = priv->channel[CTCM_WRITE];

^ permalink raw reply

* [patch 1/6] [PATCH] qeth: Remove unused variable
From: frank.blaschka @ 2012-11-19 12:46 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, Stefan Raspl
In-Reply-To: <20121119124644.679089322@de.ibm.com>

[-- Attachment #1: 601-qeth-unused-variable.diff --]
[-- Type: text/plain, Size: 831 bytes --]

From: Stefan Raspl <raspl@linux.vnet.ibm.com>

Eliminate a variable that is never modified.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Reviewed-by: Ursula Braun <ursula.braun@de.ibm.com>
---
 drivers/s390/net/qeth_core_main.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -2280,7 +2280,6 @@ static int qeth_ulp_setup_cb(struct qeth
 		unsigned long data)
 {
 	struct qeth_cmd_buffer *iob;
-	int rc = 0;
 
 	QETH_DBF_TEXT(SETUP, 2, "ulpstpcb");
 
@@ -2296,7 +2295,7 @@ static int qeth_ulp_setup_cb(struct qeth
 		iob->rc = -EMLINK;
 	}
 	QETH_DBF_TEXT_(SETUP, 2, "  rc%d", iob->rc);
-	return rc;
+	return 0;
 }
 
 static int qeth_ulp_setup(struct qeth_card *card)

^ permalink raw reply

* [patch 5/6] [PATCH] qeth: Consolidate tracing of card features
From: frank.blaschka @ 2012-11-19 12:46 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, Stefan Raspl
In-Reply-To: <20121119124644.679089322@de.ibm.com>

[-- Attachment #1: 602-qeth-tracing.diff --]
[-- Type: text/plain, Size: 2815 bytes --]

From: Stefan Raspl <raspl@linux.vnet.ibm.com>

Trace all supported and enabled card features to s390dbf.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Reviewed-by: Ursula Braun <ursula.braun@de.ibm.com>
---
 drivers/s390/net/qeth_core.h      |    1 +
 drivers/s390/net/qeth_core_main.c |   16 +++++++++++++---
 drivers/s390/net/qeth_l2_main.c   |    1 +
 drivers/s390/net/qeth_l3_main.c   |    1 +
 4 files changed, 16 insertions(+), 3 deletions(-)

--- a/drivers/s390/net/qeth_core.h
+++ b/drivers/s390/net/qeth_core.h
@@ -933,6 +933,7 @@ int qeth_hdr_chk_and_bounce(struct sk_bu
 int qeth_configure_cq(struct qeth_card *, enum qeth_cq);
 int qeth_hw_trap(struct qeth_card *, enum qeth_diags_trap_action);
 int qeth_query_ipassists(struct qeth_card *, enum qeth_prot_versions prot);
+void qeth_trace_features(struct qeth_card *);
 
 /* exports for OSN */
 int qeth_osn_assist(struct net_device *, void *, int);
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -2968,9 +2968,6 @@ static int qeth_query_ipassists_cb(struc
 	} else
 		QETH_DBF_MESSAGE(1, "%s IPA_CMD_QIPASSIST: Flawed LIC detected"
 					"\n", dev_name(&card->gdev->dev));
-	QETH_DBF_TEXT(SETUP, 2, "suppenbl");
-	QETH_DBF_TEXT_(SETUP, 2, "%08x", (__u32)cmd->hdr.ipa_supported);
-	QETH_DBF_TEXT_(SETUP, 2, "%08x", (__u32)cmd->hdr.ipa_enabled);
 	return 0;
 }
 
@@ -4730,6 +4727,19 @@ static void qeth_core_free_card(struct q
 	kfree(card);
 }
 
+void qeth_trace_features(struct qeth_card *card)
+{
+	QETH_CARD_TEXT(card, 2, "features");
+	QETH_CARD_TEXT_(card, 2, "%x", card->options.ipa4.supported_funcs);
+	QETH_CARD_TEXT_(card, 2, "%x", card->options.ipa4.enabled_funcs);
+	QETH_CARD_TEXT_(card, 2, "%x", card->options.ipa6.supported_funcs);
+	QETH_CARD_TEXT_(card, 2, "%x", card->options.ipa6.enabled_funcs);
+	QETH_CARD_TEXT_(card, 2, "%x", card->options.adp.supported_funcs);
+	QETH_CARD_TEXT_(card, 2, "%x", card->options.adp.enabled_funcs);
+	QETH_CARD_TEXT_(card, 2, "%x", card->info.diagass_support);
+}
+EXPORT_SYMBOL_GPL(qeth_trace_features);
+
 static struct ccw_device_id qeth_ids[] = {
 	{CCW_DEVICE_DEVTYPE(0x1731, 0x01, 0x1732, 0x01),
 					.driver_info = QETH_CARD_TYPE_OSD},
--- a/drivers/s390/net/qeth_l2_main.c
+++ b/drivers/s390/net/qeth_l2_main.c
@@ -986,6 +986,7 @@ static int __qeth_l2_set_online(struct c
 		rc = -ENODEV;
 		goto out_remove;
 	}
+	qeth_trace_features(card);
 
 	if (!card->dev && qeth_l2_setup_netdev(card)) {
 		rc = -ENODEV;
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -3347,6 +3347,7 @@ static int __qeth_l3_set_online(struct c
 		rc = -ENODEV;
 		goto out_remove;
 	}
+	qeth_trace_features(card);
 
 	if (!card->dev && qeth_l3_setup_netdev(card)) {
 		rc = -ENODEV;

^ permalink raw reply

* [patch 4/6] [PATCH] qeth: Clarify card type naming for virtual NICs
From: frank.blaschka @ 2012-11-19 12:46 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390, Stefan Raspl
In-Reply-To: <20121119124644.679089322@de.ibm.com>

[-- Attachment #1: 603-qeth-card-naming.diff --]
[-- Type: text/plain, Size: 1630 bytes --]

From: Stefan Raspl <raspl@linux.vnet.ibm.com>

So far, virtual NICs whether attached to a VSWITCH or a guest LAN were always
displayed as guest LANs in the device driver attributes and messages, while
in fact it is a virtual NIC.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Reviewed-by: Ursula Braun <ursula.braun@de.ibm.com>
---
 drivers/s390/net/qeth_core_main.c |   16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -73,13 +73,13 @@ static inline const char *qeth_get_cardn
 	if (card->info.guestlan) {
 		switch (card->info.type) {
 		case QETH_CARD_TYPE_OSD:
-			return " Guest LAN QDIO";
+			return " Virtual NIC QDIO";
 		case QETH_CARD_TYPE_IQD:
-			return " Guest LAN Hiper";
+			return " Virtual NIC Hiper";
 		case QETH_CARD_TYPE_OSM:
-			return " Guest LAN QDIO - OSM";
+			return " Virtual NIC QDIO - OSM";
 		case QETH_CARD_TYPE_OSX:
-			return " Guest LAN QDIO - OSX";
+			return " Virtual NIC QDIO - OSX";
 		default:
 			return " unknown";
 		}
@@ -108,13 +108,13 @@ const char *qeth_get_cardname_short(stru
 	if (card->info.guestlan) {
 		switch (card->info.type) {
 		case QETH_CARD_TYPE_OSD:
-			return "GuestLAN QDIO";
+			return "Virt.NIC QDIO";
 		case QETH_CARD_TYPE_IQD:
-			return "GuestLAN Hiper";
+			return "Virt.NIC Hiper";
 		case QETH_CARD_TYPE_OSM:
-			return "GuestLAN OSM";
+			return "Virt.NIC OSM";
 		case QETH_CARD_TYPE_OSX:
-			return "GuestLAN OSX";
+			return "Virt.NIC OSX";
 		default:
 			return "unknown";
 		}

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox