* [GIT PULL] Marvell mvneta Ethernet/MDIO drivers checkpatch fixes
From: Thomas Petazzoni @ 2012-11-19 11:00 UTC (permalink / raw)
To: Jason Cooper
Cc: David S. Miller, Andrew Lunn, Gregory Clement, linux-arm-kernel,
Lior Amsalem, netdev
Jason,
Sebastian Hesselbarth noticed that the new mvmdio and mvneta drivers
produce a number of checkpatch warnings, related to the incluse of
<asm/delay.h> and to the style of multiline comments. The following
three patches fix those checkpatch warnings. Feel free to integrate
them either as follow-up patches of the network driver patches, or to
squash them into the driver patches as you prefer.
David, are you OK with those patches and the fact that we carry them
through the arm-soc tree, as we agreed to do for the mvmdio and mvneta
drivers? Those patches are really simple/stupid fixes.
Thanks!
The following changes since commit a7e7265b086fb12eb4472d0d216c6b787bb35eba:
Merge remote-tracking branch 'jcooper/mvebu/dt' into mvneta-fixes (2012-11-19 11:30:18 +0100)
are available in the git repository at:
git@github.com:MISL-EBU-System-SW/mainline-public.git tags/marvell-net-mdio-checkpatch-fixes-3.8
for you to fetch changes up to e313b995bf8598f56c98377a750f968f9dd5ca78:
net: mvneta: adjust multiline comments to net/ style (2012-11-19 11:42:42 +0100)
----------------------------------------------------------------
Marvell network/MDIO driver checkpatch fixes
----------------------------------------------------------------
Thomas Petazzoni (3):
net: mvmdio: use <linux/delay.h> instead of <asm/delay.h>
net: mvmdio: adjust multiline comment to net/ style
net: mvneta: adjust multiline comments to net/ style
drivers/net/ethernet/marvell/mvmdio.c | 6 +--
drivers/net/ethernet/marvell/mvneta.c | 84 ++++++++++++++++-----------------
2 files changed, 44 insertions(+), 46 deletions(-)
^ permalink raw reply
* [PATCH 1/3] net: mvmdio: use <linux/delay.h> instead of <asm/delay.h>
From: Thomas Petazzoni @ 2012-11-19 11:00 UTC (permalink / raw)
To: Jason Cooper
Cc: David S. Miller, Andrew Lunn, Gregory Clement, linux-arm-kernel,
Lior Amsalem, netdev
In-Reply-To: <1353322834-16952-1-git-send-email-thomas.petazzoni@free-electrons.com>
As suggested by checkpatch, using <linux/delay.h> instead of
<asm/delay.h> is appropriate.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
drivers/net/ethernet/marvell/mvmdio.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/marvell/mvmdio.c b/drivers/net/ethernet/marvell/mvmdio.c
index 82fbd23..114a0f1 100644
--- a/drivers/net/ethernet/marvell/mvmdio.c
+++ b/drivers/net/ethernet/marvell/mvmdio.c
@@ -27,8 +27,7 @@
#include <linux/of_address.h>
#include <linux/of_mdio.h>
#include <linux/platform_device.h>
-
-#include <asm/delay.h>
+#include <linux/delay.h>
#define MVMDIO_SMI_DATA_SHIFT 0
#define MVMDIO_SMI_PHY_ADDR_SHIFT 16
--
1.7.9.5
^ permalink raw reply related
* Re: [Pv-drivers] [PATCH 0/6] VSOCK for Linux upstreaming
From: Benjamin Herrenschmidt @ 2012-11-19 9:59 UTC (permalink / raw)
To: Anthony Liguori
Cc: Gerd Hoffmann, Andy King, pv-drivers, netdev, linux-kernel,
virtualization, gregkh, David Miller, georgezhang
In-Reply-To: <50A55F57.7080804@us.ibm.com>
On Thu, 2012-11-15 at 15:32 -0600, Anthony Liguori wrote:
>
> The concept was Nacked and that led to the abomination of virtio-serial. If an
> address family for virtualization is on the table, we should reconsider
> AF_VMCHANNEL.
>
> I'd be thrilled to get rid of virtio-serial...
Ack.
Ben.
^ permalink raw reply
* Re: [PATCH RFC 0/5] Containerize syslog
From: Eric W. Biederman @ 2012-11-19 9:51 UTC (permalink / raw)
To: Rui Xiang
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <50A9EAD8.9090501-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Rui Xiang <leo.ruixiang-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> From: Xiang Rui <rui.xiang-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
>
> In Serge's patch (http://lwn.net/Articles/525629/), syslog_namespace was tied to a user
> namespace. We add syslog_ns tied to nsproxy instead, and implement ns_printk in
> ip_table context.
>
> We add syslog_namespace as a part of nsproxy, and a new flag CLONE_SYSLOG to unshare
> syslog area.
>
> In syslog_namespace, some necessary identifiers for handling syslog buf are contained.
> When one container creates a new syslog namespace,containerized buf will be allocated
> to store log ownned this container. Containerized identifiers such as log_first_seq
> instead of global variable only affect their own buf.The buf will not be free until
> syslog_namespace is destructed by host.
>
> Printk should be re-implimented because log buf is isolated into syslog_ns. The function
> include printk, /dev/kmsg, do_syslog and kmsg_dump should be realized in container. So,
> to make these funtions available in container, a parameter syslog_ns is necessory for
> their interfaces.
>
> For container context, the value syslog namespace is reasonable if we use current method
> to get syslog_ns when using iptable. Because the log info belong to each containers will
> be printed in host.
>
> We add a pointer in net namespace, and use it to track the syslog_ns which was created
> when the log was generated in container. Then add ns_printk to provide a new interface
> while using syslog_ns.
It occurs to me that calling this a syslog namespace is a misnomer.
Syslog in general uses unix domain sockets. This is about the linux
kernel specific kernel log interface that tends to be put in syslog.
Are there any kernel print statements besides networking stack printks
that we want to move to show up in a new "kernel log" namespace?
For the kernel generated pieces of information that are interesting (and
their don't seem to be many of those) would we be better off using
another kernel method that is already per namespace. Something like
netlink.
Eric
^ permalink raw reply
* [PATCH 10/10] batman-adv: Use packing of 2 for all headers before an ethernet header
From: Antonio Quartulli @ 2012-11-19 8:24 UTC (permalink / raw)
To: davem; +Cc: netdev, b.a.t.m.a.n, Sven Eckelmann, Marek Lindner,
Antonio Quartulli
In-Reply-To: <1353313451-2930-1-git-send-email-ordex@autistici.org>
From: Sven Eckelmann <sven@narfation.org>
All packet headers in front of an ethernet header have to be completely
divisible by 2 but not by 4 to make the payload after the ethernet header again
4 bytes boundary aligned.
A packing of 2 is necessary to avoid extra padding at the end of the struct
caused by a structure member which is larger than two bytes. Otherwise the
structure would not fulfill the previously mentioned rule to avoid the
misalignment of the payload after the ethernet header. It may also lead to
leakage of information when the padding it not initialized before sending.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
net/batman-adv/packet.h | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/net/batman-adv/packet.h b/net/batman-adv/packet.h
index df548ed..1c5454d 100644
--- a/net/batman-adv/packet.h
+++ b/net/batman-adv/packet.h
@@ -173,6 +173,18 @@ struct batadv_icmp_packet_rr {
uint8_t rr[BATADV_RR_LEN][ETH_ALEN];
};
+/* All packet headers in front of an ethernet header have to be completely
+ * divisible by 2 but not by 4 to make the payload after the ethernet
+ * header again 4 bytes boundary aligned.
+ *
+ * A packing of 2 is necessary to avoid extra padding at the end of the struct
+ * caused by a structure member which is larger than two bytes. Otherwise
+ * the structure would not fulfill the previously mentioned rule to avoid the
+ * misalignment of the payload after the ethernet header. It may also lead to
+ * leakage of information when the padding it not initialized before sending.
+ */
+#pragma pack(2)
+
struct batadv_unicast_packet {
struct batadv_header header;
uint8_t ttvn; /* destination translation table version number */
@@ -216,7 +228,9 @@ struct batadv_bcast_packet {
/* "4 bytes boundary + 2 bytes" long to make the payload after the
* following ethernet header again 4 bytes boundary aligned
*/
-} __packed;
+};
+
+#pragma pack()
struct batadv_vis_packet {
struct batadv_header header;
--
1.8.0
^ permalink raw reply related
* [PATCH 09/10] batman-adv: Start new development cycle
From: Antonio Quartulli @ 2012-11-19 8:24 UTC (permalink / raw)
To: davem; +Cc: netdev, b.a.t.m.a.n, Sven Eckelmann, Antonio Quartulli
In-Reply-To: <1353313451-2930-1-git-send-email-ordex@autistici.org>
From: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
net/batman-adv/main.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/batman-adv/main.h b/net/batman-adv/main.h
index 0213cb5..e2b7e1a 100644
--- a/net/batman-adv/main.h
+++ b/net/batman-adv/main.h
@@ -26,7 +26,7 @@
#define BATADV_DRIVER_DEVICE "batman-adv"
#ifndef BATADV_SOURCE_VERSION
-#define BATADV_SOURCE_VERSION "2012.4.0"
+#define BATADV_SOURCE_VERSION "2012.5.0"
#endif
/* B.A.T.M.A.N. parameters */
--
1.8.0
^ permalink raw reply related
* [PATCH 08/10] batman-adv: Fix broadcast duplist for fragmentation
From: Antonio Quartulli @ 2012-11-19 8:24 UTC (permalink / raw)
To: davem
Cc: netdev, b.a.t.m.a.n, Simon Wunderlich, Simon Wunderlich,
Marek Lindner, Antonio Quartulli
In-Reply-To: <1353313451-2930-1-git-send-email-ordex@autistici.org>
From: Simon Wunderlich <simon.wunderlich@s2003.tu-chemnitz.de>
If the skb is fragmented, the checksum must be computed on the
individual fragments, just using skb->data may fail on fragmented
data. Instead of doing linearizing the packet, use the new
batadv_crc32 to do that more efficiently- it should not hurt
replacing the old crc16 by the new crc32.
Reported-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
net/batman-adv/bridge_loop_avoidance.c | 18 +++++++-----------
net/batman-adv/bridge_loop_avoidance.h | 6 ++----
net/batman-adv/routing.c | 8 +-------
net/batman-adv/types.h | 2 +-
4 files changed, 11 insertions(+), 23 deletions(-)
diff --git a/net/batman-adv/bridge_loop_avoidance.c b/net/batman-adv/bridge_loop_avoidance.c
index 7ffef8b..5aebe93 100644
--- a/net/batman-adv/bridge_loop_avoidance.c
+++ b/net/batman-adv/bridge_loop_avoidance.c
@@ -1249,8 +1249,7 @@ int batadv_bla_init(struct batadv_priv *bat_priv)
/**
* batadv_bla_check_bcast_duplist
* @bat_priv: the bat priv with all the soft interface information
- * @bcast_packet: encapsulated broadcast frame plus batman header
- * @bcast_packet_len: length of encapsulated broadcast frame plus batman header
+ * @skb: contains the bcast_packet to be checked
*
* check if it is on our broadcast list. Another gateway might
* have sent the same packet because it is connected to the same backbone,
@@ -1262,20 +1261,17 @@ int batadv_bla_init(struct batadv_priv *bat_priv)
* the same host however as this might be intended.
*/
int batadv_bla_check_bcast_duplist(struct batadv_priv *bat_priv,
- struct batadv_bcast_packet *bcast_packet,
- int bcast_packet_len)
+ struct sk_buff *skb)
{
- int i, length, curr, ret = 0;
- uint8_t *content;
- uint16_t crc;
+ int i, curr, ret = 0;
+ __be32 crc;
+ struct batadv_bcast_packet *bcast_packet;
struct batadv_bcast_duplist_entry *entry;
- length = bcast_packet_len - sizeof(*bcast_packet);
- content = (uint8_t *)bcast_packet;
- content += sizeof(*bcast_packet);
+ bcast_packet = (struct batadv_bcast_packet *)skb->data;
/* calculate the crc ... */
- crc = crc16(0, content, length);
+ crc = batadv_skb_crc32(skb, (u8 *)(bcast_packet + 1));
spin_lock_bh(&bat_priv->bla.bcast_duplist_lock);
diff --git a/net/batman-adv/bridge_loop_avoidance.h b/net/batman-adv/bridge_loop_avoidance.h
index 789cb73..196d9a0 100644
--- a/net/batman-adv/bridge_loop_avoidance.h
+++ b/net/batman-adv/bridge_loop_avoidance.h
@@ -31,8 +31,7 @@ int batadv_bla_backbone_table_seq_print_text(struct seq_file *seq,
void *offset);
int batadv_bla_is_backbone_gw_orig(struct batadv_priv *bat_priv, uint8_t *orig);
int batadv_bla_check_bcast_duplist(struct batadv_priv *bat_priv,
- struct batadv_bcast_packet *bcast_packet,
- int hdr_size);
+ struct sk_buff *skb);
void batadv_bla_update_orig_address(struct batadv_priv *bat_priv,
struct batadv_hard_iface *primary_if,
struct batadv_hard_iface *oldif);
@@ -81,8 +80,7 @@ static inline int batadv_bla_is_backbone_gw_orig(struct batadv_priv *bat_priv,
static inline int
batadv_bla_check_bcast_duplist(struct batadv_priv *bat_priv,
- struct batadv_bcast_packet *bcast_packet,
- int hdr_size)
+ struct sk_buff *skb)
{
return 0;
}
diff --git a/net/batman-adv/routing.c b/net/batman-adv/routing.c
index 8d64348..1aa1722 100644
--- a/net/batman-adv/routing.c
+++ b/net/batman-adv/routing.c
@@ -1196,14 +1196,8 @@ int batadv_recv_bcast_packet(struct sk_buff *skb,
spin_unlock_bh(&orig_node->bcast_seqno_lock);
- /* keep skb linear for crc calculation */
- if (skb_linearize(skb) < 0)
- goto out;
-
- bcast_packet = (struct batadv_bcast_packet *)skb->data;
-
/* check whether this has been sent by another originator before */
- if (batadv_bla_check_bcast_duplist(bat_priv, bcast_packet, skb->len))
+ if (batadv_bla_check_bcast_duplist(bat_priv, skb))
goto out;
/* rebroadcast packet */
diff --git a/net/batman-adv/types.h b/net/batman-adv/types.h
index 7b3d0d7..ae9ac9a 100644
--- a/net/batman-adv/types.h
+++ b/net/batman-adv/types.h
@@ -156,7 +156,7 @@ struct batadv_neigh_node {
#ifdef CONFIG_BATMAN_ADV_BLA
struct batadv_bcast_duplist_entry {
uint8_t orig[ETH_ALEN];
- uint16_t crc;
+ __be32 crc;
unsigned long entrytime;
};
#endif
--
1.8.0
^ permalink raw reply related
* [PATCH 07/10] batman-adv: Add function to calculate crc32c for the skb payload
From: Antonio Quartulli @ 2012-11-19 8:24 UTC (permalink / raw)
To: davem; +Cc: netdev, b.a.t.m.a.n, Sven Eckelmann, Marek Lindner,
Antonio Quartulli
In-Reply-To: <1353313451-2930-1-git-send-email-ordex@autistici.org>
From: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
net/batman-adv/Kconfig | 1 +
net/batman-adv/main.c | 34 ++++++++++++++++++++++++++++++++++
net/batman-adv/main.h | 1 +
3 files changed, 36 insertions(+)
diff --git a/net/batman-adv/Kconfig b/net/batman-adv/Kconfig
index 250e0b5..8d8afb1 100644
--- a/net/batman-adv/Kconfig
+++ b/net/batman-adv/Kconfig
@@ -6,6 +6,7 @@ config BATMAN_ADV
tristate "B.A.T.M.A.N. Advanced Meshing Protocol"
depends on NET
select CRC16
+ select LIBCRC32C
default n
help
B.A.T.M.A.N. (better approach to mobile ad-hoc networking) is
diff --git a/net/batman-adv/main.c b/net/batman-adv/main.c
index 70797de..253e240 100644
--- a/net/batman-adv/main.c
+++ b/net/batman-adv/main.c
@@ -17,6 +17,8 @@
* 02110-1301, USA
*/
+#include <linux/crc32c.h>
+#include <linux/highmem.h>
#include "main.h"
#include "sysfs.h"
#include "debugfs.h"
@@ -432,6 +434,38 @@ int batadv_compat_seq_print_text(struct seq_file *seq, void *offset)
return 0;
}
+/**
+ * batadv_skb_crc32 - calculate CRC32 of the whole packet and skip bytes in
+ * the header
+ * @skb: skb pointing to fragmented socket buffers
+ * @payload_ptr: Pointer to position inside the head buffer of the skb
+ * marking the start of the data to be CRC'ed
+ *
+ * payload_ptr must always point to an address in the skb head buffer and not to
+ * a fragment.
+ */
+__be32 batadv_skb_crc32(struct sk_buff *skb, u8 *payload_ptr)
+{
+ u32 crc = 0;
+ unsigned int from;
+ unsigned int to = skb->len;
+ struct skb_seq_state st;
+ const u8 *data;
+ unsigned int len;
+ unsigned int consumed = 0;
+
+ from = (unsigned int)(payload_ptr - skb->data);
+
+ skb_prepare_seq_read(skb, from, to, &st);
+ while ((len = skb_seq_read(consumed, &data, &st)) != 0) {
+ crc = crc32c(crc, data, len);
+ consumed += len;
+ }
+ skb_abort_seq_read(&st);
+
+ return htonl(crc);
+}
+
static int batadv_param_set_ra(const char *val, const struct kernel_param *kp)
{
struct batadv_algo_ops *bat_algo_ops;
diff --git a/net/batman-adv/main.h b/net/batman-adv/main.h
index 3243189..0213cb5 100644
--- a/net/batman-adv/main.h
+++ b/net/batman-adv/main.h
@@ -175,6 +175,7 @@ int batadv_algo_register(struct batadv_algo_ops *bat_algo_ops);
int batadv_algo_select(struct batadv_priv *bat_priv, char *name);
int batadv_algo_seq_print_text(struct seq_file *seq, void *offset);
int batadv_compat_seq_print_text(struct seq_file *seq, void *offset);
+__be32 batadv_skb_crc32(struct sk_buff *skb, u8 *payload_ptr);
/**
* enum batadv_dbg_level - available log levels
--
1.8.0
^ permalink raw reply related
* [PATCH 06/10] batman-adv: sysfs documentation should keep alphabetical order
From: Antonio Quartulli @ 2012-11-19 8:24 UTC (permalink / raw)
To: davem; +Cc: netdev, b.a.t.m.a.n, Marek Lindner, Antonio Quartulli
In-Reply-To: <1353313451-2930-1-git-send-email-ordex@autistici.org>
From: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
.../ABI/testing/sysfs-class-net-batman-adv | 11 +++---
Documentation/ABI/testing/sysfs-class-net-mesh | 44 +++++++++++-----------
2 files changed, 28 insertions(+), 27 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-class-net-batman-adv b/Documentation/ABI/testing/sysfs-class-net-batman-adv
index 38dd762..bdc0070 100644
--- a/Documentation/ABI/testing/sysfs-class-net-batman-adv
+++ b/Documentation/ABI/testing/sysfs-class-net-batman-adv
@@ -1,4 +1,10 @@
+What: /sys/class/net/<iface>/batman-adv/iface_status
+Date: May 2010
+Contact: Marek Lindner <lindner_marek@yahoo.de>
+Description:
+ Indicates the status of <iface> as it is seen by batman.
+
What: /sys/class/net/<iface>/batman-adv/mesh_iface
Date: May 2010
Contact: Marek Lindner <lindner_marek@yahoo.de>
@@ -7,8 +13,3 @@ Description:
displays the batman mesh interface this <iface>
currently is associated with.
-What: /sys/class/net/<iface>/batman-adv/iface_status
-Date: May 2010
-Contact: Marek Lindner <lindner_marek@yahoo.de>
-Description:
- Indicates the status of <iface> as it is seen by batman.
diff --git a/Documentation/ABI/testing/sysfs-class-net-mesh b/Documentation/ABI/testing/sysfs-class-net-mesh
index c81fe89..bc41da6 100644
--- a/Documentation/ABI/testing/sysfs-class-net-mesh
+++ b/Documentation/ABI/testing/sysfs-class-net-mesh
@@ -6,6 +6,14 @@ Description:
Indicates whether the batman protocol messages of the
mesh <mesh_iface> shall be aggregated or not.
+What: /sys/class/net/<mesh_iface>/mesh/ap_isolation
+Date: May 2011
+Contact: Antonio Quartulli <ordex@autistici.org>
+Description:
+ Indicates whether the data traffic going from a
+ wireless client to another wireless client will be
+ silently dropped.
+
What: /sys/class/net/<mesh_iface>/mesh/bonding
Date: June 2010
Contact: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
@@ -31,14 +39,6 @@ Description:
mesh will be fragmented or silently discarded if the
packet size exceeds the outgoing interface MTU.
-What: /sys/class/net/<mesh_iface>/mesh/ap_isolation
-Date: May 2011
-Contact: Antonio Quartulli <ordex@autistici.org>
-Description:
- Indicates whether the data traffic going from a
- wireless client to another wireless client will be
- silently dropped.
-
What: /sys/class/net/<mesh_iface>/mesh/gw_bandwidth
Date: October 2010
Contact: Marek Lindner <lindner_marek@yahoo.de>
@@ -60,26 +60,26 @@ Description:
Defines the selection criteria this node will use
to choose a gateway if gw_mode was set to 'client'.
-What: /sys/class/net/<mesh_iface>/mesh/orig_interval
-Date: May 2010
-Contact: Marek Lindner <lindner_marek@yahoo.de>
-Description:
- Defines the interval in milliseconds in which batman
- sends its protocol messages.
-
What: /sys/class/net/<mesh_iface>/mesh/hop_penalty
Date: Oct 2010
Contact: Linus Lüssing <linus.luessing@web.de>
Description:
- Defines the penalty which will be applied to an
- originator message's tq-field on every hop.
+ Defines the penalty which will be applied to an
+ originator message's tq-field on every hop.
-What: /sys/class/net/<mesh_iface>/mesh/routing_algo
-Date: Dec 2011
-Contact: Marek Lindner <lindner_marek@yahoo.de>
+What: /sys/class/net/<mesh_iface>/mesh/orig_interval
+Date: May 2010
+Contact: Marek Lindner <lindner_marek@yahoo.de>
Description:
- Defines the routing procotol this mesh instance
- uses to find the optimal paths through the mesh.
+ Defines the interval in milliseconds in which batman
+ sends its protocol messages.
+
+What: /sys/class/net/<mesh_iface>/mesh/routing_algo
+Date: Dec 2011
+Contact: Marek Lindner <lindner_marek@yahoo.de>
+Description:
+ Defines the routing procotol this mesh instance
+ uses to find the optimal paths through the mesh.
What: /sys/class/net/<mesh_iface>/mesh/vis_mode
Date: May 2010
--
1.8.0
^ permalink raw reply related
* [PATCH 05/10] batman-adv: Add wrapper to look up neighbor and send skb
From: Antonio Quartulli @ 2012-11-19 8:24 UTC (permalink / raw)
To: davem
Cc: netdev, b.a.t.m.a.n, Martin Hundebøll, Marek Lindner,
Antonio Quartulli
In-Reply-To: <1353313451-2930-1-git-send-email-ordex@autistici.org>
From: Martin Hundebøll <martin@hundeboll.net>
By adding batadv_send_skb_to_orig() in send.c, we can remove duplicate
code that looks up the next hop and then calls batadv_send_skb_packet().
Furthermore, this prepares the upcoming new implementation of
fragmentation, which requires the next hop to route packets.
Please note that this doesn't entirely remove the next-hop lookup in
routing.c and unicast.c, since it is used by the current fragmentation
code.
Also note that the next-hop info is removed from debug messages in
translation-table.c, since it is looked up elsewhere.
Signed-off-by: Martin Hundebøll <martin@hundeboll.net>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
net/batman-adv/routing.c | 37 +++++-----------------
net/batman-adv/send.c | 33 +++++++++++++++++++
net/batman-adv/send.h | 3 ++
net/batman-adv/translation-table.c | 65 ++++++++++----------------------------
net/batman-adv/unicast.c | 8 ++---
net/batman-adv/vis.c | 37 ++++++----------------
6 files changed, 75 insertions(+), 108 deletions(-)
diff --git a/net/batman-adv/routing.c b/net/batman-adv/routing.c
index 78d6572..8d64348 100644
--- a/net/batman-adv/routing.c
+++ b/net/batman-adv/routing.c
@@ -285,7 +285,6 @@ static int batadv_recv_my_icmp_packet(struct batadv_priv *bat_priv,
{
struct batadv_hard_iface *primary_if = NULL;
struct batadv_orig_node *orig_node = NULL;
- struct batadv_neigh_node *router = NULL;
struct batadv_icmp_packet_rr *icmp_packet;
int ret = NET_RX_DROP;
@@ -307,10 +306,6 @@ static int batadv_recv_my_icmp_packet(struct batadv_priv *bat_priv,
if (!orig_node)
goto out;
- router = batadv_orig_node_get_router(orig_node);
- if (!router)
- goto out;
-
/* create a copy of the skb, if needed, to modify it. */
if (skb_cow(skb, ETH_HLEN) < 0)
goto out;
@@ -322,14 +317,12 @@ static int batadv_recv_my_icmp_packet(struct batadv_priv *bat_priv,
icmp_packet->msg_type = BATADV_ECHO_REPLY;
icmp_packet->header.ttl = BATADV_TTL;
- batadv_send_skb_packet(skb, router->if_incoming, router->addr);
- ret = NET_RX_SUCCESS;
+ if (batadv_send_skb_to_orig(skb, orig_node, NULL))
+ ret = NET_RX_SUCCESS;
out:
if (primary_if)
batadv_hardif_free_ref(primary_if);
- if (router)
- batadv_neigh_node_free_ref(router);
if (orig_node)
batadv_orig_node_free_ref(orig_node);
return ret;
@@ -340,7 +333,6 @@ static int batadv_recv_icmp_ttl_exceeded(struct batadv_priv *bat_priv,
{
struct batadv_hard_iface *primary_if = NULL;
struct batadv_orig_node *orig_node = NULL;
- struct batadv_neigh_node *router = NULL;
struct batadv_icmp_packet *icmp_packet;
int ret = NET_RX_DROP;
@@ -362,10 +354,6 @@ static int batadv_recv_icmp_ttl_exceeded(struct batadv_priv *bat_priv,
if (!orig_node)
goto out;
- router = batadv_orig_node_get_router(orig_node);
- if (!router)
- goto out;
-
/* create a copy of the skb, if needed, to modify it. */
if (skb_cow(skb, ETH_HLEN) < 0)
goto out;
@@ -377,14 +365,12 @@ static int batadv_recv_icmp_ttl_exceeded(struct batadv_priv *bat_priv,
icmp_packet->msg_type = BATADV_TTL_EXCEEDED;
icmp_packet->header.ttl = BATADV_TTL;
- batadv_send_skb_packet(skb, router->if_incoming, router->addr);
- ret = NET_RX_SUCCESS;
+ if (batadv_send_skb_to_orig(skb, orig_node, NULL))
+ ret = NET_RX_SUCCESS;
out:
if (primary_if)
batadv_hardif_free_ref(primary_if);
- if (router)
- batadv_neigh_node_free_ref(router);
if (orig_node)
batadv_orig_node_free_ref(orig_node);
return ret;
@@ -398,7 +384,6 @@ int batadv_recv_icmp_packet(struct sk_buff *skb,
struct batadv_icmp_packet_rr *icmp_packet;
struct ethhdr *ethhdr;
struct batadv_orig_node *orig_node = NULL;
- struct batadv_neigh_node *router = NULL;
int hdr_size = sizeof(struct batadv_icmp_packet);
int ret = NET_RX_DROP;
@@ -447,10 +432,6 @@ int batadv_recv_icmp_packet(struct sk_buff *skb,
if (!orig_node)
goto out;
- router = batadv_orig_node_get_router(orig_node);
- if (!router)
- goto out;
-
/* create a copy of the skb, if needed, to modify it. */
if (skb_cow(skb, ETH_HLEN) < 0)
goto out;
@@ -461,12 +442,10 @@ int batadv_recv_icmp_packet(struct sk_buff *skb,
icmp_packet->header.ttl--;
/* route it */
- batadv_send_skb_packet(skb, router->if_incoming, router->addr);
- ret = NET_RX_SUCCESS;
+ if (batadv_send_skb_to_orig(skb, orig_node, recv_if))
+ ret = NET_RX_SUCCESS;
out:
- if (router)
- batadv_neigh_node_free_ref(router);
if (orig_node)
batadv_orig_node_free_ref(orig_node);
return ret;
@@ -882,8 +861,8 @@ static int batadv_route_unicast_packet(struct sk_buff *skb,
skb->len + ETH_HLEN);
/* route it */
- batadv_send_skb_packet(skb, neigh_node->if_incoming, neigh_node->addr);
- ret = NET_RX_SUCCESS;
+ if (batadv_send_skb_to_orig(skb, orig_node, recv_if))
+ ret = NET_RX_SUCCESS;
out:
if (neigh_node)
diff --git a/net/batman-adv/send.c b/net/batman-adv/send.c
index 660d9bf..c7f7023 100644
--- a/net/batman-adv/send.c
+++ b/net/batman-adv/send.c
@@ -78,6 +78,39 @@ send_skb_err:
return NET_XMIT_DROP;
}
+/**
+ * batadv_send_skb_to_orig - Lookup next-hop and transmit skb.
+ * @skb: Packet to be transmitted.
+ * @orig_node: Final destination of the packet.
+ * @recv_if: Interface used when receiving the packet (can be NULL).
+ *
+ * Looks up the best next-hop towards the passed originator and passes the
+ * skb on for preparation of MAC header. If the packet originated from this
+ * host, NULL can be passed as recv_if and no interface alternating is
+ * attempted.
+ *
+ * Returns TRUE on success; FALSE otherwise.
+ */
+bool batadv_send_skb_to_orig(struct sk_buff *skb,
+ struct batadv_orig_node *orig_node,
+ struct batadv_hard_iface *recv_if)
+{
+ struct batadv_priv *bat_priv = orig_node->bat_priv;
+ struct batadv_neigh_node *neigh_node;
+
+ /* batadv_find_router() increases neigh_nodes refcount if found. */
+ neigh_node = batadv_find_router(bat_priv, orig_node, recv_if);
+ if (!neigh_node)
+ return false;
+
+ /* route it */
+ batadv_send_skb_packet(skb, neigh_node->if_incoming, neigh_node->addr);
+
+ batadv_neigh_node_free_ref(neigh_node);
+
+ return true;
+}
+
void batadv_schedule_bat_ogm(struct batadv_hard_iface *hard_iface)
{
struct batadv_priv *bat_priv = netdev_priv(hard_iface->soft_iface);
diff --git a/net/batman-adv/send.h b/net/batman-adv/send.h
index 643329b..0078dec 100644
--- a/net/batman-adv/send.h
+++ b/net/batman-adv/send.h
@@ -23,6 +23,9 @@
int batadv_send_skb_packet(struct sk_buff *skb,
struct batadv_hard_iface *hard_iface,
const uint8_t *dst_addr);
+bool batadv_send_skb_to_orig(struct sk_buff *skb,
+ struct batadv_orig_node *orig_node,
+ struct batadv_hard_iface *recv_if);
void batadv_schedule_bat_ogm(struct batadv_hard_iface *hard_iface);
int batadv_add_bcast_packet_to_list(struct batadv_priv *bat_priv,
const struct sk_buff *skb,
diff --git a/net/batman-adv/translation-table.c b/net/batman-adv/translation-table.c
index cdad824..22457a7 100644
--- a/net/batman-adv/translation-table.c
+++ b/net/batman-adv/translation-table.c
@@ -1642,7 +1642,6 @@ static int batadv_send_tt_request(struct batadv_priv *bat_priv,
{
struct sk_buff *skb = NULL;
struct batadv_tt_query_packet *tt_request;
- struct batadv_neigh_node *neigh_node = NULL;
struct batadv_hard_iface *primary_if;
struct batadv_tt_req_node *tt_req_node = NULL;
int ret = 1;
@@ -1680,23 +1679,15 @@ static int batadv_send_tt_request(struct batadv_priv *bat_priv,
if (full_table)
tt_request->flags |= BATADV_TT_FULL_TABLE;
- neigh_node = batadv_orig_node_get_router(dst_orig_node);
- if (!neigh_node)
- goto out;
-
- batadv_dbg(BATADV_DBG_TT, bat_priv,
- "Sending TT_REQUEST to %pM via %pM [%c]\n",
- dst_orig_node->orig, neigh_node->addr,
- (full_table ? 'F' : '.'));
+ batadv_dbg(BATADV_DBG_TT, bat_priv, "Sending TT_REQUEST to %pM [%c]\n",
+ dst_orig_node->orig, (full_table ? 'F' : '.'));
batadv_inc_counter(bat_priv, BATADV_CNT_TT_REQUEST_TX);
- batadv_send_skb_packet(skb, neigh_node->if_incoming, neigh_node->addr);
- ret = 0;
+ if (batadv_send_skb_to_orig(skb, dst_orig_node, NULL))
+ ret = 0;
out:
- if (neigh_node)
- batadv_neigh_node_free_ref(neigh_node);
if (primary_if)
batadv_hardif_free_ref(primary_if);
if (ret)
@@ -1716,7 +1707,6 @@ batadv_send_other_tt_response(struct batadv_priv *bat_priv,
{
struct batadv_orig_node *req_dst_orig_node;
struct batadv_orig_node *res_dst_orig_node = NULL;
- struct batadv_neigh_node *neigh_node = NULL;
struct batadv_hard_iface *primary_if = NULL;
uint8_t orig_ttvn, req_ttvn, ttvn;
int ret = false;
@@ -1742,10 +1732,6 @@ batadv_send_other_tt_response(struct batadv_priv *bat_priv,
if (!res_dst_orig_node)
goto out;
- neigh_node = batadv_orig_node_get_router(res_dst_orig_node);
- if (!neigh_node)
- goto out;
-
primary_if = batadv_primary_if_get_selected(bat_priv);
if (!primary_if)
goto out;
@@ -1817,14 +1803,13 @@ batadv_send_other_tt_response(struct batadv_priv *bat_priv,
tt_response->flags |= BATADV_TT_FULL_TABLE;
batadv_dbg(BATADV_DBG_TT, bat_priv,
- "Sending TT_RESPONSE %pM via %pM for %pM (ttvn: %u)\n",
- res_dst_orig_node->orig, neigh_node->addr,
- req_dst_orig_node->orig, req_ttvn);
+ "Sending TT_RESPONSE %pM for %pM (ttvn: %u)\n",
+ res_dst_orig_node->orig, req_dst_orig_node->orig, req_ttvn);
batadv_inc_counter(bat_priv, BATADV_CNT_TT_RESPONSE_TX);
- batadv_send_skb_packet(skb, neigh_node->if_incoming, neigh_node->addr);
- ret = true;
+ if (batadv_send_skb_to_orig(skb, res_dst_orig_node, NULL))
+ ret = true;
goto out;
unlock:
@@ -1835,8 +1820,6 @@ out:
batadv_orig_node_free_ref(res_dst_orig_node);
if (req_dst_orig_node)
batadv_orig_node_free_ref(req_dst_orig_node);
- if (neigh_node)
- batadv_neigh_node_free_ref(neigh_node);
if (primary_if)
batadv_hardif_free_ref(primary_if);
if (!ret)
@@ -1850,7 +1833,6 @@ batadv_send_my_tt_response(struct batadv_priv *bat_priv,
struct batadv_tt_query_packet *tt_request)
{
struct batadv_orig_node *orig_node;
- struct batadv_neigh_node *neigh_node = NULL;
struct batadv_hard_iface *primary_if = NULL;
uint8_t my_ttvn, req_ttvn, ttvn;
int ret = false;
@@ -1875,10 +1857,6 @@ batadv_send_my_tt_response(struct batadv_priv *bat_priv,
if (!orig_node)
goto out;
- neigh_node = batadv_orig_node_get_router(orig_node);
- if (!neigh_node)
- goto out;
-
primary_if = batadv_primary_if_get_selected(bat_priv);
if (!primary_if)
goto out;
@@ -1942,14 +1920,14 @@ batadv_send_my_tt_response(struct batadv_priv *bat_priv,
tt_response->flags |= BATADV_TT_FULL_TABLE;
batadv_dbg(BATADV_DBG_TT, bat_priv,
- "Sending TT_RESPONSE to %pM via %pM [%c]\n",
- orig_node->orig, neigh_node->addr,
+ "Sending TT_RESPONSE to %pM [%c]\n",
+ orig_node->orig,
(tt_response->flags & BATADV_TT_FULL_TABLE ? 'F' : '.'));
batadv_inc_counter(bat_priv, BATADV_CNT_TT_RESPONSE_TX);
- batadv_send_skb_packet(skb, neigh_node->if_incoming, neigh_node->addr);
- ret = true;
+ if (batadv_send_skb_to_orig(skb, orig_node, NULL))
+ ret = true;
goto out;
unlock:
@@ -1957,8 +1935,6 @@ unlock:
out:
if (orig_node)
batadv_orig_node_free_ref(orig_node);
- if (neigh_node)
- batadv_neigh_node_free_ref(neigh_node);
if (primary_if)
batadv_hardif_free_ref(primary_if);
if (!ret)
@@ -2223,7 +2199,6 @@ unlock:
static void batadv_send_roam_adv(struct batadv_priv *bat_priv, uint8_t *client,
struct batadv_orig_node *orig_node)
{
- struct batadv_neigh_node *neigh_node = NULL;
struct sk_buff *skb = NULL;
struct batadv_roam_adv_packet *roam_adv_packet;
int ret = 1;
@@ -2256,23 +2231,17 @@ static void batadv_send_roam_adv(struct batadv_priv *bat_priv, uint8_t *client,
memcpy(roam_adv_packet->dst, orig_node->orig, ETH_ALEN);
memcpy(roam_adv_packet->client, client, ETH_ALEN);
- neigh_node = batadv_orig_node_get_router(orig_node);
- if (!neigh_node)
- goto out;
-
batadv_dbg(BATADV_DBG_TT, bat_priv,
- "Sending ROAMING_ADV to %pM (client %pM) via %pM\n",
- orig_node->orig, client, neigh_node->addr);
+ "Sending ROAMING_ADV to %pM (client %pM)\n",
+ orig_node->orig, client);
batadv_inc_counter(bat_priv, BATADV_CNT_TT_ROAM_ADV_TX);
- batadv_send_skb_packet(skb, neigh_node->if_incoming, neigh_node->addr);
- ret = 0;
+ if (batadv_send_skb_to_orig(skb, orig_node, NULL))
+ ret = 0;
out:
- if (neigh_node)
- batadv_neigh_node_free_ref(neigh_node);
- if (ret)
+ if (ret && skb)
kfree_skb(skb);
return;
}
diff --git a/net/batman-adv/unicast.c b/net/batman-adv/unicast.c
index c9a1f65..10aff49 100644
--- a/net/batman-adv/unicast.c
+++ b/net/batman-adv/unicast.c
@@ -402,7 +402,7 @@ int batadv_unicast_generic_send_skb(struct batadv_priv *bat_priv,
struct batadv_orig_node *orig_node;
struct batadv_neigh_node *neigh_node;
int data_len = skb->len;
- int ret = 1;
+ int ret = NET_RX_DROP;
unsigned int dev_mtu;
/* get routing information */
@@ -466,15 +466,15 @@ find_router:
goto out;
}
- batadv_send_skb_packet(skb, neigh_node->if_incoming, neigh_node->addr);
- ret = 0;
+ if (batadv_send_skb_to_orig(skb, orig_node, NULL))
+ ret = 0;
out:
if (neigh_node)
batadv_neigh_node_free_ref(neigh_node);
if (orig_node)
batadv_orig_node_free_ref(orig_node);
- if (ret == 1)
+ if (ret == NET_RX_DROP)
kfree_skb(skb);
return ret;
}
diff --git a/net/batman-adv/vis.c b/net/batman-adv/vis.c
index ad14a6c..0f65a9d 100644
--- a/net/batman-adv/vis.c
+++ b/net/batman-adv/vis.c
@@ -698,15 +698,12 @@ static void batadv_purge_vis_packets(struct batadv_priv *bat_priv)
static void batadv_broadcast_vis_packet(struct batadv_priv *bat_priv,
struct batadv_vis_info *info)
{
- struct batadv_neigh_node *router;
struct batadv_hashtable *hash = bat_priv->orig_hash;
struct hlist_node *node;
struct hlist_head *head;
struct batadv_orig_node *orig_node;
struct batadv_vis_packet *packet;
struct sk_buff *skb;
- struct batadv_hard_iface *hard_iface;
- uint8_t dstaddr[ETH_ALEN];
uint32_t i;
@@ -722,30 +719,20 @@ static void batadv_broadcast_vis_packet(struct batadv_priv *bat_priv,
if (!(orig_node->flags & BATADV_VIS_SERVER))
continue;
- router = batadv_orig_node_get_router(orig_node);
- if (!router)
- continue;
-
/* don't send it if we already received the packet from
* this node.
*/
if (batadv_recv_list_is_in(bat_priv, &info->recv_list,
- orig_node->orig)) {
- batadv_neigh_node_free_ref(router);
+ orig_node->orig))
continue;
- }
memcpy(packet->target_orig, orig_node->orig, ETH_ALEN);
- hard_iface = router->if_incoming;
- memcpy(dstaddr, router->addr, ETH_ALEN);
-
- batadv_neigh_node_free_ref(router);
-
skb = skb_clone(info->skb_packet, GFP_ATOMIC);
- if (skb)
- batadv_send_skb_packet(skb, hard_iface,
- dstaddr);
+ if (!skb)
+ continue;
+ if (!batadv_send_skb_to_orig(skb, orig_node, NULL))
+ kfree_skb(skb);
}
rcu_read_unlock();
}
@@ -755,7 +742,6 @@ static void batadv_unicast_vis_packet(struct batadv_priv *bat_priv,
struct batadv_vis_info *info)
{
struct batadv_orig_node *orig_node;
- struct batadv_neigh_node *router = NULL;
struct sk_buff *skb;
struct batadv_vis_packet *packet;
@@ -765,17 +751,14 @@ static void batadv_unicast_vis_packet(struct batadv_priv *bat_priv,
if (!orig_node)
goto out;
- router = batadv_orig_node_get_router(orig_node);
- if (!router)
- goto out;
-
skb = skb_clone(info->skb_packet, GFP_ATOMIC);
- if (skb)
- batadv_send_skb_packet(skb, router->if_incoming, router->addr);
+ if (!skb)
+ goto out;
+
+ if (!batadv_send_skb_to_orig(skb, orig_node, NULL))
+ kfree_skb(skb);
out:
- if (router)
- batadv_neigh_node_free_ref(router);
if (orig_node)
batadv_orig_node_free_ref(orig_node);
}
--
1.8.0
^ permalink raw reply related
* [PATCH 04/10] batman-adv: export compatibility version via debugfs
From: Antonio Quartulli @ 2012-11-19 8:24 UTC (permalink / raw)
To: davem; +Cc: netdev, b.a.t.m.a.n, Antonio Quartulli, Marek Lindner
In-Reply-To: <1353313451-2930-1-git-send-email-ordex@autistici.org>
Different versions of the batman-adv module may use the same compatibility
version, but this is not understandable at runtime (the only way is to parse the
kernel log and fetch the batman-adv advertisement message on loading). The user
may want to know whether two nodes using different versions can communicate or
not. For this purpose the module has to export this value through debugfs.
Reported-by: Moritz Warning <moritzwarning@web.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
---
net/batman-adv/debugfs.c | 12 ++++++++++++
net/batman-adv/main.c | 12 ++++++++++++
net/batman-adv/main.h | 1 +
3 files changed, 25 insertions(+)
diff --git a/net/batman-adv/debugfs.c b/net/batman-adv/debugfs.c
index 6f58ddd..ae79b19 100644
--- a/net/batman-adv/debugfs.c
+++ b/net/batman-adv/debugfs.c
@@ -245,6 +245,16 @@ static int batadv_algorithms_open(struct inode *inode, struct file *file)
return single_open(file, batadv_algo_seq_print_text, NULL);
}
+/**
+ * batadv_compat_open - Prepare file handler for printing of the compat version
+ * @inode: inode which was opened
+ * @file: file handle to be initialized
+ */
+static int batadv_compat_open(struct inode *inode, struct file *file)
+{
+ return single_open(file, batadv_compat_seq_print_text, NULL);
+}
+
static int batadv_originators_open(struct inode *inode, struct file *file)
{
struct net_device *net_dev = (struct net_device *)inode->i_private;
@@ -327,9 +337,11 @@ struct batadv_debuginfo batadv_debuginfo_##_name = { \
* placed in the BATADV_DEBUGFS_SUBDIR subdirectory of debugfs
*/
static BATADV_DEBUGINFO(routing_algos, S_IRUGO, batadv_algorithms_open);
+static BATADV_DEBUGINFO(compat_version, S_IRUGO, batadv_compat_open);
static struct batadv_debuginfo *batadv_general_debuginfos[] = {
&batadv_debuginfo_routing_algos,
+ &batadv_debuginfo_compat_version,
NULL,
};
diff --git a/net/batman-adv/main.c b/net/batman-adv/main.c
index dc33a0c..70797de 100644
--- a/net/batman-adv/main.c
+++ b/net/batman-adv/main.c
@@ -420,6 +420,18 @@ int batadv_algo_seq_print_text(struct seq_file *seq, void *offset)
return 0;
}
+/**
+ * batadv_compat_seq_print_text - print the compatibility version
+ * @seq: debugfs table seq_file struct
+ * @offset: not used
+ */
+int batadv_compat_seq_print_text(struct seq_file *seq, void *offset)
+{
+ seq_printf(seq, "%d\n", BATADV_COMPAT_VERSION);
+
+ return 0;
+}
+
static int batadv_param_set_ra(const char *val, const struct kernel_param *kp)
{
struct batadv_algo_ops *bat_algo_ops;
diff --git a/net/batman-adv/main.h b/net/batman-adv/main.h
index 8f149bb..3243189 100644
--- a/net/batman-adv/main.h
+++ b/net/batman-adv/main.h
@@ -174,6 +174,7 @@ void batadv_recv_handler_unregister(uint8_t packet_type);
int batadv_algo_register(struct batadv_algo_ops *bat_algo_ops);
int batadv_algo_select(struct batadv_priv *bat_priv, char *name);
int batadv_algo_seq_print_text(struct seq_file *seq, void *offset);
+int batadv_compat_seq_print_text(struct seq_file *seq, void *offset);
/**
* enum batadv_dbg_level - available log levels
--
1.8.0
^ permalink raw reply related
* [PATCH 03/10] batman-adv: support array of debugfs general attributes
From: Antonio Quartulli @ 2012-11-19 8:24 UTC (permalink / raw)
To: davem; +Cc: netdev, b.a.t.m.a.n, Antonio Quartulli, Marek Lindner
In-Reply-To: <1353313451-2930-1-git-send-email-ordex@autistici.org>
This patch adds support for an array of debugfs general (not soft_iface
specific) attributes. With this change it will be possible to add more general
attributes by simply appending them to the array without touching the rest of
the code.
Reported-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Acked-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
---
net/batman-adv/debugfs.c | 34 +++++++++++++++++++++++++---------
1 file changed, 25 insertions(+), 9 deletions(-)
diff --git a/net/batman-adv/debugfs.c b/net/batman-adv/debugfs.c
index 3f679cb..6f58ddd 100644
--- a/net/batman-adv/debugfs.c
+++ b/net/batman-adv/debugfs.c
@@ -323,7 +323,17 @@ struct batadv_debuginfo batadv_debuginfo_##_name = { \
} \
};
+/* the following attributes are general and therefore they will be directly
+ * placed in the BATADV_DEBUGFS_SUBDIR subdirectory of debugfs
+ */
static BATADV_DEBUGINFO(routing_algos, S_IRUGO, batadv_algorithms_open);
+
+static struct batadv_debuginfo *batadv_general_debuginfos[] = {
+ &batadv_debuginfo_routing_algos,
+ NULL,
+};
+
+/* The following attributes are per soft interface */
static BATADV_DEBUGINFO(originators, S_IRUGO, batadv_originators_open);
static BATADV_DEBUGINFO(gateways, S_IRUGO, batadv_gateways_open);
static BATADV_DEBUGINFO(transtable_global, S_IRUGO,
@@ -358,7 +368,7 @@ static struct batadv_debuginfo *batadv_mesh_debuginfos[] = {
void batadv_debugfs_init(void)
{
- struct batadv_debuginfo *bat_debug;
+ struct batadv_debuginfo **bat_debug;
struct dentry *file;
batadv_debugfs = debugfs_create_dir(BATADV_DEBUGFS_SUBDIR, NULL);
@@ -366,17 +376,23 @@ void batadv_debugfs_init(void)
batadv_debugfs = NULL;
if (!batadv_debugfs)
- goto out;
+ goto err;
- bat_debug = &batadv_debuginfo_routing_algos;
- file = debugfs_create_file(bat_debug->attr.name,
- S_IFREG | bat_debug->attr.mode,
- batadv_debugfs, NULL, &bat_debug->fops);
- if (!file)
- pr_err("Can't add debugfs file: %s\n", bat_debug->attr.name);
+ for (bat_debug = batadv_general_debuginfos; *bat_debug; ++bat_debug) {
+ file = debugfs_create_file(((*bat_debug)->attr).name,
+ S_IFREG | ((*bat_debug)->attr).mode,
+ batadv_debugfs, NULL,
+ &(*bat_debug)->fops);
+ if (!file) {
+ pr_err("Can't add general debugfs file: %s\n",
+ ((*bat_debug)->attr).name);
+ goto err;
+ }
+ }
-out:
return;
+err:
+ debugfs_remove_recursive(batadv_debugfs);
}
void batadv_debugfs_destroy(void)
--
1.8.0
^ permalink raw reply related
* [PATCH 02/10] batman-adv: fix bla compare function
From: Antonio Quartulli @ 2012-11-19 8:24 UTC (permalink / raw)
To: davem
Cc: netdev, b.a.t.m.a.n, Simon Wunderlich, Simon Wunderlich,
Marek Lindner, Antonio Quartulli
In-Reply-To: <1353313451-2930-1-git-send-email-ordex@autistici.org>
From: Simon Wunderlich <simon.wunderlich@s2003.tu-chemnitz.de>
The address and the VLAN VID may not be packed in the respective
structs. Fix this by comparing the elements individually.
Reported-by: Marek Lindner <lindner_marek@yahoo.de>
Reported-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
net/batman-adv/bridge_loop_avoidance.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/net/batman-adv/bridge_loop_avoidance.c b/net/batman-adv/bridge_loop_avoidance.c
index bda8b17..7ffef8b 100644
--- a/net/batman-adv/bridge_loop_avoidance.c
+++ b/net/batman-adv/bridge_loop_avoidance.c
@@ -77,8 +77,15 @@ static int batadv_compare_backbone_gw(const struct hlist_node *node,
{
const void *data1 = container_of(node, struct batadv_backbone_gw,
hash_entry);
+ const struct batadv_backbone_gw *gw1 = data1, *gw2 = data2;
- return (memcmp(data1, data2, ETH_ALEN + sizeof(short)) == 0 ? 1 : 0);
+ if (!batadv_compare_eth(gw1->orig, gw2->orig))
+ return 0;
+
+ if (gw1->vid != gw2->vid)
+ return 0;
+
+ return 1;
}
/* compares address and vid of two claims */
@@ -87,8 +94,15 @@ static int batadv_compare_claim(const struct hlist_node *node,
{
const void *data1 = container_of(node, struct batadv_claim,
hash_entry);
+ const struct batadv_claim *cl1 = data1, *cl2 = data2;
- return (memcmp(data1, data2, ETH_ALEN + sizeof(short)) == 0 ? 1 : 0);
+ if (!batadv_compare_eth(cl1->addr, cl2->addr))
+ return 0;
+
+ if (cl1->vid != cl2->vid)
+ return 0;
+
+ return 1;
}
/* free a backbone gw */
--
1.8.0
^ permalink raw reply related
* [PATCH 01/10] batman-adv: Mark best gateway in transtable_global debugfs
From: Antonio Quartulli @ 2012-11-19 8:24 UTC (permalink / raw)
To: davem; +Cc: netdev, b.a.t.m.a.n, Sven Eckelmann, Marek Lindner,
Antonio Quartulli
In-Reply-To: <1353313451-2930-1-git-send-email-ordex@autistici.org>
From: Sven Eckelmann <sven@narfation.org>
The transtable_global debug file can show multiple entries for a single client
when multiple gateways exist. The chosen gateway isn't marked in the list and
therefore the user cannot easily debug the situation when there is a problem
with the currently used gateway.
The best gateway is now marked with "*" and secondary gateways are marked with
"+".
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
net/batman-adv/translation-table.c | 90 +++++++++++++++++++++++++++-----------
1 file changed, 64 insertions(+), 26 deletions(-)
diff --git a/net/batman-adv/translation-table.c b/net/batman-adv/translation-table.c
index 582f134..cdad824 100644
--- a/net/batman-adv/translation-table.c
+++ b/net/batman-adv/translation-table.c
@@ -911,8 +911,44 @@ out:
return ret;
}
-/* print all orig nodes who announce the address for this global entry.
- * it is assumed that the caller holds rcu_read_lock();
+/* batadv_transtable_best_orig - Get best originator list entry from tt entry
+ * @tt_global_entry: global translation table entry to be analyzed
+ *
+ * This functon assumes the caller holds rcu_read_lock().
+ * Returns best originator list entry or NULL on errors.
+ */
+static struct batadv_tt_orig_list_entry *
+batadv_transtable_best_orig(struct batadv_tt_global_entry *tt_global_entry)
+{
+ struct batadv_neigh_node *router = NULL;
+ struct hlist_head *head;
+ struct hlist_node *node;
+ struct batadv_tt_orig_list_entry *orig_entry, *best_entry = NULL;
+ int best_tq = 0;
+
+ head = &tt_global_entry->orig_list;
+ hlist_for_each_entry_rcu(orig_entry, node, head, list) {
+ router = batadv_orig_node_get_router(orig_entry->orig_node);
+ if (!router)
+ continue;
+
+ if (router->tq_avg > best_tq) {
+ best_entry = orig_entry;
+ best_tq = router->tq_avg;
+ }
+
+ batadv_neigh_node_free_ref(router);
+ }
+
+ return best_entry;
+}
+
+/* batadv_tt_global_print_entry - print all orig nodes who announce the address
+ * for this global entry
+ * @tt_global_entry: global translation table entry to be printed
+ * @seq: debugfs table seq_file struct
+ *
+ * This functon assumes the caller holds rcu_read_lock().
*/
static void
batadv_tt_global_print_entry(struct batadv_tt_global_entry *tt_global_entry,
@@ -920,21 +956,37 @@ batadv_tt_global_print_entry(struct batadv_tt_global_entry *tt_global_entry,
{
struct hlist_head *head;
struct hlist_node *node;
- struct batadv_tt_orig_list_entry *orig_entry;
+ struct batadv_tt_orig_list_entry *orig_entry, *best_entry;
struct batadv_tt_common_entry *tt_common_entry;
uint16_t flags;
uint8_t last_ttvn;
tt_common_entry = &tt_global_entry->common;
+ flags = tt_common_entry->flags;
+
+ best_entry = batadv_transtable_best_orig(tt_global_entry);
+ if (best_entry) {
+ last_ttvn = atomic_read(&best_entry->orig_node->last_ttvn);
+ seq_printf(seq, " %c %pM (%3u) via %pM (%3u) [%c%c%c]\n",
+ '*', tt_global_entry->common.addr,
+ best_entry->ttvn, best_entry->orig_node->orig,
+ last_ttvn,
+ (flags & BATADV_TT_CLIENT_ROAM ? 'R' : '.'),
+ (flags & BATADV_TT_CLIENT_WIFI ? 'W' : '.'),
+ (flags & BATADV_TT_CLIENT_TEMP ? 'T' : '.'));
+ }
head = &tt_global_entry->orig_list;
hlist_for_each_entry_rcu(orig_entry, node, head, list) {
- flags = tt_common_entry->flags;
+ if (best_entry == orig_entry)
+ continue;
+
last_ttvn = atomic_read(&orig_entry->orig_node->last_ttvn);
- seq_printf(seq, " * %pM (%3u) via %pM (%3u) [%c%c%c]\n",
- tt_global_entry->common.addr, orig_entry->ttvn,
- orig_entry->orig_node->orig, last_ttvn,
+ seq_printf(seq, " %c %pM (%3u) via %pM (%3u) [%c%c%c]\n",
+ '+', tt_global_entry->common.addr,
+ orig_entry->ttvn, orig_entry->orig_node->orig,
+ last_ttvn,
(flags & BATADV_TT_CLIENT_ROAM ? 'R' : '.'),
(flags & BATADV_TT_CLIENT_WIFI ? 'W' : '.'),
(flags & BATADV_TT_CLIENT_TEMP ? 'T' : '.'));
@@ -1280,11 +1332,7 @@ struct batadv_orig_node *batadv_transtable_search(struct batadv_priv *bat_priv,
struct batadv_tt_local_entry *tt_local_entry = NULL;
struct batadv_tt_global_entry *tt_global_entry = NULL;
struct batadv_orig_node *orig_node = NULL;
- struct batadv_neigh_node *router = NULL;
- struct hlist_head *head;
- struct hlist_node *node;
- struct batadv_tt_orig_list_entry *orig_entry;
- int best_tq;
+ struct batadv_tt_orig_list_entry *best_entry;
if (src && atomic_read(&bat_priv->ap_isolation)) {
tt_local_entry = batadv_tt_local_hash_find(bat_priv, src);
@@ -1304,25 +1352,15 @@ struct batadv_orig_node *batadv_transtable_search(struct batadv_priv *bat_priv,
_batadv_is_ap_isolated(tt_local_entry, tt_global_entry))
goto out;
- best_tq = 0;
-
rcu_read_lock();
- head = &tt_global_entry->orig_list;
- hlist_for_each_entry_rcu(orig_entry, node, head, list) {
- router = batadv_orig_node_get_router(orig_entry->orig_node);
- if (!router)
- continue;
-
- if (router->tq_avg > best_tq) {
- orig_node = orig_entry->orig_node;
- best_tq = router->tq_avg;
- }
- batadv_neigh_node_free_ref(router);
- }
+ best_entry = batadv_transtable_best_orig(tt_global_entry);
/* found anything? */
+ if (best_entry)
+ orig_node = best_entry->orig_node;
if (orig_node && !atomic_inc_not_zero(&orig_node->refcount))
orig_node = NULL;
rcu_read_unlock();
+
out:
if (tt_global_entry)
batadv_tt_global_entry_free_ref(tt_global_entry);
--
1.8.0
^ permalink raw reply related
* pull request: batman-adv 2012-11-19
From: Antonio Quartulli @ 2012-11-19 8:24 UTC (permalink / raw)
To: davem; +Cc: netdev, b.a.t.m.a.n
Hello David,
this should be our last batch of patches intended for net-next/linux-3.8.
In this patchset we have patches 7,8/10 by Sven Eckelmann which improve the crc
computation on broadcast packets (in the Bridge Loop Avoidance component) by
using crc32c and by avoiding the entire linearisation of the skb! Then, patch
4/10 introduces a new debugfs file which exports the compatibility version so
that users having different batman-adv releases can understand whether they can
or cannot communicate.
Patch 10/10 removes the packed attribute for the unicast message type and adds
"#pragma pack(2)" (again, this is just part of our intermediate changes which do
not break compatibility. The real restructure will come later..).
The others are cleanups or small code refactoring.
Let me know if there is any problem!
Thank you,
Antonio
The following changes since commit 3594698a1fb8e5ae60a92c72ce9ca280256939a7:
net: Make CAP_NET_BIND_SERVICE per user namespace (2012-11-18 20:33:37 -0500)
are available in the git repository at:
git://git.open-mesh.org/linux-merge.git tags/batman-adv-for-davem
for you to fetch changes up to 15401e33ef94d4f251c42e8228e6c387327f38f8:
batman-adv: Use packing of 2 for all headers before an ethernet header (2012-11-19 09:14:11 +0100)
----------------------------------------------------------------
Included changes:
- Increase batman-adv version
- Bridge Loop Avoidance: compute checksum (using crc32) on skb fragments instead
of linearising it
- sort the sysfs documentation
- export the compatibility version via debugfs
- some other minor cleanups
----------------------------------------------------------------
Antonio Quartulli (2):
batman-adv: support array of debugfs general attributes
batman-adv: export compatibility version via debugfs
Marek Lindner (1):
batman-adv: sysfs documentation should keep alphabetical order
Martin Hundebøll (1):
batman-adv: Add wrapper to look up neighbor and send skb
Simon Wunderlich (2):
batman-adv: fix bla compare function
batman-adv: Fix broadcast duplist for fragmentation
Sven Eckelmann (4):
batman-adv: Mark best gateway in transtable_global debugfs
batman-adv: Add function to calculate crc32c for the skb payload
batman-adv: Start new development cycle
batman-adv: Use packing of 2 for all headers before an ethernet header
.../ABI/testing/sysfs-class-net-batman-adv | 11 +-
Documentation/ABI/testing/sysfs-class-net-mesh | 40 +++---
net/batman-adv/Kconfig | 1 +
net/batman-adv/bridge_loop_avoidance.c | 36 +++--
net/batman-adv/bridge_loop_avoidance.h | 6 +-
net/batman-adv/debugfs.c | 46 ++++--
net/batman-adv/main.c | 46 ++++++
net/batman-adv/main.h | 4 +-
net/batman-adv/packet.h | 16 ++-
net/batman-adv/routing.c | 45 ++----
net/batman-adv/send.c | 33 +++++
net/batman-adv/send.h | 3 +
net/batman-adv/translation-table.c | 155 +++++++++++----------
net/batman-adv/types.h | 2 +-
net/batman-adv/unicast.c | 8 +-
net/batman-adv/vis.c | 35 ++---
16 files changed, 293 insertions(+), 194 deletions(-)
^ permalink raw reply
* [PATCH RFC 4/5] printk: add ns_printk for specific syslog_ns
From: Rui Xiang @ 2012-11-19 8:17 UTC (permalink / raw)
To: serge.hallyn, containers; +Cc: Eric W. Biederman, netdev
From: Libo Chen <clbchenlibo.chen@huawei.com>
In some context such as iptable, we can not get correct syslog_ns by
current_syslog_ns, because we get init_syslog_ns instead of syslog_ns
belonged to container.
We add a new interface ns_printk,and give it an parameter syslog_ns.
Signed-off-by: Libo Chen <clbchenlibo.chen@huawei.com>
Signed-off-by: Xiang Rui <rui.xiang@huawei.com>
---
include/linux/printk.h | 1 +
kernel/printk.c | 37 +++++++++++++++++++++++++++++++++++++
2 files changed, 38 insertions(+), 0 deletions(-)
diff --git a/include/linux/printk.h b/include/linux/printk.h
index e0c60d9..444d229 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -119,6 +119,7 @@ asmlinkage int printk_emit(int facility, int level,
asmlinkage __printf(1, 2) __cold
int printk(const char *fmt, ...);
+int ns_printk(struct syslog_namespace *syslog_ns, const char *fmt, ...);
/*
* Special printk facility for scheduler use only, _DO_NOT_USE_ !
diff --git a/kernel/printk.c b/kernel/printk.c
index 2ef9c46..85a9965 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -1681,6 +1681,43 @@ asmlinkage int printk(const char *fmt, ...)
}
EXPORT_SYMBOL(printk);
+/**
+ * ns_printk - print a kernel message in syslog_ns
+ * @syslog_ns: syslog namespace
+ * @fmt: format string
+ *
+ * This is ns_printk().
+ * It can be called from container context. We add a param
+ * syslog_ns to record current syslog namespace,because
+ * we can't get the correct syslog_ns from current_syslog_ns
+ * in some context,e.g. iptable.
+ *
+ * See the vsnprintf() documentation for format string extensions over C99.
+ **/
+asmlinkage int ns_printk(struct syslog_namespace *syslog_ns,
+ const char *fmt, ...)
+{
+ va_list args;
+ int r;
+
+ if (!syslog_ns)
+ syslog_ns = current_syslog_ns();
+
+#ifdef CONFIG_KGDB_KDB
+ if (unlikely(kdb_trap_printk)) {
+ va_start(args, fmt);
+ r = vkdb_printf(fmt, args);
+ va_end(args);
+ return r;
+ }
+#endif
+ va_start(args, fmt);
+ r = vprintk_emit(0, -1, NULL, 0, fmt, args, syslog_ns);
+ va_end(args);
+
+ return r;
+}
+EXPORT_SYMBOL(ns_printk);
#else /* CONFIG_PRINTK */
#define LOG_LINE_MAX 0
--
1.7.1
^ permalink raw reply related
* [PATCH RFC 5/5] printk: use ns_printk in iptable context
From: Rui Xiang @ 2012-11-19 8:17 UTC (permalink / raw)
To: serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw,
netdev-u79uwXL29TY76Z2rM5mHXA
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
Eric W. Biederman
From: Libo Chen <clbchenlibo.chen-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
We add a syslog_ns pointer into net namespace for fix the iptable
issue, and use ns_printk as getting syslog_ns parameter from
skb->dev->nd_net->syslog_ns.
Signed-off-by: Libo Chen <clbchenlibo.chen-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Signed-off-by: Xiang Rui <rui.xiang-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
include/linux/syslog_namespace.h | 7 ++++---
include/net/net_namespace.h | 7 +++++--
include/net/netfilter/xt_log.h | 7 +++++--
kernel/nsproxy.c | 21 +++++++++++----------
kernel/syslog_namespace.c | 6 ++++--
net/core/net_namespace.c | 12 ++++++++++--
net/netfilter/xt_LOG.c | 4 ++--
7 files changed, 41 insertions(+), 23 deletions(-)
diff --git a/include/linux/syslog_namespace.h b/include/linux/syslog_namespace.h
index 1ecb8b8..2053409 100644
--- a/include/linux/syslog_namespace.h
+++ b/include/linux/syslog_namespace.h
@@ -58,7 +58,7 @@ static inline struct syslog_namespace *current_syslog_ns(void)
#ifdef CONFIG_SYSLOG_NS
extern void free_syslog_ns(struct kref *kref);
extern struct syslog_namespace *copy_syslog_ns(unsigned long flags,
- struct task_struct *tsk);
+ struct syslog_namespace *syslog_ns);
static inline struct syslog_namespace *get_syslog_ns(
struct syslog_namespace *ns)
@@ -76,11 +76,12 @@ static inline void put_syslog_ns(struct syslog_namespace *ns)
#else
static inline struct syslog_namespace *copy_syslog_ns(unsigned long flags,
- struct task_struct *tsk)
+ struct syslog_namespace *syslog_ns)
{
if (flags & CLONE_NEWSYSLOG)
return ERR_PTR(-EINVAL);
- return tsk->nsproxy->syslog_ns;
+
+ return syslog_ns;
}
static inline struct syslog_namespace *get_syslog_ns(
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 95e6466..61fe80f 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -108,6 +108,7 @@ struct net {
#ifdef CONFIG_XFRM
struct netns_xfrm xfrm;
#endif
+ struct syslog_namespace *syslog_ns;
struct netns_ipvs *ipvs;
struct sock *diag_nlsk;
atomic_t rt_genid;
@@ -127,10 +128,12 @@ struct net {
extern struct net init_net;
#ifdef CONFIG_NET
-extern struct net *copy_net_ns(unsigned long flags, struct net *net_ns);
+extern struct net *copy_net_ns(unsigned long flags, struct net *net_ns,
+ struct syslog_namespace *syslog_ns);
#else /* CONFIG_NET */
-static inline struct net *copy_net_ns(unsigned long flags, struct net *net_ns)
+static inline struct net *copy_net_ns(unsigned long flags, struct net *net_ns,
+ struct syslog_namespace *syslog_ns);
{
/* There is nothing to copy so this is a noop */
return net_ns;
diff --git a/include/net/netfilter/xt_log.h b/include/net/netfilter/xt_log.h
index 9d9756c..5f15e0e 100644
--- a/include/net/netfilter/xt_log.h
+++ b/include/net/netfilter/xt_log.h
@@ -39,11 +39,14 @@ static struct sbuff *sb_open(void)
return m;
}
-static void sb_close(struct sbuff *m)
+static void sb_close(struct sbuff *m, struct sk_buff *skb)
{
m->buf[m->count] = 0;
+#ifdef CONFIG_NET_NS
+ ns_printk(skb->dev->nd_net->syslog_ns, "%s\n", m->buf);
+#else
printk("%s\n", m->buf);
-
+#endif
if (likely(m != &emergency))
kfree(m);
else {
diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
index 331d31f..cb9608a 100644
--- a/kernel/nsproxy.c
+++ b/kernel/nsproxy.c
@@ -92,24 +92,25 @@ static struct nsproxy *create_new_namespaces(unsigned long flags,
goto out_pid;
}
- new_nsp->net_ns = copy_net_ns(flags, tsk->nsproxy->net_ns);
- if (IS_ERR(new_nsp->net_ns)) {
- err = PTR_ERR(new_nsp->net_ns);
- goto out_net;
- }
-
- new_nsp->syslog_ns = copy_syslog_ns(flags, tsk);
+ new_nsp->syslog_ns = copy_syslog_ns(flags, tsk->nsproxy->syslog_ns);
if (IS_ERR(new_nsp->syslog_ns)) {
err = PTR_ERR(new_nsp->syslog_ns);
goto out_syslog;
}
+ new_nsp->net_ns = copy_net_ns(flags, tsk->nsproxy->net_ns,
+ new_nsp->syslog_ns);
+ if (IS_ERR(new_nsp->net_ns)) {
+ err = PTR_ERR(new_nsp->net_ns);
+ goto out_net;
+ }
+
return new_nsp;
-out_syslog:
- if (new_nsp->net_ns)
- put_net(new_nsp->net_ns);
out_net:
+ if (new_nsp->syslog_ns)
+ put_net(new_nsp->syslog_ns);
+out_syslog:
if (new_nsp->pid_ns)
put_pid_ns(new_nsp->pid_ns);
out_pid:
diff --git a/kernel/syslog_namespace.c b/kernel/syslog_namespace.c
index a12e1c1..1c3ed4b 100644
--- a/kernel/syslog_namespace.c
+++ b/kernel/syslog_namespace.c
@@ -9,6 +9,7 @@
#include <linux/module.h>
#include <linux/bootmem.h>
#include <linux/syslog_namespace.h>
+#include <net/net_namespace.h>
static char __log_buf[__LOG_BUF_LEN] __aligned(LOG_ALIGN);
@@ -46,10 +47,11 @@ static struct syslog_namespace *create_syslog_ns(unsigned int buf_len)
}
struct syslog_namespace *copy_syslog_ns(unsigned long flags,
- struct task_struct *tsk)
+ struct syslog_namespace *syslog_ns)
{
if (!(flags & CLONE_NEWSYSLOG))
- return get_syslog_ns(tsk->nsproxy->syslog_ns);
+ return get_syslog_ns(syslog_ns);
+
return create_syslog_ns(CONTAINER_BUF_LEN);
}
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 42f1e1c..f192e1e 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -15,6 +15,7 @@
#include <linux/export.h>
#include <net/net_namespace.h>
#include <net/netns/generic.h>
+#include <linux/syslog_namespace.h>
/*
* Our network namespace constructor/destructor lists
@@ -29,6 +30,7 @@ EXPORT_SYMBOL_GPL(net_namespace_list);
struct net init_net = {
.dev_base_head = LIST_HEAD_INIT(init_net.dev_base_head),
+ .syslog_ns = &init_syslog_ns
};
EXPORT_SYMBOL(init_net);
@@ -232,7 +234,8 @@ void net_drop_ns(void *p)
net_free(ns);
}
-struct net *copy_net_ns(unsigned long flags, struct net *old_net)
+struct net *copy_net_ns(unsigned long flags, struct net *old_net,
+ struct syslog_namespace *syslog_ns)
{
struct net *net;
int rv;
@@ -255,6 +258,9 @@ struct net *copy_net_ns(unsigned long flags, struct net *old_net)
net_drop_ns(net);
return ERR_PTR(rv);
}
+
+ net->syslog_ns = get_syslog_ns(syslog_ns);
+
return net;
}
@@ -308,6 +314,7 @@ static void cleanup_net(struct work_struct *work)
/* Finally it is safe to free my network namespace structure */
list_for_each_entry_safe(net, tmp, &net_exit_list, exit_list) {
list_del_init(&net->exit_list);
+ put_syslog_ns(net->syslog_ns);
net_drop_ns(net);
}
}
@@ -347,7 +354,8 @@ struct net *get_net_ns_by_fd(int fd)
}
#else
-struct net *copy_net_ns(unsigned long flags, struct net *old_net)
+struct net *copy_net_ns(unsigned long flags, struct net *old_net,
+ struct syslog_namespace *syslog_ns)
{
if (flags & CLONE_NEWNET)
return ERR_PTR(-EINVAL);
diff --git a/net/netfilter/xt_LOG.c b/net/netfilter/xt_LOG.c
index fa40096..6b13b72 100644
--- a/net/netfilter/xt_LOG.c
+++ b/net/netfilter/xt_LOG.c
@@ -486,7 +486,7 @@ ipt_log_packet(u_int8_t pf,
dump_ipv4_packet(m, loginfo, skb, 0);
- sb_close(m);
+ sb_close(m, skb);
}
#if IS_ENABLED(CONFIG_IP6_NF_IPTABLES)
@@ -810,7 +810,7 @@ ip6t_log_packet(u_int8_t pf,
dump_ipv6_packet(m, loginfo, skb, skb_network_offset(skb), 1);
- sb_close(m);
+ sb_close(m, skb);
}
#endif
--
1.7.1
^ permalink raw reply related
* [PATCH RFC 3/5] printk: modify printk interface for syslog_namespace
From: Rui Xiang @ 2012-11-19 8:16 UTC (permalink / raw)
To: serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, Eric W. Biederman
From: Libo Chen <clbchenlibo.chen-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
We re-implement printk by additional syslog_ns.
The function include printk, /dev/kmsg, do_syslog and kmsg_dump should be modifyed
for syslog_ns. Previous identifier *** such as log_first_seq should be replaced
by syslog_ns->***.
Signed-off-by: Libo Chen <clbchenlibo.chen-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Signed-off-by: Xiang Rui <rui.xiang-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
drivers/base/core.c | 4 +-
include/linux/printk.h | 4 +-
kernel/printk.c | 609 +++++++++++++++++++++++++++++-------------------
3 files changed, 372 insertions(+), 245 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index abea76c..665c2f7 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -26,6 +26,7 @@
#include <linux/async.h>
#include <linux/pm_runtime.h>
#include <linux/netdevice.h>
+#include <linux/syslog_namespace.h>
#include "base.h"
#include "power/power.h"
@@ -1922,7 +1923,8 @@ int dev_vprintk_emit(int level, const struct device *dev,
hdrlen = create_syslog_header(dev, hdr, sizeof(hdr));
- return vprintk_emit(0, level, hdrlen ? hdr : NULL, hdrlen, fmt, args);
+ return vprintk_emit(0, level, hdrlen ? hdr : NULL, hdrlen,
+ fmt, args, current_syslog_ns());
}
EXPORT_SYMBOL(dev_vprintk_emit);
diff --git a/include/linux/printk.h b/include/linux/printk.h
index 9afc01e..e0c60d9 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -7,6 +7,7 @@
extern const char linux_banner[];
extern const char linux_proc_banner[];
+struct syslog_namespace;
static inline int printk_get_level(const char *buffer)
{
if (buffer[0] == KERN_SOH_ASCII && buffer[1]) {
@@ -105,7 +106,8 @@ extern void printk_tick(void);
asmlinkage __printf(5, 0)
int vprintk_emit(int facility, int level,
const char *dict, size_t dictlen,
- const char *fmt, va_list args);
+ const char *fmt, va_list args,
+ struct syslog_namespace *syslog_ns);
asmlinkage __printf(1, 0)
int vprintk(const char *fmt, va_list args);
diff --git a/kernel/printk.c b/kernel/printk.c
index 2d607f4..2ef9c46 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -42,6 +42,7 @@
#include <linux/notifier.h>
#include <linux/rculist.h>
#include <linux/poll.h>
+#include <linux/syslog_namespace.h>
#include <asm/uaccess.h>
@@ -214,46 +215,14 @@ struct log {
* The logbuf_lock protects kmsg buffer, indices, counters. It is also
* used in interesting ways to provide interlocking in console_unlock();
*/
-static DEFINE_RAW_SPINLOCK(logbuf_lock);
#ifdef CONFIG_PRINTK
-/* the next printk record to read by syslog(READ) or /proc/kmsg */
-static u64 syslog_seq;
-static u32 syslog_idx;
-static enum log_flags syslog_prev;
-static size_t syslog_partial;
-
-/* index and sequence number of the first record stored in the buffer */
-static u64 log_first_seq;
-static u32 log_first_idx;
-
-/* index and sequence number of the next record to store in the buffer */
-static u64 log_next_seq;
-static u32 log_next_idx;
-/* the next printk record to write to the console */
-static u64 console_seq;
-static u32 console_idx;
static enum log_flags console_prev;
-/* the next printk record to read after the last 'clear' command */
-static u64 clear_seq;
-static u32 clear_idx;
-
#define PREFIX_MAX 32
#define LOG_LINE_MAX 1024 - PREFIX_MAX
-/* record buffer */
-#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
-#define LOG_ALIGN 4
-#else
-#define LOG_ALIGN __alignof__(struct log)
-#endif
-#define __LOG_BUF_LEN (1 << CONFIG_LOG_BUF_SHIFT)
-static char __log_buf[__LOG_BUF_LEN] __aligned(LOG_ALIGN);
-static char *log_buf = __log_buf;
-static u32 log_buf_len = __LOG_BUF_LEN;
-
/* cpu currently holding logbuf_lock */
static volatile unsigned int logbuf_cpu = UINT_MAX;
@@ -270,23 +239,23 @@ static char *log_dict(const struct log *msg)
}
/* get record by index; idx must point to valid msg */
-static struct log *log_from_idx(u32 idx)
+static struct log *log_from_idx(u32 idx, struct syslog_namespace *syslog_ns)
{
- struct log *msg = (struct log *)(log_buf + idx);
+ struct log *msg = (struct log *)(syslog_ns->log_buf + idx);
/*
* A length == 0 record is the end of buffer marker. Wrap around and
* read the message at the start of the buffer.
*/
if (!msg->len)
- return (struct log *)log_buf;
+ return (struct log *)syslog_ns->log_buf;
return msg;
}
/* get next record; idx must point to valid msg */
-static u32 log_next(u32 idx)
+static u32 log_next(u32 idx, struct syslog_namespace *syslog_ns)
{
- struct log *msg = (struct log *)(log_buf + idx);
+ struct log *msg = (struct log *)(syslog_ns->log_buf + idx);
/* length == 0 indicates the end of the buffer; wrap */
/*
@@ -295,7 +264,7 @@ static u32 log_next(u32 idx)
* return the one after that.
*/
if (!msg->len) {
- msg = (struct log *)log_buf;
+ msg = (struct log *)syslog_ns->log_buf;
return msg->len;
}
return idx + msg->len;
@@ -305,7 +274,8 @@ static u32 log_next(u32 idx)
static void log_store(int facility, int level,
enum log_flags flags, u64 ts_nsec,
const char *dict, u16 dict_len,
- const char *text, u16 text_len)
+ const char *text, u16 text_len,
+ struct syslog_namespace *syslog_ns)
{
struct log *msg;
u32 size, pad_len;
@@ -315,34 +285,40 @@ static void log_store(int facility, int level,
pad_len = (-size) & (LOG_ALIGN - 1);
size += pad_len;
- while (log_first_seq < log_next_seq) {
+ while (syslog_ns->log_first_seq < syslog_ns->log_next_seq) {
u32 free;
- if (log_next_idx > log_first_idx)
- free = max(log_buf_len - log_next_idx, log_first_idx);
+ if (syslog_ns->log_next_idx > syslog_ns->log_first_idx)
+ free = max(syslog_ns->log_buf_len -
+ syslog_ns->log_next_idx,
+ syslog_ns->log_first_idx);
else
- free = log_first_idx - log_next_idx;
+ free = syslog_ns->log_first_idx -
+ syslog_ns->log_next_idx;
if (free > size + sizeof(struct log))
break;
/* drop old messages until we have enough contiuous space */
- log_first_idx = log_next(log_first_idx);
- log_first_seq++;
+ syslog_ns->log_first_idx =
+ log_next(syslog_ns->log_first_idx, syslog_ns);
+ syslog_ns->log_first_seq++;
}
- if (log_next_idx + size + sizeof(struct log) >= log_buf_len) {
+ if (syslog_ns->log_next_idx + size + sizeof(struct log) >=
+ syslog_ns->log_buf_len) {
/*
* This message + an additional empty header does not fit
* at the end of the buffer. Add an empty header with len == 0
* to signify a wrap around.
*/
- memset(log_buf + log_next_idx, 0, sizeof(struct log));
- log_next_idx = 0;
+ memset(syslog_ns->log_buf + syslog_ns->log_next_idx,
+ 0, sizeof(struct log));
+ syslog_ns->log_next_idx = 0;
}
/* fill message */
- msg = (struct log *)(log_buf + log_next_idx);
+ msg = (struct log *)(syslog_ns->log_buf + syslog_ns->log_next_idx);
memcpy(log_text(msg), text, text_len);
msg->text_len = text_len;
memcpy(log_dict(msg), dict, dict_len);
@@ -358,8 +334,8 @@ static void log_store(int facility, int level,
msg->len = sizeof(struct log) + text_len + dict_len + pad_len;
/* insert message */
- log_next_idx += msg->len;
- log_next_seq++;
+ syslog_ns->log_next_idx += msg->len;
+ syslog_ns->log_next_seq++;
}
/* /dev/kmsg - userspace message inject/listen interface */
@@ -368,6 +344,7 @@ struct devkmsg_user {
u32 idx;
enum log_flags prev;
struct mutex lock;
+ struct syslog_namespace *syslog_ns;
char buf[8192];
};
@@ -431,6 +408,7 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf,
size_t count, loff_t *ppos)
{
struct devkmsg_user *user = file->private_data;
+ struct syslog_namespace *syslog_ns = user->syslog_ns;
struct log *msg;
u64 ts_usec;
size_t i;
@@ -444,32 +422,32 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf,
ret = mutex_lock_interruptible(&user->lock);
if (ret)
return ret;
- raw_spin_lock_irq(&logbuf_lock);
- while (user->seq == log_next_seq) {
+ raw_spin_lock_irq(&syslog_ns->logbuf_lock);
+ while (user->seq == syslog_ns->log_next_seq) {
if (file->f_flags & O_NONBLOCK) {
ret = -EAGAIN;
- raw_spin_unlock_irq(&logbuf_lock);
+ raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
goto out;
}
- raw_spin_unlock_irq(&logbuf_lock);
+ raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
ret = wait_event_interruptible(log_wait,
- user->seq != log_next_seq);
+ user->seq != syslog_ns->log_next_seq);
if (ret)
goto out;
- raw_spin_lock_irq(&logbuf_lock);
+ raw_spin_lock_irq(&syslog_ns->logbuf_lock);
}
- if (user->seq < log_first_seq) {
+ if (user->seq < syslog_ns->log_first_seq) {
/* our last seen message is gone, return error and reset */
- user->idx = log_first_idx;
- user->seq = log_first_seq;
+ user->idx = syslog_ns->log_first_idx;
+ user->seq = syslog_ns->log_first_seq;
ret = -EPIPE;
- raw_spin_unlock_irq(&logbuf_lock);
+ raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
goto out;
}
- msg = log_from_idx(user->idx);
+ msg = log_from_idx(user->idx, syslog_ns);
ts_usec = msg->ts_nsec;
do_div(ts_usec, 1000);
@@ -530,9 +508,9 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf,
user->buf[len++] = '\n';
}
- user->idx = log_next(user->idx);
+ user->idx = log_next(user->idx, syslog_ns);
user->seq++;
- raw_spin_unlock_irq(&logbuf_lock);
+ raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
if (len > count) {
ret = -EINVAL;
@@ -552,6 +530,7 @@ out:
static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence)
{
struct devkmsg_user *user = file->private_data;
+ struct syslog_namespace *syslog_ns = user->syslog_ns;
loff_t ret = 0;
if (!user)
@@ -559,12 +538,12 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence)
if (offset)
return -ESPIPE;
- raw_spin_lock_irq(&logbuf_lock);
+ raw_spin_lock_irq(&syslog_ns->logbuf_lock);
switch (whence) {
case SEEK_SET:
/* the first record */
- user->idx = log_first_idx;
- user->seq = log_first_seq;
+ user->idx = syslog_ns->log_first_idx;
+ user->seq = syslog_ns->log_first_seq;
break;
case SEEK_DATA:
/*
@@ -572,24 +551,25 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence)
* like issued by 'dmesg -c'. Reading /dev/kmsg itself
* changes no global state, and does not clear anything.
*/
- user->idx = clear_idx;
- user->seq = clear_seq;
+ user->idx = syslog_ns->clear_idx;
+ user->seq = syslog_ns->clear_seq;
break;
case SEEK_END:
/* after the last record */
- user->idx = log_next_idx;
- user->seq = log_next_seq;
+ user->idx = syslog_ns->log_next_idx;
+ user->seq = syslog_ns->log_next_seq;
break;
default:
ret = -EINVAL;
}
- raw_spin_unlock_irq(&logbuf_lock);
+ raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
return ret;
}
static unsigned int devkmsg_poll(struct file *file, poll_table *wait)
{
struct devkmsg_user *user = file->private_data;
+ struct syslog_namespace *syslog_ns = user->syslog_ns;
int ret = 0;
if (!user)
@@ -597,20 +577,21 @@ static unsigned int devkmsg_poll(struct file *file, poll_table *wait)
poll_wait(file, &log_wait, wait);
- raw_spin_lock_irq(&logbuf_lock);
- if (user->seq < log_next_seq) {
+ raw_spin_lock_irq(&syslog_ns->logbuf_lock);
+ if (user->seq < syslog_ns->log_next_seq) {
/* return error when data has vanished underneath us */
- if (user->seq < log_first_seq)
+ if (user->seq < syslog_ns->log_first_seq)
ret = POLLIN|POLLRDNORM|POLLERR|POLLPRI;
ret = POLLIN|POLLRDNORM;
}
- raw_spin_unlock_irq(&logbuf_lock);
+ raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
return ret;
}
static int devkmsg_open(struct inode *inode, struct file *file)
{
+ struct syslog_namespace *syslog_ns;
struct devkmsg_user *user;
int err;
@@ -628,10 +609,11 @@ static int devkmsg_open(struct inode *inode, struct file *file)
mutex_init(&user->lock);
- raw_spin_lock_irq(&logbuf_lock);
- user->idx = log_first_idx;
- user->seq = log_first_seq;
- raw_spin_unlock_irq(&logbuf_lock);
+ user->syslog_ns = current_syslog_ns();
+ raw_spin_lock_irq(&syslog_ns->logbuf_lock);
+ user->idx = syslog_ns->log_first_idx;
+ user->seq = syslog_ns->log_first_seq;
+ raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
file->private_data = user;
return 0;
@@ -669,10 +651,12 @@ const struct file_operations kmsg_fops = {
*/
void log_buf_kexec_setup(void)
{
- VMCOREINFO_SYMBOL(log_buf);
- VMCOREINFO_SYMBOL(log_buf_len);
- VMCOREINFO_SYMBOL(log_first_idx);
- VMCOREINFO_SYMBOL(log_next_idx);
+ struct syslog_namespace *syslog_ns = current_syslog_ns();
+
+ VMCOREINFO_SYMBOL(syslog_ns->log_buf);
+ VMCOREINFO_SYMBOL(syslog_ns->log_buf_len);
+ VMCOREINFO_SYMBOL(syslog_ns->log_first_idx);
+ VMCOREINFO_SYMBOL(syslog_ns->log_next_idx);
/*
* Export struct log size and field offsets. User space tools can
* parse it and detect any changes to structure down the line.
@@ -692,10 +676,11 @@ static unsigned long __initdata new_log_buf_len;
static int __init log_buf_len_setup(char *str)
{
unsigned size = memparse(str, &str);
+ struct syslog_namespace *syslog_ns = &init_syslog_ns;
if (size)
size = roundup_pow_of_two(size);
- if (size > log_buf_len)
+ if (size > syslog_ns->log_buf_len)
new_log_buf_len = size;
return 0;
@@ -707,6 +692,7 @@ void __init setup_log_buf(int early)
unsigned long flags;
char *new_log_buf;
int free;
+ struct syslog_namespace *syslog_ns = &init_syslog_ns;
if (!new_log_buf_len)
return;
@@ -728,15 +714,15 @@ void __init setup_log_buf(int early)
return;
}
- raw_spin_lock_irqsave(&logbuf_lock, flags);
- log_buf_len = new_log_buf_len;
- log_buf = new_log_buf;
+ raw_spin_lock_irqsave(&syslog_ns->logbuf_lock, flags);
+ memcpy(new_log_buf, syslog_ns->log_buf, __LOG_BUF_LEN);
+ syslog_ns->log_buf_len = new_log_buf_len;
+ syslog_ns->log_buf = new_log_buf;
new_log_buf_len = 0;
- free = __LOG_BUF_LEN - log_next_idx;
- memcpy(log_buf, __log_buf, __LOG_BUF_LEN);
- raw_spin_unlock_irqrestore(&logbuf_lock, flags);
+ free = __LOG_BUF_LEN - syslog_ns->log_next_idx;
+ raw_spin_unlock_irqrestore(&syslog_ns->logbuf_lock, flags);
- pr_info("log_buf_len: %d\n", log_buf_len);
+ pr_info("log_buf_len: %d\n", syslog_ns->log_buf_len);
pr_info("early log buf free: %d(%d%%)\n",
free, (free * 100) / __LOG_BUF_LEN);
}
@@ -937,7 +923,8 @@ static size_t msg_print_text(const struct log *msg, enum log_flags prev,
return len;
}
-static int syslog_print(char __user *buf, int size)
+static int syslog_print(char __user *buf, int size,
+ struct syslog_namespace *syslog_ns)
{
char *text;
struct log *msg;
@@ -951,37 +938,38 @@ static int syslog_print(char __user *buf, int size)
size_t n;
size_t skip;
- raw_spin_lock_irq(&logbuf_lock);
- if (syslog_seq < log_first_seq) {
+ raw_spin_lock_irq(&syslog_ns->logbuf_lock);
+ if (syslog_ns->syslog_seq < syslog_ns->log_first_seq) {
/* messages are gone, move to first one */
- syslog_seq = log_first_seq;
- syslog_idx = log_first_idx;
- syslog_prev = 0;
- syslog_partial = 0;
+ syslog_ns->syslog_seq = syslog_ns->log_first_seq;
+ syslog_ns->syslog_idx = syslog_ns->log_first_idx;
+ syslog_ns->syslog_prev = 0;
+ syslog_ns->syslog_partial = 0;
}
- if (syslog_seq == log_next_seq) {
- raw_spin_unlock_irq(&logbuf_lock);
+ if (syslog_ns->syslog_seq == syslog_ns->log_next_seq) {
+ raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
break;
}
- skip = syslog_partial;
- msg = log_from_idx(syslog_idx);
- n = msg_print_text(msg, syslog_prev, true, text,
+ skip = syslog_ns->syslog_partial;
+ msg = log_from_idx(syslog_ns->syslog_idx, syslog_ns);
+ n = msg_print_text(msg, syslog_ns->syslog_prev, true, text,
LOG_LINE_MAX + PREFIX_MAX);
- if (n - syslog_partial <= size) {
+ if (n - syslog_ns->syslog_partial <= size) {
/* message fits into buffer, move forward */
- syslog_idx = log_next(syslog_idx);
- syslog_seq++;
- syslog_prev = msg->flags;
- n -= syslog_partial;
- syslog_partial = 0;
+ syslog_ns->syslog_idx =
+ log_next(syslog_ns->syslog_idx, syslog_ns);
+ syslog_ns->syslog_seq++;
+ syslog_ns->syslog_prev = msg->flags;
+ n -= syslog_ns->syslog_partial;
+ syslog_ns->syslog_partial = 0;
} else if (!len){
/* partial read(), remember position */
n = size;
- syslog_partial += n;
+ syslog_ns->syslog_partial += n;
} else
n = 0;
- raw_spin_unlock_irq(&logbuf_lock);
+ raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
if (!n)
break;
@@ -1001,7 +989,8 @@ static int syslog_print(char __user *buf, int size)
return len;
}
-static int syslog_print_all(char __user *buf, int size, bool clear)
+static int syslog_print_all(char __user *buf, int size, bool clear,
+ struct syslog_namespace *syslog_ns)
{
char *text;
int len = 0;
@@ -1010,55 +999,55 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
if (!text)
return -ENOMEM;
- raw_spin_lock_irq(&logbuf_lock);
+ raw_spin_lock_irq(&syslog_ns->logbuf_lock);
if (buf) {
u64 next_seq;
u64 seq;
u32 idx;
enum log_flags prev;
- if (clear_seq < log_first_seq) {
+ if (syslog_ns->clear_seq < syslog_ns->log_first_seq) {
/* messages are gone, move to first available one */
- clear_seq = log_first_seq;
- clear_idx = log_first_idx;
+ syslog_ns->clear_seq = syslog_ns->log_first_seq;
+ syslog_ns->clear_idx = syslog_ns->log_first_idx;
}
/*
* Find first record that fits, including all following records,
* into the user-provided buffer for this dump.
*/
- seq = clear_seq;
- idx = clear_idx;
+ seq = syslog_ns->clear_seq;
+ idx = syslog_ns->clear_idx;
prev = 0;
- while (seq < log_next_seq) {
- struct log *msg = log_from_idx(idx);
+ while (seq < syslog_ns->log_next_seq) {
+ struct log *msg = log_from_idx(idx, syslog_ns);
len += msg_print_text(msg, prev, true, NULL, 0);
prev = msg->flags;
- idx = log_next(idx);
+ idx = log_next(idx, syslog_ns);
seq++;
}
/* move first record forward until length fits into the buffer */
- seq = clear_seq;
- idx = clear_idx;
+ seq = syslog_ns->clear_seq;
+ idx = syslog_ns->clear_idx;
prev = 0;
- while (len > size && seq < log_next_seq) {
- struct log *msg = log_from_idx(idx);
+ while (len > size && seq < syslog_ns->log_next_seq) {
+ struct log *msg = log_from_idx(idx, syslog_ns);
len -= msg_print_text(msg, prev, true, NULL, 0);
prev = msg->flags;
- idx = log_next(idx);
+ idx = log_next(idx, syslog_ns);
seq++;
}
/* last message fitting into this dump */
- next_seq = log_next_seq;
+ next_seq = syslog_ns->log_next_seq;
len = 0;
prev = 0;
while (len >= 0 && seq < next_seq) {
- struct log *msg = log_from_idx(idx);
+ struct log *msg = log_from_idx(idx, syslog_ns);
int textlen;
textlen = msg_print_text(msg, prev, true, text,
@@ -1067,31 +1056,31 @@ static int syslog_print_all(char __user *buf, int size, bool clear)
len = textlen;
break;
}
- idx = log_next(idx);
+ idx = log_next(idx, syslog_ns);
seq++;
prev = msg->flags;
- raw_spin_unlock_irq(&logbuf_lock);
+ raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
if (copy_to_user(buf + len, text, textlen))
len = -EFAULT;
else
len += textlen;
- raw_spin_lock_irq(&logbuf_lock);
+ raw_spin_lock_irq(&syslog_ns->logbuf_lock);
- if (seq < log_first_seq) {
+ if (seq < syslog_ns->log_first_seq) {
/* messages are gone, move to next one */
- seq = log_first_seq;
- idx = log_first_idx;
+ seq = syslog_ns->log_first_seq;
+ idx = syslog_ns->log_first_idx;
prev = 0;
}
}
}
if (clear) {
- clear_seq = log_next_seq;
- clear_idx = log_next_idx;
+ syslog_ns->clear_seq = syslog_ns->log_next_seq;
+ syslog_ns->clear_idx = syslog_ns->log_next_idx;
}
- raw_spin_unlock_irq(&logbuf_lock);
+ raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
kfree(text);
return len;
@@ -1102,6 +1091,7 @@ int do_syslog(int type, char __user *buf, int len, bool from_file)
bool clear = false;
static int saved_console_loglevel = -1;
int error;
+ struct syslog_namespace *syslog_ns = current_syslog_ns();
error = check_syslog_permissions(type, from_file);
if (error)
@@ -1128,10 +1118,10 @@ int do_syslog(int type, char __user *buf, int len, bool from_file)
goto out;
}
error = wait_event_interruptible(log_wait,
- syslog_seq != log_next_seq);
+ syslog_ns->syslog_seq != syslog_ns->log_next_seq);
if (error)
goto out;
- error = syslog_print(buf, len);
+ error = syslog_print(buf, len, syslog_ns);
break;
/* Read/clear last kernel messages */
case SYSLOG_ACTION_READ_CLEAR:
@@ -1149,11 +1139,11 @@ int do_syslog(int type, char __user *buf, int len, bool from_file)
error = -EFAULT;
goto out;
}
- error = syslog_print_all(buf, len, clear);
+ error = syslog_print_all(buf, len, clear, syslog_ns);
break;
/* Clear ring buffer */
case SYSLOG_ACTION_CLEAR:
- syslog_print_all(NULL, 0, true);
+ syslog_print_all(NULL, 0, true, syslog_ns);
break;
/* Disable logging to console */
case SYSLOG_ACTION_CONSOLE_OFF:
@@ -1182,13 +1172,13 @@ int do_syslog(int type, char __user *buf, int len, bool from_file)
break;
/* Number of chars in the log buffer */
case SYSLOG_ACTION_SIZE_UNREAD:
- raw_spin_lock_irq(&logbuf_lock);
- if (syslog_seq < log_first_seq) {
+ raw_spin_lock_irq(&syslog_ns->logbuf_lock);
+ if (syslog_ns->syslog_seq < syslog_ns->log_first_seq) {
/* messages are gone, move to first one */
- syslog_seq = log_first_seq;
- syslog_idx = log_first_idx;
- syslog_prev = 0;
- syslog_partial = 0;
+ syslog_ns->syslog_seq = syslog_ns->log_first_seq;
+ syslog_ns->syslog_idx = syslog_ns->log_first_idx;
+ syslog_ns->syslog_prev = 0;
+ syslog_ns->syslog_partial = 0;
}
if (from_file) {
/*
@@ -1196,28 +1186,28 @@ int do_syslog(int type, char __user *buf, int len, bool from_file)
* for pending data, not the size; return the count of
* records, not the length.
*/
- error = log_next_idx - syslog_idx;
+ error = syslog_ns->log_next_idx - syslog_ns->syslog_idx;
} else {
- u64 seq = syslog_seq;
- u32 idx = syslog_idx;
- enum log_flags prev = syslog_prev;
+ u64 seq = syslog_ns->syslog_seq;
+ u32 idx = syslog_ns->syslog_idx;
+ enum log_flags prev = syslog_ns->syslog_prev;
error = 0;
- while (seq < log_next_seq) {
- struct log *msg = log_from_idx(idx);
+ while (seq < syslog_ns->log_next_seq) {
+ struct log *msg = log_from_idx(idx, syslog_ns);
error += msg_print_text(msg, prev, true, NULL, 0);
- idx = log_next(idx);
+ idx = log_next(idx, syslog_ns);
seq++;
prev = msg->flags;
}
- error -= syslog_partial;
+ error -= syslog_ns->syslog_partial;
}
- raw_spin_unlock_irq(&logbuf_lock);
+ raw_spin_unlock_irq(&syslog_ns->logbuf_lock);
break;
/* Size of the log buffer */
case SYSLOG_ACTION_SIZE_BUFFER:
- error = log_buf_len;
+ error = syslog_ns->log_buf_len;
break;
default:
error = -EINVAL;
@@ -1282,7 +1272,7 @@ static void call_console_drivers(int level, const char *text, size_t len)
* every 10 seconds, to leave time for slow consoles to print a
* full oops.
*/
-static void zap_locks(void)
+static void zap_locks(struct syslog_namespace *syslog_ns)
{
static unsigned long oops_timestamp;
@@ -1294,7 +1284,7 @@ static void zap_locks(void)
debug_locks_off();
/* If a crash is occurring, make sure we can't deadlock */
- raw_spin_lock_init(&logbuf_lock);
+ raw_spin_lock_init(&syslog_ns->logbuf_lock);
/* And make sure that we print immediately */
sema_init(&console_sem, 1);
}
@@ -1334,8 +1324,9 @@ static inline int can_use_console(unsigned int cpu)
* interrupts disabled. It should return with 'lockbuf_lock'
* released but interrupts still disabled.
*/
-static int console_trylock_for_printk(unsigned int cpu)
- __releases(&logbuf_lock)
+static int console_trylock_for_printk(unsigned int cpu,
+ struct syslog_namespace *syslog_ns)
+ __releases(&syslog_ns->logbuf_lock)
{
int retval = 0, wake = 0;
@@ -1357,7 +1348,7 @@ static int console_trylock_for_printk(unsigned int cpu)
logbuf_cpu = UINT_MAX;
if (wake)
up(&console_sem);
- raw_spin_unlock(&logbuf_lock);
+ raw_spin_unlock(&syslog_ns->logbuf_lock);
return retval;
}
@@ -1393,7 +1384,7 @@ static struct cont {
bool flushed:1; /* buffer sealed and committed */
} cont;
-static void cont_flush(enum log_flags flags)
+static void cont_flush(enum log_flags flags, struct syslog_namespace *syslog_ns)
{
if (cont.flushed)
return;
@@ -1407,7 +1398,7 @@ static void cont_flush(enum log_flags flags)
* line. LOG_NOCONS suppresses a duplicated output.
*/
log_store(cont.facility, cont.level, flags | LOG_NOCONS,
- cont.ts_nsec, NULL, 0, cont.buf, cont.len);
+ cont.ts_nsec, NULL, 0, cont.buf, cont.len, syslog_ns);
cont.flags = flags;
cont.flushed = true;
} else {
@@ -1416,19 +1407,20 @@ static void cont_flush(enum log_flags flags)
* just submit it to the store and free the buffer.
*/
log_store(cont.facility, cont.level, flags, 0,
- NULL, 0, cont.buf, cont.len);
+ NULL, 0, cont.buf, cont.len, syslog_ns);
cont.len = 0;
}
}
-static bool cont_add(int facility, int level, const char *text, size_t len)
+static bool cont_add(int facility, int level, const char *text, size_t len,
+ struct syslog_namespace *syslog_ns)
{
if (cont.len && cont.flushed)
return false;
if (cont.len + len > sizeof(cont.buf)) {
/* the line gets too long, split it up in separate records */
- cont_flush(LOG_CONT);
+ cont_flush(LOG_CONT, syslog_ns);
return false;
}
@@ -1446,7 +1438,7 @@ static bool cont_add(int facility, int level, const char *text, size_t len)
cont.len += len;
if (cont.len > (sizeof(cont.buf) * 80) / 100)
- cont_flush(LOG_CONT);
+ cont_flush(LOG_CONT, syslog_ns);
return true;
}
@@ -1481,7 +1473,8 @@ static size_t cont_print_text(char *text, size_t size)
asmlinkage int vprintk_emit(int facility, int level,
const char *dict, size_t dictlen,
- const char *fmt, va_list args)
+ const char *fmt, va_list args,
+ struct syslog_namespace *syslog_ns)
{
static int recursion_bug;
static char textbuf[LOG_LINE_MAX];
@@ -1514,11 +1507,11 @@ asmlinkage int vprintk_emit(int facility, int level,
recursion_bug = 1;
goto out_restore_irqs;
}
- zap_locks();
+ zap_locks(syslog_ns);
}
lockdep_off();
- raw_spin_lock(&logbuf_lock);
+ raw_spin_lock(&syslog_ns->logbuf_lock);
logbuf_cpu = this_cpu;
if (recursion_bug) {
@@ -1529,7 +1522,7 @@ asmlinkage int vprintk_emit(int facility, int level,
printed_len += strlen(recursion_msg);
/* emit KERN_CRIT message */
log_store(0, 2, LOG_PREFIX|LOG_NEWLINE, 0,
- NULL, 0, recursion_msg, printed_len);
+ NULL, 0, recursion_msg, printed_len, syslog_ns);
}
/*
@@ -1576,12 +1569,12 @@ asmlinkage int vprintk_emit(int facility, int level,
* or another task also prints continuation lines.
*/
if (cont.len && (lflags & LOG_PREFIX || cont.owner != current))
- cont_flush(LOG_NEWLINE);
+ cont_flush(LOG_NEWLINE, syslog_ns);
/* buffer line if possible, otherwise store it right away */
- if (!cont_add(facility, level, text, text_len))
+ if (!cont_add(facility, level, text, text_len, syslog_ns))
log_store(facility, level, lflags | LOG_CONT, 0,
- dict, dictlen, text, text_len);
+ dict, dictlen, text, text_len, syslog_ns);
} else {
bool stored = false;
@@ -1593,13 +1586,14 @@ asmlinkage int vprintk_emit(int facility, int level,
*/
if (cont.len && cont.owner == current) {
if (!(lflags & LOG_PREFIX))
- stored = cont_add(facility, level, text, text_len);
- cont_flush(LOG_NEWLINE);
+ stored = cont_add(facility, level, text,
+ text_len, syslog_ns);
+ cont_flush(LOG_NEWLINE, syslog_ns);
}
if (!stored)
log_store(facility, level, lflags, 0,
- dict, dictlen, text, text_len);
+ dict, dictlen, text, text_len, syslog_ns);
}
printed_len += text_len;
@@ -1611,7 +1605,7 @@ asmlinkage int vprintk_emit(int facility, int level,
* The console_trylock_for_printk() function will release 'logbuf_lock'
* regardless of whether it actually gets the console semaphore or not.
*/
- if (console_trylock_for_printk(this_cpu))
+ if (console_trylock_for_printk(this_cpu, syslog_ns))
console_unlock();
lockdep_on();
@@ -1624,7 +1618,8 @@ EXPORT_SYMBOL(vprintk_emit);
asmlinkage int vprintk(const char *fmt, va_list args)
{
- return vprintk_emit(0, -1, NULL, 0, fmt, args);
+ return vprintk_emit(0, -1, NULL, 0, fmt, args,
+ current_syslog_ns());
}
EXPORT_SYMBOL(vprintk);
@@ -1636,7 +1631,8 @@ asmlinkage int printk_emit(int facility, int level,
int r;
va_start(args, fmt);
- r = vprintk_emit(facility, level, dict, dictlen, fmt, args);
+ r = vprintk_emit(facility, level, dict, dictlen, fmt, args,
+ current_syslog_ns());
va_end(args);
return r;
@@ -1678,7 +1674,7 @@ asmlinkage int printk(const char *fmt, ...)
}
#endif
va_start(args, fmt);
- r = vprintk_emit(0, -1, NULL, 0, fmt, args);
+ r = vprintk_emit(0, -1, NULL, 0, fmt, args, current_syslog_ns());
va_end(args);
return r;
@@ -1981,12 +1977,13 @@ void wake_up_klogd(void)
this_cpu_or(printk_pending, PRINTK_PENDING_WAKEUP);
}
-static void console_cont_flush(char *text, size_t size)
+static void console_cont_flush(char *text, size_t size,
+ struct syslog_namespace *syslog_ns)
{
unsigned long flags;
size_t len;
- raw_spin_lock_irqsave(&logbuf_lock, flags);
+ raw_spin_lock_irqsave(&syslog_ns->logbuf_lock, flags);
if (!cont.len)
goto out;
@@ -1996,18 +1993,131 @@ static void console_cont_flush(char *text, size_t size)
* busy. The earlier ones need to be printed before this one, we
* did not flush any fragment so far, so just let it queue up.
*/
- if (console_seq < log_next_seq && !cont.cons)
+ if (syslog_ns->console_seq < syslog_ns->log_next_seq && !cont.cons)
goto out;
len = cont_print_text(text, size);
- raw_spin_unlock(&logbuf_lock);
+ raw_spin_unlock(&syslog_ns->logbuf_lock);
stop_critical_timings();
call_console_drivers(cont.level, text, len);
start_critical_timings();
local_irq_restore(flags);
return;
out:
- raw_spin_unlock_irqrestore(&logbuf_lock, flags);
+ raw_spin_unlock_irqrestore(&syslog_ns->logbuf_lock, flags);
+}
+
+/**
+ * syslog_console_unlock - unlock the console system for syslog_namespace
+ *
+ * Releases the console_lock which the caller holds on the console system
+ * and the console driver list.
+ *
+ * While the console_lock was held, console output may have been buffered
+ * by printk(). If this is the case, syslog_console_unlock(); emits
+ * the output prior to releasing the lock.
+ *
+ * If there is output waiting, we wake /dev/kmsg and syslog() users.
+ *
+ * syslog_console_unlock(); may be called from any context.
+ */
+void syslog_console_unlock(struct syslog_namespace *syslog_ns)
+{
+ static char text[LOG_LINE_MAX + PREFIX_MAX];
+ static u64 seen_seq;
+ unsigned long flags;
+ bool wake_klogd = false;
+ bool retry;
+
+ if (console_suspended) {
+ up(&console_sem);
+ return;
+ }
+
+ console_may_schedule = 0;
+
+ /* flush buffered message fragment immediately to console */
+ console_cont_flush(text, sizeof(text), syslog_ns);
+again:
+ for (;;) {
+ struct log *msg;
+ size_t len;
+ int level;
+
+ raw_spin_lock_irqsave(&syslog_ns->logbuf_lock, flags);
+ if (seen_seq != syslog_ns->log_next_seq) {
+ wake_klogd = true;
+ seen_seq = syslog_ns->log_next_seq;
+ }
+
+ if (syslog_ns->console_seq < syslog_ns->log_first_seq) {
+ /* messages are gone, move to first one */
+ syslog_ns->console_seq = syslog_ns->log_first_seq;
+ syslog_ns->console_idx = syslog_ns->log_first_idx;
+ console_prev = 0;
+ }
+skip:
+ if (syslog_ns->console_seq == syslog_ns->log_next_seq)
+ break;
+
+ msg = log_from_idx(syslog_ns->console_idx, syslog_ns);
+ if (msg->flags & LOG_NOCONS) {
+ /*
+ * Skip record we have buffered and already printed
+ * directly to the console when we received it.
+ */
+ syslog_ns->console_idx =
+ log_next(syslog_ns->console_idx, syslog_ns);
+ syslog_ns->console_seq++;
+ /*
+ * We will get here again when we register a new
+ * CON_PRINTBUFFER console. Clear the flag so we
+ * will properly dump everything later.
+ */
+ msg->flags &= ~LOG_NOCONS;
+ console_prev = msg->flags;
+ goto skip;
+ }
+
+ level = msg->level;
+ len = msg_print_text(msg, console_prev, false,
+ text, sizeof(text));
+ syslog_ns->console_idx =
+ log_next(syslog_ns->console_idx, syslog_ns);
+ syslog_ns->console_seq++;
+ console_prev = msg->flags;
+ raw_spin_unlock(&syslog_ns->logbuf_lock);
+
+ stop_critical_timings(); /* don't trace print latency */
+ call_console_drivers(level, text, len);
+ start_critical_timings();
+ local_irq_restore(flags);
+ }
+ console_locked = 0;
+
+ /* Release the exclusive_console once it is used */
+ if (unlikely(exclusive_console))
+ exclusive_console = NULL;
+
+ raw_spin_unlock(&syslog_ns->logbuf_lock);
+
+ up(&console_sem);
+
+ /*
+ * Someone could have filled up the buffer again, so re-check if there's
+ * something to flush. In case we cannot trylock the console_sem again,
+ * there's a new owner and the console_unlock() from them will do the
+ * flush, no worries.
+ */
+ raw_spin_lock(&syslog_ns->logbuf_lock);
+ retry = syslog_ns->console_seq != syslog_ns->log_next_seq;
+ raw_spin_unlock_irqrestore(&syslog_ns->logbuf_lock, flags);
+
+ if (retry && console_trylock())
+ goto again;
+
+ if (wake_klogd)
+ wake_up_klogd();
}
/**
@@ -2027,6 +2137,7 @@ out:
void console_unlock(void)
{
static char text[LOG_LINE_MAX + PREFIX_MAX];
+ struct syslog_namespace *syslog_ns = current_syslog_ns();
static u64 seen_seq;
unsigned long flags;
bool wake_klogd = false;
@@ -2040,37 +2151,38 @@ void console_unlock(void)
console_may_schedule = 0;
/* flush buffered message fragment immediately to console */
- console_cont_flush(text, sizeof(text));
+ console_cont_flush(text, sizeof(text), syslog_ns);
again:
for (;;) {
struct log *msg;
size_t len;
int level;
- raw_spin_lock_irqsave(&logbuf_lock, flags);
- if (seen_seq != log_next_seq) {
+ raw_spin_lock_irqsave(&syslog_ns->logbuf_lock, flags);
+ if (seen_seq != syslog_ns->log_next_seq) {
wake_klogd = true;
- seen_seq = log_next_seq;
+ seen_seq = syslog_ns->log_next_seq;
}
- if (console_seq < log_first_seq) {
+ if (syslog_ns->console_seq < syslog_ns->log_first_seq) {
/* messages are gone, move to first one */
- console_seq = log_first_seq;
- console_idx = log_first_idx;
+ syslog_ns->console_seq = syslog_ns->log_first_seq;
+ syslog_ns->console_idx = syslog_ns->log_first_idx;
console_prev = 0;
}
skip:
- if (console_seq == log_next_seq)
+ if (syslog_ns->console_seq == syslog_ns->log_next_seq)
break;
- msg = log_from_idx(console_idx);
+ msg = log_from_idx(syslog_ns->console_idx, syslog_ns);
if (msg->flags & LOG_NOCONS) {
/*
* Skip record we have buffered and already printed
* directly to the console when we received it.
*/
- console_idx = log_next(console_idx);
- console_seq++;
+ syslog_ns->console_idx =
+ log_next(syslog_ns->console_idx, syslog_ns);
+ syslog_ns->console_seq++;
/*
* We will get here again when we register a new
* CON_PRINTBUFFER console. Clear the flag so we
@@ -2084,10 +2196,11 @@ skip:
level = msg->level;
len = msg_print_text(msg, console_prev, false,
text, sizeof(text));
- console_idx = log_next(console_idx);
- console_seq++;
+ syslog_ns->console_idx =
+ log_next(syslog_ns->console_idx, syslog_ns);
+ syslog_ns->console_seq++;
console_prev = msg->flags;
- raw_spin_unlock(&logbuf_lock);
+ raw_spin_unlock(&syslog_ns->logbuf_lock);
stop_critical_timings(); /* don't trace print latency */
call_console_drivers(level, text, len);
@@ -2100,7 +2213,7 @@ skip:
if (unlikely(exclusive_console))
exclusive_console = NULL;
- raw_spin_unlock(&logbuf_lock);
+ raw_spin_unlock(&syslog_ns->logbuf_lock);
up(&console_sem);
@@ -2110,9 +2223,9 @@ skip:
* there's a new owner and the console_unlock() from them will do the
* flush, no worries.
*/
- raw_spin_lock(&logbuf_lock);
- retry = console_seq != log_next_seq;
- raw_spin_unlock_irqrestore(&logbuf_lock, flags);
+ raw_spin_lock(&syslog_ns->logbuf_lock);
+ retry = syslog_ns->console_seq != syslog_ns->log_next_seq;
+ raw_spin_unlock_irqrestore(&syslog_ns->logbuf_lock, flags);
if (retry && console_trylock())
goto again;
@@ -2237,6 +2350,7 @@ void register_console(struct console *newcon)
int i;
unsigned long flags;
struct console *bcon = NULL;
+ struct syslog_namespace *syslog_ns = &init_syslog_ns;
/*
* before we register a new CON_BOOT console, make sure we don't
@@ -2346,11 +2460,11 @@ void register_console(struct console *newcon)
* console_unlock(); will print out the buffered messages
* for us.
*/
- raw_spin_lock_irqsave(&logbuf_lock, flags);
- console_seq = syslog_seq;
- console_idx = syslog_idx;
- console_prev = syslog_prev;
- raw_spin_unlock_irqrestore(&logbuf_lock, flags);
+ raw_spin_lock_irqsave(&syslog_ns->logbuf_lock, flags);
+ syslog_ns->console_seq = syslog_ns->syslog_seq;
+ syslog_ns->console_idx = syslog_ns->syslog_idx;
+ console_prev = syslog_ns->syslog_prev;
+ raw_spin_unlock_irqrestore(&syslog_ns->logbuf_lock, flags);
/*
* We're about to replay the log buffer. Only do this to the
* just-registered console to avoid excessive message spam to
@@ -2573,6 +2687,7 @@ void kmsg_dump(enum kmsg_dump_reason reason)
{
struct kmsg_dumper *dumper;
unsigned long flags;
+ struct syslog_namespace *syslog_ns = &init_syslog_ns;
if ((reason > KMSG_DUMP_OOPS) && !always_kmsg_dump)
return;
@@ -2585,12 +2700,12 @@ void kmsg_dump(enum kmsg_dump_reason reason)
/* initialize iterator with data about the stored records */
dumper->active = true;
- raw_spin_lock_irqsave(&logbuf_lock, flags);
- dumper->cur_seq = clear_seq;
- dumper->cur_idx = clear_idx;
- dumper->next_seq = log_next_seq;
- dumper->next_idx = log_next_idx;
- raw_spin_unlock_irqrestore(&logbuf_lock, flags);
+ raw_spin_lock_irqsave(&syslog_ns->logbuf_lock, flags);
+ dumper->cur_seq = syslog_ns->clear_seq;
+ dumper->cur_idx = syslog_ns->clear_idx;
+ dumper->next_seq = syslog_ns->log_next_seq;
+ dumper->next_idx = syslog_ns->log_next_idx;
+ raw_spin_unlock_irqrestore(&syslog_ns->logbuf_lock, flags);
/* invoke dumper which will iterate over records */
dumper->dump(dumper, reason);
@@ -2626,24 +2741,25 @@ bool kmsg_dump_get_line_nolock(struct kmsg_dumper *dumper, bool syslog,
struct log *msg;
size_t l = 0;
bool ret = false;
+ struct syslog_namespace *syslog_ns = &init_syslog_ns;
if (!dumper->active)
goto out;
- if (dumper->cur_seq < log_first_seq) {
+ if (dumper->cur_seq < syslog_ns->log_first_seq) {
/* messages are gone, move to first available one */
- dumper->cur_seq = log_first_seq;
- dumper->cur_idx = log_first_idx;
+ dumper->cur_seq = syslog_ns->log_first_seq;
+ dumper->cur_idx = syslog_ns->log_first_idx;
}
/* last entry */
- if (dumper->cur_seq >= log_next_seq)
+ if (dumper->cur_seq >= syslog_ns->log_next_seq)
goto out;
- msg = log_from_idx(dumper->cur_idx);
+ msg = log_from_idx(dumper->cur_idx, syslog_ns);
l = msg_print_text(msg, 0, syslog, line, size);
- dumper->cur_idx = log_next(dumper->cur_idx);
+ dumper->cur_idx = log_next(dumper->cur_idx, syslog_ns);
dumper->cur_seq++;
ret = true;
out:
@@ -2674,10 +2790,12 @@ bool kmsg_dump_get_line(struct kmsg_dumper *dumper, bool syslog,
{
unsigned long flags;
bool ret;
+ struct syslog_namespace *syslog_ns = &init_syslog_ns;
+
+ raw_spin_lock_irqsave(&syslog_ns->logbuf_lock, flags);
- raw_spin_lock_irqsave(&logbuf_lock, flags);
ret = kmsg_dump_get_line_nolock(dumper, syslog, line, size, len);
- raw_spin_unlock_irqrestore(&logbuf_lock, flags);
+ raw_spin_unlock_irqrestore(&syslog_ns->logbuf_lock, flags);
return ret;
}
@@ -2713,20 +2831,21 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
enum log_flags prev;
size_t l = 0;
bool ret = false;
+ struct syslog_namespace *syslog_ns = &init_syslog_ns;
if (!dumper->active)
goto out;
- raw_spin_lock_irqsave(&logbuf_lock, flags);
- if (dumper->cur_seq < log_first_seq) {
+ raw_spin_lock_irqsave(&syslog_ns->logbuf_lock, flags);
+ if (dumper->cur_seq < syslog_ns->log_first_seq) {
/* messages are gone, move to first available one */
- dumper->cur_seq = log_first_seq;
- dumper->cur_idx = log_first_idx;
+ dumper->cur_seq = syslog_ns->log_first_seq;
+ dumper->cur_idx = syslog_ns->log_first_idx;
}
/* last entry */
if (dumper->cur_seq >= dumper->next_seq) {
- raw_spin_unlock_irqrestore(&logbuf_lock, flags);
+ raw_spin_unlock_irqrestore(&syslog_ns->logbuf_lock, flags);
goto out;
}
@@ -2735,10 +2854,10 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
idx = dumper->cur_idx;
prev = 0;
while (seq < dumper->next_seq) {
- struct log *msg = log_from_idx(idx);
+ struct log *msg = log_from_idx(idx, syslog_ns);
l += msg_print_text(msg, prev, true, NULL, 0);
- idx = log_next(idx);
+ idx = log_next(idx, syslog_ns);
seq++;
prev = msg->flags;
}
@@ -2748,10 +2867,10 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
idx = dumper->cur_idx;
prev = 0;
while (l > size && seq < dumper->next_seq) {
- struct log *msg = log_from_idx(idx);
+ struct log *msg = log_from_idx(idx, syslog_ns);
l -= msg_print_text(msg, prev, true, NULL, 0);
- idx = log_next(idx);
+ idx = log_next(idx, syslog_ns);
seq++;
prev = msg->flags;
}
@@ -2763,10 +2882,10 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
l = 0;
prev = 0;
while (seq < dumper->next_seq) {
- struct log *msg = log_from_idx(idx);
+ struct log *msg = log_from_idx(idx, syslog_ns);
l += msg_print_text(msg, prev, syslog, buf + l, size - l);
- idx = log_next(idx);
+ idx = log_next(idx, syslog_ns);
seq++;
prev = msg->flags;
}
@@ -2774,7 +2893,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
dumper->next_seq = next_seq;
dumper->next_idx = next_idx;
ret = true;
- raw_spin_unlock_irqrestore(&logbuf_lock, flags);
+ raw_spin_unlock_irqrestore(&syslog_ns->logbuf_lock, flags);
out:
if (len)
*len = l;
@@ -2794,10 +2913,12 @@ EXPORT_SYMBOL_GPL(kmsg_dump_get_buffer);
*/
void kmsg_dump_rewind_nolock(struct kmsg_dumper *dumper)
{
- dumper->cur_seq = clear_seq;
- dumper->cur_idx = clear_idx;
- dumper->next_seq = log_next_seq;
- dumper->next_idx = log_next_idx;
+ struct syslog_namespace *syslog_ns = &init_syslog_ns;
+
+ dumper->cur_seq = syslog_ns->clear_seq;
+ dumper->cur_idx = syslog_ns->clear_idx;
+ dumper->next_seq = syslog_ns->log_next_seq;
+ dumper->next_idx = syslog_ns->log_next_idx;
}
/**
@@ -2811,10 +2932,12 @@ void kmsg_dump_rewind_nolock(struct kmsg_dumper *dumper)
void kmsg_dump_rewind(struct kmsg_dumper *dumper)
{
unsigned long flags;
+ struct syslog_namespace *syslog_ns = &init_syslog_ns;
+
+ raw_spin_lock_irqsave(&syslog_ns->logbuf_lock, flags);
- raw_spin_lock_irqsave(&logbuf_lock, flags);
kmsg_dump_rewind_nolock(dumper);
- raw_spin_unlock_irqrestore(&logbuf_lock, flags);
+ raw_spin_unlock_irqrestore(&syslog_ns->logbuf_lock, flags);
}
EXPORT_SYMBOL_GPL(kmsg_dump_rewind);
#endif
--
1.7.1
^ permalink raw reply related
* [PATCH RFC 1/5] Syslog_ns: add syslog_namespace struct and API
From: Rui Xiang @ 2012-11-19 8:16 UTC (permalink / raw)
To: serge.hallyn, containers; +Cc: Eric W. Biederman, netdev
From: Xiang Rui <rui.xiang@huawei.com>
This patch add a struct syslog_namespace which contains the necessary member
when handling syslog.
We realize gut_syslog_ns and put_syslog_ns API, and syslog_ns is initialized
by init_syslog_ns. CONFIG_SYSLOG_NS is defined to allow to create syslog_ns.
Signed-off-by: Xiang Rui <rui.xiang@huawei.com>
Signed-off-by: Libo Chen <clbchenlibo.chen@huawei.com>
---
include/linux/syslog_namespace.h | 78 ++++++++++++++++++++++++++++++++++++++
init/Kconfig | 7 +++
kernel/Makefile | 1 +
kernel/syslog_namespace.c | 31 +++++++++++++++
4 files changed, 117 insertions(+), 0 deletions(-)
create mode 100644 include/linux/syslog_namespace.h
create mode 100644 kernel/syslog_namespace.c
diff --git a/include/linux/syslog_namespace.h b/include/linux/syslog_namespace.h
new file mode 100644
index 0000000..8c8ac5a
--- /dev/null
+++ b/include/linux/syslog_namespace.h
@@ -0,0 +1,78 @@
+#ifndef _LINUX_SYSLOG_NAMESPACE_H
+#define _LINUX_SYSLOG_NAMESPACE_H
+
+#include <linux/kref.h>
+
+/* record buffer */
+#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
+#define LOG_ALIGN 4
+#else
+#define LOG_ALIGN __alignof__(struct log)
+#endif
+
+#define CONTAINER_BUF_LEN 4096
+
+#define __LOG_BUF_LEN (1 << CONFIG_LOG_BUF_SHIFT)
+
+struct syslog_namespace {
+ struct kref kref; /* syslog_ns reference count & control */
+
+ raw_spinlock_t logbuf_lock; /* access conflict locker */
+
+ /* index and sequence number of the first record stored in the buffer */
+ u64 log_first_seq;
+ u32 log_first_idx;
+
+ /* index and sequence number of the next record stored in the buffer */
+ u64 log_next_seq;
+ u32 log_next_idx;
+
+ /* the next printk record to read after the last 'clear' command */
+ u64 clear_seq;
+ u32 clear_idx;
+
+ char *log_buf;
+ u32 log_buf_len;
+
+ /* the next printk record to write to the console */
+ u64 console_seq;
+ u32 console_idx;
+
+ /* the next printk record to read by syslog(READ) or /proc/kmsg */
+ u64 syslog_seq;
+ u32 syslog_idx;
+ int syslog_prev;
+ size_t syslog_partial;
+};
+
+extern struct syslog_namespace init_syslog_ns;
+
+#ifdef CONFIG_SYSLOG_NS
+extern void free_syslog_ns(struct kref *kref);
+static inline struct syslog_namespace *get_syslog_ns(
+ struct syslog_namespace *ns)
+{
+ if (ns != &init_syslog_ns)
+ kref_get(&ns->kref);
+ return ns;
+}
+
+static inline void put_syslog_ns(struct syslog_namespace *ns)
+{
+ if (ns != &init_syslog_ns)
+ kref_put(&ns->kref, free_syslog_ns);
+}
+
+#else
+static inline struct syslog_namespace *get_syslog_ns(
+ struct syslog_namespace *ns)
+{
+ return ns;
+}
+
+static inline void put_syslog_ns(struct syslog_namespace *ns)
+{
+}
+#endif
+
+#endif
diff --git a/init/Kconfig b/init/Kconfig
index 6fdd6e3..82771e0 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -988,6 +988,13 @@ config NET_NS
Allow user space to create what appear to be multiple instances
of the network stack.
+config SYSLOG_NS
+ bool "Syslog namespace"
+ default y
+ help
+ Allow containers to use syslog namespaces to provide different
+ syslog for containers.
+
endif # NAMESPACES
config UIDGID_CONVERTED
diff --git a/kernel/Makefile b/kernel/Makefile
index 0dfeca4..cb3cba0 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -28,6 +28,7 @@ obj-y += power/
ifeq ($(CONFIG_CHECKPOINT_RESTORE),y)
obj-$(CONFIG_X86) += kcmp.o
endif
+obj-$(CONFIG_SYSLOG_NS) += syslog_namespace.o
obj-$(CONFIG_FREEZER) += freezer.o
obj-$(CONFIG_PROFILING) += profile.o
obj-$(CONFIG_STACKTRACE) += stacktrace.o
diff --git a/kernel/syslog_namespace.c b/kernel/syslog_namespace.c
new file mode 100644
index 0000000..9482927
--- /dev/null
+++ b/kernel/syslog_namespace.c
@@ -0,0 +1,31 @@
+/*
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation, version 2 of the
+ * License.
+ */
+
+#include <linux/slab.h>
+#include <linux/module.h>
+#include <linux/syslog_namespace.h>
+
+static char __log_buf[__LOG_BUF_LEN] __aligned(LOG_ALIGN);
+
+struct syslog_namespace init_syslog_ns = {
+ .kref = {
+ .refcount = ATOMIC_INIT(2),
+ },
+ .logbuf_lock = __RAW_SPIN_LOCK_UNLOCKED(init_syslog_ns.logbuf_lock),
+ .log_buf_len = __LOG_BUF_LEN,
+ .log_buf = __log_buf,
+};
+EXPORT_SYMBOL_GPL(init_syslog_ns);
+
+void free_syslog_ns(struct kref *kref)
+{
+ struct syslog_namespace *ns;
+ ns = container_of(kref, struct syslog_namespace, kref);
+
+ kfree(ns->log_buf);
+ kfree(ns);
+}
--
1.7.1
^ permalink raw reply related
* [PATCH RFC 2/5] Syslog_ns: add CLONE_NEWSYSLOG and create syslog_ns when copying process
From: Rui Xiang @ 2012-11-19 8:16 UTC (permalink / raw)
To: serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, Eric W. Biederman
From: Xiang Rui <rui.xiang-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
We add a new clone flag named CLONE_NEWSYSLOG, and use 0x02000000 which was
previously the unused CLONE_STOPPED and is now available for re-use.
In syslog_namespaces.c, the interface copy_syslog_ns is implemented for create
a new syslog_ns. When a new namespace was created for one process copying, the
interface was used.
Signed-off-by: Xiang Rui <rui.xiang-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Signed-off-by: Libo Chen <clbchenlibo.chen-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
include/linux/nsproxy.h | 2 ++
include/linux/syslog_namespace.h | 19 +++++++++++++++++++
include/uapi/linux/sched.h | 3 +--
kernel/nsproxy.c | 16 +++++++++++++++-
kernel/syslog_namespace.c | 32 ++++++++++++++++++++++++++++++++
5 files changed, 69 insertions(+), 3 deletions(-)
diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h
index cc37a55..9db2527 100644
--- a/include/linux/nsproxy.h
+++ b/include/linux/nsproxy.h
@@ -8,6 +8,7 @@ struct mnt_namespace;
struct uts_namespace;
struct ipc_namespace;
struct pid_namespace;
+struct syslog_namespace;
struct fs_struct;
/*
@@ -29,6 +30,7 @@ struct nsproxy {
struct mnt_namespace *mnt_ns;
struct pid_namespace *pid_ns;
struct net *net_ns;
+ struct syslog_namespace *syslog_ns;
};
extern struct nsproxy init_nsproxy;
diff --git a/include/linux/syslog_namespace.h b/include/linux/syslog_namespace.h
index 8c8ac5a..1ecb8b8 100644
--- a/include/linux/syslog_namespace.h
+++ b/include/linux/syslog_namespace.h
@@ -2,6 +2,9 @@
#define _LINUX_SYSLOG_NAMESPACE_H
#include <linux/kref.h>
+#include <linux/sched.h>
+#include <linux/nsproxy.h>
+#include <linux/err.h>
/* record buffer */
#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
@@ -47,8 +50,16 @@ struct syslog_namespace {
extern struct syslog_namespace init_syslog_ns;
+static inline struct syslog_namespace *current_syslog_ns(void)
+{
+ return current->nsproxy->syslog_ns;
+}
+
#ifdef CONFIG_SYSLOG_NS
extern void free_syslog_ns(struct kref *kref);
+extern struct syslog_namespace *copy_syslog_ns(unsigned long flags,
+ struct task_struct *tsk);
+
static inline struct syslog_namespace *get_syslog_ns(
struct syslog_namespace *ns)
{
@@ -64,6 +75,14 @@ static inline void put_syslog_ns(struct syslog_namespace *ns)
}
#else
+static inline struct syslog_namespace *copy_syslog_ns(unsigned long flags,
+ struct task_struct *tsk)
+{
+ if (flags & CLONE_NEWSYSLOG)
+ return ERR_PTR(-EINVAL);
+ return tsk->nsproxy->syslog_ns;
+}
+
static inline struct syslog_namespace *get_syslog_ns(
struct syslog_namespace *ns)
{
diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
index 5a0f945..906a3da 100644
--- a/include/uapi/linux/sched.h
+++ b/include/uapi/linux/sched.h
@@ -21,8 +21,7 @@
#define CLONE_DETACHED 0x00400000 /* Unused, ignored */
#define CLONE_UNTRACED 0x00800000 /* set if the tracing process can't force CLONE_PTRACE on this clone */
#define CLONE_CHILD_SETTID 0x01000000 /* set the TID in the child */
-/* 0x02000000 was previously the unused CLONE_STOPPED (Start in stopped state)
- and is now available for re-use. */
+#define CLONE_NEWSYSLOG 0x02000000 /* New syslog namespace */
#define CLONE_NEWUTS 0x04000000 /* New utsname group? */
#define CLONE_NEWIPC 0x08000000 /* New ipcs */
#define CLONE_NEWUSER 0x10000000 /* New user namespace */
diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
index b576f7f..331d31f 100644
--- a/kernel/nsproxy.c
+++ b/kernel/nsproxy.c
@@ -22,6 +22,7 @@
#include <linux/pid_namespace.h>
#include <net/net_namespace.h>
#include <linux/ipc_namespace.h>
+#include <linux/syslog_namespace.h>
#include <linux/proc_fs.h>
#include <linux/file.h>
#include <linux/syscalls.h>
@@ -36,6 +37,7 @@ struct nsproxy init_nsproxy = {
#endif
.mnt_ns = NULL,
.pid_ns = &init_pid_ns,
+ .syslog_ns = &init_syslog_ns,
#ifdef CONFIG_NET
.net_ns = &init_net,
#endif
@@ -96,8 +98,17 @@ static struct nsproxy *create_new_namespaces(unsigned long flags,
goto out_net;
}
+ new_nsp->syslog_ns = copy_syslog_ns(flags, tsk);
+ if (IS_ERR(new_nsp->syslog_ns)) {
+ err = PTR_ERR(new_nsp->syslog_ns);
+ goto out_syslog;
+ }
+
return new_nsp;
+out_syslog:
+ if (new_nsp->net_ns)
+ put_net(new_nsp->net_ns);
out_net:
if (new_nsp->pid_ns)
put_pid_ns(new_nsp->pid_ns);
@@ -131,7 +142,8 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk)
get_nsproxy(old_ns);
if (!(flags & (CLONE_NEWNS | CLONE_NEWUTS | CLONE_NEWIPC |
- CLONE_NEWPID | CLONE_NEWNET)))
+ CLONE_NEWPID | CLONE_NEWNET |
+ CLONE_NEWSYSLOG)))
return 0;
if (!capable(CAP_SYS_ADMIN)) {
@@ -174,6 +186,8 @@ void free_nsproxy(struct nsproxy *ns)
put_ipc_ns(ns->ipc_ns);
if (ns->pid_ns)
put_pid_ns(ns->pid_ns);
+ if (ns->syslog_ns)
+ put_syslog_ns(ns->syslog_ns);
put_net(ns->net_ns);
kmem_cache_free(nsproxy_cachep, ns);
}
diff --git a/kernel/syslog_namespace.c b/kernel/syslog_namespace.c
index 9482927..a12e1c1 100644
--- a/kernel/syslog_namespace.c
+++ b/kernel/syslog_namespace.c
@@ -7,6 +7,7 @@
#include <linux/slab.h>
#include <linux/module.h>
+#include <linux/bootmem.h>
#include <linux/syslog_namespace.h>
static char __log_buf[__LOG_BUF_LEN] __aligned(LOG_ALIGN);
@@ -21,6 +22,37 @@ struct syslog_namespace init_syslog_ns = {
};
EXPORT_SYMBOL_GPL(init_syslog_ns);
+static struct syslog_namespace *create_syslog_ns(unsigned int buf_len)
+{
+ struct syslog_namespace *ns;
+
+ if (buf_len <= 0)
+ return ERR_PTR(-EINVAL);
+ ns = kzalloc(sizeof(*ns), GFP_KERNEL);
+ if (!ns)
+ return ERR_PTR(-ENOMEM);
+
+ kref_init(&(ns->kref));
+
+ ns->log_buf_len = buf_len;
+ ns->log_buf = kzalloc(buf_len, GFP_KERNEL);
+ if (!ns->log_buf) {
+ kfree(ns);
+ return ERR_PTR(-ENOMEM);
+ }
+ raw_spin_lock_init(&(ns->logbuf_lock));
+
+ return ns;
+}
+
+struct syslog_namespace *copy_syslog_ns(unsigned long flags,
+ struct task_struct *tsk)
+{
+ if (!(flags & CLONE_NEWSYSLOG))
+ return get_syslog_ns(tsk->nsproxy->syslog_ns);
+ return create_syslog_ns(CONTAINER_BUF_LEN);
+}
+
void free_syslog_ns(struct kref *kref)
{
struct syslog_namespace *ns;
--
1.7.1
^ permalink raw reply related
* [PATCH RFC 0/5] Containerize syslog
From: Rui Xiang @ 2012-11-19 8:16 UTC (permalink / raw)
To: serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, Eric W. Biederman
From: Xiang Rui <rui.xiang-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
In Serge's patch (http://lwn.net/Articles/525629/), syslog_namespace was tied to a user
namespace. We add syslog_ns tied to nsproxy instead, and implement ns_printk in
ip_table context.
We add syslog_namespace as a part of nsproxy, and a new flag CLONE_SYSLOG to unshare
syslog area.
In syslog_namespace, some necessary identifiers for handling syslog buf are contained.
When one container creates a new syslog namespace,containerized buf will be allocated
to store log ownned this container. Containerized identifiers such as log_first_seq
instead of global variable only affect their own buf.The buf will not be free until
syslog_namespace is destructed by host.
Printk should be re-implimented because log buf is isolated into syslog_ns. The function
include printk, /dev/kmsg, do_syslog and kmsg_dump should be realized in container. So,
to make these funtions available in container, a parameter syslog_ns is necessory for
their interfaces.
For container context, the value syslog namespace is reasonable if we use current method
to get syslog_ns when using iptable. Because the log info belong to each containers will
be printed in host.
We add a pointer in net namespace, and use it to track the syslog_ns which was created
when the log was generated in container. Then add ns_printk to provide a new interface
while using syslog_ns.
This patchset is based on the develop tree of net branch
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git.
Libo Chen (3):
printk: modify printk interface for syslog_namespace
printk: add ns_printk for specific syslog_ns
printk: use ns_printk in iptable context
Xiang Rui (2):
Syslog_ns: add syslog_namespace struct and API
Syslog_ns: add CLONE_NEWSYSLOG and create syslog_ns when copying
process
drivers/base/core.c | 4 +-
include/linux/nsproxy.h | 2 +
include/linux/printk.h | 5 +-
include/linux/syslog_namespace.h | 98 ++++++
include/net/net_namespace.h | 7 +-
include/net/netfilter/xt_log.h | 7 +-
include/uapi/linux/sched.h | 3 +-
init/Kconfig | 7 +
kernel/Makefile | 1 +
kernel/nsproxy.c | 19 +-
kernel/printk.c | 646 ++++++++++++++++++++++++--------------
kernel/syslog_namespace.c | 65 ++++
net/core/net_namespace.c | 12 +-
net/netfilter/xt_LOG.c | 4 +-
14 files changed, 623 insertions(+), 257 deletions(-)
create mode 100644 include/linux/syslog_namespace.h
create mode 100644 kernel/syslog_namespace.c
^ permalink raw reply
* Re: [rfc net-next v6 2/3] virtio_net: multiqueue support
From: Jason Wang @ 2012-11-19 7:40 UTC (permalink / raw)
To: Rusty Russell
Cc: krkumar2, kvm, mst, netdev, linux-kernel, virtualization, davem
In-Reply-To: <87y5igyhyg.fsf@rustcorp.com.au>
On 11/05/2012 09:08 AM, Rusty Russell wrote:
> Jason Wang <jasowang@redhat.com> writes:
>> +struct virtnet_info {
>> + u16 num_queue_pairs; /* # of RX/TX vq pairs */
>> + u16 total_queue_pairs;
>> +
>> + struct send_queue *sq;
>> + struct receive_queue *rq;
>> + struct virtqueue *cvq;
>> +
>> + struct virtio_device *vdev;
>> + struct net_device *dev;
>> + unsigned int status;
> status seems unused?
>
It's used for tacking the status of the device (e.g in
virtnet_config_changed_work() ).
>> +static const struct ethtool_ops virtnet_ethtool_ops;
> Strange hoist, but I can't tell from the patch if this is necessary.
> Assume it is.
Sorry, this line should belong to patch 3/3.
>
>> +static inline int vq2txq(struct virtqueue *vq)
>> +{
>> + int index = virtqueue_get_queue_index(vq);
>> + return index == 1 ? 0 : (index - 3) / 2;
>> +}
>> +
>> +static inline int txq2vq(int txq)
>> +{
>> + return txq ? 2 * txq + 3 : 1;
>> +}
>> +
>> +static inline int vq2rxq(struct virtqueue *vq)
>> +{
>> + int index = virtqueue_get_queue_index(vq);
>> + return index ? (index - 2) / 2 : 0;
>> +}
>> +
>> +static inline int rxq2vq(int rxq)
>> +{
>> + return rxq ? 2 * rxq + 2 : 0;
>> +}
>> +
>> static inline struct skb_vnet_hdr *skb_vnet_hdr(struct sk_buff *skb)
> I know skb_vnet_hdr() does it, but I generally dislike inline in C
> files; gcc is generally smart enough these days, and inline suppresses
> unused function warnings.
Ok, I will remove the inline here.
> I guess these mappings have to work even when we're switching from mq to
> single queue mode; otherwise we could simplify them using a 'bool mq'
> flag.
Yes, it still work when switching to sq. And what makes it looks strange
is because we reserve the virtqueues for single queue mode and also
reserve vq 3. But it does not bring much benefit, need more thought.
>
>> +static int virtnet_set_queues(struct virtnet_info *vi)
>> +{
>> + struct scatterlist sg;
>> + struct virtio_net_ctrl_steering s;
>> + struct net_device *dev = vi->dev;
>> +
>> + if (vi->num_queue_pairs == 1) {
>> + s.current_steering_rule = VIRTIO_NET_CTRL_STEERING_SINGLE;
>> + s.current_steering_param = 1;
>> + } else {
>> + s.current_steering_rule =
>> + VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX;
>> + s.current_steering_param = vi->num_queue_pairs;
>> + }
> (BTW, VIRTIO_NET_CTRL_STEERING_RX_FOLLOWS_TX etc not defined anywhere?)
It's defined in include/uapi/linux/virtio_net.h
>
> Hmm, it's not clear that anything other than RX_FOLLOWS_TX will ever
> make sense, so this is really just turning mq on and off.
Currently, when multiqueue is enabled for tuntap, it does tx follow rx.
So when guest driver specify the RX_FOLLOWS_TX, qemu would just enable
multiqueue for tuntap and this policy could be done by tuntap.
>
> Unfortunately, we can't turn feature bits on and off after startup, so
> if we want this level of control (and I think we do), there does need to
> be a mechanism.
>
> Michael? I'd prefer this to be further simplfied, to just
> disable/enable. We can extend it later, but for now the second
> parameter is redundant, ie.:
>
> struct virtio_net_ctrl_steering {
> u8 mode; /* 0 == off, 1 == on */
> } __attribute__((packed));
>
We may need more policy in the future, so maybe a
VIRTIO_NET_CTRL_STEERING_NONE is ok?
>> @@ -924,11 +1032,10 @@ static void virtnet_get_ringparam(struct net_device *dev,
>> {
>> struct virtnet_info *vi = netdev_priv(dev);
>>
>> - ring->rx_max_pending = virtqueue_get_vring_size(vi->rvq);
>> - ring->tx_max_pending = virtqueue_get_vring_size(vi->svq);
>> + ring->rx_max_pending = virtqueue_get_vring_size(vi->rq[0].vq);
>> + ring->tx_max_pending = virtqueue_get_vring_size(vi->sq[0].vq);
>> ring->rx_pending = ring->rx_max_pending;
>> ring->tx_pending = ring->tx_max_pending;
>> -
>> }
> This assumes all vqs are the same size. I think this should probably
> check: for mq mode, use the first vq, otherewise use the 0th.
Ok, but I don't see the reason that we need different size for mq.
>
> For bonus points, check this assertion at probe time.
>
>> + /*
>> + * We expect 1 RX virtqueue followed by 1 TX virtqueue, followd by
>> + * possible control virtqueue, followed by 1 reserved vq, followed
>> + * by RX/TX queue pairs used in multiqueue mode.
>> + */
>> + if (vi->total_queue_pairs == 1)
>> + total_vqs = 2 + virtio_has_feature(vi->vdev,
>> + VIRTIO_NET_F_CTRL_VQ);
>> + else
>> + total_vqs = 2 * vi->total_queue_pairs + 2;
> What's the allergy to odd numbers? Why the reserved queue?
It was suggested by Michael to let the vq calculation easier, but it
seems does not help much. So it's better not reserve virtqueue in next
version.
>> + if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ))
>> + vi->has_cvq = true;
>> +
>> + /* Use single tx/rx queue pair as default */
>> + vi->num_queue_pairs = 1;
>> + vi->total_queue_pairs = num_queue_pairs;
>> +
>> + /* Allocate/initialize the rx/tx queues, and invoke find_vqs */
>> + err = virtnet_setup_vqs(vi);
>> if (err)
>> goto free_stats;
>>
>> + if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ) &&
>> + virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VLAN))
>> + dev->features |= NETIF_F_HW_VLAN_FILTER;
> We should be using has_cvq here...
Sure.
>
>> -#ifdef CONFIG_PM
>> -static int virtnet_freeze(struct virtio_device *vdev)
>> +static void virtnet_stop(struct virtnet_info *vi)
> I think you still want this under CONFIG_PM, right? Doesn't seem used
> elsewhere.
Yes, will fix this.
>
> Cheers,
> Rusty.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH net-next ] net: Allow userns root to control tun and tap devices
From: Eric W. Biederman @ 2012-11-19 7:34 UTC (permalink / raw)
To: David Miller; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA, Linux Containers
Allow an unpriviled user who has created a user namespace, and then
created a network namespace to effectively use the new network
namespace, by reducing capable(CAP_NET_ADMIN) calls to
ns_capable(net->user_ns,CAP_NET_ADMIN) calls.
Allow setting of the tun iff flags.
Allow creating of tun devices.
Allow adding a new queue to a tun device.
Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
---
drivers/net/tun.c | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index b44d7b7..b01e8c0 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -373,10 +373,11 @@ static u16 tun_select_queue(struct net_device *dev, struct sk_buff *skb)
static inline bool tun_not_capable(struct tun_struct *tun)
{
const struct cred *cred = current_cred();
+ struct net *net = dev_net(tun->dev);
return ((uid_valid(tun->owner) && !uid_eq(cred->euid, tun->owner)) ||
(gid_valid(tun->group) && !in_egroup_p(tun->group))) &&
- !capable(CAP_NET_ADMIN);
+ !ns_capable(net->user_ns, CAP_NET_ADMIN);
}
static void tun_set_real_num_queues(struct tun_struct *tun)
@@ -1559,7 +1560,7 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
char *name;
unsigned long flags = 0;
- if (!capable(CAP_NET_ADMIN))
+ if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
return -EPERM;
err = security_tun_dev_create();
if (err < 0)
--
1.7.5.4
^ permalink raw reply related
* Re: Optics (SFP) monitoring on ixgbe and igbe
From: Robert Olsson @ 2012-11-19 7:27 UTC (permalink / raw)
To: footplus; +Cc: Ben Hutchings, netdev
In-Reply-To: <CAPN4dA9f3y1mDPubqd9s+v5supj3hNvZaWym0_y3EMZd7L6MyQ@mail.gmail.com>
Hi,
FYI. DOM use in Serengeti Tanzania (Bunda-Nata 60km) on solar driven low-power
linux atom router @ 20Watt w. igb driver using the older DOM pathes. Very useful
stuff. Yes get included in the kernel.
NATA:/# ethtool -D eth1
Ext-Calbr: Avr RX-Power: Alarm & Warn: RX_LOS: Wavelength: 1550 nm
Alarms, warnings in beginning of line, Ie. AH = Alarm High, WL == Warn Low etc
Temp: 76.2 C Thresh: Lo: -50.0/-48.0 Hi: 95.0/110.0 C
Vcc: 3.27 V Thresh: Lo: 2.9/3.0 Hi: 3.5/3.6 V
Tx-Bias: 27.9 mA Thresh: Lo: 3.0/5.0 Hi: 90.0/100.0 mA
TX-pwr: 3.8 dBm ( 2.39 mW) Thresh: Lo: -5.0/-4.0 Hi: 5.0/6.0 dBm
RX-pwr: -17.3 dBm ( 0.02 mW) Thresh: Lo: -40.0/-37.0 Hi: -5.0/-3.0 dBm
http://herjulf.se/robert/tanzania-2012/Nata-installation-2.jpg
--ro
^ permalink raw reply
* Re: [PATCH net-next 0/17] Make the network stack usable by userns root
From: Eric W. Biederman @ 2012-11-19 7:27 UTC (permalink / raw)
To: David Miller
Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
In-Reply-To: <20121118.222601.1683927229305655885.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> writes:
> There were merge issues so I applied the patches and sorted the
> conflicts out one-by-one.
>
> I hope this doesn't cause major problems.
Shucks, I had thought I had tested and verified there would not be merge
issues. Oh well.
No major problems.
To keep it that way I am dropping all but the first two patches from my
userns development tree. I have dependencies on the infrastructure bits.
A quick merge test reveals that your tree against my full development
tree has two minor conflicts that are trivial to resolve. So I don't
anticipate Linus will have any problems.
Eric
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox