* Re: [PATCH v7] tilegx network driver: initial support
From: David Miller @ 2012-05-24 4:31 UTC (permalink / raw)
To: cmetcalf; +Cc: bhutchings, arnd, linux-kernel, netdev
In-Reply-To: <201205240115.q4O1FwqG006336@lab-41.internal.tilera.com>
From: Chris Metcalf <cmetcalf@tilera.com>
Date: Wed, 23 May 2012 16:42:03 -0400
> + * FIXME (bug 11489): add support for IPv6.
...
> + * FIXME (bug# 11479): We should stop queues when they're full.
...
Mentioning bug numbers in the driver source is not appropriate.
This second problem looks extremely serious, rather than some minor
issue to look into at some time in the future.
^ permalink raw reply
* [PATCH IPROUTE2] tc-codel: Add manpage
From: Vijay Subramanian @ 2012-05-24 4:33 UTC (permalink / raw)
To: netdev; +Cc: Stephen Hemminger, Eric Dumazet, Dave Taht, Vijay Subramanian
This patch adds the manpage for the CoDel (Controlled-Delay) AQM.
Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
---
man/man8/Makefile | 2 +-
man/man8/tc-codel.8 | 114 +++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 115 insertions(+), 1 deletions(-)
create mode 100644 man/man8/tc-codel.8
diff --git a/man/man8/Makefile b/man/man8/Makefile
index 6873a4b..6d6242e 100644
--- a/man/man8/Makefile
+++ b/man/man8/Makefile
@@ -3,7 +3,7 @@ TARGETS = ip-address.8 ip-link.8 ip-route.8
MAN8PAGES = $(TARGETS) ip.8 arpd.8 lnstat.8 routel.8 rtacct.8 rtmon.8 ss.8 \
tc-bfifo.8 tc-cbq-details.8 tc-cbq.8 tc-drr.8 tc-htb.8 \
tc-pfifo.8 tc-pfifo_fast.8 tc-prio.8 tc-red.8 tc-sfq.8 \
- tc-tbf.8 tc.8 rtstat.8 ctstat.8 nstat.8 routef.8 \
+ tc-tbf.8 tc.8 rtstat.8 ctstat.8 nstat.8 routef.8 tc-codel.8 \
tc-sfb.8 tc-netem.8 tc-choke.8 ip-tunnel.8 ip-rule.8 ip-ntable.8 \
ip-monitor.8 tc-stab.8 tc-hfsc.8 ip-xfrm.8 ip-netns.8 \
ip-neighbour.8 ip-mroute.8 ip-maddress.8 ip-addrlabel.8
diff --git a/man/man8/tc-codel.8 b/man/man8/tc-codel.8
new file mode 100644
index 0000000..605e498
--- /dev/null
+++ b/man/man8/tc-codel.8
@@ -0,0 +1,114 @@
+.TH CoDel 8 "23 May 2012" "iproute2" "Linux"
+.SH NAME
+CoDel \- Controlled-Delay Active Queue Management algorithm
+.SH SYNOPSIS
+.B tc qdisc ... codel
+[
+.B limit
+PACKETS ] [
+.B target
+TIME ] [
+.B interval
+TIME ] [
+.B ecn
+|
+.B noecn
+]
+
+.SH DESCRIPTION
+CoDel (pronounced "coddle") is an adaptive "no-knobs" active queue management
+algorithm (AQM) scheme that was developed to address the shortcomings of
+RED and its variants. It was developed with the following goals
+in mind:
+ o It should be parameterless.
+ o It should keep delays low while permitting bursts of traffic.
+ o It should control delay.
+ o It should adapt dynamically to changing link rates with no impact on
+utilization.
+ o It should be simple and efficient and should scale from simple to
+complex routers.
+
+.SH ALGORITHM
+CoDel comes with three major innovations. Instead of using queue size or queue
+average, it uses the local minimum queue as a measure of the standing/persistent queue.
+Second, it uses a single state-tracking variable of the minimum delay to see where it
+is relative to the standing queue delay. Third, instead of measuring queue size
+in bytes or packets, it is measured in packet-sojourn time in the queue.
+
+CoDel measures the minimum local queue delay (i.e. standing queue delay) and
+compares it to the value of the given acceptable queue delay
+.B target.
+As long as the minimum queue delay is less than
+.B target
+or the buffer contains fewer than MTU worth of bytes, packets are not dropped.
+Codel enters a dropping mode when the minimum queue delay has exceeded
+.B target
+for a time greater than
+.B interval.
+In this mode, packets are dropped at different drop times which is set by a
+control law. The control law ensures that the packet drops cause a linear change
+in the throughput. Once the minimum delay goes below
+.B target,
+packets are no longer dropped.
+
+Additional details can be found in the paper cited below.
+
+.SH PARAMETERS
+.SS limit
+hard limit on the real queue size. When this limit is reached, incoming packets
+are dropped. If the value is lowered, packets are dropped so that the new limit is
+met. Default is 1000 packets.
+
+.SS target
+is the acceptable minimum standing/persistent queue delay. This minimum delay
+is identified by tracking the local minimum queue delay that packets experience.
+Default and recommended value is 5ms.
+
+.SS interval
+is used to ensure that the measured minimum delay does not become too stale. The
+minimum delay must be experienced in the last epoch of length
+.B interval.
+It should be set on the order of the worst-case RTT through the bottleneck to
+give endpoints sufficient time to react. Default value is 100ms.
+
+.SS ecn | noecn
+can be used to mark packets instead of dropping them. If
+.B ecn
+has been enabled,
+.B noecn
+can be used to turn it off and vice-a-versa. By default,
+.B ecn
+is turned off.
+
+.SH EXAMPLES
+ # tc qdisc add dev eth0 root codel
+ # tc -s qdisc show
+ qdisc codel 801b: dev eth0 root refcnt 2 limit 1000p target 5.0ms
+interval 100.0ms
+ Sent 245801662 bytes 275853 pkt (dropped 0, overlimits 0 requeues 24)
+ backlog 0b 0p requeues 24
+ count 0 lastcount 0 ldelay 2us drop_next 0us
+ maxpacket 7306 ecn_mark 0 drop_overlimit 0
+
+ # tc qdisc add dev eth0 root codel limit 100 target 4ms interval 30ms ecn
+ # tc -s qdisc show
+ qdisc codel 801c: dev eth0 root refcnt 2 limit 100p target 4.0ms
+interval 30.0ms ecn
+ Sent 237573074 bytes 268561 pkt (dropped 0, overlimits 0 requeues 5)
+ backlog 0b 0p requeues 5
+ count 0 lastcount 0 ldelay 76us drop_next 0us
+ maxpacket 2962 ecn_mark 0 drop_overlimit 0
+
+
+.SH SEE ALSO
+.BR tc (8),
+.BR tc-red (8)
+
+.SH SOURCES
+o Kathleen Nicols and Van Jaconson, "Controlling Queue Delay", ACM Queue,
+http://queue.acm.org/detail.cfm?id=2209336
+
+.SH AUTHORS
+CoDel was implemented by Eric Dumazet and David Taht. This manpage was written
+by Vijay Subramanian. Please reports corrections to the Linux Networking
+mailing list <netdev@vger.kernel.org>.
--
1.7.0.4
^ permalink raw reply related
* Re: [PATCH IPROUTE2] tc-codel: Add manpage
From: Eric Dumazet @ 2012-05-24 4:53 UTC (permalink / raw)
To: Vijay Subramanian; +Cc: netdev, Stephen Hemminger, Dave Taht
In-Reply-To: <1337834034-27803-1-git-send-email-subramanian.vijay@gmail.com>
On Wed, 2012-05-23 at 21:33 -0700, Vijay Subramanian wrote:
> This patch adds the manpage for the CoDel (Controlled-Delay) AQM.
>
> Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
> ---
Thanks !
> +.SS target
> +is the acceptable minimum standing/persistent queue delay. This minimum delay
> +is identified by tracking the local minimum queue delay that packets experience.
> +Default and recommended value is 5ms.
Although I can tell I prefer lower values on hosts.
On 10Gbe links, I used 500us target
> +
> +.SS interval
> +is used to ensure that the measured minimum delay does not become too stale. The
> +minimum delay must be experienced in the last epoch of length
> +.B interval.
> +It should be set on the order of the worst-case RTT through the bottleneck to
> +give endpoints sufficient time to react. Default value is 100ms.
Same here. In a datacenter, you might reduce this to 20ms or so...
^ permalink raw reply
* Re: [PATCH 06/15] batman-adv: Distributed ARP Table - add snooping functions for ARP messages
From: Sven Eckelmann @ 2012-05-24 5:34 UTC (permalink / raw)
To: b.a.t.m.a.n-ZwoEplunGu2X36UT3dwllkB+6BGkLq7r
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, lindner_marek-LWAfsSFWpa4,
David Miller
In-Reply-To: <20120523.190158.2172815395820691292.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
[-- Attachment #1: Type: text/plain, Size: 715 bytes --]
On Wednesday 23 May 2012 19:01:58 David Miller wrote:
> It can't be all on me to answer your question, I cannot be
> the choke point.
>
> You must lean on the entire networking developer community
> for help, otherwise it simply will not scale.
_You_ were the person that declined the pull request because _you_ wanted to
rewrite the ARP handling. So _you_ are the person that has the insight in
_your_ plans. Either _you_ tell us what is _your_ problem with it or _you_
will have to point us to a person that knows _you_.
Until now nobody stepped up (the mails were public visible to the netdev
people). But I will ask ask Antonio to send a separate mail to netdev and
recent arp.c commiter.
Thanks,
Sven
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply
* Re: [PATCH 06/15] batman-adv: Distributed ARP Table - add snooping functions for ARP messages
From: David Miller @ 2012-05-24 5:54 UTC (permalink / raw)
To: sven-KaDOiPu9UxWEi8DpZVb4nw
Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
b.a.t.m.a.n-ZwoEplunGu2X36UT3dwllkB+6BGkLq7r,
lindner_marek-LWAfsSFWpa4
In-Reply-To: <3476925.EJY4MZoOgZ-1RWNDQYo44h8XcdJbWeDu3TFMtCCXL7YSoIsB4E12gc@public.gmane.org>
From: Sven Eckelmann <sven-KaDOiPu9UxWEi8DpZVb4nw@public.gmane.org>
Date: Thu, 24 May 2012 07:34:12 +0200
> _You_ were the person that declined the pull request because _you_ wanted to
> rewrite the ARP handling. So _you_ are the person that has the insight in
> _your_ plans. Either _you_ tell us what is _your_ problem with it or _you_
> will have to point us to a person that knows _you_.
If I say that you must not use ARP nor neighbour layer internals, it
doesn't mean that I have to come up with the alternative
implementation for you.
Now, you can ask others on the netdev list for suggestions, but you
can't expect me to be the direct and only responder on things like
that.
^ permalink raw reply
* [GIT] Networking
From: David Miller @ 2012-05-24 6:05 UTC (permalink / raw)
To: torvalds; +Cc: akpm, netdev, linux-kernel
1) One final sync of wireless and bluetooth stuff from John
Linville. These changes have all been in his tree for more
than a week, and therefore have had the necessary -next
exposure. John was just away on a trip and didn't have
a change to send the pull request until a day or two ago.
2) Put back some defines in user exposed header file areas
that were removed during the tokenring purge. From
Stephen Hemminger and Paul Gortmaker.
3) A bug fix for UDP hash table allocation got lost in the pile due to
one of those "you got it.. no I've got it.." situations. :-)
From Tim Bird.
4) SKB coalescing in TCP needs to have stricter checks, otherwise
we'll try to coalesce overlapping frags and crash. Fix from
Eric Dumazet.
5) RCU routing table lookups can race with free_fib_info(), causing
crashes when we deref the device pointers in the route. Fix by
releasing the net device in the RCU callback. From Yanmin Zhang.
Ok, everything from here on out will be bug fixes.
Please pull, thanks a lot!
The following changes since commit 72c04af9a2d57b7945cf3de8e71461bd80695d50:
fbdev: sh_mobile_lcdc: Don't confuse line size with pitch (2012-05-21 20:59:32 -0700)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master
for you to fetch changes up to 1ca7ee30630e1022dbcf1b51be20580815ffab73:
tcp: take care of overlaps in tcp_try_coalesce() (2012-05-24 00:28:21 -0400)
----------------------------------------------------------------
Amit Beka (1):
iwlwifi: fix power index handling
Amitkumar Karwar (2):
Bluetooth: btmrvl: configure default host sleep parameters
Bluetooth: btmrvl: add support for SDIO suspend/resume callbacks
Andre Guedes (21):
Bluetooth: Check FINDING state in interleaved discovery
Bluetooth: Add hci_cancel_le_scan() to hci_core
Bluetooth: LE support for MGMT stop discovery
Bluetooth: Replace EPERM by EALREADY in hci_cancel_inquiry
Bluetooth: Refactor stop_discovery
Bluetooth: Add Periodic Inquiry command complete handler
Bluetooth: Add HCI_PERIODIC_INQ to dev_flags
Bluetooth: Check HCI_PERIODIC_INQ in start_discovery
Bluetooth: Ignore inquiry results from periodic inquiry
Bluetooth: Add Periodic Inquiry command complete handler
Bluetooth: Add HCI_PERIODIC_INQ to dev_flags
Bluetooth: Remove MGMT_ADDR_INVALID macro
Bluetooth: Remove useless code in hci_connect
Bluetooth: Move address type macros to bluetooth.h
Bluetooth: Rename link_to_mgmt to link_to_bdaddr
Bluetooth: Add address type to struct sockaddr_l2
Bluetooth: Rename mgmt_to_le to bdaddr_to_le
Bluetooth: Move bdaddr_to_le to hci_core
Bluetooth: Add dst_type parameter to hci_connect
Bluetooth: Use address type info from user-space
Bluetooth: Remove advertising cache
Andrei Emeltchenko (24):
Bluetooth: trivial: Correct endian conversion
Bluetooth: Correct type for hdev lmp_subver
Bluetooth: Correct type for ediv to __le16
Bluetooth: Fix extra conversion to __le32
Bluetooth: Correct chan->psm endian conversions
Bluetooth: Correct ediv in SMP
Bluetooth: Correct length calc in L2CAP conf rsp
Bluetooth: Correct CID endian notation
Bluetooth: Convert error codes to le16
Bluetooth: trivial: Fix endian conversion mode
Bluetooth: trivial: Correct types
Bluetooth: Fix type in cpu_to_le conversion
Bluetooth: Fix opcode access in hci_complete
Bluetooth: trivial: Remove sparse warnings
Bluetooth: Silence sparse warning
Bluetooth: Comments and style fixes
Bluetooth: Remove unneeded timer clear
Bluetooth: Make L2CAP chan_add functions static
Bluetooth: Remove unneeded zero initialization
Bluetooth: Add Read Local AMP Info to init
Bluetooth: Adds set_default function in L2CAP setup
Bluetooth: Fix debug printing unallocated name
Bluetooth: trivial: Remove empty line
Bluetooth: Remove unneeded calculation and magic number
Arik Nemtsov (1):
mac80211: fix network header location when adding encryption headers
Ashok Nagarajan (4):
mac80211: Push the deleted comment to correct place
mac80211: Fix don't use '>' operator for matching channel types
mac80211: Modify mesh_set_ht_prot_mode() to have less identation
mac80211: Add debugfs entry for mesh ht_opmode
Avinash Patil (18):
mwifiex: allocate space for one more mwifiex_private structure
mwifiex: handle station specific commands on STA interface only
mwifiex: support for creation of AP interface
mwifiex: multi-interface support for mwifiex
mwifiex: save adapter pointer in wiphy_priv
mwifiex: append peer mac address TLV in key material command to firmware
mwifiex: add bss start and bss stop commands for AP
mwifiex: add AP command sys_config and set channel
mwifiex: stop BSS in deauthentication handling
mwifiex: handle interface type changes correctly
mwifiex: common set_wiphy_params cfg80211 handler for AP and STA interface
mwifiex: add cfg80211 start_ap and stop_ap handlers
mwifiex: add AP event handling framework
mwifiex: add WPA2 support for AP
mwifiex: rearrange AP sys configure code
mwifiex: add custom IE framework
mwifiex: retrieve IEs from cfg80211_beacon_data and send to firmware
mwifiex: delete IEs when stop_ap
Bartosz.Markowski@tieto.com (1):
wlcore/wl12xx: implement better beacon loss handling
Bing Zhao (1):
mwifiex: fix coding style issue in mwifiex_deauthenticate
Bjorn Helgaas (1):
b43: use pci_is_pcie() instead of obsolete pci_dev.is_pcie
Chun-Yeow Yeoh (1):
mac80211: fix the increment of unicast/multicast counters for forwarded PREQ
Cristian Chilipirea (2):
Bluetooth: Fixed checkpatch warnings
Net: wireless: core.c: fixed checkpatch warnings
Dan Carpenter (6):
ath6kl: list_first_entry() is never NULL
ath6kl: change || to &&
ath6kl: fix an indenting issue
NFC: Remove unneeded pn533 dev NULL check
wlcore: release lock on error in wl1271_op_suspend()
wlcore: fixup an allocation
David Herrmann (5):
Bluetooth: Remove redundant hdev->parent field
Bluetooth: vhci: Ignore return code of nonseekable_open()
Bluetooth: Move hci_alloc/free_dev close to hci_register/unregister_dev
Bluetooth: Move device initialization to hci_alloc_dev()
Bluetooth: Remove unneeded initialization in hci_alloc_dev()
David S. Miller (1):
Merge branch 'master' of git://git.kernel.org/.../linville/wireless
David Spinadel (3):
iwlwifi: fix scan_cmd_size allocation
iwlwifi: disable default wildcard ssid scan
iwlwifi: invert the order of ssid list in scan cmd
Eldad Zack (1):
Bluetooth: bnep: use constant for ethertype
Emmanuel Grumbach (2):
iwlwifi: don't flood logs when HT debug flag is set
iwlwifi: don't disable AGG queues that are not enabled
Eric Dumazet (1):
tcp: take care of overlaps in tcp_try_coalesce()
Eric Lapuyade (7):
NFC: Cache the core NFC active target pointer instead of its index
NFC: Remove useless HCI private nfc target table
NFC: Specify usage for targets found and target lost events
NFC: Add HCI/SHDLC support to let driver check for tag presence
NFC: Update Documentation/nfc-hci.txt
NFC: HCI based pn544 driver
NFC: HCI drivers don't have to keep track of polling state
Eyal Shapira (4):
wlcore: add RX filters util functions
wl12xx: add RX filters ACX commands
wlcore: add RX filters driver state mgmt functions
wl12xx: support wowlan wakeup patterns
Franky Lin (11):
brcmfmac: remove unused parameter of brcmf_sdcard_reg_read
brcmfmac: remove unused parameter of brcmf_sdcard_reg_write
brcmfmac: decouple set_sbaddr_window from register write interface
brcmfmac: introduce unified register access interface for SDIO
brcmfmac: replace brcmf_sdcard_cfg_read with brcmf_sdio_regrb
brcmfmac: replace brcmf_sdcard_cfg_write with brcmf_sdio_regwb
brcmfmac: replace brcmf_sdcard_reg_read with brcmf_sdio_regrl
brcmfmac: replace brcmf_sdcard_reg_write with brcmf_sdio_regwl
brcmfmac: remove redundant retries for SDIO core register access
brcmfmac: remove function brcmf_sdcard_regfail
brcmfmac: replace brcmf_sdioh_card_regread with brcmf_sdio_regrl
Gustavo Padovan (13):
Bluetooth: Remove sk parameter from l2cap_chan_create()
Bluetooth: Remove err parameter from alloc_skb()
Bluetooth: remove unneeded declaration of sco_conn_del()
Bluetooth: Remove unneeded elements from size calculation
Bluetooth: Remove hlen variable
Merge git://git.kernel.org/.../bluetooth/bluetooth
Bluetooth: Fix wrong set of skb fragments
Bluetooth: Fix packet size provided to the controller
Bluetooth: Fix skb length calculation
Bluetooth: improve readability of l2cap_seq_list code
Bluetooth: report the right security level in getsockopt
Bluetooth: Create flags for bt_sk()
Bluetooth: Report proper error number in disconnection
H Hartley Sweeten (5):
NFC: Quiet nci/data.c sparse noise about plain integer as NULL pointer
NFC: Include nci_core.h to nci/lib.c
NFC: Quiet nci/ntf.c sparse noise about plain integer as NULL pointer
NFC: HCI ops should not be exposed globally
NFC: The NFC genl family structure should not be exposed globally
Hauke Mehrtens (32):
ssb: remove rev from boardinfo
MIPS: bcm47xx: refactor fetching board data
bcma: add boardinfo struct
MIPS: bcm47xx: read baordrev without prefix from sprom
ssb/bcma: fill attribute alpha2 from sprom
ssb: fill board_rev attribute from sprom
bcma: read out some additional sprom attributes
bcma/ssb: parse new attributes from sprom
bcma: implement setting core clock mode to dynamic
bcma: add bcma_core_pci_extend_L1timer
bcma: add bcma_core_pci_fixcfg()
bcma: add bcma_core_pci_config_fixup()
brcmsmac: use sprom from bcma
brcmsmac: remove brcmsmac own sprom parsing
brcmsmac: get board and chip info from bcma
brcmsmac: remove support for cc rev < 20
brcmsmac: remove references to PCI
brcmsmac: remove PCIe functions needed for PCIe core rev <= 10
brcmsmac: remove pcicore_hwup()
brcmsmac: remove ai_pci_setup()
brcmsmac: remove ai_chipcontrl_epa4331
brcmsmac: remove ai_gpiocontrol()
brcmsmac: remove _ai_clkctl_cc()
brcmsmac: remove pcicore_attach()
brcmsmac: remove pcicore_find_pci_capability()
brcmsmac: remove pcie_extendL1timer()
brcmsmac: remove pcicore_fixcfg()
brcmsmac: remove nicpci.c
brcmsmac: do not access host_pci
brcmsmac: read PCI vendor and device id only for PCI devices
brcmsmac: handle non pci in ai_deviceremoved()
ssb: add PCI IDs 0x4322 and 43222
Hemant Gupta (5):
Bluetooth: Send correct address type for LTK
Bluetooth: Fix clearing discovery type when stopping discovery
Bluetooth: mgmt: Fix missing connect failed event for LE
Bluetooth: mgmt: Fix address type while loading Long Term Key
Bluetooth: Don't distribute keys in case of Encryption Failure
Ido Yariv (1):
Bluetooth: Search global l2cap channels by src/dst addresses
Janusz.Dziedzic@tieto.com (1):
mac80211: Add IV-room in the skb for TKIP and WEP
Javier Cardona (1):
mac80211_hwsim: Fix rate control by correctly reporting transmission counts
Jesper Juhl (3):
ath6kl: fix memory leak in ath6kl_fwlog_block_read()
Bluetooth: btmrvl_sdio: remove pointless conditional before release_firmware()
wlcore: fix size of two memset's in wl1271_cmd_build_arp_rsp()
Johan Hedberg (1):
Bluetooth: Fix Inquiry with RSSI event mask
Johannes Berg (11):
mac80211: fix single queue drivers
mac80211: fix TX aggregation session timer
cfg80211: remove double prototype
cfg80211: add warning when calculating MCS rates >= 32
mac80211: (selectively) add HT details in radiotap
nl80211: prevent additions to old station flags API
cfg80211: fix cfg80211_can_beacon_sec_chan prototype
nl80211: refactor valid channel type check
iwlwifi: support explicit monitor interface
rndis_wlan: remove set_channel cfg80211 hook
mwifiex: remove set_channel cfg80211 hook
John W. Linville (3):
Merge branch 'for-linville' of git://github.com/kvalo/ath6kl
Merge branch 'for-upstream' of git://git.kernel.org/.../bluetooth/bluetooth-next
Merge git://git.kernel.org/.../linville/wireless-next
Jouni Malinen (2):
ath6kl: Remove incorrect Probe Response offload support for Interworking
ath6kl: Configure probed SSID list consistently
Kalle Valo (2):
Merge remote branch 'wireless-next/master' into ath6kl-next
ath6kl: merge split format strings into one
Karsten Keil (1):
mISDN: Add X-Tensions USB ISDN TA XC-525
Kevin Fang (2):
ath6kl: handle background(BK) stream properly on htc mbox layer
ath6kl: assign Tx packet drop threshold per endpoint on htc pipe layer
Larry Finger (1):
b43legacy: Fix error due to MMIO access with SSB unpowered
Luciano Coelho (3):
wlcore: use GFP_KERNEL together with GFP_DMA
wlcore: fix pointer print out in wl1271_acx_set_rx_filter()
wlcore: fix some sparse warnings due to missing static declaration
Luis R. Rodriguez (1):
ath6kl: include in.h explicitly
Lukasz Rymanowski (1):
Bluetooth: Remove not needed status parameter
Marcel Holtmann (12):
Bluetooth: Add TX power tag to EIR data
Bluetooth: Handle EIR tags for Device ID
Bluetooth: Add management command for setting Device ID
Bluetooth: Fix broken usage of put_unaligned_le16
Bluetooth: Fix broken usage of get_unaligned_le16
Bluetooth: Update management interface revision
Bluetooth: Split error handling for L2CAP listen sockets
Bluetooth: Split error handling for SCO listen sockets
Bluetooth: Don't check source address in SCO bind function
Bluetooth: Restrict to one SCO listening socket
Bluetooth: Enable Low Energy support by default
NFC: Select CRC_CCITT for SHDLC link layer of HCI based drivers
Marek Marczykowski (1):
xen: do not disable netfront in dom0
Mat Martineau (17):
Bluetooth: Add definitions and struct members for new ERTM state machine
Bluetooth: Add a structure to carry ERTM data in skb control blocks
Bluetooth: Add the l2cap_seq_list structure for tracking frames
Bluetooth: Functions for handling ERTM control fields
Bluetooth: Improve ERTM sequence number offset calculation
Bluetooth: Remove duplicate structure members from bt_skb_cb
Bluetooth: Move recently-added ERTM header packing functions
Bluetooth: Initialize new l2cap_chan structure members
Bluetooth: Remove unused function
Bluetooth: Make better use of l2cap_chan reference counting
Bluetooth: Add Code Aurora Forum copyright
Bluetooth: Refactor L2CAP ERTM and streaming transmit segmentation
Bluetooth: Update tx_send_head when sending ERTM data
Bluetooth: Initialize the transmit queue for L2CAP streaming mode
Bluetooth: Fix a redundant and problematic incoming MTU check
Bluetooth: Restore locking semantics when looking up L2CAP channels
Bluetooth: Lock the L2CAP channel when sending
Michael Gruetzner (1):
Bluetooth: Add support for Foxconn/Hon Hai AR5BBU22 0489:E03C
Mikel Astiz (3):
Bluetooth: Use unsigned int instead of signed int
Bluetooth: Remove unnecessary check
Bluetooth: btusb: Dynamic alternate setting
Ming Jiang (2):
ath6kl: allow deepsleep_suspend function when wlan interface down
ath6kl clear the MMC_PM_KEEP_POWER for cutpower case
Nathan Hintz (6):
bcma: Find names of non BCM cores
bcma: Move initialization of SPROM to prevent overwrite
bcma: Account for variable PCI memory base/size
bcma: reads/writes are always 4 bytes, so always map 4 bytes
bcma: Add __devexit to bcma_host_pci_remove
bcma: Add flush for BCMA_RESET_CTL write
Naveen Gangadharan (1):
ath6kl: Multicast filter support in wow suspend and non-suspend
Nobuhiro Iwamatsu (1):
phy/micrel: Fix ID of KSZ9021
Paul Gortmaker (1):
ipx: restore token ring define to include/linux/ipx.h
Raja Mani (1):
ath6kl: Retain bg scan period value modified by the user
Randy Dunlap (1):
wireless: TI wlxxx depends on MAC80211
Ray Chen (2):
ath6kl: Add AR6004 1.2 support for USB and SDIO
ath6kl: Fix system crash sometimes for USB hotplug
Samuel Ortiz (6):
NFC: LLCP connect must wait for a CC frame
NFC: Update the LLCP poll mask
NFC: Return the amount of LLCP bytes queued to sock_sendmsg
feature-removal: Remove pn544 raw driver
NFC: Export nfc.h to userland
NFC: Queue I frame fragments to the LLCP sockets queue tail
Subramania Sharma Thandaveswaran (1):
ath6kl: Fix bug in bg scan configuration in schedule scan
Sujith Manoharan (1):
ath9k_hw: Fix RTT calibration
Syam Sidhardhan (5):
Bluetooth: mgmt: Remove unwanted goto statements
Bluetooth: remove header declared but not defined
Bluetooth: Remove strtoba header declared but not defined
Bluetooth: Remove unused hci_le_ltk_reply()
Bluetooth: Remove unused hci_le_ltk_neg_reply()
Szymon Janc (2):
Bluetooth: mgmt: Fix some code style and indentation issues
Bluetooth: mgmt: Don't allow to set invalid value to DeviceID source
Thomas Pedersen (7):
ath6kl: handle concurrent AP-STA channel switches
ath6kl: support fw reporting phy capabilities
ath6kl: only restore supported HT caps
ath6kl: disallow WoW with multiple vifs
ath6kl: unblock fwlog_block_read() on exit
ath6kl: check for sband existence when creating scan cmd
mac80211: send peer candidate event for new sta only
Tim Bird (1):
mm: add a low limit to alloc_large_system_hash
Tim Gardner (1):
ath6kl: Normalize use of FW_DIR
Ulisses Furquim (1):
Bluetooth: Fix registering hci with duplicate name
Vasanthakumar Thiagarajan (6):
ath6kl: Fix possible unaligned memory access in ath6kl_get_rsn_capab()
ath6kl: Configure 0 as rsn cap when it is not there in rsn ie
ath6kl: Don't advertise HT capability for incapable firmware
ath6kl: Fix bss filter setting while scanning
ath6kl: Update netstats for some of the tx failrues in ath6kl_data_tx()
ath6kl: Complete failed tx packet in ath6kl_htc_tx_from_queue()
Vinicius Costa Gomes (1):
Bluetooth: Add support for reusing the same hci_conn for LE links
Vishal Agarwal (1):
Bluetooth: Fix EIR data generation for mgmt_device_found
Vivek Natarajan (1):
ath6kl_sdio: Fix the EAPOL out of order issue
Wey-Yi Guy (3):
iwlwifi: include rssi as part of decision making for reduce txpower
iwlwifi: add documentation for bt reduced tx power
iwlwifi: make sure reduced tx power bit is valid
Wu Jiajun-B06378 (1):
gianfar:don't add FCB length to hard_header_len
Yanmin Zhang (1):
ipv4: fix the rcu race between free_fib_info and ip_route_output_slow
Zefir Kurtisi (1):
nl80211: fix typos in comments
Zero.Lin (1):
rt2x00:Add RT539b chipset support
joseph daniel (1):
NFC: Fix LLCP compilation warning
stephen hemminger (1):
if: restore token ring ARP type to header
Documentation/feature-removal-schedule.txt | 12 +
Documentation/nfc/nfc-hci.txt | 45 ++-
arch/mips/bcm47xx/setup.c | 15 +-
arch/mips/bcm47xx/sprom.c | 28 +-
arch/mips/include/asm/mach-bcm47xx/bcm47xx.h | 9 +
drivers/bcma/core.c | 3 +-
drivers/bcma/driver_pci.c | 53 ++-
drivers/bcma/driver_pci_host.c | 10 +-
drivers/bcma/host_pci.c | 7 +-
drivers/bcma/scan.c | 54 ++-
drivers/bcma/sprom.c | 149 ++++++-
drivers/bluetooth/ath3k.c | 6 +
drivers/bluetooth/btmrvl_drv.h | 3 +
drivers/bluetooth/btmrvl_main.c | 56 +--
drivers/bluetooth/btmrvl_sdio.c | 112 +++++-
drivers/bluetooth/btusb.c | 16 +-
drivers/bluetooth/hci_ldisc.c | 2 +-
drivers/bluetooth/hci_vhci.c | 3 +-
drivers/isdn/hardware/mISDN/hfcsusb.h | 6 +
drivers/net/ethernet/freescale/gianfar.c | 2 +-
drivers/net/wireless/ath/ath6kl/cfg80211.c | 238 ++++++++----
drivers/net/wireless/ath/ath6kl/cfg80211.h | 2 +
drivers/net/wireless/ath/ath6kl/core.h | 33 +-
drivers/net/wireless/ath/ath6kl/debug.c | 12 +-
drivers/net/wireless/ath/ath6kl/htc_mbox.c | 45 ++-
drivers/net/wireless/ath/ath6kl/htc_pipe.c | 11 +-
drivers/net/wireless/ath/ath6kl/init.c | 29 +-
drivers/net/wireless/ath/ath6kl/main.c | 104 ++++-
drivers/net/wireless/ath/ath6kl/sdio.c | 17 +-
drivers/net/wireless/ath/ath6kl/txrx.c | 12 +-
drivers/net/wireless/ath/ath6kl/usb.c | 12 +
drivers/net/wireless/ath/ath6kl/wmi.c | 94 +++--
drivers/net/wireless/ath/ath6kl/wmi.h | 24 ++
drivers/net/wireless/ath/ath9k/ar9003_calib.c | 50 +--
drivers/net/wireless/ath/ath9k/ar9003_mci.c | 2 +-
drivers/net/wireless/ath/ath9k/ar9003_rtt.c | 84 +++-
drivers/net/wireless/ath/ath9k/ar9003_rtt.h | 5 +-
drivers/net/wireless/ath/ath9k/hw.c | 9 +-
drivers/net/wireless/ath/ath9k/hw.h | 9 +-
drivers/net/wireless/b43/bus.c | 6 +-
drivers/net/wireless/b43/dma.c | 2 +-
drivers/net/wireless/b43/main.c | 4 +-
drivers/net/wireless/b43legacy/main.c | 4 +-
drivers/net/wireless/b43legacy/phy.c | 4 +-
drivers/net/wireless/b43legacy/radio.c | 10 +-
drivers/net/wireless/brcm80211/brcmfmac/bcmsdh.c | 244 ++++++------
drivers/net/wireless/brcm80211/brcmfmac/bcmsdh_sdmmc.c | 32 +-
drivers/net/wireless/brcm80211/brcmfmac/dhd_sdio.c | 350 +++++++----------
drivers/net/wireless/brcm80211/brcmfmac/sdio_chip.c | 265 +++++++------
drivers/net/wireless/brcm80211/brcmfmac/sdio_host.h | 37 +-
drivers/net/wireless/brcm80211/brcmsmac/Makefile | 3 -
drivers/net/wireless/brcm80211/brcmsmac/aiutils.c | 479 ++---------------------
drivers/net/wireless/brcm80211/brcmsmac/aiutils.h | 24 --
drivers/net/wireless/brcm80211/brcmsmac/antsel.c | 16 +-
drivers/net/wireless/brcm80211/brcmsmac/channel.c | 7 +-
drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c | 11 +-
drivers/net/wireless/brcm80211/brcmsmac/main.c | 142 +++----
drivers/net/wireless/brcm80211/brcmsmac/nicpci.c | 826 ---------------------------------------
drivers/net/wireless/brcm80211/brcmsmac/nicpci.h | 77 ----
drivers/net/wireless/brcm80211/brcmsmac/otp.c | 410 --------------------
drivers/net/wireless/brcm80211/brcmsmac/otp.h | 36 --
drivers/net/wireless/brcm80211/brcmsmac/phy/phy_lcn.c | 67 ++--
drivers/net/wireless/brcm80211/brcmsmac/phy/phy_n.c | 333 ++++++----------
drivers/net/wireless/brcm80211/brcmsmac/phy_shim.c | 9 -
drivers/net/wireless/brcm80211/brcmsmac/phy_shim.h | 3 -
drivers/net/wireless/brcm80211/brcmsmac/pub.h | 228 -----------
drivers/net/wireless/brcm80211/brcmsmac/srom.c | 980 -----------------------------------------------
drivers/net/wireless/brcm80211/brcmsmac/srom.h | 29 --
drivers/net/wireless/brcm80211/brcmsmac/stf.c | 6 +-
drivers/net/wireless/iwlwifi/iwl-agn-lib.c | 35 +-
drivers/net/wireless/iwlwifi/iwl-agn-rxon.c | 4 +
drivers/net/wireless/iwlwifi/iwl-agn-tx.c | 19 +-
drivers/net/wireless/iwlwifi/iwl-agn.c | 2 +-
drivers/net/wireless/iwlwifi/iwl-commands.h | 7 +-
drivers/net/wireless/iwlwifi/iwl-mac80211.c | 5 +-
drivers/net/wireless/iwlwifi/iwl-power.c | 8 +-
drivers/net/wireless/iwlwifi/iwl-scan.c | 52 ++-
drivers/net/wireless/mac80211_hwsim.c | 5 +
drivers/net/wireless/mwifiex/Makefile | 2 +
drivers/net/wireless/mwifiex/cfg80211.c | 498 +++++++++++++++++-------
drivers/net/wireless/mwifiex/cfg80211.h | 2 +-
drivers/net/wireless/mwifiex/cmdevt.c | 21 +-
drivers/net/wireless/mwifiex/decl.h | 13 +-
drivers/net/wireless/mwifiex/fw.h | 159 +++++++-
drivers/net/wireless/mwifiex/ie.c | 396 +++++++++++++++++++
drivers/net/wireless/mwifiex/init.c | 1 +
drivers/net/wireless/mwifiex/ioctl.h | 32 ++
drivers/net/wireless/mwifiex/join.c | 26 +-
drivers/net/wireless/mwifiex/main.c | 57 ++-
drivers/net/wireless/mwifiex/main.h | 26 +-
drivers/net/wireless/mwifiex/sta_cmd.c | 69 ++--
drivers/net/wireless/mwifiex/sta_cmdresp.c | 8 +
drivers/net/wireless/mwifiex/sta_event.c | 51 ++-
drivers/net/wireless/mwifiex/sta_ioctl.c | 9 +-
drivers/net/wireless/mwifiex/uap_cmd.c | 432 +++++++++++++++++++++
drivers/net/wireless/mwifiex/wmm.c | 4 +
drivers/net/wireless/rndis_wlan.c | 14 -
drivers/net/wireless/rt2x00/rt2800pci.c | 1 +
drivers/net/wireless/ti/wl12xx/Kconfig | 1 +
drivers/net/wireless/ti/wlcore/Kconfig | 2 +-
drivers/net/wireless/ti/wlcore/acx.c | 80 ++++
drivers/net/wireless/ti/wlcore/acx.h | 30 ++
drivers/net/wireless/ti/wlcore/boot.c | 3 +-
drivers/net/wireless/ti/wlcore/cmd.c | 8 +-
drivers/net/wireless/ti/wlcore/event.c | 29 +-
drivers/net/wireless/ti/wlcore/main.c | 323 +++++++++++++++-
drivers/net/wireless/ti/wlcore/rx.c | 36 ++
drivers/net/wireless/ti/wlcore/rx.h | 4 +
drivers/net/wireless/ti/wlcore/wl12xx.h | 41 ++
drivers/net/wireless/ti/wlcore/wlcore.h | 6 +
drivers/net/xen-netfront.c | 6 -
drivers/nfc/Kconfig | 13 +
drivers/nfc/Makefile | 1 +
drivers/nfc/pn533.c | 19 +-
drivers/nfc/pn544_hci.c | 947 +++++++++++++++++++++++++++++++++++++++++++++
drivers/ssb/b43_pci_bridge.c | 2 +
drivers/ssb/pci.c | 88 ++++-
fs/dcache.c | 2 +
fs/inode.c | 2 +
include/linux/Kbuild | 1 +
include/linux/bcma/bcma.h | 7 +
include/linux/bcma/bcma_driver_pci.h | 11 +
include/linux/bootmem.h | 3 +-
include/linux/if_arp.h | 2 +-
include/linux/ipx.h | 2 +-
include/linux/micrel_phy.h | 2 +-
include/linux/nfc/pn544.h | 7 +
include/linux/nl80211.h | 8 +-
include/linux/ssb/ssb.h | 1 -
include/linux/ssb/ssb_regs.h | 61 ++-
include/net/bluetooth/bluetooth.h | 32 +-
include/net/bluetooth/hci.h | 8 +-
include/net/bluetooth/hci_core.h | 67 ++--
include/net/bluetooth/l2cap.h | 93 ++++-
include/net/bluetooth/mgmt.h | 9 +
include/net/bluetooth/smp.h | 2 +-
include/net/cfg80211.h | 6 +-
include/net/mac80211.h | 12 +-
include/net/nfc/hci.h | 6 +-
include/net/nfc/nfc.h | 19 +-
include/net/nfc/shdlc.h | 2 +
kernel/pid.c | 3 +-
mm/page_alloc.c | 7 +-
net/bluetooth/af_bluetooth.c | 8 +-
net/bluetooth/bnep/core.c | 2 +-
net/bluetooth/hci_conn.c | 56 +--
net/bluetooth/hci_core.c | 267 ++++++-------
net/bluetooth/hci_event.c | 75 +++-
net/bluetooth/hci_sysfs.c | 5 +-
net/bluetooth/l2cap_core.c | 762 +++++++++++++++++++++++++-----------
net/bluetooth/l2cap_sock.c | 76 ++--
net/bluetooth/mgmt.c | 286 ++++++++------
net/bluetooth/rfcomm/sock.c | 14 +-
net/bluetooth/sco.c | 75 ++--
net/bluetooth/smp.c | 2 +-
net/ipv4/fib_semantics.c | 12 +-
net/ipv4/route.c | 1 +
net/ipv4/tcp.c | 2 +
net/ipv4/tcp_input.c | 5 +
net/ipv4/udp.c | 30 +-
net/mac80211/agg-tx.c | 10 +-
net/mac80211/debugfs_netdev.c | 2 +
net/mac80211/ibss.c | 5 +
net/mac80211/iface.c | 4 +-
net/mac80211/main.c | 3 +
net/mac80211/mesh.c | 6 +-
net/mac80211/mesh_hwmp.c | 5 +-
net/mac80211/mesh_plink.c | 65 ++--
net/mac80211/rx.c | 6 +-
net/mac80211/wep.c | 15 +-
net/mac80211/wpa.c | 10 +-
net/nfc/core.c | 112 ++++--
net/nfc/hci/Kconfig | 1 +
net/nfc/hci/core.c | 78 ++--
net/nfc/hci/shdlc.c | 12 +
net/nfc/llcp/commands.c | 4 +-
net/nfc/llcp/llcp.c | 7 +
net/nfc/llcp/sock.c | 57 ++-
net/nfc/nci/core.c | 27 +-
net/nfc/nci/data.c | 8 +-
net/nfc/nci/lib.c | 1 +
net/nfc/nci/ntf.c | 2 +-
net/nfc/netlink.c | 6 +-
net/nfc/nfc.h | 2 +-
net/wireless/chan.c | 2 +-
net/wireless/core.c | 4 +-
net/wireless/core.h | 2 -
net/wireless/nl80211.c | 69 ++--
net/wireless/util.c | 2 +-
189 files changed, 6665 insertions(+), 5539 deletions(-)
delete mode 100644 drivers/net/wireless/brcm80211/brcmsmac/nicpci.c
delete mode 100644 drivers/net/wireless/brcm80211/brcmsmac/nicpci.h
delete mode 100644 drivers/net/wireless/brcm80211/brcmsmac/otp.c
delete mode 100644 drivers/net/wireless/brcm80211/brcmsmac/otp.h
delete mode 100644 drivers/net/wireless/brcm80211/brcmsmac/srom.c
delete mode 100644 drivers/net/wireless/brcm80211/brcmsmac/srom.h
create mode 100644 drivers/net/wireless/mwifiex/ie.c
create mode 100644 drivers/net/wireless/mwifiex/uap_cmd.c
create mode 100644 drivers/nfc/pn544_hci.c
^ permalink raw reply
* Re: [PATCH] Bluetooth: Fix null pointer dereference in l2cap_chan_send
From: Minho Ban @ 2012-05-24 6:32 UTC (permalink / raw)
To: Chanyeol Park
Cc: Gustavo Padovan, Marcel Holtmann, Johan Hedberg, David S. Miller,
linux-bluetooth, netdev, linux-kernel
In-Reply-To: <4FBB8828.303@samsung.com>
On 05/22/2012 09:35 PM, Chanyeol Park wrote:
> Hi
> On 2012년 05월 21일 09:58, Minho Ban wrote:
>> diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
>> index 3bb1611..98d4541 100644
>> --- a/net/bluetooth/l2cap_sock.c
>> +++ b/net/bluetooth/l2cap_sock.c
>> @@ -727,10 +727,12 @@ static int l2cap_sock_sendmsg(struct kiocb *iocb, struct socket *sock, struct ms
>> if (msg->msg_flags& MSG_OOB)
>> return -EOPNOTSUPP;
>>
>> - if (sk->sk_state != BT_CONNECTED)
>> + l2cap_chan_lock(chan);
>> + if (sk->sk_state != BT_CONNECTED || !chan->conn) {
>> + l2cap_chan_unlock(chan);
>> return -ENOTCONN;
>> + }
>>
>> - l2cap_chan_lock(chan);
>> err = l2cap_chan_send(chan, msg, len, sk->sk_priority);
>> l2cap_chan_unlock(chan);
>>
> Beside !chan->conn condition,I think it makes sense that sk_state check should be moved after l2cap_chan_lock()
> because sk_state could be changed due to l2cap_conn_del().
>
Thanks, chan->conn condition is not necessary, move sk->sk_state != BT_CONNECTED behind chan_lock is enough.
I'll amend this patch.
Regards
Minho Ban
^ permalink raw reply
* Re: [PATCH 06/15] batman-adv: Distributed ARP Table - add snooping functions for ARP messages
From: Simon Wunderlich @ 2012-05-24 8:09 UTC (permalink / raw)
To: David Miller
Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
b.a.t.m.a.n-ZwoEplunGu2X36UT3dwllkB+6BGkLq7r,
lindner_marek-LWAfsSFWpa4
In-Reply-To: <20120524.015457.1543147002306809286.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
[-- Attachment #1: Type: text/plain, Size: 1124 bytes --]
Hey David,
thanks for your answer,
On Thu, May 24, 2012 at 01:54:57AM -0400, David Miller wrote:
> From: Sven Eckelmann <sven-KaDOiPu9UxWEi8DpZVb4nw@public.gmane.org>
> Date: Thu, 24 May 2012 07:34:12 +0200
>
> > _You_ were the person that declined the pull request because _you_ wanted to
> > rewrite the ARP handling. So _you_ are the person that has the insight in
> > _your_ plans. Either _you_ tell us what is _your_ problem with it or _you_
> > will have to point us to a person that knows _you_.
>
> If I say that you must not use ARP nor neighbour layer internals, it
> doesn't mean that I have to come up with the alternative
> implementation for you.
well, thats pretty much answers it. If we must not use ARP or neighbour
internals, even after your rewrite (?), we have to come up with an alternative
in any case (write our own backened).
We don't expect you to come up with an alternative implementation, but
as you are the one accepting the patches (or not) we need to know why
you decline something and what the problem is so we ca n work around
or improve.
Thanks
Simon
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply
* kernel 3.4.0, pppoe NAS, possible circular locking
From: Denys Fedoryshchenko @ 2012-05-24 8:44 UTC (permalink / raw)
To: netdev
Hi, upgraded one of NAS servers and got this:
[ 177.597130] ======================================================
[ 177.597234] [ INFO: possible circular locking dependency detected ]
[ 177.597339] 3.4.0-build-0061 #10 Not tainted
[ 177.597438] -------------------------------------------------------
[ 177.597543] swapper/0/0 is trying to acquire lock:
[ 177.597642] (_xmit_PPP#2){+.-...}, at: [<c02f09b9>]
sch_direct_xmit+0x36/0x119
[ 177.597892]
[ 177.597892] but task is already holding lock:
[ 177.597892] (&(&sch->busylock)->rlock){+.-...}, at: [<c02e0808>]
dev_queue_xmit+0x1d6/0x418
[ 177.597892]
[ 177.597892] which lock already depends on the new lock.
[ 177.597892]
[ 177.597892]
[ 177.597892] the existing dependency chain (in reverse order) is:
[ 177.597892]
[ 177.597892] -> #3 (&(&sch->busylock)->rlock){+.-...}:
[ 177.597892] [<c015a6d1>] lock_acquire+0x71/0x85
[ 177.597892] [<c034de94>] _raw_spin_lock_irqsave+0x40/0x50
[ 177.597892] [<c017c1f2>] get_page_from_freelist+0x227/0x398
[ 177.597892] [<c017c5a7>] __alloc_pages_nodemask+0xef/0x5f9
[ 177.597892] [<c019c34f>] alloc_slab_page+0x1d/0x21
[ 177.597892] [<c019c39f>] new_slab+0x4c/0x164
[ 177.597892] [<c019d259>]
__slab_alloc.clone.59.clone.64+0x247/0x2de
[ 177.597892] [<c019d6ee>] __kmalloc+0x55/0xa3
[ 177.597892] [<c02d70ce>] pskb_expand_head+0xbe/0x200
[ 177.597892] [<c02a8e91>] __pppoe_xmit+0xb0/0x145
[ 177.597892] [<c02a8f30>] pppoe_xmit+0xa/0xc
[ 177.597892] [<c02a6c71>] ppp_channel_push+0x3d/0x94
[ 177.597892] [<c02a6d72>] ppp_write+0x99/0xa1
[ 177.597892] [<c01a32b4>] vfs_write+0x7e/0xab
[ 177.597892] [<c01a3424>] sys_write+0x3d/0x5e
[ 177.597892] [<c034e511>] syscall_call+0x7/0xb
[ 177.597892]
[ 177.597892] -> #2 (&(&pch->downl)->rlock){+.-...}:
[ 177.597892] [<c015a6d1>] lock_acquire+0x71/0x85
[ 177.597892] [<c034df86>] _raw_spin_lock_bh+0x38/0x45
[ 177.597892] [<c02a523a>] ppp_push+0x59/0x4b3
[ 177.597892] [<c02a6a69>] ppp_xmit_process+0x41b/0x4be
[ 177.597892] [<c02a6d69>] ppp_write+0x90/0xa1
[ 177.597892] [<c01a32b4>] vfs_write+0x7e/0xab
[ 177.597892] [<c01a3424>] sys_write+0x3d/0x5e
[ 177.597892] [<c034e511>] syscall_call+0x7/0xb
[ 177.597892]
[ 177.597892] -> #1 (&(&ppp->wlock)->rlock){+.-...}:
[ 177.597892] [<c015a6d1>] lock_acquire+0x71/0x85
[ 177.597892] [<c034df86>] _raw_spin_lock_bh+0x38/0x45
[ 177.597892] [<c02a6667>] ppp_xmit_process+0x19/0x4be
[ 177.597892] [<c02a6c19>] ppp_start_xmit+0x10d/0x128
[ 177.597892] [<c02e0573>] dev_hard_start_xmit+0x333/0x3f2
[ 177.597892] [<c02f09d8>] sch_direct_xmit+0x55/0x119
[ 177.597892] [<c02e08b4>] dev_queue_xmit+0x282/0x418
[ 177.597892] [<c02e65c6>] neigh_direct_output+0xa/0xc
[ 177.597892] [<c03039e0>] ip_finish_output2+0x1e1/0x21c
[ 177.597892] [<c0303a50>] ip_finish_output+0x35/0x39
[ 177.597892] [<c03048c7>] ip_output+0x87/0x8c
[ 177.597892] [<c0301c9a>] ip_forward_finish+0x56/0x5a
[ 177.597892] [<c0301e9e>] ip_forward+0x200/0x2a2
[ 177.597892] [<c0300969>] ip_rcv_finish+0x31a/0x33c
[ 177.597892] [<c03009d1>] NF_HOOK.clone.11+0x46/0x4d
[ 177.597892] [<c0300cec>] ip_rcv+0x201/0x23d
[ 177.597892] [<c02deca7>] __netif_receive_skb+0x329/0x378
[ 177.597892] [<c02dee74>] netif_receive_skb+0x4e/0x7d
[ 177.597892] [<f846fef3>] rtl8139_poll+0x243/0x33d [8139too]
[ 177.597892] [<c02df48f>] net_rx_action+0x90/0x15d
[ 177.597892] [<c012b42d>] __do_softirq+0x7b/0x118
[ 177.597892]
[ 177.597892] -> #0 (_xmit_PPP#2){+.-...}:
[ 177.597892] [<c015a08b>] __lock_acquire+0x9a3/0xc27
[ 177.597892] [<c015a6d1>] lock_acquire+0x71/0x85
[ 177.597892] [<c034ddad>] _raw_spin_lock+0x33/0x40
[ 177.597892] [<c02f09b9>] sch_direct_xmit+0x36/0x119
[ 177.597892] [<c02e08b4>] dev_queue_xmit+0x282/0x418
[ 177.597892] [<c02e65c6>] neigh_direct_output+0xa/0xc
[ 177.597892] [<c03039e0>] ip_finish_output2+0x1e1/0x21c
[ 177.597892] [<c0303a50>] ip_finish_output+0x35/0x39
[ 177.597892] [<c03048c7>] ip_output+0x87/0x8c
[ 177.597892] [<c03030c6>] dst_output+0x15/0x18
[ 177.597892] [<c03042d7>] ip_local_out+0x17/0x1a
[ 177.597892] [<c0304f59>] ip_send_skb+0x12/0x5c
[ 177.597892] [<c0304fcd>] ip_push_pending_frames+0x2a/0x2e
[ 177.597892] [<c0320a7a>] icmp_push_reply+0xf9/0x101
[ 177.597892] [<c0320f1c>] icmp_reply+0x10e/0x12d
[ 177.597892] [<c0321050>] icmp_echo+0x59/0x5f
[ 177.597892] [<c032169f>] icmp_rcv+0xfd/0x11a
[ 177.597892] [<c030055c>] ip_local_deliver_finish+0x13a/0x1e9
[ 177.597892] [<c03009d1>] NF_HOOK.clone.11+0x46/0x4d
[ 177.597892] [<c0300ae7>] ip_local_deliver+0x41/0x45
[ 177.597892] [<c0300969>] ip_rcv_finish+0x31a/0x33c
[ 177.597892] [<c03009d1>] NF_HOOK.clone.11+0x46/0x4d
[ 177.597892] [<c0300cec>] ip_rcv+0x201/0x23d
[ 177.597892] [<c02deca7>] __netif_receive_skb+0x329/0x378
[ 177.597892] [<c02ded5f>] process_backlog+0x69/0x130
[ 177.597892] [<c02df48f>] net_rx_action+0x90/0x15d
[ 177.597892] [<c012b42d>] __do_softirq+0x7b/0x118
[ 177.597892]
[ 177.597892] other info that might help us debug this:
[ 177.597892]
[ 177.597892] Chain exists of:
[ 177.597892] _xmit_PPP#2 --> &(&pch->downl)->rlock -->
&(&sch->busylock)->rlock
[ 177.597892]
[ 177.597892] Possible unsafe locking scenario:
[ 177.597892]
[ 177.597892] CPU0 CPU1
[ 177.597892] ---- ----
[ 177.597892] lock(&(&sch->busylock)->rlock);
[ 177.597892]
lock(&(&pch->downl)->rlock);
[ 177.597892]
lock(&(&sch->busylock)->rlock);
[ 177.597892] lock(_xmit_PPP#2);
[ 177.597892]
[ 177.597892] *** DEADLOCK ***
[ 177.597892]
[ 177.597892] 6 locks held by swapper/0/0:
[ 177.597892] #0: (rcu_read_lock){.+.+..}, at: [<c02dbf9c>]
rcu_lock_acquire+0x0/0x30
[ 177.597892] #1: (rcu_read_lock){.+.+..}, at: [<c0300453>]
ip_local_deliver_finish+0x31/0x1e9
[ 177.597892] #2: (slock-AF_INET){+.-...}, at: [<c0320c40>]
icmp_xmit_lock.clone.19+0x1f/0x2f
[ 177.597892] #3: (rcu_read_lock){.+.+..}, at: [<c0302fad>]
rcu_read_lock+0x0/0x35
[ 177.597892] #4: (rcu_read_lock_bh){.+....}, at: [<c02dbf9c>]
rcu_lock_acquire+0x0/0x30
[ 177.597892] #5: (&(&sch->busylock)->rlock){+.-...}, at:
[<c02e0808>] dev_queue_xmit+0x1d6/0x418
[ 177.597892]
[ 177.597892] stack backtrace:
[ 177.597892] Pid: 0, comm: swapper/0 Not tainted 3.4.0-build-0061 #10
[ 177.597892] Call Trace:
[ 177.597892] [<c034c156>] ? printk+0x18/0x1a
[ 177.597892] [<c0158a74>] print_circular_bug+0x1ac/0x1b6
[ 177.597892] [<c015a08b>] __lock_acquire+0x9a3/0xc27
[ 177.597892] [<c0159256>] ? check_irq_usage+0x76/0x86
[ 177.597892] [<c015a6d1>] lock_acquire+0x71/0x85
[ 177.597892] [<c02f09b9>] ? sch_direct_xmit+0x36/0x119
[ 177.597892] [<c034ddad>] _raw_spin_lock+0x33/0x40
[ 177.597892] [<c02f09b9>] ? sch_direct_xmit+0x36/0x119
[ 177.597892] [<c02f09b9>] sch_direct_xmit+0x36/0x119
[ 177.597892] [<c02e08b4>] dev_queue_xmit+0x282/0x418
[ 177.597892] [<c0302fad>] ? ip_generic_getfrag+0x6e/0x6e
[ 177.597892] [<c02e65c6>] neigh_direct_output+0xa/0xc
[ 177.597892] [<c03039e0>] ip_finish_output2+0x1e1/0x21c
[ 177.597892] [<c02fcce6>] ? ipv4_mtu+0x36/0x65
[ 177.597892] [<c0303a50>] ip_finish_output+0x35/0x39
[ 177.597892] [<c03048c7>] ip_output+0x87/0x8c
[ 177.597892] [<c0303a1b>] ? ip_finish_output2+0x21c/0x21c
[ 177.597892] [<c03030c6>] dst_output+0x15/0x18
[ 177.597892] [<c03042d7>] ip_local_out+0x17/0x1a
[ 177.597892] [<c0304f59>] ip_send_skb+0x12/0x5c
[ 177.597892] [<c0304fcd>] ip_push_pending_frames+0x2a/0x2e
[ 177.597892] [<c0320a7a>] icmp_push_reply+0xf9/0x101
[ 177.597892] [<c0320f1c>] icmp_reply+0x10e/0x12d
[ 177.597892] [<c0321050>] icmp_echo+0x59/0x5f
[ 177.597892] [<f85af28d>] ? nf_nat_fn+0x121/0x12d [iptable_nat]
[ 177.597892] [<c0320cdb>] ? skb_dst.clone.21+0x1e/0x44
[ 177.597892] [<c032169f>] icmp_rcv+0xfd/0x11a
[ 177.597892] [<c030055c>] ip_local_deliver_finish+0x13a/0x1e9
[ 177.597892] [<c0300453>] ? ip_local_deliver_finish+0x31/0x1e9
[ 177.597892] [<c0300422>] ? pskb_may_pull+0x30/0x30
[ 177.597892] [<c03009d1>] NF_HOOK.clone.11+0x46/0x4d
[ 177.597892] [<c0300422>] ? pskb_may_pull+0x30/0x30
[ 177.597892] [<c0300ae7>] ip_local_deliver+0x41/0x45
[ 177.597892] [<c0300422>] ? pskb_may_pull+0x30/0x30
[ 177.597892] [<c0300969>] ip_rcv_finish+0x31a/0x33c
[ 177.597892] [<c030064f>] ? skb_dst.clone.10+0x44/0x44
[ 177.597892] [<c03009d1>] NF_HOOK.clone.11+0x46/0x4d
[ 177.597892] [<c030064f>] ? skb_dst.clone.10+0x44/0x44
[ 177.597892] [<c0300cec>] ip_rcv+0x201/0x23d
[ 177.597892] [<c030064f>] ? skb_dst.clone.10+0x44/0x44
[ 177.597892] [<c02deca7>] __netif_receive_skb+0x329/0x378
[ 177.597892] [<c02ded5f>] process_backlog+0x69/0x130
[ 177.597892] [<c02df48f>] net_rx_action+0x90/0x15d
[ 177.597892] [<c012b42d>] __do_softirq+0x7b/0x118
[ 177.597892] [<c013236e>] ? do_send_specific+0xb/0x8f
[ 177.597892] [<c012b3b2>] ? local_bh_enable+0xd/0xd
[ 177.597892] <IRQ> [<c012b648>] ? irq_exit+0x41/0x91
[ 177.597892] [<c0103c73>] ? do_IRQ+0x79/0x8d
[ 177.597892] [<c0158011>] ? trace_hardirqs_off_caller+0x2e/0x86
[ 177.597892] [<c034f2ee>] ? common_interrupt+0x2e/0x34
[ 177.597892] [<c015007b>] ? ktime_get_ts+0x8f/0x9b
[ 177.597892] [<c0108a0a>] ? mwait_idle+0x50/0x5a
[ 177.597892] [<c01091ac>] ? cpu_idle+0x55/0x6f
[ 177.597892] [<c033e2b1>] ? rest_init+0xa1/0xa7
[ 177.597892] [<c033e210>] ? __read_lock_failed+0x14/0x14
[ 177.597892] [<c049874f>] ? start_kernel+0x30d/0x314
[ 177.597892] [<c0498209>] ? repair_env_string+0x51/0x51
[ 177.597892] [<c04980a8>] ? i386_start_kernel+0xa8/0xaf
---
Denys Fedoryshchenko, Network Engineer, Virtual ISP S.A.L.
^ permalink raw reply
* [PATCH 01/21] datapath: tunnelling: Replace tun_id with tun_key
From: Simon Horman @ 2012-05-24 9:08 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
this is a first pass at providing a tun_key which can be used
as the basis for flow-based tunnelling. The tun_key includes and
replaces the tun_id in both struct ovs_skb_cb and struct sw_tun_key.
In ovs_skb_cb tun_key is a pointer as it is envisaged that it will grow
when support for IPv6 to an extent that inlining the structure will result
in ovs_skb_cb being larger than the 48 bytes available in skb->cb.
As OVS does not support IPv6 as the outer transport protocol for tunnels
the IPv6 portions of this change, which appeared in the previous revision,
have been dropped in order to limit the scope and size of this patch.
This patch does not make any effort to retain the existing tun_id behaviour
nor does it fully implement flow-based tunnels. As such it it is incomplete
and can't be used in its current form (other than to break OVS tunnelling).
** Please do not apply **
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
v4
* Add tun_flags to ovs_key_ipv4_tunnel
* Correct format in format_odp_key_attr()
v3
* Rework, actually works in limited scenarios
v2
* Use pointer to struct ovs_key_ipv4_tunnel in OVS_CB()
rather than having a struct ovs_key_ipv4_tunnel in OVS_CB()
v1
* Initial post
---
datapath/actions.c | 6 +++---
datapath/datapath.c | 10 +++++++++-
datapath/datapath.h | 5 +++--
datapath/flow.c | 34 +++++++++++++++++++++++-----------
datapath/flow.h | 27 +++++++++++++++++++++++----
datapath/tunnel.c | 24 +++++++++++++-----------
datapath/tunnel.h | 5 +++--
datapath/vport-capwap.c | 12 ++++++------
datapath/vport-gre.c | 21 +++++++++++----------
datapath/vport.c | 2 +-
include/linux/openvswitch.h | 13 ++++++++++++-
lib/dpif-netdev.c | 1 +
lib/odp-util.c | 13 +++++++++++++
lib/odp-util.h | 5 +++--
14 files changed, 124 insertions(+), 54 deletions(-)
diff --git a/datapath/actions.c b/datapath/actions.c
index 208f260..7b2ea25 100644
--- a/datapath/actions.c
+++ b/datapath/actions.c
@@ -342,8 +342,8 @@ static int execute_set_action(struct sk_buff *skb,
skb->priority = nla_get_u32(nested_attr);
break;
- case OVS_KEY_ATTR_TUN_ID:
- OVS_CB(skb)->tun_id = nla_get_be64(nested_attr);
+ case OVS_KEY_ATTR_IPV4_TUNNEL:
+ OVS_CB(skb)->tun_key = nla_data(nested_attr);
break;
case OVS_KEY_ATTR_ETHERNET:
@@ -469,7 +469,7 @@ int ovs_execute_actions(struct datapath *dp, struct sk_buff *skb)
goto out_loop;
}
- OVS_CB(skb)->tun_id = 0;
+ OVS_CB(skb)->tun_key = NULL;
error = do_execute_actions(dp, skb, acts->actions,
acts->actions_len, false);
diff --git a/datapath/datapath.c b/datapath/datapath.c
index a4376a0..65dfe79 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -587,12 +587,20 @@ static int validate_set(const struct nlattr *a,
switch (key_type) {
const struct ovs_key_ipv4 *ipv4_key;
+ const struct ovs_key_ipv4_tunnel *tun_key;
case OVS_KEY_ATTR_PRIORITY:
case OVS_KEY_ATTR_TUN_ID:
case OVS_KEY_ATTR_ETHERNET:
break;
+ case OVS_KEY_ATTR_IPV4_TUNNEL:
+ tun_key = nla_data(ovs_key);
+ if (!tun_key->ipv4_dst) {
+ return -EINVAL;
+ }
+ break;
+
case OVS_KEY_ATTR_IPV4:
if (flow_key->eth.type != htons(ETH_P_IP))
return -EINVAL;
@@ -785,7 +793,7 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info)
err = ovs_flow_metadata_from_nlattrs(&flow->key.phy.priority,
&flow->key.phy.in_port,
- &flow->key.phy.tun_id,
+ &flow->key.phy.tun_key,
a[OVS_PACKET_ATTR_KEY]);
if (err)
goto err_flow_put;
diff --git a/datapath/datapath.h b/datapath/datapath.h
index affbf0e..de0b28d 100644
--- a/datapath/datapath.h
+++ b/datapath/datapath.h
@@ -96,7 +96,7 @@ struct datapath {
/**
* struct ovs_skb_cb - OVS data in skb CB
* @flow: The flow associated with this packet. May be %NULL if no flow.
- * @tun_id: ID of the tunnel that encapsulated this packet. It is 0 if the
+ * @tun_key: Key for the tunnel that encapsulated this packet.
* @ip_summed: Consistently stores L4 checksumming status across different
* kernel versions.
* @csum_start: Stores the offset from which to start checksumming independent
@@ -107,7 +107,7 @@ struct datapath {
*/
struct ovs_skb_cb {
struct sw_flow *flow;
- __be64 tun_id;
+ struct ovs_key_ipv4_tunnel *tun_key;
#ifdef NEED_CSUM_NORMALIZE
enum csum_type ip_summed;
u16 csum_start;
@@ -192,4 +192,5 @@ struct sk_buff *ovs_vport_cmd_build_info(struct vport *, u32 pid, u32 seq,
u8 cmd);
int ovs_execute_actions(struct datapath *dp, struct sk_buff *skb);
+
#endif /* datapath.h */
diff --git a/datapath/flow.c b/datapath/flow.c
index d07337c..49c0dd8 100644
--- a/datapath/flow.c
+++ b/datapath/flow.c
@@ -629,7 +629,8 @@ int ovs_flow_extract(struct sk_buff *skb, u16 in_port, struct sw_flow_key *key,
memset(key, 0, sizeof(*key));
key->phy.priority = skb->priority;
- key->phy.tun_id = OVS_CB(skb)->tun_id;
+ if (OVS_CB(skb)->tun_key)
+ key->phy.tun_key = *OVS_CB(skb)->tun_key;
key->phy.in_port = in_port;
skb_reset_mac_header(skb);
@@ -847,6 +848,7 @@ const int ovs_key_lens[OVS_KEY_ATTR_MAX + 1] = {
/* Not upstream. */
[OVS_KEY_ATTR_TUN_ID] = sizeof(__be64),
+ [OVS_KEY_ATTR_IPV4_TUNNEL] = sizeof(struct ovs_key_ipv4_tunnel),
};
static int ipv4_flow_from_nlattrs(struct sw_flow_key *swkey, int *key_len,
@@ -1022,9 +1024,11 @@ int ovs_flow_from_nlattrs(struct sw_flow_key *swkey, int *key_lenp,
swkey->phy.in_port = DP_MAX_PORTS;
}
- if (attrs & (1ULL << OVS_KEY_ATTR_TUN_ID)) {
- swkey->phy.tun_id = nla_get_be64(a[OVS_KEY_ATTR_TUN_ID]);
- attrs &= ~(1ULL << OVS_KEY_ATTR_TUN_ID);
+ if (attrs & (1ULL << OVS_KEY_ATTR_IPV4_TUNNEL)) {
+ struct ovs_key_ipv4_tunnel *tun_key;
+ tun_key = nla_data(a[OVS_KEY_ATTR_IPV4_TUNNEL]);
+ swkey->phy.tun_key = *tun_key;
+ attrs &= ~(1ULL << OVS_KEY_ATTR_IPV4_TUNNEL);
}
/* Data attributes. */
@@ -1162,14 +1166,15 @@ int ovs_flow_from_nlattrs(struct sw_flow_key *swkey, int *key_lenp,
* get the metadata, that is, the parts of the flow key that cannot be
* extracted from the packet itself.
*/
-int ovs_flow_metadata_from_nlattrs(u32 *priority, u16 *in_port, __be64 *tun_id,
+int ovs_flow_metadata_from_nlattrs(u32 *priority, u16 *in_port,
+ struct ovs_key_ipv4_tunnel *tun_key,
const struct nlattr *attr)
{
const struct nlattr *nla;
int rem;
*in_port = DP_MAX_PORTS;
- *tun_id = 0;
+ tun_key->tun_id = 0;
*priority = 0;
nla_for_each_nested(nla, attr, rem) {
@@ -1184,8 +1189,9 @@ int ovs_flow_metadata_from_nlattrs(u32 *priority, u16 *in_port, __be64 *tun_id,
*priority = nla_get_u32(nla);
break;
- case OVS_KEY_ATTR_TUN_ID:
- *tun_id = nla_get_be64(nla);
+ case OVS_KEY_ATTR_IPV4_TUNNEL:
+ memcpy(tun_key, nla_data(nla),
+ sizeof(*tun_key));
break;
case OVS_KEY_ATTR_IN_PORT:
@@ -1204,15 +1210,21 @@ int ovs_flow_metadata_from_nlattrs(u32 *priority, u16 *in_port, __be64 *tun_id,
int ovs_flow_to_nlattrs(const struct sw_flow_key *swkey, struct sk_buff *skb)
{
struct ovs_key_ethernet *eth_key;
+ struct ovs_key_ipv4_tunnel *tun_key;
struct nlattr *nla, *encap;
if (swkey->phy.priority &&
nla_put_u32(skb, OVS_KEY_ATTR_PRIORITY, swkey->phy.priority))
goto nla_put_failure;
- if (swkey->phy.tun_id != cpu_to_be64(0) &&
- nla_put_be64(skb, OVS_KEY_ATTR_TUN_ID, swkey->phy.tun_id))
- goto nla_put_failure;
+ if (swkey->phy.tun_key.ipv4_dst) {
+ nla = nla_reserve(skb, OVS_KEY_ATTR_IPV4_TUNNEL,
+ sizeof(*tun_key));
+ if (!nla)
+ goto nla_put_failure;
+ tun_key = nla_data(nla);
+ *tun_key = swkey->phy.tun_key;
+ }
if (swkey->phy.in_port != DP_MAX_PORTS &&
nla_put_u32(skb, OVS_KEY_ATTR_IN_PORT, swkey->phy.in_port))
diff --git a/datapath/flow.h b/datapath/flow.h
index 5be481e..bab5363 100644
--- a/datapath/flow.h
+++ b/datapath/flow.h
@@ -42,7 +42,7 @@ struct sw_flow_actions {
struct sw_flow_key {
struct {
- __be64 tun_id; /* Encapsulating tunnel ID. */
+ struct ovs_key_ipv4_tunnel tun_key; /* Encapsulating tunnel key. */
u32 priority; /* Packet QoS priority. */
u16 in_port; /* Input switch port (or DP_MAX_PORTS). */
} phy;
@@ -150,6 +150,7 @@ u64 ovs_flow_used_time(unsigned long flow_jiffies);
* ------ --- ------ -----
* OVS_KEY_ATTR_PRIORITY 4 -- 4 8
* OVS_KEY_ATTR_TUN_ID 8 -- 4 12
+ * OVS_KEY_ATTR_IPV4_TUNNEL 18 2 4 24
* OVS_KEY_ATTR_IN_PORT 4 -- 4 8
* OVS_KEY_ATTR_ETHERNET 12 -- 4 16
* OVS_KEY_ATTR_8021Q 4 -- 4 8
@@ -158,14 +159,15 @@ u64 ovs_flow_used_time(unsigned long flow_jiffies);
* OVS_KEY_ATTR_ICMPV6 2 2 4 8
* OVS_KEY_ATTR_ND 28 -- 4 32
* -------------------------------------------------
- * total 144
+ * total 168
*/
-#define FLOW_BUFSIZE 144
+#define FLOW_BUFSIZE 168
int ovs_flow_to_nlattrs(const struct sw_flow_key *, struct sk_buff *);
int ovs_flow_from_nlattrs(struct sw_flow_key *swkey, int *key_lenp,
const struct nlattr *);
-int ovs_flow_metadata_from_nlattrs(u32 *priority, u16 *in_port, __be64 *tun_id,
+int ovs_flow_metadata_from_nlattrs(u32 *priority, u16 *in_port,
+ struct ovs_key_ipv4_tunnel *tun_key,
const struct nlattr *);
#define MAX_ACTIONS_BUFSIZE (16 * 1024)
@@ -204,4 +206,21 @@ u32 ovs_flow_hash(const struct sw_flow_key *key, int key_len);
struct sw_flow *ovs_flow_tbl_next(struct flow_table *table, u32 *bucket, u32 *idx);
extern const int ovs_key_lens[OVS_KEY_ATTR_MAX + 1];
+static inline void tun_key_swap_addr(struct ovs_key_ipv4_tunnel *tun_key)
+{
+ __be32 ndst = tun_key->ipv4_src;
+ tun_key->ipv4_src = tun_key->ipv4_dst;
+ tun_key->ipv4_dst = ndst;
+}
+
+static inline void tun_key_init(struct ovs_key_ipv4_tunnel *tun_key,
+ const struct iphdr *iph, __be64 tun_id)
+{
+ tun_key->tun_id = tun_id;
+ tun_key->ipv4_src = iph->saddr;
+ tun_key->ipv4_dst = iph->daddr;
+ tun_key->ipv4_tos = iph->tos;
+ tun_key->ipv4_ttl = iph->ttl;
+}
+
#endif /* flow.h */
diff --git a/datapath/tunnel.c b/datapath/tunnel.c
index d651c11..010e513 100644
--- a/datapath/tunnel.c
+++ b/datapath/tunnel.c
@@ -367,9 +367,9 @@ struct vport *ovs_tnl_find_port(struct net *net, __be32 saddr, __be32 daddr,
return NULL;
}
-static void ecn_decapsulate(struct sk_buff *skb, u8 tos)
+static void ecn_decapsulate(struct sk_buff *skb)
{
- if (unlikely(INET_ECN_is_ce(tos))) {
+ if (unlikely(INET_ECN_is_ce(OVS_CB(skb)->tun_key->ipv4_tos))) {
__be16 protocol = skb->protocol;
skb_set_network_header(skb, ETH_HLEN);
@@ -416,7 +416,7 @@ static void ecn_decapsulate(struct sk_buff *skb, u8 tos)
* - skb->csum does not include the inner Ethernet header.
* - The layer pointers are undefined.
*/
-void ovs_tnl_rcv(struct vport *vport, struct sk_buff *skb, u8 tos)
+void ovs_tnl_rcv(struct vport *vport, struct sk_buff *skb)
{
struct ethhdr *eh;
@@ -433,7 +433,7 @@ void ovs_tnl_rcv(struct vport *vport, struct sk_buff *skb, u8 tos)
skb_clear_rxhash(skb);
secpath_reset(skb);
- ecn_decapsulate(skb, tos);
+ ecn_decapsulate(skb);
vlan_set_tci(skb, 0);
if (unlikely(compute_ip_summed(skb, false))) {
@@ -613,12 +613,14 @@ static void ipv6_build_icmp(struct sk_buff *skb, struct sk_buff *nskb,
bool ovs_tnl_frag_needed(struct vport *vport,
const struct tnl_mutable_config *mutable,
- struct sk_buff *skb, unsigned int mtu, __be64 flow_key)
+ struct sk_buff *skb, unsigned int mtu,
+ struct ovs_key_ipv4_tunnel *tun_key)
{
unsigned int eth_hdr_len = ETH_HLEN;
unsigned int total_length = 0, header_length = 0, payload_length;
struct ethhdr *eh, *old_eh = eth_hdr(skb);
struct sk_buff *nskb;
+ struct ovs_key_ipv4_tunnel ntun_key;
/* Sanity check */
if (skb->protocol == htons(ETH_P_IP)) {
@@ -705,8 +707,10 @@ bool ovs_tnl_frag_needed(struct vport *vport,
* any way of synthesizing packets.
*/
if ((mutable->flags & (TNL_F_IN_KEY_MATCH | TNL_F_OUT_KEY_ACTION)) ==
- (TNL_F_IN_KEY_MATCH | TNL_F_OUT_KEY_ACTION))
- OVS_CB(nskb)->tun_id = flow_key;
+ (TNL_F_IN_KEY_MATCH | TNL_F_OUT_KEY_ACTION)) {
+ ntun_key = *tun_key;
+ OVS_CB(nskb)->tun_key = &ntun_key;
+ }
if (unlikely(compute_ip_summed(nskb, false))) {
kfree_skb(nskb);
@@ -761,7 +765,7 @@ static bool check_mtu(struct sk_buff *skb,
if (packet_length > mtu &&
ovs_tnl_frag_needed(vport, mutable, skb, mtu,
- OVS_CB(skb)->tun_id))
+ OVS_CB(skb)->tun_key))
return false;
}
}
@@ -778,7 +782,7 @@ static bool check_mtu(struct sk_buff *skb,
if (packet_length > mtu &&
ovs_tnl_frag_needed(vport, mutable, skb, mtu,
- OVS_CB(skb)->tun_id))
+ OVS_CB(skb)->tun_key))
return false;
}
}
@@ -799,10 +803,8 @@ static void create_tunnel_header(const struct vport *vport,
iph->ihl = sizeof(struct iphdr) >> 2;
iph->frag_off = htons(IP_DF);
iph->protocol = tnl_vport->tnl_ops->ipproto;
- iph->tos = mutable->tos;
iph->daddr = rt->rt_dst;
iph->saddr = rt->rt_src;
- iph->ttl = mutable->ttl;
if (!iph->ttl)
iph->ttl = ip4_dst_hoplimit(&rt_dst(rt));
diff --git a/datapath/tunnel.h b/datapath/tunnel.h
index 1924017..7d78297 100644
--- a/datapath/tunnel.h
+++ b/datapath/tunnel.h
@@ -269,14 +269,15 @@ int ovs_tnl_set_addr(struct vport *vport, const unsigned char *addr);
const char *ovs_tnl_get_name(const struct vport *vport);
const unsigned char *ovs_tnl_get_addr(const struct vport *vport);
int ovs_tnl_send(struct vport *vport, struct sk_buff *skb);
-void ovs_tnl_rcv(struct vport *vport, struct sk_buff *skb, u8 tos);
+void ovs_tnl_rcv(struct vport *vport, struct sk_buff *skb);
struct vport *ovs_tnl_find_port(struct net *net, __be32 saddr, __be32 daddr,
__be64 key, int tunnel_type,
const struct tnl_mutable_config **mutable);
bool ovs_tnl_frag_needed(struct vport *vport,
const struct tnl_mutable_config *mutable,
- struct sk_buff *skb, unsigned int mtu, __be64 flow_key);
+ struct sk_buff *skb, unsigned int mtu,
+ struct ovs_key_ipv4_tunnel *tun_key);
void ovs_tnl_free_linked_skbs(struct sk_buff *skb);
int ovs_tnl_init(void);
diff --git a/datapath/vport-capwap.c b/datapath/vport-capwap.c
index 05a099d..1e08d5a 100644
--- a/datapath/vport-capwap.c
+++ b/datapath/vport-capwap.c
@@ -220,7 +220,7 @@ static struct sk_buff *capwap_update_header(const struct vport *vport,
struct capwaphdr_wsi *wsi = (struct capwaphdr_wsi *)(cwh + 1);
struct capwaphdr_wsi_key *opt = (struct capwaphdr_wsi_key *)(wsi + 1);
- opt->key = OVS_CB(skb)->tun_id;
+ opt->key = OVS_CB(skb)->tun_key->tun_id;
}
udph->len = htons(skb->len - skb_transport_offset(skb));
@@ -316,6 +316,7 @@ static int capwap_rcv(struct sock *sk, struct sk_buff *skb)
struct vport *vport;
const struct tnl_mutable_config *mutable;
struct iphdr *iph;
+ struct ovs_key_ipv4_tunnel tun_key;
__be64 key = 0;
if (unlikely(!pskb_may_pull(skb, CAPWAP_MIN_HLEN + ETH_HLEN)))
@@ -333,12 +334,11 @@ static int capwap_rcv(struct sock *sk, struct sk_buff *skb)
goto error;
}
- if (mutable->flags & TNL_F_IN_KEY_MATCH)
- OVS_CB(skb)->tun_id = key;
- else
- OVS_CB(skb)->tun_id = 0;
+ tun_key_init(&tun_key, iph,
+ mutable->flags & TNL_F_IN_KEY_MATCH ? key : 0);
+ OVS_CB(skb)->tun_key = &tun_key;
- ovs_tnl_rcv(vport, skb, iph->tos);
+ ovs_tnl_rcv(vport, skb);
goto out;
error:
diff --git a/datapath/vport-gre.c b/datapath/vport-gre.c
index ab89c5b..fd2b038 100644
--- a/datapath/vport-gre.c
+++ b/datapath/vport-gre.c
@@ -101,10 +101,6 @@ static struct sk_buff *gre_update_header(const struct vport *vport,
__be32 *options = (__be32 *)(skb_network_header(skb) + mutable->tunnel_hlen
- GRE_HEADER_SECTION);
- /* Work backwards over the options so the checksum is last. */
- if (mutable->flags & TNL_F_OUT_KEY_ACTION)
- *options = be64_get_low32(OVS_CB(skb)->tun_id);
-
if (mutable->out_key || mutable->flags & TNL_F_OUT_KEY_ACTION)
options--;
@@ -285,7 +281,11 @@ static void gre_err(struct sk_buff *skb, u32 info)
#endif
__skb_pull(skb, tunnel_hdr_len);
- ovs_tnl_frag_needed(vport, mutable, skb, mtu, key);
+ {
+ struct ovs_key_ipv4_tunnel tun_key;
+ tun_key_init(&tun_key, iph, key);
+ ovs_tnl_frag_needed(vport, mutable, skb, mtu, &tun_key);
+ }
__skb_push(skb, tunnel_hdr_len);
out:
@@ -327,6 +327,7 @@ static int gre_rcv(struct sk_buff *skb)
const struct tnl_mutable_config *mutable;
int hdr_len;
struct iphdr *iph;
+ struct ovs_key_ipv4_tunnel tun_key;
__be16 flags;
__be64 key;
@@ -351,15 +352,15 @@ static int gre_rcv(struct sk_buff *skb)
goto error;
}
- if (mutable->flags & TNL_F_IN_KEY_MATCH)
- OVS_CB(skb)->tun_id = key;
- else
- OVS_CB(skb)->tun_id = 0;
+
+ tun_key_init(&tun_key, iph,
+ mutable->flags & TNL_F_IN_KEY_MATCH ? key : 0);
+ OVS_CB(skb)->tun_key = &tun_key;
__skb_pull(skb, hdr_len);
skb_postpull_rcsum(skb, skb_transport_header(skb), hdr_len + ETH_HLEN);
- ovs_tnl_rcv(vport, skb, iph->tos);
+ ovs_tnl_rcv(vport, skb);
return 0;
error:
diff --git a/datapath/vport.c b/datapath/vport.c
index 172261a..0c77a1b 100644
--- a/datapath/vport.c
+++ b/datapath/vport.c
@@ -462,7 +462,7 @@ void ovs_vport_receive(struct vport *vport, struct sk_buff *skb)
OVS_CB(skb)->flow = NULL;
if (!(vport->ops->flags & VPORT_F_TUN_ID))
- OVS_CB(skb)->tun_id = 0;
+ OVS_CB(skb)->tun_key = NULL;
ovs_dp_process_received_packet(vport, skb);
}
diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
index f5c9cca..c32bb58 100644
--- a/include/linux/openvswitch.h
+++ b/include/linux/openvswitch.h
@@ -278,7 +278,8 @@ enum ovs_key_attr {
OVS_KEY_ATTR_ICMPV6, /* struct ovs_key_icmpv6 */
OVS_KEY_ATTR_ARP, /* struct ovs_key_arp */
OVS_KEY_ATTR_ND, /* struct ovs_key_nd */
- OVS_KEY_ATTR_TUN_ID = 63, /* be64 tunnel ID */
+ OVS_KEY_ATTR_TUN_ID, /* be64 tunnel ID */
+ OVS_KEY_ATTR_IPV4_TUNNEL, /* struct ovs_key_ipv4_tunnel */
__OVS_KEY_ATTR_MAX
};
@@ -360,6 +361,16 @@ struct ovs_key_nd {
__u8 nd_tll[6];
};
+struct ovs_key_ipv4_tunnel {
+ __be64 tun_id;
+ __u32 tun_flags;
+ __be32 ipv4_src;
+ __be32 ipv4_dst;
+ __u8 ipv4_tos;
+ __u8 ipv4_ttl;
+ __u8 pad[2];
+};
+
/**
* enum ovs_flow_attr - attributes for %OVS_FLOW_* commands.
* @OVS_FLOW_ATTR_KEY: Nested %OVS_KEY_ATTR_* attributes specifying the flow
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index fb0a863..d065a3a 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -1165,6 +1165,7 @@ execute_set_action(struct ofpbuf *packet, const struct nlattr *a)
case OVS_KEY_ATTR_TUN_ID:
case OVS_KEY_ATTR_PRIORITY:
case OVS_KEY_ATTR_IPV6:
+ case OVS_KEY_ATTR_IPV4_TUNNEL:
/* not implemented */
break;
diff --git a/lib/odp-util.c b/lib/odp-util.c
index 8693d3c..23d1efe 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -106,6 +106,7 @@ ovs_key_attr_to_string(enum ovs_key_attr attr)
case OVS_KEY_ATTR_ARP: return "arp";
case OVS_KEY_ATTR_ND: return "nd";
case OVS_KEY_ATTR_TUN_ID: return "tun_id";
+ case OVS_KEY_ATTR_IPV4_TUNNEL: return "ipv4_tunnel";
case __OVS_KEY_ATTR_MAX:
default:
@@ -614,6 +615,7 @@ odp_flow_key_attr_len(uint16_t type)
case OVS_KEY_ATTR_ICMPV6: return sizeof(struct ovs_key_icmpv6);
case OVS_KEY_ATTR_ARP: return sizeof(struct ovs_key_arp);
case OVS_KEY_ATTR_ND: return sizeof(struct ovs_key_nd);
+ case OVS_KEY_ATTR_IPV4_TUNNEL: return sizeof(struct ovs_key_ipv4_tunnel);
case OVS_KEY_ATTR_UNSPEC:
case __OVS_KEY_ATTR_MAX:
@@ -668,6 +670,7 @@ format_odp_key_attr(const struct nlattr *a, struct ds *ds)
const struct ovs_key_icmpv6 *icmpv6_key;
const struct ovs_key_arp *arp_key;
const struct ovs_key_nd *nd_key;
+ const struct ovs_key_ipv4_tunnel *ipv4_tun_key;
enum ovs_key_attr attr = nl_attr_type(a);
int expected_len;
@@ -698,6 +701,16 @@ format_odp_key_attr(const struct nlattr *a, struct ds *ds)
ds_put_format(ds, "(%#"PRIx64")", ntohll(nl_attr_get_be64(a)));
break;
+ case OVS_KEY_ATTR_IPV4_TUNNEL:
+ ipv4_tun_key = nl_attr_get(a);
+ ds_put_format(ds, "(tun_id=%"PRIx64",flags=%"PRIx32
+ ",src="IP_FMT",dst="IP_FMT",tos=%"PRIx8",ttl=%"PRIu8")",
+ ntohll(ipv4_tun_key->tun_id), ipv4_tun_key->tun_flags,
+ IP_ARGS(&ipv4_tun_key->ipv4_src),
+ IP_ARGS(&ipv4_tun_key->ipv4_dst),
+ ipv4_tun_key->ipv4_tos, ipv4_tun_key->ipv4_ttl);
+ break;
+
case OVS_KEY_ATTR_IN_PORT:
ds_put_format(ds, "(%"PRIu32")", nl_attr_get_u32(a));
break;
diff --git a/lib/odp-util.h b/lib/odp-util.h
index d53f083..4e5a8a1 100644
--- a/lib/odp-util.h
+++ b/lib/odp-util.h
@@ -72,6 +72,7 @@ int odp_actions_from_string(const char *, const struct simap *port_names,
* ------ --- ------ -----
* OVS_KEY_ATTR_PRIORITY 4 -- 4 8
* OVS_KEY_ATTR_TUN_ID 8 -- 4 12
+ * OVS_KEY_ATTR_IPV4_TUNNEL 18 2 4 24
* OVS_KEY_ATTR_IN_PORT 4 -- 4 8
* OVS_KEY_ATTR_ETHERNET 12 -- 4 16
* OVS_KEY_ATTR_8021Q 4 -- 4 8
@@ -80,9 +81,9 @@ int odp_actions_from_string(const char *, const struct simap *port_names,
* OVS_KEY_ATTR_ICMPV6 2 2 4 8
* OVS_KEY_ATTR_ND 28 -- 4 32
* -------------------------------------------------
- * total 144
+ * total 168
*/
-#define ODPUTIL_FLOW_KEY_BYTES 144
+#define ODPUTIL_FLOW_KEY_BYTES 168
/* A buffer with sufficient size and alignment to hold an nlattr-formatted flow
* key. An array of "struct nlattr" might not, in theory, be sufficiently
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 04/21] vswitchd: Add iface_parse_tunnel
From: Simon Horman @ 2012-05-24 9:08 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
This duplicates parse_tunnel_config, the duplication will later be minimised.
iface_parse_tunnel() is currently only used to verify the configuration
by passing NULL as its third argument. It will later be used in storing
the configuration by passing a non-NULL argument. The purpose of verification
is to allow for error-free parsing later.
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
include/openvswitch/tunnel.h | 2 +
ofproto/ofproto.h | 33 +++++++
vswitchd/bridge.c | 214 +++++++++++++++++++++++++++++++++++++++++++
3 files changed, 249 insertions(+)
diff --git a/include/openvswitch/tunnel.h b/include/openvswitch/tunnel.h
index c494791..5f55ecc 100644
--- a/include/openvswitch/tunnel.h
+++ b/include/openvswitch/tunnel.h
@@ -71,5 +71,7 @@ enum {
#define TNL_F_PMTUD (1 << 5) /* Enable path MTU discovery. */
#define TNL_F_HDR_CACHE (1 << 6) /* Enable tunnel header caching. */
#define TNL_F_IPSEC (1 << 7) /* Traffic is IPsec encrypted. */
+#define TNL_F_IN_KEY (1 << 8) /* Tunnel port has input key. */
+#define TNL_F_OUT_KEY (1 << 9) /* Tunnel port has output key. */
#endif /* openvswitch/tunnel.h */
diff --git a/ofproto/ofproto.h b/ofproto/ofproto.h
index ea988e7..d8739b0 100644
--- a/ofproto/ofproto.h
+++ b/ofproto/ofproto.h
@@ -367,7 +367,40 @@ void ofproto_get_vlan_usage(struct ofproto *, unsigned long int *vlan_bitmap);
bool ofproto_has_vlan_usage_changed(const struct ofproto *);
int ofproto_port_set_realdev(struct ofproto *, uint16_t vlandev_ofp_port,
uint16_t realdev_ofp_port, int vid);
+\f
+#define TNL_F_CSUM (1 << 0) /* Checksum packets. */
+#define TNL_F_TOS_INHERIT (1 << 1) /* Inherit ToS from inner packet. */
+#define TNL_F_TTL_INHERIT (1 << 2) /* Inherit TTL from inner packet. */
+#define TNL_F_DF_INHERIT (1 << 3) /* Inherit DF bit from inner packet. */
+#define TNL_F_DF_DEFAULT (1 << 4) /* Set DF bit if inherit off or
+ * not IP. */
+#define TNL_F_PMTUD (1 << 5) /* Enable path MTU discovery. */
+#define TNL_F_HDR_CACHE (1 << 6) /* Enable tunnel header caching. */
+#define TNL_F_IPSEC (1 << 7) /* Traffic is IPsec encrypted. */
+#define TNL_F_IN_KEY (1 << 8) /* Tunnel port has input key. */
+#define TNL_F_OUT_KEY (1 << 9) /* Tunnel port has output key. */
+
+#define TNL_T_PROTO_GRE 0
+#define TNL_T_PROTO_CAPWAP 1
+
+#define TNL_T_KEY_EXACT (1 << 6)
+#define TNL_T_KEY_MATCH (1 << 7)
+
+/* Tunnel device support */
+struct tunnel_settings {
+ ovs_be64 in_key;
+ ovs_be64 out_key;
+ ovs_be32 saddr;
+ ovs_be32 daddr;
+ uint8_t tos;
+ uint8_t ttl;
+ uint16_t flags;
+ uint8_t type;
+};
+void ofproto_port_set_tunnel(struct ofproto *ofproto, uint16_t tundev_ofp_port,
+ uint16_t realdev_ofp_port,
+ const struct tunnel_settings *s);
#ifdef __cplusplus
}
#endif
diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
index d720952..f775ae7 100644
--- a/vswitchd/bridge.c
+++ b/vswitchd/bridge.c
@@ -20,6 +20,7 @@
#include <inttypes.h>
#include <stdlib.h>
#include "bitmap.h"
+#include "byte-order.h"
#include "bond.h"
#include "cfm.h"
#include "coverage.h"
@@ -625,6 +626,13 @@ bridge_update_ofprotos(void)
}
}
+static bool
+is_tunnel_realdev(const char *type)
+{
+ return !strcmp(type, "gre") || !strcmp(type, "ipsec_gre") ||
+ !strcmp(type, "capwap");
+}
+
static void
port_configure(struct port *port)
{
@@ -1333,6 +1341,207 @@ error:
return error;
}
+
+static const char *
+get_key(const struct shash *args, const char *name)
+{
+ const char *s;
+
+ s = shash_find_data(args, name);
+ if (!s) {
+ s = shash_find_data(args, "key");
+ if (!s) {
+ s = "0";
+ }
+ }
+
+ if (!strcmp(s, "flow")) {
+ /* This is the default if no attribute is present. */
+ return NULL;
+ }
+
+ return s;
+}
+
+static int
+iface_parse_tunnel(const struct ovsrec_interface *iface_cfg,
+ const char *type, struct tunnel_settings *sp)
+{
+ bool is_gre = false;
+ bool is_ipsec = false;
+ struct shash args;
+ struct shash_node *node;
+ struct tunnel_settings s = { .tos = 0 };
+ bool ipsec_mech_set = false;
+ int status;
+ const char *key;
+
+ shash_init(&args);
+ shash_from_ovs_idl_map(iface_cfg->key_options,
+ iface_cfg->value_options,
+ iface_cfg->n_options, &args);
+
+ s.flags = TNL_F_DF_DEFAULT | TNL_F_PMTUD | TNL_F_HDR_CACHE;
+ if (!strcmp(type, "gre")) {
+ is_gre = true;
+ s.type = TNL_T_PROTO_GRE;
+ } else if (!strcmp(type, "ipsec_gre")) {
+ is_gre = true;
+ s.type = TNL_T_PROTO_GRE;
+ is_ipsec = true;
+ s.flags |= TNL_F_IPSEC;
+ s.flags &= ~TNL_F_HDR_CACHE;
+ } else if (strcmp(type, "capwap")) {
+ s.type = TNL_T_PROTO_CAPWAP;
+ }
+
+ SHASH_FOR_EACH (node, &args) {
+ if (!strcmp(node->name, "remote_ip")) {
+ struct in_addr in_addr;
+ if (lookup_ip(node->data, &in_addr)) {
+ VLOG_WARN("%s: bad %s 'remote_ip'", iface_cfg->name, type);
+ } else {
+ s.daddr = in_addr.s_addr;
+ }
+ } else if (!strcmp(node->name, "local_ip")) {
+ struct in_addr in_addr;
+ if (lookup_ip(node->data, &in_addr)) {
+ VLOG_WARN("%s: bad %s 'local_ip'", iface_cfg->name, type);
+ } else {
+ s.saddr = in_addr.s_addr;
+ }
+ } else if (!strcmp(node->name, "tos")) {
+ if (!strcmp(node->data, "inherit")) {
+ s.flags |= TNL_F_TOS_INHERIT;
+ } else {
+ s.tos = atoi(node->data);
+ }
+ } else if (!strcmp(node->name, "ttl")) {
+ if (!strcmp(node->data, "inherit")) {
+ s.flags |= TNL_F_TTL_INHERIT;
+ } else {
+ s.ttl = atoi(node->data);
+ }
+ } else if (!strcmp(node->name, "csum") && is_gre) {
+ if (!strcmp(node->data, "true")) {
+ s.flags |= TNL_F_CSUM;
+ }
+ } else if (!strcmp(node->name, "df_inherit")) {
+ if (!strcmp(node->data, "true")) {
+ s.flags |= TNL_F_DF_INHERIT;
+ }
+ } else if (!strcmp(node->name, "df_default")) {
+ if (!strcmp(node->data, "false")) {
+ s.flags &= ~TNL_F_DF_DEFAULT;
+ }
+ } else if (!strcmp(node->name, "pmtud")) {
+ if (!strcmp(node->data, "false")) {
+ s.flags &= ~TNL_F_PMTUD;
+ }
+ } else if (!strcmp(node->name, "header_cache")) {
+ if (!strcmp(node->data, "false")) {
+ s.flags &= ~TNL_F_HDR_CACHE;
+ }
+ } else if (!strcmp(node->name, "peer_cert") && is_ipsec) {
+ if (shash_find(&args, "certificate")) {
+ ipsec_mech_set = true;
+ } else {
+ const char *use_ssl_cert;
+
+ /* If the "use_ssl_cert" is true, then "certificate" and
+ * "private_key" will be pulled from the SSL table. The
+ * use of this option is strongly discouraged, since it
+ * will like be removed when multiple SSL configurations
+ * are supported by OVS.
+ */
+ use_ssl_cert = shash_find_data(&args, "use_ssl_cert");
+ if (!use_ssl_cert || strcmp(use_ssl_cert, "true")) {
+ VLOG_ERR("%s: 'peer_cert' requires 'certificate' argument",
+ iface_cfg->name);
+ goto err;
+ }
+ ipsec_mech_set = true;
+ }
+ } else if (!strcmp(node->name, "psk") && is_ipsec) {
+ ipsec_mech_set = true;
+ } else if (is_ipsec
+ && (!strcmp(node->name, "certificate")
+ || !strcmp(node->name, "private_key")
+ || !strcmp(node->name, "use_ssl_cert"))) {
+ /* Ignore options not used by the netdev. */
+ } else if (!strcmp(node->name, "key") ||
+ !strcmp(node->name, "in_key") ||
+ !strcmp(node->name, "out_key")) {
+ /* Handled separately below. */
+ } else {
+ VLOG_WARN("%s: unknown %s argument '%s'", iface_cfg->name,
+ type, node->name);
+ }
+ }
+
+ if (is_ipsec) {
+ char *file_name = xasprintf("%s/%s", ovs_rundir(),
+ "ovs-monitor-ipsec.pid");
+ pid_t pid = read_pidfile(file_name);
+ free(file_name);
+ if (pid < 0) {
+ VLOG_ERR("%s: IPsec requires the ovs-monitor-ipsec daemon",
+ iface_cfg->name);
+ goto err;
+ }
+
+ if (shash_find(&args, "peer_cert") && shash_find(&args, "psk")) {
+ VLOG_ERR("%s: cannot define both 'peer_cert' and 'psk'",
+ iface_cfg->name);
+ goto err;
+ }
+
+ if (!ipsec_mech_set) {
+ VLOG_ERR("%s: IPsec requires an 'peer_cert' or psk' argument",
+ iface_cfg->name);
+ goto err;
+ }
+ }
+
+ if ((key = get_key(&args, "in_key"))) {
+ s.flags |= TNL_F_IN_KEY;
+ s.type |= TNL_T_KEY_EXACT;
+ s.in_key = htonll(strtoull(key, NULL, 0));
+ } else {
+ s.type |= TNL_T_KEY_MATCH;
+ s.in_key = 0ULL;
+ }
+ if ((key = get_key(&args, "out_key"))) {
+ s.flags |= TNL_F_OUT_KEY;
+ s.out_key = htonll(strtoull(key, NULL, 0));
+ } else {
+ s.out_key = 0ULL;
+ }
+
+ if (!s.daddr) {
+ VLOG_ERR("%s: %s type requires valid 'remote_ip' argument",
+ iface_cfg->name, type);
+ goto err;
+ }
+
+ if (s.saddr) {
+ if (ip_is_multicast(s.daddr)) {
+ VLOG_WARN("%s: remote_ip is multicast, ignoring local_ip",
+ iface_cfg->name);
+ s.saddr = 0;
+ }
+ }
+
+ if (sp) {
+ *sp = s;
+ }
+
+ status = 0;
+err:
+ shash_destroy(&args);
+ return status;
+}
+
/* Creates a new iface on 'br' based on 'if_cfg'. The new iface has OpenFlow
* port number 'ofp_port'. If ofp_port is negative, an OpenFlow port is
* automatically allocated for the iface. Takes ownership of and
@@ -1344,6 +1553,7 @@ iface_create(struct bridge *br, struct if_cfg *if_cfg, int ofp_port)
{
const struct ovsrec_interface *iface_cfg = if_cfg->cfg;
const struct ovsrec_port *port_cfg = if_cfg->parent;
+ const char *type = iface_get_type(iface_cfg, br->cfg);
struct netdev *netdev;
struct iface *iface;
@@ -1355,6 +1565,10 @@ iface_create(struct bridge *br, struct if_cfg *if_cfg, int ofp_port)
hmap_remove(&br->if_cfg_todo, &if_cfg->hmap_node);
free(if_cfg);
+ if (is_tunnel_realdev(type) && iface_parse_tunnel(iface_cfg, type, NULL)) {
+ return false;
+ }
+
/* Do the bits that can fail up front. */
assert(!iface_lookup(br, iface_cfg->name));
error = iface_do_create(br, iface_cfg, port_cfg, &ofp_port, &netdev);
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 05/21] vswitchd: Add add_tunnel_ports()
From: Simon Horman @ 2012-05-24 9:08 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
Add tunnel tundevs for tunnel realdevs as needed.
In general the notion is that realdevs may be configured by users
and from an end-user point of view are compatible with the existing
port-based tunneling code. And that tundevs exist in the datapath
arnd are actually used to send and recieve packets, based on flows.
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
vswitchd/bridge.c | 67 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 67 insertions(+)
diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
index f775ae7..3d187f0 100644
--- a/vswitchd/bridge.c
+++ b/vswitchd/bridge.c
@@ -268,6 +268,7 @@ static void configure_splinter_port(struct port *);
static void add_vlan_splinter_ports(struct bridge *,
const unsigned long int *splinter_vlans,
struct shash *ports);
+static void add_tunnel_ports(struct bridge *, struct shash *ports);
\f
/* Public functions. */
@@ -2751,6 +2752,8 @@ bridge_add_del_ports(struct bridge *br,
add_vlan_splinter_ports(br, splinter_vlans, &new_ports);
}
+ add_tunnel_ports(br, &new_ports);
+
/* Get rid of deleted ports.
* Get rid of deleted interfaces on ports that still exist. */
HMAP_FOR_EACH_SAFE (port, next, hmap_node, &br->ports) {
@@ -4153,6 +4156,70 @@ add_vlan_splinter_ports(struct bridge *br,
}
}
+static struct ovsrec_port *
+synthesize_tunnel_port(const char *name, const char *type)
+{
+ struct ovsrec_interface *iface;
+ struct ovsrec_port *port;
+
+ iface = xzalloc(sizeof *iface);
+ iface->name = xstrdup(name);
+ iface->type = type;
+
+ port = xzalloc(sizeof *port);
+ port->interfaces = xmemdup(&iface, sizeof iface);
+ port->n_interfaces = 1;
+ port->name = xstrdup(name);
+
+ register_block(iface);
+ register_block(iface->name);
+ register_block(port);
+ register_block(port->interfaces);
+ register_block(port->name);
+
+ return port;
+}
+
+/* For each interface with 'br' is a tunnel, adds the corresponding
+ * ovsrec_port to 'ports' if it is not already present */
+static void
+add_tunnel_ports(struct bridge *br, struct shash *ports)
+{
+ size_t i;
+
+ /* We iterate through 'br->cfg->ports' instead of 'ports' here because
+ * we're modifying 'ports'. */
+ for (i = 0; i < br->cfg->n_ports; i++) {
+ const char *name = br->cfg->ports[i]->name;
+ struct ovsrec_port *port_cfg = shash_find_data(ports, name);
+ size_t j;
+
+ for (j = 0; j < port_cfg->n_interfaces; j++) {
+ struct ovsrec_interface *iface_cfg = port_cfg->interfaces[j];
+ const char *type = iface_get_type(iface_cfg, br->cfg);
+ const char *tundev_name;
+ const char *tundev_type;
+
+ if (!is_tunnel_realdev(type)) {
+ continue;
+ }
+
+ tundev_name = strcmp(type, "ipsec_gre") ? type : "gre";
+ if (!strcmp(tundev_name, "gre")) {
+ tundev_type = "gre-tundev";
+ } else {
+ tundev_type = "capwap-tundev";
+ }
+
+ if (!shash_find(ports, tundev_name)) {
+ shash_add(ports, tundev_name,
+ synthesize_tunnel_port(tundev_name,
+ tundev_type));
+ }
+ }
+ }
+}
+
static void
mirror_refresh_stats(struct mirror *m)
{
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 06/21] ofproto: Add set_tunnelling()
From: Simon Horman @ 2012-05-24 9:08 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
Allow configuration of tunneling in ofproto_port instances.
For tunnel realdevs this includes the remote IP of the and type tunnel,
and optionally the local IP, tos and ttl.
For tunnel tundevs it only includes the type.
realdevs and tundevs can be differentiated by examining the remote IP,
which is always zero for tundevs and always non-zero for realdevs.
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
ofproto/ofproto-dpif.c | 116 +++++++++++++++++++++++++++++++++++++++++++++
ofproto/ofproto-provider.h | 12 +++++
ofproto/ofproto.c | 28 +++++++++++
ofproto/ofproto.h | 13 +++++
4 files changed, 169 insertions(+)
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index f2c2ca9..642b508 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -476,6 +476,13 @@ static void facet_account(struct facet *);
static bool facet_is_controller_flow(struct facet *);
+struct ofport_dpif_tun {
+ struct tunnel_settings s;
+ uint16_t tundev_ofp_port;
+ struct hmap_node tundev_node;
+ struct ofport_dpif *ofport; /* Containing ofport_dpif */
+};
+
struct ofport_dpif {
struct ofport up;
@@ -503,6 +510,9 @@ struct ofport_dpif {
* widespread use, we will delete these interfaces. */
uint16_t realdev_ofp_port;
int vlandev_vid;
+
+ /* Tunneling */
+ struct ofport_dpif_tun *tun;
};
/* Node in 'ofport_dpif''s 'priorities' map. Used to maintain a map from
@@ -535,6 +545,16 @@ static bool vsp_adjust_flow(const struct ofproto_dpif *, struct flow *);
static void vsp_remove(struct ofport_dpif *);
static void vsp_add(struct ofport_dpif *, uint16_t realdev_ofp_port, int vid);
+static unsigned key_local_remote_ports;
+static unsigned key_remote_ports;
+static unsigned local_remote_ports;
+static unsigned remote_ports;
+static unsigned key_multicast_ports;
+static unsigned multicast_ports;
+
+static int set_tunnelling(struct ofport *ofport_, uint16_t realdev_ofp_port,
+ const struct tunnel_settings *s);
+
static struct ofport_dpif *
ofport_dpif_cast(const struct ofport *ofport)
{
@@ -612,6 +632,9 @@ struct ofproto_dpif {
/* VLAN splinters. */
struct hmap realdev_vid_map; /* (realdev,vid) -> vlandev. */
struct hmap vlandev_map; /* vlandev -> (realdev,vid). */
+
+ /* Tunnelling */
+ struct hmap tundev_map; /* tundev -> realdev */
};
/* Defer flow mod completion until "ovs-appctl ofproto/unclog"? (Useful only
@@ -771,6 +794,8 @@ construct(struct ofproto *ofproto_)
hmap_init(&ofproto->vlandev_map);
hmap_init(&ofproto->realdev_vid_map);
+ hmap_init(&ofproto->tundev_map);
+
hmap_insert(&all_ofproto_dpifs, &ofproto->all_ofproto_dpifs_node,
hash_string(ofproto->up.name, 0));
memset(&ofproto->stats, 0, sizeof ofproto->stats);
@@ -1153,6 +1178,7 @@ port_construct(struct ofport *port_)
hmap_init(&port->priorities);
port->realdev_ofp_port = 0;
port->vlandev_vid = 0;
+ port->tun = NULL;
port->carrier_seq = netdev_get_carrier_resets(port->up.netdev);
if (ofproto->sflow) {
@@ -1171,6 +1197,7 @@ port_destruct(struct ofport *port_)
ofproto->need_revalidate = true;
bundle_remove(port_);
set_cfm(port_, NULL);
+ set_tunnelling(port_, 0, NULL);
if (ofproto->sflow) {
dpif_sflow_del_port(ofproto->sflow, port->odp_port);
}
@@ -7097,6 +7124,94 @@ vsp_add(struct ofport_dpif *port, uint16_t realdev_ofp_port, int vid)
}
}
\f
+static inline bool
+ipv4_is_multicast(__be32 addr)
+{
+ return (addr & htonl(0xf0000000)) == htonl(0xe0000000);
+}
+
+static unsigned int *
+tun_port_pool(const struct tunnel_settings *s)
+{
+ bool is_multicast = ipv4_is_multicast(s->daddr);
+
+ if (s->type & TNL_T_KEY_MATCH) {
+ if (s->saddr)
+ return &local_remote_ports;
+ else if (is_multicast)
+ return &multicast_ports;
+ else
+ return &remote_ports;
+ } else {
+ if (s->saddr)
+ return &key_local_remote_ports;
+ else if (is_multicast)
+ return &key_multicast_ports;
+ else
+ return &key_remote_ports;
+ }
+}
+
+static void
+tun_remove(struct ofport_dpif *ofport)
+{
+ struct ofproto_dpif *ofproto = ofproto_dpif_cast(ofport->up.ofproto);
+
+ if (!ofport->tun) {
+ return;
+ }
+
+ hmap_remove(&ofproto->tundev_map, &ofport->tun->tundev_node);
+ (*tun_port_pool(&ofport->tun->s))--;
+}
+
+static void
+tun_add(struct ofport_dpif *ofport, uint16_t tundev_ofp_port,
+ const struct tunnel_settings *s)
+{
+ struct ofproto_dpif *ofproto = ofproto_dpif_cast(ofport->up.ofproto);
+
+ ofport->tun->tundev_ofp_port = tundev_ofp_port;
+ ofport->tun->s = *s;
+ (*tun_port_pool(&ofport->tun->s))++;
+ hmap_insert(&ofproto->tundev_map, &ofport->tun->tundev_node,
+ hash_int(tundev_ofp_port, 0));
+}
+
+static int
+set_tunnelling(struct ofport *ofport_, uint16_t tundev_ofp_port,
+ const struct tunnel_settings *s)
+{
+ struct ofport_dpif *ofport = ofport_dpif_cast(ofport_);
+
+ if (!s) {
+ tun_remove(ofport);
+ free(ofport->tun);
+ ofport->tun = NULL;
+ return 0;
+ }
+
+ if (!ofport->tun) {
+ struct ofproto_dpif *ofproto;
+
+ ofproto = ofproto_dpif_cast(ofport->up.ofproto);
+ ofproto->need_revalidate = true;
+ ofport->tun = xzalloc(sizeof *ofport->tun);
+ ofport->tun->ofport = ofport;
+ }
+ else {
+ if (ofport->tun->tundev_ofp_port == tundev_ofp_port &&
+ tunnel_settings_equal(&ofport->tun->s, s)) {
+ return 0;
+ }
+ tun_remove(ofport);
+ }
+
+ tun_add(ofport, tundev_ofp_port, s);
+
+ return 0;
+}
+\f
const struct ofproto_class ofproto_dpif_class = {
enumerate_types,
enumerate_names,
@@ -7159,4 +7274,5 @@ const struct ofproto_class ofproto_dpif_class = {
forward_bpdu_changed,
set_mac_idle_time,
set_realdev,
+ set_tunnelling,
};
diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h
index 1f3ad37..be39691 100644
--- a/ofproto/ofproto-provider.h
+++ b/ofproto/ofproto-provider.h
@@ -1168,6 +1168,18 @@ struct ofproto_class {
* it. */
int (*set_realdev)(struct ofport *ofport,
uint16_t realdev_ofp_port, int vid);
+
+ /* Configures tunneling for 'ofport'.
+ *
+ * If 'tunnel_settings' is nonnull, configures tunneling
+ * according to its members.
+ *
+ * If 'tunneling_settings' is null, then any tunnel configuration is
+ * removed.
+ *
+ * This function should be null if tunnelling is not supported */
+ int (*set_tunnelling)(struct ofport *ofport, uint16_t tundev_ofp_port,
+ const struct tunnel_settings *s);
};
extern const struct ofproto_class ofproto_dpif_class;
diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
index 0bda06a..79f7a24 100644
--- a/ofproto/ofproto.c
+++ b/ofproto/ofproto.c
@@ -4184,3 +4184,31 @@ ofproto_port_set_realdev(struct ofproto *ofproto, uint16_t vlandev_ofp_port,
}
return error;
}
+
+/* Configure tunneling parameters of a port
+ *
+ * This function has no effect if 'ofproto' does not have a port 'ofp_port'. */
+void
+ofproto_port_set_tunnel(struct ofproto *ofproto, uint16_t tundev_ofp_port,
+ uint16_t ofp_port, const struct tunnel_settings *s)
+{
+ struct ofport *ofport;
+ int error;
+
+ ofport = ofproto_get_port(ofproto, ofp_port);
+ if (!ofport) {
+ VLOG_WARN("%s: cannot configure tunnel on nonexistent port %"PRIu16,
+ ofproto->name, ofp_port);
+ return;
+ }
+
+ error = (ofproto->ofproto_class->set_tunnelling
+ ? ofproto->ofproto_class->set_tunnelling(ofport,
+ tundev_ofp_port, s)
+ : EOPNOTSUPP);
+ if (error) {
+ VLOG_WARN("%s: Tunnel configuration on port %"PRIu16" (%s) failed (%s)",
+ ofproto->name, ofp_port,
+ netdev_get_name(ofport->netdev), strerror(error));
+ }
+}
diff --git a/ofproto/ofproto.h b/ofproto/ofproto.h
index d8739b0..147a588 100644
--- a/ofproto/ofproto.h
+++ b/ofproto/ofproto.h
@@ -398,6 +398,19 @@ struct tunnel_settings {
uint8_t type;
};
+static inline bool
+tunnel_settings_equal(const struct tunnel_settings *a,
+ const struct tunnel_settings *b)
+{
+ return a->daddr == b->daddr &&
+ a->in_key == b->in_key &&
+ a->out_key == b->out_key &&
+ a->saddr == b->saddr &&
+ a->flags == b->flags &&
+ a->tos == b->tos &&
+ a->ttl == b->ttl;
+}
+
void ofproto_port_set_tunnel(struct ofproto *ofproto, uint16_t tundev_ofp_port,
uint16_t realdev_ofp_port,
const struct tunnel_settings *s);
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 07/21] vswitchd: Configure tunnel interfaces.
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
For tunnel realdevs this sets the remote IP and type,
and optionally source IP, ttl and tos. The remote IP
must non-zero.
For tunnel tundevs only the type is configured.
The remote IP must be zero.
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
vswitchd/bridge.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 69 insertions(+)
diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
index 3d187f0..a67f391 100644
--- a/vswitchd/bridge.c
+++ b/vswitchd/bridge.c
@@ -242,6 +242,7 @@ static void iface_set_ofport(const struct ovsrec_interface *, int64_t ofport);
static void iface_clear_db_record(const struct ovsrec_interface *if_cfg);
static void iface_configure_qos(struct iface *, const struct ovsrec_qos *);
static void iface_configure_cfm(struct iface *);
+static void iface_configure_tunnel(struct iface *);
static void iface_refresh_cfm_stats(struct iface *);
static void iface_refresh_stats(struct iface *);
static void iface_refresh_status(struct iface *);
@@ -535,6 +536,7 @@ bridge_reconfigure_continue(const struct ovsrec_open_vswitch *ovs_cfg)
LIST_FOR_EACH (iface, port_elem, &port->ifaces) {
iface_configure_cfm(iface);
iface_configure_qos(iface, port->cfg->qos);
+ iface_configure_tunnel(iface);
iface_set_mac(iface);
}
}
@@ -627,6 +629,21 @@ bridge_update_ofprotos(void)
}
}
+is_tunnel_tundev(const char *type)
+{
+ return !strcmp(type, "gre-tundev") || !strcmp(type, "capwap-tundev");
+}
+
+static uint8_t
+tunnel_tundev_type_from_str(const char *type)
+{
+ if (!strcmp(type, "gre-tundev"))
+ return TNL_T_PROTO_GRE;
+ if (!strcmp(type, "gre-tundev"))
+ return TNL_T_PROTO_CAPWAP;
+ NOT_REACHED();
+}
+
static bool
is_tunnel_realdev(const char *type)
{
@@ -648,6 +665,15 @@ port_configure(struct port *port)
return;
}
+ if (list_is_singleton(&port->ifaces)) {
+ iface = CONTAINER_OF(list_front(&port->ifaces),
+ struct iface, port_elem);
+ if (is_tunnel_tundev(iface->type)) {
+ ofproto_bundle_unregister(port->bridge->ofproto, port);
+ return;
+ }
+ }
+
/* Get name. */
s.name = port->name;
@@ -3686,6 +3712,49 @@ iface_configure_cfm(struct iface *iface)
ofproto_port_set_cfm(iface->port->bridge->ofproto, iface->ofp_port, &s);
}
+static void
+iface_configure_tunnel_tundev(struct iface *iface)
+{
+ const char *type = iface_get_type(iface->cfg, iface->port->bridge->cfg);
+ struct tunnel_settings s = { .type = tunnel_tundev_type_from_str(type) };
+
+ ofproto_port_set_tunnel(iface->port->bridge->ofproto, 0,
+ iface->ofp_port, &s);
+}
+
+static void
+iface_configure_tunnel_realdev(struct iface *iface)
+{
+ struct tunnel_settings s = { .tos = 0 };
+ const char *type = iface_get_type(iface->cfg, iface->port->bridge->cfg);
+ struct iface *tundev;
+
+ /* This will not fail as it has already been called
+ * to check for errors */
+ iface_parse_tunnel(iface->cfg, type, &s);
+
+ tundev = iface_lookup(iface->port->bridge, type);
+ assert(tundev);
+
+ ofproto_port_set_tunnel(iface->port->bridge->ofproto, tundev->ofp_port,
+ iface->ofp_port, &s);
+}
+
+static void
+iface_configure_tunnel(struct iface *iface)
+{
+ const char *type = iface_get_type(iface->cfg, iface->port->bridge->cfg);
+
+ if (is_tunnel_realdev(type)) {
+ return iface_configure_tunnel_realdev(iface);
+ } else if (is_tunnel_tundev(type)) {
+ return iface_configure_tunnel_tundev(iface);
+ }
+
+ /* Nothing to do */
+ return;
+}
+
/* Returns true if 'iface' is synthetic, that is, if we constructed it locally
* instead of obtaining it from the database. */
static bool
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 09/21] ofproto: Add tundev_to_realdev()
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
In essence this is a duplication of ovs_tnl_find_port(),
copying code from the datapath to vswitchd. It is planned
that the datapath version will be removed.
It is used to map from the tundev interface that a
packet is recieved by in the datapath to the tunnel realdev
interface used in user-sapce. It is the tunnel realdev
that has the tunnel configuration attached.
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
ofproto/ofproto-dpif.c | 194 ++++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 174 insertions(+), 20 deletions(-)
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index c7ea391..03a86bc 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -183,7 +183,7 @@ static void bundle_del_port(struct ofport_dpif *);
static void bundle_run(struct ofbundle *);
static void bundle_wait(struct ofbundle *);
static struct ofbundle *lookup_input_bundle(const struct ofproto_dpif *,
- uint16_t in_port, bool warn,
+ const struct flow *, bool warn,
struct ofport_dpif **in_ofportp);
/* A controller may use OFPP_NONE as the ingress port to indicate that
@@ -550,8 +550,12 @@ static unsigned remote_ports;
static unsigned key_multicast_ports;
static unsigned multicast_ports;
+static bool tunnel_adjust_flow(const struct ofproto_dpif *ofproto,
+ struct flow *flow);
static int set_tunnelling(struct ofport *ofport_, uint16_t realdev_ofp_port,
const struct tunnel_settings *s);
+static struct ofport_dpif *tundev_to_realdev(const struct ofproto_dpif *ofproto,
+ const struct flow *flow);
static uint32_t
realdev_to_txdev(const struct ofproto_dpif *ofproto,
@@ -2998,6 +3002,7 @@ ofproto_dpif_extract_flow_key(const struct ofproto_dpif *ofproto,
struct ofpbuf *packet)
{
enum odp_key_fitness fitness;
+ bool adjusted = false;
fitness = odp_flow_key_to_flow(key, key_len, flow);
if (fitness == ODP_FIT_ERROR) {
@@ -3005,7 +3010,9 @@ ofproto_dpif_extract_flow_key(const struct ofproto_dpif *ofproto,
}
*initial_tci = flow->vlan_tci;
- if (vsp_adjust_flow(ofproto, flow)) {
+ if (tunnel_adjust_flow(ofproto, flow)) {
+ adjusted = true;
+ } else if (vsp_adjust_flow(ofproto, flow)) {
if (packet) {
/* Make the packet resemble the flow, so that it gets sent to an
* OpenFlow controller properly, so that it looks correct for
@@ -3023,11 +3030,12 @@ ofproto_dpif_extract_flow_key(const struct ofproto_dpif *ofproto,
* since we don't need that header anymore. */
eth_push_vlan(packet, flow->vlan_tci);
}
+ adjusted = true;
+ }
- /* Let the caller know that we can't reproduce 'key' from 'flow'. */
- if (fitness == ODP_FIT_PERFECT) {
- fitness = ODP_FIT_TOO_MUCH;
- }
+ /* Let the caller know that we can't reproduce 'key' from 'flow'. */
+ if (adjusted && fitness == ODP_FIT_PERFECT) {
+ fitness = ODP_FIT_TOO_MUCH;
}
return fitness;
@@ -5934,7 +5942,7 @@ add_mirror_actions(struct action_xlate_ctx *ctx, const struct flow *orig_flow)
const struct nlattr *a;
size_t left;
- in_bundle = lookup_input_bundle(ctx->ofproto, orig_flow->in_port,
+ in_bundle = lookup_input_bundle(ctx->ofproto, orig_flow,
ctx->packet != NULL, NULL);
if (!in_bundle) {
return;
@@ -6095,13 +6103,17 @@ update_learning_table(struct ofproto_dpif *ofproto,
}
static struct ofbundle *
-lookup_input_bundle(const struct ofproto_dpif *ofproto, uint16_t in_port,
- bool warn, struct ofport_dpif **in_ofportp)
+lookup_input_bundle(const struct ofproto_dpif *ofproto,
+ const struct flow *flow, bool warn,
+ struct ofport_dpif **in_ofportp)
{
struct ofport_dpif *ofport;
/* Find the port and bundle for the received packet. */
- ofport = get_ofp_port(ofproto, in_port);
+ ofport = tundev_to_realdev(ofproto, flow);
+ if (!ofport) {
+ ofport = get_ofp_port(ofproto, flow->in_port);
+ }
if (in_ofportp) {
*in_ofportp = ofport;
}
@@ -6111,7 +6123,7 @@ lookup_input_bundle(const struct ofproto_dpif *ofproto, uint16_t in_port,
/* Special-case OFPP_NONE, which a controller may use as the ingress
* port for traffic that it is sourcing. */
- if (in_port == OFPP_NONE) {
+ if (flow->in_port == OFPP_NONE) {
return &ofpp_none_bundle;
}
@@ -6129,7 +6141,7 @@ lookup_input_bundle(const struct ofproto_dpif *ofproto, uint16_t in_port,
static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
VLOG_WARN_RL(&rl, "bridge %s: received packet on unknown "
- "port %"PRIu16, ofproto->up.name, in_port);
+ "port %"PRIu16, ofproto->up.name, flow->in_port);
}
return NULL;
}
@@ -6196,7 +6208,7 @@ xlate_normal(struct action_xlate_ctx *ctx)
ctx->has_normal = true;
- in_bundle = lookup_input_bundle(ctx->ofproto, ctx->flow.in_port,
+ in_bundle = lookup_input_bundle(ctx->ofproto, &ctx->flow,
ctx->packet != NULL, &in_port);
if (!in_bundle) {
return;
@@ -7166,16 +7178,19 @@ tun_remove(struct ofport_dpif *ofport)
}
static void
-tun_add(struct ofport_dpif *ofport, uint16_t tundev_ofp_port,
- const struct tunnel_settings *s)
+tun_add(struct ofport_dpif *ofport)
{
struct ofproto_dpif *ofproto = ofproto_dpif_cast(ofport->up.ofproto);
- ofport->tun->tundev_ofp_port = tundev_ofp_port;
- ofport->tun->s = *s;
+ /* Only add if the saddr is non-zero, in which case ofport is a
+ * realdev. Otherwise it is a tundev */
+ if (ofport->tun->s.daddr == htonl(0)) {
+ return;
+ }
+
(*tun_port_pool(&ofport->tun->s))++;
hmap_insert(&ofproto->tundev_map, &ofport->tun->tundev_node,
- hash_int(tundev_ofp_port, 0));
+ hash_int(ofport->tun->tundev_ofp_port, 0));
}
static int
@@ -7203,15 +7218,154 @@ set_tunnelling(struct ofport *ofport_, uint16_t tundev_ofp_port,
if (ofport->tun->tundev_ofp_port == tundev_ofp_port &&
tunnel_settings_equal(&ofport->tun->s, s)) {
return 0;
- }
+ }
tun_remove(ofport);
}
- tun_add(ofport, tundev_ofp_port, s);
+ ofport->tun->s = *s;
+ ofport->tun->tundev_ofp_port = tundev_ofp_port;
+ tun_add(ofport);
return 0;
}
+struct tunnel_lookup_key {
+ ovs_be64 tun_id;
+ ovs_be32 ipv4_src;
+ ovs_be32 ipv4_dst;
+ uint8_t tun_type;
+};
+
+static struct ofport_dpif *
+tundev_find(const struct ofproto_dpif *ofproto, uint16_t tundev_ofp_port,
+ const struct tunnel_lookup_key *tun_key)
+{
+ struct ofport_dpif_tun *tun;
+
+ HMAP_FOR_EACH_WITH_HASH (tun, tundev_node, hash_int(tundev_ofp_port, 0),
+ &ofproto->tundev_map) {
+ if (tun_key->tun_type == tun->s.type &&
+ tun_key->ipv4_dst == tun->s.daddr &&
+ tun_key->tun_id == tun->s.in_key &&
+ tun_key->ipv4_src == tun->s.saddr) {
+ return tun->ofport;
+ }
+ }
+
+ return NULL;
+}
+
+/* Returns the OpenFlow port number of the "real" device underlying the Linux
+ * tunnel device matching tun_key.
+ *
+ * Returns 0 if no match is found */
+static struct ofport_dpif *
+tundev_to_realdev(const struct ofproto_dpif *ofproto, const struct flow *flow)
+{
+ bool is_multicast = ipv4_is_multicast(flow->tun_key.ipv4_dst);
+ struct ofport_dpif *tundev_ofport;
+ struct ofport_dpif *realdev_ofport;
+ struct tunnel_lookup_key lookup;
+
+ /* Nothing to do if the packet wasn't unencapsulated on receive */
+ if (!flow->tun_key.ipv4_dst) {
+ return NULL;
+ }
+
+ /* Nothing to do if there are no tunnel devices configured */
+ if (hmap_is_empty(&ofproto->tundev_map)) {
+ return NULL;
+ }
+
+ /* Give up if the tunnel device can't be found
+ * or isn't a tunnel tundev */
+ tundev_ofport = get_ofp_port(ofproto, flow->in_port);
+ if (!tundev_ofport || !tundev_ofport->tun || tundev_ofport->tun->s.daddr) {
+ return NULL;
+ }
+
+ lookup.tun_id = flow->tun_key.tun_id;
+ lookup.ipv4_src = flow->tun_key.ipv4_dst;
+ lookup.ipv4_dst = flow->tun_key.ipv4_src;
+
+ /* First try for an exact match on the tun_id */
+ lookup.tun_id = flow->tun_key.tun_id;
+ lookup.tun_type = tundev_ofport->tun->s.type | TNL_T_KEY_EXACT;
+ if (!is_multicast && key_local_remote_ports) {
+ realdev_ofport = tundev_find(ofproto, flow->in_port, &lookup);
+ if (realdev_ofport)
+ return realdev_ofport;
+ }
+ if (key_remote_ports) {
+ lookup.ipv4_src = htonl(0);
+ realdev_ofport = tundev_find(ofproto, flow->in_port, &lookup);
+ if (realdev_ofport)
+ return realdev_ofport;
+ lookup.ipv4_src = flow->tun_key.ipv4_dst;
+ }
+
+ /* Then try matches that wildcard the tun_id. */
+ lookup.tun_id = htonll(0);
+ lookup.tun_type = tundev_ofport->tun->s.type | TNL_T_KEY_MATCH;
+ if (!is_multicast && local_remote_ports) {
+ realdev_ofport = tundev_find(ofproto, flow->in_port, &lookup);
+ if (realdev_ofport)
+ return realdev_ofport;
+ }
+ if (remote_ports) {
+ lookup.ipv4_src = htonl(0);
+ realdev_ofport = tundev_find(ofproto, flow->in_port, &lookup);
+ if (realdev_ofport)
+ return realdev_ofport;
+ }
+
+ if (is_multicast) {
+ lookup.ipv4_src = htonl(0);
+ lookup.ipv4_dst = flow->tun_key.ipv4_dst;
+ if (key_multicast_ports) {
+ lookup.tun_id = flow->tun_key.tun_id;
+ lookup.tun_type = tundev_ofport->tun->s.type | TNL_T_KEY_EXACT;
+ realdev_ofport = tundev_find(ofproto, flow->in_port, &lookup);
+ if (realdev_ofport)
+ return realdev_ofport;
+ }
+ if (multicast_ports) {
+ lookup.tun_id = 0;
+ lookup.tun_type = tundev_ofport->tun->s.type | TNL_T_KEY_MATCH;
+ realdev_ofport = tundev_find(ofproto, flow->in_port, &lookup);
+ if (realdev_ofport)
+ return realdev_ofport;
+ }
+ }
+
+ return NULL;
+}
+
+/* Given 'flow', a flow representing a packet received on 'ofproto', checks
+ * whether 'flow->in_port' represents a Linux tunnel device. If so, changes
+ * 'flow->in_port' to the "real" device backing the tunnel device, sets
+ * 'flow->key' to using the real device's tunnel settings, and returns true.
+ * Otherwise (which is always the case unless tunneling enabled), returns
+ * false without making any changes. */
+static bool
+tunnel_adjust_flow(const struct ofproto_dpif *ofproto, struct flow *flow)
+{
+ const struct ofport_dpif *realdev_ofport = tundev_to_realdev(ofproto, flow);
+ if (!realdev_ofport) {
+ return false;
+ }
+
+ /* Cause the flow to be processed as if it came in on the real device with
+ * the tunnel's key. */
+ flow->in_port = ofp_port_to_odp_port(realdev_ofport->up.ofp_port);
+ flow->tun_key.tun_id = realdev_ofport->tun->s.out_key;
+ flow->tun_key.ipv4_src = realdev_ofport->tun->s.saddr;
+ flow->tun_key.ipv4_dst = realdev_ofport->tun->s.daddr;
+ flow->tun_key.ipv4_tos = realdev_ofport->tun->s.tos;
+ flow->tun_key.ipv4_ttl = realdev_ofport->tun->s.ttl;
+ return true;
+}
+
/* Maps a port to the port that it should be transmitted on.
* If tunneling is enabled then the associated tunnel port is returned.
* If VLAN splintering is enabled then the ofp_port of the vlandev is
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 10/21] classifier: Convert struct flow flow_metadata to use tun_key
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
This allows the tun_key tp be bassed throughout user-space,
attached to a flow. This is the essence of flow-based tunneling.
This does not add tun_key or wildcards, other than the existing match for
the tun_id. It is envisaged that most if not all fields of the tun_key
could be wildcarded.
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
v4
* flow_format() and ofp_print_packet_in() format strings:
- Make more consistent with eachother and format_odp_key_attr()
- Update for flags field of tunnel
* Remove debugging message
* Add struct flow_tun_key to avoid needing to use
ovs_key_ipv4_tunnel which is defined in a Linux kernel header.
This code should be ofproto-provider agnostic.
v3
* Initial posting
classifer: don't use kernel tunnel structure
---
lib/classifier.c | 8 ++++----
lib/dpif-linux.c | 2 +-
lib/flow.c | 31 ++++++++++++++++++++++++++-----
lib/flow.h | 21 ++++++++++++++++-----
lib/meta-flow.c | 4 ++--
lib/nx-match.c | 2 +-
lib/odp-util.c | 24 ++++++++++++++++--------
lib/ofp-print.c | 12 ++++++++++--
lib/ofp-util.c | 4 ++--
ofproto/ofproto-dpif.c | 11 ++++++-----
tests/test-classifier.c | 7 ++++---
11 files changed, 88 insertions(+), 38 deletions(-)
diff --git a/lib/classifier.c b/lib/classifier.c
index e11a585..7dc6560 100644
--- a/lib/classifier.c
+++ b/lib/classifier.c
@@ -129,7 +129,7 @@ cls_rule_set_tun_id_masked(struct cls_rule *rule,
ovs_be64 tun_id, ovs_be64 mask)
{
rule->wc.tun_id_mask = mask;
- rule->flow.tun_id = tun_id & mask;
+ rule->flow.tun_key.tun_id = tun_id & mask;
}
void
@@ -563,11 +563,11 @@ cls_rule_format(const struct cls_rule *rule, struct ds *s)
case 0:
break;
case CONSTANT_HTONLL(UINT64_MAX):
- ds_put_format(s, "tun_id=%#"PRIx64",", ntohll(f->tun_id));
+ ds_put_format(s, "tun_id=%#"PRIx64",", ntohll(f->tun_key.tun_id));
break;
default:
ds_put_format(s, "tun_id=%#"PRIx64"/%#"PRIx64",",
- ntohll(f->tun_id), ntohll(wc->tun_id_mask));
+ ntohll(f->tun_key.tun_id), ntohll(wc->tun_id_mask));
break;
}
if (!(w & FWW_IN_PORT)) {
@@ -1187,7 +1187,7 @@ flow_equal_except(const struct flow *a, const struct flow *b,
}
}
- return (!((a->tun_id ^ b->tun_id) & wildcards->tun_id_mask)
+ return (!((a->tun_key.tun_id ^ b->tun_key.tun_id) & wildcards->tun_id_mask)
&& !((a->nw_src ^ b->nw_src) & wildcards->nw_src_mask)
&& !((a->nw_dst ^ b->nw_dst) & wildcards->nw_dst_mask)
&& (wc & FWW_IN_PORT || a->in_port == b->in_port)
diff --git a/lib/dpif-linux.c b/lib/dpif-linux.c
index 256c9d6..0e5cdd2 100644
--- a/lib/dpif-linux.c
+++ b/lib/dpif-linux.c
@@ -1292,7 +1292,7 @@ dpif_linux_vport_send(int dp_ifindex, uint32_t port_no,
uint64_t action;
ofpbuf_use_const(&packet, data, size);
- flow_extract(&packet, 0, htonll(0), 0, &flow);
+ flow_extract(&packet, 0, NULL, 0, &flow);
ofpbuf_use_stack(&key, &keybuf, sizeof keybuf);
odp_flow_key_from_flow(&key, &flow);
diff --git a/lib/flow.c b/lib/flow.c
index fc61610..8645e7d 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -330,7 +330,8 @@ invalid:
* present and has a correct length, and otherwise NULL.
*/
void
-flow_extract(struct ofpbuf *packet, uint32_t skb_priority, ovs_be64 tun_id,
+flow_extract(struct ofpbuf *packet, uint32_t skb_priority,
+ const struct flow_tun_key *tun_key,
uint16_t ofp_in_port, struct flow *flow)
{
struct ofpbuf b = *packet;
@@ -339,7 +340,9 @@ flow_extract(struct ofpbuf *packet, uint32_t skb_priority, ovs_be64 tun_id,
COVERAGE_INC(flow_extract);
memset(flow, 0, sizeof *flow);
- flow->tun_id = tun_id;
+ if (tun_key) {
+ flow->tun_key = *tun_key;;
+ }
flow->in_port = ofp_in_port;
flow->skb_priority = skb_priority;
@@ -449,7 +452,7 @@ flow_zero_wildcards(struct flow *flow, const struct flow_wildcards *wildcards)
for (i = 0; i < FLOW_N_REGS; i++) {
flow->regs[i] &= wildcards->reg_masks[i];
}
- flow->tun_id &= wildcards->tun_id_mask;
+ flow->tun_key.tun_id &= wildcards->tun_id_mask;
flow->nw_src &= wildcards->nw_src_mask;
flow->nw_dst &= wildcards->nw_dst_mask;
if (wc & FWW_IN_PORT) {
@@ -508,7 +511,7 @@ flow_get_metadata(const struct flow *flow, struct flow_metadata *fmd)
{
BUILD_ASSERT_DECL(FLOW_WC_SEQ == 10);
- fmd->tun_id = flow->tun_id;
+ fmd->tun_key = flow->tun_key;
fmd->tun_id_mask = htonll(UINT64_MAX);
memcpy(fmd->regs, flow->regs, sizeof fmd->regs);
@@ -528,11 +531,13 @@ flow_to_string(const struct flow *flow)
void
flow_format(struct ds *ds, const struct flow *flow)
{
+ /* The tunnel key is also displayed as part of tunnel() below.
+ * It is here for backwards-compatibility */
ds_put_format(ds, "priority:%"PRIu32
",tunnel:%#"PRIx64
",in_port:%04"PRIx16,
flow->skb_priority,
- ntohll(flow->tun_id),
+ ntohll(flow->tun_key.tun_id),
flow->in_port);
ds_put_format(ds, ",tci(");
@@ -579,6 +584,22 @@ flow_format(struct ds *ds, const struct flow *flow)
ETH_ADDR_ARGS(flow->arp_sha),
ETH_ADDR_ARGS(flow->arp_tha));
}
+ if (!eth_addr_is_zero(flow->arp_sha) || !eth_addr_is_zero(flow->arp_tha)) {
+ ds_put_format(ds, " arp_ha("ETH_ADDR_FMT"->"ETH_ADDR_FMT")",
+ ETH_ADDR_ARGS(flow->arp_sha),
+ ETH_ADDR_ARGS(flow->arp_tha));
+ }
+ if (flow->tun_key.ipv4_dst != htonl(0)) {
+ ds_put_format(ds, " tunnel(tun_id:%"PRIx64",flags:%"PRIx32
+ ",ip("IP_FMT"->"IP_FMT"),"
+ ",tos:%"PRIx8",ttl:%"PRIu8")",
+ ntohll(flow->tun_key.tun_id),
+ flow->tun_key.tun_flags,
+ IP_ARGS(&flow->tun_key.ipv4_src),
+ IP_ARGS(&flow->tun_key.ipv4_dst),
+ flow->tun_key.ipv4_tos, flow->tun_key.ipv4_ttl);
+ }
+
}
void
diff --git a/lib/flow.h b/lib/flow.h
index 7ee9a26..0b5932f 100644
--- a/lib/flow.h
+++ b/lib/flow.h
@@ -52,8 +52,18 @@ BUILD_ASSERT_DECL(FLOW_N_REGS <= NXM_NX_MAX_REGS);
BUILD_ASSERT_DECL(FLOW_NW_FRAG_ANY == NX_IP_FRAG_ANY);
BUILD_ASSERT_DECL(FLOW_NW_FRAG_LATER == NX_IP_FRAG_LATER);
+struct flow_tun_key {
+ ovs_be64 tun_id;
+ uint32_t tun_flags;
+ ovs_be32 ipv4_src;
+ ovs_be32 ipv4_dst;
+ uint8_t ipv4_tos;
+ uint8_t ipv4_ttl;
+ uint8_t pad[2];
+};
+
struct flow {
- ovs_be64 tun_id; /* Encapsulating tunnel ID. */
+ struct flow_tun_key tun_key;/* Encapsulating tunnel. */
struct in6_addr ipv6_src; /* IPv6 source address. */
struct in6_addr ipv6_dst; /* IPv6 destination address. */
struct in6_addr nd_target; /* IPv6 neighbor discovery (ND) target. */
@@ -82,7 +92,7 @@ struct flow {
* indicate which metadata fields are relevant in a given context. Typically
* they will be all 1 or all 0. */
struct flow_metadata {
- ovs_be64 tun_id; /* Encapsulating tunnel ID. */
+ struct flow_tun_key tun_key; /* Encapsulating tunnel. */
ovs_be64 tun_id_mask; /* 1-bit in each significant tun_id bit.*/
uint32_t regs[FLOW_N_REGS]; /* Registers. */
@@ -93,16 +103,17 @@ struct flow_metadata {
/* Assert that there are FLOW_SIG_SIZE bytes of significant data in "struct
* flow", followed by FLOW_PAD_SIZE bytes of padding. */
-#define FLOW_SIG_SIZE (110 + FLOW_N_REGS * 4)
+#define FLOW_SIG_SIZE (126 + FLOW_N_REGS * 4)
#define FLOW_PAD_SIZE 2
BUILD_ASSERT_DECL(offsetof(struct flow, nw_frag) == FLOW_SIG_SIZE - 1);
BUILD_ASSERT_DECL(sizeof(((struct flow *)0)->nw_frag) == 1);
BUILD_ASSERT_DECL(sizeof(struct flow) == FLOW_SIG_SIZE + FLOW_PAD_SIZE);
/* Remember to update FLOW_WC_SEQ when changing 'struct flow'. */
-BUILD_ASSERT_DECL(FLOW_SIG_SIZE == 142 && FLOW_WC_SEQ == 10);
+BUILD_ASSERT_DECL(FLOW_SIG_SIZE == 158 && FLOW_WC_SEQ == 10);
-void flow_extract(struct ofpbuf *, uint32_t priority, ovs_be64 tun_id,
+void flow_extract(struct ofpbuf *, uint32_t priority,
+ const struct flow_tun_key *,
uint16_t in_port, struct flow *);
void flow_zero_wildcards(struct flow *, const struct flow_wildcards *);
void flow_get_metadata(const struct flow *, struct flow_metadata *);
diff --git a/lib/meta-flow.c b/lib/meta-flow.c
index 8b60b35..0b47ea1 100644
--- a/lib/meta-flow.c
+++ b/lib/meta-flow.c
@@ -962,7 +962,7 @@ mf_get_value(const struct mf_field *mf, const struct flow *flow,
{
switch (mf->id) {
case MFF_TUN_ID:
- value->be64 = flow->tun_id;
+ value->be64 = flow->tun_key.tun_id;
break;
case MFF_IN_PORT:
@@ -1300,7 +1300,7 @@ mf_set_flow_value(const struct mf_field *mf,
{
switch (mf->id) {
case MFF_TUN_ID:
- flow->tun_id = value->be64;
+ flow->tun_key.tun_id = value->be64;
break;
case MFF_IN_PORT:
diff --git a/lib/nx-match.c b/lib/nx-match.c
index 34c8354..f97ef5d 100644
--- a/lib/nx-match.c
+++ b/lib/nx-match.c
@@ -541,7 +541,7 @@ nx_put_match(struct ofpbuf *b, const struct cls_rule *cr,
}
/* Tunnel ID. */
- nxm_put_64m(b, NXM_NX_TUN_ID, flow->tun_id, cr->wc.tun_id_mask);
+ nxm_put_64m(b, NXM_NX_TUN_ID, flow->tun_key.tun_id, cr->wc.tun_id_mask);
/* Registers. */
for (i = 0; i < FLOW_N_REGS; i++) {
diff --git a/lib/odp-util.c b/lib/odp-util.c
index 7cff00c..5f76f5e 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -1299,8 +1299,12 @@ odp_flow_key_from_flow(struct ofpbuf *buf, const struct flow *flow)
nl_msg_put_u32(buf, OVS_KEY_ATTR_PRIORITY, flow->skb_priority);
}
- if (flow->tun_id != htonll(0)) {
- nl_msg_put_be64(buf, OVS_KEY_ATTR_TUN_ID, flow->tun_id);
+ if (flow->tun_key.ipv4_dst != htonl(0)) {
+ struct flow_tun_key *tun_key;
+
+ tun_key = nl_msg_put_unspec_uninit(buf, OVS_KEY_ATTR_IPV4_TUNNEL,
+ sizeof *tun_key);
+ *tun_key = flow->tun_key;
}
if (flow->in_port != OFPP_NONE && flow->in_port != OFPP_CONTROLLER) {
@@ -1791,9 +1795,13 @@ odp_flow_key_to_flow(const struct nlattr *key, size_t key_len,
expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_PRIORITY;
}
- if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_TUN_ID)) {
- flow->tun_id = nl_attr_get_be64(attrs[OVS_KEY_ATTR_TUN_ID]);
- expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_TUN_ID;
+ if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_IPV4_TUNNEL)) {
+ const struct flow_tun_key *tun_key;
+
+ tun_key = nl_attr_get(attrs[OVS_KEY_ATTR_IPV4_TUNNEL]);
+ flow->tun_key = *tun_key;
+
+ expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_IPV4_TUNNEL;
}
if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_IN_PORT)) {
@@ -1887,13 +1895,13 @@ static void
commit_set_tun_id_action(const struct flow *flow, struct flow *base,
struct ofpbuf *odp_actions)
{
- if (base->tun_id == flow->tun_id) {
+ if (base->tun_key.tun_id == flow->tun_key.tun_id) {
return;
}
- base->tun_id = flow->tun_id;
+ base->tun_key.tun_id = flow->tun_key.tun_id;
commit_set_action(odp_actions, OVS_KEY_ATTR_TUN_ID,
- &base->tun_id, sizeof(base->tun_id));
+ &base->tun_key.tun_id, sizeof(base->tun_key.tun_id));
}
static void
diff --git a/lib/ofp-print.c b/lib/ofp-print.c
index 1757a30..fff7454 100644
--- a/lib/ofp-print.c
+++ b/lib/ofp-print.c
@@ -106,11 +106,19 @@ ofp_print_packet_in(struct ds *string, const struct ofp_header *oh,
ds_put_format(string, " total_len=%"PRIu16" in_port=", pin.total_len);
ofputil_format_port(pin.fmd.in_port, string);
- if (pin.fmd.tun_id_mask) {
- ds_put_format(string, " tun_id=0x%"PRIx64, ntohll(pin.fmd.tun_id));
+ if (pin.fmd.tun_key.ipv4_dst != htonl(0)) {
+ ds_put_format(string, " tunnel(tun_id=0x%"PRIx64,
+ ntohll(pin.fmd.tun_key.tun_id));
if (pin.fmd.tun_id_mask != htonll(UINT64_MAX)) {
ds_put_format(string, "/0x%"PRIx64, ntohll(pin.fmd.tun_id_mask));
}
+ ds_put_format(string, ",flags=%"PRIx32",ip="IP_FMT"->"IP_FMT","
+ "tos=%"PRIx8",ttl=%"PRIu8")",
+ pin.fmd.tun_key.tun_flags,
+ IP_ARGS(&pin.fmd.tun_key.ipv4_src),
+ IP_ARGS(&pin.fmd.tun_key.ipv4_dst),
+ pin.fmd.tun_key.ipv4_tos,
+ pin.fmd.tun_key.ipv4_ttl);
}
for (i = 0; i < FLOW_N_REGS; i++) {
diff --git a/lib/ofp-util.c b/lib/ofp-util.c
index 90124ec..652a6bf 100644
--- a/lib/ofp-util.c
+++ b/lib/ofp-util.c
@@ -2096,7 +2096,7 @@ ofputil_decode_packet_in(struct ofputil_packet_in *pin,
pin->fmd.in_port = rule.flow.in_port;
- pin->fmd.tun_id = rule.flow.tun_id;
+ pin->fmd.tun_key.tun_id = rule.flow.tun_key.tun_id;
pin->fmd.tun_id_mask = rule.wc.tun_id_mask;
memcpy(pin->fmd.regs, rule.flow.regs, sizeof pin->fmd.regs);
@@ -2149,7 +2149,7 @@ ofputil_encode_packet_in(const struct ofputil_packet_in *pin,
+ 2 + send_len);
cls_rule_init_catchall(&rule, 0);
- cls_rule_set_tun_id_masked(&rule, pin->fmd.tun_id,
+ cls_rule_set_tun_id_masked(&rule, pin->fmd.tun_key.tun_id,
pin->fmd.tun_id_mask);
for (i = 0; i < FLOW_N_REGS; i++) {
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 03a86bc..2a52f37 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -3080,7 +3080,7 @@ handle_miss_upcalls(struct ofproto_dpif *ofproto, struct dpif_upcall *upcalls,
continue;
}
flow_extract(upcall->packet, miss->flow.skb_priority,
- miss->flow.tun_id, miss->flow.in_port, &miss->flow);
+ &miss->flow.tun_key, miss->flow.in_port, &miss->flow);
/* Add other packets to a to-do list. */
hash = flow_hash(&miss->flow, 0);
@@ -5464,7 +5464,7 @@ do_xlate_actions(const union ofp_action *in, size_t n_in,
case OFPUTIL_NXAST_SET_TUNNEL:
nast = (const struct nx_action_set_tunnel *) ia;
tun_id = htonll(ntohl(nast->tun_id));
- ctx->flow.tun_id = tun_id;
+ ctx->flow.tun_key.tun_id = tun_id;
break;
case OFPUTIL_NXAST_SET_QUEUE:
@@ -5492,7 +5492,7 @@ do_xlate_actions(const union ofp_action *in, size_t n_in,
case OFPUTIL_NXAST_SET_TUNNEL64:
tun_id = ((const struct nx_action_set_tunnel64 *) ia)->tun_id;
- ctx->flow.tun_id = tun_id;
+ ctx->flow.tun_key.tun_id = tun_id;
break;
case OFPUTIL_NXAST_MULTIPATH:
@@ -5576,7 +5576,7 @@ action_xlate_ctx_init(struct action_xlate_ctx *ctx,
ctx->ofproto = ofproto;
ctx->flow = *flow;
ctx->base_flow = ctx->flow;
- ctx->base_flow.tun_id = 0;
+ ctx->base_flow.tun_key.ipv4_src = 0;
ctx->base_flow.vlan_tci = initial_tci;
ctx->rule = rule;
ctx->packet = packet;
@@ -6739,6 +6739,7 @@ ofproto_unixctl_trace(struct unixctl_conn *conn, int argc, const char *argv[],
const char *packet_s = argv[5];
uint16_t in_port = ofp_port_to_odp_port(atoi(in_port_s));
ovs_be64 tun_id = htonll(strtoull(tun_id_s, NULL, 0));
+ struct ovs_key_ipv4_tunnel tun_key = { .tun_id = tun_id };
uint32_t priority = atoi(priority_s);
const char *msg;
@@ -6753,7 +6754,7 @@ ofproto_unixctl_trace(struct unixctl_conn *conn, int argc, const char *argv[],
ds_put_cstr(&result, s);
free(s);
- flow_extract(packet, priority, tun_id, in_port, &flow);
+ flow_extract(packet, priority, &tun_key, in_port, &flow);
initial_tci = flow.vlan_tci;
} else {
unixctl_command_reply_error(conn, "Bad command syntax");
diff --git a/tests/test-classifier.c b/tests/test-classifier.c
index fcafdb2..5bb5df8 100644
--- a/tests/test-classifier.c
+++ b/tests/test-classifier.c
@@ -44,7 +44,7 @@
/* struct flow all-caps */ \
/* FWW_* bit(s) member name name */ \
/* -------------------------- ----------- -------- */ \
- CLS_FIELD(0, tun_id, TUN_ID) \
+ CLS_FIELD(0, tun_key.tun_id, TUN_ID) \
CLS_FIELD(0, nw_src, NW_SRC) \
CLS_FIELD(0, nw_dst, NW_DST) \
CLS_FIELD(FWW_IN_PORT, in_port, IN_PORT) \
@@ -206,7 +206,8 @@ match(const struct cls_rule *wild, const struct flow *fixed)
eq = !((fixed->vlan_tci ^ wild->flow.vlan_tci)
& wild->wc.vlan_tci_mask);
} else if (f_idx == CLS_F_IDX_TUN_ID) {
- eq = !((fixed->tun_id ^ wild->flow.tun_id) & wild->wc.tun_id_mask);
+ eq = !((fixed->tun_key.tun_id ^ wild->flow.tun_key.tun_id) &
+ wild->wc.tun_id_mask);
} else if (f_idx == CLS_F_IDX_NW_DSCP) {
eq = !((fixed->nw_tos ^ wild->flow.nw_tos) & IP_DSCP_MASK);
} else {
@@ -362,7 +363,7 @@ compare_classifiers(struct classifier *cls, struct tcls *tcls)
x = rand () % N_FLOW_VALUES;
flow.nw_src = nw_src_values[get_value(&x, N_NW_SRC_VALUES)];
flow.nw_dst = nw_dst_values[get_value(&x, N_NW_DST_VALUES)];
- flow.tun_id = tun_id_values[get_value(&x, N_TUN_ID_VALUES)];
+ flow.tun_key.tun_id = tun_id_values[get_value(&x, N_TUN_ID_VALUES)];
flow.in_port = in_port_values[get_value(&x, N_IN_PORT_VALUES)];
flow.vlan_tci = vlan_tci_values[get_value(&x, N_VLAN_TCI_VALUES)];
flow.dl_type = dl_type_values[get_value(&x, N_DL_TYPE_VALUES)];
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 11/21] datapath, vport: Provide tunnel realdev and tundev classes and vports
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
On the user-space side of things, the existing tunnel classes become tunnel
realdev classes and new classes are added to provide tunnel tundevs.
On the datpath side of things, the existing tunnel vports are used as
tundev vports. A new vport is added for tunnel realdevs.
It should be possible to remove realdevs entirely from the datapath,
however that requries teaching the user-space netdev to exclude them from
kernel-related opperations. I have avoided that at this time in order to
allow review of other aspects of the approach taken in my flow-bassed
tunneling prototype.
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
--
v4
* Tunnel tundevs should have a NULL set_config callback as their
parse_config call back is NULL. Otherwise, reconfiguration will fail and
ovs-vwitchd will exit if started with tundevs already configured.
* Remove unparse_tunnel_config, it is not used
v3
* Initial Post
remove unparse_tunnel_config
---
datapath/Modules.mk | 3 +-
datapath/tunnel.c | 158 +------------------
datapath/vport-capwap.c | 2 -
datapath/vport-gre.c | 2 -
datapath/vport-tunnel-realdev.c | 260 +++++++++++++++++++++++++++++++
datapath/vport.c | 1 +
datapath/vport.h | 1 +
include/linux/openvswitch.h | 1 +
include/openvswitch/tunnel.h | 2 +
lib/netdev-vport.c | 333 +++++++++-------------------------------
10 files changed, 343 insertions(+), 420 deletions(-)
create mode 100644 datapath/vport-tunnel-realdev.c
diff --git a/datapath/Modules.mk b/datapath/Modules.mk
index 24c1075..9aed4c3 100644
--- a/datapath/Modules.mk
+++ b/datapath/Modules.mk
@@ -26,7 +26,8 @@ openvswitch_sources = \
vport-gre.c \
vport-internal_dev.c \
vport-netdev.c \
- vport-patch.c
+ vport-patch.c \
+ vport-tunnel-realdev.c
openvswitch_headers = \
checksum.h \
diff --git a/datapath/tunnel.c b/datapath/tunnel.c
index 61add96..f07ec69 100644
--- a/datapath/tunnel.c
+++ b/datapath/tunnel.c
@@ -250,21 +250,6 @@ static void port_table_add_port(struct vport *vport)
(*find_port_pool(rtnl_dereference(tnl_vport->mutable)))++;
}
-static void port_table_move_port(struct vport *vport,
- struct tnl_mutable_config *new_mutable)
-{
- struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
- u32 hash;
-
- hash = port_hash(&new_mutable->key);
- hlist_del_init_rcu(&tnl_vport->hash_node);
- hlist_add_head_rcu(&tnl_vport->hash_node, find_bucket(hash));
-
- (*find_port_pool(rtnl_dereference(tnl_vport->mutable)))--;
- assign_config_rcu(vport, new_mutable);
- (*find_port_pool(rtnl_dereference(tnl_vport->mutable)))++;
-}
-
static void port_table_remove_port(struct vport *vport)
{
struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
@@ -1381,71 +1366,20 @@ out:
return sent_len;
}
-static const struct nla_policy tnl_policy[OVS_TUNNEL_ATTR_MAX + 1] = {
- [OVS_TUNNEL_ATTR_FLAGS] = { .type = NLA_U32 },
- [OVS_TUNNEL_ATTR_DST_IPV4] = { .type = NLA_U32 },
- [OVS_TUNNEL_ATTR_SRC_IPV4] = { .type = NLA_U32 },
- [OVS_TUNNEL_ATTR_OUT_KEY] = { .type = NLA_U64 },
- [OVS_TUNNEL_ATTR_IN_KEY] = { .type = NLA_U64 },
- [OVS_TUNNEL_ATTR_TOS] = { .type = NLA_U8 },
- [OVS_TUNNEL_ATTR_TTL] = { .type = NLA_U8 },
-};
-
/* Sets OVS_TUNNEL_ATTR_* fields in 'mutable', which must initially be
* zeroed. */
-static int tnl_set_config(struct net *net, struct nlattr *options,
+static int tnl_set_config(struct net *net,
const struct tnl_ops *tnl_ops,
const struct vport *cur_vport,
struct tnl_mutable_config *mutable)
{
const struct vport *old_vport;
const struct tnl_mutable_config *old_mutable;
- struct nlattr *a[OVS_TUNNEL_ATTR_MAX + 1];
- int err;
-
- if (!options)
- return -EINVAL;
-
- err = nla_parse_nested(a, OVS_TUNNEL_ATTR_MAX, options, tnl_policy);
- if (err)
- return err;
-
- if (!a[OVS_TUNNEL_ATTR_FLAGS] || !a[OVS_TUNNEL_ATTR_DST_IPV4])
- return -EINVAL;
-
- mutable->flags = nla_get_u32(a[OVS_TUNNEL_ATTR_FLAGS]) & TNL_F_PUBLIC;
+ mutable->flags = 0;
port_key_set_net(&mutable->key, net);
- mutable->key.daddr = nla_get_be32(a[OVS_TUNNEL_ATTR_DST_IPV4]);
- if (a[OVS_TUNNEL_ATTR_SRC_IPV4]) {
- if (ipv4_is_multicast(mutable->key.daddr))
- return -EINVAL;
- mutable->key.saddr = nla_get_be32(a[OVS_TUNNEL_ATTR_SRC_IPV4]);
- }
-
- if (a[OVS_TUNNEL_ATTR_TOS]) {
- mutable->tos = nla_get_u8(a[OVS_TUNNEL_ATTR_TOS]);
- /* Reject ToS config with ECN bits set. */
- if (mutable->tos & INET_ECN_MASK)
- return -EINVAL;
- }
-
- if (a[OVS_TUNNEL_ATTR_TTL])
- mutable->ttl = nla_get_u8(a[OVS_TUNNEL_ATTR_TTL]);
-
+ mutable->key.daddr = htonl(0);
mutable->key.tunnel_type = tnl_ops->tunnel_type;
- if (!a[OVS_TUNNEL_ATTR_IN_KEY]) {
- mutable->key.tunnel_type |= TNL_T_KEY_MATCH;
- mutable->flags |= TNL_F_IN_KEY_MATCH;
- } else {
- mutable->key.tunnel_type |= TNL_T_KEY_EXACT;
- mutable->key.in_key = nla_get_be64(a[OVS_TUNNEL_ATTR_IN_KEY]);
- }
-
- if (!a[OVS_TUNNEL_ATTR_OUT_KEY])
- mutable->flags |= TNL_F_OUT_KEY_ACTION;
- else
- mutable->out_key = nla_get_be64(a[OVS_TUNNEL_ATTR_OUT_KEY]);
mutable->tunnel_hlen = tnl_ops->hdr_len(mutable);
if (mutable->tunnel_hlen < 0)
@@ -1458,21 +1392,6 @@ static int tnl_set_config(struct net *net, struct nlattr *options,
return -EEXIST;
mutable->mlink = 0;
- if (ipv4_is_multicast(mutable->key.daddr)) {
- struct net_device *dev;
- struct rtable *rt;
-
- rt = __find_route(mutable, tnl_ops->ipproto, mutable->tos,
- mutable->key.daddr, mutable->key.saddr);
- if (IS_ERR(rt))
- return -EADDRNOTAVAIL;
- dev = rt_dst(rt).dev;
- ip_rt_put(rt);
- if (__in_dev_get_rtnl(dev) == NULL)
- return -EADDRNOTAVAIL;
- mutable->mlink = dev->ifindex;
- ip_mc_inc_group(__in_dev_get_rtnl(dev), mutable->key.daddr);
- }
return 0;
}
@@ -1509,8 +1428,7 @@ struct vport *ovs_tnl_create(const struct vport_parms *parms,
get_random_bytes(&initial_frag_id, sizeof(int));
atomic_set(&tnl_vport->frag_id, initial_frag_id);
- err = tnl_set_config(ovs_dp_get_net(parms->dp), parms->options, tnl_ops,
- NULL, mutable);
+ err = tnl_set_config(ovs_dp_get_net(parms->dp), tnl_ops, NULL, mutable);
if (err)
goto error_free_mutable;
@@ -1535,74 +1453,6 @@ error:
return ERR_PTR(err);
}
-int ovs_tnl_set_options(struct vport *vport, struct nlattr *options)
-{
- struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
- const struct tnl_mutable_config *old_mutable;
- struct tnl_mutable_config *mutable;
- int err;
-
- mutable = kzalloc(sizeof(struct tnl_mutable_config), GFP_KERNEL);
- if (!mutable) {
- err = -ENOMEM;
- goto error;
- }
-
- /* Copy fields whose values should be retained. */
- old_mutable = rtnl_dereference(tnl_vport->mutable);
- mutable->seq = old_mutable->seq + 1;
- memcpy(mutable->eth_addr, old_mutable->eth_addr, ETH_ALEN);
-
- /* Parse the others configured by userspace. */
- err = tnl_set_config(ovs_dp_get_net(vport->dp), options, tnl_vport->tnl_ops,
- vport, mutable);
- if (err)
- goto error_free;
-
- if (port_hash(&mutable->key) != port_hash(&old_mutable->key))
- port_table_move_port(vport, mutable);
- else
- assign_config_rcu(vport, mutable);
-
- return 0;
-
-error_free:
- free_mutable_rtnl(mutable);
- kfree(mutable);
-error:
- return err;
-}
-
-int ovs_tnl_get_options(const struct vport *vport, struct sk_buff *skb)
-{
- const struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
- const struct tnl_mutable_config *mutable = rcu_dereference_rtnl(tnl_vport->mutable);
-
- if (nla_put_u32(skb, OVS_TUNNEL_ATTR_FLAGS,
- mutable->flags & TNL_F_PUBLIC) ||
- nla_put_be32(skb, OVS_TUNNEL_ATTR_DST_IPV4, mutable->key.daddr))
- goto nla_put_failure;
-
- if (!(mutable->flags & TNL_F_IN_KEY_MATCH) &&
- nla_put_be64(skb, OVS_TUNNEL_ATTR_IN_KEY, mutable->key.in_key))
- goto nla_put_failure;
- if (!(mutable->flags & TNL_F_OUT_KEY_ACTION) &&
- nla_put_be64(skb, OVS_TUNNEL_ATTR_OUT_KEY, mutable->out_key))
- goto nla_put_failure;
- if (mutable->key.saddr &&
- nla_put_be32(skb, OVS_TUNNEL_ATTR_SRC_IPV4, mutable->key.saddr))
- goto nla_put_failure;
- if (mutable->tos && nla_put_u8(skb, OVS_TUNNEL_ATTR_TOS, mutable->tos))
- goto nla_put_failure;
- if (mutable->ttl && nla_put_u8(skb, OVS_TUNNEL_ATTR_TTL, mutable->ttl))
- goto nla_put_failure;
-
- return 0;
-
-nla_put_failure:
- return -EMSGSIZE;
-}
-
static void free_port_rcu(struct rcu_head *rcu)
{
struct tnl_vport *tnl_vport = container_of(rcu,
diff --git a/datapath/vport-capwap.c b/datapath/vport-capwap.c
index 1e08d5a..f26a7d2 100644
--- a/datapath/vport-capwap.c
+++ b/datapath/vport-capwap.c
@@ -835,8 +835,6 @@ const struct vport_ops ovs_capwap_vport_ops = {
.set_addr = ovs_tnl_set_addr,
.get_name = ovs_tnl_get_name,
.get_addr = ovs_tnl_get_addr,
- .get_options = ovs_tnl_get_options,
- .set_options = ovs_tnl_set_options,
.get_dev_flags = ovs_vport_gen_get_dev_flags,
.is_running = ovs_vport_gen_is_running,
.get_operstate = ovs_vport_gen_get_operstate,
diff --git a/datapath/vport-gre.c b/datapath/vport-gre.c
index fd2b038..f610097 100644
--- a/datapath/vport-gre.c
+++ b/datapath/vport-gre.c
@@ -415,8 +415,6 @@ const struct vport_ops ovs_gre_vport_ops = {
.set_addr = ovs_tnl_set_addr,
.get_name = ovs_tnl_get_name,
.get_addr = ovs_tnl_get_addr,
- .get_options = ovs_tnl_get_options,
- .set_options = ovs_tnl_set_options,
.get_dev_flags = ovs_vport_gen_get_dev_flags,
.is_running = ovs_vport_gen_is_running,
.get_operstate = ovs_vport_gen_get_operstate,
diff --git a/datapath/vport-tunnel-realdev.c b/datapath/vport-tunnel-realdev.c
new file mode 100644
index 0000000..6225f70
--- /dev/null
+++ b/datapath/vport-tunnel-realdev.c
@@ -0,0 +1,260 @@
+/*
+ * Copyright (c) 2012 Horms Solution Ltd.
+ *
+ * Based on vport-patch.c
+ *
+ * Copyright (c) 2007-2012 Nicira, Inc.
+ *
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+ * 02110-1301, USA
+ */
+
+#include <linux/kernel.h>
+#include <linux/jhash.h>
+#include <linux/list.h>
+#include <linux/rtnetlink.h>
+#include <net/net_namespace.h>
+
+#include "compat.h"
+#include "datapath.h"
+#include "vport.h"
+#include "vport-generic.h"
+
+struct realdev_config {
+ struct rcu_head rcu;
+
+ unsigned char eth_addr[ETH_ALEN];
+ __be32 daddr;
+ u32 flags;
+};
+
+struct realdev_vport {
+ struct rcu_head rcu;
+
+ char name[IFNAMSIZ];
+
+ struct realdev_config __rcu *realdevconf;
+};
+
+static struct realdev_vport *realdev_vport_priv(const struct vport *vport)
+{
+ return vport_priv(vport);
+}
+
+/* RCU callback. */
+static void free_config(struct rcu_head *rcu)
+{
+ struct realdev_config *c = container_of(rcu, struct realdev_config, rcu);
+ kfree(c);
+}
+
+static void assign_config_rcu(struct vport *vport,
+ struct realdev_config *new_config)
+{
+ struct realdev_vport *realdev_vport = realdev_vport_priv(vport);
+ struct realdev_config *old_config;
+
+ old_config = rtnl_dereference(realdev_vport->realdevconf);
+ rcu_assign_pointer(realdev_vport->realdevconf, new_config);
+ call_rcu(&old_config->rcu, free_config);
+}
+
+static int realdev_init(void)
+{
+ return 0;
+}
+
+static void realdev_exit(void)
+{
+}
+
+static const struct nla_policy realdev_policy[OVS_TUNNEL_ATTR_MAX + 1] = {
+ [OVS_TUNNEL_ATTR_FLAGS] = { .type = NLA_U32 },
+ [OVS_TUNNEL_ATTR_DST_IPV4] = { .type = NLA_U32 },
+};
+
+static int realdev_set_config(struct vport *vport, const struct nlattr *options,
+ struct realdev_config *realdevconf)
+{
+ struct nlattr *a[OVS_TUNNEL_ATTR_MAX + 1];
+ int err;
+
+ if (!options)
+ return -EINVAL;
+
+ err = nla_parse_nested(a, OVS_TUNNEL_ATTR_MAX, options, realdev_policy);
+ if (err)
+ return err;
+
+ if (!a[OVS_TUNNEL_ATTR_FLAGS] || !a[OVS_TUNNEL_ATTR_DST_IPV4])
+ return -EINVAL;
+
+ realdevconf->flags = nla_get_u32(a[OVS_TUNNEL_ATTR_FLAGS]);
+ realdevconf->daddr = nla_get_u32(a[OVS_TUNNEL_ATTR_DST_IPV4]);
+
+ return 0;
+}
+
+
+static struct vport *realdev_create(const struct vport_parms *parms)
+{
+ struct vport *vport;
+ struct realdev_vport *realdev_vport;
+ struct realdev_config *realdevconf;
+ int err;
+
+ vport = ovs_vport_alloc(sizeof(struct realdev_vport),
+ &ovs_tunnel_realdev_vport_ops, parms);
+ if (IS_ERR(vport)) {
+ err = PTR_ERR(vport);
+ goto error;
+ }
+
+ realdev_vport = realdev_vport_priv(vport);
+
+ strcpy(realdev_vport->name, parms->name);
+
+ realdevconf = kmalloc(sizeof(struct realdev_config), GFP_KERNEL);
+ if (!realdevconf) {
+ err = -ENOMEM;
+ goto error_free_vport;
+ }
+
+ err = realdev_set_config(vport, parms->options, realdevconf);
+ if (err)
+ goto error_free_realdevconf;
+
+ random_ether_addr(realdevconf->eth_addr);
+
+ rcu_assign_pointer(realdev_vport->realdevconf, realdevconf);
+
+ return vport;
+
+error_free_realdevconf:
+ kfree(realdevconf);
+error_free_vport:
+ ovs_vport_free(vport);
+error:
+ return ERR_PTR(err);
+}
+
+static void free_port_rcu(struct rcu_head *rcu)
+{
+ struct realdev_vport *realdev_vport = container_of(rcu,
+ struct realdev_vport, rcu);
+
+ kfree((struct realdev_config __force *)realdev_vport->realdevconf);
+ ovs_vport_free(vport_from_priv(realdev_vport));
+}
+
+static void realdev_destroy(struct vport *vport)
+{
+ struct realdev_vport *realdev_vport = realdev_vport_priv(vport);
+ call_rcu(&realdev_vport->rcu, free_port_rcu);
+}
+
+static int realdev_set_addr(struct vport *vport, const unsigned char *addr)
+{
+ struct realdev_vport *realdev_vport = realdev_vport_priv(vport);
+ struct realdev_config *realdevconf;
+
+ realdevconf = kmemdup(rtnl_dereference(realdev_vport->realdevconf),
+ sizeof(struct realdev_config), GFP_KERNEL);
+ if (!realdevconf)
+ return -ENOMEM;
+
+ memcpy(realdevconf->eth_addr, addr, ETH_ALEN);
+ assign_config_rcu(vport, realdevconf);
+
+ return 0;
+}
+
+static int realdev_set_options(struct vport *vport, struct nlattr *options)
+{
+ struct realdev_vport *realdev_vport = realdev_vport_priv(vport);
+ struct realdev_config *realdevconf;
+ int err;
+
+ realdevconf = kmemdup(rtnl_dereference(realdev_vport->realdevconf),
+ sizeof(struct realdev_config), GFP_KERNEL);
+ if (!realdevconf) {
+ err = -ENOMEM;
+ goto error;
+ }
+
+ err = realdev_set_config(vport, options, realdevconf);
+ if (err)
+ goto error_free;
+
+ assign_config_rcu(vport, realdevconf);
+
+ return 0;
+error_free:
+ kfree(realdevconf);
+error:
+ return err;
+}
+
+static const char *realdev_get_name(const struct vport *vport)
+{
+ const struct realdev_vport *realdev_vport = realdev_vport_priv(vport);
+ return realdev_vport->name;
+}
+
+static const unsigned char *realdev_get_addr(const struct vport *vport)
+{
+ const struct realdev_vport *realdev_vport = realdev_vport_priv(vport);
+ return rcu_dereference_rtnl(realdev_vport->realdevconf)->eth_addr;
+}
+
+static int realdev_get_options(const struct vport *vport, struct sk_buff *skb)
+{
+ struct realdev_vport *realdev_vport = realdev_vport_priv(vport);
+ struct realdev_config *realdevconf =
+ rcu_dereference_rtnl(realdev_vport->realdevconf);
+ int err;
+
+ err = nla_put_u32(skb, OVS_TUNNEL_ATTR_FLAGS, realdevconf->flags);
+ if (err)
+ goto error;
+
+ err = nla_put_u32(skb, OVS_TUNNEL_ATTR_DST_IPV4, realdevconf->daddr);
+error:
+ return err;
+}
+
+static int realdev_send(struct vport *vport, struct sk_buff *skb)
+{
+ kfree_skb(skb);
+ ovs_vport_record_error(vport, VPORT_E_TX_DROPPED);
+ return 0;
+}
+
+const struct vport_ops ovs_tunnel_realdev_vport_ops = {
+ .type = OVS_VPORT_TYPE_TUNNEL_REALDEV,
+ .init = realdev_init,
+ .exit = realdev_exit,
+ .create = realdev_create,
+ .destroy = realdev_destroy,
+ .set_addr = realdev_set_addr,
+ .get_name = realdev_get_name,
+ .get_addr = realdev_get_addr,
+ .get_options = realdev_get_options,
+ .set_options = realdev_set_options,
+ .get_dev_flags = ovs_vport_gen_get_dev_flags,
+ .is_running = ovs_vport_gen_is_running,
+ .get_operstate = ovs_vport_gen_get_operstate,
+ .send = realdev_send,
+};
diff --git a/datapath/vport.c b/datapath/vport.c
index 0c77a1b..7759e07 100644
--- a/datapath/vport.c
+++ b/datapath/vport.c
@@ -44,6 +44,7 @@ static const struct vport_ops *base_vport_ops_list[] = {
#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,26)
&ovs_capwap_vport_ops,
#endif
+ &ovs_tunnel_realdev_vport_ops,
};
static const struct vport_ops **vport_ops_list;
diff --git a/datapath/vport.h b/datapath/vport.h
index b0cdeae..893daaf 100644
--- a/datapath/vport.h
+++ b/datapath/vport.h
@@ -257,5 +257,6 @@ extern const struct vport_ops ovs_internal_vport_ops;
extern const struct vport_ops ovs_patch_vport_ops;
extern const struct vport_ops ovs_gre_vport_ops;
extern const struct vport_ops ovs_capwap_vport_ops;
+extern const struct vport_ops ovs_tunnel_realdev_vport_ops;
#endif /* vport.h */
diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
index c32bb58..87a3e22 100644
--- a/include/linux/openvswitch.h
+++ b/include/linux/openvswitch.h
@@ -185,6 +185,7 @@ enum ovs_vport_type {
OVS_VPORT_TYPE_PATCH = 100, /* virtual tunnel connecting two vports */
OVS_VPORT_TYPE_GRE, /* GRE tunnel */
OVS_VPORT_TYPE_CAPWAP, /* CAPWAP tunnel */
+ OVS_VPORT_TYPE_TUNNEL_REALDEV, /* real tunnel device */
__OVS_VPORT_TYPE_MAX
};
diff --git a/include/openvswitch/tunnel.h b/include/openvswitch/tunnel.h
index 5f55ecc..078a940 100644
--- a/include/openvswitch/tunnel.h
+++ b/include/openvswitch/tunnel.h
@@ -74,4 +74,6 @@ enum {
#define TNL_F_IN_KEY (1 << 8) /* Tunnel port has input key. */
#define TNL_F_OUT_KEY (1 << 9) /* Tunnel port has output key. */
+#define TNL_F_CAPWAP (1 << 10)
+
#endif /* openvswitch/tunnel.h */
diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
index a9eb3eb..7a9803b 100644
--- a/lib/netdev-vport.c
+++ b/lib/netdev-vport.c
@@ -155,15 +155,24 @@ netdev_vport_get_netdev_type(const struct dpif_linux_vport *vport)
return "patch";
case OVS_VPORT_TYPE_GRE:
- if (tnl_port_config_from_nlattr(vport->options, vport->options_len,
- a)) {
- break;
- }
- return (nl_attr_get_u32(a[OVS_TUNNEL_ATTR_FLAGS]) & TNL_F_IPSEC
- ? "ipsec_gre" : "gre");
+ return "gre-tundev";
case OVS_VPORT_TYPE_CAPWAP:
- return "capwap";
+ return "capwap-tundev";
+
+ case OVS_VPORT_TYPE_TUNNEL_REALDEV:
+ if (tnl_port_config_from_nlattr(vport->options,
+ vport->options_len, a)) {
+ return "no-config";
+ }
+
+ if (nl_attr_get_u32(a[OVS_TUNNEL_ATTR_FLAGS]) & TNL_F_CAPWAP) {
+ return "capwap";
+ } else if (nl_attr_get_u32(a[OVS_TUNNEL_ATTR_FLAGS]) & TNL_F_IPSEC) {
+ return "ipsec_gre";
+ } else {
+ return "gre";
+ }
case __OVS_VPORT_TYPE_MAX:
break;
@@ -248,6 +257,10 @@ netdev_vport_get_config(struct netdev_dev *dev_, struct shash *args)
ofpbuf_delete(buf);
}
+ if (!vport_class->unparse_config) {
+ return 0;
+ }
+
error = vport_class->unparse_config(name, netdev_class->type,
dev->options->data,
dev->options->size,
@@ -267,11 +280,13 @@ netdev_vport_set_config(struct netdev_dev *dev_, const struct shash *args)
struct netdev_dev_vport *dev = netdev_dev_vport_cast(dev_);
const char *name = netdev_dev_get_name(dev_);
struct ofpbuf *options;
- int error;
+ int error = 0;
options = ofpbuf_new(64);
- error = vport_class->parse_config(name, netdev_dev_get_type(dev_),
- args, options);
+ if (vport_class->parse_config) {
+ error = vport_class->parse_config(name, netdev_dev_get_type(dev_),
+ args, options);
+ }
if (!error
&& (!dev->options
|| options->size != dev->options->size
@@ -550,47 +565,18 @@ netdev_vport_poll_notify(const struct netdev *netdev)
\f
/* Code specific to individual vport types. */
-static void
-set_key(const struct shash *args, const char *name, uint16_t type,
- struct ofpbuf *options)
-{
- const char *s;
-
- s = shash_find_data(args, name);
- if (!s) {
- s = shash_find_data(args, "key");
- if (!s) {
- s = "0";
- }
- }
-
- if (!strcmp(s, "flow")) {
- /* This is the default if no attribute is present. */
- } else {
- nl_msg_put_be64(options, type, htonll(strtoull(s, NULL, 0)));
- }
-}
-
static int
parse_tunnel_config(const char *name, const char *type,
const struct shash *args, struct ofpbuf *options)
{
- bool is_gre = false;
- bool is_ipsec = false;
- struct shash_node *node;
- bool ipsec_mech_set = false;
ovs_be32 daddr = htonl(0);
- ovs_be32 saddr = htonl(0);
- uint32_t flags;
-
- flags = TNL_F_DF_DEFAULT | TNL_F_PMTUD | TNL_F_HDR_CACHE;
- if (!strcmp(type, "gre")) {
- is_gre = true;
- } else if (!strcmp(type, "ipsec_gre")) {
- is_gre = true;
- is_ipsec = true;
+ struct shash_node *node;
+ uint32_t flags = 0;
+
+ if (!strcmp(type, "ipsec_gre")) {
flags |= TNL_F_IPSEC;
- flags &= ~TNL_F_HDR_CACHE;
+ } else if (!strcmp(type, "capwap")) {
+ flags |= TNL_F_CAPWAP;
}
SHASH_FOR_EACH (node, args) {
@@ -601,112 +587,9 @@ parse_tunnel_config(const char *name, const char *type,
} else {
daddr = in_addr.s_addr;
}
- } else if (!strcmp(node->name, "local_ip")) {
- struct in_addr in_addr;
- if (lookup_ip(node->data, &in_addr)) {
- VLOG_WARN("%s: bad %s 'local_ip'", name, type);
- } else {
- saddr = in_addr.s_addr;
- }
- } else if (!strcmp(node->name, "tos")) {
- if (!strcmp(node->data, "inherit")) {
- flags |= TNL_F_TOS_INHERIT;
- } else {
- char *endptr;
- int tos;
- tos = strtol(node->data, &endptr, 0);
- if (*endptr == '\0') {
- nl_msg_put_u8(options, OVS_TUNNEL_ATTR_TOS, tos);
- }
- }
- } else if (!strcmp(node->name, "ttl")) {
- if (!strcmp(node->data, "inherit")) {
- flags |= TNL_F_TTL_INHERIT;
- } else {
- nl_msg_put_u8(options, OVS_TUNNEL_ATTR_TTL, atoi(node->data));
- }
- } else if (!strcmp(node->name, "csum") && is_gre) {
- if (!strcmp(node->data, "true")) {
- flags |= TNL_F_CSUM;
- }
- } else if (!strcmp(node->name, "df_inherit")) {
- if (!strcmp(node->data, "true")) {
- flags |= TNL_F_DF_INHERIT;
- }
- } else if (!strcmp(node->name, "df_default")) {
- if (!strcmp(node->data, "false")) {
- flags &= ~TNL_F_DF_DEFAULT;
- }
- } else if (!strcmp(node->name, "pmtud")) {
- if (!strcmp(node->data, "false")) {
- flags &= ~TNL_F_PMTUD;
- }
- } else if (!strcmp(node->name, "header_cache")) {
- if (!strcmp(node->data, "false")) {
- flags &= ~TNL_F_HDR_CACHE;
- }
- } else if (!strcmp(node->name, "peer_cert") && is_ipsec) {
- if (shash_find(args, "certificate")) {
- ipsec_mech_set = true;
- } else {
- const char *use_ssl_cert;
-
- /* If the "use_ssl_cert" is true, then "certificate" and
- * "private_key" will be pulled from the SSL table. The
- * use of this option is strongly discouraged, since it
- * will like be removed when multiple SSL configurations
- * are supported by OVS.
- */
- use_ssl_cert = shash_find_data(args, "use_ssl_cert");
- if (!use_ssl_cert || strcmp(use_ssl_cert, "true")) {
- VLOG_ERR("%s: 'peer_cert' requires 'certificate' argument",
- name);
- return EINVAL;
- }
- ipsec_mech_set = true;
- }
- } else if (!strcmp(node->name, "psk") && is_ipsec) {
- ipsec_mech_set = true;
- } else if (is_ipsec
- && (!strcmp(node->name, "certificate")
- || !strcmp(node->name, "private_key")
- || !strcmp(node->name, "use_ssl_cert"))) {
- /* Ignore options not used by the netdev. */
- } else if (!strcmp(node->name, "key") ||
- !strcmp(node->name, "in_key") ||
- !strcmp(node->name, "out_key")) {
- /* Handled separately below. */
- } else {
- VLOG_WARN("%s: unknown %s argument '%s'", name, type, node->name);
}
}
- if (is_ipsec) {
- char *file_name = xasprintf("%s/%s", ovs_rundir(),
- "ovs-monitor-ipsec.pid");
- pid_t pid = read_pidfile(file_name);
- free(file_name);
- if (pid < 0) {
- VLOG_ERR("%s: IPsec requires the ovs-monitor-ipsec daemon",
- name);
- return EINVAL;
- }
-
- if (shash_find(args, "peer_cert") && shash_find(args, "psk")) {
- VLOG_ERR("%s: cannot define both 'peer_cert' and 'psk'", name);
- return EINVAL;
- }
-
- if (!ipsec_mech_set) {
- VLOG_ERR("%s: IPsec requires an 'peer_cert' or psk' argument",
- name);
- return EINVAL;
- }
- }
-
- set_key(args, "in_key", OVS_TUNNEL_ATTR_IN_KEY, options);
- set_key(args, "out_key", OVS_TUNNEL_ATTR_OUT_KEY, options);
-
if (!daddr) {
VLOG_ERR("%s: %s type requires valid 'remote_ip' argument",
name, type);
@@ -714,14 +597,6 @@ parse_tunnel_config(const char *name, const char *type,
}
nl_msg_put_be32(options, OVS_TUNNEL_ATTR_DST_IPV4, daddr);
- if (saddr) {
- if (ip_is_multicast(daddr)) {
- VLOG_WARN("%s: remote_ip is multicast, ignoring local_ip", name);
- } else {
- nl_msg_put_be32(options, OVS_TUNNEL_ATTR_SRC_IPV4, saddr);
- }
- }
-
nl_msg_put_u32(options, OVS_TUNNEL_ATTR_FLAGS, flags);
return 0;
@@ -749,95 +624,6 @@ tnl_port_config_from_nlattr(const struct nlattr *options, size_t options_len,
}
return 0;
}
-
-static uint64_t
-get_be64_or_zero(const struct nlattr *a)
-{
- return a ? ntohll(nl_attr_get_be64(a)) : 0;
-}
-
-static int
-unparse_tunnel_config(const char *name OVS_UNUSED, const char *type OVS_UNUSED,
- const struct nlattr *options, size_t options_len,
- struct shash *args)
-{
- struct nlattr *a[OVS_TUNNEL_ATTR_MAX + 1];
- ovs_be32 daddr;
- uint32_t flags;
- int error;
-
- error = tnl_port_config_from_nlattr(options, options_len, a);
- if (error) {
- return error;
- }
-
- flags = nl_attr_get_u32(a[OVS_TUNNEL_ATTR_FLAGS]);
- if (!(flags & TNL_F_HDR_CACHE) == !(flags & TNL_F_IPSEC)) {
- smap_add(args, "header_cache",
- flags & TNL_F_HDR_CACHE ? "true" : "false");
- }
-
- daddr = nl_attr_get_be32(a[OVS_TUNNEL_ATTR_DST_IPV4]);
- shash_add(args, "remote_ip", xasprintf(IP_FMT, IP_ARGS(&daddr)));
-
- if (a[OVS_TUNNEL_ATTR_SRC_IPV4]) {
- ovs_be32 saddr = nl_attr_get_be32(a[OVS_TUNNEL_ATTR_SRC_IPV4]);
- shash_add(args, "local_ip", xasprintf(IP_FMT, IP_ARGS(&saddr)));
- }
-
- if (!a[OVS_TUNNEL_ATTR_IN_KEY] && !a[OVS_TUNNEL_ATTR_OUT_KEY]) {
- smap_add(args, "key", "flow");
- } else {
- uint64_t in_key = get_be64_or_zero(a[OVS_TUNNEL_ATTR_IN_KEY]);
- uint64_t out_key = get_be64_or_zero(a[OVS_TUNNEL_ATTR_OUT_KEY]);
-
- if (in_key && in_key == out_key) {
- shash_add(args, "key", xasprintf("%"PRIu64, in_key));
- } else {
- if (!a[OVS_TUNNEL_ATTR_IN_KEY]) {
- smap_add(args, "in_key", "flow");
- } else if (in_key) {
- shash_add(args, "in_key", xasprintf("%"PRIu64, in_key));
- }
-
- if (!a[OVS_TUNNEL_ATTR_OUT_KEY]) {
- smap_add(args, "out_key", "flow");
- } else if (out_key) {
- shash_add(args, "out_key", xasprintf("%"PRIu64, out_key));
- }
- }
- }
-
- if (flags & TNL_F_TTL_INHERIT) {
- smap_add(args, "tos", "inherit");
- } else if (a[OVS_TUNNEL_ATTR_TTL]) {
- int ttl = nl_attr_get_u8(a[OVS_TUNNEL_ATTR_TTL]);
- shash_add(args, "tos", xasprintf("%d", ttl));
- }
-
- if (flags & TNL_F_TOS_INHERIT) {
- smap_add(args, "tos", "inherit");
- } else if (a[OVS_TUNNEL_ATTR_TOS]) {
- int tos = nl_attr_get_u8(a[OVS_TUNNEL_ATTR_TOS]);
- shash_add(args, "tos", xasprintf("0x%x", tos));
- }
-
- if (flags & TNL_F_CSUM) {
- smap_add(args, "csum", "true");
- }
- if (flags & TNL_F_DF_INHERIT) {
- smap_add(args, "df_inherit", "true");
- }
- if (!(flags & TNL_F_DF_DEFAULT)) {
- smap_add(args, "df_default", "false");
- }
- if (!(flags & TNL_F_PMTUD)) {
- smap_add(args, "pmtud", "false");
- }
-
- return 0;
-}
-
static int
parse_patch_config(const char *name, const char *type OVS_UNUSED,
const struct shash *args, struct ofpbuf *options)
@@ -894,15 +680,17 @@ unparse_patch_config(const char *name OVS_UNUSED, const char *type OVS_UNUSED,
return 0;
}
\f
-#define VPORT_FUNCTIONS(GET_STATUS) \
+#define __VPORT_FUNCTIONS(RUN, WAIT, GET_CONFIG, \
+ SET_CONFIG, SEND, GET_STATS, \
+ SET_STATS, GET_STATUS) \
NULL, \
- netdev_vport_run, \
- netdev_vport_wait, \
+ RUN, \
+ WAIT, \
\
netdev_vport_create, \
netdev_vport_destroy, \
- netdev_vport_get_config, \
- netdev_vport_set_config, \
+ GET_CONFIG, \
+ SET_CONFIG, \
\
netdev_vport_open, \
netdev_vport_close, \
@@ -912,7 +700,7 @@ unparse_patch_config(const char *name OVS_UNUSED, const char *type OVS_UNUSED,
NULL, /* recv_wait */ \
NULL, /* drain */ \
\
- netdev_vport_send, /* send */ \
+ SEND, /* send */ \
NULL, /* send_wait */ \
\
netdev_vport_set_etheraddr, \
@@ -923,8 +711,8 @@ unparse_patch_config(const char *name OVS_UNUSED, const char *type OVS_UNUSED,
NULL, /* get_carrier */ \
NULL, /* get_carrier_resets */ \
NULL, /* get_miimon */ \
- netdev_vport_get_stats, \
- netdev_vport_set_stats, \
+ GET_STATS, \
+ SET_STATS, \
\
NULL, /* get_features */ \
NULL, /* set_advertisements */ \
@@ -953,24 +741,47 @@ unparse_patch_config(const char *name OVS_UNUSED, const char *type OVS_UNUSED,
\
netdev_vport_change_seq
+#define VPORT_FUNCTIONS(SET_CONFIG, GET_STATUS) \
+ __VPORT_FUNCTIONS(netdev_vport_run, \
+ netdev_vport_wait, \
+ netdev_vport_get_config, \
+ SET_CONFIG, \
+ netdev_vport_send, \
+ netdev_vport_get_stats, \
+ netdev_vport_set_stats, \
+ GET_STATUS)
+
+#define VPORT_TUNNEL_REALDEV_FUNCTIONS \
+ __VPORT_FUNCTIONS(NULL, NULL, NULL, \
+ netdev_vport_set_config, \
+ NULL, NULL, NULL, NULL)
+
void
netdev_vport_register(void)
{
static const struct vport_class vport_classes[] = {
- { OVS_VPORT_TYPE_GRE,
- { "gre", VPORT_FUNCTIONS(netdev_vport_get_drv_info) },
- parse_tunnel_config, unparse_tunnel_config },
+ { OVS_VPORT_TYPE_TUNNEL_REALDEV,
+ { "gre", VPORT_TUNNEL_REALDEV_FUNCTIONS },
+ parse_tunnel_config, NULL },
+
+ { OVS_VPORT_TYPE_TUNNEL_REALDEV,
+ { "ipsec_gre", VPORT_TUNNEL_REALDEV_FUNCTIONS },
+ parse_tunnel_config, NULL },
{ OVS_VPORT_TYPE_GRE,
- { "ipsec_gre", VPORT_FUNCTIONS(netdev_vport_get_drv_info) },
- parse_tunnel_config, unparse_tunnel_config },
+ { "gre-tundev", VPORT_FUNCTIONS(NULL, netdev_vport_get_drv_info) },
+ NULL, NULL },
+
+ { OVS_VPORT_TYPE_TUNNEL_REALDEV,
+ { "capwap", VPORT_TUNNEL_REALDEV_FUNCTIONS },
+ parse_tunnel_config, NULL },
{ OVS_VPORT_TYPE_CAPWAP,
- { "capwap", VPORT_FUNCTIONS(netdev_vport_get_drv_info) },
- parse_tunnel_config, unparse_tunnel_config },
+ { "capwap-tundev", VPORT_FUNCTIONS(NULL, netdev_vport_get_drv_info) },
+ NULL, NULL },
{ OVS_VPORT_TYPE_PATCH,
- { "patch", VPORT_FUNCTIONS(NULL) },
+ { "patch", VPORT_FUNCTIONS(netdev_vport_set_config, NULL) },
parse_patch_config, unparse_patch_config }
};
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 12/21] lib: Replace commit_set_tun_id_action() with commit_set_tunnel_action()
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
include/linux/openvswitch.h | 11 +++++++++++
lib/odp-util.c | 12 ++++++------
2 files changed, 17 insertions(+), 6 deletions(-)
diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
index 87a3e22..f2d56ec 100644
--- a/include/linux/openvswitch.h
+++ b/include/linux/openvswitch.h
@@ -372,6 +372,17 @@ struct ovs_key_ipv4_tunnel {
__u8 pad[2];
};
+static inline int
+ovs_key_ipv4_tunnel_equal(const struct ovs_key_ipv4_tunnel *a,
+ const struct ovs_key_ipv4_tunnel *b)
+{
+ return a->ipv4_dst == b->ipv4_dst &&
+ a->tun_id == b->tun_id &&
+ a->ipv4_src == b->ipv4_src &&
+ a->ipv4_tos == b->ipv4_tos &&
+ a->ipv4_ttl == b->ipv4_ttl;
+}
+
/**
* enum ovs_flow_attr - attributes for %OVS_FLOW_* commands.
* @OVS_FLOW_ATTR_KEY: Nested %OVS_KEY_ATTR_* attributes specifying the flow
diff --git a/lib/odp-util.c b/lib/odp-util.c
index 5f76f5e..11b7a1b 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -1892,16 +1892,16 @@ commit_set_action(struct ofpbuf *odp_actions, enum ovs_key_attr key_type,
}
static void
-commit_set_tun_id_action(const struct flow *flow, struct flow *base,
+commit_set_tunnel_action(const struct flow *flow, struct flow *base,
struct ofpbuf *odp_actions)
{
- if (base->tun_key.tun_id == flow->tun_key.tun_id) {
+ if (ovs_key_ipv4_tunnel_equal(&base->tun_key, &flow->tun_key)) {
return;
}
- base->tun_key.tun_id = flow->tun_key.tun_id;
+ base->tun_key = flow->tun_key;
- commit_set_action(odp_actions, OVS_KEY_ATTR_TUN_ID,
- &base->tun_key.tun_id, sizeof(base->tun_key.tun_id));
+ commit_set_action(odp_actions, OVS_KEY_ATTR_IPV4_TUNNEL,
+ &base->tun_key, sizeof(base->tun_key));
}
static void
@@ -2072,7 +2072,7 @@ void
commit_odp_actions(const struct flow *flow, struct flow *base,
struct ofpbuf *odp_actions)
{
- commit_set_tun_id_action(flow, base, odp_actions);
+ commit_set_tunnel_action(flow, base, odp_actions);
commit_set_ether_addr_action(flow, base, odp_actions);
commit_vlan_action(flow, base, odp_actions);
commit_set_nw_action(flow, base, odp_actions);
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 13/21] global: Remove OVS_KEY_ATTR_TUN_ID
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
OVS_KEY_ATTR_TUN_ID may now be removed as it is
no longer used in any meaningful way.
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
datapath/datapath.c | 1 -
datapath/flow.c | 1 -
include/linux/openvswitch.h | 1 -
lib/dpif-netdev.c | 1 -
lib/odp-util.c | 18 ------------------
5 files changed, 22 deletions(-)
diff --git a/datapath/datapath.c b/datapath/datapath.c
index 65dfe79..dcff4c6 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -590,7 +590,6 @@ static int validate_set(const struct nlattr *a,
const struct ovs_key_ipv4_tunnel *tun_key;
case OVS_KEY_ATTR_PRIORITY:
- case OVS_KEY_ATTR_TUN_ID:
case OVS_KEY_ATTR_ETHERNET:
break;
diff --git a/datapath/flow.c b/datapath/flow.c
index 49c0dd8..9c898c6 100644
--- a/datapath/flow.c
+++ b/datapath/flow.c
@@ -847,7 +847,6 @@ const int ovs_key_lens[OVS_KEY_ATTR_MAX + 1] = {
[OVS_KEY_ATTR_ND] = sizeof(struct ovs_key_nd),
/* Not upstream. */
- [OVS_KEY_ATTR_TUN_ID] = sizeof(__be64),
[OVS_KEY_ATTR_IPV4_TUNNEL] = sizeof(struct ovs_key_ipv4_tunnel),
};
diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
index f2d56ec..9de3f20 100644
--- a/include/linux/openvswitch.h
+++ b/include/linux/openvswitch.h
@@ -279,7 +279,6 @@ enum ovs_key_attr {
OVS_KEY_ATTR_ICMPV6, /* struct ovs_key_icmpv6 */
OVS_KEY_ATTR_ARP, /* struct ovs_key_arp */
OVS_KEY_ATTR_ND, /* struct ovs_key_nd */
- OVS_KEY_ATTR_TUN_ID, /* be64 tunnel ID */
OVS_KEY_ATTR_IPV4_TUNNEL, /* struct ovs_key_ipv4_tunnel */
__OVS_KEY_ATTR_MAX
};
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index d065a3a..ff00e05 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -1162,7 +1162,6 @@ execute_set_action(struct ofpbuf *packet, const struct nlattr *a)
const struct ovs_key_udp *udp_key;
switch (type) {
- case OVS_KEY_ATTR_TUN_ID:
case OVS_KEY_ATTR_PRIORITY:
case OVS_KEY_ATTR_IPV6:
case OVS_KEY_ATTR_IPV4_TUNNEL:
diff --git a/lib/odp-util.c b/lib/odp-util.c
index 11b7a1b..d1fe9d8 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -105,7 +105,6 @@ ovs_key_attr_to_string(enum ovs_key_attr attr)
case OVS_KEY_ATTR_ICMPV6: return "icmpv6";
case OVS_KEY_ATTR_ARP: return "arp";
case OVS_KEY_ATTR_ND: return "nd";
- case OVS_KEY_ATTR_TUN_ID: return "tun_id";
case OVS_KEY_ATTR_IPV4_TUNNEL: return "ipv4_tunnel";
case __OVS_KEY_ATTR_MAX:
@@ -602,7 +601,6 @@ odp_flow_key_attr_len(uint16_t type)
switch ((enum ovs_key_attr) type) {
case OVS_KEY_ATTR_ENCAP: return -2;
case OVS_KEY_ATTR_PRIORITY: return 4;
- case OVS_KEY_ATTR_TUN_ID: return 8;
case OVS_KEY_ATTR_IN_PORT: return 4;
case OVS_KEY_ATTR_ETHERNET: return sizeof(struct ovs_key_ethernet);
case OVS_KEY_ATTR_VLAN: return sizeof(ovs_be16);
@@ -697,10 +695,6 @@ format_odp_key_attr(const struct nlattr *a, struct ds *ds)
ds_put_format(ds, "(%"PRIu32")", nl_attr_get_u32(a));
break;
- case OVS_KEY_ATTR_TUN_ID:
- ds_put_format(ds, "(%#"PRIx64")", ntohll(nl_attr_get_be64(a)));
- break;
-
case OVS_KEY_ATTR_IPV4_TUNNEL:
ipv4_tun_key = nl_attr_get(a);
ds_put_format(ds, "(tun_id=%"PRIx64",flags=%"PRIx32
@@ -913,18 +907,6 @@ parse_odp_key_attr(const char *s, const struct simap *port_names,
}
{
- char tun_id_s[32];
- int n = -1;
-
- if (sscanf(s, "tun_id(%31[x0123456789abcdefABCDEF])%n",
- tun_id_s, &n) > 0 && n > 0) {
- uint64_t tun_id = strtoull(tun_id_s, NULL, 0);
- nl_msg_put_be64(key, OVS_KEY_ATTR_TUN_ID, htonll(tun_id));
- return n;
- }
- }
-
- {
ovs_be32 ipv4_src;
ovs_be32 ipv4_dst;
unsigned long long tun_flags;
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 14/21] ofproto: Set flow tun_key in compose_output_action()
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
In essence this attached the tun_key, if any,
to the output processing of a packet. This allows
it the packet to be transmitted using flow-based
tunneling as necessary.
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
v4
* Set tun_flags field of flow.tun_key
* Remove debugging message
v3
* Initial release
datapath: Add flags to ovs_key_ipv4_tunnel
Add flags to ovs_key_ipv4_tunnel and set from
the tunnel's realdev flags. This allows the datapath
to have access to flags on transmit which can be
used to effect the transmission - e.g. add a tunnel id.
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
ofproto/ofproto-dpif.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 2a52f37..b1354a2 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -4919,8 +4919,17 @@ compose_output_action__(struct action_xlate_ctx *ctx, uint16_t ofp_port,
}
out_port = realdev_to_txdev(ctx->ofproto, ofport, ctx->flow.vlan_tci);
- if (out_port != odp_port && !ofport->tun) {
- ctx->flow.vlan_tci = htons(0);
+ if (out_port != odp_port) {
+ if (ofport->tun) {
+ ctx->flow.tun_key.tun_id = ofport->tun->s.out_key;
+ ctx->flow.tun_key.tun_flags = ofport->tun->s.flags;
+ ctx->flow.tun_key.ipv4_src = ofport->tun->s.saddr;
+ ctx->flow.tun_key.ipv4_dst = ofport->tun->s.daddr;
+ ctx->flow.tun_key.ipv4_tos = ofport->tun->s.tos;
+ ctx->flow.tun_key.ipv4_ttl = ofport->tun->s.ttl;
+ } else {
+ ctx->flow.vlan_tci = htons(0);
+ }
}
commit_odp_actions(&ctx->flow, &ctx->base_flow, ctx->odp_actions);
nl_msg_put_u32(ctx->odp_actions, OVS_ACTION_ATTR_OUTPUT, out_port);
@@ -5576,7 +5585,7 @@ action_xlate_ctx_init(struct action_xlate_ctx *ctx,
ctx->ofproto = ofproto;
ctx->flow = *flow;
ctx->base_flow = ctx->flow;
- ctx->base_flow.tun_key.ipv4_src = 0;
+ ctx->base_flow.tun_key.ipv4_src = htonl(0);
ctx->base_flow.vlan_tci = initial_tci;
ctx->rule = rule;
ctx->packet = packet;
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 15/21] datapath: Remove mlink element from tnl_mutable_config
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
Multicast may be handled in user-space (but isn't yet).
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
datapath/tunnel.c | 22 ----------------------
datapath/tunnel.h | 3 ---
2 files changed, 25 deletions(-)
diff --git a/datapath/tunnel.c b/datapath/tunnel.c
index f07ec69..cdcb0a7 100644
--- a/datapath/tunnel.c
+++ b/datapath/tunnel.c
@@ -162,21 +162,6 @@ static void free_cache_rcu(struct rcu_head *rcu)
free_cache(c);
}
-/* Frees the portion of 'mutable' that requires RTNL and thus can't happen
- * within an RCU callback. Fortunately this part doesn't require waiting for
- * an RCU grace period.
- */
-static void free_mutable_rtnl(struct tnl_mutable_config *mutable)
-{
- ASSERT_RTNL();
- if (ipv4_is_multicast(mutable->key.daddr) && mutable->mlink) {
- struct in_device *in_dev;
- in_dev = inetdev_by_index(port_key_get_net(&mutable->key), mutable->mlink);
- if (in_dev)
- ip_mc_dec_group(in_dev, mutable->key.daddr);
- }
-}
-
static void assign_config_rcu(struct vport *vport,
struct tnl_mutable_config *new_config)
{
@@ -186,7 +171,6 @@ static void assign_config_rcu(struct vport *vport,
old_config = rtnl_dereference(tnl_vport->mutable);
rcu_assign_pointer(tnl_vport->mutable, new_config);
- free_mutable_rtnl(old_config);
call_rcu(&old_config->rcu, free_config_rcu);
}
@@ -1391,8 +1375,6 @@ static int tnl_set_config(struct net *net,
if (old_vport && old_vport != cur_vport)
return -EEXIST;
- mutable->mlink = 0;
-
return 0;
}
@@ -1445,7 +1427,6 @@ struct vport *ovs_tnl_create(const struct vport_parms *parms,
return vport;
error_free_mutable:
- free_mutable_rtnl(mutable);
kfree(mutable);
error_free_vport:
ovs_vport_free(vport);
@@ -1470,7 +1451,6 @@ void ovs_tnl_destroy(struct vport *vport)
mutable = rtnl_dereference(tnl_vport->mutable);
port_table_remove_port(vport);
- free_mutable_rtnl(mutable);
call_rcu(&tnl_vport->rcu, free_port_rcu);
}
@@ -1484,8 +1464,6 @@ int ovs_tnl_set_addr(struct vport *vport, const unsigned char *addr)
if (!mutable)
return -ENOMEM;
- old_mutable->mlink = 0;
-
memcpy(mutable->eth_addr, addr, ETH_ALEN);
assign_config_rcu(vport, mutable);
diff --git a/datapath/tunnel.h b/datapath/tunnel.h
index 7d78297..0af27ac 100644
--- a/datapath/tunnel.h
+++ b/datapath/tunnel.h
@@ -117,9 +117,6 @@ struct tnl_mutable_config {
u32 flags;
u8 tos;
u8 ttl;
-
- /* Multicast configuration. */
- int mlink;
};
struct tnl_ops {
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 18/21] dataptah: remove ttl and tos from tnl_mutable_config
From: Simon Horman @ 2012-05-24 9:09 UTC (permalink / raw)
To: dev-yBygre7rU0TnMu66kgdUjQ; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1337850554-10339-1-git-send-email-horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
tun_key should always be present and correct in ovs_tnl_send()
It ought to be possible to handle the ttl entirely
in user-space. This is not implemented yet. However, the
TNL_F_TOS_INHERIT is currently never set.
Cc: Kyle Mestery <kmestery-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Simon Horman <horms-/R6kz+dDXgpPR4JQBCEnsQ@public.gmane.org>
---
datapath/tunnel.c | 10 ++--------
datapath/tunnel.h | 4 ----
2 files changed, 2 insertions(+), 12 deletions(-)
diff --git a/datapath/tunnel.c b/datapath/tunnel.c
index ba18055..39aa2af 100644
--- a/datapath/tunnel.c
+++ b/datapath/tunnel.c
@@ -900,10 +900,8 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
if (mutable->flags & TNL_F_TOS_INHERIT)
tos = inner_tos;
- else if (OVS_CB(skb)->tun_key)
- tos = OVS_CB(skb)->tun_key->ipv4_tos;
else
- tos = mutable->tos;
+ tos = OVS_CB(skb)->tun_key->ipv4_tos;
/* Route lookup */
rt = find_route(vport, port_key_get_net(&mutable->key),
@@ -940,11 +938,7 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
dst_hold(unattached_dst);
}
- /* TTL */
- if (OVS_CB(skb)->tun_key)
- ttl = OVS_CB(skb)->tun_key->ipv4_ttl;
- else
- ttl = mutable->ttl;
+ ttl = OVS_CB(skb)->tun_key->ipv4_ttl;
if (!ttl)
ttl = ip4_dst_hoplimit(&rt_dst(rt));
if (mutable->flags & TNL_F_TTL_INHERIT) {
diff --git a/datapath/tunnel.h b/datapath/tunnel.h
index ed3b4ec..330df27 100644
--- a/datapath/tunnel.h
+++ b/datapath/tunnel.h
@@ -99,8 +99,6 @@ static inline void port_key_set_net(struct port_lookup_key *key, struct net *net
* (e.g. ICMP fragmentation needed messages).
* @out_key: Key to use on output, 0 if this tunnel has no fixed output key.
* @flags: TNL_F_* flags.
- * @tos: IPv4 TOS value to use for tunnel, 0 if no fixed TOS.
- * @ttl: IPv4 TTL value to use for tunnel, 0 if no fixed TTL.
*/
struct tnl_mutable_config {
struct port_lookup_key key;
@@ -115,8 +113,6 @@ struct tnl_mutable_config {
/* Configured via OVS_TUNNEL_ATTR_* attributes. */
__be64 out_key;
u32 flags;
- u8 tos;
- u8 ttl;
};
struct tnl_ops {
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 03/21] odp-util: Add tun_key to parse_odp_key_attr()
From: Simon Horman @ 2012-05-24 9:08 UTC (permalink / raw)
To: dev; +Cc: netdev, Kyle Mestery, Simon Horman
In-Reply-To: <1337850554-10339-1-git-send-email-horms@verge.net.au>
Cc: Kyle Mestery <kmestery@cisco.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
v4
Correct parsing of tunnel key in parse_odp_key_attr()
so that it matches the out put of format_odp_key_attr()
TODO: fix test suite
v3
* Initial post
---
lib/odp-util.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/lib/odp-util.c b/lib/odp-util.c
index 23d1efe..7cff00c 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -925,6 +925,35 @@ parse_odp_key_attr(const char *s, const struct simap *port_names,
}
{
+ ovs_be32 ipv4_src;
+ ovs_be32 ipv4_dst;
+ unsigned long long tun_flags;
+ int ipv4_tos;
+ int ipv4_ttl;
+ int n = -1;
+
+ if (sscanf(s, "ipv4_tunnel(tun_id=%31[x0123456789abcdefABCDEF]"
+ ",flags=%llx,src="IP_SCAN_FMT",dst="IP_SCAN_FMT
+ ",tos=%i,ttl=%i)%n",
+ tun_id_s, &tun_flags,
+ IP_SCAN_ARGS(&ipv4_src), IP_SCAN_ARGS(&ipv4_dst),
+ &ipv4_tos, &ipv4_ttl, &n) > 0
+ && n > 0) {
+ struct ovs_key_ipv4_tunnel tun_key;
+
+ tun_key.tun_id = htonll(strtoull(tun_id_s, NULL, 0));
+ tun_key.tun_flags = tun_flags;
+ tun_key.ipv4_src = ipv4_src;
+ tun_key.ipv4_dst = ipv4_dst;
+ tun_key.ipv4_tos = ipv4_tos;
+ tun_key.ipv4_ttl = ipv4_ttl;
+ nl_msg_put_unspec(key, OVS_KEY_ATTR_IPV4_TUNNEL,
+ &tun_key, sizeof tun_key);
+ return n;
+ }
+ }
+
+ {
unsigned long long int in_port;
int n = -1;
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [RFC v4 00/21] Flow Based Tunneling for Open vSwitch
From: Simon Horman @ 2012-05-24 9:08 UTC (permalink / raw)
To: dev; +Cc: netdev, Kyle Mestery
Hi,
This series comprises a fresh batch of proposed changes to introduce
flow-based tunnelling.
At the heart of these changes is the following structure, which
is attached as a pointer to skb->cb.
struct ovs_key_ipv4_tunnel {
__be64 tun_id;
__u32 tun_flags;
__be32 ipv4_src;
__be32 ipv4_dst;
__u8 ipv4_tos;
__u8 ipv4_ttl;
__u8 pad[2];
};
This series does not introdue use of in-tree kernel tunneling code
by Open vSwitch. However, it is intended as preliminary work
for that goal and I believe attaching a structure similar
to the one above to to skb->cb could be mechanism to achieve that.
I have CCed netdev for any comment on that.
Some details of the implementatoin follow, they are not
particularly related to the use of in-tree kernel tunneling code.
Overview:
In general the appraoch that I have taken in user-space is to split
tunneling into realdevs and tundevs. Tunnel realdevs are devices that look
to users like the existing port-based tunnelling implementation. Tunnel
tundevs exist in the datapath and are where tx and rx occur. Tunnel
tundevs have very little configuration and are unable to opperate without
flow information that describes at least the remote IP.
Changes:
* Do not attempt to configure a tundev realport, it will fail which
results in ovs-vswitchd to start. I had not noticed this as
ovs-vswitchd will start if there are no tundevs present in the databse
when it starts, and I usally test on a fresh install.
* Add a flags fields to ovs_key_ipv4_tunnel (above) and use it
to reinstate the functionality of various flags e.g. tunnel checksum,
tunnel out key. Previously these flags were set on the 'mutable' of
a tunnel device in the kernel, however this is no longer appropriate
as a tunnel device may now handle multiple tunnels.
* Cleaned up output and parsing of tunnel flows.
Test Suite enhancements to come.
* Do not use Linux kernel headers in lib/odp-util.c.
This is achieved by defining a new structure flow_tun_key
and using it instead of ovs_key_ipv4_tunnel. THe structure
is currently the same internally as ovs_key_ipv4_tunnel.
Limiations:
* In this series, realdevs exist in the kernel although I believe
it should not be necessary for them to do so. The reason that they are
there is to limit the changes that are needed to the user-space netdev
code and to allow review of the series before making those changes.
* PMTU discovery is broken and I'm unsure if it has been fixed.
Jesse Gross sugested that a uer-space implemtation of MSS clampint would
be a good solution to this. I have made a start on that and sent a
separate email about it.
* The header cache has been removed, but some reminants of the
API remain. In particualr the tunnel header is still created and updated,
even thogh both occur for each transmit. It may make sense to
recombine those calls into a single call if the header cache is
to be permantently removed.
* Multicast could be implemented in user-space byt currently isn't.
This means that muilticast remote IP for tunneling is broken.
* I have not implemented matches for tun_keys. This means
that the current implementation only provides port-based tunneling
implemented on top of flow-bassed tunneling. It is not yet possible for a
controller to match on or set the tun_key of flows.
I expect this to be a small body of work to complete.
* The way that I have split the patchs is still somewhat arbitrary.
I wanted to avoid one very large patch to aid review. But a lot of the
chagnes are inter-related, so a bisectable split seems rather difficult.
None the less, the split could be significantly improved.
----------------------------------------------------------------
Simon Horman (21):
datapath: tunnelling: Replace tun_id with tun_key
datapath: Use tun_key on transmit
odp-util: Add tun_key to parse_odp_key_attr()
vswitchd: Add iface_parse_tunnel
vswitchd: Add add_tunnel_ports()
ofproto: Add set_tunnelling()
vswitchd: Configure tunnel interfaces.
ofproto: Add realdev_to_txdev()
ofproto: Add tundev_to_realdev()
classifier: Convert struct flow flow_metadata to use tun_key
datapath, vport: Provide tunnel realdev and tundev classes and vports
lib: Replace commit_set_tun_id_action() with commit_set_tunnel_action()
global: Remove OVS_KEY_ATTR_TUN_ID
ofproto: Set flow tun_key in compose_output_action()
datapath: Remove mlink element from tnl_mutable_config
datapath: remove tunnel cache
datapath: Always use tun_key addresses for route lookup
dataptah: remove ttl and tos from tnl_mutable_config
datapath: Simplify vport lookup
datapath: Use tun_key flags for id and csum settings on transmit
datapath: Always use tun_key flags
datapath/Modules.mk | 3 +-
datapath/actions.c | 6 +-
datapath/datapath.c | 11 +-
datapath/datapath.h | 5 +-
datapath/flow.c | 35 +-
datapath/flow.h | 27 +-
datapath/tunnel.c | 782 +++++-----------------------------------
datapath/tunnel.h | 98 +----
datapath/vport-capwap.c | 45 +--
datapath/vport-gre.c | 62 ++--
datapath/vport-tunnel-realdev.c | 260 +++++++++++++
datapath/vport.c | 3 +-
datapath/vport.h | 1 +
include/linux/openvswitch.h | 24 +-
include/openvswitch/tunnel.h | 4 +
lib/classifier.c | 8 +-
lib/dpif-linux.c | 2 +-
lib/dpif-netdev.c | 2 +-
lib/flow.c | 31 +-
lib/flow.h | 21 +-
lib/meta-flow.c | 4 +-
lib/netdev-vport.c | 333 ++++-------------
lib/nx-match.c | 2 +-
lib/odp-util.c | 72 +++-
lib/odp-util.h | 5 +-
lib/ofp-print.c | 12 +-
lib/ofp-util.c | 4 +-
ofproto/ofproto-dpif.c | 347 ++++++++++++++++--
ofproto/ofproto-provider.h | 12 +
ofproto/ofproto.c | 28 ++
ofproto/ofproto.h | 46 +++
tests/test-classifier.c | 7 +-
vswitchd/bridge.c | 350 ++++++++++++++++++
33 files changed, 1451 insertions(+), 1201 deletions(-)
create mode 100644 datapath/vport-tunnel-realdev.c
^ permalink raw reply
* [PATCH 02/21] datapath: Use tun_key on transmit
From: Simon Horman @ 2012-05-24 9:08 UTC (permalink / raw)
To: dev; +Cc: netdev, Kyle Mestery, Simon Horman
In-Reply-To: <1337850554-10339-1-git-send-email-horms@verge.net.au>
Use the tun_key, which is the basis of flow-based tunnelling, on transmit.
Cc: Kyle Mestery <kmestery@cisco.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
datapath/tunnel.c | 45 ++++++++++++++++++++++++++++++++-------------
1 file changed, 32 insertions(+), 13 deletions(-)
diff --git a/datapath/tunnel.c b/datapath/tunnel.c
index 010e513..61add96 100644
--- a/datapath/tunnel.c
+++ b/datapath/tunnel.c
@@ -1002,15 +1002,16 @@ unlock:
}
static struct rtable *__find_route(const struct tnl_mutable_config *mutable,
- u8 ipproto, u8 tos)
+ u8 ipproto, __be32 daddr, __be32 saddr,
+ u8 tos)
{
/* Tunnel configuration keeps DSCP part of TOS bits, But Linux
* router expect RT_TOS bits only. */
#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,39)
struct flowi fl = { .nl_u = { .ip4_u = {
- .daddr = mutable->key.daddr,
- .saddr = mutable->key.saddr,
+ .daddr = daddr,
+ .saddr = saddr,
.tos = RT_TOS(tos) } },
.proto = ipproto };
struct rtable *rt;
@@ -1020,8 +1021,8 @@ static struct rtable *__find_route(const struct tnl_mutable_config *mutable,
return rt;
#else
- struct flowi4 fl = { .daddr = mutable->key.daddr,
- .saddr = mutable->key.saddr,
+ struct flowi4 fl = { .daddr = daddr,
+ .saddr = saddr,
.flowi4_tos = RT_TOS(tos),
.flowi4_proto = ipproto };
@@ -1031,7 +1032,8 @@ static struct rtable *__find_route(const struct tnl_mutable_config *mutable,
static struct rtable *find_route(struct vport *vport,
const struct tnl_mutable_config *mutable,
- u8 tos, struct tnl_cache **cache)
+ u8 tos, __be32 daddr, __be32 saddr,
+ struct tnl_cache **cache)
{
struct tnl_vport *tnl_vport = tnl_vport_priv(vport);
struct tnl_cache *cur_cache = rcu_dereference(tnl_vport->cache);
@@ -1039,14 +1041,16 @@ static struct rtable *find_route(struct vport *vport,
*cache = NULL;
tos = RT_TOS(tos);
- if (likely(tos == RT_TOS(mutable->tos) &&
- check_cache_valid(cur_cache, mutable))) {
+ if (daddr == mutable->key.daddr && saddr == mutable->key.saddr &&
+ tos == RT_TOS(mutable->tos) &&
+ check_cache_valid(cur_cache, mutable)) {
*cache = cur_cache;
return cur_cache->rt;
} else {
struct rtable *rt;
- rt = __find_route(mutable, tnl_vport->tnl_ops->ipproto, tos);
+ rt = __find_route(mutable, tnl_vport->tnl_ops->ipproto,
+ daddr, saddr, tos);
if (IS_ERR(rt))
return NULL;
@@ -1182,6 +1186,8 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
struct tnl_cache *cache;
int sent_len = 0;
__be16 frag_off = 0;
+ __be32 daddr;
+ __be32 saddr;
u8 ttl;
u8 inner_tos;
u8 tos;
@@ -1221,11 +1227,21 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
if (mutable->flags & TNL_F_TOS_INHERIT)
tos = inner_tos;
+ else if (OVS_CB(skb)->tun_key)
+ tos = OVS_CB(skb)->tun_key->ipv4_tos;
else
tos = mutable->tos;
+ if (OVS_CB(skb)->tun_key) {
+ daddr = OVS_CB(skb)->tun_key->ipv4_dst;
+ saddr = OVS_CB(skb)->tun_key->ipv4_src;
+ } else {
+ daddr = mutable->key.daddr;
+ saddr = mutable->key.saddr;
+ }
+
/* Route lookup */
- rt = find_route(vport, mutable, tos, &cache);
+ rt = find_route(vport, mutable, tos, daddr, saddr, &cache);
if (unlikely(!rt))
goto error_free;
if (unlikely(!cache))
@@ -1262,10 +1278,12 @@ int ovs_tnl_send(struct vport *vport, struct sk_buff *skb)
}
/* TTL */
- ttl = mutable->ttl;
+ if (OVS_CB(skb)->tun_key)
+ ttl = OVS_CB(skb)->tun_key->ipv4_ttl;
+ else
+ ttl = mutable->ttl;
if (!ttl)
ttl = ip4_dst_hoplimit(&rt_dst(rt));
-
if (mutable->flags & TNL_F_TTL_INHERIT) {
if (skb->protocol == htons(ETH_P_IP))
ttl = ip_hdr(skb)->ttl;
@@ -1444,7 +1462,8 @@ static int tnl_set_config(struct net *net, struct nlattr *options,
struct net_device *dev;
struct rtable *rt;
- rt = __find_route(mutable, tnl_ops->ipproto, mutable->tos);
+ rt = __find_route(mutable, tnl_ops->ipproto, mutable->tos,
+ mutable->key.daddr, mutable->key.saddr);
if (IS_ERR(rt))
return -EADDRNOTAVAIL;
dev = rt_dst(rt).dev;
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox